Machine Learning on the Java Platform

Machine learning on the Java Platform David Ström [email protected] #DevSum19 Agenda • The incredible power of machine learning • Common machine learning challenges • How Java faces up to those challenges • How to implement a machine learning system in Java • Questions The incredible power of machine learning What problems can ML solve? • Some things are almost impossible to solve without ML, e.g. • Audio and image recognition • While other things can get (a lot) better • Interpreting human language • Recommendations • Predictions • Anomaly detection • Patterns are everywhere • What can your data tell you? Is this really for me? Data Softare scientist engineer Is this really for me? Data Softare scientist engineer Is this really for me? Data You? Softare scientist engineer Common machine learning challenges Challenge #1: Complexity Backpropagation, naive Bayes, convolutional neural networks, decision trees, autoencoders, generative adversial networks, linear regression, eigenvectors, activation function, hyperparameters, classification, LSTM, random forest, support vector machines, restricted Boltzmann machine, interpolation, K- nearest, logistic regression, gradient descent, transpose, sequence padding, vectorization, polynomials, inference, hyperspace, n-dimensional arrays, learning rate, loss function... Challenge #1: Complexity Backpropagation, naive Bayes, convolutional neural networks, decision trees, autoencoders, generative adversial networks, linear regression, eigenvectors, activation function, hyperparameters, classification, LSTM, random forest, support vector machines, restricted Boltzmann machine, interpolation, K- nearest, logistic regression, gradient descent, transpose, sequence padding, vectorization, polynomials, inference, hyperspace, n-dimensional arrays, learning rate, loss function... Challenge #1: Complexity Backpropagation, naive Bayes, convolutional neural networks, decision trees, autoencoders, generative adversial networks, linear regression, eigenvectors, activation function, hyperparameters, classification, LSTM, random forest, support vector machines, restricted Boltzmann machine, interpolation, K- nearest, logistic regression, gradient descent, transpose, sequence padding, vectorization, polynomials, inference, hyperspace, n-dimensional arrays, learning rate, loss function... Challenge #1: Complexity Backpropagation, naive Bayes, convolutional neural networks, decision trees, autoencoders, generative adversial networks, linear regression, eigenvectors, activation function, hyperparameters, classification, LSTM, random forest, support vector machines, restricted Boltzmann machine, interpolation, K- nearest, logistic regression, gradient descent, transpose, sequence padding, vectorization, polynomials, inference, hyperspace, n-dimensional arrays, learning rate, loss function... Challenge #1: Complexity Backpropagation, naive Bayes, convolutional neural networks, decision trees, autoencoders, generative adversial networks, linear regression, eigenvectors, activation function, hyperparameters, classification, LSTM, random forest, support vector machines, restricted Boltzmann machine, interpolation, K- nearest, logistic regression, gradient descent, transpose, sequence padding, vectorization, polynomials, inference, hyperspace, n-dimensional arrays, learning rate, loss function... Challenge #1: Complexity Backpropagation, naive Bayes, convolutional neural networks, decision trees, autoencoders, generative adversial networks, linear regression, eigenvectors, activation function, hyperparameters, classification, LSTM, random forest, support vector machines, restricted Boltzmann machine, interpolation, K- nearest, logistic regression, gradient descent, transpose, sequence padding, vectorization, polynomials, inference, hyperspace, n-dimensional arrays, learning rate, loss function... Challenge #1: Complexity Backpropagation, naive Bayes, convolutional neural networks, decision trees, autoencoders, generative adversial networks, linear regression, eigenvectors, activation function, hyperparameters, classification, LSTM, random forest, support vector machines, restricted Boltzmann machine, interpolation, K- nearest, logistic regression, gradient descent, transpose, sequence padding, vectorization, polynomials, inference, hyperspace, n-dimensional arrays, learning rate, loss function... Challenge #1: Complexity Backpropagation, naive Bayes, convolutional neural networks, decision trees, autoencoders, generative adversial networks, linear regression, eigenvectors, activation function, hyperparameters, classification, LSTM, random forest, support vector machines, restricted Boltzmann machine, interpolation, K- nearest, logistic regression, gradient descent, transpose, sequence padding, vectorization, polynomials, inference, hyperspace, n-dimensional arrays, learning rate, loss function... Challenge #1: Complexity Backpropagation, naive Bayes, convolutional neural networks, decision trees, autoencoders, generative adversial networks, linear regression, eigenvectors, activation function, hyperparameters, classification, LSTM, random forest, support vector machines, restricted Boltzmann machine, interpolation, K- nearest, logistic regression, gradient descent, transpose, sequence padding, vectorization, polynomials, inference, hyperspace, n-dimensional arrays, learning rate, loss function... Challenge #2: Data, data, data Challenge #3: Model selection Challenge #4: Bias vs Variance VS. VARIANCE Challenge #5: Vectorization How does Java face up to those challenges? Why Java? • Date.from(Instant.now()): {Python} > {Java} < {R} • Java is versatile with huge ecosystem of tools • My thesis: As ML moves toward more and more practical implementation instead of research, a system development approach is needed Machine learning in Java • Deeplearning4J • Weka (incl. WekaDeeplearning4J) • Apache Mahout • JavaML • Etc. Deeplearning4J • Open source • Includes various tools for ML • ND4J • DataVec • Arbiter • Some visualisation tools • Import of Keras models • Supports dataprocessing on CUDA* enabled GPUs (Nvidia) *CUDA: Compute Unified Device Architecture Facing challenge 1: Complexity • Ease-of-use • Tutorials and documentation* • Pre-packaged algorithms (and even ML-models) • But no, a framework doesn't solve the whole complexity *Disclaimer: Also be aware that in order to be successful in implementing machine learning algorithms you probably need more understanding than what a simple tutorial or even a wiki can provide. problem I strongly suggest spending some time studying the subject before running head-long into implementing intricate machine learning models. Facing challenge 2: Data FileSplit fileSplit = new FileSplit(directory, {".png"}); ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator(); ImageRecordReader recordReader = new ImageRecordReader(28,28,1, labelMaker); recordReader.initialize(fileSplit); Facing challenge 2: Data FileSplit fileSplit = new FileSplit(directory, {".png"}); ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator(); ImageRecordReader recordReader = new ImageRecordReader(28,28,1, labelMaker); recordReader.initialize(fileSplit); Facing challenge 2: Data FileSplit fileSplit = new FileSplit(directory, {".png"}); ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator(); ImageRecordReader recordReader = new ImageRecordReader(28,28,1, labelMaker); recordReader.initialize(fileSplit); Facing challenge 3: Model selection • Three common types: • MLP (Multilayer perceptron) • CNN (Convolutional neural network) • RNN (Recurrent neural network) • Hybrid networks: use layers or subnets of different types Facing challenge 3: Model selection (cont.) • Multilayer Perceptron • General purpose type of network • Very useful for data in tabular format (csv, sql etc.) • Implemented in Deeplearning4j by adding DenseLayer objects to a MultiLayerConfiguration new NeuralNetConfiguration.Builder() // configuration... .layer(0, new DenseLayer.Builder().nIn(100).nOut(50) .activation(RELU).build()) Facing challenge 3: Model selection (cont.) • Multilayer Perceptron • General purpose type of network • Very useful for data in tabular format (csv, sql etc.) • Implemented in Deeplearning4j by adding DenseLayer objects to a MultiLayerConfiguration new NeuralNetConfiguration.Builder() // configuration... .layer(0, new DenseLayer.Builder().nIn(100).nOut(50) .activation(RELU).build()) Facing challenge 3: Model selection (cont.) • Convolutional neural network • Useful to make generalisations of the input (has/has not) • Particularly useful for identifying patterns in images • Not good for anomaly detection IMAGES Facing challenge 3: Model selection (cont.) • Recurrent neural network • Complex, can be difficult to train • Useful for time-series such as sound, text or video • LSTM (Long-short term memory) successful exception • Implemented in Deeplearning4j by adding e.g. LSTM layers to MultilayerConfiguration new NeuralNetConfiguration.Builder() // configuration... .layer(0, new LSTM.Builder().nIn(50).nOut(100) .activation(SOFTSIGN).updater(ADAGRAD).build()) SOUND Facing challenge 3: Model selection (cont.) • Recurrent neural network • Complex, can be difficult to train • Useful for time-series such as sound, text or video • LSTM (Long-short term memory) successful exception • Implemented in Deeplearning4j by adding e.g. LSTM layers to MultilayerConfiguration new NeuralNetConfiguration.Builder() // configuration... .layer(0, new LSTM.Builder().nIn(50).nOut(100) .activation(SOFTSIGN).updater(ADAGRAD).build())

Machine Learning on the Java Platform

Comparative Study of Deep Learning Software Frameworks

Reinforcement Learning in Videogames

Distributed Negative Sampling for Word Embeddings Stergios Stergiou Zygimantas Straznickas Yahoo Research MIT [email protected] [email protected]

Comparative Study of Caffe, Neon, Theano, and Torch

BUSEM at Semeval-2017 Task 4A Sentiment Analysis with Word

Using Long Short-Term Memory Recurrent Neural Network in Land Cover Classification on Landsat and Cropland Data Layer Time Series

Learning Options in Deep Reinforcement Learning Jean

Deep Learning: Concepts and Implementation Tools

An Exploration of Approaches to Integrating Neural Reranking Models in Multi-Stage Ranking Architectures

Using Deep Learning for Optimized Personalized Card Linked Offer

Long Short-Term Memory Recurrent Neural Network Architectures for Generating Music and Japanese Lyrics

Chained Anomaly Detection Models for Federated Learning: an Intrusion Detection Case Study