Deep Learning Meena Arunachalam, Ph.D Principal Engineer, Intel Corp

Total Page:16

File Type:pdf, Size:1020Kb

Deep Learning Meena Arunachalam, Ph.D Principal Engineer, Intel Corp AN INTRO TO Machine Learning & Deep Learning Meena Arunachalam, Ph.D Principal Engineer, Intel Corp 1 AGENDA • AI is transformative across all domains • Machine Learning: Usages, Spark MLlib, MKL • Deep Learning: Usages, Topologies, Convolution, MKL • Software stacks, frameworks and segments • Tensorflow and Neon: How to get started 2 History of Machine Learning (ML) Pioneering Machine Rediscovery of Machine Machine Learning becomes Learning Learning Feasible • Suggested simple • David Rumelhart, Geoff • Inspired by deep learning and machine learning Hinton and Ronald J. help of powerful processors, ML algorithms Williams discovered becomes popular method • Perceptron, nearest “Backpropagation” neighbor<1950 algorithm • 1970Causes rethinking of ML1990 2010 1950-1960 1980 2000 Statistical Methods Data-Driven Approach Wide Spread • Bayes’s “AI winter” • Analyze large amounts of • ML is widely used Theorem • Pessimist on ML data and conclude from the • Google, Apple, • Markov chains • ML algorithms were data Amazon, … not as effective as • Support Vector Machine expected (SVM), Recurrent Neural Networks (RNN) Artificial intelligence 4 Ai is transformative Consumer Health Finance Retail Government Energy Transport Industrial Other Smart Enhanced Algorithmic Support Defense Oil & Gas Autonomous Factory Advertising Assistants Diagnostics Trading Exploration Cars Automation Experience Data Education Chatbots Drug Fraud Insights Smart Automated Predictive Discovery Detection Marketing Grid Trucking Maintenance Gaming Search Safety & Merchandising Professional & Patient Care Research Security Operational Aerospace Precision Personalization IT Services Loyalty Improvement Agriculture Research Personal Resident Shipping Augmented Telco/Media Finance Engagement Conservation Field Reality Sensory Supply Chain Search & Automation Sports Aids Risk Mitigation Security Smarter Rescue Robots Cities Source: Intel forecast 5 The next big wave Data deluge COMPUTE breakthrough Innovation surge mainframes Standards- Cloud Artificial based servers computing intelligence AI Compute Cycles will grow 12X by 2020 Source: Intel forecast 6 Inside AI experiences Intel® Nervana™ Cloud & Appliance Intel® Computer Movidius platforms Intel® Nervana™ DL Studio Vision SDK (VPU) Frameworks Mllib BigDL Intel® Data Analytics Intel® Nervana™ Graph* Intel Python Acceleration Library libraries Distribution Intel® Math Kernel Library (DAAL) (MKL, MKL-DNN) * hardware Compute Memory & Storage Networking *Future Other names and brands may be claimed as the property of others. 7 End-to-end aicompute Wireless and non-IP wired protocols Secure Ethernet High throughput Real-time Cloud/appliance & Wireless gateway Edge Many-to-many hyperscale for stream 1-to-many with majority 1-to-1 devices with lower power and and massive batch data processing streaming data from devices often UX requirements Intel® Xeon® Processors Intel® Xeon Phi™ Processors* Intel® Core™ & Atom™ Processors CPU+ Intel® Processor Graphics Intel® FPGA Intel® Nervana™ Neural Network Processor (NNP)✝ Vision Movidius VPU Speech (DC=Datacenter. WS = workstation Intel® GNA (IP)* *Formerly codenamed as the Crest Family All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. 8 Modern AI: Theorists vs Practitioners • Modern AI research has largely split into 2 camps - Theorists: Work on mathematical and statistical problems related to algorithms that learn (Classical ML) . E.g. Is an Email message spam or not spam – Could use Bayesian algorithm to build a model with large dataset – Use the model to filter spam - Practitioners: Apply ML to various real-work problems guided by experimentation than by mathematical theory. However, use algorithms in their work (DL) . E.g. Is an Email message spam or not spam – One could “tag” what is spam and what is not spam – Could process millions of emails that are spam and billions that are not – Teach a machine to filter spam 9 Classic ML Deep Learning Using functions or algorithms to extract Using massive data sets to train deep (neural) insights from new data graphs that can extract insights from new data Functions 푓1, 푓2, … , 푓퐾 CNN, RNN, New Untrained Trained . Random Forest RBM, Data Training . Naïve Bayes Inference or etc. Data* . Decision Trees Classification . Graph Analytics . Regression . Ensemble methods Step 1: Training Step 2: Inference . Support Vector Machines (SVM) Hours to Days Real-Time in Cloud at Edge/Cloud . More… Use massive “known” dataset Form inference about (e.g. 10M tagged images) to new input data (e.g. a iteratively adjust weighting of photo) using trained New Data* neural network connections neural network *Not all classical machine learning algorithms require separate training and new data sets End-To-End Analytics Flow • Data analysis process and analytics building blocks despite variety of data formats, usages and domains Data Source Extract, Transform, Load (ETL) ML Compute DL Compute Scoring/Prediction Pre-processing Transformation Analysis Modeling Validation Decision Making Business Web/Social Scientific/Engineering Machine Learning (Training) Hypothesis testing Decompression, Aggregation, Summary Inference. Filtering, Normalization Dimension Reduction Statistics Parameter Estimation Model errors Clustering, etc. Simulation 11 Intel Confidential 1 1 Types of Learning • Supervised (inductive) learning - Training data includes desired outputs - Prediction / Classification (discrete labels), Regression (real values) • Unsupervised learning - Training data does not include desired outputs - Clustering / probability distribution estimation - Finding association (in features) - Dimension reduction • Semi-supervised learning - Training data includes a few desired outputs • Reinforcement learning - Rewards from sequence of actions 12 - Decision making (robot, chess machine) Unsupervised “Weakly” supervised Fully supervised Definition depends on task Slide credit: L. Lazebnik Visualizing Types of Learning Supervised Unsupervised learning learning Semi- supervised learning 14 First, let there be data! Data Practical acquisition usage Universal set (unobserv ed) Training Testing set set (observe (unobserv d) ed) Training and testing • Training = the process of making the system able to learn • Testing = the process of seeing how well the system learned – Simulates “real world” usage – Training set and testing set come from the same distribution – Need to make some assumptions or bias • Deployment = actually using the learned system in practice What is Deep Learning? Artificial Intelligence Machine Learning Brain Inspired "I propose to consider the question, ‘Can machines Study to build an algorithm that can “learn” and Create think?’”,a program that “draw” a prediction without specific direction mimics Alanthe work Turing, of human1950 brain 17 Brain Inspired • Human brain - Neuron: basic computational unit of the brain . 86 billion neurons in the brain - Neurons are connected by synapses . Gap between synapses are 1014-1015 - Neurons receive input signal from dendrites, go through axon, send signal to another neuron via synapses Synapses 18 What is Deep Learning? Artificial Intelligence Machine Learning Brain Inspired "I propose to consider the question, ‘Can machines Study to build anNeural algorithm Network that can “learn” and Create think?’”,a program that “draw” a prediction without specific direction mimicsThe Alan thenetwork work Turing, thatof human1950 mimics brain the work of human brain 19 Machine learning 푁 × 푁 CLASSIFIER SVM Classic Random Forest (푓1, 푓2, … , 푓퐾) Naïve Bayes Arjun ML Decision Trees Logistic Regression Ensemble Methods 푁 × 푁 Deep learning Arjun ~60 million parameters 20 Deep learning Features are discovered from data extract features at multiple levels of abstraction Performance improves with more data High degree of representational power 21 Deep learning breakthroughs Image recognition Speech recognition 30% 30% 97% 23% person 23% 99% “play song” 15% 15% Error Error 8% 8% Human Human 0% 0% 2010 Present 2000 Present 22 Deep learning in practice Healthcare: Tumor detection Industry: Agricultural Robotics Energy: Oil & Gas Normal Plant Tumor Weed Automotive: Speech interfaces Finance: Document Classification Genomics: Sequence analysis 23 Biological Neuron Artificial Neuron Dendrite Cell Body Axon 24 Biology Artificial Neural Networks 25 Neural Networks: Weighted Sum 26 Image Source: Stanford Neural Network Weighted Sum Image Source: Stanford Topologies: Network Depth is of importance Faster & Less Accurate <<<<<<<<<<<<<<<< >>>>>>>>>>>>>>>>>>>> More Accurate ResNet-50 AlexNet, 8 Layers VGG, 19 Layers GoogleNet, 22 Layers ResNet-152 Layers (2012) (2014) (2014) (2015) First 35 layers of 152 U Toronto Visual Geom. Group (VGG) Google Microsoft 28 Deep Learning Networks • Deep Neural Networks (DNN) - Billions of parameters • Convolutional Neural Networks (CNN) - Millions of parameters • Recurrent Neural Networks (RNN) - Used for time-series, e.g. speech etc. 29 What is Deep Learning? “Volvo Image XC90” Image Source: [Lee et al., Comm. ACM 2011] Deep Neural Networks Example: Image Recognition • How the Features are extracted in DNN Forward-Pass Backward-Pass Filters Fully Connected Layers 31 Deep Learning in Action: Image Recognition • AlexNet (Krizhevsky, et. al. 2012) - AlexNet: 5 Convolution Layers, 3 Fully connected layers + Soft-Max, 650K Neurons, 60Mil Weights Fully Connected Forward-Iter i+1 Backward-Iteri Layers Forward-Pass
Recommended publications
  • Intel® Optimized AI Frameworks
    Intel® optimized AI frameworks Dr. Fabio Baruffa & Shailen Sobhee Technical Consulting Engineers, Intel IAGS Visit: www.intel.ai/technology Speed up development using open AI software Machine learning Deep learning TOOLKITS App Open source platform for building E2E Analytics & Deep learning inference deployment Open source, scalable, and developers AI applications on Apache Spark* with distributed on CPU/GPU/FPGA/VPU for Caffe*, extensible distributed deep learning TensorFlow*, Keras*, BigDL TensorFlow*, MXNet*, ONNX*, Kaldi* platform built on Kubernetes (BETA) Python R Distributed Intel-optimized Frameworks libraries * • Scikit- • Cart • MlLib (on Spark) * * And more framework Data learn • Random • Mahout optimizations underway • Pandas Forest including PaddlePaddle*, scientists * * • NumPy • e1071 Chainer*, CNTK* & others Intel® Intel® Data Analytics Intel® Math Kernel Library Kernels Distribution Acceleration Library Library for Deep Neural Networks for Python* (Intel® DAAL) (Intel® MKL-DNN) developers Intel distribution High performance machine Open source compiler for deep learning optimized for learning & data analytics Open source DNN functions for model computations optimized for multiple machine learning library CPU / integrated graphics devices (CPU, GPU, NNP) from multiple frameworks (TF, MXNet, ONNX) 2 Visit: www.intel.ai/technology Speed up development using open AI software Machine learning Deep learning TOOLKITS App Open source platform for building E2E Analytics & Deep learning inference deployment Open source, scalable,
    [Show full text]
  • Intel® Software Template Overview
    data analytics on spark Software & Services Group Intel® Confidential — INTERNAL USE ONLY Spark platform for big data analytics What is Apache Spark? • Apache Spark is an open-source cluster-computing framework • Spark has a developed ecosystem • Designed for massively distributed apps • Fault tolerance • Dynamic resource sharing https://software.intel.com/ai 2 INTEL and Spark community • BigDL – deep learning lib for Apache Spark • Intel® Data Analytics Acceleration Library (Intel DAAL) – machine learning lib includes support for Apache Spark • Performance Optimizations across Apache Hadoop and Apache Spark open source projects • Storage and File System: Ceph, Tachyon, and HDFS • Security: Sentry and integration into various Hadoop & Spark modules • Benchmarks Contributions: Big Bench, Hi Bench, Cloud Sort 3 Taxonomy Artificial Intelligence (AI) Machines that can sense, reason, act without explicit programming Machine Learning (ML), a key tool for AI, is the development, and application of algorithms that improve their performance at some task based on experience (previous iterations) BigDL Deep Learning Classic Machine Learning DAAL focus focus Algorithms where multiple layers of neurons Algorithms based on statistical or other learn successively complex representations techniques for estimating functions from examples Dimension Classifi Clusterin Regres- CNN RNN RBM … -ality -cation g sion Reduction Training: Build a mathematical model based on a data set Inference: Use trained model to make predictions about new data 4 BigDLand Intel DAAL BigDL and Intel DAAL are machine learning and data analytics libraries natively integrated into Apache Spark ecosystem 5 WhATis bigdl? • BigDL is a distributed deep learning library for Apache Spark • Allows to write deep learning applications as standard Spark programs • Runs on top of existing Spark or Hadoop/Hive clusters • Feature parity with popular DL frameworks.
    [Show full text]
  • Online Power Management for Multi-Cores: a Reinforcement Learning Based Approach
    1 Online Power Management for Multi-cores: A Reinforcement Learning Based Approach Yiming Wang, Weizhe Zhang, Senior Member, IEEE, Meng Hao, Zheng Wang, Member, IEEE Abstract—Power and energy is the first-class design constraint for multi-core processors and is a limiting factor for future-generation supercomputers. While modern processor design provides a wide range of mechanisms for power and energy optimization, it remains unclear how software can make the best use of them. This paper presents a novel approach for runtime power optimization on modern multi-core systems. Our policy combines power capping and uncore frequency scaling to match the hardware power profile to the dynamically changing program behavior at runtime. We achieve this by employing reinforcement learning (RL) to automatically explore the energy-performance optimization space from training programs, learning the subtle relationships between the hardware power profile, the program characteristics, power consumption and program running times. Our RL framework then uses the learned knowledge to adapt the chip’s power budget and uncore frequency to match the changing program phases for any new, previously unseen program. We evaluate our approach on two computing clusters by applying our techniques to 11 parallel programs that were not seen by our RL framework at the training stage. Experimental results show that our approach can reduce the system-level energy consumption by 12%, on average, with less than 3% of slowdown on the application performance. By lowering the uncore frequency to leave more energy budget to allow the processor cores to run at a higher frequency, our approach can reduce the energy consumption by up to 17% while improving the application performance by 5% for specific workloads.
    [Show full text]
  • March 2012 Version 2
    WWRF VIP WG CONNECTED VEHICLES White Paper Connected Vehicles: The Role of Emerging Standards, Security and Privacy, and Machine Learning Editor: SESHADRI MOHAN, CHAIR CONNECTED VEHICLES WORKING GROUP PROFESSOR, SYSTEMS ENGINEERING DEPARTMENT UA LITTLE ROCK, AR 72204, USA Project website address: www.wwrf.ch This publication is partly based on work performed in the framework of the WWRF. It represents the views of the authors(s) and not necessarily those of the WWRF. EXECUTIVE SUMMARY The Internet of Vehicles (IoV) is an emerging technology that provides secure vehicle-to-vehicle (V2V) communication and safety for drivers and their passengers. It stands at the confluence of many evolving disciplines, including: evolving wireless technologies, V2X standards, the Internet of Things (IoT) consisting of a multitude of sensors that are housed in a vehicle, on the roadside, and in the devices worn by pedestrians, the radio technology along with the protocols that can establish an ad-hoc vehicular network, the cloud technology, the field of Big Data, and machine intelligence tools. WWRF is presenting this white paper inspired by the developments that have taken place in recent years in standards organizations such as IEEE and 3GPP and industry consortia efforts as well as current research in academia. This white paper provides insights into the state-of-the-art regarding 3GPP C-V2X as well as security and privacy of ETSI ITS, IEEE DSRC WAVE, 3GPP C-V2X. The White Paper further discusses spectrum allocation worldwide for ITS applications and connected vehicles. A section is devoted to a discussion on providing connected vehicles communication over a heterogonous set of wireless access technologies as it is imperative that the connectivity of vehicles be maintained even when the vehicles are out of coverage and/or the need to maintain vehicular connectivity as a vehicle traverses multiple wireless access technology for access to V2X applications.
    [Show full text]
  • Bigdl: a Distributed Deep Learning Framework for Big Data
    BigDL: A Distributed Deep Learning Framework for Big Data Jason (Jinquan) Dai1, Yiheng Wang2 ǂ, Xin Qiu1, Ding Ding1, Yao Zhang3 ǂ, Yanzhang Wang1, Xianyan Jia4 ǂ, Cherry (Li) Zhang1, Yan Wan4 ǂ, Zhichao Li1, Jiao Wang1, Shengsheng Huang1, Zhongyuan Wu1, Yang Wang1, Yuhao Yang1, Bowen She1, Dongjie Shi1, Qi Lu1, Kai Huang1, Guoqiong Song1 1Intel, 2 Tencent, 3 Sequoia Capital, 4Alibaba, ǂ Work was done when the author worked at Intel SOCC 2019 Agenda • Motivation • BigDL Execution Model • Experimental Evaluation • Real-World Applications • Future Work SOCC 2019 Real-World ML/DL Systems Are Complex Big Data Analytics Pipelines “Hidden Technical Debt in Machine Learning Systems”, Sculley et al., Google, NIPS 2015 Paper SOCC 2019 Big Data Analysis Challenges Real-World data analytics and deep learning pipelines are challenging • Deep learning benchmarks (ImageNet, SQuAD , etc.) • Curated and explicitly labelled Dataset • Suitable for dedicated DL systems • Real-world production data pipeline • Dynamic, messy (and possibly implicitly labeled) dataset • Suitable for integrated data analytics and DL pipelines using BigDL • Problems with “connector approaches” • TFX, TensorFlowOnSpark, Project Hydrogen, etc. • Adaptation overheads, impedance mismatch SOCC 2019 BigDL Execution Model SOCC 2019 Distributed Training in BigDL Data Parallel, Synchronous Mini-Batch SGD Prepare training data as an RDD of Samples Construct an RDD of models (each being a replica of the original model) for (i <- 1 to N) { //”model forward-backward” job for each task in the
    [Show full text]
  • Bigdl: Distributed Deep Leaning on Apache Spark Using Bigdl
    Analytics Zoo: Distributed Tensorflow, Keras and BigDL in production on Apache Spark Jennie Wang, Big Data Technologies, Intel Strata2019 Agenda • Motivation • BigDL • Analytics Zoo • Real-world applications • Conclusion and Q&A Strata2019 Motivations Technology and Industry Trends Real World Scenarios Strata2019 Trend #1: Data Scale Driving Deep Learning Process “Machine Learning Yearning”, Andrew Ng, 2016 Strata2019 Trend #2: Real-World ML/DL Systems Are Complex Big Data Analytics Pipelines “Hidden Technical Debt in Machine Learning Systems”, Sculley et al., Google, NIPS 2015 Paper Strata2019 Trend #3: Hadoop Becoming the Center of Data Gravity Phillip Radley, BT Group Matthew Glickman, Goldman Sachs Strata + Hadoop World 2016 San Jose Spark Summit East 2015 Strata2019 Unified Big Data Analytics Platform Apache Hadoop & Spark Platform Machine Graph Spreadsheet Leaning Analytics SQL Notebook Batch Streaming Interactive R Java Python DataFrame Flink Storm Data ML Pipelines Processing SQL SparkR Streaming MLlib GraphX & Analysis Spark Core MR Giraph Resource Mgmt YARN ZooKeeper & Co-ordination Data Flume Kafka Storage HDFS Parquet Avro HBase Input Strata2019 Chasm b/w Deep Learning and Big Data Communities The Chasm Deep learning experts Average users (big data users, data scientists, analysts, etc.) Strata2019 Large-Scale Image Recognition at JD.com Strata2019 Bridging the Chasm Make deep learning more accessible to big data and data science communities • Continue the use of familiar SW tools and HW infrastructure to build deep learning
    [Show full text]
  • Bigdl, for Apache Spark* Openvino, Ray* and Apache Spark*
    Cluster Serving: Distributed and Automated Model Inference on Big Data Streaming Frameworks Authors: Jiaming Song, Dongjie Shi, Qiyuan Gong, Lei Xia, Jason Dai Outline Challenges AI productions facing Integrated Big Data and AI pipeline Scalable online serving Cross-industry end-to-end use cases Big Data & Model Performance “Machine Learning Yearning”, Andrew Ng, 2016 *Other names and brands may be claimed as the property of others. Real-World ML/DL Applications Are Complex Data Analytics Pipelines “Hidden Technical Debt in Machine Learning Systems”, Sculley et al., Google, NIPS 2015 Paper *Other names and brands may be claimed as the property of others. Outline Challenges AI productions facing Integrated Big Data and AI pipeline Scalable online serving Cross-industry end-to-end use cases Integrated Big Data Analytics and AI Seamless Scaling from Laptop to Distributed Big Data Prototype on laptop Experiment on clusters Production deployment w/ using sample data with history data distributed data pipeline Production Data pipeline • Easily prototype end-to-end pipelines that apply AI models to big data • “Zero” code change from laptop to distributed cluster • Seamlessly deployed on production Hadoop/K8s clusters • Automate the process of applying machine learning to big data *Other names and brands may be claimed as the property of others. AI on Big Data Distributed, High-Performance Unified Analytics + AI Platform Deep Learning Framework for TensorFlow*, PyTorch*, Keras*, BigDL, for Apache Spark* OpenVINO, Ray* and Apache Spark* https://github.com/intel-analytics/bigdl https://github.com/intel-analytics/analytics-zoo Seamless Scaling from Laptop to Distributed Big Data *Other names and brands may be claimed as the property of others.
    [Show full text]
  • Using Deep Learning for Optimized Personalized Card Linked Offer
    Using Deep Learning for optimized personalized card linked offer March 5, 2018 Suqiang Song , Director, Chapter Leader of Data Engineering & AI Shared Components and Security Solutions LinkedIn : https://www.linkedin.com/in/suqiang-song-72041716/ Mastercard Big Data & AI Expertise Differentiation starts with consumer insights from a massive worldwide payments network and our experience in data cleansing, analytics and modeling What can MULTI-SOURCED • 38MM+ merchant locations • 22,000 issuers CLEANSED, AGGREGATD, ANONYMOUS, AUGMENTED 2.4 BILLION • 1.5MM automated rules • Continuously tested and Global Cards WAREHOUSED • 10 petabytes 56 BILLION • 5+ year historic global view • Rapid retrieval Transactions/ • Above-and-beyond privacy protection and security Year mean TRANSFORMED INTO ACTIONABLE INSIGHTS • Reports, indexes, benchmarks to you? • Behavioral variables • Models, scores, forecasting • Econometrics Mastercard Enhanced Artificial Intelligence Capability with the Acquisitions of Applied Predictive Technologies(2015) and Brighterion (2017) Hierarchy of needs of AI https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007?gi=b66e6ae1b71d ©2016 Mastercard.Proprietaryand Confidential. Our Explore & Evaluation Journey Enterprise Requirements for Deep Learning Analyze a large amount of data on the same Big Data clusters where the data are stored (HDFS, HBase, Hive, etc.) rather than move or duplicate data. Add deep learning capabilities to existing Analytic Applications and/or machine learning workflows rather than rebuild all of them. Leverage existing Big Data clusters and deep learning workloads should be managed and monitored with other workloads (ETL, data warehouse, traditional ML etc..) rather than separated deployment , resources allocation ,workloads management and enterprise monitoring …… ©2016 Mastercard.Proprietaryand Confidential. Challenges and limitations to Production considering some “Super Stars”….
    [Show full text]
  • Tensor Relational Algebra for Distributed Machine Learning System Design
    Tensor Relational Algebra for Distributed Machine Learning System Design Binhang Yuany, Dimitrije Jankovy, Jia Zouz, Yuxin Tangy, Daniel Bourgeoisy, Chris Jermainey yRice University; zArizona State University y{by8, dj16, yuxin.tang, dcb10, cmj4}@rice.edu; [email protected] ABSTRACT This operational approach has certain drawbacks, namely that We consider the question: what is the abstraction that should be the system has limited latitude for automatic optimization. The implemented by the computational engine of a machine learning programmer is responsible for making sure that the operations can system? Current machine learning systems typically push whole run successfully using available resources (such as the amount of tensors through a series of compute kernels such as matrix multi- RAM on each GPU), and if the operations cannot run successfully, plications or activation functions, where each kernel runs on an AI the programmer must figure out how to break the operations up into accelerator (ASIC) such as a GPU. This implementation abstraction smaller pieces that can run. Tasks such as getting a computation provides little built-in support for ML systems to scale past a single to run on multiple GPUs or on multiple machines in a distributed machine, or for handling large models with matrices or tensors cluster so as to minimize communication are left to the programmer. that do not easily fit into the RAM of an ASIC. In this paper, we Toward declarative tensor programming. There has been work present an alternative implementation abstraction called the tensor on programming models that are more declarative. PyTorch and relational algebra (TRA). The TRA is a set-based algebra based on TensorFlow now both support variants of Einstein notation—a clas- the relational algebra.
    [Show full text]
  • Constructive Learning of Deep Neural Networks for Bigdata Analysis
    International Journal of Computer Applications Technology and Research Volume 9–Issue 12, 311-322, 2020, ISSN:-2319–8656 Constructive Learning of Deep Neural Networks for Bigdata Analysis Soha Abd El-Moamen Mohamed Marghany Hassan Mohamed Information System Department Computer Science Department Faculty of Computer and Faculty of Computer and Information Assiut University Information Assiut University Egypt Egypt Mohammed F. Farghally Information System Department Faculty of Computer and Information Assiut University Egypt Abstract. The need for tracking and evaluation of patients in real-time has contributed to an increase in knowing people’s actions to enhance care facilities. Deep learning is good at both a rapid pace in collecting frameworks of big data healthcare and good predictions for detection the lung cancer early. In this paper, we proposed a constructive deep neural network with Apache Spark to classify images and levels of lung cancer. We developed a binary classification model using threshold technique classifying nodules to benign or malignant. At the proposed framework, the neural network models training, defined using the Keras API, is performed using BigDL in a distributed Spark clusters. The proposed algorithm has metrics AUC-0.9810, a misclassifying rate from which it has been shown that our suggested classifiers perform better than other classifiers. Keywords: Health care, Deep learning, Constructive Deep learning, Diagnosis systems, Big data analysis 1 INTRODUCTION an overview of unstructured data that considered as the foundation of predictive analysis. Here’s the power and importance of unstructured data for Medical applications can predetermine the parts of predictive analysis. So, the organizations shouldn’t lungs damaged in increasing levels and this is neglect unstructured data.
    [Show full text]
  • Characterizing Deep Learning Over Big Data (Dlobd) Stacks on RDMA-Capable Networks
    2017 IEEE 25th Annual Symposium on High-Performance Interconnects Characterizing Deep Learning over Big Data (DLoBD) Stacks on RDMA-capable Networks Xiaoyi Lu, Haiyang Shi, M. Haseeb Javed, Rajarshi Biswas, and Dhabaleswar K. (DK) Panda Department of Computer Science and Engineering, The Ohio State University {lu.932, shi.876, javed.19, biswas.91, panda.2}@osu.edu Abstract—Deep Learning over Big Data (DLoBD) is becoming the large amount of gathered data in companies typically is one of the most important research paradigms to mine value already stored or being processed in big data stacks (e.g., from the massive amount of gathered data. Many emerging deep stored in HDFS), deep learning jobs on big data stacks can learning frameworks start running over Big Data stacks, such easily access those data without moving them back and forth. as Hadoop and Spark. With the convergence of HPC, Big Data, and Deep Learning, these DLoBD stacks are taking advantage 3) From infrastructure management perspective, we do not of RDMA and multi-/many-core based CPUs/GPUs. Even though need to set up new dedicated deep learning clusters if we can a lot of activities are happening in the field, there is a lack of run deep learning jobs directly on existing big data analytics systematic studies on analyzing the impact of RDMA-capable clusters. This could significantly reduce the costs of device networks and CPU/GPU on DLoBD stacks. To fill this gap, we purchasing and infrastructure management. propose a systematical characterization methodology and con- duct extensive performance evaluations on three representative With the benefits of integrating deep learning capabilities DLoBD stacks (i.e., CaffeOnSpark, TensorFlowOnSpark, and with big data stacks, we see a lot of activities in the com- BigDL) to expose the interesting trends regarding performance, munity to build DLoBD stacks, such as CaffeOnSpark [2], scalability, accuracy, and resource utilization.
    [Show full text]
  • Intel® Xeon Phi™ Processor Family Enables Shorter Time to Train Using General Purpose Infrastructure
    Intel® Nervana™ AI SUMMIT -Moscow 2017 AMIR KHOSROWSHAHI CTO, INTEL AI PRODUCTS GROUP 2 why now Inside ai Make it real 3 Ai is transformative Consumer Health Finance Retail Government Energy Transport Industrial Other Smart Enhanced Algorithmic Support Defense Oil & Gas Autonomous Factory Advertising Assistants Diagnostics Trading Exploration Cars Automation Experience Data Education Chatbots Drug Fraud Insights Smart Automated Predictive examples Discovery Detection Marketing Grid Trucking Maintenance Gaming Search Safety & Patient Care Research Merchandising Security Operational Aerospace Precision Professional & Personalization Improvement Agriculture IT Services Research Personal Loyalty Resident Shipping Augmented Finance Engagement Conservation Field Telco/Media Supply Chain Reality Sensory Search & Automation Aids Risk Mitigation Smarter Rescue Sports Security Robots Cities Early adoption Source: Intel forecast automotive Example (end-to-end) Artificial intelligence for automated driving 5G vehicle Network Cloud Driving Functions Data Storage Neural Network Design for Target Autonomous Driving Functions Anomaly Detection Captured Hardware, & usage Sensor Data Trajectory Enumeration, Path Dataset (Vision, Data Driven, etc.) Planning, Selection & Maneuvering Management Driving Policy, Path Selection & Model Training Model Inference Real Time Environment Modeling Traceability Single & Multi- >than Real Time Localization Real Time HD node optimized Model Map Updates Frameworks Simulation & Sensor Processing & Fusion Verification Object
    [Show full text]