Machine Learning on Tensorflow
Total Page:16
File Type:pdf, Size:1020Kb
Machine Learning on TensorFlow [email protected] Mar, 2018 Outlines ● Machine learning introduction ● What is TensorFlow? ● TensorFlow in China ● TensorFlow in Google ● TensorFlow Basics ○ Distributed TensorFlow ● New Features ○ Eager execution ○ TF Lite ○ XLA ○ Performance ○ Dataset (tf.datasets) ○ Cloud TPU 3 NPC CONTROL Camera Effects Input Saturation Defocus + = ? Image source: Wikimedia A Neural Algorithm of Artistic Style http://arxiv.org/abs/1508.06576 + = Image source: Wikimedia A Neural Algorithm of Artistic Style http://arxiv.org/abs/1508.06576 Source: Instacart Source: Google blog 眼科学 放射学 “The network performed similarly to senior orthopedic surgeons when presented with images at the same resolution 0.95 0.91 as the network.” Algorithm Ophthalmologist (median) www.tandfonline.com/doi/full/10.1080/17453674.2017.1344459 共同的目标: 机器学习让未来变得更美好 What is TensorFlow 多维数组 流动 TensorFlow: Computation Graph ● Computation is defined as a graph ● Nodes represent computation or states ○ Can be run on any devices ● Data flow along edges ● Graph can be defined in any language ● Graph would be compiled and optimized TensorFlow: ML for Everyone ● An open-source machine learning platform for everyone ● Fast, flexible, and production-ready ● Scales from research to production 16 Machine learning gets complex quickly Deep Learning Just like regular learning, but with more layers. Inception v3 has ~25 M parameters. 17 TensorFlow Handles Complexity Modeling complexity Distributed Heterogenous System System Canned Estimators Keras Estimator Model Datasets Layers Python Frontend C++ Java Go ... TensorFlow Distributed Execution Engine CPU GPU Android iOS XLA CPU GPU TPU ... TensorFlow provides great tools like TensorBoard 20 TensorBoard TensorFlow supports many platforms CPU GPU iOS Android Raspberry Pi 1st-gen TPU Cloud TPU TensorFlow supports many languages Java 活跃的开源社区 Positive Reviews Rapid Development Direct Engagement 81,000+ 1,100+ 8,000+ GitHub Stars Contributors Stack Overflow questions answered 23,000+ 21,000+ 100+ GitHub repositories with Commits in 21 months Community-submitted GitHub ‘TensorFlow’ in the title issues responded to weekly 50000 37500 81K+ TensorFlow 25000 GitHub Star Count 12500 0 2013 2014 2015 2016 2017 Confidential + Proprietary Confidential + Proprietary TensorFlow出现在课程里 University of California, Udacity Berkeley Coursera Stanford University deeplearning.ai University of Toronto Andreessen Horowitz TensorFlow in China TensorFlow 中文网站: tensorflow.google.cn Follow TensorFlow on WeChat JD Xiaomi Document Scanning and Text Detection With Convolution Neural Network running on TENSORFLOW, We bring powerful features to Youdao Translate and Youdao Note. Document Scanning: Real time, Stable, Applicable to multiple scenarios Text Detection: Accurate, Fast, Less computing resources needed More research examples ● “SENSING URBAN LAND-USE PATTERNS BY INTEGRATING GOOGLE TENSORFLOW AND SCENE-CLASSIFICATION MODELS” -- 中山大学 ● “Prediction of Sea Surface Temperature using Long Short-Term Memory” -- 中国海洋大学 iotAI-one人工智能分拣机是由“江西环 境工程职业学院”即“iotAI商标申报人” 陈万钧老师团队基于谷歌开源人工智 能框架TensorFlow并以智能分拣业务 为载体的第二代人工智能分拣系统原 型机。 该原型机是目前马上 商用的:“基于机器视 觉的钕铁硼毛坯外观 智能检分设备”的原 型机。 SDK下载超过1000万次, 180个国家和地区. TensorFlow in China Supporting AI education Developing strong communities ● Supporting all levels from ● TensorFlow DevSummit viewing universities and vocational parties at Feb schools to K-12 schools ● TensorFlow symposiums in ● Investing millions of RMB for AI Beijing and Shanghai education in 2018, by following ● Nationwide GDG DevFests in 27 5-year MoU signed with China cities Ministry of Education ● More online and offline ● Externalizing machine learning community activities coming in courses to train more teachers 2018! TensorFlow in Google Neural Machine Translation Reduces Errors By 55%-85% Sutskever et al. NIPS, Dec 2014 Google Research Blog, Sept 2016 Wu et al. arXiv, Sept 2016 41 Google 翻译 Google Translate, a truly global product... 1B+ Translations every single day, that is 1M books Monthly active users 1B+ Google Translate Languages cover 99% of online 103 population Machine Translation: A Brief History Statistical machine learning -> Statistical MT Deep learning -> Neural MT 1950’s 1960’s 1990’s 2000’s 2014 2016 Google’s Early research SYSTRAN founded Example-based MT Phrase-based MT Neural MT Multilingual Neural MT ALPAC report IBM models Syntax-based MT Word-based MT Semantics-based MT Rule-based MT NMT: A Brief History Figure credit - Orhan Firat Old Tech: PBMT Translation pyramid Neural Machine Translation: The Game Changer Phrase-based machine translation Neural Machine Translation (NMT) Discrete, local decision Continuous, global decision Sequence Modeling ● What does this mean? ● Predict the likelihood of a sequence: ○ What is the probability of P(X1, X2, ..., XN)? ● A sequence (X1, X2, ..., XN) can be A piece of text. Figure(s) credit - Orhan Firat Sequence Modeling ● Non-Markovian sequence modelling ○ Markovian modelling ignores dependency beyond context ○ Directly model the original conditional probabilities N ■ P(X1, X2, ..., XN) = ∏t =1P(Xt| X1, ..., Xt-1) ● Input sequences can be variable length Sequence Modeling ● How can we handle variable length sequences? ○ Recurrence Credits - Orhan Firat, Martin Görner Recurrent Neural Networks ● Modelling P(the,cat,sat,on) with RNNs: ○ an input sequence (X1, X2, ..., XN) ○ an internal memory state - tracks state so far ○ a function - recurses over input Xi and memory P(the) P(cat|...) P(sat|...) P(on|...) the cat sat Recurrent Neural Networks ● Deepest neural networks possible ○ Unlimited depth ● Most general neural networks ○ They are general computers ● Universal function approximators ○ Can learn any program Credits - Orhan Firat, Chris Olah Training RNNs ● Unroll the loop ● Apply back-propagation through time (BPTT) Mozer et al.’95 ● Problems: Vanishing or exploding gradients (Hochreiter et al.’91, Bengio et al.’94) ○ Long Short Term Memory Units (LSTM) (Hochreiter & Schmidhuber’95) Credits - Chris Olah LSTMs Vanilla RNN LSTM RNN Credits - Chris Olah LSTMs Credits - Chris Olah LSTMs Credits - Chris Olah LSTMs Credits - Chris Olah Sequence to Sequence Modelling ● Learn to map: X1, X2,...,XN -> Y1, Y2,...,YN ● Encoder/Decoder framework ● Theoretically any sequence length for input/output works Die Katze saß EOS Bottleneck the cat sat Die Katze saß Deep Sequence to Sequence Y1 Y2 </s> SoftMax Encoder LSTMs Decoder LSTMs X3 X2 </s> <s> Y1 Y3 Attention Mechanism ● Solves information bottleneck problem Google Neural Machine Translation Model GNMT github https://github.com/tensorflow/nmt Google Open Models https://github.com/tensorflow Learn2Learn & 进化算法 AM!!! 以概率p取样得到架构A 以架构 来训练子网络 控制器 (RNN) A 得到精确度R 计算p的梯度,以R为比例来校正控制器 Why Evolution? Worker ● Pick 2 at random ● Kill worst ● Select best as parent ● Copy-mutate parent ● Train, evaluate child Worker Worker Worker Worker Worker ● 插入卷积层 ● 去除卷积层 ● 插入非线性层 ● 去除非线性层 ● 插入跳过连接 ● 去除跳过连接 ● 改变stride ● 改变channel数量 ● 改变水平过滤器大小 ● 改变垂直过滤器大小 ● 改变学习率 ● 不变 ● 重设权重 77 78 Parsey McParseface https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html ht Prediction A Xt Google's Project Magenta https://magenta.tensorflow.org/ 80 Hemorrhages Healthy Diseased No DR Mild DR Moderate DR Severe DR Proliferative DR 81 机器人 数据中心优化 Confidential + Proprietary 数据中心优化 高PUE 机器学习控制开启 机器学习控制关闭 低PUE Confidential + Proprietary TensorFlow Basics TensorFlow: Computation Graph ● Computation is defined as a graph ● Nodes represent computation or states ○ Can be run on any devices ● Data flow along edges ● Graph can be defined in any language ● Graph would be compiled and optimized Simple ML Model: Linear Regression Core TF code without using high-level API y = Wx + b # Model parameters W = tf.Variable([0.3], dtype=tf.float32) b = tf.Variable([0.1], dtype=tf.float32) x = tf.placeholder(tf.float32) y = W*x + b y_prime = tf.placeholder(tf.float32) (x, y’) # Minimize loss. loss = tf.reduce_sum(tf.square(y - y_prime)) loss = ∑ (yi - y’i)^2 Dataflow based computation Python Program TensorFlow Graph 88 https://www.tensorflow.org/get_started/get_started Build a graph; then run it. a b ... c = tf.add(a, b) add c ... session = tf.Session() value_of_c = session.run(c, {a=1, b=2}) Any Computation is a TensorFlow Graph biases weights Add Relu MatMul Xent examples labels Any Computation is a TensorFlow Graph variables with state biases weights Add Relu MatMul Xent examples labels Automatic Differentiation Automatically add ops which compute gradients for variables biases ... Xent grad Any Computation is a TensorFlow Graph with state biases ... Xent grad Mul −= learning rate Any Computation is a TensorFlow Graph distributed Device A Device B biases Add ... Mul −= ... learning rate Devices: Processes, Machines, CPUs, GPUs, TPUs, etc Send and Receive Nodes distributed Device A Device B biases Add ... Mul −= ... learning rate Devices: Processes, Machines, CPUs, GPUs, TPUs, etc Send and Receive Nodes distributed Device A Device B biases Send Recv Add ... Mul Send Recv −= ... Send Recv Recv learning rate Send Devices: Processes, Machines, CPUs, GPUs, TPUs, etc TensorFlow APIs Canned Estimators Keras Estimator Model Datasets Layers Python Frontend C++ Java Go ... TensorFlow Distributed Execution Engine CPU GPU Android iOS XLA CPU GPU TPU ... API: Layers Canned Estimators Keras Estimator Model Datasets Layers Python Frontend C++ Frontend ... TensorFlow Distributed Execution Engine CPU GPU Android iOS ... conv 5x5 (relu) max pool 2x2 conv 5x5 (relu) max pool 2x2 dense (relu) dropout