Building Intelligent Systems with Large Scale Deep Learning Jeff Dean Google Brain Team G.Co/Brain
Total Page:16
File Type:pdf, Size:1020Kb
Building Intelligent Systems with Large Scale Deep Learning Jeff Dean Google Brain team g.co/brain Presenting the work of many people at Google Google Brain Team Mission: Make Machines Intelligent. Improve People’s Lives. How do we do this? ● Conduct long-term research (>200 papers, see g.co/brain & g.co/brain/papers) ○ Unsupervised learning of cats, Inception, word2vec, seq2seq, DeepDream, image captioning, neural translation, Magenta, ML for robotics control, healthcare, … ● Build and open-source systems like TensorFlow (see tensorflow.org and https://github.com/tensorflow/tensorflow) ● Collaborate with others at Google and Alphabet to get our work into the hands of billions of people (e.g., RankBrain for Google Search, GMail Smart Reply, Google Photos, Google speech recognition, Google Translate, Waymo, …) ● Train new researchers through internships and the Google Brain Residency program Main Research Areas ● General Machine Learning Algorithms and Techniques ● Computer Systems for Machine Learning ● Natural Language Understanding ● Perception ● Healthcare ● Robotics ● Music and Art Generation Main Research Areas ● General Machine Learning Algorithms and Techniques ● Computer Systems for Machine Learning ● Natural Language Understanding ● Perception ● Healthcare ● Robotics ● Music and Art Generation research.googleblog.com/2017/01 /the-google-brain-team-looking-ba ck-on.html 1980s and 1990s Accuracy neural networks other approaches Scale (data size, model size) 1980s and 1990s more Accuracy compute neural networks other approaches Scale (data size, model size) Now more Accuracy compute neural networks other approaches Scale (data size, model size) Growth of Deep Learning at Google and many more . Directories containing model description files Experiment Turnaround Time and Research Productivity ● Minutes, Hours: ○ Interactive research! Instant gratification! ● 1-4 days ○ Tolerable ○ Interactivity replaced by running many experiments in parallel ● 1-4 weeks ○ High value experiments only ○ Progress stalls ● >1 month ○ Don’t even try Build the right tools Google Confidential + Proprietary (permission granted to share within NIST) Open, standard software for general machine learning Great for Deep Learning in particular http://tensorflow.org/ First released Nov 2015 and Apache 2.0 license https://github.com/tensorflow/tensorflow TensorFlow Goals Establish common platform for expressing machine learning ideas and systems Make this platform the best in the world for both research and production use Open source it so that it becomes a platform for everyone, not just Google TensorFlow Scaling Near-linear performance gains with each additional 8x NVIDIA® Tesla® K80 server added to the cluster TensorFlow supports many platforms CPU GPU iOS Android Raspberry Pi 1st-gen TPU Cloud TPU TensorFlow supports many languages Java 2013 2011 2013 2013 2010 late 2015 ML is done in many places TensorFlow GitHub stars by GitHub user profiles w/ public locations Source: http://jrvis.com/red-dwarf/?user=tensorflow&repo=tensorflow TensorFlow: A Vibrant Open-Source Community ● Rapid development, many outside contributors ○ 475+ non-Google contributors to TensorFlow 1.0 ○ 15,000+ commits in 15 months ○ Many community created tutorials, models, translations, and projects ■ ~7,000 GitHub repositories with ‘TensorFlow’ in the title ● Direct engagement between community and TensorFlow team ○ 5000+ Stack Overflow questions answered ○ 80+ community-submitted GitHub issues responded to weekly ● Growing use in ML classes: Toronto, Berkeley, Stanford, ... Google Photos [glacier] Google Cloud Platform Confidential & Proprietary 24 24 Reuse same model for completely different problems Same basic model structure trained on different data, useful in completely different contexts Example: given image → predict interesting pixels www.google.com/sunroof We have tons of vision problems Image search, StreetView, Satellite Imagery, Translation, Robotics, Self-driving Cars, Computers can now see Large implications for healthcare Google Confidential + Proprietary (permission granted to share within NIST) MEDICAL IMAGING Using similar model for detecting diabetic retinopathy in retinal images Performance on par or slightly better than the median of 8 U.S. board-certified ophthalmologists (F-score of 0.95 vs. 0.91). http://research.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html Computers can now see Large implications for robotics Google Confidential + Proprietary (permission granted to share within NIST) Combining Vision with Robotics “Deep Learning for Robots: Learning from Large-Scale Interaction”, Google Research Blog, March, 2016 “Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection”, Sergey Levine, Peter Pastor, Alex Krizhevsky, & Deirdre Quillen, Arxiv, arxiv.org/abs/1603.02199 Self-Supervised and End-to-end Pose Estimation Confidential + Proprietary TCN + Self-Supervision (No Labels!) Confidential + Proprietary Scientific Applications of ML Google Confidential + Proprietary (permission granted to share within NIST) Predicting Properties of Molecules Toxic? Message Passing Neural Bind with a given protein? Aspirin Net Quantum properties. ● Chemical space is too big, so chemists often rely on virtual screening. ● Machine Learning can help search this large space. ● Molecules are graphs, nodes=atoms and edges=bonds (and other stuff) ● Message Passing Neural Nets unify and extend many neural net models that are invariant to graph symmetries ● State of the art results predicting output of expensive quantum chemistry calculations, but ~300,000 times faster https://research.googleblog.com/2017/04/predicting-properties-of-molecules-with.html and https://arxiv.org/abs/1702.05532 and https://arxiv.org/abs/1704.01212 (latter to appear in ICML 2017) Measuring live cells with image to image regression “Seeing More” Enabling technology: Image to image regression Input True Depth Predicted Depth Depth prediction on portrait data Applications for camera effects Input Saturation Defocus Predict cellular markers from transmission microscopy? Human cancer cells / DIC / nuclei (blue) and cell mask (green) Human iPSC neurons / phase contrast / nuclei (blue), dendrites (green), and axons (red) Scaling language understanding models Google Confidential + Proprietary (permission granted to share within NIST) Sequence-to-Sequence Model Target sequence [Sutskever & Vinyals & Le NIPS 2014] X Y Z Q v Deep LSTM A B C D __ X Y Z Input sequence Sequence-to-Sequence Model: Machine Translation Target sentence [Sutskever & Vinyals & Le NIPS 2014] How v Quelle est votre taille? <EOS> Input sentence Sequence-to-Sequence Model: Machine Translation Target sentence [Sutskever & Vinyals & Le NIPS 2014] How tall v Quelle est votre taille? <EOS> How Input sentence Sequence-to-Sequence Model: Machine Translation Target sentence [Sutskever & Vinyals & Le NIPS 2014] How tall are v Quelle est votre taille? <EOS> How tall Input sentence Sequence-to-Sequence Model: Machine Translation Target sentence [Sutskever & Vinyals & Le NIPS 2014] How tall are you? v Quelle est votre taille? <EOS> How tall are Input sentence Sequence-to-Sequence Model: Machine Translation At inference time: Beam search to choose most probable [Sutskever & Vinyals & Le NIPS 2014] over possible output sequences v Quelle est votre taille? <EOS> Input sentence Google Research Blog - Nov 2015 Incoming Email Smart Reply Activate Small Smart Reply? Feed-Forward yes/no Neural Network Google Research Blog - Nov 2015 Incoming Email Smart Reply Activate Small Smart Reply? Feed-Forward yes/no Neural Network Generated Replies Deep Recurrent Neural Network Smart Reply April 1, 2009: April Fool’s Day joke Nov 5, 2015: Launched Real Product Feb 1, 2016: >10% of mobile Inbox replies Sequence to Sequence model applied to Google Translate Google Confidential + Proprietary (permission granted to share within NIST) https://arxiv.org/abs/1609.08144 Google Neural Machine Translation Model Y1 Y2 </s> One model Encoder LSTMs SoftMax replica: one Decoder LSTMs machine Gpu8 w/ 8 Gpu8 GPUs 8 Layers + + + + + + Gpu3 Gpu3 Gpu2 Attention Gpu2 Gpu2 Gpu1 Gpu1 <s> Y1 Y3 X3 X2 </s> Model + Data Parallelism Parameters distributed across Params Params Params many parameter ... server machines Many ... replicas Neural Machine Translation 6 perfect translation 5 human 4 neural (GNMT) phrase-based (PBMT) 3 2 Closes gap between old system Translation quality Translation 1 and human-quality translation by 58% to 87% 0 English English English Spanish French Chinese > > > > > > Spanish French Chinese English English English Enables better communication Translation model across the world research.googleblog.com/2016/09/a-neural-network-for-machine.html BACKTRANSLATION FROM JAPANESE (en->ja->en) Phrase-Based Machine Translation (old system): Kilimanjaro is 19,710 feet of the mountain covered with snow, and it is said that the highest mountain in Africa. Top of the west, “Ngaje Ngai” in the Maasai language, has been referred to as the house of God. The top close to the west, there is a dry, frozen carcass of a leopard. Whether the leopard had what the demand at that altitude, there is no that nobody explained. Google Neural Machine Translation (new system): Kilimanjaro is a mountain of 19,710 feet covered with snow, which is said to be the highest mountain in Africa. The summit of the west is called “Ngaje Ngai” God ‘s house in Masai language. There is a dried and frozen carcass of a leopard near the summit of