<<

” and Data

statistics and lots of data

Brian D Goodwin, PhD January 13, 2020 Milwaukee School of

Image: https://bernardmarr.com/ Brian D Goodwin, PhD

Marquette University

BD Goodwin, CR Butson. (2015). J Neuromodulation Medical College of Wisconsin

BD Goodwin, FA Pintar, NA Yoganandan. (2017). Annals Bio. Eng. There are many paths, but here is mine • Undergrad: BS Mech. Eng. at MSOE – 2009 • Coding/programming • Control • Finite element modeling (or FEM; also, FEA) • Started a Masters in Biomedical Eng. at MU … • Decided to just do a PhD about 6 months into my Masters • PhD took 5.5 years • Specialized in neuroscience and computational science • Stayed in research to do a postdoctoral fellowship at MCW • Fellowship was about 2 year (could have been longer) • Specialized in biomechanics, signal processing, and machine-learning • Transitioned into industry at IT consulting firm in the “Data & AI” space • 3 years • Specialized in cloud solution architecture, machine-learning, and optimization • Recently accepted an offer from an AI company, Synthetaic • Using AI to generate synthetic datasets: large datasets that represent and characterize rare events A few thoughts on mechanical engineering

• (I'm obviously biased, but) I think a BSME sets you up for success • Examples: the PI's I worked under and the CTO at Synthetaic • Problems with "condensed concentrations/majors” (in my opinion) • Biomedical Engineering • Mechanical Eng. • Electrical Eng. • Systems • Materials Science & Biocompatilibity • Science • Statistics • Signal processing • Mathematics and Lin. Alg. • General Engineering • Data Engineering What is potentially ahead for you

• Industry • Academia • How do I get the position I want? • Competence, but many of the competencies needed for a given job (especially your first one) are learned on the job. • At the end of the day, stories and relevant/novel contributions will land you the type of engineering job you want • Craft your stories accordingly • Question asking! Transcranial Magnetic Stimulation and Magnetism in Medicine (Nuclear) Magnetic Resonance Imaging Magnetoencephalography (MEG) Transcranial Magnetic Stimulation

ECS DBS TMS

Idris Z, et al. 2013 (Chapter 2, “Clinical Management and Evolving Novel…”, Lichtor T) 111

117 within Nissl stain layers I and II. From the soma to the tip of the highest dendrite measures 800 µm.

Monophasic (V/m) Axon Tractography Near Cortex 295 174 54 Axon tractography was performed using SCIRun’s (SCI Institute, University of 178 105 32 Utah, Salt Lake City) interface with Tend Fiber (TEEM49), which uses Westin’s linear Electric Field Biphasic (V/m) 40 tensor-line (Westin et al., 2002). A point cloud was generated around the Electric Field neuron cell bodies to provide seeds for the axon tractography algorithme . Fiber tracts 1 Monophasic Stimulus Direction fa (Magnetic Field) 0.5 were generated with the termination criteria that fibers must haveu minimum length of S

l 0

a 0 Time 0.5 ms i

20mm and FA must be >0.5. Connections were formed betweenP the cell bodies and fiber Compute the DFT -18º 50 tracts via a custom algorithm in m-script that employs Hermite splines . This algorithm Stimulus Discrete 0.1 ensures that axon trajectories are void of sharp curves (Figure 31). All pyramidal cells in Fourier Series 0 the population had axons 20 mm in length. 0 Frequency 20 kHz

10 mm Compute of equations at each Fourier Component

0.1 Electromagnetic Solution Discrete Fourier Series 0 0 Frequency 20 kHz -75º

Compute the Inverse DFT

3 Time-dependent Electromagnetic 0 Solution 0 Time 0.5 ms

Figure 6 Fourier Solver Calculation Steps. The numerical process to calculate the time-dependent electromagnetic solution from a transient magnetic pulse has three steps indicated by the black arrows. In Figure 31 Use of DTI for Anisotropic Conductivity and Neuron Axon Tractography. (Left) Acquired DTI the case of TMS, the magnetic source is the electric current flowing through the TMS coil. The Fourier was applied to the 3D head model for the inclusion of anisotropic conductivity. (Right) Diffusion tensors were Solver was developed for use in finite element methods (middle black arrow). employed for axon tractography to construct descending axons from the pyramidal cell bodies. Four out of 6111 neurons are shown for the purpose of example.

49 -124ºhttp://teem.sourceforge.net/ 50 A Hermite spline is a 3D spline generated from Hermite’s polynomials, which requires the locationModeling and spline trajectory (vector) of two points. Pipeline for Prescriptive Therapy

Figure 35 E-Field Within the Cortex for Three Coil Orientations. (Top left) The location of the data plane (red) relative to the pial surface is shown with the coordinate frame. E-field magnitude maps are shown at peak E-field during the stimulus pulse. Conductivity tensors are superimposed over E-field maps within the data plane. FEM results are shown for three stimulus (biphasic @ 130% RMT). Coil placements are shown in the left panes with the primary direction of the induced E-field. Goodwin, Butson. “Subject-Specific Multiscale Modeling...” Neuromodulation (2015). E-field and Brain Stimulation

350 V/m

57

200 V/m

0.3 0.2

0.1 50 V/m

50mm

Figure 11 Induced E-field Vectors and E-field Magnitude Contour Map. Induced E-field vectors and -90º -135º -180º relative E-field magnitude contour lines on a plane 1cm below the plane of the figure-8 TMS coil. Dark red line indicates 0.9 of the max E-field magnitude.

The measured EMF happened to match the waveform characteristics (Figure 12) specified by the manufacturer in terms of the rise time, duration, and shape (Jalinous,

1998; Walsh & Pascual-Leone, 2005). Unfortunately, I was able to neither record nor calculate the actual current through the TMS coil. For this reason, I elected to model the actual current by normalizing the measured waveforms and scaling them according to manufacturer specifications to enable prediction of the absolute induced fields. The shape of the current waveform through the coil was calculated post-hoc by integrating the EMF record (Figure 12). Computational Neuroscience and HPC Patient-specific Modeling Pipeline Results Human Subject Comparison

Goodwin, Marquette University (2015) Biomechanics and Multi-channel HF Signals Biomechanics and Multi-channel HF Signals Acoustic Emissions from Bone Fracture Events

Goodwin, et al. “Acoustic Emission Signatures During Failure of Vertebra and Long Bone.” Annals of Biomedical Engineering, 2017. Business, Tech, and Science News Text Sources: The Economist; Wall Street Journal; Science (AAAS); Reuters; BBC What industry thinks about Data Science Data Business Intelligence Predictive Analytics Internet of Things Statistical BigData Measures Predictive Modeling Key Performance Statistical Learning Indicators Map Reduce Data Factory

Probabilistic Modeling Risk Estimation Clustering Neural Networks Bayesian Frequentist Classification Color Key Data Science Artificial Intelligence Development & Data Analytics Prescriptive Analytics Data science is perceived as many things…

Machine- Business Statistics “Big Data” learning Intelligence and Cloud Cloud Solution Architecture

Information Big Data Stores Machine Learning Intelligence Management and Analytics Data People Sources Machine Cognitive Data Factory Data Lake Store Learning Services

SQL Data Data Lake Bot Web Data Catalog Warehouse Analytics Framework Apps

Mobile HDInsight Event Hubs (Hadoop and Cortana Apps Spark) Bots

Sensors Stream Dashboards & and devices Analytics Visualizations Automated Systems Power BI Data

Data Intelligence Action Statistics! Question... • 1 in 1000 products are defective • QC performs a defect test that has a 95% accuracy; • i.e., a 95% true positive rate and 5% false positive rate • The test never fails to identify a defective product (false negatives are impossible)

• QC tests a product at random, and the test is positive (indicating defective product)

What is the probability that the product is actually defective? We are poor intuitive statisticians

An answer in plain words: • If 1000 randomly selected billets were tested... • Then 50 parts will test positive (5% false positive rate) • But only 1 of them is actually defective.

• Therefore, ≈ 0.02 or �% Answer... Human intuition is really expensive, yet it’s D: positive defect test error prone in 2 delicate environments: defect: part is actually defective 1) one requiring use of statistics and � defect = 0.001 2) one where many variables are interacting � normal = 0.999 with each other. � �|������ = 1.00 � �|������ = 0.05

� � = � � defect � defect + � � ~defect � ~defect � � = 1.00 ∗ 0.001 + 0.05 ∗ 0.999 = 0.05095

� �|defect � defect 1.00 ∗ 0.001 � defect|� = = = 0.0196 ≈ �% � � 0.05095 A Caveat... ”But the cleverest of are no substitute for human intelligence and knowledge of the data in the problem.”

- Brieman and Cutler, UC Berkley, inventors of the Random Forest ML algorithm. https://www.stat.berkeley.edu/~breiman/RandomForests/cc_philosophy.htm “This... is a Football” Predictive Model Insight or �(�) Statistic

Acquisition Data Model Data Or Intelligent Activity Information “Ingestion” Intelligible symbols (e.g., a number)

You will need a You will need a brain to do this... This stuff is what computer to do this... drives decisions... At the end of the day: it’s a function

Predictors (Inputs) Prediction Model Prediction � � � �

Can be “black box” Why Estimate � ? The system, �, has predictive power (prediction). The system, �, has explanatory power (inference). Say you have some system that produces an output � = � � + � But it’s impossible to know all of the factors (� features) that influence �, let alone the exact function, �. So, we try to estimate this:

� = � � A more tangible (supervised learning) example… � � Predictors (Inputs) Prediction Model Prediction � � � �

See: https://playground.tensorflow.org/ A more tangible (supervised Neural Network a b learning) example… a2 b1 a1 b2 Learning or “Training” set

Activation function

Validation or “Testing” set Label: 1 or 0

See: https://playground.tensorflow.org/ Label: 1

X_1

Label: 0

X_2 Model Performance

Actual SVM FALSE TRUE FALSE 216 1 TRUE 23 9

Predicted Precision: 9 / (9+23) = 28.1% Recall: 9 / (9+1) = 90% Machine Learning & Estimating a �(x) • Supervised Learning • Classification problems; e.g., classifying images, anomaly detection, etc. • Features have a known class and you want to predict the class for a new set of features. • Unsupervised Learning • Data clustering • No known labels • Grouping data points with similarities among their features. • • Neural Network or Deep Learning • Optimizing the performance of some action • Common in Robotics Pillars of Data Science

Models Compute Data Published as a conference paper at ICLR 2018

Published as a conference paper at ICLR 2018

PROGRESSIVE GROWING OF GANSFORIMPROVED QUALITY,STABILITY, AND VARIATION

Tero Karras Timo Aila Samuli Laine Jaakko Lehtinen NVIDIA NVIDIA NVIDIA NVIDIA and Aalto University tkarras,taila,slaine,jlehtinen @nvidia.com { }

ABSTRACT

We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it, al- lowing us to produce images of unprecedented quality, e.g., CELEBA images at 10242. We also propose a simple way to increase the variation in generated im- ages, and achieve a record inception score of 8.80 in unsupervised CIFAR10. Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator. Fi- nally, we suggest a new metric for evaluating GAN results, both in terms of image quality and variation. As an additional contribution, we construct a higher-quality version of the CELEBA dataset.

Karras, T. et al., 2017. Progressive1 INTRODUCTION Growing of GANs for Improved Quality, Stability, and Variation. In ICLR 2018. pp. 1–26. Generative methods that produce novel samples from high-dimensional data distributions, such as images, are finding widespread use, for example in (van den Oord et al., 2016a), image-to-image translation (Zhu et al., 2017; Liu et al., 2017; Wang et al., 2017), and image in- painting (Iizuka et al., 2017). Currently the most prominent approaches are autoregressive models (van den Oord et al., 2016b;c), variational (VAE) (Kingma & Welling, 2014), and gen- erative adversarial networks (GAN) (Goodfellow et al., 2014). Currently they all have significant strengths and weaknesses. Autoregressive models – such as PixelCNN – produce sharp images but are slow to evaluate and do not have a latent representation as they directly model the conditional distribution over pixels, potentially limiting their applicability. VAEs are easy to train but tend to produce blurry results due to restrictions in the model, although recent work is improving this (Kingma et al., 2016). GANs produce sharp images, albeit only in fairly small resolutions and with somewhat limited variation, and the training continues to be unstable despite recent progress (Sali- mans et al., 2016; Gulrajani et al., 2017; Berthelot et al., 2017; Kodali et al., 2017). Hybrid methods combine various strengths of the three, but so far lag behind GANs in image quality (Makhzani & Frey,Figure 2017; 10: Top: Ulyanov Our CELEB et al.,A-HQ 2017; results. Dumoulin Next five et al., rows: 2016). Nearest neighbors found from the train- ing data, based on feature-space distance. We used activations from five VGG layers, as suggested Typically,by Chen & aKoltun GAN (2017). consists Only of the two crop networks: highlighted generator in bottom and right discriminator image was used (aka for critic). comparison The generator producesin order to a exclude sample, image e.g., background an image, and from focus a latent the search code, on and matching the distribution facial features. of these images should ideally be indistinguishable from the training distribution. Since it is generally infeasible to engineer a function that tells whether that is the case, a discriminator network is trained to do the assessment, and since networks are differentiable, we also get a gradient we can use to steer both networks to the right direction. Typically, the generator is of main interest – the discriminator is an adaptive that gets discarded once the generator has been trained. There are multiple potential problems with this formulation. When we measure the distance between the training distribution and the generated distribution, the gradients can point to more or less random directions if the distributions do not have substantial overlap, i.e., are too easy to tell apart (Arjovsky & Bottou, 2017). Originally, Jensen-Shannon divergence was used as a distance metric (Goodfellow et al., 2014), and recently that formulation has18 been improved (Hjelm et al., 2017) and a number of more stable alternatives have been proposed, including least squares (Mao et al., 2016b), absolute deviation with margin (Zhao et al., 2017), and Wasserstein distance (Arjovsky et al., 2017; Gulrajani

1 Improving intra-operative process in neurosurgery Hollon, et al. 2020, Nature. ARTICLE doi:10.1038/nature24270

Mastering the game of Go without human knowledge David Silver1*, Julian Schrittwieser1*, Karen Simonyan1*, Ioannis Antonoglou1, Aja Huang1, Arthur Guez1, REThomasSEARCH HubertARTICLE1, Lucas Baker1, Matthew Lai1, Adrian Bolton1, Yutian Chen1, Timothy Lillicrap1, Fan Hui1, Laurent Sifre1, George van den Driessche1, Thore Graepel1 & Demis Hassabis1 ab

5,000 5,000

4,000 4,000 A long-standing goal3,000 of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains.2,000 Recently, AlphaGo became the first program3,000 to defeat a world champion in the game of Go. The tree search in AlphaGo1,000 evaluated positions and selected moves using deep neural networks. These neural networks were Elo rating 2,000 0 Elo rating trained by supervised learning from human expertAlphaGo moves, Zero 40 blocks and by reinforcement learning from self-play. Here we introduce an algorithm based–1,000 solely on reinforcement learning,AlphaGo Master without human1,000 data, guidance or domain knowledge beyond game AlphaGo Lee –2,000 rules. AlphaGo becomes its own teacher: a neural network is trained0 to predict AlphaGo’s own move selections and also the winner of AlphaGo’s0 games.51 This01 neural5202 network5303 improves540 the strengthro ofr the treee search,o resulting in higher quality Days Pachi move selection and stronger self-play in the next iteration. Starting tabula rasa, our newGnuG program AlphaGo Zero achieved Crazy Ston Raw AlphaGonetwork ZeAlphaGoAlphaGo Lee Fan superhuman performance, winning 100–0 against the previously published,AlphaGo Maste champion-defeating AlphaGo. Figure 6 | Performance of AlphaGo Zero. a, Learning curve for AlphaGo Lee (defeated Lee Sedol), AlphaGo Fan (defeated Fan Hui), as well as Zero using a larger 40-block residual network over 40 days. The plot shows previous Go programs Crazy Stone, Pachi and GnuGo. Each program was Much progress towards artificial intelligence has been made using trained solely by self-play reinforcement learning, starting from ran- the performance of each player αθi from each iteration i of our given 5 s of thinking time per move. AlphaGo Zero and AlphaGo Master reinforcementsupervised learning algorithm. systems Elo that ratings are trained were computed to replicate from the decisionsplayed on adom single play, machine without on the any Google supervision Cloud; AlphaGo or use Fan of and human data. Second, it evaluationof human games experts between1–4 .different However, players, expert using data 0.4 s setsper search are often (see expensive,AlphaGo Leeuses were only distributed the black over and many white machines. stones The from raw neuralthe board as input features. Methods).unreliable b, Final or simply performance unavailable. of AlphaGo Even Zero. AlphaGowhen reliable Zero was data setsnetwork are fromThird, AlphaGo it uses Zero a single is also included,neural network, which directly rather selects than the separate policy and trained for 40 days using a 40-block residual neural network. The plot move a with maximum probability pa, without using MCTS. Programs showsavailable, the results they of maya tournament impose between: a ceiling AlphaGo on the Zero, performance AlphaGo of systemswere evaluated value on networks. an Elo scale 25Finally,: a 200-point it uses gap a corresponds simpler tree to a search75% that relies upon 5 Mastertrained (defeated in this top manner human professionals. By contrast, 60–0 reinforcement in online games), learning AlphaGo systemsprobability this of winning. single neural network to evaluate positions and sample moves, are trained from their own experience, in principle allowing them to without performing any Monte Carlo rollouts. To achieve these results, Silver, D. et al., 2017.exceed Mastering human capabilities, the game and to of operate Go without in domains human where human knowledge. we introduce Nature a new, reinforcement550, p.354. learning algorithm that incorporates Fig.expertise 2); ultimately is lacking. AlphaGo Recently, Zero there preferred has been new rapid joseki progress variants thattowards to this 4,858 forlookahead AlphaGo search Master, inside 3,739 thefor AlphaGotraining loop,Lee and resulting 3,144 for in rapid improve- weregoal, previously using deep unknown neural (Fig. networks 5b and Extended trained Databy reinforcement Fig. 3). Figure 5c learning. AlphaGo mentFan. and precise and stable learning. Further technical differences in showsThese several systems fast have self-play outperformed games played humans at different in computer stages of train games,- suchFinally, the we evaluatedsearch algorithm, AlphaGo Zerotraining head procedure to head against and networkAlphaGo architecture are ingas Atari(see Supplementary6,7 and 3D virtual Information). environments Tournament8–10. However, length thegames most Masterchal- indescribed a 100-game in match Methods. with 2-h time controls. AlphaGo Zero played at regular intervals throughout training are shown in Extended won by 89 games to 11 (see Extended Data Fig. 6 and Supplementary lenging domains in terms of human intellect—such as the game of Go, Data Fig. 4 and in the Supplementary Information. AlphaGo Zero11 Information). rapidlywidely progressed viewed as froma grand entirely challenge random for moves artificial towards intelligence a sophisti—require- Reinforcement learning in AlphaGo Zero cateda precise understanding and sophisticated of Go concepts, lookahead including in vast fuseki search (opening), spaces. tesuji Fully gene-ConclusionOur new method uses a deep neural network fθ with parameters θ. (tactics),ral methods life- haveand- death,not previously ko (repeated achieved board human situations),-level performance yose Our results This comprehensively neural network demonstrate takes as an that input a pure the reinforcement raw board representation s (endgame),in these domains. capturing races, sente (initiative), shape, influence and learning approachof the position is fully and feasible, its history, even in and the outputs most challenging both move of probabilities and territory,AlphaGo all discovered was the first from program first principles. to achieve Surprisingly, superhuman shicho performance domains: ait value, is possible (p, v to) = train fθ(s ).to The superhuman vector of level, move without probabilities human p represents the (‘ladder’ capture sequences that may12 span the whole board)—one of examples or guidance, given no knowledge of the domain beyond basic in Go. The published version , which we refer to as AlphaGo Fan, probability of selecting each move a (including pass), pa = Pr(a|s ). The thedefeated first elements the European of Go knowledge champion learned Fan Hui by humans—werein October 2015. only AlphaGo rules. Furthermore,value v is a purescalar reinforcement evaluation, learningestimating approach the probability requires of the current understood by AlphaGo Zero much later in training. just a few more hours to train, and achieves much better asymptotic Fan used two deep neural networks: a policy network that outputs player winning from position s. This neural network combines the roles performance, compared to training on human expert data.12 Using this Finalmove performance probabilities andof AlphaGo a value network Zero that outputs a position approach,eval- of AlphaGo both policy Zero networkdefeated theand strongest value network previous versions into a single of architecture. 4 Weuation. subsequently The policy applied network our reinforcement was trained initially learning by pipeline supervised to a learnAlphaGo,- The which neural were trainednetwork from consists human of data many using residual handcrafted blocks fea -of convolutional seconding to instanceaccurately of AlphaGo predict humanZero using expert a larger moves, neural and network was subsequently and tures, by alayers large 16margin.,17 with batch normalization18 and rectifier nonlinearities19 (see overrefined a longer by policyduration.-gradient Training reinforcement again started from learning. completely The random value network Humankind Methods). has accumulated Go knowledge from millions of games behaviourwas trained and tocontinued predict for the approximately winner of games 40 days. played by the policyplayed net- over thousandsThe neural of years,network collectively in AlphaGo distilled Zero into ispatterns, trained prov from- games of self- workOver againstthe course itself. of training, Once trained, 29 million these games networks of self-play were were combined gener- erbs with and books.play by In thea novel space reinforcement of a few days, starting learning tabula algorithm. rasa, AlphaGo In each position s, ated. Parameters were updated from 3.1 13–15million mini-batches of 2,048 Zero was able to rediscover much of this Go knowledge, as well as novel a Monte Carlo tree search (MCTS) to provide a lookahead search, an MCTS search is executed, guided by the neural network fθ. The positions each. The neural network contained 40 residual blocks. The strategies that provide new insights into the oldest ofπ games. learningusing the curve policy is shown network in Fig. to narrow 6a. Games down played the atsearch regular to intervalshigh-probability MCTS search outputs probabilities of playing each move. These moves, and using the value network (in conjunction with Monte CarloOnline Content search Methods, probabilities along with any usually additional select Extended much Data displaystronger items moves and than the raw throughout training are shown in Extended Data Fig. 5 and in the Source Data, are available in the online version of the paper; references unique to Supplementaryrollouts using Information. a fast rollout policy) to evaluate positions in the tree.these A sections move appear probabilities only in the online p of paper. the neural network fθ(s); MCTS may therefore subsequentWe evaluated version, the fully which trained we AlphaGorefer to as Zero AlphaGo using anLee, internal used a similar be viewed as a powerful policy improvement operator20,21. Self-play tournamentapproach (see against Methods), AlphaGo and Fan, defeated AlphaGo Lee Lee Sedol, and several the winner previous of 18 interReceived- 7with April; search—using accepted 13 September the improved 2017. MCTS-based policy to select each Gonational programs. titles, We in also March played 2016. games against the strongest existing move, then using the game winner z as a sample of the value—may program, AlphaGo Master—a program based on the algorithm and 1. Friedman, J., Hastie, T. & Tibshirani, R. The Elements of Statistical Learning: Data Our program, AlphaGo Zero, differs from AlphaGo Fan andMining, be Inference, viewed and as Prediction a powerful (Springer, policy 2009). evaluation operator. The main idea of architectureAlphaGo Lee presented12 in several in this important paper but using aspects. human First data and and foremost, fea- 2. it LeCun,is our Y., Bengio, reinforcement Y. & Hinton, G. learningDeep learning. algorithm Nature 521, is 436–444 to use these(2015). search operators tures (see Methods)—which defeated the strongest human professional 3. Krizhevsky, A., Sutskever, I. & Hinton, G. ImageNet classi!cation with deep 34 convolutional neural networks. In Adv. Neural Inf. Process. Syst. Vol. 25 players1 60–0 in online games in January 2017 . In our evaluation, all DeepMind, 5 New Street Square, London EC4A 3TW, UK. (eds Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.) 1097–1105 programs were allowed 5 s of thinking time per move; AlphaGo Zero *These authors contributed equally to this work. (2012). and AlphaGo Master each played on a single machine with 4 TPUs; 4. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. AlphaGo Fan and AlphaGo Lee were distributed over 176 GPUs and In Proc. 29th IEEE Conf. Comput. Vis. Pattern Recognit. 770–778 (2016). 354 | NATURE | VOL 550 | 19 O C TOB ER 2017 5. Hayes-Roth, F., Waterman, D. & Lenat, D. Building Expert Systems (Addison- 48 TPUs, respectively. We also included a player© 2017 based Macmillan solely on Publishers the raw Limited,Wesley, part of 1984). Springer Nature. All rights reserved. neural network of AlphaGo Zero; this player simply selected the move 6. Mnih, V. et al. Human-level control through deep reinforcement learning. with maximum probability. Nature 518, 529–533 (2015). 7. Guo, X., Singh, S. P., Lee, H., Lewis, R. L. & Wang, X. Deep learning for real-time Figure 6b shows the performance of each program on an Elo scale. Atari game play using o"ine Monte-Carlo tree search planning. In Adv. Neural The raw neural network, without using any lookahead, achieved an Elo Inf. Process. Syst. Vol. 27 (eds Ghahramani, Z., Welling, M., Cortes, C., Lawrence, rating of 3,055. AlphaGo Zero achieved a rating of 5,185, compared N. D. & Weinberger, K. Q.) 3338–3346 (2014).

358 | NATURE | VOL 550 | 19 O C TOB ER 2017 © 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. Design Optimization

• Constrained optimization • Design constraints • Optimizing for specific circumstance (e.g., lap speed on a specific course) • Structural engineering and optimization • Reinforcement learning • Massive parameter space • environment for quantifying outcomes

https://www.fastcompany.com/3054028/inside-the-hack-rod-the-worlds-first-ai-designed-car