<<

PERSPECTIVES

teams involved to obtain results from INNOVATION hypothesis testing within a day or two11. Similar robotic systems have been installed Automating or are under construction in several pharmaceutical companies (for an example, see FIG. 2, right panel). Gisbert Schneider Now, advances in areas such as Abstract | Small-molecule drug discovery can be viewed as a challenging ‘organ‑on‑a‑chip’ technologies and artificial multidimensional problem in which various characteristics of compounds — intelligence are increasingly providing the basis for more widespread application including efficacy, and safety — need to be optimized in parallel of semi-autonomous or even fully to provide drug candidates. Recent advances in areas such as microfluidics-­ autonomous processes to support project assisted and biological testing, as well as artificial intelligence teams in identifying and optimizing tool systems that improve a design hypothesis through feedback analysis, are now and hit compounds in drug discovery. The providing a basis for the introduction of greater automation into aspects of this benefits of automation include: diminished measurement errors and reduced material process. This could potentially accelerate time frames for compound discovery and consumption by the application of optimization and enable more effective searches of chemical space. However, such standardized procedures with robotic approaches also raise considerable conceptual, technical and organizational support; shortened synthesize-and-test cycle challenges, as well as scepticism about the current hype around them. This article times, enabling fast feedback loops and aims to identify the approaches and technologies that could be implemented compound optimization; and ‘objectified’ robustly by medicinal chemists in the near future and to critically analyse the molecular design towards multiple relevant biochemical and biological end points opportunities and challenges for their more widespread application. without personal bias. Furthermore, given the increased interest in the application of ‘Automation of science’ bears the promise proof‑of‑concept studies. For example, sophisticated cell-based assays12 — in an of making better decisions faster1. In drug MacConnell et al.9 recently disclosed a effort to more effectively recapitulate disease discovery, automated systems already have microfluidics-based, miniaturized discovery biology and thereby improve the likelihood a long and fruitful history2 (FIG. 1). Medium- platform for ultra-high-throughput hit of identifying compounds that show efficacy throughput to high-throughput robotic deconvolution by sequencing. The device in humans — more rigorous compound screening in specialized assays has become distributes DNA-encoded compound prioritization aided by automated standard in the beads into picolitre-scale droplets, cleaves approaches could be particularly important (FIG. 2). The breadth of other applications of off the compounds from the beads by because these assays are not always suitable automated systems extends from decision-­ ultraviolet (UV) irradiation and performs for high-throughput compound testing13,14. support systems, to computational molecular a fluorescence-based binding assay, hit The potential value of more fully design to fully fledged robotic synthesis and detection and subsequent hit identification integrated automated systems in drug hit finding3. Prominent examples include by DNA barcode sequencing. By replicate discovery is substantial. However, as traditional rule-based and model-based analysis, the authors were able to reduce with past technological advances that approaches (for example, the archetypal the false-positive hit rate to below 3%. have raised hopes of revolutionizing DENDRAL system for analysing mass This proof‑of‑concept study highlights drug discovery (but often not lived up to spectra4, LHASA5 software for synthesis the use of integrated microfluidics systems expectations), it is important to look beyond planning and various in‑house tools for for large-scale screening within short, the hype, for example, around automated accessing and analysing chemical and hour-scale time frames and with very low high-throughput combinatorial synthesis, biological data similar to Amgen’s AADAPT material consumption. Another example ‘big data’ and artificial intelligence. This system6), various software tools for de novo is provided by researchers at AbbVie, article aims to identify the key approaches molecular design7 and prototypical robotic who have developed an integrated robotic and technologies that could be implemented systems such as ADAM and EVE for platform for the automated parallel synthesis robustly by medicinal chemists in the automated target and hit finding1,8. of small, focused compound libraries, near future and to critically analyse the Nevertheless, the full integration of all built mainly from commercially available technological and conceptual challenges aspects of compound design, synthesis, components10. Their system is able to of doing so in the context of workflows testing and automated iteration throughout perform liquid handling and evaporation for in industry. It first summarizes the state the molecular design cycle (FIG. 1) has not yet in-line analytics, purification and activity of the art in the application of automated been productively applied on a broader scale, testing. Turnaround times of 24–36 hours systems in separate aspects of the ‘design– although there have been a few isolated were reported, which allow the project synthesize–test–analyse’ cycle and then

NATURE REVIEWS | DRUG DISCOVERY VOLUME 17 | FEBRUARY 2018 | 97 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

22 • Chemical intuition drugs (such as absorption, distribution, synthetic innovation . A recent example • Combinational approaches metabolism, excretion and toxicity of the FOS approach is the successful • Molecular modelling (ADMET) properties), the availability of design of oxazolidine derivatives with • De novo methods compounds and (if antibiotic activities as simplified analogues Deduction the compounds are being synthesized rather of the structurally intricate than being sourced from existing libraries caprazamycin from Streptomyces23. Design Synthesis or commercial suppliers). Consequently, A wide range of guidelines that aim to medicinal chemists routinely face complex improve the -likeness or drug-likeness Start Hypothesis Test End multi­dimensional optimization problems, of compounds have also been introduced, with the importance of different parameters beginning with Lipinski’s recommendations changing as the drug discovery process (often referred to as the ‘rule of 5ʹ)24,25 Learning Assay progresses from the identification of and combined efficiency (LE) and Induction initial screening hits (when identifying lipophilic (LLE) values, compounds with the relevant biological which can be applied automatically or • Chemical intuition activity is crucial) via hit‑to‑lead expansion semi-automatically as computational • Team intelligence • (Q)SAR modelling (which often requires massive synthetic filters for existing compound libraries or • Machine learning effort to improve compound activity and candidates for synthesis (see REFS 26–28 developability) towards the selection of for reviews). Early applications of artificial Figure 1 | The molecularNature Reviews design | Drug cycle. Discovery Starting clinical candidates (when there may be a neural networks have contributed to off from results obtained by high-throughput need to compromise to achieve the best rationalization of the drug-likeness concept compound screening, fragment screening, com- possible mix between desirable biological in more sophisticated abstract terms putational modelling or data from the literature, activity and desirable ADMET properties). and enabled on‑the-fly computational this feedback-driven discovery process alter- Given the vast size (cardinality) of the compound profiling29,30. Importantly, it has nates between deduction and induction, eventu- relevant ‘chemical space’, which is estimated been realized that compound quality can be ally leading to optimized hit and lead 30 60 compounds. Smart automation of the individual to be in the range of 10 –10 drug-like controlled by appropriate lead selection and parts of the cycle can help to reduce random- molecules, the key challenge for medicinal optimization based on informed decisions ness and error, thereby supporting less wasteful, chemists could be summed up as ‘what rather than by the naive application of more productive and efficient drug discovery. to make and test next?’ Automated drug empirical rules31. Today, fully fledged in silico Miniaturization and advanced lab‑on‑a‑chip discovery platforms must be able to provide decision-support systems that greatly technology, together with machine learning the right answers to this question. extend and augment such concepts and methods, represent enabling technologies. The guidelines can assist medicinal chemists in whole design cycle can also be performed com- Chemical design concepts. Traditionally, multi-objective compound design, selection pletely inside a software program. These adaptive compound selection and/or design was and prioritization32,33. A consequent de novo design methods are equipped with both the sole domain of medicinal chemists, ‘predict first’ mindset has recently been chemical knowledge for in silico compound syn- thesis and meaningful virtual screening models drawing on their expert knowledge and advocated by researchers at Merck, drawing as surrogates for biochemical and biological providing a substantial role for intuitive from positive experiences with their own 34 tests, while active learning algorithms enable decision making. Over the past two decades, integrated design–make–test activities . The chemical space navigation towards compounds various broad concepts have emerged to help concepts and guidelines have been reviewed with promising properties. Note that the terms guide compound library design, hit‑to‑lead comprehensively in the articles cited above, ‘deduction’ and ‘induction’ in the context of drug expansion and the enrichment of compound and thus this article focuses on some discovery are not always used in a strictly logical collections with new chemical entities. selected illustrative examples, as well as the sense. Induction refers to explanatory reasoning For example, diversity-oriented synthesis limitations and challenges of autonomous in generating hypotheses. Deductive inference (DOS) provides a rationale for generating computational selection and design necessarily results in a true statement if the collections of small molecules with diverse of compounds. underlying hypothesis is true. Because a hypo­ thesis in is based on incomplete, functional groups, stereochemistry and 15,16 error-prone experimental data, the term ‘abduc- frameworks in a controlled fashion . Automated de novo design. Importantly, 17 tion’ may be formally better suited. (Q)SAR, Following this concept, Maurya and Rana the probabilities of the underlying research (quantitative) structure–activity relationship. recently reported on the diversification hypotheses are recorded as experimental of macrocycles by carbohydrate-derived metadata and stored in databases, which building blocks. As a complement to DOS, enables automated semantic analysis, biology-oriented synthesis (BIOS) takes generating both revised design hypotheses discusses progress in the integration of these natural products as templates for generating and deriving new examples (that is, aspects to fully harness the potential of synthetically accessible derivatives and chemical entities) for testing35,36. Numerous automation in drug discovery. mimetics18,19, often relying on natural automated compound generators and product-derived scaffolds20. Finally, selection operators have been conceived Automation in molecule design so‑called function-oriented synthesis for this purpose, some of which use certain Medicinal chemists select, design and (FOS)21 strategies take the BIOS concept classes of ‘deep’ machine learning methods; prioritize molecular structures on the to the next level by aiming to recapitulate for example, generative and recurrent neural basis of factors including the desired or tune the function of a biologically active networks37,38, inverse quantitative structure– biological activity of the compounds, lead structure to obtain simpler scaffolds, relationship models39–41 and reaction-based other characteristics important for increase their ease of synthesis and achieve compound assembly techniques42.

98 | FEBRUARY 2018 | VOLUME 17 www.nature.com/nrd ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

a b c

Figure 2 | Automated drug discovery facilities. a | Millions of compound preparation and testing. These storage and screening systems have become Nature Reviews | Drug Discovery samples are stored in compact high-capacity facilities and handled by cornerstones of contemporary drug discovery. c | A prototype of a novel robots. b | Robot systems perform both high-throughput and medi - miniaturized design–synthesize–test–analyse facility for rapid automated um-throughput screening of up to ten thousand samples per day to deter- drug discovery at AstraZeneca is shown. Images a and b courtesy of Jan mine the activity against the biological target of interest. Multiple arms and Kriegl, Boehringer–Ingelheim Pharma; image c courtesy of Michael flexible workstations enable fully automated liquid dispensing, compound Kossenjans, AstraZeneca.

De novo molecular design methods were identified as potent and target-subtype Compounds 10 and 11 are examples of in particular have matured enough to selective ligands and synthesized in flow computationally optimized ligand structures, be applicable in prospective settings and on a microfluidics chip44. Compound 5 starting from weaker or less selective are now receiving increasing attention. was obtained as a target-subtype selective precursors48,49 (FIG. 3d). In both cases, the FIGURE 3 presents examples of recent serotonin receptor 5‑HT2B antagonist design–synthesize–test cycles were guided compounds that were obtained by fully based on computational prediction, with by computational design methods trained on autonomous or semi-autonomous de novo no activities towards a large panel of publicly available activity data, epitomizing computational design. In each of these cases, off-targets45. By contrast, compound 6 the aforementioned ‘predict first’ philosophy. a computer-­generated molecular design was deliberately designed as an ‘ultimately The last de novo design example hypothesis guided the decision of which promiscuous’ ligand, without showing shown in FIG. 3e highlights the concept of compound to make next. The first example aggregation in solution or possessing automated morphing of natural products (FIG. 3a) demonstrates how computational undesired frequent-hitter properties46. into synthetically accessible, isofunctional target prediction can prioritize Importantly, very few compounds had to be compounds, and illustrates the FOS design combinatorial compound assays. A focused synthesized to reach the design objectives. concept introduced previously. The natural imidazopyridine (compound 1) library was The example shown in FIG. 3c anticancer compound (−)-englerin A obtained by linear microfluidic synthesis demonstrates the advantageous interplay (compound 12)50, which is synthetically on a chip, with the building block selection between ligand-based and structure-based accessible in a 14‑step process51, was performed by an ant colony algorithm and hypothesis generation for scaffold hopping. computationally (and by subsequent manual multi-target activity predictions43. Several With the known drug fasudil (a vasodilator, refinement) converted into compound 13, active molecules, such as compound 2, potent Rho kinase inhibitor and moderate which could be afforded in only three were obtained within minutes. The results inhibitor of death-associated­ kinase 3 synthetic steps52. Both compounds potently of this study provide support for the close (DAPK3)) as a template, computational block transient receptor potential cation integration of microfluidics-assisted de novo design suggested several scaffold channel subfamily M member 8 (TRPM8) synthesis with computer-based target hops47. A target prediction method relying calcium channels, as correctly predicted by prediction as a viable approach to rapidly on self-organizing neural networks the software. generate bioactivity-focused combinatorial prioritized these frameworks to obtain These selected examples of computer-­ compound libraries with high success rates. a novel DAPK3 inhibitor, compound 8. assisted molecular design illustrate some We revisit this design concept in more detail Subsequent crystallographic studies of the potential of contemporary in silico in the subsequent sections of this article. confirmed the binding of inhibitor 8 in methods for hypothesis generation. There is The second example (FIG. 3b) showcases the the ATP–substrate pocket of the kinase no doubt that state-of-the-art computational benefits of using virtual library enumeration (Protein Data Bank identifier: 5a6n). On de novo design delivers new synthesizable in concert with target-panel prediction for the basis of the known binding mode chemical entities with desired properties. focused library design and building block of the de novo generated ligand, the Multi-objective compound selection selection. Compounds 3–6 originated diuretic drug azosemide (compound 9) strategies have shown their applicability to from the same chemical space accessible could be identified as a DAPK3 inhibitor. de novo design, which is not only useful for by reductive amination reaction products This particular study succeeded in lead prioritizing chemically attractive lead-like but possess different target preferences, identification through the combination and drug-like molecular structures but validating the computational selection of automated scaffold hopping and also relevant in light of ligand–target strategies employed. Compounds 3 and 4 experimental structure determination. promiscuity (estimates range between up to

NATURE REVIEWS | DRUG DISCOVERY VOLUME 17 | FEBRUARY 2018 | 99 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

a On-chip In silico target R2 synthesis prediction N Building blocks for OHC N R2 multi-component NH2 R1 reaction N N R1 + N R3 NH CN NH R3 1 2

Imidazopyridine scaffold α1BR: Ki = 2–3 μM (LE = 0.4)

b In silico In silico target enumeration prediction Building blocks 2,469,832 virtual On-chip and for reductive products batch synthesis F amination

O N N N N N N Cl N N N N O N N H F 3 4 5 6

σ1R: Ki = 2 nM (LE = 0.61) D4R: Ki = 12 nM (LE = 0.46) 5-HT2BR: Ki = 251 nM (LE = 0.42) Multi-target ligand 5-HT2AR: Ki = 3.4 μM 5-HT2CR: Ki = 18 μM

c S In silico N NH In silico NH N O S O de novo N target NH NH O N design O N N HN N S prediction N O N N N N N H O N O N N Synthesis Cl SO NH SO NH N 2 2 2 2 7 8 9

Fasudil DAPK3: IC50 = 52 μM Azosemide DAPK3: IC50 = 5 μM (LE = 0.48) DAPK3: IC50 = 2 μM

d e OH O O In silico de novo design, O N N H O O N N target prediction N N O H N H N O N N H H Synthesis N O O O O O 10 11 12 13

VEGFR1: Ki = 2,400 nM D3R: Ki = 867 nM (–)-Englerin A TRPM8: Ki = 0.2 μM VEGFR2: Ki = 64 nM σ1AR: Ki = 3 nM TRPM8: Ki = 0.4 μM

Figure 3 | Examples of automated computer-assisted de novo design as (DAPK3); compound 7) and structure determination enabled the identifica- an enabling technology. a | A focused library of compounds with an imida- tion of a novel DAPK3 inhibitor (compoundNature 8). On Reviews the basis | Drug of its Discovery binding zopyridine scaffold (compound 1) was synthesized on a microfluidics chip, mode determined by crystallographic studies, the diuretic drug azosemide based on the Ugi three-component reaction43. Coupling building block pri- (compound 9) was identified as a DAPK3 inhibitor47. d | Compounds 10 and oritization to a computational method for predicting ligand–target associa- 11 are examples of ligand structures that were computationally optimized tion led to the rapid identification of several ligands for G protein-coupled from weaker or less selective precursors by using design methods trained on 48,49 receptors (GPCRs), such as the α1A and α1B adrenoceptor antagonist shown publicly available activity data . e | The natural product (−)-englerin A (compound 2). b | Integration of computational activity prediction at GPCRs (compound 12) was computationally morphed52 into the synthetically acces- with microfluidics-assisted synthesis based on a reductive amination reac- sible compound 13; both compounds inhibit the transient receptor potential tion enabled the identification of ligands with various binding profiles (com- cation channel subfamily M member 8 (TRPM8). IC50, half-maximal inhibitory 44–46 pounds 3–6) . c | Automated scaffold hopping from the drug fasudil concentration; Ki, inhibition constant; LE, ligand efficiency; VEGFR, vascular (known to be a moderate inhibitor of death-associated protein kinase 3 endothelial growth factor receptor.

100 | FEBRUARY 2018 | VOLUME 17 www.nature.com/nrd ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

• Deprotection • Deprotection • Coupling • Coupling • Deprotection • Purification • Purification • Coupling N N N O O O B O B O B O O O O O O N O N O O X X B O X B O O O O O Figure 4 | Automated formation of C–C bonds to yield structurally purification process. Synthesizers implementing this and other combi- The example demonstrates the concept of sequen- natorial reaction schemes can serve as modules in automated diverse products. Nature Reviews | Drug Discovery tial boronate building block assembly. Four building blocks (coloured drug discovery platforms. Adapted with permission from REF. 74, circles) are combined in a standardized deprotection, coupling and Science/AAAS.

5 and 11 pharmacologically relevant targets miniaturization and inclusion in automated important advancement in automated per drug)53–56. The logical next step is to design cycles64. Researchers at Eli Lilly have synthesis was enabled by standardizing combine these and related techniques with established a superb example of such a fully the synthesis and purification processes and compound testing automated robotic synthesis laboratory that involved. in an integrated discovery platform. can be remotely controlled, which is a major step towards advancing the efficiency and Microfluidics-based synthesis. ‘From batch Automation in compound synthesis effectiveness of chemical synthesis for drug to continuous’ is a general trend in industry The automation and parallelization of discovery65,66. and not limited to chemical production chemical synthesis offer benefits such as Some reaction schemes have been processes76,77. Evidently, miniaturized increased speed and throughput, greater shown to be more agreeable than others microfluidic synthetic and analytical devices reproducibility, lower consumption of for straightforward automation and will play a central role in drug discovery materials and, consequently, the possibility parallelization67,68. Typically, these reactions automation. Microfluidic reactors integrated to explore wider areas of chemical space do not require exotic reaction conditions, with real-time product detection and a within a given time frame compared with can be standardized, are amenable to a wide command-­and-control system can, in manual, serial compound synthesis57. variety of (readily available or obtainable) theory, perform and analyse thousands of Historically, the first automated synthetic educts and can be optimized for maximum reactions on timescales that are not possible processes and robots were conceived for yield. Prominent examples include scaffold-­ with conventional macroscale technologies. peptides58,59 (Merrifield’s method for forming reactions (for example, the Embracing such advantages demands bond formation), oligonucleotides60,61 Pictet–Spengler reaction and metathesis the substitution of widespread, but (solid-phase phosphoramidite method for reactions)69,70. Other desirable linkage inefficient, one-parameter-at‑a‑time internucleotide linkage) and later for oligo- reactions (for example, palladium-free C–C methods with more sophisticated and saccharides62 (for example, the trichloro­ bond forming reactions) have been scarcely specialized algorithms. For example, acetimidate method for glycosidic bond used in or automated trial-and-error scanning of the formation). synthesis set-ups71,72. experimental parameter space can identify A key element in each of these processes However, automated discovery local optima but often fails to find global is the use of a small set of building blocks processes may be crucial for exploring optima. In the field of medicinal chemistry, (including larger fragments) and a new chemistry73. One of the most versatile reagents and products are often expensive. well-defined, robust chemical reaction to automated synthesis platforms for drug-like Furthermore, many reagents and afford large sets of diverse products in high small molecules to date was developed by intermediates have unknown hazards yields by iterative building block assembly, Burke and co-workers74. The synthesis of and must be treated with extreme caution orthogonal protection group chemistry and Csp3-rich macrocyclic and polycyclic natural owing to their unknown . purification. Various methodological products, pharmaceuticals and natural Microfluidics can offer an advantage and technical improvements, including product-like cores was achieved by iterative by decreasing opportunities for human stereoselective synthesis, parallelization building block assembly via automated exposure and minimizing material usage78. of subprocesses and preparatory steps, C–C bond formation and cyclization There are also several other technologies miniaturization (small volumes and reactions75 (FIG. 4). Cartridged bifunctional that can be used for this purpose. For compact synthesis arrays) and automated N‑methyliminodiacetic acid (MIDA) instance, acoustic liquid handling systems in‑line purification, have resulted in highly boronate building blocks were prepared for precision droplet dispensing are reliable synthesis machines for increasingly for this purpose, complementing the well-accepted tools in chemical synthesis complex oligomeric structures. Their commercially available samples. Importantly, that increase the reproducibility of underlying general design concept mimics a small set of building blocks was sufficient experiments and reduce the amount of the of most natural products. for generating remarkable structural core consumables needed, thereby cutting Furthermore, combinatorial thinking has diversity in the final products. The authors costs79,80. Exceptionally high precision has led to methods for the massively parallelized developed an in‑line catch-and-release been reported for transferring microlitre scaffold-centric synthesis of structurally purification protocol for realizing a seamless droplets into well plates81. Nevertheless, each diverse compound libraries63. Many of three-step reaction cycle. Similarly to the automation process requires skilled chemists these approaches are readily amenable to automated synthesis of oligomers, this and solid chemical engineering, as the

NATURE REVIEWS | DRUG DISCOVERY VOLUME 17 | FEBRUARY 2018 | 101 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

individual usage of acoustic droplet ejection Continuous-flow syntheses have also or telescoped preparative ES, yielding up and its applicability depend on the types of been used early on to obtain drug-like to 47% conversion of the starting material liquids and mixtures handled82. combinatorial compound libraries with to atropine in residence times of a few As a distinct feature of microfluidics heterocyclic scaffolds98,99. minutes. Microfluidics techniques have systems, converging streams of fluids flow Nagaki and co-workers noted the also simplified the set-up and improved the in parallel without turbulence (that is, the specific advantage of flow microreactors functions of ambient mass spectrometry conditions of laminar flow are fulfilled), with to enable ‘flash’ chemistry reactions that by integrating probe sampling and ES on a characteristically low Reynolds numbers cannot be performed in batch100. The single glass microchip111. (the ratio of inertial forces to viscous forces, high-resolution reaction time control Nevertheless, there are limitations to a dimensionless parameter indicating possible in microreactors allows access to continuous-flow systems including the (in) whether a flow condition will be laminar a multitude of otherwise difficult synthetic stability of the fluidic interfaces between or turbulent)83. In addition to allowing procedures101. One such prominent microscopic and macroscopic fluid miniaturized bioassays in flow, this property example is the sequential synthesis of the handling and the deposition of reactive of microfluidics systems enables fine-tuned, subtype-selective retinoic acid receptor-α by-products, and automated batch synthesis diffusion-controlled synthetic reactions84. (RARα) ligand TAC‑101 with a total and fast parallel synthetic strategies have The short distances in microfluidic on‑chip residence time of 13 seconds been suggested as alternatives112. For channels guarantee the desired rapid and and a productivity of 100–200 mg min−1 example, researchers at Merck recently controlled transport of heat and mass. (REF. 102). Another example is the high-­ presented their ‘chemical high-throughput Complex channel geometries, pulsed flow temperature, high-pressure continuous-flow experimentation’ (HTE) platform in conditions and the high surface-to‑volume synthesis of 1H‑4‑substituted imidazoles103. 3,456‑well microtitre plates, aiming to ratio of miniaturized reactors can result in a The use of microfluidics technology to optimize a key synthetic step in a drug dramatic increase in throughput and yield in simulate the cytochrome P450‑catalysed discovery programme. HTE successfully microreactors85. oxidation of drug molecules bears the identified the preferred catalyst, reaction Ley and colleagues pioneered the field promise of substituting in vitro metabolite conditions, reagents and solvents for the of flow chemistry, which has numerous identification by on‑chip chemotransfor- given transformation. The authors conclude practical applications in drug discovery; for mations of compounds in the near future that hypothesis-driven HTE allows a example, the synthesis of imatinib in flow86, (for example, aromatic hydroxylation, C–H scientist to ‘go fast’ and may be considered the translation of four sequential steps oxidation, glutathione conjugation and the logical extension of traditional chemical into a continuous-flow system to generate sulfoxidation)104,105. For further instances of experimentation113. Chow and Nelson114 (E/Z)-tamoxifen with 100% conversion advanced continuous-flow applications in have argued that automated HTE discovery and 84% yield87 and numerous natural chemical synthesis, see the topical review by workflows may enable expansion of the product syntheses88. Their seminal work Britton and Raston106. synthetic chemistry toolkit and increase has introduced single-step and multistep innovation in medicinal chemistry. microscale and mesoscale flow systems, Automated optimization of reaction An advantage of batch approaches, namely which enable otherwise difficult reactions conditions. Single-step and multistep the ability to collect data from many time with low yields or reactions that require syntheses can be optimized by feedback points in a single experiment, and a limitation special safety measures to be performed, control107. Jensen and co-workers108 of one‑at‑a‑time flow experiments, has been such as or ozonolysis89–91. pioneered self-optimizing microscale and addressed by recording time-series reaction Warrington and co-workers have explored mesoscale reactor systems, for example, and interaction data in‑flow for kinetic numerous reactions and microreactor for C–C bond forming reactions. A recent analysis115. Similarly, microfluidics systems are designs, which have paved the way for example of such reaction optimization by no longer restricted to single-step reactions. advanced applications92–95. The technical suitable algorithms to achieve the maximum For all these applications, in‑line spectroscopy capability of multistep continuous-flow product yield, highest throughput and lowest and purification of intermediates are vital to synthesis was demonstrated by the Ley production cost is the palladium-catalysed ensure maximal yields. Various fluorescence-­ group in the generation of key intermediates Heck–Matsuda arylation reaction109. Our based and infrared-based detectors, as well for the of the polyketide group used microfluidic synthesis with as Raman, NMR and mass-spectrometric­ spirangien A96. This high-yielding system in‑line analytics to determine the optimal analytical devices, have been integrated into consists of heterogeneous reactor coils flow rate, temperature range, catalyst continuous mix and flow systems116–118. Steady and microfluidics components, requiring loading and reagent concentrations for progress in miniaturized manufacturing minimal downstream processing. continuous imidazopyridine formation of analytical devices facilitates system Some of these techniques are already on a chip43. Comparable conversion rates integration. In particular, 3D printing being applied in the pharmaceutical industry. were obtained in a microwave procedure, provides opportunities for building versatile For example, researchers at the Novartis– albeit with much longer reaction times multifunctional microfluidics modules with Massachusetts Institute of Technology (MIT) (15 min in the microwave reactor versus embedded in‑line reaction monitoring and Center for Continuous Manufacturing 0.3 s in flow). In‑line mass spectrometry analytical capability119. succeeded in assembling a compact system has enabled the optimization of atropine for the continuous end‑to‑end synthesis of synthesis in microdroplets obtained by Droplet reactors. Although there are several diphenhydramine hydrochloride, preparative electrospray (ES), as recently off-the-shelf instruments available (for lidocaine hydrochloride, diazepam and demonstrated by researchers from Purdue example, for hydrogenation reactions), fluoxetine hydrochloride in qualities that University110. They devised several the majority of current microfluidics meet US Pharmacopeia standards97. continuous-flow set-ups with multistep platforms require a custom set-up, and one

102 | FEBRUARY 2018 | VOLUME 17 www.nature.com/nrd ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

should carefully weigh the pros and cons material and organ models for screening, of microfluidic versus batch technologies thereby helping to address the well-known before deciding on a particular technology. issues with species-specific variations and Coupling the individual components poorly predictive animal models126,127. For is an engineering challenge. The majority example, liver‑on‑a‑chip technology based of platforms currently being introduced in on human hepatocytes can be used to industry for the automated parallel synthesis swiftly screen compounds for cytochrome of small, focused compound libraries seem P450 binding to substrates and inhibitors, to operate without making extensive use of as well as subsequent high-performance microfluidics-assisted chemical synthesis, liquid chromatography (HPLC)–mass probably because for certain microfluidic spectroscopy for metabolite identification128. reactors, clogging of the reactor channels Combined with computational predictive and leakage due to back-pressure issues or models, this technology is ready for incompatibility of the solvents and materials prospective practical application129. Cancer- remain a major problem. Performing on‑a‑chip systems that use single cells or chemical flow reactions in droplet 3D cancer models bear the promise of Figure 5 | Chemical synthesis in microfluidics Nature Reviews | Drug Discovery environments offers a potential solution replicating the pathophysiology of human droplet reactors. The image shows a micro­ to several of these problems. Droplets tumours and tumour environments reactor channel with droplets containing multi- 130,131 123 may be considered isolated mini-reactors in vitro . Again, as with the many other nary (Cs/FA)Pb(Br/I)3 perovskite nanocrystals . with volumes reduced to the femtolitre organ‑on‑a‑chip models, this technology has Each droplet exhibits different, composition- scale120,121, facilitating sorting and process the potential to produce relevant readouts dependent emission under ultraviolet excita- control122. DeMello and co-workers123,124 within short time frames and to enable tion, revealing the compositional gradient along have demonstrated that droplet-based informed hit and lead prioritization and the reactor. The flow rates of the individual pre- cursor streams provided control over reaction microfluidics systems are precise tools optimization. times as well as precursor concentration ratios. for studying and optimizing the synthetic Physiologically relevant microfluidic This example from the field of nanomaterials parameters of chemical reactions, leading environments are stable over weeks and have demonstrates the unique capabilities of droplet- to the production of materials with superior a footprint of a few square millimetres. For based synthesis for the production of chemical characteristics (FIG. 5). example, Loskill et al.132 recently presented matter. Image courtesy of Andrew J. deMello A challenge for drug discovery is the a white adipose tissue (WAT)-on‑a‑chip and Richard Maceiczyk, ETH Zürich. slow reaction time of many chemical system, allowing drug–WAT interactions transformations. Furthermore, any realistic to be studied by convective transport. Cao application of such high-throughput et al.133 reported a microfluidics system for arrays for high-resolution peak miniaturized synthetic devices in drug rapid epigenetic DNA scanning to monitor analysis138. ‘Plug-and-play’ microfluidics discovery requires rapid in‑line analytics drug effects on stem cells, using as few as modules are the next step towards fully of the generated products. Belder and 100 cells. Microfluidics platforms have integrated on‑chip drug discovery. co-workers125 have recently presented a been developed for the high-throughput Miled and co-workers developed such droplet-based microfluidics system with (thousands of samples) analysis of DNA a modular lab‑on‑a‑chip device for seamless coupling to ES–mass spectrometry. methylation patterns in low volumes on automated monitoring and modulating of In a proof‑of‑concept study, they applied a chip, greatly extending chemical base the concentrations of neurotransmitters the device to an amino-catalysed domino modification studies for epigenetics-related such as dopamine and serotonin, thereby reaction in nanolitre droplets (Knoevenagel drug effects134. Dittrich and co-workers135 opening new possibilities for functional drug condensation followed by an intramolecular demonstrated the possibility of determining screening with feedback control139. hetero-Diels–Alder reaction), with only the concentration of intracellular cAMP in picomolar amounts of catalyst needed. The response to extracellular stimuli in single Integration for automated design cycles greatly increasing numbers of applications cells, thereby greatly extending the capabilities Coupling synthesis and testing. The and technological advances in the field of of continuous chip-based assay systems for Automated Lead Optimization Equipment continuous microfluidic synthesis showcase measuring relevant biochemical parameters (ALOE) platform is a prototypical example the potential of these platforms for the for drug discovery. In addition, 3D triple of an adaptive molecular design process140. high-throughput generation of diverse co‑culture microfluidics devices have been Its software control contains an algorithm for chemical entities for subsequent testing. The established as functional surrogates for the building predictive bioactivity models and concept of continuous microfluidic reactors, blood–brain barrier136. prioritizing the selection of starting materials which were originally designed for the Advanced nanotechnology offers for subsequent rounds of on‑chip compound continuous production of single compounds, even farther-reaching opportunities generation. The system can adapt to the has been augmented by their suitability for such as micromachines (nanobots) for underlying structure–activity relationship producing many compounds within very drug delivery137. In fact, the prospect of (SAR) and rapidly find optima in chemical short time frames. combining nanotechnological devices space, with low reagent consumption. with on‑chip testing of computationally Basic schematics of integrated microfluidics Microfluidics technologies for screening designed compounds does not seem synthesize-and-test platforms are shown The use of miniaturized microfluidics far-fetched. Advances in chemical imaging in FIG. 6, and a selection of applications is devices not only supports chemistry but also further augment the capabilities of on‑chip listed in TABLE 1. These methods operate on enables the use of human cell lines, biopsy monitoring, for example, by miniature small volumes of fluids in geometrically

NATURE REVIEWS | DRUG DISCOVERY VOLUME 17 | FEBRUARY 2018 | 103 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

a rules (electron-transfer steps) that represent LC column Splitter UV MS basic chemical transformations to devise Microreactor a mechanistic interpretation of a plausible Software reaction pathway156. More recently, machine trigger learning models have been developed for automated synthesis planning, enabled Waste or Valve Reactants Assay collection by large curated reaction databases. ReactionPredictor is such a method and automatically identifies and ranks electron-­ transfer steps by use of a simplified molecular 157 b Building Fraction orbital description . The number of block hotel LC–MS collector prospective applications of these and other Microreactor tools is still limited, and there is not much experience, if any, with integrating such tools in automated synthesis platforms. Computer Feedback Start Assay Reformatting End model However, the continuously growing ‘Network of Organic Chemistry’ (NOC) contains Figure 6 | Schematics of integrated microfluidics-assisted synthesize-and-test platforms. approximately ten million reactions and The classic linear layout shown in part a does not contain automated feedback from the assay to the 158 Nature Reviews | Drug Discovery reactants for synthesis planning . One may reagent selection, whereas the cyclic layout shown in part b includes an adaptive computer model for consider such a collection of facts ‘big data’ reactant prioritization based on the assay readout. LC, liquid chromatography; MS, mass spectrometry; in chemistry. Szymkuc et al.159 presented an UV, ultraviolet light. innovative approach to reaction pathway construction based on NOC, using fast graph-analysis methods borrowed from well-controlled environments composed droplet-based systems have been shown to be bioinformatics. These algorithms are able to of different functional units, for example, more efficient and amenable to ultra-high-­­ efficiently navigate through the entire breadth dispensers, mixers, reactors and detectors. throughput analysis145. Droplets are especially of chemical synthesis knowledge to identify Solvent exchange may be required when suitable for performing enzyme-controlled optimal synthetic pathways. Alternative transferring newly synthesized compounds processes146,147 and may contain cells for synthetic routes leading from the reactants to to biochemical or biological testing, which probing drug effects in continuous flow148. the products are compared using a function is typically performed in aqueous media. In this way, single cells may be addressed, that includes the number of steps and the cost Some of the integrated flow systems allow thereby eliminating potential issues of of synthesis. Finally, algorithmically identified for slow solvent mixing and direct in‑line readout interpretability caused by cell optimal syntheses are obtained. testing. Fast evaporation and reformatting heterogeneity, for example, for studying These and related data-driven machine has also proved suitable and may represent cancer cells149. Often, a fluorescence-based learning approaches, with continuously an alternative working solution, especially readout of phenotypic drug effects is obtained increasing accuracy and chemical reaction in combination with batch synthesis. For for further analysis150. The rapidly developing space coverage, are no longer science example, researchers at Cyclofluidics and progressing field of microfluidics-assisted fiction and will enable fully integrated drug developed a flow technology platform lab‑on‑a-chip platforms has recently been discovery platforms to be built. One such integrating the key elements of adaptive SAR reviewed by Nakajima and co-workers151. straightforward approach implements a modelling to the discovery of novel ABL1 The full automation of compound combination of forward reaction templates kinase inhibitors141. Similarly, Tseng and synthesis also requires reliable planning for generating a set of chemically plausible co-workers142 devised a complex microfluidics tools for synthesis and retrosynthesis. In candidate products and a machine learning chip for ‘click’ chemistry and subsequent hit fact, numerous such programmes have classifier for virtual product scoring160. This identification. In their proof‑of‑concept study, been conceived, dating back to Corey’s system is based on more than one million throughput was limited by the employment pioneering work from the 1960s152, reactions compiled from United States patent of an eight-channel mass spectrometer for employing rigorous physical models (for literature. Importantly, the model does reaction monitoring, but the authors argue example, reactivity prediction), rule-based not predict quantitative yields but merely that substantially higher throughput could be approaches (for example, synthons and spots plausible true reaction products in the achieved by expanding the instrumentation. reaction schemes) or empirical models pool of potential solutions. Although this For biological experimentation and (for example, precedent-based database overall concept may not be entirely new, the integration with chemical synthesis devices, searching). Classic approaches have been availability of suitable reaction databases droplet microfluidics systems and biological reviewed elsewhere153–155. Their main and advanced machine learning models has readouts from single cells seem to be drawbacks are their limited scope and often enabled the development of robust classifiers. reasonable choices143,144 (FIG. 7). These systems inaccurate results caused by insufficient are suitable for creating concentration chemical background knowledge captured Artificial intelligence in molecular design. gradients and generating microdroplets of by the software tools, paired with low Aside from the required robotic hardware varying compositions for biochemical and execution speed. and synthesize-and-test machinery, the cell-based screening applications. Similar Current computational tools are largely learning aspect probably represents the most to chemical microreactors, compared to data driven. For example, ReactionExplorer crucial part of the automated design cycle. single-layer microfluidics systems, 3D is based on thousands of manually curated If the design hypothesis is wrong, then even

104 | FEBRUARY 2018 | VOLUME 17 www.nature.com/nrd ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

Table 1 | Selected examples of microfluidics-assisted synthesize-and-test platforms for hit identification and optimization Synthetic reaction Biological Description Refs assay T cell Pioneering 246 O O Br tyrosine microfluidics DMF phosphatase synthesize-and-test O + MeOH, NaOH O assay system using a Caliper NO ONa 2 NO2 O chip

Bovine Integrated in situ 142 carbonic microfluidic ‘click’ Huisgen N N cycloaddition anhydrase synthesis, reaction R1 R1 + R2 N3 N R2 II (bCAII) monitoring, inhibition purification and binding assay

Retinoic acid Multistep ‘flash’ 110 OCN Si receptor-α chemistry in +BuLi Br (RARα) an integrated + Me3SiCl CO2Me H binding microfluidic reactor N aq. NaOH Si system consisting of six mixers and six tube Six steps O OH Br Br reactors TAC-101 O

ABL1 kinase Integrated 141 R 1 R1 inhibition microfluidic synthesis Sonogashira N platform with active O N R2 coupling O N N Br + machine learning for CF N N reagent selection 3 CF3 N N R2 H H H H

β‑Secretase Continuous hit 247 H2N O H2N O O Cl 1 (BACE1) optimization N N DMTMMT H N F inhibition by microfluidic OH + H N R2 N 2 F synthesis with R1 Ar in‑line purification, O fractionation and chip-based bioassay

Dipeptidyl Integrated compound 248 peptidase 4 design, synthesis, O O O (DPP4) purification, MsOH, N N O N quantification, N N 90°C N Br 150°C N N N NH dilution and bioassay for rapid generation O N N O N N O O N N HN NBoc of information on the structure–activity relationship

HIV Continuous 249 O R R1 + 1 replication microfluidic two-step S (phenotypic synthesis and ligand O O N R H N R2 2 assay, optimization NH2 DMF, 150°C DMF, 200°C + S S NH off-line) O O O + N O Br H2N N R3 R R1 H 3

Polycomb Integrated robotic 10 O O protein EED library synthesis–test platform using parallel F OH Buchwald F N transformation chemistry N N H N N N N R1 R2 N

NATURE REVIEWS | DRUG DISCOVERY VOLUME 17 | FEBRUARY 2018 | 105 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

not only cost intensive but also inefficient. the additional advantage of limiting both the Such an approach does not include feedback number of iterations that are required to but relies on a single library design step find compounds with the desired properties before brute-force compound testing. The and the number of compounds to be necessary continuous adjustment of the synthesized and tested in each iteration molecular design hypothesis is performed of the design cycle172. Visualization of the only in the later stages of hit optimization fitness landscape (‘activity landscape’) and lead expansion. This design concept modelled during each iteration can is prone to fail when relying on noisy data, additionally help to navigate the chemical personal bias and poor intuitive choices space173 (FIG. 8). Compound 14 is a new sub- (‘gut feeling’). type-selective antagonist of the dopamine

The active learning concept is central D4 receptor found by active learning with an to automated drug discovery. This concept ant colony algorithm (MAntA, Molecular 174 44 Supporting Microfluidic is based on iteratively adapting a design Ant Algorithm) for compound selection . column channel hypothesis — for example, a quantitative Similarly, new CXC-chemokine receptor 4 SAR model — by adjusting its free (CXCR4) antagonists have been identified by Cell trap variables on the basis of newly acquired active learning with a random forest model175. compound activity data. The modified design hypothesis is then used to select new ‘Deep learning’ from ‘big data’. The compound sets for synthesis and testing. possibilities of computational molecular Valve Dating back to the early 1990s, there have structure generation and property–activity been several attempts to use adaptive prediction seem virtually unlimited. A de novo drug design guided by artificial particular appeal of automated structure neural networks and other machine learning generators lies in their trainability on techniques (see REFS 163–165 for reviews), complex chemical data, extreme speed and although these attempts have been isolated. consideration of several design objectives 166 Figure 7 | Microfluidic single-cell screening In a recent article, Hunter advanced the in parallel. The young research field of device. A microfluidics system for the continuous view that adopting and exploiting the full constructive machine learning offers screeningNature of compound Reviews | effectsDrug Discovery on single cells is potential of artificial intelligence methods innovative methods for learning multi­ shown. It consists of a double-layer device con- for pharmaceutical research might be dimensional SARs and iteratively navigating taining an array of chambers. Each chamber has essential to creating a sustainable drug in very large chemical spaces to suggest a central trap for capturing cells or vesicles discovery process. chemical entities for testing that optimally fit (individual traps are visible in the enlarged illus- A specific advantage of machine-driven the design hypothesis. tration of a section of the device) and a round hypothesis generation is that new Based on the body of assay data stored valve that can be opened and closed for fluid compounds may be designed according in public and proprietary databases, it is exchange. For analysis, the valve is usually closed. The volume of the chambers depends on the par- to numerous criteria in parallel, for now possible to train learning machines ticular chip design and is typically 150–500 pico- example, activity, synthesizability, predicted on arbitrary target−target, ligand−target litres. Reproduced with permission from Lucas off-target effects and so on. Importantly, and ligand−effect associations. Algorithms Armbrecht and Petra S. Dittrich, Bioanalytics these models are able to capture essential are able to recognize hidden patterns in Group, ETH Zürich. non-additive (nonlinear) feature molecules that escape medicinal chemical contributions to the design objectives, rationales and intuition because of the which cannot be appropriately considered large set of variables and drug design the most advanced synthesize-and-test by linear substituent contribution models objectives that should be considered in approach will fail to deliver, irrespective of (for example, Free−Wilson analysis and parallel. Suitable molecular structures the technology used. It is important to note matched-molecular-pair analysis)167,168. that fit these patterns can then be that if we can achieve partial predictability Non-additive models of protein−ligand computationally generated and forwarded of SAR models in this situation and build binding are a basic prerequisite for rational to chemical synthesis and analytics and on iterative adjustments of our underlying drug design169. subsequent biophysical, biochemical and molecular design hypothesis, we can While explorative selection by active biological testing. A new design hypothesis gradually approximate the underlying learning aims to add new information to is formed after updating the machine function. This process is referred to the model with each iteration through the learning model with the newly obtained as ‘adaptive design’ or ‘active learning’ design cycle, exploitive selection maximizes assay data (feedback loop), and swift (REFS 161,162). The key requirement for compound quality with regard to certain compound optimization can take place. active learning is rapid feedback, and for hit design criteria, such as activity and selectivity. With such a set-up, one can expect to make and lead discovery, rapid feedback can be Balanced selection strategies compromising informed choices of starting points for lead achieved by fast synthesize-and-test cycles. between these two extremes seem to be optimization. Considering this situation from an particularly suitable for both finding potent Drug design can be regarded as a pattern information-­theoretical viewpoint, the compounds (exploitive selection) with novel recognition process. Medicinal chemists are full-deck screening of hundreds of thousands scaffolds (explorative selection) and optimal skilled in visual chemical structure recognition of compounds by contemporary technology SAR model building170,171. This principle of and their association with retrosynthetic (for example, as shown in FIG. 2) may be model adaptation by active learning offers routes and pharmacological properties.

106 | FEBRUARY 2018 | VOLUME 17 www.nature.com/nrd ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

a Inactive clustering and regression methods (for compounds example, nearest neighbour approaches, support vector machines, standard neural networks and decision trees). The successes of these methods in activity prediction K and lead suggestion are, in part, due to D the development of useful, often domain- d2 specific, molecular representations, which enable comparably simple machine learning architectures to make reasonable predictions. d1 Active In the process of engineering and applying compounds these descriptor systems, we include a Time (synthesize–test cycles) measure of our chemical knowledge and understanding in the depiction of the b actuality of these molecules. Now, ‘deep’ N methods based on learning directly from N molecular graphs and other physically oriented models of complex molecular N objects have been proposed that remove some of this input-level abstraction190–192.

m) N This more general approach, however, ( P benefits from a more sophisticated Model 1 machine learning methodology for pattern Model 2 recognition, as the input data are much less Model 3 amenable to producing useful output with Model 4 14 ‘shallow’ transformation methods. Predictive confidence, m D R: K = 12 nM 4 i Essentially, deep-learning models Figure 8 | Active learning in drug design. Knowledge of the underlying structure–activity rela- are hypothesis generators. Their secret tionship (SAR) captured by a machine learning model is very limitedNature in the Reviews beginning | Drug of aDiscovery discovery lies in a cascaded feature extraction and project but grows over time with each active learning step. The ‘fitness landscapes’ visualize the transformation process from the training areas of chemical space that are associated with low (transparent) and high (strong colour intensity) data representation and in nonlinear predictive confidence (part a). In the example, d and d denote meaningful coordinates of chemical 1 2 function estimation based on these features space, which can be obtained, for example, by projection or dimensionality-reduction techniques245. (FIG. 9) The distributions shown in part b illustrate four stages of a SAR model during active learning. The . While passing information from average predictive confidence increases (and the margin of error decreases) with each iteration the input to the output layer, increasingly (models 1–4). The initial model 1 was trained on literature data (in this case, CXC-chemokine recep- intricate features are formed in the tor 4 (CXCR4) ligands)175. Models 2 and 3 were obtained after testing 30 additional compounds per subsequent layers of such models. Each learning step. Model 4 was trained with all tested compounds taken together. The small discrepancy network layer may contain heterogeneous of predictive confidence between models 3 and 4 demonstrates the efficiency of the active learning processing units that select and refine process. D4R, dopamine D4 receptor; KD, dissociation constant; Ki, inhibition constant; P, pseudo- features in different ways. Such a learning probability density function. process often results in models that elude our immediate interpretation in chemical terms193,194. Nonetheless, such models can be In this context, various ‘deep-learning’ and software180,181. One of the prominent extremely useful195,196. concepts are currently being evaluated machine learning toolkits harnessing From a viewpoint197,198, as potentially enabling technology for the computational power of specifically deep-learning methods for model building drug discovery and automation because developed tensor processing units (TPUs; may indeed represent a breakthrough199–201. these systems aim to mimic the chemist’s application-specific integrated circuits Currently, there are approximately 70 pattern recognition process and to take it developed by Google)182 is the TensorFlow million SAR data points stored in public to the next level by considering all available open-source software library for numerical databases, not accounting for the very large domain-specific data and associations during computation183,184. This software library volumes of proprietary data from deep model development. While acknowledging provides access to contemporary machine sequencing and other massively parallel and their usefulness, we should not fool learning methods and has found widespread ultra-high-throughput assays. Deep-learning ourselves with the term ‘deep learning’ use for and bioinformatics networks provide appropriate technology for or consider these methods ‘magic wands’. modelling and medicinal informatics185–188. analysing such large amounts of data to find These systems are reincarnations of artificial For a review on toolkits and software libraries meaningful relationships between ligands, neural network prototypes for automated for deep learning, see REF. 189. proteins, genotypes and phenotypes202–205. molecular design from the 1990s176–179 that, To date, most machine learning Several heterogeneous deep-learning in augmented and expanded form, can now applications in the field have been ‘shallow’ systems with high prediction accuracies have be trained and optimized on complex pattern — that is, using a single layer of feature been developed for drug–target association, recognition tasks, largely owing to substantial transformation to achieve their goals. drug repurposing opportunities and target improvements in available hardware This class of algorithms includes various identification, among other tasks202,206,207,208.

NATURE REVIEWS | DRUG DISCOVERY VOLUME 17 | FEBRUARY 2018 | 107 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

f (x) space, one may indeed wonder if big Although the required flexibility and Output experimental data exist in chemistry215. In adaptability of the design hypothesis have Layer n this context, Tetko et al.216 suggested the long been adopted in software solutions definition of big data as “out of the scale for de novo molecular design and model ... of traditional applications, which require building, real-life applications have only Layer 2 efforts beyond the traditional analysis”. recently been demonstrated. Minimizing Data sharing and open software between the time gap between synthesis and testing Deep-learning network research organizations will further expedite may be the vital factor for increased successful model building for automated productivity of drug discovery projects. A Layer 1 drug discovery217. Importantly, big data as high program speed increases the number such are not a prerequisite or guarantee for of design loops that can be made and limits obtaining good predictive models. Similarly, the risk of generating new compounds it is advisable not to simply try and apply agnostically, without full integration of deep models to any given classification the test results into the design hypothesis. Input or regression task in drug discovery, but There is no learning without reflection and Data (x) to carefully evaluate the required model feedback. Nature Reviews | Drug Discovery Figure 9 | Schematic of a deep-learning net- complexity and its applicability domain Lab‑on‑a‑chip and other miniaturized work. Deep neural networks transform the input beforehand210,218,219. and/or mobile platforms with a small data (for example, molecular structures or footprint seem to be suited to address this microscopic images) by cascaded feature extrac- Conceptual and practical challenges bottleneck in hit expansion. As appealing tion and compute a nonlinear function of the input, f(x). They essentially represent universal Judging from successful proof‑of‑concept as this technology may be, however, function estimators. Each network layer can vary studies and pilot applications, potentially seamless integration of the heterogeneous in size and architecture, can have alternating major benefits for drug design from instrumentation faces technical challenges. functionality and can contain different types of the integration of automated discovery New continuous-flow platforms may provide processing units. When trained on compound processes can be anticipated. These include a complement or even an alternative to these activity data, the overall network function low error rates (for example, reduced risk of mixed-method systems. Similar adapts to the underlying structure–activity rela- false positives), high speed of execution (for to conventional robot-assisted systems, in tionship and, after successful training, can be example, faster hit and lead identification), continuous-flow devices, the lack of direct used for automated compound design. low consumption of materials (advancing in‑line methods for compound profiling in Essentially, such learning systems are able to green chemistry), straightforward synthetic dose–response format has prevented the incorporate new data (for example, new com- pound–target activities or chemogenomics data) schemes for ease of compound production, emergence of fully automated hit discovery and continuously adjust their internal model of potentially patentable compound and optimization in the past. the input–output relationship. The depicted net- structures (in combination with scaffold Another limiting factor is the currently work architecture highlights only one of several hopping), ease of instrument handling (low restricted versatility of automated synthesis related deep-learning concepts. maintenance) and, ultimately, improved platforms. Each chemical reaction decision making for hit and lead candidate requires optimization and often hardware selection. modifications (for example, seals, reactors Nevertheless, molecular design is and piping); the reagents must be prepared Deep network models have also been governed by nonlinear relationships between for handling, detection and purification shown to improve conventional virtual the chemical structures and their biological protocols must be adjusted and so on. screening methods, such as automated activities, random events (serendipity), On‑the-fly switching from one chemical ligand docking209, and to accelerate measurement and judgement errors and the transformation scheme to another and otherwise computationally costly chemical incompleteness of available drug discovery sequentially performing multiple steps computing tasks210. Various applications data. In addition, erroneous assay readouts automatically may be straightforward of deep learning in biomedicine have been hamper accurate model building, and poor in silico, but remains challenging in real life. comprehensively reviewed211. data curation can easily be a limiting factor Although one-step syntheses of individual Curated consistent data are a prerequisite for machine learning. Reducing errors in compounds or focused libraries can be for improved model building. A consortium data annotation and relying on suitable robustly performed in parallel batches or in of industrial and academic partners has assays will therefore be mandatory for flow, we still need to identify the sweet spots recently published a new comprehensive future success. Progress in automatically of such platforms for seamless integration database of standardized chemical and detecting and recovering false negatives in drug discovery. The elegant automated biological data for chemogenomics (that is, active compounds misidentified as synthetic strategy devised by Burke and data analysis (ExCAPE-DB212, Exascale inactive by the test) points to new means co-workers74, which enabled the generation Compound Activity Prediction Engine)213. of hit selection besides relying on primary of structurally diverse compounds from a Although the number of compound activity alone220. Automated retesting of limited set of simple building blocks (FIG. 4), structures and activity values stored in suspicious compounds could be performed points to a direction of future research to these databases may appear impressive by autonomous robots. Researchers at address this issue. from a chemistry-oriented viewpoint, they Pfizer recently disclosed success rates of With all the current excitement about are vanishingly small in comparison with 13–51% of true false negatives from HTS sophisticated artificial intelligence systems other fields, such as computer vision214. that were rescued based on computational and the maturation of rapid automation, With the exception of virtual chemical prediction221. it is crucial to identify approaches and

108 | FEBRUARY 2018 | VOLUME 17 www.nature.com/nrd ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

technologies that could be implemented Irrespective of the success or failure There is no doubt that drug discovery robustly by medicinal chemists in the near of individual technologies, this fresh demands the right mix of human mind, future and to discuss the challenges of doing view on drug discovery goes far beyond automation and machine intelligence. In the so in the context of industrial workflows. traditional approaches and will deliver future, the ‘intranet/internet of things’ may Computational molecular design has always innovative methodologies and potentially enable fully autonomous cross-platform raised hopes that some computer wizardry ground-breaking solutions that may have drug discovery. In combination with the might come to the rescue of stalled discovery a substantial impact on future discovery appropriate test systems and metrics of projects. The prospect of process automation concepts. One could envisage the future success, such integrated environments in the age of ‘big data’ further stimulates development of benchtop instruments bear the promise not only of stable system a drug designer’s fantasies. What will the equipped with building block cartridges performance but also of increasing the laboratory of the future look like? Are we for chemical synthesis and cassette-like competitiveness and efficiency of drug facing the automation of drug discovery bespoke assay panels for in‑line screening, discovery processes by sharing resources and with autonomous molecular design robots opening up great opportunities for small and data intramurally and extramurally235,236. replacing medicinal chemists? medium-sized technology companies; for There is no doubt that the automation example, such a mobile instrument could Conclusions and future perspectives of science has already begun. The use of be made available for project teams in many The drug discovery process has robotic devices is not limited to improving laboratories. Certainly, this concept does characteristics of chaotic systems, the reproducibility of experiments; a not make medicinal chemistry obsolete, as including nonlinear behaviour, error, particular feature of ‘robot scientists’ is their one might mistakenly deduce from some incompleteness, random serendipitous explicit foundation of scientific reasoning, published comments on this topic227,228; in events and partial predictability237. Not which contrasts with the more polymorphic, reality, the opposite expectation is probably surprisingly, good compounds may be generalized human mind222. The key closer to the truth. However, medicinal overlooked for various reasons. Clearly, technology drivers are hardware and chemistry training needs to adapt to this drug discovery is a challenging endeavour software improvements and data availability. new situation and to prepare chemists that requires skilful navigation in a mul- However, there may be limitations to accordingly229–231. tidimensional, multimodal search space. the applicability of machine learning in The well-controlled conditions possible For example, ‘activity cliffs’ may affect chemistry, as recently noted by Gambin using microfluidic synthesis technology lead optimization238, and unexpected and co-workers223. According to their enable otherwise strongly exothermic, biochemical and pharmacological effects study, fundamental mathematical theorems dangerous or difficult reactions to be can derail expansion and impose upper bounds on the accuracy with performed safely, potentially making development. which reaction yields and times can be novel molecular scaffolds more accessible. The three challenges for automated drug predicted, which in turn will limit the scope However, chemists will still have to design design are the assembly of synthetically of autonomous drug discovery platforms. these experiments to be performed by a accessible structures, scoring and Furthermore, the hundreds of thousands machine, and the tool compounds obtained property prediction, and the systematic (or more) data points required for deep will not represent perfect lead compounds optimization of promising molecules in learning will be unavailable in many drug for immediate expansion and development. adaptive learning cycles. Over the past three discovery projects. Alternative methods Furthermore, because the design machine decades, numerous guidelines, methods, for equally robust feature extraction and will be able to produce chemical starting algorithms and heuristics have been hypothesis generation from ‘small data’ sets points very quickly, future hit‑to‑lead proposed to address each of these problems. need to be identified. Pande and co-workers optimization and scaffold morphing will Although the generation of new chemical recently suggested ‘one-shot’ learning for require strong chemical expertise and will entities with attractive chemical scaffolds such instances224. probably generate demand for increased has become feasible and although the More conventional modelling techniques conventional synthesis capacity. algorithmic optimization problem can also are not expected to become outdated. The possibilities of bioinspired molecular be considered largely solved, the persisting The combination of ‘big data’ and ‘deep machines allow for even farther-reaching issue of compound scoring — that is, picking learning’ per se does not solve problems; goals: for example, in the performance of the best compounds from a large pool of it is the ability of the researchers involved diverse operations in response to chemical accessible possibilities — remains difficult. who devise appropriate representations of triggers. A recent example is provided by a While compound elimination by appropriate chemistry and biology for computational DNA nanomachine that uses DNA origami scoring models discards the bulk of the analysis. Their scientific skills will be command tracks to control a microfluidics designs (‘negative design’) with acceptable needed even more in future drug discovery device232. One may also envisage automated accuracy, the selection of the best or most settings. This notion becomes especially drug discovery platforms that include promising (‘positive design’) remains prone relevant when contemplating the fragility of modules for dynamic combinatorial to error. More accurate activity prediction autonomous discovery platforms. Although chemistry with biocompatible reactions; that models that extend the capabilities of there have been reports about robots that is, the in situ generation of drugs binding to existing approaches could originate from can adapt to damage and show outwardly a protein target233,234. In light of the rather advanced machine learning methods. ‘intelligent’ behaviour225,226, at least in the limited compound library sizes used in Prognoses of the sustainability of foreseeable future, it will remain the task such projects to date, automated adaptive customary pharmaceutical discovery and of the skilled scientists, technicians and feedback control offers opportunities for the development practices imply the need for engineers who design, run and maintain optimal exploration of chemical space for adjusted strategies for the future239–242. In such these discovery platforms. dynamic . a situation, one can and must be creative.

NATURE REVIEWS | DRUG DISCOVERY VOLUME 17 | FEBRUARY 2018 | 109 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

Given the prospects of labs‑on‑a‑chip, for planning and performing compound 23. Ichikawa, S. Function-oriented synthesis: how to design simplified analogues of antibacterial nucleoside human organoid assay systems, automated design, synthesis and testing, without fearing natural products? Chem. Rec. 16, 1106–1115 (2016). synthesis and intelligent learning a loss of control, could enable substantial 24. Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational software, we are currently witnessing improvements in the effectiveness of drug approaches to estimate solubility and permeability in a new wave of excitement about the discovery. drug discovery and development settings. Adv. Drug Del. Rev. 46, 3–26 (2001). changes in pharmaceutical research and Gisbert Schneider is at the Swiss Federal Institute 25. Walters, W. P., Ajay & Murcko, M. A. Recognizing 243,244 molecules with drug-like properties. Curr. Opin. Chem. development . The concept of automated of Technology (ETH), Department of Chemistry Biol. 3, 384–387 (1999). drug discovery could help to considerably and Applied Biosciences, Vladimir-Prelog-Weg 4, 26. Leeson, P. D. & Springthorpe, B. The influence of drug- reduce the number of compounds to 8093 Zürich, Switzerland. like concepts on decision-making in medicinal [email protected] chemistry. Nat. Rev. Drug Discov. 6, 881–890 (2007). be tested in a medicinal chemistry 27. Bickerton, G. R., Paolini, G. V., Besnard, J., project and, at the same time, establish a doi:10.1038/nrd.2017.232 Muresan, S. & Hopkins, A. L. Quantifying the chemical Published online 15 Dec 2017 beauty of drugs. Nat. Chem. 4, 90–98 (2012). rational unbiased foundation of adaptive 28. Yusof, I. & Segall, M. D. Considering the impact drug- molecular design. Recent advances in both like properties have on the chance of success. Drug 1. King, R. D. et al. The automation of science. Science Discov. Today 18, 659–666 (2013). lab‑on‑a‑chip and computer technology, 324, 85–89 (2009). 29. Ajay, A., Walters, W. P. & Murcko, M. A. Can we learn as well as the development of self-teaching 2. Chapman, T. Lab automation and robotics: automation to distinguish between “drug-like” and “nondrug-like” on the move. Nature 421, 661–666 (2003). molecules? J. Med. Chem. 41, 3314–3324 (1998). artificial intelligence systems, could allow 3. Sanderson, K. March of the synthesis machines. 30. Sadowski, J. & Kubinyi, H. A scoring scheme for bottlenecks in the molecular design cycle Nat. Rev. Drug Discov. 14, 299–300 (2015). discriminating between drugs and nondrugs. J. Med. 4. Lindsay, R. K., Buchanan, B. G., Feigenbaum, E. A. & Chem. 41, 3325–3329 (1998). to be addressed, thereby enabling better Lederberg, J. DENDRAL: a case study of the first 31. Leeson, P. D. Molecular inflation, attrition and the rule decision making in the future. Automation expert system for scientific hypothesis formation. Artif. of five. Adv. Drug Deliv. Rev. 101, 22–33 (2016). Intell. 61, 209–261 (1993). 32. Leahy, D. E. & Sykora, V. Automation of decision will play a central role in this process. 5. Johnson, A. P. & Marshall, C. Starting material making in drug design. Drug Discov. Today Technol. The envisaged drug discovery engine oriented retrosynthetic analysis in the LHASA 10, e437–e441 (2013). program. 3. Heuristic estimation of synthetic 33. Nicolaou, C. A. & Brown, N. Multi-objective imitates human decision making by proximity. J. Chem. Inf. Comput. Sci. 32, 426–429 optimization methods in drug design. Drug Discov. transferring responsibility to an objective (1992). Today Technol. 10, e427–e435 (2013). 6. Cho, S. J., Sun, Y. & Harte, W. ADAAPT: Amgen’s data 34. Harrison, S. et al. Extending ‘predict first’ to the machine learning system as a core aspect of access, analysis, and prediction tools. J. Comput. design-make-test cycle in small-molecule drug the discovery process. If successful in the Aided Mol. Des. 20, 249–261 (2006). discovery. Future Med. Chem. 9, 533–536 (2017). 7. Schneider, G. De novo Molecular Design (Wiley–VCH, 35. Soldatova, L. N., Rzhetsky, A., De Grave, K. & long run, the approach will amalgamate a 2013). King, R. D. Representation of probabilistic scientific continuously learning machine intelligence 8. Sparkes, A. et al. Towards robot scientists for knowledge. J. Biomed. Semantics 4 (Suppl. 1), S7 autonomous scientific discovery. Autom. Exp. 4, 1 (2013). with the synthesis of pharmaco­logically (2010). 36. Zhu, Q. et al. Semantic inference using relevant chemical matter. Thus, the 9. MacConnell, A. B., Price, A. K. & Paegel, B. M. An chemogenomics data for drug discovery. integrated microfluidic processor for DNA-encoded BMC Bioinformatics 12, 256 (2011). medicinal chemist will gain the freedom to combinatorial library functional screening. ACS Comb. 37. White, D. & Wilson, R. C. Generative models for draw inspiration from potentially surprising Sci. 19, 181–192 (2017). chemical structures. J. Chem. Inf. Model. 50, 10. Baranczak, A. et al. Integrated platform for 1257–1274 (2010). solutions delivered by computational expedited synthesis-purification-testing of small 38. Gupta, A. et al. Generative recurrent networks for de models, have fast access to initial tool molecule libraries. ACS Med. Chem. Lett. 8, novo design. Mol. Inf. 36, 1700111 (2017). 461–465 (2017). 39. Miyao, T., Arakawa, M. & Funatsu, K. Exhaustive compounds for a given discovery project and 11. Vasudevan, A., Bogdan, A. R., Koolman, H. F., Wang, Y. structure generation for inverse-QSPR/QSAR. Mol. Inf. save precious material. & Djuric, S. W. Enabling chemistry technologies and 29, 111–125 (2010). parallel synthesis-accelerators of drug discovery 40. Miyao, T., Kaneko, H. & Funatsu, K. Inverse QSPR/ Rapid feedback cycles require the programmes. Prog. Med. Chem. 56, 1–35 (2017). QSAR analysis for chemical structure generation customization of instrumentation and 12. Esch, E. W., Bahinski, A. & Huh, D. Organs-on-chips at (from y to x). J. Chem. Inf. Model. 56, 286–299 the frontiers of drug discovery. Nat. Rev. Drug Discov. (2016). the adjustment of work processes. 14, 248–260 (2015). 41. Gaspar, H. A., Baskin, I. I., Marcou, G., Horvath, D. & Establishing this concept in pharmaceutical 13. Eglen, R. M. & Randle, D. H. Drug discovery goes Varnek, A. Stargate GTM: bridging descriptor and three-dimensional: goodbye to flat high-throughput activity spaces. J. Chem. Inf. Model. 55, 2403–2410 discovery may require considerable screening? Assay Drug Dev. Technol. 13, 262–265 (2015). investment in terms of money and the (2015). 42. Schneider, G., Funatsu, K., Okuno, J. & Winkler, D. De 14. Jones, L. H. & Bunnage, M. E. Applications of novo drug design — ye olde scoring problem revisited. reorganization of laboratory structures and chemogenomic library screening in drug discovery. Mol. Inf. 36, 1681031 (2017). processes. It will be necessary to evaluate the Nat. Rev. Drug Discov. 16, 285–296 (2017). 43. Reutlinger, M., Rodrigues, T., Schneider, P. & 15. Schreiber, S. L. Target-oriented and diversity-oriented Schneider, G. Combining on‑chip synthesis of a feasibility of fully autonomous molecular in drug discovery. Science 287, focused combinatorial library with computational design with the aid of computers and robotic 1964–1969 (2000). target prediction reveals imidazopyridine GPCR 16. O’ Connor, C. J., Beckmann, H. S. & Spring, D. R. ligands. Angew. Chem. Int. Ed. 53, 582–585 (2014). devices and, at the same time, to analyse Diversity-oriented synthesis: producing chemical 44. Reutlinger, M., Rodrigues, T., Schneider, P. & which aspects of compound generation tools for dissecting biology. Chem. Soc. Rev. 41, Schneider, G. Multi-objective molecular de novo 4444–4456 (2012). design by adaptive fragment prioritization. Angew. are best left to a chemically savvy artificial 17. Maurya, S. K. & Rana, R. An eco-compatible strategy Chem. Int. Ed. 53, 4244–4248 (2014). intelligence or a skilled human mind. for the diversity-oriented synthesis of macrocycles 45. Rodrigues, T. et al. Multidimensional de novo design exploiting carbohydrate-derived building blocks. reveals 5‑HT2B receptor-selective ligands. Angew. The answers to these questions may vary Beilstein J. Org. Chem. 13, 1106–1118 (2017). Chem. Int. Ed. 54, 1551–1555 (2015). depending on the particular discovery 18. Maier, M. E. Design and synthesis of analogues of 46. Schneider, P., Röthlisberger, M., Reker, D. & natural products. Org. Biomol. Chem. 13, 5302–5343 Schneider, G. Spotting and designing promiscuous context, and keeping an open mind to (2015). ligands for drug discovery. Chem. Commun. 52, many different viewpoints is advisable. 19. Wetzel, S., Bon, R. S., Kumar, K. & Waldmann, H. 1135–1138 (2016). Biology-oriented synthesis. Angew. Chem. Int. Ed. 50, 47. Rodrigues, T. et al. De novo fragment design for drug Medicinal chemistry has always borrowed 10800–10826 (2011). discovery and chemical biology. Angew. Chem. Int. Ed. methodological thinking from engineering 20. Wilk, W., Zimmermann, T. J., Kaiser, M. & 54, 15079–15083 (2015). Waldmann, H. Principles, implementation, and 48. Rodrigues, T. et al. Steering target selectivity and and experimental design so that tailored application of biology-oriented synthesis (BIOS). potency by fragment-based de novo drug design. solutions could be implemented to meet Biol. Chem. 391, 491–497 (2010). Angew. Chem. Int. Ed. 52, 10006–10009 (2013). 21. Wender, P. A., Verma, V. A., Paxton, T. J. & Pillow, T. H. 49. Besnard, J. et al. Automated design of ligands to challenges in chemistry, and continuing Function-oriented synthesis, step economy, and drug polypharmacological profiles. Nature 492, 215–220 to do so would be wise. While keeping a design. Acc. Chem. Res. 41, 40–49 (2008). (2012). 22. Wender, P. A., Quiroz, R. V. & Stevens, M. C. Function 50. Willot, M. et al. Total synthesis and absolute healthy scepticism of automation for its through synthesis-informed design. Acc. Chem. Res. configuration of the guaiane sesquiterpene Englerin A. own sake, embracing new technologies 48, 752–760 (2015). Angew. Chem. Int. Ed. 48, 9105–9108 (2009).

110 | FEBRUARY 2018 | VOLUME 17 www.nature.com/nrd ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

51. Kusama, H., Tazawa, A., Ishida, K. & Iwasawa, N. Total 78. Dressler, O. J., Maceiczyk, R. M., Chang, S. I. & 103. Carneiro, P. F., Gutmann, B., de Souza, R. O. M. A. & synthesis of (±)-Englerin A using an intermolecular deMello, A. J. Droplet-based microfluidics: enabling Kappe, O. Process intensified flow synthesis of [3 + 2] cycloaddition reaction of -containing impact on drug discovery. J. Biomol. Screen. 19, 1H‑4‑substituted imidazoles: toward the continuous carbonyl ylide. Chem. Asian J. 11, 64–67 (2016). 483–496 (2014). production of Daclatasvir. ACS Sustain. Chem. Eng. 3, 52. Friedrich, L., Rodrigues, T., Neuhaus, C. S., 79. Shultz, S. et al. Miniaturized GPCR signaling studies in 3445–3453 (2015). Schneider, P. & Schneider, G. From complex natural 1536‑well format. J. Biomol. Tech. 19, 267–274 104. Stalder, R. & Roth, G. P. Preparative microfluidic products to simple synthetic mimetics by (2008). of drug metabolites. ACS Med. Chem. computational de novo design. Angew. Chem. Int. Ed. 80. Kanigowska, P., Shen, Y., Zheng, Y., Rosser, S. & Cai, Y. Lett. 4, 1119–1123 (2013). 55, 6789–6792 (2016). Smart DNA fabrication using sound waves: applying 105. Genovino, J., Sames, D., Hamann, L. G. & Touré, B. B. 53. Antolín, A. A. & Mestres, J. Distant polypharmacology acoustic dispensing technologies to synthetic biology. Accessing drug metabolites via transition-metal among MLP chemical probes. ACS Chem. Biol. 10, J. Lab. Autom. 21, 49–56 (2016). catalyzed C-H oxidation: the liver as synthetic 395–400 (2015). 81. Sackmann, E. K. et al. Technologies that enable inspiration. Angew. Chem. Int. Ed. 55, 14218–14238 54. Reker, D., Rodrigues, T., Schneider, P. & Schneider, G. accurate and precise nano- to milliliter-scale liquid (2016). Identifying the macromolecular targets of de novo- dispensing of aqueous reagents using acoustic droplet 106. Britton, J. & Raston, C. L. Multi-step continuous-flow designed chemical entities through self-organizing ejection. J. Lab. Autom. 21, 166–177 (2016). synthesis. Chem. Soc. Rev. 46, 1250–1271 (2017). map consensus. Proc. Natl Acad. Sci. USA 111, 82. Hadimioglu, B., Stearns, R. & Ellson, R. Moving liquids 107. Reizman, B. J. & Jensen, K. F. Feedback in flow for 4067–4072 (2014). with sound: the physics of acoustic droplet ejection for accelerated reaction development. Acc. Chem. Res. 55. Schneider, P. & Schneider, G. Privileged structures robust laboratory automation in life sciences. J. Lab. 49, 1786–1796 (2016). revisited. Angew. Chem. Int. Ed. 56, 7971–7974 (2017). Autom. 21, 4–18 (2016). 108. McMullen, J. P., Stone, M. T., Buchwald, S. L. & 56. Schneider, P. & Schneider, G. A computational method 83. Squires, T. M. & Quake, S. R. Microfluidics: fluid Jensen, K. F. An integrated microreactor system for for unveiling the target promiscuity of physics at the nanoliter scale. Rev. Mod. Phys. 77, self-optimization of a Heck reaction: from micro- to pharmacologically active compounds. Angew. Chem. 977–1026 (2005). mesoscale flow systems. Angew. Chem. Int. Ed. 49, Int. Ed. 56, 11520–11524 (2017). 84. Yoshida, J., Nagaki, A. & Yamada, D. Continuous flow 7076–7080 (2010). 57. Ley, S. V., Fitzpatrick, D. E., Ingham, R. J. & synthesis. Drug Discov. Today Technol. 10, e53–e59 109. Cortés–Borda, D. et al. Optimizing the Heck-Matsuda Myers, R. M. Organic synthesis: march of the (2013). reaction in flow with a constraint-adapted direct search machines. Angew. Chem. Int. Ed. 54, 3449–3464 85. Rodrigues, T., Schneider, P. & Schneider, G. Accessing algorithm. Org. Process Res. Dev. 20, 1979–1987 (2015). new chemical entities through microfluidic systems. (2016). 58. Merrifield, R. B. Solid phase . I. The Angew. Chem. Ind. Ed. 53, 5750–5758 (2014). 110. Falcone, C. E. et al. Reaction screening and synthesis of a tetrapeptide. J. Am. Chem. Soc. 85, 86. Hopkin, M. D., Baxendale, I. R. & Ley, S. V. A flow- optimization of continuous-flow atropine synthesis by 2149–2154 (1963). based synthesis of imatinib: the API of Gleevec. Chem. preparative electrospray mass spectrometry. Analyst 59. Palomo, J. M. Solid–phase peptide synthesis: an Commun. 46, 2450–2452 (2010). 142, 2836–2845 (2017). overview focused on the preparation of biologically 87. Murray, P. R. D. et al. Continuous flow-processing of 111. Huang, C. M., Zhu, Y., Jin, D. Q., Kelly, R. T. & Fang, Q. relevant peptides. RSC Adv. 4, 32658–32672 (2014). organometallic reagents using an advanced peristaltic Direct surface and droplet microsampling for 60. Kosuri, S. & Church, G. M. Large-scale de novo DNA pumping system and the telescoped flow synthesis of electrospray ionization mass spectrometry analysis synthesis: technologies and applications. Nat. (E/Z)-tamoxifen. Org. Process Res. Dev. 17, 1192–1208 with an integrated dual-probe microfluidic chip. Methods 11, 499–507 (2014). (2013). Anal. Chem. 89, 9009–9016 (2017). 61. Wan, W. B. & Seth, P. P. The medicinal chemistry of 88. Pastre, J. C., Browne, D. L. & Ley, S. V. Flow chemistry 112. Hartman, R. L., McMullen, J. P. & Jensen, K. F. therapeutic oligonucleotides. J. Med. Chem. 59, syntheses of natural products. Chem. Soc. Rev. 42, Deciding whether to go with the flow: evaluating the 9645–9667 (2016). 8849–8869 (2013). merits of flow reactors for synthesis. Angew. Chem. Int. 62. Seeberger, P. H. & Werz, D. B. Synthesis and medical 89. Saaby, S., Knudsen, K. R., Ladlow, M. & Ley, S. V. Ed. 50, 7502–7519 (2011). applications of oligosaccharides. Nature 446, The use of a continuous flow-reactor employing a 113. Shevlin, M. Practical high-throughput experimentation 1046–1051 (2007). mixed -liquid flow stream for the efficient for chemists. ACS Med. Chem. Lett. 8, 601–607 63. Koppitz, M. & Eis, K. Automated medicinal chemistry. reduction of imines to . Chem. Commun. 23, (2017). Drug Discov. Today 11, 561–568 (2006). 2909–2911 (2005). 114. Chow, S. Y. & Nelson, A. Embarking on a chemical 64. Liu, R., Li, X. & Lam, K. S. Combinatorial chemistry in 90. Baxendale, I. R., Hayward, J. J. & Ley, S. V. Microwave space odyssey. J. Med. Chem. 60, 3591–3593 drug discovery. Curr. Opin. Chem. Biol. 38, 117–126 reactions under continuous flow conditions. Comb. (2017). (2017). Chem. High Throughput Screen. 10, 802–836 (2007). 115. Moore, J. S. & Jensen, K. F. “Batch” kinetics in flow: 65. Godfrey, A. G., Masquelin, T. & Hemmerle, H. 91. Brzozowski, M., O’Brien, M., Ley, S. V. & Polyzos, A. online IR analysis and continuous control. Angew. A remote-controlled adaptive medchem lab: an Flow chemistry: intelligent processing of gas-liquid Chem. Int. Ed. 53, 470–473 (2014). innovative approach to enable drug discovery in the transformations using a tube-in-tube reactor. Acc. 116. Haeberle, S. & Zengerle, R. Microfluidic platforms for 21st Century. Drug Discov. Today 18, 795–802 Chem. Res. 48, 349–362 (2015). lab-on-a-chip applications. Lab Chip 7, 1094–10110 (2013). 92. Wong-Hawkes, S. Y., Matteo, J. C., Warrington, B. H. & (2007). 66. Nicolaou, C. A., Watson, I. A., Hu, H. & Wang, J. The White, J. D. in New Avenues to Efficient Chemical 117. Jeong, G. S., Chung, S., Kim, C. B. & Lee, S. H. Proximal Lilly Collection: mapping, exploring and Synthesis Vol. 2006 (eds Seeberger, P. H. & Blume, T.) Applications of micromixing technology. Analyst 135, exploiting feasible chemical space. J. Chem. Inf. 39–55 (2007). 460–473 (2010). Model. 56, 1253–1266 (2016). 93. Fernandez-Suarez, M., Wong, S. Y. & Warrington, B. H. 118. Fratila, R. M. & Velders, A. H. Small-volume nuclear 67. Crooks, S. L. & Charles, L. J. Overview of Synthesis of a three-member array of cycloadducts in magnetic resonance spectroscopy. Annu. Rev. Anal. combinatorial chemistry. Curr. Protoc. Pharmacol. 9, a glass microchip under pressure driven flow. Lab Chip Chem. 4, 227–249 (2011). Unit 9.3 (2001). 2, 170–174 (2002). 119. Capel, A. J. et al. 3D printed fluidics with embedded 68. Long, A. Parallel chemistry in the 21st century. Curr. 94. Jönsson, D., Warrington, B. H. & Ladlow, M. analytic functionality for automated reaction Protoc. Pharmacol. 9, Unit9.16 (2012). Automated flow-through synthesis of heterocyclic optimisation. Beilstein J. Org. Chem. 13, 111–119 69. Ingallina, C. et al. The Pictet-Spengler reaction still on thioethers. J. Comb. Chem. 6, 584–595 (2004). (2017). stage. Curr. Pharm. Des. 22, 1808–1850 (2016). 95. Garcia-Egido, E., Spikmans, V., Wong, S. Y. & 120. Chiu, D. T. & Lorenz, R. M. Chemistry and biology in 70. Pirrung, M. C. Molecular Diversity and Combinatorial Warrington, B. H. Synthesis and analysis of femtoliter and picoliter volume droplets. Acc. Chem. Chemistry (Elsevier, 2004). combinatorial libraries performed in an automated Res. 42, 649–658 (2009). 71. Roughley, S. D. & Jordan, A. M. The medicinal micro reactor system. Lab Chip 3, 73–76 (2003). 121. He, M. et al. Selective encapsulation of single cells and chemist’s toolbox: an analysis of reactions used in the 96. Newton, S. et al. Accelerating spirocyclic polyketide subcellular organelles into picoliter- and femtoliter- pursuit of drug candidates. J. Med. Chem. 54, synthesis using flow chemistry. Angew. Chem. Int. Ed. volume droplets. Anal. Chem. 77, 1539–1544 3451–3479 (2011). 53, 4915–4920 (2014). (2005). 72. Brown, D. G. & Boström, J. Analysis of past and 97. Adamo, A. et al. On-demand continuous-flow 122. Theberge, A. B. et al. Microdroplets in microfluidics: present synthetic methodologies on medicinal production of pharmaceuticals in a compact, an evolving platform for discoveries in chemistry and chemistry: where have all the new reactions gone? reconfigurable system. Science 352, 61–67 (2016). biology. Angew. Chem. Int. Ed. 49, 5846–5868 J. Med. Chem. 59, 4443–4458 (2016). 98. Hochlowski, J. E. et al. An integrated synthesis- (2010). 73. Collins, K. D., Gensch, T. & Glorius, F. Contemporary purification system to accelerate the generation of 123. Lignos, I. et al. Synthesis of Cesium lead halide screening approaches to reaction discovery and compounds in pharmaceutical discovery. J. Flow Perovskite nanocrystals in a droplet-based microfluidic development. Nat. Chem. 6, 859–871 (2014). Chem. 2, 56–61 (2011). platform: fast parametric space mapping. Nano Lett. 74. Li, J. et al. Synthesis of many different types of organic 99. Lange, P. P. & James, K. Rapid access to compound 16, 1869–1877 (2016). small molecules using one automated process. Science libraries through flow technology: fully automated 124. Krishnadasan, S., Brown, R. J., deMello, A. J. & 347, 1221–1226 (2015). synthesis of a 3‑aminoindolizine library via deMello, J. C. Intelligent routes to the controlled 75. Li, J., Grillo, A. S. & Burke, M. D. From synthesis to orthogonal diversification. ACS Comb. Sci. 14, synthesis of nanoparticles. Lab Chip 7, 1434–1441 function via iterative assembly of 570–578 (2012). (2007). N‑methyliminodiacetic acid boronate building blocks. 100. Yoshida, J., Nagaki, A. & Yamada, T. Flash chemistry: 125. Beulig, R. J. et al. A droplet-chip/mass spectrometry Acc. Chem. Res. 48, 2297–2307 (2015). fast chemical synthesis by using microreactors. approach to study organic synthesis at nanoliter scale. 76. LaPorte, T. L. & Wang, C. Continuous processes for the Chemistry 14, 7450–7459 (2008). Lab Chip 17, 1996–2002 (2017). production of pharmaceutical intermediates and 101. Yoshida, J., Takahashi, Y. & Nagaki, A. Flash 126. Dittrich, P. S. & Manz, A. Lab-on-a-chip: microfluidics active pharmaceutical ingredients. Curr. Opin. Drug chemistry: flow chemistry that cannot be done in in drug discovery. Nat. Rev. Drug Discov. 5, 210–218 Discov. Devel. 10, 738–745 (2007). batch. Chem. Commun. 49, 9896–9904 (2013). (2006). 77. Chin, P., Barney, W. S. & Pindzola, B. A. 102. Nagaki, A., Imai, K., Kim, H. & Yoshida, J. Flash 127. Skardal, A., Shupe, T. & Atala, A. Organoid-on-a-chip Microstructured reactors as tools for the intensification synthesis of TAC‑101 and its analogues from and body-on-a-chip systems for drug screening and of pharmaceutical reactions and processes. Curr. Opin. 1,3,5‑tribromobenzene using integrated flow disease modeling. Drug Discov. Today 21, 1399–1411 Drug Discov. Devel. 12, 848–861 (2009). microreactor systems. RSC Adv. 1, 758–760 (2011). (2016).

NATURE REVIEWS | DRUG DISCOVERY VOLUME 17 | FEBRUARY 2018 | 111 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

128. Zakhariants, A. A., Burmistrova, O. A., 154. Cook, A. et al. Computer-aided synthesis design: 181. Baskin, I. I., Winkler, D. & Tetko, I. V. A renaissance of Shkurnikov, M. Y., Poloznikov, A. A. & Sakharov, D. A. 40 years on. Wiley Interdiscip. Rev. Comput. Mol. Sci. neural networks in drug discovery. Expert Opin. of a specific substrate-inhibitor panel 2, 79–107 (2011). Discov. 11, 785–795 (2016). (liver-on-a-chip) for evaluation of cytochrome P450 155. Ravitz, O. Data-driven computer aided synthesis 182. Jouppi, N. P. et al. in Proceedings of the 44th activity. Bull. Exp. Biol. Med. 162, 170–174 (2016). design. Drug Discov. Today Technol. 10, e443–e449 International Symposium on Computer Architecture 129. Kirchmair, J. et al. Predicting : (2013). (ISCA) http://dx.doi.org/10.1145/3079856.3080246 experiment and/or computation? Nat. Rev. Drug 156. Chen, J. H. & Baldi, P. No electron left behind: a rule- (Toronto, 2017). Discov. 14, 387–404 (2015). based expert system to predict chemical reactions 183. Sato, K., Young, C. & Patterson, D. An in-depth look at 130. Zhang, Y. S., Zhang, Y. N. & Zhang, W. Cancer-on-a- and reaction mechanisms. J. Chem. Inf. Model. 49, Google’s first Tensor Processing Unit (TPU). Google chip systems at the frontier of nanomedicine. Drug 2034–2043 (2009). Cloud Platform https://cloud.google.com/blog/big- Discov. Today 22, 1392–1399 (2017). 157. Kayala, M. A. et al. Learning to predict chemical data/2017/05/an-in-depth-look-at-googles-first-tensor- 131. Galler, K., Bräutigam, K., Große, C., Popp, J. & reactions. J. Chem. Inf. Model. 51, 2209–2222 processing-unit-tpu (2017). Neugebauer, U. Making a big thing of a small cell — (2011). 184. Google. TensorFlowTM www.tensorflow.org (2017) recent advances in single cell analysis. Analyst 139, 158. Kowalik, M. et al. Parallel optimization of synthetic 185. Rampasek, L. & Goldenberg, A. TensorFlow: biology’s 1237–1273 (2014). pathways within the Network of Organic Chemistry. gateway to deep learning? Cell Syst. 2, 12–14 (2016). 132. Loskill, P. et al. WAT-on-a-chip: a physiologically Angew. Chem. Int. Ed. 51, 7928–7932 (2012). 186. Litjens, G. et al. A survey on deep learning in medical relevant microfluidic system incorporating white 159. Szymkuc, S. et al. Computer-assisted synthetic image analysis. Med. Image Anal. 42, 60–88 (2017). adipose tissue. Lab Chip. 17, 1645–1654 (2017). planning: the end of the beginning. Angew. Chem. Int. 187. Holder, L. B., Haque, M. M. & Skinner, M. K. Machine 133. Cao, Z., Chen, C., He, B., Tan, K. & Lu, C. A microfluidic Ed. 55, 5904–5937 (2016). learning for epigenetics and future medical device for epigenomic profiling using 100 cells. 160. Coley, C. W., Barzilay, R., Jaakkola, T. S., Green, W. H. applications. Epigenetics 19, 1–10 (2017). Nat. Methods 12, 959–962 (2015). & Jensen, K. F. Prediction of organic reaction 188. Li, Y., Chen, C. Y. & Wasserman, W. W. Deep feature 134. Kurita, R. & Niwa, O. Microfluidic platforms for DNA outcomes using machine learning. ACS Cent. Sci. 3, selection: theory and application to identify enhancers methylation analysis. Lab Chip 16, 3631–3644 434–443 (2017). and promoters. J. Comput. Biol. 23, 322–336 (2016). (2016). 161. Whelan, K. E. & King, R. D. Intelligent software for 189. Erickson, B. J., Korfiatis, P., Akkus, Z., Kline, T. & 135. Eyer, K., Stratz, S., Kuhn, P., Küster, S. K. & laboratory automation. Trends Biotechnol. 22, Philbrick, K. Toolkits and libraries for deep learning. Dittrich, P. S. Implementing enzyme-linked 440–445 (2004). J. Digit. Imag. 30, 400–405 (2017). imunosorbent assays on a microfluidic chip to quantify 162. Reker, D. & Schneider, G. Active learning strategies in 190. Gasteiger, J. Physicochemical effects in the intracellular molecules in single cells. Anal. Chem. 85, computer-assisted drug discovery. Drug Discov. Today representation of molecular structures for drug 3280–3287 (2013). 20, 458–465 (2015). designing. Mini Rev. Med. Chem. 3, 789–796 (2003). 136. Adriani, G., Ma, D., Pavesi, A., Gohm, E. L. & 163. Schneider, G. & Fechner, U. Computer-based de novo 191. Sawada, R., Kotera, M. & Yamanishi, Y. Benchmarking Kamm, R. D. Modeling the blood-brain barrier in a 3D design of drug-like molecules. Nat. Rev. Drug Discov. a wide range of chemical descriptors for drug–target triple co-culture microfluidic system. Conf. Proc. IEEE 4, 649–663 (2005). interaction prediction using a chemogenomic Eng. Med. Biol. Soc. 2015, 338–341 (2015). 164. Hartenfeller, M. & Schneider, G. Enabling future drug approach. Mol. Inf. 33, 719–731 (2014). 137. Huang, T. Y. et al. 3D printed microtransporters: discovery by de novo design. Wiley Interdiscip. Rev. 192. Goh, G. B., Siegel, C., Vishnu, A., Hodas, N. O. & compound micromachines for spatiotemporally Comput. Mol. Sci. 1, 742–759 (2011). Baker, N. Chemception: a deep neural network with controlled delivery of therapeutic agents. Adv. Mater. 165. Rodrigues, T. & Schneider, G. Flashback forward: minimal chemistry knowledge matches the 42, 6644–6650 (2015). reaction-driven de novo design of bioactive performance of expert-developed QSAR/QSPR 138. Kara, A. et al. Electrochemical imaging for compounds. Synlett 25, 170–178 (2014). models. arXiv, 1706.06689 (2017). microfluidics: a full-system approach. Lab Chip 16, 166. Hunter, J. Adopting AI is essential for a sustainable 193. Castelvecchi, D. Can we open the black box of AI? 1081–1087 (2016). pharma industry. Drug Discov. World Winter Nature 538, 20–23 (2016). 139. Kara, A. et al. Towards a multifunctional 2016/2017, 69–71 (2017). 194. Albrecht, T., Slabaugh, G., Alonso, E. & Al‑Arif, M. R. electrochemical sensing and niosome generation lab- 167. Kramer, C., Fuchs, J. E. & Liedl, K. R. Strong Deep learning for single-molecule science. on-chip platform based on a plug-and-play concept. nonadditivity as a key structure-activity relationship Nanotechnology 28, 423001 (2017). Sensors 16, 778 (2016). feature: distinguishing structural changes from assay 195. Schneider, G. Neural networks are useful tools for drug 140. Hartmann, D. M. et al. Microfluidic chip apparatuses, artifacts. J. Chem. Inf. Model. 55, 483–494 (2015). design. Neural Netw. 13, 15–16 (2000). systems and methods having fluidic and fiber optic 168. Schonherr, H. & Cernak, T. Profound methyl effects in 196. Winkler, D. A. & Le, T. C. Performance of deep and interconnections. US Patent 20090147253 drug discovery and a call for new C-H methylation shallow neural networks, the universal approximation A1(2007). meactions. Angew. Chem. Int. Ed. 52, 12256–12267 theorem, activity cliffs, and QSAR. Mol. Inf. 36, 141. Desai, B. et al. Rapid discovery of a novel series of Abl (2013). 1600118 (2017). kinase inhibitors by application of an integrated 169. Kuhn, B., Fuchs, J. E., Reutlinger, M., Stahl, M. & 197. Xie, L., Draizen, E. J. & Bourne, P. E. Harnessing big microfluidic synthesis and screening platform. J. Med. Taylor, N. R. Rationalizing tight ligand binding through data for systems pharmacology. Annu. Rev. Chem. 56, 3033–3047 (2013). cooperative interaction networks. J. Chem. Inf. Model. Pharmacol. Toxicol. 57, 157–160 (2017). 142. Wang, Y. et al. An integrated microfluidic device for 51, 3180–3198 (2011). 198. Del Sol, A., Thiesen, H. J., Imitola, J. & Carazo large-scale in situ click chemistry screening. Lab. Chip 170. Reker, D., Schneider, P., Schneider, G. & Brown, J. B. Salas, R. E. Big-data-driven stem cell science and 9, 2281–2285 (2009). Active learning for computational chemogenomics. tissue engineering: vision and unique opportunities. 143. Lombardi, D. & Dittrich, P. S. Advances in microfluidics Future Med. Chem. 9, 381–402 (2017). Cell Stem Cell 20, 157–160 (2017). for drug discovery. Expert Opin. Drug Discov. 5, 171. Lang, T., Flachsenberg, F., von Luxburg, U. & 199. Schmid, M. & Lipson, H. Distilling free-form natural 1081–1094 (2010). Rarey, M. Feasibility of active machine learning for laws from experimental data. Science 324, 81–85 144. Wen, N. et al. Development of droplet microfluidics multiclass compound classification. J. Chem. Inf. (2009). enabling high-throughput single-cell analysis. Model. 56, 12–20 (2016). 200. Ekins, S. The next era: deep learning in Molecules 21, 881 (2016). 172. Schüller, A. & Schneider, G. Identification of hits and pharmaceutical research. Pharm. Res. 33, 145. Kang, D. K. et al. 3D droplet microfluidic systems for lead structure candidates with limited resources by 2594–2603 (2016). high-throughput biological experimentation. Anal. adaptive optimization. J. Chem. Inf. Model. 48, 201. Gawehn, E., Hiss, J. A. & Schneider, G. Deep learning Chem. 87, 10770–10778 (2015). 1473–1491 (2008). in drug discovery. Mol. Inf. 35, 3–14 (2016). 146. Agresti, J. J. et al. Ultrahigh-throughput screening in 173. Reutlinger, M. et al. Neighborhood–preserving 202. Aliper, A. et al. Deep learning applications for drop-based microfluidics for directed evolution. Proc. visualization of adaptive structure-activity landscapes: predicting pharmacological properties of drugs and Natl Acad. Sci. USA 107, 4004–4009 (2010). application to drug discovery. Angew. Chem. Int. Ed. drug repurposing using transcriptomic data. Mol. 147. Obexer, R. et al. Emergence of a catalytic tetrad 50, 11633–11636 (2011). Pharm. 13, 2524–2530 (2016). during evolution of a highly active artificial aldolase. 174. Hiss, J. A. et al. Combinatorial chemistry by ant 203. Tian, K., Shao, M., Wang, Y., Guan, J. & Zhou, S. Nat. Chem. 9, 50–56 (2017). colony optimization. Future Med. Chem. 6, 267–280 Boosting compound-protein interaction prediction by 148. Du, G., Fang, Q. & den Toonder, J. M. Microfluidics for (2014). deep learning. Methods 110, 64–72 (2016). cell-based high throughput screening platforms — a 175. Reker, D., Schneider, P. & Schneider, G. Multi-objective 204. Schneider, G. & Schneider, P. Macromolecular target review. Anal. Chim. Acta 903, 36–50 (2016). active machine learning rapidly improves structure- prediction by self-organizing feature maps. Expert 149. Zhu, Z. & Yang, C. J. Hydrogel droplet microfluidics for activity models and reveals new protein-protein Opin. Drug Discov. 12, 271–277 (2017). high-throughput single molecule/cell analysis. Acc. interaction inhibitors. Chem. Sci. 7, 3919–3927 205. Filzen, T. M., Kutchukian, P. S., Hermes, J. D., Li, J. & Chem. Res. 50, 22–31 (2017). (2016). Tudor, M. Representing high throughput expression 150. Fenneteau, J., Chauvin, D., Griffiths, A. D., Nizak, C. & 176. Schneider, G., Schuchhardt, J. & Wrede, P. Artificial profiles via perturbation barcodes reveals compound Cossy, J. Synthesis of new hydrophilic rhodamine neural networks and simulated molecular evolution targets. PLoS Comput. Biol. 13, e1005335 (2017). based enzymatic substrates compatible with droplet- are potential tools for sequence-oriented protein 206. Zhang, L., Tan, J., Han, D. & Zhu, H. From machine based microfluidic assays. Chem. Commun. 53, design. Comput. Appl. Biosci. 10, 635–645 (1994). learning to deep learning: progress in machine 5437–5440 (2017). 177. Schneider, G. et al. Peptide design by artificial neural intelligence for rational drug discovery. Drug Discov. 151. Khalid, N., Kobayashi, I. & Nakajima, M. Recent lab- networks and computer-based evolutionary search. Today. http://dx.doi.org/10.1016/j. on-chip developments for novel drug discovery. Wiley Proc. Natl Acad. Sci. USA 95, 12179–12184 (1998). drudis.2017.08.010 (2017). Interdiscip. Rev. Syst. Biol. Med. 6, e1381 (2017). 178. Schneider, G. & Wrede, P. Artificial neural networks for 207. Zong, N., Kim, H., Ngo, V. & Harismendy, O. Deep 152. Corey, E. J. General methods for the construction of computer-based molecular design. Prog. Biophys. mining heterogeneous networks of biomedical linked complex molecules. Pure Appl. Chem. 14, 19–38 Mol. Biol. 70, 175–222 (1998). data to predict novel drug-target associations. (1967). 179. Zupan, J. & Gasteiger, J. Neural networks: a new Bioinformatics 33, 2337–2344 (2017). 153. Ihlenfeldt, W. D. & Gasteiger, J. Computer-assisted method for solving chemical problems or just a 208. Ragoza, M., Hochuli, J., Idrobo, E., Sunseri, J. & planning of organic syntheses: the second generation passing phase? Anal. Chim. Acta 248, 1–30 (1991). Koes, D. R. Protein-ligand scoring with convolutional of programs. Angew. Chem. Int. Ed. 34, 2613–2633 180. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. neural networks. J. Chem. Inf. Model. 57, 942–957 (1996). Nature 521, 436–444 (2015). (2017).

112 | FEBRUARY 2018 | VOLUME 17 www.nature.com/nrd ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

209. Pereira, J. C., Caffarena, E. R. & Dos Santos, C. N. 226. Adami, C. Artificial intelligence: robots with instincts. 243. Mignani, S., Huber, S., Tomás, H., Rodrigues, J. & Boosting docking-based virtual screening with deep Nature 521, 426–427 (2015). Majoral, J. P. Why and how have drug discovery learning. J. Chem. Inf. Model. 56, 2495–2506 (2016). 227. [No authors listed.] Blogroll: Robot wars. Nat. Chem. strategies in pharma changed? What are the new 210. Goh, G. B., Hodas, N. O. & Vishnu, A. Deep learning 1, 173 (2009). mindsets? Drug Discov. Today 21, 239–249 for computational chemistry. J. Comput. Chem. 38, 228. Peplow, M. Organic synthesis: the robo-chemist. (2016). 1291–1307 (2017). Nature 512, 20–22 (2014). 244. Gautam, A. & Pan, X. The changing model of big 211. Mamoshina, P., Vieira, A., Putin, E. & Zhavoronkov, A. 229. Satyanarayanajois, S. D. & Hill, R. A. Medicinal pharma: impact of key trends. Drug Discov. Today Applications of deep learning in biomedicine. Mol. chemistry for 2020. Future Med. Chem. 3, 1765–1786 21, 379–384 (2016). Pharm. 13, 1445–1454 (2016). (2011). 245. Reutlinger, M. & Schneider, G. Nonlinear 212. ExCAPE-DB: ExCAPE chemogenomics database. 230. Rafferty, M. F. No denying it: medicinal chemistry dimensionality reduction and mapping of compound https://solr.ideaconsult.net/search/excape/ (2017). training is in big trouble. J. Med. Chem. 59, libraries for drug discovery. J. Mol. Graph. Model. 213. Sun, J. et al. ExCAPE–DB: an integrated large scale 10859–10864 (2016). 34, 108–117 (2012). dataset facilitating Big Data analysis in 231. Allen, D. Where will we get the next generation of 246. Hawkes, S. Y. F. W., Chapela, M. J. V. & chemogenomics. J. Cheminform. 9, 17 (2017). medicinal chemists? Drug Discov. Today 21, 704–706 Montembault, M. Leveraging the advantages offered 214. Mondal, K. Design issues of Big Data parallelisms. (2016). by microfluidics to enhance the drug discovery Adv. Intell. Syst. Comput. 434, 209–217 (2016). 232. Tomov, T. E. et al. DNA bipedal motor achieves a process. QSAR Comb. Sci. 24, 712–721 (2005). 215. Tetko, I. V., Engkvist, O. & Chen, H. Does ‘Big Data’ large number of steps due to operation using 247. Werner, M. et al. Seamless integration of dose– exist in medicinal chemistry, and if so, how can it be microfluidics–based interface. ACS Nano 11, response screening and flow chemistry: efficient harnessed? Future Med. Chem. 8, 1801–1806 (2016). 4002–4008 (2017). generation of structure–activity relationship data of 216. Tetko, I. V., Engkvist, O., Koch, U., Reymond, J. L. & 233. Lehn, J. M. & Eliseev, A. V. Dynamic combinatorial β-secretase (BACE1) inhibitors. Angew. Chem. Int. Chen, H. BIGCHEM: challenges and opportunities for chemistry: evolutionary formation and screening of Ed. 53, 1704–1708 (2014). big data analysis in chemistry. Mol. Inf. 35, 615–621 molecular libraries. Science 291, 2331–2332 (2001). 248. Czechtizky, W. et al. Integrated synthesis and testing (2016). 234. Mondal, M. & Hirsch, A. K. Dynamic combinatorial of substituted xanthine based DPP4 inhibitors: 217. Ramsundar, B. et al. Is multitask deep learning chemistry. Chem. Soc. Rev. 44, 2455–2488 (2015). application to drug discovery. ACS Med. Chem. Lett. practical for pharma? J. Chem. Inf. Model. 57, 235. Vermesan, O. & Friess, P. Internet of Things — 4, 768–772 (2013). 2068–2076 (2017). Converging Technologies for Smart Environments 249. Pagano, N. et al. An integrated chemical biology 218. Mathea, M., Klingspohn, W. & Baumann, K. and Integrated Ecosystems (River Publishers, 2013). approach reveals the of HIV Chemoinformatic classification methods and their 236. Carroll, G. P., Srivastava, S., Volini, A. S., Piñeiro- replication inhibitors. Bioorg. Med. Chem. http:// applicability domain. Mol. Inf. 35, 160–180 (2016). Núñez, M. M. & Vetman, T. Measuring the dx.doi.org/10.1016/j.bmc.2017.03.061 (2017). 219. Ochi, S., Miyao, T. & Funatsu, K. Structure effectiveness and impact of an open innovation modification toward applicability domain of a QSAR/ platform. Drug Discov. Today 22, 776–785 (2017). Acknowledgements QSPR model considering activity/property. Mol. Inf. 237. Schneider, P. & Schneider, G. De novo design at the P. Dittrich, A. deMello, Boehringer-Ingelheim Pharma and http://dx.doi.org/10.1002/minf.201700076 (2017). edge of chaos. J. Med. Chem. 59, 4077–4086 AstraZeneca contributed photographs of automated discovery 220. Posner, B. A., Xi, H. & Mills, J. E. Enhanced HTS hit (2016). devices. The author thanks M. Kossenjans, J. Hiss, P. selection via a local hit rate analysis. J. Chem. Inf. 238. Dimova, D., Heikamp, K., Stumpfe, D. & Bajorath, J. Schneider, J. B. Brown, J. Kriegl and R. King for stimulating Model. 49, 2202–2210 (2009). Do medicinal chemists learn from activity cliffs? A discussions on the future of drug discovery and process auto- 221. Zhang, L., Boehm, M. & Lovering, F. in ACS National systematic evaluation of cliff progression in evolving mation. The author was financially supported by the Swiss Meeting & Exposition CINF82 (San Francisco, compound data sets. J. Med. Chem. 56, 3339–3345 Federal Institute of Technology (ETH) Zurich, the Swiss National 2017). (2013). Science Foundation (grant numbers: 200021_157190, 222. Sparkes, A. et al. Towards robot scientists for 239. Munos, B. Lessons from 60 years of pharmaceutical CR32I2_159737), the European Union Framework autonomous scientific discovery. Autom. Exp. 2, 1 innovation. Nat. Rev. Drug Discov. 8, 959–968 (2009). Programme for Research and Innovation (Horizon 2020, Marie (2010). 240. Sneddon, H. Embedding sustainable practices into Skłodowska–Curie ITN grant numbers: 676434 ‘BIGCHEM’, 223. Skoraczynski, G. et al. Predicting the outcomes of pharmaceutical R&D: what are the challenges? Future 675555 ‘AEGIS’) and the OPO-Foundation Zurich. organic reactions via machine learning: are current Med. Chem. 6, 1373–1376 (2014). descriptors sufficient? Sci. Rep. 7, 3582 (2017). 241. Djuric, S. W., Hutchins, C. W. & Talaty, N. N. Current Competing interests statement 224. Altae-Tran, H., Ramsundar, B., Pappu, A. S. & status and future prospects for enabling chemistry The author declares competing interests: see Web version Pande, V. Low data drug discovery with one-shot technology in the drug discovery process. F1000Res for details. learning. ACS Cent. Sci. 3, 283–293 (2017). 5, 2426 (2016). 225. Cully, A., Clune, J., Tarapore, D. & Mouret, J. B. 242. Scannell, J. W., Blanckley, A., Boldon, H. & Warrington, B. Publisher’s note Robots that can adapt like animals. Nature 521, Diagnosing the decline in pharmaceutical R&D efficiency. Springer Nature remains neutral with regard to jurisdictional 503–507 (2015). Nat. Rev. Drug Discov. 11, 191–200 (2012). claims in published maps and institutional affiliations.

NATURE REVIEWS | DRUG DISCOVERY VOLUME 17 | FEBRUARY 2018 | 113 ©2018 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.