Using Computational to Understand & Discover Chemical Reactions

K. N. Houk & Peng Liu

Abstract: Chemistry, the “science of matter,” is the investigation of the fabulously complex interchanges Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 of and bonds that happen constantly throughout our universe and within all living things. Com- putational chemistry is the modeling of chemistry using mathematical equations that come from physics. The ½eld was made possible by advances in computer and computer power and continues to flourish in step with developments in those areas. can be thought of as both a time-lapse video that slows down processes by a quadrillion-fold and an ultramicroscope that provides a billion-fold magni½cation. Computational can quantitatively simulate simple chemistry, such as the chemical reactions between in interstellar space. The chemistry inside a living organism is dramatically more complicated and cannot be simulated exactly, but even here com- putational chemistry enables understanding and leads to discovery of previously unrecognized phenomena. This essay describes how computational chemistry has evolved into a potent force for progress in chem- istry in the twenty-½rst century.

In chemistry class, we learn that chemists study mat - ter and its properties; they wear lab coats and safety glasses and mix chemicals together and observe the amazing things that happen. But there is no need to go into a chemical laboratory to ½nd chemistry. In fact, chemistry is literally everywhere: it is the thou - sands of chemical processes that result in the emer- K. N. HOUK, a Fellow of the Amer- gence of a growing plant from a seed, the transfor- ican Academy since 2002, is the Saul mation of flower nectar into the flight of a humming- Winstein Chair in Organic Chemis- bird, or the conversion in chemical factories of oil try in the Department of Chemistry from decayed ancient life into polymers that are made and at the University into stylish fabrics or spacesuits. How do these things of California, Los Angeles. happen? Chemists learn how chemi cal reactions oc- PENG LIU is an Assistant Professor cur and how to control them for human purposes. In of Chemistry at the University of the twenty-½rst century, computa tional chemistry . plays a major role in chemical discovery. (*See endnotes for complete con- Before the twentieth century, knowledge about the tributor biographies.) properties and transformations of matter was gained

© 2014 by the American Academy of Arts & Sciences doi:10.1162/DAED_a_00305 49 Using through experimenta tion. Early chemical to obtain ever more accurate solutions to Compu - theories and rules, such as Mendeleev’s pe- small problems. tational Chemistry riodic table, were empirically derived from to Under- observations of chemical phenomena. xperiments yield facts, such as which stand & E Discover Some theories were wrong (for example, products are formed when various chem- Chemical the phlogiston theory, which posited the icals come into contact or how much elec- Reactions existence of an element called phlogiston tricity is generated when sunlight shines on in order to explain combus tion), while oth- a chunk of silicon or sandwich of organic ers were very crude models. The discovery polymers. However, do not of quantum mechan ics in the 1920s revolu- tell us why such results occur. For example, tionized science. Heisenberg, Schrödinger, why are certain products formed and not , and other physicists developed a others, or why is only a few percent of the Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 theo ry based on pure mathematics that in sunlight converted to electricity? explains how chemistry arises from the in- Both theory and computation are needed teractions of nu clei and .1 Paul Di - to answer these questions: theory to pro- rac, one of the Nobel Laureates for quan- vide the general framework and simple tum me chan ics, noted in 1929: mod els for a qualitative conceptual under - pinning of experimental phenomena, and The underlying physical laws necessary for computation to flesh out an accurate mi- the mathematical theory of a large part of croscopic account of them. Today’s chem- physics and the whole of chemistry are thus ists attempt to employ computations to completely known, and the dif½culty is only explain phenomena and guide new exper - that the exact application of these laws leads iments, but quantitative modeling of to equations much too complicated to be chem ical reactions is very challenging due soluble. It therefore becomes desirable that to problems of scale. The chemical phe- approximate practical methods of applying nomena that we observe are the outcomes should be developed, of rearrangements of the atomic structures which can lead to an explanation of the main of a huge number of very small molecules. features of complex atomic systems without A water droplet contains around one sex- too much computation.2 tillion (1021) molecules, each with a slight ly Exactly as Dirac envisioned, a hierarchy different shape, velocity, and energy at any of mathematical models, with different given moment. The atoms in each water lev els of approximation, has been devel- are moving rapidly inside the oped over the last century.3 But Dirac could droplet: the atoms change to a new ar- not foresee the discovery and development rangement 1014 times per second. To com - of powerful with which we can pletely reproduce the properties of that solve some of these highly complex prob- droplet and predict how it will change up - lems of applied mathematics. While we still on heating or mixing with other chemicals cannot obtain exact solutions to the quan- would require simulating all sextillion of tum mechanical equations for chemical the droplet’s fast-moving molecules, were systems with very large numbers of atoms, we to compute everything from exact we can calculate answers as close as de - quan tum mechanical equations (or “½rst sired to the exact mathematical solution, principles”). Modern computers can calcu- given enough computer time. When more late how one molecule changes over time, powerful computers become available, but to calculate all sextillion or even a sig- com putational chemists will set out to ni½cant fraction of them is not practical, solve bigger and bigger problems and try nor will it be anytime in the foreseeable

50 Dædalus, the Journal ofthe American Academy of Arts & Sciences future. However, approximate equations– this, calculations on such large mole cules K. N. Houk model systems calibrated with empirical must involve shortcuts that make the cal- & Peng Liu data to capture the average properties of a culations faster but less accu rate. water molecule and its interactions with other molecules–can be computed to al- Computational modeling is the simula- low us to understand what occurs in the tion of chemical structures, properties, and drop of water and to estimate its proper- reactions with a computer. is ties: density, surface tension, viscosity, and sometimes described as the third form of even chemical . science. The ½rst form, experimental sci- Aside from the daunting numbers of cal - ence, starts with empirical observations culations that must be performed to mim- and models created from inductive logic. ic reality, there is also the issue of the size of The second form is theoretical science, for - Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 some important molecules: smaller mol - mulated in equations that describe the e cules are, of course, much simpler to mod - phenomena of the natural world. Simula- el. As the number of electrons in a molecule tion is a third form where mathematical increases, the time needed to perform cal- equations are coded into computer pro- culations on it goes up rapidly. A hydrogen grams to predict what happens in various H molecule ( 2) consists of two of the light- hypothetical chemical situations. est atoms bonded together and only two The fundamental theories used in these electrons; natural gas con sists pri marily of computer programs are based on classi- CH meth ane ( 4), which has only ½ve light cal and quantum mechanics. Galileo, Kep- atoms and ten electrons. Everything about ler, Newton, and other scienti½c revolu- individual hy dro gen and meth ane mole- tionaries of the late seventeenth century cules can be computed nearly exactly in a developed what we now call classical me - short time. How ever, many mol ecules of chanics, which describes the physics of crucial importance for life, and those that relatively large objects moving on a human make up com mon materials, are much timescale. Newton’s equations of motion larg er. Consider a nucleic acid mol ecule, (as these equations are such as a strand of dna, or a that often called) are used for molecular dy - controls so many of the pro cesses of life, namics to derive the motion of or a polymer molecule in a poly styrene cup: or larger ob jects over each molecule contains thousands of at- time. Classical mechanics can also be used oms and can ex ist in many different three- to study structures of molecules by ½tting dimensional ar rangements that in ter con - equations to empirical data–what chem- vert very quickly. To simulate the be hav - ists call “.” However, ior of chemicals with so many atoms takes classical mechanics cannot predict chemi- many computer re sources. Depend ing on cal reactions and reactivity, because the how accurate the calculations need to be, mo tions of electrons are wave like, quan- the number of hours needed to perform tized, and described correctly only by quan - com putations on molecules scale be tween tum mechanics. For more mas sive and the third and the seventh power of the slowly moving systems, classical and quan- number of electrons contained within tum mechanics converge, but quantum them! This means, for example, a cal cu la - mechanics is uniquely capable of describ- tion involving a benzylpenicillin mole cule ing the of atoms and with forty-one atoms can be up to two mil - molecules and thus chemical properties lion times slower than the same calcu la tion and reactions. Multiscale computational done with a methane molecule. Be cause of methods, which employ both classical and

143 (4) Fall 2014 51 Using quantum mechanics, have been developed exploring real chemistry, and software Compu - for calculations of complex chemical and companies have been formed to further de- tational Chemistry biological systems, such as . Here, velop and market these programs commer- to Under- quantum mechanics is applied to study the cially.5 stand & Discover central part of the sys tem: for example, at - The progress described here was stimu- Chemical oms that are close to the forming or break - lated by the development of computers. Reactions ing in a reaction. The re - The eniac (Electronic Numerical Integra- maining atoms are treat ed with classical tor and Computer) and other general-pur- mechanics so that such calculations can be pose computers in the 1940s occupied space applied to very large systems. In 2013, three equal to a comfortable house for four peo- of the pioneers in this ½eld, Martin Kar - ple. Now, the computers that reside in our plus, , and , smartphones are about a trillion times Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 were awarded the in Chemis- more powerful. Furthermore, computa- try for their studies in the early 1970s that tional chemists have access to computers established what is now called the qm/ all over the world, and rapid Internet con- mm method. nections give computational research Based on these underlying theories, groups in the United States access to a many computer algorithms to calculate whole network of powerful computers sup- pro perties and reactions were written in ported by the National Science Foundation the last century, and these developments and other federal research agencies. Other continue to this day. While the fundamen- countries have similar networks, and there tal equations of quantum mechanics are is international competition to produce the de ceptively simple, the computer programs most powerful computer. written in order to use them in simulations The impact of these developments on are extremely complicated. In 1998, the No- the capabilities of computational chemis - bel Prize in Chemistry was awarded to try has been profound, and the ½eld has be- , a mathematician and come an increasingly important aspect of whose research group developed many of science. In the flagship journal of chemis - these algorithms and computer programs, try, the Journal of the American Chemical Soci- and , a physicist who, with his ety, the number of computational papers coworkers, developed an alternative meth - has risen from very few in the 1960s to over od of solving the Schrödinger equation, three hundred papers per year. Along with now known as density func tional theory the growth of computational chemistry in (dft). Pople and Kohn were at the fore- mainstream journals, there has been a pro - front of using computational methods in liferation of journals speci½cally devoted to chemistry, inspiring many mathemati- the subject. There is the Journal of Computa - cians, mathematical chem ists, and physi- tional Chemistry, the Journal of Chem ical Theo - cists to devise algorithms and computer ry and Computation, and at least two dozen methods for studying chemical phenom- other journals that concentrate on the study ena. For example, one of the main pro- of chemistry using computation. Chem- grams now used for these calculations has istry is not unique in this regard: physics seventy-four authors from all over the has a dozen such journals; and bio logy has world!4 These authors and many other sci- and many other pub- entists worked over the last ½fty years on lications that emphasize computation. various computational chemistry programs The most successful computations also that are now in general use by chemists. lead to the development of general con- The programs have become very useful for cepts that can be used to guide future ex -

52 Dædalus, the Journal ofthe American Academy of Arts & Sciences periments and make predictions. This is, of was predicted by a quantum mechanical K. N. Houk course, helpful in the ½eld of organic chem - cal culation. The calculation took ten min - & Peng Liu istry, the area of expertise of the authors of utes today using a powerful desktop com- this article. may be no - puter, but thirty years ago when the study torious as a gatekeeper for future doctors, began, the same calculations took one week but it is really an intellectually rich and and involved a small roomful of equip- chal lenging branch of chemistry that in - ment. Chemists have developed these types volves chemical compounds containing of pictures for the rapid visual represen- carbon atoms, along with any of the other tation of what is actually a very compli- atoms of the . Organic chem- cated mathematical result in a computer. istry touches all of our existence, from life- Experiments already showed that these sav ing and -enhancing pharmaceuticals to reactions typically occur with the rota- Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 fuels, insecticides, and organic electronic tion of both termini in the same direction materials. We describe in the following (this motion was called “conrotatory” by pages how computations are used to ex - Woodward and Hoffmann), rather than plore and understand organic chemistry. op posite directions (called “disrotatory”). As shown in Figure 1, each of the two mo- In 1965, R. B. Woodward and Roald Hoff - tions leads to a distinct product called a ste- mann published one of the most influen- reoisomer. Different stereoisomers have tial conceptual developments that thrust the same atoms and bonds, but they are theory, and eventually computation, into connected together in different three-di - the forefront of organic chemistry.6 Al- mensional spatial arrangements. This dif- though these concepts were grounded in fer ence in shape gives stereoisomers dis- previous developments by many , tinct chemical prop er ties: molecules that they came to be known by chemists as the have the same “structure” but a different Woodward-Hoffmann rules. Based on and shape may turn out to quan tum mechanical principles, these rules be a life-saving drug or a poison, depend- give predictions about a particular class of ing on their three-dimensional shape. It organic chemical reactions in which bond is therefore important to understand and formation and bond breakage occur simul- control which stereo iso mer is formed in taneously in a ring of atoms. One exam- a reaction used to make the molecule. ple of such a reaction is the ring opening Using qualitative reasoning and support- of a molecule known as cis-3,4-dimethyl- ed by the very approximate calculations cyclobutene, shown in Figure 1. possible at the time, Woodward and Hoff - With these images, we launch into real mann provided an elegant quantum me - organic chemistry and hope to introduce chanical interpretation of the selectivity the reader to the visual world that organic shown in Figure 1. The conrotatory process chemists occupy and that computational maximizes bonding all along the reaction organic chemists study. The ½rst two pic- pathway (it is “allowed,” or occurs rapidly), tures shown in Figure 1 are computer draw - while the disrotatory opening in volves a ings of the three-dimensional structure of motion that would require a very high en- the cis-3,4-dimethylcyclobutene molecule. ergy to occur (it is therefore “for bid den”). While it can also be represented by its for- They described these principles in terms of C H mula, 6 10, there are many different mol- the symmetries of orbitals, the regions in ecules with that same formula, each of which electrons are located according to which has unique properties. The exact way the usual form of quantum mechanics used that the atoms are arranged in this ½gure to describe molecules. The insights led to

143 (4) Fall 2014 53 Using Figure 1 Compu - Three-Dimensional “Space-Filling,” “Ball-and-Stick,” and Schematic Representations tational of cis-3,4-dimeth ylcyclobutene and the Conrotatory and Disrotatory Reaction Products Chemistry to Under- stand & Discover Chemical Reactions

conrotatory (allowed)

cis,trans-2,4-hexadiene

cis-3,4-dimethylcyclobutene Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 disrotatory (forbidden) trans,trans-2,4-hexadiene

In the two structures on the far left, spheres represent the positions of the atoms in the reactant molecule. The larger spheres show the carbon atoms, and the smaller spheres show the hydrogen atoms. The size of the atoms in the “space-½lling” picture represents their van der Waals radii (which measure how close two atoms can ap - proach). Smaller spheres were used for the “-and-stick” picture to illustrate the chemical bonds (indicated by bold lines) that are formed by a buildup of electrons between the nuclei. The third picture is a sketch of the same molecule, with the atoms represented by letters and the large balls representing the larger size of the “sub- CH H stituent” methyl groups ( 3; the bold lines indicate that the atoms are in the foreground, and the dotted lines CH indicate that the 3 groups recede backward). The sketches in brackets show the changes occurring in the reactions. The dashed lines indicate bonds that are breaking in the reaction. The products of the reactions are shown in the “ball-and-stick” and sketch renditions to the right of the arrows. Source: Figure prepared by the authors using data in R. Hoffmann and R. B. Woodward, “Stereochemistry of Electrocyclic Reactions,” Journal of the American Chemical Society 87 (1965): 395–397.

new understanding of a broad segment of inward-rotating reaction ten billion times organic chemistry and to predictions of faster than the non-observed outward- new reactions that were subsequently dis - rotating reaction. The result was very puz - CF covered experimentally. zling, since the larger 3 group bumps into Chemists found through experiments the other CF3 in the faster reaction, and that their understanding was incomplete. usually minimizes such bumping In examples such as that shown in Figure (which we call steric clashes)–but not 2, there are two different “allowed” con- here. Our group used quantum mechanics rotatory processes: namely, rotation in a and computational chemistry to under- clockwise or in a counterclockwise fash- stand why. . The Woodward-Hoffmann rules did Quantum mechanical simulations of not differentiate between the two, but re - these reactions showed the motions of the searchers found a huge preference for one nuclei and electrons in these molecules as direction of rotation in several cases stud- they change from reactants to either of the ied experimentally.7 The difference be- two possible products. These calculations tween the activation required for also determine how much energy it takes these processes to occur (30.5 kcal/mol for for the bonds of one molecule (the reac- the inward conrotatory rotation and 49.7 tant) to be reorganized through nuclear kcal/mol for the outward conrotatory and rearrangements to form an - rotation) is enough to make the observed other molecule (the product). The “transi-

54 Dædalus, the Journal ofthe American Academy of Arts & Sciences Figure 2 K. N. Houk Two “Allowed” Reactions and the Large Activation Energy Differences in the Two Modes & Peng Liu of Ring Opening of trans-perfluoro-3,4-dimethylcyclobutene Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021

The horizontal lines represent the relative energies of the reactants (left), transition states (“TS,” center), and prod- E ucts (right), and the lines show how these are interrelated. The experimentally measured activation energies ( a) are shown next to the corresponding transition states. Lower activation energy means a faster reaction, and so the reac - tion follows the path with the lowest activation energy (indicated by the solid lines), even though the product of that path is less stable. Source: Figure prepared by Peng Liu using data from W. R. Dolbier, Jr., H. Koroniak, D. J. Burton, A. R. Bailey, G. S. Shaw, and S. W. Hansen, “Remarkable, Contrasteric, Electrocyclic Ring Opening of a Cyclobu - tene,” Journal of the American Chemical Society 106 (1984): 1871–1872.

tion state” is the highest-energy point along like (rough sketches are given in Figure 2) the best path from reactant to product as so that we could analyze them to under- marked in Figure 2,8 and it is the energy of stand why one is much this transition state relative to the reactants lower in energy than the other. E (activation energy, a) that determines how Calculations of this type could also be fast or slow the reaction will occur. By performed for other atoms and groups be- F CF quan tum mechanical calculations, we sides and 3, and we eventually learn ed found out what the transition states look that certain types of substituents–those

143 (4) Fall 2014 55 Using we call electron-donors (D in Figure 3)– This example illustrates that when Compu - always rotate outward away from the quan tum mechanics is applied to a small tational Chemistry breaking bond, but strong electron-accep- enough molecule, an accurate prediction to Under- tors (A in Figure 3) rotate inward toward of the product of a new stand & Discover the breaking bond. Donors are already sur- is possible. These calculations also led to Chemical rounded by electrons and thus avoid inter - the development of the theory of torquo - Reactions acting with the electrons of the bond that selectivity that is applicable to every other breaks, but acceptors seek electrons and reaction of this type. tend to move toward the breaking bond (see Figure 3). The shaded shapes are meant Another signi½cant application for com- to represent the regions where electrons putational chemistry is the development are localized and would have repulsive in - of catalysts (substances that speed up a re - Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 teractions with other ½lled orbitals nearby. action but are not consumed during the The empty shapes represent regions that reaction). Catalysts cause reactions that do not have electrons but would like to; normally do not occur at all to take place these empty orbitals cause an to be under conditions that are easily achiev- electron-loving, or “electro philic.” Quan- able. Chemists aim to develop catalysts to tum mechanics shows that the interaction achieve new chemical transformations or of a ½lled orbital with a va cant orbital is to increase the ef½ciency of valuable reac - favorable, while the interaction of two tions. ½lled orbitals is repulsive. This is the basis Many reactions used in the chemical in- of bonding and steric effects, respectively. dustry involve transition metal catalysts Since the bond twists–or torques–as (transition metals are so called because it breaks, we describe this selective twist- they have partially occupied d orbitals and ing as “torquoselectivity.”9 Computations therefore special properties). Three Nobel were used ½rst to reproduce the initial ex- Prizes in Chemistry have been awarded in periment, then to analyze the results, and the last ½fteen years (in 2001, 2005, and then to develop concepts and rules to pre - 2010) for discoveries about organic reac- dict the results of similar future exper i - tions using these catalysts.11 The study of ments. The principle here has been applied chemicals containing carbon and metal at- to predict the course of reactions of new oms in the same molecule is called organo - substances synthesized for the ½rst time. metallic chemistry and is now a prominent For example, Figure 4 shows a simple mol- ½eld of chemistry. The carbon atoms are ecule, 3-formylcyclobutene, which was part of the attached to the metal. made in our laboratory in 1987 for the ½rst Catalytic reactions generally involve time to test the prediction made about the many steps and intermediates that are usu- unexpected stereochemistry of this reac- ally dif½cult to detect or identify experi- tion.10 Based upon our torquoselectivity mentally, although new spectroscopic and theory, we expected that the formyl group imaging tools are being developed to try (labeled “CHO” in Figure 4), an acceptor- to achieve this. In the glory days of mech- type substituent, would rotate inward, and anistic physical organic chemistry, the we used a quantum mechanical simulation masters of the ½eld, such as Saul Winstein to predict exactly how much this was pre- at ucla and Paul D. Bartlett at Harvard, ferred over an outward rotation. Then we devised many clever experiments using the did the , and it worked! Only instrumentation available at the time to the less stable product (shown on the top try to deduce how reactions occurred in line of Figure 4) is formed. solutions. Major controversies often devel-

56 Dædalus, the Journal ofthe American Academy of Arts & Sciences Figure 3 K. N. Houk Orbital Interactions in Two Conrotatory Transition States in the Electrocyclic Ring & Peng Liu Opening of a cis-3-donor-4-acceptor-cyclobutene Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021

The torquoselectivity model developed from calculations predicts (correctly) that the counterclockwise motion shown on the top line of the ½gure is highly preferred. Source: Figure prepared by the authors using data from N. G. Rondan and K. N. Houk, “Theory of Stereoselection in Conrotatory Electrocyclic Reactions of Substituted Cyclobutenes,” Journal of the American Chemical Society 107 (1985): 2099–2111.

Figure 4 Computations Predicted the Torquoselectivity of Electrocyclic Ring Opening of 3-Formylcyclobutene

Source: Figure prepared by the authors using data from K. Rudolf, D. C. Spellmeyer, and K. N. Houk, “Predic- tion and Experimental Veri½cation of the Stereoselective Electrocyclization of 3-Formylcyclobutene,” Journal of Organic Chemistry 52 (1987): 3708–3710.

143 (4) Fall 2014 57 Using oped about the interpretation of the ex- Determining why this ruthenium cata- Compu - periments. Nowadays we try to gain this lyst produced Z-ole½ns was a dif½cult chal- tational Chemistry mechanistic information about catalytic lenge to computational chemistry. The to Under- reactions by looking directly at molecules molecules involved in this reaction are stand & Discover as they react using computer simulations. large and contain metals, which have high Chemical Shown in Figure 5 is ole½n metathesis, a atomic numbers and dozens of electrons. Reactions very important reaction in industry and Consequently, hundreds of computer lab oratory synthesis. Metathesis means hours are needed for each computation. transposition, in this case of the atoms In addition, there are many structures to from one ole½n to another. An ole½n has compute due to the great number of ways two carbons joined by a double bond; each the reaction could occur and the multiple

single bond is made of one pair of electrons structures involved in each case. Extensive Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 (a double bond, comprising two pairs of experimental and computational studies electrons, is represented by two lines). The of ole½n metathesis with previously re- metathesis reaction shown in Figure 5 ported ruthenium catalysts have been car - swaps the atoms making up the ends of ried out all over the world in the last few each double-bonded ole½n. This reaction decades. Based on those studies, our group provides one of the most powerful strate- had many clues about what types of reac- gies for making new carbon-carbon double tions we needed to investigate. We were bonds and is important for the synthesis of able to limit the number of structures to many complex organic compounds and evaluate, rather than having to compute polymeric materials like those used for every possibility for this complex reac- many familiar objects, from Norsorex pants tion.14 Given previous results, there were for motorcyclists to gigantic wind turbine really only two plausible pathways, distin- (modern windmill) blades. guished from one another by the direction A major challenge that developed during from which the ole½n molecule approach - the study of ole½n metathesis was to ½nd es the catalyst. The approach can be either catalysts that form “Z-ole½ns,” in which ad jacent or opposite to the ligand. These the two substituents (X and Y) in the prod- two pathways are shown in Figure 7 and uct are on the same side of the double are called “side” and “bottom” approaches. bond. Many years after the discovery of Our computations revealed a major sur- ole½n metathesis, chemists were still try- prise and a crucial discovery: in contrast to ing to learn how to make Z-ole½ns this everything known before, this reaction way so that new compounds and materi- with the new ruthenium catalysts involves als could be synthesized via the ole½n me - side approach of the ole½n to the catalyst.15 tathesis process. In 2009, chemists Amir Computational technology allowed us to Hoveyda and Richard R. Schrock found the render visualizations of the three-dimen- ½rst molybdenum- and tungsten-based sional structures of the transition states ole ½n me ta thesis catalysts that selectively (Fig ure 7). Scientists use microscopes to produce Z-ole½ns,12 and two years later, see microbes and employ atomic force chemist Robert H. Grubbs, one of the No- micro scopes (afm) to study materials at bel Laureates in this ½eld (along with Rich- the atomic level (for example, in the bur- ard R. Schrock and ), discov- geoning ½eld of ). By con- ered a new type of ruthenium catalyst that trast, there are currently no established performed the same function (Figure 6). ex perimental tools to visualize transition Why this catalyst produced Z-ole½ns, how - states, since they are only about 10−9 me- ever, was not known.13 ters (1 nanometer) in diameter and exist for

58 Dædalus, the Journal ofthe American Academy of Arts & Sciences Figure 5 K. N. Houk The Ole½n Metathesis Reaction Swaps the Ends of Ole½ns & Peng Liu

The usual products of these reactions, called E-ole½ns, have the X and Y groups on opposite sides of the double bond. Source: Figure prepared by the authors using data from R. H. Grubbs and S. Chang, “Recent Advances in

Ole½n Metathesis and Its Application in Organic Synthesis,” Tetrahedron 54 (1998): 4413–4450. Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021

Figure 6 The Ole½n Metathesis to Give Z-Ole½ns

Source: Figure prepared by the authors using data from K. Endo and R. H. Grubbs, “Chelated Ruthenium Cata- lysts for Z-Selective Ole½n Metathesis,” Journal of the American Chemical Society 133 (2011): 8525–8527. less than 10−13 seconds! (Chemists like Ah- lyst complex, as shown in Figure 7. The ole- med Zewail at Caltech are working to de- ½n approach ing in this way clashes with vel op such experimental tools.) Computa - the ligand and places the substituents on tions, however, are able to bring the tran- the ole½n on the same side of the newly sition state to life by taking snapshots of formed double bond to form the Z-ole½n. simulated reactions as they happen (imag- This is very different from previous reac- ine a camera with a shutter speed of one tions using other ruthen ium catalysts, in femtosecond, or 10−15 seconds!) and by which the ole½n approaches from the bot- func tioning like a super–high power mi - tom, far away from the ligand, causing the croscope (with 109 times magni½cation!). formation of more stable E-ole½ns rather Although it is a prediction that cannot cur- than Z-ole½ns. rently be veri½ed directly, this picture en - These computations provided important ables us to interpret important occurrences insights for further catalyst development. in this reaction, such as how individual at- Armed with the knowledge that Z-selec- oms attract or repel each other, that would tivity in the new catalysts arises from the otherwise be impossible to ob serve. Such repulsions with the ligand on the catalyst, an analysis revealed that the Z-ole½n is researchers began experimental stud ies of selectively formed due to the “side” ap - catalysts with even larger ligands. This led proach of the ole½n molecule in the cata- to the discovery of an improved Z-selective

143 (4) Fall 2014 59 Using Figure 7 Compu - Three-Dimensional Renditions of the Computed Transition State Structures tational of the Possible Approaches of the Ole½n Molecule Chemistry to Under- stand & Discover Chemical Reactions Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021

a) Shows the ole½n approaching the ruthenium catalyst adjacent to the ligand (the “side” approach); b) shows the ole½n approaching opposite the ligand on the ruthenium catalyst (the “bottom” approach). A qualitative ren- dering is shown at the right. Source: Figure prepared by the authors using data from P. Liu, X. Xu, X. Dong, B. K. Keitz, M. B. Herbert, R. H. Grubbs, and K. N. Houk, “Z-Selectivity in Ole½n Metathesis with Chelated Ru Cata- lysts: Computational Studies of Mechanism and Selectivity,” Journal of the American Chemical Society 134 (2012): 1464–1467.

catalyst by the Grubbs group.16 Compu- different combinations of metals produce tational investigations of this type have be- useful metal alloys as catalysts.17 come a standard way to accelerate under- Nature generally uses proteins, and standing and discovery, and many exper- sometimes ribonucleic acids (rna), to cat - imental groups have become involved in alyze the reactions necessary for metabo- computational work to complement their lism at the rates required to sustain life. experiments. Proteins are poly-amino acids of the gen- eral structure shown in Figure 8 (a), where The Z-selective catalyst was discovered “R” can be any of the twenty different side accidentally through experiments; now chains of natural amino acids. The amino computations have helped to determine acid fragments are connected in a speci½c precisely which experiments might im- sequence that determines the structure prove such catalysts. Similar computation- and properties of a protein. al approaches are also being used to pre- Our research group at ucla collabo- dict new catalysts for pharmaceuticals, rates with David Baker’s group at the Uni- fuels, and materials. For example, compu- versity of Washington and Stephen Mayo’s tational materials scientists calculate how group at Caltech to design new

60 Dædalus, the Journal ofthe American Academy of Arts & Sciences Figure 8 K. N. Houk The General Structure or “Primary Sequence” of a Protein (a); the Three-Dimensional Representa- & Peng Liu tions of a Protein, Cytochrome P450cam, Showing All Atoms in a Space-½lling Display (b); and a “Ribbon Diagram” of Protein Architecture (c). Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021

Source: The three-dimensional protein structures are illustrated using PyMol (Version 1.3 Schrödinger, llc) with structure obtained from the Protein Data Bank (PDB ID: 2ZWT); and K. Sakurai, H. Shimada, T. Hayashi, and T. Tsukihara, “Substrate Binding Induces Structural Changes in Cytochrome P450cam,” Acta Crystallographica Section F 65 (2009): 80–83. that catalyze “non-natural reactions”: the ed) needed, it is a simple matter for chem - many reactions not catalyzed by naturally ists to use automatic machines to synthe- occurring enzymes. We do this by using size the desired dna and for molecular computer calculations to predict which biologists to incorporate this dna into a pro tein structures will fold into a speci½c microorganism and induce it to produce three-dimensional structure like those these new proteins. shown in Figure 8 (b) and (c). We can then We use quantum mechanical calcula- try to create a fold that will align the cat- tions to design optimal arrangements of alytic groups from the protein in order to protein active site components that are pre- catalyze a desired reaction–perhaps one dicted to catalyze speci½c reactions. Da vid that has known practical or commercial Baker’s group has developed a computer value or perhaps simply one we dreamed program called Rosetta that predicts the up. If we can predict the amino acid se - amino acid sequences of proteins that will quence (the list of individual amino acids fold up into a speci½c three-dimensional and the order in which they are connect- structure. The program is based on classi-

143 (4) Fall 2014 61 Using cal mechanics and empirical information; zymes, and even then only a small fraction Compu - because the proteins that are studied are of the computationally designed en zymes tational Chemistry large and can adopt many shapes, accurate are active in the experiment. Nevertheless, to Under- quantum mechanical calculations would designing protein catalysts from scratch stand & Discover take too long to be useful. Although they using only computer calculations is a major Chemical only produce approximate models, Rosetta development, and we envision that this Reactions and other programs can nevertheless offer technology will lead to catalysts for the valuable information about which ar- synthesis of many important compounds range ment of atoms in proteins is most and for therapeutic purposes as well. stable. Realizing the potential of these tools, These examples illustrate a very small por - our group and David Baker’s began about tion of the ½eld of computational chem- Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 ten years ago to create what we call the istry. Computer programs that calculate inside-out approach to design and predict properties of chemical systems (outlined in Figure 9).18 We start with using a combination of theoretical meth- quantum mechanical models of the tran- ods have been developed for use in many sition states of the chemical reaction we areas of chemistry. One well-established wish to catalyze and calculate which pro- example of the integration of multiple tein side chains will stabilize the transi- com p utational tools to solve important tion state. This becomes a model for the prob lems is the ½eld of computational core of the enzyme where catalytic reac- .21 Calculations in this enter- tions occur. We call this computational prise range from structural evaluation to model a theoretical enzyme, or “theo- quickly screen thousands of candidate mol- zyme.” Then, using Rosetta, we ½nd a sta- ecules for use in drugs to elaborate simu- ble protein structure in the Protein Data lations of substrate-protein binding that Bank (a of the three-dimensional can predict whether a molecule will act as structures of all known proteins) that can a good inhibitor for a target protein in - be modi½ed to achieve the designed struc - volved in a disease. Such calculations have ture with the necessary catalytic groups prov en their worth in developing new en - aligned in the perfect positions for cataly - zyme inhibitors, although the path from sis.19 After extensive computational tests effective inhibitors to commercial drugs using both quantum mechanics and clas- is still long, expensive, and mostly empir- sical , the best compu- ical. tationally designed enzymes are selected Yet another innovative use of computa- for experimental testing. The actual pro- tional chemistry is in developing new teins are produced by modi½ed microor- materials for many different industries. ganisms such as E. coli and are then tested Computational chemists are developing for . Using this procedure, we have programs based on a combination of quan- successfully produced new effective cata- tum and classical mechanics to compute lytic proteins for three different types of the properties of structural materials, solar- reactions.20 The entire process of design- energy conversion devices, and new chem- ing new enzymes through computation ical batteries. Aiming to aid the develop- currently takes years; however, it takes ment of computational architecture and much longer (billions of years) for nature methodology for materials chemistry, the to evolve enzymes for metabolism. Right White House approved the Materials Ge - now we must do many thousands of cal- nome Initiative in 2011.22 The name evokes culations to make predictions of new en- the remarkably successful Human Genome

62 Dædalus, the Journal ofthe American Academy of Arts & Sciences Figure 9 K. N. Houk Overview of the “Inside-Out” Computational Approach to Designing Unnatural Enzymes & Peng Liu Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021

Source: G. Kiss, N. Çelebi-Ölçüm, R. Moretti, D. Baker, and K. N. Houk, “Computational Enzyme Design,” Ange- wandte Chemie International Edition 52 (2013): 5700–5725.

Project and extends the idea to the world of Innovations in de - materials. It has therefore been recognized sign will continue to enhance the scope of at the highest policy level that computa- computational chemistry. For example, the tional methods can accelerate the discov- development of graphics processing units ery of advanced materials and shorten the (gpus) by the computer industry has en - process of deploying them to the commer- ergized the entertainment industry. The cial market. success of computer gaming has made

143 (4) Fall 2014 63 Using these devices inexpensive, and computa- ways, not from simple extensions of known Compu - tional chemists are rushing to adapt their phenomena. Quantum mechanics can pre- tational Chemistry programs to these commodity devices in dict things that have never been observed. to Under- order to enhance their modeling capabil- To predict what reaction happens when a stand & 23 Discover ities. new combination of chemicals is tested Chemical What will we be able to do with these re quires the evaluation of every possibility. Reactions com puters of the future? We have dis - Computational chemists are working on cussed in this article how computations are methods to predict reactions and their applied to study the way chemical reac- rates based solely on the information about tions occur and to improve and extend the separated reactants, catalysts, solvents, them. In the future, this will be done more and reaction conditions, essentially forcing accurately, more quickly, and on much molecules together and seeing what hap- Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 larger systems, producing more realistic pens in the computer.24 models of chemistry. Predicting complete- We have described how computational ly new chemical transformations is likely chemists go about exploring chemistry and to re main challenging, because so many developing new models and theories to differ ent bonds may be made or broken un derstand nature and predict useful in a chemical reaction and, as we stated things. Better algorithms and increasing above, the combinations of even relatively computer power make ever larger and simple pure chemicals can lead to a huge more accurate calculations possible, and num ber of reaction pathways that grows this challenges computational chemists to exponen tially with the number of atoms take on larger and more complex prob- involved. Experimental knowledge about lems. Computational chemistry has grown existing reactions may help chemists guess from the breakthrough theory of the early the outcome of an unknown reaction, but twentieth century into a ubiquitous and im por tant discoveries in chemistry often powerful engine for chemical discovery in re sult from discoveries of new types of the twenty-½rst. transfor mations that occur in unexpected

endnotes * Contributor Biographies: K. N. HOUK, a Fellow of the American Academy since 2002, is the Saul Winstein Chair in Or ganic Chemistry in the Department of Chemistry and Biochemistry at the University of California, Los Angeles. He taught earlier at Louisiana State University and the University of Pittsburgh, and was Director of the Chemistry Division of the National Science Foundation. Over his career, he has “evolved” from an experimental physical organic chemist to a computational chemist, parallel to the developments of the research ½eld de - scribed in this article. PENG LIU is an Assistant Professor of Chemistry at the University of Pittsburgh. He obtained his Ph.D. in Chemistry at the University of California, Los Angeles, where he was a Post- doctoral Scholar in Professor K. N. Houk’s research group. His research interests include computational studies of organometallic and organic reactions. 1 Graham Farmelo, The Strangest Man: The Hidden Life of Paul Dirac, Mystic of the Atom (New York: Basic Books, 2009). 2 Paul A. M. Dirac, “Quantum Mechanics of Many-Electron Systems,” Proceedings of the Royal Society of London A 123 (1929): 714–733.

64 Dædalus, the Journal ofthe American Academy of Arts & Sciences 3 Christopher J. Cramer, Essentials of Computational Chemistry: Theories and Models, 2nd ed. K. N. Houk (Malden, Mass.: John Wiley & Sons, 2004). & Peng Liu 4 M. J. Frisch et al., 09, Revision D.01 [electronic structure modeling program] (Wal- lingford, Conn.: Gaussian, Inc., 2009). 5 A. B. Richon, “An Early History of the Molecular Modeling Industry,” Drug Discovery Today 13 (2008): 659–664. 6 See R. Hoffmann and R. B. Woodward, “Stereochemistry of Electrocyclic Reactions,” Journal of the American Chemical Society 87 (1965): 395–397; R. Hoffmann and R. B. Woodward, “Se - lection Rules for Concerted Cycloaddition Reactions,” Journal of the American Chemical Society 87 (1965): 2046–2048; R. Hoffmann and R. B. Woodward, “Selection Rules for Sigmatropic Reactions,” Journal of the American Chemical Society 87 (1965): 2511–2513; and R. B. Woodward and R. Hoffmann, The Conservation of Orbital Symmetry (New York: Academic Press, 1970).

7 W. R. Dolbier, Jr., H. Koroniak, D. J. Burton, A. R. Bailey, G. S. Shaw, and S. W. Hansen, Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 “Re mark able, Contrasteric, Electrocyclic Ring Opening of a Cyclobutene,” Journal of the Amer - ican Chemical Society 106 (1984): 1871–1872. 8 , “The Activated Complex in Chemical Reactions,” The Journal of 3 (1935): 107–115. 9 W. Kirmse, N. G. Rondan, and K. N. Houk, “Stereoselective Substituent Effects on Conrotatory Electrocyclic Reactions of Cyclobutenes,” Journal of the American Chemical Society 106 (1984): 7989–7991. See also N. G. Rondan and K. N. Houk, “Theory of Stereoselection in Conrota- tory Electrocyclic Reactions of Substituted Cyclobutenes,” Journal of the American Chemical Society 107 (1985): 2099–2111. 10 K. Rudolf, D. C. Spellmeyer, and K. N. Houk, “Prediction and Experimental Veri½cation of the Stereoselective Electrocyclization of 3-Formylcyclobutene,” The Journal of Organic Chemistry 52 (1987): 3708–3710. 11 These Nobel Prizes in Chemistry were awarded for asymmetric hydrogenations and oxidations (William S. Knowles, Ryoji Noyori, and K. Barry Sharpless; 2001), ole½n metathesis (Yves Chauvin, Robert H. Grubbs, and Richard R. Schrock; 2005), and palladium-catalyzed cross couplings (Richard F. Heck, Ei-ichi Negishi, and ; 2010). 12 A. J. Jiang, Y. Zhao, R. R. Schrock, and A. H. Hoveyda, “Highly Z-Selective Metathesis Homo - coupling of Terminal Ole½ns,” Journal of the American Chemical Society 131 (2009): 16630– 16631; and S. J. Meek, R. V. O’Brien, J. Llaveria, R. R. Schrock, and A. H. Hoveyda, “Catalytic Z-Selective Ole½n Cross-Metathesis for Natural Product Synthesis,” Nature 471 (2011): 461–466. 13 K. Endo and R. H. Grubbs, “Chelated Ruthenium Catalysts for Z-Selective Ole½n Metathesis,” Journal of the American Chemical Society 133 (2011): 8525–8527; and B. K. Keitz, K. Endo, M. B. Herbert, and R. H. Grubbs, “Z-Selective Homodimerization of Terminal Ole½ns with a Ruthe- nium Metathesis Catalyst,” Journal of the American Chemical Society 133 (2011): 9686–9688. 14 Even though we evaluate only the reasonable possibilities, we use a lot of computer time. Last year we used about ten million hours of fast computer time, equivalent to one thousand years on one fast computer. 15 P. Liu, X. Xu, X. Dong, B. K. Keitz, M. B. Herbert, R. H. Grubbs, and K. N. Houk, “Z-Selec- tivity in Ole½n Metathesis with Chelated Ru Catalysts: Computational Studies of Mecha- nism and Selectivity,” Journal of the American Chemical Society 134 (2012): 1464–1467. 16 L. E. Rosebrugh, M. B. Herbert, V. M. Marx, B. K. Keitz, and R. H. Grubbs, “Highly Active Ru- thenium Metathesis Catalysts Exhibiting Unprecedented Activity and Z-Selectivity,” Journal of the American Chemical Society 135 (2013): 1276–1279. 17 J. K. Nørskov, T. Bligaard, J. Rossmeisl, and C. H. Christensen, “Towards the Computational Design of Solid Catalysts,” Nature Chemistry 1 (2009): 37–46.

143 (4) Fall 2014 65 Using 18 For a review of this procedure, see G. Kiss, N. Çelebi-Ölçüm, R. Moretti, D. Baker, and K. N. Compu - Houk, “Computational Enzyme Design,” Angewandte Chemie International Edition 52 (2013): tational 5700–5725. Chemistry to Under- 19 See http://www.pdb.org/; and H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. stand & Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne, “The Protein Data Bank,” Nucleic Acids Discover Research 28 (2000): 235–242. Chemical Reactions 20See D. Rothlisberger, O. Khersonsky, A. M. Wollacott, L. Jiang, J. DeChancie, J. Betker, J. L. Gallaher, E. A. Althoff, A. Zanghellini, O. Dym, S. Albeck, K. N. Houk, D. S. Taw½k, and D. Baker, “Kemp Elimination Catalysts by Computational Enzyme Design,” Nature 453 (2008): 190–195; L. Jiang, E. A. Althoff, F. R. Clemente, L. Doyle, D. Rothlisberger, A. Zanghellini, J. L. Gallaher, J. L. Betker, F. Tanaka, C. F. Barbas III, D. Hilvert, K. N. Houk, B. L. Stoddard, and D. Baker, “De Novo Computational Design of Retro-Aldol Enzymes,” Science 319 (2008): 1387–1391; and J. B. Siegel, A. Zanghellini, H. M. Lovick, G. Kiss, A. R. Lambert, J. L. St. Clair, J. L. Gallaher, D. Hilvert, M. H. Gelb, B. L. Stoddard, K. N. Houk, F. E. Michael, and D. Baker, Downloaded from http://direct.mit.edu/daed/article-pdf/143/4/49/1830827/daed_a_00305.pdf by guest on 28 September 2021 “Computational Design of an Enzyme Catalyst for a Stereoselective Bimolecular Diels-Alder Reaction,” Science 329 (2010): 309–313. 21 William L. Jorgensen, “The Many Roles of Computation in Drug Discovery,” Science 303 (2004): 1813–1818. 22 National Science and Technology Council, “Materials Genome Initiative for Global Compet - itive ness” (Washington, D.C.: Executive Of½ce of the President of the United States, 2011). 23 Andreas W. Götz, Mark J. Williamson, Dong Xu, Duncan Poole, Scott Le Grand, and Ross C. Walker, “Routine Microsecond Molecular Dynamics Simulations with amber on gpus. 1. Generalized Born,” The Journal of Chemical Theory and Computation 8 (2012): 1542–1555. 24 Satoshi Maeda and Keiji Morokuma, “Communications: A Systematic Method for Locating Transition Structures of A+B g X Type Reactions,” The Journal of Chemical Physics 132 (24) (2010): 241102.

66 Dædalus, the Journal ofthe American Academy of Arts & Sciences