Feature COGNITION STUDIO INC. STUDIO COGNITION CORONAVIRUS PIECE BY PIECE Biologists are working at breakneck speed to solve the structures of key SARS-CoV-2 proteins and use them against the . By Megan Scudellari

ying in bed on the night of 10 January, West Lafayette, Indiana. “It’s what history has virus’s genome, seeking out the instructions scrolling through news on his smart- shown us.” for each of its estimated 25–29 proteins. With phone, Andrew Mesecar got an alert. In Lübeck, Germany, Rolf Hilgenfeld those instructions in hand, the scientists could He sat up. It was here. The complete stopped packing boxes for his retirement recreate the proteins in the lab, visualize them genome of a coronavirus causing a and started preparing buffers for crystal- and then, hopefully, identify drug compounds cluster of pneumonia-like cases in lography. In Minnesota, Fang Li stayed up all to block them or develop vaccines to incite the Wuhan, China, had just been posted night analysing the new genome and drafting immune system against them. online. a manuscript. In Shanghai, China, Haitao Yang “Here we go,” thought Mesecar. “I’d better LAround the world, similar notifications rallied a dozen graduate students to clear their get some sleep.” appeared on the devices of scientists who schedules. In Texas, Jason McLellan instructed first crossed swords with coronaviruses in laboratory members to start assembling gene 11 January: 41 confirmed cases of the 2003 outbreak of SARS (severe acute sequences from the viral genome. COVID-19 worldwide respiratory syndrome) and then again with Within 24 hours, a network of structural Mesecar woke at 6 a.m. the next day, turned on MERS (Middle East respiratory syndrome) biologists around the world had redirected the coffee pot and began blasting through the in 2012. Instantly, the researchers mobilized their labs towards a single goal — solving the new genome looking for recognizable protein against a new adversary. “We always knew that protein structures of a deadly, rapidly spread- sequences. It didn’t take long. He had spent this was going to come back,” says Mesecar, ing new contagion. To do so, they would need 17 years studying coronaviruses, and the new head of biochemistry at Purdue University in to sift through the 29,811 RNA bases in the virus’s genome looked very familiar.

252 | | Vol 581 | 21 May 2020 ©2020 Spri nger Nature Li mited. All rights reserved. ©2020 Spri nger Nature Li mited. All rights reserved.

“Holy shit,” he thought. “This is the same thing as SARS.” THE KEY CORONAVIRUS PROTEINS Right away, Mesecar contacted Karla Researchers are racing to visualize and understand the proteins used by SARS-CoV-2 to enter cells and replicate. That information could be crucial for making drugs and vaccines to stop the virus. Satchell, a microbiologist at Northwestern University in Chicago, Illinois. Satchell is The spike protein (closed) co-director of the Center for Structural Genomics of Infectious Diseases (CSGID), a consortium of eight institutions set up exactly

for moments like this — to rapidly investigate Carbohydrates the structures of emerging infectious agents. cover the surface To solve the 3D structure of a protein at of the spike, disguising it from high resolution, scientists first design a gene S2 subunit the immune system. construct — a circle of DNA containing the S1 subunit . NATURE GRAPHICS: NIK SPENCER/ 181 , 281 (2020). instructions for the protein, together with regulatory sequences to control where and how it is expressed. They then insert the construct into living cells, often the bacte- rium Escherichia coli, using the cells’ own The virus shell is covered in spikes Each spike carries The new virus binds each made of three identical three identical more strongly than machinery to churn out the desired protein. proteins. At the end of each spike binding domains, SARS does to cells, Next, they purify the protein so that they is a small binding region that locks all of which must even though the spike onto human cells. bind the host cell. shapes are similar. SOURCE: A. C. WALLS ET AL. CELL A. C. WALLS SOURCE: can visualize its structure using either of two methods. One is X-ray crystallography, which involves growing tiny crystals of pure the structures of two other coronavirus spikes Mpro in coronaviruses is made up of two protein and revealing their internal struc- — from HKU1, a cause of common colds1, and identical subunits and looks like a moth- ture by bombarding them with X-rays from from the MERS virus2. The work was done in eaten heart, with an active enzyme site on a high-energy electron beam. The other is collaboration with structural biologist Andrew each side of the structure. On 26 January, Rao cryo-electron microscopy (cryo-EM), a pro- Ward at the Scripps Research Institute in La and Yang submitted the Mpro structural data to cess of scanning flash-frozen proteins using Jolla, California, and virologist Barney Graham the Protein Data Bank (PDB), an open-access a high-powered electron microscope. at the US National Institute of Allergy and digital resource for 3D structures of biological Either process can take months, even years, Infectious Diseases’ molecules. By 5 February, the data had been for an unfamiliar protein. Luckily, many of the in Bethesda, . So, the group knew how processed and the final structure was released new coronavirus proteins were familiar, with to tweak the spike protein’s genetic sequence online — not a moment too soon, says Yang. 70–80% sequence similarity to SARS-CoV, the so that it would stabilize in a pre-fusion shape The laboratory had already received an over- virus that caused the 2003 SARS outbreak. By — the form it adopts before it docks onto a host whelming 300 requests for the structure. 7:30 a.m., Mesecar and his team had begun cell. “Our ability to get this particular structure While working on Mpro, Rao contacted a designing gene constructs for the new viral was based upon all our prior knowledge from former co-worker, David Stuart, a structural proteins, and even predicted which of their working on HKU1 and MERS and SARS,” says biologist at the University of Oxford, UK, who existing coronavirus inhibitors might block McLellan. is life-sciences director at Diamond Light these proteins. While McLellan’s team waited for the Source, the United Kingdom’s synchrotron Satchell, who had been following early news construct to arrive, Graham called facility. The UK and Shanghai groups began reports about the virus, organized a virtual Therapeutics, a drug-discovery company in collaborating closely to share advice and meeting of consortium members to start Cambridge, Massachusetts, with which the avoid overlap, says Martin Walsh, deputy life- solving the virus’s proteins. “We’ve thrown Vaccine Research Center had been working sciences director at Diamond. “We keep each the weight of every investigator at every on a pandemic-preparedness project. On other up to date on things, and try to benefit site behind COVID,” says Satchell. Mesecar, 13 January — before any spike protein had from the different approaches they’re using a CSGID investigator, started with Mpro, the been made — Moderna began preparing its and we’re using.” virus’s main protease, an enzyme that cuts out manufacturing facilities to make a coronavirus Because the Shanghai team solved Mpro in proteins from a long strand that the virus pro- vaccine based on that protein. complex with an inhibitor, the Diamond team duces when it invades a cell, like a tailor cutting decided to focus on crystallizing the protein out pattern pieces. Without Mpro, there is no 26 January: 2,014 confirmed cases with no molecule attached, hoping to identify viral replication. Humans do not have a similar At ShanghaiTech University in China, Zihe Rao, active sites to which potential drug com- protease, so drugs targeting this protein are Haitao Yang and their colleagues worked day pounds might bind. Over two weeks, Walsh’s less likely to cause side effects. and night, sacrificing their week-long Chinese group ran 17,000 experiments to hit on the Lunar New Year holiday, to solve the Mpro struc- best recipe for precipitating the unbound 13 January: 42 confirmed cases ture and those of another trio of proteins that protein into a crystal. In McLellan’s molecular biosciences lab at the the coronavirus uses to replicate. University of Texas at Austin, graduate student Using X-ray data acquired at the Shanghai 1 February: 11,953 confirmed cases Daniel Wrapp spent the weekend designing Synchrotron Radiation Facility and the In Hilgenfeld’s lab at the University of Lübeck, a gene construct for another key protein — National Center for Protein Science Shanghai researcher Linlin Zhang had taken to phoning the outer, three-pronged spike that gives the — which both allocated special beam time the company making the Mpro gene construct coronavirus its crown-like appearance and for the project — the team solved the crystal daily until it finally arrived. Thanks to the lab’s name. Wrapp placed an order for the con- structure of Mpro bound to an inhibitor3. In experience crystallizing other coronavirus pro- structs with a commercial firm that Monday, 2003, it had taken them two months to solve teases, Zhang grew Mpro crystals in 10 days, and 13 January. the structure of the SARS-CoV main protease. on 1 February, she took the precious samples McLellan had been involved in determining This time, it took one week. to the BESSY II synchrotron in Berlin, which

Nature | Vol 581 | 21 May 2020 | 253 ©2020 Spri nger Nature Li mited. All rights reserved. ©2020 Spri nger Nature Li mited. All rights reserved.

Feature THE SPIKE LOCKS ON crystal structure of unbound Mpro on its Binding regions on the tip of the spike open out to attach to the human website (go.nature.com/2z41uwj). ACE2 receptor, found on lung cells and elsewhere in the body. Targeting the spike To support US teams, the APS and other Researchers are finding national synchrotrons coordinated their

antibodies — molecules schedules to ensure there would be no inter- made by the immune system to fight infection — ruption in beamtime if one facility had to that might interfere with the close for maintenance or because of a local spike as a way to prevent or outbreak. “Our goal is just to keep the research treat infection. Antibodies can stick to the top or side going,” says Stephen Streiffer, director of the of the binding prong. APS. “The rate at which people are working at this is an order of magnitude faster than they’ve been able to work on other problems.” So far, the CSGID consortium has solved 12 unique SARS-CoV-2 protein structures, Spike which are kept in a new online database with protein their accompanying genomic information Spike (see https://coronavirus3d.org). “We’ve been opens up Antibody Binding part of projects like this on cancer, but it took five years to set that all up,” says Adam Receptor- ACE2 binding receptor Godzik, a bioinformatician at the University of domain

California, Riverside, and a CSGID investigator. (2020). 368 , 630–633 “This happened spontaneously in the course Human cell SCIENCE of months.” ET AL . BINDING: M. YUAN ANTIBODY 7; BINDING: REF. 6; ACE2 SPIKE: REF. OPEN SOURCES: opened up a beamline especially for the project. system, and sent the sequence to Moderna, 16 March: 167,515 confirmed cases In addition to focusing on the unbound Mpro where the production line was ready and wait- With 3D structures in hand, structural-biology structure, Hilgenfeld docked a small-molecule ing. On 7 February, Moderna completed its teams moved straight to next steps. “Struc- inhibitor called 13a, which he had designed to first batch of the vaccine based on that protein. tures aren’t everything,” says Mesecar. “You inhibit the MERS virus, into the protein’s active Meanwhile, on 10 February, just 12 days after want to get to compounds — to antivirals and site. It wasn’t a perfect fit, so the team altered harvesting the protein, McLellan and his group vaccines.” a residue on the compound and named it 13b. submitted its cryo-EM structure6 to the PDB. On 16 March, just 65 days after the viral This one “fit nicely”, says Hilgenfeld, and in ten By studying the spike in detail, they found that genome was released, clinicians gave the more days his team had solved the structure it binds to its human cell receptor, a protein first dose of Moderna’s vaccine candidate to of Mpro bound to the inhibitor4. called ACE2, at least ten times more tightly a patient in a clinical trial funded by the US McLellan’s group in Texas was solving the than SARS-CoV does. National Institutes of Health. spike protein structure at similar speed. At the University of Minnesota in Saint Paul, “It was a lot faster than even the fastest one As soon as the group had finished gather- Li’s team was on its way to working out why. we’d previously done,” says Graham. Because ing high-resolution electron-microscopy On 11 February, Li and his colleagues began of research on SARS and MERS, coronaviruses data of the stabilized spike, thanks to a collecting X-ray data from the spike protein were probably the only viral family for multimillion-dollar cryo-EM facility at the using the Advanced Photon Source (APS), the which that was possible, he adds. “If it was a university, McLellan sent the data to Graham synchrotron facility at the US Department of bunyavirus or an arenavirus, we would have at the Vaccine Research Center. Energy’s Argonne National Laboratory near been lost for two to three years.” Vaccines are often based on presenting Chicago, Illinois. By 13 February, the research- But even a vaccine developed at parts of a virus to the human immune system ers had defined the small, important spot record-breaking speed is likely to be a slower to provoke a response, and the spike protein is where the spike protein locks on to the ACE2 solution than repurposing an approved drug, an obvious candidate because it has a crucial receptor7 (see ‘The spike locks on’). They found or at least finding one for which safety testing role in infection. that the new coronavirus spike protein has has begun. “That’s absolutely going to be the The spike is formed of three identical small molecular differences in its binding fastest way to help patients sick in the hospital molecules stuck together in the shape of a region compared with that of SARS-CoV, today,” says Satchell. pyramid, with a hinge-like trapdoor. This which might be why the new virus attaches to That was exactly what Andrew Hopkins was opens to expose a portion that grabs onto a ACE2 more strongly. These changes could also planning. On 19 March, Hopkins, the chief exec- receptor on a human cell (see ‘The key coro- explain why it seems to infect cells better and utive of Exscientia, an artificial-intelligence navirus proteins’). Graham and McLellan’s past spreads faster than the SARS virus. That same drug-discovery company in Oxford, UK, took work on a similar protein5 suggested that pre- week, the virus also got a name: SARS-CoV-2. delivery of a large styrofoam cooler packed with senting the spike protein in its pre-grab state dry ice. Inside was a library of 12,000 drug com- would provoke the human immune system. 18 February: 73,332 confirmed cases pounds known to be safe and ready for human From the complete structure, Graham could By mid-February, protein structures were use, sent from Scripps Research in California. see that McLellan’s gene construct made a pouring out. On 18 February, Hilgenfeld, The Exscientia team, working closely with Dia- high-quality protein arranged in the right con- Zhang and their colleagues submitted a paper4 mond, immediately began screening the col- formation. “It was really, really important to on the Mpro structure alone and bound to 13b, lection against four of Diamond’s structures: have that electron-microscopy information,” and posted the preprint on the bioRxiv server Mpro, the spike protein, a second protease and says Graham. on 20 February. “It was pretty fast,” Hilgenfeld the replication-machinery complex. Exscien- Graham tested the spike protein in mice, admits. “The longest time period was just tia is currently preparing to test compounds working to improve its expression levels getting it published.” That same day, the that bind to the first two proteins for antiviral and the strength of its effect on the immune Diamond team released the high-resolution activity, says Hopkins.

254 | Nature | Vol 581 | 21 May 2020 ©2020 Spri nger Nature Li mited. All rights reserved. ©2020 Spri nger Nature Li mited. All rights reserved.

BREAKING THE CYCLE Once the virus has fused with a host cell, the virus injects genetic material and uses the host machinery to make copies of itself. Many teams are studying viral proteins involved in replication.

Human cell Viral RNA pro New virus Targeting M enters the cell Targeting RdRp If the virus can’t build its packaged and released This enzyme makes copies main components, it can’t of the full viral genome. replicate. Researchers are Several novel compounds trying to find drugs that RdRp produces viral and approved drugs, such block the cutting action of Once inside, as remdesivir, bind to its pro Host-cell RNA and RNA that is M by fitting into its two ribosomes from the active site. active sites – little dimples ribosome host translate the translated into new virus proteins on the surface. viral RNA into long Active by the host. Remdesivir Drug Drug binding strands of protein. site site Polyproteins Non- structural Proteins involved The virus makes and viral proteins in replication and uses a protease enzyme transcription form to chop the long strands a complex within into functional proteins. a host vesicle.

Main protease RNA-dependent RNA (Mpro polymerase (RdRp) . HTTPS://DOI.ORG/10.2210/PDB6YB7/PDB (2020); RDRP: REF. 8. RDRP: REF. (2020); ET AL . HTTPS://DOI.ORG/10.2210/PDB6YB7/PDB OWEN C. D. MPRO: SOURCES:

Similarly, the ShanghaiTech team conducted 91 chemical fragments — bits of molecules shipped spike constructs to more than 100 labs virtual and high-throughput screening of a that are less than one-third the size of a nor- worldwide. Many are looking for treatments, library of more than 10,000 approved drugs mal drug — that bind to Mpro. Those fragments using the protein to fish antibodies out of the and compounds already in clinical trials, to inspired the launch of a non-profit crowd- blood of people who have had COVID-19, and see whether any would disable Mpro. They sourced initiative, the COVID Moonshot, to McLellan’s team is now characterizing the first identified six promising candidates3. One of engage chemists around the world to use of these potentially therapeutic antibodies. them, ebselen, is already in clinical trials for the fragments to design antiviral drug candi- Hilgenfeld, who was officially scheduled the treatment of bipolar disorder and hearing dates. The initiative has received more than to retire on 1 April as a result of a mandatory loss, and the group is preparing animal tests to 4,600 design submissions, and several thera- retirement policy, has packed up his office but study its activity in vivo, says Yang. peutic possibilities are already emerging. continues to work. “I’ve been working on coro- On 10 April, Rao, Yang and their collabo- In Germany, researcher Katharina Rox at naviruses for 20 years, and most of the time rators published8 the structure of the virus’s the Helmholtz Centre for Infection Research it was neglected and not taken seriously,” he replication complex — a large protein called in Braunschweig tested Hilgenfeld’s 13b says. “Now that it’s happened, how can I leave?” RNA-dependent RNA polymerase (RdRp, or His team is investigating other SARS-CoV-2 nsp12) that forms a complex with two others, “Structures aren’t structures, including nsp3, a large protein that nsp7 and nsp8. They also modelled how it binds the virus uses to shut down host-cell defences. to the antiviral drug remdesivir, originally everything. You want to The race against the virus can’t afford to developed to treat Ebola and now in phase III get to compounds — to slow down anytime soon. As soon as countries trials for coronavirus. Another recently com- antivirals and vaccines.” start lifting restrictions on people’s move- pleted structure of the protein in complex with ment, the virus will return and “flip around the the drug9 could provide a template to help world again”, says Satchell. “When that hap- model and modify other existing antivirals. compound in mice, showing that it was safe pens, it would be really great to have beautiful and accumulated well in the lungs4, a key infec- drugs that were designed specifically to target 22 April: 2,471,136 confirmed cases tion site. Meanwhile, a compound that Mesecar this coronavirus,” she says. “But we need to The hard-core biochemistry of designing developed to inhibit SARS-CoV, compound 77, do it fast.” brand-new, custom drugs to inhibit has been shown in unpublished work to have SARS-CoV-2 proteins will take months, antiviral activity against SARS-CoV-2 in cells, Megan Scudellari is a science journalist in even years, but could eventually lead to the and he hopes to complete animal studies by Boston, Massachusetts. best-performing drugs against the infection the end of the summer. (see ‘Breaking the cycle’). 1. Kirchdoerfer, R. N. et al. Nature 531, 118–121 (2016). 2. Pallesen, J. et al. Proc. Natl Acad. Sci. USA 114, E7348– The ShanghaiTech team and collaborators 14 May: 4,248,389 confirmed cases E7357 (2017). have designed and synthesized a series of Structural biologists continue to plug away at 3. Jin, Z. et al. Nature https://doi.org/10.1038/s41586-020- compounds targeting the active site of Mpro. the remaining unsolved proteins in the corona- 2223-y (2020). 4. Zhang, L. et al. Science 368, 409–412 (2020). On 22 April, after much chemical tweaking, virus genome. These include ORF8, a protein 5. McLellan, J. S. et al. Science 342, 592–598 (2013). they published details of one that inhibits whose function remains mysterious. “We pre- 6. Wrapp, D. et al. Science 367, 1260–1263 (2020). viral replication in cells and was not toxic dict it should be crystallizable, but nobody has 7. Shang, J. et al. Nature 581, 221–224 (2020). when tested in rats and dogs10. The team will done it, so we’re trying,” says Godzik. 8. Gao, Y. et al. Science 368, 779–782 (2020). 9. Yin, W. et al. Science https://doi.org/10.1126/science. continue developing that compound as a drug In the United Kingdom, the Diamond team is abc1560 (2020). candidate, says Yang. screening various compounds against a second 10. Dai, W. et al. Science https://doi.org/10.1126/science. The Diamond team has identified coronavirus protease. In Texas, McLellan has abb4489 (2020).

Nature | Vol 581 | 21 May 2020 | 255 ©2020 Spri nger Nature Li mited. All rights reserved. ©2020 Spri nger Nature Li mited. All rights reserved.