News in Focus EDWARD KINSMAN/SPL EDWARD a Protein’S Function Is Determined by Its 3D Shape

The world this week News in focus EDWARD KINSMAN/SPL EDWARD A protein’s function is determined by its 3D shape. ‘IT WILL CHANGE EVERYTHING’: AI MAKES GIGANTIC LEAP IN SOLVING PROTEIN STRUCTURES DeepMind’s program for determining the 3D shapes of proteins stands to transform biology, say scientists. By Ewen Callaway results were announced on 30 November, at to understand the building blocks of cells and the start of the conference — held virtually this aid more advanced drug discovery. n artificial intelligence (AI) network year — that takes stock of the exercise. AlphaFold came top of the table at the developed by Google AI offshoot “This is a big deal,” says John Moult, a compu- last CASP — in 2018, the first year that DeepMind has made a gargantuan tational biologist at the University of Maryland London-based DeepMind participated. But, leap in solving one of biology’s grand- in College Park, who co-founded CASP in 1994 this year, the outfit’s deep-learning net- est challenges — determining a pro- to improve computational methods for accu- work was head-and-shoulders above other Atein’s 3D shape from its amino-acid sequence. rately predicting protein structures. “In some teams and, say scientists, performed so DeepMind’s program, called AlphaFold, sense the problem is solved.” mind-bogglingly well that it could herald a outperformed around 100 other teams The ability to accurately predict proteins’ revolution in biology. in a biennial protein-structure prediction structures from their amino-acid sequences “It’s a game changer,” says Andrei Lupas, an challenge called CASP, short for Critical would be a huge boon to life sciences and evolutionary biologist at the Max Planck Insti- Assessment of Structure Prediction. The medicine. It would vastly accelerate efforts tute for Developmental Biology in Tübingen, Nature | Vol 588 | 10 December 2020 | 203 ©2020 Spri nger Nature Li mited. All rights reserved. News in focus Germany, who assessed the performance of efforts. The event challenges teams to pre- on a regular basis, and teams have several different teams in CASP. AlphaFold has helped dict the structures of proteins that have been weeks to submit their structure predictions. him to find the structure of a protein that has solved using experimental methods, but for A team of independent scientists assesses the vexed his laboratory for a decade. “This will which the structures are not public. predictions using metrics that gauge how simi- change medicine. It will change research. It DeepMind’s 2018 performance at CASP13 lar a predicted protein is to the experimentally will change bioengineering. It will change startled many scientists in the field, which determined structure. The assessors don’t everything,” Lupas adds. has long been the bastion of small academic know who is making a prediction. In some cases, AlphaFold’s structure pre- groups. But its approach was broadly similar AlphaFold’s predictions arrived under the dictions were indistinguishable from those to those of other teams that were applying AI, name ‘group 427’, but the startling accuracy determined using ‘gold standard’ experimen- says Jinbo Xu, a computational biologist at the of many of its entries made them stand out, tal methods such as X-ray crystallography and, University of Chicago, Illinois. says Lupas. “I had guessed it was AlphaFold. in recent years, cryo-electron microscopy The first iteration of AlphaFold applied the Most people had,” he says. (cryo-EM). AlphaFold might not obviate the AI method known as deep learning to struc- Some predictions were better than others, need for these laborious and expensive meth- tural and genetic data to predict the distance but nearly two-thirds were comparable in qual- ods — yet — say scientists, but the AI will make between pairs of amino acids in a protein. ity to experimental structures. In some cases, it possible to study living things in new ways. In a second step, which does not invoke AI, says Moult, it was not clear whether the dis- AlphaFold uses this information to come up crepancy between AlphaFold’s predictions and The structure problem the experimental results was a prediction error Proteins are the building blocks of life, respon- “This is going to empower or an artefact of the experiment. AlphaFold sible for most of what happens inside cells. also struggled to model individual structures How a protein works and what it does is deter- a new generation of in protein complexes. mined by its 3D shape. Proteins tend to adopt molecular biologists to ask their shape without help, guided only by the more advanced questions.” Faster structures laws of physics. An AlphaFold prediction helped to determine For decades, laboratory experiments have the structure of a bacterial protein that Lupas’s been the main way to obtain good protein struc- with a ‘consensus’ model of what the pro- lab has been trying to crack for years. Lupas’s tures. The first complete structures of proteins tein should look like, says John Jumper at team had previously collected raw X-ray diffrac- were determined, starting in the 1950s, using DeepMind, who is leading the project. tion data, but transforming these patterns into a technique in which X-ray beams are fired at The team tried to build on that approach a structure requires some information about crystallized proteins and the diffracted light but eventually hit the wall. So it changed tack, the shape of the protein. Tricks for getting this translated into a protein’s atomic coordinates. says Jumper, and developed an AI network that information, as well as other prediction tools, X-ray crystallography has produced the lion’s incorporated additional information about had failed. “The model from group 427 gave us share of protein structures. But, over the past the physical and geometric constraints that our structure in half an hour,” Lupas says. decade, cryo-EM has become the favoured tool determine how a protein folds. The team also Demis Hassabis, DeepMind’s co-founder and of many structural-biology labs. set it a more difficult task: instead of predict- chief executive, says that the company plans to Scientists have long wondered how a pro- ing relationships between amino acids, the make AlphaFold useful to other scientists. (It tein’s constituent amino acids map out the network predicts the final structure of a target previously published enough details about the twists and folds of its eventual shape. Early protein sequence. “It’s a more complex system first version of AlphaFold for other researchers attempts to use computers to predict protein by quite a bit,” Jumper says. to replicate the approach.) It can take AlphaFold structures in the 1980s and 1990s performed days to come up with a predicted structure, poorly. Lofty claims for methods in published Startling accuracy which includes estimates on the reliability of papers tended to disintegrate when other CASP takes place over several months. Tar- different regions of the protein. “We’re just scientists applied them to other proteins. get proteins or portions of proteins called starting to understand what biologists would Moult started CASP to bring rigour to these domains — about 100 in total — are released want,” adds Hassabis. In early 2020, the company released pre- STRUCTURE SOLVER dictions for a handful of SARS-CoV-2 protein DeepMind’s AlphaFold 2 algorithm significantly outperformed other teams at the CASP14 structures that hadn’t been determined experi- protein-folding contest — and its previous version’s performance at the last CASP. mentally. DeepMind’s predictions for a protein 100 called Orf3a ended up being similar to one later AlphaFold 2 90 determined through cryo-EM, says Stephen A score above 90 Brohawn, a molecular neurobiologist at the 80 is considered roughly University of California, Berkeley, whose team equivalent to the released the structure in June. “What they have 70 experimentally determined structure AlphaFold been able to do is very impressive,” he adds. 60 AlphaFold is unlikely to remove the need 50 for labs, such as Brohawn’s, that use experimental methods to solve protein structures. 40 But it could mean that lower-quality experi- 30 mental data would be all that’s needed to get a good structure. Some applications, such as 20 the evolutionary analysis of proteins, are set Global distance test (GDT_TS; average) Global distance test (GDT_TS; 10 to flourish. “This is going to empower a new 0 generation of molecular biologists to ask more 2006 2008 2010 2012 2014 2016 2018 2020 advanced questions,” says Lupas. “It’s going Contest year to require more thinking and less pipetting.” DEEPMIND SOURCE: 204 | Nature | Vol 588 | 10 December 2020 ©2020 Spri nger Nature Li mited. All rights reserved. .

News in Focus EDWARD KINSMAN/SPL EDWARD a Protein’S Function Is Determined by Its 3D Shape

Lecture Note on Deep Learning and Quantum Many-Body Computation

AI Computer Wraps up 4-1 Victory Against Human Champion Nature Reports from Alphago's Victory in Seoul

Accelerators and Obstacles to Emerging ML-Enabled Research Fields

The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design

The Power and Promise of Computers That Learn by Example

CASP)-Round V

AI for Health Examples from Industry, Government, and Academia Table of Contents

Artificial Intelligence and Big Data – Innovation Landscape Brief

Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information

Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences

Machine Theory of Mind

Benchmarking the POEM@HOME Network for Protein Structure