<<

www.nature.com/nature Vol 466 | Issue no. 7307 | 5 August 2010 Domestic science Although China is a world leader in renewable-energy technology, it is missing the chance to deploy this equipment on a suitably grand scale at home.

he United States’ lengthy reign as the world’s number-one of 500 gigawatts of renewable-energy capacity by 2020 — nearly one- energy consumer came to an end last year, according to the third of the nation’s projected power capacity for that year. TParis-based International Energy Agency on 20 July. But the Yet the reality of China’s sustainable energy falls considerably short agency’s revelation that China had finally taken the top slot swiftly of the promise. For example, the installation of wind turbines con- drew denials from officials in Beijing. tinues to outstrip China’s ability to hook them up to the power grid, China’s protests are perhaps understandable given the huge inter- and the sites chosen are not always where the best winds blow. The national sensitivities over which nations have been — and will be — upshot is that the capacity factor, a measure of a turbine’s efficiency responsible for most carbon emissions. But even if China is not yet and ultimately its profitability, is estimated to be at least 10% lower number one, its population of 1.3 billion and its fast-growing economy in China than in the best countries. mean that it will very soon be consuming far more energy than the There are problems with solar power too. China may be the world’s United States. The only real questions are by how much will that usage leading producer of photovoltaic cells, with more than 40% of the glo- grow, and how much environmental damage will it do in the process? bal market, but it is not even among the top five countries for install- One fact China is not disputing is that it extended its lead in sus- ing those cells domestically. And when the Chinese government and tainable energy last year, adding 37 gigawatts of renewable capacity, utilities do deploy photovoltaics, they prefer big, centralized, easily nearly half of the 80 gigawatts added globally in 2009. That brought its managed installations, which limits a technology that is ideally suited total renewable capacity to 226 gigawatts, dwarfing the 144 gigawatts to broad but small-scale use in places such as farms and villages. China of its nearest rival, the United States (see go.nature.com/vgU3mn). has likewise done little to encourage the use of concentrated solar China’s sustainable future has solid support from the government and thermal energy, a low-tech but effective approach that uses lenses and the industrial and financial sectors. For example, investment in China’s mirrors to focus sunlight to run conventional steam turbines. clean-energy companies by the financial sector hit US$33.7 billion China’s success with wind and solar manufacturing has given it last year — a 53% increase over 2008 and more than the $32.3 billion good credentials in sustainable energy. But its focus on green tech- invested in North and South America combined. And last month, Chi- nologies that are also immediately lucrative for export needs to give na’s National Energy Administration announced a ten-year, 5-trillion- way to a more comprehensive effort to ensure that its ambitious yuan (US$738-billion) plan that will help China realize its stated target investments in domestic green energy are as effective as possible. ■

electric utility companies — not least because many of the utilities are Slow progress keen to end years of regulatory and economic doubt. It is unlikely that the group’s discussions will bear fruit this year, but lawmakers are pay- US cap-and-trade legislation has fallen victim to ing attention. And it is clear that a solid majority of senators has become politics. But all is not lost. convinced that something needs to be done about carbon emissions. Meanwhile, political pressure for action continues to come from s China surges ahead with renewable energy (see above), all for- states, communities, environmentalists and many businesses. And ward motion seems to have stalled in the US Senate. Two weeks everyone on Capitol Hill knows that if Congress fails to move, Presi- Aago, with the November elections in mind and the Republican dent Barack Obama’s administration will regulate industrial green- minority in no mood to compromise, the Senate’s Democratic leaders house-gas emissions using the Environmental Protection Agency’s admitted that they would not have the votes this year to pass any kind existing authority under the Clean Air Act — a process that is the first of cap-and-trade system to curb carbon emissions. Instead, they opted choice of no one, including Obama. for a scaled-back energy bill that addresses issues such as the Gulf of Although the political discussion has stalled at the top, there is Mexico oil spill without doing anything to deal with global warming. reason to believe that momentum is gradually building from below As Nature went to press, it was unclear whether even that bill would — to the extent that at least some Republicans might be more willing pass. And with the midterm election almost certain to shift a substan- to strike a deal next year. They should do so. But if the result is not tial number of seats to the Republicans, who have so far been united in the comprehensive attack on global warming that many had wished their opposition to what they call ‘cap-and-tax’, the prospects for more for, perhaps that is inevitable: with an issue as big and complex as substantive climate legislation next year seem dim. climate change, there may be no way to reach consensus on a single But behind the scenes, an informal group of energy-industry officials piece of legislation that solves every problem for everybody. Instead, and environmentalists is quietly working on a proposal for compro- policy-makers both in the United States and at the international level mise legislation that would impose a cap-and-trade regimen on just the will have to keep putting the solutions together one step at a time. ■

667 © 2010 Macmillan Publishers Limited. All rights reserved Vol 466|5 August 2010 RESEARCH HIGHLIGHTS s

ECOLOGY D war

Life after logging D D. E D. Proc. R. Soc. B doi:10.1098/rspb.2010.1062 (2010) Take a biodiverse rainforest in Southeast Asia. Log it, let the area regrow, then repeat. What do you have? Not much of ecological value, many scientists would say. As a result, such ‘degraded’ lands have often been turned into oil palm plantations. But Edwards at the University of Leeds, UK, and his co-workers have now found that such twice-logged forests retain a surprising amount of biodiversity. Using birds and dung beetles as proxies for biodiversity, the researchers surveyed 18 sites in Borneo — some never logged, some logged once, some twice. From their nets, traps and by using binoculars, the authors determined that more than 75% of species found in unlogged forests continued to live in doubly logged forests.

BIOTECHNOLOGY report that the neurons release a signalling this structure is curved rather than coiled. molecule to control the formation and Zhe-Xi Luo at the Carnegie Museum of Fuel from microbes maintenance of these ‘tunnels’. Natural History in Pittsburgh, Pennsylvania, Science 329, 559–562 (2010) Kazunobu Sawamoto at Nagoya City and his colleagues suggest that the cochlea Many plants, insects and microbes naturally University in Japan and his colleagues studied was innervated before it evolved into today’s produce small quantities of alkanes and neuronal movement in the brain tissue of curved shape. alkenes — long-chain carbon and hydrogen mice in which the gene for a protein called molecules that are major components of SLIT1 had been deleted. They noticed slowed ASTRONOMY fossil fuels. The biotechnology company LS9, neuronal migration. The team also found based in South San Francisco, California, has that the receptors for SLIT1 were expressed Powerful space lens pinpointed the biochemical pathway that in astrocytes and were also required for Astron. Astrophys. doi:10.1051/0004- bacteria use to do this. proper movement. The interaction of SLIT1 6361/201014376 (2010) Andreas Schirmer and his colleagues with its receptors resulted in a change in the Brighter than a hundred billion stars have discovered and patented two genes in astrocytes’ shape and organization, leading to combined, quasars — extremely energetic cyanobacteria that encode enzymes that faster neuronal migration. galactic nuclei — typically outshine and convert fatty-acid metabolites into fuel- obscure everything in their vicinity. Now grade alkanes and alkenes. They expressed EVOLUTION astronomers have spotted a quasar that acts these genes in the bacterium Escherichia coli, as a gravitational lens, and have used this fed it glucose, and showed that it secreted Ear roots property to uncover information about the diesel-like fuel that did not need any further Proc. R. Soc. B doi:10.1098/rspb.2010.1148 (2010) galaxy that it inhabits. chemical conversions. The company is Some of the best hearing in the animal Malte Tewes at the Swiss Federal Institute currently scaling up this process. kingdom belongs to mammals, of Technology (EPFL) in Lausanne and his including humans and bats, colleagues sifted through more than 22,000 .

NEUROSCIENCE C

thanks to the snail-shaped potential candidates to find the exotic o s .

coiling of the cochlea, a object, which is about 490 megaparsecs r Tunnelling brain cells key part of the inner ear. A away. With its strong gravitational pull, the Neuron 67, 213–223 (2010) 150-million-year-old fossil quasar redirects and magnifies the light

Developing neurons must migrate of a mammal, Dryolestes of a galaxy located almost exactly behind Bonn)/ niv. u

relatively long distances in the leiriensis, has revealed how it, more than 2,300 megaparsecs away. ( uf r . brain to reach their this key innovation By measuring the effect of the quasar on i ) & )

destinations. evolved. the distant galaxy’s light, the researchers M u To do so, they The fossil has a estimated that the quasar’s host galaxy E Mus

move through bony inner ear has about 22 billion times the mass of the E gi

tubes made up structure (pictured) Sun — a more precise number than could E of support cells containing auditory be obtained using previous methods. The called astrocytes. nerves, similar to that of technique could help to determine how Researchers now its contemporary relatives. But galaxies form and evolve. (Carn Luo

668 © 2010 Macmillan Publishers Limited. All rights reserved VolNATURE 466|5|Vol August 466| 52010 August 2010 RESEARCH HIGHLIGHTS

CHEMISTRY human embryonic stem-cell models of blood DEVELOPMENTAL BIOLOGY vessel development. One, called miR-132, Splitting with sunlight was highly expressed in human tumour New hearts need jolts Angew. Chem. Int. Edn doi:10.1002/anie.201003110 vasculature, but not in normal tissue. Proc. Natl Acad. Sci. USA doi:10.1073/ (2010) The microRNA boosted the growth of pnas.0909432107 (2010) The ability to split water into hydrogen and human blood vessel cells in culture, whereas A developing heart, at least in zebrafish, oxygen could be an important step in the reducing miR-132 expression in mice stunted needs electrical conduction to grow into a development of renewable fuels. The use of blood vessel growth and shrank transplanted functional organ. haematite, a form of iron oxide, as an electrode human breast tumours. The molecule turns Neil Chi at the University of California, to drive this reaction with the help of light is on vascularization by suppressing RASA1, a San Diego, Didier Stainier at the University well established. This material is stable in water protein that inhibits blood vessel development. of California, San Francisco, and their and can be made from low-cost abundant colleagues focused on a zebrafish mutant elements. However, it has not been as effective CLIMATE SCIENCE (pictured bottom) in which the heart as titanium or tungsten oxide electrodes. contracts asynchronously and eventually Now Kevin Sivula, Michael Grätzel and Hotter heatwaves fails. Genetic analysis revealed that the their colleagues at the Swiss Federal Institute Geophys. Res. Lett. doi:10.1029/2010GL043898 (2010) mutated gene is cx46, which codes for a of Technology (EPFL) in Lausanne have Regional changes in extreme summer protein that connects adjacent cardiac cells, made two alterations to boost haematite’s temperatures could exceed average global allowing electrical impulses to move from cell water-splitting ability. warming by several degrees, according to cell and thus coordinate heart contraction. They altered the compound’s nanostructure to Robin Clark and his to improve its electronic properties and colleagues at the Met Office deposited nanoparticles of iridium oxide — a Hadley Centre in Exeter, UK. water oxidation catalyst — on its surface. The They ran 224 simulations performance achieved is not at a commercially of climate responses to an useful level, but is superior to that previously atmospheric carbon dioxide described for other oxide-based materials. level double that of today’s to determine the corresponding CANCER BIOLOGY changes in regional heat extremes. They found that in Blood vessel regulator many geographical areas in Nature Med. doi:10.1038/nm.2186 (2010) the Northern Hemisphere, Growing tumours rely on a good blood even the lower estimates of changes in Mutant hearts had abnormal conduction . Chi . supply to feed them, so the identification of a heat extremes exceeded the global average and slower transmission of electrical signals n small RNA molecule that switches on blood- increase in temperature. than hearts in normal zebrafish (top). The vessel growth in tumours provides a potential Furthermore, 44 simulations that produced cardiac cells also had deformed shapes. Mice target for anti-cancer drugs. an average of 2 °C of global warming predicted lacking the same gene had similar conduction Small regulatory RNAs called microRNAs that single-day extreme temperatures could defects, some of which have been linked to are known to regulate vascular development. increase by 6 °C or more in large parts of human heart malfunction and failure. To find microRNAs that initiate this Europe, North America and Asia. Regional The authors suggest that electrical process in tumours, David Cheresh at the changes in excessively hot days and heatwaves stimulation may improve the effectiveness of University of California, San Diego, and are related to variability in reductions in soil experimental tissue-repair techniques that his colleagues looked for microRNAs in moisture, the authors suggest. transplant cells into damaged hearts.

JOURNAL CLUB Francisco Bay area of California, geographical locations of specific will require a major change in the more than 98% are expected to temperatures will move by as much perceived role of nature reserves. Dov Sax have entirely different summer as 4.9 kilometres per year. This Traditionally, these have been Brown University, Providence, temperatures going forwards, means that conditions currently managed as ‘museums’ that Rhode Island with no overlap between the experienced at a particular maintain historically accurate A conservation biologist warmest conditions found within location could shift by hundreds of compositions of species and considers the role of nature these areas now and the coolest kilometres in just 50 years. ecosystems. In the future, we may reserves in a warming world. conditions in the future. These findings have important need some reserves to function David Ackerly at the University implications for the design and as ‘way stations’, with transient Over the next 100 years, climate of California, Berkeley, and his team management of protected areas. compositions of species. This may change is expected to extirpate studied the pace of climate change With climate change, most reserves be the only way to promote the many species from their current in the western United States will not maintain conditions that are long-term conservation of species locations. As a scientist who (D. D. Ackerly et al. Divers. Distrib. suitable for the set of species that that can no longer survive in their studies these effects, I was 16, 476–487; 2010). By mapping exists there at present. To survive, present locales. surprised by the magnitude of a current temperatures and those many species will need to move, recent projection. Of the nearly projected by a moderate warming either on their own or with human View the archive at http://blogs. 500 protected reserves in the San scenario, they found that the assistance. Accommodating this nature.com/nature/journalclub

669 © 2010 Macmillan Publishers Limited. All rights reserved Vol 466|5 August 2010 NEWS BRIEFING hy ● BUSINESS P a Genomics offering: Complete

Genomics, one of a number OTOGR of young companies offering Ph fast, cheap genome sequencing, URTON will seek up to US$86 million B IM in an initial public offering, T according to plans filed with the US Securities and Exchange Commission on 30 July. The company, based in Mountain View, California, said in the filing that, as of 20 July, it had sequenced more than 200 complete human genomes this year — more than 100 of those in the first three weeks of July — and had an order backlog of more than 500 genomes.

Clean-tech buyout: Electronics GREENLAND DRILLERS HIT BEDROCK group Panasonic, based in The North Greenland Eemian Ice Drilling (NEEM) project, which is analysing gas and particles trapped inside Kadoma, Japan, will spend up ice cores to describe Earth’s past climate, has reached the bedrock, at a depth of 2,537.36 metres. The drilling, to ¥818 billion (US$9.4 billion) carried out by a 14-nation consortium under Danish leadership, began in 2007; since then, more than 300 ice- to buy the remaining shares core researchers have worked at the NEEM camp. On 27 July, lead scientist Dorthe Dahl-Jensen of the University in two subsidiaries: Sanyo — of Copenhagen lifted the last ice core (pictured), which is more than 130,000 years old. Researchers can now an electronics maker based make a detailed study of the climate of the Eemian interglacial period (130,000–115,000 years ago), when the in Moriguchi, Japan, that it average global temperature was roughly 5 °C warmer than it is now. See go.nature.com/tU35ut for more. part-acquired last year — and Panasonic Electric Works, headquartered in Kadoma. year. Germany and Italy have in an initial public offering on Panasonic already owns just also announced solar subsidy 29 July, although its share price over 50% of shares in both of cuts this year. dropped by 8% on its first day’s these companies. In a 29 July trading. The company, based in announcement, it said that Contract-research deal Greenwood Village, Colorado, the buyout would continue a undone: Charles River wants to reopen and expand its corporate push towards ‘green Laboratories said on 29 July Mountain Pass rare-earth oxides innovation’. Sanyo is the world’s that it would terminate its mine in California. The United largest supplier of rechargeable US$1.6-billion acquisition of States is heavily dependent on batteries, and also makes solar WuXi PharmaTech, a drug Chinese imports for its rare- cells. Panasonic Electric Works research company based in SOUND earth elements, which are used makes energy-efficient lighting. Shanghai, because shareholders as catalysts and in high-tech said the investment was too BITES magnets, hybrid car batteries, Solar incentives cut: Spain is expensive. The deal, first signed “So you don’t wind turbines and mobile phones. the latest European government in April, would have created consider [Francis] to reduce state incentives for a global company providing solar power, after its industry outsourcing services to Collins to be a true ● RESEARCH ministry on 1 August confirmed pharmaceutical, biotechnology scientist?” ISS glitch: The crew of the cuts to feed-in tariffs — the price and medical-device firms. International Space Station is not an electricity utility must pay Charles River, which is based Let’s just say in danger, NASA says, despite an to generators of solar energy. A in Wilmington, Massachusetts, “ electrical spike that shut down a draft law, now under review with and is one of the world’s largest he’s a government pump module feeding ammonia the national energy regulator providers of animals for administrator.” coolant into the starboard CNE, would cut subsidies by laboratory testing, will pay $30 cooling system on 31 July. 45% for new large, ground-based million to dissolve the agreement. Craig Venter opines on the The port-side cooling system current director of the US photovoltaic plants, and by National Institutes of Health, immediately began providing 25% and 5% for large or small Rare-earth offering: Molycorp, Francis Collins, to Der Spiegel — coolant for critical systems such roof-top panels, respectively. a US company that owns one of ten years after both researchers as support, and the crew Existing plants might also have the largest deposits of rare-earth announced that their groups quickly installed jumper cables had sequenced the draft human their subsidies cut, once the law’s minerals outside China, raised genome. from the Destiny Lab to power details are clarified later this US$394 million at $14 a share other redundancy systems.

670 © 2010 Macmillan Publishers Limited. All rights reserved PhOTOlIBRaRy.COM VolNATURE 466 (CDM) for 65–75 times more Clean Development Mechanism credits under the Kyoto Protocol’s of refrigerant gas, can sell carbon a by-product of the manufacture countries that destroy HFC-23, abused. Companies in developing the offsetting system is being the work, amid accusations that for not immediately suspending last week — but was criticized hydrofluorocarbon gases (HFCs) that reduce emissions of investigation into projects carbon offsets asked for further Nations panel in charge of Carbon offsets: expected before 2027. of power generation are not validate atomic fusion as a means 2019; the first experiments to is scheduled to be turned on in (US$20.9 — now estimated at €16 delays and overruns in costs which has suffered from repeated and financing for the project, approved a baseline schedule meeting, at which delegates at a 28 The changeover was announced diplomat and nuclear engineer. Kaname Ikeda, a former Japanese of France. Motojima replaces experiment based in the south the multibillion-euro fusion new director-general of ITER, has been appointed as the physicist Osamu Motojima ITER baseline: ● the cause of the malfunction. The walks may also shed light on failed component for a spare. for 6 and 9 NASA has scheduled spacewalks BUSINESS W it again in free to proceed in January 2009, only to freeze The agency then told Geron that the trial was after Geron filed a 21,000-page application. FD headquartered in Menlo Park, California. The developer, Geron, a biotechnology company that such news had come for the therapy’s spinal-cord injuries. of a stem-cell-derived therapy for severe a trials on 30 July, when the US Food and Drug stem cells came one step closer to clinical Treatments based on human embryonic p a dministration (FD a lthough promising, it was not the first time OLIC first put the trial on hold in May 2008,

July ITER council | 5 August 2010 | Vol 466

billion). The machine

August to change the a ugust after learning that animals y Japanese |

5 August 2010 A United a ) lifted its hold on a study

billion

a TC in Florida and a swathe of Everglades National Park Endangered sites: costs of HFC destruction. cheaper to simply reimburse the Critics argue that it would be 23 to sell more carbon credits. increasing production of HFC- that many companies are actually watchdog. The group reports Watch, a Brussels-based than their costs, says CDM added added to a list of World Heritage Madagascan rainforest have been CRUNCH NUMBER accidental oil spill. is the world’s largest federal government. It working for the US (± according to estimates well operated by B Mexico from the broken leaked into the Gulf of million litres) that Barrels of oil (780 4.9 Source: National Incident Command’s h

10%) by scientists Flow Rate Technical Group © million

2 0 a is also testing the therapy in animal models of within two weeks of the injury. The company will be injected into the patients’ spinal cords — derived from human embryonic stem cells Glial cells — cells that can generate neurons patients paralysed by spinal-cord injuries. proposed trial is a safety study in up to ten after it fell 35% over the previous year. The would continue drove Geron’s stock up 18%, observed in one of many animal studies. cysts did not harm the animals, and were only near the site of injury. Geron says that the treated with the therapy had developed cysts 10 The FD lzheimer’s disease and multiple sclerosis.

M a

The The c m a i p l l ’s announcement that the trial a , , n

P u b l i s h e r s

L i m conservationists criticized the of Atsinanana, Madagascar. But lemur hunting in the rainforests Everglades and logging and and poor water flows in the (UNESCO) cited pollution and Cultural Organization Nations Educational, Scientific Committee of the United in Brasilia, the World Heritage removed last week. At a meeting Islands were controversially sites in danger, but the Galapagos needs needs explicit requirements and report. The forensics programme is too complex, according to the organizational environment that with other agencies in an but shares responsibilities in charge of nuclear forensics, Homeland Security is nominally 29 July. The Department of Academies report released on a detonation, says a National nuclear materials or debris after weapon on the basis of seized trace the source of a nuclear States risks losing its ability to US forensic doubts: tourism and overfishing. — face threats from excessive as the giant tortoise (pictured) — with iconic native fauna such saying that the unique islands from the list as “premature”, decision to remove the Galapagos i t e d .

A l l r i g h t s r e s e r v e d

The United

Share price (US$ ) 4 8 THE STEM-CELL SHUFFLE 6 5 7 3 trial ofhumanembryonic stem cells. Geron hasagain received clearance for itsclinical 2009 Jan January 23,2009:Geron receives FDA clearance Apr Dakota. Nebraska, Kansas and South caves in Colorado, Wyoming, ordered a one-year closure of On 27 (see its spread south- and westwards one million bats in the course of United States, has killed at least detected in 2006 in the northeast white-nose syndrome, originally States have been closed. So far, owned land in the United bats, caves on government- is obliterating North American helping to spread a fungus that caving (or spelunking) could be of concerns that recreational Bat caves closed: the report says. unannounced training exercises, also undertake more realistic, goals, and the agencies should ➧ knowledge of cognitive science. students’ learning using include how to accelerate Portland, Oregon. Discussions holds its annual meeting in The Cognitive Science Society 11–14 A ➧ Brazil. Americas’, held in Foz do Iguaçu, Union’s ‘Meeting of the at the American Geophysical South America will be discussed The impact of climate change on 8–12 A AHEAD THE WEEK go.nature.com/ www.agu.org/meetings/ja10 Trial putonhold August 18,2009: Jul Nature

July, the US Forest Service UGUST UGUST Oct NEWS BRIEFING

463, Jan 2010 144–145; 2010). y 2 Hold released July 30, 2010: Because a m8a m8a Apr Jul /

671

SOURCE: NaSDaQ Vol 466|5 August 2010 NEWS Demand for malaria drug soars Farmers and scientists struggle to keep up with needs of ambitious medicine-subsidy programme.

From bust to boom to bust again: artemisinin, the key ingredient of front-line antimalarial drugs, is entering the third chapter of its tur- bulent history. A decade ago, the compound —

available only from the sweet wormwood plant Daniels/Panos W. Artemisia annua — was scarce and expensive. But by 2007, the market was wallowing in a surfeit of the drug as farmers flocked to grow the crop. Now, as a US$343-million initiative starts to battle malaria through hugely subsi- dized medicines, suppliers are again worried that there will not be enough artemisinin to go around, while farmers, plant breeders and syn- thetic biologists are hoping that they can snap the drug out of its roller-coaster supply cycle. This year’s problems began with what should be a malaria success story. The Global Fund to Fight AIDS, Tuberculosis and Malaria last Hybrid plants could boost artemisinin supplies from farms in Tanzania (above) and elsewhere. month saw its first orders for cheap drugs under its Affordable Medicines Facility — Malaria and yeast to produce a precursor of artemisinin. the Institute for OneWorld Health, a non-profit (AMFm) initiative. Using subsidies, it plans to Fermenting the organisms in huge vats could organization in San Francisco, California. cut the price of artemisinin-based combina- yield a plentiful and inexpensive drug supply. While this technology was being developed, tion therapies (ACTs), which partner artemisi- Keasling’s semi-synthetic artemisinin farmers in China and Vietnam planted tens of nin with another drug to reduce the chance of project received $42.6 million over five years thousands of hectares of Artemisia, and by 2007 malaria parasites developing resistance, as they from the Bill & Melinda Gates Foundation, the market was swamped. The price of artem- have done to treatments such as chloroquine. and became a focus for biotech firm Amyris of isinin plummeted from more than $1,100 per Emeryville, California, which spun out from kilogram to around $200 per kilogram (see Costly convenience Keasling’s lab. It successfully added or tweaked ‘Boom and bust for Artemisia farmers’), putting Governmental public-health clinics already a dozen genes in yeast to make artemisinic acid some 80 processing companies — and untold offer ACTs at just $1 per dose, but roughly 60% (D.-K. Ro et al. Nature 440, 940–943; 2006), numbers of farmers — out of business. of patients with malaria opt for convenience and and gave a royalty-free licence to drug firm Even though artemisinin was being sold buy the drugs from local market stalls and pri- Sanofi-aventis, headquartered in Paris, to make cheaply (compared with the price that Sanofi- vate pharmacies — even though they cost many semi-synthetic artemisinin on a commercial aventis will set in 2012), millions of people in times more. The AMFm initiative, running as scale. Four years on, the product is still two sub-Saharan Africa were still not getting access a two-year trial in seven African countries and years away, says the drug company, which is to the ACTs. “We have learned there is a lot more in Cambodia, hopes to ensure that even the scaling up production to 100,000-litre vats, to it than cost,” says Jack Newman, co-founder private sector will sell ACTs at $0.20–0.50 per financed by another $10.7-million grant from and senior vice-president for research at Amyris. dose. That should improve access to the drugs, the Gates foundation, and with assistance from Improving access to the medicines is as impor-

and may stop patients buying tant as driving down their price P

cheap but ineffective chloro quine — hence the idea for the AMFm, rou BOOM AND BUST FOR ARTEMISIA FARMERS g or the single artemisinin therapies 1,200 Today’s demand for the and its focus on the local businesses that are promoting resistance. 1,000 drug artemisinin is being that sell treatments. 800 met by the surplus from a It will also drive up demand for If farmers grew enough Artemi- onsulting artemisinin. 600 past production boom, but sia before, why not again? With c 400 a larger and more reliable

Artemisinin suppliers have seen (US$ per kg) supply will be required to food prices rising, the incentive Estimated price Estimated 200 this all before. In 2005, the World 0 meet future needs. to plant the crop this time is low, Health Organization declared that notes Malcolm Cutler, an artemisi- 200 Boston ource: much more of the drug was needed Production nin-industry expert and director of s to increase the production of 150 Demand* the consultancy FSC Development

ACTs. At the time, researchers led 100 Services, near Gloucester, UK. His

by Jay Keasling at the University of of Tonnes priority is to improve communica- California, Berkeley, thought that artemisinin 50 tion between growers, processors synthetic biology could solve the 0 and drug companies, and to help supply problem. They hoped to 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 farmers who must decide to plant modify the genomes of bacteria *One tonne gives about two million doses of artemisinin-based combination therapy. Artemisia 14 months before that

672 © 2010 Macmillan Publishers Limited. All rights reserved VolNATURE 466|5|Vol August 466| 52010 August 2010 neWs

Feeling up molecules

The structure of an organic al. et molecule revealed by ross ross

atomic force microscopy. g . go.nature.com/MS8XAP l crop’s drug will be produced. This year, the Assured Artemisinin sponge genome goes deep Supply System (A2S2) initiative, supported by the international drug-purchasing facil- ity UNITAID, began to give advance loans With a simple body plan lacking organs, window, some 150 to 200 million years in to the companies that extract artemisinin muscles and nerve cells, the sea sponge duration, when the basics of multicellular from plants, and to encourage drug firms to hardly seems a rich avenue for study. Yet life emerged. Nearly one-third of the genetic sign long-term contracts with them. About this humble organism squats firmly at the alterations that distinguish humans from 10,000 hectares of Artemisia was planted this doorway to one of life’s great mysteries: the their last common ancestor with single- year, twice as much as in 2009. But recent leap to multicellularity. celled organisms took place during this floods in China and Vietnam, and a drought Telltale molecular fragments teased out of period. These changes would have occurred in East Africa, mean that yields of artemisi- ancient sediment1 show that sponges existed within our sponge-like forebears. nin for use in 2011 may be only two-thirds of some 635 million years ago — the oldest The researchers also identified parts what has been planted, says Cutler. evidence for metazoans (multicellular of the genome devoted to suppressing animals) on Earth. Now, a draft genome individual cells that multiply at the expense Breeding boost sequence of the Great Barrier Reef of the collective. The presence of such Yields could be vastly improved by planting demosponge (Amphimedon queenslandica), genes indicates that the battle to stop rogue new Artemisia strains. On average, one kilo- published in this issue2 (see page 720), cells — in other words, cancer — is as old gram of its dried leaves yields some 8 grams offers a comprehensive look at the genetic as multicellularity itself. Such a link was of artemisinin. But at the National Institute mechanisms that first allowed individual recently hinted at by work showing that of Agricultural Botany in Cambridge, UK, cells to work together as parts of a larger certain ‘founder genes’ that are associated researchers have used selective breeding whole. As an added benefit, this genome to create hybrid plants that produce up may shed light on how primitive animal

to 24 grams, says Colin Hill, chair of an cells first learned to cope with the enduring authier g

Artemisia breeding consortium supported hazard of collective existence: cancer. M. by the UK Department for the Environ- “As the earliest branching lineage from ment, Food and Rural Affairs. These our last common ancestor, sponges can tell plants are now being grown and harvested us a lot about what is needed to make an commercially in Madagascar, and trialled in animal,” says geneticist Mansi Srivastava, South Africa, Uganda, Zimbabwe and the the paper’s lead author, now a postdoc at the United States, as well as in Britain. Massachusetts Institute of Technology in In an alternative approach, Ian Graham Cambridge. and colleagues at the University of York, UK, With more than 18,000 individual genes, The genome of the demosponge A. queenslandica identified key Artemisia genes that could the sponge genome represents a diverse offers a glimpse at the dawn of multicellular life. optimize agricultural yields, robustness or toolkit, coding for many processes that other desirable traits when the plant is grown lay the foundations for more complex with human cancers first arose at about in different areas of the globe (I. A. Graham creatures. These include mechanisms for the same time as metazoans appeared3. et al. Science 327, 328–331; 2010). Graham telling cells how to adhere to one another, The demosponge genome shows that genes says that the work has helped to create plants grow in an organized fashion and recognize for cell suicide — those activated within that produce up to 50% more artemisinin per interlopers. The genome also includes an individual cell when something goes kilogram of leaves than the best commer- analogues of genes that, in organisms with wrong — evolved before pathways that are cial variety. They expect to release seed to a neuromuscular system, code for muscle activated by adjacent cells to dispatch a commercial growers in mid-2012. tissue and neurons. cancerous neighbour. Despite the advances in plant bio- According to Douglas Erwin, a “Cell suicide predated cell homicide,” says technology, Keasling says that semi-synthetic palaeobiologist at the Smithsonian Carlo Maley, an oncologist at the Wistar artemisinin is still sorely needed. Although Institution in Washington DC, such Institute in Philadelphia, Pennsylvania. it began as a way to make the drug more complexity indicates that sponges must have This suggests that the single-celled colonial cheaply, the mass-produced semi-synthetic descended from a more advanced ancestor organisms that gave rise to our ancestors will be no cheaper than the plant-derived than previously suspected. “This flies in had already evolved mechanisms to kill version — partly because Sanofi-aventis does the face of what we think of early metazoan themselves, which multicellular creatures not want to undercut farmers. Instead, it will evolution,” says Erwin. later exploited as a cancer defence. be used to smooth out the cycle of boom and Charles Marshall, director of the “Cancer was not the original motivation bust in crop-based artemisinin supply. “A University of California Museum of for this work,” says Srivastava. “But now stable and adequate source of artemisinin Paleontology in Berkeley, agrees. “It means we can learn about the ways in which would be fundamentally important,” says there was an elaborate machinery in place multicellular animals have to regulate Silvia Schwarte of the World Health Organi- that already had some function,” he says. themselves and the original function of zation’s malaria programme. “What I want to know now is what were these genes.” ■ As Newman says: “If you suddenly need all these genes doing prior to the advent of adam Mann twice as much artemisinin, you just fire up sponge.” 1. love, g. D. et al. Nature 457, 718–721 (2009). ■ another fermenter.” The analyses of Srivastava and her 2. srivastava, M. et al. Nature 466, 720–726 (2010). Richard Van Noorden colleagues suggest that there was a crucial 3. Domazet-lošo, t. & tautz, D. BMC Biol. 8, 66 (2010).

673 © 2010 Macmillan Publishers Limited. All rights reserved NEWS NATURE|Vol 466|5 August 2010

UK embryo agency faces the axe Coalition government promises to abolish respected regulator in effort to cut back on quangos.

In the ethically fraught field of human-embryo will be reduced from 18 to “between eight and the loss of specialist expertise that might result. research, Britain’s Human Fertilisation and ten”, to reduce overlap between the bodies and The plans were abandoned after a cross-party Embryology Authority (HFEA) has long been save £180 million (US$285 million). The move parliamentary inquiry held in 2007 concluded regarded as a world leader in regulating and is part of a bigger push to make public spending that the case against a merger was “overwhelm- advising scientists. cuts by closing ‘quangos’ — quasi-autonomous ing and convincing”. The inquiry heard evidence But now the HFEA faces the axe, and non-governmental organizations — many of that the HFEA’s remit was fundamentally dif- researchers and politicians are chorusing their which perform regulatory functions on behalf ferent from, and more ethically complex than, discontent. “I’m absolutely astonished at this,” of the government. the HTA’s. Many of those consulted warned that says Ruth Deech, an independent member of The government says that the HFEA’s regu- losing the HFEA as a discrete body could under- the House of Lords and former chair of the latation of fertility treatments will move to the mine public confidence in the regulations it HFEA. “I think our standing in the world will Care Quality Commission, one of the health enforced; and that even as part of a larger organ- be reduced.” quangos to survive the cull. But its research ization, it would still need the same resources Since it was created by the Human Fertilis- licensing work will probably move to a new to operate effectively, limiting any cost savings. ation and Embryology Act in 1990, the HFEA super-regulator that would also absorb the Deech, who was on that inquiry committee, has regulated fertility treatment and research functions of the Human Tissue Authority says that the merger plan “was comprehensively involving human embryos in the United King- (HTA), which oversees organ donation and the demolished three years ago for very good rea- dom (see ‘The development of use of human tissues in research sons, which are just as good today”. an embryo agency’). Its work “Amalgamation and teaching. The government involves inspecting and licens- would be a loss. You says the HFEA and the HTA will Health hazard ing centres, as well as providing might lose expertise be abolished by April 2013. A more immediate casualty of the ‘bonfire of ethical and legal advice to sci- “I think amalgamation would the quangos’ is likely to be the Health Protec- entists and the public. Scientists and considerable be a loss,” says St. John, who held tion Agency (HPA), which provides advice and have generally applauded the knowledge.” an HFEA licence for research guidance on infectious diseases and environ- HFEA for providing a clear set using ‘cybrid’ embryos — cre- mental hazards. The government says that the of boundaries for what research is permissible. ated by putting human DNA into an empty HPA’s work will migrate into the Department “It’s looked upon as an organization that is often animal egg — when he worked at the Univer- of Health by April 2012. This has raised alarm the first to make decisions that define scientific sity of Warwick, UK. “You might lose expertise bells with some scientists. “Will the new service and clinical barriers,” says Justin St. John, direc- and considerable knowledge.” be able to give advice that is in the best inter- tor of the Centre for Reproduction and Devel- If the clinical and research aspects of embry- ests of public health, whether or not it conflicts opment at the Monash Institute of Medical ology were divided, “it would risk spreading with policy and interests of whatever govern- Research in Melbourne, Australia. Countries the available expert advice thinly across two ment is in power?” asks Paul Hunter, a profes- such as Australia and Canada have established bodies”, says Martin Bobrow, emeritus pro- sor of health protection at the University of East similar agencies using the HFEA as a model. “It fessor of medical genetics at the University of Anglia in Norwich, UK. “The HPA currently is the envy of American researchers and biotech Cambridge, UK. does a lot of good research that ultimately ben- companies,” adds Paul Wolpe, director of the After Britain’s previous government mooted efits the public health,” he adds. “Will this still Emory Centre for Ethics in Atlanta, Georgia. similar reforms in 2004, research charities and continue in the new service, or if not, how will The HFEA is threatened because Britain’s academics queued up to decry plans to merge the gap be filled?” Further details on the reforms new coalition government has pledged itself to the HFEA, the HTA and parts of the Medicines are expected after a wide-ranging review of a “radical simplification” of the regulatory land- and Healthcare Products Regulatory Agency medical- research regulation by the Academy of scape for public health and medical research. into a new Regulatory Authority for Tissue and Medical Sciences, commissioned by the previous Health secretary Andrew Lansley announced Embryos. Some said the bodies’ functions were government, is completed this autumn. ■ last week that the number of health agencies too different for a merger to work; others feared Daniel Cressey

THE DEVELOPMENT OF AN EMBRYO AGENCY The HFEA gives First human embryonic Licences permission for parents to stem-cell lines derived for ‘cybrid’ create ‘saviour siblings’ in United Kingdom. research using in vitro fertilization. granted.

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

The Human Fertilisation and Embryology Act Licences granted for Research involving embryo 1990 approves the creation of the Human UK parliament allows cell nuclear therapeutic cloning with genetic information Fertilisation and Embryology Authority (HFEA). replacement to study serious disease. with human embryos. from two mothers approved. R. RAWLINS/CUSTOM MEDICAL STOCK PHOTO/SPL STOCK MEDICAL RAWLINS/CUSTOM R.

674 © 2010 Macmillan Publishers Limited. All rights reserved VolNATURE 466|5|Vol August 466| 52010 August 2010 NEWS IS

SEWEr StUdiES baSEd b oN lEaKy SciENcE Sampling techniques /COR could skew results. go.nature.com/gU9UV7 ICHARDSON J. R J. COM y.

RAR ‘Tough’ chief b PHOTOLI /

IN to defend MRC v E

T. L T. Britain’s biomedical establishment has given an enthusiastic welcome to the incoming chief executive of the Medical Research Council (MRC). John Savill was named last week as the man who will steer the agency through a round of public-spending cuts expected Algae cohabit with salamander embryos this autumn. In 2008–09, the government- in their eggs — and inside their cells. funded agency spent £704.2 million (US$1.1 billion) on research, making it one of Europe’s largest national supporters of biomedical research. But all of Britain’s Salamander’s egg surprise research councils recently drew up strategies to deal with cuts of up to 20% over PUnta del este, UrUgUay mitochondria might be taking advantage of four years (see Nature 466, 420–421; 2010). Scientists have stumbled across the first both oxygen and carbohydrate generated by the “I find it hard to think of anybody else example of a photosynthetic organism living alga’s photosynthesis. who is better able to defend the MRC,” says inside a vertebrate’s cells. The discovery is So when do the algae enter the embryos’ Keith Peters, former president of learned a surprise because the adaptive immune cells? A time-lapse video made by Roger society the Academy of Medical Sciences. systems of vertebrates generally destroy Hangarter at Indiana University in Savill, who is currently head of the foreign biological material. In this case, Bloomington, and presented by Kerney at the College of Medicine and Veterinary however, a symbiotic alga seems to be meeting, reveals a fluorescent green flash — an Medicine at the University of Edinburgh, surviving unchallenged — and might be giving algal bloom — next to each embryo just as its UK, is also chief scientific adviser for health its host a solar-powered metabolic boost. nervous system begins to form. Most research to the Scottish government. He will replace The embryos of the spotted salamander on spotted salamander embryos has focused Leszek Borysiewicz, who is leaving the MRC (Ambystoma maculatum) have long been on earlier periods of development, which might to become vice-chancellor at the University known to enjoy a mutualistic relationship with explain why algae have not been seen inside of Cambridge, UK. Borysiewicz’s departure, the single-celled alga Oophila amblystomatis. the cells before. one year before his four-year term was due The salamanders’ viridescent eggs are One of Kerney’s most curious discoveries to expire, had prompted concerns that a coloured by algae living in the jelly-like material suggests that the algae may be a maternal gift. power vacuum at the top of the agency that surrounds the embryo. The embryos He has found the same algae in the oviducts of might allow the government’s Department produce nitrogen-rich waste that is useful to adult female spotted salamanders, where the of Health to steer it away from basic science the algae, which, in turn, supply the developing embryo-encompassing jelly sacs first form. and into more applied biomedical work (see embryos with extra oxygen. The algae clearly David Wake, an emeritus professor at the Nature 462, 553; 2009). The appointment benefit their salamander hosts: Lynda Goff, a University of California, Berkeley, who watched may assuage those concerns — colleagues molecular marine biologist at the University of Kerney’s presentation, wonders whether algae say that Savill commands the respect of California, Santa Cruz, showed 30 years ago could be getting into the reproductive cells. basic researchers and of clinicians, and that salamander embryos lacking algae in their This would “really challenge the dogma” that knows the importance of both. “That is surrounding jelly are slower to hatch. vertebrates’ immune systems ban such close essential to secure the future of the MRC,” Ryan Kerney of Dalhousie University relationships, he says. Both Wake and David says Colin Blakemore, a neuroscientist at the in Halifax, Nova Scotia, Canada, has now Buckley, who studies salamander development University of Oxford, UK, and Borysiewicz’s found that these algae also live inside the at the National Museum of Natural Sciences in predecessor at the MRC. embryo’s cells. Such a close coexistence with a Madrid, agree that the work might tell us more Savill trained as a medical doctor and photosynthetic organism has previously been about how vertebrate cells learn to identify was formerly head of Edinburgh’s MRC found only in invertebrates, such as corals. intruders. Centre for Inflammation Research. “John is Kerney took long-exposure fluorescent images “It makes me wonder if other species very experienced at working at the clinical– of pre-hatchling salamander embryos, and saw of salamander that have known symbiotic basic science interface,” says Kay Davies, scattered dots in the unstained tissue — an relationships with algae also harbour algae honorary director of the MRC Functional indicator that it might contain chlorophyll. inside their cells,” adds Daniel Buchholz, a Genomics Unit and head of the Department Transmission electron microscopy (TEM) developmental biologist at the University of of Physiology, Anatomy and Genetics at images showed mitochondria in the salamander Cincinnati in Ohio. “I think that if people start the University of Oxford. “He is tough and cells clustering close to the algae. Reporting the looking we may see many more examples.” ■ dedicated to the scientific enterprise.” discovery on 28 July at the Ninth International Anna Petherick Savill, who was not available for interview, Congress of Vertebrate Morphology in For a longer version of this story, will start at the MRC on 1 October. ■ Punta del Este, Kerney suggested that the see http://go.nature.com/l2drrP Daniel Cressey

675 © 2010 Macmillan Publishers Limited. All rights reserved VolNATURE 466|5|Vol August 466| 52010 August 2010 neWs

Cellular suiCide spurs CanCer Unexpected role found for protein kill switch.

go.nature.com/t9kcP5 PLETTSTOESSER T. S T. drug safety crackdown revs up FDA’s strengthened powers to assess drugs already on the market will soon be put to the test.

It’s not easy to quit smoking, but when some passed in 2007, gives the agency authority to already at high risk of cardiovascular disease, people became violent while taking a drug to demand clinical trials even after a drug is on the that subtle effect could translate into thou- help them beat their addiction, the US Food market. Since then, the FDA has ordered fur- sands of heart attacks. and Drug Administration (FDA) took notice. ther studies on 132 approved drug applications, To pick up on those effects, the FDA is also Since its approval in 2006, a popular prescrip- some of them well-known pharmaceuticals (see constructing a network called the Sentinel Ini- tion drug called Chantix (varenicline) has table). The agency also has the power to issue tiative. Still in early development, the network helped smokers curb their cigarette cravings. fines if those trials are not completed within an will eventually link together the nation’s largest But the drug has also produced more reports agreed time. All of this suggests insurance databases. With of psychiatric side effects than any other drug that further approved drugs will Most drugs are not those data to hand, FDA on the market, according to the Institute be thrust into the spotlight over pulled from the market watch dogs will increasingly be for Safe Medication Practices in Horsham, adverse reactions. able to perform studies such Pennsylvania. But some remain sceptical because they are too as the recent analysis of more A paper published on 20 July in The Annals about whether the act will have risky to be of any use, than 200,000 Avandia users, of Pharmacotherapy highlighted 26 instances its intended effect. “The FDA but rather because they performed by FDA epidemi- in which Chantix users became violent in has a little bit more power, but it ologist David Graham with thought or action1. In one case, a woman hit has to develop the courage to use are misused. data collected from patients her 17-year-old daughter while the girl was that power,” says Curt Furberg, on Medicare, a government- driving a car. In another, a man on the drug a physician with the Division of Public Health sponsored health-insurance programme2. had his front teeth knocked out after punching Sciences at the Wake Forest University School Graham’s analysis figured prominently in the a stranger at a bowling alley. of Medicine in Winston-Salem, North Caro- decision to further restrict Avandia. Large data Although every drug bears some risk, those lina, and a co-author on the latest Chantix sets will give regulators more power to identify risks can lurk undetected until the drug hits paper1. Furberg and others have expressed con- smaller increases in risk than they have in the pharmacy shelves — an uncomfortable truth cern that the FDA’s post-marketing watchdogs past, says Strom. This could, in turn, mark out highlighted by the once-popular diabetes work under regulators who may be reluctant to further drugs for investigation. drug Avandia (rosiglitazone), which has been question a drug’s safety, having approved the But inclusion on the FDA’s watch list should linked to heart attacks. In a close decision, an drug in the first place. not automatically condemn a drug, cautions FDA-appointed committee last month voted Steven Nissen, a cardiologist at the Cleveland to allow Avandia to remain on the market Hidden risks Clinic in Ohio who published the first meta- with tighter guidelines for how it is prescribed. By the time a drug such as Chantix hits the analysis of Avandia clinical trials3. “Not every “Avandia is not an exception or an aberration,” market, it has typically been through clinical drug that raises concern turns out to be a bad says Brian Strom, an epidemiologist at the trials that can involve several hundred to a few drug,” says Nissen. University of Pennsylvania in Philadelphia. thousand subjects. That’s enough to uncover In fact, most drugs are not pulled from the “We’ll certainly see more cases like it.” common side effects, or even infrequent market because they are too risky to be of any Chantix is one of a growing number of drugs events, such as liver failure, that are uncom- use, but rather because they are misused. The that could soon face similar attention. In 2008, mon in the population at large. But small painkiller Vioxx (rofecoxib), which was with- the FDA required Pfizer, the New York-based increases in relatively common ailments can drawn from the market in 2004 for causing manufacturer of Chantix, to perform addi- easily slip through. For example, some studies heart attacks and strokes, was a valuable drug tional safety studies on the drug. In doing so, found that Avandia increased the risk of heart for the few patients who failed to respond to the agency was flexing new muscle. The Food attack by 40% — considered a relatively sub- other anti-inflammatory drugs. But thanks and Drug Administration Amendments Act, tle effect. But given that diabetes patients are to aggressive marketing, the drug was over- prescribed, says Strom, and reached patients FIVE EXAMPLES OF PRESCRIBED DRUGS NOW UNDER FDA SCRUTINY with a relatively high risk of heart attack. Drug Indication Safety concern 2009 sales Manufacturer Year approved Such advertising-fuelled haste to embrace Herceptin Breast cancer Cardiotoxicity US$5 billion Roche 1998 new drugs will ensure a steady supply of post- (trastuzumab) marketing scandals, he says. “It isn’t necessarily that the drugs don’t have a benefit or have too Cymbalta Depression Birth defects $3 billion Eli Lilly 2007 (duloxetine) and anxiety many risks,” he adds. “The issue is that we haven’t precisely quantified those risks, and Provigil Excessive Severe skin $1 billion Cephalon 1998 then we start using the drugs too widely.” ■ (modafinil) sleepiness reaction Heidi Ledford Victoza Type 2 Thyroid cancer $1 billion (2015 Novo Nordisk 2010 (liraglutide) diabetes projected) 1. Moore, T. J. et al. Ann. Pharmacother. doi:10.1345/aph.1P172 (2010). Rotarix Vaccination Include $450 million GlaxoSmithKline 2008 2. Graham, D. J. et al. J. Am. Med. Assoc. 304, 411–418 (2010). (rotavirus infections and 3. Nissen, S. E. & Wolski, K. N. Engl. J. Med. 356, 2457–2471 vaccine) convulsions (2007).

677 © 2010 Macmillan Publishers Limited. All rights reserved NEWS NATURE|Vol 466|5 August 2010

US report pins down future biosecurity Committee recommends a sequence-based system for identifying pathogens.

Can the disease-causing capabilities of an organism be predicted from its DNA? This was a key question faced by a 13-member com- mittee of the US National Research Council (NRC). It was trying determine what it would take to develop a government system that

spots bioweapons in the making by screening Schwartz/iStockphoto D. the genetic sequences routinely ordered from commercial suppliers of synthetic DNA. This week, the committee offered its answer in a 187-page report commissioned by the National Institutes of Health (NIH). The verdict: a biosecurity system that can predict the poten- tial for harm lurking within a snippet of DNA is so technologically distant that the concept is useless for practical purposes. “This is a predic- tion problem that can’t be solved, now or in the foreseeable future,” says Sean Eddy, one of the All in the code: is it possible to predict the pathogenicity of a toxin by looking for particular genes? report’s authors and a computational biologist at the Howard Hughes Medical Institute’s Janelia “A sequence-based classification system,” regulations, companies like his must laboriously Farm Research Campus in Ashburn, Virginia. says the report, “could be used to create a prag- comb GenBank — an annotated database of The committee also declined to describe a matic ‘brighter line’ for deciding when a new publicly available DNA sequences maintained detailed scientific road map leading to such genome sequence should be regarded as one of by the NIH’s National Center for Biotechnology a predictive capability, because it felt that the the existing select agents or not.” This would Information — for sequences that could corre- information could be misused. “We were very help, it suggests, to tackle potential confusion spond to a select agent on the list maintained by hesitant to go down that path, because we felt raised by variants of existing pathogens. The the Centers for Disease Control and Prevention that the ability to predict something relied report states that DNA synthesis companies, in Atlanta, Georgia, and the US Department of on the same skill set that would be needed to and the scientists they serve, should be able to Agriculture. “It’s not a trivial task,” says Min- design a pathogen,” says James Leduc, chair of quickly and unambiguously determine whether shull, who adds that he would welcome the the committee and director of the Galveston a given sequence is on the select-agent list. efficiency of a comprehensive government National Laboratory at the University of Texas The report also describes a “yellow flag” curated database of pathogenic sequences. Medical Branch in Galveston. biosafety system that would address sequences But some critics say that moving to sequence- The NRC report comes less than three of concern — snippets of DNA that are not in based classification would introduce complexity. months after Craig Venter and his colleagues at themselves select agents, but could be part of “It would actually decrease regulatory clarity,” the J. Craig Venter Institute one or otherwise used to says Gigi Kwik Gronvall, a senior associate at the in Rockville, Maryland, pub- “It places a lot of emphasis produce a bioweapon. The Center for Biosecurity of UPMC, a major hos- lished their manufacture and on really using sequences yellow-flag system would pital network in Pittsburgh, Pennsylvania. “It insertion of a synthetic bac- consist of a centralized exchanges a functional definition of a pathogen terial genome into a closely to screen, rather than biosafety sequence database — that is, a microorganism that can do harm related bacterial cell which worrying about taxonomy.” that would be annotated as — for a very complicated approach that says: was then able to self-replicate evidence of the function ‘Maybe it’s a pathogen and maybe it’s not, and (D. G. Gibson et al. Science 329, 52–56; 2010). of suspect genes comes to light. If a sequence there are infinite numbers of possibilities’.” This milestone lifted the profile of synthetic received a yellow-flag designation, that should Eddy calls Gronvall’s criticism “totally fair”. biology, including its potential for misuse. not trigger regulatory action, the authors write, But, he adds, it does not answer the question: Prompted by advances such as this, the com- but “common sense follow-up”, such as a tel- “What do we do with the select agent list in an mittee did, however, identify a key change that ephone call from a synthetic DNA company to era when you can synthesize things for which would be possible with current technology: make sure that a customer is legitimate. there are no experimental data?” moving to a sequence-based classification sys- The idea of relying on sequences to define The NIH did not have immediate comment tem for the regulation of dangerous pathogens. select agents drew some praise. “It’s very good on the report, which stops short of saying that The United States regulates a list of 82 patho- because it places a lot of emphasis on really the government should press ahead with build- gens and toxins, called ‘select agents’, deemed using sequences to screen, rather than wor- ing the suggested classification system. Commit- to pose a biosecurity threat and so subject to rying about taxonomy,” says Jeremy Minshull, tee members wrote that they chose to leave that restricted access. But currently, nothing iden- president of DNA 2.0, a synthetic-genomics decision — including the risk–benefit analysis tifies them beyond taxonomic labels, such as company in Menlo Park, California. that should precede it — to policy-makers. ■ Bacillus anthracis for anthrax. Currently, he says, to comply with select-agent Meredith Wadman

678 © 2010 Macmillan Publishers Limited. All rights reserved neWs feAtUre NATURE|Vol 466|5 August 2010

n 12 May, Vernon chemist based in Juneau, Asper was cruising Alaska, who works with the

through the Gulf of conservation-advocacy group Schrope M. OMexico, just a few Oceana, and was a leader in the kilometres south of where damage assessment of the 1989 the Macondo well was gush- Exxon Valdez oil spill before he ing tens of thousands of bar- retired from NOAA. rels of oil a day into the ocean. If oil was spreading in the Asper, an oceanographer at the deeper parts of the Gulf, there University of Southern Mis- could be major consequences. sissippi near Diamondhead, That oil might harm a host wasn’t there to see the carnival of organisms, ranging from of response ships and drilling delicate deep water corals to rigs at the site, or to look for oil migrating plankton, that help slicks on the surface. He and support the Gulf’s food web. It his colleagues were hunting for might expose BP, the company something more elusive — an responsible for the leaking oil answer to what might be hap- well, to a new area of liability pening to the unseen oil and for environmental damages. natural gas billowing into the And it would raise a question bottom of the Gulf. about whether the use of oil As the first group of aca- dispersants at the wellhead had demic scientists on the scene, contributed to the deep plume having arrived less than two — something that scientists weeks after the well blowout, would need to answer quickly. Asper (pictured) and his team Oil at the surface, said Asper, knew that valuable informa- “could be contained or moni- tion about the spill was being tored or defended against”. The lost and that they were the deep plume was something only ones in a position to cap- entirely different. “It’s far more ture the disappearing data. The complicated than I expected,” researchers, funded by the US he said on the boat. National Oceanic and Atmospheric Admin- Although he was talking about the oil, istration (NOAA), lowered a constellation of Asper might as well have been forecasting instruments into the Gulf that would beam his life. Following the Pelican cruise, he up data in real time to their ship, the RV Peli­ A scientist would find himself in the middle of a politi- can. A fluorometer scanned the water with a cal and scientific maelstrom that he could narrow beam of light that would cause any never have anticipated. dissolved oil to fluoresce at a telltale wave- At the Several teams of researchers would length. A transmissometer measured how later confirm the Pelican’s discovery. But particles or cloudiness in the water blocked throughout the agonizing three months that the transmission of light. And another sen- the well spouted, Asper was surprised to find sor gauged levels of dissolved oxygen. centre of his team’s work alternately ignored and chal- For most of the day, the monitors showed lenged by NOAA. The agency temporarily little of interest. But towards the after- requested that Asper and his colleagues stop noon, when the instruments were passing the spill talking to the media. And BP is now trying through water about 1,000 metres deep, to hire Asper and other scientists, in what the fluoro meter and transmissometer some view as an attempt to silence them1. readings spiked. The team went on to Vernon Asper was one of the It has all been more than enough to probe track remnants of that signal for some first researchers in the Gulf of the limits of his otherwise composed and 45 kilometres southwest of the wellhead, good-natured disposition. in a layer between about 1,000 metres Mexico to study the oil gushing “The whole experience has been both and 1,400 metres deep. It took some time out from the BP well. But it has exciting — to be involved in cutting-edge for researchers to make sense of the data, research into an incredibly important but all the signs suggested that a deep, hid- not all been smooth sailing, event — and frustrating,” says Asper. den plume of oily water was spreading away It was chance that brought Asper and from the gusher. reports Mark Schrope. his colleagues to the centre of the oil spill The news came as a shock, because oil is just days after it started. When the Deep- supposed to float on water. “That was a very, water Horizon rig, operated by a contrac- very disturbing and fascinating develop- tor for BP, suffered a catastrophic blowout ment,” says Jeffrey Short, an environmental on 20 April, Asper’s team was making final

680 © 2010 Macmillan Publishers Limited. All rights reserved VolNATURE 466|5|Vol August 466| 52010 August 2010 neWs feAtUre

plans for a research cruise to study natural methane seeps AA and shipwrecks at the bottom of the Gulf of Mexico. His o group, including the cruise’s chief scientist, Arne Diercks from the University of Mississippi in Oxford, was part of NIUST/N the National Institute for Undersea Science and Technology (NIUST), a NOAA-funded multi-university cooperative effort to apply new technology to undersea research. After the blowout, the researchers requested approval from NOAA to switch plans and investigate the spill instead. The NIUST team was still working out its research strategy when it departed from shore on 2 May, with a general mis- sion to track where the oil was going and to collect samples of sediments in areas not yet affected by the spreading oil. Two weeks into the voyage, the instruments picked up signs that were consistent with the presence of oily water at depth. The researchers started to wonder whether the oil was getting trapped in a relatively stable layer, rather than rising to the surface as expected. As they cruised southwest of the well site, they kept encountering hints of oil at depths below 1,000 metres (see map overleaf). None of their data was conclusive. The researchers knew, for example, that natural seeps emit more than 400,000 litres of oil and gas into the Gulf each day2. And when the Pelican Evidence of deep oil plumes, gathered by the crew of the RV Pelican, has caused controversy. team pulled up water from the region where the instru- ments pointed to oil, the initial samples looked clear and “I really wonder if the plumes are a result of that or if they had no oily scent. “I don’t know what to think,” Asper said would have been there without it,” Asper said at the time. The at the time, “but that’s why we’re here.” question had some urgency because the US Environmental Over the course of the cruise, the researchers collected Protection Agency would require BP to stop using the dis- enough evidence to build what they considered a strong persants if they were shown to be creating a hazardous situ- circumstantial case for the existence of a deep plume of ation — for example by depressing oxygen concentrations some form of oil. They also found unusually low concentra- enough to harm life. The team felt that it was important to tions of oxygen in the water, which they suspected could be get its findings out quickly so that scientists could mobilize caused by bacteria metabolizing oil and methane at depth. to collect more information as fast as possible. Many oil wells produce substantial amounts of methane But once he was on shore, Asper found himself at the cen- and later measurements of samples collected by the Pelican tre of a mess that would in some ways prove even more chal- team would find methane levels 100–100,000 times higher lenging than understanding the deep oil. He had agreed to than normal in the plume. “The be the Pelican team’s media face, and he did interviews from Towards the end of the trip the group was asked by experience just after dawn until the evening on the day they returned. NOAA to take a detailed inventory of all the water sam- “It was just a crazy, crazy, crazy day,” says Asper. “It was a ples it had collected, in case they ended up as evidence in has been both twilight zone.” legal proceedings. The agency needed the data to fulfil its exciting and During the interviews, he described the evidence for challenging roles in responding to oil spills. NOAA must frustrating.” a hidden plume of deep oil that was spreading an untold work closely with BP to guide response efforts, but it must amount of hydrocarbons into the Gulf. Asper believes he also lead the environmental assessment that will ultimately was careful to note that more analyses were needed before determine BP’s liability. anything could be said for sure. Still, some media reports Assembling the inventory was a time-consuming task at gave the impression that huge lakes of crude oil were hiding a point when the group was frantically trying to finish its in the deep — a view not supported by the data. work. At first the scientists feared that they might have to “It was a surprise to us that we had been misinterpreted,” halt their research early to get it done. “I’m a scientist,” said says Asper, who admits that he entered the fray with little Asper soon after the directive came through. “This legal media experience. But he says that he did what he could to crap is not what I got into the business for.” keep the record straight, and doesn’t know how he could have better controlled the picture that the media painted. A media storm Other researchers were also unprepared for the crush of But Asper and his colleagues could not avoid the issue of attention, which might have caught even the most media- liability. Shortly before the Pelican crew found the plume, savvy scientists off guard. Samantha Joye, a biogeochemist at BP had begun to apply dispersants at the wellhead — the the University of Georgia in Athens who collaborated with first time these chemical brews designed to break up oil had the Pelican crew and has grant funding through NIUST, was ever been used underwater. Asper and his colleagues dis- quoted by The New York Times as saying “There’s a shock- cussed from the outset the possibility that the dispersants ing amount of oil in the deep water, relative to what you see might explain why they were seeing oil forming a plume in the surface water.” Joye says that she now chooses her 1,000 metres down, rather than rising to the surface. words more carefully and makes a concerted effort to be

681 © 2010 Macmillan Publishers Limited. All rights reserved neWs feAtUre NATURE|Vol 466|5 August 2010

“less excited” when giving interviews, Justin Kenney, NOAA’s communi- AA o but that she does not regret spreading cations chief, told Nature last week: N the news about the plumes because “Throughout this event, all research- the opportunity to study them might ers have been committed to providing have been missed if the press had not scientifically accurate information as learned about the Pelican data. soon as possible. Specifically in the The deep-oil discovery was not case of the Pelican, all of us agreed that good news for BP. At the time, efforts laboratory analyses of water samples to contain the oil and study its effects collected on site had to be completed were focused on the surface, where the before definitive statements could be battle against all previous oil spills had made about the presence of oil.” been fought. Executives and spokes- In hindsight, the Pelican discov- persons at the oil company questioned ery should not have been much of the existence of any deep plume of oil a surprise. Ten years earlier, the US or gas, arguing simply that oil floats. Minerals Management Service in (When Nature contacted BP, the com- collaboration with 23 oil compa- pany provided information already nies, including BP, released some publicly available but did not give spe- 120,000 litres of oil at a depth of cific responses to several questions.) 844 metres off the coast of Norway, What baffled Asper and his as part of an experiment aimed at colleagues, however, was NOAA’s simulating a deepwater blowout. cool response to the Pelican data. The They found that a small but signifi- day after the ship returned to shore, Jane Lubchenco urged caution in discussions about deep-sea oil. cant amount of the oil was confined to NOAA asked the researchers to post- lower levels and did not rise quickly to pone talking to the press to allow time for regrouping. the surface3. But few people seemed to recall the Norwegian On the same day, the agency issued a statement about the experiment as NOAA set about coordinating the response plumes calling media reports on the team’s work “mislead- to the blowout in the Gulf. ing, premature and, in some cases, inaccurate”. A few days after asking the Pelican scientists to stop The researchers were taken aback. “We took it personally,” speaking to the press, NOAA rescinded its request. Since says Asper. “We thought it was talking about us.” His team then, Asper has been interviewed regularly. “It’s extremely was proud of the work it had accomplished under difficult time consuming,” he says. “There are so many phone calls circumstances. “We expected NOAA to be as proud of it as and inquiries, but it’s hard to say no. We’re paid to collect we were,” he says. “To instead have NOAA basically say that data and obtain information, so you don’t want to withhold our results were invalid was quite a surprise.” anything when someone asks about your findings.” The scientists were further surprised by the rest of NOAA’s statement, which said that the scientists wished to Corroborating evidence clarify that they had not yet reached definitive conclusions. Within weeks of the Pelican’s return, other researchers were It also said that the team’s findings showed that oxygen levels finding corroborating evidence for the deep oil plume. were not low enough to be of concern, and that any connec- “There is a Researchers at the University of South Florida in Saint tion to subsea dispersant use was only speculative. lot of potential Petersburg went out to study the spill area on the Weather­ Asper says that they fully agreed with the statements bird II twice in May, and NOAA presented data collected by attributed to them, but that the Pelican researchers had not out there for the Florida researchers on 8 June. Lubchenco announced seen the text before NOAA released it. “They were doing jumping to that NOAA had confirmed the presence of low concen- damage control and trying to make sure people didn’t conclusions.” trations of oil from the Deepwater Horizon well in deep panic,” says Asper. “I kind of understand that, but I do wish plumes: specifically hydrocarbons in the parts per million they would have communicated with us a little bit better.” range, and polycyclic aromatic hydrocarbons — carcino- Short saw the statement as a way to divert attention. “I genic oil-breakdown products — in the parts per trillion think the agency probably felt like they should have been range. “It was gratifying,” says Asper, “I thought, ‘Great, at the ones to catch this and they weren’t,” he says. last now we’re vindicated’.” Although NOAA never completely denied the possibil- But the Weatherbird II team had its own challenges with ity that the oil might spread far below the surface, it consist- NOAA. Representatives from the agency and from BP trav- ently backed away from confirming the Pelican findings, and elled with the scientists on their first boat trip, and much of pointed to a need for definitive confirmation. Weeks after the the work was carried out as part of the government’s Natural cruise, Jane Lubchenco, head of NOAA, still seemed uncer- Resource Damage Assessment (NRDA) process for gathering tain about the evidence for a significant plume of oil at depth. evidence that might be used in future spill liability cases. The “Obviously it would be highly unusual if we didn’t find oil NRDA process is a foreign one to many scientists because right close to the well; the question is what’s happening far- there are restrictions on how samples and data are handled. ther afield,” she said. “I think the bottom line is that there is a “Everything was kept under a very, very strict chain lot of potential out there for jumping to conclusions that may of custody,” says Ernst Peebles, a biological oceanogra- not be warranted and that we are all served best by proceed- pher from the University of South Florida and one of the ing in a careful, thoughtful and quantifiable manner.” lead researchers on the vessel. His group relinquished its

682 © 2010 Macmillan Publishers Limited. All rights reserved VolNATURE 466|5|Vol August 466| 52010 August 2010 neWs feAtUre

samples to NOAA and has not been given the opportu- nity to analyse them or most of those collected during OIL ON THE MOVE the second cruise. The Florida team is scheduled to head Since the RV Pelican cruise in May, surface oil slicks have shifted from near the wellhead towards shore. Deep plumes of dispersed oil have also spread more than 1,000 metres below the surface, out on another research cruise this week, but university but the fate of this oil is not known. administrators arranged funding for the trip independent of NOAA and BP. Despite the corroborating data collected by the Weather­ bird II and other cruises that found multiple shifting plumes, NOAA continued to publicly criticize parts of the Pelican team’s work. Speaking at a conference in Baton Rouge in early June, Lubchenco said: “Unfortunately, some data col- lected have not been usable because the protocols that have been well identified have not always been followed.” Lubchenco was referring to samples taken by the Peli­ can crew during its cruise. At the time, the scientists had followed established protocols by collecting water in glass containers for oil analysis and in plastic bottles for methane measurements. By a prior arrangement, the Pelican crew sent the glass containers off to researchers in Texas who were scheduled to do the oil tests. After the ship returned, NOAA requested water samples so that the agency could conduct its own oil analyses. The Surface oil extent in late July* Approximate zone within Surface oil extent in mid-May* which the RV Pelican crew Pelican researchers provided some of the remaining sam- Potential beached oil in late July* detected signs of deep oil ples, which had been collected in plastic and were less ideal *NOAA forecasts based on models initialized with aircraft and satellite observations for oil analysis because of potential interactions with the plastic. But Lubchenco blamed the Pelican crew for failing to follow protocol. the ocean floor throughout the Gulf. Scientifically, it is all “Did you hear what she was saying up there?” asked an new territory. “I suspect that the concentrations found may incensed Asper, after hearing Lubchenco’s accusations in pose threats to plankton and larvae of many species,” says Baton Rouge. Tom Shirley, a marine biologist at Texas A&M University He felt better when he spoke directly to Lubchenco after- in Corpus Christi. Murawski has similar fears. “Personally, wards. “I got the impression that she really did not under- I’m concerned about that much oil in a community that is stand what our situation had been and that the information long lived and slow growing, although we don’t have any she had been provided was very incomplete,” says Asper. “I indications of mortalities.” don’t blame her personally.” In addition to toxicity, a key concern is the oxygen deple- But Asper and his colleagues were again dismayed when tion caused by microbes consuming oil and methane in deep NOAA issued a research update on 13 June that criticized the plumes. On their cruises, Asper and his colleagues found sample collection on the Pelican. “It’s like they were saying, that oxygen concentrations had dropped by 30–55% within ‘Shame on you for not being psychic and reading our minds the plumes. In June, a team led by John Kessler, an ocean- and knowing what we wanted’,” says Joye, who maintains that ographer at Texas A&M University in College Station, found the samples were collected properly for their intended use. reductions of up to 30%, in patterns similar to those seen by the Pelican team. Neither group detected oxygen levels that Offending words would be considered hypoxic — too low to support aerobic Steve Murawski, director of scientific programmes and organisms — which is the level set by the Environmental chief science adviser for NOAA Fisheries, says that no criti- Protection Agency to cease use of dispersants. cism was intended and he considers the sample issue a mix- NOAA has been slow to compile and assess the oxygen up. “Those guys jumped into the breach,” he says. “I think data. Although several scientific groups found signs of they did the nation an incredible service and they should oxygen depletion, an initial report in late June by NOAA be congratulated.” However, NOAA has not yet acted on the and various federal agencies and BP did not describe any team’s request to take down the press statement. concerns4. “We don’t see significant oxygen depletion,” said “That whole issue to my nose had a bad odour,” says Murawski soon after the report’s release. Short, who feels that the protocol issue was part of an overall A report on 23 July by the same group acknowledged that tendency by NOAA to be overcautious in its response to the oxygen depletion has been observed but suggested that avail- spill. “I don’t feel it’s responsible to hide behind excuses like, able data are inconclusive5. The report did not include data ‘Oh, you used the wrong sample bottle.’ That’s the kind of from Kessler’s group or from the Pelican, even though later behaviour you expect out of an oil company trying to mini- work could have been compared with the first glimpse of the mize liability. It’s not what you expect out of a government plumes to help assess their evolution. “It’s still a mystery to us that is supposed to be telling us what is happening.” why NOAA is not recognizing our data set,” says Asper. It is not yet certain whether oil and gas in the plumes The recent report is hesitant about the oxygen-depletion are harming life in midwaters, or the delicate deepwater data, suggesting that backup measurements are needed corals and other bottom dwellers found in fertile areas on because oil could potentially cause problems with the

683 © 2010 Macmillan Publishers Limited. All rights reserved neWs feAtUre NATURE|Vol 466|5 August 2010

AA including Asper, have been pushing BP to fund a compre- o hensive series of studies examining how the gas, oil and dispersants behave as they enter the water at the wellhead NIUST/N and spread up the water column. “What no one really knows is why the oil is going where people are finding it,” says Leifer. “And because the underly- ing science is unknown we have no predictive capability.” Researchers can’t answer basic questions such as whether deep oil and gas will be exported from the Gulf into the Atlantic, and what is the most effective ratio of dispersant to oil. Leifer’s funding requests have been rebuffed for sev- eral weeks, despite being championed by Congressman Ed Markey (Democrat, Massachusetts). Early on, BP said that it was very close to reducing the flow, so the study would be moot. Although the oil flowed for several more weeks, the capping of the well in July prevented the studies that Leifer has in mind. For now, Asper is planning his group’s next cruise to the spill zone and trying to keep up with interview requests. Previously, he was pondering whether to accept an offer to work as a consultant for BP to help guide its response to the spill. When first contacted by a BP lawyer about the pos- sibility of working on retainer, Asper was sceptical. “I think he wants to make sure I don’t testify against BP,” he said. The oil on the surface of the Gulf of Mexico is obvious — but what lurks at lower depths? More than a dozen scientists have already signed con- tracts with BP, according to the company1. Some academics standard oxygen sensor used in most of the studies. But found the offers by BP appealing because they said it would Kessler’s team and others performed further analyses and provide a way to bring strong science into efforts to respond found no such problems6. Kessler has tried several times to to the spill. After he learned more about what would be make his data available to NOAA and is working with the involved, Asper found the offer more tempting, both sci- agency so it can incorporate his results into the ongoing entifically and financially. But he recognized that there was analysis of the plumes. “They seem as eager to understand a potential conflict of interest and eventually decided not this as we are,” he says, although he is not sure why it is tak- to accept the offer. ing NOAA so long to receive the team’s data. Despite his frustrations with certain aspects of the oil- Short says that NOAA may be reluctant to acknowledge the spill experience, Asper says that it is by far the most inter- oxygen impacts because it will make it harder for the agency “It is still a esting scientific problem of his career. He recognizes the to carry out its damage assessment. The effects of a diffuse oil mystery to us benefits of scientific stardom, which have helped to focus plume are not clear, but there are more studies of how low- international attention on the needs of the Gulf Coast. “I’m ered oxygen concentrations can harm marine ecosystems. why NOAA is also pleased that people are interested in the things we’ve The news of oxygen depletion, he says, “complicates NOAA’s not recognizing been interested in for so long,” he says. “It is flattering to relationship with BP because these assessment studies would our data set.” have people call up and ask what you do. I say, ‘Well, I’ve also create the potential for liability on BP’s part”. been doing this for 25 years and nobody ever asked before Kessler thinks the answer may be simpler. Although he is but I’m happy to tell you about it.’” not privy to the inner workings of NOAA, he surmises that On a recent weekday afternoon the wind was blowing the agency is simply overwhelmed. “I know these guys are the oil stench from the Gulf to Asper’s office building at the unbelievably busy. They’ve got to be running 12 different university. He had spent the morning with the journalist directions at once.” Dan Rather, doing an interview for a cable show. After work, Some scientists suggest there are broad problems in the Asper decided to take some time off to fly an experimental way the US government has handled research into the oil plane that he had finished building after 24 years of effort. spill. “One critical shortfall is that there is no overall coor- He flew up through the clouds and above trucks loaded dination of the many types of research that are being under- with dispersants under military guard. “It is gorgeous up taken, so that important issues are not missed,” says Nancy there,” says Asper. “It’s nice to get above it all.” ■ Rabalais, executive director of the Louisiana Universities Mark Schrope is a freelance writer in Melbourne, Florida. Marine Consortium in Cododrie. 1. Mascarelli, A. Nature 466, 538 (2010). 2. Oil in the Sea III: Inputs, Fates, and Effects 70 (National Academies, 2003). Scientists for hire 3. Johansen, o., rye, h. & cooper, c. Spill Sci. Technol. Bull. 8, 433–443 (2003). 4. Joint Analysis Group Review of R/V Mccall Data to Examine Given the scale of the environmental problem, many Subsurface Oil (NoAA, 2010); available at http://www.noaa.gov/ researchers worry that the government and BP are not sciencemissions/pDFs/JAG_report_1_BrooksMccall_Final_June20.pdf. doing enough to understand the gusher and its conse- 5. Joint Analysis Group Review of Preliminary Data to Examine Subsurface Oil In the Vicinity of MC252#1 (NoAA, 2010); available at http://www.noaa.gov/ quences. Ira Leifer, an oil-spill specialist at the University of sciencemissions/pDFs/JAG_Data_report_Subsurface%20oil_Final.pdf. California, Santa Barbara, and about a dozen collaborators 6. Mascarelli, A. Nature doi:10.1038/news.2010.378 (2010).

684 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 NEWS FEATURE

People power Networks of human minds are taking citizen science to a new level, reports Eric Hand.

he whole thing began by accident, says the Berkeley Open Infrastructure for Network playing levels — all while folding proteins. David Baker, a biochemist at the Uni- Computing (BOINC). By 2005, there were And it works. This week, Baker and his versity of Washington in Seattle. It was dozens of active BOINC projects — Rosetta@ colleagues publish evidence that top-ranked T2005, and he and his colleagues had just home among them — and hundreds of thou- Foldit players can fold proteins better than a unveiled Rosetta@home — one of those distrib- sands of users worldwide. computer (see page 756). By collaborating, ork by W. Fernandes W. by ork

uted-computing projects in which volunteers But what was surprising, says Baker, was that these top players often come up with entirely W rt download a small piece of software and let their the Rosetta@home volunteers quickly began new folding strategies. “There’s this incred- a e; home computers do some extracurricular work to chafe at the painfully slow progress of their ible amount of human computing power out C when the machines would otherwise be idle. screen saver. “People started writing in saying, there that we’re starting to capitalize on,” says ien sC The downloaded program was devoted to the ‘I can see where it would fit better this way’,” Baker, who is feeding some of the best human ame notoriously difficult problem of protein fold- he says. tactics back into his Rosetta g or ing: determining how a linear chain of amino In retrospect, this should “We’re at the dawn algorithms. F acids curls up into a three-dimensional shape have been obvious: even a of a new era, in which By harnessing human that minimizes the internal stresses and strains small protein can have several computation between brains for problem solving, — presumably the protein’s natural shape. If hundred amino acids, so com- Foldit takes BOINC’s dis- the users wanted, they could watch on a screen puters have to plod through humans and machines tributed-computing concept saver as their computer methodically tugged thousands of degrees of free- is being mixed.” to a whole new level. And it and twisted the protein in search of a more dom to arrive at an optimum is not alone: several projects favourable configuration. energy state. But humans, blessed with a highly are emerging in this field, sometimes called Thousands of people were signing up for evolved talent for spatial manipulation, can distributed thinking, and the number of publi- Rosetta@home, says Baker, which was grati- often see the solution intuitively. cations based on the approach is increasing. fying, but not entirely surprising; this kind Recognizing an unexpected opportunity, “We’re at the dawn of a new era, in which Foldit image: Univ. Washington Center Washington Univ. image: Foldit of digital citizen science had become almost Baker enlisted the help of computer-scientist computation between humans and machines routine by then. It was first popularized in 1999 colleagues. By mid-2008, they had created is being mixed,” says Michael Kearns, a compu- by the SETI@home project at the University an interface for Rosetta@home that not only ter scientist at the University of Pennsylvania of California, Berkeley (UCB), which har- allows users to assist in the computation, but in Philadelphia, who evaluated the concept nessed volunteers’ computers to sift through gives them an incentive to do so by turning it of distributed thinking as part of an unpub- radio telescope data in search of alien signals. into an online game. In the game Foldit, play- lished 2008 study funded by the US Defense And in 2002, UCB engineers had released a ers compete, collaborate, develop strategies, Advanced Research Projects Agency. Kearns generalized version of the software known as accumulate game points and move to different says that the approach has the most promise

685 © 2010 Macmillan Publishers Limited. All rights reserved NEWS FEATURE NATURE|Vol 466|5 August 2010

ine in areas such as vision, language and complex software wasn’t good enough to identify the C logic puzzles — territories in which humans tracks, so Westphal found himself staring at

medi are expected to retain an edge on computers image after image, counting the tracks by eye. UW for some time to come. It was excruciating, he recalls. “Despite your

lean/ David Anderson, a UCB computer scientist best efforts, your mind wanders. You start to C

. m . and the founder of BOINC, admits that the think about lunch or whatever.” C approach is still a long way from becoming That reality was on Westphal’s mind when mainstream. For many sceptical scientists, he he joined NASA’s Stardust mission, which was says, “there’s this idea that they’re giving up launched in 1999 to collect samples of a comet control somehow, and that their importance and return them to Earth. Westphal’s focus was would be diminished”. not on the comet itself, but on a collecting tray But advocates of distributed thinking, that was exposed to space during the years of such as François Grey, a physicist at CERN, cruising required to get there. He and his team Europe’s particle-physics centre near Geneva, were confident that 100 or so microscopic have few doubts. Last July, Grey helped to pieces of interstellar dust would burrow into establish the Citizen Cyberscience Centre in the tray’s aerogel, a wispy material designed Geneva, which aims to promote distributed- to decelerate and capture the dust without thinking projects, especially in the developing damaging it. But again, the challenge was to world. Grey is currently setting up distrib- find those particles. uted-computing projects in China. And he Unfortunately, that task made TREK look has helped to organize a workshop to be held easy. After the spacecraft’s sample-return in London this September to encourage sci- capsule fell to Earth in January 2006, Westphal entists to adopt the new approaches. reused the automatic imaging microscope “The whole field has a funny image, that it is from TREK to create 1.6 million images of David Baker’s online game Foldit uses the basic just for fun or for PR,” says Grey. “That’s what the aerogel. He estimated that it would take a problem-solving skills of volunteers to help solve we have to break through.” century for one person to peruse them all. So three-dimensional protein structures. the following August, Westphal and his team The eye of the beholder launched Stardust@home, a continuing project determine how many volunteers had to reach Andrew Westphal, a UCB physicist, started on that enlists the pattern-recognition abilities of the same conclusions about an image before a the road to distributed thinking almost two thousands of volunteer ‘dusters’. result could be believed. decades ago, when he was a lead investigator Although the ‘@home’ name pays homage to on a cosmic-ray experiment called TREK. BOINC volunteer computing programs, Star- Cosmic stardust agents TREK consisted of specially designed glass dust@home is one of the pioneering distrib- For Bruce Hudson, a resident of Midland, plates mounted on the outside of the Russian uted-thinking projects. As such, it faced plenty Ontario, Stardust@home was a perfect way to space station Mir in 1991. Cosmic-ray parti- of early hurdles. For example, the dusters had fill the long days. In 2003, he had a stroke that cles pelting the glass left microscopic traces to be given lessons on how to avoid being rendered the right side of his body mostly use- that were revealed by chemical etching after fooled by cracks in the brit- less. Even computer games weren’t much fun. the TREK detector had returned to Earth tle aerogel or by particles But somehow, the endless microscope photos in 1995. To find those traces, Westphal of Earth dust that had of aerogel were enthralling. “I’ve always liked automatically scanned and recorded embedded in the aero- the stars and the and all that kind of images of the plates using a micro- gel from the start. stuff,” says Hudson, who previously worked scope. But image-recognition Only some of the as a groundskeeper for a Catholic shrine. He volunteers worked estimates that he spent as much as 15 hours a

rank diligently. Others day on the project. F quickly slacked His hard work paid off. In 2010, at off. And still oth- the Lunar and Planetary Science Conference ers tried to cheat, in Houston, Texas, Westphal announced that dson; inset: d. d. inset: dson; just flipping through Hudson had found the first probable piece of U

b. h b. as many images as pos- stardust — actually a pair of particles in the same sible to rise to the top of a track (see Nature doi:10.1038/news.2010.106; scorecard put in place as an 2010). “I still can’t believe it,” says Hudson, who incentive. named the particles Orion and Sirius. Westphal Westphal, working together is already using the unique characteristics of with Anderson, realized that Orion and Sirius to calibrate the expectations they would have to calibrate of a new generation of Stardust volunteers. their volunteers just as they Meanwhile, Anderson is reprising what would any instrument. They he did with BOINC by generalizing the Star- had to find ways to assign a dust@home software, so that it can be used by skill level to each volunteer; scientists for other distributed-thinking Non-scientist Bruce Hudson found the first dust particles from to assess how that skill level projects. He calls the result Bossa, which NASA’s Stardust mission, naming them Orion (inset) and Sirius. changes with time; and to doesn’t stand for anything. “At some point, I

686 © 2010 Macmillan Publishers Limited. All rights reserved VolNATURE 466|5|Vol August 466| 52010 August 2010 NEWS FEATURE

eye whether the galaxies are spiral or ellipti-

cal — a task for which computers are almost anelli worthless. Galaxy Zoo has already published CC 17 papers after classifying 1.25 million different za s. galaxies, and has just begun another stage of galaxy classification with data from the Hubble Space Telescope. But as Lintott expands his domain to a ‘Zooniverse’ of projects — not just for galaxy classification, but for galactic mergers, superno- vae, solar storms and lunar craters — he has been much pickier than Anderson is being with Bossa, where anyone can try anything. Lintott worries that Bossa projects might be hasty affairs that end up wasting the goodwill of citizen scientists. “Rather than letting anyone pitch for volun- teers, we’d like to be a place where people can come and expect a certain level of com- mitment,” he says. Scott Zaccanelli designed Anderson, not surprisingly, a fibronectin variant that disagrees. He says he likes the was synthesized in the lab. got tired of the idea that everything has to have commitment of Galaxy Zoo to dis- an acronym,” he says laconically. tributed thinking, but not its ‘walled garden’ this field,’ says Kearns. Anderson’s intention is that Bossa will always approach. Galaxy Zoo “doesn’t provide flexibil- For now, there are still plenty of volunteers be open source and free, so that any scientist can ity to the individual scientist”, he says. who are not jaded. Scott ‘Boots’ Zaccanelli is use the software and adapt it to the task at hand. Baker says that he also drew his inspiration for one of them. A resident of McKinney, Texas, he He foresees applications as diverse as in the Foldit from Stardust@home. But any similarities splits his time between a day job as a buyer for a BOINC ecosystem, where 68 active projects are to that program or to Galaxy Zoo end there. For valve factory and a personal business — Good engaging nearly two million users worldwide. one thing, Foldit players aren’t just engaged in For You Massage Therapy — that takes him An early recruit is Tim White, a UCB palae- basic image recognition and classification tasks and his massage chair to rodeos, county fairs ontologist, whose research involves searching — they are intuitively solving much harder opti- and flea markets. But he has also been hooked for early hominid fossils in the Great Rift Val- mization problems. Baker argues that the pro- on Foldit since 2008. “I’m pretty much there ley of east Africa. For decades, his teams have gram is exploiting three uniquely human talents: every night,” says Zaccanelli, who has used his searched in the same way: slowly. Bent over. a superior spatial awareness; an ability to take undergraduate biology degree to help him rise Crawling across the dry desert short-term risks for long-term to a number-6 global Foldit ranking. “I can soil in temperatures of up to “It needs to be an gain; and the converse, recog- look at something and see that it’s not right.” 50 °C. But with their planned exciting, compelling nizing a dead-end early and The skills of players such as Zaccanelli are so Bossa-based project, which knowing when to quit. impressive that Baker has moved past protein they intend to call Hominids@ experience that’s not The other important differ- folding and is now offering them chances to home, much of that work could always the same.” ence is that the Foldit designers design completely new proteins. Tasks include be taken over by volunteers take the gaming element more searches for new catalysts for photosynthesis, who would look for the white gleam of bone in seriously. Neither Galaxy Zoo nor Stardust has and for proteins that can bind to pathogens pictures. “A kid in front of a monitor isn’t going the immersive qualities of Foldit, with its chat such as HIV or the H1N1 influenza virus. to know the difference between the tooth of a rooms, wikis and increasingly difficult levels of One puzzle asked players to create a more colobus monkey and a baboon,” says White, play. Zoran Popović, Baker’s computer-science stable variant of fibronectin, a protein scaf- “but they’re going to know it’s a tooth.” collaborator at the University of Washington, fold that is useful for creating antibody-like points out that holding the volunteers’ interest compounds. Last October, Baker thought Galaxy Zoo is necessary if they are to learn quickly the skills Zaccanelli’s design was promising enough to be The online astronomy project Galaxy Zoo, required to make a real contribution. “It needs synthesized in the lab — the first time a play- which launched in 2007 at the University of to be an exciting, compelling experience that’s er’s recipe had been tested. It turned out that Oxford, UK, is taking a rather different tack. not always the same,” says Popović. Zaccanelli’s fibronectin wasn’t any more stable, Co-founder Chris Lintott says that the project There are also limits to games. If nothing else, but Baker says it is just a matter of time before was directly inspired by Stardust@home. “If says Kearns, as human computing becomes a player designs something that is. people would look at dust grains,” he says, “then ubiquitous, “people will no longer marvel at And that is a good enough motivation for surely they’d look at our beautiful images of being a part of these networks and may start to Zaccanelli. “Maybe something I do will help galaxies” — images that have been collected in feel exploited by them”. The day may come when contribute an answer to curing cancer or AIDS the millions by the international Sloan Digital scientists have to seduce volunteers by doing or the common cold,” he says. ■ Sky Survey consortium. what many consider anathema at present: pay- Eric Hand is a reporter for Nature in The idea is for volunteers to determine by ing them. “There will be a whole economics of Washington DC.

687 © 2010 Macmillan Publishers Limited. All rights reserved COLUMN NATURE|Vol 466|5 August 2010

horizon — synthetic biology and geo­ engineering — vigorous dialogues are starting Not by experts alone up in Europe (see Nature 465, 867; 2010 and http://royalsociety.org/Geoengineering­the­cli­ More and earlier public involvement is required to steer mate/) and the United States. In July, President powerful new technologies wisely, says Daniel Sarewitz. Obama’s bioethics panel devoted its first meet­ ing to synthetic biology (see www.bioethics.gov/ hese are the days of miracles and horrors meetings). Scientists appearing in front of the and hubris. The unveiling of the first panel trotted out the standard hype (vaccines Tsynthetic living cell in May signalled that that could be developed the day after a new synthetic biology had emerged as a new tech­ disease is identified; synthetic biofuels to com­ nological frontier. Meanwhile, the Faustian pletely replace fossil fuels), and other speakers bargain of a past frontier — using fossil fuels talked about the potential downsides, including to provide energy — has come home to roost the possibility of escaped designer pathogens in the oil­ruined Gulf of Mexico, and in calls to (the synthetic­biology equivalent of the Gulf oil geo engineer the climate. spill). Ignorance about the future was rampant, We are an innovating species, engaged in a as might be expected. But using an ethics panel balancing act. In the decades after the Second politically volatile to make progress on the safe to launch a discussion about a new technology World War, innovation fuelled an unprec­ disposal of nuclear waste or the construction of sent a good signal: the government’s role is not edented era of wealth creation while keeping us a new generation of reactors. just to throw money at the next big thing, but to on the brink of nuclear annihilation. The green A stark contrast comes from the controversy encourage open talks about social implications revolution fed billions while poisoning soil and over human embryonic stem­cell research. and options. water and destroying agrarian cultures. Today, Although the applications of stem cells remain Geoengineering is undergoing similar treat­ synthetic biology and geo engineering portend speculative, over the past decade a vicious ment. The US Congress, the Government a future in which managing socio­technical debate has played out in US politics over the Accountability Office and the non­governmen­ complexity will be every bit as challenging, if morality of destroying embryos for research. tal National Commission on Energy Policy are not more so. Is there a better way forward? Many scientists portray the struggle as one of among the most conspicuous groups starting to Maybe — if we act fast, embrace our igno­ rationality versus the forces of darkness, but this think through the troubling question of how — rance, and keep experts from taking over. is far too simple. President George W. Bush radi­ if at all — humans ought to directly intervene in Once a complex technology is widely used cally restricted — but did not prohibit — public the climate to try to mitigate the worst effects of — like the automobile or the coal­fired power funding of stem­cell research. President Barack global warming. plant — restricting, reorienting or replacing Obama has greatly expanded — but maintained But wise democratic guidance of technologi­ it becomes incredibly difficult. So the key to limits on — the work. And in the process of the cal decision­making will take more than ad hoc making better choices is to start early, when debate, a wider range of scientific approaches panels. A commitment to reflecting on techno­ uncertainty about a technology’s future is high, to stem­cell research has opened up (not only logical futures needs to be integrated into the by maximizing the diversity of perspectives and in the United States but also in other countries research and development enterprise — much interests involved in the discussion. that have wrestled with bioethics, such as Ger­ as, starting in the 1960s, the process of reflect­ The goal is not to convince the hoi polloi that many), creating more paths for innovation and ing on the ethics of research involving human they have nothing to fear, but to improve social options for steering the science towards social subjects became formally integrated into all bio­ outcomes of emerging technologies. Scientists benefits. All before stem­cell therapy has cured medical research programmes. Relative to the may be inclined to ignore or dismiss the efforts a single patient. cost of research and development, increasing of non­experts to influence complex techni­ Stem­cell research is an ethical hot button, this capacity would be cheap. It could be paid for cal discussions — for example, in discounting but thinking technology through doesn’t have by a small tithe on the federal research budget, the views of English sheep farmers during the to be so painful. In the early 2000s, talk of a and coordinated by one or more loose networks response to the Chernobyl nuclear reactor disas­ nanotechnology revolution prompted the US of non­governmental groups, research universi­ ter, or belittling the critiques of AIDS patients in Congress to require that investigations into, and ties, and government laboratories (for example, early efforts to develop treatments. But when it public discussions on, the social implications of see www.ecastnetwork.org). New social net­ comes to the future of an emerging technology, technological change be integrated into govern­ working technologies could permit such dis­ no one (or everyone) is an expert. ment research programmes on nanotechnology cussions on scales from local to international, (for example, see http://cns.asu.edu). The effort in venues ranging from science museums and Slouching towards governance is minuscule compared to the scale of the whole research laboratories to presidential commis­ Most industrialized nations have made, at best, nano technology programme. Yet it shows an sions and nationwide virtual conferences. halting and politically painful progress towards awareness among US policymakers that areas This is the momentum of democracy. In more democratic technological decision­mak­ of research with the potential to transform soci­ the long run, it will also be the best thing for ing. In the United States, for example, govern­ ety should not proceed in isolation from public science. ■ ment and industry embraced nuclear power deliberation. Daniel Sarewitz, co-director of the largely uninfluenced by serious democratic Consortium for Science, Policy and Outcomes deliberation. In the rush to deploy new reactors More than just panels at Arizona State University, is based in in the mid­1960s, poor technological choices The general movement seems to be in the Washington DC. were made, costs skyrocketed and public back­ right direction — towards earlier, more inclu­ e-mail: [email protected]. lash led to regulatory regimes too inflexible and sive discussions. For the big things now on the See go.nature.com/ILx8PC for more columns.

688 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 OPINION CORRESPONDENCE

UK coalition’s funding absorptive capacity is that it leads how physicist Peter Higgs alone International Conference on us away from pure excellence, as came to be associated with the High-Energy Physics at the plan accelerates measured by the citation metrics elusive boson that bears his National Accelerator Laboratory trends Labour started you describe, and towards a name (Nature 465, 873–874; (now Fermilab) in Batavia, concept of ‘good-enough’ science, 2010). Illinois. Recalling their earlier You seem to view the UK coalition which does just enough to improve The story, as recalled by conversation, he used Higgs’s government’s approach to funding a firm’s absorptive capacity but not Higgs and by colleagues of name as a shorthand to describe research in universities as a change obviously anything more. the late Korean-born physicist work based on his kind of theory. of direction, even harking back I thus believe it is more accurate Benjamin Lee, is as follows. From there, the name stuck and to that of the 1980s (Nature 466, to see the coalition government’s Higgs discussed his work on the Higgs boson was born. 296; 2010). This is not the case. approach as a potentially powerful what became known as the Higgs openly acknowledges The emphasis on economic acceleration of the trends Higgs mechanism with Lee over the contributions of others in this returns by minister David Willetts instigated under Labour. a glass of wine at a conference defining work. He opened one is couched in language such as William Cullerne Bown reception in 1967. Higgs’s famous conference by suggesting that “absorptive capacity”, which Research Fortnight, 134–146 Curtain 1964 paper had been the first to the Higgs mechanism should was part and parcel of the former Road, London EC2A 3AR, UK draw attention to the existence be renamed the “ABEGHHK’tH Labour government’s approach. For e-mail: [email protected] of the massive boson that would mechanism” after all of the example, in addition to the paper become the signature particle people (Phil Anderson, Robert by Jonathan Haskel and Gavin of the mass-giving mechanism. Brout, François Englert, Gerry Wallis that you mention, Willetts In his discussion with Lee, Higgs Guralnik, Dick Hagen, Peter cites one by Rachel Griffith and The long story of how did not enter into a full history Higgs, Tom Kibble and Gerard colleagues at London’s Institute the boson got only of those on whose work he had ‘t Hooft) who discovered it, or of Fiscal Studies, a mainstay of Higgs’s name built, given the informal nature of rediscovered it. thinking on this for the Treasury their chat. Ian Sample 31 Elm Park, London under Labour (see go.nature. In his review of my book Fast-forward to 1972, when SW2 2TX, UK com/orVDfF). The significance of Massive, Frank Close wonders Lee was rapporteur for the e-mail: [email protected]

689 © 2010 Macmillan Publishers Limited. All rights reserved Vol 466|5 August 2010 OPINION Harnessing telecoms cables for science Telecommunications companies and oceanographers should work together to plug old and new submarine cables into research projects, says Yuzhu You. A global network could monitor climate change.

Since the first submarine communication cable bis was laid across the English Channel in 1850, OR s/C more than a million kilometres of telecommu- R nications cables have been laid on the ocean floor. The result is a valuable network that can provide information about the world’s oceans:

electrical signals from the cables can yield Okanga/Reute J. information about the water they run through, and cables can be used to provide power to and transmit data from observatories on the sea floor. Yet only a tiny fraction of the existing undersea cabling is used for scientific purposes (see map). This is a missed opportunity. If the full potential of undersea cables could be har- nessed, they would be hugely useful in moni- toring the potential effects of climate change on ocean currents, temperature, and salinity as changed by melting ice, and in extending the global monitoring of seismicity. Oceanographers have a wide range of tools at their disposal, but each has its limita- More than a million kilometres of telecommunications cables lie beneath the sea. tions. Satellites can monitor only surface pat- terns and temperatures. Research vessels can it is expensive. NEPTUNE Canada’s recently should be urged to work together to overcome make detailed measurements of water tem- activated network, which relies on 800 kilo- these difficulties, to create a real-time global perature and composition at depth, but only metres of new cable, has cost Can$145 million network of undersea observation. from a tiny portion of the sea and rarely on (US$140 million) for the cables and instru- a regular schedule. The Argo array of 3,000 ments so far. Second life free-drifting profiling floats that measure the Other projects can make use of existing Projects that have successfully used old tele- temperature and salinity of the ocean are cables on a much smaller budget. A huge coms cables show how valuable they can be. limited in where they can go. They cannot go quantity of first-generation fibre-optic cable The Incorporated Research Institutions for below 2,000 metres, and generally are not used has been retired long before the end of its Seismology (IRIS), a university consortium in areas shallower than 2,000 metres in case useful life, thanks to rapid advances in cable headquartered in Washington DC, used two they hit the bottom. technology. A seismometer can be attached to retired coaxial telephone cables in the Pacific Another option is to build moored sea- a renovated cable relatively easily. And for cli- Ocean for part of the Global Seismographic floor observatories that use mate monitoring, a simple Network, a network of seismic monitors that cables to provide power voltmeter and computer can provides open-access data. The TPC-1 cable, and transmit data. These “The donation of cables turn a retired or in-service running from Guam to Japan, was donated by have an obvious advan- comes with legal and cable into an ocean-current telecommunications companies AT&T, based tage for long-term scien- practical difficulties.” data-set generator for as lit- in Dallas, Texas, and Tokyo-based KDDI, in tific monitoring, in that tle as a few thousand dol- collaboration with the University of Tokyo. they create a constant data lars. Retired cables can also AT&T also donated the second cable, HAW-2, stream from set locations, often at great depth be moved to scientifically important locations which runs from Hawaii to California. HAW-2 — something that cannot be achieved by any such as the Southern Ocean, where cabling is helped to run the Hawaii-2 Observatory2, a sea- other means. sparse. Relocation costs about $2,000 per kilo- floor seismometer sitting at a depth of nearly Some cabled ocean-floor observatories — metre of cable1, compared with $50,000 per 5,000 metres halfway between Hawaii and Cali- including the North-East Pacific Time-series kilometre for new cable. fornia, which sent real-time data to Hawaii from Undersea Networked Experiments (NEPTUNE), The difficulty in harnessing the potential 1999 to 2003. More recently, the retired HAW-4 the European Sea Floor Observatory Network of undersea cables is that the scientific com- cable between Hawaii and California was trans- (ESONET) and the Japanese Advanced Real- munity is not sufficiently organized to take ferred to the University of Hawaii for its Aloha time Earth Monitoring Network in the Area advantage of this resource, to navigate the legal Cabled Observatory, which is aiming to use a (ARENA) — are based on new cables. New, and practical difficulties of transferring the broader range of sensors for deep-water oceano- purpose-built cabling can increase the reliabil- ownership of retired cables, or to negotiate the graphic measurements in the coming decade3. ity of the system, and can be more appropriate multiple use of active cables. Telecommunica- Telecoms cables can themselves provide for local projects with specific design needs; but tions companies and the scientific community valuable data — without the need for additional

690 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 OPINION

instruments plugged into them. An electromag- In principle, it should be possible to design consultation with the US National Science netic current is induced in a cable by the motion repeaters for telecoms cables that are better Foundation, to help facilitate the acquisition of ocean currents, tides and tsunami waves. So able to provide a wide range of geophysical and of telecoms cables for science. But this was submarine cables can be used to measure water biogeochemical measurements. Ideally, they only partly successful. Over the past decade, flow — and have been used in this way around could be altered to directly discussions about trans- the world (see map). In particular, an undersea measure factors such as ferring the ownership of cable has taken daily measurements of the vol- pressure and temperature “The scientific community many retired transatlantic ume of water transported by the Florida Cur- at the ocean floor. They is not sufficiently organized optical cables6 — TAT-8, rent for the past 25 years, generating one of the could also provide ‘nodes’ to take advantage of this -9, -10, -11 — were brought longest time series of ocean water transport at which scientific instru- to the scientific commu- available. This cable record provides one of ments, such as seismom- resource.” nity but ultimately came to the essential data sets for calculating the North eters, could be plugged nothing as a result of a lack Atlantic meridional overturning circulation — a in — either directly or, indirectly, by using an of consensus on how to proceed. The IOC was major driver of global deep-ocean circulation, acoustic modem. It should even be possible to dissolved in 2007. Reports have been written and a phenomenon of importance to climate do this with an active telecoms cable without about the technical issues of cable reuse, and researchers. disturbing the communication signal. workshops held. But this has not resulted in a Recently, a group of oceanographers from functional global effort. more than ten countries, including myself, Making it work The first step to forming a global network has proposed using submarine cables to Telecommunications companies have been should be to evaluate the scientific potential monitor the Indonesian Throughflow — a generous in allowing the scientific community of all in-service and out-of-service telecom- major Indo-Pacific inter-ocean current — as access to cables and shore stations. However, the munications cables. This might require a new, part of the Pacific Source Water Investigation donation and transfer of ownership of cables international consortium of administrators, programme4,5. comes with legal and practical difficulties, espe- scientists and engineers. The next step would The electrical signal is also affected by chem- cially for transoceanic cables with multinational be to ensure coordination between governmen- ical and temperature changes in the water at the ownership. Fishing boats can snag and damage tal agencies, telecommunications companies sites of ‘repeaters’ — devices typically installed trawling equipment on undersea cables, leaving and the scientific community. An international 50–150 kilometres apart to amplify the com- the cable owner liable for damages. There are organization should perhaps be established munications signal in a powered telecoms no standard procedures for transferring this under the United Nations framework to do so, cable. These qualities can be measured, with liability from companies to academic institu- given that cables cover the global ocean floor a varying degree of success, by cleaning up the tions. In some regions, cable owners must also and cross many boundaries of national juris- electrical signal reaching shore. promise to remove a cable after its retirement, diction. This group, which should include legal Techniques used at present to estimate ocean which can be an expensive commitment. experts, would share knowledge and experi- bottom temperature at repeater sites have a Even if ownership can be successfully trans- ence about transferring rights and resolving high noise-to-signal ratio, but this problem ferred, projects can still be stymied by lack of the liability, security and legal issues. could be overcome by developing better tech- funding or technical support to maintain the In the future, as Internet use increases, nologies and methodologies. Cabled tempera- cable system. Simply storing donated spare demand for new cables will only grow. More ture monitoring, which could provide critical parts can be expensive. than 95% of telecommunications still goes information for climate-change studies, has In 1990, IRIS established the not-for-profit through submarine fibre-optic cables today, not yet attracted enough attention. corporation IRIS Ocean Cable (IOC), in with less than 5% going by the more expen- sive satellite route. We should take steps now GLOBAL NETWORK OF SEA-FLOOR CABLES to ensure that this expanding resource is put to Only a tiny fraction of telecommunications cables have been used for science so far. best use. ■ Yuzhu You is a senior research associate at the Institute of Marine Science, University of Sydney, Sydney, New South Wales 2006, Australia. e-mail: [email protected]

TAT-8,9,10,11 TPC-4 1. DeOs Cable Re-use Committee Report (2003); available online at www.soest.hawaii.edu/soest/facilities/esf/ Projects/DeOsCableRe-useReport.pdf 2. butler, R. et al. EOS Trans. Am. Geophys. Union 81, 157, HAW-4 162–163 (2000). 3. Duennebier, F., Harris, D. & Jolly, J. Sea Technol. 49, 51–54 TPC-1 HAW-2 (2008). 4. You, Y. et al. CLIVAR Exchanges 51 14, 11–13 (2009); TPC-3 TPC-2 available online at www.clivar.org/publications/ exchanges/exchanges.php 5. You, Y., sanford, t. & Liu, C.-t. EOS Trans. Am. Geophys. Union 91, 13–15 (2010). 6. butler, R. Proc. 3rd Int. Workshop on Scientific Use of Submarine Cables and Related Technologies ieee Catalog no. 03eX660, 3248–3249 (2003). Ocean-current monitoring station Seabed observatory project Acknowledgements I am indebted to Rhett Butler, Out-of-service cables Cable systems for which ownership Submarine cable donated for scientific reuse transfer to science has been discussed former director of IRIS Ocean Cable, for his encouragement, contribution and advice on this article.

691 © 2010 Macmillan Publishers Limited. All rights reserved Vol 466|5 August 2010 BOOKS & ARTS N. NANU/AFP/Getty N.

Hindu pilgrims pierce their tongues with metal skewers in a sacred ritual, apparently without feeling . Overcoming agony A broad account of the science of pain offers hope to patients but highlights how the culture of medicine needs to change, explains Lucy Odling-Smee.

The Pain Chronicles: Cures, Myths, recent research, she delivers a more complete the neuropathic model. It may explain why, Mysteries, Prayers, Diaries, Brain Scans, picture of pain in The Pain Chronicles than for some, the stabbing, aching or burning Healing, and the Science of Suffering most specialists would do. stubbornly refuses to resolve itself long after by Melanie Thernstrom Three paradigms have shaped attitudes the damage that caused it has healed. Much Farrar, Straus and Giroux: 2010. 384 pp. $27 to pain. Before modern medicine, pain was chronic pain — the kind that can worsen over suffused with metaphysical meaning. Trial months and years, and assume a life of its own by ordeal, for instance, was practised for — is now thought to be caused by changes We all experience pain, yet it is surprisingly thousands of years — the accused were made in the brain and spinal cord, or in peripheral hard to describe. This inability to share the to plunge a hand in boiling water or to walk sensory nerves. Neurons transmitting pain feeling makes chronic pain a double bur- on hot coals to see whether God would pro- become ‘hyperexcitable’; they begin firing den. To be in pain is, as journalist Melanie tect them from injury, and so prove their spont aneously. And other nerves are recruited Thernstrom describes, “to imagine that no one innocence. to help sound the alarm. Meanwhile, this state else can imagine the world you inhabit”. That The concept of pain as an evolutionary adap- of excitability kills neurons that would nor- is, until you read her book. tation arose in the nineteenth century, when mally dampen the pain signal. Thernstrom writes for The New York Times Charles Darwin’s theory of natural selection Tricks in perception also play a part. A per- Magazine and has covered war, murder, match- flourished, and still dominates. This second son’s genes, thoughts and culture all shape how making and divorce. Years of unremitting pain view sees pain as a signal of tissue damage these nerve impulses are interpreted. Neurosci- in her neck, shoulder and arm — after merely — implying that if you treat the underlying entists now know that simply imagining that swimming across a lake — took her into the disease or injury, the agony should go away. your pain will increase or persist for weeks revs top US pain laboratories and clinics. Weav- Yet such a reprieve eludes roughly 70 million up the activity of the central nervous system ing her own story with ancient myths, the people in the United States alone. in a way that creates more pain. Likewise, by history of anaesthesia, patient accounts and A third picture of pain has recently emerged: activating the parts of the brain that modulate

692 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 OPINION

pain, a sugar pill can be as effective in some a specialist who could explain why my pain be borne as an unavoidable fact of life persists people as an opioid analgesic. persists.) Treatment is improving, but, even among some physicians. For many others, the This latest picture may explain why different now, there is roughly only one pain specialist possibility of curing a life-threatening disease people have varied responses to pain and injury. for every 25,000 patients in the United States. probably holds more appeal than showing For example, Thernstrom witnesses Hindu Meanwhile, chronic pain costs the country patients how to manage a non-threatening but pilgrims piercing their cheeks and tongues with more than US$100 billion a year. incurable one. metal skewers in a sacred ritual, apparently Thernstrom argues that the problem is Regardless of what sustains the divide without feeling pain. It also hints at why non- mainly one of disseminating new understand- between the lab and the clinic, The Pain Chroni- traditional approaches, such as hypnotism, can ing. I disagree. Although she focuses on recent cles should narrow it. I hope that Thernstrom’s be successful in treating pain. Yet despite this studies, the idea that pain can result from a industrious survey of the latest research, treat- wider understanding, the treatment of chronic damaged nervous system is not new. Many of ment biases and patient and physician perspec- pain remains woefully inadequate. the core concepts were laid out by psycholo- tives will prompt some health-care givers to Anyone experiencing what Thernstrom calls gists Ronald Melzack and Patrick Wall in their rethink their approach. And patients who read the “misunderstood, misdiagnosed and under- 1982 book The Challenge of Pain (Penguin). her book will no longer have to remain in the treated disease” of chronic pain will be familiar Anaesthetics were not widely used to treat dark or feel so alone. Thernstrom’s descriptions with her years of bafflement, pointless expense pain until a century after their discovery of what it is like to live with “a broken alarm on treatments that don’t work, and consulta- because cultural preconceptions — for instance, that rings continuously, signaling only its own tions with physicians who don’t understand. about the virtue of suffering — needed to be brokenness” give a voice to millions of people (I am among those whose pain did not go away, overcome first. A similar shift in attitudes is whose are blackened by something that no after a disc herniation three years ago; only necessary for medicine to properly embrace one else can see. ■ in the past few months did I stumble across the science of pain. The belief that pain should Lucy Odling-Smee is an Opinion editor at Nature. An embellished tale of Pluto’s discovery

Percival’s : A Novel from Harvard astronomers to fading boxers, inner dialogue and period details help build by Michael Byers confused heirs and beautiful women, both sane an immersive world. Henry Holt: 2010. 432 pp. $27 and insane. All these actors orbit ever closer to The true story of Pluto’s discovery is here, the Lowell Observatory, perturbing each other but it is slow to come to the fore. And the mix more and more until they collide. Byers’s por- of fact and fiction can be unsettling. Fabricated Eighty years ago, the hunt for a mysterious trayal of the United States on the cusp of the characters crucial to the story coexist with real Planet X culminated in the discovery of Pluto. Great Depression is meticulous; glimpses of people such as Tombaugh, his fellow astrono- Lying beyond Neptune in the Solar System, mer Vesto Slipher and Lowell’s widow. Many O t

O the orb was named after the Greek god of the readers will find themselves turning to Google underworld and became part of our family of for information about what is true and what AP Ph AP nine planets. In 2006, having completed just is not. one-third of an orbit around the Sun since its Byers’s choice of embellishment over historic detection, Pluto was declared a mere ‘dwarf accuracy is understandable because the search planet’, an interloper from deep space, and was for Planet X involved countless hours of drudg- cast out of that family. ery: meticulous astronomical observation, Percival’s Planet is novelist Michael Byers’s long periods staring at photographic plates fictionalized history of the discovery. Inter- and uneventful book keeping. In real life, cool weaving real people and events with imagined discoveries are often made by normal people ones in a pastiche of the United States in the who then just go home and have tea. By adding late 1920s, he tells the (real) tale of Planet X — flesh and blood to the true story of Pluto, Byers an unseen planet at the Solar System’s edge that reinvigorates its history. had been predicted by wealthy businessman- Like witnessing men walking on the Moon turned-astronomer Percival Lowell to explain 40 years later, finding Pluto was a milestone anomalies in the orbits of Neptune and Ura- in the lives of a generation. It came at a pivotal nus. Into the search for the putative planet moment when human horizons expanded. The steps Clyde Tombaugh, working at the Lowell discovery coincided with the birth of modern Observatory in Flagstaff, Arizona, which Low- cosmology, when the vast scale of the ell had endowed. After Tombaugh’s efforts pay was revealed, and when improved global com- off, Planet X is renamed Pluto by a schoolgirl munications meant that a crash on Wall Street who wins a competition. reverberated around the world. Science was Tombaugh’s story of serendipity frames the changing too, becoming more professional. novel. A farmer’s son from Kansas, he was a Yet the pace of investigation remained genteel: skilled technician and a self-taught maker researchers could take their time on specula- of telescopes. In the novel, he shares the Clyde Tombaugh rose from humble origins to tive projects such as the hunt for Planet X. This stage with a bewildering array of characters, discover Pluto. wider context makes Byers’s imagined tale of

693 © 2010 Macmillan Publishers Limited. All rights reserved OPINION NATURE|Vol 466|5 August 2010

real science more compelling. Some of Tombaugh’s ashes are now on board Not bad for a farmer’s boy from Kansas. ■ Percival’s Planet ends before Tombaugh’s the New Horizons mission to that distant, Caleb Scharf is director of at death, a graceful effort to avoid Pluto’s frigid dwarf planet. Speeding through the Columbia University, New York 10027, USA, and miserable — albeit scientifically justified — outer planets, his ashes will arrive in 2015 at author of Extrasolar Planets and Astrobiology. demotion. But there is a neat coda to the tale. the speck caught by his sharp eye a lifetime ago. e-mail: [email protected] Behind the Mona Lisa’s smile X-ray scans reveal Leonardo’s remarkable control of glaze thickness, explains Philip Ball. lé

Leonardo da Vinci was renowned as a O prevaricating genius, apt to undertake too much, to experiment open-endedly and to SRF, V. A. S A. V. SRF, stall over details. “This man will never do any- e thing!”, Pope Leo X is said to have complained after finding the artist concocting a new kind of varnish rather than beginning a commis- sioned work. Even Leonardo’s Mona Lisa portrait was never formally completed, although he laboured on it for four years beginning in 1503, and returned to it many times throughout his life. A study of the Mona Lisa’s paint layers, pub- lished in Angewandte Chemie International Edition last month (L. de Viguerie et al. Angew. Chem. Int. Ed. doi:10.1002/anie.201001116; 2010), gives insight into the techniques over which Leonardo obsessed. Philippe Walter and his colleagues at the Centre for Research and Restoration of French Museums — based, like the painting, in the Louvre in Paris — found that the smooth shading of the iconic face is a Rare artworks can be analysed on site using X-ray spectroscopy — without removing paint samples. product of astonishingly fine control of glaze thickness, and that Leonardo experimented particularly keen to establish how Leonardo painting techniques reached Italy. widely with painting methods and materials achieved his trademark sfumato (‘smoky’) shad- That enthusiasm for experimentation is on other portraits. ing, which is devoid of evident brush marks. confirmed by analyses of the other Leon- Rather than extract paint samples from the It was known that this style exploits a glaz- ardo paintings, in which he used a variety of sacrosanct flesh tones of the Mona Lisa’s face, ing technique developed by fifteenth-century materials and techniques, including direct Walter and colleagues exploited a non-invasive northern European oil painters such as Jan van mixing, in the flesh tones. It also tallies with his technique that has only recently been applied Eyck, in which a translucent paint is laid over reputed interest in chemistry, which seemingly to art analysis: X-ray fluorescence spectros- an opaque one. But the details of how Leonardo provoked Pope Leo’s impatience at his tinker- copy. Bombardment of the material with used it to such great effect were obscure. Walter ing and distilling. It is possible that analyses X-rays excites an electronic transition from an et al. find that the thickness of a brown glaze of other Leonardo paintings might allow his atom’s inner shell. The excited electron then placed over the pink base of the Mona Lisa’s experimental methods to act as chronological decays by emitting another X-ray, the energy cheek grades smoothly from just 2–5 microme- markers — a valuable goal, given that the cur- of which reveals the atom’s elemental identity. tres to around 30 micrometres in the deepest rent dating of his works is sketchy. Thanks to improvements in instrumentation shadow, and that it is made up of an iron-oxide Prospects for further analyses of the Louvre’s and in software — developed when the team earth pigment darkened with manganese paintings on site are challenged by the decision worked with other artworks using the bright oxide. Although these materials were widely of France’s culture ministry to relocate Walter’s X-ray source at the European Synchrotron used, Leonardo’s control of glaze thickness is research laboratories to the town of Neuville- Radiation Facility in Grenoble, France — the remarkable. He probably used his fingertips, sur-Oise in Cergy-Pontoise, some 30 kilometres technique can now be used on site to map out as did, rather than a brush. outside Paris. The new centre, which is sched- elements horizontally across the paint surface This finding confirms Leonardo as an inno- uled to operate from 2013, will house conser- and vertically through the layers. vative artist. He trained in the studio of Andrea vation laboratories and accommodate 250,000 The researchers have traced how the com- del Verrocchio in Florence, but was apparently artworks now stored in the Louvre and other position and thickness of the layers varies from ready to abandon the pigment-mixing method Parisian museums. But the upheaval is taking light to shadow on the face of the Mona Lisa, and preferred by Florentine artists in favour of its toll on staff morale, casting a shadow over the in the flesh tones of six other paintings by Leon- experiments with glazing, similar to those con- future of scientific research at the Louvre. ■ ardo in the Louvre. Walter and his team were ducted in Venice after the northern European Philip Ball is a writer based in London.

694 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 OPINION

the participant. They should go beyond ‘the Serious fun with computer games dancing clown problem’, in which students are dazzled by digitized versions of pop quizzes but Sophisticated multimedia experiments offer platforms for learning don’t retain knowledge. about science through play, Aleks Krotoski finds. An increasing number of computer games are achieving the desired balance. One example — It is the year 2110, and the level of methane in systems in games that support learning. These described as effective and fun by educators and the atmosphere has reached a critical threshold. include rewards, options that allow the user those in the trade — is McGonigal’s World With- You have a decision to make: accept the risk of to navigate obstacles in a personalized way, out Oil, co-designed and written by Ken Eklund. global catastrophe and continue extracting the opportunities to try out hypotheses and to An interactive online experiment that ran over gas to meet the energy needs of an increas- fail in a safe space, iterative advance based on several weeks in 2007, it used realistic forecasts ing population; implement a one-child-per- prior decisions and consecutive challenges that of our planet and asked people to collaborate on household rule to reduce future energy unfold logically. These rules echo many char- responses and descriptions of future life experi- demand; or fund a decade-long research pro- acteristics of scientific inquiry. ences that stemmed from their beliefs about fuel gramme to deliver technological solutions. With good game mechanics, learning may scarcity and energy shortage. Another is Red These judgements on climate-change policy result even when the educational material Redemption’s 2007 PC game Climate Challenge, are central to Fate of the World, a computer underlying the game has flaws. The 2008 game the proof-of-concept title preceding Fate of the game due for release in October. Through “a Spore, released by Electronic Arts, was criti- World, into which research-based scenarios nail biting set of global warming scenarios”, cized by educators for its unrealistic portrayal were also integrated. It attracted more than one players will explore geoengineering, alternative of evolution. Yet it was a critical and com- million players. energy sources and other options for protecting mercial success, proving that there is a market Multiple media can enhance the mechanics the planet over the next 200 years. By incorpo- for mainstream games that tackle scientific of game playing. The alternate-reality games rating realistic predictions from climate models theory. It encouraged the gaming community community keeps audiences enthralled outside and advice from scientists, the game’s develop- to engage in debates about evolution. a game’s boundaries by sending players e-mails, ers, Red Redemption, hope to encourage play- The potential audience for such games is vast: instant messages and texts. They infiltrate the ers to engage with climate-change issues and to two-thirds of US households play computer and physical environment with city-wide billboard influence their attitudes and behaviours. Fate video games, and one-third of UK residents campaigns, live events and newspaper and of the World is the latest commercial title in a consider themselves gamers. Yet few serious magazine advertisements. They fabricate blogs, growing library of computer games that seek educational games make a profit — most rely video diaries and websites to further increase to convey serious messages through play. But on funding from government departments, users’ sense of immersion. how effective are they? media organizations and science-promoting This multimedia approach was used in Over the past decade, evidence Oil Productions’ award-winning et

N has grown that computer-based ROUTES, which explored genetics ld. play can support learning in through videos, stories and tradi- OR schools. Pedagogical studies and tional web games between Janu- thew evaluations, summarized in a 2006 ary and March 2009. With more OF

te joint report titled ‘Unlimited Learn- than 500,000 visitors and 4 million FA ing’, by the UK government’s edu- games played during the 3-month cation department and a software period it was live, and 675,000 view- www. publishers’ association, found that ers and 21 million players engaging students whose lessons included with the website since, its successful interactive games were more formula has been used by its spon- engaged in curriculum content and sors Channel 4 Education and the demonstrated deeper understand- Wellcome Trust for other public- ing of concepts than those who did service campaigns aimed at teenag- not use games. Better exam scores ers. These include Ada, an explorer and teacher ratings resulted when puzzle game still in development computer games, both commercial Interactive games such as Fate of the World foster learning by simulating but similar to Tomb Raider, that and bespoke, were used as support climate-change decisions. encourages girls to choose careers materials. A plethora of organiza- in science, and SuperMe, a game tions have sprung up to explore computer- charities. Attracting non-gamers is difficult: released last month that seeks to boost young based learning; in the United Kingdom, these the word ‘game’ may alienate would-be players people’s self-esteem. include Futurelab in Bristol and the Serious and devalue the serious messages within. As a Games provide an alternative platform for Games Institute at Coventry University. result, some developers use alternative names communicating science. If their mechanics are The success of computer games in engaging for their products, such as ‘behaviour-change well designed, game play could help us to make students lies in the mechanics of how they are p l a t f o r m’. better decisions about our future. ■ designed. Jane McGonigal, a games researcher Broadening the appeal and effectiveness of Aleks Krotoski is a researcher and journalist and designer at the Institute for the Future in educational games will require better products who specializes in the social applications of Palo Alto, California, and Raph Koster, author and targeted marketing. The playing experi- technology. She is currently Researcher in of A Theory of Fun for Game Design (Paraglyph, ence must be immersive, coherent and believ- Residence at the British Library, London. 2004), have each described several effective able. Games must not patronize or outpace e-mail: [email protected]

695 © 2010 Macmillan Publishers Limited. All rights reserved Vol 466|5 August 2010 NEWS & VIEWS

EARTH SCIENCE An inner core slip-sliding away Michael I. Bergman An ingenious proposal holds that Earth’s inner core is solidifying in the western hemisphere and melting in the east. The process is consequent on, and reinforces, its easterly slippage — or translation.

Over the past couple of decades, seismologists by slight, asymmetric, gravitational effects have revealed some bewildering features of from the mantle. As a result of the eastward Earth’s solid inner core and liquid outer core. Outer core translation, and because the melting tempera- On page 744 of this issue, Alboussière et al.1 ture increases with depth (again, as a result of offer explanations based on theory and experi- Plumes enriched in pressure) faster than does the temperature of ment that may help to account for two of lighter elements the core itself, extra solidification, above that those features. simply due to inner-core growth, occurs in the Inner core Perhaps the most fundamental of the west, with concomitant melting in the east. puzzles is that seismic waves in the inner Solidification Melting Because the solid is denser than the liquid, core travel faster in the north–south direc- Translation this reinforces the original motion. tion than they do east–west2,3, a property By balancing the production of latent heat, known as elastic anisotropy. The exact which is a function of the translation velocity, cause remains unknown, although it prob- with the ability of the outer core to remove Stably stratified ably involves alignment of iron crystals. Even denser fluid layer heat, Alboussière et al. estimate a current trans- stranger, the inner core exhibits a range of lational velocity of about 5 × 10−10 m s−1 (about east–west seismic asymmetries, with the eastern 1.5 cm yr−1). This means it would take roughly hemisphere being less elastically anisotropic4, 100 million years for newly solidified iron to having a faster direction-averaged velocity 5 move across the inner core and melt — that is, and greater wave attenuation6, and showing less for the inner core to renew. One can compare 7 Figure 1 | Convective translation. Alboussière attenuation anisotropy . Another surprising 1 this with an estimated overall growth rate of et al. argue that the inner core is translating −11 −1 feature is the presence at the base of the over- eastwards, driven by convection. As it does so, it the inner core of order 10 m s , based on lying outer core of a layer of reduced seismic- a 1-billion-year-old inner core with a present 8 solidifies in the west, releasing plumes enriched in velocity gradient , which may result from a lighter elements, and melts in the east, resulting radius of 1,200 km. 9 1 denser fluid layer . Alboussière et al. suggest a in an iron-enriched layer that is denser than the Because the translational velocity exceeds simple unified explanation for the denser fluid overlying outer core. The process of translation the growth rate by a factor of several tens, there layer and, in a somewhat more speculative way, could also help to explain the differing seismic is significant melting occurring on the eastern properties between the hemispheres in the inner for the east–west asymmetries. 1 side of the inner-core boundary, sufficient to Earth has been cooling since it formed core. (See also Fig. 3 of the paper on page 745.) be the source of a global denser layer at the 4.5 billion years ago. At some unknown time, base of the outer core as the fluid spreads out probably about 1 billion years ago, it cooled suf- overlying fluid, not less. Moreover, composi- atop the inner core. In accompanying experi- ficiently for the iron alloy in the core to begin to tional convection is probably important in the ments, Alboussière et al. show that it is possible solidify. Because pressure increases with depth, generation of the present magnetic field by the to maintain a denser layer against the mixing and pressure increases the melting temperature geodynamo10. But the presence of a convec- caused by the overall release of lighter fluid. of iron, the core solidifies outwards from the tively stable layer would seem to argue against Although the denser layer is stable, plumes of centre, in spite of the temperature increasing this, because diffusion does not move fluid and lighter fluid penetrate the layer to drive com- with depth. As when pure water ice begins to cannot contribute to the geodynamo. positional convection above it. The authors are freeze from salt water and the remaining liquid The new theoretical model that Alboussière also able to demonstrate that the layer thickness becomes saltier, the liquid outer core becomes et al.1 suggest is that the inner core can ‘con- of 200 km inferred from seismic measurements richer in the alloying elements as the inner core vectively translate’ (Fig. 1). In this translational is consistent with their experiments, given the solidifies enriched in iron. The alloying ele- mode, a cold, dense perturbation on one side concentration of lighter elements in the core. ments are unknown, but they are known from of the inner core, say the west, moves the inner A physically plausible origin for a denser layer seismology to be lighter than iron. Under the core eastwards under the action of gravity, with builds support for the case that a denser layer is action of gravity, these lighter elements rise to the return flow occurring in the outer core. The the cause of the reduced seismic-velocity gradient the top of the outer core, driving compositional authors argue that the translation is favoured above the inner core8,9. convection. in the equatorial plane because the transla- One difficulty with the idea of convective This makes the presence of a denser layer at tion velocity is proportional to the inner-core translation, which Alboussière et al. recog- the base of the outer core puzzling. Solidifica- radius, which is slightly longer in the equatorial nize, is that the finite inner-core viscosity will tion of an iron-rich inner core should result plane as a result of Earth’s rotation. That the allow deformation that will modify their sim- in the fluid adjacent to the inner core having translation occurs from west to east appears to ple model of translation. Moreover, although a higher fraction of lighter elements than the be arbitrary, although it could be influenced it is clear that convective translation with

697 © 2010 Macmillan Publishers Limited. All rights reserved NEWS & VIEWS NATURE|Vol 466|5 August 2010

solidification in the west and melting in the east–west variations in elastic anisotropy). 3. Woodhouse, J. H., giardini, D. & Li, X.-D. Geophys. Res. Lett. east will lead in a general way to east–west This serves as an example of how, by offering 13, 1549–1552 (1986). 4. Tanaka, s. & Hamaguchi, H. J. Geophys. Res. 102, asymmetry, and provides an alternative idea an ingenious proposal, the study by Albous- 2925–2938 (1997). to long-term control by the mantle for under- sière et al. opens up new avenues to investigate 5. niu, F. & Wen, L. Nature 410, 1081–1084 (2001). standing inner-core asymmetry11, much the strange centre of our planet. ■ 6. Cao, A. & Romanowicz, B. Earth Planet. Sci. Lett. 228, 243–253 (2004). work needs to be done to understand the Michael I. Bergman is in the Department of 7. Yu, W. & Wen, L. Earth Planet. Sci. Lett. 245, 581–594 (2006). origin of the inferred seismic properties of the Natural Sciences, Mathematics and Computing, 8. souriau, A. & Poupinet, g. Geophys. Res. Lett. 18, inner core. Bard College at Simon’s Rock, Simon’s Rock, 2023–2026 (1991). 12 Great Barrington, Massachusetts 01230, USA. 9. gubbins, D., Masters, g. & nimmo, F. Geophys. J. Int. 174, In a start on that task, Monnereau et al. 1007–1018 (2008). suggest that the east–west asymmetry in direc- e-mail: [email protected] 10. Lister, J. R. & Buffett, B. A. Phys. Earth Planet. Inter. 91, 17–30 tion-averaged seismic velocity and attenuation (1995). is due to growth of already solidified grains as 1. Alboussière, T., Deguen, R. & Melzani, M. Nature 466, 11. Aubert, J., Amit, H., Hulot, g. & Olson, P. Nature 454, 744–747 (2010). 758–761 (2008). inner-core material moves eastwards (although 2. Morelli, A., Dziewonski, A. M. & Woodhouse, J. H. 12. Monnereau, M., Calvet, M., Margerin, L. & souriau, A. they are unclear as to why this would lead to Geophys. Res. Lett. 13, 1545–1548 (1986). Science 328, 1014–1017 (2010).

ECOLOGY large populations of specialized pathogens or herbivores. But both Comita et al.2 and Man- gan et al.3 show that the reverse is true: it is Close relatives are bad news locally rare species that suffer most from the Owen T. Lewis proximity of relatives, suggesting that varia- tions in abundance are a consequence rather In tropical rainforests, tree seedlings growing close to their parent are more than a cause of negative density-dependence. likely to die. This mortality, caused by soil organisms, helps to explain the The two investigations took place in the well- studied forests of central Panama (Fig. 1), but coexistence and relative abundance of species. applied contrasting approaches. Comita et al.2 analysed an exceptionally complete data set on Simple models of competition among species variation among species in terms of the strength 31,000 seedlings representing 180 tree species, suggest that a few tree species, those that are of negative density-dependence has been over- identified and tagged in 2001 and re-surveyed best at exploiting limiting resources such as looked. Many ecologists have assumed that the five years later. They investigated seedling sur- light and nutrients, should dominate eco- most abundant tree species in a local commu- vival as a function of the abundance of trees systems such as tropical rainforests1. However, nity suffer most from the proximity of neigh- of the same (conspecific) or different (hetero- rainforests support hundreds of apparently very bours of the same species. Such a pattern could specific) species growing within a 30-metre similar tree species — typically a small number arise if, for example, abundant species support radius, and the number of conspecific or

R of abundant species and many rare ones. How heterospecific seedlings growing in the same e L do these species coexist? Why are some of them 1-m2 plot. Neighbouring trees and seedlings eg I rare and others common? Complementary of different species had little effect on seedling n Z n 2

TIA studies in Panama by Comita et al. (published survivorship. However, seedlings were much s 3

HRI in Science) and Mangan et al. (page 752 of this more likely to die close to neighbours of the C issue) show that a form of negative feedback same species. Strikingly, the extent to which driven by soil organisms can explain the rela- conspecific neighbours affected seedling mor- tive abundance of tropical tree species, as well tality correlates with species’ abundances at the as promoting their coexistence. community level: rare species suffer more than The theory that pests and diseases can be common species from the presence of same- good for diversity was formalized independ- species neighbours. ently by Dan Janzen and Joseph Connell Mangan and colleagues3 isolated the feed- 40 years ago4,5. Under the Janzen–Connell back mechanism underlying these patterns, hypothesis, seeds and seedlings close to mem- firmly implicating soil-dwelling organisms. bers of the same species will suffer particu- In an elegant reciprocal experiment, Mangan larly high mortality from specialized enemies et al. grew seedlings of six tree species in pots such as herbivores and pathogens, a pattern filled with ‘home’ soil (collected under trees of referred to as ‘negative density-dependence’. the same species) or ‘away’ soil (collected under The Janzen–Connell mechanism of negative other tree species). Relative to other species, feedback can enhance tree diversity because it seedling growth and survival were significantly prevents any one species from becoming locally lower on ‘home’ soil, which is more likely to dominant. Plants form the foundations of eco- harbour species-specific pests and diseases. logical communities, so if we can explain the Seedlings transplanted into the field near con- high diversity of tropical forest trees then we specific or heterospecific trees showed similar may also explain the high diversity of species effects, with little evidence that above-ground further up the food chain6. pests such as leaf-feeding insects contributed Negative density-dependent seedling and to the results. Again, species experiencing sapling survivorship consistent with the Figure 1 | Tree diversity in Panama. Seedlings of stronger negative feedback were rarer in the Janzen–Connell mechanism has been widely community. 7,8 rare species suffer more than those of common documented . However, several uncertainties species from the presence of same-species Understanding the factors that determine have remained about the biological mecha- neighbours, soil-dwelling organisms being the commonness and rarity of species is a nisms driving these patterns. In particular, implicated in the process. major preoccupation for ecologists, and is also

698 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 NEWS & VIEWS relevant to conservation because rare species example, through the application of pesticides ATG16L1 variant associated with increased are most at risk of extinction. The results will — may help to unravel the detailed ecologi- disease risk — show similar changes5. encourage researchers working in a variety of cal consequences of altering the soil biota. The As part of their new study1, Cadwell et al. ecosystems to look more closely at the ‘self- ‘unseen majority’ of soil organisms are not a find that ATG16L1HM mice raised in an limiting’ effects of species interactions. Cor- conventional target of conservation efforts, enhanced barrier facility (one that is free from relations between the strength of feedback but the new results show that they may be pathogens) did not show the inflammatory effects and relative abundance have also been essential to maintaining diverse tropical forest trait. So, while considering which environ- documented for plants invading temperate eco systems. By altering the composition and mental factors might trigger Crohn’s disease, grasslands9, and it seems likely that similar activity of these communities, human actions Cadwell et al. investigated whether murine processes operate in other habitats, including such as forest exploitation and anthropogenic norovirus (MNV) — which is often found in temperate forests. climate change may have long-term reper- conventional, but not in enhanced, barrier In the future, one challenge will be to pin- cussions for the processes structuring and facilities — could be a contributor. To do this, point more precisely the organisms causing maintaining rainforest biodiversity. ■ they infected normal and ATG16L1HM mice feedback effects, and to assess their specificity10. Owen T. Lewis is in the Department of Zoology, (both raised in enhanced barrier conditions) Fungi and bacteria seem the most likely cul- University of Oxford, South Parks Road, with an MNV strain that causes a persistent prits, and Mangan and colleagues plan to use Oxford OX1 3PS, UK. infection6. the latest genomics approaches to compare soil e-mail: [email protected] A week after infection, Paneth-cell abnor- microbial communities associated with differ- malities were seen in the ATG16L1HM mice, 1. Tilman, D. Resource Competition and Community Structure ent tree species. All else being equal, feedback (Princeton Univ. Press, 1982). whereas the normal mice showed a nor- caused by highly specific pathogens will gener- 2. Comita, L. s., Muller-Landau, H. C., Aguilar, s. & Hubbell, mal Paneth-cell response to infection. Fur- ate the strongest patterns of negative density- s. P. Science 329, 330–332 (2010). thermore, only Paneth cells from infected 3. Mangan, s. A. et al. Nature 466, 752–755 (2010). HM dependence. However, pests or diseases that 4. Janzen, D. H. Am. Nat. 104, 501–528 (1970). ATG16L1 mice showed the pro-inflamma- affect several tree species could alter the prob- 5. Connell, J. H. in Dynamics of Populations (eds Den Boer, P. J. tory gene-expression profile and developed ability that individuals of closely related species & gradwell, g. R.) 298–312 (PUDOC, Wageningen, 1971). intestinal ulceration in response to the toxic 6. novotny, V. et al. Science 313, 1115–1118 (2006). (which are more likely to share natural enemies) 7. Harms, K. e., Wright, s. J., Calderón, O., Hernández, A. & substance dextran sodium sulphate. This will survive in close proximity. If this happens, Herre, e. A. Nature 404, 493–495 (2000). mucosal injury depended on the presence of its signal should be apparent in the seedling 8. Webb, C. O. & Peart, D. R. Ecology 80, 2006–2017 (1999). the mouse microbiome and was mediated by 9. Klironomos, J. n. Nature 417, 67–70 (2002). data sets studied by Comita and colleagues. 10. Freckleton, R. P. & Lewis, O. T. Proc. R. Soc. B 273, the pro-inflammatory cytokines interferon-γ Further experimental research — for 2909–2916 (2006). and tumour-necrosis factor-α. These results1 demonstrate that, in mice, several factors — variations in the host’s genetic make-up, exposure to a specific virus, CROHN’S DISEASE toxin-mediated mucosal-barrier injury and the microbiome — can act together to trig- ger inflammation with similarities to Crohn’s Genes, viruses and microbes disease in humans. Such models will be of Alison simmons growing importance in understanding both inflammatory and complex diseases generally Variations in several genes can increase an individual’s susceptibility and how the immune system communicates to complex disorders. But what tips the balance to cause the full-blown with the environment. Immune responses to viruses rely on sensors disease? For Crohn’s disease, viruses could provide part of the answer. of infection in the shape of ‘pattern-recognition receptors’, which recognize evolutionarily con- Crohn’s disease, a common inflammatory disease susceptibility gene. The authors had served motifs present in microorganisms; in bowel disorder, is debilitating. Yet, other than previously generated mice expressing low the case of viruses, these are often viral nucleic the fact that variations in several genes as well levels of the ATG16L1 protein (ATG16L1HM) acids. Signalling through these receptors affects as unknown environmental factors contribute and observed two abnormalities in these the induction of autophagy, cytokine secretion to it, little is known about its cause. Writing animals4. First, their cells showed reduced and timely antigen presentation to immune in Cell, Cadwell et al.1 show that, against the levels of autophagy, the homeostatic process cells. Of the known susceptibility genes asso- right genetic background, a viral infection by which cells break down their own compo- ciated with Crohn’s disease, the strongest can- can make the difference between health and nents. This finding was consistent with a study didate encodes NOD2, a possible intracellular inflammation in mice with a condition that from another group5 in which mice express- sensor of nucleic acid7 that acts upstream of mimics Crohn’s disease. ing a mutant version of ATG16L1, lacking ATG16L1 in the induction of autophagy8,9. That the host’s genetic make-up can account a domain required for starvation-induced NOD2 is expressed exclusively in monocyte- for around 50% of the risk of Crohn’s disease2 is autophagy, showed enhanced inflammatory derived cells, in which MNV replicates10, and well known. Genome-wide association studies3 immune responses on exposure to bacterial in Paneth cells. ATG16L1HM mice infected with have pinpointed more than 30 genomic regions components. MNV would therefore be an attractive model (loci) variations in which are associated with Second, Cadwell and co-workers found for studying the combined effects of genetic an increased risk of developing the disease. The abnormalities in the Paneth cells, a subset risk factors and specific autophagy-mediated question is what contributes to the remaining of intestinal epithelial cells that secrete anti- immune processes such as antigen presenta- 50% of risk. Environmental factors, and in par- bacterial peptides. Indeed, Paneth cells of tion in response to inflammation. ticular the host’s resident microorganisms (the ATG16L1HM mice show defects in the pack- A notable difference between Cadwell and microbiome), are strong contenders, although aging and extrusion of antimicrobial granules co-workers’ mouse model and Crohn’s disease direct evidence for an environmental contribu- and express higher levels of genes that influence in humans is that, in the mouse, the genetic risk tion in either humans with Crohn’s disease or the response to intestinal injury4. Consistent is mimicked by reducing ATG16L1 expression animal models has not been forthcoming. with this, Paneth cells of patients with Crohn’s rather than by expressing the actual human- Cadwell et al.1 focus on Atg16L1, a Crohn’s disease who express ATG16L1 T300A — an gene variant. In the human disease, ATG16L1

699 © 2010 Macmillan Publishers Limited. All rights reserved NEWS & VIEWS NATURE|Vol 466|5 August 2010

T300A — the risk-associated variant of For example, it is possible that defects in into account in subsets of patients with Crohn’s ATG16L1 — carries a single amino-acid change autophagy, or in nucleic-acid sensing medi- disease who have specific disease traits. That in a region of the protein called the WD repeat. ated by pattern-recognition receptors, lead task seems Herculean, but efforts at addressing This repeat is a recent evolutionary acquisition, to abnormal persistence of viruses normally it will undoubtedly throw up new paradigms absent from the equivalent autophagy gene of cleared by the immune system. Alternatively, that are likely to have ramifications for inflam- organisms such as yeast, and its function is viral infection of the host microbiome could matory disease in general and for Crohn’s unknown. As the ATG16L1 T300A variant is play a part. Yet another possibility is abnormal disease in particular. ■ also common in the healthy human popula- handling of endogenous retroviruses — those Alison Simmons is in the Translational tion, it is likely to have been maintained for a that have integrated into the host genome; this Gastroenterology Unit and MRC Human good reason. Understanding the interaction has been implicated in another inflamma- Immunology Unit, Nuffield Department of between viruses and the autophagy pathway tory disease, Aicardi–Goutières syndrome11. Experimental Medicine, University of Oxford, in cells expressing ATG16L1 T300A should be Mutations in the TREX1 gene that predispose Oxford OX3 9DU, UK. informative. humans to Aicardi–Goutières syndrome cause e-mail: [email protected] Aside from the immunological facets of their an accumulation of nucleic acids derived from model, Cadwell and colleagues’ work is note- endogenous retroelements and an ensuing 1. Cadwell, K. et al. Cell 141, 1135–1145 (2010). 11 2. Van Limbergen, J., Russell, R. K., nimmo, e. R. & satsangi, J. worthy for providing much-needed evidence increase in the interferon response to DNA . Am. J. Gastroenterol. 102, 2820–2831 (2007). for the concerted actions of several environ- On the basis of these new results, it seems 3. Barrett, J. C. et al. Nature Genet. 40, 955–962 (2008). mental and genetic factors in triggering a dis- the Crohn’s disease research community has 4. Cadwell, K. et al. Nature 456, 259–263 (2008). 5. saitoh, T. et al. Nature 456, 264–268 (2008). ease. Before this study, a viral contribution to a formidable challenge on its hands. Not only 6. Thackray, L. B. et al. J. Virol. 81, 10460–10473 (2007). Crohn’s disease would not have featured high do the remaining genetic risk factors for this 7. sabbah, A. et al. Nature Immunol. 10, 1073–1080 (2009). in a sweepstake of risk factors, as the inci- disease need to be determined and their func- 8. Cooney, R. et al. Nature Med. 16, 90–97 (2010). 9. Travassos, L. H. et al. Nature Immunol. 11, 55–62 (2010). dence of clinical flare-ups of the disease do tions elucidated, but the contribution of the 10. Wobus, C. e. et al. PLoS Biol. 2, e432 (2004). not correlate with outbreaks of viral infection. host’s microbiome — and, now, of the host’s 11. stetson, D. B., Ko, J. s., Heidmann, T. & Medzhitov, R. Cell But viruses could be operating ‘subclinically’. intestinal viral repertoire — must also be taken 134, 587–598 (2008).

SPECTROSCOPY reduced Planck constant) and hence the hole dynamics. Because all light frequencies inter- act with the ion simultaneously, the authors Attosecond prints of electrons could observe modulation of the absorption Olga smirnova signal even when monitoring only one of the absorption lines. Attosecond spectroscopy has been used to track the real-time motion So far, the measurement might look like a of electrons in a krypton ion, and to probe the entanglement between an typical, albeit technically very challenging, pump–probe experiment. Its other important electron removed from the atom and the ion left behind. aspect becomes apparent when we recall that the electron removed by the pump pulse en Snapshots of ultrafast dynamics in the micro- produce a beat sound, two simultaneously route to making Kr+ is lost from the interfero- world are traditionally made in a ‘pump–probe’ excited states in Kr+ that have different ener- meter. Both pathways to the 3d −1 state share set-up. A first (pump) pulse of light plays the gies create a beat in the wavefunction of the this loss. Because of this, the experiment deals role of a starter gun, initiating the dynamics. A hole. The phase of the beat (φ) is proportional with an open system — one that has not been second, delayed (probe) pulse plays the role of to the energy difference (∆E) between the completely measured. Incomplete measure- a fast camera, taking snapshots of the moving states, and changes with time (τ). Goulielmakis ment is a source of decoherence (the loss of object at different times. To take snapshots of et al. took a ‘picture’ of the evolving hole using a phase relationship) in the measured part of electrons moving in atoms, the camera shutter a 150-attosecond extreme-ultraviolet (EUV) the system, here the Kr+ ion. So what does this must open and close in a fraction of a femto- probe pulse to excite another electron from the mean in Goulielmakis and colleagues’ study? second (1 femtosecond = 10–15 seconds). On deeper-lying 3d shell of the ion into the hole, Let us recall that, in optics, interferometers page 739 of this issue, Goulielmakis et al.1 while monitoring light absorption. are also used to characterize the coherence of report the first images of electronic motion in The way the attosecond ‘camera’ operates optical beams. In the same way, the visibility atoms taken with an attosecond probe pulse is reminiscent of Young’s interferometer, in of the interference fringes in the authors’ study (1 attosecond = 10–18 s). which light waves travelling along two differ- measures the coherence of the two interfering In their experiment, Goulielmakis and col- ent pathways add constructively or destruc- pathways in the ion subsystem — that is, the leagues used an intense infrared laser pulse to tively depending on their relative phase. In coherence of the spin–orbit dynamics in the quickly remove an electron from the outermost Goulielmakis and colleagues’ study1, two Kr+ ion. These dynamics are coherent if the (4p) shell of a krypton atom. A krypton ion excitation pathways running through the quantum states of the removed electron are (Kr+) was thus created in a superposition of its two Kr+ states add together in an excited state common to both pathways. In contrast, the two lowest-energy states (Fig. 1). These states of Kr+ known as the 3d –1 state (Fig. 1). If the coherence is lost if these states are orthogonal differ in the way that the spin and the orbital two pathways are in phase, they add construc- — not common to both pathways (Fig. 1a). momentum of the created hole (the absence tively to set up a large population of ions in the Then, the two subsystems cannot be treated of the electron removed from the atom) add 3d–1 state, absorbing a large amount of EUV independently: the removed electron and the together — the total angular momentum (J) light. If the pathways are out of phase, they add ion are entangled. Lack of information about of one state is 1/2, whereas that of the other destructively to give a small 3d –1 population the removed electron therefore results in the is 3/2. and weak absorption. In the authors’ experi- loss of information about the phase between As a result of this spin–orbit interaction, ments, the amount of absorbed light varied the two states of the ion. In other words, a high the energies of the two states are different. Just with the pump–probe delay τ, reflecting the degree of entanglement results in low coher- as two different notes struck at the same time evolving phase (φ = ∆Eτ/ћ, where ћ is the ence of the hole motion.

700 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 NEWS & VIEWS

Goulielmakis and colleagues1 characterized Figure 1 | The first attosecond probe the coherence, and thus the entanglement, of experiments. Goulielmakis et al.1 report a Kr+ and the lost electron. In their experiments, technique for observing electron motion in the intense, ultrashort pump pulse ensures real time. They irradiated krypton atoms (Kr) significant overlap of the two quantum states Kr+, 3d–1 with a ‘pump’ pulse of infrared light lasting a few femtoseconds, liberating electrons to of the removed electron that correlate with generate Kr+ ions in a superposition of two two different pathways in the ion’s subsystem states, 4p−1(J = 1/2) and 4p−1(J = 3/2), where J is (Fig. 1b), resulting in a low electron–ion entan- total angular momentum. Black arrows indicate glement, a high coherence of the hole’s wave the two ionization pathways. The authors then packet and high visibility of the interference Kr+, irradiated the ions with attosecond ‘probe’ pulses 4p–1(J=1/2) fringes. The ability to probe decoherence is a + of extreme-ultraviolet light, exciting them to a Kr , −1 very important aspect of the experiment. 4p–1(J=3/2) higher-energy 3d state; red and green arrows The authors’ experiment is reminiscent of a indicate the two possible excitation pathways. two-colour coherent-control scheme2. In such The complete system constitutes an entangled electron–ion pair. a, The different excitation schemes, population of a final state is controlled pathways taken by the ion to reach the 3d−1 by the relative phase between the two colours state may cause the liberated electrons to adopt of light needed to promote a system from two orthogonal quantum states. The spheres represent Kr intermediate states (J = 1/2, 3/2) to a final state. two states of the same electron, in which the red One might thus conclude that Goulielmakis sphere correlates with the J = 1/2 state of Kr+, and et al. could have made their measurements the green sphere correlates with the J = 3/2 state. without resorting to attosecond pulses — two The ‘hands’ on the spheres don’t touch, indicating ‘phase-locked’ colours, with controlled phase φ a that the states don’t overlap. b, In Goulielmakis between them, would have been enough. From and colleagues’ experiments, strong overlap of this perspective, the use of a time-delayed the two quantum states (indicated by the held hands of the spheres) lowers entanglement and attosecond probe can be viewed merely as a b allows the two possible excitation pathways of the convenient way to achieve this goal. Indeed, ion to interfere. By measuring the interference, both colours needed to promote the system to the authors tracked the motion of the hole (the −1 the final 3d state are naturally present in the absence of the liberated electron) in Kr+ in real ultrashort probe pulse used by Goulielmakis time, characterizing its coherence and the degree et al., and the relative phase between them of electron–ion entanglement. changes with the pump–probe delay. Are atto- second probe pulses really needed? direct time-domain measurements, such as in connected5 to the concept of charge-directed The answer is generally yes, if one deals with the experiment1, become indispensable — a reactivity6 — the idea that molecular bonds open systems. In the authors’ study, conducted two-colour coherent-control scheme operating break in places to which a hole has migrated. in the gas phase, decoherence arises only during with long pulses may not catch ultrafast changes This idea assumes coherence of the hole wave the preparation of the hole wave packet and does in electron coherence. packet when it is prepared. The technique of not evolve afterwards. But a notable strength Subfemtosecond hole migration across atto second transient absorption spectroscopy, of Goulielmakis and colleagues’ technique is many ångströms has been predicted to occur introduced by Goulielmakis and colleagues1, is that it can also be used in condensed phases in large molecules3,4. Such motion may have well suited to check this key assumption. (such as liquids and solids). Here, decoher- important implications5 for subsequent, femto- The coupling of hole motion to other elec- ence may quickly evolve during the time delay second-scale nuclear dynamics in these mol- tronic and vibrational modes in molecules, between the pump and the probe pulses, and so ecules. Thus, early-stage hole dynamics may be which is responsible for charge-directed

NEUROANATOMY The labelled neurons could then be followed during development, From fin to forelimb crucially showing that they develop in situ rather than migrating to their The vertebrate invasion of land was cartilaginous fish such as sharks. final location. made possible in part by evolution Motor-neuron innervation in Baker and colleagues’ extension of the tetrapod forelimb from the tetrapods (forelimb) and fish here in a species called the plainfin of their study to lobe-finned fish pectoral fin. But what changes (pectoral fin) arises from the midshipman fish, attached to and cartilaginous fish provided occurred in neural control during spinal cord. But for ray- and lobe- its egg yolk), the authors also evidence that, in these groups too, this transition? finned fish, there is evidence that demonstrated that the motor pectoral-fin motor-neuron control Robert Baker and colleagues these nerves also originate in the neurons project from both the is exercised from the hindbrain have tackled this question using a hindbrain. In following up that hindbrain and the spinal cord. as well as the spinal cord. Overall, thorough application of comparative evidence, the authors looked at the studies with transgenic zebrafish, the authors conclude that this neuroanatomy (L.-H. Ma et al. gross anatomy of the developing containing a fluorescently tagged dual contribution is the ancestral Nature Commun. 1, 49, doi:10.1038/ pectoral fin buds of various ray- enhancer that reports the activity condition in vertebrates. As to the ncomms1045; 2010). Their study finned fish. They found that they all of the developmental gene hoxb4a functional context, they speculate centred on the developmental have a similar organization of the in motor neurons, confirmed the that the advent of spinal-only motor biology of several species of ray- buds themselves, of the myotomes mapping of pectoral-fin neurons. innervation of the forelimb allowed finned fish, which are by far the that give rise to muscles, and of the Further work involved injection another notable characteristic of largest group of extant fish. But it neuroepithelium that generates of the RnA for a tetrapods compared with fish — their also included lobe-finned fish (a pectoral motor neurons. Using photoactive fluorescent protein, greater freedom of head movement. lineage that led to tetrapods) and dye-labelled fin buds (pictured kaede, into zebrafish embryos. Katie Ridd

701 © 2010 Macmillan Publishers Limited. All rights reserved NEWS & VIEWS NATURE|Vol 466|5 August 2010 reactivity, can be viewed as decoherence evolv- and amplitudes to obtain a single beam as out- route to discovering and characterizing new ing over time. This could also be investigated put — or a well-defined product of a chemical mechanisms of chemical reactivity. ■ using the authors’ approach. Such experiments reaction in the molecular context. Similarly, Olga Smirnova is at the Max-Born-Institut fÜr may address the role of electronic coherence one should expect that the amplitudes and Nichtlineare Optik und Kurzzeitspektroskopie, between different potential-energy surfaces at phases of the electronic states that make up D-12489 Berlin, Germany. points where such surfaces intersect (known as a hole’s wave packet will affect how this wave e-mail: [email protected] conical intersections). packet passes through the conical intersec- Returning to our optical analogy, conical tion, and what will appear at the output of 1. goulielmakis, e. et al. Nature 466, 739–743 (2010). 2. Brumer, P. & shapiro, M. Principles of the Quantum Control of intersections can be thought of as beam split- such a molecular beam splitter. With charge Molecular Processes (Wiley, 2003). ters. In optics, when a single beam of light transfer playing a vital role in many biologi- 3. Breidbach, J. & Cederbaum, L. s. Phys. Rev. Lett. 94, 033901 strikes a beam splitter, two beams that have a cal and chemical systems, the ability of atto- (2005). well-defined phase relationship are produced second transient absorption spectroscopy1 4. Hennig, H., Breidbach, J. & Cederbaum, L. s. J. Phys. Chem. A 109, 409–414 (2005). as output. Now imagine reversing the pro cess, to characterize attosecond-scale preparation 5. Remacle, F. & Levine, R. D. Z. Phys. Chem. 221, 647–661 so that two rays are sent to a beam splitter. It of electronic coherence and its subsequent (2007). takes precise control of the rays’ relative phases evolution over tens of femtoseconds opens a 6. Weinkauf, R. et al. J. Phys. Chem. A 101, 7702–7710 (1997).

METABOLISM efficient. Under certain conditions — such as oxygen shortage, an incomplete ATP-synthase complex or lack of mitochondria — pyruvate Malaria parasite stands out is fermented into lactate or ethanol, yield- Hagai ginsburg ing just two ATP molecules for each glucose molecule. One of the hallmarks of cellular biochemistry is the ability to extract energy The biochemical properties of the enzymes efficiently from available substrates. The malaria parasite, however, that participate in the various stages of ATP production have long been known, and the deviates from the norm, and has come up with its own solution. overall activity of the TCA cycle has been extensively studied. Moreover, recent progress All living organisms require energy for growth, chain, respectively, drives the multiprotein in metabolomics (the study of small-molecule maintenance and reproduction. At the cellu- enzyme complex FOF1-ATP synthase to pro- metabolite profiles produced by distinct cel- lar level, chemical reactions transform energy duce ATP — a process known as oxidative lular processes) has enabled specific details from one type to another: the energy stored phosphorylation. In total, for each molecule of metabolism to be elucidated. And recent in chemical bonds is turned into ATP — the of glucose that is broken down, 36 molecules advances in liquid chromatography and mass cell’s energy currency — when certain com- of ATP are produced (Fig. 1). spectrometry allow the detection and quantifi- plex molecules are broken down to simpler When pyruvate cannot be processed through cation of metabolites in ever-smaller amounts ones. One such molecule is glucose, which the TCA cycle, ATP production is much less of biological material. Furthermore, when is broken down through a sequence of path- ways that lead to the generation of ATP. In 1 Acetyl-CoA Red blood cell this issue (page 774), Olszewski et al. show Glucose that the malaria-causing parasite Plasmodium falciparum does not follow the usual route to Cytoplasm Parasite generate ATP. Glucose After glucose has been taken up by a cell, it is broken down to two molecules of pyruvate Glycolysis through the cytoplasmic process of glycolysis Phosphoenolpyruvate Histone (Fig. 1). In eukaryotes (organisms whose cells Acetyl- CoA Pyruvate CoA acetylation contain membrane-bounded organelles), pyru- Apicoplast Citrate vate then moves into the mitochondria — the Lactate Oxaloacetate Cis-aconitate cell’s powerhouses — where it is converted to Nucleus Acetyl-CoA TCA cycle acetyl-CoA and carbon dioxide. There, acetyl- Malate D-Isocitrate CoA enters the tricarboxylic-acid (TCA) cycle Endoplasmic reticulum NADH++H+ (also called the citric-acid cycle or the Krebs Fumarate 2-Oxoglutarate Glutamate cycle), the central crossroads of intracellular Amino sugars FADH2 metabolic pathways. Succinate Succinyl-CoA Glutamine Indeed, the TCA cycle is involved not only Mitochondrion in the production of energy, but also in the syn- Oxidative Electron- thesis and degradation of biomolecules. For its phosphorylation transport chain Haemoglobin digestion energy-generating activity, acetyl-CoA reacts Haem biosynthesis with oxaloacetate to form citrate. In a series Extracellular of enzyme-mediated reactions, citrate is then space reconverted to oxaloacetate, while two mol- Figure 1 | Canonical intracellular metabolic pathways and those of Plasmodium falciparum. A red ecules of CO2 are produced, completing the blood cell infected with a malaria parasite is depicted. The normal tricarboxylic-acid (TCA) cycle cycle. Along the way, the movement of pro- is denoted by green arrows, and the branched pathway used by P. falciparum is shown in red tons and electrons (provided by the co factor (with bifurcation starting at 2-oxoglutarate). The main cellular processes that mediate the supply + + NADH and H ) across the inner mitochon- or consumption of metabolites are depicted in black. The host-cell contributions are in blue. Dashed drial membrane and the electron-transport arrows indicate multiple steps.

702 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 NEWS & VIEWS

substrates containing carbon or nitrogen iso- given that the former is 18 times more efficient sub units of FOF1-ATP synthase but the complex topes are fed to cells and the data are collected at producing ATP than the latter? The physi- is biochemically inactive in these organisms. It over time, both the metabolic kinetics and ological consequences of this choice are costly is hoped that future investigations will solve the relative rates of different pathways can be for the infected host. In individuals with severe these mysteries. determined accurately. malaria, the increase in glucose consumption What’s clear is that Olszewski et al.1 seem to Olszewski and colleagues1 use a metabo- and lactate production that is driven by the have resolved a long-standing problem by pro- lomic approach to analyse the TCA cycle of parasites causes life-threatening hypoglycae- viding a functional answer to why the TCA- blood-stage P. falciparum, the most lethal of mia (low levels of blood glucose) and lactic aci- cycle enzymes are present in P. falciparum when the four parasite species that cause malaria in dosis (low blood and tissue pH)4. One could oxidative phosphorylation is absent. Demon- humans. Their Herculean effort seems to have argue that if the activity of the electron-trans- strating even that the cycle is essential for the been well rewarded. In agreement with previ- port chain were instead enhanced, this would parasite remains an obvious challenge. ■ ous suggestions2, they find that — unlike the exacerbate the oxidative stress that the parasite Hagai Ginsburg is in the Department of Biological canonical TCA cycle, which is unidirectional already places on its host cell (through digest- Chemistry, Institute of Life Sciences, The Hebrew and oxidative — the cycle used by the parasite ing the oxygen carrier haemoglobin), thereby University of Jerusalem, Jerusalem 91904, Israel. (which contains the genes encoding all of the disrupting the infected red blood cells before e-mail: [email protected] known TCA-cycle enzymes) is bifurcated, part the parasite could mature5 — killing the host 1. Olszewski, K. L. et al. Nature 466, 774–778 (2010). of it being oxidative and the other part being would obviously destroy the parasite’s habitat. 2. Vaidya, A. B. & Mather, M. W. Annu. Rev. Microbiol. 63, reductive (Fig. 1). What’s more, as the parasite By contrast, mice infected with Plasmodium 249–267 (2009). cannot convert pyruvate into acetyl-CoA in its berghei or Plasmodium yoelii, two other species 3. Foth, B. J. et al. Mol. Microbiol. 55, 39–53 (2005). mitochondrion3, depriving it of this substrate of the parasite that cause malaria in rodents, 4. Planche, T. & Krishna, s. Curr. Mol. Med. 6, 141–153 6 (2006). for the TCA cycle, it instead feeds the amino carry out active oxidative phosphorylation . 5. Müller, s. Mol. Microbiol. 53, 1291–1305 (2004). acids glutamic acid and/or glutamine into Another puzzle is why the parasite’s genome 6. Uyemura, s. A., Luo, s., Moreno, s. n. & Docampo, R. the cycle. contains the genes encoding eight of the J. Biol. Chem. 275, 9709–9715 (2000). The main metabolic roles of the TCA cycle in the malaria parasite seem to be production of succinyl-CoA for haem biosynthesis through the oxidative branch and the synth esis of cit- GENOMICS rate through the reductive branch. The two branches converge on malate, which, together with citrate, is exported from the mitochon- Variations in blood lipids drion, thus driving the process in both direc- Alan R. shuldiner and Toni I. Pollin tions. It is note worthy that P. falciparum cannot convert citrate to acetyl-CoA1. This may be What is the new gold standard for genome-wide association studies? As carried out by the appropriate enzyme (ATP- exemplified by analyses of blood lipids, it is collaboration to amass huge dependent citrate lyase) of the red blood cell in which the parasite resides, with the acetyl- sample sizes and functional studies of the genes identified. CoA that is generated being shuttled back to the parasite. Cardiovascular disease is a leading cause of the ongoing discovery and characterization Two distinct, compartmentalized routes death. In the United States, for example, it of numerous, relatively rare genetic variants mediate the synthesis and use of acetyl-CoA accounted for 1 in every 2.8 deaths in 2005 with large effects on lipid levels (reviewed in in the parasite. The mitochondrial source (ref. 1). Disruptions in the amounts of blood ref. 3). Brown and Goldstein’s landmark stud- (derived from the exported citrate) is tar- lipids greatly increase the risk of this disease. ies identifying mutations in the LDL receptor geted to the nucleus, where it is involved in On page 707 of this issue, Teslovich et al.2 pioneered such work4. More recently, however, the acetylation of histone proteins, which report on one of the largest meta-analyses researchers have focused on identifying genetic determine the spatial organization of DNA of genome-wide association studies so far, variants that influence the more common and control the pro cesses of DNA replication involving 46 cohorts and more than 100,000 causes of increased blood-lipid levels, appar- and transcription. In addition, acetyl-CoA human subjects. They identify 95 distinct gene ently resulting from the interaction of multi- can be synthesized, in a parasitic and alga- variants and/or chromosomal locations — 59 ple small genetic effects with environmental derived organelle called the apicoplast, from of them new — associated with lipid traits in and life-style factors such as diet and physical phospho enolpyruvate — an intermediate of the the blood. What’s more, the authors go a step activity3. glycolytic pathway3. Apicoplast-derived acetyl- further to validate the biological relevance of An earlier meta-analysis of genome-wide CoA is used to synthesize amino sugars, a three of the novel genes in mice. association studies (GWAS) involving more pro cess that occurs in yet another organelle, Cells need cholesterol and triglycerides — than 8,000 individuals5, for instance, impli- the endoplasmic reticulum. derived from dietary sources and the liver — for cated 36 genes and chromosomal loci in com- In nature, there is no known precedent membrane synthesis and energy. These lipids mon variation in the levels of blood lipids. But either for a non-standard TCA pathway such circulate in the blood as part of lipoprotein to detect smaller effects of genetic variants, as that of P. falciparum or for the exclusive particles, which are made of various propor- even larger sample sizes are needed in GWAS use of acetyl-CoA originating from different tions of cholesterol, triglycerides, phospho- that evaluate the association of lipid levels with cellular compartments (and then probably lipids and proteins. Low-density lipoproteins millions of single nucleotide polymorphisms mixing in the cytoplasm) in two other distinct (LDLs), for example, shuttle cholesterol from (SNPs) distributed throughout the genome. compartments. the liver to other tissues, whereas high-density Pooling several GWAS in a larger meta-analysis Although P. falciparum can be admired for lipoproteins (HDLs) scavenge cholesterol from enables detection of yet smaller effects. the ingen uity of its altered metabolic archi- blood vessels and other tissues, returning it to This is exactly the approach Teslovich et al.2 tecture, the physiological rationale for this is the liver. took — a strategy that led to several insights. far from clear. Why, for example, would the Much of what we know about lipid meta- Many of the variants the authors discovered parasite relinquish oxidative phosphorylation bolism and the treatment of dyslipidaemia are in, or near, genes known to mediate lipid for lactate production, which is fermentative, (disruptions in blood-lipid levels) comes from metabolism. They include common variants in

703 © 2010 Macmillan Publishers Limited. All rights reserved NEWS & VIEWS NATURE|Vol 466|5 August 2010 genes previously known to harbour rare vari- encodes an inhibitory subunit of protein phos- lipid–protein composition? For the novel ants that cause extremely low or extremely high phatase 1) decreased HDL-cholesterol levels. genes, what mechanisms underlie their effect? blood-lipid levels; variants identified through An accompanying paper on page 714 of this Are any of them drug-responsive targets? This ‘candidate-gene’ studies; and variants in genes issue8 also takes functional validation as its work2 was made possible thanks to the Human that are targets of lipid-lowering drugs. Among focus. It dissects the biological consequences Genome Project and an unprecedentedly large the remaining variants, many are in, or near, of genetic variation in a locus on chromosome collaborative effort among an international genes with no known function in lipid metabo- 1p13 that has been strongly associated, by pre- multidisciplinary genomics team. Now, scien- lism. Identifying these genes and elucidating vious GWAS, with elevated LDL-cholesterol tists interested in translational aspects — dis- their functions could lead to information about levels in the blood and myocardial infarction ease-related mechanisms and clinical relevance lipid metabolism and, potentially, to new drug in humans. Through studies of human subjects — can roll up their sleeves. ■ targets. Notably, the effects of individual vari- and human-derived liver cells, this study shows Alan R. Shuldiner and Toni I. Pollin are at the ants on lipid levels tend to be additive, with that rs12740374 — a common non-protein- Division of Endocrinology, Diabetes and people carrying more ‘risk’ variants being more coding variant located within this locus — cre- Nutrition, Department of Medicine, University likely to have dyslipidaemia than those carry- ates a binding site for the transcription factor of Maryland School of Medicine, Baltimore, ing fewer risk variants. C/EBP, altering the expression of the SORT1 Maryland 21201, USA. Alan R. Shuldiner is also Teslovich and colleagues’ results also begin gene in the liver. In mice, Sort1 alters secretion at the Geriatric Research and Education Clinical to unravel the complex genetic architecture of of very low-density lipoproteins (VLDLs) by Center, Veterans Administration Medical Center, lipid homeostasis. Some gene variants or loci liver cells, thus affecting blood levels of LDL- Baltimore. seem to have gender-specific effects. Moreover, cholesterol and VLDL particles. This work8 is e-mail: [email protected] most — but not all — of the variants discovered yet another example of how information from 1. Rosamond, W. et al. Circulation 117, e25–e146 (2008). in this large sample of individuals of European GWAS can be used to unravel new regulatory 2. Teslovich, M. et al. Nature 466, 707–713 (2010). ancestry apparently affect blood-lipid levels in pathways that alter the risk of human disease, 3. Hegele, R. A. Nature Rev. Genet. 10, 109–121 (2009). Asians and African Americans too. in this case myocardial infarction. 4. Brown, M. s., Hobbs, H. H. & goldstein, J. L. in Metabolic The authors find that many of the genetic Teslovich and colleagues’ analysis leads to and Molecular Bases of Inherited Disease 8th edn (eds scriver, C. R. et al.) Ch. 120.I (Mcgraw-Hill, 2000). variants affecting blood-lipid levels also many clinically relevant questions. For exam- 5. Willer, C. J. et al. Nature Genet. 40, 161–169 (2008). associate with coronary artery disease. This ple, would it be of diagnostic value to include a 6. Davey smith, g. & ebrahim, s. Int. J. Epidemiol. 32, 1–22 is especially true for variants associated with panel of 95 genetic tests, based on their results, (2003). increased LDL-cholesterol, and to a lesser beyond conventional measurements of blood- 7. goldstein, D. B. N. Engl. J. Med. 360, 1696–1698 (2009). 8. Musunuru, K. et al. Nature 714–719 (2010). extent for SNPs associated with decreased lipid levels? Which of the gene variants they 466, HDL-cholesterol or increased triglycerides. identify affect other disease-related features The authors declare competing financial interests. These observations support the notion that of lipoprotein particles such as their size and see online article for details. alterations in blood lipids pave the way for coronary artery disease6. Despite the outstanding power of this study to detect common variants of very small INORGANIC CHEMISTRY effect, the 95 loci identified explain only about 10–12% of the total variance in blood-lipid lev- els, which corresponds to about 25–30% of the Cation o’ nine tails genetic variance. Thus, as with other large-scale Polly L. Arnold GWAS for complex diseases and traits, most of the genetic variance remains unexplained7. The field of actinide chemistry is still young, not least because the Nonetheless, the huge sample size and excel- 2 radioactivity of these elements makes them difficult to work with. A study lent genomic coverage of this work hint that additional modest-effect common variants are now reveals details of how actinide compounds might behave in water. unlikely to contribute significantly to the ‘miss- ing’ heritability. Further genomic sequencing Reporting in Angewandte Chemie, Apostolidis thorium to lawrencium and is usually dis- could lead to the discovery of many rare et al.1 describe how they combined synthesis, played as the bottom row of the periodic table (large-effect) variants or other types of variants spectroscopy and computational modelling to (Fig. 1a). Together with the lanthanides (the (microdeletions, duplications and inversions) identify, for the first time, a series of complexes row of elements directly above the actinides not tagged — or imprecisely tagged — by cur- in which water molecules bind to ions of the in the periodic table), the actinides form a rent genotyping platforms used in GWAS. actinide elements — which include some of family known as the f-block elements, named A highlight of Teslovich and co-workers’ the heaviest and least-stable elements known. after their outermost, incompletely filled elec- paper is their analysis of the biological signifi- Overcoming the difficulties associated with tron orbitals, the f orbitals. The f-block ele- cance of several of the genes and loci they iden- the intense radioactivity and complex chem- ments are often insultingly referred to as the tified, including a systematic evaluation of the istry of these metals, the authors found that footnotes of the table, but it has to be admitted effects of the associated SNPs on gene expres- one actinide cation binds directly to nine water that the chemistry of the lanthanides in water, sion in the liver and in fat tissue. Of the three molecules to form complex ions of the form at least, is relatively straightforward. All lan- 3+ 3+ genes they investigated further, one, GALNT2, [An(H2O)9] , where An is an actinide. Gaining thanide ions (generically abbreviated as Ln ) which encodes a member of the N-acetyl- such detailed knowledge of hydrated actinide react with simple anions (X–, typically the salts galactosamine-transferase enzyme family, cations is fundamental for understanding how of strong acids), generating complexes of the was not previously known to be involved in the most radioactive components of nuclear form LnX3. And that’s pretty much it. Compare lipid metabolism. Decreasing the expres- waste behave in water. It will also allow chem- this with the rich chemistry of the transition sion of Galnt2 in mouse liver significantly ists to optimize procedures to separate and elements, which take part in redox reactions decreased levels of HDL-cholesterol. Simi- extract actinides in the laboratory, as well as and form all sorts of different complexes. larly, reduced expression of Ttc39b (function to understand and prevent actinide-ion migra- So why is lanthanide chemistry so much unknown) increased HDL-cholesterol levels, tion in the environment. less diverse than transition-metal chemis- and increased expression of Ppp1r3b (which The actinide series contains the elements try? The answer is electronic — the number

704 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 NEWS & VIEWS

was not necessarily a given. These findings a b will help to guide predictions of the further reactions of the actinides. Triflate anions in the secondary coordina- tion sphere of the actinide complexes1 help to hold the water molecules in fixed positions through a large number of weak hydrogen- bonding interactions, in the same way that water molecules in ice crystals are held together

Ac by a symmetrical network of hydrogen bonds. The networks of bonds within the complexes will also have allowed the growth of the high- Actinides quality single crystals used by the authors in Th Pa U Np Pu Am Cm Bk Cf Es Fm Md No Lr their X-ray studies. Single-crystal X-ray studies on transuranic compounds are rare, because the crystals are so ‘hot’ that they tend to self- Figure 1 | The structures of hydrated actinide ions. a, The actinides are the elements thorium to destruct during analysis, owing to internal irra- lawrencium. They follow on from actinium (Ac) in the periodic table, but are usually depicted diation. Californium triflate is therefore one of 3,4 as a separate row at the foot of the table. b, Apostolidis et al.1 have characterized the cations — just a handful of transuranic compounds to 3+ [An(H2O)9] , where An is an actinide — found in hydrated actinide triflate compounds. The structure be analysed by this method. Of the previously shown here is for the californium compound, but all of the members of the series studied by the published compounds, one4 has the same nine- authors (uranium to curium, and californium) adopt the same symmetrical structure. The complete − fold geometry as Apostolidis and colleagues’ crystal structure includes triflate counterions, CF3SO3 , but these have been omitted here for clarity. triflates1. The actinide cation is depicted in green; the red and grey spheres represent the oxygen and hydrogen The f-orbital electrons in the triflate com- atoms, respectively, of the nine bound water molecules. plexes undergo transitions that allow detailed spectroscopic and magnetic analysis. This of valence (outermost) electrons that can be the block of actinides from uranium to curium, provides fundamental information about the removed from transition-metal atoms varies, as well as for californium. The lighter, naturally ordering of the f-electron energy levels in the whereas in general only three valence elec- occurring, actinides (those up to uranium in complexes and shows that the early actinide trons are removable from f-block atoms. The the periodic table) are only mildly radioactive, triflates are strongly ionic, and behave remark- remaining valence electrons sit in multi-lobed which simplifies handling. The main practi- ably like lanthanides in water. The electronic orbitals (shaped like flower petals) that do not cal problem associated with these elements is transitions also give the compounds their extend far enough from the atom to overlap the that they can take part in unwanted oxidation beautiful colours, ranging from smoky blues orbitals of the atoms in the ligand, and so do reactions, and so the authors had to use careful and olive greens to vibrant pinks. not participate in bonding. To use our flower synthetic techniques to avoid oxidation when The stunning colours are not, in themselves, analogy, the lobes of f orbitals are tiny daisy they made uranium triflate. Similar techniques enough to encourage more synthetic chemists petals compared with the giant poppy petals of were also required when the authors made the to study actinide cations. But other factors are transition-metal valence orbitals. Nevertheless, triflates of the heavier elements neptunium and certainly promoting a renaissance in uranium the ‘hidden’ f-orbital electrons of lanthanides plutonium. chemistry 5–7 — including the ready availability give their ions interesting spectroscopic and After uranium on the periodic table are the of uranium triiodide, which can be used as a magnetic properties. ‘transuranic’ metals, which are man-made. precursor to uranium complexes. Replacing But what of the actinides, the lanthanides’ These elements are at least 100,000-fold more triflate anions in lanthanide complexes with radioactive siblings in the f-block family of radioactive than uranium, and their half-lives anionic groups that bind more strongly to the elements? The actinides have unique prop- are markedly shorter. Pure samples of trans- cation has also been effective for synthesizing erties and, in an increasingly nuclear age, it uranic salts and complexes quickly undergo new complexes and catalysts8–10. Exciting new is crucial to understand their behaviour — radioactive decay to produce daughter prod- actinide chemistry is therefore sure to follow especially in water, if we are to gauge the risks ucts, and the energy released in the process from studies such as those of Apostolidis and of actinide-containing nuclear waste in the damages any bound ligands. The authors there- colleagues. ■ environment. In this respect, ‘triflate’ salts of fore needed to rapidly synthesize and identify Polly L. Arnold is in the School of Chemistry, 3+ − − actinides (An (CF3SO3 )3, where CF3SO3 is the triflates made from transuranic starting University of Edinburgh, Edinburgh EH9 3JJ, UK. a triflate anion) are useful model compounds. materials, before the samples became too con- e-mail: [email protected] Triflate anions can bind directly to f-block taminated with decay products. When mak- 1. Apostolidis, C. et al. Angew. Chem. Int. Edn doi:10.1002/ cations, but they are readily displaced by other ing californium triflate, Apostolidis et al. chose anie.201001077 (2010). 249 ligand molecules, such as water, that bind more an isotope of the element ( Cf) with a long 2. Kobayashi, s. in Aqueous-phase Organometallic Catalysis strongly. For lanthanides, the resultant com- enough half-life that they did not have to rush 2nd edn (eds Cornils, B. & Herrmann, W. A.) Ch. 3, 88–100 (Wiley-VCH, 2004). plexes are known to consist of a central f-block the synthesis. In this case, the main practical 3. Laubereau, P. g. & , J. H. Inorg. Chem. 9, 1091–1095 cation surrounded by nine water molecules challenge was californium’s high level of radio- (1970). (known as the primary coordination sphere), activity, a problem that the authors overcame 4. sykora, R. e., Assefa, Z., Haire, R. g. & Albrecht-schmitt, T. e. Inorg. Chem. 45, 475–477 (2005). to which the triflate ions bind in turn (form- by working at the microgram scale. 5. Avens, L. R. et al. Inorg. Chem. 33, 2248–2256 (1994). ing the secondary co ordination sphere). These Perhaps surprisingly, Apostolidis et al. found 6. Carmichael, C. D., Jones, n. A. & Arnold, P. L. Inorg. Chem. water-stable, water-soluble lanthanide triflates that the same highly symmetrical cation, in 47, 8577–8579 (2008). have been used for 20 years as catalysts for a which nine water molecules make up the pri- 7. Fox, A. R., Bart, s. C., Meyer, K. & Cummins, C. C. Nature 2 455, 341–349 (2008). range of organic transformations . But making mary coordination sphere, is formed for all of 8. Hamidi, M. e. M. & Pascal, J.-L. Polyhedron 13, 1787–1792 and studying the analogous actinide triflates the actinides studied, despite the changes in (1994). presents a series of challenges. cation size that occur throughout the series 9. schuetz, s. A., Day, V. W., sommer, R. D., Rheingold, A. L. & 1 Belot, J. A. Inorg. Chem. 40, 5292–5295 (2001). Apostolidis et al. have overcome these chal- (Fig. 1b). What’s more, this behaviour mirrors 10. Arnold, P. L., Casely, I. J., Zlatogorsky, s. & Wilson, C. Helv. lenges in their syntheses of hydrated triflates for that of the lanthanide ions — a similarity that Chim. Acta 92, 2291–2303 (2009).

705 © 2010 Macmillan Publishers Limited. All rights reserved NEWS & VIEWS NATURE|Vol 466|5 August 2010

OBITUARY first target was Marrella, the most common arthropod in the fauna. Whittington selected the most informative specimens from among Harry Whittington (1916–2010) thousands collected by the Geological Survey Palaeontologist who revealed the extraordinary animals of the Burgess shale. of Canada and previously by Walcott. He and his team achieved remarkable results in their studies of Burgess Shale fossils .

Harry B. Whittington, who died on 20 June by the painstaking application of traditional IV aged 94, led research on the fossils of the methods: a modified dental drill for removing Un Burgess Shale in British Columbia, Canada. the matrix that concealed parts of specimens, His team’s studies of these fossils, which date a camera lucida attached to a binocular ARVARD , H , to about 505 million years ago, revolutionized microscope for preparing drawings, and IBRARY

our understanding of the ‘Cambrian various photographic techniques, including L

Explosion’, the origin of all the major animal the use of ultraviolet light. Examining AYR M body plans. Whittington was also the world’s specimens preserved in different attitudes T leading authority on trilobites, diverse helped in restoring the three-dimensional ns eR , Y marine arthropods of Palaeozoic age — the appearance of these flattened fossils. g era between 542 million and 251 million Whittington expected his students to work OOLO Z

years ago — that fascinate professionals and independently, but he was supportive, tolerant, e collectors alike. Whittington continued to kind and generous — an avuncular figure to publish on them into his nineties. many of them, including myself.

Whittington grew up in Birmingham, UK, The Burgess Shale provides a much more OMPARATIV C OF and developed a lifelong interest in Lower complete picture of Cambrian life than

Palaeozoic rocks and fossils as a PhD student the fossil record of shells alone. Although UM se at Birmingham University. He mapped Walcott identified the Burgess Shale animals U the geology of the Berwyn Hills in north as early examples of modern groups, M es Wales, determining the age of the rocks Whittington found it difficult to place some RCHIV using brachiopods and trilobites. In 1938, of the more unusual forms, such as Opabinia A a Commonwealth Fellowship took him to authority on trilobites. In addition to and Anomalocaris, in living taxa. When he the Peabody Museum at Yale University in monographs on faunas from Virginia, presented his preliminary restoration of New Haven, Connecticut, where he focused Newfoundland and north Wales, he studied Opabinia — with its anterior proboscis, five on trilobites, including blind forms called the development of trilobites from larva to eyes, flap-like appendages and rudder-like trinucleids, which are characteristic of rocks of adult using silicified specimens. In those tail — at a conference in 1972, the audience Ordovician age (dating to about 488 million to days — before the advent of scanning laughed. Such a reaction is unthinkable 444 million years ago). Whittington became electron microscopy — he had to develop today: we have become used to the oddities enthralled by the extra morphological detail photographic techniques to illustrate tiny thrown up by the Cambrian radiation, and afforded by trilobites that have been replaced specimens, sometimes less than 1 millimetre now know that Opabinia is an early offshoot by silica during fossilization. Silicified in dimension. Whittington made major of the line leading to the modern arthropods. specimens do not have to be dug out of contributions on the morphology, biology More recently, exceptionally well preserved the rock; they can be isolated by dissolving and evolution of trilobites, including some of Cambrian animals have turned up in other limestone in acid — leaving the fossils intact. the earliest identifications of ancient faunal parts of the world, notably at Chengjiang Whittington married an American — provinces on the basis of trilobite distributions in China and Sirius Passet in Greenland. Dorothy Arnold, who would be his constant and the former positions of tectonic plates. But it was the Burgess Shale that pushed the companion for more than 50 years — before Whittington’s research shifted dramatically creatures of the Cambrian into the limelight. he left Yale in 1940 and took up a lectureship in 1966, when he was invited to head a Ironically, it was not Whittington’s 1985 in Rangoon. The ensuing invasion of the Geological Survey of Canada investigation book The Burgess Shale that generated the Japanese army prompted a remarkable journey of the Burgess Shale, including fieldwork excitement (Whittington was a remarkably out of Burma (now Myanmar) to China, in the Rockies. The Burgess Shale had modest individual), but Stephen Jay Gould’s where Whittington taught at Ginling Women’s been discovered by Charles Walcott of the 1989 laudatory best-seller Wonderful College in Chengdu, Sichuan Province, until Smithsonian Institution in Washington DC Life. Gould dubbed some of the more the Second World War ended in Europe. in the decade before Whittington was born. unusual Burgess Shale creatures “weird He returned to Birmingham as a lecturer The deposit is unusual in preserving a wonders”. However, it was Whittington in 1945. So began further fieldwork in north remarkable diversity of soft-bodied creatures, who had discovered the ‘weirdness’. His Wales on the stratigraphy and fossils of the which are normally lost to decay. After major contribution was in confirming the classic Ordovician rocks around Bala. But Walcott’s preliminary descriptions, the fauna explosive nature of the Cambrian radiation Whittington continued to be fascinated was largely ignored, until the Canadian and establishing a platform for interpreting by silicified trilobites and the remarkable Survey set out to study the geology of the the early evolution of the major invertebrate complexities of the trilobite skeleton that region and to make a new — Canadian — groups — now a central concern of they reveal. In 1947, he spent three months collection from the nation’s most famous biologists who are delving into evolutionary in Washington DC studying examples from fossil locality (not least because Walcott’s development and gene sequences to resolve the Ordovician rocks of Virginia, and in 1949 enormous collection is at the Smithsonian). the tree of life. seized the chance to work in the United States As he started the Burgess Shale project, Derek E. G. Briggs when he was offered a post at the Museum Whittington moved from Cambridge, Derek E. G. Briggs is in the Department of Geology of Comparative Zoology at Harvard in Massachusetts, to the University of and Geophysics, and the Peabody Museum of Cambridge, Massachusetts. Cambridge, UK. Arthropods were the most Natural History, Yale University, New Haven, During Whittington’s 17 years at Harvard, diverse group of animals in the Cambrian, as Connecticut 06520, USA. he became established as the international they are today, and it was no surprise that his e-mail: [email protected]

706 © 2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09270 ARTICLES

Biological, clinical and population relevance of 95 loci for blood lipids

A list of authors and their affiliations appears at the end of the paper.

Plasma concentrations of total cholesterol, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol and triglycerides are among the most important risk factors for coronary artery disease (CAD) and are targets for therapeutic intervention. We screened the genome for common variants associated with plasma lipids in .100,000 individuals of European ancestry. Here we report 95 significantly associated loci (P , 5 3 1028), with 59 showing genome-wide significant association with lipid traits for the first time. The newly reported associations include single nucleotide polymorphisms (SNPs) near known lipid regulators (for example, CYP7A1, NPC1L1 and SCARB1) as well as in scores of loci not previously implicated in lipoprotein metabolism. The 95 loci contribute not only to normal variation in lipid traits but also to extreme lipid phenotypes and have an impact on lipid traits in three non-European populations (East Asians, South Asians and African Americans). Our results identify several novel loci associated with plasma lipids that are also associated with CAD. Finally, we validated three of the novel genes—GALNT2, PPP1R3B and TTC39B—with experiments in mouse models. Taken together, our findings provide the foundation to develop a broader biological understanding of lipoprotein metabolism and to identify new therapeutic opportunities for the prevention of CAD.

Plasma concentrations of total cholesterol (TC), low-density lipo- polymorphisms (SNPs) and phased chromosomes from the protein cholesterol (LDL-C), high-density lipoprotein cholesterol HapMap CEU (Utah residents with ancestry from northern and (HDL-C) and triglycerides (TG) are heritable risk factors for cardio- western Europe) sample to impute autosomal SNPs catalogued in vascular disease and targets for therapeutic intervention1. Genome- the HapMap; SNPs with minor allele frequency (MAF) .1% and wide association studies (GWASs) involving up to 20,000 individuals good imputation quality (see Methods in Supplementary Informa- of European ancestry have identified .30 genetic loci contributing to tion) were analysed. A total of , 2.6 million directly genotyped or inter-individual variation in plasma lipid concentrations2–10. Half of imputed SNPs were tested for association with each of the four lipid these loci harboured genes previously known to influence plasma traits in each study. For each SNP, evidence of association was com- lipid concentrations, establishing the technical validity of the lipid bined across studies using a fixed-effects meta-analysis. GWAS. Nevertheless, the practical value of the GWAS approach We identified 95 loci that showed genome-wide significant asso- remains a subject of debate11–14. ciation (P , 5 3 1028) with at least one of the four traits tested Here we focus on three key questions motivated by recent pro- (Table 1; Supplementary Fig. 1 and Supplementary Table 2). These gress in genetic mapping: (1) are loci identified in populations of include all of the 36 loci previously reported by GWAS at genome- European descent important in non-European groups, suggesting wide significance2–10 and 59 loci reported here in a GWAS for the first relevance in different global populations; (2) are these loci of clinical time. Among these 59 novel loci, 39 demonstrated genome-wide relevance, providing the framework to identify potential novel drug significant association with TC, 22 with LDL-C, 31 with HDL-C, targets for the treatment of extreme lipid phenotypes and prevention and 16 with TG. Among the 36 known loci, 21 demonstrated gen- of CAD; and (3) do these loci harbour genes with biological rel- ome-wide significant association with another lipid phenotype in evance, that is, which are directly involved in lipid regulation and addition to that previously described. To rule out spurious associa- metabolism? tions arising as a result of imputation artefact, at nearly all loci we We address these questions using several approaches: a genome- were able to identify proxy SNPs that had been directly genotyped on wide association screen for plasma lipids in .100,000 individuals of Illumina and/or Affymetrix arrays and confirm each of the associa- European ancestry; evaluation of mapped variants in East Asians, tions (Supplementary Table 5). The full association results for each of South Asians and African Americans; association testing in indivi- the four traits are available at http://www.broadinstitute.org/mpg/ duals with and without CAD; evaluation of genetic variants in pubs/lipids2010/ or http://www.sph.umich.edu/csg/abecasis/public/ patients with extreme plasma lipid concentrations; and genetic lipids2010. manipulation in mouse models. To evaluate whether additional independent association signals existed at each locus, we performed conditional association analyses GWAS in >100,000 individuals for each of the four lipid traits including genotypes at the lead SNPs for To identify additional common variants associated with plasma TC, each of the 95 loci as covariates in the association analyses (see LDL-C, HDL-C and TG concentrations, we performed a meta- Supplementary Methods). These analyses identified secondary signals analysis of 46 lipid GWASs (Supplementary Tables 1–4). These in 26 loci (Supplementary Table 6); when these additional SNPs are studies together comprise .100,000 individuals of European descent combined with the lead SNPs, the total set of mapped variants explains (maximum sample size 100,184 for TC, 95,454 for LDL-C, 99,900 for 12.4% (TC), 12.2% (LDL-C), 12.1% (HDL-C), and 9.6% (TG) of the HDL-C and 96,598 for TG), ascertained in the United States, Europe total variance in each lipid trait in the Framingham Heart Study, or Australia. In each study, we used genotyped single nucleotide corresponding to ,25–30% of the genetic variance for each trait. 707 ©2010 Macmillan Publishers Limited. All rights reserved ARTICLES NATURE | Vol 466 | 5 August 2010

Table 1 | Meta-analysis of plasma lipid concentrations in >100,000 individuals of European descent. Locus Chr Lead SNP Lead trait Other traits Alleles/MAF Effect size P eQTL CAD Ethnic LDLRAP1 1 rs12027135 TC LDL T/A/0.45 21.22 4 3 10211 Y 111? PABPC4 1 rs4660293 HDL A/G/0.23 20.48 4 3 10210 Y 1111 PCSK9 1 rs2479409 LDL TC A/G/0.30 12.01 2 3 10228 1111 ANGPTL3 1 rs2131925 TG TC, LDL T/G/0.32 24.94 9 3 10243 Y 1111 EVI5 1 rs7515577 TC A/C/0.21 21.18 3 3 1028 111? SORT1 1 rs629301 LDL TC T/G/0.22 25.65 1 3 102170 YY1111 ZNF648 1 rs1689800 HDL A/G/0.35 20.47 3 3 10210 1112 MOSC1 1 rs2642442 TC LDL T/C/0.32 21.39 6 3 10213 111? GALNT2 1 rs4846914 HDL TG A/G/0.40 20.61 4 3 10221 1111 IRF2BP2 1 rs514230 TC LDL T/A/0.48 21.36 5 3 10214 111? APOB 2 rs1367117 LDL TC G/A/0.30 14.05 4 3 102114 1111 rs1042034 TG HDL T/C/0.22 25.99 1 3 10245 1211 GCKR 2 rs1260326 TG TC C/T/0.41 18.76 6 3 102133 Y 1111 ABCG5/8 2 rs4299376 LDL TC T/G/0.30 12.75 2 3 10247 1111 RAB3GAP1 2 rs7570971 TC C/A/0.34 11.25 2 3 1028 12?? COBLL1 2 rs10195252 TG T/C/0.40 22.01 2 3 10210 Y 1111 rs12328675 HDL T/C/0.13 10.68 3 3 10210 11?1 IRS1 2 rs2972146 HDL TG T/G/0.37 10.46 3 3 1029 YY1111 RAF1 3 rs2290159 TC G/C/0.22 21.42 4 3 1029 111? MSL2L1 3 rs645040 TG T/G/0.22 22.22 3 3 1028 1121 KLHL8 4 rs442177 TG T/G/0.41 22.25 9 3 10212 1111 SLC39A8 4 rs13107325 HDL C/T/0.07 20.84 7 3 10211 Y 12?2 ARL15 5 rs6450176 HDL G/A/0.26 20.49 5 3 1028 2??1 MAP3K1 5 rs9686661 TG C/T/0.20 12.57 1 3 10210 1111 HMGCR 5 rs12916 TC LDL T/C/0.39 12.84 9 3 10247 111? TIMD4 5 rs6882076 TC LDL, TG C/T/0.35 21.98 7 3 10228 111? MYLIP 6 rs3757354 LDL TC C/T/0.22 21.43 1 3 10211 1221 HFE 6 rs1800562 LDL TC G/A/0.06 22.22 6 3 10210 11?1 HLA 6 rs3177928 TC LDL G/A/0.16 12.31 4 3 10219 Y 111? rs2247056 TG C/T/0.25 22.99 2 3 10215 1112 C6orf106 6 rs2814944 HDL G/A/0.16 20.49 4 3 1029 Y 1112 rs2814982 TC C/T/0.11 21.86 5 3 10211 Y 221? FRK 6 rs9488822 TC LDL A/T/0.35 21.18 2 3 10210 Y 111? CITED2 6 rs605066 HDL T/C/0.42 20.39 3 3 1028 1121 LPA 6 rs1564348 LDL TC T/C/0.17 20.56 2 3 10217 Y 11?1 rs1084651 HDL G/A/0.16 11.95 3 3 1028 11?1 DNAH11 7 rs12670798 TC LDL T/C/0.23 11.43 9 3 10210 111? NPC1L1 7 rs2072183 TC LDL G/C/0.25 12.01 3 3 10211 121? TYW1B 7 rs13238203 TG C/T/0.04 27.91 1 3 1029 1??? MLXIPL 7 rs17145738 TG HDL C/T/0.12 29.32 6 3 10258 Y 1111 KLF14 7 rs4731702 HDL C/T/0.48 10.59 1 3 10215 Y 1111 PPP1R3B 8 rs9987289 HDL TC, LDL G/A/0.09 21.21 6 3 10225 Y 1111 PINX1 8 rs11776767 TG G/C/0.37 12.01 1 3 1028 2111 NAT2 8 rs1495741 TG TC A/G/0.22 12.85 5 3 10214 Y 2111 LPL 8 rs12678919 TG HDL A/G/0.12 213.64 2 3 102115 Y 1111 CYP7A1 8 rs2081687 TC LDL C/T/0.35 11.23 2 3 10212 111? TRPS1 8 rs2293889 HDL G/T/0.41 20.44 6 3 10211 1111 rs2737229 TC A/C/0.30 21.11 2 3 1028 112? TRIB1 8 rs2954029 TG TC, LDL, HDL A/T/0.47 25.64 3 3 10255 Y 1111 PLEC1 8 rs11136341 LDL TC A/G/0.40 11.40 4 3 10213 1111 TTC39B 9 rs581080 HDL TC C/G/0.18 20.65 3 3 10212 1211

Previous studies have suggested sex-specific heritability of lipid three types of human tissue samples from liver (960 samples), traits15. A key challenge in addressing this issue is evaluating enough omental fat (741 samples) and subcutaneous fat (609 samples). We men and women to achieve adequate statistical power for each sex. examined the correlations between each of the lead SNPs at the 95 We re-analysed the GWAS for the four lipid traits separately in loci and the expression levels of transcripts located within 500 kilo- women (n 5 63,274) and in men (n 5 38,514). Four of the 95 loci bases of the SNP. We pre-specified a conservative threshold of stat- identified in the primary analysis showed significant heterogeneity of istical significance at P , 5 3 1028. At this threshold, we identified 38 effect size (P , 0.0005) between men and women (Supplementary SNP-to-gene eQTLs in liver, 28 in omental fat, and 19 in subcutan- Table 7). Moreover, an additional five loci had significant association eous fat (Table 1; Supplementary Tables 8–10). Some lead SNPs are in only one sex and not in the sex-combined analysis. Two loci quite remote from the associated gene transcripts. For example, associated with HDL-C in the sex-combined analysis (KLF14 and rs9987289 (associated with both LDL-C and HDL-C) correlates with ABCA8) showed female-specific association with TG and LDL-C, a twofold change in liver expression of PPP1R3B, yet is 174 kb away respectively. The KLF14 locus is a striking example, with rs1562398 from the gene, which as demonstrated below is likely to be a causal significantly associated with TG in women (effect size 5 –0.046 for gene. Similarly, rs2972146 (associated with both HDL-C and TG in the C allele, P 5 2 3 10212), but not in men (effect size 5 –0.012, this study, as well as with insulin resistance and type 2 diabetes P 5 0.05) (Supplementary Fig. 2 and Supplementary Table 7). mellitus in a previous study17) correlates with IRS1 expression in To gain insight into how DNA variants in associated loci might omental fat, despite being located 495 kb away from the gene. influence plasma lipid concentrations, we tested whether the mapped DNA sequence variants regulate the expression levels of nearby genes Relevance of GWAS loci in non-Europeans (expression quantitative trait loci, or eQTLs) in human tissues rel- As all of the individuals studied in our primary GWAS were of evant to lipoprotein metabolism (liver and fat)16. We carried out European ancestry, it remained unclear if the loci we identified in genotyping and RNA expression profiling of .39,000 transcripts in Europeans are relevant in non-European individuals. To address this 708 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 ARTICLES

Table 1 | Continued. Locus Chr Lead SNP Lead trait Other traits Alleles/MAF Effect size P eQTL CAD Ethnic ABCA1 9 rs1883025 HDL TC C/T/0.25 20.94 2 3 10233 1111 ABO 9 rs9411489 LDL TC C/T/0.20 12.24 6 3 10213 Y ???? JMJD1C 10 rs10761731 TG A/T/0.43 22.38 3 3 10212 1111 CYP26A1 10 rs2068888 TG G/A/0.46 22.28 2 3 1028 1111 GPAM 10 rs2255141 TC LDL G/A/0.30 11.14 2 3 10210 111? AMPD3 11 rs2923084 HDL A/G/0.17 20.41 5 3 1028 1121 SPTY2D1 11 rs10128711 TC C/T/0.28 21.04 3 3 1028 Y 121? LRP4 11 rs3136441 HDL T/C/0.15 10.78 3 3 10218 Y 111? FADS1-2-3 11 rs174546 TG HDL, TC, LDL C/T/0.34 13.82 5 3 10224 Y 1111 APOA1 11 rs964184 TG TC, HDL, LDL C/G/0.13 116.95 7 3 102240 Y 1111 UBASH3B 11 rs7941030 TC HDL T/C/0.38 10.97 2 3 10210 111? ST3GAL4 11 rs11220462 LDL TC G/A/0.14 11.95 1 3 10215 Y 1111 PDE3A 12 rs7134375 HDL C/A/0.42 10.40 4 3 1028 1111 LRP1 12 rs11613352 TG HDL C/T/0.23 22.70 4 3 10210 11?1 MVK 12 rs7134594 HDL T/C/0.47 20.44 7 3 10215 Y 11?1 BRAP 12 rs11065987 TC LDL A/G/0.42 20.96 7 3 10212 11?? HNF1A 12 rs1169288 TC LDL A/C/0.33 11.42 1 3 10214 Y 111? SBNO1 12 rs4759375 HDL C/T/0.06 10.86 7 3 1029 1??1 ZNF664 12 rs4765127 HDL TG G/T/0.34 10.44 3 3 10210 2121 SCARB1 12 rs838880 HDL T/C/0.31 10.61 3 3 10214 112? NYNRIN 14 rs8017377 LDL G/A/0.47 11.14 5 3 10211 1211 CAPN3 15 rs2412710 TG G/A/0.02 17.00 2 3 1028 1??2 FRMD5 15 rs2929282 TG A/T/0.05 15.13 2 3 10211 Y 1222 LIPC 15 rs1532085 HDL TC, TG G/A/0.39 11.45 3 3 10296 Y 1111 LACTB 15 rs2652834 HDL G/A/0.20 20.39 9 3 1029 Y 1??? CTF1 16 rs11649653 TG C/G/0.40 22.13 3 3 1028 Y 1??2 CETP 16 rs3764261 HDL TC, LDL, TG C/A/0.32 13.39 7 3 102380 1111 LCAT 16 rs16942887 HDL G/A/0.12 11.27 8 3 10233 Y 1111 HPR 16 rs2000999 TC LDL G/A/0.20 12.34 3 3 10224 111? CMIP 16 rs2925979 HDL C/T/0.30 20.45 2 3 10211 1111 STARD3 17 rs11869286 HDL C/G/0.34 20.48 1 3 10213 Y 1111 OSBPL7 17 rs7206971 LDL TC G/A/0.49 10.78 2 3 1028 Y 1121 ABCA8 17 rs4148008 HDL C/G/0.32 20.42 2 3 10210 1111 PGS1 17 rs4129767 HDL A/G/0.49 20.39 8 3 1029 1111 LIPG 18 rs7241918 HDL TC T/G/0.17 21.31 3 3 10249 Y 1111 MC4R 18 rs12967135 HDL G/A/0.23 20.42 7 3 1029 1111 ANGPTL4 19 rs7255436 HDL A/C/0.47 20.45 3 3 1028 Y 1111 LDLR 19 rs6511720 LDL TC G/T/0.11 26.99 4 3 102117 Y 11?1 LOC55908 19 rs737337 HDL T/C/0.08 20.64 3 3 1029 1111 CILP2 19 rs10401969 TC TG, LDL T/C/0.07 24.74 3 3 10238 Y 111? APOE 19 rs4420638 LDL TC, HDL A/G/0.17 17.14 9 3 102147 Y 1111 rs439401 TG C/T/0.36 25.50 1 3 10230 Y 111? FLJ36070 19 rs492602 TC A/G/0.49 11.27 2 3 10210 121? LILRA3 19 rs386000 HDL G/C/0.20 10.83 4 3 10216 Y 1212 ERGIC3 20 rs2277862 TC C/T/0.15 21.19 4 3 10210 Y 111? MAFB 20 rs2902940 TC LDL A/G/0.29 21.38 6 3 10211 221? TOP1 20 rs6029526 LDL TC T/A/0.47 11.39 4 3 10219 Y 1111 HNF4A 20 rs1800961 HDL TC C/T/0.03 21.88 1 3 10215 1112 PLTP 20 rs6065906 HDL TG T/C/0.18 20.93 2 3 10222 1211 UBE2L3 22 rs181362 HDL C/T/0.20 20.46 1 3 1028 Y 1111 PLA2G6 22 rs5756931 TG T/C/0.40 21.54 4 3 1028 1??1 The gene name listed in ‘Locus’ column is either a plausible biological candidate gene in the locus or the nearest annotated gene to the lead SNP. Listed in ‘Lead trait’ column is the lipid trait with best P-value among all four traits. Listed in ‘Other traits’ are additional lipid traits with P , 5 3 1028. Listed in ‘Alleles/MAF’ column are: major allele, minor allele and minor allele frequency (MAF) within the combined cohorts included in this meta-analysis (alleles designated with respect to the ‘1’ strand; Supplementary Table 2). Numbers in ‘Effect size’ column are in mg dl21 for the lead trait, modelled as an additive effect of the minor allele. P-values are listed for the lead traits. In the ‘eQTL’ column, ‘Y’ indicates that lead SNP has an eQTL with at least one gene within 500 kb with P , 5 3 1028 in at least one the three tissues tested (liver, omental fat, subcutaneous fat). In the ‘CAD’ column, ‘Y’ indicates that the lead SNP meets the pre-specified statistical significance threshold of P , 0.001 for association with CAD and being concordant between the direction of lipid effect and the change in CAD risk. In the ‘Ethnic’ column, ‘1’ indicates concordant effect on lead trait of the variant between the primary meta-analysis cohort and the European or non-European group, ‘2’ indicates discordant effect on lead trait, and ‘?’ indicates data not available for the group; in order, the ethnic groups are European, East Asian, South Asian and African American (Supplementary Table 11). Chr, chromosome.

question, we performed additional analyses in cohorts comprising that 24 of 26 tested SNPs had the same direction of association .15,000 East Asians (Chinese, Koreans and Filipinos), .9,000 South (Supplementary Table 13). Asians and .8,000 African Americans (Table 1; Supplementary We observed similar proportions in South Asians, with 29 of 32 Table 11). As a similarly sized control, we also performed genotyping lead SNPs tested against LDL-C having the same direction of asso- in a cohort of 7,000 additional Europeans. ciation as in the primary analysis (P 5 1 3 1026), 35 of 39 SNPs for In the European group, we found that 35 of 36 lead SNPs tested HDL-C (P 5 2 3 1027), and 24 of 27 SNPs for TG (P 5 3 3 1025). against LDL-C had the same direction of association as seen in the We also had consistent results with East Asians (LDL-C: 29 of 36, primary (.100,000 person) analysis (see Supplementary Table 12 for P 5 2 3 1024; HDL-C: 38 of 44, P 5 5 3 1027; TG: 26 of 28, explanation), 44 of 47 SNPs for HDL-C, and 29 of 32 SNPs for TG. P 5 2 3 1026), with more modest evidence for replication in Such directional consistency for the three traits is unlikely to be due African Americans (LDL-C: 33 of 36, P 5 1 3 1027; HDL-C: 37 of to chance (P 5 5 3 10210 for LDL-C, P 5 1 3 10210 for HDL-C, and 44, P 5 3 3 1026; TG: 24 of 30, P 5 7 3 1024). Furthermore, we P 5 1 3 1026 for TG). For further replication evidence, we per- found that the proportions of SNPs that had the same direction of formed direct genotyping of a subset of the lead SNPs in two association and P , 0.05 were similar in the European, South Asian European cohorts together totalling 12,000 individuals and found and East Asian replication groups, with smaller proportions in 709 ©2010 Macmillan Publishers Limited. All rights reserved ARTICLES NATURE | Vol 466 | 5 August 2010

African Americans (Supplementary Table 12). Of note, for a majority loci, each with individually small effects on plasma lipids, combine to of the loci, there was no evidence of heterogeneity of effects between contribute to extreme lipid phenotypes. We genotyped individuals the primary European groups and each of the non-European groups identified in three independent studies as having high LDL-C (Supplementary Table 11). (n 5 532, mean 219 mg dl21), high HDL-C (n 5 652, mean These observations indicate that most (but probably not all) of the 95 90 mg dl21)orhighTG(n 5 344, mean 1,079 mg dl21). For each lipid loci identified in this study contribute to the genetic architecture extreme case group, individuals with low plasma LDL-C (n 5 532, of lipid traits widely across global populations. They also suggest future mean 110 mg dl21), HDL-C (n 5 784, mean 36.2 mg dl21)orTG studies to localize causal DNA variants by leveraging differences in (n 5 144, mean 106 mg dl21) served as control groups. In each case- linkage disequilibrium (LD) patterns among populations. We evalu- control sample set, we calculated risk scores summarizing the number ated the potential for fine mapping by comparing the number of SNPs of LDL-C-, HDL-C-, or TG-raising alleles weighted by effect size. in LD with lead SNPs in three HapMap populations (Supplementary For LDL-C, we found that individuals with an LDL-C allelic Table 14). At many loci, only a subset of SNPs in high LD (r2 $ 0.8) dosage score in the top quartile were 13 times as likely to have high with the lead SNP in HapMap CEU are also in high LD with the lead LDL-C as individuals in the bottom quartile (P 5 1 3 10214) (Sup- SNP in HapMap YRI (Yoruba in Ibadan, Nigeria) individuals or in the plementary Fig. 3; Supplementary Tables 16 and 17). For HDL-C, joint JPT1CHB (Japanese in Tokyo, Japan and Han Chinese in Beijing, individuals in the top quartile of the HDL-C risk score were four China) cohort. Such differential LD patterns can prove useful to refine times as likely to have high HDL-C as individuals in the bottom association boundaries and prioritize SNPs for functional evaluation, quartile (P 5 2 3 10216). For TG, individuals in the top quartile of as demonstrated for the LDL-C-associated locus on chromosome 1p13 the TG risk score were 44 times as likely to be hypertriglyceridaemic (reported in the accompanying paper ref. 18). as individuals in the bottom quartile (P 5 4 3 10228). These results indicate that the additive effects of multiple common variants con- Clinical relevance of GWAS loci tribute to determining membership in the extremes of a quantitative To assess whether the GWAS approach yields clinical insights of trait distribution. potential therapeutic relevance, we sought to determine which of the lipid-associated lead SNPs are also associated with CAD in a Biological relevance of GWAS loci manner consistent with established epidemiological relationships Whether the GWAS approach can yield biological insights that (that is, SNP alleles which increase TC, LDL-C or TG or that decrease improve our understanding of the mechanisms underlying pheno- HDL-C should be associated with increased risk of CAD). Whereas types such as plasma lipid concentrations remains an open question. LDL-C is an accepted causal risk factor for CAD, it is unclear whether Loci identified through GWAS may explain a very small proportion of HDL-C and/or TG are also causal risk factors. This uncertainty was the variance in a phenotype through naturally occurring common reinforced by the failure of a drug that raised HDL-C via cholesteryl variants in humans, but they may have a greater impact through rare ester transfer protein (CETP) inhibition to reduce the risk of cardio- variants or when targeted by pharmacological or genetic intervention. vascular disease19. We surveyed our 95 GWAS loci and asked whether any nearby Whether other drugs that specifically raise HDL-C or lower TG can genes are linked to known Mendelian lipid disorders. There is remark- reduce CAD risk remains an open question. In contrast, the most able overlap between the loci identified here and 18 genes previously widely marketed drugs for lowering of LDL-C, statins, have been implicated in Mendelian lipid disorders (Supplementary Table 18). demonstrated in numerous clinical trials to reduce risk of CAD. Fifteen of the genes underlying these Mendelian disorders lie within Statins inhibit hydroxy-3-methylglutaryl coenzyme A reductase 100 kb of one of our lead SNPs, including eight that lie within 10 kb of (the protein product of HMGCR) and thereby reduce LDL-C and the nearest lead SNP. In 1,000,000 simulations of 95 randomly drawn TC levels. We observed that the variant of our lead SNP in the SNPs, selected to match our lead SNPs with respect to MAF and the HMGCR locus that is associated with lower LDL-C levels is also number of nearby genes, the average simulation showed no overlap- associated with lower CAD risk (P 5 0.004), consistent with the clin- ping loci and none showed more than eight overlapping loci. ical effects of statins. Analogously, common variants in other lipid- An additional two loci represent well-established drug targets for associated loci that are also associated with CAD may implicate genes the treatment of hyperlipidaemia: HMGCR (statins) and NPC1L1 at these loci as possible therapeutic targets. (ezetimibe). Several other loci harbour genes that were already We performed association testing for each of the lead SNPs from known to influence lipid metabolism before this study: LPA, which this study in 24,607 individuals of European descent with CAD and encodes lipoprotein(a); PLTP, which encodes phospholipid transfer 66,197 without CAD, with a pre-specified one-sided significance protein; ANGPTL3 and ANGPTL4, lipoprotein lipase inhibitors; threshold of P , 0.001 requiring directionality consistent with the SCARB1, a HDL receptor that mediates selective uptake of cholesteryl relevant lipid–CAD epidemiological relationship. A limited number ester; CYP7A1, which encodes cholesterol 7-alpha-hydroxylase; of loci met this criterion (Table 1; Supplementary Table 15), with STARD3, a cholesterol transport gene; and LRP1 and LRP4, members most of them being associated with LDL-C—consistent with LDL-C of the LDL receptor-related protein family. Notably, the protein being a causal risk factor for CAD. product of one of the genes implicated by our study—MYLIP—is a Four novel CAD-associated loci related specifically to HDL-C or ubiquitin ligase that had no recognized role in lipid metabolism TG, but not LDL-C: IRS1 (HDL-C, TG), C6orf106 (HDL-C), before our study’s inception, but has since been independently KLF14 (HDL-C) and NAT2 (TG). That these loci were associated demonstrated to be a regulator of cellular LDL receptor levels and with CAD shows that there may be selective mechanisms by which is now termed Idol (inducible degrader of the LDL receptor)20. HDL-C or TG can be altered in ways that also modulate CAD risk. GALNT2 (encoding UDP-N-acetyl-alpha-D-galactosamine:poly- However, it is also possible that causal genes in these loci may have peptide N-acetylgalactosaminyl transferase 2) is a member of a family pleiotropic effects on non-lipid parameters that are causal for of GalNAc-transferases, which transfer an N-acetyl galactosamine to CAD risk reduction. For example, the major allele of the lead the hydroxyl group of a serine/threonine residue in the first step of SNP in the IRS1 locus is associated with increased risk of type 2 O-linked oligosaccharide biosynthesis. It is the only gene in the diabetes mellitus, insulin resistance and hyperinsulinemia17,along mapped locus on chromosome 1q42 within 150 kb of the lead SNP with decreased HDL-C, increased TG and increased risk of CAD; it (rs4846914), which is located in an intron of the gene. We therefore remains unclear which of the metabolic risk factors are responsible reasoned that GALNT2 would be an ideal candidate for functional for the increased CAD risk. validation in a mouse model. We introduced the mouse orthologue Besides CAD, a second clinically relevant phenotype is hyperlipi- Galnt2 into mouse liver via a viral vector. Liver-specific overexpres- daemia. We asked whether the common variants in the 95 associated sion of Galnt2 resulted in significantly lower plasma HDL-C (24% 710 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 ARTICLES

a b with the demonstration that SORT1 is a causal gene for LDL-C and is Control Control 70 120 P < 0.0001 regulated in its expression by a GWAS SNP (reported in the accom- Galnt2 P = 0.002 Galnt2 shRNA panying paper ref. 18), support the use of eQTL studies to prioritize

) ) functional validation of GWAS-nominated genes.

–1 60 –1 90 Together, these observations establish that some of the identified 50 95 loci harbour novel bona fide lipid regulatory genes and show that with additional functional studies many, if not all, of the loci will 60 yield insights into the biological underpinnings of lipid metabolism.

HDL-C (mg dl 40 HDL-C (mg dl New biological, clinical and genetic insights 30 30 Baseline 2 weeks 4 weeks Baseline 2 weeks 4 weeks Through a series of studies, we demonstrate that (1) at least 95 loci Time after injection Time after injection across the human genome harbour common variants associated with plasma lipid traits in Europeans, (2) the loci contribute to lipid traits c d Control in multiple non-European populations, (3) some of these loci are Control Ttc39b shRNA associated not only with lipids but also with risk for CAD, (4) com- Ppp1r3b P = 0.002 mon variants in the loci combine to contribute to extreme lipid P 100 = 0.0002 90 P = 0.008 phenotypes and (5) many of the identified loci harbour genes that P < 0.0001 contribute to lipid metabolism, including the novel lipid genes ) ) –1 –1 GALNT2, PPP1R3B and TTC39B that we validated in mouse models. 80 80 It has recently been suggested that conducting genetic studies with increasingly larger cohorts will be relatively uninformative for the 60 70 biology of complex human disease, particularly if initial studies have HDL-C (mg dl HDL-C (mg dl failed to explain a sizable fraction of the heritability of the disease in question11. As the reasoning goes, analysis of a few thousand indivi- 40 60 duals will uncover the common variants with the strongest effect on Baseline 2 weeks 4 weeks Baseline 4 days 7 days phenotype. Larger studies will suffer from a plateau phenomenon in Time after injection Time after injection which either no additional common variants will be found or any Figure 1 | Effects of altered Galnt2, Ppp1r3b or Ttc39b expression in mouse common variants that are identified will have too small an effect to be liver on plasma lipid levels. a, b, Overexpression and knockdown of Galnt2. of biological interest. Plasma HDL-C levels at baseline, 2 weeks or 4 weeks after injection of viral Our study provides strong empirical evidence against this asser- vectors are shown. n 5 6 mice per group. c, Overexpression of Ppp1r3b. tion. We extended a GWAS for plasma lipids from ,20,000 to Plasma HDL-C levels at baseline, 2 weeks or 4 weeks after injection of viral ,100,000 individuals and identified 95 loci (of which 59 are novel) 5 d vectors are shown. n 7 mice per group. , Knockdown of Ttc39b. Plasma that, in aggregate, explain 10–12% of the total variance (representing HDL-C levels at baseline, 4 days or 7 days after injection of viral vectors are shown. n 5 6 mice per group. Error bars show s.d. Because independent ,25–30% of the genetic variance). Even though the lipid-associated experiments were performed at different times and/or sites, there is SNPs we identified have relatively small effect sizes, some of the 59 variability in baseline HDL-C levels. new loci contain genes of clear biological and clinical importance— among them LDLRAP1 (responsible for autosomal recessive compared to control mice) by 4 weeks (Fig. 1a). We also performed hypercholesterolemia), SCARB1 (receptor for selective uptake of knockdown of endogenous liver Galnt2 through delivery of a short HDL-C), NPC1L1 (established drug target), MYLIP (recently char- hairpin RNA via a viral vector. Reduction of the transcript level acterized regulator of LDL-C), and PPP1R3B (newly characterized (,95% knockdown as determined by qRT–PCR) resulted in higher regulator of HDL-C). We expect that future investigations of the new HDL-C levels by 4 weeks (71% compared to control mice) (Fig. 1b). loci (for example, resequencing efforts to identify low-frequency and These observations validate GALNT2 as a biological mediator of rare variants, or functional experiments in cells and animal models, HDL-C levels. as demonstrated for SORT1 in a separate study reported in the We further asked whether eQTL studies could facilitate the iden- accompanying paper ref. 18) will uncover additional important tification of causal genes in loci with multiple genes. Out of several new genes. Thus, the data presented in this study provide a founda- genes surrounding a locus on chromosome 8p23 found to be assoc- tion from which to develop a broader biological understanding of iated with HDL-C, LDL-C and TC (Table 1), only PPP1R3B (encoding lipoprotein metabolism and to identify potential new therapeutic protein phosphatase 1, regulatory (inhibitor) subunit 3B) was found opportunities. to have an eQTL in liver (Supplementary Table 7). The allele assoc- iated with increased expression correlated with lower levels of each of METHODS SUMMARY the lipid traits. This eQTL relationship indicates that higher express- The full Methods are in Supplementary Information and provide information ion of PPP1R3B will lower plasma lipids. Consistent with this predic- about (1) study samples and phenotypes; (2) genotyping and imputation; (3) tion, overexpression of the mouse orthologue Ppp1r3b in mouse liver genome-wide association analyses; (4) meta-analyses of directly typed and via a viral vector resulted in significantly lower plasma HDL-C levels at imputed SNPs; (5) estimation of effect sizes; (6) conditional analyses of top signals; (7) sex-specific analyses; (8) cis-expression quantitative trait locus ana- 2 weeks (25%) and 4 weeks (18%) (Fig. 1c), as well as lower TC levels lyses; (9) analyses of lipid-associated SNPs in European and non-European at 2 weeks (21%) and 4 weeks (14%) (data not shown). samples; (10) analyses of lipid-associated SNPs in individuals with and without Similarly, on a locus on chromosome 9p22 found to be associated CAD; (11) analyses of associated SNPs in patients with extreme LDL-C, HDL-C with HDL-C, TTC39B (encoding tetratricopeptide repeat domain or TG levels; (12) simulation studies to assess overlap between GWAS signals and 39B) was the only one of several genes in the locus to have an Mendelian disease loci; and (13) details of mouse studies. eQTL in liver (Supplementary Table 7), with the allele associated with Received 25 February; accepted 11 June 2010. decreased expression correlating with increased HDL-C. Consistent with this eQTL, knockdown of the mouse orthologue Ttc39b via a viral 1. Kathiresan, S. et al. A genome-wide association study for blood lipid phenotypes vector, with 50% knockdown of transcript as determined by qRT– in the Framingham Heart Study. BMC Med. Genet. 8 (Suppl 1), S17 (2007). 2. Diabetes Genetics Initiative of Broad Institute of Harvard and MIT, Lund PCR, resulted in significantly higher plasma HDL-C levels at 4 days University, and Novartis Institutes of BioMedical Research. Genome-wide (19%) and 7 days (14%) (Fig. 1d). These data indicate PPP1R3B and association analysis identifies loci for type 2 diabetes and triglyceride levels. TTC39B as causal genes for lipid regulation. These findings, combined Science 316, 1331–1336 (2007). 711 ©2010 Macmillan Publishers Limited. All rights reserved ARTICLES NATURE | Vol 466 | 5 August 2010

3. Willer, C. J. et al. Newly identified loci that influence lipid concentrations and risk Christopher J. O’Donnell4,19, Markku S. Nieminen57, Deborah A. Nickerson36, Grant W. of coronary artery disease. Nature Genet. 40, 161–169 (2008). Montgomery46, Thomas Meitinger70,71, Ruth McPherson63, Mark I. McCarthy72,73,74, 4. Kathiresan, S. et al. Six new loci associated with blood low-density lipoprotein Wendy McArdle75, David Masson11, Nicholas G. Martin46, Fabio Marroni76, Massimo cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Mangino54, Patrik K. E. Magnusson66, Gavin Lucas77, Robert Luben21, Ruth J. F. Loos38, Nature Genet. 40, 189–197 (2008). Marja-Liisa Lokki78, Guillaume Lettre79, Claudia Langenberg38, Lenore J. Launer80, 5. Kooner, J. S. et al. Genome-wide scan identifies variation in MLXIPL associated Edward G. Lakatta60, Reijo Laaksonen81, Kirsten O. Kyvik82, Florian Kronenberg39, Inke with plasma triglycerides. Nature Genet. 40, 149–151 (2008). R. Ko¨nig40, Kay-Tee Khaw21, Jaakko Kaprio12,13,83, Lee M. Kaplan84,A˚sa Johansson85, 6. Wallace, C. et al. Genome-wide association study identifies genes for biomarkers Marjo-Riitta Jarvelin86,87, A. Cecile J. W. Janssens17, Erik Ingelsson66, Wilmar Igl85,G. of cardiovascular disease: serum urate and dyslipidemia. Am. J. Hum. Genet. 82, Kees Hovingh16, Jouke-Jan Hottenga44, Albert Hofman17,42, Andrew A. Hicks65, 139–149 (2008). Christian Hengstenberg88, Iris M. Heid45,89, Caroline Hayward41, Aki S. 7. Sabatti, C. et al. Genome-wide association analysis of metabolic traits in a birth Havulinna50,90, Nicholas D. Hastie41, Tamara B. Harris80, Talin Haritunians28, Alistair cohort from a founder population. Nature Genet. 41, 35–46 (2009). S. Hall91, Ulf Gyllensten85, Candace Guiducci5, Leif C. Groop26,92, Elena Gonzalez5, 8. Aulchenko, Y. S. et al. Loci influencing lipid levels and coronary heart disease risk Christian Gieger45, Nelson B. Freimer93, Luigi Ferrucci94, Jeanette Erdmann95, Paul in 16 European population cohorts. Nature Genet. 41, 47–55 (2009). Elliott86,96, Kenechi G. Ejebe5, Angela Do¨ring45, Anna F. Dominiczak97, Serkalem 9. Kathiresan, S. et al. Common variants at 30 loci contribute to polygenic Demissie18,19, Panagiotis Deloukas55, Eco J. C. de Geus44, Ulf de Faire98, Gabriel dyslipidemia. Nature Genet. 41, 56–65 (2009). Crawford5, Francis S. Collins99, Yii-der I. Chen28, Mark J. Caulfield27, Harry 10. Chasman, D. I. et al. Forty-three loci associated with plasma lipoprotein size, Campbell43, Noel P. Burtt5, Lori L. Bonnycastle99, Dorret I. Boomsma44, S. Matthijs concentration, and cholesterol content in genome-wide analysis. PLoS Genet. 5, Boekholdt100, Richard N. Bergman101,Ineˆs Barroso55, Stefania Bandinelli102, Christie M. e1000730 (2009). Ballantyne103, Themistocles L. Assimes104, Thomas Quertermous104, David 11. Goldstein, D. B. Common genetic variation and human traits. N. Engl. J. Med. 360, Altshuler2,4,5, Mark Seielstad34, Tien Y. Wong105, E-Shyong Tai106, Alan B. Feranil107, 1696–1698 (2009). Christopher W. Kuzawa108, Linda S. Adair109, Herman A. Taylor Jr110, Ingrid B. 12. Hirschhorn, J. N. Genomewide association studies—illuminating biologic Borecki24, Stacey B. Gabriel5, James G. Wilson110, Hilma Holm23, Unnur pathways. N. Engl. J. Med. 360, 1699–1701 (2009). Thorsteinsdottir8,23, Vilmundur Gudnason7,8, Ronald M. Krauss111, Karen L. Mohlke35, 13. Kraft, P. & Hunter, D. J. Genetic risk prediction—are we there yet? N. Engl. J. Med. Jose M. Ordovas112,113, Patricia B. Munroe114, Jaspal S. Kooner59, Alan R. Tall11, Robert 360, 1701–1703 (2009). A. Hegele15, John J.P. Kastelein16, Eric E. Schadt115, Jerome I. Rotter28, Eric 14. Hardy, J. & Singleton, A. Genomewide association studies and human disease. Boerwinkle20, David P. Strachan116, Vincent Mooser37, Kari Stefansson8,23, Muredach N. Engl. J. Med. 360, 1759–1768 (2009). P. Reilly9,10, Nilesh J Samani117, Heribert Schunkert95, L. Adrienne Cupples18,19*, 15. Weiss, L. A., Pan, L., Abney, M. & Ober, C. The sex-specific genetic architecture of Manjinder S. Sandhu21,38,55*, Paul M Ridker4,14*, Daniel J. Rader9,10*, Cornelia M. van quantitative traits in humans. Nature Genet. 38, 218–222 (2006). Duijn17,42*, Leena Peltonen{, Gonc¸aloR. Abecasis1*, Michael Boehnke1* & Sekar 16. Schadt, E. E. et al. Mapping the genetic architecture of gene expression in human Kathiresan2,3,4,5* liver. PLoS Biol. 6, e107 (2008). 17. Rung, J. et al. Genetic variant near IRS1 is associated with type 2 diabetes, insulin 1Center for Statistical Genetics, Department of Biostatistics, University of Michigan, Ann resistance and hyperinsulinemia. Nature Genet. 41, 1110–1115 (2009). Arbor, Michigan 48109, USA. 2Center for Human Genetic Research, Massachusetts 18. Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 General Hospital, Boston, Massachusetts 02114, USA. 3Cardiovascular Research Center, cholesterol locus. Nature xxx, xxx– xxx (2010). Massachusetts General Hospital, Boston, Massachusetts 02114, USA. 4Department of 19. Barter, P. J. et al. Effects of torcetrapib in patients at high risk for coronary events. Medicine, Harvard Medical School, Boston, Massachusetts 02115, USA. 5Broad Institute, N. Engl. J. Med. 357, 2109–2122 (2007). Cambridge, Massachusetts 02142, USA. 6Johns Hopkins University School of Medicine, 20. Zelcer, N., Hong, C., Boyadjian, R. & Tontonoz, P. LXR regulates cholesterol uptake Baltimore, Maryland 21287, USA. 7Icelandic Heart Association, Heart Preventive Clinic through Idol-dependent ubiquitination of the LDL receptor. Science 325, 100–104 8 (2009). and Research Institute, 201 Kopavogur, Iceland. University of Iceland, 101 Reykjavik, Iceland. 9Cardiovascular Institute, University of Pennsylvania School of Medicine, Supplementary Information is linked to the online version of the paper at Philadelphia, Pennsylvania 19104, USA. 10Institute for Translational Medicine and www.nature.com/nature. Therapeutics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104, USA. 11Division of Molecular Medicine, Department of Medicine, Columbia Acknowledgements We wish to dedicate this paper to the memory of Dr Leena University, New York, New York 10032, USA. 12Institute for Molecular Medicine Finland Peltonen, who passed away on 11 March 2010. A full listing of acknowledgements is FIMM, University of Helsinki, FI-00014 Helsinki, Finland. 13National Institute for Health provided in Supplementary Information. and Welfare, P.O. Box 104, FI-00251 Helsinki, Finland. 14Division of Preventive Medicine, 15 Author Contributions T.M.T., K.M., A.V.S., A.C.E., I.M.S., M.K. and J.P.P. carried out Brigham and Women’s Hospital, Boston Massachusetts 02215, USA. Robarts Research Institute, University of Western Ontario, London, Ontario N6A 5K8, Canada. the primary data analyses and/or experimental work. All other authors contributed 16 to additional analyses. L.A.C., M.S.S., P.M.R., D.J.R., C.M.v.D., L.P., G.R.A., M.B. and Department of Vascular Medicine, Academic Medical Centre at the University of 17 S.K. conceived, designed, and supervised the study. K.M. wrote the manuscript. Amsterdam, 1105 AZ Amsterdam, The Netherlands. Department of Epidemiology, Erasmus University Medical Center, P.O. Box 2040, 3000 CA Rotterdam, The Author Information Reprints and permissions information is available at Netherlands. 18Department of Biostatistics, Boston University School of Public Health, www.nature.com/reprints. The authors declare competing financial interests: Boston, Massachusetts 02118, USA. 19National Heart, Lung and Blood Institute’s details accompany the full-text HTML version of the paper at www.nature.com/ Framingham Heart Study, Framingham, Massachusetts 01702, USA. 20Human Genetics nature. Readers are welcome to comment on the online version of this article at Center, University of Texas Health Science Center at Houston, Houston, Texas 77030, www.nature.com/nature. Correspondence and requests for materials should be USA. 21Department of Public Health and Primary Care, Strangeways Research addressed to S.K. ([email protected]). Laboratory, University of Cambridge, Cambridge CB1 8RN, UK. 22Cardiovascular Health Research Unit and Department of Medicine, University of Washington, Seattle, Washington 98101, USA. 23deCODE Genetics, 101 Reykjavik, Iceland. 24Division of Statistical Genomics in the Center for Genome Sciences, Washington University School 25 Tanya M. Teslovich1*, Kiran Musunuru2,3,4,5,6*, Albert V. Smith7,8, Andrew C. of Medicine, St Louis, Missouri 63108, USA. Department of Epidemiology and Public 26 Edmondson9,10, Ioannis M. Stylianou10, Masahiro Koseki11, James P. Pirruccello2,5,6, Health, Imperial College London, London W2 1PG, UK. Department of Clinical Sciences, 27 Samuli Ripatti12,13, Daniel I. Chasman4,14, Cristen J. Willer1, Christopher T. Johansen15, Lund University, SE-20502, Malmo¨, Sweden. Clinical Pharmacology and Barts and the Sigrid W. Fouchier16, Aaron Isaacs17, Gina M. Peloso18,19, Maja Barbalic20, Sally L. London Genome Centre, William Harvey Research Institute, Barts and the London School 28 Ricketts21, Joshua C. Bis22, Yurii S. Aulchenko17, Gudmar Thorleifsson23, Mary F. of Medicine, Queen Mary University of London, London EC1M 6BQ, UK. Medical Feitosa24, John Chambers25, Marju Orho-Melander26, Olle Melander26, Toby Genetics Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, USA. Johnson27, Xiaohui Li28, Xiuqing Guo28, Mingyao Li9,10, Yoon Shin Cho29, Min Jin Go29, 29Center for Genome Science, National Institute of Health, Seoul 122-701, Republic of Young Jin Kim29, Jong-Young Lee29, Taesung Park30,31, Kyunga Kim32, Xueling Sim33, Korea. 30Interdisciplinary Program in Bioinformatics, College of Natural Sciences, Seoul Rick Twee-Hee Ong34, Damien C. Croteau-Chonka35, Leslie A. Lange35, Joshua D. National University, Seoul 151-742, Republic of Korea. 31Department of Statistics, College Smith36, Kijoung Song37, Jing Hua Zhao38, Xin Yuan37, Jian’an Luan38, Claudia of Natural Sciences, Seoul National University, Seoul 151-742, Republic of Korea. Lamina39, Andreas Ziegler40, Weihua Zhang25, Robert Y. L. Zee4,14, Alan F. Wright41, 32Department of Statistics, Sookmyung Women’s University, Seoul 140-742, Republic of Jacqueline C. M. Witteman17,42, James F. Wilson43, Gonneke Willemsen44, H.-Erich Korea. 33Centre for Molecular Epidemiology, National University of Singapore, Singapore Wichmann45, John B. Whitfield46, Dawn M. Waterworth37, Nicholas J. Wareham38, 117597, Republic of Singapore. 34Genome Institute of Singapore, Singapore 138672, Ge´rard Waeber47, Peter Vollenweider47, Benjamin F. Voight2,5, Veronique Vitart41, Republic of Singapore. 35Department of Genetics, University of North Carolina, Chapel Andre G. Uitterlinden17,42,48, Manuela Uda49, Jaakko Tuomilehto50, John R. Hill, North Carolina 27599, USA. 36Department of Genome Sciences, University of Thompson51, Toshiko Tanaka52,53, Ida Surakka12,13, Heather M. Stringham1,TimD. Washington, Seattle, Washington 98195, USA. 37Genetics Division, GlaxoSmithKline Spector54, Nicole Soranzo54,55, Johannes H. Smit56, Juha Sinisalo57, Kaisa Silander12,13, R&D, King of Prussia, Pennsylvania 19406, USA. 38MRC Epidemiology Unit, Institute of Eric J. G. Sijbrands17,48, Angelo Scuteri58, James Scott59, David Schlessinger60, Serena Metabolic Science, Addenbrooke’s Hospital, Cambridge CB2 0QQ, UK. 39Division of Sanna49, Veikko Salomaa50, Juha Saharinen12, Chiara Sabatti61, Aimo Ruokonen62, Igor Genetic Epidemiology, Department of Medical Genetics, Molecular and Clinical Rudan43, Lynda M. Rose14, Robert Roberts63, Mark Rieder36, Bruce M. Psaty64, Peter P. Pharmacology, Innsbruck Medical University, Schoepfstrasse 41, A-6020 Innsbruck, Pramstaller65, Irene Pichler65, Markus Perola12,13, Brenda W. J. H. Penninx56, Nancy L. Austria. 40Institut fu¨r Medizinische Biometrie und Statistik, Universita¨t zu Lu¨beck, 23562 Pedersen66, Cristian Pattaro65, Alex N. Parker67, Guillaume Pare68, Ben A. Oostra69, Lu¨beck, Germany. 41MRC Human Genetics Unit, Institute of Genetics and Molecular 712 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 ARTICLES

Medicine, Edinburgh EH4 2XU, UK. 42Netherlands Genomics Initiative (NGI)-sponsored University Hospital, FI-33521 Tampere, Finland. 82Institute of Regional Health Research Netherlands Consortium for Healthy Aging (NCHA) and Center of Medical Systems and the Danish Twin Registry, Institute of Public Health, University of Southern Denmark, Biology (CMSB), 2300 RC Leiden, The Netherlands. 43Centre for Population Health JBWinsløws Vej 9B, DK-5000, Odense, Denmark. 83Faculty of Medicine, Department of Sciences, University of Edinburgh, Edinburgh EH8 9AG, UK. 44Department of Biological Public Health, University of Helsinki, FI-00014 Helsinki, Finland. 84Massachusetts Psychology, VU University Amsterdam, Van der Boechorststraat 1, 1081 BT Amsterdam, General Hospital Weight Center, Boston, Massachusetts 02114, USA. 85Department of The Netherlands. 45Institute of Epidemiology, Helmholtz Zentrum Munchen – German Genetics and Pathology, Rudbeck Laboratory, University of Uppsala, SE-75185 Uppsala, 46 Research Center for Environmental Health, 85764 Neuherberg, Germany. Genetic Sweden. 86Department of Epidemiology & Biostatistics, Imperial College London, St Epidemiology Unit, Queensland Institute of Medical Research, PO Royal Brisbane Mary’s Campus, Norfolk Place, London W2 1PG, UK. 87Department of Public Health 47 Hospital, Queensland 4029, Australia. Department of Internal Medicine, Centre Science and General Practice, University of Oulu, FI-90220 Oulu, Finland. 88Klinik und 48 Hospitalier Universitaire Vaudois, 1011 Lausanne, Switzerland. Department of Internal Poliklinik fu¨r Innere Medizin II, Universita¨t Regensburg, 93053 Regensburg, Germany. Medicine, Erasmus University Medical Center, PO Box 2040, 3000 CA Rotterdam, The 89Department of Epidemiology and Preventive Medicine Regensburg University Medical 49 Netherlands. Istituto di Neurogenetica e Neurofarmacologia (INN), Consiglio Center Franz-Josef-Strauss-Allee 11, 93053 Regensburg, Germany. 90Department of Nazionale delle Ricerche, c/o Cittadella Universitaria di Monserrato, Monserrato, Biomedical Engineering and Computational Science, Aalto University School of Science 50 Cagliari 09042, Italy. Department of Chronic Disease Prevention, National Institute for and Technology, FI-00076 Aalto, Finland. 91LIGHT Research Institute, Faculty of 51 Health and Welfare, FI-00271 Helsinki, Finland. Department of Health Sciences, Medicine and Health, University of Leeds, Leeds LS2 9JT, UK. 92Department of Medicine, 52 University of Leicester, Leicester LE1 6TP, UK. Clinical Research Branch, National Helsinki University Hospital, FI-00029 Helsinki, Finland. 93Department of Psychiatry, Institute on Aging, National Institutes of Health, Baltimore, Maryland 21225, USA. 53 54 Center for Neurobehavioral Genetics, The Jane and Terry Semel Institute for Medstar Research Institute, Baltimore, Maryland 21218, USA. Department of Twin Neuroscience and Human Behavior, David Geffen School of Medicine, University of Research and Genetic Epidemiology, King’s College London, London SE1 7EH, UK. 94 55 56 California, Los Angeles, California 90095, USA. Clinical Research Branch, National Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK. Department of Institute on Aging, National Institutes of Health, Baltimore, Maryland 21225, USA. Psychiatry, EMGO Institute, Neuroscience Campus Amsterdam, VU University Medical 95 96 57 Medizinische Klinik II, Universita¨tzuLu¨beck, 23538 Lu¨beck, Germany. MRC-HPA Center, 1007 MB Amsterdam, The Netherlands. Division of Cardiology, Department of Centre for Environment and Health, Imperial College London, London W2 1PG, UK. 97BHF Medicine, Helsinki University Central Hospital (HUCH), FI-00029 Helsinki, Finland. Glasgow Cardiovascular Research Centre, University of Glasgow, 126 University Place, 58Unita Operativa Geriatria, Istituto Nazionale Ricovero e Cura Anziani (INRCA), Istituto Glasgow G12 8TA, UK. 98Division of Cardiovascular Epidemiology, Institute of Ricovero e Cura a Carattere Scientifico (IRCCS), Via Cassia 1167, 00189 Rome, Italy. Environmental Medicine, Karolinska Institutet, SE-17177 Stockholm, Sweden. 99National 59Hammersmith Hospital, National Heart and Lung Institute, Imperial College London, Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland London W12 0NN, UK. 60Gerontology Research Center, National Institute on Aging, 20892, USA. 100Departments of Vascular Medicine & Cardiology, Academic Medical 5600 Nathan Shock Drive, Baltimore, Maryland 21224, USA. 61Department of Health Centre, 1105 AZ Amsterdam, The Netherlands. 101Department of Physiology and Research and Policy, Stanford University, Stanford, California 94305, USA. Biophysics, University of Southern California, Los Angeles, California 90033, USA. 62Department of Clinical Chemistry, University of Oulu, FI-90220 Oulu, Finland. 63The 102Geriatric Unit, Azienda Sanitaria Firenze (ASF), 50125 Florence, Italy. 103Department John & Jennifer Ruddy Canadian Cardiovascular Genetics Centre, University of Ottawa, of Medicine, Baylor College of Medicine, Houston, Texas 77030, USA. 104Department of Ottawa K1Y 4W7, Canada. 64Departments of Medicine, Epidemiology, and Health Medicine, Stanford University School of Medicine, Stanford, California 94305, USA. Services, University of Washington, Group Health Research Institute, Group Health 105 65 Singapore Eye Research Institute, National University of Singapore, Singapore 168751, Cooperative, Seattle, Washington 98101, USA. Institute of Genetic Medicine, European 106 Academy Bozen/Bolzano (EURAC), Viale Druso 1, 39100 Bolzano, Italy – affiliated Republic of Singapore. Departments of Medicine/Epidemiology and Public Health, 66 Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117597, institute of the University of Lu¨beck, Germany. Department of Medical Epidemiology 107 67 Republic of Singapore. Office of Population Studies Foundation, University of San and Biostatistics, Karolinska Institutet, SE-17177 Stockholm, Sweden. Amgen, 108 Thousand Oaks, California 91320, USA. 68Genetic and Molecular Epidemiology Carlos, Cebu City 6000, Philippines. Department of Anthropology, Northwestern 109 Laboratory, McMaster University, Hamilton, Ontario L8N3Z5, Canada. 69Department of University, Evanston, Illinois 60208, USA. Department of Nutrition, Carolina Clinical Genetics, Erasmus University Medical Center, 3000 CA Rotterdam, The Population Center, University of North Carolina, Chapel Hill, North Carolina 27516, USA. 110 Netherlands. 70Institut fur Humangenetik, Helmholtz Zentrum Munchen, Deutsches Department of Medicine, University of Mississippi Medical Center, Jackson, 111 Forschungszentrum fur Umwelt und Gesundheit, 85764 Neuherberg, Germany. Mississippi 39126, USA. Children’s Hospital Oakland Research Institute, Oakland, 112 71Institute of Human Genetics, Klinikum rechts der Isar, Technische Universita¨t California 94609, USA. Department of Cardiovascular Epidemiology and Population Mu¨nchen, 81675 Muenchen, Germany. 72Wellcome Trust Centre for Human Genetics, Genetics, Centro Nacional de Investigaciones Cardiovasculares, 28029 Madrid, Spain. 113 University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK. 73Oxford Centre for Nutrition and Genomics Laboratory, Jean Mayer United States Department of Diabetes, Endocrinology and Medicine, University of Oxford, Churchill Hospital, Oxford Agriculture Human Nutrition Research Center on Aging at Tufts University, Boston, 114 OX3 7LJ, UK. 74Oxford NIHR Biomedical Research Centre, Churchill Hospital, Oxford Massachusetts 02111, USA. Clinical Pharmacology and Barts and The London Genome OX3 7LJ, UK. 75Avon Longitudinal Study of Parents and Children, University of Bristol, Centre, William Harvey Research Institute, Barts and The London School of Medicine and Bristol BS8 2BN, UK. 76Institute of Applied Genomics, via Linussio 51, 33100 Udine, Italy. Dentistry, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, 77Cardiovascular Epidemiology and Genetics, Institut Municipal d’Investigacio Medica, UK. 115Sage Bionetworks, Seattle, Washington 98109, USA. 116Division of Community 08003 Barcelona, Spain. 78Transplantation Laboratory, Haartman Institute, University Health Sciences, St George’s University of London, London SW17 0RE, UK. of Helsinki, FI-00014 Helsinki, Finland. 79Montreal Heart Institute (Research Center), 117Department of Cardiovascular Sciences, University of Leicester, NIHR Biomedical Universite´ de Montre´al, Montre´al, Que´bec H1T 1C8, Canada. 80Laboratory of Research Unit in Cardiovascular Disease, Glenfield Hospital, Leicester LE3 9QP, UK. Epidemiology, Demography, and Biometry, National Institute on Aging, National *These authors contributed equally to this work. Institutes of Health, Bethesda, Maryland 20892, USA. 81Science Center, Tampere {Deceased.

713 ©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09266 ARTICLES

From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus

Kiran Musunuru1,2,3*, Alanna Strong4*, Maria Frank-Kamenetsky5, Noemi E. Lee1, Tim Ahfeldt1,6, Katherine V. Sachs4, Xiaoyu Li4, Hui Li4, Nicolas Kuperwasser1, Vera M. Ruda1, James P. Pirruccello1,2, Brian Muchmore7, Ludmila Prokunina-Olsson7, Jennifer L. Hall2,8, Eric E. Schadt9, Carlos R. Morales10, Sissel Lund-Katz11, Michael C. Phillips11, Jamie Wong5, William Cantley5, Timothy Racie5, Kenechi G. Ejebe1,2, Marju Orho-Melander12, Olle Melander12, Victor Koteliansky5, Kevin Fitzgerald5, Ronald M. Krauss13, Chad A. Cowan1,2, Sekar Kathiresan1,2* & Daniel J. Rader4*

Recent genome-wide association studies (GWASs) have identified a locus on chromosome 1p13 strongly associated with both plasma low-density lipoprotein cholesterol (LDL-C) and myocardial infarction (MI) in humans. Here we show through a series of studies in human cohorts and human-derived hepatocytes that a common noncoding polymorphism at the 1p13 locus, rs12740374, creates a C/EBP (CCAAT/enhancer binding protein) transcription factor binding site and alters the hepatic expression of the SORT1 gene. With small interfering RNA (siRNA) knockdown and viral overexpression in mouse liver, we demonstrate that Sort1 alters plasma LDL-C and very low-density lipoprotein (VLDL) particle levels by modulating hepatic VLDL secretion. Thus, we provide functional evidence for a novel regulatory pathway for lipoprotein metabolism and suggest that modulation of this pathway may alter risk for MI in humans. We also demonstrate that common noncoding DNA variants identified by GWASs can directly contribute to clinical phenotypes.

MI is the leading cause of death in the developed world. LDL-C is a recognized intermediate phenotype as well as a hard clinical disease causal risk factor for the disease, as demonstrated by the increased and endpoint. early burden of MI in individuals with the Mendelian disorder of As compelling as these associations are, they do not explain how familial hypercholesterolemia1 and the success of LDL-C-lowering human genetic variation at the 1p13 locus confers change in plasma medications in reducing the incidence of MI in clinical trials in many LDL-C and thereby alters risk of MI. We therefore sought to identify populations2. Despite aggressive use of statin drugs, many individuals (1) the causal DNA variant in the 1p13 locus, (2) the gene regulated by the do not achieve the LDL-C levels recommended by clinical guidelines3. locus, (3) the mechanism by which the DNA variant affects the gene, and There remains a need for additional methods of reducing LDL-C. (4) the mechanism by which the gene influences lipoprotein metabolism. GWASs for plasma lipoprotein traits have identified a number of common single nucleotide polymorphism (SNP) variants that are 1p13 SNPs associated with LDL particles strongly associated with plasma LDL-C4–10. Many of these SNPs are LDL-C comprises a variety of lipoprotein particles that range in size in or near genes known to cause Mendelian dyslipidaemias (LDLR, and density, and it has been hypothesized that smaller LDL particles APOB and PCSK9) or established molecular targets for LDL-C- are more atherogenic than larger LDL particles13. To determine lowering therapies (HMGCR). However, several of the LDL-C loci whether the 1p13 locus selectively affects certain LDL subclasses, contain genes not previously implicated in lipoprotein metabolism. we used different methodologies—ion mobility and gradient gel Of the newly mapped loci, the novel SNPs most strongly associated electrophoresis—to measure lipoprotein subclasses in two different with LDL-C all lie on chromosome 1p13; indeed, in a meta-analysis cohorts—the Malmo¨ Diet and Cancer Study – Cardiovascular of ,100,000 individuals (reported in the accompanying paper10) this Cohort (MDC-CC)14 and the Pharmacogenomics and Risk of locus has the strongest association with LDL-C of any locus in the Cardiovascular Disease (PARC) study15. We found that an index genome (P 5 1 3 102170). The same 1p13 SNPs have also been inde- SNP in the 1p13 locus, rs646776, was most highly associated with pendently linked to coronary artery disease and MI in GWASs10–12. changes in the very small LDL (LDL-VS) lipoprotein subclass (20% Individuals of European descent who are homozygous for the major increase in major allele homozygotes versus minor allele homozy- alleles of these SNPs have up to 16 mg dl21 higher LDL-C as well gotes with P 5 1.1 3 10211 in MDC-CC; 37% increase with as ,40% increased risk of MI11,12 when compared with minor P 5 8.0 3 10211 in PARC); progressively smaller changes were seen allele homozygotes. Thus, the same genetic locus is linked to both a with larger LDL subclasses (Fig. 1a; Supplementary Fig. 1a, b).

1Cardiovascular Research Center and Center for Human Genetic Research, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts 02114, USA. 2Broad Institute, Cambridge, Massachusetts 02142, USA. 3Division of Cardiology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21287, USA. 4Institute for Translational Medicine and Therapeutics, Institute for Diabetes, Obesity and Metabolism, and Cardiovascular Institute, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104, USA. 5Alnylam Pharmaceuticals, Inc., Cambridge, Massachusetts 02142, USA. 6Department of Biochemistry and Molecular Biology II: Molecular Cell Biology, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany. 7Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA. 8Program in Cardiovascular Translational Genomics, Lillehei Heart Institute, University of Minnesota, Minneapolis, Minnesota 55455, USA. 9Sage Bionetworks, Seattle, Washington 98109, USA. 10Department of Anatomy and Cell Biology, McGill University, Montreal, H3A 2B2, Canada. 11The Children’s Hospital of Philadelphia, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104, USA. 12Department of Clinical Sciences, Skania University Hospital, Lund University, SE-20502 Malmo¨, Sweden. 13Children’s Hospital Oakland Research Institute, Oakland, California 94609, USA. *These authors contributed equally to this work. 714 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 ARTICLES

1p13 SNPs and liver-specific expression (Fig. 1c). SORT1 displayed the largest expression change. We replicated The SNPs in the 1p13 locus reported previously to be most highly these liver eQTL findings in an independent cohort of 62 human liver associated with LDL-C—rs646776, rs599839, rs12740374, and samples, from which rs12740374 (the putative causal 1p13 SNP, see rs629301—lie in a noncoding DNA region between two genes, below) was directly genotyped and SORT1, PSRC1 and CELSR2 expres- CELSR2 and PSRC1, whose functions are unknown (Figs 1b and sion were individually measured. Minor allele homozygotes displayed 2a)4–10. As noncoding DNA variants may alter gene expression, we more than 12-fold higherSORT1andPSRC1 expressionthan majorallele previously used expression quantitative trait locus (eQTL) analyses homozygotes,with nosignificantchangeforCELSR2(Fig. 1d).Immuno- to explore whether 1p13 SNPs are cis-acting regulators of nearby blot analysis of liver lysates demonstrated a significant increase in abund- genes in human liver4,7. We have now extended these studies by ance of the SORT1 protein product (sortilin) in heterozygotes compared measuring expression of genes in or near the 1p13 locus in three types to major allele homozygotes (Fig. 1d and Supplementary Fig. 1c). of human tissue samples: liver (960 samples), subcutaneous fat (433 Notably, none of the gene expression changes in liver were seen in samples) and omental fat (520 samples). the two adipose tissue types (Fig. 1c), and minimal changes were In liver, presence of the minor allele of rs646776 was highly associated reported in lymphocytes16, suggesting that the regulatory mechanism with elevated transcript levels of three genes: CELSR2,PSRC1 and SORT1 underlying the allele-specific gene expression is liver-specific.

P = 8 × 10–11 Figure 1 | Human chromosome 1p13 locus is a preferentially associated with very small LDL 1.4 Homozygote major and liver gene expression. a, Mean plasma lipid Homozygote minor 1.3 P = 1 × 10–11 and lipoprotein particle levels in homozygotes for rs646776 the minor haplotype of the 1p13 locus (minor 1.2 allele of rs646776) versus homozygotes for the P = 2 × 10–11 P = 0.004 1.1 major haplotype (major allele of rs646776), normalized to the mean level in minor haplotype

Normalized level 1.0 homozygotes, in the MDC-CC cohort (measured by ion mobility) and the PARC cohort (measured 0.9 LDL-C LDL-L LDL-M LDL-S LDL-VS LDL-C LDL-L LDL-M LDL-S LDL-VS by gradient gel electrophoresis). LDL-L, large MDC-CC (ion mobility) PARC (gradient gel electrophoresis) LDL; LDL-M, medium LDL; LDL-S, small LDL; LDL-VS, very small LDL. b, Relative gene positions in and around the 1p13 locus; * b Chr1: 109600000 109650000 109700000 109750000 109800000 indicates position of rs646776. c, Mean RefSeq genes expression of local genes in homozygotes for the SARS PSRC1 PSMA5 SYPL2 major 1p13 haplotype (major allele of rs646776) CELSR2 SORT1 versus heterozygotes versus homozygotes for the MYBPHL minor 1p13 haplotype (minor allele of rs646776), * rs646776 normalized to the mean level in major haplotype homozygotes, in samples of human liver, human subcutaneous adipose and human omental c 5 n P –300 Homozygote major Liver ( = 960) P = 2 × 10–271 = 2 × 10 adipose. d, Mean expression of PSRC1, CELSR2, Heterozygote 4 SORT1 and TCF7L2 (negative control) mRNA, Homozygote minor standardized to B2M expression, and sortilin 3 rs646776 P = 5 × 10–94 protein, standardized to a-tubulin, in samples of 2 human liver from homozygotes for the major 1p13 haplotype (major allele of rs12740374) 1 versus heterozygotes versus homozygotes for the minor 1p13 haplotype (minor allele of 0 rs12740374) if available, normalized to the mean 2 Subcutaneous adipose (n = 433) level in major haplotype homozygotes. P values derived from linear regression analyses or 1 unpaired t-test. Error bars show s.e.m.

0 Normalized gene expression 2 Omental adipose (n = 520)

1

0 SARS CELSR2 PSRC1 MYBPHL SORT1 PSMA5 SYPL2

d Homozygote major (n = 34) rs12740374 Homozygote major (n = 4) Liver, qPCR Heterozygote (n = 25) Heterozygote (n = 3) Homozygote minor (n = 3) P = 9 × 10–7 P = 0.05 24 2 P = 1 × 10–11 18

Liver, western 12 1

P 6 = 0.85 P = 0.69 Normalizd protein 0 0 Normalizd gene expression PSRC1/ CELSR2/ SORT1/ TCF7L2/ Sortilin/ B2M B2M B2M B2M α-tubulin

715 ©2010 Macmillan Publishers Limited. All rights reserved ARTICLES NATURE | Vol 466 | 5 August 2010

A causal 1p13 noncoding DNA variant switched to major alleles. We identified the SNP rs12740374 as being We performed fine mapping of the 1p13 locus to define the minimal sufficient to confer the haplotype-specific effect (Figs 2c and 3c). DNA region responsible for the LDL-C association. Because rs646776, We genotyped rs12740374 and 15 other SNPs in or near the 6.1 kb rs599839, rs12740374 and rs629301 lie between CELSR2 and PSRC1,we noncoding region in ,9,000 African Americans. Whereas six SNPs have used data from a recent GWAS of ,20,000 individuals of European indistinguishable evidence for association with LDL-C in Europeans, we descent7 to perform association analyses with LDL-C on these and other found that, in African Americans, rs12740374 alone had the strongest 220 SNPs spanning the two genes. Out of 18 other SNPs, we identified two evidence for association (P 5 2.3 3 10 for rs12740374 versus 2 SNPs with P values comparable to rs646776, rs599839, rs12740374 and 9.2 3 10 15 at the next best 1p13 SNP) (Supplementary Fig. 2a). This rs629301 (P values ranging from 1.8 3 10242 to 8.3 3 10241) and no is consistent with rs12740374 being in high LD with nearby SNPs in SNPs with lower P values (Supplementary Fig. 2a). Together these six HapMap Europeans (CEU), but not so in HapMap Africans (YRI) best SNPs cluster in a noncoding DNA region that is 6.1 kilobases in (Supplementary Fig. 3). size, spanning the 39 untranslated region (39UTR) of CELSR2,theinter- We observed that rs12740374 alters a predicted binding site for C/EBP genic region, and the PSRC1 39UTR oriented in the opposite direction transcription factors, with the minor allele creating the site and the major (Fig. 2a). The six SNPs are in high linkage disequilibrium (LD) and allele disrupting it; the binding site is not present in the orthologous DNA comprise two predominant haplotypes in HapMap Europeans (CEU), region in mice (Fig. 3a). C/EBPa (also known as CEBPA) is a liver- with the ‘major’ haplotype present on 68% of chromosomes 1 and the enriched transcriptional factor that regulates the expression of numerous ‘minor’ haplotype on 29% (Supplementary Fig. 3). hepatic genes involved in a variety of metabolic processes17. We tested We identified two human bacterial artificial chromosomes (BACs) binding of the rs12740374 minor and major allele sequences by C/EBP harbouring the major and minor haplotypes of the 6.1 kb region. We with electrophoretic mobility shift assays and found the minor allele sequenced the region on each of the BACs in full and identified 16 sequence to be shifted as much as a classic C/EBP binding sequence18, polymorphisms (Supplementary Fig. 2b). From each BAC, the region with minimal shifting of the major allele sequence; addition of either of spanning precisely between the stop codon of CELSR2 and the stop two C/EBPa antibodies impaired the binding (Fig. 3b). codon of PSRC1 was subcloned into firefly luciferase expression con- We tested luciferase constructs in Hep3B cells expressing a dominant structs just distal to the stop codon of the luciferase gene in either the negative C/EBP protein (A-C/EBP)19,20 and found significantly reduced ‘forward’ (CELSR2) or ‘reverse’ (PSRC1) orientation. On transfection differences in haplotype-specific expression (Fig. 3d and Supplemen- of the constructs into Hep3B cultured human hepatocellular carcinoma tary Fig. 5b). We also tested luciferase constructs in NIH 3T3 cultured cells, we found that in both orientations, the minor haplotype produced mouse fibroblast cells (Fig. 3e) and found no haplotype-specific expres- significantly greater luciferase expression than the major haplotype, sion difference, consistent with liver specificity. Addition of C/EBPa to consistent with the human liver eQTL analyses (Fig. 2b, compare to the 3T3 cells restored the haplotype-specific effect (Fig. 3e). We altered Fig. 1c). After localizing the haplotype-specific effect to the proximal other nucleotides besides rs12740374 in the consensus binding site 2.1 kb of the region (Supplementary Fig. 4), we tested an array of con- predicted to be critical for C/EBP protein–DNA interactions21 and structs in which single polymorphisms in the minor haplotype were found that they were needed for transcriptional activation by the minor a chr1: 109610000 109615000 109620000 109625000 Figure 2 | rs12740374 is responsible for haplotype-specific difference in transcriptional 6.1 kb activity. a, Map of 1p13 SNPs genotyped in 3′ UTR 3′ UTR ,20,000 individuals of European descent relative CELSR2 PSRC1 to CELSR2 and PSRC1 genes. The six SNPs with strongest association with LDL-C (indicated with boxes), comprising a single haplotype, define the SNPs 6.1 kb region between the stop codons of the two genes. b, Firefly luciferase expression from constructs transfected into Hep3B human rs4970833 rs653635 rs6689614 rs6657811 rs2281894 rs17035630 rs17035665 rs4970834 rs611917 rs12740374 rs660240 rs658435 rs629301 rs646776 rs17035949 rs602633 rs599839 rs10410 rs14000 rs657420 rs672569 rs608196 hepatoma cells. Both the major (darker colours) and minor (lighter colours) haplotypes of the 6.1 kb region were subcloned in forward and b luc Hap1 (major) forward reverse orientations into a basal firefly luciferase construct with the SV40 promoter. Shown are luc Hap2 (minor) forward ratios of firefly luciferase expression to Renilla luciferase expression (expressed from luc Hap1 (major) reverse cotransfected plasmid), measured 48 h after transfection, normalized to the mean ratio from luc Hap2 (minor) reverse the major haplotype, forward orientation SV40 01234567construct. Error bars show s.e.m., n 5 2. c, Both Luciferase expression the major and minor haplotypes of a minimal 2.1 kb region were subcloned into the basal c construct. Single nucleotide alterations were luc Hap1 (major) 2.1 kb introduced individually into the minor luc Hap2 (minor) 2.1 kb haplotype, changing minor alleles of SNPs into major alleles. Shown are ratios of firefly luciferase luc * Hap2 + rs11102967 (major) expression to Renilla luciferase expression luc Hap2 + rs12740374 (major) normalized to the mean ratio from the major * haplotype construct. Error bars show s.e.m., luc * Hap2 + rs660240 (major) n 5 4. luc * Hap2 + rs3832016 (major) luc * Hap2 + rs629301 (major) luc * Hap2 + rs646776 (major) 01234 Luciferase expression

716 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 ARTICLES haplotype (Fig. 3c). Furthermore, we determined that C/EBPa binds to encodes the sortilin protein23, also known as neurotensin receptor 3, the site of rs12740374 in homozygous minor allele cells by chromatin a protein that functions as a multiligand sorting receptor. Sortilin immunoprecipitation (Supplementary Fig. 5c). localizes to various intracellular compartments including the Golgi We tested whether C/EBP proteins can influence SORT1 expression apparatus and has roles in both endocytosis and intracellular trafficking via rs12740374. When we added A-C/EBP to Hep3B cells that are homo- of other proteins24. To model the functional effects of altered SORT1 zygous for the major allele, there was no difference in SORT1 expression expression on lipids and lipoproteins, we performed knockdown and (Fig. 3f). In contrast, when we added A-C/EBP to SK-HEP-1 cultured overexpression studies of Sort1 in the livers of mice. Importantly, we human hepatoma cells that are heterozygous (one minor allele), we chose approaches to specifically alter gene expression in liver, because observed a threefold reduction in SORT1 expression (Fig. 3f). When variation at the 1p13 locus results in a largeSORT1 expression change in we added C/EBPa to human embryonic stem (ES) cells that are homo- liver, but no change in adipose tissues (Fig. 1c). Because sortilin is zygous for the minor allele (HUES-1), there was no change in SORT1 known to be highly expressed and have important physiological roles expression, presumably because ES cells do not harbour cofactors in adipocytes and neurons25,26, we felt it was most appropriate to restrict needed for transcriptional activation (Supplementary Fig. 5d). When knockdown and overexpression to liver to model the effects of the 1p13 we differentiated HUES-1 cells into endoderm, the first step towards locus on phenotype. Because wild-type mice have very low levels of hepatocyte differentiation22, addition of C/EBPa resulted in signifi- plasma LDL-C compared to humans, we used ‘humanized’ mice of cantly increased SORT1 expression; in contrast, human ES cells various genetic backgrounds for our studies (Supplementary Fig. 6). homozygous for the major allele (HUES-9), when differentiated into Adeno-associated virus serotype 8 (AAV8) has been demonstrated to endoderm, showed no expression difference (Supplementary Fig. 5d). appropriately target genes for specific expression in liver27,28. AAV8 Together, these findings indicate that rs12740374 is the causal vector encoding the murine Sort1 gene driven by a liver-specific pro- variant responsible for the liver-specific association between the moter (thyroglobulin) was delivered to mouse liver via intraperitoneal 1p13 locus and gene expression and, by extension, the associations injection. A null AAV8 vector was used as a control. The Sort1 AAV with LDL-C and MI risk. resulted in increased sortilin levels in liver with no change in adipose tissue (Supplementary Fig. 7a). Use of these viral vectors did not result in Sort1 in mouse liver alters plasma lipids elevated alanine aminotransferase (ALT) levels (Supplementary Fig. 7b). Of the genes differentially expressed in human liver by 1p13 geno- When compared with mice injected with null virus, Sort1- type, the SORT1 gene showed the largest difference (Fig. 1c). SORT1 overexpressing Apobec1–/–; APOB Tg mice showed a marked decrease

a rs12740374 C/EBP consensus site d Minor TGGCTCGGCTGCCCTGAGGTTGCTCAATCAAGCACAGGT Hep3B hepatoma cells 12345678910 luc Hap1 (major) forward TGGCTCGGCTGCCCTGAGGGTGCTCAATCAAGCACAGGT Major luc Hap2 (minor) forward ____ Mouse TGGCATGGTGGCCCTGAGGGGGC CCCAGCACAGGT Hep3B hepatoma cells + A-C/EBP luc Hap1 (major) forward b Competition assays Antibodies luc Hap2 (minor) forward HepG2 (liver cell) G G G 01234 No extract nuclear extract EBPα #1EBPα #2 Luciferase expression 100 × C/EBP100 × T 100 × 100 × C/EBP100 × T 100 × 100 × C/EBP100 × T 100 × C/ C/ e

NIH 3T3 fibroblasts luc Hap1 (major) forward luc Hap2 (minor) forward

NIH 3T3 fibroblasts + C/EBPα luc Hap1 (major) forward luc Hap2 (minor) forward C/EBP T G C/EBPT G C/EBP consensus T = minor allele G = major allele C/EBP 01234567 EMSA probes Luciferase expression c f luc Hap1 (major) 2.1 kb luc Hap1 (minor) 2.1 kb Hep3B human hepatoma Control cells (homozygous major luc * Hap1 + G2 T (rs12740374) at rs12740374) + A-C/EBP luc * Hap2 + T2 G (rs12740374) 0 0.2 0.60.40.81.0 1.2 1.4 luc + Hap2 + T3 G SK-HEP-1 human hepatoma Control luc × Hap2 + A8 C cells (heterozygous major + A-C/EBP luc # Hap2 + A9 C at rs12740374) 0 0.2 0.60.40.81.0 1.2 1.4 0 123456 Hep3B human hepatoma cells SORT1 Luciferase expression expression Figure 3 | rs12740374 alters a C/EBP transcription factor binding site. Single nucleotide alterations were introduced into constructs as indicated, a, The human DNA sequence surrounding rs12740374, major and minor altering rs12740374 and the three other core recognition nucleotides in the alleles, and orthologous DNA sequence in mouse. The major allele of predicted C/EBP binding site. d, e, Relative firefly luciferase expression from rs12740374 disrupts one of two core elements (position 2, 3 and 8, 9) in the constructs with haplotypes of 6.1 kb region transfected into (d) Hep3B predicted consensus binding site on which a C/EBP dimer binds21. human hepatoma cells with or without concomitant transduction with A-C/ b, Electrophoretic mobility shift assays (EMSA) with labelled probes EBP (dominant negative C/EBP) cDNA via lentivirus and (e) NIH 3T3 matching the C/EBP consensus binding site18, the rs12740374 minor allele fibroblasts with or without concomitant transduction with C/EBPa cDNA (T) sequence, and the rs12740374 major allele (G) sequence. Competition via lentivirus. f, Relative SORT1 expression, determined as a ratio with B2M assays were performed with 100-fold excess of cold probe. Either of two expression by qRT–PCR, in Hep3B cells (homozygous major (GG) at C/EBPa antibodies was used to compete for binding and/or shift the rs12740374) or SK-HEP-1 human hepatoma cells (heterozygous (GT) at protein–DNA complex. c, Relative firefly luciferase expression from rs12740374) with or without concomitant transduction with A-C/EBP constructs with haplotypes of 2.1 kb region transfected into Hep3B cells. cDNA via lentivirus. Error bars show s.e.m., n 5 3 for each experiment. 717 ©2010 Macmillan Publishers Limited. All rights reserved ARTICLES NATURE | Vol 466 | 5 August 2010 in total plasma cholesterol (70% reduction at 2 weeks, 46% reduction cleavage, as well as reduced sortilin levels in liver with no change in at 6 weeks) and LDL-C (73% reduction at 2 weeks) (Fig. 4a, d); con- adipose tissue (Supplementary Fig. 9a–c). sistent results were seen in three other mouse backgrounds (Sup- Sort1 knockdown in Apobec1–/–; APOB Tg mice resulted in a 46% plementary Figs 6 and 7c–f). At 6 weeks the mice had a 73% reduction increase in total cholesterol compared to control mice at 2 weeks, in very small LDL particles and an 88% reduction in medium small with a more than twofold increase in LDL-C (Fig. 4e, f). Consistent LDL particles (Fig. 4b), resulting in increased LDL peak particle size results were seen in two other mouse backgrounds, as well as a sig- (22.0 nm versus 20.9 nm, P 5 0.05). These gain-of-function studies nificant increase in the plasma VLDL level (Supplementary Figs 6 and in mice were concordant with the genetic findings in human cohorts, 9e–g). We also compared plasma lipid levels in Sort1 knockout mice32 in whom the 1p13 minor haplotype was associated with increased and wild-type mice and observed significantly higher total choles- liver SORT1 expression as well as decreased LDL-C and, especially, terol and LDL-C levels in the knockout mice (Supplementary Fig. very small LDL particles (Fig. 1a, c). 9h), consistent with the results of liver-specific knockdown. To study hepatic VLDL secretion, we administered Pluronic F-127 To confirm that the altered plasma VLDL levels in the overexpres- detergent to the AAV-injected mice and measured lipoproteins at sion and knockdown mice were due specifically to altered VLDL serial time points. We found a 57% decrease in the rate of VLDL secretion from hepatocytes, we performed labelling experiments secretion (Fig. 4c) and a similar decrease in the rate of triglyceride using primary hepatocytes isolated from these mice. With Sort1 secretion (data not shown) in Sort1-overexpressing mice. knockdown, we observed a significant increase in labelled apoB- Chemically synthesized small interfering RNA (siRNA)-mediated 100 secretion; with Sort1 overexpression, there was decreased knockdown of Apob or Pcsk9 in liver has been successful in determining apoB-100 secretion (Supplementary Fig. 10). the effects of these genes on plasma lipid levels29,30. We used a similar Besides SORT1, PSRC1 displayed the greatest differential express- approach to reduce Sort1 expression in mouse liver. We identified siRNA ion in human liver by 1p13 genotype (Fig. 1c). We used an AAV8 duplexes that effected .90% knockdown of Sort1 expression in cells vector encoding the murine Psrc1 gene for mouse liver overexpres- (Supplementary Fig. 8a). We selected one chemically modified duplex sion and did not observe any significant changes in total cholesterol with a low half-maximal inhibitory concentration (IC50)thatdidnot or LDL-C levels (Fig. 4g, h). induce cytokines in a human peripheral blood mononuclear cell assay (Supplementary Fig. 8b; data not shown) for large-scale preparation in a A novel lipoprotein regulatory pathway lipidoid formulation optimized for liver-specific delivery31 and injection Through a series of studies in human cohorts, mice and hepatocytes, into mouse tail veins. As a negative control in some experiments, we used we provide evidence that a single noncoding DNA variant at the a chemically modified, non-immunostimulatory siRNA duplex specific chromosome 1p13 locus, rs12740374, influences LDL-C and MI risk for the firefly luciferase gene. Sort1 siRNA achieved 70–80% reduction via liver-specific transcriptional regulation of the SORT1 gene by in Sort1 expression in liver, confirmed to be due to siRNA-mediated C/EBP transcription factors. The clinical importance of this novel a AAV Sort1 AAV null n = 5 per group b AAV Sort1 AAV null Figure 4 | Overexpression or knockdown of P –5 P –4 Sort1 in mouse liver alters plasma lipids and = 4 × 10 = 3 × 10 P = 0.03 P = 0.02 200 1.2 200 120 400 lipoproteins. Adeno-associated virus 8 (AAV8)

150 0.9 150 90 300 vectors either containing no gene, murine Sort1 –1 –1 cDNA or murine Psrc1 cDNA were administered 100 0.6 100 60 200 via intraperitoneal injection; phosphate-buffered mg dl nmol l 50 0.3 50 30 100 saline or siRNA duplex targeting firefly luciferase

Normalized level or mouse Sort1 and prepared in lipidoid 0 0 0 0 0 2 6 LDL-C LDL-L LDL-M/S LDL-VS formulation was administered weekly via tail vein Baseline 21 weeks weeks (pooled FPLC) (NMR) (NMR) (NMR) injection at 2.0 mg kg . Plasma samples were Total cholesterol (Mira) collected before injection and at various time c ) d –1 1000 AAV null 0.03 AAV null points after injection, and were subjected: 217 nmol l–1 h–1 800 AAV Sort1 AAV Sort1 Cholesterol individually to analytical chemistry (Mira 0.02 (pooled FPLC) autoanalyser) to measure total cholesterol 600 –1 (a, e, g); as pooled samples to FPLC (d, f, h), from

400 μ g µl 0.01 which fractions 10 to 26 were used to calculate 200 94 nmol l–1 h–1 LDL-C levels (a, e, g); individually to NMR to

VLDL pooled (nmol l 0 0 measure LDL particle concentrations (b). P 1 4 7

01234 10 13 16 19 22 25 28 31 34 37 40 43 46 Time (h) Fraction values calculated with unpaired t-test, shown if P , 0.05. Error bars show s.e.m. a–d, Apobec1–/–; efControl Sort1 siRNA n = 5 per group APOB Tg mice (five mice per group). b, NMR P = 0.03 240 2.5 0.08 Control measurements at 6 weeks. c, Mice were injected Sort1 siRNA Cholesterol intraperitoneally with Pluronic F-127 detergent (pooled FPLC) 200 2.0 0.06 to block VLDL triglyceride lipolysis and permit –1 –1 160 1.5 0.04 assessment of the rate of VLDL secretion. Plasma μ g µl mg dl samples were collected at baseline, 1 h, 2 h and 4 h 120 1.0 0.02

Normalized level after injection. VLDL particle concentrations 80 0.5 0 were measured from pooled samples with NMR. 1 4 7

Baseline 2 LDL-C 10 13 16 19 22 25 28 31 34 37 40 43 46 –/– weeks (pooled FPLC) e, f, Apobec1 ; APOB Tg mice (five mice per Fraction –/– Total cholesterol (Mira) group). g, h, Ldlr mice (five mice per group). ghAAV null AAV Psrc1 0.03 400 n = 5 per group 1.2 AAV null AAV Psrc1 Cholesterol 300 0.9 0.02 (pooled FPLC) –1 –1 200 0.6 μ g µl mg dl 0.01 100 0.3 Normalized level 0 0 0 1 4 7

Baseline 2 4 LDL-C 10 13 16 19 22 25 28 31 34 37 40 43 46 weeks weeks (pooled FPLC) Fraction Total cholesterol (Mira)

718 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 ARTICLES pathway is defined by the ,40% difference in MI risk between 17. Darlington, G. J., Wang, N. & Hanson, R. W. C/EBP alpha: a critical regulator of alternative 1p13 homozygotes, an effect comparable to those of com- genes governing integrative metabolic processes. Curr. Opin. Genet. Dev. 5, 565–570 (1995). mon variants of LDLR and PCSK9 and larger than the effects of 18. Osada, S., Yamamoto, H., Nishihara, T. & Imagawa, M. DNA binding specificity of 11,12 common variants in HMGCR (the target of statin drugs) .As the CCAAT/enhancer-binding protein transcription factor family. J. Biol. Chem. the 1p13 minor allele frequency is about 30% in Europeans and is 271, 3891–3896 (1996). also common in other ethnicities including African Americans, 19. Olive, M., Williams, S. C., Dezan, C., Johnson, P. F. & Vinson, C. Design of a C/EBP- 8,33 specific, dominant-negative bZIP protein with both inhibitory and gain-of- Hispanics, Asian Indians and Chinese , this locus is an important function properties. J. Biol. Chem. 271, 2040–2047 (1996). global genetic determinant of MI risk. We note that among lipid- 20. Ahn,S.et al. Adominant-negativeinhibitorofCREBrevealsthatitisageneralmediator regulating genes related to MI, SORT1 is unique for having been of stimulus-dependent transcription of c-fos. Mol. Cell. Biol. 18, 967–977 (1998). identified by GWAS mapping of common DNA variants, rather than 21. Miller, M., Shuman, J. D., Sebastian, T., Dauter, Z. & Johnson, P. F. Structural basis by discovery of rare gene variants underlying Mendelian disorders. for DNA recognition by the basic region leucine zipper transcription factor CCAAT/enhancer-binding protein alpha. J. Biol. Chem. 278, 15178–15184 (2003). In conclusion, our results nominate SORT1 as the causal gene at 22. Si-Tayeb, K. et al. Highly efficient generation of human hepatocyte-like cells from the 1p13 locus for LDL-C and MI and the sortilin pathway as a induced pluripotent stem cells. Hepatology 51, 297–305 (2010). promising new target for therapeutic intervention in the reduction 23. Petersen, C. M. et al. Molecular identification of a novel candidate sorting receptor of LDL-C and prevention of MI. They also provide insights into purified from human brain by receptor-associated protein affinity mechanisms by which common noncoding genetic variants can lead chromatography. J. Biol. Chem. 272, 3599–3605 (1997). 24. Nielsen,M.S.etal.ThesortilincytoplasmictailconveysGolgi-endosometransportand to clinical phenotypes, rather than simply being markers for disease. binds the VHS domain of the GGA2 sorting protein. EMBO J. 20, 2180–2190 (2001). 25. Nykjaer, A. et al. Sortilin is essential for proNGF-induced neuronal cell death. METHODS SUMMARY Nature 427, 843–848 (2004). The full Methods provides information about all experimental procedures: (1) 26. Shi, J. & Kandror, K. V. Sortilin is essential and sufficient for the formation of Glut4 description of association analyses in the population cohorts; (2) description of storage vesicles in 3T3–L1 adipocytes. Dev. Cell 9, 99–108 (2005). genotype-expression analyses in human liver, subcutaneous adipose, and 27. Kitajima, K. et al. Complete prevention of atherosclerosis in apoE-deficient mice by hepatic human apoE gene transfer with adeno-associated virus serotypes 7 omental adipose samples; (3) details for generation of luciferase expression and 8. Arterioscler. Thromb. Vasc. Biol. 26, 1852–1857 (2006). constructs; (4) details for conducting luciferase expression assays; (5) details 28. Tanigawa, H. et al. Expression of cholesteryl ester transfer protein in mice promotes for conducting SORT1 expression assays; (6) details for performing electrophor- macrophage reverse cholesterol transport. Circulation 116, 1267–1273 (2007). etic mobility shift assays; (7) details for performing chromatin immunoprecipi- 29. Soutschek, J. et al. Therapeutic silencing of an endogenous gene by systemic tation assays; (8) description of siRNA screening and validation; (9) details for administration of modified siRNAs. Nature 432, 173–178 (2004). performing gene knockdown studies in mouse liver; (10) details for performing 30. Frank-Kamenetsky, M. et al. Therapeutic RNAi targeting PCSK9 acutely lowers gene overexpression studies in mouse liver; (11) details for measuring lipids and plasma cholesterol in rodents and LDL cholesterol in nonhuman primates. Proc. lipoproteins by analytic chemistry, fast protein liquid chromatography, and Natl Acad. Sci. USA 105, 11915–11920 (2008). NMR; (12) details for performing VLDL secretion studies; and (13) details for 31. Akinc, A. et al. A combinatorial library of lipid-like materials for delivery of RNAi performing hepatocyte apoB studies. therapeutics. Nature Biotechnol. 26, 561–569 (2008). 32. Zeng, J., Racicott, J. & Morales, C. R. The inactivation of the sortilin gene leads to a Full Methods and any associated references are available in the online version of partial disruption of prosaposin trafficking to the lysosomes. Exp. Cell Res. 315, the paper at www.nature.com/nature. 3112–3124 (2009). 33. Keebler, M. E. et al. Association of blood lipids with common DNA sequence Received 3 August 2009; accepted 9 June 2010. variants at 19 genetic loci in the multiethnic United States National Health and Nutrition Examination Survey III. Circ. Cardiovasc. Genet. 2, 238–243 (2009). 1. Rader, D. J. et al. Monogenic hypercholesterolemia: new insights in pathogenesis and treatment. J. Clin. Invest. 111, 1795–1803 (2003). Supplementary Information is linked to the online version of the paper at 2. Brown, M. S. & Goldstein, J. L. Heart attacks: gone with the century? Science 272, www.nature.com/nature. 629 (1996). Acknowledgements We thank D. Altshuler, E. Fisher and J. Maraganore for advice 3. , D. D. et al. Lipid treatment assessment project 2: a multinational survey and guidance, and A. Akinc, J. Billheimer, R. Brown, R. Camahort, D. Cromley, to evaluate the proportion of patients achieving low-density lipoprotein E. Eduoard, I. Fuki, C. Geaney, G. Hinkle, I. Kohaar, S. Kuchimanchi, W. Lagor, F. Lau, cholesterol goals. Circulation 120, 28–34 (2009). D. Lum, M. Maier, D. Marchadier, R. Meyers, J. Millar, S. Milstein, D. Nguyen, 4. Kathiresan, S. et al. Six new loci associated with blood low-density lipoprotein D. Perez, D. Peters, V. Redon, A. Rigamonti, R. Schinzel, M.-S. Sun, S.-A. Toh, cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. A. Wilson and K. Wojnoonski for assistance and suggestions. We acknowledge the Nature Genet. 40, 189–197 (2008). National Heart, Lung, and Blood Institute (NHLBI) Gene Therapy Resource 5. Willer, C. J. et al. Newly identified loci that influence lipid concentrations and risk Program for providing support for viral vector production as well as the Vector Core of coronary artery disease. Nature Genet. 40, 161–169 (2008). laboratory of the University of Pennsylvania for producing the vectors. We 6. Wallace, C. et al. Genome-wide association study identifies genes for biomarkers acknowledge the members of the NHLBI Candidate Gene Association Resource of cardiovascular disease: serum urate and dyslipidemia. Am. J. Hum. Genet. 82, (CARe) lipids working group for the contribution of association data in African 139–149 (2008). Americans. This work was supported in part by a T32 grant in Cell and Molecular 7. Kathiresan, S. et al. Common variants at 30 loci contribute to polygenic Training for Cardiovascular Biology from the United States National Institutes of dyslipidemia. Nature Genet. 41, 56–65 (2009). Health (NIH), K99-HL098364 from the NIH, and the Clinician Scientist Program of 8. Aulchenko, Y. S. et al. Loci influencing lipid levels and coronary heart disease risk the Harvard Stem Cell Institute (K.M.); a Medical Scientist Training Program grant in 16 European population cohorts. Nature Genet. 41, 47–55 (2009). 9. Sabatti, C. et al. Genome-wide association analysis of metabolic traits in a birth from the NIH (A.S.); the intramural research program of the Division of Cancer cohort from a founder population. Nature Genet. 41, 35–46 (2009). Epidemiology & Genetics, National Cancer Institute, NIH (L.P.-O.); the Swedish ˚ 10. Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for Medical Research Council, Heart-Lung Foundation, and Pahlsson Foundation blood lipids. Nature doi:10.1038/nature09270 (this issue). (M.O.-M., O.M.); U01-HL069757 from the NIH and research support from Quest 11. Samani, N. J. et al. Genomewide association analysis of coronary artery disease. N. Diagnostics, Inc. (R.M.K.); RC2-HL101864 from the NIH (S.K.); and Engl. J. Med. 357, 443–453 (2007). P01-HL059407 and RC2-HL101864 from the NIH and a ‘‘Freedom to Discover’’ 12. Myocardial Infarction Genetics Consortium. Genome-wide association of early- Unrestricted Cardiovascular Research Grant from Bristol-Myers Squibb (D.J.R.). onset myocardial infarction with single nucleotide polymorphisms and copy Author Contributions K.M., A.S., M.F.-K., N.E.L., T.A., K.V.S., X.L., H.L., N.K., V.M.R., number variants. Nature Genet. 41, 334–341 (2009). J.J.P., B.M., L.P.-O., J.L.H., E.E.S., C.R.M., S.L.-K., M.C.P., J.W., W.C., T.R., K.G.E., 13. Berneis, K. K. & Krauss, R. M. Metabolic origins and clinical significance of LDL M.O.-M., O.M. and R.M.K. carried out experimental work and/or performed data heterogeneity. J. Lipid Res. 43, 1363–1379 (2002). analysis. V.K., K.F., C.A.C., S.K. and D.J.R. supervised the study. K.M., A.S., S.K. and 14. Musunuru, K. et al. Ion mobility analysis of lipoprotein subfractions identifies D.J.R. conceived and designed the study. K.M. wrote the manuscript. three independent axes of cardiovascular risk. Arterioscler. Thromb. Vasc. Biol. 29, 1975–1980 (2009). Author Information Reprints and permissions information is available at 15. Siri-Tarino, P. W., Williams, P. T., Fernstrom, H. S., Rawlings, R. S. & Krauss, R. M. www.nature.com/reprints. The authors declare competing financial interests: Reversal of small, dense LDL subclass phenotype by normalization of adiposity. details accompany the full-text HTML version of the paper at www.nature.com/ Obesity (Silver Spring) 17, 1768–1775 (2009). nature. Readers are welcome to comment on the online version of this article at 16. Linsel-Nitschke, P. et al. Genetic variation at chromosome 1p13.3 affects sortilin www.nature.com/nature. Correspondence and requests for materials should be mRNA expression, cellular LDL-uptake and serum LDL levels which translates to addressed to D.J.R. ([email protected]) or S.K. the risk of coronary artery disease. Atherosclerosis 208, 183–189 (2010). ([email protected]).

719 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09266

METHODS Luciferase expression assays. Hep3B cultured human hepatoma cells, BNL CL.2 Association analyses. The Malmo¨ Diet and Cancer Study – Cardiovascular cultured mouse embryonic liver cells or NIH 3T3 cultured mouse fibroblast cells Cohort (MDC-CC) is a prospective, community-based epidemiological cohort were transfected at roughly 50% confluence and maintained in DMEM with 10% ¨ FBS. In some experiments, cells were infected with a lentivirus encoding the of 6,103 residents of Malmo, Sweden, for whom a comprehensive analysis of 19,20 cardiovascular risk factors has been performed. The 1p13 SNP rs646776 was C/EBPa cDNA or a lentivirus encoding the A-C/EBP (dominant negative genotyped as described previously14. The ion mobility method of lipoprotein C/EBP) cDNA 24 h before transfection. The firefly luciferase constructs were co- measurement was applied to archived baseline blood samples from these indi- transfected with the Renilla luciferase pRL-CMV Vector (Promega) using the m m viduals to quantify directly the full spectrum of lipoprotein particles, as described FuGENE 6 transfection reagent (Roche) in the ratio 1 g:100 ng:3 l mixed with m previously14. Multivariable linear regression analyses were used to test whether Opti-MEM I Reduced Serum Medium (Invitrogen) for a 100 l mix, of which m each of the lipid or lipoprotein measures differed according to an increasing copy 20 l was used for each well of 24-well plates. Forty-eight hours after transfection, firefly and Renilla luciferase activities were measured using the Dual-Luciferase number of the SNP minor allele, adjusted for age, gender and diabetes status. Reporter Assay System (Promega) according to the manufacturer’s protocol, SPSS (version 16.0) was used for the analyses. using untransfected cells to adjust for background activity. The Pharmacogenomics and Risk of Cardiovascular Disease (PARC) study is a SORT1 expression assays. Hep3B cultured human hepatoma cells or SK-HEP-1 two-stage genome-wide association (GWA) study34. In stage 1, 980 subjects were cultured human hepatoma cells were seeded at roughly 50% confluence in typed for 317,000 SNPs with the Illumina Human-1 BeadChip. In stage 2, 930 24-well plates and infected with a lentivirus encoding the A-C/EBP (dominant additional subjects were typed for a subset of 13,680 SNPs with the Illumina negative C/EBP) cDNA or mock-infected; virus was removed after 24 h. The cells iSelect platform. All subjects were of self-reported European ancestry. The gra- were maintained in DMEM with 10% FBS. Total RNA was isolated with the dient gel electrophoresis method of lipoprotein measurement was applied to RNeasy Mini Kit (Qiagen) 72 h after infection. blood samples from these individuals to quantify directly the full spectrum of HUES-1 or HUES-9 human embryonic stem cells were seeded at roughly 50% lipoprotein particles, as described previously15. Multivariable linear regression confluence on Geltrex matrix (Invitrogen) and initially maintained on mTeSR1 analyses were used to test whether each of the lipid or lipoprotein measures medium (StemCell Technologies). The cells were then switched to and main- differed according to an increasing copy number of the SNP minor allele. JMP tained on EndoMedia (RPMI-B27 medium, supplemented with 100 ng ml21 (SAS Institute) was used for the analyses. human recombinant activin A, Invitrogen) for 7 days to induce differentiation Roughly 20,000 individuals of European descent were genotyped on various into definitive endoderm. Successful differentiation was confirmed in parallel array platforms, and roughly 9,000 African American individuals were geno- experiments by monitoring morphological changes and detecting expression of typed on the ITMAT-Broad-CARe Array (Illumina). Association analyses for endoderm-specific markers SOX17 and GATA4. C/EBPa was expressed through LDL-C and meta-analyses were performed as described previously7,35. lentiviral infection during the last 2 days of differentiation, followed by isolation Genotype-expression analyses. To evaluate whether SNPs serve as eQTLs with of total RNA with the RNeasy Mini Kit (Qiagen). putative cis regulatory effects on liver and adipose gene expression traits, 782,476 For each sample, 2 mg of total RNA was converted into cDNA with the High SNPs had been genotyped and expression levels of 39,280 transcripts profiled in Capacity cDNA Reverse Transcription Kit (Applied Biosystems). qRT–PCR was 960 human liver samples, 433 human subcutaneous adipose samples, and 520 performed with TaqMan Gene Expression Assays for SORT1 and B2M and human omental adipose samples. Tissue samples were either post-mortem or associated reagents (Applied Biosystems) according to the manufacturer’s pro- surgical resections from organ donors or elective cases. Methods for tissue col- tocol. The SORT1 expression values for each target were normalized by B2M lection, RNA and DNA isolation, expression profiling and DNA genotyping have D 36 expression values ( Ct method). been described previously . The correlation of rs646776 minor allele count with Electrophoretic mobility shift assays (EMSA). Primers with the consensus each of the profiled transcripts was determined using linear regression analysis. C/EBPa binding site were described previously18: C/EBPa-F, 59-CTAGGCATA For the replication study in 62 liver samples, de-identified histopathologically TTGCGCAATATGC-39; C/EBPa-R, 59-GCATATTGCGCAATATGCCTAG-39. normal human liver samples were provided by the University of Minnesota Primers for rs12740374 were designed based on genomic sequences surrounding Academic Health Center’s Biological Materials Procurement Facility (BioNet; the SNP (http://www.ncbi.nlm.nih.gov/projects/SNP/): rs12740374_T-F, 59-TG www.bionet.umn.edu). For each sample, 1 mg of DNAase-treated total RNA was CCCTGAGGTTGCTCAATCA-39; rs12740374_T-R, 59-TGATTGAGCAAC converted into cDNA with random hexamers and SuperScript III reverse tran- CTCAGGGCA-39; rs12740374_G-F, 59-TGCCCTGAGGGTGCTCAATCA-39; scriptase (Invitrogen). cDNA samples were diluted with water and 2 ng of total rs12740374_G-R: 59-TGATTGAGCACCCTCAGGGCA-39. The variable nuc- RNA was used for each quantitative reverse transcriptase-polymerase chain leotide is shown in bold. All primers were ordered from Invitrogen. Individual reaction (qRT–PCR), performed with TaqMan Gene Expression Assays for primers were labelled with a biotin 39 end DNA labelling kit (Pierce) according to SORT1, PSRC1, CELSR2, TCF7L2 and B2M (beta-2-microglobulin) and asso- instructions, and the efficiency of labelling was tested by a dot-test that con- ciated reagents (Applied Biosystems) according to the manufacturer’s protocol. firmed that all the primers were labelled similarly. Corresponding forward and Expression of all assays was measured in technical duplicates and average values reverse primers were annealed to create 39-end biotin-labelled double-stranded of the duplicates were used for the analysis. The SORT1, PSRC1, CELSR2 and probes. EMSA reactions were performed with the biotin 39-end DNA labelling TCF7L2 expression values for each target were normalized by B2M expression kit (Pierce) according to instructions, with 8 mg of nuclear extract from HepG2 values (DCt method) and were tested for normality of distribution before ana- cultured human hepatoma cells per reaction (Active Motif). For competition lysis. A pre-developed TaqMan genotyping allelic discrimination assay for SNP assays, we used 100-fold excess of unlabelled probe. To test for involvement of rs12740374 was used according to the manufacturer’s protocol (Applied CEBP/a in interaction with the probes, we preincubated the HepG2 nuclear Biosystems). A univariate linear regression analysis was used to test the associa- extract for 15 min at room temperature with either of two antibodies for tions between mRNA expression and the SNP coded by the number of major CEBP/a (39306, Active Motif; 2295, Cell Signaling). The protein complexes were alleles and was performed with SPSS 16.0. Information on age and sex was tested resolved on 6% DNA retardation gels (Invitrogen) for 1 h at 100 V, transferred to as a covariate but was not included in the final analysis as it was not available for Biodyne B Nylon Membranes (Pierce), crosslinked, and processed with the all samples. Chemiluminescent Nucleic Acid Detection Module (Pierce). Protein extracts from liver tissue samples were prepared by homogenization of Chromatin immunoprecipitation assays. HUES-1 human embryonic stem cells , 30 mg of tissue with Tissue Lyser (Qiagen) in RIPA buffer (Invitrogen) in the were seeded at roughly 50% confluence on Geltrex matrix (Invitrogen) and presence of complete cocktail of proteinase inhibitors (Roche). Samples were maintained on mTeSR1 medium (StemCell Technologies). The cells were subjected to immunoblotting and probed with anti-sortilin antibody (AF2934, infected with a lentivirus encoding C/EBPa; virus was removed after 24 h. R&D Systems) or anti-a-tubulin antibody as loading control (ab-7291-100, Seventy-two hours after infection, the cells were harvested and cross-linked with Abcam). 4% paraformaldehyde at 37 uC for 10 min followed by quenching with glycine Luciferase expression constructs. To characterize the intergenic region between and flash freezing. After thawing, the lysates were sonicated in RIPA buffer 25 CELSR2 and PSRC1, the major (Hap1) and minor (Hap2) haplotypes from two times for 10 s at 4 uC. The lysates were precipitated with anti-C/EBPa antibody bacterial artificial chromosomes (CTD-2068B15 and RP11-463O24, respectively; (2295, Cell Signaling) at 1:40 dilution overnight versus no antibody. After Invitrogen) were cloned into the pGL3-Promoter vector (Promega) in both the incubation with Protein G Sepharose beads (GE Healthcare) for 2 h at room 59-to-39 and 39-to-59 orientations just downstream of the stop codon of the firefly temperature, serial washes, and elution, DNA was recovered by addition of luciferase gene. A naturally occurring BamHI site was used to generate constructs sodium chloride and incubation overnight at 65 uC, followed by treatment with with truncations and composites of the two haplotypes. PCR was used to generate proteinase K and RNase A for 2 h at 42 uC. DNA was purified with the QIAquick smaller truncations. The QuikChange Site-Directed Mutagenesis Kit (Stratagene) PCR Purification Kit (Qiagen). The presence of immunoprecipitated DNA was used to alter single nucleotides (that is, SNP alleles). All constructs were verified sequence around rs12740374 was assayed by quantitative PCR using the by DNA sequencing. primers 59-CTGAGGTTGCTCAATCAAGCGCTTGATTGAGCAACCTCAG-39

©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09266

and 59-CTGAGGGTGCTCAATCAAGCGCTTGATTGAGCACCCTCAG-39 and Gene overexpression studies in mouse liver. The murine Sort1 cDNA (Origene, the probe 59-FAM-AGCCAGCACTGTGTTTACTCTTCCTC-Iowa Black-39 MR210834) was subcloned into a specialized vector for use by the University of (Integrated DNA Technologies). The values for the immunoprecipitated target Pennsylvania’s Penn Vector Core for production of AAV8 viral particles expres- were normalized by values for the target from 1:30 dilution of input chromatin. sing Sort1. Viruses were produced with a chimaeric packaging construct in which siRNA screening and validation. siRNA design was carried out to identify the AAV2 rep gene was fused with the cap gene of AAV serotype 8 (ref. 27). siRNAs targeting both homologues of the gene SORT1 from human (symbol Empty AAV8 viral particles were also provided by the Penn Vector Core. SORT1) and mouse (symbol Sort1). The design used the SORT1 transcripts Mice received either 1 3 1012 viral particles of null AAV or 1 3 1012 viral NM_002959.4 (human) and NM_019972.2 (mouse) from the NCBI RefSeq particles of AAV-encoding Sort1 in PBS via intraperitoneal injection. At various collection. siRNA duplexes were designed with 100% identity to both respective time points (including before injection), animals were anaesthetized by isoflurane SORT1 genes. To select appropriate candidate target sequences and their cor- inhalation and blood was collected by retro-orbital bleed followed by centrifu- responding siRNAs, their predicted potentials for interacting with irrelevant gation to isolate plasma. Mice were killed at 6 weeks after 4 h of fasting. After targets (off-target potentials) were used as a ranking parameter. siRNAs with killing the mice, terminal bleeds were collected and livers and adipose were col- low off-target potentials were defined as preferable and assumed to be more lected and analysed for protein and gene expression as described above. specific in vivo. To identify potential off-target genes, 19-mer candidate Measurement of mouse plasma lipids and lipoproteins. Collected mouse sequences were subjected to a homology search against the human and mouse plasma samples were analysed for lipids by analytical chemistry and fast protein RefSeq mRNA databases. The following off-target properties for each 19-mer liquid chromatography (FPLC) and for lipoproteins by nuclear magnetic res- input sequence were extracted for each off-target gene to calculate the off-target onance (NMR). Total plasma cholesterol and alanine aminotransferase (ALT) score: number of mismatches in non-seed region, number of mismatches in seed were measured enzymatically on a Cobas Mira autoanalyser (Roche Diagnostic region, and number of mismatches in cleavage site region. The 29 siRNAs with Systems). Pooled plasma from each experimental group (140 ml) was separated best off-target scores were selected for synthesis and screening. by FPLC gel filtration. Cholesterol and triglyceride plate assays were performed Single-stranded RNAs were produced at Alnylam Pharmaceuticals as described on FPLC fractions using the Infinity cholesterol and triglyceride reagents, previously31,30. Deprotection and purification of the crude oligoribonucleotides respectively. Individual plasma samples were sent for NMR lipoprotein mea- by anion exchange high performance liquid chromatography (HPLC) were surement (LipoScience). carried out according to established procedures. siRNAs were generated by VLDL secretion studies. To study hepatic VLDL secretion, mice were prebled by annealing equimolar amounts of complementary sense and antisense strands. m 21 For screening transfection experiments, BNL CL.2 cultured mouse embryonic retro-orbital bleeding followed by intraperitoneal injection of 400 l of 1 mg g liver cells were seeded at 4 3 104 cells per well in 24-well plates and reverse Pluronic F-127 detergent resuspended in PBS. The mice were fasted for 4 h transfected with the siRNAs using Lipofectamine RNAiMAX (Invitrogen) before injection and through the study. We performed serial retro-orbital bleeds according to the manufacturer’s protocol. Total RNA was isolated with the at 1, 2, and 4 h after injection of the detergent. Plasma samples were individually RNeasy Mini Kit (Qiagen) 48 h after transfection. For each sample, 2 mg of total subjected to triglyceride measurements by analytical chemistry (plate assays with RNA was converted into cDNA with the High Capacity cDNA Reverse the Infinity triglyceride reagent) and pooled together by experimental condition Transcription Kit (Applied Biosystems). qRT–PCR was performed with and sent for NMR analysis for VLDL measurement (LipoScience). –/– 1/– TaqMan Gene Expression Assays for Sort1 and 18S rRNA and associated reagents Primary hepatocyte apoB studies. Mice of the Apobec ; APOB Tg; Ldlr or –/– –/– (Applied Biosystems) according to the manufacturer’s protocol. The Sort1 Apobec ; Ldlr background that had been administered AAV vectors or expression values for each target were normalized by 18S rRNA expression values siRNAs were used as the source of primary hepatocytes for all experiments. (DCt method). A half-maximal inhibitory concentration (IC50) curve was deter- Mice were anesthetized with 2,2,2-tribromoethanol and then dissected to expose mined for the Sort1 duplex yielding the greatest degree of knockdown. The the liver, portal vein, and inferior vena cava. A catheter was inserted into the sequences for this duplex were: 59-uGucAGAAuGGucGAGAcudTsdT-39 and portal vein and sutured in place. The livers were perfused with buffer for 5 min to 59-AGUCUCGACcAUUCUGAcAdTsdT-39 (29-OMe modified nucleotides are remove all red blood cells, followed by digestion in situ by running digestion in lower case, and phosphorothioate linkages are indicated by ‘s’). media through the catheter for 15 min. The livers were transferred to 10 mm A previously validated siRNA duplex targeting the luciferase gene was used31. dishes with 15 ml of hepatocyte wash media and run through a mesh into 50 ml Gene knockdown studies in mouse liver. Lipidoid formulations of siRNAs were conical tubes to separate the cells. The cells were centrifuged at 50g at 4 uCto prepared as described previously31. Mice received either phosphate-buffered remove Kupffer cells. The hepatocyte pellets were washed twice with hepatocyte saline (PBS) or formulated siRNAs via weekly tail vein injection at dosages of wash media and resuspended in 25 ml PBS 1 25 ml of Percoll solution (45 ml 2.0 mg kg21. At various time points (including before injection), animals were Percoll 1 5mL103 PBS 1 100 ml of 1 M HEPES). The cells were then centri- anaesthetized by isoflurane inhalation, and blood was collected by retro-orbital fuged at 115g for 5 min at 4 uC to pellet the viable hepatocytes. The hepatocytes bleed followed by centrifugation to isolate plasma. Mice were killed at 5 days or at were resuspended in Hepatozyme medium 1 10% FBS 1 1% amino acids and 2 weeks after 4 h of fasting. After killing the mice, terminal bleeds were collected, plated at one million cells per well. A subset of the cells was analysed for sortilin and livers and adipose were collected and snap frozen in liquid nitrogen. Frozen and actin protein expression as described above. tissue was ground, and tissue lysates were prepared. Sort1 mRNA levels relative to For labelling experiments, cells were switched to cystine/methionine-free those of GAPDH mRNA were determined in the liver lysates by using the DMEM with 1% FBS, 1% antibiotics/antimycotics, and 0.4 mM oleic acid for branched-DNA-technology-based QuantiGene Reagent System (Panomics), 1 h, followed by addition of 200 mCi per well of 35S-methionine/cysteine. After according to the manufacturer’s protocols. Sortilin and actin expression in liver 3 h, media from the cells were harvested, and apoB was immunoprecipitated and adipose was determined by immunoblotting (612100, BD Transduction with the antibody ab20737 (Abcam). The immunoprecipitate was subjected to Laboratories; ab20272, Abcam). SDS–PAGE, and the gel was exposed to film at –80 uC for 3 days to 2 weeks. 59-RACE (rapid amplification of cloned/cDNA ends) was conducted as Relative secreted apoB-100 levels were determined by quantification of appro- 30 described previously . In brief, an oligonucleotide adaptor was ligated to total priately sized bands by densitometry. liver RNA, and the ligation mixture was reverse transcribed using the Sort1- To determine relative total secreted protein levels, 50 ml of 2 mg ml21 BSA and specific oligonucleotide 59- TATTCCAGGAGGTCCTCATCTGAGTCGTC-39, 25 ml of 50% trichloroacetic acid (TCA) were added to 50 ml of harvested media, followed by cDNA amplification with the oligonucleotides 59-CGACTGGAG followed by incubation on ice for 20 min. The samples were centrifuged for CACGAGGACACTGACATGG-39 and 59- GGATTCATCCCACCTTGGCATTT 15 min, and the pellets were washed with 1 ml of 50% TCA and resuspended 9 9 GTCTC-3 . Nested PCR was performed using the oligonucleotides 5 - GGACA by boiling in 1 ml of 0.2 M NaOH. The NaOH suspension (200 ml) was analysed CTGACATGGACTGAAGGAGTAG-39 and 59- GAAGTAGCCAAAGTCACAG 35 in a scintillation counter for S counts. AGGAAGTC-39. PCR products were examined by gel electrophoresis, purified, and subcloned for sequencing. 34. Reiner, A. P. et al. Polymorphisms of the HNF1A gene encoding hepatocyte nuclear –/– 32 Sort1 mice were generated as described previously and outbred to the factor-1a are associated with C-reactive protein. Am. J. Hum. Genet. 82, 1193–1201 C57BL/6 strain. Matched wild-type C57BL/6 mice were used as controls. (2008). All mice were fed ad libitum with regular rodent chow. All procedures used in 35. Musunuru, K. et al. Candidate Gene Association Resource (CARe): design, animal studies were approved by the pertinent Institutional Animal Care and methods, and proof of concept. Circ. Cardiovasc. Genet. 3, 267–275 (2010). Use Committee and were consistent with local, state and federal regulations as 36. Schadt, E. E. et al. Mapping the genetic architecture of gene expression in human applicable. liver. PLoS Biol. 6, e107 (2008).

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09201 ARTICLES

The Amphimedon queenslandica genome and the evolution of animal complexity

Mansi Srivastava1{, Oleg Simakov2{, Jarrod Chapman3, Bryony Fahey4, Marie E. A. Gauthier4{, Therese Mitros1, Gemma S. Richards4{, Cecilia Conaco5, Michael Dacre6, Uffe Hellsten3, Claire Larroux4{, Nicholas H. Putnam7, Mario Stanke8, Maja Adamska4{, Aaron Darling9, Sandie M. Degnan4, Todd H. Oakley10, David C. Plachetzki10, Yufeng Zhai6, Marcin Adamski4{, Andrew Calcino4, Scott F. Cummins4, David M. Goodstein3, Christina Harris4, Daniel J. Jackson4{, Sally P. Leys11, Shengqiang Shu3, Ben J. Woodcroft4, Michel Vervoort12, Kenneth S. Kosik5, Gerard Manning6, Bernard M. Degnan4 & Daniel S. Rokhsar1,3

Sponges are an ancient group of animals that diverged from other metazoans over 600 million years ago. Here we present the draft genome sequence of Amphimedon queenslandica, a demosponge from the Great Barrier Reef, and show that it is remarkably similar to other animal genomes in content, structure and organization. Comparative analysis enabled by the sequencing of the sponge genome reveals genomic events linked to the origin and early evolution of animals, including the appearance, expansion and diversification of pan-metazoan transcription factor, signalling pathway and structural genes. This diverse ‘toolkit’ of genes correlates with critical aspects of all metazoan body plans, and comprises cell cycle control and growth, development, somatic- and germ-cell specification, cell adhesion, innate immunity and allorecognition. Notably, many of the genes associated with the emergence of animals are also implicated in cancer, which arises from defects in basic processes associated with metazoan multicellularity.

The emergence of multicellular animals from single-celled ancestors is typical for sponges, feeding on microbes and particulate organic over 600 million years ago required the evolution of mechanisms for matter filtered by flagellated collar cells that resemble choanoflagel- coordinating cell division, growth, specialization, adhesion and death. lates. Although the diversity of sponges and their uncertain phylogeny Dysfunction of these mechanisms drives diseases such as cancers, in make it doubtful that any single species can reveal the intricacies of which social controls on multicellularity fail, and autoimmune dis- early animal evolution, comparison of the A. queenslandica draft gen- orders, in which distinctions between self and non-self are disrupted. ome with sequences from other species can provide a conservative The hallmarks of metazoan multicellularity are therefore intimately estimate of the genome of the common ancestor of all animals and related to those of cancer1 and immunity2. the timing and nature of the genomic events that led to the origin and Sponges have a critical role in the search for the origins of metazoan early evolution of animal lineages. multicellular processes3, as they are generally recognized as the oldest The A. queenslandica genome harbours an extensive repertoire of surviving metazoan phyletic lineage. Although the kinship of sponges to developmental signalling and transcription factor genes, indicating other animals was recognized by the nineteenth century4, the absence of that the metazoan ancestor had a developmental ‘toolkit’ similar to a gut and nervous system had relegated sponges to the ‘Parazoa’5,agrade that of modern complex bilaterians. The origins of many of these and below the ‘Eumetazoa’ or ‘true animals’ (that is, cnidarians, ctenophores other genes specific to animal processes such as cell adhesion, and and bilaterians)6. Nevertheless, sponges share key adhesion and signal- social control of cell proliferation, death and differentiation can be ling genes7–11 with eumetazoans, as well as other genes important in body traced to genomic events (gene birth, subfamily expansions, intron plan patterning such as developmental transcription factors12–15;sponge gain/loss, and so on) that occurred in the lineage that led to the embryos and larvae (Fig. 1) are readily comparable to those of other metazoan ancestor, after animals diverged from their unicellular animals12,16. Sponges are diverse and their phylogeny is poorly ‘cousins’. In addition to possessing a wide range of metazoan-specific resolved17–19, allowing for the possibility that sponges are paraphyletic20, genes, the Amphimedon draft genome is missing some genes that are which implies that other animals evolved from sponge-like ancestors. conserved in other animals, indicative of gene origin and expansion Here we report on the genome of Amphimedon queenslandica,a in eumetazoans after their divergence from the demosponge lineage haplosclerid demosponge, the adult organization and lifestyle of which and/or gene loss in Amphimedon.

1Center for Integrative Genomics and Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA. 2Molecular Evolution Genomics, University of Heidelberg, 69117 Heidelberg, Germany. 3Department of Energy Joint Genome Institute, Walnut Creek, California 94598, USA. 4School of Biological Sciences, The University of Queensland, Brisbane, Queensland 4072, Australia. 5Neuroscience Research Institute, University of California Santa Barbara, Santa Barbara, California 93106, USA. 6Razavi Newman Center for Bioinformatics, Salk Institute for Biological Studies, La Jolla, California 92037, USA. 7Department of Ecology and , Rice University, 6100 Main Street, Houston, Texas 77005, USA. 8Institut fu¨r Mikrobiologie und Genetik, Abteilung fu¨r Bioinformatik, Goldschmidtstr. 1, 37077 Go¨ttingen, Germany. 9Genome Center, University of California-Davis, Davis, California 95616, USA. 10Department of Ecology, Evolution and Marine Biology, University of California Santa Barbara, Santa Barbara, California 93106, USA. 11Department of Biological Sciences, University of Alberta, Edmonton, Alberta T6G 2E9, Canada. 12Development and Neurobiology program Institut Jacques Monod, UMR 7592 CNRS/Universite´ Paris Diderot-Paris 7, 75205 Paris Cedex 13, France. {Present addresses: Whitehead Institute for Biomedical Research, Cambridge, Massachusetts 02138, USA (M.Sr.); EMBL Heidelberg, Meyerhofstr. 1, 69117 Heidelberg, Germany (O.S.); Institute of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstr. 190, CH-8057 Zurich, Switzerland (M.E.A.G.); Sars International Centre for Marine Molecular Biology, N-5008 Bergen, Norway (G.S.R., Maj.A., Mar.A.); Department of Earth and Environmental Sciences, Palaeontology and Geobiology, Ludwig-Maximilians-University, 80333 Munich, Germany (C.L.); Courant Research Centre Geobiology, Georg-August University of Go¨ttingen, Goldschmidtstr.3, 37077 Go¨ttingen, Germany (D.J.J.). 720 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 ARTICLES

a d families, sufficient for synteny to be assessed) show segments of con- Homo Bilateria (bilateral symmetry) Branchiostoma served synteny with other animals (Supplementary Note 6). This indi- Eumetazoa (‘true’ animals) Strongylocentrotus cates that portions of the 15 ancestral linkage groups inferred for the 22,24 Caenorhabditis Metazoa (animals) cnidarian–bilaterian ancestor were already in place in the demos- Pristionchus Opisthokonta ponge–eumetazoan ancestor. No such conserved synteny was detected Drosophila Holozoa b Capitella between animals and the choanoflagellate Monosiga brevicollis. Eukaryota Helobdella Lottia Animal relationships Hydra Cnidaria Nematostella We addressed the controversial phyletic branching of early animal Trichoplax lineages by comparing sets of orthologous genes in A. queenslandica Amphimedon and a diverse sampling of 18 complete genomes (Supplementary Note c Monosiga Neurospora 7). Our analyses support the grouping of placozoans, cnidarians and Arabidopsis bilaterians into a eumetazoan clade, with demosponges as an earlier- Dictyostelium branching lineage25, and reject the diploblast–triploblast phylogeny17 Paramecium in favour of a more conventional ‘sponges first’ tree19,20 (Fig. 1d). 0.1 changes per site In our discussion below we therefore refer to descendants of the Figure 1 | Amphimedonlifehistoryandmetazoanphylogeny. a,Amphimedon placozoan–cnidarian–bilaterian last common ancestor as Eumetazoa, queenslandica adult. Scale bar, 5 cm. b, Embryos in a brood chamber. Scale bar, and reserve ‘Eumetazoa sensu stricto’ for the more limited clade defined 1mm.c, Larva. Scale bar, 100 mm. d, Animal phylogeny based on whole- by descendants of the cnidarian–bilaterian ancestor. genome data. This unrooted tree is inferred from 229 concatenated nuclear Our analysis emphasizes the quantitative divergence between protein-coding genes with 44,616 amino acids using Bayesian inference. All metazoans and their closest living unicellular relatives. For example, clades are supported with a posterior probability of 1. Coloured boxes mark 28% of the amino acid substitutions between humans and their last the nodes for which origins of genes are inferred in Figs 3 and 4. The same common ancestor with choanoflagellates occurred on the metazoan topology is supported by the nuclear gene data sets generated by alternative stem lineage (bold line in Fig. 1d), before the divergence of sponges methods as well as by other inference methods (Supplementary Note 7). The from other animals. This pre-metazoan period can be crudely esti- metazoan stem leading to the animal radiation is shown in bold. Contrary to the current consensus of eukaryotic relationships, Amoebozoa are not a sister- mated to be ,150–200 million years (Supplementary Note 7.6). group to Opisthokonta in this tree (Supplementary Note 7). The zootype and origin of metazoan genes Genome sequencing and annotation With multiple animal genomes now in hand, we can extend the ‘zoo- type’ concept26 to include other shared derived genomic characteristics Amphimedon queenslandica is a hermaphroditic spermcast spawner, of animals. Out of 4,670 pan-metazoan gene families defined by clus- and cannot be readily inbred in the laboratory (Fig. 1a–c and tering sponge and eumetazoan peptides, 1,286 (27%) seem to be meta- Supplementary Note 1)21. Adult sponges also harbour many com- zoan-specific (see Supplementary Note 9.2). Similarly, there are mensal microbes. To minimize allelic variation and microbial con- eumetazoan, eumetazoan sensu stricto and bilaterian genomic synapo- tamination we sequenced genomic DNA from multiple embryos and morphies, as well as sponge-specific gene families (for example, larvae from a single mother. This DNA contains four dominant kinases, see Supplementary Note 8). Owing to residual incompleteness , parental haplotypes ( 3% polymorphism), although a single brood of the sponge genome draft, and possible gene losses in the may have multiple fathers (Supplementary Notes 2.1 and 3). We used Amphimedon lineage, this analysis provides a conservative estimate. , , 9-fold whole-genome Sanger shotgun coverage to produce a 167- Nearly three-quarters of the 1,286 animal-specific gene families megabase-pair assembly that typically represents each locus once arose by gene duplication on the metazoan stem (Supplementary rather than splitting alleles (Supplementary Notes 2 and 3) and cap- Note 9). These include the early duplication of transcription factor tures ,97% of the protein-coding gene content (Supplementary families such as homeodomains and basic helix–loop–helix Note 2.5). We also recovered an alpha-proteobacterial genome that domains13,14,27. Additional gene duplication and divergence in eume- is probably a vertically transmitted commensal microbe of tazoans further increased transcription factor gene family number, Amphimedon embryos (Supplementary Note 2.7). which in general are 2 to 34 times larger in eumetazoans than in The assembled A. queenslandica genome encodes ,30,000 predicted Amphimedon. In contrast, substantial diversification of kinase gene protein-coding loci (Supplementary Note 4). This is an overestimate of families occurred before the divergence of the sponge and eume- the true gene number due to overprediction, unrecognized transpos- tazoan lineages (see below)28. We can assess the role of tandem able elements and gene fragmentation at contig or scaffold boundaries. duplication in the creation of these families by seeking evidence for Nevertheless, 18,693 (63%) have identifiable homologues in other linkages among anciently diverged paralogues (Supplementary Note organisms in the Swiss-Prot database; there are no doubt novel or 10). A significant fraction remain linked (up to 30%, as found in rapidly evolving sponge genes unknown in other species. CpG dinu- Trichoplax, P , 0.0001, with lower levels in other contemporary cleotides are depleted, and TpG and CpA dinucleotides augmented, metazoan genomes), indicating that many gene family expansions relative to overall G1C composition, which is indicative of germline originally occurred as tandem or proximal duplications, and that cytosine methylation in the Amphimedon genome. This is consistent these genomically local duplications have remained linked over time. with the presence of a DNMT3-related putative de novo methytrans- This is consistent with the overall preservation of relict linkages ferase as well as proteins with predicted methyl CpG binding domains. observed here and in other basal metazoan genomes22,24,25. Analysis of the Amphimedon gene set reveals marked conservation We find 235 animal-specific protein domains and 769 animal- of gene structure (intron phase and position) and genome organiza- specific domain combinations that evolved along the metazoan stem tion (synteny) relative to other animals (Supplementary Notes 5 and (Supplementary Note 9). Additionally, lineage-specific changes to 6). In Amphimedon, intragenic position and phase are retained for 84% these animal domain architectures occurred in early metazoan evolu- of the introns inferred for the metazoan ancestor, comparable to the tion16,29,30. For example, new combinations of domains in death-fold 76% and 88% retention in human and sea anemone, respectively22,23. domain proteins and laminins possibly allow for the modification of The organization of genes shows conserved synteny (that is, conserved protein interactions and pathways involved in programmed cell linkage without necessarily requiring colinearity) relative to other death and cell adhesion, respectively (Supplementary Note 9.3), animals. In particular, 83 of the 153 longest Amphimedon scaffolds and the co-option of sponge-, eumetazoan- or bilaterian-specific (those that contain genes from more than ten distinct metazoan gene architectures into novel functions. 721 ©2010 Macmillan Publishers Limited. All rights reserved ARTICLES NATURE | Vol 466 | 5 August 2010

The 705 Amphimedon kinases represent the largest reported meta- Six hallmarks of animal multicellularity zoan kinome, and include members of .70% of human kinase classes The A. queenslandica genome allows us to assess systematically the (compared with 59% in choanoflagellate, 83% in sea anemone, 70% in origin of the six hallmarks of metazoan multicellularity: (1) regulated Caenorhabditis elegans and 77% in fruitfly; see Supplementary Note 8.7). cell cycling and growth; (2) programmed cell death; (3) cell–cell and Amphimedon has single copies of most metazoan kinase classes, but has cell–matrix adhesion; (4) developmental signalling and gene regu- several expansions of over 50 genes per class. The largest expansions are lation; (5) allorecognition and innate immunity; and (6) specializa- in the tyrosine kinase and tyrosine-kinase-like groups, and include over tion of cell types. These cardinal features of metazoan multicellularity 150 likely receptor tyrosine kinases (RTKs). Unlike Monosiga,where have their origins on the metazoan stem and often are the result of RTKs could not be classified into metazoan families28, Amphimedon metazoan gene novelties combining with more ancient factors. A has kinase domains from six known animal families (epidermal growth recurring theme is the overlap of these core ‘multicellularity’ genes factor receptor (EGFR), Met, discoidin domain receptor (DDR), rege- with genes perturbed in cancer, a disease of aberrant multicellularity neron orphan receptor (ROR), Eph and Sevenless). The EGFR and some (see oncogenes and tumour suppressors in Figs 2 and 3). Eph extracellular domain architectures are as in their eumetazoan Regulated cell cycling and growth. Although the core machinery of counterparts, but many other RTKs have unique extracellular domains. the animal cell cycle traces back to early eukaryotes (Fig. 2a and Sup- For instance, DDRs have immunoglobulin repeats, and sushi domains plementary Note 8.2), some critical metazoan regulatory mechanisms are found in some members of the expanded Eph and Met families. This emerged more recently. For example, whereas the p53/p63/p73 tumour indicates that the activating ligands, presumably found largely in the suppressor family is holozoan-specific31, the HIPK kinase that phos- external environment, may be distinct from those of eumetazoans. phorylates p53 in the presence of DNA breaks is metazoan-specific, and

a Cell cycle b Growth signalling p15 p16 p18 p19

Cytokine SOS Ras PIP3 RTK receptor GRB2 SHP2 GAB1 Shc CDK4/6 IRS-1 JAK JAK PI(3)K GAB2 Cyclin D PI(3)K Cbl G1 STATSTAT PI(3)K CDC25A SHP1 HDAC Rb SOCS cRaf PDK1 Rb LKB1 Abl Nutrients DP E2F DP E2F AMPK (amino acids and glucose) Akt (inactive) (active) 2 MDM2 DNA damage CDK TSC2 mTOR DNA repair UV IR Cyclin E * * Rictor Rad52 TSC1 ATM Rad51 Abl DNA-PK HIPK2 ATR p27 CDK7 Aurora 4E-BP A Cyclin H FANC p21 Rheb D2 p53 NBS1 p27 M Chk1 MDM2 GSK3 p21 ARF Cdc2 BRCA1 Chk2 Cyclin B MDM2 mTOR S6K Cyclin D p53 S Raptor Wee1 CDC25A CDK2 Cyclin A G2 p21 CDC25 CDK7 B/C Cyclin H STAT STAT Cdc2 FKHR/ FOXO p53 Cyclin A

c Apoptosis Extrinsic pathway

NGFR Fas/DR

ASK TRADD Ancient eukaryotic FADD Casp MEK7 Opisthokont origin Akt 8/10 FLIPs Bid Holozoan origin JNK Bcl-2 Animal origin Intrinsic pathway Bax tBid Intrinsic Eumetazoan origin Bim pathway 14-3-3 XIAP Bilaterian/vertebrate origin Bad Bak

Bcl-x CytC AIF Bax Casp 9 Bcl-2 Smac NOXA Apaf1 Casp Casp 2 3,6,7

cIAP ICAD

CAD Lamins PARP

CAD

Figure 2 | Origins of vertebrate/bilaterian pathways. Reference pathways with eumetazoan origin are found in either Nematostella or Trichoplax or from human and other vertebrates are depicted here for comparative both. a, Cell cycle; b, growth signalling; c, apoptosis. Dashed outlines purposes. Gene products are coloured by their node of origin as per Fig. 1. indicate cases where proteins could not be affiliated to a subtype (see White text denotes known oncogenes or tumour suppressor genes. Genes Supplementary Note 8.3). 722 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 ARTICLES

a Epithelial Apical b Neuronal Crumbs surface Proneural- bHLH Stardust PatJ Pre-synapse Claudin Apical Claudin Par6 region Occludin aPKC Par3 Occludin vATPase Synapsin PTPRN RIMBP Cathepsin L SV2 Bassoon Zonula Rab3 Synaptic SVOP

PAM RIMS AJ cadherin PC1/2/3 vesicle AJ cadherin adherens Synapto g Piccolo AJ cadherin δ-Catenin Actin region CAPS VGLUT Syn yrin AJ cadherin GC Musashi ERC ap β-Catenin microfilaments totagm α -Catenin Veli α-Catenin ELAV Synaptophysin in VAMP β -Catenin Mint PPFIA CPLX NSF Cask Par-1 NAPA Neurexin I/II/III

l

m UNC13 e Syntaxin SNAP25

lciu

a PTPRF

C chann Dlg Ephrin

Basolateral AJ cadherin

+

igin

zin

l Scribble A

Neurexin IV region a

ro

Neuroglian rg R

PMC

Lethal giant ErbB h-GPCR NMDA

Neu

AMPA Sta

receptor Ephrin Contactin channel AJ cadherin receptor β -Catenin receptor receptor larvae Shaker K GluR Neurexin IV δ-Catenin m

α -Catenin Dlg Neuroglian GRIP Dlg Erbin GRASP Contactin omer H r Int Int

GKAP ome Citron nNOS DG α β Kalirin H CRIPT

Basal SynGAP SHANK Cortactin Collagen minin β surface Perlecan Collagen IV* LaLaminin α Nidogen Post-synapse XV/XVIII Laminin γ

Ancient eukaryotic Opisthokont origin Holozoan origin Animal origin Eumetazoan origin Bilaterian/vertebrate origin Figure 3 | Origins of complexes and pathways of bilaterian cell types. Trichoplax or both. a, Cell adhesion and polarity in epithelia. The asterisk Reference cellular structures from human and other vertebrates are depicted indicates that collagen IV genes have not been found in the Amphimedon here for comparative purposes. Gene products are coloured by their node of genome but have been reported as present in the homoscleromorph sponge origin as per Fig. 1. White text denotes known oncogenes or tumour suppressor Pseudocorticium jarrei48. The ancient origins of integrins reflect the recent genes. Genes with eumetazoan origin are found in either Nematostella or findings of ref. 47. b, Synaptic and signalling elements in neurons. the MDM2 ubiquitin ligase that regulates p53 appears as a eumetazoan Supplementary Note 8.3). Both intrinsic and extrinsic programmed feature. Thus, the p53-mediated response to DNA damage may have cell death pathways require caspases, a metazoan-specific family of emerged before the divergence of eumetazoans. The Myc oncogene cysteine aspartyl proteases. Amphimedon encodes initiator caspases illustrates how intramolecular regulation has also evolved. Although with the characteristic caspaserecruitment and death effector domains, Amphimedon shares the four-amino-acid N-terminal DCMW motif as well as an expanded repertoire of effector capases. present in other animal Myc proteins, this motif is missing in the The intrinsic pathway drives cell death by permeabilization of the Myc orthologue found in the unicellular Monosiga31. Because mutation outer mitochondrial membrane and is regulated by the Bcl-2 onco- of this motif disrupts Myc function in vertebrates, it may have an gene family of pro- and antiapoptotic factors. The pro-apoptotic important role in all animals. protein Bak arose in the metazoan lineage, whereas Bax and Bok seem Tumour suppressors encoded by two classes of cyclin-dependent to be eumetazoan-specific. Bcl-2/Bcl-X are antiapoptotic and meta- kinase (CDK) inhibitors mediate growth-factor-dependent regu- zoan-specific. Mitochondrial permeabilization releases proteins of lation of the cell cycle. Although the INK4/CDKN2 class (p15/p16/ varying evolutionary origin, including the ancient apoptosis- p18/p19) regulates the eumetazoan-specific CDK4/6-cyclin D kinase inducing factor (AIF) that contributes to caspase-independent apopto- and is chordate-specific, the Cip/Kip/CDKN1 class (p21/p27/p57) is sis, metazoan-specific apoptotic protease activating factor 1 (Apaf-1), more general, regulating many CDKs, and seems to have arisen on and eumetazoan sensu stricto-specific caspase-activated DNase (CAD) the eumetazoan stem. In bilaterians, Cip/Kip genes integrate external and its regulator ICAD. growth signals, and are regulated transcriptionally and post- The extrinsic apoptotic pathway is activated by external signals transcriptionally by the major growth pathways (see below). The through transmembrane tumour necrosis factor receptors (TNFRs) emergence of this class of CDK inhibitors on the eumetazoan stem whose intracellular death domain interacts with downstream adaptors. suggests a central regulatory role even in early animals. Amphimedon encodes a nerve growth factor receptor (NGFR) p75-like Although cell growth and cell division are tightly coupled in unicel- protein, although it lacks the crucial death domain that is seen in lular species, they can be separately regulated in multicellular organisms. Nematostella and bilaterians (see ref. 32); other death TNFRs (that is, In bilaterians, growth is regulated by six major signalling pathways (RTK Fas, DR4, DR5 and TNFR1) are vertebrate-specific32,33. Because the signalling via Ras, insulin signalling via the phosphatidylinositol-3-OH intrinsic cascade is composed of components that pre-date metazoans, kinase (PI(3)K) pathway, Rheb/Tor, cytokine-JAK/STAT, Warts/Hippo, it is likely to be the original mechanism for inducing apoptosis. and the Myc oncogene) that also modulate the cell cycle (Sup- Cell–cell and cell–matrix adhesion. The diagnostic domains of two plementary Note 8.2). Whereas the Rheb/Tor pathway dates back to early major cell–cell adhesion superfamilies, the cadherins and the immu- eukaryotes, the other pathways contain several genes that are holozoan noglobulins, are present in Monosiga within the extracellular region and metazoan innovations. For example, the insulin receptor substrate of putative transmembrane proteins31,34 (Supplementary Note 8.8). and phosphotyrosine binding proteins GAB1/GAB2 emerged on the Amphimedon cadherins differ from those of Monosiga in having metazoan stem after the divergence of choanoflagellates, indicating that proteins with domain architectures diagnostic for the metazoan- an insulin-signalling-like pathway may have been a key regulator of specific classical cadherin and seven pass transmembrane cadherin growth in early animals by tying into the ancient PDK1 and Akt kinases subfamilies31,35. A considerable expansion of immunoglobulin-like (Fig. 2b). However, because p21, p27 and MDM2 are all eumetazoan domain-containing proteins occurred on the metazoan stem, with novelties, this pathway may not have acquired the ability to regulate cell 218 predicted in Amphimedon versus 5 in Monosiga31. The combina- proliferation until after the divergence of sponges from eumetazoans. tion of N-terminal immunoglobulin domains with C-terminal FN3 Programmed cell death. In contrast to the cell cycle machinery, most repeats is found only in metazoans. of the apoptotic circuitry is unique to animals, increasing in complexity Similarly, metazoan extracellular matrix (ECM) proteins use along metazoan, eumetazoan and bilaterian stems (Fig. 2c and domains that evolved on the holozoan stem. For example, Monosiga 723 ©2010 Macmillan Publishers Limited. All rights reserved ARTICLES NATURE | Vol 466 | 5 August 2010 encodes proteins with collagen triple helix repeats and other genes with consist largely of metazoan innovations, such as the macrophage- fibrillar collagen C-terminal domains, but these domains only appear expressed gene 1 (MPEG1) that participates directly in pathogen together in metazoans30,31. Thrombospondin domain architectures are elimination38. Likewise all animals share specific antiviral defence found in Amphimedon; however, agrin, netrin and perlecan seem to be factors such as MDA5-like RNA helicases, and interferon regulatory eumetazoan innovations. The extracellular matrix receptors, a and b factor-like proteins, although other systems (for example, RNAi) integrin (Int), are present in Amphimedon and other metazoans, but have more ancient origins39. A primordial complement pathway absent from the Monosiga and the other non-metazoan eukaryotic appears to have evolved exclusively on the eumetazoan sensu stricto genomes we considered (Fig. 3a; see note added in proof). stem and further diversified in bilaterians40. Developmental signalling and transcription. Components of the Amphimedon and other demosponges encode unique extracellular major metazoan developmental signalling pathways, as well as classes Calx-b domain-containing proteoglycans called aggregation factors, of developmental transcription factors, are mostly present in which promote cell adhesion and may also be involved in allorecog- Amphimedon and absent from Monosiga and other non-metazoan nition41. The presence of a cluster of aggregation-factor-related genes genomes13,14,16,27,29, suggesting that ontogenetic development, includ- in the Amphimedon genome indicates that allorecognition could be ing primary germ cell formation (Supplementary Note 8.4), origi- under the control of a multigene family. nated on the metazoan stem3,11,12. Although Amphimedon possesses a Specialized cell types characteristically metazoan repertoire of transcription factor families Polarized epithelia. Sponge cells adhere to form tissue-like layers, (Supplementary Note 8.6)13,14,27,31, in general these families are but a true epithelial cell layer, characterized by aligned cell polarity, further expanded in eumetazoans13. Some differences between belt-form junctions and underlying basal lamina, is thought to be a sponges and eumetazoans correlate with morphological complexity. eumetazoan innovation. Amphimedon possesses all the main compo- For example, sponges do not seem to have a mesoderm and accord- nents of the Par, Crumbs and Discs Large (Dlg) complexes, a set of ingly Amphimedon lacks transcription factors involved in mesoderm interacting proteins that are largely metazoan-specific and determine development (Fkh, Gsc, Twist, Snail). In contrast, sponges possess polarity in epithelial cells (Fig. 3a and Supplementary Note 8.8). The several transcription factors involved in determination or differenti- main proteins comprising bilaterian spot-form and zonula adherens ation of muscles and nerves despite lacking a neuromuscular system junctions are also present in Amphimedon and appear to be meta- (PaxB, Lhx genes, SoxB, Msx, Mef2, Irx and bHLH neurogenic zoan-specific34,42. By contrast, septate junction and basal lamina pro- factors)13,14,27. Amphimedon lacks Hox genes and some other tran- teins appear to be largely eumetazoan innovations (Fig. 3a); scription factor subfamilies that are involved in specifying and Amphimedon does possess several genes with laminin-like domain patterning bilaterian nervous systems and body plans13,14,27,36,37. architectures (Supplementary Note 9.3). Signalling cascades, such as the Wnt, TGF-b, Notch and Hedgehog Sensory systems and the neuron. Sponges can sense and respond to pathways, pattern embryos by specifying cellular identity and coor- their environment, although nerve cells seem to be restricted to eume- dinating morphogenetic events. The ligands and receptors of all of tazoans sensu stricto43,44. However, the expression of orthologues these cascades are metazoan innovations at the cell surface (Sup- of post-synaptic structural and proneural regulatory proteins in plementary Note 8.5), except the eumetazoan sensu stricto- Amphimedon larval globular cells suggests an evolutionary connection specific Hedgehog ligand29. The transcription factors specific to these with an ancestral protoneuron36,42. Amphimedon possesses homolo- pathways are also metazoan-specific (Tcf/Lef, Smads, CSL, Gli), gues of bilaterian proteins involved in nervous system development whereas the cytosolic signal transducers generally have more ancient (for example, elav- and musashi-like RNA-binding proteins, neural origins. This pattern suggests that these pathways arose by the transcription factors), pre- and post-synaptic organization (for engagement of novel ligands and receptors with already active sig- example, Discs large)42, endogenous and exogenous signalling (for nalling mechanisms, enabling multicellular communication. example, G-protein-coupled receptors (GPCRs)), and neuroendo- Amphimedon also has fewer ligands and receptors in each pathway crine secretion, although bilaterian peptide hormones are not detected compared to eumetazoans (three Wnt and two Fzd, eight TGF-b ligands (Supplementary Note 8.9). Some key synaptic genes are cons- and five TGF-b receptors, one Notch and five Deltas) (Supplemen- picuously missing from Amphimedon (Fig. 3b and Supplementary tary Note 8.5), as observed for many transcription factor families. In Note 8.9), including the ionotropic glutamate receptor family42, contrast to transcription factors13,14,27, however, these proteins generally whereas neuronal-type metabotropic glutamate, dopamine and sero- can not be assigned to eumetazoan subfamilies or are obvious recent tonin receptors are present. Amphimedon has a homologue of the sponge-specific duplications. This lack of phylogenetic resolution may ephrin receptor, an axon guidance protein, although the ephrin ligand reflect a period of rapid evolution and diversification of ligand/receptor and developmental genes involved in axon guidance (for example, slit, molecules in sponge and eumetazoan lineages. Perhaps as a con- netrin, unc-5 and robo) are not present. Amphimedon also possesses sequence, the inhibitors that interact with ligands and receptors to over 200 GPCRs, which includes a large lineage-specific expansion of modulate pathway activity also appear to be lineage-specific. In par- rhodopsin-related GPCRs (Rh-GPCRs) that are encoded largely by ticular, inhibitors described from bilaterians were not found in clusters of single exon genes as observed in other metazoans (Sup- Amphimedon (for example, Chordin, Numb, I-Smads, Wif). plementary Note 8.9). From these observations we infer that the meta- Allorecognition and innate immunity. The transition to multicel- zoan ancestor possessed a complex sensory system, and many of the lularity was accompanied by mechanisms to defend against invading molecular requirements for neural development and nerve cell func- pathogens and to prevent the fusion of genetically distinct conspe- tion. This suggests that exaptation was critical for the genesis of the cifics2. Although some metazoan immunity genes originated early in first nerve cell, with eumetazoan-specific gene innovations providing eukaryotic evolution, many are restricted to animals, as illustrated by the regulatory and structural requirements to connect these proto- the signalling cascades shared by the Toll-like receptor (TLR) and the neural components into a functional neuron (Fig. 3b). interleukin1 receptor (IL-1R) (Supplementary Note 8.10). An ances- tral form belonging to this receptor superfamily was probably present Molecular correlates of morphological complexity in the last common metazoan ancestor and independently diversified With a diverse sample of genomes in hand, we sought differences in gene in poriferan and cnidarian lineages. Nuclear factor kB (NF-kB), repertoire that are associated with gross morphological complexity. Tollip and ECSIT genes are present in holozoans; however, most Figure 4 shows molecular function categories that are significantly TLR/IL-1R pathway proteins are either composed of metazoan- enriched (P , 1310210) in one or more metazoan complexity group, specific domains (for example, Pellino) or architectures (for with the relative frequencies of genes with these functions in each example, the death domain with TIR and protein kinase domains species shown by colour code. Here we have defined broad groupings in MyD88 and IRAKs, respectively). Immune effector systems also representing three grades of morphological complexity, guided by 724 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 ARTICLES

Basal Invertebrate Vertebrate Figure 4 | Molecular functions metazoa bilaterian (human) 3.3×10–103 MF00004 Immunoglobulin receptor family member enriched in various complexity 6.0×10–14 1.1×10–88 MF00222 Zinc finger transcription factor 1.2×10–78 MF00005 Cytokine receptor groups. Molecular function 2.6×10–44 MF00240 Immunoglobulin 1.7×10–42 MF00020 Peptide hormone categories that show significant 1.3×10–41 MF00175 Major histocompatibility complex antigen 210 4.0×10–11 2.0×10–34 MF00100 G-protein modulator enrichment (P,1310 )in 2.9×10–28 MF00019 Growth factor 3.9×10–28 MF00018 Chemokine Fisher’s exact tests were selected 1.7×10–24 MF00017 Cytokine 7.4×10–24 MF00294 Double-stranded DNA binding protein (Supplementary Note 11). 6.0×10–18 1.1×10–14 MF00259 Cadherin 7.8×10–126 8.6×10–36 MF00003 Nuclear hormone receptor Significance of enrichments (grey 1.9×10–33 MF00198 Structural protein 4.4×10–15 MF00182 Gap junction background) and depletions (white 4.6×10–14 MF00038 Homeobox transcription factor background) for the three ‘metazoan 1.4×10–13 MF00216 Serine protease 5.3×10–12 MF00142 Lipase complexity groups’ (non-bilaterian 6.0×10–11 MF00144 Esterase 8.1×10–11 7.1×10–11 3.0×10–26 MF00127 Reductase (basal) metazoans; invertebrate 5.7×10–13 MF00031 Voltage-gated ion channel 8.6×10–11 MF00177 Other defence and immunity protein bilaterians; vertebrates) are indicated 0 MF00002 G-protein coupled receptor 3.2×10–50 MF00179 Extracellular matrix structural protein in the columns to the left of the 4.9×10–23 MF00229 Extracellular matrix linker protein 1.5×10–16 MF00180 Extracellular matrix glycoprotein heatmap. The heatmap shows 3.0×10–14 MF00154 Metalloprotease 2.3×10–13 MF00035 Other ion channel normalized gene counts of 1.0×10–223 1.1×10–17 1.5×10–20 MF00015 Other receptor 2.7×10–174 1.2×10–137 2.7×10–27 MF00062 Reverse transcriptase PANTHER molecular function 3.6×10–167 5.1×10–36 MF00258 CAM family adhesion molecule 4.1×10–150 1.5×10–12 MF00247 Membrane-bound signalling molecule categories for the species in the 5.1×10–119 2.4×10–115 MF00228 Basal transcription factor 3.6×10–61 4.7×10–46 MF00044 Nuclease analysis. Aqu, Amphimedon 2.4×10–53 2.3×10–39 MF00192 Other viral protein 5.6×10–42 2.6×10–29 MF00023 Other signalling molecule queenslandica;Ath,Arabidopsis –36 –37 1.3×10 4.1×10 MF00076 Other nucleic acid binding –16 –23 2.1×10 5.3×10 MF00039 Other transcription factor thaliana;Cel,Caenorhabditis elegans; Ddi, Dictyostelium discoideum;Dme, Cel Pte Ath Ddi Ncr Tad Nve Hsa Mbr Aqu Spu Drosophila melanogaster;Hma, Dme Hma Hydra magnipapillata; Hsa, Homo sapiens; Mbr, Monosiga brevicollis; Ncr, Neurospora crassa;Nve, Nematostella vectensis;Pte, Paramecium tetraurelia; Spu, 0 0.2 0.4 0.6 0.8 Strongylocentrotus purpuratus;Tad, Normalized gene count Trichoplax adhaerens. the number of described cell types45, including non-bilaterian (or Whereas the eumetazoan lineage produced a wide diversity of body ‘basal’) metazoans (Nematostella, Trichoplax, Amphimedon; ,5–15 cell forms, the sponge body plan has been stable for over 600 million years. types), invertebrate bilaterians (Drosophila, C. elegans, sea urchin; What can explain this disparity in evolved morphological complexity? ,50–100 cell types), and vertebrates (,225 cell types, represented by Although we have seen that sponges and eumetazoans share many the human genome), with a selection of non-animals as an outgroup common pathways related to morphogenesis and cell-type specifica- (Supplementary Note 11). Similarly, using a principal component ana- tion, there are notable genomic differences, including different lysis, we also identified suites of molecular functions that are associated microRNA assemblages46, lineage-specific domains and domain archi- with complexity (Supplementary Figure 11.2). The first component tectures, and the differential expansions of gene families. Although differentiates between metazoans and non-metazoans; the second com- there has been minimal characterization of cis-regulatory architectures ponent partly differentiates between metazoan complexity groups. in non-bilaterians, we note that as most classes of bilaterian transcrip- Included among the functional categories that correlate with tion factors are also present in sponges, cnidarians and placozoans, it increase in metazoan morphological complexity are (Fig. 4 and Sup- may be that quantitative rather than qualitative differences in cis-reg- plementary Table 11.1.1): GPCRs, ion channels, cell adhesion ulatory mechanisms were needed to produce more diverse body plans. proteins, and defence and immunity proteins, which are enriched in The sexually-reproducing, heterotrophic metazoan ancestor had the basal metazoans relative to non-animals; homeobox transcription capacity to sense, respond to, and exploit the surrounding environ- factors and gap junction proteins, which are enriched in bilaterians ment while maintaining multicellular homeostasis. Although sponges relative to non bilaterian animals; and immunoglobulin receptor lack some of the cell types found in eumetazoans, including neurons family members, immunoglobulins, MHC antigens, and cytokine and muscles, they share with all other animals genes that are essential receptors, which are enriched in vertebrates relative to invertebrate for the form and function of integrated multicellular organisms. With bilaterians. These broad associations with complexity are evidently these genomic innovations enabling the regulation of cellular pro- superimposed on notable lineage-specific variation as seen in Fig. 4 liferation, death, differentiation and cohesion, metazoans transcended (for example, serine protease gene loss in C. elegans, and voltage-gated their microbial ancestry. ion channel expansion in Paramecium). Similar functional categories Note added in proof: After completing our analysis, integrins and other contribute to principal components (Supplementary Table 11.2.1). cell-adhesion-related genes were discovered outside metazoa47. The presumed earlier origin of integrins has been incorporated in Fig 3a. Conclusions The Amphimedon genome, combined with recently sequenced genomes METHODS SUMMARY of diverse invertebrates and a choanoflagellate, identifies innovations Detailed methods are described in Supplementary Information. The genome that underlie the emergence and early diversification of the Metazoa. assembly, gene model sequences, predicted proteins, EST clusters and sequences These genomic comparisons reconstruct a common animal ancestor of have been deposited with DDBJ/EMBL/GenBank as project accession ACUQ remarkable complexity. Metazoans can now be defined by a long list of 00000000 and can be accessed from http://www.metazome.net/amphimedon. genomic synapomorphies—gene content, intron–exon structure and Full Methods and any associated references are available in the online version of syntenies—as well as characteristics common to all animal life such as the paper at www.nature.com/nature. sex, development, controlled cellular proliferation, differentiation and growth, and immunity. To what extent the ancestral functioning of Received 31 December 2009; accepted 24 May 2010. this gene set is reflected in modern poriferans is unclear, although 1. Hanahan, D. & Weinberg, R. A. The hallmarks of cancer. Cell 100, 57–70 (2000). studies of both sponge development, which yields a highly patterned 2. Muller, W. E. & Muller, I. M. Origin of the metazoan immune system: identification 12 larva with axial polarity , and sponge immunity provide points of of the molecules and their functions in sponges. Integr. Comp. Biol. 43, 281–292 direct comparison with the eumetazoan condition. (2003). 725 ©2010 Macmillan Publishers Limited. All rights reserved ARTICLES NATURE | Vol 466 | 5 August 2010

3. Muller, W. E. et al. Bauplan of Urmetazoa: basis for genetic complexity of 32. Robertson, A. J. et al. The genomic underpinnings of apoptosis in metazoa. Int. Rev. Cytol. 235, 53–92 (2004). Strongylocentrotus purpuratus. Dev. Biol. 300, 321–334 (2006). 4. Grant, R. E. Animal Kingdom. in The Cyclopaedia of Anatomy and Physiology, Vol. 1 33. Huang,S.et al. Genomic analysis of the immune gene repertoire of amphioxus reveals (ed. Todd, R. B.) (Sherwood-Gilbert-Piper, 1836). extraordinary innate complexity and diversity. Genome Res. 18, 1112–1126 (2008). 5. Sollas, W. J. Report on the Tetractinellida collected by H.M.S. Challenger, during 34. Abedin, M. & King, N. The premetazoan ancestry of cadherins. Science 319, the years 1873–1876. in Report on the Scientific Results of the Voyage of H.M.S. 946–948 (2008). Challenger During the Years 1873–76 (Neill and Company, 1888). 35. Tepass, U., Truong, K., Godt, D., Ikura, M. & Peifer, M. Cadherins in embryonic and 6. Hyman, L. H. The Invertebrates, Vol. 1 Protozoa through Ctenophora (McGraw-Hill, neural morphogenesis. Nature Rev. Mol. Cell Biol. 1, 91–100 (2000). 1940). 36. Richards, G. S. et al. Sponge genes provide new insight into the evolutionary origin 7. Mu¨ller, W. E. Origin of metazoan adhesion molecules and adhesion receptors as of the neurogenic circuit. Curr. Biol. 18, 1156–1161 (2008). deduced from cDNA analyses in the marine sponge Geodia cydonium: a review. 37. Srivastava, M. et al. Evolution of the LIM homeobox gene family in basal Cell Tissue Res. 289, 383–395 (1997). metazoans. BMC Biol. 8, 4 (2010). 8. Muller, W. E. & Schacke, H. Characterization of the receptor protein-tyrosine 38. Wiens, M. et al. Innate immune defense of the sponge Suberites domuncula against kinase gene from the marine sponge Geodia cydonium. Prog. Mol. Subcell. Biol. 17, bacteria involves a MyD88-dependent signaling pathway. Induction of a perforin- 183–208 (1996). like molecule. J. Biol. Chem. 280, 27949–27959 (2005). 9. Suga, H., Katoh, K. & Miyata, T. Sponge homologs of vertebrate protein tyrosine 39. de Jong, D. et al. Multiple dicer genes in the early-diverging metazoa. Mol. Biol. kinases and frequent domain shufflings in the early evolution of animals before Evol. 26, 1333–1340 (2009). the parazoan-eumetazoan split. Gene 280, 195–201 (2001). 40. Kimura, A., Sakaguchi, E. & Nonaka, M. Multi-component complement system of 10. Skorokhod, A. et al. Origin of insulin receptor-like tyrosine kinases in marine Cnidaria: C3, Bf, and MASP genes expressed in the endodermal tissues of a sea sponges. Biol. Bull. 197, 198–206 (1999). anemone, Nematostella vectensis. Immunobiology 214, 165–178 (2009). 11. Nichols, S. A., Dirks, W., Pearse, J. S. & King, N. Early evolution of animal cell 41. Fernandez-Busquets, X. & Burger, M. M. Circular proteoglycans from sponges: signaling and adhesion genes. Proc. Natl Acad. Sci. USA 103, 12451–12456 (2006). first members of the spongican family. Cell. Mol. Life Sci. 60, 88–112 (2003). 12. Larroux, C. et al. Developmental expression of transcription factor genes in a 42. Sakarya, O. et al. A post-synaptic scaffold at the origin of the animal kingdom. demosponge: insights into the origin of metazoan multicellularity. Evol. Dev. 8, PLoS ONE 2, e506 (2007). 150–173 (2006). 43. Pavans de Ceccatty, M. Coordination in sponges. The foundations of integration. 13. Larroux, C. et al. Genesis and expansion of metazoan transcription factor gene Am. Zool. 14, 895–903 (1974). classes. Mol. Biol. Evol. 25, 980–996 (2008). 44. Leys, S. P. & Degnan, B. M. The cytological basis of photoresponsive behavior in a 14. Simionato, E. et al. Origin and diversification of the basic helix-loop-helix gene family sponge larva. Biol. Bull. 201, 323–338 (2001). in metazoans: insights from comparative genomics. BMC Evol. Biol. 7, 33 (2007). 45. Valentine, J. W. Late Precambrian bilaterians: grades and clades. Proc. Natl Acad. 15. Gazave, E. et al. NK homeobox genes with choanocyte-specific expression in Sci. USA 91, 6751–6757 (1994). homoscleromorph sponges. Dev. Genes Evol. 218, 479–489 (2008). 46. Grimson, A. et al. Early origins and evolution of microRNAs and Piwi-interacting 16. Adamska, M. et al. Wnt and TGF-b expression in the sponge Amphimedon RNAs in animals. Nature 455, 1193–1197 (2008). queenslandica and the origin of metazoan embryonic patterning. PLoS ONE 2, 47. Sebe-Pedros, A., Roger, A. J., Lang, F. B., King, N. & Ruiz-Trillo, I. Ancient origin of e1031 (2007). the integrin-mediated adhesion and signaling machinery. Proc. Natl Acad. Sci. USA 17. Schierwater, B. et al. Concatenated analysis sheds light on early metazoan 107, 10142–10147 (2010). evolution and fuels a modern ‘‘urmetazoon’’ hypothesis. PLoS Biol. 7, e20 (2009). 48. Boute, N. et al. Type IV collagen in sponges, the missing link in basement 18. Dunn, C. W. et al. Broad phylogenomic sampling improves resolution of the animal membrane ubiquity. Biol. Cell 88, 37–44 (1996). tree of life. Nature 452, 745–749 (2008). 19. Pick, K. S. et al. Improved phylogenomic taxon sampling noticeably affects non- Supplementary Information is linked to the online version of the paper at bilaterian relationships. Mol. Biol. Evol. doi:10.1093/molbev/msq089 (2010). www.nature.com/nature. 20. Sperling, E. A., Peterson, K. J. & Pisani, D. Phylogenetic-signal dissection of nuclear housekeeping genes supports the paraphyly of sponges and the monophyly of Acknowledgements This study was supported by funds from the Australian Eumetazoa. Mol. Biol. Evol. 26, 2261–2274 (2009). Research Council (B.M.D., Maj.A), US Department of Energy Joint Genome 21. Degnan, B. et al. The demosponge Amphimedon queenslandica: Reconstructing the Institute (B.M.D., D.S.R., S.P.L.) Harvey Karp (K.S.K.), NSF (T.H.O.), NIH/NHGRI ancestral metazoan genome and deciphering the origin of animal multicellularity. (G.M.), University of Queensland Postdocotral Fellowship (Maj.A., S.F.C), Sars in Emerging Model Organisms: A Laboratory Manual, Vol. 1 (Cold Spring Harbor International Centre for Marine Molecular Biology (Maj.A.), DFG (M.St.), ANR Laboratory Press, 2009). (M.V.), CNRS (M.V.), Gordon and Betty Moore Foundation (D.S.R.) and Richard 22. Putnam, N. H. et al. Sea anemone genome reveals ancestral eumetazoan gene Melmon (D.S.R.). We thank J. Huelsenbeck and I. Hariharan for help with repertoire and genomic organization. Science 317, 86–94 (2007). phylogenetic analyses and growth pathways, respectively. The work conducted by 23. Sullivan, J. C., Reitzel, A. M. & Finnerty, J. R. A high percentage of introns in human the US Department of Energy Joint Genome Institute was supported by the Office genes were present early in animal evolution: evidence from the basal metazoan of Science of the US Department of Energy under contract no. Nematostella vectensis. Genome Inform 17, 219–229 (2006). DE-AC02-05CH11231. 24. Putnam, N. H. et al. The amphioxus genome and the evolution of the chordate Author Contributions Genome and EST sequencing, assembly, annotation and karyotype. Nature 453, 1064–1071 (2008). analysis: J.C., T.M., U.H., N.H.P., M.St., A.D., Y.Z., Mar.A., A.C., D.M.G., D.J.J., S.S., 25. Srivastava, M. et al. The Trichoplax genome and the nature of placozoans. Nature B.J.W. and D.S.R. Phylogenetics: M.Sr. and D.S.R. Gene family and biological 454, 955–960 (2008). process analyses: M.Sr., B.F., M.E.A.G., G.S.R., C.C., M.D., C.L., Maj.A., S.M.D., 26. Slack, J. M., Holland, P. W. & Graham, C. F. The zootype and the phylotypic stage. T.H.O., D.C.P., S.F.C., C.H., M.V., K.S.K., G.M., B.M.D. and D.S.R. Clustering, novelty, Nature 361, 490–492 (1993). domain content and complexity analyses: O.S. and D.S.R. Gene family expansion 27. Larroux, C. et al. The NK homeobox gene cluster predates the origin of Hox genes. analyses: M.Sr., O.S., D.S.R. Writing: M.Sr., B.M.D., D.S.R., O.S., J.C., B.F., M.G., Curr. Biol. 17, 706–710 (2007). G.S.R., G.M., K.S.K., M.V., C.L., S.M.D., N.H.P., A.D., C.C., M.A., T.H.O. and S.P.L. 28. Manning, G., Young, S. L., Miller, W. T. & Zhai, Y. The protist, Monosiga brevicollis, Project design and coordination: B.M.D and D.S.R. has a tyrosine kinase signaling network more elaborate and diverse than found in any known metazoan. Proc. Natl Acad. Sci. USA 105, 9674–9679 (2008). Author Information The genome sequence data can be accessed from DDBJ/ 29. Adamska, M. et al. The evolutionary origin of hedgehog proteins. Curr. Biol. 17, EMBL/GenBank as project accession ACUQ00000000. This paper is distributed R836–R837 (2007). under the terms of the Creative Commons Attribution-Non-Commercial-Share 30. Exposito, J. Y. et al. Demosponge and sea anemone fibrillar collagen diversity Alike licence, and is freely available to all readers at www.nature.com/nature. reveals the early emergence of A/C clades and the maintenance of the modular Reprints and permissions information is available at www.nature.com/reprints. The structure of type V/XI collagens from sponge to human. J. Biol. Chem. 283, authors declare no competing financial interests. Readers are welcome to comment 28226–28235 (2008). on the online version of this article at www.nature.com/nature. Correspondence 31. King, N. et al. The genome of the choanoflagellate Monosiga brevicollis and the and requests for materials should be addressed to M.Sr. ([email protected]), origin of metazoans. Nature 451, 783–788 (2008). B.M.D. ([email protected]) or D.S.R. ([email protected]).

726 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09201

49. Aparicio, S. et al. Whole-genome shotgun assembly and analysis of the genome of METHODS Fugu rubripes. Science 297, 1301–1310 (2002). A detailed description of methods used in this study can be found in the Sup- 50. Stanke, M., Tzvetkova, A. & Morgenstern, B. AUGUSTUS at EGASP: using EST, plementary Information. protein and genomic alignments for improved gene prediction in the human Genome sequencing. Genomic DNA was sheared and cloned into plasmid and genome. Genome Biol. 7 (Suppl. 1), S11.1–S11.8 (2006). fosmid vectors for whole genome shotgun sequencing as described49. The data were 51. Yeh, R. F., Lim, L. P. & Burge, C. B. Computational inference of homologous gene assembled using a custom approach described in the Supplementary Information. structures in the human genome. Genome Res. 11, 803–816 (2001). 52. Korf, I. Gene finding in novel genomes. BMC Bioinform. 5, 59 (2004). The Amphimedon 9Xassembly and the preliminary dataanalysis has beendeposited 53. Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the at DDBJ/EMBL/GenBank as project accession ACUQ00000000. sensitivity of progressive multiple sequence alignment through sequence Gene prediction and annotation. Protein-coding genes were annotated using weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids homology-based methods (Augustus50, Genomescan51) and one ab initio Res. 22, 4673–4680 (1994). method (SNAP52). Protein-coding gene predictions can be accessed from 54. Castresana, J. Selection of conserved blocks from multiple alignments for their http://www.metazome.net/amphimedon. use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000). Phylogenetic methods. Three data sets of orthologous genes from eighteen 55. Ronquist, F. & Huelsenbeck, J. P. MrBayes 3: Bayesian phylogenetic inference genomes were aligned using default parameters using CLUSTALW53 and poorly under mixed models. Bioinformatics 19, 1572–1574 (2003). aligned regions were excluded using Gblocks54. 56. Huelsenbeck, J. P. & Ronquist, F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17, 754–755 (2001). Phylogenetic analyses were conducted using Bayesian inference and maxi- 57. Guindon, S. & Gascuel, O. A simple, fast, and accurate algorithm to estimate large 55,56 57 mum likelihood with bootstrap using MrBayes , and PHYML respectively. phylogenies by maximum likelihood. Syst. Biol. 52, 696–704 (2003). 58 Alternative likelihood topologies were tested using TREEPUZZLE and 58. Schmidt, H. A., Strimmer, K., Vingron, M. & von Haeseler, A. TREE-PUZZLE: CONSEL59. Bayesian analysis using site-heterogeneous models were done using maximum likelihood phylogenetic analysis using quartets and parallel computing. aamodel (J. Huelsenbeck, unpublished) and PhyloBayes60,61. Bioinformatics 18, 502–504 (2002). Identification of Amphimedon orthologues of specific bilaterian genes. 59. Shimodaira, H. & Hasegawa, M. CONSEL: for assessing the confidence of Putative orthologues of genes involved in various processes in bilaterians were selection. Bioinformatics 17, 1246–1247 (2001). identified by reciprocal BLAST of human, mouse, or Drosophila genes against the 60. Lartillot, N., Brinkmann, H. & Philippe, H. Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC Evol. Amphimedon gene models (blastp) or the assembly (tblastn). PFAM62 domain 63,64 Biol. 7 (Suppl. 1), S4 (2007). composition, assignment of PANTHER HMMs and phylogenetic trees were 61. Lartillot, N. & Philippe, H. A Bayesian mixture model for across-site used to determine orthology. Trees were built using the neighbour-joining heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21, method in Phylip65 with one-hundred bootstrap replicates. 1095–1109 (2004). Molecular function enrichments and correlation of complexity. Metazoan gene 62. Bateman, A. et al. The Pfam protein families database. Nucleic Acids Res. 32, families were assigned molecular functions using PANTHER63 annotations.Fisher’s D138–D141 (2004). exact test as implemented in R66 was run to test for enrichment or depletion of 63. Thomas, P. D. et al. PANTHER: a library of protein families and subfamilies numbers of gene families for each molecular function category in the novel versus indexed by function. Genome Res. 13, 2129–2141 (2003). 64. Thomas, P. D. et al. PANTHER: a browsable database of gene products organized ancestral gene sets. Numbers of genes (not gene families) for various molecular by biological function, using curated protein family and subfamily classification. function categories were tested for enrichment between different pairs of four Nucleic Acids Res. 31, 334–341 (2003). eukaryotic complexity groups (vertebrate, non-vertebrate bilaterian, basal meta- 65. Felsenstein, J. PHYLIP–Phylogeny Inference Package (Version 3.2). Cladistics 5, zoan, non-animal) to identify molecular function families that correlate with the 164–166 (1989). differences in complexity. Principal components analysis was used to identify the 66. Team, R. D. C. R: A Language and Environment for Statistical Computing (R contribution of each molecular function category to a eukaryotic complexity group. Foundation for Statistical Computing, 2009).

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09311 LETTERS

A ground-layer adaptive optics system with multiple laser guide stars

M. Hart1, N. M. Milton1, C. Baranec2, K. Powell1, T. Stalcup3, D. McCarthy1, C. Kulesa1 & E. Bendek1

To determine the influence of the environment on star formation, not measurable by the Shack–Hartmann sensor. Instead, a separate we need to study the process in the extreme conditions of massive tilt sensor looking at a nearby natural star is used to recover that young star clusters ( 104 solar masses) near the centre of our own information. Both sensors run at 400 frames per second. A beam Galaxy1,2. Observations must be carried out in the near infrared splitter ahead of the tilt sensor directs 10% of the light from the because of very high extinction in visible light within the Galactic natural star to a third sensor of high spatial order. This sensor is read plane. We need high resolution to identify cluster members from only once every 10–30 s; its output, calibrated against an unresolved their peculiar motions3, and because most such clusters span more source in the optical system, is used to determine and correct the than 19, efficient observation demands a wide field of view. There aberrations that are not common to the optical paths of the wave- is at present no space-based facility that meets all these criteria. front sensor and the science camera. This use of the third sensor has Ground-based telescopes can in principle make such observations proved critical in achieving the system’s predicted performance. when fitted with ground-layer adaptive optics (GLAO)4–6, which The MMT’s unique adaptive secondary mirror20 is used to correct removes the optical aberration caused by atmospheric turbulence the aberrations measured by all three sensors. Because the mirror is up to an altitude of 500 m (refs 7-10). A GLAO system that uses large and conjugates to the low atmospheric layers to be corrected, it multiple laser guide stars11–13 has been developed at the 6.5-m naturally provides correction over a wide field while eliminating the MMT telescope, in Arizona. In previous tests13, the system losses and added thermal emission of conventional adaptive optics21. improved the resolution of the telescope by 30–50%, limited by Updates to the mirror actuators are synchronized with the fast sensor wavefront error in the optics, but that was insufficient to allow read-outs. The resulting image in the near infrared is recorded by an rapid determination of cluster membership. Here we report obser- imaging camera, called PISCES, that is sensitive from 1.2 to 2.5 mm vations of the core of the globular cluster M3 made after commis- and has a 11099 field of view and a 0.199 pixel scale22. sioning a sensor to monitor and remove slowly varying aberration Images of the globular cluster M3 were recorded in wavebands in the optics. In natural seeing of 0.799, the point spread function at centred on 1.25 mm (J), 1.65 mm (H) and 2.15 mm (K). To illustrate, 2.2-mm wavelength was sharpened uniformly to 0.399 over a field of in Fig. 1 we present details of two observations of M3 taken in the K at least 29. The wide-field resolution was enhanced by a factor of band. Each is a 60-s exposure comprising the sum of 60 individual 1-s two to three over previous work13, with better uniformity, and images taken over a 4-min period. The first observation (Fig. 1a) was extends to a wavelength of 1.2 mm. Entire stellar clusters may be recorded with no adaptive optics correction but with the adaptive examined in a single pointing, and cluster membership can be secondary mirror set to a fixed position that removed as far as possible determined from two such observations separated by just one the static wavefront aberration introduced by the telescope and instru- year14–17. ment optics. In this case, the stellar images reflect the native seeing of The potential of GLAO to meet all these observational require- 0.799, which is slightly worse than the median at the MMT at this ments has been predicted by analytic studies8 and numerical simula- wavelength, 0.6099. The second observation shows the image quality tions18,19. Preliminary results13 from the optical system built at the obtained with GLAO: the average image width across the entire field is MMT to implement GLAO also hinted at the ability of the technique reduced to 0.3099. Figure 1b–e shows the results, with and without to offer wide-field near-infrared image sharpening, but that work was adaptive correction, in two 2799 3 2799 regions of the field, one centred limited by aberration in the optical train of the laser-guide-star wave- on the tip–tilt star and the other centred near the edge of the camera’s front sensor that was not seen by the science camera. field. Each subfield is about the size of the isoplanatic patch for con- The MMT system projects five pulsed laser beams, each of power ventional adaptive optics correction at this wavelength. We note that ,4 W and wavelength 532 nm, from a single telescope of 50-cm the point spread function (PSF) is nearly identical in the two subfields. diameter positioned behind the MMT’s secondary mirror. They are Furthermore, the peak intensity of the stellar images was improved arranged in a regular pentagon spanning a field of 29. A Shack– over the full field by an average factor of 3.4, which for a detection at a Hartmann wavefront sensor records the light returned to the tele- given signal-to-noise ratio leads to an improvement of 2 mag in this scope by Rayleigh backscattering of each laser pulse over a range from very crowded region. Although the correction does not reach the 20 to 29 km (ref. 12). The beacon light is maintained in sharp focus diffraction limit, which at this wavelength is 0.0799, the image quality over this range by means of a mirror in the optical train that oscillates is essentially constant across the field of view; the standard deviation of longitudinally at a frequency equal to the laser pulse rate11. The the full-width at half-maximum (FWHM) is 0.00999. signals from the five beacons are sensed separately and then averaged We have examined the behaviour of the PSF from observations in to obtain the mean wavefront, representing our estimate of the the J, H and K bands of the open star cluster M34, which is less ground-layer aberration. Because of unknown jitter in the outgoing crowded than M3 and allows individual stellar images to be well laser beam paths, overall image motion in the MMT’s focal plane is isolated for this purpose. Properties of the PSF are summarized in

1Steward Observatory, The University of Arizona, Tucson, Arizona 85721, USA. 2Caltech Optical Observatories, California Institute of Technology, Pasadena, California 91125, USA. 3W. M. Keck Observatory, 65-1120 Mamalahoa Highway, Kamuela, Hawaii 96743, USA. 727 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

Central and edge subfields Uncorrected full field Uncorrected GLAO corrected

ab110″ 27″ c

de

m K = 16.5

Figure 1 | The core of M3 imaged in the K band in two 60-s exposures in of the field. c, e, In a second 60-s exposure of the same two regions, taken May 2009. a, The full 11099 field of our infrared camera in the native seeing with GLAO running at 400 Hz, and shown on the same linear scale as b and limit of 0.70, on a logarithmic intensity scale. b, d, Two smaller 2799 regions of d, the stellar image width is reduced to 0.399 and the PSF morphology is very the same image, indicated by the boxes in a, shown on a truncated linear similar across the whole field of view. For reference, we highlight a star in the scale in which bright stars appear saturated but which reaches the noise floor corrected image with K-band magnitude mK 5 16.5, detected at a signal-to- and brings out the faintest observable stars: one (b) is centred on the tip–tilt noise ratio of 26. In the uncorrected image, stars must be 2 mag brighter to be star, indicated by the arrow, and the other (d) is positioned to show the edge seen at the same signal-to-noise ratio. Table 1. In Fig. 2a, we show the FWHM of stellar images as a function observed in this experiment is around 0.499. The result is arguably of angular separation from the tilt star, measured from 60-s expo- attributable to an unusually low and thin ground layer during this sures recorded in the K band with and without GLAO correction. The observation, or a high ratio of ground-layer to free-atmosphere tur- mean uncorrected FWHM in this case was 0.6199, and this improved bulence, but performance in the J band that exceeds the modelling to 0.2299 with correction. We note that no trend in corrected image predictions has now been observed during several telescope runs width across the field is apparent. The standard deviation is just under a variety of seeing conditions and at different times of year. 0.01699, attributable to the 0.01599 mean estimated uncertainty, aris- Examination of the individual 1-s exposures of M3 shows that for ing largely from sky background noise, in the measurement of indi- thepffiffiffiffiffiffi bright stars, relative astrometric accuracy is proportional to 1/ vidual FWHM values. The reduction in FWHM represents an tint, where tint is the total integration time, as expected from differ- improvement in seeing from average to better than the fifth percentile ential tilt jitter along the different lines of sight23. The standard devi- for the site. Of particular importance to spectroscopy, which is ation for the measured positions of stars of K-band magnitude improved with higher energy concentration, is that the encircled mK 5 16 is 7.0 mas over the full field in the 1-s exposures, which energy flux within a 0.299 circular aperture increased substantially in reduces to 1.3 mas in co-additions of 30-s total exposure. Provided all wavebands. So did the peak intensity, which in the case of the K that care is taken to avoid systematic effects from optical distortion in band improved from 1.2% to 6.7% of the value of the diffraction limit. 3 the instrument and differentialpffiffiffiffiffiffi atmosphericpffiffiffi refraction , the error will No statistically significant trend with field angle is distinguishable for continue to scale as 1/ tint and also as 1/ B, where B is the bright- any of these metrics. Furthermore, observed ellipticity in the PSFs ness of the star. Extrapolating from the M3 observations, we find that seems to be randomly distributed in position angle, with magnitudes the required accuracy of 0.2 mas for stars of mK 5 18 will be achieved between 0.0 and 0.3 that are consistent in each case with a true value of in an integration time of 7,700 s. zero. In short, we do not see any evidence for PSF variation across the The MMT’s GLAO system is the first of its kind, designed as a field of view. Rather, GLAO correction was fully effective over at least prototype for more capable systems on larger telescopes. Nonetheless, 29, suggesting that the improvement will be significant over substan- performance, even at this early stage in the development of the tech- tially larger fields. nique, compares favourably with space-based instruments intended to Radial stellar image profiles were computed in all three wavebands address similar scientific goals. The Wide Field Camera 3 on the Hubble by averaging the images of the same 25 stars across the field that were Space Telescope is the latest such instrument. It offers imaging and low- used to characterize the GLAO performance in Fig. 2a. The positions resolution slitless spectroscopy over a field of view of 13599 3 12399, of the stars in the field are plotted in Fig. 2b and the profiles are shown sampled with rectangular 0.13599 3 0.12199 pixels, and wavelength in Fig. 2c. The best image sharpening was as expected in the K band, coverage from 800 to 1,700 nm in its infrared channel. The field of view but, remarkably, the system is still effective at correcting in the J band. and spatial resolution are similar to those of the PISCES camera at the The performance in the H and K bands is in line with the results MMT with GLAO, and because the MMT has a collecting area 7.8-fold predicted by numerical modelling: approximately 0.299 6 0.0599 for 18,19 19 bigger than that of the Hubble Space Telescope, energy concentration in the K band and 0.399 6 0.0599 for the H band , depending on thecoreofthePSFisalsosimilar.ButthegreatversatilityofGLAOlies assumed conditions. But the best performance expected at the in its broad applicability: it is not tied to any particular instrument. shorter wavelength for the approximately median conditions actually Building on the work at the MMT, a similar system is now under Table 1 | Properties of the compensated point spread function construction for use with the Large Binocular Telescope, in 24 Waveband FWHM Encircled energy Peak intensity Arizona , which also deploys adaptive secondary mirrors. Two 25 enhancement* enhancement instruments, collectively called LUCIFER , one on each half of the K(2.2 mm) 0.2299 3.85.5 Large Binocular Telescope and each with a field of view of 49, offer H(1.65 mm) 0.2999 2.73.6 both imaging and multi-object spectroscopy in the J, H and K bands. J(1.25 mm) 0.2999 2.33.0 Supplied with GLAO-corrected images, they will undertake rapid * The factor by which the energy within a 0.299 circular aperture is increased by GLAO. spectroscopic and astrometric surveys of large areas of the sky, with 728 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

a 0.8 in regard to its sensitivity to unresolved sources. It therefore allows GLAO corrected the study of many important topics, ranging from the formation and Uncorrected aggregation of galaxies in the early Universe and the origin of the Hubble sequence to determining the initial mass function of massive 0.6 young star clusters.

Received 19 March; accepted 24 June 2010.

0.4 1. McCrady, N., Graham, J. R. & Vacca, W. D. Mass segregation and the initial mass function of super star cluster M82-F. Astrophys. J. 621, 278–284 (2005). FWHM ( ′′ ) 2. Stolte, A., Brandner, W., Grebel, E. K., Lenzen, R. & Lagrange, A.-M. The Arches cluster: evidence for a truncated mass function? Astrophys. J. 628, L113–L117 (2005). 0.2 3. Cameron, P. B., Britton, M. C. & Kulkarni, S. R. Precision astrometry with adaptive optics. Astron. J. 137, 83–93 (2009). 4. Angel, J. R. P. & Lloyd-Hart, M. Atmospheric tomography with Rayleigh laser beacons for correction of wide fields and 30 m class telescopes. Proc. SPIE 4007, 0.0 270–276 (2000). 010203040 50 60 5. Rigaut, F. in Beyond Conventional Adaptive Optics (eds Vernet, E., Ragazzoni, R., ′′ Field angle ( ) Esposito, S. & Hubin, N.) 11–16(Proc.ESO58,EuropeanSouthernObservatory,2002). 6. Lloyd-Hart, M. et al. Experimental results of ground-layer and tomographic wavefront b ′′ 60 N reconstruction from multiple laser guide stars. Opt. Exp. 14, 7541–7551 (2006). 7. Marchetti, E. et al. On-sky testing of the Multi-Conjugate Adaptive Optics Demonstrator. Messenger 129, 8–13 (2007). 8. Tokovinin, A. Seeing improvement with ground-layer adaptive optics. Publ. Astron. 30′′ Soc. Pacif. 116, 941–951 (2004). 9. Martin, O. et al. Opto-mechanical commissioning of the GLAS Rayleigh laser guide star for the WHT. Proc. SPIE 7015, 70154N (2008). 10. Tokovinin, A. etal. SAM: a facility GLAOinstrument. Proc. SPIE 7015, 70154C (2008). E 11. Stalcup, T. et al. Field tests of wavefront sensing with multiple Rayleigh laser guide –60′′ –30′′ 0′′ 30′′ 60′′ stars and dynamic refocus. Proc. SPIE 5490, 1021–1032 (2004). 12. Lloyd-Hart, M. et al. First tests of wavefront sensing with a constellation of laser guide beacons. Astrophys. J. 634, 679–686 (2005). 13. Baranec, C. et al. On-sky wide-field adaptive optics correction using multiple laser –30′′ guide stars at the MMT. Astrophys. J. 693, 1814–1820 (2009). 14. Reid, M. J. et al. Trigonometric parallaxes of massive star-forming regions. VI. Galactic structure, fundamental parameters, and noncircular motions. Astrophys. J. 700, 137–148 (2009). 15. Stolte, A. et al. The proper motion of the Arches cluster with Keck laser-guide star ′′ –60 adaptive optics. Astrophys. J. 675, 1278–1292 (2008). 16. Hußmann, B., Stolte, A. & Brandner, W. in Star Clusters: Basic Galactic Building c 1.0 Blocks throughout Time and Space (eds de Grijs, R. & Le´pine, J. R. D.) 422 (Proc. Int. K with GLAO Astron. Union 5 Symp. S266, International Astronomical Union, 2010). 17. Scho¨del, R., Merritt, D. & Eckart, A. The nuclear star cluster of the Milky Way: H with GLAO 0.8 proper motions and mass. Astron. Astrophys. 502, 91–111 (2009). J with GLAO 18. Andersen, D. et al. Performance modeling of a wide-field ground-layer adaptive K uncorrected optics system. Publ. Astron. Soc. Pacif. 118, 1574–1590 (2006). 0.6 K diffraction limit 19. Le Louarn, M. & Hubin, N. Improving the seeing with wide-field adaptive optics in the near-infrared. Mon. Not. R. Astron. Soc. 365, 1324–1332 (2006). 20. Wildi, F., Brusa, G., Lloyd-Hart, M., Close, L. & Riccardi, A. First light of the 6.5-m

Intensity 0.4 MMT adaptive optics system. Proc. SPIE 5169, 17–25 (2003). 21. Lloyd-Hart, M. Thermal performance enhancement of adaptive optics by use of a deformable secondary mirror. Publ. Astron. Soc. Pacif. 112, 264–272 (2000). 0.2 22. McCarthy, D., Ge, J., Hinz, J., Finn, R. & de Jong, R. PISCES: a wide field 1–2.5 micron camera for large aperture telescopes. Publ. Astron. Soc. Pacif. 113, 353–361 (2001). 23. Sasiela, R. J. Electromagnetic Wave Propagation in Turbulence: Evaluation and 0.0 Application of Mellin Transforms 2nd edn, 164–172 (Springer Ser. Wave 0 0.2 0.4 0.6 0.8 Phenomena, Springer, 1994). Field angle (′′) 24. Rabien, S. et al. The laser guide star program for the LBT. Proc. SPIE 7015, 701515 (2008). Figure 2 | Comparison of open-loop and closed-loop near-infrared image 25. Mandel, H. et al. LUCIFER: a NIR spectrograph and imager for the LBT. Astron. widths. a, In the K band, the corrected stellar images in M34 show no more Nachr. 328, 626–627 (2007). significant variation in FWHM versus separation from the tip–tilt star (spectral type A1, m 5 10.0) across the PISCES field than they do in the seeing limit. The Acknowledgements We thank the staff of the Steward Observatory Engineering K and Technical Services division and the staff of the MMT Observatory for their horizontal dotted lines represent the average corrected FWHM (red; 0.2299)and support in the development and deployment of the MMT adaptive optics system. uncorrected FWHM (blue; 0.6199, about median for the site). In both cases, the We are grateful to P. Strittmatter and R. Angel for reading the manuscript. The FWHM was measured for all stars detected with a signal-to-noise ratio greater observations reported here were made at the MMT Observatory, a joint facility of than 20, which yielded sample sizes of 13 and 25 in the open- and closed-loop The University of Arizona and the Smithsonian Institution. The work has been cases, respectively. b, The stars’ placement in the field. Red points indicate stars supported by the National Science Foundation. measured in closed loop, and blue points indicate those measured in both open Author Contributions M.H. wrote the paper. N.M.M., C.B., M.H. and E.B. carried c loop and closed loop. , Radial profiles of the GLAO-corrected images, out the data reduction. N.M.M. designed the adaptive optics reconstructor normalized to unit peak intensity, in the J, H and K wavebands have FWHMs of matrices. K.P. analysed the real-time system performance. T.S. designed and built 0.2999,0.2999 and 0.2299,respectively.Theremarkabledegreeofsimilarity the laser launch optics and wrote the system’s operating software. D.M. and C.K. between the J and H profiles is, we believe, attributable to statistical fluctuations operated the infrared camera that recorded the cluster images at the telescope. All in the seeing. Also shown, for comparison, are the seeing-limited K-band image authors took part in the telescope runs during which the data presented here were profile,normalizedtothesametotalenergyasthecorrectedK-bandimage,and acquired. the profile expected of a diffraction-limited source in the K band. Author Information Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interests. dozens of objects examined in a single exposure. GLAO effectively Readers are welcome to comment on the online version of this article at improves the site seeing by a factor of two to three over a wide field, www.nature.com/nature. Correspondence and requests for materials should be thereby increasing the effective telescope aperture by a similar factor addressed to M.H. ([email protected]). 729 ©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09256 LETTERS

Quantum entanglement between an optical photon and a solid-state spin qubit

E. Togan1*, Y. Chu1*, A. S. Trifonov1, L. Jiang1,2,3, J. Maze1, L. Childress1,4, M. V. G. Dutt1,5, A. S. Sørensen6, P. R. Hemmer7, A. S. Zibrov1 & M. D. Lukin1

Quantum entanglement is among the most fascinating aspects of The key idea of our experiment is illustrated in Fig. 1a. The NV 1 quantum theory . Entangled optical photons are now widely used centre is prepared in a specific excited state (jA2æ in Fig. 1a) that for fundamental tests of quantum mechanics2 and applications decays with equal probability into two different long lived spin states such as quantum cryptography1. Several recent experiments (j61æ) by the emission of orthogonally polarized optical photons at demonstrated entanglement of optical photons with trapped ions3, 637 nm. The entangled state given by equation (1) is created because atoms4,5 and atomic ensembles6–8, which are then used to connect photon polarization is uniquely correlated with the final spin state. remote long-term memory nodes in distributed quantum This entanglement is verified by spin state measurement using a networks9–11. Here we realize quantum entanglement between cycling optical transition following the detection of a 637-nm photon the polarization of a single optical photon and a solid-state qubit of chosen polarization. associated with the single electronic spin of a nitrogen vacancy Understanding and controlling excited state properties is a central centre in diamond. Our experimental entanglement verification challenge for achieving such a coherent interface between spin mem- uses the quantum eraser technique5,12, and demonstrates that a ory and optical photons. In contrast to isolated atoms and ions, solid high degree of control over interactions between a solid-state state systems possess complex excited state properties that depend qubit and the quantum light field can be achieved. The reported sensitively on their local environment23. Non-axial crystal strain is entanglement source can be used in studies of fundamental particularly important to the present realization because it affects the quantum phenomena and provides a key building block for the optical transitions’ selection rules and polarization properties24. solid-state realization of quantum optical networks13,14. In the absence of external strain and electric or magnetic fields, A quantum network13 consists of several nodes, each containing a properties of the six electronic excited states are determined by the long-lived quantum memory and a small quantum processor, that NV centre’s C3v symmetry and spin–orbit and spin–spin interactions are connected via entanglement. Its potential applications include (shown in Fig. 2a)24. Optical transitions between the ground and long-distance quantum communication and distributed quantum excited states are spin preserving, but could change electronic orbital computation15. Several recent experiments demonstrated on-chip angular momentum depending on the photon polarization. Two of the 16 entanglement of solid-state qubits separated by nanometre to mil- excited states, labelled jExæ and jEyæ according to their orbital sym- 17,18 limetre length scales. However, realization of long-distance metry, correspond to the ms 5 0 spin projection. Therefore they couple entanglement based on solid-state systems coupled to single optical only to the j0æ ground state and provide good cycling transitions, photons19 is an outstanding challenge. The nitrogen-vacancy (NV) suitable for readout of the j0æ state population through fluorescence centre, a defect in diamond consisting of a substitutional nitrogen detection. The other four excited states are entangled states of spin and atom and an adjacent vacancy, is a promising candidate for imple- orbital angular momentum. Specifically, the jA2æ state has the form menting a quantum node. The ground state of the negatively charged 1ffiffiffi NV centre is an electronic spin triplet with a 2.88-GHz zero-field jiA2 ~ p ðÞðjiE{ jiz1 zjiEz ji{1 2Þ 2 splitting between the magnetic sublevels jms 5 0æ and jms 561æ states (from here on denoted j0æ and j61æ). With long coherence where jE6æ are orbital states with angular momentum projection 61 times20, fast microwave manipulation, and optical preparation and along the NV axis. At the same time, the ground states (j0æ, j61æ)are 21 detection , the NV electronic spin presents a promising qubit can- associated with the orbital state jE0æ with zero projection of angular didate. Moreover, it can be coupled to nearby nuclear spins that momentum (for simplicity, the spatial part of the wavefunction is not provide exceptional quantum memories; such coupling allows for explicitly written). Hence, owing to total angular momentum conser- 16,22 the robust implementation of few-qubit quantum registers .In vation, the jA2æ state decays with equal probability to the j21æ ground this work, we demonstrate the preparation of quantum entangled state through s1 polarized radiation and to j11æ through s2 polarized states between a single photon and the electronic spin of an NV radiation. centre: The inevitable presence of a small strain field, characterized by the 1 strain splitting (Ds)ofjEx,yæ, reduces the NV centre’s symmetry and jiY ~ pffiffiffi ðÞðjis{ jiz1 zjisz ji{1 1Þ shifts the energies of the excited state levels according to their orbital 2 wavefunctions. For moderate and high strain, the excited states are 25 where js1æ and js2æ are orthogonal circularly polarized single separated into two branches and there is mixing between levels .In photon states. the upper branch, an energy gap protects jA2æ against low strain and

1Department of Physics, Harvard University, Cambridge, Massachusetts 02138, USA. 2Department of Physics, California Institute of Technology, Pasadena, California 91125, USA. 3Institute for Quantum Information, California Institute of Technology, Pasadena, California 91125, USA. 4Department of Physics and Astronomy, Bates College, Lewiston, Maine 04240, USA. 5Department of Physics and Astronomy, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA. 6QUANTOP, The Niels Bohr Institute, University of Copenhagen, DK2100 Copenhagen, Denmark. 7Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas 77843, USA. *These authors contributed equally to this work. 730 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

ab637 nm |A 〉 AOM 2 tunable laser 2 PBS 637 nm Waveguide Waveguide AOM 637.19 nm tunable laser 1 modulator 1 modulator 2 σ Cryostat + Dichroic Dichroic σ– BS 1 BS 2 Diamond Waveguide |+1〉 modulator 3 |–1〉 APD PBS HWP BS PSB QWP Microscope ZPL detection objective |0〉 detection Polarization analysis

APD 532 nm AOM laser

Figure 1 | Scheme for spin-photon entanglement. a, Following selective from the NV centre passes through a quarter-wave plate (QWP) and is excitation to the | A2æ state, the L-type three level system decays to two spectrally separated into PSB and ZPL channels, and detected with avalanche different spin states through the emission of orthogonally polarized photodiodes (APDs). The latter channel contains entangled photons and is photons, resulting in spin–photon entanglement. b, Schematic of the optical sent using a beam splitter (BS) through a polarization analysis stage set-up. Individual NV centres are isolated and addressed optically using a consisting of a half-wave plate (HWP) and a polarizing beam splitter (PBS). microscope objective. Two resonant lasers at 637 nm and an off-resonant See text for details. laser at 532 nm address various optical transitions. Fluorescence emitted

magnetic fields, preserving the polarization properties of its optical To ensure that jEyæ is a good cycling transition and jA2æ acts as an transitions. A group theoretical analysis of the excited states and entanglement generation transition as required for the current study, polarization properties of the transitions is given in the Supplemen- we select an NV centre with relatively small strain splitting tary Information. (Ds < 2 3 1.28 GHz). Figure 2b presents its excitation spectrum,

a c

|A 〉 0.0015 10 2

|A 〉 1 5 | 0.0010 Ex〉

0 |E 〉 y 0.0005 PSB fluorescence (a.u.) PSB fluorescence –5 Excited state energies (GHz) Excited state energies |E 〉 1,2 50 100 150 200 250 300 350 –10 QWP angle (degrees) 0246 810 Δ Strain perpendicular to NV axis, s (GHz) b 3,500 | |E Ey〉 x〉 3,000 |A 〉 2 | 〉 | 〉 2,500 |E 〉 0 0 1,2 |±1〉 |A 〉 2,000 1 |±1〉 1,500 |±1〉

PSB fluorescence (a.u.) PSB fluorescence 1,000

500

2 4 6 8 10 12 14 Frequency (GHz) Figure 2 | Characterization of NV centres. a, Energy levels of the NV centre transition in absorption. The system is initially prepared in | 11æ (blue) or 23 under strain. Solid lines are based on a theoretical model and dots are data | 21æ (red). We then apply a laser pulse of varying polarization to the | A2æ from seven NV centres. The dashed line indicates the NV centre used in this state while collecting fluorescence. Oscillations with visibility 77 6 10% paper. b, Excitation spectrum of the NV centre under continuous wave (c.w.) indicate that the transitions linking | 61æ to | A2æ are circularly polarized and microwave radiation. c, Polarization properties of the | 61æ R | A2æ mutually orthogonal (see Supplementary Information for details). 731 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010 while Fig. 2c demonstrates the desired polarization properties of the Figure 4a shows the populations in the j61æ states, measured j61æ«jA2æ transitions via resonant excitation. conditionally on the detection of a single circularly polarized ZPL We now turn to the experimental demonstration of spin–photon photon. Excellent correlations between the photon polarization and entanglement. Our experimental set-up is outlined in Fig. 1b and NV spin states are observed. described in the Supplementary Information. To create the entangled To complete the verification of entanglement, we now show that state, we use coherent emission within the narrow-band zero phonon correlations persist when ZPL photons are detected in a rotated line (ZPL), which includes only 4% of the NV centre’s total emission. polarization basis. On detection of a linearly polarized jHæ or jVæ The remaining optical radiation occurs in the frequency shifted pho- photon at time td, the entangled state in equation (1) is projected to ji+ ~ p1ffiffi ðÞjiz1 +ji{1 , respectively. These states subsequently non side band (PSB), which is accompanied by phonon emission that 2 causes deterioration of the spin–photon entanglement26. Isolating evolve in time (t) according to: the weak ZPL emission presents a significant experimental challenge 1 { { { { ~ ffiffiffi ivzðÞt td z iv{ðÞt td { owing to strong reflections of the resonant excitation pulse reaching ji+ t p e ji1 +e ji1 ð3Þ the detector. By exciting the NV centre with a circularly polarized 2 2-ns p-pulse that is shorter than the emission timescale, we can use In order to read out the relative phase of superposition states between detection timing to separate reflection from fluorescence photons. A j11æ and j21æ, we use two resonant microwave fields with combination of confocal rejection, modulators and finite transmit- frequenciesv1 and v2 to coherently transfer the state { { { { tivity of our optics suppresses the reflections sufficiently to clearly jiM ~ p1ffiffi e ivzt jiz1 ze iðÞv{t ðÞwz w{ ji{1 to j0æ (see detect the NV centre’s ZPL emission in a 20-ns region (Fig. 3). 2 Fig. 3b), where the initial relative phase w1 2 w2 is set to the same For photon state determination, ZPL photons in either the js6æ or 1 1 value for each round of the experiment. Thus, the conditional prob- jiH ~ pffiffi ðÞjisz zjis{ , jiV ~ pffiffi ðÞjisz {jis{ basis are selected by 2 2 ability of measuring the state jMæ is a polarization analysis stage and detected after an optical path of , 1+cos aðÞtd 2 m. Spin readout then occurs after a 0.5-ms spin memory interval p ðÞt ~ ð4Þ following photon detection by transferring population from either MHj ,V d 2 the j61æ states or from their appropriately chosen superposition into where a(td) 5 (v1 2 v2)td 1 (w1 2 w2). Equation (4) indicates the j0æ state using microwave pulses, V61. The pulses selectively that the two conditional probabilities should oscillate with a p phase address the j0æ«j61æ transitions with resonant frequencies v6 that difference as a function of the photon detection time, td. This can be differ by dv 5 v1 2 v2 5 122 MHz due to an applied magnetic understood as follows. In the presence of Zeeman splitting (dv ? 0), field. For superpositions of j61æ states, an echo sequence is applied the NV centre’s spin state is entangled with both the polarization and before the state transfer to extend the spin coherence time (see frequency of the emitted photon. The photon’s frequency provides Supplementary Information). The transfer is followed by resonant which-path information about its decay. In the spirit of quantum excitation of the j0æ«jEyæ transition and collection of the PSB eraser techniques, the detection of jHæ or jVæ at td with high time fluorescence. We carefully calibrate the transferred population mea- resolution (,300 ps = 1/dv) erases the frequency information5,12. sured in the j0æ state using the procedure detailed in Methods. When the initial relative phase between the microwave fields V61

a bc |A 〉 |+1〉 |A 〉 2 |–1〉 2 | Ey〉 , Ω |E 〉 σ– σ+ +1 y |0〉 637.19 nm ZPL σ+ σ– PSB 637.20 nm H, V | 〉 |+1〉 |+1〉 |–1〉 +1 |–1〉 |–1〉 Ω Ω Ω +1 –1 +1 |0〉 |0〉 |0〉

d Spin Entanglement Spin state polarization generation and readout 104 photon state readout 532 nm 103 excitation 637.19 nm π 102 excitation π Ω 10 +1 Cumulative ZPL counts ZPL detection 0 1020304050 Ω Ω 2ππ Time of detection (ns) +1+ –1 637.20 nm excitation PSB detection Time Figure 3 | Experimental procedure for entanglement generation. a, After text) to | 0æ. c, The population in | 0æ is measured using the 637.20-nm optical spin polarization into | 0æ, population is transferred to | 11æ by a microwave readout transition. d, Pulse sequence for the case where an | Hæ or | Væ ZPL p-pulse (V11). The NV is excited to | A2æ with a 637.19-nm p-pulse and the photon is detected (time axis not to scale). If a s6 photon is detected instead, ZPL emission is collected. b,Ifas1 or s2 photon is detected, the population only a p-pulse on either V11 or V21 is used for spin readout. Inset, detection in | 11æ or | 21æ is transferred to | 0æ.Ifan | Hæ or | Væ photon is detected, a time of ZPL channel photons, showing reflection from diamond surface and t–2p–t echo sequence (see Supplementary Information) is applied with V11 subsequent NV emission (blue) and background counts (purple). and V21, followed by a p-pulse which transfers the population in | Mæ (see 732 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

+ – abσ σ VH 1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2 Conditional probability Conditional probability

–1 +1 –1 +1 +–+ – cd V H 1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2 Conditional probability, p M|H Conditional probability, Conditional probability, p M| V Conditional probability, 0 0 0 5 10 15 0 5 10 15 t t d (ns) d (ns) Figure 4 | Measurement of spin-photon correlations in two bases. region is the 68% confidence interval for the fit (solid line) to the time- a, Conditional probability of measuring | 61æ after the detection of a s1 or binned data (Supplementary Information). Errors bars on data points show s2 photon. b, Conditional probability of measuring | 6æ after the detection 61 s.d. Combined with the data shown in a, oscillations with amplitude of an H or V photon, extracted from a fit to data shown in c and d. outside of the yellow regions result in fidelities greater than 0.5. The visibility c, d, Measured conditional probability of finding the electronic spin in the of the measured oscillations are 0.59 6 0.18 (c) and 0.60 6 0.11 (d). state | Mæ after detection of a V (c)orH (d) photon at time td. Blue shaded

is kept constant, the acquired phase difference (v1 2 v2)td gives rise sources include finite signal to noise ratio in the ZPL channel (fidelity to oscillations in the conditional probability and produces an effect decrease 11%), as well as timing jitter (another 4%). The resulting equivalent to varying the relative phase in the measured superposi- expected fidelity (75%) is consistent with our experimental observa- tion; this allows us to verify the coherence of the spin–photon entang- tions. Finally, the entanglement generation succeeds with probability led state. p < 1026, which is limited by low collection and detection efficiency The detection times of ZPL photons are recorded during the as well as the small probability of ZPL emission. experiment without any time gating, which allows us to study Entanglement of pairs of remote quantum registers is one import- spin–photon correlations without reducing the count rate. The ant potential application of the technique described here11. This can resulting data are analysed in two different ways. First, we time-bin be done by coincidence measurements on a pair of photons emitted the data and use it to evaluate the conditional probabilities of mea- by two remote NV centres. The key figure of merit for such an suring spin state jMæ as a function of jHæ or jVæ photon detection time entanglement operation over a distance L is proportional to (Fig. 4c, d). Off-diagonal elements of the spin–photon density matrix cT p2 , where c < 2p 3 15 MHz is the spontaneous decay rate of are evaluated from a simultaneous fit to the binned data (see 1zct Methods). The time bins are chosen to minimize fit uncertainty, as the NV centre, t 5 L/c is the photon travel time, c is the velocity of described in Supplementary Information. The resulting conditional light, and T is the memory lifetime. A large figure of merit is critical probabilities are used to evaluate a lower bound on the entanglement for applications such as quantum repeaters and entanglement puri- fidelity of F $ 0.69 6 0.068, above the classical limit of 0.5, indicating fication protocols. The 0.5-ms spin memory interval in our experi- the preparation of an entangled state. ments can be extended to several hundred microseconds using spin We further reinforce our analysis using the method of maximum echo techniques. Furthermore, by mapping the electronic spin state likelihood estimate. As described in Supplementary Information, this onto proximal nuclei, T can be extended to hundreds of milli- method is applied to raw, un-binned ZPL photon detection and spin seconds22. The key limitation in attaining a large figure of merit is measurement data and yields a probability distribution of a lower low p. It can be circumvented if optical cavities are used, which bound on the fidelity. Consistent with the time-binned approach, we simultaneously enhances emission into the ZPL and improves col- find that our data are described by a near Gaussian probability lection efficiency through integration with appropriate waveguides. distribution associated with a fidelity of F $ 0.70 6 0.070 (see For example, by using a photonic crystal nanocavity27–29, the poten- Supplementary Fig. 7). Significantly, the cumulative probability dis- tial rate for spin–spin entanglement generation can be about 1 MHz tribution directly shows that the measured lower bound on the fidel- for t , 1/c and a few hertz for t corresponding to L < 100 km, result- ity is above the classical limit with a probability of 99.7%. cT ing in p2 §1. Beyond this specific application, our ability to Several experimental imperfections reduce the observed entangle- 1zct ment fidelity. First, the measured strain and magnetic field slightly control interactions between NV centres and quantum light fields mixes the jA2æ state with the other excited states. On the basis of demonstrates that quantum optical techniques, such as all-optical 11 30 Fig. 2b, we estimate that the jA2æ state imperfection and photon spin control, non-local entanglement and photon storage , can depolarization in the set-up together reduce the fidelity by 12%, be implemented using long-lived solid-state qubits, paving the way the latter being the dominant effect. Imperfections in readout and for a wide variety of potential applications in quantum optics and echo microwave pulses decrease the fidelity by 3%. Other error quantum information science. 733 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

METHODS SUMMARY 9. Cabrillo, C., Cirac, J. I., Garcia-Fernandez, P. & Zoller, P. Creation of entangled Spin readout. We determine the spin of the NV centre by resonantly exciting the states of distant atoms by interference. Phys. Rev. A 59, 1025–1033 (1999). j0æ«jE æ transition and collecting emission on the PSB within a 10-ms window. 10. Chou, C. W. et al. Measurement-induced entanglement for excitation stored in y remote atomic ensembles. Nature 438, 828–832 (2005). To obtain accurate readout levels relevant for calibration of our experimental 11. Moehring, D. L. et al. Entanglement of single-atom quantum bits at a distance. data, we effectively project the state of the NV centre into j0æ or j61æ before spin Nature 449, 68–71 (2007). measurement by detecting a PSB photon after exciting to the jEyæ or jA2æ state, 12. Scully, M. O. & Dru¨hl, K. Quantum eraser: a proposed photon correlation respectively. Subsequent readout produces a maximum number of 0.11 6 0.0022 experiment concerning observation and ‘‘delayed choice’’ in quantum mechanics. counts per shot when the NV is initially in the j0æ state and a minimum number Phys. Rev. A 25, 2208–2213 (1982). (consistent with background) when it is in the j61æ states. These levels are then 13. Kimble, H. J. The quantum internet. Nature 453, 1023–1030 (2008). used to calculate the populations measured for entanglement verification. 14. Childress, L., Taylor, J. M., Sørensen, A. S. & Lukin, M. D. Fault-tolerant quantum Compared to conventional spin measurements in NV centres, this method is less communication based on solid-state photon emitters. Phys. Rev. Lett. 96, 070504 sensitive to effects such as imperfect initial spin polarization, NV photo-ioniza- (2006). tion, and spectral or spatial instabilities. A detailed description of the spin readout 15. Duan, L.-M. & Monroe, C. Robust quantum information processing with atoms, photons, and atomic ensembles. Adv. At. Mol. Opt. Phys. 55, 419–464 (2008). measurement and calibration is given in the Supplementary Information. 16. , P. et al. Multipartite entanglement among single spins in diamond. Calculation of entanglement fidelity. To estimate the entanglement fidelity, we Science 320, 1326–1329 (2008). first use the conditional measurement shown in Fig. 4a, b to determine diagonal 17. Ansmann, M. et al. Violation of Bell’s inequality in Josephson phase qubits. Nature elements of the spin–photon density matrix in the js6æ, j61æ basis. As s6 photons 461, 504–506 (2009). are emitted with equal probability (see Supplementary Information), we 18. DiCarlo, L. et al. Demonstration of two-qubit algorithms with a superconducting ~ 1 ~ 1 ~ 1 quantum processor. Nature 460, 240–244 (2009). find r z{ z{ p{1jsz ðÞ0:96+0:12 , r zz zz ðÞ0:07+0:04 , s 1,s 1 2 2 s 1,s 1 2 19. de Riedmatten, H., Afzelius, M., Staudt, M. U., Simon, C. & Gisin, N. A solid-state ~ 1 ~ 1 r {{ {{ ðÞ0:10+0:05 ,andr {z {z ðÞ0:87+0:14 .Toevaluate light-matter interface at the single-photon level. Nature 456, 773 (2008). s 1,s 1 2 s 1,s 1 2 theoff-diagonalelements,werotatethemeasurement basis by projecting the photon 20. Balasubramanian, G. et al. Ultralong spin coherence time in isotopically to the jHæ or jVæ states and measuring the conditional probability of being in state engineered diamond. Nature Mater. 8, 383–387 (2009). jMæ, which is equal to j6æ for particular choices of a (for example, j1æ 5 jMæj ). 21. Fuchs, G. D., Dobrovitski, V. V., Toyli, D. M., Heremans, F. J. & Awschalom, D. D. a50 Gigahertz dynamics of a strongly driven single quantum spin. Science 326, The required diagonal matrix elements in the jHæ, jVæ, j6æ basis are then given by 1 1520–1522 (2009). r z z~ p ðÞa~0 ,andsimilarlyforr ,r and r .Wemodel 22. Dutt, M. V. G. et al. Quantum register based on individual electronic and nuclear V ,V 2 MVj H1,H1 H2,H2 V2,V2 the experimentally measured conditional probabilities with the forms spin qubits in diamond. Science 316, 1312–1316 (2007). p 5 (b 1 a cosa)/2 and p 5 (b 2 a cosa)/2, where b are the offsets of 23. Tamarat, P. et al. Spin-flip and spin-conserving optical transitions of the nitrogen- MjH H H MjV V V H,V vacancy centre in diamond. N. J. Phys. 10, 045004 (2008). the oscillations and a are their amplitudes. Using a simultaneous fit to the data H,V 24. Manson, N., Harrison, J. & Sellars, M. Nitrogen-vacancy center in diamond: model in Fig. 4c, d that constrains the frequency to be the Zeeman splitting, we obtain of the electronic structure and associated dynamics. Phys. Rev. B 74, 104303 { ~ ~ 1 { ~ the values rVz,Vz rV{,V{ aV =2 ðÞ0:53+0:16 , rH{,H{ rHz,Hz (2006). 1 2 25. Santori, C. et al. Coherent population trapping of single spins in diamond under aH =2~ ðÞ0:58+0:10 The information obtained is sufficient to provide a lower 2 1 optical excitation. Phys. Rev. Lett. 97, 247401 (2006). bound for the entanglement fidelity. Using the analysis in ref. 3 F§ r { { 26. Kaiser, F. et al. Polarization properties of single photons emitted by nitrogen- ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 sz 1,sz 1 z { p z { z vacancy defect in diamond at low temperature. Æhttp://arXiv.org/abs/ rs{z 1,s{ z 1 2 rsz z 1,szz1rs{{1,s{{1 rVz,Vz rV{,V{ rH{,H{ 0906.3426æ (2009). { rHz,HzÞ, we find F $ 0.696 0.068. This analysis agrees with the results of an 27. Englund, D., Faraon, A., Fushman, I., Stoltz, N. & Petroff, P. Controlling cavity independent maximum likelihood analysis described in the Supplementary reflectivity with a single quantum dot. Nature 450, 857–861 (2007). Information, which yields a near Gaussian probability distribution for a lower 28. Schietinger, S., Schro¨der, T. & Benson, O. One-by-one coupling of single defect bound on the fidelity with F $ 0.70 6 0.070. centers in nanodiamonds to high-Q modes of an optical microresonator. Nano Lett. 8, 3911–3915 (2008). Full Methods and any associated references are available in the online version of 29. Wang, C. F. et al. Fabrication and characterization of two-dimensional photonic the paper at www.nature.com/nature. crystal microcavities in nanocrystalline diamond. Appl. Phys. Lett. 91, 201112 (2007). Received 8 February; accepted 8 June 2010. 30. Fleischhauer, M., Imamog˘lu, A. & Marangos, J. P. Electromagnetically induced transparency: optics in coherent media. Rev. Mod. Phys. 77, 633–673 (2005). 1. Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum Information (Cambridge Univ. Press, 2000). Supplementary Information is linked to the online version of the paper at 2. Aspect, A., Grangier, P. & Roger, G. Experimental realization of Einstein-Podolsky- www.nature.com/nature. Rosen-Bohm Gedankenexperiment: a new violation of Bell’s inequalities. Phys. Rev. Lett. 49, 91–94 (1982). Acknowledgements We thank F. Jelezko, J. Wrachtrup, V. Jacques, N. Manson, 3. Blinov, B. B., Moehring, D. L., Duan, L. M. & Monroe, C. Observation of J. Taylor and J. MacArthur for discussions and experimental help. This work was entanglement between a single trapped atom and a single photon. Nature 428, supported by the Defense Advanced Research Projects Agency, NSF, Harvard-MIT 153–157 (2004). CUA, the NDSEG Fellowship and the Packard Foundation. The content of the 4. Volz, J. et al. Observation of entanglement of a single photon with a trapped atom. information does not necessarily reflect the position or the policy of the US Phys. Rev. Lett. 96, 030404 (2006). Government, and no official endorsements should be inferred. 5. Wilk, T., Webster, S. C., Kuhn, A. & Rempe, G. Single-atom single-photon Author Contributions All authors contributed extensively to the work presented in quantum interface. Science 317, 488–490 (2007). this paper. 6. Yuan, Z.-S. et al. Experimental demonstration of a BDCZ quantum repeater node. Nature 454, 1098–1101 (2008). Author Information Reprints and permissions information is available at 7. Matsukevich, D. et al. Entanglement of a photon and a collective atomic www.nature.com/reprints. The authors declare no competing financial interests. excitation. Phys. Rev. Lett. 95, 040405 (2005). Readers are welcome to comment on the online version of this article at 8. Sherson, J. F. et al. Quantum teleportation between light and matter. Nature 443, www.nature.com/nature. Correspondence and requests for materials should be 557–560 (2006). addressed to M.D.L. ([email protected]).

734 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09256

METHODS Addressing of the j0æ«j61æ microwave transitions is carried out using a Experimental set-up. Our experiments are performed using a natural bulk dia- 15-mm copper wire attached to the diamond. For simultaneous addressing, mond sample kept below 7 K. A Nikon 0.95 NA microscope objective is used in our two microwave fields (generated by mixing the difference frequency of the confocal set-up to address individual NV centres. Resonant excitation of the read- two transitions with their average frequency) are separated using bandpass out and entanglement generation transitions are done using two external cavity filters, individually attenuated, and recombined to balance their power. Low diode lasers. To overcome the main experimental challenge of ensuring sufficient shot-to-shot noise in the microwave fields’ relative phase is crucial. This is signal to noise of the detected ZPL emission, we eliminate background from laser achieved by triggering all timing-sensitive channels from one output event of light reflected off the diamond surface by creating an isolated excitation p-pulse a controller device that produces the entanglement generation and condi- using two cascaded waveguide modulators. This excitation pulse is sent through a tioned readout sequences. Timing information of both ZPL and PSB photons quarter-wave plate that is fixedduring all experimental runs toproducethe circular are collected by combining them at the input of a time-tagged-single-photon- polarization that most efficiently excites the NV to the jA2æ state. We note that, counting device. because our measurements are conditioned on the detection of an emitted photon, Given an experiment repetition rate of ,100 kHz and an entanglement gen- 2 optical p-pulse imperfections only affect the efficiency of the entanglement gen- eration success probability of p < 10 6, we then detect on average one signal eration and not the measured fidelity. In the collection path, the ZPL is sent to a photon every few seconds. As the microwave p-pulses used for population trans- polarization analysis set-up consisting of an HWP and a PBS for photon state fer to the j0æ state are nearly perfect and about 100 repetitions are required for selection. It then passes through a narrow frequency filter before being detected reliable spin state determination, roughly 24 h of data taking were required for by a low dark count APD. We use a waveguide based electro-optical modulator each of the four photon polarizations measured. Overall, characterization, cal- before the APD to further reduce reflections of the excitation pulse and suppress ibration, and data acquisition for a given NV centre were performed over a detector afterpulsing. Special care is taken to minimize reflections during the roughly two month period. The overall measurement time for each individual measurement window to around the dark count level of the detector. NV centre is limited by the long term mechanical stability of the set-up.

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09278 LETTERS

Loss-free and active optical negative-index metamaterials

Shumin Xiao1, Vladimir P. Drachev1, Alexander V. Kildishev1, Xingjie Ni1, Uday K. Chettiar1{, Hsiao-Kuan Yuan1{ & Vladimir M. Shalaev1

The recently emerged fields of metamaterials and transformation spaser21,25,26. However, in a NIM-based device the thickness of the optics promise a family of exciting applications such as invisibility, active material must necessarily be kept small to preserve the negative optical imaging with deeply subwavelength resolution and nanopho- refractive index. The incorporation of gain in an optical NIM design tonics with the potential for much faster information processing. has been hampered by these difficulties, but such an achievement The possibility of creating optical negative-index metamaterials would lead to the production of low- or no-loss optical NIMs for (NIMs) using nanostructured metal–dielectric composites has trig- use in a large number of breakthrough applications. gered intense basic and applied research over the past several We have overcome this limitation by an approach in which the years1–10. However, the performance of all NIM applications is sig- active medium within the NIM gives rise to an effective gain much nificantly limited by the inherent and strong energy dissipation in higher than its bulk counterpart. The large value of gain is due to the metals, especially in the near-infrared and visible wavelength local-field enhancement inherent in the plasmonic response of ranges11,12. Generally the losses are orders of magnitude too large NIMs20,22, a phenomenon that provides a new direction for the com- for the proposed applications, and the reduction of losses with opti- pensation of losses in NIMs. In our experiments, the transmission mized designs seems to be out of reach. One way of addressing this through the optical NIM sample is amplified by pumping the active issue is to incorporate gain media into NIM designs13–16. However, medium within it, and the structure is carefully designed such that whether NIMs with low loss can be achieved has been the subject of the active medium experiences the highest local field while preserving theoretical debate17,18. Here we experimentally demonstrate that the the negative-index property of the metamaterial. Our experimental incorporation of gain material in the high-local-field areas of a meta- results, along with our numerical simulations, directly demonstrate material makes it possible to fabricate an extremely low-loss and that our NIM sample is lossless and active. active optical NIM. The original loss-limited negative refractive The NIM structure in our experiment is the fishnet structure, index and the figure of merit (FOM) of the device have been dras- which was also used in some of the earliest demonstrations of optical tically improved with loss compensation in the visible wavelength NIMs7,27. Epoxy doped with rhodamine 800 (Rh800) dye is used as range between 722 and 738 nm. In this range, the NIM becomes the gain medium. The fabrication process for creating the gain- active such that the sum of the light intensities in transmission assisted fishnet sample is schematically shown in Fig. 1 (see and reflection exceeds the intensity of the incident beam. At a wave- Methods for details). In this proposed structure, the typical alumina length of 737 nm, the negative refractive index improves from 20.66 spacer of the initial fishnet structure is replaced by the gain medium. to 21.017 and the FOM increases from 1 to 26. At 738 nm, the FOM is Successful fabrication is extremely challenging because the nano- expected to become macroscopically large, of the order of 106.This structure can be easily destroyed during this critical replacement study demonstrates the possibility of fabricating an optical negative- process. The scanning electron microscope (SEM) images of the index metamaterial that is not limited by the inherent loss in its structure at different fabrication steps (Fig. 2; see Methods for metal constituent. details) indicate that no damage to our fishnet structure occurred Optical NIMs are artificially tailored composites where a counter- during this process. intuitive negative refractive index arises from the nanoscale ‘meta- The fabricated sample was first optically characterized by means of atoms’ designed into the material. These optical NIM building blocks far-field transmission and reflection measurements, using normally typically require a plasmonic material such as silver or gold in addition incident light at the primary polarization with the electric field vector to dielectric constituents. Losses inherent in these noble metals at of the light along the x axis in Fig. 2a (Methods). The blue, red and optical frequencies plague the entire field of metamaterials and are green solid lines in Fig. 3a represent the transmission, reflection and one of the major restrictions preventing metamaterials from leaving absorptance spectra of the sample, respectively. A clear resonance the domain of academic research and entering industrial applications. around 725 nm can be observed, which matches very well with the Recently, the incorporation of an active material has been suggested as fluorescence peak of Rh800. Here the low transmission is solely due a viable and effective method of minimizing or eliminating loss in to high absorption because the impedance of the sample is nearly NIMs13–16. This method has been theoretically discussed in a variety of matched and, hence, the reflection is low around the resonance. By gain models, resulting in predictions of drastic performance improve- applying a Lorentz-oscillator absorption model for the material ments19–22. However, the high levels of gain required for this method properties of the Rh800–epoxy combination, the experimental spec- were previously considered impossible to obtain in experiments23,24. tra were matched well with numerical simulations (dotted lines in The impairment of low gain levels can be overcome by using a thick Fig. 3a), and the effective permittivity and permeability of the sample active host layer, as was shown in the recent demonstration of the were determined following the bianisotropic parameter retrieval

1Birck Nanotechnology Center and School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana 47907, USA. {Present addresses: Department of Electrical and Systems Engineering, University of Pennsylvania, 200 South 33rd St, Philadelphia, Pennsylvania 19104, USA (U.K.C.); Intel, 2501 NW 229th Avenue, RA2-283, Hillsboro, Oregon 97124, USA (H.-K.Y.). 735 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

ab a b 1.0 R 1

A s

/ T T 0.8 0.3 Δ s 5 0.8 A R e e 0.6 no

4 0.6 p

3 um

0.4 0.2 0.4 p

Transmission 2 0.2 T 0.2 and absorptance T e s 1

Transmission, reflection Transmission, 0.0 0.1 0 650 700 750 800 850 715 720 725 730 735 Wavelength (nm) Figure 3 | Experimental results and simulation. a, Experimental far-field transmission (Te), reflection (Re) and absorptance (Ae) spectra of the d c sample, along with simulated results (Ts, Rs, and As) at the primary linear polarization shown in Fig. 2a. b, The transmission spectra without pumping (line 1), with the optimized delay between pump and probe (probe pulse is 54 ps later than the pump) and 1-mW pumping power (line 5), with the optimized delay and 0.12-mW pumping power (line 3), with the optimized delay and 0.16-mW pumping power (line 4), and with the pump preceding the probe by 6 ps and 1-mW pumping power (line 2). The wavelength- dependent relative transmission change from the pump–probe experiment is shown by the red solid line. The compensation of loss in the structure was investigated with a pump–probe experiment. The experimental set-up is shown in Fig. 2d and explained in detail in Methods. The polarizations of Substraterate Silver Dye the pump and probe pulses were both along the primary polarization. Aluminana AirAir The results of the pump–probe measurements are shown in Fig. 3b. We first measured the transmission spectrum without pumping (blue line 1 in Fig. 3b). The unpumped transmission value is nearly the same as that given by the blue solid line in Fig. 3a in the same Figure 1 | Schematic of the fabrication process. a, Unit cell of the fishnet wavelength region, and confirms the validity of the measurement. structure with alumina as the spacer material between two silver layers. We then measured the transmission of the sample with the pump b, One-quarter of the fishnet structure with an alumina spacer. c, After laser turned on and the average power fixed at 1 mW, which is below etching the alumina, the fishnet structure has air or solvent as the spacer with the damage threshold for the sample but five times larger than the alumina pillars as support. d, After coating with Rh800–epoxy, the fishnet structure has the dye–epoxy material in the spacer region and above the gain saturation power for Rh800–epoxy. We optimized the delay fishnet structure. time between the pump and probe pulses to ensure that the dye offered the maximum gain. The transmission spectrum increased method described in ref. 28. A negative refractive index was obtained significantly when the gain medium was pumped (blue line 5 in from 720 to 760 nm, with the strongest negative index, of n9 520.86, Fig. 3b). The relative change between the transmission values with occurring at 740 nm. The maximum FOM (FOM 52n9/jn99j, for and without pumping (DT/Tnopump, where DT 5 Tpump 2 Tnopump) n9,0, where n9 and n99 are respectively the real and imaginary parts is plotted as the red line in Fig. 3b. The transmission was enhanced by of the refractive index) was 1 at 737 nm. about 100% within a wide wavelength range from 712 to 736 nm. These measurements far exceed the experimental error, which was ab less than 10%. To exclude the possibility of damage-induced trans- parency, the measurements with and without pumping were repeated ten times. We note that, under the conditions of our experi- ment (average power, 1 mW; repetition rate, 1 kHz), photobleaching y was observed after 5 min of illumination. We also measured the transmission spectrum of the sample with a different (non-optimal) delay by tuning the probe beam pulse to be 6 ps earlier than the pump x 500 nm pulse (blue line 2 in Fig. 3b), and compared this result with the c d unpumped spectrum. This detuned spectrum matches the spectrum Amplified Ti:sapphire without pumping very well, indicating that influences from the set- picosecond laser up and local heating can be excluded from our results. OPA Another control experiment on a similar sample with pure epoxy Separator as the spacer shows no changes in the transmission in any measure- Optical Water ments. Therefore, we conclude that the changes in transmission delay line CCD observed from the Rh800–epoxy sample are caused by the compensa- Probe tion of loss in the metamaterial. We also performed the pump–probe Pump Grating Sample experiment with an optimized delay time between the pump and probe pulses at lower pump intensities. The results are plotted as Figure 2 | SEM images of the fishnet structure at different fabrication blue lines 3 and 4 in Fig. 3b. Under the much lower pumping intensity, stages. a, Fishnet structure with alumina spacer. b, After etching the the pumping does not result in the required population inversion in alumina and coating with Rh800–epoxy, the fishnet structure has the dye–polymer as the spacer and on the top. c, Tilt-view SEM image of the the dye molecules and thus does not offer the needed gain; instead, it structure after coating with Rh800–epoxy and after a part of the top layer of only reduces the absorption of the dye, such that the transmission silver has been removed by focused ion-beam milling. The scale is the same change is much smaller in this case. in all of the SEM images. d, The pump–probe experimental set-up. CCD, The loss compensation mechanism in the sample is relatively charge-coupled device; OPA, optical parametric amplifier. straightforward to understand. When Rh800 is excited by a pump 736 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS pulse with sufficiently high power, a population inversion is formed parameter retrieval, see Methods). The real part of the refractive inside the dye molecules. This provides amplification for a properly index, n9, becomes more negative after gain is applied, and the ima- delayed probe pulse whose wavelength is coincident with the stimu- ginary part, n99, also drops significantly near the resonance. At lated emission wavelength of the dye molecules. As a result of this 737 nm, n9 changes with the addition of gain from 20.66 to amplification, the transmission of the probe light through the sample 21.017, whereas n99 decreases from 0.66 to 0.039 (Fig. 4b). This increases. Once the pump is turned off or the probe pulse is detuned corresponds to an increase in FOM from 1 to 26, as seen in Fig. 4c. away from the optimal delay value, the probe light no longer experi- This value is so far the largest reported FOM achieved in any NIM in ences amplification during its propagation through the device, and the optical region; it is much larger even than the values reported for the transmission drops back down to its initial (unpumped) value. two-dimensional metamaterial waveguides30. An even larger FOM is We performed numerical simulations to understand the effects of expected to be achieved at 738 nm, where it should be of the order of gain on the corresponding transmission spectra and to determine the 106 (n9 521.26, n99 5 1 3 1026), and the structure is practically loss refractive index and FOM of the gain-assisted sample (Methods). In free (A # 0), even at macroscopically large sizes. Although experi- Supplementary Fig. 2, we show how the relative transmission through mentally it is hard to tune the gain and other parameters exactly to the sample depends on the effective gain value, gsim. When gsim the optimized performance point, it is still easy to achieve FOM values increases, the transmission through the sample also increases. For close to 102 within about a 3-nm wavelength range near the res- 21 gsim 5 2,800 cm around 725 nm (the centre wavelength of the fluor- onance. We note that with a further increase in gain, n99 for our escence of Rh800), the simulated relative change between transmis- sample does not become negative but instead starts to increase. sion with and without gain matches our experimental result well This occurs because the impedance mismatch grows at this level of (Fig. 3b). The value of gsim in our simulations is about seven times gain, leading to increased front-side reflection and asymmetry in the larger than that measured for a homogeneous slab of dye-doped front- and back-side reflections. The NIM remains active with A , 0 epoxy, which could be due to a chemical enhancement mechanism in a relatively broad range between 722 and 738 nm. We believe that known to have an important role, for example, in surface-enhanced by further improving the structure’s design, it is possible to obtain a Raman scattering29. Another possible reason for such an increase in NIM with a macroscopically large FOM in a much broader spectral gain could be related to a feedback mechanism analogous to that range. We also note that the use of the standard definition of FOM occurring in a spaser25. We also note that similar gain values for dye (2n9/jn99j, for n9,0) is normally reserved for non-bianisotropic molecules were reported by other groups in experiments on the com- metamaterials, because in bianisotropic ones the losses depend on pensation of losses in surface plasmons23. the bianisotropy parameters; hence, the ratio 2n9/jn99j becomes less With a change in the refractive index accompanying the loss com- useful in quantifying the loss in bianisotropic NIMs. In that sense, the pensation, the impedances at the interface between the sample and air most important finding here is related to the presence of a region become mismatched and, thus, the spectral reflection also increases where A , 0 and n9,0. with gain, reaching 107.6% at 730 nm. The sum of the field intensities The loss compensation also produces drastic changes in the effec- in transmission and reflection is nearly 1.23 times larger than the tive dielectric permittivity and magnetic permeability of the sample. intensity of the incident beam at this wavelength. This confirms that Figure 4d shows the effective real parts of the permittivity, e9, and the the incident light is indeed amplified in the sample such that the permeability, m9, for the sample with and without gain. Owing to a absorptance, A, is negative and the sample is active. According to stronger electric resonance along with a stronger magnetic anti- Fig. 4a, the NIM remains active within a spectral range between 722 resonance achieved with gain, e9 and m9 both become narrower, in 22 and 738 nm, and the refractive index is negative in a broader range, agreement with theoretical considerations . between 720 and 760 nm. The effectiveness of the loss compensation in our sample arises from The effective refractive index and FOM were determined from the the local-field enhancement of the structure when a gain medium is 20,22 simulated spectra and are shown in Fig. 4 (for details of our bianisotropic used as the spacer layer . Because the effective extinction coefficients at 737 nm are a < 6.75 3 103 cm21 and a < 1.13 3 105 cm21 for the device with and without gain, respectively, the effective amplification is a b a < 21.07 3 105 cm21, which is 46 times larger at this wavelength than 0.8 1 0.4 3 the ‘seed’ value (without the local-field factor) that was used in simula- 0.6 0.0 2 tions.According to our numerical modelling, the high local fields in the 0.4 0 –0.4 A n′ n′ n′′ fishnet structure result in a total (spatially integrated) energy produced 0.2 –0.8 1 –1 by the gain medium that is about 45 times larger than that produced by 0.0 –1.2 a homogeneous gain material of the same volume, in good agreement 0 –0.2 –1.6 –2 with the factor of 46 found above (Methods). 710 720 730 740 750 760 710 720 730 740 750 760 c d METHODS SUMMARY 103 4 8 Fabrication of the fishnet structure with a gain medium as the spacer is accomp- 0 0 1 lished by a post-processing method. First we fabricate the fishnet sample with 10 –8 ′ –4 ′ Al2O3 as a spacer using standard electron-beam lithography with a Leica VB6 µ –16 ε FOM writer and lift-off processes. Then chemical etching is used to remove the Al2O3 –1 10 –8 –24 spacer and, finally, epoxy with dye molecules is used to fill the vacated space. –32 –12 Far-field measurements for transmission and reflection of the fishnet with 10–3 –40 Rh800–epoxy as the spacer are performed and used to determine the effective 720 730 740 750 760 710 720 730 740 750 760 parameters without pumping. We measure the transmission for the gain-assisted Wavelength (nm) fishnet by using a pump–probe set-up at three different pump levels; the gain line Figure 4 | Simulation and determined parameters. a, The simulated shape of the Rh800–epoxy is obtained with a pump energy above the saturation refractive index, n9 (real part), and absorptance, A (in the forward direction), energy. as functions of wavelength with (solid) and without (dashed) gain. b, The In the numerical studies of the gain-assisted fishnet metamaterial, the intrinsic effective refractive index, n 5 n9 1 in99, determined with (solid) and without losses of the metal are included. The loss is increased by a factor of about three (dashed) gain. c, The effective FOM determined with (solid) and without relative to that of bulk silver to account for the additional loss resulting from electron (dashed) gain (the FOM is set to zero when the real part of the refractive index scattering in the fabricated nanostructures. Without the pump, the Rh800–epoxy 2 is positive). d, The effective permittivity, e9 (real part), and permeability, m9 material is described by the dielectric function ep(l) 5 1.65 1 ie99(l), where e99(l) (real part), determined with (solid) and without (dashed) gain. is the loss line shape obtained from experiments with no pumping. 737 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

Full Methods and any associated references are available in the online version of 21. Zheludev, N. I., Prosvirnin, S. L., Papasimakis, N. & Fedotov, V. A. Lasing spaser. the paper at www.nature.com/nature. Nature Photon. 2, 351–354 (2008). 22. Sivan, Y., Xiao, S., Chettiar, U. K., Kildishev, A. V. & Shalaev, V. M. Frequency- Received 14 December 2009; accepted 14 June 2010. domain simulations of a negative-index material with embedded gain. Opt. Exp. 26, 24060–24074 (2009). 1. Veselago, V. G. The electrodynamics of substances with simultaneously negative 23. Noginov, M. A. et al. Compensation of loss in propagating surface plasmon value of e and m. Sov. Phys. Usp. 10, 509–514 (1968). polariton by gain in adjacent dielectric media. Opt. Exp. 16, 1385–1392 (2008). 2. Pendry, J. B. Negative refractive makes a perfect lens. Phys. Rev. Lett. 85, 24. Klimov, V. I. et al. Optical gain and stimulated emission in nanocrystal quantum 3966–3969 (2000). dots. Science 290, 314–317 (2000). 3. Shalaev, V. M, et al. Negative index of refraction in optical metamaterials. Opt. 25. Stockman, M. I. The spaser as a nanoscale quantum generator and ultrafast Lett. 30, 3356–3358 (2005). amplifier. J. Opt. 12, 024004 (2010). 4. Pendry, J. B., Schurig, D. & Smith, D. R. Controlling electromagnetic fields. Science 26. Plum, E., Fedotov, V. A., Kuo, P., Tsai, D. P. & Zheludev, N. I. Towards the lasing 312, 1780–1782 (2006). spaser: controlling metamaterial optical response with semiconductor quantum 5. Soukoulis, C. M., Linden, S. & Wegener, M. Negative refractive index at optical dots. Opt. Exp. 17, 8548–8551 (2009). wavelengths. Science 315, 47–49 (2007). 27. Xiao, S. et al. Yellow-light negative-index metamaterials. Opt. Lett. 34, 6. Valentine, J. et al. Three-dimensional optical metamaterial with a negative 3478–3480 (2009). refractive index. Nature 455, 376–379 (2008). 28. Kriegler, C. E., Rill, M. S., Linden, S. & Wegener, M. Bianisotropic photonic 7. Zhang, S, et al. Experimental demonstration of near-infrared negative-index metamaterials. IEEE J. Sel. Top. Quantum Electron. 16, 367–375 (2010). metamaterials. Phys. Rev. Lett. 95, 137404 (2005). 29. Fromm, D. P. et al. Exploring the chemical enhancement for surface-enhanced 8. Shalaev, V. M. Optical negative-index metamaterials. Nature Photon. 1, 41–48 Raman scattering with Au bowtie nanoantennas. J. Chem. Phys. 124, 061101 (2006). (2006). 9. Tsakmakidis, K. L., Boardman, A. D. & Hess, O. ‘Trapped rainbow’ storage of light 30. Lezec, H. J., Dionne, J. A. & Atwater, H. A. Negative refraction at visible in metamaterials. Nature 450, 397–401 (2007). frequencies. Science 316, 430–432 (2007). 10. Kildishev, A. V. & Shalaev, V. M. Engineering space for light via transformation Supplementary Information is linked to the online version of the paper at optics. Opt. Lett. 33, 43–45 (2008). www.nature.com/nature. 11. Pinchuk, A., Kreibig, U. & Hilger, A. Optical properties of metallic nanoparticles: influence of interface effects and interband transitions. Surf. Sci. 557, 269–280 Acknowledgements This work was supported in part by ARO-MURI awards (2004). 50342-PH-MUR and W911NF-09-1-0539 and by NSF PREM grant no. DMR 12. Drachev, V. P. et al. The Ag dielectric function in plasmonic metamaterials. Opt. 0611430. The authors acknowledge valuable discussions with T. Klar. V.M.S. is Exp. 16, 1186–1195 (2008). grateful to Y. Sivan and Z. Jacob for their comments. 13. Noginov, M. A. et al. Enhancement of surface plasmons in an Ag aggregate by optical gain in a dielectric medium. Opt. Lett. 31, 3022–3024 (2006). Author Contributions S.X. fabricated the samples and conducted optical 14. Ramakrishna, S. A. & Pendry, J. B. Removal of absorption and increase in characterization and part of the numerical simulations; S.X. and V.P.D. assembled resolution in a near-field lens via optical gain. Phys. Rev. B 67, 201101 (2003). the set-up; V.P.D. guided the optical experiments and partly the numerical simulations and fabrication; A.V.K. guided the numerical simulations and 15. Klar, T. A., Kildisher, A. V., Drachev, V. P. & Shalaev, V. M. Negative-index developed a sample-specific analytical technique for retrieving the bianisotropic metamaterial: going optical. IEEE J. Sel. Top. Quantum Electron. 12, 1106–1115 parameters; A.V.K. and X.N. performed numerical simulations; U.K.C. performed (2006). part of the numerical simulations and implemented parallelism in the design and 16. Sarychev, A. K. & Tartakovsky, G. Magnetic plasmonic metamaterials in actively retrieval optimization; H.-K.Y. suggested and developed the original fabrication pumped host medium and plasmonic nanolaser. Phys. Rev. B 75, 085436 (2007). approach; S.X., V.P.D., A.V.K., X.N. and V.M.S. wrote the manuscript; V.M.S. led the 17. Stockman, M. I. Criterion for negative refraction with low optical losses from a project and discussed the fabrication, optical characterization and numerical fundamental principle of causality. Phys. Rev. Lett. 98, 177404 (2007). modelling. 18. Kinsler, P. & McCall, M. W. Causality-based criteria for a negative refractive index must be used with care. Phys. Rev. Lett. 101, 167401 (2008). Author Information Reprints and permissions information is available at 19. Wegener, M. et al. Toy model for plasmonic metamaterial resonances coupled to www.nature.com/reprints. The authors declare no competing financial interests. two-level system gain. Opt. Exp. 16, 19785–19788 (2008). Readers are welcome to comment on the online version of this article at 20. Fang, A., Koschny, Th., Wegener, M. & Soukoulis, C. M. Self-consistent calculation www.nature.com/nature. Correspondence and requests for materials should be of metamaterial with gain. Phys. Rev. B 79, 241104 (2009). addressed to V.M.S. ([email protected]).

738 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09278

METHODS In retrieving the effective parameters of a sample, the trapezoidal cross-section and the substrate effect in fishnet-type NIMs can induce non-zero effective Fabrication methods. To obtain the maximum usefulness of our gain medium, the bianisotropy parameters32. To retrieve the effective parameters accurately, a alumina spacer of our fishnet structure was replaced with the selected dye–epoxy bianisotropic retrieval method is used. Similar to the approach shown in ref. material. This places the dye–epoxy in the region of highest local fields and thus 28, we use a technique built on the transfer matrix maximizes the overall gain. The main restriction in fabrication is the possibility of {1 the fishnet structure collapsing and being damaged during the etching process to t t tz r{z1 1zrz t{ ~ 11 12 ~ {1 {1 remove the alumina spacer. Once such damage occurs, the negative index of the t t3 z4 ð1Þ t21 t22 tz r{{1 1{rz {t{ metamaterial disappears and is unrecoverable. In our experiment, we first fabri- cated an 80 3 80 mm2 fishnet sample on a 15-nm-thick indium tin oxide (ITO)- where r1 is the complex reflection coefficient computed at the air–epoxy interface coated glass substrate. The vertical structure consists of a 50-nm alumina layer and t1 is the complex transmission coefficient obtained at the ITO–glass interface sandwiched by two 50-nm perforated silver layers, which are protected by 10-nm upon structure-side illumination. The component r2 is the complex reflection alumina layers both at the top and bottom surfaces. The in-plane fishnet structure is coefficient computed at the ITO–glass interface and t2 is the complex transmission shown in Fig. 2a, where the geometry is defined by 280-nm periodicity in both coefficient obtained at the air–epoxy interface on substrate-side illumination. Also, 21 21 lateral directions. The widths of the fishnet nanostrips are 163 nm in the x direction z4 5 diag(1, nSUB ), nSUB 5 1.52, t3 5 (1/2)z3 us3(dITO)uz3, dITO 5 15 nm, 21 and 207 nm in the y direction. z3 5 diag(1, nITO ), nITO is the wavelength-dependent refractive index of the ik0nITOx {ik0nITOx Next the alumina is etched in tetramethylammonium hydroxide solution. The ITO layer, s3(x) 5 diag(e ,e ), k0 is the vacuum wavevector and etching time is precisely controlled to create thin alumina pillars between the 11 perforated metal layers. The pillars are significantly smaller than the fishnet nano- u~ 1 {1 strips. Therefore, collapse of, and damage to, the fishnet structure can be avoided and enough space is left for the gain material. Then a 800-nm epoxy film doped In equation (1) we have used the identity t2 5 t1nSUB, because the structure is with Rh800 at a concentration of 2 3 1022 M is spin-coated onto the sample. reciprocal. Rh800 is selected here because of its relatively high quantum efficiency and high Then, using thep factorizationffiffiffiffiffiffiffiffiffiffiffiffiffiffi of t and the substitutions s 5 t11 1 t22, 2 2 2 solubility in the organic host. The concentration of Rh800 molecules in the epoxy d 5 t11 2 t22, D 56 d zt , t 5 4t12t21 and s6 5 (s + D)/2, we finally obtain film is 1.2 3 1019 cm23, and the measured spectral absorption and emission peaks the effective parameters of the bianisotropic slab as of Rh800 in epoxy are at 690 nm and 724 nm, respectively. During the spin-coating { { n~(k x ) 1 cos 1 (s=2) process, the Rh800–epoxy solution penetrates into the fishnet holes and fills the 0 0 voids left by the etched spacer layer. Finally, reactive ion etching is used to etch the z ~(+d{D)=2t thickness of the Rh800–epoxy layer down to 220 nm, leaving only about 60 nm on + 21 top of the fishnet structure. Figure 2b shows the SEM image of the polymer-coated where x0 is the thickness of the slab. fishnet structure. Although the structure is almost indiscernible owing to the Once the values of n, z1 and z2 are known, the effective parameters e, m and j Rh800–epoxy layer covering the fishnet surface, the profile of the fishnet structure are retrieved using the formulae can still be seen. To confirm that our fishnet structures are damage free after the e~2n=(zzzz{) fabrication process, we removed the epoxy layer and the top layer of silver by focused ion-beam milling. Figure 2c in the paper shows the tilt-view SEM image j~ie(z{{zz)=2~ind=D of the top layer of the structure after such milling. The clean and undamaged silver ~ 2{ 2 fishnet can be observed, showing no cracks, collapse or other defects. m (n j )=e Far-field measurement. The set-up used for far-field measurement is shown and which are consistent with ref. 28. described in detail in our previous paper31. As our measurements showed, the An alternative method of validating our retrieval procedure, which is built on secondary polarization (with the electric field vector of the incident light along the the classical Kramers–Kronig relation33,34, has been applied to the available y axis in Fig. 2a) shows a weak resonance around 780 nm. This polarization was (truncated) spectral range. The Kramers–Kronig relation for the refractive index not used in our pump–probe experiments because, as follows from the numerical links the frequency-dependent n9 and n99 expressions through the principal value simulation and retrieved results, it does not exhibit a negative refractive index at of the integral ð the resonance, and the resonance is far away from the strong-emission region of v 2 2 { the gain. Thus, all the pump-probe results are shown for the primary polarization. n’(n)~1{ vn’’(v)(v2{n2) 1 dv p Pump–probe measurement. A 690-nm incident beam from an optical parametric v1 amplifier pumped by an 800-nm, picosecond Ti:sapphire laser is focused onto the Here v1 and v2 are the integral’s low- and high-frequency limits. We have sample and acts as the pump beam. The spot size is 200 mm. A supercontinuum, directly used a numerical integration scheme using the subgrid midpoints white-light source generated by pumping water with an 800-nm pulse from the nl 5 (vl 1 vl11)/2 obtained from the initially non-uniform spectral grid v. laser is normally focused to a 70-mm spot on the sample as a probe beam. The probe Thus, the values of nl9 5 n9(nl) are calculated using "# beam is sent at a very small angle with respect to the pump beam, providing a good X 1 pmax{1 v z n’’z v n’’ overlap of the two beams on the sample surface. The pulse duration and repetition n’~1{ Dv p 1 p 1 z p p l 1 p 2 { 2 2{ 2 rate of both beams are 2 ps and 1 kHz, respectively. The time delay between the two p vpz1 nl vp nl beams is adjusted by an optical delay line, which gives us a time resolution of 6 ps. with Dv 5 v 2 v and n 99 5 n99(v ). Supplementary Fig. 3 compares the The optimum time delay between the probe and pump pulses is about 54 ps; this p p11 p p p values for n99 obtained with the truncated Kramers–Kronig numerical convolu- delay provides the maximum amplification of the probe beam. The transmission of tion (red line with squares) with the n99 values retrieved with our general scheme the probe light is then collected from a 30–40-mm spot at the centre of the fishnet based on refs 28, 35 (black line). Although there is some expected mismatch at structure and analysed with a spectrometer with an acquisition time of 15 s. We the edges, which is due to the truncation of the spectral range, the figure indicates focused our study on the spectral region near the luminescence peak wavelength of sufficient qualitative and quantitative consistency in the position, magnitude the Rh800 dye by exploiting the full spectral range of the grating/CCD detection. and width of the major resonant features of n99, including the important spectral The angular position of the grating remained unchanged. band of negative refraction. Therefore, this result ensures the correct choice of Numerical simulations. The numerical simulations of the sample were per- the branch in the general retrieval scheme. formed with the COMSOL MULTIPHYSICS software package. In simulations, For the calculation of the local-field enhancement, two different numerical we used the dispersive dielectric function for ITO, simulations were performed. First we calculated the process of stimulated emis- 2 {1 l il sion in a bulk layer of dye without metal and the electric field, Eb, in a unit cell e ~4{ 1z ITO 597:6 16067:6 with the same volume, VS, as the volume of the unit cell in our fishnet structure (air and the substrate are excluded).ð The total power radiated from the bulk layer For the active gain medium, the gain is taken into account in the imaginary part ~ 2 of the refractive index of the Rh800–epoxy material; hence, the refractive index of was calculated as Qb (1=2)e0v e’’jjEb dVS. In the second calculation, the VS l 5 2 p l l the Rh800–epoxy material is defined as na( ) 1.65 i(100/4 ) gsim( ), where electric field distribution, Ef, was obtained in the fishnet structure using the same gsim(l) is the gain line shape obtained from the experiment with a constant pump conditions as in our pump–probe experiment. The total generated/absorbedð (Supplementary Fig. 1). In contrast to the active gain medium, the passive regime power in the fishnet layer is then calculated as Q ~(1=2)e v e’’jjE 2 dV . is implemented by using the dielectric function e (l) 5 1.652 1 ie99(l), where f 0 f S p VS e99(l) is the loss line shape obtained from the experiment with no pump. Finally, the enhancement factor was calculated as r 5 Qf/Qb. It turned out that

©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09278

the enhancement factor was r 5 45; that is, the total power produced by the gain 32. Ku, Z., Dani, K. M., Upadhya, P. C. & Brueck, S. R. Bianisotropic negative-index medium (and absorbed by the nanostructured metal) in the fishnet unit cell was metamaterial embedded in a symmetric medium. J. Opt. Soc. Am. B 26, B34–B38 about 45 times larger than that produced by a homogeneous gain material of the (2009). same volume. 33. Jackson, J. D. Classical Electrodynamics Ch. 7.10 (Wiley, 1975). 34. Cook, J. J. H., Tsakmakidis, K. L. & Hess, O. Ultralow-loss optical diamagnetism in silver nanoforests. J. Opt. A 11, 114026 (2009). 31. Cai, W. et al. Metamagnetics with rainbow colors. Opt. Exp. 15, 3333–3341 35. Kildishev, A. V. et al. Negative refractive index in optics of metal-dielectric (2007). composites. J. Opt. Soc. Am. B 23, 423–433 (2006).

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09212 LETTERS

Real-time observation of valence electron motion

Eleftherios Goulielmakis1*, Zhi-Heng Loh2,3*, Adrian Wirth1, Robin Santra4,5, Nina Rohringer6, Vladislav S. Yakovlev1,7, Sergey Zherebtsov1, Thomas Pfeifer2,3{, Abdallah M. Azzeer8, Matthias F. Kling1, Stephen R. Leone2,3 & Ferenc Krausz1,7

The superposition of quantum states drives motion on the atomic Isolated attosecond extreme-ultraviolet (EUV) pulses3,11–13 lend and subatomic scales, with the energy spacing of the states dictat- themselves as a probe to overcome these limitations. ing the speed of the motion. In the case of electrons residing in the In this work, we introduce attosecond probe spectroscopy to outer (valence) shells of atoms and molecules which are separated study—as one of the simplest open quantum systems—krypton by electronvolt energies, this means that valence electron motion atoms ionized by a strong field. Strong-field ionization has been occurs on a subfemtosecond to few-femtosecond timescale studied extensively14, without answering the question of whether it (1 fs 5 10215 s). In the absence of complete measurements, the is able to create long-lived coherences. Insufficient temporal confine- motion can be characterized in terms of a complex quantity, the ment of ionization, recollision and/or electron correlations under density matrix. Here we report an attosecond pump–probe mea- strong-field influence may affect electronic coherence in the emerging surement of the density matrix of valence electrons in atomic ions. Our study reveals that strong-field ionization by a waveform- krypton ions1. We generate the ions with a controlled few-cycle controlled2, near-single-cycle13 laser pulse is capable of launching a laser field2 and then probe them through the spectrally resolved broadband (,0.7-eV splitting) valence electron wave packet with a absorption of an attosecond extreme-ultraviolet pulse3, which high degree of coherence (g < 0.6) that persists without notable decay allows us to observe in real time the subfemtosecond motion of for much longer than 10 fs. valence electrons over a multifemtosecond time span. We are able Attosecond probe spectroscopy can rely on the momentum or to completely characterize the quantum mechanical electron energy distribution of the liberated photoelectrons15 or transmitted motion and determine its degree of coherence in the specimen photons as observables. Here we choose the latter option, transient of the ensemble. Although the present study uses a simple, pro- absorption spectroscopy16–19, which can be extended to the scrutiny totypical open system, attosecond transient absorption spec- of condensed-matter phenomena whereas photoelectron spectro- troscopy should be applicable to molecules and solid-state scopy is restricted to surfaces and gas-phase samples. Figure 1a illus- materials to reveal the elementary electron motions that control trates our attosecond probing concept applied to the multiple physical, chemical and biological properties and processes. ionization of krypton. Conventional spectroscopy20 shows that the The millielectronvolt-scale spacing of vibrational energy levels ions are created in manifolds of states depicted by green boxes and implies that changes in molecular structure occur on a multifemto- denoted by nl2i, indicating that, relative to the ground-state con- second timescale and can be accessed by femtosecond pump–probe figuration of the atom, i electrons are missing from the nl subshell. An spectroscopy4. Electronic phenomena in the valence band are one attosecond pulse carried at a photon energy of ,80 eV promotes the hundred to one thousand times faster and have remained elusive krypton ions created in their 4p2i manifolds to the 3d214p2(i 2 1) so far. Electronic coherence is the key to accessing them in real time. core-hole states. The transitions lead to dips in the spectrum of the It has been studied in kinematically complete experiments5, from broadband EUV radiation transmitted through the ionized atomic which the motion can be inferred but not directly observed. Apart ensemble (Fig. 1b), revealing populations and coherences on an from in the simplest systems, experimental techniques are unable to attosecond–femtosecond timescale. probe all degrees of freedom. In these cases, we have to deal with an In our experiments, we ionized krypton atoms with sub-4-fs, wave- open system, which can only be characterized in terms of ensemble- form-controlled NIR laser pulses carried at a wavelength of ,750 nm averaged quantities (observables) predicted by the system’s density (ref. 13; wave period, TL < 2.5 fs) and probed the emergence of ions matrix. Under these circumstances, the synchrony of wave-packet and electronic dynamics within their 4p2i valence subshells with iso- dynamics in the specimens of the ensemble (that is, coherence) is lated sub-150-as EUV pulses carried at ,80 eV (Fig. 1c; for details, see indispensable and only time-resolved measurements can provide Methods Summary). The target of the NIR pump/EUV probe expo- direct access to the observables of the motion. Combination of the sure is a quasistatic cell filled with krypton atoms at densities on the powerful concepts of correlated measurement and high-harmonic order of ,1018 cm23. The EUV beam size at the focus is a small spectroscopy6–9 has recently uncovered signatures of electronic fraction of that of the NIR beam and, hence, ionization and accom- coherence and resultant dynamics in an ensemble of ionizing mole- panying electron dynamics are probed near the optical axis, where the cules within a temporal window of ,1 fs following ionization10. The radial dependence of the laser intensity can be neglected. degree and the persistence of coherence have not been measured and EUV absorption spectra are recorded as functions of the pump– the method is limited to the scrutiny of systems with large ($10 eV) probe delay (negative delay means that the EUV probe precedes ionization potentials and of processes under strong-field influence. the NIR excitation) by a spectrometer downstream of the target. No

1Max-Planck-Institut fu¨r Quantenoptik, Hans-Kopfermann-Strasse 1, D-85748 Garching, Germany. 2Departments of Chemistry and Physics, University of California, Berkeley, California 94720, USA. 3Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA. 4Argonne National Laboratory, Argonne, Illinois 60439, USA. 5Department of Physics, University of Chicago, Chicago, Illinois 60637, USA. 6Lawrence Livermore National Laboratory, Livermore, California 94551, USA. 7Department fu¨r Physik, Ludwig-Maximilians-Universita¨tMu¨nchen, Am Coulombwall 1, D-85748 Garching, Germany. 8Physics and Astronomy Department, King Saud University, Riyadh, 11451, Kingdom of Saudi Arabia. {Present address: Max-Planck-Institut fu¨r Kernphysik, Saupfercheckweg 1, D-69117 Heidelberg, Germany. *These authors contributed equally to this work. 739 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

b a 150 3d–14p–2 1.0 140 Kr+ 2+ Before ionization Kr 3+ 1 130 0.8 Kr 3d–14p–1 –1 Attosecond –1 –3d 120 EUV probe d 0.6 –1 3 d–1 –1 –1 3 4p 100 3d p–3 –4p –2 4 0.4 –2 80 1 Absorbance 4p –4p

Energy (eV) Energy –3 60 4p–2 0.2 4p Few-cycle Intensity (arb.u.) 40 0.8 NIR pump 0.0 After ionization 4p–1 20 4p 0.6 78 0 0 Delay 82 + 2+ 3+ Photon energy84 (eV) 76 80 84 88 92 96 Kr Kr Kr Kr 86 88 Photon energy (eV) 0.4

0.2 c Krypton atoms Delay 1018 cm–3 0

Two-component EUV mirror assembly spectrometer Zirconium Molybdenum–silicon filter imaging mirror

Zirconium filter on pellicle 5

EUV pulses 0 Few-cycle laser pulse Neon gas 750 nm y (fs) λ ≈ target < 4 fs τ L –5 Dela

–10

Figure 1 | Probing intra-atomic electron motion by attosecond absorption Figure 2 | Transient absorption spectra of krypton ions. Absorbance is spectroscopy. a, The strong electric field of a near-infrared (NIR) laser pulse defined as A(E, t) 5 ln(Itrans(E, t)/I0(E)), where I0(E) is the spectral density (in red) with a duration of tL , 4 fs liberates electrons from the 4p valence recorded at a negative delay of 210 fs, that is, the attosecond probe precedes subshell of krypton atoms to generate singly charged 4p21, doubly charged 22 23 21 22 23 the ionizing laser pulse by 10 fs in the atomic sample, and Itrans(E, t)isthe 4p or triply charged 4p ions in the 4p ,4p and 4p manifolds of spectral density recorded at a pump–probe delay t. The delay is varied in steps quantum states, respectively, by means of optical field ionization (indicated of 200 as. Error bars indicate the standard error of the mean values acquired by red arrows). A subfemtosecond EUV pulse (in violet) with a carrier from several spectra recorded at the same delay. The EUV probe pulse shows photon energy of ,80 eV is passed through the ions and promotes them to 31 21 21 21 21 22 the formation of charge states up to Kr as indicated in the spectrum shown core-hole excited-state manifolds 3d ,3d 4p and 3d 4p (as in the background, which is recorded at t < 10 fs. Disregarding a forerunner indicated by the violet arrows). Transient EUV absorption spectra are in the main Kr1 line, the origin of which is unclear, the Kr21 lines appear with acquired by recording the attosecond EUV pulse spectrum transmitted a significant, well-resolved delay of about one-half the laser period (TL/ through the ionized gas target as a function of pump–probe delay with an 2 < 1.25 fs) after the Kr1 lines, and the Kr31 lines appear with approximately EUV spectrometer. b, c, Spectral intensity distribution of the relevant part of the same delay after the Kr21 lines. The decrease in the neutral krypton the broadband attosecond probe pulse (b) and schematic of the population in the atomic sample manifests itself as a reduction of the experimental set-up (c). The pulse is transmitted through an ensemble of 18 23 absorption in the range 91–93 eV, where neutral krypton atoms absorb krypton atoms (,10 cm ; interaction length, ,1 mm) before their resonantly. Relative occupations between the ionic states Kr1,Kr21 and Kr31 ionization (blue curve) and after their ionization (red curve) by the laser are estimated as NKrz : NKr2z : NKr3z 5 1:0.875:0.25. pulse with a peak intensity of ,7 3 1014 Wcm22, as well as by several optical elements shown in c and discussed in detail in Methods. arb.u., arbitrary units. they result from a coherent polarization response of the atomic ensemble that extends over several femtoseconds after the EUV pulse high-order harmonic radiation emerging from the krypton target was hits the sample. During this time, the strong laser field is still present detected in the spectral range of interest (.60 eV). In Fig. 2, we plot a and may affect the atomic polarization response. Modelling of this series of transient EUV absorption spectra for an on-axis peak laser interaction is beyond the scope of this work; here we focus on intensity of ,7 3 1014 Wcm22, which is sufficiently high to produce dynamics occurring in Kr1 ions after the ionizing field has vanished, singly, doubly and triply charged ions, as inferred from the appearance where this complication does not arise. of absorption lines associated with electronic transitions in these ions. It is known from photoelectron23,24 and photoabsorption18,25 spec- Our spectrometer has a resolution of ,0.45 eV and hence is unable to troscopy that strong-field ionization populates not only the ground resolve the ,88-meV width21 of the observed lines resulting from the state but also excited electronic states of the emerging ions. Exposure ,8-fs core-hole decay of the 3d214p2(i 2 1) states22. of krypton atoms to a strong, low-frequency laser field is expected 1 {1 The appearance of different charge states delayed by approxi- to generate Kr ions predominantly in their 4pj~3=2 ground-state {1 mately half the laser cycle with respect to each other is consistent manifold and the 4pj~1=2 excited-state manifold, comprising four with our understanding that the primary process in strong-field ion- (mj 5 3/2 523/2, 21/2, 1/2, 3/2) and two (mj 5 1/2 521/2, 1/2) states, ization is electron release near the field oscillation peaks by means of respectively (j denotes the total angular momentum and mj denotes its tunnelling or above-barrier escape. Attosecond transient absorption projection on the z axis). The fine-structure (or spin–orbit) energy spectroscopy offers unprecedented insight into the dynamics of mul- splitting between the two manifolds is DESO 5 0.67 eV (ref. 26). tiple ionization but theory must be further developed before this However, the critical question is whether they can be populated coher- potential can be exploited. This is because the narrow natural line- ently during strong-field ionization so as to allow the creation of widths of the absorption lines in the transmitted EUV light imply that subsequent wave-packet motion in the valence shell27. 740 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

We have numerically modelled the interaction of krypton atoms The evolution of coherence shown in Fig. 3b can be understood with a strong laser field, as described in ref. 1. The populations of the intuitively. At every intense wave crest, a fraction of the atoms is six relevant quantum states of Kr1 and the coherence between these ionized and a hole spin–orbit wave packet is launched in the 4p (mj ) 21 states are respectively given by the diagonal elements rj,j’~j and the valence subshell by populating the 4p manifold. Spin–orbit wave corresponding off-diagonal elements of the reduced density matrix packets launched in ions that are generated at a given wave crest have of the atomic ensemble (Supplementary Information, section I). a fairly well-defined phase with respect to the wave crest and evolve Solving the time-dependent Schro¨dinger equation for a few-cycle with a half-period of 3.1 fs. As a consequence, their phase has changed ionizing NIR field, linearly polarized along the quantization axis, by 0.4p by the time the next wave crest arrives (after one half of the yields the evolution of the density matrix elements as depicted in laser period, TL/251.25 fs), such that ions produced at this wave crest Fig. 3. By the end of the laser pulse (t $ 3 fs), ,30% of the atoms are phase-shifted with respect to the ions generated previously. In the are ionized, with a hole emerging in the (4pj, mj)or(4pj, 2mj) 3.8-fs pulse, ionization is confined to two to three wave crests, that is, (1=2) ({1=2) (1=2) to within a single laser period. Therefore, the final population of the orbital with a relative population of r 1 r 5 2r 5 3=2,3=2 3=2,3=2 3=2,3=2 4p21 manifold builds up within less than the 3.1-fs half-period, (1=2) ({1=2) (1=2) (3=2) ({3=2) 0.69, r1=2,1=2 1 r1=2,1=2 5 2r1=2,1=2 5 0.26 and r3=2,3=2 1 r3=2,3=2 5 ensuring that the ions produced at the individual wave crests are 2r(3=2) 5 0.05 (Fig. 3a). The hole populations increase in subfemto- substantially in phase. This gives rise to a final ion ensemble with a 3=2,3=2 high degree of coherence (g < 0.6). By contrast, in the 7.6-fs pulse, the second steps near the oscillation peaks of the ionizing laser field build-up of 4p21 population takes longer, leading to a wider distri- depicted by the dashed line. Only states with the same value of mj bution of relative phases among the ions created and, hence, to a can form a coherent superposition state characterized by a non-zero strongly reduced degree of coherence of the ion ensemble (g < 0.13). value of the corresponding off-diagonal element of the density {1 Our simulations reveal that, in the absence of a tunable excitation matrix. In our case, this condition is fulfilled by the (4p3=2 , period, the creation of a persisting electronic coherence critically {1 1 mj 561/2) and (4p1=2 , mj 561/2) states . The degree of coherence relies on the confinement of excitation to a time interval that is between these states can be characterized by the parameter comparable to the characteristic timescale of the resultant wave- (1=2) packet motion. These simulations also reveal that field ionization r3=2,1=2 10 g(t)~ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1Þ by multicycle pulses generally creates incoherent ion ensembles. r(1=2) r(1=2) When the coherent broadband EUV probe promotes the system 3=2,3=2 1=2,1=2 from the components of its superposition state into a common final Perfect coherence corresponds to g 5 1, whereas incoherent super- state, quantum interference leads to a temporal modulation of the positions yield g 5 0. In Fig. 3b, we plot the half-cycle-averaged g(t) transition probability: its depth is indicative of the degree of coher- for pulse durations of 3.8 fs and 7.6 fs (full-width at half-maximum of ence of the probed superposition state, and its variation with pump– the intensity envelope). probe delay reveals the temporal evolution of the system. In our case, strong-field ionization is predicted to create a hole superposition in a {1 {1 0.1 the 4pj~3=2 ground-state manifold and the 4pj~1=2 excited-state 2ρ(1/2) 1 0.75 3/2, 3/2 manifold of Kr ions (Fig. 4a). A broadband pulse carried at ,80 eV promotes the ions from these states—by means of dipole- allowed transitions—into the 3d{1 manifold by creating a core-level 0.50 3={2 { 0.0 1 1 vacancy. A transition from the 4p3=2 states to the 3d5=2 states is also 2ρ(1/2) {1 {1 1/2, 1/2 possible, but the 4p1=2 R 3d5=2 transition is forbidden by electric

Field (a.u.) Population 0.25 dipole selection rules. Figure 4b shows the simulated absorption 2ρ(3/2) 1 3/2, 3/2 cross-section of Kr for the above transitions (see Supplementary 0.1 0.00 Information, section I, for details) as a function of pump–probe b delay, for the simulation parameters given in Fig. 3. 0.8 {1 {1 The high degree of coherence between the 4p1=2 and 4p3=2 states 3.8 fs (Fig. 3b) is predicted to translate into modulation of the absorption 0.6 {1 {1 cross-sections as functions of pump–probe delay for the 4p3=2 R 3d3=2 {1 {1 and 4p1=2 R 3d3=2 transitions (Fig. 4b), owing to a temporal variation 0.4 {1 {1 of the relative phase, w, between the 4p1=2 and 4p3=2 states 7.6 fs E , ~ = {E , ~ = 0.2 w(t)~ ji4p j 1 2 ji4p j 3 2 tzw Degree of coherence Degree B 0 ð2Þ 0.0 DE 2p –6–4–20246810 SO z z : t w0: t w0 Time (fs) B TSO

1 {1 Figure 3 | Build-up of electronic coherence in Kr produced by optical field where Eji4p,j~1=2 is the energy of the 4p1=2 state, Eji4p,j~3=2 is the ionization (theory). a, Temporal evolution of the diagonal elements of the energy of the 4p{1 state and B is Planck’s constant divided by 2p. reduced density matrix of Kr1 in the presence of a 750-nm, 3.8-fs laser pulse 3=2 14 22 D 5 with a peak intensity of ,3 3 10 Wcm and a sinusoidal waveform. The Here ESO 0.67 eV implies a wave-packet oscillation period of notation of the matrix elements is explained in the text. The populations TSO 5 6.2 fs. The modulation is most pronounced in the weaker {1 {1 {1 {1 have been normalized such that the trace of the reduced density matrix at 4p3=2 R 3d3=2 transition and is absent from the 4p3=2 R 3d5=2 trans- t 5 10 fs equals one. The dashed line shows the electric field of the laser pulse ition, which is insensitive to the quantum superposition. In equation (in atomic units (a.u.)) used in the simulations. The hole density (2), w(t) describes the relative phase between the two states after the NIR distributions of the corresponding orbitals of the 4p subshell are also pulse. For a given NIR waveform, w is a nontrivial, well-defined quant- b 0 depicted. , The degree of coherence, g(t) (see equation (1)), averaged over ity, which will become experimentally measurable by combining atto- time intervals of the laser half-cycle (,1.25 fs), is shown with red dots. Squares show the half-cycle-averaged degree of coherence for an ionizing second absorption spectroscopy with attosecond streaking to determine laser pulse with a duration of 7.6 fs at the same peak intensity. The degree of the NIR waveform and its timing with respect to the attosecond probe coherence is defined such that, after the NIR pump pulse, it equals one for a pulse with attosecond accuracy. Access to the phase during the laser perfectly coherent hole wave packet. The detailed behaviour of the degree of pulse has been possible for short-lived coherences10. The experimental coherence during ion formation is not yet understood. data (Fig. 4c) clearly show the predicted consequences of the 741 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

a 120 Figure 4 | Attosecond absorption spectroscopy reveals intra-atomic electron wave-packet motion in Kr1.a, Energy-level diagram showing the 1 100 3d–1 3d –1 spin–orbit splitting of the 4p and 3d subshells in Kr . A sub-4-fs NIR laser 3/2 pulse (red wave) liberates an electron from the 4p subshell and leaves the d –1 {1 {1 3 5/2 80 1.3 eV ensemble of ions in a coherent superposition of 4p3=2 and 4p1=2 states. Single-photon EUV absorption promotes the ions from these states to the {1 core-excited 3d3=2 state. b, Simulated transient EUV absorption spectra 60 {1 {1 reveal characteristic modulations present in the 4p3=2 R 3d3=2 and {1 {1

Energy (eV) 4p R 3d transitions as functions of pump–probe delay. The 40 1=2 3=2 0.67 eV modulation depth is highly sensitive to the degree of coherence, g(t). Mb, 4p–1 p –1 megabarn. c, False-colour plot of an attosecond absorption spectrogram 20 4 1/2 p –1 comprising 40 transient absorption spectra recorded at delays increased in 4 3/2 4p steps of 1 fs with a sub-4-fs, ,750-nm laser pump and a sub-150-as, ,80-eV 0 Kr EUV probe. The reference spectrum was recorded at 26 fs. The absorption Kr+ spectrum plotted in the background is taken at a delay of 30 fs. The linewidths are determined by the ,0.45-eV resolution of our EUV {1 {1 spectrometer. The lower modulation depth of the 4p1=2 R 3d3=2 transition relative to the calculations shown in b is a result of the spectral resolution 40 being limited in comparison with the ,88-meV natural linewidth of the 30 studied transitions. The zero of the delay scale is set to coincide with the 20 {1 {1 (Mb) instant when the main 4p3=2 R 3d5=2 absorption line reaches 95% of its 10 quasistationary value.

b Cross-section 0 79 prediction agrees best with spectra recorded over the photon energy 40 80 range 78.5–82 eV and the pump-probe delay range 3–33 fs (Sup- 30 80.5 plementary Information, section III). This procedure has yielded Photon energy81 (eV) (1=2) ({1=2) (1=2) ({1=2) 20 81.5 (for t $ 3 fs) r 1 r 5 0.42 6 0.10, r 1 r 5 82 3=2,3=2 3=2,3=2 1=2,1=2 1=2,1=2 10 (3=2) ({3=2) 0.35 6 0.03, r3=2,3=2 1 r3=2,3=2 5 0.23 6 0.08, g 5 0.63 6 0.17 and 0 T 5 6.3 6 0.1 fs. This value for T is in excellent agreement with 30 SO SO the one derived from DESO (ref. 26). The degree of coherence agrees 25 well with the predictions in Fig. 3, verifying that few-cycle ionization is 20 capable of creating robust electronic coherence. It survives over a time- 15 scale of multiple tens of femtoseconds, allowing electron wave packets 10 to affect structural dynamics once launched in molecular orbitals. elay (fs) {1 {1 5 D Fig. 5a displays the mean optical density of the 4p3=2 R 3d3=2 absorption line as a function of pump–probe delay over the energy 0 range 81.20–81.45 eV (black dots) in comparison with the prediction –5 of our model with the parameters given above (red line). To recon- 1 struct the quantum mechanical motion of the electron wave packet, c 4p –1 – 3d –1 the temporal evolution of the quantum phase w(t) defined by equa- 0.5 3/2 5/2 p –1 d –1 tion (2) must also be retrieved. Attosecond absorption spectroscopy 4 1/2– 3 3/2

0.8 Absorbance 0 allows this retrieval over a multifemtosecond time interval with a 0.6 p –1 d –1 resolution matched to the electronic timescale (Fig. 5b). The un- 79 4 3/2– 3 3/2 79.5 certainty in the obtained values of the quantum phase (of ,p/5) 0.4 80 Photon80.5 energy (eV) translates—using equation (2)—into a temporal resolution of 0.2 81 81.5 ,0.6 fs. In Fig. 5b, we plot the evolution of the quantum phase versus 1 0 82 the time elapsed since the main absorption line of Kr reached 95% of its stationary value. In this frame of reference, the electron (or hole) density distribution can be reconstructed at any instant after the ionizing laser pulse with subfemtosecond accuracy. We plot the 30 result of this reconstruction at a few selected instants separated by 1 fs (Supplementary Information, section II). Measuring w along with 25 0 the amplitude of the off-diagonal matrix element as a function of the 20 laser waveform, which can be determined using attosecond streak- 15 ing11, will provide insight into the dynamics of optical field ionization 10 and the concomitant formation of intra-atomic electron wave pack- 5 Delay (fs) ets. Through comparisons with model predictions, this measured 0 phase will provide a new, sensitive test of the modelling of strong- –5 field ionization of multi-electron atoms, a process that is far from being well understood. Attosecond transient absorption spectroscopy extends real-time wave-packet motion described by equation (2), including the deep insight into microscopic motion, from nuclear wave-packet4 and {1 {1 28 amplitude modulation of the 4p3=2 R 3d3=2 absorption line as well as Rydberg wave-packet dynamics to electrons in the valence shell. Its the energy modulation of the lines, with the latter resulting from unique features, that is, gentle probing owing to the absence of strong absorptive and dispersive terms in the transient absorption cross- fields, the capacity to study processes within condensed matter, and section16 (see equation (1) in Supplementary Information). wide applicability to materials with arbitrary ionization potential, render Components of the reduced density matrix of the system are deter- this approach ideal for attosecond real-time observation of electronic 29,30 mined by adjusting them along with TSO such that the model’s and concomitant processes in atoms, molecules and solids . 742 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

a Elapsed time (fs) 6. Niikura, H. et al. Sub-laser-cycle electron pulses for probing molecular dynamics. Nature 417, 917–922 (2002). –5 0 5 10 15 20 25 30 7. Niikura, H. et al. Probing molecular dynamics with attosecond resolution using 6.3 ± 0.1 fs correlated wave packet pairs. Nature 421, 826–829 (2002). 0.3 8. Niikura, H., Villeneuve, D. M. & Corkum, P. B. Mapping attosecond electron wave packet motion. Phys. Rev. Lett. 94, 083003 (2005). 9. Baker, S. et al. Probing proton dynamics in molecules on an attosecond time scale. Science 312, 424–427 (2006). 0.2 10. Smirnova, O. et al. High harmonic interferometry of multi-electron dynamics in molecules. Nature 460, 972–977 (2009). 11. Kienberger, R. et al. Atomic transient recorder. Nature 427, 817–821 (2004). 0.1 12. Sansone, G. et al. Isolated single-cycle attosecond pulses. Science 314, 443–446 Absorbance (2006). 13. Goulielmakis, E. et al. Single-cycle nonlinear optics. Science 320, 1614–1617 (2008). 14. Brabec, T. Strong Field Laser Physics (Springer, 2008). 0 15. Yudin, G. L. et al. Attosecond photoionization of coherently coupled electronic states. Phys. Rev. A 72, 051401 (2005). b Elapsed time (fs) 16. Pollard, W. T., Lee, S.-Y. & Mathies, R. A. Wave packet theory of dynamic absorption spectra in femtosecond pump-probe experiments. J. Chem. Phys. 92, 19 20 21 22 23 24 4012–4029 (1990). 3π 17. Mathies, R. A. et al. Direct observation of the femtosecond excited-state cis-trans isomerization in bacteriorhodopsin. Science 240, 777–779 (1988). 2π 18. Loh, Z.-H. et al. Quantum state-resolved probing of strong-field-ionized xenon atoms using femtosecond high-order harmonic transient absorption

Phase z 1π spectroscopy. Phys. Rev. Lett. 98, 143601 (2007). 19. Loh, Z.-H. & Leone, S. R. Ultrafast strong-field dissociative ionization dynamics of 0 CH2Br2 probed by femtosecond soft X-ray transient absorption spectroscopy. y J. Chem. Phys. 128, 204302 (2008). x 20. Southworth, S. H. et al. K-edge X-ray-absorption spectroscopy of laser-generated Kr1 and Kr21. Phys. Rev. A 76, 043421 (2007). 21. Jurvansuu, M., Kivima¨ki, A. & Aksela, S. Inherent lifetime widths of Ar 2p21,Kr 3d21,Xe3d21, and Xe 4d21 states. Phys. Rev. A 64, 012502 (2001). 22. Drescher, M. et al. Time-resolved atomic inner-shell spectroscopy. Nature 419, 803–807 (2002). 2 Figure 5 | Reconstruction of valence-shell electron wave-packet motion. 23. Rottke, H., Ludwig, J. & Sandner, W. ‘Short’ pulse MPI of xenon: the P1/2 ionization channel. J. Phys. B 29, 1479 (1996). a, Absorbance (dots) averaged over the photon energy range 81.20–81.45 eV 24. Gubbini, E. et al. Core relaxation in atomic ultrastrong laser field ionization. Phys. {1 {1 corresponding to the 4p3=2 R 3d3=2 transition (see Fig. 4c), as a function of Rev. Lett. 94, 053602 (2005). time elapsed since zero as defined in the text. The full line shows the result of 25. Young, L. et al. X-ray microprobe of orbital alignment in strong-field ionized our modelling for the values of the fit parameters given in the text. The atoms. Phys. Rev. Lett. 97, 083601 (2006). modulation occurs with a period of 6.3 60.1 fs. Error bars depict the standard 26. Saloman, E. B. Energy levels and observed spectral lines of krypton, Kr I through Kr error of the values extracted from several data sets recorded under identical XXXVI. J. Phys. Chem. Ref. Data 36, 215–386 (2007). experimental conditions. b, Quantum phase of the 4p superposition state w(t) 27. Santra, R., Dunford, R. W. & Young, L. Spin-orbit effect on strong-field ionization of (see equation (2)), as retrieved from the measured attosecond absorption krypton. Phys. Rev. A 74, 043403 (2006). 28. Jones, R. R. & Noordam, L. D. Electronic wavepackets. Adv. At. Mol. Opt. Phys. 38, spectrogram shown in Fig. 4c. Uncertainty in the values, resulting from our 1–38 (1998). measurement and modelling, indicates accuracy of reconstruction of the 29. Bucksbaum, P. H. The future of attosecond spectroscopy. Science 317, 766–769 superposition of ,p/5. The lower diagram shows ensemble-averaged hole (2007). 1 density distributions in the 4p subshell of Kr reconstructed from the 30. Krausz, F. & Ivanov, M. Y. Attosecond physics. Rev. Mod. Phys. 81, 163–234 (2009). measured w(t) and the measured components of the density matrix, at Supplementary Information is linked to the online version of the paper at instants separated by 1 fs, within an interval of 17–25 fs following ionization. www.nature.com/nature.

METHODS SUMMARY Acknowledgements We thank U. Kleineberg, M. Hofstetter and M. Fiess for invaluable contributions. This work was supported by the Max Planck Society, the We sent sub-4-fs, 0.3-mJ NIR laser pulses at a wavelength of ,750 nm into a Nobel Program of King Saud University and the DFG Cluster of Excellence: Munich neon-filled tube to generate EUV pulses by means of high-harmonic generation. Centre for Advanced Photonics (http://www.munich-photonics.de). E.G. The collinear NIR and EUV beams were then passed through a filter assembly acknowledges a Marie-Curie Reintegration grant (MERG-CT-2007-208643). and focused in a quasistatic gas cell containing the krypton gas at a pressure of A.W., S.Z. and M.F.K. acknowledge support by the Emmy Noether programme of about 80 mbar. The EUV pulse hitting the target was delayed with respect to the the DFG. Z.-H.L., T.P. and S.R.L. acknowledge support from the Air Force Office of Scientific Research (FA9550-04-1-0242), the National Science Foundation NIR laser pulse and had a duration of less than 150 as. A broadband mol- (CHE-0742662 and EEC-0310717) and the Director, Office of Science, Office of ybdenum–silicon multilayer mirror imaged, in one transverse dimension, the Basic Energy Sciences, US Department of Energy (DE-AC02-05-CH11231). T.P. attosecond EUV beam transmitted through the krypton gas target to the acknowledges support from the MPRG program of the MPG. R.S. is supported by entrance slit of an EUV spectrometer used for measuring the spectral intensity the Office of Basic Energy Sciences, Office of Science, US Department of Energy distribution of the transmitted beam. (DE-AC02-06CH11357). Part of this work was performed under the auspices of the US Department of Energy by Lawrence Livermore National Laboratory Full Methods and any associated references are available in the online version of (DE-AC52-07NA27344). S.R.L. gratefully acknowledges appointment as a Miller the paper at www.nature.com/nature. Research Professor in the Miller Institute for Basic Research in Science. Author Contributions E.G., Z.-H.L. and A.W. conceived and designed the Received 25 September 2009; accepted 24 May 2010. experiments; E.G., A.W. and Z.-H.L. performed the measurements; A.W., Z.-H.L., 1. Rohringer, N. & Santra, R. Multichannel coherence in strong-field ionization. Phys. E.G., T.P., S.Z., A.M.A., M.F.K., S.R.L and F.K. evaluated, analysed and interpreted the Rev. A 79, 053402 (2009). experimental data; and R.S., N.R. and V.S.Y. performed the theoretical modelling. All 2. Baltuska, A. et al. Attosecond control of electronic processes by intense light authors discussed the results and contributed to the final manuscript. fields. Nature 421, 611–615 (2003). Author Information Reprints and permissions information is available at 3. Hentschel, M. et al. Attosecond metrology. Nature 414, 509–513 (2001). www.nature.com/reprints. The authors declare no competing financial interests. 4. Zewail, A. H. Femtochemistry: atomic-scale dynamics of the chemical bond. Readers are welcome to comment on the online version of this article at J. Phys. Chem. A 104, 5660–5694 (2000). www.nature.com/nature. Correspondence and requests for materials should be 5. Scho¨ffler, M. S. et al. Ultrafast probing of core hole localization in N2. Science 320, addressed to E.G. ([email protected]), S.R.L. ([email protected]) or F.K. 920–923 (2008). ([email protected]).

743 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09212

METHODS bandwidth. The NIR beam is reflected by the outer part of the mirror assembly, which is coated with silver. Both mirrors are deposited on a super-polished Carrier-envelope-phase-controlled, sub-4-fs, 0.3-mJ NIR laser pulses carried at a substrate with a radius of curvature of 25 cm. Both beams are focused in a wavelength of ,750 nm and delivered at a repetition rate of 3 kHz are focused into a quasistatic gas cell formed by a nickel tube, containing the krypton gas at a neon-filled tube to generate EUV pulses by means of high-harmonic generation (Fig. 1c). On their way towards a two-component, concentric mirror module, the pressure of about 80 mbar. The effective interaction length in the krypton gas , collinear NIR and EUV beams pass through a filter assembly consisting of a 150-nm- target is 1 mm. The EUV pulses hitting the target have a duration of less than thick zirconium foil and an ultrathin pellicle. The small-divergence EUV beam is 150 as as verified by attosecond streak-camera measurements. It is delayed with transmitted through the zirconium filter covering a circular spot ,3mmindia- respect to the laser pulse by the focusing molybdenum–silicon mirror mounted meter, whereas the NIR beam is efficiently blocked by this filter. The outer part of the on a piezo-controlled translation stage. A motorized aperture installed down- stream of the source adjusts the on-axis NIR intensity on target (not shown) (more divergent) NIR beam is transmitted by the pellicle carrying the zirconium foil. 14 22 The EUV beam transmitted through the circular zirconium filter has a dia- between zero and ,7 3 10 Wcm . A second zirconium foil, installed behind meter of ,3 mm when hitting the internal part of the double mirror assembly, the krypton gas cell, prevents the NIR light from entering the EUV spectrometer. which also has a diameter of ,3 mm. This inner mirror is covered with a A broadband molybdenum–silicon multilayer mirror images, in one transverse molybdenum–silicon multilayer with a reflectance of ,2.5% over a 28-eV dimension, the transmitted EUV beam from the krypton gas target to the (full-width at half-maximum) band centred at ,80 eV. In these experiments, entrance slit of an EUV spectrometer used for measuring the spectral intensity spectral filtering and the intensity of the driving field yield pulses with ,15-eV distribution of the attosecond EUV beam transmitted through the ionized target.

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09257 LETTERS

Melting-induced stratification above the Earth’s inner core due to convective translation

Thierry Alboussie`re1,2, Renaud Deguen1,3 & Mickae¨l Melzani1

In addition to its global North–South anisotropy1, there are two The experiments consist of simultaneously injecting constant other enigmatic seismological observations related to the Earth’s fluxes of light and dense fluids at the bottom of a fluid cavity. The inner core: asymmetry between its eastern and western cavity is a box of perspex 20 cm high and with a 15 cm 3 15 cm hemispheres2–6 and the presence of a layer of reduced seismic horizontal cross-section. It is initially filled with salted water (initial 6–12 velocity at the base of the outer core . This 250-km-thick layer concentration x0, in wt% NaCl). At the bottom of the cavity, there is a has been interpreted as a stably stratified region of reduced com- porous layer (sponge) below which the cross-section is divided into 13 position in light elements . Here we show that this layer can be two disconnected parts: on one side light fluid is injected (xl , x0) generated by simultaneous crystallization and melting at the sur- and on the other side heavy fluid is injected (xh . x0), where xl and xh face of the inner core, and that a translational mode of thermal are the salt concentrations of the light and heavy fluids in wt% NaCl. convection in the inner core can produce enough melting and Both density differences x0 2 xl and xh 2 x0 and both flow rates are crystallization on each hemisphere respectively for the dense layer controlled and set to be constant during the experiment. The injec- to develop. The dynamical model we propose introduces a clear tions of fluids start simultaneously through pipes from reservoirs asymmetry between a melting and a crystallizing hemisphere with the desired concentration. The excess fluid is removed through which forms a basis for also explaining the East–West asymmetry. an overflow at the top of the cavity. The present translation rate is found to be typically 100 million The geophysically relevant case is when the positive buoyancy flux years for the inner core to be entirely renewed, which is one to two exceeds the negative one, because on average the inner core is grow- orders of magnitude faster than the growth rate of the inner core’s ing. When the negative buoyancy flux induced by the heavy fluid is radius. The resulting strong asymmetry of buoyancy flux caused less than 80% of the amplitude of the light fluid, no dense layer is by light elements is anticipated to have an impact on the dynamics observed: the entrainment caused by the rise of light plumes is suf- of the outer core and on the geodynamo. ficient to mix the heavy fluid as it is released by the bottom boundary. The original observation7 of seismic compressional (P)-wave velo- However, when the heavy buoyancy flux is more than 80% of the light cities slower than the adiabatic PREM14 model in the lower outer core buoyancy flux, a dense layer grows at the bottom of the cavity. It has has since been confirmed and incorporated in one-dimensional global been observed experimentally that the condition for the existence of models AK135 (ref. 10) and PREM2 (ref. 11). That discrepancy from the dense layer is really a condition for the buoyancy fluxes, as the adiabatic profile could result from a wrong interpretation caused described above; it does not specifically depend either on the volume by the nearby complex inner core, because sensitivity kernels have a flow rates or on the density differences between the fluids. This jus- width of several hundred kilometres at body-wave frequencies15,or tifies our convection experiment as an appropriate model of a melt- might also be attributed to floating crystals12,16. Gubbins et al.13 show ing/crystallization process for the inner core. that this last explanation is not possible but that the observed seismic On Fig. 1, an experimental run is shown. This experiment corre- velocities can be explained by a stratification in light elements (and sponds to a case in which the heavy fluid buoyancy flux was 83% that temperature). However, the stratification mechanism by crystalliza- of the light fluid. The initial concentration and concentrations of the tion and melting of crystals at different depths has not been completely dense and light injected fluids were x0 5 4 wt%, xh 5 6 wt% and elucidated. xl 5 1.65 wt% NaCl respectively. The volume flow rate of the dense 27 3 21 We propose that a dense layer can develop when melting and fluid was Qh 5 3.9 3 10 m s and that of the light fluid was 27 3 21 crystallization occur only at the inner-core boundary (ICB). Where Ql 5 4.0 3 10 m s . The experiment was run twice under the crystallization takes place, light elements are released, providing light same conditions: in the first instance, the injected dense fluid was fluid; where melting takes place, dense fluid is produced. It is possible coloured with potassium permanganate and photographs of the set- to quantify these effects in terms of flux of buoyancy. Let us denote up were taken at different times after the beginning of the injections. Dr as the fraction of density jump across the ICB that is due to A dense coloured layer forms at the bottom and its thickness grows composition partition between solid and liquid phases. For a rate linearly with time. It is also possible to see convection plumes going of crystallization V, the buoyancy flux is DrgcV, where gc is the up on the right-hand side, carrying along some of the heavy coloured magnitude of gravity17 on the ICB (subscript c is for ‘core’). For fluid in the upper part of the cavity. In the second instance, the 18,19 melting, the buoyancy flux is –DrgcV. The idea is that part of the synthetic schlieren method has been used , providing a quantitat- heavy fluid would remain at the bottom, while the rest would be ive two-dimensional field of refraction index with which to visualize entrained by the light fluid. Conversely, part of the light fluid would the concentration gradients: their horizontal components are shown mix with the dense fluid in the dense layer while the rest would cross on the middle row of Fig. 1, showing convection plumes of light fluid the dense layer and contribute to convection within the main part of on the right-hand side of the cavity, while their vertical components the outer core. This idea has been validated experimentally as follows. are shown on the bottom row, visualizing the dense layer and its

1Laboratoire de Ge´ophysique Interne et Tectonophysique, CNRS, Observatoire de Grenoble, Universite´ Joseph Fourier, Maison des Ge´osciences, BP 53, 38041 Grenoble Cedex 9, France. 2Universite´ de Lyon, CNRS UMR5570, site UCB Lyon 1, 2 rue Raphae¨l Dubois, baˆtiment Ge´ode, 69622 Villeurbanne, Universite´ Lyon 1, ENS de Lyon, France. 3Department of Earth and Planetary Sciences, Johns Hopkins University, Baltimore, Maryland 21218, USA. 744 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

a t = 24 min t = 48 min t = 72 min t = 96 min The dynamical model we put forward to account for significant melting on the ICB results from the combination of three physical elements: the thermal state of a superadiabatic inner core, gravita- tional equilibrium and finite heat exchange of latent heat with the outer core. In superadiabatic conditions, a uniform velocity in the inner-core V, say from west to east along the x-axis (see Fig. 3), generates a global superadiabatic temperature gradient in the same Concentration gradient (wt% m b 0.5 direction proportional to the residence time in the inner core; such a gradient would hence be inversely proportional to V, and propor- 0 tional to a positive source term S < 10215 Ks21 defined by secular cooling and thermal conduction along the adiabat (see Methods and −0.5 ref. 20): c 1 LH S 0 ~ ð1Þ −1 Lx V −2 −3 where H is the temperature relative to the adiabat Tad in the inner −4 –1 core anchored to the ICB17. It follows from the volume expansion ) −5 coefficient21 a 5 1.1 3 1025 K21 and inner-core density (on the 17 4 23 Figure 1 | Visualization of the growth of a dense layer in an experimental ICB ) rs 5 1.28 3 10 kg m (subscript s is for ‘solid’) that there run. We used dye injection (a) and measurement of horizontal (b) and exists a density gradient –arshH/hx. The resulting gravity field and vertical (c) density gradients. The experimental cavity is initially filled with a density distribution generate unbalanced forces on the inner core, so 4 wt% NaCl water solution. From t 5 0, a constant flux of 1.65 wt% NaCl that it is displaced a distance d in the x direction. In the Methods, we solution is injected at the bottom on the right-hand side of the cavity while a derive the gravitational field and potential associated with this mass 6 wt% NaCl solution is injected on the left-hand side. The dense fluid is coloured with potassium permanganate (a), visualizing a growing dense distribution, from which it is possible to calculate the net gravita- layer at the bottom, at four different times after the injection of the dye. The tional force FG exerted on the inner core and the net pressure force FP exerted by the outer core on the inner core synthetic schlieren method is used in a second identical experiment: the horizontal gradient of refraction index in b highlights the convective plumes 16 p2 LH c2 and the vertical gradient in c reveals the dense layer. F zF ~ G r c3 a r {ðÞr {r d e ð2Þ G P 9 l Lx s 5 s l x where G is the universal gravitational constant, c 5 1,220 km is the growth. The concentration field is computed from its gradient, and { radius of the inner core17, r ~1:22 | 104 kg m 3 is the outer core averaged along the horizontal direction: the resulting stratification l density on the ICB17 and e is the unit vector in the direction of the profile is shown in Fig. 2. There is clearly a region of stratified fluid, x above which density is nearly uniform. The thickness of this layer grows linearly with time, its volume being 50% to 90% that of the total volume generated by the light and heavy fluxes. Melting part of the inner core at a significant rate is difficult while Crystallization Melting it is crystallizing (on average over its surface) as a result of secular cooling. The most plausible mechanism is that a topography is V formed dynamically on the ICB so that the temperature of the adja- cent fluid of the outer core exceeds the melting temperature. That δ excess temperature is then responsible for heat transfer from the West East outer core to the ICB, providing latent heat for fusion: in this way C O θ x r topography can be related to the rate of melting. R

M r = c 4.35 Light plumes 4.3 Dense liquid 4.25 layer Figure 3 | A schematic representation of the translational convective 4.2 mode. The centre of the inner core O is shifted by a distance d away from the centre of the Earth C, which would be its equilibrium position if its density 4.15 48 min were uniform. That shift causes a thermal departure from the adiabat at the ICB, generating melting on one side and crystallization on the other side.

Concentration (wt %) 120 min 4.1 Hence a uniform flow exists in the inner core (arrow labelled V): in the case of a superadiabatic regime, a gradient of temperature develops, as 24 min 72 min represented by greyscale shading. Its associated changes in density and 4.05 96 min gravitational potential lead to a new mechanical equilibrium for the inner core, corresponding to a shift in position in the same direction as initially 4 assumed. r is the distance from the centre O of the inner core and h is the 0510 15 20 angle between the x axis and the direction of the point where H is evaluated. Altitude measured from the bottom of the cavity (cm) V is the rate of crystallization, and c is the radius of the core. R is the distance Figure 2 | Evolution of the concentration profile during the growth of a between the point at which the gravitational potential U is calculated and the dense layer. The concentration field is extracted from the gradient of centre of the Earth C. M is a dummy point, used to define r, H and R. The refraction index. It is averaged along the horizontal direction and shows the dotted circle is the position of the ICB in the absence of density gradient time evolution of the dense layer since injection of the dye. (centred on C). See Methods. 745 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

temperature gradient. The equilibrium condition that both forces 5 5 balance provides the shift d as a function of the thermal gradient hH/hx 4 Q = 11 TW 4 LH 2 Q a Lx rsc = 10 TW d~ ð3Þ Q = 9 TW

{ c ) 5ðÞr r • s l Q (10

–1 = 8 TW 3 3 Q = 7 TW

Then, the displacement d is associated with a non-uniform pressure –10 m s m s

distribution on the ICB (see Methods), yielding a small temperature –10

departure dT from the adiabat (see Fig. 4) 2 2 –1 ) ~ { V (10 dT rlgc d cos h ðÞmP mad ð4Þ 29 21 22 where mP 5 8.5 3 10 KPa is the Clapeyron slope , 1 1 29 21 mad 5 (aTad)/(rcp) 5 6 3 10 KPa is the adiabatic gradient, ~ 4p 21 21 gc G 3 rsc is the gravity on the ICB and cp 5 850 J kg K is the specific heat capacity23. That departure is accommodated by a ther- 0 0 05001,000 mal boundary layer in the outer core, with a corresponding heat c (km) transfer of typical magnitude u9c dT, where u9 5 1024 ms21 is a p Figure 5 | Growth rate of the radius of the inner core and uniform typical velocity scale in the outer core. That heat transfer must be convective velocity as functions of the inner-core radius. They are plotted balanced by the release or absorption of latent heat for different values of the heat flux Q at the core–mantle boundary. Thin LVcos h~u’cpdT ð5Þ solid lines show the mean solidification (crystallization) rates c˙ of the inner core. Dash-dotted lines show the translation velocities V, calculated with the 21 where L 5 900 kJ kg is the latent heat coefficient24,25. Finally, com- assumption of a constant S. Thick solid lines show the translation velocities bining equations (1), (3), (4) and (5), we can express the translational V, with S(t) calculated (see Supplementary Information) from the core velocity as thermal evolution model of ref. 30. 4pG u’ c r2r a ðÞm {m S 2~ p s l P ad 3 asymmetry of the inner core: grain growth during the transit from the V { c ð6Þ 15 L ðÞrs rl western hemisphere to the eastern hemisphere may explain the dif- Depending on the heat flux at the core–mantle boundary, the history ference in seismic properties27. The temperature difference of a few of the inner core shows a first phase dominated by growth cc_!c{1, kelvin between the hemispheres is another source of asymmetry. followed by the development of the translational instability (see According to our experiments, a melting rate above 80% of the Supplementary Information), when its radius was around 400 km, crystallization rate is necessary for a dense layer to form, which geo- leading to the dominant present translation V / c3/2 of the order of metrically implies that the translation velocity V is more than 20 5 3 10210 ms21, while the growth rate is of the order of 10211 ms21 times that of the inner-core growth rate. From Fig. 5, we see that this (Fig. 5). happens only when the core–mantle boundary heat flux exceeds The latter scaling law implies that the translational convection is 10 TW, and only since the inner-core radius was 1,100 km, some faster along a long axis of the inner-core oblate spheroid (see 200 million years ago. Extrapolating from our experiments, 50% of Supplementary Information), that is, perpendicular to the rotation the volume of melt produced since then would correspond to a layer axis. It follows that the temperature gradient is preferentially aligned 250 km thick. The experimental excess concentration is found to be with such a long axis, which again reinforces convection in that 10% of the concentration difference between light and heavy injected direction. Moreover, the Earth’s aspherical mass distribution— fluids. In the Earth’s core, where the concentration of light elements which has essentially a degree 2, order 2 geometry26—is responsible is about 10%, a difference in concentration of around 1% across the for elongating the inner core slightly along an east–west axis and dense layer is expected. This is indeed coherent with the observed induces a degree 1 translational convection in the inner core through seismic velocities13. a bifurcation produced by instability (see Supplementary Our convection mechanism ignores deformation in the inner core Information). We propose that the translational flow has a west to and compositional buoyancy. With a finite effective viscosity, tem- east orientation, which is responsible for the observed hemispherical perature variations along gravity isopotentials induce an internal flow with deformation that affects the translational mode. We have T estimated that the internal flow is weak compared to translation for an effective viscosity above 1018 Pa s. Enrichment in light elements of the outer core (a few per cent) has been invoked28,29 to imagine a δT Crystallization stabilizing mechanism for convection in the inner core. This is specu- lative, however, because the fraction of light elements incorporated in the inner core may have decreased more rapidly than the fraction of light elements incorporated into the outer core increased, given that δT Adiabat gravity on the ICB is getting larger, reinforcing convection and com- Melting paction in the mush. Invoking an excessively asymmetric buoyancy flux on the ICB calls for further study of the dynamics of the outer core and the geody- Melting curve namo. The stratified layer is expected to be dynamically isolated and to act as a filter between the inner core and the rest of the outer core, c but there might subsist some hemispherical asymmetry in the outer- r core dynamics. δ δ

Figure 4 | Thermal departure from the adiabat due to the displacement of METHODS SUMMARY the inner core and heat transfer at the ICB. A thermal boundary layer forms The mode of convection associated with the translation of the inner core is not in the outer core to adjust to the different radii of the ICB on the melting and standard. Therefore, it is presented in the Methods. Thermal buoyancy is the crystallization sides. driving force; however, unlike classical convection, the damping is not due to 746 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

viscous and/or thermal diffusion. Damping is set by the capacity of the outer core 18. Dalziel, S. B., Hughes, G. O. & Sutherland, B. R. Whole-field density measurements to extract or supply latent heat on the ICB. by ‘synthetic schlieren’. Exp. Fluids 28, 322–335 (2000). 19. Gostiaux, L. & Dauxois, T. Laboratory experiments on the generation of internal Full Methods and any associated references are available in the online version of tidal beams over steep slopes. Phys. Fluids 19, 028102, doi 10.1063/1.2472511 the paper at www.nature.com/nature. (2007). 20. Stacey, F. D. & Davis, P. M. Physics of the Earth Ch. 19 (Cambridge University Press, Received 7 December 2009; accepted 7 June 2010. 2008). 21. Vocˇadlo, L. in Treatise on Geophysics (ed. Schubert, G.) Vol. 2, 91–120 (2007). 1. Poupinet, G., Pillet, R. & Souriau, A. Possible heterogeneity of the Earth’s core 22. Alfe`, D., Price, G. D. & Gillan, M. J. Iron under Earth’s core conditions: liquid-state deduced from PKIKP travel times. Nature 305, 204–206 (1983). thermodynamics and high-pressure melting curve from ab initio calculations. 2. Tanaka, S. & Hamaguchi, H. Degree one heterogeneity and hemispherical Phys. Rev. B 65, 165118, doi 10.1103/PhysRevB.65.165118 (2002). variation of anisotropy in the inner core from PKP(BC)-PKP(DF) times. J. Geophys. 23. Poirier, J.-P. Physical properties of the Earth’s core. C. R. Acad. Sci. 318, 341–350 Res. 102, 2925–2938 (1997). (1994). 3. Creager, K. C. Large-scale variations in inner core anisotropy. J. Geophys. Res. 104, 24. Poirier, J.-P. & Shankland, T. J. Dislocation melting of iron and the temperature of 309–314 (1999). the inner core boundary, revisited. Geophys. J. Int. 115, 147–151 (1993). 4. Garcia, R. & Souriau, A. Inner core anisotropy and heterogeneity level. Geophys. 25. Anderson, O. L. & Duba, A. Experimental melting curve of iron revisited. J. Res. Lett. 27, 3121–3124 (2000). Geophys. Res. 102, 22659–22670 (1997). 5. Niu, F. & Wen, L. Hemispherical variations in seismic velocity at the top of the 26. Masters, G., Jordan, T. H., Silver, P. G., &. Gilbert, F. Aspherical Earth structure Earth’s inner core. Nature 410, 1081–1084 (2001). from fundamental spheroidal-mode data. Nature 298 609–613 (1982). 6. Yu, W.-c., Wen, L. & Niu, F. Seismic velocity structure in the earth’s outer core. 27. Calvet, M. & Margerin, L. Constraints on grain size and stable iron phases in the J. Geophys. Res. 110, B02302, doi 10.1029/2003JB002928 (2005). uppermost inner core from multiple scattering modeling of seismic velocity and 7. Souriau, A. & Poupinet, G. The velocity profile at the base of the liquid core from attenuation. Earth Planet. Sci. Lett. 267, 200–212 (2008). PKP(BC1Cdiff) data: an argument in favor of radial inhomogeneity. Geophys. Res. 28. Buffett, B. A. Onset and orientation of convection in the inner core. Geophys. J. Int. Lett. 18, 2023–2026 (1991). 179, 711–719 (2009). 8. Kennett, B. L. N. & Engdahl, E. R. Traveltimes for global earthquake location and 29. Deguen, R. & Cardin, P. Tectonic history of the Earth’s inner core preserved in its phase identification. Geophys. J. Int. 105, 429–465 (1991). seismic structure. Nature Geosci. 2, 419–422 (2009). 9. Souriau, A. & Roudil, P. Attenuation in the uppermost inner core from broad-band 30. Labrosse, S. Thermal and magnetic evolution of the earth’s core. Phys. Earth Planet. GEOSCOPE PKP data. Geophys. J. Int. 123, 572–587 (1995). Inter. 140, 127–143 (2003). 10. Kennett, B. L. N., Engdahl, E. R. & Buland, R. Constraints on seismic velocities in the Supplementary Information is linked to the online version of the paper at earth from traveltimes. Geophys. J. Int. 122, 108–124 (1995). www.nature.com/nature. 11. Song, X. & Helmberger, D. V. A. P wave velocity model of Earth’s core. J. Geophys. Res. 100, 9817–9830 (1995). Acknowledgements This work has benefited from discussions during the 12. Zou, Z., Koper, K. D. & Cormier, V. F. The structure of the base of the outer core CNRS-INSU SEDIT meetings. We thank M. Bergman for discussions regarding inferred from seismic waves diffracted around the inner core. J. Geophys. Res. 113, inner-core crystallization. The LGIT and the ANR (Agence Nationale de la B05314, doi 10.1029/2007JB005316 (2008). Recherche) (ANR-08-BLAN-0234-01) have provided financial support for the 13. Gubbins, D., Masters, G. & Nimmo, F. A thermochemical boundary layer at the experiments. base of Earth’s outer core and independent estimate of core heat flux. Geophys. J. Author Contributions M.M., R.D. and T.A. ran and analysed the experiments. T.A. Int. 174, 1007–1018 (2008). designed the experimental study and built the dynamical model. R.D. and T.A. 14. Dziewonski, A. M. & Anderson, D. L. Preliminary reference Earth model. Phys. worked out the thermal conditions on the ICB and assessed the geophysical Earth Planet. Inter. 25, 297–356 (1981). relevance of the dynamical model. R.D. computed the different scenarios of 15. Calvet, M., Chevrot, S. & Souriau, A. P-wave propagation in transversely isotropic thermal history. R.D., T.A. and M.M. applied the experimental results to the media: II. Application to inner core anisotropy: effect of data averaging, geophysical context. T.A. and R.D. wrote the paper. parametrization and a priori information. Phys. Earth Planet. Inter. 156, 21–40 (2006). Author Information Reprints and permissions information is available at 16. Loper, D. & Roberts, P. A study of conditions at the inner core boundary of the www.nature.com/reprints. The authors declare no competing financial interests. Earth. Phys. Earth Planet. Inter. 24, 302–307 (1981). Readers are welcome to comment on the online version of this article at 17. Dziewonski, A. M. & Anderson, D. L. Preliminary reference Earth model. Phys. www.nature.com/nature. Correspondence and requests for materials should be Earth Planet. Inter. 25, 297–356 (1981). addressed to T.A. ([email protected]).

747 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09257

METHODS uniform velocity are kept in the analysis, but perpendicular variations are ignored. They would lead to degree 2 spherical harmonic contributions with We present here in some detail an analytical model of inner-core translation. The little contribution to the displacement d. Adiabatic spherical symmetric density model results from the combination of three physical phenomena: the thermal variations are ignored because they contribute to d only by slightly changing the state of the inner core, gravitational equilibrium and phase change restrictions average density of the inner core. Density in the inner core is thus expressed as due to finite heat exchange with the outer core. S Thermal evolution of the inner core. In the inner core, owing to secular cooling, r~r zðÞr {r {a r rcos h ð11Þ any parcel of matter experiences a decrease in temperature with respect to its l s l V s initial curve of constant entropy. However, as the inner core grows, newly solidi- where a is the volume thermal expansion coefficient of the inner core, and rl and fied material is set to a lower and lower entropy value. Hence the inner core is rs are the density of the liquid outer core and solid inner core. In equation (11), thermally stable if formerly solidified matter is colder than the current adiabatic the first term on the right-hand side (that is, rl) is the contribution of the liquid profile attached to the liquidus temperature at the ICB, as a result of diffusion. If core, centred on C, and the other two terms on the right-hand side are the not, it is unstable to thermal convection. contributions of the inner core, centred on O, separated from C by a distance It is convenient to introduce a potential temperature H(r, t) 5 T 2 Tad(r, t), d. Let us introduce the gravitational potential U, such that gravity is g 52=U, 2 where the adiabat Tad(r, t) is anchored at the ICB (that is, H 5 0 at the ICB). At obeying the Poisson equation = U 5 4pGr, with G the universal gravitational inner-core conditions, the equation of conservation of entropy can be simplified constant. From equation (11), the corresponding gravitational potential is found (see Supplementary Information) and written as to be  2 2 3 2 LH 2 U R r S r c r zðÞvN+ H~k+ HzStðÞ ð7Þ ~r zðÞr {r {a r { cosh ð12Þ Lt 4pG l 6 s l 6 V s 10 6 where k is the thermal diffusivity of solid iron. This form of the entropy equation where r denotes the distance between the point at which U is calculated and the captures first-order effects of compressibility by retaining the contribution of 31 centre of the inner core O and R the distance between the same point and the adiabatic heating or cooling during vertical advection . The source term is centre of the Earth C (see Fig. 3). In the derivation of equation (12), the potential 2 StðÞ~k+ Tad{TT_ ad ð8Þ had to be determined inside and outside the inner core, whereas potential and gravity are continuous across the ICB. The formula (12) is the gravitational _ 2 2 2 where TT ad~LTad=Lt is the difference between thermal diffusion along the adia- potential within the inner core. Noting that R 5 r 1 2drcosh 1 d , equation bat and secular cooling and is independent of space. The sign of S determines (12) becomes whether or not the inner core is superadiabatic and likely to convect. It is  U r2 d2 S r3 c2r r uncertain because S is the difference between two poorly constrained quantities ~r zr {a r { cos hzdr cosh ð13Þ of comparable magnitude. A young inner core (large secular cooling) and small 4pG s 6 l 6 V s 10 6 l 3 thermal diffusivity favour a superadiabatic temperature regime (positive S) and The total gravitational forces exerted on the inner core can be readily evaluated as instability. The low estimate of thermal conductivity given recently by Stacey and ð ð  20 r Davis together with the young inner core age favoured by recent core thermal F ~{ r+U dV~{4pG drr + cosh dV ð14Þ 30, 32,33 21 21 G l s models make it plausible: with a conductivity k 5 36 W m K as sug- inner core inner core 3 gested by ref. 20, the inner core would be superadiabatic if its age is of the order of Only the contribution from the last term in equation (13) remains, because the a billion years or less. S(t) can be calculated for any given thermal history of the other terms cancel out or have no contribution. Indeed, the distribution of core (see Supplementary Information); it is a decreasing function of time, with masses within the inner core exerts no net gravity force on the inner core itself typical values of 10–100 K per billion years. In what follows, we assume that S is and only the outer core has a non-zero contribution when the inner core is not 215 21 indeed positive, and will use a nominal value of S 5 10 Ks < 30 K per centred. We finally obtain billion years. 2 If it is superadiabatic, the inner core is mechanically unstable. Classical ther- 16 p 3 FG~{ G drr c ex ð15Þ mal convection will develop if the inner-core viscosity is not too large34–36, but 9 l s the fact that the ICB is not fixed allows for a new instability consisting of a Within the liquid outer core, we assume hydrostatic equilibrium { { ~ translation (see Supplementary Information for a linear stability analysis). +P rl+U 0, which provides a simple relationship between pressure P Under the assumption that the viscosity of the inner core is large enough, this and the potential U evaluated in equation (13) mode becomes dominant and the motion is effectively restricted to be a trans- ~{ lation, with velocity V. Assuming that the Pe´clet number (Pe 5 Vc/k, where c is P rl U ð16Þ 2 the radius of the inner core) is very large, the terms hH/ht and k= H can be up to an additive constant. It is then possible to evaluate the net pressure force neglected in equation (7), which now takes the simple form exerted by the outer core on the ICB þ LH S p2 2 ~ ð9Þ ~{ ~ 16 3 S c z Lx V FP Per dS G rlc a rs rld ex ð17Þ ICB 9 V 5 for a uniform velocity V in the direction of the x axis (see Fig. 3). With a The net force exerted on the inner core is then boundary condition of H 5 0 on the crystallization side, the solution is ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 16 p2 S c2 p S ~ z ~ r 3 a r { r {r d H~ rcos hz c2{r2sin2h ð10Þ F FP FG G lc s ðÞs l ex ð18Þ V 9 V 5 where r is the distance from the centre O of the inner core and h is the angle Static equilibrium of the inner core (F 5 0) is reached when the inner core is between the x axis and the direction of the point where H is evaluated (see Fig. 3). translated by a distance equal to The component H goes back to zero on the melting side within a thin boundary a S r c2 layer (not visible on the schematic Fig. 3) of thickness k/V = c, which can be d~ V s ð19Þ 5ðÞr {r resolved when thermal diffusion is considered. The maximal temperature devi- s l ation from spherical symmetry is thus DT 5 2cS/V. Mechanical equilibrium. The thermal asymmetry induced by a translation of Kinetics of phase change at the ICB. The displacement of the inner core the inner core is accompanied by a density asymmetry and it is anticipated that implies that pressure is no longer uniform on the ICB. This corresponds to a the inner core as a whole will be shifted in the direction of the thermal gradient in temperature difference dT between the adiabat and the liquidus temperature an attempt to move the centre of mass of the inner core towards the centre of the along the interface: Earth: the light part is emerging while the dense part is sinking. We show here dT~{dPmðÞ{m ð20Þ that a new equilibrium state with the inner core translated by a distance d in the x P ad direction results from a balance between the gravitational forces applied on the where dP denotes the pressure variation on the ICB and mP and mad are the inner core and the pressure forces on the ICB. A correct estimate of the position Clapeyron slope and adiabatic gradient (in the liquid phase) respectively. of the inner core requires the evaluation of the change in self-gravitational Pressure variations on the ICB can readily be determined from the previous potential resulting from the change in mass distribution. calculations on gravitational equilibrium. Pressure in the liquid is related to For the sake of simplicity and tractability, density in the outer core is supposed the gravitational potential (through equation (16)). The gravitational poten- to be uniform. In the inner core, density variations in the direction of the tial (13) is evaluated on the ICB r 5 c, with equation (19) taken into account

©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09257

U r ÂÃ Results of the model. All elements of the model have been analysed and now we ~ s c2zd2z2 dccosh ð21Þ put them together. Equations (23) and (24) provide a relationship between d and V 4pG 6 ~ { Hence, the pressure variation on the ICB follows from equations (16) and LV u’cprlgcðÞmP mad d ð25Þ (21) corresponding to the thermal aspect of the problem. Using the independent mech- ~{ anical equation (19), the displacement d can be eliminated and the solution for the dP rlgcdcosh ð22Þ translation velocity V can be obtained as ~ 4p where gc G rsc is the gravitational acceleration in r 5 c. From equation 3 p u’ c r2r a m {m S (20), the temperature departure from the adiabat is 2~ 4 G p s l ðÞP ad 3 V { c ð26Þ 15 LðÞrs rl dT~r gc d cosh ðÞmP {mad ð23Þ l Representative values of the parameters involved are G 5 6.67 3 10211 3 21 22 24 21 21 21 ~ The adiabat is thus higher than thermodynamic equilibrium on the eastern m kg s , u9 5 10 ms , cp 5 850 J kg K (ref. 23), rs 12,800 side and lower by the same amount on the western side. We do not assume {3 ~ {3 25 21 kg m and rl 12,200 kg m (ref. 17), a 51.1ÀÁ3 10 K (ref. 21), that the actual temperature of the solid–liquid interface is dependent on the 29 21 ~ ~ | {9 {1 mP 5 8.5 3 10 KPa (ref. 22), mad ðÞaT rlcp 6 10 KPa , rate of melting or crystallization, dynamic undercooling being very small for L 5 900 kJ kg21 (refs 24, 25), and c 5 1,221 km (ref. 17). With S 5 10215 Ks21, metals. We consider instead that a thermal boundary layer develops in the the translation velocity obtained for the present state of the inner core is found to outer core, which is the cause of heat exchange, that is, supply or extraction { { be V^7:7 | 10 10 ms 1, which is faster than the growth rate of the radius of of latent heat (see Fig. 4). Heat conduction in the solid and in the liquid are the inner core by a factor of around 70. This is a justification for neglecting the smaller contributions and are fairly equal and opposite. Moreover, it is growth of the inner core in the analysis. The associated displacement is derived assumed that the rate of crystallization (and melting on the other side) is from equations (25) or (19). Its value is d^95 m. The maximal temperature much bigger than the growth rate of the inner-core dc/dt,wherec(t)isthe disequilibrium is dT^0:01 K, while the non-adiabatic temperature difference radius of the inner core. Fusion and crystallization are thus supposed to be of across the inner core is DT 5 2cS/V ^ 3.2 K. Because dT is very small compared equal magnitude: this can be expressed in a single form Vcosh, where V is the to DT, the boundary condition H 5 0 is justified to a good approximation from the assumed uniform velocity of the inner core. Heat transfer in the liquid outer point of view of the inner core. It is also possible to determine the maximal time of core is related to the amplitude of velocity fluctuations u9 in the outer core: residence in the solid inner core, which is 2c=V^100 million years. we have little knowledge regarding u9 near the ICB, so we take them to be of the same order of magnitude as the velocity at the core–mantle boundary 31. Tritton, D. J. Physical Fluid Dynamics 1–536 (Oxford, Clarendon Press, 1988). estimated from the secular variation of the magnetic field, which is 32. Gubbins, D., Alfe`, D., Masters, G., Price, G. D. & Gillan, M. Gross thermodynamics 24 21 10 ms . The simplest estimate for the heat transfer coefficient is cpu9. of two-component core convection. Geophys. J. Int. 157, 1407–1414 (2004). Hence, the heat budget at the ICB is 33. Nimmo, F. in Treatise on Geophysics (ed. Schubert, G.) Vol. 2, 31–65, 2007). 34. Jeanloz, R. & Wenk, H.-R. Convection and anisotropy of the inner core. Geophys. LVcosh~u’ cp dT ð24Þ Res. Lett. 15, 72–75 (1988). 35. Weber, P. & Machetel, P. Convection within the inner-core and thermal where L is the latent heat. This equation relates the velocity V (rate of implications. Geophys. Res. Lett. 19, 2107–2110 (1992). crystallization on one side, melting on the other side) to the thermal depar- 36. Wenk, H.-R., Baumgardner, J. R., Lebensohn, R. A. & Tome´, C. N. A convection ture from the adiabat dT at the interface, which is itself related to the dis- model to explain anisotropy of the inner core. J. Geophys. Res. 105, 5663–5678 placement d of the inner core by equation (23). (2000).

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09061 LETTERS

The evolution of mammal-like crocodyliforms in the Cretaceous Period of Gondwana

Patrick M. O’Connor1,2, Joseph J. W. Sertich3, Nancy J. Stevens1,2, Eric M. Roberts4,5, Michael D. Gottfried6, Tobin L. Hieronymus7, Zubair A. Jinnah8, Ryan Ridgely1, Sifa E. Ngasala6,9 & Jesuit Temba10

Fossil crocodyliforms discovered in recent years1–5 have revealed a Referred material. RRBP 05103, partial skull preserving left maxilla, level of morphological and ecological diversity not exhibited by lower jaw and eight postcaniniform teeth (Fig. 2). extant members of the group. This diversity is particularly notable Type locality and horizon. Locality RRBP 2007-04, ,20 km south of among taxa of the Cretaceous Period (144–65 million years ago) Lake Rukwa, Rukwa Rift Basin, Tanzania (see Supplementary recovered from former Gondwanan landmasses. Here we report Information); Galula Formation, mid-Cretaceous6. the discovery of a new species of Cretaceous notosuchian croco- Diagnosis. Pakasuchus differs from other crocodyliforms in possessing dyliform from the Rukwa Rift Basin6 of southwestern Tanzania. the following unique combination of characters: extreme heterodonty This small-bodied form deviates significantly from more typical with reduced tooth count (8 lower quadrant, 5 upper quadrant); crocodyliform craniodental morphologies, having a short, broad trenchant molariform cheek-teeth with paired rostrocaudally- skull, robust lower jaw, and a dentition with relatively few teeth oriented crests; rostroventrolaterally projecting pterygoid flanges; that nonetheless show marked heterodonty. The presence of mor- dorsally flared squamosal at contact with parietal; biplanar articular– phologically complex, complementary upper and lower molari- quadrate articulation bracketed laterally by an expanded surangular; form teeth suggests a degree of crown–crown contact during jaw reduced osteoderms in thorax, with a normal complement of osteo- adduction that is unmatched among known crocodyliforms, derms surrounding the tail. paralleling the level of occlusal complexity seen in mammals and Description and comparison. The holotype (RRBP 08631) of their extinct relatives7–12. The presence of another small-bodied Pakasuchus is represented by a virtually complete, exquisitely pre- mammal-like crocodyliform in the Cretaceous of Gondwana indi- served articulated skull and skeleton (skull length, 7 cm; snout-vent cates that notosuchians probably filled niches and inhabited eco- length, 30 cm; Fig. 1). The tapered skull is low and broad and generally 5,14,15 morphospace that were otherwise occupied by mammals on similar to those of many notosuchians . External sculpture on the northern continents. facial elements is reduced or absent altogether, whereas the dorsal Specimens of the new notosuchian crocodyliform were recovered surface of the cranium shows moderate sculpturing. The maxilla is from several locations in the middle Cretaceous Galula Formation6 vertical, with limited exposure on the dorsal surface of the rostrum (Fig. 1b). The external nares face rostrally, as is typical of many exposed in the Mbeya Region of southwestern Tanzania. The new 1–3,14–16 crocodyliform adds a small-bodied constituent to the terrestrial terrestrial crocodyliforms . vertebrate fauna of continental Africa, and exemplifies the extreme The maxilla preserves an alveolar trough rather than individual mammal-like heterodonty realized by Gondwanan notosuchians alveoli. A complete palatine–pterygoid secondary palate is present. The fused pterygoids form the roof and caudal margins of the choanal during the Cretaceous. Given the scarcity of Cretaceous-age verte- groove and include a rostrolaterally directed pterygoid flange. There is brate assemblages from subequatorial Africa6,13, and indeed much of no antorbital fenestra (Fig. 1b), similar to the condition in the South Gondwana, the new form is crucial for exploring the evolutionary American notosuchians Mariliasuchus2 and Adamantinasuchus3. The dynamics of terrestrial faunas on southern landmasses. lower jaw is deep and has a laterally expanded para-alveolar shelf, an Archosauria Cope, 1869 enlarged mandibular fenestra and a rostrally extended splenial that Crocodyliformes Hay, 1930 (sensu Clark in Benton and Clark, 1988) forms one-third of the symphysis (Figs 1, 2). The hypertrophied lower Mesoeucrocodylia Whetstone and Whybrow, 1983 caniniform tooth has a distinct alveolus, whereas the postcaniniform Notosuchia Gasparini, 1971 teeth are situated in an undulating alveolar trough. The quadrate– Pakasuchus kapilimai gen. et sp. nov. articular joint is bi-planar, with both horizontal and near-vertical articular surfaces (Fig. 2k, l). The articular is flat rather than concave, Etymology. From Paka, Kiswahili for ‘cat’ in reference to the short, indicating the potential for substantial rostrocaudal translation of the low skull with molariform teeth reminiscent of carnassials in mam- lower jaw. The dorsally expanded surangular forms an enhanced malian carnivores, and souchos (Gr.), crocodile; and kapilimai,in lateral buttress for the jaw joint (Fig. 2l), further constraining move- honour of the late Professor Saidi Kapilima (University of Dar es ments of the lower jaw. Salaam), a key contributor to the Rukwa Rift Basin Project. Pakasuchus shows extreme variation in dental size and shape, with Holotype. RRBP (Rukwa Rift Basin Project (Tanzanian Antiquities distinct caniniform, premolariform and molariform teeth (Fig. 2). Unit)) 08631 is an articulated skull and skeleton (Figs 1, 2). Although the specimen was preserved with the jaws closed, X-ray

1Department of Biomedical Sciences, Ohio University College of Osteopathic Medicine, 228 Irvine Hall, Athens, Ohio 45701, USA. 2Ohio Center for Ecology and Evolutionary Studies, Irvine Hall, Ohio University, Athens, Ohio 45701, USA. 3Department of Anatomical Sciences, Stony Brook University, Stony Brook, New York 11794, USA. 4Department of Physical Sciences, Southern Utah University, Cedar City, Utah 84720, USA. 5School of Earth and Environmental Sciences, James Cook University, Townsville Qld 4811, Australia. 6Department of Geological Sciences, Michigan State University, East Lansing, Michigan 48824, USA. 7Department of Anatomy and Neurobiology, Northeastern Ohio Universities Colleges of Medicine and Pharmacy, Rootstown, Ohio 44272, USA. 8School of Geosciences, University of the Witwatersrand, Private Bag 3, Wits, Johannesburg, South Africa. 9Geology Department, University of Dar es Salaam, PO Box 35052, Dar es Salaam, Tanzania. 10Tanzania Antiquities Unit, PO Box 2280, Dar es Salaam, Tanzania. 748 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

CaV Ost

LFe RUl RTi Lil a Ost RFi RH RRa Sc c RFe

CeV DoV Fr Pa d

Pb Po Mx

Sq Lc Ju d Qj c Sr b Ros Ga? DoV Qd MaF Vr Ros Sa De An Vr e f RTi g LFe C1C2C3C4C5C6 C7 Lil

Ros Mt Ost RFe RFi C9 Ph C8

Figure 1 | Pakasuchus kapilimai.a–g, RRBP (Rukwa Rift Basin Project) vertebra; CeV, cervical vertebra; De, dentary; C1–C9, cervical vertebrae and 08631, holotype specimen. a, Skeleton in dorsal view. b, Skull in left lateral position; DoV, dorsal vertebrae; Fr, frontal; Ga?, gastralia; Ju, jugal; Lc, view. c, Vertebral and sternal ribs. d, Reduced trunk osteoderms. lacrimal; LFe, left femur; Lil, ilium; MaF, mandibular fenestra; Mt, e, Transition from reduced trunk osteoderms to normal tail osteoderms. metatarsal; Mx, maxilla; Ost, caudal osteoderm; Pa, parietal; Pb, palpebral; f, Reconstructed micro-CT scan of distal hind limb (extracted from within Ph, phalanges; Qd, quadrate; Qj, quadratojugal; RRa, right radius; RFe, right matrix). g, Reconstructed micro-CT scan of cervical vertebrae in right lateral femur; RFi, right fibula; RH, right humerus; Ros, reduced osteoderms; RTi, view. Dashed boxes (white) in a indicate the positions of c and d. Scale right tibia; RUl, right ulna; Sa, surangular; Sc, scapula; Sq, squamosal; Sr, bars: 5 cm in a; 1 cm in b–g. Abbreviations: An, angular; CaV, caudal sternal rib; Vr, vertebral rib. computed tomography of the holotype and referred specimen reveals unerupted molariform tooth crowns reveals that the complex trough- a dental formula of five (5) maxillary and eight (8) mandibular teeth crest morphology observed in the working dentition is primary in (Figs 1,2). Incisiform teeth as described in other notosuchians2,3,17–19 nature and not the result of wear (Supplementary Fig. 3). The complex either are not preserved or were absent in Pakasuchus. All teeth in the morphology and high degree of occlusal precision of the cheek teeth in post-caniniform series show a distinct constriction between the crown Pakasuchus shows a level of sophistication otherwise seen only in and root (Fig. 2). The maxillary dentition is characterized by an mammals. Moreover, the morphology of the elongate quadrate– enlarged caniniform tooth in position one that is immediately fol- articular joint (Fig. 2k, l) provides additional evidence for the derived lowed by a small conical tooth (Fig. 2a). Two large molariforms nature of jaw mobility in this form, with potential movements limited occupy positions three and four, and the ultimate tooth is a small to rotation and rostrocaudal translation of the lower jaw. Although it molariform tooth at the extreme caudal end of the alveolar trough. has been hypothesised that other notosuchians possessed proal The lower dentition consists of a large caniniform in position one, kinematics (anterior displacement during the power stroke)2,4,14,15, followed immediately by five small ‘premolariform’ teeth in positions no other form has such highly corresponding molariform occlusal two through six. Positions seven and eight accommodate two molari- morphology. The organization of opposing molariform occlusal sur- form teeth situated opposite the two large maxillary molariforms. faces (for example, canted occlusal surfaces) indicates that maximum Pakasuchus is unique among crocodyliforms, including the dentally crown–crown contact would probably have occurred during upward diverse notosuchians, in having fully complementary upper and lower (orthal) and anterior (proal) displacement of the lower jaw during molariform cheek-teeth (Fig. 2c–j). The crown on both upper and adduction. lower molariforms has two parallel, rostrocaudally oriented crests The postcranial skeleton of Pakasuchus is characterized by long, separated by a longitudinal trough (Fig. 2d, g). This degree of com- gracile limbs and an elongated and relatively mobile thorax (Fig. 1). plementary occlusal morphology maximizes crown–crown contact, Pakasuchus is unique among crocodyliforms in having extremely providing two shearing edges separated by a trough for proces- reduced osteoderms in the trunk region (Fig. 1d), in contrast to the sing food. Importantly, microcomputed tomography (microCT) of heavily armoured condition characterizing virtually all living and 749 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

a cdTR e Eo

LCr MCr Pal Simosuchus Mahajangasuchus Peirosaurus Uruguaysuchus fgtr h mcr Baurusuchus Adamantinasuchus Notosuchus Comahuesuchus LK Mariliasuchus b lcr Araripesuchus buitreraensis Sphagesaurus PMX CN PRM MF Lybicosuchus

inc cn prm mf Araripesuchus gomesii Pakasuchus Candidodon Malawisuchus ?11 3 0 15 2 Araripesuchus wegeneri i j Anatosuchus Chimaeresuchus mcr kl mcr lcr EK Sa mcr LCr Unnamed Qd node

lcr Alligatorium Ar Hsisosuchus Rp An LJ Figure 2 | Reconstruction of the dentition of Pakasuchus kapilimai derived Notosuchia from X-ray computed tomography scans. a, Composite reconstruction of left dentition (RRBP 08631, RRBP 05103) to illustrate size and shape MJ heterodonty in dental series. b, Dental classification (above) by tooth position and quadrant-specific (below) dental formula. c–h, Lateral (c, f), occlusal (d, g), and lingual (e, h) views of left upper (c–e) and left lower EJ (f–h) molariform teeth (RRBP 05103). i, Occlusal view of left first and second lower molariform teeth (RRBP 05103). j, Oblique caudodorsal view Figure 3 | Phylogenetic relationships of Pakasuchus kapilimai within of left upper and lower molariform teeth to illustrate complementarity of crocodyliforms. Stratigraphically calibrated phylogeny of the restricted occlusal surfaces (RRBP 05103). k, l, Right jaw (quadrate–articular) joint in notosuchian data set with geography indicated by silhouettes: Africa, South medial (k) and caudal (l) views (RRBP 08631). Colour coding: red, America, Madagascar and Asia (China). See Supplementary Information for caniniform; green, premolariform; blue, molariform teeth. Arrow the analysis protocol, data matrix, character list and discussion. (k) indicates potential sliding movement at quadrate-articular articulation. Abbreviations: EJ, Early Jurassic; EK, Early Cretaceous; Eo, Eocene; LJ, Late Scale bars 5 0.5 cm. Abbreviations: An, angular; Ar, articular; CN/cn, Jurassic; LK, Late Cretaceous; MJ, Middle Jurassic; Pal, Palaeocene. caniniform; inc, incisiform; LCr/lcr, lateral crest on molariform; MCr/mcr, medial crest on molariform; MF/mf, molariform; PRM/prm, The most distinctive features of the group are craniodental specia- premolariform; PMX, premaxillary dentition; Qd, quadrate; Rp, lizations related to divergent feeding strategies, which are distinctly retroarticular process; Sa, surangular; Tr/tr, molariform trough. Upper case indicates upper dentition, lower case indicates lower dentition. different from the condition found in extant crocodylians or inferred from other extinct crocodyliforms. In conjunction with extremely extinct members of the clade. A bizarre exception to the otherwise small body size, many notosuchians express marked heterodonty, reduced armour in Pakasuchus is found in the tail, which is encased in including postcaniniform dentitions with multi-cusped teeth15 osteoderms. Trunk osteoderms consist of bilaterally symmetrical, and/or complete molarization2,14,21 of cheek-teeth, convergent with longitudinally oriented ossifications positioned dorsal to the vertebral patterns in various non-mammalian cynodont7,8 and mammalian column, vertebral ribs and limb girdles (Fig. 1d). Together, the lineages9–11. Further exemplifying such trends within Notosuchia, forward-facing external nares and long, gracile limbs indicate that Pakasuchus kapilimai shows an additional reduction in the number Pakasuchus probably occupied a primarily terrestrial, rather than of postcaniniform teeth combined with precise complementarity aquatic, niche. The reduced dorsal body armour further enhances this between upper and lower molariform teeth (Fig. 2). Pakasuchus ecomorphological model in that it would have permitted a more parallels the level of occlusal complexity found in adaptations that active foraging mode for an organism in a terrestrial environment are considered integral during the radiation of mammals. by allowing reduced weight and increased mobility. The clade of notosuchians that shows the highest degree of hetero- A phylogenetic analysis of representatives of all major mesoeucro- donty is restricted to the late Early and early Late Cretaceous of Africa codyliform groups positions Pakasuchus within Notosuchia (Fig. 3). and South America (Fig. 3), still united as a single, large landmass (West Characters in support of this placement include only moderate sculp- Gondwana) until near the Early–Late Cretaceous boundary22.The turing on the dorsal aspect of the skull, smooth alveolar margins and diversity of WestGondwanan mammal-like crocodyliforms during this regional differentiation of the dentition. Pakasuchus in turn shares a temporal span is interesting in light of the paucity of mammalian taxa number of features (for example, a jugal that does not extend rostral to recovered from these areas relative to those known from contem- the orbit, an ultimate maxillary tooth that is less than or equal to half poraneous Laurasian terrestrial faunas11. Gondwanan mammals that the size of the penultimate maxillary tooth and molarization of cheek- are known from this timeframe are typically either relictual represen- teeth) with a less inclusive clade that comprises Mariliasuchus2, tatives of cosmopolitan archaic therian groups (for example, eutrico- Adamantinasuchus3, Malawisuchus15 and Candidodon20. Significantly, nodontans)11 or members of Gondwanan-restricted clades that show these taxa are all small-bodied ‘middle’ Cretaceous (Aptian–Turonian) highly derived morphologies (for example, extreme hypsodonty in forms known exclusively from South America and Africa, each inter- gondwanatherians)23–25. By contrast, Cretaceous Laurasian mam- preted as an atypical crocodyliform with respect to both anatomical malian assemblages consist of multituberculates, metatherians and (for example, regionally differentiated dentition) and ecological (for basal eutherians,groups thatappearto be restricted or absent altogether example, terrestrial rather than aquatic) characteristics. from contemporaneous Gondwanan assemblages11. Crocodyliform 750 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS and mammalian diversification patterns suggest that faunal dynamics 13. O’Connor, P. M. et al. A new vertebrate fauna from the Cretaceous Red Sandstone Group, Rukwa Rift Basin, Southwestern Tanzania. J. Afr. Earth Sci. 44, 277–288 were different on the northern and southern landmasses, perhaps (2006). related to the differential radiation of small-bodied terrestrial forms. 14. Andrade, M. B. & Bertini, R. J. Morphological and anatomical observations about Fossil evidence recovered so far indicates that multituberculate, Mariliasuchus amarali and Notosuchus terrestris (Mesoeucrocodylia) and their metatherian and eutherianmammals radiated broadlyin Laurasia, with relationships with other South American notosuchians. Aq. Mus. Nac., Rio de Jan. notosuchian crocodyliforms and gondwanatherian and ‘archaic’ mam- 66, 5–62 (2008). 15. Gomani, E. M. A crocodyliform from the Early Cretaceous Dinosaur Beds, malian lineages occupying similar niches (for example, small-bodied, Northern Malawi. J. Vertebr. Paleontol. 17, 280–294 (1997). terrestrial faunivores) throughout Gondwana. 16. Turner, A. H. Osteology and phylogeny of a new species of Araripesuchus Notosuchian crocodyliforms have been interpreted to exhibit either (Crocodyliformes: Mesoeucrocodylia) from the Late Cretaceous of Madagascar. mammal-like4,15 or herbivorous reptile-like1 dental morphologies, no Hist. Biol. 18, 255–369 (2006). 17. Fiorelli, L. & Calvo, J. O. New remains of Notosuchus terrestris Woodward, 1896 doubt related to alteration in the fundamental signalling pathways (Crocodyliformes: Mesoeucrocodylia) from the Late Cretaceous of Neuque´n, 26–28 that underlie both individualtooth development and global organi- Patagonia, Argentina. Aq. Mus. Nac., Rio de Jan. 66, 83–124 (2008). zation of the dental arcade29,30. Notosuchian craniodental novelty 18. Lecuona, A. & Pol, D. Tooth morphology of Notosuchus terrestris (Notosuchia: probably represents an example of evolutionary-developmental experi- Mesoeucrocodylia): New evidence and implications. C. R. Palevol 7, 407–417 (2008). mentation by a clade in the absence of potentially competitive eco- 19. Novas, F. E. et al. Bizarre notosuchian crocodyliform with associated eggs from the morphs from other major tetrapod groups (that is, mammals). After Upper Cretaceous of Bolivia. J. Vertebr. Paleontol. 29, 1316–1320 (2009). the Mesozoic, a number of crocodyliform lineages, including notosu- 20. Carvalho, I. S. Candidodon: Um crocodile com heterodontia (Notosuchia, Creta´ceo chians, either became extinct, or experienced a marked canalization in Infereior – Brasil). An. Acad. Bras. Cienc. 66, 331–346 (1994). 21. Nobre, P. H. et al. Feeding behavior of the Gondwanic Crocodylomorpha morphology as reflected by the restricted bauplan of extant crocody- Mariliasuchus amarali from the Upper Cretaceous Bauru Basin, Brazil. Gond. Res. lians. The disappearance of so many intriguing Cretaceous forms might 13, 139–145 (2008). therefore reflect a reduction in ecomorphospace owing to environ- 22. Pletsch, T. et al. Cretaceous separation of Africa and South America: The view mental change, the arrival or emergence of new forms, or both. from the West African margin (ODP Leg 159). J. S. Am. Earth Sci. 14, 147–174 (2001). 23. Krause, D. W. et al. Cosmopolitanism among Gondwanan Late Cretaceous METHODS SUMMARY mammals. Nature 390, 504–507 (1997). The small size and state of preservation of the specimens (that is, upper and lower 24. Bonaparte, J. F. A new and unusual Late Cretaceous mammal from Patagonia. J. jaws recovered in a closed position) prompted the use of high-resolution X-ray Vertebr. Paleontol. 6, 264–270 (1986). microCT to elucidate details of morphology related to the teeth and jaws. X-ray 25. Krause, D. W. et al. A Cretaceous mammal from Tanzania. Acta Palaeontol. Pol. 48, microCT was conducted at the Ohio University mCT Facility (GE eXplore Locus 321–330 (2003). in-vivo microCT scanner) using the following protocol: 85 kVp, 400 mA and a 26. Jernvall, J., Kera¨nen, S. V. E. & Thesleff, I. Evolutionary modification of development in mammaliam teeth: Quantifying gene expression patterns and slice thickness of 0.045 mm. VFF and DICOM files were compiled into three- topography. Proc. Natl Acad. Sci. USA 97, 14444–14448 (2000). dimensional reconstructions with visualizations obtained using the AMIRA 4.1 27. Kangas, A. T. et al. Nonindependence of mammalian dental characters. Nature Advanced Graphics Package. 432, 211–214 (2004). 28. Tummers, M. & Thesleff, I. The importance of signal pathway modulation in all Received 22 December 2009; accepted 25 March 2010. aspects of tooth development. J. Exp. Biol. 312, 309–319 (2009). 1. Buckley, G. A. et al. A pug-nosed crocodyliform from the Late Cretaceous of 29. Zhao, Z., Weiss, K. M. & Stock, D. W. in Development, Function and Evolution of Madagascar. Nature 405, 941–944 (2000). Teeth (eds Teaford, M. F., Meredith Smith, M., & Ferguson, M. W. J.) 152–172 2. Zaher, H. et al. Redescription of the cranial morphology of Mariliasuchus amarali, (Cambridge Univ. Press, 2007). and it phylogenetic affinities (Crocodyliformes, Notosuchia). Amer. Mus. Nov. 30. Osborn, J. W. Relationship between growth and the pattern of tooth initiation in 3512, 1–40 (2006). alligator embryos. J. Dent. Res. 77, 1730–1738 (1998). 3. Nobre, P. H. & Carvalho, I. S. Adamantinasuchus navae: A new Gondwanan Supplementary Information is linked to the online version of the paper at Crocodylomorpha (Mesoeucrocodylia) from the Late Cretaceous of Brazil. Gond. www.nature.com/nature. Res. 10, 370–378 (2006). 4. Pol, D. New remains of Sphagesaurus huenei (Crocodylomopha: Acknowledgements We thank: D. Kamamba, F. Ndunguru (Tanzania Antiquities Mesoeucrocodylia) from the Late Cretaceous of Brazil. J. Vertebr. Paleontol. 23, Unit), P. Msemwa (Tanzania Museum of House of Culture), I. Marobhe (University 817–831 (2003). of Dar es Salaam), and the Tanzania Commission for Science and Technology for 5. Pol, D. & Apesteguı´a, S. New Araripesuchus remains from the early Late support; J.P. Cavigelli and V. Heisey for specimen preparation; M. Getty, E. Lund, Cretaceous (Cenomanian-Turonian) of Patagonia. Amer. Mus. Nov. 3490, 1–38 S. Burch, V. Simons, E. Simons, J. Garcia-Massini, G. Masai, and A. Mussa for field (2005). assistance; P. Sereno, E. Gomani, and C. Chiumia for specimen access; J. Sidote for 6. Roberts, E. M. et al. Sedimentology and depositional environments of the Red digital processing assistance. This research was supported by the US National Sandstone Group, Rukwa Rift basin, southwestern Tanzania: new insight into Science Foundation (NSF EAR-0617561, EAR-0854218), the National Geographic Cretaceous and Paleogene terrestrial ecosystems and tectonics in sub-equatorial Society (CRE), the University of the Witwatersrand, the Michigan State University Africa. J. Afr. Earth Sci. 57, 179–212 (2010). Office of Research and Graduate Studies, and the Ohio University College of 7. Kemp, T. S. The Origin and Evolution of Mammals (Oxford Univ. Press, 2005). Osteopathic Medicine and Ohio University Office of Research and Sponsored 8. Angielczyk, K. D. Phylogenetic evidence for and implications of a dual origin of Programs. propaliny in anomodont therapsids (Synapsida). Paleobiol. 30, 268–296 (2004). Author Contributions P.M.O., N.J.S., E.M.R. and M.D.G. developed the field project. 9. Crompton, A. W. & Jenkins, F. A. Molar occlusion in Late Triassic Mammals. Biol. P.M.O., J.J.W.S., N.J.S., T.L.H. and R.R. conducted the research. P.M.O., J.J.W.S., Rev. Camb. Philos. Soc. 43, 427–458 (1968). N.J.S., E.M.R., M.D.G. and S.E.N. wrote the manuscript. P.M.O., N.J.S., E.M.R., 10. Luo, Z.-X. Transformation and diversification in early mammal evolution. Nature M.D.G., Z.A.J., and J.T. excavated the specimens. 450, 1011–1019 (2007). 11. Kielan-Jaworowski, Z. et al. Mammals from the Age of Dinosaurs Origins, Evolution, Author Information Reprints and permissions information is available at and Structure. (Columbia Univ. Press, New York, 2004). www.nature.com/reprints. The authors declare no competing financial interests. 12. Luo, Z.-X., Crompton, A. W. & Sun, A.-L. A new mammaliaform from the Early Readers are welcome to comment on the online version of this article at Jurassic and evolution of mammalian characteristics. Science 292, 1535–1540 www.nature.com/nature. Correspondence and requests for materials should be (2001). addressed to P.M.O. ([email protected]).

751 ©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09273 LETTERS

Negative plant–soil feedback predicts tree-species relative abundance in a tropical forest

Scott A. Mangan1,2, Stefan A. Schnitzer1,2, Edward A. Herre2, Keenan M. L. Mack3, Mariana C. Valencia4, Evelyn I. Sanchez2 & James D. Bever3

The accumulation of species-specific enemies around adults is feedback and other possible mechanisms that could lead to similar hypothesized to maintain plant diversity by limiting the recruit- demographic patterns, such as higher intraspecific competition for ment of conspecific seedlings relative to heterospecific seedlings1–6. abiotic resources near parent trees. Although previous studies in forested ecosystems have documented Experimental studies in both temperate and tropical forests have patterns consistent with the process of negative feedback7–16, these attempted to demonstrate the process of negative feedback and to studies are unable to address which classes of enemies (for example, identify the causal agents. These studies often find that detrimental pathogens, invertebrates, mammals) exhibit species-specific effects effects of enemies on seed or seedlings are greater near than away from strong enough to generate negative feedback17, and whether nega- conspecific trees13–16. However, with few exceptions21, these studies tive feedback at the level of the individual tree is sufficient to influ- restrict their analyses to within single tree species, and thus fail to ence community-wide forest composition. Here we use fully examine whether the effects of enemies are species-specific, which is reciprocal shade-house and field experiments to test whether the an essential requirement to provide a recruitment advantage to performance of conspecific tree seedlings (relative to heterospecific heterospecific seedlings. Instead, experimental studies that examine seedlings) is reduced when grown in the presence of enemies asso- performance of conspecific relative to heterospecific juveniles near a ciated with adult trees. Both experiments provide strong evidence host tree are necessary17. Furthermore, simulation models are needed for negative plant–soil feedback mediated by soil biota. In contrast, to determine whether empirically based estimates of negative feedback above-ground enemies (mammals, foliar herbivores and foliar occurring at the local scale of the tree are sufficient to influence pathogens) contributed little to negative feedback observed in the community-wide patterns in species diversity and relative abundances. field. In both experiments, we found that tree species that showed We first conducted a shade-house experiment designed to assess stronger negative feedback were less common as adults in the forest the importance of soil biota (for example, fungi, bacteria, fauna) in community, indicating that susceptibility to soil biota may deter- generating negative feedback, while controlling for nutrients and mine species relative abundance in these tropical forests. Finally, light. We chose six shade-tolerant tree species, the adult relative our simulation models confirm that the strength of local negative abundances of which in the 50-ha plot on Barro Colorado Island feedback that we measured is sufficient to produce the observed (BCI) ranged over roughly two orders of magnitude, which allowed community-wide patterns in tree-species relative abundance. Our us to examine whether variation in the strength of feedback among findings indicate that plant–soil feedback is an important mech- species was correlated with their adult abundance. We filled all pots anism that can maintain species diversity and explain patterns of with an identical mixture (3:1) of sterilized field soil and sand. To tree-species relative abundance in tropical forests. each pot, we added a single seedling along with a small quantity (6% Negative feedbacks occur when detrimental effects of enemies that total volume) of either live or sterilized soil inoculum collected from accumulate in the vicinity of a given adult are expressed more under either conspecific or heterospecific adult trees in a fully recip- strongly on conspecific relative to heterospecific juveniles. As a result, rocal design. This experimental design allowed us to control for enemy-mediated reduction of growth and survival of conspecific abiotic soil effects, while introducing soil biota. We measured total juveniles near a given adult can provide a localized recruitment seedling biomass after 5 months. advantage for juveniles of other species1–3. This process can maintain We found strong evidence for negative plant–soil feedback based species richness by preventing any one species from dominating the on growth when averaged across all species, with four of the six being plant community18–20. significant (Fig. 1a). For these four species, seedling growth was In forests, the strongest evidence that negative feedback processes reduced relative to heterospecific seedlings when grown with con- influence plant species composition comes from demographic ana- specific versus heterospecific inoculum (see Supplementary Fig. 1). lyses of spatial and temporal patterns of tree growth and survival. Overall seedling growth (averaged across species) did not differ These demographic studies often reveal that seedlings and saplings across sterilized inocula from the different adult species; however, perform more poorly when in high densities or near conspecific growth did differ significantly across different sources of inocula adults7–12. Such patterns of density and distance dependence are containing live biota (Fig. 1b). This finding confirms that differences expected to emerge if the process of enemy-mediated negative feed- in seedling response were due to differences in live soil biota and not back is operating in the plant community. Demographic analyses, due to differences in abiotic properties of the soil. Furthermore, however, are not able to identify the principal classes of enemies (for the strength of negative feedback was correlated with the relative example, pathogens, invertebrates, mammals) that drive negative abundance of those adult trees found on the BCI 50-ha plot feedbacks, nor are they able to distinguish between enemy-mediated (P 5 0.058). Tree species showing strong negative feedback were less

1Department of Biological Sciences, University of Wisconsin–Milwaukee, Wisconsin 53201, USA. 2Smithsonian Tropical Research Institute, MRC 0580-06, Unit 9100 Box 0948, DPO AA 34002-9998, USA. 3Department of Biology, Indiana University, Bloomington, Indiana 47405, USA. 4Department of Biological Sciences, University of Illinois–Chicago, Chicago, Illinois 60607, USA. 752 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

a 0.00 significant negative feedback based on growth of surviving seedlings (Fig. 2a). Consistent with the shade-house experiment, tree species –0.05 exhibiting strong negative feedback were less common as adults in this mainland forest than those species exhibiting weaker negative –0.10 feedback (Fig. 2b). Foliar enemies contributed little to the strength of feedback. Leaf damage caused by insect herbivores and foliar patho- * ** –0.15 ** gens explained an average of only 14% of the overall strength of

Strength of feedback Strength negative feedback (averaged across all species), with the maximum –0.20 *** contribution (28%) occurring in Beilschmiedia pendula due to foliar fungi (Fig. 2c and Supplementary Fig. 2). These findings, combined Bp Ba En Lp Tp Vs with the strong effect of soil biota and no effect of nutrients on Tree species feedback in the shade-house experiment, indicate that below-ground b 0.9 biota contributed to the majority of growth-based negative feedback Live inoculum measured in the forest (see Supplementary Discussion). ) 0.8 10 Sterile inoculum Mortality during the first 6 months was primarily due to above- 0.7 ground enemies (uprooting by vertebrates or clipping of stems by vertebrates or insects) and occurred shortly after we transplanted the 0.6 seedlings. Seedling death caused by these above-ground enemies did not lead to mortality-based feedback when measured in January 0.5 Tot al b iom ass (lo g

0.4 a 0.0 Bp Ba En Lp Tp Vs Soil inoculum source

–0.1 c 3 BCI forest Tp * Bp ** –0.2 *** 2 Ba *** *** Strength of feedback Strength En –0.3 1 Aa Bp Ba Lp Sa Lp Vs

Log abundance of abundance Log trees >10 cm DBH Tree species r2 = 0.63, n = 6, P = 0.058 0 b 3 Gigante forest –0.20 –0.15 –0.10 –0.05 –0.00 Strength of feedback Aa 2 Figure 1 | Strengths of negative plant–soil feedback measured in the shade-house experiment are correlated with adult tree species abundance of the BCI forest. a, Variation in the strength of negative feedback mediated Lp Sa by soil biota among the six seedling species. Ba, Brosimum alicastrum; Bp, 1 Bp

Beilschmiedia pendula; En, Eugenia nesiotica; Lp, Lacmellea panamensis; Tp, Log abundance of >10 cm DBH trees Tetragastris panamensis; Vs, Virola surinamensis. Bars indicate standard Ba r2 = 0.87, n = 5, P = 0.021 errors, and means that differ from zero are indicated by asterisks (*P , 0.05; 0 **P , 0.01; ***P , 0.001). Number of seedlings analysed 5 349. b, Seedling –0.20 –0.18 –0.16 –0.14 –0.12 –0.10 –0.08 response (averaged across seedling species) varied across live inocula Strength of feedback (ANCOVA: F 5 5.31, P , 0.0001) but not across sterile inocula 5,408 c (ANCOVA: F5,408 5 0.15, P 5 0.981). Bars indicate standard errors. c, Tree Other Insect herbivory species that exhibited stronger negative feedback were less common as adults 1.0 in the BCI 50-ha plot. DBH, diameter at breast height. Foliar pathogens 0.8 common as adults than were species exhibiting weaker or no signifi- 0.6 cant negative feedback (Fig. 1c). We then conducted a reciprocal field experiment in a mainland 0.4 forest (Gigante Peninsula, Panama) located adjacent to BCI to deter- 0.2 mine whether patterns of negative plant–soil feedback observed in the shade-house experiment were also found in the forest in the contribution Proportional 0.0 presence of above-ground enemies and other potentially confound- Aa Bp Ba Lp Sa ing processes. We selected five tree species that differed in adult Tree species relative abundance (three of which were used in the shade-house Figure 2 | Strengths of negative feedback measured in the field experiment experiment). We grew seedlings in sterilized soil for 1 month and are correlated with adult tree species abundance of the Gigante forest. then transplanted them into plots containing all five species in July a, Variation in the strength of negative feedback among the five seedling 2008. A single mixed-species seedling plot was established under each species. Species abbreviations are the same as those in Fig. 1, except for: Aa, replicate adult tree of each species. We measured growth and survival Apeiba aspera; Sa, Simarouba amara. Bars indicate standard errors, and means that differ from zero are indicated by asterisks (*P , 0.05; **P , 0.01; at the end of the first wet season (January 2009) and survival at both ***P , 0.001). Number of seedlings analysed 5 945. b, Tree species that the end of the first dry season (May 2009) and after 16 months, near exhibited stronger negative feedback were less common as adults in the the end of the second wet season (November 2009). forest of the Gigante Peninsula than those species exhibiting weaker negative We found that of the 1,270 seedlings planted into the forest, 945 feedback. c, Proportional contribution of foliar insect herbivory, foliar (74%) survived after 6 months in the forest. All five species exhibited pathogens and other causes to observed patterns of negative feedback. 753 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

2009. However, in a subsequent mortality census conducted in May P , 0.0001; see Supplementary Methods). When we included vari- 2009, faster growing seedlings (as measured in January 2009) had a ation in species-specific life history traits, a single highly competitive higher probability of surviving the first dry season than did slower species dominated the simulations in the absence of feedback. Species growing seedlings (Supplementary Table 2). By November 2009, coexistence occurred only when we also included the empirically estimates of feedback through mortality became increasingly nega- measured feedback responses. These simulations demonstrate that tive (Supplementary Table 3), demonstrating that growth differences the relationship between feedback and abundance is expected only among seedlings emerge quickly and develop into mortality-based when negative plant–soil feedback is the major force driving plant negative feedback over longer periods of time, as slower growing species coexistence. seedlings are increasingly likely to die. Our study is consistent with other findings that soil biota (for Analytical models and simulations indicate that negative feedback example, soil-borne fungi, bacteria, fauna) mediate negative plant–soil can maintain plant diversity18–20; however, theory on the expected feedback in temperate grasslands3,4,23–26. In a greenhouse experiment, response in tree relative abundance to variation in feedback strength rare temperate grassland species exhibited stronger negative plant–soil is lacking. In addition, it has been argued that localized processes such feedback due to soil pathogens than more common plant species4. as negative feedback may not be sufficient to influence community- Notably, our field experiment suggests that the correlation between wide patterns in tree composition22. We addressed these issues by the strength of negative feedback among species and their relative simulating community dynamics using a stochastic spatially explicit abundance occurs even in the presence of plant competition and other cellular automata model. We found that simulations that included naturally occurring processes in tropical forests. Furthermore, this the strength of plant–soil feedback between species pairs measured in relationship does not seem to be restricted to just those tree species our experiments generated community-wide species abundances of that we examined. A recent, demographic analysis found a positive similar rank order as those found on BCI and the mainland forest correlation between patterns of density-dependent seedling mortality (Fig. 3a, b). Moreover, this pattern holds for simulations of more and abundance when 180 tree species on BCI were examined27.Our species-rich communities (Fig. 3c). The correlation between abund- simulations confirm theoretically that variation in the strength of ance and average feedback was robust when we relaxed the assump- plant–soil feedback can drive this relationship. tion that species had equivalent growth and mortality rates (t 5 17.19, For decades, resource partitioning, above-ground herbivory6 and 28 a neutral processes have received considerable attention as mechan- isms for the maintenance of plant species diversity. However, much of 4.5 this work has overlooked the effects of soil biota, particularly in species-rich tropical forests. Soil communities are characterized by a great diversity of microbes and fauna26,29, but the extent to which these organisms contribute to the functioning of plant communities is only 4.0 now beginning to be discovered. By using fully reciprocal experi- ments, we were able to demonstrate that species-specific interactions Mean r = 0.91, d.f. = 9, P < 0.0001 between tropical trees and their soil biota are sufficiently strong to 3.5 maintain tree diversity through negative feedback. Self-limiting –0.20 –0.16 –0.12 –0.08 –0.04 –0.00 processes such as negative plant–soil feedback have been assumed b previously to occur more strongly in tree species of high abundance30. However, empirically we found the opposite result: more abundant tree species exhibited the weakest negative feedback. Our simulations 4.4 reinforce the conclusion that trees are abundant because they are less susceptible to the detrimental effects of their associated soil com- munities than are rarer tree species. Thus, localized negative plant– 4.2 soil feedback occurring between plants and below-ground organisms may be a general mechanism for the maintenance of plant species diversity and patterns of relative abundance across ecosystems ranging Mean r = 0.87, d.f. = 9, P < 0.0001 from temperate grasslands to tropical forests.

Simulatedabundance tree (log) 4.0 –0.20 –0.18 –0.16 –0.14 –0.12 –0.10 –0.08 METHODS SUMMARY Study species. We selected shade-tolerant tree species from different families that c 5 produced sufficient amounts of seeds at the onset of each experiment. We used Beilschmiedia pendula, Brosimum alicastrum and Lacmellea panamensis in both experiments; Eugenia nesiotica, Tetragastris panamensis and Virola surinamensis 4 in the shade-house experiment; and Apeiba aspera and Simarouba amara in the field experiment. We were unable to use identical species sets for each experiment because seed availability varied between the two years in which each experiment 3 was conducted. For each experiment, we collected seeds of all species from their respective forests (shade-house experiment: Barro Colorado Island; field experi- ment: Gigante Peninsula, Panama), surface sterilized the seeds, and germinated Mean r = 0.61, d.f. = 9, P < 0.0001 2 them in sterile soil (see Methods). –0.30 –0.25 –0.20 –0.15 –0.10 –0.05 –0.00 Feedback measure. For each experiment, feedback was measured using a priori contrasts within the ‘seedling species 3 soil-biota source’ interaction term in our Strength of negative feedback mixed-model analysis of covariance (ANCOVA) tests (see Methods). These con- Figure 3 | Simulations indicate that variation in feedback strength predicts trasts isolated the strength and direction of the interaction between seedling species tree species abundance. a–c, Species abundance generated using and adult biota source for each possible species pair (that is, pairwise feedbacks18). simulations including shade-house plant response data (a), field-collected The strength of average feedback per species was determined by averaging all plant response data (b) and randomly generated feedback data (c). All pairwise feedbacks involving that species18 (see Supplementary Fig. 1). simulations demonstrate that stronger negative feedback leads to lower Stochastic cellular automata simulation. Initially, all cells were occupied and species abundance. Each circle falling at the same location of the x axis in each species was equally represented. A cell was then chosen at random and the panels a and b indicates simulated abundance for each of 10 runs. Regression species identity was reassigned based on the pairwise plant–soil responses mea- lines per run are plotted in panel c. sured in each experiment. Simulations in Fig. 3 assumed species equivalence in 754 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

growth and mortality. This assumption was then relaxed for subsequent runs of 20. Petermann, J. S., Fergus, A. J. F., Turnbull, L. A. & Schmid, B. Janzen-Connell effects the model (see Methods and Supplementary Equations). The abundance of each are widespread and strong enough to maintain diversity in grasslands. Ecology 89, species after 20 million replacements was recorded and the correlation with the 2399–2406 (2008). average strength of feedback per species was tested. 21. McCarthy-Neumann, S. & Kobe, R. K. Conspecific plant-soil feedbacks reduce survivorship and growth of tropical tree seedlings. J. Ecol. 98, 396–407 (2010). Full Methods and any associated references are available in the online version of 22. Hubbell, S. P., Ahumada, J. A., Condit, R. & Foster, R. B. Local neighborhood effects the paper at www.nature.com/nature. on long-term survival of individual trees in a neotropical forest. Ecol. Res. 16, 859–875 (2001). Received 15 April; accepted 15 June 2010. 23. Mills, K. M. & Bever, J. D. Maintenance of diversity within plant communities: soil Published online 25 June 2010. pathogens as agents of negative feedback. Ecology 79, 1595–1601 (1998). 24. Kardol, P., Cornips, N. J., van Kempen, M. M. L., Bakx-Shotman, J. M. T. & van der 1. Janzen, D. H. Herbivores and the number of tree species in tropical forests. Am. Putten, W. H. Microbe-mediated plant-soil feedback causes historical Nat. 104, 501–528 (1970). contingency effects in plant community assembly. Ecol. Monogr. 77, 147–162 2. Connell, J. H. in Dynamics of Populations (eds den Boer, P. J. & Gradwell, G. R.) (2007). 298–312 (Center for Agricultural Publication and Documentation, 1971). 25. Bever, J. D. Negative feedback within a mutualism: host-specific growth of 3. Bever, J. D. Feedback between plants and their soil communities in an old field mycorrhizal fungi reduces plant benefit. Proc. R. Soc. Lond. B 269, 2595–2601 community. Ecology 75, 1965–1977 (1994). (2002). 4. Klironomos, J. N. Feedback with soil biota contributes to plant rarity and 26. De Deyn, G. B. et al. Soil invertebrate fauna enhances grassland succession and invasiveness in communities. Nature 417, 67–70 (2002). diversity. Nature 442, 711–713 (2003). 5. Kulmatiski, A., Beard, K. H., Stevens, J. R. & Cobbold, S. M. Plant-soil feedback: a 27. Comita, L. S., Muller-Landau, H. C., Aguilar, S. & Hubbell, S. P. Asymmetric density meta-analytical review. Ecol. Lett. 11, 980–992 (2008). dependence shapes species abundances in a tropical tree community. Science 6. Carson, W. P., Anderson, J. T., Leigh, E. G. & Schnitzer, S. A. in Tropical Forest doi:10.1126/science.1190772 (in the press). Community Ecology (eds Carson, W. P. & Schnitzer, S. A.) 210–241 (Wiley- 28. Hubbell, S. P. The Unified Neutral Theory of Biodiversity and Biogeography (Princeton Blackwell, 2008). Univ. Press, 2001). 7. Webb, C. O. & Peart, D. R. Seedling density dependence promotes coexistence of 29. Roesch, L. F. W. et al. Pyrosequencing enumerates and contrasts soil microbial Bornean rain forest trees. Ecology 80, 2006–2017 (1999). diversity. ISME J. 1, 283–290 (2007). 8. Harms, K. E., Wright, S. J., Caldero´n, O., Herna´ndez, A. & Herre, A. E. Pervasive 30. Connell, J. H., Tracey, J. G. & Webb, L. J. Compensatory recruitment, growth, and density-dependent recruitment enhances seedling diversity in a tropical forest. mortality as factors maintaining rain forest tree diversity. Ecol. Monogr. 54, Nature 404, 493–495 (2000). 141–164 (1984). 9. Lambers, H. R. L., Clark, J. S. & Beckage, B. Density-dependent mortality and the latitude gradient in species diversity. Nature 417, 732–735 (2002). Supplementary Information is linked to the online version of the paper at 10. Peters, H. A. Neighbour-regulated mortality: the influence of positive and www.nature.com/nature. negative density dependence on tree populations in species-rich tropical forests. Ecol. Lett. 6, 757–765 (2003). Acknowledgements We thank G. Adler, M. Kaspari, E. Leigh, T. Lambert, 11. Wills, C. et al. Nonrandom processes maintain diversity in tropical forests. Science I. Rubinoff, E. Tanner, M. Tobin, B. Turner, S. Van Bael and N. Wurzburger for 311, 527–531 (2006). providing discussions and comments on the manuscript. R. Kolodziej, K. Meyer, 12. Comita, L. S. & Hubbell, S. P. Local neighborhood and species’ shade tolerance K. McElligott and T. Shirshac provided greenhouse and field assistance. Logistical influence survival in a diverse seedling bank. Ecology 90, 328–334 (2009). support was provided by the Smithsonian Tropical Research Institute. The Center 13. Packer, A. & Clay, K. Soil pathogens and spatial patterns of seedling mortality in a of Tropical Forest Science provided BCI tree abundance data published online at temperate tree. Nature 404, 278–281 (2000). https://ctfs.arnarb.harvard.edu/webatlas/datasets/bci/abundance. This study 14. Augspurger, C. K. & Kelly, C. K. Pathogen mortality of tropical tree seedlings: was supported by a Smithsonian Tropical Research Institute (STRI) postdoctoral experimental studies of the effects of dispersal distance, seedling density, and fellowship to S.A.M., a University of Wisconsin–Milwaukee (UWM) Research light conditions. Oecologia 61, 211–217 (1984). Growth Initiative grant to S.A.S., a fellowship from the UWM Research Foundation, 15. Hood, L. A., Swaine, M. D. & Mason, P. A. The influence of spatial patterns of and a grant from the National Science Foundation to J.D.B. We thank I. Rubinoff for damping-off disease and arbuscular mycorrhizal colonization on tree seedling his support of the STRI Soil Initiative. establishment in Ghanaian tropical forest soil. J. Ecol. 92, 816–823 (2004). Author Contributions S.A.M. designed and conducted the experiments, analysed 16. Bell, T., Freckleton, R. P. & Lewis, O. T. Plant pathogens drive density-dependent the data and wrote the first draft. S.A.S., E.A.H. and J.D.B. provided important seedling mortality in a tropical tree. Ecol. Lett. 9, 569–574 (2006). revisions. J.D.B. and K.M.L.M. developed the simulation. M.C.V. and E.I.S. provided 17. Bever, J. D., Kristi, M. W. & Antonovics, J. Incorporating the soil community into essential field support. plant population dynamics: the utility of the feedback approach. J. Ecol. 85, 561–573 (1997). Author Information Reprints and permissions information is available at 18. Bever, J. D. Soil community feedback and the coexistence of competitors: www.nature.com/reprints. The authors declare no competing financial interests. conceptual frameworks and empirical tests. New Phytol. 157, 465–473 (2003). Readers are welcome to comment on the online version of this article at 19. Adler, R. A. & Muller-Landau, H. C. When do localized natural enemies increase www.nature.com/nature. Correspondence and requests for materials should be species richness? Ecol. Lett. 8, 438–447 (2005). addressed to S.A.M. ([email protected]).

755 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09273

2 METHODS r 5 0.84). Initial seedling above-ground biomass was obtained in the same Reciprocal shade-house experiment. We collected seeds from three adults of manner. The percentage of herbivory or foliar pathogen damage was estimated Beilschmiedia pendula, Brosimum alicastrum, Eugenia nesiotica, Lacmellea per leaf for each surviving seedling. We also assessed seedling survival in May panamensis, Tetragastris panamensis and Virola surinamensis located near the 2009 and November 2009. 50-ha plot on Barro Colorado Island (BCI). For each adult, we collected and We analysed survival after 6 months using the SAS procedure PROC homogenized soil samples from three locations 2 m away from the base of the GLIMMIX for binomial distributions and log-transformed above-ground bio- tree to be used as inoculum. To separate the effects of soil biota from that of mass using PROC MIXED. Each model included seedling and adult species and potential variation in abiotic properties, we filled all 4-l pots with an identical their interaction as main effects, and log- transformed initial above-ground steam-pasteurized 3:1 sand–field soil mixture. To each pot, we also added a small biomass (per seedling) as a covariate. The growth model also included number quantity of live soil inoculum (6% total soil volume) collected from one of the six of days between the initial and final census (per seedling) as a covariate. We target species. We planted a single one-month-old seedling of each tree species included ‘site 3 adult species’ and ‘site 3 adult species 3 seedling species’ as into pots containing their own live inoculum (conspecific combinations) and random effects, with site defined as a single adult tree (43 ‘sites’ in total). We pots containing inoculum from each of the five other species (heterospecific determined the average feedback per species using methods identical to the combinations). For each tree species, we replicated the conspecific plant–biota shade-house experiment. To investigate the contribution of leaf herbivory and combination fifteen times and each heterospecific combination eight times. To foliar fungal damage to the strength of feedback, we computed the per cent confirm that our dilution technique adequately controlled for potential variation decrease in strength of feedback per species when each damage type was included in abiotic properties introduced by the small volume of soil inoculum, we as a covariate in two additional growth models (see Supplementary Table 2 and assessed seedling growth in the same plant–inoculum combinations, but using Supplementary Fig. 2). sterilized inoculum. Each plant–sterile inoculum combination was replicated Simulation. We used stochastic spatially explicit cellular automata computer twice. We divided all treatment combinations equally across four shade-houses, simulations. Each cell on a 300 3 300 torus grid was randomly assigned a species which were included in the analysis as blocks. Seedlings were well watered and identity. The initial grid contained an equal number of cells per species. Focal allowed to grow for 5 months, after which seedlings were harvested and total dry cells were then chosen at random and replaced. After 20 million replacements, we weight was determined. examined the abundance of cells representing each species. For all simulations, We used mixed-model ANCOVA to examine the main effects of seedling seeds of each species were assumed to disperse evenly over their 25 surrounding species and soil biota source and their interaction on log-transformed seedling cells. The identity of the new occupant of a replaced focal cell was determined by biomass using the SAS procedure PROC MIXED. In this model, we included the establishment probability of each species occurring within the local neigh- seedling species and block (and all interactions with block) as random effects, bourhood (25 surrounding cells) of the focal cell. Establishment probabilities with log-transformed initial biomass as a covariate. We estimated initial biomass were determined by the species-specific response to soil biotic compositions per species using regression equations obtained from extra harvested seedlings at created by both the species previously occupying the focal cell and the suite of the onset of the experiment, where the product of leaf area and stem height was species occurring immediately adjacent to the focal cell (surrounding 8 cells). regressed with total seedling dry weights. Within the ‘seedlings species 3 soil- The strength of this response was scaled so that it would be highest immediately biota source’ interaction, we used a priori contrasts that isolated the strength and adjacent to an adult and taper in strength with increasing distance (see direction of the interaction between seedling species and adult biota source for Supplementary Equations). We parameterized two separate simulations where each possible species pair (that is, pairwise feedbacks18). These contrasts com- plant response to soil biota (that is, establishment probability) was based on pared the relative growth response of seedlings when associated with soil biota pairwise growth responses measured in either the field or the shade-house from their own adults versus from under heterospecific adults, relative to how experiment. In addition, we simulated a community containing 15 species by heterospecific seedlings responded across these same soil biota sources (see assigning the conspecific plant response as a value between 0.1 and 0.2 and the Supplementary Fig. 1). base of the heterospecific pairwise plant response as a value between 0.2 and 0.6, Reciprocal field experiment. In July 2008, we transplanted ten seedlings of the with the individual heterospecific responses being chosen from a uniform dis- same species as the adult (conspecifics) and five seedlings of each of the hetero- tribution within 0.1 of that base value. In this simulation, a new random estab- specific species (30 seedlings total) into a single 1 3 0.8 m grid ,2.5 m from the lishment matrix was generated for each replication. For simulations described base of each adult. Seedlings were randomized and planted 20 cm apart. Ten thus far, all cells had an equal probability of being selected for replacement (that adult trees of Apeiba aspera, Brosimum alicastrum and Lacmellea panamensis, is, adult mortality rates were assumed to be equal across species). We ran an nine of Simarouba amara and four of Beilschmiedia pendula were haphazardly additional simulation where we relaxed the assumption of species equivalence in located in the forest of the mainland Gigante Peninsula, adjacent to BCI. We mortality by weighting the probability of replacement by estimates of species- monitored seedling survival and estimated levels of visible damage (for example, specific difference in tree mortality. We also relaxed the assumption that estab- insect herbivory, stem clipping, foliar pathogen infection) biweekly for the first lishment was determined only by plant response to soil biota by weighting this four months, and monthly until the final census. In January 2009, we measured measure by species-specific seedling growth rates (see Supplementary stem height and leaf lengths and widths, and estimated total above-ground Equations). For all simulations, we replicated each simulation ten times and biomass using regression equations. These equations were obtained per species averaged the correlation coefficients that examined the relationship between by regressing the product of the growth measurements with total above-ground the average strength of feedback and tree abundance. Simulations were run in biomass of extra harvested seedlings (4 of 5 species: r2 . 0.91; B. alicastrum: MATLAB.

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09304 LETTERS

Predicting protein structures with a multiplayer online game

Seth Cooper1, Firas Khatib2, Adrien Treuille1,3, Janos Barbero1, Jeehyung Lee3, Michael Beenen1, Andrew Leaver-Fay2{, David Baker2,4, Zoran Popovic´1 & Foldit players

People exert large amounts of problem-solving effort playing com- retaining the deterministic Rosetta algorithms as user tools. We puter games. Simple image- and text-recognition tasks have been developed a multiplayer online game, Foldit, with the goal of pro- successfully ‘crowd-sourced’ through games1–3, but it is not clear if ducing accurate protein structure models through gameplay (Fig. 1). more complex scientific problems can be solved with human- Improperly folded protein conformations are posted online as puz- directed computing. Protein structure prediction is one such zles for a fixed amount of time, during which players interactively problem: locating the biologically relevant native conformation reshape them in the direction they believe will lead to the highest of a protein is a formidable computational challenge given the score (the negative of the Rosetta energy). The player’s current status very large size of the search space. Here we describe Foldit, a is shown, along with a leader board of other players, and groups of multiplayer online game that engages non-scientists in solving players working together, competing in the same puzzle (Fig. 1, hard prediction problems. Foldit players interact with protein arrows 8 and 9). To make the game approachable by players with structures using direct manipulation tools and user-friendly no scientific training, many technical terms are replaced by terms in versions of algorithms from the Rosetta structure prediction more common usage. We remove protein elements that hinder struc- methodology4, while they compete and collaborate to optimize tural problem solving, and highlight energetically frustrated areas of the computed energy. We show that top-ranked Foldit players the protein where the player can probably improve the structure excel at solving challenging structure refinement problems in (Fig. 1, arrows 1–5). Side chains are coloured by hydrophobicity which substantial backbone rearrangements are necessary to and the backbone is coloured by energy. There are specific visual cues achieve the burial of hydrophobic residues. Players working depicting hydrophobicity (‘exposed hydrophobics’), interatomic collaboratively develop a rich assortment of new strategies and repulsion (‘clashes’) and cavities (‘voids’). The players are given algorithms; unlike computational approaches, they explore not intuitive direct manipulation tools. The most immediate method only the conformational space but also the space of possible search of interaction is directly pulling on the protein. It is also possible strategies. The integration of human visual problem-solving and to rotate helices and rewire b-sheet connectivity (‘tweak’). Players are strategy development capabilities with traditional computational able to guide moves by introducing soft constraints (‘rubber bands’) algorithms through interactive multiplayer games is a powerful and fixing degrees of freedom (‘freezing’) (Fig. 1, arrows 6 and 7). new approach to solving computationally-limited scientific They are also able to change the strength of the repulsion term to problems. allow more freedom of movement. Available automatic moves— Although it has been known for over 40 years that the three- combinatorial side-chain rotamer packing (‘shake’), gradient-based dimensional structures of proteins are determined by their amino acid minimization (‘wiggle’), fragment insertion (‘rebuild’)—are Rosetta sequences5, protein structure prediction remains a largely unsolved optimizations modified to suit direct protein interaction and simplified problem for all but the smallest protein domains. The state-of-the-art to run at interactive speeds. Rosetta structure prediction methodology, for example, is limited To engage players with no previous exposure to molecular biology, primarily by conformational sampling; the native structure almost it was essential to introduce these concepts through a series of intro- always has lower energy than any non-native conformation, but the ductory levels (Supplementary Fig. 1 and Supplementary Table 1): free energy landscape that must be searched is extremely large—even puzzles that are always available, and can be completed by reaching a small proteins have on the order of 1,000 degrees of freedom—and goal score. These levels teach the game’s tools and visualizations, and rugged due to unfavourable atom–atom repulsion that can dominate certain strategies. We have found the game to be approachable by a the energy even quite close to the native state. To search this landscape, wide variety of people, not only those with a scientific background Rosetta uses a combination of stochastic and deterministic algo- (Supplementary Fig. 2)—in fact, few top-ranked players are profes- rithms: rebuilding all or a portion of the chain from fragments; sionally involved in biochemistry (Supplementary Fig. 3). random perturbation to a subset of the backbone torsion angles; To evaluate players’ abilities to solve structure prediction pro- combinatorial optimization of protein side-chain conformations; blems, we posted a series of prediction puzzles. Puzzles in this series gradient-based energy minimization; and energy-dependent accept- were blind, in the sense that neither the target protein nor homolog- ance or rejection of structure changes6–8. ous proteins had structures contained within publicly available data- We hypothesized that human spatial reasoning could improve bases for the duration of the puzzles. Detailed information for these both the sampling of conformational space and the determination ten blind structures, including comparisons between the best-scoring of when to pursue suboptimal conformations if the stochastic ele- Foldit predictions and the best-scoring Rosetta predictions using the ments of the search were replaced with human decision making while rebuild and refine protocol7, is given in Table 1. We found that Foldit

1Department of Computer Science and Engineering, University of Washington, Box 352350, Seattle, Washington 98195, USA. 2Department of Biochemistry, University of Washington, Box 357350, Seattle, Washington 98195, USA. 3School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA. 4Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, Washington 98195, USA. {Present address: Department of Biochemistry, University of North Carolina, CB 7260, Chapel Hill, North Carolina 27599, USA. 756 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

8 5 2

1

9

12 6

4

10 7 3 11

Figure 1 | Foldit screenshot illustrating tools and visualizations. The prevents degrees of freedom from changing. The user interface includes visualizations include a clash representing atoms that are too close (arrow 1); information about the player’s current status, including score (arrow 8); a a hydrogen bond (arrow 2); a hydrophobic side chain with a yellow blob leader board (arrow 9), which shows the scores of other players and groups; because it is exposed (arrow 3); a hydrophilic side chain (arrow 4); and a toolbars for accessing tools and options (arrow 10); chat for interacting with segment of the backbone that is red due to high residue energy (arrow 5). The other players (arrow 11); and a ‘cookbook’ for making new automated tools players can make modifications including ‘rubber bands’ (arrow 6), which or ‘recipes’ (arrow 12). add constraints to guide automated tools, and freezing (arrow 7), which players were particularly adept at solving puzzles requiring substan- has settled in a local low-energy basin. Players were able to carry out tial backbone remodelling to bury exposed hydrophobic residues these restructuring operations in such scenarios as strand swapping into the protein core (Fig. 2). When a hydrophobic residue points (Fig. 3) and register shifting (Fig. 2a). In one strand-swap puzzle, outwards into solvent, and no corresponding hole within the core is Foldit players were able to get within 1.1 A˚ of the native structure, evident, stochastic Monte Carlo trajectories are unlikely to sample with the top-scoring Foldit prediction being 1.4 A˚ away. A superposi- the coordinated backbone and side-chain shifts needed to bury the tion between the starting Foldit puzzle, the top-scoring Foldit solu- residue properly in the core. By adjusting the backbone to allow the tion, and model 1 of the native NMR structure 2kpo (Protein Data exposed hydrophobic residue to pack properly in the core, players Bank) are shown in Fig. 3b. Rosetta’s rebuild and refine protocol, were able to solve these problems in a variety of blind scenarios however, was unable to get within 2 A˚ of the native structure (Fig. 3a, including a register shift and a remodelled loop (Fig. 2a, b), a rotated yellow points). This example highlights a key difference between helix (Fig. 2c), two remodelled loops (Fig. 2d), and a helix rotation humans and computers. As shown in Fig. 3c, solving the strand-swap and remodelled loop (Fig. 2e). problem required substantially unravelling the structure (Fig. 3c, Players were also able to restructure b-sheets to improve hydro- bottom), with a corresponding unfavourable increase in energy phobic burial and hydrogen bond quality. Automated methods have (Fig. 3c, top). Players persisted with this reconfiguration despite the difficulty performing major protein restructuring operations to energy increase because they correctly recognized that the swap could change b-sheet hydrogen-bond patterns, especially once the solution ultimately lead to lower energies. In contrast, although the Rosetta

Table 1 | Blind data set Puzzle ID Foldit Ca r.m.s.d. Rebuild and refine Ca Native Method Number of residues Figure(s) r.m.s.d. 986875 1.4 4.52kpo NMR 99 3a–c, Supplementary 4 986698 1.8 3.72kky NMR 102 3d, e 986836 5.7 6.63epu X-ray 136 2c, Supplementary 6d 987088 3.5 4.32kpt NMR 116 2a, b, Supplementary 6a, b 987162 4.5 5.23lur X-ray 158 Supplementary 6c 987076 3.33.52kpm NMR 81 2e, Supplementary 5c 986629 3.53.32kk1 NMR 135 Supplementary 5b 987145 2.62.33nuf X-ray 105 2d, Supplementary 5a 986844 6.9 5.8 2ki0 NMR 36 Supplementary 10a 986961 10.6 5.7 2knr NMR 118 Supplementary 10b A listing of all the Foldit puzzles run in the blind data set. A Ca r.m.s.d. comparison to the native structure is given between the best-scoring model produced by Foldit players and the best-scoring model produced by the Rosetta rebuild and refine protocol, given the same starting model(s). Solutions considerably better with one method than the other are indicated in bold. The solved structures (which were released after each puzzle ended) are represented by their Protein Data Bank (PDB) codes. Results from these Foldit puzzles can be accessed on the Foldit website by replacing ID with the corresponding Foldit puzzle ID in http://fold.it/portal/node/ID. 2kky, 2kpt, 2kpm, 2kk1 and 2knr were taken from the CASD-NMR experiment10. 2kpo was provided by N. Koga and R. Koga. 2ki0 and 3epu were found by searching for unreleased structures on the PDB website (http://www.rcsb.org/pdb/search/searchStatus.do). 3lur and 3nuf were provided by the Joint Center for Structural Genomics (JCSG). The location of figures containing results for each puzzle are provided in the last column. 757 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

a bcd e

Figure 2 | Structure prediction problems solved by Foldit players. top-scoring Foldit prediction correctly rotated an entire helix that was Examples of blind structure prediction problems in which players were misplaced in the starting puzzle. d, The starting puzzle had an exposed successfully able to improve structures. Native structures are shown in blue, isoleucine and phenylalanine on the top, as well as an exposed valine on the starting puzzles in red, and top-scoring Foldit predictions in green. a, The bottom left. The top-scoring Foldit prediction was able to correctly bury red starting puzzle had a register shift and the top-scoring green Foldit these exposed hydrophobic residues. e, Another successful Foldit helix prediction correctly flips and slides the b-strand. b, On the same structure as rotation along with a remodelled loop that correctly buries an exposed above, Foldit players correctly buried an exposed isoleucine residue in the phenylalanine. Images were produced using PyMOL software11. loop on the bottom right by remodelling the loop backbone. c, The

a c d –75 –100 3 –100 –160 –125 –150 –150 4 –180 –175 –200 –200 2 –200 –225 1 –250 –220 Rosetta energy

–250 Rosetta energy Rosetta energy 6 –275 5 –240 –300 –300 0 0.5 1 1.5 2 0246810 Time (h) 02468 10 12 14 Buried residue full-atom r.m.s.d. Cα r.m.s.d. native 2kky NMR model 1 to native 2kpo NMR model 1 b e 12

34

56

Figure 3 | Puzzles in which human predictors significantly outperformed reach the native state. The y axis shows the Rosetta energy and the x axis the the Rosetta rebuild and refine protocol. a–c, Puzzle 986875. d, e, Puzzle elapsed time in hours. The starting structure had a Rosetta energy of 2243. 986698. a, Comparison of Foldit player solutions (green) to the low-energy Each point in the plot represents a solution produced by this player. The first structures sampled in Rosetta rebuild and refine trajectories (yellow) for structure (1) is near the starting puzzle structure, shown as the black dot in blind Foldit puzzle 986875 based on the recently determined structure of panel a. The following structures (2–6) are shown as blue dots in panel a.In 2kpo. The x axis is the all-atom r.m.s.d. to 2kpo, and the y axis is the Rosetta structures 2–4, the player must explore higher energies to move the strand energy. The starting Foldit puzzle was 4.3 A˚ away from the native structure into place, shown by the blue lines. In structures 5 and 6, the player refines (shown by the black dot on the plot); Foldit players sampled many different the strand pairing. d, Comparison of Foldit player solutions (green) to the conformations, with the top-scoring submission (the lowest scoring Rosetta low-energy structures sampled in Rosetta rebuild and refine trajectories energy) 1.4 A˚ away from the native structure, whereas the automated Rosetta (yellow) for blind Foldit puzzle 986698 based on the recently determined protocol did not sample below 2 A˚ . The blue dots and lines correspond to the structure of 2kky. Foldit players were able to get the best Foldit score by trajectory of a single Foldit player in c. b, Superposition of the top-scoring correctly picking from multiple alternative starting Rosetta models (black) Foldit prediction in green with the experimentally determined NMR model 1 the model that was closest to the native structure. e, The native structure is in blue. The starting puzzle is in red, where the terminal strand is incorrectly shown in blue with the top-scoring Foldit prediction shown in green. The swapped with its neighbour; 8% of all Foldit players were able to swap these top-scoring Rosetta rebuild and refine prediction given the same ten starting strands correctly (Supplementary Table 2). c, A score trajectory with selected models (shown in yellow) was unable to sample as close to the native structures for the top-scoring player in puzzle 986875 over a 2-h window, structure as the Foldit players. showing how the player explores through high-energy conformations to 758 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

rebuild and refine protocol did sample some partially swapped con- a Refinement puzzles Freestyle puzzles formations (Fig. 3a, leftmost yellow point), these were not retained in 986824, 986836 986844, 986870 subsequent generations owing to their relatively high energies, result- 986894, 986928 986931, 987043 Band: add, remove, 986950, 986967 987056, 987060 ing in the top-scoring Rosetta prediction being further from the or modify a distance constraint native than the starting structure (Supplementary Fig. 5). Human players are also able to distinguish which starting point Freeze: fix or unfix a will be most useful to them. Figure 3d, e shows a case where players subset of torsion were given ten different Rosetta predictions to choose from. Players angles were able to identify the model closest to the native structure, and to Player 46533 Global wiggle: improve it further. Given the same ten starting models, the Rosetta gradient-based rebuild and refine protocol was unable to get as close to the native minimization structure as the top-scoring Foldit predictions. Local wiggle: gradient- Foldit players performed similarly to the Rosetta rebuild and refine based minimization protocol for three of the ten blind puzzles (Supplementary Fig. 6). with loop closure They outperformed Rosetta on five of the puzzles (Fig. 3 and Shake: combinatorial

Supplementary Figs 5 and 7), including the two above cases where Player 115025 side-chain optimization players performed significantly better. A larger set of successful solu- tions for similar, although non-blind, puzzles are described in Backbone pull: Supplementary Figs 8–10. For two of the ten blind puzzles, the manual backbone top-scoring Rosetta rebuild and refine prediction was numerically modification better than the Foldit solution (Table 1) but still basically incorrect Side-chain pull: manual (root mean squared deviation (r.m.s.d.) to native structure .5.7 A˚ ) side-chain modification

(Supplementary Fig. 11). Player 101291 Rebuild: fragment Despite the promising results described above, there exists room insertion with loop for improvement. For one particularly difficult class of problems, closure players are only given an extended protein chain to start from. Although the Foldit tools are sufficient to reach the native conforma- Secondary structure: modify label for tion from this unfolded start (Supplementary Fig. 12), players can fragment selection have trouble reaching it from so far away (Supplementary Fig. 11a). This indicates the need to find the right balance between humans and Player 66184 Tweak: sheet register computational methods: players guided by visual cues perform better shift or helix rotation in resolving incorrect features in partially correct models than ‘blank First hour First day Entire puzzle slate’ de novo folding of an extended, featureless protein chain. As interesting as the Foldit predictions themselves is the complexity, b Rosetta rebuild and refine variation and creativity of the human search process. Foldit gameplay Backbone perturb: supports both competition and collaboration between players. For Global minimize: random perturbation to gradient-based backbone torsion angles collaboration, players can share structures with their group members, minimization and help each other out with strategies and tips through the game’s Repack: combinatorial chat function, or across the wiki. The competition and collaboration Loop rebuild: side-chain optimization fragment insertion create a large social aspect to the game, which alters the aggregate search with loop closure Cluster progress of Foldit and heightens player motivation. As groups compete for higher rankings and discover new structures, other groups appear First hour First day Entire job to be motivated to play more (Supplementary Fig. 14a), and within Figure 4 | Player move preferences. a, Different Foldit players take groups the exchange of solutions can help other members catch up to different approaches to solving the same problem. Each circle represents the the leaders (Supplementary Fig. 14b). move type frequencies used in the top-scoring solution produced by each Humans use a much more varied range of exploration methods player in different time frames: the inner circle denotes the first hour; the than computers. Different players use different move sequences, both middle circle denotes the first day; and the outer circle denotes the puzzle’s according to the puzzle type and throughout the duration of a puzzle entire duration. Each colour represents a different type of move that can be (Fig. 4a). For example, some players prefer to manually adjust side made in the game. The left column reflects player move types for puzzles that chains; some will forego large amounts of continuous minimization at start relatively close to the native topology. The right column reflects player move types for puzzles that start from a fully extended conformation. Each the beginning of a puzzle, but increase it as the puzzle progresses; and row represents a different Foldit player. Each player’s preferred move types some prefer a more direct approach and use more rubber bands when across each puzzle class are distinct from one another, yet a player’s the puzzle begins from an extended chain. Within teams, there is often preferences are similar for both classes of puzzles. Also note that the move a division of labour: some players specialize in early-stage openings, preferences change over the lifetime of a puzzle; local wiggle is heavily others in middle- and end-game polishing. Our informal investiga- preferred by the end of puzzles but not by all players at the beginning. The tion revealed a fascinating array of thought processes, insights and move type preferences are very different from Rosetta’s current best previously unexplored methodologies developed solely through automated protocol, rebuild and refine, shown in b. Foldit gameplay (see Supplementary Text, ‘Player Testimonials’ section and Supplementary Table 3 for more information). More from players who did well and making them accessible to all players. in-depth analysis of player strategies should provide further insight Most of the tools available to players today are a product of this into the basis for human achievement with Foldit and could lead to refinement. They either did not initially exist or have undergone improved automated algorithms for protein structure prediction. major revision. The introductory levels were also iteratively tuned In designing Foldit we sought to maximize both engagement by a to reduce player attrition due to difficulty or lack of engagement. Just wide range of players (a requirement common to all games) and the as Foldit players gained expertise by playing Foldit, both individually scientific relevance of the game outcomes (unique to Foldit). We and collectively, the game itself adapted to players’ best practices and fine-tuned the game through continuous iterative refinement based skill sets. We suspect that this process of co-adaptation of game and on observations of player activity and feedback, taking approaches players should be applicable to similar scientific discovery games. 759 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

To attract the widest possible audience for the game and encourage 5. Anfinsen, C. B. Principles that govern the folding of protein chains. Science 181, prolonged engagement, we designed the game so that the supported 223–230 (1973). 6. Das, R. & Baker, D. Macromolecular modeling with Rosetta. Annu. Rev. Biochem. motivations and the reward structure are diverse, including short- 77, 363–382 (2008). term rewards (game score), long-term rewards (player status and 7. Qian, B. et al. High-resolution structure prediction and the crystallographic phase rank), social praise (chats and forums), the ability to work individually problem. Nature 450, 259–264 (2007). or in a team, and the connection between the game and scientific 8. Bradley, P., Misura, K. M. S. & Baker, D. Toward high-resolution de novo structure outcomes. A survey of Foldit players (Supplementary Fig. 4) revealed prediction for small proteins. Science 309, 1868–1871 (2005). 9. Yee, N. Motivations of play in online games. J. CyberPsychol. Behav. 9, 772–775 that although the purpose of contributing to science is a motivating (2007). factor for many players, Foldit also attracts players interested in 10. Rosato, A. et al. CASD-NMR: critical assessment of automated structure achievement through competition and point accumulation, social determination by NMR. Nature Methods 6, 625–626 (2009). interaction through chat and web-based communication, and immer- 11. DeLano, W. L. The PyMOL Molecular Graphics System (DeLano Scientific, 2002). 9 sion through engaging gameplay and exploration of protein shapes . Supplementary Information accompanies the paper on www.nature.com/nature. We expect generally that future scientific discovery games will also Acknowledgements We thank D. Salesin, K. Tuite, J. Snyder, D. Suskin, benefit from varied motivation sets. P. Kra¨henbu¨hl, A. C. Snyder, H. Lu¨, L. S. Tan, A. Chia, M. Yao, E. Butler, C. Carrico, The solution of challenging structure prediction problems by P. Bradley, I. Davis, D. Kim, R. Das, W. Sheffler, J. Thompson, O. , R. Vernon, Foldit players demonstrates the considerable potential of a hybrid B. Correia, D. Anderson, Y. Zhao, S. Herin and B. Bethurum for their help. We would human–computer optimization framework in the form of a mas- like to thank N. Koga, R. Koga and A. Deacon and the JCSG for providing us with sively multiplayer game. The approach should be readily extendable protein structures before their public release. We would also like to acknowledge all of the Foldit players who have made this work possible. Usernames of players to related problems, such as protein design and other scientific whose solutions were used in figures can be found in Supplementary Table 4. This domains where human three-dimensional structural problem solv- work was supported by NSF grants IIS0811902 and 0906026, DARPA grant ing can be used. Our results indicate that scientific advancement is N00173-08-1-G025, the DARPA PDP program, the Howard Hughes Medical possible if even a small fraction of the energy that goes into playing Institute (D.B.), Microsoft, and an NVIDIA Fellowship. This material is based upon work supported by the National Science Foundation under a grant awarded in computer games can be channelled into scientific discovery. 2009. Received 22 January; accepted 30 June 2010. Author Contributions All named authors contributed extensively to development 1. von Ahn, L. & Dabbish, L. Labeling images with a computer game. in CHI ’04: Proc. and analysis for the work presented in this paper. Foldit players (more than 57,000) 2004 Conf. Human Factors Comput. Syst.,319–326 (ACM, 2004). contributed extensively through their feedback and gameplay, which generated the 2. von Ahn, L., Liu, R. & Blum, M. Peekaboom: a game for locating objects in images. data for this paper. in CHI ’06: Proc. SIGCHI Conf. Human Factors Comp. Syst.,55–64 (ACM, 2006). 3. Westphal, A. J. et al. Non-destructive search for interstellar dust using Author Information Reprints and permissions information is available at synchrotron microprobes. In X-ray Optics Microanalysis: Proc. 20th Int. Congr. Vol. www.nature.com/reprints. The authors declare no competing financial interests. 1221, 131–138 (2010). Readers are welcome to comment on the online version of this article at 4. Rohl, C., Strauss, C., Misura, K. & Baker, D. Protein structure prediction using www.nature.com/nature. Correspondence and requests for materials should be Rosetta. Methods Enzymol. 383, 66–93 (2004). addressed to Z.P. ([email protected]) or D.B. ([email protected]).

760 ©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09182 LETTERS

Link communities reveal multiscale complexity in networks

Yong-Yeol Ahn1,2*, James P. Bagrow1,2* & Sune Lehmann3,4*

Networks have become a key approach to understanding systems represent link communities (Fig. 1d, e and Methods). In this den- of interacting objects, unifying the study of diverse phenomena drogram, links occupy unique positions whereas nodes naturally including biological organisms and human society1–3. One crucial occupy multiple positions, owing to their links. We extract link com- step when studying the structure and dynamics of networks is to munities at multiple levels by cutting this dendrogram at various identify communities4,5: groups of related nodes that correspond thresholds. Each node inherits all memberships of its links and can to functional subunits such as protein complexes6,7 or social thus belong to multiple, overlapping communities. Even though we spheres8–10. Communities in networks often overlap9,10 such that assign only a single membership per link, link communities can also nodes simultaneously belong to several groups. Meanwhile, many capture multiple relationships between nodes, because multiple networks are known to possess hierarchical organization, where nodes can simultaneously belong to several communities together. communities are recursively grouped into a hierarchical struc- The link dendrogram provides a rich hierarchy of structure, but to ture11–13. However, the fact that many real networks have com- obtain the most relevant communities it is necessary to determine the munities with pervasive overlap, where each and every node best level at which to cut the tree. For this purpose, we introduce a belongs to more than one group, has the consequence that a global natural objective function, the partition density, D, based on link hierarchy of nodes cannot capture the relationships between over- density inside communities; unlike modularity20, D does not suffer lapping groups. Here we reinvent communities as groups of links from a resolution limit25 (Methods). Computing D at each level of the rather than nodes and show that this unorthodox approach suc- link dendrogram allows us to pick the best level to cut (although cessfully reconciles the antagonistic organizing principles of over- meaningful structure exists above and below that threshold). It is lapping communities and hierarchy. In contrast to the existing also possible to optimize D directly. We can now formulate overlap- literature, which has entirely focused on grouping nodes, link ping community discovery as a well-posed optimization problem, communities naturally incorporate overlap while revealing hier- accounting for overlap at every node without penalizing that nodes archical organization. We find relevant link communities in many participate in multiple communities. networks, including major biological networks such as protein– As an illustrative example, Fig. 1f shows link communities around protein interaction6,7,14 and metabolic networks11,15,16, and show the word ‘Newton’ in a network of commonly associated English that a large social network10,17,18 contains hierarchically organized words. (See Supplementary Information, section 6, for details on community structures spanning inner-city to regional scales while networks used throughout the text.) The ‘clever, wit’ community is maintaining pervasive overlap. Our results imply that link com- correctly identified inside the ‘smart/intellect’ community. The munities are fundamental building blocks that reveal overlap and words ‘Newton’ and ‘Gravity’ both belong to the ‘smart/intellect’, hierarchical organization in networks to be two aspects of the ‘weight’ and ‘apple’ communities, illustrating that link communities same phenomenon. capture multiple relationships between nodes. See Supplementary Although no common definition has been agreed upon, it is widely Information, section 3.6, for further visualizations. accepted that a community should have more internal than external Having unified hierarchy and overlap, we provide quantitative, connections19–24. Counterintuitively, highly overlapping communities real-world evidence that a link-based approach is superior to exist- can have many more external than internal connections (Fig. 1a, b). ing, node-based approaches. Using data-driven performance mea- Because pervasive overlap breaks even this fundamental assumption, a sures, we analyse link communities found at the maximum partition new approach is needed. density in real-world networks, compared with node communities The discovery of hierarchy and community organization has always found by three widely used and successful methods: clique percola- been considered a problem of determining the correct membership tion9, greedy modularity optimization26 and Infomap21. Clique per- (or memberships) of each node. Notice that, whereas nodes belong to colation is the most prominent overlapping community algorithm, multiple groups (individuals have families, co-workers and friends; greedy modularity optimization is the most popular modularity- Fig. 1c), links often exist for one dominant reason (two people are in based20 technique and Infomap is often considered the most accurate the same family, work together or have common interests). Instead of method available27. assuming that a community is a set of nodes with many links between We compiled a test group of 11 networks covering many domains them, we consider a community to be a set of closely interrelated links. of active research and representing the wide body of available data Placing each link in a single context allows us to reveal hierarchical (Supplementary Table 2). These networks vary from small to large, and overlapping relationships simultaneously. We use hierarchical from sparse to dense, and from those with modular structure to those clustering with a similarity between links to build a dendrogram with highly overlapping structure. We highlight a few data sets of where each leaf is a link from the original network and branches particular scientific importance: The mobile phone network is the

1Center for Complex Network Research, Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA. 2Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Harvard University, Boston, Massachusetts 02215, USA. 3Institute for Quantitative Social Science, Harvard University, Cambridge, Massachusetts 02138, USA. 4College of Computer and Information Science, Northeastern University, Boston, Massachusetts 02115, USA. *These authors contributed equally to this work. 761 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

ab Figure 1 | Overlapping communities lead to dense networks and prevent the discovery of a single node hierarchy. a, Local structure in many networks is simple: an individual node sees the communities it belongs to. b, Complex global structure emerges when every node is in the situation displayed in a. c, Pervasive overlap hinders the discovery of hierarchical organization because nodes cannot occupy multiple leaves of a node dendrogram, preventing a single tree from encoding the full hierarchy. d, e, An example showing link communities (colours in d), the link similarity matrix (e; darker entries show more similar pairs of links) and the link dendrogram (e). f, Link communities from the full word association network around the word ‘Newton’. Link colours represent communities and filled c Family regions provide a guide for the eye. Link communities capture concepts Buildings in same related to science and allow substantial overlap. Note that the words were neighborhood produced by experiment participants during free word associations.

These networks possess rich metadata that allow us to describe the University Home and work structural and functional roles of each node. For example, the bio- logical roles of each protein in the protein–protein interaction net- work can be described by a controlled vocabulary (Gene Ontology Joint appointment 28 3–4 terms ). By calculating metadata-based similarity measures between 2–4 nodes (Methods and Supplementary Information, section 5), we can d 1 e 1–4 2 2–3 determine the quality of communities by the similarity of the nodes 1–2 they contain (‘community quality’). Likewise, we can use metadata to 3 1–3 estimate the expected amount of overlap around a node, testing the 9 4–7 4 5–6 quality of the discovered overlap according to the metadata (‘overlap 7 4–6 quality’). For example, metabolites that participate in more meta- 4–5 6 bolic pathways are expected to belong to more communities than 8 7–9 5 7–8 metabolites that participate in fewer pathways. Some methods may 8–9 find high-quality communities but only for a small fraction of the f Experiment, science Smart, intellect, scientists network; coverage measures describe how much of the network was Chemical Biologist Flask Beaker Invent Exceptional classified by each algorithm (‘community coverage’) and how much Biology Test tube Lab Bright overlap was discovered (‘overlap coverage’). Each community algo- Research Chemist Chemistry Inventor Brilliant Experiment Intellect rithm is tested by comparing its output with the metadata, to deter- Scientist Genius Gifted Kinetic mine how well the discovered community structure reflects the Science Intelligent Velocity Scientific Intelligence Physics Smart Retarded metadata, according to the four measures. Each measure is normalized Hypothesis Einstein Wisdom Theory such that the best method attains a value of one. ‘Composite perfor- Wise Theorem Newton mance’ is the sum of these four normalized measures, such that the Relativity Clever Cunning Inertia Weight maximum achievable score is four. Full details are in Methods and Science, scientists Wit Outfox Supplementary Information, sections 5 and 6. Law Sly Gravity Apple Clever, wit Newton, gravity, apple Figure 2 displays the results of this quantitative comparison, show- ing that link communities reveal more about every network’s meta- most comprehensive proxy of a large-scale social network currently data than other tested methods. Not only is our approach the overall in existence17,18; the metabolic network iAF1260, from Escherichia coli leader in every network, it is also the winner in most individual aspects K-12 MG1655 strain, is one of the most elaborate reconstructions of the composite performance for all networks, particularly the quality currently available16; and the three protein–protein interaction net- measures. The performance of link communities stands out for dense works of Saccharomyces cerevisiae are the most recent and complete networks, such as the metabolic and word association networks, protein–protein interaction data yet published14. which are expected to have pervasively overlapping structure.

4

3 Measures Overlap coverage Community coverage 2 Overlap quality Community quality 1 Composite performance 0 Methods LCGI LCGI LCGI LCGI LCGI LCGI LCGI LCGI LCGI LCGI LCGI Metabolic PPI (Y2H) PPI (AP/MS) PPI (LC) PPI (all) Phone Actor US Congress Philosopher Word assoc. Amazon.com L – Links C – Clique percolation Biological networks Social networks Other networks G – Greedy modularity I – Infomap N 1,042 1,647 1,004 1,213 2,729 885,989 67,411 390 1,219 5,018 18,142 〈k〉 16.81 3.06 16.57 4.21 8.92 6.34 8.90 38.95 9.80 22.02 5.09 Figure 2 | Assessing the relevance of link communities using real-world networks were chosen for their varied sizes and topologies and to represent networks. Composite performance (Methods and Supplementary the different domains where network analysis is used. Shown for each are the Information) is a data-driven measure of the quality (relevance of discovered number of nodes, N, and the average number of neighbours per node, Ækæ. memberships) and coverage (fraction of network classified) of community Link clustering finds the most relevant community structure in real-world and overlap. Tested algorithms are link clustering, introduced here; clique networks. AP/MS, affinity-purification/mass spectrometry; LC, literature percolation9; greedy modularity optimization26; and Infomap21. Test curated; PPI, protein–protein interaction; Y2H, yeast two-hybrid. 762 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

It is instructive to examine further the statistics of link communities throughout the dendrogram, we use a randomized control dendro- in the metabolic and mobile phone networks (Fig. 3). The community gram that quantifies how community quality would evolve if there size distribution at the optimum value of D is heavy tailed for both were no hierarchical organization beyond a certain point. Figure 4e networks, whereas the number of communities per node distinguishes shows that the quality of the actual communities decays much more them (Fig. 3, insets): Mobile phone users are limited to a smaller range slowly than the control, indicating that real link dendrograms possess of community memberships, most likely as a result of social and time a large range of high quality community structures. The quantitative constraints. Meanwhile, the membership distribution of the metabolic results of Fig. 4 are typical for the full test group, implying that rich, network displays the universality of currency metabolites (water, ATP meaningful community structure is contained within the link den- and so on) through the large number of communities they participate drogram. Additional results supporting these conclusions are pre- in. Notable previous work11,15 removed currency metabolites before sented in Supplementary Information, section 7. identifying meaningful community structure. The statistics presented Many cutting-edge networks are far from complete. For example, here match current knowledge about the two systems, further con- an ambitious project to map all protein–protein interactions in yeast firming the communities’ relevance. is currently estimated to detect approximately 20% of connections14. Having established that link communities at the maximal partition As the rate of data collection continues to increase, networks become density are meaningful and relevant, we now show that the link dendrogram reveals meaningful communities at different scales. a Figure 4a–c shows that mobile phone users in a community are spatially co-located. Figure 4a maps the most likely geographic loca- tions of all users in the network; several cities are present. In Fig. 4b, we show (insets) several communities at different cuts above the 50 km optimum threshold, revealing small, intra-city communities. Below b t = the optimum threshold, larger, yet still spatially correlated, com- 0.24 munities exist (Fig. 4c). Because we expect a tight-knit community t = 0.27 to have only small geographical dispersion, the clustered structures on the map indicate that the communities are meaningful. The geo- graphical correlation of each community does not suddenly break c t = 0.27 down, but is sustained over a wide range of thresholds. In Fig. 4d, we Largest look more closely at the social network of the largest community in community Second Fig. 4c, extracting the structure of its largest subcommunity along largest with its remaining hierarchy and revealing the small-scale structures Third largest encoded in the link dendrogram. This example provides evidence for Threshold, t = 0.20 the presence of spatial, hierarchical organization at a societal scale. To d Largest community validate the hierarchical organization of communities quantitatively Largest subcommunity D

103 0.4 0.6t 0.8 1 103

ATP 102 ADP 2 P + 10 i H2O, H 101 Number of metabolites 100 101 0 50 100 150 200 Number of communities per metabolite Number of communities Metabolic 100 101 102 103 Remaining Number of metabolites per community hierarchy 106 106 e Phone Metabolic Word association 1 105 105 104 0.8 4 3 10 10 0.6 102 max 3 1 0.4

10 10 Q/Q Number of users 100 0.2 Actual 102 0 5 10 15 20 25 30 35 Control Number of communities 0 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 per user 101 Link dendrogram threshold, t Number of communities Mobile phone 100 Figure 4 | Meaningful communities at multiple levels of the link 1 2 3 10 10 10 dendrogram. a–c, The social network of mobile phone users displays co- Number of users per community located, overlapping communities on multiple scales. a, Heat map of the Figure 3 | Community and membership distributions for the metabolic and most likely locations of all users in the region, showing several cities. mobile phone networks. The distribution of community sizes and node b, Cutting the dendrogram above the optimum threshold yields small, intra- memberships (insets). Community size shows a heavy tail. The number of city communities (insets). c, Below the optimum threshold, the largest memberships per node is reasonable for both networks: we do not observe communities become spatially extended but still show correlation. d, The phone users that belong to large numbers of communities and we correctly social network within the largest community in c, with its largest identify currency metabolites, such as water, ATP and inorganic phosphate subcommunity highlighted. The highlighted subcommunity is shown along (Pi), that are prevalently used throughout metabolism. The appearance of with its link dendrogram and partition density, D, as a function of threshold, currency metabolites in many metabolic reactions is naturally incorporated t. Link colours correspond to dendrogram branches. e, Community quality, into link communities, whereas their presence hindered community Q, as a function of dendrogram level, compared with random control identification in previous work11,15. (Methods). 763 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010 denser and denser, overlap becomes increasingly pervasive and 7. Gavin, A.-C. et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2006). approaches specifically designed to untangle complex, highly over- 8. Wasserman, S. & Faust, K. Social Network Analysis: Methods and Applications. lapping structure become essential. More generally, the shift in per- Structural analysis in the social sciences (Cambridge Univ. Press, 1994). spective from nodes to links represents a fundamentally new way to 9. Palla, G., Dere´ny, I., Farkas, I. & Vicsek, T. Uncovering the overlapping community study complex systems. Here we have taken steps towards under- structure of complex networks in nature and society. Nature 435, 814–818 standing the consequences of a link-based approach, but its full (2005). 10. Palla, G., Baraba´si, A. & Vicsek, T. Quantifying social group evolution. Nature 446, potential remains unexplored. Our work has primarily focused on 664–667 (2007). the highly overlapping community structure of complex networks, 11. Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N. & Baraba´si, A.-L. Hierarchical but, as we have shown, the hierarchy that organizes these overlapping organization of modularity in metabolic networks. Science 297, 1551–1555 communities holds great promise for further study. (2002). 12. Sales-Pardo, M., Guimera, R., Moreira, A. & , L. Extracting the hierarchical While finalizing this manuscript, we have been made aware of a organization of complex systems. Proc. Natl Acad. Sci. USA 104, 15224–15229 similar approach developed independently by T. S. Evans and R. (2007). Lambiotte29,30. 13. Clauset, A., Moore, C. & Newman, M. E. J. Hierarchical structure and the prediction of missing links in networks. Nature 453, 98–101 (2008). METHODS SUMMARY 14. Yu, H. et al. High-quality binary protein interaction map of the yeast interactome network. Science 322, 104–110 (2008). Link communities. We denote the set of node i and its neighbours as n1(i). For 15. Guimera`, R. & Amaral, L. A. N. Functional cartography of complex metabolic link pairs that share a node, the similarity between links eik and ejk is networks. Nature 433, 895–900 (2005). S(eik ,ejk)~jjnz(i)\nz(j) =jjnz(i)|nz(j) . Single-linkage hierarchical cluster- 16. Feist, A. M. et al. A genome-scale metabolic reconstruction for Escherichia coli ing then builds a link dendrogram (agglomerate ties in S simultaneously). K-12 MG1655 that accounts for 1260 orfs and thermodynamic information. Mol. Cutting this dendrogram at some threshold yields link communities. See Sup- Syst. Biol. 3, 121 (2007). plementary Information for details, generalizations to multipartite and weighted 17. Onnela, J.-P. et al. Structure and tie strengths in mobile communication networks. graphs, and other algorithms. Proc. Natl Acad. Sci. USA 104, 7332–7336 (2007). 18. Gonza´lez, M. C., Hidalgo, C. A. & Baraba´si, A.-L. Understanding individual human Partition density. For a network with M links, {P1,…,PC} is a partition of the mobility patterns. Nature 453, 779–782 (2008). links into C subsets. Subset P has m 5 P links and n 5 ^ i,j nodes. c c j cj c eij [Pc f g 19. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V. & Parisi, D. Defining and Then we define identifying communities in networks. Proc. Natl Acad. Sci. USA 101, 2658–2663 { { (2004). ~ mc (nc 1) Dc { { { 20. Newman, M. E. J. & Girvan, M. Finding and evaluating community structure in nc (nc 1)=2 (nc 1) networks. Phys. Rev. E 69, 026113 (2004). 21. Rosvall, M. & Bergstrom, C. T. Maps of random walks on complex networks reveal This is mc normalized by the minimum and maximum numbers of links possible community structure. Proc. Natl Acad. Sci. USA 105, 1118–1123 (2008). between nc connected nodes. (We assume that Dc 5 0ifnc 5 2.) The partition density, D, is the average of D , weighted by the fraction of present links: 22. Reichardt, J. & Bornholdt, S. Detecting fuzzy community structures in complex c networks with a Potts model. Phys. Rev. Lett. 93, 218701 (2004). X {( { ) 23. Li, D. et al. Synchronization interfaces and overlapping communities in complex ~ 2 mc nc 1 D mc ð1Þ networks. Phys. Rev. Lett. 101, 168701 (2008). M (nc {2)(nc {1) c 24. Lancichinetti, A., Fortunato, S. & Kertesz, J. Detecting the overlapping and Equation (1) does not possess a resolution limit25 because each term is local in c. hierarchical community structure in complex networks. N. J. Phys. 11, 033015 Community validation. Nontrivial communities possess 31 nodes. We use (2009). metadata ‘enrichment’ to assess community quality, comparing how similar 25. Fortunato, S. & Barthe´lemy, M. Resolution limit in community detection. Proc. Natl Acad. Sci. USA 104, 36–41 (2007). nodes are within nontrivial communities relative to all nodes (global baseline). 26. Clauset, A., Newman, M. E. J. & Moore, C. Finding community structure in very Overlap quality is the mutual information between the number of nontrivial large networks. Phys. Rev. E 70, 066111 (2004). memberships and the overlap metadata (Supplementary Table 2). Community 27. Lancichinetti, A. & Fortunato, S. Community detection algorithms: a comparative coverage is the fraction of nodes belonging to 11 nontrivial communities. analysis. Phys. Rev. E 80, 056117 (2009). Overlap coverage, because methods with equal community coverage can extract 28. The Gene Ontology Consortium. The Gene Ontology project in 2008. Nucleic different amounts of overlap, is the average number of nontrivial memberships Acids Res. 36, D440–D444 (2008). per node. See Supplementary Information for full details. 29. Evans, T. S. & Lambiotte, R. Line graphs, link partitions and overlapping communities. Phys. Rev. E 80, 016105 (2009). Control dendrogram. To study the hierarchy beyond some threshold, t*, we begin hierarchical clustering, merging all edge pairs with S $ t and thus fixing the 30. Evans, T. S. & Lambiotte, R. Edge partitions and overlapping communities in * complex networks. Preprint at Æhttp://arxiv.org/abs/0912.4389æ (2009). community structure at threshold t 5 t*. Then we randomly shuffle similarities amongst the remaining edge pairs with S , t*, and continue the merging process. Supplementary Information is linked to the online version of the paper at Full details are in Supplementary Information, section 7.4. www.nature.com/nature.

Full Methods and any associated references are available in the online version of Acknowledgements The authors thank A.-L. Baraba´si, S. Ahnert, J. Park, D.-S. Lee, the paper at www.nature.com/nature. P.-J. Kim, N. Blumm, D. Wang, M. A. Yildirim and H. Yu. The authors acknowledge the Center for Complex Network Research, supported by the James S. McDonnell Received 29 October 2009; accepted 13 May 2010. Foundation 21st Century Initiative in Studying Complex Systems; the NSF-DDDAS Published online 20 June 2010. (CNS-0540348), NSF-ITR (DMR-0426737) and NSF-IIS-0513650 programmes; US ONR Award N00014-07-C; the NIH (U01 A1070499-01/Sub #:111620-2); the 1. Newman, M. E. J., Baraba´si, A.-L. & Watts, D. J. The Structure and Dynamics of DTRA (BRBAA07-J-2-0035); the NS-CTA sponsored by US ARL Networks (Princeton Univ. Press, 2006). (W911NF-09-2-0053); and NKTH NAP (KCKHA005). S.L. acknowledges support 2. Caldarelli, G. Scale-Free Networks: Complex Webs in Nature and Technology (Oxford from the Danish Natural Science Research Council. Univ. Press, 2007). Author Contributions Y.-Y.A., J.P.B. and S.L. designed and performed the research 3. Dorogovtsev, S. N., Goltsev, A. V. & Mendes, J. F. F. Critical phenomena in and wrote the manuscript. complex networks. Rev. Mod. Phys. 80, 1275–1335 (2008). 4. Girvan, M. & Newman, M. E. J. Community structure in social and biological Author Information Reprints and permissions information is available at networks. Proc. Natl Acad. Sci. USA 99, 7821–7826 (2002). www.nature.com/reprints. The authors declare no competing financial interests. 5. Fortunato, S. Community detection in graphs. Phys. Rep. 486, 75–174 (2010). Readers are welcome to comment on the online version of this article at 6. Krogan, N. J. et al. Global landscape of protein complexes in the yeast www.nature.com/nature. Correspondence and requests for materials should be Saccharomyces cerevisiae. Nature 440, 637–643 (2006). addressed to S.L. ([email protected]).

764 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09182

METHODS X {( { ) ~ 2 mc nc 1 Link communities. For an undirected, unweighted network, we denote the set of D mc { { ð3Þ M c (nc 2)(nc 1) node i and its neighbours as n1(i). Limiting ourselves to link pairs that share a node, expected to be more similar than disconnected pairs, we find the similarity, Equation (3) does not possess a resolution limit25 because each term is local in c. S, between links eik and ejk to be Community validation. Nontrivial communities possess 31 nodes. We use metadata ‘enrichment’ to assess community quality, comparing how similar jjnz(i)\nz(j) S(eik ,ejk)~ ð2Þ nodes are within nontrivial communities relative to all nodes (global baseline). jjnz(i)|nz(j) Overlap quality is the mutual information between the number of nontrivial Shared node k does not appear in S because it provides no additional information memberships and the overlap metadata (Supplementary Table 2). Community and introduces bias. Single-linkage hierarchical clustering builds a link dendro- coverage is the fraction of nodes belonging to 11 nontrivial communities. gram from equation (2) (ties in S are agglomerated simultaneously). Cutting this Overlap coverage, because methods with equal community coverage can extract dendrogram at some clustering threshold—for example the threshold with maxi- different amounts of overlap, is the average number of nontrivial memberships mum partition density (see below)—yields link communities. See Supplementary per node (equivalent to community coverage for non-overlapping methods). See Information for details, generalizations to multipartite and weighted graphs, and Supplementary Information for details. the usage of other algorithms. Control dendrogram. To test whether the hierarchical structure is valid beyond Partition density. For a network with M links and N nodes, P 5 {P1,…,PC}isa some threshold, t*, we introduce the following control. First we compute the partition of the links into C subsets. The number of links in subset Pc is mc 5 jPcj. similarities S(eik, ejk) for all connected edge pairs (eik, ejk), as normal. We then The number of induced nodes, all nodes that those links touch, is perform our standard single-linkage hierarchical clustering, merging all edge P P nc 5 ^eij [Pc fi,jg . Note that cmc 5 M and cnc $ N (assuming no uncon- pairs in descending order of S for S $ t*, fixing the community structure up to nected nodes). The link density, Dc, of community c is t 5 t*. Below t*, we randomly shuffle similarities among the remaining edge pairs with S , t , then proceed with the merging process as before. This randomiza- m {(n {1) * ~ c c tion only alters the merging order, and ensures that the rate of edge pair merging Dc { { { nc (nc 1)=2 (nc 1) is preserved, because the same similarities are clustered. This strictly controls not This is the number of links in Pc normalized by the minimum and maximum only the merging rate but also the similarity distributions and the high-quality numbers of links possible between those nodes, assuming they remain con- community structure found at t*. This procedure ensures that the dendrogram is nected. (We assume that Dc 5 0ifnc 5 2.) The partition density, D, is the average properly randomized while other salient features are conserved. Full details are in of Dc, weighted by the fraction of present links: Supplementary Information, section 7.4.

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09171 LETTERS

Regulation of myeloid leukaemia by the cell-fate determinant Musashi

Takahiro Ito1*, Hyog Young Kwon1*, Bryan Zimdahl1, Kendra L. Congdon1, Jordan Blum1, William E. Lento1, Chen Zhao1, Anand Lagoo2, Gareth Gerrard3, Letizia Foroni3, John Goldman3, Harriet Goh4, Soo-Hyun Kim4, Dong-Wook Kim4, Charles Chuah5, Vivian G. Oehler6, Jerald P. Radich6, Craig T. Jordan7 & Tannishtha Reya1

Chronic myelogenous leukaemia (CML) can progress from a slow that keeping Numb at low levels may be essential for maintaining an growing chronic phase to an aggressive blast crisis phase1, but the immature state and that increasing its levels could trigger differenti- molecular basis of this transition remains poorly understood. ation and inhibit disease progression. To test this possibility, haema- Here we have used mouse models of CML2,3 to show that disease topoietic cells were infected with BCR–ABL and NUP98–HOXA9 progression is regulated by the Musashi–Numb signalling axis4,5. together with either control vector or Numb, transplanted and leuk- Specifically, we find that the chronic phase is marked by high levels aemia progression monitored. A total of 83% of control mice of Numb expression whereas the blast crisis phase has low levels of developed leukaemia compared with 63% of those transplanted with Numb expression, and that ectopic expression of Numb promotes Numb-expressing cells (Fig. 1d). Notably, leukaemias that developed differentiation and impairs advanced-phase disease in vivo.Asa in the presence of Numb were more differentiated (Fig. 1e, f) and possible explanation for the decreased levels of Numb in the blast unable to propagate disease efficiently (93% versus 20%, Fig. 1g) or crisis phase, we show that NUP98–HOXA9, an oncogene assoc- infiltrate secondary organs (Fig. 1h, i and Supplementary Fig. 1); no iated with blast crisis CML6,7, can trigger expression of the RNA- signs of leukaemia were detected in mice that survived (Fig. 1j and binding protein Musashi2 (Msi2), which in turn represses Numb. Supplementary Fig. 1). Numb also impaired propagation of fully Notably, loss of Msi2 restores Numb expression and significantly established leukaemias and markedly reduced the frequency of can- impairs the development and propagation of blast crisis CML in cer stem cells (Supplementary Fig. 2). These data show that continual vitro and in vivo. Finally we show that Msi2 expression is not only repression of Numb is essential for maintenance of blast crisis CML, highly upregulated during human CML progression but is also an and that increasing the levels of Numb can inhibit disease. early indicator of poorer prognosis. These data show that the Because Numb can antagonize Notch signalling in several sys- Musashi–Numb pathway can control the differentiation of CML tems13,18,19, we tested whether Numb and Notch had a reciprocal cells, and raise the possibility that targeting this pathway may relationship in CML. Notch signalling was elevated in blast crisis provide a new strategy for the therapy of aggressive leukaemias. CML (Supplementary Fig. 3), and its inhibition via dominant nega- Chronic myelogenous leukaemia (CML) is initiated by the BCR– tive Xenopus Suppressor of Hairless (dnXSu(H)) delivery or through ABL translocation, which leads to myeloid cell expansion while allow- conditional deletion of Rbpj paralleled the effects of Numb and ing differentiation8–11. Secondary translocations such as NUP98– led to reduced incidence and propagation of blast crisis CML HOXA9 or AML1–EVI1, or mutations in p53 or INK4A/ARF, trigger (Supplementary Fig. 4). Furthermore, levels of p53, another Numb progression through an accelerated phase to a blast crisis phase, with target20, were higher in Numb-expressing blast crisis CML progressive loss of the capacity to differentiate1. Although blast crisis (Supplementary Fig. 5a). In the absence of p53, Numb was unable CML is, in part, more aggressive because of arrested differentiation, to affect leukaemic cell growth in vivo or in vitro (Supplementary Fig. the pathways that underlie this arrest remain poorly understood. To 5b–f), indicating that Numb’s effects are in part dependent on p53. determine whether CML progression may be driven by reversal of The observation that Numb repression was critical for the main- signals that regulate differentiation during normal development, we tenance of blast crisis CML led us to seek the mechanism by which focused on Numb4, a molecule that can be inherited differentially Numb may be downregulated in this context. We focused on the during asymmetric division and specify a committed fate12–16. RNA-binding protein Musashi (Msi), which has been shown in the To determine whether Numb regulates leukaemia progression we nervous system to repress Numb by binding the 39 untranslated used mouse models representing chronic phase and myeloid blast region (UTR) of the transcript21. Msi was originally identified in crisis CML. Chronic disease was generated by infecting haematopoie- Drosophila as a regulator of asymmetric division5,22 and its expression tic stem-cell-enriched populations (c-Kit1Lin2Sca-11 or KLS) with has been associated with stem and progenitor cells in several tis- BCR–ABL and transplanting them into irradiated recipient mice2,3. sues23,24. In the haematopoietic system we found that Msi2 was Myeloid blast crisis was modelled by transplanting KLS cells trans- expressed at much higher levels than Msi1 (Fig. 2a), and was particu- duced with BCR–ABL and NUP98–HOXA9 (refs 6, 7, 17). Using larly elevated in stem cells (Fig. 2b). Paralleling this, Msi2 expression these we found that Numb was expressed at significantly lower levels was tenfold higher in the more immature blast crisis CML (Fig. 2c); in the blast crisis phase compared with the chronic phase (Fig. 1a–c). this pattern held true even in matched lineage-negative fractions The decreased expression of Numb in the blast crisis phase indicated (Fig. 2d), indicating that Msi2 upregulation in the advanced phase

1Department of Pharmacology and Cancer Biology, Duke University Medical Center, Durham, North Carolina 27710, USA. 2Department of Pathology, Duke University Medical Center, Durham, North Carolina 27710, USA. 3Department of Haematology, Imperial College London, Hammersmith Hospital, London W12 0NN, UK. 4Division of Hematology, Seoul St Mary’s Hospital, The Catholic University of Korea, Seoul, Korea. 5Department of Haematology, Singapore General Hospital, Cancer and Stem Cell Biology Program, Duke-NUS Graduate Medical School, Singapore. 6Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA. 7James P. Wilmot Cancer Center, University of Rochester School of Medicine, Rochester, New York 14642, USA. *These authors contributed equally to this work. 765 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

a Chronic Blast crisis bc abcde 120 kDa 800 400 600 300 * Msi1 ** ** 75 Numb 600 300 80 50 400 200 Msi2 400 200 (a.u.) * Msi2 expression 40 200 100 Tbp 200 100 Gapdh 0 0 0 0 + + – – – +

Relative 0 Relative Msi2 expression Relative Msi2 expression Relative Msi2 expression

OB in in in Fluorescence intensity Fluorescence –RT

KLS KLS L KLS L WBM Water Chronic last crisis

Chronic Chronic Chronic B last crisis Chronic Lin Blast crisis B

Blast crisis Blast crisis Lin Blast crisisBlast Lin crisis L d 100 Primary e f f gh 60 Control Msi2 Control Numb 250 75 60 * 40 200 50 35.6 (%) 150

Numb – 40 1.54 100

Lin 20 25 Control ** 20 50 Per cent survival ** 0 0 0 0

0 25 5075 100 125 Relative cell number Lineage Relative Msi2 expression 9 Fluorescence intensity (a.u.) Fluorescence Days elapsed Numb Msi2 Control Control Control i g Secondary Murine Msi2 gene ~430 kb NUP98–HOXA 100 hiNumb jNumb Numb Control (40 days) (150 days) +1 m 75 –5.7 kb +110 kb 5,000 12 14 * 50 4,000 3,000 25 Control (a.u.) 2,000 Per cent survival 0 j Flt3 promoter kl 1,000 0 25 5075 100 125 150 175 Experiment 1 Experiment 1 0 Days elapsed β -Galactosidase activity 9 Experiment 2 Experiment 2 Figure 1 | Expression of Numb impairs blast crisis CML development. Control IgG IgG IgG Input Input a, b, CML cells were immunostained with anti-Numb antibody (red) and Input NUP98–HOXA 49,6-diamidino-2-phenylindole (DAPI, green pseudocolour) (a), and nti-HoxA9 nti-HoxA9 nti-HoxA9 fluorescence intensity was quantified, *P , 0.05 (b). a.u., arbitrary units. A A A c, CML cells were analysed by western blot for Numb expression. d, Cells Figure 2 | The RNA-binding protein Musashi is highly expressed in immature infected with BCR–ABL, NUP98–HOXA9 and either control vector or Numb normal and leukaemic cells and is regulated by HoxA9. a, Musashi (Msi) were transplanted and survival was monitored (control, n 5 18; Numb, expression in whole bone marrow (WBM), KLS cells, chronic and blast crisis 2 n 5 19). e, f, Representative (e) and average frequency of Lin cells (f) from CML, olfactory bulb (OB), 2reverse transcriptase (2RT in OB) and water. control or Numb expressing leukaemias. **P , 0.001. g, Donor-derived cells Tbp, TATA-binding protein. b–e, Real-time RT–PCR analysis of Msi2 from primary leukaemias were serially transplanted and survival monitored expression in KLS cells (n 5 3) and Lin1 cells (n 5 2) (b), blast crisis phase (vector, n 5 14; Numb, n 5 15; **P , 0.001). h–j, Haematoxylin-and-eosin- (n 5 9) and chronic phase (n 5 6) (c), Lin2 chronic and blast crisis phase cells stained spleen sections from control vector (h) or Numb expressing relative to normal KLS and Lin1 cells (Lin1, n 5 2 and others, n 5 3) (d), and leukaemias (i) or surviving mice (j). Immature myeloid cells (red arrowheads) Lin2 (n 5 5) or Lin1 (n 5 5) blast crisis CML cells (e). Error bars represent and lymphoid follicles (black arrowheads) are indicated. Original s.e.m.; *P 5 0.039; **P , 0.001. f, g, Control vector- or Msi2-expressing CML magnification, 310. Error bars in all bar graphs are s.e.m. Data shown are cells were stained with anti-Numb antibody (red) and DAPI (green representative of three to four independent experiments. pseudocolour) (f), and fluorescence intensity was quantified (g). **P , 0.001. h, Msi2 expression in KLS cells transduced with either control vector or is not simply a consequence of altered cellular composition. Finally, NUP98–HOXA9 retrovirus along with BCR–ABL. *P 5 0.017. i–l, HoxA9 expression of Msi2 was most enriched in the lineage-negative fraction binds to the Msi2 promoter. Murine Msi2 gene structure: numbered boxes of blast crisis CML (Fig. 2e). These data indicate that Msi2 expression indicate exons; transcription start site (TSS) and the direction of transcription associates predominantly with normal haematopoietic stem cells and are indicated by 11 and the black flag, respectively; the oval indicates putative the most immature fraction of leukaemic cells. HOX binding element 5.7 kb upstream of TSS; the open rectangle indicates Because Msi2 and Numb were expressed in a reciprocal pattern, we 1110 kb site with no HoxA9 binding sequence. ChIP was performed either tested whether Msi2 could repress Numb during leukaemogenesis. with IgG control or anti-HoxA9 antibody for Flt3, a known HoxA9 target gene, Expression of Msi2 in chronic phase CML cells led to downregulation as a positive control(j), and for Msi2 25.7 kb region (k)orMsi2 1110 kb region (l). m, KLS cells from Msi2 gene-trap reporter mice were transduced with of Numb (Fig. 2f, g). Furthermore, NUP98–HOXA9 could also activate BCR–ABL and either control vector or NUP98–HOXA9, and b-galactosidase this cascade by increasing expression of Msi2 (Fig. 2h). Because reporter activity was quantified (n 5 2each;*P 5 0.011). a.u., arbitrary units. NUP98–HOXA9 initiates transformation through HoxA9-mediated DNA binding and transcription, we tested whether HoxA9 could bind To determine whether inhibiting Msi2 could have an impact on the Msi2 promoter and activate its expression directly. Chromatin the growth of established CML, and to rule out the possibility that the immunoprecipitation revealed that HoxA9 was associated with the reduced incidence of leukaemia in gene-trap mutants was due to putative HoxA9-binding element we identified at 25.7 kb (Fig. 2i–l). developmental defects, Msi expression was targeted using an alterna- NUP98–HOXA9 expression was also able to induce Msi2 reporter tive short hairpin (sh)RNA approach (Supplementary Fig. 7a). activity in KLS cells (Fig. 2m and Supplementary Fig. 6a). These data Delivery of Msi2 shRNAs (shMsi) into established blast crisis CML show that Msi2 can be upregulated by NUP98–HOXA9 and sub- cells increased Numb expression (data not shown) and reduced leuk- sequently contribute to blast crisis CML by repressing Numb. aemia growth in vitro (Fig. 3c and Supplementary Fig. 7b–e) and in To test if Msi2 is required for the development of blast crisis CML, vivo (Fig. 3d). Further, the majority of leukaemias that occurred in we used a mouse in which the Msi2 gene was disrupted by a gene-trap the presence of shMsi were more differentiated (Fig. 3e) and (Gt) vector25 (Supplementary Fig. 6a, b). Msi2 mutant mice impaired in their ability to propagate disease (Fig. 3f, 88% control were viable, albeit smaller and less frequent than predicted versus 25% shMsi). These data show that Msi2 is important for the (Msi21/1:Msi21/Gt:Msi2Gt/Gt 5 38:66:19, P 5 0.038), and showed a establishment and continued propagation of blast crisis CML. two–three-fold reduction in the frequency (Fig. 3a) and absolute Finally, we examined whether MSI2 was aberrantly upregulated number (data not shown) of KLS cells. Additionally, loss of Msi2 during human leukaemia progression. MSI2 was tracked in 30 patient led to significantly impaired leukaemia growth in vivo (Fig. 3b, 93% samples from repositories in Korea and the United Kingdom, and for control versus 57% for Msi2Gt/Gt). found to be expressed at significantly higher levels in blast crisis 766 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

a Lin– gated ab200 ** 200 ** Msi2+/+ Msi2+/Gt Msi2Gt/Gt 150 150

100 100 3.09% 1.85% 1.78% c-Kit expression 50 expression 50 Relative gene Relative gene

0 0

Scal Chronic Chronic Blast crisis Blast crisis b cdPrimary 40 100 100 cdMSI2 NUMB 75 30 75 1.20 0.60 shMsi 1.00 0.40 50 Msi2Gt/Gt 20 50 ** 0.80 0.20 25 10 25 Control +/+ 0.60 0.00 Msi2 Colony number Per cent survival 0 0 Per cent survival 0 0.40 –0.20 0 10203040 50 60 70 shLuc shMsi 0 102030405060708090100 0.20 –0.40 Days elapsed Days elapsed 0.00 –0.60 efControl shMsi Secondary value Expression value Expression 100 –0.20 –0.80 shMsi –0.40 –1.00 75

50 Chronic Chronic AcceleratedBlast crisis AcceleratedBlast crisis 25 Control

Per cent survival 0 e HOXA9 f HES1 0 10203040506070 1.80 1.50 Days elapsed 1.50 1.20 1.20 Figure 3 | Loss of Musashi impairs the development and propagation of 0.90 0.90 blast crisis CML a, Representative FACS plots showing frequency of KLS 0.60 0.60 1/1 1/Gt 0.30 cells in mice of the indicated genotypes (Msi2 , n 5 4; Msi2 , n 5 3; 0.00 0.30 Msi2Gt/Gt, n 5 4). b, Survival curve of mice transplanted with BCR–ABL- and –0.30 0.00 1/1 Gt/Gt 1/1 –0.60 Expression value Expression NUP98–HOXA9-infected Msi2 or Msi2 KLS cells (Msi2 , n 5 15; value Expression –0.30 Gt/Gt –0.90 Msi2 , n 5 14; *P 5 0.0159). c, Colony-forming ability of blast crisis –1.20 –0.60 CML cells transduced with control shRNA (shLuc) or Msi2 shRNA (shMsi).

Error bars represent s.e.m. **P , 0.001. d, Survival curve of mice Chronic Chronic transplanted with established blast crisis CML cells infected with control AcceleratedBlast crisis AcceleratedBlast crisis shLuc or shMsi (n 5 13 each; *P 5 0.0267). e, Wright’s stain of leukaemic cells from mice transplanted with control shLuc- or shMsi-infected blast g Chronic phase Blast phase crisis CML. Immature myeloblasts, filled arrowheads; differentiating Second hit First hit NUP98–HOXA9 myelocytes and mature band cells, open arrowheads. Original BCR–ABL AML1–EVI1, p53 magnification, 3100. f, Survival curve of mice transplanted with Lin2 cells from primary shRNA-expressing leukaemias (n 5 16 each; **P , 0.001). Proliferation Differentiation Data shown is representative of two to three independent experiments. Survival MSI2 MSI2 CML (Fig. 4a, b). To determine if this reflected a general pattern in NUMB NUMB human CML progression, we examined the expression of MSI2 and Figure 4 | Musashi expression is upregulated during human CML progression. associated genes in 90 patient samples from banks in the United a, b, PCR analysis of MSI2 expression in chronic and blast crisis CML patient States26. Microarray analysis revealed a marked upregulation of MSI2 samples from the Korean Leukaemia Bank, Korea (n 5 9percohort, in every patient during CML progression (Fig. 4c). Furthermore, Mann–Whitney U-test, **P , 0.001) (a), and the Hammersmith MRD Lab NUMB was downregulated in a majority of blast crisis patients Sample Archive, United Kingdom (n 5 6 per cohort, Mann–Whitney U-test, (Fig. 4d). Notably, our mouse model was driven by NUP98–HOXA9 **P , 0.001) (b). Error bars represent s.e.m. c–f, Microarray analysis of as a second hit, whereas human blast crisis CML patients harbour a expression of MSI2 (c), NUMB (d), HOXA9 (e)(allP , 0.001) and HES1 (f)(P 5 0.68) in bone marrow and peripheral blood samples from 42 chronic variety of secondary mutations. Because Msi2 could be regulated by (red),17accelerated (green)and31blastcrisis phase(blue)patientsintheUnited HoxA9 expression in the mouse model of CML, we examined whether States. g, Proposed model for the role of MSI2 and NUMB in CML progression. HOXA9 was upregulated in blast crisis CML samples. The observation that a majority of patient samples had elevated levels of HOXA9 only associated with higher risk of relapse (all relapses occurred in the (Fig. 4e) may explain how MSI2 becomes upregulated in advanced increased MSI2 group, P 5 0.06) but also with higher risk of death stage disease regardless of the nature of the second hit. Notch signalling (hazard ratio 5 6.76; 95% CI, 0.78–58.57, P 5 0.08). The association targets HES1 and TRIB2 were also elevated in a number of blast crisis of MSI2 with poorer outcomes indicates that MSI2 may be an early patient samples (Fig. 4f and Supplementary Fig. 8), consistent with marker of advanced CML disease. other independent reports27. Our work identifies the Musashi–Numb axis as an important regu- Because the highest MSI2 expression was observed in blast crisis lator of myeloid leukaemia and indicates that maintenance of the patients, where treatment outcomes are extremely poor, and because immature state is dependent on reversal of classical differentiation a range of expression was observed in both chronic and accelerated cues. Specifically, we find that MSI2 is upregulated and NUMB down- phase CML, we tested whether MSI2 expression correlated with out- regulated as chronic phase CML progresses to blast crisis, and that come after allogeneic transplantation. Patients were divided into two modulation of this pathway can inhibit disease (Fig. 4g). Although groups based on median expression of MSI2. Among 37 chronic previous work has implicated Musashi and Numb in normal develop- phase patients with available outcomes (9 relapses), increased ment13,14,23,24, to our knowledge this is the first demonstration that this MSI2 expression was associated with a higher risk of relapse (hazard pathway is required for haematological malignancy. ratio 5 4.35; 95% confidence interval, 0.90–21.06, P 5 0.07). Our previous work showing that whereas BCR–ABL cannot affect Additionally, among 13 accelerated phase patients with available out- the choice between asymmetric and symmetric division, NUP98– comes (6 deaths and 3 relapses), increased MSI2 expression was not HOXA9 can trigger a bias towards symmetric renewal15, had led us 767 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010 to propose that regulators of asymmetric division might regulate 13. Spana, E. P. & Doe, C. Q. Numb antagonizes Notch signaling to specify sibling neuron cell fates. Neuron 17, 21–26 (1996). leukaemic differentiation, and could thus be targets for therapy in 14. Shen, Q., Zhong, W., Jan, Y. N. & Temple, S. Asymmetric Numb distribution is advanced myeloid leukaemia. Our current work supports this and critical for asymmetric cell division of mouse cerebral cortical stem cells and shows that Numb, which drives commitment and differentiation, can neuroblasts. Development 129, 4843–4853 (2002). impair blast crisis CML establishment and propagation. It should be 15. Wu, M. et al. Imaging hematopoietic precursor division in real time. Cell Stem Cell noted that just as Numb’s influence may be mediated through p53 1, 541–554 (2007). 12,13,20 16. Wang, H., Ouyang, Y., Somers, W. G., Chia, W. & Lu, B. Polo inhibits progenitor and/or Notch signalling , Musashi may act through Numb as self-renewal and regulates Numb asymmetry by phosphorylating Pon. Nature WAF1 well as other targets such as p21 (refs 21, 28). 449, 96–100 (2007). Because blast crisis CML is uniformly resistant to current treat- 17. Neering, S. J. et al. Leukemia stem cells in a genetically defined murine model of ments, it is critical to identify new pathways that drive this aggres- blast-crisis CML. Blood 110, 2578–2585 (2007). 18. Justice, N., Roegiers, F., Jan, L. Y. & Jan, Y. N. Lethal giant larvae acts together with sive disease. In that context, our work is important because it shows numb in notch inhibition and cell fate specification in the Drosophila adult sensory that specific differentiation cues associated with the Musashi– organ precursor lineage. Curr. Biol. 13, 778–783 (2003). Numb cascade can unlock the differentiation potential of blast 19. Wakamatsu, Y., Maynard, T. M., Jones, S. U. & Weston, J. A. NUMB localizes in crisis CML and impair its growth. These data, together with the fact the basal cortex of mitotic avian neuroepithelial cells and modulates neuronal differentiation by binding to NOTCH-1. Neuron 23, 71–81 (1999). thatMusashiseemstobeanearlymarkerofadvancedCML,indicate 20. Colaluca, I. N. et al. NUMB controls p53 tumour suppressor activity. Nature 451, that its expression could serve as a prognostic tool, and that target- 76–80 (2008). ing it might represent a new approach to therapy. Finally, reports of 21. Imai, T. et al. The neural RNA-binding protein Musashi1 translationally regulates increased expression of Musashi in glioblastoma29 and decreased mammalian numb gene expression by interacting with its mRNA. Mol. Cell. Biol. 30 21, 3888–3900 (2001). expression of NUMB in high-grade breast cancer raise the 22. Okabe, M., Imai, T., Kurusu, M., Hiromi, Y. & Okano, H. Translational repression possibility that this pathway may also be relevant in solid cancers. determines a neuronal potential in Drosophila asymmetric cell division. Nature 411, 94–98 (2001). METHODS SUMMARY 23. Sakakibara, S. et al. RNA-binding protein Musashi family: roles for CNS stem cells and a subpopulation of ependymal cells revealed by targeted disruption and Mouse models of CML were generated by transducing bone marrow stem and antisense ablation. Proc. Natl Acad. Sci. USA 99, 15194–15199 (2002). progenitor cells with retroviruses carrying BCR–ABL (chronic phase) or BCR– 24. Okano, H. et al. Function of RNA-binding protein Musashi-1 in stem cells. Exp. Cell ABL and NUP98–HOXA9 (blast crisis phase) and transplanting them into irra- Res. 306, 349–356 (2005). diated recipient mice. The development of CML was confirmed by flow cyto- 25. Taniwaki, T. et al. Characterization of an exchangeable gene trap using pU-17 metry and histopathology. For Msi2 knockdown experiments, lineage-negative carrying a stop codon-bgeo cassette. Dev. Growth Differ. 47, 163–172 (2005). blast crisis CML cells were infected with Msi2 or control Luciferase shRNA 26. Radich, J. P. et al. Gene expression changes associated with progression and response retroviral constructs and leukaemia incidence monitored. Chromatin immuno- in chronic myeloid leukemia. Proc. Natl Acad. Sci. USA 103, 2794–2799 (2006). precipitation (ChIP) assays were performed using the myeloid leukaemia cell 27. Nakahara, F. et al. Hes1 immortalizes committed progenitors and plays a role in blast crisis transition in chronic myelogenous leukemia. Blood 115, 2872–2881 (2010). line M1. DNA was crosslinked and immunoprecipitated with control or anti- 28. Battelli, C., Nikopoulos, G. N., Mitchell, J. G. & , J. M. The RNA-binding HOXA9 antibodies and analysed by PCR for regions of interest. CML patient protein Musashi-1 regulates neural development through the translational samples were obtained from the Korean Leukaemia Bank (Korea), the repression of p21WAF-1. Mol. Cell. Neurosci. 31, 85–96 (2006). Hammersmith MRD Lab Sample Archive (United Kingdom), the Fred 29. Liu, G. et al. Analysis of gene expression and chemoresistance of CD1331 cancer Hutchinson Cancer Research Center (United States) and the Singapore stem cells in glioblastoma. Mol. Cancer 5, 67 (2006). General Hospital (Singapore). Gene expression in human chronic and blast crisis 30. Pece, S. et al. Loss of negative regulation by Numb over Notch is relevant to human CML was analysed by PCR or by DNA microarrays. breast carcinogenesis. J. Cell Biol. 167, 215–221 (2004). Supplementary Information is linked to the online version of the paper at Full Methods and any associated references are available in the online version of www.nature.com/nature. the paper at www.nature.com/nature. Acknowledgements We thank A. M. Pendergast, J. Chute, K. Itahana, L. Penalva Received 7 August 2008; accepted 13 May 2010. and L. Grimes for advice and reagents; K.-i. Yamamura for the Msi2 gene-trap mice; Published online 18 July 2010. T. Honjo for the Rbpj conditional mice; N. Gaiano for the TNR mice; D. Baltimore for the lentiviral shRNA constructs; and A. Means and B. Hogan for comments on the 1. Calabretta, B. & Perrotti, D. The biology of CML blast crisis. Blood 103, 4010–4022 manuscript. We also thank M. Cook, B. Harvat and L. Martinek for cell sorting; (2004). M. Fereshteh for advice on analysis of patient samples; D. McDonnell and H. Wade 2. Daley, G. Q., Van Etten, R. A. & Baltimore, D. Induction of chronic myelogenous for advice on ChIP experiments; S. W. Tian for help in collecting patient samples leukemia in mice by the P210bcr/abl gene of the Philadelphia chromosome. and A. Chen and S. Honeycutt for technical help. The BCR–ABL construct was a gift Science 247, 824–830 (1990). from W. Pear and the NUP98–HOXA9 construct a gift from G. Gilliland. T.I. is the 3. Pear, W. S. et al. Efficient and rapid induction of a chronic myelogenous leukemia- recipient of a postdoctoral fellowship from the Astellas Foundation for Research on like myeloproliferative disease in mice receiving P210 bcr/abl-transduced bone Metabolic Disorders, K.L.C. is the recipient of an American Heart Association marrow. Blood 92, 3780–3792 (1998). predoctoral award, B.Z. received support from T32 GM007184-33 and T.R. is the 4. Uemura, T., Shepherd, S., Ackerman, L., Jan, L. Y. & Jan, Y. N. numb, a gene recipient of a Leukemia and Lymphoma Society Scholar Award. This work was also required in determination of cell fate during sensory organ formation in Drosophila supported by an LLS Translational Research grant and an ASH Junior Faculty embryos. Cell 58, 349–360 (1989). Award to V.G.O., as well as NIH grants CA18029 to J.P.R., CA140371 to V.G.O., 5. Nakamura, M., Okano, H., Blendy, J. A. & Montell, C. Musashi, a neural RNA- CA122206 to C.T.J. and DK63031, DK072234, AI067798, HL097767, binding protein required for Drosophila adult external sensory organ development. DP1OD006430 and an Alexander and Margaret Stewart Fund grant to T.R. We are Neuron 13, 67–81 (1994). grateful for the support received from the Lisa Stafford Research Prize. 6. Mayotte, N., Roy, D. C., Yao, J., Kroon, E. & Sauvageau, G. Oncogenic interaction between BCR-ABL and NUP98-HOXA9 demonstrated by the use of an in vitro Author Contributions T.I. and H.Y.K. designed the research, performed the purging culture system. Blood 100, 4177–4184 (2002). majority of the experiments and helped write the paper. B.Z., K.L.C., J.B., W.E.L. and 7. Dash, A. B. et al. A murine model ofCML blast crisis induced by cooperation between C.Z. provided experimental data and help; A.L. provided histopathological analysis; BCR/ABL and NUP98/HOXA9. Proc. Natl Acad. Sci. USA 99, 7622–7627 (2002). C.T.J., G.G., L.F., J.G., H.G., S.-H.K., D.-W.K. and C.C. provided human patient samples and experimental advice; T.I., H.Y.K., G.G. and B.Z. defined gene 8. Witte, O. The role of Bcr-Abl in chronic myeloid leukemia and stem cell biology. expression in patient samples by PCR; and V.G.O. and J.P.R. carried out all Semin. Hematol. 38, 3–8 (2001). microarray and patient outcome analyses. T.R. conceived of the project, planned 9. Ren, R. Mechanisms of BCR-ABL in the pathogenesis of chronic myelogenous and guided the research, and wrote the paper. leukaemia. Nature Rev. Cancer 5, 172–183 (2005). 10. Melo, J. V. & Barnes, D. J. Chronic myeloid leukaemia as a model of disease Author Information Reprints and permissions information is available at evolution in human cancer. Nature Rev. Cancer 7, 441–453 (2007). www.nature.com/reprints. The authors declare competing financial interests: 11. Goldman, J. M. & Melo, J. V. BCR-ABL in chronic myelogenous leukemia–how details accompany the full-text HTML version of the paper at www.nature.com/ does it work? Acta Haematol. 119, 212–217 (2008). nature. Readers are welcome to comment on the online version of this article at 12. Knoblich, J. A. Mechanisms of asymmetric cell division during animal www.nature.com/nature. Correspondence and requests for materials should be development. Curr. Opin. Cell Biol. 9, 833–841 (1997). addressed to T.R. ([email protected]).

768 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09171

2 METHODS For Msi2 knockdown by retroviral shRNA transduction, the Lin population from blast crisis CML were sorted and infected with either control shLuc (against luci- Mice. C57BL6/J and BA (C57BL/Ka-Thy1.1) mice were used as transplant ferase) or shMsi (against Msi2) retrovirus for 48 h. Infected cells were sorted based donors, and B6-CD45.1 (B6.SJL-Ptprca Pepcb/BoyJ) and HZ (C57BL/Ka- on their GFP expression, and 1,000 to 3,000 cells were transplanted in sublethally Thy1.1-CD45.1) mice were used as transplant recipients. All mice were irradiated B6-CD45.1 recipients. After transplantation, recipient mice were main- 8–16 weeks of age. Msi2 mutant mice, B6;CB-Msi2Gt(pU-21T)2Imeg, were made tained on antibiotic water (sulphamethoxazole and trimethoprim) and evaluated and established by gene-trap mutagenesis (CARD, Kumamoto University). daily for signs of morbidity, weight loss, failure to groom and splenomegaly. Pre- Floxed Rbpj mice, B6.Cg-Rbpsuhtm3Kyo, were from RIKEN BioResource Center morbid animals were killed and relevant tissues were harvested and analysed by (RBRC01071), and crossed with Vav-cre transgenic mice31,32. Mice were bred and flow cytometry and histopathology. maintained in the animal care facility at Duke University Medical Center. All Immunofluorescence staining. For immunofluorescence, relevant leukaemic cell animal experiments were performed according to protocols approved by the populations were sorted, cytospun and fixed in 4% paraformaldehyde for 5 min. Duke University Institutional Animal Care and Use Committee. Samples were then blocked using 20% normal donkey serum in PBS with 0.1% Cell isolation and FACS analysis. Haematopoietic stem cells were sorted from Tween 20, and stained at 4 uC overnight with an antibody followed by Alexafluor- mouse bone marrow essentially as described31. c-Kit-positive cells were enriched conjugated secondary antibody (Molecular probe) and DAPI. Slides were by staining whole bone marrow with anti-CD117/c-Kit microbeads and isolating mounted using mounting media (Fluoromount-G SouthernBiotech) and viewed positively labelled cells with autoMACS cell separation (Miltenyi Biotec). For on the Axio Imager (Zeiss). Antibodies used were as follows: anti-Numb, Ab4147 lineage analysis peripheral blood cells were obtained by submandibular bleeding (Abcam) or C29G11 (Cell Signaling Technology); anti-p53, DO-1 (Thermo and diluted in 0.5 ml of 10 mM EDTA in PBS. 1 ml of 2% dextran was then added Scientific); anti-cleaved Notch1, Val 1744 (Cell Signaling). Fluorescence intensity to each sample, and red blood cells depleted by sedimentation for 45 min at 37 uC. was analysed using Metamorph software (Molecular Devices). Red blood cells were lysed using RBC Lysis Buffer (eBioscience) before staining for ChIP assays. To identify potential HOX binding sites in the Msi2 gene upstream lineage markers. The following antibodies were used to define the lineage positive promoter region, we used an algorithm ConCise Scanner34 using a combination e cells in leukaemic samples: 145-2C11 (CD3 ), GK1.5 (CD4), 53-6.7 (CD8), RB6- of the following matrices: V$HOXA9.01, V$HOXB9.01, V$PBX_HOXA9.01, 8C5 (Ly-6G/Gr-1), M1/70 (CD11b/Mac-1), TER119 (Ly-76/TER119) and 6B2 V$HOX_PBX.01, V$MEIS1A_HOXA9.01 and V$MEIS1B_HOXA9.01. The (CD45R/B220). Other antibodies used for haematopoietic stem cell sorts myeloid leukaemia cell line M1 was maintained in RPMI1640 media supplemen- included 2B8 (CD117/c-Kit) and D7 (Ly-6A/E/Sca-1). All antibodies were pur- ted with 10% fetal bovine serum, and 1 3 107 cells were subjected to DNA–protein chased from BD Pharmingen or eBioscience. Analysis and cell sorting were carried cross-linking. ChIP assays were performed according to a modified protocol out on a FACSVantage SE, FACStar, FACSCanto II, or FACSDiva (all from Becton based on the ChIP-IT Express kit (Active Motif). PCR primer sequences are as Dickinson) at the Duke Comprehensive Cancer Center Flow Core Facility, and follows; for Msi2 (25.7 kb site), 59- TGGACAGCCTCATCCACAGAGCA-39 and data were analysed with FlowJo software (Tree Star Inc.). 59-ACTGTGCTACATTCCCAGCCGCT-39; for Msi2 (1110 kb site), 59-GT Retroviral constructs and production. BCR-ABL was cloned into MSCV-IRES- TCTTAGCTGCCTCTCTCAGA-39 and 59-GAACAATGTCTCTGTCAGGC GFP, -YFP or -CFP retroviral vector. NUP98-HOXA9 was cloned into the CT-39; for Flt3,59-AGTCAGAAGGGACTGGCTCC-39 and 59-GAGTGCTG MSCV-IRES-YFP or -tNGFR vector. Numb cDNA (p65 isoform, NCBI accession CTTAGCAGATTACC-39. number BC033459) was cloned into the MSCV-IRES-GFP vector. Msi2 cDNA b-Galactosidase reporter gene assays. KLS cells were isolated from Msi2 gene- (IMAGE clone ID 40045350) was purchased from Open Biosystems, and its trap heterozygote bone marrow and infected with MSCV-BCR-ABL-IRES-YFP protein coding region was cloned into MSCV-IRES-GFP or MSCV-IRES-CFP. and MSCV-NUP98-HOXA9-IRES-GFP. GFP and YFP double positive cells were Short hairpin RNA (shRNA) constructs were designed and cloned in MSCV/ sorted 48 h after infection, and cultured in X-Vivo15 media supplemented with LTRmiR30-PIG (LMP) vector from Open Biosystems according to their instruc- SCF and TPO as described. After 4 days, cells were harvested in reporter lysis tions. The target sequences are 59-CCCAGATAGCCTTAGAGACTAT-39 for reagent, and b-galactosidase activities were analysed by using b-Gal reporter Msi2 and 59-CTGTGCCAGAGTCCTTCGATAG-39 for firefly luciferase as a gene assay, Chemiluminescent (Roche Diagnostics). negative control. MSCV-IRES-CFP with Msi2 mutant cDNA resistant to Real-time and standard RT–PCR analysis. RNA was isolated using RNAqueous- shMsi was constructed by inverse PCR strategy using primers with silent muta- Micro (Ambion), equal amounts of RNAs were converted to cDNA using tions (underlined) in the shMsi target sequence: 59-CCTGACTCTCTGA Superscript II reverse transcriptase (Invitrogen). Quantitative real-time PCRs were GGGACTATTTTAGCAAATTTGG-39. Lentiviral shRNA construct with the performed using an iCycler (BioRad) by mixing cDNAs, iQ SYBRGreen Supermix alternative Msi2 target sequence, 59-AGTTAGATTCCAAGACGA-39, was cloned (BioRad) and gene-specific primers. Results were normalized to the level of b2 33 in FG12 as described previously . Virus was produced in 293T cells transfected microglobulin (B2m,mouse)orb-actin (ACTB, human). Primer sequences are with viral constructs along with gag-pol, VSV-G and Rev (in case of FG12) as follows: Numb-F, 59-ATGAGTTGCCTTCCACTATGCAG-39; Numb-R, 59- constructs. Viral supernatants were collected for 3–5 days and concentrated by TGCTGAAGGCACTGGTGATCTGG-39; Msi1-F, 59-ATGGATGCCTTCATGCT ultracentrifugal at 50,000g for 3 h. GGGT-39; Msi1-R, 59-CTCCGCTCTACACGGAATTCG-39; Msi2-F, 59-TGCCA 2 In vitro methylcellulose colony formation assays. Lineage negative (Lin ), TACACCATGGATGCGT-39; Msi2-R, 59-GTAGCCTCTGCCATAGGTTGC-39; NUP98-HOXA9-IRES-YFP positive cells from blast crisis CML were sorted and B2m-F, 59-ACCGGCCTGTATGCTATCCAGAA-39; B2m-R, 59-AATGTGAGGCG infected retrovirally with either Vector-IRES-GFP or Numb-IRES-GFP. After 48 h GGTGGAACTGT-39; MSI2-F, 59-GTTATCTGCGAACACAGTAGTG-39; MSI2- of infection, cells were sorted and serially plated in complete methylcellulosemedium R, 59-ACCCTCTGTGCCTGTTGGTAG-39; ACTB-F, 59-AAGCCACCCCACTTC (Methocult GF M3434 from StemCell Technologies). For knockdown experiments, TCTCTAA-39; ACTB-R, 59-AATGCTATCACCTCCCCTGTGT-39.HumanHES1 2 Lin population in blast crisis CML were sorted and infected with the indicated (Hs00172878_m1) and TRIB2 (Hs00222224_m1) gene levels were analysed with retroviruses for 48 h. Infected cells were sorted based on their fluorescent protein TaqMan Gene Expression Assays. expression and plated as above. Colonieswerecounted5–7daysafterplating. Human leukaemia specimens and microarray gene expression studies. 1 Generation and analysis of leukaemic mice. Bone marrow c-Kit or KLS cells Chronic and blast crisis CML samples were obtained from the Korean were sorted and cultured overnight in X-Vivo15 media (BioWhittaker) supple- Leukaemia Bank (Korea), the Hammersmith MRD Lab Sample Archive (UK), mented with 50 mM 2-mercaptoethanol, 10% fetal bovine serum, stem cell factor the Fred Hutchinson Cancer Research Center (USA) and the Singapore General (SCF, 100 ng ml21, R&D Systems) and thrombopoietin (TPO, 20 ng ml21, R&D Hospital (Singapore) from Institutional Review Board approved protocols with Systems). Subsequently, cells were infected with the retroviruses. Viruses used written informed consent in accordance with the Declaration of Helsinki. Gene were as follows: MSCV-BCR-ABL-IRES-YFP (or CFP or GFP) to generate expression profiles of CML patient samples have been described previously35. chronic phase leukaemia, or MSCV-BCR-ABL-IRES-YFP (or CFP or GFP) This published data set has been reanalysed to examine expression of MSI2, and MSCV-NUP98-HOXA9-IRES-tNGFR (or YFP) to generate blast crisis NUMB, HES1 and HOXA9 in bone marrow and peripheral blood samples from CML. Cells were harvested 48 h after infection and transplanted retro-orbitally 42 chronic phase, 17 accelerated phase and 31 blast crisis CML patients. The into groups of B6-CD45.1 mice. Recipients were lethally irradiated (10 Gy) for procedures for RNA extraction, amplification, labelling and hybridization, as chronic phase leukaemia, and sublethally (7 Gy) or non-irradiated for blast crisis well as statistical analysis methods for the Rosetta platform, are as previously CML. For Numb overexpression, cells were infected with either MSCV-Numb- published35. GenePlus software (Enodar Biologic) was used to determine differ- IRES-GFP or MSCV-IRES-GFP along with MSCV-BCR-ABL-IRES-YFP (or CFP) ential expression between groups (that is, by disease phase); P-values were cal- and MSCV-NUP98-HOXA9-IRES-tNGFR (or YFP) and 20,000 to 100,000 culated using gene-by-gene ANOVA and estimating equation techniques were infected cells were transplanted per mouse. For secondary transplantation, cells used to calculate the number of false discoveries36. from primary transplanted mice were sorted for either MSCV-Numb-IRES-GFP Statistical analysis. The statistical analysis was carried out using the R language and MSCV-NUP98-HOXA9-IRES-YFP or MSCV-IRES-GFP and MSCV- version 2.6.2 (http://www.r-project.org/) and GraphPad Prism software version NUP98-HOXA9-IRES-YFP, and 7,000 to 8,000 cells were transplanted per mouse. 4.0c (GraphPad software Inc.).

©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09171

31. Zhao, C. et al. Loss of b-catenin impairs the renewal of normal and CML stem cells 34. Jegga, A. G. et al. Detection and visualization of compositionally similar cis- in vivo. Cancer Cell 12, 528–541 (2007). regulatory element clusters in orthologous and coordinately controlled genes. 32. Han, H. et al. Inducible gene knockout of transcription factor recombination signal Genome Res. 12, 1408–1417 (2002). binding protein-J reveals its essential role in T versus B lineage decision. Int. 35. Radich, J. P. et al. Gene expression changes associated with progression and response Immunol. 14, 637–645 (2002). in chronic myeloid leukemia. Proc. Natl Acad. Sci. USA 103, 2794–2799 (2006). 33. Qin, X. F., An, D. S., Chen, I. S. & Baltimore, D. Inhibiting HIV-1 infection in human T 36. Zhao, L. P., Prentice, R. & Breeden, L. Statistical modeling of large microarray data cells by lentiviral-mediated delivery of small interfering RNA against CCR5. Proc. sets to identify stimulus-response profiles. Proc. Natl Acad. Sci. USA 98, Natl Acad. Sci. USA 100, 183–188 (2003). 5631–5636 (2001).

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09209 LETTERS

Epigenetic silencing of engineered L1 retrotransposition events in human embryonic carcinoma cells

Jose L. Garcia-Perez1,2, Maria Morell2,3, Joshua O. Scheys4, Deanna A. Kulpa5, Santiago Morell2, Christoph C. Carter4, Gary D. Hammer3,4,5,6, Kathleen L. Collins4,5,7, K. Sue O’Shea3, Pablo Menendez2 & John V. Moran1,4,5,8

Long interspersed element-1 (LINE-1 or L1) retrotransposition transfected either with pLRE3/mEGFPI or pJM111/L1RPmEGFPI continues to affect human genome evolution1,2. L1s can retrotran- seven days post-transfection12–14. PCR revealed the unspliced (vec- spose in the germline, during early development and in select tor) and spliced (retrotransposition) products in pLRE3/mEGFPI- somatic cells3–8; however, the host response to L1 retrotransposi- transfected HeLa cells, but only the unspliced product in pJM111/ tion remains largely unexplored. Here we show that reporter L1RPmEGFPI-transfected HeLa cells (Fig. 1c and Supplementary genes introduced into the genome of various human embryonic Fig. 3). We also observed the spliced product in pLRE3/mEGFPI- carcinoma-derived cell lines (ECs) by L1 retrotransposition are transfected PA-1 cells (Fig. 1c), suggesting that the retrotransposed rapidly and efficiently silenced either during or immediately after EGFP reporter gene (L1-retro-EGFP) was not expressed from the their integration. Treating ECs with histone deacetylase inhibitors PA-1 genome. rapidly reverses this silencing, and chromatin immunoprecipita- To dissect the mechanism of L1-retro-EGFP silencing, we trans- tion experiments revealed that reactivation of the reporter gene fected cells with pLRE3/mEGFPI. Seven days later, we treated cells was correlated with changes in chromatin status at the L1 integ- with the histone deacetylase inhibitor trichostatin A (TSA) for 14 h ration site. Under our assay conditions, rapid silencing was also (Fig. 2a)5,8. Flow cytometry revealed a modest increase in the number observed when reporter genes were delivered into ECs by mouse of EGFP-positive cells after TSA treatment of HeLa cells (1.3% versus L1s and a zebrafish LINE-2 element, but not when similar reporter 2.6%; Fig. 2a). In contrast, we observed a marked increase of L1-retro- genes were delivered into ECs by Moloney murine leukaemia virus EGFP expression after TSA treatment of PA-1 and 2102Ep cells or human immunodeficiency virus, suggesting that these integ- (roughly 22-fold and 12-fold, respectively; Fig. 2a). We observed a ration events are silenced by distinct mechanisms. Finally, we similar response in 833KE cells; however, we did not readily detect demonstrate that subjecting ECs to culture conditions that pro- retrotransposition in NTera2D1 cells (Supplementary Fig. 4a, b and mote differentiation attenuates the silencing of reporter genes data not shown). We saw reactivation of L1-retro-EGFP expression delivered by L1 retrotransposition, but that differentiation, in on treatment of PA-1 cells with sodium butyrate and valproic acid, itself, is not sufficient to reactivate previously silenced reporter but not on treatment with 5-azacytidine (Supplementary Fig. 4c). genes. Thus, our data indicate that ECs differ from many differ- Controls revealed that TSA treatment reactivated existing L1-retro- entiated cells in their ability to silence reporter genes delivered by EGFP events and did not result in a burst of L1 retrotransposition L1 retrotransposition. (Supplementary Fig. 4d–f). Thus, several ECs accommodate L1 retro- Human ECs have a transcription profile similar to human embryonic transposition, but the resultant L1-retro-EGFP events undergo effi- stem cells, and have been used as a model of early human development9. cient silencing. Previous studies demonstrated that human L1s are expressed in ECs We also observed efficient silencing in PA-1 cells when the cyto- and human embryonic stem cells3,10. We confirmed these findings by megalovirus immediate-early (CMV) promoter driving EGFP expres- conducting L1 expression analyses in male ECs (NTera2D1, 833KE and sion was replaced with the mouse phosphoglycerate kinase-1 (pgk) 2102Ep) and a female EC (PA-1) that exhibits a restricted ectodermal promoter, and when the SV40 polyadenylation signal was removed differentiation pattern (Fig. 1a; Supplementary Figs 1, 2a and 2c). from the L1 expression construct (Supplementary Table 1)13,14. We next assayed a human L1 element (LRE3)11 tagged with different Similarly, we observed efficient L1-retro-EGFP silencing when the cas- 12–14 15 indicator cassettes (mneoI, mneoI/ColE1 or mEGFPI) for retro- sette was delivered by a mouse L1 (TGF21) , a synthetic mouse L1 transposition (Supplementary Fig. 3). An inactive L1 (pJM111/ (L1SM)16 or a zebrafish LINE-2 element that retrotransposes at a low 13,14 17 L1RPmEGFPI) served as a negative control. LRE3 retrotransposi- level in human cells . In each instance, TSA treatment reactivated the tion was readily detected in HeLa cells, but not ECs (Fig. 1b; Sup- silenced L1-retro-EGFP cassette (Supplementary Table 1, Supplemen- plementary Figs 2b and 3). Because these assays rely on reporter-gene tary Fig. 4h, i and data not shown). Thus, the establishment of L1-retro- expression to detect retrotransposition, the above data indicate that EGFP silencing appears to be independent of viral sequences or L1 retrotransposition is inhibited in ECs. Alternatively, as observed sequences within the engineered LINE constructs. in some experiments with neural progenitor cells5,8, the indicator Retroviral insertions can also be efficiently silenced in ECs18–21.To cassette delivered by L1 retrotransposition may be silenced in ECs. determine whether the kinetics of retroviral and L1-retro-EGFP silen- Thus, we isolated genomic DNA from HeLa and PA-1 cells that were cing are similar, we infected PA-1 cells with a human immunodeficiency

1Department of Human Genetics, 1241 East Catherine Street, University of Michigan Medical School, Ann Arbor, Michigan 48109-5618, USA. 2Andalusian Stem Cell Bank, Consejeria de Salud Junta de Andalucia, Center for Biomedical Research, University of Granada, Granada 18100, Spain. 3Department of Cell and Developmental Biology, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA. 4Cellular and Molecular Biology Program, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA. 5Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA. 6Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA. 7Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA. 8Howard Hughes Medical Institute, Chevy Chase, Maryland 20815-6789, USA. 769 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

PA-1 or 2102Ep cells with a linearized neomycin or hygromycin a expression plasmid readily led to the formation of drug-resistant foci HeLa NTera2D12102EpPA-1 833KE (Supplementary Fig. 4g and data not shown). Thus, the efficiency of EGFP reporter-gene silencing seems to depend on the mechanism of 170 integration. 109 We next characterized 36 clonal PA-1 cell lines containing at least 78.9 60.4 one silenced L1-retro-EGFP event (see Supplementary Methods). 47.2 Thirty-three cell lines exhibited efficient silencing, and we detected 35.1 ORF1p EGFP-positive cells only on TSA treatment (for example, pk-5; Fig. 3a). Three cell lines (for example, pk-87; Supplementary Fig. 24.9 5) exhibited only modest L1-retro-EGFP silencing, although TSA 47.2 treatment increased the number of EGFP-positive cells (Supplemen- 35.1 S6 tary Fig. 5). Characterization of nine retrotransposition events 24.9 revealed that six occurred either within known genes or in genomic regions associated with expressed sequence tags (Supplementary Table 2), which is consistent with previous studies in cultured 3,5,8,12,13

b SD SA cells . N 5′UTR ORF1 EN RT C EO ColE1 We analysed the pk-5 clonal cell line in greater detail. Southern- Retrotransposition blot and inverse-PCR (ref. 3) analyses revealed the presence of a

single full-length L1-retro-EGFP event on chromosome 12q21.1

N EO (Fig. 3a, b; Supplementary Fig. 6a). Treating pk-5 cells with TSA C ColE1 An (Fig. 3a; Supplementary Movie), sodium butyrate, or valproic acid (Supplementary Fig. 6b, see 24-h panels; and data not shown) reac- pLRE3/mneol/ColE1 tivated the silenced L1-retro-EGFP event. Additional experiments revealed that L1-retro-EGFP reactivation did not require cell division (Supplementary Fig. 7), and that withdrawal of histone deacetylase inhibitors led to a steady decrease in the number of EGFP-positive cells over a 120-h period (Fig. 3c; Supplementary Fig. 6b). Thus, the HeLa PA-1 NTera2D1 2102Ep 833KE maintenance of L1-retro-EGFP silencing probably requires the pres- ence of active histone deacetylases. The slower kinetics required to re-

c establish the silenced state in pk-5 cells may reflect the half-life of the

GFP 5′UTR ORF1 EN RT C E EGFP protein (roughly 20 h)22. Retrotransposition To test whether reactivation of L1-retro-EGFP expression is corre-

lated with histone modifications at the L1 integration site, we per-

E GFP formed chromatin immunoprecipitation on naive and TSA-treated C An pk-5 cells, using antibodies diagnostic of transcriptionally active

PA-1 HeLa (acetylated histone-H4; H4ac) and transcriptionally repressed 23 (–) (+) JM111 LRE3 JM111 LRE3 (dimethyl histone-H3-Lys9; H3K9me2) chromatin . Quantitative- PCR experiments revealed a roughly 9-fold increase in the number of Unspliced (vector) EGFP sequences precipitated using the H4ac antibody in TSA-treated pk-5 cells when compared to the untreated cell line, and a roughly 7-fold decrease in the number precipitated using the H3K9me2 Spliced (retrotransposition) antibody in TSA-treated pk-5 cells when compared to the untreated cell line (Fig. 3d). Thus, reactivation of L1-retro-EGFP expression is accompanied by histone modifications, indicating that silencing is Figure 1 | L1 expression and retrotransposition in EC cells. a, Assay showing principally mediated at the chromatin level. that ECs express endogenous L1-encoded protein ORF1p. The ribosomal S6 Previous studies indicated that the silencing of retroviral sequences protein is a loading control. MW, molecular mass standards in kDa. b, Results is attenuated in differentiating cells19–21. To test whether differenti- of the retrotransposition assay in HeLa and EC cells. G418-resistant foci that ation affects L1-retro-EGFP silencing, we transfected PA-1 cells with expressed the retrotransposed NEO reporter gene were stained for visualization. UTR, untranslated region; EN, endonuclease; RT, reverse pLRE3/mEGFPI. We grew the cells for seven days in standard medium transcriptase; C, cysteine-rich domain; ColE1, bacterial origin of replication; (10% fetal bovine calf serum; FBS) or medium that promotes differ- An, poly(A) tail; SD, splice donor; SA, splice acceptor. c, PCR assay for intron entiation (see Supplementary Methods), and then treated them with removal (retrotransposition) in both HeLa and PA-1 cells. LRE3, a or without TSA to assay for L1-retro-EGFP silencing. TSA treatment retrotransposition-competent L1; JM111, a retrotransposition-defective L1; resulted in similar numbers of EGFP-positive cells whether cells were MW, 1-kb molecular mass ladder. (2), PCR reaction conducted without grown in 10% FBS or in differentiation medium, indicating that the template; (1), a positive control PCR conducted with vector DNA. growth medium did not dramatically affect L1 retrotransposition (Fig. 4a; Supplementary Fig. 8a, b). However, we readily detected virus (HIV89.6DENV) or a replication-deficient Moloney murine leuk- EGFP-positive cells in differentiation medium without TSA treatment aemia retrovirus carrying an EGFP reporter gene. We then treated the (roughly 10% of cells grown in differentiation medium versus less cells with or without TSA seven days post-infection. Flow cytometry than 0.3% of cells grown in 10% FBS; Fig. 4a; Supplementary Fig. revealed that TSA treatment modestly increased the number of EGFP- 8a, b). Controls verified that the majority of EGFP-positive PA-1 cells positive PA-1 cells in the retroviral-based experiments, although the identified in differentiation medium stained negatively for the tran- extent of reactivation was not as pronounced as in the L1-retro-EGFP scription factor Oct4 and positively for the epithelial cell surface experiments (roughly 2-fold in the human immunodeficiency virus marker Lu5 (Supplementary Fig. 8c–e). We obtained similar results experiment or roughly 3-fold in the Moloney murine leukaemia virus from experiments using a human L1 (pJM101/LRE3)3 or a codon- experiment versus more than 20-fold in the L1 experiments; Fig. 2b and optimized mouse L1 (pCEPL1SM)16 containing the mneoI retrotran- Supplementary Table 1). Controls demonstrated that transfection of sposition indicator cassette (Supplementary Figs 3 and 8f). Thus, 770 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

abDays Figure 2 | Engineered L1 retrotransposition

events are efficiently silenced in EC cells. a GFP , Top, E Infect 5′UTR ORF1 EN RT C 123456 cartoon of an L1 and the experimental rationale. Bottom, cells transfected with an RC-L1 reporter Transfect Days + IHDAC (14 h) plasmid (kpLRE3/mEGFPI, top and middle Untreated-FACS panels; cpLRE3/mEGFPI, bottom panel) and left 123456 Treated-FACS untreated (left panel) or treated with TSA (right + IHDAC (14 h) panel). The percentage of EGFP-positive cells and LTR HIV 89.6 ΔEnv IRES EGFP LTR 5 Untreated-FACS standard deviation (n 3) is indicated. Hoechst Treated-FACS 30 staining (blue) highlights the nuclei of cells. P, PA-1 25 experiments where puromycin was used to select LRE3 1.30± 0.10 LRE3 2.63± 0.15 20 15 for the episomal L1 expression plasmid; IHDAC, 10 histone deacetylase inhibitors; FACS, EGFP (%) 5 fluorescence-activated cell sorting. Scale bars, 0 5 100 mm. b, Retroviral-based EGFP insertions are 50 25 500 100 5-TSA 00-TSA 50-TSA 25-TSA not as efficiently silenced as L1-retro-EGFP 5 100-TSA Ψ HeLa Untreated HeLa TSA LTR LTR insertions in PA-1 cells (see Methods). Top, NEO EGFP cartoon showing the experimental rationale. LRE3 0.30± 0.01 LRE3 6.63± 0.10 12 10 PA-1 Bottom, graphs indicating the percentage of 8 EGFP-positive cells and the standard deviation 6 (n 5 3). The x axes on the graph indicate 4

EGFP (%) microlitres of retroviral supernatant added to 2 0 PA-1 cells and if PA-1 cells were treated with TSA. 50 10 PA-1 Untreated PA-1 TSA 150 SA 100 LTR, long terminal repeat; IRES, internal 50-TSA 10-TSA 150-T 100-TSA Y LRE3 0.79± 0.15 LRE3 9.59± 0.80 ribosome entry site. , packaging signal. P P

2102Ep Untreated 2102Ep TSA

abd Input Anti-H4ac Anti-H3K9me2 Chromosome 12 8 8.7–fold

7

GFP 5′UTR ORF1 EN RT RT C C E 6 A~102 5 5 4 4 6.8–fold Untreated +TSA 3 3

Fold change 2 2 1 1.6 0 Brightfield pk5 pk5-TSA –1 GFP c +IHDAC (14 h) Wash

Seed 0 h 24 h 48 h 96 h 120 h

Fluorescence 0 h 24 h 48 h 96 h 120 h

Untreated Untreated Untreated Untreated Untreated

+TSA TSAw TSAw TSAw TSAw

Figure 3 | Analyses of L1 silencing in a clonal (pk-5) cell line. a, Cartoon deacetylase inhibitors. TSAw, TSA withdrawal. Scale bars, 250 mm. indicating the chromosomal location of a silenced L1-retro-EGFP event in a d, Chromatin immunoprecipitation analysis on naive and TSA-treated pk-5 clonal PA-1 cell line. L1-retro-EGFP expression can be reactivated by TSA cells using H4ac and H3K9me2 antibodies. Quantitative PCR revealed the treatment. A-102, approximate length of the poly(A) tail at the 39 end of the enrichment (H4ac) or depletion (H3K9me2) of the retrotransposed EGFP insertion. Scale bars, 100 mm. b, Southern-blot analysis reveals that pk-5 cells sequences in the TSA-treated pk-5 cells (red-highlighted rectangles). The contain a single L1-retro-EGFP event. Genomic DNA was digested with input cycle threshold (Ct) was designated as 1 and used to calculate fold- HindIII and the blot was probed with an a-32P radiolabelled EGFP probe. change differences. Samples were run in triplicate from the same experiment. MW, molecular mass standards (kb). c, Top, cartoon showing the The standard deviation (s.d., n 5 3) is indicated in the graph. GFP, green experimental rationale. Bottom, withdrawal of TSA (bottom panels) results fluorescent protein. in the re-establishment of L1-retro-EGFP silencing. IHDAC, histone 771 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

L1-retro-reporter-gene silencing is more efficient in ECs than in dif- cells; Supplementary Fig. 10). Thus, differentiation, in itself, is not ferentiating cells. 2102Ep cells, which do not differentiate when grown sufficient to efficiently reactivate previously silenced L1-retro-EGFP in differentiation medium24, consistently exhibited L1-retro-EGFP insertions. silencing when experiments were conducted in either 10% FBS or Our study builds on existing literature, suggesting that host differentiation medium (Supplementary Fig. 9). mechanisms act to regulate L1 retrotransposition5,26–30. We propose We next generated a population of silenced L1-retro-EGFP retro- that L1-retro-EGFP silencing occurs by a two-step process (Fig. 4c). transposition events in PA-1 cells (Fig.4b). We grew the EGFP-negative First, because reporter cassettes delivered by various non-long- cells in 10% FBS or differentiation medium for seven days in the pres- terminal-repeat retrotransposons are silenced in PA-1 cells, we specu- ence of the reverse transcriptase inhibitor 39-azido-39-deoxythymidine, late that nascent L1 complementary DNAs may be targeted by host to repress further L1 retrotransposition25. TSA treatment was required factors, apparently sequence-independently, to ‘initiate’ L1-retro- to reactivate L1-retro-EGFP expression in both 10% FBS and differ- EGFP silencing either during target-site primed reverse transcription entiation medium (Fig. 4b). Growing the clonal pk-5 cell line in dif- or immediately after integration. Second, because the removal of ferentiation medium rarely led to EGFP-positive cells (roughly 2% of histone deacetylase inhibitors results in the re-establishment of L1- retro-EGFP silencing, we propose that histone-modification enzymes (deacetylases) act to maintain silencing, and that silencing in ECs, at least in the short term, does not require methylation of the retro- a +Puro 10% FBS DM transposed L1-retro-EGFP cDNA. It remains possible that L1s insert Transfect 12345 6 + 1 µM TSA (14 h) into chromosomal regions that are preferentially silenced in ECs but Days not in differentiated cells, although such a result lacks precedent and is Untreated-FACS Treated-FACS not supported by the initial characterization of retrotransposition events in PA-1 cells (Supplementary Table 2). The silencing of L1- 133-fold 3-fold 3-fold retro-EGFP events in ECs that express endogenous L1s may seem 40 12 paradoxical. However, because 3 out of 36 (roughly 8%) L1-retro- 10 30 EGFP events in PA-1 cells evaded complete silencing (see Supplemen- 8 tary Fig. 5), we suggest that some full-length endogenous L1s are 20 6 expressed from favourable genomic contexts, and speculate that L1- 4 EGFP (%) 10 EGFP (%) 2 mediated reporter-gene silencing may represent a mechanism for 0 0 regulating retrotransposition in cells that naturally express human PA-1 HeLa L1s. We further determined that L1-retro-EGFP silencing is attenuated b FACS2 10% FBS in differentiating cells, but that differentiation is not sufficient to DM reactivate a previously silenced L1-retro-EGFP cassette. A similar 123456 + 1 µM TSA Transfect +Puro (14 h) FACS1 pattern has been reported for retroviral silencing in pluripotent Days 19–21 123456 + 5 µM AZT cells . Thus, we speculate that host factor(s) required for the ini- Days Untreated-FACS tiation of L1-retro-EGFP silencing are expressed in multipotent ECs Split 1 234 56FACS3 Treated-FACS and undergo downregulation during cellular differentiation. Alternatively, a repressor of L1-retro-EGFP silencing could be acti- 32-fold 14-fold 20-fold 50 20 14 vated on differentiation. In either case, we have uncovered a novel 40 16 12 mechanism that mediates the silencing of engineered L1 retrotran- 10 30 12 8 sposition events in ECs. 20 8 6 4 EGFP (%) 10 EGFP (%) 4 EGFP (%) METHODS SUMMARY 2 3,13 0 0 0 Cell culture and plasmid DNA. We grew HeLa and human ECs as described . FACS1 FACS2 FACS3 DNA constructs are described in the Supplementary Methods section (see also Supplementary Methods for specific details and references to previously pub- c L1 Remove lished works). Retrotransposition IHDAC IHDAC PA-1 cells L1 L1 L1 Retrotransposition assays. Cell transfection and L1-retrotransposition assays Initiation Maintenance were performed as described12–14. In some instances, puromycin was added to the L1 medium to select for the episomal L1 expression vector. Where indicated, trans- Retrotransposition PA-1 cells L1 fected cells were treated with 500 nM–1 mM trichostatin A (TSA, Sigma), 1 mM Differentiation Attenuated Initiation valproic acid (VPA, Sigma), 1 mM sodium butyrate (NaB, Sigma) for 14–16 h, or with 25 mM 5-azacytidine (5-Aza, Sigma) for at least 56 h, to assay for the reacti- L1 vation of L1-retro-EGFP expression. Silencing assays are reported in refs 5 and 8. Retrotransposition Differentiation PA-1 cells L1 L1 Initiation Maintenance Treating cells with TSA for longer than 24 h resulted in toxicity; thus, we per- formed time-course studies to optimize the TSA treatment time for our assays. Figure 4 | Analysis of L1 silencing in differentiating cells. a, Top, cartoon Southern blot and PCR. We conducted PCR reactions to follow the removal of showing the experimental rationale. Bottom, graphs indicating the the intron from the retrotransposition indicator cassette as described13,14.We percentage of EGFP-positive cells and the standard deviation (n 5 3). also conducted Southern blot and inverse PCR as described3,5,12. Silencing was efficient in PA-1 cells grown in medium containing 10% FBS Western blot and immunocytochemistry. We performed western-blot and (white rectangles), but was attenuated in differentiation medium (grey immunocytochemistry analyses as described3. rectangles). Red-highlighted rectangles indicate experiments with TSA Chromatin immunoprecipitation assays. We performed chromatin immuno- treatment. DM, differentiation medium; FACS, fluorescence-activated cell precipitation assays as described23. sorting; Puro, experiments where puromycin was used to select for the episomal L1 expression plasmid. b, Differentiation is not sufficient to Received 28 September 2009; accepted 28 May 2010. derepress L1 silencing (details are provided in the text). Top, cartoon 1. Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature showing the experimental rationale. Bottom, graphs indicating the 409, 860–921 (2001). percentage of EGFP-positive cells and the standard deviation (n 5 3). AZT, 2. Goodier, J. L. & Kazazian, H. H. Retrotransposons revisited: the restraint and 39-azido-39-deoxythymidine. c, A model for the initiation and maintenance rehabilitation of parasites. Cell 135, 23–35 (2008). of L1 silencing in EC cells (details are provided in the text). IHDAC, histone 3. Garcia-Perez, J. L. et al. LINE-1 retrotransposition in human embryonic stem cells. deacetylase inhibitors. Hum. Mol. Genet. 16, 1569–1577 (2007). 772 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

4. Kano, H. et al. L1 retrotransposition occurs mainly in embryogenesis and creates 29. Stetson, D. B., Ko, J. S., Heidmann, T. & Medzhitov, R. Trex1 prevents cell-intrinsic somatic mosaicism. Genes Dev. 23, 1303–1312 (2009). initiation of autoimmunity. Cell 134, 587–598 (2008). 5. Muotri, A. R. et al. Somatic mosaicism in neuronal precursor cells mediated by L1 30. Suzuki, J. et al. Genetic evidence that the non-homologous end-joining repair retrotransposition. Nature 435, 903–910 (2005). pathway is involved in LINE retrotransposition. PLoS Genet. 5, e1000461 (2009). 6. Ostertag, E. M. et al. A mouse model of human L1 retrotransposition. Nature Genet. 32, 655–660 (2002). Supplementary Information is linked to the online version of the paper at 7. van den Hurk, J. A. et al. L1 retrotransposition can occur early in human embryonic www.nature.com/nature. development. Hum. Mol. Genet. 16, 1587–1592 (2007). Acknowledgements We thank P. W. Andrews for providing human EC lines, 8. Coufal, N. G. et al. L1 retrotransposition in human neural progenitor cells. Nature discussing unpublished data from his laboratory and giving advice during the 460, 1127–1131 (2009). course of this study. We thank A. Macia and M. Munoz-Lopez (Andalusian Stem 9. Sperger, J. M. et al. Gene expression patterns in human embryonic stem cells and Cell Bank) for sharing unpublished data; A. V. Furano, H. Kazazian, J. K. Kim, human pluripotent germ cell tumors. Proc. Natl Acad. Sci. USA 100, 13350–13355 H. Kopera, A. Muotri, and members of the Moran and Garcia-Perez laboratories for (2003). critical reading of the manuscript; and G. Smith and L. Villa for help in creating the 10. Hohjoh, H. & Singer, M. F. Cytoplasmic ribonucleoprotein complexes containing time-lapsed movie. We thank M. Velkey for providing plasmid pBSSK-pgk; human LINE-1 protein and RNA. EMBO J. 15, 630–639 (1996). H. Kazazian for providing plasmid pJCC5/LRE3; J. Boeke for providing synthetic 11. Brouha, B. et al. Evidence consistent with human L1 retrotransposition in maternal mouse LINE-1 constructs; M. Kajikawa and N. Okada for providing the zebrafish meiosis I. Am. J. Hum. Genet. 71, 327–336 (2002). LINE-2 expression plasmids; T. Fanning for providing the polyclonal ORF1 antibody; 12. Gilbert, N., Lutz-Prigge, S. & Moran, J. V. Genomic deletions created upon LINE-1 I. Damjanov for comments on the teratoma characterization; T. Lanigan for retrotransposition. Cell 110, 315–325 (2002). preparing Moloney murine leukaemia virus retroviral supernatants; C. Pigott for EC 13. Moran, J. V. et al. High frequency retrotransposition in cultured mammalian cells. culture advice; and T. de la Cueva, P. Catalina and A. Nieto (Andalusian Stem Cell Cell 87, 917–927 (1996). Bank) for their help with mouse experimentation, SKY-FISH and pathology 14. Ostertag, E. M., Prak, E. T., DeBerardinis, R. J., Moran, J. V. & Kazazian, H. H. Jr. analyses, respectively. J.V.M. is supported by the National Institutes of Health Determination of L1 retrotransposition kinetics in cultured cells. Nucleic Acids Res. (NIH) (GM060518 and GM082970) and the Howard Hughes Medical Institute. 28, 1418–1423 (2000). J.L.G.-P. is supported by the Instituto de Salud Carlos III - Consejeria de Salud Junta 15. Goodier, J. L., Ostertag, E. M., Du, K. & Kazazian, H. H. Jr. A novel active L1 de Andalucia (ISCIII-CSJA) (EMER07/056), by a Marie Curie International retrotransposon subfamily in the mouse. Genome Res. 11, 1677–1685 (2001). Reintegration Grant action (FP7-PEOPLE-2007-4-3-IRG), by CICE 16. Han, J. S. & Boeke, J. D. A highly active synthetic mammalian retrotransposon. (P09-CTS-4980) and Proyectos en Salud (PI0002/2009) from Junta de Nature 429, 314–318 (2004). Andalucia (Spain) and through the Spanish Ministry of Health (FIS PI08171 and 17. Sugano, T., Kajikawa, M. & Okada, N. Isolation and characterization of Miguel Servet CP07/00065). M.M. is supported by the ISCIII-CSJA (EMER07/ retrotransposition-competent LINEs from zebrafish. Gene 365, 74–82 (2006). 056). P.M. is supported by the Spanish Ministry of Science and Innovation 18. Loh, T. P., Sievert, L. L. & Scott, R. W. Proviral sequences that restrict retroviral MICINN-PLANE (PLE-2009-0111), by CICE (P08-CTS-3678) from Junta de expression in mouse embryonal carcinoma cells. Mol. Cell. Biol. 7, 3775–3784 Andalucia (Spain) and by the Spanish Ministry of Health (FIS PI070026). K.S.O’S. (1987). is supported by the NIH (NS-048187 and GM-069985). K.L.C. is supported by the 19. Teich, N. M., Weiss, R. A., Martin, G. R. & Lowy, D. R. Virus infection of murine Burroughs Wellcome Foundation and by an NIH Research Project Grant (R01) teratocarcinoma stem cell lines. Cell 12, 973–982 (1977). (AI051198). G.D.H is supported by a National Institute of Diabetes and Digestive 20. Wolf, D. & Goff, S. P. TRIM28 mediates primer binding site-targeted silencing of and Kidney Diseases NIH R01 (DK62027). J.O.S. is supported by a Cellular and murine leukemia virus in embryonic cells. Cell 131, 46–57 (2007). Molecular Approaches to Systems and Integrative Biology Training Grant 21. Wolf, D. & Goff, S. P. Embryonic stem cells use ZFP809 to silence retroviral DNAs. (T32-GM08322). D.A.K. is supported by The Irvington Institute Fellowship Nature 458, 1201–1204 (2009). Program of the Cancer Research Institute. S.M. is supported by a CICE 22. Li, X. et al. Generation of destabilized green fluorescent protein as a transcription (P08-CTS-3678) scholarship from Junta de Andalucia, Spain. C.C.C. is supported reporter. J. Biol. Chem. 273, 34970–34975 (1998). by a Rackham Predoctoral Fellowship from the University of Michigan. We 23. Gummow, B. M., Scheys, J. O., Cancelli, V. R. & Hammer, G. D. Reciprocal defrayed the costs of DNA sequencing in part with the University of Michigan’s regulation of a glucocorticoid receptor-steroidogenic factor-1 transcription Cancer Center Support Grant (NIH 5 P30 CA46592). complex on the Dax-1 promoter by glucocorticoids and adrenocorticotropic Author Contributions J.V.M. and J.L.G.-P. directed the project, designed hormone in the adrenal cortex. Mol. Endocrinol. 20, 2711–2723 (2006). experiments and drafted the manuscript. J.L.G.-P. performed experiments with the 24. Matthaei, K. I., Andrews, P. W. & Bronson, D. L. Retinoic acid fails to induce assistance of M.M. and K.S.O’S. (cell cycle experiments), J.O.S. and G.D.H differentiation in human teratocarcinoma cell lines that express high levels of a (chromatin immunoprecipitation experiments), D.A.K., C.C.C. and K.L.C. (human cellular receptor protein. Exp. Cell Res. 143, 471–474 (1983). immunodeficiency virus-based experiments), and S.M. and P.M. (teratoma 25. Kubo, S. et al. L1 retrotransposition in nondividing and primary human somatic assays). All the authors commented on the manuscript. cells. Proc. Natl Acad. Sci. USA 103, 8036–8041 (2006). 26. Bestor, T. H. & Tycko, B. Creation of genomic methylation patterns. Nature Genet. Author Information Reprints and permissions information is available at 12, 363–367 (1996). www.nature.com/reprints. The authors declare no competing financial interests. 27. Bourc’his, D. & Bestor, T. H. Meiotic catastrophe and retrotransposon reactivation Readers are welcome to comment on the online version of this article at in male germ cells lacking Dnmt3L. Nature 431, 96–99 (2004). www.nature.com/nature. Correspondence and requests for materials should be 28. Schumann, G. G. APOBEC3 proteins: major players in intracellular defence against addressed to J.V.M. ([email protected]) and J.L.G.-P. LINE-1-mediated retrotransposition. Biochem. Soc. Trans. 35, 637–642 (2007). ([email protected]).

773 ©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09301 LETTERS

Branched tricarboxylic acid metabolism in Plasmodium falciparum

Kellen L. Olszewski1, Michael W. Mather2, Joanne M. Morrisey2, Benjamin A. Garcia3, Akhil B. Vaidya2, Joshua D. Rabinowitz4 & Manuel Llina´s1

A central hub of carbon metabolism is the tricarboxylic acid cycle1, transcribed during the blood stage8. The citrate synthase orthologue which serves to connect the processes of glycolysis, gluconeogenesis, (PF10_0218), aconitase (PF13_0229) and isocitrate dehydrogenase respiration, amino acid synthesis and other biosynthetic pathways. (PfIDH, PF13_0242, see Supplementary Discussion) have been loca- The protozoan intracellular malaria parasites (Plasmodium spp.), lized to the mitochondrion9,10, and PfIDH, aconitase and succinate however, have long been suspected of possessing a significantly dehydrogenase complex (PFL0630w, PF10_0334) have been bio- streamlined carbon metabolic network in which tricarboxylic acid chemically characterized10–12, suggesting an active mitochondrial metabolism plays a minor role2. Blood-stage Plasmodium parasites pathway. The presence of an essential de novo haem biosynthesis rely almost entirely on glucose fermentation for energy and con- pathway in P. falciparum2 further implies that succinyl-coenzyme A sume minimal amounts of oxygen3,yettheparasitegenomeencodes (succinyl-CoA) must be generated in the mitochondrion. We have all of the enzymes necessary for a complete tricarboxylic acid cycle4. found that the intracellular levels of several TCA metabolites oscillate Here, by tracing 13C-labelled compounds using mass spectrometry5, over the parasite growth cycle roughly in phase with the expression we show that tricarboxylic acid metabolism in the human malaria profiles of cognate enzymes13. Therefore, TCA metabolites are actively parasite Plasmodium falciparum is largely disconnected from gly- synthesized by the parasite. However, the P. falciparum pyruvate dehy- colysis and is organized along a fundamentally different architec- drogenase (PDH) complex localizes not to the mitochondrion but to ture from the canonical textbook pathway. We find that this the apicoplast, a non-photosynthetic plastid-like organelle14. Thus, pathway is not cyclic, but rather is a branched structure in which instead of its canonical role of feeding glucose-derived carbon into the major carbon sources are the amino acids glutamate and gluta- the TCA cycle, the suggested role of PDH is solely to produce acetyl- mine. As a consequence of this branched architecture, several reac- coenzyme A (acetyl-CoA) for fatty acid elongation14. tions must run in the reverse of the standard direction, thereby In addition to glucose, major TCA cycle carbon sources in many generating two-carbon units in the form of acetyl-coenzyme A. organisms are the amino acids aspartate, asparagine, glutamate We further show that glutamine-derived acetyl-coenzyme A is used and glutamine, which can be deaminated to yield oxaloacetate or for histone acetylation, whereas glucose-derived acetyl-coenzyme A 2-oxoglutarate (a-ketoglutarate). To elucidate the role of the TCA cycle is used to acetylate amino sugars. Thus, the parasite has evolved two in parasite metabolism, we have determined the major carbon source independent production mechanisms for acetyl-coenzyme A with contributing to the accumulation of TCA intermediates. By culturing different biological functions. These results significantly clarify our synchronized parasite-infected red blood cells in medium supplemen- understanding of the Plasmodium metabolic network and highlight ted with U-13C-glucose,U-13C-15N-aspartate or U-13C-15N-glutamine, the ability of altered variants of central carbon metabolism to arise where U indicates labelling at all carbon or nitrogen atoms, we mea- in response to unique environments. sured intracellular metabolite isotope-labelling patterns throughout The mitochondrion of P. falciparum contains the smallest genome the 48-hour parasite cell cycle using a liquid chromatography–mass sequenced to date, and seems to have evolved reduced functional spectrometry (LC–MS) platform capable of detecting most central roles compared with other eukaryotic organisms6. Moreover, the carbon metabolites. limited number of mitochondrial cristae, minimal oxygen consump- As expected, in parasites grown on U-13C-glucose the pools of all tion and rapid fermentation of glucose into lactate that are observed glycolytic intermediates were rapidly and uniformly labelled (data not in intraerythrocytic human malaria parasites suggest that oxidative shown). We observed limited labelling of carboxylic acid pools, with phosphorylation is not a significant source of ATP-generation during moderate amounts of 13 13C-malate and 13 13C-fumarate, where 13 the blood stage6. Blood-stage Plasmodium spp. have also dispensed indicates labelling at three carbon atoms. These 13 forms are consist- with several of the functions often associated with the mitochondrial ent with phosphoenolpyruvate (PEP) carboxylation incorporating tricarboxylic acid (TCA) cycle, such as de novo amino acid biosyn- unlabelled carbonate from the gaseous environment15 (Fig. 1a). The thesis. Although the parasite possesses a functional electron-transport absence of labelling in other TCA intermediates suggests that these chain, and mitochondrial membrane potential is required for survival, labelled dicarboxylic acids derive from cytosolic pathways independent we have shown that the critical metabolic function of electron trans- of mitochondrial TCA metabolism (Supplementary Fig. 1a). Similarly, port during blood-stage growth is the regeneration of ubiquinone to parasite growth on U-13C-15N-aspartate results only in the generation supply pyrimidine biosynthesis7. of 14 13C-malate and 14 13C-fumarate (Supplementary Fig. 2), which Several lines of evidence, however, suggest that TCA metabolism can also occur in the cytosol (Supplementary Fig. 1b). plays an active role in the metabolism of the parasite. The parasite When parasites are fed U-13C-glucose, PDH-complex activity genome encodes orthologues for all TCA cycle enzymes4, which are all yields acetyl-labelled 13C-acetyl-CoA (Fig. 2a). Feeding on labelled

1Department of Molecular Biology and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA. 2Center for Molecular Parasitology, Drexel University College of Medicine, Philadelphia, Pennsylvania 19129, USA. 3Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544, USA. 4Department of Chemistry and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA. 774 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

a b c 13C-glucose 13C-15N-glutamine Oxidative Reductive O O O O

2-oxoglutarate Unlabelled HO OH HO OH 50 13 +2 C O O +3 13C +4 13C 2-oxoglutarate 2-oxoglutarate 25 13 +5 +5 +5 C CoA CO2 2H CO2 0 2H Concentration (μM) OOH Citrate O O O 10 S CoA HO HO OH O OH 5 Succinyl-CoA Isocitrate +4 +5 GDP 0 Concentration (μM) P i CoA GTP Malate 300 20 O OOH O O OH 200 HO 10 HO OH 100 O OH Succinate Citrate 0 0 Concentration (μM) +4 +5

Succinate 2H Ac-R 10

O OO 5 OH OH HO HO O O 0 Concentration (μM) Fumarate Oxaloacetate +4 +3 Fumarate 2H H2O 10 0.5

5 0.25 OOH OOH OH OH HO HO 0 0 Concentration (μM) 0 8 16 24 32 40 0 8 16 24 32 40 0 8 16 24 32 40 O O Time Time Time Malate Malate (hours after invasion) (hours after invasion) (hours after invasion) +4 +3 Figure 1 | Glutamine drives reverse flux through the TCA cycle. pathways, respectively, whereas 13 fumarate probably derives from a, Concentrations of different isotope-labelled carboxylic acids in extracts of interconversion of fumarate and malate by fumarate hydratase (PFI1340w). P. falciparum-infected red blood cells. We cultured synchronized parasites Error bars show the s.d. of n 5 3 biological replicates. b, Schematic of the in medium supplemented with either U-13C-glucose or U-13C-15N- oxidative pathway from 2-oxoglutarate to malate. Red dots denote 13C glutamine 2 h before invasion, then extracted them every 8 hours after atoms arising from U-13C-15N-glutamine. GDP, guanosine diphosphate; invasion for high-performance LC–MS (HPLC–MS) analysis. The plots to GTP, guanosine triphosphate; Pi, inorganic phosphate. c, Schematic of the the right of the grey triangles zoom in on the profiles of the labelled reductive carboxylation pathway from 2-oxoglutarate to malate. Ac-R metabolites. The 13 and 14 malate arises from the reductive and oxidative represents either acetyl-CoA or acetate. glucose results in labelling of only a small fraction of the total acetyl- all five carbons (Fig. 1a). We also observe the 14 13C-labelled forms of CoA pool, suggesting the presence of additional sources of two-carbon the four-carbon (C4) compounds succinate, fumarate and malate, units. U-13C-glucose feeding also results in small but measurable expected from the canonical TCA cycle reactions occurring in the amounts of both 12and15 13C-citrate (Fig. 1a), which derive from standard clockwise direction (Fig. 1b). the condensation of acetyl-labelled 13C-acetyl-CoA with either unla- We detect only 15 13C forms of the C6 metabolite citrate. This belled oxaloacetate or 13 13C-oxaloacetate, respectively. These labelling is inconsistent with the TCA cycle turning in the standard labelled forms account for only a minor fraction of citrate, and the clockwise direction, but is characteristic of the reductive carboxylation labelling does not propagate to other intermediates downstream in of 2-oxoglutarate to isocitrate, followed by isomerization to citrate18,in the TCA cycle. These data raised the possibility that glucose- and the reverse of standard TCA cycle directionality (Fig. 1c). We also aspartate-derived metabolites are disconnected from mitochondrial observe 13 13C-labelled forms of both malate and fumarate, which TCA metabolism. are generated with temporal profiles similar to those of 15 13C-citrate Consistent with the TCA cycle being fed instead from glutamine, we (Fig. 1a). Such malate labelling is consistent with 15 13C-citrate being find significant labelling of all TCA compounds in parasites grown in cleaved into 12 13C-acetate or acetyl-CoA and 13 13C-oxaloacetate, the presence of U-13C-15N-glutamine (Fig. 1a). Extracellular glutamine which is then reduced to 13 13C-malate (Fig. 1c). We also observe 12 is rapidly taken up by parasitized red blood cells16 and deamidated to 13C-acetyl-CoA during growth on U-13C-15N-glutamine (Fig. 2a). glutamate, which can donate its carbon skeleton to TCA metabolism Thus several TCA cycle reactions are running with net flux in the through conversion to 2-oxoglutarate. Although the growth medium reverse direction, in the process generating C2 units from 2-oxogluta- contains only labelled glutamine, the intracellular glutamine/glutamate rate via citrate. pools are incompletely labelled owing to the generation of un- To further dissect the biological role of this reverse-TCA branch, labelled amino acids by haemoglobin catabolism17. Consistent with we investigated the major metabolic fates for C2 units: fatty acid this glutamine-driven reaction pathway, 2-oxoglutarate is labelled at synthesis, protein modification and small-molecule acetylation. We 775 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

a 1.00 Acetyl-CoA comprise approximately 56% of the total acetylated histone pool, a proportion similar to the fractional labelling of the 2-oxoglutarate pool. However, UDP-N-acetyl-glucosamine (UDP-GlcNAc), a nuc- leotide amino sugar acetylated in the endoplasmic reticulum during 0.50 the biosynthesis of glycosylphosphatidylinositol-anchored proteins associated with malaria pathogenesis21, is labelled at the acetyl group

Fraction of total only during growth on U-13C-glucose (Fig. 2c). Thus it appears that the 0.00 13C-glucose 13C-15N-glutamine malaria parasite has evolved two independent pathways that produce acetyl-CoA for different metabolic functions. How glucose- and glutamine-derived C2 units are maintained as functionally distinct b Acetyl-histone H4 1.00 pools and transported from their respective organelles to different sites of acetylation remains to be investigated. Our metabolic labelling data suggest a branched architecture for 0.50 mitochondrial carbon metabolism in which both arms produce malate. To achieve a net flux through these pathways it would be necessary to remove this terminal product, either by conversion or Fraction ofFraction total excretion. When we analysed the liquid culture media from cultures 0.00 13C-glucose 13C-15N-glutamine grown on labelled nutrients, we found that malate, 2-oxoglutarate and, to a lesser extent, fumarate are excreted from infected red blood cells at c UDP-GlcNAc a significant rate (Fig. 3 and Supplementary Fig. 4). Cytosolic fumarate 1.00 is a byproduct of the parasite’s purine salvage pathway22,whereas 2-oxoglutarate is produced by glutamate dehydrogenase. Our data imply that these metabolites, as well as malate derived from both 0.50 cytosolic and mitochondrial pathways, flow out of the system as waste products.

Fraction ofFraction total On the basis of these results, we propose a new model for central 0.00 carbon metabolism in blood-stage Plasmodium spp. (Fig. 4). In this 13C-glucose 13C-15N-glutamine pathway the ultimate carbon source for mitochondrial carboxylic acid Figure 2 | Acetyl groups deriving from glucose and glutamine are pools is the amino acids glutamine and glutamate, and carbon flux in functionally distinct. a, Labelling of acetyl-CoA in extracts of P. falciparum- the mitochondrion is organized into two independent linear infected red blood cells at t 5 40 hours after invasion as determined by branches. Branch 1 (red in Fig. 4) begins with the reductive carbox- HPLC–MS. b, Labelling of a singly acetylated peptide derived from the ylation of 2-oxoglutarate to isocitrate, which is then isomerized to N-terminal tail of histone H4, determined by proteomic mass spectrometry. citrate. This citrate is cleaved into a C2 compound and oxaloacetate, c, Labelling of UDP-GlcNAc at t 5 40 hours after invasion. Black bars, which is reduced to malate. Branch 2 (blue in Fig. 4) comprises the unlabelled molecule; red bars, the molecule labelled at both carbons of the standard clockwise turning of the TCA cycle to oxidize 2-oxoglutarate acetyl group, regardless of any other labelling; dark grey bars, acetyl-CoA labelled at all five carbons of the ribose moiety of CoA, but not the acetyl to malate, in the process generating reducing power and succinyl- group; white bars, UDP-GlcNAc labelled at some combination of the CoA, an essential precursor for haem biosynthesis. Two labelled forms glucose, ribose or pyrimidine ring, but not the acetyl group; light grey bars, are observed for malate and fumarate, but no other TCA intermedi- 13 15 UDP-GlcNAc labelled at 0–3 nitrogens, but at no carbons. Error bars show ates, during growth on U- C- N-glutamine, suggesting that both the s.d. of n 5 3 biological replicates. branches converge at these metabolites, which are the terminal pro- ducts of each. On the basis of current evidence, our model depicts 13 13 profiled C labelling of parasite lipids during growth on U- C- these pathways as mitochondrial, although the localization of some 13 15 glucose or U- C- N-glutamine by gas chromatography–mass spec- enzymatic steps and details regarding transport are yet to be fully trometry (GC–MS) but were unable to detect labelling under either established (see Supplementary Discussion and Supplementary Figs condition, which is consistent with recent reports that the parasite’s de 5–8). novo fatty acid synthesis pathway is not required during the blood This model for branched TCA metabolism is fundamentally different stages19,20. Some of the major protein-acetylation targets in eukaryotes from any yet described. Reductive flux from 2-oxoglutarate has been are the lysine residues within the amino-terminal tails of histones. demonstrated in human brown adipose cell cultures18, in which this When parasites are cultured in medium containing either U-13C- pathway was shown to be a source of lipogenic C2 units18. However, glucose or U-13C-15N-glutamine, we observe robust labelling of the such adipose cells seem capable of running a complete TCA cycle acetyl groups in histone tails only in the U-13C-15N-glutamine-fed simultaneously with this reductive pathway. This was proposed to cultures (Fig. 2b, Supplementary Fig. 3). The acetyl-labelled histones be due to the presence of two mitochondrial isoforms of isocitrate

Unlabelled 13C-glucose 13C-15N-glutamine +3 13C +4 13C Extracellular malate 80 2

60

40 1

20

Concentration (μM) 0 0 0 8 16 24 32 40 0 8 16 24 32 40 0 8 16 24 32 40 Time (hours after invasion) Time (hours after invasion) Time (hours after invasion) Figure 3 | Malate excretion by P. falciparum-infected red blood cell are given as molar concentrations in the medium samples. The plot at the cultures. We grew and cultured parasites as described above, collected right, indicated by grey triangle, is a close up of the profiles of the labelled samples of the culture medium and analysed them by HPLC–MS. The data metabolites. All error bars show the s.d. of n 5 3 biological replicates. 776 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

Gln Glc Asp Fum because the nutrient availabilities and metabolic demands in these Purine salvage environments vary substantially. Glu Our results highlight the growing part that metabolomic technolo-

Glycolysis OA Mal gies play in elucidating the architecture of metabolic pathways, par- OG PEP ticularly in such divergent pathogens as Apicomplexan parasites. Pyr Lac Genomic reconstructions4, which generally map metabolic networks onto those of well-studied model organisms, must be informed by OG PEP direct experimental evidence or run the risk of failing to identify the ICT Suc-CoA pathways that represent the best candidates for drug targets. This Pyr study clarifies our understanding of the metabolism underlying plas- modial mitochondrial electron flow, haem biosynthesis and histone Cit Suc Ac-R * Ac-CoA Ac-CoA acetylation, all of which are current or suggested targets for phar- maceutical intervention29,30. In addition, it presents a clear case in OA Fum which a fundamental metabolic pathway has undergone significant ** Histones Mal Amino sugars evolutionary adaptation towards a particular environmental niche. Mitochondrion Apicoplast METHODS SUMMARY Mal We performed P. falciparum culturing and metabolomics essentially as described13. Figure 4 | An integrated model for central carbon metabolism in P. For further details, and for descriptions of cloning, fluorescent imaging, mitochon- falciparum. Arrows show the direction of net flux; multiple arrows depict drial isolation, enzyme assays, histone extraction, proteomics and GC–MS ana- pathways not shown in their entirety and are labelled as such. Metabolites in lysis, see the Supplementary Methods. red are those found to flow out into the medium as waste products. Red Full Methods and any associated references are available in the online version of arrows indicate the reductive pathway of TCA metabolism; blue arrows show the paper at www.nature.com/nature. the oxidative pathway. Asterisk (*), the specific enzyme responsible for the citrate cleavage step and its localization are unclear (see Supplementary Received 19 March; accepted 11 June 2010. Discussion). Double asterisk (**), there are two predicted enzymes capable of catalysing this reaction, the cytosolic malate dehydrogenase (PFF0895w) 1. Krebs, H. A. & Johnson, W. A. The role of citric acid in intermediate metabolism in and the putative mitochondrial malate:quinone oxidoreductase animal tissues. Enzymologia 4, 148–156 (1937). (MAL6P1.258). Abbreviations: Gln, glutamine; Glu, glutamate; OG, 2. van Dooren, G. G., Stimmler, L. M. & McFadden, G. I. Metabolic maps and functions of the Plasmodium mitochondrion. FEMS Microbiol. Rev. 30, 596–630 2-oxoglutarate; ICT, isocitrate; Cit, citrate; Ac-R, acetate/acetyl-CoA; Ac- (2006). CoA, acetyl-CoA; OA, oxaloacetate; Mal, malate; Suc, succinyl; Suc-CoA, 3. Sherman, I. W. in Malaria, Parasite Biology, Pathogenesis and Protection (ed. succinyl-CoA; Fum, fumarate; Glc, glucose; Asp, aspartate; PEP, Sherman, I. W.) 135–143 (ASM, 1998). phosphoenolpyruvate; Pyr, pyruvate; Lac, lactate. 4. Gardner, M. J. et al. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419, 498–511 (2002). dehydrogenase (IDH) in the human cells: IDH3, the canonical TCA- 5. Munger, J. et al. Systems-level metabolic flux profiling identifies fatty acid cycle enzyme which uses NAD(H) as a cofactor, and IDH2, which is synthesis as a target for antiviral therapy. Nature Biotechnol. 26, 1179–1186 specific for NADP(H)and may run in the reductive direction owing to a (2008). 1 18 6. Vaidya, A. B. & Mather, M. W. Mitochondrial evolution and functions in malaria mitochondrial NADP :NADPH ratio favouring the reverse reaction . parasites. Annu. Rev. Microbiol. 63, 249–267 (2009). The P. falciparum genome encodes only an NADP(H)-specific, mito- 7. Painter, H. J., Morrisey, J. M., Mather, M. W. & Vaidya, A. B. Specific role of chondrial IDH11, suggesting that it may have entirely lost the ability to mitochondrial electron transport in blood-stage Plasmodium falciparum. Nature run a textbook TCA cycle and is effectively locked into this branched 446, 88–91 (2007). architecture. We propose that the mitochondrial NADPH required by 8. Bozdech, Z. et al. The transcriptome of the intraerythrocytic developmental cycle of Plasmodium falciparum. PLoS Biol. 1, e5 (2003). this reductive pathway may be generated by the parasite’s NADP(H)- 9. Tonkin, C. J. et al. Localization of organellar proteins in Plasmodium falciparum specific glutamate dehydrogenase (PF14_0164), and glutamate oxida- using a novel set of transfection vectors and a new immunofluorescence fixation tion has been detected in isolated P. falciparum mitochondria23. method. Mol. Biochem. Parasitol. 137, 13–21 (2004). This branched TCA pathway can be understood as an evolutionary 10. Hodges, M. et al. An iron regulatory-like protein expressed in Plasmodium falciparum displays aconitase activity. Mol. Biochem. Parasitol. 143, 29–38 (2005). trade-off in which metabolic flexibility is lost to optimize growth 11. Wrenger, C. & Muller, S. Isocitrate dehydrogenase of Plasmodium falciparum. Eur. within the specific environment of the host cell. Within the human J. Biochem. 270, 1775–1783 (2003). bloodstream, an abundant and homeostatic supply of glucose ensures a 12. Suraveratum, N. et al. Purification and characterization of Plasmodium falciparum constant supply of energy, whereas the high levels of plasma glutamine succinate dehydrogenase. Mol. Biochem. Parasitol. 105, 215–222 (2000). (about 0.5 mM) represent a ready source of C5 carbon skeletons to 13. Olszewski, K. L. et al. Host-parasite interactions revealed by Plasmodium falciparum metabolomics. Cell Host Microbe 5, 191–199 (2009). drive the mitochondrial production of reduced ubiquinone, succinyl- 14. Foth, B. J. et al. The malaria parasite Plasmodium falciparum has only one pyruvate CoA and C2 acetyl units. In human cells, production of nuclear acetyl- dehydrogenase complex, which is located in the apicoplast. Mol. Microbiol. 55, CoA from mitochondrially derived citrate is a major determinant of 39–53 (2005). the acetylation state of histones24, and acetylation of metabolic enzymes 15. Blum, J. J. & Ginsburg, H. Absence of a-ketoglutarate dehydrogenase activity and is gaining recognition as a major post-translational modification presence of CO2-fixing activity in Plasmodium falciparum grown in vitro in human erythrocytes. J. Protozool. 31, 167–169 (1984). involved in sensing and regulating responses to nutrient availability 16. Elford, B. C., Haynes, J. D., Chulay, J. D. & Wilson, R. J. Selective stage-specific 25,26 in diverse organisms . It is possible that flux through this reductive changes in the permeability to small hydrophilic solutes of human erythrocytes TCA pathway in P. falciparum serves as a nutrient sensor regulating infected with Plasmodium falciparum. Mol. Biochem. Parasitol. 16, 43–60 (1985). enzymatic activities and transcriptional responses by means of protein 17. Liu, J. et al. Plasmodium falciparum ensures its amino acid supply with multiple acquisition pathways and redundant proteolytic enzyme systems. Proc. Natl Acad. acetylation. Also, studies have found that TCA cycle enzymes are upre- Sci. USA 103, 8840–8845 (2006). 27 gulated in a subset of patient-derived blood-stage parasite isolates as 18. Yoo, H., Antoniewicz, M. R., Stephanopoulos, G. & Kelleher, J. K. Quantifying well as in salivary gland sporozoites27,28. Our results suggest that under reductive carboxylation flux of glutamine to lipid in a brown adipocyte cell line. J. these glucose-limited conditions, reductive TCA flux might compen- Biol. Chem. 283, 20621–20627 (2008). sate for reduced synthesis of C2 units from glucose. Whether the path- 19. Vaughan, A. M. et al. Type II fatty acid synthesis is essential only for malaria parasite late liver stage development. Cell. Microbiol. 11, 506–520 (2009). way architecture described in our model is maintained within other 20. Yu, M. et al. The fatty acid biosynthesis enzyme FabI plays a key role in the tissues invaded during the parasite life cycle, such as the mosquito development of liver-stage malarial parasites. Cell Host Microbe 4, 567–578 midgut and salivary gland or the human liver, merits further study, (2008). 777 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

21. Gowda, D. C. & Davidson, E. A. Protein glycosylation in the malaria parasite. Acknowledgements We thank G. McFadden and I. Sherman for discussions and Parasitol. Today 15, 147–152 (1999). scrutiny of the manuscript; B. Bennett, T. Campbell, E. De Silva, J. O’Hara, and 22. Downie, M. J., Kirk, K. & Mamoun, C. B. Purine salvage pathways in the H. Painter for reading of the manuscript; I. Ying for assistance with histone intraerythrocytic malaria parasite Plasmodium falciparum. Eukaryot. Cell 7, extraction; T. Spurck and C. Tonkin for the modified erythrocyte immobilization 1231–1237 (2008). procedure for microscopy; M. Clasquin and W. Lu for developing the LC–MS 23. Fry, M. & Beesley, J. E. Mitochondria of mammalian Plasmodium spp. Parasitology methodology; E. Melamud for LC–MS data extraction and analysis; and J. Groves 102, 17–26 (1991). and H. Cooper for GC–MS analysis. M.L. is funded by the Burroughs Wellcome 24. Wellen, K. E. et al. ATP-citrate lyase links cellular metabolism to histone Fund and an NIH Director’s New Innovators award (1DP2OD001315-01). J.D.R. is acetylation. Science 324, 1076–1080 (2009). funded by a Beckman Young Investigators award, an NSF CAREER award and NIH 25. Wang, Q. et al. Acetylation of metabolic enzymes coordinates carbon source R01 AI078063. M.L and J.D.R. receive support from the Center for Quantitative utilization and metabolic flux. Science 327, 1004–1007 (2010). Biology (P50 GM071508). B.A.G. receives support from NSF grant CBET-0941143. 26. Zhao, S. et al. Regulation of cellular metabolism by protein lysine acetylation. K.L.O. is funded by an NSF Graduate Research Fellowship. J.M.M., M.W.M. and Science 327, 1000–1004 (2010). A.B.V. are supported by grant AI028398 from NIAID, NIH. 27. Daily, J. P. et al. Distinct physiological states of Plasmodium falciparum in malaria- Author Contributions K.L.O. cultured the parasites, and collected and analysed all infected patients. Nature 450, 1091–1095 (2007). LC–MS and GC–MS data; B.A.G. performed mass spectrometric analysis of 28. Lasonder, E. et al. Proteomic profiling of Plasmodium sporozoite maturation histones. M.W.M. and J.M.M. carried out IDH localization studies. M.W.M. purified identifies new proteins essential for parasite development and infectivity. PLoS mitochondria and K.L.O. did biochemical assays. K.L.O., M.L., J.D.R., M.W.M., Pathog. 4, e1000195 (2008). A.B.V. and B.A.G. designed the study; J.D.R. provided the metabolomic technology. 29. Mather, M. W., Henry, K. W. & Vaidya, A. B. Mitochondrial drug targets in M.L. and K.L.O. wrote the paper. All authors discussed the results and commented apicomplexan parasites. Curr. Drug Targets 8, 49–60 (2007). on the manuscript. 30. Andrews, K. T., Tran, T. N., Wheatley, N. C. & Fairlie, D. P. Targeting histone deacetylase inhibitors for anti-malarial therapy. Curr. Top. Med. Chem. 9, 292–308 Author Information Reprints and permissions information is available at (2009). www.nature.com/reprints. The authors declare no competing financial interests. Readers are welcome to comment on the online version of this article at Supplementary Information is linked to the online version of the paper at www.nature.com/nature. Correspondence and requests for materials should be www.nature.com/nature. addressed to M.L. ([email protected]).

778 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09301

METHODS program, and aligned them with a nonlinear regression using a high degree polynomial to the retention times of the highest-intensity m/z measurements P. falciparum culturing and metabolite extraction. We maintained and syn- from each sample to construct a median reference. We extracted ion chromato- chronized P. falciparum cultures by standard methods31,32. Briefly, we grew P. grams using a 5-p.p.m. window centred around the expected m/z of each com- falciparum-infected (3D7 strain) red blood cells in RPMI 1640 culture medium pound and smoothed by applying a Gaussian filter to the intensity signal. Within supplemented with sodium carbonate (2 mg ml21), hypoxanthine (100 mM), each extracted ion chromatogram, we detected peaks and evaluated their quality Albumax II (0.25%) and gentamycin (50 mgml21) in a humidified incubator using a neural-network-based classification model that takes into account peak at 5% CO ,6%O and 37 uC. We collected human red blood cells used for 2 2 height, peak width, peak area, the signal to noise ratio and the peak shape36.We culturing two days before use, in tubes supplemented with sodium heparin grouped peaks across samples and matched them to the expected retention time instead of standard citrate-containing anticoagulants to avoid contaminating of each compound. We used the peak closest to the expected retention with a citrate. quality score of more than 0.5 for quantitation. We hand-checked all peaks used For metabolic labelling experiments, growth medium was formulated accord- for quantitation after automated extraction and assignment. We extracted the ing to the standard nutrient concentrations of RPMI 1640 medium33. We sup- isotopically labelled forms of compounds in a similar manner. We based the plied vitamins using RPMI 100X Vitamins Solution (Sigma-Aldrich); we quantitation of labelled forms on the highest-intensity peak within a 5-p.p.m. purchased inorganic salts and all other nutrients individually from Sigma- 13 15 window around the expected m/z of the C- and/or N-labelled form of the Aldrich; they were of the highest purity available. We found, by standard growth compound. We will describe the specifics of the MAVEN program in a forth- assays, that this reformulated medium was indistinguishable from commercial coming publication. RPMI 1640 mixes in supporting P. falciparum growth. We purchased U-13C- We identified metabolites on the basis of both their match to the expected m/z glucose, U-13C-15N-aspartate and U-13C-15N-glutamine from Cambridge ratio and their chromatographic retention times determined previously for Isotope Laboratories and used them to replace the replace the unlabelled nutrient 21 21 standard solutions. We identified isotope-labelled forms using the expected mass at the normal concentrations of each (glucose, 2 g l ; aspartate, 20 mg l ; 13 15 21 shifts given by C and N. Where ambiguous, we determined the positional glutamine, 300 mg l ). labelling of specific atoms within the molecule by LC–MS or MS analysis as We carried out metabolic labelling experiments as follows: we inspected a described previously13. We used the height of the extracted peak for each com- highly synchronized P. falciparum culture in the late schizont stage hourly by pound as the signal. We corrected the raw signals for each labelled form to microscope until host-cell lysis and reinvasion was complete. We then adjusted account for the naturally occurring isotope distribution as calculated by the this culture to 6% parasitaemia using cultured red blood cells, diluted it to 0.4% Qual Browser included in the Xcalibur software suite (Thermo Fisher haematocrit in fresh, prewarmed (37 uC) medium containing one of the labelled Scientific). We added the signal from naturally occurring isotopes to the signal nutrients, and returned it to the incubator. We allowed the cultures to equilibrate of the unlabelled form. Similarly, we discounted signals that were due to incom- for 2 h and then collected infected red blood cells and culture media for the t 5 0 plete labelling of the isotope-labelled nutrients (which are generally 98–99% fully time point and at 8-h intervals thereafter. We similarly treated an uninfected red labelled). For quantitation experiments, we determined the signal ratio of the blood cell culture and it extracted at t 5 0 to use for normalization. unlabelled metabolite to its isotope-labelled internal standard and used it to We extracted the metabolite using a modified version of our previous pro- calculate the concentration of the metabolite in the extract at the 24-h time 13 tocol . Briefly, we pelleted red blood cells from liquid cultures by centrifugation point. We used the relative signal in the other time-point samples to calculate for 5 min at 500g. We collected medium samples from the supernatant by dilu- the concentrations over the temporal profile. tion in 4 volumes of 100% methanol at 270 uC, then removed the supernatant by We treated signals from the uninfected red blood cell sample as the back- aspiration. We flash-quenched the cell pellet was in 4 volumes of 100% methanol ground level in host cells and subtracted them from every time point; where this at 270 uC and incubated it on dry ice for 15 min, with vortexing every 5 min. We reduced signals to less than 1,000 counts the signal was set to 1,000 counts, the centrifuged the lysate for 5 min at 500g and collected the supernatant. We re- approximate limit of quantitation for the instrument. Each plotted point shows extracted the pellet in 10 volumes of 80:10 methanol:water at 4 uC and agitated it the average of n 5 3 biological replicates; error bars show the standard deviation. with ultrasound on ice in a water-bath sonicator for 15 min. We then centrifuged We plotted the labelled forms only if the average signal in at least one time point it for 5 min at 16,000g and collected the supernatant, then pooled it with the is greater than 1,000 counts and the signal represents at least 1% of the signal of previous extract. We centrifuged the pooled extract for 10 min at 16,000g to the unlabelled form. precipitate denatured protein, and then transferred the supernatant to a fresh Fatty-acid extraction and analysis. For lipid-labelling experiments, we cultured tube and dried it under nitrogen flow. We stored the dried extracts at 270 uC parasites collected at the trophozoite stage as described above, in labelled nutrient- until we analysed them by LC–MS (less than 96 h). For analysis, we resuspended supplemented RPMI media for 96 h (encompassing two growth cycles). We con- the dried extracts in 200 ml of chromatographic buffer A (97:3 water:methanol, ducted these experiments using normally formulated RPMI and a minimal fatty- 10 mM tributylamine, 15 mM acetic acid). We also prepared medium extracts acid formulation (lacking Albumax II, we supplemented it with 60 mM lipid-free described above. To quantitate TCA intermediates in medium and cell extracts, bovine serum albumin and 30 mM each of myristic, stearic and oleic acids) pre- we maintained a parasite culture in label-free medium and extracted it at 24 h pared according to an earlier report that growth in this medium resulted in post-invasion into methanol-containing isotope-labelled internal standards at enhanced elongation of preformed fatty acids by the parasite37. We lysed infected 13 21 13 the following concentrations: U- C-malic acid, 5 mgml ;U- C-fumaric acid, red blood cells by treatment with 100 pellet volumes of 0.1% saponin in phosphate- 21 13 21 13 21 0.5 mgml ; 1,4- C-succinic acid, 0.5 mgml ; 2,4- C-citric acid, 1 mgml ; buffered saline (PBS) buffer and collected the freed parasite cells by centrifugation 13 21 1,2,3,4- C-ketoglutaric acid disodium, 5 mgml (Cambridge Isotope at 2,500g,4uC for 10 min. After washing the parasite pellet in PBS, we extracted the Laboratories). We otherwise performed the extraction as above. and derivatized them to fatty acid methyl esters using the protocol of ref. 18. LC–MS instrumentation. We did LC–MS analyses on a Exactive Orbitrap mass We investigated the fatty acid methyl esters by GC–MS, using an Agilent 7890A spectrometer, coupled with an Accela U-HPLC system (Thermo Fisher GC coupled to a 5975C inert MSD with an Rtx-5Sil MS (30 m length, 0.25 mm Scientific) and HTC PAL autosampler (CTC Analytics AG). We achieved liquid internal diameter, 0.25 mm film) column. We used the thermal gradient: start at chromatography separation on a Synergy Hydro-RP column (100 3 2 mm, 2.5 m 70 uC, holding for 2 min; ramping to 230 uCat20uC min21. We examined the particle size, Phenomenex); the gradient is modified from our previous mass spectra of the molecular ions of the palmitic and stearic acid methyl esters method34: 0 min, 0% B; 2.5 min, 0% B; 5 min, 20% B; 7.5 min, 20% B; 13 min, (monoisotopic m/z 5 270.45 and m/z 5 298.51) for deviation from the naturally 55% B; 15.5 min, 95% B; 18.5 min, 95% B; 19 min, 0% B; 25 min, 0% B. Solvent A occurring isotope distribution indicating 13C incorporation. We detected no is 97:3 water:methanol with 10 mM tributylamine and 15 mM acetic acid; solvent incorporation in either the normal or minimal fatty acid RPMI cultures. B is methanol. Other LC parameters are: autosampler temperature 4 uC; injec- Histone extraction and analysis. For histone labelling experiments, we cultured tion volume 10 ml; column temperature 25 uC. parasites as above in labelled nutrient-supplemented RPMI media for 96 h We operated the Exactive Orbitrap mass spectrometer in negative mode, scan- (encompassing two growth cycles) and collected them at the trophozoite stage. ning mass-charge ratio (m/z) 85–1,000. Other instrumental parameters are: reso- We lysed infected red blood cell cultures (50 ml total volume, 2% haematocrit, lution 100,000 at 1 Hz (1 scan per second); AGC (automatic gain control) target 10% parasitaemia) by treatment with 100 pellet volumes of 0.1% saponin in PBS 3E6; maximum injection time 100 ms; sheath gas flow rate 25 (arbitrary unit); aux buffer and collected the freed parasite cells by centrifugation at 2,500g,4uC for gas flow rate 8 (arbitrary unit); sweep gas flow rate 3 (arbitrary unit); spray voltage 10 min. After washing the parasite pellet in PBS, we acid-extracted38 and puri- 3 kV; capillary temperature 270 uC; capillary voltage 250 V; tube lens voltage fied39 the histones according to modified versions of the published protocols. 2100 V. Briefly, we resuspended the parasite pellet in 800 ml of acid-extraction buffer LC–MS data extraction and analysis. We converted Thermo Fisher mass spec- (0.2 M HCl, 1 mM dithiothreitol, 1 mM sodium orthovanadate, 10 mM sodium trometry RAW files from profile mode into centroid mode using the ReAdW butyrate, 1 Roche EDTA-free protease inhibitor tablet per 50 ml). We gently program35. We loaded centroide-mode files into MAVEN, an in-house analysis agitated the suspension at 4 uC for 2 h, and then centrifuged it at 16,000g for

©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09301

1 min at 4 uC. We collected the supernatant and slowly added 246 ml of 100% on dry ice for 15 min. We centrifuged the samples (16,000g, 10 min, 4 uC), diluted trichloroacetic acid. We mixed this by inversion and incubated it on ice over- the supernatant 1:10 in water and analysed by LC–MS for the production of 11 night, and then centrifuged it at 16,000g for 10 min at 4 uC. We discarded the 13C-acetyl-CoA. supernatant and washed the pellet twice with cold acetone at 4 uC. We allowed We performed ATP-independent citrate lyase using a protocol modified from the acetone to evaporate and resuspended the pellet in deionized water for ref. 44. We prepared the samples as described above, except in a buffer with the analysis. formulation: 10 mM triethanolamine (pH 7.6), 0.3 mM ZnCl2, 454 mM ammo- We fractionated histone extracts by reverse phase HPLC on a C18 column nium sulphate, 10 mM DTT redox reagent and 1 EDTA-free protease inhibitor using 30–60% buffer B in 100 min gradient (buffer A is 5% acetonitrile in 0.2% cocktail tablet per 10 ml (Roche complete Mini). We diluted 1 ml of sample into trifluoroacetic acid (TFA), buffer B is 90% acetonitrile in 0.188% TFA). We 29 ml of a reaction cocktail to final concentrations: 96 mM triethanolamine (pH 13 pooled fractions containing histone H4 and propionylated them as previously 7.6), 0.5 mM ZnCl2, 0.23 mM b-NADH, 0.67 mM 2,4- C-citric acid, 15 mM published40. We digested propionylated histone H4 with trypsin at a 20:1 ammonium sulphate, 100 units L-lactate dehydrogenase, 50 units malate dehy- protein:enzyme ratio for 7 h at 37 uC. We subjected digested histone H4 to MS drogenase. We incubated the reaction mix for 2 h at 37 uC and terminated the analysis on an Orbitrap mass spectrometer, operated by obtaining a full mass reaction by adding 120 ml of methanol and cooling to -70 uC on dry ice for spectrum at 30,000 resolution in the Orbitrap followed by 7 data-dependent MS/ 15 min. We centrifuged the samples (16,000g, 10 min, 4 uC) and diluted the MS spectra acquired in the ion trap. We interpreted all mass spectra manually. supernatant 1:10 in water and analysed it by LC–MS for the enzyme-linked We quantitated 13C labelling using the natural isotope distribution of the pro- production of 11 13C-malate and 11 13C-lactate. pionylated acetyl-H4 peptide calculated by the Qual Browser included in the IDH leader-GFP localization. We amplified the GFP gene from pHDGFP (ref. 45) Xcalibur software suite (Thermo Fisher Scientific). with added 59 XhoI and 39 SalI sites. We eliminated the internal BstBI site using Preparation of P. falciparum mitochondria. We sychronized P. falciparum cul- site-directed mutagenesis. We sub-cloned the modified GFP into pHHMC*/3R0.5 tures twice by treatment with sorbitol as described, and expanded and collected (ref. 46) digested with XhoI, producing the plasmid pHHGFP19. We amplified the them at 8% parasitaemia in the early trophozoite stage. We prepared fractions 59 204 base pairs (bp) of the Pf IDH gene (PF13_0242), corresponding to the initial substantially enriched in mitochondria using a procedure modified from the 68 amino acids (MGKHIRILKNQYLQFMSKRCIQSKAAFNICGKINVENPIV method of ref. 41. We collected parasitized erythrocytes by centrifugation, washed ELDGDEMTRIIWKDIKEKLILPYVNLKI), from P. falciparum 3D7 DNA with theminAIMmedium(120mMKCl,20mMNaCl,20mMglucose;6mMHEPES primers adding a 59 BstBI site and a 39 XhoI site. We inserted the product into buffer, 6 mM MOPS buffer, 1 mM MgCl2, 0.1 mM EGTA; pH 7.0) and lysed them pHHGFP19 digested with BstBI and XhoI to produce pHHIDHldrGFP. We con- with 0.05% (w/v) saponin in AIM medium. After washing 3 times with AIM firmed the cloned DNA sequences by sequencing. We transfected P. falciparum medium and once with MSEH buffer (225 mM mannitol, 75 mM sucrose, using standard methods47 and selected parasites with the drug WR99210. 4.3 mM MgCl2, 0.25 mM EGTA, 10 mM HEPES (Tris) buffer, 5 mM HEPES We used the primers: (KOH) buffer; pH 7.4), we disrupted the parasites by N2 cavitation (using a 4639 GFP–Xhosens, 34-mer: 59 GCT CTC GAG TCT GCA GCA GCA GCA GCA GCA Cell Disruption Bomb, Parr) at 1,000 p.s.i. (6.9 MPa) for 20 min at 4 uCinde- GCA G 39. aerated MSEH buffer containing 5 mM glucose and mitochondrial substrates GFP–Salanti, 50-mer: 59 GCA GTC GAC TAT TAT AAA TCT TCT TCA GAT (5 mM a-glycerophosphate and 2.5 mM dihydroorotate) in the presence of 1 mM ATT AAT TTT TGT TCA GAT CC 39. PMSF inhibitor and 1 ml of fungal protease inhibitor cocktail (Sigma-Aldrich) per PCR product 806 bp ml. After drop-wise release from the N2 bomb, we mixed another aliquot of protease GFP-rmvBstB-up, 42-mer: 59 CCA CAC AAT CTG CCC TTT CTA AAG ATC inhibitors into the disrupted parasite sample. We removed the unbroken cells and CCA ACG AAA AGA GAG 39. cell debris by centrifugation at 900g for 6 min at 4 uC. We passed the low-speed GFP-rmvBstB-dn, 42-mer: 59 CTC TCT TTT CGT TGG GAT CTT TAG AAA supernatantslowlythroughaMACSCScolumnprewashedwithMSEHbufferina GGG CAG ATT GTG TGG 39. Vario MACS magnetic separation apparatus (Miltenyi Biotec) to remove most of IDHldr-BstBsens, 42-mer: 59 GAC GTT CGA ATA AAA TGG GAA AGC ATA the haemozoin from the preparation. We then recovered the mitochondria as a TAC GAA TTT TAA AAA 39. pellet by centrifugation at 23,000g for 20 min at 4 uC. We suspended the pellet in a IDHldr-Xhoanti, 45-mer: 59 GAT CTC GAG TAT CTT TAA GTT AAC ATA minimal volume of MSEH buffer containing 1 mM dihydroorotate and 1 mg ml21 TGG TAA GAT TAA TTT TTC 39. fatty acid-free BSA and stored at 280 uC. Fluorescence microscopy. We suspended live infected erythrocytes in RPMI Mitochondrial IDH assay. We assayed for mitochondrial reductive IDH activity medium containing mitotracker Red CM-H2XROS (Molecular Probes) at using a modified version of the protocol of ref. 42. Briefly, we thawed purified 50 nM and Hoechst 33342 dye (Sigma) at 1 mgml21 and incubated them at mitochondrial preparations on ice and diluted them into 9 volumes of assay buffer 37 uC for about 25 min. We immobilized the stained erythrocytes on a micro- (final concentration: 50 mM Na2HPO4, 0.5 mM MgCl2, 5 mM NaHCO3,0.2mM scope coverslip in RPMI medium using a fibrin clot procedure modified from NADPH, 10 mM 1,2,3,4-13C-2-oxoglutarate; pH 7.0) in a sealed tube. We incu- ref. 48. We captured images with an Olympus BX60 microscope equipped with a bated the reaction mixture in a water bath at 37 uC and removed the samples at the SPOT RT Slider digital camera and software system (Diagnostic Instruments). specified times. We quenched the reaction by dilution into 9 volumes of methanol at 270 uC. We centrifuged these samples for 15 min at 16,000g,4uC to precipitate 31. Trager, W. & Jensen, J. B. Human malaria parasites in continuous culture. Science 193, 673–675 (1976). protein and biological material, then we diluted the supernatant was diluted into 9 32. Lambros, C. & Vanderberg, J. P. Synchronization of Plasmodium falciparum volumes of water and subjected it to HPLC–MS analysis as described. erythrocytic stages in culture. J. Parasitol. 65, 418–420 (1979). Citrate lyase assays. We assayed for ATP:citrate lyase and ATP-independent 33. Moore, G. E., Gerner, R. E. & Franklin, H. A. Culture of normal human leukocytes. J. citrate lyase on lysates of uninfected erythrocytes, infected erythrocytes (trophozoite- Am. Med. Assoc. 199, 519–524 (1967). stage, 10% parasitaemia), host cell-free parasites (trophozoite-stage) and isolated 34. Lu, W., Bennett, B. D. & Rabinowitz, J. D. Analytical strategies for LC–MS-based mitochondria. We prepared host cell-free parasites by standard saponin treatment: targeted metabolomics. J. Chromatogr. B 871, 236–242 (2008). briefly, we collected blood cultures by centrifugation (500 g,5min);washedthem 35. Keller, A. et al. A uniform proteomics MS/MS analysis platform utilizing open once with PBS buffer (500g, 5 min); resuspended them at 2% haematocrit in 0.1% XML file formats. Mol. Syst. Biol. 1, 2005.0017 (2005). w/v saponin in PBS buffer and incubated them for 2 min at 25 uCpelletedthemby 36. Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001). 37. Mi-Ichi, F., Kita, K. & Mitamura, T. Intraerythrocytic Plasmodium falciparum utilize centrifugation (2,000g,10min,4uC); and washed them once more in PBS buffer a broad range of serum-derived fatty acids with limited modification for their (2,000g,10min,4uC). growth. Parasitology 133, 399–410 (2006). We performed ATP:citrate lyase assays using a protocol modified slightly from 38. Miao, J., Fan, Q., Cui, L. & Li, J. The malaria parasite Plasmodium falciparum ref. 43. We diluted erythrocyte and parasite cell samples into 9 volumes of lysis histones: organization, expression, and acetylation. Gene 369, 53–65 (2006). buffer (50 mM Tris-HCl buffer (pH 8.0), 50 mM NaCl, 2 mM DTT redox reagent, 39. Shechter, D., Dormann, H. L., Allis, C. D. & Hake, S. B. Extraction, purification and 1mMMgCl2, 1 EDTA-free protease inhibitor cocktail tablet per 10 ml; Roche analysis of histones. Nature Protocols 2, 1445–1457 (2007). complete Mini), lysed them by three rounds of freeze-thaw cycles (1 min in liquid 40. Garcia, B. A. et al. Chemical derivatization of histones for facilitated analysis by nitrogen, 5 min in 37 uC water bath) and centrifuged them (16,000g, 10 min, 4 uC) mass spectrometry. Nature Protocols 2, 933–938 (2007). to clear the supernatant. We diluted mitochondrial preparations into the same 41. Takashima, E. et al. Isolation of mitochondria from Plasmodium falciparum showing dihydroorotate dependent respiration. Parasitol. Int. 50, 273–278 (2001). buffer, but with 0.05% dodecyl maltoside to permeabilize the membranes, and 42. Kornberg, A. & Pricer, W. E. Jr. Di- and triphosphopyridine nucleotide isocitric lyse them. We diluted 2.5 ml of these samples into 17.5 ml of reaction cocktail with dehydrogenases in yeast. J. Biol. Chem. 189, 123–136 (1951). final concentrations of: 87 mM Tris-HCl buffer (pH 8.0), 20 mMMgCl2,10mM 43. Ma, Z., Chu, C. H. & Cheng, D. A novel direct homogeneous assay for ATP citrate 13 KCl, 10 mM DTT redox reagent, 100 mM coenzyme A, 150 mM2,4- C-citric acid, lyase. J. Lipid Res. 50, 2131–2135 (2009). with or without 400 mM ATP. We incubated the reaction mix for 15 min at 37 uC 44. Bergmeyer, H. U., Gawehn, K. & Grassl, M. in Methods of Enzymatic Analysis Vol. 1 and terminated the reaction by adding 80 ml of methanol and cooling it to 270 uC (ed. Bergmeyer, H. U.) 442–443 (Academic, 1974).

©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09301

45. Kadekoppala, M., Kline, K., Akompong, T. & Haldar, K. Stable expression of a new 47. Fidock, D. A. & Wellems, T. E. Transformation with human dihydrofolate reductase chimeric fluorescent reporter in the human malaria parasite Plasmodium renders malaria parasites insensitive to WR99210 but does not affect the intrinsic falciparum. Infect. Immun. 68, 2328–2332 (2000). activity of proguanil. Proc. Natl Acad. Sci. USA 94, 10931–10936 (1997). 46. O’Donnell, R. A. et al. A genetic screen for improved plasmid segregation reveals a 48. Forer, A. & Pickett-Heaps, J. D. Cytochalasin D and latrunculin affect role for Rep20 in the interaction of Plasmodium falciparum chromosomes. EMBO J. chromosome behaviour during meiosis in crane-fly spermatocytes. Chromosome 21, 1231–1239 (2002). Res. 6, 533–549 (1998).

©2010 Macmillan Publishers Limited. All rights reserved Vol 466 | 5 August 2010 | doi:10.1038/nature09265 LETTERS

Microbial metalloproteomes are largely uncharacterized

Aleksandar Cvetkovic1*, Angeli Lal Menon1*, Michael P. Thorgersen1*, Joseph W. Scott1, Farris L. Poole II1, Francis E. Jenney Jr1{, W. Andrew Lancaster1, Jeremy L. Praissman1, Saratchandra Shanmukh1, Brian J. Vaccaro1, Sunia A. Trauger2, Ewa Kalisiak2, Junefredo V. Apon2, Gary Siuzdak2, Steven M. Yannone3, John A. Tainer3 & Michael W. W. Adams1

Metal ion cofactors afford proteins virtually unlimited catalytic Herein we present combined technologies that reveal assimilated potential, enable electron transfer reactions and have a great metals and metalloproteins from biomass of the prototypical impact on protein stability1,2. Consequently, metalloproteins have microbe Pyrococcus furiosus16 (Supplementary Fig. 1). Whereas key roles in most biological processes, including respiration (iron proteins containing five transition metals, cobalt (Co), iron (Fe), and copper), photosynthesis (manganese) and drug metabolism nickel (Ni), tungsten (W) and zinc (Zn), have been purified from (iron). Yet, predicting from genome sequence the numbers and P. furiosus (Supplementary Table 1) previously, surprisingly 21 of 53 types of metal an organism assimilates from its environment or metals analysed by ICP-MS11 were detected in the cytoplasmic uses in its metalloproteome is currently impossible because metal extract, including lead (Pb), titanium (Ti) and uranium (U) (Sup- coordination sites are diverse and poorly recognized2–4. We pre- plementary Tables 2 and 3). P. furiosus specifically assimilated these sent here a robust, metal-based approach to determine all metals 21 from 44 metals in the growth medium (Supplementary Table 4). an organism assimilates and identify its metalloproteins on a gen- This had seven added metals, the remaining coming from added ome-wide scale. This shifts the focus from classical protein-based organic components. Excepting chromium (Cr), ruthenium (Ru) purification to metal-based identification and purification by and strontium (Sr), 18 metals were in macromolecular complexes liquid chromatography, high-throughput tandem mass spectro- ($5 kDa) rather than free ions (Supplementary Fig. 2 and Table 3). metry (HT-MS/MS) and inductively coupled plasma mass spectro- Cells grown with added Pb, U, Ru, rhodium (Rh, each 50 nM) and Cr metry (ICP-MS) to characterize cytoplasmic metalloproteins from (200 nM) contained a more than tenfold increase in the intracellular an exemplary microorganism (Pyrococcus furiosus). Of 343 metal concentrations of tightly-bound Pb and U, but not Rh, Ru and Cr peaks in chromatography fractions, 158 did not matchany predicted (Supplementary Fig. 3), indicating specific uptake of metals that are metalloprotein. Unassigned peaks included metals known to be available to P. furiosus in its marine environment17. used (cobalt, iron, nickel, tungsten and zinc; 83 peaks) plus metals To investigate if uptake of unanticipated metals involved biological the organism was not thought to assimilate (lead, manganese, functions or inadvertent assimilation, we examined stable cytoplasmic molybdenum, uranium and vanadium; 75 peaks). Purification of metalloproteins that retained metals after an anion exchange separa- eight of 158 unexpected metal peaks yielded four novel nickel- tion (chromatography 1 (C1); Supplementary Fig. 4)18. This un- and molybdenum-containing proteins, whereas four purified ambiguously identified 10 metals as multiple peaks in 126 C1 proteins contained sub-stoichiometric amounts of misincorporated chromatography fractions: (1) molybdenum (Mo), manganese lead and uranium. Analyses of two additional microorganisms (Mn) and vanadium (V), not previously known in P. furiosus, (2) U (Escherichia coli and Sulfolobus solfataricus) revealed species- and Pb (not found previously in any organism except in detoxification specific assimilation of yet more unexpected metals. Metallo- proteins), and (3) known metals (Co, Fe, Ni, W and Zn). Other proteomes are therefore much more extensive and diverse than cytoplasmic metals had no distinct peaks in C1 fractions. The most previously recognized, and promise to provide key insights for cell abundant were Fe and Zn (97% of the total), with less W and Ni biology, microbial growth and toxicity mechanisms. (,2.5%) and even less Co, Mo, Mn, Pb, V and U (,0.5%; Fig. 1a, Once revealed, a metal cofactor adds new dimensions to understand- Supplementary Tables 5 and 6). ing protein structure and function; yet, the presence of metal is often To further explore the P. furiosus metalloproteome, we separated unsuspected until the protein is analysed1,2,5. For example, unexpected C1 fractions by second level chromatography (C2; Supplementary zinc and iron–sulphur sites gave fundamental insights into DNA repair Fig. 5). ICP-MS of 790 fractions obtained from fifteen C2 columns proteins relevant to human cancers6. Unfortunately, the small fraction revealed 343 distinct metal peaks (Fig. 2, Supplementary Tables 7 and of biochemically characterized proteins and limitations of metallo- 8). HT-MS/MS18 identified 770 proteins or ,60% of the cytoplasmic protein bioinformatics1,4,5,7–9 make it impossible to predict metals used proteins19. Given the difficulties of predicting metalloproteins1,4,5,7,9, by organisms and to define any metalloproteome. Previous metal-based we searched the Integrated Resource of Protein Domains and studies examined individual purified proteins, recombinant proteins, Functional Sites (InterPro)20 annotation of the P. furiosus genome biological fluids (such as blood or urine) or involved limited (Supplementary Table 9) to assign metal peaks to proteins. This metals8,10–13. Yet, native biomass is likely essential as metalloproteins InterPro-Metal (IPM) analysis identified domains even remotely from recombinant sources may have incorrect or no metal at all14,15. related to those that bind a metal. However, only 185 of the 343 metal

1Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia 30602, USA. 2Scripps Center for Mass Spectrometry and the Departments of Molecular Biology and Chemistry, The Scripps Research Institute, La Jolla, California 92037, USA. 3Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA. {Present address: Philadelphia College of Osteopathic Medicine, Suwanee, Georgia 30024, USA (F.E.J.). *These authors contributed equally to this work. 779 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010

a 1.8 identified by MALDI-MS as PF1972 (27.6 kDa). It contained Mo 6 21 1.5 0.05 0.0001 (0.77 0.43 atoms mol ) and also iron with a Fe:Mo ratio of Zn 4:1 (Supplementary Fig. 7) and is predicted to be a [4Fe–4S] 1.2 cluster-containing activase for anaerobic ribonucleotide reductase 0.9 0 0 (PF1971)22. Activases are widespread in anaerobes, but those in Mn V Pb U 0.6 Fe hyperthermophiles, like P. furiosus, contain four conserved Cys resi- Abundance (%) 0.3 dues besides three expected conserved Cys residues coordinating the 0 [4Fe–4S] cluster. Expression of PF1972 is upregulated at suboptimal Abundance (%) W Ni Co Mo Mn V Pb U growth temperatures, indicating a role for Mo in DNA synthesis under these conditions23. The second Mo peak co-purified with b (34) 1.8 two proteins that partially separated after six chromatography steps: a known tungstoprotein (PF0464; Supplementary Table 1) lacking 1.5 0.002 Mo, and PF1587 with unique peptides detected by HT-MS/MS Fe 1.2 matching the Mo peak (Supplementary Fig. 8). PF1587 is a W 0.9 0 32.9 kDa conserved hypothetical protein with archaeal and bacterial U 0.6 homologues that contain five conserved cysteines. If the cysteine

Zn Abundance (%) 0.3 residues in PF1587 and PF1972 directly coordinate Mo, this would be unprecedented for molybdoenzymes, wherein Mo is bound by S 0 Abundance (%) W Ni Co Mo Mn V Pb U atoms of an organic pterin cofactor24. PF1972 and PF1587 are the first Mo-proteins purified from P. furiosus, an organism not previously c known to assimilate this metal. Assigned Unassigned 40 The second unassigned Ni peak purified originated from PF0086 on the basis of native biomass18 and recombinant protein data. After three chromatography steps (Fig. 2, Supplementary Fig. 9), only 20 PF0086 had a profile of unique peptides that matched the Ni peak (Supplementary Fig. 9 and Table 11). Annotated as alanyl-tRNA 25 Number of peaks editing hydrolase, PF0086 is predicted to contain Zn but Zn was 0 undetected in PF0086 fractions. To confirm PF0086 is a bona fide W Ni Co Mo Mn V Pb U Ni-protein, the corresponding gene was expressed in E. coli grown in Figure 1 | Metal assimilation by P. furiosus and unassigned metal peaks. a Ni-supplemented medium (200 mM). The purified protein con- a–c, Relative amounts (percentage on a molar basis) of the ten metals tained 0.86 6 0.20 Ni atoms mol21. Interestingly, when PF0086 was present as peaks in the C1 chromatography fractions (a) and those ten expressed in E. coli grown in a Zn- or Co-supplemented medium 18 metals in the growth medium (b) , and number of metal peaks in the C2 (200 mM), its predominant metal was Zn or Co, respectively chromatography fractions that can be assigned (solid bars) or cannot be (Supplementary Fig. 10). E. coli evidently inserts the most abundant assigned (shaded bars) to a protein with an InterPro-Metal (IPM) hit for that metal (c). The values for uranium are from the C1 column. The order of metal (Ni, Co or Zn), whereas P. furiosus specifically inserts Ni into metals in the bar graphs reflects their abundance in the C1 fractions (the W PF0086, despite a Zn concentration ,50-fold greater than Ni in its content is 34% in the medium) and data for Fe and Zn are omitted for clarity. growth medium (Fig. 1b). On the basis of a homologue structure Metals that P. furiosus was not known to use are underlined (see (PDB 2E1B), the Ni in PF0086 is likely coordinated by three His Supplementary Tables 3 and 5). and one Cys residue (Table 1). PF0086 is another new type of Ni- containing enzyme and with PF0056, increases the number known in peaks detected in the C2 fractions contained an IPM-predicted biology from eight to ten26. Such Ni-enzyme discoveries can enhance protein for the identified metal. The remaining 158 unassigned metal understanding of catalysis and biology as seen for Ni versus other peaks therefore represent metalloproteins containing unknown metal-containing superoxide dismutases27. metal-binding domains (Fig. 1c, Supplementary Table 8). Con- In contrast to Ni and Mo, purifications of U and Pb peaks, even sequently, P. furiosus assimilates more metals than expected, and from cells grown with more than tenfold higher U and Pb concentra- even metals it is known to use (Co, Fe, Ni, W and Zn) give rise to tions, yielded homogeneous proteins with only trace amounts of numerous unassigned peaks (.80). these metals (Table 1; Supplementary Table 12). For example, To test the feasibility of assigning proteins to the 158 metal peaks purification of one of 34 Pb peaks through six chromatography steps without IPM hits, we selected eight peaks (two Mo, Ni, Pb and U; (Supplementary Fig. 11) yielded a single protein identified as PF1343, Supplementary Table 8) for multistep chromatography purification a known metalloprotease (39.4 kDa), containing 0.65 6 0.09 Zn to obtain a homogeneous protein containing a near stoichiometric atoms mol21 but only 0.010 6 0.002 Pb atoms mol21 (Table 1). amount of the metal, analogous to the traditional purification of an Similarly, after five chromatography steps, another Pb peak yielded enzymatic activity (Supplementary Table 10). Using 300 g of P. furiosus homogeneous PF0257, a pyrophosphatase (20.9 kDa; Supplementary biomass, one of the 29 unassigned Ni peaks after five chromatography Fig. 12). This is a new iron protein (1.40 6 0.02 Fe atoms mol21) with steps yielded a pure Ni-containing protein (600 mg) identified by low amounts of Pb (0.007 6 0.001 Pb atoms mol21). matrix-assisted laser desorption/ionization (MALDI)-MS as PF0056 Similarly, after three chromatography steps, one U peak was asso- (Supplementary Fig. 6). This cupin/putative sugar-binding protein ciated with known iron-protein ferritin (PF0742, 20.3 kDa; Table 1, (14 kDa) had no IPM hit for Ni yet contained 0.47 6 0.05 Ni and Supplementary Fig. 13) containing 1.20 6 0.11 Fe atoms mol21 but 0.50 6 0.08 Zn atoms per mole (but no other . 0.1 atoms mol21). only 0.010 6 0.001 U atoms mol21 (and also 0.010 6 0.001 Pb The PF0056 metal ions are predicted to be coordinated by three His atoms mol21). A second U peak copurified through six steps with and one Glu residue, from a homologue structure (PDB 1VJ2, Table 1). the glycolytic Mg21-dependent enzyme enolase (PF0215, 46.8 kDa) Cupins are among the most functionally diverse protein superfamilies21; but contained only 0.00010 6 0.00004 U atoms mol21 (Supplemen- PF0056 is the first native Ni-containing member of this family to be tary Fig. 14). Although other U and Pb peaks may represent bona fide purified. U- and Pb-proteins, the four analysed seem to have misincorporated Two of 18 Mo peaks lacking IPM-predicted molybdoproteins were U and Pb that dissociate over multiple chromatography steps. also purified (Supplementary Figs 7 and 8). After five chromato- However, identification of proteins susceptible to such metal mis- graphy steps, one Mo peak yielded a homogeneous protein (2.5 mg) incorporation has implications in elucidating mechanisms of metal 780 ©2010 Macmillan Publishers Limited. All rights reserved NATURE | Vol 466 | 5 August 2010 LETTERS

a e 80 0.03 V-C1 Pb-C2

0.04 0.04 Pb ( μ M) Pb-C1 60 Pb ( μ M) 0.02 40 0.02 0.02 V ( μ M)

Proteins 0.01 20 0 0 0 0 1 16 31 46 61 76 91 106 121 1 16 31 46 b f 0.15 60 Co-C1 Ni-C2 Ni ( μ M) Ni-C1 0.4 Ni ( μ M) 0.10 40 0.4

0.2 Co ( μ M) 0.2 0.05 Proteins 20

0 0 0 0 1 16 31 46 61 76 91 106 121 12243 64 c g 0.15 0.15 90 0.12 Mn-C1 Mo-C2 Mo ( μ M) Mo ( μ M) 0.10 Mo-C1 0.10 60 0.08

0.05 Mn ( μ M) 0.05 30 0.04 Proteins

0 0 0 0 1 16 31 46 61 76 91 106 121 1163146 d h 0.3 6 80 0.12 10× W-C2 W ( μ M) 60 W ( μ M) 0.2 4 0.08 U-C1 40

U (nM) 0.1 W-C1 2 0.04 Proteins 20 0 0 0 0 1 16 31 46 61 76 91 106 121 1163146 Fraction Fraction Figure 2 | Metal concentration profiles after chromatographic indicate which were applied to a subsequent (C2) column. e–h, The metal fractionation of P. furiosus cytoplasmic extract. a–d, The C1 columns concentrations and the number of proteins in the C2 columns are shown for are vanadium (V) and lead (Pb) (a); nickel (Ni) and cobalt (Co) Pb (e), Ni (f), Mo (g) and W (h). The bold line above the fractions in the Ni (b); molybdenum (Mo) and manganese (Mn) (c); tungsten (W) and C2 column indicates which were applied to a subsequent (C3) column (see uranium (U) (d). The bold lines above the fractions in the C1 columns text and Supplementary Tables 10 and 11). toxicity in both prokaryotes and eukaryotes. Our approach can metalloproteome and consequently microbial physiology, which raises identify proteins containing any of 53 metals using any organism’s the issue of whether laboratory media satisfy organisms’ metal require- biomass without requiring radiolabels. ments. This has an impact on efforts to grow new microbes and com- To test further if this approach is generally applicable, we fractionated munities, which are often challenging or impossible. cytoplasmic extracts of Escherichia coli and Sulfolobus solfataricus28. Overall, we find that much of microbial metalloproteomes remain Although their growth media contained the same 44 metals as the uncharacterized. Notably, even with metals P. furiosus was known to P. furiosus medium (Supplementary Table 4), there were substantial assimilate, half of the observed peaks were unassigned (Fig. 1c). Given differences in the metals assimilated (Supplementary Figs 15 and 16; the major roles that metals have in protein function, native metallo- Table 13). Their C1 fractions also contained distinct peaks of Co, Fe, proteomes must be characterized to complement recombinant efforts Mo, Mn, V, Zn and Pb, but E. coli fractions uniquely contained including structural genomics. These results validate our metal-based, cadmium (Cd) and arsenic (As), whereas tin (Sn) and antimony (Sb) non-radiolabel approach to determine metals an organism assimilates were found only in S. solfataricus. E. coli fractions also contained U and and identify new metal-containing proteins with uncharacterized Ni but those of S. solfataricus did not. Which assimilated metals are metal-binding domains. These encompass both known protein families biologically functional can be ascertained by the methods described and uncharacterized portions of genomes comprised of conserved/ herein. Such organism-specific assimilation likely reflects natural envir- hypothetical proteins (Table 1). Furthermore, this technique can onments. P. furiosus is a marine anaerobe, S. solfataricus is a freshwater identify proteins with misincorporated metals, providing insight into aerobic acidophile, and E. coli is a facultative anaerobe inhabiting the metal toxicity mechanisms in both organisms and tissues29,30.The human gut. In general metal availability could significantly alter the power and flexibility of this metal-based approach makes it a valuable

Table 1 | Metalloproteins purified from P. furiosus by metal-based chromatography Metal peak purified Protein purified* Metals present{ Proposed metal coordination IPM-predicted metal Annotated function{

Mo PF1972 (18978344) Mo, Fe Mo(Cys)4 Fe Radical SAM activase Mo PF1587 (18977959) Mo Mo(Cys)4 None Conserved hypothetical Ni PF0086 (18976458) Ni Ni(His)3Cys Zn Alanyl-tRNA editing hydrolase Ni PF0056 (18976428) Ni, Zn Ni(His)3Glu None Cupin/putative sugar-binding protein UPF0742 (18977114) Fe (U, Pb) 2 Fe Ferritin UPF0215 (18976587)(U) 2 None Enolase Pb PF1343 (18977715)Zn(Pb) 2 Zn Proline dipeptidase Pb PF0257 (18976629) Fe (Pb) 2 None Inorganic pyrophosphatase * The NCBI GI number is given in parenthesis. { Metals present in low amounts (,0.01 atoms mol21) are given in parenthesis. { On the basis of the annotation in the InterPro database. 781 ©2010 Macmillan Publishers Limited. All rights reserved LETTERS NATURE | Vol 466 | 5 August 2010 tool in unlocking a more complete understanding of the far-reaching 14. Jenney, F. E. & Adams, M. W. W. Rubredoxin from Pyrococcus furiosus. Methods Enzymol. 334, 45–55 (2001). roles of metals in biology. 15. Chai, S. C., Wang, W. L. & Ye, Q. Z. Fe(II) is the native cofactor for Escherichia coli methionine aminopeptidase. J. Biol. Chem. 283, 26879–26885 (2008). METHODS SUMMARY 16. Fiala, G. & Stetter, K. O. Pyrococcus furiosus sp. nov. represents a novel genus of Pyrococcus furiosus (DSM 3638T) was grown at 90 uC using maltose and peptides marine heterotrophic archaebacteria growing optimally at 100uC. Arch. Microbiol. as the carbon source and cells were collected at late-exponential phase18.The 145, 56–61 (1986). cytoplasmic extract was prepared and fractionated by two chromatography steps 17. Aiuppa, A., Dongarra, G., Capasso, G. & Allard, P. Trace elements in the thermal groundwaters of Vulcano island (Sicily). J. Volc. Geoth. Res. 98, 189–207 (2000). (C1 and C2) and proteins were identified in the chromatography fractions by HT- 18 18. Menon, A. L. et al. Novel protein complexes identified in the hyperthermophilic MS/MS as described elsewhere . Metal concentrations were measured in the archaeon Pyrococcus furiosus by non-denaturing fractionation of the native uninoculated growth medium, in the cytoplasmic extract, and in C1 and C2 proteome. Mol. Cell. Proteomics 8, 735–751 (2009). column fractions using a quadrupole-based ICP-MS equipped with a 19. Poole, F. L. II et al. Defining genes in the genome of the hyperthermophilic MicroMist Nebulizer operated under Ar with and without He as collision gas. archaeon Pyrococcus furiosus: implications for all microbial genomes. J. Bacteriol. Selected metal peaks in the C2 chromatography fractions were purified further 187, 7325–7332 (2005). individually by following the metal through multiple chromatography steps until 20. Hunter, S. et al. InterPro: the integrative protein signature database. Nucleic Acids a single protein band was obtained after analysis by sodium dodecyl sulphate Res. 37, D211–D215 (2009). 21. Agarwal, G., Rajavel, M., Gopal, B. & Srinivasan, N. Structure-based phylogeny as electrophoresis. Metal stoichiometry in the purified proteins is based on the a diagnostic for functional characterization of proteins with a cupin fold. PLoS ONE molecular mass calculated from the gene sequence and a colorimetric estimate 4, e5736 (2009). 18 of protein concentration . Heterologous expression of PF0086 was induced by 22. Luttringer, F., Mulliez, E., Dublet, B., Lemaire, D. & Fontecave, M. The Zn center of isopropyl b-D-1-thiogalactopyranoside in E. coli BL21(DE3) grown aerobically in the anaerobic ribonucleotide reductase from E. coli. J. Biol. Inorg. Chem. 14, a rich medium. The recombinant protein was purified by heat treatment (80 uC 923–933 (2009). for 15 min) followed by multistep column chromatography. For metal analyses of 23. Weinberg, M. V., Schut, G. J., Brehm, S., Datta, S. & Adams, M. W. W. Cold shock their C1 chromatography fractions, E. coli was grown aerobically at 37 uC in a rich of a hyperthermophilic archaeon: Pyrococcus furiosus exhibits multiple responses medium and S. solfataricus P2 was grown aerobically at 80 uC in a medium con- to a suboptimal growth temperature with a key role for membrane-bound taining sucrose and peptides at pH 3.0 (ref. 28). Cytoplasmic extracts of each glycoproteins. J. Bacteriol. 187, 336–348 (2005). 24. Schwarz, G., Mendel, R. R. & Ribbe, M. W. Molybdenum cofactors, enzymes and organism were subjected to anion exchange chromatography and metal analysis 18 pathways. Nature 460, 839–847 (2009). using the procedures devised for P. furiosus . See Methods for details. 25. Splan, K. E., Musier-Forsyth, K., Boniecki, M. T. & Martinis, S. A. In vitro assays for the determination of aminoacyl-tRNA synthetase editing activity. Methods 44, Full Methods and any associated references are available in the online version of 119–128 (2008). the paper at www.nature.com/nature. 26. Ragsdale, S. W. Nickel-based enzyme systems. J. Biol. Chem. 284, 18571–18575 (2009). Received 8 April; accepted 7 May 2010. 27. Perry, J. J., Shin, D. S., Getzoff, E. D. & Tainer, J. A. The structural biochemistry of Published online 18 July 2010. the superoxide dismutases. Biochim. Biophys. Acta 1804, 245–262 (2010). 1. Gray, H. B., Stiefel, E. I., Valentine, J. S. & Bertini, I. Biological Inorganic Chemistry: 28. Zillig, W. et al. The Sulfolobus-‘‘Caldariella’’ group: taxonomy on the basis of the Structure and Reactivity (Univ. Science Books, 2006). structure of DNA-dependent RNA polymerases. Arch. Microbiol. 125, 259–269 2. Messerschmidt, A., Huber, R., Wieghart, K. & Poulos, T. Handbook of (1980). Metalloproteins, Vol. 1–3. (Wiley, 2005). 29. Kosnett, M. J. in Basic and clinical pharmacology 10th ed. (ed. B. G. Katzung) 3. Shu, N., Zhou, T. & Hovmoller, S. Prediction of zinc-binding sites in proteins from 945–957 (McGraw-Hill, 2007). sequence. Bioinformatics 24, 775–782 (2008). 30. Bressler, J. P. et al. Metal transporters in intestine and brain: their involvement in 4. Kasampalidis, I. N., Pitas, I. & Lyroudia, K. Conservation of metal-coordinating metal-associated neurotoxicities. Hum. Exp. Toxicol. 26, 221–229 (2007). residues. Proteins: Struct. Funct. Bioinf. 68, 123–130 (2007). Supplementary Information is linked to the online version of the paper at 5. Castagnetto, J. M. et al. MDB: the metalloprotein database and browser at the www.nature.com/nature. Scripps Research Institute. Nucleic Acids Res. 30, 379–382 (2002). 6. Fan, L. et al. XPD helicase structures and activities: insights into the cancer and Acknowledgements This research is part of the MAGGIE (Molecular Assemblies, aging phenotypes from XPD mutations. Cell 133, 789–800 (2008). Genes and Genomes Integrated Efficiently) project supported by Department of 7. Andreini, C., Bertini, I., Cavallaro, G., Holliday, G. L. & Thornton, J. M. Metal- Energy grant (DE-FG0207ER64326). We thank S. Hammond, L. Wells, R. Hopkins MACiE: a database of metals involved in biological catalysis. Bioinformatics 25, and D. Phillips for help with in-gel MS analyses. 2088–2089 (2009). Author Contributions A.C., A.L.M., M.P.T. and J.W.S. grew and fractionated P. 8. Waldron, K. J., Rutherford, J. C., Ford, D. & Robinson, N. J. Metalloproteins and furiosus; A.L.M. carried out cytoplasmic washes; A.L.M. and S.M.Y. grew and metal sensing. Nature 460, 823–830 (2009). fractionated S. solfataricus; A.L.M. and M.P.T. grew and fractionated E. coli;A.C.and 9. Zhang, Y. & Gladyshev, V. N. General trends in trace element utilization revealed S.S. performed ICP-MS analyses; S.A.T., E.K., J.V.A. and G.S. performed HT-MS/ by comparative genomic analyses of Co, Cu, Mo, Ni, and Se. J. Biol. Chem. 285, MS analyses; A.L.M. purified PF0056; J.W.S. purified PF1972 and PF0086; M.P.T. 3393–3405 (2010). and B.J.V. purified PF0742; M.T.P. purified PF1587, PF0215, PF1343 and PF0257; 10. Lobinski, R., Schaumlo¨ffel, D. & Szpunar, J. Mass spectrometry in bioinorganic W.A.L., J.L.P. and F.L.P. carried out metal-protein bioinformatic analyses; A.C., analytical chemistry. Mass Spec. Rev. 25, 255–289 (2006). A.L.M., F.E.J., F.L.P., M.P.T. and J.A.T. and M.W.W.A. contributed to experimental 11. Sanz-Medel, A., Montes-Bayo´n, M., del Rosario Ferna´ndez de la Campa, M., design and data analyses, and wrote the paper. Encinar, J. R. & Bettmer, J. Elemental mass spectrometry for quantitative proteomics. Analyt. Bioanalyt. Chem. 390, 3–16 (2008). Author Information Reprints and permissions information is available at 12. Shi, W. et al. Metalloproteomics: high-throughput structural and functional www.nature.com/reprints. The authors declare no competing financial interests. annotation of proteins in structural genomics. Structure 13, 1473–1486 (2005). Readers are welcome to comment on the online version of this article at 13. Atanassova, A., Ho¨gbom, M. & Zamble, D. B. in Methods in molecular biology Vol. www.nature.com/nature. Correspondence and requests for materials should be 436 (eds B. Kobe, M. Guss & T. Huber) 319–330 (Humana Press, 2008). addressed to M.W.W.A. ([email protected]).

782 ©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09265

METHODS The acidified samples were vortexed and incubated for at least 1.5 h at 24 uCto denature proteins and release metals. The samples were then centrifuged at Fractionation of P. furiosus. The procedures for the growth of Pyrococcus furiosus T 2,800g for 5 min at 24 uC in an Allegra 6R centrifuge (Beckman) immediately (DSM 3638 )at90uC on a rich medium (RM), the anaerobic preparation of the before the experimental run. The experimental parameters for ICP-MS (without cytoplasmic extract, and its anaerobic fractionation using one first (C1) and collision gas) were optimized to maximize sensitivity for the isotopes present in fifteen second (C2) level chromatography columns, have been described else- the tuning solution (1 p.p.b.) using a procedure described by the instrument where18. The column fractionation procedure is summarized in Supplementary manufacturer (Agilent Technologies). This was followed by optimization of the Fig. 5. Information on other column steps is described for each protein that was maximum ion intensities of the isotopes in the tuning solution to minimize purified. P. furiosus was also grown using the RMex and CMMex media. RMex is isobaric interferences in the presence of the collision gas. The dual ion detector the RM medium supplemented with lead, uranium, rhodium and ruthenium (used in pulse and analogue mode) was calibrated for each of the investigated (Pb(NO ) ,UO(C H O ) .2H O, RhCl .3H O and RuCl .3H O, each at 3 2 2 2 3 2 2 2 3 2 3 2 isotopes. A calibration (0, 0.1, 0.5, 1, 5, 10 and 50 p.p.b.) was performed for each 50 nM) and chromium (CrCl .6H O, 200 nM). CMMex medium is complete 3 2 of the metals to be analysed in a specific run and the regression coefficient for maltose medium31,32 containing elemental sulphur (31 mM) to which each metal was .0.99. Each sample was analysed in duplicate. Pb(NO ) and UO (C H O ) .2H O were added at 500 nM each. Cells were 3 2 2 2 3 2 2 2 To release metals during sample digestion before ICP-MS analysis33, the most processed for the C1 fractionation step as described for cells grown in the RM commonly used acid, HNO , has the benefits of wide elemental solubility com- medium18. For metal analyses and susceptibility of metals to removal by filtration 3 bined with low levels of interference and signal instability34. Native purified P. studies, the cytoplasmic extract from cells grown in the RM and RMex media were furiosus rubredoxin (PF1282), a highly stable iron-containing metalloprotein35, used. Frozen cells (3 g) were gently lysed by osmotic shock anaerobically under a was used to develop the pretreatment procedure for ICP-MS analysis. The effect continuous flow of Ar in 9 ml of 50 mM Tris-HCl (pH 8.0) containing 2 mM of HNO concentration (1, 2 and 5%, v/v) and temperature (24 and 80 uC) after a sodium dithionite as a reductant and 0.5 mgml21 DNase I to reduce viscosity. 3 1 h pretreatment on the release of Fe from rubredoxin (at a final concentration of The cell-lysates were centrifuged at 100,000g for 1 h at 18 C and the supernatants 2 u 0.83 mgml 1) was evaluated by ICP-MS operated in the collision mode. representing the cytoplasmic fractions were used for the metal analyses. Pretreatment at 24 uC in 2% (v/v) HNO3 for 1 h led to the quantitative release Protein identification. The procedures for protein identification in solution using of Fe (data not shown). The same conditions were used to investigate the release high-throughput tandem mass spectrometry (HT-MS/MS) were described prev- 18 of Co, Ni, Mo, W and Zn from experimental samples. Fraction position 48 iously . Proteins identified by MALDI-MS were first separated using native- or collected after fractionation of P. furiosus cytosol over a DEAE-Sepharose FF SDS-PAGE gradient gel electrophoresis (4–20% Criterion gels; Bio-Rad). The gel (DEAE-FF, GE Healthcare) column was diluted 50-fold in 2% (v/v) HNO3 and bands of interest were cut out, processed and digested for 16 h at 37 uC according to after the various pretreatments metals were detected by ICP-MS in collision the manufacturer’s protocol provided with the recombinant porcine trypsin used mode. The results confirmed those obtained with rubredoxin (data not shown), for the in-gel protein digest (Roche Applied Science). The peptides were purified namely, a 1 h sample treatment in 2 and 5% (v/v) nitric acid at 24 uC for 1 h with C-18 reversed-phase NuTip cartridges according to the manufacturer’s before ICP-MS analysis gave the same values for the release of the metals in the P. instructions (Glygen). The peptides were eluted with 1 ml of a saturated solution furiosus samples. The validity of all of these approaches is illustrated in of a-cyano-4-hydroxycinnamic acid (Sigma-Aldrich) dissolved in 50% (v/v) acet- Supplementary Fig. 1, which shows that there is an excellent correlation between onitrile containing 0.1% (v/v) trifluoroacetic acid (TFA) and spotted onto a MTP the iron concentrations measured in the C1 chromatography fractions deter- 384 Massive MADLI target (Bruker Daltonics), along with 1 ml of ProteoMass mined by a colorimetric assay and by ICP-MS analysis performed in the reaction Peptide & Protein MALDI-MS Calibration Kit standard (Sigma-Aldrich). The and collision modes. target was analysed using a Bruker Daltonics Autoflex MALDI time-of-flight mass Identification of metalloproteins in P. furiosus. The list of the P. furiosus spectrometer in reflectron mode using positive ion detection. proteins having metal-associated or metal-binding domains was generated by The mass list was generated by the SNAP peak detection algorithm using a analyzing the P. furiosus genome using the 2007 InterProScan tool36 and the signal-to-noise threshold of four following baseline correction of the spectra. InterPro (IPR) database20. Each gene of P. furiosus may have several different Proteins were identified by searching the mass list against the National Center for IPR hits each with a unique IPR identifier, which corresponds to a family, Biotechnology Information (NCBI) annotation of the P. furiosus genome domain or functional site that is accompanied by a well-maintained summary (NC_003413) using Mascot’s Peptide Mass Fingerprint tool (version 2.1, page on the IPR web site (www.ebi.ac.uk/interpro/). All of the information in the Matrix Science). The searches were conducted using a peptide mass tolerance summary pages is available in a single downloadable XML file. This summary of 1.0, variable modifications of carbamidomethylation (C) and oxidation (M), XML file was searched against a dictionary of specific metal-related words or and a maximum of one missed cleavage. Proteins with a P , 0.05 (corresponding phrases (i.e. Fe, iron, metal, etc., see Supplementary Table 9) represented by to a Mascot protein score greater than 46) were considered significant. regular expressions using a simple Perl script. These dictionary hits were then Metal analyses. Metals were measured using a quadrupole-based ICP-MS manually assessed for accuracy and tabulated. On the basis of this analysis, a list (7500ce, Agilent Technologies) equipped with a MicroMist Nebulizer (Agilent of P. furiosus proteins having metal-associated domains was created for each of Technologies). This system uses an octupole collision/reaction cell for collision the metals detected in the C1 fractions (Supplementary Table 7). These lists were focusing and interference reduction. Sample solutions were introduced into the then compared with the proteins identified by HT-MS/MS18 within each of the instrument via a peristaltic pump from an ASX-500 series ICP-MS autosampler metal peaks to determine if any known or predicted protein contained that 21 (Agilent Technologies) at a flow rate of 0.2 ml min into a water-cooled (2 uC) metal. quartz spray chamber. Argon (.99.99% purity) was used as the plasma, auxiliary, Recombinant protein expression. Heterologous expression of PF0086 was nebulizer and makeup gas. The instrument was operated with and without a carried out in E. coli with the addition of either no metal, NiCl2, CoCl2 or . collision gas. The use of helium ( 99.99% purity) as the collision gas effectively ZnCl2 (each 200 mM) to cells growing in NZCYM rich medium, which contains removes almost all matrix- and carrier-gas-related interferences (the impact of casein hydrolysate, casamino acids and yeast extract37. The PF0086 open reading 56 ArO, a main interference of iron isotope Fe, was negligible even when the frame was amplified from P. furiosus genomic DNA using the forward primer 59- instrument was run in the collision mode; data not shown). Before analysis, the GGGAGCTCCATATGACCAGATTGCTATACTATGAAGACGC-39 contain- instrument was stabilized for 30 min and equilibrated for 40 min with a matrix ing an NdeI restriction site and the reverse primer 59-AAGCTCGAGC solution identical to that of the sample to be analysed. The operation conditions GGCCGCCTAATCTTCCAGCCATATCTCCAATC-39 containing an XhoI used to analyse all fractions are summarized in Supplementary Table 2. restriction site (restriction sites are underlined). The PCR product was digested Quantification of metals was performed using certified standard reference with the restriction enzymes, Nde1 and Xho1, and inserted into the pET24a(1) materials (IV-ICPMS-71A CCS-5 and CMS-2; Inorganic Ventures) as external vector (Novagen). The sequence of the resulting plasmid, pET24a(1):PF0086 standards. Standard stock solutions were diluted with high-purity, glass-distilled was verified by Sanger sequencing of both strands at the Integrated Biotech deionized water obtained from a Corning Mega-Pure System D2 water purifier Laboratories facility at the University of Georgia. The plasmid was transformed (Corning) acidified with 2% (v/v) trace metal grade nitric acid (Fisher into E. coli BL21(DE3) pRIPL and 1-l cultures were grown at 37 uCtoanA600 of Scientific). To control the stability of the plasma, drifting and matrix effects, 0.6–0.7. Isopropyl b-D-1-thiogalactopyranoside (IPTG) was added to a final con- 21 an internal standard IV-ICPMS-71D (10 mgl of Li, Sc, Y, In, Tb and Bi; centration of 0.4 mM and either no metal, NiCl2, CoCl2 or ZnCl2 was added to Inorganic Ventures) was automatically added to the samples and to external final a concentration of 200 mM. After a 16-h incubation at 16 uC, cells were standards before being added to the nebulizer. Li, Sc and Y were used as internal collected by centrifugation, resuspended in 50 mM Tris, pH 8.0, and lysed with standards in the presence of collision gas, and Y, In, Tb and Bi were used in the lysozyme. Cell-free extracts were prepared by centrifuging the lysed cells at non-collision mode of instrument operation. 48,000g for 20 min. SDS–PAGE analysis of the cell-free extracts demonstrated Samples were diluted to the desired volume with 2% (v/v) Trace Metal Grade the production of a ,25 kDa protein not seen in control cells lacking the recom- HNO3 (Fisher Scientific) in acid-washed 15 ml polypropylene tubes (Sarstedt). binant plasmid. To purify PF0086, the cell extract was heat-treated (80 uC for

©2010 Macmillan Publishers Limited. All rights reserved doi:10.1038/nature09265

15 min) and centrifuged at 48,000g for 20 min. The supernatant was applied to a phase yielding ,600 g of cell paste28. The procedure for preparing the anaerobic 5 ml QHP column (GE Healthcare) equilibrated with 50 mM Tris-HCl, pH 8.0, cell-free extract and for running the first chromatography column were the same and the bound proteins were eluted using a linear gradient from 0 to 0.5 M NaCl as described for P. furiosus, except that 100 g of frozen cells were processed. The over 20 column volumes (CVs). The 25 kDa protein eluted from the QHP column cytoplasmic fraction was loaded onto a 175 ml (5 3 9 cm) DEAE-FF column, at salt concentrations between 0.3 and 0.4 M NaCl. Those fractions containing the washed with three CV of Buffer A and proteins were eluted using a 0–250 mM ,25 kDa protein were combined, concentrated, and applied onto a Superdex 75 NaCl gradient over 15 CV (60 fractions), followed by a gradient (3 CVs) of 250– 16/60 column equilibrated with 50 mM Tris-HCl, pH 8.0, containing 200 mM 1,000 mM NaCl (5 fractions). The fractions generated were used for ICP-MS KCl. SDS–PAGE analysis was used to identify the fractions containing the analysis (Supplementary Table 13). ,25 kDa recombinant protein. MALDI-MS analysis confirmed that the major gel band was PF0086 and ICP-MS was used to determine the metal content of the 31. Adams, M. W. W. et al. Key role for sulfur in peptide metabolism and in regulation recombinant protein. The metal content of the recombinant proteins obtained of three hydrogenases in the hyperthermophilic archaeon Pyrococcus furiosus. J. Bacteriol. 183, 716–724 (2001). from E. coli grown in various media is shown in Supplementary Fig. 10. 32. Schut, G. J., Bridger, S. L. & Adams, M. W. W. Insights into the metabolism of Fractionation of S. solfataricus and E. coli. E. coli BW25113 was grown aero- elemental sulfur by the hyperthermophilic archaeon Pyrococcus furiosus: bically with shaking at 37 uC in 2 l of rich (23YT) medium and collected in the characterization of a coenzyme A-dependent NAD(P)H sulfur oxidoreductase. J. 37 late log phase yielding 11 g of cell paste . All further steps were performed under Bacteriol. 189, 4431–4441 (2007). anaerobic and reducing conditions. The cells were resuspended in 3 CV of 33. Cai, Y., Georgiadis, M. & Fourqurean, J. W. Determination of arsenic in seagrass 50 mM Tris HCl (pH 8.0) containing 2 mM Na-dithionite (Buffer A) and using inductively coupled plasma mass spectrometry. Spectrochim. Acta B 55, 0.05 mg ml21 lysozyme and incubated with shaking for 1 h at 25 uC. The cell 1411–1422 (2000). lysate was treated with DNAse I (4 mgml21) and incubated an additional 30 min 34. Karthikeyan, S., Joshi, U. M. & Balasubramanian, R. Microwave assisted sample before centrifugation at 47,000g for 60 min at 4 uC and the supernatant (cyto- preparationfordeterminingwater-solublefractionoftraceelementsinurbanairborne plasmic fraction) was loaded at 25% in Buffer A onto a 45 ml (5.3 3 8.5 cm) particulate matter: evaluation of bioavailability. Anal. Chim. Acta 576, 23–30 (2006). DEAE-FF column equilibrated in Buffer A. After loading, the column was 35. Blake, P. R. et al. Determinants of protein hyperthermostability: purification and amino acid sequence of rubredoxin from the hyperthermophilic archaebacterium washed with 5 CV of Buffer A and the proteins were eluted with a 15 CV gradient Pyrococcus furiosus and secondary structure of the zinc adduct by NMR. of 0–500 mM NaCl in Buffer A (60 fractions), followed by a 7 CV gradient of Biochemistry 30, 10885–10895 (1991). 500–1,000 mM NaCl in Buffer A (5 fractions). The fractions generated were used 36. Quevillon, E. et al. InterProScan: protein domains identifier. Nucleic Acids Res. 33, for ICP-MS analysis (Supplementary Table 13). W116–W120 (2005). S. solfataricus P2 was grown aerobically at 80 uC in a 600-l fermenter on a 37. Sambrook, J., Fritsch, E. F. & Maniatis, T. Molecular cloning: A Laboratory Manual medium containing sucrose and peptides (pH 3.0) and collected in the late log 2nd ed. Vol. 3 (Cold Spring Harbor Laboratory Press, 1989).

©2010 Macmillan Publishers Limited. All rights reserved CAREERS NATURE|Vol 466|5 August 2010 Researchers on a mission Marine biologists are developing an appreciation for conservation, a change that is creating new jobs. Emma Marris reports.

Many marine biologists are now as interested in preserving species as in ensuring fish stocks.

ennis Apeti and Andrew Mason has become part and parcel of mainstream in marine biology and, sometimes

spend most of their day collecting marine biology in the United States: when a indistinguishably, more marine in Bena B. samples of water, sediment and marine ecosystem is transformed, scientists conservation biology — should provide more oysters at varying depths in the now often not only study the impact, but also jobs, in potentially new areas, say some in the DGulf of Mexico. In the wake of what many suggest ways to conserve the affected species. field. The trend also makes it easier to move term the worst environmental disaster Humanity’s disruptive effects on the ocean between non-governmental organizations, in American history, Apeti and Mason, have turned many former basic-research academia and posts in government. “We both scientists at the US government’s scientists into conservationists. At the same have more options than we had ten years National Oceanic and Atmospheric time, the culture of science is ago,” says Heather Leslie, Administration (NOAA), are trying to changing and fewer scientists an assistant professor of assess the consequences as plumes of oil believe that their work can “Conservation environmental studies and continue to spread. There is little doubt ever be truly value-neutral. jobs are going to biology at Brown University that the Gulf’s plants and animals will be Many marine biologists have increase with public in Providence, Rhode Island. affected, potentially threatening species and dropped their traditional “You can choose where you ecosystems. objections to some kinds of awareness. People want to be on that spectrum.” Apeti and Mason are not conservation science advocacy, or at least are going to demand Whether that means basic, biologists, but they’re well aware of what to experiments designed to more answers.” applied or advocacy-driven results such as theirs could indicate for inform conservation. “You research, interdisciplinary the futures of pelican and plankton. They cannot deal with the biology skill sets — including the also realize what such disasters mean for of turtles or whales without looking at habitat tools to understand sociology and humans’ the conservation field. “The number of modification by humans,” says fisheries interactions with their natural surroundings conservation jobs is going to increase with scientist Daniel Pauly of the University of — have become increasingly important. public awareness,” says Mason. “People are British Columbia in Vancouver, Canada. In 1997, Elliott Norse a marine biologist going to demand more answers.” “Noise, chemicals, removal of habitat by and president of the Marine Conservation The scientific-research response to a trawlers, you name it. You are forced to take a Biology Institute in Bellevue, Washington, disaster such as the Deepwater Horizon oil position. It is somehow unavoidable.” put together the first symposium on marine spill exemplifies how conservation biology, For those considering entering marine conservation biology in a conscious effort once largely a specialized terrestrial field, biology, this trend — more conservation to start a new field. “There were fisheries

784 © 2010 Macmillan Publishers Limited. All rights reserved NATURE|Vol 466|5 August 2010 CAREERS people meeting coral biologists meeting mammalogists,” he says. At one point in Voyages inTo conserVaTion the conference, held in Victoria, British Three early-career marine biologists share the moments when they dived into conservation. For Columbia, big names in marine biology and each, taking the plunge also meant learning about a terrestrial species: humans. conservation were sitting in clusters on the lawn, engaged in impassioned discussions. FisHer’s-eye VieW Norse recalls a colleague who turned to him Janna Shackeroff, international about the sea held by non-scientists. For her in excitement and said, “This is Woodstock!”. coordinator for the US National graduate research, she went on to talk to The marine biology that existed when Oceanic and Atmospheric fishermen, diving-shop owners and aquarium- Norse founded the Marine Conservation Administration (NOAA) Coral fish collectors on the Kona coast of Hawaii’s Biology Institute in 1996, save for that done by Reef Conservation Program, Big island, gathering thousands of pages of data a small number of forward-thinking people, based in Silver Spring, Maryland. and perspectives. “Dive-shop operators often was either applied or purely academic, he will dive in the same reef a couple of times a says. “But the application was something like For Janna shackeroff, a third-generation week for 30 years,” she says. Her PhD in marine how many fish are in the sea so we could catch californian and an avid swimmer and ecology and anthropology is “equally footed in more of them.” Norse and other biologists participant in beach clean-up from childhood, the natural and social sciences”. who were concerned about the sea have made the conversion to conservation happened early. after earning her PhD in 2008, she went to a conscious effort to change that. As a result, But it wasn’t until just before she started her work for noaa, first at a marine protected area engagement by the marine community in graduate degree that she made the connection — a region where human activity is restricted conservation biology is increasing. Back in between her beloved sea and the people who to preserve resources — in Hawaii, and then live on its shores. in “office buildings, conference rooms and 1996, Norse and a colleague examined papers Before starting her PhD at Duke University in embassies”. Day to day, the team she manages in nine volumes of the journal Conservation Beaufort, north carolina, shackeroff spent a few helps other countries with tasks such as Biology, and found that just 5% of the total months helping her sister film a documentary establishing marine protected areas and writing covered marine topics (K. E. Irish & E. A. about native fishermen in Hawaii. as she held management plans. shackeroff says that she Norse Conserv. Biol. 10, 680–681; 1996). An the boom mic for the interviews, she became chose a government position to be as close as unpublished analysis performed recently on fascinated with the wealth of information possible to policy and management decisions. behalf of Naturejobs reports that in the past three-and-a-half years, marine topics have KeePing a Door oPen represented 9% of the journal’s total — still a Vera Agostini, scientist at the University of Miami, where she tried to “bridge D oy minority, but a significant increase. Nature Conservancy’s Global a gap between conservation and fisheries”. L . L . Marine Initiative, Miami, Florida. she is still based in Miami, now working for the r Culture shift nature conservancy. “in a typical work day, i Not everyone believes that the old divisions When Vera agostini earned might be talking to the mayor of a small town, are dead and buried, however. Evolutionary her fisheries PhD from the or giving a presentation at the United nations,” ecologist Les Kaufman still urges young University of Washington’s she says. But she also researches the life cycles scientists interested in the academic path school of aquatic and Fishery sciences in of fish including anchovies and hake. “i produce to take what he calls “the stealth approach”: seattle in 2003, the department was still the science that will help my policy colleagues putting the emphasis on pure science until quite focused on industry-oriented fisheries go out and effect change.” they are well established, and only then management, and she didn’t mingle much Her career doesn’t leave much time for peer- turning their research programmes towards with conservationists. “i was just too busy in reviewed papers. But agostini keeps at it, in my PhD programme; i didn’t reach out to the case she ever wants to go back to academia. conservation. Kaufman, a professor of conservation world,” she says. These days, universities no longer view a biology in the marine programme at Boston But once she had her degree, she wanted candidate as ‘tainted’ by advocacy work for University in Massachusetts and a principal to do something to help the sea — and to use non-governmental organizations, she says, “as investigator at Conservation International, a her communication skills. she took a position long as you can keep your cV rich with what an non-profit organization based in Arlington, at the Pew institute for ocean science at the academic would look for — publications”. Virginia, says that conservation experience counts for little in traditional academic cHance FaVoUrs THe PrePareD evaluations. But he admits that this gambit Michael Webster, programme that focuses on getting degrees, getting grants, has a downside — while researchers are officer with the Wild Salmon writing papers, but little incentive to apply that biding their time, attempting to secure Ecosystems Initiative at the work and make it useful in the real world.” tenure, “the oceans are falling apart”. Gordon and Betty Moore He began looking for work at non- Kaufman believes that marine biology Foundation, Palo Alto, California. governmental organizations and government is building up a “critical mass” of good, agencies in 2004, but there were few jobs rigorous scientists who also count themselves Michael Webster got his available. an advert for a job at a foundation as conservationists. When they represent zoology PhD in 2001 at oregon state University intrigued him, and the next thing he knew, he the majority — hopefully before it is too late in corvallis, on basic population biology and was managing grants, funding science that will for the oceans — scientists will no longer community ecology. it wasn’t until he began a inform salmon conservation. Marine-biology be forced to put conservation on the back postdoc with Bruce Menge and Jane Lubchenco, foundation jobs are hard to find, so Webster burner until later in their careers. also at oregon state, that he wondered whether recommends having a plan B. But, he says, once That time may already have come, says pure science was for him. Menge focused on one looks beyond the well-trodden academic pure science, whereas his wife, Lubchenco, was path, all kinds of jobs like his are possible. Leslie, who studies the social and ecological involved in the interface of science and policy. “it He has some advice for those wanting to go dynamics of coastal systems, including was interesting to think which of these paths i into conservation: “you have to be interested in the design and evaluation of conservation would like to go down,” says Webster. more than the science: communication, policy, plans. Employers took interest in her as a in the end, he chose Lubchenco’s path, grassroots organizing,” he says. “Very little of it, job candidate, she says, because her research moving into applied science. “i felt dissatisfied for most people, is about being in the field and programme had explicit connections to in ecology,” he says. “There is a lot of activity collecting data and writing papers.” E.M. policy and management. Many of the

785 © 2010 Macmillan Publishers Limited. All rights reserved CAREERS NATURE|Vol 466|5 August 2010

marine-biology PhDs of her generation, she says, are coming out of the conservation nasa closet much earlier — or never going in. “We didn’t want to wait until we were senior scientists to do this kind of work,” says Leslie. A new conservation-focused mindset is apparent at the University of Washington’s renowned School of Aquatic and Fishery Sciences in Seattle, which was known until 2000 as the School of Fisheries — a name that was proving unappealing to potential students. “We were losing undergraduate interest among those who were more conservation oriented,” says school director David Armstrong. “Our department has diversified a lot in the past eight to ten years.” It was once a powerhouse for producing government fisheries scientists, but many of the school’s graduates now move into tenure-track academic positions or non-governmental organizations, and staff and students alike have a more conservationist outlook. Faculty member Julian Olden, hired in 2006, won the Early Career Conservationist Award from the Society for Conservation Biology this year for his research on the spread of invasive aquatic The Gulf of Mexico oil spill has raised awareness of marine conservation issues. species and their effects on ecosystems. Staff such as Olden might help to convince The Marine Conservation Biology Institute in their research projects (see ‘Voyages a new generation that fisheries science is not is one such establishment, and the Scripps into conservation’). An interdisciplinary incompatible with conservation. “Classical Center for Marine Biodiversity and background is key. Some argue that managing fisheries science is so closely associated with Conservation, based at the University of the ocean is primarily a social question with failure, there is a real risk that we won’t get California, San Diego, is another. Academic a scientific component, rather than the other good students,” says Pauly. Those failures are departments at the University of British way around. notorious: fish stocks collapsing into nothing Columbia, Brown University, Duke Although marine conservationists do despite being managed by University in Durham, North sometimes design marine protected areas

ai02 trained scientists. But now, Carolina, and others have also with limits on human activity, many doubt L r s many graduates of fisheries opened their doors to marine the feasibility of ‘fortress conservation’ programmes are tackling conservation. — the concept of preventing all human the problem by finding new Still, Norse bemoans the use of nature to which some terrestrial applications for quantitative lack of funds for his pet field. conservationists aspire. Instead, the marine models that were developed to At most institutions, he conservationists are looking at problems maximize fisheries yields; they says, money comes through through a “sustainability lens”, says Barry are modelling fish populations grant committees rooted Gold, programme director for the marine to learn how best to conserve in traditional disciplines: conservation initiative at the Gordon and them. Instead of asking how oceanography, biology Betty Moore Foundation in Palo Alto, many fish can be taken out and social sciences such as California. This means managing use rather one year without depleting economics. There isn’t a lot of than just banning it. Managers have a variety stocks the next, they seek to funding for interdisciplinary of tools, from laws to educational campaigns determine how many fish can marine studies, although to complex financial incentives, for avoiding be removed without damaging this represents the biggest by-catch of threatened species, dumping a functional ecosystem in “You cannot deal research need, says Norse. into the sea, illegal fishing and even the which fish from all age classes with the biology of Filling the gap for now are destruction of reefs by recreational divers. are well represented. groups such as the David and Interdisciplinary researchers might survey turtles or whales Lucile Packard Foundation the species in a coral reef while interviewing Jobs growth without looking at in Los Altos, California, and local fishers and householders about which The infiltration of habitat modification the Pew Charitable Trusts in fish are considered tastiest and which are conservation into traditional Philadelphia, Pennsylvania. most culturally significant — or they might fields and the birth of marine by humans.” Marine biologists in these combine modelling the populations of a conservation biology have — Daniel Pauly new roles spend a surprising single species in a specific area with lobbying both spawned jobs in the past amount of their time meetings of the regional fisheries management 15 years. These have come about through the studying a terrestrial species — humans. The councils. “I am not sure,” says Joshua Cinner, addition of conservation-science positions researchers interviewed for this story were a coral-reef expert at the Australian Research in government, particularly at NOAA; the unanimous in recommending that young Council, “that the prospects for someone who expansion of marine interests at the big non- marine biologists with a conservationist bent can only count fish or look into a microscope governmental conservation organizations; develop experience in the social sciences by are particularly bright.” ■ and the growth of new institutes that completing courses in sociology, anthropology Emma Marris is a freelance writer based in specifically focus on marine conservation. and economics, or by including humans Columbia, Missouri.

786 © 2010 Macmillan Publishers Limited. All rights reserved FUTURES 788 FUTURES war,one the likeon terrorism, that has declared the “War on Cancer” — a forever have been squandered since Richard Nixon laboratories upon which billions of dollars of the major drug companies and corporate my ing who will read it? normal cells at all. systemwithout negative on effects any by any reasonably healthy human immune ing every last one of them to destruction cancer cells and cancer cells alone, expos believed — silver bullet. The de-selfing of those lives to continue. lives, like the AIDS cocktails, in order for mustfortake rest the of their lifespan. Not something patients extendedmonthsof mere in drug whose efficacy is measured Not just another chemotherapy whole clades of malignancies. for one cancer specific but for to the cure for cancer. Not just research that would surely lead company I approached with the or another by every major drug world what is being suppressed. tiple bottles, atwill least tell the like a desperate message in mul sive, tossed into the media ocean age to stop me, maybe this mis real enemies, and if mine man haveparanoiacscan even But distributing the cure for cancer. Communistsbegin they when won’tthey’reworld that care Cubans,convinceandthe the waytoCuba. If cangetthereI that’s andmyself, further anywork having the capital to finish the I’mcapitalisteither,a not not No, I’m not aCommunist, but Norman Spinrad A healthy profit. The silver bullet and the golden goose process were perfected, cancer as a deadly immune system all the time, and that if this cells arise and are destroyed by everyone’s based on the long-known fact that cancer permanent cures. they are taken, but damn little in the way of can hold back cancers, some for as long as produced many chemotherapy drugs that I can tell you that my research has been I can tell you, whoever you may be, that How much dare I reveal here not know The long-sought — or at least so I naively That’s what I was told one way Was I hopelessly naive? why. And research has not been anything like that the only reason I’m on my my I’m on reason only the - - - ©

2 0 - -

10

M them. them. little islands. And that I found a dozen of not exist — in jungles, obscuredeserts, populationstribal where, it infact, did more travelling the globe to seek out small disease would not exist. as it would require exhaustive studies of the resources of a major drug company to do it, It would take the financial and technical synthesize it and, voila, the silver bullet! ergistically.Discover was,what mix the many somethings coming together syn biomes, and probably a complex brew of environmental lations was not genetic, it could only be absence of cancer in these isolated popu in a hut in a jungle — if the cause of the under the influence of a local psychedelic relevant in common. dismay that they seemed to have nothing populations sequenced, only to find to my characteristic genomes of these isolated a c Something common to a dozen isolated I can tell you that I realized the obvious the getting finances my depleted I I can’t tell you where. I can tell you that I spent a decade and m i l l a n

P u b l i s h e . r s

L i m i t e d .

A l l r i g h t s r e s e r v e d - - chemotherapy. and back to work, but still receiving real-time. Now at home recovering well Internet diary on the experience in stomach cancer, where he wrote an three weeks in hospital being treated for Walked Among Us Norman Spinrad’s latest novel is Communist. golden goose from my silver bullet? opposite direction will go to protect their those whose economic self-interest is the discovered. Do I want to find out how far tion of the cancer-free populations I’ve inflating it to maximize profit. interest is in cutting health-care cost, not ment financed, and so their economic self- system and their drug industry is govern still is, isn’t it? Or, at least, their health-care nist to But But I’m a scientist, neither a capitalist nor a That’s why I can’t risk revealing the loca That’s No, I’m not, but the Cuban government “What are you, some kind of pay for killing our own golden goose ?” that why I’m on my way to Cuba. naive I no longer am. told, told, repeatedly. to produce the silver bullet. of them and let them compete if not all cancer. I’d simply tell all broad-spectrum cure for most end this time would come the otheroutand the apy drugs, chemotherofstreams their dogged process that produced afford it, it was just the sort of on neutral populations. tions and their large-scale testing synthesis of candidate combina on an interactive molecular level; of their total biome surroundings populations I had discovered and profit centre and you expect us pull the rug out from our major Yoursilver curebullet would keeps this industry in the black. alive are the profit centre that have to take indefinitely to stay for cancer that the customers out. And chemotherapy drugs selling it until the patent runs and that means we’ve got to keep any new drug out on the market, It costs scores of millions to get drug any Haveyou NATURE HopelesslywasI naive! So But the companies could well “Don’t you watch television? . He very recently spent | that that Vol 466 ever cured seen an ad foranad seen | 5 August 2010 anything?

He Commu ? ■ - - - - -

Jacey