<<

TECHNOLOGY FEATURE

Genetics: new tales from ancient DNA Vivien Marx

Coaxing ancient DNA to reveal its history delivers surprises and even improves ways of working with badly damaged present-day DNA.

As a , we may be 100,000 years older Because it has been preserved for a long than previously thought. A research team time, ancient DNA (aDNA) can shed new believes it has found the earliest known light on , epigenomes and Earth sapiens remains in Jebel Irhoud, history. Ancient DNA from between Morocco, which are around 300,000 years 7,000 and 45,000 years ago has helped old1. Fossils help scientists piece together researchers discover striking aspects of the ancient history of our and other species. European population history3. DNA can be Now that DNA can be extracted from fossils, extracted from ancient , teeth, , analyzed with high-throughput sequenc- eggshells, paleofeces and even . In the ing and reconstructed with computational absence of bones, researchers sifted through tools, DNA is becoming a kind of molecular soil in multiple Eurasian Pleistocene-era fossil2. “It is the most amazing data in the caves and found ancient human DNA4. world right now,” says , a popu- When analyzing ancient human DNA, lation and evolutionary biologist researchers use the human reference at Harvard Medical School. In many cases . With many organisms, howev- is the only way to truly look er, even a close living relative may still be MPI EVA S. Freidline, at what happened in the past, as opposed evolutionarily distant, says , A reconstruction of a 300,000-year-old face to making highly educated guesses, says evolutionary biologist at the University of based on H. sapiens fossils in Jebel Irhoud, Tom Gilbert, an evolutionary biologist at California at Santa Cruz. Toxodon, a hoofed, Morocco. the of the Natural extinct mammal, is an example where “we History Museum of Denmark, which is affil- don’t know what their closest living relative and methods developer at the Max Planck iated with the University of Copenhagen. is,” she says. Whether human or Toxodon, Institute for Evolutionary “It’s really like going back in time,” he says. what’s challenging about aDNA is that it’s a in Leipzig, Germany. These changes were mess. not well understood until high-throughput sequencing established a pattern: a Intriguing, messy aDNA that has undergone into uracil It is hard to extract aDNA efficiently, says is read as . Gilbert. Researchers need enough mate- Over time, ancient DNA breaks into piec- rial but must also responsibly minimize es, not with blunt-ended breaks but with the amount of precious sample they use to sticky, single-stranded overhangs. “And in generate complete genomes. “It’s not an easy these short single-stranded overhangs, you balance,” he says. have a very high deamination rate, and so Fossils can contain a low absolute amount your uracils accumulate particularly at the of endogenous aDNA, and some fossils have ends,” says Meyer. no endogenous DNA at all, says Qiaomei It can be helpful to treat aDNA with Fu, a researcher at the Institute of Vertebrate enzymes such as uracil-DNA glycosylase Paleontology and Paleoanthropology in or similar enzymes to cut the DNA where Beijing, where she and her team are cur- the uracils appear, says Gilbert. Uracils rently working on the human prehistory of accumulate only slowly over time, although Asia. the speed is temperature- and water-depen- El Sidrón research team research El Sidrón As DNA ages, chemical changes occur dent. The cytosine-to-uracil conversion is a DNA was found in the soil at an that are read as sequence changes, says hydrolytic reaction, so the warmer and wet- archaeological site in El Sidrón, Spain. Matthias Meyer, evolutionary biologist ter it is, the faster uracils will appear, he says.

NATURE METHODS | VOL.14 NO.8 | AUGUST 2017 | 771 TECHNOLOGY FEATURE

“We used to complain about the damage Yucatán Peninsula that was estimated to at the ends of the molecules,” says Shapiro, be between 12,000 and 13,000 years old5. but uracil occurrence is now being used After analyzing mitochondrial DNA to authenticate aDNA. Reich calls this pat- (mtDNA) from a and bones, they tern of DNA damage “a sanity check.” In a stated that their findings support a link batch of molecules, he and his colleagues between Paleoamericans and modern restrict analyses to molecules that show Native Americans. such damage. Meyer and his colleagues raised doubts about this interpretation because high- Fighting contamination throughput sequence evidence to confirm Contamination is a constant threat. “In the the presence of aDNA was lacking. The past, especially when working with modern or the sampling equipment might have human remains, there’s always been doubts,” been contaminated with modern Native says Meyer. As Gilbert explains, there have American DNA that was then amplified been multiple relatively high-profile exam- with PCR. In a published exchange, the ples of problems related to contamina- study authors countered that DNA damage tion. People contaminated samples when caused by contaminants can take on forms MPI EVA S. Tüpke, handling a human skeletal element and expected of aDNA. In their view, indepen- Extraction of aDNA happens in clean rooms. then performing PCR amplification. This dent replication matters for authentication in was before the advent of high-throughput human aDNA studies, and damage patterns sequencing. across samples from different environments time. It appears a founding population of Researchers use sterile excavation tech- might show an underappreciated variance. ancient was dispersed throughout niques, careful lab protocols and clean Viewpoint differences about these and Europe. This branch disappeared and was rooms, and perform PCR in rooms separate other damage patterns remain, but there replaced through migration. Around 19,000 from where DNA extraction is performed. is agreement about the fact that the major- years ago, at the end of the Ice Age, there was “This isn’t to say that contamination isn’t a ity of aDNA in a bone, for example, is not population migration from the area of cur- problem, but one can at least incorporate the from the ‘owner’ of that bone, says Shapiro. rent Spain; 14,000 years ago another migrat- uncertainty associated with it into analyses,” Most fossils are massively contaminated ing group arrived, likely from the east. says Gilbert. with microbial DNA. “This is irritating, and For this work, the teams extracted Typical aDNA damage patterns help makes analyses much more costly than they DNA in dedicated clean rooms and set researchers detect contamination, but it’s would otherwise be,” says Gilbert. Unless a up sequencing libraries. Fu brought her not always easy. The patterns have been lab is specifically studying metagenomes, ‘home-brew’ method again to bear: in- established mainly with aDNA from tem- this issue can be addressed through the use solution hybrid capture to enrich for perate zones and . Discoveries of enrichment techniques pre-sequencing many single- polymorphisms in other regions lead to discussions. In and data-filtering after sequencing. (SNPs)—between 390,000 and 3.7 million 2014, researchers reported the analysis of them. The team synthesized oligonu- of a skeleton found in a cave on Mexico’s Human DNA enrichment cleotides 52 base pairs long that targeted Microbial contamination led Fu to work on these SNPs and hybridized aDNA to these her “home-brew solution method,” a hybrid- probes. Without this strategy, she says, the ization enrichment strategy for mitochon- team would have been unable to obtain drial and nuclear DNA, which she co-devel- DNA from that many individuals from a oped with Meyer as a PhD student at the pool so contaminated with microbial DNA. Max Planck Institute in Leipzig6. Enriching Such human population studies are highly fragmented aDNA calls for overlap- remarkable, says Gilbert about this and other ping probes, which limits the size of genomic research. Yet, he says, one must bear in mind regions that can be targeted. To address this that labs draw conclusions from imperfect limitation, the team used oligonucleotides sample sets and the information is restricted made on arrays to construct probe libraries by the available samples. “The more things and combined them into one ‘superprobe we look at, the more we can refine our ideas library’. This was the main strategy used in of the past.” Almost on an annual basis, find- their work on Ice Age Europe, which she co- ings and data add to the story of Europe’s authored as a postdoctoral fellow in Reich’s and Australia’s peopling. “This is natural— lab3. it’s how science works—but it does mean The fossil record had revealed that Europe researchers have a responsibility to be very was first populated 45,000 years ago. By careful how they word their discoveries, assembling and analyzing genome-wide bearing in mind that their results may not T.C.G. Bosch T.C.G. data from 51 ancient humans between always be set in stone,” he says. Scientists need to be vigilant about aDNA 7,000 and around 45,000 years old, the Fu says that when working with aDNA, it contamination, says Tom Gilbert. researchers tracked genetic changes over is also advisable to “always think you are so

772 | VOL.14 NO.8 | AUGUST 2017 | METHODS TECHNOLOGY FEATURE dirty” and to “always feel you will contami- mtDNA analysis of the highly degraded way the group works and the types of proj- nate your samples and will easily introduce DNA indicated that the bones belong to ects the team can take on, he says. And it’s a cross-contamination.” For computational relatives of , eastern Eurasian method he and his group continue to devel- data analysis, she recommends constant relatives of . In a later study, op and use to generate reference genomes speculation about what might be wrong and nuclear DNA analysis showed that these for Neanderthals, Denisovans and early advises researchers to check results from hominin individuals are more closely relat- humans. different angles and with different methods. ed to Neanderthals than to Denisovans8. Each cell is likely to have several hun- Enrichment for mtDNA was also at the These fossils probably belong to early dred copies of mtDNA, but the bones in the heart of the work of a multinational group of Neanderthal ancestors or their close rela- Spanish caves showed only traces of DNA. scientists that identified both Neanderthal tives, says Meyer, who adds that the odd The mtDNA enrichment combined with and DNA in soil samples at mtDNA findings indicate that these groups’ single-stranded DNA sequencing library seven archaeological sites across Eurasia4. population history is more complex than prep helped this project, says Meyer. The technique presents the possibility of can be picked up from currently available Even though sequencing costs have detecting hominin groups at sites that lack data. dropped, it would be impossible to gen- skeletal remains. A key technique in this and other work, erate sequence from so many fragments A typical excavation delivers thousands says Meyer, was single-stranded sequenc- without mtDNA enrichment, says Meyer. of animal bones “but you find very, very few ing library preparation, in which the two Most humans did not die in caves. They left human fossils,” says Meyer. The study’s first strands of DNA are unzipped and con- only small traces of organic matter, which pass was to check the state of DNA preserva- verted into separate libraries. It was devel- meant that researchers had to isolate DNA tion. The team looked for mammalian DNA, oped for an analysis of DNA from a tiny from hundreds of samples. Using mtDNA expecting and finding much DNA from well-preserved piece of bone that turned rather than nuclear DNA makes it easier to , bison and deer bones. “What out to be from a Denisovan individual. The discern human from the plentiful deer and really shocked us was how much DNA there approach increased the sequencing library hyena DNA, he says. The fragments can is in sediment,” he says. They found trillions yield by an order of magnitude, he says, then be assembled and compared to refer- of DNA fragments in 50 mg of soil—an which was needed because all they had to ence sequence to determine what is human- amount that fits on the tip of a steak knife. work with to reconstruct a high-quality like DNA, as opposed to, for example, cave- In the analysis, the team used mtDNA sequence was the tip of a juvenile’s finger. bear-like. as ‘bait’ to pull out DNA fragments with This library prep method has changed the Meyer and colleagues recently applied similar mtDNA, says Meyer. DNA from this single-stranded library prep method Single-stranded library prep primates or great apes but also Neanderthal, for highly degraded DNA to present-day formalin-fixed samples Denisovan or archaic human sequence will with quite damaged DNA9. “We see a hybridize to this ‘bait’. Sequence analysis Double-stranded DNA fragments ridiculous improvement in yield,” he says. then helped filter the data down to hominin They obtained 3,100 times more molecules DNA. Previous work has shown that aDNA than with double-stranded library prep. can be isolated from soil, says Gilbert7, and The gains are due to a number of factors. the limits have mainly been financial. He Heat denaturation Classic DNA library prep, with its enzy- hopes that as sequencing costs drop, other matic manipulations and purification steps, groups, too, will be able to try such analyses. A splinter oligo bridges a biotinylated oligo and always leads to molecule loss. The single- Research with aDNA brings ever-new DNA fragment; a ligase connects them. stranded method uses the material more aspects to light about H. sapiens. There were NNNNNN efficiently, essentially doubling the amount Neanderthals, other ancient humans called of DNA. Denisovans and early modern H. sapiens. Gain is also due to the fact that most of Our species diverged from the other two the library prep takes place on solid sup- The single strand is immobilized on a bead, and a more than 500,000 years ago, but despite the second strand is synthesized. port. “All the subsequent reactions are tak- separate lineages, there was dating and mat- ing place on beads,” says Meyer. Buffer and ing among the groups—so-called admixture enzymes can be changed without any sub- events. Our current-day DNA, with its trac- stantial DNA loss, and DNA-purification es from our past, shows that we have both steps can be eliminated. The results with A double-stranded sequencing adaptor is ligated. Neanderthal and Denisovan DNA from damaged present-day DNA show the role those encounters. Sometimes aDNA analy- niche methods can play in other fields after sis leads to a reassessment of fossils. originating in an area that some might con- sider “very peculiar,” he says. Methods surprises Heat denaturation The method is also more efficient at Fossils from around 28 hominin individu- retaining short DNA fragments, which The library strand is released from the bead and als dated to be more than 400,000 years is ready for sequencing. might be as short as 17–20 base pairs with old have been excavated in caves in Spain’s aDNA. That’s too short to align unambigu- MPI EVA; E. Dewalt, Nature Research E. Dewalt, Nature MPI EVA; Sierra de Atapuerca, where one site is ously, but that does not stop Meyer and called Sima de los Huesos, the ‘pit of bones’. A single-stranded sequencing library prep makes others from pushing methods forward and The fossils appear Neanderthal-like, but the most out of precious little sample. dreaming about possibilities.

NATURE METHODS | VOL.14 NO.8 | AUGUST 2017 | 773 TECHNOLOGY FEATURE

which has increased the throughput “at least BOX 1­ COMPUTING aDNA by a factor of ten, probably more,” he says. The lab’s liquid-handling systems handle Labs can work on aDNA data with many computational tools and pipelines. “I’d say the most of the sample prep pipetting steps. biggest computational challenge is for people who want to study ancient metagenomes— making sense of a mix of true ancient metagenome and modern bacterial contaminants is Gilbert sees some labs turn to robots and a very hard problem,” says Gilbert. automation, but, he says, “whether they Classic mapping approaches work well for mapping short fragments, says Alexander are the best solution or not is something I Seitz, an aDNA computational methods developer who is wrapping up his PhD at the haven’t decided on yet.” University of Tübingen in the bioinformatics lab of Kay Nieselt. When mapping human As the costs associated with high- DNA, researchers can use tools such as BWA or Bowtie. For the specific traits of aDNA, labs throughput sequencing drop “we can can use EAGER and PALEOMIX, which are computational pipelines that map reads against increasingly afford to consider generat- a known reference in order to reconstruct aDNA, he says. Tools such as mapDamage and ing high-coverage ancient genomes,” PMDtools can be used to verify that the samples are aDNA and not modern DNA. says Gilbert. That high coverage requires Identifying genomic rearrangements or gene loss in aDNA requires de novo assembly plenty of material, but teams battle the low methods, which is where the software developed by Seitz and Nieselt comes in. It’s called efficiency of DNA extraction. And both MADAM, for iMproving Ancient DNA AsseMbly. “When using enrichment, missing genes extraction and sequencing library prep are cannot be identified, as the corresponding DNA fragments were not enriched,” says Seitz. not only hard but difficult to scale up to the De novo assembly tools for modern DNA samples for which there is no closely related population level. reference include SOAPdenovo2 and SPADES. But given how short the aDNA fragments Much DNA has been extracted from are, the team wanted to improve the options by applying de Bruijn graphs in a specific fossils, “but still it’s frustrating that there way. are so many interesting fossils out there De Bruijn assemblers represent the input reads as a graph with nodes and edges. Each where we just can’t manage to get any DNA node stands for a k-mer, a fixed string that is shorter in length than a read. An edge from,” says Meyer. Getting DNA sequences is added between two nodes to indicate overlap. For example, if k-mer A is ATGCG and from Homo floresiensis or Homo naledi, for k-mer B is TGCGA, there will be an edge between k-mer A and k-mer B. But with aDNA, example, is yet a dream. the choice of fixed k-mer length is challenging because a sample has many different read Some labs work to reconstruct epi­ lengths, says Seitz. MADAM takes a two-layer approach. In a first computational pass, genomes, for example, by analyzing DNA the contiguous sequences are assembled from short reads in a de Bruijn graph that uses extracted from ancient bison. The pro- multiple k-mers at various lengths. In a second computational step these ‘contigs’ are cess calls for much DNA and damages combined. the DNA, too, which makes it not a good choice for aDNA. There are methylation Wish list of higher-sensitivity methods to get more maps of both Neanderthal and Denisovan With fragments less than 30 base pairs long out of very difficult or older samples. genomes based on inferences from ‘natu- “it really gets tricky,” says Meyer, to discern “Another frontier is to automate ancient rally’ occurring base damage and existing an ancient human DNA fragment from, say, DNA analysis, to make it more efficient and data. Ancient human DNA methylation microbial DNA, but it makes for an inter- cheaper,” he says. analysis, a kind of ancient gene expres- esting computational challenge awaiting a Over the past two years, Meyer and his col- sion analysis, is possible but for now only solution that would greatly help the field leagues have been using more automation, with bones, says Meyer. One could com- (see Box 1, “Computing aDNA”). Another pare gene expression in current-day bone item on his wish list is aDNA repair, such to that in ancient bone. “It would be great as ways to fix aDNA breaks. “If you could to do this from ancient brains,” he says, seal them, you could make your molecules but unfortunately there is no ancient soft longer again, you could perhaps make more tissue. molecules for analysis,” he says. As scientists learn more about how Vivien Marx is technology editor for aDNA decays, new ideas and methods can Nature Methods ([email protected]). help them get more DNA and information out of the bones. Shapiro worked on an 1. Richter, D. et al. Nature 546, 293–296 (2017). aDNA project with results indicating that 2. Orlando, L., Gilbert, M.T. & Willerslev, E. Nat. the team might have found a different form Rev. Genet. 16, 395–408 (2015). 3. Fu, Q. et al. Nature 534, 200–205 (2016). of aDNA damage. They used a sequenc- 4. Slon, V. et al. Science http://dx.doi. ing platform by a now-defunct company org/10.1126/science.aam9695 (2017). called Helicos Biosciences. It’s hard to know 5. Chatters, J.C. et al. Science 344, 750–754 (2014). whether more could have been learned, she 6. Fu, Q. et al. Proc. Natl. Acad. Sci. USA 110, says, and “I think there’s certainly room to 2223–2227 (2013). Institute of Vertebrate Paleontology and Paleoanthropology, Beijing, China Beijing, Paleoanthropology, and Paleontology Vertebrate of Institute see different forms of decay as the technol- 7. Willerslev, E. et al. Science 300, 791–795 (2003). ogy for generating the data improves.” Qiaomei Fu developed a ‘home-brew’ hybridization 8. Meyer, M. et al. Nature 531, 504–507 (2016). Reich sees multiple frontiers in aDNA enrichment strategy for mitochondrial and 9. Gansauge, M.-T. et al. Nucleic Acids Res. 45, e79 research, one of which is the development nuclear aDNA. (2017).

774 | VOL.14 NO.8 | AUGUST 2017 | NATURE METHODS