<<

Ancient DNA

Looking back in time with evolutionary Downloaded from http://portlandpress.com/biochemist/article-pdf/42/1/34/867102/bio042010034.pdf by guest on 28 September 2021

Tony Capra is an Associate Professor of Biological at Vanderbilt University, and one of the leaders in the field of evolutionary genetics.

Tony, many thanks for agreeing to contribute to The pairs (the As, Ts, Cs, Gs). Thanks to advances in DNA Biochemist, we are delighted to be able to speak to you sequencing technology, whole genome sequences from about your work in the field of evolutionary genetics. more than 100,000 people and thousands of different species are available. Similarities and differences between Tell us about yourself – you originally these genomes can reveal how different individuals and studied mathematics, what drew you species are evolutionarily related. to the molecular biosciences? There’s no way we could analyse these trillions of data points without the help of computer programs and I studied math and computer in college and statistical models. Computational modelling helps us to then went on to a PhD program in computer science. identify patterns in the DNA that would be impossible to However, through a personal connection, I got a job as detect otherwise. For example, my group is interested in a programmer in a mouse neurogenetics lab for several how DNA sequence patterns influence when and where summers during college. It exposed me to the fact that different genes are expressed. We use machine learning many fields of (genetics, in particular) were to extract sequence motifs that are predictive of gene beginning to generate more data than they could analyse regulatory activity in different cellular contexts. using their traditional methods. During my first year of grad school, I tried out a few small computational How do you apply algorithms and biology projects and realized that it was fun to be both statistical models to ancient and a and a computer scientist at the same time. I modern genomes? haven’t looked back since! Thanks to recent advances in the ability to identify, Can you tell us about evolutionary extract, and sequence DNA from ancient material (e.g., genetics? What is the aim of the field? bones and teeth), DNA sequence information is now available from thousands of ancient humans and several The goal of evolutionary genetics is to understand how of our close relatives like the Neanderthals. These data evolutionary processes have produced the astonishing are revolutionizing how we think about the recent past diversity of form and function we see in the living world of our species and model evolution. Ancient DNA is and to identify the specific genetic differences responsible enabling us to identify differences between humans and for differences between individuals and species. our close extinct relatives and to try to infer traits of these groups that we cannot directly observe. For example, You use computational approaches to my group recently developed an algorithmic approach analyse DNA – could you explain that that uses modern genomes paired with for our readers? data to predict differences in gene regulation in different cellular contexts between ancient and modern humans. Genomes are very big. For example, one copy of the When I started in this field, I never thought we’d have human genome contains more than 3 billion DNA base data like this available. It is like having a time machine!

34 February 2020 © Biochemical Society Ancient DNA

One of the aims of your is to What did you nd? explore how our increasing knowledge of genomic variation can be translated We found thousands of genes that were likely active at into the treatment and prevention of dierent levels between humans and Neanderthals. By . Could you tell us about that? looking at what we know about the functions of these genes, we found many traits that were likely divergent. Given the importance of our genomes to how our Some of these agree with known dierences in skeletal bodies develop and function, knowing an individual’s structure from the fossil record, e.g. they were shorter genome sequence has great potential to inform how and had a more pronounced brow ridge. Others point Downloaded from http://portlandpress.com/biochemist/article-pdf/42/1/34/867102/bio042010034.pdf by guest on 28 September 2021 they should be treated. For example, some people have us to differences in their immune, reproductive, and genetic variants that make the that process cardiovascular systems. Our work also suggests that drugs more or less ecient. Failure to account for these some of these dierences in gene regulation may have dierences can result in improper dosage. Many of these been detrimental in humanNeanderthal hybrids genetic variants have dierent frequencies in dierent human populations, so it is very important to base such Our October issue explored the use predictions on data sets that match the ancestry of the of machine learning and arti cial patient. Furthermore, for many common we intelligence in research, including work do not fully understand how genetic variants inuence using deep learning programs to predict risk, so we cannot (yet) use genetic information to guide structures. How do you think the treatment or prevention. use of AI will aect research in the future?

Previous work has explored the impact AI is already commonly used in and of Neanderthal genes on the immune biomedicine. Its role will only increase over the next systems of modern humans. Can you decade. The development and application of these tell us about it? powerful computational methods will be the easy part. In my view, the two major challenges posed by the Small amounts of Neanderthal DNA are present in increasing use of machine learning in research are: 1) modern non-African individuals. My group pioneered an their interpretation and 2) ensuring that they do not approach that uses biobanks with DNA samples from introduce subtle biases into our knowledge. For example, patients linked to de-identied electronic health records we’ve developed deep learning algorithms that can to test how remaining Neanderthal DNA inuences accurately identify parts of our genome that inuence traits. We identied the Neanderthal DNA in tens gene regulation, but the DNA patterns ‘learned’ by these of thousands of people and then tested for traits that algorithms to make their predictions o en do not fully associated with dierent Neanderthal alleles. We’ve capture all the DNA motifs required for the function of found many associations, including inuences on many these regions in the . Sometimes these algorithms genes involved in the innate and risk can make very accurate predictions without reecting for many autoimmune diseases. Other groups have now biological reality. Furthermore, these algorithms are shown that cells from humans with Neanderthal DNA only as good as their training data. Since most of at certain positions in their genomes are more likely to our genomic databases have strong biases towards respond to dierent viral and bacterial challenges than individuals of European ancestry, algorithms trained on human cells without Neanderthal ancestry. Overall, these data have the potential to be much less accurate Neanderthal DNA influences many systems in the on individuals with other backgrounds. human body, but there is a particularly strong inuence on the immune system. However, I want to be clear Finally, if you could bring back one extinct that we cannot blame Neanderthals for these diseases; species (human or otherwise), what would Neanderthal DNA only contributes a small fraction of it be and why? the overall genetic risk. Rather than bringing back an extinct species, I think we Your recent paper described analysing should use our growing understanding of evolutionary Neanderthal DNA to nd dierences genetics to better take care of the humans and species in how genes are controlled between that remain. This kind of stewardship poses many modern humans and Neanderthals. challenges but is vitally important to our world. ■

February 2020 © Biochemical Society 35