2018 Institute Annual Research Report Epigenetics 330-41 Epigenetics

Inside cells, genetic information stored in DNA is packaged by proteins into a structure called chromatin. Epigenetics is the study of chemical modifications to DNA and to chromatin and the effects that these modifications have on genome function. Epigenetic marks are involved in the creation of different types of cells from stem cells and epigenetic changes over time are associated with ageing. Epigenetic marks also provide a form of cellular memory, recording certain information about past events and potentially carrying it between parent and child. Our work in this area aims to enhance our understanding of how epigenetics shapes human development and affects healthy ageing by examining: n How stem cells develop into different types of cells n How subtle epigenetic differences influence cell diversity n The impacts of diet on epigenetics, health and ageing n The inheritance of epigenetic memory between generations n How life events affect biological ageing through the epigenetic clock n New approaches and technologies to drive further progress

30 Group leaders

Wolf Olivia Myriam Jon Gavin Peter Stefan Reik Casanueva Hemberger Houseley Kelsey Rugg-Gunn Schoenfelder

31 Epigenetics

Single-cell epigenome landscape of development and ageing

We are interested in epigenetic single cell. Our collaborators have mechanisms in mammalian developed computational methods Selected Impact Activities development and ageing. Epigenetic by which biologically meaningful n Members of the group formed marks are able to regulate gene relationships between the regulatory a core part in developing and Programme leader expression and can behave as a form layers can be identified. In cells that delivering the Institute’s exhibit: of memory that records the history of exit from pluripotency and prepare for Race Against the Ageing Clock at Group members a cell. These marks include chemical differentiation, we made the surprising the 2018 Royal Society Summer Senior research scientists: changes to DNA or DNA-associated observation that cells become hugely Science Exhibition. Stephen Clark proteins. We are particularly interested heterogeneous in their methylation n Wolf Reik was featured in an Wendy Dean in the epigenetic rules that govern cell patterns especially in enhancers, and this article on big data ‘How big data is Melanie Eckersley-Maslin fate decisions in early development, and may be associated with transcriptional Fatima Santos changing science’ for the Wellcome how cell fate degrades during ageing. heterogeneity (or noise) which may help Ferdinand Von Meyenn Trust’s Mosaic platform, which Our research uses single-cell sequencing with cell fate decisions. We are also aiming (Left in 2018) was subsequently featured in methods to investigate cell fate to connect epigenetic marks in enhancers Research fellow: the Independent online on 11th decisions at the level of individual cells. with histone dynamics which may be Carine Stapel November, reaching over 22M (Started 2018) important for dynamic gene regulation. Current Aims people. Postdoctoral researchers: The group’s research focuses on Irina Abnizova n A visiting employee from Shift understanding how cell fate decisions are Rebecca Berrens (Left in 2018) Biosciences has been based in our Poppy Gould (Left in 2018) first programmed or primed, and which lab over the past year to apply Irene Hernando Herraez epigenetic layers this involves. We would the lab’s epigenetic ageing clock Nelly Olova then like to know which cell signalling- model to drug discovery. Aled Parry (Started in 2018) induced epigenetic rearrangements Solenn Patalano occur during fate change. Finally, we are PhD students: exploring whether epigenetic memory Celia Alda keeps cell identity of fully differentiated Diljeet Gill Oana Kubinyecz cells intact for the rest of our lives (or at (Started in 2018) least until we start to age). We are also Georgia Lea working on identifying DNA binding Tim Lohoff proteins which are involved in epigenetic Juliette Pearce priming of enhancers or promoters for Julia Spindel future lineage-specific gene expression. Research assistant: Mouse embryonic stem cells (ESCs) stained for DNA Laura Benson Progress in 2018 methylation, and pseudo-coloured according to (Started in 2018) In order to address these questions we signal intensity (higher signal - red; lowest - blue), Visiting scientists: are developing single-cell multi-omics revealing the heterogeneity of this epigenetic mark. Romina Durigon sequencing approaches which can reveal (Started in 2018) molecular hierarchies involved in fate decision-making. Our most advanced method combines sequencing of the transcriptome, the methylome, and chromatin accessibility from the same

Publications www.babraham.ac.uk/our-research/epigenetics/wolf-reik @ReikLab n Rulands, S. et al. (2018) Genome-scale oscillations in DNA methylation during exit from pluripotency. Cell Syst. 7(1):63-76 n Clark, S.J. et al. (2018) scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells. Nat. Commun. 9(1):781 n Raiber, E.A. et al. (2018) 5-Formylcytosine organizes nucleosomes and forms Schiff base interactions with histones in mouse embryonic stem cells. Nat. Chem. 10:1258-1266

32 Understanding the interplay between stress and metabolism during early stages of ageing

The discovery that genes control 2. How lipid and energy metabolism longevity has been quite significant for interplay with signalling pathways that the understanding of ageing because mediate healthy ageing. Metabolism is Olivia Casanueva it changed the view from a gradual a key mediator of longevity, however stochastic process, to a genetically its complexity makes it difficult to Group members controlled process that we can interfere study within the particular context Senior research assistant: with and potentially slow down. Since of ageing. With this goal in mind, we Sharlene Murdoch then, thousands of genes and conditions have developed computational tools to have been found to influence lifespan, Postdoctoral research study metabolic fluxes during ageing. with many of them controlling the way scientists: Progress in 2018 in which organisms deal with external Laetitia Chauve n We launched WormJam, a community- Cheryl Li (Left 2018) challenges brought by stress and driven platform that improved the Celia Raimondi nutrition. Such findings underscore the status of the existing model of C. elegans Boo Virk (Left 2018) key interplay between genes and the C. elegans worms labelled with a fluorescent protein by reconciling and manually curating environment and may explain the high and imaged using a confocal microscope. C. elegans PhD students: its metabolic pathways into a single Janna Hastings degree of discordance among identical is a really useful system for our research because we consensus model (refs 1 & 3). Abraham Mains (Left 2018) twins. Our lab’s interest is to understand can monitor fluorescent reporters of gene expression in vivo and study inter-individual variability in stress Manusnan Suriyalaksh the non-genetic influences on lifespan n We also discussed the relevance of response gene expression in isogenic individuals. Research assistants: and stress related phenotypes using these approaches in Current Opinions Francesca Hodge genetically identical lab strains of in Systems Biology (Hastings, in print), Sheikh Mukhtar C.elegans as a model organism. pointing at one of the key challenges Selected Impact Activities that we face when studying ageing with n The group was involved in several Visiting students: Current Aims these tools, namely that the modelling public engagement events Fatemeh Masoudzadeh Our overarching aim is to understand the tools available are optimised for throughout 2018, including being Pia Todtenhaupt (Left in 2018) molecular details of ageing and to discover animals or cells that are in the process part of the team that developed new ways to slow or even reverse the Other visitors: of growing, which is not happening in and presented the Institute’s Rebecca Aldunate ageing process. With that goal in mind, we aged animals. ‘Race Against the Ageing Clock’ (Left in 2018) use C. elegans to understand: exhibit at the Royal Society Rob Jelier (Left in 2018) n Confronted with this challenge, we 1. The significance of non-genetically Summer Exhibition, and sharing re-optimised the modelling tool by encoded variability in the expression the Institute’s science at the driving information from multi-omic of genes that respond to external cues Science Festival and sources (both transcriptomics and such as temperature and nutrients. We events held as part of the LifeLab metabolomics) and were able to are interesting in finding how early project for European Researchers’ optimise this tool to study metabolic molecular differences in the way worms Night. fluxes during ageing (2). This re- respond to stress can influence and be optimisation represents a significant n Lab members were involved in predictive of lifespan. technical advance for the field and will organising the second EU-LIFE allow more accurate predictions of postdoc retreat. metabolic fluxes during the course of n The lab coordinates the ageing. organisation of local area Cambridge worm meetings.

Publications www.babraham.ac.uk/our-research/epigenetics/olivia-casanueva n Witting, M. et al. (2018) Modeling meets metabolomics - The WormJam consensus model as basis for metabolic studies in the model organism Caenorhabditis elegans Front. Mol. Biosci. 5: 96 n Hastings, J. et al. Multi-omics and genome-scale modeling reveal a metabolic shift during C. elegans aging Front. Mol. Biosci. (in print) n Hastings, J. et al. (2017) WormJam: A consensus C. elegans metabolic reconstruction and metabolomics community and workshop series Worm 6(2): e1373939

33 Epigenetics

The placenta at the heart of development

WT MUT Progress in 2018 It has long been appreciated that development relies on a functional placenta. However, the extent to which the placenta potentially contributes to Myriam Hemberger embryonic defects had remained vastly under-estimated. In 2018, we reported Group members EMBRYO that the placenta is abnormal in about Senior researcher: two-thirds of gene mutations that cause Claire Senner embryonic lethality. We also found that placental defects often correlate with Postdoctoral researchers: Sarah Burge abnormal heart and brain development. Ruslan Strogantsev Thus, the placenta may be a frequent contributor to developmental and birth PhD student: defects. Natasha Morgan In parallel, we contributed to the Visiting scientists: PLACENTA Courtney Hanna establishment of a human placental stem Vicente Perez Garcia cell model, a breakthrough advance that Laura Woods will greatly facilitate research into placenta- based pregnancy complications. Visiting student: Comparison of a normal (WT) and mutant (MUT) mouse embryo that lacks a functional Nubpl gene. The Dominika Dudzinska corresponding placentas are depicted underneath. The placenta associated with the mutant embryo is severely malformed and contributes significantly to the developmental defects of the embryo. The placentas are stained for the nutrient exchange surface (green) and for an essential placental progenitor cell type (red, highlighted by arrows). Selected Impact Activities n Conference organiser: A functional placenta is critical for Current Aims Reproduction and Development normal embryonic development and Our work focuses on gaining a better Meeting, Cambridge, UK (10-12 lifelong health. The placenta is an molecular understanding of placental March 2018). integrated unit that develops from development. We use genetic and stem n Speaker at the British Society of cells derived both from the embryo cell models to identify critical factors Developmental Biology (BSDB) and from the mother. Our aim is to and pathways involved in the early Meeting, Oxford, UK (10-13 gain a comprehensive understanding placentation process. This critical time September 2018). of the collection of genes that window is when most defects occur that contribute to normal placentation, lead to common pregnancy disorders and n Received the Institute’s Athena as well as the maternal factors that developmental defects. We use state- SWAN Best Practice Award 2018 affect this process such as advancing of-the art technologies to manipulate for pioneering initiatives and maternal age. gene function and correlate this with commitment to women’s careers differences in placental cell differentiation in science. and function. We are also integrating the impact of the maternal environment on placental development, and are specifically interested in the changes induced by advancing maternal age and how they affect placental development.

Publications www.babraham.ac.uk/our-research/affiliated-scientists/myriam-hemberger @HembergerLab n Perez-Garcia, V. et al. (2018). Placentation defects are highly prevalent in embryonic lethal mouse mutants. Nature 555: 463-468 n Turco, M.Y. et al. (2018). Trophoblast organoids as a model for maternal-fetal interactions during human placentation. Nature 564: 263-267 n Chrysanthou, S. et al. (2018). A critical role of TET1/2 proteins in cell-cycle progression of trophoblast stem cells. Stem Cell Rep. 10: 1355-1368

34 How cells interact with their environment

We study how cells adapt to their Current Aims Progress in 2018 environment at the genetic and n To determine how novel mutations Our optimisation of yeast ageing methods epigenetic level, particularly how occur and whether these can be has finally borne fruit, allowing us to they adjust to challenging and toxic stimulated by the environment. show that certain epigenetic marks environments. This contributes to our become important as cells age to facilitate Jon Houseley n To establish when and how drug understanding of how our cells change appropriate gene expression patterns. We resistance mutations occur in in response to environmental pressures are now able to functionally profile multiple cancer cells. Group members and as a consequence of ageing. Our epigenetic marks during the chromatin Senior research associate: work aims to discover ways of improving n To understand the mechanistic link upheavals that are thought to accompany Cristina Cruz health throughout life and to find better between nutrient environment and the ageing in all eukaryotes, allowing new approaches to chemotherapy. insights into the relationships between Postdoctoral researchers: ageing process. Anna Channathodiyil environment, diet and ageing. Work on Ryan Hull non-random mutation is also progressing; Alex Whale we are elucidating new pathways of extrachromosomal DNA formation in PhD students: response to environmental change and Dorottya Horkai Andre Zylstra setting up industrial collaborations to address mechanisms of drug resistance Research assistant: acquisition. Michelle King Visiting students: Sebastian Parker (Left in 2018) Selected Impact Activities Fabiola Vacca (Started in 2018) n Members of the lab participated in the 2018 Royal Society Summer Exhibition ‘Race Against the Ageing Clock’ exhibit. n Jon Houseley presented on extrachromosomal DNA formation at the EMBL Molecular Evolution and Ecology meeting. n The lab opened a Twitter account - @HouseleyLab.

Images of very old yeast cells that have been aged under unhealthy (left) or healthy diets (right). Red and green channels show two fluorescent markers of ageing that allow us to quantify ageing pathology. Images acquired using an ImageStream imaging flow cytometer (within the Institute’s Flow Cytometry facility) by Dori Horkai.

Publications www.babraham.ac.uk/our-research/epigenetics/jon-houseley @HouseleyLab n Cruz, C., et al. (2018). Tri-methylation of histone H3 lysine 4 facilitates gene expression in ageing cells. eLife 7, e34081 n Frenk, S. & Houseley, J. (2018). Gene expression hallmarks of cellular ageing. Biogerontology 19(6): 547-566 n Cruz, C. & Houseley, J. (2018). Protocols for northern analysis of exosome substrates and other non-coding RNAs. Methods Mol Bio., in press

35 Epigenetics

Epigenetic legacies from eggs

As well as genetic information, the egg and sperm also contribute epigenetic annotations that may influence gene activity both before and after fertilisation. We examine epigenetics Gavin Kelsey during egg development and the effects of epigenetic marks on gene activity in Group members the embryo. Our goal is to understand Research fellow: whether, through epigenetics, factors Antonio Galvao such as a mother’s age or diet have consequences on the health of a child. Postdoctoral researchers: Hannah Desmond Current Aims Elena Ivanova We investigate how epigenetic states are PhD student: set up during oocyte development and Gintare Sendzikaite influence gene expression in the embryo; for example, how repressive chromatin Visiting scientists: Zahra Anvar (Left in 2018) marks in oocytes lead to long-term Salah Azzi (Left in 2018) silencing of maternal alleles particularly Joomyeong Kim (Left in 2018) in cells that will form the placenta. We are Evelyne Oller (Left in 2018) also interested in how variations in DNA Laura Saucedo-Cuevas methylation can come about in oocytes (Left in 2018) and whether we can use methylation Alternative roles for the repressive chromatin mark H3K27me3 in controlling genomic imprinting. In the upper panel, Visiting student: variation as a marker for oocyte quality a paternal allele is silenced by H3K27me3 at its promoter but poised for later expression. In the lower panel, extensive Anna Townley and embryo potential. To investigate these domains of H3K27me3 dependent on monoallelic expression of a long non-coding RNA repress multiple genes. (Started in 2018) questions, we develop methods to profile epigenetic information in small numbers of cells or even single cells. imprinted genes – genes that express Progress in 2018 a single allele – that depend on DNA Selected Impact Activities Using these sensitive methods, we have methylation conferred in the oocyte from n Participation of multiple lab developed high-resolution epigenetic those that depend on repressive chromatin members in the 2018 Royal Society maps of mouse embryos shortly after in oocytes. Importantly, we are beginning Summer Exhibition ‘Race Against implantation, in which we can separate to see a pattern to explain how chromatin- the Ageing Clock’. the epigenetic information obtained on dependent imprinting is controlled n Participation in the Cambridge maternal and paternal chromosomes and why it persists selectively in extra- LaunchPad partnership for STEM (alleles). This has allowed us to distinguish embryonic tissues. engagement (Hannah Demond). n Co-organisation of 2018 EU-LIFE postdoc retreat, Wellcome Sanger Institute, November 2018 (Hannah Demond).

Publications www.babraham.ac.uk/our-research/epigenetics/gavin-kelsey n Hanna, C. W., Demond, H. & Kelsey, G. (2018) Epigenetic regulation in development: is the mouse a good model for the human? Human Reproduction Update 24: 556-576 n Clark, S. J., et al. (2018) Joint profiling of chromatin accessibility, DNA methylation and transcription in single cells. Nat. Commun. 9:781 n Hanna, C. W. et al. (2018) MLL2 conveys transcription-independent H3K4me3 in the oocyte. Nat. Struct. Mol. Bio. 25: 73-82

36 Epigenetic regulation of human development

How DNA is packaged in cells and Progress in 2018 the use of biochemical switches in One highlight was discovering that very Selected Impact Activities the genome are key aspects of the active regulatory regions, called super- n We participated in several public epigenetic control of gene activity. We enhancers, are brought into close physical events including Pint of Science are interested in understanding how proximity to different gene promoters in and the Cambridge Science Peter Rugg-Gunn epigenetic processes are established different cell types through long-range Festival, and we contributed to a during human development and during DNA interactions (1). We further showed podcast by the Naked Scientists. Group members the differentiation of stem cells to form that the regulatory interactions connecting n We all enjoyed being part of the various cell types. This is important for super-enhancers with their target gene Senior researcher: Royal Society Summer Science Clara Novo understanding health and for finding promoters are controlled by cell type- Exhibition ‘Race Against the Ageing ways to use stem cells in regenerative specific transcription factors. Working with Clock’ – a very successful and Postdoctoral researchers: biology. the Corcoran and Schoenfelder groups, rewarding team effort. Mandy Collier we have now developed computational Claudia Semprich Current Aims methods to study genome interactions at n Peter took over the co-organisation (Started in 2018) We seek to define the epigenetic and a network scale (see figure). Investigating of the Cambridge Epigenetics Club gene regulatory mechanisms that operate PhD students: chromatin topology and activity in (@EpigeneticsClub). in unspecialised pluripotent stem cells Adam Bendall pluripotent cells offers new insights into and in cells transitioning towards more (Started in 2018) features of gene regulatory control during Charlene Fabian specialist cell types. We examine how development and stem cell differentiation. Andrew Malcolm these mechanisms are established in (Started in 2018) development, how they control cell state Visiting scientist: changes, and how their alteration can Paola Serena Nisi be helpful to reprogramme mature cell (Left in 2018) types back into an unspecialised form. Our current work also investigates how changes to epigenetic marks can alter the organisation of DNA interactions and associated gene activity. Applying this information will allow us to more precisely control cell fate decisions and to better understand the processes that shape human development.

Visualisation of promoter-capture Hi-C data at a network-scale uncovers global changes in gene regulatory interactions as human pluripotent stem cells transition from a ‘naïve’ state to a ‘primed’ state. The example shown highlights the acquisition of long-range, Polycomb-associated DNA interaction networks that contain the majority of genes encoding key developmental regulators. The figure was produced by Peter Chovanec and Amanda Collier, as part of a collaborative project with Anne Corcoran and Stefan Schoenfelder.

Publications www.babraham.ac.uk/our-research/epigenetics/peter-rugg-gunn @RuggGunnLab n Novo, C. et al. (2018) Long-range enhancer interactions are prevalent in mouse embryonic stem cells and are reorganized upon pluripotent state transition. Cell Rep 22: 2615-2627 n Lupo, G. et al. (2018) Molecular profiling of aged neural progenitors identifies Dbx2 as a candidate regulator of age-associated neurogenic decline. Aging Cell 17: e12745 n Collier, A. and Rugg-Gunn, P. (2018) Identifying human naïve pluripotent stem cells - evaluating state-specific reporter lines and cell-surface markers. Bioessays 40: e1700239

37 Epigenetics

3D genome organisation in stem cells

The three-dimensional organisation Developmental control of enhancer– of our genome is tightly linked to promoter contacts during early human its function. Dispersed throughout cell lineage specification. Transcriptional upregulation of PAX6 during the the non-coding part of our genome, conversion of human embryonic stem regulatory elements (such as enhancers Stefan Schoenfelder cells (ESCs) into neuroectodermal cells and promoters) function as ‘molecular (NECs) is accompanied by the formation switches’ to turn genes on and off. These of enhancer–promoter contacts in regulatory elements are brought into NECs (mapped by Promoter Capture spatial proximity through cell-type Hi-C; PCHi-C), which involves both novel specific folding of chromatin, which enhancer–promoter contacts and the represents a key regulatory mechanism emergence of the enhancer associated mark acetylation at lysine 27 of histone to control gene expression programmes H3 (H3K27ac) at genomic regions that during cell lineage specification. already interacted with PAX6 in ESCs Current Aims (modified from Freire-Pritchett et al., Our aim is to dissect 3D gene regulatory eLife 2017). networks and enhancer–promoter interactions during normal development, established after the first cell lineage element interactions. This indicates that and to understand how their function differentiation process during early priming of promoter–regulatory element is perturbed in disease and ageing. Our mammalian development. We have also contacts may contribute to efficient research focuses on pluripotent stem cells, shown that haematopoietic stem cells signalling pathway responses at the as they are a unique model to study early respond to growth factor signalling transcriptional level. We have further mammalian development, and hold great through widespread epigenome and analysed how the 3D genome organisation promise for applications in regenerative transcriptome remodelling, but with very changes during ageing in mouse B medicine, disease modelling and limited changes to promoter–regulatory lymphocytes. compound screening. We address three main questions: 1. How is the folding of the genome Selected Impact Activities n n rewired during early mammalian Presentation at EMBL-EBI Industry Interview with Tom Chivers that development? Programme workshop ‘Epigenetics for contributed to an online article ‘How ageing and disease’ (November 2018). big data is changing science’ for the 2. Which regulatory elements control ’s Mosaic platform, n Participation in a STEM Insights cell fate decisions during lineage which was subsequently featured teacher event hosted at the Babraham specification? in the Independent online on 11th Institute (October 2018). November, reaching over 3. How do sequence variants in regulatory 22M people. elements affect gene expression levels and thereby cell function? Progress in 2018 We have mapped promoter–regulatory element interactions in mouse embryonic stem cells and trophoblast stem cells, demonstrating that profound differences in 3D genome organisation are already

Publications www.babraham.ac.uk/our-research/epigenetics/stefan-schoenfelder n Schoenfelder S. et al. (2018) Divergent wiring of repressive and active chromatin interactions between mouse embryonic and trophoblast lineages. Nat. Commun 9: 4189 n Schoenfelder S. et al. (2018) Promoter Capture Hi-C: High resolution, genome-wide profiling of promoter interactions. J. Vis. Exp. 136. doi: 10.3791/5732 n Comoglio F. et al. (2018) Thrombopoietin signalling to chromatin elicits rapid and pervasive epigenome remodelling within poised chromatin architectures. Genome Res. 28: 295-309

38 39 Riding the data wave

Big data is revolutionising science. But as well as changing physics, chemistry and biology, it’s changing the nature of science itself. Institute researchers Wolf Reik and Stefan Schoenfelder and bioinformatics expert Simon Andrews reflect on how big data is re-shaping not only the way they work, but how they think. And we discover how bioinformatics – once considered a geeky corner of biology by some – has become central to scientific progress.

When Professor Wolf Reik, head scientific questions he can ask and has emerged at the intersection of the Epigenetics programme, an embodiment of how it’s reshaped of biology, computer science and thinks about how big data the way his younger colleagues statistics. has revolutionised his field he think. Dr Simon Andrews, head of remembers the Swiss anatomist Before big data, researchers thought Bioinformatics, belongs to this new Karl Theiler. Theiler spent a lifetime and worked on single genes – how breed of experts. Since joining the creating ‘The House Mouse: Atlas they were regulated and their Institute in 2001, he’s seen the group of Embryonic Development’, role in development, health and expand from two to 10 staff, many painstakingly taking embryos at ageing. Now, thanks to the recent of whom have their roots in biology. different stages of development and developments in next-generation using staining and microscopy to “Lots of people in my group were sequencing, the focus is firmly on identify each tissue type. once biologists who happened the genome as a whole. “We can to play with computers for fun. “Now, we take the same embryos now look at 20,000 genes or 20,000 My mother was a primary school and put them in a big machine promoters and get huge amounts of teacher. Sometimes she’d turn up which sequences up to 100,000 cells information. The younger members with a computer that had been at a time. Through gene expression, of my group get excited about the donated to the school, point me it gives us an equally detailed atlas whole genome and what it’s doing, towards it and say ‘make this do of development, but because we whereas I was brought up in an era something that I can take back into can now use multi-omics methods of asking what single genes do; it’s a the classroom!’” Andrews recalls. “At that link together different layers fundamental difference in thinking,” university I built my own computers in a single cell – the epigenome says Reik. because we couldn’t afford to and the transcriptome – we can ask Big data brings huge opportunities, buy them, and when I started much deeper questions about how but using techniques that generate research we were beginning to get these patterns arise mechanistically. massive amounts of increasingly electronically-generated data.” That’s what we really want to know,” complex data also presents huge says Reik. His PhD generated a respectable computational challenges. So how 1,000 bases of DNA sequence. Today, Despite being a relic of an earlier do Reik and other researchers a single sample at the Institute scientific age, Theiler’s atlas remains extract meaning from this deluge yields 40 billion. “The fundamental on Reik’s bookshelves: an illustration of data? The answer lies in change is that many experiments of how big data has transformed the bioinformatics, the science that generate amounts of data that are

40 Epigenetics

‘Big data changes how we think – and how we work’

impossible to understand without one of his key questions. “Whereas we a computer. Before, computers used to look at individual examples, were a nice add-on; now they are now it’s possible to address those fundamental,” he says. questions genome wide. We can get a complete picture of all the interacting One of the Institute’s core facilities, sequences in a cell,” he says. “When the Bioinformatics group provides I came here after my PhD, it was computational power and data something I thought might happen analysis plus expert advice and at the end of my career. That it’s bespoke development work. “What happened so quickly is incredible.” fires me up are computational problems that spring from biology,” It also means that researchers need says Andrews, and what researchers to learn how to interpret data, so often need most are ways of making the Institute’s Bioinformatics group their data more accessible. Over makes a major difference. “The skills several years, Andrews’ group has I was equipped with in my PhD are developed packages capable of not enough anymore. It’s normal to visualising sequencing data sets keep learning in science, but this is a with billions of data points. “These quantum leap,” says Schoenfelder. “In are unfathomable on their own, but a competitive field you need to work we can turn them into billions of rapidly. I often work with dedicated positions in a genome, and visualise bioinformaticians because it’s almost what they look like,” he explains. impossible to be an expert in both.”

Like Reik and Andrews, Dr Stefan The next scientific revolution is Schoenfelder has lived through anyone’s guess, but Schoenfelder the revolution wrought by next- is sure it will only underscore how generation sequencing and big data. much more we need to understand. “It changes the way you think and “Sequencing and its impact on changes the way we work,” he says. personalised medicine will continue “When I did my PhD 15 years ago I to grow. High-resolution microscopy, spent all my time doing experiments observing live cells and even in the lab. Now it’s the analysis that individual molecules, will be another takes the time.” game changer,” he concludes. “We make contributions all the time, but Schoenfelder is interested in how ‘Bioinformatics turns we know so little. That’s humbling gene function and gene expression – but it’s also very exciting to be a something unfathomable are controlled by non-coding bits of part of.” DNA known as regulatory sequences. into something we In linear terms, genes and their regulatory elements may be some can visualise’ distance apart, so how the genome is organised in three dimensions is

41 Designed & Produced by www.pickeringhutchins.com by & Produced Designed

Babraham Institute The Babraham Institute Babraham Research Campus Registered office: Cambridge Babraham Hall, Babraham, CB22 3AT Cambridge, CB22 3AT UK Registered in and Wales No. 3011737 as a company limited by guarantee www.babraham.ac.uk Registered Charity No. 1053902

Tel: +44 (0)1223 496000 The Babraham Institute receives [email protected] core funding in strategic programme grants from the BBSRC. @BabrahamInst