<<

2019 Society Meeting A Century of Genetics

13 – 15 November 2019. Royal College of Physicians,

Meeting Programme THANK YOU TO ALL OF OUR SPONSORS

GOLD

SILVER

BRONZE WELCOME

Just 10 years after Wilhelm Johannsen coined In June we celebrated the unveiling of Blue the term “gene”, and 14 years after William Plaques for Saunders and Bateson at the Bateson bequeathed us the word “genetics”, John Innes Centre, the focal point of UK on 25th June 1919 Bateson and Edith plant genetics, founded by Bateson. It is then “Becky” Saunders convened a meeting in the appropriate that the centenary conference rooms of the Linnaean Society to propose the should be convened in Edinburgh, for long a founding of a “Genetical Society”. Approved UK centre for animal and molecular genetics. unanimously, the first meeting was held in By happy coincidence, 2019 is also the Cambridge on 12th July that year. centenary of the origins of genetics research in Edinburgh. From the 26 members at the founding meeting, the Genetical Society (now the Genetics and Edinburgh Genetics Society) has grown over the last The first lectureship in genetics was instituted century to an all-time high now in excess at The in 1911. of 2500 members. Members reflect a The Animal Breeding Research Station of broad span of career stages and expertise, the government’s Board of Agriculture and covering all major branches of pure and Fisheries was established in Edinburgh in applied genetics. A cursory glance at the 1919 and evolved over the next 10 years to past Presidents of the Society reflects both become The University of Edinburgh’s Institute the standing of the Genetics Society over the of Animal Genetics. In 1930, the Institute past century and the breadth and depth of moved into what is now known as the Crew British genetics: from the founders of classical Building at King’s Buildings, named after its transmission genetics (Saunders, Reginald first director and the Buchanan Professor of Punnett), to the developers of the modern Animal Genetics. In 1945, the government evolutionary synthesis (RA Fisher and JBS established the National Animal Breeding Haldane), by way of the leading lights in yeast and Genetics Research Organisation as an genetics (Paul Nurse), developmental genetics Agricultural Research Council institute outside (Conrad Waddington), microbial genetics the University. Links to the University and the (David Hopwood, Noreen Murray), plant Institute of Animal Genetics were strong from genetics (Enrico Cohen), ecological genetics the beginning, and continued throughout (EB Ford), fly genetics (Mike Ashburner), numerous organisational changes, coming worm genetics (Jonathan Hodgkin), fungal full circle with the (re)integration of The Roslin genetics (Guido Pontecorvo, John Fincham), Institute (the successor Research Council population genetics (Brian Charlesworth), institute) into the University in 2008. The ageing genetics (Linda Partridge), and medical internationally renowned quantitative and and human genetics (, Veronica population geneticists, Douglas Falconer and van Heyningen). In our breadth, we run true Alan Robertson, were both associated with to the wishes of the founders who were keen, the Institute of Animal Genetics. for example, that both research academics and breeders be involved. Meanwhile other strands of genetics research

Introduction 3 flourished at King’s Buildings within what and pioneered the study of is now the School of Biological Sciences. abnormalities at a population level – Many fundamental contributions to genetics particularly on Hebridean islands. In 1988, were made here, including the discovery of the Unit changed its name to the MRC chemical mutagenesis, bacterial conjugation, Human Genetics Unit (HGU) and has led the purification of the first gene (frog rRNA understanding of human genome biology, genes), the pioneering work on gene cloning and both Mendelian and complex genetic by Noreen and Ken Murray, and the first disease. The HGU became part of The explorations of animal genomes by DNA University of Edinburgh and a partner in the hybridisation, by Peter Walker amongst Institute of Genetics and Molecular Medicine others. This led to the establishment of in 2012. the Medical Research Council Mammalian In short, Edinburgh is at the forefront of Genome Unit. The ‘Genome Unit’ population and quantitative genetics, pioneered genome biology, with its last human and animal genomics, and genetic director developing the fundamental manipulation of farmed animals. tools of modern molecular genetics – the eponymously named Southern blot – pulsed The conference programme celebrates field gel electrophoresis, microarrays, and not just the remarkable subject – in all its sequencing the first piece of mammalian breadth – that genetics has become, but also DNA – the guinea pig alpha satellite. The highlights the defining role the UK, and in Genome Unit was also the home of CpG no small part Edinburgh, plays in the subject. island discovery and the identification and For a small nation, the UK hits far above our cloning of telomeres. The current Institute weight. It should be a source of great pride of Evolutionary Biology at King’s Buildings, that over the last century the Genetics Society as one of the successors to the Institute of has been an important part of this success. Animal Genetics, has a strong reputation for As a Society we continue to promote genetics its research in evolutionary genetics. and geneticists not least through funding, and enabling attendance at, conferences like this Across the other side of the city, in 1956 the one. We support the promotion of genetics, Clinical Effects of Radiation Unit was set up both via training in science communication by the MRC to study human chromosome and by funding science engagement. Through abnormalities. Contemporaneous with the provision of training and fieldwork grants, the identification of trisomy 21 in Down’s and via the summer studentship scheme, we syndrome, the unit identified the XXY aim to support the next century’s geneticists. karyotype in patients with Klinefelter syndrome. This unit then became the MRC May the next 100 years be as productive for Clinical and Population Cytogenetics Unit British genetics as the first 100.

Laurence D. Hurst Eleanor Riley Wendy Bickmore Brian Charlesworth President of the Director, The Director, MRC Human IEB Genetics Society Genetics Unit

4 Introduction PROGRAMME

Wednesday 13 November 2019

09:00 REGISTRATION OPEN 09:30 INTRODUCTION TO THE CONFERENCE Laurence D. Hurst, Genetics Society Eleanor Riley, The Roslin Institute

SESSION 1: GENOME STABILITY AND INSTABILITY Chair: David Finnegan, co-chair Sveta Markovets 09:40 David Finnegan, The University of Edinburgh Introduction to the session 09:45 Steve Jackson, Harnessing genetic principles 10:15 Severine Chambeyron, Institute of Human Genetics, Montpelier piRNA pathway is a guardian of genome integrity 10:40 REFRESHMENT BREAK AND POSTER VIEWING 11:15 Kim Nasmyth, University of Oxford Title to be confirmed 11:45 Adele Marston, The University of Edinburgh Pericentromere organisation in mitosis 12:10 Lana Talmane, The University of Edinburgh Protein binding as a selective filter for new mutations at regulatory sites in the germline and in cancers 12:25 Poster pitches x 10 (odd numbers) 12:55 LUNCH BREAK POSTER SESSION - ODD NUMBERS

SESSION 2: GENETICS AND SELECTION OF QUANTITATIVE TRAITS Chair: Josephine Pemberton, co-chair Smaragda Tsairidou Sponsor: Aviagen 14:00 Josephine Pemberton, Institute of Evolutionary Biology Introduction to the session 14:05 Greg Kudla, The University of Edinburgh A multidimensional genotype-phenotype map of a fluorescent RNA aptamer 14:30 Peter Keightley, The University of Edinburgh Inferring the distribution of fitness effects of mutations 14:45 Trudy Mackay, Clemson University Charting the Genotype-Phenotype Map: Lessons from Drosophila 15:15 Troy Rowan, University of Missouri Detecting signatures of ongoing polygenic selection and local adaptation in United States Bos taurus beef cattle 15:30 REFRESHMENT BREAK AND POSTER VIEWING

Programme 5 PROGRAMME

Wednesday 13 November 2019

16:00 Felicity Jones, Max Planck Society, Tübingen Gene regulation, epigenomics and recombination in adaptive evolution of natural stickleback populations 16:25 Balfour Lecture: Susan Johnston, The University of Edinburgh Introduced by Laurence D. Hurst The evolution of individual recombination rates in the wild 16:50 John Hickey, The Roslin Institute Disruptions through Data Driven Breeding and integration will drive step changes in animal and crop performance 17:20 Ivan Pocrnic, The University of Edinburgh Limited dimensionality of genomic information and implications for genomic prediction 17:35 Patrick Sharman, University of Exeter Genetic improvement of racehorse speed 17:50 Mendel Medal: Professor Bill Hill Introduced by Laurence D. Hurst Appreciation of Bill’s work by Trudy Mackay

18:30 DRINKS RECEPTION AND POSTER VIEWING Sponsor: Genus PLC

19:30 Public Lecture: Farm, Field and Family – 100 years of genetics in Edinburgh Host: Sarah Chan, Usher Institute Speakers: Josephine Pemberton, Institute of Evolutionary Biology Bruce Whitelaw, The Roslin Institute Wendy Bickmore, Institute of Genetics and Molecular Medicine

Thursday 14 November 2019

09:00 REGISTRATION OPEN

SESSION 3: THE GERMLINE, SEX DETERMINATION & SEX Chair: Brian Charlesworth, co-chair Kay Boulton 09:20 Brian Charlesworth, The University of Edinburgh Introduction to the session 09:25 Genetics Society Medal: Deborah Charlesworth, The University of Edinburgh Introduced by Laurence D. Hurst The diversity of sex chromosomes and the importance of genetics in understanding them

6 Programme PROGRAMME

10:00 Doris Bachtrog, University of California, Berkeley Massive gene amplification on a recently formedDrosophila Y chromosome 10:25 Laura Ross, The University of Edinburgh Genomic exclusion: males that lose their fathers genes 10:50 REFRESHMENT BREAK AND POSTER VIEWING 11:30 Stephen Wright, University of Toronto Y degenerate? Evolutionary genomics of Y chromosome degeneration in plants 12:00 Peter Ellis, University of Kent Differential sperm motility mediates the sex ratio drive shaping mouse sex chromosome evolution 12:15 Poster pitches x 10 (even numbers) 12:45 LUNCH BREAK POSTERS SESSION - EVEN NUMBERS

SESSION 4: HUMAN GENETIC VARIATION: FROM MOLECULES TO POPULATIONS Chair: Wendy Bickmore, co-chair Peter Joshi Sponsor: Edinburgh Genomics 14:00 Wendy Bickmore, The University of Edinburgh Introduction to the session 14:05 Gil McVean, University of Oxford The ancestry of everyone 14:35 Peter Visscher, University of Queensland Estimation of additive genetic variance from whole genome sequence data 15:05 Evropi Theodoratou, The University of Edinburgh Phenome wide Mendelian randomisation study of genetically determined biomarkers using large scale data 15:30 REFRESHMENT BREAK AND POSTER VIEWING 16:00 Anna Gloyn, University of Oxford Unravelling causal mechanisms for diabetes 16:40 Joe Marsh, MRC Human Genetics Unit, The University of Edinburgh The role of protein complex assembly in dominant Mendelian genetics disorders 17:15 Alasdair MacKenzie, University of Aberdeen CRISPR and UK Biobank analysis of an enhancer that modulates fat and alcohol intake and mood suggests a role in male alcohol abuse and anxiety 17:30 Patricia Heyn, The University of Edinburgh Gain-of-function DNMT3A mutations cause microcephalic dwarfism and hypermethylation of Polycomb-regulated regions 17:45 M. Madan Babu, University of Cambridge Understanding variation in GPCR drug targets 18:15 Genetics Society AGM 19:30 Conference Dinner and Ceilidh, InterContinental Edinburgh the George (formerly Principal Hotel) 19-21 George Street, Edinburgh Sponsor: Cobb Vantress Programme 7 PROGRAMME

Friday 15 November 2019

09:00 REGISTRATION OPEN

SESSION 5: EPIGENETICS AND NON-CODING RNA Chair: Robin Allshire, co-chair Liz Bayne Sponsor: Diagenode 09:30 Robin Allshire, The University of Edinburgh Introduction to the session 09:35 Kat Arney, Genetics Unzipped 09:50 Adrian Bird, The University of Edinburgh DNA base composition controls cell fate via SALL4 10:15 Sito Torres-Garcia, The University of Edinburgh Stochastic epigenetic silencing by heterochromatin primes fungal resistance 10:30 David Baulcombe, University of Cambridge The tomato epigenome 11:00 REFRESHMENT BREAK 11:30 Beth Sullivan, Duke University Alpha satellite variation, transcription, and centromere assembly: finding function in the recesses of the human genome 12:00 Anna Thamm, University of Oxford MpFRH1 miRNA Controls the Patterning of Epidermal Structures in the Liverwort Marchantia Polymorpha 12:15 Rosalind John, Cardiff University Genomic Imprinting: linking early life adversity to maternal behaviour and lifelong health 12:30 Anne Ferguson-Smith, University of Cambridge Epigenetic modulation of repeat elements – implications for non-genetic inheritance 13:15 POSTER PRIZE AWARDS 13:30 Final Comments Eleanor Riley, The Roslin Institute Laurence D. Hurst, Genetics Society

POSTER PRIZE SPONSORS: SILVER SPONSOR:

Session 1: Session 3:

Session 2: Session 4:

Session 5:

8 Programme Love podcasts? Love genetics?

Listen to Genetics Unzipped The Genetics Society Podcast

Bringing you stories from the world of genes, genomes and DNA

Listen and subscribe for free from Apple podcasts, Spotify and all major podcast apps

Find episodes, transcripts and more at GeneticsUnzipped.com MEETING INFORMATION

Meeting organisers Location Kay Boulton, The Roslin Institute, Royal College of Physicians, The University of Edinburgh 9 Queen Street, Edinburgh EH2 1JQ Martin Taylor, MRC Human Genetics Unit, The University of Edinburgh Registration Helen Sang, The Roslin Institute, This will be held in the foyer. The University of Edinburgh Josephine Pemberton, IEB, The University of Edinburgh Talks Wendy Bickmore, MRC Human Genetics These will be held in the auditorium. Unit, The University of Edinburgh Brian Charlesworth, IEB, The University of Edinburgh Refreshments Smaragda Tsairidou, The Roslin Institute, These will be available in the foyer, the Great The University of Edinburgh Hall and the New Library. Douglas Vernimmen, The Roslin Institute, The University of Edinburgh Posters and the drinks reception Helen Nickerson, MRC Human Genetics To be held in the Great Hall and the New Unit, The University of Edinburgh Library (connected rooms). Albert Tenesa, The Roslin Institute, The University of Edinburgh Nicola Stock, The Roslin Institute, The conference dinner and The University of Edinburgh Ceilidh Ines Crespo, The Roslin Institute, These will be held in the Kings Hall at the The University of Edinburgh InterContinental Edinburgh the George Cristina Fonseca, Genetics Society (formerly Principal Hotel) Bill Hill, IEB, The University of Edinburgh 19-21 George Street, Edinburgh Alan Archibald, The Roslin Institute, The University of Edinburgh Social media Chris Haley, The Roslin Institute, Tweet @GenSocUK using #GenSoc100 The University of Edinburgh Eileen Wall, SRUC

10 Meeting information SITE MAP

Site map 11 Want to hear about the latest developments in genetics and evolution from the people making them? Then tune into the Heredity Podcast!

From discussions with authors to live event recordings and editor interviews, accompany our host James Burgon as he journeys through and beyond the pages of Heredity – the official journal of the Genetics Society.

Covering everything from the population genetics of invasive species to the evolution of human storytelling, join us to hear the stories behind the science.

Listen to our latest episodes on the Heredity website (nature.com/hdy/podcast) or find them on:

iTunes Overcast Google Podcasts GENOME STABILITY AND INSTABILITY

Chair: David Finnegan Co-chair: Sveta Markovets

The genetic information required by an organism is contained within its nuclear genome together with a little in mitochondria and, if you happen to be a plant, chloroplasts. Occasional changes to the genome are beneficial but generally they are bad news and, as one would expect, mechanisms have evolved to maintain genome integrity within single cells and as the genome as a whole is transmitted from one generation to the next and it is some of these mechanisms that will be addressed during this session. 01 STEVE JACKSON University of Cambridge

Harnessing genetic principles

As DNA is frequently subject to a wide array of molecularly distinct forms of damage, life has evolved multiple DNA repair and associated processes, collectively termed the DNA- damage response (DDR). The importance of DDR mechanisms is underlined by their deregulation or loss causing cancer and various human disorders whose pathologies include stem-cell exhaustion, developmental defects, infertility, immune-deficiency, neuro-degeneration, cancer predisposition and/or premature ageing. Work in my laboratory aims to decipher DDR mechanisms, particularly those triggered by DNA double-strand breaks. In this talk, I will describe how we have used various experimental approaches and the genetic principles of synthetic-lethality and synthetic- viability to identify new DDR factors/regulators and then define their functions. I will also explain how our work has contributed towards the generation of new cancer therapies and is suggesting how such therapies may be most effectively deployed. Finally, I will discuss how we are extending our studies into various genetic diseases.

Biography

Steve Jackson FRS FMedSci is Professor of Biology at the University of Cambridge. His research has identified many key principles by which cells respond to and repair DNA damage, and has involved him identifying many DNA-repair proteins, establishing how they function and showing how their dysfunction yields disease. Steve has founded and scientifically led several biotechnology companies. One of these, KuDOS Pharmaceuticals, generated the PARP inhibitor drug olaparib/LynparzaTM that is now marketed worldwide by AstraZeneca for treating certain ovarian and breast cancers. Lynparza is the world’s first drug targeting inherited cancer predisposition.

14 Speakers and abstracts 02 SÉVERINE CHAMBEYRON Institute of Human Genetics, Montpelier piRNA pathway is a guardian of genome integrity

Transposable elements (TEs) are genomic parasites that have managed to colonise the whole life tree. They hijack the cellular machinery to produce mRNAs and proteins that are essential for their own replication which rarely benefits to host cells directly. Moreover, the act of transposition is mutagenic because (1) random TE insertions can disrupt functional DNA and (2) double-stranded breaks caused by transposition can have dramatic consequences for genome integrity. In the ongoing conflict between TEs and their host genomes, no battleground is more critical than the germline, where even the most deleterious TE insertions are heritable. Unrestricted germline TE propagation can also disrupt gametogenesis, sometimes causing complete sterility. Recently, a RNAi-like genome surveillance system has been discovered in animal gonads involving small non-coding RNAs called PIWI-interacting RNAs (piRNAs). This small RNA- mediated immunity can keep track of all the genomic TEs, including the new genome invaders and is able to selectively target them for heterochromatinization. Our lab is interested in understanding the molecular mechanisms involved in piRNA biogenesis and function, using the Drosophila melanogaster model system. We use a variety of methodologies ranging from protein biochemistry, cell biology, and animal genetics to computational biology.

Biography

Séverine Chambeyron started to study Drosophila genetics in 1998 as a PhD student working on I-R hybrid dysgenesis at the University of Montpellier. In 2002, as a postdoc in the laboratory of Wendy Bickmore at the MRC Human Genetics Unit, Edinburgh, she studied the structure and the organisation of mouse chromatin. She then developed her own projects in a junior team at the Institute of Human Genetics in Montpellier, working on the epigenetic mechanisms involved in transposon repression in the Drosophila ovary. Of note, her lab discovered piRNAs as the molecular basis of the non-Mendelian inheritance of this silencing.

Speakers and abstracts 15 03 KIM NASMYTH University of Oxford

Kim Nasmyth FRS FMedSci is an English geneticist, the Whitley Professor of Biochemistry at the University of Oxford, a Fellow of Trinity College, Oxford, former Scientific Director of the Research Institute of Molecular Pathology, and former Head of the Department of Biochemistry, University of Oxford. He is a member of the Advisory Council for the Campaign for Science and Engineering. He is best known for his work on the segregation of chromosomes during cell division.

16 Speakers and abstracts 04 ADELE L. MARSTON The Wellcome Centre for Cell Biology

Pericentromere organisation in mitosis

The 3D architecture of the genome governs its maintenance, expression and transmission. The conserved ring-shaped cohesin complex organises the genome by topologically linking distant loci on either a single DNA molecule or, after DNA replication, on separate sister chromatids to provide the cohesion that resists the pulling forces of spindle microtubules during mitosis. Cohesin is highly enriched in specialised chromosomal domains surrounding centromeres, called pericentromeres. We determined the 3D structure of budding yeast pericentromeres. We find that convergent genes mark pericentromere borders and, together with core centromeres, define their structure and function by positioning cohesin.

Biography

Adele Marston is a Wellcome Senior Research Fellow and Professor in Cell Biology at The University of Edinburgh. Her group aims to understand the origin of aneuploidy, particularly during meiosis, the cell division that generates eggs and sperm. The focus is to understand molecular mechanisms of chromosome segregation, primarily using yeast, together with a combination of genetics, biochemistry and microscopy. Adele obtained her PhD from the University of Oxford, in the lab of Jeff Errington, and carried out postdoctoral work with John Chant at Harvard University and Angelika Amon at MIT. In 2005 she moved to the Wellcome Centre Cell Biology in Edinburgh to establish her independent research group as a Wellcome Research Career Development Fellow. In 2010 she obtained a Wellcome Senior Fellowship, which she renewed in 2015. Adele was an EMBO Young Investigator (2010) and elected to EMBO membership in 2019.

Speakers and abstracts 17 05 LANA TALMANE The University of Edinburgh

Protein binding as a selective filter for new mutations at regulatory sites in the germline and in cancers

Genetic mutations provide the raw material for evolution, they are responsible for heritable disease and driving the development of cancer. We, and others, have shown that the binding of chromatin and regulatory proteins to DNA can interfere with replication, surveillance and repair processes, but the proposed mechanisms presume the loading of sequence specific binding factors over nucleotide mismatches and other lesions, which seems paradoxical for sequence specific binding factors. Here we propose the biased mask model, where the binding of some transcription factors can tolerate mismatch substitutions or other lesions strand specifically at some sites, acting as a selective filter of new mutations. We provide electrophoretic mobility shift assay support for the biased mask, and illustrate how it is shaping the mutation patterns of both cancers and the human germline. Being replication associated, the mutational burden of this biased mask predicts that the protein binding sites occupied during germline replication are hotspots for functionally important mutations, which will be exacerbated by increased paternal age. Exploring this we have applied ATAC-seq to primary human spermatogonial cells, which account for up to 80% of human germline DNA replication, to map the germline protein binding landscape. By combining this map with human population variation data we confirm germline sequence specific binding sites as hotspots of deleterious (negatively selected) human mutations.

18 Speakers and abstracts POSTER PITCHES

Speakers and abstracts 19 GENOME STABILITY AND INSTABILITY

P01 MARIA GOMEZ Centro de Biologica Molecular Severo Ochoa, Spain

Non-coding transcript retention in chromatin underlies replication-transcription conflicts in histone H1-deficient cells

Chromatin conformation is central to the interplay between DNA replication and transcription. We recently described that loss of chromatin compaction in mES cells with reduced histone H1 content triggers massive replication fork stalling and DNA damage. Replicative stress was suppressed by transcription inhibition but rapidly re-established upon transcription re-start, indicating that correct amounts of histone H1 are required to avoid nascent RNA accumulation in chromatin. Chromatin-enriched RNAs characterisation showed specific accumulation of nascent long non-coding transcripts (lncRNAs) in H1- defficient cells, some of which can give rise to R-loop structures. Importantly, chromatin retention of non-coding transcripts also occurs in human cells transiently depleted of the orthologous histone H1 genes. Our results implicate histone H1 in the RNA surveillance of lncRNAs and stressed the multiple roles of chromatin structure in regulating RNA metabolism and thus in minimising replication-transcription conflicts.

P03 JASMINE ONO University College

Engineering recombination between diverged yeast species reveals genetic incompatibilities

Reproductive isolation, and consequently speciation, builds up from a combination of isolating barriers. We investigate the potential genetic causes of intrinsic reproductive isolation between two closely related species of yeast, Saccharomyces cerevisiae and S. paradoxus. F1 crosses between these species are viable but infertile, with less than 1% of their gametes being viable. The main cause for this sterility is the mis-segregation of chromosomes during meiosis, due to a failure of homologous chromosomes from the different species to recombine. By suppressing the meiotic expression of two genes, we increased recombination and improved hybrid fertility 70-fold. This allowed us to search for other forms of reproductive isolation in the system, such as negative genetic interactions (Bateson-Dobzhansky-Muller incompatibilities), which are thought to be a common mechanism for speciation. We present evidence of such incompatibilities in yeast and take further steps to map and characterise the causative loci.

20 Speakers and abstracts GENETICS AND SELECTION OF QUANTITATIVE TRAITS

P05 KONRAD RAWLIK The Roslin Institute, The University of Edinburgh

SNP heritability and how not to estimate it

With widespread access to genotype data, estimation of SNP heritability has been widely adopted as an analysis tool. But, do widely employed estimators actually estimate a quantity of interest? One might argue that estimators like REML are un-biased. However, this is only the case if the data is sampled from the assumed statistical model. This assumption is not tenable in real populations with an abundance of trait specific unquantified properties, like assortative mating, selection or genetic architecture. This has led to prolonged debate of the appropriate model choice and has yielded a number of approaches to account for frequency- and linkage disequilibrium-dependent genetic architectures. Here we resolve these debates. We first analytically answer the question of how estimates of model parameters relate to parameters of the population from which samples are actually drawn. Surprisingly, the correct model for the purpose of inference about parameters of the population does not actually require knowledge of the true genetic architecture of a trait or other difficult to establish properties of the population. Conversely, we illustrate how commonly employed approaches can lead to false conclusions about the genetic architecture. For example, how inclusion of SNPs with no effect on the phenotype can inflate the estimates of heritability. Overall, our results provide a complete perspective of genomic REML estimators, what estimates obtained under current practice mean, why they shouldn’t be used, and what the way forward is.

P07 JOEL PICK The University of Edinburgh

The importance of being normal: Decomposing phenotypic skew and its consequences for predicting the response to strong selection

A large amount of quantitative genetic theory is based on the premise that traits, and their underlying constituents (genetic and environmental values) are normally distributed. Theory however suggests that there are reasons to expect deviations from normality in distributions of both environmental and genetic values. Furthermore, such deviations from normality can impact how traits respond to selection and even have the potential to explain evolutionary stasis. Although phenotypic skew is seen in many traits, it is typically either ignored, or it is assumed to be constant across all underlying constituents. Using data of nearly 6000 individuals from a long-term study of wild blue tits (Cyanistes caeruleus), we decompose phenotypic skew into environmental and genetic skew, in juvenile body size, a trait under strong, persistent, directional selection. We then consider the impact that skew at different levels has on how this trait responds to selection.

Speakers and abstracts 21 THE GERMLINE, SEX DETERMINATION & SEX CHROMOSOMES

P09 QI ZHOU Zhejiang University & University of Vienna

Sex chromosome evolution of birds and bird-like mammals: shifting from karyotype to 3D genome

Sex chromosomes are the outliers of genome given their unique evolutionary history and regulatory programmes. Their diversity across species remain largely unexplored due to their usually repetitive sequence nature despite the status quo of third-generation sequencing. I will present our recent work on birds and platypus, whose genomes are assembled into a chromosome-level by Hi-C technique with corroboration of published fluorescence in situ hybridization (FISH) data. During the analyses of 15 Paleognathae and 12 songbird genomes, we found consistent with our previous results, bird sex chromosomes only shared one time of recombination suppression encompassing the male-determining DMRT1 gene, but independently form evolutionary strata after the rapid species radiation. We found several cases of genomic regions that exhibit excessive female heterozygosities, but a conserved synteny between the Z/W chromosomes according to FISH data. This indicates recombination suppression occurred without chromosomal inversions. In the case of platypus, we traced some of the sex chromosome origin to the microchromosomes of chicken. To our surprise, platypus sex chromosome show unexpected interchromosomal interactions that are not observed on any other chromosomes. We further confirmed the pattern with mitotic cell imaging showing indeed certain pairs of labelled sex chromosomes tend to have more contacts with each other. And such an interaction pattern seems to be conserved with human. These results highlight the fact that in the post-genomic era, karyotypic resources and techniques become in fact more important than before in the studies of sex chromosome evolution. And higher order of chromatin structure maybe conserved even involved chromosomes have changed their identity.

22 Speakers and abstracts P11 AURORA RUIZ-HERRERA Universitat Autònoma de Barcelona, Spain

Principles of 3D genome folding in the mammalian germline

Mammalian genomes are packaged into a specifically tailored chromatin structure; the regulation of which depends on several superimposed layers of organisation that includes the chemical modifications of the DNA, epigenetic modifications of the nucleosomes and the high-order organisation of chromatin compartments inside the nucleus. How these different levels of chromatin organisation interact during meiosis remains largely unexplored. Even more fragmented is our understanding of the heritability of genome organisation. In this context, germ cells represent a unique cell model, where unipotent diploid cells undergo extensive cellular differentiation to form highly specialised haploid cells that ultimately form a totipotent embryo. These sequential developmental stages involve dramatic and highly regulated chromosomal organisation and chromatin remodelling, whose regulatory pathways are far from understood. Here we present a comprehensive high-resolution structural and functional atlas of mouse spermatogenesis, which addresses outstanding questions in the field of genome organisation, such as the implications for recombination and fertility. We first developed a reproducible flow cytometry protocol to isolate highly enriched mouse male germ cell populations representing all stages of spermatogenesis (pre-meiotic, meiotic and post-meiotic cells). Then we subjected millions of cells (representing the whole developmental process) to a comprehensive and robust high-throughput analysis that included in situ Hi-C, RNA-seq, ChIP-seq of CTCF and meiotic cohesins (REC8 and RAD21L), coupled with confocal and super-resolution microscopy imaging. These data permitted unprecedented detailed study of the close interplay between chromatin high- order organisation dynamics and function during mammalian spermatogenesis. This allowed us to distinguish different stages of compartmentalisation for chromosomes in pre-meiotic (spermatogonia), meiotic (leptonema, zygonema, pachynema and diplonema) and post-meiotic cells (round spermatids and sperm). And this is reflected at different levels of genome organisation, such as intra-/inter-chromosomal interaction ratios, distance-dependent interaction frequencies, genomic compartments, topological domains, and occupancy of insulator proteins. Our unique collection of data show that chromatin remodelling during germ cell differentiation is coupled with a fine-scale correlation between the expression of key genes involved in early stages of development and architectural proteins, such as meiotic cohesins. This hints at an unpredicted functional role for cohesin in regulating gene transcription (independent of CTCF) during and after meiosis. Overall, our results provide new insights into how genome reshuffling influences chromatin folding and nuclear architecture, especially in the context of the dramatic chromatin changes that take place during the formation of the germ line. We anticipate that our results will provide impetus for further exploration of the functional and structural basis of genomes in a broad context, which will reinforce the link between developmental genetics and genome evolution.

Speakers and abstracts 23 HUMAN GENETIC VARIATION: FROM MOLECULES TO POPULATIONS

P13 CHLOE STANTON The University of Edinburgh

Mendelian disease and complex trait genetics implicate ZNF469 as a master regulator of corneal thickness

The collagen-rich corneal stroma accounts for 90% of corneal thickness in humans, and is a major determinant of visual acuity. Stromal disorganisation and thinning are seen in corneal disorders including keratoconus – the main cause of corneal transplants in the UK – and in systemic connective tissue disorders such as Ehlers-Danlos syndrome. Most devastatingly, the Mendelian recessive disorder Brittle Cornea Syndrome (BCS) characterised by extreme thinning of the cornea and sclera, and general connective tissue dysfunction, results from loss of function mutations in the poorly understood genes encoding transcription factors ZNF469 or PRDM5. Central corneal thickness (CCT) is a highly heritable trait. We, and others, discovered a strong GWAS signal for CCT >100kb upstream of ZNF469 on chromosome 16 in a region rich in ENCODE chromatin marks indicative of enhancer function. This suggests that ZNF469 regulation and function are key to stromal development and corneal biology. We hereby sought to determine the biological function of ZNF469 to improve understanding of mechanisms leading to extreme and normal variation of cornea thickness. In order to dissect the regulatory mechanisms modulated by the GWAS variant(s) upstream of ZNF469 and to establish the function of ZNF469, CRISPR-Cas9 genome editing, CRISPR interference (CRISPRi) and RNAi, using shRNAs to knockdown expression of ZNF469, have been performed in the human corneal keratocyte cell line, hTK. Cellular investigations include proliferation assays, transcriptomics and generation of cell-derived matrices for proteomics. CRISPR-Cas9 gene editing has been used to create a mouse line carrying a human BCS mutation. Phenotyping includes anterior segment OCT, corneal histology and immunostaining. Our data indicate that ZNF469 is required for the maintenance and proliferation of hTKs in culture, with complete knockout incompatible with cell viability. Reducing ZNF469 expression in vitro affects the expression of many keratocyte genes, with a significant enrichment of GWAS candidate target genes. These misregulated genes include not only stromal components but also attractive candidate regulatory factors for stromal components, and for ZNF469 itself. Our mouse model of BCS indicates that loss-of- function mutation in Znf469 results in thin corneas relative to wildtype control mice. Mendelian disease and complex trait genetics have converged to implicate ZNF469 as a master regulator of corneal thickness. Homozygous loss-of-function mutations in ZNF469 lead to extreme corneal thinning, representing the pathological extreme of the spectrum of stromal thickness in the human population which can be modulated by genetic variation in enhancer regions. Increased knowledge of ZNF469 function and regulation

24 Speakers and abstracts in vitro and in vivo will provide insight into biological mechanisms controlling corneal thickness. Work is on-going to determine whether the enhancer highlighted by CCT GWAS is cell- or tissue-specific, which regulatory factors contribute to enhancer activity, and to elucidate the mechanism by which ZNF469 controls corneal thickness.

P15 NICOLA PIRASTU The University of Edinburgh

Using genetics to disentangle the complex relationship between food choices and health status

Food choices are one of the most important factors influencing health, thus understanding their genetic determinants may shed light on important biological mechanisms underlying disease. However, nutritional epidemiology is plagued by biases and confounding, where for example, health status or education affect both behaviour and reporting of consumption. In this context, Mendelian Randomisation may prove the ideal choice of method as it is supposedly less influenced by these factors. We have thus first performed GWAS on 29 food consumption traits in up to 445,779 individuals identify 302 associated loci (286 of which novel). In order to verify if our results were influenced by bias we have developed applied a novel post-GWAS Mendelian Randomisation (MR)-based correction approach to measure and adjust for the effect of health-related traits on food choice/reporting. We show that these risk factors and diseases account for up to 42% of the genetic variance of reported food intake in UK Biobank affecting greatly both the identified loci but also genetic correlation with other traits producing confounding in these measures. We also show by simulation and real data that comparison of the adjusted and raw results distinguish loci directly influencing food choice, rather than via a mediator/confounder, allowing selection of more valid instrumental variables for MR and better gene prioritisation for functional follow up. Network analysis of the genes prioritised in the direct effect loci identify 10 different clusters each characterised by specific function and expression pattern. For example, cluster one identifies genes involved in feeding behaviours and expressed in the brain while genes in cluster two are linked to energy and carbohydrates metabolism and are expressed mostly in skeletal muscle. We show through extensive MR analysis that the use of the direct-effect only instruments can substantially change the results. The corrected MR results show that although apparently an overall healthy diet (more fish, vegetables, and fruit and less meat, alcohol and coffee) have protective effects over a wide range of diseases/traits, the specific foods patterns which affect them vary across traits. For example, although BMI and Body fat seem to be affected in the same way by the overall dietary pattern, meat consumption has an effect on the former while not on the latter. This suggests that although an overall “healthy” diet does in the end produce many beneficial effects the recommendations should be targeted based on the desired effect.

Speakers and abstracts 25 In conclusion, this study while shedding a new light on the biology underlying food consumption also gives new insight on the complex relationships between food and health showing how complex these effects are. Furthermore, we provide a framework which could potentially benefit the field of genetics in general giving the tools to address the ever more wide spread problem of biases in GWAS studies.

EPIGENETICS AND NON-CODING RNA

P17 HOLLIE MARSHALL University of Leicester and The University of Edinburgh

The role of DNA methylation in genomic imprinting in social insects.

Genomic imprinting is the expression of only one allele in a diploid organism; expression being dependent upon the sex of the parent the allele was inherited from. Haig’s kinship theory predicts imprinting exists because of conflict between the matrigenes (maternal alleles) and patrigenes (paternal alleles) of an individual over maternal resource allocation. This theory is also applicable to haplodiploid social insect species, predicting genomic imprinting should, for example, occur with relation to worker reproductive behaviour in social bees. However, the existence of genomic imprinting in social insects is currently unknown. For a gene to be classed as imprinted an ‘imprinting’ mark must be present to distinguish the parental alleles within the genome. In mammals and flowering plants imprinted genes are primarily associated with DNA methylation in areas of the genome known as imprinting control regions. Many social insects including the bumblebee, possess a functional DNA methylation system, meaning the tools for imprinting are in place. Whilst some previous research suggests genomic imprinting may exist in honeybees, it is only based on gene expression data, DNA methylation as a regulatory mechanism has not yet been explored. In order to determine the role of DNA methylation in genomic imprinting in social insects we carried out reciprocal crosses of two bumblebee subspecies. Whole genome sequencing of the parents and RNA-Seq of offspring allowed us to identify parent-of-origin expression for the first time in this species. We found equal numbers of maternally and paternally expressed alleles in both reproductive and sterile workers with the majority of genes showing the same expression bias in both castes. We also found low evolutionary conservation of potentially imprinted genes in Hymenoptera, with only two homologous genes showing parent-of-origin expression in Bombus terrestris and Apis mellifera, suggesting rapid evolution of imprinted genes in this order. Finally using whole genome bisulfite sequencing we have also identified a small number of CpGs which display parent-of-origin DNA methylation. However, none of the genes which contained these CpGs were associated with parent-of-origin expression, indicating imprinting may be primarily controlled by other epigenomic mechanisms in social insects.

26 Speakers and abstracts P19 ATITILA MOLNAR The University of Edinburgh

Non-perfectly Matching Small RNAs Can Induce Stable and Heritable Epigenetic Modifications

Short non-coding RNA molecules (sRNAs) play a fundamental role in gene regulation and development in higher organisms. They act as molecular postcodes and guide AGO proteins to target nucleic acids. In plants, sRNA-targeted mRNAs are degraded, reducing gene expression. In contrast, sRNA-targeted DNA sequences undergo cytosine methylation referred to as RNA-directed DNA methylation (RdDM). Cytosine methylation can suppress transcription, thus sRNAs are potent regulators of gene expression. sRNA-mediated RdDM is involved in genome stability through transposon silencing, mobile signalling for epigenetic gene control and hybrid vigour. Since cytosine methylation can be passed on to subsequent generations, RdDM contributes to transgenerational inheritance of the epigenome. Using a novel approach, which can differentiate between primary (inducer) and secondary (amplified) sRNAs, we show that initiation of heritable RdDM does not require complete sequence complementarity between the sRNAs and their nuclear target sequences. Our findings contribute to understanding how sRNA can directly shape the epigenome and may be used in designing the next generation of RNA silencing constructs.

Speakers and abstracts 27 GENETICS AND SELECTION OF QUANTITATIVE TRAITS

Chair: Josephine Pemberton, [email protected]

Co-chair: Smaragda Tsairidou

Sponsor:

Almost all biological traits of importance are complex, variation in such traits being controlled by many genes of small effect plus environmental influences. Probing the causes of complex trait variation is critical to our understanding in areas as diverse as evolutionary biology, animal and plant breeding, behaviour, genetics of disease and personalised medicine. The genomics era and the information revolution offer unprecedented opportunities to dissect such traits and we are now in a period of intensive development in analytical methods in humans, animals and plants. Edinburgh has been at the forefront of quantitative genetics since its inception, and through its MSc cluster ‘Quantitative genetics and Genome Analysis’ and its PhD students, it has trained a very substantial number of quantitative geneticists worldwide. In this session, we will hear about recent developments in quantitative genetics ranging from the fundamental to the highly applied, and we will hear an appreciation of our Mendel Medal winner, Bill Hill.

Biography Josephine Pemberton is Professor of Molecular Ecology at the Institute of Evolutionary Biology in the School of Biological Sciences, The University of Edinburgh. She works on evolutionary quantitative genetics in natural populations, using data from the long- term, individual-based studies of red deer on Rum and Soay on St Kilda. Using both pedigree (reconstructed from markers) and genome-wide SNP genotypes her lab is currently focused on studies of maternal effects and inbreeding depression.

28 Speakers and abstracts 06 GRZEGORZ KUDLA The University of Edinburgh

A multidimensional genotype-phenotype map of a fluorescent RNA aptamer

Advances in synthetic biology and next-generation sequencing open the way to systematic analysis of mutational effects in individual genes, with applications in medical diagnosis, molecular evolution, and structural biology. An important challenge is to understand how the effects of mutations depend on the environment and genetic background. Here we developed a microarray platform to analyse the phenotypes of all 10,000 single- and double-mutants of the 49-nt long fluorescent RNA aptamer, Broccoli. We collected approximately 1,000 fluorescence measurements for each mutant, with varying pH, ligand and metal ion concentrations, temperatures, and fluorescence emission wavelengths. Multiple variants showed improved brightness, blue- and yellow-shifted fluorescence spectra, and altered potassium ion sensitivity. Mapping these mutations to secondary and tertiary structure models of broccoli revealed a hotspot of phenotypic diversity in the ligand binding pocket. We are currently conducting structural modelling and machine learning analyses to understand the structural basis of phenotypic variability.

Biography

Grzegorz Kudla carried out his PhD with Maciej Zylicz in Warsaw, and postdocs with Joshua Plotkin at Harvard and David Tollervey in Edinburgh. Since 2012, he has been leading a research group at the MRC Human Genetics Unit in Edinburgh, funded by the Wellcome Trust, MRC, ERC, and EMBO. The Kudla group develops high-throughput approaches to study relationships between the sequence, structure, and function of RNA and proteins. They pioneered the use of saturation mutagenesis to study RNA expression and function, and the use of proximity ligation for in vivo studies of RNA-RNA interactions. In his free time, Grzegorz runs an after school Go club.

Speakers and abstracts 29 07 PETER KEIGHTLEY The University of Edinburgh

Inferring the distribution of fitness effects of mutations

The distribution of effects of new mutations for fitness and other quantitative traits is important for several questions in population and quantitative genetics, including the extent of nearly neutral molecular evolution and the maintenance of variation and response to selection for quantitative traits. The distribution of effects informs about the frequencies of small- versus large-effect mutations and the frequencies of advantageous versus deleterious mutations. At the Institute of Animal Genetics, The University of Edinburgh, Alan Robertson stressed the importance of the distribution of effects of mutations for understanding the genetic basis of quantitative variation (Robertson, 1967, Heritage from Mendel) and Bill Hill showed that the properties of the distribution are critically important for determining the response to artificial selection (Hill, Genet Res, 1982). I will present two experiments to directly estimate the distribution of fitness effects of mutations in the model single-celled alga Chlamydomonas reinhardtii. We have produced mutation accumulation (MA) lines of C. reinhardtii by allowing spontaneous mutations to accumulate in replicated sublines for many generations in the near absence of natural selection. We went on to determine the complement of mutations carried by each MA line by genome sequencing. Our two experiments start by crossing individual MA lines to a mutation-free ancestral strain of the same genetic background in order to produce recombinant lines (RLs), each carrying a random combination of the mutations present in the MA line parent. In the first experiment, we genotyped 1,526 RLs carrying a total of 386 mutations at their known locations and obtained replicate measures of their growth rate in liquid culture. In the second experiment, we sequenced pools of RLs in order to estimate the initial frequency of each mutation following crossing, and allowed the pools of RLs to evolve in liquid culture for ~50 generations. We then sequenced the evolved pools and measured the changes of frequency of each mutation following experimental evolution. For both experiments, we developed methods to infer the distribution of effects of mutations by Markov chain Monte Carlo (MCMC). We assume a random-effects model in which mutation effects are sampled from some distribution (e.g., a gamma distribution) or a mixture of distributions, and the parameters of the distribution(s) and the effects of each mutation are estimated.

30 Speakers and abstracts 08 TRUDY MACKAY Clemson University

Charting the Genotype-Phenotype Map: Lessons from Drosophila

A major challenge of modern biology is to understand the genetic and environmental factors causing variation in quantitative traits, such as susceptibility to common diseases, height and blood pressure in humans. This is required for predicting phenotypes from genetic and genomic data, the goal of precision medicine. Although intense efforts have been devoted to genome wide association studies for many diseases and quantitative traits in humans in the past decade, relatively few causal genes have been identified, in part due to extensive local linkage disequilibrium in human populations. The Drosophila melanogaster Genetic Reference Panel (DGRP) consists of 205 sequenced inbred strains derived from the Raleigh, NC population. The DGRP is a community resource for genome wide association (GWA) analyses for genetically complex traits, in a scenario where all molecular variants are known. The large amount of quantitative genetic variation, lack of population structure and rapid local decay of linkage disequilibrium in the DGRP present a favourable scenario for identifying candidate causal genes and even polymorphisms affecting complex traits. I will present the lessons we have learned about the genetic architecture of quantitative traits from studies on the DGRP and outbred populations derived from it, and the implications of the Drosophila data for genotype-phenotype mapping for complex traits in other species, including humans.

Biography

Trudy Mackay is the Director of the Clemson University Center for Human Genetics. Her research focuses on understanding the genetic and environmental factors affecting variation in quantitative traits, using Drosophila as a model system. Her laboratory seeks to identify the genetic loci at which segregating and mutational variation occurs, allelic effects and environmental sensitivities, and the causal molecular variants. Trudy’s research utilises mutagenesis, analysis of natural variation, and systems genomics analyses. She is a Fellow of the American Academy of Arts and Sciences, the Royal Society, the US National Academy of Sciences and the 2016 Wolf Prize Laureate for Agriculture.

Speakers and abstracts 31 09 TROY ROWAN University of Missouri

Detecting signatures of ongoing polygenic selection and local adaptation in United States Bos taurus beef cattle

Cattle are under intense selection pressures, both artificial and natural, making them an interesting system in which to study how selection acts on short timescales. This project leverages temporally-stratified genotypes to identify genomic changes that have occurred in cattle populations over the last ~50 years. Additionally, using geographic information and historical climate data, we identify unique signatures of selection associated with regional environments. Using an evolutionary approach to study artificial selection allows us to create deliverable results for use in genetic evaluations while providing important insights into adaptation biology. Genotypes for 22,000 Red Angus, 17,000 Simmental, and 15,000 Gelbvieh cattle, born from 1970 to 2016, were phased with Eagle v2.4 and imputed with Minimac3 up to ~850,000 autosomal SNPs. To identify ongoing selection occurring within these three populations, we performed Generation Proxy Selection Mapping. This method implements a genome-wide linear mixed model that uses an individual’s birth year as a proxy for generation number as the phenotype to identify shifts in allele frequency over time more significant than those caused by drift alone. The use of a genomic relationship matrix allows us to explicitly control for family structure while in a polygenic framework. In each dataset, we identified dozens of genomic regions under strong directional selection. These regions are associated with known targets of selection, QTL for economically-relevant traits, genes involved in immune function, and numerous other loci actively undergoing large shifts in allele frequency. Ongoing simulations are exploring the potential applications of this method to other species and systems. We leveraged the diverse geographic distribution of our samples to further identify regions of the genome contributing to local adaptation. We used 30-year mean temperature, precipitation, and elevation measurements for each sample as phenotypes in univariate and multivariate genome-wide association analyses. These tests identify loci associated with these continuous environmental gradients, which are presumably under selection in a region-specific manner. While univariate analyses were underpowered, the multivariate analysis identified 8 genomic regions strongly associated with continuous climate attributes. We also performed univariate and multivariate case-control analyses using an individual’s climate zone, from a k-means clustering-derived map, as the dependent variable. In all three datasets, we observed a considerable overlap between the continuous and discrete variable analyses. We identified potentially adaptive loci that have been associated with local adaptation in human populations, as well as variants within GRIA4 that have been associated with cold-adaptation in Arctic cattle populations. Loci associated with regional or environmental differences are being used

32 Speakers and abstracts to calculate region-specific genomic predictions via genomic feature selection. These predictions will provide beef producers with a new environment-aware selection tool to maximise efficiency specific to different environmental stressors throughout the United States.

Biography

Troy Rowan is a PhD candidate from the University of Missouri’s Animal Genomics Group (United States). His research uses large, commercially-generated datasets from United States beef cattle populations to study how selection and local adaptation shape the genome over short timescales and across fine-scale environments. He is interested in using this system to understand evolution as well as to breed more sustainable beef cattle. He is currently a visiting student with the AlphaGenes group at The Roslin Institute.

Speakers and abstracts 33 10 FELICITY JONES Max Planck Society, Tübingen

Gene regulation, epigenomics and recombination in adaptive evolution of natural stickleback populations

The ability of organisms to rapidly adapt to new environments can be both facilitated and constrained by molecular mechanisms operating at the genomic level. Genome studies frequently identify non-coding regions as drivers of adaptation, suggesting regulatory elements play a major role. It has previously been shown that stickleback fish make use of pre-existing, predominantly intergenic, and putatively regulatory adaptive alleles at more than 81 loci across the genome to adapt to divergent marine and freshwater environments. This presents a unique opportunity to discover and functionally dissect molecular mechanisms influencing evolution in a naturally adapting organism. In my talk, I am going to discuss two linked aspects of our research programme: defining the components of transcription regulation using functional genomics approaches like allele-specific RNAseq, ChIPseq, ATACseq, and transgenic assays, and charting variation in the recombination landscape among individuals, sexes and ecotypes using whole genome sequencing of pedigrees, linked-read sequencing and ChIPseq methods. These approaches highlight the importance of cis-acting changes for adaptation: cis- acting elements are closely linked to the genes whose expression they regulate and the tight linkage is amplified further through suppressed recombination. Evolutionarily, such architecture facilitates adaptive divergence under gene flow. Together these approaches provide key insight into how fundamental biological processes such as gene regulation and meiosis shape the adapting genome in the early stages of divergence and ecological speciation.

34 Speakers and abstracts Biography

Felicity Jones is a Max Planck Research Group Leader in evolutionary genetics at the Friedrich Miescher Laboratory of the Max Planck Society, Tübingen Germany. Her research focuses on the molecular mechanisms of adaptation and speciation and uses an evolutionary “supermodel” - the threespine stickleback fish. Felicity graduated with a Bachelor of Science in 1999 from the University of Queensland, Australia where she developed an interest in speciation working with Professor Craig Moritz. She pursued this during her PhD research where she focused on the early stages of adaptive divergence by systematically dissecting the pre- and post-zygotic components of reproductive isolation in stickleback hybrid zones. After completing her PhD at the Institute of Evolutionary Biology, The University of Edinburgh in 2005 Felicity moved to Stanford University, USA. There she worked as a postdoc with Professor David Kingsley developing genetic, genomic and transgenic tools that helped usher the stickleback into the genomic era. With a high resolution map of adaptive loci, solid transgenic methods, and recent advances in genome-editing technology Felicity’s team at the Max Planck Institute in Tübingen is tackling new challenges in evolutionary genetics such as functionally dissecting adaptive loci using comparative epigenomics, transcriptomics and transgenic approaches to identify molecular mechanisms underlying adaptation; sequencing thousands of genomes to identify factors that facilitate and constrain the availability of standing genetic variation for future adaptation. Supported by an ERC Consolidator Grant they are also using linked-read sequencing of gametes, ChIP sequencing of recombination proteins, and nuclear family sequencing to investigate how the genomic recombination landscape evolves and influences the genomic distribution of adaptive loci in natural populations.

Speakers and abstracts 35 11

BALFOUR LECTURE: SUSAN JOHNSTON The University of Edinburgh, [email protected], @SuseJohnston

The evolution of individual recombination rates in the wild

When animals and plants produce eggs or sperm, chromosome pairs swap large chunks of DNA in a process called meiotic recombination. This creates chromosomes that are a mosaic of DNA from both of an individual’s parents which are then passed on its offspring. Recombination is important for evolution as it can remove harmful mutations and create new combinations of alleles that are favoured by natural selection. However, too much recombination can break up good allele combinations and itself cause harmful mutations to occur. It was long thought that trade-offs between costs and benefits tightly constrained recombination rates to a narrow range, but we now know that rates can differ within and between chromosomes, individuals, populations and species. Yet, little is known about how and why this variation has evolved. In this lecture, I will present our work using genomic data from long-term studies of wild mammals and birds to quantify how recombination rates vary, which genes underlie this variation, and how variants at these genes relate to individual reproductive success and survival. Our goal is to gain a better understanding of the importance of recombination in fertility, genetic improvement, and how it affects the evolution of other traits in our changing world.

Biography

Susan Johnston is a Royal Society University Research Fellow at the Institute of Evolutionary Biology at The University of Edinburgh. Her research integrates quantitative genetics with genomics to understand evolutionary trade-offs in wild populations. Her PhD and postdoctoral work investigated genetic architecture and fitness trade-offs in sexually-selected traits, before shifting her focus to understanding the causes and consequences of recombination rate variation at different evolutionary timescales. Her group currently uses genomic data from long-term studies of birds and mammals to investigate a range of questions related to recombination, immune traits, indirect genetic effects and sexual conflict.

36 Speakers and abstracts 12 JOHN HICKEY The Roslin Institute, The University of Edinburgh, [email protected]

Disruptions through Data Driven Breeding and integration will drive step changes in animal and crop performance

The world population is predicted to reach 9 billion within the next 35 years, requiring a 70-100% increase in food production relative to current levels. Breeding, which has been responsible for ~50% of historical improvements in agriculture, is one of the key routes through which this increased production, efficiency and sustainability can be delivered, though it requires a doubling of the rates of genetic gain achieved by breeding. Although plant and animal breeding have similar objectives and have similar roots, the two fields, and their respective concepts and technology, have diverged over the past 100 or so years. With advances in technologies and a re-appreciation of the centrality of data science these two fields are converging again creating opportunities for synergy in research, training and deployment. This will drive disruptions and step changes. Breeding is increasingly viewed as an integrative science that has data science at its core. Advances in genomic sequencing, genome editing, reproductive, robotic and other technologies offer huge potential to drive step changes in crop performance. Untapping the potential of these technologies requires their development and deployment in concert build upon a platform and culture of data science.

Biography

Professor John Hickey holds the Chair of Animal Breeding at The Roslin Institute, which is part of The Royal Dick School of Veterinary Studies at The University of Edinburgh in the United Kingdom. His area of research spans animal breeding, plant breeding, with some spill over into human genetics. In particular, he seeks to develop computational methods to generate and analyse huge data sets with whole genome sequence information as well as methods and breeding strategies that use genomic information to increase rates of genetic progress. In recent years his research group have worked on breeding projects in pigs, chickens, cattle, groundnut, forest trees, strawberry, maize, wheat and other species. Software and algorithms developed by John Hickey and colleagues underpin aspects of several of the largest breeding programmes globally.

Speakers and abstracts 37 13 IVAN POCRNIC The Roslin Institute, The University of Edinburgh, [email protected]

Limited dimensionality of genomic information and implications for genomic prediction

Modern molecular genetic techniques are providing data for an ever-increasing number of loci, yet the amount of usable information for genomic prediction seems to be saturating. In this contribution we summarise (i) our approach to measuring the limited dimensionality of genomic information, (ii) utilisation of the limited dimensionality for fast and scalable genomic prediction and (iii) future research avenues. (i) We conceptualised the dimensionality of genomic information using Stam’s number of independent (effective) chromosome segments (Me), which is related to the Fisher’s theory of junctions. This metric has been “rediscovered” with the advent of genomic prediction, particularly as one of the key parameters for predicting the accuracy of genomic prediction. Stam derived an expectation of Me for a random mating population of constant size and showed that it is equal to 4NeL, where Ne is effective population size and L is genome length in Morgans. However, we generally do not know Ne for a given population or dataset at hand. We therefore proposed to estimate the dimensionality of genomic information for a given dataset as the number of non- negligible singular values of a genotype matrix, or equivalently, as the number of non- negligible eigenvalues of a genomic relationship matrix (GRM). Using this approach, we estimated that Me is about 10,000-15,000 in cattle and about 4,000 in pigs and broilers despite the fact that the underlying data had about 40,000 genome-wide loci for 16,000 to 81,000 genotyped animals. In these populations, the number of eigenvalues that explained 90, 95, and 98% of variation in GRM respectively correspond to NeL, 2NeL, and 4NeL. (ii) We exploited the limited dimensionality of genomic information to speed-up calculations of genomic prediction. Specifically, we have derived specialised inverse of GRM by dividing individuals into core and non-core, where non-core individuals are modeled as a function of core individuals. Limited dimensionality was exploited by minimising the number of core individuals, which increases the sparsity of inverse of GRM and substantially speeds-up calculations, in particular with large datasets (the core size can be constant). In a range of animal breeding populations accuracy of genomic predictions peak when we use the dimensionality corresponding to 98 to 99% of variation in GRM or around 4NeL. Therefore, the number of core individuals approximately corresponded to Me. This indicates that 1 to 2% of variation in GRM is due to the noise or that our datasets are too small to accurately estimate the effect of smallest eigenvectors. Further, we observe that the accuracies are only slightly reduced at half the “optimum” dimensionality.

38 Speakers and abstracts (iii) The results show that genomic information is limited, which provides us with the new viewpoint on the mechanisms behind the genomic prediction. Some of the open questions for future research are the relationship between the limited dimensionality and underlying genetic processes that generate linkage disequilibrium, choice of core individuals, the impact of limited dimensionality on genomic prediction and genome- wide association studies in homogeneous and structured populations, optimal number of loci for genomic studies and theoretical formulas for genomic prediction accuracy.

Biography

Ivan Pocrnic is a Core Scientist (Research Fellow) at The University of Edinburgh. His main research interest is in the development of novel breeding methodologies and tools. This work involves a synergy between theoretical quantitative genetics and practical animal breeding. Ivan’s main expertise is the application of genomic selection on big datasets for various species (pig, chicken, cattle, etc.) and various models, with emphasis on single-step genomic prediction. Before taking a post at The Roslin Institute with Gregor Gorjanc, Ivan was postdoc with Ignacy Misztal at the University of Georgia (USA), where he also obtained his PhD. Ivan previously worked at the Croatian department for breeding value estimation and obtained MSc in Animal Genetics and Breeding, and BSc in Agricultural Economics at the University of Zagreb (Croatia).

Speakers and abstracts 39 14 PATRICK SHARMAN University of Exeter

Genetic improvement of racehorse speed

The highly competitive and commercial nature of the racehorse breeding industry mean that selection on thoroughbred performance is assumed to be strong. However, a number of studies have failed to detect substantive improvement in thoroughbred racehorse speed over recent decades, leading to the suggestion that thoroughbred performance has reached a de facto selection limit. A more recent (and more comprehensive) analysis of race times in Great Britain from the mid-1800s to 2012 revealed that racehorses are still getting faster. However, this pattern is largely driven by improvement over sprint distance races, rather than over the longer distances run in the most prestigious races. Furthermore, estimated rates of improvement reported were well below those achieved in livestock breeding programmes more generally.

Biography

Patrick Sharman is a PhD student working with Alastair Wilson at the University of Exeter. His research addresses the genetics of performance and health traits in thoroughbred racehorses. Patrick completed his MSc in Quantitative Genetics and Genome Analysis at The University of Edinburgh in 2005, following which he spent a number of years working in the racing and horse breeding industries. He returned to academia in 2013 and now combines part-time PhD studies with continuing involvement in the racing industry.

40 Speakers and abstracts 15 MENDEL MEDAL: PROFESSOR WILLIAM G. HILL Introduced by Laurence D. Hurst Bill Hill, an appreciation: Trudy Mackay

Bill Hill was brought up on a family farm in Hertfordshire and later developed an interest in animal genetics and breeding, and its statistical underpinning. Much after a masters in the US, he carried out a PhD with Alan Robertson in Edinburgh. Bill then went on to join The University of Edinburgh staff, he still maintains an office, long after formal retirement. He has maintained an interest in long-term genetic improvement and genetics of finite populations.

Speakers and abstracts 41 THE GERMLINE, SEX DETERMINATION & SEX CHROMOSOMES

Chair: Brian Charlesworth, The University of Edinburgh, [email protected] Co-chair: Kay Boulton

Studies of the determination of gender by sex chromosomes played a critical role in establishing the chromosomal basis of inheritance. The pioneer of genetics in Edinburgh, F.A.E. Crew, wrote the first English-language book on sex determination, and contributed to early work in this field. H.J. Muller, who discovered dosage compensation in Drosophila and laid the foundations of theories of the evolution of X and Y chromosomes, worked here for several years. Mary Lyon, who completed her PhD in Edinburgh, discovered dosage compensation by X-inactivation in female mammals. Howard Cook made important contributions to the genetics of the human Y chromosome. Finally, Deborah Charlesworth and myself helped to develop modern theories of sex chromosome evolution, and Deborah pioneered the genetics of sex chromosomes in plants. It therefore seemed appropriate to include a session on these topics in this centennial celebration. Stephen Wright and Doris Bachtrog have made important discoveries about how the lack of genetic recombination on the Y chromosome has led to its loss of function and to dosage compensation, in plants and Drosophila respectively. Laura Ross will discuss some maverick systems involving the suppression or elimination of paternal genomes, which pose challenges to both molecular and evolutionary geneticists.

Biography Brian Charlesworth is a Senior Honorary Professorial Fellow at The University of Edinburgh. He obtained a PhD in Genetics at Cambridge in 1969. He subsequently worked at the Universities of Chicago, Liverpool and Sussex, moving to Edinburgh from Chicago as a Royal Society Research Professor in 1997. He is an FRS, FRSE and Foreign Associate of the US National Academy of Sciences. He received the Darwin Medal of the Royal Society in 2000, and the Thomas Hunt Morgan Medal of the Genetics Society of America in 2015. His research interests are in population genetics theory, molecular evolution and genome evolution. 42 Speakers and abstracts 16 GENETICS SOCIETY MEDAL: DEBORAH CHARLESWORTH The University of Edinburgh, [email protected]

The diversity of sex chromosomes and the importance of genetics in understanding them

I will outline several genetic studies of young sex chromosome systems, involving genetic work in my lab, using the guppy (a fish) and the white campion (a plant) to illustrate the importance of this kind of study for understanding the different kinds of sex chromosomes that have evolved in animals and plants.

Biography

Deborah Charlesworth received a PhD in Genetics from Cambridge University in 1968, and did postdoctoral work at Cambridge, the University of Chicago, and Liverpool University. She taught at the University of Chicago from 1988–1997, leaving to take up a Professorial Research Fellowship at the University of Edinburgh. She is best known for her work on the evolution of genetic self-incompatibility in plants and is recognised as a leader in that field. According to the Web of Science she has published over 300 articles in peer-reviewed journals. These articles have been cited over 10,000 times and she has an h-index of 53.

Speakers and abstracts 43 17 DORIS BACHTROG University of California, Berkeley

Massive gene amplification on a recently formedDrosophila Y chromosome

Widespread loss of genes on the Y chromosome is considered to be a hallmark of sex chromosome differentiation. Here we show that the initial stages of Y evolution are driven by massive amplification of distinct classes of genes. The neo-Y chromosome of Drosophila miranda initially contained about 2700 protein-coding genes, but has gained over 3200 genes since its formation about 1.5 MY ago, primarily by amplification of protein-coding genes ancestrally present on this chromosome. We show that distinct evolutionary processes may account for this drastic increase in gene number on the Y. Testis-specific and dosage sensitive genes appear to have amplified on the Y in order to increase male fitness. A distinct class of meiosis-related multi-copy Y genes is co-amplified on the X, and their expansion is probably driven by conflicts over segregation. Co-amplified X/Y genes are highly expressed in the testis, are enriched for meiosis and RNAi functions, and are frequently targeted by short RNAs. This suggests that their amplification is driven by X vs. Y antagonism over increased transmission, where sex chromosome drive suppression is probably mediated by sequence homology between the suppressor and distorter, through RNAi mechanisms. Thus, our analysis suggests that newly emerged sex chromosomes are a battleground for sexual and meiotic conflict.

Biography

Doris Bachtrog obtained her PhD in 2002, working with Professor Charlesworth at The University of Edinburgh. She was a postdoctoral scholar at Cornell University, and an Assistant Professor at UCSD and is now a Professor at UC Berkeley. She received a Sloan Research fellowship, and a Packard Fellowship. Her work centres on a broad range of topics in functional evolutionary genomics, with a particular focus on sex chromosome biology.

44 Speakers and abstracts 18 LAURA ROSS The University of Edinburgh

Genomic exclusion; males that lose their fathers genes

Under Mendelian inheritance, individuals receive one set of chromosomes from each of their parents, and transmit one set of these chromosomes at random to their offspring. Yet, in thousands of animals Mendel’s laws are broken and the transmission of maternal and paternal alleles becomes unequal. Why such non-Mendelian reproductive systems have evolved repeatedly across the tree of life remains unclear. My lab studies a variety of insect species to understand why, when and how the transmission of genes from one generation to the next deviate from Mendel’s laws. We mainly focus on species with Paternal Genome Elimination: Males transmit only those chromosomes they inherited from their mother to their offspring, while paternal chromosomes are excluded from sperm through meiotic drive. Here I present an overview of our work on the evolution of this unusual reproductive strategy and the mechanisms responsible for the parent-of-origin specific behaviour and loss of genes.

Biography

Laura Ross is a Royal Society Dorothy Hodgkin’s fellow and ERC starting grant holder at the Institute of Evolutionary Biology at The University of Edinburgh. Her research interests lie in evolutionary genetics, especially the question of how genes within organisms can come into conflict over gene transmission, and how these conflicts affect the way organisms reproduce. During her PhD at the University of Groningen and subsequent fellowships at the University of Massachusetts Amherst and Oxford University she studied insects with unusual reproductive strategies using a mix of comparative, theoretical, and empirical approaches. She has continued this work at The University of Edinburgh and her lab currently focuses on understanding the evolution and mechanism of genomic imprinting, sex determination and meiotic drive associated with these reproductive strategies.

Speakers and abstracts 45 19 STEPHEN WRIGHT University of Toronto, @stepheniwright

Y degenerate? Evolutionary genomics of Y chromosome degeneration in plants

Sex chromosomes represent one of the most striking examples of parallel evolutionary change, where many organisms show a recurrent pattern of gene degeneration and loss. However, the evolutionary forces driving this degenerative genome evolution remain poorly understood. Here, I present work from our group that has aimed to investigate the relative importance of distinct evolutionary processes that may contribute to degenerate genome evolution on the non-recombining Y chromosomes of the plant genus Rumex. Our analyses indicate that inefficient natural selection caused by recombination suppression in a large non- recombining chromosomal region is the predominant force contributing to Y chromosome degeneration. Our results also suggest that ancestral suppression of recombination across large fractions of the chromosome probably facilitated the formation of sex chromosomes.

Biography

Stephen Wright is Professor and Canada Research Chair in the Department of Ecology and Evolutionary Biology at the University of Toronto. Research in his group focuses on understanding the evolutionary forces structuring patterns of genomic diversity and genome evolution. Using plants as model systems, his research group has investigated the evolutionary causes and consequences of evolutionary transitions, including the evolution of self-fertilisation from outcrossing, loss of sexual reproduction, and the early stages of sex chromosome evolution.

46 Speakers and abstracts 20 PETER ELLIS University of Kent

Differential sperm motility mediates the sex ratio drive shaping mouse sex chromosome evolution

The mouse Y chromosome, and to a lesser extent the X chromosome, exhibit an extraordinary level of gene amplification selectively affecting genes expressed in postmeiotic spermatids. This is ascribed to a rampant and ongoing genomic conflict between the X and Y chromosomes over the control of offspring sex ratio. Previous work has shown that offspring sex ratio is modulated via competition between the X- and Y-linked transcriptional regulators Slx and Sly. These genes compete to regulate postmeiotic sex chromatin, and alterations in the balance of these genes’ activity (e.g. due to Y deletion, interspecific hybridisation or transgenic knockdown) lead to global shifts in X/Y transcription in spermatids. Competition between Slx and Sly has led to a runaway “arms race” between the chromosomes that drives massive amplification of competing X / Y gene complexes, and also of any autosomal loci embroiled in the conflict. The signs of this conflict have now been observed at the molecular genetic level in the form of opposing transcriptional regulatory consequences of Slx / Sly depletion; at the cellular biological level in the form of Slx / Sly competition for intracellular binding targets; at the organism level in the form of sex ratio skewing in Yq-deleted animals, transgenic Slx / Sly knockdown animals and various Y chromosome congenic males; at the ecological level in the form of population sex ratio skewing in introgression zones between subspecies with different copy numbers of the Y-borne amplicons; and at the evolutionary population genetic level in the form of correlated co-amplification of gene families and recurrent selective sweeps at X and Y-linked amplicons. However, as yet the physiological mechanism through which these genes regulate offspring sex ratio skewing remains completely unknown. Using IVF and sperm motility studies, we now show that sex ratio skewing in Yq deleted males is not due to any aspect of sperm/egg interaction, but is caused by a relatively greater motility of X-bearing sperm. Using a novel software package for sperm morphometric analysis, we show that the sperm morphological abnormalities associated with Yq deletion are more severe in Y-bearing sperm, providing a potential hydrodynamic basis for the altered motility. Collectively, this work therefore traces a direct link from an underlying genetic perturbation to a quantifiable difference between X- and Y-bearing sperm, and from this to the resulting skewed offspring sex ratio.

Speakers and abstracts 47 POSTER PITCHES

48 Speakers and abstracts GENOME STABILITY AND INSTABILITY

P02 RODRIGO PRACANA University of Oxford

Runaway GC evolution in gerbil genomes

Biases in mutation and recombination can cause the fixation of non-adaptive and maladaptive alleles in populations. For instance, deleterious mutations can spread in the population by processes such as GC-biased gene conversion or as a result of low recombination rates. The fat sand rat (Psammommys obesus, a species of gerbil) has recently been shown to carry a ~10 Mb region of the genome with approximately 80 genes that have an extremely high GC content. Several of the genes in the region, including the homeobox gene PDX1, have a high nonsynonymous GC-skew, despite considerable sequence conservation across other vertebrates. This nonsynonymous divergence suggests that this species has been affected by non-adaptive GC accumulation. Here, we measure substitution rates for AT-to-GC, GC-to-AT and GC-conservative mutations for orthologous genes of four gerbil species, which we compare to mouse and rat. We show that (1) all genes in the region have an elevated substitution rate for all mutational categories; (2) that the substitution rate for AT-to-GC mutations in the region is higher than for the other categories, consistent with the observed GC skew; and (3) that gerbils have other clusters of GC-skewed genes scattered across their genome. We discuss that the current model of GC-biased gene conversion cannot explain the increase in the substitution rate of all mutation types. We also highlight that GC content in rodents should be studied over short phylogenetic distances, given the fast evolution of karyotypes and of recombination hotspots in this clade.

Speakers and abstracts 49 P04 JULIET LUFT The University of Edinburgh

Loss-of-heterozygosity as a mechanism of oncogenic selection

Loss-of-heterozygosity (LOH) is a common occurrence in many types of cancer that reduces the genetic diversity of the tumour. By analysing patterns of preferential allelic retention during LOH across approximately 10,000 cancer samples from The Cancer Genome Atlas, we sought to identify genetic polymorphisms currently segregating in the human population that are preferentially selected for, or against during cancer development. Low quality variant data, experimental batch effects and cross-sample contamination were found to be major confounders in our analysis. To mitigate these we developed a data- cleaning pipeline, including a generally applicable contamination classifier for use with matched tumour:normal whole exome sequencing data. Using our pipeline, we filtered 8.8% of common variants and found evidence of contamination in 1,493 pairs of samples (13.3%). Our analysis of LOH revealed highly significant selection of predicted damaging germline variants (pDGVs) in recessive tumour suppressor genes. Analysis of individual genes found this signal to be dominated by mutations in ATM, BRCA1, BRCA2 and SDHB. Furthermore - we identified 25 protein interaction pathways with significant preferential retention of pDGVs, 14 of which are involved in DNA double-strand break repair. Our results identified damaging germline variants in genes and pathways associated with DNA damage repair as the most common targets of preferential LOH. This may indicate that heterozygous mutations in these genes reduce the efficacy of DNA double-strand break repair, increasing the likelihood of second hit at the locus triggering an oncogenic mutator phenotype.

50 Speakers and abstracts GENETICS AND SELECTION OF QUANTITATIVE TRAITS

P06 BAILEY HARRINGTON MRC Human Genetics Unit, The University of Edinburgh

Dissecting out the bones of height

Since the advent of GWAS, hundreds of genetic variants that are significantly associated with morphometric traits, such as height, have been found, but the mechanisms through which they affect morphology remain elusive. One avenue of enquiry that may lead to a better understanding of relative body proportions and bone morphology, as well as the components of growth, is breaking height into its constituent parts, such as the long bones of the leg, and studying them. The major long bones of the leg are the largest contributors to height in humans, and, while their length correlates with height, significant variation also exists within them, independent of variation in height. The same trends are seen in the major long bones of the arms. Genetic variants acting on the length aspect of bone morphology may have larger relative effect sizes on long bone length than they do on height, so focusing on these more immediate phenotypes should result in greater power to uncover single nucleotide polymorphism (SNP)-trait associations. Bone lengths have previously seen limited study in humans due to the difficulty of measuring them accurately in living participants, and the lack of cohorts with both long bone lengths and genetic data available. The major long bone lengths in both the arms (humerus, radius, ulna) and legs (femur, tibia, fibula) were obtained from dual x-ray absorptiometry (DXA) whole-body scans in the UK Biobank cohort (N=15,000), using image-analysis software that was developed in-house. After initial quality control on ethnicity and phenotypic vs genotypic sex, 12,000 images remained to be analysed in the GWAS. To allow for replication, the 12,000 were split into independent discovery and replication cohorts (N = 8,000 and 4,000, respectively). Genotyping was performed by the UK Biobank using the UK Biobank Axiom® Array (820,967 markers), and data were imputed to the Haplotype Reference Consortium panel (24.9 million SNPs after imputation and quality control). Bone length SNP heritability estimates for these traits in the discovery cohort range from 37% for radius length, to 54% for femur length, after including height, age, sex, genotyping array, and the first 10 genetic principal components fitted as standard fixed effects, and a genomic relationship matrix. These same covariates were included in the model for the GWAS. Prior to running the associations, individuals with a z-score for their raw phenotype values of more than 5, and a z-score of more than 3 for the phenotype residuals after fitting the linear-mixed model, were removed from each analysis.

Speakers and abstracts 51 Six variants reached genome-wide significance for individual long bone lengths in the discovery cohort; of these, none have been previously reported for associations with either long bone length, or height. One of these variants, associated with femur length and located on chromosome 4, has an r 2 value of 0.81 with a height-associated variant. These results show that there are variants affecting long bone length that are not associated with height, and therefore affect proportional morphology in humans.

THE GERMLINE, SEX DETERMINATION & SEX CHROMOSOMES

P08 KAMIL JARON The University of Edinburgh

Genomic consequences of the loss of recombination

Recombination is a key driver of genome evolution as shown by the dramatic differences between non-recombining sex chromosomes and other supergenes versus recombining parts of genomes. Specifically, predicted consequences of the loss of recombination include a reduced effectiveness of selection, increased accumulation of transposable elements, gene loss and structural rearrangements as well as effects on heterozygosity. However, while these consequences are empirically well documented for sex chromosomes, it remains largely unknown if they extend to the non-recombining genomes of asexual organisms. We systematically characterised genomic features in the genomes of 31 asexual animals representing at least 23 independent transitions to asexuality, and compared them to close or direct sexual relatives where possible. In contrast to sex chromosomes, asexual genomes are characterised by a low transposable element load and little structural rearrangement. Other studied genome features were either not affected at all or not in the same direction across all or even a majority of transitions to asexuality, suggesting that genome characteristics were lineage specific rather than caused by asexuality. Overall, despite the importance of recombination rate variation for understanding the evolution of sexual animal genomes, the genome-wide absence of recombination does not appear to have the dramatic effects, which are expected from empirical studies of sex chromosomes or classical theoretical models. The reasons for this are probably a combination of lineage-specific patterns, impact of the origin of asexuality, and a survivor bias of asexual lineages.

52 Speakers and abstracts P10 ELENA BERNABEU GOMEZ The Roslin Institute, The University of Edinburgh

Genomics of sexual differences

Despite males and females sharing nearly identical genomes, there are differences between the sexes in complex traits and in the risk, incidence, and prevalence of a wide array of diseases. Gene by sex interactions (GxS) are thought to underlie some of this sexual dimorphism. However, the extent and basis of these interactions are poorly understood. Here we provide insights into both the scope and mechanism of GxS across the genome of circa 450,000 individuals of European ancestry and 530 complex traits in UK Biobank. We report sex differences in heritability for 71 traits, through genetic correlations for 69 traits, and in associations of genetic variants located in the autosome and X-chromosome for 103 traits. We also show that sex specific polygenic risk scores can lead to improved phenotypic prediction. Finally, we highlight how sex agnostic analyses could be missing loci of interest. This study marks the largest examination of the genetics of sexual dimorphism to date. Our findings parallel previous reports, proving the widespread presence of sexual genetic heterogeneity across complex traits, though be it of generally modest magnitude. This warrants the need for future studies to consider sex-stratifying analyses, in order to both shed light into possible sex-specific molecular mechanisms as well as improve prediction of high-level phenotypes.

Speakers and abstracts 53 HUMAN GENETIC VARIATION: FROM MOLECULES TO POPULATIONS

P12 YIZHOU YU Imperial College London

The cross-species transmission and infection dynamics of retroviruses

Non-human animals are increasingly recognised as a major reservoir for viruses, including rabies, Ebola, and MERS. Here we investigated past incidents of cross-species transmission involving the endogenous retrovirus HERV.I as well as the presence of long tandem repeat (LTR) retrotransposons to uncover potential mechanisms of cross-species of transmission. We first employed the Basic Local Alignment Search Tool (BLAST) to screen for the presence of LTR retrotransposons in all fully sequenced genomes of mammals and bird, using the reverse transcriptase motif of gypsy open reading frame. While previous studies suggest that there are at most one copy, this screen revealed 8 species that potentially contains large number of LTR retrotransposons, indicating potential mechanisms of viral replication and flaws of genome stability. Additionally, performing BLAST through fully sequenced vertebrates using the HERV.I amino acid sequence for conserved motifs of the lemon shark genome, we identified several infected species. Construction of a parsimonious evolutionary lineage tree across these species proposes that there could be cross-taxa transmission across birds and mammals. In conclusion, the investigation of the distribution of retroelements reveals different host- specific infection dynamics. Current research investigates potential mechanisms of cross- species infection and genomic adaptations post-infection.

54 Speakers and abstracts P14 NEIL CLARK The University of Edinburgh

Trait-causal effects of altered Vitamin D Receptor binding revealed by Epistatic and Northerliness interactions with Allele Specific Binding

Genetic variation that alters binding affinity of sequence-specific transcription factors (TFs) is likely to be a common causal mechanism explaining genetic and complex trait variation. Identifying such variants would elucidate the causal pathways from genome to trait and would inform a rational approach to therapeutic intervention. Sufficiently powered GWAS are able to identify causal associations between genetic and trait variation. However, GWAS are mechanism-agnostic and causal links between individual variants and trait are inevitably convolved with variants lying in linkage disequilibrium. Mendelian Randomisation (MR), by examining the trait variation explained by the genomic component of the exposure variation, can potentially distinguish causal effects from environmental confounding factors. The reliability of such causal estimates depends crucially on the ability to identify and correct for pleiotropic bias deriving from the existence of unobserved causal pathways between the instrumental variant and trait which do not include the exposure. We first identified 305 variants as reliable instrumental variants for the occupancy of the Vitamin D Receptor (VDR) in at least one of 16 genotyped lymphoblastoid cell lines, determined by ChIP-exo (PMID: 28335003). Based on summary statistics for all 776 traits from UK Biobank (UKBB) we then calculated the ratio-estimate of the causal effect of VDR binding at each locus in each trait and identified 847 significant causal effects on trait under the strong assumption of no pleiotropic bias. In a second step, we distinguished true causal effects from pleiotropy by applying a mechanistic model in which the causal effect size is modulated by the bioavailability of Vitamin D, estimated using either (epistatic) genetic or environmental variables. The genomic component of the serum levels of the Vitamin D precursor, 25-hydroxyvitamin D, was estimated based on trans-acting variants used previously as MR instrumental variants. Separately, we used the latitude of UK Biobank participants as a proxy for skin ultraviolet B exposure, and thus vitamin D bioavailability. We identified 78 binding variant-trait pairs for which there is a significant epistatic interaction between the VDR binding variant effect and the genomic based bioavailability of Vitamin D using a linear model trained on individual-level UK Biobank GWAS data. For the environmental (latitude) analysis, 33 binding variant-trait pairs were identified, four of which replicate pairs identified by the genetic epistasis model. Effect sizes of UKBB participants’ latitude at entry significantly exceed those of their birth latitude indicating that elevating vitamin D levels by supplementation or sun exposure has a therapeutic effect on some traits with a size that is modulated by VDR binding variant genotypes. Our results demonstrate that Mendelian Randomisation models are sufficiently powered to identify trait causal effects of altered TF-DNA occupancy that generalise to other Tfs.

Speakers and abstracts 55 EPIGENETICS AND NON-CODING RNA

P16 CATHERINE NAUGHTON The University of Edinburgh

Centromeres have a transcription dependent disrupted chromatin structure

Centromeres are highly specialised genomic loci that function to maintain genome stability. They are identified cytogenetically as the primary constriction of metaphase chromosomes where they serve as the assembly site for the kinetochore, a multiprotein structure that forms attachments to the microtubules of the mitotic and meiotic spindles. Whilst kinetochore architecture and microtubule interactions have been described in detail the structure and organisation of chromatin forming the centromere has remained obscure. This is due to most eukaryotic centromeres being formed on repetitive DNA sequences, for example, human centromeres are composed of tandem arrays of α-satellite DNA, thus hampering assays that necessitate mapping using a unique DNA sequencing. In contrast to canonical centromeres, neocentromeres are formed on unique DNA sequences, a feature we have exploited in this study to comprehensively describe the interphase chromatin structure of a functioning human centromere. Neocentromeres arise in rare instances where chromosomes silence their endogenous centromere and a new centromere forms at an ectopic location within the euchromatic chromosome arms. We have used two human lymphoblastoid cell lines, called Neo3 or Neo6, where a neocentromere has formed on HSA3 or HSA6, respectively. Cytogenetic analysis combined with DNA FISH showed a shift in primary constriction location to a region of unique DNA sequence, whilst the precise location of the core neocentromeric domains were mapped using antibodies against CENP-A and CENP-C to a ~100kb region of 3q24 (Neo3) and ~90kb region of 6p22.2 (Neo6). To genetically separate the normal HSA3 and HSA3 carrying a neocentromere to enable independent analysis of the 3q24 region we generated human hamster hybrid cell lines through fusion of the Neo3 cells with CHO cells. These cell lines provide a unique hybrid model to independently assay the chromatin structure of the normal 3q24 locus versus when it functions as a centromere. Centromeric transcription has been suggested to be necessary for function however due to the repetitive nature of human centromeres, clear evidence is lacking. RNA-seq was used to investigate changes in the transcriptional landscape across the 3q24 locus associated with the formation of an active centromere. This core centromere domain is particularly gene poor, with only 2 active genes (ZIC1 and ZIC4) that upon centromerization lose RNA pol II localisation and become transcriptionally silenced. Our analysis also shows that the core neocentromere is flanked by a 4.5MB transcriptionally repressed domain, delimited by H3K27me3 and H3K9me3. Strikingly, large-scale chromatin structure analysis, analysed by FISH, revealed that neocentromere formation decompacts the core chromatin structure, and is enriched in negatively supercoiled DNA and disrupted chromatin fibres. In contrast to repetitive pericentromeric chromatin found at endogenous centromeres the flanking pericentromeric regions surrounding the neocentromere core do not show an altered structure. Significantly, disrupted chromatin at the centromere core was transcription

56 Speakers and abstracts dependent, and RNase sensitive. However, despite exhaustive methods, stable transcripts were not found emanating from the centromeric region. We therefore suggest that sparse transcription, at a level too low to be detected, is responsible for generating a disrupted chromatin structure landscape.

P18 COLUM WALSH University of Ulster

UHRF1 is required to suppress viral mimicry through both DNA methylation- dependent and -independent mechanisms

Loss of histone 3 lysine 9 trimethylation (H3K9me3) or of DNA methylation have been shown to induce a state of “viral mimicry” involving upregulation of endogenous retroviruses (ERV) and a subsequent innate immune response. This approach may be useful in combination with immune checkpoint cancer therapies but little is currently known about cellular control of ERV suppression. The UHRF1 protein can interact with H3K9me3 and the maintenance methylation protein DNMT1 and is known to play an important role in epigenetic control in the cell. To address its possible role in ERV regulation we first established stable knockdown cell lines in normal lung fibroblasts. While these showed loss of DNA methylation genome- wide, the transcriptional response was dominated by activation of innate immune signalling, consistent with viral mimicry. We confirmed using mechanistic approaches that activation of interferons and interferon-stimulated genes involved in double-stranded RNA detection was crucial to the response. ERVs were demethylated and transcriptionally activated in stable knockdown cells. As well as these normal cell lines, ERV activation and interferon response also occurred following transient loss of UHRF1 in both melanoma and colon cancer cell lines. Restoring UHRF1 in either transient- or stable knockdown systems abrogated ERV reactivation and interferon response, but without substantial restoration of DNA methylation. Rescued cell lines were hypersensitive to loss of SETDB1 and KAP1, implicating H3K9me3 as crucial to UHRF1-mediated repression in the absence of DNA methylation. Confirming this, cells rescued with UHRF1-containing point mutations affecting H3K9me3 binding could not shut down ERV transcription or the innate immune response. Finally, by introducing similar point mutations in the mouse homologue we could show that this pathway is conserved in mice. Our results clearly implicate UHRF1 as a key regulator of ERV suppression, thus suggesting a promising new approach to inducing viral mimicry in cancer immunotherapy.

Speakers and abstracts 57 HUMAN GENETIC VARIATION: FROM MOLECULES TO POPULATIONS

Chair: Wendy Bickmore, The University of Edinburgh, [email protected] Co-chair: Peter Joshi

Sponsor:

The first chromosome structural abnormality associated with a human phenotype was discovered in Edinburgh 60 years ago by Pat Jacobs and John Strong working in the Medical Research Council Group for Research on the General Effects of Radiation, - the forerunner to the current MRC Human Genetics Unit. Molecular insights into human genome variation did not come until the pioneering work of Ed Southern to develop Southern blotting and then DNA microarray technology at the MRC Mammalian Genome Unit, King’s Buildings, The University of Edinburgh. Amazing technological advances have now allowed for the sequencing of close to 1 million human genomes, and genetic association studies have been performed on many more. The scale of these data is daunting and we are still a long way from being able to predict phenotype from genotype in most instances. Speakers in this session will discuss progress in the analysis of human genetic variation at scale (Gil McVean) and detecting signals of genetic association with disease and how these might be used in disease prediction (Evi Theodoratou). Anna Gloyn will show the power of genetics to provide new insights into normal physiology and disease aetiology in the context of diabetes and the transformative power of genetics in diagnosing and treating monogenic disease. Our ability to predict the pathogenicity of mutations in human genes is still remarkably limited and Joe Marsh and M. Madan Babu will both discuss how an understanding of protein structure and function is key to progressing this.

Biography Wendy Bickmore is Director of the MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine at the University of Edinburgh. She is fascinated by the structure and organisation of chromatin in the nucleus. Her group showed that different human chromosomes have preferred positions in the nucleus, related to their gene content, and addressed how genes are organised and packaged in the nucleus and how they move in the cell cycle and during development. She demonstrated that the polycomb repressive complex functions by compacting higher-order chromatin at target loci. Current research in Wendy Bickmore’s laboratory focuses on how the spatial organisation of the nucleus influences genome function in development and disease, including how enhancers communicate with their target gene promoters. Wendy is an EMBO member, a Fellow of the Royal Society and of the Academy of Medical Sciences. She was the President of the Genetics Society of Great Britain from 2015 to 2018. She is an editor on many journals including PLoS Genetics and Cell. 58 Speakers and abstracts 21 GIL MCVEAN University of Oxford

The ancestry of everyone

The explosion of genome sequencing is transforming our understanding of how genetic variants arise, the evolutionary forces that act on them and their consequences for health and disease. Genome data can also reveal, with unprecedented detail, the way in which individuals and groups of people share common ancestry. I will describe how we are using genomic data to reconstruct genealogical histories from population-scale data sets, potentially consisting of millions of individuals, and how we can use this to learn about the forces that have shaped our species.

Biography

Gil McVean is Professor of Statistical Genetics at the University of Oxford and Director of Oxford’s Big Data Institute within the Li Ka Shing Centre for Health Information and Discovery (www.bdi.ox.ac.uk). His research focuses on understanding the molecular and evolutionary processes that shape genetic variation in populations and the relationship between genetic variation and phenotype. He has played a leading role in the HapMap and 1000 Genomes Projects, is co-founder of Genomics plc (www.genomicsplc.com), and currently works on organisms from HIV to malaria.

Speakers and abstracts 59 22 PETER VISSCHER University of Queensland

Estimation of additive genetic variance from whole genome sequence data

Heritability, the proportion of phenotypic variance explained by genetic factors, can be estimated from pedigree data, as shown by RA Fisher a century ago. Such estimates are uninformative with respect to the underlying genetic architecture. Analyses of data from genome-wide association studies (GWAS) on unrelated individuals have shown that for human traits and disease, approximately one-third to two-thirds of heritability is captured by common SNPs. It is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular if the causal variants are rare, or other reasons such as over-estimation of heritability from pedigree data. We will show results on the estimation and partitioning of additive genetic variance from whole-genome sequence (WGS) data on ~22,000 unrelated individuals of European ancestry for human height, using approximately 50 million DNA variants. The estimated heritability was ~0.8, consistent with pedigree estimates. Low-MAF variants in low LD with neighbouring variants were enriched for heritability, to a greater extent for protein altering variants, consistent with negative selection thereon. Cumulatively, variants in the MAF range of 0.0001 to 0.1 explained ~50% of additive variance for height. The results imply that the still missing heritability of height, and possibly for other complex traits, is accounted for by rare variants, in particular those in regions of low LD.

Biography

Peter Visscher is a Quantitative Geneticist, trained in Edinburgh from 1987-1991 by Bill Hill, Robin Thompson and others. Bill has remained a mentor ever since. Visscher’s research interests are to better understand genetic variation for complex traits in human populations, including quantitative traits and common disease. The first half of his research career to date was predominantly in livestock genetics (animal breeding is applied quantitative genetics), whereas the last 15 years he has contributed to methods, software and applications to quantify and dissect the genetic architecture of human traits.

60 Speakers and abstracts 23 EVROPI THEODORATOU The University of Edinburgh

Phenome Wide Mendelian Randomisation Study of genetically determined biomarkers using large scale data

The recent progress in genome-wide association studies (GWASs) discovered an increasing number of trait-associated risk variants, with the overarching goal to better understand the biology of disease. Beyond that, GWASs have also been successfully implemented for the creation of polygenic risk scores, for assisting in risk prediction and in Instrumental Variable methods. In my talk, I will present one example of the application of GWAS findings. I will show the findings of our most recent Phenome Wide Association analysis together with a Bayesian analysis of tree structured phenotypic models to examine disease outcomes related to genetically determined serum urate levels in 339,256 UK Biobank participants. Mendelian Randomisation analyses were also employed to replicate significant findings using various GWAS consortia data.

Biography

Evropi Theodoratou is Professor of Cancer Epidemiology and Global Health at The University of Edinburgh. She graduated from the School of Biology at the Aristotle University of Thessaloniki, Greece in 2003. She then completed an MSc in Medical Biology (2004) (Linkoping University, Sweden) and a PhD in Cancer Epidemiology (2009) (The University of Edinburgh, UK). Her research includes work on risk factors and risk models of colorectal cancer risk and prognosis; analysis of large scale genetic data and other -omics data for various outcomes; and meta-research and evaluation of published evidence using state-of-the-art methods.

Speakers and abstracts 61 24 ANNA GLOYN University of Oxford, @annagloyn

Unravelling causal mechanisms for diabetes

Over the past decade, common variant genome wide association studies (GWAS) have emerged as the dominant strategy for human genetics discovery. In terms of locus detection this approach has proved extremely effective with hundreds of loci identified which influence type 2 diabetes risk and glycaemic traits. Each of these loci has the potential to reveal novel insights into biology, and to underpin future translational advances. However, several of the characteristics of these loci – in particular the modest effect size of most common variant risk-loci, and their predilection for regulatory sequence – have served as an obstacle to efforts to connect each of these signals to the gene through which the effect on diabetes risk is mediated, the so called “effector transcript” and thereby access knowledge and functional approaches which are generally focused on transcript biology. Recent developments in high throughput genomics has provided strategies for data generation and integration which now offer highly-efficient approaches for connecting the risk-variants and haplotypes revealed by common variant GWAS studies, to their effector transcripts. Whilst genome-editing in human induced pluripotent stem cells coupled with improved differentiation protocols have opened up the possibility of studying disease alleles in isogenic backgrounds in inaccessible cell types. This presentation will focus on recent efforts to use a variety of these approaches to unlock the biology at GWAS signals for type 2 diabetes. Funding: Wellcome, MRC, European Union, NIDDK, NIH, NIHR

Biography

Anna Gloyn is currently Professor of Molecular Genetics & Metabolism and a Wellcome Senior Fellow based jointly at the Oxford Centre for Diabetes Endocrinology and Metabolism & the Wellcome Centre for Human Genetics at the University of Oxford. She is also the current lead for the Diabetes & Metabolism Theme of the NIHR Oxford Biomedical Research Centre. The over-arching aim of her research is to identify effective therapeutic targets for type 2 diabetes (T2D) treatment through mechanistic studies of proteins causally implicated in T2D risk through human genetics.

62 Speakers and abstracts 25 JOE MARSH MRC Human Genetics Unit, The University of Edinburgh

The role of protein complex assembly in dominant Mendelian genetics disorders

Most proteins can assemble into complexes, and protein complex subunits are highly enriched in human disease mutations. I will discuss our recent work trying to understand the role of protein complexes in human genetic disease. First, I will focus on the dominant-negative effect, whereby a heterozygous mutation can cause an almost complete loss of function by sequestering wild type protein subunits within a non-functional complex. We have used systematic computational analyses to investigate the molecular determinants of dominant-negative mutations in comparison to other pathogenic mutations. We find that both protein quaternary structure and assembly patterns strongly influence the likelihood of observing a dominant-negative effect for a given gene. We also demonstrate that currently available phenotype predictors significantly underperform on dominant-negative mutations, likely due to the fact that they tend to be much milder at a protein structural level than other pathogenic mutations. Second, I will address the phenomenon of haploinsufficiency, in which the partial loss of function from a heterozygous mutation can cause disease. We show how this is often due to imbalances in protein complex stoichiometry, and that relative subunit expression levels, assembly pathways and transcriptional noise are all important properties for understanding assembly-mediated haploinsufficiency.

Biography

Joe Marsh obtained his PhD in 2010 from the University of Toronto, followed by postdoctoral work at the MRC Laboratory of and the EMBL European Bioinformatics Institute in Cambridge. He started his research group at the MRC Human Genetics Unit, The University of Edinburgh in 2014, where he is now a Reader, supported by an MRC Career Development Award and a Lister Institute Research Prize. Research in his group employs a variety of mostly computational approaches to study how and why proteins assemble into complexes, and what the implications of this are for human genetic disease.

Speakers and abstracts 63 26 ALASDAIR MACKENZIE University of Aberdeen, [email protected]

CRISPR and UK Biobank analysis of an enhancer that modulates fat and alcohol intake and mood suggests a role in male alcohol abuse and anxiety

Alcohol abuse accounts for 7.9 % of male deaths annually through cancer and liver disease and is frequently co-morbid with anxiety. In addition, obesity, that affects 25% of Western populations is related to increasing consumption of dietary fats and is linked to type-II diabetes and cancer. Thus, understanding the genomic influences governing consumptive behaviour and mood will be essential to understanding susceptibility to alcohol abuse, obesity and anxiety that greatly impact human health world-wide. Because of its role in modulating fat intake, ethanol intake and mood we focused on the expression of the galanin neuropeptide (encoded by the GAL gene) in the hypothalamus and amygdala where it controls consumptive behaviour and mood respectively. Because the GAL gene lacked polymorphisms in its coding region, we explored regions of DNA flanking the GAL gene and discovered a polymorphic DNA region that lay 42kb from the GAL coding region and which had been conserved for 400 million years. Using transgenic reporter mice, we established this sequence as an enhancer (GAL5.1) that drove reporter gene expression within GAL expressing cells of the hypothalamus and amygdala. Further reporter gene analysis in hypothalamic cells found that the dominant GG haplotype of GAL5.1 acted as a stronger enhancer than did the minor CA haplotype. Analysis of these polymorphisms through the human UK Biobank cohort demonstrated a positive association between the GG allele and alcohol intake and anxiety in men (n=115,865; p=0.0008). This prompted the use of CRISPR genome editing to disrupt the GAL5.1 enhancer in mice (GAL5.1KO). RNA analysis identified a major decrease in GAL mRNA expression in all regions of GAL5.1KO brains examined but not in four other flanking genes. Behavioural analysis of these mice demonstrated a decrease in high-fat diet intake and fat-mass accumulation. Critically, we also observed a major decrease in alcohol intake compared to wild-type litter-mates and a decrease in anxiety-like behaviour, particularly in male mice, mirroring our human data. Using a combination of ChIP analysis, expression vector/reporter vector and agonist treatments we were able to identify that the GG allele of GAL5.1 bound the EGR1 transcription factor and responded to PKCε signalling. However, these interactions were blunted in the CA allele. These observations provide a possible target for the development of personalised anxiolytics and that could also reduce fat and alcohol intake. Intriguingly, major changes in methylation of the GAL5.1 enhancer were observed in the offspring of mothers exposed to high fat diet during pregnancy and nursing suggesting the involvement of epigenetic mechanisms in modulating GAL5.1 activity.

64 Speakers and abstracts These findings highlight an important role for the polymorphic GAL5.1 enhancer in harmful consumptive behaviour and anxiety that may be further modulated by epigenetics. Our study demonstrates that combining comparative genomics and CRISPR genome editing with in vivo behavioural analysis and human association analysis represent an additional powerful tool for understanding the genomic mechanisms governing consumptive behaviour and mental health and suggests novel directions for the development of personalised treatments.

Biography

Alasdair MacKenzie worked as a postdoc at the MRC HGU under Bob Hill and then with John Quinn at the Vet School where he gained his passion for understanding the tissue specific regulation of neuropeptides in the hypothalamus and amygdala. Since coming to Aberdeen, Alasdair has been funded by the MRC, Wellcome Trust and BBSRC where his group has used transgenic and CRISPR technologies in mice to characterise enhancer regions modulating anxiety, alcohol abuse and fat intake. Alasdair also seeks to determine the effects of epigenetics and polymorphisms on the mechanisms controlling these enhancers with a view to developing personalised therapeutics.

Speakers and abstracts 65 27 PATRICIA HEYN The University of Edinburgh

Gain-of-function DNMT3A mutations cause microcephalic dwarfism and hypermethylation of Polycomb-regulated regions

Chromatin regulators have well established roles in cancer and in addition significant relevance to human growth disorders, with the dysregulation of the methyltransferases DNMT3A, NSD1, SETD2 and EZH2 established to cause human overgrowth. We have recently identifiedde novo gain-of-function mutations in DNMT3A, encoding the DNA methyltransferase DNMT3A, as a cause of microcephalic dwarfism, a hypocellular disorder of extreme global growth failure. As such DNMT3A dysregulation can cause both, overgrowth and growth failure. Functionally, substitutions in the DNMT3A-PWWP domain abrogate binding to the histone modifications H3K36me2/3 and result in altered DNA methylation in the cells of microcephalic dwarfism patients. DNA methylation canyons/valleys, hypomethylated domains encompassing developmental genes, consequently become methylated with concomitant loss of the polycomb mark H3K27me3. Such de novo DNA methylation occurs during differentiation of pluripotent cells in vitro and is also evident in Dnmt3aW326R/+ dwarf mice. The DNMT3A PWWP domain therefore normally acts to prevent DNA methylation of polycomb-marked domains, implicating the interplay between DNA methylation and polycomb at key developmental regulators as a determinant of organism size in mammals. Our ongoing work investigates how the loss of H3K36me2/3 binding leads to increased DNA methylation and how the mutually exclusive histone marks H3K36me2/3 and H3K27me3 may regulate DNA methylation and organism growth.

Biography

Patricia Heyn is a Postdoctoral Researcher at the MRC Human Genetics Unit, The University of Edinburgh in the group of Professor Andrew Jackson. She undertook her PhD studies at the Max Planck Institute of Molecular Cell Biology and Genetics in Dresden, Germany in the laboratory of Professor Karla Neugebauer on the transcriptional control of early zebrafish development. Her current work focuses on how epigenetic modifiers regulate organism growth, specifically on the interplay between DNA methylation and polycomb as a possible determinant of organism size in mammals.

66 Speakers and abstracts 28 M. MADAN BABU University of Cambridge

Understanding variation in GPCR drug targets

G-protein-coupled receptors (GPCRs) participate in diverse physiological processes, ranging from sensory responses such as vision, taste and smell to those regulating behaviour, the immune and the cardiac system among others. The ~800 human GPCRs sense diverse signaling molecules such as hormones and neurotransmitters to allosterically activate the associated G proteins, which in turn regulate intracellular signaling. In this manner, GPCRs regulate virtually every aspect of human physiology. Not surprisingly, GPCRs are the targets of over one-third of all prescribed human drugs. In this presentation, I will first discuss how one could exploit data on structures of GPCRs and that of completely sequenced genomes of individuals from the human population to gain insights into natural receptor variation, which can result in variable drug response. I will then present ongoing work wherein by studying transcriptome data from over 30 different tissues in humans, one could begin to understand how alternative splicing can create diversity in GPCR signaling components, which could contribute to tissue-specific differences in receptor signaling and drug response. I will conclude by discussing how understanding variation at these different spatio-temporal dimensions, i.e. among different individuals of a species, and between tissues of a species, can provide a rich source of new hypotheses with implications for personalised medicine and understanding basic receptor biology.

Biography

M. Madan Babu heads the Regulatory Genomics and Systems Biology group at the MRC Laboratory of Molecular Biology, Cambridge, UK. His group investigates how regulation is achieved at multiple levels of complexity in cellular systems and how this influences evolution of organisms and their genome. He obtained his undergraduate degree in 2001 from the Centre for Biotechnology, Anna University, India with fellowships from the Indian Institute of Science and the Indian Academy of Sciences. He then received an LMB-Cambridge International Fellowship and a Trinity College Research Scholarship to carry out his doctoral research at the Medical Research Council’s Laboratory of Molecular Biology (MRC-LMB) in Cambridge, UK. During this time, he carried out studies on the structure, evolution and dynamics of transcriptional regulatory networks by employing a combination of computational approaches. For this work, he was awarded the Max Perutz Prize for outstanding PhD research in 2004.

Speakers and abstracts 67 He subsequently received an NIH International Visiting Fellowship to work at the NCBI, USA where he defined general principles of the organisation of biological networks. He returned to the UK to become an independent Group Leader at the MRC Laboratory of Molecular Biology in 2006, was elected as a Schlumberger Research Fellow at Darwin College, Cambridge in 2007 and was appointed as a Programme Leader in 2010. Madan is an Executive Editor at Nucleic Acids Research, an Associate Editor at Molecular BioSystems and a member of F1000. He was a Director of Studies at Trinity College, Cambridge (2011-2015), and a member of the HUGO council. He has published over 125 papers, reviews and chapters (h-index: 50) and has received several awards including the Biochemical Society’s Early Career Award (2009), EMBO Young Investigator Award (2010), British Genetics Society’s Balfour Award (2011), Royal Society of Chemistry’s Molecular BioSystems Award (2011), the Biochemical Society’s Colworth Medal (2014), the Lister Prize from the Lister Institute of Preventive Medicine and the Protein Science Young Investigator Award (2014) from the American Protein Society. Most recently, Madan received the Francis Crick Medal and Lecture (2015) from the Royal Society and was elected as a member of EMBO (2016) and Fellow of the Royal Society of Chemistry (2017).

68 Speakers and abstracts WELCOME EPIGENETICS AND NON-CODING RNA

Chair: Robin Allshire, The University of Edinburgh Co-chair: Liz Bayne

Sponsor:

The term Epigenetics was coined by Conrad Waddington, Professor of Animal Genetics, The University of Edinburgh 1947-1975. In his usage ‘epigenetics’ referred to the mechanisms involved in the acquisition of stable cell fates during development. Since then the term ‘epigenetics’ has been repurposed and now has distinct meanings in different contexts. We will learn how DNA base composition and DNA methylation is interpreted to ensure normal cell fate and development from Adrian Bird. Adrian carried out his PhD research on rDNA at the MRC Epigenetics Research Group, Edinburgh that was founded by Conrad Waddington. Adrian has been a stalwart of Edinburgh molecular genetics and cell biology for many years. Enigmatic epigenetic regulation such as paramutation and RNA-mediated gene silencing became considerably more tangible following the discovery of small RNAs in plants by David Baulcombe. David performed his PhD research on plant messenger RNA at the Department of Botany, The University of Edinburgh; he will tell us about his intriguing findings concerning RNA-directed epigenetic regulation in tomatoes. As a post-doc Beth Sullivan investigated nuclear compartmentalisation with Wendy Bickmore at the MRC Human Genetics Unit, Edinburgh. Since then her research has provided insight into the epigenetic processes, including non-coding transcription, that influence the location of centromeres on human chromosomes. At the Institute of Animal Genetics in Edinburgh, Anne McLaren performed research on reproductive biology and early development and this environment spawned her interest in epigenetically regulated mechanisms such as imprinting and X inactivation. Anne Ferguson-Smith has carried similar interests forward and will tell us about her latest research on genomic imprinting and non-genetic inheritance.

Speakers and abstracts 69 29 KAT ARNEY @Kat_Arney

The Genetics Society’s podcast, Genetics Unzipped, is one year old. Presenter Kat Arney will share some highlights from the series so far, featuring interviews with some of the leading lights from the field as well as our centenary series exploring 100 Ideas in Genetics, along with some facts and figures about the show’s success so far.

Biography

Dr Kat Arney is an award-winning science writer, broadcaster and public speaker, and the founder and creative director of science communications consultancy First Create The Media. She presents The Genetics Society’s Genetics Unzipped podcast which is produced by First Create The Media’s talented audio team. She is the author of the popular science books Herding Hemingway’s Cats: Understanding How Our Genes Work, and How to Code a Human. Her next book will be published in June 2020, exploring the genetics and evolution of cancer.

70 Speakers and abstracts 30 ADRIAN BIRD The University of Edinburgh

DNA base composition controls cell fate via SALL4

Patterns of epigenomic modification change during development and are often disrupted in cancer, but the origin and molecular consequences of these shifts are incompletely understood. We are testing the hypothesis that these changes can be traced back to proteins that interpret the DNA sequence. Specifically, we proposed that proteins recognising short, frequent DNA sequence motifs recruit enzymes that deposit or remove these marks. A biochemical screen revealed candidate proteins recognising AT-rich sequence motifs, including SALL4, which is implicated in pluripotency and cancer. We found that the stem cell-specific factor SALL4 preferentially senses A/T content to bring about blanket inhibition of genes that promote neuronal differentiation. An A/T binding zinc finger cluster is essential to prevent inappropriate stem cell differentiation, whereas other zinc finger clusters are dispensable. We conclude that base composition is not a passive by-product of genome evolution, but is read as a signal to facilitate control of developmental gene expression programmes and hence cell fate.

Biography

Adrian Bird holds the Buchanan Chair of Genetics at The University of Edinburgh. He obtained his PhD at Edinburgh University and postdoctoral experience at the Universities of Yale and Zurich. He then joined the MRC Mammalian Genome Unit in Edinburgh and in 1987, moved to the Institute for Molecular Pathology in Vienna, returning to Edinburgh in 1990 where he became Director of the Centre for Cell biology. Awards include the Gairdner International Award and the Shaw Prize. His research focuses on the basic biology of DNA methylation and other epigenetic processes, including the molecular mechanisms underlying the neurological disorder Rett Syndrome.

Speakers and abstracts 71 31 SITO TORRES-GARCIA The University of Edinburgh

Stochastic epigenetic silencing by heterochromatin primes fungal resistance

Genes embedded in H3K9 methylation–dependent (H3K9me) heterochromatin are transcriptionally silenced. In fission yeast,Schizosaccharomyces pombe, H3K9me heterochromatin silencing can be transmitted through cell division provided the gene encoding the counteracting demethylase Epe1 is deleted. It is possible that under certain conditions wild-type cells might utilise heterochromatin heritability to form epimutations, phenotypes mediated by metastable silencing rather than changes in DNA. We have identified resistant heterochromatin-mediated epimutants that form in response to threshold levels of the external insult caffeine. ChIP-seq analyses of unstable resistant isolates revealed new distinct heterochromatin domains over several genes, which in some cases reduce the expression of genes known to confer resistance when deleted. Forced synthetic heterochromatin formation confirmed that resistance results from heterochromatin-mediated silencing at several loci. In some isolates, subsequent or co- occurring gene amplification events enhance resistance. Our analyses reveal that epigenetic processes allow wild-type fission yeast to adapt to non-favourable environments without altering their genotype. Heterochromatin- dependent epimutant formation provides a bet-hedging strategy that allows cells to remain wild-type but transiently adapt to external insults. Since unstable caffeine- resistant isolates show cross-resistance to the antifungal drug clotrimazole it is possible that related heterochromatin-dependent processes contribute to anti-fungal resistance in pathogenic fungi.

72 Speakers and abstracts 32 DAVID BAULCOMBE University of Cambridge, [email protected], @dcb40

The tomato epigenome

It is an oversimplification to think of epigenetics in plant and animal genomes as simply defence against the mutagenic effects of transposable elements. Epigenetics does silence these mobile DNAs and it does prevent their damaging influence on genes and chromosomes but there are also more diverse effects - both positive and negative - on the transposon and the host genome. To characterise the influence of an epigenome on a plant genome we are using tomato as a model system. Our approaches involve gene-edited mutants and genetic knock down in epigenetic pathways involving RNA-directed DNA methylation (NRPD1, NRPE1), other mechanisms (DDM1, CMT3, KYP) and wide cross hybrids of cultivated and wild tomato. These approaches lead to the conclusion that the epigenome is dynamic with features being gained and lost over time. These features may influence gene expression and, in some extreme instances, they are associated with non-Mendelian inheritance patterns that resemble paramutation in maize and other species.

Biography

David Baulcombe fell in love with Edinburgh during his PhD (1977) but he also fell in love with a southerner so, after postdocs in Montreal and Athens Georgia, he returned south to the Cambridge Plant Breeding Institute, the Sainsbury Laboratory in Norwich and, since 2007, Cambridge University as Regius Professor of Botany. David is an FRS and a member (foreign associate) of the US National Academy of Sciences. His awards include the 2006 Royal Medal, a 2008 Lasker Award and the Wolf Prize for Agriculture in 2010. He was knighted in June 2009.

Speakers and abstracts 73 33 BETH SULLIVAN Duke University

Alpha satellite variation, transcription, and centromere assembly: Finding function in the recesses of the human genome

Centromeres are essential to genome inheritance. Abnormal centromere function is associated with birth defects, infertility, and cancer. Human centromeric (CEN) chromatin typically forms on alpha satellite DNA, a 171bp monomeric sequence that is tandemly organised higher order repeats (HORs) that are reiterated across multiple megabases with little sequence variation. Using Homo sapiens chromosome 17 (HSA17) as a model, we showed that HOR size and sequence variation is present in ~35% of the population. High levels of variation are linked to architecturally flawed kinetochores, resulting in chromosome aneuploidy. The molecular basis for how genomic variation affects centromere assembly and maintenance is not clear. We have tested if alpha satellite arrays containing variant HORs are sub-optimally organised for proper kinetochore assembly and produce unstable transcripts that cannot interact appropriately with centromere proteins (CENPs). We have found that long-range alpha satellite organisation differs dramatically between alpha satellite arrays, due to the placement of wt and variant HORs. Moreover, we have identified the presence and activity of transposable elements within alpha satellite arrays that we propose drive alpha satellite transcription. Our work offers new structural and functional views of challenging repetitive DNA regions that have been historically absent from the human genome assembly.

Biography

Beth Sullivan is Associate Professor of Molecular Genetics and Microbiology at Duke University. She earned a BA in Biology, Chemistry, and Classics (Western Maryland College) and a PhD in Human Genetics (University of Maryland, Baltimore). Beth was a postdoc at the MRC Human Genetics Unit and the Salk Institute. Her research focuses on human centromeres, incorporating genomics, genetics, and chromosome engineering to study alpha satellite organisation and transcription. Her lab also studies experimentally-generated dicentric and artificial chromosomes. Beth teaches an undergraduate Epigenetics honors course. She is an Associate Editor for PLOS Genetics and Executive Editor of Chromosome Research.

Contact Beth at: Sullivan Lab @TeamCentromere Department of Molecular Genetics and Microbiology @DukeMGM Duke University Medical Center @DukeMedSchool

74 Speakers and abstracts 34 ANNA THAMM University of Oxford, [email protected]

MpFRH1 miRNA Controls the Patterning of Epidermal Structures in the Liverwort Marchantia Polymorpha

Epidermal patterning in plants is regulated by interactions between activators and inhibitors of cell differentiation. We previously demonstrated that rhizoid differentiation in Marchantia polymorpha is positively regulated by the ROOT HAIR DEFECTIVE SIX- LIKE1 (MpRSL1) basic Helix Loop Helix (bHLH) transcription factor which in turn is directly repressed by the FEW RHIZOIDS1 (MpFRH1) miRNA. Further investigation showed that MpFRH1 miRNA acts as an inhibitor in the rhizoid cell patterning mechanism by repressing the activity of the MpRSL1 activator. We generated loss of function (LOF) mutants lacking the MpFRH1 miRNA, which consequently overexpress MpRSL1. Larger clusters of rhizoid initial cells (RICs) develop on Mpfrh1LOF mutants than in wild type. Furthermore, the arrangement of RICs within a cluster is altered, and RICs become more cluttered in plants that lack MpFRH1 function compared to wildtype. Even bigger RIC clusters were observed in MpRSL1GOF lines. Taken together these data suggest that MpFRH1 miRNA acts as an inhibitor of RSL1- mediated differentiation during the patterning of rhizoids in the epidermis. Therefore, like in angiosperms, a mechanism comprising activators and inhibitors controls epidermal patterning in M. polymorpha. But unlike the mechanism in A. thaliana, where proteins act as inhibitors, the liverwort inhibitor is a miRNA. This indicates that different inhibitory mechanism of epidermal patterning evolved in the different land plant lineages.

Biography

Anna Thamm did her undergraduate degree at the Freie Universität in Berlin, followed by a master’s degree in cellular and molecular biology at the University of Potsdam. She then joined the Marie-Curie Actions PLANTORIGINS- Initial Training Network, working in Oxford (UK) and Oeiras (Portugal) to characterise genes that control developmental processes in early diverged land plants. She then returned to Oxford to complete a PhD investigating the genetic mechanisms controlling the development of epidermal structures based in the Department of Plant Sciences in Oxford, where she now continues her project as a postdoc.

Speakers and abstracts 75 35 ROSALIND JOHN Cardiff University

Genomic Imprinting: linking early life adversity to maternal behaviour and lifelong health

A critically important adaptation during the mammalian pregnancy is the induction of maternal care. Maternal care is vitally important for the survival of the newborn and the withdrawal of maternal care or poor quality maternal care can result in adverse behavioural and metabolic outcomes later in life. We identified a novel function for the epigenetically regulated imprinted genes in modulating the development of the endocrine lineages of the mouse placenta, with maternally expressed genes functioning antagonistically to paternally expressed genes. Placental hormones have been implicated in the induction of maternal care in rodents. These observations led us to hypothesise that imprinted genes expressed in the foetally-derived placenta modulate maternal care provision through regulating the production of placental hormones. We tested this hypothesis by developing experimental models in which wild type dams were exposed to embryos carrying genetic modifications of the maternally expressed Phlda2 gene. We discovered alterations in the brain and behaviour of these dams with “paternalisation” increasing interactions between the dam and her pups and “maternalisation” decreasing these interactions (Creeth et al., 2018). We similarly observed alterations in the behaviour of wild type dams carrying and caring for paternally expressed Peg3 mutant pups (McNamara et al., 2018) lending further support to our original hypothesis. Peg3 mutant offspring are already known to demonstrate both behavioural and metabolic abnormalities and we recently discovered that Phlda2 offspring are also behaviourally abnormal. Remarkably, both the transgenic and their non-transgenic littermates are behaviourally impacted by the adverse early life exposure, and in a sexually dimorphic manner. The same genes we are studying in mice are associated with low birth weight and prenatal depression in human pregnancies (Janssen et al., 2016a; Jensen et al., 2014). Together, these data demonstrate the inter-related, reiterative interactions between mothers and their offspring, and suggest an alternative to the foetal programming hypothesis with clear relevance to human health.

76 Speakers and abstracts Biography

Rosalind John studied Biochemistry at King’s College, London and gained a postgraduate degree in Human Genetics at Imperial College London followed by postdoctoral training at UCSF/Stanford (USA), The Wellcome Trust Institute, Cambridge University and the MRC CSC Institute. She joined Cardiff University’s School of Biosciences in late 2003 as Senior Lecturer and was promoted to Professor of Developmental Epigenetics and Head of the Biomedicine Division in 2015. Her research involves understanding the remarkable process of epigenetics whereby the genome is marked during development to initiate and maintain precise patterns of gene expression through mitosis, and for the life span of the individual. Her group use both genetically and environmentally modified experimental models (mouse) and human cohort studies to interrogate exposures during pregnancy that interact with the developing epigenome to alter phenotypic outcomes. Her expertise spans from molecular (DNA methylation) to studying the pregnant animal in its environment. Her work has wide implications for maternal mental health, infant wellbeing and their health later in life.

Speakers and abstracts 77 36 ANNE FERGUSON-SMITH University of Cambridge

Epigenetic modulation of repeat elements – implications for non-genetic inheritance

Genetic models of epigenetic inheritance can provide useful insights into the mechanisms, stability and heritability of modified states. Endogenous retroviruses (ERVs) are a class of LTR retrotransposons representing around one quarter of the repetitive elements in the murine genome. Most are epigenetically silenced. However, in two classic mouse models, Agouti viable yellow (Avy) and Axin fused (Axin(Fu)), a member of the IAP class of ERV has inserted in the vicinity of the agouti and axin genes respectively, and been variably DNA methylated between individuals; this is associated with transcriptional variability of the associated genes, and non-genetically conferred phenotypic variation which can be transmitted across generations. Such alleles are known as metastable epialleles. We have conducted a genome-wide systematic screen for metastable epialleles at ERVs using mouse strain-specific datasets that we generated as part of the BLUEPRINT reference epigenome project (EUFP7 BLUEPRINT grant HEALTH-F5-2011-282510). The properties, impact and heritability of these novel variably methylated IAPs provides useful insights into the impact of the epigenetic control of repetitive elements on the mammalian genome and transgenerational epigenetic inheritance. In particular, the variable methylation state is locus-specific within an individual and very few act as heterologous promoters for adjacent genes. Of interest, variably methylated IAPs are enriched for the methylation-sensitive DNA binding factor CTCF which may influence the establishment/maintenance of the metastable state as well as contribute longer range functional effects. Variably methylated IAPs are reprogrammed after fertilization and re-established as variable loci in the next generation indicating an ability to reconstruct a stochastically variable epigenetic state. Finally, we explore the sensitivity of metastable epialleles to environmental effects to determine the extent to which they can act as biosensors of environmental compromise and mediators of environmental effects on genome function.

78 Speakers and abstracts Biography

Anne Ferguson-Smith is the Arthur Balfour Professor of Genetics and Head of the Department of Genetics at the University of Cambridge. She studied molecular biology at Glasgow University and received her PhD in Biology from Yale University USA, exploring the importance of mammalian genome organisation for development. In 1989, she moved to Cambridge for postdoctoral work in the emerging field of epigenetic inheritance and this has been her focus since then. Her current research addresses three themes: i) genomic imprinting and the epigenetic programme in development; (ii) functional interactions between the genome and epigenome with particular emphasis on repetitive elements; and (iii) the relationship between the genome, epigenome and the environment in development and disease. She is an elected member of EMBO, a Fellow of the UK Academy of Medical Sciences and a Fellow of the Royal Society. Anne feels very honoured to be succeeding Laurence D. Hurst, as President of the Genetics Society.

Speakers and abstracts 79 POSTER ABSTRACTS

80 Speakers and abstracts GENOME STABILITY AND INSTABILITY

P20 HEATHER JEFFERY University of Oxford

Probing chromatin structure in single molecules

A third of proteins are estimated to have the ability to bind DNA. This may be a critical part of their function, as DNA-binding proteins are involved in many cellular processes, including DNA replication, transcription and DNA repair. Dysregulation of any of these processes can lead to disease. In particular, faulty DNA replication has been linked to cancer and Meier-Gorlin syndrome. Protein binding to DNA can be analysed by chromatin accessibility assays such as ATAC- seq, DNase I-seq, MNase-seq, FAIRE-seq and ChIP-seq. All these methods produce population data with some only providing information on specific proteins. Furthermore, these methods hide the biological stochasticity and regulation that may occur on single molecules. I present a new approach to investigate DNA-protein interactions on single molecules. This involves combining enzymatic modification of accessible DNA in Saccharomyces cerevisiae with long read Nanopore sequencing. Subsequent data analysis employs recently developed tools to detect enzymatic base modifications on DNA. My aim is to generate genomic footprints of chromatin accessibility on single molecules, with a focus on the positions of DNA replication origins. This will allow investigation of the interactions between origin-binding proteins and surrounding chromatin in single molecules. I aim to apply this method to DNA replication in Saccharomyces cerevisiae and to compare chromatin structures between active and dormant origins. The timing and efficiency of origins are known in S. cerevisiae. However, the preparatory steps that occur before replication, in the stage termed origin licensing, are less well studied. Therefore, I hope to further characterise these steps by looking at how the pre-replication complex forms at origins with different timing and efficiencies. By looking at single molecules we can learn more about the stoichiometry and interactions of proteins involved in the first steps of DNA replication, that cannot be done with population-based approaches. I will present the development of this single molecule chromatin accessibility assay using nanopore sequencing, which I anticipate will provide insight into the stochasticity or pre-determined nature of DNA-binding events occurring at DNA replication origins. Understanding the initial steps of DNA replication is a step towards understanding how this process may be defective in cancer and Meier-Gorlin syndrome. Although my focus is on DNA replication, this method can also be applied to other DNA-protein interactions.

Speakers and abstracts 81 P21 YICHEN (SERENA) DAI University of Oxford

Weird gene in a weird mammal: A highly divergent pancreatic duodenal homeobox 1 (Pdx1) gene in the fat sand rat

Various forces leading to strong GC skew in local genomic regions cause genome instabilities, giving rise to conflict between increasing GC levels and alteration of conserved amino acids. In most cases, natural selection will purge any deleterious alleles that arise. However, in the gerbil subfamily of rodents, several conserved genes serving key functions have undergone radical alteration in association with strong GC skew. We present an extreme example concerning the highly conserved homeobox gene Pdx1, a key gene in initiation of pancreatic organogenesis in embryonic development. In the fat sand rat Psammomys obesus and close relatives, we observe a highly divergent Pdx1 gene associated with high GC content. In this study, we investigate the antagonistic interplay between very rare amino acid changes driven by GC skew and the force of natural selection. Using ectopic protein expression in cell culture, pulse-chase labelling, in vitro mutagenesis and drug treatment, we compare properties of mouse and sand rat Pdx1 proteins. We find that amino acid changes driven by GC skew resulted in altered protein stability, with a significantly longer protein half-life for sand rat Pdx1. We show that both sand rat and mouse Pdx1 are degraded through the ubiquitin proteasome pathway. However, in vitro mutagenesis reveals that GC skew has caused loss of a key ubiquitination site conserved through vertebrate evolution, and we suggest that sand rat Pdx1 may have evolved a new ubiquitination site to compensate. Our results give molecular insight into the conflict between natural selection and genetic instabilities driven by strong GC skew.

P22 VERA KAISER MRC Institute of Genetics and Molecular Medicine, The University of Edinburgh

From germline DNA breakage to developmental disorders

DNA breakage is an integral aspect of meiosis and the germline the ultimate source of structural variation, affecting traits such as developmental disorders. However, neither the mutational processes which give rise to structural variation nor their impact on the noncoding parts of the genome are well understood. The aims of this study are two- fold: First, we leverage genomic datasets to study the effects of transcription factor binding in the germline on population level structural variation. In particular, we make use of ATAC-seq in spermatogonia to infer the locations of transcription factors and meiotic recombination initiation sites (PRDM9 binding sites), and relate these to rare DNA breakage events observed in a variety of developmental disease cohorts. Second, we assess the overall impact of DNA breakage on non-coding variation and its effects on enhancer elements as well as regulatory domains that are active in the developing brain.

82 Speakers and abstracts P23 RIYADH AEBAN University of Reading

Characterisation of Microsatellite loci for Dermatophagoides farinae (Acari) and novel identification of Saudi House Dust Mites

In forensic genetics, the use of microsatellites to identify suspects, victims and to analyse trace evidence of biological origin has become routine practice. Microsatellites are genetic markers (DNA sequences) characterised by the repetition of nucleotides in short tandems. Mites are tiny, microscopic chelicerates (arthropods), that have managed to colonise all terrestrial and aquatic environments. A great majority of mite species are highly synanthropic and live in inhabited human dwellings, the so-called House Dust Mites (HDM). HDMs are well studies, being responsible for house allergies. They are micro- habitat specific, therefore, they can be easily recovered in a crime scene as trace evidence. This especially applies to the investigation of cases of rape or sexual assault considering the association of these mites with human secretions, semen, saliva, etc, as they live inside mattresses, and other clothed furniture. The aim of this work is to design, describe and characterise microsatellite markers of the most ubiquitous species of domestic mites, Dermatophagoides farinae. This study highlights the potential use of HDMs as trace evidence and markers for the indoor environments. According to the data analysis of microsatellites characterisation of D farinae genome (genome searching for screening the potential microsatellite motifs), the highest number of short tandem repeat are MS 101, MS 98, MS 88, and MS 80, while the highest length of short tandem repeat are MS 101, MS 98, MS 80 and MS 85. The results also showed that the trinucleotide repeats were the most abundant STR types, followed by dinucleotide, hexanucleotide and tetranucleotide.

Speakers and abstracts 83 P24 CRISTINA TUFARELLI University of Leicester

LCT13, a LINE-1 driven transcript with potential roles in CRC

Most of our DNA is composed of repetitive sequences such as LINE-1, elements of viral origins that can move themselves to different locations in the genome. Of the more than 500,000 LINE1s within the genome, only 80-100 can mobilise, but about 7,000 carry intact promoters. These promoters are usually silenced in normal tissues and their activity is considered a hallmark of cancer, especially for epithelial types such as colorectal cancer. Activation of LINE-1 promoters in colorectal cancer can drive transcription of L1 chimeric transcripts (LCTs) composed of part L1 and part adjacent genomic sequence. We previously reported LCT13, an LCT whose transcriptional unit encompasses the metastasis suppressor gene TFPI-2 and whose expression in CRC is inversely linked to TFPI-2 expression. Interestingly, one of the two LCT13 alternative splicing isoforms so far identified contains the intact open reading frame of GNGT1, the gamma subunit of a retina- specific heterotrimeric G-protein. Here we show a link between LCT13 expression and detection of GNGT1 in primary tissues and sera from colorectal cancer (CRC) patients. In addition, treatment of LCT13 positive HCA7 cells with salicylic acid and resveratrol leads to decreased levels of LCT13 expression suggesting a potential role for inflammation in LCT13 promoter activation. These findings further highlight the importance of studying individual LINE-1 chimeric transcripts and their protein products as they may play important roles in cancer and provide information with potential for clinical exploitation.

P25 ANE STRANGER University of Kent

Investigating the role of H2AX Y142 phosphorylation in the DNA damage response

One of the earliest events following DNA damage is phosphorylation of H2AX at S139, forming gamma-H2AX (γH2AX). γH2AX spreads across large chromatin domains around double-stranded breaks (DSBs) and recruits repair proteins to the site of damage, initiating the DNA damage response. More recently, a second H2AX phosphorylation site was discovered. H2AX is phosphorylated at Y142 in undamaged cells and the phosphorylation mark is thought to be gradually removed upon DNA damage. However, some H2AX retains its Y142ph in DNA damage conditions, and becomes doubly phosphorylated at both S139 and Y142, forming di-γH2AX. Whether di-γH2AX is a signal in its own right or a transient state as Y142ph is removed from H2AX and S139ph is added, is not known, but persistence of the di-γH2AX signal is thought to initiate apoptosis, suggesting a role

84 Speakers and abstracts for H2AX phosphorylation in life/death decisions. A previous lack of reagents specific to this phosphorylation state has made it difficult to test this hypothesis directly. Here, an antibody against di-γH2AX previously developed by the lab, was used to study the formation, appearance and localisation of di-γH2AX in DNA damaged cells, to further understand its role in DNA damage signalling. First, the antibody’s specificity was tested by dot blot against either unphosphorylated, S139ph, Y142ph or S139phY142ph H2AX peptides, and found to be highly specific for doubly phosphorylated H2AX. In human cell lines treated with the radiomimetic drug bleomycin, both γH2AX and di-γH2AX were present in apoptotic cells and repair foci marking DSBs. However, a decrease in the di-γH2AX signal upon DNA damage was not seen, in contrast to previous reports. Instead, the signal remained constant throughout the bleomycin time course. Chromatin immunoprecipitation (ChIP)-qPCR analysis of the DSB-inducible via AsiSI restriction enzyme (DiVA) cell line showed that di-γH2AX is increased around DSBs. Interestingly, it follows a similar pattern to γH2AX, with a higher signal at distal sites versus proximal sites (~3000 basepairs from the DSB versus ~180 basepairs). In addition, ChIP-qPCR experiments indicate a decrease of H2AX Y142ph around DSBs upon damage induction, in line with previous results. Our results demonstrate that our di-γH2AX antibody can be used in conjunction with a variety of experimental techniques. The presence of both γH2AX and di-γH2AX in apoptotic and non-apoptotic cells suggests that the role of di-γH2AX is not simply acting as a “switch” between repair and apoptosis, but that di-γH2AX instead serve other/ additional purposes in the DNA damage response. Future experiments will focus on what kinases are able to produce di-γH2AX and di-γH2AX interaction partners in DNA damaged cells. Further ChIP-seq analysis will allow studying genome-wide di-γH2AX signalling in DNA damaged cells, how di-γH2AX is organised around DSBs and whether di-γH2AX associates with a particular type of chromatin.

P26 MATTHEW HARTFIELD The University of Edinburgh

Using singleton densities to infer recent selection in Bos taurus

Detecting regions of the genome that have been subject to natural selection is a major focus of evolution research. While there has been considerable success with discovering individual genes exhibiting footprints of historical selection, quantifying ongoing evolution has proved more challenging. This is especially true when multiple genes contribute to a phenotype (‘polygenic’ selection). Domesticated species are characterised by strong selection on traits of economic value (e.g. milk production and stature in cattle). Yet these phenotypes likely have a polygenic basis, and it remains unclear how those genes act in concert to influence phenotypic evolution, and over what timescales selection acts. The ‘Singleton Density Score’ (or SDS) was recently introduced to detect ongoing selection in humans, by identifying SNPs that exhibit a reduced occurrence of singletons (variants in a dataset that are unique to an individual) in their vicinity. Here we investigate the

Speakers and abstracts 85 extent of contemporary selection in Bos taurus, using a modified SDS method. We first perform a scan across autosomes to determine individual genetic regions that appear to be subject to ongoing selection. We then study the distribution of SDS scores in both genes associated with milk protein content, and at SNPs lying close to stature QTLs, to determine how domestication has shaped the recent evolution of these traits. Our study clarifies the genetic basis underlying the evolution of complex phenotypes.

P27 VICTORIA SMITH University of Nottingham

Co-purification of the replicative helicase CMG complex in archaeonHaloferax volcanii

The accurate and timely replication of deoxyribonucleic acid (DNA) is a ubiquitous requirement of all cells, whereby the entire genome needs to be copied once per cell division. Bacteria, Eukaryotes and Archaea carry various replication fork proteins, with some broad conservation between the latter two domains, where Archaea possess simplified versions of Eukaryotic components. For example, eukaryotic heterohexameric helicase MCM(2-7) is a simplified homohexameric MCM helicase in archaeal species. This makes Archaea strong model organisms for working out the complexities of intricate information processing systems such as DNA replication. Halophilic archaeon Haloferax volcanii is a genetically tractable organism with a vast range of genetic tools available which have previously been utilised for studies into replication and repair. The CMG helicase complex, consisting of proteins Cdc45, MCM helicase and GINS, is a key component of DNA replication in eukaryotes and archaea. In eukaryotes the regulated recruitment of Cdc45 and GINS at the replication fork drives DNA unwinding and therefore fork progression by MCM helicase. Previous studies have shown archaea possess a simplified ancestral form of this complex, with homologues of GINS, Cdc45 (called RecJ or GAN in archaea) and MCM being utilised for archaeal replication. Both MCM and GINS are known to be essential in H. volcanii, suggesting the CMG complex plays a key role in archaeal replication, as in eukaryotes. However, of the four recJ genes in H. volcanii, it is yet to be elucidated which acts as the Cdc45 homologue at the replisome. Here, we utilise affinity tags for purification of MCM and GINS in halophilic archaeonH. volcanii. Using these, we were able to co-purify members of the CMG complex, elucidating the binding partners of these key players during replication using mass spectrometry. This work will help define the replisome employed in replicating H. volcanii cells.

86 Speakers and abstracts P28 ROBERT FOSTER MRC Human Genetics Unit, The University of Edinburgh

Comprehensive in vitro modelling of Cornelia de Lange Syndrome (CdLS) in human differentiated dopaminergic neurons to identify convergent cellular pathology with translational potential for non-syndromic autism

Typical Cornelia de Lange syndrome (CdLS, MIM #122470) is a very severe, clinically distinct multisystem disorder characterised by , prenatal onset growth retardation, limb malformations, multiple chronic medical problems and distinctive facial features. Most affected individuals (~70%) have de novo heterozygous, loss-of-function mutations in NIPBL, while a proportion (<10%) of the remaining individuals have heterozygous or hemizygous mutations in the genes SMC3, RAD21, SMC1A and HDAC8. Generally, individuals with mutations in NIPBL present with all features typical of CdLS, whereas mutations in other genes lead to a less severe phenotype, with minimal/absent limb abnormalities, a less distinctive facial appearance and mild intellectual disability and behavioural abnormalities. All genes known to cause CdLS code for proteins which are involved in the formation or functioning of the cohesin complex; a loop-like structure with a ‘clasp’ which encircles DNA. In humans the loop of the cohesin complex is formed by the proteins SMC3 and SMC1A and the clasp is formed by RAD21 and STAG1/2. NIPBL is required to load cohesin onto DNA while HDAC8 is a deacetylase which regulates the opening and closing of the clasp via SMC3. Although the most well-known role of cohesin is in bringing sister chromatids together during cell division, it has several other critical functions, including during transcription, DNA replication and DNA repair. However, it is believed to be impairments in cohesin’s ability to bring enhancers and promoters together to form transcriptionally-associated domains (TADs) which lead to the abnormalities seen in CdLS; a key marker of which is the global down- regulation of many thousands of genes across all tissues, including transcription factors. In this project, we aim to determine whether the phenotypic convergence we see upon mutation of cohesin related genes is predictive of an underlying molecular convergence which would allow treatments to be developed that target mutations in all genes. To this end, we have introduced patient mutations from several genes into mouse embryonic stem cells (mESCs) using CRISPR/Cas9. These mutant mESC lines can then be differentiated into several key disease cell types, including neurons and optic vesicles, and organoids, which we will profile in their differentiated state and during differentiation using techniques such as RNA- seq, mass-spectrometry and high-content imaging. Our goal is to identify suitable affected pathways shared between the mutated genes, and then use these cell lines to perform high-throughput small molecule screens for therapeutics. We will also examine the mutant differentiated neuronal cultures specifically, in areas such as synapse formation and function, the up-/downregulation of cell-surface receptors and affected signalling pathways. This will give us insights into what leads to the neurological phenotypes in CdLS, and more broadly autism and intellectual disability in general. During our work with these mutated cell lines we also expect to uncover novel insights into each gene’s role in development and differentiation, and potentially explain the phenotypic differences we see between individuals with mutations in these genes.

Speakers and abstracts 87 P29 LUCIA CAMPOS-DOMINGUEZ The University of Edinburgh

Is genome instability driving faster speciation in Begonia?

Speciation events are driven by genetic changes which may be either adaptive, driven by selection or can be fixed in genetically isolated populations through the action of genetic drift. Genome instability and variation are very commonly found in angiosperms through polyploidy, high transposon activity and epigenetic alterations. Moreover, this group of land plants is known to represent one of the fastest species radiations in the tree of life. Within angiosperms, the genus Begonia is one of the fastest-growing and has over c.1900 pantropically distributed species that have been currently identified. Recent studies on Begonia provide a robust phylogenetic background for the analysis of evolutionary patterns across the group and confirm that speciation events in this group are not fully adaptive. Furthermore, previous studies have shown wide variation in genome size, chromosome number and repetitive elements content across the genus, confirming that Begonia species have very unstable or “dynamic” genomes. Using Begonia as a model, we are testing the role of genome instability in the evolution of species. We aim to determine the role of genome dynamics and composition in the repeated bursts of speciation indicated by the Begonia phylogeny. For this purpose, we selected 6 species representing different Begonia clades for genomic structure analysis. Our whole genome sequencing results show a wide variation in both content and types of DNA repeats across different radiations. We have been able to confirm that two of the three highly specious clades under analysis are composed by species with bigger and more repetitive genomes compared to their more depauperate sister clades. Our findings and upcoming experiments are taking us one step close to revealing the genome dynamics behind Begonia speciation, providing new insights on the role of non-coding elements in plant genome evolution.

GENETICS AND SELECTION OF QUANTITATIVE TRAITS

P30 VICTORIA LINDSAY Royal Veterinary College

Dissecting the genetic architecture of equine exertional rhabdomyolysis

Exertional rhabdomyolysis (ER) is a syndrome characterised by episodes of exercise-induced muscle stiffness and contractures, myofibre necrosis, myoglobinuria and even renal failure and death. The disease phenotype varies in severity. Histologically it is characterised by non-specific features of muscle damage on biopsy. ER affects humans, particularly athletes and highly active careers such as the military, and animal species including horses, but the aetiology and pathogenesis is unclear.

88 Speakers and abstracts One genetic cause of ER has so far been identified: polysaccharide storage myopathy (PSSM1), caused by a mutation in glycogen synthase (R309H in GYS1). However, this mutation has not been reported in many breeds including Thoroughbreds, where prevalence of recurrent ER (RER) is high (5-7% in racehorses), and a variety of other breeds are also diagnosed with ER. RER is a genetic disease in horses, however the causal genes and underlying molecular mechanisms have not yet been identified. There is also currently no cure for this disorder although many horses can continue to perform well if managed appropriately. This study aims to identify specific breed-related phenotypic patterns and investigate the genetic architecture of RER in two contrasting breed types that are common in the UK horse population: Connemara ponies and Warmblood horses. We hypothesise that the phenotypic heterogeneity seen in RER cases has a genetic basis and that the genetic architecture of the disease in horses is similar to the one in humans. The referral database from the Royal Veterinary College Comparative Neuromuscular Disease Laboratory was utilised to investigate the presence of distinct breed-associated RER phenotypes. This database contained 1087 cases referred to the laboratory with differential diagnoses of neuromuscular disease between May 2006 and September 2017, of which 205 were diagnosed with RER. Principal component analysis (PCA) was carried out using features including reported clinical signs, biopsy histological features and animal characteristics such as age, sex and neutering status. Chi-squared test was used to assess the associations between breed, and within Warmbloods the type, and number of diagnoses of RER and PSSM1. In order to investigate the genetic architecture of RER 40 horses, 20 RER cases and 20 controls, half of each breed (Warmbloods and Connemaras), were whole-genome sequenced (WGS) using Illumina Hi-Seq. Variant calling was carried out using the GATK best practices pipeline and EquCab 3.0 reference genome, and functional impact predicted using Variant Effect Predictor. Three potential RER phenotypes were identified by PCA when utilising reported clinical signs only. The clusters did not appear to be related to breed, sex, disease severity or confidence of diagnosis. A distinction based on histological features were not identified. The only breed that showed an overall significant association with number of diagnoses of RER or PSSM1 were the Warmblood breed type Selle Francais, which were significantly more likely to be diagnosed with PSSM1 than RER or any other diagnosis - other Warmblood breeds and breeds typically associated with RER and PSSM showed no significant associations. We have extracted 50 candidate genes based on the human and horse literature from the WGS data and are looking for variation with a potential functional impact on the encoded proteins. Analyses within and across breeds are currently underway.

Speakers and abstracts 89 P31 CHANDANA BASU MALLICK The Roslin Institute, The University of Edinburgh

Investigating the role of positively selected PRSS53 variants: the journey from mouse models to cells

Positively selected variants have been instrumental in understanding the genetic architecture of adaptive traits. Studying the footprints of selection in the present genomes enables us to understand, why a trait has been selected, how it emerged in a population and how it relates to reproductive fitness. The story seems straightforward if the selected allele contributes to a single phenotype/trait like lactase persistence but in case the variant is associated with a myriad of traits, it’s difficult to decipher the selected phenotype. Human head hair varies in its shape and texture across populations. Among the 18 loci so far identified, some of the hair variants are also exemplary candidates of positively selection. However, it has not been clear if hair shape itself is the target of selection for these variants or any other associated phenotype. In this study, we focus on PRSS53 encoding a serine protease, which harbors two positively selected variants. One being selected in East Asians (rs11150606 encoding PRSS53p.Q30R) associated with straight hair and other being selected in South Asians (rs201075024, encoding PRSS53p.G34S) with no phenotype reported till date. Hence, we use an integrated approach of functional genomics, molecular biology and population genetics to gain a complete understanding of the biological implications of these two variants. Our comprehensive phenotyping of gene-edited mouse models attests the role of the PRSS53 in hair shape and further identifies another skin phenotype. Using population genetic and cell-based approaches, we demonstrate that though PRSS53 and EDAR are co-selected at the population level, they do not functionally interact. This further suggests that the variants in these genes influence hair shape through independent mechanisms. Lastly, the fact that both the genes intersect on hair shape as their common phenotype provides further clue of hair shape being the main target of selection in human evolution.

P32 LOUISE JOHNSON University of Reading

Experimental evolution of invasion and escape in cancer cell lines

Experimental evolution approaches developed in microbial systems can be applied to cancer cells in vitro to address the causes of evolutionary changes during cancer progression. Cell motility traits are important in cancer progression, particularly for metastasis and tissue invasion. By tracking cells over several generations, we can determine which cell traits show significant, stable, heritable variation within the population, and are therefore most readily altered by selection.

90 Speakers and abstracts We can also directly subject cancer cell lines to selection. We selected for two metastasis- like behaviours: firstly, we repeatedly placed a dense population of cells inside a compressed collagen disc, and collected those that successfully move out into the surrounding medium - “escape lines”. Secondly, we selected for cells that successfully invaded Matrigel, a synthetic basement membrane, from an adjacent softer medium - “invasion lines”. Using time-lapse video microscopy and deep learning, we monitor dynamic cell phenotypes such as changes in cell shape, overall cell speed, and response to the proximity of other cells, and compare both of these selection lines with the ancestral population and with a control line cultured on plastic over the same timescale. We detect clear phenotypic changes between our ancestral, control, and selection lines. In both escape lines and invasion lines, changes in cell shape are far more tightly linked to speed of movement than in control and ancestral lines. The fact that invasion and escape appear to select for similar cell motility phenotypes may suggest that those cells that have escaped a tumour and entered the bloodstream have some degree of pre-adaptation to reinvasion, the next step of metastasis.

P33 OLIVIER GERVAIS Kyoto University, Japan

Heritability of traits related to metabolic syndrome and chronic kidney disease in a large-scale cohort from Northeast Japan

Over the past few decades, obesity has become a public health issue of pandemic scale. The upsurge in the incidence of obesity in low-income countries and the dramatic increase in the number of overweight/obese youths also raise concerns that the effects of this trend will worsen in the coming years, notably by increasing the risk for developing associated medical conditions such as chronic kidney disease. Despite this being a global issue, the majority of the research on the genetics of obesity is still being performed in populations of European descent. In this study, we focused, in a large-scale cohort from Northeast Japan, on the estimation of the genetic parameters of a wide range of traits related to metabolic syndrome and chronic kidney disease, including height, BMI, blood pressure, serum creatinine, and cystatin C. We applied linear mixed model methods and the AI-REML approach in GCTA for variance components estimation. The heritability estimates of height, HDL cholesterol, and serum creatinine, were 0.561 (± 0.031), 0.330 (± 0.031), and 0.268 (± 0.024), respectively. Overall, our results are largely in line with estimates from other populations such as that of the UK Biobank. Our findings suggest that in spite of the genetic and environmental differences that exist between populations of European descent and the Japanese population, the proportion of phenotypic variation that is due to genetic variation in these populations is similar. Further research is however needed to assess whether this result applies to the genetic correlation between these traits as well as to a broader spectrum of phenotypes.

Speakers and abstracts 91 P34 LARA URBAN European Bioinformatics Institute

Scrutinising nanopore sequencing for the study of freshwater microbiomes

Year by year, Cambridge rowers and punters experience serious infections associated with pathogens from the city’s local river, the Cam. To address this problem, and to shed light on the Cam’s bacterial diversity and pathogenic landscape, our team has collected water samples and leveraged real-time DNA sequencing with the portable MinION DNA sequencing device to study the river microbiome. In April, June, and August 2018, we analysed metagenomes from nine distinct locations along an 11.6km trajectory of the Cam. Besides the optimisation of DNA extraction and real-time nanopore sequencing workflows for microbial community studies, we also integrated key physicochemical variables into our study, such as pH, flow rate, water temperature and salt levels through optical emission spectroscopy. Our data captures both, broad seasonal trends in freshwater microbial community composition and localised effects consistent with environmental features of the sampling range. Albeit we find that our analyses are mostly constrained to family-level taxonomy, ongoing nanopore base accuracy improvements may allow for targeting genus and species-level resolution in the future, even in workflows relying on (full length) 16S rRNA amplicons. At this stage, we tested the efficiency of selective manual read curation to obtain species level resolution of the bacterial communities. For example, we investigated the bacterial genus Leptospira, which pathogenic species represent the infectious water agents of Weil’s disease. Accounting for biased homopolymer sequencing errors suggested the presence of Saprophytic instead of pathogenic Leptospira species, a finding we could subsequently confirm by targeted PCR analyses. Overall, this study aims (i) at creating a MinION benchmark for freshwater microbiome monitoring, with a focus on cost optimisation for applications in One Health contexts, and (ii) at providing spatio-temporal metagenomics snapshots from an exemplary East Anglian river.

92 Speakers and abstracts P35 SMARAGDA TSAIRIDOU The Roslin Institute, The University of Edinburgh

Optimising genomic selection with low density markers in farmed Atlantic salmon

Genomic selection is increasingly applied in aquaculture breeding to expedite genetic gain for key production traits such as disease resistance. Its effective application in breeding programmes depends on large training datasets of phenotypes and genotypes. In livestock breeding, genotype information on large populations can be achieved in a cost-effective manner through genotype imputation, which allows inference of high density genetic information from low density genotype data. In typical aquaculture breeding programmes where large full-sibling families sharing large genomic segments are available, the value of imputation has yet to be fully assessed. The aim of this study was to evaluate strategies for the use of low density SNP panels and genotype imputation in Atlantic salmon breeding programmes. Our study focuses on two economically important traits, namely resistance to sea lice (sea lice count) and body weight. Data were derived from a Scottish Atlantic salmon breeding programme population challenged with L. salmonis (Landcatch, UK) and genotyped with the Affymetrix Axion 132K Atlantic salmon SNP chip (78362 SNP markers). The analysis comprised three main parts: (a) identifying high density (HD) SNP panels with optimal density and composition; (b) testing the imputation accuracy of a range of low density (LD) imputation SNP panels; (c) comparing the genomic prediction accuracies of each of the datasets, with and without imputation. All analyses were performed using a bespoke in-house software pipeline, which employed R for selection of HD and LD SNP panels, calculation of genomic relationship matrices, and cross-validation, FImpute for genotype imputation and ASReml for estimation of breeding values. Mean genomic prediction accuracies were obtained across 50 replicates of 5-fold cross-validation, as the correlation between phenotypes and predicted breeding values divided by the square root of the heritability. The heritability estimated using the genomic relationship matrix was 0.19 (0.07) for sea lice resistance, and 0.57 (0.07) for weight. Reducing the SNP panel density from the full HD SNP panel to 200 SNPs resulted in a 14.5% decrease in genomic prediction accuracy for sea lice resistance, and a 27.9% decrease for body weight. This loss of accuracy started between 2,000 and 5,000 SNPs for both traits, therefore these densities were tested as the optimal target HD SNP panels for imputation. Imputation accuracy increased with increasing SNP panel density. Genomic prediction accuracy using imputed genotypes was only marginally higher when increasing the density of the imputation SNP panel. The benefits of imputation were highest at the lowest imputation panel densities. Prediction accuracy using 200 SNPs without imputation for sea lice resistance was 0.45, whereas with imputation to 5000 SNPs it was 0.53, which corresponds to a 17.47% increase due to imputation. In conclusion, genomic prediction using LD genotypes imputed to medium densities can provide better accuracies than using LD true genotype data alone, allowing realistic use of small LD SNP panels as a cost effective tool for genomic prediction in Atlantic salmon.

Speakers and abstracts 93 P36 ALASDAIR ALLAN MRC Harwell Institute

Generation and validation of increasingly complex alleles introduced by genome editing

Mouse models are valuable tools to understand gene function, genetic disease mechanisms and to develop and test new therapeutic treatments in vivo. The CRISPR/Cas9 system can be utilised as a genome-engineering tool and has brought new perspectives in the generation of mouse models of human disease. The use of this system allows for the introduction of targeted point mutations and other subtle modifications into the mouse genome in a time-efficient and cost-effective manner. We will present our recent developments of processes for genome engineering. As the CRISPR/Cas9 system becomes better understood and communicated, the requests for mouse models of human disease become ever more complex. These include flanking a target exon with FLOX sites, large megabase deletions of DNA sequence containing multiple genes and humanising sections of a gene to better mimic a disease model in humans. Alongside the generation of these mutants, their validation represents a new challenge. With new processes for allele validation, we uncover further variability in the outcome of applying CRISPR/Cas9 to the modification of mouse early embryos. We will demonstrate how we have used droplet digital PCR (ddPCR) to identify discrete sequence changes, the generation of larger than expected deletions and chromosomal rearrangements. Extensive validation in this manner recognises unwanted variants at early stages of the mutagenesis process, which reduces the number of animals required for genome engineering and contributes to experimental reproducibility of the model.

94 Speakers and abstracts P37 IRENE BREIDER The Roslin Institute, The University of Edinburgh

Effect of training population composition when using genomic selection for introgression of exotic alleles

By 2050 we need to feed over 9 billion people worldwide. In 2015 cereals contributed almost 50% of dietary energy worldwide. To meet the future demand in cereal consumption, production needs to grow by almost one billion tonnes between 2005 and 2050. Since the green revolution years increase in cereal yields has been falling. An opportunity to overcome this stagnation in genetic gain is the introgression of exotic alleles. However, (re)introducing exotic alleles to improve polygenic traits poses difficulties, due to e.g. genetic lag, linkage drag and a lack of adaptation. Here we show that introduction of exotic alleles using a pre-breeding strategy combined with genomic selection increases genetic gain in wheat yield. The challenge with introgression of exotic alleles to improve a polygenic trait is that many genes are involved, most with a small effect. Genomic selection enables us to directly target alleles with a desirable effect on the trait of interest. However, the success of introgression using genomic selection depends on the choice for an appropriate training population. Our objective was to develop a breeding strategy that enables us; 1) to build a training population including exotic alleles, and 2) to use a subset of our training population at different stages of our breeding programme, to optimise introgression and increase in genetic gain simultaneously. A stochastic simulation in AlphaSimR was performed simulating introgression of exotic germplasm into an elite wheat breeding programme using pre-breeding. Three baseline scenarios were simulated; 1) a conventional breeding programme using phenotypic selection, 2) a conventional breeding programme using genomic selection, and 3) a breeding programme with introgression using phenotypic selection. Five scenarios of a breeding programme with introgression using genomic selection were simulated, each using 1-3 subsets of a training population used at 4 different stages of the programme. For introgression an exotic line was simulated that was 5, 10 or 15 years behind the simulated elite line. Operational costs of the breeding programmes were equalized so that costs were equivalent or less to operating costs of a conventional breeding programme using phenotypic selection. Results show that the increase in genetic gain is higher when introgression is combined with genomic selection, compared to any of the baseline scenarios. Differences in genetic gain between the scenarios investigating different training populations were not significant. Accuracy of genomic selection was significantly different between some scenarios. Results demonstrate that it is possible to improve genetic gain in a quantitative trait using introgression with genomic selection and show the advantage of tailoring the training population composition to the stages of the breeding programme in optimising genetic gain. These results show the potential of using introgression with genomic selection to increase yield in food crops to contribute to meeting future food requirements.

Speakers and abstracts 95 P38 IAN DUNN The Roslin Institute and Royal (Dick) School of Veterinary Studies

Lack of genetic correlation between laying hen bone quality and egg production

Bone fractures including Keel Bone damage are a challenge for laying hens, as the physiological adaptations for egg laying make them susceptible to osteoporosis. There is also a welfare paradox in laying hens; cage free production that allows greater movement and loading of the bone is consistently associated with increased bone quality but at the same time these alternative systems are associated with greater incidence of bone damage, often featuring the keel. We are also starting to see the ‘Long Life Layer’, which has great benefits for sustainability but with a longer period of production this could be a greater risk for bone quality at the end of lay. In this context it is frequently stated that poor bone quality is a result of selection for increased egg production. However, the literature is not entirely supportive of this hypothesis. Whilst laying eggs can be related to potential issues of poor bone quality, it is not clear that the relationship between numbers of eggs laid and bone quality is negative. To answer this question, we estimated genetic correlations between egg production and bone quality towards the end of lay in a White Leghorn (WL, 53 weeks of age) and a Rhode Island Red (RIR, 68 weeks of age) pure line. Egg numbers at the commencement of lay, related to age at first egg, were analysed separately (early egg number) and the remaining periods (egg number) represented post peak production. Tibia and humerus breaking strength and the radiographic density of the tibia, humerus and keel bone were measured (>900 per line). Heritability estimates for bone quality traits were 0.19±0.07-0.48±0.08 for WL, 0.35±0.08-0.59±0.09 for RIR. Heritability of keel bone density was near zero. Genetic correlations between egg number and bone quality were not significant. Correlation between egg production and tibia bone density was 0.24±0.21and -0.05±0.22 for WL and RIR respectively. In the WL there were negative genetic correlations between bone quality and early egg number (-0.50±0.11). Genetic correlations were also negative between early egg number and body weight in the WL line (-0.64±0.13). This may reflect the interaction between bone quality and puberty which also affects mature body weight. As we have shown previously the potential exists to improve bone quality by selection, assuming a simple phenotype can be developed. We are currently working on that. We can find no evidence for a relationship between post peak egg production and bone quality. However, in one line there is an effect of age at first egg. This suggests the move towards longer laying periods will not necessarily impact bone quality but will need to be carefully monitored. These results will inform and guide future efforts to improve bone quality and in some cases merits an examination of puberty onset.

96 Speakers and abstracts P39 JACQUELINE SMITH The Roslin Institute, The University of Edinburgh

Defining QTLs, genes and genome variants for resistance to Marek’s Disease Virus

Marek’s Disease Virus (MDV), is a highly-contagious oncogenic and a highly immune- suppressive alpha-herpes virus. MDV infects chickens, causing neurological symptoms and tumour formation. Though partially controlled by vaccination, Marek’s disease (MD) continues to have large impacts on animal health and welfare, and a profound effect on the economy of the poultry industry. Even after years of study, the genetic mechanisms underlying resistance to MDV remain poorly understood. The MHC is known to play a role in disease resistance, and a handful of other genes have also been implicated in resistance. Genome wide association study (GWAS) of an F6 advanced intercross line derived from a cross between two commercial layer lines phenotyped for survival in the face of MDV challenge, has allowed us to define Quantitative Trait Locus Regions (QTLRs) associated with resistance. We have further identified in the QTLRs candidate genes for survival, by integrating transcriptomics and genetic association tests. The Fms-Related Tyrosine Kinase 3 (FLT3) gene has been shown to have strong genetic association with resistance, along with several other candidate genes, miRNAs and lncRNAs. This provides targets for mitigating the effects of MDV by selective breeding, improved vaccine design or gene- editing technologies.

P40 ARIANNA LANDINI The University of Edinburgh

First GWAS on transferrin N-glycans: one step closer to understanding the genetics of protein glycosylation

Glycomics, the study of the collection of glycans in biological systems, is an emerging field among omics data. Despite glycans being involved in the aging process and in a wide variety of diseases (including cancer, immunological and autoimmune diseases, diabetes and congenital disorders), the genetic regulation of glycosylation is not fully understood. To partly address the knowledge gap regarding genetic regulation of glycosylation, we focused on transferrins, blood plasma glycoproteins regulating iron’s level in body fluids. This is the first study integrating genomics and transferrin glycomics, thus allowing us to investigate genetic regulation of transferrin N-glycosylation and whether the same or distinct genes are involved in the N-glycosylation of different proteins. Accordingly, we compared results obtained for transferrin N-glycans to state-of-the-art knowledge about genetic regulation of immunoglobulin G (IgG) N-glycosylation. We performed genome-wide association study (GWAS) of 35 transferrin N-glycans, using data collected on 948 samples from the CROATIA-Korcula cohort, a genetic isolate characterised by high kinship. This dataset features UPLC-quantified N-glycan traits and genetic data imputed to the Haplotype Reference Consortium.

Speakers and abstracts 97 We identified 522 genome-wide significant loci (p-value < 5 x 10-8/35 traits), mapping in genes encoding glycosyltransferases (MGAT5, FUT6, FUT8, ST3GAL4, B3GAT1), genes potentially related to the glycosylation process (MSR1, TUSC3) and transferrin (TF), or genes previously associated with N-glycosylation of IgG (NXPE1, NXPE4). Some of these genes (TF, MSR1, TUSC3) have not previously been associated with glycosylation. We thus suggest that while some genes are responsible for regulating the N-glycosylation of both transferrin and IgG proteins, others instead appear to control N-glycosylation of only one of these proteins. Our findings shed light on the complex mechanisms regulating protein N-glycosylation and also corroborate the notion that while some genes are specific for a restricted number of proteins, others instead affect the N-glycosylation of multiple proteins.

P41 ANDREW WHALEN The Roslin Institute and Royal (Dick) School of Veterinary Studies

Large scale livestock genomic data can be used to accurately detect individual- level recombinations

Estimating the recombination rates for different species is important for understanding the evolutionary history of the genome, detecting adaptive loci, and potentially increasing the rate of genetic gain in breeding programmes. In this project, we used SNP array data and pedigree data from a commercial pig breeding programme to predict the number of recombinations for each individual. We used AlphaImpute2 to generate per-locus recombination estimates for 150,000 pigs across 9 different breeding lines. We found genetic map length estimates for each chromosome that were in line with previous published estimates, including accurately measuring per-chromosome map lengths, recovering the sex difference in recombination rates, and correctly estimating local, within-chromosome variation in recombination rates. In addition, we compared recombination rates between each of the pig breeding lines, performed a GWAS to find loci that were predictive of the number of recombinations for an individual, and found results of biological interest. Overall, this project demonstrates the feasibility and the usefulness of generating individual-level recombination data using large scale SNP array and pedigree data in livestock populations.

98 Speakers and abstracts P42 JULIA RAMÍREZ Queen Mary University of London

The genetic underpinning of the electrocardiogram Tpeak to Tend interval

The T-peak-to-end (Tpe) interval on the electrocardiogram is heritable and longer values and increased variation with exercise are associated with increased arrhythmic risk. Currently, the genetic basis of resting Tpe and the response of the Tpe interval to heart rate (Tpe dynamics) remains to be elucidated. The aim of this study was to discover single- nucleotide polymorphisms (SNPs) associated with resting Tpe and Tpe dynamics. 1-lead ECG recordings of 52,147 participants of European ancestry from the UK Biobank study who had an exercise stress test were analysed. The test used a bicycle ergometer and included six minutes of graded cycling exercise and 1-minute recovery. Three Tpe phenotypes were derived, resting Tpe, Tpe dynamics during exercise (quantifying the Tpe changes between resting and peak exercise), and Tpe dynamics during recovery (measured between peak exercise and 50 s post-exercise). We performed a discovery genome-wide association studies (GWASs) for each phenotype in 30,000 individuals and a validation experiment in the remaining 22,147 samples. We also conducted a GWAS in the full data set, all from UK Biobank. For evidence of additional support, we performed a lookup of the identified SNVs for resting Tpe in an independent cohort (N = 20,453) of individuals from UK Biobank and in a meta-analysis combining all participants from both cohorts (N = 72,953). For resting Tpe, replication was confirmed if (i) P < 0.05/n in the validation cohort (where n = number of sentinel SNVs with P < 10 -6 in the discovery cohort), (ii) P < 0.05 in the independent cohort and (iii) P < 5 x 10 -8 in the meta-analysis. For the Tpe dynamics traits, replication was confirmed if P < 0.05/n in the validation cohort. Twenty-one (RNF207, SSBP3, SGIP1, KCND3, MEF2D, STRN, SERTAD2, SCN5A-SCN10A, CAMK2D, NKX2-5, SLC35F1, CREB5, KCNH2, PRAG1, AZIN1, ZMIZ1, IGF1R, GINS3, KCNJ2, PYGB and KCNJ4) and one (KCNJ2) loci for resting Tpe and Tpe dynamics during exercise, respectively, formally replicated. In the full data set GWAS, nine further loci for resting Tpe (FAAP20, DPT, SLC22A23, KDM1B, CAV2, FAM167A, PTGES3, LITAF and PRKCA), one for Tpe dynamics during exercise (EIPR1), and one for Tpe dynamics during recovery (NAF1) were genome-wide significant (P ≤ 5 × 10 −8). Sex-specific analyses identified one female-specific locus for resting Tpe (POLR2D), one male specific locus for Tpe dynamics during exercise (ETS2) and two female-specific loci for Tpe dynamics during recovery (NRXN3 and NOL4L). In total, we discovered 37 loci, with one (KCNJ2) being common across resting Tpe and Tpe dynamics during exercise. Bioinformatics analyses indicated that the top biological processes involved ventricular repolarisation and cardiac conduction and contraction. Our findings reinforce current understanding of ventricular conduction and repolarization and its response to heart rate changes and will guide future studies evaluating its contribution in cardiovascular risk prediction as well as identifying new therapeutic targets to modulate resting Tpe and its dynamics.

Speakers and abstracts 99 P43 JULIE GAUZERE The University of Edinburgh

Impact of maternal genetic effects on the evolutionary potential of a red deer population

The identification of the key processes facilitating or constraining rapid evolutionary responses is a major topic in evolutionary biology. Theoretical developments have highlighted that maternal genetic effects can alter evolutionary trajectories of traits, and potentially resolve the persistent problem of the ‘paradox of stasis’, whereby heritable traits under strong selection do not evolve. In mammals, the large influence of maternal traits on offspring development and fitness is well established. Nevertheless, only a few studies have tried to investigate how maternal genetic effects may mediate a life-history trade-off between mother and offspring performances. Here, we used one of the largest wild mammal field studies, of red deer Cervus( elaphus) on the Isle of Rum (Scotland), to address this question and estimate the influence of direct and indirect genetic effects on offspring phenotypes. We analysed the phenotypic data of 2,300 individuals using animal models and a genomic relatedness matrix. The analysis of birth weight, a key juvenile trait, revealed that maternal genetic effects explained the largest proportion of the total phenotypic variance for this trait (27%), followed by additive direct genetic effects (14%) and maternal environmental effects (10%). There was no genetic correlation between additive and maternal genetic effects, suggesting no constraints to the evolution of this offspring trait due to negative correlation between direct and indirect genetic effects. Finally, these analyses highlight that maternal genetic effects increase the total amount of genetic variation available for selection (h2tot = 0.25 instead of 0.14) and the adaptive potential of this juvenile trait.

P44 CHRISTINA JOSEPH MRC Human Genetics Unit, The University of Edinburgh

Genetic and molecular analysis of urinary magnesium concentration in Scottish and Croatian populations

Electrolyte imbalance, including changes in magnesium concentration and other electrolyte ratios, can result in dizziness, arrhythmia and if left untreated can result in serious illness or even death. Magnesium is the second most abundant bivalent cation in the body and is essential for many cellular processes. Renal magnesium handling plays an important role in maintaining magnesium homeostasis, however the exact biological mechanisms remain unclear. A recent genome-wide association study (GWAS) identified an association between urinary magnesium concentration (uMg) and variants in the ARL15 gene on chromosome 5. ARL15 encodes a GTP-binding protein that regulates the magnesium transporter channel TRPM6 and other proteins involved in magnesium homeostasis in physiologically relevant cell lines.

100 Speakers and abstracts Meta-analyses for urinary magnesium, and for ratios of magnesium to creatinine (uMg/ cre) and other clinically relevant electrolytes were conducted in 11,617 individuals from Scottish and Croatian populations. Functional validation of the candidate gene Arl15, using CRISPR-Cas9 to establish in vivo and in vitro knockout models of Arl15, is underway. Protein pulldown and mass spectrometry analysis was conducted to identify interactors of Arl15 and determine pathways involved in magnesium homeostasis. We observed genome-wide significant peaks in the ARL15 locus in meta-analyses conducted for uMg, magnesium to creatinine (uMg/cre), magnesium to phosphate (uMg/ uPh) and magnesium to potassium (uMg/uK) ratios. The top SNP associated with uMg and uMg/cre in the meta-analyses lies within a transcription factor binding site in an enhancer region of ARL15. Individuals homozygous for the alternate allele of rs35931 have lower urinary magnesium levels compared to individuals with the homozygous reference allele. Arl15 interacts with magnesium transporters within the distal convoluted tubule segment of the kidneys. Furthermore, homozygous deletion of Arl15 is lethal at the organismal level. We hypothesise that the variant found affects the expression of ARL15, which, in turn modulates magnesium transport by regulating cellular localisation of magnesium transporters. Further functional studies to elucidate the role ARL15 plays in magnesium homeostasis both in vitro and in vivo are underway.

P45 MARTIN JOHNSSON Swedish University of Agricultural Sciences

Detecting deleterious variants in the pig by means of whole-genome sequencing

Deleterious mutation is an unavoidable fact of genetics and can cause substantial reduction in fitness. With the advent of affordable genome sequencing and effective pedigree-based imputation, it is now possible to observe deleterious variants directly in livestock breeding programmes, and design new breeding strategies to improve fitness. We used a series of bioinformatic and population genetic methods to detect deleterious variants in sequence data from 8,000 pigs and imputed them to 350,000 genotyped pigs from an elite breeding nucleus population. With this dataset, we studied the distribution of deleterious variants in the genome, between lines, and across generations. We also scored variant intolerance and evolutionary constraint genome-wide in the pig. These results form the basis for designing next-generation sequence based breeding strategies.

Speakers and abstracts 101 P46 AMY FINDLAY MRC Human Genetics Unit, The University of Edinburgh

Fam151b, homologue of C. elegans menorin, is necessary for photoreceptor survival in mice

Fam151b is a mammalian homologue of the C.elegans menorin gene, involved in neuronal branching. The International Mouse Phenotyping Consortium (IMPC) aims to knock out every gene in the mouse and comprehensively phenotype the mutant animals. This project identified Fam151b mutant mice as having retinal degeneration. We show they have no photoreceptor function from eye opening, demonstrated by a lack of electroretinograph (ERG) response. Histological analysis has shown that during development of the eye, the correct number of cells are produced and the layers of the retina differentiate normally. However, after eye opening at P14, Fam151b mutant eyes exhibit signs of retinal stress and rapidly lose photoreceptor cells. We have mutated the second menorin homologue, Fam151a, and mutant mice have no discernible phenotype. Sequence analysis indicates that the Fam151 proteins are members of the PLC-like phosphodiesterase superfamily, but the substrates and function of the proteins remains unknown.

P47 JON BANCIC The Roslin Institute, The University of Edinburgh

Plant breeding designs for intercropping

Intercropping is an agricultural practice in which two or more component crops are grown and managed in the field simultaneously. Allowing a more diverse production in the field positively influences agricultural land and provides more resilience to the farmer. Despite this, breeding for intercropping has been largely ignored due to strong preferences for simpler monoculture farming practices and uniform production systems. The existing elite monoculture varieties are not necessarily adapted to combine well with other plant species. Due to recent renewed efforts towards supporting more sustainable and diversified production, breeding for intercropping has become an important topic of research as its implications could potentially be applied in emerging and advanced economies. In this study, the aim was to propose and evaluate different intercrop breeding programmes for two plant species. Genomic selection is a novel breeding strategy that has revolutionised plant breeding in recent years and offers a unique opportunity to overcome some of the obstacles associated with breeding for improved intercrop varieties. Five different breeding programmes with and without genomic selection were developed based on the knowledge gained from hybrid breeding programmes and the limited intercrop breeding literature.

102 Speakers and abstracts Stochastic simulations were used to compare the five breeding programmes under each of three different budget and trait correlation scenarios. Budget scenarios were $250K, $500K and $1 million to represent available resources to the breeder. Trait correlation scenarios between line’s two traits representing monocrop grain yield and intercrop grain yield were 0.4, 0.7, and 0.9. A key finding was that all intercrop breeding programmes implementing genomic selection set within budget constraints achieved higher genetic gains compared to a proposed phenotypic selection intercrop breeding programme, regardless of the budget or trait correlation scenario. Surprisingly, different genomic selection breeding programmes performed better under trait correlation scenarios as a result of programme’s ability to predict intercrop performance. This is the first study using stochastic simulations that has investigated the maximisation of genetic potential for intercrop cultivars. The simulation approach serves as a good starting point to showcase the potential of intercrop breeding programmes to encourage otherwise reluctant plant breeders to adopt new breeding strategies. Our simulations have demonstrated the potential of intercrop breeding programmes with genomic selection to perform well even under limited budget and low trait correlation scenarios. However, further research is required to address an issue of crop interaction and economic importance of each crop. Such work will ultimately support intercrop adoption and the diversification of food systems.

P48 YIZHOU YU Imperial College London

Reactive oxygen species as a signalling mechanism in homeostatic sleep regulation

It is widely known that sleep duration varies across individuals. Recent high-throughput experiments assaying sleep in about 500 Drosophila melanogaster samples indicate that some wild types (Canton-S) are virtually sleepless. Importantly, short sleep phenotypes can be inherited (Serrano et al., 2018). However, the implication of short sleep is unclear. Here we propose the hypothesis that reduced sleep leads to increased activation in dopaminergic neurons, which produces reactive oxygen species (ROS). ROS reduces the ratio of nicotinamide adenine dinucleotide (NAD) to its oxidised form (NADH), which act as a signalling mechanism to increase sleep pressure. A the MAK037 kit was employed to measure the average NAD/NADH ratio. 6 dissected brains were used in each biological replicate, and the difference between ratios were assessed using the Wilcoxon rank sum test. All changes were significant (p < 0.05) unless indicated. We measured a 1.67-fold decrease in the NAD/NADH level in the brain of inbred short sleeping mutants in comparison to control (Canton-S). This level is comparable to the 1.83-fold decrease observed in hyperkinetic mutants, which have a known short sleeping phenotype due to a mutated potassium channel. Similarly, a 1.9-fold decreased was also observed in brains of mechanically sleep deprived Canton-S flies.

Speakers and abstracts 103 In order to observe the reverse effect of ROS on sleep, the sleep fraction of Canton-S, short sleeping and hyperkinetic flies were calculated using Rethomics. We observed that the sleep fraction of short sleeping flies increased by 1.70 fold, and that of hyperkinetic mutants increased by 2.42 fold when fed with the oxidant H2O2 (5.0 mM). However, the sleep fraction in Canton-S flies fed with oxidant food did not significantly change (p ≈ 0.03). In conclusion, we suggest that short sleep could lead to increased oxidative stress, and increasing oxidative stress by food intake in turn leads to more sleep in these mutants. Notably, the sleep fraction of longer sleeping Canton-S flies was not affected by oxidant intake, possibly due to the inherent capacity of longer sleeping flies to cope with a small dietary increase in H2O2. Our results contribute to bridging recent hypotheses on sleep regulation and mechanisms of neurological diseases. Ongoing experiments aim to increase the number of biological replicates of the NAD/NADH assay using controls with different genetic backgrounds.

P49 JULIANE FRIEDRICH The Roslin Institute, The University of Edinburgh

Assessing the genetic contribution to complex behavioural traits in German Shepherd dogs

The close co-existence with humans, favourable genetic structure and intense selection for behaviour characteristics resulting in a high diversity of behavioural features highlights the potential of dogs for studying the genetic contribution to behaviour variance. Results from these studies can have implications for animal welfare and human conditions. However, behaviours are complex traits, which have been shown to be influenced by numerous genetic and non-genetic factors, complicating their analysis and demanding innovative approaches. In this study, we examined the genetic contribution to behaviour variance by combining quantitative and population genetics approaches. First, the association between behavioural traits of German Shepherd dogs (GSDs) and genetic variants was analysed, while accounting for various environmental factors. Several behavioural traits that were defined based on dog owner responses to questions of the established Canine Behavioural Assessment and Research Questionnaire (C-BARQ) exhibited moderate heritabilities, with the highest estimates identified for Playfulness and Non-social Fear. Furthermore, we identified genomic regions distributed over 17 autosomes that showed at least chromosome-wide significant association with the analysed traits; for several traits, multiple regions were identified, supporting the hypothesis that behaviour traits are influenced by multiple genes.

104 Speakers and abstracts Second, we exploited the different breeding backgrounds of the analysed GSD cohorts to identify signatures of recent selection for behaviour. Several statistics, e.g., the integrated haplotype score, the fixation index and the cross-population extended haplotype homozygosity, were calculated to detect these signatures within and between cohorts. Putative selection signatures within cohorts were found in regions on CFA4 and CFA19, both of which harbour behaviour-related candidate genes. Target regions for divergent selection between cohorts were found on CFA12 and CFA32, both of which have been described previously to be associated with morphological and behavioural variation in different dog breeds.

P50 BENJAMIN LIVESEY The University of Edinburgh

Using Deep Mutational Scanning data to benchmark computational phenotype predictors

Genome sequencing is becoming an increasingly useful tool for the diagnosis and treatment of genetic disease, however a large proportion of the coding variants we identify are variants of unknown clinical significance. In order to address this issue, many groups have developed computational phenotype predictors to help classify these variants as damaging or benign to protein function. The datasets used to assess predictor accuracy vary between groups, hindering a direct comparison of predictive ability. Furthermore, predictors based on machine learning methods may make biased predictions on mutations which were present in their training set. We present deep mutational scanning, a high throughput experimental measurement of variant phenotype as a source of unbiased datasets which can be used to benchmark computational phenotype predictors. These experiments produce large numbers of rare or novel mutations which are not present in existing training and testing data. In this study, we compare the results from 29 deep mutational scans of human, yeast, bacterial and viral proteins with 29 different computational prediction metrics. This data is used to rank the predictor performance for each group of organisms. Our results identify DeepSequence, SNPs&GO and SNAP2 as among the best correlated metrics with deep mutational scanning data. We also show that deep mutational scanning is comparable to even biased methods for predicting the effects of known pathogenic and benign mutations. We hope that deep mutational scanning datasets will, in the future be used as an independent benchmark of computational phenotype predictor performance in order to provide an unbiased performance metric and allow predictors to be easily compared.

Speakers and abstracts 105 P51 LETÍCIA LARA Luiz de Queiroz College of Agriculture, University of São Paulo

Mixed models with different variance-covariance structures for genetic and residual effects in forage breeding

In forage experimental designs, it is common to take repeated measures of a given trait in the same plot and its repeatability in successive measures (harvests) is estimated by linear mixed models (LMM) with variance homogeneity and absence of correlation. However, the main feature of LMM methodology is to allow modelling heterogeneous and correlated variance and covariance (VCOV) structures for genetic (G) and residual (R) effects as well as to include information from relatives, exploiting genetic correlation by an additive relationship matrix (A). In this work, we aimed to compare three different classes of models: genetic effects fitted by compound symmetry (CS), genotypes assumed as not correlated, and residual effects assumed with homoscedasticity and absence of correlation (Class A); genetic and residual effects with different VCOV and genotypes not correlated (Class B); genetic and residual effects with different VCOV and genotypes correlated by an A matrix based on pedigree values (Class C). The experiment was conducted in an augmented block design, with 570 sexual plants as treatments and three cultivars as controls in six blocks. Individual plants were evaluated in eight harvests for total green matter (TGM), total dry matter (TDM), leaf dry matter (LDM), stem dry matter (SDM), and percentage of leaf blade (PLB). Analyses were performed using software GenStat17th. Matrices G and R were analysed considering six different structures: Identity (ID), Diagonal (DIAG), CS, Heterogeneous CS (CS-Het), Power (Po), and Factor analytic (1) (FA1). Matrix A was obtained by R package AGHmatrix. Model selection was performed based on AIC and BIC criteria. Correspondence among the 20 best plants selected by different classes was evaluated by coincidence index. For all traits, Class C had a better fit. Generalised heritability ranged from 34.93% (PLB) to 87.54% (LDM). In general, lower heritability traits also had lower coincidence of the 20 best plants selected. Given these results, we recommend to incorporate A matrix as well as to model genotype x harvest interaction by Po structure in longitudinal mixed models in Panicum maximum. Furthermore, an A matrix estimated by molecular markers can be used instead of pedigree to increase the accuracy in predicting genetic values.

106 Speakers and abstracts P52 CAL DONNELLY The Roslin Institute, The University of Edinburgh

A matter of life and death; identifying host genomic factors that determine susceptibility of chickens to highly pathogenic avian influenza

DNA samples collected from survivors of recent outbreaks of highly pathogenic avian influenza (HPAI) in Mexico and the USA have provided a rare opportunity to study the genetic mechanisms underpinning susceptibility of chickens to this devastating and economically impactful disease which normally exhibits 70-100% mortality in the chicken host. Whole Genome Sequence (WGS) data has been used to perform Genome-Wide Association Studies (GWAS), which have highlighted single nucleotide polymorphisms (SNPs) segregating significantly between survivors and controls, with a pedigreed experimental group exhibiting a highly significant signal on chicken chromosome 2 in the region of a biologically relevant gene. Candidate SNPs for resilience are currently being validated using in vitro gene editing methodologies that modulate candidate gene expression to investigate the effect on viral replication and cellular response to HPAI infection. A detailed understanding of the genomic resilience to HPAI from this study will have implications for both the poultry industry and for public health.

THE GERMLINE, SEX DETERMINATION & SEX CHROMOSOMES

P53 PHILIPPA JACKSON MRC Human Genetics Unit, The University of Edinburgh

Determining the mechanistic importance of DNA methylation for Polycomb targeting in mESCs

Following fertilisation and during early development gene expression must be tightly regulated and coordinated to promote appropriate gene expression patterns. This includes silencing and activation of genes, which in mammalian cells is linked with DNA methylation and histone modification states. Previous work has shown that removal of global DNA methylation in Dnmt1 hypomorphic mouse embryonic fibroblasts results in the redistribution of H3K27me3, a repressive histone mark deposited by Polycomb Repressive Complex 2. This resulted in ectopic deposition of H3K27me3 in regions previously highly DNA methylated, and a dilution of PRC2 away from its normal targets, and consequently aberrant gene repression and activation. However, the molecular mechanisms underlying this redistribution have not been elucidated.

Speakers and abstracts 107 This project aims to dissect the importance of DNA methylation for PRC2 distribution, and the kinetics of recruitment or loss of these marks from different gene targets. Stable mESC lines will be generated using Recombination Mediated Cassette Exchange, a technique which allows efficient insertion of sequences of interest into an epigenetically inert locus by Cre-Lox recombination. Here, these sequences will include promoters of genes strictly controlled by DNA methylation or polycomb proteins, and retroviral sequences. The methylome will be manipulated in these lines, and the epigenetic response of the ectopically introduced sequence investigated. In this way we can study changes in methylation level, protein recruitment, histone PTMs, and chromatin conformation of a specific locus induced by gain or loss of DNA methylation, and how this response varies with target and tissue type.

P54 MARC STIFT University of Konstanz, Germany

At least two loci involved in the loss of self-incompatibility in Arabidopsis lyrata

Breakdown of self-incompatibility (SI) and subsequent shifts to inbreeding have intrigued evolutionary geneticists for decades. In self-incompatible plants, self-pollen is rejected due to the simultaneous expression of female recognition proteins in the stigma and matching male-recognition proteins in the pollen. The S-locus, a cluster of a female and a male self- recognition gene, encodes these complementing recognition proteins. In species with a functional SI system, multiple S-locus specificities (matching female and male recognition pairs) are maintained in the population. The intraspecific breakdown of SI in North American Arabidopsis lyrata provides an excellent system to unravel the genetic basis of the breakdown of SI. The fixation of certain S-specificities in selfing populations of this species (S1 in two populations, S19 in three populations) suggests a functional link between SC and the S-locus, for example due to a loss-of-function mutation at these haplotypes that compromises self-recognition. However, evidence contradicts this, leading to the hypothesis that more than one locus is responsible for SC. To test this, we crossed self- incompatible and self-compatible plants (SIxSC). Specifically, we tested whether the frequency of SC and SI progeny could fit single-locus models. We found roughly equal proportions of SC and SI progeny, which could fit a single-locus model only if the allele causing SC is recessive, and occurs at 50% frequency in outcrossing populations. However, the latter does not fit the observation that SC individuals are quite rare (at a frequency well below 25%) in outcrossing populations. Therefore, we propose that two-locus models are more likely. More specifically, we propose that the expression of the S-haplotypes fixed in selfing populations is affected by genetic modifiers unlinked to the S-locus.

108 Speakers and abstracts P55 CAROLINA BARATA University of St Andrews

Bait-ER: fishing for targets of selection in a sea of alleles

Combining high-throughput next-generation sequencing with Experimental Evolution is a powerful approach to describe the evolution of allele frequencies in laboratory experiments. Researchers are thus able to re-sequence experimental populations to test evolutionary theory predictions. These Evolve and Re-sequence (E&R) experiments are especially useful for detecting signatures of short-term adaptation in the genome. One cost-effective way of estimating allele frequencies is to sequence pools of individuals (Pool- Seq). Though practical, pooled sequencing generates allele frequency variance due to finite sequencing depth. Moreover, most laboratory populations are small, where genetic drift plays a substantial role in determining the fate of most polymorphic sites. Accordingly, it is anything but trivial to test for selection in the genome and, particularly, to estimate selection coefficients. Despite numerous efforts to investigate allele frequency changes in such E&R experiments, most methods that aim at detecting selection still suffer from high false positive rates and low statistical power. For that, we have developed a fully Bayesian approach to estimate selection coefficients aimed at E\&R allele frequency datasets. Our method is based on the Moran model of nucleotide evolution which allows for overlapping generations. We also employ a statistical test that uses Bayes Factors to compare two alternative non-nested models of evolution - neutrality and genetic drift vs positive selection. We have tested our method using simulated data and compared its performance to that of other available software. It is comparable to other approaches in terms of computational time and shows high accuracy even for complex demographic scenarios. Last but not least, we have analysed a Drosophila pseudoobscura time series dataset which looks at sexual selection in populations of naturally polyandrous flies. These flies were reared in the laboratory for more than 15 years and subjected to altered mating system treatments. Predictions suggest monogamy will reduce sexual conflict and elevated polyandry will aggravate it. Will our inferred targets of selection match patterns of reduced effective population size on the X chromosome vs autosomes?

Speakers and abstracts 109 P56 ANDRÉS G. DE LA FILIA The University of Edinburgh

Whole-genome maternal imprinting in mealybugs under paternal genome elimination (PGE) revealed by allele-specific RNAseq

Mendelian laws of gene transmission and expression, whereby genes are equally likely to be transmitted and expressed regardless of the parent they were inherited from, are ubiquitous across diploid organisms. However, exceptions to these laws are rife across eukaryotes: meiotic drive (non-random segregation during gametogenesis) and genomic imprinting (parent-of-origin-specific expression) are two widely studied departures from Mendelian rules. One of the most dramatic examples of Mendelian violations that combines aspects of both these phenomena is paternal genome elimination (PGE), a type of reproduction under which diploid males only transmit maternally-inherited chromosomes to the offspring, while the paternal set is excluded from sperm. Moreover, in many PGE taxa, such as the mealybug Planocococcus citri (Hemiptera:Pseudococcidae), paternal chromosomes are tightly condensed in early development. For decades, silencing of paternal chromosomes has been assumed through a combination of cytological and classic genetics studies, but a direct genome-wide confirmation has been thus far lacking. Here, we present a parent-of-origin allele-specific RNAseq gene expression analysis in somatic and reproductive tissues of male offspring from hybrid and intraspecific crosses, confirming that gene expression is globally, but not completely, biased towards the maternal genome. We analysed parent-of-origin expression patterns in >6,000 genes and found exclusive maternal expression in 25% of somatic genes (80% in testes), with a further 70% exhibiting maternally-biased expression. Finally, we performed a functional interrogation analysis of the small fraction of genes exhibiting biparental expression in order to understand why silencing of the paternal genome is incomplete. Together, our results reveal an extreme form of tissue-specific, whole-genome imprinting and offer a first insight into the dynamics of male-specific gene regulation and potential intragenomic conflict between parental genomes in this species.

P57 CHRISTINA HODSON The University of Edinburgh

Evolution of germline specific chromosomes in a fungus gnat Sciara( coprophila)

Mendel’s laws of inheritance are applicable to most sexual species. However, some species exhibit unusual chromosome inheritance systems (either for specific chromosomes or entire sets of chromosomes). Understanding these systems better provides information about why Mendel’s laws are followed by most species, as well as how unconventional chromosome inheritance systems evolve. A good system to explore this topic in is the fungus gnat species Sciara coprophila. This species has been of interest to biologist for

110 Speakers and abstracts over a century due to its intriguing chromosome dynamics. It exhibits paternal genome elimination, a system of reproduction whereby males only transmit maternally inherited chromosomes to offspring, sex determination via X chromosome elimination in somatic cells early in development rather than by the number or type of sex chromosomes inherited from the parents, and germline specific chromosomes that are eliminated in somatic cells but retained in the germline. Very little is understood about how this interesting genetic system has evolved, however, one theory suggests that the germline restricted chromosomes have evolved from X chromosomes. We are focusing on understanding how the germline restricted chromosomes in Sciara coprophila evolved. We have generated genome sequence data and RNAseq data from the germline and somatic cells of Sciara corprophila to determine what genes are present on the germline specific chromosomes, what their function may be in development and reproduction, and whether expression of genes on these chromosomes is sex specific. We have found that the germline specific chromosome makes up approximately 53Mb of the 274Mb genome for this species and contains numerous genes which we are still determining the function of. We are also working to understand how these chromosomes have evolved by comparing the gene content on the germline restricted chromosomes to the X chromosomes (from which they have been theorised to evolve) and comparing the rates of evolution of germline restricted chromosome genes to genes on other chromosomes. This will also allow us to determine how this system compares to other systems with germline specific chromosomes (such as the zebra finch which has a large germline specific chromosome which carries genes paralogous to autosomal genes and which is enriched in genes that function in reproduction).

P58 SIMON BROWN MRC Human Genetics Unit, The University of Edinburgh

Lymphoid Specific Helicase (LSH) Regulates Chromosome Synapsis and Meiotic Recombination During Mouse Spermatogenesis

Meiotic recombination shuffles alleles of the parental chromosomes, setting up the recombination landscape of the genome that shapes genetic diversity and evolution. Lymphoid specific helicase (LSH) is a member of the SNF2 family of chromatin remodelling enzymes that is required for de novo DNA methylation of repetitive sequences and some single copy genes, and has a poorly defined role in progression through meiosis. Lsh- /- spermatocytes have defects in pairing homologous chromosomes during meiosis, but despite the phenotypic similarity to other mouse mutants with defects in establishing or maintaining DNA methylation, it is unclear why homologous chromosomes do not pair properly in the absence of Lsh, or whether this meiotic error occurs due to problems in establishing or maintaining de novo DNA methylation. To investigate the role of Lsh in mammalian meiosis, we are using a conditional mouse knockout strategy to delete Lsh during (Ddx4-Cre) or after (Stra8-Cre) de novo DNA methylation, but prior to the onset of meiosis in the male germline.

Speakers and abstracts 111 As expected from published analyses of Lsh-/- mice, Ddx4-Cre Lsh cKO mice exhibit defects in pairing homologous chromosomes during meiosis, and increased cell death at this stage of spermatogenesis. Thus, Lsh has a crucial function during de novo DNA methylation that impacts on chromosome behaviour during meiosis. Loss of Lsh after de novo methylation has occurred (Stra8-cre Lsh cKO) also results in defects in pairing homologous chromosomes during meiosis and increased spermatocyte death at this stage, suggesting that Lsh is required after de novo methylation has been established in the male germline. But surprisingly, in contrast to other DNA methylation deficient mutants, Stra8-Cre Lsh cKO, but not Ddx4-Cre Lsh cKO, mice have defects loading components of the meiotic recombination machinery involved in the homology search onto chromosomes prior to homologous chromosome pairing. Thus, despite the common arrest point, Stra8-Cre Lsh cKO exhibit clear phenotypic differences from other DNA methylation mutants, and unexpectedly exhibit a meiotic recombination phenotype that is not evident in the earlier deleting Ddx4-Cre Lsh cKO mice. Our data suggests that late foetal prospermatogonia and/or spermatogonial stem cells are able to epigenetically compensate for loss of Lsh at this earlier stage of spermatogenesis, but more differentiated spermatogonia and/or spermatocytes cannot. These in vivo findings resonate with the ability of polycomb repressive complexes to compensate for loss of DNA methylation in embryonic stem cells, but not in more differentiated fibroblasts ex vivo. Understanding more about how Lsh fulfils its functions in meiotic recombination will generate insight into how the epigenetic chromatin environment can influence the meiotic recombination landscape.

P59 SILVIYA DIMOVA The University of Edinburgh

Targeting of Polycomb to Heterochromatin when perturbing DNA methylation

Epigenetic factors such as DNA methylation and post-translational modifications of the tails of histone proteins are implicated in regulating genome organisation throughout early embryo development. In vitro mouse embryonic stem cell (mESC) models have demonstrated that loss of DNA methylation is associated with the redirection of Polycomb Repressive Complex 2 (PRC2) away from its target sites and the subsequent deposition of its mark – H3K27me3 at the hypomethylated pericentromeric heterochromatic (PCH) regions within the nucleus. The PCH is composed of repetitive sequences, mainly major satellite repeats, which are known to be highly methylated in WT mESC and marked with the constitutive heterochromatin modification H3K9me3. Interestingly, upon depletion of H3K9me3 in knockout experiments, H3K27me3 is also found to be relocated to PCH. A cell line (CH9 cells) has been generated in which the endogenous maintenance DNA methyltransferase –DNMT1 contains a TET-OFF cassette so that the gene can be reversibly silenced upon treatment with doxycycline. This project aims to use the CH9 cells in conjunction with other differentially methylated mES cell lines and different culture media such as serum and 2i media to discern the steps required for Polycomb complexes (PRC1-

112 Speakers and abstracts H2AK119ub and PRC2’s –H3K27me3) to be redistributed to PCH immediately following DNA demethylation. Additionally, the interplay between H3K9me3 and H3K27me3 in silencing major satellite regions and the role of PRC2’s H3K27me3 as a potential backup silencer of pericentric repetitive sequences are investigated. Results so far, demonstrate that DNA demethylation occurs shortly after doxycycline treatment and is quickly followed by relocalisation of Polycomb marks to PCH. Re-expression of Dnmt1 restores DNA methylation and H2K27me3 profiles.

HUMAN GENETIC VARIATION: FROM MOLECULES TO POPULATIONS

P60 SEBASTIAN MAY-WILSON The University of Edinburgh

Prediction of blood proteins via multiple-tissue expression models

More and more studies are relying on integrating transcriptomics data, in the form of eQTLs, in order to provide more power for the discovery of causal loci and provide evidence of biological function. However, despite their use in examining differential expression there is increasing evidence that eQTLs do not predict translated protein levels effectively. This is believed to be due to the many layers of control that exist between transcription and protein translation. This includes multiple splicing variants, the existence of trans-eQTLs, mRNA half-life, and the rate of translation which can be influenced by a number of factors. Another possibility is that the protein levels in specific tissues may be modulated by exportation of proteins from one tissue to another. To examine this possibility, we conducted multiple attempts to generate a polygenic model for predicting protein levels in multiple tissues simultaneously. This was done using GTEx eQTLs modelled against approximately 1,000 proteins measured in blood samples from 1,000 individuals in the ORCADES dataset, an isolated and richly phenotyped population study. We compared these models against predictions created for individual tissues. Prior to this we compared the computational methods of PrediXcan and LDpred for creating polygenic weights, to determine which algorithm is superior for creating the predictive scores, however this had mixed results and had results which were somewhat contentious. An initial attempt at creating the multi-tissue model against individual tissue scores had a higher predictive power in 54% for the 711 genes carried through the analysis. This model was created using 800 individuals from ORCADES and tested in the remaining 200. Then when comparing the predictive scores for specific tissues, the whole blood eQTLs only best predicted the blood proteins in 6% of genes, of which half were improved on by the multi-tissue model. However, the initial multi-tissue model is believed to be prone to overfitting as it only accounted for testing within 200 individuals. Therefore, a subsequent k5-fold cross-validation model was created in which each individual is assigned to one of five groups and each group is tested against the other four.

Speakers and abstracts 113 The results from this model are believed to more closely resemble the truth and were much more conservative, with only ~33% of genes seemingly better predicted by the new multi-tissue model. However, when comparing the likelihood-ratios of the new multi-tissue model with that of the individual tissues, the two models were only significantly different (P < 5 x 10-5) in 83 cases and the multi-tissue model best in 61 of these. Therefore, the multi-tissue model only significantly improves on protein prediction in blood in 8.5% of cases. Finally, an enrichment and pathway overrepresentation analysis was conducted on the results of the new model, demonstrating an over-representation of proteins involved in immune response were better predicted by the multi-tissue model, which might suggest that due to the secretion of immune response proteins into surrounding tissue, these proteins may specifically adhere to the initially stated hypothesis.

P61 MAXIM FREYDIN King’s College London

SNP-by-sex interaction analysis for chronic back pain

Gender differences in the prevalence of chronic back pain (cBP) have been recorded, but not fully explained. Suggesting gene-by-sex interaction as a possible reason, we explored sex-specific genetic loci for cBP via a genome-wide association study (GWAS) using the UK Biobank dataset. A total of 200,948 males and 236,412 females of European ancestry have been analysed. The phenotype was “Back pain for 3+months”. The GWAS were carried out examining males and females separately adjusting for age and technical covariates, followed by a comparisons of SNP effect sizes (regression estimates) between the sexes using t-statistics. The prevalence of cBP was 17.7% and 18.2% and the mean age (SD) was 57.5 (8.1) and 57.1 (7.9) in males and females, respectively. Four and 8 non-overlapping genome- wide significant loci were identified for males and females. SNP-explained heritability was higher in females (0.079 vs 0.067, p = 0.006). There was a high genetic correlation between the sexes (r = 0.84, p = 1.7e-94). The GWAS comparing the effect size between the sexes revealed three significant loci: on chromosomes 4 (rs62327819, SLC10A7), 6 (rs556434841, MEI4), and 8 (rs184345217, CSMD1). We found differences in genetic correlations between sexes for four traits: “Neck and shoulder pain”, “Serious illness, injury or assault to yourself”, “Dorsalgia”, and “Self-reported disc problems”. The results suggest modest differences in the genetic component of cBP in males and females. The mechanisms underlying these differences and their impact of cBP epidemiology are to be explored. The study was performed using the UK Biobank data (project # 18219). We are grateful to the UK Biobank participants.

114 Speakers and abstracts P62 REEM ALHAMIDI University of Nottingham

Determining the relationship between DEFA1A3 copy number variation and mRNA expression levels

Alpha-defensins are small cationic peptides that have important roles in host defence of innate immunity. They possess anti-microbial properties and are highly expressed in neutrophils. Alpha-defensin 1 (DEFA1) and alpha-defensin 3 (DEFA3) genes encode Human Neutrophil Peptides 1 and 3 (HNP1 and HNP3), respectively. Both peptides have similar sequences except they differ in their first amino acid. Both genes are clustered on chromosome 8 (8p23.1). Approximately 10% of the population lack the DEFA3 gene; primates lack DEFA3 completely which suggests that it is human-specific and has emerged from DEFA1. Both genes have high similarity in sequences, so that the locus symbol DEFA1A3 has been introduced, which means that each haplotype carries a variable number in a tandem array of a 19 kb full repeat (consisting of DEFA1/DEFA3 and DEFT1P pseudogene) and a single copy of a 10 kb centromeric partial repeat (consisting of DEFA1/ DEFA3). DEFA1A3 is one of the many complex multi-allelic copy number variations (CNV) found in humans. Because of its complexity, measuring DEFA1A3 copy number becomes increasingly difficult as the gene CN increases, therefore finding an application to produce accurate estimates of copy number is essential. It was reported that individuals exhibit 3 to 16 copies of DEFA1A3 for the European population. Normally with gene expression, it is expected that copy number should correlate with the levels of mRNA expression, which implies that lower numbers of copies should yield lower gene expression. Investigations of DEFA1A3 copy number relationship with mRNA levels is inconclusive so far, with studies conducted so far having small sample size. DEFA1A3 copy number measurements from the 1000 Genomes Project (1kGP) with 2504 samples across different populations and the BLUEPRINT consortium (BP) data with 197 samples were determined. The DEFA1A3 copy number ranged from 3 to 16 copies in 1kGP and 4 to 12 copies in BP. Samples NA12878 and NA12760 with 5 long amplicons each (tiling path) were used for sequencing that was loaded to a flongle with 126 channels from Oxford Nanopore Technologies (ONT). The amplicons span the 19 kb repeat and flanking regions in the telomeric and centromeric regions from the gene. Each amplicon is ~6kb in length. The 24 hr run produced 10.8k reads yielding 26.76 Mb with only 20.52% of reads passing the quality threshold. The amplicons were separated to determine the haplotypic structures. NA12878 and NA12760 showed 1,611 and 1,265 reads mapped to the reference sequence, respectively. The BP consortium datasets will be used to determine gene expression levels and investigate the relationship between DEFA1A3 copy number and mRNA expression levels. Nanopore amplicon sequencing will be used to explore the haplotypic structures of NA12878 and NA12760. We are expecting to obtain 5 copies of DEFA1A3 and 3 copies of the DEFT1P pseudogene from both samples. The ratio of DEFA1:DEFA3 in NA12878 is 4:1 while in NA12760 it is 3:2.

Speakers and abstracts 115 P63 LIDIYA NEDEVSKA Oxford Brookes University

From rare to common variants: genetic overlaps between Usher Syndrome, hearing, auditory processing and speech/language abilities

Detecting human DNA variation for the first time using restriction fragment length polymorphisms took a lot of effort, time and resources. Nowadays, we are able to cheaply and quickly sequence a whole human genome and compare all the variation between one person and thousands of others. The analysis and classification of variants contributing to typical ability and those which cause disease, however, is just as challenging. Traditionally and based on inheritance, diseases would be grouped into Mendelian (very rare disorders with monogenic inheritance) and complex (common disorders with polygenic inheritance plus environmental and genetic interactions). Numerous examples from the literature have shown us that there is no clear-cut separation and that rare and common variants often play a role in the same disease. In this work, we consider variations across the Usher Syndrome Type II gene USH2A and find overlapping and interacting genetic mechanisms that contribute to hearing, auditory processing and the risk of a speech/language disorder. This study combines whole genome sequencing data from a family affected by a severe form of language disorder with genetic characterisation of two large population cohorts (the Avon Longitudinal Study of Parents and Children and UK10K). We apply three models to look for variants in USH2A that may impact on hearing and language abilities: monogenic model (searching for very rare causal variants with high effect sizes), common disease- common variants model (searching for risk common variants with low risk effect sizes and allowing for interaction variables) and common disease- rare variants model (searching for rare risk variants with high effect sizes). A genetic knockout of the recent homolog Ush2a was assessed to complement human findings. Sequencing analyses identified a very rare stop-gain mutation in USH2A which co- segregated with language disorder in the discovery family. Mouse data indicated that heterozygous disruptions of Ush2a are associated with modified low-frequency hearing and altered vocalisations. According to the monogenic model, pathogenic USH2A variants in humans do not result in an apparent phenotype when present in a heterozygous form. In contrast, our data indicate that these variants may play a role in a complex genetic model where common USH2A variants when combined with altered low-frequency hearing thresholds lead to worse language outcomes. By applying three different models to study the impact of genetic variation, we identify a complex, interactive genetic mechanism. We hypothesise that USH2A common risk variants contribute to low-frequency hearing loss, altering auditory processing and thus increasing the risk of language disorder. These findings support emerging ideas around genetic complexity and indicate a continuous model of complex genetic risk that includes multiple interacting factors.

116 Speakers and abstracts P64 ARCHIE CAMPBELL The University of Edinburgh

Generation Scotland - Using Electronic Health Records for Research

Generation Scotland: Scottish Family Health Study (GS:SFHS) is a family-based genetic epidemiology study of ~24,000 volunteers from ~7000 families recruited across Scotland between 2006 and 2011 with the capacity for follow-up through record linkage and re-contact. Broad consent was obtained for linkage to “medical records” for 98% of the cohort. This created a resource for investigation of the genetics of common conditions, available to researchers in Scotland, the UK and worldwide. Participants completed a demographic, health and lifestyle questionnaire, provided biological samples, and underwent detailed clinical assessment. The samples, phenotype and genotype data collected form a resource with broad consent for research on the genetics of health, disease and conditions of current and projected public health importance. This has become a longitudinal dataset by linkage to routine NHS hospital, maternity, lab tests, prescribing and dental records, using the Scottish Community Health Index (CHI). Mortality data has been obtained from the General Register of Scotland. We plan to use the SHARE register to obtain new research samples from routine NHS tests. Researchers can now use the linked datasets to find prevalent and incident disease cases, and healthy controls, to test research hypotheses on a stratified population. They can also do targeted recruitment of participants to new studies, including recall by genotype, utilising the NHS CHI register for current contact details. We have now established and validated EHR linkage, overcoming technical and governance issues in the process. Using consented data avoids some limitations of safe havens for analysis. Genome-wide association studies (GWAS) have been done on a wide range of quantitative traits and biomarker measurements. Generation Scotland is a contributor to major international consortia, with collaborators from many institutions from the UK and worldwide, both academic and commercial. GS has also collaborated with Dementia Platforms UK and Health Data Research (HDR) UK to make the resources more widely known. There have been over 300 research collaborations, and GS data has contributed to 200 published papers, with more in the pipeline. Generation Scotland have thoroughly tested the linkage process and are now extending it to include primary care data (GP records) and scanned images (PACS) this year. We have been awarded a substantial grant from the Wellcome Trust to extend the cohort, and plan to recruit another 20,000 participants. (“NextGenScot”). We will contact current volunteers to invite new and younger family members to take part. Any family member living in Scotland, aged 14 or over, not already in the Scottish Family Health Study, will have the chance to take part. New families will also be asked to volunteer from other studies, such as the Aberdeen Children of the 1950s. The GS resources are a Research Tissue Bank available to academic and commercial researchers through a managed access process (www.generationscotland.org). Participants can be invited to take part in new studies, targeted by phenotype, genotype, or presence or absence of a condition in their medical records. Speakers and abstracts 117 P65 SEAN BANKIER Centre for Cardiovascular Science, Queen’s Medical Research Institute

Causal inference approaches for the reconstruction of transcriptomic networks from common genetic variants influencing plasma cortisol

A genome wide meta-analysis by the CORtisol NETwork (CORNET) consortium has identified genetic variants spanning the SERPINA6/A1 locus on chromosome 14 associated with changes in morning plasma cortisol and predictive of cardiovascular disease. SERPINA6 encodes Corticosteroid Binding Globin (CBG) which binds most cortisol in blood and influences delivery of cortisol to target tissues. We hypothesised that genetic variants in SERPINA6 influence CBG expression and cortisol delivery to tissues which promote cardiovascular disease, reflected in tissue-specific variation in cortisol-regulated gene expression. The Stockholm Tartu Atherosclerosis Reverse Networks Engineering Task study (STARNET) provides genome wide DNA and RNAseq data in seven vascular and metabolic tissues from over 600 patients undergoing coronary artery bypass grafting. We used STARNET to link SNPs identified in CORNET to SERPINA6 transcript levels in liver and the expression of other trans-associated genes across all STARNET tissues. Pairwise causal inference was employed to reconstruct interactions between these genetic factors and their downstream targets using eQTLs as genetic instruments to infer causality. These gene-gene relationships can be used as the foundation for the reconstruction of overarching gene networks. We identified 21 SNPs that were significant in CORNET (p ≤ 5x10 -8) and cis-eQTLs for SERPINA6 expression in liver (q ≤ 0.05). Tissue-specific trans-genes in liver, subcutaneous and visceral abdominal adipose tissue were associated with these SNPs, with an over- representation of glucocorticoid-regulated genes. Of these trans-genes, the interferon regulatory factor IRF2, controls a putative glucocorticoid-regulated network with targets including LDB2 and LIP1, both associated with coronary artery disease. Motif enrichment was used to demonstrate an over-abundance of IRF2 and Glucocorticoid Receptor (GR) transcription factor targets within this network. We conclude that variants in the SERPINA6/A1 locus mediate their effect on plasma cortisol through variation in CBG expression in liver, and that variation in CBG influences gene expression in extrahepatic tissues through modulating cortisol delivery, notably in adipose tissue. The cortisol-responsive gene networks identified here represent candidate pathways to mediate cardiovascular risk associated with elevated cortisol.

118 Speakers and abstracts P66 LEE MURPHY The University of Edinburgh

Understanding how environmental factors, diet and gut micro-organisms influence Crohn’s Disease and Ulcerative Colitis flare

Crohn’s disease and ulcerative colitis are chronic, incurable inflammatory bowel diseases that are on the rise worldwide. PREdiCCt (The PRognostic effect of Environmental factors in Crohn’s and Colitis) is a major study that is focused on understanding how diet, lifestyle and the bacteria in the gut influence the characteristic disease flare and recovery in inflammatory bowel disease. We are recruiting 3,100 people with Crohn’s disease and ulcerative colitis who are in clinical remission. We collect a large amount of data at baseline including clinical information, lifestyle and diet, with monthly follow-up questionnaires through a digital patient portal. Patient follow-up is for two years. Biological samples are collected through the Edinburgh Clinical Research Facility Genetics Core. Stool samples are collected for calprotectin assay and DNA extraction for microbiome sequencing. Saliva samples are collected for whole-genome sequencing.

P67 SUSAN FAIRLEY The European Bioinformatics Institute

Open human genomic variation resources in the International Genome Sample Resource (IGSR): building on the 1000 Genomes Project with new populations and data sets

The 1000 Genomes Project generated the largest wholly publicly accessible catalogue of common human genetic variation, making all data available and developing valuable, and widely used, reference resources. Uniquely, the data encompasses around 100 samples, including trios, from each of 26 populations, spanning five continents. Our work maintains and expands the resources created by the 1000 Genomes Project, updating the resources to GRCh38, adding new data sets generated on samples from the 1000 Genomes Project and also adding new populations, which were not part of the 1000 Genomes Project. The cell lines created by the 1000 Genomes Project enable researchers to generate new data from the samples. This year, we have added 30x coverage Illumina data for the 2,504 samples in the 1000 Genomes Project phase three panel to our collections. The sequence data was generated by the New York Genome Center (NYGC), funded by NHGRI. NYGC have also used GATK to generate comprehensive variation call from these data on GRCh38. Further analysis is in progress including our set of variation calls from FreeBayes and collaborations between the Human Genome Structural Variation Consortium (HGSVC)

Speakers and abstracts 119 and NYGC. All analysis results will be shared on our FTP site as they are completed. This data sits alongside the existing range of data types previously available for these samples in IGSR. Through ongoing collaboration with the HGSVC, a variety of data types have been added to our resources, including PacBio, Strand-seq, Oxford Nanopore and Bionano. It is planned that the range of samples for which some of these data types are available will continue to grow as the HGSVC commences a new phase of data generation. New populations have been added as well as new data types. We are performing variant discovery on 8x coverage data for four populations from the Gambian Genome Variation Project (GGVP), three of which were not part of the 1000 Genomes Project. These samples were sequenced by the Kwiatkowski lab (Wellcome Sanger Institute and Oxford). The site discovery process is similar to that used in our work recalling the 1000 Genomes Project data on GRCh38, using BCFtools, GATK UnifiedGenotyper and FreeBayes in site discovery. As new data is added to IGSR, including data from the Simons Genome Diversity Project (SGDP) and the Human Genome Diversity Project (HGDP), which add new samples using different population structures and naming, our work will continue to facilitate easily finding data as well as adding new methods for exploring the available data sets in our website’s data portal. We offer further support for accessing and understanding the data via our helpdesk at [email protected]

P68 JOSEPH GILBODY The University of Edinburgh

Harnessing haplotypes to hunt for hunks of human heredity: a case study searching for hidden heritability in depression using UK Biobank

Around 1 in 5 adults display symptoms of depression and it is the leading cause of world- wide disabilities. Depression is a trait with a strong genetic component, and heritability is estimated at 37%. However, genome wide association studies (GWAS) have only been able to identify single nucleotide polymorphisms (SNPs) that contribute ~9% to the variation of depression. This discrepancy could, in part, be due to the inability of GWAS to detect rare variants. Alternate forms of genomic analysis have been developed to complement GWAS approaches and identify those rare variants. A joint SNP-haplotype genome-based restricted maximum likelihood (GREML) regional heritability mapping method was applied in the case of depression (an exemplar polygenic trait with ‘hidden heritability’) using UK Biobank. This method jointly fitted SNP and haplotype-based genetic relationship matrices (GRM) at a local level while also fitting a genome relationship matrix to estimate the heritability explained by haplotype blocks. The use of haplotype based GRMs provides a greater ability to capture relationships when identifying rare variants due to segregating rare variation, and should assist in explaining the hidden heritability of depression. We examined the degree to which haplotype blocks

120 Speakers and abstracts that were previously identified as associated with depression using this methodology replicate across five geographically separated UK Biobank cohorts (Leeds, Nottingham, Bristol, Newcastle, Liverpool). A meta-analysis was then performed across the 5 cohorts to identify novel regions associated with depression. The joint haplotype-SNP GREML analysis examined the associations between 273 regions and depression across 5 UK Biobank cohorts, with replication occurring in 38% of the regions selected for significance in two previously analysed Biobank cohorts. The haplotype based model (for which the regions were selected) identified 102 regions above the significance threshold of -log10 (p-value) of 4.44 calculated using a Bonferroni correction for the number of independent test performed (1365), while the SNP based model identified 33 regions. Furthermore, the joint cohort meta-analysis of the regions identified 35 regions which are significant at a genome-wide level (p-values of between 2.79E- 07 and 1.31E-35 at a threshold of 4.33E-07). Importantly, none of these corresponded with variants identified in previous GWAS, and were novel variants strongly indicative of explaining hidden heritability. The ability to identify novel regions associated with depression and replicate those results across cohorts acts as a proof of concept for this novel methodology. Since these regions had not previously been identified using GWAS, this exploration shows that this method works in a complementary manner to other forms of analyses. Future work will replicate these analyses across every region in all 5 cohorts at a genome-wide scale, and is likely to offer the opportunity to identify a greater number of novel regions.

P69 MARIA TIMOFEEVA Institute of Genetics and Molecular Medicine, University of Edinburgh

Plasma 25-Hydroxyvitamin D concentration and dietary intake interact with genetic variation at the e-cadherin (CDH1) gene locus to exert clinically-relevant modifying effects on colorectal cancer risk.

Genome-wide association studies (GWAS) have identified over 90 common genetic variants that individually impart modestly increased colorectal cancer (CRC) risk. These account for <11% of the 2.2-fold excess familial risk of CRC. Some “missing heritability” may be attributable to interactions between genetic and environmental risk factors. Vitamin D is strongly associated with colorectal cancer risk in observational studies but not in randomised controlled trials or studies using Mendelian randomisation approach. 1,25-dihydroxycholecalciferol binds to its receptor ligand (VDR) and influences expression of numerous cancer-relevant genes. We examined whether vitamin D modifies the effect of common genetic variants associated with CRC risk. The Discovery Cohort comprised 2,000 cases and 2,236 controls from Scotland. Data on demographic, health, lifestyle and other variables were collected. DNA from blood samples was genotyped and plasma assayed for total 25-hydroxyvitamin D (25OHD) concentration by mass spectrometry. Case-control logistic regression was used to test for GxE interactions

Speakers and abstracts 121 between 93 CRC risk SNPs and vitD (25OHD concentration and dietary vitamin D intake). Case-only analysis was conducted in two Replication Cohorts (2,354 Scottish, 544 Croatian CRC cases). The 16q22.1 (e-cadherin) CRC risk locus, tagged by rs99299218, showed a significant interaction with plasma level (P=0.0002), supplementation (P=0.0028) and dietary intake of vitamin D (P=0.016). A clinically relevant effect of vitamin D was observed after stratification by rs9929218 genotype. Comparing the highest to lowest tertiles of plasma 25OHD, AA carriers had 3.7-fold (95%CI: 2.13-6.67) and GG carriers had 1.56-fold (95%CI: 1.25-1.92) greater CRC risk. The interaction effect was validated in Scottish and Croatian replication cohorts in univariate (p=0.00027; p=0.013) and adjusted case only analysis (p=0.0033; p=0.017). These findings establish that vitamin D deficiency differentially affects CRC susceptibility, dependent on genetic variation at the 16q22.1 CRC risk locus. The magnitude of effect is sufficiently large to be considered clinically actionable. These findings indicate that CRC risk attributable to vitamin D deficiency varies according to germline variation at known CRC risk loci. Consideration should be given to stratification by genotype of populations recruited to vitD intervention trials, thereby optimising statistical power and effect size.

P70 ERIN MACDONALD-DUNLOP The University of Edinburgh

Comparison of Multiple Omics Ageing Clocks

Biological age, a measure of deterioration and ageing that is distinct from chronological age has been found to predict disease and mortality. Since the first published measure of biological age: Hannum’s epigenetic clock, closely followed by Horvath’s epigenetic clock, ageing clocks have been built using telomere length, facial morphology, metabolomics, glycomics and proteomics. However, comparison between clocks has so far been limited. Fortunately, the ORCADES study is a deeply annotated cross-sectional cohort with approximately 1,000 individuals, where multiple omics assays such as circulating plasma proteins, lipids, glycans and metabolomics are available, enabling consistent comparisons in a common dataset. Elastic net regression were used to create multiple omics clocks in a training set and their accuracy assessed in a testing set and an additional independent cohort, in univariate and bivariate analyses. We replicate existing clocks, (with some improvement of our proteomics clock over Enroth et al’s, r =0.83) in predicting chronological age. We find our protein clock (r=0.93) out preforms our IgG glycan and metabolite clocks (r=0.74 and r=0.77 respectively). In pairwise bivariate analysis we find large but not complete overlap between clocks. We show that there is a significant difference of excess biological age (3-4 years) between current and non-smokers of the same chronological age and the effect of having formerly been a smoker on biological age. Finally, we show that increased age acceleration

122 Speakers and abstracts estimated by clocks are consistently associated with increased BMI, cortisol and stroke across omics clocks. Proteins more accurately predict chronological age than glycans and glycans can track the benefits of giving up smoking. Multi-omics clocks appear to track both overlapping and distinct aspects of ageing and offer the prospect of identifying distinct ageing pathways.

P71 KATHRYN JACKSON-JONES MRC Human Genetics Unit, The University of Edinburgh

A Novel Nonsense-Mediated Decay Pathway Factor Located at the Endoplasmic Reticulum

Nonsense mediated decay (NMD) is a translation-dependent quality control pathway that identifies and degrades mRNAs that would otherwise lead to truncated peptides and disease. Furthermore, NMD has also been shown to regulate post-transcriptional gene expression by degrading ~10% of correctly transcribed RNAs, and in this way fine-tunes many physiological processes including development, cell cycle, cell stress and tumorigenesis. Whilst the RNA helicase UPF1 is central to NMD across species, another protein NBAS has recently been shown to be essential for NMD in human cells, Caenorhabditis elegans and zebrafish. Unlike previously identified NMD factors located in the cytoplasm, NBAS is specifically tethered at the endoplasmic reticulum (ER). Around 30% of proteins are translated at the ER, mainly those in the secretome and membrane proteins. As such, we hypothesise a novel branch of NMD moderated specifically by NBAS that target proteins translated at the ER. We sought to investigate the targets of NBAS by comparing them with those of UPF1. To this end, we measured global RNA expression and stability changes in HeLa cells. Differential expression analysis showed a significant overlap between transcripts significantly upregulated when either NBAS or UPF1 was depleted. GO term analysis of the overlapping transcript set revealed an enrichment for secretome and membrane-bound proteins showing that NBAS specifically affects those NMD targets that are translated at the ER. Since NMD decreases transcript stability rather than expression directly, we simultaneously measured the change in RNA stability genome wide by labelling cells with the uridine analogue 4sU. Using a novel statistical approach, we were able to identify genes with increasing and decreasing stability using a single time point. We found a significant global increase in RNA stability in cells without either UPF1 or NBAS and a high correlation between genes significantly changed in stability when either factor was depleted. We are currently investigating the differing characteristics of these genes. To elucidate where in the NMD pathway NBAS has a role, we created a GFP-tagged NBAS cell line using CRISPR/Cas9 and have identified direct protein binding partners using immunoprecipitation-mass spectroscopy and immunoprecipitation-western blot. We have found that NBAS interacts with sec61β and sec63, components of the translocon where translating ribosomes dock at the ER. We have previously shown that a subset of UPF1 also interacts with sec61β suggesting an ER-specific NMD complex involving NBAS, UPF1 and

Speakers and abstracts 123 the ER translocon. These data demonstrate that NBAS mediates decay of a specific subset of NMD targets that are being translated at the ER and likely acts through a complex with UPF1 and the ER translocon.

P72 THIBAUD BOUTIN The University of Edinburgh

Insights into the genetic basis of retinal detachment

Retinal detachment can lead to permanent partial or total sight loss if untreated; therefore, it is a major cause of emergency eye surgery. Nevertheless, the genetic aetiology of this common condition remains poorly characterised due to the limited size of the case and control cohorts available. Here, we harnessed the large UK Biobank dataset to gain insight into the genetics architecture of retinal detachment. UK Biobank self-report and electronic health records (hospitalisation) allowed to identify a set of retinal detachment cases (N = 3977). This dataset was first validated based on wealth of phenotypic information. The genetic correlations between retinal detachment and high myopia or cataract operation were estimated to be respectively 0.46 (SE=0.08) and 0.44 (SE=0.07) in agreement with known epidemiological associations. A meta- analysis of genome-wide association studies combining this biobank dataset and two cohorts, each of ~1000 clinically ascertained retinal detachment cases, uncovered 11 genome-wide significant association signals located near, or within, the following retinal expressed genes: ZC3H11B, BMP3, COL22A1, DLG5, PLCE1, EFEMP2, TYR, FAT3, TRIM29, COL2A1 and LOXL1. A replication analysis was then conducted with the 23andMe dataset, where retinal detachment is self-reported by participants. This has confirmed six of the retinal detachment risk loci: FAT3, COL22A1, TYR, BMP3, ZC3H11B and PLCE1. Amongst these loci, remarkably the two first had not been associated with other eye trait or condition, underlining a specific impact on retinal detachment risk. The lead variant in TYR is a common missense variant previously associated with macular thickness. The other loci have been associated with myopia and/ or longer eye (BMP3, ZC3H11B), a well-recognised risk factor for retinal detachment, or glaucoma (PLCE1). Molecular investigations have been initiated to elucidate the molecular mechanisms underlying the association at the BMP3 locus, using both mouse and human cell line. The genetic study sheds light on shared genetic aetiologies with cataract, myopia and glaucoma. The larger study size, enabled by resources linked to health records or self-report, provides novel insights into retinal detachment aetiology and underlying pathological pathways.

124 Speakers and abstracts P73 MARIA TIMOFEEVA MRC Human Genetics Unit, The University of Edinburgh

Causal relationship between obesity and colorectal cancer risk: a Mendelian Randomisation analysis

Abstract Colorectal Cancer (CRC) is the third most commonly diagnosed cancer worldwide causing ~10% of cancer specific deaths, making the cancer a major public health burden. High body mass index (BMI) and obesity are strongly associated with colorectal cancer risk in observational studies. However, observational studies are unable to assess causality due to the presence of environmental confounders. Identifying a causal relationship between a risk factor and CRC could be beneficial by establishing a potential drug target or a lifestyle change, both of which could contribute to lowering CRC incidence in a general population or a high-risk group. Since obesity can be easily prevented due to lifestyle choices, the study of causality is crucial for the prevention of the cancer. Mendelian Randomisation is a statistical analysis tool that utilises genetic variants associated with the exposure as instrumental variables. It is based on the assumption that genetic variants are fixed at conception and are randomly allocated during gamete formation which is in agreement with Mendel’s law of inheritance. Mendelian Randomisation has previously established causal associations between body mass index (BMI) and waist-to-hip ratio (WHR) with increased CRC risk. This project has expanded on these findings by utilising a novel GWAS on BMI and WHR and a novel CRC genome-wide association study, and the addition of novel obesity phenotypes from dual energy x-ray absorptiometry scans. We applied two-sample summary level Mendelian Randomisation to study the causal relationship between CRC risk and nineteen obesity phenotypes reflecting overall obesity and different types of fat distributions across the body. The instrumental variables were derived from multiple GWAS. The Inverse Variance-Weighted method was implemented, and six obesity phenotypes were found to be significantly associated (after correction for multiple testing) to the increased risk of CRC. The phenotypes were related to overall obesity and upper body obesity; BMI/WHR same direction of effects (OR = 1.18), BMI (OR = 1.16), Visceral Fat Mass Index (OR = 1.15), Android Lean Mass Index (OR = 1.11), Subcutaneous Fat (OR = 1.10) and Apple Fat (OR = 1.09). Additionally, Lean Mass Percentage (OR = 0.94) and BMI/WHR opposite direction of effects (OR = 0.54), suggested a potential association to the decreased risk of CRC. The results from this project suggest a differential effect of fat distribution on cancer risk with no association between CRC and lower body fat. These results provide additional evidence for a compelling causal relationship between obesity and the risk of developing CRC. It not only reiterates the crucial need for the prevention and treatment of obesity, but it additionally appoints obesity as a critical target for the primary prevention of CRC. Heterogeneity and pleiotropy could not be ruled out of this project, therefore, future studies focussing on understanding the mechanisms of the relationship between CRC and obesity measurements are required.

Speakers and abstracts 125 P74 MOHAMMAD MUSTAFA MRC Institute of Genetics & Molecular Medicine, The University of Edinburgh

Functional mechanisms underlying the RXRA-COL5A1 signal associated with central cornea thickness and keratoconus

Keratoconus is characterised by progressive corneal thinning and significant irregular astigmatism. The process that leads to stromal loss is poorly understood, but may reflect a pathological extreme of biological mechanisms controlling healthy central corneal thickness (CCT), a highly heritable trait. Genome wide association studies (GWAS) have identified multiple variants associated with decreased CCT and increased keratoconus risk. One set of such variants lies in the intergenic region between RXRA and COL5A1 on chromsome 9. We hypothesize that this locus lies in a putative enhancer for COL5A1, encoding the a1 isoform chain of type 5 collagen, a major component of the corneal stroma. Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) was performed on a human keratocyte cell line derived from corneal stroma (hTK) and a human corneal epithelial cell line (hTCEpi). The COL5A1 gene was expressed in hTK but not hTCEpi and both expressed the RXRA gene. Dual luciferase reporter assays were performed by transient transfection of pGL4 plasmids to test associated haplotypes in hTK and in a human skin fibroblast cell line (RITVA). Site directed mutagenesis was used to change the CCT decreasing haplotype to the CCT increasing haplotype to dissect the contribution of associated SNPs to enhancer activity. ATAC-seq shows the associated variants are located in regions of accessible chromatin in hTK cells, but not in hTCEpi, consistent with a cell type specific regulatory mechanism being disturbed by the associated variants. Transfection with the CCT reducing haplotype construct displayed significantly lower (p=0.004) normalised luminescence compared to reference haplotype in hTK cells, whereas no significant difference was seen in RITVA cells. When the CCT reducing allele of a single variant, rs3118515, was changed to the reference allele, a significant increase in relative luciferase activity was recorded, consistent with variation at rs3118515 influencing enhancer activity. We showed that the SNP rs3118515 is a functional variant in a transcriptional regulatory element that disrupts a putative enhancer region in a cell type specific manner. The gene this enhancer is influencing is being investigated. We are also using chromatin immunoprecipitation to confirm transcription factors predicted to bind in this region.

126 Speakers and abstracts P75 SUSAN TWEEDIE European Bioinformatics Institute

HGNC: 40 years of standardised human gene nomenclature

Long before the human genome project was considered feasible, geneticists realised they needed unique, memorable, and ideally pronounceable gene nomenclature for effective communication. The first official human gene naming guidelines were published in 1979 and the HUGO Gene Nomenclature Committee (HGNC, www.genenames.org) subsequently established as the global authority for assigning human gene nomenclature. Forty years on, we continue to work closely with authors, journals, annotation groups and other nomenclature committees to assign gene symbols and names. Every gene also receives a unique and stable - but less user-friendly - HGNC ID number, which is invaluable for navigating between resources. By August 2019, HGNC had approved just over 42K gene symbols, primarily for protein coding genes (19,244), pseudogenes (13,329) and non-coding RNAs (ncRNAs, 7,804). While small numbers of new protein coding genes continue to be identified, the majority of the newly named genes are ncRNAs – either published loci or unpublished ncRNAs for which there is strong support and consensus between annotation groups. We will present our approach to naming uncharacterised long ncRNAs. While originally confined to scientific literature and biomedical databases, our gene symbols now reach a much wider audience, frequently being quoted in clinical reports and the media, increasing the importance of nomenclature to patients, doctors, charities, support groups etc. It is crucial for clinicians to be able to identify the correct genes for tests – ambiguity could be a matter of life or death – and genetic counsellors need to be able to discuss genes without the complication of inappropriate gene names. Nomenclature stability is also important to minimise changes to websites, test panels, protocols etc. Hence, we have been reviewing our clinically-related gene symbols and names with a view to optimising them prior to marking them as ‘stable’. In a small number, review has revealed that the approved symbol is now misleading or confusing due to new data, or even potentially rude or pejorative, so we have consulted with the research community to agree alternatives. We have also been reviewing usage – and occasionally replacing unused symbols and/or updating gene names for those favoured in the literature, to ensure we are stabilising something that will be used. We present a summary of our progress and encourage everyone to alert us to any problematic nomenclature so that we can correct any issues prior to stabilisation. Another major development for the HGNC has been the instigation of our sister project, the Vertebrate Gene Nomenclature Committee (VGNC), established to name genes in selected and well-studied vertebrates that lack their own nomenclature authority. Approved nomenclature is now available for chimpanzee, cow, dog, horse, cat and macaque, with >11K protein coding genes named in each species, and >81K genes named in total to date. All VGNC data are available via the VGNC website (https:// vertebrate.genenames.org) and the symbols and names are displayed in key resources such as Ensembl and NCBI Gene. We will present our progress in this work to date.

Speakers and abstracts 127 P76 ROSIE LITTLE MRC Harwell Institute

Investigating the role of FGF Signalling in Left-Right development

Left-right patterning is set up early in embryonic development. Cilia in the pit of the murine node are required to create a leftwards flow of extracellular fluid and thus break symmetry. Immotile cilia on cells at the edge of the node are required in order to respond to the flow. There is debate in the field as to how these immotile cilia translate the flow into left-specific gene expression and other side-specific signals. FGF signalling is implicated in left-right development, with Fgf8 being known to have a role in the mouse due to Fgf8 mutant embryos analysed in 1999 showing situs defects. However, the specifics of this role are not agreed upon, and the direct action of this gene is not clear. It has been suggested that Fgf8 may directly regulate the release of vesicles within the pit of the node, which contain signalling molecules. Conversely, it has been proposed to be directly upstream of Nodal, a key asymmetric gene involved in left-specific development. I am investigating the role of FGFs in left-right patterning through the use of a number of techniques, including embryonic culture using inhibitors of FGF receptors to elucidate receptor action. Additionally, I am analysing the expression of Fgfrs and their counterpart FGFR protein localisation. Also, I am analysing the Fgf8 mutant created in 1999 in order to investigate the role it has in mice on a congenic background, and further characterise the phenotypic abnormalities and perhaps elucidate the role of this gene in this context.

EPIGENETICS AND NON-CODING RNA

P77 TAMIR CHANDRA MRC Human Genetics Unit, The University of Edinburgh

Age-related clonal haemopoiesis is associated with increased epigenetic age

Age-related clonal haemopoiesis (ARCH) in healthy individuals was initially observed through an increased skewing in X-chromosome inactivation. More recently, several groups reported that ARCH is driven by somatic mutations, previously described as drivers of myeloid malignancies. ARCH is associated with an increased risk for haematological cancers. ARCH also confers an increased risk for non-haematological diseases, such as cardiovascular disease, atherosclerosis, and chronic ischemic heart failure, for which age is a main risk factor. Whether ARCH is linked to accelerated ageing has remained unexplored. The most accurate and commonly used tools to measure age acceleration are epigenetic clocks: they are based on age-related methylation differences at specific CpG sites.

128 Speakers and abstracts Deviations from chronological age towards an increased epigenetic age have been associated with increased risk of earlier mortality and age-related morbidities. Here we present evidence of accelerated epigenetic age in individuals with ARCH. The most prevalent ARCH mutations are loss of function mutations in DNMT3A and TET2 genes, which paradoxically should have opposite epigenetic effects. We go on to compare the effects DNMT3A and TET2 mutations have on methylated CpG’s.

P78 KIRSTY UTTLEY MRC Institute of Genetics & Molecular Medicine, The University of Edinburgh

Understanding enhancers: Decoding the precise spatial and temporal dynamics of the PAX6 cis-regulatory landscape

Cis-regulatory elements (CREs) play a critical role in development by precisely modulating the spatial and temporal aspects of tissue-specific gene expression. CREs contain docking sites for transcription factors (TFs), which are expressed in a highly tissue-specific manner to regulate target gene expression. CRE sequence variants, which can disrupt the potential for TF binding, have been implicated in disease, however establishing the functional consequences of these changes on the expression of target genes is challenging. A major reason for this is a lack of understanding as to how CREs function together, for example when multiple enhancer CREs drive a common tissue-specific gene expression pattern. Our work aims to address this knowledge gap using a highly quantitative and robust CRE-reporter transgenic assay during zebrafish embryonic development. This system has previously been used to validate the effect of disease-associated CRE mutations on gene expression and disease aetiology, and is being further developed for this use. We now aim to use this system to understand the interplay of CREs in the regulation of genes with vital pleiotropic roles in development such as PAX6, which is the master regulator of eye development. We will use our CRE-reporter transgenic zebrafish lines to visualise in vivo the spatiotemporal activity of PAX6 ocular enhancers during early embryonic development using time-lapse confocal imaging. In addition, we will gather RNA-sequencing data for cell types where these CREs are active, to map these populations onto publically available zebrafish single-cell gene-expression lineage databases. Taken together, these methods will inform on the precise functions of key PAX6 ocular enhancers to elucidate the regulatory network underpinning PAX6-mediated eye development, and contribute to the fundamental understanding of gene regulation.

Speakers and abstracts 129 P79 DANIEL ZERBINO The European Bioinformatics Institute

Functional annotation of GWAS regulatory variants powered by Ensembl

Ensembl is one of the world’s leading sources of information on the structure and function of the genome. It brings together genome sequences, genes, non-coding RNAs, known variants, and other data to create an up-to-date, comprehensive and consistent resource. The Ensembl annotations are widely used for the analysis and interpretation of genome data using tools such as the Ensembl Variant Effect Predictor (VEP), which can quickly annotate the known variants of an individual and report on the potential effects of each. The analysis of individual variants remains challenging when analyzing the results of genome wide association studies (GWAS). In a majority of cases, the associated variants are not likely causing coding variants, rather potential regulatory variants with weak phenotypic associations. Here, we present how weak associations with causal genes can be detected through a genome-wide and multi-layered integrative analysis. Ensembl’s Regulatory Build synthesizes epigenomic datasets produced by large-scale projects such as ENCODE, Roadmap Epigenomics or BLUEPRINT. The resulting regulatory annotation defines biochemically active regions across 118 human cell types and 79 mouse cell types, assigning them a function wherever possible. To support the Regulatory Build, Ensembl maintains the International Human Epigenome Consortium’s (IHEC) Epigenome Reference Registry (EpiRR), where large epigenomic consortia collect the metadata describing their datasets across multiple species. Regulatory elements are chiefly of interest because of their effect on gene expression. To gain further insight, Ensembl is developing a database of cis-regulatory interactions that attach them to their target genes; as a first step all of the GTEx summary eQTL data is incorporated and can be accessed and viewed. Having brought all this data together, we demonstrate the possibilities of functional data integration with our post-GWAS analysis platform. This algorithm compares human GWAS results, as stored in public archives or in a private individual study, to a collection of genomic annotations, many of which are stored in Ensembl, producing a list of putative causal genes along with linked evidence. It optionally integrates a Bayesian colocalization algorithm that stochastically tests possible sets of causal variants at each GWAS peak. This pipeline can be quickly deployed locally and will also soon be available online on the Ensembl genome browser.

130 Speakers and abstracts P80 NAOUEL ATHMANE MRC Institute of Genetics and Molecular Medicine, The University of Edinburgh

Visualising genomic loci in live mammalian cells

Imaging chromatin dynamics is crucial to understand genome organisation and its role in transcriptional regulation. Chromatin folding of enhancer regions are of particular interest because they regulate gene expression over long distances. What we know about 3D genome organisation from imaging based methods relies on Fluorescent in situ Hybridization (FISH) on fixed cells, which misses a big part of the genome’s dynamic information. Recently, there has been significant progress in live cell imaging of biological processes including tools to tag genomic DNA in a live cell, the most common ones employ fluorescently tagged Cas9 and TALE proteins. However, the binding dynamics of these proteins are still unclear. My PhD project aims to understand firstly, the binding kinetics of Cas9 and TALE proteins to telomeres in human cells using Fluorescence Recovery After Photobleaching (FRAP) and Fluorescence Correlation Spectroscopy (FCS) in order to choose the best labelling method. This will allow me, furthermore, using advanced super-resolution microscopy techniques, to develop a method that will enable the visualisation of single copy loci in living cells.

P81 EMILY PERRY European Bioinformatics Institute

The Ensembl genome browser: what’s new?

Ensembl (www.ensembl.org) offers comprehensive access to genome annotation for a wide variety of species, including vertebrates, plants, fungi, bacteria, protists and non- vertebrate metazoa. The genome browser allows for easy visualisation of genomic regions, with comprehensive information about genes, genetic variants and features that regulate gene expression. Our popular tools include BioMart, for bulk export of gene data, the Variant Effect Predictor (VEP), for annotating variant calls with the genes affected and known variation data, and our REST APIs for programmatic access. Recent updates to Ensembl include a collaboration with NCBI to unify the Ensembl GENCODE gene annotation with RefSeq and identify the most biologically relevant transcript of a gene, known as the MANE Select (Matched Annotation from NCBI and EBI), for better reporting of variants. Planned for early 2020 is the launch of a preview version of our new website, designed from the ground up; the new website will be faster and easier to use than the existing website, although we plan a smooth transition between the two, having both available for some time. To facilitate researchers, both wet-lab and bioinformaticians, working with genetic and genomic data, Ensembl offer hands-on training on using their tools and interfaces.

Speakers and abstracts 131 P82 LAUREN KANE MRC Human Genetics Unit, The University of Edinburgh

Is TAD structure important for distal regulation of the Shh locus?

Mammalian genomes are organised into topologically associated domains (TADs) and chromatin loops. Structuring of chromatin within these highly conserved TADs is believed to both facilitate gene regulation via distal enhancers whilst insulating their regulatory activity at TAD boundaries. Shh and its regulatory elements are all contained within a highly conserved, well-characterised ~960kb TAD that has previously been proposed to form an invariant chromatin loop structure across various tissue types. To investigate the importance of chromatin conformation and TAD integrity on developmental gene regulation, we have manipulated the Shh TAD by deleting CTCF sites at the TAD boundaries. Both RNA and DNA fluorescencein situ hybridisation assays were used the investigate changes in Shh expression and chromatin conformation that result from these manipulations. We found that Shh loses its’ co-localisation with the forebrain enhancer, SBE2, in the developing forebrain of CTCF mutant embryos but with no consequence on Shh forebrain expression patterns. Our data suggest that the removal of CTCF sites affects TAD structure but has no readily detectable effect on Shh expression patterns during development and results in no evident phenotypes. Our data suggests that, contrary to expectations, the developmental regulation of Shh expression is remarkably robust to TAD perturbations.

P83 IOANNIS KAFETZOPOULOS The University of Edinburgh

Investigating the relationship between DNA methylation, transcription, replication timing and nuclear positioning in colorectal cancer cells

DNA methylation is a pervasive epigenetic mark in normal cells. Abnormalities of DNA methylation are a fundamental hallmark of cancer that are believed to promote carcinogenesis through misregulation of transcription and increased genomic instability. In particular, loss of DNA methylation levels is observed in heterochromatic regions in tumours as compared to normal cells. These hypomethylated regions are termed as partially methylated domains (PMDs), which show tissue and cancer type specificity. PMDs along their heterochromatic state are correlated with transcriptional repression, replication during late S-phase as well as association with lamina at the nuclear periphery. In this project, we are interested to elucidate the relationship between replication timing and its effects at DNA methylation levels and particularly characterise why and how PMDs become hypomethylated during tumorigenesis. We investigate the contribution of DNA methyltransferases (DNMTs), the enzymes that catalyze DNA methylation, in the establishment of methylation patterns in HCT116 colorectal cancer cells.

132 Speakers and abstracts We observe a peculiar effect where in the absence of one of the DNMTs - DNMT1 or DNMT3B genetic knock outs (KOs), where lower levels of methylation would be expected throughout the genome, approximately 14% of PMDs for DNMT1KO and 32% of PMDs for DNMT3BKO increase their methylation levels. We hypothesize that these PMD regions are able to shift to non-PMD like regions and gain higher levels of methylation. Factors that could be attributed to such a shift in the behaviour of a genetic loci could be gene transcription, replication timing as well as nucleus positioning. RNA seq analysis of DNMT1 and DNMT3B KOs shows no transcriptional activation of genes located within the hypermethylated PMDs, while preliminary replication timing experiments have showed no shift of replication of these particular PMDs. We are currently investigating the interaction of these domains with lamina and whether the altered methylation levels affect the nuclear positioning of these loci. This project aims to dissect the relationship between DNA methylation and fundamental cellular process in order to determine their effect and causality.

P84 SIMGE ACAR Koc University School of Medicine, Turkey

Sensitisation of GBM cells to Temozolomide by efficient MGMT repression via combination of CRISPR-based repression systems

Temozolomide (TMZ) has been shown to marginally prolong survival in Glioblastoma (GBM) patients, however there is still a growing field of research towards increasing its efficacy. O-6- Methyguanine-DNA Methyltransferase (MGMT) is a well-known DNA repair enzyme and its silencing through promoter hypermethylation is associated with better TMZ response. In this project, we aimed to develop and apply new CRISPR-based repression systems and silence MGMT expression in TMZ-resistant GBM cell lines. To this end, we employed dCas9-fused KRAB repressor transcription factor and dCas9-fused DNMT3A and DNMT3L to provide DNA methylation for repression as single tools or in combination. We engineered new systems that aimed to combine these repressors in a single complex for more efficient repression. DNMT3A, DNMT3L and KRAB repressors were fused with MS2 coating protein (MCP) and also cloned into lentiviral systems. The functionality of the repression systems was tested by transfection studies of the 293T cell line with all possible combinations of dCas9 fused and MCP fused repressors. Because GBM cell lines cannot be transfected efficiently, T98G GBM cell line was infected with dCas9-KRAB and gRNAs targeting MGMT promoter. Accordingly, system comparisons via transfection showed that dCas9-KRAB + MS2-DNMT3A + MS2-DNMT3L provides the most efficient repression in the 293T cell line. A partial MGMT repression and partial sensitivity to TMZ was observed in TMZ-resistant T98G GBM cell line infected with dCas9-KRAB and multiple MGMT gRNAs. In ongoing studies, optimal combination system found in comparison studies is being used to repress MGMT more efficiently and obtain higher TMZ sensitivity in GBM cell lines. On the other hand, effects of repression systems may be cell line dependent; thus, different combinations will continue to be tested in GBM cells to ascertain the most optimal system for increasing sensitivity to TMZ.

Speakers and abstracts 133 P85 ALUN JONES University of Leicester

RNA Editing in Eusocial Hymenoptera

The allocation of reproductive work is a hall mark of eusocial societies. The task of reproduction is often attributed to a few or even a single individual within a colony. These individuals can have specific phenotypes that help with reproduction, such as increased size compared to non-reproductive members, but have to produce these phenotypes from the same genome as their non-reproductive counterparts. The ability to be reproductive is, more often than not, a flexible trait with differing levels of flexibility associated with the complexity of the eusocial society. Bombus terrestris (Bumble bee) and Ooceraea biroi (Clonal raider ant) are just two examples where reproductive division of labour occurs but any individual can become reproductive, depending upon colony conditions. How reproduction is controlled is unclear. Whilst changes in gene expression, novel genes and methylation differences have been associated with reproductive castes, it is still not known whether or not there is a conserved underlying mechanism or, if varying regulatory mechanisms occur in different lineages. RNA editing is the conversion of one RNA base to another, the most common being A to I editing by adenosine deaminase. Editing has been identified in all domains of life and has only began to be investigated in social insects. RNA editing has the potential to change protein function, by altering coding regions, but also altering expression through 3 prime untranslated region modifications. This innate flexibility to alter gene expression, as well as protein coding, is often used for highly variable processes such as neurological regulation and immune responses. This additional transcript variability would appear an ideal mechanism for a flexible trait, such as reproductive potential, to be regulated by. Changes have indeed been identified between reproductive and non-reproductive castes in Acromyrmex echinatior (Leaf cutter ant). Our aim is to use NGS data sets, containing DNA and RNA, to identify RNA editing in different eusocial hymenoptera. By identifying sites where editing occurs and how it differs between reproductive castes we aim to identify 1) If editing is conserved as a mechanism for reproductive division of labour 2) If the genes targeted for editing are conserved across species 3) Whether the target of editing shift depending upon reproductive flexibility.

www.genetics.org.uk

134 Speakers and abstracts www.genetics.org.uk