<<

PERSISTENCE: Molecular Mechanisms of Evolution

Announcements: • Please fill out course questionnaire

MMN - How do epigenetic modifications of change transcription? - What are the implications of the broad occurrence of transposable elements?

DKN - Caltech connections: extra credit iClicker point opportunity - Do mutations arise spontaneously, or in response to their environment? - How do mutations form? - How can we apply the power of bacterial genetics to solve problems?

Both iClicker Qs at the end STRUCTURE AND TRANSCRIPTION/TRANSLATION IN EUKARYA transcription factor (a must)/ polymerase binding Transcription stop transcription start DNA start codon (ATG) stop codon (TGA)

Promoter 5’ UTR INTRON EXON INTRON EXON INTRON EXON 3’ UTR

5’ cap transcription polyA tail Pre- RNA 5’ UTR INTRON EXON INTRON EXON INTRON EXON 3’ UTR splicing (alternative)

RNA 5’ UTR Open reading frame 3’ UTR translation

protein Mendelian genetics

PUNNETT Square

How genes encode for traits. Is it really that simple? Just over 10 years ago, the full human genome sequence was determined.

Why so few genes? ~20,000 – 30,000

[aren’t we more complex than worms, fish and plants?] Other mysteries:

Identical twins

IDENTICAL?

Fertilized egg 8 cell EMBRYOGENESIS embryo

--HOW???? differentiation blood cells

nerve cells cardiac muscle Epigenetics

Transposable elements (TEs) EPIGENETICS: ‘ABOVE THE GENOME’

STUDY OF THE INHERITANCE OF CHANGES IN GENE FUNCTION THAT OCCUR WITHOUT CHANGES IN THE DNA SEQUENCE

M M

Some major epigenetic mechanisms of control: M M M Methylation H3C DNA itself methyl Histones

Acetylation of histones

acetyl DNA HISTONE ACETYLATION histones (usually enhances transcription – opens up the chromatin)

Histone tail

DNA modifications

Histone core EPIGENETICS IN HEALTH AND DISEASE

?

pancreatic cancer (and other cancers) PERSISTENCE: Molecular Mechanisms of Evolution

Announcements: • Please fill out course questionnaire

MMN - How do epigenetic modifications of genes change transcription? - What are the implications of the broad occurrence of transposable elements?

DKN - Caltech connections: extra credit iClicker point opportunity - Do mutations arise spontaneously, or in response to their environment? - How do mutations form? - How can we apply the power of bacterial genetics to solve problems?

Both iClicker Qs at the end DISCOVERIES

Among others

Transposition (this lecture)

Crossing over during cell division (next lecture)

Barbara McClintock (1902-1992) in / 1983 McClintock studied corn genetics – abundance of visible traits, due to (we now know) a diverse genome

-why are these corn kernels spotted? Mendelian genetics could not explain the phenomenon.

HAD NO MECHANISM

1940’s discovered transposition -such criticism that didn’t publish again on the subject after 1953 Transposable Elements (TEs): ‘Junk DNA’

DNA sequence that can change its relative position within the genome of a single cell.

Likely originally viral, then passed in the genome. Promoter 5’ UTR INTRON EXON INTRON EXON INTRON EXON 3’ UTR

GENE

Transposable element (TE) inserts

GENE

CAN DROP IN ANYWHERE replicate TE excises

GENE + NO PHYLOGENETIC PATTERN TO TE LOAD – significance – higher ‘evolvability’?

corn mosquito (Aedes aegypti) silkmoth gorilla man rat mouse poplar rice chicken anemone mosquito (Anopheles gambiae) fruit fly slime mold another chicken yeast puffer fish fungus honey bee

10 20 30 40 50 60 70 Proportion of transposable elements (% of genome) The human genome TE load

Protein-coding sequence 2%

Goodier and Kazazian, 2008 A REVOLUTION ONE EXAMPLE:

LETTER doi:10.1038/nature10531

Somatic retrotransposition alters the genetic landscape of the human brain

J. Kenneth Baillie1*, Mark W. Barnett1*, Kyle R. Upton1*, Daniel J. Gerhardt2, Todd A. Richmond2, Fioravante De Sapio1, Paul Brennan3, Patrizia Rizzu4, Sarah Smith1, Mark Fell1, Richard T. Talbot1, Stefano Gustincich5, Thomas C. Freeman1, John S. Mattick6, David A. Hume1, Peter Heutink4, Piero Carninci7, Jeffrey A. Jeddeloh2 & Geoffrey J. Faulkner1

Retrotransposons are mobile genetic elements that use a germline Mapping the individual retrotransposition events that collectively ‘copy-and-paste’ mechanism to spread throughout metazoan form a somatic mosaic is challenging owing to the rarity of each genomes1. At least 50 per cent of the human genome is derived allele in a heterogeneous cell population. We therefore from retrotransposons, with three active families (L1, Alu and developed a high-throughput protocol that we call retrotransposon SVA) associated with insertional mutagenesis and disease2,3. capture sequencing (RC-seq). First, fragmented genomic DNA was Epigenetic and post-transcriptional suppression block retrotran- hybridized to custom sequence capture arrays targeting the 59 and 39 sposition in somatic cells4,5, excluding early embryo development termini of full-length L1, Alu and SVA retrotransposons (Fig. 1a and and some malignancies6,7. Recent reports of L1 expression8,9 and Supplementary Tables 1 and 2). Immobile ERVK and ERV1 long copy number variation10,11 in the human brain suggest that L1 terminal repeat (LTR) elements were included as negative controls. mobilization may also occur during later development. However, Second, the captured DNA was deeply sequenced, yielding ,25 million the corresponding integration sites have not been mapped. Here paired-end 101-mer reads per sample (Fig. 1b). Last, read pairs were we apply a high-throughput method to identify numerous L1, Alu mapped using a conservative computational pipeline designed to and SVA germline mutations, as well as 7,743 putative somatic L1 identify known (Fig. 1c) and novel (Fig. 1d and Supplementary Fig. insertions, in the hippocampus and caudate nucleus of three indi- 1a–d) retrotransposon insertions with uniquely mapped read pairs viduals. Surprisingly, we also found 13,692 somatic Alu insertions (‘diagnostic reads’) spanning their termini. and 1,350 SVA insertions. Our results demonstrate that retrotran- Previous works have equated L1 CNV with somatic mobilization in sposons mobilize to protein-coding genes differentially expressed vivo10,11. To test this assumption with RC-seq, we first screened five and active in the brain. Thus, somatic genome mosaicism driven by brain subregions taken from three individuals (donors A, B and C) for retrotransposition may reshape the genetic circuitry that under- L1 CNV. A significant (P , 0.001) increase was observed in the num- pins normal and abnormal neurobiological processes. ber of copies of L1 open reading frame 2 (ORF2) present in DNA Malignancy and ageing are commonly associated with the accu- extracted from the hippocampus of donor C, and a similar, though mulation of deleterious mutations that lead to loss of function, cell smaller, increase was observed for donor A (Fig. 2). We then applied death or uncontrolled growth. Retrotransposition is strongly muta- RC-seq to the brain regions that showed the highest (hippocampus) genic; an estimated 400 million retrotransposon-derived structural and lowest (caudate nucleus) L1 CNV using samples from all three variants are present in the global human population3 and more than donors, including a technical replicate of caudate nucleus from donor 70 diseases involve heritable and de novo retrotransposition events2. A. A total of 177.4 million RC-seq paired-end reads were generated Presumably for this reason, transposition-competent retrotranspo- from seven libraries (Supplementary Table 3). RC-seq achieved deep sons are heavily methylated and transcriptionally inactivated4,5. sequencing coverage of known active retrotransposons, high repro- Nevertheless, substantial somatic L1 retrotransposition has been ducibility and limited sequence capture bias (Supplementary Results). detected in neural cell lineages10–12. Given the complex structural Read pairs diagnostic for novel retrotransposon insertions were and functional organization of the mammalian brain, its adaptive clustered on the basis of their insertion site, relative orientation and and regenerative capabilities13 and the unresolved aetiology of many retrotransposon family. A total of 25,229 clusters were produced. neurobiological disorders, these somatic insertions could be of great Proximal clusters arranged on opposing strands indicated two termini importance14. of one insertion and were paired, resulting in a catalogue of 24,540 One explanation for the observed transpositional activity in the novel insertions (Supplementary Table 4). As expected, the great brain may be that the L1 promoter is transiently released from epi- majority of these were either L1 (32.2%) or Alu (60.9%) (Fig. 3a). To genetic suppression during neurogenesis11,12. Transposition-competent segregate germline mutations from other events, we combined the L1 retrotransposons can then repeatedly mobilize to different loci in three largest available catalogues of L1 and Alu polymorphisms6,16,17 individual cells and produce somatic mosaicism. Several lines of evid- as an annotation database and also performed RC-seq on pooled ence support this model, including L1 transcription8,9 and copy number genomic DNA extracted from blood, producing 6,150 clusters variation (CNV) in brain tissues from human donors of various (Supplementary Table 5) that were intersected with the existing brain ages10,11, as well as mobilization of engineered L1 retrotransposons in RC-seq clusters. Any brain clusters that contained RC-seq reads from vitro and in transgenic rodents10,12. Importantly, it is not known where more than one region or individual, overlapped a blood RC-seq cluster somatic L1 insertions occur in the genome, nor, considering that open or matched a known polymorphism were designated as germline chromatin is susceptible to L1 integration15, whether these events dis- insertions. Overall, 8.4% of Alu insertions in the brain were anno- proportionately affect protein-coding loci expressed in the brain. tated as germline, versus only 1.9% for L1. Nearly all unannotated

1Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Edinburgh EH25 9RG, UK. 2Roche NimbleGen, Inc., 500 South Rosa Road, Madison, Wisconsin 53719, USA. 3Edinburgh Cancer Research Centre, Western General Hospital, Crewe Road South, Edinburgh EH4 2XR, UK. 4Section of Medical Genomics, Department of Clinical Genetics, VU University Medical Center, Van der Boechorststraat 7, 1081 BT Amsterdam, The Netherlands. 5Sector of Neurobiology, International School for Advanced Studies (SISSA), via Bonomea 265, 34136 Trieste, Italy. 6Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland 4072, Australia. 7RIKEN Yokohama Institute, Omics Science Center, 1-7-22 Suehiro-choˆ, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. *These authors contributed equally to this work.

534 | NATURE | VOL 479 | 24 NOVEMBER 2011 ©2011 Macmillan Publishers Limited. All rights reserved PERSISTENCE: Molecular Mechanisms of Evolution

MMN - How do epigenetic modifications of genes change transcription?

Through modifications of the DNA and its associated proteins

- What are the implications of the broad occurrence of transposable elements?

We seem to have only discovered the tip of the iceberg.

Barbara McClintock (1902-1992) Nobel Prize in Physiology/Medicine 1983

Transposable Elements - Rediscovered in in Late 1960s &1970s PERSISTENCE: Molecular Mechanisms of Evolution

Announcements: • Please fill out course questionnaire

MMN - How do epigenetic modifications of genes change transcription? - What are the implications of the broad occurrence of transposable elements?

DKN - Caltech connections: extra credit iClicker point opportunity - Do mutations arise spontaneously, or in response to their environment? - How do mutations form? - How can we apply the power of bacterial genetics to solve problems?

Both iClicker Qs at the end Examples of Caltech research outside of Biology applying/probing aspects of the Central Dogma: 2 extra credit iClicker point—see website!

Pierce Rothemund

Tirell Clemons Shan

Folding, structure, function

Mayo Gray Examples of Caltech research in Biology applying/probing aspects of the Central Dogma: 2 extra credit iClicker point—see website!

RNA interference (i.e. RNA-mediated gene silencing) Aravin Reverse Transcriptase (e.g. HIV)

Fejes-Toth

Balmore

Campbell Deshaies Varshavsky What types of mutations are there? à Single nucleotide mismatch

(DNA) ...A A G A C A G C A… (DNA) A A G A C T G C A (RNA) U U C U G U C G U (RNA) U U C U G A C G U aa Phe Cys Arg aa Phe TERM Arg (DNA) A A G A C A G C A (RNA) U U C U A U C G U aa Phe Tyr Arg ribosomal error Ala Tyr Arg

à Deletion (frameshift, partial, clean deletion) (DNA) A A G A C A G C A (RNA) U U G U C G U aa Leu Ser

à Insertion (insertion elements: transposons) (DNA) A A G G G G C C C G C C G C C A C A G C A

* differences in GC content suggest horizontal gene transfer How do mutations form?

“Vertical” inheritance:

• DNAP, RNAP, ribosome mistakes [lower proofreading fidelity] • environmental stressor e.g. chemical mutagens, UV

“Horizontal” gene transfer:

à Transformation (natural, chemical, electrical “competence”)

à Transfection (phage infection)

à Conjugation (exchange of DNA between bacteria) Why you should care about this: resistance!

Dangerous liasons – conjugative transposons

à Some transposons hop at random into the genome; others integrate at specific sites! Do mutations arise spontaneously, or in response to their environment?

Darwin (theory of natural selection) Lamarck (adaptive mutation)

How to test? “” (almost entirely comprising physicists)

Salvador Luria, MIT

Max Delbrück, Caltech

Nobel Prize 1969 Physiology and Medicine (with A. Hershey) Why phage and bacteria?

à Simplest systems that could be studied

à Easy assays, requiring no more than “toothpicks and logic” The plaque assay

Lytic phage create a burst (clearing) on bacterial lawn

à A simple assay to measure mutation frequency is to score the numbers of bacteria that can grow in the presence of the phage (i.e. phage resistant) Question: Do mutations arise spontaneously, or in response to their environment?

Specifically: Does E.coli acquire resistance to phage T1 by “induced immunity” or by spontaneous mutation?

Luria’s key insight at a faculty party à hitting the jackpot is random Luria–Delbrück experiment (1943) Fluctuation Test

Experimental Design & Results: à Draw out 1 flask split into 20 volumes at the end à Draw out 20 parallel flasks à Plate them all (normalizing # of cells) on a saturating T1 phage background, select for resistant colonies à Compare the number of resistant colonies à What would the implications be if you saw equal numbers of colonies regardless of how you grew up the cultures? What if you saw different numbers? à They observed different numbers, showing that mutations arose randomly in some cultures (“jackpot”) but not in others—had nothing to do with the exposure to the phage!

Conclusion: à Mutations spontaneously arise in the absence of selection, not a response to environment

How can we apply the power of bacterial genetics to solve problems?

Draw out cartoon of microbe respiring As(V) to As(III)

Discuss As(III) more mobile and Toxic

How can we know when microbes are catalyzing this activity?

GOAL: FIND A ROBUST MARKER! Isolating a good “model organism” to find the gene(s) specifically conferring the ability to respire As(V)

- Tell the story of isolation

- Explain how yellow mineral assay works

- Explain identified a new strain of Shewanella ANA-3 is metabolically versatile, and a close relative of other strains that CANNOT respire As(V)

Why do you think these properties would help us find the As(V)- respiratory genes?

How did we find the gene? What approaches could have been taken?

Gain of function

Prof. Chad Saltikov Loss of function UCSC

Davin Malasarn à Actual approach posted in supplementary readings on website Turns out every As(V)-respiring strain has the arr genes—robust biomarker

à Encode a highly conserved protein family! à Can detect genes in the environment using PCR

SUMMARY

- Caltech connections: extra credit iClicker point opportunity - What types of mutations are there? point mutations, insertions, deletions

- How do mutations form? Vertical inheritance (DNAP, RNAP, ribosomal mistakes) Horizontal inheritance (transformation, transfection, conjugation)

- Do mutations arise spontaneously, or in response to their environment? In many instances spontaneously, but there are exceptions (prions)

- How can we apply the power of bacterial genetics to solve problems? To identify genes controlling a process via mutational analysis.