
A Zero-Knowledge Based Introduction to Biology Jim Notwell 09 January 2013 Q: What is your genome? A: Q: What is your genome? A:The sum of your hereditary information. From DNA to Organism You are composed of ~ 10 trillion cells From DNA to Organism Cell From DNA to Organism Cell Protein Proteins do most of the work in biology Central Dogma of Biology DNA: “Blueprints” for a cell •Genetic information encoded in long strings •Deoxyribonucleic acid comes in four flavors: adenine, thymine, guanine, and cytosine Phosphate-deoxyribose Backbone to previous nucleotide O 5’ H to base O P O C O H O- C C H H H H C C 3’ H to next nucleotide Nucleobase Complementary Pairing purines Adenine (A) Guanine (G) Thymine (T) Cytosine (C) pyrimidines DNA Double Helix DNA Packaging Q: What is your genome? A:The sum of your hereditary information. Q: What is your genome? A:The sum of your hereditary information. Humans bundle two copies of the genome into 46 chromosomes in every cell Central Dogma of Biology DNA vs RNA RNA Nucleobases purines Adenine (A) Guanine (G) Uracil (U) Cytosine (C) pyrimidines Gene Transcription G A T T A C A . 5’ 3’ 3’ C T A A T G T . 5’ Gene Transcription G A T T A C A . 5’ 3’ 3’ C T A A T G T . 5’ Gene Transcription G A T T A C A . 5’ 3’ 3’ C T A A T G T . 5’ Strands are separated (DNA helicase) Gene Transcription G A T T A C A . 5’ G A U U A C A 3’ 3’ C T A A T G T . 5’ An RNA copy of the 5’→3’ sequence is created from the 3’→5’ template Gene Transcription G A T T A C A . 5’ 3’ 3’ C T A A T G T . 5’ G A U U A C A . pre-mRNA 5’ 3’ RNA Processing 5’ cap poly(A) tail exon intron mRNA 5’ UTR 3’ UTR Gene Structure introns 5’ 3’ promoter 5’ UTR exons 3’ UTR coding non-coding Central Dogma of Biology From RNA to Protein •Proteins are long strings of amino acids joined by peptide bonds •Translation from RNA sequence to amino acid sequence performed by ribosomes •20 amino acids → 3 RNA letters required to specify a single amino acid Amino Acid Alanine Arginine Asparagine Aspartate H O Cysteine Glutamate Glutamine H N C C OH Glycine Histidine Isoleucine H Leucine Lysine R Methionine Phenylalanine Proline Serine Threonine Tryptophan Tyrosine There are 20 standard amino acids Valine Proteins H O to previous aa N C C to next aa H R H OH N-terminus C-terminus (start) (end) from 5’ 3’ mRNA Translation The ribosome (a complex of protein and RNA) synthesizes a protein by reading the mRNA in triplets (codons). Each codon is translated to an amino acid. Translation U C A G UUU"Phenylalanine"(Phe) UCU"Serine"(Ser) UAU"Tyrosine"(Tyr) UGU"Cysteine"(Cys) U# UUC"Phe UCC"Ser UAC"Tyr UGC"Cys C# U# UUA"Leucine"(Leu) UCA"Ser" UAA"STOP UGA"STOP A# UUG"Leu UCG"Ser" UAG"STOP UGG"Tryptophan"(Trp) G CUU"Leucine"(Leu) CCU"Proline"(Pro) CAU"His4dine"(His) CGU"Arginine"(Arg) U# CUC"Leu CCC"Pro CAC"His CGC"Arg" C# C# CUA"Leu CCA"Pro CAA"Glutamine"(Gln) CGA"Arg" A# CUG"Leu CCG"Pro CAG"Gln CGG"Arg" G AUU"Isoleucine"(Ile) ACU"Threonine"(Thr) AAU"Asparagine"(Asn) AGU"Serine"(Ser) U# AUC"Ile ACC"Thr AAC"Asn AGC"Ser" C# A# AUA"Ile ACA"Thr" AAA"Lysine"(Lys) AGA"Arginine"(Arg) A# AUG"Methionine"(Met)"or"START ACG"Thr AAG"Lys AGG"Arg" G GUU"Valine"(Val) GCU"Alanine"(Ala) GAU"Aspar4c#acid"(Asp) GGU"Glycine"(Gly) U# GUC"Val GCC"Ala GAC"Asp GGC"Gly C# G# GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUG"Val GCG"Ala GAG"Glu GGG"Gly G Translation 5’ . A U U A U G G C C U G G A C U U G A . 3’ Translation 5’ . A U U A U G G C C U G G A C U U G A . 3’ UTR Met Ala Trp Thr Start Stop Codon Codon Translation Central Dogma of Biology Protein coding 1% Other Stuff 99% Protein Non-coding coding exons 1% 2% ??? Introns/ 32% promoters/ polyA sites 37% Intergenic transcribed RNA Regulatory 19% elements 9% The ENCODE Project Consortium (2012) Nature 489:57-74 Non-coding RNAs •RNAs transcribed from DNA but not translated into protein •Structural ncRNAs: Conserved secondary structure •Involved in gene regulation microRNA Protein Non-coding coding exons 1% 2% ??? Introns/ 32% promoters/ polyA sites 37% Intergenic transcribed RNA Regulatory 19% elements 9% The ENCODE Project Consortium (2012) Nature 489:57-74 Different Cell Types Subsets of the DNA sequence determine the identity and function of different cells Gene Expression Regulation •When should each gene be expressed? •Why? Every cell has same DNA but each cell expresses different proteins. •Signal transduction: One signal converted to another: cascade has “master regulators” turning on many proteins, which in turn each turn on many proteins Central Dogma of Biology Transcription Regulation •Transcription factors link to binding sites •Complex of transcription factors forms •Complex assists or inhibits formation of the RNA polymerase machinery Gene Transcription G A T T A C A . 5’ 3’ 3’ C T A A T G T . 5’ Transcription Factor Binding Sites •Short, degenerate DNA sequences recognized by particular transcription factors •For complex organisms, cooperative binding of multiple transcription factors required to initiate transcription Binding Sequence Logo Transcription Regulation TF A Gene B Binding Site Transcription Factor A ANRV285-GG07-02 ARI 8 August 2006 1:29 Gene Regulatory Region understand how different permutations of the Distal regulatory elements same regulatory elements alter gene expres- sion. An understanding of how the combina- Locus control region Insulator torial organization of a promoter encodes reg- Silencer Enhancer ulatory information first requires an overview of the proteins that constitute the transcrip- tional machinery. Proximal Core THE EUKARYOTIC promoter promoter TRANSCRIPTIONAL elements MACHINERY Promoter ( 1 kb) Factors involved in the accurate transcrip- tion of eukaryotic protein-coding genes by Figure 1 RNA polymerase II can be classified into three Schematic of a typical gene regulatory region. The promoter, which is groups: general (or basic) transcription fac- composed of a core promoter and proximal promoter elements, typically spans less than 1 kb pairs. Distal (upstream) regulatory elements, which can tors (GTFs), promoter-specific activator pro- include enhancers, silencers, insulators, and locus control regions, can be teins (activators), and coactivators (Figure 2). located up to 1 Mb pairs from the promoter. These distal elements may GTFs are necessary and can be sufficient for contact the core promoter or proximal promoter through a mechanism that accurate transcription initiation in vitro (re- involves looping out the intervening DNA. viewed in 141). Such factors include RNA polymerase II itself and a variety of auxil- (73); subsequent reinitiation of transcription iary components, including TFIIA, TFIIB, then only requires rerecruitment of RNA TFIID, TFIIE, TFIIF, and TFIIH. In addi- polymerase II-TFIIF and TFIIB. General tion to these “classic” GTFs, it is apparent that The assembly of a PIC on the core pro- transcription factor in vivo transcription also requires Mediator, moter is sufficient to direct only low levels of (GTF): a factor that a highly conserved, large multisubunit com- accurately initiated transcription from DNA assembles on the plex that was originally identified in yeast (re- templates in vitro, a process generally referred core promoter to viewed in 38, 119). to as basal transcription. Transcriptional ac- form a preinitiation complex and is GTFs assemble on the core promoter in tivity is greatly stimulated by a second class required for an ordered fashion to form a transcription of factors, termed activators. In general, ac- transcription of all preinitiation complex (PIC), which directs tivators are sequence-specific DNA-binding (or almost all) genes RNA polymerase II to the transcription start proteins whose recognition sites are usually Coactivators: site (TSS). The first step in PIC assembly present in sequences upstream of the core adaptor proteins that is binding of TFIID, a multisubunit com- promoter (reviewed in 149). Many classes of typically lack by Stanford University Robert Crown Law Lib. on 04/03/07. For personal use only. plex consisting of TATA-box-binding pro- activators, discriminated by different DNA- intrinsic sequence-specific tein (TBP) and a set of tightly bound TBP- binding domains, have been described, each DNA binding but Annu. Rev. Genom. Human Genet. 2006.7:29-59. Downloaded from arjournals.annualreviews.org associated factors (TAFs). Transcription then associating with their own class of specific provide a link proceeds through a series of steps, including DNA sequences. Examples of activator fam- between activators promoter melting, clearance, and escape, be- ilies include those containing a cysteine- and the general fore a fully functional RNA polymerase II rich zinc finger, homeobox, helix-loop-helix transcriptional machinery elongation complex is formed. The current (HLH), basic leucine zipper (bZIP), fork- model of transcription regulation views this head, ETS, or Pit-Oct-Unc (POU) DNA- PIC: preinitiation complex as a cycle, in which complete PIC assembly is binding domain (reviewed in 142). In addition stimulated only once. After RNA polymerase to a sequence-specific DNA-binding domain, TSS: transcription start site II escapes from the promoter, a scaffold struc- a typical activator also contains a separable ture, composed of TFIID, TFIIE, TFIIH, activation domain that is required for the ac- and Mediator, remains on the core promoter tivator to stimulate transcription (149).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages60 Page
-
File Size-