Introduction to Bioinformatics •

Introduction to Bioinformatics •

21‐Mar‐15 Info and documentation Introduction to Bioinformatics • http://theory.bio.uu.nl/BDA/2015 • http://www.google.com – … but only for guidance and hints: never take the internet for granted • Campbell Biology, 9th or 10th edition, Pearson • Reader – Printed in black and white – Download full color PDF at: http://theory.bio.uu.nl/BDA/2015/BioInf2015.pdf Bas E. Dutilh – Errata: Systems Biology: Bioinformatic Data Analysis http://theory.bio.uu.nl/BDA/2015/errata.html Utrecht University, March 19th 2015 Evaluation How would you figure out the function of a protein? • Final mark course – 2/3 mark of Mathematics/Theoretical Biology – 1/3 mark of Bioinformatic Data Analysis • Bioinformatics: mark of written exam only – NOTE: this is different from info in studiegids! Activity assay – Date: April 9th 2015 at 17:00‐20:00 in Educatorium Gamma X‐ray structure • Bonus point – NOTE: this is different from info in studiegids! – Make all practicals and have them signed by the assistant • In case of emergencies you can be late by one class maximum th – Hand in your mini‐article on time (deadline: April 7 2015) Knock‐out mouse through http://theory.bio.uu.nl/sb/rooster.html – The bonus point will only be added to the mark of the written exam if this mark is >4 before addition – The maximum mark is a 10 BLAST search How about for all proteins in a genome? Genome sizes Chaos chaos (1.4 Tb, Friz 1968) Tb: Tera base pairs (1012) Gb: Giga base pairs (109) Mb: Mega base pairs (106) Kb: Kilo base pairs (103) 1 21‐Mar‐15 Gene density and non‐coding DNA Components of the human genome • Mammals (including humans) have the lowest gene • 20,000 – 25,000 protein‐coding genes (1.5%) density – Number of genes in a given length of DNA • Introns within genes • Introns (25.9%) • Noncoding DNA between genes • Transposable elements (44.7%) – DNA transposons – Long terminal repeat (LTR) retrotransposons – Short interspersed nuclear elements (SINEs) – Long interspersed nuclear elements (LINEs) – Endogenous retroviruses – Miniature inverted repeat transposable elements (MITEs) Largest genomes Smallest genomes • Eukaryota – Free: Ostreococcus tauri (12.6 Mb) – Endosymb: Encephalitozoon intestinalis (2.3 Mb) • Bacteria and Archaea – Free: Mycoplasma genitalium (580 kb) Largest sequenced genome: – Endosymb: Cand. Carsonella ruddii (160 kb) Loblolly pine (Pinus taeda) 20, 000, 000,000 bp (20 Gb) Kinugasasō (Paris japonica) • Viruses 149,000,000,000 bp (149 Gb) – Circoviridae (1.8 kb –only two proteins!) Genetic diversity Human genome • Phylogenetic Tree of Life • 3,000,000,000 bp (3 Gb) • Human Genome Project (HGP) – 1990‐2003 – Draft genome sequence complete in 2000 Eukaryotes • Reference genome – Source: blood (female) and sperm (male) – Samples taken from many donors, but only a few were used to protect donor identities – Sequence is not from one individual • >70% from one male donor Archaea • Cost HGP: $ 3,000,000,000 Prokaryotes – Target: $ 1,000 genome Bacteria 2 21‐Mar‐15 Genome sequencing Whole Genome Shotgun (WGS) approach Cloned genomes Segments known order Fragment and sequence Assemble sequences Consensus genome Personal genome sequences Your personal genome sequence ~2.000.000 differences Craig Venter James Watson ~5.000.000 differences ~5.000.000 differences Reference Genome So we have a $200 personal genome… Personalized medicine Sergey Brin Co‐founder Co‐invester LRRK2 polymorphism on chromosome 12 ‐ 28% risk of Parkinson’s at age 59 ‐ 51% at age 69 • …now the million dollar question is: ‐ 74% at age 79 • From reactive to proactive medicine What can I learn from my – Identify high risk alleles 3,000,000,000 A’s, C’s, G’s, and T’s? – Adapt lifestyle (e.g. risk of high blood pressure) – Preventive screening or treatment (e.g. risk of cancer) • Pharmacogenomics: – Impact of genetic variation on response to medication 3 21‐Mar‐15 Biology is Big Data science Omics sciences • The suffix ‐ome refers to a totality of some sort • Gene (genetics) • Genome • Genomics • Transcript (RNA) • Transcriptome • Transcriptomics genomes • Protein • Proteome • Proteomics sequenced # DNA RNA Protein • Metabolite • Metabolome • Metabolomics • Lipid • Lipidome • Lipidomics Moore's Law: computer power doubles every ~2 years. • Microbe • Microbiome • Microbiomics (?!) Genomics Metagenomics • Identify differences in gene content between genomes • Discover new species: “Biological Dark Matter” Sample • Analyze genome evolution • Predict gene functions Filter Microbes or viruses Chordata ↔ Echinodermata Human microbiome and virome Bioinformatics • In your body: ~1013 human cells ~1014 bacteria ~1015 viruses • Bioinformatics: study of informatic processes in biotic Image: Lisa Brown for systems Paulien Hogeweg and Ben Hesper (Utrecht University, 1970) • Bioinformatic Data Analysis: using computational methods to analyze biological data 4 21‐Mar‐15 Bioinformatics in Utrecht today 5.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us