Spring 2003 Final3

Total Page:16

File Type:pdf, Size:1020Kb

Spring 2003 Final3 NCBI News National Center for Biotechnology Information National Library of Medicine National Institutes of Health Department of Health and Human Services Spring 2003 [ A Field Guide to GenBank® and NCBI Resources: First Version of Human NCBI’s Scientific Outreach and Training Program Genome Reference Sequence Debuts on Biological sequence and structure use of NCBI databases and tools. DNA’s 50th information are now used in nearly The course, called “A Field Guide every field of biological research. A to GenBank and NCBI Resources”, working knowledge of these resources is designed especially for biologists April 14, 2003 marked the and standard computational biology who work at the bench or in the field 50th anniversary of the tools are an essential part of every but use sequence and structure data description of the structure biologist’s toolkit. However, keeping in their research. All researchers, of DNA and also saw the up with these databases and tools can educators and students who work release of the first version of be challenging in this period of rapidly with biological sequence and structure the 3 billion base pair refer- changing bioinformatics resources. data should find this to be a useful ence sequence of the human introduction and survey of the genome. Annotations to the In order to help researchers keep available NCBI tools and databases. raw sequence made public on April 14 abreast of enhancements and the Because of the rapid expansion of were released on April 29 when the increasing diversity of NCBI molecu- the resources, even experienced NCBI reference genome, NCBI build 33, lar biology resources, the NCBI users will likely learn something new appeared in the NCBI Map Viewer. service desk provides a free training program for biologists on the effective continued on page 4 continued on page 2 In this issue 1 Field Guide to GenBank 1 Human Reference Sequence 2 UniGene Expands 3 Rat Genome Assembly 5 Taxonomy Browser 5 Search the NCBI Web 5 Recent Publications 6 New Genomes in GenBank 6 Entrez Quiz 7 Submissions Corner 8 GenBank Cumulative Updates 8 GenBank Release 135 Figure 1: This Field Guide Map shows the number of Field Guides held in each state through April 2003. the Human Genome Project sequenc- Human Reference Sequence ing centers, NCBI, the University of continued from page 1 California at Santa Cruz, and Ensembl, NCBI News a joint project between EMBL-EBI and the Sanger Institute. The human genome reference NCBI News is distributed four times sequence is a critical contribution to There are about 32,000 genes anno- a year. We welcome communication RefSeq, NCBI’s database of reference tated by NCBI on build 33; of those, from users of NCBI databases and sequences for genomes, mRNAs, almost 18,000 are supported by software and invite suggestions for and proteins. mRNA alignments and may be con- articles in future issues. Send corre- sidered to be confirmed. The typical spondence to NCBI News at the The reference sequence covers about confirmed human gene has 12 exons address below. 99 percent of the human genome’s of an average length of 236 base NCBI News gene-containing , euchromatic, pairs each, separated by introns of National Library of Medicine regions, at an accuracy of 99.99 per- an average length of 5,478 base pairs. Bldg. 38A, Room 3S308 cent. Only the sequence near cen- As a consequence, the average intron 8600 Rockville Pike tromeres, telomeres, and other hete- length is about twice the average Bethesda, MD 20894 Phone: (301) 496-2475 rochromatic blocks, along with a small transcript length. Some statistics on Fax: (301) 480-9241 number of unclonable gaps, has yet NCBI’s build 33 are given in Table 1. E-mail: [email protected] to be determined. Small updates to the assembly will continue to be made To view the genome or download the Editors Dennis Benson as complex regions are refined and sequence, start at the “Human Genome David Wheeler the remaining gaps, about 400 of less Resources” link under “Hot Spots” on than 100 kilobases each, are closed. the NCBI Home Page. See the upcom- Contributors ing summer edition of the NCBI News Medha Bhagwat The assembly is the end product of for more on the human genome and Peter Cooper Eric Sayers several years of collaborative work by how to explore it at NCBI. Writers Vyvy Pham Gene Identification Method EST Alignment Predicted mRNA Alignment David Wheeler Number of Genes 3,295 11,972 17,708 Editing and Production Robert Yates Exons/Gene (Avg) 4 5 12 Graphic Design Exon Length (Avg) 440 158 226 Gary Mosteller Intron Length (Avg) 5,605 10,512 5,478 Joe Vuthipong mRNA Length (Avg) 2,127 820 2,741 In 1988, Congress established the National Center for Biotechnology Gene Length (Avg) 13,940 43,918 55,147 Information as part of the National Library of Medicine; its charge is Table 1: Statistics: First version of the human genome reference sequence, NCBI human genome build 33. to create information systems for molecular biology and genetics data and perform research in UniGene Expands; C. elegans added to LocusLink computational molecular biology. The contents of this newsletter may With the recent additions of Ciona similarities, the LocusLink report for be reprinted without permission. intestinalis (sea squirt), Oncorhynchus the gene, and its map location. Try The mention of trade names, com- mykiss (rainbow trout), and Vitis UniGene by following the “UniGene” mercial products, or organizations vinifera (the wine grape) the UniGene link under “Hot Spots” on the NCBI does not imply endorsement by database of non-redundant gene-orient- Home Page. NCBI, NIH, or the U.S. Government. ed clusters now covers 15 animals, 12 NIH Publication No. 03-3272 plants and Dictyostelium discoideum, The nematode, Caenorhabditis ISSN 1060-8788 a slime mold. Each UniGene cluster elegans, represented in UniGene with ISSN 1098-8408 (Online Version) is linked to related information, such over 19,000 clusters, is now also repre- as the tissue types in which the gene sented in the LocusLink database of is expressed, model organism protein genetic loci with over 22,000 records. Spring 2003 NCBI News 2 Rat Genome Assembly Deposited into GenBank With the recent deposit of the draft rat genome assembly Try the rat MapViewer at: into GenBank, researchers can view the rat genome in Entrez www.ncbi.nlm.nih.gov/mapview and from the Rat Genome Map Viewer. Other resources such as Rat Genomic BLAST and downloadable files of predicted Researchers can also BLAST a query sequence against WGS gene and protein sequences complement the navigational contigs, WGS traces from the Trace Archive, BAC ends, HTG, and display capabilities of the Viewer. RNA, and EST sequences, using the Rat Genome BLAST service. BLAST hits are shown marked in red on the genome The Whole Genome Shotgun (WGS) effort that produced assembly. Figure 2 shows the BLAST hit of a Bos taurus p53 the rat genome was managed by the Rat Genome Sequencing tumor suppressor phosphoprotein (TP53), mRNA against Consortium (RGSC), which is led by the Baylor College of the Rat genome assembly. The query sequence shows an 82% Medicine Human Genome Sequencing Center. The current homology to the rat contig NW_042660.1. The parallel gene assembly is comprised of over 157,000 WGS contigs fused to sequence map shows the annotated rat gene TP53 that is produce almost 1,800 supercontigs, or scaffolds. annotated within the same region, suggesting a homologous relationship between the Bos taurus and rat genes. The rat WGS contigs are assigned accession numbers of the format “AABR########”, with the first two digits describing The Rat BLAST service can be accessed from the BLAST the project version number. Thus, the first version has the Home Page. accession AABR01000000, and consists of sequences AABR01000001 through AABR01157561. One can either search for AABR01000000 in Entrez, and from its record, click to view the records AABR01000001 through AABR01157561; or, one can search for any of the individual records separately. The WGS reads are found in the Trace Archive and can also be queried using the BLAST service described below. There are over 38,000,000 rat traces available via the “Hot Spots” “Trace archive” link on the NCBI Home Page. More information on the assembly and other rat genomic resources is available from the NCBI “Genomic Biology” page. Follow the “Rat” link under “Organism-Specific Resources”. To download sequence and annotations via FTP, use Figure 1: Map Viewer display of rat genome assembly for chromosome 17 showing several parallel maps, including those for predicted and confirmed ftp.ncbi.nih.gov/genomes genes, and the contigs and their components. Click on “Maps and Options” to view all the available maps. The genome assembly can be displayed using the Rat Genome Map Viewer, as in Figure 1. In the figure, several different maps are shown, including gene sequence, compo- nent, contig, GenomeScan, UniGene, and STS. The compo- nent map shows the tiling path of the WGS contig accessions used to build the “NW_xxxxxx” scaffolds, whereas the contig map displays the placement of the scaffolds on the chromosomes. Genes that have also been annotated on the genome assembly using the alignment of mRNAs to the con- tigs are shown on the Genes_seq map, while GenomeScan predictions are included on a separate map. In addition, the Transcript map shows the combinations of exons, or splice variants, that are implied by mRNA alignments. The Rn_Uni- Gene map shows the alignment of rat ESTs to the genome assembly and shows EST-only gene models based on shared introns and alignment to a common position on the genome.
Recommended publications
  • Long-Read Cdna Sequencing Identifies Functional Pseudogenes in the Human Transcriptome Robin-Lee Troskie1, Yohaann Jafrani1, Tim R
    Troskie et al. Genome Biology (2021) 22:146 https://doi.org/10.1186/s13059-021-02369-0 SHORT REPORT Open Access Long-read cDNA sequencing identifies functional pseudogenes in the human transcriptome Robin-Lee Troskie1, Yohaann Jafrani1, Tim R. Mercer2, Adam D. Ewing1*, Geoffrey J. Faulkner1,3* and Seth W. Cheetham1* * Correspondence: adam.ewing@ mater.uq.edu.au; faulknergj@gmail. Abstract com; [email protected]. au Pseudogenes are gene copies presumed to mainly be functionless relics of evolution 1Mater Research Institute-University due to acquired deleterious mutations or transcriptional silencing. Using deep full- of Queensland, TRI Building, QLD length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we 4102 Woolloongabba, Australia Full list of author information is identify here hundreds of novel transcribed pseudogenes expressed in tissue-specific available at the end of the article patterns. Some pseudogene transcripts have intact open reading frames and are translated in cultured cells, representing unannotated protein-coding genes. To assess the biological impact of noncoding pseudogenes, we CRISPR-Cas9 delete the nucleus-enriched pseudogene PDCL3P4 and observe hundreds of perturbed genes. This study highlights pseudogenes as a complex and dynamic component of the human transcriptional landscape. Keywords: Pseudogene, PacBio, Long-read, lncRNA, CRISPR Background Pseudogenes are gene copies which are thought to be defective due to frame- disrupting mutations or transcriptional silencing [1, 2]. Most human pseudogenes (72%) are derived from retrotransposition of processed mRNAs, mediated by proteins encoded by the LINE-1 retrotransposon [3, 4]. Due to the loss of parental cis-regula- tory elements, processed pseudogenes were initially presumed to be transcriptionally silent [1] and were excluded from genome-wide functional screens and most transcrip- tome analyses [2].
    [Show full text]
  • Linkage and Association Analysis of Obesity Traits Reveals Novel Loci and Interactions with Dietary N-3 Fatty Acids in an Alaska Native (Yup’Ik) Population
    METABOLISM CLINICAL AND EXPERIMENTAL XX (2015) XXX– XXX Available online at www.sciencedirect.com Metabolism www.metabolismjournal.com Linkage and association analysis of obesity traits reveals novel loci and interactions with dietary n-3 fatty acids in an Alaska Native (Yup’ik) population Laura Kelly Vaughan a, Howard W. Wiener b, Stella Aslibekyan b, David B. Allison c, Peter J. Havel d, Kimber L. Stanhope d, Diane M. O’Brien e, Scarlett E. Hopkins e, Dominick J. Lemas f, Bert B. Boyer e,⁎, Hemant K. Tiwari c a Department of Biology, King University, 1350 King College Rd, Bristol, TN 37620, USA b Department of Epidemiology, University of Alabama at Birmingham, 1665 University Blvd, Birmingham, AL 35294, USA c Section on Statistical Genetics, Department of Biostatistics, University of Alabama at Birmingham, 1665 University Blvd, Birmingham, AL 35294, USA d Departments of Nutrition and Molecular Biosciences, University of California at Davis, 1 Shields Ave, Davis, CA 95616, USA e USACenter for Alaska Native Health Research, Institute of Arctic Biology, 311 Irving I Building, University of Alaska Fairbanks, Fairbanks, AK 99775, USA f Department of Pediatrics, Section of Neonatology, University of Colorado Anschutz Medical Campus, 13123 East 16th Ave, Aurora, CO 80045, USA ARTICLE INFO ABSTRACT Article history: Objective. To identify novel genetic markers of obesity-related traits and to identify gene- Received 16 June 2014 diet interactions with n-3 polyunsaturated fatty acid (n-3 PUFA) intake in Yup’ik people. Accepted 28 February 2015 Material and methods. We measured body composition, plasma adipokines and ghrelin in 982 participants enrolled in the Center for Alaska Native Health Research (CANHR) Study.
    [Show full text]
  • Gene and Alternative Splicing Annotation with AIR
    Downloaded from genome.cshlp.org on May 8, 2012 - Published by Cold Spring Harbor Laboratory Press Gene and alternative splicing annotation with AIR Liliana Florea, Valentina Di Francesco, Jason Miller, et al. Genome Res. 2005 15: 54-66 Access the most recent version at doi:10.1101/gr.2889405 Supplemental http://genome.cshlp.org/content/suppl/2004/12/08/15.1.54.DC1.html Material References This article cites 49 articles, 34 of which can be accessed free at: http://genome.cshlp.org/content/15/1/54.full.html#ref-list-1 Article cited in: http://genome.cshlp.org/content/15/1/54.full.html#related-urls Creative This article is distributed exclusively by Cold Spring Harbor Laboratory Press Commons for the first six months after the full-issue publication date (see License http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/. Email alerting Receive free email alerts when new articles cite this article - sign up in the box at the service top right corner of the article or click here To subscribe to Genome Research go to: http://genome.cshlp.org/subscriptions © 2005, Published by Cold Spring Harbor Laboratory Press Downloaded from genome.cshlp.org on May 8, 2012 - Published by Cold Spring Harbor Laboratory Press Methods Gene and alternative splicing annotation with AIR Liliana Florea,1,4,5 Valentina Di Francesco,2 Jason Miller,1 Russell Turner,1 Alison Yao,2 Michael Harris,2 Brian Walenz,1 Clark Mobarry,1 Gennady V.
    [Show full text]
  • A Literature-Based Database for Cell Senescence Genes and Its Application to Identify Critical Cell Aging Pathways and Associated Diseases
    Citation: Cell Death and Disease (2016) 7, e2053; doi:10.1038/cddis.2015.414 OPEN & 2016 Macmillan Publishers Limited All rights reserved 2041-4889/16 www.nature.com/cddis CSGene: a literature-based database for cell senescence genes and its application to identify critical cell aging pathways and associated diseases M Zhao1, L Chen2 and H Qu*,2 Cell senescence is a cellular process in which normal diploid cells cease to replicate and is a major driving force for human cancers and aging-associated diseases. Recent studies on cell senescence have identified many new genetic components and pathways that control cell aging. However, there is no comprehensive resource for cell senescence that integrates various genetic studies and relationships with cell senescence, and the risk associated with complex diseases such as cancer is still unexplored. We have developed the first literature-based gene resource for exploring cell senescence genes, CSGene. We complied 504 experimentally verified genes from public data resources and published literature. Pathway analyses highlighted the prominent roles of cell senescence genes in the control of rRNA gene transcription and unusual rDNA repeat that constitute a center for the stability of the whole genome. We also found a strong association of cell senescence with HIV-1 infection and viral carcinogenesis that are mainly related to promoter/enhancer binding and chromatin modification processes. Moreover, pan-cancer mutation and network analysis also identified common cell aging mechanisms in cancers and uncovered a highly modular network structure. These results highlight the utility of CSGene for elucidating the complex cellular events of cell senescence.
    [Show full text]
  • Biomolecule and Bioentity Interaction Databases in Systems Biology: a Comprehensive Review
    biomolecules Review Biomolecule and Bioentity Interaction Databases in Systems Biology: A Comprehensive Review Fotis A. Baltoumas 1,* , Sofia Zafeiropoulou 1, Evangelos Karatzas 1 , Mikaela Koutrouli 1,2, Foteini Thanati 1, Kleanthi Voutsadaki 1 , Maria Gkonta 1, Joana Hotova 1, Ioannis Kasionis 1, Pantelis Hatzis 1,3 and Georgios A. Pavlopoulos 1,3,* 1 Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; zafeiropoulou@fleming.gr (S.Z.); karatzas@fleming.gr (E.K.); [email protected] (M.K.); [email protected] (F.T.); voutsadaki@fleming.gr (K.V.); [email protected] (M.G.); hotova@fleming.gr (J.H.); [email protected] (I.K.); hatzis@fleming.gr (P.H.) 2 Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen, Denmark 3 Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, 11527 Athens, Greece * Correspondence: baltoumas@fleming.gr (F.A.B.); pavlopoulos@fleming.gr (G.A.P.); Tel.: +30-210-965-6310 (G.A.P.) Abstract: Technological advances in high-throughput techniques have resulted in tremendous growth Citation: Baltoumas, F.A.; of complex biological datasets providing evidence regarding various biomolecular interactions. Zafeiropoulou, S.; Karatzas, E.; To cope with this data flood, computational approaches, web services, and databases have been Koutrouli, M.; Thanati, F.; Voutsadaki, implemented to deal with issues such as data integration, visualization, exploration, organization, K.; Gkonta, M.; Hotova, J.; Kasionis, scalability, and complexity. Nevertheless, as the number of such sets increases, it is becoming more I.; Hatzis, P.; et al. Biomolecule and and more difficult for an end user to know what the scope and focus of each repository is and how Bioentity Interaction Databases in redundant the information between them is.
    [Show full text]
  • Navigo: Interactive Tool for Visualization and Functional Similarity and Coherence Analysis with Gene Ontology Qing Wei1, Ishita K
    Wei et al. BMC Bioinformatics (2017) 18:177 DOI 10.1186/s12859-017-1600-5 SOFTWARE Open Access NaviGO: interactive tool for visualization and functional similarity and coherence analysis with gene ontology Qing Wei1, Ishita K. Khan1, Ziyun Ding2, Satwica Yerneni3 and Daisuke Kihara2,1* Abstract Background: The number of genomics and proteomics experiments is growing rapidly, producing an ever-increasing amount of data that are awaiting functional interpretation. A number of function prediction algorithms were developed and improved to enable fast and automatic function annotation. With the well-defined structure and manual curation, Gene Ontology (GO) is the most frequently used vocabulary for representing gene functions. To understand relationship and similarity between GO annotations of genes, it is important to have a convenient pipeline that quantifies and visualizes the GO function analyses in a systematic fashion. Results: NaviGO is a web-based tool for interactive visualization, retrieval, and computation of functional similarity and associations of GO terms and genes. Similarity of GO terms and gene functions is quantified with six different scores including protein-protein interaction and context based association scores we have developed in our previous works. Interactive navigation of the GO function space provides intuitive and effective real-time visualization of functional groupings of GO terms and genes as well as statistical analysis of enriched functions. Conclusions: We developed NaviGO, which visualizes and analyses functional similarity and associations of GO terms and genes. The NaviGO webserver is freely available at: http://kiharalab.org/web/navigo. Keywords: Gene function, Gene ontology, GO, Ontology, GO directed acyclic graph, Function similarity, Gene function prediction, GO annotation, function enrichment analysis, GO parental terms, GO association score Background their parental relationship make it non-trivial to provide Functional elucidation of genes is one of the central an intuitive summary of GO annotations of genes.
    [Show full text]
  • The Human Protein Coevolution Network
    Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press Methods The human protein coevolution network Elisabeth R.M. Tillier1 and Robert L. Charlebois Department of Medical Biophysics, University of Toronto, and Ontario Cancer Institute, University Health Network, Toronto, Ontario M5G 1L7, Canada Coevolution maintains interactions between phenotypic traits through the process of reciprocal natural selection. Detecting molecular coevolution can expose functional interactions between molecules in the cell, generating insights into biological processes, pathways, and the networks of interactions important for cellular function. Prediction of interaction partners from different protein families exploits the property that interacting proteins can follow similar patterns and relative rates of evolution. Current methods for detecting coevolution based on the similarity of phylogenetic trees or evolutionary distance matrices have, however, been limited by requiring coevolution over the entire evolutionary history considered and are inaccurate in the presence of paralogous copies. We present a novel method for determining coevolving protein partners by finding the largest common submatrix in a given pair of distance matrices, with the size of the largest common submatrix measuring the strength of coevolution. This approach permits us to consider matrices of different size and scale, to find lineage-specific coevolution, and to predict multiple interaction partners. We used MatrixMatchMaker to predict protein–protein interactions in the human genome. We show that proteins that are known to interact physically are more strongly coevolving than proteins that simply belong to the same biochemical pathway. The human coevolution network is highly connected, suggesting many more protein–protein interactions than are currently known from high-throughput and other experimental evidence.
    [Show full text]
  • Genomes of the Mouse Collaborative Cross
    | MULTIPARENTAL POPULATIONS Genomes of the Mouse Collaborative Cross Anuj Srivastava,*,1 Andrew P. Morgan,†,‡,§,1 Maya L. Najarian,**,1 Vishal Kumar Sarsani,* J. Sebastian Sigmon,** John R. Shorter,†,‡ Anwica Kashfeen,** Rachel C. McMullan,†,‡,†† Lucy H. Williams,† Paola Giusti-Rodríguez,† Martin T. Ferris,†,‡ Patrick Sullivan,† Pablo Hock,†,‡ Darla R. Miller,†,‡ Timothy A. Bell,†,‡ Leonard McMillan,** Gary A. Churchill,*,2 and Fernando Pardo-Manuel de Villena†,‡,2 *The Jackson Laboratory, Bar Harbor, Maine 04609, and †Department of Genetics, ‡Lineberger Comprehensive Cancer Center, §Curriculum of Bioinformatics and Computational Biology, **Department of Computer Science, and ††Curriculum of Genetics and Molecular Biology, University of North Carolina, Chapel Hill, North Carolina 27599 ORCID IDs: 0000-0003-1942-4543 (A.P.M.); 0000-0003-4732-5526 (J.R.S.); 0000-0003-1241-6268 (M.T.F.); 0000-0002-0781-7254 (D.R.M.); 0000-0002-5738-5795 (F.P.-M.d.V.) ABSTRACT The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse.
    [Show full text]
  • Analysis of DNA Methylation Sites Used for Forensic Age Prediction and Their Correlation with Human Aging
    Silva DSBS, et al., J Forensic Leg Investig Sci 2021, 7: 054 DOI: 10.24966/FLIS-733X/100054 HSOA Journal of Forensic, Legal & Investigative Sciences Research Article Analysis of DNA Methylation Introduction DNA samples left behind in crime scenes are often used for Sites used for Forensic Age identification purposes by comparing them to reference samples or to a forensic database. In cases where there is no match, investigators may Prediction and their Correlation turn to alternative approaches using advanced technologies to gain valuable leads on phenotypic traits (externally visible characteristics) with Human Aging of the person who had left the DNA material. Previous studies have found that phenotypes can be predicted by using information obtained from different DNA markers, such as single nucleotide Silva DSBS1* and Karantenislis G2 polymorphisms (SNPs), insertion/deletion polymorphisms (InDel), and epigenetic markers [1-11]. It is well established that epigenetic 1Department of Chemistry, Hofstra University, Hempstead-NY, USA modifications are major regulators in the translation of genotype 2John F Kennedy High School, Bellmore-NY, USA to phenotype. Epigenetic regulation encompasses various levels of gene expression and involves modifications of DNA, histones, RNA and chromatin, with functional consequences for the human Abstract genome [12-14]. DNA methylation is the most well-characterized epigenetic modification. It is usually associated with transcriptional Phenotype-related features, such as age, are linked to repression and it involves the addition of a methyl group (– CH3) in genetic components through regulatory pathways and epigenetic the cytosine of CpG dinucleotides. Differences in DNA methylation modifications, such as DNA methylation, are major regulators in levels in specific regions of the genome can affect the expression of the translation of genotype to phenotype.
    [Show full text]
  • Biopython Tutorial and Cookbook
    Biopython Tutorial and Cookbook Jeff Chang, Brad Chapman, Iddo Friedberg, Thomas Hamelryck, Michiel de Hoon, Peter Cock Last Update – September 2008 Contents 1 Introduction 5 1.1 What is Biopython? ......................................... 5 1.1.1 What can I find in the Biopython package ......................... 5 1.2 Installing Biopython ......................................... 6 1.3 FAQ .................................................. 6 2 Quick Start – What can you do with Biopython? 8 2.1 General overview of what Biopython provides ........................... 8 2.2 Working with sequences ....................................... 8 2.3 A usage example ........................................... 9 2.4 Parsing sequence file formats .................................... 10 2.4.1 Simple FASTA parsing example ............................... 10 2.4.2 Simple GenBank parsing example ............................. 11 2.4.3 I love parsing – please don’t stop talking about it! .................... 11 2.5 Connecting with biological databases ................................ 11 2.6 What to do next ........................................... 12 3 Sequence objects 13 3.1 Sequences and Alphabets ...................................... 13 3.2 Sequences act like strings ...................................... 14 3.3 Slicing a sequence .......................................... 15 3.4 Turning Seq objects into strings ................................... 15 3.5 Concatenating or adding sequences ................................. 16 3.6 Nucleotide sequences and
    [Show full text]
  • Linkage and Association Analysis of Circulating Vitamin D and Parathyroid Hormone Identifies Novel Loci in Alaska Native Yup’Ik People Stella Aslibekyan1, Laura K
    Aslibekyan et al. Genes & Nutrition (2016) 11:23 DOI 10.1186/s12263-016-0538-y RESEARCH Open Access Linkage and association analysis of circulating vitamin D and parathyroid hormone identifies novel loci in Alaska Native Yup’ik people Stella Aslibekyan1, Laura K. Vaughan2,4, Howard W. Wiener1, Bertha A. Hidalgo1, Dominick J. Lemas3, Diane M. O’Brien5, Scarlett E. Hopkins5, Kimber L. Stanhope6, Peter J. Havel6, Kenneth E. Thummel7, Bert B. Boyer5 and Hemant K. Tiwari2* Abstract Background: Vitamin D deficiency is a well-documented public health issue with both genetic and environmental determinants. Populations living at far northern latitudes are vulnerable to vitamin D deficiency and its health sequelae, although consumption of traditional native dietary pattern rich in fish and marine mammals may buffer the effects of reduced sunlight exposure. To date, few studies have investigated the genetics of vitamin D metabolism in circumpolar populations or considered genediet interactions with fish and n-3 fatty acid intake. Methods: We searched for genomic regions exhibiting linkage and association with circulating levels of vitamin D and parathyroid hormone (PTH) in 982 Yup’ik individuals from the Center for Alaska Native Health Research Study. We also investigated potential interactions between genetic variants and a biomarker of traditional dietary intake, the δ15N value. Results: We identified several novel regions linked with circulating vitamin D and PTH as well as replicated a previous linkage finding on 2p16.2 for vitamin D. Bioinformatic analysis revealed multiple candidate genes for both PTH and vitamin D, including CUBN, MGAT3,andNFKBIA. Targeted association analysis identified NEBL as a candidate gene for vitamin D and FNDC3B for PTH.
    [Show full text]
  • X-Ray Ptychography Imaging of Human Chromosomes After Low-Dose Irradiation
    Chromosome Res https://doi.org/10.1007/s10577-021-09660-7 ORIGINAL ARTICLE X-ray Ptychography Imaging of Human Chromosomes After Low-dose Irradiation Archana Bhartiya & Darren Batey & Silvia Cipiccia & Xiaowen Shi & Christoph Rau & Stanley Botchway & Mohammed Yusuf & Ian K. Robinson Received: 18 December 2020 /Revised: 15 February 2021 /Accepted: 9 March 2021 # The Author(s), under exclusive licence to Springer Nature B.V. 2021 Abstract Studies of the structural and functional role of chromosomes of metaphase human B and T cells using chromosomes in cytogenetics have spanned more than hard X-ray ptychography. We have produced ‘X-ray 10 decades. In this work, we take advantage of the karyotypes’ of both heavy metal–stained and unstained coherent X-rays available at the latest synchrotron spreads to determine the gain or loss of genetic material sources to extract the individual masses of all 46 upon low-level X-ray irradiation doses due to radiation damage. The experiments were performed at the I-13 Responsible Editors: Kiichi Fukui and Toshiyuki Wako beamline, Diamond Light Source, Didcot, UK, using the phase-sensitive X-ray ptychography method. A. Bhartiya : M. Yusuf : I. K. Robinson London Centre for Nanotechnology, University College, London, UK Keywords X-ray imaging . X-ray microscopy . Karyotype . Irradiation . Chromosome structure . Mass A. Bhartiya determination Department of Chemistry, University College, London, UK A. Bhartiya : S. Botchway : M. Yusuf : I. K. Robinson (*) Introduction Research Complex at Harwell, Harwell Campus, Didcot, UK e-mail: [email protected] Chromatin, composed of DNA-protein complexes pres- D. Batey : S. Cipiccia : X. Shi : C.
    [Show full text]