UCSC Genome Browser Tutorial.Pdf

Total Page:16

File Type:pdf, Size:1020Kb

Load more

Introduction to the UCSC Genome Brower 1 Overview Today we’re going to cover: § Introduction to the UCSC Genome Browser § Navigation and layout § Search and Configure § Details and Sequences § Download data § Search by Sequence (BLAT) § Share configuration and Export Images § Visualize your own data 2 http://genome.ucsc.edu/ Contents Today we’re going to cover: § Introduction to the UCSC Genome Browser § Navigation and layout § Search and Configure § Details and Sequences § Download data § Search by Sequence (BLAT) § Share configuration and Export Images § Visualize your own data 3 What is the UCSC Genome Browser The UCSC Genome Browser is a graphical visualization tool to explore the genome and genomic data. Developed by Jim Kent and maintained by the UCSC Genome Browser Project Team. 4 http://genome.ucsc.edu/ Genome Browser like Google Maps for Genomes (Cont.) Businesses Bus Stops Traffic 5 Genome Browser like Google Maps for Genomes (cont.) Coordinates (like streets): Position in the DNA sequence Scale 5 kb hg19 chr21: 33,026,000 33,027,000 33,028,000 33,029,000 33,030,000 33,031,000 33,032,000 33,033,000 33,034,000 33,035,000 33,036,000 33,037,000 33,038,000 33,039,000 UCSC Genes (RefSeq, GenBank, CCDS, Rfam, tRNAs & Comparative Genomics) UCSC Genes RNA-seq from ENCODE/Caltech GM78 2x75 Sg 1 GM78 2x75 Sg 2 GM78 1x75D - 1 GM78 1x75D - 2 GM78 1x75D + 1 GM78 1x75D + 2 K562 2x75 Sg 1 K562 2x75 Sg 2 K562 1x75D - 1 K562 1x75D - 2 K562 1x75D + 1 K562 1x75D + 2 Transcription Factor Binding Sites by ChIP-seq from ENCODE/HAIB GM78 Pol2-4H8 PCR1 1 Expression Genes Regulation “Annotation tracks”: everything below the coordinates annotates something notable about that region 6 Why use the UCSC Genome Browser? § Visualize genetic and genomic data – Publicly available data – Your own data § Download data – Useful if you want subset data – Common interface for data from many pulic sources § Tools for working with genomic data – Graphical interface – Common-line 7 Contents Today we’re going to cover: § Introduction to the UCSC Genome Browser § Navigation and layout § Display and Configuration § Details and Sequences § Download data § Search by Sequence (BLAT) § Share configuration and Export Images § Visualize your own data 8 http://genome.ucsc.edu/ Genome Browser Home Page 9 Gateway: Start Page Change Assembly Change Species Search Genes • Gene Names, symbols, IDs • Chromosome Number (chr7) or Position chr11:4000-5000 • Keywords: kinase, receptor 10 Choosing an Assembly § The Genome Browser displays a “reference sequence for every organism – genetic variation (SNPs, insertions, deletions, etc.) is annotated in terms of how it is different from the reference § The species specic genome reference sequence is still being fine=tuned, and there are different versions (called “assemblies”) § If you have a list of genomic coordinates, they belong to a particular assembly! 11 Gateway: Start Page • Hit “GO” without entering a search term to go to a default location • Or the last place you were – the browser ”remembers” it 12 Genome Browser View Zoom Navigation Ideogram Search Genome View 13 Genome Browser View (cont.) Tracks Category Change Visibility 14 Contents Today we’re going to cover: § Introduction to the UCSC Genome Browser § Navigation and layout § Display and Configuration § Details and Sequences § Download data § Search by Sequence (BLAT) § Share configuration and Export Images § Visualize your own data 15 Scale 2 kb hg38 chr1: 11,097,500 11,098,000 11,098,500 11,099,000 11,099,500 11,100,000 11,100,500 11,101,000 11,101,500 11,102,000 GENCODE v24 Comprehensive Transcript Set (only Basic displayed by default) EXOSC10 EXOSC10 RP4-635E18.6 RefSeq gene predictions from NCBI RefSeq Curated OMIM Allelic Variants OMIM Alleles Gene Expression in 53 tissues from GTEx RNA-seq of 8555 samples (570 donors) EXOSC10 RP4-635E18.6 H3K27Ac Mark (Often Found Near Regulatory Elements) on 7 cell lines from ENCODE 58.48 _ GM12878 0.24 _ DNase I Hypersensitivity Peak Clusters from ENCODE (95 cell types) DNase Clusters 4.88 _ 100 vertebrates Basewise Conservation by PhyloP Cons 100 Verts 0 - -4.5 _ Multiz Alignments of 100 Vertebrates Rhesus Mouse Dog Elephant Chicken Visualization of Track Types Scale 2 kb hg38 X_chr1t: ro1p1,09i7,c500 alis11,098,000 11,098,500 11,099,000 11,099,500 11,100,000 11,100,500 11,101,000 11,101,500 11,102,000 GENCODE v24 Comprehensive Transcript Set (only Basic displayed by default) EXOSC10 EXOSCZ10 ebrafish RP4-635E18.6 RefSeq gene predictions from NCBI RefSeq Curated Lamprey OMIM Allelic Variants OMIM Alleles Gene Expression in 53 tissues from GTEx RNA-seq of 8555 samples (570 donors) Gene Simple Nucleotide Polymorphisms (dbSNP 147) Found in >= 1% of Samples EXOSC10 Feature rs376968979 rsDirecti12131787 5’ UTR rs62623443 rs57976837 rs12136339 rs17552394 rs2335276 rs2791644 rs370840174 3’ UTR Exon on rs12131652 RP4-635E18.6 rs17036324 rs12133922 rs112467639 rs1194757 H3K27Ac Mark (Often Found Near Regulatory Elements) on 7 cell lines from ENCODE 58.48 _ GM12878 rs558146924 rs17552387 0.24 _ Wiggle DNase I Hypersensitivity Peak Clusters from ENCODE (95 cell types) DNase Clusters 4.88 _ rs1398056100 v9erteb6rates Basewise Conservation by PhyloP Cons 100 Verts 0 - Repeating Elements by RepeatMasker -4.5 _ Regions Multiz Alignments of 100 Vertebrates Rhesus Mouse SINE Dog Elephant Chicken X_tropicalis LINE Zebrafish Lamprey Simple Nucleotide Polymorphisms (dbSNP 147) Found in >= 1% of Samples rs376968979 rsL1213T1787 Rrs62623443 rs57976837 rs12136339 rs17552394 rs2335276 rs2791644 rs370840174 rs12131652 rs17036324 rs12133922 rs112467639 rs1194757 rs558146924 rs17552387 rs139805696 DNA Repeating Elements by RepeatMasker SINE LINE LTR Simple DNA Simple 16 Low Complexity Low CSateollite mplexity RNA Other Unknown Satellite RNA Other Unknown Additional options for Changing Images • Drag Tracks up and down to re-arrange • Drag-and-Select • Zoom-in • Highlight selected section • Highlight Chromosome View to zoom in/out of section • Right-click track to change visibility 17 Basic Visibility Example • Hide: remove track from view • Dense: all itemds collapsed on a single line • Squish: each item on separate line, but 50% height + packed • Pack: each item on separate line and efficiently stacked • Full: each item on separate line 18 Mid-Page options to Change Settings Reverse 5’<-> 3’ Data Search Reset Default Hides all Track settings tracks Slice track view Fit Browser to (e.g only exons) Window 19 Search an item § If the name of the item you search: – Perfect Unique Match in track – Track will open and the item will be highlighted – Matches multiple items or tracks – See a list of possible matches and you need to select the item/track you want – Returns “alternate” chromosome location – There are alternate haplotype chromosomes 20 Display Configuration (demo) § Track visibility – Hide, dense, squish, pack full (pack is usually a good choice) – Tracks automatically collapse to squish/dense format when you zoom out far enough § Track ordering: drag and drop § Track right click menu – Another way to change the visibility and configure display § Track highlighting and zoom § Search 21 Contents Today we’re going to cover: § Introduction to the UCSC Genome Browser § Navigation and layout § Display and Configuration § Details and Sequences § Download data § Search by Sequence (BLAT) § Share configuration and Export Images § Visualize your own data 22 Track Details Configuration Description 23 Item Details Click any part of the gene Item Description page With additional information 24 Get Sequence from Description Page Genomic Sequence Coding Sequence Translated Protein Annotation Description Page Region Page 25 Get DNA Sequence for whole View 26 Details and Sequence (demo) § Track Details § Item Details § Get DNA sequence – Region – Item 27 Contents Today we’re going to cover: § Introduction to the UCSC Genome Browser § Navigation and layout § Display and Configuration § Details and Sequences § Download data § Search by Sequence (BLAT) § Share configuration and Export Images § Visualize your own data 28 How to Download data • Method 1: Use the Table Browser to get a subset of data • Method 2: Entire table 29 Method 1: The Table Browser 30 The Table Browser (Assembly and Table) Specify the assembly and table 31 The Table Browser (Region) Specify whole genome, or just a region 32 The Table Browser (Identifers) Optional: Get output only for specific item names 33 The Table Browser (Identifiers) Optional: Get output only for specific item names 34 Table Browser Query (demo) Retrieve a list of common SNPs in the BRCA2 gene 1. Naviage to BRCA2 gene locus using the serach box 2. Go to “Tools” -> “Table Browser” 3. Select: – Assembly: GRCh38/hg38 – Group: Variation – Track: Common SNPs (147) – Table: snp147Common – Region: position (these coordinates will be the BRCA2 position) 35 Table Browser Query (demo cont.) Retrieve a list of common SNPs in the BRCA2 gene 4. Hit “get output” 5. Note that you get every column in the table 6. To limit to a few columns, go back and choose: – Output format: “selected fields from primary and related tables” 7. Select only these fields – chrom, chromStart, chromEnd, name, strand, class, func 36 Table Browser (Output) § Note: You need to add 1 to table start coordiantes to get position displayed in the Genome Browser § See explanations: https://genome.ucsc.edu/FAQ/FAQtrac ks.html#tracks1 § http://genomewiki.ucsc.edu/index.ph p/Coordinate_Transforms 37 The Table Browser (File Output) • Enter name for output file.
Recommended publications
  • Mouse Kcnip2 Conditional Knockout Project (CRISPR/Cas9)

    Mouse Kcnip2 Conditional Knockout Project (CRISPR/Cas9)

    https://www.alphaknockout.com Mouse Kcnip2 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Kcnip2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Kcnip2 gene (NCBI Reference Sequence: NM_145703 ; Ensembl: ENSMUSG00000025221 ) is located on Mouse chromosome 19. 10 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 10 (Transcript: ENSMUST00000162528). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Kcnip2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-98F2 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for disruptions in this gene are susceptible to induced cardiac arrhythmias but are otherwise normal. Exon 4 starts from about 27.65% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 574 bp, and the size of intron 4 for 3'-loxP site insertion: 532 bp. The size of effective cKO region: ~625 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 1 2 3 4 5 6 7 8 9 10 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Kcnip2 Homology arm cKO region loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats.
  • ENCODE Genome-Wide Data on the UCSC Genome Browser Melissa Cline ENCODE Data Coordination Center (DCC) UC Santa Cruz

    ENCODE Genome-Wide Data on the UCSC Genome Browser Melissa Cline ENCODE Data Coordination Center (DCC) UC Santa Cruz

    ENCODE Genome-Wide Data on the UCSC Genome Browser Melissa Cline ENCODE Data Coordination Center (DCC) UC Santa Cruz http://encodeproject.org Slides at http://genome-preview.ucsc.edu/ What is ENCODE? • International consortium project with the goal of cataloguing the functional regions of the human genome GTTTGCCATCTTTTG! CTGCTCTAGGGAATC" CAGCAGCTGTCACCA" TGTAAACAAGCCCAG" GCTAGACCAGTTACC" CTCATCATCTTAGCT" GATAGCCAGCCAGCC" ACCACAGGCATGAGT" • A gold mine of experimental data for independent researchers with available disk space ENCODE covers diverse regulatory processes ENCODE experiments are planned for integrative analysis Example of ENCODE data Genes Dnase HS Chromatin Marks Chromatin State Transcription Factor Binding Transcription RNA Binding Translation ENCODE tracks on the UCSC Genome Browser ENCODE tracks marked with the NHGRI helix There are currently 2061 ENCODE experiments at the ENCODE DCC How to find the data you want Finding ENCODE tracks the hard way A better way to find ENCODE tracks Finding ENCODE metadata descriptions Visualizing: Genome Browser tricks that every ENCODE user should know Turning ENCODE subtracks and views on and off Peaks Signal View on/off Subtrack on/off Right-click to the subtrack display menu Subtrack Drag and Drop Sessions: the easy way to save and share your work Downloading data with less pain 1. Via the Downloads button on the track details page 2. Via the File Selection tool Publishing: the ENCODE data release policy Every ENCODE subtrack has a “Restricted Until” date Key points of the ENCODE data release policy • Anyone is free to download and analyze data. • One cannot submit publications involving ENCODE data unless – the data has been at the ENCODE DCC for at least nine months, or – the data producers have published on the data, or – the data producers have granted permission to publish.
  • BIO4342 Exercise 2: Browser-Based Annotation and RNA-Seq Data

    BIO4342 Exercise 2: Browser-Based Annotation and RNA-Seq Data

    BIO4342 Exercise 2: Browser-Based Annotation and RNA-Seq Data Jeremy Buhler March 15, 2010 This exercise continues your introduction to practical issues in comparative annotation. You’ll be annotating genomic sequence from the dot chromosome of Drosophila mojavensis using your knowledge of BLAST and some improved visualization tools. You’ll also consider how best to integrate information from high-throughput sequencing of expressed RNA. 1 Getting Started To begin, go to our local genome browser at http://gander.wustl.edu/. Select “Genome Browser” from the left-side menu and choose the “Improved Dot” assembly of D. mojavensis for viewing. Finally, hit submit to start looking at the sequence. The entire dot assembly is about 1.69 megabases in length; zoom out to see everything. This assembly is built from a set of overlapping fosmid clones prepared for the 2009 edition of BIO 4342. We’ve added a variety of information to the genome browser to help you annotate, such as: • gene-structure predictions from several different tools; • repeats annotated using the RepeatMasker program; • BLAST hits to D. melanogaster proteins; • RNA-Seq data, which we’ll describe in more detail later. Having all this evidence available at once is somewhat overwhelming. To keep the view to a manageable level, I’d suggest that you initially set all the gene prediction tracks (Genscan, Nscan, SNAP, Geneid), as well as the repeat tracks, to “dense” mode, so that each displays on a single line. Set the BLAST hit track (called “D. mel proteins”) to “pack” to see the locations of all BLAST hits, and set the “RNA-Seq Coverage” track to “full” and the “TopHat junctions” track to “pack” to get a detailed view of these results.
  • BLAT—The BLAST-Like Alignment Tool

    BLAT—The BLAST-Like Alignment Tool

    Resource BLAT—The BLAST-Like Alignment Tool W. James Kent Department of Biology and Center for Molecular Biology of RNA, University of California, Santa Cruz, Santa Cruz, California 95064, USA Analyzing vertebrate genomes requires rapid mRNA/DNA and cross-species protein alignments. A new tool, BLAT, is more accurate and 500 times faster than popular existing tools for mRNA/DNA alignments and 50 times faster for protein alignments at sensitivity settings typically used when comparing vertebrate sequences. BLAT’s speed stems from an index of all nonoverlapping K-mers in the genome. This index fits inside the RAM of inexpensive computers, and need only be computed once for each genome assembly. BLAT has several major stages. It uses the index to find regions in the genome likely to be homologous to the query sequence. It performs an alignment between homologous regions. It stitches together these aligned regions (often exons) into larger alignments (typically genes). Finally, BLAT revisits small internal exons possibly missed at the first stage and adjusts large gap boundaries that have canonical splice sites where feasible. This paper describes how BLAT was optimized. Effects on speed and sensitivity are explored for various K-mer sizes, mismatch schemes, and number of required index matches. BLAT is compared with other alignment programs on various test sets and then used in several genome-wide applications. http://genome.ucsc.edu hosts a web-based BLAT server for the human genome. Some might wonder why in the year 2002 the world needs sions on any number of perfect or near-perfect hits. Where another sequence alignment tool.
  • No Evidence for Recent Selection at FOXP2 Among Diverse Human Populations

    No Evidence for Recent Selection at FOXP2 Among Diverse Human Populations

    Article No Evidence for Recent Selection at FOXP2 among Diverse Human Populations Graphical Abstract Authors Elizabeth Grace Atkinson, Amanda Jane Audesse, Julia Adela Palacios, Dean Michael Bobo, Ashley Elizabeth Webb, Sohini Ramachandran, Brenna Mariah Henn Correspondence [email protected] (E.G.A.), [email protected] (B.M.H.) In Brief An in-depth examination of diverse sets of human genomes argues against a recent selective evolutionary sweep of FOXP2, a gene that was believed to be critical for speech evolution in early hominins. Highlights d No support for positive selection at FOXP2 in large genomic datasets d Sample composition and genomic scale significantly affect selection scans d An intronic ROI within FOXP2 is expressed in human brain cells and cortical tissue d This ROI contains a large amount of constrained, human- specific polymorphisms Atkinson et al., 2018, Cell 174, 1424–1435 September 6, 2018 ª 2018 Elsevier Inc. https://doi.org/10.1016/j.cell.2018.06.048 Article No Evidence for Recent Selection at FOXP2 among Diverse Human Populations Elizabeth Grace Atkinson,1,8,9,10,* Amanda Jane Audesse,2,3 Julia Adela Palacios,4,5 Dean Michael Bobo,1 Ashley Elizabeth Webb,2,6 Sohini Ramachandran,4 and Brenna Mariah Henn1,7,* 1Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, USA 2Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI 02912, USA 3Neuroscience Graduate Program, Brown University, Providence, RI 02912, USA 4Department of Ecology and Evolutionary
  • A Multithread Blat Algorithm Speeding up Aligning Sequences to Genomes Meng Wang and Lei Kong*

    A Multithread Blat Algorithm Speeding up Aligning Sequences to Genomes Meng Wang and Lei Kong*

    Wang and Kong BMC Bioinformatics (2019) 20:28 https://doi.org/10.1186/s12859-019-2597-8 SOFTWARE Open Access pblat: a multithread blat algorithm speeding up aligning sequences to genomes Meng Wang and Lei Kong* Abstract Background: The blat is a widely used sequence alignment tool. It is especially useful for aligning long sequences and gapped mapping, which cannot be performed properly by other fast sequence mappers designed for short reads. However, the blat tool is single threaded and when used to map whole genome or whole transcriptome sequences to reference genomes this program can take days to finish, making it unsuitable for large scale sequencing projects and iterative analysis. Here, we present pblat (parallel blat), a parallelized blat algorithm with multithread and cluster computing support, which functions to rapidly fine map large scale DNA/RNA sequences against genomes. Results: The pblat algorithm takes advantage of modern multicore processors and significantly reduces the run time with the number of threads used. pblat utilizes almost equal amount of memory as when running blat. The results generated by pblat are identical with those generated by blat. The pblat tool is easy to install and can run on Linux and Mac OS systems. In addition, we provide a cluster version of pblat (pblat-cluster) running on computing clusters with MPI support. Conclusion: pblat is open source and free available for non-commercial users. It is easy to install and easy to use. pblat and pblat-cluster would facilitate the high-throughput mapping of large scale genomic and transcript sequences to reference genomes with both high speed and high precision.
  • Homology & Alignment

    Homology & Alignment

    Protein Bioinformatics Johns Hopkins Bloomberg School of Public Health 260.655 Thursday, April 1, 2010 Jonathan Pevsner Outline for today 1. Homology and pairwise alignment 2. BLAST 3. Multiple sequence alignment 4. Phylogeny and evolution Learning objectives: homology & alignment 1. You should know the definitions of homologs, orthologs, and paralogs 2. You should know how to determine whether two genes (or proteins) are homologous 3. You should know what a scoring matrix is 4. You should know how alignments are performed 5. You should know how to align two sequences using the BLAST tool at NCBI 1 Pairwise sequence alignment is the most fundamental operation of bioinformatics • It is used to decide if two proteins (or genes) are related structurally or functionally • It is used to identify domains or motifs that are shared between proteins • It is the basis of BLAST searching (next topic) • It is used in the analysis of genomes myoglobin Beta globin (NP_005359) (NP_000509) 2MM1 2HHB Page 49 Pairwise alignment: protein sequences can be more informative than DNA • protein is more informative (20 vs 4 characters); many amino acids share related biophysical properties • codons are degenerate: changes in the third position often do not alter the amino acid that is specified • protein sequences offer a longer “look-back” time • DNA sequences can be translated into protein, and then used in pairwise alignments 2 Find BLAST from the home page of NCBI and select protein BLAST… Page 52 Choose align two or more sequences… Page 52 Enter the two sequences (as accession numbers or in the fasta format) and click BLAST.
  • Databases/Resources on the Web

    Databases/Resources on the Web

    Jon K. Lærdahl, Structural Bioinforma�cs Databases/Resources on the web Jon K. Lærdahl [email protected] Jon K. Lærdahl, A lot of biological databases Structural Bioinforma�cs available on the web... MetaBase, the database of biological bioinforma�cs.ca – links directory databases (1801 entries) (620 databases) -­‐ h�p://metadatabase.org -­‐ h�p://bioinforma�cs.ca/links_directory Jon K. Lærdahl, Structural Bioinforma�cs btw, the bioinforma�cs.ca links directory is an excellent resource bioinforma�cs.ca – links directory h�p://bioinforma�cs.ca/links_directory Currently 1459 tools 620 databases 164 “resources” The problem is not to find a tool or database, but to know what is “gold” and what is “junk” Jon K. Lærdahl, Some important centres for Structural Bioinforma�cs bioinforma�cs Na�onal Center for Biotechnology Informa�on (NCBI) – part of the US Na�onal Library of Medicine (NLM), a branch of the Na�onal Ins�tutes of Health – located in Bethesda, Maryland European Bioinforma�cs Ins�tute (EMBL-­‐EBI) – part of part of European Molecular Biology Laboratory (EMBL) – located in Hinxton, Cambridgeshire, UK Jon K. Lærdahl, NCBI databases Structural Bioinforma�cs Provided the GenBank DNA sequence database since 1992 Online Mendelian Inheritance in Man (OMIM) -­‐ known diseases with a gene�c component and links to genes – started early 1960s as a book – online version, OMIM, since 1987 – on the WWW by NCBI in 1995 – currently >22,000 entries (14,400 genes) EST -­‐ nucleo�de database subset that contains only Expressed Sequence Tag
  • The UCSC Genome Browser Database: 2021 Update Jairo Navarro Gonzalez 1,*, Ann S

    The UCSC Genome Browser Database: 2021 Update Jairo Navarro Gonzalez 1,*, Ann S

    D1046–D1057 Nucleic Acids Research, 2021, Vol. 49, Database issue Published online 22 November 2020 doi: 10.1093/nar/gkaa1070 The UCSC Genome Browser database: 2021 update Jairo Navarro Gonzalez 1,*, Ann S. Zweig1, Matthew L. Speir1, Daniel Schmelter1, Kate R. Rosenbloom1, Brian J. Raney1, Conner C. Powell1, Luis R. Nassar1, Nathan D. Maulding1, Christopher M. Lee 1, Brian T. Lee 1,AngieS.Hinrichs1, Alastair C. Fyfe1, Jason D. Fernandes1, Mark Diekhans 1, Hiram Clawson1, Jonathan Casper1, Anna Benet-Pages` 1,2, Galt P. Barber1, David Haussler1,3, Robert M. Kuhn1, Maximilian Haeussler1 and W. James Kent1 Downloaded from https://academic.oup.com/nar/article/49/D1/D1046/5998393 by guest on 27 September 2021 1Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA, 2Medical Genetics Center (MGZ), Munich, Germany and 3Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA Received September 21, 2020; Revised October 19, 2020; Editorial Decision October 20, 2020; Accepted November 18, 2020 ABSTRACT pires to quickly incorporate and contextualize vast amounts of genomic information. For more than two decades, the UCSC Genome Apart from incorporating data from researchers and con- Browser database (https://genome.ucsc.edu)has sortia, the Browser also provides tools available for users to provided high-quality genomics data visualization view and compare their own data with ease. Custom tracks and genome annotations to the research community. allow users to quickly view a dataset, and track hubs allow As the field of genomics grows and more data be- users to extensively organize their data and share it privately come available, new modes of display are required using a URL.
  • A Dissertation

    A Dissertation

    A Dissertation entitled Strategies for Membrane Protein Studies and Structural Characterization of a Metabolic Enzyme for Antibiotic Development by Buenafe T. Arachea Submitted to the Graduate Faculty as partial fulfillment of the requirements for the Doctor of Philosophy Degree in Chemistry Dr. Ronald E. Viola, Committee Chair Dr. Max O. Funk, Committee Member Dr. Donald Ronning, Committee Member Dr. Marcia McInerney, Committee Member Dr. Patricia R. Komuniecki, Dean College of Graduate Studies The University of Toledo August 2011 Copyright © 2011, Buenafe T. Arachea This document is copyrighted material. Under copyright law, no parts of this document may be reproduced without the expressed permission of the author. An Abstract of Strategies for Membrane Protein Studies and Structural Characterization of a Metabolic Enzyme for Antibiotic Development by Buenafe T. Arachea Submitted to the Graduate Faculty as partial fulfillment of the requirements for the Doctor of Philosophy Degree in Chemistry The University of Toledo August 2011 Membrane proteins are essential in a variety of cellular functions, making them viable targets for drug development. However, progress in the structural elucidation of membrane proteins has proven to be a difficult task, thus limiting the number of published structures of membrane proteins as compared with the enormous structural information obtained from soluble proteins. The challenge in membrane protein studies lies in the production of the required sample for characterization, as well as in developing methods to effectively solubilize and maintain a functional and stable form of the target protein during the course of crystallization. To address these issues, two different approaches were explored for membrane protein studies.
  • The UCSC Genome Browser Database: Update 2011 Pauline A

    The UCSC Genome Browser Database: Update 2011 Pauline A

    D876–D882 Nucleic Acids Research, 2011, Vol. 39, Database issue Published online 18 October 2010 doi:10.1093/nar/gkq963 The UCSC Genome Browser database: update 2011 Pauline A. Fujita1,*, Brooke Rhead1, Ann S. Zweig1, Angie S. Hinrichs1, Donna Karolchik1, Melissa S. Cline1, Mary Goldman1, Galt P. Barber1, Hiram Clawson1, Antonio Coelho1, Mark Diekhans1, Timothy R. Dreszer1, Belinda M. Giardine2, Rachel A. Harte1, Jennifer Hillman-Jackson1, Fan Hsu1, Vanessa Kirkup1, Robert M. Kuhn1, Katrina Learned1, Chin H. Li1, Laurence R. Meyer1, Andy Pohl1,3, Brian J. Raney1, Kate R. Rosenbloom1, Kayla E. Smith1, David Haussler1,4 and W. James Kent1 1Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz Downloaded from (UCSC), Santa Cruz, CA 95064, 2Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA, 3Centre for Genomic Regulation (CRG), Barcelona, Spain and 4Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA Received September 15, 2010; Accepted September 30, 2010 nar.oxfordjournals.org ABSTRACT differs among species, with recent assemblies of the human The University of California, Santa Cruz Genome genome being the most richly annotated. The Genome Browser (http://genome.ucsc.edu) offers online Browser contains mapping and sequencing annotation tracks describing assembly, gap and GC percent details access to a database of genomic sequence and at Health Sciences Center Library on February 4, 2011 annotation data for a wide variety of organisms. for all assemblies. Most organisms also have tracks con- taining alignments of RefSeq genes (3,4), mRNAs and The Browser also has many tools for visualizing, ESTs from GenBank (5) as well as gene and gene predic- comparing and analyzing both publicly available tion tracks such as Ensembl Genes (6).
  • Genetic Analyses in a Bonobo (Pan Paniscus) with Arrhythmogenic Right Ventricular Cardiomyopathy Received: 19 September 2017 Patrícia B

    Genetic Analyses in a Bonobo (Pan Paniscus) with Arrhythmogenic Right Ventricular Cardiomyopathy Received: 19 September 2017 Patrícia B

    www.nature.com/scientificreports OPEN Genetic analyses in a bonobo (Pan paniscus) with arrhythmogenic right ventricular cardiomyopathy Received: 19 September 2017 Patrícia B. S. Celestino-Soper1, Ty C. Lynnes1, Lili Zhang1, Karen Ouyang1, Samuel Wann2, Accepted: 21 February 2018 Victoria L. Clyde2 & Matteo Vatta1,3 Published: xx xx xxxx Arrhythmogenic right ventricular cardiomyopathy (ARVC) is a disorder that may lead to sudden death and can afect humans and other primates. In 2012, the alpha male bonobo of the Milwaukee County Zoo died suddenly and histologic evaluation found features of ARVC. This study sought to discover a possible genetic cause for ARVC in this individual. We sequenced our subject’s DNA to search for deleterious variants in genes involved in cardiovascular disorders. Variants found were annotated according to the human genome, following currently available classifcation used for human diseases. Sequencing from the DNA of an unrelated unafected bonobo was also used for prediction of pathogenicity. Twenty-four variants of uncertain clinical signifcance (VUSs) but no pathogenic variants were found in the proband studied. Further familial, functional, and bonobo population studies are needed to determine if any of the VUSs or a combination of the VUSs found may be associated with the clinical fndings. Future genotype-phenotype establishment will be benefcial for the appropriate care of the captive zoo bonobo population world-wide as well as conservation of the bobono species in its native habitat. Cardiovascular disease is a leading cause of death for both human and non-human primates, who share similar genomes and many environmental and lifestyle characteristics1. However, while humans most ofen die due to coronary artery disease (CAD), non-human primates living in captivity are more ofen afected by hypertensive cardiomyopathy1.