Fishing for Genes in the UCSC Browser a Tutorial

Fishing for Genes in the UCSC Browser A Tutorial Rachel Harte <[email protected]> Mark Diekhans <[email protected]> University of California, Santa Cruz, CA Version 1.0 September 18, 2008 Abstract This tutorial is aimed at the biologist who is interested in exploring protein-coding genes using the University of California Santa Cruz (UCSC) Genome Browser. It is geared towards those who have little or no experience using the UCSC Genome Browser and for more advanced users who are not familiar with many of the gene-oriented browser features. Using the example of a human gene, PPP1R1B, the reader is guided through a step-by-step process for ﬁnding and visualizing protein-coding genes in the context of the human genome and a wide variety of genomic data. The user is shown how to use the UCSC Genome Browser to locate a Mammalian Gene Collection (MGC) clone of the gene and how to order the clone from suppliers. Fishing for Genes in the UCSC Browser 1 Accessing the UCSC • Genome Graphs on the side blue bar is a tool to facilitate viewing of whole genome Genome Browser datasets such as genome-wide SNP associ- The UCSC Genome Bioinformatics website ation studies, linkage studies and homozy- consists of a suite of tools for the viewing and gosity mapping. Instructions are in the mining of genomic data. The UCSC Genome Genome Graphs User’s Guide Browser [16, 18] facilitates the viewing of • PCR on the top blue bar or In Silico PCR clones from the Mammalian Gene Collection on the side blue bar is a fast method of (MGC)[35, 37] in the context of other genome searching for a pair of PCR primer se- annotations. Various types of annotations such quences in a genome assembly. as MGC Genes are visualized in tracks which • VisiGene is a virtual microscope for view- are graphical representations of data displayed ing in situ hybridization images. in a genomic context on the Genome Browser. • Proteome on the top blue bar and Pro- To view these annotation tracks, ﬁrst go to the teome Browser on the side blue bar lead home page at http://genome.ucsc.edu (Figure 1). to the gateway of a tool for viewing pro- The blue menu bars at the top and the left side teins and their properties[14]. Both graph- of the Genome Browser home page both include ical images and links to external websites following links to various tools (Figure 1): provide a rich source of protein information. • Genomes on the top blue bar or the • Utilities on the side blue bar allows access Genome Browser link on the side blue to various tools to remove non-sequence- menu bar allow entry to the Gateway page related characters from DNA or protein, for for the Genome Browser. From here, users creating a gif image for a phylogenetic tree may select the genomes that they wish and a tool which converts genome coordi- to browse. An additional route of en- nates between assemblies (liftOver). try is via a Photo Gateway page in the • Downloads on the side blue bar allows UCSC genomewiki with photographs of bulk download of data including dumps each species whose genome is represented from the Genome Browser databases. in the UCSC Genome Browser. • Custom Tracks is a powerful tool allow- • Blat is a fast alignment tool[20] which al- ing users to add their own data for view- lows the user to align DNA or protein se- ing and querying within the context of a quences to the genome assemblies. genome and its associated annotation data. • Tables on the top blue bar or Table A User’s Guide provides help for creating Browser[17] on the side blue bar link to an Custom Tracks. interface allowing the retrieval of data associated with tracks from the databases underlying the Genome Browser. Data added 2 Searching for a gene as additional tracks created by the user (Custom Tracks) may also be queried with Let’s suppose that a user wishes to study the hu- this tool. man PPP1R1B (protein phosphatase 1, regula- • Gene Sorter is a tool that displays a sorted tory (inhibitor) subunit 1B) gene, which is ex- table of genes that are related by some met- pressed in the brain. The protein encoded by ric selected by the user e.g. similar expres- this gene is also known as DARPP32. Its phos- sion patterns, protein homology or proxim- phorylation status is regulated by dopaminer- ity in the genome[19]. gic and glutamatergic (NMDA) receptors. Once 1 Fishing for Genes in the UCSC Browser Genomes - Blat - Tables - Gene Sorter - PCR - VisiGene - Proteome - Session - FAQ - Help Genome About the UCSC Genome Bioinformatics Site Browser Welcome to the UCSC Genome Browser website. This site contains the reference sequence and ENCODE working draft assemblies for a large collection of genomes. It also provides a portal to the ENCODE Gateway project. Blat We encourage you to explore these sequences with our tools. The Genome Browser zooms and scrolls Table over chromosomes, showing the work of annotators worldwide. The Gene Sorter shows expression, Browser homology and other information on groups of genes that can be related in many ways. Blat quickly maps your sequence to the genome. The Table Browser provides convenient access to the underlying Gene Sorter database. VisiGene lets you browse through a large collection of in situ mouse and frog images to In Silico examine expression patterns. Genome Graphs allows you to upload and display genome-wide data PCR sets. Genome The UCSC Genome Browser is developed and maintained by the Genome Bioinformatics Group, a Graphs cross-departmental team within the Center for Biomolecular Science and Engineering (CBSE) at the University of California Santa Cruz (UCSC). If you have feedback or questions concerning the tools or Galaxy data on this website, feel free to contact us on our public mailing list. To view the results of the Genome VisiGene Browser users' survey we conducted in May 2007, click here. Proteome Browser News Utilities To receive announcements of new genome assembly releases, new software features, updates and training seminars by email, subscribe to the genome-announce mailing list. Downloads Release Log 26 June 2008 - New Worm Genome Available Custom Along with the set of worm browser updates that we're currently releasing, we've added a new Tracks nematode to the collection: Caenorhabditis japonica. This genome assembly (UCSC version caeJap1, Mar. 2008) corresponds to the v. 3.0.2 assembly produced by the Genome Sequencing Archaeal Center at the Washington University St. Louis (WUSTL) School of Medicine. Genomes Bulk downloads of the sequence and annotation data are available via the Genome Browser FTP Mirrors server or Downloads page. Please review the WUSTL data use policy for usage restrictions and Archives citation information. Training We'd like to thank WUSTL for providing the sequence data for this assembly. The UCSC caeJap1 browser was produced by Hiram Clawson, Ann Zweig, and Donna Karolchik. See the Genome Credits Browser Credits page for a detailed list of the organizations and individuals who contributed to this release. Publications Cite Us 20 June 2008 - Two Worm Updates Released: We've updated our browsers for Licenses the C. remanei and C. brenneri nematode genomes. Read more. Jobs 10 June 2008 - Lamprey Browser Released: We have released a Genome Browser for the Mar. 2007 assembly of the lamprey genome, Petromyzon marinus. Read more. Staff Contact Us 10 June 2008 - Lancelet Genome Available in Browser: The Mar. 2006 release of the lancelet genome (Branchiostoma floridae) is now available in the UCSC Genome Figure 1: UCSC Genome BrowserBrowser. Reahomed more. page. On this page, there is an introduction to the web site and postings of news of5 trainingJune 2008 seminars - Guinea and Pig recent Browser additions Released: such asThe new Feb. assemblies2008 CavPor3 and release of the guinea pig genome (Cavia porcellus) is now available in the UCSC Genome Browser. software features. The blueReamenud more. bars at the top and left side of the page allow access to the Genome Browser for genome assemblies of a variety of organisms, data mining tools and help pages. 2 Fishing for Genes in the UCSC Browser phosphorylated, PPP1R1B, is a potent protein 35046403 link. This will take you to phosphatase-1 inhibitor. To visualize this gene the PPP1R1B locus in the human genome. in the UCSC Genome Browser in the context of Note that there are two RefSeq splice vari- various genomic annotations, the user may take ants listed for this locus and this one is the the following steps: longer transcript variant. 1. To get started, clicking on either the Genomes or Genome Browser link will 3 Genome Browser take the user to the Gateway page where the clade, genome and assembly of inter- annotation tracks est may be selected from pull-down lists of multiple organisms and genome assem- The Genome Browser displays certain annota- blies. tion tracks by default in the main Browser image. The current default display (Figure 4) 2. Below this area, there is a section describ- shows a number of gene tracks: ing the selected assembly which also indi- cates that the default assembly (at the time • UCSC Gene predictions[15, 18] UCSC of writing) is known as hg18 on the • BLAT alignments of sequences from Genome Browser website. This assembly is GenBank[2] also known as NCBI Build 36.1. This sec- • BLAT alignments of full-length ORF MGC tion also contains a list of Sample position Genes. queries. These are a selection of queries that may be entered into the position or Other visible tracks in this default display are: search term box adjacent to the genome and assembly controls.

Fishing for Genes in the UCSC Browser a Tutorial

Genetic Analysis of the Calcineurin Pathway Identifies Members of the EGR Gene Family, Specifically EGR3, As Potential Susceptibility Candidates in Schizophrenia

Mouse Kcnip2 Conditional Knockout Project (CRISPR/Cas9)

Egr-1 Induces DARPP-32 Expression in Striatal Medium Spiny Neurons Via a Conserved Intragenic Element

ENCODE Genome-Wide Data on the UCSC Genome Browser Melissa Cline ENCODE Data Coordination Center (DCC) UC Santa Cruz

Psychostimulant-Regulated Plasticity in Interneurons of the Nucleus Accumbens

DRD2 and PPP1R1B (DARPP-32)

Effects of PPP1R1B (DARPP-32) Polymorphism on Feedback-Related Brain Potentials Across the Life Span

BIO4342 Exercise 2: Browser-Based Annotation and RNA-Seq Data

BLAT—The BLAST-Like Alignment Tool

No Evidence for Recent Selection at FOXP2 Among Diverse Human Populations

Thejournalofcellb Io Logy

A Multithread Blat Algorithm Speeding up Aligning Sequences to Genomes Meng Wang and Lei Kong*