<<

The UCSC Browser Introduction

Materials prepared by Warren C. Lathe, Ph.D. Mary Mangan, Ph.D. www.openhelix.com Updated: Q2 2007

Copyright OpenHelix. No use or reproduction without express written Version10_0407 consent 1

The UCSC Homepage: http://genome.ucsc.edu

navigate

navigate General information

Specific information— new features, current status, etc.

Copyright OpenHelix. No use or reproduction without express written consent 2 The Genome Browser Gateway start page, basic search

text/ID searches

s, ple am ex w rch elo ea s b l s ion u est lpf gg He su

„ Use this Gateway to search by: „ Gene names, symbols „ Chromosome number: chr7, or region: chr11:1038475-1075482 „ Keywords: kinase, receptor „ IDs: NP, NM, OMIM, and more… „ See lower part of page for help with format Copyright OpenHelix. No use or reproduction without express written consent 3

The Genome Browser Gateway start page choices, April 2007

1 2 3 4 5

6

Make your Gateway choices: 1. Select Clade 2. Select genome = species: search 1 species at a time 3. Assembly: the official backbone DNA sequence 4. Position: location in the genome to examine 5. Image width: how many pixels in display window; 5000 max 6. Configure: make fonts bigger + other choices Copyright OpenHelix. No use or reproduction without express written consent 4 Accessing the BLAT tool

BLAT = BLAST-like Alignment Tool „ Rapid searches by INDEXING the entire genome „ Works best with high similarity matches „ See documentation and publication for details „ Kent, WJ. Genome Res. 2002. 12:656 Copyright OpenHelix. No use or reproduction without express written consent 5

BLAT tool overview: www.openhelix.com/sampleseqs.html

„ Make choices

„ Paste one or more DNA limit 25000 bases sequences Protein limit 10000 aa 25 total sequences

submit

„ Or upload

Copyright OpenHelix. No use or reproduction without express written consent 6 The Genome Browser Gateway sample search for TP53

„ Sample search: human, March 2006 assembly, tp53

select

„ Select from results list „ ID search may go right to a viewer page, if unique Copyright OpenHelix. No use or reproduction without express written consent 7

Overview of the whole Genome Browser page (mature release) }Genome viewer section Groups of data

Mapping and Sequencing Tracks Phenotype and Disease Tracks Genes and Gene Prediction Tracks

mRNA and EST Tracks Expression and Regulation

Comparative Genomics

Variation and Repeats ENCODE Tracks Copyright OpenHelix. No use or reproduction without express written consent 8 Sample Genome Viewer image, TP53 region

base position STS markers

UCSC genes

RefSeq genes MGC clones

ESTs

17 species compared

single species compared

SNPs repeats Copyright OpenHelix. No use or reproduction without express written consent 9

Options for Changing Images: Upper Section Walk Zoom Zoom left or in out right

fonts, Specify window, a more position click to zoom 3x and re-center

„ Change your view or location with controls at the top „ Use “base” to get right down to the nucleotides „ Configure: to change font, window size, more…

Copyright OpenHelix. No use or reproduction without express written consent 10 Get DNA, with Extended Case/Color Options

„ Use the DNA link at the top „ Plain or Extended options „ Change colors, fonts, etc.

Copyright OpenHelix. No use or reproduction without express written consent 11

Overview of the whole Genome Browser page (first day, new human release)

}Genome viewer section „ Tracks are added to an assembly over time „ Not all are present in a new release at first

Track and image controls (day 1 = 40 tracks)

Copyright OpenHelix. No use or reproduction without express written consent 12 The UCSC Gene Sorter & Table Browser Advanced searching and discovery using the UCSC Table Browser

Materials prepared by: Warren C Lathe, Ph.D. Updated: Q2 2007

Version 10_0407

Gene Sorter

From homepage select “Gene Sorter”

Copyright OpenHelix. No use or reproduction without express written consent 2 Gene Sorter

Sort genes by similarity criteria

Copyright OpenHelix. No use or reproduction without express written consent 3

Gene Sorter: Chose Genome and Gene

Choose AssemblyType in gene name Choose Genome or accession number

Copyright OpenHelix. No use or reproduction without express written consent 4 Gene Sorter: Similarity Sorting Options

Choose from several similarity sorting options, varies by assembly & species

Copyright OpenHelix. No use or reproduction without express written consent 5

Gene Sorter: Get Results

Go to results

Copyright OpenHelix. No use or reproduction without express written consent 6 Gene Sorter: Results

Description

Re-sort Re-sort genes based on gene clicked using same parameters.

Copyright OpenHelix. No use or reproduction without express written consent 7

Gene Sorter: Results

VisiGene

Open UCSC VisiGene image page for gene.

Copyright OpenHelix. No use or reproduction without express written consent 8 Gene Sorter: Results

Alignment Amino acid alignment between reference gene and gene protein products

Copyright OpenHelix. No use or reproduction without express written consent 9

Gene Sorter: Results

Open Genome Browser to location of chosen gene

Browser

Copyright OpenHelix. No use or reproduction without express written consent 10 Gene Sorter: Results

Open gene details Description page of chosen gene

Copyright OpenHelix. No use or reproduction without express written consent 11

Gene Sorter: Text File of Data

Data text

Obtain tab-delineated file of all displayed gene data

Copyright OpenHelix. No use or reproduction without express written consent 12 Gene Sorter: Gene Sequences

Sequences

Choose sequence type, obtain FASTA file of all gene sequences Get sequences

Copyright OpenHelix. No use or reproduction without express written consent 13

Gene Sorter

Takes the gene of interest and finds the most similar genes based on: - expression - protein sequence or domains - GO annotation -etc

Sorted genes can be filtered out to find exactly your list of genes

The Gene Sorter displays links out to diverse information about the identified genes

Copyright OpenHelix. No use or reproduction without express written consent 14 The Table Browser

http://genome.ucsc.edu/Copyright OpenHelix. No use or reproduction without express written consent 15

Genome Browser Database

visualize search & download

Underlying Database (MySQL)

Primary table: Auxiliary table: positions, names, etc. related data

Copyright OpenHelix. No use or reproduction without express written consent 16 The Table Browser

Open browser

Open browser

http://genome.ucsc.edCopyrightu/ OpenHelix. No use or reproduction without express written consent 17

Table Browser: Choose Genome

Choose Genome

In the Human genome,genome, search for simple repeats on a location with copy number more than 10 and download the sequence. Copyright OpenHelix. No use or reproduction without express written consent 18 Table Browser: Choose Table to Search

Choose Data Table

In the Human genome, search for simple repeats on a chromosome 4 location with copy number more than 10 and download the sequence. Copyright OpenHelix. No use or reproduction without express written consent 19

Table Browser: Describe Table

Describe table

Copyright OpenHelix. No use or reproduction without express written consent 20 Table Browser: Choose Region to Search

Choose Region to Search Define region

In the Human genome, search for simple repeats on a chromosome 4 location with copy number more than 10 and download the sequence.

Copyright OpenHelix. No use or reproduction without express written consent 21

Table Browser: Upload Locations to Search

Paste Upload

Copyright OpenHelix. No use or reproduction without express written consent 22 Table Browser: Filter to Refine Search

Create Filter

Submit Filter

In the Human genome, search for simple repeats on a chromosome 4 location with copy number more than 10 and download the sequence.

Copyright OpenHelix. No use or reproduction without express written consent 23

Table Browser: Output Data

Output data

In the Human genome, search for simple repeats on a chromosome 4 location with copy number more than 10 and download the sequence.

Copyright OpenHelix. No use or reproduction without express written consent 24 Table Browser: Output Formats

Text Fields Output formats

Copyright OpenHelix. No use or reproduction without express written consent 25

Table Browser: Fasta Sequence Output

Sequence

Copyright OpenHelix. No use or reproduction without express written consent 26 Table Browser: Database Format Outputs

Database

Copyright OpenHelix. No use or reproduction without express written consent 27

Table Browser: Custom Track Output

Custom Track

Copyright OpenHelix. No use or reproduction without express written consent 28 Table Browser: Obtaining Output

Adding name creates file on desktop, leaving blank creates output in browser. (exception: custom track)

Data Summary

Copyright OpenHelix. No use or reproduction without express written consent 29

Table Browser: Output configuration

Sequence Format

Get Sequence

Copyright OpenHelix. No use or reproduction without express written consent 30 Table Browser: Intersecting Data

2nd Table Any Overlap

Intersect Submit

Find simple repeats (copy number > 10) within known genes and download the sequence.

Copyright OpenHelix. No use or reproduction without express written consent 31

Table Browser: Intersecting Data Narrows Search

Filtered simple repeats

Summary Filtered simple repeats, intersected (overlapping) w/ known genes

Copyright OpenHelix. No use or reproduction without express written consent 32 Table Browser: Downloading Sequence Data

Sequence Format

Get Sequence

Copyright OpenHelix. No use or reproduction without express written consent 33

Table Browser

Resource to do avanced searching and discovery of specific data at the UCSC database

Available data for dozens of

Diverse data types such as SNPs, repeats, CpG islands, expression, or inter-species conservation

Allows to filter exactly the data type you want and to intersect the different datasets

Different possible output formats

Copyright OpenHelix. No use or reproduction without express written consent 34 GalaxyGalaxy

A platform for interactive large-scale genome analysis Genome Res 15:1451-55 (2005)

Dorota Retelska, January 28th, 2008

WebWeb ––basedbased integrativeintegrative platformplatform

Retrieval of genomic sequences and annotations Simple operations on genomic data Sequence analysis tools Output displays Flexible history saves your data OverviewOverview

Tools Current actions Stored data

DataData acquisitionacquisition –– uploadupload DataData acquisitionacquisition –– databasesdatabases

DataData storagestorage

All data is stored in your history, and can freely recovered on next connection DataData manipulation:manipulation: TextText

Various modifications on data files

OperationsOperations onon intervalsintervals

All promoters of your list that have CpGs Genes including non-synonymous SNPs ….. OtherOther operationsoperations

Fetch sequences Filter using regular expressions Statistics Histograms Regional variation Evolution : tree building

BioinformaticBioinformatic toolstools

EMBOSS based tools Paste or modifiy a sequence Reverse complement Search for pattern Shuffle a set of sequences Translation SummarySummary

Galaxy allows data retrieval from databases Managing and bioinformatic operations on your own and public data Eliminates a large fraction of need for script-based processing ☺ Allows biologists to perform complex genomic analyses