ArrayExpress Gene Expression Atlas Investigating gene expression patterns
The Bioinformatics Roadshow – Rotterdam Erasmus Postgraduate school Molecular Medicine
Ibrahim Emam Functional Genomics Group European Bioinformatics Institute ArrayExpress – two databases
May 2, 2012 2 How we view experiments…
• Given one experiment where the effect of a particular compound treatment is assayed from two different strains in four different tissue types.
02/05/2012 3 Examine profile of Saa4 in one experiment
• With respect to Compound Treatment
02/05/2012 4 Examine profile of Saa4 in one experiment
• With respect to Genotype
02/05/2012 5 Examine profile of Saa4 in one experiment
• With respect to Organism Part (Tissue)
02/05/2012 6 Atlas construction Atlas construction - example Meta-analysis framework
• For each experiment: • Identify differentially expressed genes between groups of samples • A gene is significantly differentially expressed if the combined F- statistic derived from all pairwise comparisons of the means of a gene's expression levels across factors has a sufficiently small adjusted p-value. • Score every condition/gene/experiment triplet – this score gives us the likelihood that this gene is differentially expressed for this condition in the given experiment • Correct these scores for multiple testing and make a cut-off – differentially expressed: yes/no • Repeat for all experiments
02/05/20129 Meta-analysis framework-cont’d
• For every condition-gene pair count in how many experiments it is differentially expressed
• The result is a two-dimensional matrix where rows correspond to genes and columns correspond to conditions, rather than samples.
• The matrix entries are p-values together with a sign, indicating the significance and direction of differential expression
02/05/201210 Gene Expression Atlas Matrix
Conditions Sample annotation Genes Gene expression Gene levels annotations Saa4 in E-AFMX-4
02/05/2012 12 Saa4 in E-MEXP-114
02/05/2012 13 Saa4 in E-MEXP-748
02/05/2012 14 Experiment selection criteria
• The criteria we use for selecting experiments for inclusion in the Atlas are as follows:
• Array designs relating to experiment must be provided to enable re- annotation using Ensembl or Uniprot (or have the potential for this to be done) • High MIAME scores • Experiment must have 6 or more hybridizations • Sufficient replication and large sample size • EF and EFV must be well annotated • Adequate sample annotation must be provided • Processed data must be provided or raw data which can be renormalized must be available Gene Expression Atlas – when to use it
• Find out if the expression of a gene (or a group of genes with a common gene attribute, e.g. GO term) change(s) across all the experiments available in the Expression Atlas;
• Discover which genes are differentially expressed in a particular biological condition that you are interested in
May 2, 2012 16 Gene Expression Atlas Database
• Provides a gene/condition centric view of data in ArrayExpress
• Queries are optimized for summary, meta-analytical gene expression results (over-expressed/under-expressed) across all experiments in any condition and any gene
• Use cases • Search for all genes over/under-expressed in a condition/set of conditions • Search for a gene/set of genes over/under-expressed in a condition/ set of conditions • View all summary expression results for a specific gene • View gene expression patterns in a particular experiment
02/05/2012 17 Gene Expression Atlas Atlas home page http://www.ebi.ac.uk/gxa/
Restrict query by direction of Query for genes differential expression Query for conditions
The ‘advanced query’ option allows building more complex queries Atlas searching fields – auto suggest
• The ‘Genes’ and ‘Conditions’ search fields Scenario
• Imagine you are a scientist working in a drug discovery laboratory developing new therapies for neurodegenerative diseases.
You want to find human genes involved in the disease that could possibly be targets for drug therapy.
You have recently read a paper stating that 'glutamate receptors are important in neurodegeneration',
So you are particularly interested in finding signaling receptor proteins containing an NMDA domain (a particular class of glutamate receptor) that are deregulated in neurodegenerative disease.
02/05/201221 Search for genes
Start typing Interested in genes involved in receptor activity your query use GO term ‘receptor activity’ in genes search box in the ‘gene search box’
auto- suggest will display all matching gene properties available in the Atlas
02/05/2012 22 02/05/2012 Add experimental conditions to your search
Start typing search for ‘nervous system disease’ in the 'Conditions' box and see if any EFO term matches your search criteria
02/05/2012 23 02/05/2012 Search results – heatmap view
Columns: Conditions in EFO terms
Rows: Genes
Heatmap cell: expression and number of times gene is up- or down- regulated
02/05/2012 24 02/05/2012 Advance query
search if among these genes there are some which encode Clicking on ‘advanced for a protein carrying an NMDA receptor domain query’ will expand the query window to add more query items
Choosing ‘InterPro Term’ from the ‘gene property’ drop down will add a new query item
02/05/2012 25 02/05/2012 Gene Expression Atlas Views
• Search results views • Heatmap view • List view • Gene View • Experiment View • Download results
02/05/201226 Search results – heatmap view Search results – heatmap view
Click on heatmap cell
Plots of experiments supporting the selected gene- condition pair will be shown
p-value of significance of differential expression
02/05/2012 28 02/05/2012 Search results – list view
Each row represents a gene-condition pair
Expanding the row displays thumbnail plots for corresponding experiments
Refine query
Refine query Download search results
• Download a tab-delimited file of your search results • Keep track of all downloads per session Terms and external databases Gene view cross-references
List of experiments showing differential expression for this gene. Clicking on a particular factor on the heatmap will Anatogram showing gene expr in filter only experiments showing diff exp different tissues for that factor
Expression heat maps summary listing all conditions in which the gene was observed differentially expressed
33 02/05/2012 A word of caution
Differential expression of a gene in a certain condition was calculated in context of individual experiments
When we say this gene is over-expressed in kidney in 10 experiments we are not suggesting it is a kidney specific gene.
It means that in each experiment the expression of this gene in kidney was differentially expressed compared to other conditions in each experiment
Gene view plots
Click on liver from expression summary
Liver samples are clearly showing a POTENTIAL expression specificity to this gene
35 02/05/2012 Gene view expression summary
Shows all conditions where this gene has been differentially expressed
02/05/2012 36 02/05/2012 Atlas Experiment View
• Plot the expression of genes in a particular experiment showing the different conditions and experimental factors studied in this experiment
• Show top DE genes for an experiment and be able to plot their expression pattern
• Search for gene(s) of interest to examine their behavior in this experiment
• Identify sample properties Atlas experiment view
Three sections: Plot, Genes, Samples
38 02/05/2012 Experiment box plot
Hovering on bars will show summary statistics for the gene expression
02/05/2012 39 02/05/2012 Experiment box plot
Hovering on bars will show summary statistics for the gene expression
02/05/2012 40 02/05/2012 Box plot
Displays graphically the so-called 5-number summary of a dataset
The summary consists of the median, the upper and lower quartiles, the range, and, possibly, individual extreme values
41 02/05/2012 Experiment line plot
Samples are grouped by condition
Clicking on each EF will plot the same gene but showing different condition groups
02/05/2012 42 02/05/2012 Experiment line plot
Samples are grouped by condition
Clicking on each EF will plot the same gene but showing different condition groups
02/05/2012 43 02/05/2012 Experiment line plot
Hovering over a sample will display all its properties
44 02/05/2012 Condensed experiment plots
Zoom in/out of plot to see all conditions
02/05/2012 45 02/05/2012 Experiment view – HTS data
Clicking on “genome view” will show sequence on ensembl genome browser
46 02/05/2012 Adding genes to plot
Select genes by EF
Search for gene,
list of top DE genes (default)
Add/Remove gene from plot by clicking on little + / - next to gene name
47 02/05/2012 Viewing sample attributes
Clicking on experiment design
Export to tab delimited file
02/05/2012 48 02/05/2012 Summary
• The Gene Expression Atlas is a database that provides information about gene expression patterns at within different biological conditions
• Search for differentially expressed genes either: • by gene name or gene attribute(s) (e.g. Gene Ontology terms) • by biological conditions (e.g. diseases, organism parts, cell types) • by using both gene(s) and condition(s)
• Different output views used in the Atlas • the gene page • the experiment page • the heatmap/list views That’s all folks! Questions?