Ensembl Variaon API Hinxton, Mai 2013

Anja Thormann [email protected]

EBI is an Outstation of the European Molecular Laboratory. G GCTACA G GCTACA GGGCTACA GGGCTACA Lactose Intolerance rs4988235: a SNP near the LCT controls whether lactase enzyme is turned on or off as a person grows older. : G/A

Genotype What it means

GG Likely to be lactose intolerant.

AG Likely to be tolerant due to lactase persistence. AA Different paths to Variaon API • Variaon name (e.g. Variaon consequence) • Genomic Locaon (e.g. ) • Populaon or individual specific (variaon, locaon) • Phenotype • Study (Structural variaon data) Ensembl Variaon

• Build a new variaon database: import data from e.g. dbSNP, … • Quality control • Data annotaon: e.g. phenotype data, ancestral alleles, … • Calculate consequences of variant on transcript • SIFT and PolyPhen scores

• Variaon – Data descripon • Variaon – Predicted data Course outline

• Central data objects: Variaon, VariaonFeature, • Structural variaons • Variaon consequences • Phenotype annotaon • Linkage disequilibrium • Resequencing data • Variaon sets • What next? hp://www.ensembl.org/info/docs/Doxygen/variaon-api/index.html hp://www.ensembl.org/info/docs/Doxygen/core-api/index.html hp://www.ensembl.org/Homo_sapiens/Variaon/Summary?r=2:136608146-136609146;source=dbSNP;v=rs4988235;vdb=variaon;vf=3877172 StructuralVariaon OverlapConsequece Allele StructuralVariaonFeature PhenotypeFeature

TranscriptVariaonAllele LDFeatureContainer

TranscriptVariaon SupporngStructuralVariaon VariaonFeature

Variaon TranscriptVariationAdaptor

VariationAdaptor …Adaptor Variaon $variation->name >rs5571078 name >COSM998

$variation->var_class var_class Variaon >SNV >indel

source $variation->source >dbSNP >COSMIC … $allele->allele Allele >A

$allele->frequency >0.85

$allele->population allele >Bio::EnsEMBL::Variation::Population

frequency

populaon

Allele … Object creation

• Using adaptors • Fetch object(s) according to some property e.g. name, location • Check documentation which methods the adaptor provides • Using API objects: e.g. Variation • Get object(s) from an API object • $variation->get_all_Alleles() # returns a listref of Allele objects Adaptors use Bio::EnsEMBL::Registry; my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db( -host => ‘ensembldb.ensembl.org’, -user => ‘anonymous’ );

# get VariationAdaptor my $va = $registry->get_adaptor(‘’, ‘variation’, ‘variation’);

group object type

… my $variation = $va->fetch_by_name(‘rs334’);

Exercise 1: Variaon, Allele

• Retrieve source and variaon class for the following variaons in human: • rs55710239 • rs56385407 • COSM998 • CI003207 • For SNP rs1333049 in human, retrieve the following informaon for each of its alleles: • Allele • Frequency* • Populaon name* • Submier name (‘handle’)* *if exists