NGS, Cancer and Bioinformatics Exome-sequencing followed by Variant Calling.
29 janvier 2015 Forma on NGS & Cancer - Analyses Exome 10 Overview of exome analysis G a l ax y W o rk fl o w
Refer en ce Gen o me (F asta )
Co nv er sio n to Gal axy M ap pi n g Al ig ned an d p r ep r o cessed Read s Fo r mat ------r ead s (B A M ) ( Fast q ) ------Bowtie2 ------Groom er - M ark ed PC R duplic at es - I nt ersec te d on t arget regions - R ealigned around indels Qual ity PC R d up l i cates - R ec alibrat ed C on tr ol M ar kin g ------Fas t QC Mark Dup
Pr epr ocess GA T K Pr ep r o cess GA T K Tar get Targ et par t 1 p ar t 2 In ter secti on r egi o ns ------(b ed ) Local rea lign ment Bas e Qualit y Sc ore I nt ers ec t Bam around indels R ec alibrat ion
29 janvier 2015 FO R M AT IO N “ N GS Forma on NGS & Cancer - Analyses & CA NC E R : A N AL Y SE D EExome V A R I AN T S G ÉN O M IQ U ES ” 7 - 9 A VRIL 2014 Public dataset • Accessible online on SRA (Sequence Read Archive): ERA148528
Ø Exome sequencing of 2 samples: tumor (lung cancer) and blood (normal sample) Ø Publica on : Ys et al., Genome Res. 2012 Mar;22(3):436-45
• 100bp paired-end reads, Illumina HiSeq 2000 • Mean depth higher for the tumor sample (~100X) than for the normal sample (~30X) to detect soma c variant with a low allelic frequency • Aligned Exome size: ~15 Go tumor; ~7 Go blood • Complete analysis processing me: ~20h Ø Need to restrict the analysis to a few regions in order to limit the processing me (~112kb)
29 janvier 2015 Forma on NGS & Cancer - Analyses Exome 12 Select Librairies on Galaxy
1. Open your web browser and go to « http://galaxy.sb-roscoff.fr » Select libraries on Galaxy 2. In the top menu, click on « Shared Data » then « Data librairies » 1. Open your web browser and go to ”h ps://
galaxy.gustaveroussy.fr/galaxyprod” 2. In the top menu, click on « Shared Data » then « Data librairies »
3. Click on «canceropole-tp-input » 3. Click on [FORMATION] Input Data then « EXOME » 4. Select « tumor_R1.fastq » ; « tumor_R2.fastq » ; 4. Select « tumor_R1.fastq » ; « tumor_R2.fastq » ; « exome_regions.bed » ; « exome_regions.bed » ; « known_sites_regions.vcf » then click on « Go » « known_sites_regions.vcf » then click on « Go ».
29 janvier 2015 Forma on NGS & Cancer - Analyses Exome FORMATION “NGS & CANCER : ANALYSE DE VARIANTS GÉNOMIQUES” 7 - 9 AVRIL 2014
13 FASTQ formatFASTQ conversion format conversion 1. Rename your history to « Tumor » by clicking on « Unnamed 1.historyRename ». your history to « Tumor » by clicking on « Unnamed history » 2.2. In the In the leftle panel, panel, click on the « click on « FASTQ Groomersearch » undertools » thetextbox NGS: QC and enter « and FASTQ manipulationGroomer section » and to convertthen click on both yourit FASTQ to convert into FASTQ both Sangeryour FASTQ Format into FASTQ Sanger Format. 3.3. Click on « Click on « ExecuteExecute » to » to launchlaunch the conversion the conversion.
29 janvier 2015 Forma on NGS & Cancer - Analyses Exome FORMATION “NGS & CANCER : ANALYSE DE VARIANTS GÉNOMIQUES” 7 - 9 AVRIL 2014 14 GENERAL TIP : RENAME YOUR HISTORY ITEMS FREQUENTLY TO FASTQC : FASTQ QualityBE MORE EXPLICIT THAN « on data xxx » ! Control FASTQC : FASTQ Quality Control 1. In the1. In the left panel,le panel, click on the « click on « FASTQC: Readsearch QC tools » under » textbox the NGS: and enter « QC and manipulationFASTQC: Read QC » section 2. Select the FASTQ Groomer dataset and click on « Execute »; 2. Select the FASTQ Groomer dataset and click on « Execute »; repeat for both reads repeat for both reads