[Inserm-00719917, V1] Integrated Analysis of Somatic Mutations and Focal Copy-Number Changes Identifies Key Genes and Pathways I
Total Page:16
File Type:pdf, Size:1020Kb
a Final target bases 46.2 Mb Target bases covered 43.3 Mb Percent target bases covered 93.81% Average depth 73.3 Median depth 70 Coverage at 25x 75.9% Coverage at 10x 87.0% Coverage at 4x 92.3% Coverage at 1x 96.0% b 100 75 50 25 Reads depth (X) depth Reads 0 1 2 3 4 5 6 7 8 9 5 6 2 10 11 12 13 14 1 1 17 18 19 20 21 2 X # Chromosome c 1.0 Author manuscript, published in "Nature Genetics 2012;44(6):694-8" 0.8 0.6 0.4 Coverage in target regions target in 0.2 Coveragex1 Coveragex4 Coveragex10 Coveragex25 0.0 1 2 3 4 5 6 7 8 9 5 8 1 4 7 8 1 4 10 11 12 13 14 1 16 17 1 19 20 2 22 23 2 25 26 2 28 29 30 31 32 33 34 35 36 37 3 39 40 4 42 43 4 45 46 47 48 # Exomes DOI : Supplementary Figure 1. Statistics of mapping of sequencing reads 10.1038/ng.2256 inserm-00719917, version 1 - 22 Jul 2012 a, summary statistics for whole-exome sequence reads of 24 HCC with their non tumor liver tissues. b, mean depth (with 95% IC) of reads on each chromosome, c, cumulative fraction of coding bases covered in captured regions. A 1-fold, 4-fold, 10-fold and 25- fold coverage were considered (mean with 95% IC) per exome (numbered #1-48). $ Bioinforma5cs$workflow$ Variant$calling$Tumor$ ! Variant$calling$Normal$ 1,775,892$variants$ $ Database$extrac5on$ ! dbSNP$ 1,577,724$$ 198,168$ $ background$known$polymorphisms$ $ Coding$SNPs/Indels$ (targeted$sequencing$of$exons)$ ! Intronic,$ 16,498$ UTR$ synonymous$$ 65,947$ variants$ non$coding$variants$ 44,172$nonsynonymous$ $coding$SNPs/Indels$ $ $ Quality$check$of$reads$ (Mean$size$±3SD,$Paired$end,$Phred$>=10,$$N>=$6)$ $ ! 35,539$ $ $$ Soma5c$variants$ ! 1000$ Proprietary$ Duplicate$variants$$ Genomes$ database$ per$individual$ 29,056$$ germline$ (unique$per$individual) 3,777$ $ nonsynonymous$$ soma5c$novel$nonsynonymous$$ coding$SNPs$ coding$SNPs$ /Indels$ $ Func5onal$Evalua5on$ (visual$control,$PPH2)$ ! Artefact$(visual$control)$`$$1,501$ Predict$to$be$benign$or$neutral$(Polyphen`2)$`$1,190$ 1,086$variants$predicted$to$alter$func5on$ $ inserm-00719917, version 1 - 22 Jul 2012 $$ Independant$verifica5on$ (capillary$Sanger$sequencing)$ Verified$as$ Verified$as$ ! Not$validated$ germline$ soma5c$ 23$ 994$ 69$ Supplementary Figure 2. Whole-exome sequencing analysis flowchart. Extrac(on of func(onally somac gene mutaons for 24 HCC. Boxes refer to major bioinformacs steps; arrows indicate the number of variants obtained or removed from subsequent analysis. Variants were filtered for their coding localizaon, annotaon in dbSNP31 or 1000 genomes, somac and func(onally impairment. The per-base and reads quality scores were used to filter false posi(ve gene mutaon events. a b 100 p< 0.0001 80 60 40 Percentsurvival 20 None Homozygous Deletion 0 0 20 40 60 Time subjects 57 43 28 21 at risk 42 19 9 3 Supplementary Figure 3. Genome-wide copy number changes and recurrent homozygous dele8ons in HCC. a, Frequency of gains, dele8ons, and losses of heterozygosity (LOH) along the genome in a series of 125 HCC. b, Kaplan-Meier analysis comparing survival of cases presen8ng homozygous dele8ons (n=42) versus those without local losses (n=57). Univariate inserm-00719917, version 1 - 22 Jul 2012 Cox P-value for risk index is included. Only paents with curave (R0) resec8on were included in survival analysis. CHC012T ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● CHC031T CHC014T ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ●● 0 ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● CHC031T CHC130T ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●●● ●● ●● ● ● ●● ● ● ● 0 ● ● ●● ● ● ● 0 ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● Log R Ratio ● ● ● ● ● ● ●● ● ● ● ● ● CHC1199T CHC205T ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● 0 ● ● ● ● ● ● ● ●●● ● 0 ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● HBZHBA2 HBQ1 ITFG3 ARHGDIG MPG AXIN1 CHC218T ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● NPRL3 HBM HBA1 LUC7L RGS11 PDIA2 MRPL28 ● ● ● ● ● ● ● ● ● ● 300 400 CHC235T Genomic position (Kb) ● ● ●● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ●● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Log R Ratio ●● ● ●● ● ● ● ● ● ● ● ●● ● ● CHC012T ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ●● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● CHC433T ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● 0 ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● CHC018T ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ●● ● ● 0 ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ●●● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● CHC793T ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●●● ● 0 ●● ● ● ● ● ● ● ●● ● ●● ● ● ●●● ●● ●● ● ● ● ● ●●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● CHC034T ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ●● 0 ● ●●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● CHC1069T ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●●● ● ●● ● ● ● ● ● ● ●● ● ● ●● ●● ● ●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ●●● ●●● ●● ● ● ●●●● ●● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Log R Ratio ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● CHC239T ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● MIR31 MTAP CDKN2A CDKN2BAS ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● inserm-00719917, version 1 - 22 Jul 2012 0 ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● MIR31HG C9orf53 CDKN2B ● ● ● STOX2 IRF2 CASP3 MLF1IP Genomic position (Kb) ENPP6 CCDC111 ACSL1 Genomic position (Kb) Supplementary Figure 4. Recurrent homozygous dele8ons at CDKN2A/B, AXIN1 and IRF2 loci. Raw log R raos are represented as black dots, and smoothed log R raos as blue lines. The minimal region of loss is indicated in red for each region. CTNNB1 ARM TD TP53 NES DBD NLS Dim NTD 781aa 393aa AXIN1 RGS P53B GSK3B βCatB PP2AB DIX CDKN2A ANK ANK ANK ANK 862aa 173aa APC ARM UbD IRF2 DBD 2843aa 349aa K137 ARID1A LXXLL ARID LXXLL LXXLL ARID2 ARID LXXLL RFX ZnF 2285aa 1835aa PIK3CA P85B RasB LID Helical K KRAS GTPB K RasB LID Helical 1068aa 189aa RPS6KA3 K K 740aa S227 NFE2L2 Neh2 Neh6 DBD Neh3 HNF1A POU-S POU-H TD 605aa 631aa IL6ST IL6BD TM Cyt 918aa Nonsense DeleCon of several exons Splice site Frameshi< Inframe Indels Missense Supplementary Figure 5. Somac mutaon spectra in 125 HCC. InacCvang and acCvang mutaons are shown above and below the core protein, respecCvely. FuncConal domains are colored boxes. ANK: Ankyrin repeat, ARM: Armadillo repeat, βCatB: βCatenin binding Domain, Cyt: Cytosolic Domain, DBD: DNA Binding domain, inserm-00719917, version 1 - 22 Jul 2012 Dim: Dimerizaon Domain, GSK3B: GSK3 Binding Domain, GTPB: GTP binding Domain, IL6BD: IL6 Binding Domain, K: Kinase domain, LID: Lipid interacCon domain, Neh: Nrf2- ECH Homology Domain, NES: Nuclear Export Signal, NLS: Nuclear Localizaon Signal, NTD: Negave TransacCvaon Domain, P53B: P53 Binding Domain, P85B: P85 Binding Domain, PP2AB: PP2A Binding Domain, RasB: Ras Binding Domain, RGS: regulator of G- protein signalling, TD: TransacCvaon domain, TM: Transmembrane domain, UbD: UbiquiCnylated domain. a b c 500 12.5 2 1.5 ) pControl f f n G o siControl o o p i r r pIRF2 e 10.0 s e s e s l siIRF2 (1) s l H l b l b e e e r 1.0 n m i m c c 7.5 p u u x 2 250 P=0.0001 A 2 n n P<0.0001 e G N G 2 e e p e p v 5.0 R v i v i e t e i t 0.5 t m a H a l H a l l 2 e 2.5 e 1 e F r R R ( R I 0.0 0.0 0 24 48 72 96 0 24 48 72 96 120 0 Time (Hours) Time (Hours) Control SiIRF2 pIRF2 d Supplementary Figure 6. IRF2 as