Statistical Genomics

Total Page:16

File Type:pdf, Size:1020Kb

Statistical Genomics 12/4/2012 Statistical Genomics MSc CoMPLEX UCL Dr Andrew Teschendorff Statistical Cancer Genomics UCL Cancer Institute ([email protected]) 1-Dec-12 1 Outline 1. Motivation and biological/clinical background. 2. Statistical tests: parametric vs non-parametric testing, univariate and multivariate regressions, empirical nulls. 3. The multiple-testing problem in genomics: estimating the false discovery rate (FDR). 4. Power calculations in genomic studies. 5. Gene Set Enrichment Analysis (GSEA). 6. Dimensional Reduction: singular value decomposition. 4-Dec-12 2 1 12/4/2012 Statistical Genomics Definition: The development and application of statistical methodology to help analyze and interpret data from omic technologies. Goal: Ultimately, the development of statistical algorithms and software to improve the clinical management of complex genetic diseases. 4-Dec-12 3 Motivation (biological/clinical) • “Omic” data sets (e.g mRNA expression, SNPs, DNA methylation) have revolutionized the field of molecular genetics and medicine. • Example: an ongoing clinical trial (the MINDACT trial) is assessing a prognostic 70- gene expression signature, called MammaPrint, in deciding whether to give chemotherapy to breast cancer patients. 1-Dec-12 4 2 12/4/2012 Motivation (biological/clinical) • Personalized medicine: in cancer, knowing the repertoire of aberrations (genomic & epigenomic) in any given tumour, can we predict which treatments will work on that tumour? • Improved understanding of systems biology principles underlying complex genetic diseases. 1-Dec-12 5 Statistical Genomics Flowchart Biological/clinical question Experimental Design Experiment (microarrays / sequencing) Preprocessing (e.g image analysis) Normalisation Downstream Analysis (feature selection, classification, clustering…etc) Biological verification and interpretation 1-Dec-12 6 3 12/4/2012 Typical tasks in Statistical Genomics 1. Experimental design: (i) large experiments in genomics that profile many samples over a large number of arrays require careful design so as to avoid confounding by technical factors (e.g chip/batch effects), (ii) power calculations to determine minimum sample size. 2. Normalisation of raw data: raw data needs to be carefully calibrated and normalised (need for both intra and inter-array/sample normalisation). 3. Identification of genomic features correlating with a phenotype of interest: the purpose of the experiment is usually to identify genomic features (e.g mRNA expression levels) that are different between two conditions (e.g normal versus cancer). 4. Constructing classifiers for prediction: often we want to know whether we can derive a predictor based on genomic features (e.g. can we predict the prognosis of breast cancer patients based on epigenetic DNA methylation profiles measured at the time of diagnosis?) 1-Dec-12 7 Types of omic data 1. Transcriptomics: genome-wide quantification mRNA & miRNA expression (continuous valued data). 2. Proteomics: large-scale quantification of protein expression (continuous valued) 3. Epigenomics: genome-wide quantification of epigenetic marks (e.g. DNA methylation- covalent modification of cytosines by methyl group). Although binary in single cells, becomes continuous when measured over many cells due to (stochastic) variation. 4. Metabolomics: large-scale quantification of metabolite levels (continuous). 5. Genomics: genome wide quantification of allele-specific copy-number state (continuous & discrete) & SNP profiling (discrete valued data). 1-Dec-12 8 4 12/4/2012 Functional genomics: measuring gene expression • We can “easily” measure the mRNA levels of most known transcripts and individual exons, over ~100-1000 samples with microarray-based technologies (cDNA-,oligo-,exon-arrays), ~£100-200 per sample. • The Microarray consists of a solid surface onto which known DNA molecules have been chemically bonded at special locations. – Each array location is typically known as a probe and contains many replicates of the same molecule. – The molecules in each array location are carefully chosen so as to hybridise only with mRNA molecules corresponding to a single gene. “Omic” data matrices Raw data Intermediate data Final data: Matrix Array scans Images n Samples Spots p Features/Genes p Abundance Spot/Image levels quantitations p >> n 5 12/4/2012 Choosing a statistical test: binary phenotype (0,1) • Suppose we would like to establish if two sample distributions are different (sample distributions assumed to be representative of each phenotype). • The main characteristic of a distribution is the mean-the first statistical moment of a distribution (higher order moments include variance, skewness,…etc). So, typically we want a test to determine if the mean is different. • Parametric testing: it implicitly assumes a model for how the data is distributed in each sample group, i.e model is given by a statistical distribution, the parameters of which specify the model. Testing relies on parameter estimation (e.g Student’s t-test). • Non-parametric testing: no implicit model, testing does not involve parameter estimation (e.g Wilcoxon rank sum test). 1-Dec-12 11 Student’s (unpaired) t-test • Suppose data is normally distributed in each phenotype (small deviations from normality will not affect the testing): xx () T 1 2 1 2 t-statistic ss22 12 nn12 Null hypothesis: 12 Tt~ (0, ) t(0, ) is a t-distribution of mean 0 and degrees of freedom 2 2 2 s1 / n1 s2 / n2 2 2 2 2 (s1 / n1) /(n1 1) (s2 / n2 ) /(n2 1) A comparison of the t distribution with 4 df (in blue) and the standard normal distribution (in red) (same mean and variance). 1-Dec-12 12 6 12/4/2012 Wilcoxon rank sum test (unpaired) • If normality assumption is grossly violated, better to use the non- parametric Wilcoxon rank sum test (Mann-Whitney U-test). However, the test is less powerful than a t-test. • Null hypothesis (for continuous data) is: P(Red>Black)=P(Black>Red) 1. Arrange all values (n n12 n values) in increasing order without 20 1 regard to phenotype. Assign ranks. 18.5 2 2. Sum ranks of all values within one phenotype => R1 15.2 3 3. Then, following must hold: R R n( n 1) / 2 10.1 4 21 8.6 5 4. Statistic: W1 R 1 n 1( n 1 1) / 2 ( W 1 W 2 n 1 n 2 ) 6.9 6 WW 4.2 7 5. The statistic, max12 , , is directly related to the AUC. n n n n 1 2 1 2 AUC=Area under the ROC curve. • Note: statistic is derived from actual ranks and not values. In above example, R(red)=1+3+4=8 => W(red)=8-6=2 => W(black)=12-2=10 => AUC=10/12=0.83. 1-Dec-12 13 Wilcoxon rank sum test (unpaired) • Exercise 1: given data for two phenotypes (black & red) (-1, 2.5, 3.5, 7.5, 4, 9, 10, 10.5, 11, 12, 11.5, 13, 15) find AUC and P-value for rejecting null hypothesis that P(Black>Red)=P(Red>Black). For the P-value calculation you might want to use the R-function wilcox.test. • Exercise 2: Now consider data (1.1, 1.0, 1.2, 4.1, 4.0, 4.2). Compute P- values according to Wilcoxon test and t-test separately. What is the AUC in the case of the Wilcoxon test? What does this tell you about using non- parametric tests in the case where sample sizes are small? 1-Dec-12 14 7 12/4/2012 Parametric or non-parametric? Drawbacks of non-parametric testing: • given the sample size, there is a minimum achievable P-value (this constitutes a problem when correcting for multiple testing). • features of low variance may be given highly significant P-values: A) x x xxxxxx oo x x oo o o o oo o ooooo B) x x xxxxxx oo x x o o o oo o ooooo • Wilcoxon-test would assign same P-value to features A) and B), i.e. it is blind to the effect sizes of the features. 1-Dec-12 15 Testing with continuous phenotypes • Suppose we want to determine if a genomic variable (e.g gene expression) is correlated with a continuous phenotype (e.g age). • For this, can use a regression framework: e.g linear model 푦 = 훼 + 훽푥 + 휀 푦′ = 훽푥 + 휀 (푦′ = 푦 − 훼) Least Squares Estimate: 푇 푦′ 푥 푛 훽 = 푥푇푥 data points 훽 ⇒ 푡 = 푥 ~ 푡 0, 푛 − 2 |훽=0 푦 • Compare with Pearson correlation & Fisher Z-transform: T yx 1 1 + 푥푦 1 xy -1 xy 1 푍 = 푙표푔 ~ N(0, )| 1-Dec-12 2 푛−3 =0 xy 1 − 푥푦 8 12/4/2012 Some notes • The statistical significance of correlation values depends on the sample size n (e.g. a correlation of “only” 0.1 can be significant if n>300). Exercise: check this. • If there are outliers, using a t-test to evaluate significance is not a good idea, because the residuals won’t be normally distributed (the assumption underlying the t-test). In this case, we can obtain the “null” distribution of the t-statistic by randomly reassigning the phenotype labels to expression values (need to do this many times, > 1000 , to generate a reasonable estimate of the null distribution). • Null distribution = distribution of the statistic when the null hypothesis is true. By permuting a large number of times you effectively destroy any potential association between predictor and phenotype. By definition this constitutes the null hypothesis. • Often, the null distribution can’t be derived analytically, in which case permutation is the only approach to derive it and hence estimate significance. In this case, we talk about an empirical null. 1-Dec-12 17 Deriving an empirical null • Given an observed statistic S: to derive the null distribution of the statistic: i) randomly permute phenotype labels. ii) recompute statistic with permuted labels SP iii) repeat a large number, nP, of times (>1000) (SP1,SP2,…) iv) an empirical P-value can be calculated as: P-value=(#SP > S)/nP • in this case, noise was modelled from a Gaussian distribution, so not surprisingly, analytical and empirical estimates for the P-value are in close agreement.
Recommended publications
  • Core Transcriptional Regulatory Circuitries in Cancer
    Oncogene (2020) 39:6633–6646 https://doi.org/10.1038/s41388-020-01459-w REVIEW ARTICLE Core transcriptional regulatory circuitries in cancer 1 1,2,3 1 2 1,4,5 Ye Chen ● Liang Xu ● Ruby Yu-Tong Lin ● Markus Müschen ● H. Phillip Koeffler Received: 14 June 2020 / Revised: 30 August 2020 / Accepted: 4 September 2020 / Published online: 17 September 2020 © The Author(s) 2020. This article is published with open access Abstract Transcription factors (TFs) coordinate the on-and-off states of gene expression typically in a combinatorial fashion. Studies from embryonic stem cells and other cell types have revealed that a clique of self-regulated core TFs control cell identity and cell state. These core TFs form interconnected feed-forward transcriptional loops to establish and reinforce the cell-type- specific gene-expression program; the ensemble of core TFs and their regulatory loops constitutes core transcriptional regulatory circuitry (CRC). Here, we summarize recent progress in computational reconstitution and biologic exploration of CRCs across various human malignancies, and consolidate the strategy and methodology for CRC discovery. We also discuss the genetic basis and therapeutic vulnerability of CRC, and highlight new frontiers and future efforts for the study of CRC in cancer. Knowledge of CRC in cancer is fundamental to understanding cancer-specific transcriptional addiction, and should provide important insight to both pathobiology and therapeutics. 1234567890();,: 1234567890();,: Introduction genes. Till now, one critical goal in biology remains to understand the composition and hierarchy of transcriptional Transcriptional regulation is one of the fundamental mole- regulatory network in each specified cell type/lineage.
    [Show full text]
  • Supplemental Materials ZNF281 Enhances Cardiac Reprogramming
    Supplemental Materials ZNF281 enhances cardiac reprogramming by modulating cardiac and inflammatory gene expression Huanyu Zhou, Maria Gabriela Morales, Hisayuki Hashimoto, Matthew E. Dickson, Kunhua Song, Wenduo Ye, Min S. Kim, Hanspeter Niederstrasser, Zhaoning Wang, Beibei Chen, Bruce A. Posner, Rhonda Bassel-Duby and Eric N. Olson Supplemental Table 1; related to Figure 1. Supplemental Table 2; related to Figure 1. Supplemental Table 3; related to the “quantitative mRNA measurement” in Materials and Methods section. Supplemental Table 4; related to the “ChIP-seq, gene ontology and pathway analysis” and “RNA-seq” and gene ontology analysis” in Materials and Methods section. Supplemental Figure S1; related to Figure 1. Supplemental Figure S2; related to Figure 2. Supplemental Figure S3; related to Figure 3. Supplemental Figure S4; related to Figure 4. Supplemental Figure S5; related to Figure 6. Supplemental Table S1. Genes included in human retroviral ORF cDNA library. Gene Gene Gene Gene Gene Gene Gene Gene Symbol Symbol Symbol Symbol Symbol Symbol Symbol Symbol AATF BMP8A CEBPE CTNNB1 ESR2 GDF3 HOXA5 IL17D ADIPOQ BRPF1 CEBPG CUX1 ESRRA GDF6 HOXA6 IL17F ADNP BRPF3 CERS1 CX3CL1 ETS1 GIN1 HOXA7 IL18 AEBP1 BUD31 CERS2 CXCL10 ETS2 GLIS3 HOXB1 IL19 AFF4 C17ORF77 CERS4 CXCL11 ETV3 GMEB1 HOXB13 IL1A AHR C1QTNF4 CFL2 CXCL12 ETV7 GPBP1 HOXB5 IL1B AIMP1 C21ORF66 CHIA CXCL13 FAM3B GPER HOXB6 IL1F3 ALS2CR8 CBFA2T2 CIR1 CXCL14 FAM3D GPI HOXB7 IL1F5 ALX1 CBFA2T3 CITED1 CXCL16 FASLG GREM1 HOXB9 IL1F6 ARGFX CBFB CITED2 CXCL3 FBLN1 GREM2 HOXC4 IL1F7
    [Show full text]
  • Genome-Wide DNA Methylation Analysis of KRAS Mutant Cell Lines Ben Yi Tew1,5, Joel K
    www.nature.com/scientificreports OPEN Genome-wide DNA methylation analysis of KRAS mutant cell lines Ben Yi Tew1,5, Joel K. Durand2,5, Kirsten L. Bryant2, Tikvah K. Hayes2, Sen Peng3, Nhan L. Tran4, Gerald C. Gooden1, David N. Buckley1, Channing J. Der2, Albert S. Baldwin2 ✉ & Bodour Salhia1 ✉ Oncogenic RAS mutations are associated with DNA methylation changes that alter gene expression to drive cancer. Recent studies suggest that DNA methylation changes may be stochastic in nature, while other groups propose distinct signaling pathways responsible for aberrant methylation. Better understanding of DNA methylation events associated with oncogenic KRAS expression could enhance therapeutic approaches. Here we analyzed the basal CpG methylation of 11 KRAS-mutant and dependent pancreatic cancer cell lines and observed strikingly similar methylation patterns. KRAS knockdown resulted in unique methylation changes with limited overlap between each cell line. In KRAS-mutant Pa16C pancreatic cancer cells, while KRAS knockdown resulted in over 8,000 diferentially methylated (DM) CpGs, treatment with the ERK1/2-selective inhibitor SCH772984 showed less than 40 DM CpGs, suggesting that ERK is not a broadly active driver of KRAS-associated DNA methylation. KRAS G12V overexpression in an isogenic lung model reveals >50,600 DM CpGs compared to non-transformed controls. In lung and pancreatic cells, gene ontology analyses of DM promoters show an enrichment for genes involved in diferentiation and development. Taken all together, KRAS-mediated DNA methylation are stochastic and independent of canonical downstream efector signaling. These epigenetically altered genes associated with KRAS expression could represent potential therapeutic targets in KRAS-driven cancer. Activating KRAS mutations can be found in nearly 25 percent of all cancers1.
    [Show full text]
  • 140503 IPF Signatures Supplement Withfigs Thorax
    Supplementary material for Heterogeneous gene expression signatures correspond to distinct lung pathologies and biomarkers of disease severity in idiopathic pulmonary fibrosis Daryle J. DePianto1*, Sanjay Chandriani1⌘*, Alexander R. Abbas1, Guiquan Jia1, Elsa N. N’Diaye1, Patrick Caplazi1, Steven E. Kauder1, Sabyasachi Biswas1, Satyajit K. Karnik1#, Connie Ha1, Zora Modrusan1, Michael A. Matthay2, Jasleen Kukreja3, Harold R. Collard2, Jackson G. Egen1, Paul J. Wolters2§, and Joseph R. Arron1§ 1Genentech Research and Early Development, South San Francisco, CA 2Department of Medicine, University of California, San Francisco, CA 3Department of Surgery, University of California, San Francisco, CA ⌘Current address: Novartis Institutes for Biomedical Research, Emeryville, CA. #Current address: Gilead Sciences, Foster City, CA. *DJD and SC contributed equally to this manuscript §PJW and JRA co-directed this project Address correspondence to Paul J. Wolters, MD University of California, San Francisco Department of Medicine Box 0111 San Francisco, CA 94143-0111 [email protected] or Joseph R. Arron, MD, PhD Genentech, Inc. MS 231C 1 DNA Way South San Francisco, CA 94080 [email protected] 1 METHODS Human lung tissue samples Tissues were obtained at UCSF from clinical samples from IPF patients at the time of biopsy or lung transplantation. All patients were seen at UCSF and the diagnosis of IPF was established through multidisciplinary review of clinical, radiological, and pathological data according to criteria established by the consensus classification of the American Thoracic Society (ATS) and European Respiratory Society (ERS), Japanese Respiratory Society (JRS), and the Latin American Thoracic Association (ALAT) (ref. 5 in main text). Non-diseased normal lung tissues were procured from lungs not used by the Northern California Transplant Donor Network.
    [Show full text]
  • Whole Exome Sequencing in Families at High Risk for Hodgkin Lymphoma: Identification of a Predisposing Mutation in the KDR Gene
    Hodgkin Lymphoma SUPPLEMENTARY APPENDIX Whole exome sequencing in families at high risk for Hodgkin lymphoma: identification of a predisposing mutation in the KDR gene Melissa Rotunno, 1 Mary L. McMaster, 1 Joseph Boland, 2 Sara Bass, 2 Xijun Zhang, 2 Laurie Burdett, 2 Belynda Hicks, 2 Sarangan Ravichandran, 3 Brian T. Luke, 3 Meredith Yeager, 2 Laura Fontaine, 4 Paula L. Hyland, 1 Alisa M. Goldstein, 1 NCI DCEG Cancer Sequencing Working Group, NCI DCEG Cancer Genomics Research Laboratory, Stephen J. Chanock, 5 Neil E. Caporaso, 1 Margaret A. Tucker, 6 and Lynn R. Goldin 1 1Genetic Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD; 2Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD; 3Ad - vanced Biomedical Computing Center, Leidos Biomedical Research Inc.; Frederick National Laboratory for Cancer Research, Frederick, MD; 4Westat, Inc., Rockville MD; 5Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD; and 6Human Genetics Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD, USA ©2016 Ferrata Storti Foundation. This is an open-access paper. doi:10.3324/haematol.2015.135475 Received: August 19, 2015. Accepted: January 7, 2016. Pre-published: June 13, 2016. Correspondence: [email protected] Supplemental Author Information: NCI DCEG Cancer Sequencing Working Group: Mark H. Greene, Allan Hildesheim, Nan Hu, Maria Theresa Landi, Jennifer Loud, Phuong Mai, Lisa Mirabello, Lindsay Morton, Dilys Parry, Anand Pathak, Douglas R. Stewart, Philip R. Taylor, Geoffrey S. Tobias, Xiaohong R. Yang, Guoqin Yu NCI DCEG Cancer Genomics Research Laboratory: Salma Chowdhury, Michael Cullen, Casey Dagnall, Herbert Higson, Amy A.
    [Show full text]
  • Transcription Factor Gene Expression Profiling and Analysis of SOX Gene Family Transcription Factors in Human Limbal Epithelial
    Transcription factor gene expression profiling and analysis of SOX gene family transcription factors in human limbal epithelial progenitor cells Der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades Dr. rer. nat. vorgelegt von Dr. med. Johannes Menzel-Severing aus Bonn Als Dissertation genehmigt von der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg Tag der mündlichen Prüfung: 7. Februar 2018 Vorsitzender des Promotionsorgans: Prof. Dr. Georg Kreimer Gutachter: Prof. Dr. Andreas Feigenspan Prof. Dr. Ursula Schlötzer-Schrehardt 1 INDEX 1. ABSTRACTS Page 1.1. Abstract in English 4 1.2. Zusammenfassung auf Deutsch 7 2. INTRODUCTION 2.1. Anatomy and histology of the cornea and the corneal surface 11 2.2. Homeostasis of corneal epithelium and the limbal stem cell paradigm 13 2.3. The limbal stem cell niche 15 2.4. Cell therapeutic strategies in ocular surface disease 17 2.5. Alternative cell sources for transplantation to the corneal surface 18 2.6. Transcription factors in cell differentiation and reprogramming 21 2.7. Transcription factors in limbal epithelial cells 22 2.8. Research question 25 3. MATERIALS AND METHODS 3.1. Human donor corneas 27 3.2. Laser Capture Microdissection (LCM) 28 3.3. RNA amplification and RT2 profiler PCR arrays 29 3.4. Real-time PCR analysis 33 3.5. Immunohistochemistry 34 3.6. Limbal epithelial cell culture 38 3.7. Transcription-factor knockdown/overexpression in vitro 39 3.8. Proliferation assay 40 3.9. Western blot 40 3.10. Statistical analysis 41 2 4. RESULTS 4.1. Quality control of LCM-isolated and amplified RNA 42 4.2.
    [Show full text]
  • Identification of Transcriptional Mechanisms Downstream of Nf1 Gene Defeciency in Malignant Peripheral Nerve Sheath Tumors Daochun Sun Wayne State University
    Wayne State University DigitalCommons@WayneState Wayne State University Dissertations 1-1-2012 Identification of transcriptional mechanisms downstream of nf1 gene defeciency in malignant peripheral nerve sheath tumors Daochun Sun Wayne State University, Follow this and additional works at: http://digitalcommons.wayne.edu/oa_dissertations Recommended Citation Sun, Daochun, "Identification of transcriptional mechanisms downstream of nf1 gene defeciency in malignant peripheral nerve sheath tumors" (2012). Wayne State University Dissertations. Paper 558. This Open Access Dissertation is brought to you for free and open access by DigitalCommons@WayneState. It has been accepted for inclusion in Wayne State University Dissertations by an authorized administrator of DigitalCommons@WayneState. IDENTIFICATION OF TRANSCRIPTIONAL MECHANISMS DOWNSTREAM OF NF1 GENE DEFECIENCY IN MALIGNANT PERIPHERAL NERVE SHEATH TUMORS by DAOCHUN SUN DISSERTATION Submitted to the Graduate School of Wayne State University, Detroit, Michigan in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY 2012 MAJOR: MOLECULAR BIOLOGY AND GENETICS Approved by: _______________________________________ Advisor Date _______________________________________ _______________________________________ _______________________________________ © COPYRIGHT BY DAOCHUN SUN 2012 All Rights Reserved DEDICATION This work is dedicated to my parents and my wife Ze Zheng for their continuous support and understanding during the years of my education. I could not achieve my goal without them. ii ACKNOWLEDGMENTS I would like to express tremendous appreciation to my mentor, Dr. Michael Tainsky. His guidance and encouragement throughout this project made this dissertation come true. I would also like to thank my committee members, Dr. Raymond Mattingly and Dr. John Reiners Jr. for their sustained attention to this project during the monthly NF1 group meetings and committee meetings, Dr.
    [Show full text]
  • Single Cell Derived Clonal Analysis of Human Glioblastoma Links
    SUPPLEMENTARY INFORMATION: Single cell derived clonal analysis of human glioblastoma links functional and genomic heterogeneity ! Mona Meyer*, Jüri Reimand*, Xiaoyang Lan, Renee Head, Xueming Zhu, Michelle Kushida, Jane Bayani, Jessica C. Pressey, Anath Lionel, Ian D. Clarke, Michael Cusimano, Jeremy Squire, Stephen Scherer, Mark Bernstein, Melanie A. Woodin, Gary D. Bader**, and Peter B. Dirks**! ! * These authors contributed equally to this work.! ** Correspondence: [email protected] or [email protected]! ! Supplementary information - Meyer, Reimand et al. Supplementary methods" 4" Patient samples and fluorescence activated cell sorting (FACS)! 4! Differentiation! 4! Immunocytochemistry and EdU Imaging! 4! Proliferation! 5! Western blotting ! 5! Temozolomide treatment! 5! NCI drug library screen! 6! Orthotopic injections! 6! Immunohistochemistry on tumor sections! 6! Promoter methylation of MGMT! 6! Fluorescence in situ Hybridization (FISH)! 7! SNP6 microarray analysis and genome segmentation! 7! Calling copy number alterations! 8! Mapping altered genome segments to genes! 8! Recurrently altered genes with clonal variability! 9! Global analyses of copy number alterations! 9! Phylogenetic analysis of copy number alterations! 10! Microarray analysis! 10! Gene expression differences of TMZ resistant and sensitive clones of GBM-482! 10! Reverse transcription-PCR analyses! 11! Tumor subtype analysis of TMZ-sensitive and resistant clones! 11! Pathway analysis of gene expression in the TMZ-sensitive clone of GBM-482! 11! Supplementary figures and tables" 13" "2 Supplementary information - Meyer, Reimand et al. Table S1: Individual clones from all patient tumors are tumorigenic. ! 14! Fig. S1: clonal tumorigenicity.! 15! Fig. S2: clonal heterogeneity of EGFR and PTEN expression.! 20! Fig. S3: clonal heterogeneity of proliferation.! 21! Fig.
    [Show full text]
  • Xo PANEL DNA GENE LIST
    xO PANEL DNA GENE LIST ~1700 gene comprehensive cancer panel enriched for clinically actionable genes with additional biologically relevant genes (at 400 -500x average coverage on tumor) Genes A-C Genes D-F Genes G-I Genes J-L AATK ATAD2B BTG1 CDH7 CREM DACH1 EPHA1 FES G6PC3 HGF IL18RAP JADE1 LMO1 ABCA1 ATF1 BTG2 CDK1 CRHR1 DACH2 EPHA2 FEV G6PD HIF1A IL1R1 JAK1 LMO2 ABCB1 ATM BTG3 CDK10 CRK DAXX EPHA3 FGF1 GAB1 HIF1AN IL1R2 JAK2 LMO7 ABCB11 ATR BTK CDK11A CRKL DBH EPHA4 FGF10 GAB2 HIST1H1E IL1RAP JAK3 LMTK2 ABCB4 ATRX BTRC CDK11B CRLF2 DCC EPHA5 FGF11 GABPA HIST1H3B IL20RA JARID2 LMTK3 ABCC1 AURKA BUB1 CDK12 CRTC1 DCUN1D1 EPHA6 FGF12 GALNT12 HIST1H4E IL20RB JAZF1 LPHN2 ABCC2 AURKB BUB1B CDK13 CRTC2 DCUN1D2 EPHA7 FGF13 GATA1 HLA-A IL21R JMJD1C LPHN3 ABCG1 AURKC BUB3 CDK14 CRTC3 DDB2 EPHA8 FGF14 GATA2 HLA-B IL22RA1 JMJD4 LPP ABCG2 AXIN1 C11orf30 CDK15 CSF1 DDIT3 EPHB1 FGF16 GATA3 HLF IL22RA2 JMJD6 LRP1B ABI1 AXIN2 CACNA1C CDK16 CSF1R DDR1 EPHB2 FGF17 GATA5 HLTF IL23R JMJD7 LRP5 ABL1 AXL CACNA1S CDK17 CSF2RA DDR2 EPHB3 FGF18 GATA6 HMGA1 IL2RA JMJD8 LRP6 ABL2 B2M CACNB2 CDK18 CSF2RB DDX3X EPHB4 FGF19 GDNF HMGA2 IL2RB JUN LRRK2 ACE BABAM1 CADM2 CDK19 CSF3R DDX5 EPHB6 FGF2 GFI1 HMGCR IL2RG JUNB LSM1 ACSL6 BACH1 CALR CDK2 CSK DDX6 EPOR FGF20 GFI1B HNF1A IL3 JUND LTK ACTA2 BACH2 CAMTA1 CDK20 CSNK1D DEK ERBB2 FGF21 GFRA4 HNF1B IL3RA JUP LYL1 ACTC1 BAG4 CAPRIN2 CDK3 CSNK1E DHFR ERBB3 FGF22 GGCX HNRNPA3 IL4R KAT2A LYN ACVR1 BAI3 CARD10 CDK4 CTCF DHH ERBB4 FGF23 GHR HOXA10 IL5RA KAT2B LZTR1 ACVR1B BAP1 CARD11 CDK5 CTCFL DIAPH1 ERCC1 FGF3 GID4 HOXA11
    [Show full text]
  • Factor Expression and Correlate with Specific Transcription in Early Human Precursor B Cell Subsets Ig Gene Rearrangement Steps
    The Journal of Immunology Ig Gene Rearrangement Steps Are Initiated in Early Human Precursor B Cell Subsets and Correlate with Specific Transcription Factor Expression1 Menno C. van Zelm,*† Mirjam van der Burg,* Dick de Ridder,*‡ Barbara H. Barendregt,*† Edwin F. E. de Haas,* Marcel J. T. Reinders,‡ Arjan C. Lankester,§ Tom Re´ve´sz,¶ Frank J. T. Staal,* and Jacques J. M. van Dongen2* The role of specific transcription factors in the initiation and regulation of Ig gene rearrangements has been studied extensively in mouse models, but data on normal human precursor B cell differentiation are limited. We purified five human precursor B cell subsets, and assessed and quantified their IGH, IGK, and IGL gene rearrangement patterns and gene expression profiles. Pro-B cells already massively initiate DH-JH rearrangements, which are completed with VH-DJH rearrangements in pre-B-I cells. Large cycling pre-B-II cells are selected for in-frame IGH gene rearrangements. The first IGK/IGL gene rearrangements were initiated in pre-B-I cells, but their frequency increased enormously in small pre-B-II cells, and in-frame selection was found in immature B cells. Transcripts of the RAG1 and RAG2 genes and earlier defined transcription factors, such as E2A, early B cell factor, E2-2, PAX5, and IRF4, were specifically up-regulated at stages undergoing Ig gene rearrangements. Based on the combined Ig gene rearrangement status and gene expression profiles of consecutive precursor B cell subsets, we identified 16 candidate genes involved in initiation and/or regulation of Ig gene rearrangements. These analyses provide new insights into early human pre- cursor B cell differentiation steps and represent an excellent template for studies on oncogenic transformation in precursor B acute lymphoblastic leukemia and B cell differentiation blocks in primary Ab deficiencies.
    [Show full text]
  • Function of Sox2 As a Transcriptional Repressor
    Function of Sox2 as a transcriptional repressor By Yu-Ru Liu, M. Sc Thesis submitted to the University of Nottingham for the degree of Doctor of Philosophy September 2011 1 Abstract Sox2 is one of the earliest known transcription factors to be expressed during development of the nervous system (Rex et al., 1997; Silvia Brunelli, 2003; Wang et al., 2006b; Dee et al., 2008). Ectodermal cells expressing Sox2 have the potential to differentiate into nerve cells. Cells expressing Sox2 are specified to a neural fate during neural induction. Sox2 belongs to the SoxB1 family, comprising Sox1, Sox2 and Sox3, which are generally considered to activate specific target genes, whereas, the SoxB2 group, Sox14 and Sox21, act as transcriptional repressors (Uchikawa, Kamachi, & Kondoh, 1999). However, Sox2 has also been demonstrated to act as a repressor (Kopp et al., 2008) which implies that Sox2 could have a dual-function in vivo . Previous studies indicated that the HMG box-containing protein, Tcf/Lef, interacts with the transcriptional co-repressor, Groucho (Helen Brantjes, 2001). We therefore set out to determine if interaction with the Groucho co-repressor could also explain the repressor ability of Sox2. In this study, we have examined the interaction between Sox2 and Groucho using nuclear translocation, yeast-two-hybrid and co-immunoprecipitation assays. The data suggest that Sox2 interacts with Groucho through a C-terminal, engrailed-like motif. The effect of Groucho on Sox2 function was measured using a luciferase reporter assay. The transcriptional activation activity of Sox2 was repressed after co-expressing with Groucho. To address the biological function of Sox2-Groucho interaction, a loss-of-repressor-function mutant of Sox2 was created by point mutating the essential engrailed-like motif.
    [Show full text]
  • Sox9 and Rbpj Differentially Regulate Endothelial to Mesenchymal Transition and Wound Scarring in Murine Endovascular Progenitors
    ARTICLE https://doi.org/10.1038/s41467-021-22717-9 OPEN Sox9 and Rbpj differentially regulate endothelial to mesenchymal transition and wound scarring in murine endovascular progenitors Jilai Zhao 1,6, Jatin Patel1,2,6, Simranpreet Kaur1, Seen-Ling Sim1, Ho Yi Wong1, Cassandra Styke 1, Isabella Hogan1, Sam Kahler 1, Hamish Hamilton1, Racheal Wadlow1, James Dight1, Ghazaleh Hashemi 1, ✉ Laura Sormani1, Edwige Roy 1, Mervin C. Yoder 3, Mathias Francois4,5 & Kiarash Khosrotehrani 1 1234567890():,; Endothelial to mesenchymal transition (EndMT) is a leading cause of fibrosis and disease, however its mechanism has yet to be elucidated. The endothelium possesses a profound regenerative capacity to adapt and reorganize that is attributed to a population of vessel- resident endovascular progenitors (EVP) governing an endothelial hierarchy. Here, using fate analysis, we show that two transcription factors SOX9 and RBPJ specifically affect the murine EVP numbers and regulate lineage specification. Conditional knock-out of Sox9 from the vasculature (Sox9fl/fl/Cdh5-CreER RosaYFP) depletes EVP while enhancing Rbpj expression and canonical Notch signalling. Additionally, skin wound analysis from Sox9 conditional knock-out mice demonstrates a significant reduction in pathological EndMT resulting in reduced scar area. The converse is observed with Rbpj conditionally knocked-out from the murine vas- culature (Rbpjfl/fl/Cdh5-CreER RosaYFP) or inhibition of Notch signaling in human endothelial colony forming cells, resulting in enhanced Sox9 and EndMT related gene (Snail, Slug, Twist1, Twist2, TGF-β) expression. Similarly, increased endothelial hedgehog signaling (Ptch1fl/fl/ Cdh5-CreER RosaYFP), that upregulates the expression of Sox9 in cells undergoing patholo- gical EndMT, also results in excess fibrosis.
    [Show full text]