4/26/17
Toward Cancer Prevention High-Throughput Transcriptomics for Chemical Carcinogenicity Screening
Stefano Monti [email protected]
Section of Computational Biomedicine, BUSM Department of Biostatistics, BUSPH Bioinformatics Program, BU Cancer Program, Broad Institute of MIT & Harvard
Modeling Chemical Carcinogenicity
Development of analytical/experimental framework to
predict long-term cancer risk from exposure to chemical compounds using genomic assays and computation
Cancer Prevention
1 4/26/17
The Case for Carcinogenicity Screening prevention vs. care
Prevention
Care
High-profile Reports
PCP recommendations (2009)
“A precautionary, prevention-oriented approach should replace current reactionary approaches to environmental contaminants.” “High-throughput screening technologies and related data interpretation models should be developed and used to evaluate multiple exposures simultaneously”
IBCERCC recommendations (2013) Prevention is key to reducing the burden of breast cancer. Investments to reduce or eliminate harmful environmental exposures Utilize high-throughput technologies to evaluate multiple risk factors simultaneously.
Report of the Interagency Breast Cancer and Environmental Research Coordinating Committee (IBCERCC)
February 2013 2 4/26/17
Heritable & Environmental Factors
Lichtenstein et al., NEJM 343:78-85 (2000).
Heritable & Environmental Factors
“Inherited genetic factors make a minor contribution to susceptibility to most types of neoplasms.”
“The environment has the principal role in causing sporadic cancer.”
Lichtenstein et al., NEJM 343:78-85 (2000).
3 4/26/17
Bad Luck vs. Bad Environment
Bad Luck vs. Bad Environment
Two thirds of mutations in human cancers are the results of DNA replication errors.
4 4/26/17
Chemical Exposure Understudied
• Constant exposure to pesticides, industrial ~ 85,000 pollutants, consumer products and drugs
• Less than 2% of all chemical compounds have ~ 1,500 been systematically tested
• Mixtures of compounds challenging to evaluate ~ 109–1012
Carcinogenicity Testing approaches
ü observational, not randomized trial Epidemiology ü incomplete/unstandardized studies exposure data ü difficult to control for confounders
ü two year rat bioassay (“gold standard”) ü time and resource consuming In vivo assays ü Imperfect mapping to human carcinogenicity
ü human cell lines ü less time and resource consuming In vitro assays ü allows large sample size of chemical perturbations ü Challenge: translation to in vivo relevance
5 4/26/17
Carcinogenicity Testing approaches
ü observational, not randomized trial Epidemiology ü incomplete/unstandardized studies exposure data ü difficult to control for confounders
ü two year rat bioassay (“gold standard”) ü time and resource consuming In vivo assays ü Imperfect mapping to human carcinogenicity
ü human cell lines ü less time and resource consuming In vitro assays ü allows large sample size of chemical perturbations ü Challenge: translation to in vivo relevance
Predicting and Modeling Carcinogenicity
Carcinogen Toxic Endocrine Disruptor Chemical Cancer Risk Obesogen Prediction Model … Pathway X Inhibitor
Non-carcinogen
ü Pathways affected ü Driving genetic alterations ü Biomarkers ü … Understand Why
6 4/26/17
Predicting Carcinogenicity as a Machine Learning/Data Mining problem
Don’t Like
Movie Carcinogenicity Prediction Model
Like
Director Scriptwriter Genre (Machine) Learning from Actor Known Examples Period Foreign
Length 1 2 n Color/BW
… Movie Movie … Movie
Predicting Carcinogenicity as a Machine Learning/Data Mining problem
Carcinogen
Chemical Cancer Risk Prediction Model
Non-carcinogen
Non-carcinogens Carcinogens
gene1 gene2 gene3 gene4 gene5 gene6 To generate this database gene7 … 10K/100K's of experiments need to be performed
7 4/26/17
Experimental Design Overview
… Carcinogenicity Prediction Genotoxicity Carcinogen Carcinogenicity “New” compound Non-Carcinogen
Prediction Evaluation Classification Accuracy Sensitivity/Specificity ROC curve … …
Biology of Exposure Exposure MoA Pathways Cell lines/iPSC treated w/ .. and mRNA “Drivers” compounds … profiled Exposure risk models
Project relies on high-throughput, cost-effective gene expression assay Luminex-1000 (L1000) @ Broad Institute [or highly multiplexed RNA-sequencing (piloting)]
Experimental Design Overview
Long-term Phenotypes
… Carcinogenicity Prediction Genotoxicity Carcinogen Carcinogenicity “New” compound Non-Carcinogen
Prediction Evaluation Classification Accuracy Short-term Sensitivity/Specificity ROC curve Assay … …
Biology of Exposure Exposure MoA Pathways Cell lines/iPSC treated w/ .. and mRNA “Drivers” compounds … profiled Exposure risk models
Project relies on high-throughput, cost-effective gene expression assay Luminex-1000 (L1000) @ Broad Institute [or highly multiplexed RNA-sequencing (piloting)]
8 4/26/17
Experimental Design Overview
Long-term Phenotypes CENTRAL HYPOTHESIS … Carcinogenicity Prediction Genotoxicity Carcinogen Carcinogenicity “New” Long-term in-vivo exposure phenotypescompound can be modeled by Non-Carcinogen Prediction Evaluation short-term in-vitro genomic assaysClassification Accuracy Short-term Sensitivity/Specificity ROC curve Assay … …
Biology of Exposure Exposure MoA Pathways Cell lines/iPSC treated w/ .. and mRNA “Drivers” compounds … profiled Exposure risk models
Project relies on high-throughput, cost-effective gene expression assay Luminex-1000 (L1000) @ Broad Institute [or highly multiplexed RNA-sequencing (piloting)]
Can Carcinogenicity be Predicted from GEP? the answer from short-term in-vivo (rat-based) assays
… Carcinogenicity Prediction Genotoxicity Carcinogen Carcinogenicity “New” compound Non-Carcinogen
Prediction Evaluation Classification Accuracy Sensitivity/Specificity ROC curve … …
Biology of Exposure … Exposure MoA Pathways Cell lines/Rats exposed to iPSC treated w/ .. and profiled on .. and mRNA “Drivers” compounds …compounds … Affymetrixprofiled Exposure risk models
1000s of profiles, 100s of chemicals
9 4/26/17
Carcinogens vs. Non-Carcinogens (in liver) Carcinogens induce more pervasive (more genes) and more significant transcriptional response
• Response consistent across multiple compounds • Up-regulation more frequent than down-regulation • Heightened response driven by non- genotoxic mechanisms
Stefano Monti − BUSM
Can Carcinogenicity be predicted from GEP? The DrugMatrix/TG-GATEs answer
Yes it can
AUC: ~77% – 83%
~130 chemicals
Prediction can be improved Helps elucidate mechanisms with more chemicals
MoA’s
• DNA damage • Oxidative Stress • Altered metabolism • Proteasome • …
Gusenleitner et al., PLoS ONE 2014
10 4/26/17
From in-vivo to in-vitro …
Ongoing Experiments and Platform Comparison
ü Multiple cell/tissue types ü Multiple perturbations ü Multiple doses
Multiple Platforms L1000 3’DGE SFL ü majority profiled on L1000 ü subset profiled on RNA-seq 3’DGE: Soumillion et al., bioRxiv 2014 SFL: Shishkin et al., Nat Meth 2015
Motivation ü In-house capabilities ü Multiple organisms ü Whole-Transcriptome ü Lower throughput needed … … …
11 4/26/17
Ongoing Experiments and Profiles
Currently profiling >500 chemicals Long-Term Annotation • Liver carcinogens • Breast carcinogens • Lung carcinogens
Chemical)Type #)Chemicals #)Profiles)(L1000) Notes Liver Carcinogens 131 2358 6 doses Non,Carcinogens 172 3096HepG20cells 3 replicates 1 cell type Others0(BUSRP) 33 594 Total 336 6048 Breast Carcinogens 120 2160 3 dose Non,Carcinogens 114 2052 +/, 3 replicates MCF,10A,0MCF,10A0P53 2 cell types Others0(BUSRP) 68 1224 Total 302 5436
Total of more than 11,000 profiles LINCS/NIH/NIEHS funding …
From in-vivo to in-vitro
Results thus far …
12 4/26/17
Carcinogenicity can be Captured by in-vitro (human) models – LUNG
Human lung cell lines exposed to carcinogens and non-carcinogens
Luminex-1000 data 2) ≥ FC | .05 ≤ (FDR genes 36
p < 0.005 Rat carcinogenicity signature can be mapped to human data
DrugMatrix signature genes
Enrichment Score Enrichment carc non-carc L1000-based gene ranking Daniel Gusenleitner
Carcinogenicity can be Captured by in-vitro (human) models – LIVER
Human liver cell lines exposed to Rat carcinogenicity signature can be carcinogens and non-carcinogens mapped to human data
p < 0.02 .05) ≤ (FDR genes 712
Amy Li
13 4/26/17
Hepato-Carcinogenicity can be Captured by in-vitro (human) models – Differentially Expressed Genes
FDR ≤ 0.05 • 712 genes
Amy Li
Hepato-Carcinogenicity can be Captured by in-vitro (human) models – Mechanisms of Action
Pathway annotation points to Mechanisms of Action
FDR ≤ 0.05 • 712 genes
Amy Li
14 4/26/17
In-vitro human models distinguish GTOX vs. non-GTOX hepato-carcinogens
Genotoxicity
Top 100 DE genes
val −3 −2 −1 0 1 2 3 Genotoxicity NEGATIVE POSITIVE Amy Li
Summary & Ongoing Work
Ø Chemical procurement and quality challenging. ü Dose determination essential
Ø Predicting long-term carcinogenicity from short-term in vivo (rats) exposure assays highly accurate ü Transcriptional models point to Mechanisms of Action (MoA’s)
Ø Predicting long-term carcinogenicity from short-term in vitro (cell lines) exposure assays is feasible but more challenging ü Analysis of hepato-carcinogens ongoing …
Ø Ongoing: generation of large GEP dataset from in vitro experiments (~>5K profiles) with Breast Carcinogens ü ~ 350 chemicals ü 2 tissue types (MCF-10A, MCF-10A TP53+/-) ü Multiple profiling platforms Gusenleitner et al., PLoS ONE 2014 Mulas et al., BMC Bioinformatics 2017
Stefano Monti − BUSM Li et al., Manuscript in preparation.
15 4/26/17
“Spin-Off’s”
Ø Personalized (Drug & Chemical) Exposure Assessment ü Profiling of “genetically tailored” cell line models (CRISPR) ü Exposure to selected tobacco smoke chemicals ü Profiling on 3’DGE, SFL, microarrays (subset) with Catalina Perdomo & Gang Liu @ CBM/BU
Ø Transcriptional Profiling of Chemical & Pharmaceutical Obesogens ü Profiling of 3T3 Mouse adipocytes exposed to known (and unknown) obesogens (including nuclear receptor ligands). ü Profiling on 3’DGE, (SFL perhaps) with Jennifer Schlezinger @ BUSPH
Ø A yeast-based, DNAseq-based assay for Genotoxicity with Josh Campbell @ CBM/BU & David Levin @ GSDM/BU
Acknowledgments
BU CBM BU SPH Broad Institute Amy Li David Sherr Aravind Subramanian Eric Reed Paola Sebastiani Todd Golub Daniel Gusenleitner Jennifer Schlezinger Xiaodong Lu Liye Zhang Stephanie Kim Ted Natoli Francesca Mulas Joshua Bittker Gang Liu Joshua Campbell NTP/NIEHS Liz Moses Scott Auerbach Catalina Perdomo Rick Paules Ray Tice
16