<<

4/26/17

Toward Cancer Prevention High-Throughput Transcriptomics for Chemical Carcinogenicity Screening

Stefano Monti [email protected]

Section of Computational Biomedicine, BUSM Department of Biostatistics, BUSPH Bioinformatics Program, BU Cancer Program, Broad Institute of MIT & Harvard

Modeling Chemical Carcinogenicity

Development of analytical/experimental framework to

predict long-term cancer risk from exposure to chemical compounds using genomic assays and computation

Cancer Prevention

1 4/26/17

The Case for Carcinogenicity Screening prevention vs. care

Prevention

Care

High-profile Reports

PCP recommendations (2009)

“A precautionary, prevention-oriented approach should replace current reactionary approaches to environmental contaminants.” “High-throughput screening technologies and related data interpretation models should be developed and used to evaluate multiple exposures simultaneously”

IBCERCC recommendations (2013) Prevention is key to reducing the burden of breast cancer. Investments to reduce or eliminate harmful environmental exposures Utilize high-throughput technologies to evaluate multiple risk factors simultaneously.

Report of the Interagency Breast Cancer and Environmental Research Coordinating Committee (IBCERCC)

February 2013 2 4/26/17

Heritable & Environmental Factors

Lichtenstein et al., NEJM 343:78-85 (2000).

Heritable & Environmental Factors

“Inherited genetic factors make a minor contribution to susceptibility to most types of neoplasms.”

“The environment has the principal role in causing sporadic cancer.”

Lichtenstein et al., NEJM 343:78-85 (2000).

3 4/26/17

Bad Luck vs. Bad Environment

Bad Luck vs. Bad Environment

Two thirds of mutations in human cancers are the results of DNA replication errors.

4 4/26/17

Chemical Exposure Understudied

• Constant exposure to , industrial ~ 85,000 pollutants, consumer products and drugs

• Less than 2% of all chemical compounds have ~ 1,500 been systematically tested

• Mixtures of compounds challenging to evaluate ~ 109–1012

Carcinogenicity Testing approaches

ü observational, not randomized trial Epidemiology ü incomplete/unstandardized studies exposure data ü difficult to control for confounders

ü two year rat bioassay (“gold standard”) ü time and resource consuming In vivo assays ü Imperfect mapping to human carcinogenicity

ü human cell lines ü less time and resource consuming In vitro assays ü allows large sample size of chemical perturbations ü Challenge: translation to in vivo relevance

5 4/26/17

Carcinogenicity Testing approaches

ü observational, not randomized trial Epidemiology ü incomplete/unstandardized studies exposure data ü difficult to control for confounders

ü two year rat bioassay (“gold standard”) ü time and resource consuming In vivo assays ü Imperfect mapping to human carcinogenicity

ü human cell lines ü less time and resource consuming In vitro assays ü allows large sample size of chemical perturbations ü Challenge: translation to in vivo relevance

Predicting and Modeling Carcinogenicity

Carcinogen Toxic Chemical Cancer Risk Obesogen Prediction Model … Pathway X Inhibitor

Non-

ü Pathways affected ü Driving genetic alterations ü Biomarkers ü … Understand Why

6 4/26/17

Predicting Carcinogenicity as a Machine Learning/Data Mining problem

Don’t Like

Movie Carcinogenicity Prediction Model

Like

Director Scriptwriter Genre (Machine) Learning from Actor Known Examples Period Foreign

Length 1 2 n Color/BW

… Movie Movie … Movie

Predicting Carcinogenicity as a Machine Learning/Data Mining problem

Carcinogen

Chemical Cancer Risk Prediction Model

Non-carcinogen

Non- Carcinogens

gene1 gene2 gene3 gene4 gene5 gene6 To generate this database gene7 … 10K/100K's of experiments need to be performed

7 4/26/17

Experimental Design Overview

… Carcinogenicity Prediction Genotoxicity Carcinogen Carcinogenicity “New” compound Non-Carcinogen

Prediction Evaluation Classification Accuracy Sensitivity/Specificity ROC curve … …

Biology of Exposure Exposure MoA Pathways Cell lines/iPSC treated w/ .. and mRNA “Drivers” compounds … profiled Exposure risk models

Project relies on high-throughput, cost-effective gene expression assay Luminex-1000 (L1000) @ Broad Institute [or highly multiplexed RNA-sequencing (piloting)]

Experimental Design Overview

Long-term Phenotypes

… Carcinogenicity Prediction Genotoxicity Carcinogen Carcinogenicity “New” compound Non-Carcinogen

Prediction Evaluation Classification Accuracy Short-term Sensitivity/Specificity ROC curve Assay … …

Biology of Exposure Exposure MoA Pathways Cell lines/iPSC treated w/ .. and mRNA “Drivers” compounds … profiled Exposure risk models

Project relies on high-throughput, cost-effective gene expression assay Luminex-1000 (L1000) @ Broad Institute [or highly multiplexed RNA-sequencing (piloting)]

8 4/26/17

Experimental Design Overview

Long-term Phenotypes CENTRAL HYPOTHESIS … Carcinogenicity Prediction Genotoxicity Carcinogen Carcinogenicity “New” Long-term in-vivo exposure phenotypescompound can be modeled by Non-Carcinogen Prediction Evaluation short-term in-vitro genomic assaysClassification Accuracy Short-term Sensitivity/Specificity ROC curve Assay … …

Biology of Exposure Exposure MoA Pathways Cell lines/iPSC treated w/ .. and mRNA “Drivers” compounds … profiled Exposure risk models

Project relies on high-throughput, cost-effective gene expression assay Luminex-1000 (L1000) @ Broad Institute [or highly multiplexed RNA-sequencing (piloting)]

Can Carcinogenicity be Predicted from GEP? the answer from short-term in-vivo (rat-based) assays

… Carcinogenicity Prediction Genotoxicity Carcinogen Carcinogenicity “New” compound Non-Carcinogen

Prediction Evaluation Classification Accuracy Sensitivity/Specificity ROC curve … …

Biology of Exposure … Exposure MoA Pathways Cell lines/Rats exposed to iPSC treated w/ .. and profiled on .. and mRNA “Drivers” compounds …compounds … Affymetrixprofiled Exposure risk models

1000s of profiles, 100s of chemicals

9 4/26/17

Carcinogens vs. Non-Carcinogens (in liver) Carcinogens induce more pervasive (more genes) and more significant transcriptional response

• Response consistent across multiple compounds • Up-regulation more frequent than down-regulation • Heightened response driven by non- genotoxic mechanisms

Stefano Monti − BUSM

Can Carcinogenicity be predicted from GEP? The DrugMatrix/TG-GATEs answer

Yes it can

AUC: ~77% – 83%

~130 chemicals

Prediction can be improved Helps elucidate mechanisms with more chemicals

MoA’s

• DNA damage • Oxidative Stress • Altered • Proteasome • …

Gusenleitner et al., PLoS ONE 2014

10 4/26/17

From in-vivo to in-vitro …

Ongoing Experiments and Platform Comparison

ü Multiple cell/tissue types ü Multiple perturbations ü Multiple doses

Multiple Platforms L1000 3’DGE SFL ü majority profiled on L1000 ü subset profiled on RNA-seq 3’DGE: Soumillion et al., bioRxiv 2014 SFL: Shishkin et al., Nat Meth 2015

Motivation ü In-house capabilities ü Multiple organisms ü Whole-Transcriptome ü Lower throughput needed … … …

11 4/26/17

Ongoing Experiments and Profiles

Currently profiling >500 chemicals Long-Term Annotation • Liver carcinogens • Breast carcinogens • Lung carcinogens

Chemical)Type #)Chemicals #)Profiles)(L1000) Notes Liver Carcinogens 131 2358 6 doses Non,Carcinogens 172 3096HepG20cells 3 replicates 1 cell type Others0(BUSRP) 33 594 Total 336 6048 Breast Carcinogens 120 2160 3 dose Non,Carcinogens 114 2052 +/, 3 replicates MCF,10A,0MCF,10A0P53 2 cell types Others0(BUSRP) 68 1224 Total 302 5436

Total of more than 11,000 profiles LINCS/NIH/NIEHS funding …

From in-vivo to in-vitro

Results thus far …

12 4/26/17

Carcinogenicity can be Captured by in-vitro (human) models – LUNG

Human lung cell lines exposed to carcinogens and non-carcinogens

Luminex-1000 data 2) ≥ FC | .05 ≤ (FDR genes 36

p < 0.005 Rat carcinogenicity signature can be mapped to human data

DrugMatrix signature genes

Enrichment Score Enrichment carc non-carc L1000-based gene ranking Daniel Gusenleitner

Carcinogenicity can be Captured by in-vitro (human) models – LIVER

Human liver cell lines exposed to Rat carcinogenicity signature can be carcinogens and non-carcinogens mapped to human data

p < 0.02 .05) ≤ (FDR genes 712

Amy Li

13 4/26/17

Hepato-Carcinogenicity can be Captured by in-vitro (human) models – Differentially Expressed Genes

FDR ≤ 0.05 • 712 genes

Amy Li

Hepato-Carcinogenicity can be Captured by in-vitro (human) models – Mechanisms of Action

Pathway annotation points to Mechanisms of Action

FDR ≤ 0.05 • 712 genes

Amy Li

14 4/26/17

In-vitro human models distinguish GTOX vs. non-GTOX hepato-carcinogens

Genotoxicity

Top 100 DE genes

val −3 −2 −1 0 1 2 3 Genotoxicity NEGATIVE POSITIVE Amy Li

Summary & Ongoing Work

Ø Chemical procurement and quality challenging. ü Dose determination essential

Ø Predicting long-term carcinogenicity from short-term in vivo (rats) exposure assays highly accurate ü Transcriptional models point to Mechanisms of Action (MoA’s)

Ø Predicting long-term carcinogenicity from short-term in vitro (cell lines) exposure assays is feasible but more challenging ü Analysis of hepato-carcinogens ongoing …

Ø Ongoing: generation of large GEP dataset from in vitro experiments (~>5K profiles) with Breast Carcinogens ü ~ 350 chemicals ü 2 tissue types (MCF-10A, MCF-10A TP53+/-) ü Multiple profiling platforms Gusenleitner et al., PLoS ONE 2014 Mulas et al., BMC Bioinformatics 2017

Stefano Monti − BUSM Li et al., Manuscript in preparation.

15 4/26/17

“Spin-Off’s”

Ø Personalized (Drug & Chemical) Exposure Assessment ü Profiling of “genetically tailored” cell line models (CRISPR) ü Exposure to selected tobacco smoke chemicals ü Profiling on 3’DGE, SFL, microarrays (subset) with Catalina Perdomo & Gang Liu @ CBM/BU

Ø Transcriptional Profiling of Chemical & Pharmaceutical Obesogens ü Profiling of 3T3 Mouse exposed to known (and unknown) obesogens (including ligands). ü Profiling on 3’DGE, (SFL perhaps) with Jennifer Schlezinger @ BUSPH

Ø A yeast-based, DNAseq-based assay for Genotoxicity with Josh Campbell @ CBM/BU & David Levin @ GSDM/BU

Acknowledgments

BU CBM BU SPH Broad Institute Amy Li David Sherr Aravind Subramanian Eric Reed Paola Sebastiani Todd Golub Daniel Gusenleitner Jennifer Schlezinger Xiaodong Lu Liye Zhang Stephanie Kim Ted Natoli Francesca Mulas Joshua Bittker Gang Liu Joshua Campbell NTP/NIEHS Liz Moses Scott Auerbach Catalina Perdomo Rick Paules Ray Tice

16