CHARACTERIZATION OF TM9SF2 AND WAC AS NOVEL COLORECTAL DRIVER

A DISSERTATION SUBMITTED TO THE FACULTY OF UNIVERSITY OF MINNESOTA BY

CHRISTOPHER ROBERT CLARK

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

UNDER THE ADVISEMENT OF DR. TIMOTHY K. STARR

OCTOBER 2018

Ó Christopher Robert Clark 2018

Acknowledgements

My graduate work would not have been possible without the help of so many other people, from those that helped plan and troubleshoot experiments to those that were willing to listen and provide endless amounts of encouragement during times of frustration.

Dr. Timothy Starr, my thesis advisor, graciously supported me over the last six years. Scientifically, Tim has provided me with many lessons that include how to artfully craft a grant, how to enthusiastically engage and mentor undergraduate student researchers, and how to think as an independent scientist. A trait that I still struggle with but admire the most about Tim, is his fearlessness when it comes to taking on risky projects or using the latest technologies to advance our science. Tim has a special ability to focus on the rewards that may come from taking on such difficult challenge. I would be mistaken to claim that the most important things I learned from Tim are restricted to the scientific realm. Some of

Tim’s greatest qualities are his abilities to always be optimistic, always be accepting, and always be big-hearted. He has always been a tremendous role model for me and everyone else in his lab.

The landscape of the Starr lab has changed multiple times in just six short years.

When I first joined the lab, I was primarily taught by Dr. Casey Dorr. Together we

i made several pairs of TAL endonucleases targeting the human WAC . I’m fairly certain we never used these reagents but years later I remain fascinated by golden gate cloning. Lee Pribyl, my first officemate and now friend, was instrumental in helping me get my feet on the ground and making Minnesota feel like home. Dr. Juan Abrahante and I overlapped for my first two years in the lab before he moved on to bigger and better things. Juan patiently taught me many methods in molecular biology and how to become a “mac person.” Intentionally,

Juan made sure to be around during the most stressful times of my graduate school journey. He was supportive throughout the pains of the preliminary exam process, dropped everything he was doing to come visit me and my wife in the hospital after she badly broke her wrist, and was willing to listen as I mourned the loss of my cousin, my grandfather, and then my grandmother. Thank you, Juan.

A number of undergraduate students have contributed to these studies and I’d be thoughtless not to thank Madison Weg, Grant Hedblom, Conor Nath, Jared Tait,

Michael Chandler, Anna Strauss, Angela Nguyen, and Kaila Thatcher for all of their efforts. Additionally, I would like to thank the Makayla Maile and Patrick

Blaney for their support as lab managers and also their valuable friendship over the years. The current Starr lab members Mihir Shetty, Zenas Chang, and Willa

Durose have been tremendously helpful in my last years as a graduate student and I am extremely thankful for their contributions and friendship.

ii I would also like to thank several members of the Bazzaro lab for their contributions to this work and my scientific development. The Germans, also known as Stefano “Rafa” Hellweg and Juri Habicht, always volunteered to exchange labor for English lessons. I’m fairly certain their understanding of the

English language is greater than my own. Ashely Mooneyham and I overlapped for much of our graduate studies and I am thankful for her dear friendship over the last several years.

I owe thanks to the members of my thesis committee: Drs. Scott McIvor, Martina

Bazzaro, Robert Cormier, and Emil Lou. All have contributed time and effort to aid in my training as a scientist and helped me as I progress to the next point in my career. I am extremely thankful for their mentorship.

Outside of the lab, many people have helped me stay grounded over the years.

My fellow MICab students and friends Nick Brady and Dylan White have provided endless laughs during Trivia nights and football Sundays. I am also grateful for my fitness center buddies who have always been eager to hear about my progress. I would also like to thank all my east coast friends who provided support from a distance.

Lastly, I owe my deepest thanks to my loving wife Jamie Clark. She has been unwavering in her support and devotion over the last six years. She has always

iii pushed me through the hard times and has also made a point to enthusiastically celebrate my successes. Not a day goes by that I am not reminded she is my biggest fan.

Dedication

Dedicated to my wife and my family for a lifetime of support.

To my cousin Michael Mahoney, my grandfather Kevin Donovan, and my grandmother Alice Chase. Though it was a great challenge to mourn the loss of these loving family members while trying conduct my research, I always took comfort in knowing that each of them would have been so proud of this accomplishment.

iv Abstract

The studies performed in this dissertation focused on the characterization of two candidate cancer genes, TM9SF2 and WAC, and their role in colorectal cancer

(CRC). Interest in these two genes stems from their discovery as frequently mutated genes in a mouse-based CRC mutagenesis screen. The first chapter will discuss CRC and provide an historical overview of the methods used to discover novel CRC driver genes. The chapter will also cover modern strategies to identify exciting new CRC driver genes before it ends with a thorough overview of the transposon based forward mutagenesis screen used for CRC gene discovery.

The second chapter describes my work demonstrating that the transmembrane TM9SF2 is a novel CRC . Here, we have shown that TM9SF2 is a significant driver gene in murine CRC tumors and, with multiple approaches, that TM9SF2 is overexpressed in approximately one-third of human CRC samples. We provide functional data demonstrating that shRNA-mediated reduction of TM9SF2 or complete knockout by CRISPR/Cas9 drastically reduces tumor fitness in human CRC cell lines. Finally, we provide evidence that high

TM9SF2 expression is correlated with poor patient prognosis. The third chapter focuses on the WAC gene and its potential tumor suppressor activity in CRC. We have shown that WAC is frequently mutated in murine CRC mutagenesis screens and that reduction in WAC expression reduces . This chapter also discusses our finding that loss of WAC is detrimental to mouse embryonic development. The final chapter in this dissertation provides a discussion of the

v significance of these findings and how these results will impact the CRC research community.

vi Table of Contents

Acknowledgements ...... i

Dedication ...... iv

Abstract ...... v

Table of Contents ...... vii

List of Tables ...... xi

List of Figures ...... xii

Chapter 1. Introduction ...... 1

Colorectal Cancer Overview ...... 1

Introduction ...... 2

Early advances in CRC gene discovery ...... 5

Current advances in CRC gene discovery ...... 6

Mouse Models of FAP ...... 7

Mouse Models of Lynch Syndrome ...... 8

Models of Sporadic CRC ...... 10

Conclusion ...... 17

Thesis Statement ...... 18

Chapter 2. Transposon mutagenesis screen in mice identifies TM9SF2 as a novel colorectal cancer oncogene ...... 19 vii Snyopsis ...... 19

Introduction ...... 20

Results ...... 22

Insertional mutagenesis screens identify TM9SF2 as candidate cancer gene 22

TM9SF2 is overexpressed in human CRC ...... 26

TM9SF2 functions as an oncogene in CRC cell lines ...... 31

TM9SF2 functions as an oncogene in vivo ...... 36

Cell cycle and metabolic pathways upregulated by TM9SF2 ...... 36

ELF1 regulates TM9SF2 expression ...... 41

High levels of TM9SF2 correlate with higher stage cancer and decreased

disease-free survival ...... 42

Discussion ...... 49

Materials and Methods ...... 51

Chapter 3. WAC is a common insertion site gene and has potential tumor suppressor activity in human colorectal cancer ...... 60

Synopsis ...... 60

Introduction ...... 61

Results ...... 64

viii Insertional Mutagenesis screens in mice have identified Wac as a candidate

...... 64

WAC is somatically mutated in cancer and its expression is down regulated in

human CRC...... 68

In mouse CRC cell lines, loss of WAC promotes anchorage independent

growth...... 77

WAC knockdown cooperates with APC and to promote anchorage

independent growth of premalignant colonic epithelial cells...... 78

Partial reduction of WAC does not increase the growth of sporadic CRC cell

lines...... 82

Complete knockout and strong overexpression of WAC reveal its tumor

suppressive properties in sporadic CRC cell lines...... 86

WAC is a potential regulator of TP53 and CDKN1A ...... 94

Loss of WAC is embryonic lethal in mice...... 99

Discussion ...... 105

Materials and Methods ...... 109

Chapter 4. Discussion, future directions, and concluding remarks ...... 115

Discussion ...... 115

Future directions ...... 121

ix The utility of the transposon-based insertional mutagenesis screen in a next

generation era ...... 128

References ...... 132

Appendix ...... 147

x List of Tables

Table 2. 1 Genesets enriched in parental HT-29 cells compared to HT-29

TM9SF2 knockout cells...... 40

Table 3. 1. Top 10 common insertion site genes from insertional mutagenesis screens in mice...... 66

xi List of Figures

Figure 1. 1. A schematic of the T2/Onc transposon...... 13

Figure 1. 2. Breeding scheme used to generate triple transgenic mice...... 14

Figure 2. 1. SB screen identifies Tm9sf2 as a candidate CRC driver gene...... 23

Figure 2. 2. Frequency of SB insertions in Tm9sf2 in SBCD database...... 25

Figure 2. 3. TM9SF2 is overexpressed in human CRC samples...... 27

Figure 2. 4. TM9SF2 is overexpressed in human CRC cell lines...... 29

Figure 2. 5. TM9SF2 expression is elevated in patient tumor versus matched normal samples...... 30

Figure 2. 6. Knockdown of TM9SF2 reduces anchorage independent growth in

DLD1 cells...... 32

Figure 2. 7. CRISPR/Cas9 Knockout of TM9SF2 reduces cell growth in soft agar.

...... 33

Figure 2. 8. CRISPR/Cas9 Knockout of TM9SF2 reduces cell proliferation rate. 34

Figure 2. 9. Overexpression of TM9SF2 enhances anchorage independent growth of HCT116 cells...... 35

Figure 2. 10. TM9SF2 knockout reduces tumor growth and extends survival in an in vivo xenograft model...... 37

Figure 2. 11. TM9SF2 knockout demonstrates its role as a potential regulator...... 39

Figure 2. 12. ELF1 binding motif in the TM9SF2 promoter region...... 43

xii Figure 2. 13. ELF1 expression positively correlates with TM9SF2 expression in human CRC samples...... 44

Figure 2. 14. ELF1 expression in a panel of CRC cell lines...... 45

Figure 2. 15. ELF1 binds to the promoter region of TM9SF2...... 46

Figure 2. 16. TM9SF2 expression increases with disease stage...... 47

Figure 2. 17. TM9Sf2 expression predicts relapse free survival...... 48

Figure 3. 1. Wac is a common insertion site gene in mouse intestinal tumors. ... 67

Figure 3. 2. WAC is mutated in human ...... 72

Figure 3. 3. WAC is somatically mutated in human CRC samples...... 75

Figure 3. 4. WAC mRNA expression is downregulated in CRC tumors...... 76

Figure 3. 5. WAC functions to suppress anchorage independent growth in murine colon epithelial cells...... 80

Figure 3. 6. WAC has tumor suppressive function in human colorectal adenoma cells...... 81

Figure 3. 7. Partial WAC reduction has minor effect on sporadic CRC cell growth.

...... 85

Figure 3. 8. CRISPR/Cas9 deletion of WAC in the CRC cell line HT-29...... 88

Figure 3. 9. WAC is tumor suppressive in sporadic CRC cell lines...... 90

Figure 3. 10. WAC is tumor suppressive in sporadic CRC cell lines...... 92

Figure 3. 11. WAC is involved in cell cycle regulation and may be involved in the

TP53 response...... 97

xiii Figure 3. 12. Reduction of WAC does not sensitize cells to the effects of chemotherapeutics...... 98

Figure 3. 13. Creation of conditional WAC knockout mouse and experimental design...... 101

Figure 3. 14. Wac is critical for proper embryonic development and may be embryonic lethal...... 103

xiv Chapter 1. Introduction

*This work has been published as a review article in the World Journal of

Gastroenterology. *

Colorectal Cancer Overview

Synopsis

Colorectal Cancer (CRC) constitutes a major public health problem as the third most commonly diagnosed and third most lethal malignancy worldwide. The prevalence and the physical accessibility to colorectal tumors have made

CRC an ideal model for the study of tumor genetics. Early research efforts using patient derived CRC samples led to the discovery of several highly penetrant mutations (e.g. APC, KRAS, MMR genes) in both hereditary and sporadic CRC tumors. This knowledge has enabled researchers to develop genetically engineered and chemically induced tumor models of CRC, both of which have had a substantial impact on our understanding of the molecular basis of CRC.

Despite these advances, the morbidity and mortality of CRC remains a cause for concern and highlights the need to uncover novel genetic drivers of CRC. This review focuses on mouse models of CRC with particular emphasis on a newly developed cancer gene discovery tool, the Sleeping Beauty transposon-based mutagenesis model of CRC.

1 Key words: Mouse Models; Colorectal cancer; Cancer genes; Insertional mutagenesis; Transposable elements

© The Author(s) 2015. Published by Baishideng Publishing Group Inc. All rights reserved.

Core tip: Successful implementation of targeted therapy will require a much more sophisticated understanding of colorectal cancer genetics, including the ability to discern "driver" mutations from the more common "passenger" mutations. Interpreting causality from large human genomic datasets will benefit from data produced by animal models and will expedite clinical trials using targeted therapies. This review describes the benefits and limitations of both traditional and new mouse models that are being used to discover and define colorectal cancer driver genes.

Introduction

The cause of CRC, as with many malignancies, is the accumulation of genetic mutations and other genetic aberrations over an individual’s lifetime.

Several decades of genomic studies have revealed that colorectal tumors can be categorized into two general molecular categories known as the “mutator” phenotype and the “ instable” phenotype1-3.

It is estimated that fifteen percent of patients with CRC have mutations in the DNA mismatch repair system. Because these patients have tumors

2 containing a significantly higher number of somatic mutations versus those with proficient DNA repair systems, their tumors are said to display the mutator phenotype. Sporadic CRC tumors displaying the mutator phenotype typically have a loss of MLH1 expression due to promoter methylation4. The MLH1 gene is highly conserved and its gene product functions in repairing DNA replication errors that result in mismatched nucleotide bases. Loss of MLH1 function results in the accumulation of single nucleotide alterations across the genome but are especially apparent in regions of the genome displaying highly repetitive sequence5. The result of dysfunctional DNA mismatch repair is illustrated by that fact that MLH1 knockout mice display increased susceptibility to cancer formation5. Individuals with Lynch syndrome, also known as Hereditary Non- polyposis Colorectal Cancer (HNPCC), have germline mutations in DNA mismatch repair genes causing a high mutation load and microsatellite instability.

Those with Lynch syndrome are at high risk of developing CRC and typically account for 3-5% of the total number of new CRC diagnoses6. For reasons that are not well understood, patients with CRC tumors that are genetically defined as the mutator phenotype generally have a better clinical prognosis versus their non-mutator phenotype counterparts7,8.

Inactivating mutations in the adenomatous polyposis coli gene (APC), a tumor suppressor gene, is widely regarded as a key somatic mutation responsible for driving sporadic colorectal tumor formation9. This is supported by data from large-scale genomic studies that identified mutations in APC in

3 approximately 80% of sporadic CRCs10,11. The protein encoded by the APC gene functions as a negative regulator of the Wnt signaling pathway, which is a highly conserved pathway responsible for controlling many cellular processes including cellular proliferation and differentiation12,13. Additionally, the Wnt pathway is critical for maintaining organ homeostasis and is particularly important in maintaining homeostasis in the intestinal epithelium. The consequences of deregulated Wnt signaling, usually due to inactivating mutations in APC, are best exemplified in those affected by the relatively uncommon syndrome known as familial adenomatous polyposis (FAP). Individuals with FAP have germline mutations in APC and consequently develop hundreds to thousands of colon polyps by their teenage years. Furthermore, it estimated that nearly all FAP patients are diagnosed with CRC prior to the age of 4014,15.

The vast majority of sporadic colorectal tumors (~85%) are molecularly characterized as chromosome instable2. Tumors in this category display major chromosomal aberrations including aneuploidy, loss of heterozygosity, gene fusions, and large insertions and deletions2,16. The root cause of chromosomal instability in colon cancer is unclear but many mechanisms have been proposed.

These mechanisms include, but are not limited to, defects in the system responsible for faithful chromosome segregation during mitosis and inactivation of genes responsible for the repair of DNA damaged by genotoxic stressors (e.g. methylating agents)2,17. Regardless of the underlying cause of chromosome instability, it is likely that the accumulation of somatic mutations in tumor

4 suppressors and , in combination with genome instability, leads to the initiation and progression of CRC.

Early advances in CRC gene discovery

During the 1970’s researchers identified and defined a small number of cancer oncogenes and tumor suppressors based on the study of oncoviruses and recurrent chromosomal abnormalities. In the 1980's several groups used this knowledge to assay those known cancer genes in CRC and found that RAS mutations were present in 30% to 50% of patients18,19. Furthermore, recurrent allelic deletions were identified that eventually implicated TP53, APC and SMAD4 as tumor suppressors in CRC19-21. By analyzing tumors at various stages along the continuum of adenoma-adenocarcinoma-metastasis, these researchers were able to hypothesize that certain mutations were early gatekeepers, such as APC and KRAS, while other mutations were only found at later stages, such as

SMAD4 and TP53.

In the early 1990s scientists used FAP and Lynch syndrome family cohorts to identify the causative tumor suppressor genes for these two syndromes. The APC gene was identified by positional cloning in FAP cohorts, while linkage analysis implicated mutations at chromosomal regions 2p16 and

3p21 in Lynch syndrome families22-28. At the same time it was discovered that a subset of CRC patients had novel microsatellite alleles in their cancers, indicating microsatellite instability and microbiologists had recently identified mismatch repair (MMR) genes in yeast. This finding lead to the hypothesis that mutations in

5 human homologs of the yeast MMR genes may be the cause of Lynch

Syndrome29. This hypothesis was strengthened by the discovery that one of the homologs, MSH2, was in the chromosome 2p region linked to Lynch syndrome30.

Within the next few years mutations in several other MMR genes, including

MLH1, PMS1, PMS2, and MSH6, were discovered in Lynch syndrome patients31-

35.

Current advances in CRC gene discovery

New technologies, such as next generation sequencing, have allowed researchers to comprehensively analyze whole exomes, genomes and transcriptomes of large numbers of CRC patients. These datasets have been mined using various bioinformatic algorithms to identify putative CRC driver genes. In general, these algorithms detect genes that are recurrently mutated, amplified, deleted, or altered by other means and assign a rank or p-value to predict whether or not the gene is a driver of tumorigenesis. For example, a study of the entire exomes of 11 CRC patient tumors identified 140 putative cancer drivers based on frequency of recurrence36,37. More recently, 224 CRC patient tumors were assayed for mutations, gene expression, copy number, and methylation status by The Cancer Research Atlas (TCGA) Network. This study produced various lists of genes based on recurrent changes, including 31 genes recurrently mutated and a larger number of genes found in genomic regions that were recurrently amplified or deleted38. It is still unknown how many genetic

6 mutations are necessary and sufficient for cancer initiation with estimates ranging from 3 to 1437,39.

These landmark studies have provided cancer geneticists with a wealth of data regarding the genetic drivers of CRC and were the springboard for the creation of several animal models.

Mouse Models of FAP

Moser et al developed the first mouse model of FAP in a forward genetic screen using N-Ethyl-N-Nitrosourea (ENU) as a germline mutagen. C57BL/6J

(B6) were treated with ENU then crossed to AKR mice to generate progeny harboring ENU-induced germline mutations. The progeny from this breeding scheme exhibited an interesting circling trait that proved to be heritable in the offspring of AKR/B6 (f1) crossed to B6. Adult offspring from this cross were often anemic and frequently passed bloody stool. GI tract adenomas were observed in all anemic mice and tumors eventually progressed to adenocarcinomas in aging mice. Mice displaying these phenotypic features were said to carry the Multiple

Intestinal Neoplasia (Min) gene. Further analysis demonstrated the Min mutation was inherited in an autosomal dominant fashion and heterozygous offspring developed hundreds of GI tract tumors resulting in death at approximately 120 days40. Su et al later identified the Min locus as the mouse homolog of the human

APC gene. A thymine to adenine transversion mutation, creating a premature stop codon, was found at nucleotide 2549 ( 850) of the Apc gene41.

7 With similar nonsense mutations often observed in FAP patients, it was determined that the ApcMin mouse was a suitable FAP model.

Several variations of the Apc mutant mouse have since been developed with the main differences being the location of the Apc truncating mutations. The most notable and well characterized of these variations are mice engineered with

Apc truncated at amino acids 716 and 163842,43. The development of Cre-lox technology has also enabled researchers to control the location and/or timing of

Apc deletions. For example, expressing Cre recombinase from a tissue specific promoter (e.g. Fabpl- and Villin-promoters) results in deletions of Apc specifically in the GI tract epithelial cells44-48. Each Apc deletion mutant displays slight phenotypic variations but all develop GI tract adenomas that eventually develop to adenocarcinomas. For more detail on the various Apc mutants please refer to these excellent reviews49-52. Because Apc mutant mice rarely develop aggressive metastatic disease these models have been mostly helpful in studying the genetic events driving early CRC tumor formation.

Mouse Models of Lynch Syndrome

Several attempts to create a mouse model of Lynch syndrome have been made but most are limited in their ability to faithfully recapitulate the early onset of CRC tumors. Early efforts to create a Lynch syndrome mouse model focused on deleting the murine homologs of MSH1 and MSH2, as these genes are mutated in approximately 90% of Lynch syndrome patients14. Msh2 knockout mice are viable and develop without abnormalities. Adult Msh2-/- mice have a

8 reduced life span (6-12 mo.) due to T-cell malignancies53,54. Gastrointestinal tract adenomas and adenocarcinomas are observed in 80% Msh2-/- mice that survive 8-10 months55. The microsatellite instability observed in Lynch syndrome

CRC tumors was also found in tumor tissues (T-cell and GI tract) from Msh2-/- mice. Mlh1 knockout mice are phenotypically similar to Msh2-/- mice56. It has also been shown that mice deficient in the MMR protein Msh6 develop GI tract tumors but, unlike Msh1-/- and Msh2-/- models, these mice do not display the microsatellite instability characteristic of Lynch syndrome CRC tumors57.

In order to avoid early death from lymphomas, researchers have bred

MMR gene knockout mice to immunocompromised strains (tap1-/-) that lack

CD8+ T-cells. Such mice do not die early from lymphomas allowing for the development of CRC tumors resembling those found in Lynch syndrome patients58. This approach has been improved upon using the Cre-lox system to restrict MMR gene deletion to the GI tract epithelial cells of immunocompetent mice59.

Although useful, genetically engineered mice (transgenic or gene, knockout/knockin) are typically designed to harbor only a few mutant alleles and are, therefore, inherently limited in their ability to accurately model the complex genetic alterations of sporadic CRC tumors. Another deficiency is the difficulty of simultaneously altering multiple genes, including genes of unknown function. The next section describes models that can overcome these limitations.

9 Models of Sporadic CRC

Chemically Induced Tumor Models

To mimic environmentally induced cancers researchers have used carcinogenic chemicals to generate sporadic CRC tumors. Methylazoxymethanol

(MAM), 1,2-dimethylhyrdrazine (DMH), and azoxymethane (AOM) are examples of popular compounds used to generate CRC tumors in mice. Although the number of tumors varies depending on the strain, mice treated with these compounds quickly develop CRC tumors in the distal colon that moderately resemble human CRC tumors (e.g. KRAS mutations). Although these models reliably produce sporadic CRC tumors, it is challenging to determine their mutational landscape. To do so, one must target previously identify cancer drivers for DNA sequencing or use a more unbiased method of analysis (e.g. whole exome, RNA-seq, etc.), which can be cost prohibitive. For a comprehensive review of the many carcinogen-induced CRC models refer to these reviews60-63.

Insertional mutagenesis models for CRC gene discovery

The advancement of CRC therapeutics, specifically the development of molecularly targeted therapies, is critically dependent on the identification of novel CRC driver mutations. Insertional mutagenesis forward genetic screens are an excellent method to identify novel cancer genes. Retroviruses (e.g. MMTV and MuLV) have long been used as insertional mutagens and their use has led to the discovery of several major cancer genes64-67. There are two flavors of

10 retroviruses, the acute transforming and the slow transforming virus. Slow transforming retroviruses are often used as insertional mutagens because they do not carry viral oncogenes within their genomes. Instead, slow transforming viruses promote tumor formation by proviral insertional into endogenous oncogenes and tumor suppressors. Upon insertion next to, or within, a gene, elements within the viral genome can act in cis to alter the expression of cellular genes68. The identification of the virally disrupted genes is possible using PCR to amplify the viral-host genome DNA junction and subsequent sequencing. Due to tissue tropisms, the use of retroviruses has mainly been used to model mammary and blood cancers.

Class II transposable elements represent novel insertional mutagens used for the discovery of CRC driver genes. Class II transposons are DNA elements that rely on the enzymatic activity of a transposase to be “cut” from one genomic location and “pasted” to another. Transposons have been used for decades to study gene function in a wide array of organisms but the use of transposons in vertebrates is a relatively new advancement69,70. Ivics et al (1997) were the first to construct a synthetic transposon, Sleeping Beauty (SB), and transposase that showed activity in vertebrate cells71. Quickly after its introduction, the SB transposon system was engineered into the mouse genome and transposition was shown to occur in vivo72,73. Enhancements that increased the transposition frequency of the SB transposon lead to the development of the oncogenic transposon known as T2/Onc74,75. The T2/Onc transposon carries multiple

11 mutagenic elements that make it an ideal tool for the discovery of candidate cancer genes. To mimic gain of function mutations, the T2/Onc transposon carries a strong viral promoter and a splice donor sequence. In the event of

T2/Onc integration upstream of a gene and in the same transcriptional orientation, the strong viral promoter carried by the transposon will drive expression of the cellular gene. Similarly, if the transposon lands in an intron it can also produce an active truncated protein. T2/Onc also carries splice acceptor sites on both DNA strands and a bidirectional poly(A) signal. If T2/Onc integrates within a gene, in either orientation, the splice acceptors and poly(A) signal will terminate transcription effectively mimicking loss of function mutations (figure

1.1). In separate publications Dupuy et al and Collier et al used the T2/Onc transposon and SB transposase to demonstrate the mutagenic potential of the

T2/Onc transposon in somatic cells76,77.

Soon after the debut of the T2/Onc transposon, Starr et al harnessed its mutagenic potential for the discovery of novel CRC driver genes. In this forward genetic screen, the authors created a triple transgenic mouse by crossing mice harboring a Cre-responsive SB transposase allele with double transgenic mice engineered to express the Cre recombinase only in the gut as well as carry a concatemer of T2/Onc transposons (figure 1.2). In this model the T2/Onc transposon is mobilized by the SB transposase only after Cre recombinase (villin-

Cre) removes the LoxP-STOP-LoxP cassette located upstream of the SB transposase.

12

Figure 1. 1. A schematic of the T2/Onc transposon. The central viral promoter and splice donor (SD) promote the expression of candidate oncogenes. The bidirectional splice acceptor (SA) sites and polyadenylation (pA) sites allow for the disruption of candidate tumor suppressor gene expression. The inverted repeat direct repeat (IR/DR) sites (in pink) serve as binding sites for the SB transposase.

13

Figure 1. 2. Breeding scheme used to generate triple transgenic mice. Villin-Cre mice express the Cre recombinase from the GI-specific villin promoter. T2/Onc mice carry a concatemer of T2/Onc transposons on

Chromosome 1 or 15. Rosa-LSL-SB mice carry a knockin of the SB11 transposase enzyme downstream from a Lox-STOP-Lox (LSL) cassette at the

Rosa locus. This design restricts T2/Onc transposition to the epithelial cells lining the GI tract.

14 The results from this study revealed that triple transgenic mice become moribund at a faster rate than control mice (double transgenics) and that the vast majority of these triple transgenic mice developed intestinal lesions.

Histopathology revealed the resulting GI tract growths to be intraepithelial neoplasias, adenomas, and adenocarcinomas. To identify the disrupted host genes, DNA was harvested from 135 tumors for linker-mediated PCR (LM-PCR) amplification of the transposon-host genome junction. Sequencing of the LM-

PCR products revealed each tumor contained, on average, 124 T2/Onc insertions. Using statistical approaches to identify common insertion site (CIS) loci the authors were able to hone in on a list of 77 candidate CRC genes. The ability of this tool to model sporadic CRC was confirmed by the finding that APC,

PTEN, and SMAD4, which are commonly disrupted in human CRC tumors, were also identified as T2/Onc CISs in mouse tumors78. Of the 77 CIS genes, 17 were identified as novel CRC driver genes. One of these genes, RSPO2, has recently been identified as a tumor suppressor gene in human CRC tumors79.

Similar screens have been carried out using the T2/Onc transposon as an insertional mutagen in Apc mutant mice. This approach was used by Starr et al

(2011) to identify novel CRC genes that cooperate with Apc mutations. T2/Onc transposition in ApcMin mice increased the polyp count from 112 (controls) to an average of 360 polyps per mouse. DNA analysis of 96 polyps from 12 mice revealed the presence of greater than 30 thousand transposon insertions. CIS analysis identified 37 genes in this set of transposon insertions. The remaining

15 wild-type Apc allele of the ApcMin mouse was the most commonly mutated gene in this study, a result that further demonstrates the importance of APC mutations in CRC development. To validate some of the 37 CIS genes the authors used

RNAi to decrease the expression of nine CIS genes (CNOT1, PDE4DIP,

PDCD6IP, ATF2, SF11, FNBP1L, MYO5B, SNX24, and STAG1) in vitro. It was determined that knockdown of CNOT, PDE4DIP, PDCD6IP, ATF2, and SF11 significantly decreased the proliferation of the SW480 CRC cell line80. Using a similar approach March et al also determined that SB transposition increased the morbidity and tumor burden of Apc mutant mice. CIS analysis of DNA from 446 transposon-induced tumors identified hundreds of CIS genes. Using pathway analysis, the authors determined that genes from their CIS list were involved in

38 cancer related genetic networks (e.g. K-Ras signaling pathway) and as many as 183 CIS genes were identified as Wnt pathway targets81. A more recent publication by Takeda et al expanded upon this approach to include transposon based insertional mutagenesis on KRAS, SMAD4, and TP53 mutant strains of mice. In human CRC tumors the loss of APC is a tumor initiating mutation followed by activating mutations in KRAS in early to intermediate adenomas and loss of function mutations in SMAD4 and TP53 occurring in later stages. By generating mouse models with initiating mutations in these three genes (KRAS,

SMAD4, and TP53) the authors attempted to identify mutations that cooperate with these three genes. Using this strategy, the authors discovered different sets of CIS genes that were unique to each genetic background suggesting that the

16 initiating mutation influences which genes are mutated during CRC tumor development. One of their findings was that the gatekeeper Apc mutation was common in all backgrounds, while activating mutations in the Wnt pathway members Rspo1 and Rspo2 were only found in the SMAD4 mutant mice.

Conclusion

The ultimate goal of cancer gene discovery is to translate new knowledge into more effective therapies for treating CRC. Identification of recurrently altered genes is informative but does not directly result in reduced mortality rates for

CRC patients. The limited success of targeted therapies is likely attributed to tumor heterogeneity and the action of unidentified CRC driver genes. Mouse models continue to be an essential tool used to unveil novel CRC driver genes with therapeutic potential. One promising example is the identification of ion channel genes as candidate drivers of CRC. Both KCNQ1 and CFTR were identified in multiple Sleeping Beauty insertional mutagenesis screens80-82.

Based on these findings, Than et al reported that KCNQ1 is a tumor suppressor in human CRC 83, while recent findings from cystic fibrosis patients support the hypothesis that CFTR is also a human tumor suppressor84. These findings suggest that ion channel modulators may represent a new class of drugs for treating a subset of CRC patients.

Eventually, clinical trials using approved and experimental therapies are required. Once the panoply of driver genes has been defined, we will need to develop clinical lab diagnostics capable of determining the exact drivers of each

17 individual patient's cancer based on tumor biopsies. Such information will allow physicians to select drugs and therapies specifically designed to target those drivers.

Thesis Statement

The aim of this dissertation is to validate that the common insertion site genes

TM9SF2 and WAC are novel human CRC driver genes. Specifically, I propose that TM9SF2, a relatively unknown channel protein, is a colorectal cancer oncogene. I have tested this hypothesis with bioinformatic approaches examining the expression level of TM9SF2 in publicly available data sets as well as a set of tumor and matched normal tissues. Furthermore, I have utilized CRISPR/Cas9 technology to develop the first TM9SF2 knockout cell line to use as a model for functional assays. Lastly, I propose that WAC is a colorectal cancer tumor suppressor with a potential role in the regulation of P53 and P53 target genes.

18 Chapter 2. Transposon mutagenesis screen in mice identifies

TM9SF2 as a novel colorectal cancer oncogene

Authors: Christopher R. Clark a, Makayla Maile a, Patrick Blaney a, Stefano R.

Hellweg a, Anna Strauss a, Wilaiwan Durose a, Sambhawa Priya d,e, Juri Habicht a, Michael B. Burns f, Ran Blekhman d,e, Juan E. Abrahante c, Timothy K. Starr a,b

Synopsis

New therapeutic targets for advanced colorectal cancer (CRC) are critically needed. Our laboratory recently performed an insertional mutagenesis screen in mice to identify novel CRC driver genes and, thus, potential drug targets. Here, we define Transmembrane 9 Superfamily 2 (TM9SF2) as a novel

CRC oncogene. TM9SF2 is an understudied protein belonging to a well conserved protein family characterized by their nine putative transmembrane domains. Based on our transposon screen we found that TM9SF2 is a candidate progression driver in digestive tract tumors. Analysis of The Cancer Genome

Atlas (TCGA) data revealed that approximately 35% of CRC patients have elevated levels of TM9SF2 mRNA, data we validated using an independent set of

CRC samples. RNAi silencing of TM9SF2 reduced CRC cell growth in an anchorage-independent manner, a hallmark of cancer. Furthermore,

CRISPR/Cas9 knockout of TM9SF2 substantially diminished CRC tumor fitness

19 in vitro and in vivo. Transcriptome analysis of TM9SF2 knockout cells revealed a potential role for TM9SF2 in cell cycle progression, oxidative , and ceramide signaling. Lastly, we report that increased TM9SF2 expression correlates with disease stage and low TM9SF2 expression correlate with a more favorable relapse-free survival. Taken together, this study provides evidence that

TM9SF2 is a novel CRC oncogene.

Introduction

Colorectal cancer (CRC) arises from a stepwise accumulation of mutations that transform normal epithelia into cancerous tissue19,85. Decades of research analyzing the genetic basis of CRC has resulted in the identification of several important driver genes including APC, KRAS, SMAD4, and TP53. In addition, recent large scale genomic analyses, such as The Cancer Genome

Atlas (TCGA), have identified numerous additional recurrent somatic mutations, focal copy number alterations and gene expression changes 86,87. From these studies it is clear that CRC has a complex genetic etiology. It is important that we understand the functional significance of these genetic changes so that we can develop better therapies, especially for advanced disease.

To functionally define genetic drivers of CRC, we and others have used

Sleeping Beauty (SB) transposon mutagenesis screens in mice, an unbiased method of finding genetic drivers of CRC. These studies have produced multiple lists of genes suspected of contributing to CRC when altered by transposon mutagenesis 78,80,82,88. With the goal of finding potential therapeutic targets we

20 are using cross-species bioinformatics approaches to select genes from these lists for further study. This approach has resulted in the identification of potential actionable targets including KCNQ1, CFTR, and RSPO2/3 83,89,90. In this study, we report our findings on TM9SF2, a transmembrane protein belonging to the transmembrane-9 superfamily (TM9SF), which includes TM9SF1-4. Although it is well conserved evolutionarily, very little is known about the function of TM9SF in mammalian cells, nor their role in cancer 91,92. TM9SF1 has been implicated in autophagosome formation and has been linked to bladder cancer

93,94. It has been reported that TM9SF3 is upregulated in chemoresistant breast cancer cells after combination treatment with paclitaxel and an HDAC inhibitor and may also play a role in gastric cancer 95,96. The most well studied member,

TM9SF4, is reportedly overexpressed in human melanoma cells and has also been described as a proton pump associated protein 97,98.

In this study, we identify TM9SF2 as a novel oncogene in CRC. We found that TM9SF2 is potentially regulated by the Ets-family transcription factor ELF1, and TM9SF2 is upregulated in approximately one-third of human CRC samples.

We used RNAi and CRISPR/Cas9 to either reduce or knockout the expression of

TM9SF2, which had the effect of reducing tumor fitness in both the in vitro and in vivo settings. Finally, we performed transcriptome analysis to gain insight into the potential role of TM9SF2 as a cell cycle regulating protein.

21 Results

Insertional mutagenesis screens identify TM9SF2 as candidate cancer gene

Our laboratory previously performed an insertional mutagenesis screen in mice to identify novel gastrointestinal (GI) tract cancer driver genes 78. In this study we used the Sleeping Beauty (SB) DNA system consisting of an oncogenic

DNA transposon (T2/Onc) capable of disrupting tumor suppressor genes and activating oncogenes, which is activated by tissue-specific expression of the SB transposase 76,77,99. We identified 77 candidate cancer genes whose activity was potentially altered by T2/Onc transposition based on common insertion site (CIS) analysis 100. Of these 77 candidate cancer genes, we chose to focus on TM9SF2 for further study because we found this gene to be overexpressed in a large percentage of human CRC samples, suggesting a potential oncogenic function.

TM9SF2 is a member of a highly conserved family of proteins that span the lipid bilayer nine times. The predicted function of the TM9SF2 protein product is to act as a small molecule transporter or ion channel. In our screen the T2/Onc transposon insertions were mapped to the murine Tm9sf2 gene in nine tumor samples (Fig 2.1).

To further explore the role of TM9SF2 as a cancer gene, we used two publicly available databases that catalog cancer genes discovered using DNA transposon insertional mutagenesis. The Candidate Cancer Gene Database

(CCGD http://ccgd-starrlab.oit.umn.edu/about.php) catalogs cancer genes identified in 69 insertional mutagenesis studies covering 12 tumor types 88.

22

Figure 2. 1. SB screen identifies Tm9sf2 as a candidate CRC driver gene. Schematic representation of gastrointestinal tract tumor-T2/onc insertion sites within the murine Tm9sf2 gene. Triangles depict the location of insertion as well as the orientation of the promoter-splice donor within the transposon.

23

Mining the CCGD database revealed that Tm9sf2 was a transposon-targeted mutation in an additional eight forward genetic screens, including screens for liver, pancreatic, breast, and gastric cancer (see supplementary table S2.2). The

Sleeping Beauty Cancer Driver Database (SBCDDB: http://sbcddb.moffitt.org/index.html) catalogs over 1.5 million transposon insertions from 2 354 tumors taken from approximately 1 000 mice from 19 tumor types 101. Mining of the SBCDDB revealed that Tm9sf2 was a common insertion site in 7.2% (121/1674 tumors) of all digestive tumors, which includes liver, pancreas, intestine, and stomach tumors, but was not identified as a driver in hematopoietic tumors (Fig 2.2A). Several insertional mutagenesis studies were conducted using mice that are predisposed to GI tract cancer by manipulating known genetic drivers, including Apc, Kras, Smad4, and Trp53 80-82. Interestingly,

TM9SF2 was not identified as a driver gene in mice with Apc or Smad4 mutations but was identified as a driver in mice with Kras or Trp53 mutations.

Mice harboring the activating Kras G12D allele had Tm9Sf2 transposon insertions in 13 out of 173 tumors (7.5%; p=1.68e-05) and mice harboring the dominant negative R172H Trp53 mutation had insertions in 7 out of 55 tumors

(12.7%; p<0.005) (Fig 2.2B and Supplementary Table S3). This analysis indicates TM9SF2 has a contributing role in the formation of murine intestinal tumors.

24

A. B.

Figure 2. 2. Frequency of SB insertions in Tm9sf2 in SBCD database. A, The frequency of tumors with SB insertions in Tm9sf2 in digestive tract, solid tumor, liquid tumors, and all tumors analyzed in the SBCD database. Gray bars represented instances where Tm9sf2 is a progression diver gene. White bars are not significantly altered cases. B, The frequency of Tm9sf2 insertions in intestinal-specific mutagenesis screens in mice with predisposing mutations in

Trp53 (R172H allele) or Kras (G12D allele). Tm9sf2 insertions are predicted to act as a progression driver gene in both studies.

25 TM9SF2 is overexpressed in human CRC

To determine if TM9SF2 is altered in human disease we evaluated

TM9SF2 mutation data from the Catalogue of Somatic Mutations in Cancer

(COSMIC) database (see: http://cancer.sanger.ac.uk/cosmic/publications)102.

Interestingly, data from COSMIC revealed that mutations in TM9SF2 are rarely catalogued in human tumors. We found that tumors derived from the endometrium contained the highest rate of TM9SF2 mutation at a tumor frequency of approximately 2% (13/656 total tested). Tumors from the small and large intestine were the next most likely to contain mutations in TM9SF2 but the mutation frequency remained very low at 1.92 % and 1.71% respectively. These data suggest that point mutations and small in/dels in TM9SF2 are unlikely to play a significant role in CRC development and progression.

We used several approaches to measure TM9SF2 gene expression in

CRC. First, we used cBioportal to analyze gene expression in CRC tumors from

The Cancer Genome Atlas 3,103. Analyses of gene expression levels using microarray or RNA sequencing reveal that, when compared to the mean expression distribution of tumor samples that are diploid for TM9SF2, TM9SF2 is overexpressed in approximately one-third (194/601 tumors) of large intestine tumors (Fig 2.3).

26

Figure 2. 3. TM9SF2 is overexpressed in human CRC samples. (top) Oncoprint of genomic alterations in 379 TCGA patient colorectal adenocarcinomas subjected to RNA sequencing. TM9SF2 copy number alterations and mRNA expression changes are shown in addition to a heatmap depicting the intensity of TM9SF2 mRNA changes. A, (bottom) Oncoprint showing alterations in TM9SF2 in 222 TCGA patients subjected to Agilent microarray analysis.

27

These data suggest that TM9SF2 may function as a proto-oncogene in CRC. For our second approach, we evaluated the mRNA levels of TM9SF2 in a panel of nine commonly used colorectal cancer cell lines using qRT-PCR. For comparison we used the Human Colonic Epithelial Cell (HCEC) line, which is a non- oncogenic immortalized cell line, as a normal (non-tumor) control sample 104. All cell lines tested had significantly upregulated levels of TM9SF2 transcript when compared to HCEC. DLD1, HCT-8, and HT-29 cells expressed the highest levels of TM9SF2 ranging from approximately 4 to 5-fold that of HCEC cells (Fig 2.4).

Finally, we performed RNA-sequencing on a set of 44 CRC tumor and matched normal samples collected at the University of Minnesota. Expression levels were significantly increased in tumors compared to matched normal samples (P =

4.461x10-6; Fig. 2.5). In all but eight samples the levels of TM9SF2 mRNA were increased in the patient's tumor sample compared to the matched normal sample

(Fig 2.5). These data support the hypothesis that TM9SF2 is overexpressed in

CRC and may play a role as an oncogene.

28

Figure 2. 4. TM9SF2 is overexpressed in human CRC cell lines. qRT-PCR analysis for TM9SF2 expression in human CRC cell lines. HCEC, an immortalized but non-transformed colon epithelial cell line, was used as the control. *, t-test P < 0.05. **, P < 0.01. ***, P < 0.001. ****, P < 0.0001.

29

Figure 2. 5. TM9SF2 expression is elevated in patient tumor versus matched normal samples. RNA sequencing results showing TM9SF2 expression levels in a set of CRC tumor and matched normal tissue samples from the University of Minnesota. Green lines highlight samples with an increase in TM9SF2 mRNA in tumor versus normal and red lines indicate a decrease. Expression differences between healthy and tumor tissue were tested for significance using non-parametric Mann-Whitney U test (paired).

30

TM9SF2 functions as an oncogene in CRC cell lines

To determine if TM9SF2 plays an oncogenic role in CRC, we generated stable

TM9SF2 knockdowns in DLD1 cells using lentiviral shRNA vectors (Fig S1A &

Fig S1B). Reducing the level of TM9SF2 in DLD1 cells resulted in reduced anchorage-independent growth compared to empty-vector control cells based on their reduced ability to form colonies in soft agar. We observed a significant reduction in colony numbers in two independent knockdown DLD1 cell lines

(shRNA3: 25% and shRNA7: 37% (Fig 2.6 & Fig S1C).

To further validate the oncogenic role of TM9SF2, we used CRISPR/Cas9 editing to knockout the TM9SF2 gene in HT-29 and HCT116 CRC cell lines

(Supplementary Fig 2.1). We generated multiple independent clones that did not express TM9SF2 based on sequencing and Western blot analysis

(Supplementary Fig S2.2D). Similar to the DLD1 knockdown cell line, loss of

TM9SF2 in HT-29 resulted in reduced anchorage-independent growth based on the soft agar assay (Fig 2.7). In addition, the proliferation rate was decreased in

HT-29 TM9SF2 KO cells (Fig 2.8). Anchorage-independent growth and proliferation were not affected in TM9SF2 knockout HCT116 cells (data not shown), which is likely due to the already low level of expression in the parental

HCT116 cells (Fig 2.4). In fact, overexpression of TM9SF2 in HCT116 cells resulted in increased colony growth in soft agar, which further supports the finding that TM9SF2 functions as a CRC driving oncogene (Fig 2.9 & Fig S2.2E).

31

Figure 2. 6. Knockdown of TM9SF2 reduces anchorage independent growth in DLD1 cells. Quantification of colonies indicating that TM9SF2 knockdown reduces DLD1 growth in soft agar. ****, t-test P < 0.0001.

32 A.

B.

Figure 2. 7. CRISPR/Cas9 Knockout of TM9SF2 reduces cell growth in soft agar. A, Representative quantification of colonies indicating that TM9SF2 knockout reduces cell growth in soft agar. Two independent single cell clones, A3 and C6, were used in this experiment. B, Images of colonies stained with crystal violet ten days post plating. ****, t-test P < 0.0001.

33

Figure 2. 8. CRISPR/Cas9 Knockout of TM9SF2 reduces cell proliferation rate. Proliferation assay measuring trypan blue exclusion in parental (black line) and

TM9SF2 HT-29 knockout clones (gray lines). *, t-test P < 0.05. ****, t-test P <

0.0001.

34

Figure 2. 9. Overexpression of TM9SF2 enhances anchorage independent growth of HCT116 cells. Representative quantification of colonies indicating TM9SF2 overexpression increases cell growth in soft agar. ****, t-test P < 0.0001.

35 TM9SF2 functions as an oncogene in vivo

To determine if TM9SF2 knockout has an effect on tumor growth in vivo we performed a xenograft experiment using HT-29 TM9SF2 KO cells compared to parental cells. By day 20, mean tumor volume for parental control cells exceeded 1 000 mm3 compared to 600 mm3 and 325 mm3 for knockout clones

A3 and C6 respectively (Fig 2.10A). This reduced tumor burden corresponded to a significantly increased overall survival for mice bearing TM9SF2 knockout tumor cells. Mice grafted with clones A3 and C6 had a median survival of 39 and

43 days, while mice grafted with control cells survived on average only 25 days

(Fig 2.10B). Taken together, these data support the hypothesis that TM9SF2 functions as an oncogene in CRC.

Cell cycle and metabolic pathways upregulated by TM9SF2

To uncover the molecular pathways affected by TM9SF2 we performed

RNASeq and quantified transcript levels in the parental and TM9SF2 knockout

HT-29 cells. We identified 835 genes that were differentially expressed, with 596 being increased and 239 being decreased in the knockout cells (see

Supplementary Table S5). The expression of several notable cell cycle checkpoint genes including CCNA2, CCNB2, and AURKA, as well as the target proliferation-related genes PCNA and NPM1, were all significantly lower in the TM9SF2 knockout cells. The reduced proliferative phenotype observed in

TM9SF2 knockout cells can likely be explained by the downregulation of these genes.

36

Figure 2. 10. TM9SF2 knockout reduces tumor growth and extends survival in an in vivo xenograft model. A, HT-29 TM9SF2 knockout cells were injected subcutaneously in the rear flank of athymic nude mice and tumor volume was measured with calipers approximately every other day. TM9SF2 knockout cells have a significantly reduced ability to grow versus control cells. B, Kaplan-Meier survival curves for control animals and two groups bearing tumors from two independent TM9SF2 knockout clones. *, P < 0.05. **, P < 0.01. ****, t-test P < 0.0001.

37

To identify gene sets that are affected by TM9SF2 loss we performed

Gene Set Enrichment analysis (GSEA) on the set of altered genes 105. Gene sets that were significantly enriched in parental controls included the G2-M cell cycle checkpoint genes, Myc target genes, and genes in the reactive oxygen species pathway (Fig 2.11 and Table 2.1). The highest-ranking member of the reactive oxygen species pathways was the glucose-6-phosphate dehydrogenase gene

(G6PD). Loss or downregulation of this gene has been known to cause embryonic lethality in mice and lead to cellular senescence in various cell types

106-108. TM9SF2 knockout cells have more than a two-fold decrease in G6PD, which likely contributes to the reduced oncogenic behavior of these cells.

We used Ingenuity Pathway Analysis to identify canonical pathways altered in TM9SF2 knockout cells. This analysis found significant alterations in the expression of genes associated with Ceramide signaling, Mitotic Roles of

Polo-like , and Protein Kinase A signaling pathways (Supplementary Table

S6). Notable Ceramide signaling genes altered in the HT-29 knockout cells included PIK3R3 (-2.1-fold), S1PR2 (+5.7-fold), SMPD3 (-8.9-fold), and SPHK1

(+5.2-fold). These genes, and the majority of the other molecules associated with

Ceramide signaling, are altered in the direction consistent with pathway activation. These genes are known to play an integral part in the metabolism of

Sphingolipids and they have been associated with various neoplasms 109-113.

38

Figure 2. 11. TM9SF2 knockout demonstrates its role as a potential cell cycle regulator. Three panels showing the enrichment score (ES) for the top three hallmark gene sets enriched in the control HT-29 cells compared to the TM9SF2 knockout cells.

39

Table 2. 1 Genesets enriched in parental HT-29 cells compared to HT-29 TM9SF2 knockout cells.

Geneset NES p-value FDR q-val Hallmark_G2M CHECKPOINT 1.25 0.000 0.215 Hallmark_MYC TARGETS V1 1.26 0.000 0.246 Hallmark_Oxidative Phosphorylation 1.3 0.0063 0.252

40 ELF1 transcription factor regulates TM9SF2 expression

To identify regulatory factors responsible for upregulation of TM9SF2 in

CRC we analyzed HCT116 CHIP-Seq data from the ENCODE project 114. Based on ENCODE data, there was a strong ELF1 binding motif (chr13: 100,153,732 –

100,153,742) in the 5'UTR of TM9SF2. The consensus ELF1 binding motif is 5’-

CGGAAGT, which is a near perfect sequence match to the observed ELF1 DNA binding site in the TM9SF2 promoter region (5’-CGGAACT). Based on ENCODE

Chip-seq data, the location of the ELF1 binding site overlaps with elevated levels of H3K4 trimethylation, an epigenetic mark commonly associated with promoters

(Fig 2.12 & Fig S2.3).

If ELF1 is required for elevated expression of TM9SF2, we predicted that

ELF1 mRNA levels would correlate with TM9SF2 expression. To test this prediction, we analyzed ELF1 and TM9SF2 mRNA expression levels in 382

TCGA CRC patient samples and found a strong positive correlation (Pearson:

0.781; Spearman 0.744) between the expression of TM9SF2 and ELF1 (Fig

2.13). For further validation we performed qRT-PCR for ELF1 in our panel of

CRC cell lines. ELF1 expression remains generally unchanged in the five cell lines with lower TM9SF2 expression but is highly upregulated in the three cell lines with the highest TM9SF2 expression (HT-29, HCT-8 and DLD1) (Fig 2.4 and Fig 2.13).

To verify that ELF1 binds to the TM9SF2 promoter we performed CHIP- qPCR using DLD1 cells. At steady state we observed a 6-15-fold enrichment in

41 TM9SF2 promoter DNA after chromatin pull-down with an ELF1 antibody (Fig

2.15). These data are in support of ELF1 as a TM9SF2 regulating transcription factor.

High levels of TM9SF2 correlate with higher stage cancer and decreased disease-free survival

Based on our data supporting a role for TM9SF2 as an oncogene in CRC we tested the hypothesis that TM9SF2 expression correlates with worse patient prognosis. We analyzed data from Staub et al., (Staub, 2009, GSE12945) where they measured gene expression in 62 patient samples spanning all clinical stages of CRC (I-IV). We found that TM9SF2 expression correlates with disease severity with late stage (III & IV) cancers having a significantly higher level of expression compared to stage I (Fig 2.16).

In two additional studies with 566 patients (Marisa et al.) and 355 patients

(Sieber et al.; Smith et al.) we tested the association of TM9SF2 mRNA levels with disease-free survival 115-117. In both studies, we observed a significantly favorable disease-free survival probability in patients with the lowest levels of

TM9SF2 expression (Fig 2.17A/B).

42

Figure 2. 12. ELF1 binding motif in the TM9SF2 promoter region. Position weight matrix from ENCODE CHIP-seq experiments showing a potential

ELF1 transcription factor binding motif in the 5’ UTR of the TM9SF2 gene.

43

Figure 2. 13. ELF1 expression positively correlates with TM9SF2 expression in human CRC samples. ELF1 and TM9SF2 mRNA co-expression scatter plot using TCGA RNAseq data from CRC patients.

44

Figure 2. 14. ELF1 expression in a panel of CRC cell lines. qRT-PCR analysis for ELF1 expression in human CRC cell lines. HCEC, an immortalized but non-transformed colon epithelial cell line, was used as the control. t-test, ns = not significant, *, P < 0.05. **, P < 0.01. ***, P < 0.001. ****, P

< 0.0001.

45

Figure 2. 15. ELF1 binds to the promoter region of TM9SF2. Data from ELF1 ChIP normalized against those from IgG mock ChIP. Schematic shows binding location of two qPCR primer sets (black and gray arrows) used to quantify TM9SF2 promoter DNA after ELF1 ChIP. Translation start site is indicated with bent arrow.

46

Figure 2. 16. TM9SF2 expression increases with disease stage. Microarray data (Staub, 2009, GSE12945) depicting TM9SF2 mRNA expression levels in patient samples from stages I through IV.

47 A.

B.

Figure 2. 17. TM9Sf2 expression predicts relapse free survival. A,B Kaplan-Meier curves depicting relapse-free survival with data stratified by intensity of TM9SF2 mRNA measured in CRC samples via microarray (top, Marissa, 2013, GSE39582, Bottom, Sieber Smith, 2010, GSE14333 plus GSE17538 minus identical samples).

48 Discussion

Our studies support the hypothesis that TM9SF2 functions as an oncogene in CRC. Analysis of data from our previous SB transposon forward genetic screen identified Tm9sf2 as a top candidate cancer gene. Mining of other publicly available SB transposon databases revealed further support for Tm9sf2 as a candidate cancer gene. Transposon insertions in the murine Tm9sf2 gene occurred in over 7% of the 1 674 analyzed digestive tract tumors and is predicted to be a progression driver gene 101. We demonstrated using both TCGA data and our own independent data set that TM9SF2 mRNA is overexpressed in up to one-third of CRC patients. Finally, using RNAi and CRISPR/Cas9 gene editing we demonstrated that reduction or complete knockout of TM9SF2 resulted in reduced tumor fitness in the in vitro and in vivo settings.

Our data also support TM9SF2 expression as a potential prognostic indicator as we found that mRNA expression levels correlate with both disease stage and relapse-free survival probability. We observed a significant increase in

TM9SF2 expression in patients with stage III/IV disease versus those with stage

I, suggesting that TM9SF2 expression may promote either the migration of cancerous cells or their seeding and growth in distant organs.

The molecular mechanism of TM9SF2's oncogenic function in CRC is currently unknown. Others have demonstrated that TM9SF proteins are responsible for controlling surface expression of adhesion molecules

118,119. In our RNA-seq data we observed alterations in several cell adhesion

49 related genes including ITGA1 and three members of the CEACAM family

(CEACAM5, CEACAM6, CEACAM19). The mRNA levels of CEACAM5 and

CEACAM6 were reduced by 4.9-fold and 12.3-fold respectively in knockout cells.

Both CEACAM5 and CEACAM6 are known to play a critical role in facilitating tumorigeneis and metastasis by inhibiting anoikis 120,121. The reduced ability of

TM9SF2 knockout cells to grow in soft agar could be explained by an increased sensitivity to anoikis due to the accompanying decrease in CEACAM expression.

It was recently reported that TM9SF4 plays a role in the assembly of the

V-ATPase proton pump in CRC cells 98. Silencing of TM9SF4 expression led to a more acidic cytoplasmic pH with an accompanying alkalization of intracellular vesicles. Ingenuity pathway analysis revealed that the top canonical pathway altered in TM9SF2 knockout cells was the Ceramide Signaling pathway.

Ceramides are bioactive lipids that are critical for modulating multiple cellular processes including cell cycle, , senescence, and stemness 122.

Ceramide levels within cells are tightly controlled by pH sensitive phospholipases called sphingomyelinases 123. With a close similarity in amino acid sequence and structure to TM9SF4, we hypothesize that TM9SF2 may function as a membrane-bound vesicular protein involved in the regulation of intracellular acidity. Future studies are required to determine the role of TM9SF2 in intracellular ion homeostasis and regulation of ceramide signaling molecules.

An increased understanding of the genetic basis of CRC will be useful for designing new targeted therapies. Current efforts to target the known drivers of

50 CRC, including APC, SMAD4, TP53, and KRAS, have not yet resulted in significant improvements in survival for advanced CRC. In the current study we identify TM9SF2 as an oncogene that is most likely regulated by the transcription factor ELF1. Our findings provide a rationale for exploring the efficacy of drugs targeting either TM9SF2 or ELF1 for treating advanced-stage CRC.

Materials and Methods

Cells

All cell lines (DLD1, LoVo, HCT116, HT-29, HCT-8, SW480, SW620, and

T84) except HCEC were obtained from the ATCC and were minimally passaged.

Human colonic epithelial cells (HCEC) were immortalized by expression of cyclin dependent kinase 4 (Cdk4) and the active components of human telomerase

(hTERT) and were kindly provided by Dr. Jerry Shay (UT Southwestern, Dallas

TX). HCEC cells were maintained in DMEM media with 2% calf serum, 25 ng/ml

EGF (Sigma Aldrich, St. Louis, MO), 2 µg/ml transferrin (Sigma Aldrich, St. Louis,

MO), 10 µg/ml insulin (Sigma Aldrich), 5 nM sodium selenite (Sigma Aldrich, St.

Louis, MO), 1 µg/ml hydrocortisone (Sigma Aldrich), and 50 µg/ml gentamicin

(Gemini Bio-products, West Sacramento, CA). HCEC cells were grown at 37 °C on Primaria flasks (Corning Inc., Corning NY) placed in chambers purged with

93% nitrogen 5% carbon dioxide and 2% oxygen gas. All other cell lines were maintained in DMEM with 4 mM L-glutamine, 10% FBS, and 1 x penicillin/streptomycin at 37° C and 5% carbon dioxide.

51 Vectors

RNAi experiments were performed using pLKO.1 vectors obtained through the University of Minnesota Genomics Center (UMGC) partnership with the Open

Access Program from Open Biosystems. The pLKO.1 vector with the oligo ID

(TRCN0000059768) was used to generate DLD1 shRNA3 cells and the vector with oligo ID (TRCN0000059772) was used to generate DLD1 shRNA7 cells.

Non-silencing control vectors were obtained in the same manner. The lentiviral packaging vector psPAX2 was a gift from Didier Trono (Addgene plasmid #

12260) and viral envelope encoding vector pCMV-VSV-G was a gift from Bob

Weinberg (Addgene plasmid #8454). The lentiCRISPR v2 vector used to generate TM9SF2 knockout cells was a gift from Feng Zhang (Addgene plasmid

# 52961).

ShRNA knockdown cells

DLD1 stable TM9SF2 knockdown cells were created by transduction with lentiviral particles encoding TM9SF2 shRNAs followed by puromycin selection.

Virus production was performed using standard 293T packaging protocols and the pLKO.1 vectors.

CRISPR/Cas9 knockout cells

CRISPR lentiviral constructs targeting TM9SF2 were generated by annealing together primers for sgTM9SF2 CRISPR1, sgTM9SF2 CRISPR2, and sgTM9SF2 CRISPR6 and cloning the annealed product into lentiCRISPR v2 as described previously 124. The sgRNA primer sequences are listed in 52 supplementary table S2.1. The sgRNAs were designed to specifically target coding regions of the TM9SF2 gene using the CRISPR design tool from Dr. Feng

Zhang’s laboratory (http://crisper.mit.edu/). After transducing cells with TM9SF2 targeting CRISPR/Cas9 lentiviral particles, single cell puromycin-resistant clones were isolated by limiting dilution. DNA was extracted from single cell clones for

PCR amplification of the genomic loci targeted by CRISPR/Cas9. After PCR, the amplicons were purified and subsequently sequenced. Sequencing data from these clones was then used for Tracking of Indels by Decomposition (TIDE) analysis and mutants were confirmed by immunoblot with an anti-TM9SF2 antibody (Sigma Aldrich, St. Louis, MO, Catalog # HPA005657) using a standard

Western Blotting protocol.

Soft Agar assay and cell proliferation by trypan blue exclusion

For soft agar assays, cells were grown in between layers of 0.5% Sea

Plaque Low Melt Agarose (Lonza Cat. # 50101) in DMEM supplemented with

10% FBS and 1x penicillin/streptomycin for 10 days. Colonies were fixed, stained with crystal violet, and imaged. Colony counts were analyzed using ImageJ software. Equal numbers of HT-29 control and TM9SF2 knockout cells were plated in triplicate wells of 24-well culture plates and allowed to adhere for 24 hours. Cells were detached using trypsin and, after adding trypan blue, counted with an automated cell counter (Thermo Fisher Scientific, Waltham MA). The total number of live cells was determined for each well and the average number of viable cells for each triplicate was plotted for days 1 through 4.

53 RNA extraction and qRT-PCR

Total RNA was extracted using the RNeasy Plus mini kit (Qiagen,

Germantown, MD) according to the manufacturer’s protocol. Total RNA concentration and quality were measured using the BioTek Epoch microplate spectrophotometer (BioTek, Winooski, VT). One microgram of total RNA was reverse transcribed using the High Capacity cDNA reverse transcription kit

(Applied Biosystems, Foster City, CA). RT-qPCR was then performed in triplicate using diluted cDNA, FastStart Essential DNA Green Master mix (Roche, Basel,

Switzerland), and either ELF1 or TM9SF2 specific primers. Samples were run in the LightCycler 96 (Roche, Basel, Switzerland). Data were normalized to human beta actin and fold change was calculated using the delta-delta Ct method.

Primer sequences are listed in supplementary table S1.

Xenograft model

Athymic nude mice were obtained from The Jackson Laboratory (Stock

No. 002019) and protocols were approved by the University of Minnesota’s

Institutional Animal Care and Use committee. Five million HT-29 control and

TM9SF2 knockout cells were resuspended in 200 µL of DMEM media and subcutaneously injected into the rear flank. Each mouse carried duplicate tumors on the left and right flank. Tumor volume was measured with digital calipers every other day and tumor volume was calculated using the formula: volume =

(width)2 x (length)/2, where the width is the smaller of the measurements. All mice protocols were approved by the University of Minnesota’s institutional

54 animal care and use committee. Mice were sacrificed according to IACUC protocols.

Gene expression analysis

Oncoprint data showing the genomic alterations in TM9SF2 were generated using cBioPortal (version 1.12.1; http://www.cbioportal.org/index.do) using both RNA-seq and Microarray data from Colorectal Adenomacarcinoma

(TCGA, Provisional) 3,103. cBioPortal enrichment analysis revealed that TM9SF2 and ELF1 were highly positively correlated as measured by Pearson’s or

Spearman’s correlation.

RNA sequencing

Matched frozen tumor and normal samples were acquired from 44 patients (88 samples total) from the University of Minnesota's Tissue Repository.

All samples were de-identified. For RNA-seq of patient samples, total RNA was extracted using a previously established protocol 125,126. Purified RNA was submitted to the University of Minnesota Genomics Center for library preparation and sequencing. A quality check of raw sequence data (FASTQ files) was performed using FastQC software

(https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) to assess overall sequence quality, GC content, adaptor content, etc. Quality trimming was performed to remove sequence adaptors and low-quality bases using

Trimmomatic with a 3bp sliding window trimming from 3’ end requiring minimum

Q16 (phred33)127. FastQC was run on the resulting trimmed FASTQ files to 55 ensure good quality of sequences. The paired-end reads were mapped to NCBI v38 H. sapiens reference genome using HISAT2, resulting in average alignment rate of 87.11% overall for 88 samples 128. SAMtools was used for sorting and indexing the aligned bam files. After alignment, featureCounts was used to generate gene abundance, and Cuffnorm was used to generate an FPKM expression table 129. We used ensembl web-service from biomaRt package in R to map gene transcripts (ensemble gene id) to HGNC gene symbols 130.

RNA-seq of HT-29 knockout cells: 50bp FastQ paired-end reads (n=8.8 million per sample) were trimmed using Trimmomatic (v 0.33) enabled with the optional

“-q” option; 3bp sliding-window trimming from 3’ end requiring minimum Q30.

Quality control checks on raw sequence data for each sample were performed with FastQC. Read mapping was performed via HISAT2 (v2.0.2) using the UCSC (hg38) as reference. Gene quantification was done via Cuffquant for FPKM values and Feature Counts for raw read counts. Differentially expressed genes were identified using the edgeR (negative binomial) feature in

CLCGWB (Qiagen, Valencia, CA) using raw read counts. We filtered the generated list based on a minimum 2X Absolute Fold Change and Bonferroni corrected p < 0.05. These filtered genes were then imported into Ingenuity

Pathway Analysis Software (Qiagen, Valencia, CA) for pathway identification.

Chromatin Immunoprecipitation and qPCR

DLD1 cells were plated at 3 x 106 cells per 10 cm plate in standard fully supplemented DMEM overnight. ChIP was performed using an ELF-1 specific

56 antibody (Santa Cruz, SC-631) or a non-specific IgG rabbit isotype control using

Protein G magnetic beads (Active Motif). Analysis was performed using previously described methods 131. qPCR data are presented as Fold enrichment relative to the negative (IgG) sample. Primers are listed in supplemental table S1.

Patient Prognoses and Survival

The Staub cohort of 62 CRC patient samples used for transcriptome analysis by microarray (GSE12945) was previously described 132. Patients were grouped by disease stage and the mean expression value of TM9SF2 in each stage was plotted using box plots. Statistical significance between groups were determined using one-way ANOVA with a Tukey’s post hoc test. Kaplan Meier curves depicting relapse-free survival (RFS) probability were generated using the

R2 genomics analysis and visualization platform (http://hgserver1.amc.nl/). The

Marisa et al. cohort consisting of 566 samples was used for RFS analysis with two groups (high vs. low) TM9SF2 mRNA expression divided based on optimal expression separation. RFS groups were compared using the log-rank test. The same method was used when analyzing the 355-patient cohort provided by

Sieber and Smith (GSE1433 and GSE175) 116,117.

57 Acknowledgements

This work was supported by a 3M Science and Technology Fellowship award and the Norman Wells Memorial Fellowship from the Minnesota Colorectal

Cancer Research Foundation (Clark CR). Additional financial support came from the National Cancer Institute of the National Institutes of Health, No.

5R00CA1516723-03, the Mezin-Koats Colorectal Cancer Research Fund and the

University of Minnesota Masonic Cancer Center (Starr, TK). Other support came from the Minnesota Supercomputing Institute and the University of MN Genomics

Center. The authors thank Robert T. Cormier for critically reviewing and editing the manuscript.

Affiliations

a Department of Obstetrics, Gynecology and Women’s Health, University of

Minnesota, Minneapolis MN b Masonic Cancer Center, University of Minnesota,

Minneapolis MN c University of Minnesota Informatics Institute, Minneapolis MN d Department of Genetics, Cell Biology, and Development, University of

Minnesota, Minneapolis MN e Department of Ecology, Evolution, and Behavior,

University of Minnesota, Minneapolis MN f Department of Biology, Loyola

University, Chicago IL

Contributions

Data acquisition: C.R.C., M.M., P.B., S.R.H., A.S., W.D., S.P.; Data analysis and interpretation: C.R.C., S.P., M.B., R.B., J.E.A., T.K.S.; Wrote the manuscript:

58 C.R.C.; Conception, design, and supervision of study: C.R.C., T.K.S. All authors contributed to manuscript revision.

59 Chapter 3. WAC has potential tumor suppressor activity in human colorectal cancer

Synopsis

Our lab recently performed a DNA transposon forward genetic screen in mice that was designed to identify low-frequency mutations that contribute to colorectal cancer (CRC) initiation and progression. Results from this screen identified the WW domain-containing adaptor with coiled-coil (Wac) gene as a top DNA transposon insertion site. WAC has previously been implicated in several cellular processes including amino acid starvation-induced autophagy,

Golgi biogenesis, and transcription associated histone modification but has never before been linked to tumorigenesis. Transposon mutagenesis screens performed by others (Takeda et al. Nature Genetics 2015) have also identified

Wac as a common insertion site, a result that further implicates WAC as a candidate CRC driver gene. Analyses of transposon insertion patterns within

Wac predict loss of gene function and a role as a tumor suppressor. Using publicly available mutation data we determined that WAC is somatically mutated in human CRC and its mRNA expression is reduced in tumor versus matched normal samples. Soft agar colony formation assays reveal that shRNA mediated silencing of Wac cooperates with Apc mutations in mouse colorectal cells to promote cellular transformation. Additional colony formation assays using the human adenoma derived AA/C1 cell line also shows that silencing WAC is

60 protumorigenic. We also demonstrate that complete knockout of WAC using

CRISPR/Cas9 technology increases the tumor fitness of sporadic CRC cells.

Using a zebrafish model, we demonstrate that overexpression of wild type, but not cancer-associated mutant forms of WAC induces expression of the cell cycle inhibitor . Finally, we show that disruption of WAC is embryonic lethal in mouse models. Taken together, these data are the first to implicate WAC as a tumor suppressor gene with a possible role in cell cycle regulation.

Introduction

Colorectal cancer (CRC) is the third most common type of cancer worldwide and it is a major cause of cancer-related deaths in both men and women 133,134. Treatment for CRC patients typically involves surgical resection to remove cancerous tissue and chemotherapy. Surgical resection is potentially curative in patients with early stage disease, but this approach is not suitable for patients suffering from metastatic CRC 135. Unfortunately, as many as 25% of patients newly diagnosed already have metastatic CRC that has spread to the liver and/or lungs 136. Patients with potentially resectable or unresectable metastatic disease are generally treated with traditional cytotoxic chemotherapies. Relatively recently, molecularly targeted therapies have been included in therapeutic regimens and have provided additional treatment options for those with metastatic CRC 135. These targeted therapies include the vascular endothelial growth factor (VEGF) targeting antibody bevacizumab and the two antibodies directed against epidermal growth factor receptor (EGFR),

61 panitumumab and cetuximab. While these targeted therapies have improved outcomes, their effectiveness has been modest. To improve outcomes in those with advanced disease we must enhance our understanding of the molecular events driving CRC.

For decades, it has been known that CRC is established through a stepwise accumulation of mutations in oncogenes and tumor suppressors. In the vast majority of cases, the early initiating event in CRC is loss of the tumor suppressor gene, Adenomatous Polyposis Coli (APC) 11. The gene product of this “gatekeeper” gene acts as an important tumor suppressor in colonic epithelial cells through its ability to restrict WNT signaling. Activation of the KRAS oncogene, loss of SMAD4, and mutation in TP53 are additional genetic lesions classically associated with CRC tumor evolution. Though we have understood these major CRC driving events for years, we remain limited in our ability to exploit these molecules for significant clinical benefit. Thus, there has been a major effort to identify new drug targets. Recent next generation sequencing studies have revealed that CRC tumors can contain anywhere from 60 to over

700 somatic mutations 86. There is no doubt that these data are valuable, but a considerable effort is required to identify and validate promising new molecules.

Transposon-based insertional mutagenesis screens in mice have been a useful tool for deciphering which low frequency mutation events are drivers and which are passengers 100. Using this method, we have identified WW domain containing adaptor with coiled-coil (WAC) as a candidate tumor suppressor gene

62 in CRC 78,80. WAC is an understudied adaptor protein characterized by its WW and coiled-coil domains, which are important in mediating protein-protein interactions. Early studies implicated WAC as a possible RNA processing protein due to its colocalization with SC35, a marker protein for pre-mRNA splicing machinery 137. It has also been reported that WAC exists in a complex with the cytoplasmic deubiquitinase VCIP135 and p97 and is required for Golgi biogenesis 138. Recent genome-wide siRNA screens identified WAC as a promising regulator of starvation induced autophagy 139. Work by Joachim et al. demonstrated that WAC directly interacts with the Golgi protein GM130 and that this interaction is required for autophagy 140. Finally, it has been shown that WAC functions as an adaptor protein in transcription-coupled histone modification, specifically H2B monoubiquitination 141. To carry out this function, nuclear WAC interacts with the C-terminal domain of RNA polymerase II through its WW domain and recruits an E3 ligase complex composed of RNF20 and RNF40. The

WAC-RNF20/40 interaction is mediated through the WAC coiled-coil domain.

This report also showed that WAC is critical for proper cell cycle checkpoint activation as evidenced by recruitment of the WAC-RNF20/40 complex to the

CDKN1A (p21) locus in response to DNA damage 141. Taken together, these findings suggest that WAC is an adaptor protein with multiple functions that are pertinent to tumor biology.

In this study, we show that WAC is common insertion site gene in a variety of murine cancer models and that it is also mutated in human tumors. Analyses

63 of publicly available data sets reveal that WAC expression is decreased in tumors as compared to matched normal tissues. Using a variety of in vitro model systems, we demonstrate that WAC alters the ability of colonic epithelial cells to grow in an anchorage independent fashion. We confirm, using multiple functional assays, that WAC expression influences the expression of cell cycle inhibitory genes. Through development of a conditional knockout mouse, we demonstrate that Wac may be critical for proper embryonic development. Collectively, these are the first data to demonstrate that WAC has a tumor suppressive function in human CRC.

Results

Insertional Mutagenesis screens in mice have identified Wac as a candidate tumor suppressor gene.

In a transposon screen designed to identify novel genetic drivers of CRC,

Starr et al. discovered murine Wac as a common insertion site (CIS) of the

T2/Onc transposon78. Additional transposon insertional mutagenesis screens were performed in APCMIN (Starr et al, 2011) and P53R270H (unpublished) backgrounds, which are mice harboring predisposing genetic alterations. Use of mice harboring predisposing mutations would accelerate tumor formation as well as reveal unique transposon insertions that cooperate with the predisposing alteration to promote tumor formation and progression80. The top ten transposon common insertion sites from wild type mice (Starr et al., 2009), APCMIN mice

(Starr et al., 2011), and P53R270H (unpublished) mice are shown in table 3.1. It is

64 important to note that the Apc gene was the top target for disruption in each of the three genetic backgrounds. APC is a critical regulator of the WNT signaling pathway and its disruption is found in roughly 70 -80% of human colon and rectal cancers142. With Apc as the most common transposon insertion site, these models recapitulate human colorectal (CRC) tumors, at least in the predominant mechanism of tumor initiation. Also included in the Table 3.1 CIS lists are Rspo2,

Kcnq1, Tcf12, and Ptprk, all of which have been linked to human CRC thus providing further support of the power of these studies to model human disease

79,83,143,144. Remarkably, Wac was a top common insertion site in each of these mutagenesis screens. These findings suggest that alterations in Wac are a critical requirement in CRC development as they were observed across wild type,

Apc deficient, and dominant negative P53 backgrounds.

Analysis of the pattern of insertions can provide insight into the predicted effect after transposition occurs within or near a gene. In wild type animals, transposon insertion in Wac occurred in 12 out of 135 (8.8%) gastrointestinal (GI) tract tumors analyzed by Starr et al. (Figure 3.1). The varying orientation and scattered distribution of transposon insertions suggest that T2/Onc is disrupting normal Wac expression, which mimics loss of function mutations observed in tumor suppressor genes. The same sporadic insertion patterns are observed in tumors from Apc deficient mice (15/96 tumors; 15.6%) as well as those harvested from mice expressing dominant negative P53 (11/30 tumors; 36.6%). These data suggest that Wac is a novel tumor suppressor gene in CRC.

65 WT Cohort APCMIN Cohort P53R270H Cohort

APC APC APC

RSPO2 4930422G04Rik FBXW7

KCNQ1 SNX24 RREB1

TCF12 MYO5B WAC

WAC AP1AR MLL3

FBXW7 EMCN UBN2

PTPRK WAC RSF1

CUGBP1 NO GENE GSK3B

NR6A1 PIGL SMG1

ZCCHC7 ESCO1 PELI1

Table 3. 1. Top 10 common insertion site genes from CRC insertional mutagenesis screens in mice. Listed are the top ten common insertion site (CIS) genes of the T2/Onc transposon. Left column represents data from a screen modeling CRC in wild type mice (Starr et al., 2009). Center column represents CIS genes from a screen modeling CRC in mice that harbor a cancer predisposing Apc mutation (Starr et al., 2011). Right column represents CIS genes from a screen modeling CRC in mice that harbor a cancer predisposing p53 mutation (unpublished).

66

Figure 3. 1. Wac is a common insertion site gene in mouse intestinal tumors. Wac was identified as a CIS gene in three separate insertional mutagenesis screens designed to identify novel CRC driver genes. Pictured (top) is a schematic of the murine Wac gene showing introns (black lines) and exon (black rectangles) as well as the transcriptional orientation of the gene (arrow pointing to right). The triangles below represent the location and orientation of transposon insertions with orange triangles depicting insertions where the MSCV promoter within the transposon is in the same orientation as Wac and purple triangles depict transposon insertions in the opposite orientation.

67 In further support, nineteen other groups have also identified Wac as a cancer associated CIS gene. A meta-analysis of these studies using the

Candidate Cancer Gene Database demonstrate that transposon insertions associated with murine Wac have occurred in ten unique tissues (blood, breast, colorectal, gastric, liver, lung, nervous system, pancreas, sarcoma, and skin)

(Supplementary Table S3.2) 81,82,145-160. When defined, the most common predicted effect of transposon insertion within Wac was a loss of function, which is consistent with our data and prediction that Wac has a tumor suppressive function. In independent studies specifically designed to identify intestinal driver genes, March et al. and Takeda et al. also defined Wac as a common insertion site gene 81,82. These groups demonstrated that transposon insertions in Wac occur in Apc deficient animals, those harboring a constitutively active Kras allele,

Smad4 knockout mice, and finally, mice expressing dominant negative p53.

Combined, these cohorts carry the sensitizing mutations observed at different stages of CRC evolution from early adenoma to carcinoma. These data are strongly suggestive that Wac deficiency promotes tumor formation, especially in intestinal tract tumors.

WAC is somatically mutated in cancer and its expression is down regulated in human CRC.

Next, we analyzed publicly available data to determine if WAC mutations are observed in human cancers. To date, The Cancer Genome Atlas (TCGA) database provides comprehensive genomic analyses on thousands of cancer

68 samples from over 200 studies 3,103. We queried nearly 32,000 TCGA samples from 175 studies for copy number alterations and mutation data using the cBioPortal user interface. We removed any overlapping datasets and focused on published studies as they are typically subjected to more rigorous quality control standards than the available provisional datasets. This analysis revealed that

WAC is altered in one percent (314/31,731) of tumor samples (Figure 3.2A). The tumor type with the highest frequency of WAC alterations was prostate cancer with 6% of samples showing copy number amplification. Since copy number gain is typically associated with oncogenes and considering our hypothesis that WAC has tumor suppressive function, this was an unexpected finding 161.

Next, we specifically examined four published colorectal studies in the

TCGA database. In these data sets, the average alteration frequency in 1,105 human CRC samples was 1% with only 11 samples containing WAC mutations

(Figure 3.2B). These data differ dramatically from our observation that Wac was potentially altered by transposon insertion in anywhere from 8 to 36% of murine

CRC samples. Out of the 11 total TCGA somatic mutations, nine were missense mutations, one was a nonsense mutation (E172X) potentially leading to a nonfunctional truncation mutant, and one frame shift mutation (D21Tfs*171) resulting in a change in amino acid and the new reading frame ending in a stop codon at position 171 of the WAC protein (Figure 3.3A, Table S3.3). Like the

E172X nonsense mutation, the D21Tfs*171 mutation is also likely to lead to truncation and loss of WAC function.

69 We also analyzed mutation data from the Catalog of Somatic Mutations in

Cancer (COSMIC) database 162. This manually curated database contains some overlapping samples from TCGA but also contains additional data from non-

TCGA studies. We found a total of 50 somatic mutations in WAC out of 2,357 large intestine tumors (Tables S.3.4). Forty-four percent of these mutations were silent mutations with the remaining mutations being potentially deleterious missense (38%) or nonsense (8%) (Figure 3.3B). Several non-silent mutations are of interest because of their location within the WAC amino acid sequence.

The L627P (COSM1347500) and K640N (COSM1347501) missense mutations, both of which are derived from nonpublished TCGA colorectal adenocarcinoma studies, are point mutations that occur within the coiled-coil domain of WAC. The coiled-coil domain is known to bind to the RNF20/40 complex and is required for histone monoubiquitination during transcript elongation (Tables S4 and S5) 141.

Each of these point mutations is predicted to be pathogenic by FATHMM-MKL, an integrative algorithm developed to predict functional effects of coding sequence variation (L627P FATHMM Score 0.99; K640N FATHMM Score 0.96)

163.

Next, we analyzed the mRNA expression of WAC in 41 tumor and matched normal samples profiled by the TCGA. The mRNA expression levels were significantly decreased in CRC tumors when compared to their matched normal colonic tissue (p=0.0006). Twenty-eight samples displayed a 10% or

70 greater decrease in WAC expression while only six samples had a greater than

10% increase in expression (Figure 3.4).

71 9/6/2018 cancer_types_summary (2).svg

6%

5%

4% 9/6/2018 cancer_types_summary (2).svg 3%

6%

Alteration Frequency 2% 9/6/2018A. cancer_types_summary (2).svg 5% 6% 1%

5%4%

4%

umor umor umor umor umor umor umor umor

3% , NOS , NOS 3% Glioma

Leukemia

Melanoma

fuse Glioma Alteration Frequency Anal Cancer 2% Histiocytosis CNS Cancer

Bone Cancer Wilms T Alteration Frequency 2% aginal Cancer Mesothelioma Penile Cancer Breast Cancer Thymic T Dif issue Sarcoma V Thyroid Cancer Bladder Cancer Ovarian Cancer Cervical Cancer Breast Sarcoma Prostate Cancer , Non­Melanoma Uterine Sarcoma 1% Germ Cell T ube Cancer Embryonal T Colorectal Cancer

Pancreatic Cancer

Endometrial Cancer Appendiceal Cancer Small Bowel Cancer

Nerve Sheath T Soft T 1% Hepatobiliary Cancer

Renal Cell Carcinoma Prostate Cancer Salivary Gland Cancer

Head and Neck Cancer Small Cell Lung Cancer

Esophagogastric Cancer Sex Cord Stromal T

umor umor umor umor umor umor umor umor

Adrenocortical Carcinoma Hepatocellular Carcinoma , NOS , NOS

Glioma

Colorectal Adenocarcinoma

Cancer of Unknown Primary Leukemia Non­Small Cell Lung Cancer

Melanoma Skin Cancer

Bladder Urothelial Carcinoma

fuse Glioma Anal Cancer Histiocytosis CNS Cancer

Bone Cancer Wilms T

aginal Cancer Mesothelioma Gastrointestinal Stromal T Penile Cancer Breast Cancer Thymic T Dif issue Sarcoma V Thyroid Cancer Bladder Cancer Ovarian Cancer Cervical Cancer Breast Sarcoma Prostate Cancer , Non­Melanoma Uterine Sarcoma Germ Cell T umor umor umor umor umor umor umor umor ube Cancer Embryonal T Colorectal Cancer

Pancreatic Cancer , NOS , NOS

Endometrial Cancer Appendiceal Cancer Small Bowel Cancer Glioma Nerve Sheath T Cervical Squamous Cell Carcinoma Soft T Hepatobiliary Cancer

Renal Cell Carcinoma Prostate Cancer Salivary Gland Cancer

Head and Neck Cancer Ovarian/Fallopian T Small Cell Lung Cancer

Esophagogastric Cancer Leukemia Sex Cord Stromal T

Adrenocortical Carcinoma Hepatocellular Carcinoma Melanoma

Esophageal Squamous Cell Carcinoma Gastrointestinal Neuroendocrine T

Colorectal Adenocarcinoma

Cancer of Unknown Primary fuse Glioma Non­Small Cell Lung Cancer Anal Cancer Skin Cancer Histiocytosis CNS Cancer Bladder Urothelial Carcinoma Bone Cancer Wilms T

aginal Cancer Mesothelioma Gastrointestinal Stromal T Penile Cancer Breast Cancer Thymic T Dif issue Sarcoma V Thyroid Cancer Bladder Cancer Ovarian Cancer Cervical Cancer Breast Sarcoma Prostate Cancer , Non­Melanoma Cervical Squamous Cell Carcinoma Uterine Sarcoma Germ Cell T

ube Cancer Embryonal T Ovarian/Fallopian T Colorectal Cancer

Pancreatic Cancer

Esophageal Squamous Cell Carcinoma Gastrointestinal Neuroendocrine T

Endometrial Cancer Appendiceal Cancer Small Bowel Cancer

Nerve Sheath T Soft T Hepatobiliary Cancer

Renal Cell Carcinoma Prostate Cancer Salivary Gland Cancer

Head and Neck Cancer Small Cell Lung Cancer

Esophagogastric Cancer Sex Cord Stromal T Mutation Fusion Amplification Deep DeletionAdrenocortical Carcinoma Hepatocellular Carcinoma Multiple Alterations

Colorectal Adenocarcinoma

Cancer of Unknown Primary Non­Small Cell Lung Cancer Skin Cancer Mutation Fusion Bladder Urothelial Carcinoma Amplification Deep Deletion Multiple Alterations

Gastrointestinal Stromal T

file:///Users/clar1 B. 9/6/2018

Cervical Squamous Cell Carcinoma

Alteration FrequencyAlteration Frequency Ovarian/Fallopian T

Esophageal Squamous Cell Carcinoma Gastrointestinal Neuroendocrine T Mutation Mutation 0.5% 1.5% 2.5% 1% 2% 1% 2% 3% 4% 5% 6%

Colorectal (Genentech) Prostate Cancer , NOS

181/Downloads/cancer_types_summary%20(2).svg Colorectal (TCGA pub) Bladder Urothelial Carcinoma Fusion Mutation Fusion Amplification Deep Deletion Multiple Alterations Colorectal (DFCI 2016) Nerve Sheath T Colorectal (MSKCC)umor Cancer of Unknown Primary Amplification Figure 3.Colorectal Adenocarcinoma 2. WAC is mutated in human cancers. A, The frequencyEndometrial Cancer of WAC alterations found in published TCGA tumor samples. Cervical Squamous Cell Carcinoma The coloration represents the nature of the alteration with green being somatic Esophageal Squamous Cell Carcinoma

Deep Deletion mutation,file:///Users/clar1 purple181/Downloads/cancer_types_summary%20(2).svg representing gene fusions, red representing copy number 1/1 Bladder Cancer 72 Adrenocortical Carcinoma

Melanoma Hepatocellular Carcinoma

Multiple Alterations Breast Cancer Esophagogastric Cancer file:///Users/clar1181/Downloads/cancer_types_summary%20(2).svg 1/1 Ovarian Cancer

Leukemia Non­Small Cell Lung Cancer

Prostate Cancer

Thymic Tumor file:///Users/clar1181/Downloads/cancer_types_summary%20(2).svgDiffuse Glioma 1/1 Mesothelioma

Soft Tissue Sarcoma cancer_types_summary (2).svg cancer_types_summary Renal Cell Carcinoma

Head and Neck Cancer

Hepatobiliary Cancer

Germ Cell Tumor Pancreatic Cancer

Colorectal Cancer

Glioma Salivary Gland Cancer

Breast Sarcoma Ovarian/Fallopian T ube Cancer, NOS Skin Cancer, Non­Melanoma

Embryonal Tumor Thyroid Cancer

Uterine Sarcoma Small Cell Lung Cancer

Sex Cord Stromal T umor Gastrointestinal Stromal T umor Cervical Cancer

Anal Cancer

Bone Cancer

CNS Cancer

Histiocytosis Appendiceal Cancer

Small Bowel Cancer Gastrointestinal Neuroendocrine T umor Penile Cancer

Vaginal Cancer 1

/ Wilms T 1 umor amplification, blue representing copy number deletion, and gray representing samples with multiple alterations. We queried 31731 samples in 175 studies and

WAC is altered in 314 (1%) queried samples. B, The frequency of WAC mutations found in published TCGA CRC datasets. We queried 1105 samples in 4 studies and WAC is mutated in 11 (1%) of queried samples.

73 647aa 600 500 400 300 200 E172X WW B. 100 D21Tfs*171 0

0 5 A. Mutations WAC #

74 Figure 3. 3. WAC is somatically mutated in human CRC samples. A, Schematic of the WAC protein (647 aa) depicting the number and location of specific somatic mutations in CRC samples from the TCGA. The color of the

“lollipop” depicts the type of mutation with green representing missense mutations and black representing truncating mutations. The WW domain is depicted as a green rectangle. B, The mutation distribution of WAC mutations found in large intestine tumors from the COSMIC database.

75 5000 P = 0.0006

4000

3000 RPKM 2000

1000

0

Normal Tumor

Figure 3. 4. WAC mRNA expression is downregulated in CRC tumors. WAC mRNA levels (RPKM) in 41 tumor and matched normal colorectal tissue.

Data are obtained from TCGA samples subjected to RNA-sequencing.

76

Loss of WAC promotes anchorage independent growth in murine colonic epithelial cells.

To determine if loss of WAC plays a role in the cellular transformation of colonic epithelial cells, we generated stable WAC knockdowns in both the YAMC and IMCE cell lines using lentiviral vectors. These two conditionally immortalized colon cell lines are derived from the intestinal mucosa of the “Immortomouse” and express a temperature-sensitive SV40 large T antigen from an interferon gamma (IFN�) responsive promoter (H-2Kb-tsA58) 164-166. Under permissive conditions, which is incubation at 33°C with IFN� supplemented medium, the

SV40 large T protein will be in its active conformation allowing for the binding and inhibition of p53. Inhibition of p53 in this manner will render the IMCE an YAMC cell lines “immortal” meaning they evade p53 induced cellular senescence. The major difference between YAMC and IMCE cells are that YAMC are wild type for the Apc gene while IMCE carries one mutant Apc allele (ApcMIN/+). Because these cells are immortalized but non-oncogenic, this system allowed us to examine the impact of loss of WAC in normal colonic (YAMC) and premalignant colonic epithelia (IMCE). We achieved 60% reduction in WAC mRNA as measured by qRT-PCR (data not shown).

We grew, under permissive conditions, stable YAMC and IMCE WAC knockdown cells in soft agar (Figure 3.5A). This well-established assay is considered a stringent test for the malignant phenotype of anchorage

77 independent growth, a hallmark of cancer. Knockdown of Wac expression in

YAMC cells had no effect on colony formation compared to their negative control counterparts. Depletion of Wac in the Apc mutant IMCE cell line resulted in significantly increased colony formation (Figure 3.5B). These data support our hypothesis that WAC has tumor suppressive activity, but only in cells that have prior genetic lesions, particularly in Apc.

WAC knockdown cooperates with APC and P53 to promote anchorage independent growth of premalignant colonic epithelial cells.

To investigate the tumor suppressor activity in human cells, we created stable WAC knockdowns in the APC mutant AA/C1 adenoma cell line. This cell line is derived from an intestinal polyp of a patient with familial adenomatous polyposis (FAP). FAP is caused by germline inheritance of a truncating mutation in APC and is characterized by the development of numerous colonic polyps as early as teenage years. Previous characterization of AA/C1 cells also demonstrated that these cells are TP53 wild type, anchorage-dependent, and non-tumorigenic in mice 167,168. In soft agar colony formation assays, stable knockdown of WAC alone did not increase anchorage independent growth of

AA/C1 cells. Since mutation of TP53 is a common occurrence in the cancerous transformation of colonic epithelial cells, we decided to knockdown TP53 in

AA/C1 cells. Short-hairpin mediated reduction of TP53 resulted in a small but significant increase in anchorage independent growth. We also used shRNAs to co-knockdown both TP53 and WAC. Interestingly, the simultaneous knockdown

78 of TP53 and WAC significantly increased colony formation compared to controls

(Figure 3.6). These data suggest that in human cells the loss of WAC is detrimental but requires additional genetic insult in both APC and TP53.

79

A. B. YAMC IMCE 5000 (APC MT) (APC WT) p<0.05P = 0.0006

Silencing 4000 - shRNA Non

3000 shWAC RPKM 2000

Figure 3. 5. WAC functions to suppress anchorage1000 independent growth in murine colon epithelial cells. A, Wac knockdown increases anchorage independent growth in Apc mutant

IMCE cells (ApcMIN) but not in Apc wild type YAMC cells. Stable Wac knockdown 0 cells were grown in permissive conditions and grown in soft agar. Three weeks after plating, colonies were fixed with formaldehyde and stainedNormal with crystalTumor violet. Stained colonies were imaged with a dissection scope and the resulting photos were used for colony counting via ImageJ software. B, Representative quantification of colonies from (A).

80 5000 Pp<0.05 = 0.0006

4000 5000 p<0.05 P = 0.0006

3000 4000 RPKM 2000 3000 RPKM 10002000

01000

Normal Tumor 0

Normal Tumor

Figure 3. 6. WAC has tumor suppressive function in human colorectal adenoma cells. Quantification of crystal violet stained AA/C1 colonies growing in soft agar. Only

the combination of WAC and TP53 knockdown increased the colony formation of

the AA/C1 adenoma (APC mutant) cell line versus controls.

81 Partial reduction of WAC does not increase the growth of sporadic CRC cell lines.

The AA/C1 model demonstrated that loss of WAC alone is insufficient to initiate transformation of pre-malignant cells. Therefore, we sought to determine the impact of WAC in tumor progression by asking whether WAC loss would enhance the growth of already tumorigenic CRC cell lines. To answer this question, we used two lentiviral shRNAs (A6 and A8) to create stable WAC knockdowns in the commonly used sporadic CRC cell lines DLD-1 and HCT116.

In DLD-1 cells, we achieved only a minor reduction of WAC mRNA in shWAC-A6 cells whereas we observed an approximately 90% reduction in shWAC-A8 cells

(Figure 3.7A). For HCT116 cells, we achieved a more consistent reduction of

WAC with an 80 to 90% decrease in mRNA for both shWAC-A6 and shWAC-A8

(Figure 3.7B). In soft agar colony formation assays, we unexpectedly observed a small but statistically significant reduction of anchorage independent growth in both DLD-1 and HCT116 WAC knockdown clones (Figures 3.7C and 3.7D).

These data suggest that WAC may actually have some oncogenic and not tumor suppressive properties in the DLD-1 and HCT116 cell lines.

To further examine the role of WAC in DLD-1 and HCT116 cell growth, we performed MTS assays to measure changes in cellular proliferation and wound healing scratch assays to measure changes in the migratory behavior of cells.

For each cell line, we observed no change in either the cellular proliferation or migration rates of WAC knockdown cells (data not shown). These data suggest

82 that in DLD-1 and HCT116 cells WAC may function to promote resistance to anoikis-mediated cell death but plays no other role in pathways related to cell growth or cell migration.

83 A. DLD-1 B. HCT116 (% of (% Control) of (% Control) WAC Expression WAC Expression WAC

1500 NC shWAC shWAC 1500 NC shWAC shWAC shRNA A6 A8 shRNA A6 A8

DLD-1 HCT116 C. 10/10/16: Colony Assay DLD1 Wac KD D. 90

80 1000 1000 70

60

50

Avg of Avg of wells 40

Colonies/Field 30

Colonies/field 20 Colonies/field

10 500 500 0 Empty Vector shRNA A6 shRNA A8 NC shWAC shWAC NC shWAC shWAC shRNA A6 A8 shRNA A6 A8

0 0

84

HT29 Parental HT29 Parental

HT29 C1C2 knockout HT29 C1C2 knockout Figure 3. 7. Partial WAC reduction has minor effect on sporadic CRC cell growth. A, Quantitative RT-PCR was used to measure WAC expression levels in negative control DLD-1 cells as well as in stable knockdown clones shWAC-A6 and sh-WAC-A8. Relative expression (% of control) of WAC is shown after normalization with housekeeping genes (i.e ACTB or TBP). B, Quantitative RT-

PCR was used to measure WAC expression levels in negative control HCT116 cells as well as in stable knockdown clones shWAC-A6 and sh-WAC-A8.

Relative expression (% of control) of WAC is shown after normalization with housekeeping genes. C, Representative quantification of DLD-1 control and stable knockdown cells showing reduction of WAC expression results in a minor decrease in colonies growing in soft agar. Histogram shows average number of colonies counted (in 10 fields/replicate). D, Representative quantification of

HCT116 control and stable knockdown cells showing reduction of WAC expression results in a minor decrease in colonies growing in soft agar.

Histogram shows average number of colonies counted (in 10 fields/replicate).

85

Complete knockout and strong overexpression of WAC reveal its tumor suppressive properties in sporadic CRC cell lines.

We postulated that the contradictory results observed between our model systems may be attributed to only achieving a partial reduction of WAC rather than a complete loss. To generate complete WAC knockout cells, we used lentiCRISPR/Cas9 technology to generate lentiviral particles carrying Cas9 and

WAC specific gRNAs in a single vector (Supplementary Figures 3.1 and 3.2). We chose the HT-29 cell line for CRISPR/Cas9 mediated deletion of WAC, as it was determined that these cells express the highest levels of WAC mRNA

(Supplementary Figure 3.3). Hence, we hypothesized that WAC knockout would have a more dramatic phenotype in these cells versus other CRC cell lines. To ensure the complete ablation after CRISPR/Cas9 editing we designed several gRNAs targeting the WW and coiled-coil (CC) regions, which are critically important domains for proper WAC function (Figure 3.8A) 141,169. We also selected two gRNAs that were predicted to have high targeting efficiency but did not act in the coding regions of the WW or CC domain.

Initial characterization of cells surviving single cell cloning identified two promising knockout candidates (Supplementary Figure 3.4). Clone C1/C2-A2 was co-transduced with lentiviruses carrying the predicted high efficiency gRNAs and clone C4-B4 was transduced with gRNA C4 lentivirus designed to target the

WW domain. Verification of knockout demonstrated complete loss of WAC in

86 clone C1/C2-A2 and only a partial reduction in clone C4-B4 (Figure 3.8B). Thus, we selected only clone C1/C2-A2 for further experimentation. Compared to parental control cells, C1/C2-A2 WAC knockout clones had a significantly higher colony counts in soft agar assays (Figure 3.9A and 3.9C). These findings are consistent with our hypothesis that WAC functions as a tumor suppressor gene in

CRC cells. We also compared the colony area, as measured by the number of pixels in a given crystal violet stained colony. Interestingly, knockout of WAC in

HT-29 cells resulted in a 15% decrease in colony area versus parental control cells. Overexpression of WAC in HCT116 cells corroborated the tumor suppressive activity that was observed in HT-29 knockout cells (Supplementary

Figure 3.3B). There was about a 50% reduction in the number of colonies growing in soft agar in WAC overexpression cells compared to controls (Figure

3.10A and 3.10C). We also observed altered colony area in cells overexpressing

WAC . Although the overall number of colonies was reduced in WAC overexpression cells, their size was approximately twice that of control cells

(Figure 3.10B and 3.10C). Taken together, these results support the hypothesis that WAC has a tumor suppressive function in CRC.

87 A.

C1-HE C4-WW C2-HE C5-CC

WAC

C3-WW C6-CC

WW domain CC domain A2 A2 B. - B4 KO B4 KO - PAR C4 C1/C2

WAC

B-Actin

Figure 3. 8. CRISPR/Cas9 deletion of WAC in the CRC cell line HT-29. A, Schematic depicting the cut site location of CRISPR/Cas9 gRNA designed to target coding regions in WAC (1994 base pairs). Two gRNAs were designed to target the WW domain (C3-WW, C4-WW), two gRNAs were designed to target the coiled-coil domain (C5-CC, C6-CC), and two were chosen based on their 88 prediction to cut with high efficiency. B, Western blot demonstrating the complete knockout of WAC in HT-29 clone C1/C2-A2 and partial reduction in clone C4-B4.

Beta-actin was used as a loading control.

89

A. B.

1500 800 * * 600 1000

400

Colonies/field 500 Colonies/field 200 Colony Size (pixels^2)

0 0

HT29 Parental HT29 Parental

HT29 C1C2 knockout C. HT29 C1C2 knockout

HT29 Parental HT29 C1C2 WAC KO

Figure 3. 9. WAC is tumor suppressive in sporadic CRC cell lines. A, Representative quantification of HT-29 control and WAC knockout cells showing complete loss of WAC expression results in a significant increase in the number colonies growing in soft agar compared to controls. Histogram shows

90 average number of colonies counted (in 10 fields/replicate). B, Quantification of colony area (pixel^2) from control and WAC knockout cells grown in soft agar. C,

Images of colonies stained with crystal violet ten days post plating. *, t-test P <

0.05

91 A. B.

800 2000

600 * 1500

400 * 1000 Colonies/Field 200 500 Colony Size (pixels^2)

0 0

HC116 - EGFP HC116 - EGFP HCT116 - WAC OE HCT116 - WAC OE

C.

HCT116 - EGFP HCT116 – WAC OE

Figure 3. 10. WAC is tumor suppressive in sporadic CRC cell lines. A, Representative quantification of HCT116 EGFP control and WAC overexpression cells showing overexpression results in a significant decrease in the number colonies growing in soft agar compared to controls. Histogram shows average number of colonies counted (in 10 fields/replicate). B, Quantification of colony area (pixel^2) from EGFP control and WAC overexpression cells grown in 92 soft agar. C, Images of colonies stained with crystal violet ten days post plating.

*, t-test P < 0.05.

93

WAC is a potential regulator of TP53 and CDKN1A gene expression.

It is known that WAC is recruited to p53 responsive genes after cellular stress, where it interacts with RNF20/40 and hRAD6 to monoubiquitinate H2B and promote transcript elongation 141. Therefore, we sought to understand the relationship between WAC, p53, and p53 response genes in CRC cells. First, we used our stable HCT116 knockdown clones to examine any changes that occur in CRC cells after reduction of WAC expression. In both shWAC-A6 and shWAC-

A8 clones we observed around a 25% reduction in TP53 expression. A similar result was observed for the expression of CREBBP, a known p53 interacting protein 170. Lastly, knockdown of WAC, even at steady state, had a dramatic effect on the expression of the cell cycle inhibitor CDKN1A (p21). We observed a greater than 50% reduction in CDKN1A mRNA in both shWAC-A6 and shWAC-

A8 (Figure 3.11A).

To further explore the relationship between WAC and CDKN1A expression we used an in vivo zebrafish model. Here, we injected in vitro transcribed WAC mRNA into zebrafish embryos and subsequently measured

CDKN1A expression by qRT-PCR twenty-four hours later. Injection with wild type

WAC mRNA resulted in a two-fold increase in the expression of CDKN1A (Figure

3.11B). Next, we used the same model to test whether cancer-associated WAC mutants were also capable of inducing CDKN1A expression. WAC-E172X is nonsense mutation that introduces a premature stop codon just after the WW

94 domain of the protein. This mutation likely results in a severely truncated transcript and loss of WAC function. We also constructed the WAC-S475L mutant mRNA. While the impact of this point mutation is unknown, the serine 475 residue is evolutionarily conserved and may be an important phosphorylation site

(data not shown). Neither WAC-E172X nor WAC-S475L had that ability to induce

CDKN1A expression (Figure 3.11B). Taken together, these data suggest a possible role for wild type WAC, and not its cancer-associated mutant forms, in the regulation of several critical pathways.

Considering the potential importance of WAC in response to cellular stress responses, we wondered whether reduction in WAC would alter the sensitivity to the commonly used chemotherapeutic drug 5-Flourouracil (5-FU). To test this idea, we treated stable HCT116 knockdown cells with increasing concentrations of 5-FU and measured cell viability 72 hours later. We observed no decrease in viability of control or WAC knockdown cells after 72-hour treatment with 10 uM 5-

FU. With the highest dose (50 uM), we observed a 50% reduction in cell viability but no differences between control and WAC knockdown clones (Figure 3.12).

While reduction of WAC clearly impacts p53 and cell cycle inhibitor expression, the functional consequences of this change remain to be determined.

95 A. WacWacWacWac 1.01.01.01.0 TP53TP53TP53TP53 P21P21P21P21WACWACWAC CREBBPCREBBPCREBBPCREBBPTP53 1.01.0 1.0 TP53TP53 P21P21P21 CREBBPCREBBPCREBBP

0.50.50.50.5 Relative Gene Expression Gene Relative Expression Gene Relative Expression Gene Relative Relative Gene Expression Gene Relative Relative gene expression gene Relative

0.50.5 0.5 0.00.00.00.0

ScrScrScrScrA6A6A6A6A8A8A8A8 ScrScrScrScrA6A6A6A6A8A8A8A8 ScrScrScrScrA6A6A6A6A8A8A8A8 ScrScrScrScrA6A6A6A6A8A8A8A8 d d Relative Gene Expression Gene Relative e e Relative Gene Expression Gene Relative Relative Gene Expression Gene Relative Relative Gene Expression Gene Relative t t 0.0226 0.0226 c c e e j Bj . 0.0285 0.0285 n n i i 0.0019 0.0019 n n u u

0.0284 0.0284 o 2.5o 2.5 t t *

0.0009 0.0009 e e * * 0.0318 0.0318 v v i i t 2.0t 2.0 a a l l e e r r

l 0.0 1.5l 1.5 e 0.0 0.0 e v v e e l l

1.0 1.0 Scr A6A6 A8A8 Scr A6A6 A8A8 Scr A6A6 A8A8 Scr A6A6 A8A8 1 Scr Scr A6 A8 Scr1 Scr A6 A8 ScrScr A6 A8 ScrScr A6 A8 2 2 p p

0.5 0.5 n n i i

e e Relative p21 Expression p21 Relative g 0.0g 0.0 n n a a h h WT c c

d d l l S475L Controlwac 633wac 633 o o f f E172Stopwac S475Lwacwac K479N S475Lwac K479N control RNAwildtypecontrol RNAwildtypewac wac wac E172stopwac E172stop

96 Figure 3. 11. WAC is involved in cell cycle regulation and may be involved in the TP53 response. A, Quantitative RT-PCR showing reduction of WAC in HCT116 cells. Also pictured is the expression level of TP53, CDKN1A, and CREBBP in stable WAC knockdown clones shWAC-A6 and shWAC-A8. Data were first normalized to

TBP expression. B, Zebrafish embryos were injected with either in vitro transcribed wild type WAC or cancer-associated WAC mutant mRNAs. The level of CKDN1A was measured by qRT-PCR and fold-change in expression is shown after normalization with the S6K. Data is representative of triplicate experiments. *, t-test P < 0.05.

97 8/2/16: MTS assay: HCT116 WAC KD cells treated with 5FU. Measured at 72 hours. % of control 120% 8/2/16: MTS assay: HCT116 WAC KD cells treated with 5FU. Measured at 72 hours. % of control 120% 100% 100% 80% 80% 60% 60%

40% Treated Cells 40% A490 as % of Vehicle Vehicle of % as A490 20% 20% Absorbance as % of DMSO treated cells treated DMSO of % as Absorbance Absorbance as % of DMSO treated cells treated DMSO of % as Absorbance 0% 0% EV EV A6 A6 A8 A8 10uM10uM25uM25uM50uM50uM

Figure 3. 12. Reduction of WAC does not sensitize cells to the effects of chemotherapeutics. Histogram showing the viability of control and stable HCT116 WAC knockdown cells after treatment with 5-Fluorouracil (5-FU). Cells were treated with 10, 25, or

50 uM concentration of 5-FU for 72 hours and viability was measured using the

MTS tetrazolium reduction assay (absorbance A490).

98 Loss of WAC is embryonic lethal in mice.

To directly assess the tumor suppressive function of WAC in a living system, we created a conditional knockout mouse. Here, we used a targeting vector to introduce loxP sites flanking exon 3 of murine Wac (Figure 3.13A). The intended goal of this study was to compare the number of gastrointestinal tumors and survival rate of control mice (Wac +/+) with that of heterozygous (Wac +/-) and homozygous (Wac -/-) knockout mice. To limit Wac deletion to the gastrointestinal tract we would cross our conditional Wac mutant mice to mice expressing Cre recombinase only in the intestinal tract (Villin-Cre mice) (Figure

3.13B).

To produce an adequate number of mice for each genotype and to properly power these studies, we bred mice for over two years. From these matings a startling trend emerged. The expected Mendelian inheritance ratio of a heterozygous breeding pair should be 1:2:1. In this way, a litter of pups from a heterozygous (Wac +/fl) cross should produce 25% homozygous mutants (Wac fl/fl) harboring two conditional knockout alleles, 50% of the heterozygous genotype (Wac +/fl), and 25% wild type animals (Wac +/+). Analysis of the genotypes of over 250 weaned mice from 43 Wac heterozygous crosses revealed that we were obtaining homozygous conditional Wac knockout mice at just a 3% frequency (Figure 3.14A). Since mice are typically weaned at 3 to 4 weeks after birth, we next evaluated the genotypes of neonate animals from Wac

(fl/+) breeding pairs. Analysis of 25 newborn pups revealed that 16% of newborn

99 pups were of the Wac fl/fl genotype, which is slightly lower than the predicted

25% rate (Figure 3.14B). These data suggest that our conditional Wac knockout allele may be embryonic lethal or disrupt normal development such that mice surviving until birth do no survive until weaning age. A closer look at the average litter size supports this hypothesis. The average number of pups from litters

(n=107 litters) that could not produce Wac (fl/fl) animals was 8.5 whereas the average number of offspring from breeding two Wac (fl/+) animals (n=173 litters) was only 6.2 (Figure 3.14C). This 23% reduction in litter size was statistically significant. Although it is speculative, we predict that disruption of both wild type

Wac alleles with our conditional knockout construct is detrimental to embryonic development (Figure 3.14D).

100

Figure 3. 13. Creation of conditional WAC knockout mouse and experimental design. A, Targeting vector used to create conditional Wac knockout mice. The blue line represents the short homology arm, the light blue box represents the neomycin resistance gene used to positively select for desired recombinants, and the red line represents the long Wac homology arm. LoxP sites surrounding exon 3 (tan triangle) are depicted as black triangles. Note that the NeoR cassette was removed prior to all matings. B, The breeding scheme used to generate Wac

101 knockout mice, where Wac knockout is restricted to the gastrointestinal tract.

Wacflox indicates mice harboring conditional Wac knockout (CKO) alleles and

Villin-Cre are mice expressing Cre recombinase from a gut specific promoter.

Image is a representation of the desired outcome of the experiment.

102 7/28/16: Wac genotype of 252 weaned pups from 43 litters 9/13/16: Wac genotype of 25 day 1 neonates from 4 litters from Wac het A. from Wac het crosses B. crosses 100% 100% 90% 90% 80% 80% 70% 70% 60% 60% 50% 50% 40% 40%

30% pups of Percent 30% Percent of pups of Percent 20% 20% 10% 10% 0% 0%

Wildtype Wildtype Heterozygous Homozygous Homozygous 7/13/16: Average litterHeterozygous sizes at weaning from Wac Het x C. Wac Het crosses and Wac Het x Wac w/t crosses D. 9.0 * 8.0 7.0 6.0 5.0 4.0 3.0

Average litter 2.0size 1.0 0.0 Het x Het Het x w/t cross cross

Figure 3. 14. Wac is critical for proper embryonic development and may be embryonic lethal. A, Histogram showing the frequency of genotypes obtained after breeding pairs of heterozygous Wac mice (Wac+/fl x Wac+/fl). The data represents the genotypes of 252 weaned pups from 43 litters. B, Histogram showing the

103 frequency of genotypes obtained after breeding pairs of heterozygous Wac mice

(Wac+/fl x Wac+/fl). The data is representing genotypes of 25 day 1 neonates. C,

Histogram showing the average litter size of offspring from wild type Wac

(Wac+/+) and heterozygous Wac (Wac+/fl) crosses. Also pictured is the average litter size of offspring from crosses where both parents are heterozygous for Wac conditional knockout allele. D, Images of mice harvested at embryonic day E15 of development. Red box identifies the only pup with the Wac (fl/fl) genotype. *, t- test P < 0.05.

104

Discussion

We have presented several lines of evidence implicating WAC as a tumor suppressor gene. First, transposon based insertional mutagenesis screens identified Wac as common insertion site in gastrointestinal tract tumors 78,80-82.

More broadly, Wac was also identified as common insertion site gene in a variety of other murine cancer models, suggesting its tumor suppressive function may be important in a variety of tissues types. Our analyses of publicly available human tumor data sets revealed that WAC is somatically mutated in CRC and several other cancers, albeit at a relatively low frequency. In two-thirds of human CRC tumors, the mRNA expression of WAC is significantly decreased compared to matched normal tissue. Short-hairpin mediated depletion of Wac in murine colonic epithelial cells led to an increase in anchorage independent growth in those cell lines with Apc mutation. These data were supported by experiments using the human adenoma AA/C1 cell line, with the caveat that p53 depletion is also required to achieve increased anchorage independent growth. Though we observed mixed results in our experiments using sporadic CRC cell lines, we were able to show using more robust complete knockout and strong overexpression models that WAC has tumor suppressive function.

We have also provided initial insights into the possible mechanism of

WAC mediated tumor suppression. Using stable WAC knockdown cells and an in vivo zebrafish embryo model, we showed that WAC induces the expression of

105 the cell cycle inhibitor CDKN1A. Though it remains to be fully elucidated, the ability to induce CDKN1A expression is a likely explanation for the tumor suppressive phenotypes observed in our functional assays. In fact, our zebrafish model demonstrated cancer-associated mutant forms of WAC were unable to induce CDKN1A expression suggesting that loss of WAC in any given tissue could lead to uncontrolled cell growth. There is evidence that CDKN1A is downregulated during CRC tumorigenesis, an event that has been attributed to coinciding loss of the CDKN1A transcriptional regulator CDX2 171,172. Expression of CDX2 is known to be involved in intestinal development and its expression is important for proper intestinal epithelial cell differentiation and control of cell growth via its ability to induce CDKN1A expression 173. Interestingly, the expression of CDX2 is known to be regulated by APC, which is mutated in the vast majority of human CRC samples 9,172. Our work indicates that loss of WAC cooperates with APC mutations to promote tumor fitness and that WAC influences CDNK1A expression. More work should be done to define the role of

WAC in the complex cell cycle regulation networks, with specific attention paid to the APC-CDX2-CDKN1A axis.

In our knockout and overexpression models, we discovered that not only did WAC manipulation effect colony counts but also the physical size of each colony. It is well known that cellular size is controlled by the mammalian target of rapamycin (mTOR)/p70 ribosomal S6 kinase (S6K) signaling pathway 174.

Previous data has demonstrated that long lasting deactivation of mTOR/S6K

106 signaling generates a population of cells with reduced sized and mass 175. It is possible that WAC acts as an adaptor protein directly in the mTOR/S6K signaling pathway or that it regulates downstream mTOR/S6K target genes through its function as a histone modifying protein. Alternatively, it has been suggested that cells will continue to grow in size when cell division is blocked by cell cycle inhibitory genes 174. Thus, an alternate explanation for the change in colony size is simply that WAC induces CDKN1A expression, which inhibits the cell cycle but allows cells to grow in size. Our data would likely support this hypothesis

(Figures 3.9 and 3.10), but more work is required to demonstrate that complete knockout or strong overexpression significantly alters CDKN1A expression.

Lastly, WAC has been linked to autophagy, which is a complex self-degradative process that occurs in response to multiple stimuli 140. Increasing evidence supports a relationship between autophagy and cell size, which likely occurs by crosstalk of autophagy and mTOR/S6k signaling pathways 176. Thus, the cell size related phenotypes observed in our experiments could be related to WACs function in autophagy.

Recent evidence has implicated WAC in human disabilities. WAC-related intellectual disability (ID) is characterized by developmental delay and/or intellectual disabilities. Behavioral symptoms observed in the 18 individuals known to have WAC-related ID span from attention deficit disorders to autism spectrum disorders. The physical symptoms associated with WAC-ID are decreased muscle tone in infants, neonatal feeding issues, gastroesophageal

107 reflux, constipation, respiratory issues, abnormal vision, and unusually short fingers and toes 177. Though our conditional Wac knockout mouse did not allow us to study the role of Wac in CRC development, we have unintentionally supported the hypothesis that WAC is required for proper development.

Disruption of the region surrounding exon 3 of murine Wac clearly impacted embryonic development of mice and their ability to survive past weaning age.

Further investigation could shed light on important regulatory elements within this region of the gene or the importance of the amino acid residues encoded by exon

3 for proper WAC protein function.

The development of targeted therapies for advanced human CRC remains a critical need. The efforts to develop targeted therapies against known, highly penetrant, drivers of CRC (APC, SMAD4, TP53, and KRAS) have yet to provide significant improvements in patient survival. Although low frequency mutations are typically ignored in drug development, these rare mutations should be taken more seriously. In the current study we identify WAC, which is mutated in just 1% of CRC samples, as a novel tumor suppressor in human CRC. Improving our understanding of the low penetrance driver mutations in CRC could lead to identification of novel compounds useful for a subset of patients or, at the very least, improvement of preclinical models used in the drug development process.

108 Materials and Methods

Cells

The Immortomouse (IMCE) and young adult mouse colon (YAMC) cells were a kind gift from Dr. Robert Whitehead. IMCE cells were maintained in

RPMI-1640 medium containing 5% serum, 1mg/ml insulin, and alpha- thiogylcerol. In “permissive” conditions, cells were cultured in medium containing

5U/mL of mouse IFN gamma and placed at 33°C in a humidified incubator.

These conditions allow for the expression of SV40 large T antigen 165,166.

The AA/C1 cell line was a gift from Dr. Christos Paraskeva. AA/C1 cells were cultured in DMEM containing 20% fetal bovine serum, 2mM glutamine, 1 ug/mL hydrocortisone, 0.2 U/mL of insulin, and 1% penicillin/streptomycin under standard conditions. Unless noted, all other cell lines were maintained in DMEM with 10% fetal bovine serum and penicillin/streptomycin. siRNA reagents and shRNA Vectors

To generate transient Wac IMCE and YAMC knockdown cells, cells were transfected with pooled siRNAs targeting mouse Wac (Dharmacon, siGENOME

Mouse Wac siRNA SMART pool cat. No. 225131) or a negative control non- targeting pool (Dharmacon, siGENOME non-targeting pool #1) using lipofectamine RNAiMAX transfection reagent (Invitrogen). Gene expression was measured 72 hours post transfection.

Stable puromycin resistant YAMC, IMCE, or AA/C1 knockdown cells were generated by standard viral transduction with lentiviral vectors containing 109 shRNAs targeting WAC (mouse Wac shRNA V2LMM_20397, human WAC shRNA V2LHS_135342) or TP53 (human P53 shRNA V2LHS_93613). All vectors were obtained from Openbiosystems. Efficiency of knockdown was measured by qRT-PCR (Supplementary table S3.1).

Other RNAi experiments were performed using pLKO.1 vectors obtained through the University of Minnesota Genomics Center (UMGC) partnership with the Open Access Program from Open Biosystems. The pLKO.1 vector with the oligo ID (TRCN0000135667) was used to generate DLD-1 and HCT116 shRNA

A6 cells and the vector with oligo ID (TRCN0000138407) was used to generate

DLD-1 and HCT116 shRNA A8 cells. Non-silencing control vectors were obtained from the same source. The lentiviral packaging vector psPAX2 was a gift from Didier Trono (Addgene plasmid # 12260) and viral envelope encoding vector pCMV-VSV-G was a gift from Bob Weinberg (Addgene plasmid #8454).

The lentiCRISPR v2 vector used to generate WAC knockout cells was a gift from

Feng Zhang (Addgene plasmid # 52961). qRT-PCR

Total RNA was extracted using the RNeasy Plus mini kit (Qiagen,

Germantown, MD) according to the manufacturer’s protocol. Total RNA concentration and quality were measured using the BioTek Epoch microplate spectrophotometer (BioTek, Winooski, VT). One microgram of total RNA was reverse transcribed using the High Capacity cDNA reverse transcription kit

(Applied Biosystems, Foster City, CA). RT-qPCR was then performed in triplicate

110 using diluted cDNA, FastStart Essential DNA Green Master mix (Roche, Basel,

Switzerland), and housekeeping and/or gene specific primers. Samples were run in the LightCycler 96 (Roche, Basel, Switzerland). Data were normalized to housekeeping gene and fold change was calculated using the delta-delta Ct method. Primer sequences are listed in supplementary table S3.1.

CRISPR/Cas9 Knockout cells

CRISPR lentiviral constructs targeting WAC were generated by annealing primers together listed in Supplementary Figure 3.1 and then cloning annealed product into lentiCRISPR v2 as described previously 124. The sgRNAs were designed to specifically target the WAC gene using the CRISPR design tool from

Dr. Feng Zhang’s laboratory (http://crisper.mit.edu/). After transducing cells with

WAC targeting CRISPR/Cas9 lentiviral particles, single cell puromycin-resistant clones were isolated by limiting dilution. DNA was extracted from single cell clones for PCR amplification of the genomic loci targeted by CRISPR/Cas9. After

PCR, the amplicons were purified and subsequently sequenced. Sequencing data from these clones was then used for Tracking of Indels by Decomposition

(TIDE) analysis and mutants were confirmed by immunoblot with an anti-WAC antibody (Sigma Aldrich, St. Louis, MO, Catalog # HPA042609) using a standard

Western Blotting protocol. To ensure equal loading, blots were probed with Anti-

Beta Actin (8H10D10) antibodies (, cat No. 3700).

111 Soft Agar assays

For soft agar assays, cells (10,000 per well) were grown in between layers of 0.5% Sea Plaque Low Melt Agarose (Lonza Cat. # 50101) in DMEM supplemented with 10% FBS and 1x penicillin/streptomycin for 10 days. Colonies were fixed with formaldehyde, stained with crystal violet, and imaged using a dissection scope. The photos of the resulting colonies were then used to count the number of colonies per field via the ImageJ software. For IMCE/YAMC cells, the assays were conducted at permissive temperature (33°C) for three weeks prior to staining.

Animals (Zebrafish and Mice)

In vitro transcribed RNA encoding human wildtype or mutant WAC was micro-injected into one-cell zebrafish embryos obtained from natural matings of wildtype fish. Embryos were raised under standard conditions (28.5°C in embryo water) until 24 hours post fertilization. Batches of 10 embryos per condition were dechorionated and RNA was extracted for analysis by quantitative RT-PCR.

Fold-change in p21 mRNA expression was determined by normalization with

S6K to uninjected control embryos. Data represent the average of >3 biological replicates +/- S.E.M.

Conditional WAC knockout mice were generated by inGenious Targeting

Laboratory. A 7.26 Kb region used to construct the targeting vector was first subcloned from a positively identified C57BL/6 BAC clone (RP23:16F20). The region was designed such that the short homology arm (SA) extends about 1.68 112 Kb 5’ to the Neo cassette. The long homology arm (LA) ends 3’ to the single

LoxP site and is 4.91 Kb long. The loxP/FRT flanked Neo cassette is inserted

304 bp upstream of exon 3. The single LoxP site, containing engineered Hind III and MfeI sites for southern blot analysis, is inserted 161 bp downstream of exon

3. The target region is 660 bp and includes exon 3. The targeting vector is confirmed by restriction analysis after each modification step. P6 and T73 primers anneal to the backbone vector sequence and read into the 5’ and 3’ ends of the BAC sub-clone. N1 and iNeoN2 primers anneal to the 5’ and 3’ ends of the

LoxP/FRT flanked Neo cassette and sequence the 5’ side of the target region and 3’ side of the SA, respectively.

Somatic mutation and gene expression analysis

Mutation data for human CRC samples were obtained from the Catalog of

Somatic Mutations in Cancer database (COSMIC)

(https://cancer.sanger.ac.uk/cosmic). To compare normal versus tumor mRNA levels, we downloaded colorectal adenocarcinoma (COAD) data from The

Cancer Genome Atlas and analyzed RNA-seq RPKM values only from those samples with matched tissue (https://cancergenome.nih.gov). In the case of any duplicates, the RPKM values were averaged prior to comparison Data was accessed in October of 2015. Statistical significance was assessed using Prism software. To generate “lollipop” diagrams illustrating the location and frequency of somatic mutations within WAC, we used the cBioPortal interface.

Author Contributions

113 Christopher R. Clark – Designed, performed, and analyzed experimental data.

Also responsible for writing this manuscript.

Caitlin B. Conboy – Responsible for IMCE/YAMC data and generating CIS diagram. Also created cancer-associated mutant forms of WAC.

Makayla Maile – Animal husbandry and genotyping conditional Wac knockout mice.

Julia Hatler – Zebrafish embryo model and qRT-PCR to measure p21 expression after WAC mRNA injections.

Kaila Thatcher – aided in the design and cloning of WAC CRISPR constructs.

Conor Nath – aided in isolation and characterization of CRISPR knockout clones.

Anna Strauss - aided in isolation and characterization of CRISPR knockout clones.

Jesenia Perez – assisted with tissue culture experiments with WAC knockout and overexpression clones.

David Largaespada – Contributed to the oversight of these experiments.

Timothy Starr – Conducted the insertional mutagenesis screen and contributed to the conception of this study.

114 Chapter 4. Discussion, future directions, and concluding remarks

Discussion

The overarching goals of this dissertation are to characterize TM9SF2 and WAC as novel CRC driver genes. The basis for these studies is derived from a murine CRC mutagenesis screen performed by Starr et al (2009, 2011).

Though these studies were able to identify numerous candidate cancer genes, more work was necessary to functionally validate individual genes and their impact in tumor biology. Using a publicly available mouse and human tumor databases, we identified a possible oncogenic role for TM9SF2 in CRC. Using genetic deletion and overexpression systems, we demonstrate that manipulation of TM9SF2 directly impacts the tumor fitness of CRC cells. We provide preliminary insights into the possible regulatory mechanism of TM9SF2 expression and used genome-wide sequencing technologies to shed light on the possible mechanism by which TM9SF2 promotes tumor growth. Moreover, we have shown that expression levels of TM9SF2 can be used as a possible prognostic indicator. Finally, using similar approaches, we have demonstrated a tumor suppressive function for WAC in CRC cells.

In chapter 2, we aimed to define TM9SF2 as a novel CRC oncogene.

TM9SF2 was one of 77 candidate CRC cancer genes identified in an insertional mutagenesis screen performed by our lab. Analysis using the Candidate Cancer

115 Gene Database (CCGD: http://ccgd-starrlab.oit.umn.edu/about.php) revealed that TM9SF2 was also identified as a common insertion gene in an additional eight studies. Further investigation into these studies using the Sleeping Beauty

Cancer Driver database revealed that TM9SF2 was a common insertion site in almost 8% of the 1,674 digestive tract tumors profiled in the database. These findings strongly suggest a contributing role for TM9SF2 in CRC tumorigenesis.

Analysis using the COSMIC database revealed that TM9SF2 is rarely somatically mutated, suggesting that the oncogenic mechanism of TM9SF2 is unlikely to be related to a structural variation in the protein encoded by this gene. Our gene expression analysis confirmed these suspicions. Analysis of RNA sequencing data from tumors profiled by TCGA, revealed TM9SF2 is overexpressed in one- third of samples. These findings suggest that the oncogenic role of TM9SF2 likely occurs through gene amplification events or loss of negative regulators.

The findings from TCGA samples were validated by performing RNA-sequencing on a set of 44 CRC tumor and matched normal samples collected at the

University of Minnesota. As a further measure, we demonstrated using qRT-

PCR, that TM9SF2 is overexpressed in CRC cell lines versus a non-transformed but immortalized control cell line. To test our hypothesis that TM9SF2 functions as an oncogene, we used both RNAi and CRISPR/Cas9 technology to reduce or completely knockout TM9SF2 followed by a variety of growth assays. These approaches demonstrated that the reduction or complete ablation of TM9FS2 resulted in decreased cell growth. These findings support our hypothesis that

116 TM9SF2 is a novel proto-oncogene whose overexpression is required for CRC cell growth.

To provide insights into the possible molecular mechanism of TM9SF2 driven cell growth, we performed next generation RNA sequencing on our

CRISPR/Cas9 knockout clones. We found the loss of TM9SF2 resulted in the differential expression of over 800 genes. Gene set enrichment and ingenuity pathway analyses using this list of differentially expressed genes revealed that

TM9SF2 may function in variety of pathways including cell cycle regulation, reactive oxygen species pathways, and ceramide signaling. More work is required to demonstrate the precise role of TM9SF2 in these pathways.

In many of the human CRC samples where this gene is altered, TM9SF2 expression is increased even though there is no observed copy number amplification. This finding suggests TM9SF2 mRNA overexpression can be explained by changes in transcriptional regulation. To explore this hypothesis, we analyzed CHIP-seq data from the ENCODE project for possible transcription factors binding in the TM9SF2 promoter. This analysis pointed to ELF1 as a candidate transcription factor responsible for controlling TM9SF2 expression.

Using Chip-qPCR, we have shown that ELF1 is enriched at the TM9SF2 promoter, further implicating this protein in transcriptional regulation of TM9SF2.

Moreover, we demonstrated that the expression of TM9SF2 and ELF1 are positively correlated in 382 human CRC samples profiled by the TCGA. These

117 findings provide solid rationale for further work to demonstrate that ELF1 directly influences TM9SF2 expression.

Lastly, we have demonstrated the prognostic potential of TM9SF2 mRNA expression levels. Through analyses of gene expression data from a microarray experiment, we found that increasing TM9SF2 expression levels significantly correlate with higher disease stage. Additionally, we demonstrated using two different data sets that CRC patients with a lower TM9SF2 expression level had a more favorable disease-free survival rate. Collectively, these are the pioneering studies showing an oncogenic role for TM9SF2 in human CRC.

In chapter 3, studies were performed to test whether WAC is a tumor suppressor gene in human CRC. Like TM9SF2, this gene was originally identified as a candidate cancer gene via transposon insertional mutagenesis screens.

WAC was a top common insertion site gene in wild type, ApcMIN and p53R270H screens. A closer analysis of the location and orientation of T2/Onc insertion sites within Wac revealed that this gene is a candidate tumor suppressor. The

T2/Onc insertions were randomly dispersed across the Wac gene in an unbiased orientation. This pattern suggests that the splice acceptor and bidirectional polyadenylation signals within the T2/Onc transposon are disrupting normal transcription, which mimics loss of function mutations. The Wac gene has also been identified in mutagenesis screens performed by others, strengthening our hypothesis that this gene has cancer relevance. We examined both the COSMIC and TCGA databases to find mutations in the human WAC gene. This analysis

118 revealed that WAC is somatically mutated in a small subset (1%) of human tumors as well as CRC samples. Several of the identified mutations were non- sense mutations and are likely to be deleterious truncating mutations causing loss of normal WAC function. Gene expression analysis using 41 CRC tumor and matched normal samples from the TCGA, demonstrated that, in the majority of tumors, WAC expression was significantly decreased. This finding is consistent with our hypothesis that WAC loss is a contributing event in CRC tumorigenesis.

To determine if WAC plays a role in the transformation of colonic epithelial cells we used the conditionally immortalized YAMC and IMCE models. RNA silencing experiments demonstrated that reduction of Wac increased anchorage independent growth in Apc mutant IMCE cells but not in YAMC, which are Apc wild type. This finding suggests that Wac is required to restrict transformation of normal colonic epithelial cells and that loss of Wac cooperates with Apc mutations to promote tumor growth. Next, we used the human adenoma cell line

AA/C1, which is a cell line derived from a person harboring a truncating mutation in APC. Here, the loss of WAC promoted cell growth but only after coinciding reduction of p53. These data suggest that in human cells the loss WAC function will be of detriment only in the context of prior APC and p53 mutations.

With the rationale that multiple mutations are requisite to observe a phenotype after WAC manipulation, we focused our attention on sporadic CRC models. Interestingly, shRNA mediated silencing of WAC in DLD-1 and HCT116

119 cells showed a minor reduction in cell growth, a finding in opposition of our tumor suppressive hypothesis. With a poor knockdown efficiency, we chose to use

CRISPR/Cas9 to knockout WAC in HT-29 cells. Using this model system, we demonstrated that complete knockout of WAC is required to reveal its tumor suppressive function. We complemented these experiments using a strong overexpression model. In HCT116 cells, we demonstrated that strong overexpression of WAC significantly reduces the ability of cells to grow in an anchorage dependent manner.

To gain mechanistic insights into the ability of WAC to restrict cell growth, we examined the effect of WAC expression on cell cycle related genes. Prior studies demonstrated that WAC was recruited to the CDKN1A (p21) locus in response to DNA damage. Using qRT-PCR, we found that stable reduction of

WAC in HCT116 cells resulted in decreased expression of TP53, CREBBP, and

CDKN1A. Next, we used an in vivo zebrafish system to further examine the relationship between WAC and p21 expression. Here, injection of embryos with wild type WAC mRNA and not cancer-associate mutant forms of WAC caused a two-fold increase in p21 expression. While these findings are preliminary, these studies have begun to demonstrate that WAC could be important in cell cycle regulation as well as DNA damage responses.

Finally, we attempted to use a conditional knockout mouse to demonstrate that loss of Wac in vivo is protumorigenic. While we were not able to complete

120 the proposed study, two years of breeding demonstrated that manipulation of the genomic loci surrounding exon 3 of Wac has drastic consequences. From heterozygous crosses (Wac fl/+ x Wac fl/+) we rarely obtained homozygous conditional knockout genotypes (Wac fl/fl). Some mice harboring two floxed Wac alleles survived until birth but less than 5% of these mice lasted past weaning age. Moreover, an analysis of litter size demonstrated that mice from heterozygous crosses (Wac fl/+ x Wac fl/+) had a significantly reduced number of offspring compared to litters from wild type by heterozygous crosses. These findings suggest that Wac plays an important role in proper embryonic or neonatal development. Altogether, the results of Chapters 2 and 3 demonstrate a critical role for TM9SF2 and WAC in CRC biology.

Future directions

Although the work in Chapter 2 has provided an important foundation for continuing studies on the role of TM9SF2 in cell and tumor biology, many questions remained unanswered. The TM9SF family is highly conserved throughout evolution as its members are also found in several lower organisms including the D. discoideum (amoeba), S. cerevisiae (yeast), D. rerio (zebrafish), and D. melanogaster (fruit fly) model organisms 91. Although the precise molecular function of TM9SF proteins remains unclear, previous studies in the aforementioned model organisms have suggested these proteins localize to endosomes and participate in cellular adhesion 91,178-180. Other work suggests that TM9SF proteins function as regulators of intracellular pH 97,181. Like other

121 TM9SF family proteins, TM9SF2 contains nine putative transmembrane domains and is predicted to localize to endosomal vesicles where it may function as an ion channel or as a scaffolding protein for other endocytosis related proteins 98,182.

Original studies of TM9SF2 suggested that this gene’s protein product is localized within intracellular vesicles, but its precise intracellular location remains unclear 183. Based on these data and predictive models, it would be satisfying to test the hypothesize that TM9SF2 will localize to early endosomes. To test this hypothesis, one could use lentiviral vectors to stably integrate a mCherry-

TM9SF2 fusion construct into the immortalized but non-oncogenic Human

Colonic Epithelial Cell (HCEC) line, which would then be used in colocalization studies 104. The intracellular localization of mCherry-TM9SF2 can be inferred by assessing its degree of colocalization with known markers of various endocytic vesicles (i.e. EEA1, Rab7, Rab11, and Lamp1). The Pearson’s correlation coefficient can be used to estimate the degree of overlap between signals and the significance of these outcomes by t-test. For a more comprehensive evaluation of TM9SF2 localization, one should repeat these localization experiments in additional colorectal cell lines. Based on previously published

TM9SF4 data we anticipate that TM9SF2 will localize in early endosomes, as determined by colocalization with the early endosome marker EEA1.

Prior studies have demonstrated that silencing TM9SF4 expression in malignant melanoma cells results in altered intravesicular pH, leading to the idea

122 that TM9SF genes have a role in regulating endosomal pH 97. Based on the high level of to TM9SF4, we hypothesize that TM9SF2 will also contribute to the acidification of intracellular vesicles. To test this hypothesis, we could use our TM9SF2 KO cells and live-cell confocal microscopy to measure the effects that loss of TM9SF2 expression has on endosomal pH over time. The pH measurement of endocytic compartments by confocal microscopy is based on the ratio of fluorescence between pH-sensitive and pH-insensitive dextran dye conjugates 184. Dextrans are inert molecules that enter cells via fluid-phase endocytosis where they traverse the gradually decreasing acidic environments of the endocytic pathway. Throughout this course, the fluorescence of the pH- sensitive dextran dye will decrease in intensity as it encounters increasingly acidic environments while the pH-insensitive dye will maintain a strong level of fluorescence along this same endocytic pathway. Consequently, the ratio of fluorescence intensity between sensitive and insensitive dyes will gradually increase over time as the dextrans transition from early endosomes to the lysosomes. Experimental data would be plotted against a calibration curve to interpolate the intravesicular pH levels at various time points. For a comprehensive understanding, we would also evaluate changes in the cytoplasmic pH level in HCEC control versus TM9SF2 KO cells using a similar fluorescence-based approach. Using these strategies, we anticipate that

TM9SF2 KOs will have abnormal endosomal pH versus controls.

123 Integrins are transmembrane glycoproteins that are responsible for anchoring cells to the surrounding extracellular matrix and stabilizing cytoskeletal elements during cell migration. These molecules are continuously internalized from the cell membrane by endocytosis where they are either degraded or recycled to the cell surface, a process known as integrin trafficking. Deregulated integrin trafficking and altered cell surface expression is commonly associated with decreased survival in cancer patients 185,186. Interestingly, TM9SF genes have been implicated in cellular adhesion in several model organisms 118,178,179. It will be important to explore the consequences of TM9SF2 loss on integrin trafficking. To test such a hypothesis, one could use control and TM9SF2 KO cells in both integrin internalization and recycling assays. In brief, these assays require labeling all membrane receptors with biotin molecules and briefly incubating cells at 4° C (no trafficking) and 37° C (internalization period).

Receptors that have been endocytosed can be retrieved using streptavidin beads and specific biotinylated labeled proteins can then be detected by western blot.

The assay can be stopped at various time points to obtain a robust representation of integrin internalization. The percentage of internalized integrin will be calculated by comparing the signal intensity of the internalized integrin relative to the total surface integrin level at each time point.

There is a growing body of evidence supporting the notion that alterations in endocytosis related genes and ion channel expression contribute to

124 chemoresistance and decreased cancer patient survival 187-189. Studies have demonstrated that specific inhibition of endocytosis regulating genes and proton transporters in tumors results in reduced tumor invasiveness 190,191. Furthermore, several studies have shown that deregulated endocytosis and altered tumor acidity are critically important drivers of both tumor progression and resistance to chemotherapeutics 192-194. Combined with our data demonstrating oncogenic activity, successful completion of the proposed work would provide a more robust understanding of the relationship between TM9SF2 and CRC.

In chapter 3, our work provides the initial findings that WAC has a tumor suppressive function in CRC. Several additional studies should be performed to provide a more complete understanding of the function of this gene in CRC tumorigenesis. Although we demonstrated that WAC is mutated and/or downregulated in CRC samples, we did not yet demonstrate that these findings have any clinical meaning. Further studies should be done to determine if mutation or expression levels of WAC are related to tumor subtype, disease stage, progression free survival, or overall survival. It would be satisfying to obtain a tissue microarray (TMA) containing a large number of CRC samples and perform immunohistochemical staining for WAC. While this method is not perfect, we could use this tool to accomplish multiple goals. First, we could correlate the level of WAC expression with clinicopathological data of the samples contained in the TMA. Secondly, antibody-based protein staining of these samples could

125 provide clues as to the intracellular localization of WAC in tumor samples. This type of analysis using TMAs has been successfully used in prior studies 195. As previously mentioned, WAC has been known to function in the cytoplasm where it functions in autophagy and Golgi biosynthesis as well as in the nucleus where it is required for histone modifications related to transcript elongation. In the proposed TMA analysis, it would be interesting to determine if there is a relationship between the general intracellular localization (i.e. cytoplasmic vs. nuclear) of WAC and tumor grade or stage.

Our work in Chapter 3 showed that WAC deletion promotes anchorage independent growth, but we did not find any significant difference between control cells and knockouts in cell proliferation or migration assays. These data suggest that perhaps the tumor suppressive function of WAC is best exemplified when cells are grown in more complex, multidimensional environments. To explore this issue, we could use our verified gRNAs for targeting deletion of WAC in a 3-dimensional colon organoid system. Colon organoid cultures are increasingly popular in the field and CRISPR/Cas9 technology has been used to model CRC evolution with this system 196,197. We hypothesize the loss of WAC would lead to altered organoid development compared to controls, a result that can be inferred through of morphological, histological, and immunofluorescent

(IF) analyses. For example, differences in crypt-like structure formation could be observed by IF staining for Ki67+ and Lgr5, which mark proliferative cells (Ki67+)

126 at the base of the colon crypt (Lgr5). Immunofluorescent staining for EPHB2+ transit-amplifying cells, VIL1+ epithelial cells, and MUC2+ goblet cells would provide for a robust comparison between control and WAC knockout clones 198.

Harvesting established organoids for RNA-sequencing to examine the gene expression difference between WAC knockout and control organoids would also provide a wealth of information regarding the role of WAC in intestinal cell biology.

The creation of an additional conditional Wac knockout mouse model is also in progress. We anticipate this experiment will be successful as this exact mouse was used in previous studies identifying Wac as a critical tumor suppressor gene in breast, skin, and prostate cancers 199. This mouse contains loxP sites flanking exon 5, whereas our previous model used in Chapter 3 targeted exon 3 for deletion. Using this mouse, de la Rosa et al. demonstrated

Wac is a new obligate haploinsufficient gene in prostate cancer. In their system, mice with prostate-specific homozygous deletion of Pten (Pten∆/∆) combined with prostate-specific heterozygous deletion of Wac (Wac+/∆) developed the largest prostate tumors. Interestingly, mice with prostate-specific homozygous deletion of Pten (Pten∆/∆) combined with prostate-specific homozygous deletion of Wac

(Wac∆/∆) were protected from tumor progression 199. Using these data, we will strategically design a study to examine the effects of both heterozygous and homozygous deletion of Wac in the mouse gastrointestinal tract. Our previous data using CRISPR/Cas9 to completely delete WAC in CRC cells lines would

127 suggest that homozygous deletion animals would get more tumors and have shorter survival, but work by de la Rosa et al. demonstrates we should be diligent and unbiased in our analyses of future mouse experiments.

An additional observation by de la Rosa et al. was that the prostate tumor- promoting effect of partial Wac inactivation was fleeting. In their experiments, larger prostate tumors were observed in 4-month-old mice with combined homozygous Pten (Pten∆/∆) and heterozygous Wac (Wac+/∆), but this phenotype largely disappeared after an additional 5 months of ageing. These data suggest the influence of WAC mutations may be important in driving early tumor progression but become passive in later more invasive stages of disease. This finding should also be used as a guideline in our in vivo model. To observe a tumor driving force caused by Wac deletion, it may be necessary to sacrifice mice at numerous timepoints rather than waiting until mice become moribund.

The utility of the transposon-based insertional mutagenesis screen in a next generation era

The prognosis for CRC patients varies widely with five-year survival rates ranging from approximately 10 to 90% depending on the stage of disease at time of diagnosis 200. The variation in patient survival rates also reflects the genetic heterogeneity of colorectal tumors. Several studies have demonstrated that even

128 patients with similar disease severity (i.e. clinical stage) show significant variation in survival outcomes 201-204. This variation in survival illustrates that, regardless of our extensive knowledge of highly penetrant mutations, there remains a continual need to identify and validate novel genetic alterations contributing to colorectal tumorigenesis.

Transposon-based insertional mutagenesis screens have been an invaluable tool used to help fill the gap in knowledge regarding the identification of novel tumor driver genes in CRC as well as multiple other tumor types. For example, de la Rosa et al. recently developed a whole-body insertional mutagenesis model used to identify novel tumor suppressor genes whose loss cooperates with Pten mutation to promote breast, skin, and prostate tumor formation 199. Their work identified five promising candidate tumor suppressor genes that included ZBTB20, CELF2, PARD3, AKAP13, and interestingly, WAC.

The breast cancer field has taken strong interest in the molecular classification of tumor subtypes 205. Recent work by Chen et al. use the transposon insertional mutagenesis system in mice to generate breast tumors and identify new breast cancer driver genes. Rather than focus on single genes for further study, this group integrated their common insertion site data with human breast cancer gene expression data to develop a more robust clinicopathological prognostic tool 147.

129 Transposon insertional mutagenesis screens have also been used to identify drug resistance genes. Perna et al. developed a mutagenesis screen in mice expressing mutant Braf only in melanocytes. Mutations in the BRAF kinase are thought to occur in 50-60% of human melanomas 206. Braf mutation combined with T2/Onc transposition led to the formation of melanomas, at which point mice were treated with the small molecule Braf inhibitor PLX4720. In some cases, the treatment caused an initial tumor regression followed by relapse of the melanoma. Transposon insertion site analysis in treatment resistant tumors revealed eight candidate genes associated drug resistance 155. This approach further demonstrates the current value of the transposon-based mutagenesis screen.

The transposon-based gene discovery model has been elegantly used to strengthen our understanding of the genetic mechanism driving CRC. For example, Takeda et al. used this system to bolster our understanding of the evolutionary forces driving CRC tumors. Here, Takeda et al. performed insertional mutagenesis screens in cohorts of mice carry sensitizing mutations that act at different times in the evolution of a CRC tumor from an adenoma to carcinoma (i.e. WT, KrasG12D, Smad4 KO, p53R172H cohorts) 82. Data from this study revealed that there are candidate driver genes common amongst all cohorts, but also that there exists a fair number of candidate driver genes unique to each cohort. These data suggest that during CRC progression the genes

130 responsible for driving tumor growth may change over time. Developing molecularly targeted therapies against molecules unique to each cohort could be a strategy for treating patients at varying stages of disease. Remarkably, the transposon-based system was also used to identify CRC driver genes in an ex vivo setting. Using discarded normal colon tissue from CRC surgical resections,

Chen et al. decellularized the colon, repopulated the native organ extracellular matrix with cells equipped for insertional mutagenesis, and eventually identified common insertion site genes in the resulting invasive adenomas that formed ex vivo 207. This daunting technical feat provides the basis for using insertional mutagenesis screens to identify cancer driver genes in a complex, multidimensional, model system.

Collectively, the work in this dissertation and the above examples demonstrate the value of the transposon-based mutagenesis screen in cancer gene discovery. This system can be used to effectively identify novel driver genes as well as complement existing gene expression data to better predicted clinical outcomes. In an era where it has become commonplace to use next generation sequencing technology to profile a large number of tumors, the use of transposon-based mutagenesis screen should not be overlooked.

131 References

1 Thibodeau, S. N., Bren, G. & Schaid, D. Microsatellite instability in cancer of the proximal colon. Science 260, 816-819 (1993). 2 Pino, M. S. & Chung, D. C. The chromosomal instability pathway in colon cancer. Gastroenterology 138, 2059-2072, doi:10.1053/j.gastro.2009.12.065 (2010). 3 Gao, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Science signaling 6, pl1, doi:10.1126/scisignal.2004088 (2013). 4 Bertagnolli, M. M. et al. Microsatellite instability and loss of heterozygosity at chromosomal location 18q: prospective evaluation of biomarkers for stages II and III colon cancer--a study of CALGB 9581 and 89803. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 29, 3153-3162, doi:10.1200/JCO.2010.33.0092 (2011). 5 Ellison, A. R., Lofing, J. & Bitter, G. A. Human MutL homolog (MLH1) function in DNA mismatch repair: a prospective screen for missense mutations in the ATPase domain. Nucleic acids research 32, 5321-5338, doi:10.1093/nar/gkh855 (2004). 6 Bonadona, V. et al. Cancer risks associated with germline mutations in MLH1, MSH2, and MSH6 genes in Lynch syndrome. JAMA : the journal of the American Medical Association 305, 2304-2310, doi:10.1001/jama.2011.743 (2011). 7 Merok, M. A. et al. Microsatellite instability has a positive prognostic impact on stage II colorectal cancer after complete resection: results from a large, consecutive Norwegian series. Annals of oncology : official journal of the European Society for Medical Oncology / ESMO 24, 1274-1282, doi:10.1093/annonc/mds614 (2013). 8 Hveem, T. S. et al. Prognostic impact of genomic instability in colorectal cancer. British journal of cancer 110, 2159-2164, doi:10.1038/bjc.2014.133 (2014). 9 Powell, S. M. et al. APC mutations occur early during colorectal tumorigenesis. Nature 359, 235-237, doi:10.1038/359235a0 (1992). 10 Fearon, E. R. & Vogelstein, B. A genetic model for colorectal tumorigenesis. Cell 61, 759-767 (1990). 11 Fearon, E. R. Molecular genetics of colorectal cancer. Annual review of pathology 6, 479-507, doi:10.1146/annurev-pathol-011110-130235 (2011). 12 Jin, Y. R. & Yoon, J. K. The R-spondin family of proteins: emerging regulators of WNT signaling. The international journal of biochemistry & cell biology 44, 2278-2287, doi:10.1016/j.biocel.2012.09.006 (2012). 13 de Sousa, E. M., Vermeulen, L., Richel, D. & Medema, J. P. Targeting Wnt signaling in colon cancer stem cells. Clinical cancer research : an official journal of the American Association for Cancer Research 17, 647- 653, doi:10.1158/1078-0432.CCR-10-1204 (2011).

132 14 Jasperson, K. W., Tuohy, T. M., Neklason, D. W. & Burt, R. W. Hereditary and familial colon cancer. Gastroenterology 138, 2044-2058, doi:10.1053/j.gastro.2010.01.054 (2010). 15 Armaghany, T., Wilson, J. D., Chu, Q. & Mills, G. Genetic alterations in colorectal cancer. Gastrointestinal cancer research : GCR 5, 19-27 (2012). 16 Duval, A. & Hamelin, R. Mutations at coding repeat sequences in mismatch repair-deficient human cancers: toward a new concept of target genes for instability. Cancer research 62, 2447-2454 (2002). 17 Thompson, S. L., Bakhoum, S. F. & Compton, D. A. Mechanisms of chromosomal instability. Current biology : CB 20, R285-295, doi:10.1016/j.cub.2010.01.034 (2010). 18 Bos, J. L. et al. Prevalence of ras gene mutations in human colorectal cancers. Nature 327, 293-297, doi:10.1038/327293a0 (1987). 19 Vogelstein, B. et al. Genetic alterations during colorectal-tumor development. The New England journal of medicine 319, 525-532, doi:10.1056/NEJM198809013190901 (1988). 20 Baker, S. J. et al. Chromosome 17 deletions and p53 gene mutations in colorectal carcinomas. Science 244, 217-221 (1989). 21 Fearon, E. R. et al. Identification of a chromosome 18q gene that is altered in colorectal cancers. Science 247, 49-56 (1990). 22 Joslyn, G. et al. Identification of deletion mutations and three new genes at the familial polyposis locus. Cell 66, 601-613 (1991). 23 Groden, J. et al. Identification and characterization of the familial adenomatous polyposis coli gene. Cell 66, 589-600 (1991). 24 Kinzler, K. W. et al. Identification of FAP locus genes from chromosome 5q21. Science 253, 661-665 (1991). 25 Kinzler, K. W. et al. Identification of a gene located at chromosome 5q21 that is mutated in colorectal cancers. Science 251, 1366-1370 (1991). 26 Nishisho, I. et al. Mutations of chromosome 5q21 genes in FAP and colorectal cancer patients. Science 253, 665-669 (1991). 27 Lindblom, A., Tannergard, P., Werelius, B. & Nordenskjold, M. Genetic mapping of a second locus predisposing to hereditary non-polyposis colon cancer. Nature genetics 5, 279-282, doi:10.1038/ng1193-279 (1993). 28 Peltomaki, P. et al. Genetic mapping of a locus predisposing to human colorectal cancer. Science 260, 810-812 (1993). 29 Strand, M., Prolla, T. A., Liskay, R. M. & Petes, T. D. Destabilization of tracts of simple repetitive DNA in yeast by mutations affecting DNA mismatch repair. Nature 365, 274-276, doi:10.1038/365274a0 (1993). 30 Fishel, R. et al. The human mutator gene homolog MSH2 and its association with hereditary nonpolyposis colon cancer. Cell 77, 1 p following 166 (1994). 31 Nicolaides, N. C. et al. Mutations of two PMS homologues in hereditary nonpolyposis colon cancer. Nature 371, 75-80, doi:10.1038/371075a0 (1994).

133 32 Papadopoulos, N. et al. Mutation of a mutL homolog in hereditary colon cancer. Science 263, 1625-1629 (1994). 33 Liu, B. et al. Analysis of mismatch repair genes in hereditary non- polyposis colorectal cancer patients. Nature medicine 2, 169-174 (1996). 34 Wijnen, J. et al. MSH2 genomic deletions are a frequent cause of HNPCC. Nature genetics 20, 326-328, doi:10.1038/3795 (1998). 35 Montazer Haghighi, M. et al. Four novel germline mutations in the MLH1 and PMS2 mismatch repair genes in patients with hereditary nonpolyposis colorectal cancer. International journal of colorectal disease 24, 885-893, doi:10.1007/s00384-009-0731-1 (2009). 36 Sjoblom, T. et al. The consensus coding sequences of human breast and colorectal cancers. Science 314, 268-274, doi:10.1126/science.1133427 (2006). 37 Wood, L. D. et al. The genomic landscapes of human breast and colorectal cancers. Science 318, 1108-1113, doi:10.1126/science.1145720 (2007). 38 Network, T. C. G. A. R. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330-337, doi:10.1038/nature11252 (2012). 39 Tomasetti, C. & Vogelstein, B. Cancer etiology. Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science 347, 78-81, doi:10.1126/science.1260825 (2015). 40 Moser, A. R., Pitot, H. C. & Dove, W. F. A dominant mutation that predisposes to multiple intestinal neoplasia in the mouse. Science 247, 322-324 (1990). 41 Su, L. K. et al. Multiple intestinal neoplasia caused by a mutation in the murine homolog of the APC gene. Science 256, 668-670 (1992). 42 Fodde, R. et al. A targeted chain-termination mutation in the mouse Apc gene results in multiple intestinal tumors. Proceedings of the National Academy of Sciences of the United States of America 91, 8969-8973 (1994). 43 Oshima, M. et al. Loss of Apc heterozygosity and abnormal tissue building in nascent intestinal polyps in mice carrying a truncated Apc gene. Proceedings of the National Academy of Sciences of the United States of America 92, 4482-4486 (1995). 44 Shibata, H. et al. Rapid colorectal adenoma formation initiated by conditional targeting of the Apc gene. Science 278, 120-123 (1997). 45 Saam, J. R. & Gordon, J. I. Inducible gene knockouts in the small intestinal and colonic epithelium. The Journal of biological chemistry 274, 38071-38082 (1999). 46 Pinto, D., Robine, S., Jaisser, F., El Marjou, F. E. & Louvard, D. Regulatory sequences of the mouse villin gene that efficiently drive transgenic expression in immature and differentiated epithelial cells of

134 small and large intestines. The Journal of biological chemistry 274, 6476- 6482 (1999). 47 Madison, B. B. et al. Cis elements of the villin gene control expression in restricted domains of the vertical (crypt) and horizontal (duodenum, cecum) axes of the intestine. The Journal of biological chemistry 277, 33275-33283, doi:10.1074/jbc.M204935200 (2002). 48 el Marjou, F. et al. Tissue-specific and inducible Cre-mediated recombination in the gut epithelium. Genesis 39, 186-193, doi:10.1002/gene.20042 (2004). 49 Heyer, J., Yang, K., Lipkin, M., Edelmann, W. & Kucherlapati, R. Mouse models for colorectal cancer. Oncogene 18, 5325-5333, doi:10.1038/sj.onc.1203036 (1999). 50 McCart, A. E., Vickaryous, N. K. & Silver, A. Apc mice: models, modifiers and mutants. Pathology, research and practice 204, 479-490, doi:10.1016/j.prp.2008.03.004 (2008). 51 Nandan, M. O. & Yang, V. W. Genetic and Chemical Models of Colorectal Cancer in Mice. Current colorectal cancer reports 6, 51-59, doi:10.1007/s11888-010-0046-1 (2010). 52 Karim, B. O. & Huso, D. L. Mouse models for colorectal cancer. American journal of cancer research 3, 240-250 (2013). 53 de Wind, N., Dekker, M., Berns, A., Radman, M. & te Riele, H. Inactivation of the mouse Msh2 gene results in mismatch repair deficiency, methylation tolerance, hyperrecombination, and predisposition to cancer. Cell 82, 321-330 (1995). 54 Reitmair, A. H. et al. MSH2 deficient mice are viable and susceptible to lymphoid tumours. Nature genetics 11, 64-70, doi:10.1038/ng0995-64 (1995). 55 Reitmair, A. H. et al. MSH2 deficiency contributes to accelerated APC- mediated intestinal tumorigenesis. Cancer research 56, 2922-2926 (1996). 56 Edelmann, W. et al. Tumorigenesis in Mlh1 and Mlh1/Apc1638N mutant mice. Cancer research 59, 1301-1307 (1999). 57 Edelmann, W. et al. Mutation in the mismatch repair gene Msh6 causes cancer susceptibility. Cell 91, 467-477 (1997). 58 de Wind, N., Dekker, M., van Rossum, A., van der Valk, M. & te Riele, H. Mouse models for hereditary nonpolyposis colorectal cancer. Cancer research 58, 248-255 (1998). 59 Kucherlapati, M. H. et al. An Msh2 conditional knockout mouse for studying intestinal cancer and testing anticancer agents. Gastroenterology 138, 993-1002 e1001, doi:10.1053/j.gastro.2009.11.009 (2010). 60 Neufert, C., Becker, C. & Neurath, M. F. An inducible mouse model of colon carcinogenesis for the analysis of sporadic and inflammation-driven tumor progression. Nature protocols 2, 1998-2004, doi:10.1038/nprot.2007.279 (2007).

135 61 De Robertis, M. et al. The AOM/DSS murine model for the study of colon carcinogenesis: From pathways to diagnosis and therapy studies. Journal of carcinogenesis 10, 9, doi:10.4103/1477-3163.78279 (2011). 62 Rosenberg, D. W., Giardina, C. & Tanaka, T. Mouse models for the study of colon carcinogenesis. Carcinogenesis 30, 183-196, doi:10.1093/carcin/bgn267 (2009). 63 Tong, Y., Yang, W. & Koeffler, H. P. Mouse models of colorectal cancer. Chinese journal of cancer 30, 450-462, doi:10.5732/cjc.011.10041 (2011). 64 Nusse, R. & Varmus, H. E. Many tumors induced by the mouse mammary tumor virus contain a provirus integrated in the same region of the host genome. Cell 31, 99-109 (1982). 65 Selten, G., Cuypers, H. T., Zijlstra, M., Melief, C. & Berns, A. Involvement of c-myc in MuLV-induced T cell lymphomas in mice: frequency and mechanisms of activation. The EMBO journal 3, 3215-3222 (1984). 66 Johansson, F. K. et al. Identification of candidate cancer-causing genes in mouse brain tumors by retroviral tagging. Proceedings of the National Academy of Sciences of the United States of America 101, 11334-11337, doi:10.1073/pnas.0402716101 (2004). 67 Mikkers, H. et al. High-throughput retroviral tagging to identify components of specific signaling pathways in cancer. Nature genetics 32, 153-159, doi:10.1038/ng950 (2002). 68 Uren, A. G., Kool, J., Berns, A. & van Lohuizen, M. Retroviral insertional mutagenesis: past, present and future. Oncogene 24, 7656-7672, doi:10.1038/sj.onc.1209043 (2005). 69 Osborne, B. I. & Baker, B. Movers and shakers: maize transposons as tools for analyzing other plant genomes. Current opinion in cell biology 7, 406-413 (1995). 70 Spradling, A. C. et al. Gene disruptions using P transposable elements: an integral component of the Drosophila genome project. Proceedings of the National Academy of Sciences of the United States of America 92, 10824- 10830 (1995). 71 Ivics, Z., Hackett, P. B., Plasterk, R. H. & Izsvak, Z. Molecular reconstruction of Sleeping Beauty, a Tc1-like transposon from fish, and its transposition in human cells. Cell 91, 501-510 (1997). 72 Dupuy, A. J., Fritz, S. & Largaespada, D. A. Transposition and gene disruption in the male germline of the mouse. Genesis 30, 82-88 (2001). 73 Fischer, S. E., Wienholds, E. & Plasterk, R. H. Regulated transposition of a fish transposon in the mouse germ line. Proceedings of the National Academy of Sciences of the United States of America 98, 6759-6764, doi:10.1073/pnas.121569298 (2001). 74 Zayed, H., Izsvak, Z., Walisko, O. & Ivics, Z. Development of hyperactive sleeping beauty transposon vectors by mutational analysis. Molecular therapy : the journal of the American Society of Gene Therapy 9, 292-304, doi:10.1016/j.ymthe.2003.11.024 (2004).

136 75 Geurts, A. M. et al. Gene transfer into genomes of human cells by the sleeping beauty transposon system. Molecular therapy : the journal of the American Society of Gene Therapy 8, 108-117 (2003). 76 Collier, L. S., Carlson, C. M., Ravimohan, S., Dupuy, A. J. & Largaespada, D. A. Cancer gene discovery in solid tumours using transposon-based somatic mutagenesis in the mouse. Nature 436, 272-276, doi:10.1038/nature03681 (2005). 77 Dupuy, A. J., Akagi, K., Largaespada, D. A., Copeland, N. G. & Jenkins, N. A. Mammalian mutagenesis using a highly mobile somatic Sleeping Beauty transposon system. Nature 436, 221-226, doi:10.1038/nature03691 (2005). 78 Starr, T. K. et al. A transposon-based genetic screen in mice identifies genes altered in colorectal cancer. Science 323, 1747-1750, doi:10.1126/science.1163040 (2009). 79 Wu, C. et al. RSPO2-LGR5 signaling has tumour-suppressive activity in colorectal cancer. Nat Commun 5, 3149, doi:10.1038/ncomms4149 (2014). 80 Starr, T. K. et al. A Sleeping Beauty transposon-mediated screen identifies murine susceptibility genes for adenomatous polyposis coli (Apc)- dependent intestinal tumorigenesis. Proceedings of the National Academy of Sciences of the United States of America 108, 5765-5770, doi:10.1073/pnas.1018012108 (2011). 81 March, H. N. et al. Insertional mutagenesis identifies multiple networks of cooperating genes driving intestinal tumorigenesis. Nature genetics 43, 1202-1209, doi:10.1038/ng.990 (2011). 82 Takeda, H. et al. Transposon mutagenesis identifies genes and evolutionary forces driving gastrointestinal tract tumor progression. Nature genetics 47, 142-150, doi:10.1038/ng.3175 (2015). 83 Than, B. L. et al. The role of KCNQ1 in mouse and human gastrointestinal cancers. Oncogene 33, 3861-3868, doi:10.1038/onc.2013.350 (2014). 84 Billings, J. L. et al. Early colon screening of adult patients with cystic fibrosis reveals high incidence of adenomatous colon polyps. Journal of clinical gastroenterology 48, e85-88, doi:10.1097/MCG.0000000000000034 (2014). 85 Tanaka, T. Colorectal carcinogenesis: Review of human and experimental animal studies. J Carcinog 8, 5 (2009). 86 Cancer Genome Atlas, N. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330-337, doi:10.1038/nature11252 (2012). 87 Haan, J. C. et al. Genomic landscape of metastatic colorectal cancer. Nat Commun 5, 5457, doi:10.1038/ncomms6457 (2014). 88 Abbott, K. L. et al. The Candidate Cancer Gene Database: a database of cancer driver genes from forward genetic screens in mice. Nucleic Acids Res 43, D844-848, doi:10.1093/nar/gku770 (2015).

137 89 Than, B. L. N. et al. CFTR is a tumor suppressor gene in murine and human intestinal cancer. Oncogene 36, 3504, doi:10.1038/onc.2017.3 (2017). 90 Seshagiri, S. et al. Recurrent R-spondin fusions in colon cancer. Nature 488, 660-664, doi:10.1038/nature11282 (2012). 91 Pruvot, B. et al. Comparative analysis of nonaspanin protein sequences and expression studies in zebrafish. Immunogenetics 62, 681-699, doi:10.1007/s00251-010-0472-x (2010). 92 Chluba-de Tapia, J., de Tapia, M., Jaggin, V. & Eberle, A. N. Cloning of a human multispanning membrane protein cDNA: evidence for a new protein family. Gene 197, 195-204 (1997). 93 He, P. et al. High-throughput functional screening for autophagy-related genes and identification of TM9SF1 as an autophagosome-inducing gene. Autophagy 5, 52-60 (2009). 94 Zaravinos, A., Lambrou, G. I., Boulalas, I., Delakas, D. & Spandidos, D. A. Identification of common differentially expressed genes in urinary bladder cancer. PloS one 6, e18135, doi:10.1371/journal.pone.0018135 (2011). 95 Chang, H. et al. Identification of genes associated with chemosensitivity to SAHA/taxane combination treatment in taxane-resistant breast cancer cells. Breast cancer research and treatment 125, 55-63, doi:10.1007/s10549-010-0825-z (2011). 96 Oo, H. Z. et al. Identification of novel transmembrane proteins in scirrhous-type gastric cancer by the Escherichia coli ampicillin secretion trap (CAST) method: TM9SF3 participates in tumor invasion and serves as a prognostic factor. Pathobiology 81, 138-148, doi:10.1159/000357821 (2014). 97 Lozupone, F. et al. The human homologue of Dictyostelium discoideum phg1A is expressed by human metastatic melanoma cells. EMBO Rep 10, 1348-1354, doi:10.1038/embor.2009.236 (2009). 98 Lozupone, F. et al. TM9SF4 is a novel V-ATPase-interacting protein that modulates tumor pH alterations associated with drug resistance and invasiveness of colon cancer cells. Oncogene 34, 5163-5174, doi:10.1038/onc.2014.437 (2015). 99 Starr, T. K. & Largaespada, D. A. Cancer gene discovery using the Sleeping Beauty transposon. Cell Cycle 4, 1744-1748, doi:10.4161/cc.4.12.2223 (2005). 100 Copeland, N. G. & Jenkins, N. A. Harnessing transposons for cancer gene discovery. Nature reviews. Cancer 10, 696-706, doi:10.1038/nrc2916 (2010). 101 Newberg, J. Y., Mann, K. M., Mann, M. B., Jenkins, N. A. & Copeland, N. G. SBCDDB: Sleeping Beauty Cancer Driver Database for gene discovery in mouse models of human cancers. Nucleic Acids Res 46, D1011-D1017, doi:10.1093/nar/gkx956 (2018).

138 102 Forbes, S. A. et al. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res 45, D777-D783, doi:10.1093/nar/gkw1121 (2017). 103 Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer discovery 2, 401-404, doi:10.1158/2159-8290.CD-12-0095 (2012). 104 Roig, A. I. et al. Immortalized epithelial cells derived from human colon biopsies express stem cell markers and differentiate in vitro. Gastroenterology 138, 1012-1021 e1011-1015, doi:10.1053/j.gastro.2009.11.052 (2010). 105 Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America 102, 15545-15550, doi:10.1073/pnas.0506580102 (2005). 106 Stanton, R. C. Glucose-6-phosphate dehydrogenase, NADPH, and cell survival. IUBMB Life 64, 362-369, doi:10.1002/iub.1017 (2012). 107 Ho, H. Y. et al. Enhanced oxidative stress and accelerated cellular senescence in glucose-6-phosphate dehydrogenase (G6PD)-deficient human fibroblasts. Free Radic Biol Med 29, 156-169 (2000). 108 Longo, L. et al. Maternally transmitted severe glucose 6-phosphate dehydrogenase deficiency is an embryonic lethal. EMBO J 21, 4229-4239 (2002). 109 Widau, R. C., Jin, Y., Dixon, S. A., Wadzinski, B. E. & Gallagher, P. J. Protein phosphatase 2A (PP2A) holoenzymes regulate death-associated protein kinase (DAPK) in ceramide-induced anoikis. The Journal of biological chemistry 285, 13827-13838, doi:10.1074/jbc.M109.085076 (2010). 110 Powell, J. A. et al. Targeting sphingosine kinase 1 induces MCL1- dependent cell death in acute myeloid leukemia. Blood 129, 771-782, doi:10.1182/blood-2016-06-720433 (2017). 111 Adada, M. M. et al. Intracellular sphingosine kinase 2-derived sphingosine-1-phosphate mediates epidermal growth factor-induced ezrin- radixin-moesin phosphorylation and cancer cell invasion. FASEB J 29, 4654-4669, doi:10.1096/fj.15-274340 (2015). 112 Shida, D., Takabe, K., Kapitonov, D., Milstien, S. & Spiegel, S. Targeting SphK1 as a new strategy against cancer. Curr Drug Targets 9, 662-673 (2008). 113 Kim, W. J. et al. Mutations in the neutral sphingomyelinase gene SMPD3 implicate the ceramide pathway in human leukemias. Blood 111, 4716- 4722, doi:10.1182/blood-2007-10-113068 (2008). 114 Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57-74, doi:10.1038/nature11247 (2012). 115 Marisa, L. et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med 10, e1001453, doi:10.1371/journal.pmed.1001453 (2013).

139 116 Jorissen, R. N. et al. Metastasis-Associated Gene Expression Changes Predict Poor Outcomes in Patients with Dukes Stage B and C Colorectal Cancer. Clinical cancer research : an official journal of the American Association for Cancer Research 15, 7642-7651, doi:10.1158/1078- 0432.CCR-09-1431 (2009). 117 Smith, J. J. et al. Experimentally derived metastasis gene expression profile predicts recurrence and death in patients with colon cancer. Gastroenterology 138, 958-968, doi:10.1053/j.gastro.2009.11.005 (2010). 118 Froquet, R. et al. TM9/Phg1 and SadA proteins control surface expression and stability of SibA adhesion molecules in Dictyostelium. Molecular biology of the cell 23, 679-686, doi:10.1091/mbc.E11-04-0338 (2012). 119 Perrin, J. et al. TM9 family proteins control surface targeting of glycine-rich transmembrane domains. Journal of cell science 128, 2269-2277, doi:10.1242/jcs.164848 (2015). 120 Ordonez, C., Screaton, R. A., Ilantzis, C. & Stanners, C. P. Human carcinoembryonic antigen functions as a general inhibitor of anoikis. Cancer research 60, 3419-3424 (2000). 121 Ilantzis, C., DeMarte, L., Screaton, R. A. & Stanners, C. P. Deregulated expression of the human tumor marker CEA and CEA family member CEACAM6 disrupts tissue architecture and blocks colonocyte differentiation. Neoplasia 4, 151-163, doi:10.1038/sj/neo/7900201 (2002). 122 Lai, M. et al. Complete Acid Ceramidase ablation prevents cancer- initiating cell formation in melanoma cells. Scientific reports 7, 7411, doi:10.1038/s41598-017-07606-w (2017). 123 Goni, F. M. & Alonso, A. Sphingomyelinases: enzymology and membrane activity. FEBS letters 531, 38-46 (2002). 124 Sanjana, N. E., Shalem, O. & Zhang, F. Improved vectors and genome- wide libraries for CRISPR screening. Nat Methods 11, 783-784, doi:10.1038/nmeth.3047 (2014). 125 Burns, M. B. et al. APOBEC3B is an enzymatic source of mutation in breast cancer. Nature 494, 366-370, doi:10.1038/nature11881 (2013). 126 Burns, M. B., Lynch, J., Starr, T. K., Knights, D. & Blekhman, R. Virulence genes are a signature of the microbiome in the colorectal tumor microenvironment. Genome Med 7, 55, doi:10.1186/s13073-015-0177-8 (2015). 127 Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114-2120, doi:10.1093/bioinformatics/btu170 (2014). 128 Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat Methods 12, 357-360, doi:10.1038/nmeth.3317 (2015). 129 Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923-930, doi:10.1093/bioinformatics/btt656 (2014).

140 130 Durinck, S., Spellman, P. T., Birney, E. & Huber, W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc 4, 1184-1191, doi:10.1038/nprot.2009.97 (2009). 131 Chan, S. C. et al. Targeting chromatin binding regulation of constitutively active AR variants to overcome prostate cancer resistance to endocrine- based therapies. Nucleic Acids Res 43, 5880-5897, doi:10.1093/nar/gkv262 (2015). 132 Staub, E. et al. An expression module of WIPF1-coexpressed genes identifies patients with favorable prognosis in three tumor types. J Mol Med (Berl) 87, 633-644, doi:10.1007/s00109-009-0467-y (2009). 133 Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2018. CA Cancer J Clin 68, 7-30, doi:10.3322/caac.21442 (2018). 134 Ferlay, J. et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 136, E359-386, doi:10.1002/ijc.29210 (2015). 135 Van Cutsem, E., Cervantes, A., Nordlinger, B., Arnold, D. & Group, E. G. W. Metastatic colorectal cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol 25 Suppl 3, iii1-9, doi:10.1093/annonc/mdu260 (2014). 136 Surveillance, E., and End Results (SEER) Program Research Data (1973- 2015), National Cancer Institute, DCCPS, Surveillance Research Program, (2018). 137 Xu, G. M. & Arnaout, M. A. WAC, a novel WW domain-containing adapter with a coiled-coil region, is colocalized with splicing factor SC35. Genomics 79, 87-94, doi:10.1006/geno.2001.6684 (2002). 138 Totsukawa, G. et al. VCIP135 deubiquitinase and its binding protein, WAC, in p97ATPase-mediated membrane fusion. EMBO J 30, 3581-3593, doi:10.1038/emboj.2011.260 (2011). 139 McKnight, N. C. et al. Genome-wide siRNA screen reveals amino acid starvation-induced autophagy requires SCOC and WAC. EMBO J 31, 1931-1946, doi:10.1038/emboj.2012.36 (2012). 140 Joachim, J. et al. Activation of ULK Kinase and Autophagy by GABARAP Trafficking from the Is Regulated by WAC and GM130. Mol Cell 60, 899-913, doi:10.1016/j.molcel.2015.11.018 (2015). 141 Zhang, F. & Yu, X. WAC, a functional partner of RNF20/40, regulates histone H2B ubiquitination and gene transcription. Mol Cell 41, 384-397, doi:10.1016/j.molcel.2011.01.024 (2011). 142 Segditsas, S. & Tomlinson, I. Colorectal cancer and genetic alterations in the Wnt pathway. Oncogene 25, 7531-7537, doi:10.1038/sj.onc.1210059 (2006). 143 Lee, C. C. et al. TCF12 protein functions as transcriptional repressor of E- cadherin, and its overexpression is correlated with metastasis of colorectal cancer. The Journal of biological chemistry 287, 2798-2809, doi:10.1074/jbc.M111.258947 (2012).

141 144 Storm, E. E. et al. Targeting PTPRK-RSPO3 colon tumours promotes differentiation and loss of stem-cell function. Nature 529, 97-100, doi:10.1038/nature16466 (2016). 145 Bard-Chapeau, E. A. et al. Transposon mutagenesis identifies genes driving hepatocellular carcinoma in a chronic hepatitis B mouse model. Nature genetics 46, 24-32, doi:10.1038/ng.2847 (2014). 146 Berquam-Vrieze, K. E. et al. Cell of origin strongly influences genetic selection in a mouse model of T-ALL. Blood 118, 4646-4656, doi:10.1182/blood-2011-03-343947 (2011). 147 Chen, L. et al. Transposon insertional mutagenesis in mice identifies human breast cancer susceptibility genes and signatures for stratification. Proceedings of the National Academy of Sciences of the United States of America 114, E2215-E2224, doi:10.1073/pnas.1701512114 (2017). 148 Dorr, C. et al. Transposon Mutagenesis Screen Identifies Potential Lung Cancer Drivers and CUL3 as a Tumor Suppressor. Mol Cancer Res 13, 1238-1247, doi:10.1158/1541-7786.MCR-14-0674-T (2015). 149 Genovesi, L. A. et al. Sleeping Beauty mutagenesis in a mouse medulloblastoma model defines networks that discriminate between human molecular subgroups. Proceedings of the National Academy of Sciences of the United States of America 110, E4325-4334, doi:10.1073/pnas.1318639110 (2013). 150 Guo, Y. et al. Comprehensive Ex Vivo Transposon Mutagenesis Identifies Genes That Promote Growth Factor Independence and Leukemogenesis. Cancer research 76, 773-786, doi:10.1158/0008-5472.CAN-15-1697 (2016). 151 Kodama, T. et al. Two-Step Forward Genetic Screen in Mice Identifies Ral GTPase-Activating Proteins as Suppressors of Hepatocellular Carcinoma. Gastroenterology 151, 324-337 e312, doi:10.1053/j.gastro.2016.04.040 (2016). 152 Moriarity, B. S. et al. A Sleeping Beauty forward genetic screen identifies new genes and pathways driving osteosarcoma development and metastasis. Nature genetics 47, 615-624, doi:10.1038/ng.3293 (2015). 153 Morris, S. M. et al. Transposon mutagenesis identifies candidate genes that cooperate with loss of transforming growth factor-beta signaling in mouse intestinal neoplasms. Int J Cancer 140, 853-863, doi:10.1002/ijc.30491 (2017). 154 Perez-Mancera, P. A. et al. The deubiquitinase USP9X suppresses pancreatic ductal adenocarcinoma. Nature 486, 266-270, doi:10.1038/nature11114 (2012). 155 Perna, D. et al. BRAF inhibitor resistance mediated by the AKT pathway in an oncogenic BRAF mouse melanoma model. Proceedings of the National Academy of Sciences of the United States of America 112, E536-545, doi:10.1073/pnas.1418163112 (2015).

142 156 Quintana, R. M. et al. A transposon-based analysis of gene mutations related to skin cancer development. J Invest Dermatol 133, 239-248, doi:10.1038/jid.2012.245 (2013). 157 Rahrmann, E. P. et al. Forward genetic screen for malignant peripheral nerve sheath tumor formation identifies new genes and pathways driving tumorigenesis. Nature genetics 45, 756-766, doi:10.1038/ng.2641 (2013). 158 Rangel, R. et al. Transposon mutagenesis identifies genes that cooperate with mutant Pten in breast cancer progression. Proceedings of the National Academy of Sciences of the United States of America 113, E7749-E7758, doi:10.1073/pnas.1613859113 (2016). 159 Takeda, H. et al. Sleeping Beauty transposon mutagenesis identifies genes that cooperate with mutant Smad4 in gastric cancer development. Proceedings of the National Academy of Sciences of the United States of America 113, E2057-2065, doi:10.1073/pnas.1603223113 (2016). 160 Wu, X. et al. Clonal selection drives genetic divergence of metastatic medulloblastoma. Nature 482, 529-533, doi:10.1038/nature10825 (2012). 161 Soh, J. et al. Oncogene mutations, copy number gains and mutant allele specific imbalance (MASI) frequently occur together in tumor cells. PloS one 4, e7464, doi:10.1371/journal.pone.0007464 (2009). 162 Forbes, S. A. et al. COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res 39, D945- 950, doi:10.1093/nar/gkq929 (2011). 163 Shihab, H. A. et al. An integrative approach to predicting the functional effects of non-coding and coding sequence variation. Bioinformatics 31, 1536-1543, doi:10.1093/bioinformatics/btv009 (2015). 164 Jat, P. S. et al. Direct derivation of conditionally immortal cell lines from an H-2Kb-tsA58 transgenic mouse. Proceedings of the National Academy of Sciences of the United States of America 88, 5096-5100 (1991). 165 Whitehead, R. H. & Joseph, J. L. Derivation of conditionally immortalized cell lines containing the Min mutation from the normal colonic mucosa and other tissues of an "Immortomouse"/Min hybrid. Epithelial Cell Biol 3, 119- 125 (1994). 166 Whitehead, R. H., VanEeden, P. E., Noble, M. D., Ataliotis, P. & Jat, P. S. Establishment of conditionally immortalized epithelial cell lines from both colon and small intestine of adult H-2Kb-tsA58 transgenic mice. Proceedings of the National Academy of Sciences of the United States of America 90, 587-591 (1993). 167 Williams, A. C. et al. Transfection and expression of mutant p53 protein does not alter the in vivo or in vitro growth characteristics of the AA/C1 human adenoma derived cell line, including sensitivity to transforming growth factor-beta 1. Oncogene 9, 1479-1485 (1994). 168 Williams, A. C., Harper, S. J. & Paraskeva, C. Neoplastic transformation of a human colonic epithelial cell line: in vitro evidence for the adenoma to carcinoma sequence. Cancer research 50, 4724-4730 (1990).

143 169 Shi, J. et al. Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nat Biotechnol 33, 661-667, doi:10.1038/nbt.3235 (2015). 170 Scolnick, D. M. et al. CREB-binding protein and p300/CBP-associated factor are transcriptional coactivators of the p53 tumor suppressor protein. Cancer research 57, 3693-3696 (1997). 171 Polyak, K., Hamilton, S. R., Vogelstein, B. & Kinzler, K. W. Early alteration of cell-cycle-regulated gene expression in colorectal neoplasia. Am J Pathol 149, 381-387 (1996). 172 da Costa, L. T. et al. CDX2 is mutated in a colorectal cancer with normal APC/beta-catenin signaling. Oncogene 18, 5010-5014, doi:10.1038/sj.onc.1202872 (1999). 173 Bai, Y. Q., Miyake, S., Iwai, T. & Yuasa, Y. CDX2, a homeobox transcription factor, upregulates transcription of the p21/WAF1/CIP1 gene. Oncogene 22, 7942-7949, doi:10.1038/sj.onc.1206634 (2003). 174 Fingar, D. C., Salama, S., Tsou, C., Harlow, E. & Blenis, J. Mammalian cell size is controlled by mTOR and its downstream targets S6K1 and 4EBP1/eIF4E. Genes Dev 16, 1472-1487, doi:10.1101/gad.995802 (2002). 175 Fumarola, C., La Monica, S., Alfieri, R. R., Borra, E. & Guidotti, G. G. Cell size reduction induced by inhibition of the mTOR/S6K-signaling pathway protects Jurkat cells from apoptosis. Cell Death Differ 12, 1344-1357, doi:10.1038/sj.cdd.4401660 (2005). 176 Wang, R. C. & Levine, B. Autophagy in cellular growth control. FEBS Lett 584, 1417-1426, doi:10.1016/j.febslet.2010.01.009 (2010). 177 Varvagiannis, K., de Vries, B. B. A. & Vissers, L. in GeneReviews((R)) (eds M. P. Adam et al.) (1993). 178 Bergeret, E. et al. TM9SF4 is required for Drosophila cellular immunity via cell adhesion and phagocytosis. Journal of cell science 121, 3325-3334, doi:10.1242/jcs.030163 (2008). 179 Froquet, R. et al. Control of cellular physiology by TM9 proteins in yeast and Dictyostelium. The Journal of biological chemistry 283, 6764-6772, doi:10.1074/jbc.M704484200 (2008). 180 Benghezal, M. et al. Synergistic control of cellular adhesion by transmembrane 9 proteins. Molecular biology of the cell 14, 2890-2899, doi:10.1091/mbc.E02-11-0724 (2003). 181 Le Coadic, M. et al. Phg1/TM9 proteins control intracellular killing of bacteria by determining cellular levels of the Kil1 sulfotransferase in Dictyostelium. PloS one 8, e53259, doi:10.1371/journal.pone.0053259 (2013). 182 Jensen, A. G. et al. Biochemical characterization and lysosomal localization of the mannose-6-phosphate protein p76 (hypothetical protein LOC196463). Biochem J 402, 449-458, doi:10.1042/BJ20061205 (2007).

144 183 Schimmoller, F., Diaz, E., Muhlbauer, B. & Pfeffer, S. R. Characterization of a 76 kDa endosomal, multispanning membrane protein that is highly conserved throughout evolution. Gene 216, 311-318 (1998). 184 Majumdar, A. et al. Activation of microglia acidifies lysosomes and leads to degradation of Alzheimer amyloid fibrils. Molecular biology of the cell 18, 1490-1496, doi:10.1091/mbc.E06-10-0975 (2007). 185 Shin, S., Wolgamott, L. & Yoon, S. O. Integrin trafficking and tumor progression. International journal of cell biology 2012, 516789, doi:10.1155/2012/516789 (2012). 186 Kuwada, S. K., Kuang, J. & Li, X. Integrin alpha5/beta1 expression mediates HER-2 down-regulation in colon cancer cells. The Journal of biological chemistry 280, 19027-19035, doi:10.1074/jbc.M410540200 (2005). 187 Nam, K. T. et al. Loss of Rab25 promotes the development of intestinal neoplasia in mice and is associated with human colorectal adenocarcinomas. The Journal of clinical investigation 120, 840-849, doi:10.1172/JCI40728 (2010). 188 Dozynkiewicz, M. A. et al. Rab25 and CLIC3 collaborate to promote integrin recycling from late endosomes/lysosomes and drive cancer progression. Developmental cell 22, 131-145, doi:10.1016/j.devcel.2011.11.008 (2012). 189 Than, B. L. et al. The role of KCNQ1 in mouse and human gastrointestinal cancers. Oncogene, doi:10.1038/onc.2013.350 (2013). 190 Schempp, C. M. et al. V-ATPase Inhibition Regulates Anoikis Resistance and Metastasis of Cancer Cells. Molecular cancer therapeutics, doi:10.1158/1535-7163.MCT-13-0484 (2014). 191 Supino, R. et al. Antimetastatic effect of a small-molecule vacuolar H+- ATPase inhibitor in in vitro and in vivo preclinical studies. The Journal of pharmacology and experimental therapeutics 324, 15-22, doi:10.1124/jpet.107.128587 (2008). 192 Prevarskaya, N., Skryma, R. & Shuba, Y. Ion channels and the hallmarks of cancer. Trends in molecular medicine 16, 107-121, doi:10.1016/j.molmed.2010.01.005 (2010). 193 Mosesson, Y., Mills, G. B. & Yarden, Y. Derailed endocytosis: an emerging feature of cancer. Nature reviews. Cancer 8, 835-850, doi:10.1038/nrc2521 (2008). 194 Raghunand, N., Martinez-Zaguilan, R., Wright, S. H. & Gillies, R. J. pH and drug resistance. II. Turnover of acidic vesicles and resistance to weakly basic chemotherapeutic drugs. Biochemical pharmacology 57, 1047-1058 (1999). 195 Chung, G. G. et al. Tissue microarray analysis of beta-catenin in colorectal cancer shows nuclear phospho-beta-catenin is associated with a better prognosis. Clinical cancer research : an official journal of the American Association for Cancer Research 7, 4013-4020 (2001).

145 196 Sato, T. et al. Long-term expansion of epithelial organoids from human colon, adenoma, adenocarcinoma, and Barrett's epithelium. Gastroenterology 141, 1762-1772, doi:10.1053/j.gastro.2011.07.050 (2011). 197 Matano, M. et al. Modeling colorectal cancer using CRISPR-Cas9- mediated engineering of human intestinal organoids. Nature medicine 21, 256-262, doi:10.1038/nm.3802 (2015). 198 Crespo, M. et al. Colonic organoids derived from human induced pluripotent stem cells for modeling colorectal cancer and drug testing. Nature medicine 23, 878-884, doi:10.1038/nm.4355 (2017). 199 de la Rosa, J. et al. A single-copy Sleeping Beauty transposon mutagenesis screen identifies new PTEN-cooperating tumor suppressor genes. Nature genetics 49, 730-741, doi:10.1038/ng.3817 (2017). 200 Howlader, N. et al. SEER Cancer Statistics Review, 1975-2011, (2014). 201 TCGA. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330-337, doi:10.1038/nature11252 (2012). 202 Budinska, E. et al. Gene expression patterns unveil a new level of molecular heterogeneity in colorectal cancer. The Journal of pathology 231, 63-76, doi:10.1002/path.4212 (2013). 203 McKeown, E. et al. Current Approaches and Challenges for Monitoring Treatment Response in Colon and Rectal Cancer. Journal of Cancer 5, 31-43, doi:10.7150/jca.7987 (2014). 204 De Sousa, E. M. F. et al. Poor-prognosis colon cancer is defined by a molecularly distinct subtype and develops from serrated precursor lesions. Nature medicine 19, 614-618, doi:10.1038/nm.3174 (2013). 205 Rouzier, R. et al. Breast cancer molecular subtypes respond differently to preoperative chemotherapy. Clinical cancer research : an official journal of the American Association for Cancer Research 11, 5678-5685, doi:10.1158/1078-0432.CCR-04-2421 (2005). 206 Davies, H. et al. Mutations of the BRAF gene in human cancer. Nature 417, 949-954, doi:10.1038/nature00766 (2002). 207 Chen, H. J. et al. A recellularized human colon model identifies cancer driver genes. Nat Biotechnol 34, 845-851, doi:10.1038/nbt.3586 (2016).

146 Appendix

Supplemental Figure 2.1: TM9SF2 knockdown reduces anchorage independent growth.

147 Supplemental 2.1. TM9SF2 knockdown reduces anchorage-independent growth in DLD1 cells. A, quantification of TM9SF2 mRNA levels in DLD1 cells transduced with lentiviral particles carrying TM9SF2 shRNA. B, Western blot confirmation of TM9SF2 knockdown in DLD1 cells. C, Images of colonies stained with crystal violet ten days post plating.

148 Supplemental Figure 2.2. TM9SF2 knockout schema and knockout confirmation

149

Supplemental 2.2. A, schematic of TM9SF2 gene editing with CRISPR/Cas9 lentiV2. Underlined nucleotides represent the protospacer adjacent motif (PAM) sequence following the DNA sequence target by the Cas9 nuclease (DNA sequence in red font). B, Representative sequence trace decomposition by TIDE analysis. Shown is the predicted mutant sequence after CRISPR/Cas9 editing and the indel spectrum determined by tide (Fig C). D, western blot analysis for

TM9SF2 protein expression in single cell HT-29 knockout clones. E, western blot analysis for TM9SF2 protein expression in HCT116 overexpression cells.

150 Supplemental Figure 2.3. ENCODE H3K4me3 and ELF1 Chip-Seq peaks in

HCT116 cells.

Supplemental 2.3. Gene track view of Chip-Seq data at the TM9SF2 locus

(Chr 13). The read density after H3K4me4 (middle track) and ELF-1 immunoprecipitations are shown (bottom track). Also pictured is the ELF1 binding motif in the TM9SF2 promoter region. Data is from the UCSC genome browser

(GRch37/hg19).

151 Table S2.1: Primers used in Chapter 2

No. Primer Sequence (5' to 3') Explanation Name 1 sgTM9SF2 CACCGTAGAAAGCGCCGCTCCGGCG CRISPR1 Forward oligo CRISPR1 F 2 sgTM9SF2 AAACCGCCGGAGCGGCGCTTTCTAC CRISPR1 Reverse oligo CRISPR1 R 3 sgTM9SF2 CACCGCGCAGAAGTTGACGGGCGCC CRISPR2 Forward oligo CRISPR2 F 4 sgTM9SF2 AAACGGCGCCCGTCAACTTCTGCGC CRISP2 Reverse oligo CRISPR2 R 5 sgTM9SF2 CACCGATCAGCGCTCCTCGGTTGGC CRISP6 Forward oligo CRISPR6 F 6 sgTM9SF2 AAACGCCAACCGAGGAGCGCTGATC CRISP6 Reverse oligo CRISPR6 R 7 CRC1 CCCGAAACAACTATCATGAGC Forward primer flanking CRISPR 1 and 2 cut site 8 CRC2 TGCGTCCTCCAGAAACTACC Reverse primer flanking CRISPR 1 and 2 cut site 9 CRC3 CATGGTTTTGCGTGAAGTTG Reverse primer flanking CRISPR 6 cut site 10 CRC4 AAGAGGGAGGGGATAGTGGA Forward primer flanking CRISPR 6 cut site 11 CRC036 CAGAGAAAGCTGAAGACAAACAAA Forward primer for TM9SF2 qRT-PCR experiment 12 CRC038 CAGAACCTCTGACCATCTTCAA Reverse primer for TM9SF2 qRT-PCR experiment 13 CRC079 TATTGAGGCTGCTGAGGCAC Forward primer for ELF qRT- PCR experiment 14 CRC080 CCTGCTGTGTTTCCATCACTTC Reverse primer for ELF qRT- PCR experiment 15 TS hACTB CATTGCCGACAGGATGCAGAAG Reverse housekeeping gene ex5-6 primer for qRT-PCR 16 TS hACTB GATCCACACGGAGTACTTGCGCT Forward housekeeping gene ex5-6 primer for qRT-PCR 17 CRChip 1 GGCCGGCTTCCTTTATCTCT Forward TM9SF2 qPCR primer used after ELF1 ChIP 18 CRChip 2 TCGGGGAAAGGGCAATCAAA Reverse TM9SF2 qPCR primer used after ELF1 ChIP 19 CRChip 7 TCCCTCCCGGAAACTCCTC Forward TM9SF2 qPCR primer used after ELF1 ChIP 20 CRChip 8 CAACAGAAACCGAAGCGGAC Reverse TM9SF2 qPCR primer used after ELF1 ChIP

152 Table S2.2. Insertional mutagenesis studies identifying TM9SF2 as a candidate cancer gene1

Study PubMed Predicted CIS Address2 Cancer ID Effect Type Bard- 24316982 Not 14:122107081- Liver Cancer Chapeau Determined 122159603 2014-01 Kodama 27178121 Not 14:122107038- Liver Cancer 2016-01 Determined 122159604 Morris 2016- 27790711 Not 14:122108778- Colorectal 01 Determined 122160778 Cancer Perez 22699621 Not 14:122130850- Pancreatic Mancera Determined 122130850 Cancer 2012-01 Perez 22699621 Not 14:122130686- Pancreatic Mancera Determined 122130686 Cancer 2012-02 Rangel 2016- 27849608 Not 14:122107038- Breast 01 Determined 122159604 Cancer Starr 2009-01 19251594 Loss 14:122086303- Colorectal 122194129 Cancer Takeda 2016- 27006499 Not 14:122107038- Gastric 01 Determined 122159604 Cancer

1Downloaded from CCGD database (http://ccgd-starrlab.oit.umn.edu/search.php)

2CIS address: Chromosome: start coordinate - end coordinate

153 Table S2.3. Insertional Mutagenesis Studies from SBCDDB in which TM9SF2 is considered a Progression Driver gene. Organ Dataset Study Expected Observed Total Corrected Insertions Insertions Tumors p-value Liver HCC- A liver tumor 2.2 46 356 1.02E-24 HBV model consisting of tumors with and without HBV. Contains unpublished T2Onc3 data. Liver HCC- A liver tumor 2.3 17 146 7.62E-08 noHBV model consisting of tumors with and without HBV. Contains unpublished T2Onc3 data. Intestine INT- A cohort of an 13.1 13 173 1.68E-05 Kras intestinal cancer model consisting of tumors with a Kras G12D allele. Stomach GAS A gastric cancer 2.8 8 66 5.80E-05 model consisting of tumors with a Smad4 KO allele. Pancreas PDAC A pancreatic 2.4 11 172 1.58E-03 ductal adenocarcinoma model using a Pdx1-cre model consisting of tumors with Kras G12D or Kras wild-type alleles. Draws on data from two studies. Intestine INT- A cohort of an 1.4 7 55 4.05E-03 Trp53 intestinal cancer model consisting of tumors with a Trp53 R172H allele.

Table S2.4. Not Shown (See publication)

154 Table S2.5. Differentially expressed genes comparing parental HT-29 to HT- 29 TM9SF2 KO cells

Gene Fold P-value Fold change Bonferroni FDR p-value Change correction (original values) CTNNA2 -∞ 1.31E-07 -41.09 3.46E-03 4.51E-06 LINC00665 -∞ 3.65E-11 -75.38 9.67E-07 3.02E-09 SMIM10L1 -∞ 1.46E-14 -1,293.09 3.86E-10 2.38E-12 SPRR3 -∞ 3.80E-09 -66.01 1.01E-04 1.96E-07 ZNF260 -∞ 1.21E-12 -83.82 3.22E-08 1.41E-10 XIST -182.5 5.29E-86 -178.09 1.40E-81 5.00E-83 PRSS2 -119.56 3.05E-33 -108.16 8.07E-29 2.07E-30 PLXDC2 -35 5.77E-09 -24.03 1.53E-04 2.88E-07 TRIB2 -33.25 2.81E-34 -32.12 7.44E-30 1.96E-31 FLJ43879 -28 1.81E-11 -22.53 4.80E-07 1.61E-09 FAM84A -24.8 1.07E-12 -21 2.84E-08 1.26E-10 MAGEA3 -22.35 1.69E-13 -23.15 4.47E-09 2.27E-11 SPRR1B -21.91 9.06E-23 -21.37 2.40E-18 4.00E-20 ZNF83 -21.71 5.40E-17 -20.95 1.43E-12 1.31E-14 SOSTDC1 -20.75 2.60E-31 -20.58 6.89E-27 1.57E-28 LINGO1 -20 3.58E-14 -19.07 9.47E-10 5.41E-12 GALNT8 -19.27 6.41E-12 -17.79 1.70E-07 6.36E-10 SPRR1A -18.67 3.82E-10 -16.38 1.01E-05 2.56E-08 GCNT1 -18.33 3.43E-16 -17.82 9.09E-12 7.33E-14 C1QTNF9B- -17.5 1.66E-07 -14.26 4.39E-03 5.57E-06 AS1 TM9SF2 -16.45 0 -16.23 0 0 ANO3 -14.77 5.74E-07 -13.42 0.02 1.63E-05 ANXA13 -14.4 7.93E-17 -14.21 2.10E-12 1.88E-14 PPP1R14C -14 3.84E-10 -13.5 1.02E-05 2.56E-08 PRB2 -13.41 2.29E-07 -12.42 6.06E-03 7.38E-06 SCEL -12.72 9.07E-45 -12.76 2.40E-40 7.28E-42 ADAM19 -12.37 1.52E-06 -12.06 0.04 3.73E-05 TNC -12.34 9.84E-13 -12.17 2.61E-08 1.17E-10 CEACAM6 -12.33 0 -12.25 0 0 PTPRO -10.42 1.81E-10 -10.34 4.79E-06 1.29E-08 MUC2 -10.25 2.65E-17 -10.23 7.02E-13 6.68E-15 AUTS2 -10.19 1.16E-11 -10.45 3.08E-07 1.08E-09 LOC101929549 -9.93 7.22E-22 -9.87 1.91E-17 3.13E-19 IGFL2 -9.9 6.91E-08 -9.74 1.83E-03 2.61E-06 C2orf54 -9.69 1.25E-07 -9.92 3.32E-03 4.36E-06 PI3 -9.19 7.86E-21 -9.18 2.08E-16 2.97E-18 CAMK1D -9.18 2.54E-10 -8.99 6.72E-06 1.78E-08 APCDD1L-AS1 -8.97 1.05E-06 -8.6 0.03 2.72E-05 LRRN4 -8.9 3.02E-14 -8.84 7.99E-10 4.67E-12 SMPD3 -8.89 1.13E-37 -9 2.99E-33 8.07E-35 ANKRD45 -8.74 2.06E-08 -8.34 5.46E-04 8.85E-07 KIF19 -8.21 1.70E-08 -7.98 4.50E-04 7.49E-07 CLEC3A -8.13 9.68E-19 -8.27 2.56E-14 2.88E-16 155 SSTR1 -8 1.35E-06 -7.75 0.04 3.36E-05 MAGEA6 -7.98 1.79E-13 -8.11 4.74E-09 2.38E-11 SFTPA2 -7.69 4.72E-07 -7.64 0.01 1.39E-05 MT1A -7.58 9.13E-07 -7.21 0.02 2.41E-05 MALRD1 -7.2 1.65E-07 -7.04 4.37E-03 5.55E-06 MAMDC2 -6.9 1.27E-06 -6.75 0.03 3.17E-05 PLA2G2A -6.65 1.32E-09 -6.52 3.49E-05 7.63E-08 FAXDC2 -6.62 3.57E-13 -6.58 9.46E-09 4.59E-11 SULT2B1 -6.43 3.29E-12 -6.63 8.70E-08 3.45E-10 SLC28A3 -6.34 4.13E-09 -6.28 1.09E-04 2.12E-07 EPHB6 -6.29 2.78E-18 -6.28 7.35E-14 7.51E-16 PHGR1 -6.29 7.02E-07 -6.47 0.02 1.94E-05 MYH15 -6.05 1.18E-08 -5.97 3.13E-04 5.51E-07 CCDC170 -6 2.38E-07 -5.87 6.30E-03 7.60E-06 ITGA1 -5.84 6.80E-18 -5.82 1.80E-13 1.78E-15 MYLK -5.79 1.41E-18 -5.77 3.73E-14 3.96E-16 UPK3B -5.55 7.57E-17 -5.61 2.01E-12 1.82E-14 CLCA4 -5.48 7.97E-07 -5.39 0.02 2.16E-05 CEACAM5 -4.88 5.85E-24 -4.96 1.55E-19 2.72E-21 GUCA2A -4.88 3.59E-09 -4.84 9.51E-05 1.86E-07 NEBL -4.73 1.02E-18 -4.75 2.71E-14 3.01E-16 RIMBP2 -4.71 1.32E-06 -4.65 0.03 3.29E-05 VSIG10L -4.55 0 -4.55 0 0 PTHLH -4.49 6.98E-10 -4.47 1.85E-05 4.39E-08 NOXO1 -4.48 1.60E-23 -4.51 4.24E-19 7.18E-21 MT1E -4.35 8.22E-10 -4.47 2.18E-05 4.98E-08 SBSPON -4.33 0 -4.41 0 0 LGR6 -4.21 1.22E-06 -4.32 0.03 3.08E-05 LOC101929128 -3.97 2.83E-12 -4 7.50E-08 2.99E-10 EPHB3 -3.91 0 -3.92 0 0 LIMS2 -3.9 0 -3.89 0 0 ATF5 -3.88 0 -3.88 0 0 ANXA3 -3.84 0 -3.87 0 0 ATP10B -3.84 1.97E-16 -3.88 5.21E-12 4.41E-14 SEPW1 -3.84 0 -3.86 0 0 SLCO3A1 -3.83 1.35E-09 -3.85 3.57E-05 7.78E-08 KLK6 -3.67 2.22E-16 -3.75 5.88E-12 4.90E-14 ATP2A3 -3.6 8.57E-07 -3.7 0.02 2.29E-05 ANXA6 -3.57 1.07E-10 -3.62 2.82E-06 7.99E-09 FAM3D -3.44 9.38E-16 -3.46 2.48E-11 1.87E-13 GRHL3 -3.36 3.68E-12 -3.38 9.76E-08 3.81E-10 MIR622 -3.29 2.94E-07 -3.35 7.78E-03 9.05E-06 MEX3B -3.28 9.48E-08 -3.28 2.51E-03 3.39E-06 EMP1 -3.22 3.85E-08 -3.3 1.02E-03 1.55E-06 AQP1 -3.21 3.92E-11 -3.25 1.04E-06 3.23E-09 KRT8 -3.21 0 -3.27 0 0 KRT18 -3.15 1.78E-15 -3.21 4.70E-11 3.34E-13 TCN1 -3.12 2.92E-09 -3.14 7.74E-05 1.56E-07 NRP1 -3.07 4.83E-10 -3.13 1.28E-05 3.17E-08 PLK2 -3.07 5.33E-15 -3.07 1.41E-10 9.29E-13 THOC3 -3.02 1.56E-08 -3.05 4.12E-04 7.00E-07 156 ECM1 -2.97 1.00E-07 -3.01 2.66E-03 3.58E-06 SFN -2.95 0 -3 0 0 KRT8P41 -2.91 8.96E-08 -2.95 2.37E-03 3.24E-06 RFK -2.91 3.33E-13 -2.95 8.81E-09 4.32E-11 SULF2 -2.87 2.41E-08 -2.94 6.37E-04 1.02E-06 TPPP -2.85 2.46E-11 -2.86 6.51E-07 2.14E-09 LMNB1 -2.8 0 -2.82 0 0 RFC3 -2.8 0 -2.81 0 0 DUSP4 -2.79 1.05E-12 -2.82 2.77E-08 1.23E-10 SGK1 -2.71 0 -2.74 0 0 EVL -2.7 2.30E-07 -2.76 6.10E-03 7.42E-06 GJB2 -2.68 3.22E-14 -2.72 8.53E-10 4.96E-12 HSPH1 -2.68 0 -2.72 0 0 TXN -2.68 4.77E-09 -2.74 1.26E-04 2.43E-07 DLL4 -2.66 1.87E-07 -2.67 4.96E-03 6.19E-06 EIF1AX -2.64 2.92E-11 -2.69 7.74E-07 2.48E-09 RRM2 -2.64 0 -2.67 0 0 BAG2 -2.63 0 -2.64 0 0 ISOC1 -2.63 0 -2.65 0 0 SRM -2.62 4.88E-07 -2.68 0.01 1.43E-05 KLK11 -2.61 0 -2.63 0 0 NDUFAF4 -2.6 6.00E-09 -2.65 1.59E-04 2.97E-07 RGS2 -2.6 0 -2.63 0 0 COL18A1 -2.59 3.26E-09 -2.64 8.62E-05 1.71E-07 CHORDC1 -2.57 2.07E-12 -2.61 5.48E-08 2.25E-10 HNRNPAB -2.53 0 -2.55 0 0 TUBA4A -2.52 1.80E-14 -2.56 4.76E-10 2.87E-12 SVIP -2.5 3.45E-07 -2.55 9.14E-03 1.05E-05 MPV17L2 -2.47 2.70E-12 -2.49 7.16E-08 2.88E-10 COMMD10 -2.44 7.00E-07 -2.44 0.02 1.93E-05 HAUS7 -2.44 1.41E-08 -2.48 3.74E-04 6.47E-07 RRP9 -2.44 0 -2.48 0 0 CCNB1 -2.42 0 -2.44 0 0 BRIX1 -2.41 0 -2.44 0 0 CD3EAP -2.39 1.62E-06 -2.44 0.04 3.91E-05 SPNS2 -2.39 1.52E-10 -2.39 4.02E-06 1.10E-08 BRI3BP -2.38 1.48E-06 -2.42 0.04 3.64E-05 ANXA2 -2.36 4.45E-12 -2.4 1.18E-07 4.54E-10 CCZ1 -2.35 2.48E-07 -2.39 6.58E-03 7.87E-06 NOP16 -2.35 2.22E-16 -2.37 5.88E-12 4.90E-14 ASF1A -2.33 1.68E-07 -2.38 4.46E-03 5.64E-06 LINC00941 -2.32 4.53E-09 -2.33 1.20E-04 2.32E-07 EBNA1BP2 -2.3 9.33E-15 -2.33 2.47E-10 1.57E-12 G6PD -2.3 0 -2.32 0 0 P2RY2 -2.3 1.24E-07 -2.3 3.28E-03 4.32E-06 ANXA2P2 -2.29 2.22E-15 -2.31 5.88E-11 4.08E-13 PDCD6 -2.29 4.40E-11 -2.33 1.17E-06 3.59E-09 UCP2 -2.29 6.66E-16 -2.3 1.76E-11 1.36E-13 CDCA3 -2.28 1.18E-10 -2.32 3.13E-06 8.78E-09 GEMIN5 -2.28 1.39E-12 -2.29 3.69E-08 1.58E-10 LANCL2 -2.28 3.09E-11 -2.3 8.17E-07 2.60E-09 157 MUC3A -2.28 6.24E-11 -2.3 1.65E-06 4.89E-09 CSTF2 -2.27 5.33E-15 -2.29 1.41E-10 9.29E-13 DNAJA1 -2.27 3.61E-08 -2.31 9.57E-04 1.47E-06 NPM1 -2.27 1.00E-13 -2.3 2.65E-09 1.38E-11 PDGFB -2.27 2.87E-08 -2.3 7.61E-04 1.20E-06 SHF -2.27 1.02E-09 -2.29 2.69E-05 5.96E-08 RASSF6 -2.25 1.27E-06 -2.26 0.03 3.17E-05 CCDC86 -2.24 6.66E-16 -2.27 1.76E-11 1.36E-13 FAM111B -2.24 6.99E-14 -2.27 1.85E-09 9.75E-12 TOMM40 -2.24 4.44E-15 -2.25 1.18E-10 7.89E-13 TSPAN1 -2.24 0 -2.27 0 0 PTTG1 -2.22 4.44E-15 -2.24 1.18E-10 7.89E-13 TIPIN -2.22 2.78E-07 -2.26 7.37E-03 8.64E-06 CYCS -2.21 1.19E-12 -2.24 3.14E-08 1.38E-10 PA2G4 -2.21 1.67E-09 -2.25 4.41E-05 9.43E-08 TNFRSF21 -2.21 1.35E-14 -2.24 3.59E-10 2.23E-12 HMGCS1 -2.19 6.66E-16 -2.21 1.76E-11 1.36E-13 HSPA4 -2.19 1.64E-11 -2.2 4.34E-07 1.47E-09 NDUFS6 -2.19 5.31E-14 -2.21 1.41E-09 7.64E-12 CDKN3 -2.18 2.15E-07 -2.22 5.70E-03 7.01E-06 CSE1L -2.18 8.62E-13 -2.22 2.28E-08 1.03E-10 POLR3K -2.17 2.50E-09 -2.2 6.61E-05 1.35E-07 PRSS23 -2.16 1.00E-10 -2.18 2.66E-06 7.59E-09 SMIM15 -2.16 6.68E-14 -2.18 1.77E-09 9.42E-12 KRT80 -2.15 1.59E-06 -2.2 0.04 3.85E-05 PA2G4P4 -2.15 2.41E-07 -2.17 6.38E-03 7.68E-06 POLR2L -2.15 3.35E-14 -2.17 8.88E-10 5.13E-12 ARHGDIB -2.14 5.70E-09 -2.18 1.51E-04 2.86E-07 CCT5 -2.14 3.77E-15 -2.15 1.00E-10 6.85E-13 CREB3L1 -2.14 5.10E-12 -2.17 1.35E-07 5.14E-10 PMEPA1 -2.13 1.83E-06 -2.18 0.05 4.38E-05 CACYBP -2.12 1.27E-14 -2.14 3.35E-10 2.09E-12 MACC1 -2.12 2.92E-11 -2.13 7.74E-07 2.48E-09 NDUFAF2 -2.12 1.31E-10 -2.13 3.47E-06 9.61E-09 SKIV2L2 -2.12 1.30E-08 -2.13 3.44E-04 6.03E-07 LPCAT2 -2.11 2.50E-09 -2.15 6.63E-05 1.35E-07 SLC7A7 -2.11 1.05E-06 -2.13 0.03 2.71E-05 ADRM1 -2.1 1.72E-11 -2.12 4.56E-07 1.54E-09 ANKRD22 -2.1 1.36E-08 -2.13 3.61E-04 6.27E-07 PDE12 -2.09 3.76E-08 -2.12 9.96E-04 1.52E-06 PGGT1B -2.09 1.04E-06 -2.09 0.03 2.70E-05 TSPAN13 -2.09 8.94E-13 -2.11 2.37E-08 1.07E-10 YWHAH -2.09 2.90E-08 -2.13 7.67E-04 1.21E-06 DUSP7 -2.08 5.94E-07 -2.09 0.02 1.68E-05 RAD51AP1 -2.08 1.08E-09 -2.09 2.86E-05 6.32E-08 AHSA1 -2.07 6.45E-08 -2.11 1.71E-03 2.46E-06 EFNB2 -2.07 1.22E-11 -2.1 3.23E-07 1.12E-09 MRPS30 -2.07 4.94E-11 -2.08 1.31E-06 3.97E-09 UTP15 -2.07 3.32E-08 -2.08 8.80E-04 1.37E-06 DIMT1 -2.06 1.23E-10 -2.08 3.27E-06 9.10E-09 PIK3R3 -2.06 1.72E-06 -2.1 0.05 4.14E-05 158 SRFBP1 -2.06 4.46E-07 -2.07 0.01 1.32E-05 TMX4 -2.06 6.60E-08 -2.08 1.75E-03 2.50E-06 CCT6A -2.05 5.81E-11 -2.08 1.54E-06 4.60E-09 CDC25A -2.05 2.99E-09 -2.07 7.92E-05 1.59E-07 DDX27 -2.05 1.46E-09 -2.08 3.86E-05 8.35E-08 GPX1 -2.05 1.20E-11 -2.07 3.17E-07 1.10E-09 PCNA -2.05 4.37E-08 -2.08 1.16E-03 1.72E-06 POF1B -2.05 1.29E-12 -2.07 3.42E-08 1.49E-10 AK1 -2.04 4.74E-11 -2.06 1.26E-06 3.83E-09 ANXA1 -2.04 2.65E-11 -2.07 7.02E-07 2.29E-09 CCNA2 -2.04 7.82E-12 -2.07 2.07E-07 7.59E-10 GCLM -2.04 3.17E-10 -2.06 8.40E-06 2.19E-08 MISP -2.04 3.98E-08 -2.07 1.05E-03 1.59E-06 NUDC -2.04 7.19E-13 -2.06 1.90E-08 8.77E-11 THG1L -2.04 7.76E-09 -2.06 2.06E-04 3.74E-07 TTC1 -2.04 7.87E-10 -2.05 2.08E-05 4.83E-08 CDC20 -2.03 1.79E-13 -2.05 4.74E-09 2.38E-11 NOC4L -2.03 2.37E-09 -2.05 6.28E-05 1.29E-07 NR2F2 -2.03 1.53E-08 -2.06 4.04E-04 6.91E-07 RARS -2.03 1.55E-10 -2.04 4.10E-06 1.12E-08 ANXA5 -2.02 2.03E-07 -2.03 5.38E-03 6.66E-06 DCPS -2.02 9.70E-12 -2.04 2.57E-07 9.27E-10 DSCC1 -2.02 1.94E-09 -2.05 5.15E-05 1.08E-07 IDH3A -2.02 7.04E-07 -2.05 0.02 1.94E-05 PPP2CA -2.02 5.86E-13 -2.04 1.55E-08 7.25E-11 ANTXR2 -2.01 1.01E-07 -2.02 2.69E-03 3.60E-06 ATG3 -2.01 4.04E-12 -2.03 1.07E-07 4.17E-10 PFDN2 -2.01 6.12E-07 -2.04 0.02 1.72E-05 UBIAD1 -2.01 2.25E-09 -2.03 5.96E-05 1.23E-07 ABCE1 -2 4.96E-09 -2.03 1.31E-04 2.52E-07 AURKA -2 7.39E-11 -2.03 1.96E-06 5.74E-09 PSMG1 -2 1.44E-06 -2.04 0.04 3.55E-05 RAP2B -2 8.79E-07 -2.04 0.02 2.34E-05 AMD1 -1.99 4.47E-09 -2.02 1.18E-04 2.29E-07 MUC13 -1.99 6.06E-12 -2 1.60E-07 6.08E-10 NSUN2 -1.99 3.93E-10 -2.01 1.04E-05 2.61E-08 RDH10 -1.99 8.11E-11 -2.01 2.15E-06 6.22E-09 SYNCRIP -1.99 2.85E-07 -2.02 7.56E-03 8.83E-06 UBE2S -1.99 7.74E-11 -2.01 2.05E-06 5.98E-09 CALB2 -1.98 9.01E-10 -2 2.39E-05 5.39E-08 PSMA3 -1.98 9.13E-07 -2.01 0.02 2.41E-05 SUB1 -1.98 3.09E-10 -2.01 8.19E-06 2.14E-08 UHRF1 -1.97 3.02E-09 -2 8.00E-05 1.60E-07 ZBTB4 2.01 3.54E-07 2 9.37E-03 1.08E-05 ARMCX3 2.02 1.47E-07 2 3.89E-03 5.03E-06 KRCC1 2.03 1.08E-06 2 0.03 2.77E-05 PMM1 2.04 1.81E-06 2.03 0.05 4.35E-05 PSMD5-AS1 2.04 1.83E-06 2.02 0.05 4.37E-05 ZNF160 2.04 1.36E-08 2.02 3.60E-04 6.27E-07 BAHCC1 2.05 1.03E-06 2.02 0.03 2.66E-05 EXOC7 2.05 3.49E-10 2.03 9.24E-06 2.38E-08 159 SPRY4 2.05 7.25E-07 2.05 0.02 1.99E-05 DOCK6 2.06 1.15E-06 2.06 0.03 2.92E-05 MAPK8IP3 2.06 6.42E-08 2.05 1.70E-03 2.45E-06 ATG2B 2.07 1.44E-06 2.06 0.04 3.54E-05 LINC00674 2.07 3.70E-07 2.07 9.81E-03 1.12E-05 GIGYF1 2.08 6.49E-08 2.07 1.72E-03 2.47E-06 LGR5 2.08 2.81E-07 2.06 7.44E-03 8.71E-06 NEK9 2.08 7.75E-09 2.05 2.05E-04 3.74E-07 PTPRB 2.08 7.46E-08 2.07 1.98E-03 2.79E-06 WDR59 2.09 1.41E-07 2.09 3.72E-03 4.83E-06 CTPS2 2.1 8.77E-09 2.08 2.32E-04 4.19E-07 HMGCL 2.1 1.62E-09 2.08 4.30E-05 9.23E-08 ITM2C 2.1 2.67E-11 2.08 7.08E-07 2.31E-09 SZT2 2.1 6.20E-08 2.09 1.64E-03 2.38E-06 ANKRD13A 2.12 1.32E-08 2.1 3.50E-04 6.10E-07 VAT1 2.12 8.03E-10 2.11 2.13E-05 4.89E-08 CRTAP 2.13 1.10E-07 2.1 2.91E-03 3.87E-06 FAM53C 2.13 2.14E-09 2.1 5.66E-05 1.18E-07 SLC23A2 2.14 1.61E-06 2.14 0.04 3.90E-05 CDK18 2.15 1.37E-08 2.15 3.63E-04 6.30E-07 GRB7 2.15 2.68E-08 2.15 7.11E-04 1.14E-06 PLEKHG4 2.15 4.28E-11 2.12 1.13E-06 3.50E-09 TP53I13 2.15 2.97E-09 2.12 7.86E-05 1.58E-07 CHD3 2.16 7.63E-07 2.16 0.02 2.08E-05 PLCG1 2.16 2.92E-11 2.14 7.74E-07 2.48E-09 ZCCHC11 2.16 4.50E-08 2.15 1.19E-03 1.77E-06 ASIC1 2.17 6.63E-08 2.16 1.76E-03 2.51E-06 PDE10A 2.17 6.47E-07 2.16 0.02 1.81E-05 DBNDD1 2.18 2.81E-08 2.15 7.44E-04 1.18E-06 OPTN 2.18 4.12E-07 2.14 0.01 1.23E-05 PHF21A 2.18 1.42E-08 2.16 3.76E-04 6.48E-07 KDM4B 2.19 5.25E-07 2.16 0.01 1.52E-05 HERC2P2 2.2 5.09E-07 2.18 0.01 1.48E-05 KLF6 2.2 1.36E-12 2.19 3.61E-08 1.57E-10 LMBR1L 2.2 8.09E-07 2.18 0.02 2.19E-05 CCDC57 2.21 5.39E-07 2.2 0.01 1.55E-05 GGH 2.21 1.75E-09 2.18 4.64E-05 9.88E-08 ZFP36 2.21 3.35E-09 2.18 8.88E-05 1.76E-07 IFT80 2.22 1.98E-07 2.22 5.23E-03 6.51E-06 LOC100129034 2.22 4.78E-07 2.21 0.01 1.40E-05 SMARCA1 2.22 2.24E-09 2.21 5.94E-05 1.23E-07 MACROD1 2.23 5.76E-10 2.19 1.53E-05 3.70E-08 CLDN2 2.24 7.98E-11 2.21 2.11E-06 6.14E-09 WDR60 2.24 3.32E-08 2.22 8.78E-04 1.37E-06 WDR19 2.25 5.31E-07 2.24 0.01 1.53E-05 KIAA1549 2.26 3.73E-07 2.22 9.88E-03 1.13E-05 LRP1 2.26 1.01E-11 2.24 2.68E-07 9.62E-10 RECQL5 2.26 3.80E-10 2.25 1.01E-05 2.55E-08 YPEL5 2.26 3.02E-11 2.23 8.00E-07 2.56E-09 MAPK3 2.27 1.82E-08 2.26 4.81E-04 7.93E-07 TDRD7 2.27 1.40E-06 2.23 0.04 3.48E-05 160 CELSR3 2.29 5.55E-11 2.27 1.47E-06 4.41E-09 PPARD 2.3 1.22E-10 2.28 3.23E-06 9.03E-09 LINC00339 2.32 1.20E-07 2.3 3.18E-03 4.20E-06 MIF4GD 2.32 1.81E-07 2.3 4.80E-03 6.01E-06 TGFB1 2.32 1.96E-12 2.3 5.19E-08 2.17E-10 TGIF2 2.32 5.73E-11 2.3 1.52E-06 4.54E-09 PKN3 2.33 2.52E-07 2.31 6.66E-03 7.94E-06 XYLT2 2.33 1.49E-11 2.31 3.95E-07 1.35E-09 ZDHHC8 2.33 8.95E-07 2.33 0.02 2.38E-05 BCL6 2.34 1.42E-06 2.34 0.04 3.50E-05 PLA2G6 2.34 6.74E-07 2.33 0.02 1.87E-05 C1RL 2.35 6.19E-11 2.34 1.64E-06 4.87E-09 NMRK1 2.35 1.39E-06 2.33 0.04 3.45E-05 ZNF767P 2.35 2.22E-07 2.33 5.88E-03 7.19E-06 EME1 2.36 1.55E-07 2.35 4.10E-03 5.28E-06 EPN2 2.36 3.53E-08 2.37 9.35E-04 1.44E-06 HERC3 2.36 1.80E-06 2.33 0.05 4.31E-05 KREMEN1 2.36 1.03E-06 2.32 0.03 2.66E-05 BIVM 2.37 8.25E-08 2.35 2.18E-03 3.02E-06 PPP1R15A 2.37 2.01E-12 2.35 5.33E-08 2.21E-10 ZBTB10 2.37 1.52E-06 2.34 0.04 3.73E-05 CALCOCO1 2.38 5.86E-08 2.38 1.55E-03 2.26E-06 SELENBP1 2.38 1.08E-07 2.37 2.86E-03 3.82E-06 STAT2 2.38 2.22E-07 2.38 5.88E-03 7.19E-06 MEGF6 2.39 5.93E-07 2.38 0.02 1.68E-05 TMEM159 2.39 4.91E-08 2.39 1.30E-03 1.91E-06 MEGF8 2.4 1.00E-12 2.38 2.66E-08 1.19E-10 SLC35E2 2.4 1.23E-08 2.39 3.27E-04 5.74E-07 HS1BP3 2.41 2.80E-08 2.4 7.42E-04 1.18E-06 NDST1 2.42 1.35E-06 2.43 0.04 3.36E-05 NLRX1 2.42 8.29E-07 2.38 0.02 2.23E-05 TNRC6C-AS1 2.42 2.33E-10 2.39 6.16E-06 1.64E-08 BRD8 2.43 5.62E-08 2.38 1.49E-03 2.18E-06 REPS2 2.45 1.77E-08 2.44 4.68E-04 7.74E-07 WSB1 2.46 2.30E-12 2.44 6.09E-08 2.48E-10 IFI35 2.48 8.55E-07 2.44 0.02 2.29E-05 PRCP 2.5 4.32E-14 2.47 1.14E-09 6.39E-12 TNFRSF10D 2.5 1.78E-06 2.46 0.05 4.27E-05 INPPL1 2.51 2.49E-16 2.5 6.59E-12 5.35E-14 NCMAP 2.51 2.96E-07 2.5 7.84E-03 9.10E-06 PPP2R5B 2.51 2.18E-08 2.49 5.77E-04 9.32E-07 IFT140 2.52 6.09E-09 2.5 1.61E-04 3.00E-07 H6PD 2.53 1.25E-07 2.49 3.32E-03 4.36E-06 TMEM254 2.53 6.02E-07 2.5 0.02 1.70E-05 KIAA0895L 2.56 8.99E-10 2.56 2.38E-05 5.39E-08 ALPK1 2.58 4.34E-08 2.55 1.15E-03 1.71E-06 SUZ12P1 2.58 1.87E-06 2.53 0.05 4.46E-05 KLHL24 2.59 1.13E-06 2.6 0.03 2.88E-05 SLC25A25-AS1 2.59 5.62E-07 2.58 0.01 1.60E-05 GDF11 2.6 4.96E-07 2.56 0.01 1.45E-05 PPP2R3A 2.6 6.26E-07 2.58 0.02 1.76E-05 161 LRRC37B 2.61 1.01E-08 2.59 2.68E-04 4.79E-07 IDUA 2.63 4.16E-08 2.6 1.10E-03 1.65E-06 C22orf46 2.64 1.62E-08 2.63 4.29E-04 7.22E-07 DENND5A 2.64 3.21E-11 2.61 8.50E-07 2.69E-09 MAP3K10 2.64 8.77E-09 2.6 2.32E-04 4.19E-07 ULK1 2.64 1.15E-13 2.61 3.03E-09 1.57E-11 HEXDC 2.65 9.75E-10 2.62 2.58E-05 5.75E-08 AMIGO2 2.66 1.66E-08 2.61 4.40E-04 7.35E-07 CC2D2A 2.66 2.57E-07 2.64 6.80E-03 8.06E-06 LMF1 2.66 7.59E-08 2.63 2.01E-03 2.82E-06 ACAD10 2.67 5.55E-10 2.66 1.47E-05 3.59E-08 ENDOV 2.67 1.53E-08 2.64 4.05E-04 6.91E-07 KIAA0141 2.67 2.38E-07 2.62 6.30E-03 7.60E-06 TUBG2 2.68 5.11E-07 2.64 0.01 1.48E-05 UNC119 2.68 8.65E-07 2.62 0.02 2.31E-05 ZNF655 2.68 4.95E-12 2.67 1.31E-07 5.00E-10 ABCC3 2.69 6.43E-19 2.68 1.70E-14 2.00E-16 FOXO4 2.69 4.13E-08 2.67 1.09E-03 1.64E-06 PRDM5 2.69 5.43E-07 2.68 0.01 1.56E-05 SPACA6P 2.69 5.23E-07 2.66 0.01 1.51E-05 THRA 2.69 8.18E-14 2.66 2.17E-09 1.13E-11 ZFP90 2.7 3.39E-10 2.68 8.98E-06 2.33E-08 ALDH6A1 2.71 8.48E-09 2.68 2.25E-04 4.07E-07 FKBP10 2.71 3.28E-10 2.66 8.68E-06 2.26E-08 RAB40B 2.71 3.48E-08 2.65 9.22E-04 1.42E-06 SYTL1 2.71 2.88E-10 2.69 7.62E-06 2.00E-08 FRY 2.73 1.18E-06 2.74 0.03 2.99E-05 ZSCAN30 2.73 4.46E-07 2.73 0.01 1.32E-05 CCDC24 2.75 7.13E-07 2.71 0.02 1.96E-05 HOOK2 2.75 2.09E-09 2.74 5.53E-05 1.16E-07 NRSN2 2.75 6.96E-12 2.71 1.84E-07 6.85E-10 PCED1A 2.75 1.74E-14 2.74 4.60E-10 2.79E-12 CLSTN3 2.76 1.97E-12 2.75 5.22E-08 2.17E-10 SPATA20 2.76 1.55E-10 2.71 4.10E-06 1.12E-08 JUN 2.77 1.92E-15 2.73 5.08E-11 3.58E-13 NME4 2.77 1.42E-08 2.72 3.75E-04 6.47E-07 RHPN1 2.78 9.04E-16 2.75 2.39E-11 1.83E-13 GHDC 2.79 1.52E-11 2.75 4.02E-07 1.37E-09 CTSV 2.8 6.52E-11 2.76 1.73E-06 5.08E-09 STAT5B 2.8 3.38E-13 2.76 8.94E-09 4.36E-11 BCORL1 2.81 1.46E-08 2.81 3.86E-04 6.61E-07 CAPS 2.81 4.73E-09 2.78 1.25E-04 2.41E-07 KIAA0195 2.82 3.44E-14 2.81 9.11E-10 5.24E-12 APH1B 2.83 9.14E-08 2.79 2.42E-03 3.29E-06 BCAS3 2.83 8.02E-08 2.81 2.12E-03 2.96E-06 GOLGA8A 2.83 1.00E-14 2.8 2.66E-10 1.68E-12 KCNAB2 2.86 9.61E-10 2.84 2.54E-05 5.68E-08 KDM3A 2.86 6.01E-20 2.83 1.59E-15 2.07E-17 ZNF334 2.86 2.67E-12 2.84 7.08E-08 2.85E-10 FBXO36 2.87 1.58E-06 2.83 0.04 3.85E-05 TTC6 2.87 8.51E-07 2.83 0.02 2.28E-05 162 GALK1 2.89 6.72E-07 2.84 0.02 1.86E-05 JUNB 2.89 9.26E-08 2.83 2.45E-03 3.33E-06 RNF213 2.9 1.72E-21 2.87 4.55E-17 7.23E-19 CABP4 2.92 5.66E-10 2.89 1.50E-05 3.64E-08 NDRG2 2.93 6.44E-10 2.89 1.70E-05 4.08E-08 PABPC1L 2.93 3.78E-16 2.91 1.00E-11 8.00E-14 AHSA2 2.94 1.44E-11 2.93 3.81E-07 1.31E-09 FABP6 2.94 7.06E-08 2.9 1.87E-03 2.66E-06 ABTB1 2.95 3.53E-11 2.91 9.35E-07 2.93E-09 TMEM74B 2.95 1.01E-07 2.89 2.68E-03 3.60E-06 NRBP2 2.96 3.27E-13 2.93 8.67E-09 4.27E-11 RELL2 2.97 1.22E-09 2.93 3.23E-05 7.11E-08 SGSH 2.97 4.92E-13 2.94 1.30E-08 6.14E-11 GTF2IP20 2.98 6.55E-07 2.97 0.02 1.83E-05 LRRC23 2.98 3.41E-07 2.94 9.02E-03 1.04E-05 CUL9 2.99 1.27E-07 3 3.36E-03 4.39E-06 AGO4 3.01 3.59E-13 2.97 9.51E-09 4.59E-11 PON3 3.01 2.67E-08 2.98 7.08E-04 1.13E-06 SOX12 3.01 4.19E-14 2.97 1.11E-09 6.27E-12 CYP3A5 3.02 9.62E-19 3 2.55E-14 2.88E-16 GGT6 3.02 1.30E-09 3 3.43E-05 7.51E-08 RCN3 3.02 1.93E-11 2.99 5.10E-07 1.70E-09 DEF6 3.03 2.64E-08 2.99 6.98E-04 1.12E-06 ZNF34 3.03 8.79E-08 2.99 2.33E-03 3.19E-06 RHOQ 3.04 1.38E-16 3.02 3.66E-12 3.12E-14 VANGL2 3.04 6.17E-09 3.05 1.63E-04 3.03E-07 FAM13A 3.05 1.60E-09 3.04 4.23E-05 9.11E-08 ICAM1 3.05 1.79E-09 3.02 4.74E-05 1.00E-07 CCDC78 3.06 7.12E-08 3.01 1.89E-03 2.68E-06 LZTS3 3.07 4.98E-09 3.07 1.32E-04 2.52E-07 TXNIP 3.07 5.16E-14 3.06 1.37E-09 7.51E-12 SRP14-AS1 3.08 7.49E-10 3.03 1.98E-05 4.65E-08 ANKZF1 3.11 6.74E-15 3.09 1.79E-10 1.15E-12 FAM160B2 3.11 8.19E-08 3.06 2.17E-03 3.00E-06 GRAMD1C 3.12 2.35E-07 3.11 6.22E-03 7.54E-06 KLHL22 3.12 1.83E-07 3.05 4.84E-03 6.05E-06 P4HA1 3.12 1.21E-06 3.04 0.03 3.05E-05 FBXL20 3.13 1.71E-07 3.15 4.53E-03 5.71E-06 NHS 3.14 9.11E-21 3.11 2.41E-16 3.40E-18 ZNF337 3.14 4.40E-16 3.1 1.16E-11 9.24E-14 BACE1 3.15 1.12E-16 3.11 2.96E-12 2.57E-14 GUSBP11 3.15 5.55E-10 3.11 1.47E-05 3.59E-08 PAQR8 3.17 5.28E-11 3.15 1.40E-06 4.22E-09 SHISA4 3.17 9.63E-07 3.12 0.03 2.52E-05 HID1 3.18 1.29E-09 3.19 3.42E-05 7.50E-08 BDH2 3.21 2.06E-12 3.16 5.46E-08 2.25E-10 C9orf172 3.21 1.26E-08 3.2 3.33E-04 5.85E-07 GLS2 3.21 1.39E-11 3.16 3.69E-07 1.27E-09 KLHL5 3.21 2.03E-10 3.18 5.38E-06 1.44E-08 PYROXD2 3.21 5.55E-09 3.21 1.47E-04 2.78E-07 CCDC40 3.22 1.72E-08 3.21 4.56E-04 7.57E-07 163 PDK4 3.22 3.71E-08 3.23 9.81E-04 1.50E-06 CSAD 3.23 1.25E-07 3.22 3.30E-03 4.34E-06 NBR2 3.24 2.69E-08 3.23 7.12E-04 1.14E-06 ARHGEF40 3.26 1.22E-20 3.23 3.23E-16 4.49E-18 CFAP69 3.26 3.39E-07 3.25 8.97E-03 1.03E-05 DDIT3 3.28 9.26E-16 3.24 2.45E-11 1.86E-13 GSAP 3.28 1.42E-13 3.26 3.76E-09 1.93E-11 LOC146880 3.28 4.03E-19 3.26 1.07E-14 1.32E-16 FMNL3 3.29 1.46E-09 3.25 3.87E-05 8.35E-08 GATS 3.29 7.13E-08 3.3 1.89E-03 2.68E-06 LOC284454 3.29 1.54E-08 3.3 4.08E-04 6.95E-07 LYSMD1 3.29 1.59E-06 3.25 0.04 3.87E-05 PCYOX1L 3.3 2.08E-09 3.25 5.51E-05 1.16E-07 TBX6 3.31 2.05E-07 3.27 5.42E-03 6.70E-06 ENGASE 3.32 4.12E-13 3.32 1.09E-08 5.19E-11 MYO7A 3.33 5.52E-07 3.33 0.01 1.58E-05 PLD2 3.34 3.23E-09 3.34 8.55E-05 1.70E-07 TXNRD3 3.34 5.53E-07 3.29 0.01 1.58E-05 ANO9 3.35 1.79E-20 3.32 4.74E-16 6.50E-18 CACNB1 3.37 6.39E-12 3.33 1.69E-07 6.36E-10 SPIRE2 3.39 5.79E-16 3.37 1.53E-11 1.21E-13 CD27-AS1 3.4 3.87E-08 3.37 1.02E-03 1.56E-06 CATSPER2 3.44 1.34E-06 3.46 0.04 3.33E-05 CDK5RAP3 3.44 3.08E-15 3.44 8.16E-11 5.62E-13 FER1L4 3.44 1.26E-15 3.45 3.34E-11 2.46E-13 PLCE1 3.47 8.57E-10 3.44 2.27E-05 5.15E-08 DYNC2H1 3.48 2.07E-11 3.45 5.48E-07 1.82E-09 RAB29 3.48 9.22E-10 3.48 2.44E-05 5.48E-08 STAT5A 3.48 3.40E-08 3.46 9.01E-04 1.40E-06 GYLTL1B 3.49 2.60E-11 3.45 6.88E-07 2.26E-09 NICN1 3.5 6.92E-09 3.47 1.83E-04 3.36E-07 RUSC2 3.5 5.17E-11 3.47 1.37E-06 4.14E-09 ZNF395 3.5 1.10E-07 3.42 2.93E-03 3.89E-06 IFNLR1 3.52 1.03E-07 3.47 2.73E-03 3.66E-06 C7orf31 3.54 3.81E-08 3.49 1.01E-03 1.53E-06 PAN2 3.56 2.27E-14 3.57 6.00E-10 3.55E-12 SLC7A2 3.57 1.78E-12 3.52 4.71E-08 1.98E-10 DNAJC18 3.59 1.60E-08 3.53 4.24E-04 7.16E-07 ENO3 3.6 2.20E-07 3.53 5.81E-03 7.13E-06 SEMA4A 3.61 1.66E-08 3.56 4.40E-04 7.35E-07 NEAT1 3.62 1.66E-09 3.65 4.40E-05 9.42E-08 ECHDC2 3.63 1.54E-15 3.6 4.08E-11 2.96E-13 LOC401320 3.64 7.92E-10 3.62 2.10E-05 4.83E-08 ACSF2 3.66 5.40E-24 3.62 1.43E-19 2.60E-21 FGF11 3.67 2.18E-08 3.61 5.76E-04 9.32E-07 HDAC5 3.67 2.08E-18 3.65 5.52E-14 5.81E-16 LINC01061 3.67 1.96E-08 3.66 5.18E-04 8.44E-07 EZH1 3.68 2.89E-20 3.65 7.65E-16 1.03E-17 SCX 3.68 1.30E-06 3.62 0.03 3.26E-05 CROCC 3.69 1.11E-07 3.6 2.95E-03 3.92E-06 SLC27A1 3.69 6.29E-09 3.69 1.67E-04 3.08E-07 164 ENO2 3.7 2.61E-21 3.64 6.91E-17 1.05E-18 MALAT1 3.7 4.17E-10 3.71 1.10E-05 2.77E-08 SLC38A11 3.76 6.40E-12 3.76 1.70E-07 6.36E-10 GOLGA8B 3.8 4.93E-12 3.76 1.31E-07 5.00E-10 ADPRHL1 3.81 6.73E-08 3.76 1.78E-03 2.55E-06 LINC00910 3.81 8.26E-10 3.79 2.19E-05 5.00E-08 LPIN3 3.81 1.70E-15 3.78 4.51E-11 3.24E-13 EGLN3 3.82 5.49E-10 3.74 1.45E-05 3.56E-08 GSTM4 3.82 2.09E-07 3.74 5.55E-03 6.84E-06 WDR27 3.85 9.92E-16 3.83 2.63E-11 1.96E-13 PPP1R1B 3.87 1.79E-32 3.84 4.74E-28 1.19E-29 CTSK 3.89 1.17E-08 3.85 3.11E-04 5.49E-07 LOC155060 3.89 1.87E-11 3.86 4.95E-07 1.65E-09 SERINC4 3.89 1.09E-06 3.89 0.03 2.80E-05 GALNT6 3.91 5.03E-07 3.93 0.01 1.46E-05 WDR66 3.91 1.75E-06 3.82 0.05 4.20E-05 ATG16L2 3.92 5.93E-14 3.86 1.57E-09 8.48E-12 ASIC3 3.95 3.76E-09 3.9 9.96E-05 1.94E-07 MC1R 3.96 1.94E-10 3.95 5.14E-06 1.38E-08 SDCBP2-AS1 3.99 8.26E-08 3.99 2.19E-03 3.02E-06 ACCS 4 1.30E-10 4 3.43E-06 9.54E-09 USP32P2 4 3.58E-09 4.01 9.47E-05 1.86E-07 MYO15B 4.01 6.67E-13 3.94 1.77E-08 8.17E-11 CNN3 4.02 6.48E-15 3.99 1.72E-10 1.11E-12 EDN1 4.07 2.86E-21 4.01 7.58E-17 1.13E-18 PGM1 4.07 1.37E-12 4.06 3.63E-08 1.57E-10 RNF207 4.07 2.85E-08 4.06 7.54E-04 1.19E-06 NSUN7 4.08 1.80E-11 4.06 4.78E-07 1.61E-09 AGAP2-AS1 4.09 2.66E-07 4 7.04E-03 8.30E-06 HSF4 4.1 1.32E-18 4.08 3.49E-14 3.75E-16 BTN2A2 4.11 3.61E-12 4.06 9.57E-08 3.75E-10 BLMH 4.13 1.32E-16 4.06 3.51E-12 3.02E-14 HHAT 4.15 8.77E-07 4.06 0.02 2.34E-05 UAP1L1 4.16 4.95E-15 4.12 1.31E-10 8.74E-13 LINC00950 4.17 9.89E-07 4.07 0.03 2.58E-05 FBF1 4.21 1.21E-10 4.13 3.21E-06 8.99E-09 ZNF514 4.22 1.03E-16 4.2 2.72E-12 2.40E-14 DDIT4 4.23 1.43E-08 4.12 3.80E-04 6.53E-07 PAQR6 4.24 1.39E-12 4.21 3.68E-08 1.58E-10 VEGFA 4.26 2.49E-21 4.2 6.61E-17 1.02E-18 ZNF493 4.28 1.81E-07 4.18 4.80E-03 6.01E-06 KCNC3 4.31 4.54E-10 4.3 1.20E-05 2.99E-08 PDK1 4.33 6.98E-14 4.24 1.85E-09 9.75E-12 SCD5 4.34 6.29E-14 4.3 1.67E-09 8.96E-12 FZD4 4.38 8.58E-10 4.35 2.27E-05 5.15E-08 SERP2 4.38 3.63E-07 4.37 9.61E-03 1.10E-05 BIRC7 4.42 5.89E-11 4.41 1.56E-06 4.64E-09 CXCL16 4.42 2.50E-07 4.3 6.62E-03 7.91E-06 RTN4RL2 4.42 3.85E-07 4.31 0.01 1.16E-05 HCG11 4.43 1.57E-08 4.35 4.15E-04 7.04E-07 KLF4 4.44 4.77E-07 4.48 0.01 1.40E-05 165 HSH2D 4.45 4.95E-07 4.33 0.01 1.45E-05 KIAA1107 4.46 7.69E-10 4.43 2.04E-05 4.75E-08 YPEL3 4.47 1.71E-10 4.37 4.54E-06 1.23E-08 CES4A 4.48 3.49E-08 4.43 9.23E-04 1.42E-06 POU6F1 4.52 6.08E-07 4.42 0.02 1.71E-05 LOC101930452 4.58 5.97E-07 4.46 0.02 1.69E-05 MAMDC4 4.58 1.69E-08 4.46 4.49E-04 7.49E-07 C6orf223 4.6 8.61E-07 4.47 0.02 2.30E-05 SGSM2 4.62 7.15E-12 4.64 1.89E-07 7.02E-10 SLC35G2 4.64 7.89E-10 4.63 2.09E-05 4.83E-08 KIF12 4.65 2.94E-18 4.64 7.79E-14 7.86E-16 TSPAN33 4.66 3.45E-10 4.55 9.13E-06 2.36E-08 TMEM91 4.68 6.38E-09 4.69 1.69E-04 3.12E-07 FCHSD1 4.7 1.99E-21 4.63 5.26E-17 8.23E-19 SAT2 4.7 4.32E-14 4.62 1.14E-09 6.39E-12 SLC9A3 4.7 5.48E-07 4.58 0.01 1.57E-05 CLHC1 4.75 1.15E-11 4.72 3.05E-07 1.08E-09 EHHADH 4.76 1.86E-08 4.76 4.92E-04 8.06E-07 LOC101929567 4.78 1.07E-08 4.75 2.82E-04 5.01E-07 LAMB2 4.83 4.07E-25 4.76 1.08E-20 2.16E-22 SNX10 4.84 9.68E-08 4.76 2.56E-03 3.46E-06 WNK4 4.86 4.08E-19 4.85 1.08E-14 1.32E-16 LOC100131564 4.87 4.80E-08 4.88 1.27E-03 1.87E-06 TMEM25 4.88 8.32E-09 4.78 2.20E-04 4.01E-07 PLOD2 4.89 3.86E-15 4.79 1.02E-10 6.96E-13 TESK2 4.89 1.70E-08 4.86 4.49E-04 7.49E-07 EPHX4 4.9 4.49E-11 4.78 1.19E-06 3.65E-09 TP53INP1 4.91 1.01E-10 4.91 2.67E-06 7.60E-09 ACBD4 4.92 5.66E-21 4.87 1.50E-16 2.17E-18 FAM229A 4.92 1.79E-08 4.8 4.75E-04 7.83E-07 GATA6-AS1 4.94 3.17E-08 4.97 8.40E-04 1.31E-06 SYNE4 5.02 4.10E-12 5.02 1.09E-07 4.21E-10 LOC284581 5.03 7.41E-11 4.99 1.96E-06 5.74E-09 RNF183 5.03 1.74E-15 4.97 4.60E-11 3.28E-13 COL27A1 5.05 2.54E-24 5.04 6.73E-20 1.27E-21 TMEM198B 5.06 1.88E-14 5.03 4.97E-10 2.98E-12 DNHD1 5.07 2.52E-07 5.12 6.68E-03 7.95E-06 FAM13A-AS1 5.07 1.76E-07 4.99 4.66E-03 5.87E-06 PCED1B 5.09 1.39E-23 5.03 3.69E-19 6.36E-21 PLIN1 5.09 7.29E-08 4.94 1.93E-03 2.74E-06 EPB41L2 5.1 5.81E-24 5.07 1.54E-19 2.72E-21 TPTE2P5 5.13 6.10E-08 5.07 1.62E-03 2.34E-06 SMARCD3 5.15 3.31E-07 5.02 8.76E-03 1.01E-05 CCDC183 5.16 7.91E-17 5.14 2.10E-12 1.88E-14 RASL10B 5.2 2.98E-20 5.13 7.90E-16 1.05E-17 L3MBTL1 5.21 4.56E-08 5.19 1.21E-03 1.79E-06 SNTA1 5.22 3.31E-08 5.16 8.77E-04 1.37E-06 CRYM 5.27 1.52E-09 5.15 4.02E-05 8.67E-08 METTL7A 5.27 3.44E-07 5.1 9.10E-03 1.05E-05 KLF15 5.31 3.98E-17 5.26 1.05E-12 9.86E-15 FBXL16 5.33 1.49E-11 5.23 3.94E-07 1.35E-09 166 PAPLN 5.33 4.72E-09 5.21 1.25E-04 2.41E-07 PROCA1 5.33 4.89E-07 5.15 0.01 1.43E-05 MMP14 5.36 2.50E-14 5.29 6.63E-10 3.90E-12 SNAI3 5.37 9.22E-08 5.27 2.44E-03 3.32E-06 PRX 5.39 4.68E-07 5.32 0.01 1.38E-05 AXL 5.4 5.07E-13 5.33 1.34E-08 6.30E-11 SPHK1 5.4 1.56E-06 5.23 0.04 3.81E-05 CA11 5.41 2.22E-07 5.24 5.88E-03 7.19E-06 LCK 5.43 1.54E-06 5.33 0.04 3.76E-05 IL11RA 5.45 3.49E-11 5.32 9.24E-07 2.90E-09 SEPN1 5.46 9.06E-07 5.29 0.02 2.40E-05 LOC100130691 5.47 1.77E-06 5.35 0.05 4.25E-05 KCNAB3 5.48 1.58E-06 5.31 0.04 3.85E-05 EPOR 5.52 2.44E-13 5.41 6.45E-09 3.21E-11 KIAA1407 5.52 1.58E-17 5.47 4.19E-13 4.03E-15 OBSCN 5.52 6.80E-07 5.6 0.02 1.88E-05 GCGR 5.53 1.56E-06 5.42 0.04 3.80E-05 GDPD3 5.58 1.96E-07 5.45 5.19E-03 6.47E-06 PARD6G 5.58 1.27E-07 5.49 3.35E-03 4.39E-06 P2RX4 5.62 8.66E-18 5.53 2.29E-13 2.25E-15 ALDOC 5.67 6.25E-10 5.51 1.65E-05 3.98E-08 CCDC153 5.67 1.85E-08 5.5 4.89E-04 8.03E-07 PLEKHF1 5.67 6.35E-14 5.59 1.68E-09 8.99E-12 SARDH 5.67 1.36E-10 5.6 3.61E-06 9.98E-09 MTMR9LP 5.69 9.56E-11 5.68 2.53E-06 7.26E-09 ERO1B 5.79 2.02E-07 5.57 5.36E-03 6.64E-06 ABCA5 5.81 1.53E-08 5.84 4.05E-04 6.91E-07 ECI2 5.81 1.67E-12 5.78 4.43E-08 1.87E-10 QPRT 5.85 5.29E-14 5.71 1.40E-09 7.64E-12 GSDMB 5.87 3.34E-20 5.86 8.84E-16 1.16E-17 HIST2H2BE 5.87 9.03E-15 5.86 2.39E-10 1.53E-12 NOXA1 5.88 1.06E-16 5.81 2.80E-12 2.45E-14 KIAA1875 5.91 2.97E-09 5.75 7.86E-05 1.58E-07 S1PR2 5.93 4.03E-08 5.72 1.07E-03 1.61E-06 TCAF2 5.97 5.11E-09 5.81 1.35E-04 2.58E-07 LOC374443 5.98 1.55E-07 5.79 4.10E-03 5.28E-06 ADCY5 6 9.92E-07 5.87 0.03 2.59E-05 FMNL1 6.03 1.46E-14 5.93 3.88E-10 2.38E-12 CEACAM19 6.05 1.19E-09 6.01 3.16E-05 6.98E-08 NLGN2 6.05 4.71E-14 5.92 1.25E-09 6.94E-12 TTLL3 6.06 4.16E-07 5.91 0.01 1.24E-05 HACE1 6.07 2.18E-12 5.92 5.78E-08 2.37E-10 LOC100129550 6.09 7.87E-07 6.05 0.02 2.14E-05 CCDC183-AS1 6.1 3.06E-09 6.11 8.10E-05 1.62E-07 KIFC2 6.1 4.18E-30 6.06 1.11E-25 2.41E-27 NEIL1 6.1 4.20E-11 6 1.11E-06 3.44E-09 ADA 6.11 2.37E-07 5.92 6.27E-03 7.59E-06 ZNF358 6.12 9.08E-07 5.93 0.02 2.40E-05 SALL4 6.13 6.42E-08 6.09 1.70E-03 2.45E-06 SCAMP5 6.17 2.29E-09 5.97 6.06E-05 1.25E-07 BAIAP3 6.21 3.33E-12 6.08 8.83E-08 3.49E-10 167 BNIP3L 6.21 1.84E-07 6.03 4.87E-03 6.08E-06 CES3 6.21 7.29E-07 6.21 0.02 2.00E-05 THBS3 6.22 5.99E-26 6.14 1.59E-21 3.31E-23 FLT3LG 6.24 1.18E-11 6.08 3.11E-07 1.09E-09 PIK3IP1 6.24 2.64E-12 6.13 6.99E-08 2.83E-10 C11orf45 6.27 9.89E-12 6.09 2.62E-07 9.42E-10 PHC1 6.27 1.08E-17 6.18 2.85E-13 2.76E-15 PNPLA7 6.29 1.02E-09 6.14 2.69E-05 5.96E-08 ABAT 6.3 1.64E-10 6.08 4.33E-06 1.18E-08 SEC31B 6.34 5.43E-07 6.4 0.01 1.56E-05 CFAP43 6.36 9.50E-09 6.09 2.52E-04 4.50E-07 AMT 6.4 2.55E-18 6.3 6.74E-14 6.95E-16 LINC00482 6.42 7.71E-13 6.29 2.04E-08 9.36E-11 SCPEP1 6.42 1.05E-31 6.39 2.79E-27 6.48E-29 TPST1 6.44 9.08E-10 6.26 2.41E-05 5.42E-08 LINC00174 6.49 8.48E-19 6.41 2.25E-14 2.58E-16 CROCCP3 6.53 1.41E-12 6.41 3.73E-08 1.59E-10 TTC7B 6.54 1.64E-07 6.32 4.35E-03 5.53E-06 NHLRC4 6.55 6.29E-07 6.21 0.02 1.76E-05 CYP2U1 6.65 1.15E-11 6.48 3.05E-07 1.08E-09 ZC4H2 6.7 2.33E-10 6.71 6.16E-06 1.64E-08 NR4A1 6.72 9.99E-32 6.66 2.64E-27 6.30E-29 PLEKHA4 6.75 1.11E-15 6.64 2.94E-11 2.18E-13 SLC39A5 6.84 1.76E-09 6.88 4.66E-05 9.89E-08 PDZD3 6.88 1.17E-13 6.71 3.10E-09 1.60E-11 PFKFB4 6.95 5.26E-19 6.8 1.39E-14 1.66E-16 C1orf115 6.99 2.12E-11 6.8 5.63E-07 1.86E-09 MLXIPL 6.99 2.89E-07 6.77 7.65E-03 8.92E-06 FABP1 7 8.31E-12 7.06 2.20E-07 7.98E-10 LMNTD2 7.05 7.56E-09 6.82 2.00E-04 3.66E-07 FAM171A2 7.06 2.39E-11 6.97 6.32E-07 2.08E-09 PNCK 7.14 1.01E-06 6.84 0.03 2.63E-05 CLIP3 7.32 9.34E-11 7.27 2.47E-06 7.13E-09 FBXO44 7.32 2.21E-09 7.05 5.85E-05 1.21E-07 FZD2 7.32 3.58E-19 7.2 9.49E-15 1.19E-16 ZDHHC2 7.32 3.84E-13 7.15 1.02E-08 4.90E-11 KCND1 7.34 6.85E-09 7.04 1.81E-04 3.33E-07 TRABD2A 7.37 4.57E-13 7.39 1.21E-08 5.73E-11 SEMA3G 7.4 2.29E-09 7.24 6.05E-05 1.25E-07 YJEFN3 7.59 7.59E-10 7.44 2.01E-05 4.71E-08 ATF3 7.61 1.24E-26 7.51 3.27E-22 6.96E-24 PDE4A 7.77 5.02E-14 7.59 1.33E-09 7.35E-12 DMKN 7.78 2.16E-14 7.83 5.71E-10 3.40E-12 THNSL2 7.79 8.28E-12 7.58 2.19E-07 7.98E-10 RLTPR 7.87 1.85E-08 7.61 4.90E-04 8.03E-07 FAM131C 7.9 5.65E-08 7.62 1.50E-03 2.19E-06 GAA 7.99 7.94E-19 7.79 2.10E-14 2.45E-16 SEMA6C 8.06 4.73E-07 7.56 0.01 1.39E-05 EPHA10 8.09 2.84E-11 7.74 7.52E-07 2.43E-09 TSSK3 8.1 8.14E-08 8.12 2.16E-03 3.00E-06 TMEM98 8.12 8.84E-08 7.7 2.34E-03 3.21E-06 168 FBXO43 8.19 8.28E-08 7.69 2.19E-03 3.02E-06 PCDH12 8.21 3.32E-07 7.64 8.79E-03 1.02E-05 UST 8.27 5.50E-09 8.21 1.46E-04 2.77E-07 LDHD 8.33 5.07E-07 8.13 0.01 1.47E-05 C2orf48 8.34 8.14E-07 8.3 0.02 2.20E-05 CDKL2 8.36 1.53E-06 7.75 0.04 3.75E-05 RILPL1 8.46 5.88E-10 8.13 1.56E-05 3.76E-08 ADCY4 8.53 4.42E-12 8.24 1.17E-07 4.52E-10 AMN 8.57 1.40E-12 8.34 3.71E-08 1.59E-10 GNB3 8.6 1.39E-10 8.3 3.67E-06 1.01E-08 RASAL1 8.69 2.55E-07 8.75 6.75E-03 8.01E-06 BRSK2 8.79 1.41E-09 8.45 3.74E-05 8.13E-08 AGBL2 9.13 4.07E-11 8.77 1.08E-06 3.35E-09 ASAP3 9.14 1.23E-14 8.99 3.25E-10 2.05E-12 CCDC88A 9.15 5.83E-09 9.07 1.54E-04 2.90E-07 CD74 9.18 1.65E-13 8.93 4.36E-09 2.22E-11 SLC29A4 9.27 3.66E-24 9.09 9.70E-20 1.80E-21 PACSIN3 9.48 6.27E-13 9.21 1.66E-08 7.72E-11 DLG4 9.5 2.16E-09 9.05 5.73E-05 1.19E-07 MPP2 9.5 2.74E-10 9.33 7.26E-06 1.91E-08 PDE4C 9.66 1.57E-14 9.6 4.16E-10 2.54E-12 MUC1 9.71 7.25E-12 9.41 1.92E-07 7.08E-10 PALM3 9.75 8.98E-07 9.3 0.02 2.39E-05 ADAMTS13 9.85 3.35E-19 9.62 8.86E-15 1.12E-16 ATP8B3 9.91 5.45E-15 9.63 1.44E-10 9.44E-13 ADGRL1 9.92 2.01E-08 9.56 5.32E-04 8.65E-07 DUSP1 9.92 3.26E-47 9.86 8.63E-43 2.88E-44 PCOLCE 10.05 2.76E-13 9.8 7.31E-09 3.62E-11 DCST2 10.11 1.47E-07 9.59 3.88E-03 5.03E-06 INPP5J 10.11 2.02E-08 9.94 5.34E-04 8.68E-07 NUPR1 10.13 2.46E-16 9.81 6.50E-12 5.33E-14 MFI2-AS1 10.38 2.16E-13 9.94 5.72E-09 2.86E-11 EGR2 10.52 2.28E-08 10.4 6.04E-04 9.74E-07 MFI2 10.73 4.94E-63 10.63 1.31E-58 4.51E-60 ROBO3 10.73 6.60E-18 10.53 1.75E-13 1.75E-15 PIGZ 10.74 3.80E-32 10.75 1.01E-27 2.45E-29 SLC26A11 11.05 1.12E-18 10.87 2.96E-14 3.25E-16 IKZF3 11.06 8.94E-11 10.97 2.37E-06 6.84E-09 LINC00921 11.19 2.98E-08 10.59 7.90E-04 1.24E-06 SSPO 11.2 1.57E-12 10.7 4.16E-08 1.76E-10 FOS 11.22 4.29E-46 11.06 1.14E-41 3.67E-43 ITGB2-AS1 11.31 3.79E-10 10.59 1.00E-05 2.55E-08 TNNT1 11.68 5.31E-31 11.45 1.41E-26 3.13E-28 SEMA4F 11.75 2.82E-17 11.44 7.48E-13 7.06E-15 FAM167B 11.83 4.05E-07 10.88 0.01 1.21E-05 ZNF385C 12 6.04E-09 11.06 1.60E-04 2.98E-07 NDNF 12.1 2.66E-10 11.51 7.05E-06 1.86E-08 WNT10B 12.25 3.75E-09 11.42 9.94E-05 1.94E-07 ATF7IP2 12.42 1.29E-18 12.04 3.42E-14 3.71E-16 SCARF2 12.42 1.57E-07 11.27 4.16E-03 5.34E-06 NFATC4 12.47 6.38E-10 11.97 1.69E-05 4.06E-08 169 GAREML 12.48 1.31E-15 11.97 3.48E-11 2.54E-13 GIPR 12.48 1.16E-06 12.7 0.03 2.95E-05 SLC9A5 12.52 1.38E-21 12.25 3.64E-17 5.88E-19 MFSD4 12.67 8.24E-13 12.07 2.18E-08 9.92E-11 IPO5P1 13.05 1.03E-25 12.72 2.72E-21 5.55E-23 TCEA3 13.48 4.83E-25 13.34 1.28E-20 2.51E-22 BZRAP1 13.94 7.21E-07 13.3 0.02 1.98E-05 PPFIA4 14.02 2.13E-18 13.69 5.63E-14 5.87E-16 CCDC146 14.08 8.42E-10 12.94 2.23E-05 5.08E-08 DFNB31 14.1 1.73E-08 12.75 4.58E-04 7.59E-07 LAMP3 14.35 6.06E-07 13.63 0.02 1.71E-05 NXPH4 14.6 3.21E-08 14.04 8.50E-04 1.33E-06 ARHGAP31 14.86 6.51E-08 13.62 1.72E-03 2.48E-06 PIP5KL1 15.62 3.23E-09 13.93 8.56E-05 1.70E-07 CCDC114 15.67 4.68E-11 14.44 1.24E-06 3.79E-09 P3H3 15.87 4.53E-10 15.3 1.20E-05 2.99E-08 C1R 16.17 4.21E-10 14.77 1.12E-05 2.79E-08 CYP26B1 16.25 5.05E-19 15.67 1.34E-14 1.61E-16 LINC01125 16.5 1.58E-08 14.53 4.18E-04 7.07E-07 RIMS3 16.8 1.02E-07 15.31 2.71E-03 3.63E-06 ZNF204P 17.77 1.05E-11 16.87 2.79E-07 9.94E-10 CARD14 17.87 6.40E-20 17.21 1.70E-15 2.17E-17 MIR210HG 18.04 6.56E-12 17.26 1.74E-07 6.49E-10 DBH-AS1 18.25 8.55E-07 14.63 0.02 2.29E-05 GRID2IP 18.5 1.64E-07 15.98 4.34E-03 5.52E-06 NDUFA4L2 19.01 2.69E-11 18.37 7.12E-07 2.31E-09 LIMD2 19.35 9.26E-07 18.33 0.02 2.44E-05 CTGF 19.54 5.21E-17 18.95 1.38E-12 1.28E-14 OBSL1 19.9 3.31E-38 19.59 8.76E-34 2.43E-35 IL27RA 20.95 2.20E-15 20.05 5.83E-11 4.08E-13 DHDH 21 3.04E-07 16.97 8.06E-03 9.34E-06 FLJ31356 21.33 6.59E-08 18.17 1.74E-03 2.50E-06 NDRG1 21.8 3.20E-95 21.39 8.48E-91 3.14E-92 SCARA3 22.96 2.42E-16 21.63 6.41E-12 5.30E-14 SLC2A10 23.72 5.06E-41 23.26 1.34E-36 3.94E-38 GPR162 24 1.15E-06 18.96 0.03 2.92E-05 LINC00663 24 1.82E-06 18.93 0.05 4.37E-05 CD83 24.16 6.46E-25 23.49 1.71E-20 3.29E-22 CCDC136 24.25 6.69E-07 19.25 0.02 1.86E-05 C4orf47 26 2.96E-08 21.05 7.83E-04 1.23E-06 C9orf173-AS1 28.5 4.01E-08 22.79 1.06E-03 1.60E-06 EGR1 30.21 1.38E-45 29.48 3.65E-41 1.14E-42 CERCAM 30.55 1.06E-08 28.03 2.80E-04 4.97E-07 TLE2 31.25 1.58E-07 27.4 4.18E-03 5.36E-06 DUSP5P1 31.5 1.23E-09 25 3.26E-05 7.17E-08 KIAA1324L 31.5 3.40E-12 28.23 9.00E-08 3.54E-10 NTF4 32 1.10E-06 27.66 0.03 2.81E-05 CACNA1F 33.5 1.48E-06 22.46 0.04 3.64E-05 APOE 33.75 6.53E-09 31.71 1.73E-04 3.19E-07 ARAP3 34.5 4.03E-13 32.77 1.07E-08 5.11E-11 PRKCG 34.5 3.53E-10 27.86 9.34E-06 2.39E-08 170 DOC2A 36 8.88E-07 24.42 0.02 2.36E-05 PLIN5 36.08 3.27E-21 33.24 8.65E-17 1.27E-18 N4BP2L1 36.81 8.18E-12 33.8 2.17E-07 7.91E-10 GNG7 38.75 1.18E-11 31.25 3.14E-07 1.10E-09 AQP7 41 1.30E-07 27.49 3.45E-03 4.50E-06 LYRM9 43.5 8.07E-07 34.26 0.02 2.18E-05 ARHGEF25 49.5 1.29E-07 38.62 3.43E-03 4.48E-06 TMEM229B 51 6.63E-07 33.93 0.02 1.85E-05 ALOX12B 55 3.82E-09 36.58 1.01E-04 1.96E-07 FNDC4 58 2.31E-07 47.35 6.12E-03 7.44E-06 LOC729970 61 5.64E-10 41.22 1.49E-05 3.63E-08 GBP4 75 3.73E-08 48.61 9.89E-04 1.51E-06 ANGPTL4 86.92 2.83E-12 77.33 7.50E-08 2.99E-10 TBKBP1 102.25 2.57E-12 79.42 6.81E-08 2.77E-10 FOSB 142.09 2.56E-40 143.19 6.79E-36 1.94E-37 ALOX15 ∞ 3.08E-08 89.56 8.16E-04 1.28E-06 AZIN2 ∞ 4.90E-11 105.77 1.30E-06 3.94E-09 C1orf228 ∞ 9.33E-08 75.71 2.47E-03 3.35E-06 EBF4 ∞ 1.03E-10 143.33 2.72E-06 7.74E-09 LINC01252 ∞ 1.16E-10 128.11 3.06E-06 8.65E-09 MCOLN2 ∞ 7.50E-08 86.16 1.99E-03 2.80E-06 NRN1 ∞ 3.65E-14 231.01 9.67E-10 5.50E-12 RND2 ∞ 1.12E-08 71.86 2.97E-04 5.25E-07 SIGLEC10 ∞ 1.26E-06 100.85 0.03 3.16E-05 TTLL6 ∞ 4.06E-08 85.25 1.07E-03 1.62E-06 VWA3A ∞ 5.00E-07 55.97 0.01 1.46E-05 ZMAT1 ∞ 1.05E-06 59.89 0.03 2.72E-05

171 Table S2.6 Top canonical pathways altered by TM9SF2 knockout

Pathway p-value % Overlap Overlap gene number Ceramide 3.82E-04 11.8% 11/93 Signaling Mitotic Roles of 4.46E-04 13.6% 9/66 Polo-Like Kinase Protein Kinase A 8.54E-04 6.8% 27/400 Signaling Gustation 9.32E-04 9.2% 14/153 Pathway Beta alanine 1.21E-04 100% 2/2 Degradation I

172 Chapter 3 Supplementary Information

Supplementary Figure 3.1. Shown are the six gRNA oligo pairs designed to target the coding regions of the human WAC gene. Each gRNA oligo pair has flanking BsmBI restriction enzyme overhang sites (grey italics) for cloning into the lentiCRISPRV2 vector.

173

Supplementary Figure 3.2. LentiCRISPRv2 vector used to generate WAC knockout clones. Between the Long Terminal Repeat (LTR) site in the parental vector (shown left) are a mammalian codon optimized Cas9 and cloning sites for gene specific gRNA oligos. Parental vectors are digested with BsmBI to release

“stuffer” sequence and replaced by WAC specific gRNA oligos (shown right).

174 A

. Relative WAC Relative expression

B EGFP . OE WAC WAC B-Actin

Supplementary Figure 3.3. Relative WAC expression in a panel of human

CRC cell lines. A, Quantitative RT-PCR was used to measure the WAC mRNA

levels in eight human CRC cells relative to the expression in the human colonic

epithelial cell (HCEC) line. HCEC cells are immortalized but non-tumor forming

cells. B, Western blot confirming the overexpression of WAC in HCT116 cells.

175 A. HE +HE HE C2 +WW WW C4 CC + CC C6 HE HE WW WW CC CC ------

L C1 C2 C3 C4 C5 C6 C1 C3 C5

500 bp

B.

C.

176 Supplementary Figure 3.4. Confirmation of WAC mutation by CRISPR/Cas9 gene editing. A, PCR was performed on DNA extracted from singe cell HT-29 clones after viral transduction with LentiCRISPRv2 vectors carrying gRNAs targeting human WAC. PCR primers were designed to flank the targeted cut site.

The product was purified and sent for sanger sequencing. B, Sanger sequencing trace files were subjected to decomposition by TIDE analysis. HT-29 parental

DNA (black lines) was used as a control for decomposition and test samples

(clone C4B4) are shown in green. The vertical blue dotted line represents the predicted CRISPR/Cas9 cut site. C, Pictured is the indel spectrum after

CRISPR/Cas9 editing as determined by TIDE analysis.

177

Table S3.1. Primers used for qRT-PCR in Chapter 3.

Gene Species Project Forward Primer Reverse Primer

Actb mouse IMCE/YAMC TCCAGCCTTCCTTCTTGGGTATGGA CGCAGCTCAGTAACAGTCCGCC

Wac mouse IMCE/YAMC TGCAGATGATTGGTCTGAGC AACTGCCAGCTTATTTGCTTC

ACTB human ACC1 GCCGTCTTCCCCTCCATCGT TGCTCTGGGCCTCGTCGC

WAC human AAC1 CAACCACAGTGCTCTTCATAGTTC AGCTAATATGCTCAGACCAGTCATC

TP53 human AAC1 TGTTTCCTGACTCAGAGGGG GAGCGTGCTTTCCACGAC

CDKN1A human HCT116 CACCGAGGCACTCAGAGGA CACTGGGCCGAAGAGGC WAC KD TBP human HCT116 CCCATGACTCCCATGACC TTTACAACCAAGATTCACTGTGG WAC KD CREBBP human HCT116 AGAGCCAAACTCAGCTCGC TGGGTATCAGCTCATCAGGAAG WAC KD TP53 human HCT116 ACACGCTTCCCTGGATTGG GTTTCCTGACTCAGAGGGGG WAC KD WAC human All non- GGACTCGCAGCCTTACCAG AGAGCACTGTGGTTGTGTGAA AAC1 WAC KD experiments

178 Table S3.2. Insertional mutagenesis studies identifying WAC as a candidate cancer gene1

Cancer Type Study PubMed ID CIS Address Predicted Relative Effect Rank Blood Cancer Berquam 21828136 18:7896837- Loss B Vrieze 2011- 7925450 03 Blood Cancer Guo 2016-01 26676752 18:7868832- Not B 7973547 Determined Breast Chen 2017- 28251929 18:7839022- Not D Cancer 01 7927627 Determined Breast Rangel 2016- 27849608 18:7868832- Not C Cancer 01 7973547 Determined Colorectal March 2011- 22057237 18:7842074- Not A Cancer 01 7955421 Determined Colorectal Morris 2016- 27790711 18:7873602- Not A Cancer 01 7925602 Determined Colorectal Starr 2009- 19251594 18:7874333- Loss A Cancer 01 7972742 Colorectal Starr 2011- 21436051 18:7855250- Loss B Cancer 01 8046128 Colorectal Takeda 25559195 18:7867714- Not A Cancer 2015-01 7935905 Determined Colorectal Takeda 25559195 18:7861270- Not A Cancer 2015-02 7926589 Determined Colorectal Takeda 25559195 18:7863777- Not A Cancer 2015-03 7931990 Determined Gastric Takeda 27006499 18:7884817- Loss C Cancer 2016-01 7912398 Liver Cancer Bard- 24316982 18:7868857- Not A Chapeau 7929027 Determined 2014-01 Liver Cancer Kodama 27178121 18:7868832- Not A 2016-01 7973547 Determined Lung Cancer Dorr 2015-01 25995385 18:7833702- Loss B 7924702 Nervous Genovesi 24167280 18:7868831- Not D System 2013-01 7929028 Determined Cancer Nervous Rahrmann 23685747 18:7868832- Not B System 2013-01 7973547 Determined Cancer Nervous Rahrmann 23685747 18:7868832- Not B System 2013-02 7973547 Determined Cancer Nervous Wu 2012-01 22343890 18:7868832- Not Not Ranked System 7973547 Determined Cancer Pancreatic Perez 22699621 18:7892929- Not B Cancer Mancera 7892929 Determined 2012-01 179 Pancreatic Perez 22699621 18:7899105- Not B Cancer Mancera 7899105 Determined 2012-02 Sarcoma Moriarity 25961939 18:7869196- Loss C 2015-01 7929028 Sarcoma Moriarity 25961939 18:7868832- Loss C 2015-02 7973547 Skin Cancer Perna 2015- 25624498 18:7894705- Not A 01 7894705 Determined Skin Cancer Quintana 22832494 18:7868832- Not A 2012-01 7973547 Determined

180 Table S3.3. TCGA database CRC mutations

Study Sample Cancer Protein Mutation Center Start End ID Type Change Type COAD coadread Colorectal R522I Missense Harvard 28905110 28905 (DFCI, _dfci_201 110 Cell 6_578 Reports 2016) COAD TCGA- Colon A423T Missense Broad 28899729 28899 (TCGA, AA-3977- 729 Nature 01 2012) COAD TCGA- Colon E600K Missense Broad 28906637 28906 (TCGA, AA-3984- 637 Nature 01 2012) COAD TCGA- Rectal S442P Missense Broad 28900738 28900 (TCGA, AG-A036- 738 Nature 01 2012) COAD 587278 Colorectal T552M Missense Genentech 28905200 28905 (Genent 200 ech, Nature 2012) COAD 587376 Colorectal E172* Nonsense Genentech 28879665 28879 (Genent 665 ech, Nature 2012) COAD TCGA- Rectal K277N Missense Broad 28884882 28884 (TCGA, AG-3892- 882 Nature 01 2012) COAD TCGA- Colorectal P322H Missense hgsc.bcm.e 28897160 28897 (TCGA, AA-3516- du 160 Nature 01 2012) COAD coadread Colorectal D21Tfs* Frame Harvard 28822941 28822 (DFCI, _dfci_201 171 941 Cell 6_227039 Reports 2016) COAD coadread Colorectal I416T Missense Harvard 28899709 28899 (DFCI, _dfci_201 709 Cell 6_3451 Reports 2016) COAD coadread Colorectal R18Q Missense Harvard 28822938 28822 (DFCI, _dfci_201 938 Cell 6_485

181 Reports 2016)

Notes: All samples were considered adenocarcinomas. WAC is located on .

182 Table S3.4. WAC mutation data from COSMIC Database

Pos. CDS AA Mutation ID (COSM) Count Type Mutation 3 c.7A>G p.M3V 3686701 1 Substitution - Missense

18 c.53G>A p.R18Q 6821135 1 Substitution - Missense

21 c.56delG p.D21fs*17 6821130 1 Deletion - 1 Frameshift

32 c.96G>A p.S32S 2134812 1 Substitution - coding silent

38 c.114C>T p.S38S 2134813 2 Substitution - coding silent

50 c.149G>C p.G50A 2134815 1 Substitution - Missense

56 c.168T>C p.N56N 2134817 1 Substitution - coding silent

63 c.187G>A p.D63N 2134821 1 Substitution - Missense

99 c.295G>T p.E99* 4740723 2 Substitution - Nonsense

149 c.446A>G p.N149S 6052979 1 Substitution - Missense

183 172 c.514G>T p.E172* 1232484 1 Substitution - Nonsense

178 c.533T>C p.V178A 4641354 1 Substitution - Missense

230 c.689G>A p.R230K 2134837 1 Substitution - Missense

242 c.726T>C p.S242S 2134839 3 Substitution - coding silent

247 c.741G>A p.Q247Q 6821131 1 Substitution - coding silent

257 c.771T>C p.T257T 2134845 3 Substitution - coding silent

269 c.807G>A p.T269T 4740725 2 Substitution - coding silent

270 c.810A>T p.L270L 2134847 2 Substitution - coding silent

277 c.831G>A p.K277K 2134853 1 Substitution - coding silent

277 c.831G>T p.K277N 258360 2 Substitution - Missense

290 c.868A>C p.K290Q 1347494 1 Substitution - Missense

184 324 c.972G>A p.T324T 2134863 3 Substitution - coding silent

342 c.1026G> p.A342A 2134865 1 Substitution - A coding silent

346 c.1030_10 p.S346_V3 1347495 1 Deletion - In frame 38delCCT 48delSPV GTTTCT 347 c.1041T> p.P347P 5480772 1 Substitution - C coding silent

373 c.1119G> p.T373T 2134867 1 Substitution - A coding silent

395 c.1183A> p.T395A 3414945 1 Substitution - G Missense

416 c.1247T> p.I416T 6821133 1 Substitution - C Missense

437 c.1311G> p.P437P 286487 1 Substitution - A coding silent

442 c.1324T> p.S442P 290867 1 Substitution - C Missense

475 c.1424C> p.S475L 32899 1 Substitution - T Missense

497 c.1491G> p.Q497Q 1347498 1 Substitution - A coding silent

185 522 c.1565G> p.R522I 4013638 2 Substitution - T Missense

552 c.1655C> p.T552M 1232483 1 Substitution - T Missense

552 c.1656G> p.T552T 5448011 1 Substitution - A coding silent

556 c.1668G> p.T556T 2134879 1 Substitution - A coding silent

587 c.1761C> p.R587R 3414947 1 Substitution - T coding silent

606 c.1816A> p.K606Q 1347499 1 Substitution - C Missense

617 c.1851T> p.I617I 5459555 1 Substitution - A coding silent

618 c.1852C> p.Q618* 1165131 1 Substitution - T Nonsense

627 c.1880T> p.L627P 1347500 1 Substitution - C Missense

640 c.1920A> p.K640N 1347501 1 Substitution - C Missense

c.275- p.? 6821128 3 Unknown 10delT

186 c.610+1G p.? 1347493 1 Unknown >A c.79-2A>T p.? 4740722 1 Unknown

187