<<

UGENE, A NEWLY IDENTIFIED THAT IS COMMONLY OVER- EXPRESSED IN CANCER, AND THAT BINDS URACIL DNA-GLYCOSYLASE

by

CHUNGUANG GUO

Submitted in partial fulfillment of the requirements For the degree of Doctor of Philosophy

Dissertation Adviser: Dr. Sanford Markowitz

Department of Molecular and Microbiology

CASE WESTERN RESERVE UNIVERSITY

January, 2009

CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the thesis/dissertation of

Chunguang Guo

candidate for the Ph.D. degree *.

(signed) Robert Silverman (chair of the committee)

George Stark

Andrei Gudkov

Sanford D Markowitz

(date) 06/24/08

*We also certify that written approval has been obtained for any proprietary material

contained therein.

ii Table of contents

List of tables ...... 4

List of figures ...... 5

Acknowledgement ...... 7

Abstract ...... 9

Chapter 1 Introduction ...... 10

1.1 Epidemiology of Colorectal Cancer ...... 10

1.2 Staging, Therapy and Prognosis ...... 10

1.3 of Colon Cancer ...... 11

1.4 The Role of TGF-β Signaling in Colorectal Cancer ...... 13

1.4.1 TGFbR2 ...... 15

1.4.2 Smads ...... 17

1.5 DNA repair pathways and colorectal cancer ...... 18

1.6 Microarray techniques in cancer research ...... 21

Chapter 2 Identification of a novel , Ugene, over-expressed in

most cancer types ...... 24

2.1 Expression of EST U46258 transcript is consistently up-

regulated in colon cancer by DNA microarray analysis ...... 24

2.2 Identification of Ugene ...... 26

2.3 Validation of Ugene expression in colon cancer vs. normal

colon ...... 31

2.3.1 Ugene expression is elevated in colon cancer ...... 31

2.3.2 Both Ugene-p and Ugene-q are over-expressed in colon cancer ...... 34

1 2.3.3 Ugene is over-expressed in many types of cancers ...... 35

2.4 Investigation of the possible causes for the elevated expression

level of Ugene in colon cancers ...... 36

2.4.1 amplification and rearrangement ...... 36

2.4.2 TGF-β regulation ...... 38

2.5 Ugene expression is cell cycle denpendent ...... 41

2.6 Ugene encodes a nuclear protein ...... 44

Materials and methods ...... 46

Chapter 3 Epitope tagging of endogenous Ugene ...... 52

Materials and methods ...... 59

Chapter 4 Investigation of potential roles of Ugene in cell

proliferation and oncogenesis ...... 64

4.1 Potential roles of Ugene in cell transformation ...... 64

4.1.1 NIH 3T3 cells transformation ...... 64

4.1.2 REF-52 cells transformation ...... 66

4.1.3 WI-38 cells transformation ...... 67

4.1.4 HME1 cells transformation ...... 69

4.2 Potential roles of Ugene in cell proliferation ...... 70

4.3 Ugene potential interaction partners ...... 73

4.3.1 Identification of Ugene potential partners ...... 73

4.3.2 Validation of Ugene potential partners ...... 74

4.3.3 Ugene-p binds to the NH2-terminus of UNG2 ...... 81

2 4.3.4 Ugene binding does not directly alter UNG2 enzymatic activity or

localization ...... 84

Materials and methods ...... 90

Chapter 5 Discussion and Future Directions ...... 95

Bibliography ...... 98

3 List of tables

Table 1-1 DNA repair linked to familial cancer syndromes……….……19

Table 2-1 Difference between Ugene-p and Ugene-q…………………….……31

Table 2-2 Ratio of Ugene-p and Ugene-q in colon cancer cell lines……………34

Table 2-3 Ratio of Ugene-p and Ugene-q in V330 ……………………….……41

4 List of figures

Figure 1-1 Genetic changes associated with colorectal ...... 12

Figure 2-1 Expression of EST U46258 on DNA microarrays ...... 26

Figure 2-2 Structure of the Ugene locu...... 29

Figure 2-3 Ugene consists of two different transcripts, Ugene-p and Ugene-q ....31

Figure 2-4 Ugene mRNA expression in normal and cancer samples ...... 34

Figure 2-5 Ugene expression in cancers ...... 36

Figure 2-6 Detection of amplification, rearrangement of Ugene ...... 37

Figure 2-7 TGF-β regulation of Ugene expression on microarray ...... 39

Figure 2-8 Ugene can be down-regulated by TGF-β ...... 40

Figure 2-9 Cell cycle profiling of Ugene mRNA expression ...... 43

Figure 2-10 Subcellular localization of Ugene ...... 45

Figure 3-1 Schematic diagram of tagging endogenous protein with 3 Flag ...... 54

Figure 3-2 3xFlag epitope tagging of endogenous Ugene-p ...... 57

Figure 3-3 Timelines for 3xFlag knock-in and antibody production ...... 58

Figure 4-1 NIH 3T3 cell transformation assay ...... 66

Figure 4-2 Anchorage independent growth assay of WI-38 cells ...... 69

Figure 4-3 Suppression of Ugene expression by specific siRNAs ...... 71

Figure 4-4 Colony formation assays after siRNA transfection ...... 73

Figure 4-5 Flag pull-down of Ugene-p ...... 74

Figure 4-6 Protein-protein interaction of Ugene-p and UNG2 ...... 77

Figure 4-7 Ugene-q does not interact with UNG2 ...... 78

Figure 4-8 Protein-protein interaction of Ugene and NOSIP ...... 80

5 Figure 4-9 NOSIP doesn’t form complex with UNG2 ...... 81

Figure 4-10 Mapping of the UNG2 domain for binding to Ugene-p ...... 84

Figure 4-11 Western analysis of UNG in UNG1 null and UNG null cells ...... 87

Figure 4-12 Assays of UNG enzymatic activity ...... 88

Figure 4-13 UNG2 localization in DLD1 cells with and without presence of

Ugene-p ...... 89

6 Acknowledgement

First, I would like to express my sincere gratitude to my thesis advisor, Dr. Sanford D Markowitz, not only for his great guidance and endless support in my project, but also for his patience and encouragement throughout the course of my graduate studies. In past six years, he has taught me innumerable lessons and insights to think critically, to experimentally query the unknown, and to write about science. His support has meant more to me than I can express here, and he has empowered me with the strong motivation and unbendable perseverance in scientific research to embrace further challenges.

My project would have been much harder to complete without the help from all the incredible past and present members of Markowitz lab. A special thank goes to Dr. Steve Fink and Dr Kishore Guda for all their support on my project and patience in teaching me countless techniques and ways of scientific thinking; Dr. Jerome F Sah, Dr Min Yan, Dr Baozhong Xin and Dr Hui Xiao for good-spirited discussions and excellent advices in presentation; James Lutterbaugh for preparing a lot of samples; Lydia Beard for tremendous tissue culture work; Lucy He for proofreading the dissertation; all the other members for assistance in my lab work and for helpful discussions.

Specifically, I would like to thank Dr. Zhenghe J Wang for his insightful advice and guidance, given without any reservation, on both my research project and my scientific career development; Dr. Xiaodong Zhang for all the valuable discussions.

I also owe my deep appreciation to my previous and present committee members for their patience, time, and guidance over the years. In particular I thank Dr.

Andrei Gudkov for coming all the way from buffalo to attend my committee meeting every time and Dr. Robert H Silverman for taking the last minute request

7 to be my committee chair. Dr. George Stark have contributed remarkably to my work through his invaluable expertise knowledge.

I will always be deeply indebted to my wife for her unending and unconditional love and support throughout the years of hardship. Being the wife of a busy scientist is not an easy position. In particular I thank you for your tolerance of my irritability at time of experimental failure. Thank you for all the after-hours and weekends you spend with me in the lab. Thank you for making me the person I am today.

Last but not least, I would like to thank my parents. Without their encouragement and support, it would have been impossible for me to finish my graduate study. The poverty of language is unable to express the invaluable roles they have played in my life.

8

Ugene, a Newly Identified Protein that is Commonly Over-Expressed in Cancer, and

that Binds to Uracil DNA-Glycosylase

Abstract

by

CHUNGUANG GUO

Expression microarrays identified a novel transcript, designated as Ugene, whose

expression is absent in normal colon and colon adenomas, but that is commonly

induced in malignant colon cancers. These findings were validated by real-time

PCR and Northern blot analysis in an independent panel of colon cancer cases. In

addition, Ugene expression was found to be elevated in many other common

cancer types, including, breast, lung, uterus, and ovary. Immunofluorescence of

V5-tagged Ugene revealed it to have a nuclear localization. In a pull-down assay,

uracil DNA-glycosylase 2 (UNG2), an important in the base excision

repair pathway, was identified as a partner protein that binds to Ugene. Co-

immunoprecipitation and Western blot analysis confirmed the binding between the endogenous Ugene and UNG2 . Using deletion constructs, we find that Ugene binds to the first 25 amino acids of the UNG2 NH2-terminus. We

suggest Ugene induction in cancer may contribute to the cancer phenotype by

interacting with the base excision repair pathway.

9 Chapter 1 Introduction

1.1 Epidemiology of Colorectal Cancer

Colorectal cancer (which includes cancer of the colon, rectum, anus, and

appendix) is the second-leading cause of cancer-related deaths in the United

States and the third most common cancer worldwide (Greenlee et al., 2000). Half

of the Western population develops colorectal tumor by the age of 70, which may

progress into colorectal cancer in around 10% of these affected individuals

(Parker et al., 1996). According to the American Cancer Society, with more than

154,000 new cases diagnosed in the United States each year, representing 15% of

all cancers, this disease constitutes a major public health problem. Colon cancers

arise from the epithelial cells lining colon lumen. As one of the highest

proliferative cell populations among all human tissues, these cells renew

themselves every four days from a stem cell population at the base of colon

crypts. The process of colon neoplasia formation generally takes several years,

from benign colon adenoma to invasive cancer metastasized to lymph nodes, liver

and other distal organs. Both genetic and environmental factors contribute to

colorectal neoplasia (Moinova et al., 2002).

1.2 Staging, Therapy and Prognosis

Surgical resection of the involved intestine is the initial treatment for colon and

many rectal cancers. The prognosis of colorectal cancer is highly related to the

degree of penetration of the tumor through intestinal wall, nodal involvement and

10 metastasis. These characteristics are the basis for the TNM clinical staging system. Briefly, T1N0M0 (Willson et al.) cancers are confined to the submucosa;

T2N0M0 cancers invade the major muscular layer; T3N0M0 (Xin et al.) cancers invade through the muscular layer into the subserosa, or into the pericolic or perirectal tissues; cancers metastatic to regional lymph nodes are designated N1-3

(Xin et al.); cancers metastatic to distant organs (Brunschwig et al.) are designated stage IV (Moinova et al., 2002).

Surgical resection provides a cure rate of over 90% in stage I and 75% in stage II colon cancer patients. With addition of 5-fluorouracil-based chemotherapy following surgical resection as the standard of care, the overall survival of 60% may be reached in stage III patients. While surgical resection and chemotherapy can be employed in Stage IV patients, most patients succumb to the metastatic disease within 2 years ((Moinova et al., 2002).

1.3 Genetics of Colon Cancer

The risk of developing colon cancer is influenced by a number of factors that include age and diet, but it is primarily a genetic disease, resulting from both the activation of oncogenes which triggers uncontrolled proliferation of the cell, and the inactivation of tumor suppressor genes which prevents the destruction of damaged cells via apoptosis (Vogelstein and Kinzler, 1993). The etiology of cancer is a complex interplay of numerous acquired genetic abnormalities, including amplification of oncogenes, deletion of tumor suppressor genes, gene rearrangements and loss or gain of function mutations(Gray et al., 2000). Some of the alterations have been convincingly shown to promote colon carcinogenesis

11 (Figure 1-1; Kinzler and Vogelstein, 1996). During the past decade, some

important aspects in mechanism of colon tumorigenesis have been revealed. For

example, activation of Wnt signaling pathway through APC mutation is a

predominant event in colon cancer. In fact, germline APC mutation causes the

hereditary familial adenomatous polyposis (FAP) syndrome, characterized by

hundreds of intestinal polyps in affected individuals (Kinzler and Vogelstein,

1996). APC protein catalyzes the degradation of β-catenin, thus suppressing β-

catenin-mediated Wnt signaling pathway. Alternatively, a subset of colon cancers

activates Wnt signaling through activating mutation of β-catenin. Pro- tumorigenic genes, such as c-myc, cyclin D1 and PPAR-δ, are induced by Wnt signaling pathway.

Figure 1-1 Genetic changes associated with colorectal tumorigenesis (Kinzler and

Vogelstein, 1996).

Dysfunction of DNA mismatch repair complex leads to genomic instability

characterized by microsatellite instability in around 13% of colon cancers. Indeed,

germline mutations of hMLH1 or hMSH2, two genes encoding major mismatch

12 repair proteins, are found in most hereditary non-polyposis colon cancer (Lu et

al.) kindreds accounting for 5% of all colon cancer cases. Meanwhile, the most

common cause of sporadic MSI colon cancer is epigenetic methylation of two hMLH1 alleles (Grady and Markowitz, 2002, 2003; Moinova et al., 2002).

In addition, 50% of colon cancer cases harbor oncogenic mutation of RAS, especially K-RAS. First found in colon cancer, the inactivating mutation of P53 is now recognized as the most common genetic event in human cancer (Moinova et al., 2002).

TGF-β signaling is another major growth inhibitory pathway in colon epithelial cells. MSI cancers ubiquitously inactivate TGF-β type II receptor (RII) by frameshift mutation within a 10 coding region poly-A repeat

(Markowitz et al., 1995). Mutations in the kinase domain of RII is found in around 15% of microsatellite stable cancers (Grady et al., 1999). Smad2 and

Smad4, two downstream mediators of TGF-β signaling, are also targets of

mutation in colon cancer, underlying the importance of TGF-β mediated tumor

suppressor pathway. More details about TGF-β mediated tumor suppressor

pathway will be presented below.

1.4 The Role of TGF-β Signaling in Colorectal Cancer

Transforming growth factor β (TGF-β) is a widely studied growth factor involved

in regulation of cell proliferation, differentiation, migration, apoptosis and

tumorigenesis(Derynck and Feng, 1997; Massague et al., 2000). The three

members of TGF-β family (TGF-β1, TGF-β2 and TGF-β3) share the same

13 receptor signaling pathway. TGF-β signals through a heteromeric complex of two transmembrane serine/threonine kinases, named “type I” (RI) and “type II” (RII) receptors. TGF-β binding induces the type II receptor to phosphorylate the glycine-serine-rich region (GS box) upstream from the kinase domain of type I receptor. Upon phosphorylation, type I receptor undergoes auto-phosphorylation and phosphorylation of downstream target proteins.

Smad proteins are a family of well-characterized downstream mediators of TGF-β signaling. The Smad protein family is evolutionarily conserved from C. elegans,

Drosophila Melanogaster and Xenopus to Human. The N- or C- terminal domains of human Smads are highly homologous to the counterpart of Drosophila orthologue, Mad, and thus are named MH1 and MH2 domain respectively. MH1 and MH2 domains are connected by a less well-conserved, proline-rich linker domain. Smad proteins are divided into three major classes: (a) the receptor- regulated Smads (R-Smads), including Smad1, 2, 3 and 5; (b) the common

Smads (Co-Smad) including Smad4, which bind R-Smads to form heteromeric complex and further regulate gene transcription; (c) the inhibitory Smads (I-

Smads), including Smad6 and Smad7, which antagonize Smad-mediated TGF-β pathway.

Activated RI phophorylates Smad2 and Smad3 on two serine residues in a conserved SS(M/V)S motif located at C-terminal of R-Smads. Activated Smad2 and Smad3 bind to Co-Smad (Smad4), translocate to nucleus and modulate gene expression. Smad4 acts as a convergent node in that it can interact with all R-

Smads downsteam of TGF-β superfamily signal pathway. Smad4 was originally

14 thought to be essential for TGF-β induced Smad translocation and gene expression regulation. However, Smad2 and Smad3 may achieve nucleus

translocation without Smad4(Fink et al., 2003); and not all gene expression

changes with TGF-β treatment are dependent on the presence of Smad4(Hocevar

et al., 1999).

Translocated Smad complex regulates transcription of various target genes

through binding to cis-regulatory Smad-binding sequences and other transcription

factors such as p300/CBP, ski, etc. Thus, it is widely believed that TGF-β exerts

its effects via regulation of various gene transcription, although TGF-β

modulation of stability of specific mRNA and protein has also been

reported(Uchida et al., 2003; Westerhausen et al., 1991).

The past decade has seen an accumulation of evidence demonstrating the tumor

suppressor role of TGF-β signal pathway in colorectal cancer. As mentioned

above, TGF-β pathway may be targeted in colon tumorigenesis at the level of

either receptor (RII) or downstream mediators (Smad2, Smad4).

1.4.1 TGFbR2

a) MSI

It has long been known most colon cancer cell lines have lost responsiveness

to TGF-β mediated growth inhibition. In fact, almost all MSI colon cancers

have inactivated both alleles of TGFbR2 encoding type II TGF-β

receptor(Markowitz et al., 1995). In exon 3 of TGFbR2, there is a 10 base

pair poly-adenine repeat resembling a microsatellite sequence named BAT-

15 RII (Big Adenine Tract-RII). In one study, 100 out of 110 MSI colon cancers had a one or two nucleotide insertion or deletion in BAT-RII, resulting in a truncated RII protein product(Parsons et al., 1995). This mutant RII cannot transmit TGF-β signaling for lack of transmembrane and kinase domains. The

MSI tumors that did not possess biallelic BAT-RII mutations had missense mutations in the residual TGFbR2 allele. b) MSS

The tumor suppressor role of TGF-β is further confirmed by the finding that

15% of microsatellite stable (MSS) colon tumors contains inactivating mutations in TGFbR2. These mutations are not frameshift mutations in BAT-

RII but are missense in the kinase domain or putative binding domain of

TGFbR2, leading to inactivation of TGF-β signaling(Grady et al., 1999). In aggregate, the overall incidence of TGFbR2 mutation in both MSS and MSI colon cancers approaches 30%. c) Germline Mutation

Germline mutation of TGFbR2 is identified in a Japanese colon-cancer family.

This family is characterized by an HNPCC-like presentation but with an older age of onset of the tumors. The residual wild-type allele of TGFbR2 is inactivated in the tumors arising in this family. This provides independent genetic evidence supporting the tumor suppressor role of TGFbR2(Lu et al.,

1998). d) Functional Reconstitution

16 Reconstitution of wild-type type II receptor expression in MSI colon cell line

(e.g. HCT116) has led to growth delay in vitro and in vivo. In fact, the effect

may have underestimated the tumor suppressive activity of RII because the

RII-reconstituted clones have been selected to tolerate RII expression(Wang et

al., 1995)

1.4.2 Smads

a) Smad4 as tumor suppressor gene

Given the tumor suppressor role of TGFbR2 and the importance of Smad4 in

propagation of TGF-β signaling, it is not surprising to identify Smad4 as a

tumor suppressor gene. Indeed, inactivating mutation of Smad4 is found in

15% of colon cancers(Takagi et al., 1996), and germline mutation of Smad4

accounts for approximately one third of patients with familial juvenile

polyposis (FJP), an autosomal dominant disease characterized by a

predisposition to hamartomatous polyps and gastrointestinal cancers (Friedl et

al., 1999; Howe et al., 1998).

Furthermore, mouse with heterozygous deletion of germ line Smad4 develops

gastric polyposis when they were over 6-12 months old, and in the duodenum

and cecum in older animals at lower frequency. With increasing age, polyps in

the antrum show sequential changes from hyperplasia, to dysplasia, in-situ

carcinoma, and finally invasion (Takaku et al., 1999; Taketo and Takaku,

2000; Xu et al., 2000).

17 Another murine model, a compound heterozygote Smad4 /+/Apc 716,

develops highly invasive colon cancer, as opposed to the benign intestinal

adenomas found in the Apc 716 mouse(Takaku et al., 1998). In aggregate,

smad4 is a tumor suppressor gene which may play roles in both tumor

initiation and progression.

b) Study of Smad2 and Smad3

Smad2 inactivation through missense mutations and deletions is found in

around 5% of colon cancers(Riggins et al., 1996). The other Smad genes do

not appear to be frequently altered in colon cancer. Although no Smad3

mutation has been found in human cancer, mice homozygously null for

Smad3 develop aggressive metastatic colorectal cancer at an early age in a

manner that seems to be highly dependent on the genetic background of the

mice (Grady and Markowitz, 2002). In conclusion, Smad mutations and

deletions appear to play a role in tumor formation in a subset of colon cancers.

1.5 DNA repair pathways and colorectal cancer

The discovery that alterations of the DNA mismatch repair system (MMR) were linked to the common human cancer susceptibility HNPCC syndrome resulted in the declaration of a third class of genes involved in tumor development. In addition to oncogenes and tumor suppressors, alterations of DNA repair genes involved in maintaining genomic stability were found to be a clear cause of tumorigenesis. Nearly all tumors display some form of genomic instability, whether it is the level of the single nucleotides or . This observation

18 suggested that the establishment of genomic instability, termed the Mutator

Phenotype, was an important aspect of tumor development (Loeb, 1991; Loeb et al., 1974). Since the initial identification of the human MutS homolog hMSH2 nearly a decade ago (Fishel et al., 1993; Leach et al., 1993), more links have been described between human cancers and genes involved in maintaining genomic stability (Heinen et al., 2002).

Table 1-1 DNA repair genes linked to familial cancer syndromes

Familial cancer syndromes Disease genes

Hereditary nonpolyposis colon cancer MSH1, MLH2, MSH6

Familial breast cancer BRCA1, BRCA2, CHK2

Fanconi anemia FANC-A, -C, -D2, -E, -F, -G, BRCA2

Ataxia telangiectasia ATM

Li Fraumeni syndrome CHK2, p53

Bloom’s syndrome BS

Werner’s syndrome WRN

Nijmegen breakage syndrome NBS1

Ataxia telangiectasia-like disease MRE11

Xeroderma pigmentalsum XPA, XPB, XPC, XPD, XPE, XPF, XPG

As mentioned previously, the DNA mismatch repair system (also known as

Mutation Mismatch Repair (MMR) system) consists of a complex of proteins that recognize and repair base pair mismatches that occur during DNA replication.

Microsatellite Instability (MSI) occurs as a consequence of inactivation of the

19 MMR and is recognized by frame-shift mutations in microsatellite repeats located throughout the genome. Inactivation of the MMR system due to germline gene defects accounts for the colon-cancer family syndrome, Hereditary Nonpolyposis

Colon Cancer Syndrome (HNPCC). Somatic inactivation of the mismatch repair system accounts for approximately 15% of sporadic colon cancers (Grady et al.,

2002). Alterations in at least 6 of the genes that encode for proteins involved in the MMR system have been identified in either HNPCC or sporadic colon cancer, including hMSH2, hMSH3, hMSH6, hMLH1, hPMS1 and hPMS2 (Grady, 2004).

The MSI tumor formation process has been termed the microsatellite mutator phenotype and is a pathway to tumor formation that is distinct from that seen in colon cancers that are microsatellite stable (Yamamoto et al., 1997, 1998). The most frequently targeted gene in MSI tumors is the TGFβ receptor type II tumor suppressor gene (TGFβ RII), which undergoes somatic frameshift mutations at a repeat stretch of adenines due to germline defect in MMR (Markowitz et al.,

1995; Parsons et al., 1995).

Besides MMR, base excision repair (BER) pathway also plays an important role in colorectal cancer. One example is MutYH-associated polyposis (MAP). MAP is a recently described colorectal adenoma and carcinoma predisposition syndrome that is associated with biallelic-inherited mutations of the human MutY homologue gene, MutYH (also called MYH). Reactive oxygen species generated during aerobic metabolism represent a potent source of DNA damage. Notably, the oxidation product 8-oxoG is stable and readily mispairs with adenine (instead of cytosine). Unless repaired, this leads to G:C to T:A transversion at the next

20 round of DNA replication. MutYH is a DNA glycosylase that plays a key role in

BER of 8-oxoG:A (and G:A) mismatches by removing the mismatched adenine.

MAP tumors display a mutational signature of somatic guanine-to-thymine transversion mutations in the adenomatous polyposis coli and K-ras genes, reflecting the normal role of MutYH in the base excision repair of adenines misincorporated opposite 7,8-dihydro-8-oxoguanine, a prevalent and stable product of oxidative damage to DNA (Sampson et al., 2005).

1.6 Microarray techniques in cancer research

Tumorigenesis is governed by a series of complex genetic and epigenetic changes.

Both mechanisms can result in either the silencing or aberrant expression of messages in a cell. As a widely used gene expression profiling technique, microarray analysis can provide global overviews of these changes, as well identify key genes and pathways involved in this process. Essentially there are two types of array; those that carry PCR products from cloned nucleic acids (e.g. cDNA, BACs, cosmids) and those that use oligonucleotides. Each has advantages and disadvantages but it is now possible to survey genome wide DNA copy number abnormalities and expression levels to allow correlations between losses, gains and amplifications in tumor cells with genes that are over- and under- expressed in the same samples. The gene expression arrays that provide estimates of mRNA levels in tumors have given rise to exon-specific arrays that can identify both gene expression levels, alternative splicing events and mRNA processing alterations. Oligonucleotide arrays are also being used to interrogate single nucleotide polymorphisms (SNPs) throughout the genome for linkage and

21 association studies and these have been adapted to quantify copy number abnormalities and loss of heterozygosity events. To identify as yet unknown transcripts tiling arrays across the genome have been developed which can also identify DNA methylation changes and be used to identify DNA-protein interactions using ChIP on Chip protocols. Ultimately DNA arrays will allow resequencing of chromosome regions and whole genomes. With all of these capabilities becoming routine in laboratories, the idea of a systematic characterization of the sum genetic events that give rise to a cancer cell is rapidly becoming a reality.

Using Affymetrix cDNA array techniques covering around 50,000 human ESTs, our lab has identified Slc5A8 and HLTF as potential tumor suppressors, and CCSP1 and

CCSP2 as potential oncogenes (Li et al., 2003; Moinova et al., 2002; Xin et al., 2005), by comparing expression of transcripts in normal colon epithelium, colon cancers at different stages and liver metastases. Since TGF-β signaling represents an important tumor suppressor pathway in colorectal cancer, and many TGF-β responses result from changes in the expression of its target genes, the same array technology was also used to compare expression of transcripts in V330 cell lines with and without TGF-β treatment.

Combination of these two microarray experiments greatly narrowed the candidate list of cancer related genes and increased the probability of finding important genes involved in colorectal cancer development. Base on these two sets of array data, our lab has identified

15-hydroxyprostaglandin dehydrogenase (15-PGDH) as a tumor suppressor. 15-PGDH is a prostaglandin-degrading enzyme that physiologically antagonizes COX-2. Yan et. al. found that 15-PGDH transcript and protein are both highly expressed by normal colonic epithelia but are nearly undetectable in colon cancers; and colonic 15-PGDH expression

22 is directly controlled and strongly induced by activation of the TGF-beta tumor suppressor pathway. Later, using gene transfection to restore 15-PGDH expression in colon cancer cells strongly inhibited the ability of these cells to form tumors in immune- deficient mice and 15-PGDH gene knockout induces a marked 7.6-fold increase in colon

tumors arising in the Min (multiple intestinal neoplasia) mouse model (Myung et al.,

2006; Yan et al., 2004). These exciting discoveries greatly strengthened our confidence of using this method to identify more important genes involved in tumorigenesis, which was the original objective for this project.

23

Chapter 2 Identification of a novel gene, Ugene, over-

expressed in most cancer types

2.1 Expression of EST U46258 transcript is consistently up-

regulated in colon cancer by DNA microarray analysis

As normal colon epithelial cells undergo malignant transformation, gene

expression pattern undergoes corresponding changes. To identify genes involved in colon tumorigenesis, we performed DNA microarray experiments to compare the expression profiles of approximately 55,000 genes, EST clusters, and predicted exons in 21 dissected colon epithelial strips from normal colonic mucosa, a group of 72 primary and metastatic colon cancer resection specimens and 36 colon cancer cell lines (Platzer et al., 2002). Analysis of the microarray data revealed a group of genes showing different expression patterns in cancers versus normal tissues, one of them being an EST with GenBank accession number

U46258. In normal colon samples, the low level of EST U46258 expression was demonstrated. Indeed, among 21 normal colonic mucosa samples from different patients the average level of U46258 expression at the measurement floor of the assay, a value of 25 AI units. In contrast, in colon cancer tissues, U46258 expression was dramatically increased. It was expressed with an average level of

124 in Duke’s B stage (stage II) cancers that were confined to the wall of the colon; 131 in Duke’s C stage (stage III) cancers that had spread to regional lymph

24 nodes; 92 in Duke’s D stage (stage IV) cancers that had spread beyond the

confines of the lymph nodes and to distant organs (Figure 2-1). This high

expression of U46258 was also detected in liver metastases with colorectal origin,

with an average level of 126. In colon cancer cell lines, which consist of pure

populations of epithelial cells, U46258 expression showed the highest expression with an average level of 220. U46258 induction was found to be specific to colon cancer, and was absent in nine precursor colon adenoma cases (median value 25, corresponding to null expression).

25 600

500

400

300 220 Average intensity Average intensity 200 124 131 92 107 100 25 25 0

r

a Stage II Stage III Stage IV ormal colon ormal colon cell lines N epitheli Adenoma Colon Cancer Metastasis Colon cance

Figure 2-1 Expression of EST U46258 on DNA microarrays

Expression of U46258 on GeneChip microarrays. Shown for comparison are analyses of RNA samples from normal colon epithelium, colon adenomas, primary colon cancers of Stages II, III and IV, colon cancer hepatic metastases, and colon cancer cell lines. Horizontal bars denote median expression values within each group. 25 average intensity (AI) units correspond to null expression.

2.2 Identification of Ugene

U46258 corresponds to a 123-bp EST sequence mapping to chromosome 1q32.

To derive a full length transcript corresponding to this EST, I used the method

called EST walking. Specifically, the NCBI EST database with this EST sequence, and align all the hits to get a longer consensus, and then use this consensus to blast the same database again until no longer consensus generated

26 from this method. By EST walking, I got a 2335nt mRNA sequence that mapped

to within 20 kb of U46258. To further confirm this mRNA does exist in human

cells and what I got was a full-length transcript, several methods had been

applied. First, connection RT-PCR from colon cancer cell line (SW480 and

VACO241) using 2 pairs of primers could amplify the transcript from exon1 to exon4, including U46258 sequence. Second, 5’ rapid amplification of cDNA end

(RACE) confirmed that there was no more exon beyond exon 1, and 3’ RACE showed the exon4 was the real end of this transcript. Third, northern blotting probed by the middle part of this transcript (covering all four exons) showed a single band whose sized was between 2000 and 2500nt. Blasting this transcript in

the NCBI database, I found that it was a novel gene with no publications on it. This

gene, named Ugene, has an mRNA consisting of 4 exons. The open-reading frame

encodes a 149 (aa) protein (16.9kDa) with an ATG start codon in exon

1 and a TAA stop codon in exon 4 (Figure 2). Searching online databases did not

identify any known protein motifs contained within the Ugene sequence. BLAST

alignment results indicated that the Ugene mRNA sequence shared 99% identity

to a predicted human mRNA (accession number XM_001133365), designated as

“Homo sapiens similar to family with sequence similarity 72, member A”, which

was computationally predicted from the Ugene locus (NCBI contig NT_086602)

by the Gnomon gene prediction method.

To further characterize the Ugene transcript, we sequenced 5 or more individually

cloned Ugene cDNAs from each of 9 colon cancer cell lines. Surprisingly, each

line demonstrated two distinct Ugene transcripts, designated as Ugene-p and

27 Ugene-q, which differed at 3 individual nucleotide positions in exons 1, 3 and 4,

respectively. Two of the three nucleotide differences result in amino acid changes

at codon 99 (Gly/Arg) and codon 125 (Trp/Arg) (Figure 2-2, Table 2-1). These

findings are suggestive that Ugene-p and Ugene-q arise from separate gene loci.

In support of this interpretation we additionally demonstrated amplification from

genomic DNA of both Ugene-p and Ugene-q specific forms of exons 3 and 4 in

both normal colon (n=20) and colon cancer cell line (n=6) samples.

BLAST analysis showed 100% correct alignment of the four Ugene-p exons to

two different positions on Chromosome 1 (corresponding to GenBank accession

numbers NT_086586, NT_086602, Figure 2-3), suggesting either mis-assembly

of Chromosome 1 or multiple copies of this gene. Additionally, BLAST

alignment identified 2 exons corresponding to Ugene-q exons 3 and 4 as located on Chromosome 1, but at position different from Ugene-p (corresponding to

GenBank accession number NT_034403, Figure 2-3). Our cDNA cloning supports that in fact four distinct Ugene-q exons exist, likely all located on

Chromosome 1. Because Ugene-p is conserved across all mammalian species, we focused on Ugene-p in this study.

28

Figure 2-2 Structure of the Ugene locus.

Numbered black boxes denote the 4 Ugene exons, with locations of initiator ATG and termination TAA designated. Sites which differentiate Ugene-p and Ugene-q are indicated by arrow heads. P1, P2, P3 and P4 indicate the primers used in the connection RT-PCR. Nucleotide and deduced amino acid (aa) sequence of complete Ugene (Ugene- p) coding region. The Ugene nucleotide sequence is provided in the upper reading frame and numbered in roman type. The deduced aa sequence is provided underneath the nucleotide sequence, and is numbered in italic. The in-frame stop codon 5’ of the start codon is indicated in boldface. Underlined letters represent codons that differ in Ugene-q.

29 A

1143 1221

B

30 Figure 2-3 Ugene consists of two different transcripts, Ugene-p and Ugene-q

A) Example of Ugene sequencing result. Red arrows show the double signals at the position 1143(G/A) and 1221(T/C). B) BLAST with Ugene sequence. The upper panel shows that Ugene has four hit GIs on Chromosome 1, one on p arm and three on q arm. The lower panel shows a zoomed-in view of Chromosome 1. GenBank sequence indicates that the top and the bottom ones are Ugene-p and the middle two are Ugene-q.

Table 2-1 Difference between Ugene-p and Ugene-q

Position DNA/mRNA sequence Encoded protein number Ugene-p Ugene-q Ugene-p Ugene-q

854 T C Ser(TCT) Ser(TCC)

1143 G A Gly(GGA) Arg(AGA)

1221 T C Trp(TGG) Arg(CGG)

2.3 Validation of Ugene expression in colon cancer vs. normal

colon

2.3.1 Ugene expression is elevated in colon cancer

To confirm the DNA microarray findings, Ugene expression was first analyzed by

Northern blotting analysis. As shown in Figure 2-4A, Northern blot analysis corroborated that Ugene transcripts are expressed by malignant but not normal colon tissues, detecting a single 2.4 kb Ugene mRNA with moderate to strong intensity in ten of 11 colon cancer cell lines, but in none of 4 normal colon

31 epithelial tissue samples. To provide a more quantitative measurement of Ugene induction, we extended this analysis by employing real-time PCR. Real-time PCR demonstrated only barely detectable Ugene expression in eleven of 11 normal

colon epithelial samples (mean value 2.6, range 1.1 – 4.7), whereas colon cancer

cell lines showed an average 56-fold increased level of expression (mean value

147, range 3.3 - 503), with eleven of 13 colon cancer cell lines showing a greater

than 15-fold increase in expression (Figure 2-4B).

Over-expression of Ugene in malignant colon cancer was also confirmed by real-

time PCR analysis of Ugene mRNA in primary colon cancers versus matched

normal colon mucosa from the same individuals. A median increase of 6.8 fold in

Ugene expression was observed in cancers versus matched colon normals, with

greater than 2-fold increase exhibited by 18 of the 20 tumors (Figure 2-4C). These

20 colon cancers examined by real-time-PCR constituted a ‘validation set’ of

samples completely independent of those that had been previously characterized

on the GeneChip expression microarrays.

Taken together, up-regulation of Ugene mRNA expression in colon cancer

suggested by the microarray analysis was further confirmed by three independent

techniques applied to three independent sets of colon cancer samples.

32 A

B

C

33 Figure 2-4 Ugene mRNA expression in normal and cancer samples.

A) Upper panel shows Northern blot analysis of Ugene expression in four normal colon epithelium samples (Normal-1, -2, -3, and -4) versus 12 colon cancer cell lines. Lower panel shows the ethidium bromide staining of 28S and 18S ribosomal RNA subunits for each of the corresponding samples. B) Real-time PCR measurement of Ugene transcript expression in 8 normal colon epithelial samples versus 15 colon cancer cell lines. Ugene values are normalized against expression of the house-keeping gene B2M. Black bars indicate the mean value for each group. C) The ratio of Ugene expression in colon cancer versus matched normal colon mucosa, as measured by real time PCR in 20 patients. Ugene values are normalized against expression of the house-keeping gene B2M. Values represent averages of six replicates.

Table 2-2 Ratio of Ugene-p and Ugene-q in colon cancer cell lines

Cell line Ugene-p Ugene-q Neither V241 15/30 13/30 2/30 SW480 19/32 13/32 0/32 PCR: 24cycles

2.3.2 Both Ugene-p and Ugene-q are over-expressed in colon cancer

To determine expression of Ugene-p versus Ugene-q encoded transcripts, we sequenced individual Ugene cDNA clones from two colon cancer cell lines,

SW480 and VACO241. We found that in both cell lines Ugene-p represents 60%

of Ugene expression while Ugene-q represents 40% (Table 2-2). Given that

Ugene expression is up-regulated by more than 20 fold in both cell lines

34 compared with normal colon (real time PCR, Figure 2-4B), this result suggests

that both Ugene-p and Ugene-q are over-expressed in colon cancer.

2.3.3 Ugene is over-expressed in many types of cancers

To examine whether Ugene is over-expressed in other cancer types, we probed a

cDNA Cancer Profiling Array (BD Biosciences), comparing Ugene expression

level in matched tumor and normal tissue from a variety of organs. As expected, a high proportion of colon cancer samples were observed to have elevated expression of Ugene (Figure 2-5). Using densitometry, the intensity of the radioactive probe signal from each cDNA sample was quantitated. Twenty-two of

34 (65%) of the colon cancer cases showed greater than 2-fold increased expression of Ugene. Furthermore, we also found Ugene expression elevated in multiple other common cancer types, including breast (56% of cases), lung (52% of cases), stomach (64% of cases), uterus (67% of cases), and ovary (79% of cases).

35

Figure 2-5 Ugene expression in cancers cDNA dot blot analysis of Ugene expression level in matched tumor and normal tissue blots from the various organs. N: normal; T: tumor. Numbers above each blot represent the tumors with >2-fold increase in Ugene expression relative to the matched normal (Ugene (↑)); versus total samples analyzed (total #). Each L-shaped box of three dots represents normal tissue (left), primary tumor (upper right), and metastases (lower right) from the same patient. Equal sample loading over all lanes were confirmed by rehybriding the blot to a ubiquitin probe (data not shown).

2.4 Investigation of the possible causes for the elevated expression

level of Ugene in colon cancers

2.4.1 Chromosome amplification and rearrangement

Using Ugene ORF region as a proble, Southern blotting of 12 colon cancer cell lines did not demonstrate any increase in Ugene gene copy number as an

36 explanation for Ugene over-expression (Figure 2-6). Furthermore, there was also no change in size of the band corresponding to Ugene promoter region, which indicated no chromosome rearrangement (Figure 2-6). Therefore, amplification and rearrangement are not likely the possible reasons for the over-expression of

NGUC in colon cancers.

Cell type V364 V576 V503 V400 V389 V429 V9M V394 V330 V411 V425 V10M N1 N2 N3 N4 real-time 71.8 5.65 5.92 33.5 7.82 21.1 28.5 8.98 2.09 0.47 0.37 chip data 297 148 36 249 269 306 374 241 196 120 175 206 19

RII

Ratio of Ugene/RII: .34 .16 .28 .26 .31 .30 .25 .27 .26 .36 .22 .34 .20 .27 .27 .23

Figure 2-6 Detection of amplification, rearrangement of Ugene

Southern blotting analysis with probe in ORF region of Ugene cDNA sequence in 4 normal colon epithelium samples and 12 colon cancer cell lines (upper panel). Lower panel shows Southern blotting analysis with TGFβ RII probe as internal control. The genomic DNA of the samples was digested with PstI.

37 2.4.2 TGF-β regulation

Two completely independent microarray experiments consistently showed that

Ugene could be down-regulated by TGF-β in VACO330 cell line from as early as

24 hours after treatments (Figure 2-7). Moreover, the fact that Ugene is up- regulated in carcinoma but not adenoma in the previous microarray data also supports Ugene could be one of TGF-β target genes.

To validate the DNA microarray findings, Ugene expression in colon cancer cell lines with and without TGF-β treatment was first analyzed by Northern blot analysis. Since TGF-β pathway is highly mutated in colorectal cancer, most colon cancer cell lines are not sensitive to TGF-β treatment. Our previous data showed that VACO330, VACO394, VACO235, and FET cell lines were still responsible to TGF-β treatment, so we used these four cell lines in this analysis. As shown in

Figure 2-8A, Ugene expression was clearly decreased in VACO330, VACO394 and FET cell lines after TGF-β treatment. Moreover, this down-regulation in

VACO330 cell line was highest after 72-hour treatment, compared with 24- and

48-hour treatment, which indicates that the suppression is time dependent. To quantify Ugene suppression by TGF-β, we employed real-time PCR technique.

Real-time PCR corroborated the Northern blot result and showed a 36.3-fold reduction in VACO330 cell line, 3.7-fold reduction in V394 cell line, and 6.5-fold reduction in FET cell line (Figure 2-8B).

38 160 180 140 160 120 140 100 120 80 100 80 60 60 40 40 20 20 0 0 24h 48h 72h 24h 48h 72h

+TGFβ - TGFβ

Figure 2-7 TGF-β regulation of Ugene expression on microarray

Expression of Ugene on GeneChip microarrays. Shown for comparison are analyses of

RNA samples from VACO330 cell lines with (red) and without (blue) TGF-β treatment for 24, 48 and 72 hours.

39 A V330 V394 V235 FET 72h 48h 24h

TGFβ -+-+-+-+-+-+

28S

18S

B

Figure 2-8 Ugene can be down-regulated by TGF-β A) Upper panel shows Northern blot analysis of Ugene expression with and without TGF-β treatment in four colon cancer cell lines. VACO330 cell line were treated for 24, 48, and 72 hours. The other 3 cell lines were treated for 72 hours. Lower panel shows the ethidium bromide staining of 28S and 18S ribosomal RNA subunits for each of the corresponding samples. B) Real-time PCR quantification of Ugene transcript expression with and without TGF-β treatment in four colon cancer cell lines. Ugene values are normalized against expression of the house-keeping gene B2M. Numbers below the chart indicate the folds of the Ugene reduction by TGF-β.

40 To determine whether Ugene-p or Ugene-q is down-regulated by TGF-β, we

sequenced individual Ugene cDNA clones from VACO330 cell line with and

without TGF-β treatment. We found that the ratio of Ugene-p to Ugene-q changed from 45% versus 55% before TGF-β treatment, to 79% versus 21% after TGF-β treatment (Table 2-3). Given that Ugene expression is down-regulated by more than 30 fold in VACO330 cell line after TGF-β treatment (real time PCR, Figure

2-8B), this result suggests that both Ugene-p and Ugene-q are suppressed by

TGF-β treatment.

Table 2-3 Ratio of Ugene-p and Ugene-q in V330

TGF-β Ugene-p Ugene-q Neither - 9/24 11/24 4/24 + 19/26 5/26 2/26 PCR: 24cycles

2.5 Ugene expression is cell cycle denpendent

Losing the proper regulation of cell division cycle can lead to cancer. Most of the genes

whose expression correlates with the proliferative state of tumors are found to be

periodically expressed during the cell cycle. Using cDNA microarray technology,

Michael L. Whitfield et al identified 874 genes that show periodic expression across the

human cell cycle in HeLa cells (Whitfield et al, 2002). Included are two probing EST sequences, R22949, AA628867, both of which are parts of the 3’ of Ugene. Based on their data, they defined Ugene a G2/M phase gene. Interestingly, among those genes

41 expressed in this phase, there are several which contribute directly to tumor phenotypes.

For example, the two highest scoring genes in this study, STK15 and Polo-like kinase

(PLK), have transforming activity in NIH3T3 cells (Holtrich et al., 1994; Zhou et al.,

1998; Takai et al., 2001). STK15 has been shown to be amplified in human colon cancers and in cell lines derived from many other kinds of human tumors (Bischoff et al., 1998;

Zhou et al., 1998). High expression has also been observed in the absence of amplification and contributes to abnormal centrosome numbers in cell lines (Zhou et al.,

1998). PLK has also been implicated in centrosome maturation; injection of PLK into either HeLa cells or human foreskin fibroblasts resulted in reduced centrosome size and abnormal chromatin distribution (Lane and Nigg, 1996). Amplification or overexpression of STK15 and/or PLK has been postulated as a potential cause of aneuploidy in human tumors (Lengauer et al., 1998). The high expression of genes involved in centrosome duplication may contribute to the chromosomal translocations and aneuploidy found in

HeLa cells. In addition, Bub1 and Bub1B are both members of the G2/M cluster.

Mutations of Bub1 were found in various types of cancers, while Bub1B mutation has been found to cause constitutional aneuploidy and cancer predisposition (Myrie et al,

2000, Hanks et al, 2004).

To validate this microarray data, we employed the double thymidine block (DTB) method to synchronize the HCT116 cell at the G1/S border. After removing thymidine, all the cells started a new cell cycle simultaneously by entering S phase. The samples at different time point were then collected and tested for Ugene expression. Real time PCR showed that Ugene mRNA expression was cell cycle dependent. It went up from G1 through S phase, and reached its peak at G2M phase. After cytokinesis, it decreased gradually and came down to the base line at G1 phase (Figure 2-9).

42 A

B

Figure 2-9 Cell cycle profiling of Ugene mRNA expression

A) Real time PCR measurement of Ugene expression at different time points after release from double thymidine block. The bars indicating cell cycle phases were generated according to the flow cytometry data, as demonstrated in B)

43 2.6 Ugene encodes a nuclear protein

To investigate the subcellular localization of Ugene encoded protein, constructs expressing V5-epitope tagged Ugene-p and Ugene-q protein were transfected into

SW480 cells. Figure 2-10 shows the immunofluorescent staining for the V5 tag

(green) in Ugene transfected cells. Results demonstrate that tagged Ugene protein accumulates in nuclei, which were defined by DAPI staining (red). As Ugene is a small protein (16.9kD) and lacks a nuclear localization signal, this accumulation suggested that Ugene might be held in the nucleus through interacting with other nuclear proteins.

A

44 B

Figure 2-10 Subcellular localization of Ugene.

SW480 cells were transfected with constructs expressing V5-tagged Ugene-p protein (Panel A) and Ugene-q protein (Panel B). The expression of Ugene-p and Ugene-q was detected with V5 antibody and visualized by fluorescent microscopy.

45

Materials and methods

Cell lines and tissues

VACO cell lines were established and maintained as previously described

(Willson et al., 1987). DLD1 and SW480 cell lines were obtained from American

Type Culture Collection (ATCC, Manassas, VA). Normal colons, primary colon cancers, and liver metastasis tissues were obtained from the archives of University

Hospitals of Cleveland (Cleveland, OH) under an IRB-approved protocol. Total

RNA and genomic DNA were prepared as described (Markowitz et al., 1995).

cDNA microarray data generation

We isolated total RNA from the fresh frozen tissues using Trizol solution (In

Vitrogen, Carlsbad,CA) and a Polytron homogenizer (Kinematica, Cincinnati,

OH) for samples >100 mg weight or Fastprep beads (Bio101,Carlsbad, CA) for

samples <100 mg weight or by CsCl banding of total RNA extracted in guanidine

isothiocyanate. A second Trizol extraction and an RNAeasy chromatography

purification (Qiagen, Valencia, CA) was done to ensure high quality total RNA.

Ten µg of total RNA was used per sample for cDNA production employing the

Superscript Choice System (In Vitrogen, Carlsbad, CA). cRNA was generated

and labeled with biotin using the T7 MEGAscript protocol (Ambion, Austin, TX)

and purified by RNeasy chromatography purification (Qiagen, Valencia, CA).

cRNA was hybridized to a custom Affymetrix Gene chip (Eos Hu03) designed by

Eos Biotechnology, Inc. The single Eos Hu03 Gene chip contains >59,000

probesets which represent ~45,000 mRNAs and EST clusters along with 6,200 ab

46 initio predicted genes from the human genomic sequence not represented in the

mRNA and EST expressed sequences at the time of chip design. Labeled cRNA

was hybridized to the custom Affymetrix arrays using standard protocols

(Affymetrix, Santa Clara, CA), and raw image data was collected using the

Affymetrix Expression Array . Data was normalized using protocols and software developed at Eos Biotechnology (Ghandour and Glynne, 2000). In brief, probe intensity values were background subtracted and normalized to a gamma distribution. An average intensity (AI) was calculated from these probe intensities.

Rapid amplification of cDNA ends (RACE)-PCR

5′ and 3′ RACE-ready cDNA was generated from 2µg of total RNA (V241 cell line) using the 5′/3′ RACE kit (Roche, Indianapolis, IN). The gene-specific primers used for 5′ RACE were as follows: SP1, 5′-GCG GGA CCT AGA GCT

TTT CT-3′; SP2, 5′-GAG GCA GGT GGA GTT TGA AG-3′; and SP3, 5′-ATC

CCT TCC CCA GCA TTA AG-3′. The gene-specific primer for 3′ RACE was: 5′-

ACC TCA TCC TTC CTG CGA CG-3′. Full-length Ugene was PCR amplified from RACE-ready cDNA using the forward 5′-CCG ACT GAG CCT CTA AAG

CGA C-3′ and reverse 5′-TCC TGA TTC ACA AAC TCT TGC TCC-3′ primers.

Northern analysis

Ten µg of total RNA were separated on a 1% formaldehyde agarose gel and transferred to Nytran SuPerCharge (Schleicher & Schuell, Keene, NH). The blot was prehybridized using express-hyb Buffer (Clontech, Palo Alto, CA) for 30 min

47 at 68°C. Amplified Ugene-p coding cDNA was labeled with 32P-dATP using

Strip-EZ DNA labeling kit (Ambion, Austin, TX) and purified with Sephadex G-

50 quick spin columns (Roche, Indianapolis, IN). The labeled probe was added to

the membrane in express-hyb buffer for 1.5 h at 68°C and washed according to

manufacture’s protocol. The membrane was then exposed to a phosphor screen

overnight and then analyzed using a STORM optical scanner (Molecular

Dynamics, Sunnyvale, CA).

Southern analysis

Total genomic DNA from cell lines and normal tissues were digested with PstI,

separated by electrophoresis on a 0.8% agarose gel and transferred onto a Zeta-

Probe blotting membrane. 32P labeled DNA probes were prepared by random

primer extension of a fragment containing the Ugene coding sequence. Equal

loading of DNA was confirmed by rehybridizing blots with a probe designated to

the TGFβ-RII gene, which is relatively copy number invariant in CRC.

Human cancer dot blots

Radioactively labeled cDNA probes were synthesized from human Ugene or ubiquitin control cDNA using random primer labeling followed by probe

purification on CHROMA SPIN+STE-100 columns (BD Biosciences).

Hybridization of the Cancer Profiling Array with human Ugene probes and washings of the array were done according to the manufacturer's

recommendations (BD Biosciences). The hybridized Cancer Profiling Arrays were

then exposed to the phosphorimaging screens and scanned with a Storm 840

48 PhosphorImager. We then stripped this same membrane and hybridized it with human ubiquitin cDNA probe to show equal sample loading.

Ugene real-time PCR

Primers and a fluorogenic hybridization probe were designed using Primer3

software (Rozen and Skaletsky, 2000). Ugene was amplified using 400 nM of

forward primer 5′-CTG TCT TCT TTC CTG CAA CAA C-3′ and reverse primer

5′-TAG GAC GTT TAC ACC TGT GGA G-3′, and detected using fluorogenic hybridization probe 5′-/56-FAM/ATA AAC TGC CTG GCT GTG AAA CAT

CCA G/3BHQ_2/-3′. Beta-2-microglobulin (B2M) was amplified using 0.2× of the human B2M TaqMan primer/probe kit (Perkin-Elmer Biosciences, Foster

City, CA). Each PCR was carried out in triplicate in a 25 µL volume using

TaqMan Assay Mastermix (Applied Biosystems, Foster City, CA) for 8 min at

95ºC, followed by 50 cycles of 95ºC for 15 s, 57ºC for 30 s, and 72ºC for 30 s.

The level of Ugene expression was determined as the ratio of Ugene : B2M = 2

(CT B2M-CT Ugene).

Test of ratio of Ugene-p versus Ugene-q

0.1 ug of cDNA from colon cancer cell lines (SW480, VACO241 and VACO330)

were amplified by PCR for 24 cycles (linear condition), using primers: forward

5′-ACC TCA TCC TTC CTG CGA CG-3′ and reverse 5′-TCT AAT ACA CTC

CTC TGC TGA GAT-3′. The products were then cloned into the pcDNA2.1-

TOPO vector (Invitrogen) for sequencing.

Double thymidine block

49 Wash HCT116 cell culture (25-30% confluency) twice with 1xPBS and add

medium with 2mM thymidine for 18h (first block); remove thymidine by washing

with 1xPBS; add fresh media for 9h to release cells; then add DMEM medium

with 2mM thymidine for 18h (second block); remove thymidine by washing with

1xPBS; release cells by adding fresh media. Cells are collected at different time

points and subjected to flow cytometry and RNA extraction for real time PCR

Construction of expression/deletion vectors

The coding sequence of Ugene (Ugene-p/Ugene-q, XM_001133365) was PCR

amplified and cloned into the eukaryotic expression vector pcDNA3.1/V5/His-

TOPO (Invitrogen, Carlsbad, CA) to generate COOH-terminal V5-tagged Ugene-

p/q expression vectors. The primer sequences for constructing the vectors are as

follows: forward 5′-ACC TCA TCC TTC CTG CGA CG-3′ and reverse 5′-TCT

AAT ACA CTC CTC TGC TGA GAT-3′.

Immunofluorescence

SW480 cells were seeded at 1.0×106 cells/100mm dish and transfected the next day with 2 µg of V5-tagged Ugene expression vector using 12 µL of Fugene 6

(Roche Applied Sciences, Indianapolis, IN) as per the manufacturer’s protocols.

Immunofluorescence was performed 48 hours after transfection using V5 antibody (Invitrogen) at 1:200, followed by Alexa Fluor 488 goat anti–mouse IgG antibody (Invitrogen) at 1:400.

Propidium iodide (PI) staining and flow cytometry analysis

50 Havest the cells by trysinization; wash the cells twice with 1xPBS; pellet the cells

by centrifugation; resuspend cells in a small amount (~50µL) 1xPBS; add drop-

wise 9 volumes of cold methanol while vortexing; incubate for at least 20 min at -

20°C to fix the cells. Wash twice with 1xPBS; pellet the cells by centrifugation.

Add 50 µl of 100 µg/ml RNase and incubate 30min at 37°C to remove RNA. Add

200 µl propidium iodide (50 µg/ml) and stain for 1h. Flow cytometry analysis was performed on EPICS-XL MCL flow cytometer (Beckman Coulter).

51

Chapter 3 Epitope tagging of endogenous Ugene

Now we have characterized Ugene as over-expressed in many cancers and down-

regulated by TGF-β. Next, we wanted to know what was the function of Ugene

encoded protein(s), and why cancer cells turned this gene on. To answer these

questions, we needed to do functional analysis of Ugene protein. Most commonly

used techniques in these analyses, such as immunohistochemistry,

immunofluorescence, immunoprecipitation, chromosome immunoprecipitation,

enzyme-linked immunosorbent assay, require good antibodies against target

proteins. We have tried to develop specific antibodies against Ugene proteins.

Unfortunately, because Ugene is highly conserved across species in mammals

(the similarity between human Ugene protein and mouse Ugene protein is 88%),

the antibodies generated in mouse and rabbit had low specificity and low affinity.

One way to solve this problem is transgenic expression of recombinant proteins.

Because recombinant proteins can be tagged by commonly used epitopes that have commercially available specific antibodies, these antibodies can be used in

the protein analysis assays. Previously, this method was used in investigating the

subcellular localization of Ugene encoded protein by immunofluorescence of V5

tagged recombinant proteins. However, there are several major pitfalls that make

this method not suitable for many purposes. First, transgenic expression of

recombinant proteins is generally driven by a strong promoter, such as SV40,

CMV, etc, and there are normally more than one copy of transgenic genes in the

genome. So the expression of the tagged protein is much more than its

52 physiological level, which sometimes will generate non-specific effect. Second, because transgenic genes are not driven by the endogenous promoter sequences, or even when they are, the endogenous promoter sequences are not in the same genomic context, the expression of the recombinant proteins can not be regulated correctly. These regulations include cell cycle dependent regulations, responses to specific cell signaling, and responses to DNA damage. Expression deregulation of these genes will greatly influence their functions and sometimes reverse their effects. Third, because transgenic genes are randomly incorporated into genomes, there is a significant chance that these insertions will affect or abolish expression of adjacent genes. To avoid these problems, we developed a strategy to introduce epitope tag–encoding DNA into endogenous loci by homologous recombination– mediated ‘knock-in’. The knock-in approach provides a general solution for the study of proteins to which antibodies are substandard or not available.

For ease in constructing directed knock-in of Flag-epitope tags, we developed a universal knock-in vector. The vector contains two multiple cloning sites, sequences that encode a triple Flag epitope tag (3xFlag), a neomycin gene flanked by loxP sites and two inverted terminal repeats (Figure 3-1). For targeted knock- in of 3xFlag, we inserted sequences homologous to 5’ and 3’ regions flanking the target locus into the two respective cloning sites, and packaged the resulting vector into recombinant adeno-associated virus (rAAV). Then we infected cells with targeting virus and selected neomycin resistant clones. We identified correctly targeted clones by genomic PCR and then excised the neomycin gene by infection with adenovirus expressing Cre recombinase.

53

A

B

Figure 3-1 Schematic diagram of tagging endogenous protein with 3 Flag

(A) Targeting (NEO-loxP-3 Flag) vector. L-ITR and R-ITR, left and right inverted terminal repeats, respectively; MCS, multiple cloning site; CMV, cytomegalovirus promoter; NEO, neomycin resistance gene. (B) Diagram of knock-in strategy. rAAV

targeting vectors contain a left and right 'arm' homologous to sequences in the target

gene, flanking a NEO-loxP-3 Flag cassette. Clones are then screened by genomic PCR with primers complementary to the neomycin resistance gene and upstream of the left (P1 and NR) or downstream of the right (NF and P2) homologous arms. The neomycin gene cassette is excised with Cre recombinase, and genomic PCR using primers P3 and P4 identifies clones with the correct excision.

54 Using this new technique, we successfully tagged one allele of endogenous

Ugene-p with 3xFlag epitope. Shown in Figure 3-2, upper Panel B is the Western blot analysis against Flag epitope. Clearly, while the lane loaded with parental cell lysate was clean, the knock-in clone showed a single band with the molecular weight expected (about 25kD, Ugene-p is 17kD and 3xFlag is 8kD). The lower panel B showed that these knock-in clones are suitable for the immunoprecipitation assay.

A real time PCR analysis was performed to demonstrate no change in total Ugene transcript expression in cells bearing the 3xFlag knock-in epitope (Figure 3-2C), and a semi-quantitative PCR showed the knock-in Ugene allele to be expressed at essentially the same level as the non-targeted alleles (Figure 3-2D). Quantitation of the incorporated radiolabel in each band (corrected for the molecular weight of the band) showed a molar ratio of 0.27:1 for the 3xFLAG allele versus the summed untagged Ugene alleles, equivalent to a ratio of 0.8:1 for expression of the 3xFLAG allele versus an individual Ugene allele. Accordingly, introduction of the 3xFLAG tag essentially makes no change in expression of either total

Ugene transcript or of transcript arising from the tagged allele.

In Chapter 2, it is demonstrated that Ugene mRNA expression is cell cycle dependent. To test if this regulation can be translated into protein level, a Western blot analysis against Flag epitope was performed to the cell lysates of 3xFlag knock-in clones collected at different time points after release from double thymidine block. As shown in Figure 3-2E, the Ugene-p protein level showed a cell cycle dependent manner. Consistent with mRNA expression, Ugene-p protein

55 expression was low at G1-S phase and increased about 2.5-fold through G2/M phase. The protein level decreased after mitosis and remained at low level until next G2/M phase of the cell cycle.

A B

WB: Flag

IP: Flag WB: Flag

C

D

56 E

Figure 3-2 3xFlag epitope tagging of endogenous Ugene-p

A) Genomic PCR of Ugene. Primers flank the epitope sequence incorporation site. In the knock-in clones, the upper bands (large products) were amplified from the Ugene-p knock-in allele. B) Western blot (upper) and immunoprecipitation (lower) analyses of endogenous 3xFlag tagged Ugene-p in DLD1 cells. C) real-time PCR assay of total Ugene transcript expression in DLD1 parental cells compared to DLD1 cells in which one Ugene-P allele bears a knocked in 3xFLAG epitope tag. D) PCR comparison of expression of 3xFLAG tagged versus untagged Ugene alleles. Radio-labeled RT-PCR was performed with primers that flank the 3xFLAG epitope tag allowing amplification of a short band corresponding to mRNA from the three untagged Ugene alleles (2 Ugene-q and one Ugene-p) (band labeled Ugene), and amplification of a larger band corresponding to mRNA from the 3xFLAG knockin allele (Ugene-p-3xFLAG). The molar ratio for the 3xFLAG allele versus the summed untagged Ugene alleles was 0.27:1. E) Western Blot analysis of Ugene-p-3xFlag protein in DLD1 knock-in clones after release from double thymidine block.

57 We have developed a new method to tag the endogenous protein with widely used

epitope. Using this method, we tagged Ugene-p with 3xFlag and showed that this

epitope tagging didn’t affect its total mRNA level or its expression regulation. We also demonstrated that the tagged protein can be used for Western blot and immunoprecipitation analysis. It also has the potential for assays such as immunofluorescence, chromosome immunoprecipitation (Zhang et al., 2008). All these evidences support that this knock-in approach provides a general solution for the study of proteins to which antibodies are substandard or not available.

More importantly, endogenous epitope tagging only needs less than half of the time for antibody generation (Figure 3-3).

Figure 3-3 Timelines for 3xFlag knock-in and antibody production

58

Materials and methods

Cells and reagents

The human colorectal cancer cell lines DLD1 were grown in McCoy's 5A

modified medium (Invitrogen) supplemented with 10% fetal bovine serum

(HyClone) and penicillin/streptomycin (Invitrogen). Mouse anti-Flag monoclonal antibody was from Sigma (Cat #: P2983).

Construction of Ugene-p rAAV-3xFlag Knock-in targeting vectors

We first constructed a pAAV-lox P-Neo vector by ligating a fragment containing a neomycin resistance gene cassette flanked by two Lox P sites and left and right multiple cloning sites with the AAV vector consist of L and R-ITRs, bacterial replication origin and an ampicillin resistance gene. We then PCR amplified the

3xFlag tag with linker sequences from the pMZ3F using primers 5’-

ggaattcgaaaagagaagatggaaa -3’ and 5’-ggaattctcactacttgtcatcgtc-3’. The PCR

products were digested with Eco RI and inserted in the pAAV-Lox P-Neo vector to

construct a plasmid called pAAV-LoxP-Neo-3xFlag. A PCR fragment (~1-kb)

extending from intron to the last exon of Ugene-p before the stop codon was

amplified from genomic DNA and cloned in-frame with the 3xFlag tag sequence

as the left homologous arm. A PCR fragment from the sequence after the TGA

stop codon extending to the 3’ end of the targeted genes was also amplified from

genomic DNA to be used as the right arm. The primers used were: left arm 5′-

GGC TCG AGC AAC CTG GCC CTA AAG TTC A-3′, 5′-CCG ATA TCT CTA

ATA CAC TCC TCT GCT GAG-3′, right arm 5′- GGA CTA GTA TGG AAT

59 TAT GAT ATA TAT GAT ATA C-3′, 5′- AAC CGC GGC AAA ACC ACA

ACT CAG TCT GCT-3′.

Packaging of rAAV targeting constructs

The targeting construct made above (2.5 μg) was mixed with pAAV-RC and

pHelper (2.5 μg of each) from the AAV Helper-Free System (Stratagene)

and transfected into HEK 293T cells (ATCC) using Lipofectamine (Invitrogen).

The DNA was dissolved in Opti- MEM reduced-serum media (Invitrogen) to a total volume of 750 μl (i.e. if volume of DNA was 50 μl, volume of Opti-MEM

was 700 μl). Similarly, 54 μl of Lipofectamine was dissolved in Opti-MEM to a total volume of 750 μl. The two tubes were combined and the DNA–

Lipofectamine mix was incubated at room temperature for 15 min. HEK 293T cells at 70–80% confluence in a 75 cm2 flask were washed with Hank’s Balanced

Salt Solution (HBSS, HyClone) and then 7.5 ml Opti-MEM was added. To this, the 1.5 ml DNA–Lipofectamine mixture was added dropwise, and the cells were incubated at 37°C for 3–4 h. The Opti-MEM was replaced with HEK 293T growth medium and the cells were allowed to grow for 72 h prior to harvesting virus. Virus was harvested according to the AAV Helper-Free System instructions with minor modifications. Briefly, the media was aspirated from the flask and the

293 cells were scraped into 1 ml of phosphate-buffered saline (Invitrogen), transferred to a 2 ml microfuge tube, and subjected to three cycles of freeze–thaw.

Each cycle consisted of 10 min freeze in a dry ice–ethanol bath, and 10 min thaw in a 37°C water bath, vortexing after each thaw. The lysate was then clarified by centrifugation at 12,000 r.p.m. in a microfuge to remove cell debris and the

60 supernatant containing rAAV was divided into three aliquots of 330 μl each and

frozen at –80°C. The rAAV preparation generally contained ~3 x 108 genome

particles/ml.

AAV titration assay

10 μl of rAAV stock was mixed with 10 μl of salmon sperm DNA (1 mg/ml) and

20 μl of 2 M NaOH. The mixture was then incubated at 56ºC for 30 min and then

neutralized by adding 19 μl of 2 M HCl. The rAAV lysates was diluted 10 folds

and 1 μl of dilutant was mixed with 2 μl of 5 μM forward primer (5’-

TGAATGAACTGCAGGACGAG-3’), 2 μl of 5 μM reverse primer (5’-

CAATAGCAGCCAGTCCCTTC-3’) and 12.5 μl of SYBR green PCR mix in a

total volume of 25 μl. To calculate the copy number, the rAAV-NEO targeting

vector was serially diluted in the range of 103 to 106 copies per μl as the real-time

PCR standards (Veldwijk et al., 2002).

Gene targeting and isolation of recombinant cell lines

DLD1 cells were grown in 25 cm2 flasks and infected with rAAV when 75%

confluent (~3x106). At the time of infection, medium was aspirated and 4 ml of

medium containing 50–250 μl of rAAV lysate (0.2-1 x 108 viral particles) was added to each flask. Cells were washed with PBS buffer and detached with trypsin

(Invitrogen) 24 h after infection. Cells were replated in eight 96-well plates in medium containing geneticin (Invitrogen) at a final concentration of 1 mg/ml.

Drug resistant colonies were grown for 10-14 day (~3, 000 G418 resistance clones/T25 flask). At the end of the selection period, genomic DNA was extracted

61 from single clones growing in 96-well plates using the Lyse-N-Go reagent

(Pierce). Locus-specific integration was assessed by PCR using a primer that

annealed outside the homology region and another that annealed within neo.

Positive clones were confirmed by PCR across both homology arms. Primers used for screening of individual targeted loci are available upon request.

Cre-mediated excision of the drug resistance marker in targeted cells

To remove the drug resistance marker from correctly targeted clones, cells were infected with an adenovirus that expresses the Cre recombinase (Kohli et al.,

2004). Cells were plated at limiting dilution in nonselective medium, 24 h after infection. After 2 weeks, single cell clones were plated in duplicate and 0.4 mg/ml geneticin was added to one set of wells. After 1 week of growth, clones that were geneticin-sensitive were expanded for further analysis.

Western blot and immuprecipitation analysis.

Cells were lysed in RIPA buffer with protease inhibitors and phosphatase inhibitors (50 mM Tris-HCl, pH 8.0, 0.5% triton X-100, 0.25% sodium deoxycholate, 150 mM sodium chloride, 1 mM EDTA, 1 mM sodium orthovnadate, 50 mM NaF, 80 μM β-glyerophosphate and 20 mM sodium

pyrophosphate). FLAG immunoprecipitation was performed with anti-FLAG M2

affinity gel as described in the manufacturer’s protocol. Western blots were

performed using anti-FLAG M2 (Sigma, 1:1000) followed by horseradish

peroxide conjugated donkey anti-mouse secondary antibody (1:1500, Jackson

62 ImmunoResearch Laboratories, Grove City, PA), and visualized by using an

Enhanced Chemiluminescence Plus detection kit (Amersham Biosciences).

Double thymidine block, PI staining, flow cytometry and real time PCR were

performed as described in Chapter 2.

63

Chapter 4 Investigation of potential roles of Ugene in

cell proliferation and oncogenesis

Being over-expressed in tumors and down-regulated by a tumor suppressor,

Ugene have both major expression characteristics of oncogenes. However,

structure analysis and motif searching did not provide any clues on its functions.

To determine the potential functions of Ugene and its roles in cell proliferation

and oncogenesis, I have tried several different strategies, including focus

formation assay and anchorage independent growth assay for testing of

transformation capacity; siRNA suppression followed by either cell cycle

profiling or colony formation assay for testing of effects on cell proliferation;

pulldown assay and DNA binding test for fishing of Ugene potential partners.

4.1 Potential roles of Ugene in cell transformation

4.1.1 NIH 3T3 cells transformation

Malignant transformation is typically characterized by the acquisition of some or

all of the following properties: alteration in cell morphology, increased growth rate, reduced serum dependence, loss of density-dependent growth inhibition, altered gene expression, acquisition of anchorage-independent growth potential, and the ability to form tumors in experimental animals. The expression of oncogenic forms of Ras proteins, as well as over-expression of normal Ras proteins, triggers these cellular consequences in NIH 3T3 mouse fibroblast cell

64 line, a preneoplastic cell line. The exquisite sensitivity of NIH 3T3 cells to transformation provides a valuable tool in the investigation of the function of oncogenes in general.

To test if Ugene protein possesses the similar transforming capacity, constructs

expressing Ugene-p or Ugene-q were transfected into NIH 3T3 cells. As shown in

Figure 4-1, after three weeks, while Ras transformation generated over 100

transformed foci on each dish, neither Ugene-p nor Ugene-q construct induced

significantly more transformed foci in NIH 3T3 cells than the empty vector

control. To further test the potential synergetic effect between Ugene-p and

Ugene-q, they were co-transfected into NIH 3T3 cells. However, this combination

did not enable cells to form foci. Therefore, results suggest that Ugene alone may

not be sufficient to induce transformation. It may need to cooperate with other

oncogenes to induce malignant transformation of normal cells.

65

A Ras Empty vector Ugene-p Ugene-q Ugene-p+Ugene-q

B

Ugene

Figure 4-1 NIH 3T3 cell transformation assay

A) Pictures were taken of stained foci, three weeks after transfection. Each construct was transfected into two parallel 60-mm plates. B) Western blot analysis of transfected Ugene (V5 tagged).

4.1.2 REF-52 cells transformation

Although NIH 3T3 cells can be transformed by oncogenic Ras only, the majority

of normal primary cells cannot be transformed by the simple addition of a single

oncogene. Ras over-expression in most primary cells will lead to growth arrest

unless co-expressed with collaborating genes such as adenovirus early region 1A

66 (E1A) or c-myc. An established rodent cell line REF52, which resembles certain

primary rodent cells with regard to transformation by Ras, has been widely used

to investigate the ability of other proteins to collaborate with Ras in cell

transformation (Franza et al., 1986).

To test if Ugene protein can rescue the cells from Ras induced growth arrest, constructs expressing Ugene-p or Ugene-q or both of them were co-transfected with oncogenic Ras into REF52 cells. After three weeks, while dominant negative

P53 (positive control) could cooperate with Ras to confer the cells with the ability to form foci, neither Ugene-p nor Ugene-q, nor the combination showed the similar effect (data not shown). Therefore, Ugene protein can not rescue the cells from Ras induced growth arrest.

4.1.3 WI-38 cells transformation

While the two oncogenic “hits” can frequently transform primary rodent cells, primary human cells have been proven to be refractory to transformation by numerous combinations of cellular and viral oncoproteins. It has been reported that the combined expression of E1A, Ha-RasV12, and MDM2 is sufficient to convert human fibroblast cells capable of forming tumors (Seger et al., 2002).

Given that MDM2 is an antagonist of the tumor suppressor P53, this is a good model to test if Ugene is involved in P53 pathway.

Again, constructs expressing Ugene-p or Ugene-q or both of them were co- transfected with Ha-Ras and E1A into a human primary lung cell line called WI-

38. Figure 4-2 shows the result of an anchorage independent growth assay. As expected, cells transfected with E1A, Ha-RasV12, and MDM2 were able to form

67 colonies in soft agar. However, this anchorage independent growth ability could not be obtained when MDM2 was substituted with Ugene-p, Ugene-q or both.

This result suggests that Ugene may not function through P53 pathway.

A

B Ugene-p + Ugene-q Empty vector Empty Ugene-p Ugene-q

Ugene

β-actin

68 Figure 4-2 Anchorage independent growth assay of WI-38 cells

A) Anchorage independence growth of WI38 cells with different combination of gene expression. WI38 fibroblasts were infected with retroviruses to direct the expression of E1A and Ha-RasV12 in combination with the proteins indicated in the figure. B) Expression of V5-tagged Ugene protein in WI38 cells that stably expressing E1A, Ras, detected by Western blot analysis using anti-V5 antibody.

4.1.4 HME1 cells transformation

The transformation models we had tested, NIH-3T3 cells, REF52 cells and WI-38 cells, were all fibroblasts. Because colorectal cancer originates from epithelial cells, we also tested Ugene potential oncogenesis function in human mammary epithelial cells, HME-1 (there is no available primary colon epithelial cells). It has been reported that expression of CyclinD1-CDK2 fusion protein is sufficient for

HME-1 cells to grow in soft agar (Chytil et al., 2004). However, anchorage independent growth assay indicated that Ugene protein (Ugene-p, Ugene-q and both) could not transform HME-1 cells and could not enable them to grow in the soft agar, even when co-transfected with mutant P110 (data not shown).

To sum up, four models have been used to test potential roles of Ugene in cell transformation, including one epithelial cell model (HME-1) and three fibroblast cell models (NIH 3T3, REF52 and WI38). After over-expression, Ugene, with or without combination with other oncogenes, did not demonstrate the ability to transform the targeted cells.

69 4.2 Potential roles of Ugene in cell proliferation

Many oncogenes, such as Bcl-2, eIF4E and mutant P110, are important for

promoting cell proliferation in cancer cells. Suppression of their expression will

lead to cell cycle arrest and even apoptosis of host cancer cells (An et al., 2007;

Anai et al., 2007; Dong et al., 2008). Because Ugene is over-expressed in most cancer cells and its expression is cell cycle dependent, we tested potential roles of

Ugene in cell proliferation by knocking down Ugene expression.

Both shRNA and siRNA techniques were employed for Ugene knockdown. 8 shRNA constructs were made based on 2 different backbones and carriers (mir30

backbone in lentivirus, and regular hairpin backbone in retrovirus). None of them

yielded ideal suppressions even after drug selection for stable clones. A

confounding complication could be that there are too many copies of Ugene in

human genome.

A total of 8 siRNAs have been tested. Among them, si1017 was the best and

showed more than 90% suppression. si1078, si1688 and si1656, showed more

than 75% suppression (Figure 4-3A). Furthermore, the suppression was sustained

for up to 8 days (Figure 4-3B). However, only one of them, si1017, the best

suppressor of Ugene expression, caused apoptosis and decreased colony numbers

of DLD1 and HCT116 cells (Figure 4-4). This could be explained as either an off-

target effect of the siRNA, or by the fact that the suppression threshold is very

high for this phenotype. Unfortunately, no other siRNA could suppress Ugene

expression to the extend siRNA achieved (>90%).

70

A

120 100

80

60 40 Signal intensity 20

0 Mock siGLO si1017 si1078 si1278 si1656 si1688 si1279 si1880

B

Figure 4-3 Suppression of Ugene expression by specific siRNAs.

A) Ugene-p suppression 72 hours after siRNA transfection in 3xFlag knock-in DLD1 clones, detected by Western blot against Flag epitope (upper panel). Lower panel shows the quantitation of Ugene suppression by siRNAs, indicated by signal intensity of Western blot bands. Mock: transfection reagent only; siGLO: control siRNA that does not interact with human mRNA. B) Ugene suppression by si1017 in 3xFlag knock-in DLD1 clones at different time points after siRNA transfection (as indicated).

71 A

B

72 Figure 4-4 Colony formation assays after siRNA transfection

A) Images of colonies. HCT 116 cells were transfected with siRNAs, then counted and plated. Images were taken 10 days after cells were seeded. B) Quantitation of the colony numbers.

4.3 Ugene potential interaction partners

4.3.1 Identification of Ugene potential partners

To look for a potential Ugene partner, we performed a pull-down assay using

Flag-tagged Ugene-p protein over-expressed in SW480 cells. Silver staining of

SDS-PAGE separation of pull-down products showed two extra bands with

molecular weights around 36kD precipitated with Ugene-p-Flag (Figure 4-5).

Using mass spectrometry, these two protein bands were identified as uracil DNA glycosylase-2 (UNG2) and nitric oxide synthase interaction protein (NOSIP).

UNG2 is the most active uracil-DNA glycosylase in human cells. Its major function is to prevent mutagenesis by eliminating uracil from DNA molecules by cleaving the N-glycosylic bond and initiating the base-excision repair (BER) pathway. It is very important to maintain the genomic stability during DNA synthesis and throughout the whole cell cycle. NOSIP is a nucleocytoplasmic

shuttle protein. It can modulate eNOS (endothelial nitric oxide synthase) activity by alteration of eNOS subcellular localisation. Because both of them are localized in the nucleus, they are legitimate candidates as Ugene interaction partners.

73

Figure 4-5 Flag pull-down of Ugene-p

Left: Silver staining of SDS-PAGE separation of pull-down products. Arrows on the left indicate the extra protein band precipitated with Ugene-p. Right: Western blot analysis against Flag (Ugene-p).

4.3.2 Validation of Ugene potential partners

4.3.2.1 UNG2 binds to Ugene-p, but not Ugene-q

In order to confirm the interaction of Ugene-p and UNG2, we first co-transfected tagged Ugene-p and UNG2 constructs, followed by immunoprecipitation of either protein, and then Western blot analysis of the immunoprecipitates to detect the presence of potential partners. We found Ugene-p and UNG2 co- immunoprecipitated together in assays in which either of the proteins was first pulled down (Figure 4-6A). To prove that this binding of Ugene-p and UNG2 was

74 not an artificial result due to protein over-expression, we co-immunoprecipitated

Ugene-p from 3xFlag knock-in cell lysates with antibodies against the Flag-

epitope. Western blot analysis confirmed that endogenous UNG2 co-

immunoprecipitated with the tagged endogenous Ugene-p protein (Figure 4-6B).

Because Ugene-p and Ugene-q are encoded by different genomic loci and they are

both expressed in colon cancer, we were interested to know if Ugene-q could also

interact with UNG2. The similar co-transfection and co-immunoprecipitaton

analysis was performed. To our surprise, with only 2 amino acids different from

Ugene-p, Ugene-q does not bind to UNG2 at all. We found Ugene-p and UNG2

could not co-immunoprecipitate together in assays in which either of the proteins

was first pulled down (Figure 4-7A). Furthermore, using the knock-in technique,

we introduced a point mutation (W125R) at the position where Ugene-p is differentiated

from Ugene-q, into the endogenous Ugene-p, and showed that this mutated Ugene- p(W125R) lost the capacity to interact with UNG2 (Figure 4-7B). Therefore, the tryptophan125 is necessary for the interaction between Ugene and UNG2.

4.3.2.2 Both Ugene-p and Ugene-q interacts with NOSIP

Next, we validated the interaction between Ugene protein and NOSIP. Both

Ugene-p and Ugene-q showed the ability to co-precipitate NOSIP when they were

transiently expressed in HEK293K cells (Figure 4-8A&B). Co-IP with 3xFlag

knock-in cell lysates also confirmed the interaction at the physiological level

(Figure 4-8C).

75 4.3.2.3 NOSIP does not form complex with UNG2

Because Ugene-p interacts with both NOSIP and UNG2, we wanted to know if these three nuclear proteins were involved in the same complex. To test this, tagged NOSIP and Ugene-p were co-transfected, followed by immunoprecipitation of NOSIP and Ugene-p respectively. Although Ugene-p co- immunoprecipitated with NOSIP and UNG2 co-immunoprecipitated with Ugene- p, UNG2 could not be detected in NOSIP immunoprecipitates (Figure 4-9). These results suggested that UNG2, NOSIP and Ugene-p were not involved in the same complex. Comparing these two partners of Ugene, we were more interested in

DNA repair enzyme UNG2, so we prioritized to concentrate on UNG2 in the following study.

76 A Ugene-p - + Ugene-p + + Input Input UNG2 + + UNG2 - +

IP: Ugene-p IP: UNG2 WB: Ugene-p WB: UNG2

WB: Ugene-p WB: Ugene-p

WB: UNG2 Cell lysate Cell WB: UNG2 lysate Cell

g xFla -3 p B l -in Ugene-ock n Parenta k

WB: Ugene-p-3xFlag

WB: UNG2

IP: Ugene-p-3xFlag

Figure 4-6 Protein-protein interaction of Ugene-p and UNG2

A) Left panel shows HEK293 cells transfected with an expression vector for UNG2 (V5- epitope-tagged), and cotransfected with an expression vector for Ugene-p (Flag-epitope- tagged) versus an empty expression vector. Interaction of Ugene-p and UNG2 was tested by immunoprecipitation (IP) of Ugene-p and Western detection (WB) of UNG2 (Upper panel); expression of Ugene-p and UNG2 in the transfected cells is shown by Western blot against epitope tags (lower panel); Right panel shows HEK293 cells transfected with an expression vector for Ugene-p (V5-epitope-tagged), and cotransfected with an expression vector for UNG2 (Flag-epitope-tagged) versus an empty expression vector. Upper panel again tests for interaction of Ugene-p and UNG2 by IP of UNG2 and Western detection of Ugene-p; lower panel shows expression of translated proteins. B) Western detection of endogenous Ugene-p and its interaction with endogenous UNG2. Ugene-p was immunoprecipitated from Ugene-p-3xFlag tagged cells using Flag antibodies. Upper panel demonstrates endogenous Ugene-p protein detected by Western blot against Flag; lower panel demonstrates the presence of co-immunoprecipitated UNG2 by Western blot against UNG2. Assay of parental DLD1 cells is shown as controls.

77 A

Ugene-q - + Ugene-q + + Input Input UNG2 + + UNG2 - +

IP: UNG2 IP: Ugene-q WB: UNG2 WB: Ugene-q

WB: Ugene-q WB: Ugene-q

Cell lysate Cell WB: UNG2 WB: UNG2 lysate Cell

B

Figure 4-7 Ugene-q does not interact with UNG2

A) Left panel shows HEK293 cells transfected with an expression vector for UNG2 (V5- epitope-tagged), and cotransfected with an expression vector for Ugene-q (Flag-epitope- tagged) versus an empty expression vector. Interaction of Ugene-p and UNG2 was tested by IP of Ugene-q and Western detection of UNG2 (Upper panel); expression of Ugene-q and UNG2 in the transfected cells is shown by Western blot against epitope tags (lower panel); Right panel shows HEK293 cells transfected with an expression vector for Ugene-q (V5-epitope-tagged), and cotransfected with an expression vector for UNG2 (Flag-epitope-tagged) versus an empty expression vector. Upper panel again tests for interaction of Ugene-q and UNG2 by IP of UNG2 and Western detection of Ugene-q; lower panel shows expression of translated proteins. B) 3xFlag tagged endogenous wild type and mutant (W125R) Ugene-p was immunoprecipitated using Flag antibodies. Upper panel demonstrates the presence of co-immunoprecipitated UNG2 by Western blot against UNG2; lower panel demonstrates endogenous Ugene-p protein detected by Western blot against Flag.

78 A

B

C

79 Figure 4-8 Protein-protein interaction of Ugene and NOSIP

NOSIP binds to both Ugene-p (A) and Ugene-q (B). Left panel shows HEK293 cells transfected with an expression vector for NOSIP (V5-epitope-tagged), and cotransfected with an expression vector for Ugene (Flag-epitope-tagged) versus an empty expression vector. Interaction of Ugene and NOSIP was tested by IP of Ugene and Western detection of NOSIP (Upper panel); expression of Ugene and NOSIP in the transfected cells is shown by Western blot against epitope tags (lower panel); Right panel shows HEK293 cells transfected with an expression vector for Ugene (V5-epitope-tagged), and cotransfected with an expression vector for NOSIP (Flag-epitope-tagged) versus an empty expression vector. Upper panel again tests for interaction of Ugene and NOSIP by IP of NOSIP and Western detection of Ugene; lower panel shows expression of translated proteins. C) Western detection of interaction between endogenous Ugene-p endogenous NOSIP. Ugene-p was immunoprecipitated from Ugene-p-3xFlag tagged cells using Flag antibodies. Upper panel demonstrates endogenous Ugene-p protein detected by Western blot against Flag; lower panel demonstrates the presence of co-immunoprecipitated NOSIP by Western blot against NOSIP. Assay of parental DLD1 cells is shown as controls.

80

Figure 4-9 NOSIP doesn’t form complex with UNG2

HEK293 cells transfected with an expression vector for NOSIP (Flag-epitope-tagged), and cotransfected with an expression vector for Ugene-p (V5-epitope-tagged). The upper panel shows western blot against UNG2, Ugene-p (V5) and NOSIP (Flag), after IP of NOSIP (Flag). Lower panel shows Western blot against UNG2 and Ugene-p (V5), after IP of Ugene-p. Note that UNG2 could only be detected in Ugene-p (V5) immunoprecipitates, although similar amount of Ugene-p (V5) was detected in both Ugene-p (V5) immunoprecipitates and NOSIP (Flag) immunoprecipitates.

4.3.3 Ugene-p binds to the NH2-terminus of UNG2

To determine the UNG2 motif responsible for binding to Ugene-p, we made a

series of constructs expressing V5-epitope tagged nested UNG2 deletions (Figure

4-10A). After co-transfecting each of these V5-epitope tagged UNG2 deletion

constructs with Flag-epitope tagged Ugene-p, we immunoprecipitated Ugene-p

and performed Western blot analysis to test for co-immunoprecipitation of each of

the UNG2 deletion constructs (Figure 4-10B). One UNG2 deletion that lacked

81 only sequences between codons 3 and 33, showed complete loss of the capacity to

bind to Ugene-p. This result suggested that Ugene-p binds to the NH2-terminus of

UNG2.

To further test if the UNG2 NH2-terminus is able to bind to Ugene-p, we

artificially synthesized peptides encoding the NH2-terminal amino acids 1-25 of

UNG2 (1-25-UNG2). We then tested if this peptide could competitively block the

binding of endogenous UNG2 to Flag-epitope tagged endogenous Ugene-p.

Figure 4-10C shows that adding the 1-25-UNG2 peptide into cell lysates blocked

UNG2 binding to Ugene-p in a dose dependent fashion. 5µM 1-25-UNG2

competing peptide could compete out almost all Ugene-p binding to UNG2.

To test if the NH2-terminal 1-25aa of UNG2 is sufficient for binding to Ugene-p,

we expressed a fusion protein with 1-25-UNG2 fused to green fluorescent protein

(1-25-UNG2-GFP) under the regulatory control of doxycycline. We performed

this in cells already containing the 3xFlag-epitope tagged endogenous Ugene-p.

Serial immunoprecipitation and Western blot analysis confirmed Ugene-p bound

to the (1-25-UNG2-GFP) protein (Figure 4-10D). Indeed, induction of the 1-25-

UNG2-GFP decoy protein could completely out compete and block co-

immunoprecipitation of endogenous UNG2 with endogenous Ugene-p (Figure 4-

10D). Therefore, the NH2-terminal 1-25aa of UNG2 are sufficient in vivo for the interaction with Ugene-p.

82

A 1 313 Full length V5 1 279 Del 280-313 V5 1 255 Del 268-313 V5 1 216 Del 217-313 V5 1 165 Del 166-313 V5 1 165 196 313 Del 166-195 V5 1 132 177 313 Del 133-176 V5 1 100 147 313 Del 101-146 V5 1 70 101 313 Del 72-104 V5 1 43 73 313 Del 44- 72 V5 Del 14- 44 1 13 45 313 V5 Del 3- 33 1 2 34 313 V5

B Del 3-33 Full length Del 280-313 Del 166-313 Del 101-146 44-72 Del 14-44 Del Del 268-313 Del 166-195 Del 133-176 72-104 Del

WB: α-V5 (UNG2)

IP: Ugene-p-Flag

WB: α-Flag (Ugene-p)

Whole cell lysate WB: α-V5 (UNG2)

C 1-25aa of UNG2 D Dox

1 µM 5 µM - + Control µM 0.2 WB: UNG2 WB: UNG2 WB: GFP WB: Ugene-p-Flag WB: Ugene-p-Flag IP: Ugene-p-FlagIP:

IP: Ugene-p-Flag IP:

83 Figure 4-10 Mapping of the UNG2 domain for binding to Ugene-p

A) Schematic diagram of constructs expressing nested UNG2 deletions. B) HEK 293T cells were transfected with Flag tagged Ugene-p and a cDNA encoding either V5-tagged wild type UNG2 or the indicated UNG2 deletion mutants. 48h after transfection, immunoprecipitates prepared by IP with Flag antibodies were analyzed by Western blot for UNG2(V5) and Ugene-p(Flag). Box indicates that deletion of UNG2 amino acid 3-33 abolished the binding to Ugene-p. The lower panel shows expression of V5-tagged wild type UNG2 and UNG2 deletion proteins in whole cell lysates. C) Cell lysates of DLD1 expressing endogenous 3xFlag-tagged Ugene-p were mixed with artificially synthesized competing peptides (1-25-UNG2, amounts as indicated), then immunoprecipitated with Flag antibodies. Immunoprecipitates were analyzed by Western blot (IB) for presence of Ugene-p (Flag) and for co-IP of UNG2 (V5). D) DLD1 cells expressing endogenous 3xFlag-tagged Ugene-p were transfected with pcDNA6/TR and pcDNA4-1-25-UNG2- GFP, and selected by blasticidine (10µg/ml) and zeocin (200µg/ml) to derive clones conditionally expressing 1-25-UNG2-GFP fusion protein under doxycycline (dox) regulation. These clones are designated as DLD/Ugene-p-3xFlag/1-25-UNG2-GFP. Interaction of Ugene-p with either UNG2 or the 1-25-UNG2-GFP decoy protein was assayed by IP for Ugene-p with the Flag antibody, followed by Western blot detection of Ugene-p, GFP, and UNG2, in cells without (dox -) and with (dox +) induced expression of the 1-25-UNG2-GFP decoy protein.

4.3.4 Ugene binding does not directly alter UNG2 enzymatic activity or localization

To examine potential functional effects of Ugene-p binding to UNG2, we performed a co-immunoprecipitation to collect UNG2 bound to Ugene-p (pulled down by antibodies against the Flag-epitope). A biochemical assay showed that

UNG2 bound to Ugene-p was an active enzyme, as indicated by initiating a cascade causing cleavage of a uracil containing oligo from the parental 21

84 nucleotides size down to 10 nucleotides (Figure 4-12A, lane 2). To ensure the

activity in the Ugene-p (Flag) immunoprecipitates derived from captured UNG2, we repeated the assay in DLD1 cells rendered UNG null by somatic cell knockout

(Figure 4-11B). No activity was detected in Ugene-p immunoprecipitates from

UNG null cells. Thus, we conclude that the biochemical activity detected in

Ugene-p precipitates from parental DLD1 cells derives from active UNG2 bound

to Ugene-p.

To examine whether binding to Ugene-p can alter UNG2 subcellular localization,

we expressed V5-epitope tagged UNG2 (UNG2-V5) in the cells conditionally

expressing the 1-25-UNG2-GFP fusion protein under doxycycline (dox)

regulation. Immunofluorescence against the V5 epitope showed that UNG2 was

localized in the nucleus irrespective of expression of the 1-25-UNG2-GFP decoy

protein (Figure 4-13). Therefore, expressing a competitor for Ugene-p binding did

not alter UNG2 nuclear localization.

The UNG locus encodes both a nuclear protein UNG2, and a mitochondrial

isoform UNG1, that both share the same catalytic domain, but are of different

sizes (Nilsen et al., 1997). They are transcribed from different starting sites, which

causes the difference in their NH2-terminal protein sequence (Figure 4-11A). In

repeated assays, only a UNG2 sized protein was ever detected in Ugene-p

immunoprecipitates (data not shown).

To further assay effects of Ugene-p expression on UNG2 activity, we generated

cells null for UNG1. This was done by selective knockout of the UNG1 specific

exon 1 from the UNG locus (Figure 4-11). In these cells expressing UNG2 only,

85 we again introduced the 1-25-UNG2-GFP decoy protein under doxycycline regulation. These cells were used to determine UNG2 enzymatic activity under two experimental conditions (Figure 4-12B). First, we compared UNG2 activity in cell lysates without (dox -) and with (dox +) induced expression of the 1-25-

UNG2-GFP decoy protein (upper panel). As shown in Figure 4-10D, the highly expressed decoy protein totally abolished the interaction of Ugene-p and UNG2, but did not alter UNG2 biochemical activity in the lysates, as shown in Figure 4-

12B (upper panels). Specifically, equal signal intensity of the 10nt cleavage product of the uracil containing oligos was seen in both dox(+) and dox(-) conditions. Second, we compared UNG2 activity in lysates prepared from cells without and with suppression of Ugene expression by si1017. However, Ugene knock-down did not change the enzymatic activity of UNG2 as shown in Figure

4-12B (lower panels). These findings were equally true, whether UNG2 activity was analyzed with a 21-bp oligo containing a U:A or a U:G mispair at position

10, which respectively modeled uracil misincorporation into DNA and uracil arising from spontaneous deamination of cytosine. These results suggest that under the experimental condition employed, changing Ugene-p expression did not alter UNG2 biochemical activity.

86

A

B UNG1 null UNG1 nullUNG Parenal

UNG2

UNG1

Actin

Figure 4-11 Western analysis of UNG in UNG1 null and UNG null cells

A) Genomic structure of UNG locus. B) Western blot of UNG

87 UNG A Wt Null Ugene-p-Flag -+- +

1234

20nt U:A 10nt

UNG2

B -dox +dox -dox +dox

Cell lysate 1 ug 1 ug 1 ug 1 ug 2 ug 2 ug 2 ug 2 ug 1/4 ug 1/4 ug 1/4 ug 1/4 ug 1/2 ug 1/2 ug 1/2 ug 1/2 ug 20nt

10nt

U:A U:G

siGLO si1017 siGLO si1017 1 ug 1 ug 1 ug 1 ug 2 ug 2 ug 2 ug 2 ug 1/4 ug 1/4 ug 1/4 ug 1/4 ug 1/2 ug 1/2 ug 1/2 ug 1/2 ug 20nt

10nt

U:A U:G

Figure 4-12 Assays of UNG enzymatic activity

A) Wild type or UNG null DLD1 cells were transfected with a plasmid expressing Flag tagged Ugene-p or the corresponding empty vector. Cell lysates were then immunoprecipitated with Flag antibodies. Immunoprecipitates were subjected to a UNG biochemical activity assay, as indicated by presence of a 10nt product (arrow) generated by cleavage of a 21bp input double strand DNA. Input oligos contain a single U:A base

88 pair. The lower panel shows Western blot of UNG2 in Ugene-p immunoprecipitates. B) Using an AAV mediated somatic knockout technique, cells expressing UNG2 only were constructed in the (DLD1/Ugene-p-3xFlag/1-25-UNG2-GFP) background, The upper two panels show the UNG2 activity in the cell lysates (amount as indicated) without (dox -) and with (dox +) induced expression of the 1-25-UNG2-GFP decoy protein. The lower two panels compare UNG2 activity in lysates (amount as indicated) from cells without (siGLO) and with (siRNA-1017) the suppression of Ugene expression by siRNA. In the left two panels, the input oligos for the assay contain a single U:A base pair; for the right two panels, the input oligos contain a single U:G base pair. UNG2 activity is indicated by autoradioactivity of the 10nt oligos.

Figure 4-13 UNG2 localization in DLD1 cells with and without presence of

Ugene-p

DLD/Ugene-p-3xFLAG/1-25-UNG2-GFP cells (see legend of figure 4C) were transfected with plasmids expressing V5-epitope-tagged UNG2. Immunofluorescence against V5 epitope (green) was applied to the tranfected cells without (dox -) and with (dox +) expression of the 1-25-UNG2-GFP decoy protein, to detect subcellular localization of UNG2.

89

Materials and methods

Cell lines and tissues are as described in the previous chapters

Clonogenic assay

Cells were seeded at 5.0x104 cells/well of a six-well plate and transfected the next

day with 100pmol of a double stranded siRNA molecules (Dharmacon) using 5

µL of LipofectAMINE 2000 (Invitrogen, Carlsbad, CA) as per the manufacturer's

instructions. 18 hours after transfection, the cells were washed in PBS and

trypsinized. Cells were then seeded in six-well plates at 400 cells/well in 2 mL

medium. Colonies were grown for 10 days, stained as previously described (Guda

et al., 2007) and quantified using Alpha imager (Alpha Innotech, San Leandro,

CA).

Construction of expression/deletion vectors

The coding sequence of Ugene (Ugene-p/Ugene-q, XM_001133365) and UNG2

(NM_080911) was PCR amplified and cloned into the eukaryotic expression

vector pcDNA3.1/V5/His-TOPO (Invitrogen, Carlsbad, CA) to generate COOH-

terminal V5-tagged Ugene/UNG2 expression vectors. The primer sequences for

constructing the vectors are as follows: for Ugene, forward 5′-ACC TCA TCC

TTC CTG CGA CG-3′ and reverse 5′-TCT AAT ACA CTC CTC TGC TGA

GAT-3′; for UNG2, forward 5′-ATG GGC GTC TTC TGC CTT G-3′ and reverse

5′-CAG CTC CTT CCA GTC AAT G-3′. FLAG tagged constructs were similarly made by adding the complimentary FLAG tag sequence with a stop codon (5’-

90 TTA CTT GTC ATC GTC GTC CTT GTA GTC-3’) at the 5’ end of the reverse

primer. UNG2 deletion constructs were generated by blunt-ligation of the PCR

products, amplified using V5-tagged UNG2 expression vector as a template, with

the primer sets as listed in supplemental table 1. The decoy fusion protein 1-25-

UNG2-GFP expression vector was constructed by ligating the following 3

fragments: a) green fluorescence protein (GFP) DNA [that was PCR amplified

from pEGFP-N1 template (Clontech, Mountain View, CA) using forward 5′-TTG

AAT TCA TGG TGA GCA AGG GCG AGG AG-3′ and reverse 5′-TTC TCG

AGC CCT TGT ACA GCT CGT CCA TGC-3′, and digested with EcoRI and

XhoI]; b) nucleotides corresponding to 1-25 amino acids of UNG2 (that were amplified using forward 5′-TTG GAT CCC TCC TCA GCT CCA GGA TGA T-

3′ and reverse 5′-TTG AAT TCC TCG GGG CTG GGG GCG TGT-3′, then digested with BamH1 and EcoRI); c) pcDNA4/myc-His(B) (Invitrogen) fragment

(that was obtained by digesting the vector with BamHI and XhoI). In this construct, 1-25-UNG2-GFP is driven by a CMV promoter with two copies of the

TetO2 operator sequence, which can be suppressed by the Tet repressor.

Construction of DLD1 clones conditionally expressing 1-25-UNG2-GFP

DLD1 cells were seeded at 1.0×106 cells/100mm dish and transfected the next day

with 1.6 µg of pcDNA6/TR and 0.4µg pcDNA4/myc-His/1-25-UNG2-GFP

plasmid using 12 µL of Fugene 6 (Roche Applied Sciences) as per the

manufacturer’s protocols, then selected by blasticidine (10µg/ml) and zeocin

(200µg/ml) for 2 weeks to derive clones conditionally expressing 1-25-UNG2-

GFP fusion protein under doxycycline (0.5µg/mL) regulation.

91

Immunoprecipitation and Western blot analysis

HEK 293T cells were seeded at 4.0×106 cells/T75 flask and transfected the next

day with total 6 µg of plasmids using 36 µL of LipofectAMINE (Invitrogen). Cell

lysates were prepared 48 hours after tranfection using lysis buffer (50 mM Tris-

HCl pH7.5, 1 mM EDTA pH8.0, 150 mM NaCl, 1% Triton X-100) supplemented

with protease inhibitor mixture (Roche Applied Sciences). FLAG immunoprecipitation was performed with anti-FLAG M2 affinity gel as described in the manufacturer’s protocol. After elution by either FLAG-peptide (Sigma, St.

Louis, MO) or 3xFLAG peptide (Sigma), eluates were used for biochemical activity assay or for Western blot. Western blots were performed using anti-

FLAG M2 (Sigma, 1:1000), anit-V5 (1:1000, Invitrogen), anti-UNG (1:500,

Abcam, Cambridge, MA), anti-GFP (1:5000, Invitrogen), or anti-beta actin

(1:100,000, Sigma) antibodies, followed by horseradish peroxide conjugated donkey anti-mouse secondary antibody (1:1500, Jackson ImmunoResearch

Laboratories, Grove City, PA), and visualized by using an Enhanced

Chemiluminescence Plus detection kit (Amersham Biosciences).

Somatic cell knockout

Somatic cell knockout was performed as described (Kohli et al., 2004). Knockout

of both UNG1 and UNG2 transcription units was accomplished by disrupting

UNG locus exon 2 using left and right targeting arms amplified with primers: left

arm 5′-GGC TCG AGA GGC ACA AAG CGA ATG AAA G-3′, 5′- CCG AAT

92 TCA GTC AGT CAC TCT GGA TCC GGT CCA ACT-3′, right arm 5′- GGA

CTA GTG GAG AGA GCT GGA AGA AGC A-3′, 5′-AAC CGC GGT TTG

AAC TTC ACC ACC ACC A-3′. UNG1 specific knockout was accomplished by eliminating the UNG1 specific exon 1 using targeting arms amplified with primers: left arm 5′-GGC TCG AGA AGA GCC TGT CCA AAG AGC A-3′/5′-

CCG AAT TCC GGG AAT TGG GAA TTA GGT T-3′, right arm 5′-GGA CTA

GTC TCT TGA GCC GCC TCT GC-3′/5′-AAC CGC GGT TTG AAC TTC ACC

ACC ACC A-3′. Cells expressing only UNG2 were generated by UNG1 specific knockout of one allele combined with the UNG locus knockout of the second allele. siRNA-mediated Ugene silencing

The Ugene-specific and control siRNAs were synthesized by Dharmacon. For siRNA transfection experiments, DLD1 cells were seeded on 100 mm culture dishes and transfected with 30 µL of 20 µM siRNA stock using 30 µL

LipofectAMINE 2000 (Invitrogen). Cell lysates were collected 48 hours after transfection and knockdown was validated by Western blot. The sequences of

Ugene siRNA-1017 were as follows: sense 5′-GGA AGA UGC UAU UUC ACC

AUU-3′, antisense 5′-pUGG UGA AAU AGC AUC UUC CUU-3′.

In vitro UNG biochemical activity assay

The uracil-containing oligonucleotide (5′-CCT GCC CTG UGC AGC TGT GGG-

3′) (R&D, Minneapolis, MN) was annealed to an equimolar amount of its complementary strand (5′-CCC ACA GCT GCA CAG GGC AGG-3′ or 5′-CCC

93 ACA GCT GCG CAG GGC AGG-3′, for U:A and U:G pairs respectively), mixed and heated to 95°C in annealing buffer (20 mM Tris-HCl pH 8.0, 1 mM EDTA, 1 mM DTT, and 0.1 mg/mL BSA), and allowed to slowly cool to room temperature.

The DNA was then end-labeled with gamma 32P-dATP by T4 polynucleotide kinase. In vitro UNG biochemical activity assay was performed as per manufacturers instruction (R&D). In brief, after exposure to UNG2, which deglycosylates uracil, the deglycosylated oligonucleotide was split in half by incubation in Alkali buffer (300mM NaOH, 97% formamide) at 100ºC for 10 minutes.

94

Chapter 5 Discussion and Future Directions

In summary, we report here the identification of a novel gene, Ugene, whose

expression is broadly induced in many human cancer types. Using various

techniques we have also demonstrated suppression of Ugene message by TGF-β

treatment. Moreover, we demonstrate the interaction of Ugene-p with UNG2, a base excision repair enzyme. The frequent over-expression of Ugene-p in cancer suggests that this gene may participate in the cancer phenotype. A direct assay to test Ugene-p for oncogenic activity, however, revealed no transforming or focus forming activity when tested in epithelial (HMEC) or fibroblast (WI-38, IMR-90,

REF-52, NIH 3T3) cells.

The interaction of Ugene-p with UNG2, in particular, is highly intriguing, as multiple DNA repair pathways are now recognized as targets for alteration in cancers, including inactivation of genes in the mismatch repair pathway in colon cancers (Grady and Markowitz, 2002) and inactivation of the BRCA1/2 proteins in breast cancers (Fackenthal and Olopade, 2007). Despite this intriguing association, we have not yet been able to demonstrate a direct regulation of UNG2 repair activity by Ugene-p, in vitro. It is likely however, that the in vivo activity of

UNG2 is more complicated than we have been able to model in in vitro assays, as

UNG2 in vivo activity involves recognition of mis-incorporated uracil at the replication fork, and involves recognition of uracils that are spontaneously generated through cytosine deamination in native chromatin, in addition to

95 involving interactions with other members of the BER complex, such as AP

(apurinic/apyrimidinic). Suggesting that Ugene-p could promote a specialized

function of UNG2, is that immunoprecipitation of over-expressed Flag-tagged

Ugene-p pulled down only a subpopulation of total UNG2 protein. Of note, the

NH2-terminus of UNG2, to which Ugene-p binds, has been shown to also be bind

to the PPM1D phosphatase that dephosphorylates Thr6, effecting a protein

modification that is suggested to play an important role in the regulation of UNG2 activity under some circumstances (Lu et al., 2004) . Further analysis of the

effect of Ugene-p on UNG2 in these native contexts will be undertaken in future

studies.

Genome comparisons show Ugene-p arose as a feature of mammalian cells, in

which it is highly conserved, suggesting an important role for the protein in higher

organisms. It is, however, unclear from the current genome assemblies whether

there are multiple copies of the Ugene-p on chromosome 1, or whether the

chromosome 1 assembly remains in need of revision. In contrast, Ugene-q, which

is unable to bind UNG2, is specific to humans, and is absent in other mammals. It

is tempting to speculate that Ugene-q may act as a competitor of Ugene-p

interactions with other proteins, but testing of this model awaits additional

clarification of the functional activities of Ugene-p. The fact that Ugene-q does

not bind to UNG2 also suggest that Ugene-q may have different functions.

However, since these two sister genes share such a similar genomic structure and

are regulated together, Ugene-p and Ugene-q must be functionally related.

96 Having multiple copies also raise the difficulty for the study on Ugene functions.

It is practically not possible to knock out all the copies in human cells, and epitope knock-in can only tag at most one fourth of the total Ugene molecules, which is not enough to be detected by immunofluorescence. Because there is only one copy of Ugene homolog in mouse genome, murine cells can be applied in

Ugene function study in the future.

Additional future study will also be needed to clarify the role of a second protein,

NOSIP that binds to both Ugene-p and Ugene-q. Although on further evaluation,

NOSIP was proved to not co-precipitate with UNG2, it still can be the one for which Ugene-p and Ugene-q is competing.

The cell cycle dependent expression of Ugene indicates this gene may be important for cell cycle progression, or it may only function at certain phases of cell cycle. However, knocking down Ugene expression by Ugene specific siRNA did not affect the cell cycle profiling (data not shown) or colony forming ability of the targeted cells. It will be interesting if we could knock down or knock out

Ugene-p or Ugene-q only. Then we can test if breaking the balance between

Ugene-p and Ugene-q is the key to study their functions

In summary, we report Ugene-p as a novel gene, commonly over-expressed in human cancers, and participating in a nuclear complex with the base-excision- repair gene, UNG2.

97 Bibliography

An, H.J., Cho, N.H., Yang, H.S., Kwak, K.B., Kim, N.K., Oh, D.Y., Lee, S.W.,

Kim, H.O., and Koh, J.J. (2007). Targeted RNA interference of

phosphatidylinositol 3-kinase p110-beta induces apoptosis and proliferation arrest

in endometrial carcinoma cells. The Journal of pathology 212, 161-169.

Anai, S., Goodison, S., Shiverick, K., Hirao, Y., Brown, B.D., and Rosser, C.J.

(2007). Knock-down of Bcl-2 by antisense oligodeoxynucleotides induces

radiosensitization and inhibition of angiogenesis in human PC-3 prostate tumor

xenografts. Molecular cancer therapeutics 6, 101-111.

Brunschwig, E.B., Wilson, K., Mack, D., Dawson, D., Lawrence, E., Willson,

J.K., Lu, S., Nosrati, A., Rerko, R.M., Swinler, S., et al. (2003). PMEPA1, a

transforming growth factor-beta-induced marker of terminal colonocyte

differentiation whose expression is maintained in primary and metastatic colon

cancer. Cancer research 63, 1568-1575.

Chytil, A., Waltner-Law, M., West, R., Friedman, D., Aakre, M., Barker, D., and

Law, B. (2004). Construction of a cyclin D1-Cdk2 fusion protein to model the

biological functions of cyclin D1-Cdk2 complexes. The Journal of biological

chemistry 279, 47688-47698.

98 Derynck, R., and Feng, X.H. (1997). TGF-beta receptor signaling. Biochim

Biophys Acta 1333, F105-150.

Dong, K., Wang, R., Wang, X., Lin, F., Shen, J.J., Gao, P., and Zhang, H.Z.

(2008). Tumor-specific RNAi targeting eIF4E suppresses tumor growth, induces

apoptosis and enhances cisplatin cytotoxicity in human breast carcinoma cells.

Breast cancer research and treatment.

Fackenthal, J.D., and Olopade, O.I. (2007). Breast cancer risk associated with

BRCA1 and BRCA2 in diverse populations. Nature reviews 7, 937-948.

Fink, S.P., Mikkola, D., Willson, J.K., and Markowitz, S. (2003). TGF-beta-

induced nuclear localization of Smad2 and Smad3 in Smad4 null cancer cell lines.

Oncogene 22, 1317-1323.

Fishel, R., Lescoe, M.K., Rao, M.R., Copeland, N.G., Jenkins, N.A., Garber, J.,

Kane, M., and Kolodner, R. (1993). The human mutator gene homolog MSH2 and

its association with hereditary nonpolyposis colon cancer. Cell 75, 1027-1038.

Friedl, W., Kruse, R., Uhlhaas, S., Stolte, M., Schartmann, B., Keller, K.M.,

Jungck, M., Stern, M., Loff, S., Back, W., et al. (1999). Frequent 4-bp deletion in

exon 9 of the SMAD4/MADH4 gene in familial juvenile polyposis patients.

Genes, chromosomes & cancer 25, 403-406.

Ghandour, G., and Glynne, R.J. (2000). Method and apparatus for analysis of data

from biomolecular arrays, International Patent WO0079645.

99 Grady, W.M., and Markowitz, S.D. (2002). Genetic and epigenetic alterations in colon cancer. Annual review of genomics and human genetics 3, 101-128.

Grady, W.M., and Markowitz, S.D. (2003). Hereditary colon cancer genes.

Methods Mol Biol 222, 59-83.

Grady, W.M., Myeroff, L.L., Swinler, S.E., Rajput, A., Thiagalingam, S.,

Lutterbaugh, J.D., Neumann, A., Brattain, M.G., Chang, J., Kim, S.J., et al.

(1999). Mutational inactivation of transforming growth factor beta receptor type

II in microsatellite stable colon cancers. Cancer Res 59, 320-324.

Guda, K., Natale, L., and Markowitz, S. (2007). An improved method for staining cell colonies in clonogenic assays. Cytotechnology 54, 85-88.

Heinen, C.D., Schmutte, C., and Fishel, R. (2002). DNA repair and tumorigenesis: lessons from hereditary cancer syndromes. Cancer biology & therapy 1, 477-485.

Hocevar, B.A., Brown, T.L., and Howe, P.H. (1999). TGF-beta induces fibronectin synthesis through a c-Jun N-terminal kinase-dependent, Smad4- independent pathway. Embo J 18, 1345-1356.

Howe, J.R., Roth, S., Ringold, J.C., Summers, R.W., Jarvinen, H.J., Sistonen, P.,

Tomlinson, I.P., Houlston, R.S., Bevan, S., Mitros, F.A., et al. (1998). Mutations in the SMAD4/DPC4 gene in juvenile polyposis. Science (New York, NY 280,

1086-1088.

100 Kinzler, K.W., and Vogelstein, B. (1996). Lessons from hereditary colorectal cancer. Cell 87, 159-170.

Kohli, M., Rago, C., Lengauer, C., Kinzler, K.W., and Vogelstein, B. (2004).

Facile methods for generating human somatic cell gene knockouts using recombinant adeno-associated viruses. Nucleic acids research 32, e3.

Leach, F.S., Nicolaides, N.C., Papadopoulos, N., Liu, B., Jen, J., Parsons, R.,

Peltomaki, P., Sistonen, P., Aaltonen, L.A., Nystrom-Lahti, M., et al. (1993).

Mutations of a mutS homolog in hereditary nonpolyposis colorectal cancer. Cell

75, 1215-1225.

Li, H., Myeroff, L., Smiraglia, D., Romero, M.F., Pretlow, T.P., Kasturi, L.,

Lutterbaugh, J., Rerko, R.M., Casey, G., Issa, J.P., et al. (2003). SLC5A8, a sodium transporter, is a tumor suppressor gene silenced by methylation in human colon aberrant crypt foci and cancers. Proceedings of the National Academy of

Sciences of the United States of America 100, 8412-8417.

Loeb, L.A. (1991). Mutator phenotype may be required for multistage carcinogenesis. Cancer research 51, 3075-3079.

Loeb, L.A., Springgate, C.F., and Battula, N. (1974). Errors in DNA replication as a basis of malignant changes. Cancer research 34, 2311-2321.

101 Lu, S.L., Kawabata, M., Imamura, T., Akiyama, Y., Nomizu, T., Miyazono, K.,

and Yuasa, Y. (1998). HNPCC associated with germline mutation in the TGF-

beta type II receptor gene. Nat Genet 19, 17-18.

Lu, X., Bocangel, D., Nannenga, B., Yamaguchi, H., Appella, E., and

Donehower, L.A. (2004). The p53-induced oncogenic phosphatase PPM1D interacts with uracil DNA glycosylase and suppresses base excision repair.

Molecular cell 15, 621-634.

Markowitz, S., Wang, J., Myeroff, L., Parsons, R., Sun, L., Lutterbaugh, J., Fan,

R.S., Zborowska, E., Kinzler, K.W., Vogelstein, B., et al. (1995). Inactivation of

the type II TGF-beta receptor in colon cancer cells with microsatellite instability.

Science 268, 1336-1338.

Massague, J., Blain, S.W., and Lo, R.S. (2000). TGFbeta signaling in growth

control, cancer, and heritable disorders. Cell 103, 295-309.

Moinova, H.R., Chen, W.D., Shen, L., Smiraglia, D., Olechnowicz, J., Ravi, L.,

Kasturi, L., Myeroff, L., Plass, C., Parsons, R., et al. (2002). HLTF gene silencing

in human colon cancer. Proceedings of the National Academy of Sciences of the

United States of America 99, 4562-4567.

Myung, S.J., Rerko, R.M., Yan, M., Platzer, P., Guda, K., Dotson, A., Lawrence,

E., Dannenberg, A.J., Lovgren, A.K., Luo, G., et al. (2006). 15-

Hydroxyprostaglandin dehydrogenase is an in vivo suppressor of colon

102 tumorigenesis. Proceedings of the National Academy of Sciences of the United

States of America 103, 12098-12102.

Nilsen, H., Otterlei, M., Haug, T., Solum, K., Nagelhus, T.A., Skorpen, F., and

Krokan, H.E. (1997). Nuclear and mitochondrial uracil-DNA glycosylases are generated by alternative splicing and transcription from different positions in the

UNG gene. Nucleic acids research 25, 750-755.

Parker, S.L., Tong, T., Bolden, S., and Wingo, P.A. (1996). Cancer statistics,

1996. CA Cancer J Clin 46, 5-27.

Parsons, R., Myeroff, L.L., Liu, B., Willson, J.K., Markowitz, S.D., Kinzler,

K.W., and Vogelstein, B. (1995). Microsatellite instability and mutations of the transforming growth factor beta type II receptor gene in colorectal cancer. Cancer

Res 55, 5548-5550.

Riggins, G.J., Thiagalingam, S., Rozenblum, E., Weinstein, C.L., Kern, S.E.,

Hamilton, S.R., Willson, J.K., Markowitz, S.D., Kinzler, K.W., and Vogelstein,

B. (1996). Mad-related genes in the human. Nat Genet 13, 347-349.

Rozen, S., and Skaletsky, H. (2000). Primer3 on the WWW for general users and for biologist programmers. Methods in molecular biology (Clifton, NJ 132, 365-

386.

Sampson, J.R., Jones, S., Dolwani, S., and Cheadle, J.P. (2005). MutYH (MYH) and colorectal cancer. Biochemical Society transactions 33, 679-683.

103 Takagi, Y., Kohmura, H., Futamura, M., Kida, H., Tanemura, H., Shimokawa, K.,

and Saji, S. (1996). Somatic alterations of the DPC4 gene in human colorectal

cancers in vivo. Gastroenterology 111, 1369-1372.

Takaku, K., Miyoshi, H., Matsunaga, A., Oshima, M., Sasaki, N., and Taketo,

M.M. (1999). Gastric and duodenal polyps in Smad4 (Dpc4) knockout mice.

Cancer research 59, 6113-6117.

Takaku, K., Oshima, M., Miyoshi, H., Matsui, M., Seldin, M.F., and Taketo,

M.M. (1998). Intestinal tumorigenesis in compound mutant mice of both Dpc4

(Smad4) and Apc genes. Cell 92, 645-656.

Taketo, M.M., and Takaku, K. (2000). Gastro-intestinal tumorigenesis in Smad4 mutant mice. Cytokine & growth factor reviews 11, 147-157.

Uchida, D., Omotehara, F., Nakashiro, K., Tateishi, Y., Hino, S., Begum, N.M.,

Fujimori, T., and Kawamata, H. (2003). Posttranscriptional regulation of TSC-22

(TGF-beta-stimulated clone-22) gene by TGF-beta 1. Biochem Biophys Res

Commun 305, 846-854.

Veldwijk, M.R., Topaly, J., Laufs, S., Hengge, U.R., Wenz, F., Zeller, W.J., and

Fruehauf, S. (2002). Development and optimization of a real-time quantitative

PCR-based method for the titration of AAV-2 vector stocks. Mol Ther 6, 272-

278.

104 Wang, J., Sun, L., Myeroff, L., Wang, X., Gentry, L.E., Yang, J., Liang, J.,

Zborowska, E., Markowitz, S., Willson, J.K., et al. (1995). Demonstration that

mutation of the type II transforming growth factor beta receptor inactivates its tumor suppressor activity in replication error-positive colon carcinoma cells. J

Biol Chem 270, 22044-22049.

Westerhausen, D.R., Jr., Hopkins, W.E., and Billadello, J.J. (1991). Multiple transforming growth factor-beta-inducible elements regulate expression of the plasminogen activator inhibitor type-1 gene in Hep G2 cells. J Biol Chem 266,

1092-1100.

Willson, J.K., Bittner, G.N., Oberley, T.D., Meisner, L.F., and Weese, J.L.

(1987). Cell culture of human colon adenomas and carcinomas. Cancer research

47, 2704-2713.

Xin, B., Platzer, P., Fink, S.P., Reese, L., Nosrati, A., Willson, J.K., Wilson, K., and Markowitz, S. (2005). Colon cancer secreted protein-2 (CCSP-2), a novel candidate serological marker of colon neoplasia. Oncogene 24, 724-731.

Xu, X., Brodie, S.G., Yang, X., Im, Y.H., Parks, W.T., Chen, L., Zhou, Y.X.,

Weinstein, M., Kim, S.J., and Deng, C.X. (2000). Haploid loss of the tumor suppressor Smad4/Dpc4 initiates gastric polyposis and cancer in mice. Oncogene

19, 1868-1874.

105 Yan, M., Rerko, R.M., Platzer, P., Dawson, D., Willis, J., Tong, M., Lawrence,

E., Lutterbaugh, J., Lu, S., Willson, J.K., et al. (2004). 15-Hydroxyprostaglandin dehydrogenase, a COX-2 oncogene antagonist, is a TGF-beta-induced suppressor of human gastrointestinal cancers. Proceedings of the National Academy of

Sciences of the United States of America 101, 17468-17473.

Zhang, X., Guo, C., Chen, Y., Shulha, H.P., Schnetz, M.P., LaFramboise, T.,

Bartels, C.F., Markowitz, S., Weng, Z., Scacheri, P.C., et al. (2008). Epitope tagging of endogenous proteins for genome-wide ChIP-chip studies. Nature methods 5, 163-165.

106