MECHANISTIC STUDY OF FRAGILE SITE INSTABILITY BY INVESTIGATING RET/PTC REARRANGEMENTS, A COMMON CAUSE OF PAPILLARY THYROID CARCINOMA

BY

LAURA WILLIAMS DILLON

A Dissertation Submitted to the Graduate Faculty of

WAKE FOREST UNIVERSITY GRADUATE SCHOOL OF ARTS AND SCIENCES

in Partial Fulfillment of the Requirements

for the Degree of

DOCTOR OF PHILOSOPHY

Biochemistry and Molecular Biology

May 2012

Winston-Salem, North Carolina

Approved By:

Yuh-Hwa Wang, Ph.D., Advisor

David A. Ornelles, Ph.D., Chair

Peter A. Antinozzi, Ph.D.

Thomas Hollis, Ph.D.

Alan J. Townsend, Ph.D.

ACKNOWLEDGEMENTS

First and foremost I would like to thank my advisor, Dr. Yuh-Hwa Wang. She has provided my support and guidance over the years and I couldn’t have accomplished any of this without her. Throughout my graduate career under her supervision, she has always put education and scientific growth as a priority. I thank her for all the knowledge she has imparted to me, shaping me into the researcher I am today.

I would also like to thank my committee members, Dr. David Ornelles, Dr. Peter

Antinozzi, Dr. Thomas Hollis, and Dr. Alan Townsend, for their scientific guidance and encouragement.

Thank you to all the current and past members of the Wang laboratory for providing support and discussing scientific thoughts and problems. I would like to especially thank Christine Lehman, for her work and contribution to the analysis of double-strand DNA breaks in patient thyroid tissue samples, and Allison Weckerle, for being a great friend and always available for advice.

Additionally, I would like to thank our collaborators, without whom much of this work would not have been possible. Dr. Yuri Nikiforov and his colleague Dr. Manoj

Gandhi from the University of Pittsburgh for collecting the data on chromosomal breakage at RET/PTC and detection of RET/PTC rearrangements in thyroid cells, and providing thyroid tissue samples for the DNA breakage analysis studies. Dr. Jennifer

Cannon from Wake Forest School of Medicine for also providing patient thyroid tissue samples.

ii Finally, I would like to thank my family and friends for all their love and support.

My parents, David and Susan Williams, who have always encouraged me to do anything

I put my mind to and been by my side every step of the way. From an early age, they provided a model for me of what it takes to be a great scientist, and I hope to always lead by their example. Thank you to my husband, Stephen, who has supported me in all my endeavors and always knows how to brighten my day.

iii TABLE OF CONTENTS

Page LIST OF ABBREVIATIONS…………………………………………………………………………………………….…..v

LIST OF FIGURES……………………………………………………………………………………………………………..x

LIST OF TABLES……………………………………………………………………………………………………………..xii

ABSTRACT…………………………………………………………………………………………………………….………xiv

Chapter I. INTRODUCTION…………………………………………………………………………………………………..1

II. DNA BREAKS AT FRAGILE SITES GENERATE ONCOGENIC RET/PTC REARRANGEMENTS IN HUMAN THYROID CELLS…………………………………25 Published in Oncogene, April 2010

III. DNA TOPOISOMERASES PARTICIPATE IN ONCOGENE RET FRAGILITY…………………………………………………………………………………………………..47 Manuscript in Preparation

IV. DNA SECONDARY STRUCTURES INVOLVED IN FRAGILE SITE BREAKAGE FROM THE STUDY OF HUMAN 10………………………….78 Manuscript Submitted

V. DEVELOPMENT OF A DNA BREAKAGE ASSAY TO DETECT SUSCEPTIBILITY TO RET/PTC REARRANGEMENT FORMATION AND POTENTIAL EXPOSURE TO ENVIRONMENTAL FRAGILE SITE-INDUCING CHEMICALS…………………………………………………………………………….117

VI. CONCLUSIONS…………………………………………………………………………………………………137

APPENDIX…………………………………………………………………………………………………………………..162

SCHOLASTIC VITA……………………………………………………………………………………………………….191

iv LIST OF ABBREVIATIONS

°C Degrees Celcius

2-AP 2-Aminopurine

ADD3 Adductin 3 (gamma)

APH Aphidicolin

ATM Ataxia telangiectasia mutated

ATP

ATR Ataxia telangiectasia and Rad3 related

ATRIP ATR interacting

BA Betulinic acid

BAC Bacteria artificial chromosome

BCL-2 B-cell CLL/lymphoma 2 bp

BRCA1 Breast 1

BRCA2 Breast cancer 2

BrdU Bromodeoxyuridine c-MYC V-myc myelocytomatosis viral oncogene homolog (avian)

CA California

CCDC6 Coiled-coil domain containing 6 cDNA Complementary deoxyribonucleic acid

CFS Common fragile site

CHK1 Checkpoint 1

v CHK2 Checkpoint kinase 2

Chr Chromosome cm Centimeter

CPT Camptothecin

CPT-11 Camptothecin-11

DEN Diethylnitrosamine

DNA Deoxyribonucleic acid

DNA-PKcs DNA-dependent protein kinase, catalytic subunit dsDNA double-strand DNA dUTP Deoxyuridine triphosphate

EDTA Ethylenediaminetetraacetic acid

FA Fanconi amenia

FANCD2 Fanconi anemia, complementation group D2

FANCI Fanconi anemia, complementation group I

FHIT Fragile histidine triad

FISH Fluorescence in situ hybridization

FUdR Fluorodeoxyuridine

G1-phase Gap 1 phase

G2-phase Gap 2 phase

G6PD Glucose-6-phosphate dehydrogenase

H2AX H2A family, member X

HR Homologous recombination

vi HUS1 HUS1 checkpoint homolog (S. pombe)

IGH@ Immunoglobulin heavy

INA Internexin neuronal intermediate filament protein, alpha

Kb Kilobase kcal Kilocalorie

KCl Potassium Chloride

L Liter

LM-PCR Ligation-mediated Polymerase chain reaction

M Molar

Mb Megabase mg Milligram min Minute mL Milliliter mM Millimolar mol Mole mRNA Messenger ribonucleic acid

Na Sodium

NaOH Sodium hydroxide

NC North Carolina

NCBI National Center for Biotechnology Information

NCOA4 Nuclear receptor co-activator 4

NFKB2 Nuclear factor of kappa light polypeptide enhancer in B-cells 2

vii ng Nanogram

NHEJ Non-homologous end joining nM Nanomolar

NSC Non-small cell nt Nucleotide

NUP98 Nucleoporin 98 kDa

P Phosphorous

PA Pennsylvania

PAC P1 artificial chromosome

PAGE Polyacrylamide gel electrophoresis

PBS Phosphate buffered saline

PC Positive control

PCNA Proliferating cell nuclear antigen

PCR Polymerase chain reaction

PI Propidium iodide

PTC Papillary thyroid carcinoma

PTEN Phosphatase and tensin homolog

RAD51 RAD51 homolog (S. cerevisiae)

RET Rearranged during transfection proto-oncogene

RFS Rare fragile site

RNA Ribonucleic acid

RNase Ribonuclease

viii RPMI Roswell Park Memorial Institite Medium

RT-PCR Reverse transcription polymerase chain reaction

S-phase Synthesis phase

SD Standard deviation

SDS Sodium dodecyl sulfate sec Second

SMC1 Structural maintenance of 1

SV40 Simian virus 40

TBE Tris/Borate/EDTA

TBXAS1 Thromboxane A synthase 1 (platelet)

TE Tris/EDTA

TopBP1 Topoisomerase II binding protein 1

Tris Tris(hydroxymethyl)aminomethane

USA United States of America

V Volt

VP-16 VePesid-16, etoposide

WRN Werner syndrome, RecQ -like

WWOX WW domain containing oxidoreductase

ZMIZ1 Zinc finger, MIZ-type containing 1

μg Microgram

μL Microliter

μM Micromolar

ix LIST OF FIGURES

Page 2.1: Fluorescence in situ hybridization on metaphase chromosomes of HTori-3 cells after treatment with fragile site-inducing chemicals…………………………………………….31

2.2: DNA breaksite mapping by LM-PCR……………………………………………………………………….34

2.3: LM-PCR detection of breaks formed in HTori-3 cells after treatment with APH……..35

2.4: Location of breakpoints within intron 11 of RET induced by treatment with APH….36

2.5: LM-PCR detection of breaks formed in HTori-3 cells after treatment with APH……..37

2.6: Detection of RET/PTC rearrangements in HTori-3 cells after treatment with fragile site-inducing chemicals……………………………………………………………………………………..39

3.1: Location of APH-induced DNA breakpoints within intron 11 of RET detected by LM-PCR……………………………………………………………………………………………………………………54

3.2: Location of APH-induced breakpoints within intron 11 of RET relative to known patient breakpoints…………………………………………………………………………………………..57

3.3: Comparison of APH-induced DNA breakpoints to predicted DNA topoisomerase I and II cleavage sites……………………………………………………………………………58

3.4: Comparison of CPT-11 and VP-16 induced topoisomerase I and II cleavage to predicted cleavage sites……………………………………………………………………………………………60

3.5: Location of APH-induced RET intron 11 breakpoints on predicted DNA secondary structures……………………………………………………………………………………………………62

3.6: Cell survival of HTori-3 cells following drug treatment…………………………………………..65

3.7: The effect of DNA topoisomerase catalytic inhibitors on the APH-induced common fragile site breakage………………………………………………………………………………………66

3.8: Frequency of APH-induced DNA breakage in combination with CPT-11 treatment…………………………………………………………………………………………………………70

4.1: Free energy values for predicted DNA secondary structures on ….90

4.2: Division of the chromosome 10 sequence into non-fragile and fragile regions………91

x 4.3: Density of secondary structure forming potential on chromosome 10………………….95

4.4: Establishment and validation of a threshold for Mfold prediction of chromosomal fragility…………………………………………………………………………………………………..98

4.5: Regions predicted to exhibit fragile site breakage within APH-induced common fragile site FRA10G………………………………………………………………………………………101

4.6: DNA secondary structure prediction and in vitro detection within regions predicted to exhibit fragile site instability…………………………………………………………………..104

4.7: The most stable Mfold predicted DNA secondary structures and free energy values for DNA fragments analyzed by in vitro reduplexing assays……………………………..106

4.8: Location of regions predicted to exhibit fragile site instability and correlation with cancer-associated chromosomal aberrations…………………………………………..…………109

5.1: Induction of fragile site breakage by environmental and dietary agents, benzene and DEN……………………………………………………………………………………………………….125

5.2: The effect of APH, benzene, and DEN treatments on the cell cycle……………………..127

5.3: Frequency of DNA breakage in patient normal thyroid tissue………………………………129

5.4: Detection of double-stranded DNA breaks by LM-PCR…………………………………………131

6.1: Model of fragile site instability in the formation of RET/PTC1 rearrangements in papillary thyroid carcinoma…………………………………………………………………………………….140

Appendix Figure 1: Linker and primer sequences for LM-PCR…………………………………….162

xi LIST OF TABLES

Page 1.1: Classification of chromosomal fragile sites………………………………………………………………3

1.2: DNA damage checkpoint shown to regulate common fragile site stability……………………………………………………………………………………………………………………8

1.3: Types of RET/PTC rearrangements found in papillary thyroid carcinomas……………..16

1.4: External fragile site-inducing/enhancing agents…………………………………………………….22

2.1: Percentage of chromosomes showing disruption of RET, NCOA4, and CCDC6 after exposure to fragile site-inducing agents………………………………………………………………33

3.1: Frequency of DNA breakage with RET intron 11 as detected by LM-PCR……………….56

4.1: Classification of the chromosome 10 sequence based on fragile site location……….89

4.2: Free energy of predicted DNA secondary structures for fragile and non-fragile regions of chromosome 10…………………………………………………………………………………………..92

4.3: Percentage of segments with free energy below -40 kcal/mol……………………………………………………………………………………………………………………….93

4.4: Percentage of segments with free energy below -50 kcal/mol………………………………94

4.5: Distribution of proportion of segments per section with free energy below -40 kcal/mol………………………………………………………………………………………………………………...96

4.6: Genes located in regions capable of forming highly stable secondary structures and disease associations……………………………………………………………………………108

5.1: Frequency of DNA breaks at RET, FRA3B, and 12p12.3 loci in patient normal thyroid tissue determined by LM-PCR………………………………………………………………………..128

5.2: Frequency of double-stranded DNA breaks at RET in patient normal thyroid tissue determined by LM-PCR…………………………………………………………………………………….132

Appendix Table 1: Primer and linker sequences for LM-PCR……………………………………….163

Appendix Table 2: APH-Induced DNA breakpoints within RET intron 11……………………..164

xii Appendix Table 3: Location of RET intron 11 APH-induced DNA breakpoints on predicted DNA secondary structures……………………………………………………………………..168

Appendix Table 4: Regions on chromosome 10 with at least seven consecutive segments below -43.61 kcal/mol………………………………………………………………………………..172

Appendix Table 5: Genes located in regions capable of forming highly stable secondary structures and disease associations…………………………………………………………..188

Appendix Table 6: Copy number alterations that overlap with regions of predicted high levels of DNA secondary structure………………………………………………………………………190

xiii ABSTRACT

Dillon, Laura Williams

MECHANISTIC STUDY OF FRAGILE SITE INSTABILITY BY INVESTIGATING RET/PTC REARRANGEMENTS, A COMMON CAUSE OF PAPILLARY THYROID CARCINOMA

Dissertation under the direction of Yuh-Hwa Wang, Ph.D, Associate Professor

Chromosomal fragile sites are non-random regions of the genome with a predisposition to the formation of DNA breaks. Common fragile sites, which are found in all individuals, often coincide with regions mutated in cancer, and therefore are believed to play a role in carcinogenesis. However, there has been no direct evidence linking breakage at fragile sites to the formation of a cancer-causing chromosomal translocation. While fragile sites are stable under normal conditions, exposure to certain chemicals can induce DNA breakage at fragile sites, which ultimately may result in cancer development. The mechanism of instability at fragile sites remains elusive, making it difficult to determine the role of fragile sites in cancer and the risk factors involved. The goal of this work is to investigate the mechanism of common fragile site breakage, the role it plays in the formation of cancer-causing chromosomal translocations, and how this knowledge can be utilized to tailor the treatment of patients.

To provide direct evidence for the role of fragile sites in the formation of oncogenic chromosomal translocations, RET/PTC rearrangements, a common cause of papillary thyroid carcinoma (PTC), were examined since the genes involved in the two most common subtypes, RET/PTC1 and RET/PTC3, are all located within common fragile

xiv sites. Furthermore, thyroid cancer rates are rapidly increasing, especially in women where it is the fastest growing cancer, and increased PTC incidences are almost entirely responsible for this upsurge. A large number of PTC tumors containing RET/PTC rearrangements are sporadic in nature, and may be the result of fragile site instability.

The location of RET, CCDC6, and NCOA4 genes participating in RET/PTC1 and RET/PTC3 rearrangements, in fragile sites and their instability when exposed to fragile site- inducing chemicals was confirmed by Fluorescence In Situ Hybridization (FISH) and ligation mediation-PCR (LM-PCR). More importantly, treatment of thyroid cells with these chemicals resulted in the formation of RET/PTC1 rearrangements, providing direct evidence for fragile site involvement in the formation of a cancer-causing chromosomal translocation.

While a role for fragile sites in cancer development is strongly supported, the initial events leading to breakage at these sites are not well understood. Although no consensus sequence has been identified for common fragile sites, several characteristics are shared among many fragile sites studied to date, including the formation of stable

DNA secondary structures, which are believed to impede replication fork progression and result in genomic instability. DNA topoisomerases I and II maintain chromosome structural integrity during replication and transcription by transiently inducing DNA breaks, and these have been shown to recognize and preferentially cleave at

DNA secondary structures. Using the RET oncogene, initial events of fragile site breakage following treatment with the fragile site-inducing chemical aphidicolin (APH), were investigated. The location of DNA breaks within intron 11 of RET, the major

xv breakpoint cluster region observed in PTC patients with RET/PTC rearrangements, was determined following treatment of thyroid cells with APH using LM-PCR. These breakpoints were located at or near predicted DNA topoisomerase I and/or II cleavage sites. Furthermore, these breakpoints were predicted to coincide with DNA structural features recognized by topoisomerases I and II. Using topoisomerase catalytic inhibitors in combination with APH treatment, the rate of APH-induced DNA breakage at RET and

FHIT, also located at an APH-induced common fragile site, was significantly decreased, confirming the involvement of DNA topoisomerases I and II in initiating DNA breakage at these common fragile sites.

The location of chromosomal fragile sites has traditionally been defined cytogenetically, but molecular characterization of many fragile sites has revealed that instability does not occur throughout the entire region. The results above suggest that fragile site breakage may be the result of DNA secondary structure formation, but an unbiased examination of the ability to form stable DNA secondary structures at fragile sites has not been performed. Using DNA secondary structure predictions of the human chromosome 10 sequence, APH-induced common fragile sites were found to contain sequence segments with potential high secondary structure-forming ability, and these segments clustered more densely than those in non-fragile DNA. These predictions were used to refine cytogenetically defined fragile sites, as well as predict potential new fragile sites within non-fragile DNA. Many of these regions were found to overlap with genes mutated in various human diseases as well as regions of copy number variation in cancer.

xvi Exposure to various dietary and environmental chemicals, including chemotherapeutic drugs, has been shown to induce fragile site breakage. The rate of second is on the rise, and exposure to fragile site-inducing chemicals as a part of treatment or in daily life may be a potential risk factor for developing second primary tumors. PTC has been observed as a second cancer in patients, including those with a history of treatment with DNA topoisomerase poison chemotherapy drugs. This, combined with the increasing rates of thyroid cancer, is a growing concern for the treatment of patients. The potential contribution of external agents in PTC development was tested by monitoring DNA breakage within RET intron 11 in response to treatment with two known environmental fragile site-inducing chemicals, benzene and diethylnitrosamine (DEN). Using LM-PCR, significant levels of DNA breakage were observed within RET intron 11, and this breakage was specific to fragile sites.

Furthermore, treatment of thyroid cells with benzene and DEN resulted in a significant accumulation of cells in the S-phase of the cell cycle, a similar profile as seen with APH treatment, suggesting similar modes of action for these drugs. The ability to monitor a patient’s susceptibility to fragile site breakage prior to treatment with chemotherapeutic agents may prove valuable in tailoring individualized care and preventing the occurrence of second cancers. Utilizing genomic DNA from normal thyroid tissue of PTC patients with or without RET/PTC rearrangements and patients with noncancerous growths, the rate of DNA breakage was monitored at the RET oncogene using LM-PCR. Although no significant difference was observed in the overall level of DNA breaks at RET in the normal tissue of RET/PTC-positive patients compared

xvii to RET/PTC-negative patients, the level of double-strand DNA breaks at RET was significantly elevated in RET/PTC-positive patients compared to RET/PTC-negative hyperplastic nodule patients.

Overall, these results strongly support the role of fragile sites in the formation of cancer-causing chromosomal translocations, specifically the RET/PTC1 rearrangement in

PTC. The location of fragile site breakage within the RET oncogene and the DNA secondary structure prediction analysis of chromosome 10, provide valuable insight into the initial events of common fragile site breakage, supporting a model of fragile site instability whereby the formation of stable DNA secondary structures impede replication fork progression and DNA topoisomerases I and II initiate DNA breakage at these sites through recognition of secondary structures. Furthermore, the ability of benzene and DEN to induce fragile site-specific breakage at the RET oncogene supports fragile site involvement in sporadic PTC tumors. Significant levels of double-strand DNA breakage at RET in normal thyroid tissue from RET/PTC-positive patients further supports this finding, suggesting this property could be used as a potential diagnostic assay to tailor the treatment of cancer patients with chemotherapeutic drugs or monitor cancer susceptibility in individuals exposure to high levels of external fragile site-inducing agents.

xviii CHAPTER I: INTRODUCTION

Chromosomal fragile sites are regions of the genome susceptible to DNA breakage and have been linked to mutations leading to cancer development. To date there are 121 fragile sites that can be further classified based on their frequency in the population, common or rare, and their mode of induction. Common fragile sites are present in all individuals and breakage can be induced at these sites by are variety of chemical and environmental agents (1). Variability in fragile site breakage has been observed among individuals (2), with high levels being associated with cancer patients

(3). These variances may reflect differing exposures to external fragile site-inducing agents, and such exposure may predispose an individual to a variety of cancers. The mechanism of fragile site instability has yet to be fully elucidated, providing obstacles for understanding the role of fragile sites in cancer and potential risk factors.

The goal of this work is to investigate the mechanism of common fragile site breakage, the role it plays in the formation of cancer-causing chromosomal translocations, and how this knowledge can be utilized to tailor the treatment of patients. These goals were addressed through investigation into the role of fragile site breakage in the formation of the papillary thyroid carcinoma-causing RET/PTC1 translocation, the initiation of fragile site breakage at the corresponding RET oncogene, computational analysis of DNA secondary structure formation at fragile sites, analysis of the effects of environmental agents on the induction of fragile site breakage at RET, and development of a potential diagnostic assay to monitor fragile site susceptibility in patients.

1 Human Chromosomal Fragile Sites

Chromosomal fragile sites are non-random loci that can be observed as gaps or breaks on metaphase chromosomes under conditions that partially inhibit DNA replication (1). Fragile sites can further be defined as common or rare, based on the frequency of their occurrence in the population and the chemical used to induce their expression (Table 1.1). Rare fragile sites consist of repeated sequence motifs, such as trinucleotide repeats, which are present in less than 5% of the population and are inherited in a Mendelian manner (4). In contrast, common fragile sites are present in all individuals and therefore are a normal component of chromosomal architecture (5).

Fragile sites are stable under normal conditions but can be observed in culture through treatment with various chemicals. The majority of common fragile sites can be induced by aphidicolin (APH), an inhibitor of DNA polymerases , , and  (6,7).

Induction of other common fragile sites has been observed following treatment with bromodeoxyuridine (BrdU) or 5-azacytidine (8). Most rare fragile sites are expressed through the removal of folate, while others show induction following treatment with distamycin-A or BrdU (4). Recently, Mrasek et al. found that APH can induce all types of common and rare fragile sites, suggesting that their expression is less dependent on their currently defined mode of induction, and instead, a classification of fragile sites based on their frequency is more appropriate (9). Additionally, common fragile site breakage can be induced or enhanced through exposure to various dietary or environmental chemicals, including chemotherapeutic agents (10).

2 Table 1.1: Classification of chromosomal fragile sites Classes of Fragile Sites (n=112) Common (n=90) Rare (n=31) Aphidicolin inducible (n=79) Folate sensitive (n=24) 5-azacytidine inducible (n=4) Distamycin-A inducible (n=5) BrdU inducible (n=7) BrdU inducible (n=2)

Traditionally, fragile sites are defined cytogenetically as unstained gaps of an average 3 Mb in size on metaphase chromosomes. However, fragility is not present in the entire region. Rare fragile sites are the result of unstable nucleotide repeats, for example the (CGG)n repeat at folate sensitive rare fragile sites (11) or AT-rich minisatellite repeats at others (12), which can form stable DNA secondary structures.

Unlike rare fragile sites, no known consensus sequence exists for common fragile sites.

However, analysis of several common fragile sites found that regions of instability corresponded to highly flexible AT-rich sequences (13,14) and predicted stable DNA secondary structures (13,15,16).

Molecular Basis of Fragile Sites

Understanding the molecular basis of fragile site breakage is critical for dissecting the role of fragile sites in cancer. Several intrinsic factors may contribute to their expression. Replication timing experiments have demonstrated that all fragile sites examined so far, including FRA1H (17), FRA2G (17), FRA3B (18), FRA7H (19), FRA10B

(20), FRA16B (20), and FRAXA (21) exhibit late replication, which can be further delayed by the addition of replication inhibitors, with some fragile site alleles remaining

3 unreplicated in the late G2 phase (18,19). The presence of large genes in common fragile sites is also overrepresented (8). Furthermore, transcription of late replicating long genes at fragile sites can result in the formation of stable R-loop structures that can lead to common fragile site expression (22).

Although there is no consensus sequence among fragile sites, analysis of several common fragile sites has revealed AT-rich sequences displaying the potential to form highly stable secondary structures (13,14), which may stall DNA replication fork progression. The CGG repeats, which are present in all rare, folate-sensitive fragile sites, can form quadraplex (23) and hairpin (24) structures in vitro, and display significant blocks to DNA replication both in vitro (25) and in vivo (26). The AT-rich rare fragile site

FRA16B demonstrates the formation of secondary structure and DNA polymerase stalling in vitro, as well as reduced replication efficiency and increased instability in human cells (27). The examination of replication intermediates from cells containing AT- rich sequences within common fragile site FRA16D in yeast showed site-specific replication fork stalling depending on the length of the AT repeat (15). DNA synthesis of the same fragile site by human replicative polymerases δ and α using an in vitro primer extension assay confirmed polymerase stalling at sites predicted to form inhibitory DNA structures (16). Similar findings were observed for eukaryotic replicative polymerases at hairpin and tetraplex structures formed within CGG repeat expansions (28). Replication fork stalling also occurs at endogenous AT-rich sequences within the common fragile site FRA16C in human lymphoblastoid cells, which is enhanced by APH treatment (29).

These data suggest that the ability of fragile sites to form stable secondary structures

4 during DNA replication likely contributes to their breakage by stalling replication fork progression.

Insufficient origin activation at fragile sites has also been observed and contributed to breakage in these regions. Letessier et al observed a lack of origin firing within the 700-Kb core of common fragile site FRA3B in lymphoblastoid cells, which in turn forced replication forks in the flanking regions to cover large distances to replicate this region (30). Furthermore, these replication forks fired late in the S-phase, such that slowed polymerase progression resulted in incomplete replication of this region. Ozeri-

Galai et al observed that normal replication of common fragile site FRA16C results in slowed and stalled replication forks at regions with the potential to form secondary structures, requiring the activation of additional origins to complete replication.

Replication fork stalling at FRA16C increased with mild replication stress and the firing of additional origins to compensate did not occur, resulting in breakage.

The current model for common fragile site breakage is that under conditions of mild replication stress the helicase complex becomes uncoupled from the replicative

DNA polymerases, resulting in long stretches of single-stranded DNA. At fragile sites this can result in the formation of stable DNA secondary structures that block replication fork progression, resulting in a stalled replication fork and triggering of the ATR (Ataxia telangiectasia and Rad3 Related)-dependent DNA damage checkpoint pathway. If the replication fork is not properly repaired, this can result in DNA breakage, and the gaps or breaks observed in metaphase chromosomes are believed to represent these unreplicated regions. Additional work needs to be performed to understand the initial

5 events leading to DNA breakage at fragile sites and ultimately mutational events resulting in cancer.

Regulation of Common Fragile Site Stability and Repair of DNA Breaks

ATR kinase is a DNA damage sensor protein that has a major role in regulating stability at common fragile sites (Table 1.2). ATR works with downstream target proteins to respond to stalled and collapsed replication forks, resulting in a block in further replication and mitosis progression and the promotion of DNA repair, recombination, or (31,32). The loss of functional ATR in cells results in a defective DNA damage response to agents which block replication fork progression, including APH and hydroxyurea (33-35), and conditions of hypoxia (36). Casper et al. found that cells deficient in ATR, but not ATM (Ataxia telangiectasia-mutated), display up to a 20-fold increase in fragile site breakage following treatment with low doses of APH compared to control cells (37). Also, a deficiency in ATR alone is enough to induce fragile site breakage in cells without treatment with replication inhibitors. Cells from patients with

Seckel syndrome, who express low levels of ATR protein due to a hypomorphic mutation in the ATR gene, exhibit an increase in chromosomal breakage at common fragile sites compared to unaffected individuals (38). Furthermore, mice hypomorphic for ATR also display an increase in common fragile site breakage and a significant delay in checkpoint induction (39).

Other downstream targets of the ATR-mediated pathway involved in maintaining fragile site stability include BRCA1 (40), CHK1 (41), SMC1 (42), FANCD2 (43), HUS1 (44),

6 WRN (45), and Claspin (46) (Table 1.2). BRCA1 is a primary target of both ATR and ATM phosphorylation in response to DNA damage. Cells lacking BRCA1 show significantly more fragile site breakage after treatment with APH compared to control cells (40).

Also, cells expressing mutant BRCA1 exhibit elevated levels of fragile site breakage but lack the G2/M checkpoint, suggesting BRCA1 regulates fragile site stability through its role at this checkpoint.

CHK1 kinase is the major downstream target of ATR and serves as the central regulator of the ATR checkpoint pathway. Loss of CHK1, but not the ATM regulated

CHK2, in cells was found to result in a significant increase in fragile site breakage after treatment with APH (41). Also, it was found that both ATR and ATM phosphorylate CHK1 following treatment with low doses of APH (47). These data suggest that the role of

ATM in fragile site maintenance may be to activate the ATR pathway through phosphorylation of CHK1, when ATR is missing or fails to properly respond to damage.

HUS1 is a member of the PCNA-related 9-1-1 complex which promotes the phosphorylation of ATR substrates like CHK1 and helps aid in DNA repair through association with multiple factors. A significant increase in DNA breakage at common fragile sites was observed after inactivation of HUS1 (44). SMC1 is a chromosomal structural maintenance protein that belongs to the cohesin complex, which is necessary for sister chromatid cohesion and DNA repair and acts to hold DNA strands in place.

After treatment with APH, cells exhibit an ATR-dependent, ATM-independent, phosphorylation of SMC1 and increased fragile site breakage after SMC1 inhibition (42).

Claspin is another member of the ATR pathway that is required for ATR-mediated

7 Table 1.2: DNA damage checkpoint proteins shown to regulate common fragile site stability Protein Function Reference ATM Kinase, maintains fragile site stability in the absense of ATR (47) Kinase, binds to fragile DNA in response to replication stress, (37, 53) ATR phosphorylates downstream targets to activate checkpoint response Phosphorylated by ATR, major downstream target of ATR, necessary (40) BRCA1 for G2/M checkpoint activation following replication stress Kinase, phosphorylated by ATR in response to replication stress, (41) CHK1 central regulator of ATR pathway Phosphorylated and interacts with CHK1 in response to replication (46) Claspin stress Fanconi Anemia pathway protein, phosphorylated by ATR leading to (43) FANCD2 activation by mono-Ub, activated by replication stress Member of the 9-1-1 complex, promotes phosphorylation of ATR (44) HUS1 substrates Chromosomal structural maintenance protein, member of the (42) SMC1 cohesion complex WRN ATP-dependent 3’-5’ helicase, 3’-5’ exonuclease (16, 45)

phosphorylation of CHK1 in response to replication stress. Inhibition of Claspin expression increases fragile site expression, with or without APH treatment (46).

Several studies have focused on the Fanconi anemia pathway, which responds to

DNA cross-linking damage and chromosomal instability through a yet unknown mechanism involving interactions with BRCA1 and RAD51 and recruitment of BRCA2, in regulation of fragile site stability (43). Fanconi anemia is an autosomal recessive disease associated with an increase in cancer susceptibility, and is most often a result of mutations in FA genes (subtypes A, B, C, D1, D2, E, F, G, I, J, L, M, N) (48). Chromosomal breaks in blood lymphocytes of FA patients are preferentially located at fragile sites

(49), and FANCD2 and FANCI specifically associate with common fragile site loci under conditions of replication stress (50). Also, ATR phosphorylates the FA protein, FANCD2,

8 and is required for its mono-ubiquitination (51), which is necessary for its activation during S-phase and subsequent colocalization with BRCA1 and RAD51(52), in response to replication stress. Treatment of both FANCD2 knockdown cells and FA-patient cells with APH results in increased fragile site breakage (43).

WRN is an ATP-dependent 3’-5’ helicase and 3’-5’ exonuclease that is targeted by

ATR and interacts with ATR-pathway proteins. Increased fragile site breakage is seen in cells of patients with Werner syndrome (a premature aging disease associated with a greater susceptibility to cancer development), and in WRN knockdown cells after treatment with APH (45). In addition, double knockdown of WRN and ATR did not result in increased chromosomal damage compared to WRN or ATR knockdown alone, suggesting these proteins work in a common pathway. The activity of WRN in fragile site maintenance still remains unclear. Pirzio et al presented data suggesting that WRN helicase, not exonuclease activity, plays the main role in stabilizing fragile sites (45). In contrast, Shah et al found that neither WRN helicase or exonuclease activity was necessary for polymerase δ progression past stalled replication forks within various

FRA16D sequences in vitro (16).

While the importance of the ATR pathway in fragile site maintenance has been established, the mechanism is not fully understood. Recently, Wan et al found that ATR binds (directly or through complexes) to fragile site FRA3B preferentially compared to non-fragile regions under conditions of mild replication stress (53). This binding increases in a dose-dependent manner, peaking at 0.4μM APH, and decreases at higher

APH concentrations. While the level of ATR binding to FRA3B changes with treatment,

9 the cellular levels of ATR, phospho-ATR (Serine 428), and ATR interacting proteins ATRIP and TopBP1 remain unchanged. This suggests that ATR binding to the fragile site is guided initially by the level of replication stress signals generated at FRA3B due to APH treatment, and then sequestered from FRA3B regions by successive signals from other non-fragile site regions, which are produced at the higher concentrations of APH.

Furthermore, the kinase activity of ATR was required for ATR binding to FRA3B in response to APH treatment. While ATR kinase activity is known to be necessary for phosphorylation of downstream targets to activate the checkpoint signaling cascade

(32), these data indicate that the kinase activity of ATR is also necessary for ATR interaction to fragile site regions, most likely through phosphorylation of ATRIP and

TopBP1 to stabilize the interaction between these three proteins and the fragile DNA.

Two models which are not mutually exclusive have been proposed to explain how ATR helps to maintain fragile site stability (54). The first model states that a loss of

ATR can lead to a bypass of stalled replication forks at fragile sites, ultimately resulting in a failure of checkpoint pathways to prevent entry into mitosis, thus leaving DNA breakage at the unreplicated DNA. The second model states that a loss of ATR leads to replication fork collapse at fragile sites and improper resolution of these structures by

ATR leads to DNA breaks. The current information about the involvement of ATR at fragile sites supports a combination of both models. The preferential binding of ATR protein to FRA3B fragile DNAs following APH treatment (53) suggests that ATR plays a possible local role in stabilizing stalled replication forks at fragile regions. Also, this binding and increased fragile site breakage following the inhibition of various members

10 of the ATR pathway suggest that ATR response to fragile sites under conditions of replication stress can activate the ATR-dependent pathway. Finally, decreased ATR binding to FRA3B at higher concentrations of APH (53), which induce more chromosomal gaps or genomic breaks, supports the idea that DNA breakage at fragile sites is due to a failure of ATR to stabilize replication forks and to signal a checkpoint response.

The double-strand DNA break repair protein ATM is also involved in maintaining fragile site stability, but in the absence of ATR because loss of ATM alone does not cause increased common fragile site breakage (37). Ozeri-Galai et al. found that a loss of both

ATR and ATM significantly increases APH-induced common fragile site breakage compared to the loss of ATR alone. Also, ATM is activated and forms nuclear foci with

γH2AX following treatment with low doses of APH (47). These findings indicate that ATR is the major pathway responsible for maintaining fragile site stability, but that ATM also plays a secondary role, perhaps through a downstream response to double strand breaks that form as a result of ATR deficiency. Schwartz et al. found that downregulation of Rad51, DNA-PKcs, and Ligase IV, key components of the homologous recombination (HR) and non-homolous end joining (NHEJ) repair pathways, significantly increases fragile site breakage with APH treatment, and that γH2AX and phosphorylated

DNA-PKcs foci were located at expressed fragile sites (55). The initiation and repair of

DNA breaks at fragile sites still remains fairly unclear, such that more studies need to be performed to identify factors required for maintaining fragile site stability.

11 Fragile Sites and Cancer

Studies over the past several decades have shown a correlation between fragile sites and cancer-specific chromosomal aberrations (56). Many of the genes identified within fragile sites are known tumor suppressor genes or oncogenes (57). Fragile sites have been identified as hot spots for sister chromatid exchange (58), viral integrations

(59-64), and gene amplifications (65-68) in tumor cells. Additionally, mutational signatures of some unexplained homozygous deletions observed in cancer cell lines match those at fragile sites (69).

A comprehensive examination of all known simple chromosomal translocations in cancer revealed that 52% of these recurrent translocations had at least one breakpoint located within a fragile site (70). Specifically, 40% of translocations had breakpoints within one gene located in a fragile site, while an additional 12% of translocations had breakpoints within both genes located in fragile sites. Furthermore,

65% of the breakpoints identified within fragile sites were within common - not rare - fragile sites, conferring a genetic risk among all individuals. Since this study only focused on simple translocations between two genes and not the participation of fragile sites in other more complex genomic rearrangements, the association of fragile sites with breakpoints in cancer may prove to be greater than estimated in this study.

In addition to the correlation between fragile sites and regions of the genome mutated in cancer, the direct contribution of fragile site breakage to the formation of cancer-specific chromosomal deletions has been observed. Durkin et al observed deletions within the tumor suppressor gene FHIT, located within the most active

12 common fragile site FRA3B, following treatment of human-mouse chromosome 3 somatic hybrid cells with APH. These deletions were consistent with those observed in esophageal, breast, and lung cancers (71). While this study provides evidence for fragile sites in the formation of chromosomal deletions in cancer, no study has been performed linking fragile site breakage to the formation of a cancer-causing chromosomal translocation.

Variability in fragile site breakage has been observed among individuals (2), with high levels being associated with cancer patients (3). Individuals genetically predisposed to forming cancer also have higher levels of fragile site breakage. For example, Seckel syndrome - a rare genetic disorder in which patients exhibit high levels of chromosomal instability and cancer - is caused by low expression of the DNA repair protein ATR, due to a hypomorphic mutation in the ATR gene (72). Cells from patients with Seckel syndrome have significantly higher levels of APH-induced fragile site breakage compared to normal individuals (38). Another rare genetic disorder, Fanconi anemia (FA), is the result of mutations in various proteins involved in the FA double-strand DNA repair pathway. Patients with FA have elevated levels of chromosomal breakage and cancer

(73). Chromosomal breakpoints in blood lymphocytes from FA patients are preferentially located in fragile sites (49), and APH-induced fragile site breakage is significantly increased among these patients (43). Proteins in both the ATR and FA DNA repair pathways are important in maintaining stability at fragile sites (10), and these provide extreme examples for a genetic predisposition for fragile site breakage and cancer development.

13 Together these previous studies provide a strong link between fragile sites and cancer, whereby exposure to fragile site-inducing conditions and/or a genetic predisposition could be attributed to the development of various types of cancer.

RET/PTC Rearrangements in Papillary Thyroid Carcinoma

The incidence of thyroid cancer is dramatically rising in the United States and other countries. Thyroid cancer has increased steadily in the United States over the past several decades, and according to data from the National Cancer Institute’s Surveillance,

Epidemiology and End Results (SEER) database, incidences are now nearly three times that of the early 1970s (74-76). Furthermore, for unknown reasons, thyroid cancer is three times more prevalent in women than men. Thyroid cancer is the sixth most common type of cancer among women, and increasing more rapidly than any other cancer. The American Cancer Society estimates that 56,460 new cases of thyroid cancer

(43,210 in women, and 13,250 in men) will be diagnosed in the United States in 2012, with approximately 80% of patients below 65 years old (77).

The recent upsurge in thyroid cancer is not well understood. Rates of thyroid cancer diagnoses have increased the most among small (≤ 2cm) thyroid nodules, which may be explained by the use of thyroid ultrasound for diagnosis beginning in the 1980s

(75,78). However, it is believed that the increase in thyroid cancer is not solely based on diagnostic methods (79), but due to changes in other risk factors as well (80). One possible contributory risk factor is exposure to ionizing radiation (81), either from external radiation, such as x-rays and γ-radiation, or internal radiation, from ingestion or

14 inhalation of radioiodine. Increases in thyroid cancer have been well-documented following exposure to high doses of radiation during medical procedures or following nuclear bomb explosions or nuclear reactor fallouts (82). However, exposure to low doses of radiation from routine diagnostic X-ray procedures and in the workplace does not increase the risk of thyroid cancer development (82), suggesting radiation exposure is not the only risk factor. An increased body mass index (BMI) is also positively associated with thyroid cancer in both women and men (83), suggesting obesity is another risk factor for thyroid cancer.

Papillary thyroid carcinoma (PTC) is primarily responsible for the upsurge in thyroid cancer rates (76). One mutation commonly observed in PTC is the RET/PTC rearrangement, in which the RET oncogene translocates with a variety of genes that are constitutively expressed in the thyroid. RET (rearranged during transfection) encodes for a cell membrane receptor tyrosine kinase that responds to ligands of the glial cell line- neurotropic factor (GDNF) family, activating and survival pathways (84).

Expression of RET in the thyroid is high in neural-crest derived C-cells but not in follicular cells, where RET/PTC rearrangements can result in its activation through expression of fusion proteins leading to tumorigenesis.

While prevalence of RET/PTC rearrangements is variable among different studies, overall these translocations are found in 30-40% of adult and 50-60% of pediatric PTC tumors (85). To date, 12 RET/PTC rearrangements have been reported

(Table 1.3), all involving RET (86). The two most common subtypes are RET/PTC1 and

RET/PTC3, where RET is translocated with CCDC6 and NCOA4, respectively (87). One

15 Table 1.3: Types of RET/PTC rearrangements found in papillary thyroid carcinomas N RET/PTC type Partner gene Chromosomal rearrangement Frequency 1 RET/PTC1 CCDC6 inv10(q11.2;q21) 70% 2 RET/PTC2 PRKAR1A t(10;17)(q11.2;q23) 3% 3 RET/PTC3 NCOA4 inv10(q11.2;q10) 30% (RET/PTC4) (higher in radiation- induced tumors) 4 RET/PTC5 GOLGA5 t(10;14)(q11.2;q32) Rare 5 RET/PTC6 HTIF1 t(7;10)(q32;q11.2) Rare 6 RET/PTC7 TIF1G t(1;10)(p13;q11.2) Rare 7 ELKS/RET ELKS t(10;12)(q11.2;p13.3) Rare 8 RET/PTC8 KTN1 t(10;14)(q11.2;q22.1) Rare 9 RET/PTC9 RFG9 t(10;18)(q11.2;q21) Rare 10 PCM1/RET PCM1 t(8;10)(p21;q11.2) Rare 11 RFP/RET RFP t(6;10)(p21;q11.2) Rare 12 HOOK3/RET HOOK3 t(8;10)(p11.21;q11.2) Rare Information reviewed in (86).

known risk factor for RET/PTC rearrangement formation is exposure to radiation, where the incidence of RET/PTC rearrangements in PTC patients increases to 60-70% regardless of age (85). RET/PTC3 rearrangements have shown a strong correlation with radiation exposure, where multiple studies indicated these translocations in 63-75% of radiation-induced RET/PTC-positive pediatric PTC tumors (88-91). In contrast, RET/PTC1 rearrangements have been observed in 50-71% of RET/PTC-positive sporadic PTC tumors, while RET/PTC3 rearrangements were only observed in 13-42% of tumors

(88,92-94).

RET, CCDC6, and NCOA4, the genes participating in RET/PTC1 and RET/PTC3 rearrangements, are all located within common fragile sites. RET and NCOA4 are both located within the same APH-inducible common fragile site, FRA10G, while CCCD6 is

16 located within the BrdU-induced common fragile site FRA10C. The location of these genes within fragile sites, the unexplained nature of sporadic PTC tumors containing translocations of these genes, and the increasing incidence of PTC tumors suggest that fragile site breakage may contribute to sporadic RET/PTC rearrangements.

External Fragile Site-Inducing/Enhancing Agents

Aside from classic fragile site-inducing chemicals like APH, DNA breakage at common fragile sites has been observed following exposure to many external agents, including dietary, environmental, and chemotherapeutic compounds (Table 1.4).

Variability in fragile site breakage has been observed among individuals (2), with high levels being associated with cancer patients (3). These variances may reflect differing exposures to external fragile site-inducing agents, and such exposure may predispose an individual to a variety of cancers, including PTC.

Environmental and Dietary Fragile Site-Inducing/Enhancing Chemicals

Numerous dietary and environmental chemicals can induce or enhance fragile site breakage (Table 1.4). Caffeine and ethanol are two dietary agents that can significantly increase the rate of fragile site breakage. Caffeine, an inhibitor of phosphoinositide 3-kinase-related , including ATR and ATM, significantly enhances fragile site breakage in combination with APH or fluorodeoxyuridine (FUdR)

(95,96). Similarly, ethanol enhances APH-induced fragile site breakage (97). Cells from chronic alcohol users have an increased frequency of fragile site breakage compared to

17 non-drinkers, which suggests that long-term alcohol use can induce fragile site expression (98).

Exposure to cigarette smoke, pesticides, or hypoxic conditions can also increase susceptibility to fragile site breakage. Peripheral blood lymphocytes from cigarette smokers have significantly greater levels of APH-induced fragile site breakage compared to non-smokers (99). Interestingly, peripheral blood lymphocytes from non-smokers and patients with small cell lung cancer who have stopped smoking both display lower levels of fragile site breakage following APH treatment than active smokers, suggesting this risk factor is reversible (100). Individuals exposed to pesticides through occupational work, such as pesticide sprayers or flower collectors working in greenhouses, have increased levels of APH-induced fragile site breakage in their blood lymphocytes compared to control individuals, and these results persisted even a year later (101-103).

Furthermore, the pesticide-induced breakage was located within fragile sites containing breakpoints observed in non-Hodgkin’s lymphoma and leukemia; consistent with this finding, increasing rates of hematopoietic cancers have been linked to pesticide exposure (104,105). Hypoxic conditions also enhance fragile site breakage with or without APH treatment in CMA32 Chinese hamster cells (106).

Dietary and environmental agents, carbon tetrachloride, dimethyl sulfate, benzene, and diethylnitrosamine (DEN) all can induce fragile site breakage (107). Carbon tetrachloride is used in refrigerants, pesticides, and industrial and manufacturing processes (108). Formerly, this compound was used in cleaning fluids and fire extinguishers, but has since been banned from home use due to its carcinogenic

18 properties. An increased risk of non-Hodgkin’s lymphoma has been reported among individuals working in manufacturing, industrial, and laboratory jobs in which they are exposed to carbon tetrachloride. Dimethyl sulfate is used to manufacture organic chemicals including pesticides, dyes, drugs, perfumes, fabric softeners, and adhesives

(108). Dimethyl sulfate was also formerly used as a chemical weapon. Occupational exposure to this compound has been linked to cancers of the eye, bronchus, and lung.

Benzene is a known human carcinogen found in gasoline, pesticides, cigarette smoke, and industrial emissions, and is a common contaminant detected in food and water (108). According to the U.S. Department of Health and Human Services, half of the national exposure to benzene comes from cigarette smoke, a known fragile site - enhancing agent. The optimal concentrations of benzene (500 μg/mL) to induce fragile sites are relevant. Smoke from a smoldering cigarette yields 345-653 μg of benzene, and the average exposure for one hour of driving or riding in a car is about 40 μg benzene

(even greater in highly congested areas) (108). Therefore, the general population is exposed to a level of benzene comparable to the amount able to induce fragile sites, especially under long-term exposure.

Exposure to benzene due to occupation or geographic location has been linked to leukemia. Recently, Pellegriti et al observed a significantly higher prevalence of PTC in the Sicilian province of Catania, located near the Mount Etna volcano, compared to other provinces, which could not be explained by industrial pollution or mild iodine deficiencies (109). The authors suggest that exposure to environmental factors associated with the Mount Etna volcano may be responsible. Benzene can form

19 following the incomplete combustion of organic materials in volcanoes and forest fires

(110), and has been detected in lava gases emitted from Mount Etna (111), suggesting fragile site breakage due to benzene exposure may contribute to the increasing incidence of PTC in Catania.

DEN is found in pesticides, cigarette smoke, industrial pollution, drinking water, and foods and beverages (108). DEN was previously used as a gasoline and lubricant additive, antioxidant, stabilizer in plastics, and in other manufacturing processes.

Although there are no studies relating exposure to DEN with cancer susceptibility in humans, many studies in laboratory animals have shown that DEN exposure can result in the development of various tumors (108).

Atenolol is a common β-blocker used to treat hypertension, and peripheral blood lymphocytes from hypertensive patients taking this drug exhibit higher levels of chromatid and chromosomal breaks than normal individuals not taking atenolol; these breaks were preferentially located at fragile sites (112). Furthermore, blood lymphocytes from patients taking atenolol have significantly more micronuclei than normal patients (113). Although antihypertensive drugs have been evaluated for carcinogenic effects, studies performed to date may not be complete enough to rule out these drugs as potential cancer-causing agents (114). Due to the prevalent usage of hypertensive drugs and the ability of atenolol to induce fragile site breakage, usage of such drugs may be an additional risk factor for cancer development, and extensive investigations are needed.

The American Cancer Society 2012 report indicated that the incidence of thyroid

20 cancer among women is three times higher than men (77). Thyroid cancer is the fastest- growing cancer and the sixth most common among women. Furthermore, the risk of thyroid cancer peaks earlier for women, where most women are diagnosed in the fourth and fifth decades of life compared to the sixth and seventh for men. Despite this dramatic increase and strong disparity between women and men, there has been no explanation for these observations. However, various fragile site-inducing factors may contribute to the difference in PTC incidence in women versus men. One possible explanation for this gender disparity is hormonal differences between men and women.

The number of chromosomal breaks and sister chromatid exchanges is elevated in women who are pregnant or taking oral contraceptives (115). Variation in the frequency of APH-induced common fragile site breakage was also observed in women during different times of the menstrual cycle, with a significant increase during the luteal phase when progesterone and estradiol levels are at their highest (116). Therefore increased usage of hormonal birth control among reproductive aged women (117) may be one contributing factor. Additional trends that may increase a woman’s exposure to external fragile site-inducing agents include an increasing presence of women in the work force

(118), increased alcohol consumption among young women (119), and a minimal decline in cigarette smoking (120). Together these factors may explain some of the increase in PTC incidences among women over the past several decades and further implicate fragile sites in this process.

21 Table 1.4: External fragile site-inducing/enhancing agents Agents Applications References Dietary and Environmental atenolol hypertension drug (112) benzene cigarette smoke, gasoline, pesticides, food, water (107) caffeine dietary agent (95,96) carbon tetrachloride refrigerants, pesticides (107) cigarette smoke dietary and environmental agent (99,100) diethylnitrosamine cigarette smoke, pesticides, food, beverage (107) dimethyl sulfate dyes, drugs, perfumes, pesticides (107) ethanol dietary agent (97,98) hypoxia low oxygen, tumor microenvironment (106) pesticides dietary and environmental agent (101-103) Chemotherapeutics 5-azacytidine myelodysplastic syndrome, leukemia (107) actinomycin D sarcoma, Wilms' tumor, germ cell, testicular, (107) melanoma, neuroblastoma, retinoblastoma

bleomycin squamous cell, melanoma, sarcoma, testicular, (107) Hodgkin’s and non-Hodgkin’s lymphoma

busulfan chronic myelogenous leukemia (107) camptothecin colon, rectal (121) chlorambucil chronic lymphocytic leukemia, Hodgkin’s and non- (107) Hodgkin’s lymphoma, breast, ovarian, testicular

cytosine arabinoside leukemia, lymphoma (107) floxuridine colon, kidney, stomach (107) methotrexate breast, head and neck, lung, stomach, esophageal, (107) sarcoma, non-Hodgkin’s lymphoma, acute lymphoblastic leukemia

Chemotherapeutic Agents

Several chemotherapeutic agents are known to induce fragile site breakage including actinomycin D, bleomycin, busulfan, camptothecin, chlorambucil, cytosine arabinoside (cytarabine), 5-azacytidine, floxuridine, and methotrexate (Table 1.4)

22 (107,121). The fragile site breakage observed following treatment with these chemotherapeutic agents was multiplied 3 to 8-fold by the addition of caffeine. These chemotherapeutics are commonly used to treat cancer, including leukemias and lymphomas (Table 1.4). Aside from killing cancer cells, collateral effects of chemotherapy drugs include mutations in healthy cells that could result in a therapy- related second primary tumor. The plasma concentrations of cytarabine derived from the treatment dosage are comparable (~10 μM) or higher (depending on the regimens)

(122), to the amount that causes fragile site induction (10 μM).

The rate of second primary cancers is on the rise, and they now account for one in six of all newly diagnosed cancers in the United States (123). Thyroid cancer has been observed as a secondary cancer following treatment for various cancers, including

Hodgkin’s lymphoma (124,125). Patients with testicular cancer treated with chemotherapy and/or radiation also have a significantly elevated risk for developing thyroid cancer (126). PTC has been observed in patients treated for osteosarcoma (127-

132), including children treated with chemotherapeutic agents (some of which are known to induce fragile sites, including bleomycin, actinomycin D, and methotrexate), but not radiation (132). PTC was documented as a secondary malignancy following treatment of a pediatric rhabdomyosarcoma patient with only chemotherapeutic drugs, including actinomycin D (133). PTC has also been observed as a second primary cancer in children treated with chemotherapy alone for acute lymphoblastic leukemia, neuroblastoma, and Ewing’s sarcoma (134-137).

Many chemotherapeutic agents target DNA topoisomerases, acting as enzymatic

23 poisons and resulting in an accumulation of double-strand DNA breaks in cells. DNA topoisomerase I activity is essential for common fragile site breakage (138,139).

Camptothecin, one of the fragile site-inducing chemotherapeutic agents, is a DNA topoisomerase I poison. The remaining fragile site-inducing chemotherapeutic agents perturb DNA replication and/or RNA transcription in cells, the mode by which many fragile site-inducing chemicals lead to chromosomal breakage. Besides the chemotherapeutic drugs already shown to induce fragile sites, many others work in a similar manner by inhibiting DNA topoisomerases and perturbing DNA replication or

RNA transcription, suggesting additional drugs may have the ability to induce fragile site breakage.

Together, chemotherapeutic, dietary, and environmental agents represent a diverse spectrum by which individuals can be exposed to, and increase their risk of fragile site breakage. Long-term exposure or exposure to significant doses of any of these agents, or a combination of different agents, can increase a person’s susceptibility to cancer development, including PTC.

24 CHAPTER II: DNA BREAKS AT FRAGILE SITE GENERATE ONCOGENIC RET/PTC REARRANGEMENTS IN HUMAN THYROID CELLS

1 2 2,3 1 Manoj Gandhi , Laura W. Dillon , Sreemanta Pramanik , Yuri E. Nikiforov and Yuh-Hwa Wang2

1Department of Biochemistry, Wake Forest University School of Medicine, Medical Center Boulevard, Winston-Salem, NC 27157-1016, USA. 2Department of Pathology and Laboratory Medicine, University of Pittsburgh, Pittsburgh, PA 15261, USA. 3Current address: Environmental Biotechnology Division, National Environmental Engineering Research Institute, Nehru Marg, Nagpur- 440020, India.

The following paper was published in Oncogene in April 2010, with modifications in number of Figures and References. Differences in organization reflect the requirements of the journal.

25 ABSTRACT

Human chromosomal fragile sites are regions of the genome that are prone to

DNA breakage, and are classified as common or rare, depending on their frequency in the population. Common fragile sites frequently coincide with the location of genes involved in carcinogenic chromosomal translocations, suggesting their role in cancer formation. However, there has been no direct evidence linking breakage at fragile sites to the formation of a cancer-specific translocation. Here, we studied the involvement of fragile sites in the formation of RET/PTC rearrangements, which are frequently found in papillary thyroid carcinoma (PTC). These rearrangements are commonly associated with radiation exposure; however most of the tumors found in adults are not linked to radiation. In this study, we provide structural and biochemical evidence that the RET,

CCDC6, and NCOA4 genes participating in two major types of RET/PTC rearrangements, are located in common fragile sites FRA10C and FRA10G, and undergo DNA breakage after exposure to fragile site-inducing chemicals. Moreover, exposure of human thyroid cells to these chemicals results in the formation of cancer-specific RET/PTC rearrangements. These results provide direct evidence for the involvement of chromosomal fragile sites in the generation of cancer-specific rearrangements in human cells.

26 INTRODUCTION

Cancer development can be initiated by the accumulation of various genetic abnormalities that lead to the disregulation of genes involved in various cellular processes. Chromosomal translocations are one of such abnormalities commonly seen in cancer cells. Translocations result in the rearrangement of genetic material, which typically leads to the expression of an oncogenic fusion protein contributing to the neoplastic process (140). To date, there are a total of 705 known recurrent translocations in cancer that involve 459 different gene pairs, and are present in many different types of cancer (141).

In all translocations, the development of breaks in DNA strands must occur.

There are various ways in which a cell can acquire these breaks, such as ionizing radiation (142). DNA breaks are commonly repaired by two pathways, homologous recombination or non-homologous end joining (143), but dysfunction of these pathways can contribute to the formation of chromosomal translocations (140). Alternatively, an overwhelming accumulation of DNA breaks could prevent these normally functioning pathways from eliminating all of the breaks, leading to translocation events.

Chromosomal fragile sites are known to contribute to the formation of DNA breaks and are hotspots for sister chromatid exchange (58), chromosomal translocations, deletions (144), and viral integrations (57). Fragile sites are non-random specific loci which are stable under normal conditions, but upon certain culture conditions can form visible gaps or breaks in metaphase chromosomes (8). Depending on their frequency in the population, fragile sites can be divided into two classes:

27 common and rare. Common fragile sites, which constitute the majority of the two classes, are present in all individuals, and are a normal component of chromosome structure (5). Common fragile sites can be further classified based on their mode of induction, as not all sites are induced by the same compounds, or to the same extent.

Aphidicolin (APH) induces expression of the majority of common fragile sites. Other known fragile site-inducing conditions include the addition of 5-bromodeoxyuridine

(BrdU), 5-azacytidine, and distamycin A and the removal of folic acid (11). Also, certain dietary and environmental factors have been shown to contribute to fragile site expression, including caffeine (96), ethanol (97), hypoxia (106), and pesticides (101).

Together, genetic influences on fragile site instability, along with external influences from chemical, dietary and environmental factors, suggest a possible role for fragile sites in sporadic cancer formation.

Fragile sites are also known to be late replicating regions of the genome. Delayed

DNA replication has been observed in all fragile sites examined to date (17,19-

21,66,145,146). Delayed replication at fragile sites is believed to be attributed to the high propensity of DNA sequences to form stable secondary DNA structures (13-15,24-

26,147). Difficulties in passing of the replication fork, caused by secondary DNA structure formed within the fragile DNA regions, could result in stalled replication. ATR, a major replication checkpoint protein, is crucial for maintaining fragile site stability

(38), and its inhibition by 2-aminopurine (2-AP) in conjunction with fragile site inducing chemicals significantly increases common fragile site expression (37). Therefore, it is

28 suggested that DNA breakage at fragile sites results from delayed replication forks that escape the ATR-mediated checkpoint pathway (8).

Many studies point towards the association between fragile sites and formation of cancer-specific translocations (56). In a comprehensive survey, we found that 52% of all known recurrent simple chromosomal translocations have at least one gene located within a fragile site, strongly suggesting a potential role for fragile sites in the initiation of translocation events (70). Also, Glover and colleagues found that upon addition of

APH, submicroscopic deletions within FHIT, located in the fragile site FRA3B and associated with various human cancers, were detected and resembled those seen in cancer cells (71). However, there has been no direct evidence linking breakage at fragile sites to the formation of cancer-causing chromosomal aberrations.

Genes participating in the two main types of RET/PTC rearrangements, RET/PTC1 and RET/PTC3, have been mapped to known fragile sites (70). RET/PTC rearrangements are commonly found in papillary thyroid carcinomas (PTC), and in all cases result in the fusion of the tyrosine kinase domain of RET to the 5’ portion of various unrelated genes

(148). In the case of the RET/PTC1 and RET/PTC3, RET is fused with CCDC6 and NCOA4 respectively (87). These rearrangements result in the expression of a fusion protein possessing constitutive tyrosine kinase activity, which is tumorigenic in thyroid follicular cells (148). Both genes involved in the RET/PTC3 rearrangement, RET and NCOA4, are located at 10q11.2 within fragile site FRA10G, a common fragile site induced by APH.

The CCDC6 gene, involved in RET/PTC1, is located at 10q21.2 within fragile site FRA10C, a common fragile site induced by BrdU. Major breakpoint cluster regions for these

29 genes have been identified, and are located within intron 11 of RET, intron 5 of NCOA4, and intron 1 of CCDC6 (149,150). RET/PTC rearrangements are known to be associated with radiation exposure, although most of adult tumors are sporadic and those patients lack the radiation exposure history (86), implying that other mechanisms should be responsible for DNA breakage and RET/PTC formation in most tumors. Clinical studies have shown that RET/PTC3 rearrangements are common in radiation-induced tumors

(88,89,151). In contrast, sporadic PTC tumors have shown a greater prevalence of

RET/PTC1 rearrangements (93), which account for 70% of all RET/PTC tumor types (86).

Because the participating genes co-localize with fragile sites and there is a well- established association between RET/PTC rearrangements and DNA damage induced by ionizing radiation, these rearrangements offer an excellent model to examine directly the role of fragile sites in the formation of cancer-specific chromosomal translocations.

In this study, we demonstrate that fragile site-inducing chemicals can create DNA breaks within the RET/PTC partner genes and ultimately lead to the formation of

RET/PTC rearrangements, offering direct evidence for the role of fragile sites in cancer- specific translocations.

RESULTS

Chromosomal disruptions in RET/PTC gene partners upon fragile site induction

To examine whether chromosomal regions involved in RET/PTC rearrangements are part of fragile sites, HTori-3 human thyroid cells were exposed to APH, APH+2-AP,

30

Figure 2.1 Fluorescence in situ hybridization on metaphase chromosomes of HTori-3 cells after treatment with fragile site-inducing chemicals. (a) Negative control without treatment showing smooth chromosomes with intact RET (red) signal. (b) Exposure to APH resulting in irregular chromosome contours and one RET signal (red) showing split in the signal whereas four other RET signals are intact. Centromeric probe for chromosome 10 is labeled in green. (c) Exposure to BrdU+2-AP resulting in the disruption of CCDC6 (green) while NCOA4 is intact (red). (d) Exposure to APH+2-AP+BrdU resulting in split in RET (red).

and BrdU+2-AP. Metaphase spreads of cultured HTori-3 cells were hybridized with fluorescently labeled BAC probes covering the entire genomic sequence of RET, NCOA4 and CCDC6 (Figure 2.1). Without exposure to fragile site-inducing chemicals, metaphase

31 chromosomes of HTori-3 cells appeared normal with smooth contours and intact RET signal (Figure 2.1a). With exposure to fragile site-inducing chemicals, the morphology of metaphase chromosomes appeared distorted with irregular surfaces and loss of continuity. After treatment with 0.4 μM APH for 24 hours, RET was disrupted in 6 

0.35% of chromosomes (Figure 2.1b; Table 2.1), NCOA4 was disrupted in 0.62% of chromosomes and no breaks were identified in the CCDC6 gene (Table 2.1). The appearance of breaks in RET but not in CCDC6 is consistent with the characteristics of the fragile sites in which each of these genes are located (RET located at APH-induced

FRA10G, and CCDC6 at BrdU-induced FRA10C). The frequency of breakage observed in

RET is in agreement with the previously published levels at FRA10G obtained using

Giemsa-stained chromosomes, which were found to be on average at 4.6% following treatment of human skin fibroblasts with 0.2 μM APH for 26 hours (152). After addition of APH and 2-AP, 5.93  0.52% of chromosomes showed breaks in RET; 0.63  0.08 % showed breaks in NCOA4 and 0.98  0.58% showed breaks in CCDC6. 2-AP is a general inhibitor of ATR kinase and is known to increase fragile site expression with or without the addition of replication inhibitors like APH (37). While breakage in RET and NCOA4 did not change significantly, breakage was now seen in CCDC6, consistent with 2-AP action. Treatment with BrdU and 2-AP resulted in 2.72  0.78% of chromosomes showing breaks in CCDC6 (Figure 2.1c). However, RET and NCOA4 were each disrupted in 0.6  0.08% of chromosomes after BrdU and 2-AP treatment (Table 2.1). Increased breakage in CCDC6 is consistent with its fragile site mode of induction. Also, the level of breakage at CCDC6 is comparable with previous reports at FRA10C, with DNA breakage

32 Table 2.1: Percentage of chromosomes showing disruption of RET, NCOA4, and CCDC6 after exposure to fragile site-inducing agents

APH APH+2-AP BrdU+2-AP RET 6.00  0.35 5.93  0.52 0.60  0.08 NCOA4 0.62 0.63  0.08 0.60  0.08 CCDC6 0 0.98  0.58 2.72  0.78

ranging from 4-20% following treatment of human blood lymphocytes from ten patients with 50 mg/L BrdU for 4-6 hours (153). The breakage frequency seen in RET and NCOA4 with BrdU and 2-AP treatment is similar to that observed in CCDC6 after treatment with

APH and 2-AP, showing consistency with 2-AP induced breakage. In concert, these results demonstrate directly that chemicals known to result in fragile site breakage cause DNA breaks within genomic sequences of genes participating in RET/PTC rearrangements.

Induction of DNA breaks in intron 11 of the RET gene by APH treatment

All RET/PTC rearrangements involve the fusion of the tyrosine kinase domain of

RET, and the major breakpoint cluster region identified in tumor cells is located within intron 11 (149). While fluorescence in situ hybridization (FISH) experiments allowed us to detect breaks occurring within the RET gene sequence, whether or not the breaks are located in intron 11 was next examined using ligation-mediated PCR (LM-PCR). HTori-3 cells were treated with APH for 24 hours, and the genomic DNAs from both the treated and untreated cells were subjected to primer extension with biotinylated primers that are specific to the regions of interest (Materials & Methods; Figure 2.2). The synthesis

33

Figure 2.2: DNA breaksite mapping by LM-PCR. Genomic DNA was isolated from HTori-3 cells with or without APH treatment, and was denatures and then annealed to a biotinylated primer specific for the region of interest. Primer extension was carried out with DNA Sequenase, and the reaction terminates at a DNA break. DNA breaks were isolated through ligation of the LL3/LP2 linker, and recovered by streptavidin breads. Amplification of these DNA breaks was achieved by nested PCR of the extension-ligation products. The final PCR products were resolved by agarose gel electrophoresis. Each band observed on the gel corresponds to a break found within the region of interest. The exact breakpoint sites were determed by DNA sequencing of the PCR products, and by identifying the nucleotide adjacent to the LL3/LP2 linker sequence. reaction terminated at a DNA break to produce a duplex with a blunt end, and the duplex was ligated to a linker. The linker-attached DNAs were then isolated by streptavidin beads, amplified by two rounds of PCR, and visualized by agarose gel electrophoresis (Figure 2.3). Each lane on the agarose gel represents the DNA breaks isolated from approximately 4000 cells, and each band observed on the gel corresponds to a break found within the region of interest. DNA breaks were observed within intron

34

Figure 2.3: LM-PCR detection of breaks formed in HTori-3 cells after treatment with APH. LM-PCR detection of DNA breaks formed in HTori-3 cells at intron 11 of RET (a), the fragile site FRA3B (c), and the non-fragile 12p12.3 region (d) after treatment with APH. The same reaction was carried out as in (a) for intron 11 of RET, but using DNA from cells without APH treatment (b). Last lane of each gel is a 100 bp molecular weight ladder. Bands below 100bp correspond to primer dimers. Asterisks mark DNA fragments that were sequenced, and results are shown in Figure 2.3.

11 of RET after treatment with APH (Figure 2.3a) with a frequency of 0.024 ± 0.015 breaks per 100 cells, which was significantly higher than that in the untreated cells

(0.004 ± 0.009/100 cells, p = 0.010) (Figure 2.3b). DNA samples from lanes 1, and 3-6 in

Figure 2.3a (marked with asterisks) were sequenced to determine the location of the induced breakpoints in the RET gene (Figure 2.4). DNA sequencing revealed the breakpoints to be located within intron 11, and at a distance from exon 12 that is

35

Figure 2.4: Location of breakpoints within intron 11 of RET induced by treatment with APH. (a) DNA samples from lanes 1, and 3-6 in Figure 2.2a (marked with asterisks) were sequenced, and six breakpoints are identified and indicated by black arrowheads. The locations of known breakpoints found in tumors containing RET/PTC rearrangements are indicated by grey arrowheads (154,155). The grey arrow corresponds to the RET-7 primer with a dual biotin label (grey circles), which is annealed to exon 12 of the RET gene. The black solid and dashed arrows correspond to the RET-R1b and RET-R1 nested PCR primers, respectively. The sequence of intron 11 is italicized. (b) The distance of each induced breakpoint from the 5’ end of the RET-R1b primer and the nearest patient tumor breakpoint was listed.

consistent with the size of the PCR product observed on the agarose gel in Figure 2.2a.

The locations of these breakpoints were compared to the location of known breakpoints found in PTC tumors containing RET/PTC rearrangements (Figure 2.4) (154,155). Each induced breakpoint was found to be located near a human tumor breakpoint, with distances ranging from 2-15 base pairs. It is important to note that these induced breakpoints were detected prior to a rearrangement event, while the breakpoints found in tumors have been identified after a rearrangement event has occurred. In most cases, small modifications, such as deletions and insertions of 1-18 nucleotides, have been observed surrounding the fusion points in human tumors. These results confirm that the exposure of thyroid cells to APH induces the formation of DNA breaks within the major

36

Figure 2.5: LM-PCR detection of breaks formed in HTori-3 cells after treatment with APH. LM-PCR detection of DNA breaks formed in HTori-3 cells at intron 11 of RET (a) and the non- fragile exon 1 of G6PD (b) after treatment with APH. Last lane of each gel is a 100 bp molecular weight ladder.

breakpoint cluster region found in the RET gene, and these induced breakpoints are located close to known breakpoints found in human tumors.

DNA breaks were also examined within FRA3B after APH treatment. FRA3B is the most inducible fragile site in the and contains FHIT, a gene involved in several cancers, where microscopic deletions have been observed after treatment with

APH (71,146). Intron 4 of the FHIT gene, a major region of high instability in various tumors and APH-treated cells (156,157), was examined here for DNA breaks. DNA breaks were detected within intron 4 of FHIT upon APH treatment (Figure 2.3c) at a frequency of 0.036 ± 0.020 breaks per 100 cells, confirming that indeed the APH treatment can induce fragile site breakage. An increased number of breaks were observed within FRA3B in comparison to RET, which is consistent with FRA3B being the most active fragile site in the genome. A non-fragile region, 12p12.3 (14) and the G6PD gene, within FRAXF (a rare folate-sensitive fragile site not induced by APH), were examined after treatment with APH, and in contrast to RET and FRA3B, no DNA breaks

37 were observed within the 12p12.3 region (Figure 2.3d) or in exon 1 of G6PD (Figure 2.5).

The absence of breaks in 12p12.3 and G6PD suggests that the DNA breaks observed within RET and FRA3B after exposure to fragile site-inducing chemicals are due to their fragile nature in response to APH.

Generation of RET/PTC rearrangements after treatment with fragile site-inducing chemicals

To test for the induction of RET/PTC rearrangement after exposure to fragile site- inducing chemicals, HTori-3 cells were treated with APH and 2-AP for 24 hours with the addition of BrdU for the last 5 hours. These treatment conditions were chosen because they have been previously established to be optimal for the induction of fragile sites

FRA10C and FRA10G (152,153). To confirm breakage in the genes after exposure, metaphase spreads were made and chromosomes were scored for disruption of the probe (Figure 2.1d). The breakage in the probes for RET, NCOA4 and CCDC6 were 7.47%,

1.15% and 2.87% respectively. The mRNA was then isolated and used in RT-PCR for detection of RET/PTC1 and RET/PTC3 formation. To assure that a cell with the rearrangement would be detected, 1 x 106 cells in a 10 cm culture dish were divided among 30 culture dishes 24 hours post-exposure. Therefore, each well received no more than 3 x 104 cells, and if a dish contained only one cell with RET/PTC, it would constitute

1 part in 3 x 104, a fraction within the limit of detection (158). No RET/PTC rearrangement was detected without any treatments in five independent experiments

(Figure 2.6), indicating an extremely low level of spontaneous generation of RET/PTC in this human cell line and the absence of contamination. Similarly, no RET/PTC

38

Figure 2.6: Detection of RET/PTC rearrangements in HTori-3 cells after treatment with fragile site-inducing chemicals. (a) Detection of RET/PTC rearrangements in representative RT-PCR experiment after exposure to APH+2-AP+BrdU. PC, Positive control. (b) Number of rearrangement events detected in untreated cells and cells exposed to APH+2-AP+BrdU. Five independent experiments were carried out for each treatment, and each experiment analyzed 106 cells.

rearrangement was detected using the same experimental approach in HTori-3 cells in four independent experiments in a study reported by Caudill et al (158). Exposure to a combination of APH, 2-AP and BrdU resulted in the generation of RET/PTC1, with 5 total events identified in 5 independent experiments, each assaying 106 cells (incidence of 2,

1, 2, 0, 0 events per 106 cells) (Figure 2.6b). However, no RET/PTC3 rearrangements were identified. Representative RT-PCR blots are shown in Figure 2.6a. Statistical analysis revealed a significant difference in the incidence of RET/PTC1 induction between untreated cells (zero events) and cells treated with fragile site-inducing agents

(five total events) (p = 0.027). These results demonstrate that the exposure of thyroid

39 cells to fragile site-inducing chemicals can lead to the formation of a carcinogenic

RET/PTC rearrangement.

DISCUSSION

Chromosomal rearrangements contribute to the development of many types of human tumors. Therefore, it is critical to understand the mechanisms of chromosomal rearrangements in cancer cells. Here, we demonstrated that DNA breakage at fragile sites FRA10C and FRA10G under fragile site-inducing conditions initiates and leads to the generation of RET/PTC1 rearrangement, which is known to contribute to PTC development. To our knowledge, this is the first demonstration that a cancer-specific rearrangement can be produced in human cells by inducing DNA breaks at fragile sites.

Interestingly, only RET/PTC1 rearrangements were observed, and no RET/PTC3 rearrangements were identified. While breakage was seen within NCOA4, the RET/PTC3 partner gene, the frequency of breakage was lower when compared to RET and CCDC6.

NCOA4 breakage remained relatively constant with each combination of fragile site- inducing chemicals, and was about 10-fold lower than the breakage observed within

RET, and about 4.5-fold below the level found in CCDC6. The lower incidence of breakage within NCOA4 could contribute to the lack of RET/PTC3 rearrangement events.

Also, clinical studies have revealed that RET/PTC3 rearrangements are frequent in radiation-induced tumors (88,89,151), while RET/PTC1 rearrangements are more commonly seen in sporadic tumors (93). Our observation of RET/PTC1 rearrangement, but not RET/PTC3 rearrangement, generated by fragile site induction, further supports

40 the idea that sporadic PTC tumors may result from breakage at fragile sites. It is known that specific environmental and food toxins (such as caffeine, alcohol, tobacco) (96,97), and other stress factors (such as hypoxia) (106) can induce fragile sites. Therefore, our results suggest that these exogenous factors may contribute to the occurrence of chromosomal rearrangements, and therefore cancer initiation in human populations, by a mechanism of DNA breakage at fragile sites.

To demonstrate that fragile site-inducing chemicals can cause DNA breaks at

RET/PTC participating genes, FISH analysis of chromosome 10, and LM-PCR analysis at the nucleotide level of the RET gene was performed. Using FISH, we showed that upon exposure of human thyroid cells to fragile site-inducing chemicals, chromosomal breaks are formed within the RET and CCDC6 genes. RET and CCDC6 are located respectively within the APH and BrdU-induced fragile sites, and display breakage only after the addition of APH or BrdU, accordingly. These results demonstrate not only that the fragility is indeed present within the genes involved in RET/PTC rearrangements, but also underline the specificity of fragile site induction that was observed in these regions.

While 2-AP addition is known to increase overall chromosomal breakage and fragile site

FRA3B expression (37), no significant increase in breakage at RET and NCOA4 genes was noted in HTori-3 cells, indicating its weaker influence on the FRA10G site. Furthermore, the addition of 2-AP in combination with APH resulted in the appearance of breaks within CCDC6, while its combination with BrdU resulted in breaks within RET and

NCOA4. This nonspecific effect of 2-AP on induction of DNA breaks at fragile sites is in

41 agreement with its ability to inhibit ATR protein, which provides a key maintenance role in fragile site stability.

The DNA breaks generated in RET after exposure to APH were confirmed to be located within intron 11, which is the breakpoint cluster region identified in thyroid tumors, while untreated cells showed little to no breaks. These breaks are further confirmed to be fragile in nature, when comparing the formation of breaks within

FRA3B, 12p12.3 and G6PD regions. FRA3B, the most inducible fragile site in the human genome (71,146), displayed DNA breaks after treatment with APH (Figure 2.3c); while

12p12.3, a non-fragile region, and the G6PD gene, located within a rare folate-sensitive fragile site, showed no DNA breakage with the same treatment (Figure 2.3d and 2.5b).

Together with cytogenetic analysis, these results demonstrate that fragile site-inducing chemicals can generate breaks within RET and CCDC6 genes, which could result in the formation of cancer-causing RET/PTC1 rearrangement.

The induction rate of RET/PTC rearrangement by fragile site-inducing chemicals was four orders of magnitude lower than the frequency of chromosomal breaks observed in RET and CCDC6 genes. DNA breaks, a serious threat to genome stability and cell viability, can trigger DNA repair pathways, including homologous recombination or non-homologous end joining (143). The action of these pathways ensures proper repair of DNA breaks, and prevents the deleterious consequences of such breakage. However, some (small number of) DNA breaks escaping the repair pathways will ultimately result in large-scale chromosomal changes, such as RET/PTC rearrangement.

42 This study provides important information about the mechanisms of formation of carcinogenic chromosomal rearrangements in human cells. In addition, it establishes an experimental system that will allow for testing the role of specific environmental substances, dietary toxins, and other stress factors in the generation of chromosomal rearrangements and tumor induction.

MATERIALS AND METHODS

Cell line and culture conditions

The experiments were performed on HTori-3 cells, which are human thyroid epithelial cells transfected with an origin-defective SV40 genome. They are characterized as immortalized, partially transformed, differentiated cells having three copies of chromosome 10 with intact RET, NCOA4 and CCDC6 loci and preserve the expression of thyroid differentiation markers such as thyroglobulin production and sodium iodide symporter, as we reported previously (158). The cells were purchased from the European Tissue Culture Collection and grown in RPMI 1640 medium

(Invitrogen) supplemented with 10% fetal bovine serum.

Fragile site induction

HTori-3 cells (1 x 106) were plated in 10-cm culture dishes and 16 h later exposed for 24 h to APH (0.4 μM) or APH and 2-AP (2 mM) (37). When desired, cells were treated with BrdU (50 mg/L) for the last 5 h in addition to 2-AP and/or APH for 24 h. For DNA breaksite detection 5 x 105 cells were plated in 10-cm culture dishes and treated the same as above with 0.4 μM APH.

43 Metaphase chromosome preparation

HTori-3 cells exposed to various chemicals were treated with 0.1 μg/mL of

Colcemid for the last 2 hours before harvesting. Cells were incubated in hypotonic solution (0.075 M KCl), fixed in multiple changes of methanol:acetic acid (3:1) and dropped onto moistened slides in order to obtain metaphase spreads. Slides were aged overnight and pretreated with RNase before proceeding for hybridization.

Probes for FISH

BAC clones RP11-351D16 (RET), RP11-481A12 (NCOA4), RP11-435G3 and RP11-

369L1 (CCDC6) were obtained from BAC/PAC Resources, Children's Hospital, Oakland.

BAC clone RP11-481A12 containing the NCOA4 gene was subcloned into fosmid vector after cutting with restriction enzymes (Epicentre). A mixture of subcloned probes (SC10,

SC19) containing 70 kb of the NCOA4 gene and its flanking regions was used as a probe for NCOA4. The probes were labeled by nick translation using Spectrum Green-dUTP,

Spectrum Orange-dUTP or Spectrum Red-dUTP (Vysis Inc.). Hybridization was performed as previously described (159). On average 150 chromosomes were scored for breaks in the RET, NCOA4 and CCDC6 probes for each condition.

DNA breaksite mapping by LM-PCR

To detect DNA breaks within intron 11 of RET induced by APH, a 5’-biotinylated primer RET-7 corresponding to the RET at the 5’ end of exon 12 (the grey arrow in Figure

2.4a) was used to extend into intron 11. For the first and second rounds of nested PCR primers RET-R1b and RET-R1 were used, respectively. To isolate the DNA breaks, a duplex DNA linker LL3/LP2 was used as described (160) as well as the corresponding

44 linker specific primers LL4 and LL2 (Figure 2.2). For FRA3B, the biotinylated primer

FRA3B-20 was used to allow identification of break sites occurring at intron 4 of the FHIT gene, which contains major clusters of APH-induced breakpoints in FRA3B (156,157), and primers FRA3B-9 and FRA3B-23 were used in the first and second rounds of nested

PCR, respectively. For detection of breaks within the 12p12.3 region, the biotinylated primer 12p12.3-1 and primers 12p12.3-2 and 12p12.3-3 were used. For detection of breaks within exon 1 of G6PD, the biotinylated primer G6PDF3 and primers G6PDF and

G6PDF2 were used. Sequence of linkers and PCR primers is described in the Appendix

Figure 1.

DNA breaksite mapping was performed as described (160) with modifications

(Figure 2.2). Genomic DNA was isolated from HTori-3 cells with or without APH treatment. Primer extension was performed using 200 ng of DNA at 45°C, and the DNA breaks were isolated through ligation of the LL3/LP2 linker, and then using streptavidin beads. Amplification of these DNA breaks was achieved by nested PCR of the extension- ligation products. The final PCR products were resolved by electrophoresis on a 1.3% agarose gel. Each band observed on the gel corresponds to a break isolated within the region of interest. To confirm the bands observed were located within intron 11 of RET, the PCR products were sequenced. The exact breakpoint sites were determined from the sequencing results by identifying the nucleotide adjacent to the LL3/LP2 linker sequence.

45 Detection of RET/PTC rearrangements

Upon treatment with fragile site-inducing agents for 24 hours, cells were split into 30 6-cm culture dishes at a density of approximately 3 x 104 cells per dish and grown for 3-4 days. To sustain growth for 9 days, cells were transferred to 10-cm culture dishes 4–5 days after seeding into 6-cm dishes. RNA was extracted from each culture dish using a Trizol reagent (Invitrogen). Then, mRNA was purified using the Oligotex mRNA minikit (QIAGEN). RT-PCR was performed using a Superscript first strand synthesis system kit and random hexamer priming (Invitrogen). PCR was performed to simultaneously detect RET/PTC1 and RET/PTC3 rearrangement using primers RET/PTC1 forward, RET/PTC3 forward, and common reverse (Appendix Figure 1). As positive controls, cDNA from RET/PTC1-positive TPC-1 cells and RET/PTC3 positive tumor sample were used. Ten μL of each PCR product was electrophoresed in a 1.5% agarose gel, transferred to the nylon membrane, and hybridized with 32P-labeled oligonucleotide probes specific for RET/PTC1 and RET/PTC3 (Appendix Figure 1). Evidence of RET/PTC rearrangement in the cells from a given flask was scored as one RET/PTC event. All statistics performed using one-tailed Student’s t-test.

ACKNOWLEDGEMENTS

This work was supported by the National Cancer Institute (CA113863 to Y.-H. W and Y.

E. N.).

46 CHAPTER III: DNA TOPOISOMERASES PARTICIPATE IN ONCOGENE RET FRAGILITY

Laura W. Dillon1, Levi C.T. Pierce2, Yuri E. Nikiforov3, Yuh-Hwa Wang1

1Department of Biochemistry, Wake Forest School of Medicine, Medical Center Boulevard, Winston-Salem, NC 27157-1016, USA. 2Department of Chemistry and Biochemistry, University of California-San Diego, 9500 Gilman Drive, Urey Hall, La Jolla, CA 92003-0365, USA 3Department of Pathology and Laboratory Medicine, University of Pittsburgh, Pittsburgh, PA 15261, USA.

The following manuscript is in preparation for submission.

47 ABSTRACT

Fragile site breakage was previously shown to result in rearrangement of RET resembling those found in papillary thyroid carcinoma. Common fragile sites are specific regions of the genome with a high susceptibility to the formation of DNA breakage under conditions that partially inhibit DNA replication, and often coincide with genes deleted, amplified, or rearranged in cancer. While a substantial amount of work has been performed investigating DNA repair and cell cycle checkpoint proteins vital for maintaining stability at fragile sites, little is known about the initial events leading to

DNA breakage at these sites. These initial events are investigated here through the detection of aphidicolin (APH)-induced DNA breakage within the RET oncogene, in which

144 APH-induced DNA breakpoints were mapped at the nucleotide level in human thyroid cells within intron 11 of RET, the breakpoint cluster region found in patients.

These breakpoints were located at or near DNA topoisomerase I and/or II predicted cleavage sites, as well as at DNA secondary structural features recognized and preferentially cleaved by DNA topoisomerases I and II. Co-treatment of thyroid cells with APH and topoisomerase catalytic inhibitors, betulinic acid and merbarone, significantly decreased APH-induced fragile site breakage within RET intron 11, and within common fragile site FRA3B. These data demonstrate that DNA topoisomerases I and II are involved in initiating APH-induced common fragile site breakage at RET, and may engage the recognition of DNA secondary structures formed during perturbed DNA replication.

48 INTRODUCTION

The oncogene RET rearranges with various genes in a class of translocations known as RET/PTC rearrangements, which consequently results in papillary thyroid carcinoma (PTC) (85). The rate of thyroid cancer incidences have steadily increased over the past several decades, where in the United States alone cases have doubled in the past decade and nearly tripled since the early 1970s (74,76). Interestingly, the increase in thyroid cancer can be almost 100% accounted for due to an increase in papillary thyroid carcinoma (76). Approximately 20% of all PTC cases are due to RET/PTC rearrangements (85). The most common form of RET/PTC rearrangement is the

RET/PTC1 type, where RET translocates with CCDC6 (86). RET and CCDC6 are both located within common chromosomal fragile sites, FRA10G and FRA10C, respectively.

Recently we found that the formation of RET/PTC1 rearrangements can be induced in human thyroid cells through treatment with fragile site-inducing chemicals (161).

Exposure to chemicals that can induce fragile sites may contribute to the increasing rates of thyroid cancer.

Chromosomal fragile sites are specific regions of the genome that exhibit gaps or breaks on metaphase chromosomes under conditions that partially inhibit DNA replication (8). These sites often co-localize with regions deleted, amplified, or rearranged in cancer (10). Over half of all known simple recurrent chromosomal translocations in cancer have breakpoints located within at least one fragile site (70).

Mutational signatures of some unexplained homozygous deletions in cancer cell lines match those found in fragile site regions (69). Furthermore, fragile site-inducing

49 conditions have been shown to introduce in vivo deletions within the tumor suppressor gene FHIT and to generate oncogenic RET/PTC1 rearrangement, like those observed in patients (71,161).

Although a strong connection between fragile sites and cancer has been established, little is known about the initial events leading to DNA breakage at these sites. Chromosomal fragile sites are traditionally defined cytogenetically as unstained gaps having an average size of 3-Mb. Some common fragile sites have been defined on the molecular level, where DNA breakage is observed over large regions up to several megabases in size (162). Unlike rare fragile sites, which consist of repeated sequence elements that are present in less than 5% of the population and are inherited in a

Medelian manner (4), common fragile sites are present in all individuals and have no known consensus sequence (5). Common fragile sites are further characterized based on the culture conditions known to induce breakage within these regions, the most common being aphidicolin (APH), an inhibitor of DNA polymerases , , and  (6,7).

Although no consensus sequence is known for common fragile sites, several characteristics are shared among many sites studied to date, including being late- replicating (17-20), located within large genes (8), containing highly flexible AT-rich sequences (13,14), and having the potential to form highly stable DNA secondary structures (14-16). Recently, we found through the study of the human chromosome 10 sequence that APH-induced common fragile sites are predicted to form more stable

DNA secondary structures that cluster with greater density than non-fragile regions

(Manuscript Submitted). One proposed mechanism for common fragile site breakage is

50 that replication stress results in a long stretch of single-stranded DNA and subsequent formation of stable DNA secondary structures, which can pause polymerase progression, resulting in incomplete replication at fragile sites and ultimately DNA breakage (8). In addition to DNA replication, transcription of large genes at fragile sites can result in the formation of stable R-loop structures that ultimately result in common fragile site breakage (22). Triplet repeat expansions, including those observed at rare fragile sites, have also been found to form stable R-loops during transcription, most likely influenced by the formation of stable DNA secondary structures on the non- template strand (163-165).

DNA topoisomerases play a critical role in maintaining chromosome structural integrity during DNA processes such as replication or transcription by regulating DNA supercoiling and removing knots in the genomic material (166). During replication and transcription, topoisomerase I alleviates DNA supercoiling by transiently inducing a single-strand DNA break and then re-ligating at the cleavage site. Topoisomerase II modulates DNA supercoiling and removes knots and tangles in the DNA formed during replication by transiently inducing a double-strand DNA break and then re-ligating at the cleavage site. Additionally, DNA topoisomerases I and II can recognize and preferentially cleave DNA at regions capable of forming stable DNA secondary structures (167-169), similar to those predicted or formed at fragile sites. Furthermore, normal topoisomerase I activity is vital for common fragile site breakage (138,139).

The critical role of DNA topoisomerases in replication and transcription, their recognition of DNA secondary structures, and the involvement of topoisomerase I in

51 fragile site breakage, prompted us to directly investigate the role of these enzymes in initiating common fragile site breakage. The nucleotide locations of APH-induced DNA breaks within intron 11 of the RET oncogene, the major breakpoint cluster region in PTC patients (149), were determined using ligation mediated-PCR (LM-PCR) and were found to be at or near topoisomerase I and/or II predicted DNA cleavage sites. Furthermore, using DNA secondary structure predictions of the intron 11 sequence, the APH-induced breakpoints were found to be present in structural features known to be recognized by topoisomerases I and II. Finally, treatment of thyroid cells with low doses of topoisomerase catalytic inhibitors significantly reduced the rate of APH-induced DNA breakage within RET intron 11, as well as intron 4 of FHIT, located within the most active common fragile site FRA3B, to levels observed in untreated samples. These results support the involvement of DNA topoisomerase I and II in the initiation of DNA breakage at APH-induced common fragile sites possibly through recognition of DNA secondary structures formed during perturbed DNA replication.

RESULTS

Identification of APH-induced DNA breakpoints within intron 11 of RET

To investigate the initial events of fragile site-induced DNA breakage, the nucleotide location of APH-induced DNA breakpoints within intron 11 of RET was identified by LM-PCR. The translocation of RET with various partner genes is the hallmark of RET/PTC rearrangements, and the major breakpoint cluster regions observed in patients occurs within intron 11 (149). Previously, we established using LM-

52 PCR that APH treatment induces DNA breakage within RET intron 11 in the human thyroid epithelial cell line HTori-3, and this breakage is specific to fragile sites (161).

Here, the entire RET intron 11 was examined by the same method for APH-induced breakpoints using genomic DNA isolated from HTori-3 cells following treatment with 0.4

μM APH for 24 hours. In short, the genomic DNA was subjected to primer extension using biotinylated primers specific for the region of interest (see Materials and

Methods; Appendix Table 1). Upon reaching a DNA break, the synthesis reaction terminates, resulting in a blunt-ended DNA molecule, which was then captured by linker ligation. The linker-attached DNAs were isolated using streptavidin beads, amplified by two rounds of nested PCR, and visualized by agarose gel electrophoresis (Figure 3.1A).

Each lane on the agarose gel represents DNA isolated from approximately 1300 cells and each band on the gel represents a DNA break isolated within RET intron 11. A total of

144 DNA breaks were isolated within the 1847-bp intron 11 sequence of RET on both

DNA strands using four sets of primers, sets 1-4 (Figure 3.1B; Appendix Table 2). DNA breakage within RET intron 11 was observed at a frequency of 0.077 ± 0.029 DNA breaks per 100 cells, which is significantly greater than the frequency of DNA breakage without treatment (0.016 ± 0.009 breaks per 100 cells, P = 3.31E-4; Table 3.1, Figure 3.1A).

To assure the location of the APH-induced breakpoints representing the initial events of APH-induced fragile site breakage within this region, we first confirmed that the LM-PCR procedure can accurately identify the nucleotide location of a DNA break, and that the breaks being detected are not due to premature termination of the primer

53 Figure 1

A Untreated 0.4μM APH

1 2 3 4 5 6 7 1 2 3 4 5 6 7 B

5’-CTACCACAAGTTTGCCCACAAGCCACCCATCTCCTCAGCTGAGATGACCTTCCGGAGGCCCGCCCAGGCCTTCCCGGTCAGCTACTCCTCTTCCGGTGCCCGCCGGCCCTCGCTGGACTCCA

TGGAGAACCAGGTCTCCGTGGATGCCTTCAAGATCCTGGTGAGGGTCCCTGCGGGGCAGGGAAGATCCCCTGCCCTCCCCAGCTGCCTTCCAGGGAGGGAGGCCAGCTGGGGAGACAGA

GGCCATCCTGTGAGGGGCTGCCAACGCTGGGCAGACGAGGCCTGTGTTCTGCCCCCATTTCCATAGGGCGCTGTGTGGGGACAGTCTGTGGGGTGGGACTGTGATGAGGTGCCGTTCCCA

TCTAGGTGAGAGGCAGTGGTCAGGGTCACAGCATCGGGCAGGGGAGCAGCAGTGTGGATGGAGGGGCACTGAAGTCAGAAGGGGGTGCCTTTCTGGGGAGCCTGGCCTGCAGGTCTGC

ATGTGCTACTCAGAGCCTCCAGGCTGTGCCGAGTATCCTGGAGCCTCCTTGTCCCGGCCAGGCAGGCCTCTGCCCTCTCCTGGTGGTGGCCTGCCCCTTCAGTGTTCCTACTAGCACTGTCCA

GGGCGCTGGAAGCCAAGCCCAGTTCTGGAAGTAACAGAGGCTCAGAGCCAAGGGTGTGAGTGAACGGTGAGCCACGCAGCTTATGGTGGCGTGAATAGCTCCTCGGCAGGAGCCTCCAG

GGAGGAAGCTGAGCACCCAGTGGCCACAGGGCCCTGGCAGTTCCCATCTCAGGCTGGGAGGTGGCCTGGGATTCCTGGGAGGGGCCATATCCCACAGTGCAGCTCAGCCTGAGGCCTCG

GCCCTGGAGCCTCCGTTCAGGCAACACCCAGCCCTCGGTAAGGGTGTGAGCCAAGGAGGCCTTCCCAGATGTGGCCACTGCCGCTTCCCCACCAGCTTTCCTAATTGGTGGTCCCCATCCTG

GCCTGGCTGCAGCTTAGCCTCATGGCAGGGCTCTAGGATGAGCCACCAGAGTCCTTCATAAACCCAGTGGGTTTGTGTGAGGCTGCCCAGGAAGGCCGCACTGGTCTGGGCTGCTGCTGGC

AGAGACCACCACCCTAACCCCAGTCAGCTCCAGAGTCACACTCATCAGCACCAGGTCTTGGACCCATGACTCAACCTCAGTATTTGAGAGGATCAGGTTGATGTCGCCCTCATGTGCTTATTGC XbaI AGTCTCTAGAGTGTGGTAAACAGGTTTCCAGTGCCAGCTGTGGAGGTGACAGCGGCAGGGAAGCCATGGCAGTGTCGACACTGACCTTGACTGTGGGTTCCCAGGGAATGTGGGGCCAGA

CCAGGACAGCCCAGGAGCAGGAGACCTGGGGTGACGGATGCCCAGAGCTGGCACATCAAGGGAGGGTTCCTGGATCATGGCAGGCTTTGGCCTCCCTGGTCAGAGTTCAAGTACTGGGG

GCCAGGGTGGGGGTCTGGGAAGGCATCCGGAGCAGTCCCAAGTGGGCCCAATGTGTGGATAGAACTTTGGTGGGAGGGCAGGGTGGTAGTGCCAGCAGGCAGGGTGAGCGGGTGCGTG BanI AGGGCCAGTGGCAGCCCTTGAGGAGCAGTGCTTCCACACTCTGAGGCGGAACATGGTGGCGCCTTTCTTTGCAGGGGTGGCTATGTAGAGAAGTTGTCCTGGACACTTCCACTGTAGTCAG

AGGTCCTGGGCTGGGCCTGGTGCTCATTTAGTCCTGGGGCAGGGGTCAGGGGAGACAGTAGACCAGGAACCAGAGAGGGTCGAAGTACTGAGTCCAAGCCATGCTGTGACCACACCTGTC

ATGTAGCAGCTTTCAGGGGCCTGGCTGTGGGGTCCTGCCCAGGGCAGAGACAGGCAGCGTTGCCGCTGGCTCAGATGACAGCCGGTTCTCTGCACATTGGAACTTGTCCATGGGGCCTCCT

TTAAGGGTCTTGCCTTCTTCCTCCCCTGTCATCCTCACACTTTTCCCCCCTCTTCTCCCCCTTCCCTCATTTCCAACATAGGAGGATCCAAAGTGGGAATTCCCTCGGAAGAACTTGGTTCTTGGA

AAAACTCTAGGAGAAGGCGAATTTGGAAAAGTGGTCAAGGCAACGGCCTTCCATCTGAAAGGCAGAGCAGGGTACACCACGGTGGCCGTGAAGATGCTG-3’

C

) 12

%

(

s

t 10

n

i o

p 8

k

a

e r

B 6

f

o

e 4

g

a

t n

e 2

c

r e

P 0 0 250 500 750 1000 1250 1500 1750 RET Intron 11 (bp)

54 Figure 3.1: Location of APH-induced DNA breakpoints within intron 11 of RET detected by LM-PCR. (A) DNA breaks formed in intron 11 of RET were detected by LM-PCR with or without 24 hour 0.4 μM APH treatment. Each lane represents a separate PCR reaction using DNA from approximately 1300 cells. The first lane of each gel is 100-bp molecular weight ladder. (B) The location of 144 APH-induced DNA breakpoints isolated within intron 11 of RET by LM-PCR were determined by DNA sequencing and are marked by arrowheads. DNA breaks identified on the top strand are indicated by black arrowheads and on the bottom strand by grey arrowheads. Open arrowheads indicate the locations of known patient breakpoints observed in PTC tumors containing RET/PTC rearrangements (149,150,154,155,170). The location of BanI and XbaI digestion sites within intron 11 are labeled. RET primer sets; as described in Appendix Table 1; are indicated by arrows, where lines with circles are dual biotin labeled primers followed by two nested primers. The dashed black lines represent primer set 1, dashed grey lines primer set 2, solid black lines primer set 3, and solid grey lines primer set 4. The sequence of intron 11 is displayed along with the flanking exon 10 and 11 sequences, shown in italics. (C) The distribution of APH-induced DNA breakpoints within intron 11 are depicted as a smooth curve fit of the percentage of breakpoints (y-axis) located every 50-bp of intron 11 in a 5’ to 3’ direction (x-axis). extension reaction. Genomic DNA isolated from HTori-3 cells was digested with either the restriction BanI or XbaI, and the LM-PCR products by RET primer set 1

(Appendix Table 1) are expected to be 454-bp and 864-bp in size, respectively, for BanI and XbaI digested DNAs (Figure 3.1B). Indeed PCR products corresponding to the correct sizes were observed for the digested DNA samples, and DNA sequencing revealed that

100% of the BanI or XbaI-induced breaks corresponded to the correct nucleotide location (data not shown). These results verify that LM-PCR is a valid method for identification of DNA breakpoints up to about 1-Kb from the initial biotinylated primer.

Next, we examined whether the locations of DNA breaks being detected by LM-

PCR reflect the true APH-induced breaks, not the consequences of subsequent repair processes or DNA purification procedure. Intact HTori-3 nuclei were treated with BanI, after which genomic DNA was isolated and analyzed by LM-PCR using RET primer set 1.

DNA sequencing of 28 breakpoints generated by the LM-PCR revealed 79% located to the predicted nucleotide, while the remaining breakpoints contained deletions up to 5-

55 Table 3.1: Frequency of DNA breakage with RET intron 11 as detected by LM-PCR. Treatment DNA Breaks/100 cells ± SD P-Value vs APH* Untreated 0.016 ± 0.009 3.31E-04 0.4µM APH 0.077 ± 0.029 - 0.4µM APH + 60nM BA 0.023 ± 0.010 4.01E-03 0.4µM APH + 3µM Merbarone 0.018 ± 0.007 1.98E-03 *P-value calculated using an independent two-tailed Student’s T-test

bp, which may be the result of exonuclease digestion (data not shown). Together, these results show that the nucleotide locations of DNA breaks identified by LM-PCR mostly correspond to the initially induced breaks formed inside the cell.

The nucleotide location of the 144 APH-induced DNA breakpoints was determined by sequencing of the PCR products. APH treatment induced DNA breakage throughout intron 11 on both DNA strands (Figure 3.1B). Interestingly, the breakpoints form a notable pattern of clusters with spacing approximately every 250-bp, equivalent to the interval between nucleosomes (Figure 3.1C). Next, the locations of the APH- induced breakpoints were compared with the location of known breakpoints found in

PTC tumors containing RET/PTC rearrangements (Figure 3.1B; Figure 3.2, Appendix Table

2) (149,150,154,155,170), and we found that 94 (69%) of the APH-induced breakpoints have a known patient breakpoint located within 0 to 20-bp. While the locations of breakpoints identified in PTC tumors were isolated following rearrangement, the APH- induced breakpoints were determined prior to a translocation event. In the majority of

PTC tumors, small insertions or deletions ranging from 1 to 18-bp have been observed surrounding the fusion points.

56

Figure 3.2: Location of APH-induced breakpoints within intron 11 of RET relative to known patient breakpoints. The nucleotide location of previously reported fusion points observed in PTC patients with RET/PTC1 or RET/PTC3 translocations were compared to APH-induced breakpoints. The distance range is represented on the x axis, where a negative position refers to the closest patient breakpoint being upstream of the APH-induced breakpoint and a positive being downstream, and the percentage of the total APH-induced breakpoints contained within the distance range is displayed on the y axis.

These results reveal that APH-treatment induces DNA breakage throughout intron 11 of RET and that many of these breakpoints are located at or near known patient breakpoints. We also show that the LM-PCR can accurately identify the nucleotide location of a DNA breakpoint, therefore the locations of these breakpoints can be used to identify initial events in APH-induced fragile site breakage within the RET oncogene.

Location of APH-induced DNA breakpoints relative to predicted topoisomerase I and II cleavage sites

To determine if DNA topoisomerases I and II are involved in initiating APH- induced DNA breaks, APH-induced DNA breakpoints were compared to predicted

57

Figure 3.3: Comparison of APH-induced DNA breakpoints to predicted DNA topoisomerase I and II cleavage sites. (A) Topoisomerase I cleavage sites within RET intron 11 were predicted based on the consensus sequence [(A,T,G)/(C,G,A)/(A,T)/(T,C)] (171), compared to APH- induced DNA breakpoints, and represented as the distance in bp from each APH breakpoint to the closest predicted cleavage site (x-axis). A positive distance refers to the closest topoisomase I cleavage site being downstream of the APH breakpoint, and a negative distance being upstream. The percentage of all APH-induced breakpoints is displayed on the y-axis. (B) Topoisomerase IIα cleavage sites were predicted using the consensus sequence [(no A)/(no T)/(A, no C)/-/(C, no A)/-/-/-/-/(no T)/-/(T, no G)/(C, no A)] (172), where breakage occurs between the nucleotides five and six. The location of APH-induced breakpoints were compared to the predicted sites and represented in the same manner are topoisomerase I.

58 topoisomerase I and II cleavage sites. The location of topoisomerase I cleavage sites within RET intron 11 were predicted on both DNA strands using the consensus sequence determined by Been et al. (171). We found that all APH-induced breakpoints are located within 19-bp of a predicted topoisomerase I site, with 76% being within 6-bp (Figure

3.3A; Appendix Table 2). As with topoisomerase I, topoisomerase II cleavage sites were predicted within RET intron 11 on both strands using the consensus sequence (172) and compared to the APH-induced DNA breaks. All APH-induced breakpoints were found to have a topoisomerase II site within 12-bp, with 91% being within 6-bp (Figure 3.3B;

Appendix Table 2).

Since the topoisomerase I and II consensus sequences are fairly loose, we wanted to verify that these enzymes cleave DNA within RET intron 11 and that these cleavage sites correspond with the predicted sites. To capture topoisomerase breakage,

HTori-3 cells were treated with topoisomerase poisons CPT-11 and VP-16, which allow topoisomerases I and II, respectively, to cleave DNA but prevent re-ligation (173,174).

HTori-3 cells were exposed to 10 μM CPT-11 or VP-16 for 1.5 hours, treatments which are known to induce detectable levels of topoisomerase DNA breakage (175,176), after which the DNA was isolated and analyzed for DNA breakage within RET intron 11 by LM-

PCR using primer set 1. A total of 22 breakpoints were sequenced for CPT-11 and 21 for

VP-16 and compared to predicted topoisomerase I or II cleavage sites, respectively.

Interestingly, 18% of the CPT-11-induced and 29% of the VP-16-induced breakpoints corresponded to a predicted cleavage site (Figure 3.4). The remaining breakpoints were

59 Figure 3

- +

30

)

%

(

s CPT11

t

n i

o VP16

p k

a 20

e

r

B

f

o

e

g

a t

n 10

e

c

r

e P

0 -6 -5 -4 -3 -2 -1 0 +1 +2 +3 +4 +5 +6 Distance from CPT11 or VP16-Induced Breakpoint to Predicted Topoisomerase I or II Cleavage Site (bp) Figure 3.4: Comparison of CPT-11 and VP-16 induced topoisomerase I and II cleavage to predicted cleavage sites. Topoisomerase I and II DNA cleavage was induced by treatment of HTori-3 cells with 10μM CPT-11 or VP-16, respectively, for 1.5 hours. The location of CPT-11 (n=22) or VP-16 (n=21) induced DNA cleavage within RET intron 11 was detected using LM- PCR and RET primer set 1 and compared to either topoisomerase I (CPT-11) or II (VP-16) predicted cleavage sites. A positive distance indicates the predicted cleavage site is downstream of the drug-induced site, and a negative distance in upstream. The percentage of all drug-induced breakpoints is represented on the on the y-axis.

located within 6-bp of a predicted topoisomerase I or II site, suggesting the consensus sequences are not a perfect predictor of topoisomerase breakage.

If the criteria set forth by the topoisomerase poisons of being at or within 6-bp of a predicted topoisomerase I or II cleavage site is utilized, topoisomerase breakage can explain all but one of the APH-induced breakpoints observed. Specifically, 8% are associated with topoisomerase I, 24% with topoisomerase II, and 67% with both topoisomerase I and II. Together, these results suggest the potential involvement of

60 DNA topoisomerases I and II in the initiation of fragile site breakage within RET intron

11 following treatment with APH.

Correlation of APH-Induced DNA breakpoints to sites on predicted DNA secondary structures with topoisomerase cleavage features

Aside from the recognition of consensus sequences in double-stranded DNA,

DNA topoisomerases I and II recognize and preferentially cleave single-stranded DNA within regions that form DNA secondary structures. DNA topoisomerase I cleavage of single-stranded DNA requires the formation of a DNA duplex, where cleavage occurs within the duplexed stem of the secondary structure, and the consensus sequence needs only to be approximate (167). DNA topoisomerase II has been shown to cleave

DNA hairpins one nucleotide from the 3’-base of the stem, where DNA secondary structure and the presence of a double-stranded/single-stranded DNA junction at the 3’- base of the hairpin, rather than sequence specificity, are the predominant features recognized by the enzyme (168). Additionally, human topoisomerase II recognizes hairpin structures formed within alpha satellite DNA, and cleaves within the single- stranded DNA loop region of the hairpin structure (169). Recently, using DNA secondary structure forming analyses, we predicted potential fragile sites on chromosome 10, and among these regions was the RET oncogene, including intron 11 (Manuscript

Submitted). Using an in vitro reduplexing assay, we showed that RET intron 11 DNA forms significantly greater levels of DNA secondary structure than regions not predicted to possess this ability.

61

Figure 3.5: Location of APH-induced RET intron 11 breakpoints on predicted DNA secondary structures. (A) A representative predicted DNA secondary structure is shown, corresponding to the RET gene nucleotides 43,610,735 to 43,611,034 (hg 37.2), with the locations of APH- induced breakpoints indicated by arrows. The program Mfold was used to predict potential DNA secondary structures within the RET intron 11 sequence, analyze 300-nt fragments at a time with 150-nt shift increments on both DNA strands, and select the most energetically favorable structure for each fragment. The location of each APH-induced DNA breakpoint on the predicted secondary structures was analyzed. (B) The percentage of the APH-induced RET breakpoints is shown for each DNA secondary structural features recognized by DNA topoisomerases I and II.

62 Therefore, to investigate if there is a correlation between the location of the

APH-induced DNA breakpoints and DNA secondary structure formation, potential DNA secondary structures for both DNA strands of RET intron 11 were predicted using the program Mfold, analyzing 300-nt segments with 150-nt shift increments and determining the structure with the most favorable free energy value for each DNA segment. Due to the sequence overlap from the segment shift, the location of each DNA breakpoint was analyzed on two potential structures and assigned one structural feature for each breakpoint (Figure 3.5A; Appendix Table 3). Of the 144 APH-induced

DNA breakpoints, 61 (42.4%) are located within a predicted double-stranded DNA stem, suggesting the potential involvement of DNA topoisomerase I (Figure 3.5B). Another 49

(34%) breakpoints are located at a double-stranded/single-stranded DNA junction and

22 (15.3%) breakpoints are located within a single-stranded DNA loop, suggesting the involvement of DNA topoisomerase II. The remaining 12 (8.3%) breakpoints are located in predicted single-stranded DNA bubbles, which at this time has no known potential mechanism for DNA cleavage (Figure 3.5B).

These findings provide additional support for the role of DNA topoisomerases I and II in the initiation of APH-induced fragile site breakage within RET intron 11.

Furthermore, they suggest a mechanistic connection between the formation of DNA secondary structures at fragile sites and the initial DNA breakage events following APH- treatment.

63 Topoisomerase catalytic inhibitors decrease APH-induced DNA breakage at RET

Since the locations of the APH-induced DNA breakpoints within RET intron 11 suggest the potential involvement of topoisomerases I and II in initiating DNA breakage within this region following APH treatment, this hypothesis was directly tested by examining the effect on DNA breakage frequency in HTori-3 cells that are co-treated with APH and topoisomerase catalytic inhibitors. Topoisomerase catalytic inhibitors block DNA cleavage by the enzyme, and therefore if topoisomerases participate in initiating APH-induced DNA breakage within RET intron 11, the catalytic inhibitors would be expected to decrease the rate of APH-induced DNA breakage within this region. Two catalytic inhibitors, betulinic acid and merbarone, were chosen for cell treatments.

Betulinic acid (BA) inhibits topoisomerase I DNA cleavage through prevention of topoisomerase I-DNA cleavable complex formation by sequestering topoisomerase I in the nucleoplasm (177). There have been conflicting reports over the inhibitory effect of

BA on topoisomerase IIα (178-180), and no systematic study has been performed to clarify the action of the drug on this enzyme. Merbarone, an inhibitor of DNA topoisomerase II with selectivity for the  over the  isoform (181), inhibits topoisomerase II by interacting with the enzyme and preventing DNA scission (182).

Optimal dosages of BA or merbarone were determined in combination with APH treatment in HTori-3 cells such that significant levels of cell death were not induced

(Figure 3.6). Using these established conditions, HTori-3 cells were treated with 0.4 μM

APH and 60 nM BA or 3 μM merbarone for 24 hours. The genomic DNA was then isolated and breakpoint analysis was performed by LM-PCR using RET primer set 1

64

Figure 3.6: Cell survival of HTori-3 cells following drug treatment. The level of HTori-3 cell death following treatment with APH and/or topoisomerase inhibitor drugs was determined using a propidium iodide (PI) stain and measured using flow cytometry. The percentage of live cells (PI negative) relative to untreated cells was determined for each treatment and averaged for at least three experimental replicates. Error bars indicate standard deviation.

(Figure 3.7A). Co-treatment of cells with APH and BA or merbarone significantly decreased the level of APH-induced DNA breakage within RET intron 11 from 0.77 

0.029 breaks per 100 cells to levels similar to untreated (0.023  0.10 or 0.018  0.007 breaks per 100 cells, P = 4.01E-3 or 1.98E-3, respectively) (Figure 3.7B, Table 3.1).

Next, the involvement of topoisomerases I and II in APH-induced fragile site breakage at the most active common fragile site, FRA3B, was tested by determining the rate of DNA breakage within intron 4 of FHIT, a region within FRA3B known to exhibit clustering of APH-induced DNA breakage (157,183). We previously established that APH- treatment in HTori-3 cells results in DNA breakage within this region (161). In agreement with our previous results, a significant increase in DNA breakage was observed within

65 Figure 5

A 0.4μM APH + 60nM BA 0.4μM APH + 3μM Merbarone

1 2 3 4 5 6 7 1 2 3 4 5 6 7

B 0.12

0.10

s

l

l

e C

0.08

0

0

1 /

s 0.06

k

a

e

r B 0.04

A * N *

D * 0.02

0.00 Untreated 0.4μM APH 0.4μM APH 0.4μM APH 60nM BA 3μM Merbarone TrreTeareaattmenmtmenenttt C 0.14

0.12

s

l l

e 0.10

C

0 0

1 0.08

/

s k

a 0.06 e

r *

B

A 0.04 * N * D 0.02

0.00 Untreated 0.4μM APH 0.4μM APH 0.4μM APH 60nM BA 3μM Merbarone Treatment

66 Figure 3.7: The effect of DNA topoisomerase catalytic inhibitors on the APH-induced common fragile site breakage. (A) HTori-3 cells were treated with 0.4μM APH in combination with topoisomerase I and II catalytic inhibitors, 60nM betulinic acid (BA) or 3μM merbarone, for 24 hours. LM-PCR was used to detect DNA breaks within RET intron 11 using RET primer set 1. Each lane represents a separate PCR reaction using DNA from approximately 1300 cells. The first lane of each gel is 100-bp molecular weight ladder. (B) The frequency of DNA breakage within RET intron 11 following 0.4μM APH treatment combined with 60nM betulinic acid or 3μM merbarone significantly decreases compared to APH treatment alone (*P ≤ 4.01E-3), to levels similar to without drug treatment. Error bars indicate standard deviation. (C) The frequency of DNA breakage within FHIT intron 4, located within APH-induced common fragile site FRA3B, shows a significant increase with 0.4μM APH treatment. As with RET intron 11, the frequency of APH-induced DNA breakage within FHIT intron 4 showed a significant decrease when combined with BA or merbarone (*P ≤ 7.21E-6). Error bars indicate standard deviation.

FRA3B following treatment of HTori-3 cells with 0.4M APH for 24 hours (0.110  0.012 breaks per 100 cells) compared to untreated (0.023  0.006 breaks per 100 cells, P =

1.36E-7, Figure 3.7C). As was seen in intron 11 of RET, when APH treatment was combined with BA or merbarone, the frequency of APH-induced DNA breakage significantly decreased (0.023  0.006 or 0.030  0.017 breaks per 100 cells, P = 5.03E-7 or 7.21E-6, respectively) to levels similar to untreated (Figure 3.7C). Together these results confirm that DNA topoisomerases I and II are involved in initiating APH-induced

DNA breakage within common fragile sites located at RET and FHIT, and this mechanism may extend to other APH-induced common fragile sites as well.

DISCUSSION

In this study, we analyzed the initial events of APH-induced common fragile site breakage within the RET oncogene. APH treatment of HTori-3 cells induces significant levels of DNA breakage within intron 11 of RET, the breakpoint cluster region found in

PTC patients (Table 3.1). Previously we confirmed that APH treatment specifically

67 induces breakage at fragile sites in HTori-3 cells by detecting high levels of breakage within RET and FHIT, located in APH-induced common fragile sites FRA10G and FRA3B, respectively, but not in the non-fragile 12p12.3 region and G6PD gene, located within the non-APH inducible rare folate sensitive fragile sites FRAXF (161). Using LM-PCR, we next mapped the nucleotide location 144 APH-induced DNA breaks on both the top and bottom strands of RET intron 11 (Figure 3.1). All but one of the breakpoints induced by

APH were located at or near predicted DNA topoisomerase I and/or IIα cleavage sites

(Figure 3.3). Utilizing the DNA secondary structure prediction program Mfold, the location of these APH-induced breakpoints were compared to predicted DNA secondary structures of the intron 11 sequence. The majority of the breakpoints (91.7%) were located at structural features known to be recognized and preferentially cleaved by topoisomerases I or II (Figure 3.5). Finally, we confirmed the involvement of topoisomerases I and IIα in APH-induced DNA breakage at the RET oncogene by measuring the effect of topoisomerase catalytic inhibitors on the level of APH-induced

DNA breakage within intron 11. When catalytic inhibitors BA and merbarone were combined with APH treatment the frequency of DNA breakage within RET intron 11 significantly decreased to levels similar to untreated (Figure 3.7B). Furthermore, this effect was also observed at FHIT intron 4, confirming the involvement of the involvement of DNA topoisomerases I and IIα in APH-induced DNA breakage at other common fragile sites as well (Figure 3.7C). Together these results provide strong evidence for DNA topoisomerases I and IIα in initiating APH-induced DNA breakage at

68 common fragile sites through recognition and preferential cleavage of DNA secondary structures.

Previous studies have also implicated DNA topoisomerase I in common fragile site instability. Depletion of topoisomerase I in HCT116 cells results in a significant increase in common fragile site breakage (139). Arlt et al. observed that co-treatment of cells with APH and the topoisomerase I catalytic inhibitor BA significantly decreased common fragile site breakage, including FRA3B (138). These results are consistent with our observation that BA significantly decreased APH-induced DNA breakage at RET

(FRA10G) and FHIT (FRA3B) (Figure 3.7B,C). When Arlt et al. combined APH treatment with the topoisomerase I poison camptothecin (CPT), which prevents topoisomerase I from religating DNA following cleavage, they also observed a significant reduction in common fragile site breakage, including FRA3B (138). However, when we combined a low dosage (150nM) of the topoisomerase poison CPT-11 (Figure 3.6) with 0.4μM APH treatment, we observed a significant increase in APH-induced DNA breakage at RET

(FRA10G) and FHIT (FRA3B) in HTori-3 cells (Figure 3.8). The difference in our observations and Arlt et al. may be attributed to detection method, whereby we are detecting fragile site breakage as single and double-strand DNA breaks while they are detecting chromosomal disruptions on metaphase chromosomes. Nevertheless, our data combined with these previous studies provides compelling evidence for DNA topoisomerase involvement in common fragile site breakage.

One model for common fragile site expression is that delayed replication can cause an uncoupling of the helicase complex from the DNA polymerase resulting in long

69

Figure 3.8: Frequency of APH-induced DNA breakage in combination with CPT-11 treatment. Co-treatment of HTori-3 cells with APH and CPT-11 significantly increases the frequency of DNA breakage within RET intron 11 and FRA3B relative to APH treatment alone. The frequency of DNA breakage for each treatment was measured using LM-PCR and averaged over at least three independent experiments. Significance was calculated using a two-tailed Student’s T-test (P ≤ 5.9E-6). Error bars indicate standard deviation.

stretches of single stranded DNA, and at fragile sites this DNA can form stable DNA secondary structures that pause polymerase progression and ultimately result in DNA breakage (8). The initial events of fragile site breakage remain unclear but our data here supports the involvement of DNA topoisomerases I and IIα in the initiation of APH- induce fragile site breakage. DNA topoisomerases I and IIα participate in replication by maintaining chromosomal structural integrity through transient introduction of DNA breakage. The presence of human topoisomerases I and IIα have been observed at replication origins and inhibition of topoisomerase I interferes with replication origin

70 firing, indicating these enzymes play a role in replication initiation (184). Once replication is initiated, DNA is unwound by DNA helicase resulting in DNA overwinding

(positive supercoiling) in front of the replication fork and DNA underwinding (negative supercoiling) behind the replication fork. Positive and negative supercoiling as a result of replication can be removed by both topoisomerases I and IIα (185). Negative supercoiling behind the replication fork can also result in knots and tangles in the newly replicated DNA which can be removed by topoisomerase IIα (186). The presence of these enzymes at the replication fork, their ability to cleave DNA, and their necessity for induction of fragile site breakage at RET intron 11 and FHIT intron 4 in HTori-3 cells following APH treatment suggests these enzymes are involved in the initiation of DNA breakage. Furthermore, the location of APH-induced DNA breaks in RET intron 11 at

DNA secondary structural features recognized by topoisomerases I and II suggests these enzymes may recognize and preferentially cleave these structures while following in front or behind the replication fork scanning for topological changes or may be recruited separately to these sites.

As with DNA replication, transcription results in positive supercoiling ahead of the transcription bubble and negative supercoiling behind (187,188). Furthermore, negative supercoiling enhances the formation of stable RNA-DNA hybrids (R-loops)

(189), which are linked with genomic instability and double strand DNA break formation

(190). Another model for common fragile site breakage is that transcription of long genes at fragile sites results in the formation of stable R-loops due to the collision of transcription and replication machinery, ultimately leading to genomic instability within

71 these regions (22). DNA topoisomerase I activity is vital during transcription for the removal of positive and negative supercoiling, and thus suppressing R-loop formation

(166). Trinucleotide repeats, including those observed at rare fragile sites, also preferentially form stable R-loops (163-165) and this may be due to the formation of stable DNA secondary structures on the non-template DNA strand (191). Therefore, like with replication, DNA secondary structure formation at common fragile sites during transcription may result in the formation of stable R-loops and stalled transcription machinery, which may be recognized and preferentially cleaved by DNA topoisomerase

I.

RET protein expression in the thyroid is high in neural-crest derived C-cells but not in follicular cells, where RET/PTC rearrangements can result in its expression as a fusion protein and lead to PTC. The HTori-3 cell line used in our study is derived from normal human thyroid follicular epithelium (192) and thus should not express RET, which we confirmed by RT-PCR (data not shown). Since active transcription of a gene is required for the transcription-associated model of common fragile site breakage, delayed replication must be responsible for all of the APH-induced fragile site breakage we detect at the RET gene. Therefore, the formation of stable R-loops cannot explain breakage at all common fragile sites, further supporting the secondary structure - forming/replication stalling mechanism of common fragile site breakage.

Cancer is often treated with chemotherapeutic agents that act as DNA topoisomerase poisons. The incidence of secondary primary tumors is on the rise, where in the United States they account for one in six of all newly diagnosed cancers (123), and

72 this may be attributed to cancer treatments. Thyroid cancer has been observed following treatment of a variety of different cancers with chemotherapy, including non-

Hodgkin’s lymphoma (124,193) and testicular cancer (126). Specifically, PTC has been observed as a second cancer following treatment of osteosarcoma (132), rhabdomyosarcoma (133), acute lymphoblastic leukemia, neuroblastoma, and Ewing’s sarcoma (134-137) with chemotherapy alone. This suggests that manipulation of normal topoisomerase activity can result in PTC development. Since we observe here that perturbed DNA topoisomerase I and II activity can affect fragile site breakage at the RET oncogene, more work should be done to investigate the role of topoisomerase poisons and fragile sites in the formation of secondary PTC tumors and other cancers.

These studies show that DNA topoisomerases I and II play a vital role in initiating breakage at APH-induced common fragile sites, providing valuable insight into the initial events of common fragile site breakage. Furthermore, the mechanism by which stabl e

DNA secondary structures form at common fragile sites during delayed replication is supported by our data, and the idea that these structures can be recognized and preferentially cleaved by topoisomerases is presented. Our data described here provide new mechanistic insight by which fragile site breakage at the RET oncogene can lead to

PTC in patients, and this may be extendable to other regions of the genome and other cancers as well.

73 METHODS AND MATERIALS

Cell line and culture conditions

Experiments were performed on HTori-3 cells, a human thyroid epithelial cell line transfected with an origin-defective SV40 genome. They are characterized as immortalized, partially transformed, differentiated cells having three copies of chromosome 10 with intact RET loci and preserve the expression of thyroid differentiation markers such as thyroglobulin production and sodium iodide symporter

(158). The cells were purchased from the European Tissue Culture Collection and grown in RPMI 1640 medium (Invitrogen) supplemented with 10% fetal bovine serum.

Cell treatments and fragile site induction

For breakpoint detection, HTori-3 cells (1x105) were plated in 10 cm dishes and treated 18 hours later for 24 hours with 0.4 μM APH (Sigma), in the presence or absence of 3 μM merbarone (Sigma), 60 nM betulinic acid (BA) (Sigma), or 150 nM CPT-11

(Sigma). For detection of DNA topoisomerase I and II cleavage sites, HTori-3 cells were plated in the same manner, and treated for 1.5 hours with either 10 μM CPT-11 or 10

μM VP-16 (Sigma).

DNA breaks were directly introduced to HTori-3 genomic DNA through digestion with the restriction enzyme BanI or XbaI (New England BioLabs). DNA breaks were induced within intact nuclei isolated from HTori-3 cells [isolation of nuclei was performed as described in (183)] through treatment with BanI, after which the genomic

DNA was isolated.

74 DNA breakpoint mapping by LM-PCR

DNA breaks were identified within intron 11 of RET using four sets of primers

(Appendix Table 1), two sets detecting breaks on one DNA strand and two sets detecting breaks on the complementary strand, and each set covering a DNA region of approximately one Kb. Each primer set consists of a 5’-biotinlyated primer that extends to the breakpoint, and two nested primers, used separately in the first and second rounds of PCR to amplify the DNA. DNA breaks within FRA3B were isolated using a set of primers corresponding to intron 4 of FHIT (Appendix Table 1) (161), which is a hotspot of

APH-induced DNA breakage in FRA3B (157,158).

Detection of DNA breakpoints following drug treatment was performed as previously described (161). In short, genomic DNA was isolated from HTori-3 cells with or without treatment. Primer extension was performed using 200 ng of DNA at 45°C with DNA Sequenase (Affymetrix, Inc.), and the DNA breaks were isolated through ligation of the LL3/LP2 linker, and then using streptavidin beads. Amplification of these

DNA breaks was achieved by nested PCR of the extension-ligation products, using equivalent to 8 ng of genomic DNA per reaction (for all treatments except APH + CPT-11 where 4 ng was used). The final PCR products were resolved by electrophoresis on a

1.3% agarose gel. Each band observed on the gel corresponds to a break isolated within the region of interest. The location of the bands observed were confirmed by DNA sequencing. The nucleotide location of the breakpoints was determined from the sequencing results by identifying the nucleotide adjacent to the LL3/LP2 linker sequence.

75 DNA secondary structure prediction by Mfold

The DNA sequence of intron 11 of RET was obtained from NCBI (human genome build 37.2, Chr10: 43610185-43612031). Using the Mfold program (194), the potential of single-stranded DNA to form stable secondary structures can be predicted. The secondary structure forming potential of RET intron 11 was analyzed by inputting 300-nt segments with 150-nt shift increments into the Mfold program. We choose the 300-nt length because it equals the Okazaki initiation zone of the DNA replication fork in mammalian cells, which possesses a single-stranded property during DNA replication

(195,196). The default [Na+], [Mg2+], and temperature used were 1.0 M, 0.0 M, and

37°C, respectively. The most stable predicted DNA secondary structure for each 300-nt segment was used to analyze the location of APH-induced DNA breaks within intron 11 of RET.

Cell survival analysis following drug treatment

HTori-3 cells (1x105) were plated in 6-well plates and treated 18 hours later with

0.4 μM APH and/or tomoisomerase inhibitors for 24 hours. Cell were harvested by trypsinization, washed with phosphate-buffered saline (PBS, Invitrogen), and re- suspended in PBS containing 2 μg/mL propidium iodide. Cell viability was then determined using a Beckon Dickinson FACSCalibur flow cytometer. Titrations of betulinic acid (30 nM to 3 μM), merbarone (1 to 100 μM), and CPT-11 (3 nm to 10 μM) were performed to determine optimal dosing in HTori-3 cells. After an optimal range of doses for each of these topoisomerase inhibitors alone was determined, titrations were also

76 performed for betulinic acid (30 nM to 0.3 μM), merbarone (1 to 10 μM), and CPT-11

(150 to 500 nM) in combination with 0.4 μM APH.

ACKNOWLEDGEMENTS

This work was supported by the National Institutes of Health (CA113863 to Y.-H. W and

Y. E. N. and T32GM095440 to L.W.D.).

77 CHAPTER IV: DNA SECONDARY STRUCTURES INVOLVED IN FRAGILE SITE BREAKAGE FROM THE STUDY OF HUMAN CHROMOSOME 10

Laura W Dillon1, Levi CT Pierce 2, Maggie C. Y. Ng3,4, Yuh-Hwa Wang1

1Department of Biochemistry, 3Center for Genomics and Personalized Medicine Research, 4Center for Diabetes Research, Wake Forest School of Medicine, Medical Center Boulevard, Winston-Salem, NC 27157-1016, USA 2Department of Chemistry and Biochemistry, University of California-San Diego, 9500 Gilman Drive, Urey Hall, La Jolla, CA 92003-0365, USA

The following paper has been submitted for publication in PLoS Genetics, with modifications in number of Figures and References. Differences in organization reflect the requirements of the journal.

78 ABSTRACT

The formation of alternative DNA secondary structures can result in DNA breakage leading to cancer and other diseases. Chromosomal fragile sites, which are regions of the genome that exhibit chromosomal breakage under conditions of mild replication stress, are predicted to form stable DNA secondary structures. DNA breakage at fragile sites is associated with regions that are deleted, amplified, or rearranged in cancer. Despite the correlation, unbiased examination of the ability to form secondary structures has not been evaluated in fragile sites. Here, we utilize DNA secondary structure predictions of the chromosome 10 sequence to compare fragile and non- fragile DNA. We found that aphidicolin-induced common fragile sites contain more sequence segments with potential high secondary structure-forming ability, and these segments clustered more densely than those in non-fragile DNA. Additionally, we refined legitimate fragile sites within the cytogenetically defined boundaries, and identified potential fragile regions within non-fragile DNA. Many of these regions coincide with genes mutated in various diseases and regions of copy number alteration in cancer. This study supports the role of DNA secondary structures in common fragile site instability, provides a systematic method for their identification, and suggests a mechanism by which DNA secondary structures can lead to human disease.

79 INTRODUCTION

Alternative DNA secondary structures, which vary in comformation from the customary right-handed B form, are suggested to have a role both in biological processes such as transcription and telomere maintenance, and in genomic mutational events including deletions, amplifications, and chromosomal rearrangements (197). At least ten alternative conformations have been identified to date, including hairpins/cruciforms, Z-DNA, triplexes, tetraplexes, slipped DNA, and sticky DNA (198).

Formation of these structures can occur when the DNA duplex is unwound during metabolic DNA processes such as DNA replication and transcription, and may cause abnormalities in these processes. DNA secondary structures are strongly associated with

~20 hereditary neurological diseases (due to simple sequence amplifications), ~50 human diseases (caused by genomic rearrangements and deletions), and several psychiatric diseases (resulting from polymorphisms in simple repeat sequences) (199).

Triplet repeats, which form hairpin loops or slipped conformations, can give rise to expansions resulting in diseases such myotonic dystrophy, fragile X syndrome,

Friedreich’s ataxia, and Huntington disease (200). Z-DNA and triplex formation potentials at the oncogene c-MYC correspond with major breakpoint hotspots found in lymphomas and leukemias. Similarly, the major breakpoint cluster region in BCL-2 follicular lymphomas can form triplex DNA structures (201). Multiple stem-loop structures have been predicted or identified in several human fragile sites examined so far (202). Genome-wide analysis of palindrome formation, due to large inverted repeats, revealed these sequences to cluster in cancer cells at regions which undergo gene

80 amplification, implicating these alternative structures in tumor progression (203).

Purine-pyrimidine tracts and other repetitive elements capable of forming alternative

DNA structures are overrepresented in DNA sequences surrounding breakpoints involved in chromosomal rearrangements (204-206). Detailed analysis of 11 gross deletions resulting in various diseases revealed that alternative DNA conformations could explain the formation of DNA breaks at known breakpoints in patients (206).

While these studies provide further evidence supporting a role of various alternative

DNA secondary structures in disease development and progression, no unbiased study has been performed analyzing the formation of multiple stem loop DNA secondary structures.

Chromosomal fragile sites exhibit gaps or breaks on metaphase chromosomes under conditions that partially inhibit DNA synthesis (1). Many genes deleted, amplified, or rearranged in cancer are located within fragile sites (207). In a comprehensive survey of simple recurrent cancer-specific translocations, we found that over half of gene pairs involved in these translocations have breakpoints of at least one gene mapped to fragile sites (70). Bignell et al also found, by deriving mutation signatures of unexplained homozygous deletions in cancer cell lines, that some matched those of recessive cancer genes while others had signatures similar to fragile sites (69). In addition to cancer, fragile site instability is associated with neuropsychiatric diseases, including schizophrenia and autism (208). More importantly, fragile site-induced DNA breakage can produce deletions within the FHIT gene and the formation of RET/PTC1 rearrangements, resembling those found in human tumors (71,161).

81 Fragile sites are divided into two classes, common or rare, based on their frequency in the population, and are further divided according to their mode of induction in cultured cells. While rare fragile sites are present in less than 5% of the population and inherited in a Mendelian manner, common fragile sites are present in all individuals. Most common fragile sites are induced by low doses of aphidicolin (APH) (8), an inhibitor of DNA polymerases α, δ, and ε (6,7). The precise mechanism of instability at fragile sites remains elusive, but analysis of several common fragile sites has revealed

AT-rich sequences displaying the potential to form highly stable secondary structures

(13,14), which may stall DNA replication fork progression. The CGG repeats, which are present in all rare, folate-sensitive fragile sites, can form quadraplex (23) and hairpin

(24) structures in vitro, and display significant blocks to DNA replication both in vitro

(25) and in vivo (26). Our study of the AT-rich rare fragile site FRA16B demonstrated the formation of secondary structure and DNA polymerase stalling within this sequence in vitro, as well as reduced replication efficiency and increased instability in human cells

(27). The examination of replication intermediates from cells containing AT-rich sequences within common fragile site FRA16D in yeast showed site-specific replication fork stalling depending on the length of the AT repeat (15). DNA synthesis of the same fragile site by human replicative polymerases δ and α using an in vitro primer extension assay confirmed polymerase stalling at sites predicted to form inhibitory DNA structures

(16). Similar findings were observed for eukaryotic replicative polymerases at hairpin and tetraplex structures formed within CGG repeat expansions (28). Replication fork stalling also occurs at endogenous AT-rich sequences within the common fragile site

82 FRA16C in human lymphoblastoid cells, which is enhanced by APH treatment (209).

These data suggest that the ability of fragile sites to form stable secondary structures during DNA replication likely contributes to their breakage by stalling replication fork progression. Despite the connection between DNA secondary structure formation and

DNA fragility, no comprehensive study has evaluated the extent of DNA secondary structure-forming potential at fragile versus non-fragile regions.

In this study, we use computational predictions of the chromosome 10 sequence to form multiple stem loop DNA secondary structures, and compare the stability and clustering of these structures at fragile sites versus non-fragile sites. On chromosome

10, three APH-induced common fragile sites contained more sequence segments with potential high secondary structure-forming ability compared to non-fragile regions, and in the APH-induced sites these segments occurred in more dense clusters than non- fragile regions. Traditionally, fragile sites are defined cytogenetically as unstained gaps of an average 3-Mb in size on metaphase chromosomes. However, fragility is not present in the entire region. Using secondary structure-forming ability in combination with the information derived from molecularly-defined rare and common fragile sites, we developed a threshold to predict DNA fragility; using this threshold, we can narrow down sites of true fragility within the current cytogenetically-defined fragile sites. These computational analyses were further validated with an in vitro alternative DNA structure formation assay and a DNA breakage cell assay. Similarly applying this threshold, we uncovered potential new fragile sites previously unidentified or too small to be observed cytogenetically. Furthermore, regions identified by this DNA fragility

83 prediction threshold were correlated with regions known to be mutated in human diseases and cancer.

METHODS AND MATERIALS

DNA secondary structure prediction

The nucleotide sequence of chromosome 10 was obtained from the Human

Genome Project build 37.2 (hg 37.2) as FASTA text files. The downloaded contig sequences were assembled sequentially, in the p to q arm direction, with unsequenced gap nucleotides between contigs assigned a sequence of “N”. A list of known fragile sites on chromosome 10 and their location was obtained from NCBI GeneBank

(http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene using "fragile site" as a search term). For example, FRA10G is located at cytogenetic position 10q11.2, corresponding to nucleotides 42,300,001 to 52,900,000. Fragile sites were then assigned based on common or rare, and mode of induction [APH, bromodeoxyuridine (BrdU), or folic acid].

Non-fragile regions were defined by removing all known fragile sites, centromeric (210), subtelomeric (211), and gap sequences.

Using the Mfold program (194), the potential of single-stranded DNA to form a stable secondary structure can be predicted along with its free energy value. The secondary structure-forming potential of the entire chromosome 10 was analyzed by inputting 300-nt segments with a 150-nt shift window into the Mfold program. We chose the 300-nt length because it equals the Okazaki initiation zone of the DNA replication fork in mammalian cells, which possesses a single-stranded property during

84 DNA replication (195,196). The default [Na+], [Mg2+], and temperature used were 1.0

M, 0.0 M, and 37°C, respectively. The free energy value for the most stable predicted secondary structure for each segment was used in the analyses. A section was arbitrarily defined as 50 consecutive segments for evaluation whether segments with a low predicted free energy value clustered within a region.

Analysis of DNA secondary structure within FRA3B and FRA16D were performed using the M-fold program in the same manner as above. The FRA3B sequence, defined by Becker et al (212), is located between nucleotides 59,619,291 to 64,206,880 of chromosome 3. The FRA16D sequence, defined by Krummel et al (213), is between nucleotides 76,517,028 and 79,359,665 on chromosome 16.

Statistical analysis

The mean and standard deviation (SD) of the predicted free energy values was first calculated for the entire chromosome to generate the threshold values significantly deviating from the overall sequence (<1SD or 2SD). The average predicted free energy value of the entire chromosome 10 is -27.25  9.35 kcal/mol. Therefore, we defined two threshold values of -40 and -50 kcal/mol that correspond to ~1 and 2 SD below the average predicted free energy. A free energy value of each segment below these two thresholds indicates that it is energetically more favorable to fold into secondary structures compared to the overall sequence. The mean and SD was then calculated for each fragile site as well as the non-fragile regions, using the predicted free energy values for the 300-nt segments contained within those regions. Segments containing sequences overlapping with other regions or ‘N’ gap sequence were not used in

85 analyses. To compare the overall ability of each fragile or non-fragile region to form stable secondary structures, the percentage of segments within each region with free energy value less than -40 or -50 kcal/mol was calculated. Using a chi square test, the percentage of segments below each threshold was compared between the non-fragile region and each fragile site.

To investigate whether low free energy segments form clusters, fifty consecutive

300-nt segments were grouped into one section and numbered consecutively from the p to q arm direction. The proportion of segments with a free energy value less than -40 kcal/mol within each section was calculated. A high proportion suggests the low free energy segments tend to cluster together in that section as compared to other sections.

The mean and standard deviation of the proportion of segments was calculated for the sections contained within each fragile site and the non-fragile regions. Sections containing sequences overlapping with other regions or ‘N’ gap sequence were not used in analyses. We first assessed for overall difference in proportion of low free energy segments among the 7 non-fragile and fragile regions by analysis of variance. Then we performed post-hoc comparisons for the mean difference in the proportion of segments per section for each fragile site versus the non-fragile DNA using a Student’s T-test (2- tailed). P values were calculated after Bonferroni correction for multiple comparisons.

All statistical analysis were performed using IBM SPSS Statistics v.19 (SPSS, Chicago, IL,

USA).

86 Plasmids

The plasmid containing intron 11 of human RET sequence, pRET3, was constructed by first generating a 2177-bp fragment produced by polymerase chain reaction (PCR) of human genomic DNA using primers (5’-

TACCCTGCTCTGCCTTTCAGATGG-3’ and 5’-CTGTCCTCTTCTCCTTCATC-3’) flanking intron

11 of RET, and then cloning into the Smal site of the pGEM3zf(+) vector (Promega Co.).

The plasmid containing intron 7 of human NCOA4 sequence, pELE1, was constructed by first generating a 2262-bp fragment produced by PCR of human genomic

DNA using primers (5’-AGACCTTGGAGAACAGTCAG-3’ and 5’-

CAGAGCCTCCTTCTCACAATTC -3’) flanking intron 7 of NCOA4, and then cloning into the pGEM3zf(+) vector.

Re-duplexing assay

Reduplexing reactions were performed as previously described (27,214,215) with slight modifications. The 366-bp RET fragment (corresponding to 43610126-43610492 of chromosome 10 from hg37.2) was obtained by digesting pRET3 with restriction enzymes

PstI and NaeI. The 348-bp NCOA4 fragment (corresponding to 51583022-51583371) was obtained by digesting pELE1 with restriction enzymes SwaI and HincII. The 317-bp PTEN exon 1 fragment (corresponding to 89623390-89623693) was obtained by digesting a

PTEN cDNA clone (Open Biosystems, Catalog # MHS1011-61412) with restriction enzymes EcoRI and BanI. The 289-bp PTEN exon 9 fragment (corresponding to

89725202-89725489) was obtained by digesting the PTEN cDNA clone with DpnI and

AseI. The fragments were then gel-purified, dephosphorylated at the 5’ end with calf

87 intestinal alkaline phosphatase, and end-labeled with [α-32P] ATP (PerkinElmer) using T4 kinase. End-labeled DNA (1 ng) was added to 10 μL denaturing solution (0.5 M NaOH,

1.5 M NaCl) and incubated at room temperature for 5 min. Following denaturation,

490μL 5X TE buffer containing NaCl (ranging from 0.05 to 1 M) was added to the reaction mixture and incubated at 37°C for 3 hours. DNA was ethanol-precipitated in the presence of glycogen (Roche). The DNA pellets were then air-dried and resuspended in

TE buffer. DNA samples were electrophoresed in a 4% polyacrylamide gel cast in TBE at

150V for 3 hours at room temperature. The gel was dried and visualized by phosphorimaging (GE Healthcare). All enzymes were purchased from New England

Biolabs.

Copy number alterations

Regions of copy number alteration on chromosome 10 were obtained from

Tumorscape (www.broadinstitute.org/tumorscape).

RESULTS

Analysis of chromosome 10 for DNA secondary structure-forming potential

Chromosome 10 was chosen to investigate differences in DNA secondary structure-forming ability between fragile and non-fragile DNA, due to its moderate length and relatively even distribution of non-fragile (~60%) versus fragile (~40%) sequences. In addition, the fragile sites present on chromosome 10 include both common and rare sites, as well as different modes of induction (APH, BrdU, and folic acid) (Table 4.1). In contrast to the rare fragile sites on chromosome 10, the common

88 Table 4.1: Classification of the chromosome 10 sequence based on fragile site location Cytogenetic Fragile Mode of Number of Number of Region location Site Class Induction segments Sections Non-fragile - - - 517301 10334 FRA10G 10q11.2 common APH 63867 1272 FRA10C 10q21 common BrdU 118000 2360 FRA10D 10q22.1 common APH 28668 574 10q23.3, FRA10A 10q24.2 rare folic acid 67335 1349 FRA10B 10q25.2 rare BrdU 20001 401 FRA10E 10q25.2 common APH 20001 401 FRA10F 10q26.1 common APH 55665 1113

fragile sites have yet to be defined on the molecular level, and a computational approach would be beneficial in predicting their underlying fragility.

The approximately 136-Mb sequence of chromosome 10 was used, and un- sequenced gap, centromeric (210) and subtelomeric (211) sequences were then removed. The secondary structure-forming potential of the remaining chromosome 10 sequence (131-Mb) was analyzed using the Mfold program (194) by inputting sequentially 300-nt segments with a 150-nt shift window (Figure 4.1). For each DNA segment, the program predicts various potential DNA secondary structures along with a corresponding free energy value. A more negative free energy value indicates a more stable DNA secondary structure. Therefore, the most favorable structure and free energy value were selected for each segment. The average free energy value for chromosome 10 is -27.2 ± 9.3 kcal/mol, with a range of -180.5 to +6.1 kcal/mol.

To examine whether there is clustering of sequences capable of forming stable

DNA secondary structures, fifty consecutive 300-nt segments were grouped into

89 Figure 1

0 15 30 45 60 75 90 105 120 135 Mb

50

E

/

G

C

D A

A

F

B

0

0

0 0

0

0

0

1

1

1 1

1

1

1

A

A

A A

A

A

A

R

R

R R

R

R

R

F

F

F F

F

F F

0

)

l

o m

/ -50

l

a

c

k

(

y

g

r e

n -100

E

e

e

r F

-150

Fragile Non-fragile

-200 0 1 2 3 4 5 6 7 8 9 x 105 Segment Number

Figure 4.1: Free energy values for predicted DNA secondary structures on chromosome 10. The free energy value for the most favorable Mfold-predicted DNA secondary structure for each 300-nt segment, with 150-nt increments, of the chromosome 10 sequence is presented. The lower x axis depicts the segment number in the p to q arm direction on chromosome 10, the upper x axis depicts the corresponding nucleotide number, and the y axis displays the free energy value of the predicted structure. Un-sequenced gap, centromeric, and subtelomeric sequences were removed. Non-fragile and fragile DNAs are depicted in grey and black, respectively. Brackets mark the locations of fragile sites.

sections, with each section consisting of 7,650-nt. The level of clustering was calculated based on the proportion of segments within each section with a free energy value less than -40 kcal/mol. This threshold was chosen because it is approximately one SD below the mean free energy value of chromosome 10 and predicts a more energetically favorable structure. A higher proportion indicates that more segments are capable of forming highly stable secondary structures. The mean proportion of segments per

90

Figure 4.2: Division of the chromosome 10 sequence into non-fragile and fragile regions

section below -40 kcal/mol for chromosome 10 is 0.087 ± 0.116, with a range of 0 to

0.96.

Predicted DNA secondary structure-forming ability at fragile sites compared with non- fragile region

Delayed replication within fragile sites due to the formation of stable DNA secondary structures is suggested to contribute to fragile site breakage (14). To investigate if fragile DNA can form significantly more stable secondary structures than non-fragile DNA, the 131-Mb chromosome 10 sequence was divided into non-fragile or fragile regions, with the fragile sequence being further defined into individual fragile sites based on class and mode of induction (Figure 4.2). A total of 373,537 segments were analyzed for fragile sites, and 517,301 segments for non-fragile regions (Table 4.1).

The mean free energies of the predicted secondary structures in the non-fragile and fragile regions are shown in Table 4.2. While the non-fragile DNA exhibited a

91 Table 4.2: Free energy of predicted DNA secondary structures for fragile and non- fragile regions of chromosome 10 Chr Non- Region 10 fragile FRA10G FRA10C FRA10D FRA10A FRA10B/E FRA10F Number of segments 870837 517301 63867 118000 28668 67335 20001 55665 Mean Free Energy (kcal/mol) -27.2 -27.5 -28.1 -24.0 -32.0 -26.3 -27.2 -29.4 Standard Deviation 9.3 9.4 10.1 7.6 10.2 8.9 8.4 9.5

comparable mean free energy value to the entire chromosome 10, no significant difference was seen between non-fragile and fragile sites. This result is not unexpected, since the mean free energy values are averaged over very large regions and differences between non-fragile and fragile sites may be masked.

Next, we examined the number of segments with free energy values less than -

40 kcal/mol for each fragile site and non-fragile DNA; such segments are energetically more favorable to fold into secondary structures compared to the overall chromosome

10 sequence. Three APH-induced common fragile sites FRA10G, FRA10D, and FRA10F all contain significantly more segments with free energy values less than -40 kcal/mol than non-fragile DNA (P < 1E-100), indicating more stable DNA secondary structures forming within these regions (Table 4.3). In contrast, the BrdU-induced common fragile site

FRA10C, folate-sensitive rare fragile site FRA10A, and the co-mapped BrdU-induced rare fragile site FRA10B and APH-induced common fragile site FRA10E had fewer segments with free energy values less than -40 kcal/mol compared to non-fragile DNA. Similar results were obtained when analyzing the percentage of segments with free energy

92 Table 4.3: Percentage of segments with free energy below -40 kcal/mol Region Non-fragile FRA10G FRA10C FRA10D FRA10A FRA10B/E FRA10F Segments ΔG>-40 471257 56047 114086 22991 63094 18724 48948 kcal/mol* (91.1%) (87.8%) (96.7%) (80.2%) (93.7%) (93.6%) (87.9%) Segments ΔG<-40 46044 7820 3914 5677 4241 1277 6717 kcal/mol* (8.9%) (12.2%) (3.3%) (19.8%) (6.3%) (6.4%) (12.1%) 2.34E- 1.21E- 4.67E- P-value** - 166 0 0 133 7.04E-35 133 * Total number of segments with a free energy value higher or lower than -40 kcal/mol, with the percentage of the total in parentheses. ** P-value calculated by comparing each fragile site to non-fragile DNA using a Chi-square test. A P-value of 0 represents a P-value < 1E-200. ***Fragile sites highlighted in grey have a significantly greater proportion of segments with a free energy value less than -40 kcal/mol compared to non-fragile DNA.

values below -50 kcal/mol (Table 4.4), which is approximately two SD below the mean free energy value of chromosome 10, indicating increasing stability in the predicted secondary structures.

To examine any differences in the clustering of DNAs capable of forming stable

DNA secondary structures between non-fragile and fragile regions, fifty consecutive

300-nt segments were grouped into sections and analyzed. A total of 7,470 sections were analyzed for fragile sites and 10,334 sections for non-fragile regions (Table 4.1).

The proportion of segments within each section having a free energy value less than -40 kcal/mol was then calculated (Figure 4.3) and averaged for chromosome 10, non-fragile

DNA, and each fragile site (Table 4.5). Analysis of variance revealed a significant difference in the proportion of segments below -40 kcal/mol among all fragile and non- fragile sites (P = 7.7E-292). Subsequent pairwise comparison of each fragile site with

93 Table 4.4: Percentage of segments with free energy below -50 kcal/mol Region Non-fragile FRA10G FRA10C FRA10D FRA10A FRA10B/E FRA10F Segments ΔG>-50 506900 61984 117153 27440 66378 19765 54297 kcal/mol* (98.0%) (97.1%) (99.3%) (95.7%) (98.6%) (98.8%) (97.5%) Segments ΔG<-50 10401 1883 847 1228 4241 236 1368 kcal/mol* (2.0%) (2.9%) (0.7%) (4.3%) (1.4%) (1.2%) (2.5%) P-value** - 1.77E-54 8.02E-203 2.07E-148 1.21E-113 1.28E-16 1.62E-12 *Total number of segments with a free energy value higher or lower than -50 kcal/mol, with the percentage of the total in parentheses **P-value calculated by comparing each fragile site to non-fragile DNA using a Chi-square test. ***Fragile sites highlighted in grey have a significantly greater proportion of segments with a free energy value less than -50 kcal/mol compared to non-fragile DNA.

non-fragile site showed that APH-induced common fragile sites FRA10G, FRA10D, and

FRA10F exhibited a significantly greater proportion of low free energy segments per section as compared to non-fragile DNA (P < 1E-13). Inversely, BrdU-induced common fragile site FRA10C, rare fragile site FRA10A, and co-mapped rare fragile site FRA10B and

APH-induced common fragile site FRA10E each had a lesser proportion of segments below -40 kcal/mol compared to non-fragile DNA.

Interestingly, in both analyses only APH-induced common fragile sites were statistically significant in favor of forming stable secondary structure compared to non- fragile regions, except for FRA10E, an APH-induced common fragile site co-located at a

BrdU-induced rare fragile site FRA10B. However, a literature search revealed that

FRA10E was initially classified as a provisional common fragile site (216) and has not been verified since. Therefore, due to its overlap with a rare fragile site and poor classification, this region may not be an APH-induced common fragile site. Together,

94

Figure 4.3: Density of secondary structure forming potential on chromosome 10. The proportion of segments per section (consisting of 50 segments) with a predicted free energy value less than -40 kcal/mol is depicted for chromosome 10, where each data point represents 7650-nt. APH-induced common fragile sites (CFS) have high levels of DNA secondary structure clustering. BrdU-induced common fragile site FRA10C has low levels of clustering compared to the overall sequence. Folic acid rare fragile site (RFS) FRA10A has low levels of clustering in the 5’ half and higher levels in the 3’ section. Non-fragile DNA exhibits fluctuations in regions with low and high levels of clustering, with increasingly high levels of clustering approaching the 3’ telomere.

these results show that APH-induced common fragile sites possess more sequences with potential to form highly stable secondary structures, and these structures cluster together more often than non-fragile DNA.

Refining of cytogenetically-defined fragile sites

Most fragile sites are identified by G-banding of metaphase chromosomes, and therefore not all regions within these sites are indeed fragile due to the limited resolution of determining fragile site boundaries cytogenetically. Therefore, if highly stable secondary structure-forming ability is a major contributing factor to fragile site instability, this property could be used to narrow down sites of fragility within

95 Table 4.5: Distribution of proportion of segments per section with free energy below -40 kcal/mol Non- Region Chr 10 fragile FRA10G FRA10C FRA10D FRA10A FRA10B/E FRA10F Number of Sections 17403 10334 1272 2360 574 1349 401 1113 Mean 0.087 0.089 0.122 0.033 0.198 0.063 0.064 0.121 Standard Deviation 0.116 0.119 0.144 0.047 0.153 0.081 0.073 0.122 Mean Difference from Non-fragile - - 0.033 -0.056 0.109 -0.026 -0.025 0.032 Standard Error - - 0.0042 0.0015 0.0065 0.0025 0.0038 0.0038 P-value* - - 2.27E-14 8.19E-279 1.17E-51 1.11E-23 1.39E-4 2.77E-15 * Overall, the mean difference among fragile and non-fragile sites analyzed by Analysis of Variance is highly significant (P = 7.7E-292). Subsequently, post-hoc Student’s t tests (2- tailed) were performed between each fragile site and non-fragile DNA. P values were adjusted for multiple comparisons by Bonferroni correction. **Fragile sites highlighted in grey have a significantly greater proportion of segments per section with a free energy value less then -40 kcal/mol.

cytogenetically-defined fragile sites. Next, utilizing the Mfold DNA secondary structure data, we investigated the development of a threshold to predict regions susceptible to fragile site breakage. The formation of alternative DNA secondary structures corresponding to disease-associated trinucleotide repeat expansions has been well established at rare, folate-sensitive fragile sites (202). All folate-sensitive fragile sites sequenced to date consist of a CGG repeat, including FRA10A, the most prevalent folate- sensitive rare fragile site in the human genome (217). FRA10A has been characterized at

10q23.3 (217) and 10q24.2 (216,218,219). Sequence analysis of FRA10A at 10q23.3 revealed a minimum of approximately 250 CGG repeats for fragile site breakage (217).

The reference sequence (hg build 37.2) that we used in this study has 8 CGG repeats

96 within the FRA10A allele at 10q23.3. Mfold predictions of the reference sequence containing the non-expanded repeats and surrounding sequence (7650-nt) do not show significant DNA secondary structure-forming potential (Figure 4.4A). This is not surprising since the large repeat expansions necessary for fragile site breakage are not present in the reference sequence. To determine a threshold of predicted DNA secondary structure-forming potential necessary to result in fragile site breakage, we artificially inserted an additional 242 CGG repeats into the reference sequence, resul ting in a total of 250 repeats. Mfold analysis then revealed that the segments corresponding to the expanded CGG repeat exhibit low free energy values, where using a 95% confidence interval, we found seven consecutive 300-nt segments below -43.61 kcal/mol (Figure 4.4A). Since expanded CGG repeats form stable secondary structures in vitro and at least 250 repeats at FRA10A results in fragile site breakage, we established a threshold for potential fragile site breakage as having at least seven consecutive segments with a free energy value below -43.61 kcal/mol.

Next, we examined the validity of this threshold by analyzing common fragile sites characterized on the molecular level. The two most active common fragile sites in the genome, FRA3B and FRA16D (8) display fragility in the regions of 4.5 Mb (212) and

2.8 Mb (213), respectively. The DNA sequence corresponding to these characterized regions was analyzed for DNA secondary structure-forming potential, using the Mfold program in the same manner as the chromosome 10 sequence. Upon applying the DNA fragility threshold, we identified eight regions within FRA3B and three within FRA16D, encoding for six and three genes, respectively (Figure 4.4B). Together, these regions

97

Figure 4.4: Establishment and validation of a threshold for Mfold prediction of chromosomal fragility. (A) The free energy values for the most favorable Mfold-predicted DNA secondary structure for each 300-nt segment with 150-nt increments are shown for FRA10A at 10q23.3 with eight CGG repeats and surrounding sequence found in the reference sequence (grey dots) and artificially inserted CGG repeats to total 250 repeats (black dots). The (CGG)8-containing sequence is not predicted to form significant DNA secondary structures. A minimum of 250 CGG repeats is necessary for FRA10A breakage, and this sequence is predicted to form stable DNA secondary structures. When using a 95% confidence interval for chromosome 10, these expanded repeats exhibit seven consecutive segments below -43.61 kcal/mol (horizontal black line). (B) Applying the fragility prediction threshold to Mfold-predicted DNA structures for FRA3B and FRA16D, eight and three regions were identified, respectively. Overlap with gene sequences and the percent GC content are indicated for each of the regions.

98 account for 0.23% and 0.13% of the defined FRA3B and FRA16D sequences, respectively.

One FRA3B region is within a previously defined ‘active’ region located within tumor suppressor gene FHIT (59,220) and corresponds with APH-induced hybrid deletions (71), a hereditary renal cell carcinoma translocation breakpoint (221), and deletions found in gastric (222), lung (223), cervical (157,224), breast, acute lymphoblastic leukemia, esophageal, liver, brain, skin, and prostate cancers and cancer cell lines (Tumorscape ).

Within FRA16D, we identified a region corresponding to the tumor suppressor gene

WWOX, which contains heterozygous deletions found in gastric and colorectal adenocarcinomas (225) and frequent loss-of-heterozygosity in breast cancer (226).

Additionally, all three regions identified in FRA16D correspond with deletions observed in colorectal, breast, lung, brain, skin, and prostate cancers (Tumorscape). Therefore, these findings validate the ability of the DNA fragility threshold to predict regions with the potential of fragile site breakage.

We next applied the DNA fragility threshold to the chromosome 10 sequence, and identified a total of 615 potential fragile regions (Appendix Table 4), with an average of 9.6  3.4 consecutive segments below -43.61 kcal/mol and a range of 7 to 39.

These regions account for 0.74% of the overall chromosome 10 sequence.

Approximately 31% of the predicted fragile regions are located within the cytogenetic boundaries of previously detected common fragile sites, and the remaining regions are located within non-fragile regions. The latter group may represent potential new fragile sites not detected previously or too small to be detected on the cytogenetic level. The former group including 172 regions within APH-induced sites and 15 within a BrdU-

99 induced site, can be used to refine potential sites of fragility within previously known fragile sites. For example, within FRA10G, the largest (~10-Mb) of all four APH-induced sites on chromosome 10, there are 74 regions with at least seven consecutive segments below -43.61 kcal/mol (Appendix Table 4), and their distribution revealed variations in the presence of the predicted fragile regions (Figure 4.5), with a cluster of regions possessing high number of consecutive segments below -43.61 kcal/mol located at the

5’ portion of the sequence.

RET, an oncogene involved in twelve translocations that result in papillary thyroid carcinoma, known as RET/PTC rearrangements (86), is located within the cluster at the 5’ portion of the FRA10G sequence (Figure 4.5). The average free energy of the entire RET sequence is -41.0 ± 11.9 kcal/mol, which is significantly lower than the remaining chromosome 10 sequence (-27.2 ± 9.3 kcal/mol) (P = 4.77E-68), and the RET gene contains five predicted fragile regions with a range of nine to sixteen consecutive segments below -43.61 kcal/mol (Appendix Table 4). In each of the reported RET/PTC rearrangements, RET fuses with a different partner gene. NCOA4, one genetic partner of

RET which participates in the RET/PTC3 translocation (149), is also located in FRA10G. In contrast, NCOA4 is located within a region devoid of predicted fragile regions identified by the threshold (Figure 4.5). The average free energy of the NCOA4 sequence is -26.8 ±

8.2 kcal/mol, comparable to the value for the overall chromosome 10 sequence.

Studies of RET/PTC tumors indicate that patient breakpoint cluster regions are located in intron 11 of RET and intron 7 of NCOA4 (149). As with the overall free energy

100

Figure 4.5: Regions predicted to exhibit fragile site breakage within APH-induced common fragile site FRA10G. Regions with at least seven consecutive segments with a predicted free energy values of less then -43.61 kcal/mol are presented. The lower x axis depicts the number of the first 300-nt segment isolated by the threshold, the upper x axis displays the corresponding chromosome 10 nucleotide number, and the y axis depicts the number of consecutive segments within the isolated region. The locations of two translocation participating genes, RET (segments 290482-290839) and NCOA4 (343766-343939), are labeled. RET contains five significant regions, while NCOA4 has none.

values for RET and NCOA4, intron 11 of RET has a significantly more energetically favorable free energy value (-48.5 ± 7.9 kcal/mol) compared to the average predicted free energy value of chromosome 10 (-27.2 ± 9.3 kcal/mol) (P = 1.60E-17), and is within the predicted fragile regions, while intron 7 of NCOA4 (-25.9 ± 6.7 kcal/mol) exhibits a value similar to the chromosome 10 average (Figure 4.6A). Further supporting its secondary structure formation predictions and its involvement in APH-induced DNA fragility, RET, but not NCOA4, exhibits high levels of chromosomal breakage following

APH treatment in the HTori-3 human thyroid epithelial cell line (161). In these experiments using fluorescence in situ hybridization assays, we observed RET breakage

101 at a rate of 6% of chromosomes with 0.4 μM APH treatment compared to only 0.62% of chromosomes for NCOA4. DNA breakage within intron 11 of RET has also been confirmed on the nucleotide level following APH treatment. We detected 0.024 ± 0.015 breaks per 100 cells within intron 11 with 0.4 μM APH treatment, compared to 0.004 ±

0.009 breaks per 100 cells without treatment (P = 0.010) (161).

These data suggest that the ability to potentially form a highly stable DNA secondary structure contributes to the observed fragile site-induced breakage at RET and to some extent chromosomal breakage in patients leading to RET/PTC rearrangements, which ultimately result in papillary thyroid carcinoma. Using DNA secondary structure predictions and the established threshold, we have refined regions of fragility within the cytogenetic boundaries of FRA10G and other common fragile sites, as well as predict potential new fragile sites.

Validating the formation of secondary structures

To validate the prediction of the Mfold program, we investigated secondary structure formation within the breakpoint cluster regions of RET and NCOA4 by subjecting these DNA fragments to re-duplexing with various concentrations of NaCl to allow re-annealing of the single strands following denaturation. This in vitro re- duplexing assay have been used to analyze the formation of DNA secondary structures generated by CTG (214,215) and CGG (227) repeats, as well as the FRA16B AT-rich repeat sequence (27). Here, in addition to the two sequences derived from FRA10G, two regions (exon 1 and exon 9) within the tumor suppressor gene PTEN were also examined. PTEN, one of the most highly mutated tumor suppressor genes (228),

102 contains one predicted fragile region with eight consecutive segments below -43.61 kcal/mol (Appendix Table 4). Mutations within PTEN are found throughout the gene and its promoter region, with few mutations located after exon 8 (229). The region identified in this study contains exon 1 and the promoter sequence of PTEN, where high levels of mutations have been observed. The average free energy for exon 1 of PTEN (-58.4 ±

21.3 kcal/mol) is significantly more energetically favorable compared to the average free energy of chromosome 10 (-27.2 ± 9.3 kcal/mol) (P = 1.25E-3) (Figure 4.6A). Exon 9 of

PTEN is located within a region lacking mutations (229), and like NCOA4, has an average free energy (-21.2 ± 4.0 kcal/mol) similar to chromosome 10 (Figure 4.6A).

The 366-bp RET intron 11 and 317-bp PTEN exon 1 DNA fragments predicted to form highly stable secondary structures (with free energy values -74.5 and -73.2 kcal/mol, respectively), as well as the 348-bp NCOA4 intron 7 and 289-bp PTEN exon 9

DNA fragments predicted to form less stable structures (-20.1 and -23.2 kcal/mol, respectively) (Figure 4.7), were subjected to the re-duplexing reaction to observe the formation of DNA secondary structures. We observed slower migrating products in the re-duplexed samples, but not in the untreated samples (Figure 4.6B), suggesting the formation of secondary structure during re-duplexing of these DNAs. RET, PTEN exon 1,

NCOA4, and PTEN exon 9 all exhibited the formation of secondary structures to varying degrees (Figure 4.6B). While the concentration of NaCl (0.05 to 1M) had no impact on the amount of the secondary structure formation, DNA sequence did. RET and PTEN exon 1 showed significantly greater secondary structure formation compared to NCOA4

(P = 4.8E-10 and 3.0E-5, respectively) and PTEN exon 9 (P = 5.7E-8 and 1.6E-3,

103 A

Figure 4.6: DNA secondary structure prediction and in vitro detection within regions predicted to exhibit fragile site instability. (A) The computed lowest free energy value of the predicted DNA secondary structures from segments analyzed by the Mfold program was fit to a curve for regions isolated by the threshold for potential fragility within RET (red) and PTEN exon 1 (green), and regions not isolated by the threshold within NCOA4 (blue) and PTEN exon 9 (orange). The Matlab function polyfit found coefficients of a polynomial P(X) of degree N that fit the raw data best in a least- squares sense. Intron 11 of RET (solid line), exon 1 of PTEN (solid line), intron 7 of NCOA4 (solid line), and exon 9 of PTEN (solid line) plus flanking sequences (dashed) are depicted. The mean free energy of chromosome 10 (-27.2 kcal/mol) is depicted by the horizontal black line. The x axis indicates the size of the sequences and the y axis displays the free energy of the predicted structure. (B) Representative gel electrophoresis analysis of re-annealed RET intron 11, PTEN exon 1, NCOA4 intron 7, and PTEN exon 9 DNAs. DNA fragments were denatured and subjected to re-duplexing in the presence of 0.1 M NaCl and analyzed by native 4% PAGE. When compared to untreated samples, slower-migrating products suggest the formation of a secondary structure during re-duplexing of these DNAs. (C) Average percentages of DNA secondary structures for DNA fragments of RET (n=12), PTEN exon 1 (n=7), NCOA4 (n=12), and PTEN exon 9 (n=5) are shown with ± SD. The intensity of all bands was measured by densimetric analysis, and the percentage of shifted bands were calculated and averaged over 5-12 individual experiments performed over a range of NaCl concentrations (50mM-1M). Statistical analysis was performed using a Student’s T-Test (2-tailed), where * refers to P < 2E-3.

104 respectively) (Figure 4.6C). No significant difference was observed in the level of secondary structure formation between RET and PTEN exon 1 or between NCOA4 and

PTEN exon 9. Therefore, these data confirm that an energetically favorable free energy value predicted by Mfold program corresponds to greater DNA secondary structure formation, validating the differential folding propensity of the Mfold program predictions. These results demonstrate the validity of Mfold secondary structure predictions and further confirm that the established threshold is an accurate indicator of potential DNA fragility.

Analysis of regions capable of forming highly stable secondary structures

The co-localization of these predicted fragile regions with genes was next compared. Of these 615 regions, 426 (69%) are located within regions encoding for 258 genes (Appendix Table 4). Among them, 47 genes (possessing 71 predicted fragile regions) have known mutations of insertions, deletions, translocations, or point mutations that result in cancer or other diseases in humans (Table 4.6, Appendix Table

5). Specifically, six of the 22 genes with mutations consisting of insertions, deletions, or translocations are specifically found in cancer (Figure 4.8). Included within these genes are RET and PTEN, which form significant levels of DNA secondary structure in vitro.

CCDC6, the partner gene of RET in the RET/PTC1 rearrangement in PTC (86), was also identified. Located within the BrdU-induced common fragile site FRA10C, CCDC6 exhibits chromosomal breakage in HTori-3 cells at a rate of 2.72% of chromosomes following BrdU treatment (161). Additionally, the region predicted in this study to form high levels of secondary structure corresponds to intron 1 of CCDC6, the breakpoint

105

Figure 4.7: The most stable Mfold predicted DNA secondary structures and free energy values for DNA fragments analyzed by in vitro re-duplexing assays: RET intron 11 (A), PTEN exon 1 (B), NCOA4 intron 7 (C), and PTEN exon 9 (D).

cluster region isolated in RET/PTC patient tumors (149). Also among these genes are

ZMIZ1 (230), ADD3 (231), and NFKB2 (232), all known to be disrupted in chromosomal translocations resulting in leukemia. In B-cell acute lymphoblastic leukemia, ZMIZ1 rearranges with ABL1 (230), and translocations of ADD3 with NUP98 (231) are detected in T-cell acute lymphoblastic leukemia. NFKB2 rearranges with INA, resulting in B-cell chronic lymphocytic leukemia and cutaneous T-cell lymphoma (233); with TBXAS1,

106 resulting in multiple myeloma; and with IGH@, resulting in B-cell non-Hodgkins lymphoma (234). NFKB2 also displays deletions, point mutations and gene amplification found in B- and T-cell leukemias and lymphomas (232).

The location of these predicted fragile regions was also compared to copy number alterations observed in various tumors and cancer cell lines (235) (Tumorscape).

We found that the regions identified by the fragility prediction threshold mostly overlap with the regions containing deletions in breast, colorectal, non-small cell (NSC) lung, and prostate cancers; glioma; melanoma; and amplifications in breast, ovarian, and prostate cancers (Figure 4.8, Appendix Table 6). Deletions in breast, colorectal, glioma, and melanoma are found in the region identified to contain PTEN exon 1, in which deletions, insertions, and point mutations have been sequenced in various cancers (228), and stable DNA secondary structures were observed in vitro (described above). The region identified within NFKB2 is included in the locations of deletions found in colorectal and breast cancers, and the predicted fragile region within ADD3 overlaps with deletions found in NSC lung, colorectal, and breast cancers. Deletions found in melanoma, NSC lung, colorectal, and breast cancers, and glioma span the identified regions within the

APH-induced common fragile site FRA10F, and sequences located near the telomere on the q-arm of chromosome 10 (10q26.2-qter), which constitutes 33% of the predicted fragile regions but was not previously identified as a fragile site. Amplifications found in breast, ovarian, and prostate cancers include regions of predicted DNA secondary structure that cluster within 10q22.2-3, which contains the ZMIZ1 gene, and borders the cytogenetic boundaries of the APH-induced common fragile site FRA10D (10q22.1).

107 Table 4.6: Genes located in regions capable of forming highly stable secondary structures and disease associations Chromosomal Fragile Insertion, Deletion, Gene position site Translocation* Point Mutation* PHYH 10p13 + NMT2 10p13 + VIM 10p13 + CACNB2 10p12.33 + NEBL 10p12.31 + BMI1 10p12.2 + PTF1A 10p12.2 + MAP3K8 10p11.23 + ZEB1 10p11.22 + RET 10q11.21 FRA10G + + CXCL12 10q11.21 FRA10G + ALOX5 10q11.21 FRA10G + CHAT 10q11.23 FRA10G + CCDC6 10q21.2 FRA10C + RHOBTB1 10q21.2 FRA10C + STOX1 10q21.3 FRA10C + HK1 10q22.1 FRA10D + NEUROG3 10q22.1 FRA10D + PCBD1 10q22.1 FRA10D + CDH23 10q22.1 FRA10D + CHST3 10q22.1 FRA10D + KCNMA1 10q22.3 + ZMIZ1 10q22.3 + CDHR1 10q23.1 + LDB3 10q23.2 + BMPR1A 10q23.2 + + GLUD1 10q23.2 + PTEN 10q23.31 FRA10A + + RBP4 10q23.33 FRA10A + HPS1 10q24.2 FRA10A + ABCC2 10q24.2 FRA10A + + PAX2 10q24.31 + + PDZD7 10q24.31 + FGF8 10q24.32 + HPS6 10q24.32 + + PITX3 10q24.32 + + NFKB2 10q24.32 + CNNM2 10q24.32 + + COL17A1 10q24.33 + + ADD3 10q25.1 + ADRB1 10q25.3 + + BAG3 10q26.11 FRA10F + + FGFR2 10q26.13 FRA10F + + PLEKHA1 10q26.13 FRA10F + HTRA1 10q26.13 FRA10F + OAT 10q26.13 FRA10F + + ADAM12 10q26.2 + *Detailed descriptions of diseases and references can be found in Appendix Table 5.

108

Figure 4.8: Location of regions predicted to exhibit fragile site instability and correlation with cancer-associated chromosomal aberrations. Regions on chromosome 10 with at least seven consecutive segments below -43.61 kcal/mol are presented. The lower x axis depicts the number of the first 300-nt segment of the consecutive segments identified by the threshold, the upper x axis displays the corresponding chromosome 10 nucleotide number, and the y axis depicts the number of consecutive segments within the each identified region. The locations of the six genes with insertion, deletion, or translocation mutations in cancer that coincide with regions isolated by the threshold are marked by red diamonds. Regions of copy number alterations (Tumorscape) across various tumor types that overlap with regions identified by the threshold are depicted as horizontal lines. Black lines represent deletions and grey lines represent amplifications.

Additionally, 86 significant regions were identified on the p-arm of chromosome 10, where no chromosomal fragile site was previously defined. These regions on the p-arm coincide with deletions found in breast, prostate, NSC lung, and colorectal cancers, as well as amplifications found in breast cancer.

Therefore, using the ability to form highly stable secondary structures within a given threshold defined by analysis of molecularly-defined fragile sites, we have narrowed down cytogenetically-defined fragile sites; identified possible new fragile sites

109 in previously unidentified regions on chromosome 10; and correlated these sites both with known mutations in human disease and regions of copy number alterations in cancer. We suggest that the secondary structure-forming/replication stalling mechanism for DNA breakage could occur within these regions to generate the respective cancer- specific gene rearrangements.

DISCUSSION

In this study, we analyzed the differences in multiple stem loop DNA secondary structure-forming potential between fragile and non-fragile DNA on chromosome 10 using the DNA secondary structure prediction program, Mfold. APH-induced common fragile sites, FRA10G, FRA10D, and FRA10F, had a greater potential to form stable DNA secondary structures compared to non-fragile DNA (Table 4.2). Additionally, DNA capable of forming these stable secondary structures clustered more densely in these

APH-induced common fragile sites compared to non-fragile DNA (Table 4.4). FRA10E, although classified as an APH-induced common fragile site, did not have a greater potential to form stable DNA secondary structures compared to non-fragile DNA (Table

4.2, Table 4.4). However, the original classification of this fragile site was listed as provisional (236), and further experiments to confirm its fragility have not been performed. Therefore, we suggest this region may not be a true APH-induced common fragile site. To our knowledge, this is the first unbiased demonstration supporting secondary structure formation at proven APH-induced common fragile sites as a unifying feature distinguishing these regions from non-fragile sites.

110 We also confirmed the validity of the secondary structure predictions using an in vitro re-duplexing assay of RET, PTEN, and NCOA4 sequences. Intron 11 of RET and exon

1 of PTEN, are located within regions identified by the fragility prediction threshold

(Appendix Table 4), and are predicted to form secondary structures with energetically favorable free-energy values significantly higher than the average value of the chromosome 10 sequence (Figure 4.6A). In contrast, intron 7 of NCOA4 and exon 9 of

PTEN were not isolated in our study and are predicted to form secondary structures with free energy values similar to the overall chromosome 10 sequence (Figure 4.6A).

The in vitro re-duplexing assay showed that RET intron 11 and PTEN exon 1 DNA form a significantly higher amount of stable secondary structures than NCOA4 intron 7 and

PTEN exon 9 DNA (Figures 4.6B and C), agreeing with the differential folding propensity predicted by the Mfold program. Further, we previously found that RET, but not NCOA4, exhibits high levels of chromosomal breakage following APH treatment (161), and the formation of DNA breaks within intron 11 of RET (the major patient breakpoint cluster region found in human tumors) was also detected on the nucleotide level following APH treatment (161). Interestingly, NCOA4 is located within the same APH-induced fragile site, FRA10G, as RET. However, NCOA4, due to its inability to form highly stable secondary structures and, more importantly, the absence of chromosomal breakage in response to APH, does not fit the criteria for a bona fide APH-induced site, unlike the

RET sequence. Rearrangements in NCOA4 are most frequent in patients with a history of radiation exposure, ranging from 63-76% of the total RET/PTC rearrangements in pediatric tumors (88-90,237), suggesting that radiation, not DNA secondary structures,

111 causes DNA breakage at this site. Mutations within the PTEN gene, including deletions, are found in many types of cancer. While mutations are commonly found in exon 1 of

PTEN, they are fairly devoid within exon 9 (229), agreeing with our secondary structure predictions and observations by the in vitro re-duplexing assay. These results validate the Mfold analysis as a method for predicting potential regions of DNA fragility, and support the role of alternative DNA secondary structure in the mechanism of APH- induced common fragile site instability.

The boundaries of fragile sites have traditionally been defined cytogenetically, and to date only 23 common fragile sites have been cloned and characterized. Rare fragile sites are the result of CGG trinucleotide repeats or AT-rich minisatellite repeats located within the cytogenetically defined boundaries (238). Unlike rare fragile sites, no consensus sequence has been identified for common fragile sites to narrow down regions of fragility, but it is unlikely that these regions are fragile throughout the currently defined cytogenetic boundaries. Additionally, cloning and characterization of fragile sites is time-consuming. Here, through Mfold analysis of FRA10A, we observed a signature of fragility when comparing the reference sequence to a fully expanded repeat sequence known to produce chromosomal breaks, thereby establishing a threshold for predicting potential fragile sites (Figure 4.5A). This threshold was verified by the analysis of the two most active fragile sites in the genome, FRA3B and FRA16D, which have been defined on the molecular level. The predicted fragile regions within FRA3B and FRA16D exhibit highly frequent fragile site breakage as identified previously and correspond to the location of deletions and translocations found in cancer (Figure 4.5B). Using this

112 threshold, we could pinpoint sites of potential fragility within the cytogenetic boundaries of fragile sites on chromosome 10 (Figure 4.8, Appendix Table 4). We isolated a total of 172 regions within APH-induced common fragile sites, effectively narrowing down fragility within these regions, as in the example of FRA10G described above. We also identified 15 regions within the BrdU-induced common fragile site

FRA10C. Interestingly, in a global screening study for APH-induced fragile sites, FRA10C showed levels of chromosomal breakage in peripheral blood lymphocytes in the presence of APH at least half that of the APH-induced common fragile sites on chromosome 10 (9). Therefore, FRA10C may have characteristics of both APH and BrdU- induced fragile site breakage, and the regions that we identified in this study may be responsible for the APH breakage (9).

Another drawback of defining DNA fragility cytogenetically at metaphase chromosomes is that some fragile regions are too small to be detected or are difficult to locate by G-banding. Therefore, using the ability to form highly stable secondary structures, we can predict “micro”-fragile sites or previously unidentified fragile sites within non-fragile regions. Using the same threshold as used to narrow down cytogenetically defined fragile sites, we predicted an additional 428 potential regions of fragility within non-fragile regions (Figure 4.8, Appendix Table 4). A total of 85 of these regions are located on the p-arm, where no fragile sites have been well established.

Interestingly, 75 of these regions correspond to the locations of chromosome breaks observed by Mrasek et al. in their global study (9). Though published reproduction of their results has not occurred, our data support the presence of APH-induced common

113 fragile sites on the p-arm. Previous studies of cloning and characterization of fragile sites have demonstrated that fragile site breakage can be extended to the neighboring bands of a cytogenetically-defined fragile site (212,213). Thus, we suggested that 34 regions within 10q22.2-3, adjacent to APH-induced common fragile site FRA10D (10q22.1), might be the extension of FRA10D. We also predicted 173 regions clustering within the

10q26.2-qter region (see Figure 4.8). Due to the location of these regions at the extreme end of the q-arm and immediately telomeric to common fragile site FRA10F, cytogenetic detection of common fragile site breakage within these regions may be difficult, leaving this potential fragile site undetected previously.

The nucleotide composition of the sequences contained within the predicted fragile regions (Appendix Table 4) is generally GC rich (average GC content 65.1  6.9%, range 8 to 79%). Interestingly, only a few regions were AT rich. Previous analyses of several common fragile sites revealed AT-rich sequences with an inherent higher flexibility (13,14). Although most sequences we identified are GC-rich, those that are not are instead AT-rich. The GC richness of the sequences was also found in the regions identified within FRA3B and FRA16D (Figure 4.5B), where one sequence was AT-rich and all the rest were GC-rich. In a global analysis of sequence content and DNA flexibility,

Tsantoulis et al. found common fragile site sequences to be on average more GC-rich and less flexible than non-fragile sequences (239). Our results agree with those observations, and further support DNA secondary structure-forming potential rather than DNA flexibility as an important feature of common fragile sites.

114 Extensive studies (13-16,23-28,209) have suggested that the ability of fragile sites to form stable secondary structures during DNA replication likely contributes to their breakage by stalling replication fork progression. In addition to replication fork stalling, paucity of replication initiation in fragile site regions (30) and the presence of transcription-derived R-loops during DNA replication of fragile sites (22), are also suggested to be involved in the mechanism of fragility. However, it is not clear whether the potential of fragile site sequences to form highly stable secondary structures participates in the latter two mechanisms. Interestingly, the formation of R-loops promotes trinucleotide repeat instability. The ability of trinucleotide repeats to form stable secondary structure may stabilize the presence of R-loops by adopting a hairpin structure on the non-template DNA strand, and therefore favoring hybrid formation between RNA transcripts and the DNA template strand (163-165,191). Further investigation of the specific role of secondary structure-forming ability of fragile sites in the mechanism of fragility is needed.

This study is the first unbiased investigation of DNA secondary structure-forming potential at fragile sites versus non-fragile regions, and supports the role of DNA secondary structure formation in the mechanism of APH-induced common fragile site instability. Additionally, we have established a method whereby sites of fragility can be predicted, allowing systematic identification of common fragile sites. The co-localization of fragile sites and genes deleted, amplified, or rearranged in cancer is well documented

(207). Also, genomic and epigenomic instability in neurogenerative diseases such as schizophrenia and autism has been linked to fragile sites (208). Therefore, the ultimate

115 goal is to create a list of legitimate sites that are prone to DNA breakage caused by the secondary structure-forming mechanism(s), to evaluate genomic stress caused by endogenous and exogenous insults. To create a panel of such regions and measure their fragility, could tailor diagnosis and therapy based on direct knowledge of DNA damage.

FUNDING

This work was supported by the National Institutes of Health (RO1CA85826 and

RO1CA113863 to Y.-H. W. and T32GM095440 to L.W.D.).

ACKNOWLEDGEMENTS

We thank Rafael Diaz-Garcia for generating the plasmids, pRET3 and pELE1.

116 CHAPTER V: DEVELOPMENT OF A DNA BREAKAGE ASSAY TO DETECT SUSCEPTIBILITY TO RET/PTC REARRANGEMENT FORMATION AND POTENTIAL EXPOSURE TO ENVIRONMENTAL FRAGILE SITE-INDUCING CHEMICALS

ABSTRACT

Supporting a role for fragile sites in cancer development, we have previously shown that laboratory fragile site-inducing chemicals result in the formation of

RET/PTC1 rearrangements in human thyroid cells and induce DNA breakage within the

RET oncogene. While this data indicates fragile site breakage can result in a cancer- causing chromosomal translocation, the role of external environmental agents in this process has not been investigated. Here, we show that low doses of two environmental agents, benzene and DEN, induce significant levels of fragile site-specific DNA breakage at RET, and these chemicals act analogously to APH, by delaying DNA replication. Aside from benzene and DEN, a variety of other agents, including chemotherapeutic agents, have been shown to induce fragile site breakage. Therefore, the ability to monitor an individual’s risk of cancer development due to exposure to external fragile site-inducing agents would be a valuable diagnostic assay to tailor the treatment of cancer patients with chemotherapy and monitor those with high exposure to environmental agents.

Measuring the frequency of DNA breakage in normal thyroid tissue of patients, we found that PTC patients with RET/PTC rearrangements had a higher level of double- strand DNA breaks at RET than RET/PTC-negative patients with non-cancerous hyperplastic nodules. This data provides a foundation for development of a DNA breakage assay to evaluate fragility at the RET oncogene in patients, which can be

117 translated to other mutation-prone regions where fragile site breakage occurs, and the risk of developing numerous different cancers can be monitored.

INTRODUCTION

A strong correlation between fragile sites and cancer (10) along with the ability of fragile site breakage to result in the papillary thyroid carcinoma (PTC)-causing

RET/PTC1 translocation (161) provides strong evidence for a role of fragile site breakage in carcinogenesis. However, the majority of conditions used to study fragile site breakage employ chemicals used in the laboratory and do not represent conditions by which fragile site breakage may be induced in humans to promote the formation of chromosomal aberrations. However, a variety of external dietary and environmental agents, including benzene and diethylnitrosamine (DEN) (107), have been shown to induce or enhance fragile site breakage (Table 1.4). The optimal concentrations of these chemicals to induce fragile sites are relevant, and the general population is exposed to a level of benzene comparable to the amount able to induce fragile sites, especially under long-term exposure. Furthermore, individuals working in professions with a high exposure to these chemicals, such as farmers and flower collectors (101-103), and those who have a history of cigarette smoking (99,100), which exposes them to enhaled benzene and DEN, have an increased susceptibility to fragile site breakage.

The rate of second primary cancers is on the rise, and they now account for one in six of all newly diagnosed cancers in the United States (123). One possible cause of second primary tumors is medical treatment of the first primary tumor with either

118 radiation or chemotherapy. Several chemotherapeutic agents have been found to induce fragile site breakage as well (Table 1.4). Papillary thyroid carcinoma (PTC) has been observed as a secondary cancer following chemotherapeutic treatment of a variety of different cancers including osteosarcoma (127-132), rhabdomyosarcoma

(133), acute lymphoblastic leukemia, neuroblastoma, and Ewing’s sarcoma (134-137).

Furthermore, DNA topoisomerase poisons are common chemotherapeutic agents, and

DNA topoisomerases I and II were shown in Chapter III to participate in initiating fragile site breakage at the RET oncogene. Therefore due to the increased risk of fragile site breakage following treatment with chemotherapeutic agents as well as in patients with a history of exposure to fragile site-inducing chemicals, an assay for detecting a patient’s risk for fragile site breakage and subsequently risk for cancer development would prove to be a valuable tool for tailoring the treatment of cancer patients as well as monitoring cancer susceptibility among workers with occupational exposure to fragile site-inducing agents.

In this study, we investigated if external fragile site-inducing agents, benzene or

DEN, have the ability to induce fragile site breakage at the RET oncogene in human thyroid epithelial cells. We found that both benzene and DEN induced significant levels of DNA breakage within RET, similar to those observed with aphidicolin (APH) treatment. Furthermore, this breakage was found to be specific to fragile sites. Cell cycle analysis revealed that both benzene and DEN treatments result in a significant accumulation of cells in S-phase, like observed with APH treatment, suggesting these

119 chemicals work analogously to APH in their induction of fragile site breakage by delaying

DNA replication.

As a first step toward developing a tool for monitoring high-risk populations susceptible to cancers caused by fragile site-mediated rearrangements, we sought to determine whether normal cells of PTC patients with RET/PTC rearrangement have more DNA breakage in the RET region than those of normal individuals. Genomic DNA was isolated from normal thyroid tissue from PTC patients with or without RET/PTC rearrangements and patients with noncancerous growths. The level of DNA breakage was measured within fragile regions RET intron 11 and FHIT intron 4 (FRA3B), as well as the non-fragile 12p12.3 region using LM-PCR. Although no significant difference was observed in the overall level of DNA breaks at RET in the normal tissue of RET/PTC- positive patients compared to RET/PTC-negative patients, the level of double-strand

DNA (dsDNA) breaks at RET was significantly elevated in RET/PTC-positive patients compared to RET/PTC-negative hyperplastic nodule patients. Therefore, measuring fragile site breakage in the form of dsDNA breaks in rearrangement-prone genomic regions may be an effective method for monitoring a patient’s susceptibility for cancer development, and provides a basis for further development of a diagnostic assay.

METHODS AND MATERIALS

Cell line and thyroid tissues

Experiments were performed on HTori-3 cells, a human thyroid epithelial cell line transfected with an origin-defective SV40 genome. They are characterized as

120 immortalized, partially transformed, differentiated cells having three copies of chromosome 10 with intact RET loci and preserve the expression of thyroid differentiation markers such as thyroglobulin production and sodium iodide symporter

(158). The cells were purchased from the European Tissue Culture Collection and grown in RPMI 1640 medium (Invitrogen) supplemented with 10% fetal bovine serum.

Paired normal and tumor tissues from PTC patients (P10-10N, P10-09N, and

PHS04-15094) and hyperplastic nodules ((01-18N, 01-26N, 01-27N, and 01-33N) were obtained from Dr. Yuri Nikiforov (University of Pittsburgh). Paired normal and tumor tissues from PTC patients (2011-1345, and 2237182) and multinodular goiters were obtained from the Tumor Tissue Core Facility (Comprehensive Cancer Center of Wake

Forest School of Medicine, IRB00014385).

Cell treatments and fragile site induction

For breakpoint detection, HTori-3 cells (1x105) were plated in 10 cm dishes and treated 18 hours later for 24 hours with 0.4 μM APH (Sigma), 0.5 mg/mL benzene

(Sigma), or 3.5 mg/mL DEN (Sigma), after which genomic DNA was isolated.

For cell viability analysis by flow cytometry, HTori-3 cells (1x105) were plated in

6-well plates and treated 18 hours later with benzene, DEN, or 0.4μM APH for 24 hours.

Treatments with various concentrations were performed for benzene (0.25-8 mg/mL) and DEN (1.5-7 mg/mL) to determine optimal dosing. For cell cycle analysis by flow cytometry, HTori-3 cells (1x106) were plated in 6-well plates and treated 18 hours later with 0.4μM APH, 0.5mg/mL benzene, or 3.5mg/mL DEN for 24 hours.

121 Isolation of genomic DNA from thyroid tissue

Normal thyroid tissue (10 to 50 mg) was cut into small pieces and added to a 1.5 mL tube containing 0.3 mL extraction buffer (0.1 M NaCl, 20 mM Tris pH 8.0, 25 mM

EDTA pH 8.0, 0.5% SDS). Tissue was then homogenized, after which an additional 0.2mL extraction buffer and 1 mg proteinase K was added and incubated at 55°C for up to three days or until tissue is fully digested. Genomic DNA was then isolated by phenol/chloroform extraction and the DNA pellet was resuspended in 1X TE buffer.

DNA breakage analysis by LM-PCR

Isolation of total DNA breaks from treated HTori-3 cells or from patient genomic

DNA was performed using LM-PCR as described in Chapter III. DNA breaks were detected within RET intron 11 [using primer set 1 (Appendix Table 1)], FRA3B at FHIT intron 4 (Appendix Table 1), and the non-fragile 12p12.3 region (Appendix Table 1).

LM-PCR was adapted for isolation of double-stranded DNA breaks as follows

(Figure 5.4): The LL3/LP2 linker was directly ligated to 400 ng of genomic DNA. Following ligation, the excess linker was removed using a G-100 column. Two rounds of nested

PCR were performed using 8 ng of ligated DNA and the nested PCR primers of RET intron

11 primer set 1 (Appendix Table 1). PCR products were analyzed in the same manner as

LM-PCR of total DNA breaks. All statistics were performed using an unpaired, two-tailed

Student’s T-test.

122 Cell survival analysis following drug treatment

HTori-3 cells (1x105) were plated in 6-well plates and treated 18 hours later with

APH, benzene, or DEN as described above for 24 hours. Cells were harvested by trypsinization, washed with phosphate-buffered saline (PBS, Invitrogen), and resuspended in PBS containing 2 μg/mL propidium iodide (PI, Sigma). Cell viability was then determined using a Beckson Dickinson FACSCalibur flow cytometer.

Cell cycle analysis following drug treatment

HTori-3 cells (5x105) were plated in 6-well plates and treated 18 hours later with

APH, benzene, or DEN as described above for 24 hours. Cells were harvested by trypsinization, washed with PBS. Ethanol (100%, 0.5 mL) was added to the cell pellet dropwise while vortexing. Cells were pelleted by centrifugation and resuspended in PI-

RNase stain (50 μg/mL PI and 100 μg/mL RNase diluted in 1X PBS). Cell cycle distribution was then determined using a Beckson Dickinson FACSCalibur flow cytometer. Flow cytometry data was analyzed with code developed using the suite of tools (flowCore) available through the BioConductor (240) running under the open source environment for statistical computing, R (241).

Detection of RET/PTC rearrangements in thyroid tissues by RT-PCR

Normal and/or tumor thyroid tissue (50 to 100 mg) was cut into pieces and added to a 1.5 mL tube containing 0.5 mL TRIZOL (Invitrogen). Tissue was homogenized and an additional 0.5 mL of TRIZOL was added. RNA was isolated following the TRIZOL reagent protocol and resuspended in RNase-free water.

123 RNA isolated from thyroid tissue was converted to cDNA using SuperScript First-

Strand Synthesis for RT-PCR (Invitrogen, cat# 11904-018). RET/PTC1 fusion protein expression was examined using primers RET11 (5’-AGCAGGTCTCGAAGCTCACTC-3’) and

H4-5 (5’-CAAGAGAACAAGGTGCTGAAG-3’) and the following PCR conditions: 1 mM

MgCl2 and [(94°C for 3 min)/(94°C for 3 min, 60°C for 30 sec, 72°C for 50 sec) 35 cycles/(72°C for 7 min)]. RET/PTC3 fusion protein expression was evaluated using primers RET11 and ELE1-10 (5’-CGGTATTGTAGCTGTCCCTTTC-3’) with the same PCR conditions. Actin mRNA expression serves as a positive control for RNA isolation, and was examined using primers actin1 (5’-GCGGGAAATCGTGCGTGACATT-3’) and actin2 (5’-

GATGGAGTTGAAGGTAGTTTCGTG-3’) with the same PCR conditions except for 1.5 mM

MgCl2.

RESULTS

Fragile site induction by external environmental and dietary agents benzene and DEN

To test if external environmental and dietary fragile site-inducing agents benzene and DEN have the ability to induce DNA breakage within the RET oncogene, appropriate dosages in HTori-3 cells were first determined to minimize cell death.

Previously published results found dosages of 0.469 mg/mL for benzene and 3.4 mg/mL for DEN could induce fragile site breakage in human blood lymphocytes (107).

Therefore, cell viability experiments were performed using flow cytometry for a range of dosages of benzene (0.25 to 8 mg/mL) and DEN (1.5 to 7 mg/mL), all of which did not

124 A

100

)

d

e t

a 80

e

r

t

n

U

f 60

o

%

(

y t

i 40

l

i

b

a

i

V

l 20

l

e C

0 Untreated 0.4μM APH 0.5mg/mL Benzene 3.5mg/mL DEN Treatment

B

0.14 # * * 0.12 #

*

s l

l 0.10 *

e

C

0 0

1 0.08 / s RET

k * a

e 0.06

r FRA3B B

*

A 12p12.3

N 0.04 D

0.02

0.00 Untreated 0.4μM APH 0.5mg/mL Benzene 3.5mg/mL DEN

Treatment Figure 5.1: Induction of fragile site breakage by environmental and dietary agents, benzene and DEN. (A) The level of HTori-3 cell death following treatment with APH, benzene, or DEN was determined using a propidium iodide (PI) stain and measured using flow cytometry. The percentage of live cells (PI negative) relative to untreated cells was determined for each treatment and averaged for at least three experimental replicates. Error bars indicate standard derivation. (B) The frequency of DNA breakage following treatment with APH, benzene, or DEN was measured using LM-PCR. Treatment with APH, benzene, or DEN resulted in a significant increase in the level of DNA breakage at RET intron 11 and FHIT intron 4 (FRA3B) (*P < 0.005 relative to the untreated samples). DNA breakage induced by benzene and DEN treatments was specific to fragile sites, since the frequency of DNA breaks at the non-fragile 12p12.3 region was significantly less than RET and FRA3B (#P < 0.05). Error bars indicate standard deviation.

125 result in a significant amount of cell death, including those which were previously shown to induce fragile site breakage, 0.5 mg/mL for benzene and 3.5 mg/mL for DEN (Figure

5.1A). Therefore, these dosages were chosen to measure fragile site breakage following treatment of HTori-3 cells with these agents.

Genomic DNA was isolated from HTori-3 cells following 24 hour treatment with

0.5 mg/mL benzene or 3.5 mg/mL DEN, and the level of DNA breakage within RET intron

11 was measured using LM-PCR. A significant level of DNA breaks were induced at RET following treatment with benzene (0.042 ± 0.007 DNA breaks per 100 cells) and DEN

(0.100 ± 0.022 DNA breaks per 100 cells) compared to untreated (0.016 ± 0.009 DNA breaks per 100 cells; P = 2.71E-3 and 1.74E-5, respectively) (Figure 5.1B). This trend was also observed at FHIT intron 4, located within FRA3B, where DNA breakage following treatment with benzene (0.075 ± 0.023 DNA breaks per 100 cells) and DEN (0.063 ±

0.000 DNA breaks per 100 cells) were significantly greater compared to untreated (0.023

± 0.006 DNA breaks per 100 cells; P = 1.50E-4 and 2.13E-6, respectively) (Figure 5.1B).

Furthermore, the level of DNA breaks induced by benzene and DEN at RET and FRA3B were similar to that observed following APH treatment (Figure 5.1B). However, this trend did not extend to the non-fragile 12p12.3 region, where DNA breakage rates following benzene and DEN treatments (0.015 ± 0.018 and 0.017 ± 0.019 DNA breaks per 100 cells, respectively) are significantly less compared to RET (P = 4.81E-2 and 5.47E-

4, respectively) and FRA3B (P = 1.67E-3 and 1.40E-3, respectively) (Figure 5.1B).

Fragile site breakage is often the result of conditions that delay DNA replication, such as treatment with APH, which inhibits replicative DNA polymerases (6,7). To

126 70

60 *

) *

% ( 50

s *

l *

l

e

C

f 40 o G1 e *

g *

a 30 *

t S

n

e c

r 20 G2 e

P * 10

0 Untreated 0.4μM APH 0.5mg/mL Benzene 3.5mg/mL DEN Treatment

Figure 5.2: The effect of APH, benzene, and DEN treatments on the cell cycle. HTori-3 cells were treated with APH, benzene, or DEN, and the percentage of cells within different phases of the cell cycle (G1, S, or G2) were determined using a propidium iodide (PI) stain and measured using flow cytometry. APH, benzene, and DEN treatments all resulted in a significant accumulation of cells in the S-phase (*P < 0.005 relative to untreated). Error bars indicate standard deviation.

determine if benzene and DEN also result in a replication delay, cell cycle analysis was performed using flow cytometry. HTori-3 cells were treated with APH, benzene, or DEN using the same conditions as used to determine fragile site breakage by LM-PCR.

Treatment with benzene and DEN resulted in similar cell cycle profiles as with APH treatment (Figure 5.2). A significant increase in the percentage of cells in the S-phase was observed for APH (52.38 ± 7.55%), benzene (50.33 ± 3.08%), and DEN (44.24 ±

3.43%) treatment compared to untreated (32.93 ± 3.97%; P = 3.76E-3, 3.61E-4, and

4.44E-4, respectively); suggesting a delay in DNA replication for all three treatments.

Together these results indicate that benzene and DEN have the ability to induce fragile site-specific breakage at the RET oncogene in a manner analogous to APH

127 Table 5.1: Frequency of DNA breaks at RET, FRA3B, and 12p12.3 loci in patient normal thyroid tissue determined by LM-PCR DNA Breaks/100 Cells ± SD Patient number Patient status RET FRA3B 12p12.3 P10-10N RET/PTC1 1.075 ± 0.096 2.200 ± 0.200 0.300 ± 0.000 P10-09N RET/PTC1 1.933 ± 0.379 2.133 ± 0.416 0.250 ± 0.087 PHS04-15094 RET/PTC3 0.400 ± 0.100 1.333 ± 0.462 0.317 ± 0.176 2011-1345 PTC, non-RET/PTC 1.113 ± 0.103 1.733 ± 0.115 0.017 ± 0.029 2237182 PTC, non-RET/PTC 1.050 ± 0.705 2.067 ± 0.503 0.000 ± 0.000 01-18N Hyperplastic Nodule 0.575 ± 0.340 1.667 ± 0.306 0.383 ± 0.153 01-26N Hyperplastic Nodule 0.950 ± 0.252 1.600 ± 0.400 0.200 ± 0.087 01-27N Hyperplastic Nodule 1.275 ± 0.229 2.267 ± 0.231 0.317 ± 0.153 01-33N Hyperplastic Nodule 1.167 ± 0.153 2.000 ± 0.849 0.350 ± 0.132 2011-429 Multinodular Goiter 0.725 ± 0.222 1.067 ± 0.416 0.300 ± 0.087 2011-430 Multinodular Goiter 1.200 ± 0.100 2.000 ± 0.200 0.200 ± 0.050 2011-434 Multinodular Goiter 0.767 ± 0.058 1.200 ± 0.529 0.300 ± 0.087

treatment, which in turn could result in PTC-causing RET/PTC translocations in individuals exposed to these chemicals.

Development of a DNA breakage assay for detecting susceptibility to fragile site breakage

Measuring an individual’s susceptibility to fragile site breakage would prove to be a valuable diagnostic tool for tailoring the treatment of cancer patients with chemotherapy, as well as monitoring patients with a high occupational exposure to fragile site-inducing chemicals, such as benzene or DEN. An assay detecting levels of

DNA breakage within highly mutated regions in cancer known to be susceptible to fragile site breakage, such as the RET oncogene, could be a good target.

To test the viability of such an assay, the level of DNA breakage in genomic DNA isolated from normal thyroid tissue of patients with or without RET/PTC rearrangements

128

Figure 5.3: Frequency of DNA breakage in patient normal thyroid tissue. (A) The frequency of overall DNA breaks at RET intron 11, FHIT intron 4 (FRA3B), and 12p12.3 was determined by LM-PCR and averaged for each patient group from Table 5.1. (B) The frequency of double- stranded DNA breaks at RET intron 11 was determined by the modified LM-PCR protocol and averaged for RET/PTC-positive PTC patients and RET/PTC-negative hyperplastic nodule patients from Table 5.2. (*P = 2.56E-2). Error bars indicate standard deviation.

129 were tested. Four types of patients were selected for testing: PTC patients with (n=3) or without (n=2) RET/PTC rearrangements, and patients with non-cancerous hyperplastic nodules (n=4) or multinodular goiters (n=3) that do not contain RET/PTC rearrangements. RNA was isolated from cancerous (if applicable) and normal tissue and tested for the presence of RET/PTC rearrangements, allowing for categorization of tissue types (Table 5.1). Next, genomic DNA was isolated from the normal tissue of each patient and DNA breakage levels were measured at RET intron 11, FHIT intron 4

(FRA3B), and the non-fragile 12p12.3 region using LM-PCR (Table 5.1). Patients were then grouped based on type and the frequency of DNA breakage for each genomic region was averaged for each of the four groups (Figure 5.3A). Each of the groups displayed the same trend in the frequency of DNA breakage; where FRA3B showed the highest level of DNA breaks, followed by RET, and the non-fragile 12p12.3 region having the least (Figure 5.3A). This trend is consist with that observed in HTori-3 cells following

APH treatment (161); where fragile regions displayed higher levels of DNA breakage than non-fragile regions, and the frequency of FRA3B breakage was greater than RET.

However, no significant difference was observed between any of the patient groups, including PTC patients with RET/PTC rearrangements and non-cancerous patients

(Figure 5.3A).

Although no significant difference in the overall DNA breakage levels was observed between the patient types, the frequency of DNA breakage in the patient tissues (Figure 5.3A) was much greater than observed in HTori-3 cells (Figure 5.1B).

Since the method of LM-PCR used identifies both double and single-stranded DNA

130

Figure 5.4: Detection of double-stranded DNA breaks by LM-PCR. Double-stranded DNA breaks in genomic DNA from normal thyroid tissue of patients were isolated through ligation of the LL3/LP2 linker. Excess linker was removed using a G-100 column and the concentration of ligated DNA was determined. Amplification of these DNA breaks was achieved through two rounds of nest PCR of 8ng of ligated DNA. The final PCR products were resolved by agarose gel electrophoresis. Each band observed on the gel corresponds to a break found within the region of interest, which was used to quantitate the frequency of DNA breakage.

131 Table 5.2: Frequency of double-stranded DNA breaks at RET in patient normal thyroid tissue determined by LM-PCR Patient number Patient status RET dsDNA/100 Cells  SD P10-10N RET/PTC1 0.028 ± 0.017 P10-09N RET/PTC1 0.038 ± 0.019 PHS04-15094 RET/PTC3 0.023 ± 0.009 01-18N Hyperplastic nodule 0.014 ± 0.014 01-26N Hyperplastic nodule 0.014 ± 0.005 01-27N Hyperplastic nodule 0.005 ± 0.005

breaks (Figure 2.2), we next determined if there was a difference in the frequency of dsDNA breaks, which are the most detrimental form of DNA breakage and direct substrates for forming gene rearrangements. The LM-PCR method was adapted to only detect dsDNA breaks (Figure 5.4), and the level of breakage was measured at RET intron

11 in PTC patients with RET/PTC rearrangements and patients with non-cancerous hyperplastic nodules that do not contain RET/PTC rearrangements (Table 5.2). PTC patients with RET/PTC rearrangements displayed a significantly greater frequency of dsDNA breaks at RET intron 11 (0.029 ± 0.005 dsDNA breaks per 100 cells) than patients with non-cancerous hyperplastic nodules (0.011 ± 0.005 dsDNA breaks per 100 cells, P =

2.56E-2) (Figure 5.3B).

Therefore these results demonstrate that normal thyroid tissue from RET/PTC- positive PTC patients have significantly more dsDNA breaks within intron 11 of RET, the region mutated in RET/PTC rearrangements and susceptible to fragile site breakage, than RET/PTC-negative hyperplastic nodule patients. These differences were detected using a DNA breakage assay that could be adapted for monitoring cancer susceptibility

132 in patients at the RET locus as well as other genomic regions susceptible to fragile site breakage.

DISCUSSION

Although a strong correlation between fragile sites and cancer has been made and direct evidence has been provided for fragile site breakage to result in the formation of the RET/PTC1 rearrangement in PTC, little evidence exists directly linking exposure to external dietary and environmental agents to the formation of fragile site breakage at mutation-prone genomic regions. Benzene and DEN are mutagens previously identified to induce fragile site breakage (107) and these chemicals are found in a variety of agents, including cigarette smoke, pesticides, car exhaust, industrial emissions, and contaminated food and water (108). Here we tested the ability of these chemicals to induce fragile site breakage, and found that treatment of HTori-3 cells with low doses of benzene or DEN results in a significant level of DNA breakage at RET intron

11 and FHIT intron 4 (FRA3B), and the frequency of this breakage was similar to APH treatment (Figure 5.1B). Furthermore, the level of DNA breakage induced at RET and

FRA3B by benzene and DEN was significantly greater than breakage induced at the non- fragile 12p12.3 region (Figure 5.1B), indicating breakage is specific to fragile sites.

Chromosomal fragile sites are normally stable, but conditions which delay DNA replication, like treatment with APH, results in their expression. To determine if benzene and DEN are inducing fragile site breakage in a similar manner, cell cycle distributions were determined for HTori-3 cells following treatment with APH, benzene, or DEN. A

133 significant accumulation of cells in the S-phase was observed for all treatments, suggesting that benzene and DEN induce fragile site breakage in a manner analogous to

APH, by inhibiting DNA replication.

Benzene and DEN are present in cigarette smoke and pesticides, and exposure to these environmental insults has been linked to increased fragile site breakage in humans (99-103). The optimal dosages for induction of fragile site breakage are confirmed here to be 500 μg/mL for benzene and 3.5 mg/mL for DEN. Tobacco smoke from a single cigarette exposes an individual, despite being the smoker or not, to 345 to

653 μg of benzene and 8 to 73 ng of DEN (108). Furthermore, the concentration of benzene in the air is highest in areas of heavy motor vehicle traffic and around gasoline stations, such that one hour of driving or riding in a motor vehicle results in an estimated exposure to 40 μg of benzene and spending 70 minutes a year pumping gasoline an estimated 10 μg of benzene (108). Exposure to DEN via air, diet, and smoking is estimated to be several micrograms per day (108). Combined exposure of these chemicals demonstrates that the dosages of benzene and DEN that result in optimal fragile site breakage are relevant to every day exposure of individuals, with higher exposures depending on lifestyle and occupation.

Aside from exposure to environmental insults, cancer can also arise as a consequence of treatment of previous tumors. The rate of second primary cancers is on the rise, such that one in every six newly diagnosed cancers in the United States is a second primary neoplasm (123). PTC has been observed as a second cancer following treatment of a variety of different cancers with chemotherapeutic agents (127-137), and

134 many chemotherapeutic agents have been found to induce fragile site breakage (Table

1.4). Therefore, monitoring fragile site breakage at mutation-prone regions would provide a valuable diagnostic tool for tailoring the treatment of cancer patients with chemotherapy and help limit the occurrence of secondary cancers. This same technique could also be used to monitor cancer susceptibility in patients working in occupations with high exposure to external fragile site inducing agents, such as benzene and DEN.

The overall frequency of DNA breakage at RET, FRA3B, and the non-fragile 12p12.3 was measured in four patients groups: PTC patients with RET/PTC rearrangements and three

RET/PTC-negative patient groups - PTC, non-cancerous hyperplastic nodule, and non- cancerous multinodular goiter individuals (Table 5.1, Figure 5.3A). Although no significant difference was observed in the frequency of overall DNA breaks between the patient groups, all groups followed the same trend where FRA3B displayed the highest level of breakage, followed by RET, and 12p12.3 (Figure 5.3A). This trend follows that observed in HTori-3 cells following treatment with fragile site-inducing chemicals (Figure

5.1B), indicating breakage at these sites may be due to daily exposure to fragile site - inducing agents. Furthermore, the frequency of breakage in patient tissue was much greater than observed in cell culture, which may also indicate combined exposure to agents and genetic factors. However, a significant increase in the frequency of dsDNA breaks at RET intron 11 was observed in RET/PTC-positive PTC patients compared to

RET/PTC-negative hyperplastic nodule patients (Figure 5.3B). Therefore, even though these patient groups displayed a similar frequency of overall DNA breaks, a difference in the most detrimental form of DNA breakage and the precursor to chromosomal

135 rearrangement differed. This may be explained by a combination of environmental exposures and genetic factors, such as unfavorable DNA repair. Due to this difference, an assay monitoring only dsDNA breakage may allow for identification of at risk patients.

Together these results demonstrate that external environmental and dietary agents, benzene and DEN, have the ability to induce fragile site breakage at the RET oncogene, which in turn has the potential to result in the formation of cancer-causing

RET/PTC translocations. Furthermore, normal thyroid tissue from PTC patients with

RET/PTC rearrangements has significantly elevated levels of dsDNA breaks at RET, indicating a potential predisposition of these individuals to RET breakage. Therefore, modification of this dsDNA breakage assay could provide potential therapeutic benefit for monitoring cancer susceptibility in patients at RET or other regions of the genome mutated in cancer and susceptible to fragile site breakage, allowing for individualized treatment plans and cancer prevention.

ACKNOWLEDGEMENTS

Special thanks to Drs. Yuri Nikiforov and Jennifer Cannon for evaluating and collecting thyroid tissues, Christine Lehman for performing the LM-PCR analysis of double-strand

DNA breaks in patients, and Dr. David Ornelles for assistance in analyzing flow cytometry data.

136 CHAPTER VI: CONCLUSIONS

Strong evidence supports a role for chromosomal fragile sites in carcinogenesis, but direct evidence for the involvement of fragile site breakage in the formation of cancer-causing chromosomal translocations and the initial events that lead to breakage at these sites remains elusive. Therefore, it remains essential to investigate the molecular basis and mechanism behind common fragile site breakage, the consequences of breakage at these sites, and external contributing factors that can lead to cancer development. In this dissertation, it is demonstrated that common fragile site breakage can directly lead to the formation of the cancer-causing RET/PTC1 translocation found in papillary thyroid carcinoma (PTC). The location of fragile site breakage in the RET oncogene coincides with breaks found in patients and predicted cleavage sites of DNA topoisomerase I and IIα. Furthermore, RET was predicted and confirmed to form stable DNA secondary structures with features recognized and preferentially cleaved by topoisomerases I and IIα. The importance of DNA secondary structures in the mechanism of fragile site breakage was further revealed through computation prediction and analysis of the human chromosome 10 sequence.

Moreover, environmental and dietary agents benzene and DEN were found to induce fragile site breakage at RET, supporting a role for fragile sites in tumorigenesis. Lastly, normal thyroid tissue from PTC patients with RET/PTC rearrangements exhibited a higher frequency of double-strand DNA breaks at the RET oncogene, providing a basis for developing a DNA breakage assay for measuring cancer susceptibility in patients.

137 Fragile Site Breakage Generates Oncogenic RET/PTC Rearrangements

A strong correlation between fragile sites and regions of the genome deleted, amplified, or rearranged in cancer has been well established (10). To directly demonstrate that fragile site breakage can lead to the formation of a cancer-causing chromosomal translocation, we choose RET/PTC rearrangements as our model system.

RET/PTC rearrangements all involve translocation of the RET oncogene and are a common cause of PTC. The two major subtypes, RET/PTC1 and 3, have all genes involved, RET, NCOA4, and CCDC6, located in cytogenetically-defined fragile sites (Figure

6.1). RET and NCOA4 are located within FRA10G, and CCDC6 is mapped to FRA10C.

Furthermore, the incidence of PTC is on the rise with no definitive explanation for the increase; but due to the location of these genes in fragile sites, fragile site breakage may contribute to this increase. To analyze a role for fragile sites in the formation of chromosomal translocations, the RET oncogene was examined.

Although RET and it’s translocation partners are located within the cytogenetic boundaries of fragile sites, their fragility needs to be examined at a molecular level. A consensus sequence is not known for common fragile sites, but the ability to form stable

DNA secondary structures has been suggested to be a unifying feature among fragile sites (8). The process of molecularly refining a cytogenetically-defined fragile site is time consuming. Therefore, using the DNA secondary structure prediction program Mfold, we narrowed down regions of fragility on the human chromosome 10 sequence, which contains the RET/PTC genes. RET and CCDC6, but not NCOA4, showed potential to form highly stable DNA secondary structures, and thus predicted to be located within

138 common fragile sites. The ability of RET and CCDC6 to form chromosomal breakage following treatment with laboratory fragile site-inducing chemicals was confirmed by

Fluorescence in situ hybridization (FISH) (Figure 6.1). Further agreeing with the secondary structure predictions, NCOA4 did not exhibit significant levels of chromosomal breakage, confirming this gene region is not a true fragile site and demonstrating the ability of secondary structure prediction to refine fragile sites. The induction of DNA breakage within the intron 11 of RET, the breakpoint cluster region observed in PTC patients, was also confirmed by ligation mediated-PCR (LM-PCR), and these breakpoints also coincided with DNA secondary structure signatures preferentially cleaved by DNA topoisomerases I and IIα (Figure 6.1).

After pinpointing the location of RET and CCDC6 in common fragile sites, the ability of fragile site breakage at these genes to result in the formation of RET/PTC1 rearrangements in thyroid cells was directly tested. Treatment of thyroid cells with laboratory fragile site-inducing chemicals, APH, BrdU, and 2-AP, resulted in the formation of RET/PTC1, and not RET/PTC3, rearrangement events, demonstrating for the first time that fragile site breakage can induce the formation of a cancer-causing chromosomal translocation (Figure 6.1).

Proposed Mechanism of Fragile Site Breakage in the Formation of RET/PTC1 Rearrangements

Discussed here is a proposed model for fragile site breakage in the formation of

RET/PTC1 rearrangements in PTC (Figure 6.1). Genes participating in the RET/PTC1 rearrangement, RET and CCDC6, are located within common fragile sites FRA10G and

139

140 Figure 6.1: Model of fragile site instability in the formation of RET/PTC1 rearrangements in papillary thyroid carcinoma. RET and CCDC6 genes are located on chromosome 10 within common fragile sites FRA10G and FRA10C, respectively. Exposure to APH, BrdU, or environmental agents such as benzene or DEN, results in replication stress, causing the replicative DNA polymerases to become uncoupled from the helicase. As a result, long stretches of single-stranded DNA are exposed, which can form stable DNA secondary structures at fragile sites, stalling replication fork progression. Stalled replication forks trigger the ATR-dependent DNA repair pathway. If the ATR repair pathway responds properly, the replication fork can be repaired and DNA replication restarted. DNA topoisomerases I and IIα are also present at the replication fork and participate in DNA replication through removal of positive and negative supercoiling, with topoisomerase IIα also removing precatenanes. If ATR does not properly repair the stalled replication fork, DNA topoisomerases I and IIα can initiate DNA breakage at RET through recognition and preferential cleavage of DNA secondary structure features. DNA breakage at these sites can lead to the formation of RET/PTC1 translocations, expression of the RET-CCDC6 fusion protein, and ultimately papillary thyroid carcinoma development.

FRA10C, respectively. In normal thyroid epithelial cells, CCDC6 is actively transcribed while RET is not. Replication stress, which can be induced by chemicals such as APH or

BrdU, results in an uncoupling of the replicative DNA polymerase from the helicase and gives rise to long stretches of single-stranded DNA. At fragile sites, regions that remain single-stranded during replication, such as the Okazaki initiation zone on the lagging strand template, may form stable DNA secondary structures. At RET and CCDC6, stable

DNA secondary structures have been predicted by Mfold, and directly observed using an in vitro reduplexing assay. The formation of these stable DNA secondary structures can cause difficulties in replication, resulting in a stalled replication fork. The ATR-dependent

DNA repair pathway, which responds to stalled or collapsed replication forks (240,241) and is vital for maintaining fragile site stability (37), is then triggered. If the ATR repair pathway properly responds, the replication fork is repaired and replication is restarted.

DNA topoisomerase I is present at the replication fork and participates in DNA replication by removing positive and negative supercoiling (184). DNA topoisomerase IIα

141 is also present at the replication fork, removing positive and negative supercoiling as well as precatenanes (184). If ATR does not properly repair the stalled replication fork,

DNA topoisomerases I and IIα could initiate DNA breakage through recognition and preferential cleavage of DNA secondary structural features, as suggested by analysis of fragile site breakage at RET in this study. If the breakage remains unrepaired, it could result in the formation of RET/PTC1 rearrangements. Translocation of RET with CCDC6 results in transcription of RET-CCDC6 fusion transcripts, which is driven by the promoter of CCDC6, and ultimately leads to the development of PTC.

Induction of RET Fragile Site Breakage by Environmental Factors and Correlation to Patients

Aside from laboratory chemicals, a variety of external environmental, dietary, and chemotherapeutic agents can induce fragile site breakage. Here, the chemicals benzene and DEN, which are present in environmental and dietary sources, were found to induce fragile site-specific DNA breakage at RET and result in delayed DNA replication, like APH (Figure 6.1). Due to the rapid increase in PTC cases in the United

States, these findings indicate that some of this increase may be attributed to RET/PTC1 rearrangement formation due to fragile site breakage from exposure to external fragile site-inducing agents, such as benzene and DEN. Further experiments to demonstrate that DNA breaks caused by exposure of these environmental factors indeed generate

RET/PTC rearrangements will be important.

The incidence of second primary tumors is also increasing, and since some chemotherapeutic agents can induce fragile site breakage, fragile site instability may

142 also contribute to some of these cases. PTC has been observed as a second cancer following treatment with chemotherapeutic agents as well. Further supporting this notion, many chemotherapeutic agents are topoisomerase poisons, and we propose here that DNA topoisomerases may be responsible for inducing DNA breaks at fragile sites (Figure 6.1).

All together, this suggests a need for a diagnostic assay that can be used to predict a patient’s susceptibility to fragile site breakage, allowing for customized treatment of cancer patients with chemotherapy and monitoring of individuals with high exposure to external fragile site-inducing agents. Testing this notion, we found that

PTC patients with RET/PTC rearrangements had higher levels of double-strand DNA breaks at RET in their normal thyroid tissue than RET/PTC-negative hyperplastic nodule patients. This suggests that exposure to fragile site-inducing agents in daily life may contribute to this higher level of DNA breakage, increasing a patient’s susceptibility for

RET/PTC rearrangement formation (Figure 6.1). Therefore, these findings provide a basis for development of a DNA breakage assay that can be used to monitor the genomic integrity of RET, and other mutation-prone genetic regions susceptible to fragile site breakage, providing personalized patient care and cancer prevention.

Overall Significance

The data presented in this dissertation clearly demonstrates a role for fragile sites in cancer development and provides further understanding of the underlying mechanisms of instability at these sites. Here, fragile site breakage is directly shown to result in the formation of an oncogenic chromosomal translocation. Through study of

143 RET/PTC1 rearrangements as a model system, the mechanism of fragile site breakage was further expanded to include the involvement of DNA topoisomerases I and IIα in initiation of DNA breaks at fragile sites through the recognition of DNA secondary structures. This model can be expanded to other rearrangement-causing diseases where mutations occur within fragile site regions. Furthermore, prediction of DNA secondary structure forming ability was demonstrated to be a valid method for identification and narrowing-down of fragile sites in the genome. This method can be used to perform genome-wide mapping of fragile sites and compared to mutation-prone regions of the genome found in other cancers, such as leukemia, and psychiatric diseases, such as schizophrenia and autism. Therefore, DNA fragility at mutation-prone regions can be used to gauge susceptibility to the development of various diseases and provide personalized medical treatment.

144 BIBLIOGRAPHY

1. Richards, R.I. (2001) Fragile and unstable chromosomes in cancer: causes and consequences. Trends Genet, 17, 339-345. 2. Craig-Holmes, A.P., Strong, L.C., Goodacre, A. and Pathak, S. (1987) Variation in the expression of aphidicolin-induced fragile sites in human lymphocyte cultures. Hum Genet, 76, 134-137. 3. Tunca, B., Egeli, U., Zorluoglu, A., Yilmazlar, T., Yerci, O. and Kizil, A. (2000) The expression of fragile sites in lymphocytes of patients with rectum cancer and their first-degree relatives. Cancer Lett, 152, 201-209. 4. Sutherland, G.R. (2003) Rare fragile sites. Cytogenet Genome Res, 100, 77-84. 5. Glover, T.W. (2006) Common fragile sites. Cancer Lett, 232, 4-12. 6. Cheng, C.H. and Kuchta, R.D. (1993) DNA polymerase epsilon: aphidicolin inhibition and the relationship between polymerase and exonuclease activity. Biochemistry, 32, 8568-8574. 7. Glover, T.W., Berger, C., Coyle, J. and Echo, B. (1984) DNA polymerase alpha inhibition by aphidicolin induces gaps and breaks at common fragile sites in human chromosomes. Hum Genet, 67, 136-142. 8. Durkin, S.G. and Glover, T.W. (2007) Chromosome fragile sites. Annu Rev Genet, 41, 169-192. 9. Mrasek, K., Schoder, C., Teichmann, A.C., Behr, K., Franze, B., Wilhelm, K., Blaurock, N., Claussen, U., Liehr, T. and Weise, A. (2010) Global screening and extended nomenclature for 230 aphidicolin-inducible fragile sites, including 61 yet unreported ones. Int J Oncol, 36, 929-940. 10. Dillon, L.W., Burrow, A.A. and Wang, Y.H. (2010) DNA instability at chromosomal fragile sites in cancer. Curr Genomics, 11, 326-337. 11. Sutherland, G.R. (1991) Chromosomal fragile sites. Genet Anal Tech Appl, 8, 161- 166. 12. Sutherland, G.R., Baker, E. and Richards, R.I. (1998) Fragile sites still breaking. Trends Genet, 14, 501-506. 13. Mishmar, D., Rahat, A., Scherer, S.W., Nyakatura, G., Hinzmann, B., Kohwi, Y., Mandel-Gutfroind, Y., Lee, J.R., Drescher, B., Sas, D.E. et al. (1998) Molecular characterization of a common fragile site (FRA7H) on human chromosome 7 by the cloning of a simian virus 40 integration site. Proc Natl Acad Sci U S A, 95, 8141-8146. 14. Zlotorynski, E., Rahat, A., Skaug, J., Ben-Porat, N., Ozeri, E., Hershberg, R., Levi, A., Scherer, S.W., Margalit, H. and Kerem, B. (2003) Molecular basis for expression of common and rare fragile sites. Mol Cell Biol, 23, 7143-7151. 15. Zhang, H. and Freudenreich, C.H. (2007) An AT-rich sequence in human common fragile site FRA16D causes fork stalling and chromosome breakage in S. cervisiae. Molecular Cell, 27, 367-379. 16. Shah, S.N., Opresko, P.L., Meng, X., Lee, M.Y. and Eckert, K.A. (2010) DNA structure and the Werner protein modulate human DNA polymerase delta-

145 dependent replication dynamics within the common fragile site FRA16D. Nucleic Acids Res, 38, 1149-1162. 17. Pelliccia, F., Bosco, N., Curatolo, A. and Rocchi, A. (2008) Replication timing of two human common fragile sites: FRA1H and FRA2G. Cytogenet Genome Res, 121, 196-200. 18. Le Beau, M.M., Rassool, F.V., Neilly, M.E., Espinosa, R., 3rd, Glover, T.W., Smith, D.I. and McKeithan, T.W. (1998) Replication of a common fragile site, FRA3B, occurs late in S phase and is delayed further upon induction: implications for the mechanism of fragile site induction. Hum Mol Genet, 7, 755-761. 19. Hellman, A., Rahat, A., Scherer, S.W., Darvasi, A., Tsui, L.C. and Kerem, B. (2000) Replication delay along FRA7H, a common fragile site on human chromosome 7, leads to chromosomal instability. Mol Cell Biol, 20, 4420-4427. 20. Handt, O., Baker, E., Dayan, S., Gartler, S.M., Woollatt, E., Richards, R.I. and Hansen, R.S. (2000) Analysis of replication timing at the FRA10B and FRA16B fragile site loci. Chromosome Res, 8, 677-688. 21. Hansen, R.S., Canfield, T.K., Fjeld, A.D., Mumm, S., Laird, C.D. and Gartler, S.M. (1997) A variable domain of delayed replication in FRAXA fragile X chromosomes: X inactivation-like spread of late replication. Proc Natl Acad Sci U S A, 94, 4587- 4592. 22. Helmrich, A., Ballarino, M. and Tora, L. (2011) Collisions between replication and transcription complexes cause common fragile site instability at the longest human genes. Mol Cell, 44, 966-977. 23. Fry, M. and Loeb, L.A. (1994) The fragile X syndrome d(CGG)n nucleotide repeats form a stable tetrahelical structure. Proc Natl Acad Sci U S A, 91, 4950-4954. 24. Gacy, A.M., Goellner, G., Juranic, N., Macura, S. and McMurray, C.T. (1995) Trinucleotide repeats that expand in human disease form hairpin structures in vitro. Cell, 81, 533-540. 25. Usdin, K. and Woodford, K.J. (1995) CGG repeats associated with DNA instability and chromosome fragility form structures that block DNA synthesis in vitro. Nucleic Acids Res, 23, 4202-4209. 26. Samadashwily, G.M., Raca, G. and Mirkin, S.M. (1997) Trinucleotide repeats affect DNA replication in vivo. Nat Genet, 17, 298-304. 27. Burrow, A.A., Marullo, A., Holder, L.R. and Wang, Y.H. (2010) Secondary structure formation and DNA instability at fragile site FRA16B. Nucleic Acids Res, 38, 2865-2877. 28. Kamath-Loeb, A.S., Loeb, L.A., Johansson, E., Burgers, P.M. and Fry, M. (2001) Interactions between the Werner syndrome helicase and DNA polymerase delta specifically facilitate copying of tetraplex and hairpin structures of the d(CGG)n trinucleotide repeat sequence. J Biol Chem, 276, 16439-16446. 29. Oh, J., Bailin, T., Fukai, K., Feng, G.H., Ho, L., Mao, J.I., Frenk, E., Tamura, N. and Spritz, R.A. (1996) Positional cloning of a gene for Hermansky-Pudlak syndrome, a disorder of cytoplasmic organelles. Nat Genet, 14, 300-306.

146 30. Letessier, A., Millot, G.A., Koundrioukoff, S., Lachages, A.M., Vogt, N., Hansen, R.S., Malfoy, B., Brison, O. and Debatisse, M. (2011) Cell-type-specific replication initiation programs set fragility of the FRA3B fragile site. Nature, 470, 120-123. 31. Abraham, R.T. (2001) Cell cycle checkpoint signaling through the ATM and ATR kinases. Genes & development, 15, 2177-2196. 32. Cimprich, K.A. and Cortez, D. (2008) ATR: an essential regulator of genome integrity. Nature reviews. Molecular cell biology, 9, 616-627. 33. Cliby, W.A., Roberts, C.J., Cimprich, K.A., Stringer, C.M., Lamb, J.R., Schreiber, S.L. and Friend, S.H. (1998) Overexpression of a kinase-inactive ATR protein causes sensitivity to DNA-damaging agents and defects in cell cycle checkpoints. The EMBO journal, 17, 159-169. 34. Cortez, D., Guntuku, S., Qin, J. and Elledge, S.J. (2001) ATR and ATRIP: partners in checkpoint signaling. Science, 294, 1713-1716. 35. Nghiem, P., Park, P.K., Kim Ys, Y.S., Desai, B.N. and Schreiber, S.L. (2002) ATR is not required for p53 activation but synergizes with p53 in the replication checkpoint. J Biol Chem, 277, 4428-4434. 36. Hammond, E.M., Denko, N.C., Dorie, M.J., Abraham, R.T. and Giaccia, A.J. (2002) Hypoxia links ATR and p53 through replication arrest. Mol Cell Biol, 22, 1834- 1843. 37. Casper, A.M., Nghiem, P., Arlt, M.F. and Glover, T.W. (2002) ATR regulates fragile site stability. Cell, 111, 779-789. 38. Casper, A.M., Durkin, S.G., Arlt, M.F. and Glover, T.W. (2004) Chromosomal instability at common fragile sites in Seckel syndrome. Am.J.Hum.Genet., 75, 654-660. 39. Ragland, R.L., Arlt, M.F., Hughes, E.D., Saunders, T.L. and Glover, T.W. (2009) Mice hypomorphic for Atr have increased DNA damage and abnormal checkpoint response. Mamm Genome, 20, 375-385. 40. Arlt, M.F., Xu, B., Durkin, S.G., Casper, A.M., Kastan, M.B. and Glover, T.W. (2004) BRCA1 is required for common-fragile-site stability via its G2/M checkpoint function. Mol Cell Biol, 24, 6701-6709. 41. Durkin, S.G., Arlt, M.F., Howlett, N.G. and Glover, T.W. (2006) Depletion of CHK1, but not CHK2, induces chromosomal instability and breaks at common fragile sites. Oncogene, 25, 4381-4388. 42. Musio, A., Montagna, C., Mariani, T., Tilenni, M., Focarelli, M.L., Brait, L., Indino, E., Benedetti, P.A., Chessa, L., Albertini, A. et al. (2005) SMC1 involvement in fragile site expression. Hum Mol Genet, 14, 525-533. 43. Howlett, N.G., Taniguchi, T., Durkin, S.G., D'Andrea, A.D. and Glover, T.W. (2005) The Fanconi anemia pathway is required for the DNA replication stress response and for the regulation of common fragile site stability. Hum Mol Genet, 14, 693- 701. 44. Zhu, M. and Weiss, R.S. (2007) Increased common fragile site expression, cell proliferation defects, and apoptosis following conditional inactivation of mouse Hus1 in primary cultured cells. Molecular biology of the cell, 18, 1044-1055.

147 45. Pirzio, L.M., Pichierri, P., Bignami, M. and Franchitto, A. (2008) Werner syndrome helicase activity is essential in maintaining fragile site stability. The Journal of cell biology, 180, 305-314. 46. Focarelli, M.L., Soza, S., Mannini, L., Paulis, M., Montecucco, A. and Musio, A. (2009) Claspin inhibition leads to fragile site expression. Genes Chromosomes Cancer, 48, 1083-1090. 47. Ozeri-Galai, E., Schwartz, M., Rahat, A. and Kerem, B. (2008) Interplay between ATM and ATR in the regulation of common fragile site stability. Oncogene, 27, 2109-2117. 48. Moldovan, G.L. and D'Andrea, A.D. (2009) How the fanconi anemia pathway guards the genome. Annu Rev Genet, 43, 223-249. 49. Schoder, C., Liehr, T., Velleuer, E., Wilhelm, K., Blaurock, N., Weise, A. and Mrasek, K. (2010) New aspects on chromosomal instability: chromosomal break- points in Fanconi anemia patients co-localize on the molecular level with fragile sites. Int J Oncol, 36, 307-312. 50. Chan, K.L., Palmai-Pallag, T., Ying, S. and Hickson, I.D. (2009) Replication stress induces sister-chromatid bridging at fragile site loci in mitosis. Nature cell biology, 11, 753-760. 51. Andreassen, P.R., D'Andrea, A.D. and Taniguchi, T. (2004) ATR couples FANCD2 monoubiquitination to the DNA-damage response. Genes & development, 18, 1958-1963. 52. Taniguchi, T. and D'Andrea, A.D. (2002) The Fanconi anemia protein, FANCE, promotes the nuclear accumulation of FANCC. Blood, 100, 2457-2462. 53. Wan, C., Kulkarni, A. and Wang, Y.H. (2010) ATR preferentially interacts with common fragile site FRA3B and the binding requires its kinase activity in response to aphidicolin treatment. Mutat Res, 686, 39-46. 54. Cimprich, K.A. (2003) Fragile sites: breaking up over a slowdown. Current biology : CB, 13, R231-233. 55. Schwartz, M., Zlotorynski, E., Goldberg, M., Ozeri, E., Rahat, A., le Sage, C., Chen, B.P., Chen, D.J., Agami, R. and Kerem, B. (2005) Homologous recombination and nonhomologous end-joining repair pathways regulate fragile site stability. Genes & development, 19, 2715-2726. 56. Arlt, M.F., Durkin, S.G., Ragland, R.L. and Glover, T.W. (2006) Common fragile sites as targets for chromosome rearrangements. DNA Repair (Amst), 5, 1126- 1135. 57. Popescu, N.C. (2003) Genetic alterations in cancer as a result of breakage at fragile sites. Cancer Lett., 192, 1-17. 58. Glover, T.W. and Stein, C.K. (1987) Induction of sister chromatid exchanges at common fragile sites. Am J Hum Genet, 41, 882-890. 59. Wilke, C.M., Hall, B.K., Hoge, A., Paradee, W., Smith, D.I. and Glover, T.W. (1996) FRA3B extends over a broad region and contains a spontaneous HPV16 integration site: direct evidence for the coincidence of viral integration sites and fragile sites. Hum Mol Genet, 5, 187-195.

148 60. De Braekeleer, M., Sreekantaiah, C. and Haas, O. (1992) Herpes simplex virus and human papillomavirus sites correlate with chromosomal breakpoints in human cervical carcinoma. Cancer Genet Cytogenet, 59, 135-137. 61. Popescu, N.C. and DiPaolo, J.A. (1989) Preferential sites for viral integration on mammalian genome. Cancer Genet Cytogenet, 42, 157-171. 62. Smith, P.P., Friedman, C.L., Bryant, E.M. and McDougall, J.K. (1992) Viral integration and fragile sites in human papillomavirus-immortalized human keratinocyte cell lines. Genes Chromosomes Cancer, 5, 150-157. 63. Thorland, E.C., Myers, S.L., Gostout, B.S. and Smith, D.I. (2003) Common fragile sites are preferential targets for HPV16 integrations in cervical tumors. Oncogene, 22, 1225-1237. 64. Thorland, E.C., Myers, S.L., Persing, D.H., Sarkar, G., McGovern, R.M., Gostout, B.S. and Smith, D.I. (2000) Human papillomavirus type 16 integrations in cervical tumors frequently occur in common fragile sites. Cancer Res, 60, 5916-5921. 65. Coquelle, A., Pipiras, E., Toledo, F., Buttin, G. and Debatisse, M. (1997) Expression of fragile sites triggers intrachromosomal mammalian gene amplification and sets boundaries to early amplicons. Cell, 89, 215-225. 66. Hellman, A., Zlotorynski, E., Scherer, S.W., Cheung, J., Vincent, J.B., Smith, D.I., Trakhtenbrot, L. and Kerem, B. (2002) A role for common fragile site induction in amplification of human oncogenes. Cancer Cell, 1, 89-97. 67. Kuo, M.T., Vyas, R.C., Jiang, L.X. and Hittelman, W.N. (1994) Chromosome breakage at a major fragile site associated with P-glycoprotein gene amplification in multidrug-resistant CHO cells. Mol Cell Biol, 14, 5202-5211. 68. Miller, C.T., Lin, L., Casper, A.M., Lim, J., Thomas, D.G., Orringer, M.B., Chang, A.C., Chambers, A.F., Giordano, T.J., Glover, T.W. et al. (2006) Genomic amplification of MET with boundaries within fragile site FRA7G and upregulation of MET pathways in esophageal adenocarcinoma. Oncogene, 25, 409-418. 69. Bignell, G.R., Greenman, C.D., Davies, H., Butler, A.P., Edkins, S., Andrews, J.M., Buck, G., Chen, L., Beare, D., Latimer, C. et al. (2010) Signatures of mutation and selection in the cancer genome. Nature, 463, 893-898. 70. Burrow, A.A., Williams, L.E., Pierce, L.C. and Wang, Y.H. (2009) Over half of breakpoints in gene pairs involved in cancer-specific recurrent translocations are mapped to human chromosomal fragile sites. BMC genomics, 10, 59. 71. Durkin, S.G., Ragland, R.L., Arlt, M.F., Mulle, J.G., Warren, S.T. and Glover, T.W. (2008) Replication stress induces tumor-like microdeletions in FHIT/FRA3B. Proc Natl Acad Sci U S A, 105, 246-251. 72. O'Driscoll, M., Ruiz-Perez, V.L., Woods, C.G., Jeggo, P.A. and Goodship, J.A. (2003) A splicing mutation affecting expression of ataxia-telangiectasia and Rad3-related protein (ATR) results in Seckel syndrome. Nat Genet, 33, 497-501. 73. Kitao, H. and Takata, M. (2011) Fanconi anemia: a disorder defective in the DNA damage response. International journal of hematology, 93, 417-424. 74. Chen, A.Y., Jemal, A. and Ward, E.M. (2009) Increasing incidence of differentiated thyroid cancer in the United States, 1988-2005. Cancer, 115, 3801- 3807.

149 75. Davies, L. and Welch, H.G. (2006) Increasing incidence of thyroid cancer in the United States, 1973-2002. JAMA : the journal of the American Medical Association, 295, 2164-2167. 76. Enewold, L., Zhu, K., Ron, E., Marrogi, A.J., Stojadinovic, A., Peoples, G.E. and Devesa, S.S. (2009) Rising thyroid cancer incidence in the United States by demographic and tumor characteristics, 1980-2005. Cancer Epidemiol Biomarkers Prev, 18, 784-791. 77. ACS. (2012). American Cancer Society, Atlanta. 78. Kent, W.D., Hall, S.F., Isotalo, P.A., Houlden, R.L., George, R.L. and Groome, P.A. (2007) Increased incidence of differentiated thyroid carcinoma and detection of subclinical disease. CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne, 177, 1357-1361. 79. Zhang, Y., Zhu, Y. and Risch, H.A. (2006) Changing incidence of thyroid cancer. JAMA : the journal of the American Medical Association, 296, 1350; author reply 1350. 80. How, J. and Tabah, R. (2007) Explaining the increasing incidence of differentiated thyroid cancer. CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne, 177, 1383-1384. 81. Wartofsky, L. (2010) Increasing world incidence of thyroid cancer: increased detection or higher radiation exposure? Hormones (Athens), 9, 103-108. 82. Ron, E., Lubin, J.H., Shore, R.E., Mabuchi, K., Modan, B., Pottern, L.M., Schneider, A.B., Tucker, M.A. and Boice, J.D., Jr. (1995) Thyroid cancer after exposure to external radiation: a pooled analysis of seven studies. Radiation research, 141, 259-277. 83. Kitahara, C.M., Platz, E.A., Freeman, L.E., Hsing, A.W., Linet, M.S., Park, Y., Schairer, C., Schatzkin, A., Shikany, J.M. and Berrington de Gonzalez, A. (2011) Obesity and thyroid cancer risk among U.S. men and women: a pooled analysis of five prospective studies. Cancer Epidemiol Biomarkers Prev, 20, 464-472. 84. Arighi, E., Borrello, M.G. and Sariola, H. (2005) RET tyrosine kinase signaling in development and cancer. Cytokine & growth factor reviews, 16, 441-467. 85. Nikiforov, Y.E. (2002) RET/PTC rearrangement in thyroid tumors. Endocrine pathology, 13, 3-16. 86. Nikiforova, M.N. and Nikiforov, Y.E. (2008) Molecular genetics of thyroid cancer: implications for diagnosis, treatment and prognosis. Expert Rev Mol Diagn, 8, 83- 95. 87. Santoro, M., Melillo, R.M. and Fusco, A. (2006) RET/PTC activation in papillary thyroid carcinoma: European Journal of Endocrinology Prize Lecture. Eur J Endocrinol, 155, 645-653. 88. Nikiforov, Y.E., Rowland, J.M., Bove, K.E., Monforte-Munoz, H. and Fagin, J.A. (1997) Distinct pattern of ret oncogene rearrangements in morphological variants of radiation-induced and sporadic thyroid papillary carcinomas in children. Cancer Res, 57, 1690-1694. 89. Fugazzola, L., Pilotti, S., Pinchera, A., Vorontsova, T.V., Mondellini, P., Bongarzone, I., Greco, A., Astakhova, L., Butti, M.G., Demidchik, E.P. et al. (1995)

150 Oncogenic rearrangements of the RET proto-oncogene in papillary thyroid carcinomas from children exposed to the Chernobyl nuclear accident. Cancer Res, 55, 5617-5620. 90. Klugbauer, S., Lengfelder, E., Demidchik, E.P. and Rabes, H.M. (1995) High prevalence of RET rearrangement in thyroid tumors of children from Belarus after the Chernobyl reactor accident. Oncogene, 11, 2459-2467. 91. Klugbauer, S., Jauch, A., Lengfelder, E., Demidchik, E. and Rabes, H.M. (2000) A novel type of RET rearrangement (PTC8) in childhood papillary thyroid carcinomas and characterization of the involved gene (RFG8). Cancer Res, 60, 7028-7032. 92. Bongarzone, I., Vigneri, P., Mariani, L., Collini, P., Pilotti, S. and Pierotti, M.A. (1998) RET/NTRK1 rearrangements in thyroid gland tumors of the papillary carcinoma family: correlation with clinicopathological features. Clinical cancer research : an official journal of the American Association for Cancer Research, 4, 223-228. 93. Fenton, C.L., Lukes, Y., Nicholson, D., Dinauer, C.A., Francis, G.L. and Tuttle, R.M. (2000) The ret/PTC mutations are common in sporadic papillary thyroid carcinoma of children and young adults. J Clin Endocrinol Metab, 85, 1170-1175. 94. Finn, S.P., Smyth, P., O'Leary, J., Sweeney, E.C. and Sheils, O. (2003) Ret/PTC chimeric transcripts in an Irish cohort of sporadic papillary thyroid carcinoma. J Clin Endocrinol Metab, 88, 938-941. 95. Glover, T.W., Coyle-Morris, J. and Morgan, R. (1986) Fragile sites: overview, occurrence in acute nonlymphocytic leukemia and effects of caffeine on expression. Cancer Genet Cytogenet, 19, 141-150. 96. Yunis, J.J. and Soreng, A.L. (1984) Constitutive fragile sites and cancer. Science, 226, 1199-1204. 97. Kuwano, A. and Kajii, T. (1987) Synergistic effect of aphidicolin and ethanol on the induction of common fragile sites. Hum Genet, 75, 75-78. 98. Demirhan, O. and Tastemir, D. (2008) Cytogenetic effects of ethanol on chronic alcohol users. Alcohol Alcohol, 43, 127-136. 99. Kao-Shan, C.S., Fine, R.L., Whang-Peng, J., Lee, E.C. and Chabner, B.A. (1987) Increased fragile sites and sister chromatid exchanges in bone marrow and peripheral blood of young cigarette smokers. Cancer Res, 47, 6278-6282. 100. Stein, C.K., Glover, T.W., Palmer, J.L. and Glisson, B.S. (2002) Direct correlation between FRA3B expression and cigarette smoking. Genes Chromosomes Cancer, 34, 333-340. 101. Musio, A. and Sbrana, I. (1997) Aphidicolin-sensitive specific common fragile sites: a biomarker of exposure to pesticides. Environ Mol Mutagen, 29, 250-255. 102. Sbrana, I. and Musio, A. (1995) Enhanced expression of common fragile site with occupational exposure to pesticides. Cancer Genet Cytogenet, 82, 123-127. 103. Webster, L.R., McKenzie, G.H. and Moriarty, H.T. (2002) Organophosphate-based pesticides and genetic damage implicated in bladder cancer. Cancer Genet Cytogenet, 133, 112-117. 104. Blair, A. and Zahm, S.H. (1991) Cancer among farmers. Occup Med, 6, 335-354.

151 105. Brown, L.M., Blair, A., Gibson, R., Everett, G.D., Cantor, K.P., Schuman, L.M., Burmeister, L.F., Van Lier, S.F. and Dick, F. (1990) Pesticide exposures and other agricultural risk factors for leukemia among men in Iowa and Minnesota. Cancer Res, 50, 6585-6591. 106. Coquelle, A., Toledo, F., Stern, S., Bieth, A. and Debatisse, M. (1998) A new role for hypoxia in tumor progression: induction of fragile site triggering genomic rearrangements and formation of complex DMs and HSRs. Mol Cell, 2, 259-265. 107. Yunis, J.J., Soreng, A.L. and Bowe, A.E. (1987) Fragile sites are targets of diverse mutagens and carcinogens. Oncogene, 1, 59-69. 108. NTP. (2011). U.S. Department of Health and Human Services, Public Health Service, National Toxicology Program, Research Triangle Park, NC, pp. 499. 109. Pellegriti, G., De Vathaire, F., Scollo, C., Attard, M., Giordano, C., Arena, S., Dardanoni, G., Frasca, F., Malandrino, P., Vermiglio, F. et al. (2009) Papillary thyroid cancer incidence in the volcanic area of Sicily. Journal of the National Cancer Institute, 101, 1575-1583. 110. Pecoraino, G., Scalici, L., Avellone, G., Ceraulo, L., Favara, R., Candela, E.G., Provenzano, M.C. and Scaletta, C. (2008) Distribution of volatile organic compounds in Sicilian groundwaters analysed by head space-solid phase micro extraction coupled with gas chromatography mass spectrometry (SPME/GC/MS). Water research, 42, 3563-3577. 111. Jordan, A. (2003) Volcanic Formation of Halogenated Organic Compounds. Springer-Verlag, Germany. 112. Telez, M., Ortiz-Lastra, E., Gonzalez, A.J., Flores, P., Huerta, I., Ramirez, J.M., Barasoain, M., Criado, B. and Arrieta, I. (2010) Assessment of the genotoxicity of atenolol in human peripheral blood lymphocytes: correlation between chromosomal fragility and content of micronuclei. Mutat Res, 695, 46-54. 113. Telez, M., Martinez, B., Criado, B., Lostao, C.M., Penagarikano, O., Ortega, B., Flores, P., Ortiz-Lastra, E., Alonso, R.M., Jimenez, R.M. et al. (2000) In vitro and in vivo evaluation of the antihypertensive drug atenolol in cultured human lymphocytes: effects of long-term therapy. Mutagenesis, 15, 195-202. 114. Brambilla, G. and Martelli, A. (2006) Genotoxicity and carcinogenicity studies of antihypertensive agents. Mutat Res, 612, 115-149. 115. McQuarrie, H.G., Scott, C.D., Ellsworth, H.S., Harris, J.W. and Stone, R.A. (1970) Cytogenetic studies on women using oral contraceptives and their progeny. American journal of obstetrics and gynecology, 108, 659-665. 116. Furuya, T., Hagiwara, J., Ochi, H., Tokuhiro, H., Kikawada, R., Karube, T. and Watanabe, S. (1991) Changes of common fragile sites on chromosomes according to the menstrual cycle. Hum Genet, 86, 471-474. 117. Mosher, W.D. and Jones, J. (2010) Use of contraception in the United States: 1982-2008. Vital and health statistics. Series 23, Data from the National Survey of Family Growth, 1-44. 118. BLS. (March 2011), Department of Labor, Bureau of Labor Statistics.

152 119. Grucza, R.A., Norberg, K.E. and Bierut, L.J. (2009) Binge drinking among youths and young adults in the United States: 1979-2006. Journal of the American Academy of Child and Adolescent Psychiatry, 48, 692-702. 120. National Center for Health Statistics (U.S.) and National Center for Health Services Research. (2011), Hyattsville, MD. 121. Sbrana, I., Zavattari, P., Barale, R. and Musio, A. (1998) Common fragile sites on human chromosomes represent transcriptionally active regions: evidence from camptothecin. Hum Genet, 102, 409-414. 122. Ozkaynak, M.F., Avramis, V.I., Carcich, S. and Ortega, J.A. (1998) Pharmacology of cytarabine given as a continuous infusion followed by mitoxantrone with and without amsacrine/etoposide as reinduction chemotherapy for relapsed or refractory pediatric acute myeloid leukemia. Medical and pediatric oncology, 31, 475-482. 123. Allan, J.M. and Travis, L.B. (2005) Mechanisms of therapy-related carcinogenesis. Nature reviews. Cancer, 5, 943-955. 124. Boffetta, P. and Kaldor, J.M. (1994) Secondary malignancies following cancer chemotherapy. Acta Oncol, 33, 591-598. 125. Swerdlow, A.J., Douglas, A.J., Hudson, G.V., Hudson, B.V., Bennett, M.H. and MacLennan, K.A. (1992) Risk of second primary cancers after Hodgkin's disease by type of treatment: analysis of 2846 patients in the British National Lymphoma Investigation. British Medical Journal, 304, 1137-1143. 126. Travis, L.B., Curtis, R.E., Storm, H., Hall, P., Holowaty, E., Van Leeuwen, F.E., Kohler, B.A., Pukkala, E., Lynch, C.F., Andersson, M. et al. (1997) Risk of second malignant neoplasms among long-term survivors of testicular cancer. Journal of the National Cancer Institute, 89, 1429-1439. 127. Goto, M., Miller, R.W., Ishikawa, Y. and Sugano, H. (1996) Excess of rare cancers in Werner syndrome (adult progeria). Cancer Epidemiol Biomarkers Prev, 5, 239- 246. 128. Jimenez, M., Leon, P., Castro, L., Azcona, C. and Sierrasesumaga, L. (1995) [Second tumors in pediatric oncologic patients. Report of 5 cases]. Revista de medicina de la Universidad de Navarra, 40, 72-77. 129. Tsuchiya, H., Tomita, K., Ohno, M., Inaoki, M. and Kawashima, A. (1991) Werner's syndrome combined with quintuplicate malignant tumors: a case report and review of literature data. Japanese journal of clinical oncology, 21, 135-142. 130. Verneris, M., McDougall, I.R., Becton, D. and Link, M.P. (2001) Thyroid carcinoma after successful treatment of osteosarcoma: a report of three patients. Journal of pediatric hematology/oncology, 23, 312-315. 131. Yen, B.C., Kahn, H., Schiller, A.L., Klein, M.J., Phelps, R.G. and Lebwohl, M.G. (1993) Multiple hamartoma syndrome with osteosarcoma. Archives of pathology & laboratory medicine, 117, 1252-1254. 132. Kim, M.S., Sim, Y.S., Lee, S.Y. and Jeon, D.G. (2008) Secondary thyroid papillary carcinoma in osteosarcoma patients: report of two cases. Journal of Korean medical science, 23, 149-152.

153 133. Venkitaraman, R., Affolter, A., Ahmed, M., Thomas, V., Pritchard-Jones, K., Sharma, A.K., Marais, R. and Nutting, C.M. (2008) Childhood papillary thyroid cancer as second malignancy after successful treatment of rhabdomyosarcoma. Acta Oncol, 47, 469-472. 134. de Vathaire, F., Hawkins, M., Campbell, S., Oberlin, O., Raquin, M.A., Schlienger, J.Y., Shamsaldin, A., Diallo, I., Bell, J., Grimaud, E. et al. (1999) Second malignant neoplasms after a first cancer in childhood: temporal pattern of risk according to type of treatment. British journal of cancer, 79, 1884-1893. 135. Gow, K.W., Lensing, S., Hill, D.A., Krasin, M.J., McCarville, M.B., Rai, S.N., Zacher, M., Spunt, S.L., Strickland, D.K. and Hudson, M.M. (2003) Thyroid carcinoma presenting in childhood or after treatment of childhood malignancies: An institutional experience and review of the literature. Journal of pediatric surgery, 38, 1574-1580. 136. Vane, D., King, D.R. and Boles, E.T., Jr. (1984) Secondary thyroid neoplasms in pediatric cancer patients: increased risk with improved survival. Journal of pediatric surgery, 19, 855-860. 137. Smith, M.B., Xue, H., Strong, L., Takahashi, H., Jaffe, N., Ried, H., Zietz, H. and Andrassy, R.J. (1993) Forty-year experience with second malignancies after treatment of childhood cancer: analysis of outcome following the development of the second malignancy. Journal of pediatric surgery, 28, 1342-1348; discussion 1348-1349. 138. Arlt, M.F. and Glover, T.W. (2010) Inhibition of topoisomerase I prevents chromosome breakage at common fragile sites. DNA repair, 9, 678-689. 139. Tuduri, S., Crabbe, L., Conti, C., Tourriere, H., Holtgreve-Grez, H., Jauch, A., Pantesco, V., De Vos, J., Thomas, A., Theillet, C. et al. (2009) Topoisomerase I suppresses genomic instability by preventing interference between replication and transcription. Nature cell biology, 11, 1315-1324. 140. Gasparini, P., Sozzi, G. and Pierotti, M.A. (2007) The role of chromosomal alterations in human cancer development. J Cell Biochem, 102, 320-331. 141. Mitelman, F., Johansson, B., and Mertens, F. (2008). 142. Weterings, E. and Chen, D.J. (2008) The endless tale of non-homologous end- joining. Cell Res, 18, 114-124. 143. Shrivastav, M., De Haro, L.P. and Nickoloff, J.A. (2008) Regulation of DNA double- strand break repair pathway choice. Cell Res, 18, 134-147. 144. Glover, T.W. and Stein, C.K. (1988) Chromosome breakage and recombination at fragile sites. Am J Hum Genet, 43, 265-273. 145. Palakodeti, A., Han, Y., Jiang, Y. and Le Beau, M.M. (2004) The role of late/slow replication of the FRA16D in common fragile site induction. Genes Chromosomes Cancer, 39, 71-76. 146. Wang, L., Darling, J., Zhang, J.S., Huang, H., Liu, W. and Smith, D.I. (1999) Allele- specific late replication and fragility of the most active common fragile site, FRA3B. Hum Mol Genet, 8, 431-437. 147. Hewett, D.R., Handt, O., Hobson, L., Mangelsdorf, M., Eyre, H.J., Baker, E., Sutherland, G.R., Schuffenhauer, S., Mao, J.I. and Richards, R.I. (1998) FRA10B

154 structure reveals common elements in repeat expansion and chromosomal fragile site genesis. Mol Cell, 1, 773-781. 148. Nikiforov, Y.E. (2008) Thyroid carcinoma: molecular pathways and therapeutic targets. Mod Pathol, 21 Suppl 2, S37-43. 149. Smanik, P.A., Furminger, T.L., Mazzaferri, E.L. and Jhiang, S.M. (1995) Breakpoint characterization of the ret/PTC oncogene in human papillary thyroid carcinoma. Hum Mol Genet, 4, 2313-2318. 150. Nikiforov, Y.E., Koshoffer, A., Nikiforova, M., Stringer, J. and Fagin, J.A. (1999) Chromosomal breakpoint positions suggest a direct role for radiation in inducing illegitimate recombination between the ELE1 and RET genes in radiation-induced thyroid carcinomas. Oncogene, 18, 6330-6334. 151. Motomura, T., Nikiforov, Y.E., Namba, H., Ashizawa, K., Nagataki, S., Yamashita, S. and Fagin, J.A. (1998) ret rearrangements in Japanese pediatric and adult papillary thyroid cancers. Thyroid, 8, 485-489. 152. Murano, I., Kuwano, A. and Kajii, T. (1989) Fibroblast-specific common fragile sites induced by aphidicolin. Hum Genet, 83, 45-48. 153. Sutherland, G.R., Parslow, M.I. and Baker, E. (1985) New classes of common fragile sites induced by 5-azacytidine and bromodeoxyuridine. Hum Genet, 69, 233-237. 154. Bongarzone, I., Butti, M.G., Fugazzola, L., Pacini, F., Pinchera, A., Vorontsova, T.V., Demidchik, E.P. and Pierotti, M.A. (1997) Comparison of the breakpoint regions of ELE1 and RET genes involved in the generation of RET/PTC3 oncogene in sporadic and in radiation-associated papillary thyroid carcinomas. Genomics, 42, 252-259. 155. Klugbauer, S., Pfeiffer, P., Gassenhuber, H., Beimfohr, C. and Rabes, H.M. (2001) RET rearrangements in radiation-induced papillary thyroid carcinomas: high prevalence of topoisomerase I sites at breakpoints and microhomology- mediated end joining in ELE1 and RET chimeric genes. Genomics, 73, 149-160. 156. Boldog, F., Gemmill, R.M., West, J., Robinson, M., Robinson, L., Li, E., Roche, J., Todd, S., Waggoner, B., Lundstrom, R. et al. (1997) Chromosome 3p14 homozygous deletions and sequence analysis of FRA3B. Hum Mol Genet, 6, 193- 203. 157. Corbin, S., Neilly, M.E., Espinosa, R., 3rd, Davis, E.M., McKeithan, T.W. and Le Beau, M.M. (2002) Identification of unstable sequences within the common fragile site at 3p14.2: implications for the mechanism of deletions within fragile histidine triad gene/common fragile site at 3p14.2 in tumors. Cancer Res, 62, 3477-3484. 158. Caudill, C.M., Zhu, Z., Ciampi, R., Stringer, J.R. and Nikiforov, Y.E. (2005) Dose- dependent generation of RET/PTC in human thyroid cells after in vitro exposure to gamma-radiation: a model of carcinogenic chromosomal rearrangement induced by ionizing radiation. J Clin Endocrinol Metab, 90, 2364-2369. 159. Ciampi, R., Zhu, Z. and Nikiforov, Y.E. (2005) BRAF copy number gains in thyroid tumors detected by fluorescence in situ hybridization. Endocrine pathology, 16, 99-105.

155 160. Kong, Q. and Maizels, N. (2001) Breaksite batch mapping, a rapid method for assay and identification of DNA breaksites in mammalian cells. Nucleic Acids Res, 29, E33. 161. Gandhi, M., Dillon, L.W., Pramanik, S., Nikiforov, Y.E. and Wang, Y.H. (2010) DNA breaks at fragile sites generate oncogenic RET/PTC rearrangements in human thyroid cells. Oncogene, 29, 2272-2280. 162. McAvoy, S., Ganapathiraju, S.C., Ducharme-Smith, A.L., Pritchett, J.R., Kosari, F., Perez, D.S., Zhu, Y., James, C.D. and Smith, D.I. (2007) Non-random inactivation of large common fragile site genes in different cancers. Cytogenet Genome Res, 118, 260-269. 163. Grabczyk, E., Mancuso, M. and Sammarco, M.C. (2007) A persistent RNA.DNA hybrid formed by transcription of the Friedreich ataxia triplet repeat in live bacteria, and by T7 RNAP in vitro. Nucleic Acids Res, 35, 5351-5359. 164. Lin, Y., Dent, S.Y., Wilson, J.H., Wells, R.D. and Napierala, M. (2010) R loops stimulate genetic instability of CTG.CAG repeats. Proc Natl Acad Sci U S A, 107, 692-697. 165. Reddy, K., Tam, M., Bowater, R.P., Barber, M., Tomlinson, M., Nichol Edamura, K., Wang, Y.H. and Pearson, C.E. (2011) Determinants of R-loop formation at convergent bidirectionally transcribed trinucleotide repeats. Nucleic Acids Res, 39, 1749-1762. 166. Vos, S.M., Tretter, E.M., Schmidt, B.H. and Berger, J.M. (2011) All tangled up: how cells direct, manage and exploit topoisomerase function. Nature reviews. Molecular cell biology, 12, 827-841. 167. Been, M.D. and Champoux, J.J. (1984) Breakage of single-stranded DNA by eukaryotic type 1 topoisomerase occurs only at regions with the potential for base-pairing. J Mol Biol, 180, 515-531. 168. Froelich-Ammon, S.J., Gale, K.C. and Osheroff, N. (1994) Site-specific cleavage of a DNA hairpin by topoisomerase II. DNA secondary structure as a determinant of enzyme recognition/cleavage. J Biol Chem, 269, 7719-7725. 169. Jonstrup, A.T., Thomsen, T., Wang, Y., Knudsen, B.R., Koch, J. and Andersen, A.H. (2008) Hairpin structures formed by alpha satellite DNA of human centromeres are cleaved by human topoisomerase IIalpha. Nucleic Acids Res, 36, 6165-6174. 170. Jhiang, S.M., Caruso, D.R., Gilmore, E., Ishizaka, Y., Tahira, T., Nagao, M., Chiu, I.M. and Mazzaferri, E.L. (1992) Detection of the PTC/retTPC oncogene in human thyroid cancers. Oncogene, 7, 1331-1337. 171. Been, M.D., Burgess, R.R. and Champoux, J.J. (1984) Nucleotide sequence preference at rat liver and wheat germ type 1 DNA topoisomerase breakage sites in duplex SV40 DNA. Nucleic Acids Res, 12, 3097-3114. 172. Capranico, G. and Binaschi, M. (1998) DNA sequence selectivity of topoisomerases and topoisomerase poisons. Biochim Biophys Acta, 1400, 185- 194. 173. Garcia-Carbonero, R. and Supko, J.G. (2002) Current perspectives on the clinical experience, pharmacology, and continued development of the camptothecins.

156 Clinical cancer research : an official journal of the American Association for Cancer Research, 8, 641-661. 174. Hande, K.R. (2008) Topoisomerase II inhibitors. Update on Cancer Therapeutics, 3, 13-26. 175. Long, B.H., Musial, S.T. and Brattain, M.G. (1985) Single- and double-strand DNA breakage and repair in human lung adenocarcinoma cells exposed to etoposide and teniposide. Cancer Res, 45, 3106-3112. 176. Pondarre, C., Strumberg, D., Fujimori, A., Torres-Leon, R. and Pommier, Y. (1997) In vivo sequencing of camptothecin-induced topoisomerase I cleavage sites in human colon carcinoma cells. Nucleic Acids Res, 25, 4111-4116. 177. Ganguly, A., Das, B., Roy, A., Sen, N., Dasgupta, S.B., Mukhopadhayay, S. and Majumder, H.K. (2007) Betulinic acid, a catalytic inhibitor of topoisomerase I, inhibits -mediated apoptotic topoisomerase I-DNA cleavable complex formation in prostate cancer cells but does not affect the process of cell death. Cancer Res, 67, 11848-11858. 178. Bar, F.M., Khanfar, M.A., Elnagar, A.Y., Liu, H., Zaghloul, A.M., Badria, F.A., Sylvester, P.W., Ahmad, K.F., Raisch, K.P. and El Sayed, K.A. (2009) Rational design and semisynthesis of betulinic acid analogues as potent topoisomerase inhibitors. J Nat Prod, 72, 1643-1650. 179. Syrovets, T., Buchele, B., Gedig, E., Slupsky, J.R. and Simmet, T. (2000) Acetyl- boswellic acids are novel catalytic inhibitors of human topoisomerases I and IIalpha. Mol Pharmacol, 58, 71-81. 180. Wada, S. and Tanaka, R. (2005) Betulinic acid and its derivatives, potent DNA topoisomerase II inhibitors, from the bark of Bischofia javanica. Chemistry & biodiversity, 2, 689-694. 181. Drake, F.H., Hofmann, G.A., Mong, S.M., Bartus, J.O., Hertzberg, R.P., Johnson, R.K., Mattern, M.R. and Mirabelli, C.K. (1989) In vitro and intracellular inhibition of topoisomerase II by the antitumor agent merbarone. Cancer Res, 49, 2578- 2583. 182. Fortune, J.M. and Osheroff, N. (1998) Merbarone inhibits the catalytic activity of human topoisomerase IIalpha by blocking DNA cleavage. J Biol Chem, 273, 17643-17650. 183. Mulvihill, D.J. and Wang, Y.H. (2004) Two breakpoint clusters at fragile site FRA3B form phased nucleosomes. Genome Res, 14, 1350-1357. 184. Abdurashidova, G., Radulescu, S., Sandoval, O., Zahariev, S., Danailov, M.B., Demidovich, A., Santamaria, L., Biamonti, G., Riva, S. and Falaschi, A. (2007) Functional interactions of DNA topoisomerases with a human replication origin. The EMBO journal, 26, 998-1009. 185. Wang, J.C. (2002) Cellular roles of DNA topoisomerases: a molecular perspective. Nat Rev Mol Cell Biol, 3, 430-440. 186. Postow, L., Crisona, N.J., Peter, B.J., Hardy, C.D. and Cozzarelli, N.R. (2001) Topological challenges to DNA replication: conformations at the fork. Proc Natl Acad Sci U S A, 98, 8219-8226.

157 187. Liu, L.F. and Wang, J.C. (1987) Supercoiling of the DNA template during transcription. Proc Natl Acad Sci U S A, 84, 7024-7027. 188. Wu, H.Y., Shyy, S.H., Wang, J.C. and Liu, L.F. (1988) Transcription generates positively and negatively supercoiled domains in the template. Cell, 53, 433-440. 189. Drolet, M., Bi, X. and Liu, L.F. (1994) Hypernegative supercoiling of the DNA template during transcription elongation in vitro. J Biol Chem, 269, 2068-2074. 190. Li, X. and Manley, J.L. (2005) Inactivation of the SR protein splicing factor ASF/SF2 results in genomic instability. Cell, 122, 365-378. 191. Lin, Y. and Wilson, J.H. (2011) Transcription-induced DNA toxicity at trinucleotide repeats: double bubble is trouble. Cell Cycle, 10, 611-618. 192. Lemoine, N.R., Mayall, E.S., Jones, T., Sheer, D., McDermid, S., Kendall-Taylor, P. and Wynford-Thomas, D. (1989) Characterisation of human thyroid epithelial cells immortalised in vitro by simian virus 40 DNA transfection. British journal of cancer, 60, 897-903. 193. Swerdlow, A.J., Douglas, A.J., Vaughan Hudson, G., Vaughan Hudson, B. and MacLennan, K.A. (1993) Risk of second primary cancer after Hodgkin's disease in patients in the British National Lymphoma Investigation: relationships to host factors, histology and stage of Hodgkin's disease, and splenectomy. Br J Cancer, 68, 1006-1011. 194. Zuker, M. (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res, 31, 3406-3415. 195. Anderson, S. and DePamphilis, M.L. (1979) of Okazaki fragments during simian virus 40 DNA replication. J Biol Chem, 254, 11495-11504. 196. Hay, R.T. and DePamphilis, M.L. (1982) Initiation of SV40 DNA replication in vivo: location and structure of 5' ends of DNA synthesized in the ori region. Cell, 28, 767-779. 197. Bacolla, A. and Wells, R.D. (2004) Non-B DNA conformations, genomic rearrangements, and human disease. J Biol Chem, 279, 47411-47414. 198. Zhao, J., Bacolla, A., Wang, G. and Vasquez, K.M. (2010) Non-B DNA structure- induced genetic instability and evolution. Cell Mol Life Sci, 67, 43-62. 199. Wells, R.D. (2007) Non-B DNA conformations, mutagenesis and disease. Trends Biochem Sci, 32, 271-278. 200. Gatchel, J.R. and Zoghbi, H.Y. (2005) Diseases of unstable repeat expansion: mechanisms and common principles. Nat Rev Genet, 6, 743-755. 201. Wang, G. and Vasquez, K.M. (2006) Non-B DNA structure-induced genetic instability. Mutat Res, 598, 103-119. 202. Schwartz, M., Zlotorynski, E. and Kerem, B. (2006) The molecular basis of common and rare fragile sites. Cancer Lett, 232, 13-26. 203. Tanaka, H., Bergstrom, D.A., Yao, M.C. and Tapscott, S.J. (2005) Widespread and nonrandom distribution of DNA palindromes in cancer cells provides a structural platform for subsequent gene amplification. Nat Genet, 37, 320-327. 204. Abeysinghe, S.S., Chuzhanova, N., Krawczak, M., Ball, E.V. and Cooper, D.N. (2003) Translocation and gross deletion breakpoints in human inherited disease

158 and cancer I: Nucleotide composition and recombination-associated motifs. Hum Mutat, 22, 229-244. 205. Chuzhanova, N., Abeysinghe, S.S., Krawczak, M. and Cooper, D.N. (2003) Translocation and gross deletion breakpoints in human inherited disease and cancer II: Potential involvement of repetitive sequence elements in secondary structure formation between DNA ends. Hum Mutat, 22, 245-251. 206. Bacolla, A., Jaworski, A., Larson, J.E., Jakupciak, J.P., Chuzhanova, N., Abeysinghe, S.S., O'Connell, C.D., Cooper, D.N. and Wells, R.D. (2004) Breakpoints of gross deletions coincide with non-B DNA conformations. Proc Natl Acad Sci U S A, 101, 14162-14167. 207. Popescu, N.C. (2003) Genetic alterations in cancer as a result of breakage at fragile sites. Cancer Letters, 192, 1-17. 208. Smith, C.L., Bolton, A. and Nguyen, G. (2010) Genomic and epigenomic instability, fragile sites, schizophrenia and autism. Curr Genomics, 11, 447-469. 209. Ozeri-Galai, E., Lebofsky, R., Rahat, A., Bester, A.C., Bensimon, A. and Kerem, B. (2011) Failure of origin activation in response to fork stalling leads to chromosomal instability at fragile sites. Mol Cell, 43, 122-131. 210. Jackson, M.S., Rocchi, M., Thompson, G., Hearn, T., Crosier, M., Guy, J., Kirk, D., Mulligan, L., Ricco, A., Piccininni, S. et al. (1999) Sequences flanking the centromere of human chromosome 10 are a complex patchwork of arm-specific sequences, stable duplications and unstable sequences with homologies to telomeric and other centromeric locations. Hum Mol Genet, 8, 205-215. 211. Linardopoulou, E.V., Williams, E.M., Fan, Y., Friedman, C., Young, J.M. and Trask, B.J. (2005) Human subtelomeres are hot spots of interchromosomal recombination and segmental duplication. Nature, 437, 94-100. 212. Becker, N.A., Thorland, E.C., Denison, S.R., Phillips, L.A. and Smith, D.I. (2002) Evidence that instability within the FRA3B region extends four megabases. Oncogene, 21, 8713-8722. 213. Krummel, K.A., Roberts, L.R., Kawakami, M., Glover, T.W. and Smith, D.I. (2000) The characterization of the common fragile site FRA16D and its involvement in multiple myeloma translocations. Genomics, 69, 37-46. 214. Pearson, C.E., Tam, M., Wang, Y.H., Montgomery, S.E., Dar, A.C., Cleary, J.D. and Nichol, K. (2002) Slipped-strand DNAs formed by long (CAG)*(CTG) repeats: slipped-out repeats and slip-out junctions. Nucleic Acids Res, 30, 4534-4547. 215. Pearson, C.E., Wang, Y.H., Griffith, J.D. and Sinden, R.R. (1998) Structural analysis of slipped-strand DNA (S-DNA) formed in (CTG)n. (CAG)n repeats from the myotonic dystrophy locus. Nucleic Acids Res, 26, 816-823. 216. Muleris, M., Dutrillaux, A.M., Lombard, M. and Dutrillaux, B. (1987) Noninvolvement of a constitutional heritable fragile site at 10q24.2 in rearranged chromosomes from rectal carcinoma cells. Cancer Genet Cytogenet, 25, 7-13. 217. Sarafidou, T., Kahl, C., Martinez-Garay, I., Mangelsdorf, M., Gesk, S., Baker, E., Kokkinaki, M., Talley, P., Maltby, E.L., French, L. et al. (2004) Folate-sensitive

159 fragile site FRA10A is due to an expansion of a CGG repeat in a novel gene, FRA10AC1, encoding a nuclear protein. Genomics, 84, 69-81. 218. Dutrillaux, B., Muleris, M. and Prieur, M. (1985) [Exact localization of several fragile sites remains uncertain. The example of fra(10) sensitive to folate]. Ann Genet, 28, 161-163. 219. Giraud, F., Ayme, S., Mattei, J.F. and Mattei, M.G. (1976) Constitutional chromosomal breakage. Hum Genet, 34, 125-136. 220. Paradee, W., Wilke, C.M., Wang, L., Shridhar, R., Mullins, C.M., Hoge, A., Glover, T.W. and Smith, D.I. (1996) A 350-kb cosmid contig in 3p14.2 that crosses the t(3;8) hereditary renal cell carcinoma translocation breakpoint and 17 aphidicolin-induced FRA3B breakpoints. Genomics, 35, 87-93. 221. Boldog, F.L., Gemmill, R.M., Wilke, C.M., Glover, T.W., Nilsson, A.S., Chandrasekharappa, S.C., Brown, R.S., Li, F.P. and Drabkin, H.A. (1993) Positional cloning of the hereditary renal carcinoma 3;8 chromosome translocation breakpoint. Proc Natl Acad Sci U S A, 90, 8509-8513. 222. Huiping, C., Kristjansdottir, S., Bergthorsson, J.T., Jonasson, J.G., Magnusson, J., Egilsson, V. and Ingvarsson, S. (2002) High frequency of LOH, MSI and abnormal expression of FHIT in gastric cancer. Eur J Cancer, 38, 728-735. 223. Sozzi, G., Alder, H., Tornielli, S., Corletto, V., Baffa, R., Veronese, M.L., Negrini, M., Pilotti, S., Pierotti, M.A., Huebner, K. et al. (1996) Aberrant FHIT transcripts in Merkel cell carcinoma. Cancer Res, 56, 2472-2474. 224. Mimori, K., Druck, T., Inoue, H., Alder, H., Berk, L., Mori, M., Huebner, K. and Croce, C.M. (1999) Cancer-specific chromosome alterations in the constitutive fragile region FRA3B. Proc Natl Acad Sci U S A, 96, 7456-7461. 225. Finnis, M., Dayan, S., Hobson, L., Chenevix-Trench, G., Friend, K., Ried, K., Venter, D., Woollatt, E., Baker, E. and Richards, R.I. (2005) Common chromosomal fragile site FRA16D mutation in cancer cells. Hum Mol Genet, 14, 1341-1349. 226. Chen, T., Sahin, A. and Aldaz, C.M. (1996) Deletion map of chromosome 16q in ductal carcinoma in situ of the breast: refining a putative tumor suppressor gene region. Cancer Res, 56, 5605-5609. 227. Pearson, C.E. and Sinden, R.R. (1996) Alternative structures in duplex DNA formed within the trinucleotide repeats of the myotonic dystrophy and fragile X loci. Biochemistry, 35, 5041-5053. 228. Keniry, M. and Parsons, R. (2008) The role of PTEN signaling perturbations in cancer and in targeted therapy. Oncogene, 27, 5477-5485. 229. Rodriguez-Escudero, I., Oliver, M.D., Andres-Pons, A., Molina, M., Cid, V.J. and Pulido, R. (2011) A comprehensive functional analysis of PTEN mutations: implications in tumor- and autism-related syndromes. Hum Mol Genet, 20, 4132- 4142. 230. Soler, G., Radford-Weiss, I., Ben-Abdelali, R., Mahlaoui, N., Ponceau, J.F., Macintyre, E.A., Vekemans, M., Bernard, O.A. and Romana, S.P. (2008) Fusion of ZMIZ1 to ABL1 in a B-cell acute lymphoblastic leukaemia with a t(9;10)(q34;q22.3) translocation. Leukemia, 22, 1278-1280.

160 231. Lahortiga, I., Vizmanos, J.L., Agirre, X., Vazquez, I., Cigudosa, J.C., Larrayoz, M.J., Sala, F., Gorosquieta, A., Perez-Equiza, K., Calasanz, M.J. et al. (2003) NUP98 is fused to adducin 3 in a patient with T-cell acute lymphoblastic leukemia and myeloid markers, with a new translocation t(10;11)(q25;p15). Cancer Res, 63, 3079-3083. 232. Courtois, G. and Gilmore, T.D. (2006) Mutations in the NF-kappaB signaling pathway: implications for human disease. Oncogene, 25, 6831-6843. 233. Migliazza, A., Lombardi, L., Rocchi, M., Trecca, D., Chang, C.C., Antonacci, R., Fracchiolla, N.S., Ciana, P., Maiolo, A.T. and Neri, A. (1994) Heterogeneous chromosomal aberrations generate 3' truncations of the NFKB2/lyt-10 gene in lymphoid malignancies. Blood, 84, 3850-3860. 234. Neri, A., Chang, C.C., Lombardi, L., Salina, M., Corradini, P., Maiolo, A.T., Chaganti, R.S. and Dalla-Favera, R. (1991) B cell lymphoma-associated chromosomal translocation involves candidate oncogene lyt-10, homologous to NF-kappa B p50. Cell, 67, 1075-1087. 235. Beroukhim, R., Mermel, C.H., Porter, D., Wei, G., Raychaudhuri, S., Donovan, J., Barretina, J., Boehm, J.S., Dobson, J., Urashima, M. et al. (2010) The landscape of somatic copy-number alteration across human cancers. Nature, 463, 899-905. 236. Sutherland, G.R. and Ledbetter, D.H. (1989) Report of the committee on cytogenetic markers. Cytogenet Cell Genet, 51, 452-458. 237. Rabes, H.M., Demidchik, E.P., Sidorow, J.D., Lengfelder, E., Beimfohr, C., Hoelzel, D. and Klugbauer, S. (2000) Pattern of radiation-induced RET and NTRK1 rearrangements in 191 post-chernobyl papillary thyroid carcinomas: biological, phenotypic, and clinical implications. Clin Cancer Res, 6, 1093-1103. 238. Lukusa, T. and Fryns, J.P. (2008) Human chromosome fragility. Biochim Biophys Acta, 1779, 3-16. 239. Tsantoulis, P.K., Kotsinas, A., Sfikakis, P.P., Evangelou, K., Sideridou, M., Levy, B., Mo, L., Kittas, C., Wu, X.R., Papavassiliou, A.G. et al. (2008) Oncogene-induced replication stress preferentially targets common fragile sites in preneoplastic lesions. A genome-wide study. Oncogene, 27, 3256-3264. 240. Ellis, B., Haaland, F., Hahne, F., Le Meur, N. and Gopalakrishnan, N. (2011) flowCore: Basic Structures for Flow Cytometry Data. R package Version 1.20.0. ISBN URL 241. R Development Core Team. (2012) R: A languange and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/

161 APPENDIX

Appendix Figure 1: Linker and primer sequences for LM-PCR

LL3 5’-CGAGTTCAGTCCGTAGACCATGGAGATCTGAATTC-3’ LP2 5’-GAATTCAGATCTCC-3’ LL4 5’-CGAGTTCAGTCCGTAGAC-3’ LL2 5’-GTAGACCATGGAGATCTGAAATTC-3’ RET-7 5’- BBCAGCATCTTCACGGCCACCGTGG-3’ (B, biotin) RET-R1b 5’- TATCCTGCTCTGCCTTTCAGATGG-3’ RET-R1 5’-AGTTCTTCCGAGATTCC-3’ FRA3B-20 5’- BBCCTATCTGACGACTTCAC-3’ (B, biotin) FRA3B-9 5’- GAAAGCATAAAGTGTGGC-3’ FRA3B-23 5’- TAACTGCTTATTTTTCCGATGT-3’ 12p12.3-1 5’- BBTTTTCTTGACTAGTCTAACCAGAT-3’ (B, biotin) 12p12.3-2 5’- TTTCACTTGTATTGATCTCCTTCAT-3’ 12.12.3-3 5’- TTTCCACTGTTTGCCGCATTAT-3’ G6PDF3 5’-BBAGTAAAAACACAAGCCCCGCCCC-3’ (B, biotin) G6PDF 5’-TAGGGCCGCATCCCGCTCCGGAGAGAAGTCT-3’ G6PDF2 5’-GGCCACTTTGCAGGGCGTCA-3’

PCR primers for Detection of RET/PTC Rearrangements RET/PTC1 forward 5'-CAAGAGAACAAGGTGCTGAAG-3' RET/PTC3 forward 5'-CGGTATTGTAGCTGTCCCTTTC-3' common reverse 5'-GCAGGTCTCGAAGCTCACTC-3'

32P-labeled oligonucleotide probes RET/PTC1 5'-CGTTACCATCGAGGATCCAAA-3' RET/PTC3 5'-GAACAGTCAGGAGGTCCAA-3'

162 Appendix Table 1: Primer and linker sequences for LM-PCR Region Primer Set Primer Name Primer Sequence RET-7 5’‐ BBCAGCATCTTCACGGCCACCGTGG‐3’ RET-R1b 5’‐ TATCCTGCTCTGCCTTTCAGATGG‐3’ 1 RET-R1 5’‐AGTTCTTCCGAGATTCC‐3’ RET-8 5'-ACCTGTTTACCACACTCTAGAGAC-3' RET-9 5'-BBGCACATGAGGGCGACATCAACCTG-3' 2 RET-10 5'-CCTCTCAAATACTGAGGTTGAGTC-3' RET RET-14 5'-BBTACCACAAGTTTGCCCACAAGCCACCC-3' RET-15 5'-CCCGGTCAGCTACTCCTCTTCC-3' 3 RET-16 5'-TCTCCGTGGATGCCTTCAAGAT-3' RET-17 5'-BBGCTCTAGGATGAGCCACCAGAGTCC-3' RET-18 5'-CCAGGAAGGCCGCACTGGTC-3' 4 RET-19 5'-GCTGCTGCTGGCAGAGACCA-3' FRA3B-20 5’‐ BBCCTATCTGACGACTTCAC‐3’ FRA3B FRA3B-9 5’‐ GAAAGCATAAAGTGTGGC‐3’ 1 FRA3B-23 5’‐ TAACTGCTTATTTTTCCGATGT‐3’ LL3 5’‐CGAGTTCAGTCCGTAGACCATGGAGATCTGAATTC‐3’ LP2 5’‐GAATTCAGATCTCC‐3’ Linker LL4 5’‐CGAGTTCAGTCCGTAGAC‐3’ LL2 5’‐GTAGACCATGGAGATCTGAAATTC‐3’

163 Appendix Table 2: APH-Induced DNA breakpoints within RET intron 11 Distance Distance to Distance to Closest Distance to Closest Breakpoint From Exon Closest Patient Predicted Topo I Predicted Topo II Number Strand 11 (bp) Breakpoint (bp)* Cleavage Site (bp)* Cleavage Site (bp)* 1 Top 48 -15 -19 +12 2 Bottom 86 +35 +4 -8 3 Bottom 100 +21 -10 +2 4 Bottom 102 +19 -12 0 5 Bottom 116 +5 +8 -3 6 Bottom 150 +18 +3 +9 7 Bottom 152 +16 +1 +7 8 Bottom 154 +14 -1 +5 9 Top 173 -1 -6 -1 10 Bottom 211 +33 -3 +7 11 Bottom 212 +32 -4 +6 12 Bottom 215 +29 +4 +3 13 Bottom 215 +29 +4 +3 14 Top 233 +11 +2 -5 15 Bottom 238 +6 +6 +5 16 Bottom 249 -2 +3 -6 17 Bottom 277 -24 -3 -9 18 Bottom 280 -27 +5 +9 19 Bottom 283 -30 +2 +6 20 Bottom 294 -41 +4 +3 21 Bottom 297 -44 +1 0 22 Bottom 297 -44 +1 0 23 Bottom 326 +29 -5 0 24 Bottom 329 +26 +2 -3 25 Bottom 348 +7 +1 -2 26 Bottom 356 +1 +3 +5 27 Top 378 -7 -6 +3 28 Top 440 0 0 +1 29 Bottom 466 -1 -3 -8 30 Top 500 +6 +7 -1 31 Bottom 502 +4 -1 +1 32 Bottom 507 -1 +1 0 33 Bottom 556 +31 -5 -3 34 Top 584 +3 -7 +2 35 Bottom 592 -1 +8 -2 36 Top 605 -7 -2 -3 37 Top 651 -53 +1 -8 38 Bottom 680 +57 +6 -1 39 Bottom 687 +50 -1 +6 40 Bottom 694 +43 0 0

164 41 Top 706 +31 +1 -1 42 Top 724 +13 -17 +3 43 Bottom 735 +2 -8 -4 44 Bottom 755 -1 -7 0 45 Top 794 -40 -1 0 46 Top 806 +34 +11 +2 47 Top 806 +34 +11 +2 48 Top 806 +34 +11 +2 49 Top 806 +34 +11 +2 50 Top 806 +34 +11 +2 51 Top 806 +34 +11 +2 52 Top 806 +34 +11 +2 53 Top 806 +34 +11 +2 54 Top 809 +31 +8 -1 55 Top 809 +31 +8 -1 56 Top 815 +25 +2 0 57 Bottom 830 10 +9 0 58 Top 845 0 -4 0 59 Bottom 845 0 -4 0 60 Top 855 -10 0 +4 61 Top 855 -10 0 +4 62 Top 898 -6 +5 -1 63 Top 900 -8 +3 -3 64 Top 935 +9 -6 +1 65 Bottom 941 +3 -3 -1 66 Top 945 0 +3 +3 67 Bottom 970 -12 -4 +3 68 Top 986 +28 0 0 69 Top 993 +21 0 -2 70 Top 996 +18 -1 +2 71 Bottom 999 +15 -4 -8 72 Bottom 1028 +9 -5 -1 73 Bottom 1036 +2 0 +6 74 Bottom 1045 +4 +3 -3 75 Bottom 1048 +1 0 +2 76 Bottom 1049 0 -1 +1 77 Bottom 1094 +2 0 +3 78 Bottom 1142 -5 +1 +1 79 Bottom 1142 -5 +1 +1 80 Bottom 1182 -45 -1 +4 81 Bottom 1186 -49 +1 0 82 Top 1210 +45 -4 -2 83 Top 1214 +41 +7 +2 84 Bottom 1222 +33 -1 -2

165 85 Bottom 1223 +32 -2 -3 86 Top 1239 +16 +3 +3 87 Bottom 1241 +14 -1 -2 88 Bottom 1246 +9 -6 0 89 Bottom 1246 +9 -6 0 90 Top 1254 +1 0 -2 91 Top 1260 0 -6 +6 92 Bottom 1269 -5 0 +1 93 Bottom 1282 -3 -7 -3 94 Bottom 1292 -10 +6 -1 95 Bottom 1293 -11 +5 -2 96 Bottom 1309 +24 +1 -1 97 Bottom 1320 +13 0 +6 98 Bottom 1330 -3 -5 -4 99 Top 1332 +1 -9 +6 100 Top 1334 0 -11 +4 101 Bottom 1339 -5 -1 -5 102 Top 1346 -12 0 0 103 Top 1365 -31 -12 -2 104 Bottom 1383 -49 +6 -10 105 Bottom 1391 +58 0 +5 106 Bottom 1394 +55 -3 +2 107 Bottom 1396 +53 +1 0 108 Bottom 1409 +20 -8 +5 109 Top 1419 +30 +18 -5 110 Bottom 1444 +5 +9 -2 111 Top 1453 0 +3 -2 112 Bottom 1456 -3 -3 -1 113 Bottom 1456 -3 -3 -1 114 Top 1462 -9 -4 0 115 Top 1465 -12 -7 -3 116 Top 1474 +21 -1 -4 117 Bottom 1488 +7 -2 0 118 Top 1511 -15 +1 -3 119 Top 1516 -20 -4 +7 120 Top 1519 -23 +4 +4 121 Bottom 1534 +20 -7 0 122 Top 1545 +9 +5 0 123 Top 1556 -2 +2 -3 124 Top 1576 -8 -4 -1 125 Top 1579 -11 +2 +2 126 Top 1581 -13 0 0 127 Top 1581 -13 0 0 128 Top 1583 -15 -2 -2

166 129 Top 1651 -6 -4 -2 130 Top 1669 +12 +10 +2 131 Bottom 1678 +3 -2 +3 132 Top 1707 +10 +10 -3 133 Top 1708 +9 +9 -4 134 Top 1718 -2 -1 +3 135 Top 1718 -2 -1 +3 136 Top 1720 -3 +1 +1 137 Top 1738 -13 +2 -2 138 Top 1763 +26 -10 -3 139 Bottom 1771 +18 +1 -7 140 Bottom 1772 +17 0 +6 141 Top 1781 +8 +4 -3 142 Top 1785 -4 0 +1 143 Top 1795 0 +1 +4 144 Top 1805 +10 -1 -5 *A positive distance refers to the patient/topoisomerase cleavage site being downstream from the APH- induced breakpoint, which a negative distance is upstream.

167 Appendix Table 3: Location of RET intron 11 APH-induced DNA breakpoints on predicted DNA secondary structures Distance Double-Stranded/ Double- Single- Single- Breakpoint Strand From Exon Single-Stranded Stranded Stranded Stranded Number 11 (bp) DNA Junction* DNA Stem DNA Loop DNA Bubble 1 Top 48 x 2 Bottom 86 x 3 Bottom 100 x 4 Bottom 102 x 5 Bottom 116 x 6 Bottom 150 x 7 Bottom 152 x 8 Bottom 154 x 9 Top 173 x 10 Bottom 211 x 11 Bottom 212 x 12 Bottom 215 x 13 Bottom 215 x 14 Top 233 x 15 Bottom 238 x 16 Bottom 249 x 17 Bottom 277 x 18 Bottom 280 x 19 Bottom 283 x 20 Bottom 294 x 21 Bottom 297 x 22 Bottom 297 x 23 Bottom 326 x 24 Bottom 329 x 25 Bottom 348 x 26 Bottom 356 x 27 Top 378 x 28 Top 440 x 29 Bottom 466 x 30 Top 500 x 31 Bottom 502 x 32 Bottom 507 x 33 Bottom 556 x 34 Top 584 x 35 Bottom 592 x 36 Top 605 x 37 Top 651 x 38 Bottom 680 x 39 Bottom 687 x

168 40 Bottom 694 x 41 Top 706 x 42 Top 724 x 43 Bottom 735 x 44 Bottom 755 x 45 Top 794 x 46 Top 806 x 47 Top 806 x 48 Top 806 x 49 Top 806 x 50 Top 806 x 51 Top 806 x 52 Top 806 x 53 Top 806 x 55 Top 809 x 54 Top 809 x 56 Top 815 x 57 Bottom 830 x 59 Bottom 845 x 58 Top 845 x 60 Top 855 x 61 Top 855 x 62 Top 898 x 63 Top 900 x 64 Top 935 x 65 Bottom 941 x 66 Top 945 x 67 Bottom 970 x 68 Top 986 x 69 Top 993 x 70 Top 996 x 71 Bottom 999 x 72 Bottom 1028 x 73 Bottom 1036 x 74 Bottom 1045 x 75 Bottom 1048 x 76 Bottom 1049 x 77 Bottom 1094 x 78 Bottom 1142 x 79 Bottom 1142 x 80 Bottom 1182 x 81 Bottom 1186 x 82 Top 1210 x 83 Top 1214 x

169 84 Bottom 1222 x 85 Bottom 1223 x 86 Top 1239 x 87 Bottom 1241 x 88 Bottom 1246 x 89 Bottom 1246 x 90 Top 1254 x 91 Top 1260 x 92 Bottom 1269 x 93 Bottom 1282 x 94 Bottom 1292 x 95 Bottom 1293 x 96 Bottom 1309 x 97 Bottom 1320 x 98 Bottom 1330 x 99 Top 1332 x 100 Top 1334 x 101 Bottom 1339 x 102 Top 1346 x 103 Top 1365 x 104 Bottom 1383 x 105 Bottom 1391 x 106 Bottom 1394 x 107 Bottom 1396 x 108 Bottom 1409 x 109 Top 1419 x 110 Bottom 1444 x 111 Top 1453 x 112 Bottom 1456 x 113 Bottom 1456 x 114 Top 1462 x 115 Top 1465 x 116 Top 1474 x 117 Bottom 1488 x 118 Top 1511 x 119 Top 1516 x 120 Top 1519 x 121 Bottom 1534 x 122 Top 1545 x 123 Top 1556 x 124 Top 1576 x 125 Top 1579 x 126 Top 1581 x 127 Top 1581 x

170 128 Top 1583 x 129 Top 1651 x 130 Top 1669 x 131 Bottom 1678 x 132 Top 1707 x 133 Top 1708 x 134 Top 1718 x 135 Top 1718 x 136 Top 1720 x 137 Top 1738 x 138 Top 1763 x 139 Bottom 1771 x 140 Bottom 1772 x 141 Top 1781 x 142 Top 1785 x 143 Top 1795 x 144 Top 1805 x *Breakpoint is located at the within the double-stranded DNA stem one nucleotide from the base of the stem with allowance of up to a 5-nt deletion. The allowances for deletion is based upon the observation of up to a 5-nt deletion of BanI-induced DNA breakage within RET intron 11 as described in the results.

171 Appendix Table 4: Regions on chromosome 10 with at least seven consecutive segments below -43.61 kcal/mol Number of GC Segment Segment consecutive Nucleotide Nucleotide Chromosomal content start end segments start end Position Fragile site Gene (%) 1,204 1,215 12 180,451 182,400 10p15.3 ZMYND11 74 2,450 2,459 10 367,351 369,000 10p15.3 DIP2C 68 3,216 3,222 7 482,251 483,450 10p15.3 DIP2C 60 4,166 4,172 7 624,751 625,950 10p15.3 DIP2C 39 4,467 4,473 7 669,901 671,100 10p15.3 DIP2C 62 4,898 4,910 13 734,551 736,650 10p15.3 DIP2C 71 5,210 5,219 10 781,351 783,000 10p15.3 none 64 5,531 5,544 14 829,501 831,750 10p15.3 none 68 8,340 8,346 7 1,250,851 1,252,050 10p15.3 ADARB2 63 9,365 9,377 13 1,404,601 1,406,700 10p15.3 ADARB2 69 10,042 10,048 7 1,506,151 1,507,350 10p15.3 ADARB2 65 11,384 11,390 7 1,707,451 1,708,650 10p15.3 ADARB2 61 11,859 11,867 9 1,778,701 1,780,200 10p15.3 ADARB2 64 20,735 20,742 8 3,110,101 3,111,450 10p15.2 PFKP 70 32,580 32,586 7 4,886,851 4,888,050 10p15.1 AKR1E2 64 ANKRD16, 39,540 39,551 12 5,930,851 5,932,800 10p15.1 FBXO18 74 40,127 40,134 8 6,018,901 6,020,250 10p15.1 IL15RA 73 41,244 41,254 11 6,186,451 6,188,250 10p15.1 PFKFB3 70 41,629 41,637 9 6,244,201 6,245,700 10p15.1 PFKFB3 71 41,716 41,723 8 6,257,251 6,258,600 10p15.1 PFKFB3 61 45,845 45,861 17 6,876,601 6,879,300 10p14 none 22 49,696 49,702 7 7,454,251 7,455,450 10p14 none 65 53,943 53,960 18 8,091,301 8,094,150 10p14 none 67 55,406 55,413 8 8,310,751 8,312,100 10p14 none 9 73,730 73,738 9 11,059,351 11,060,850 10p14 CUGBP2 73 77,687 77,693 7 11,652,901 11,654,100 10p14 USP6NL 71 78,561 78,567 7 11,784,001 11,785,200 10p14 ECHDC3 71 79,411 79,417 7 11,911,501 11,912,700 10p14 C10orf47 71 79,573 79,580 8 11,935,801 11,937,150 10p14 none 65 82,604 82,617 14 12,390,451 12,392,700 10p13 CAMK1D 70 83,740 83,748 9 12,560,851 12,562,350 10p13 CAMK1D 74 88,943 88,949 7 13,341,301 13,342,500 10p13 PHYH 67 89,261 89,272 12 13,389,001 13,390,950 10p13 SEPHS1 70 90,468 90,475 8 13,570,051 13,571,400 10p13 none 79 91,324 91,331 8 13,698,451 13,699,800 10p13 FRMD4A 67 97,638 97,644 7 14,645,551 14,646,750 10p13 FAM107B 75

172 CDNF, 99,198 99,206 9 14,879,551 14,881,050 10p13 HSPA14 66 99,469 99,475 7 14,920,201 14,921,400 10p13 SUV39H2 71 101,399 101,405 7 15,209,701 15,210,900 10p13 NMT2 73 102,750 102,756 7 15,412,351 15,413,550 10p13 FAM171A1 79 115,143 115,151 9 17,271,301 17,272,800 10p13 VIM 65 116,636 116,644 9 17,495,251 17,496,750 10p12.33 ST8SIA6 72 117,725 117,732 8 17,658,601 17,659,950 10p12.33 PTPLA 68 117,806 117,813 8 17,670,751 17,672,100 10p12.33 none 69 122,861 122,869 9 18,429,001 18,430,500 10p12.33 CACNB2 68 143,085 143,091 7 21,462,601 21,463,800 10p12.31 NEBL 72 145,231 145,238 8 21,784,501 21,785,850 10p12.31 C10orf114 69 145,323 145,330 8 21,798,301 21,799,650 10p12.31 none 65 148,613 148,619 7 22,291,801 22,293,000 10p12.31 DNAJC1 67 COMMD3, 150,729 150,741 13 22,609,201 22,611,300 10p12.2 BMI1 73 150,827 150,839 13 22,623,901 22,626,000 10p12.2 none 66 151,504 151,510 7 22,725,451 22,726,650 10p12.2 none 67 151,766 151,772 7 22,764,751 22,765,950 10p12.2 none 65 153,349 153,359 11 23,002,201 23,004,000 10p12.2 PIP4K2A 73 156,415 156,427 13 23,462,101 23,464,200 10p12.2 none 66 156,537 156,549 13 23,480,401 23,482,500 10p12.2 PTF1A 71 156,586 156,592 7 23,487,751 23,488,950 10p12.2 none 66 158,185 158,196 12 23,727,601 23,729,550 10p12.2 OTUD1 75 166,297 166,310 14 24,944,401 24,946,650 10p12.1 ARHGAP21 74 166,744 166,754 11 25,011,451 25,013,250 10p12.1 ARHGAP21 72 169,763 169,771 9 25,464,301 25,465,800 10p12.1 GPR158 65 171,182 171,189 8 25,677,151 25,678,500 10p12.1 GPR158 59 183,900 183,908 9 27,584,851 27,586,350 10p12.1 none 69 186,875 186,886 12 28,031,101 28,033,050 10p12.1 MKX 66 192,141 192,153 13 28,821,001 28,823,100 10p12.1 WAC 70 193,106 193,115 10 28,965,751 28,967,400 10p12.1 BAMBI 71 199,901 199,908 8 29,985,001 29,986,350 10p11.23 SVIL 70 200,160 200,174 15 30,023,851 30,026,250 10p11.23 SVIL 71 202,318 202,324 7 30,347,551 30,348,750 10p11.23 KIAA1462 73 204,817 204,825 9 30,722,401 30,723,900 10p11.23 MAP3K8 67 210,722 210,729 8 31,608,151 31,609,500 10p11.22 ZEB1 73 211,004 211,011 8 31,650,451 31,651,800 10p11.22 ZEB1 67 213,655 213,661 7 32,048,101 32,049,300 10p11.22 none 67 214,780 214,789 10 32,216,851 32,218,500 10p11.22 ARHGAP12 69 215,059 215,067 9 32,258,701 32,260,200 10p11.22 none 76 217,568 217,578 11 32,635,051 32,636,850 10p11.22 EPC1 66

173 221,643 221,651 9 33,246,301 33,247,800 10p11.22 ITGB1 74 224,160 224,166 7 33,623,851 33,625,050 10p11.22 none 70 234,021 234,032 12 35,103,001 35,104,950 10p11.21 PARD3 74 237,500 237,510 11 35,624,851 35,626,650 10p11.21 CCNY 73 239,311 239,318 8 35,896,501 35,897,850 10p11.21 GJD4 66 239,522 239,546 25 35,928,151 35,932,050 10p11.21 FZD8 70 249,424 249,431 8 37,413,451 37,414,800 10p11.21 ANKRD30A 61 253,128 253,139 12 37,969,051 37,971,000 10p11.21 none 65 253,340 253,348 9 38,000,851 38,002,350 10p11.1 none 61 258,253 258,261 9 38,737,801 38,739,300 10p11.1 none 68 288,327 288,341 15 43,248,901 43,251,300 10q11.21 FRA10G none 68 289,075 289,081 7 43,361,101 43,362,300 10q11.21 FRA10G none 64 289,158 289,164 7 43,373,551 43,374,750 10q11.21 FRA10G none 62 289,521 289,528 8 43,428,001 43,429,350 10q11.21 FRA10G none 66 289,643 289,650 8 43,446,301 43,447,650 10q11.21 FRA10G none 63 289,768 289,778 11 43,465,051 43,466,850 10q11.21 FRA10G none 59 289,900 289,906 7 43,484,851 43,486,050 10q11.21 FRA10G none 61 289,909 289,916 8 43,486,201 43,487,550 10q11.21 FRA10G none 63 289,973 289,981 9 43,495,801 43,497,300 10q11.21 FRA10G none 61 290,263 290,271 9 43,539,301 43,540,800 10q11.21 FRA10G none 62 290,408 290,414 7 43,561,051 43,562,250 10q11.21 FRA10G none 57 290,479 290,489 11 43,571,701 43,573,500 10q11.21 FRA10G RET 71 290,511 290,519 9 43,576,501 43,578,000 10q11.21 FRA10G RET 62 290,665 290,680 16 43,599,601 43,602,150 10q11.21 FRA10G RET 63 290,731 290,742 12 43,609,501 43,611,450 10q11.21 FRA10G RET 61 290,757 290,767 11 43,613,401 43,615,200 10q11.21 FRA10G RET 62 290,887 290,893 7 43,632,901 43,634,100 10q11.21 FRA10G CSGALNACT2 67 291,295 291,311 17 43,694,101 43,696,800 10q11.21 FRA10G RASGEF1A 61 291,325 291,331 7 43,698,601 43,699,800 10q11.21 FRA10G RASGEF1A 67 291,335 291,341 7 43,700,101 43,701,300 10q11.21 FRA10G RASGEF1A 63 291,344 291,350 7 43,701,451 43,702,650 10q11.21 FRA10G RASGEF1A 64 291,371 291,379 9 43,705,501 43,707,000 10q11.21 FRA10G RASGEF1A 60 291,421 291,427 7 43,713,001 43,714,200 10q11.21 FRA10G RASGEF1A 62 291,494 291,506 13 43,723,951 43,726,050 10q11.21 FRA10G RASGEF1A 69 291,659 291,666 8 43,748,701 43,750,050 10q11.21 FRA10G RASGEF1A 62 291,669 291,677 9 43,750,201 43,751,700 10q11.21 FRA10G RASGEF1A 63 291,683 291,690 8 43,752,301 43,753,650 10q11.21 FRA10G RASGEF1A 60 291,729 291,735 7 43,759,201 43,760,400 10q11.21 FRA10G RASGEF1A 63 291,739 291,751 13 43,760,701 43,762,800 10q11.21 FRA10G RASGEF1A 70 292,162 292,169 8 43,824,151 43,825,500 10q11.21 FRA10G none 65 292,611 292,618 8 43,891,501 43,892,850 10q11.21 FRA10G HNRNPF 69

174 292,878 292,884 7 43,931,551 43,932,750 10q11.21 FRA10G none 64 294,567 294,573 7 44,184,901 44,186,100 10q11.21 FRA10G none 65 298,372 298,378 7 44,755,651 44,756,850 10q11.21 FRA10G none 57 299,053 299,059 7 44,857,801 44,859,000 10q11.21 FRA10G none 55 299,197 299,207 11 44,879,401 44,881,200 10q11.21 FRA10G CXCL12 68 302,866 302,874 9 45,429,751 45,431,250 10q11.21 FRA10G TMEM72 59 305,611 305,626 16 45,841,501 45,844,050 10q11.21 FRA10G none 66 305,791 305,804 14 45,868,501 45,870,750 10q11.21 FRA10G ALOX5 65 306,255 306,265 11 45,938,101 45,939,900 10q11.21 FRA10G ALOX5 69 309,071 309,077 7 46,360,501 46,361,700 10q11.22 FRA10G none 65 311,828 311,834 7 46,774,051 46,775,250 10q11.22 FRA10G none 70 313,128 313,140 13 46,969,051 46,971,150 10q11.22 FRA10G SYT15 67 313,325 313,335 11 46,998,601 47,000,400 10q11.22 FRA10G GPRIN2 64 315,033 315,039 7 47,254,801 47,256,000 10q11.22 FRA10G none 70 317,566 317,575 10 47,634,751 47,636,400 10q11.22 FRA10G none 57 317,580 317,586 7 47,636,851 47,638,050 10q11.22 FRA10G none 62 318,263 318,269 7 47,739,301 47,740,500 10q11.22 FRA10G none 60 321,145 321,152 8 48,171,601 48,172,950 10q11.22 FRA10G none 61 321,727 321,733 7 48,258,901 48,260,100 10q11.22 FRA10G ANXA8 58 321,847 321,853 7 48,276,901 48,278,100 10q11.22 FRA10G none 61 322,193 322,200 8 48,328,801 48,330,150 10q11.22 FRA10G none 58 322,585 322,595 11 48,387,601 48,389,400 10q11.22 FRA10G RBP3 63 322,731 322,744 14 48,409,501 48,411,750 10q11.22 FRA10G none 57 322,921 322,928 8 48,438,001 48,439,350 10q11.22 FRA10G GDF10 70 324,750 324,758 9 48,712,351 48,713,850 10q11.22 FRA10G none 61 330,095 330,102 8 49,514,101 49,515,450 10q11.22 FRA10G none 70 331,049 331,062 14 49,657,201 49,659,450 10q11.22 FRA10G ARHFAP22 65 331,111 331,117 7 49,666,501 49,667,700 10q11.22 FRA10G ARHFAP22 61 331,541 331,550 10 49,731,001 49,732,650 10q11.22 FRA10G ARHFAP22 68 332,422 332,429 8 49,863,151 49,864,500 10q11.22 FRA10G none 69 332,557 332,566 10 49,883,401 49,885,050 10q11.22 FRA10G none 20 332,657 332,663 7 49,898,401 49,899,600 10q11.22 FRA10G WDFY4 59 337,074 337,080 7 50,560,951 50,562,150 10q11.23 FRA10G none 55 337,122 337,128 7 50,568,151 50,569,350 10q11.23 FRA10G none 59 337,351 337,360 10 50,602,501 50,604,150 10q11.23 FRA10G none 67 337,368 337,375 8 50,605,051 50,606,400 10q11.23 FRA10G none 64 337,502 337,508 7 50,625,151 50,626,350 10q11.23 FRA10G none 64 CHAT, 338,783 338,801 19 50,817,301 50,820,300 10q11.23 FRA10G SLC18A3 69 339,024 339,032 9 50,853,451 50,854,950 10q11.23 FRA10G CHAT 58 339,840 339,847 8 50,975,851 50,977,200 10q11.23 FRA10G none 67 343,356 343,363 8 51,503,251 51,504,600 10q11.23 FRA10G none 61

175 349,217 349,233 17 52,382,401 52,385,100 10q11.23 FRA10G SGMS1 68 349,611 349,618 8 52,441,501 52,442,850 10q11.23 FRA10G none 62 358,823 358,830 8 53,823,301 53,824,650 10q21.1 FRA10C PRKG1 59 360,708 360,716 9 54,106,051 54,107,550 10q21.1 FRA10C none 63 396,351 396,358 8 59,452,501 59,453,850 10q21.1 FRA10C none 26 400,179 400,185 7 60,026,701 60,027,900 10q21.1 FRA10C IPMK 67 400,630 400,636 7 60,094,351 60,095,550 10q21.1 FRA10C UBE2D1 63 406,239 406,246 8 60,935,701 60,937,050 10q21.1 FRA10C PHYHIPL 67 411,103 411,110 8 61,665,301 61,666,650 10q21.2 FRA10C CCDC6 69 418,021 418,031 11 62,703,001 62,704,800 10q21.2 FRA10C RHOBTB1 65 430,428 430,437 10 64,564,051 64,565,700 10q21.3 FRA10C ADO 67 431,382 431,388 7 64,707,151 64,708,350 10q21.3 FRA10C none 76 434,832 434,839 8 65,224,651 65,226,000 10q21.3 FRA10C JMJD1C 67 435,205 435,211 7 65,280,601 65,281,800 10q21.3 FRA10C REEP3 72 463,239 463,245 7 69,485,701 69,486,900 10q21.3 FRA10C none 8 468,701 468,707 7 70,305,001 70,306,200 10q21.3 FRA10C none 49 470,579 470,587 9 70,586,701 70,588,200 10q21.3 FRA10C STOX1 74 472,556 472,562 7 70,883,251 70,884,450 10q22.1 FRA10D VPS26A 68 473,851 473,861 11 71,077,501 71,079,300 10q22.1 FRA10D HK1 68 475,545 475,555 11 71,331,601 71,333,400 10q22.1 FRA10D NEUROG3 65 475,964 475,973 10 71,394,451 71,396,100 10q22.1 FRA10D none 60 476,024 476,031 8 71,403,451 71,404,800 10q22.1 FRA10D none 57 476,078 476,085 8 71,411,551 71,412,900 10q22.1 FRA10D none 61 477,601 477,607 7 71,640,001 71,641,200 10q22.1 FRA10D COL13A1 61 478,348 478,355 8 71,752,051 71,753,400 10q22.1 FRA10D none 67 478,748 478,759 12 71,812,051 71,814,000 10q22.1 FRA10D H2AFY2 65 479,365 479,375 11 71,904,601 71,906,400 10q22.1 FRA10D TYSND1 67 479,989 480,002 14 71,998,201 72,000,450 10q22.1 FRA10D none 67 480,097 480,103 7 72,014,401 72,015,600 10q22.1 FRA10D NPFFR1 67 480,556 480,562 7 72,083,251 72,084,450 10q22.1 FRA10D LRRC20 63 480,941 480,949 9 72,141,001 72,142,500 10q22.1 FRA10D LRRC20 64 481,583 481,593 11 72,237,301 72,239,100 10q22.1 FRA10D KIAA1274 68 481,979 481,986 8 72,296,701 72,298,050 10q22.1 FRA10D KIAA1274 57 482,878 482,884 7 72,431,551 72,432,750 10q22.1 FRA10D ADAMTS14 69 482,976 482,989 14 72,446,251 72,448,500 10q22.1 FRA10D ADAMTS14 72 483,410 483,417 8 72,511,351 72,512,700 10q22.1 FRA10D ADAMTS14 62 483,472 483,479 8 72,520,651 72,522,000 10q22.1 FRA10D ADAMTS14 60 484,316 484,322 7 72,647,251 72,648,450 10q22.1 FRA10D PCBD1 68 484,626 484,633 8 72,693,751 72,695,100 10q22.1 FRA10D none 59 485,096 485,103 8 72,764,251 72,765,600 10q22.1 FRA10D none 59 486,480 486,487 8 72,971,851 72,973,200 10q22.1 FRA10D UNC5B 68

176 486,770 486,776 7 73,015,351 73,016,550 10q22.1 FRA10D UNC5B 55 486,962 486,968 7 73,044,151 73,045,350 10q22.1 FRA10D UNC5B 61 487,706 487,718 13 73,155,751 73,157,850 10q22.1 FRA10D CDH23 70 488,051 488,057 7 73,207,501 73,208,700 10q22.1 FRA10D CDH23 57 488,936 488,942 7 73,340,251 73,341,450 10q22.1 FRA10D CDH23 54 CDH23, 489,815 489,821 7 73,472,101 73,473,300 10q22.1 FRA10D C10orf105 60 CDH23, 490,216 490,222 7 73,532,251 73,533,450 10q22.1 FRA10D C10orf54 65 490,324 490,336 13 73,548,451 73,550,550 10q22.1 FRA10D CDH23 60 490,475 490,482 8 73,571,101 73,572,450 10q22.1 FRA10D CDH23 64 490,859 490,866 8 73,628,701 73,630,050 10q22.1 FRA10D none 68 491,488 491,497 10 73,723,051 73,724,700 10q22.1 FRA10D CHST3 68 491,780 491,788 9 73,766,851 73,768,350 10q22.1 FRA10D CHST3 67 492,267 492,274 8 73,839,901 73,841,250 10q22.1 FRA10D SPOCK2 59 492,311 492,321 11 73,846,501 73,848,300 10q22.1 FRA10D SPOCK2 68 493,551 493,557 7 74,032,501 74,033,700 10q22.1 FRA10D DDIT4 63 493,897 493,906 10 74,084,401 74,086,050 10q22.1 FRA10D none 64 493,919 493,927 9 74,087,701 74,089,200 10q22.1 FRA10D none 61 493,963 493,969 7 74,094,301 74,095,500 10q22.1 FRA10D DNAJB12 62 499,038 499,045 8 74,855,551 74,856,900 10q22.1 FRA10D P4HA1 61 503,282 503,288 7 75,492,151 75,493,350 10q22.2 none 71 SEC24C, 503,545 503,552 8 75,531,601 75,532,950 10q22.2 FUT11 65 503,804 503,811 8 75,570,451 75,571,800 10q22.2 NDST2 72 503,992 503,998 7 75,598,651 75,599,850 10q22.2 CAMK2G 65 504,222 504,231 10 75,633,151 75,634,800 10q22.2 CAMK2G 74 510,563 510,572 10 76,584,301 76,585,950 10q22.2 none 71 513,131 513,139 9 76,969,501 76,971,000 10q22.2 VDAC2 66 513,292 513,305 14 76,993,651 76,995,900 10q22.2 COMTD1 70 513,951 513,959 9 77,092,501 77,094,000 10q22.2 none 71 514,367 514,379 13 77,154,901 77,157,000 10q22.2 none 68 514,390 514,399 10 77,158,351 77,160,000 10q22.2 ZNF503 70 514,447 514,453 7 77,166,901 77,168,100 10q22.2 none 62 529,315 529,323 9 79,397,101 79,398,600 10q22.3 KCNMA1 74 531,237 531,246 10 79,685,401 79,687,050 10q22.3 DLG5 74 533,969 533,975 7 80,095,201 80,096,400 10q22.3 none 57 534,854 534,860 7 80,227,951 80,229,150 10q22.3 none 67 537,820 537,826 7 80,672,851 80,674,050 10q22.3 none 61 538,220 538,226 7 80,732,851 80,734,050 10q22.3 none 66 538,856 538,874 19 80,828,251 80,831,250 10q22.3 ZMIZ1 68 539,313 539,320 8 80,896,801 80,898,150 10q22.3 ZMIZ1 61

177 539,463 539,472 10 80,919,301 80,920,950 10q22.3 ZMIZ1 58 539,594 539,600 7 80,938,951 80,940,150 10q22.3 ZMIZ1 60 540,016 540,024 9 81,002,251 81,003,750 10q22.3 ZMIZ1 74 540,229 540,235 7 81,034,201 81,035,400 10q22.3 ZMIZ1 61 540,713 540,719 7 81,106,801 81,108,000 10q22.3 PPIF 66 541,001 541,007 7 81,150,001 81,151,200 10q22.3 ZCCHC24 59 541,140 541,148 9 81,170,851 81,172,350 10q22.3 ZCCHC24 59 541,363 541,370 8 81,204,301 81,205,650 10q22.3 ZCCHC24 72 541,775 541,782 8 81,266,101 81,267,450 10q22.3 none 59 542,955 542,961 7 81,443,101 81,444,300 10q22.3 none 60 543,123 543,130 8 81,468,301 81,469,650 10q22.3 none 63 543,903 543,912 10 81,585,301 81,586,950 10q22.3 none 71 544,043 544,051 9 81,606,301 81,607,800 10q22.3 FAM22E 62 544,942 544,948 7 81,741,151 81,742,350 10q22.3 none 70 548,090 548,097 8 82,213,351 82,214,700 10q23.1 TSPAN14 73 548,625 548,631 7 82,293,601 82,294,800 10q23.1 none 76 552,280 552,288 9 82,841,851 82,843,350 10q23.1 none 57 555,532 555,541 10 83,329,651 83,331,300 10q23.1 none 61 557,560 557,570 11 83,633,851 83,635,650 10q23.1 NRG3 74 573,027 573,036 10 85,953,901 85,955,550 10q23.1 CDHR1 66 574,046 574,052 7 86,106,751 86,107,950 10q23.1 FAM190B 65 575,335 575,341 7 86,300,101 86,301,300 10q23.1 none 64 586,540 586,548 9 87,980,851 87,982,350 10q23.2 GRID1 59 587,420 587,426 7 88,112,851 88,114,050 10q23.2 GRID1 61 587,504 587,516 13 88,125,451 88,127,550 10q23.2 GRID1 73 587,730 587,742 13 88,159,351 88,161,450 10q23.2 none 67 589,393 589,400 8 88,408,801 88,410,150 10q23.2 none 59 589,528 589,535 8 88,429,051 88,430,400 10q23.2 LDB3 60 589,595 589,610 16 88,439,101 88,441,650 10q23.2 LDB3 63 589,626 589,633 8 88,443,751 88,445,100 10q23.2 LDB3 60 589,734 589,740 7 88,459,951 88,461,150 10q23.2 LDB3 60 589,806 589,812 7 88,470,751 88,471,950 10q23.2 LDB3 70 590,103 590,113 11 88,515,301 88,517,100 10q23.2 BMPR1A 70 591,430 591,438 9 88,714,351 88,715,850 10q23.2 MMRN2 60 MMRN2, 591,447 591,459 13 88,716,901 88,719,000 10q23.2 SNCG 62 AGAP11, 591,537 591,547 11 88,730,401 88,732,200 10q23.2 C10orf116 65 GLUD1, 592,358 592,368 11 88,853,551 88,855,350 10q23.2 FAM35A 70 593,155 593,162 8 88,973,101 88,974,450 10q23.2 none 59 593,271 593,278 8 88,990,501 88,991,850 10q23.2 FAM22A 63

178 594,010 594,019 10 89,101,351 89,103,000 10q23.2 none 71 594,153 594,160 8 89,122,801 89,124,150 10q23.2 FAM22D 63 597,486 597,493 8 89,622,751 89,624,100 10q23.31 FRA10A KILLIN,PTEN 70 611,407 611,414 8 91,710,901 91,712,250 10q23.31 FRA10A none 56 616,864 616,871 8 92,529,451 92,530,800 10q23.31 FRA10A HTR7 60 619,481 619,492 12 92,922,001 92,923,950 10q23.32 FRA10A none 71 620,551 620,558 8 93,082,501 93,083,850 10q23.32 FRA10A none 67 621,129 621,136 8 93,169,201 93,170,550 10q23.32 FRA10A HECTD2 70 627,564 627,577 14 94,134,451 94,136,700 10q23.33 FRA10A none 78 628,888 628,895 8 94,333,051 94,334,400 10q23.33 FRA10A IDE 69 629,559 629,570 12 94,433,701 94,435,650 10q23.33 FRA10A none 74 629,658 629,669 12 94,448,551 94,450,500 10q23.33 FRA10A HHEX 71 630,719 630,725 7 94,607,701 94,608,900 10q23.33 FRA10A EXOC6 69 632,148 632,154 7 94,822,051 94,823,250 10q23.33 FRA10A CYP25C1 68 632,186 632,192 7 94,827,751 94,828,950 10q23.33 FRA10A CYP25C1 61 632,210 632,216 7 94,831,351 94,832,550 10q23.33 FRA10A none 59 632,220 632,227 8 94,832,851 94,834,200 10q23.33 FRA10A CYP26A1 65 635,735 635,741 7 95,360,101 95,361,300 10q23.33 FRA10A RBP4 70 641,080 641,088 9 96,161,851 96,163,350 10q23.33 FRA10A TBC1D12 68 650,460 650,467 8 97,568,851 97,570,200 10q24.1 ENTPD1 71 652,019 652,027 9 97,802,701 97,804,200 10q24.1 CCNJ 67 655,152 655,159 8 98,272,651 98,274,000 10q24.1 TLL2 67 655,640 655,646 7 98,345,851 98,347,050 10q24.1 TM9SF3 71 657,277 657,287 11 98,591,401 98,593,200 10q24.1 LCOR 72 658,389 658,395 7 98,758,201 98,759,400 10q24.1 SLIT1 59 658,404 658,413 10 98,760,451 98,762,100 10q24.1 SLIT1 62 659,489 659,498 10 98,923,201 98,924,850 10q24.1 SLIT1 60 659,632 659,639 8 98,944,651 98,946,000 10q24.1 SLIT1 69 660,235 660,245 11 99,035,101 99,036,900 10q24.1 ARHGAP19 79 660,525 660,533 9 99,078,601 99,080,100 10q24.1 FRAT1 71 660,623 660,632 10 99,093,301 99,094,950 10q24.1 FRAT2 72 MMS19, 661,719 661,729 11 99,257,701 99,259,500 10q24.1 UBTD1 70 661,918 661,927 10 99,287,551 99,289,200 10q24.1 UBTD1 73 662,199 662,205 7 99,329,701 99,330,900 10q24.2 FRA10A UBTD1 62 662,975 662,982 8 99,446,101 99,447,450 10q24.2 FRA10A AVPI1 63 663,156 663,163 8 99,473,251 99,474,600 10q24.2 FRA10A MARVELD1 72 663,539 663,545 7 99,530,701 99,531,900 10q24.2 FRA10A SFRP5 72 663,874 663,880 7 99,580,951 99,582,150 10q24.2 FRA10A none 56 665,058 665,064 7 99,758,551 99,759,750 10q24.2 FRA10A CRTAC1 56 665,260 665,273 14 99,788,851 99,791,100 10q24.2 FRA10A CRTAC1 69 667,845 667,852 8 100,176,601 100,177,950 10q24.2 FRA10A HPS1 60

179 673,925 673,935 11 101,088,601 101,090,400 10q24.2 FRA10A CNNM1 67 675,297 675,304 8 101,294,401 101,295,750 10q24.2 FRA10A NKX2-3 66 677,318 677,334 17 101,597,551 101,600,250 10q24.2 FRA10A ABCC2 76 677,673 677,679 7 101,650,801 101,652,000 10q24.2 FRA10A DNMBP 67 679,013 679,024 12 101,851,801 101,853,750 10q24.2 FRA10A none 73 680,246 680,253 8 102,036,751 102,038,100 10q24.31 BLOC1S2 78 680,707 680,714 8 102,105,901 102,107,250 10q24.31 SCD 64 682,772 682,778 7 102,415,651 102,416,850 10q24.31 none 66 683,396 683,404 9 102,509,251 102,510,750 10q24.31 PAX2 63 683,906 683,913 8 102,585,751 102,587,100 10q24.31 none 66 684,859 684,865 7 102,728,701 102,729,900 10q24.31 FAM178A 67 685,044 685,050 7 102,756,451 102,757,650 10q24.31 LZTS2 68 685,059 685,067 9 102,758,701 102,760,200 10q24.31 LZTS2 71 685,075 685,084 10 102,761,101 102,762,750 10q24.31 LZTS2 59 685,186 685,193 8 102,777,751 102,779,100 10q24.31 PDZD7 64 685,685 685,691 7 102,852,601 102,853,800 10q24.31 TLX1NB 62 686,036 686,043 8 102,905,251 102,906,600 10q24.31 none 63 688,868 688,874 7 103,330,051 103,331,250 10q24.32 none 66 690,227 690,242 16 103,533,901 103,536,450 10q24.32 FGF8 69 690,260 690,272 13 103,538,851 103,540,950 10q24.32 C10orf76 64 691,967 691,973 7 103,794,901 103,796,100 10q24.32 none 67 692,165 692,180 16 103,824,601 103,827,150 10q24.32 HPS6 66 692,492 692,498 7 103,873,651 103,874,850 10q24.32 LDB1 67 692,531 692,537 7 103,879,501 103,880,700 10q24.32 LDB1 70 692,985 692,991 7 103,947,601 103,948,800 10q24.32 none 56 693,265 693,272 8 103,989,601 103,990,950 10q24.32 PITX3 68 694,393 694,401 9 104,158,801 104,160,300 10q24.32 NFKB2 64 694,415 694,421 7 104,162,101 104,163,300 10q24.32 NFKB2,PSD 64 694,480 694,486 7 104,171,851 104,173,050 10q24.32 PSD 58 694,537 694,546 10 104,180,401 104,182,050 10q24.32 FBXL15 66 694,733 694,740 8 104,209,801 104,211,150 10q24.32 none 69 696,019 696,032 14 104,402,701 104,404,950 10q24.32 TRIM8 69 696,133 696,142 10 104,419,801 104,421,450 10q24.32 none 61 696,492 696,500 9 104,473,651 104,475,150 10q24.32 ARL3,SFXN2 66 697,848 697,862 15 104,677,051 104,679,450 10q24.32 CNNM2 65 699,257 699,264 8 104,888,401 104,889,750 10q24.32 NT5C2 78 699,681 699,687 7 104,952,001 104,953,200 10q24.33 NT5C2 68 700,242 700,253 12 105,036,151 105,038,100 10q24.33 INA 68 700,733 700,739 7 105,109,801 105,111,000 10q24.33 PCGF6 64 701,410 701,416 7 105,211,351 105,212,550 10q24.33 CALHM2 64 701,447 701,458 12 105,216,901 105,218,850 10q24.33 CALHM1 63

180 701,687 701,694 8 105,252,901 105,254,250 10q24.33 NEURL 70 702,343 702,349 7 105,351,301 105,352,500 10q24.33 NEURL 60 702,795 702,802 8 105,419,101 105,420,450 10q24.33 SH3PXD2A 61 702,806 702,812 7 105,420,751 105,421,950 10q24.33 SH3PXD2A 68 702,849 702,858 10 105,427,201 105,428,850 10q24.33 SH3PXD2A 61 704,098 704,104 7 105,614,551 105,615,750 10q24.33 SH3PXD2A 74 705,292 705,298 7 105,793,651 105,794,850 10q24.33 COL17A1 61 707,162 707,170 9 106,074,151 106,075,650 10q25.1 ITPRIP 62 709,330 709,336 7 106,399,351 106,400,550 10q25.1 SORCS3 65 709,340 709,348 9 106,400,851 106,402,350 10q25.1 SORC53 70 735,507 735,513 7 110,325,901 110,327,100 10q25.1 none 61 745,112 745,121 10 111,766,651 111,768,300 10q25.1 ADD3 67 FRA10B/ 748,381 748,390 10 112,257,001 112,258,650 10q25.2 FRA10E DUSP5 71 FRA10B/ 752,250 752,261 12 112,837,351 112,839,300 10q25.2 FRA10E ADRA2A 67 FRA10B/ 756,488 756,496 9 113,473,051 113,474,550 10q25.2 FRA10E none 60 772,019 772,035 17 115,802,701 115,805,400 10q25.3 ADRB1 71 774,422 774,432 11 116,163,151 116,164,950 10q25.3 AFAP1L2 68 776,850 776,856 7 116,527,351 116,528,550 10q25.3 none 69 777,208 777,214 7 116,581,051 116,582,250 10q25.3 FAM160B1 70 779,015 779,026 12 116,852,101 116,854,050 10q25.3 ATRNL1 71 786,867 786,873 7 118,029,901 118,031,100 10q25.3 GFRA1 67 786,876 786,883 8 118,031,251 118,032,600 10q25.3 GFRA1 67 790,008 790,017 10 118,501,051 118,502,700 10q25.3 HSPA12A 75 791,761 791,767 7 118,764,001 118,765,200 10q25.3 KIAA1598 65 792,661 792,668 8 118,899,001 118,900,350 10q25.3 none 68 792,815 792,822 8 118,922,101 118,923,450 10q25.3 none 67 793,173 793,179 7 118,975,801 118,977,000 10q25.3 none 66 793,331 793,341 11 118,999,501 119,001,300 10q25.3 SLC18A2 68 794,225 794,236 12 119,133,601 119,135,550 10q26.11 FRA10F PDZD8 69 795,293 795,301 9 119,293,801 119,295,300 10q26.11 FRA10F none 67 798,704 798,710 7 119,805,451 119,806,650 10q26.11 FRA10F RAB11FIP2 68 802,357 802,363 7 120,353,401 120,354,600 10q26.11 FRA10F PRLHR 64 803,425 803,432 8 120,513,601 120,514,950 10q26.11 FRA10F C10orf46 67 804,009 804,019 11 120,601,201 120,603,000 10q26.11 FRA10F none 57 805,257 805,267 11 120,788,401 120,790,200 10q26.11 FRA10F NANOS1 75 806,441 806,450 10 120,966,001 120,967,650 10q26.11 FRA10F GRK5 71 808,019 808,029 11 121,202,701 121,204,500 10q26.11 FRA10F GRK5 60 808,674 808,686 13 121,300,951 121,303,050 10q26.11 FRA10F RGS10 68 809,035 809,042 8 121,355,101 121,356,450 10q26.11 FRA10F TIAL1 62

181 809,403 809,411 9 121,410,301 121,411,800 10q26.11 FRA10F BAG3 70 809,872 809,879 8 121,480,651 121,482,000 10q26.11 FRA10F none 67 810,360 810,368 9 121,553,851 121,555,350 10q26.11 FRA10F INPP5F 71 810,877 810,883 7 121,631,401 121,632,600 10q26.11 FRA10F MCMBP 71 814,772 814,783 12 122,215,651 122,217,600 10q26.12 FRA10F PPAPDC1A 67 818,056 818,062 7 122,708,251 122,709,450 10q26.12 FRA10F none 59 822,377 822,387 11 123,356,401 123,358,200 10q26.13 FRA10F FGFR2 69 824,580 824,587 8 123,686,851 123,688,200 10q26.13 FRA10F ATE1 69 824,892 824,899 8 123,733,651 123,735,000 10q26.13 FRA10F NSMCE4A 66 825,641 825,648 8 123,846,001 123,847,350 10q26.13 FRA10F TACC2 61 825,813 825,821 9 123,871,801 123,873,300 10q26.13 FRA10F TACC2 65 826,149 826,158 10 123,922,201 123,923,850 10q26.13 FRA10F TACC2 66 826,397 826,406 10 123,959,401 123,961,050 10q26.13 FRA10F TACC2 61 827,559 827,565 7 124,133,701 124,134,900 10q26.13 FRA10F PLEKHA1 69 828,133 828,145 13 124,219,801 124,221,900 10q26.13 FRA10F HTRA1 68 832,630 832,638 9 124,894,351 124,895,850 10q26.13 FRA10F HXM3 69 832,641 832,648 8 124,896,001 124,897,350 10q26.13 FRA10F HXM3 68 834,719 834,725 7 125,207,701 125,208,900 10q26.13 FRA10F none 51 836,170 836,176 7 125,425,351 125,426,550 10q26.13 FRA10F GPR26 71 838,340 838,346 7 125,750,851 125,752,050 10q26.13 FRA10F none 64 838,626 838,632 7 125,793,751 125,794,950 10q26.13 FRA10F CHST15 63 839,011 839,021 11 125,851,501 125,853,300 10q26.13 FRA10F CHST15 72 840,514 840,520 7 126,076,951 126,078,150 10q26.13 FRA10F none 65 840,712 840,718 7 126,106,651 126,107,850 10q26.13 FRA10F OAT 69 840,904 840,910 7 126,135,451 126,136,650 10q26.13 FRA10F NKX1-2 68 841,507 841,513 7 126,225,901 126,227,100 10q26.13 FRA10F LHPP 62 841,533 841,541 9 126,229,801 126,231,300 10q26.13 FRA10F LHPP 60 841,594 841,603 10 126,238,951 126,240,600 10q26.13 FRA10F LHPP 66 841,749 841,755 7 126,262,201 126,263,400 10q26.13 FRA10F LHPP 61 841,906 841,912 7 126,285,751 126,286,950 10q26.13 FRA10F LHPP 61 841,943 841,959 17 126,291,301 126,294,000 10q26.13 FRA10F LHPP 64 842,029 842,036 8 126,304,201 126,305,550 10q26.13 FRA10F none 61 842,093 842,099 7 126,313,801 126,315,000 10q26.13 FRA10F FAM53B 62 842,264 842,270 7 126,339,451 126,340,650 10q26.13 FRA10F FAM53B 60 842,874 842,887 14 126,430,951 126,433,200 10q26.13 FRA10F FAM53B 66 844,034 844,040 7 126,604,951 126,606,150 10q26.13 FRA10F none 72 844,581 844,587 7 126,687,001 126,688,200 10q26.13 FRA10F CTBP2 59 844,601 844,616 16 126,690,001 126,692,550 10q26.13 FRA10F CTBP2 63 844,764 844,773 10 126,714,451 126,716,100 10q26.13 FRA10F CTBP2 65 845,648 845,659 12 126,847,051 126,849,000 10q26.13 FRA10F CTBP2 72 845,663 845,672 10 126,849,301 126,850,950 10q26.13 FRA10F C10orf137 68

182 849,383 849,390 8 127,407,301 127,408,650 10q26.13 FRA10F none 61 849,746 849,752 7 127,461,751 127,462,950 10q26.13 FRA10F MMP21 69 849,756 849,762 7 127,463,251 127,464,450 10q26.13 FRA10F MMP21 64 850,562 850,568 7 127,584,151 127,585,350 10q26.2 FANK1 70 853,840 853,848 9 128,075,851 128,077,350 10q26.2 ADAM12 66 857,289 857,299 11 128,593,201 128,595,000 10q26.2 none 71 DOCK1, 859,959 859,968 10 128,993,701 128,995,350 10q26.2 FAM196A 72 863,562 863,574 13 129,534,151 129,536,250 10q26.2 FOXI2 68 863,579 863,585 7 129,536,701 129,537,900 10q26.2 FOXI2 66 864,699 864,707 9 129,704,701 129,706,200 10q26.2 PTPRE 71 866,720 866,730 11 130,007,851 130,009,650 10q26.2 none 67 867,108 867,114 7 130,066,051 130,067,250 10q26.2 none 60 871,851 871,857 7 130,777,501 130,778,700 10q26.3 none 59 875,098 875,105 8 131,264,551 131,265,900 10q26.3 MGMT 69 875,461 875,467 7 131,319,001 131,320,200 10q26.3 MGMT 62 877,050 877,056 7 131,557,351 131,558,550 10q26.3 MGMT 59 877,148 877,157 10 131,572,051 131,573,700 10q26.3 none 61 877,427 877,434 8 131,613,901 131,615,250 10q26.3 none 63 878,379 878,390 12 131,756,701 131,758,650 10q26.3 EBF3 66 878,409 878,424 16 131,761,201 131,763,750 10q26.3 EBF3 70 878,441 878,453 13 131,766,001 131,768,100 10q26.3 none 64 878,471 878,477 7 131,770,501 131,771,700 10q26.3 none 70 879,919 879,928 10 131,987,701 131,989,350 10q26.3 none 70 879,989 879,997 9 131,998,201 131,999,700 10q26.3 none 63 880,657 880,666 10 132,098,401 132,100,050 10q26.3 none 68 883,837 883,843 7 132,575,401 132,576,600 10q26.3 none 56 885,197 885,203 7 132,779,401 132,780,600 10q26.3 none 10 886,159 886,165 7 132,923,701 132,924,900 10q26.3 TCERG1L 57 886,779 886,786 8 133,016,701 133,018,050 10q26.3 TCERG1L 61 886,789 886,796 8 133,018,201 133,019,550 10q26.3 TCERG1L 64 886,864 886,870 7 133,029,451 133,030,650 10q26.3 TCERG1L 59 887,099 887,109 11 133,064,701 133,066,500 10q26.3 TCERG1L 61 887,394 887,406 13 133,108,951 133,111,050 10q26.3 TCERG1L 74 889,672 889,679 8 133,450,651 133,452,000 10q26.3 none 60 889,911 889,917 7 133,486,501 133,487,700 10q26.3 none 62 890,530 890,537 8 133,579,351 133,580,700 10q26.3 none 59 890,542 890,548 7 133,581,151 133,582,350 10q26.3 none 64 891,156 891,166 11 133,673,251 133,675,050 10q26.3 none 63 891,965 891,975 11 133,794,601 133,796,400 10q26.3 BNIP3 70 892,216 892,222 7 133,832,251 133,833,450 10q26.3 none 61 893,027 893,033 7 133,953,901 133,955,100 10q26.3 JAKMIP3 62

183 893,036 893,043 8 133,955,251 133,956,600 10q26.3 JAKMIP3 67 893,065 893,074 10 133,959,601 133,961,250 10q26.3 JAKMIP3 60 893,213 893,228 16 133,981,801 133,984,350 10q26.3 JAKMIP3 66 893,325 893,343 19 133,998,601 134,001,600 10q26.3 DPYSL4 72 893,394 893,403 10 134,008,951 134,010,600 10q26.3 DPYSL4 63 893,407 893,419 13 134,010,901 134,013,000 10q26.3 DPYSL4 65 893,422 893,430 9 134,013,151 134,014,650 10q26.3 DPYSL4 67 893,436 893,453 18 134,015,251 134,018,100 10q26.3 DPYSL4 64 893,473 893,481 9 134,020,801 134,022,300 10q26.3 STK32C 65 893,574 893,588 15 134,035,951 134,038,350 10q26.3 STK32C 66 893,621 893,631 11 134,043,001 134,044,800 10q26.3 STK32C 63 893,799 893,814 16 134,069,701 134,072,250 10q26.3 STK32C 64 893,873 893,879 7 134,080,801 134,082,000 10q26.3 STK32C 52 893,882 893,889 8 134,082,151 134,083,500 10q26.3 STK32C 58 893,975 893,981 7 134,096,101 134,097,300 10q26.3 STK32C 59 894,135 894,149 15 134,120,101 134,122,500 10q26.3 STK32C 70 894,300 894,306 7 134,144,851 134,146,050 10q26.3 LRRC27 69 894,682 894,690 9 134,202,151 134,203,650 10q26.3 none 69 894,726 894,743 18 134,208,751 134,211,600 10q26.3 PWWP2B 74 894,750 894,763 14 134,212,351 134,214,600 10q26.3 PWWP2B 64 894,767 894,773 7 134,214,901 134,216,100 10q26.3 PWWP2B 63 894,779 894,795 17 134,216,701 134,219,400 10q26.3 PWWP2B 67 894,799 894,810 12 134,219,701 134,221,650 10q26.3 PWWP2B 64 894,813 894,819 7 134,221,801 134,223,000 10q26.3 PWWP2B 67 894,825 894,833 9 134,223,601 134,225,100 10q26.3 PWWP2B 65 894,863 894,877 15 134,229,301 134,231,700 10q26.3 PWWP2B 64 894,880 894,890 11 134,231,851 134,233,650 10q26.3 none 65 894,897 894,906 10 134,234,401 134,236,050 10q26.3 none 66 894,950 894,956 7 134,242,351 134,243,550 10q26.3 none 65 894,995 895,007 13 134,249,101 134,251,200 10q26.3 none 60 895,062 895,073 12 134,259,151 134,261,100 10q26.3 C10orf91 66 895,093 895,114 22 134,263,801 134,267,250 10q26.3 none 65 895,126 895,138 13 134,268,751 134,270,850 10q26.3 none 63 895,172 895,186 15 134,275,651 134,278,050 10q26.3 none 65 895,202 895,210 9 134,280,151 134,281,650 10q26.3 none 65 895,273 895,282 10 134,290,801 134,292,450 10q26.3 none 66 895,341 895,347 7 134,301,001 134,302,200 10q26.3 none 63 895,533 895,540 8 134,329,801 134,331,150 10q26.3 none 62 895,668 895,677 10 134,350,051 134,351,700 10q26.3 INPP5A 73 895,836 895,844 9 134,375,251 134,376,750 10q26.3 INPP5A 61 896,426 896,432 7 134,463,751 134,464,950 10q26.3 INPP5A 63

184 896,466 896,473 8 134,469,751 134,471,100 10q26.3 INPP5A 64 896,480 896,489 10 134,471,851 134,473,500 10q26.3 INPP5A 59 896,750 896,760 11 134,512,351 134,514,150 10q26.3 INPP5A 64 896,847 896,853 7 134,526,901 134,528,100 10q26.3 INPP5A 66 896,948 896,954 7 134,542,051 134,543,250 10q26.3 INPP5A 62 896,962 896,969 8 134,544,151 134,545,500 10q26.3 INPP5A 60 896,995 897,004 10 134,549,101 134,550,750 10q26.3 INPP5A 61 897,055 897,063 9 134,558,101 134,559,600 10q26.3 INPP5A 64 897,140 897,148 9 134,570,851 134,572,350 10q26.3 INPP5A 65 897,153 897,159 7 134,572,801 134,574,000 10q26.3 INPP5A 65 897,179 897,187 9 134,576,701 134,578,200 10q26.3 INPP5A 57 897,258 897,270 13 134,588,551 134,590,650 10q26.3 INPP5A 64 897,281 897,290 10 134,592,001 134,593,650 10q26.3 INPP5A 62 897,316 897,354 39 134,597,251 134,603,250 10q26.3 NKX6-2 71 897,363 897,375 13 134,604,301 134,606,400 10q26.3 none 64 897,379 897,395 17 134,606,701 134,609,400 10q26.3 none 66 897,415 897,424 10 134,612,101 134,613,750 10q26.3 none 66 897,430 897,436 7 134,614,351 134,615,550 10q26.3 none 67 897,477 897,487 11 134,621,401 134,623,200 10q26.3 TTC40 65 897,490 897,509 20 134,623,351 134,626,500 10q26.3 TTC40 66 897,651 897,660 10 134,647,501 134,649,150 10q26.3 TTC40 63 897,728 897,740 13 134,659,051 134,661,150 10q26.3 TTC40 64 897,830 897,837 8 134,674,351 134,675,700 10q26.3 TTC40 63 897,952 897,962 11 134,692,651 134,694,450 10q26.3 TTC40 65 898,077 898,089 13 134,711,401 134,713,500 10q26.3 TTC40 65 898,152 898,159 8 134,722,651 134,724,000 10q26.3 TTC40 63 898,180 898,188 9 134,726,851 134,728,350 10q26.3 TTC40 68 898,213 898,228 16 134,731,801 134,734,350 10q26.3 TTC40 64 898,231 898,247 17 134,734,501 134,737,200 10q26.3 TTC40 66 898,475 898,483 9 134,771,101 134,772,600 10q26.3 none 62 898,486 898,496 11 134,772,751 134,774,550 10q26.3 none 66 898,574 898,584 11 134,785,951 134,787,750 10q26.3 none 65 898,702 898,712 11 134,805,151 134,806,950 10q26.3 none 64 898,802 898,808 7 134,820,151 134,821,350 10q26.3 none 64 898,871 898,880 10 134,830,501 134,832,150 10q26.3 none 63 898,891 898,898 8 134,833,501 134,834,850 10q26.3 none 65 898,967 898,975 9 134,844,901 134,846,400 10q26.3 none 64 899,016 899,025 10 134,852,251 134,853,900 10q26.3 none 62 899,087 899,095 9 134,862,901 134,864,400 10q26.3 none 64 899,125 899,131 7 134,868,601 134,869,800 10q26.3 none 58 899,157 899,163 7 134,873,401 134,874,600 10q26.3 none 65

185 899,234 899,240 7 134,884,951 134,886,150 10q26.3 none 66 899,254 899,260 7 134,887,951 134,889,150 10q26.3 none 63 899,300 899,309 10 134,894,851 134,896,500 10q26.3 none 63 899,325 899,336 12 134,898,601 134,900,550 10q26.3 none 63 899,341 899,353 13 134,901,001 134,903,100 10q26.3 GPR123 69 899,390 899,397 8 134,908,351 134,909,700 10q26.3 GPR123 64 899,404 899,411 8 134,910,451 134,911,800 10q26.3 GPR123 61 899,438 899,449 12 134,915,551 134,917,500 10q26.3 GPR123 65 899,512 899,518 7 134,926,651 134,927,850 10q26.3 GPR123 63 899,536 899,543 8 134,930,251 134,931,600 10q26.3 GPR123 64 899,599 899,606 8 134,939,701 134,941,050 10q26.3 GPR123 65 899,612 899,619 8 134,941,651 134,943,000 10q26.3 GPR123 70 899,696 899,704 9 134,954,251 134,955,750 10q26.3 none 66 899,814 899,836 23 134,971,951 134,975,550 10q26.3 KNDC1 69 899,848 899,864 17 134,977,051 134,979,750 10q26.3 KNDC1 63 899,870 899,876 7 134,980,351 134,981,550 10q26.3 KNDC1 66 899,951 899,958 8 134,992,501 134,993,850 10q26.3 KNDC1 60 899,974 899,980 7 134,995,951 134,997,150 10q26.3 KNDC1 66 900,057 900,074 18 135,008,401 135,011,250 10q26.3 KNDC1 63 900,078 900,085 8 135,011,551 135,012,900 10q26.3 KNDC1 71 900,088 900,102 15 135,013,051 135,015,450 10q26.3 KNDC1 68 900,115 900,121 7 135,017,101 135,018,300 10q26.3 KNDC1 61 900,131 900,144 14 135,019,501 135,021,750 10q26.3 KNDC1 62 900,161 900,170 10 135,024,001 135,025,650 10q26.3 KNDC1 65 900,175 900,182 8 135,026,101 135,027,450 10q26.3 KNDC1 66 900,216 900,228 13 135,032,251 135,034,350 10q26.3 KNDC1 68 900,287 900,299 13 135,042,901 135,045,000 10q26.3 UTF1 74 900,330 900,345 16 135,049,351 135,051,900 10q26.3 VENTX 71 900,364 900,371 8 135,054,451 135,055,800 10q26.3 VENTX 59 900,379 900,386 8 135,056,701 135,058,050 10q26.3 none 60 900,390 900,397 8 135,058,351 135,059,700 10q26.3 none 63 900,407 900,415 9 135,060,901 135,062,400 10q26.3 none 67 900,474 900,486 13 135,070,951 135,073,050 10q26.3 none 65 900,492 900,504 13 135,073,651 135,075,750 10q26.3 none 67 900,532 900,570 39 135,079,651 135,085,650 10q26.3 ADAM8 66 900,576 900,584 9 135,086,251 135,087,750 10q26.3 ADAM8 68 900,588 900,610 23 135,088,051 135,091,650 10q26.3 ADAM8 66 900,621 900,631 11 135,093,001 135,094,800 10q26.3 TUBGCP2 62 TUBGCP2, 900,812 900,820 9 135,121,651 135,123,150 10q26.3 ZNF511 69 900,923 900,931 9 135,138,301 135,139,800 10q26.3 CALY 71 900,941 900,950 10 135,141,001 135,142,650 10q26.3 CALY 60

186 900,988 901,002 15 135,148,051 135,150,450 10q26.3 CALY 71 901,071 901,078 8 135,160,501 135,161,850 10q26.3 PRAP1 65 901,088 901,097 10 135,163,051 135,164,700 10q26.3 PRAP1 63 901,125 901,142 18 135,168,601 135,171,450 10q26.3 none 67 901,269 901,282 14 135,190,201 135,192,450 10q26.3 none 65 901,285 901,292 8 135,192,601 135,193,950 10q26.3 PAOX 72 901,547 901,554 8 135,231,901 135,233,250 10q26.3 MTG1 61 901,584 901,590 7 135,237,451 135,238,650 10q26.3 SPRN 73 901,639 901,646 8 135,245,701 135,247,050 10q26.3 CYP2E1 65 901,803 901,813 11 135,270,301 135,272,100 10q26.3 CYP2E1 65 901,817 901,823 7 135,272,401 135,273,600 10q26.3 CYP2E1 70 902,274 902,282 9 135,340,951 135,342,450 10q26.3 CYP2E1 63

187 Appendix Table 5: Genes located in regions capable of forming highly stable secondary structures and disease associations Chromosomal Fragile Insertions, Deletions, and Gene Position Site Translocations Point Mutations Reference* PHYH 10p13 Refsum Disease 602026 Hypogonadism and NMT2 10p13 Testicular Atrophy 603801 VIM 10p13 Cataracts 193060 CACNB2 10p12.33 Brugada Syndrome 4 600003 NEBL 10p12.31 DiGeorge syndrome-2 605491 BMI1 10p12.2 Hematological Malignancies [1] Diabetes Mellitus, Cerebellar PTF1A 10p12.2 Agenesis 607194 MAP3K8 10p11.23 Lung Cancer 191195 ZEB1 10p11.22 Corneal Dystrophy 189909 Papillary Thyroid Carcinoma, BCR-ABL(-) Hirschsprung Disease, Myeloproliferative Multiple Endocrine Neoplasia, Disorders, Papillary Thyroid Medullary Thyroid Carcinoma, Carcinoma of the Ovary, Pheochromocytoma, Renal RET 10q11.21 FRA10G Hirschprung's Disease Abnormalities 164761, [1] CXCL12 10q11.21 FRA10G Resistance to AIDS 600835 ALOX5 10q11.21 FRA10G Asthma, Atherosclerosis 152390 CHAT 10q11.23 FRA10G Myasthenic Syndrome 118490 Papillary Thyroid Carcinoma, BCR-ABL(-) Myeloproliferative Disorders, Chronic Myeloid CCDC6 10q21.2 FRA10C Leukemia 601985, [1] RHOBTB1 10q21.2 FRA10C Head and Carcinoma [1] STOX1 10q21.3 FRA10C Preeclampsia 609397 HK1 10q22.1 FRA10D Hemolytic Anemia 142600 NEUROG3 10q22.1 FRA10D Diarrhea 604882 PCBD1 10q22.1 FRA10D Hyperphenylalaninemia 126090 CDH23 10q22.1 FRA10D Deafness, Usher syndrome 605516 CHST3 10q22.1 FRA10D Spondyloepiphyseal Dysplasia 603799 KCNMA1 10q22.3 Epilepsy 600150 B-Cell Acute Lymphoblastic ZMIZ1 10q22.3 Leukemia [2] CDHR1 10q23.1 Cone-Rod Dystrophy 15 609502 Cardiomyopathy, Myofibrillar LDB3 10q23.2 Myopathy 605906 Juvenile Polyposis BMPR1A 10q23.2 Syndrome Juvenile Polyposis Syndrome 601299 Hyperinsulinism- GLUD1 10q23.2 Hyperammonemia Syndrome 138130 Bannayan-Riley-Ruvalcaba Malignant Melanom; Cowden Syndrome; Endometrial, Disease; Bannayan-Riley- Prostate, Breast, Cervical, Ruvalcaba, Lhermitte-Duclos, Squamous Cell, and Head and Macrocephaly/Autism PTEN 10q23.31 FRA10A and Neck Carcinoma Syndromes 601728

188 Retinol-Binding Protein RBP4 10q23.33 FRA10A Deficiency 180250 Hermansky-Pudlak HPS1 10q24.2 FRA10A Syndrome 1 604982 ABCC2 10q24.2 FRA10A Dubin-Johnson Syndrome Dubin-Johnson Syndrome 601107 Papillorenal Syndrome, PAX2 10q24.31 Papillorenal Syndrome Isolated Renal Hypoplasia 167409 PDZD7 10q24.31 Usher Syndrome 612971 FGF8 10q24.32 Kallmann Syndrome 6 600483 Hermansky-Pudlak Hermansky-Pudlak Syndrome HPS6 10q24.32 Syndrome 6 6 607522 Anterior Segment Mesenchymal Dysgenesis, PITX3 10q24.32 Cataracts Cataracts 602669 NFKB2 10q24.32 Hematological Malignancies 164012, [1] CNNM2 10q24.32 Renal Hypomagnesemia 6 Renal Hypomagnesemia 6 607803 COL17A1 10q24.33 Epidermolysis Bullosa Epidermolysis Bullosa 113811 T-cell Acute Lymphblastic ADD3 10q25.1 Leukemia 601568 ADRB1 10q25.3 Congestive Heart Failure Congestive Heart Failure 109630 Myofibrillar Myopathy, Dilated BAG3 10q26.11 FRA10F 1HH Cardiomyopathy 1HH 603883 Endometrial, Gastric, Lung, Breast, and Ovarian Cancers; Crouzon, Pfeiffer, Jackson- Weiss, Antley-Bixler, Beare- Saethre-Chotzen, Apert, Stevenson Cutis Gyrata, Apert, Pfeiffer, LADD,and Beare- and LADD Syndromes; Stevenson Cutis Gyrata Craniosynostosis; Axenfeld- FGFR2 10q26.13 FRA10F Syndromes Rieger Anomaly 176943, [1] PLEKHA1 10q26.13 FRA10F Age-Related Maculopathy 607772 Age-related Macular Degeneration, Cerebral Autosomal Recessive Arteriopathy with Subcortical Infarcts and HTRA1 10q26.13 FRA10F Leukoencephalopathy 602194 Gyrate Atrophy of the Gyrate Atrophy of the Choroid OAT 10q26.13 FRA10F Choroid and Retina and Retina 613349 ADAM12 10q26.2 Breast Cancer [1] *Reference numbers refer to Online Mendelian Inheritance in Man entries. URL www.omim.org [1] Atlas of Genetics and Cytogenetics in Oncology and Haematology. URL http://AtlasGeneticsOncology.org [2] Soler, G. et al. (2008) Leukemia, 22, 1278-1280.

189 Appendix Table 6: Copy number alterations that overlap with regions of predicted high levels of DNA secondary structure Nucleotide Nucleotide Segment Segment Cancer Type Start* End* Start End Glioma Deletion 89,505,798 90,410,621 596,704 602,738 Glioma Deletion 116,731,800 130,571,762 778,211 870,479 Breast Amplification 79,120,871 81,735,347 527,471 544,903 Breast Amplification 123,214,869 123,367,187 821,431 822,448 Breast Amplification 101,955 12,088,176 678 80,588 Breast Deletion 89,505,798 90,410,621 596,704 602,738 Breast Deletion 100,166,329 135,071,762 667,774 900,479 Breast Deletion 19,022,250 42,889,659 126,814 285,932 Colorectal Deletion 86,049,212 120,513,596 573,660 803,424 Colorectal Deletion 121,281,931 133,627,776 808,545 890,852 Colorectal Deletion 101,955 12,248,313 678 81,656 Lung NSC Deletion 52,319,777 53,165,029 348,797 354,434 Lung NSC Deletion 101,955 1,001,958 678 6,680 Lung NSC Deletion 104,446,715 135,071,762 696,310 900,479 Melanoma Deletion 89,308,314 90,410,621 595,387 602,738 Melanoma Deletion 129,149,375 135,071,762 860,994 900,479 Melanoma Deletion 54,201,467 57,779,853 361,342 385,200 Ovarian Amplification 75,589,696 80,443,337 503,930 536,289 Prostate Amplification 74,192,290 82,019,313 494,614 546,796 Prostate Amplification 61,909,390 66,364,553 412,728 442,431 Prostate Deletion 101,955 44,811,377 678 298,743 *Location of deletions and amplifications obtained from Tumorscape (www.broadinstitute.org/tumorscape) and based on human genome build hg19

190 SCHOLASTIC VITA

NAME: Laura Williams Dillon

BUSINESS ADDRESS: Wake Forest School of Medicine Department of Biochemistry and Molecular Biology 1 Medical Center Boulevard Winston-Salem, NC 27157

BUSINESS TELEPHONE: (336) 716-6220

E-MAIL ADDRESS: [email protected]

EDUCATION: 2006 B.S. Biochemistry Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA

2012 Ph.D. Biochemistry and Molecular Biology Graduate School of Arts and Sciences Wake Forest University, Winston-Salem, NC Thesis Title: Mechanistic Study of Fragile Site Instability by Investigating RET/PTC Rearrangements, a Common Cause of Papillary Thyroid Carcinoma

RESEARCH EXPERIENCE:

Thesis research:

Wake Forest School of Medicine, Department of Biochemistry and Molecular Biology Under the supervision of Dr. Yuh-Hwa Wang, May 2008 - May 2012 Explored the mechanism of instability at chromosomal fragile sites through the study of RET/PTC rearrangements, which are a common cause of papillary thyroid carcinoma. Results from these experiments implicated DNA secondary structure and DNA topoisomerases I and II in initiating DNA breakage at fragile sites. Environmental and dietary fragile site-inducing chemicals were also explored for their ability to induce DNA breakage within the RET oncogene. Through the analysis RET breakage in thyroid cells from patients with RET rearrangements and normal individuals; a potential diagnostic assay was developed.

191 Industry experience:

Pharmaceutical Product Development, Inc., Richmond, VA Associate Scientist, Immunochemistry Department, May 2006 - August 2007 Analysed biological matrix samples using various immunoassays to determine the levels of pharmaceutical compounds. Sample analysis used in the development of pharmaceutical drugs, according to GLP and FDA standards.

Undergraduate research:

Virginia Tech, Department of Biochemistry Under the supervision of Dr. Sunyoung Kim, September 2004 - May 2005 Analysed the role and potential mechanism of electron transfer in the repair of CPD lesions by photolyase through the use of infrared spectroscopy, UV-Vis spectroscopy, and electron paramagnetic resonance. Generated a database of infrared spectroscopy signatures for ATP and other purine derivatives.

PUBLICATIONS:

A. Burrow, L. Williams, L. Pierce, and Y.H. Wang. (2009) “Over half of breakpoints in gene pairs involved in cancer-specific recurrent translocations are mapped to human chromosomal fragile sites,” BMC Genomics. 10:59.

M. Gandhi, L. Dillon, S. Pramanik, Y. Nikiforov, Y.H. Wang. (2010) “DNA breaks at Fragile Sites Generate Tumorigenic RET/PTC Rearrangements in Human Thyroid Cells,” Oncogene. 29: 2272-80.

L. Dillon, A. Burrow, and Y.H. Wang. (2010) “DNA Instability at Chromosomal Fragile Sites in Cancer,” Current Genomics. 11: 326-337

L. Dillon, A. Burrow, and Y.H. Wang. (2012) “DNA Instability at Chromosomal Fragile Sites in Cancer,” In Neri,C (ed.), Advances in Genome Science. Bentham Science Publishers, Vol. I.

L. Dillon, C. Lehman, Y.H. Wang. “The Role of Fragile Sites in Sporadic Papillary Thyroid Carcinoma.” Journal of Thyroid Research. In press.

L. Dillon, L. Pierce, M. Ng, and Y.H. Wang. “DNA Secondary Structures Involved in Fragile Site Breakage from the Study of Human Chromosome 10.” Manuscript Submitted.

L. Dillon, L. Pierce, Y. Nikiforov, and Y.H. Wang. “DNA Topoisomerases Participate in Oncogene RET Fragility.” Manuscript in Preparation.

192 INVITED SEMINARS:

Nov 2011 Department of Biochemistry, University of Virginia. “The Mechanistic Study of Fragile Site Instability by Investigating RET/PTC Rearrangements, a Common Cause of Papillary Thyroid Carcinoma.” Charlottesville, VA

ABSTRACTS/POSTER PRESENTATIONS:

April 2009 Wake Forest Graduate Student Research Day. “Over half of breakpoints in gene pairs involved in cancer-specific recurrent translocations are mapped to human chromosomal fragile sites.” Winston-Salem, NC

October 2009 Genetics and Environmental Mutagenesis Society 27th Annual Fall Meeting. “DNA breaks at Fragile Sites Generate Tumorigenic RET/PTC Rearrangements in Human Thyroid Cells.” Chapel Hill, NC

March 2010 Wake Forest Graduate Student Research Day. “DNA breaks at Fragile Sites Generate Tumorigenic RET/PTC Rearrangements in Human Thyroid Cells.” Winston-Salem, NC

April 2010 American Society for Biochemistry and Molecular Biology 2010 Annual Meeting. “DNA breaks at Fragile Sites Generate Tumorigenic RET/PTC Rearrangements in Human Thyroid Cells.” Anaheim, CA

March 2012 Wake Forest Graduate Student Research Day. “DNA Secondary Structures Involved in Fragile Site Breakage from the Study of Human Chromosome 10.” Winston-Salem, NC

AWARDS:

2007 Sandy Lee Cowgill Fellowship, Wake Forest University Graduate School of Arts and Sciences

2009 Artom Fellowship, Wake Forest University Graduate School of Arts and Sciences

2010 Alumni Student Travel Award, Wake Forest University Graduate School of Arts and Sciences

2011-2012 NIH NRSA predoctoral fellowship in Structural and Computational Biophysics (T32GM095440-01), Wake Forest School of Medicine

2011 Herbert C. Cheung, Ph.D. Award in Biochemistry, Wake Forest University Graduate School of Arts and Sciences

193