UNIVERSITY OF CINCINNATI

Date: 28-May-2010

I, Emily Sites , hereby submit this original work as part of the requirements for the degree of: Master of Science in Genetic Counseling It is entitled: Copy Number Variation in Monozygotic Twins with NF1

Student Signature: Emily Sites

This work and its defense approved by: Committee Chair: Lisa Martin, PhD Lisa Martin, PhD

Teresa Smolarek, PhD Teresa Smolarek, PhD

Elizabeth Schorry, MD Elizabeth Schorry, MD

5/27/2010 808 Copy Number Variation in Monozygotic

Twins with NF1

Emily Sites

M.S. University of Cincinnati

December, 2008

University of Cincinnati

Genetic Counseling Graduate Program

College of Medicine

Master’s Degree Thesis

May 28th, 2010

Thesis Committee:

Elizabeth Schorry, M.D.

Teresa Smolarek, Ph.D.

Lisa Martin, Ph.D. (Chair)

Emily Sites

Genetic Counseling Graduate Program

Abstract for master’s thesis: Copy Number Variation in Monozygotic Twins

with NF1

May 28th, 2010

General Abstract

A major challenge of managing patients with Neurofibromatosis Type 1 (NF1) is the extreme variability of its phenotype, with no way to predict which patients are at high risk for serious complications. Family members and even monozygotic (MZ) twins with NF1 demonstrate inconsistent expression of the disease. The underlying mechanism for this discordance has never been elucidated. We propose that DNA copy number variants (CNVs), small deletions and duplications of genomic material, may contribute to the variability of disease manifestation. CNVs are known to differ between MZ twins and have recently been implicated in the etiology of several disorders including autism and schizophrenia Hypothesis: MZ twins with NF1 will have within-pair differences in CNVs that may explain their discordant NF1 complications, and CNVs will be present in larger numbers in NF1 patients compared to the general population. Methods: MZ twins with NF1, ages 5 to 18 years, were recruited from the CCHMC NF clinic population. Extensive data was collected for each twin’s NF1 features and complications. The Illumina 610k SNP microarray chip was used to identify and compare CNVs in peripheral blood from the twins and their parents. Age-matched controls were selected from a pre-existing CNV study population. Results: Of the five twin pairs reported here, three were discordant for optic pathway glioma, three for number of plexiform neurofibromas, one for pectus deformity, one for scoliosis, one for malignancy, and one pair was concordant for all parameters. We identified 43 CNVs meeting our conservative criteria, 18 of which overlap known or predicted . The average number of CNVs per twin pair was 8.6 with a range from 3 to 12. Of interest were five previously unreported areas of copy number change, two of which contain genes. We have yet to identify a de novo (non-familial) CNV. A larger study population would be needed to identify a correlation between familial CNVs and specific complications. Conclusions: Additional data is needed to determine if there is a correlative relationship between CNVs and NF1 phenotype.

iii

iv TABLE OF CONTENTS

TITLE PAGE

Introduction 1

Methods 6

Results 8

Discussion 12

Tables 18

Figures 22

References 24

v

INTRODUCTION

Neurofibromatosis Type 1 (NF1) is a tumor predisposition syndrome that affects one in thirty-five hundred live births. Individuals with NF1 are at an increased risk of developing both benign and malignant tumors. Other characteristics of the disease include café au lait spots, Lisch nodules of the iris, optic pathway gliomas, scoliosis, pectus deformities, other bone and vascular abnormalities and learning disabilities

(Friedman and Birch 1997). Neurofibromatosis Type 1 was first described in 1882, and the responsible was cloned in 1990 (Ober 1978; Wallace, Marchuk et al. 1990). The

350kb NF1 gene is located at 17q11.2 (Daston, Scrable et al. 1992).

One of the most challenging aspects of managing patients with NF1 is the extreme variability of phenotype. Currently, there is no way to predict which patients are at high risk for serious complications like malignancy, and to appropriately manage those at higher risk. Family members and even monozygotic (MZ) twins with NF1 demonstrate inconsistent expression of the disease. The underlying mechanism for this discordance has never been elucidated, although several have been proposed. These include: modifying genes, stochastic “2nd hit” events, environmental agents, epigenetic alterations, and post-zygotic (somatic) mutations. Despite extensive research, little evidence has been found to identify these factors or events.

The search for other modifying genes has yielded little success to date. Second- hit, loss-of-heterozygosity events have been well-documented in several tumor types of

NF1 and in at least one type of skeletal complication (Stevenson, Zhou et al. 2006), but are unlikely to explain the entire spectrum of NF1 features. Indeed, in NF1

1

neurofibromas, less than 50% of tumors have been found to have 2nd hit events in the

NF1 gene, implying an additional mechanism at play (Wiest, Eisenbarth et al. 2003).

Full gene deletions of NF1 can be associated with a more severe phenotype and higher tumor burden (Upadhyaya, Ruggieri et al. 1998), but no other significant genotype-phenotype correlations have been found. This full gene deletion, also referred to as NF1 microdeletion, occurs in only 5-10% of patients and predisposes individuals to more severe learning disabilities and facial dysmorphology, earlier or greater burden of cutaneous neurofibromas and possibly malignancy (Lopez Correa, Brems et al. 2000). It has been suggested that the severity of NF1 microdeletion is due to the concomitant loss of a group of genes adjacent to the NF1 gene, including CENTA2, RAB11FIP4,

C17orf79, and UTP6 (Bartelt-Kirbach, Wuepping et al. 2009), but this has not been demonstrated conclusively.

Twin studies have historically been a valuable tool for studying genetic disorders.

Early genetic studies compared monozygotic twins raised separately to determine heritability of specific traits. The literature reports at least 24 pairs of proven MZ twins with NF1 [see References, Case Studies]. Easton et al. (Easton, Ponder et al. 1993) studied a large NF1 population, and found that NF1 features varied to a greater degree with increasing distance from a proband, with 6 pairs of MZ twins having the closest agreement in traits such as presence of neurofibromas, head circumference, and learning disabilities, compared to more distant relatives with NF1. They concluded that there was significant evidence for modifying loci affecting NF1 expression.

Most other twin reports have been case reports. Some have shown remarkable concordance for NF1 features, although others have shown equally intriguing

2

discordancies. Most MZ twin pairs have been generally concordant for overall numbers of café-au-lait spots and cutaneous neurofibromas, implying control by genetic factors, although not solely by the NF1 gene. Presence and location of plexiform neurofibromas has generally been discordant in twin pairs, as expected if these tumors required a random, second-hit event. There have been surprising reports of concordance in other

NF1 tumors such as optic nerve glioma (ONG; 6 pairs concordant, 5 pairs discordant)

(Cartwright 1982; Crawford and Buckler 1983; Pascual-Castroviejo, Verdu et al. 1988;

Kelly, Sproul et al. 1998), more than expected from the prevalence of ONG in 15% of children with NF1.

Although MZ twins were historically assumed to be genetically identical, recent studies have shown remarkable differences in MZ twins in areas such as copy number variants (CNV)(Bruder, Piotrowski et al. 2008). Copy number variants are microduplications or deletions of DNA that are typically defined as being larger than 1 kilobase (kb), and now known to be widespread throughout the (Scherer,

Lee et al. 2007). There is mounting evidence that CNVs have played a significant role in normal population variation, evolution, and disease predisposition (Wong, deLeeuw et al.

2007; McCarroll, Kuruvilla et al. 2008). Genes involved in immunity and defense, sensory perception, and cell adhesion have been found to be associated with CNVs

(Conrad, Andrews et al. 2006). Significant CNV associations have been found in schizophrenia and autism (Sebat, Lakshmi et al. 2007; Xu, Roos et al. 2008).

Some CNVs represent recurrent polymorphisms occurring at a low rate in the population and have no known clinical significance. However, more than 50% of CNVs occur in areas of known genes and could potentially cause significant phenotypic effect

3

(Iafrate, Feuk et al. 2004). The average individual in the general population will be found to have at least 10-50 CNVs using current SNP-based microarrays, with the majority of those (>90%) being familial (Iafrate, Feuk et al. 2004; Conrad, Andrews et al. 2006;

McCarroll, Kuruvilla et al. 2008). The incidence of CNVs depends greatly on methodology used, and it is likely that the numbers demonstrated to date represent only the tip of the iceberg.

De novo CNVs are those found in a proband, but neither parent, and in our case perhaps only one twin. These are thought to arise either in a germ cell or in the early embryo. If a CNV were present in only one of a pair of MZ twins, we would expect that the change occurred sometime after division of the 2 embryos. De novo CNVs which occur during post-fertilization may have a mosaic distribution thoughout the body. This could potentially be detected by sampling tissues representative of multiple germ layers.

We sampled both blood (mesoderm) and buccal (ectoderm) cells to investigate this possibility. De novo CNVs are generally thought to be a likely cause of phenotypic abnormality.

Bruder et al. studied whole-genome CNV in MZ twin pairs discordant for

Parkinsonian disease and compared them to CNVs in unselected and concordant MZ twin pairs. They found multiple CNVs in all twin pairs, most in mosaic form, and concluded that these differences represented an example of somatic mosaicism. This group utilized both the Illumina 300k SNP array and 32K BAC array platforms and were reportedly able to detect CNVs with as low as 10% mosaicism. They asserted that the post-twinning de novo CNV rate was 5% per twin or 10% per twin pair in unselected phenotypically concordant pairs, and potentially even higher in phenotypically discordant twins. Their

4

findings open up new possibilities in the use of discordant MZ twins for identifying genetic regions harboring disease- or trait-influencing loci (Bruder, Piotrowski et al.

2008).

Multiple reports have estimated the rate of de novo CNVs in the general population to be on the order of 1%, meaning that one in one hundred persons could be expected to harbor a de novo CNV (Redon, Ishikawa et al. 2006; Marshall, Noor et al.

2008). This will of course vary slightly depending upon the genotyping platform utilized.

This rate appears to be increased, however, in certain subsets of the population affected by neurocognitive disorders like schizophrenia and autism. Reports by multiple groups suggest that in sporadic (non-familial) cases of schizophrenia and autism, the de novo

CNV rate increases to 10%. By comparison, in familial cases the rate of de novo CNV was approximately 2%, and 1% in the unaffected control group (Sebat, Lakshmi et al.

2007; Xu, Roos et al. 2008).

Already researchers in the NF1 field have begun to investigate CNVs as a tool to improve the prognosis and treatment of MPNST. By performing microarray analysis on

DNA extracted from benign and malignant NF1-related tumors, it has been demonstrated that certain CNVs can be found repeatedly in the MPNSTs of multiple individuals.

(Mantripragada, de Stahl et al. 2009). The presence of specific CNVs within MPNSTs may even be predictive of prognosis and 10 year survival (Brekke, Ribeiro et al.). For our purposes, individuals identified as carrying one of these CNVs could potentially be considered high-risk, and monitored closely for the development of MPNST. Genes within these particular regions of CNV represent strong candidates for further study, and

5

potential targets for new therapeutics. We will compare CNVs identified in our twins to these published lists.

MATERIAL AND METHODS

Recruitment

We reviewed our NF1 Clinical Database, and found 9 pairs of MZ twins with

NF1 from a total population of nearly 1,000 patients evaluated at CCHMC. In this preliminary report, we report data on 5 pairs of these monozygotic twins (ages 5 to 18 years) and their parents. Subjects were recruited from the existing NF1 patient population at Cincinnati Children’s Hospital Medical Center (CCHMC). Inclusion criteria were 1) a diagnosis of NF1, and 2) confirmation of monozygosity of the twins.

Monozygosity was verified by short tandem repeats at 16 chromosomal loci. Study participation was offered to individuals via phone or in person. Informed, written consent was obtained from parents and twins aged 18 years and older; assent was obtained from children aged 10 to 17 years. For children 17 and younger, a parent consented for their children. Research protocol and all study materials were approved by both the CCHMC

Institutional Review Board and the United States Department of Defense Human

Research Protection Office.

Samples

Blood samples were obtained from each individual and used to prepare both linear

DNA and using standard protocols. In addition, buccal swabs were

6

obtained from each twin to assess for intra-individual mosaicism between blood and oral mucosa, which are representative of the mesoderm and ectoderm, respectively. The quality of the DNA extracted from peripheral blood and buccal swabs was monitored by analysis of OD260/OD280 and OD260/OD230 ratios. Acceptable samples had values between 1.8 and 2.0 and ratios > 2.0, respectively. Four microliters of a 50–100 ng/ml dilution of genomic DNA was aliquoted into 96-well plates and genotyped using the

Illumina Human 610k SNP Chip according to manufacturer’s protocol. All samples from a given family were run simultaneously to minimize technical variation.

Genome-wide SNP Data

All samples were genotyped using the Illumina Human 610K SNP Chip, and only autosomal SNPs (chromosomes 1–22) were used. The HLA region on 6 was excluded from the analysis due to the high rate of population polymorphism within this region. Samples with a call rate of < 0.997 were discarded. Data was analyzed using

Illumina’s Genome Studio software. The Illumina 610K platform is able to detect changes of >80 kb, with gaps in coverage of 1 megabase or less. All SNP probes are replicated 5 to 20 times throughout the chip; replicates act as an internal control, reducing the false-positive rate significantly. This data set corresponds to Build 36 of the Human

Genome Project.

Copy number changes were identified using the cnvPartition Algorithm v.2.3.4.

This algorithm identifies regions of copy number variation and calculates the log R ratio and B allele frequency. All CNVs were visually inspected with regard to log R ratio and

B allele frequency, and compared to parental samples, as available, to confirm hereditary or de novo nature of each CNV. Attention was paid to both chromosomal location and

7

CNV breakpoints. The Database of Genomic Variants (http://projects.tcag.ca/variation/) was utilized to categorize CNVs as novel or benign polymorphism with regard to previously reported losses, gains, inversions and segmental duplications. The cnvPartition algorithm also generated a confidence score for each CNV. CNVs with a confidence score < 100 were excluded from the final analysis due to the increased chance of artifact in this group. Known and predicted genes within confirmed CNV regions were catalogued for further analysis. Potential genes of interest within 500 kb up- or downstream of CNVs were also recorded for future analysis.

RESULTS

Phenotype

Of the five twin pairs reported here, three were discordant for optic pathway glioma, two for presence of plexiform neurofibromas, one for number of plexiforms (2 versus 1), one for pectus deformity, one for T2 hyperintensities on brain imaging, one for scoliosis, one for learning disability and one pair was discordant for malignancy

(MPNST.) In Pair D, one twin had velopharyngeal insufficiency which resulted in hypernasal speech, but the twins were concordant for mild speech delay. Pair D was the only pair discordant for learning disability; one twin had an IEP at school whereas the other did not. Overall, 4 of 5 pairs were discordant for 2 or more features. Pair E, the youngest, was concordant for all parameters studied (Table 1.) Number of Café au Lait macules, number of cutaneous neurofibromas and presence or absence of Lisch nodules were remarkably concordant across all five pairs, which agrees with previous reports.

8

Copy Number Variants

The overall number of CNVs was relatively similar across the five twin pairs, with each pair having between 12 and 26 raw CNVs (mean=19.6). All CNVs identified were concordant within the twin pair, and inherited from a parent. Applying a stringent

CNV confidence score cutoff value of 100, we eliminated 56% of those CNVs, leaving us with 43 conservative CNVs. Each pair had between 3 and 12 conservative CNVs,

(mean=8.6). The majority (74/98 = 75.5%) of raw and conservative (37/43 = 86%)

CNVs represent copy number loss. Among conservative CNVs, 38 (88.4%) were identified as benign polymorphisms, occurring in > 1% of the general population (Table

2).

Genes

Conservative CNVs are further classified by their association with known or

predicted genes. Eighteen of the conservative CNVs (41.9%) overlap known or predicted

sequences (Table 3). Of those CNVs, 16 were considered to be benign polymorphism.

The 2 remaining CNVs occurred at 11q22.3 (pair E) and 21q21.1 (pair A). The region of

bound by SNP markers at positions 108678237 and 109100854 contains

several predicted sequences: RefSeq C11orf87, Ensembl ENST00000327419,

Mammalian Gene Collection mRNAs BC068577 and BC035798. The region on

chromosome 21 bounded by SNP markers at positions 18242313 and 18250201 contains

the CHODL gene which encodes the transmembrane Chondrolectin (MIM:

607247).

9

Twin pair A carries a large duplication polymorphism on 10q11.22 which contains or overlaps several genes - GPRIN2, ANXA8, ANXAL1, ANXAL2, SYT15 and

L25628 – as well as RefSeq predicted sequences. GRPRIN2 (MIM: 611240) is G protein regulated inducer of neurite outgrowth 2. ANXA8 (MIM: 602396), ANXAL1, and

ANXAL2 represent the annexin family of evolutionarily conserved calcium and phospholipid binding which may have anticoagulant activity. SYT15 (MIM:

608081) encodes Synaptogamin 15, a membrane trafficking protein expressed in neurons.

Twin pairs A and B share an overlapping region of copy number loss on 8p11.23 which contains the tMDC gene, also known as ADAM5, a metallopeptidase pseudogene.

Twin pairs A and C have adjacent but non-overlapping CNVs on 4q12 which do not contain known genes. In addition, all five twin pairs share the same CNV at 4q28.3 which does not contain any genes (Table 4). These changes at 8p11.23, 4q12 and 4q28.3 are common population polymorphisms (TS, unpublished data.)

CNV Distribution

In general, identified CNVs were found to be widely distributed across the genome, with a concentration on 4q (Figure 1). The long arm of chromosome 4 seemed to be a hotspot for CNVs in our twins, accounting for 14/43 or 32.5% of all conservative

CNVs. This is consistent with the greater amount of heterochromatin on 4q. Of note, all

CNVs in this region represented copy number loss and 10/13 did not contain any genes, known or predicted. There were only three genes identified within the CNVs on 4q,

UGT2B17, UGTUB28 and LRBA. Copy number loss of any of these 3 genes is within the spectrum of normal variation.

10

We did not detect copy number variation within the NF1 gene or the contiguous microdeletion region in any of the twin pairs studied. This was not unexpected as none of these twins are known to carry the full gene deletion. Missense and nonsense mutations too small to detect with microarray have been found in twin pairs A, D and E; pairs B and C have not had molecular testing. We did note in twin pair A, however, a large duplication 4.1 MB away from the NF1 gene spanning the region from 17p11.2-

17q11.1 which contains several predicted genes, but is considered a benign polymorphism. The NF1 microdeletion region is 6.3 MB away from this CNV. Of potential interest, one twin in this pair has MPNST.

Comparison to NF1-related Tumor CNVs and Scoliosis-related CNVs

We sought to compare our conservative CNVs to those reported in NF1-related

neurofibromas and MPNSTs in the literature. Multiple tumor samples have harbored

copy number changes in chromosomes 17, 19 and 22q (Koga, Iwasaki et al. 2002), gains

in 7, 8q, 15q and 17q (Schmidt, Taubert et al. 2000), gains in 8q, 7p, 16p and 17q and

losses in 9p, 11q, 17p, and 10q (Brekke, Ribeiro et al.). These chromosomal locations

were not well represented among our conservative CNVs. Of those that did overlap, the

CNVs were not matched for either breakpoints or copy number change. Mantripragada et

al. reported losses in MPNST samples at 9p21.3, containing the CDKN2A/B and MTAP

genes. Twin pair A also had a loss at 9p21.3, but it did include these key genes.

Additionally, Mantripragada et al. reported copy number losses at 11q22.3 and 17p11.2-

q11.1, which corresponded to copy number gains in our study group at the same locations

(Table 5.)

11

A search for the term “scoliosis” in the DECIPHER database (v4.4) on 5/20/10

(https://decipher.sanger.ac.uk) yielded a list of 44 patients with the scoliosis phenotype, as well as their associated microarray findings. Only twin pair C is affected by scoliosis; one twin has a mild, 10 degree curvature. We compared the DECIPHER scoliosis-related

CNVs to CNVs identified in pair C. One raw CNV at 10p11.23 (29613720-29629015) overlapped with a DECIPHER CNV (chr10: 28833195-29823698). The CNV in twin pair C overlaps the LYZL1 gene, whereas the DECIPHER CNV contains LYZL1 and overlaps 4 additional genes. The CNV in twin pair C has not been previously reported as a polymorphism.

DISCUSSION

The goal of this study was to identify somatic genetic differences in MZ twins with NF1, which could explain their often discordant phenotypes. The SNP-based microarray is a novel tool for investigating the variability of NF1 disease manifestation.

Based on previous studies, we expected to uncover one de novo CNV in 10 twin pairs

(Sebat, Lakshmi et al. 2007; Xu, Roos et al. 2008), but have not yet identified a de novo change among these five twin pairs. This may be a consequence of the small sample size, or the relatively high degree of phenotypic concordance among these five twin pairs.

Another potential explanation is the young age of our study subjects, between 5 and 18 years. CNVs are known to accumulate with age, thus we may not see a disparity until later years (Fraga, Ballestar et al. 2005).

12

The long arm of chromosome 4 seemed to be a hotspot for CNVs in our twins, accounting for 32.5% of all conservative CNVs. Variation in this region is also extremely common in the general population, as 4q is relatively gene poor (Clapp,

Bolland et al. 2003); only 3/10 conservative CNVs in this region contained genes.

Interestingly, all of the conservative CNVs on 4q were copy number losses. Losses on 4q have been associated with poor prognosis in MPNSTs, although the specific breakpoints of these losses were not described by Brekke et al., and thus cannot be compared conclusively to ours (Brekke, Ribeiro et al.).

NF1-related MPNSTs and plexiform neurofibromas have been interrogated for

CNVs by multiple groups. Overall, MPNSTs seem to contain more genomic rearrangements than benign plexiforms, suggesting an element of genomic instability which accompanies (or precedes) malignant transformation. There are multiple reports of copy number changes in NF1-related tumors, with copy number gain being more common in MPNST, and loss occurring more frequently in benign tumors (Schmidt,

Taubert et al. 2000; Koga, Iwasaki et al. 2002; Mantripragada, de Stahl et al. 2009).

While the tumor samples contained many CNVs, a significant proportion may have been constitutional variation in the individual from which the tumor was harvested.

The majority of our CNVs, both raw and benign, represent copy number loss, contradicting previous findings in population-based studies which found that copy number change was predominantly comprised of duplications (Iafrate, Feuk et al. 2004;

Conrad, Andrews et al. 2006; Eichler 2006). This could be due to differences in genotyping platforms, i.e. BAC array and CGH array versus our SNP-based array, as well as the differing genomic resolution and reference populations of each platform. Copy

13

number probes make CNVs more visible in terms of B allele frequency than do SNP probes, and deletions are more easily detected than duplications in general.

Several genes of potential interest were within 500 kb of reported CNVs. These include PTPN20A/B, MAP2K3 and FOXG1. MAP2K3 (MIM: 602315) codes for mitogen-activated protein kinase kinase 3, an element which functions downstream of activated Ras in an intracellular signaling cascade. Constitutive activation of Ras signaling is believed to be the major mechanism of pathogenesis in Nf1 (Klose,

Ahmadian et al. 1998). Loss or gain of genetic material within 500 kb of a transcriptional start site could significantly impact transcription of a gene by altering transcription factor binding in the promoter region. There are multiple other mechanisms by which a copy number change, even when devoid of genes, could affect the transcription of near and distant genes.

If a CNV interrupts a control region (LCR) or insulator element, the normal packaging, methylation and acetylation patterns of chromatin in that area could be affected. Heterochromatin may become exposed or euchromatin condensed, resulting in altered transcriptional activity within the region. Elements within the chromatin which are normally silenced may become exposed and interact with transcription factors or other components of the transcriptional machinery (Hubner and Spector). Enhancers are an additional element of transcriptional control which can be located some distance from a gene, even on another chromosome (Williams, Spilianakis et al.). This significantly expands the potential of CNVs to affect gene expression.

We found a large duplication on in twin pair A, 4.1 MB away from the NF1 gene. This duplication spans the region from 17p11.2-17q11.1 and

14

contains several predicted genes, but is polymorphic. Of interest, one twin in this pair has an MPNST. It may be that this CNV has predisposed the twins to mitotic recombination or other instability near the NF1 locus. This CNV was maternally inherited and therefore in trans to the constitutional NF1 mutation, which was paternally inherited. As a result, cells may be more susceptible to loss of heterozygosity (LOH) at

NF1 and ultimately, malignant transformation. LOH at NF1 is generally considered the causative factor for the development of tumors in the NF1 population (Serra, Puig et al.

1997). Additionally, NF1 neurofibromas have revealed a wide array of “2nd hit” mutations, even within tumors from the same patient, implying a high tendency for new mutation events (Wiest, Eisenbarth et al. 2003).

Although we have not compared copy number variation in our twins to that of non-NF1 controls, each twin represents an ideal control for their co-twin. We originally hypothesized that NF1 haplosufficiency might predispose patients to genomic instability and a higher rate of de novo CNV occurrence. The instability would manifest as de novo

CNVs that arose somatically, after division of the two embryos, or even later in development after separation of the germ layers. If this were the case, we would expect to detect CNV differences between co-twins. As we have not yet identified any CNVs as discordant between twins, our current data does not support this hypothesis.

Ideally, DNA from buccal swabs would have provided further information on genomic instability by revealing tissue type mosaicism for CNVs. If a CNV were de novo, the result of a somatic change, we would be likely to see differences between germ layers, depending upon when in embryogenesis the CNV arose. Unfortunately the buccal

15

swabs yielded low-quality DNA which was not suitable for SNP analysis. The potential for de novo changes in tissues outside of peripheral blood still exists.

Two recent studies of identical twins with discordant disease phenotypes have also failed to identify genetic differences between discordant MZ twins. The first is a case report of MZ twins with BRCA1 mutations, one of whom developed cancer twice, the other not at all. The authors were not able to detect any differences in CNVs which could explain the differing phenotypic expression of the BRCA1 mutation (Lasa, Ramon et al.). The second report describes three pairs of MZ twins discordant for Multiple

Sclerosis. Analysis of CNVs, genomic methylation patterns and mRNA expression in all three, as well as full genome sequencing of one pair, revealed no significant differences, even in confirmed MS-susceptibility SNPs (Baranzini, Mudge et al.).

Although CNVs reported here are also present in the general population, they may have greater or different significance in the context of NF1. And while de novo CNVs would represent a basis for discordance between twins, familial CNVs could be predisposing factors in twin pairs with concordant NF1 features. This study does not have sufficient power to demonstrate a correlative relationship between NF1 symptoms and CNVs, however we have described several genes and genomic regions which may be of interest for future investigation. Larger studies will be needed to conclusively correlate CNVs with specific NF1 complications.

In the future, this line of research could give us clues to the genes and pathways involved in the pathogenesis of these complications. Knowledge of these modifying genes or genomic regions could help us to better predict and prevent NF1 complications, and may contribute to the development of new, targeted therapies. Additionally, these

16

findings could translate to research in the same disease processes in the general population. We hope to expand this study to include the larger NF1 population, to better characterize any associations between CNVs and NF1 phenotype.

In addition to CNVs, several other possibilities exist for investigation of phenotypic differences between MZ twins. Differential methylation patterns at the NF1 promoter and remote genomic locations may have the potential to modify disease presentation between twins, and MZ twins have been found to display differences in both gene-specific and whole-genome methylation (Harder, Rosche et al. 2004; Petronis

2006). The NF1 promoter contains several transcription factor binding sites which are potentially susceptible to changes in methylation (Hajra et al. 1994; Zou et al. 2004).

This represents an exciting new direction for the investigation of phenotypic discordance between MZ twins with NF1.

17

TABLES & FIGURES

Table 1.

18

Table 2. Twin Pair A B C D E Total Mean Raw CNVs 22 21 17 12 26 98 19.6 Conservative 11 8 9 3 12 43 8.6 CNVs CN Losses 9 8 9 2 9 37 7.4 Polymorphism 8 6 8 3 9 34 6.8 Contain genes 6 3 3 1 5 18 3.6

Table 2. CNV type by twin pair. Conservative CNVs are those with CNV confidence values > 100. CN Losses = CNVs which represent copy number loss. Polymorphism = CNVs which occur in > 1% of the general population.

Table 3. Cyto Band Genes CNV type 3q22.1 CPNE4 loss 4q13.2 UGT2B17 loss 4q13.2 UGTUB28 loss 4q31.3 LRBA loss 5q33.1 predicted loss 8p11.23 tMDC loss 8p11.23 tMDC, TMDCII loss 9q34.3 predicted gain GPRIN2, ANXA8/L1/L2, SYT15, 10q11.22 L25628, predicted gain 10q11.22 predicted gain 11q11 OR4 family loss 11q22.3 predicted gain 12q15 predicted loss 15q14 GOLGA8B, predicted loss 17p11.2-17q11.1 predicted gain 21q21.1 CHODL loss 22q11.23 GSTT1, predicted loss 22q11.23 LRP5L, predicted loss

Table 3. Genes within regions of CNV. 18 regions of conservative CNV were found to contain or overlap known and predicted gene sequence; these are summarized here. Predicted may refer to RefSeq or Ensembl genes, or Mammalian Gene Collection mRNA sequences. Cytogenetic location, as well as type (copy number gain/loss) of CNVs are shown.

19

Table 4.

Twin Pair Chrom Cyto Band start end CNV type E 1 1p13.3 111180501 111189246 loss A 2 2p22.3 34556561 34571303 loss B 3 3q22.1 133185033 133195707 loss C 3 3q26.1 165141051 165158207 loss A 4 4q12 58409164 58418102 loss C 4 4q12 58417022 58418102 loss A 4 4q13.2 69064675 69163188 loss E 4 4q13.2 70164518 70246877 loss C 4 4q13.3 64380191 64392223 loss B 4 4q26 115398433 115401739 loss E 4 4q26 115398433 115401739 loss A 4 4q28.3 134352228 134359100 loss B 4 4q28.3 134352228 134371364 loss C 4 4q28.3 134352228 134359100 loss D 4 4q28.3 134352228 134359100 loss E 4 4q28.3 134352228 134359100 loss E 4 4q31.3 152100215 152103164 loss B 4 4q32.1 161276893 161291569 loss E 5 5q23.1 120328975 120440249 gain C 5 5q33.1 151495149 151499003 loss E 6 6q14.1 79029649 79090197 loss A 8 8p11.23 39350791 39497557 loss B 8 8p11.23 39356825 39497557 loss A 9 9p11.2 44683090 44770712 loss B 9 9p21.3 22486640 22492851 loss C 9 9p22.2 17900043 17908806 loss C 9 9p23 11392634 11398865 loss D 9 9q34.3 137288987 137397935 gain A 10 10q11.22 46291137 47454088 gain E 10 10q11.22 46631679 47173619 gain D 10 10q21.3 66980652 66983043 loss A 11 11q11 55124465 55209499 loss E 11 11q22.3 108678237 109100854 gain B 12 12q12 44185155 44217885 loss

20

E 12 12q14.1 59041515 59043746 loss B 12 12q15 69160993 69162893 loss E 13 13q13.1 31431830 31433996 loss A 14 14q12 26520903 27111869 loss E 15 15q14 32459510 32625184 loss 17p11.2- A 17 17q11.1 21452168 22370819 gain A 21 21q21.1 18242313 18250201 loss C 22 22q11.23 22668071 22715105 loss C 22 22q11.23 23980406 24244593 loss

Table 4. Complete index of conservative CNVs. Type, location and twin pair of origin are summarized for all 43 conservative CNVs found in this study. Start and end numbers represent the location of a SNP marker.

21

Figure 1.

Figure 1. Visual representation of CNV distribution. Conservative CNVs are depicted as colored regions corresponding to their cytogenetic location. Red = gain and blue = loss. Horizontal size indicates number of twin pairs with that CNV. Vertical size indicates the number of cytogenetic bands involved in the CNV (not to scale.)

22

Table 5. Mantriprigada 1p35-33 loss 1q25 gain 6q15 gain 1p21 loss 3p26 gain 6q23-24 gain loss including CDKN2A/B, 9p21.3 MTAP 3q13 gain 7p22 gain

10q25 loss 5p12 gain 7p14-13 gain 11q22-23 loss 5q11.2-q14 gain7q21 gain 17q11 loss 5q21-23 gain7q36 gain 20p12.2 loss 5q31-33 gain 8q22-24 gain 6p23-21 gain 14q22 gain 6p12 gain 17q21-25 gain Koga Schmidt Brekke 17q gain 8q gain 8q gain 4q gain 17q42->qter gain7p14->pter gain 22q loss 7p14->pter gain16p gain loss including 17p11.2-p13 p53 15q gain 17q gain 17q25-25 loss 7q gain 9p loss 19p13.2 loss 5p gain 11q loss 19q13.2->qter loss 20q gain 17p loss 10q loss

Table 5. Comparison of Conservative CNVs to Tumor CNVs in the Literature. CNVs reported by various authors (as indicated) in benign plexiform and malignant peripheral nerve sheath tumors. Areas of overlap with our conservative twin CNVs are highlighted in yellow. Matching parameters between the reports and our data set (cyto location, gain/loss, genes involved) are bolded.

23

REFERENCES, CASE REPORTS

Vaughn, A.J., D. Bachman, and A. Sommer, Neurofibromatosis in monozygotic twins: a case report of spontaneous mutation. Am J Med Genet, 1981. 8(2): p. 155-8. Cartwright, S.C., Concordant optic glioma in a pair of monozygotic twins with neurofibromatosis. Clin Pediatr (Phila), 1982. 21(4): p. 236-8. Crawford, M.J. and J.M. Buckler, Optic gliomata affecting twins with neurofibromatosis. Dev Med Child Neurol, 1983. 25(3): p. 370-3. Lubinsky, M.S., Non-random associations and vascular fields in neurofibromatosis 1: a pathogenetic hypothesis. Am J Med Genet A, 2006. 140(19): p. 2080-4. Koul, R.L., A. Chacko, and H.O. Leven, Dandy-Walker syndrome in association with neurofibromatosis in monozygotic twins. Saudi Med J, 2000. 21(4): p. 390-2. Craigen, M.A. and N.M. Clarke, Familial congenital pseudarthrosis of the ulna. J Hand Surg [Br], 1995. 20(3): p. 331-2. Akesson, H.O., R. Axelsson, and B. Samuelsson, Neurofibromatosis in monozygotic twins: a case report. Acta Genet Med Gemellol (Roma), 1983. 32(3-4): p. 245-9. Payne, M.S., et al., Congenital glaucoma and neurofibromatosis in a monozygotic twin: case report and review of the literature. J Child Neurol, 2003. 18(7): p. 504-8. Kelly, T.E., et al., Discordant puberty in monozygotic twin sisters with neurofibromatosis type 1 (NF1). Clin Pediatr (Phila), 1998. 37(5): p. 301-4. Tubridy, N., et al., Hippocampal involvement in identical twins with neurofibromatosis type 1. J Neurol Neurosurg Psychiatry, 2001. 71(1): p. 131-2. Pascual-Castroviejo, I., et al., Optic glioma with progressive occlusion of the aqueduct of Sylvius in monozygotic twins with neurofibromatosis. Brain Dev, 1988. 10(1): p. 24-9. Bauer, M., H. Lubs, and M.L. Lubs, Variable expressivity of neurofibromatosis-1 in identical twins. Neurofibromatosis, 1988. 1(5-6): p. 323-9. Brady, W.J., Brain stem gliomas causing hydrocephalus in twins with von Recklinghausen's disease. J Neuropathol Exp Neurol, 1962. 21: p. 555-65.

REFERENCES

Baranzini, S. E., J. Mudge, et al. "Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis." Nature 464(7293): 1351-6. Bartelt-Kirbach, B., M. Wuepping, et al. (2009). "Expression analysis of genes lying in the NF1 microdeletion interval points to four candidate modifiers for neurofibroma formation." Neurogenetics 10(1): 79-85. Brekke, H. R., F. R. Ribeiro, et al. "Genomic changes in chromosomes 10, 16, and X in malignant peripheral nerve sheath tumors identify a high-risk patient group." J Clin Oncol 28(9): 1573-82.

24

Bruder, C. E., A. Piotrowski, et al. (2008). "Phenotypically concordant and discordant monozygotic twins display different DNA copy-number-variation profiles." Am J Hum Genet 82(3): 763-71. Cartwright, S. C. (1982). "Concordant optic glioma in a pair of monozygotic twins with neurofibromatosis." Clin Pediatr (Phila) 21(4): 236-8. Clapp, J., D. J. Bolland, et al. (2003). "Genomic analysis of facioscapulohumeral muscular dystrophy." Brief Funct Genomic Proteomic 2(3): 213-23. Conrad, D. F., T. D. Andrews, et al. (2006). "A high-resolution survey of deletion polymorphism in the human genome." Nat Genet 38(1): 75-81. Crawford, M. J. and J. M. Buckler (1983). "Optic gliomata affecting twins with neurofibromatosis." Dev Med Child Neurol 25(3): 370-3. Daston, M. M., H. Scrable, et al. (1992). "The protein product of the neurofibromatosis type 1 gene is expressed at highest abundance in neurons, Schwann cells, and oligodendrocytes." Neuron 8(3): 415-28. Easton, D. F., M. A. Ponder, et al. (1993). "An analysis of variation in expression of neurofibromatosis (NF) type 1 (NF1): evidence for modifying genes." Am J Hum Genet 53(2): 305-13. Eichler, E. E. (2006). "Widening the spectrum of human genetic variation." Nat Genet 38(1): 9-11. Fraga, M. F., E. Ballestar, et al. (2005). "Epigenetic differences arise during the lifetime of monozygotic twins." Proc Natl Acad Sci U S A 102(30): 10604-9. Friedman, J. M. and P. H. Birch (1997). "Type 1 neurofibromatosis: a descriptive analysis of the disorder in 1,728 patients." Am J Med Genet 70(2): 138-43. Harder, A., M. Rosche, et al. (2004). "Methylation analysis of the neurofibromatosis type 1 (NF1) promoter in peripheral nerve sheath tumours." Eur J Cancer 40(18): 2820-8. Hubner, M. R. and D. L. Spector "Chromatin dynamics." Annu Rev Biophys 39: 471-89. Iafrate, A. J., L. Feuk, et al. (2004). "Detection of large-scale variation in the human genome." Nat Genet 36(9): 949-51. Kelly, T. E., G. T. Sproul, et al. (1998). "Discordant puberty in monozygotic twin sisters with neurofibromatosis type 1 (NF1)." Clin Pediatr (Phila) 37(5): 301-4. Klose, A., M. R. Ahmadian, et al. (1998). "Selective disactivation of neurofibromin GAP activity in neurofibromatosis type 1." Hum Mol Genet 7(8): 1261-8. Koga, T., H. Iwasaki, et al. (2002). "Frequent genomic imbalances in chromosomes 17, 19, and 22q in peripheral nerve sheath tumours detected by comparative genomic hybridization analysis." J Pathol 197(1): 98-107. Lasa, A., Y. C. T. Ramon, et al. "Copy number variations are not modifiers of phenotypic expression in a pair of identical twins carrying a BRCA1 mutation." Breast Cancer Res Treat. Lopez Correa, C., H. Brems, et al. (2000). "Unequal meiotic crossover: a frequent cause of NF1 microdeletions." Am J Hum Genet 66(6): 1969-74. Mantripragada, K. K., T. D. de Stahl, et al. (2009). "Genome-wide high-resolution analysis of DNA copy number alterations in NF1-associated malignant peripheral nerve sheath tumors using 32K BAC array." Genes Chromosomes Cancer 48(10): 897-907.

25

Marshall, C. R., A. Noor, et al. (2008). "Structural variation of chromosomes in autism spectrum disorder." Am J Hum Genet 82(2): 477-88. McCarroll, S. A., F. G. Kuruvilla, et al. (2008). "Integrated detection and population- genetic analysis of SNPs and copy number variation." Nat Genet 40(10): 1166-74. Ober, W. B. (1978). "Selected items from the history of pathology. Mark Akenside, MD (1721-1770): first recorded description of multiple neurofibromatosis." Am J Pathol 92(1): 315. Pascual-Castroviejo, I., A. Verdu, et al. (1988). "Optic glioma with progressive occlusion of the aqueduct of Sylvius in monozygotic twins with neurofibromatosis." Brain Dev 10(1): 24-9. Petronis, A. (2006). "Epigenetics and twins: three variations on the theme." Trends Genet 22(7): 347-50. Redon, R., S. Ishikawa, et al. (2006). "Global variation in copy number in the human genome." Nature 444(7118): 444-54. Scherer, S. W., C. Lee, et al. (2007). "Challenges and standards in integrating surveys of structural variation." Nat Genet 39(7 Suppl): S7-15. Schmidt, H., H. Taubert, et al. (2000). "Gains in chromosomes 7, 8q, 15q and 17q are characteristic changes in malignant but not in benign peripheral nerve sheath tumors from patients with Recklinghausen's disease." Cancer Lett 155(2): 181-90. Sebat, J., B. Lakshmi, et al. (2007). "Strong association of de novo copy number mutations with autism." Science 316(5823): 445-9. Serra, E., S. Puig, et al. (1997). "Confirmation of a double-hit model for the NF1 gene in benign neurofibromas." Am J Hum Genet 61(3): 512-9. Stevenson, D. A., H. Zhou, et al. (2006). "Double inactivation of NF1 in tibial pseudarthrosis." Am J Hum Genet 79(1): 143-8. Upadhyaya, M., M. Ruggieri, et al. (1998). "Gross deletions of the neurofibromatosis type 1 (NF1) gene are predominantly of maternal origin and commonly associated with a learning disability, dysmorphic features and developmental delay." Hum Genet 102(5): 591-7. Wallace, M. R., D. A. Marchuk, et al. (1990). "Type 1 neurofibromatosis gene: identification of a large transcript disrupted in three NF1 patients." Science 249(4965): 181-6. Wiest, V., I. Eisenbarth, et al. (2003). "Somatic NF1 mutation spectra in a family with neurofibromatosis type 1: toward a theory of genetic modifiers." Hum Mutat 22(6): 423-7. Williams, A., C. G. Spilianakis, et al. "Interchromosomal association and gene regulation in trans." Trends Genet 26(4): 188-97. Wong, K. K., R. J. deLeeuw, et al. (2007). "A comprehensive analysis of common copy- number variations in the human genome." Am J Hum Genet 80(1): 91-104. Xu, B., J. L. Roos, et al. (2008). "Strong association of de novo copy number mutations with sporadic schizophrenia." Nat Genet 40(7): 880-5.

26