Global Analysis of Gene Expression in the
Developing Brain of Gtf2ird1-/- Mice
by
Jennifer Anne O’Leary
A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Molecular Genetics University of Toronto
© Copyright by Jennifer Anne O’Leary (2011)
Global Analysis of Gene Expression in the Developing Brain of
Gtf2ird1-/- Mice
Jennifer Anne O’Leary
Doctor of Philosophy
Department of Molecular Genetics University of Toronto
2011
Abstract
Williams-Beuren Syndrome (WBS) is an autosomal dominant neurodevelopmental disorder caused by hemizygous deletion of a 1.5 Mb region on chromosome 7q11.23. Symptoms are numerous and include behavioural and cognitive components. One of the deleted genes,
GTF2IRD1, a putative transcription factor, has been implicated in the neurological features of
WBS by studying patients with atypical deletions of 7q11.23. Gtf2ird1-targeted mice have features consistent with the WBS phenotype, namely reduced innate fear and increased sociability. To identify neural targets of GTF2IRD1, microarray analyses were performed comparing gene expression in whole brains of Gtf2ird1-/- and wildtype (WT) mice at embryonic day 15.5 and at birth. Overall, the changes in gene expression in the mutant mice were not striking, with most falling in the range of 0.3 to 2 fold. qRT-PCR was used to verify the expression levels of candidate genes and examination of verified genes revealed that most were located on chromosome 5, within 50 Mb of Gtf2ird1. Expression of these candidate genes in
Gtf2ird1-/- mice was found to be the same as in WT 129S1/SvImJ mice, indicating the
ii
differences were the result of flanking chromosomal material from the, 129-derived, R1 ES cells from which the Gtf2ird1-/- mice were generated, and that expression differences were unrelated to Gtf2ird1 dosage. Further analysis found that while many genes showed decreased expression using primers targeting the 3’ UTR, expression of upstream exons was not affected.
Transcripts using alternative polyadenylation sites were identified using 3’ RACE, and qRT-PCR showed that expression of different 3’ UTR isoforms can occur in a strain specific manner.
Expression analysis of previously identified GTF2IRD1 targets also failed to demonstrate an in vivo effect. In summary, I was unable to find any in vivo neuronal targets of this putative transcription factor, despite its robust expression in the developing rodent brain.
iii
Acknowledgements
The work that I completed over the past seven years would not have been possible without the help and support of many people. First, I must thank my supervisor, Dr. Lucy
Osborne for her guidance and support. It has been a pleasure working in her lab, and her ability to put a positive spin on my negative results kept me from getting too depressed as the list of genes that Gtf2ird1 does not regulate continued to grow. I would also like to thank my supervisory committee members, Dr. Sabine Cordes and Dr. Timothy Hughes for their helpful insights, suggestions and technical assistance.
All members of the Osborne lab, past and present, have made the lab a great environment to work in. I will greatly miss the countless hours of “scientific” discussion, cookie days, and their company during “coffee time”. In particular, I owe a big thank you to Ted Young for teaching me how to be a good scientist. Although his Lil John impressions drove me crazy, he always came through with helpful advice when it was most needed.
Finally, none of this would have been possible without the continued support of my family, especially my parents. I was the first of their four children to enter university, and the last one to leave. Their unconditional love and encouragement have undoubtedly made it possible for me to be where I am today.
iv
Table of Contents
Abstract ...... ii
Acknowledgements ...... iv
Table of Contents ...... v
LIST OF TABLES ...... x
LIST OF FIGURES ...... xi
LIST OF ABBREVIATIONS ...... xiii
Chapter I: Introduction ...... 1
1.1 Williams-Beuren syndrome ...... 1 1.1.1 History of Williams-Beuren syndrome ...... 1
1.1.2 Williams-Beuren syndrome clinical phenotype ...... 2
1.1.3 The Williams-Beuren syndrome cognitive phenotype ...... 5
1.1.4 The Williams-Beuren syndrome behavioural phenotype ...... 8
1.2 The genetic basis of Williams-Beuren syndrome ...... 9 1.2.1 Identification of a microdeletion at 7q11.23 ...... 9
1.2.2 Genomic rearrangements at 7q11.23 ...... 12
1.2.3 Atypical deletions in the Williams-Beuren syndrome region ...... 15
1.3 General Transcription Factor 2-I (GTF2-I) gene family ...... 17 1.3.1 General Transcription Factor 2-I (TFII-I) ...... 18
1.3.2 General Transcription Factor 2-I Repeat Domain containing 1 (TFII-IRD1) ...... 21
1.3.3 General Transcription Factor 2-I Repeat Domain containing 2 (TFII-IRD2) ...... 24
v
1.4 The Gtf2ird1 mouse model ...... 25 1.4.1 Generation of the mouse model ...... 26
1.4.2 Behavioural phenotypic analysis ...... 28
1.4.3 Biochemical and electrophysiological phenotypic analysis ...... 29
1.5 Research Aims and Hypothesis ...... 29
Chapter II: TFII-IRD1 may not function as a transcription factor in the developing mouse brain...... 31
2.1 Abstract ...... 31
2.2 Literature Review ...... 32 2.2.1 Evidence supporting the role of TFII-IRD1 as a transcription factor ...... 32
2.2.2 Cellular localization of TFII-IRD1 ...... 39
2.3 Material and Methods ...... 41 2.3.1 Generation of probes for in situ hybridization ...... 41
2.3.2 Whole mount in situ hybridization of Gtf2ird1-/- embryos ...... 42
2.3.3 In situ hybridization of P0 mouse brain sections ...... 44
2.3.4 Preparation and culture of mouse embryonic fibroblast (MEF) cells ...... 45
2.3.5 Dissection of mouse tissues and RNA isolation ...... 46
2.3.6 Genotyping of P0 and embryonic mice ...... 46
2.3.7 Microarray analysis using the Affymetrix mouse 430 2.0 gene chip ...... 47
2.3.8 Microarray analysis using the Illumina mouseWG-6 v2.0 BeadChip ...... 48
2.3.9 Expression analysis using quantitative Real-Time PCR ...... 49
2.3.10 siRNA knockdown of Gtf2ird1 in neuronal cell lines ...... 52
2.3.11 Cellular localization of Gtf2ird1 in Neuro2a cells ...... 53
2.3.12 Expression analysis using western blots ...... 55 vi
2.4 Results ...... 56 2.4.1 Gtf2ird1 is expressed in the developing mouse brain ...... 56
2.4.2 Expression of candidate target genes Hoxc8 and Gsc are not altered in E11.5
Gtf2ird1-/- mouse embryos ...... 56
2.4.3 Expression of TFII-IRD1 candidate target genes identified in vitro are not altered
in vivo ………………………………………………………………………………………60
2.4.4 Global expression analysis of P0 mouse whole brain ...... 64
2.4.5 Global expression analysis of E15.5 embryo heads ...... 68
2.4.6 Validation of candidate gene expression using qRT-PCR ...... 70
2.4.7 Knockdown of Gtf2ird1 in neuronal cell lines does not affect expression of
candidate genes ...... 72
2.4.8 Altered gene expression in Gtf2ird1-/- mice is the result of differences in genetic
background ...... 75
2.4.9 TFII-IRD1 is found in the cytoplasm of Neuro2a cells ...... 78
2.5 Discussion ...... 82 2.5.1 Targets of TFII-IRD1 identified in vitro ...... 82
2.5.2 Global analysis of gene expression in Gtf2ird1-/- mice ...... 85
2.5.3 Cellular localization of TFII-IRD1 ...... 92
Chapter III: Exon specific differences in gene expression between different mouse strains 94
3.1 Abstract ...... 94
3.2 Literature Review ...... 95 3.2.1 Polyadenylation of pre-mRNA ...... 95
3.2.2 Transcription termination ...... 99
3.2.3 Strain specific gene expression ...... 100 vii
3.3 Material and Methods ...... 101 3.3.1 Expression analysis using quantitative Real-Time PCR ...... 101
3.3.2 Generation of probes for Northern blots ...... 103
3.3.3 Northern blot analysis ...... 104
3.3.4 3’ Rapid Amplification of cDNA ends (RACE) ...... 105
3.3.5 Cloning and sequencing of 3’ RACE products ...... 106
3.3.6 Expression analysis using western blots ...... 107
3.4 Results ...... 108 3.4.1 Differential gene expression detected in Gtf2ird1-/- mice is exon specific ...... 108
3.4.2 Northern blot analysis does not detect novel alternatively spliced transcripts ...... 112
3.4.3 Alternative splicing in the 3’UTR identified using 3’ RACE ...... 115
3.4.4 Expression levels of different 3’UTR isoforms differ between genotypes ...... 122
3.4.5 Expression of Stx3 is variable and does not correlate with genotype ...... 126
3.4.6 Differences in gene expression of genes located close to the Gtf2ird1 locus are
related to genetic background ...... 127
3.4.7 Differentially expressed exons do not affect protein levels of Stx3 ...... 128
3.5 Discussion ...... 129 3.5.1 Alternative splicing in the 3’ UTR ...... 130
3.5.2 Use of alternative polyadenylation sites ...... 132
3.5.3 qRT-PCR validation of microarrays ...... 135
Chapter IV: Summary and Future Directions ...... 137
4.1 Summary ...... 137
4.2 Further investigation of GTF2IRD1 function ...... 139
4.3 Further investigation of alternative polyA site selection ...... 142 viii
4.4 Conclusion ...... 142
References ...... 144
ix
LIST OF TABLES
CHAPTER I: Introduction to Williams-Beuren syndrome
Table 1.1 Genes located in the WBS deletion region……………………………... 11
CHAPTER II: TFII-IRD1 may not function as a transcription factor in the developing mouse brain
Table 2.1 Sequences of primers used in qRT-PCR……………………………….. 49 Table 2.2 siRNA sequences used to knockdown Gtf2ird1…………………………… 53 Table 2.3 Genes found to have altered expression in the brains of Gtf2ird1-/- P0 mice by microarray……………………………………………………... 65 Table 2.4 Genes found to have altered expression in the heads of Gtf2ird1-/- E15.5 mice by microarray……………………………………………… 69 Table 2.5 Comparison of SNPs in the 3’UTR Zfp68 in Gtf2ird1-/- mice and WT mice relative to 129S1/Sv1mJ mice……………………………………. 77
CHAPTER III: Exon specific differences in gene expression between different mouse strains
Table 3.1 Sequences of primers used in qRT-PCR……………………………….. 102 Table 3.2 Sequences of primers used to generate Northern blot probes………….. 103 Table 3.3 Sequences of primers used in synthesis of first strand cDNA from 3' RACE…………………………………………………………………... 106
x
LIST OF FIGURES
CHAPTER I: Introduction to Williams-Beuren syndrome
Figure 1.1 Characteristic facial features of WBS…………………………………... 4 Figure 1.2 Grammar skills in WBS patients...... 7 Figure 1.3 Visual spatial skills in WBS patients…………………………………… 7 Figure 1.4 Physical map of the WBS region……………………………………….. 10 Figure 1.5 Mechanisms of non-allelic homologous recombination………………... 14 Figure 1.6 Patients with atypical deletions in the WBS region……………………. 16 Figure 1.7 Structural elements of the TFII-I proteins………………………………. 18 Figure 1.8 Synteny between human chromosome 7q11.23 and mouse 5G………… 26
CHAPTER II: TFII -IRD1 may not function as a transcription factor in the developing mouse brain
Figure 2.1 Gtf2ird1 expression in E11.5 and P0 mice……………………………… 58 -/- Figure 2.2 Expression of Hoxc8 and Gsc in Gtf2ird1 mice………………………. 59 -/- Figure 2.3 Expression pattern of Hoxc8 in Gtf2ird1 mice………………………... 60 -/- Figure 2.4 Expression of Bmpr1b and Fgf15 in Gtf2ird1 mice…………………… 61 Figure 2.5 In vivo expression of TFII-IRD1 target genes identified in MEFs……... 63 Figure 2.6 qRT-PCR validation of expression of candidate genes identified in P0 mice……………………………………………………………………… 71 Figure 2.7 qRT-PCR validation of expression of candidate genes identified in E15.5 mice………………………………………………………………. 71 Figure 2.8 Knockdown of Gtf2ird1 in Neuro2A cells………………………………. 73 Figure 2.9 Expression of candidate genes in Gtf2ird1 siRNA treated neuronal cells. 75 Figure 2.10 Expression of candidate genes in the brain of different mouse strains….. 76 Figure 2.11 TFII-IRD1 expression in transfected Neuro2A cells……………………. 80 Figure 2.12 Localization of TFII-IRD1 in Neuro2A cells…………………………… 81
xi
CHAPTER III: Exon specific differences in gene expression between different mouse strains
Figure 3.1 Exon specific differences in gene expression in P0 Gtf2ird1-/- mice……. 110 Figure 3.2 Exon specific differences in gene expression in E15.5 Gtf2ird1-/- mice... 111 Figure 3.3 Northern blot analysis of Stx3, Kin, Mrpl16 and Pex1 expression……… 113 Figure 3.4 Mrpl16 and Stx3 transcripts identified using 3' RACE………………….. 117 Figure 3.5 Zfp68 transcripts identified using 3' RACE……………………………... 118 Figure 3.6 Coq2 transcripts identified using 3' RACE……………………………… 119 Figure 3.7 Ap4m1 and Taf6 transcripts identified using 3' RACE………………….. 120 Figure 3.8 Actl6b transcripts identified using 3' RACE…………………………….. 121 Figure 3.9 Exon specific changes in Zfp68 expression……………………………... 122 Figure 3.10 Exon specific changes in Coq2 expression……………………………… 123 Figure 3.11 Exon specific changes in Ap4m1 and Taf6 expression………………….. 124 Figure 3.12 Exon specific changes in Stx3 and Mrpl16 expression………………….. 125 Figure 3.13 Stx3 expression shows natural variation unrelated to genotype………… 127 Figure 3.14 Exon specific differences in gene expression between different mouse strains……………………………………………………………………. 128 -/- Figure 3.15 STX3 expression in Gtf2ird1 and WT mice…………………………… 129
xii
LIST OF ABBREVIATIONS
5-HIAA 5-hydroxyindoleacetic acid 5HT Serotonin
ADHD Attention Deficit Hyperactivity Disorder AdML Adenovirus Major Late AUAP Abridged Universal Amplification Primer BCR B Cell Antigen Receptor BDNF Brain-Derived Neurotrophic Factor BEN Binding Factor for Early Enhancer Btk Bruton's Tyrosine Kinase
CFIIm Cleavage Factor II
CFIm Cleavage Factor I ChIP Chromatin Immunoprecipitation CMT1A Charcot-Marie-Tooth Neuropathy Type 1A CPSF Cleavage and Polyadenylation Specificity Factor CREAM Containing Repetitive Eighty-Six Amino-Acid Motif CstF Cleavage Stimulatory Factor CTD C-Terminal Domain DE Distal Element DICE Downstream Immunoglobulin Control Element D-MEM Dulbecco's Modified Eagles Medium DNS Down Syndrome E Embryonic day (Days Post-Conception) ECL Enhanced Chemiluminescence EE Early Enhancer ELN Elastin
EMSA Electrophoretic Mobility Shift Analysis FACS Fluorescence Activated Cell Sorting FDR False Discovery Rate GSC Goosecoid GTF2I General Transcription Factor 2-I GTF2IRD1 General Transcription Factor 2-I Repeat Domain Containing 1 xiii
GTF2IRD2 General Transcription Factor 2-I Repeat Domain Containing 2 GUR GTF2IRD1 Upstream Region HDAC3 Histone Deacetylase 3 HLH Helix-Loop-Helix HPLC High-Performance Liquid Chromatography IgH Immunoglobulin Heavy Chain IgM H-chain Immunoglobulin M Heavy Chain Inr Initiator
LCR Low Copy Repeat LIMMA Linear Models for Microarray Data LTP Long Term Potentiation LZ Leucine Zipper MEF Mouse Embryonic Fibroblast MEF2C Myocyte Enhancer Factor miRNA MicroRNA MusTRD1 Muscle TFII-I Repeat Domain-Containing Protein 1 N2A Neuro2A
NAHR Non Allelic Homologous Recombination NCoR Nuclear Receptor Co-Repressor PAP Poly(A) Polymerase PAPOLG Poly(A) Polymerase γ PBS Phosphate Buffered Saline PBT PBS + 0.1% Tween-20 PE Proximal Element PFA Paraformaldehyde qRT-PCR Quantitative Real-Time PCR RACE Rapid Amplification of cDNA Ends RMA Robust Multiarray Analysis RNAPII RNA Polymerase II SAM Significance Analysis of Microarrays Sdha Succinate Dehydrogenase SELEX Systematic Evolution of Ligands by Exponential Enrichment siRNA Small Interfering RNA xiv
SPIN SRF-Phox1 Interacting Protein SRE Serum Response Element SRF Serum Response Factor SVAS Supravalvular Aortic Stenosis TBS Tris-Buffered Saline TBS-T Tris-Buffered Saline Tween-20 TCAG The Centre for Applied Genomics TGFβ Transforming Growth Factor Beta
TnIs Troponin, Slow Isoform TRPC3 Transient Receptor Potential Channel 3 UAP Universal Amplification Primer USE Upstream Regulatory Element USF1 Upstream Stimulatory Factor 1 VEGFR-2 Vascular Endothelial Growth Factor Receptor-2 WBS Williams-Beuren Syndrome WT Wildtype
xv 1
Chapter I: Introduction
1.1 Williams-Beuren syndrome
1.1.1 History of Williams-Beuren syndrome
The first reports associated with Williams-Beuren syndrome (WBS) occurred in Europe
during the early 1950’s1,2. During this time period there were many reported cases of idiopathic
infantile hypercalcemia in England, which were found to be caused by excessive vitamin D
intake in children who ate government supplied formulas and cereals containing dietary
supplements3. A sub-group of children with a severe form of infantile hypercalcemia were noted
who could not be cured by a dietary restriction of vitamin D3. This group also suffered from
generalized retardation, heart murmurs, characteristic facial features, and renal impairment1,2,4.
It was postulated by Lightwood and Stapleton that this group of children represented a distinct clinical syndrome1.
In 1961 Williams et al. reported four patients with a localized narrowing of the ascending
aorta, a condition known as supravalvular aortic stenosis (SVAS). These patients also had
mental retardation and facial features that were similar to each other5. One year later Beuren et al. reported three more patients with a similar phenotype, and noted that they all “have the same kind of friendly nature – they love everyone, are loved by everyone, and are very charming”6.
The similarities between the children with severe infantile hypercalcemia4 and those with SVAS
reported by Williams et al. and Beuren et al. were first noted by Black and Bonham Carter in
19637. They reported five additional children with aortic systolic murmurs, which are
characteristic of aortic stenosis. A review of their early case histories revealed that the children
also had many of the characteristic features associated with hypercalcaemia7. These attributes,
namely infantile hypercalcemia, SVAS, mental retardation, a friendly personality and
2
characteristic facial features are hallmarks of the disorder which is known today as Williams-
Beuren Syndrome.
WBS was originally thought to be caused by problems with vitamin D metabolism in
either the mother, fetus or both8. The offspring of rabbits who were fed excessive amounts of
vitamin D during pregnancy were born with aortic lesions that had a similar histology to the
SVAS seen in people9. Further studies found the offspring also had other symptoms which are features of WBS, namely dental anomalies, peculiar facial features, low birth weight and strabismus8.
Evidence that WBS is a genetic disorder was found approximately 40 years after the initial reports of the syndrome. Two lines of evidence pointed to the genetic basis of WBS. First, cases
of parent-to-child transmission of WBS were reported, indicating that it is an autosomal
dominant disorder10. Second, the region of the genome responsible for WBS was identified
following the discovery that disruptions of the elastin gene (ELN) cause SVAS11. ELN was
found to be deleted in individuals with WBS but as the deletion of ELN alone was unlikely to cause the full spectrum of phenotypes seen in WBS, it was postulated that other genes must also be included in the deletion12. Since this discovery, researchers have been focused on
determining which genes are deleted in WBS patients and the role that each of these genes plays
in the phenotype.
1.1.2 Williams-Beuren syndrome clinical phenotype
WBS is a relatively common disorder, with a prevalence of approximately 1 in 7500 live
births13. The full phenotype consists of a number of physical abnormalities along with characteristic behavioural and cognitive features. Patients have distinguishing craniofacial features (figure 1.1), including dolichocephaly (a disproportionately long and narrow head),
3
bitemporal depressions and asymmetry14. Their cheeks are full with malar flattening, and the
nose has a bulbous tip and low nasal root14,15. Their eyes often have a stellate pattern, with
periorbital fullness and epicanthal folds. They have a small jaw, dental malocclusion, small and
widely spaced teeth and lips that are wide and full15.
In addition to craniofacial abnormalities, neuroanatomical abnormalities are also present in WBS patients. The overall brain and cerebral volumes are decreased, with a relative preservation of the cerebellum and superior temporal gyrus and a disproportionate decrease in the brainstem volume16. Reductions in sulcal length and depth have also been reported; the central sulcus is 1-2 cm shorter in WBS patients than in control subjects17, and the
intraparietal/occipitoparietal sulcus is 8.5 mm shallower on average18.
Individuals with WBS show a pattern of retarded growth beginning in utero19. Failure to
thrive occurs in 80% of infants as a result of colic, gastroesophageal reflux and constipation15.
The rate of growth during childhood is 75% of normal, and as adults 70% people with WBS will
remain shorter than the height predicted by their genetic background15,20. The musculoskeletal
system is also affected with common problems including scoliosis, lordosis, kyphosis, radioulnar
synostosis15.
4
Figure 1.1 – Characteristic WBS facial features. The same individual with WBS is shown at three different ages. He displays the characteristic facial features including mallar flattening, a bulbous nasal tip and dolichocephaly.
Incidences of SVAS and infantile hypercalcemia led to the discovery of WBS, however
neither is an obligate symptom of the disorder21. Hypercalcemia has been documented in 15% of
individuals with WBS15, and is diagnosed by measuring the serum levels of ionized calcium; an
upper limit of 1.35 mM/L is considered normal and values in excess of this, with or without
elevated total calcium levels, would be considered hypercalcemic22. Children with mildly
increased levels are generally asymptomatic, while the severe cases that occur with WBS can
lead to vomiting, poor feeding, irritability and/or seizures.
Cardiovascular problems are common in WBS, with 84% of patients having at least one
type of abnormality23. Stenosis (an abnormal narrowing in the vasculature) is the most common type of cardiovascular clinical finding seen in WBS, occurring as a result of smooth muscle
overgrowth which results in a thickening of the vascular media24. SVAS, which occurs when
stenosis is located above the aortic valve, is seen in 69% of individuals with WBS while
5
pulmonary arterial stenosis occurs in 34% of individuals23. Other common cardiovascular
conditions, and the percentage of patients affected include: hypertension (17-50%), mitral valve
disease (15%), coarctation of aorta (4%), and pulmonary valve disease (5%)23,24.
Urinary tract problems are common, including structural defects of the kidneys, bladder
diverticulae, nephrocalcinosis, frequent urinary tract infections and enuresis during childhood15.
Individuals with WBS may also suffer from numerous gastrointestinal problems including, feeding problems, reflux, constipation, colon diverticulosis and chronic abdominal pain15. The
endocrine system in WBS patients is also affected. Hypercalciuria (excessive calcium excretion
through urination) is common, either alone or in conjunction with hypercalcemia24. Up to 30%
of patients are diagnosed with subclinical hypothyroidism24. Adults with WBS are at increased
risk of developing diabetes mellitus; in one study only 10% of participants had normal results on
an oral glucose-tolerance test25.
1.1.3 The Williams-Beuren syndrome cognitive phenotype
Typical WBS patients have mild mental retardation with an average IQ of 5526, although
individual IQ’s may range from 40 to 10027. It is important to note that individuals with WBS
differ from other individuals with similar IQs in that their abilities generally show a
characteristic pattern of strengths and weaknesses. Children with WBS generally achieve
developmental milestones at a later age than typical children; the development of language skills
are a clear example of this, with only 14% of 26 month olds with WBS having a vocabulary size
that is above the fifth percentile of the general population28. However, as the children grow
older, expressive language becomes a relative strength. Their grammar, vocabulary, syntactic
processing and semantic fluency skills are much stronger than those seen in individuals with
Down syndrome (matched for age and IQ), and in some cases close to those seen in normal
6
controls26 (figure 1.2). Other relative strengths in individuals with WBS include facial
processing26 (the ability to recognize and remember both familiar and unfamiliar faces) and
auditory rote memory15.
In contrast to these strengths, visual-spatial processing is a relative weakness for WBS
patients. Their drawings lack organization and the individual elements are not cohesive; this is
true regardless of whether they are copying an image or design placed in front of them or if they
are free drawing15,26 (figure 1.3A). Block design tests have given interesting insight into how
individuals with WBS process visual-spatial information. In these tests participants are asked to
replicate a geometric pattern by arranging a set of blocks which have sides coloured red, white or
half-and-half. Individuals with WBS focus on the small details of the design and are unable to replicate the global configuration. This is opposite from what is seen in individuals with DNS
who focus on the global organization of the blocks but are unable to replicate the specific
pattern26. These different ways of processing information were more clearly illustrated in a
study where groups of children with WBS and DNS, who were matched for age and IQ, were
asked to copy a large global figure which was made up of smaller local components (a “D” shape
composed of smaller “Y”s). As would be expected from the block design experiments, the
children with DNS focused on the global configuration and reproduced the “D” shape, while the
children with WBS focused on the local forms and reproduced the “Y”s arranged haphazardly
on the page26 (figure 1.3B).
7
Figure 1.2. Children with Williams-Beuren syndrome and Down Syndrome (DNS) who were matched for age and IQ were asked the conditional question “What if you were a bird?”. Children with WBS performed better with respect to grammar and content. (Adapted from Bellugi et al., 200026)
Figure 1.3 Visual spatial skills in children with Williams-Beuren syndrome (WBS) and Down syndrome (DNS). (A) Children were asked to draw a bicycle, the child with WBS had difficulty properly connecting all the elements together. (B) Children were asked to copy the model image of a “D” made up of smaller “Y”s, the children with WBS cannot arrange the smaller components into the proper global configuration. (Adapted from Bellugi et al., 200026).
8
1.1.4 The Williams-Beuren syndrome behavioural phenotype
In addition to a unique cognitive profile, WBS typically includes a unique behavioural
profile. Individuals are often described as “over-friendly” or “hyper-social”, and generally suffer
from anxiety and simple phobias. The friendly personality is evident even during infancy when
babies will engage with people around them through eye contact, smiling and cooing29. They enjoy interacting with others to the point that it may affect their ability to complete another task.
While IQ tests were being administered to seven toddlers with WBS, it was noted that five of the
seven children were unable to perform the cognitive task in front of them because they were
more interested in the examiners face29. People with WBS have no apparent fear of strangers,
and frequently approach them to begin conversations. Despite this social disinhibition, children
with WBS have difficulty cultivating friendships especially with their peers. Numerically
speaking, they have fewer friends and participate in fewer activities than children with DNS30.
Individuals with mental retardation are at a greater risk of developing psychiatric
disorders and maladaptive behaviours than the general public, however hyperactivity, difficulty
concentrating and attention deficit hyperactivity disorder (ADHD) occur more frequently in
WBS than in other mental retardation disorders31. In addition, people with WBS often suffer
from fear, anxiety and phobias. In one study a group of individuals with WBS ranging in age from 8-39 years and their mothers were asked questions about their fears and compared to a control group composed of individuals with mental retardation of various etiologies31. When
asked open ended questions, the WBS group reported an average of 3.78 fears per person as
opposed to 2.45 in the control group. The most frequent fears mentioned in the WBS group were
thunderstorms (47%), loud sounds (22%), death/dead people (22%), high places (22%) and
9
ghosts or spooky things (19%); this differs slightly from the fears named in the control group,
which were most commonly ghosts/spooky things (29%), snakes (21%) and high places (17%).
While only 16-18% of individuals with WBS meet the clinical criteria for a diagnosis of
generalized anxiety disorder, a majority of individuals experience one or more of the associated
symptoms including “excessive worry about the future”, is a “worrier”, “becomes sick from
worry” and “shows an inability to relax”31. Specific phobias are also common with 35% of
individuals meeting the clinical criteria for diagnosis. Nearly all individuals meet two of the
three criteria necessary for a clinical diagnosis, namely “marked, persistent, anxiety-producing fears” (96%) and “avoid fearful stimuli or endure with distress” (84%), however only 35% exhibit the final symptom “impaired adaptive functioning”31. In contrast, other studies have
reported phobias in 0.6 - 4.3% of people with mental retardation and 2.3-2.4% of normal
individuals31.
1.2 The genetic basis of Williams-Beuren syndrome
1.2.1 Identification of a microdeletion at 7q11.23
SVAS has long been known to occur either as part of WBS, or as an independently occurring autosomal dominant trait32. In 1993, Curran et al. identified a family with SVAS and demonstrated that a translocation on chromosome 7 which disrupted the elastin gene (ELN)
segregated with SVAS in this family11. ELN had previously been mapped to 7q11.233, and the
identification of this mutation provided the first clue as to the region of the genome responsible
for WBS. It was postulated that haploinsuffiency for ELN also occurs in WBS and later that
year, Ewart et al. used southern blots to show that people with WBS are indeed hemizygous for
ELN12. The deletions in WBS patients were shown to extend beyond the ELN locus,
10 encompassing at least 114kb, and indicating that neighbouring genes were likely to play a role in the disorder12.
In the years following this discovery researchers attempted to identify the size of the deletion that occurs in WBS, the genes involved, and to understand the mechanism of deletion.
Polymorphic DNA markers were initially used to show the deletion extends at least 500 kb34, and it was shown that repeated sequences flank the deletion35 which was the first indication of how the deletion might occur. By 1999 it had been established that the typical deletion seen in WBS patients is ~1.5 Mb occurring at 7q11.2336, contains 28 genes (table 1), and the region is flanked by three different low copy repeat (LCR) sequences37 (figure 1.4).
Figure 1.4. Arrangement of genes and blocks of low copy repeats located at 7q11.23. The centromeric, medial and telomeric LCRs are shown, with arrows underneath indicating the relative orientations of the sequence. The red box highlights the 28 genes which are typically deleted in WBS. (Adapted from Pober, 201024)
11
Table 1.1 Genes located in the Williams-Beuren syndrome deletion region (Adapted from Tassabehji, 200338)
Gene Description Function
NSUN5 NOP2/Sun domain Protein with a NOL1/NOP2/sun domain. May play a (WBSCR20) family, member 5 role in the regulation of the cell cycle
Tripartite motif- TRIM50 Encodes an E3 Ubiquitin ligase containing 50 FK506-binding Immunophilin protein. Role in male fertility and FKBP6 protein 6 homologous chromosome pairing in meiosis 'Frizzled' proteins act as receptors for Wnt signalling Frizzled drosophila FZD9 proteins. May be involved in tissue polarity and homolog of 9 development
Bromodomain adjacent Protein with a bromodomain. May be involved in BAZ1B to zinc finger domain 1B chromatin-dependent regulation of transcription
BCL7B B-cell CLL/lymphoma 7B Member of BCL7 protein family. Unknown function
β-transducin protein with four putative WD40 repeats. TBL2 Transducing-β-like 2 May play a role in intracellular signalling pathways or cytoskeletal organization
MLXIPL Max-like protein bHLH-LZ transcription factor. May play a role in cell (WBSCR14) interacting protein-like proliferation and/or differentiation
VPS37D Vacuolar protein sorting Regulator of vesicular trafficking. Possible role in cell (WBSCR24) 37 homolog D growth and differentiation
DNAJC30 DnaJ (Hsp40) homolog, Protein has DnaJ domain involved in protein folding (WBSCR18) subfamily C, member 30
Protein with S-adenosyl-L-methionine binding motif. WBSCR22 WBS critical region 22 May be involved in DNA methylation Syntaxin 1 A protein plays a key role in intracellular STX1A Syntaxin 1A transport and neurotransmitter release
ABHD11 Abhydrolase domain Protein has a α/β hydrolase fold domain. Unknown (WBSCR21) containing 11 function
Protein component of tight junction strands in liver CLDN3 Claudin 3 epithelial cells. Role in maintaining cellular polarity Protein component of tight junction strands in kidney CLDN4 Claudin 4 epithelial cells. Roll in maintaining cellular polarity
12
protein belongs to the ubiE/COQ5 methyltransferase WBSCR27 WBS critical region 27 family
WBSCR28 WBS critical region 28 Unknown function
Structural protein, component of elastic fibres. Role in ELN Elastin arterial morphogenesis Serine/threonine kinase with LIM domains. Role in LIMK1 LIM kinase 1 actin cytoskeletal reorganization essential for directional movement of neurons Protein contains an RNA recognition motif. Stimulates EIF4H Eukaryotic initiation initiation of protein synthesis at the level of mRNA (WBSCR1) factor 4H utilization
LAT2 Linker for activation of T Roll in immune cell development (WBSCR5) cells
Replication factor C, Component of replication factor C complex which is an RFC2 subunit 2 activator of DNA polymerases during replication Cytoplasmic linker protein. Role in regulating CYLN2 Cytoplasmic linker 2 microtubule dynamics General transcription Member of GTF2I transcription factor family . May play GTF2IRD1 factor 2-I repeat domain a role in activating/repressing gene transcription containing 1
WBSCR23 WBS critical region 23 Intronless gene. Unknown function
General transcription Multifunctional transcription factor. Functions both as GTF2I factor 2-I a basal factor and as an activator neutrophil cytosolic Component of phagocyte NADPH-oxidase system. Role NCF1 factor 1 in immunity General transcription gene containing both I-repeats and a Charlie-8-like GTF2IRD2 factor 2-I repeat domain transposase motif. Function unknown containing 2
1.2.2 Genomic rearrangements at 7q11.23
The segmental duplications occurring at 7q11.23 contain three different repeated sequences, designated “A”, “B” and “C”. There are three blocks of segmental duplications, the centromeric, telomeric and medial LCRs, that flank the deleted segment of DNA. The centromeric and medial blocks of repeats are in a different order relative to each other, but are in
13
the same orientation. The telomeric block contains repeated sequences which are in the same
order as the centromeric block, but are in the opposite orientation39 (figure 1.4). The LCRs
contain transcribed genes, pseudogenes and putative telomere associated repeats40.
WBS genomic deletions arise from non- allelic homologous recombination (NAHR)
occurring between the highly similar LCR sequences. The majority (95%) of individuals with
WBS carry a 1.55 Mb deletion resulting from unequal crossing over occurring between the
centromeric and medial “B” repeats39. Five percent of patients carry a larger (1.84 Mb) deletion
that arises when unequal recombination occurs between the “A” repeat blocks in the centromeric
and medial LCRs39 (figure 1.4). It is hypothesized that the deletion breakpoints are most likely to occur in the “B” blocks due to the high sequence similarity between the blocks (99.6% with no large gaps; the “A” blocks are only 98.2% identical with two large gaps), and the shorter physical distance between the two “B” blocks39.
NAHR can occur between homolgous chromosomes (interchromosomal), homologous
chromatids (interchromatidal) or within a chromatid (intrachromatidal). Intrachromatid NAHR
will result in a chromosome with a deletion of the WBS region and create an acentric
chromosome fragment that will be lost (figure 1.5). In contrast, both interchromosome and
interchromatid NAHR results in one chromosome with a deletion of the WBS region and another
with the reciprocal duplication (figure 1.5). A small number of patients with the reciprocal
duplication have been identified. The duplication results in a syndrome that is distinct from
WBS, with the key feature being an impairment in expressive language41,42.
Approximately 5% of the population carries a paracentric inversion containing the region
typically deleted in individuals with WBS37. The inversion is created as a result of NAHR
14 occurring between centromeric and telomeric LCRs which are inverted relative to each other37.
The expression of genes contained within the inverted region in unaffected, and there are no clinical symptoms associated with the inversion43. However, 25-33% of individuals with WBS received the affected chromosome from a parent who carries an inversion of the region44,45. This indicates that presence of the inversion predisposes an individual to undergo NAHR at this locus.
Figure 1.5 Mechanisms of inter- and intrachromatid non-allelic homologous recombination (NAHR). (A) interchromasomal and interchromatidal NAHR results in deletions and duplications of the intervening region. (B) intrachromatidal NAHR results only in deletions of the intervening region. Arrows indicate direction of centromeric (cen), medial (mid) and telomeric (tel) repeat low copy repeat blocks (A, B & C) (taken from Schubert, 200937).
15
1.2.3 Atypical deletions in the Williams-Beuren syndrome region
To date, only the role of ELN in the WBS phenotype has been elucidated, however the
discovery of patients with atypical deletions of genes in the WBS region have begun to provide
some clues to the correlation between phenotype and genotype of this disorder. The first cases
of atypical 7qll.23 deletions were reported in 1999 by Botta et al., who identified two individuals
with the full spectrum of WBS phenotypes who carried smaller deletions which began at the
common telomeric breakpoint and extended up to, and including, the ELN locus46. A third
patient with a similar deletion who also had displayed symptoms of typical WBS was later
reported47 (figure 1.6). These findings indicate that the deletion of the nine genes from ELN to
GTF2I (at the telomeric end of the deletion) is sufficient to cause the phenotypes typically seen
in individuals with WBS; this region is referred to as the “minimal critical region”. It can then be concluded that haploinsufficiency for more than one of these genes is necessary for WBS to occur. ELN is known to cause SVAS, but the specific gene(s) which cause the remainder of the symptoms have yet to be discovered.
16
Figure 1.6 Published cases of individuals with atypical deletions in the WBS region46-53. The genes located in the region are shown at the top. Boxes labelled C, M & T represent the centromeric, medial and telomeric LCRs respectively. The deleted region in each individual is depicted by a thin line. The last deletion represents what is typically seen in individuals with WBS.
A number of other atypical patients have been reported with deletions which spare one or
more of the genes in the minimal critical region48-53 (Figure 1.6). By comparing the different
phenotypes reported in these patients it has been determined that genes at the telomeric end of the deletion appear to be responsible for the behavioural and cognitive aspects of the WBS phenotype. Patients who retain two copies of GTF2IRD1 and/or GTF2I, members of the General
Transcription Factor 2-I (GTF2-I) gene family, generally do not show the traditional dysmorphic
17
facial features associated with WBS (or show very mild features), perform better on visual-
spatial processing tasks than individuals with a full deletion of the WBS region, and have more
normal intelligence.
1.3 General Transcription Factor 2-I (GTF2-I) gene family
The GTF2-I gene family consists of three genes: General Transcription Factor 2-I
(GTF2I), General Transcription Factor 2-I Repeat Domain Containing 1 (GTF2IRD1), and
General Transcription Factor 2-I Repeat Domain Containing 2 (GTF2IRD2) (Figure 1.7). These genes code for the proteins TFII-I, TFII-IRD1 and TFII-IRD2 respectively. Members of the
TFII-I protein family are characterized by the unique I-repeat domains they contain. The I- repeats are helix-loop-helix (HLH) like domains which are highly conserved between family members54,55. HLH domains contain two α-helices connected by a loop which can range from 5-
25 amino acids56. The HLH-like domains found in the TFII-I gene family differ from traditional
HLH domains in that the loop region is much larger (~40 amino acids)55. Proteins containing
HLH domains are known to form both homo- and heterodimers, and to bind to DNA (when the
HLH domain is preceded by a basic region)56.
Each member of the TFII-I family also contains a leucine zipper (LZ) located at the N-
terminus. The LZ domain is a dimerization motif which can be used to form homo- and
heterodimers55. There is experimental evidence to show that TFII-I and TFII-IRD1 can form homodimers in vitro, mediated by the LZ, however there is conflicting evidence regarding their ability for form heterodimers57-59. Sequence analysis of the LZ region in all three GTF2-I family
members by Hinsley et al. indicates that while TFII-I and TFII-IRD2 may be able to form
heterodimers, it is highly unlikely that TFII-I and TFII-IRD1 heterodimerize55.
18
Only one gene containing HLH-like I-repeats can be detected in Danio rerio and
Takifugu rubripes, and the sequence of this gene is highly similar to GTF2IRD160. Based on this information, GTF2IRD1 is believed to be the ancestral gene of the gene family, with GTF2I
arising second through duplication and divergence. Based on sequence analysis, GTF2IRD2
appears to be more closely related to GTF2I than to GTF2IRD1, and is likely to have been the
third GTF2-I family member to be created60.
Figure 1.7 Structural elements of the TFII-I proteins encoded by the GTF2I gene family. The different TFII-I isoforms arise result from the inclusion/exclusion of the A and B exons (the green and grey rectangles respectively). The I-repeats are shown as purple boxes (R1-R6).
1.3.1 General Transcription Factor 2-I (TFII-I)
TFII-I (TFII-I/BAP-135/SPIN) was the first member of the TFII-I family to be identified.
TFII-I was first identified in 1991 as a protein that is able to activate basal transcription of the adenovirus major late promoter (AdML) by binding to the initiator (Inr) sequence61. It was also
19
shown that TFII-I binds to an upstream E-box element which is usually recognized by HLH containing proteins such as upstream stimulatory factor 1 (USF1), and is able to cooperate with
USF to activate transcription of the AdML promoter61. These results indicated that TFII-I could
function as both a basal transcription factor and a transcriptional co-activator. It was later shown
that TFII-I can initiate transcription through interactions with the E-box element, even in
promoters which do not contain an Inr sequence54.
TFII-I was independently identified twice in 1997: as a phosphorylation target of
Bruton’s tyrosine kinase (Btk)62, and as a protein involved in serum-induced expression of the c-
Fos gene63. Following binding of an antigen to the B cell antigen receptor (BCR), a cascade of
tyrosine phosphorylation occurs resulting in the activation of a number of different pathways,
and resulting in proliferation and differentiation of the cell. When these pathways are
compromised it can result in immunodeficiencies such as X-linked agammaglobulinemia62. Btk
is a Src-related tyrosine kinase which is activated within minutes of a B cell encountering an
antigen64. In order to identify proteins that interact with Btk in vivo, Yang and Desiderio immunoprecipitated Btk from a human B lymphoid cell line, and identified an associated protein with a molecular mass of 135 kDa62. They named this protein Btk-associated protein of 135 kDa
(BAP-135), and determined that following activation of Btk through phosphorylation, Btk goes
on to phosphorylate a tyrosine residue of BAP-135. Later that year, it was determined through
sequence analysis that TFII-I and BAP-135 were the same protein54.
In response to extra-cellular signals, the c-FOS gene is activated following binding of serum response factor (SRF) to the serum response element (SRE) in the c-FOS promoter63.
Interactions between the protein Phox1 and SRF facilitate binding of SRF to the SRE. While
attempting to reconstitute the binding that occurs between these proteins in vivo under in vitro
20
conditions, Greuneberg et al. identified a protein they named SRF-Phox1Interacting protein
(SPIN)63. The addition of SPIN to SRF and Phox1 allowed for the formation of a stable complex
which could bind to the SRE. SPIN was found to bind to multiple sites in the c-FOS promoter,
and through interactions with SRF and Phox1, SPIN is able to induce expression of a c-FOS reporter gene in response to serum63. Cloning and sequencing of SPIN cDNA revealed that
SPIN is identical to TFII-I63. It was later shown that in order for TFII-I to activate c-FOS expression, TFII-I must be phosphorylated at a specific tyrosine residue which results in the translocation of TFII-I to the nucleus65. Phosphorylation of TFII-I occurs as a result of extra- cellular signals, and so it has been proposed that TFII-I is able to link signal transduction events to transcription
There are four known splice forms of TFII-I in humans (α-, β-, γ- and ∆-; figure 1.7), three of which are present in mice (β-, γ- and ∆-)66. Each of the isoforms show a similar
subcellular distribution when expressed ectopically in COS cells66. The different isoforms are all
capable of both homo- and heteromeric interactions both in vitro and in vivo, and it has been
proposed that different combinations of isoforms may play specific roles in the transcriptional
regulation of target genes66. Each of the isoforms is capable of binding DNA66, and studies on
the ∆-isoform have shown that deletion of either the leucine zipper region, or a basic region
which precedes the second I-repeat, is sufficient to impede binding of the protein to an Inr
sequence, and activation of a reporter gene58.
In addition to AdML and c-FOS, TFII-I has been shown to play a direct role in the activation of vascular endothelial growth factor receptor-2 (VEGFR-2)67 and goosecoid (GSC)68.
TFII-I may also play an indirect role in transcriptional control of other genes through its
21
interactions with histone deacetylase 3 (HDAC3) and PIASxβ, a member of the E3 ligase family
of proteins which are known to be involved in the SUMOylation of several transcription factors69
TFII-I is expressed in preimplantation mouse embryos, where it can be detected in both
the cytoplasm and nucleus in embyros at the two-cell stage through to the 128-cell blastocyst
which implants into the uterus at embryonic day 4.570. This indicates that TFII-I is likely to play
a role in early embryonic development.
In the developing mouse brain, Gtf2i mRNA expression is restricted to neuronal cells according to in situ hybridization71. Between embryonic day (E)18 and postnatal day 7 (P7), the
mRNA is ubiquitously expressed throughout the brain. Expression in the cerebellum appears to
be relatively enhanced beginning at P7, and by the time the mouse is six weeks old the
expression pattern of Gtf2i mRNA changes to its adult state. At this time the highest levels of
expression are seen in cerebellar Purkinje cells, the hippocampus and the neurons of the cerebral cortex. Expression can also be detected in the olfactory bulbs and in neurons of other regions of the brain. Immunohistochemistry using an antibody which recognizes all splice forms of TFII-I
revealed a similar expression pattern in the adult brain to that of the mRNA, however protein
expression in the cerebral cortex appeared lower than that of the corresponding RNA71. The high
expression of TFII-I in the cerebellum is interesting as there appear to be anatomical
abnormalities in this is area in WBS patients. The relatively high hippocampal expression is also
noteworthy as the hippocampus is important for learning and memory71.
1.3.2 General Transcription Factor 2-I Repeat Domain containing 1 (TFII-IRD1)
TFII-IRD1 (GTF2IRD1/MusTRD1/BEN/CREAM1/GTF3/WBSCR11) was
independently identified in multiple experiments72-76. Troponin I is a component of the troponin protein complex which controls the contraction of muscles in response to the level of
22
intracellular calcium. There are three troponin isoforms, each encoded by separate genes. All
isoforms are expressed in all muscle types early in development, but during late fetal
development the slow isoform for troponin (TnIs) is down-regulated in all types of muscles
fibres, except for those destined to become slow fibers75. An upstream regulatory element (USE)
in the TnIs promoter was shown to be sufficient to confer preferential slow-muscle activity to a heterologous thymidine kinase minimal promoter77. O’Mahoney et al. performed a yeast-one
hybrid screen to identify proteins capable of binding to the TnIs USE, and identified a novel protein which was similar to TFII-I, which they referred to as muscle TFII-I repeat domain- containing protein 1 (MusTRD1)75.
Soon after this, TFII-IRD1 was identified in a second yeast-one hybrid screen as a protein
that binds to the early enhancer of the hoxc8 gene and given the name binding factor for early
enhancer (BEN)74, and was found to bind to the retinoblastoma protein (Rb) through its C-
terminal region and given the name containing repetitive eighty-six amino-acid motif
(CREAM1)72.
Like its family member TFII-I, TFII-IRD1is believed to be a transcription factor. There
78 is evidence to suggest that it plays a role in the regulation of multiple genes including TnIs ,
GSC79 and VEGFR-267. Interestingly, some experiments indicate that TFII-I and TFII-IRD1 may
counter-regulate some of the same genes with TFII-I activating the expression of the target gene and TFII-IRD1 repressing expression of the same gene67,68.
TFII-IRD1 has been shown to bind to two distinct DNA sequences, which have been
found in the promoter regions of some of the proposed target genes. Vullhorst and Buonanno
identified the consensus sequence GTCGAGATTAGBGA using SELEX on the I-repeats of mouse
23
TFII-IRD180. They found that all of the mouse I-repeats were capable of binding to DNA with the exception of the first I-repeat (R1), and R4 was found to have the greatest affinity for DNA, binding specifically to the consensus sequence. The core of the consensus sequence, GGATTA, is found in the regions of both the TnIs and Hoxc8 promoters that TFII-IRD1 had previously been
shown to bind to. Lazebnik et al. identified a different TFII-IRD1 consensus sequence using
SELEX, CWGCCAYA81. The methods of Lazebnik et al. differed slightly from those of
Vullhorst and Buonanno in that they used the entire TFII-IRD1 protein as bait while Vullhorst
and Buonanno cloned each of the I-repeats individually to determine their different DNA binding abilities. TFII-IRD1 was shown to repress expression of a reporter gene when three copies of the
CWGCCAYA sequence were cloned upstream81. An in silico analysis found that this consensus sequence is present in the promoters of both the human and mouse BMPR1B and FGF15 genes.
Lazebnik et al. demonstrated that when TFII-IRD1 is knocked down in C2C12 cells using siRNA, expression of both of these genes is dramatically increased, indicating that TFII-IRD1
may play a role in the transcriptional regulation of these genes81.
Using a knock-in LacZ mouse model of Gtf2ird1, Palmer et al. were able to establish the expression pattern of Gtf2ird1 throughout mouse development82. At E7.5, TFII-IRD1 was expressed in all germ layers and extra-embryonic tissues, and expression became more refined at the onset of organogenesis. Expression in the forebrain and gut was detected in E9.5 embryos, and as development progressed, expression of TFII-IRD1 was also detected in many tissues including the midbrain, branchial arches and heart. As fetal development progressed expression in the brain was highest in the olfactory bulbs, cerebellum, thalamic and hypothalamic nuclei.
Expression of TFII-IRD1 in the brains of adult mice was relatively low, but was detected in all neuronal types examined. The highest levels of expression were detected in the olfactory bulbs,
24
purkinje neurons of the cerebellum and neurons in the piriform cortex. Studies in our lab have
revealed similar results and shown that expression of TFII-IRD1 in the prefrontal cortex of adult mice is restricted to layer V neurons83.
Outside of the nervous system, the highest levels of TFII-IRD1 in adult mice were found
in the testis, endothelial cells, brown adipose tissue, heart and smooth muscle of the gut and
bladder82.
1.3.3 General Transcription Factor 2-I Repeat Domain containing 2 (TFII-IRD2)
The human genome contains three copies of GTF2IRD2, however two of the copies are
pseudogenes located within the LCRs and are unlikely to produce a functional protein60. There is only one copy of the gene in the mouse genome, which shows a high degree (80%) of homology to the functional human locus84.
The N-terminal half of TFII-IRD2 contains two I-repeats, which appear to be derived
from the first and sixth I-repeats found in TFII-I60. Similar to other members of the TFII-I gene
family, the N-terminus also contains a leucine zipper which is believed to facilitate the formation
of protein dimers. The C-terminal portion of the protein contains a CHARLIE8 transposable
domain which has inserted in frame into the locus, resulting in the production of a fusion
protein84. The CHARLIE8 transposon is an autonomous transposon which is unique to
mammals. Sequence analysis of the CHARLIE8 domain in TFII-IRD2 and the surrounding
genomic region indicates that it may have retained some of the functions associated with
transposition84. These functions could include interacting with specific DNA or protein motifs or
cleavage of DNA strands. It is also possible that presence of the transposase target sites could
predispose the genomic region to instability by allowing other transposases to bind to cleave the
DNA at these sites84. A similar mechanism has been proposed for Charcot-Marie-tooth
25
neuropathy type 1A (CMT1A) and hereditary neuropathy with liability to pressure palsies
(HNPP) which are caused by the deletion or reciprocal duplication of 1.5 Mb on chromosome 17
respectively84. The region is flanked by repeated sequences which can undergo NAHR, and it
has been determined that 76% of the cross over events which lead to these disorders occur at a
recombination hotspot containing a transposable-like element that is flanked by transposase recognition sites85. Other transposases are believed to bind to and cleave the DNA at these sites.
Based on Northern blot analysis, Gtf2ird2 is expressed highly in mouse heart, brain and
liver tissues84. Weaker expression was detected in the spleen, lung, kidney and skeletal muscle.
Using RT-PCR, no expression could be detected in mouse embryos at days E9.5 and E10.5.
In humans GTF2IRD2 lies within the medial LCR block, and depending on the exact
breakpoint of the deletion, it may or may not be present in WBS patients. The inclusion of
GTF2IRD2 in the deletion does not seem to have any obvious effect on the phenotype of the
patient84.
1.4 The Gtf2ird1 mouse model
The genomic region associated with WBS on human chromosome 7q11.23 is conserved
in mice on chromosome 5G2 (figure 1.8). The regions are highly syntenic, however in mice the sequence of the locus is inverted relative to the human sequence86. In addition, the LCRs which
flank the region in humans are not found in mice. Minor differences found in the mouse genome
include the absence of WBSCR23 and CCL26, and the addition of an additional Cldn gene
(Cldn13) which is not found in the human genome. The mouse and human genomes are generally highly similar, which means that mutant mice will often show many of the clinical
symptoms seen in humans with a particular genetic mutation87. This genetic similarity, along
26
with the development of tests to examine both clinical and behavioural phenotypes in mice,
makes mice an excellent system to study genotype-phenotype correlations in WBS.
Figure 1.8. The WBS deletion region in humans is syntenic to a region of mouse chromosome 5. The corresponding region in the mouse genome is inverted relative to the human genome, and is lacking two genes found in humans (* - WBSCR23 and CCL26). The mouse genome contains one extra gene (* - Cldn13) which is not found in humans.
1.4.1 Generation of the mouse model
Previous work in our lab has generated a Gtf2ird1-/- mouse model using gene targeting to
better understand the role this gene plays in the WBS phenotype88. Exons 2, 3, 4 and part of 5 of
Gtf2ird1 were replaced with a neomycin-resistant gene cassette in R1 murine embryonic stem cells. Chimeric mice were generated by aggregating targeted cells with morula stage embryos.
Chimeric male mice were mated to wildtype (WT) female mice on a CD1(outbred) genetic background. The Gtf2ird1+/- offspring were then backcrossed onto a CD1 background.
27
Heterozygous mice were intercrossed to generate Gtf2ird1-/- mice. Both heterozygous and
homozygous mice were viable and fertile, mutant offspring were born at the expected Mendelian
ratio.
Real-time PCR was performed on RNA extracted from neonate and adult brains to
determine expression levels of Gtf2ird1 in the mutant mice using primers located in the deleted
region (exon 2). In heterozygous mice the expression level of Gtf2ird1 was approximately half
of that seen in WT mice, and no Gtf2ird1 expression could be detected in Gtf2ird1-/- mice.
Expression of Gtf2i and Clip2 which flank the Gtf2ird1locus were not altered in the mutant mice88.
However when primers located in exon 9 (which was not included in the deletion) of
Gtf2ird1 were used, a transcript could be detected in Gtf2ird1-/- mice. Sequence analysis of this
transcript showed that exon 1, which contains part of the 5’ UTR, was splicing directly into exon
6. The truncated transcript shows increased expression relative to the WT transcript in both
Gtf2ird1+/- and Gtf2ird1-/- mice. There are two possible translational start sites in this transcript – the first would produce a small, out-of-frame protein while the second could produce a truncated in-frame protein. As the transcript is missing the first four coding exons, the truncated protein would lack both the leucine zipper and the first I-repeat. We have been unable to confirm if the aberrant transcript is producing a truncated Gtf2ird1 protein as there are no specific antibodies available, however the presence of a dose-dependent phenotype (see below) strongly supports a loss-of-function model rather than a dominant negative model.
28
1.4.2 Behavioural phenotypic analysis
There were no obvious morphological or anatomical abnormalities in the Gtf2ird1+/- or
Gtf2ird1-/- mice. Physically, Gtf2ird1-/- mice (male and female) were significantly smaller (15%) than WT mice. The Gtf2ird1+/- mice were also smaller, although the trend was not significant.
Gtf2ird1-/- and Gtf2ird1+/- mice exhibited some behaviours that are consistent with the
WBS phenotype. When placed in a cage with an unknown mouse they displayed significantly
fewer aggressive interactions than WT mice, and the aggressive interactions that they did engage
in were shorter in duration. The mutant mice spent significantly more time following the
intruder around, and spent more time sniffing the intruder mouse88.
Mutant mice also had decreased levels of anxiety and fear as measured by the elevated
plus maze and cued fear conditioning tests. In the elevated plus maze, mice were placed on an
elevated “plus” shaped platform with two open arms and two enclosed arms. Mice typically
prefer to be in the enclosed arms where they feel safe; however Gtf2ird1-/- mice entered into the
open arms a greater number of times, spent a greater amount of time inside the open arms, and
dipped their heads over the sides of the platform a greater number of times indicating reduced
anxiety88.
In the fear conditioning test, mice were placed into a test chamber and an auditory cue
was paired with an electric foot shock. When the auditory cue was repeated at a later time WT
mice froze in anticipation of the shock, whereas Gtf2ird1-/- mice displayed significantly less
freezing. Together, these results indicate that Gtf2ird1-/- mice have significantly reduced levels
of anxiety and a reduced natural fear response88.
29
1.4.3 Biochemical and electrophysiological phenotypic analysis
Serotonin (5HT) metabolism is known to be linked to anxiety and aggression. As these traits are altered in Gtf2ird1-/- mice, high-performance liquid chromatography (HPLC) was performed to determine whether 5HT metabolism is altered in the mutant mice. Levels of 5HT and 5-hydroxyindoleacetic acid (5-HIAA; a 5HT metabolite) were measured in different brain regions of Gtf2ird1-/- mice, and significantly increased levels of 5-HIAA were found in the amygdala, frontal cortex and parietal cortex relative to WT mice. No significant differences in
5HT levels were detected88.
In order to investigate the effects of 5HT on the neurons of the prefrontal cortex in
Gtf2ird1-/- mice, electrophysiological analysis was performed on acute brain slices from the prefrontal cortex of Gtf2ird1-/- and WT littermates83. Whole cell recordings on neurons in layer
V of the cortex revealed that application of 5HT results in increased inhibitory outward currents in Gtf2ird1-/- mice, relative to WT littermates. The inhibition was shown to be mediated through
5HT1A receptors. 5HT1A receptors in the prefrontal cortex had previously been shown to regulate anxiety-like behaviours89 , and the enhanced post-synaptic inhibition of these receptors seen in Gtf2ird1-/-mice could be related to their atypical behaviours .
1.5 Research Aims and Hypothesis
Due to the presumptive role of TFII-IRD1 in the cognitive and behavioural aspects of the
WBS phenotype, it is of interest to identify neural targets of this putative transcription factor.
The Gtf2ird1-/- mice previously created in our lab provide an excellent system to study the biological role of this gene. The goal of my project was to use these mice to identify downstream targets of TFII-IRD1 in the mouse brain.
30
Given that TFII-IRD1 has been shown to bind DNA, and regulate the expression of target
genes in vitro, I hypothesized that the behavioural phenotype seen in the Gtf2ird1-/- mice was the
result of altered expression of genes regulated by TFII-IRD1. Using microarray analysis and qRT-PCR, I hoped to identify target genes of TFII-IRD1 and use this information to gain an understanding of the molecular mechanisms which give rise to the behavioural phenotype seen in both the mutant mice and WBS patients.
31
Chapter II: TFII-IRD1 may not function as a transcription factor in the developing mouse brain.
2.1 Abstract
Members of the TFII-I gene family, including TFII-IRD1, have been shown to regulate
transcription by binding to specific DNA sequences. Numerous in vitro studies examining the
effect of TFII-IRD1 on gene regulation have been done, and a few direct targets have been
proposed, including TnIs, Hoxc8, Gsc, and Vegfr2. However, to date the list of proposed target
genes has not included plausible candidates for the cognitive and behavioural phenotype seen in
either individuals with WBS, or Gtf2ird1-/- mice. In order to identify novel transcriptional
targets of TFII-IRD1, I performed the first in vivo microarray screen, examining expression in brain from Gtf2ird1-/- and WT mice at E15.5 and at birth. Changes in gene expression in the mutant mice were moderate (0.3 to 2.5 fold) and most candidate genes with altered expression
verified using real-time PCR, were located on chromosome 5, within 50 Mb of Gtf2ird1. siRNA
knock-down of Gtf2ird1 in two mouse neuronal cell lines failed to identify changes in expression
of any of the genes identified from the microarray and subsequent analysis showed that
differences in expression of genes on chromosome 5 were the result of retention of that
chromosome region from the targeted embryonic stem cell line, and so were dependent upon
strain rather than Gtf2ird1 genotype. In addition, specific analysis of genes previously identified
as direct in vitro targets of GTF2IRD1 failed to show altered expression. In summary, I was
unable to find any in vivo neuronal targets of this putative transcription factor, despite its
widespread and robust expression in the developing rodent brain.
.
32
2.2 Literature Review
2.2.1 Evidence supporting the role of TFII-IRD1 as a transcription factor
75 The first transcriptional target of TFII-IRD1 to be identified was TnIs . TFII-IRD1 was
identified through a yeast-one hybrid as a protein capable of binding to an Inr-like element in the
75,90 TnIs promoter . TnIs is initially expressed in all muscle fibers, but during fetal development it becomes up-regulated in future slow-twitch myofibers and down-regulated in future fast-twitch
myofibers91. Slow-twitch myofibers are important for endurance and maintaining posture, while
fast twitch fibers are needed for movement and fast power generation90. A 157bp upstream
enhancer (USE) sequence has been shown to be necessary for slow myofiber-specific expression
77 of TnIs , and TFII-IRD1 was shown to bind to an Inr-like element contained in the USE.
78 Polly et al. showed that TFII-IRD1 is able to repress TnIs transcription . TFII-IRD1
mediated repression of TnIs may occur through two separate pathways: directly through binding
to the USE, and indirectly through interactions with the nuclear receptor co-repressor (NCoR) protein and/or the transcription factor myocyte enhancer factor (MEF2C)78. MEF2C is an
activator of TnIs expression, and also binds to an element contained within the USE. When a
MEF2C expression construct was transfected into C2C12 (muscle) cells along with a luciferase
reporter construct containing the USE sequence, luciferase expression was increased relative to
controls. Transfection of a TFII-IRD1 expression construct, either alone or in conjunction with
the MEF2C construct, resulted in repression of luciferase expression. Expression of TFII-IRD1
was able to repress expression even when the USE contained point mutations which prevented
TFII-IRD1 binding. TFII-IRD1 was subsequently shown to interact in vitro with both NCoR and
MEF2C; it is possible that TFII-IRD1 could prevent MEF2C from activating TnIs expression by
preventing it from binding to the USE and this may occur in conjunction with NCoR78.
33
In order to determine the in vivo role TFII-IRD1 plays in regulating TnIs expression, Issa
et al. generated a transgenic mouse which expressed the human GTF2IRD1 gene in all skeletal
muscles beginning early in development92. Phenotypic analysis of adult mice revealed that they
lacked slow-twitch fibers in their hindlimb muscles; the total number of muscle fibers did not
differ from WT, however the muscle was composed almost entirely of fast-twitch fibers. In transgenic embryonic mice, development of slow twitch fibers proceeded normally indicating that the absence of slow fibers in adult mice was the result of postnatal fiber conversion.
Expression of slow-fiber specific genes including TnIs was reduced in the muscle of the
transgenic mice and fast-fiber specific genes were found to be upregulated. These results
indicate that TFII-IRD1 is a repressor of slow fiber-specific genes. Issa et al. hypothesized that
all of the slow-fiber genes which showed decreased expression in the transgenic mice may share
a common regulatory sequence which TFII-IRD1 is able to bind to, and that binding of TFII-
IRD1 represses expression of the genes needed for slow muscle fiber development92.
Hoxc8 was the second gene proposed to be a transcriptional target of TFII-IRD174. Using
a yeast-one hybrid, Bayarsaihan and Ruddle identified TFII-IRD1 as a protein that is capable of binding to the early enhancer (EE) region of the Hoxc8 promoter. The EE sequence is over 200 bp and is located 3 kb upstream of the Hoxc8 transcriptional start site, and has been shown to be necessary for the proper spatial and temporal expression of Hoxc8 in the neural tube and paraxial mesoderm during development93. In transgenic mice where the EE sequence has been deleted,
initial expression of Hoxc8 occurs later than in WT mice and the expression boundaries are
altered. By E11.5 the expression of Hoxc8 is indistinguishable from WT, however the mice
have many phenotypic similarities to Hocx8-/- mice including an abnormal hindlimb clasping
reflex upon tail suspension and skeletal transformations93. It has been reported that interactions
34
between TFII-IRD1 and the EE may repress Hoxc8 expression94, however no evidence of this
has been published.
Goosecoid (Gsc) is also proposed to be a direct target of TFII-IRD1, however there are contradicting reports as to how TFII-IRD1 may regulate Gsc68,79. The interaction between TFII-
IRD1 and the Gsc promoter was first identified by Ring et al. who performed a yeast-one hybrid
to identify proteins which interact with the distal element (DE) upstream of the Gsc
transcriptional start site79. Gsc is a transcription factor which plays an important role in the
proper development and patterning of vertebrate embryos95,96. In particular, Gsc-/- mice have
been shown to have craniofacial defects along with fused ribs and abnormalities in the sternum97.
Two regions in the Gsc promoter are necessary for proper expression of the gene: the DE which
is activated by activin and nodal family members, and the proximal element (PE) which is
activated by Wnt signalling95.
After determining that TFII-IRD1 is able to bind to the DE of the Xenopus Gsc promoter in vitro using yeast-one hybrid analysis and electrophoretic mobility shift analysis (EMSA), Ring et al. sought to determine the effect that TFII-IRD1 binding has on Gsc transcription79. They
injected Xenopus embryos with mRNA encoding a VP16-GTF2IRD1 fusion protein along with a reporter gene construct containing the DE sequence. The VP16 domain is a transcriptional activator and ensures that TFII-IRD1 will be constitutively active once translated. The VP16-
TFII-IRD1 fusion protein was able to activate expression of the reporter construct, and was also able to activate expression of the endogenous Gsc gene.
As activin is known to activate Gsc expression through the DE, Ring et al. next determined that TFII-IRD1 was necessary for this activation to occur by co-transfecting
35
morpholinos which prevent translation of endogenous GTF2IRD1, activin mRNA and a DE
sequence containing reporter construct into Xenopus embryos. Not only did injection of the morpholinos prevent activin from activating the reporter construct, but it also resulted in decreased expression of the endogenous Gsc gene. Based on these observations it was hypothesized that TFII-IRD1 binding to the Gsc DE results in activation of Gsc expression, and this activation was believed to be the result of TFII-IRD1 interacting with other proteins.
Three years later, Ku et al. also found that TFII-IRD1 plays a role in the regulation of
Gsc expression, however they determined that binding of TFII-IRD1 to the DE serves to repress expression of Gsc68. According to Ku et al. Gsc expression in P19 cells can be activated by a
complex of TFII-I and SMAD2 binding to the DE, following stimulation with transforming
growth factor beta (TGFβ). TFII-I belongs to the same gene family as TFII-IRD1 and SMAD2
is a transcription factor which is known to play a role in the regulation of genes that are activated
by TGFβ/activin68. A reduction in Gsc expression could be detected following knockdown of
TFII-I expression in P19 cells using siRNA and in Xenopus embryos using morpholinos.
Together these findings indicate that TFII-I is necessary for the proper activation of Gsc
expression.
When TFII-IRD1 and TFII-I expression constructs were transfected into P19 cells at a
1:1 ratio along with a reporter construct containing the DE, TGFβ induced expression of the reporter was greatly increased. As the ratio of TFII-IRD1: TFII-I increased, TGFβ was no longer able to activate the reporter construct, in contrast when the TFII-IRD1: TFII-I ratio was decreased TFII-IRD1 was no longer able to repress reporter gene expression. ChIP assays were performed on P19 cells and endogenous TFII-IRD1 was found to localize to the Gsc promoter in the absence of TGFβ signalling, and TFII-I was found at the promoter following stimulation of
36
the cells with TGFβ. This suggests that binding of the TFII-I family members to the DE in the
Gsc promoter may be mutually exclusive; TFII-IRD1 appears to constitutively repress Gsc
expression until the TGFβ signalling cascade is initiated, at which point TFII-I activates Gsc
expression68.
These results appear to contradict the findings of Ring et al.; Ku et al. believe that the
reason for the contradiction lies in the VP16- TFII-IRD1 fusion protein which Ring et al. used.
The VP16 domain is known to be a transcriptional activator and had previously been shown to
cause a transcriptional repressor to serve as a transcriptional activator98. Therefore, the findings
of Ring et al. using the fusion protein can only show that TFII-IRD1 is capable of binding to the
DE, and cannot indicate what result this binding has on Gsc expression. Ring et al. also found
decreased expression of endogenous Gsc when TFII-IRD1 expression was knocked down using
morpholinos. Ku et al. used the same morpholinos to knockdown TFII-IRD1 expression and they found that in vitro protein synthesis of TFII-I was inhibited, and so they propose that the decreased expression of Gsc in Xenopus embryos treated with TFII-IRD1 morpholios may be the result of decreased levels of TFII-I.
Tassabehji et al. provided further evidence that TFII-IRD1 is able to regulate Gsc expression, and their results were in agreement with those of Ring et al.53,79. Tassabehji et al.
knocked down endogenous TFII-IRD1 expression in HEK293 cells using siRNAs and found that
this reduced expression of a reporter construct containing the Gsc promoter sequence52.
Knockdown of endogenous TFII-I did not have any effect on reporter gene expression. Based on
these conflicting findings it seems likely that the effect of TFII-IRD1 and TFII-I binding to the
DE in the Gsc promoter may depend on both cell type and which cellular signalling pathways have been activated
37
Further support of the notion that TFII-IRD1 may be able to positively and negatively regulate a specific target gene depending on the cell type comes from the work of Tantin et al., who studied regulation of the murine immunoglobulin heavy chain (IgH) promoter59. The expression of IgH is restricted to B lymphocytes, and an element downstream of the transcription
start site, termed the downstream immunoglobulin control element (DICE), had previously been
shown to confer specific activation of the IgH promoter in B lymphocytes99. In order to identify
proteins which interact with the DICE sequence, WT DICE segments were coupled to latex
microspheres and incubated with B cell nuclear extracts. A 110 kDa protein was isolated which
bound with greater affinity to the WT DICE sequence than to a mutant sequence, and mass
spectroscopy revealed this protein to be TFII-IRD1. The in vitro affinity of TFII-IRD1 for the
DICE sequence was confirmed using EMSA. Interestingly, the EMSA data showed that TFII-I
was also able to form complexes with DICE.
In order to determine the consequence of TFII-IRD1 binding to DICE, a dominant
negative TFII-IRD1 construct was generated99. The mutant protein retained the ability to bind to
DICE, but was unable to interact with other proteins and form higher order complexes. When
the dominant negative protein was over expressed in an M12 cell line (mature B cell
plasmacytoma cells) along with an IgH reporter construct, IgH promoter activity was reduced
indicating that TFII-IRD1 positively regulates IgH expression. However when the same mutant
protein and reporter construct were over expressed in murine HAFTL cells (a pre-B cell line), promoter activity was increased. This result would indicate that TFII-IRD1 negatively regulates
IgH expression. Evidence for negative regulation of IgH promoter activity by TFII-IRD1 was also found in a third murine pre-B cell line (70Z/3) using the dominant negative protein and siRNA knockdown of endogenous TFII-IRD1 expression.
38
It seems likely that regulation of IgH expression in B lymphocytes is regulated
temporally, and through interactions with other proteins present in the cell, TFII-IRD1 is able to ensure proper IgH expression at the correct stage of B-cell development.
Another gene which may be counter regulated by members of the TFII-I gene family is vascular endothelial growth factor receptor 2 (VEGFR2)67. The VEGFR2 promoter does not
contain a TATA box sequence, but does have an Inr element. TFII-I has been shown to activate
VEGFR2 expression through binding to the Inr100. Jackson et al. demonstrated that TFII-I
mediated activation of VEGFR2 expression could occur even in the absence of a functional Inr
sequence, that TFII-I is able to bind to three different E-box sequences located in the VRGFR2
promoter67. As TFII-I and TFII-IRD1 had previously been shown to counter regulate the same
genes, they then went on to see if TFII-IRD1 also plays a role in VEGFR2 transcriptional
regulation.
While no direct interaction between the VEGFR2 and TFII-IRD1 was detected, when
TFII-IRD1 was transfected into bovine pulmonary artery endothelial (BPAE) cells, along with a
VEGFR2 promoter –reporter construct basal reporter activity was decreased.
The majority of studies on the function of TFII-IRD1 in transcriptional regulation have focused on specific target genes, however in 2006 Chimge et al. performed an unbiased screen to identify new transcriptional targets of this putative transcription factor101. An immortalized MEF cell line was transfected with an expression construct containing GST-TFII-IRD1. This resulted in a 6.6-fold increase in Gtf2ird1 mRNA as determined by qRT-PCR. Two separate transfection experiments were performed, and RNA was extracted from the cells 24 hours later. Microarray analysis was performed using the Operon microarray chip which contains probes for 16, 460
39
genes. Approximately 2000 genes were found to be altered by more than 1.7 fold relative to
mock transfected controls. It is important to note that a low statistical cut-off was used in
generating this list so that it would include genes which had previously been shown to be
regulated by or interact with the TFII-I gene family.
A total of 11 genes were selected for validation by qRT-PCR; G1p2, Ccl7, Ube2I6, Tgfb2 and Shrm were confirmed to be up-regulated in MEFs which over express TFII-IRD1, while
Folr1, Tgfbr2, Csrp2 and Dlk1 were confirmed to be down-regulated101. FoxH1 and Cfl were
down-regulated according to the microarray results, however qRT-PCR analysis revealed that
expression of these genes was increased in three separate transfection experiments.
Chimge et al. then went on to identify further potential targets of TFII-IRD1 using a
bioinformatics approach102. They combined the results of previous SELEX experiments and
known TFII-IRD1 binding sites to derive the consensus sequence “BRGATTRBR”, and used
this sequence to search a database of transcriptional start sites. The consensus sequence was
identified within 1kb of the start site in 1772 mouse/human orthologous pairs. Of these genes,
601 were identified as being regulated by TFII-IRD1 in the microarray performed on MEF cells.
ChIP analysis was used to show that both when TFII-IRD1 and TFII-I are over expressed in MEF cells they can bind to the promoters of a number of these genes, and siRNA knockdown of Gtf2ird1 and Gtf2i in MEFs resulted in alterations in the expression of a number of genes including Cfl1, Opn, Fgf11 and Ccnd3102.
2.2.2 Cellular localization of TFII-IRD1
Bayarsaihan et al. looked at cellular localization of TFII-IRD1 during mouse embryonic development using an anti-TFII-IRD1 antibody103. TFII-IRD1 can be detected in the nucleus
40
beginning at the two-cell stage (the onset of zygote gene expression), until E3.5. At E4.5 the
localization of TFII-IRD1 shifts, and it can only be detected in the cytoplasm of trophoblast
cells. TFII-IRD1 expression remains cytoplasmic until E7.5 when nuclear expression can be
detected in the neural ectoderm and embryonic mesoderm. Cellular localization in the later
stages of development were not examined in this study, but multiple studies on the localization
of TFII-IRD1 in cultured cells have been performed.
Endogenous TFII-IRD1 has been shown to localize specifically to the nucleus in HeLa
cells94 and C2C8 myoblast cells57, as has GFP-tagged TFII-IRD1 when over expressed in COS7
cells68,94. It has also been suggested that TFII-IRD1 and TFII-I may affect the cellular localization of the other when co-expressed. Tussie-Luna et al. proposed that TFII-IRD1 can exclude TFII-I from the nucleus, thereby repressing the activation of TFII-I-responsive genes94.
When GFP-tagged TFII-IRD1 and GST-tagged TFII-I were co-expressed in COS7 cells, TFII-
IRD1 was predominantly found in the nucleus, while the majority of TFII-I was found in the cytoplasm. The expression of TFII-I alone resulted in nuclear localization of the protein, and activation of TFII-I target genes94. However, a study published three years later, by many of the
same authors, found that when the same GFP- TFII-IRD1 and GST- TFII-I constructs were co- expressed in COS7 cells TFII-I localized to the nucleus while TFII-IRD1 was found in the cytoplasm68.
The nuclear localization of TFII-IRD1 described in these studies is consistent with the
role of TFII-IRD1 as a transcription factor.
41
2.3 Material and Methods
2.3.1 Generation of probes for in situ hybridization
The Hoxc8 probe sequence was amplified from WT mouse genomic DNA using the
following primer sequences FOR- 5’ GGAACCGGCCTATTACGACT 3’ and REV- 5’
TTAAGTGGCCTTGTCCTTCG 3’. The Gtf2ird1 probe sequence was amplified from WT
mouse cDNA (from p0 brains) using the following primer sequences: FOR-5’
AACAGACTGGGGGAGAAGGT 3’ and REV-5’ CCTTGGCGGCAGGAATATAG 3’.
The PCR amplicons were purified using Microclean (Microzone) when a single PCR product was detected using gel electrophoresis, and gel extracted using the QIAquick gel extraction kit (Qiagen) when multiple products were detected. The purified sequences were then cloned into the pCR2.1-TOPO TA cloning vector (Invitrogen) and excised using EcoR1 digestion (New England Biolabs). The excised fragment was then ligated into the pBluescriptII
KS (Fermentas) vector using the EcoR1 site in the multiple cloning region. Restriction enzyme digestion was used to identify clones carrying the insert in the forward and reverse orientation in order to generate sense and antisense probes. The plasmids containing the probe sequences were linearized at the 5’ end of the probe sequence, purified using phenol-chloroform extraction and then precipitated using RNase-free NaOAc and ethanol.
DIG-labelled probes were generated using in vitro transcription under RNase-free conditions. 1 µg of template DNA (linearized vector) was transcribed with T3 (Boehringer
Mannheim #1031171) or T7 (Boehringer Mannheim # 108981767) RNA polymerase using 10X
DIG RNA labeling Mix (Roche #11277073910) following the manufacturer's protocol.
42
2.3.2 Whole mount in situ hybridization of Gtf2ird1-/- embryos
Male and female mice were housed together overnight, and the female was checked for a vaginal plug in the morning. 12:00 pm on the day the plug was found was considered to be E0.5.
Embryos were collected from the pregnant mother at E11.5 . The embryos were dissected from the uterus into ice cold RNase-free phosphate buffered saline (PBS). The back of the head was punctured with a needle, and the embryos were fixed in 4% paraformaldehyde-PBS at 4° C overnight. The embryos were then washed twice with PBS containing 0.1% Tween-20 (PBT) and dehydrated with a series of methanol/PBT washes. Each wash was 15 minutes long, with rocking, and the methanol concentration was increased from 25% to 50%, 75% and then 100%.
The embryos were then stored in 100% methanol at -20° C.
Before hybridization, the embryos were rehydrated by taking them through the methanol/PBT series in reverse and then washing twice with PBT. They were then bleached with 6% hydrogen peroxide in PBT for 1 hour at room temperature, washed 3x in PBT, treated with 10 µg/mL proteinase K in PBT for 15 minutes, washed for 10 minutes in 2 mg/mL glycine in PBT, washed 2x in PBT, refixed for 20 minutes in 0.2% gluteraldehyde/4% paraformaldehyde in PBT and finally washed 2x with PBT.
The embryos were then placed in a 2 mL tube filled with hybridization solution (50% formamide, 5x SSC pH 4.5, 50 µg/mL yeast RNA, 1% SDS, 50 µg/ml Heparin, 0.1% CHAPS,
5mM EDTA). Once the embryos sunk to the bottom of the tube the solution was replaced and the embryos were incubated at 70° C for 1 hour. The hybridization solution was replaced and the probe was added at a concentration of 1 µg/mL and then the embryos were incubated at 70° C overnight.
43
Following hybridization the embryos were washed 2x 30 min at 70° C with solution 1
(50% formamide, 5X SSC pH 4.5, 1% SDS, 0.1% CHAPS), 3x 5 min at 70° C with solution 2
(0.5M NaCl, 10 mM Tris-HCL pH7.5, 0.1% Tween-20, 0.1% CHAPS), 2x30 min at 37° C with
100 g/mL RNase A in solution 2, 2x 30 min at 65° C in solution 3 (50% formamide, 2X SSC
pH 4.5,µ 0.1% CHAPS) and then 3x 5 min with TBS-T (TBS with 0.1% Tween-20). The
embryos were then pre-blocked for 60 min with heat inactivated 10% sheep serum in TBS-T
before incubation with preabsorbed anti-Digoxigenin-AP antibody (Roche #11093274910)
overnight at 4° C.
The antibody was preabsorbed using embryo powder generated from E11.5 mouse
embryos. To prepare the powder embryos were homogenized in a minimal volume of PBS, 4
volumes of ice cold acetone were then added and the mixture was incubated on ice for 30 min.
Following centrifugation at 10,000 g for 10 min the supernatant was removed and the pellet was
washed with ice cold acetone and spun down again. The pellet was then dried out on a sheet of
filter paper and ground into a fine powder. For each embryo used in the in situ hybridization 3
mg of embryo powder was added to 0.5 mL of TBS-T and 5 µL of sheep serum. The mixture
was incubated at 70° C for 3 min and then cooled on ice. 1 µL (0.75 U) of anti-Digoxigenin-AP
antibody was added and incubated for 60 min at 4 ° C. Following centrifugation for 10 min the
supernatant was collected and diluted to 2 mL with 1% sheep serum in TBS-T. The embryos
was placed in this pre-absorbed antibody solution for overnight incubation.
The next morning the embryos were washed 3x 5 min and then 5x 60 min with TBS-T
containing 2 mM levamisole, and then 3x 10 min with NTMT (100 mM NaCl, 100 mM TrisHCl
pH 9.5, 50 mM MgCl2, 0.1% Tween-20, 2 mM levamisole). The embryos were then incubated with 200 µL of NBT/BCIP solution (Roche #1681451) in 10 mL of NTMT. Once the colour
44
had developed to the desired extent the embryos they were washed once with NTMT and twice with PBT with 1 mM EDTA.
2.3.3 In situ hybridization of P0 mouse brain sections
The entire head of WT P0 mouse pups was removed and fixed in 4% paraformaldehyde
(PFA) in PBS overnight at 4° C. The following day the heads were washed 2x 30 min in PBS at
4° C, and then incubated at 4° C in 30% sucrose in PBS overnight or until the embryos sunk.
The heads were then rinsed in O.C.T Compound (Tissue-Tek, #4583), and then immersed and frozen in O.C.T. compound and stored at -80° C. A cryostat was used to cut 10 micron sections which were collected on silane-prep slides (Sigma, #S4651), air dried and then stored at -80° C.
All solutions used while performing in situ hybridization were RNase free. Sections on slides were re-fixed in cold 4% PFA-PBS for 10 min and then washed 3x 10 min in PBS. Slides were then incubated in acetylation mix (0.013% triethanolamine, 0.003% acetic anhydride) and washed 3x 5 min. in PBS. A humidified chamber was created by placing kimwipes soaked in
50% formamide/ 5x SSC into a tupperware container. Disposable pipettes were placed on the bottom of the container and the slides were laid on them. Each slide was pre-hybridized with
200 µl of hybridization buffer (50% formamide, 5X SSC, 0.25 mg/mL yeast tRNA, 0.5 mg/mL salmon sperm DNA, 5X Denhardt’s) in the humidified chamber for 4 hours at room temperature.
The DIG-labelled RNA probes were added to hybridization buffer at a concentration of 200 ng/mL. The probe solution was heated at 80° C for 5 min. then added to the slides and incubated in the humidified chamber at 60° C for 16 hr.
Before incubation with an anti-DIG antibody, the slides were rinsed in 5X SSC at 60° C, washed in 0.2X SSC at 60° C for 1 hr, then for 10 min., rinsed in solution B1 (pH 7.5,0.1M
45
maleic acid, 0.15M NaCl, 0.175M NaOH), and then blocked in solution B1 with 1% blocking
reagent (Roche # 11096176001) for 1 hr. The slides were then incubated in a 1:5000 dilution of
anti-Digoxigenin-AP antibody (Roche #11093274910) in solution B1 for 1 hr at room temperature. Following incubation with the antibody, slides were washed 2x 15 min. in solution
B1, rinsed in solution B3 (1M Tris, pH 9.5, 5M NaCl, 1M MgCl) for 5 min and then incubated in
solution B3 with 2% NBT/BCIP solution (Roche #1681451) until the colour had developed to
the desired extent. The slides were then washed 2x in PBS.
2.3.4 Preparation and culture of mouse embryonic fibroblast (MEF) cells
Male and female mice were housed together overnight, and the female was checked for a
vaginal plug in the morning. 12:00 pm on the day the plug was found was considered to be E0.5.
Embryos were collected from the pregnant mother at E15.5. If the genotype of the embryos was
unknown, yolk-sacs were collected for genotyping. The embryos were dissected from the uterus
into sterile PBS. The head, limbs and internal organs were removed from the embryos, and the
carcasses were washed 3x with sterile Dulbecco's Modified Eagles Medium (D-MEM) (Sigma-
Aldrich D5796). The embryos were then minced into small pieces and placed in a 50 mL tube
with 10 mL Trypsin-EDTA (Gibco #25200056) and 5 mL of sterile 3 mm glass beads (Sigma-
Aldrich Z265926) and incubated at 37° C for 90 min with shaking. 10 mL of Trypsin-EDTA
was added at 30 min. intervals for a total volume of 30 mL. The cell suspension was then
decanted into two 50 mL tubes, each containing 3 mL of fetal bovine serum (FBS; Gibco
#12483-020). The tube containing the glass beads were washed 2x with D-MEM + 10% FBS
and the washings were added to the cell suspension mixture. This mixture was then centrifuged
at 200 g for 5 min. and the pellet was suspended in 50 mL of D-MEM + 10% FBS. The cells
were counted and 5 x 106 cells were plated in D-MEM + 10% FBS & 1X penicillin-streptomycin
46
(Sigma-Aldrich P4333). Cells were cultured at 37° C with 5% C02. Cells were passaged at least
twice before any experimentation to ensure a homogenous population of MEF cells was present.
2.3.5 Dissection of mouse tissues and RNA isolation
P0 mice were sacrificed using decapitation. The whole brain was removed and
immediately flash frozen in liquid nitrogen. Tails were collected for genotyping when necessary.
For embryonic dissections, the mother was sacrificed using cervical dislocation and the uterus was removed and placed in ice-cold PBS. The embryos were immediately removed from the
yolk-sacs and placed into a separate dish of ice-cold PBS. Yolk-sacs were collected for genotyping. Entire E11.5 embryos, and the heads of E15.5 embryos, were flash frozen in liquid nitrogen. The embryos were then homogenized in TriReagent (Sigma-Aldrich Canada, Oakville,
ON) and stored at -80° C. Total RNA was extracted following the manufacturer’s protocol.
2.3.6 Genotyping of P0 and embryonic mice
Genomic DNA was isolated from yolk-sacs (embryonic) or tail clippings (P0). The tissues were incubated in 400 µL of lysis buffer (0.5% SDS, 0.1M NaCl, 50 mM Tris, pH 8.0,
0.5 µM EDTA, 0.25 µg/µL proteinase K) at 52° C until tissue was no longer visible. To purify the DNA, potassium acetate was added to a final concentration of 1.2M, and a volume of chloroform equal to the total solution volume was added. The solution was incubated at -20° C for 20 min. and then centrifuged for 5 min at 12,000 g at room temperature. The aqueous phase was transferred to a new tube, and the DNA was precipitated with 2 volumes of 100% ethanol.
The DNA was then centrifuged again for 5 min at 12,000 g and the pellet was washed with 70% ethanol before resuspension in 100 µL of nuclease free water.
Samples were genotyped using conventional PCR. Two separate PCR reactions were performed for each sample. Each reaction used the same forward primer (For-5’
47
CGACCACCATAGGTTGAAGG 3’), located in the first intron of Gtf2ird1, in a region found in
WT and Gtf2ird1-/- alleles. The reverse primers were designed to distinguish between WT and
Gtf2ird1-/- alleles. One reaction used a sequence (Rev- 5’ GGGGAACTTCCTGACTAGGG 3’)
present in the NEO-cassette which was inserted into Gtf2ird1-/- alleles, and would only amplify from Gtf2ird1-/- alleles. The other reaction used a sequence (Rev- 5’
TGGGGAACTGTTTGAGAAGG 3’), which is located in an area of the first intron of Gtf2ird1
which is deleted from Gtf2ird1-/- alleles, and would only amplify from WT alleles. The genotype
of each mouse (WT, Gtf2ird1+/- or Gtf2ird1-/-) was determined based on the results of the two
PCR reactions.
2.3.7 Microarray analysis using the Affymetrix mouse 430 2.0 gene chip
The extracted total RNA from the brains of P0 newborn mice was cleaned up using an
RNeasy kit (Qiagen) and run on a 1.2% agarose/formaldehyde denaturing gel to determine the
integrity of each sample. The concentration of each sample was then determined using a
spectrophotometer (Beckman DU 530). RNA from individual samples was pooled together.
Three pools containing RNA from WT mice were created along with three pools containing
RNA from Gtf2ird1-/- mice. Each of the 6 pools contained equal amounts of RNA from 9
different mice, at a final concentration of 1 µg/mL.
Microarray analysis was performed by The Centre for Applied Genomics (TCAG) at the
Hospital for Sick Children (Toronto, ON). The RNA was first analyzed on a Bioanalyzer to
ensure that it was of good quality, and then each pool was analyzed on the Affymetrix mouse
430 2.0 gene chip (which contains probes for over 39,000 transcripts) following the
manufacturer’s protocol.
48
The signals from the gene chips were normalized using Robust Multiarray Analysis
(RMA)104. Differences in gene expression were detected using a second software program,
Significance Analysis of Microarrays (SAM)105, which uses q values as a measure of the false
discovery rate.
2.3.8 Microarray analysis using the Illumina mouseWG-6 v2.0 BeadChip
The extracted total RNA from the heads of E15.5 mice was cleaned up using an RNeasy
kit (Qiagen) and run on a 1.2% agarose/formaldehyde denaturing gel to determine the integrity
of each sample. The concentration of each sample was then determined using a nanodrop
spectrophotometer (Beckman DU 530). RNA from 5 WT mice and 5 Gtf2ird1-/- mice were used
for microarray analysis. WT and Gtf2ird1-/- littermates were used, with 3 WT and 3 Gtf2ird1-/- mice collected from one litter, and 2 WT and 2 Gtf2ird1-/- mice collected from a second litter.
RNA samples were not pooled in this experiment.
Microarray analysis was performed by TCAG at the Hospital for Sick Children (Toronto,
ON). The RNA was first analyzed on a Bioanalyzer to ensure that it was of good quality, and then each sample was analyzed on the Illumina Mouse WG-6 v2.0 Expression BeadChip (which contains probes for over 45,200 transcripts) following the manufacturer’s protocol.
Analysis of microarray data was performed by the Statistical Analysis Core Facility at
TCAG. The data pre-processing included three steps: background correction was performed in the Beadstudio program (Illumina), the data was then transferred to log2 scale and quantile normalization106 was performed. Differentially expressed genes were identified using LIMMA
(linear models for microarray data)107. It fits a linear model for each gene, then an empirical
Bayes method is used to moderate the standard errors for estimating the moderated t-statistics for
each gene which shrinks the standard errors towards a common value. The residual standard
49 deviations are moderated across genes to ensure a more stable inference for each gene. The moderated standard deviations are a compromise between the individual gene-wise standard deviations and an overall pooled standard deviation.
2.3.9 Expression analysis using quantitative Real-Time PCR
Following extraction, total RNA samples were treated with DNase (Turbo DNA free,
Ambion) and 5 µg of RNA was converted to cDNA using the Superscript II First-Strand
Synthesis System (Invitrogen Canada Inc., Burlington, ON) and random hexamer primers.
cDNA samples were diluted 1/100 with sterile water and subjected to real-time PCR analysis using the Power SYBR Green PCR Master mix (Applied Biosystems, Foster City, CA) and the ABI Prism 7900HT sequence detection system (Applied Biosystems, Foster City, CA).
Primers used for expression analysis are listed in Table 2.1. Samples were run in triplicate, and each experiment was repeated at least twice with consistent results. Absolute quantification analysis was used; each plate included a no template control (water) and serially diluted concentrations of control genomic DNA (range 0.63 – 10 ng/well) to generate a standard curve for transcript quantification. All test genes were normalized to the housekeeping gene succinate dehydrogenase (Sdha). Samples which included RNA and all of the reagents to produce cDNA, except for reverse transcriptase, were run as a negative control to ensure that there was no genomic contamination of the samples.
Table 2.1 Primers used for quantitative real-time PCR amplification from cDNA
Primer Name Forward primer sequence (5’ – 3’) Reverse primer sequence (5’ – 3’) Housekeeping genes: mHmbsRT TCCAAGAGGAGCCCAGCTA ATTAAGCTGCCGTGCAACA mHprt1RTe3 TGCTCGAGATGTCATGAAGG AATGTAATCCAGCAGGTCAGC mSdha TGATCTTCGCTGGTGTGGATGTCA CCCACCCATGTTGTAATGCACAGT
50
Gtf2ird1: mGtf2ird1e2 ACTGTGACATCCCCACCAAC GAGTCTAAGGCGGACACCAG mGtf2ird1e9 CGAGGCTGTGGAAATTGTG TGTGTCGCTCCTCCAGAATC mGtf2ird1e21 TGAAGCTCTGGGCATCAAAT GGGGTAGGCCTTCAATGATTA
Gtf2ird1 candidate target genes identified in vitro: mBmpr1b_3UTR GAAGGGTTGGTGTCACTGGT TGAAAGAGCTGCCTACCACA mCcnd3_3UTR GCTCCAACCTTCTCAGTTGC TAGGGCAGCTCCTCATAAGC mCfl1_3UTR GCTATCCCTTCACCCCAGTT TCAAAAGCAGTTTGGGAAGG mEpc1_3UTR GCAGGGAGTATGGAGAGCAC AGCACGAGAGATTCGAGAGC mEzh2_e14 ACGGCTCCTCTAACCATGT CTATCACACAAGGGCACGAA mFgf11_e4 CTCTCTACCGTCAGCGTCGT GCTGCCTTGGTCTTCTTGAC mFgf15_3UTR CGAGGAAGCCAGAAGGTATG GGCAAGCTAAGATCCCATGA mGscRTe1 GCATGTTCAGCATCGACAAC GTAGAGCCGGGAAGACCAC mHoxc8_3UTR AGGGAATGAGGAAGAGGAGAA AAACTTCAAGGGAGTTGCTG mLhx1_e2 GGCGAGGAGCTCTACATCAT TGTTCTCTTTGGCGACACTG mOpn_e6 TTCCAAAGAGAGCCAGGAGA TTGTGGCTCTGATGTTCCAG mTgfb2_e6 GCAGGATAATTGCTGCCTTC TGTACCCTTTGGGTTCATGG
Gtf2ird1 candidate target genes identified in p0 mice: 4833441D16Rik CCACCAGTGCAGTGAAAATG ATGGCTCAGGTCAGAGGAAA mAI506816 ATAGTGGCCCCATCAAAGTG AGCCAGTCAAGGATGGTTTG mAI536236 CCCACGCGTTAGAAAGAGAG TGACTTACTGGGGTGGGAAG (Mphosph9) mAI647811 TGGGCCTTCCTCATATTCAG TACCCATGCTGGAGGAAGTC (BC046410) mAK018172 AGGCAGGAGTGGTGTTCACT CACCCCCAGTTGTTCTCACT mAkap9_3UTR TCAATGGCTCTTTTGTGCTTT TTCATGTGCTGCTGCTAAGG mAU019852 AGACCAGGCTGACCACAAAC GATGAAAGAGCCTGGCGTAG mAuts2_e24 CAGCACCTCTAGTCGGGAAG CTTCCTTGCGTTCCTCTTTG mAV343709 GGGTGTGTCCCCAGCTAATA GGTCAAGTGCCTTCCACATT (A930011O12Rik) mAW556697 ACTGGTCCGAAACAGGATTG GGAAATACAGGCGACTCCAA (Arrdc3) mBB023775 GGGTGATACGGAAGGTTTGA TCTGAGACACGGTTTTGCTG mBB040120 TGGTTACCATGGGCATTTG ATGGAAAGTGGCAGCATAGG mBB051515 GAGCTGTGCTTTTGTGTGGA GTGGGATTTCCGTGAGACTG mBB167280 TTGAGTGAGTGTGTGCGTGA AGCTCCACAGGACCAACATC mBB202611 TAATCGTATGCAGGGCTGGT CTAGCGATGCTGCTTGTACG (Dzip1) mBB206454 GGGAAAAGCAAAACAAACCA CCTGGTGTTCACCTCATCCT mBB337886 TTTGGTCAGGATGTCTTAGTGC TGTGAGTTTGTAATGTCCAGCA mBB373816 AAGCTGGCTTCAAGGAAGAA TCAGGGGAATCGTTTCAGAC
51 mBB451211 TTTCCTGGACACTTGCACTG ATGAGCATGAAGCTCCCATT mBE956180 TGGCTGGTGTTCAGACACTC ACTGCCTTACACCAGGGATG (Hpvc-ps) mBG070910 TGTTGCTTCTCGTGTTCTGG GGCAGAGGACATTTGGAAGA mBG145571 CCGGAACTCAAAAATGTGCT GGAGGCCTAGGCAACATAGA mBM199880 (Zfr) TCTCCAGCCCTCTTGTGACT ATCCATAGCACTGCCCATGT mBM200210 GTTTGCTCCCATCTCTCCAG GCAAGTGGCACTGATGGTAA (Pex1) mBM234702 TCTGCTGTGTCTGCTTGTCA CCCCCATTGTAGCTTCTTGA mBM239037 CAGCTGAGTGCTTGCTGAGT GGCAGTTTTCCTCAGTTGCT mCcm1_3UTR CTGGCCTTGTGGTAAAGCTG CAAAATGTGGTGGTTTGTACTCA mCyp51_3UTR AAGCCAGTGTGGAGAGAGGA CAACCCAGTACAGCACGAGA mDcx_3UTR TCGCTCAAGTGACCAACAAG GGCCCAGAAGAGAAGTCACA mDhcr24_e7 CATCTTCCGCTACCTCTTCG TACAGCTTGCGTAGCGTCTC mEif2s1_3UTR TGGTAGAACTCAATGGGCAAG TCTACCAGGGGTCAAATTCC mGprc5b_3UTR ATCTCACACGGGAAGACACC GCCCTCAAGAAAGACACAGC mHoxA5_3UTR CGGGCAGCTCTCTGTAGTGT ACGAGAACAGGGCTTCTTCA mKcnh1_e11 GTGTCCAAGGCAGAGTCCAT ATTCCGCTGTCACAGGAGTC mKin_3UTR TGAAAGGACGCAGAGTTGAA GTGCCTTGGCTAACACCAAT mLztf1_3UTR GAACCTGCCACACATGAACA CAAGGAAAGCCTAAACATTGG mMapk8_3UTR GATGACTACTTGGGCCTTGG TCACTCAAAAATATGACCACTGAA mMat2a_e1 AAGCGATCCTCCCTCTGTGT CGGCGGTGAGAGAGGGCGAC mMospd3_3UTR CTCTCAGCTGAACCCACCTC AGGAGCAAGGTGCAAACATC mMphosh9_3UTR TCATGTTTTGCGCAGCTCT GCCTTTTCCCAGTGCATAAT mMrpl16_3UTR GTAGTGAAAGCGCGAGGAAC AGAACCAGCAAAGACCCTCA mNdel1_e8 GAATCCAAGTTAGCCGCTTG TTGCTGTTCATTACCCCACA mNedd4_e20 CGCAAACATTCTGGAGGATT GCAACCCCTCCATAGTCAAG mNpas3_e12 GGGCAATCAGTCCGAGAATA GTCGTTGCAGTTCATGTCGT mPeg3_e9 GGATGCACTGATGGGAAACT CAAATCCTCTGCCCTCAAAG mPtprf_e30 ATGGGCAGTCAAGGACAATC GAAGCCTTCACCTGTTTTGG mRanpb2_3UTR ACGGCCAGAATACCAACAGT TCACAGTATCCATGCCATCC mRgs5_3UTR CTATGCCCTGATGGAGAAGG GCAACTTTTGGAAGCCTGAC mRtn4_3UTR GCAAAAATCCCTGGATTGAA CCAAGGGAGTGTCCCCTTTA mStx3_3UTR ACAACATGCCCAACTCAACA TGCGACCTAGAAGAGCCATT mThrap2_3UTR AGCAGTACAACGCCCTATCC CATTGTACAGCTGCGTGAGC mTrpm3_3UTR ACAGGGGTCAAAGCATGTTC ACTTTCTCTGGTGCCTGGTG
Gtf2ird1 candidate target genes identified in embryonic mice: m2310002F18RIK GCCCAAGGCTCTAGGTTCTC TTGCTCATCCAAGCCTAACA (Coq2) mActb_3UTR TGGTTACAGGAAGTCCCTCA AAGCAATGCTGTCACCTTCC mActl6b_3UTR ATACCCGTCCACCCCATC GGGTAATGGGAAAGGGAGAG
52
mAp4m1_e15 CTCCAGGTTCGATTCCTCAG TTGCTGTGGCTTAGATGTCG mAuts2_e20 CTGGCTTACCGAGCTTCAAT CGGAGGACTACGCCTCTGT mKatnal1_e11 GGCTTGAGTCCGGAAGAGAT TCAGAGCCAACTCCAAGTCC mMospd3_3UTR CTCTCAGCTGAACCCACCTC AGGAGCAAGGTGCAAACATC mRpl21_e5 CTGGCCAAGAGGATCAATGT CCCTTCTCTTTGGCTTCCTT mSlc46a3_3UTR GCAATCCACAGGACAAAACC GCTGGGCCTGTTCTCTGTAG mSlc4a4_e23 CGACCTCAGCTTCCTTGATG TCGTCATTGTCGCTATCCAA mTaf6_3UTR TCACATGTGCTGACCTCCTC GGGGAAAACCTTTCCTCCTT mZfp68_3UTR GCTAAGGGGACCCTGTGATT CAAGGTTTTCCTTCACCGTTT
2.3.10 siRNA knockdown of Gtf2ird1 in neuronal cell lines
siRNA knockdown of Gtf2ird1 was performed in two different neuroblastoma derived
cell lines: Neuro2A (N2A; ATCC #CCL-131) and N1E-115 (ATCC # CRL-2263). Cell lines were maintained in D-MEM (Sigma-Aldrich D5796) with 10% FBS (Gibco A12617DJ) and 1X penicillin-streptomycin (Sigma-Aldrich P4333). For siRNA transfection, cells were cultured in
D-MEM + 10% FBS without antibiotics. Cells were maintained at 37° C with 5% C02.
siRNAs targeting Gtf2ird1 (Table 2.2), Gapd (ON-TARGETplus GAPD Control Pool
(Mouse)) and a non-targeting control (ON-TARGET plus Non-targeting siRNA #1) were ordered from Dharmacon. siRNAs were resuspended in 250 µL to create a stock concentration of 20 µM. Transfections of siRNA into N2A and N1E-115 cells were conducted using
Lipofectamine 2000 (Invitrogen) following the manufacturer’s protocol. Briefly, cells were transfected in 6-well plates once they were 50-60% confluent. Lipofectamine 2000 was diluted
1/50 in Opti-MEM Reduced Serum Medium (Gibco), and siRNAs were diluted in Opti-MEM
Reduced Serum Medium (final concentration of RNA when added to the cells was 40, 70 or 100 nM). After a five minute incubation the diluted Lipofectamine and RNAs were combined and
53 incubated for 20 min. at room temperature, and then added to the cells. Cells were harvested either 24 or 48 hrs following transfection and total RNA was extracted.
Table 2.2 sequences of siRNAs used to knockdown Gtf2ird1 expression
Dharmacon ID siRNA sequence Gtf2ird1 exon J-050113-09 GUACUUACGGAGUGCCGAA 9 J-050113-10 GGAGAUGACUGACUCGUUA 9 J-050113-11 GGUUCUGGAGGAGCGACA 14 J-050113-12 CGGAGGAGCUGUUCGUACU 17 J-050113-19 GAAUGUUCGAUGAGCGCAU 10 J-050113-20 UCAAUGAGAAAUACGGUGA 24 J-050113-21 GUACACAAUGAGAGCGUCU 3 J-050113-22 ACACCAGACUCUCGCGGAU 20 siRNA Pools (contain equal concentrations of each RNA): Mouse GTF21RD1 ON-TARGETplus Pool A Pool B Pool C SMARTpool L-050113-01-0005 J-050113-09 J-050113-09 J-050113-11 J-050113-09 J-050113-10 J-050113-11 J-050113-19 J-050113-10 J-050113-11 J-050113-19 J-050113-12 J-050113-22
2.3.11 Cellular localization of Gtf2ird1 in Neuro2a cells
There are multiple isoforms of the Gtf2ird1 transcript, which can be classified as “short” or “long” depending on which of two possible 3’ UTRs they contain. The long forms are predominantly expressed in the brain, while the short forms are predominantly expressed in muscle. A long Gtf2ird1 transcript was amplified from cDNA prepared from the brain of a WT p0 mouse, using primers designed to incorporate restriction enzyme sites and remove the endogenous stop codon (For- 5’ AAGCTTCCATGGCCTTGCTGGGGAAG 3’ and Rev- 5’
GCGGCCGCGGCTCTGAGGTCTAATAATCAA 3’). Multiple bands were present when the
54
PCR product was run on an agarose gel; the brightest band was extracted and cloned into a
pCR2.1-TOPO TA vector (Invitrogen). The Gtf2ird1 sequence was then excised and cloned, in frame, into the multiple cloning site of the mammalian expression vector pcDNA 3.1/myc-His A
(Invitrogen). This vector produces TFII-IRD1 with both myc- and polyhistidine-tags at the C-
terminus.
The Gtf2ird1-pcDNA 3.1 vector was transfected into N2A cells using Lipofectamine
2000 (Invitrogen) following the manufacturer’s protocol. The cells were cultured on coverslips in 6-well plates in D-MEM (Sigma-Aldrich D5796) with 10% FBS (Gibco A12617DJ), and maintained at 37 ° C with 5% CO2.
Twenty-four hours following transfection the cells were fixed on the coverslips by
treatment with formalin for 10 min. at room temperature. The cells were then washed with PBS
+ 0.1% Triton X-100 for 15 min at room temperature before blocking in 0.5% BSA in PBS for 1
hr at room temperature. Primary antibodies were diluted with 0.5% BSA in PBS, added directly
to the coverslips and incubated for 1 hr at room temperature as follows: mouse monoclonal
antibody Anti-Human TFII-IRD1 (myBioSource #MBS120021), diluted 1:100, or Anti-myc
mouse monoclonal antibody (Invitrogen R95025) diluted 1:1000. The cells were then washed 3x
5 min. with PBS, then incubated with Alexa Fluor 594 goat anti-mouse IgG (H + L) (Invitrogen
#A11005) diluted 1:1000 with 0.5% BSA in PBS for 1hr at room temperature in the dark and
finally washed 3x 5 min with PBS. The coverslips were dipped into a beaker of ddH20 before
mounting them onto slides using ProLong Gold antifade reagent with DAPI (Invitrogen
#P36931). Pictures were taken at 40X magnification.
55
2.3.12 Expression analysis using western blots
Protein was extracted from cells using RIPA lysis buffer (10 mM Tris (pH 8.0), 100 mM
NaCl, 1 mM EDTA, 1% NP-40, 0.5% NaDOC, 0.1% SDS) with a protease inhibitor cocktail
(Sigma, P8340). After removing the media, cells were washed with PBS, and then cells were removed from the plate in RIPA buffer, using a cell scraper to detach the cells from the bottom of the dish. Cells in RIPA buffer were passed through a syringe to ensure full lysis of the cells.
The cells were then incubated on ice in RIPA buffer for 20 min. Once lysis was complete the cells were centrifuged at 4° C for 20 min., and the supernatant was transferred to a new tube.
Protein concentration was determined using the Detergent Compatible (DC) Protein
Assay kit (Bio-Rad, 500-0112). 20 µg of protein per sample were used for western blot analysis.
Samples were boiled for 10 min. in SDS loading buffer, before being separated by SDS-PAGE on an 8% polyacrylamide gel. The protein was transferred to a 0.2 µM nitrocellulose membrane
(Pall), and membranes were blocked overnight at 4°C in 5% non-fat dry milk powder in TBS-T
(TBS + 0.05% Tween). Primary antibodies used: Anti-c-myc mouse monoclonal, clone 9E10
(Roche, 11667149001), diluted 1/400, Anti TFII-I/BAP135 (BD Biosciences, 610943), diluted
1/1000. Primary antibodies were diluted in blocking solution and incubated for 90 min. at room temperature with shaking. Membranes were washed 3 x 10 min. in TBS-T, and then incubated for 1 hour at room temperature with ECL Mouse IgG, HRP-Linked Whole Ab (from sheep) (GE
Healthcare, NXA931), diluted 1/10,000 in blocking solution. Following 2 x 10 min. washes in
TBS-T and a final 10 min. wash in TBS, chemiluminescent detection was performed using ECL
(enhanced chemiluminescence) reagents (GE Healthcare) and Hyper Film (GE Healthcare).
56
2.4 Results
2.4.1 Gtf2ird1 is expressed in the developing mouse brain
RNA in situ hybridization analysis was done in order to determine if Gtf2ird1 is expressed in the developing mouse brain. A DIG-conjugated anti-sense probe specific to the
Gtf2ird1 3’UTR was used. Whole mount in situ hybridization on E11.5 embryos revealed high levels of expression in the developing forebrain and midbrain (Figure 2.1A).
The same anti-sense probe was used to determine the expression pattern of Gtf2ird1 in newborn (P0) mouse brains. Horizontal sections through the entire head were mounted on slides and subjected to RNA in situ hybridization. Expression of Gtf2ird1 could be detected throughout the brain, with the highest levels detected in the hippocampus, cortex, thalamus, striatum, olfactory bulbs and brain stem (Figure2.1B).
2.4.2 Expression of candidate target genes Hoxc8 and Gsc are not altered in E11.5 Gtf2ird1-/- mouse embryos
There is evidence to suggest that Hoxc8 and Gsc may be transcriptionally regulated by
TFII-IRD1, however to date there is no in vivo evidence to support this claim. Gsc is a homeobox containing gene which is expressed during two key periods of mouse embryo development; initially for a short period of time in the developing primitive streak at E6.4 –E 6.7 while gastrulation is occurring108, and then during organogenesis beginning at E10.5 109. During
the second phase of expression, high levels of Gsc are found in regions that will form the head and limbs. Gsc-/- mice display no gross abnormalities consistent with a defect in gastrulation,
however they die shortly after birth110. The cause of death is likely related to craniofacial defects
which impair breathing and olfaction.
57
Atypical deletions in the WBS region identified in patients with WBS-like phenotypes have indicated that TFII-IRD1 may play a role in proper craniofacial development53. In addition, mice from a transgenic line in which a c-myc transgene has disrupted Gtf2ird1 expression show craniofacial abnormalities53,111. Given the presumed roles for TFII-IRD1 and Gsc in craniofacial development, and the ability of TFII-IRD1 to bind to the Gsc promoter region in vitro68,79, I
investigated whether TFII-IRD1 may regulate Gsc expression in vivo.
58
C D
Figure 2.1. (A & B) Gtf2ird1 embryonic expression. Whole mount RNA in situ on E11.5 mouse embryos using DIG conjugated probes. (A) Anti-sense probe specific to the 3’ UTR of the Gtf2ird1 transcript. High expression levels are seen in the developing cerebrum. Telencephalic vesicle (TV), ventral mesencephalon (VM). (B) Sense control probe. (C & D) Gtf2ird1 expression in the newborn mouse brain. RNA in situ on a horizontal section of the newborn head. (C) Anti-sense probe specific to the 3’ UTR of the Gtf2ird1 transcript. Expression is seen throughout the brain, including the hippocampus (H), cortex (C), thalamus (T), striatum (S) olfactory bulbs (Ob) and brain stem (Bs). (D) Sense control probe.
59
I performed qRT-PCR to determine the levels of Gsc expression in Gtf2ird1-/- and WT
E11.5 mouse embryos. At this time point both Gsc and Gtf2ird1 are expressed. mRNA was
extracted from whole embryos, and expression values were normalized to the housekeeping gene
Sdha. No differences in the level of Gsc expression could be detected between Gtf2ird1-/- and
WT mice (figure 2.2).
Figure 2.2. Expression of goosecoid (Gsc) and Hoxc8 determined by qPCR. mRNA was extracted from whole E11.5 embryos. Expression values are shown relative to the housekeeping gene Sdha. There is in vitro evidence that suggests both Gsc and Hoxc8 are directly regulated by TFII-IRD1, however no statistically significant differences in expression were detected between genotypes using Student’s t-test.
I also looked at the expression levels and pattern of Hoxc8 in Gtf2ird1-/- and WT E11.5
mouse embryos. Hoxc8 is expressed in embryos beginning at E7.5, and continuing until at least
E17.5112. At E11.5 there are two different domains of Hoxc8 expression: in the neural tube and
the paraxial mesoderm. Using qRT-PCR no differences in the expression level of Hoxc8 between genotypes were detected (Figure 2.2). Whole mount RNA in situ hybridization was performed on E11.5 embryos. Hoxc8 expression was detected in the neural tube and paraxial mesoderm of both Gtf2ird1-/- and WT embyros, and there were no obvious differences in the
60
expression boundaries (Figure 2.3). Together, these results indicated that TFII-IRD1 does not play a role in the transcriptional regulation of Hoxc8 or Gsc at this time point.
Figure 2.3. The expression pattern of Hoxc8 is not altered in Gtf2ird1-/- mice. RNA in situs on E11.5 Gtf2ird1-/- and wildtype mouse embryos incubated with DIG conjugated probes specific for the Hoxc8 transcript. At E11.5 Hoxc8 is expressed between somites (S) 15 and 23, in the neural tube (N) and the paraxial mesoderm (PM) as indicated by the arrows.
2.4.3 Expression of TFII-IRD1 candidate target genes identified in vitro are not altered in vivo
Chimge et al. performed a microarray on MEF cells which over-expressed TFII-IRD1101.
A number of genes were identified as having altered expression, and in a later publication a
subset were verified to have expression changes in MEF cells that were treated with Gtf2ird1 and
Gtf2i siRNA102. In addition ChIP was used to show that TFII-IRD1 can bind to the promoter
region of some of these genes. In a related study, Lazebnik et al. found that Gtf2ird1 siRNA
treatment of C2C12 cells (a mouse myoblast cell line), resulted in a 600-fold increase in Bmpr1b
expression and a 6900-fold increase in Fgf15 expression81. Using ChIP, TFII-IRD1 was found to
61
bind to the Fgf15 promoter in C2C12 cells. Fgf15 is an attractive candidate gene for some of the
WBS phenotypes as it is known to be involved in neocortical patterning and development113.
To determine if TFII-IRD1 is involved in the regulation of Fgf15 and Bmpr1b in vivo, I
looked at the expression of these genes in the brains of newborn Gtf2ird1-/- and WT mice using
qRT-PCR. At this time point, both of these genes are expressed in the newborn brain, but I
could not detect any differences in expression levels between genotypes (Figure 2.4). This
indicates that at this time point TFII-IRD1 does not play a role in the regulation of Fgf15 or
Bmpr1b.
Figure 2.4. Expression of Bmpr1b and Fgf15 determined by qPCR. mRNA was extracted from whole brains of P0 mice (n=3/genotype). Expression values are shown relative to the housekeeping gene Sdha. No statistically significant differences in expression were detected between genotypes using Student’s t-test.
qRT-PCR was also used to look at the in vivo expression of a number of TFII-IRD1 candidate genes identified by Chimge et al., which were validated in MEF cells. I cultured
MEFs from Gtf2ird1-/-, Gtf2ird1+/- and WT E15.5 embryos, and measured expression levels of
seven of the genes identified by Chimge et al. None of the genes tested showed significant
62
differences in expression between genotypes, indicating that TFII-IRD1 is unlikely to regulate expression of these genes under the culture conditions used (Figure 2.5A).
In order to determine if TFII-IRD1 plays a role in the regulation of these genes in the brain, I looked at expression of five genes identified by Chimge et al. in the brains of E18.5 and
adult mice. No significant differences in expression of these genes could be detected between
genotypes (Figure 2.5B).
63
Figure 2.5 Expression of genes previously shown to be targets of TFII-IRD1 in MEFs. (A) mRNA was extracted from MEFs which were cultured from +/+ (n=3), +/- (n=2), and -/- (n=1) E15.5 embryonic mice. (B) mRNA was extracted from whole brains (embryonic mice: n=3/genotype, adult mice: n=3 (+/+),and n = 2 (-/-)). Expression values are shown relative to the housekeeping gene Sdha. Hmbs and Hprt are housekeeping genes and were used as a control. (For presentation purposes, some values were scaled as indicated). No statistically significant differences in expression were detected between genotypes using Student’s t-test.
64
2.4.4 Global expression analysis of P0 mouse whole brain
Microarray analysis was performed in order to identify direct and indirect targets of TFII-
IRD1 in an unbiased manner. Total RNA was extracted from the brains of newborn WT and
Gtf2ird1-/- mice. RNA from 9 mice of the same genotype was pooled together to create three pools of RNA from WT mice and three pools of RNA from Gtf2ird1-/- mice. The RNA was pooled in order to reduce variability that was not caused by Gtf2ird1 genotype as the mice were from different litters and were on a CD1 (outbred) background. In addition the brains were harvested at slightly different time points due to the variability in accessing newborn litters. All of these factors could result in differences in gene expression that were not attributable to
Gtf2ird1 copy number, and so it was hoped that the effect of these variables could be reduced by pooling the RNA and ensuring that each pool contained mice from different litters.
cRNA was prepared from each of the six pools and was hybridized to an Affymetrix - mouse 430 2.0 gene chip. The signals from the gene chips were normalized using Robust
Multiarray Analysis (RMA)104, and differences in gene expression were detected using a second software program, Significance Analysis of Microarrays (SAM)105, which uses q values as a measure of the false discovery rate (FDR) of the identified genes.
Relatively few genes showed altered expression, and the magnitudes of these changes were generally very small (Table 2.3). Using a FDR cut off of 10%, 8 genes were identified as having changes in expression in the null mice of greater than 2 fold. An additional 79 genes had changes in expression between 1.2 and 2 fold.
65
9 3C 9D 7F3 4E1 5G2 5A1 1H4 5A3 1A3 2H3 5A1 5G1 19B 5A3 2H4 5A1 6C1 4C3 5A1 5A2 5G2 5A3 6C3 14D1 13C3 12C3 14D1 8A1.3 14E2.2 11A3.3 11A3.3 osome 16C3.3 chrom-
number Accession Accession NR_015554 NM_177047 NM_177047 NM_194462 NM_194462 NM_175245 NM_175245 NM_029813 NM_029813 XM_001472446 XM_001472446 XM_001481304 XM_001481304 NM_001042591 NM_001042591 NM_001083918 NM_001083918 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (%) 7.33 7.33 3.64 1.96 8.34 3.28 3.28 2.04 3.64 8.34 7.33 8.34 3.64 7.33 8.34 7.33 q-value 0.757 1.897 1.396 1.391 0.829 4.555 1.268 0.603 1.410 1.525 0.699 0.774 1.247 1.409 1.301 0.457 0.697 0.762 0.751 1.336 0.770 3.378 0.753 0.585 1.303 0.741 0.741 2.837 0.690 0.511 0.773 1.285 0.459 1.305 Fold Change Fold 1457191_at 1439224_at 1459497_at 1452379_at 1443222_at 1440807_at 1460151_at 1459253_at 1459420_at 1439948_at 1455151_at 1434664_at 1430393_at 1439928_at 1442483_at 1432198_at 1449910_at 1458525_at 1442893_at 1437126_at 1440694_at 1439483_at 1430195_at 1456706_at 1435453_at 1424784_at 1438238_at 1446324_at 1446713_at 1445307_at 1444956_at Probe Set ID 1442760_x_at 1437717_x_at 1453589_a_at autism susceptibility candidate 2 candidate susceptibility autism Arrestin domain containing 3 A kinase (PRKA) anchor protein (yotiao) 9 (yotiao) protein anchor (PRKA) kinase A RIKEN cDNA 4833441D16 gene Description Auts2 Akap9 Arrdc3 AI506816 AK018172 AV264602 BB451211 BB373816 AU019852 BB341550 BB337886 BG070910 BB167280 BB206454 BC046401 BB148843 BB534083 BB051515 BB023775 BB040120 BB113018 BB115513 BB474913 BB523556 2510017J16Rik 1700029I01Rik 2610005L07Rik 2010315B03Rik 2410129H14Rik 2210418O10Rik 2810043O03Rik 4833441D16Rik A930011O12Rik Gene Table to found expression altered to microarray P0 2.3. Genes have of mice inbrains the according analysis
66
7F2 4E2 XF2 9F4 6C1 4C7 2A1 6B3 14B 5B1 5A1 5G2 1H6 19A 5A1 5A3 5G2 14E4 19C3 17A1 12C3 13C3 12C3 5G1.3 osome 11A3.3 16C3.3 chrom- number Accession Accession NR_002847 NM_025943 NM_025943 NM_145569 NM_053272 NM_053272 NM_025280 NM_010453 NM_010453 NM_016700 NM_020010 NM_020010 NM_022420 NM_022420 NM_026114 NM_026114 NM_030675 NM_030675 NM_008283 NM_008283 NM_033322 NM_001033460 NM_001033460 NM_001081462 NM_001081462 NM_001038607 NM_001160016 NM_001160016 NM_001110222 NM_001110222 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (%) 3.28 7.33 7.33 8.34 3.28 2.04 3.64 3.64 7.33 8.34 3.28 7.33 3.28 7.33 q-value 0.522 0.756 1.797 0.758 0.564 1.466 0.775 1.434 0.753 1.306 0.605 0.782 1.292 0.595 0.741 0.707 1.433 0.748 1.924 0.632 0.797 0.700 0.737 1.271 0.725 1.262 1.310 0.512 0.794 Fold Change Fold 1443772_at 1435514_at 1423667_at 1443905_at 1456880_at 1440417_at 1418129_at 1427317_at 1448926_at 1457936_at 1455279_at 1422533_at 1438892_at 1439705_at 1428729_at 1437577_at 1438871_at 1425908_at 1459061_at 1418141_at 1448974_at 1420491_at 1451411_at 1418414_at 1436202_at 1444869_at 1445420_at Probe Set ID 1428974_s_at 1451899_a_at DAZ interacting protein 1 methionine adenosyltransferase II, alpha DNA segment,DNA 19, ERATO Chr Doi 409, expressed 24-dehydrocholesterol reductase antigenic determinant of rec-A protein mitogen activated protein kinase 8 kinase protein activated mitogen homeo box A5 predicted gene 1060 cytochrome P450, 51 P450, cytochrome diabetic embryopathy 1 Description cerebral cavernous malformations 1 malformations cavernous cerebral doublecortin 1 subunit 2, factor initiation translation eukaryotic alpha guanine nucleotide binding protein (G protein), beta 1 G protein-coupled receptor, family C, group 5, member B domain- repeat I II factor transcription general containing 1 motif, sequence central E5 18 papillomavirus Human pseudogene (eag- H subfamily channel, voltage-gated potassium 1 member related), 1 factor-like transcription zipper leucine adenocarcinoma lung associated metastasis transcript 1 (non-coding RNA) Kin Dcx Gnb1 Dep1 Lztfl1 Dzip1 Ccm1 Kcnh1 Eif2s1 Hoxa5 Cyp51 Mat2a Mapk8 Malat1 Dhcr24 Gprc5b Gtf2ird1 Hpvc-ps Gm1060 BG075322 BG145571 BM239037 BM211666 BM238940 BM117148 D19Ertd409e Gene
67
5F 5F 5F 5F 1H3 5G2 14B 8A2 7A1 5G2 7B1 5A2 19A 5B1 8C5 19A 19B 6G2 10B4 12C1 17A1 11B3 15A1 16B3 4D2.1 13D2.1 13D2.2 11A3.3 osome chrom- number Accession Accession XM_884414 XM_884414 NM_009063 NM_009063 NM_011240 NM_011240 NM_013780 NM_013780 NM_145456 NM_145456 NM_023852 NM_023852 NM_029869 NM_029869 NM_022656 NM_022656 NM_011130 NM_011130 NM_172424 NM_172424 NM_008817 NM_008817 NM_030037 NM_030037 NM_023668 NM_023668 NM_011767 NM_008820 NM_008820 NM_027777 NM_027777 NM_011213 NM_172707 NM_172707 NM_026623 NM_026623 NM_025606 NM_025606 NM_024226 NM_024226 NM_001145899 NM_001145899 NM_001040398 NM_001040398 NM_001081203 NM_001081203 NM_001159516 NM_001159516 NM_001025307 NM_001025307 NM_001035239 NM_001035239 NM_001081323 NM_001081323 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (%) 7.33 7.33 7.33 3.28 3.64 7.33 2.04 7.33 7.33 7.33 7.33 7.33 3.28 8.34 7.33 7.33 q-value 1.290 1.462 1.416 1.262 0.442 0.656 1.315 0.729 0.799 0.686 1.247 1.335 0.695 1.272 1.309 0.782 1.465 1.301 1.236 1.301 0.573 1.300 0.646 0.609 0.439 0.347 0.734 1.421 0.703 1.421 Fold Change Fold 1437353_at 1435284_at 1439650_at 1426559_at 1420941_at 1440104_at 1450287_at 1459722_at 1432415_at 1440771_at 1433758_at 1439840_at 1453906_at 1417355_at 1460452_at 1429735_at 1424893_at 1442311_at 1425424_at 1416712_at 1440267_at 1420843_at 1431328_at 1437213_at 1431053_at 1440915_at 1450880_at 1456923_at 1417600_at Probe Set ID 1425530_a_at SET domain containing 1B strawberry notch homolog 1 regulator of G-protein signaling 5 signaling G-protein of regulator RAN binding protein 2 neuronal PAS domain protein 3 Zinc finger, SWIM domain containing 6 member RAS oncogene family oncogene RAS member zinc finger with KRAB and SCAN domains 1 nischarin polymerase (DNA directed), beta directed), (DNA polymerase thyroid receptor hormone associated protein paternally expressed 3 motile sperm domain containing 3 containing domain sperm motile quaking nuclear distribution gene E-like homolog 1 Zinc finger RNA binding protein Description hypothetical protein LOC620031 peptidase 4 peroxisomal biogenesis factor 1 factor biogenesis peroxisomal protein tyrosine phosphatase, receptor type, F syntaxin 3 M-phase phosphoprotein 9 nudix (nucleoside diphosphate linked moiety X)-type motif 21 protein phosphatase 1, catalytic subunit, beta isoform reticulon 4 transporter), (H+/peptide 15 family carrier solute member 2 subfamily channel, cation potential receptor transient M, member 3 mitochondrial ribosomal protein L16 protein ribosomal mitochondrial Qk Zfr Stx3 Polb Sbno Pex1 Ptprf Rtn4 Rgs5 Peg3 Pep4 Nisch Ndel1 Npas3 Trpm3 Setd1b Thrap2 Nudt21 Ppp1cb Mrpl16 Ranbp2 RAB3C Zswim6 Slc15a2 Zkscan1 Mospd3 MGC7817 Mphosph9 Gene
68
2.4.5 Global expression analysis of E15.5 embryo heads
In order to find additional targets of the putative transcription factor TFII-IRD1, a second microarray experiment was performed using the Illumina BeadChip platform. Studies on adult
Gtf2ird1-/- mice have shown that they have structural cerebellar defects, as well as neurotransmitter differences in the cerebellum, cortex, and amygdala. For this reason, I focused on the embryonic time points when these structures/brain regions are developing. RNA was extracted from the heads of E15.5 mouse embryos, and 5 Gtf2ird1-/- mice were compared to 5
WT mice. The mice used in the experiment came from crosses between Gtf2ird1+/- mice, which allowed for the comparison of Gtf2ird1-/- and WT littermates.
Differentially expressed genes were detected using LIMMA107, following normalization of the log2 scale transformed data using the quantile normalization method106. Eighteen genes were shown to have altered expression in Gtf2ird1-/- mice with an adjusted p-value <0.1 (Table
2.4). Similar to the findings of the first microarray experiment, the changes in expression detected were generally small with approximately half of the genes being altered by less than 2- fold. Mospd3 and Auts2 were the only genes to be identified in both microarray experiments.
However, in the microarray performed on newborn mice Auts2 was found to be increased in expression by 1.3 fold, while it was found to be decreased by 1.5 fold in the embryonic mice.
There were two probes corresponding to Mospd3 on the Illumina BeadChip array which was used for the embryonic mice; one of these indicated that Mospd3 expression was slightly decreased while the other indicated that it was slightly increased. The array performed with newborn mice on the Affymetrix platform also indicated that Mospd3 expression was slightly increased in Gtf2ird1-/- mice.
69 5E4 4B3 5E1 5G2 5G2 5G2 5G3 5G3 5G3 5G3 5G2 5G2 5G2 5G2 5G2 11E2 11B1.3 osome chrom- number Accession Accession XM_356566.1 XM_358724.1 XM_132218.3 XM_917238.2 XM_289936.2 NM_021392.3 NM_007393.3 NM_177047.3 NM_153572.1 NM_008647.3 NM_009087.1 NM_027872.3 NM_018760.1 NM_031404.4 NM_030037.1 NM_009315.3 NM_013844.2 NM_001081462.1 NM_001081468.1 pvalue 2.74E-02 7.61E-12 9.70E-04 1.01E-02 2.18E-15 9.80E-07 3.06E-03 9.80E-07 8.71E-02 9.80E-07 6.19E-04 2.65E-14 5.33E-03 3.04E-03 7.11E-05 2.25E-03 2.57E-02 4.52E-03 4.25E-04 8.96E-05 2.74E-02 1.10E-02 5.74E-17 1.35E-10 4.57E-03 5.09E-02 Adjusted Adjusted 0.80 0.21 1.63 0.64 0.07 0.53 1.59 0.37 0.31 0.42 2.03 0.18 1.36 2.14 0.41 0.66 5.09 0.65 1.64 2.03 1.58 1.42 0.14 2.17 2.05 1.45 Fold Change mouse embryos according to microarray analysis. analysis. microarray to according embryos mouse -/- Probe set ID ILMN_2677531 ILMN_2449449 ILMN_3153772 ILMN_1237413 ILMN_2455192 ILMN_2985428 ILMN_1221960 ILMN_2475184 ILMN_2542048 ILMN_2707416 ILMN_2656631 ILMN_1241137 ILMN_2599667 ILMN_1253663 ILMN_1377923 ILMN_1250157 ILMN_2706468 ILMN_2971577 ILMN_2730208 ILMN_2594202 ILMN_2634720 ILMN_2627733 ILMN_2696182 ILMN_3096468 ILMN_2993221 ILMN_2534329 Description adaptor-related protein complex AP-4, 1 mu actin, beta 6B actin-like 2 candidate susceptibility autism coenzyme Q2 homolog, prenyltransferase GTF2I repeat domain containing 1 katanin p60 subunit A-like 1 3 containing domain sperm motile major urinary protein 2 L21 protein ribosomal 1-3 polymerase RNA 3 member 46, family carrier solute exchanger), (anion 4 family carrier solute member 4 TAF6 RNA polymerase II, TATA box binding factor (TBP)-associated protein zinc finger protein 68 MUP2 KATNAL1 LOC382555 LOC382264 . Genes shown to be altered in the brains of 15.5 d.p.c. Gtf2ird1 15.5 d.p.c. of brains the in altered be to shown 2.4 . Genes Table Gene ACTB ACTL6B AP4M1 AUTS2 Coq2 GTF2IRD1 LOC333841 MOSPD3 Rpl21 RPO1-3 SLC46A3 SLC4A4 TAF6 ZFP68
70
2.4.6 Validation of candidate gene expression using qRT-PCR
qRT-PCR was performed in order to verify that the alterations in gene expression
detected in the Gtf2ird1-/- mice using microarrays were accurate. When possible, primer pairs in which one of the primers overlapped with the microarray-probe sequence were used.
Expression changes of most of the known protein-coding genes identified as being altered
in the brains of newborn Gtf2ird1-/- mice could not be validated. When gene expression was
found to be significantly different between genotypes, the changes in expression were generally
small (Figure 2.6). The protein coding genes which showed the largest changes in expression
were Kin, Stx3 and Mrpl16. Interestingly, Stx3 and Mrpl16 are located in a tail-to-tail orientation
on mouse chromosome 19, with only ~150 bp separating their 3’ UTRs. This tail-to-tail
orientation is conserved in humans. Stx3 is an attractive candidate gene for the neurological
symptoms of WBS as it expressed in neuronal growth cones and is necessary for proper neuron
growth114.
Many of the changes in gene expression identified in the microarray on embryonic mice
also could not be validated using qRT-PCR (Figure 2.7). Significant differences in expression
between genotypes were only detected in seven genes, with the largest changes in expression
seen in Actl6b, Taf6 and Zfp68. Actl6b is an attractive candidate gene for some of the
neurological phenotypes of WBS as it is a member of a neuron specific chromatin remodelling
complex which regulates dendritic growth and arborization115.
71
Figure 2.6. qPCR validation of expression of candidate genes identified in microarray analysis of newborn brains. . RNA from 9 mice of the same genotype was pooled together to make cNDA, n=3 separate pools/genotype. Expression values are shown relative to the housekeeping gene Sdha. (For presentation purposes, some values were scaled as indicated). * p < 0.05, ** p < 0.005 using Student’s t-test.
Figure 2.7. qPCR validation of expression of candidate genes identified in microarray analysis of E15.5 embryo heads. Expression values are shown relative to the housekeeping gene Sdha. (For presentation purposes, some values were scaled as indicated). * p < 0.05, ** p < 0.005 using Student’s t-test.
72
2.4.7 Knockdown of Gtf2ird1 in neuronal cell lines does not affect expression of candidate genes
It was noted that nearly all of the genes identified by microarray as being altered in the
embryonic mice were located on chromosome 5, within 45 Mb of the Gtf2ird1 locus. Actl6b,
Taf6, and Zfp68 are all within 5 Mb of the Gtf2ird1 locus. In order to determine if the alterations
in gene expression that were detected were actually the result of the physical disruption of that
locus caused by gene targeting, and not specifically related to the loss of TFII-IRD1, siRNA
knockdown of Gtf2ird1 was performed in neuronal cell lines. siRNA knockdown will result in a
decrease in the amount of TFII-IRD1 present in the cells without physically disrupting the chromosome, which can alter the expression of nearby genes.
siRNA knockdown of Gtf2ird1 was performed in two different cell lines: Neuro2A and
N1E-115, both of which are derived from mouse neuroblastomas. qRT-PCR analysis was used to determine the level of Gtf2ird1 knockdown. Optimization of Gtf2ird1 knockdown was performed in N2A cells using a pool of 4 different siRNAs (in equal concentrations) designed by
Dharmacon. A non-targeting siRNA was used as a negative control, and a pool of siRNAs which target Gapdh was used as a positive control. Gapdh siRNA treated cells showed a specific 90% reduction in Gapdh expression, while Gtf2ird1 siRNA treated cells only showed a
60% reduction in Gtf2ird1 expression (Figure 2.8A). Gtf2i expression was not affected by the
Gtf2ird1 siRNAs indicating they specifically target Gtf2ird1.
In an attempt to increase the level of Gtf2ird1 knockdown, each of the 4 siRNAs in the
SMARTpool was tested individually, along with 4 additional Gtf2ird1 siRNAs ordered from
Dharmacon. The best individual performing siRNAs were then mixed in different combinations to create three different pools of Gtf2ird1 siRNAs (named A, B & C). The pools were then
73
tested on both N2A and N1E-115 cell lines in duplicate. Each of the Gtf2ird1 siRNA pools specifically knocked down Gtf2ird1 expression by approximately 60% in Neuro2A cells and by
80% in N1E-115 cells (Figure 2.8B). Treatment with a non-targeting siRNA, or an siRNA targeting Gapdh expression had no effect on the expression level of Gtf2ird1.
B
Figure 2.8. (A) Knockdown of Gtf2ird1 in the neuronal cell line N2A, determined by qPCR. Gapdh siRNAs were used as a positive control, and Gapdh showed a 90% decrease in expression. Expression of Gtf2i was not affected. (B) Knockdown of Gtf2ird1 in the neuronal cell lines N2A and N1E-115. Expression of the housekeeping gene Hmbs was not affected. Expression values are shown relative to the housekeeping gene Sdha. Primers used in PCR are shown on the X-axis.
74
Expression levels of candidate genes identified in the microarrays and verified using
qRT-PCR were analyzed in Gtf2ird1 siRNA treated cells. As there were no differences in the effects of Gtf2ird1 pools A, B, and C, candidate gene expression was tested in cells treated with each pool and the expression values were averaged together (n=6). Although Gtf2ird1 expression was only knocked down 60-80% in these cell lines, differences in the expression levels of candidate genes could be detected in Gtf2ird1+/- mice in which Gtf2ird1 expression is
decreased by ~50%. Therefore if the candidate genes were being either directly or indirectly
regulated by TFII-IRD1, I would have expected to see a significant change in expression in the
siRNA treated cells. However, no significant differences in the expression of Actl6b, Taf6, Kin or
Zfp68 could be detected in Gtf2ird1 siRNA treated cells when compared with either non-
targeting siRNA treated cells or untreated cells (Figure 2.9). Stx3 and Mrpl16 are not expressed
in these cell lines, and so the effect of Gtf2ird1 knockdown on their expression could not be
examined. These results indicate that TFII-IRD1 is unlikely to play a role in the transcriptional regulation of these candidate genes in the cell types examined.
75
Figure 2.9. Expression of candidate genes in Gtf2ird1 siRNA treated neuronal cell lines. Expression values are shown relative to the housekeeping gene Sdha. No statistically significant changes in expression were detected between Gtf2ird1 siRNA treated cells and non-targeting siRNA treated or untreated cells using Student’s t-test. * p < 0.05.
2.4.8 Altered gene expression in Gtf2ird1-/- mice is the result of differences in genetic background
The initial targeting of the Gtf2ird1 locus was done in R1 ES cells, which are derived from a 129X1/SvJ & 129S1 cross, and the mice were backcrossed onto CD1. As the region around the targeted locus may retain a 129 genotype, I hypothesized that the gene expression differences in the Gtf2ird1-/- mice may actually be the result of differential expression between
genetically different mouse strains.
Expression of candidate genes in the brains of newborn and adult mice from a
129S1/SvImJ genetic background was analyzed using qRT-PCR. These mice are very similar
genetically to the R1 ES cells in which the Gtf2ird1 locus was targeted. If the expression
differences in the candidate genes were due to differences in genetic background, then the
expression of these genes in Gtf2ird1-/- mice would be similar to the expression in WT
76
129S1/SvImJ mice. Conversely, the expression of these genes in WT CD1 mice would be
different than the expression in WT 129S1/SvImJ mice.
Expression of candidate genes located near the Gtf2ird1 locus on chromosome 5 were
found to be the same in Gtf2ird1-/- mice as in WT 129S1/SvImJ mice, and significantly different
from CD1 WT mice (Figure 2.10). In addition, analysis of strain-specific SNPs within Zfp68 (a gene that showed altered expression), demonstrated that the Gtf2ird1-/- mice were homozygous
for 12/13 129S1/SvImJ SNPs, while CD1 WT mice were only similar to the 129S1/SvImJ mice at 2/13 SNPs (Table 2.5). While the 129S1/SvImJ mice are genetically similar to the R1 line which was used to derive the Gtf2ird1-/- mice, they are not identical. This could explain why the
SNPs are not a 100% match between Gtf2ird1-/- mice and 129S1/SvImJ mice.
Figure 2.10. Expression of candidate genes in the brains of P0 mice from different genetic backgrounds determined by qPCR. 129+/+ (n=7), Gtf2ird1-/- (n=5), CD1+/+ (n=6). Expression values are shown relative to the housekeeping gene Sdha. (For presentation purposes, some values were scaled as indicated). * p < 0.05, ** p < 0.005 using Student’s t-test.
77
Table 2.5. A comparison of SNPs in the 3’UTR Zfp68 inGtf2ird1-/- mice and CD1 WT mice relative to 129S1/Sv1mJ mice.
In order to determine if these results represented true differences in expression between the 129S1/SvImJ and CD1 mice, or if they were the result of SNPs in the genomic sequence inhibiting probe/primer binding, I had each of the PCR amplicons which was shown to be significantly altered sequenced. I did not find any SNPs in the primer sequences themselves, and only two of the amplicons contained a SNP. Thus, the apparent differences in expression between the strains are likely to be true differences, and not the result of different PCR
78
efficiencies. These results indicate that the differences in expression I had previously detected
for these genes in the brains of Gtf2ird1-/- mice relative to CD1 wildtype mice were not related to the function of TFII-IRD1.
Expression analysis of Pex1 and AI506816, which previously showed significant differences between Gtf2ird1-/- and CD1 WT mice, failed to replicate the differences between
these two genotypes. However the expression of these genes in WT 129S1/SvImJ mice was
significantly different than both Gtf2ird1-/- and CD1 WT mice (Figure 2.10). Both Pex1 and
AI506816 are located on mouse chromosome 5A, while Gtf2ird1 is located at 5G2, a
considerable distance away. It is likely that when the initial expression analysis of these genes
was conducted both the Pex1 and AI506816 alleles were derived from 129S1/SvImJ in the
Gtf2ird1-/- mice, and at some point before samples were collected for the more recent
experiment, recombination resulted in the Gtf2ird1-/- mice having CD1 derived alleles for these
genes.
Expression levels of Stx3 and Mrpl16 were significantly lower in Gtf2ird1-/- mice than in
WT CD1 or WT 129S1SvImJ mice. Both of these genes are located on mouse chromosome 19.
Thus, the altered expression of these genes is unlikely to be a result of Gtf2ird1-/- mice
harbouring 129S1/Sv1mJ alleles for these genes.
2.4.9 TFII-IRD1 is found in the cytoplasm of Neuro2a cells
Microarray experiments at time points when Gtf2ird1 is widely expressed in the brain
have been unable to find any clear targets of this putative transcription factor. TFII-I, a family
member of TFII-IRD1, has been shown to have a cytoplasmic role in addition to its role as a
transcription factor116. In order to determine if TFII-IRD1 may also have a cytoplasmic cellular
role, I looked at the localization of TFII-IRD1 protein in Neuro2A cells.
79
Endogenous expression of TFII-IRD1 in these cells could not be detected using immunocytochemistry or western blots. This could be because the protein is expressed at levels below the threshold of detection or because the antibody binds poorly to the target protein. To date, I have tested a variety of antibodies against TFII-IRD1 and have been unable to detect endogenous expression in tissue samples using western blots, or in cell cultures using western blots or immunohistochemistry. In order to raise cellular TFII-IRD1 levels, Neuro2A cells were transfected with a construct that expresses myc-tagged TFII-IRD1. Western blots performed on transfected cells using an antibody against the myc-tag identified a band of the expected size.
No bands were detected in untransfected cells (Figure 2.11). Immunocytochemistry was performed on transfected Neuro2A cells using either an antibody against the myc-tag, or an antibody against TFII-IRD1 to determine the cellular localization of the protein. Both antibodies produced identical results (Figure 2.12). No signal is visible when either of these antibodies are used on untransfected cells.
TFII-IRD1 can clearly be seen in the nucleus and throughout the cytoplasm of these cells, including the neurite extensions (Figure 2.12). This indicates that TFII-IRD1 may have a biological role other than a transcription factor.
80
A B
Figure 2.11. (A)Western blot showing that an anti-myc-tag antibody specifically detects myc- TFII-IRD1 in transfected Neuro2A cells. (B) an anti-TFII-I antibody was used as a loading control. Myc-TFII-IRD1(n) is produced from an expression vector containing the TFII-IRD1 isoform that is most highly expressed in neuronal cells. Myc-TFII-IRD1(m) is produced from an expression vector containing the TFII-IRD1 isoform that is most highly expressed in muscle cells.
81 - -
(bottom row). (bottom row). IRD1 on construct. Immunohistochemistry was performed using either an antibody performed an myc Immunohistochemistry the was using against either on construct. is found in both the cytoplasm and nucleus of Neuro2A cells. Cells were transfected myc Neuro2Awithis a of Cells nucleus cells. were in and bothcytoplasm found the
expressi IRD1 - TFII IRD1 2 Figure 2.1 Figure TFIItagged - or antibody an row) (top TFIIepitope against -
82
2.5 Discussion
Transcription factors are proteins which recognize and bind to specific DNA sequences,
and regulate transcription of the corresponding genes either positively or negatively117. There
have been many studies showing that members of the TFI-I gene family, including TFII-IRD1, are able to regulate transcription by binding to specific DNA sequences. Thus, it is surprising that I was unable to confirm any of the previously identified TFII-IRD1 target genes in vivo, or identify any novel targets in our Gtf2ird1-/- mice.
2.5.1 Targets of TFII-IRD1 identified in vitro
TFII-IRD1 has previously been shown to bind to the promoters region of the Hoxc8, Gsc,
and TnIs genes using yeast-one hybrid studies74,75,79. A comparison of the bait sequences used in
each experiment revealed a common motif, GGATTA, found in the promoter of each gene. This
is the same consensus sequence that Vullhorst and Buonanno identified using SELEX with the I-
repeats of mouse TFII-IRD180. In addition, TFII-IRD1 was shown to regulate the expression of
a reporter gene when regions of the Gsc and TnIs promoters containing the GGATTA motif were
placed upstream of the transcription start site68,78.
-/- I did not look at the expression of TnIs in the Gtf2ird1 mice, as TnIs is involved in
muscle fiber type specification, and is not expressed in the tissues/time-points that I was
-/- studying. The expression of TnIs in Gtf2ird1 mice has been studied by others, and no
differences in expression were detected (Stephen Palmer, personal communication). I analyzed
the expression of Gsc and Hoxc8 in Gtf2ird1-/- mouse embryos, and no differences from WT were detected. Previous studies have indicated that TFII-IRD1 represses the expression of both of these genes and so I had expected to find increased levels of Hoxc8 and Gsc in the Gtf2ird1-/-
83
mice. Recent findings by Palmer et al.118 could help to explain why the results of my in vivo analysis do not correlate with previous in vitro studies.
Palmer et al. found that TFII-IRD1 is able to negatively auto-regulate itself by binding to its own promoter118. They generated a Gtf2ird1 knockout mouse using homologous
recombination to delete exon 2 in 129R1 ES cells. Exon 2 contains the transcription start site, and so this was expected to prevent Gtf2ird1 transcription in these mice. However it was found that the mice did produce a transcript with exon 1 splicing directly into exon 3. Similar to our
Gtf2ird1-/- mice, their knockout mouse produced increased levels of the truncated transcript.
There are two AUG codons in exon 3 of Gtf2ird1, the first of which is out of frame and is
followed by a stop codon five codons downstream. Use of the second AUG would produce an
in-frame protein. Using an expression construct which produced the mutant truncated protein
and comparing the RNA and protein levels to those of a wildtype expression construct, it was
determined that the mutant protein is produced at ~3% of the level of the WT protein. This led
Palmer et al. to postulate that TFII-IRD1 may use negative feedback to increase transcription when the protein levels are too low.
In support of the auto-regulation through negative feedback hypothesis, Palmer et al., and others, have demonstrated that GTF2IRD1 expression in lymphoblast cells from WBS patients is not significantly different than unaffected controls118-121. This indicates that in this cell type
there is increased transcription from the single copy of GTF2IRD1 to make up for the loss of one
copy of the gene. However, this effect appears to be cell type specific as the levels of
GTF2IRD1 are decreased by ~50% in fibroblast cells of WBS patients121.
84
In order to determine if the auto-regulation occurred through direct interactions between
TFII-IRD1 and its promoter, Palmer et al.compared the GTF2IRD1 promoter sequences from a
variety of organisms on the assumption that the sequence which TFII-IRD1 binds to would be conserved. They identified a 104 bp sequence (GTF2IRD1 upstream region; GUR) which is highly conserved between humans and fish, and contains three GGATTA sequence motifs: a sequence to which TFII-IRD1 had previously been shown to recognize and bind. These six base- pair motifs are 100% conserved between different species, and the sequence homology of the
104 bp region between species ends soon before/after the proximal and distal GGATTA sequences.
Using EMSA and luciferase reporter assays, Palmer et al. demonstrated that in order for
TFII-IRD1 to bind to the GUR sequence and/or regulate reporter gene expression, there needed to be a minimum of two GGATTA sequence motifs present, and they could not be separated by more than 57 bp. The strongest interactions occurred when there were three GGATTA sequence motifs present, and if there was only one motif, or if the motifs were separated by more than 57 bp, TFII-IRD1 was not able to bind to the DNA.
The yeast-one hybrid experiments which found that TFII-IRD1 can bind to regions of the promoters of the Hoxc8, Gsc and TnIs genes all used bait sequences which included the sequence
GGATTA. The bait sequences were always replicated three times in the construct74,75,79. When
TFII-IRD1 was shown to regulate expression of reporter genes placed downstream of Gsc and
Tn1s promoter sequences, the sequences used also contained the GGATTA motif, and the
sequence was replicated six and three times in the constructs respectively78,79. The GGATTA
sequence motif is only found one time in the promoter regions of Hoxc8, Gsc and TnIs, and
based on the results of Palmer et al. it is unlikely that TFII-IRD1 is able to bind to, and regulate,
85
expression of these genes in vivo. It is likely that triplicating the bait sequences in the yeast-one
hybrid and reporter gene assays allowed TFII-IRD1 to bind to a region of DNA which it would not normally be able to interact with. Thus, it is not surprising that no differences in Hoxc8 or
Gsc expression could be detected in Gtf2ird1-/- mouse embryos.
2.5.2 Global analysis of gene expression in Gtf2ird1-/- mice
The two different microarray experiments looking at gene expression levels in the brains
of Gtf2ird1-/- and WT mice were unable to identify any genes that are likely to be regulated by
TFII-IRD1. The number of genes identified in each experiment, and the magnitudes of the
changes in expression were both smaller than expected, given the large number of genes (2000+)
identified by microarray as having altered expression in MEFs over-expressing TFII-IRD1101.
Numerous ChIP-seq experiments have been performed recently to identify binding sites
for specific transcription factors. These studies have identified hundreds to thousands of motifs
throughout the genome to which a particular transcription factor binds122-124. However many of
the binding sites are not located in the vicinity of transcription start sites and it is unlikely that all
of the binding events influence gene expression.
An in vivo microarray experiment was recently performed by Enkhmandakh et al. to look
at gene expression in a different Gtf2ird1 knockout mouse model, Gtf2ird1Gt(XE465)Byg/ Gt(XE465)Byg
(Gtf2ird1Gt/Gt), which was generated from a gene trap ES cell line125. They identified 536 genes
with altered expression in E9.5 Gtf2ird1Gt/Gt embryos; however there are several caveats to the
interpretation of these data. Firstly, Gtf2ird1Gt/Gt embryos die between E8.5 and E12.5, with most
showing signs of being actively resorbed by E9.5125. Thus, it is likely that much of the altered
gene expression was due to cellular processes involved in embryonic death and resorption and is
unrelated to the absence of TFII-IRD1. Secondly, the Gtf2ird1Gt/Gt mouse has a far more severe
86
phenotype than the other four published Gtf2ird1 mouse models53,82,88,118. In each of these other
models, homozygous mice are healthy and fertile, with milder phenotypes such as behavioural
and cognitive deficits or craniofacial abnormalities. The embryonic lethality observed in the
Gtf2ird1Gt/Gt mice likely results from the use of a gene trap ES cell line which contains an
insertion into intron 22 of Gtf2ird1. The resulting transcript would lead to translation of a fusion protein encoding most of TFII-IRD1, but lacking a nuclear localization signal. This fusion protein may still interact with its usual protein partners but be incapable of carrying out its normal function. If this is the case, the downstream effects on global gene expression would be likely to include effects on genes that are not normally either direct or indirect TFII-IRD1
targets.
A number of the genes which I identified as having altered expression in the brains of
Gtf2ird1-/- mice using microarray analysis were good candidate genes for the behavioural
phenotype seen in these mice. Actl6b, which showed decreased expression in Gtf2ird1-/- mice, is a member of a post-mitotic neuron-specific chromatin remodelling complex, and is known to be involved in dentritic growth and development115. Approximately 75% of Actl6b-/- mice die
within two days of birth as a result of defects in neuronal development. Those which survive are
hyperactive indicating defects in neuronal development115.
Microarray results also found Zfp68 to have decreased expression in Gtf2ird1-/- mice.
There is not very much known about the function of ZFP68, but it does contain the Kruppel-
associated box motif, which is known to cause transcriptional silencing. ZFP68 binds to KAP1
and these proteins then form a complex with other proteins resulting in the formation of
heterochromatin126. Kap1-/- mice show increased anxiety-like behaviours127.
87
However, while expression of these genes was confirmed to be decreased in the brains of
Gtf2ird1-/- mice using qRT-PCR, the expression differences are unlikely to be linked to the
absence of TFII-IRD1 in the mice. It was noted that all of the genes identified in the microarray
using the Illumina platform on E15.5 embryos were located within 50 MB of the Gtf2ird1 locus
on chromosome 5, as were many of the genes identified in the microarray using the Affymetrix
platform on P0 mouse brains. This raised the possibility that the physical targeting of the
Gtf2ird1 locus had disrupted the expression of nearby genes. In order to determine if the
differences in expression were directly related to the absence of TFII-IRD1, siRNA was used to knockdown expression of Gtf2ird1 in two different neuronal cell lines. This allowed the expression of candidate genes on chromosome 5 to be studied without physically disrupting the chromosome. Gtf2ird1 expression was reduced by 60 - 80% relative to controls. Differences in expression of candidate genes were detected in the brains of Gtf2ird1+/- mice, which have higher levels of Gtf2ird1 expression than the siRNA treated cells, so this level of knockdown would be expected to have an effect on the expression of any true target genes.
Using qRT-PCR, no differences in the expression of the candidate genes could be detected between Gtf2ird1 siRNA treated cells and controls. This could indicate that TFII-IRD1
does not regulate the expression of the candidate genes examined in the cell types that were used,
or that the differences in expression detected in Gtf2ird1-/- mice are not directly related to the
absence of TFII-IRD1 in the mice.
Another explanation for the clustering of the candidate target genes around the Gtf2ird1
locus is that the differences in expression were the result of background strain. The Gtf2ird1-/-
mice were generated in R1 ES cells which are derived from a 129X1/SvJ & 129S1 cross. The
mice were then back-crossed onto a CD1 background. Most of the genome in the mice used for
88
analysis would have been a CD1 genotype, but the region flanking the targeted locus would
retain a 129 genotype. Actl6b, Zfp68 and all of the other genes identified in the microarray
performed on E15.5 mouse embryos flank the Gtf2ird1 locus, and therefore may have
polymorphisms in the Gtf2ird1-/- mice which differ from the WT CD1 mice that they were
compared to and which may alter gene expression. Thus when comparing the expression of
genes on chromosome 5G between Gtf2ird1-/- and WT mice, I was actually comparing expression between CD1 and 129 mice.
The phenomenon of the genes which flank a targeted locus confounding the results of microarray experiment is a recognized problem128-130. Many polymorphisms exist which result in altered levels of gene expression between different mouse strains131. Thus, a mouse with a
targeted allele may express the genes surrounding the targeted locus at different levels than the
WT mouse to which it is being compared. This flanking gene effect has been shown to persist
after 11 generations of back crossing130, and extend up to 40 MB from the targeted locus130.
In order to determine if the candidate genes had altered expression because of differences
in genotype, expression of candidate genes was analyzed in Gtf2ird1-/-, WT CD1 and WT 129
mice using qRT-PCR. Expression of genes which flank the Gtf2ird1 locus was found to be
significantly different between WT CD1 and 129 mice, while Gtf2ird1-/- mice were not
significantly different than WT 129 mice. SNP analysis was then used to show that the Zfp68
allele in the Gtf2ird1-/- mice is genetically 129 and contains different polymorphisms than those
found in CD1 mice. These results indicated that the differences in expression of these genes
detected in Gtf2ird1-/- mice were the result of differences in background strain and were not
related to loss of TFII-IRD1.
89
As some of the genes which had altered expression in Gtf2ird1-/- mice play a role in brain
development and behavioural pathways, it raises the possibility that the behavioural phenotype
seen in the mice is unrelated to the absence of TFII-IRD1 and is actually the result of differences between different mouse strains. This is unlikely, since in contrast to the low anxiety detected in
Gtf2ird1-/- mice, mice of a 129 genetic background have been found to have higher levels of
anxiety compared to other mouse strains132-134. In addition, Gtf2i+/- mice do not show increased
sociability (unpublished results). Gtf2i is also located on chromosome 5G, adjacent to Gtf2ird1
and the Gtf2i+/- mice were also derived from R1 ES cells, and therefore would be expected to
have a 129 genotype for the genes flanking the region.
Not all of the differences in expression in Gtf2ird1-/- mice can be attributed to differences
in genetic background. Microarray analysis on p0 mice found that Kin, Stx3 and Mrpl16 all
showed decreased expression in Gtf2ird1-/- mice. This difference in expression was confirmed
using qRT-PCR; Gtf2ird1-/- mice were shown to have decreased expression of Stx3 and Mrpl16
relative to both WT CD1 and 129 mice. Stx3 was an attractive candidate gene for some of the
Gtf2ird1-/- mouse behavioural phenotypes as it is known to be involved in neuronal growth114 and
synapse function135.
My results indicate that the choice of microarray platform may affect the genes found to
be differentially expressed between two groups. There was very little overlap between the genes
identified as having altered expression using the Illiumina array and the Affymetrix array. Part
of the reason for this could be that the arrays were performed at two different time points,
however many of the genes identified could be validated at both time points using qPCR.
Studies comparing microarray platforms have generally found good correlation between the
Affymetrix and Illumina136-138. One of the biggest differences between the two platforms is in
90
the probe design; Illumina generally uses one 50-mer probe per transcript while Affymetrix uses
multiple 25-mer probes.
There are a number of possible explanations as to why no targets of TFII-IRD1 were
identified in the mutant mice. It could be that TFII-IRD1 does not regulate gene expression at
the time points examined, or that the regulation is only occurring in a very specific cell
population in the brain, and by examining the entire brain I diluted out the effect. These
scenarios are unlikely as Gtf2ird1 is expressed throughout the brain, at relatively high levels, at
both of the time points examined. It is unlikely that such robust expression would occur if the gene was not fulfilling an important role. It is possible that other GTF2I-family members are compensating for the loss TFII-IRD1 at the time points studied, however there is no evidence to date to support this theory.
Another possibility is that the absence of TFII-IRD1 does affect gene expression in
Gtf2ird1-/- mice, but the changes in expression are small, and below the threshold of detection.
This was the case when gene expression in Mecp2 mutant mice (a model of Rett Syndrome) was
examined139. Mecp2 is believed to be a general transcription repressor, and Mecp2-null mice show a disease phenotype similar to that seen in people with Rett syndrome. Microarray analysis was performed on brains from these mice at multiple time points, and no significant changes in gene expression could be detected. However, the authors realized that they could differentiate mutant mice from WT by looking at very subtle changes in gene expression that occurred in a number of genes simultaneously. It is possible that TFII-IRD1 also only causes subtle changes in gene expression which are sufficient to cause the behavioural phenotype seen in Gtf2ird1-/-
mice.
91
Recent studies have indicated that MeCP2 may play a role in chromatin remodelling in
addition to, or instead of, acting as a gene specific transcription factor. Ishibashi et al.
demonstrated that MeCP2 binds chromatin at sites of entry and exit from nucleosomes, similar to
linker histones; however unlike linker histones, MeCP2 binding is dependent on methylation140.
It was later discovered by Skene et al. that mature neurons contain one MeCP2 molecule for
every two nucelosomes, and that MeCP2 binding occurs throughout the genome suggesting a
global regulatory role for the protein141. In support of this theory, histone H1 (a linker histone) is
known to be expressed at roughly 50% lower levels in neurons relative to other cell types142.
However, in MeCP2 null mice H1 expression in neurons is increased by 2-fold, bringing it in line with the expression levels seen in other cell types141. Skene et al. proposed that H1 and
MeCP2 may compete for chromatin binding sites, and the absence of MeCP2 can be partially
compensated for by increased H1 expression. MeCP2 may be able to lead to small, but global, changes in transcription levels by acting as a linker protein and by recruiting HDACs to the
chromatin. Thus, MeCP2 is unlikely to function as a classical transcription factor as was
previously believed.
TFII-IRD1 is unlikely to function in the same manner as MeCP2; however an alternate theory to explain my findings is that TFII-IRD1 does not function as a classical transcription factor in the brain at the time points studied. It is possible that TFII-IRD1 does not directly regulate gene expression in vivo, and instead is involved in protein-protein interactions. In order to determine if there is an alternate role for TFII-IRD1 the localization of the protein within cells was studied.
92
2.5.3 Cellular localization of TFII-IRD1
Previous studies on the cellular localization of TFII-IRD1 have reported nuclear localization of the protein, as would be expected of a transcription factor57,68,94. However,
immunohistochemistry performed on transfected N2A cells revealed that TFII-IRD1 localizes to both the nucleus and cytoplasm of these cells. Interestingly it appeared that in any given cell
TFII-IRD1 staining was seen in only the nucleus or the cytoplasm, never both. However further studies using confocal microscopy will be needed to confirm this. This is the first time that TFII-
IRD1 localization has been studied in a neuronal cell line, and indicates that the protein may play a role other than transcription factor in these cells.
Recently a cellular role for TFII-I was discovered. Caraveo et al. found that TFII-I is able to negatively regulate agonist-induced calcium entry into cells143. Intracellular calcium
signalling is initiated by receptor tyrosine kinases and G protein-coupled receptors which
activate the γ or β forms of PLC. PLC-γ is then able to bind to a calcium channel, transient receptor potential channel 3 (TRPC3), resulting in the insertion of the channel into the plasma membrane and an influx of calcium144. Caraveo et al. demonstrated that when TFII-I is
phosphorylated, it can bind to PLC-γ. They went on to show that knocking down Gtf2i in PC12
cells (derived from rat adrenal medulla) leads to an increase in calcium influx, and this
phenotype could be rescued by expressing a human GTF2I isoform that is unaffected by the rat-
specific siRNAs. These results suggested that TFII-I negatively regulated intercellular calcium levels.
As binding of PLC-γ to TRPC3 channels results in insertion of the channels in the plasma membrane, Caraveo et al. studied the surface accumulation of TRPC3 after knocking down TFII-I in PC12 cells. They found that increased levels of TRPC3 were found in the plasma
93
membrane, and there was an increase in total TRPC3 protein levels. This effect was believed to
be post-transcriptional as mRNA levels were not affected.
Proper regulation of intercellular calcium levels via TRPC channels is essential for many neuronal functions including axon guidance145, membrane depolarization146 and innate levels of
fear147. It is possible that TFII-IRD1 may also play a role in regulating inter-cellular calcium levels through TRPC channels. Decreased levels of TRPC4 protein have been detected in the frontal cortex of adult Gtf2ird1-/- mice (Ted Young, personal communication). Further work will
need to be done to determine if there is a direct interaction between TFII-IRD1 and TRPC4, PLC or other cytosolic proteins, and if inter-cellular calcium levels are altered in Gtf2ird1-/- mice.
94
Chapter III: Exon specific differences in gene expression between different mouse strains
3.1 Abstract
There are a number of factors which can confound the analysis of gene expression levels
when comparing mice of different genetic backgrounds, including the use of different polyadenylation and splice sites. I have previously shown that a subset of genes in Gtf2ird1-/-
mice are expressed at different levels than in WT CD1 mice as a result of the retention of DNA
from the parental 129 derived strain. A closer look at expression of these genes in Gtf2ird1-/-
mice revealed that while changes in expression could be detected using primers specific to the
3’UTR of the transcripts, primers targeting upstream coding exons did not necessarily detect a
similar expression pattern. For some genes, including Stx3, Taf6 and Coq2, the genotype
specific expression differences were restricted to the 3’ UTR. Zfp68 and Actl6b also had
decreased expression in the 3’ UTRs, however expression of coding regions was variable:
increased in some exons and decreased in others. Northern blot analysis performed on a subset
of the genes failed to identify alternative transcripts which could explain these findings; however
transcripts which use alternative splice sites within in the 3’ UTR and/or alternative
polyadenylation sites were identified using 3’ RACE. Primers designed to differentiate between
transcripts using alternative polyadenylation sites detected genotype specific differences in
expression of genes located on chromosome 5. Altered expression of these genes was found to
be the result of retention of that chromosome region from the targeted embryonic stem cell line,
and therefore dependent upon background strain rather than Gtf2ird1 genotype.
95
3.2 Literature Review
3.2.1 Polyadenylation of pre-mRNA
A key step in the generation of eukaryotic mRNAs is the addition of a poly(A) tail to the
3’ end of the transcript. This process involves cleavage of the primary RNA transcript in the 3’
UTR, and the subsequent addition of adenosine residues. Proper processing of the poly(A) tail is
essential for gene expression since the poly(A) tail is necessary for transport of the transcript out
of the nucleus148, stabilization of the transcript149 and initiation of translation150.
The majority of mammalian transcripts contain the canonical polyadenylation signal
sequence A(A/U)UAAA 10 – 35 nucleotides upstream of the cleavage site and a GU-rich motif
14-70 nucleotides downstream of the cleavage site148,151. Cleavage requires binding of the
proteins cleavage and polyadenylation specificity factor (CPSF) and cleavage stimulatory factor
(CstF) to these motifs respectively. Binding of these two proteins allows for the assembly of the
3’ processing complex which includes up to 85 different proteins including cleavage factors Im
148,152 and IIm (CFIm and CFIIm) and poly(A) polymerase . Not all of the proteins associated with
the polyadenylation complex play a role in the formation of the 3’ end of the transcript, some are
involved in transcription, splicing, and termination153. The actual cleavage of the RNA transcript
is performed by CPSF-73, a subunit of CPSF, however the entire complex of proteins is needed
for cleavage to occur154. Poly(A) polymerase then adds a string of 200-250 adenosine residues to
the 3’ end of the transcript upstream of the cleavage site155.
In some cases mRNA cleavage and polyadenylation occurs in the absence of the canonical A(A/U)UAAA signal motif. 15-30% of human transcripts and 25% of mouse transcripts do not contain an AAUAAA or AUUAAA element in the 3’ UTR156,157. The
mechanisms by which this occurs are not yet fully understood, but in at least one case CFIm has
96
been shown to determine where polyadenylation will occur in the absence of a A(A/U)UAAA
151 motif . CFIm is composed of two subunits (a small 35 kDa protein, and a larger protein of
either 59, 68 or 72 kDa), and is believed to play a role in linking 3’ end formation and splicing of
pre-mRNAs151,158,159. The most frequently used poly(A) site in the 3’ UTR of the poly(A) polymerase-γ (PAPOLG) gene does not contain a motif with more than a 4 nucleotide match to the sequence A(A/U)UAAA151. The 3’ end of PAPOLG is highly conserved among vertebrates,
and CFIm was found to bind to a UGUAN element repeated upstream of the primary poly(A)
site. Binding of CFIm to these sites is necessary for both poly(A) site cleavage and proper
addition of adenosine nucleotides. CFIm is able to direct 3’ end polyadenylation of PAPOLG
transcripts though interactions with the hFip1 subunit of CPSF (which usually binds to the
A(A/U)UAAA motif) and PAP151. A similar mechanism is likely to regulate poly(A) site
selection in other transcripts which lack the canonical poly(A) motif.
Approximately 54% of human transcripts and 32% of mouse transcripts use alternative
polyadenylation sites, meaning that they must choose which of multiple polyadenylation signals
to use in a given transcript155. This will lead to transcripts with 3’ UTRs that differ in length, or
in combination with alternative splicing, can generate transcripts with different coding regions.
3’ UTRs contain sequences important for RNA localization, stability, translation and microRNA
binding160. Poly(A) site selection has been shown to be important for the proper expression and
function of many genes in the nervous system, including brain-derived neurotrophic factor
(BDNF). Cleavage of BDNF transcripts occurs at one of two possible poly(A) sites, allowing for
the generation of two possible transcripts (“short” and “long”), each of which encodes the same
protein, and are produced in high levels161. An et al. demonstrated that the short form localizes
to the somata, while the long form localizes to the dendrites, where it is locally translated161.
97
Transgenic mice which only express the short form of BDNF have dysmorphic dendritic spines
and impairments in hippocampal LTP, despite the fact that they express the same level of BDNF
protein as WT mice.
The method by which a cell selects which poly(A) site to use are not yet well
understood162. Potential site selection methods are thought to include the blocking of protein
binding sites in the transcript by RNA binding proteins, post-translational modifications (such as
phosphorylation) of the polyadenylation protein complex, or the affinity of certain subunits of
the polyadenylation complex for sequences flanking the cleavage site162.
CstF-64, a subunit of CstF, was one of the first proteins demonstrated to have a direct
role in poly(A) site selection163. CstF-64 is involved in the processing of immunoglobulin M heavy chain (IgM H-chain) pre-mRNA. B cells produce a membrane-bound form of IgM H- chain (µm), while differentiated plasma cells produce a secreted form (µs)164. The secreted IgM
H-chain uses a downstream µs specific poly(A) site, while the membrane-bound from is generated using an upstream µm specific poly(A) site. Increasing the cellular concentration of
CstF-64 is sufficient to cause the switch from the production of µm to µs163. CstF-64 expression
is the limiting factor in the generation of CstF. It is believed that when low levels of CstF-64 are
present the weaker µs poly(A) site is not recognized, however when levels of CstF-64 are increased, the µs poly(A) site (which is transcribed first) is used. These results indicate that the level of expression of the polyadenylation complex subunits may regulate poly(A) site selection for certain genes; higher levels of expression may be necessary in order to utilize weaker poly(A) signal motifs.
98
Recently, a novel role for the Nova2 protein in poly(A) site selection was identified165.
Nova2 is a brain specific RNA binding protein which binds to YCAY motifs and regulates the
splicing of pre-mRNAs166. Licatalosi et al. identified Nova2 binding sites using HITS-CLIP, a technique involving the crosslinking of RNA-protein complexes, followed by the use of high throughput sequencing to identify RNA fragments bound to the proteins165. The authors were
surprised to find that Nova2 bound to known 3’UTR regions, near poly(A) sites, as well as to
regions within 10 kb of stop codons which were believed to be unannotated longer 3’ UTRs of
known genes. qRT-PCR was used to look at the expression of 29 candidate Nova targets in
Nova2 knockout mice. Twelve of these genes were found to have significant differences in
poly(A) site selection, though the total levels of expression were the same. Nova2 was able to
block or enhance use of a poly(A) site by binding to flanking YCAY-rich motifs.
In rare cases, 3’ end formation of mRNAs has been found to occur through noncanonical
mechanisms. For example, transcripts of the yeast CTH2 gene are not usually cleaved at the poly(A) signal site; instead the final 1.8 kb of the primary transcript is degraded in a 3’ – 5’ manner by the nuclear exosome/TRAMP complex until it reaches a G/U rich sequence and polyadenylation occurs167. In mice and humans the 3’ UTRS of MALAT1 and MEN β, two non-
coding RNAs, form a secondary structure which is recognized and cleaved by RNase P168,169.
Each of these genes does have a canonical polyadenylation site which is used at low levels in
vivo, however in the majority of transcripts 3’end processing is done by RNAse P, an enzyme
previously identified for its role in tRNA processing. These transcripts do have poly(A) tails,
however they are not generated using poly(A) polymerase. Instead there is a sequence of
poly(A) residues genomically encoded immediately upstream of the RNase P cleavage.
99
Together these examples illustrate that polyadenylation of transcripts is a complex
process, and there are many exceptions to the “rule” of canonical 3’ end processing.
3.2.2 Transcription termination
During the elongation stage of transcription the C-terminal domain (CTD) of RNA polymerase II (RNAPII) is phosphorylated at specific serine residues to allow for binding of proteins needed at different points in the generation of the mRNA170. Early in transcription
proteins needed to for the addition of the 5’ cap bind to the CTD, then as transcription proceeds
differential phosphorylation allows for interactions with proteins needed for polyadenylation,
and then termination171,172. RNAPII continues to transcribe RNA after passing by the
polyadenylation signal(s), and the mechanisms by which transcription terminates has not yet
been fully elucidated173. There is evidence to suggest that the processing of the 3’ end of mRNA
is a key step in the process of transcription termination174.
Two models have been proposed linking transcription termination, and the displacement
of RNAPII from the RNA strand to the cleavage of the RNA strand in the 3’ UTR for
polyadenylation. The allosteric/anti-termination model proposes that after recognition of a poly(A) site, an anti-termination factor dissociates from RNAPII and/or RNAPII undergoes a conformational change, which gradually causes transcription to cease and RNAPII to “fall off” the transcript175,176. This is in contrast to the torpedo model, first proposed by Connelly and
Manley in 1988177. According to this model, following cleavage of the nascent RNA transcript
at the poly(A) site, a 5’ – 3’ exonuclease degrades the portion of the transcript that is downstream of the cleavage site. Once the exonuclease catches up with RNAPII it causes transcription termination.
100
Rtt103 was identified in a search for proteins that bind to the CTD or RANPII in yeast,
and was found to play a role in 3’ end processing178. Rat1, a yeast 5’ – 3’ exoRNase, was found
to associate with Rtt103 and both proteins have been shown to associate strongly with the 3’
ends of genes using ChIP. Kim et al. demonstrated that Rat1 co-transcriptionally degrades the uncapped RNA downstream of the poly(A) site, and that this exonuclease activity was required for proper termination of transcription178. XRN2, the human homolog of Rat1, has been found to
play a similar role in transcription termination giving support to the torpedo model of
termination179.
More recently, it was demonstrated that while XRN2 and Rat1 are necessary for RNAPII release from the nascent transcript, they are not sufficient to cause release of the elongation complex180. The authors found that in addition to its co-transcriptional exonuclease activity,
Rat1 also serves to recruit other 3’ end processing factors needed for termination including Pcf11
and Rna15. Pcf11 has been shown to bind to the CTD of RNAPII, resulting in the dissociation
of RNAPII and the nascent transcript from the DNA template181. It has been suggested that
Pcf11 is an allosteric effector that causes the dissociation of RNAPII by altering its
conformation180,181. These results indicate that neither the torpedo model nor the allosteric model
of termination is sufficient to describe this complicated process. It is likely that processes
described in each model are necessary for RNAPII dissociation and transcription termination to
occur in vivo.
3.2.3 Strain specific gene expression
Gene expression has been shown to vary between different mouse strains. Nadler et al. studied gene expression in 7 different brain regions in 10 different strains of inbred mice, and found that 30% of genes show strain specific differences in expression in at least one brain
101
region182. This variation in brain expression is not surprising, as different mouse strains are
known to exhibit different behaviours132,133. MicroRNA (miRNA) expression is also likely to
show great variation between mouse strains, and a number of miRNAs have been shown to have
strain specific expression levels in the hippocampus183. This could potentially have downstream
effects on the protein expression levels of multiple genes as many genes are post-
transcriptionally regulated by miRNAs. This type of regulation often occurs in conjunction with
poly(A) site selection as the use of alternative poly(A) sites can result in transcripts which either
include or exclude miRNA binding sites. For example, the β-actin gene encodes two isoforms
which differ in 3’ UTR length. The longer isoform is expressed at lower levels, however this
isoform is translated with higher efficiency due to miRNA binding160.
While poly(A) selection in different mouse strains has not been studied in detail, strain specific selection has been shown to occur. C57BL/6 mice express an Adh4 isoform in their stomachs that is not found in other mouse strains184. Analysis of the 3’ UTR revealed that a SNP
in C57BL/6 introduced a new poly(A) signal.
These examples illustrate how gene expression can be differentially regulated in different
mouse strains both at the level of transcription and post-transcriptionally.
3.3 Material and Methods
3.3.1 Expression analysis using quantitative Real-Time PCR
The protocol for quantitative real-time PCR can be found in section 2.3.9. Primer
sequences used to detect exon specific expression differences are listed in table 3.1.
102
Table 3.1 Primers used for quantitative real-time PCR amplification from cDNA
Primer Name Forward primer sequence (5’ – 3’) Reverse primer sequence (5’ – 3’) Actl6b ex.3 GGAGGGGGAGAAAGAGAAGA CATTCTTGAGGGGCGACAT Actl6b ex.7 CTGGCAGGGGACTTCATCT TGGCTGCAATCATGTAAGGA Actl6b ex.10 GCCCACTGTGCATTATGAAA GGATCAAACAGGCCCTCAG Actl6b ex.13 GTCTTAAGCTCATCGCCAGCA CAGTGAGGCCAAGATGGAG Actl6b ex.14 TTCCAGCAGATGTGGATCTC CAACTTCAGGGGCACTTCC Actl6b 3' UTR ATACCCGTCCACCCCATC GGGTAATGGGAAAGGGAGAG Ap4m1 ex.15 CTCCAGGTTCGATTCCTCAG TTGCTGTGGCTTAGATGTCG Ap4m1 5' UTR GGGTTCAACTTTCACCGTGT CCCCTTGGAAGACAGAATGA Coq2 ex.2 GACCCAGGTTGTTTTCCAGA CACGGTCCCACATGTCATTA Coq2 ex.6 TCAGCTCGTCTGCTCACTTC GGGATTGTCCTTGGAAACCT Coq2 3' UTR-A GCCCAAGGCTCTAGGTTCTC TTGCTCATCCAAGCCTAACA Coq2 3' UTR-B AATGCTAACACAGGGGCCTA GGCAGCGTGTACTGGACTTT Coq2 3' UTR-C TCTTGAATTACAGCTTTGGCAGT ACATGGCCGTGTGCTTTATT Kin ex.1 GCAAGTCGGATTTTCTGAGC ATCTGGCAGTACCAGCGAAG Kin 3'UTR TGAAAGGACGCAGAGTTGAA GTGCCTTGGCTAACACCAAT Mrpl16 5' UTR CTGGGAAAAGCCACTGTTGT AAAGTGCATCCGCAGGAG Mrpl16 ex.4 GCTTGACAATCAACCGCTTT CAACACCTTTGCGAGTGATG Mrpl16 3' UTR GTAGTGAAAGCGCGAGGAAC AGAACCAGCAAAGACCCTCA Pex1 ex.21 CCTGGGAAAGACCCGTTATT ATCTCTCTGCTCCTGGGTGA Pex1 3' UTR-A GTTTGCTCCCATCTCTCCAG GCAAGTGGCACTGATGGTAA Pex1 3' UTR-B GCATTAGCTTGAGCACAGCA TCAAGTGCTTGAATGCTTGG Stx3 ex.10 ATCATGATCTGCTGTATTATCCTTG AGGCAAATATGCCCCCAAT Stx3 3' UTR-A ACAACATGCCCAACTCAACA TGCGACCTAGAAGAGCCATT Stx3 3' UTR-B TGGTCTCAGGATGGAGGTTC TTTGGGAGCTGGGTCATAGT Stx3 3' UTR-C AAAAGTAGGGAGTACCATGATCTGA CAGGATGGTGGTGAAGAGG Stx3 3' UTR-D GAAGGGACATGGTAGTATTCGAG CGCATTCTTAACCAACCACA Stx3 3' UTR-E AAAAATCATGTTCCCAATGGT GCCACTTTCAGATGTCTGCTT Stx3 3' UTR-F TCAATACAAAGCCAGCTTCTACA GGCACATAGAAAAATATGGCAAC Stx3 3' UTR-G GGACCAGTTTCTTGCACATTT AAGGAAGCCAAGGGGATAAA Stx3 3' UTR-H GCTGCCCATCTTCTGTCAGT GCTTCAGATCTAGGAAGGGTTTC Taf6 ex.10 GTGGACAATCACTGGGCACT GATGTTGTTGGTGGTTGTGC Taf6 3' UTR TCACATGTGCTGACCTCCTC GGGGAAAACCTTTCCTCCTT Zfp68 ex.3 AGGTGGCTGTGCTGTAGACTC CATTTCTGGCTTCCAGGACA Zfp68 ex.5 GGCACTGCAAAACCAAACC CACACTGGTGTGAGGCTTCT Zfp68 ex.6 ACACACCCCGAGCAAGTTAG TGTTGAAGGATTTCCCTGGT Zfp68 3' UTR-A GCTAAGGGGACCCTGTGATT CAAGGTTTTCCTTCACCGTTT Zfp68 3' UTR-B CACCGTTTATTCATTTGGTTTAAT GCTAAGGGGACCCTGTGATT Zfp68 3' UTR-C AAAGCAGGAGAGATGGCTCA TCAGAGGACAAATCCCAAGG Zfp68 3' UTR-D GCCACTTCTTTGCTGTTTCC CCCATGGATAGGTCATGGTC
103
3.3.2 Generation of probes for Northern blots
Probe sequences were amplified from cDNA that had previously been prepared from a
WT P0 mouse brain. Conventional PCR using high fidelity enzyme mix (Fermentas, K0192), and primers specific to the test genes (Table 3.1) was performed. The desired PCR amplicons were cloned into a pCR 2.1-TOPO TA cloning vector (Invitrogen). The vectors were digested with EcoR1 to remove the probe, and following electrophoresis, the probe sequence was purified from an agarose gel using a QIAquick Gel Extraction Kit (Qiagen).
Table 3.2. Primer sequences used in the generation of Northern blot probes
length Probe Forward Primer Sequence (5' - 3') Reverse Primer Sequence (5' - 3') (bp) Mrpl16 5' end CTGGGAAAAGCCACTGTTGT TGATCGATGGCACCTTTACC 504 Mrpl16 3' end GTGGACGGTGTGAATTTGAA CAGCTGGCTATCAAACTGTCC 497 Mrpl16 GGAAAAGCCACTGTTGTAGTTG CAGCTGGCTATCAAACTGTCC 1042 full length Kin 5' end GCAAGTCGGATTTTCTGAGC TCTCCTGCTCTTTCCCTTCC 558 Kin 3' end CACCGAAAGGCTGGTACATT GTGCCTTGGCTAACACCAAT 843 Pex1 5' end TTGGACTCTCAAGCGGAGAT CTTTGTCAGGCAACAGGACA 819 Pex1 3' end AGATCAGGTGTCCCGTCTTG CAGCTCAGCAAATTCCTTCC 745 Pex1 long & CCTAAAGACGTCAATGAAGAAAC GTGATCAAAAGAGCGCCATTC 440 short Stx3 e9 - 3' TTGACCGCATTGAGAACAAC GCAGGCACTGTGTGCTAAGA 742 UTR Stx3 e10 - 3' TGCCATCATCTTGGCTTCTA TGAACCTCCATCCTGAGACC 892 UTR
The probes were radioactively labeled before hybridization to the blot. 100 ng of DNA was boiled with 1.25 µL random hexamers, in a total volume of 8.5 µL for two minutes. 1 µL of
100X BSA and 10 µL of 2.5X Random Priming Buffer (0.5 M HEPES, 12.5 mM MgCl2, 0.025
M β-mercaptoethanol, 0.125M Tris, pH8.0, 50 µM dATP, 50 µM dGTP, 50 µM dTTP) were added, before incubation at room temperature for 10 min. 1µL of Klenow fragment (Fermentas,
EP0051), and 5 µL (50 µCi) of deoxycytidine 5’ triphosphate [ alpha–32P] (GE Healthcare,
104
Quebec) were then added, and the mixture was incubated at 37°C for 2 hours. The probe was then passed through a sephadex column to remove unincorporated nucleotides.
3.3.3 Northern blot analysis
Total RNA was extracted from the brains of two WT and two Gtf2ird1-/- mice, using
TriReagent (Sigma) following the manufacturer’s protocol. 15 µg of each RNA sample was
mixed with 2X RNA loading dye containing formamide (Fermentas) and heated at 70°C for 10
min. After chilling on ice, the RNA samples were then run on a 1% agarose-formaldehyde
(0.7M) gel in 3-(N-morpholino) propanesulfonic acid (MOPS) buffer. An RNA ladder,
RiboRuler high range (Fermentas, SM1821), was also run to determine the sizes of the detected
transcripts.
The RNA containing gel was rinsed in RNase free ddH20, and then soaked for 20 min. in
5 gel volumes of 0.01 N NaOH/3 M NaCl to partially hydrolyze the RNA. The RNA was then
transferred to a positively charged nylon membrane (Amersham Hybond-N+, RPN303B) using
capillary action. The RNA was fixed to the membrane by irradiating at 254 nm at 120 mJ/cm2
for two minutes using the Spectrolinker XL-1500 UV crosslinker.
The blots were incubated for 2 hours at 68° C in modified Church-Gilbert solution (0.5M sodium phosphate (pH 7.2), 7% (w/v) SDS, 1mm EDTA (pH 7.0) before addition of the probe. 2
µL of labelled probe was used to check the efficiency of the labelling reaction, and the rest was denatured by heating at 100° C for 10 min, and then added directly to the modified Church-
Gilbert solution already on the blot. The blots were then incubated at 68° C overnight.
Following hybridization, the blots were washed 1x10 min. with 1X SSC + 0.1% SDS at room temperature, and then 3x10 min. with 0.5X SSC + 0.1% SDS at 68° C. Membranes were then exposed to X-ray film at -70°C for 24 hours to 1 week.
105
3.3.4 3’ Rapid Amplification of cDNA ends (RACE)
Total RNA was extracted from the heads of E15.5 mouse embryos (Gtf2ird1-/- and WT, 3
mice per genotype) using TriReagent (Sigma) following the manufacturer’s protocol. RNA
samples were treated with DNase (Turbo DNA free, Ambion) to ensure they were free of
genomic contamination. Synthesis of first strand cDNA was performed using Invitrogen’s 3’
RACE System for Rapid Amplification of cDNA Ends kit, following the manufacturer’s
protocol. Briefly, 4 µg of RNA in 11 µL was mixed with 1 µL of the 10 µM oligo(dT)-
containing Adaptor Primer and incubated at 70°C for 10 min. After cooling, 2 µL of 10X PCR
buffer was added along with 2 µL of 25 mM MgCl2, 1 µL of 10 mM dNTP mix and 2 µL of 0.1
M DTT. The mixture was incubated at 42° C for five min. at which point 1 µL of SuperScript II
RT was added and the mixture was incubated for an additional 50 min. The reaction was
terminated by heating at 70° C for 15 min, and any remaining RNA was degraded using RNase
H.
Nested PCR was used to amplify the desired transcripts. The first round of PCR was
generally performed using a forward (gene-specific) primer located 1-2 exons upstream of the exon containing the 3’ UTR. The only exception to this was Zfp68; both rounds of PCR used a forward primer located in the 3’ UTR containing exon. The reverse primer used in the first round of PCR was the Universal Amplification Primer (UAP) which was provided with the
3’RACE kit (Invitrogen). The UAP will bind to the adaptor region which is added to each cDNA by the Adaptor Primer during the first strand synthesis step. The product of the first round of PCR was diluted 1/10,000, and then 1 µL was used in the second PCR round. The forward primer was located either 1 exon upstream of, or at the beginning of, the exon containing
106 the 3’ UTR. The reverse primer was the Abridged Universal Amplification Primer (AUAP), also provided with the 3’RACE kit.
Table 3.3. Forward primer sequences used in nested PCRs of first strand cDNA from 3’ RACE
Primer name Primer sequence (5' - 3') Actl6b ex.12 ACTCTGCTTCAGGGGTTCAC Actl6b ex.13 AAGTTCAGCCCCTGGATTG
Ap4m1 ex.13 CAACATTCATCTGCACCTTCC Ap4m1 ex.14 TCCAGATCAGAAGGCAGAGC
Coq2 ex.5 GTGCGTTACTCGGATGGTCT Coq2 ex.6 AGGAGAACACAAGGCAGTGG
Kin17 ex.3 TCCGAAATGACTTTCTGGAAC Kin17 ex.4 GCACTAAAAGGGTCCACAACA
Mrpl16 ex.3 TGTCTCCATCCCTGAAAGGT Mrpl16 ex.4 GCTTGACAATCAACCGCTTT
Stx3 ex.9 TTGACCGCATTGAGAACAAC Stx3 ex.10 TGCCATCATCTTGGCTTCTA
Taf6 ex.13 CAAGCTCAGCAGGTCAACAG Taf6 ex.14 CTCCTCAGCCTTCTCCTCCT
Zfp68 ex.6a CCTGACTCGACACCAGAGAA Zfp68 ex.6b GAAAGCCTCTGAGAGCAAACA
3.3.5 Cloning and sequencing of 3’ RACE products
The products from the second round of nested PCR on first strand cDNAs from 3’ RACE were run on an agarose gel. Each band was cut out of the gel, and the DNA was extracted from the agarose using a Qiaquick Gel Extraction kit (Qiagen). Each PCR product was cloned into the pCR 2.1-TOPO TA cloning vector (Invitrogen). The size of the inserted piece of DNA was confirmed using EcoR1 digestion to remove the DNA from the vector. Vectors containing DNA of each detected size were sent to the Sanger Sequencing Facility at TCAG for sequencing using
107
capillary-based fluorescent sequencing on the ABI 3730XL instrument. Sequencing was done
using forward and reverse primers specific to the pCR 2.1-TOPO TA vector. When sequences
derived using these primers were not sufficient to cover the entire piece of DNA, internal primers
within the DNA sequence were used. The sequences were aligned against the mouse genome
using the UCSC genome browser (http://genome.ucsc.edu/).
3.3.6 Expression analysis using western blots
Western blots were performed as described in section 2.2.12 with the following
exceptions.
Protein was extracted from newborn mouse brains using RIPA lysis buffer (10 mM Tris
(pH 8.0), 100 mM NaCl, 1 mM EDTA, 1% NP-40, 0.5% NaDOC, 0.1% SDS) with a protease inhibitor cocktail (Sigma, P8340). Each brain was homogenized in 2 mL of lysis buffer, and then incubated on ice for 20 min. Once lysis was complete the cells were centrifuged at 4° C for
20 min., and the supernatant was transferred to a new tube.
Primary antibodies used: rabbit anti-Syntaxin 3 (Sigma, S5547), diluted 1/1000, mouse anti-α-Tubulin (Sigma, T 9026), diluted 1/50,000. Primary antibodies were diluted in blocking solution and incubated with the membrane for 1 – 2 hours at room temperature with shaking.
Membranes were washed 3 x 10 min. in TBS-T, and then incubated for 1 hour at room temperature with ECL Mouse IgG, HRP-Linked Whole Ab (from sheep) (GE Healthcare,
NXA931) or ECL Anti-rabbit IgG, HRP-Linked Whole Ab (from donkey) (GE Healthcare,
NA934), diluted 1/10,000 in blocking solution. Following 2 x 10 min. washes in TBS-T and a final 10 min. wash in TBS, chemiluminescent detection was performed using ECL (enhanced chemiluminescence) reagents (GE Healthcare) and Hyper Film (GE Healthcare).
108
3.4 Results
3.4.1 Differential gene expression detected in Gtf2ird1-/- mice is exon specific
Further expression analysis of candidate genes which were altered in the Illumina
mouseWG-6 v2.0 BeadChip array and the Affymetrix mouse 430 2.0 gene chip array revealed
that the changes in expression detected in the Gtf2ird1-/- mice were exon specific; differences
could only be detected in certain exons. The primers that were initially designed to confirm the
results of the microarray were generally in the 3’ UTR of the gene, as this is usually where the
probe sequences on microarray chips are found. When possible, one of the primer sequences
overlapped with the probe sequence on the microarray chip. Other primer sets were then used to
validate the changes in gene expression, which were designed to be able to distinguish between
known splice forms of the candidate genes. Surprisingly, expression differences between exons
were detected which could not be explained by known splice forms.
Using qRT-PCR on P0 mice, Kin, Mrpl16 and Stx3 were all found to have significantly
decreased expression in Gtf2ird1-/- mice when primers targeted to the 3’ UTR sequences were
used, however when primers for upstream sequences were used, no changes in expression could
be detected (Figure 3.1). There are two known splice forms of Kin, which differ in where the 3’
UTR begins. qRT-PCR was performed using primers in exon 1, and at the terminal end of the
3’UTR; both of these primer sets amplify from both known splice forms. Exon 1 expression did
not differ between WT and Gtf2ird1-/- mice, however 3’ UTR expression was significantly lower
in the Gtf2ird1-/- mice (Figure 3.1). In addition, expression of the region of the 3’ UTR, which this primer set amplifies, appears to be lower than the expression of exon 1. As the primer pairs used will amplify both known splice forms, it would be expected that the expression level of exon 1 should be equal to the expression level of the 3’ UTR. A similar trend was seen for
109
Mrpl16, in that the expression level detected using a primer pair that was specific to the 5’ UTR
did not find genotype specific differences in expression, yet expression of the 3’ UTR was found
to be approximately 50% decreased in Gtf2ird1-/- mice (Figure 3.1).
Interestingly, Mrpl16 is found next to Stx3 on mouse chromosome 19. The genes are
oriented in a tail-to-tail fashion, with their 3’ UTRs separated by only 150 bp. The 3’ UTR of
Stx3 also showed significantly decreased expression in Gtf2ird1-/- mice. Exon 10 of Stx3 also showed decreased expression in Gtf2ird1-/- mice, although the difference was not as dramatic
(Figure 3.1). If the 3’ UTR of either Stx3 or Mrpl16 was to extend further than is indicated in the genomic database (UCSC browser) and overlap with the other gene it would affect the ability of qRT-PCR to detect gene specific changes in expression, as the primers designed to amplify the
3’ UTR of one gene would actually amplify from both of the genes.
The microarray probe which detected altered Pex1 expression in Gtf2ird1-/- mice is located in the 3’ UTR of a small Pex1 transcript which begins and ends within the longer Pex1 transcript. All coding exons of the short transcript are shared with the longer transcript, but the
5’ and 3’ UTRs are unique (found within introns of the longer transcript). Only this short transcript was affected in the P0 Gtf2ird1-/- mice (Figure 3.1). Expression of the longer
transcript, as detected using primers located in exon 21, was not different between genotypes.
The longer transcript is expressed at 10X higher levels than the short transcript which made it
impossible to determine if there were significant expression differences within the coding exons
of the short transcript.
110
Figure 3.1. Differences in gene expression detected in Gtf2ird1-/- mice using qPCR are exon specific. RNA was extracted from P0 mouse brains. RNA from 9 mice of the same genotype was pooled together to make cNDA, n=3 separate pools/genotype. Expression values are shown relative to the housekeeping gene Sdha. (For presentation purposes, some values were scaled as indicated). * p < 0.05, ** p < 0.005 using Student’s t-test.
Exon specific expression differences were also detected in Zfp68, Taf6, Coq2 and Actl6b, all of which were identified as candidate genes in the Illumina mouseWG-6 v2.0 BeadChip array. qRT-PCR was used to look at the expression levels of specific exons within these transcripts. Alternative splicing is not known to occur within Actl6b transcripts, yet there was great variation in the expression of specific exons in Gtf2ird1-/- mice relative to WT mice.
Expression of the 3’ UTR was decreased by approximately 50%, while expression of exons 13
and 14, which are immediately upstream of the 3’ UTR, were increased by 40-50%. Exons
located further upstream, including exons 3, 7 and 10, did not show differential expression
between genotypes (Figure 3.2).
111
A similar trend was seen when looking at the expression of Zfp68; the 3’ UTR was
expressed at lower levels in Gtf2ird1-/- mice, while exons 5 and 6, immediately upstream of the
3’ UTR, were expressed at higher levels in Gtf2ird1-/- mice (Figure 3.2). The PCR amplicons
were sequenced, and there were no SNPs found in the primer sequences which would affect PCR
efficiency. However, it may be possible to partially explain these findings based on the
alternative splicing which is known to occur with this gene. Taf6 and Coq2 both showed altered
expression in the 3’ UTR in Gtf2ird1-/- mice, with no detected expression differences in upstream coding exons (figure 3.2). As alternative splicing is not known to occur with these genes, these findings are harder to explain.
Figure 3.2. Differences in gene expression detected in Gtf2ird1-/- mice using qPCR are exon specific. RNA was extracted from E15.5 embryo heads (n=5/genotype). Expression values are shown relative to the housekeeping gene Sdha. * p < 0.05, ** p < 0.005 using Student’s t-test.
112
3.4.2 Northern blot analysis does not detect novel alternatively spliced transcripts
In order to identify novel transcripts which could explain the exon specific differences in
gene expression that were detected, Northern blots were performed to look at Mrpl16, Stx3, Kin
and Pex1 expression. Multiple probes were used for each gene to ensure that the probes were
binding specifically to the target RNA.
Identical transcripts were identified with each of the three probes which were used to
look at Mrpl16 expression. Two sizes of transcripts were detected; a large transcript of ~4500 bp
which is likely to correspond to the Mrpl16 primary transcript, and a smaller transcript of ~1200
bp corresponding to Mrpl16 mRNA (Figure 3.3A). No differences in the size or number of
transcripts could be detected between genotypes. At least four different sized transcripts were
detected using probes for Stx3. Again, there were no obvious differences in the size or number
of transcripts between genotypes (Figure 3.3B). There were two different transcripts detected for
Kin: a smaller transcript, under 1800 bp, which likely corresponds to the mRNA, and a larger
transcript of approximately 4700 bp, which may represent the primary RNA transcript (Figure
3.3C). The final gene to be studied using Northern blot analysis was Pex1, which expresses a
full length transcript as well a smaller transcript that starts and stops internal to the full length
transcript (Figure 3.3D). A specific probe could not be generated for the smaller transcript, and so probes were used which would detect either both the small and large transcripts or only the
larger transcript. The probe which detected the short and long isoforms identified the same
transcripts as the probe which was specific to the 3’ end of the longer isoform (figure 3.3D).
This indicated that the shorter transcripts may be expressed below the threshold of detection.
The probe which was specific to the 5’ end of the longer isoform appeared to detect some
113 smaller transcripts (< 2000 bp) which were not detected by the other probes. These could represent non-specific binding or incomplete Pex1 transcripts.
Taken together, these results indicate that if alternative splicing is the cause of the detected exon specific differences in gene expression the transcripts produced are close in size to the known transcripts, or are produced at levels below the threshold of detection.
114
115
Figure 3.3. Northern blot analysis did not reveal any genotype specific splice forms of Mrpl16(A), Stx3(B), Kin(C), or Pex1(D). RNA was extracted from whole brains of P0 mice. Multiple probes were used for each gene. Expected transcript length and exons contained in the cDNA probes are indicated.
3.4.3 Alternative splicing in the 3’UTR identified using 3’ RACE
As Northern blots were not useful in detecting any novel splice forms of candidate genes,
I then went on to look at transcripts produced by these genes using 3’ RACE. This technique is able to detect small differences in transcripts, such as the inclusion or exclusion of small numbers of base pairs. The 3’ RACE products for each gene were cloned and sequenced, and the sequences were aligned against the UCSC genome browser database to identify transcripts which used alternate polyadenylation sites, or have splicing occurring within the 3’ UTR.
116
Multiple Stx3 transcripts were identified, including some which appear to be novel
(Figure 3.4). These included transcripts with splicing occurring within the 3’ UTR, and transcripts with shortened 3’ UTRs. Six different 3’ UTR ‘end points’ were detected and 4 of
these corresponded with canonical polyadenylation signal sequence motifs. Mouse mRNAs and
ESTs have previously been reported with similar end points to 3 of the 6 different transcripts that
I detected. There are no mRNAs or ESTs reported which have splicing occurring within the 3’
UTR. No splicing was detected within the Mrpl16 3’ UTR, however three different 3’ UTR end points were detected (Figure 3.4). The Mrpl16 3’ UTR only contains one canonical polyadenylation signal sequence motif which is in the middle of the 3’ UTR and is not used in the generation of the full length transcript. I detected full length transcripts, transcripts which correspond to the polyadenylation signal sequence motif, and shorter transcripts which end half way through the last coding exon. The UCSC database contains ESTs which correspond to each of these transcript end points.
117
Figure 3.4. 3’ RACE analysis identified novel 3’ UTR splicing and alternative polyadenylation site usage for Mrpl16 and Stx3. RACE was performed on RNA extracted from heads of E15.5 mouse embryos. Nested PCR was performed using gene specific forward primers. PCR products were cloned and sequenced, and the sequences were aligned against the UCSC database. Green arrows indicate location of forward primer used in final round of PCR.
3’ RACE analysis of Zfp68 transcripts detected transcripts with 3 different end points, each of which corresponds to a canonical polyadenylation signal sequence motif, and to mRNAs which have previously been identified (Figure 3.5). The 3’ UTR of Zfp68 is relatively long
(~2600 bp) and contains 8 canonical polyadenylation signal sequence motifs. I did not identify any full length transcripts using this method, however they are present in the brain tissue at this time point as I was able to amplify sequences from the terminal end of the 3’ UTR by PCR
118
performed on cDNA generated using random hexamer primers. It is likely that full length 3’
UTRs were not detected because during the creation of the first strand cDNAs, extension from the terminal end of the UTR did not reach the location of the forward primer used for PCR.
Figure 3.5. 3’ RACE analysis identified alternative polyadenylation site usage for Zfp68. RACE was performed on RNA extracted from heads of E15.5 mouse embryos. Nested PCR was performed using gene specific forward primers. PCR products were cloned and sequenced, and the sequences were aligned against the UCSC database. Green arrows indicate location of forward primer used in final round of PCR.
There are two canonical polyadenylation signal sequence motifs in the Coq2 3’ UTR, and
transcripts corresponding to the use of each motif were detected (Figure 3.6). In addition, a
transcript which extended into the intronic region following the last coding exon was identified.
This transcript did not include any of the 3’ UTR sequence, or a polyadenylation sequence motif.
119
The UCSC database does not contain any mRNAs or EST’s corresponding to either of the
shorter Coq2 transcripts. A transcript was also detected which has the same end point as full
length Coq2 transcripts, but had a portion of the 3’ UTR and last coding exon spliced out.
Figure 3.6. 3’ RACE analysis identified novel 3’ UTR splicing and alternative polyadenylation site usage for Coq2. RACE was performed on RNA extracted from heads of E15.5 mouse embryos. Nested PCR was performed using gene specific forward primers. PCR products were cloned and sequenced, and the sequences were aligned against the UCSC database. Green arrows indicate location of forward primer used in final round of PCR.
Ap4m1 and Taf6 are located on chromosome 5, in a tail-to-tail orientation; their 3’ UTRs overlap by ~50 bp. Using a Taf6 gene specific forward primer, only full length Taf6 transcripts were identified (Figure 3.7). Full length transcripts were also identified using a forward primer
120 specific to Ap4m1, however an extended Ap4m1 transcript was also identified (Figure 3.7). This transcript extends well into the 3rd-last exon of Taf6, and contains a polyadenylation sequence motif at the terminus. Primers used to look at expression of the Taf6 3’UTR using qRT-PCR would have detected this extended Ap4m1 transcript as well as the Taf6 transcript.
Figure 3.7. 3’ RACE analysis identified novel 3’ UTR splicing and alternative polyadenylation site usage for Ap4m1. RACE was performed on RNA extracted from heads of E15.5 mouse embryos. Nested PCR was performed using gene specific forward primers. PCR products were cloned and sequenced, and the sequences were aligned against the UCSC database. Green arrows indicate location of forward primer used in final round of PCR. Sequencing was done on each cloning vector using forward and reverse primers; the dotted line in the extended Ap4m1 transcript represents the region not covered using the sequencing primers.
121
Actl6b was the final gene subjected to 3’ RACE analysis. PCR using Actl6b specific forward primers resulted in a single band in WT and Gtf2ird1-/- mice. These bands were extracted from the gel, and cloned. 20 random clones derived from mice of each genotype were selected for sequencing. No large deviations from the Actl6b full length transcript were detected, however it was noted that the transcripts detected with 3’ RACE had 9 different end points (Figure 3.8).
There was a difference of ~30 bp between the longest and shortest transcripts, with most of the transcripts being within 15 bp of each other. Actl6b does not have a canonical polyadenylation signal sequence motif, and the terminus of the 3’ UTR is highly conserved among mammals, indicating that this region could be of functional importance.
Figure 3.8. 3’ RACE analysis identified novel transcript endpoints for Actl6b. RACE was performed on RNA extracted from heads of E15.5 mouse embryos. Nested PCR was performed using gene specific forward primers. PCR products were cloned and sequenced, and the sequences were aligned against the UCSC database. Regions of the 3’ UTR that are conserved between mammals are indicated at the bottom.
122
3.4.4 Expression levels of different 3’UTR isoforms differ between genotypes
Primers were designed to perform qRT-PCR analysis which could differentiate between the different candidate gene isoforms that were detected using 3’ RACE. Three different 3’ UTR
lengths were detected for Zfp68: a short 3’ UTR and two intermediate length 3’ UTRs.
Significant decreases in expression were seen in Gtf2ird1-/- mice when primers which detect the short 3’ UTR or the full length 3’ UTR were used (Figure 3.9). The primers used to detect the
full length transcript were located at the terminal end of the transcript, and would not have
detected any of the shorter isoforms. However, the primers which would amplify from
transcripts with the short 3’ UTR would also amplify transcripts with intermediate or full length
3’ UTRs. Thus the decreased expression of the short isoform in Gtf2ird1-/- mice is likely to be
larger than indicated by these results. No genotype specific changes in the expression levels of
the transcripts containing intermediate length 3’ UTRs were detected.
Figure 3.9. qPCR using primers designed for specific isoforms identified through 3’ RACE. Location of primer amplicons is indicated on the bottom diagram. RNA was extracted from E15.5 embryo heads (n=5/genotype). Expression values are shown relative to the housekeeping gene Sdha. ** p < 0.005 using Student’s t-test.
123
Similar results were seen when looking at expression of the various Coq2 3’ UTR isoforms. Genotype specific differences in expression were seen only when using primers which would amplify transcripts with short and long 3’ UTRs (Figure 3.10). When primers were used which exclusively detected the long form of the 3’ UTR, or detected a region which was spliced out in some transcripts, no differences in expression could be detected in Gtf2ird1-/- mice.
Figure 3.10. qPCR using primers designed for specific isoforms of Coq2 identified through 3’ RACE. Location of primer amplicons is indicated on the bottom diagram. RNA was extracted from E15.5 embryo heads (n=5/genotype). Expression values are shown relative to the housekeeping gene Sdha. * p < 0.05 using Student’s t-test.
No alternate transcripts were detected for Taf6 using 3’ RACE, however a transcript was detected for Ap4m1 which overlaps with the Taf6 sequence. While the probes on the microarray would have been able to differentiate between the transcripts, the primers used in qRT-PCR would amplify from both the Taf6 transcript and the longer Ap4m1 transcript. Although primers in the Taf6 3’ UTR detected ~40% decrease in expression in Gtf2ird1-/- mice, no differences were detected in exon 10 (which does not overlap with the Ap4m1 transcript) or exon 13 (which does overlap with the Ap4m1 transcript) (Figure 3.11). Primers were also used which are located in introns 12 and 14 of Taf6. No significant differences in expression between genotypes were
124 detected in intron 14, however intron 12 showed significantly increased expression in the
Gtf2ird1-/- mice (Figure 3.11).
Figure 3.11. qPCR using primers designed for specific isoforms of Ap4m1 identified through 3’ RACE. Location of primer amplicons is indicated on the bottom diagram. RNA was extracted from E15.5 embryo heads (n=5/genotype). Expression values are shown relative to the housekeeping gene Sdha. ** p < 0.005, * p < 0.05 using Student’s t-test.
The largest number of 3’UTR isoforms detected using 3’ RACE was detected in Stx3.
Eight different PCR primer pairs were used to look at expression of the Stx3 3’ UTR in 5 WT and 5 Gtf2ird1-/- mice (figure 3.12). It was noted that some of the mice belonging to each genotype showed decreased 3’ UTR expression using certain primer pairs, and those same mice showed no (or extremely low) 3’ UTR expression using primers specific to different areas of the
3’ UTR. It is difficult to interpret the expression levels detected by the different primer sets, and it is likely that there are alternate isoforms of the Stx3 3’ UTR which were not detected in this study. Interestingly, there were WT and Gtf2ird1-/- mice which showed decreased expression in
125
the Mrpl16 3’UTR, and these were the same mice which showed lower levels of Stx3 3’ UTR
expression (Figure 3.12). This could indicate that the mechanism of 3’ UTR sequence selection for each of these genes is co-regulated, or that there may be an undetected isoform of one of these genes which extends into the UTR of the other gene, similar to what is seen with Ap4m1 and Taf6. This would mean that only one of the genes is actually affected, but both genes show altered expression using qRT-PCR as the primers do not distinguish between transcripts of the different genes.
Figure 3.12. qPCR using primers designed for specific Mrpl16 and Stx3 isoforms identified through 3’ RACE. Location of primer amplicons is indicated on the bottom diagram. RNA was extracted from E15.5 embryo heads. Expression values are shown relative to the housekeeping gene Sdha. (For presentation purposes, some values were scaled as indicated)
126
3.4.5 Expression of Stx3 is variable and does not correlate with genotype
Two different patterns of Stx3 3’ UTR expression were detected, “high” and “low”.
These different patterns were seen with approximately equal frequency in the brains of the 5 WT and 5 Gtf2ird1-/- E15.5 mice used in the previous analysis. In order to determine if this variation
in Stx3 expression is common, or if it occurs more often in one particular genotype, a larger
sample size was used. Brain expression of the Stx3 3’ UTR was examined using primer set
“UTR-A” in a total of 27 WT (15 P0, 12 adult), and 25 Gtf2ird1-/- (13 P0, 12 adult) mice. High
and low levels of expression were detected in WT and Gtf2ird1-/- mice of both age groups
(Figure 3.13). The expression levels of Stx3 detected in the adult mice were not significantly
different between genotypes (average relative to Sdha: WT = 0.24 ± 0.13, Gtf2ird1-/- = 0.23 ±
0.24, p = 0.98). Roughly equal numbers of WT and Gtf2ird1-/- P0 mice had high and low levels
of expression, but the expression values of the Gtf2ird1-/- mice were slightly, and significantly, lower than the WT mice (average relative to Shda: WT = 0.31 ± 0.14, Gtf2ird1-/- = 0.20 ± 0.14, p
= 0.047). These results indicate that the previous significant differences detected in Stx3, and
possibly Mrpl16, are unrelated to genotype and are reflective of bias within a small sample to
different commonly expressed isoforms.
127
Figure 3.13. Expression of the Stx3 3’ UTR is variable between mice of the same genotype. qPCR was performed on RNA extracted from whole brains of P0 and adult mice. The Stx3 3’UTR-A primer set was used. Expression values are shown relative to the housekeeping gene Sdha.
3.4.6 Differences in gene expression of genes located close to the Gtf2ird1 locus are related to genetic background
It was previously determined that some of the expression differences between WT and
Gtf2ird1-/- mice were the result of differential expression between different mouse strains, and
were not related to the function of TFII-IRD1. In order to determine if the exon specific expression differences within certain genes were also a result of the 129 alleles carried by the
Gtf2ird1-/- mice, qRT-PCR was used to look at the expression of multiple Actl6b, Taf6, Ap4m1
and Zfp68 exons in WT CD1, Gtf2ird1-/- and WT 129S1/SvImJ P0 mice.
As was noted previously 129S1/SvImJ mice showed expression levels that were similar
to those of Gtf2ird1-/- mice and, for many of the exons examined, 129S1/SvImJ and Gtf2ird1-/-
mice displayed significantly different expression levels than WT mice (Figure 3.14).
128
Figure 3.14. Exon specific differences in gene expression are related to the genetic background of the mouse and not genotype. 129+/+ (n=7), Gtf2ird1-/- (n=5), CD1+/+ (n=6). Expression values are shown relative to the housekeeping gene Sdha. (For presentation purposes, some values were scaled as indicated) ** p < 0.005, *p < 0.05 using Student’s t-test.
3.4.7 Differentially expressed exons do not affect protein levels of Stx3
Western blots were used to look at the levels of Stx3 in the brains of three P0 Gtf2ird1-/- and WT mice. There were no obvious differences in protein expression between genotypes
(Figure 3.15), however there may be subtle differences. Protein was extracted from the whole brain for this experiment, and so the Stx3 expression levels could not be confirmed using qRT-
PCR. As there is variation in Stx3 expression in WT and Gtf2ird1-/- mice, it is possible that the mice selected for this experiment all had equal levels of Stx3 expression. A larger sample size, and confirmation of Stx3 mRNA expression levels will be needed to determine if the different 3’
UTRs detected influence the Stx3 protein levels.
129
Figure 3.15. Stx3 protein levels to not appear to vary between Gtf2ird1-/- and wildtype mice. Western blot analysis was performed on protein extracted from the whole brain of P0 mice. An anti-Stx3 antibody was used which detects all three Stx3 isoforms. Anti-α-tubulin was used as a loading control.
3.5 Discussion
An in depth analysis of transcription in Gtf2ird1-/-, WT CD1 and WT 129 mice illustrates a
number of problems which can confound expression analysis when looking at transcription in
mice with genetically different backgrounds. In addition, these results indicate that expression
differences detected when looking specifically at a very small portion of a transcript are not
necessarily indicative of the expression levels of other exons, and care must be taken when
validating microarray experiments.
The primers used in qRT-PCR amplified regions of 80 – 120 bp, located within one exon
of a gene. Initially, primers were used that targeted the 3’ UTR, and significant differences in
expression were detected between WT CD1 and Gtf2ird1-/-/WT 129 mice in a number of genes.
The most straightforward explanation of this finding would be that there were differences in the
overall mRNA levels of the genes in question. However, for a number of genes, including Kin,
Mrpl16, Actl6b, Zfp68 and Taf6 no differences in expression were detected between genotypes
130
when the primers used in qRT-PCR amplified regions at the 5’ end of the gene, while primers
which amplified from the 3’ ends of the same transcripts found significant genotype specific
differences in expression. These results indicated that mRNAs were being produced which
differed in their 3’ ends and suggest that the differences in expression detected are due to a shift
in the ratios of alternatively spliced and polyadenylated transcripts.
Northern blot analysis was performed in order to identify transcripts of unique sizes which
were believed to be produced in a genotype specific manner. Four different genes were analyzed
using this technique, and the transcripts produced by WT and Gtf2ird1-/- mice did not appear to
be different in size. It was still possible that alternative transcripts were being produced either at
levels below the threshold of detection for this technique, or with small differences in size that
could not be resolved and so 3’ RACE was performed in another attempt to identify alternative
transcripts.
The 3’ end of mRNAs were amplified using an oligo-dT primer and gene specific forward primers. The PCR products were cloned and sequenced which allowed differences as small as
1bp to be detected. When the sequences were aligned against the UCSC genomic database, it was clear that alternative splicing and/or polyadenylation was occurring in the 3’ UTRs of the genes in question.
3.5.1 Alternative splicing in the 3’ UTR
Stx3 and Coq2 transcripts were detected in which a portion of the 3’ UTR had been spliced out. Alternative splicing of primary transcripts results in proteins with different structures and functions, and is believed to be a driving force in the phenotypic complexity that exists in mammals185. Recently deep sequencing of human cDNAs from 15 different human
tissues and cell lines revealed that 92-94% of human transcripts are alternatively spliced185.
131
Splicing is a highly regulated process controlled by the spliceosome, a complex of both proteins
and snRNAs. The spliceosome recognizes 4 highly conserved signals in the RNA sequence: the
5’ donor and 3’ acceptor splice sites found at the ends of introns, a branch site sequence found
upstream of the 3’ splice site and a polypyrimidine tract found between the branch site and the 3’
splice site186. A study of 43,337 human splice junction pairs found that 98.71% of splice sites
contain the canonical 5’ GT donor and 3’ AG acceptor dinucleotide sequences187.
Three different Stx3 transcripts with alternative splicing in the 3’ UTR were identified,
containing the non-canonical donor-acceptor pairs CC-AG, AT-TC and AT-AA. The donor- acceptor pair used in the alternatively spliced Coq2 transcript was also non-canonical, GA-TG.
In a study of 10,803 human transcripts representing 91,846 donor-acceptor sites, Chong et al. found that 22 donor-acceptor sites (including the canonical GT- AG) represented 99.16% of the data set188. The splice sites used in the alternative Stx3 and Coq2 transcripts were not on this list,
and occurred extremely infrequently (CC-AG = 10 times, AT-TC = 0 times, AT-AA = 3 times,
GA-TG = 3 times), however the authors felt that all isoforms that were identified were real as they were identified in multiple mRNAs.
A recent study by Housely and Tollervey proposes another explanation for the unique 3’
UTRs detected in the Stx3 and Coq2 transcripts189. They found that some cDNAs which appear
to have been alternatively spliced using non-canonical splice sites were actually created by
template switching of reverse transcriptase. In most cases where this was found to occur, the
region spliced out was flanked by direct repeats, however in some transcripts no areas of
homology could be identified. If template switching occurred during the reverse transcription of
Stx3 it would explain why no alternative transcripts were detected using northern blot analysis.
132
The northern blot was run with total RNA extracted from tissue, and the RNA was not subjected to reverse transcription.
Ultimately, the novel Coq2 and Stx3 transcripts are unlikely to be related to genotype.
When a larger sample size was used to look at Stx3 (and Mrpl16) expression, it was found that the expression levels of different isoforms were naturally variable and the differences between genotypes were not significant. qRT-PCR using primers which would differentiate between
Coq2 isoforms that have splicing in the 3’ UTR and those that don’t did not find significant differences in expression between genotypes, indicating that the alternative form is produced at equal levels in all mice.
3.5.2 Use of alternative polyadenylation sites
In addition to alternatively spliced isoforms, 3’ RACE also detected mRNAs which used alternative polyadenylation sites, resulting in shorter Mrpl16, Stx3, Zfp68 and Coq2 transcripts than those in the UCSC database, and a longer Ap4m1 transcript. As previously mentioned, the variation in expression of Mrpl16 and Stx3 appears to occur in all mice, and is unrelated to genotype. qRT-PCR using primers to distinguish between long and short forms of Coq2 found that Gtf2ird1-/- mice expressed higher levels of full length transcripts than WT CD1 mice. The opposite result was found for Zfp68 - Gtf2ird1-/- and WT 129 mice expressed lower levels of full length transcripts than WT CD1 mice. However, the fraction of transcripts which included the full length 3’ UTR was very low in all mice. The full length 3’ UTR is approximately 4 kb, however more than half of the transcripts appear to use the first polyadenylation site resulting in a 3’UTR of only 130 bp. In Gtf2ird1-/- and WT 129 mice, Zfp68 exons 5 and 6 are expressed 3X higher than WT CD1 mice. There are 3 known isoforms of Zfp68, A, B and C. Isoform B does not include the 3rd exon of isoforms A and C. Expression of exon 3 does not differ between
133 genotypes, and so it is possible that Gtf2ird1-/- and WT 129 mice expresses higher levels Zfp68B which use a different 3’ UTR. The forward primer used in 3’ RACE was immediately upstream of the annotated 3’ UTR, and so if transcripts terminated before this point they would not have been detected using this technique. Unfortunately there are no commercial antibodies for Zfp68 available and so I was unable to determine if the different 3’ UTRs result in different levels or localization of the protein in mice of different genetic backgrounds.
Ap4m1 was also found to use an alternative polyadenylation site, resulting in a transcript
1.4 kb longer than any Ap4m1 mRNA or EST listed in the UCSC database. The commonly used polyadenylation site does not contain a canonical signal sequence, however the extended transcript does. Ap4m1 is a subunit of adaptor protein complex (AP) 4 which is involved in the trafficking of membrane proteins, and localizes to dendrites and golgi-like structures in the cell bodies of neurons190. It is possible that the localization of Ap4m1 is regulated in a similar manner to BDNF161, and the different 3’ UTR lengths detected determine in which area of the neuron Ap4m1 produced by a certain transcript will be found. As the 3’ UTRs of Ap4m1 and
Taf6 are anti-sense to each other, it is impossible to accurately determine relative levels of the
Ap4m1 isoforms using qRT-PCR on cDNA generated with random hexamer primers.
Polyadenylation site selection in different mouse strains has not been studied in detail. In transcripts containing multiple potential polyadenylation signal sequences, the selection of poly(A) site usage is known to depend on a number of factors including the tissue191, stage of development163,192, and even neuronal activity191. As was previously mentioned, one example of mouse strain specific use of a polyadenylation site has been previously reported. A SNP located in the 3’UTR of Adh4 in C57BL/6 mice results in the formation of a new poly(A) signal which causes the expression an Adh4 isoform in their stomachs that is not found in other mouse
134
strains184. To my knowledge, my results are the first evidence that identical polyadenylation
sites may be used at different frequencies for certain genes in different mouse strains.
It is easy to postulate a number of mechanisms which could lead to one strain favouring
use of one poly(A) site over another. There may be SNPs in the flanking genomic sequences
which make the poly(A) signals weaker or stronger. Each isoform identified in this study which
contained the canonical sequence A(A/U)UAAA, had an identical sequence in each mouse
tested. However, as sequences flanking this site are known to affect the affinity of the
polyadenylation complex for that region of the mRNA193,194, SNPs may alter the ratio at which a
particular site is used. In the isoforms which did not contain a canonical poly(A) site SNPs may
also affect the affinity of the polyadenylation complex for the mRNA molecule.
SNPs between different strains may also result in differential expression levels or amino acid sequence of any of the many proteins which are involved in polyadenylation directly (as members of the polyadenylation complex) or indirectly (as members of the elongation complex, or proteins which cause post-translational modifications to either of these complexes).
Transgenic mice targeted in ES cells of one strain, and then backcrossed onto a different strain for 12 generations would be expected to contain an average of 16 cM of DNA from the ES cell strain flanking the targeted locus195. This represents about 1% of the mouse genome.
Given the large number of genes involved in 3’ end processing, it is likely that Gtf2ird1-/-
mice retained one or more genes which play a role in polyadenylation from the 129 derived ES
cells. Cpsf4, which encodes a 30 kDa subunit of CPSF is located on mouse chromosome 5, only
11 MB from the Gtf2ird1 locus. CFSP4 has been shown to bind to poly(U) sequences in RNA
molecules, and has been proposed to enhance the ability of CPSF to bind at poly(A) sites196.
135
This protein, or any other protein involved in polyadenylation encoded by a 129 derived allele in
the Gtf2ird1-/- mice could explain the differences in poly(A) site usage detected.
3.5.3 qRT-PCR validation of microarrays
This work has demonstrated that care needs to be taken when validating microarray
experiments. Using primers/probes which target the 3’ UTR may give results which are not
necessarily indicative of the expression level of the entire transcript. A recent study using the
Illumina Mouse WG array to look at expression in the mouse striatum found that 22% of 1100
genes with multiple probes showed discordant expression between the probe sets197.
Poor correlation has been reported between mRNA expression levels detected by microarray analysis and proteomic analysis198,199. There are many factors which are likely to
contribute to this discrepancy, one of which is that many microarray chips are designed to only looks at expression of a very small area of the transcript. The findings of the experiments described in this report demonstrate that expression levels in different areas of a transcript need to be measured in order to accurately reflect what is occurring in vivo.
Using qRT-PCR the expression level of different areas of the transcript, including coding exons and regions of the 3’UTR can be measured independently. Primers should be designed to distinguish between both alternatively spliced and alternatively polyadenylated mRNA isoforms.
However, other factors can confound the analysis of qRT-PCR data. Expression of anti-sense
transcripts, such as Taf6 and Ap4m1 cannot be distinguished using cDNA synthesized with
random hexamer, or oligo(dt) primers. In order to look at the expression of each of these
transcripts individually, 1st strand cDNA synthesis would have to be done for each gene
separately using a gene specific primer.
136
The method by which transcriptional termination occurs can also confound analysis of
gene expression when primers which amplify from the 3’ UTR are used. RNAPII, and the
elongation complex continue transcribing DNA past the poly(A) site where the transcript is
cleaved and the poly(A) tail added. Elongation, splicing and polyadenylation occur
concurrently, and so the nascent transcript is likely to be cleaved soon after being transcribed.
The RNA downstream of the cleavage site will still be present in the cell, at least temporarily,
and it will continue to be extended in a 3’ manner until RNAPII dissociates from the chromatin
template. Therefore in addition to measuring levels of mRNA, qRT-PCR performed on cDNA prepared with random hexamer primers will also detect RNA downstream of the polyadenylation site. In order to avoid this problem, mRNA should be isolated from the total RNA before cDNA preparation, or 1st strand synthesis of cDNA should be performed with oligo(dT) primers.
137
Chapter IV: Summary and Future Directions
4.1 Summary
Williams-Beuren syndrome is an autosomal dominant developmental disorder caused by
the deletion of 26-28 genes from chromosome 7q11.1336. The clinical manifestations of this
disorder are numerous and include dysmorphic facial features, SVAS, retarded growth, infantile
hypercalcemia and renal defects15,21. In addition, WBS patients have a distinct cognitive
phenotype characterized by ‘peaks and valleys’ of ability. The average IQ in WBS is 5526 with
individual IQ’s ranging from 40-10027. Individuals with WBS typically have relatively strong
expressive language skills, and show relatively weak performance in tasks that involve visual-
spatial processing26. In addition they are overly-friendly and show reduced social inhibitions26.
Of the 28 genes included in the common 1.55 Mb WBS deletion, only ELN has been conclusively linked to a particular aspect of the WBS phenotype. Mutations that disrupt ELN,
have been found in individuals with SVAS (who do not have WBS), linking ELN to this aspect
of the WBS phenotype33. A number of individuals have been identified who have smaller
deletions in the WBS region on chromosome 7 which encompass only a few genes48-53.
Phenotypic analysis of these individuals indicates that hemizygosity for members of the GTF2I
family, GTF2I and GTF2IRD1, are likely to be responsible for the behavioural and cognitive
aspects of the WBS phenotype. Gtf2ird1-/- mice display behaviours similar to those seen in WBS
patients including increased sociability and a decreased natural fear response88. The phenotypes
seen in individuals with atypical deletions in the WBS regions, and in Gtf2ird1-/- mice indicate
that the protein product of GTF2IRD1, TFII-IRD1 plays an important role in the brain.
Since discovery in 199875, TFII-IRD1 has been widely reported to be a transcription factor.
This protein has repeatedly been shown to bind to DNA in a sequence specific manner, and to
138
regulate gene expression in luciferase assays and in transformed cell lines after TFII-IRD1
knockdown/over-expression. However, the ability of TFII-IRD1 to regulate gene expression in vivo has never been demonstrated.
In order to identify genes which are regulated by TFII-IRD1 in vivo, and may play a role in
the behavioural phenotype seen in Gtf2ird1-/- mice, I performed microarray analysis to study
gene expression in the brains of Gtf2ird1-/- mice. Analysis was performed at two different
developmental time-points: E15.5 and P0. Although TFII-IRD1 is robustly expressed
throughout the brain at both of these time-points, I failed to detect any changes in gene
expression caused by the absence of this protein. These results indicate that TFII-IRD1 may have a role other than a transcription factor in the developing mouse brain.
qRT-PCR validation of the microarray experiments only confirmed altered expression in a
small subset of genes examined. The differences in gene expression that were detected between
genotypes using the Affymetrix and Illumina microarray platforms were the result of natural
variation in gene expression or differences in genotype resulting from the carry-over of genes
flanking the Gtf2ird1 locus from the parental R1 ES cells. This highlights the importance of
proper analysis and validation of microarray data as the presence of flanking genes from parental
ES cells or using a small number of biological samples may confound interpretation of the
results.
The initial qRT-PCR that was performed to validate the microarray results generally used
primers located in the 3’ UTRs of the transcripts as this is where the majority of the microarray
probes were located on both platforms. For many of the genes which showed decreased
expression using primers in the 3’ UTR of the gene, primers specific to upstream coding exons
139
showed no differences in expression between genotypes. Northern blot analysis was used to
determine if alternative splicing was occurring in these genes, but the size or number of
transcripts did not appear to vary between genotypes. However, transcripts utilizing alternative
polyadenylation signals were identified using 3’ RACE.
qRT-PCR using primers designed to differentiate between the different 3’ UTR isoforms
of the same gene found that there were differences in polyA site selection between Gtf2ird1-/- and
WT mice. Further examination revealed that these differences were unrelated to TFII-IRD1, and were the result of alleles from the R1 129 background strain flanking the Gtf2ird1 locus in the
mutant mice. To my knowledge this is the first evidence of a strain specific bias in polyA site
selection in mice.
4.2 Further investigation of GTF2IRD1 function
While in vivo targets of this putative transcription factor have yet to be identified, it is clear
that proper expression of TFII-IRD1 is needed for normal behaviour in mice. In addition, two
more individuals with atypical deletions in the WBS region have recently been identified further
implicating TFII-IRD1 in the cognitive aspects of the WBS phenotype200,201. Ferrero et al. have
identified a patient with a 1 Mb deletion which does not include the genes GTF2IRD1 or
GTF2I200. This patient does not have the typical facial features seen in WBS and has a normal
IQ. He was not formally tested for sociability or anxiety, but the authors noted that he did not
appear to display increased sociability or show signs of anxiety. Dai et al. reported a patient
with a deletion that included GTF2IRD1 but not GTF2I201. This patient showed facial features
characteristic of WBS, and had low scores on tests of visual-spatial cognition. She did not appear to show signs of increased sociability such as maintaining eye contact and attention to strangers.
140
TFII-I has been shown to have functional roles in both the nucleus and cytoplasm, acting as a transcription factor61,63,67, and regulating intercellular calcium levels through protein-protein
interactions143. Given the lack of transcriptional targets in the tissues and time-points examined, it is possible that TFII-IRD1 may also participate in protein-protein interactions that are necessary for proper neuronal function, yet do not alter gene expression.
Preliminary research on TFII-IRD1 localization in Neuro2A cells indicates that TFII-IRD1
is found in in the cytoplasm of these cells at relatively high levels, supporting the idea that TFII-
IRD1 may have a cytoplasmic function. These studies need to be repeated using confocal
microscopy which will give better resolution, and clearly show if TFII-IRD1 localization is
cytoplasmic, nuclear, peri-nuclear, or some combination of these.
We are also currently using an unbiased approach to identify proteins with which TFII-
IRD1 interacts. Affinity purification of TFII-IRD1 followed by liquid chromatography and
tandem mass spectrometry is currently being performed by Andrew Emili’s lab at the University
of Toronto. Proteins that interact with human TFII-IRD1 in HEK-293 cells, and with mouse
TFII-IRD1 in Neuro2A cells should be identified using this method. This technique has been used successfully to pull out known and novel protein interactions using members of previously documented protein complexes as bait202.
If TFII-IRD1 is acting as a transcription factor at the time-points studied, or at other stages of life, it is likely that this activity is restricted to a specific cell population, since I was unable to detect any changes in gene expression using whole brain extracts. Conditional knockout mice would be a great tool to further elucidate the temporal and cell-specific roles of TFII-IRD1. In order to accomplish this, a mouse with the Gtf2ird1 locus flanked by loxP sites in the proper
141
orientation would need to be generated. In the presence of Cre-recombinase, the region between
the loxP sites will be deleted. By crossing mice with floxed Gtf2ird1 alleles with mice
expressing Cre under the control of a promoter of a gene that is expressed in a time or cell
specific manner, Gtf2ird1 can be deleted in vivo. This method has been successfully used in
mice to delete Nmdar1 from the CA1 pyramidal cells of the hippocampus, while allowing the
gene to be expressed normally elsewhere203.
By removing TFII-IRD1 at different time points or cell populations it would be possible to
narrow down the physiological source of the behavioural phenotype, for example, is the
phenotype related to improper neuronal development, or is TFII-IRD1 needed throughout life for proper brain function? Does the phenotype result from the absence of TFII-IRD1 in one particular type of neuron? This information will give clues as to the cellular role of TFII-IRD1 and help in the development of future experiments.
It is also possible to use this technology to generate mice which will express eGFP upon the deletion of the Gtf2ird1 allele by Cre204. This would then allow the specific cell populations
lacking TFII-IRD1 to be sorted using fluorescence-activated cell sorting (FACS). One reason
why I failed to detect any transcriptional targets of TFII-IRD1 may be that it is only acting as a
transcription factor in a specific cell population, and the alterations in gene expression were
diluted out by looking at expression in the whole brain. This method would allow gene
expression to be studied in restricted cell populations which may allow the identification of in
vivo transcriptional targets of TFII-IRD1.
In order to further elucidate the role of TFII-IRD1 in the development of neurons, primary cortical cultures are currently being generated from WT and Gtf2ird1-/- mice. Neuronal
142
precursor cells differentiate into neurons, astrocytes and oligodendrocytes. Studying the ratios
that different cell types develop at in WT and Gtf2ird1-/- mice may provide information on the
role of TFII-IRD1 in neuronal development. In addition, studying axon and dendrite length, and the number and morphology of dendrites may also give clues as to the role of TFII-IRD1 in neuronal function.
4.3 Further investigation of alternative polyA site selection
The 3’ UTR has many important roles including RNA localization, stability, translation and microRNA binding160. The different usage of polyA sites detected between WT CD1 and
WT 129S1/SvImJ mice could have a functional effect by impairing or enhancing any of these
roles. Unfortunately there are no antibodies available for most of the genes for which differences
in 3’ UTR expression were detected. There are antibodies available for STX3 and ACTL6B, but
these antibodies detect multiple isoforms, including isoforms which are not differentially
expressed between strains.
Cellular localization of transcripts with altered 3’ UTR expression can be studied by
generating DIG labeled probes specific to certain isoforms and performing RNA in situ
hybridization on primary neuronal cultures. In order to determine the effect of 3’ UTR length on
protein levels tagged proteins with different 3’ UTRs can be expressed in neuronal cultures. By
comparing the mRNA expression level to protein levels, the effect of the 3’ UTR on translation
can be determined.
4.4 Conclusion
In conclusion, although this work failed to identify the method by which TFII-IRD1
contributes to the behavioural phenotype seen in Gtf2ird1-/- mice or people with WBS, genotype-
phenotype correlations indicate that this protein does play an important role in proper brain
143 development and/or function. Further experimentation focusing on potential cytoplasmic roles for TFII-IRD1 and gene expression in restricted neuronal cell populations should aid in elucidating the function of this protein.
144
References
1. Lightwood, R. & Stapleton, T. Idiopathic hypercalcaemia in infants. Lancet 265, 255-6 (1953).
2. Fanconi, G., Girardet, P., Schlesinger, B., Butler, N. & Black, J. [Chronic hyperglycemia, combined with osteosclerosis, hyperazotemia, nanism and congenital malformations.]. Helv Paediatr Acta 7, 314-49 (1952).
3. Jones, K.L. Williams syndrome: an historical perspective of its evolution, natural history, and etiology. Am J Med Genet Suppl 6, 89-96 (1990).
4. Schlesinger, B.E., Butler, N.R. & Black, J.A. Severe type of infantile hypercalcaemia. Br Med J 1, 127-34 (1956).
5. Williams, J.C., Barratt-Boyes, B.G. & Lowe, J.B. Supravalvular aortic stenosis. Circulation 24, 1311-8 (1961).
6. Beuren, A.J., Apitz, J. & Harmjanz, D. Supravalvular aortic stenosis in association with mental retardation and a certain facial appearance. Circulation 26, 1235-40 (1962).
7. Black, J.A. & Carter, R.E. Association between Aortic Stenosis and Facies of Severe Infantile Hypercalcaemia. Lancet 2, 745-9 (1963).
8. Friedman, W.F. & Mills, L.F. The relationship between vitamin D and the craniofacial and dental anomalies of the supravalvular aortic stenosis syndrome. Pediatrics 43, 12-8 (1969).
9. Friedman, W.F. & Roberts, W.C. Vitamin D and the supravalvar aortic stenosis syndrome. The transplacental effects of vitamin D on the aorta of the rabbit. Circulation 34, 77-86 (1966).
10. Morris, C.A., Thomas, I.T. & Greenberg, F. Williams syndrome: autosomal dominant inheritance. Am J Med Genet 47, 478-81 (1993).
11. Curran, M.E., Atkinson, D.L., Ewart, A.K., Morris, C.A., Leppert, M.F. & Keating, M.T. The elastin gene is disrupted by a translocation associated with supravalvular aortic stenosis. Cell 73, 159-68 (1993).
12. Ewart, A.K., Morris, C.A., Atkinson, D., Jin, W., Sternes, K., Spallone, P., Stock, A.D., Leppert, M. & Keating, M.T. Hemizygosity at the elastin locus in a developmental disorder, Williams syndrome. Nat Genet 5, 11-6 (1993).
13. Stromme, P., Bjornstad, P.G. & Ramstad, K. Prevalence estimation of Williams syndrome. J Child Neurol 17, 269-71 (2002).
14. Morris, C.A., Demsey, S.A., Leonard, C.O., Dilts, C. & Blackburn, B.L. Natural history of Williams syndrome: physical characteristics. J Pediatr 113, 318-26 (1988).
145
15. Morris, C.A. & Mervis, C.B. Williams syndrome and related disorders. Annu Rev Genomics Hum Genet 1, 461-84 (2000).
16. Reiss, A.L., Eliez, S., Schmitt, J.E., Straus, E., Lai, Z., Jones, W. & Bellugi, U. IV. Neuroanatomy of Williams syndrome: a high-resolution MRI study. J Cogn Neurosci 12 Suppl 1, 65-73 (2000).
17. Galaburda, A.M. & Bellugi, U. V. Multi-level analysis of cortical neuroanatomy in Williams syndrome. J Cogn Neurosci 12 Suppl 1, 74-88 (2000).
18. Kippenhan, J.S., Olsen, R.K., Mervis, C.B., Morris, C.A., Kohn, P., Meyer-Lindenberg, A. & Berman, K.F. Genetic contributions to human gyrification: sulcal morphometry in Williams syndrome. J Neurosci 25, 7840-6 (2005).
19. Pankau, R., Partsch, C.J., Gosch, A., Oppermann, H.C. & Wessel, A. Statural growth in Williams-Beuren syndrome. Eur J Pediatr 151, 751-5 (1992).
20. Jones, K.L. & Smith, D.W. The Williams elfin facies syndrome. A new perspective. J Pediatr 86, 718-23 (1975).
21. Preus, M. The Williams syndrome: objective definition and diagnosis. Clin Genet 25, 422-8 (1984).
22. Rodd, C. & Goodyer, P. Hypercalcemia of the newborn: etiology, evaluation, and management. Pediatr Nephrol 13, 542-7 (1999).
23. Pober, B.R., Johnson, M. & Urban, Z. Mechanisms and treatment of cardiovascular disease in Williams-Beuren syndrome. J Clin Invest 118, 1606-15 (2008).
24. Pober, B.R. Williams-Beuren syndrome. N Engl J Med 362, 239-52 (2010).
25. Cherniske, E.M., Carpenter, T.O., Klaiman, C., Young, E., Bregman, J., Insogna, K., Schultz, R.T. & Pober, B.R. Multisystem study of 20 older adults with Williams syndrome. Am J Med Genet A 131, 255-64 (2004).
26. Bellugi, U., Lichtenberger, L., Jones, W., Lai, Z. & St George, M. I. The neurocognitive profile of Williams Syndrome: a complex pattern of strengths and weaknesses. J Cogn Neurosci 12 Suppl 1, 7-29 (2000).
27. Martens, M.A., Wilson, S.J. & Reutens, D.C. Research Review: Williams syndrome: a critical review of the cognitive, behavioral, and neuroanatomical phenotype. J Child Psychol Psychiatry 49, 576-608 (2008).
28. Mervis, C.B. & Robinson, B.F. Expressive vocabulary ability of toddlers with Williams syndrome or Down syndrome: a comparison. Dev Neuropsychol 17, 111-26 (2000).
29. Jones, W., Bellugi, U., Lai, Z., Chiles, M., Reilly, J., Lincoln, A. & Adolphs, R. II. Hypersociability in Williams Syndrome. J Cogn Neurosci 12 Suppl 1, 30-46 (2000).
146
30. Frigerio, E., Burt, D.M., Gagliardi, C., Cioffi, G., Martelli, S., Perrett, D.I. & Borgatti, R. Is everybody always my friend? Perception of approachability in Williams syndrome. Neuropsychologia 44, 254-9 (2006).
31. Dykens, E.M. Anxiety, fears, and phobias in persons with Williams syndrome. Dev Neuropsychol 23, 291-316 (2003).
32. Johnson, L.W., Fishman, R.A., Schneider, B., Parker, F.B., Jr., Husson, G. & Webb, W.R. Familial supravalvular aortic stenosis. Report of a large family and review of the literature. Chest 70, 494-500 (1976).
33. Fazio, M.J., Mattei, M.G., Passage, E., Chu, M.L., Black, D., Solomon, E., Davidson, J.M. & Uitto, J. Human elastin gene: new evidence for localization to the long arm of chromosome 7. Am J Hum Genet 48, 696-703 (1991).
34. Osborne, L.R., Martindale, D., Scherer, S.W., Shi, X.M., Huizenga, J., Heng, H.H., Costa, T., Pober, B., Lew, L., Brinkman, J., Rommens, J., Koop, B. & Tsui, L.C. Identification of genes from a 500-kb region at 7q11.23 that is commonly deleted in Williams syndrome patients. Genomics 36, 328-36 (1996).
35. Robinson, W.P., Waslynka, J., Bernasconi, F., Wang, M., Clark, S., Kotzot, D. & Schinzel, A. Delineation of 7q11.2 deletions associated with Williams-Beuren syndrome and mapping of a repetitive sequence to within and to either side of the common deletion. Genomics 34, 17-23 (1996).
36. Hockenhull, E.L., Carette, M.J., Metcalfe, K., Donnai, D., Read, A.P. & Tassabehji, M. A complete physical contig and partial transcript map of the Williams syndrome critical region. Genomics 58, 138-45 (1999).
37. Schubert, C. The genomic basis of the Williams-Beuren syndrome. Cell Mol Life Sci 66, 1178-97 (2009).
38. Tassabehji, M. Williams-Beuren syndrome: a challenge for genotype-phenotype correlations. Hum Mol Genet 12 Spec No 2, R229-37 (2003).
39. Bayes, M., Magano, L.F., Rivera, N., Flores, R. & Perez Jurado, L.A. Mutational mechanisms of Williams-Beuren syndrome deletions. Am J Hum Genet 73, 131-51 (2003).
40. Osborne, L.R., Li, M., Pober, B., Chitayat, D., Bodurtha, J., Mandel, A., Costa, T., Grebe, T., Cox, S., Tsui, L.C. & Scherer, S.W. A 1.5 million-base pair inversion polymorphism in families with Williams-Beuren syndrome. Nat Genet 29, 321-5 (2001).
41. Somerville, M.J., Mervis, C.B., Young, E.J., Seo, E.J., del Campo, M., Bamforth, S., Peregrine, E., Loo, W., Lilley, M., Perez-Jurado, L.A., Morris, C.A., Scherer, S.W. & Osborne, L.R. Severe expressive-language delay related to duplication of the Williams- Beuren locus. N Engl J Med 353, 1694-701 (2005).
147
42. Orellana, C., Bernabeu, J., Monfort, S., Rosello, M., Oltra, S., Ferrer, I., Quiroga, R., Martinez-Garay, I. & Martinez, F. Duplication of the Williams-Beuren critical region: case report and further delineation of the phenotypic spectrum. J Med Genet 45, 187-9 (2008).
43. Tam, E., Young, E.J., Morris, C.A., Marshall, C.R., Loo, W., Scherer, S.W., Mervis, C.B. & Osborne, L.R. The common inversion of the Williams-Beuren syndrome region at 7q11.23 does not cause clinical symptoms. Am J Med Genet A 146A, 1797-806 (2008).
44. Hobart, H.H., Morris, C.A., Mervis, C.B., Pani, A.M., Kistler, D.J., Rios, C.M., Kimberley, K.W., Gregg, R.G. & Bray-Ward, P. Inversion of the Williams syndrome region is a common polymorphism found more frequently in parents of children with Williams syndrome. Am J Med Genet C Semin Med Genet 154C, 220-8 (2010).
45. Osborne, L. & Pober, B. Genetics of childhood disorders: XXVII. Genes and cognition in Williams syndrome. J Am Acad Child Adolesc Psychiatry 40, 732-5 (2001).
46. Botta, A., Novelli, G., Mari, A., Novelli, A., Sabani, M., Korenberg, J., Osborne, L.R., Digilio, M.C., Giannotti, A. & Dallapiccola, B. Detection of an atypical 7q11.23 deletion in Williams syndrome patients which does not include the STX1A and FZD3 genes. J Med Genet 36, 478-80 (1999).
47. Heller, R., Rauch, A., Luttgen, S., Schroder, B. & Winterpacht, A. Partial deletion of the critical 1.5 Mb interval in Williams-Beuren syndrome. J Med Genet 40, e99 (2003).
48. Hirota, H., Matsuoka, R., Chen, X.N., Salandanan, L.S., Lincoln, A., Rose, F.E., Sunahara, M., Osawa, M., Bellugi, U. & Korenberg, J.R. Williams syndrome deficits in visual spatial processing linked to GTF2IRD1 and GTF2I on chromosome 7q11.23. Genet Med 5, 311-21 (2003).
49. Morris, C.A., Mervis, C.B., Hobart, H.H., Gregg, R.G., Bertrand, J., Ensing, G.J., Sommer, A., Moore, C.A., Hopkin, R.J., Spallone, P.A., Keating, M.T., Osborne, L., Kimberley, K.W. & Stock, A.D. GTF2I hemizygosity implicated in mental retardation in Williams syndrome: genotype-phenotype analysis of five families with deletions in the Williams syndrome region. Am J Med Genet A 123A, 45-59 (2003).
50. Howald, C., Merla, G., Digilio, M.C., Amenta, S., Lyle, R., Deutsch, S., Choudhury, U., Bottani, A., Antonarakis, S.E., Fryssira, H., Dallapiccola, B. & Reymond, A. Two high throughput technologies to detect segmental aneuploidies identify new Williams-Beuren syndrome patients with atypical deletions. J Med Genet 43, 266-73 (2006).
51. Tassabehji, M., Metcalfe, K., Karmiloff-Smith, A., Carette, M.J., Grant, J., Dennis, N., Reardon, W., Splitt, M., Read, A.P. & Donnai, D. Williams syndrome: use of chromosomal microdeletions as a tool to dissect cognitive and physical phenotypes. Am J Hum Genet 64, 118-25 (1999).
148
52. Gagliardi, C., Bonaglia, M.C., Selicorni, A., Borgatti, R. & Giorda, R. Unusual cognitive and behavioural profile in a Williams syndrome patient with atypical 7q11.23 deletion. J Med Genet 40, 526-30 (2003).
53. Tassabehji, M., Hammond, P., Karmiloff-Smith, A., Thompson, P., Thorgeirsson, S.S., Durkin, M.E., Popescu, N.C., Hutton, T., Metcalfe, K., Rucka, A., Stewart, H., Read, A.P., Maconochie, M. & Donnai, D. GTF2IRD1 in craniofacial development of humans and mice. Science 310, 1184-7 (2005).
54. Roy, A.L., Du, H., Gregor, P.D., Novina, C.D., Martinez, E. & Roeder, R.G. Cloning of an inr- and E-box-binding protein, TFII-I, that interacts physically and functionally with USF1. EMBO J 16, 7091-104 (1997).
55. Hinsley, T.A., Cunliffe, P., Tipney, H.J., Brass, A. & Tassabehji, M. Comparison of TFII-I gene family members deleted in Williams-Beuren syndrome. Protein Sci 13, 2588- 99 (2004).
56. Ferre-D'Amare, A.R., Prendergast, G.C., Ziff, E.B. & Burley, S.K. Recognition by Max of its cognate DNA through a dimeric b/HLH/Z domain. Nature 363, 38-45 (1993).
57. Vullhorst, D. & Buonanno, A. Characterization of general transcription factor 3, a transcription factor involved in slow muscle-specific gene expression. J Biol Chem 278, 8370-9 (2003).
58. Cheriyath, V. & Roy, A.L. Structure-function analysis of TFII-I. Roles of the N-terminal end, basic region, and I-repeats. J Biol Chem 276, 8377-83 (2001).
59. Tantin, D., Tussie-Luna, M.I., Roy, A.L. & Sharp, P.A. Regulation of immunoglobulin promoter activity by TFII-I class transcription factors. J Biol Chem 279, 5460-9 (2004).
60. Makeyev, A.V., Erdenechimeg, L., Mungunsukh, O., Roth, J.J., Enkhmandakh, B., Ruddle, F.H. & Bayarsaihan, D. GTF2IRD2 is located in the Williams-Beuren syndrome critical region 7q11.23 and encodes a protein with two TFII-I-like helix-loop-helix repeats. Proc Natl Acad Sci U S A 101, 11052-7 (2004).
61. Roy, A.L., Meisterernst, M., Pognonec, P. & Roeder, R.G. Cooperative interaction of an initiator-binding transcription initiation factor and the helix-loop-helix activator USF. Nature 354, 245-8 (1991).
62. Yang, W. & Desiderio, S. BAP-135, a target for Bruton's tyrosine kinase in response to B cell receptor engagement. Proc Natl Acad Sci U S A 94, 604-9 (1997).
63. Grueneberg, D.A., Henry, R.W., Brauer, A., Novina, C.D., Cheriyath, V., Roy, A.L. & Gilman, M. A multifunctional DNA-binding protein that promotes the formation of serum response factor/homeodomain complexes: identity to TFII-I. Genes Dev 11, 2482- 93 (1997).
64. Saouaf, S.J., Mahajan, S., Rowley, R.B., Kut, S.A., Fargnoli, J., Burkhardt, A.L., Tsukada, S., Witte, O.N. & Bolen, J.B. Temporal differences in the activation of three
149
classes of non-transmembrane protein tyrosine kinases following B-cell antigen receptor surface engagement. Proc Natl Acad Sci U S A 91, 9524-8 (1994).
65. Cheriyath, V., Desgranges, Z.P. & Roy, A.L. c-Src-dependent transcriptional activation of TFII-I. J Biol Chem 277, 22798-805 (2002).
66. Cheriyath, V. & Roy, A.L. Alternatively spliced isoforms of TFII-I. Complex formation, nuclear translocation, and differential gene regulation. J Biol Chem 275, 26300-8 (2000).
67. Jackson, T.A., Taylor, H.E., Sharma, D., Desiderio, S. & Danoff, S.K. Vascular endothelial growth factor receptor-2: counter-regulation by the transcription factors, TFII-I and TFII-IRD1. J Biol Chem 280, 29856-63 (2005).
68. Ku, M., Sokol, S.Y., Wu, J., Tussie-Luna, M.I., Roy, A.L. & Hata, A. Positive and negative regulation of the transforming growth factor beta/activin target gene goosecoid by the TFII-I family of transcription factors. Mol Cell Biol 25, 7144-57 (2005).
69. Tussie-Luna, M.I., Bayarsaihan, D., Seto, E., Ruddle, F.H. & Roy, A.L. Physical and functional interactions of histone deacetylase 3 with TFII-I family proteins and PIASxbeta. Proc Natl Acad Sci U S A 99, 12807-12 (2002).
70. Enkhmandakh, B., Bitchevaia, N., Ruddle, F. & Bayarsaihan, D. The early embryonic expression of TFII-I during mouse preimplantation development. Gene Expr Patterns 4, 25-8 (2004).
71. Danoff, S.K., Taylor, H.E., Blackshaw, S. & Desiderio, S. TFII-I, a candidate gene for Williams syndrome cognitive profile: parallels between regional expression in mouse brain and human phenotype. Neuroscience 123, 931-8 (2004).
72. Yan, X., Zhao, X., Qian, M., Guo, N., Gong, X. & Zhu, X. Characterization and gene structure of a novel retinoblastoma-protein-associated protein similar to the transcription regulator TFII-I. Biochem J 345 Pt 3, 749-57 (2000).
73. Franke, Y., Peoples, R.J. & Francke, U. Identification of GTF2IRD1, a putative transcription factor within the Williams-Beuren syndrome deletion at 7q11.23. Cytogenet Cell Genet 86, 296-304 (1999).
74. Bayarsaihan, D. & Ruddle, F.H. Isolation and characterization of BEN, a member of the TFII-I family of DNA-binding proteins containing distinct helix-loop-helix domains. Proc Natl Acad Sci U S A 97, 7342-7 (2000).
75. O'Mahoney, J.V., Guven, K.L., Lin, J., Joya, J.E., Robinson, C.S., Wade, R.P. & Hardeman, E.C. Identification of a novel slow-muscle-fiber enhancer binding protein, MusTRD1. Mol Cell Biol 18, 6641-52 (1998).
76. Osborne, L.R., Campbell, T., Daradich, A., Scherer, S.W. & Tsui, L.C. Identification of a putative transcription factor gene (WBSCR11) that is commonly deleted in Williams- Beuren syndrome. Genomics 57, 279-84 (1999).
150
77. Corin, S.J., Levitt, L.K., O'Mahoney, J.V., Joya, J.E., Hardeman, E.C. & Wade, R. Delineation of a slow-twitch-myofiber-specific transcriptional element by using in vivo somatic gene transfer. Proc Natl Acad Sci U S A 92, 6185-9 (1995).
78. Polly, P., Haddadi, L.M., Issa, L.L., Subramaniam, N., Palmer, S.J., Tay, E.S. & Hardeman, E.C. hMusTRD1alpha1 represses MEF2 activation of the troponin I slow enhancer. J Biol Chem 278, 36603-10 (2003).
79. Ring, C., Ogata, S., Meek, L., Song, J., Ohta, T., Miyazono, K. & Cho, K.W. The role of a Williams-Beuren syndrome-associated helix-loop-helix domain-containing transcription factor in activin/nodal signaling. Genes Dev 16, 820-35 (2002).
80. Vullhorst, D. & Buonanno, A. Multiple GTF2I-like repeats of general transcription factor 3 exhibit DNA binding properties. Evidence for a common origin as a sequence-specific DNA interaction module. J Biol Chem 280, 31722-31 (2005).
81. Lazebnik, M.B., Tussie-Luna, M.I. & Roy, A.L. Determination and functional analysis of the consensus binding site for TFII-I family member BEN, implicated in Williams- Beuren syndrome. J Biol Chem 283, 11078-82 (2008).
82. Palmer, S.J., Tay, E.S., Santucci, N., Cuc Bach, T.T., Hook, J., Lemckert, F.A., Jamieson, R.V., Gunnning, P.W. & Hardeman, E.C. Expression of Gtf2ird1, the Williams syndrome-associated gene, during mouse development. Gene Expr Patterns 7, 396-404 (2007).
83. Proulx, E., Young, E.J., Osborne, L.R. & Lambe, E.K. Enhanced prefrontal serotonin 5- HT(1A) currents in a mouse model of Williams-Beuren syndrome with low innate anxiety. J Neurodev Disord 2, 99-108 (2010).
84. Tipney, H.J., Hinsley, T.A., Brass, A., Metcalfe, K., Donnai, D. & Tassabehji, M. Isolation and characterisation of GTF2IRD2, a novel fusion gene and member of the TFII-I family of transcription factors, deleted in Williams-Beuren syndrome. Eur J Hum Genet 12, 551-60 (2004).
85. Reiter, L.T., Murakami, T., Koeuth, T., Pentao, L., Muzny, D.M., Gibbs, R.A. & Lupski, J.R. A recombination hotspot responsible for two inherited peripheral neuropathies is located near a mariner transposon-like element. Nat Genet 12, 288-97 (1996).
86. Valero, M.C., de Luis, O., Cruces, J. & Perez Jurado, L.A. Fine-scale comparative mapping of the human 7q11.23 region and the orthologous region on mouse chromosome 5G: the low-copy repeats that flank the Williams-Beuren syndrome deletion arose at breakpoint sites of an evolutionary inversion(s). Genomics 69, 1-13 (2000).
87. Osborne, L.R. Animal models of Williams syndrome. Am J Med Genet C Semin Med Genet 154C, 209-19 (2010).
88. Young, E.J., Lipina, T., Tam, E., Mandel, A., Clapcote, S.J., Bechard, A.R., Chambers, J., Mount, H.T., Fletcher, P.J., Roder, J.C. & Osborne, L.R. Reduced fear and aggression
151
and altered serotonin metabolism in Gtf2ird1-targeted mice. Genes Brain Behav 7, 224- 34 (2008).
89. Kusserow, H., Davies, B., Hortnagl, H., Voigt, I., Stroh, T., Bert, B., Deng, D.R., Fink, H., Veh, R.W. & Theuring, F. Reduced anxiety-related behaviour in transgenic mice overexpressing serotonin 1A receptors. Brain Res Mol Brain Res 129, 104-16 (2004).
90. Calvo, S., Vullhorst, D., Venepally, P., Cheng, J., Karavanova, I. & Buonanno, A. Molecular dissection of DNA sequences and factors involved in slow muscle-specific transcription. Mol Cell Biol 21, 8490-503 (2001).
91. Zhu, L., Lyons, G.E., Juhasz, O., Joya, J.E., Hardeman, E.C. & Wade, R. Developmental regulation of troponin I isoform genes in striated muscles of transgenic mice. Dev Biol 169, 487-503 (1995).
92. Issa, L.L., Palmer, S.J., Guven, K.L., Santucci, N., Hodgson, V.R., Popovic, K., Joya, J.E. & Hardeman, E.C. MusTRD can regulate postnatal fiber-specific expression. Dev Biol 293, 104-15 (2006).
93. Juan, A.H. & Ruddle, F.H. Enhancer timing of Hox gene expression: deletion of the endogenous Hoxc8 early enhancer. Development 130, 4823-34 (2003).
94. Tussie-Luna, M.I., Bayarsaihan, D., Ruddle, F.H. & Roy, A.L. Repression of TFII-I- dependent transcription by nuclear exclusion. Proc Natl Acad Sci U S A 98, 7789-94 (2001).
95. Watabe, T., Kim, S., Candia, A., Rothbacher, U., Hashimoto, C., Inoue, K. & Cho, K.W. Molecular mechanisms of Spemann's organizer formation: conserved growth factor synergy between Xenopus and mouse. Genes Dev 9, 3038-50 (1995).
96. Laurent, M.N., Blitz, I.L., Hashimoto, C., Rothbacher, U. & Cho, K.W. The Xenopus homeobox gene twin mediates Wnt induction of goosecoid in establishment of Spemann's organizer. Development 124, 4905-16 (1997).
97. Rivera-Perez, J.A., Mallo, M., Gendron-Maguire, M., Gridley, T. & Behringer, R.R. Goosecoid is not an essential component of the mouse gastrula organizer but is required for craniofacial and rib development. Development 121, 3005-12 (1995).
98. Ferreiro, B., Artinger, M., Cho, K. & Niehrs, C. Antimorphic goosecoids. Development 125, 1347-59 (1998).
99. Tantin, D. & Sharp, P.A. Mouse lymphoid cell line selected to have high immunoglobulin promoter activity. Mol Cell Biol 22, 1460-73 (2002).
100. Wu, Y. & Patterson, C. The human KDR/flk-1 gene contains a functional initiator element that is bound and transactivated by TFII-I. J Biol Chem 274, 3207-14 (1999).
152
101. Chimge, N.O., Mungunsukh, O., Ruddle, F. & Bayarsaihan, D. Expression profiling of BEN regulated genes in mouse embryonic fibroblasts. J Exp Zool B Mol Dev Evol 308, 209-24 (2007).
102. Chimge, N.O., Makeyev, A.V., Ruddle, F.H. & Bayarsaihan, D. Identification of the TFII-I family target genes in the vertebrate genome. Proc Natl Acad Sci U S A 105, 9006- 10 (2008).
103. Bayarsaihan, D., Bitchevaia, N., Enkhmandakh, B., Tussie-Luna, M.I., Leckman, J.F., Roy, A. & Ruddle, F. Expression of BEN, a member of TFII-I family of transcription factors, during mouse pre- and postimplantation development. Gene Expr Patterns 3, 579-89 (2003).
104. Irizarry, R.A., Hobbs, B., Collin, F., Beazer-Barclay, Y.D., Antonellis, K.J., Scherf, U. & Speed, T.P. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4, 249-64 (2003).
105. Tusher, V.G., Tibshirani, R. & Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98, 5116-21 (2001).
106. Bolstad, B.M., Irizarry, R.A., Astrand, M. & Speed, T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185-93 (2003).
107. Smyth, G.K. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3, Article3 (2004).
108. Blum, M., Gaunt, S.J., Cho, K.W., Steinbeisser, H., Blumberg, B., Bittner, D. & De Robertis, E.M. Gastrulation in the mouse: the role of the homeobox gene goosecoid. Cell 69, 1097-106 (1992).
109. Gaunt, S.J., Blum, M. & De Robertis, E.M. Expression of the mouse goosecoid gene during mid-embryogenesis may mark mesenchymal cell lineages in the developing head, limbs and body wall. Development 117, 769-78 (1993).
110. Yamada, G., Mansouri, A., Torres, M., Stuart, E.T., Blum, M., Schultz, M., De Robertis, E.M. & Gruss, P. Targeted mutation of the murine goosecoid gene results in craniofacial defects and neonatal death. Development 121, 2917-22 (1995).
111. Durkin, M.E., Keck-Waggoner, C.L., Popescu, N.C. & Thorgeirsson, S.S. Integration of a c-myc transgene results in disruption of the mouse Gtf2ird1 gene, the homologue of the human GTF2IRD1 gene hemizygously deleted in Williams-Beuren syndrome. Genomics 73, 20-7 (2001).
112. Kwon, Y., Shin, J., Park, H.W. & Kim, M.H. Dynamic expression pattern of Hoxc8 during mouse early embryogenesis. Anat Rec A Discov Mol Cell Evol Biol 283, 187-92 (2005).
153
113. Borello, U., Cobos, I., Long, J.E., McWhirter, J.R., Murre, C. & Rubenstein, J.L. FGF15 promotes neurogenesis and opposes FGF8 function during neocortical development. Neural Dev 3, 17 (2008).
114. Darios, F. & Davletov, B. Omega-3 and omega-6 fatty acids stimulate cell membrane expansion by acting on syntaxin 3. Nature 440, 813-7 (2006).
115. Wu, J.I., Lessard, J., Olave, I.A., Qiu, Z., Ghosh, A., Graef, I.A. & Crabtree, G.R. Regulation of dendritic development by neuron-specific chromatin remodeling complexes. Neuron 56, 94-108 (2007).
116. Hakre, S., Tussie-Luna, M.I., Ashworth, T., Novina, C.D., Settleman, J., Sharp, P.A. & Roy, A.L. Opposing functions of TFII-I spliced isoforms in growth factor-induced gene expression. Mol Cell 24, 301-8 (2006).
117. Pan, Y., Tsai, C.J., Ma, B. & Nussinov, R. Mechanisms of transcription factor selectivity. Trends Genet 26, 75-83 (2010).
118. Palmer, S.J., Santucci, N., Widagdo, J., Bontempo, S.J., Taylor, K.M., Tay, E.S., Hook, J., Lemckert, F., Gunning, P.W. & Hardeman, E.C. Negative autoregulation of GTF2IRD1 in Williams-Beuren syndrome via a novel DNA binding mechanism. J Biol Chem 285, 4715-24 (2010).
119. Antonell, A., Del Campo, M., Magano, L.F., Kaufmann, L., de la Iglesia, J.M., Gallastegui, F., Flores, R., Schweigmann, U., Fauth, C., Kotzot, D. & Perez-Jurado, L.A. Partial 7q11.23 deletions further implicate GTF2I and GTF2IRD1 as the main genes responsible for the Williams-Beuren syndrome neurocognitive profile. J Med Genet 47, 312-20 (2010).
120. Collette, J.C., Chen, X.N., Mills, D.L., Galaburda, A.M., Reiss, A.L., Bellugi, U. & Korenberg, J.R. William's syndrome: gene expression is related to parental origin and regional coordinate control. J Hum Genet 54, 193-8 (2009).
121. Merla, G., Howald, C., Henrichsen, C.N., Lyle, R., Wyss, C., Zabot, M.T., Antonarakis, S.E. & Reymond, A. Submicroscopic deletion in patients with Williams-Beuren syndrome influences expression levels of the nonhemizygous flanking genes. Am J Hum Genet 79, 332-41 (2006).
122. Corbo, J.C., Lawrence, K.A., Karlstetter, M., Myers, C.A., Abdelaziz, M., Dirkes, W., Weigelt, K., Seifert, M., Benes, V., Fritsche, L.G., Weber, B.H. & Langmann, T. CRX ChIP-seq reveals the cis-regulatory architecture of mouse photoreceptors. Genome Res 20, 1512-25 (2010).
123. Wei, G.H., Badis, G., Berger, M.F., Kivioja, T., Palin, K., Enge, M., Bonke, M., Jolma, A., Varjosalo, M., Gehrke, A.R., Yan, J., Talukder, S., Turunen, M., Taipale, M., Stunnenberg, H.G., Ukkonen, E., Hughes, T.R., Bulyk, M.L. & Taipale, J. Genome-wide analysis of ETS-family DNA-binding in vitro and in vivo. EMBO J 29, 2147-60 (2010).
154
124. Pfenning, A.R., Kim, T.K., Spotts, J.M., Hemberg, M., Su, D. & West, A.E. Genome- wide identification of calcium-response factor (CaRF) binding sites predicts a role in regulation of neuronal signaling pathways. PLoS One 5, e10870 (2010).
125. Enkhmandakh, B., Makeyev, A.V., Erdenechimeg, L., Ruddle, F.H., Chimge, N.O., Tussie-Luna, M.I., Roy, A.L. & Bayarsaihan, D. Essential functions of the Williams- Beuren syndrome-associated TFII-I genes in embryonic development. Proc Natl Acad Sci U S A 106, 181-6 (2009).
126. Matsuda, E., Agata, Y., Sugai, M., Katakai, T., Gonda, H. & Shimizu, A. Targeting of Kruppel-associated box-containing zinc finger proteins to centromeric heterochromatin. Implication for the gene silencing mechanisms. J Biol Chem 276, 14222-9 (2001).
127. Jakobsson, J., Cordero, M.I., Bisaz, R., Groner, A.C., Busskamp, V., Bensadoun, J.C., Cammas, F., Losson, R., Mansuy, I.M., Sandi, C. & Trono, D. KAP1-mediated epigenetic repression in the forebrain modulates behavioral vulnerability to stress. Neuron 60, 818-31 (2008).
128. Valor, L.M. & Grant, S.G. Clustered gene expression changes flank targeted gene loci in knockout mice. PLoS One 2, e1303 (2007).
129. Kedmi, M. & Orr-Urtreger, A. Differential brain transcriptome of beta4 nAChR subunit- deficient mice: is it the effect of the null mutation or the background strain? Physiol Genomics 28, 213-22 (2007).
130. Schalkwyk, L.C., Fernandes, C., Nash, M.W., Kurrikoff, K., Vasar, E. & Koks, S. Interpretation of knockout experiments: the congenic footprint. Genes Brain Behav 6, 299-303 (2007).
131. Noyes, H.A., Agaba, M., Anderson, S., Archibald, A.L., Brass, A., Gibson, J., Hall, L., Hulme, H., Oh, S.J. & Kemp, S. Genotype and expression analysis of two inbred mouse strains and two derived congenic strains suggest that most gene expression is trans regulated and sensitive to genetic background. BMC Genomics 11, 361 (2010).
132. Voikar, V., Koks, S., Vasar, E. & Rauvala, H. Strain and gender differences in the behavior of mouse lines commonly used in transgenic studies. Physiol Behav 72, 271-81 (2001).
133. Ducottet, C. & Belzung, C. Correlations between behaviours in the elevated plus-maze and sensitivity to unpredictable subchronic mild stress: evidence from inbred strains of mice. Behav Brain Res 156, 153-62 (2005).
134. Rodgers, R.J., Boullier, E., Chatzimichalaki, P., Cooper, G.D. & Shorten, A. Contrasting phenotypes of C57BL/6JOlaHsd, 129S2/SvHsd and 129/SvEv mice in two exploration- based tests of anxiety-related behaviour. Physiol Behav 77, 301-10 (2002).
155
135. Rodger, J., Davis, S., Laroche, S., Mallet, J. & Hicks, A. Induction of long-term potentiation in vivo regulates alternate splicing to alter syntaxin 3 isoform expression in rat dentate gyrus. J Neurochem 71, 666-75 (1998).
136. Shi, L., Reid, L.H., Jones, W.D., Shippy, R., Warrington, J.A., Baker, S.C., Collins, P.J., de Longueville, F., Kawasaki, E.S., Lee, K.Y., Luo, Y., Sun, Y.A., Willey, J.C., Setterquist, R.A., Fischer, G.M., Tong, W., Dragan, Y.P., Dix, D.J., Frueh, F.W., Goodsaid, F.M., Herman, D., Jensen, R.V., Johnson, C.D., Lobenhofer, E.K., Puri, R.K., Schrf, U., Thierry-Mieg, J., Wang, C., Wilson, M., Wolber, P.K., Zhang, L., Amur, S., Bao, W., Barbacioru, C.C., Lucas, A.B., Bertholet, V., Boysen, C., Bromley, B., Brown, D., Brunner, A., Canales, R., Cao, X.M., Cebula, T.A., Chen, J.J., Cheng, J., Chu, T.M., Chudin, E., Corson, J., Corton, J.C., Croner, L.J., Davies, C., Davison, T.S., Delenstarr, G., Deng, X., Dorris, D., Eklund, A.C., Fan, X.H., Fang, H., Fulmer-Smentek, S., Fuscoe, J.C., Gallagher, K., Ge, W., Guo, L., Guo, X., Hager, J., Haje, P.K., Han, J., Han, T., Harbottle, H.C., Harris, S.C., Hatchwell, E., Hauser, C.A., Hester, S., Hong, H., Hurban, P., Jackson, S.A., Ji, H., Knight, C.R., Kuo, W.P., LeClerc, J.E., Levy, S., Li, Q.Z., Liu, C., Liu, Y., Lombardi, M.J., Ma, Y., Magnuson, S.R., Maqsodi, B., McDaniel, T., Mei, N., Myklebost, O., Ning, B., Novoradovskaya, N., Orr, M.S., Osborn, T.W., Papallo, A., Patterson, T.A., Perkins, R.G., Peters, E.H., Peterson, R., Philips, K.L., Pine, P.S., Pusztai, L., Qian, F., Ren, H., Rosen, M., Rosenzweig, B.A., Samaha, R.R., Schena, M., Schroth, G.P., Shchegrova, S., Smith, D.D., Staedtler, F., Su, Z., Sun, H., Szallasi, Z., Tezak, Z., Thierry-Mieg, D., Thompson, K.L., Tikhonova, I., Turpaz, Y., Vallanat, B., Van, C., Walker, S.J., Wang, S.J., Wang, Y., Wolfinger, R., Wong, A., Wu, J., Xiao, C., Xie, Q., Xu, J., Yang, W., Zhong, S., Zong, Y. & Slikker, W., Jr. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 24, 1151-61 (2006).
137. Pedotti, P., t Hoen, P.A., Vreugdenhil, E., Schenk, G.J., Vossen, R.H., Ariyurek, Y., de Hollander, M., Kuiper, R., van Ommen, G.J., den Dunnen, J.T., Boer, J.M. & de Menezes, R.X. Can subtle changes in gene expression be consistently detected with different microarray platforms? BMC Genomics 9, 124 (2008).
138. Barnes, M., Freudenberg, J., Thompson, S., Aronow, B. & Pavlidis, P. Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms. Nucleic Acids Res 33, 5914-23 (2005).
139. Tudor, M., Akbarian, S., Chen, R.Z. & Jaenisch, R. Transcriptional profiling of a mouse model for Rett syndrome reveals subtle transcriptional changes in the brain. Proc Natl Acad Sci U S A 99, 15536-41 (2002).
140. Ishibashi, T., Thambirajah, A.A. & Ausio, J. MeCP2 preferentially binds to methylated linker DNA in the absence of the terminal tail of histone H3 and independently of histone acetylation. FEBS Lett 582, 1157-62 (2008).
141. Skene, P.J., Illingworth, R.S., Webb, S., Kerr, A.R., James, K.D., Turner, D.J., Andrews, R. & Bird, A.P. Neuronal MeCP2 is expressed at near histone-octamer levels and globally alters the chromatin state. Mol Cell 37, 457-68 (2010).
156
142. Pearson, E.C., Bates, D.L., Prospero, T.D. & Thomas, J.O. Neuronal nuclei and glial nuclei from mammalian cerebral cortex. Nucleosome repeat lengths, DNA contents and H1 contents. Eur J Biochem 144, 353-60 (1984).
143. Caraveo, G., van Rossum, D.B., Patterson, R.L., Snyder, S.H. & Desiderio, S. Action of TFII-I outside the nucleus as an inhibitor of agonist-induced calcium entry. Science 314, 122-5 (2006).
144. Park, C.Y. & Dolmetsch, R. Cell signaling. The double life of a transcription factor takes it outside the nucleus. Science 314, 64-5 (2006).
145. Li, Y., Jia, Y.C., Cui, K., Li, N., Zheng, Z.Y., Wang, Y.Z. & Yuan, X.B. Essential role of TRPC channels in the guidance of nerve growth cones by brain-derived neurotrophic factor. Nature 434, 894-8 (2005).
146. Tai, C., Hines, D.J., Choi, H.B. & Macvicar, B.A. Plasma membrane insertion of TRPC5 channels contributes to the cholinergic plateau potential in hippocampal CA1 pyramidal neurons. Hippocampus (2010).
147. Riccio, A., Li, Y., Moon, J., Kim, K.S., Smith, K.S., Rudolph, U., Gapon, S., Yao, G.L., Tsvetkov, E., Rodig, S.J., Van't Veer, A., Meloni, E.G., Carlezon, W.A., Jr., Bolshakov, V.Y. & Clapham, D.E. Essential role for TRPC5 in amygdala function and fear-related behavior. Cell 137, 761-72 (2009).
148. Lutz, C.S. Alternative polyadenylation: a twist on mRNA 3' end formation. ACS Chem Biol 3, 609-17 (2008).
149. Wickens, M., Anderson, P. & Jackson, R.J. Life and death in the cytoplasm: messages from the 3' end. Curr Opin Genet Dev 7, 220-32 (1997).
150. Sachs, A.B., Sarnow, P. & Hentze, M.W. Starting at the beginning, middle, and end: translation initiation in eukaryotes. Cell 89, 831-8 (1997).
151. Venkataraman, K., Brown, K.M. & Gilmartin, G.M. Analysis of a noncanonical poly(A) site reveals a tripartite mechanism for vertebrate poly(A) site recognition. Genes Dev 19, 1315-27 (2005).
152. Shi, Y., Di Giammartino, D.C., Taylor, D., Sarkeshik, A., Rice, W.J., Yates, J.R., 3rd, Frank, J. & Manley, J.L. Molecular architecture of the human pre-mRNA 3' processing complex. Mol Cell 33, 365-76 (2009).
153. Wilusz, J.E. & Spector, D.L. An unexpected ending: noncanonical 3' end processing mechanisms. RNA 16, 259-66 (2010).
154. Ryan, K., Calvo, O. & Manley, J.L. Evidence that polyadenylation factor CPSF-73 is the mRNA 3' processing endonuclease. RNA 10, 565-73 (2004).
157
155. Gunderson, S.I., Vagner, S., Polycarpou-Schwarz, M. & Mattaj, I.W. Involvement of the carboxyl terminus of vertebrate poly(A) polymerase in U1A autoregulation and in the coupling of splicing and polyadenylation. Genes Dev 11, 761-73 (1997).
156. Tian, B., Hu, J., Zhang, H. & Lutz, C.S. A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res 33, 201-12 (2005).
157. Zarudnaya, M.I., Kolomiets, I.M., Potyahaylo, A.L. & Hovorun, D.M. Downstream elements of mammalian pre-mRNA polyadenylation signals: primary, secondary and higher-order structures. Nucleic Acids Res 31, 1375-86 (2003).
158. Ruegsegger, U., Beyer, K. & Keller, W. Purification and characterization of human cleavage factor Im involved in the 3' end processing of messenger RNA precursors. J Biol Chem 271, 6107-13 (1996).
159. Awasthi, S. & Alwine, J.C. Association of polyadenylation cleavage factor I with U1 snRNP. RNA 9, 1400-9 (2003).
160. Ghosh, T., Soni, K., Scaria, V., Halimani, M., Bhattacharjee, C. & Pillai, B. MicroRNA- mediated up-regulation of an alternatively polyadenylated variant of the mouse cytoplasmic {beta}-actin gene. Nucleic Acids Res 36, 6318-32 (2008).
161. An, J.J., Gharami, K., Liao, G.Y., Woo, N.H., Lau, A.G., Vanevski, F., Torre, E.R., Jones, K.R., Feng, Y., Lu, B. & Xu, B. Distinct role of long 3' UTR BDNF mRNA in spine morphology and synaptic plasticity in hippocampal neurons. Cell 134, 175-87 (2008).
162. MacDonald, C.C. & McMahon, K.W. Tissue-specific mechanisms of alternative polyadenylation: testis, brain, and beyond. Wiley Interdiscip Rev RNA 1, 494-501 (2010).
163. Takagaki, Y., Seipelt, R.L., Peterson, M.L. & Manley, J.L. The polyadenylation factor CstF-64 regulates alternative processing of IgM heavy chain pre-mRNA during B cell differentiation. Cell 87, 941-52 (1996).
164. Early, P., Rogers, J., Davis, M., Calame, K., Bond, M., Wall, R. & Hood, L. Two mRNAs can be produced from a single immunoglobulin mu gene by alternative RNA processing pathways. Cell 20, 313-9 (1980).
165. Licatalosi, D.D., Mele, A., Fak, J.J., Ule, J., Kayikci, M., Chi, S.W., Clark, T.A., Schweitzer, A.C., Blume, J.E., Wang, X., Darnell, J.C. & Darnell, R.B. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464-9 (2008).
166. Jelen, N., Ule, J., Zivin, M. & Darnell, R.B. Evolution of Nova-dependent splicing regulation in the brain. PLoS Genet 3, 1838-47 (2007).
167. Ciais, D., Bohnsack, M.T. & Tollervey, D. The mRNA encoding the yeast ARE-binding protein Cth2 is generated by a novel 3' processing pathway. Nucleic Acids Res 36, 3075- 84 (2008).
158
168. Wilusz, J.E., Freier, S.M. & Spector, D.L. 3' end processing of a long nuclear-retained noncoding RNA yields a tRNA-like cytoplasmic RNA. Cell 135, 919-32 (2008).
169. Sunwoo, H., Dinger, M.E., Wilusz, J.E., Amaral, P.P., Mattick, J.S. & Spector, D.L. MEN epsilon/beta nuclear-retained non-coding RNAs are up-regulated upon muscle differentiation and are essential components of paraspeckles. Genome Res 19, 347-59 (2009).
170. Orphanides, G. & Reinberg, D. A unified theory of gene expression. Cell 108, 439-51 (2002).
171. Ahn, S.H., Kim, M. & Buratowski, S. Phosphorylation of serine 2 within the RNA polymerase II C-terminal domain couples transcription and 3' end processing. Mol Cell 13, 67-76 (2004).
172. Komarnitsky, P., Cho, E.J. & Buratowski, S. Different phosphorylated forms of RNA polymerase II and associated mRNA processing factors during transcription. Genes Dev 14, 2452-60 (2000).
173. Jimeno-Gonzalez, S., Haaning, L.L., Malagon, F. & Jensen, T.H. The yeast 5'-3' exonuclease Rat1p functions during transcription elongation by RNA polymerase II. Mol Cell 37, 580-7 (2010).
174. Proudfoot, N.J. How RNA polymerase II terminates transcription in higher eukaryotes. Trends Biochem Sci 14, 105-10 (1989).
175. Greenblatt, J., Nodwell, J.R. & Mason, S.W. Transcriptional antitermination. Nature 364, 401-6 (1993).
176. Calvo, O. & Manley, J.L. Evolutionarily conserved interaction between CstF-64 and PC4 links transcription, polyadenylation, and termination. Mol Cell 7, 1013-23 (2001).
177. Connelly, S. & Manley, J.L. A functional mRNA polyadenylation signal is required for transcription termination by RNA polymerase II. Genes Dev 2, 440-52 (1988).
178. Kim, M., Krogan, N.J., Vasiljeva, L., Rando, O.J., Nedea, E., Greenblatt, J.F. & Buratowski, S. The yeast Rat1 exonuclease promotes transcription termination by RNA polymerase II. Nature 432, 517-22 (2004).
179. West, S., Gromak, N. & Proudfoot, N.J. Human 5' --> 3' exonuclease Xrn2 promotes transcription termination at co-transcriptional cleavage sites. Nature 432, 522-5 (2004).
180. Luo, W., Johnson, A.W. & Bentley, D.L. The role of Rat1 in coupling mRNA 3'-end processing to transcription termination: implications for a unified allosteric-torpedo model. Genes Dev 20, 954-65 (2006).
181. Zhang, Z., Fu, J. & Gilmour, D.S. CTD-dependent dismantling of the RNA polymerase II elongation complex by the pre-mRNA 3'-end processing factor, Pcf11. Genes Dev 19, 1572-80 (2005).
159
182. Nadler, J.J., Zou, F., Huang, H., Moy, S.S., Lauder, J., Crawley, J.N., Threadgill, D.W., Wright, F.A. & Magnuson, T.R. Large-scale gene expression differences across brain regions and inbred strains correlate with a behavioral phenotype. Genetics 174, 1229-36 (2006).
183. Parsons, M.J., Grimm, C.H., Paya-Cano, J.L., Sugden, K., Nietfeld, W., Lehrach, H. & Schalkwyk, L.C. Using hippocampal microRNA expression differences between mouse inbred strains to characterise miRNA function. Mamm Genome 19, 552-60 (2008).
184. Dolney, D.E., Szalai, G., Duester, G. & Felder, M.R. Molecular analysis of genetic differences among inbred mouse strains controlling tissue expression pattern of alcohol dehydrogenase 4. Gene 267, 145-56 (2001).
185. Wang, E.T., Sandberg, R., Luo, S., Khrebtukova, I., Zhang, L., Mayr, C., Kingsmore, S.F., Schroth, G.P. & Burge, C.B. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470-6 (2008).
186. Keren, H., Lev-Maor, G. & Ast, G. Alternative splicing and evolution: diversification, exon definition and function. Nat Rev Genet 11, 345-55 (2010).
187. Burset, M., Seledtsov, I.A. & Solovyev, V.V. Analysis of canonical and non-canonical splice sites in mammalian genomes. Nucleic Acids Res 28, 4364-75 (2000).
188. Chong, A., Zhang, G. & Bajic, V.B. Information for the Coordinates of Exons (ICE): a human splice sites database. Genomics 84, 762-6 (2004).
189. Houseley, J. & Tollervey, D. Apparent non-canonical trans-splicing is generated by reverse transcriptase in vitro. PLoS One 5, e12271 (2010).
190. Yap, C.C., Murate, M., Kishigami, S., Muto, Y., Kishida, H., Hashikawa, T. & Yano, R. Adaptor protein complex-4 (AP-4) is expressed in the central nervous system neurons and interacts with glutamate receptor delta2. Mol Cell Neurosci 24, 283-95 (2003).
191. Flavell, S.W., Kim, T.K., Gray, J.M., Harmin, D.A., Hemberg, M., Hong, E.J., Markenscoff-Papadimitriou, E., Bear, D.M. & Greenberg, M.E. Genome-wide analysis of MEF2 transcriptional program reveals synaptic target genes and neuronal activity- dependent polyadenylation site selection. Neuron 60, 1022-38 (2008).
192. Ji, Z., Lee, J.Y., Pan, Z., Jiang, B. & Tian, B. Progressive lengthening of 3' untranslated regions of mRNAs by alternative polyadenylation during mouse embryonic development. Proc Natl Acad Sci U S A 106, 7028-33 (2009).
193. DeZazzo, J.D. & Imperiale, M.J. Sequences upstream of AAUAAA influence poly(A) site selection in a complex transcription unit. Mol Cell Biol 9, 4951-61 (1989).
194. Legendre, M. & Gautheret, D. Sequence determinants in human polyadenylation site selection. BMC Genomics 4, 7 (2003).
160
195. Gerlai, R. Gene-targeting studies of mammalian behavior: is it the mutation or the background genotype? Trends Neurosci 19, 177-81 (1996).
196. Barabino, S.M., Hubner, W., Jenny, A., Minvielle-Sebastia, L. & Keller, W. The 30-kD subunit of mammalian cleavage and polyadenylation specificity factor and its yeast homolog are RNA-binding zinc finger proteins. Genes Dev 11, 1703-16 (1997).
197. Ciobanu, D.C., Lu, L., Mozhui, K., Wang, X., Jagalur, M., Morris, J.A., Taylor, W.L., Dietz, K., Simon, P. & Williams, R.W. Detection, validation, and downstream analysis of allelic variation in gene expression. Genetics 184, 119-28 (2010).
198. Tian, Q., Stepaniants, S.B., Mao, M., Weng, L., Feetham, M.C., Doyle, M.J., Yi, E.C., Dai, H., Thorsson, V., Eng, J., Goodlett, D., Berger, J.P., Gunter, B., Linseley, P.S., Stoughton, R.B., Aebersold, R., Collins, S.J., Hanlon, W.A. & Hood, L.E. Integrated genomic and proteomic analyses of gene expression in Mammalian cells. Mol Cell Proteomics 3, 960-9 (2004).
199. Chen, G., Gharib, T.G., Huang, C.C., Taylor, J.M., Misek, D.E., Kardia, S.L., Giordano, T.J., Iannettoni, M.D., Orringer, M.B., Hanash, S.M. & Beer, D.G. Discordant protein and mRNA expression in lung adenocarcinomas. Mol Cell Proteomics 1, 304-13 (2002).
200. Ferrero, G.B., Howald, C., Micale, L., Biamino, E., Augello, B., Fusco, C., Turturo, M.G., Forzano, S., Reymond, A. & Merla, G. An atypical 7q11.23 deletion in a normal IQ Williams-Beuren syndrome patient. Eur J Hum Genet 18, 33-8 (2010).
201. Dai, L., Bellugi, U., Chen, X.N., Pulst-Korenberg, A.M., Jarvinen-Pasley, A., Tirosh- Wagner, T., Eis, P.S., Graham, J., Mills, D., Searcy, Y. & Korenberg, J.R. Is it Williams syndrome? GTF2IRD1 implicated in visual-spatial construction and GTF2I in sociability revealed by high resolution arrays. Am J Med Genet A 149A, 302-14 (2009).
202. Mak, A.B., Ni, Z., Hewel, J.A., Chen, G.I., Zhong, G., Karamboulas, K., Blakely, K., Smiley, S., Marcon, E., Roudeva, D., Li, J., Olsen, J.B., Wan, C., Punna, T., Isserlin, R., Chetyrkin, S., Gingras, A.C., Emili, A., Greenblatt, J. & Moffat, J. A lentiviral functional proteomics approach identifies chromatin remodeling complexes important for the induction of pluripotency. Mol Cell Proteomics 9, 811-23 (2010).
203. Tsien, J.Z., Huerta, P.T. & Tonegawa, S. The essential role of hippocampal CA1 NMDA receptor-dependent synaptic plasticity in spatial memory. Cell 87, 1327-38 (1996).
204. Novak, A., Guo, C., Yang, W., Nagy, A. & Lobe, C.G. Z/EG, a double reporter mouse line that expresses enhanced green fluorescent protein upon Cre-mediated excision. Genesis 28, 147-55 (2000).