UNIVERSITY OF CINCINNATI
Date:______
I, ______, hereby submit this work as part of the requirements for the degree of: in:
It is entitled:
This work and its defense approved by:
Chair: ______
DNA REGOGNITION BY THE K50 CLASS HOMEODOMAIN PITX2: SOLUTION
STRUCTURE, MOLECULAR DYNAMICS, AND IMPLICATIONS FOR MUTATIONS
THAT CAUSE RIEGER SYNDROME.
A dissertation submitted to the
Division of Research and Advanced Studies of the University of Cincinnati
In partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY (Ph.D.)
In the Department of Molecular Genetics, Biochemistry and Microbiology of the College of Medicine
2005
by
Beth A. Chaney
B.S. Wilmington College, 2000
Committee Chair: Mark Rance, Ph.D.
ABSTRACT
We have determined the solution structure of a complex containing the K50 class homeodomain Pituitary homeobox protein 2 (PITX2) bound to its consensus DNA site
(TAATCC). Previous studies have suggested that residue 50 is an important determinant of differential DNA-binding specificity among homeodomains. Although structures of several
homeodomain-DNA complexes have been determined, this is the first structure of a native K50
class homeodomain. The only K50 homeodomain structure determined previously is an X-ray
crystal structure of an altered specificity mutant, Engrailed Q50K (EnQ50K). Analysis of the
NMR structure of the PITX2 homeodomain indicates that the lysine at position 50 makes contacts with two guanines on the antisense strand of the DNA, adjacent to the TAAT core DNA sequence, consistent with the structure of EnQ50K. Our evidence suggests that this side chain may make fluctuating interactions with the DNA. Mutations in the human PITX2 gene are responsible for Rieger syndrome, an autosomal dominant disorder. Analysis of the residues mutated in Rieger syndrome indicates that many of these residues are involved in DNA binding, while others are involved in formation of the hydrophobic core of the protein. We have also performed molecular dynamics (MD) simulations on the PITX2 homeodomain-DNA complex.
The results indicate that motion of the K50 side chain is on a time scale longer than what we can simulate by MD. The results also show a number of long-lived water molecules in the vicinity of
R52 and R53, which form water-bridging interactions with the DNA. A number of water molecules are shown to be in the vicinity of other arginines in contact with the DNA, and near
K50. We also performed molecular dynamics simulations of Rieger mutant complexes. The results of these simulations were compared to the wild-type case, and there are many differences in levels of hydration, water residence times, energy levels, water bridging interactions, and
direct protein-DNA interactions. Overall, the role of K50 in homeodomain recognition is further clarified, and the results indicate that native K50 homeodomains may exhibit differences from altered specificity mutants. These results also provide further insight into how Rieger mutations cause severe phenotypic consequences.
Acknowledgements
There have been countless people who have helped me get this far. First and foremost, I have to
thank my thesis advisor, Dr. Mark Rance who allowed me to join his lab, and has provided the
best working environment/"home" for me for the past 5 years. His guidance and support have
been invaluable. I also have to thank all of the members of my thesis committee (Dr. Paul
Rosevear, Dr. Gary Dean, Dr. Jun Ma and Dr. John Maggio) who have been a great help when I
needed to determine a path for my thesis to take, and have provided help along that path.
Countless thanks go to Dr. Kimber Clark-Baldwin, who taught me virtually everything I know about labwork and has always been a great second opinion when I was trying to interpret strange lab results. I also have to thank Jamie Titus, who was struggling along in her project at the same time I was, and was always there to commiserate with when our clones kept coming up negative!
And of course, I can't forget Eric Johnson, who has a great taste in music and let me copy his
CDs. He was also always there to bring us back down to earth with all of his equations. I would also like to thank Dr. Jack Howarth for computer support and maintaining the NMR facility at the University of Cincinnati. And special thanks to members of the Rosevear Lab who have provided comic relief over the years, especially Neal and Alex, both of whom I greatly miss! I'd
also like to thank the love of my life, Jeff, for putting up with my endless complaining about how
I want to graduate and always being there for me and encouraging me to stick with it. And last,
but most certainly not least, I'd like to dedicate this work to my parents, Jerry and Phyllis
Chaney, whose love and support have kept me going over the years, and without whom, I would
never have made it this far!
TABLE OF CONTENTS
Table of Contents……………………………………………………………………….. 1 List of Figures and Tables………………………………………………………………. 5 Abbreviations…………………………………………………………………………… 8
CHAPTER 1: The PITX2 Homeodomain—a review of the literature 10
Introduction…………………………………………………………………………….. 10
PITX2 and Disease……………………………………………………………………… 10
PITX2 and Development……………………………………………………………….. 14
PITX1 and PITX3……………………………………………………………………… 22
The Homeodomain…………………………………………………………………….. 23
Homeodomain Structure and DNA Recognition………………………………………. 25
Bicoid………………………………………………………………………………….. 36
Non-consensus Site Recognition…………………………………….……………….. 37
Statement of Research Goals…………………………………………………………. 41
CHAPTER 2: Materials and Methods 43
Expression of the PITX2 homeodomain………………………………………………. 43
Purification of the PITX2 homeodomain……………………………………………… 45
Gel Shift Assays………………………………………………………………………. 47
Determination of KD…………………………………………………………………. 47
NMR Structure Determination………………………………………………………... 48
Protein assignments…………………………………………………………... 48
DNA assignments………………………………………………………….…. 49
1 TABLE OF CONTENTS (continued)
Structural constraints…………………………………………………………. 49
Data processing and analysis…………………………………………………. 49
Relaxation rates………………………………………………………………. 50
Structure calculations…………………………………………………………. 50
Docking of the protein to the DNA…………………………………………… 52
Structure refinement and analysis…………………………………………….. 56
Molecular Dynamics Calculations…………………………………………………… 57
Preparation of mutant protein files…………………………………………… 57
Molecular dynamics………………………………………………………….. 58
Analysis of production run data……………………………………………… 67
CHAPTER 3: Solution Structure of the K50 Class Homeodomain PITX2 Bound to DNA and Implications for Mutations That Cause Rieger Syndrome 71
Functional Analysis of Purified PITX2 Homeodomain……………………………… 71
KD of the PITX2 Homeodomain Bound to its DNA Consensus Site………………… 72
Analysis of Protein Folding by HSQC………………………………………………. 73
Resonance Assignments…………………………………………………………….. 75
Chemical Shift Indices……………………………………………………………… 77
Aromatic Assignments……………………………………………………………… 79
Side Chain Assignments for Arginine, Asparagine, and Glutamine Residues…….. 80
Chemical Shift Assignments for the DNA bound to the PITX2 homeodomain……. 81
Protein-DNA NOEs………………………………………………………………… 82
Tertiary Structure of the Pitx2 Homeodomain……………………………………… 82
2 TABLE OF CONTENTS (continued)
Structure determination…………………………………………………….. 82
Quality of the NMR structure ……………………………………………… 83
Tertiary structure of the PITX2 homeodomain-DNA complex……………. 87
Protein-DNA recognition…………………………………………………… 94
The role of lysine at position 50…………………………………………… 98
Analysis of residues mutated in Rieger syndrome………………………… 103
Concluding Remarks………………………………………………………………. 106
CHAPTER 4: DNA Recognition by the Human PITX2 Homeodomain: Molecular Dynamics Simulation of Wild-Type and Rieger Mutant Complexes 109
Introduction…………………………………………………………………………. 109
Overall Behavior of the Trajectories……………………………………………….. 112
Analysis of the molecular dynamics of the wild-type PITX2 HD-DNA Complex ………………………………………………………………... 114
Hydration and water-mediated protein-DNA contacts…………………….. 116
Properties of Lys50………………………………………………………… 119
Analysis of Mutant Complexes…………………………………………………….. 122
T30P Simulation…………………………………………………………... 123
R31H Simulation…………………………………………………………... 124
V45L Simulation…………………………………………………………... 124
R46W Simulation…………………………………………………………... 127
K50Q Simulation…………………………………………………………... 128
R52C Simulation…………………………………………………………... 128
3 TABLE OF CONTENTS (continued)
R53P Simulation…………………………………………………………... 129
Discussion…………………………………………………………………………. 130
CHAPTER 5: Thesis Summary and Future Directions 135
CHAPTER 6: Literature Cited 140
4 LIST OF FIGURES AND TABLES
FIGURES
Figure 1.1 Symptoms of Rieger sydrome…………………………………… 12
Figure 1.2 Structure of the Engrailed homeodomain as a model for overall homeodomain structure …………………………………………. 27
Figure 2.1 Amino acid sequence of the PITX2 homeodomain used for the structural studies…………………………………………………. 43
Figure 2.2 Optimizing expression conditions for the PITX2 homeodomain…. 44
Figure 2.3 Purification of the PITX2 homeodomain and production of a pure NMR sample………………………………………………… 46
Figure 2.4 DNA sequence of the binding site used in the structural studies…. 46
Figure 3.1 Gel shift assays of the PITX2 homeodomain……………………. 72
Figure 3.2 KD of PITX2 homeodomain……………………………………… 73
Figure 3.3 15N HSQC for the PITX2 homeodomain bound to its concensus DNA site……………………………………………… 74
Figure 3.4 15N-HSQC of the PITX2 homeodomain labeled with backbone and side chain assignments obtained through triple-resonance experiments……………………………………… 75
Figure 3.5 Ensemble of structures of the PITX2 homeodomain/DNA complex…………………………………………………………… 86
Figure 3.6 Ramachandran plot………………………………………………… 87
Figure 3.7 Structure of the PITX2 homeodomain/DNA complex……………. 88
Figure 3.8 Overlay of PITX2 homeodomain and EnQ50K homeodomain structures…………………………………………………………… 92
Figure 3.9 R2 relaxation rate constants for the PITX2 homeodomain………… 93
Figure 3.10 Detailed view of the protein-DNA interface and protein-DNA contacts……………………………………………… 94
5 FIGURES (continued)
Figure 3.11 The K50 side chain may be mobile……………………………….. 101
Figure 3.12 Ribbon diagram of the PITX2 homeodomain/DNA complex showing the positions of the side chains for the residues known to be mutated in Rieger syndrome and related disorders…………… 105
Figure 4.1 RMSD values of the MD snapshots versus the starting NMR structure……………………………………………………….. 113
Figure 4.2 Total energy levels of the MD snapshots as a function of simulation time………………………………………………………. 114
Figure 4.3 NUCPLOT diagram of the average structure of the wild-type protein-DNA complex during the 2 ns trajectory……………………. 115
Figure 4.4 Outline of some of the water molecules at the protein-DNA interface……………………………………………………………… 118
Figure 4.5 Snapshots of a single water molecule’s trajectory during the 2 ns simulation time…………………………………………………. 119
Figure 4.6 Properties of K50 during the MD simulation………………………… 121
Figure 4.7 NUCPLOT diagram of the average structure of the V45L mutant protein-DNA complex during the 2 ns trajectory…………… 126
Figure 4.8 Overlay of the wild-type and R46W mutant complexes……………. 127
Figure 4.9 NUCPLOT diagram of the average structure of the R53P mutant protein-DNA complex during the 2 ns trajectory…………… 130
TABLES
Table 1.1 Mutations found in Rieger syndrome, and their properties………… 20
Table 1.2 Sequence alignment of homeodomains……………………………. 25
Table 1.3 List of DNA sites that PITX2 recognizes………………………….. 40
Table 2.1 Sequences of oligonucleotide duplexes used in gel shift assays…… 47
Table 3.1 List of chemical shifts for the PITX2 homeodomain bound to DNA……………………………………………………………… 76
6
TABLES (continued)
α α Table 3.2 Chemical shift indices for the H , C , and CO atoms of the PITX2 homeodomain………………………………………….. 78
Table 3.3 Assignments of atoms in aromatic groups of the PITX2 homeodomain……………………………………………………… 79
Table 3.4 Chemical shift assignments of the arginine side chains…………… 80
Table 3.5 Chemical shift assignments of the asparagine side chains………… 80
Table 3.6 Chemical shift assignments of the glutamine side chains………… 80
Table 3.7 Chemical shift assignments of the DNA binding site…………….. 81
Table 3.8 Table of protein-DNA NOEs……………………………………… 82
Table 3.9 NMR structure statistics…………………………………………… 84
Table 4.1 Hydration in the WT and mutant trajectories……………………… 118
7 ABBREVIATIONS
AMBER Assisted Model Building with Energy Refinement
AML Acute Myeloid Leukemia
ANF Atrial Natriuretic Factor gene
Antp Antennapedia bb backbone
Bcd Bicoid
CANDID combined automated NOE assignment and structure determination
CSI chemical shift indices
CYANA CANDID (see above) + DYANA (dynamics algorithm for NMR applications
DORV double outlet right ventricle
DSS 2,2-dimethyl-2-silapentane-5-sulfonic acid
DTT dithiothreitol
En Engrailed
EnQ50K Engrailed Q50K mutant
Ftz Fushi tarazu
GCMa Glial Cells Missing homolog 1
HD homeodomain
HMQC heteronuclear multiple quantum coherence
HSQC heteronuclear single quantum correlation
IPTG isopropyl-beta-D-thiogalactopyranoside
MD molecular dynamics
MEF2A Myocyte Enhancing Factor 2A
8 NMR nuclear magnetic resonance
NOE nuclear Overhauser effect
NOESY nuclear Overhauser enhancement spectroscopy
Nps maximum number of picoseconds a water is within 3.0 Å
Nw maximum number of water molecules within 3.0 Å
PBS phosphate buffer solution
PITX1 Pituitary homeobox protein 1
PITX2 Human Pituitary homeobox protein 2
Pitx2 Mouse, chick, zebrafish Pituitary homeobox protein 2
PITX3 Pituitary homeobox protein 3
PLOD Procollagen Lysyl Hydroxylase gene
PME Particle Mesh Ewald
PMSF phenyl methyl sulfonyl fluoride
POMC Pro-opiomelanocortin gene
RMSD root mean square deviation
SANDER simulated annealing with NMR-derived energy restraints
TCB thrombin cleavage buffer
TOCSY total correlation spectroscopy
Vnd vnd/Nk-2
WT wild-type
9 CHAPTER 1: The PITX2 Homeodomain—a review of the literature
Introduction
Pituitary homeobox protein 2 (PITX2) is a transcription factor that binds to DNA with its homeodomain region. This protein plays important roles in the development of the heart, and in left-right asymmetry. Mutations in the human PITX2 gene are responsible for Rieger syndrome, which is an autosomal dominant disorder. Many of the mutations found in this gene in Rieger syndrome are single amino acid substitutions within the homeodomain region, which suggests
this domain is very important in development. The homeodomain of PITX2 is a member of the
K50 class of homeodomains, which all have a characteristic lysine residue at position 50 of the
homeodomain. The Drosophila morphogenetic protein Bicoid is another important and well-
studied member of this class of homeodomain proteins. Bicoid and PITX2 are both known to
control pattern formation in a dose-dependent manner during embryonic development. The
highly homologous proteins PITX1 and PITX3 are also known to function in a similar manner during embryogenesis, and their homeodomain regions are also highly homologous, including
the lysine residue at position 50 of the homeodomain. Previously, there had been no structure
determined for a K50 class homeodomain. The goal of this research was to determine the
structure of the PITX2 homeodomain bound to DNA and to analyze the molecular dynamics of
this interaction. This chapter covers previous research regarding the importance of PITX2 and
related proteins in development, the mechanisms by which homeodomains bind to DNA, and structural information available from other classes of homeodomain proteins.
PITX2 and Disease
The PITX2 gene was originally identified by positional cloning of the 4q25 locus in patients with Rieger syndrome [Semina et al, 1996a]. Other groups have also cloned this gene
10 and given it various other names (Ptx2, Otlx2, Brx1, and ARP1) [Kitamura et al, 1997; Arakawa et al, 1998]. Mutations in this gene not only cause Rieger syndrome, but also the related and less severe conditions iris hypoplasia and iridogoniodysgenesis syndrome [Kozlowski & Walter,
2000; Kulak et al, 1998]. These are all autosomal dominant disorders, and they all are characterized by anterior segment abnormalities, which are abnormalities of the eye. Rieger syndrome is a dominant haploinsufficient disorder, which indicates that reduction of PITX2 activity by half can cause disorders of development [Semina et al, 1996a]. There are numerous mutations in the highly homologous PITX1 gene that result in Treacher Collins syndrome
[Graham & McGonnell, 1999]. This disorder includes symptoms that are highly variable, such as underdeveloped and malformed facial bones, hearing loss, and strabismus (a turning in of the eyes).
Axenfeld-Rieger syndrome is a heterogeneous disorder, which is characterized by malformations of the eyes, teeth and umbilicus. It is a group of anomalies that includes Rieger syndrome, Axenfeld anomaly and Rieger anomaly. Patients with Axenfeld anomaly have defects of the eye, with abnormal iris tissue. Patients with Rieger anomaly have the abnormalities seen in Axenfeld anomaly, with the addition of iris changes and a displaced pupil. The most important ocular feature of this family of disorders is glaucoma, which develops in approximately 50% of patients [Espinoza et al, 2002]. Iris hypoplasia is the mildest of the disorders, characterized solely by maldevelopment of the iris stroma and early-onset glaucoma.
The pigment epithelium that is visible through the thin stroma gives the iris a striking color of slate gray to chocolate brown [Alward et al, 1998]. Patients with iridogoniodysgenesis syndrome have these defects, along with abnormalities in iridocorneal angle tissue differentiation
[Kulak et al, 1998]. Rieger syndrome is the most extreme member of this family of diseases,
11 with ocular, facial, dental, and umbilical anomalies (See Figure 1.1). Omphalocele, when abdominal contents protrude through the base of the umbilical cord, is found in about 5% of patients [Katz et al, 2004]. Teeth anomalies occur as abnormally small teeth (microdontia) along with spaces between teeth, misshapened teeth and missing teeth (hypodontia). In older patients, the teeth can become brittle and fall out. Consistent with the role of PITX2 in heart development, patients with Rieger syndrome often exhibit cardiac defects [Gage et al, 1999a;
Kitamura et al, 1999; Lu et al, 1999; Mammi et al, 1998].
A B D C
Figure 1.1. Symptoms of Rieger Syndrome. A) Dental hypoplasia. B) Omphalocele. C) Glaucoma and craniofacial defects. D) Protrusion of umbilicus [Semina et al, 1996a].
Rieger syndrome is associated with mutations in the PITX2 gene, and can also be associated with pax6 gene abnormalities [Riise et al, 2001; Perveen et al, 2000]. Mutations in
PITX2 account for about 40% of the known cases of Rieger syndrome [Hjalt et al, 1999].
Sequencing of DNA from human patients has shown that many mutations in PITX2 result in single amino acid substitutions within the homeodomain region [Priston et al, 2001]. This illustrates the importance of the homeodomain of PITX2 in development. PITX2 with a mutation at position 45 of the homeodomain region (V45L) can bind DNA at slightly lower
12 levels, and has a 200% increase in transactivation activity [Priston et al, 2001]. It's believed that this mutation affects the homeodomain conformation in such a way to affect DNA binding and transactivation differently. Another mutation found in the homeodomain in Rieger syndrome is
K50E [Saadi et al, 2001]. Transient transfection assays with the prolactin promoter and both
PITX2 and another pituitary transcription factor, Pit-1, show a strong synergistic effect on transactivation [Amendt et al, 1998]. The K50E mutation suppresses this synergism [Saadi et al,
2001, Amendt et al, 1998]. It's been found that PITX2 can form homodimers in the absence of
DNA, and the K50E mutation has a stronger dimerization activity [Saadi et al, 2003]. The wild-
type PITX2 homodimers can bind cooperatively to DNA, but the K50E-WT heterodimers have
greatly reduced cooperativity and transactivation function. This mutation therefore acts in a
dominant negative fashion. A R46W mutation was found specifically in iris hypoplasia [Heon et
al, 1995; Alward et al, 1998]. A R31H mutation was found specifically in iridogoniodysgenesis
syndrome [Chisholm & Chudley, 1983; Walter et al, 1996; Kulak et al, 1998]. Other mutations
found in the homeodomain of PITX2 in Rieger syndrome include L16Q, T30P, and R53P
[Semina et al, 1996a; Semina et al, 1996b; Murray et al, 1992]. Analysis of these mutant
proteins by electrophoretic mobility shift assays has shown that the iris hypoplasia mutant
(R46W) retains most of its DNA-binding activity, while the Rieger syndrome mutants are
nonfunctional [Kozlowski & Walter, 2000]. These results support the hypothesis that differences
in functional amounts of PITX2 may be the basis for the wide spectrum of anomalies in the
Axenfeld-Rieger group of disorders. Other studies have also provided evidence that physical or
functional haploinsufficiency of PITX2 is a pathogenic mechanism for Rieger syndrome
[Flomen et al, 1998; Espinoza et al, 2002]. The R53P mutant also exhibits cytoplasmic staining
in COS-7 cells, which supports the hypothesis that there is a nuclear localization signal within
13 the third helix of the PITX2 homeodomain. These mutants will be discussed further below and are outlined in Table 1.1. A recent study described a Chinese family in which mutational analysis showed a frame shift mutation that causes the PITX2 protein to be truncated after the homeodomain [Wang et al, 2003]. Affected members of this family show prominent dental abnormalities along with the other symptoms of Rieger syndrome.
PITX2 and Development
PITX2 is a protein that is found in many developing tissues in vertebrate embryos. It is expressed in the brain, heart, pituitary, mandibular and maxillary regions, eye, gut, limb and umbilicus [Semina et al, 1996a; Gage & Camper, 1997; Mucchielli et al, 1997; Hjalt et al, 2000].
It is the first transcriptional marker observed during tooth development [Green et al, 2001].
There have been three major isoforms of PITX2 identified, and these isoforms are produced by alternative splicing and use of different promoters [Semina et al, 1996a; Gage & Camper, 1997;
Arakawa et al, 1998; Gage et al, 1999a; Kitamura et al, 1999]. All of the isoforms contain different N-terminal domains, while the homeodomain and C-terminal domains are identical.
The C-terminal region contains a transcriptional activation domain. Phosphorylation of the C- terminus by PKC enhances the interaction with cellular factors, and increases transcriptional activation [Espinoza et al, 2005]. Studies have shown that tissue and organ development is differentially regulated by PITX2a and PITX2c isoforms. In the chick, Pitx2c plays a crucial role in left-right axis determination and rightward heart looping [Yu et al, 2001]. In zebrafish,
Pitx2a has a greater impact on cardiac symmetry than Pitx2c [Essner et al, 2000]. Experiments in mice have shown that different organs have different dosage requirements for Pitx2c [Liu et al, 2001]. This study showed that lower levels of Pitx2c are required for cardiac atria development, while higher levels are required for duodenum and lung development. Another
14 PITX2 isoform (PITX2d) has been identified, but it has a truncated homeodomain and does not
bind to DNA [Cox et al, 2002]. This study showed that PITX2d can negatively regulate the
transcriptional activities of PITX2a and PITX2c. One study has shown that in the craniofacial
region where all three isoforms are expressed, it's not the isoform type that controls differential
regulation of genes, but the dosage of the Pitx2 protein [Liu et al, 2003]. They found that
repression of Bmp4 signaling requires high doses of Pitx2, while maintenance of Fgf8 signaling
only requires low levels of Pitx2.
Experiments in mice have shown that Pitx2 is expressed in the odontogenic epithelium
and is the first transcriptional marker of tooth development [Mucchielli et al, 1997; Green et al,
2001]. In tooth development, the epithelium differentiates into the enamel-secreting ameloblasts, while the mesenchyme cells become the dentin-secreting odontoblasts. Expression of Pitx2 is
restricted to the epithelium, and can be detected as early as embryonic day 8.5 during mouse
tooth morphogenesis [Mucchielli et al, 1997; St. Amand et al, 2000]. Pitx2 expression remains
specific to the oral epithelium with a progressive restriction to the dental placodes, followed by
high-level expression in the dental lamina and enamel knot. Postnatal expression is still detected
in relatively undifferentiated epithelial tissue in the tooth germs, in the later-developing second
and third molar anlage [Green et al, 2001]. Pitx2 is found in the preameloblasts, and is absent
from the fully differentiated ameloblasts postnatally [Mucchielli et al, 1997].
The internal organs of vertebrates are arranged on both sides of the body's midline with a characteristic asymmetry. At the end of gastrulation, the precursors of the respiratory and digestive organs and the heart are located at the midline. The first visible sign of left-right asymmetry is the right-sided looping of the developing heart. Eventually, all visceral organs show left-right asymmetry, either as single organs (heart, stomach and spleen) or because paired
15 organs such as the lungs display more lobes on one side than the other. The right lung has three
lobes, while the left lung has two lobes. The liver and gallbladder are positioned to the right of
the midline, while the stomach and spleen are positioned to the left. The apex of the heart points
to the left and the primitive gut coils in a counterclockwise direction. Alterations in left-right specification can lead to severe defects, including left-right reversals of organ position (situs inversus), mirror image symmetry of asymmetric tissues (isomerism), or random and
independent laterality defects in different tissues (heterotaxia). Situs inversus carries minor
medical risk because the organs are normal in structure and in their positions relative to one
another, but the other defects can have severe consequences [Bisgrove & Yost, 2001]. PITX2
plays a role in left-right asymmetry as part of the Nodal signaling pathway. It is expressed on
the left side of developing embryos, and when expressed on the right side, the location of the organs is reversed. Nodal signals have been implicated in specification of the germ layer, patterning of the nervous system, and determination of bilateral asymmetry of organs. Nodal signaling on the left side of the developing embryo causes induction of expression of Pitx2 in the left side of the developing embryo [Kathiriya & Srivastava, 2000]. Members of the Lefty family
of TGF-β proteins have been shown to act as inhibitors of Nodal signaling, and it appears that a delicate balance between Lefty and Nodal proteins regulates the left-sided expression of Pitx2
[Bisgrove & Yost, 2001].
Another study has shown a role for Pitx2 in mediating proliferation of specific cell types as part of the Wnt/Dvl/β-Catenin pathway [Clevers, 2002; Kioussi et al, 2002]. Pitx2 expression in cardiac neural crest cells is decreased in Dvl2-/- mice, and chromatin immunoprecipitation
analysis in a pituitary cell line has shown that Lef1 and β-catenin physically occupy the Pitx2
promoter. In Pitx2-/- mice, the cardiac outflow tract and pituitary glands have lower numbers of
16 proliferating cells, and transgenic overexpression of Pitx2 leads to increased cell numbers. It was subsequently shown that the cell cycle regulator cyclin D2 has bicoid binding sites in its promotor region that are bound and subsequently regulated by Pitx2. This regulation also involves a physical interaction of Pitx2 with β-catenin. These results support a model in which
Wnt signaling activates Pitx2, which then drives cell proliferation in a tissue-specific manner. A more recent study has shown that this pathway partially regulates Pitx2 by controlling the turnover of the unstable Pitx2 mRNA [Briata et al, 2003]. In return, Pitx2 is a mediator of
Wnt/β-catenin-induced mRNA stabilization.
During looping of the heart in chicks, Pitx2 is present in the left atrium, in the ventral portion of the ventricles and in the left-ventral part of the outflow tract [Campione et al, 2001].
Mouse Pitx2 shows a similar developmental expression pattern. Pitx2 null mice show no alteration in heart looping, but they have numerous heart abnormalities, showing that Pitx2 is important for normal heart development [Lu et al, 1999; Lin et al, 1999; Gage et al, 1999a;
Kitamura et al, 1999]. The cardiac ventricles are displaced rightwards, the heart fails to septate the atrium, and there is variable hypoplasia of the ventricles in the mutant mice. These hearts fail to develop tricuspid and mitral valves and a common atrioventricular valve develops
[Kitamura et al, 1999]. Loss of Pitx2 function causes severe cardiovascular defects, such as atrial isomerism, double inlet left ventricle, transposition of the great arteries, persistent truncus arteriosus, and abnormal aortic arch remodeling, which are all conditions found in humans
[Franco & Campione, 2003]. Ectopic expression of either Pitx2c or Pitx2a via retroviral infection to the right side equally randomizes heart looping direction [Yu et al, 2001]. Ectopic
Pitx2c expression in the developing myocardium of mice creates double outlet right ventricle
(DORV) [Franco & Campione, 2003].
17 Mice that have been genetically engineered to be homozygous for a Pitx2 null allele have, in addition to the above heart defects, arrest of development of the pituitary gland, numerous defects of the eye, and alteration of development of the mandibular and maxillary regions [Gage et al, 1999a; Lu et al, 1999; Lin et al, 1999]. These null mutants display right pulmonary isomerism and altered architecture of the lobes of the left lung [Kitamura et al, 1999;
Hjalt et al, 2000]. These mice die by embryonic day 14.5. Mice that are heterozygous for a null
Pitx2 allele also exhibit defects in embryogenesis [Gage et al, 1999a]. A small number of these mice have anterior chamber defects of the eye and heart defects [Gage et al, 1999a]. They also fail to close the ventral body wall, which is consistent with omphalocele found in some Rieger patients [Gage et al, 1999a]. Rieger syndrome results from haploinsufficiency, which is consistent with some of the defects seen in the heterozygous mice [Gage et al, 1999a; Flomen et al, 1997]. The correct concentration of Pitx2 protein appears to be crucial for normal physiological function of this transcription factor.
Several target genes for PITX2 have been identified previously. It is believed that protein binding to the C-terminus of PITX2 allows for binding of PITX2 to DNA, possibly by masking an inhibitory domain [Amendt et al, 1999]. In the pituitary, the prolactin gene is synergistically activated by Pit-1 and PITX2 [Amendt et al, 1998; Quentien et al, 2002a]. Other PITX2 target genes in the pituitary have also been described [Tremblay et al, 2000]. A number of genes outside of the pituitary have been shown to be regulated specifically by PITX2. It has been shown by chromatin immunoprecipitation with antibodies specific for PITX2 that PITX2 regulates expression of procollagen lysyl hydroxylase (PLOD) and Dlx2 [Hjalt et al, 2001; Green et al, 2001]. Subsequent experiments showed that the Atrial Natriuretic Factor (ANF) gene is also a target of PITX2 [Ganga et al, 2003]. ANF is expressed early in embryonic development
18 when cells are committed to the cardiac phenotype. It has also been found that PITX2 strongly activates the Gad1 promoter, which is involved in GABAergic neuron differentiation during mammalian neural development [Westmoreland et al, 2001]. The gene PLOD2 encodes a protein that is responsible for hydroxylation of lysines in collagens, which plays a role in creating the extracellular matrix and provides a foundation for the morphogenesis of tissues and organs. The gene Dlx2 encodes a transcription factor that is expressed in the mesenchymal and epithelial cells of the facial region and tooth-forming anlage, and is also expressed in the diencephalon. Dlx2 is a member of the distal-less family of genes, and has been shown to regulate branchial arch development [Qiu et al, 1995; Thomas et al, 2000]. PITX2 and Dlx2 are expressed in the same tissues during early development. PITX2 binds to consensus and nonconsensus bicoid sites in the Dlx2 promoter and activates this promoter 30-fold in CHO cells
[Espinoza et al, 2002]. Another study found a 45-fold activation of this promoter in CHO cells by PITX2 [Green et al, 2001]. PITX2 proteins engineered with mutations found in Rieger syndrome were used to determine if they could transactivate the Dlx2 promoter. A phenotypically less severe mutation (R46W) is able to bind and transactivate the promoter. A more severe mutation (T30P), which presents with the full spectrum of anomalies, is unable to transactivate the Dlx2 promoter. One study looked at five mutations found in Rieger syndrome that are still able to bind to the CE-3 DNA response element from the pituitary POMC gene
[Quentien et al, 2002b]. All five of the mutant proteins (L16Q, T30P, R31H, R46W, R53P) have lost the transactivation function of three different pituitary gene promoters (prolactin, growth hormone, and pit-1). Four of the five mutations tested fail to affect wild-type PITX2 induction of the prolactin promoter, while the fifth (R53P) acts as a dominant negative inhibitor, blocking wild-type PITX2 induction of the prolactin promoter. Small changes in the protein conformation
19 caused by point mutations in DNA-binding or protein-binding surfaces may alter the types of
protein binding partnerships that form, and may also change which cofactors are recruited to a
target gene promoter [Voss & Day, 2002]. A summary of the point mutations found in the
PITX2 homeodomain in Rieger syndrome, and their biochemical effects is shown in Table 1.1.
Thus, the molecular basis of tooth anomalies in Rieger syndrome appears to be the inability of
PITX2 to activate genes involved in tooth morphogenesis. An additional mutation, V45L has
been found to have an increase in activation function by about 200%, even though DNA binding is lowered [Priston et al, 2001]. Overexpression of Pitx2 in mice has been shown to have very similar phenotypic consequences, with glaucoma and anterior defects [Holmberg et al, 2004].
Therefore, overexpression of PITX2 in development appears to be just as detrimental as underexpression.
Mutation Disease Properties L16Q Rieger Syndrome Unstable, no activation, no consensus binding T30P Rieger Syndrome No activation, only binds consensus R31H Iridogoniodysgenesis Reduced activation, only binds consensus site V45L Rieger Syndrome <10-fold reduction in DNA-binding, Increased activation R46W Iris Hypoplasia Reduced binding to CE-3 site, reduced activation K50E Rieger Syndrome No DNA binding or activation, Dominant negative K50Q Rieger Syndrome Not known R52C Rieger Syndrome Not known R53P Rieger Syndrome No CE-3 binding, no activation, dominant negative
Table 1.1: Mutations found in Rieger Syndrome, and their developmental properties [Semina et al, 1996; Priston et al, 2001; Kulak et al, 1998; Saadi et al, 2001; Heon et al, 1995; Alward et al, 1998; Chisholm & Chudley, 1983; Walter et al, 1996; Murray et al, 1992; Quentien et al, 2002b].
20 A recent study has indicated that PITX2 interacts with the transcription factor GCMa, which is expressed in the placenta during development, and in the kidney and thymus postnatally
[Schubert et al, 2004]. This study showed that there is cooperative binding between these two proteins in binding to promoters. Another study found that PITX2 interacts with myocyte enhancing factor 2A (MEF2A) in regulating expression of ANF [Toro et al, 2004]. And as mentioned above, the prolactin gene is synergistically activated by Pit-1 and PITX2 [Amendt et al, 1998; Quentien et al, 2002a], and Pitx2 associates with β-catenin to regulate cyclin D2 expression [Kioussi et al, 2002]. Therefore, PITX2 may be important in interacting with other transcription factors to regulate gene expression.
It is believed that PITX2 plays a role in regulation of cell differentiation and cell proliferation in adult vertebrates as well. PITX2 has been isolated as a downstream target of the human acute leukemia ALL1 gene, which has been implicated in the development of human acute leukemia associated with abnormalities at 11q23 [Arakawa et al, 1998]. PITX2 is expressed in normal human bone marrow and leukemic cell lines with a normal ALL1 allele, but is not expressed in the leukemic cell lines in which ALL1 is rearranged. A recent study has shown that expression of PITX2a induces actin-myosin reorganization and increased cell spreading in HeLa cells [Wei & Adelstein, 2002]. As discussed below, the lysine at position 50 of the homeodomain is critical for specificity of PITX2 binding to the bicoid DNA site. When this lysine was mutated, the mutants did not cause the changes in cell spreading and morphology, which suggests that this cellular phenotype requires PITX2a with lysine at position 50 [Wei &
Adelstein, 2002]. Another study has found hypermethylation of the PITX2 promoter region in
86% of patients with acute myeloid leukemia (AML) [Toyota et al, 2001]. Hypermethylation of
CpG-rich promoter regions can result in gene silencing.
21
PITX1 and PITX3
PITX1 is crucial for proper development of the craniofacial region. This gene is
expressed in the head and oral cavity regions during embryonic development [Crawford et al,
1997]. It is also expressed in the tissues that give rise to the lower body wall, bladder and
hindgut. Pitx1 is expressed in the mesenchyme of the hindlimb bud, but not in that of the
forelimb. One study has shown that together, Pitx1 and Pitx2 are required for formation of the
hindlimb buds in the mouse and when present in low amounts, they're also important for
development of the femur, tibia, and digit 1 hindlimb structures [Marcil et al, 2003]. Treacher
Collins Syndrome is believed to be caused by mutations in PITX1 [Crawford et al, 1997]. This is
an autosomal dominant disorder that is characterized by craniofacial abnormalities. PITX1 and
PITX2 have 97% similarity in the homeodomain region [Gage & Camper, 1997; Suh et al,
2002]. Along with Pitx2, Pitx1 is involved in regulating tooth development [St. Amand et al,
2000]. Mice deficient in Pitx1 have severe defects in development of the hindlimb, along with cleft palate formation and additional mild pituitary defects [Lanctot et al, 1999; Szeto et al,
1999]. In contrast to Pitx2 and Pitx3, mutations in Pitx1 in mice are recessive [Gage et al,
1999b]. In mice that are mutant for Pitx1, the pelvic girdle is smaller, and the long bones of the
limbs are significantly shorter [Graham & McGonnell, 1999; Lanctot et al, 1999]. The reduced
diameter of the bones is due to impaired ossification and calcification. The joints are also
altered. PITX1 was found to be involved in transcription of the pro-opiomelanocortin (POMC)
gene in the anterior pituitary lobe [Lamonerie et al, 1996]. The first cells to differentiate in the
pituitary are the cells that express this gene. Mutations in Pitx1 cause the aphakia phenotype in
mice. Human congenital aphakia is very rare and is classified as primary, in which no lens
22 anlage has developed, or secondary, in which a lens has begun to develop but is then expelled in
utero [Semina et al, 1997].
PITX3 was cloned from neuronal tissues and is expressed in the midbrain dopaminergic
neurons [Smidt et al, 1997]. These particular neurons are of interest because they are involved in
the pathogenesis of Parkinson's disease. A recent study has found that Pitx3 activates the mouse
tyrosine hydroxylase promoter [Lebel et al, 2001]. Tyrosine hydroxylase is the rate-limiting
enzyme of dopamine and noradrenaline biosynthesis. Mice deficient in Pitx3 have
microphthalmia, and arrested development of the lens and anterior segment structures [Semina et
al, 2000]. This protein was also found to be involved in cataract and congenital total cataract
when mutated [Semina et al, 2000]. The PITX3 protein has 70% overall identity to other
members of this family [Semina et al, 1998].
The Homeodomain
The homeodomain is a protein domain that has been conserved throughout evolution in organisms such as yeast, Drosophila, and humans [Dave et al, 2000]. The homeodomain consists of 60 amino acids, and the 180 base pair DNA sequence that encodes it is called the homeobox.
Proteins with homeodomains are known to be important in controlling embryonic pattern formation and cell-type specification and differentiation [Dave et al, 2000]. Homeotic mutations were first observed in genetic Drosophila research about 75 years ago [Billeter, 1996]. The homeodomain was first seen in proteins that are involved in specifying segment identity in
Drosophila. Mutations in these proteins in flies cause one body segment to be transformed into another, which is called homeosis. Examples are mutations resulting in transformations of the
third thoracic segment to a second thoracic segment, which leads to four-winged flies (bithorax mutations) [Lewis, 1978], or mutations that result in the replacement of the antennae on a fly's
23 head by legs (Antennapedia mutations) [Gehring, 1966]. As gene cloning techniques developed,
these mutations were localized to their genes, and it was found that a 180 base pair segment
could be cross-hybridized with many other genes, and the homeobox was discovered [McGinnis
et al, 1984; Scott & Weiner, 1984]. Hundreds of homeoboxes were subsequently discovered.
Homeoboxes have been found at all levels of development: in the establishment of morphogenetic gradients, in the structure formation of groups of body segments, and in defining the unique identity of single segments. This domain is responsible for recognizing specific sequences of DNA, and thereby recruiting the corresponding transcription factors to specific target genes. In the course of evolution, the amino acid sequence of the homeodomain has been conserved to a very high degree. An example is the human Hox-A7 homeodomain, which differs in only 1 out of 60 positions from that of the Antennapedia homedomain from Drosophila
[Gehring et al, 1994]. The protein sequence of homeodomains tends to be more highly conserved than that of the DNA sequence, which suggests that it is the protein sequence that is being selected for and maintained during evolution [Scott et al, 1989]. A sequence alignment of many common homeodomains is shown in Table 1.2. There are currently over 750 known homeodomain proteins. Mutations in homeodomains have been found in many forms of human disease, including the ones discussed above for the PITX family of homeobox proteins
[Boncinelli, 1997]. Other diseases caused by mutations in homeobox genes include mutations in
HOXD13, which causes synpolydactyly (a abnormality of the hands and feet involving both webbing of the fingers and duplication of fingers) [Muragaki et al, 1996; Akarsu et al, 1995], and mutations in HOXA9 involved in acute myeloid leukemia [Nakamura et al, 1996; Borrow et al,
1996].
24
10 20 30 Pitx2 Q R R Q R T H F T S Q Q L Q Q L E A T F Q R N R Y P D M S T
Bcd P R R T R T T F T S S Q I A E L E Q H F L Q G R Y L T A P R Antp R K R G R Q T Y T R Y Q T L E L E K E F H F N R Y L T R R R En E K R P R T A F S S E Q L A R L K R E F N E N R Y L T E R R Ftz S K R T R Q T Y T R Y Q T L E L E K E F H F N R Y I T R R R Mat2 K P Y R G H R F T K E N V R I L E S W F A K N* P Y L D T K G
Vnd K R K R R V L F T K A Q T Y E L E R R F R Q Q R Y L S A P E 40 50 60
Pitx2 R E E I A V W T N L T E A R V R V W F K N R R A K W R K R E Bcd L A D L S A K L A L G T A Q V K I W F K N R R R R H K I Q S
Antp R I E I A H A L C L T E A Q I K I W F Q N R R M K W K K E N En R Q Q L S S E L G L N E A Q I K I W F Q N K R A K I K K S
Ftz R I D I A N A L S L S E R Q I K I W F Q N R R M K S K K D R Mat2 L E N L M K N T S L S R I Q I K N W V S N R R R K E K T I T Vnd R E H L A S L I R L T P T Q V K I W F Q N H R Y K T K R A Q
Table 1.2: Sequence alignment of homeodomains. En stands for Engrailed, Ftz for Fushi tarazu, Antp for Antennapedia, and Vnd for Vnd/Nk-2, * = there is a 3 residue insertion here that was taken out for alignment. Highlighted residues indicate residues conserved in all 7 homeodomains shown.
Homeodomain Structure and DNA Recognition
The homeodomain consists of a self-folding, stable protein domain of 60 amino acids,
and previous structures of homeodomains have shown that it consists of a compact three-helix
structure and a flexible N-terminal arm [Gehring et al, 1994; Wolberger, 1993] (See Figure 1.2).
The third helix is called the recognition helix, and it makes specific contacts within the major
groove of the DNA. The overall arrangement of the homeodomain-DNA complex structures that
have been determined are very similar, with helices I and II being aligned in an antiparallel
arrangement above the DNA. The recognition helix (helix III) is positioned in the major groove
of the DNA. Homeodomains have evolved different DNA specificities in part by altering the
amino acid residue at position 50, which can interact with base pairs 5 and 6, and to a lesser
25 extent, base pair 4, in the TAATNN consensus binding site. A previous study has shown that each of 6 different amino acids tested at position 50 confers a different DNA binding specificity
[Wilson DS et al, 1996]. Tucker-Kellogg et al. [1997] and others have emphasized the point that the degree of specificity of a homeodomain for its particular DNA binding consensus site depends on the identity of the amino acid residue in position 50. Most homeodomains contain a glutamine residue at this position, and are therefore referred to as Q50 homeodomains. Q50 homeodomains prefer DNA sequences such as TAATTA and TAATGG. The homeodomain of
Bicoid, which is a Drosophila morphogenetic protein, contains a lysine at position 50, and is the founding member of the K50 class of homeodomains [Hanes & Brent, 1989]. The K50 class of homeodomains recognizes a consensus DNA sequence of TAATCC. Much attention has been focused on the consequences of lysine being located at position 50, largely due to the fact that the most dramatic examples of altered DNA specificity occur when a lysine is either introduced or replaced at position 50. For example, when Q50 in Engrailed is mutated to an alanine, the
Q50A mutant has a very similar affinity and specificity as the wild-type protein, but when mutated to a lysine, the specificity changes from TAATTA to TAATCC, clearly demonstrating the important role played by the residue in position 50, especially in the case of K50, in defining the specificity of DNA binding [Ades & Sauer, 1994; Fraenkel et al, 1998; Grant et al, 2000].
Percival-Smith et al [1990] investigated wild-type and Q50K mutant Fushi tarazu homeodomains in conjunction with altering the base pairs at positions 5 and 6 in the binding site, and found that differences in KD of ~100-fold are observed when the binding site is not the optimal one. In addition to position 50, position 47 has a role in defining specificity for some homeodomains in correlation with base-pair 4 of the binding site, especially when the residue is phenylalanine or arginine [Tron et al, 2001; Pomerantz & Sharp, 1994].
26
Figure 1.2. Structure of the Engrailed homeodomain as a model for overall homeodomain structure [Pabo & Sauer, 1992].
A comparison of the homeodomain's three-dimensional structure with that of other DNA- binding proteins reveals a large amount of structural similarity of the fragment formed by helices
II and III to the earlier described helix-turn-helix motif, which is part of many prokaryotic repressors [Qian et al, 1989; Brennan & Matthews, 1989]. However, the second helix of this motif (helix III) is longer in homeodomains, although its C-terminal end was found to be structurally less stable in free homeodomains and is sometimes called helix IV [Qian et al, 1989;
Billeter et al, 1990; Qian et al, 1993; Tsao et al, 1994]. The first experimental structure of a homeodomain-DNA complex (Antennapedia) revealed that homeodomains interact in a different way with DNA than prokaryotic repressor domains [Otting et al, 1990]. In conventional helix- turn-helix motifs, residues of the turn and of the first helical loop contact the DNA bases. In homeodomains, these contacts are made with residues located about two helical turns away from the N-terminal end of helix III [Harrison, 1991]. The structure of the LFB1/HNF1 homeodomain, which contains a 21 residue long insertion in the turn between helices II and III superimposes very well with the Antennapedia homeodomain, which indicates that the structure of the turn is not important for the DNA binding of homeodomains [Ceska et al, 1993; Leiting et al, 1993; Schott et al, 1997]. The relative arrangement of the helices therefore may help to stabilize a global fold so that a helix can fit in the major groove of DNA.
27 The tertiary structure of the homeodomain was first determined by solution-state NMR studies of the Antennapedia homeodomain from Drosophila [Otting et al, 1988; Qian et al, 1989;
Billeter et al, 1990; Guntert et al, 1991]. This structure consists of 3 α-helices folded into a
compact, globular structure, and an N-terminal extension. The N-terminal extension precedes
helix I, which is separated from helix II by a loose loop. Helix II forms a helix-turn-helix motif
with helix III. Structural information about the contacts the homeodomain makes with DNA
have been elucidated from the X-ray crystal structure of the Drosophila Engrailed protein bound to DNA and the NMR structure of the Antennapedia HD-DNA complex [Clarke et al, 1994;
Kissinger et al, 1990; Fraenkel et al, 1998; Fraenkel & Pabo, 1998]. A number of structural studies have shown that the global structure and method of DNA binding of homeodomains are highly conserved. Intermolecular interactions responsible for the specificity of the DNA recognition are concentrated to amino acid residues in helix III [Kissinger et al, 1990; Otting et al, 1990; Wolberger et al, 1991; Billeter & Wuthrich, 1993; Klemm et al, 1994; Hirsch &
Aggarwal, 1995; Li et al, 1995; Wilson DS et al, 1996]. Because the recognition helix spans the entire major groove, residues from three helical turns can reach the DNA bases (residues 47, 50,
51 and 54) [Qian et al, 1989]. Approximately six DNA base pairs are involved in the recognition site (typically a TAAT core followed by 2 base pairs that differ). The N-terminal arm is known to make additional contacts with bases in the minor groove of the DNA [Billeter, 1996]. The loop between helices I and II makes contacts with the DNA backbone. While previous studies agree on the overall docking arrangement, questions remain concerning the roles of key residues at the protein-DNA interface, hydration, and the extent of side chain motion.
The three helices span residues 10-21, 28-38 and 42-58. The removal of the N-terminal residues 1-6, which are flexibly disordered, causes only localized structural variations and does
28 not noticeably affect helix I [Qian et al, 1994a]. The structure is stabilized mostly by hydrophobic interactions involving F8, L13, L16, F20, L40, V45, W48, and F49 [Gehring et al,
1994; Qian et al, 1989; Kornberg, 1993]. Gln12 probably stabilizes the N-terminus by hydrogen bonding to residue 9 [Billeter, 1996]. Several regions of homeodomains are involved in nonspecific contacts with DNA: the N-terminus, the loop between helices I and II and residues from all parts of helix III. Many of these interactions involve basic residues that contact the phosphate groups of the DNA. Some examples are the arginines that are highly conserved at positions 31, 52 and 53, and the lysine at position 55. Specific contacts with DNA are limited to a small number of residues at the N-terminus, and residues from the recognition helix.
Most models of DNA recognition by proteins rely on intermolecular interactions and the complementarity of the protein and the DNA surface. Asn51 is conserved in nearly all of the known homeodomains [Billeter, 1996]. It has been found to form hydrogen bonds with a conserved adenine in the DNA binding site. It is highly disruptive if this amino acid residue is mutated. No other amino acid side chain at position 51 can give the same high affinity as asparagine. It is believed that this residue contributes mostly to the correct positioning of the recognition helix in the major groove, and not to binding specificity [Billeter, 1996]. The positioning is also likely due to backbone contacts made by other conserved amino acid residues.
Sixteen residues in the α-helices are conserved in homeodomains. The side chains of eight of these amino acid residues are believed to define the conformational relationships of the three helices, as they point to the interior of the protein (see above). The major contacts with the DNA are made by amino acids usually in positions 2 or 3 and 5 of the N-terminal arm, as well as residues in the recognition helix (positions 46, 47, 50, 51, and 54) [Billeter, 1996]. These contacts seem to make up the primary interactions that are responsible for specific binding. The
29 surrounding residues can also affect the specificity through more indirect effects. Residue 47 interacts only with the β-strand of DNA and always contacts base pair 4 and sometimes base pair
3 [Billeter, 1996]. If the side chain of this residue is hydrophobic, the nucleotide that is contacted in base pair 4 is T and the contacts are hydrophobic. When an asparagine is found at this position, then a C is contacted by a hydrogen bond. Residue 54 is the last residue at the C- terminal end of the helix that is involved in contacting DNA. The role of this residue may be to stabilize the helix or modulate DNA specificity [Billeter, 1996].
In terms of structural stability, the recognition helix of free homeodomains can be divided into a well-structured N-terminal half (residues 42 to 51), and a C-terminal half (residues 52 to
59) that may be less structured, depending on the homeodomain [Tell et al, 1999]. A previous study has shown that the identities of the residues at positions 54 and 56 can alter the structural stability and melting temperature of the homeodomain [Tell et al, 1999]. This study found that when both of these residues are hydrophobic, the stability of the C-terminal half of the recognition helix is the highest. Alternatively, when the residues at positions 54 and 56 are a combination of hydrophobic and hydrophilic residues, the whole stability of the recognition helix drops significantly. This unstable C-terminal half may provide the recognition helix with a structural flexibility, which allows a better “induced fit”. This higher flexibility could also allow the recognition of a wider spectrum of DNA sequences. Another study has suggested that the salt bridges formed in the homeodomain structure have no essential stabilizing role at room temperature, but instead might be important for improving thermostability [Iurcu-Mustata et al,
2001]. This hypothesis was supported by a correlation between the melting temperatures of several homeodomains and the number of salt bridges and cation-π interactions that connect the secondary structures. CD and fluorescence spectroscopy have been used in the past to probe the
30 structure and stability of the Bicoid protein (also a K50 protein, discussed below) [Subramaniam et al, 2001]. This study found that both W48 and F8 are necessary for the structural stability of the Bicoid homeodomain. An aromatic residue in position 8, and its interaction with W48, may
be critical for the structure of the homeodomain by bringing helix I and II into a conformation
optimal for DNA binding.
Relatively new NMR techniques have emerged that allow for the estimation of the
location and the lifetime of contacts between atomic groups of macromolecules and the water
molecules that are hydrating their surfaces [Otting et al, 1991]. When these techniques were
applied to the Antennapedia homeodomain-DNA complex, they demonstrated that there are
water molecules next to residues that are critical for specific DNA recognition [Qian et al, 1993].
A molecular dynamics simulation of the complex in a water bath implies the presence of up to
five water molecules in the cavity at the interface between the recognition helix and the DNA
[Billeter et al, 1996]. Water molecules at the protein-DNA interface are also visible in X-ray
crystal structures at high resolution [Hirsch & Aggarwal, 1995; Li et al, 1995; Wilson DS et al,
1996]. In the paired (S50) structure, position 50 forms hydrogen bonds to two water molecules,
which then hydrogen bond to DNA bases [Wilson DS et al, 1996]. This interfacial water is
likely to allow for mobility of the protein side chains at the interface. X-ray structures provide
information about the exact location of water molecules, while NMR data provides information
regarding the lifetime of a water-protein contact [Billeter, 1996]. The current model for the
specific homeodomain-DNA contacts is a fluctuating network of hydrogen bonds between polar
groups of the protein and the DNA, and the water molecules. These are complemented by
hydrophobic contacts. The lifetimes of the interactions are in the nanosecond to microsecond
range [Qian et al, 1993]. Water molecules appear to not only act to improve the
31 complementarity of the protein-DNA interaction, but also act to reduce entropic costs when a
large number of interactions are required for a highly specific recognition [Billeter et al, 1996].
Both the protein and DNA specifically recognize each other's hydration pattern, and there
appears to be a mobile arrangement of bonding interactions between protein, DNA and water
molecules.
There is very little known about the dynamics of the interaction between the
homeodomain and DNA. The few NMR studies that have been performed looking at side chain
dynamics of DNA-binding proteins have found evidence that there is significant conformational
flexibility at the protein-DNA interface [Slijper et al, 1997; van Heijenoort et al, 1998; Palmer,
1993; Pervushin et al, 1997]. The only studies that looked at NMR relaxation rate measurements
on homeodomain-DNA complexes were backbone 15N data on the Vnd/NK-2 homeodomain
[Fausti et al, 2001], and a study of 2H relaxation for asparagine and glutamine side chains in the
Ftz homeodomain [Pervushin et al, 1997]. The first study found that the motional behavior
primarily reflects the protein’s tertiary structure and stability of the backbone. The molecular
dynamics simulation described above for the Antennapedia homeodomain indicated that
homeodomain-DNA interactions are dynamic and fluctuating [Billeter et al, 1996], while
crystallographic studies indicate that the Antennapedia homeodomain has a well-defined
conformation at the protein-DNA interface [Fraenkel & Pabo, 1998]. Further studies into the
dynamics of the homeodomain-DNA complexes should provide insight into these issues.
Although structures of several homeodomains and homeodomain-DNA complexes have
been determined by X-ray crystallography or NMR spectroscopy [Banerjee-Basu et al, 2003],
including representatives of the wild-type Q50, S50, C50, G50 and I50 classes of homeodomains
[Otting et al, 1988; Li et al, 1995; Cox et al, 1995; Piper et al, 1999; Tejada et al, 1999], the only
32 experimentally determined K50 homeodomain structure available is an X-ray crystal structure of an altered specificity mutant, Engrailed Q50K (EnQ50K), bound to the TAATCC site [Tucker-
Kellogg et al, 1997]. The latter study found that the side chain of K50 projects into the major groove of the DNA and makes hydrogen bond contacts with the guanines at base pairs 5 and 6 of the complementary strand of the TAATCC binding site. This is the only case in which direct hydrogen bond contacts have been reported for amino acid residue 50 in any homeodomain-
DNA complex structure. Unfortunately, the relevance of the EnQ50K studies, or analyses of other mutants such as Paired S50K [Wilson DS et al, 1996] and Fushi tarazu Q50K [Zhao et al,
2000], to the case of native K50 homeodomains is unclear in the absence of experimental structural data for a native K50 homeodomain. For example, the identity of the amino acid residue at position 54 seems to be constrained by the residue at position 50. A glutamine at position 50 allows for many different residues to be present at position 54, with Met being the most abundant (see Table 1.2). However, Met54 is never found when position 50 is lysine
[Pellizzari et al, 1997]. Determining the biological relevance of studies of single site mutants should take into account possible covariation of residues [Clarke, 1995]. For example, structural studies of an EnQ50A mutant have been conducted [Grant et al, 2000], in order to provide additional information concerning the role of residue 50 in general, and Q50 in particular; however, a phage display selection of Engrailed mutants failed to recover a Q50A mutant – the only Q50A mutant recovered also contained a I47T mutation [Simon et al, 2004]. Another issue concerning the Engrailed Q50K mutant is the observation that it binds to the consensus
TAATCC site with an unusually high affinity, which approaches the picomolar range [Ades &
Sauer, 1994]. There is no evidence that natural K50 class homeodomains have such a high affinity for DNA [Amendt et al, 1998; Ma et al, 1996]. The full-length PITX2 protein has a KD
33 of 50 nM [Amendt et al, 1998]. A KD was determined for the Q50K mutant of the Fushi tarazu
homeodomain, and this value was found to be 0.63 nM [Percival-Smith et al, 1990], a much
lower affinity than the EnQ50K mutant. The KD for the PITX2 homeodomain alone was found
to be 2.6 +/- 0.38 nM (see Figure 3.2), which is comparable to the Fushi tarazu mutant, and also a much lower affinity than the Engrailed Q50K mutant. Moreover, the X-ray structure of
EnQ50K reveals two distinct conformations with the side chain of K50 contacting either the 5th or 6th position on the anti-sense strand of the DNA. Whether or not natural K50 homeodomains exhibit these two conformations in a static state, and whether or not the side chain of K50 exhibits a fluctuating state were unknown. These considerations underscore the importance of obtaining solution structures of native K50 class homeodomains.
The question regarding side chain conformational heterogeneity, referred to above in the context of the observations concerning the K50 side chain in the EnQ50K crystal structure, is broader in scope and of fundamental importance for understanding the full range of interactions that can occur at a protein-DNA interface. Crystallographic studies have generally indicated that there are several conserved and relatively stable contacts at the homeodomain-DNA interface. In
several instances, such as the aforementioned case of K50 in the EnQ50K structure and the case
of Gln50 in the crystal structure of an even-skipped homeodomain complex [Hirsch & Aggarwal,
1995], multiple, significantly populated conformations are observed for the side chain, while the
nearly invariant asparagine in position 51 is observed to make very stable contacts with the
adenine base in position 3 of the consensus TAAT core binding site. On the other hand, NMR
studies [Tsao et al, 1994] and molecular dynamics simulations [Billeter et al, 1996; Gutmanas &
Billeter, 2004] have provided strong indications of a dynamic, fluctuating environment
encompassing some of the key amino acid side chains at the interface, most importantly, the side
34 chains of asparagine 51 and of the position 50 residue. Billeter and co-workers [1996] proposed that, at least in the case of Antennapedia, the homeodomain achieves specificity through a fluctuating network of short-lived contacts that allow it to recognize DNA without the entropic cost that would result if side chains were immobilized upon DNA binding (as discussed above).
Significant interest has been expressed in the literature [Tucker-Kellogg et al, 1997; Billeter et al,
1996; Gutmanas & Billeter, 1994; Duan & Nilsson, 2002] for obtaining experimental data on native K50 homeodomains in order to shed further light on these fundamental issues.
It is difficult to explain the effects of single amino acid replacements without having the actual three-dimensional structure of the PITX2 homeodomain. While no structure of a natural
K50 homeodomain had been determined prior to our work, computer modeling had been used to create a model of the structure of the PITX2 homeodomain bound to DNA [Banerjee-Basu &
Baxevanis, 1999]. This study used threading analysis to model the PITX2 homeodomain structure after the Engrailed homeodomain. The Engrailed homeodomain is 35% homologous to the PITX2 homeodomain. They found that the structure is most likely stabilized primarily by hydrophobic interactions between residues at the helical interface, and the key hydrophobic interactions are strictly conserved between the Engrailed and PITX2 homeodomains. This study also did threading analysis of mutant PITX2 homeodomains and found that the severity of the defect as determined biochemically was directly correlated with how much that particular mutation would disrupt the putative structure and interactions of the homeodomain with DNA.
The threading analysis did not provide a PDB file that we could analyze in detail and is not necessarily indicative of true molecular structure. For this threading analysis, the focus was the role of Rieger mutations in causing disease, and there was no discussion of the role of K50 in determining the DNA-binding affinity and specificity of the homeodomain, which is something
35 best addressed via an experimentally determined structure rather than a threading model. This
study also did not analyze the K50 Rieger mutants, so there is no indication what the structure of
this side chain was in their analysis.
Proposals have been made on what effects the mutations seen in the PITX2
homeodomain in Rieger syndrome and the related anomalies may have on the structure and
function of this protein [Banerjee-Basu & Baxevanis, 1999; Kozlowski & Walter, 2000]. These
studies proposed that replacing R46 on the hydrophilic face of helix 3 with a hydrophobic and
bulky tryptophan residue may interfere with the stability of the homeodomain-DNA complex.
The arginine at position 46 may contact a sugar residue of the DNA backbone, just as K46 does
in Engrailed. The mutation at position 31 occurs at a position important for DNA binding in
helix 2. The reduction in function of the mutation at position 16 may be due to improper folding
of the homeodomain. A leucine at this position is conserved in homeodomains, and mutating
this residue may disrupt the packing of helix 1 against helix 3. One study did a survey of point
mutations found in 17 homeodomains, and three “hot spot” regions were found: Arg5, Arg31,
and amino acids in the recognition helix (especially Arg 52 and Arg 53) [D’Elia et al, 2001].
The arginines in the recognition helix contact phosphates in the DNA backbone, so it’s easy to see how this would disrupt homeodomain function. Arg31 also contacts the phosphate backbone, and is believed to establish a salt bridge with E42 at the N-terminus of the recognition helix, and is therefore involved in the correct packing of helices II and III.
Bicoid
Bicoid is the founding member of the K50 class of homeodomain proteins, and many of the studies of DNA recognition by K50 proteins have been performed with Bicoid. Head and thorax development in the Drosophila embryo require the maternal determinant Bicoid
36 [Frohnhofer & Nusslein-Volhard, 1986]. Bicoid mRNA is made during oogenesis and is transported into the egg, where it becomes localized at the anterior tip and diffuses away from there, forming a concentration gradient [Berleth et al, 1988; Driever & Nusslein-Volhard,
1988a]. This protein is expressed as an anteroposterior concentration gradient in the early embryo and is necessary for the expression of many zygotic genes in distinct anterior domains
[Driever & Nusslein-Volhard, 1988b; Struhl et al, 1989]. Increases or decreases in protein levels in different regions of the Drosophila embryo cause a corresponding posterior or anterior shift of anterior anlagen in the embryo [Driever & Nusslein-Volhard, 1988b]. Embryos from strong mutant bcd alleles completely lack head and thorax and instead, they have a second telson at the anterior end [Frohnhofer & Nusslein-Volhard, 1986]. The homeodomain of Bicoid is of the K50 class, and it recognizes DNA sequences found in the enhancer elements of Bcd-responsive genes such as hunchback, knirps, buttonhead, runt, hairy, orthodenticle and even-skipped [Burz et al,
1998; Driever & Nusslein-Volhard, 1989; Gao & Finkelstein, 1998; Hanes & Brent, 1989; La
Rosee et al, 1997; Small et al, 1992; Tsai & Gergen, 1994; Wilson et al, 1996; Wimmer et al,
1995; Yuan et al, 1996]. Bicoid also binds to caudal mRNA in the 3'UTR and prevents cap- dependent translation initiation of this mRNA and therefore prevents Caudal synthesis in response to the Bicoid gradient [Rivera-Pomar et al, 1996; Niessing et al, 2000].
Non-consensus Site Recognition
Although most homeodomains recognize a TAAT core sequence, they have evolved different DNA specificities by altering the amino acid residue at position 50, which can interact with base pairs 5 and 6 (see above). The K50 class of homeodomains, such as Bicoid and
PITX2, recognize a consensus DNA sequence of TAATCC. A previous study has shown that when the Q50 protein Fushi tarazu (Ftz) is mutated to having a lysine residue at position 50, the
37 Ftz(Q50K) protein can recognize the TAATCC consensus sequence in vitro [Zhao et al, 2000].
But, this mutant fails to select natural Bicoid targets in vivo. This indicates that the Ftz(Q50K)
mutant cannot recognize nonconsensus sites, and emphasizes the importance of nonconsensus
site recognition in vivo by Bicoid and PITX2. Bicoid is known to recognize at least three types
of nonconsensus DNA sites, which are TAAGCC, TGATCC, and AAATCC [Driever &
Nusslein-Volhard, 1989; Rivera-Pomar et al, 1995; Yuan et al, 1999]. Other nonconsensus sites
that have been reported to be recognized by Bicoid include TAAGCT and TCATCC [Driever &
Nusslein-Volhard, 1989]. Other studies have found that a single amino acid replacement at
position 50 of the homeodomain is sufficient to switch the DNA specificity of the Bicoid protein
[Hanes & Brent, 1989; Treisman et al, 1989].
Jun Ma's group (Cincinnati Children’s Hospital, Department of Developmental Biology) has performed experiments to look at the interactions between the Bicoid homeodomain and two different types of DNA sites, A1 and X1 [Dave et al, 2000]. A1 has the consensus sequence
TAATCC, while X1 has the nonconsensus sequence TAAGCT. Footprint analysis has shown that there are both shared and distinct contacts using the two different sites, which suggests that
Bicoid binds to these sites with a similar overall structure, but different interactions.
Experiments have indicated that Arg54 of the Bicoid homeodomain recognizes X1 by
specifically contacting the guanine at position 4. Arg54 is believed to contact the adenine of position 3 in the consensus site. In searching all natural K50 homeodomains, Bicoid is the only one with an arginine residue at position 54 of the homeodomain [Banerjee-Basu et al, 2003].
This residue is believed to allow Bicoid to recognize the X3 nonconsensus sequence (TGATCC)
and to allow efficient binding to X1. The PITX2 homeodomain has an alanine at position 54
instead of an arginine. Pitx2 has a high affinity for the A1 site, but weaker affinity to X1, and no
38 binding to the X3 (TGATCC) nonconsensus site. When the alanine at position 54 is mutated to
an arginine, the mutant homeodomain binds efficiently to the X3 site, and retains binding affinity
for A1 and X1.
Previous studies have indicated that recognition of nonconsensus DNA sites plays an
important role in mediating the function of Bicoid [Driever & Nusslein-Volhard, 1989; Ma et al,
1999; Rivera-Pomar et al, 1995; Yuan et al, 1999]. Chemical footprint assays have revealed
shared and distinct contacts with the consensus site (also called A1) and the nonconsensus site
TAAGCT (also called X1), and this suggests that the Bicoid homeodomain binds to these sites
with a similar overall structure, but different kinds of interactions [Dave et al, 2000].
Methylation interference studies have shown that when the guanines at positions 5 and 6 of the
antisense strand of A1 (3'-ATTAGG-5') are methylated, this interferes strongly with binding of
the Bicoid homeodomain [Dave et al, 2000]. This study also showed that when the guanine at
position 5 on the antisense strand of X1 (3'-ATTCGA-5') is methylated, Bicoid homeodomain
binding is inhibited, and methylation of position 4 of the sense strand of X1 (TAAGCT) also
interfered with binding. Dave et al [2000] also conducted KMnO4 interference assays to
determine which thymines are important for binding of the Bicoid homeodomain. They
demonstrated that the thymine at position 2 in both A1 (3'-ATTAGG-5') and X1 (3'-ATTCGA-
5') is important for Bicoid binding.
Studies have shown that the PITX2 homeodomain can also recognize different DNA sites. Biochemical studies have shown that the PITX2 homeodomain can recognize both the consensus TAATCC site and a non-consensus TAAGCT site [Dave et al, 2000]. Another study has shown that the PITX2 homeodomain can recognize a TAAGCC site from a PITX1 target gene, pro-opiomelanocortin (POMC) [Kozlowski & Walter, 2000]. A recent study has shown by
39 chromatin immunoprecipitation that the mouse procollagen lysyl hydroxylase PLOD2 gene is a
target gene of PITX2 [Hjalt et al, 2001]. The promoter region of this gene contains multiple
DNA sequences that are recognized by PITX2 in vitro and in cells. Several of these sites are
nonconsensus sequences. A PITX2(T30P) mutant homeodomain, which has a threonine-to-
proline mutation, can recognize a TAATCC site with only a small reduction in affinity, but cannot recognize the CE-3 element of POMC, which has a TAAGCC sequence. The sites that
PITX2 recognizes are presented in Table 1.3.
Type Core Sequence Source Sequence
Consensus TAAT hb A1 CCAACGTAATCCCCATAG
Non-consensus TAAG POMC CGCTGCTAAGCCGGCCATC
Non-consensus AAAT hb X3t CATCCAAATCCAAGTGCG
Non-consensus TCAT plod-2 C TTTGTTTTCATCCCTAAACAC
Non-consensus TAGT plod-2 E CACTTTTAGTCCCAGGATTT
Non-consensus TATT Dlx2 TATTCC
Non-consensus TTAT Dlx2 TTATCC
Non-consensus CAAT Dlx2 CAATCC
Table 1.3: List of DNA sites that PITX2 recognizes [Dave et al, 2000; Kozlowski & Walter, 2000; Yuan et al, 1999; Hjalt et al, 2001; Green et al, 2001; Espinoza et al, 2002].
40 A structural study of the MATα2 homeodomain revealed some interesting features of homeodomains when they bind nonspecifically to DNA [Aishima & Wolberger, 2003]. This study found that when this homeodomain nonspecifically binds to DNA, the third helix actually rotates and makes a different set of contacts with the DNA. These nonspecifically bound homeodomains make few of the expected base-specific contacts seen in other structures, yet make many contacts with the phosphate backbone of the DNA. In the nonspecific complex, residues at positions 46 and 47 make contacts with the DNA, while other contacts are lost. This provides evidence that the homeodomain is capable of adjusting itself structurally in order to sample different DNA binding sequences, and it is possible that PITX2 is also capable of adjusting itself structurally to bind different DNA sites.
Statement of Research Goals
The purpose of this research was to determine the structure of the PITX2 homeodomain bound to consensus DNA and then analyze the molecular dynamics of this interaction. The
PITX2 homeodomain is an important member of the K50 class of homeodomain proteins.
PITX2 plays vital functions in the development of the heart, umbilicus, pituitary and craniofacial regions, just to name a few. Mutations in PITX2 are known to cause the autosomal dominant disorder Rieger syndrome, and related disorders. The homeodomain of PITX2 has a lysine at position 50, which defines the DNA sequences it is able to recognize. No structure of a native
K50 homeodomain protein had been determined previous to the research presented in this thesis.
Analyzing such a structure will be useful in determining what governs the specificity of this class of homeodomains, and what structural characteristics allow for binding to both consensus and nonconsensus binding sites. Previous studies of homeodomains have indicated possible
41 fluctuating interactions with the DNA, but very little research has been performed to study this possibility.
The primary goal of this research was to determine the structure of the PITX2 homeodomain bound to DNA, and therefore have the first structure of a native K50 class homeodomain. The molecular dynamics of the wild-type PITX2 homeodomain interaction with the DNA was then explored to gain a better understanding of whether fluctuating interactions between K50 and the DNA are present. Molecular dynamics of Rieger mutant PITX2 homeodomain-DNA complexes were also explored, to gain a better understanding of differences in protein-DNA recognition, including differences in specific contacts, and differences in hydration of the protein-DNA interface, between the wild-type and mutant complexes. The results of these analyses are presented in this thesis.
42 CHAPTER 2: Materials and Methods
Expression of the PITX2 Homeodomain
The expression plasmid pGEX-1λt-pitx2HD was kindly provided by the laboratory of Dr.
Jun Ma (Division of Developmental Biology at the Cincinnati Children's Hospital Research
Foundation). This construct consists of a glutathione S-transferase tag and a thrombin cleavage
site prior to the PITX2 homeodomain sequence. There are two extra residues at the N-terminus
as a result of thrombin cleavage, and six extra residues at the C-terminus that are part of the
expression system. The final protein construct can be viewed in Figure 2.1.
1 11 21 Human PITX2 HD G S Q R R Q R T H F T S Q Q L Q Q L E A T F Q R N R 31 41 Y P D M S T R E E I A V W T N L T E A R V R V W F K 51 60 N R R A K W R K R E E F I V T D Figure 2.1: Amino acid sequence of the PITX2 homeodomain used for the structural studies. The extra residues at the N- and C-terminus are part of the expression system.
The PITX2 homeodomain was obtained by growing Escherichia coli strain BL21-Star
(Invitrogen) transformed with pGEX-1λt-pitx2HD in minimal media [0.85 g/L NaOH, 10.5 g/L
K2HPO4, 12 g/L Na2HPO4, 6 g/L KH2PO4, 1 g/L NaCl, 6 mg/L CaCl2, 13.2 mL/L concentrated
(12.2 N) HCl, nucleotides (0.5 g/L adenine, 0.65 g/L guanosine, 0.2 g/L thymine, 0.5 g/L uracil,
0.2 g/L cytosine), vitamins (1 mg/L choline chloride, 1 mg/L pyridoxal phosphate, 100 µg/L riboflavin, 50 mg/L thiamine, 50 mg/L niacin, 1 mg/L biotin), and trace elements (107 µg/L
MgCl2•6H2O, 20 µg/L FeCl2•4H2O, 0.7 µg/L CaCl2•2H2O, 0.26 µg/L H3BO3, 0.16 µg/L
43 MnCl2•4H2O, 16 ng/L CuCl2•2H2O, 2.4 µg/L Na2MoO4•2H2O, 10 µM FeCl3, 135 mM CaCl2,
50 µM ZnSO4)] containing 150 mg/L ampicillin, 4 g/L glucose, and 1 g/L NH4Cl. Half a liter of
bacterial culture was grown in baffled flasks in an incubator shaker at 37°C until saturation (A600
~ 5.0). This culture was spun down (2000g, 10 min) and resuspended in 1 L of minimal media
13 15 enriched with 10% Isogro (Isotec). When preparing labeled samples, C-glucose and NH4Cl
were used (Isotec). Expression was then induced by adding IPTG to a final concentration of
0.05-1 mM, and growing in an incubator shaker at 20°C for approximately 24 hours. The cells
were harvested by centrifugation (3200g, 30 min), and the cell paste was frozen at -80°C.
Typical wet cell paste yields were between 12g and 16g per liter of cell growth. Figure 2.2
illustrates optimization of expression conditions.
e e uble bl l l lu le so o ub s ub , nsol e G i sol l G, in T ub T IP G, G, lub IP T T o P P nsol s I I i , M M G 05mM G, T . 1m T IP , 0 , 1m IP t, M t, 0.05mM ht M h ig ight gh m e D n n m 1 e e le B 1 l l le AC ub e ub ub 8h, ub over over , overni , 8h, n lubl luble sol , overnig o o o insolubl sol C C 0°C, ti insol s insol s C, , insoluble °C, °C, 0° 0° 3 c , , C, 20 20 2 3 C, C C, C C, Standards 20°C ards 30° ndu 30° 37°C 37° -i 30° 30° 37° h, and , 37° h, 15 h, St Pre 4h, 4h, 4h, 4h 15 15 15h,
Figure 2.2: Optimizing expression conditions for the PITX2 homeodomain. (A) Narrowing optimum temperature range and length of induction. The most
induction with the most soluble protein was seen when inducing 15h at 30°C in this experiment. (B) Determining optimum temperature, length of induction, and concentration of IPTG. The most induction with the most soluble protein was ° seen when inducing for ~24h at 20 C with 0.05mM IPTG.
44 Purification of the PITX2 Homeodomain
Protein purification was carried out at 4°C to minimize protease degradation. For every
10 g of wet cells, 100 mL of ice-cold PBS buffer (144 mM NaCl, 2.7 mM KCl, 2.96 mM
Na2HPO4, 1.79 mM KH2PO4, pH 7.3) was used to resuspend the cells. Lysozyme (200 mg), 100
mM PMSF in isopropanol (667 µL), and two Complete protease inhibitor cocktail tablets
(Roche) were added for every 100 mL of resuspended cell mixture. This mixture was sonicated
and cell debris was removed by centrifugation (3200g, 25 min). The cleared lysate was applied
to a glutathione sepharose column and washed with PBS buffer, followed by thrombin cleavage
buffer (50 mM trizma hydrochloride, 150 mM NaCl, 2.5 mM CaCl2, pH 8.0). The resin was resuspended in thrombin cleavage buffer (TCB) and transferred to a 50 mL conical tube. The homeodomain was cleaved from the glutathione S-transferase fusion tag using 1 mg thrombin for
3 hours at 4°C with rotation. Nearly complete cleavage was obtained during this time as measured by SDS-PAGE (Figure 2.3). The cleaved protein was then eluted from the resin using
5 bed volumes of TCB. It was loaded onto a SP sepharose fast flow column (2 mL bed volume), washed with washing buffer (10 mM NaH2PO4, 250 mM NaCl, pH 7.0) and eluted with buffer
containing a higher salt concentration (10 mM NaH2PO4, 1 M NaCl, pH 7.0). Fractions
containing the homeodomain were identified by Abs278, pooled, and dialyzed overnight at 4°C in
10 mM NaH2PO4, pH 7.0. Protein yields were ~4.5 mg/L of cell growth. The consensus DNA
duplex (IdtDNA) was added in a 1:1 protein:DNA ratio and the complex was concentrated by
burying the dialysis bag in Spectra/Gel Absorbent (Spectrum, made of polyacrylate-polyalcohol)
or Aquacide (Calbiochem). The sequence of this DNA duplex can be seen in Figure 2.4.
Samples were concentrated to 540µL, and 60µL of D2O was added. Complete protease
inhibitors (1 tablet in 3 mL, add 1 µL), leupeptin (0.3 mM), DTT (2 mM), and Pefabloc (0.2
45 mM, Roche) were all added to inhibit proteases. Sodium azide (6 mM stock, add 1µL, final concentration 0.06 mM) was added to prevent bacterial growth in the sample.
avage
bin cle m
s ough A rd r da tan S Flow-th Resin after Thro C NMR Standards Uncut GST Sample Cut GST
mn ge olu c leava ST c G om fr gh n u tio sin ro ds d h duc ce Re -t n u sin after Thrombin B -i d ST low e tandar e In G F R S Pr
Uncut GST Cut GST
Figure 2.3: Purification of the PITX2 homeodomain and production of a pure NMR sample. (A) Cleavage of the fusion protein by thrombin. (B) Samples from different stages of the purification process. (C) Pure NMR sample of PITX2 HD.
Figure 2.4: DNA sequence of the binding site used in the structural studies. The DNA sequence consists of the TAATCC binding site surrounded by residues to confer stability to the double strand. Prepared using NUCPLOT.
46 Gel Shift Assays
Complementary oligonucleotides (Table 2.1) were annealed by heating to 95ºC and
slowly cooling to room temperature. This oligonucleotide duplex (7 pmol) was added to 5 pmol
of PITX2 homeodomain in 10 µl of H2O and incubated on ice for 30 min. These samples were
loaded onto a 15% acrylamide gel and stained with Sybr Green I (Invitrogen). The bands were
viewed under UV light and photographed.
Binding Site Sequence of Oligo
TAATCC 5'- GCT CTA ATC CCC G -3' 3'- CGA GAT TAG GGG C -5'
TAAGCC 5'- GCT GCT AAG CCG GCC -3' 3'- CGA CGA TTC GGC CGG -5'
TAGTCC 5'- GCT TTT AGT CCC AGG -3' 3'- CGA AAA TCA GGG TCC -5'
TATTCC 5'- GCT CTA TTC CCC G -3' 3'- CGA GAT AAG GGG C -5'
Table 2.1: Sequences of oligonucleotide duplexes used in gel shift assays.
Determination of KD
Measurements of KD were kindly provided by Vrushank Dave and taken by following the procedure of Dave et al [2000]. The DNA probe concentrations used in this analysis were 1, 2,
4, 8, 16, 20 and 40 nM. Quantitative gel shift assays were performed by measuring the bound
and free fractions of the probes with a PhosphorImager as previously described [Zhao et al,
2000]. The data were analyzed using Microsoft Excel (linear regression analysis) to determine
the KD value ( -1/KD = slope of the plot of Bound/Free against Bound DNA).
47 NMR Structure Determination
Determining the structure of a biomolecule or complex by NMR spectroscopy involves four basic steps: 1) identification of appropriate sample conditions (as discussed above); 2) sequence-specific assignment of the 1H, 13C, and 15N resonances; 3) identification of as many
geometrical constraints as possible from the NMR data, such as internuclear distances from
nuclear Overhauser enhancement (NOE) measurements, and dihedral angles from measurement
of scalar couplings; and 4) the calculation and refinement of the three-dimensional structure
from the structural constraints.
All NMR experiments were carried out on Varian Inova 600 and 800 MHz spectrometers.
The sample temperature was set to 295 K. Spectra were referenced to an external DSS standard.
Protein assignments: Protein 1H, 13C, and 15N resonance assignments were obtained primarily
from heteronuclear-edited NMR spectra, using conventional triple resonance 1H- {13C, 15N}
NMR probes. The pulse programming codes were written in-house. Approximately 92% of
α assignable atoms were assigned. Sequence-specific assignment of the backbone HN, N, C’, C
β and C resonances were obtained from 3D HNCO, HN(CO)CA, HNCA, CBCA(CO)NH and
HNCACB [Grzesiek & Bax, 1992a; Ikura et al, 1990; Kay et al, 1994; Muhandiram & Kay,
1994; Grzesiek & Bax, 1992b; Wittekind & Mueller, 1993; Sattler et al, 1999] spectra.
Assignment of the aliphatic side chain resonances was accomplished using a combination of 3D
15N-edited-TOCSY-HSQC [Zhang et al, 1994; Kay et al, 1992], H(CCO)NH-TOCSY, as well as
HBHA(CBCACO)NH spectra [Grzesiek & Bax, 1992b]. Aromatic 1H and 13C resonances were obtained from a combination of 2D HMQC, 2D HMQC-TOCSY, 3D HMQC-TOCSY and 2D
NOESY-HMQC spectra [Marion et al, 1989; Zerbe et al, 1996]. A HNHA experiment was performed to assign Hα and to obtain coupling constants [Vuister & Bax, 1993].
48 DNA assignments: Resonance assignments for unlabeled DNA bound to 13C, 15N-labeled protein were obtained using standard assignment methods for DNA [Wuthrich, 1986]. The data was
13 15 obtained with doubly C/ N-filtered NOESY and ω2-filtered TOCSY experiments [Otting &
Wuthrich, 1990; Breeze, 2000].
Structural constraints: The main source of structural information was the proton-proton distance constraints identified from NOESY spectra. Three-dimensional 15N-NOESY-HSQC
experiments [Talluri & Wagner, 1996] using 50-125 ms mixing times were used for
intramolecular restraints in the homeodomains, along with a 13C-NOESY-HSQC experiment
using a 150 ms mixing time [Sattler et al, 1999].
Intramolecular distance restraints for the DNA were obtained from an r-6 scaling of cross-
peak volumes in the NOESY spectra. Upper and lower bounds were calibrated on the cytosine
intraresidue H5-H6 NOE and set to +/- 15% of the calculated distance for base and H1’ protons.
Restraint boundaries to other sugar protons were widened an additional 10% to account for
effects of spin diffusion. Restraints from the longer mixing time 125 ms experiment were
assigned a lower bound of 3 Å and an upper bound of 5 Å.
13 Intermolecular restraints between the protein and DNA were obtained from 2D C(ω1)-
13 15 edited, [ C, N](ω2)-filtered NOESY spectra [Breeze, 2000; Stuart et al, 1999; Lee et al, 1994].
The NOEs were assigned manually and only unambiguously assigned peaks were used as
restraints in the docking calculation. Weak peaks were assigned an upper distance limit of 6.0 Å,
while medium peaks had an upper distance limit of 5.0 Å, and stronger peaks an upper distance
limit of 4.0 Å.
Data processing and analysis: Raw NMR data was processed using NMRPipe [Delaglio et al,
1995]. Linear prediction was used in the t1 dimension for 2D spectra and in the t1 and t2
49 dimensions for 3D spectra, using squared sinebell window functions for apodization and zero
filling in all dimensions. Spectra were viewed and analyzed using the Sparky graphical interface
[Goddard & Kneller]. This program was used to pick peaks and integrate them using a
Lorentzian function.
Relaxation Rates: R2 rate constants were determined on a Varian 600 MHz spectrometer using a
standard Carr-Purcell-Meiboom-Gill (CPMG) spin-echo experiment [Farrow et al, 1994; Skelton
et al, 1993]. The data was recorded at 21.8°C. The R2 relaxation rate constant was extracted from this data by following the procedure “Extracting R1 and R2 Relaxation Rate Constants from
Varian NMR Data” that was written for our lab by Dr. Mark Rance.
Structure calculation: Referenced chemical shift assignments and peak intensities from Sparky
were entered into the structure calculation program CYANA [Guntert et al, 1997; Herrmann et
al, 2002]. CYANA consists of an automated NOE assignment program, CANDID, which
automatically assigns all NOESY cross peaks, taking into account nearness of chemical shift,
network anchoring, ambiguous distance constraints, and constraint combination [Herrmann et al,
2002]. The structure is then calculated using the DYANA algorithm, which calculates structures using torsion angle dynamics [Guntert et al, 1997]. Calibration constants for peak intensities
versus upper distance limits were determined automatically by CYANA. Peak lists from 50 ms,
80 ms, and 125 ms 15N-NOESY-HSQC, and 150 ms 13C-NOESY-HSQC experiments were
entered into CYANA, along with a list of the chemical shift assignments (pitx2.prot), all
prepared using SPARKY. Chemical shift tolerances of 0.025 were used for the protons, and 0.4
for the 15N and 13C chemical shifts. The script prepared to enter all of this informaton is called
“pitx2.cya”:
50
pitx2.cya
# Combined automatic assignment and structure calculation with CANDID
peaks := 50ms2,80ms2,125ms2,13Cms3 # names of peak lists prot := pitx2 # names of proton lists tolerance := 0.025, 0.025, 0.4 # chemical shift tolerances (ppm); # order: 1H(a), 1H(b), 13C/15N(a), 13C/15N(b) # cal := 9.0E7, 3.0E8, 1.0E8 # calibration constants (will be determined # automatically by CANDID, if commented out)
subroutine ANNEAL # subroutine for structure calculation var n ./init # re-initialize read upl cycle$cycle.upl # NOE upper distance limits from CANDID read cco pitx2c.cco append # coupling constants read aco set.aco append #CA shift-derived dihedral angles hbond O 53 HN 56 # hydrogen bond hbond O 54 HN 57 hbond O 55 HN 58 hbond O 42 HN 45 hbond O 43 HN 46 hbond O 44 HN 47 hbond O 45 HN 48 hbond O 46 HN 49 hbond O 47 HN 50 hbond O 48 HN 51 hbond O 49 HN 52 hbond O 50 HN 53 hbond O 51 HN 54 hbond O 52 HN 55 hbond O 10 HN 13 hbond O 11 HN 14 hbond O 12 HN 15 hbond O 13 HN 16 hbond O 14 HN 17 hbond O 15 HN 18 hbond O 16 HN 19 hbond O 17 HN 20 hbond O 28 HN 31
51 hbond O 29 HN 32 hbond O 30 HN 33 hbond O 31 HN 34 hbond O 32 HN 35 hbond O 33 HN 36 hbond O 34 HN 37 seed=5671 # random number generator seed n=30 # number of start conformers if (def('nproc')) n=nint(real(n)/nproc)*nproc # adapt to a multiple of number of CPUs distance stat calc_all structures=n steps=10000 # structure calculation overview cycle$cycle structures=20 cor # write overview file and coordinates end
candid peaks=$peaks prot=$prot calculation=ANNEAL
Simulated annealing was performed using the default parameters. The command “cyana” can be typed in a terminal to start the program, and “pitx2” can then be typed to run the “pitx2.cya” script. The 20 lowest energy conformers were retained after structure calculation and used for docking to DNA.
Docking of the protein to the DNA: The protein was docked to the DNA using the AMBER all- atom force field with the Generalized Born solvation model [Case et al, 1996; Pearlman et al,
1995]. The 20 CYANA structures with the lowest values of target function were docked onto canonical B-form DNA. This was chosen as the starting DNA structure, since NOESY spectra for the complex indicated the DNA to be close to B-form. Starting structures of the complex were generated by systematically placing PITX2 in varying orientations relative to the DNA, with helix 3 approximately 50 Å from the DNA, in MOLMOL. For each of the 20 lowest- energy structures, 5 different orientations of the protein relative to the DNA were selected, yielding 100 starting conformers. The first orientation was with the third helix directly facing
52 the major groove of the DNA. The other 4 orientations were generated by rotating the first
orientation by 45 and 90 degrees in either direction.
The protein was docked onto the DNA by a 20 ps simulated annealing calculation (T =
600 K, time-step = 1 fs) using an altered version of a procedure described previously for docking
of TFIIIA to DNA [Wuttke et al, 1997]. DNA base-pairing was maintained by incorporating
Watson-Crick hydrogen-bonding restraints. These Watson-Crick DNA restraints were implemented as lower and upper bound restraints on base-paired heteroatom-heteroatom (2.7 to
3.1 Å) and heteroatom-proton distances (1.67 to 2.07 Å) (WatCrick.dist), and had a final force constant of 50 kcal mol-1Å-2. The intramolecular protein (pitx2.dist) and DNA restraints
(dnaDist.dist, dnaNOE.dist) had final force constants of 20 kcal mol-1Å-2. Protein (helix.dist) and DNA angle restraints (dnaAng.dist) had a final force constant of 32 kcal mol-1Å-2. Protein-
DNA intermolecular restraints (docking.dist) had a final force constant of 32 kcal mol-1Å-2.
Protein restraints were applied to prevent the protein conformation from being altered too much from the structure calculated by CYANA. DNA restraints were applied to prevent fraying of the
DNA, and to maintain the structure close to B-form.
The format of all distance restraint files entered into the AMBER docking calculation consists of the residue number of the first atom, the residue name, the atom name, the residue number of the second atom, the residue name, the atom name, and finally the upper distance limit. An example of a line from one of the distance restraint files is:
43 THR HN 46 ARG+ HB2 4.14
The format of all angle restraint files entered into the AMBER docking calculation consists of the residue number, the residue name, the angle name, followed by the lower and upper bounds on this angle. An example of a line from one of the angle restraint files is:
53 69 GUA GAMMA -20.0 140.0
To start the docking calculation, a parameter file has to be created. This is accomplished
by loading the PDB file into tleap as follows:
% tleap –f leaprc.ff99 > complex = loadpdb DNA+pitx2.pdb > saveamberparm complex DNA+pitx2.parm7 DNA+pitx2.x > quit
The distance and angle restraint files were then converted to the format that AMBER
requires by running makeDIST_RST and makeANG_RST commands as shown in the following
two examples:
% makeDIST_RST –upb docking.dist –pdb DNA+pitx2.pdb –rst RST.dist
% makeANG_RST –pdb DNA+pitx2.pdb –con dnaAng.dist –lib $AMBERHOME/src/nmr_aux/prepare_input/tordef.lib > RST.dnaang
The force constants were then set to the values described above by editing each RST file. The values rk2 and rk3 on line 5 of each file were set to the chosen force constants. All of the
converted restraints were then combined into one restraint file as follows:
% cat RST.pitx2 RST.dnadist RST.dnaNOE RST.wc RST.dist RST.ang RST.dnaang > RST
Now that the distance restraints and parameter files are prepared, one step of
minimization was performed using the file “RSTmin.in”:
RSTmin.in
energy minimization for Pitx2 with restraints
&cntrl
imin=1, maxcyc=100, ncyc=50, ntpr=20, ntb=0, /
&ewald
54 eedmeth=5,
/
This calculation takes about 1 minute to run, and can be executed using the following command:
% sander –O –i RSTmin.in –o RSTmin.out –c DNA+pitx2.x –p DNA+pitx2.parm7 –r RST.min.x
After this minimization, the calculation can be run to dock the protein to the DNA using the restraint RST file prepared above. The temperature was increased from 0 to 600 K over the first 4 ps, held at 600 K for 2 ps, then slowly cooled to 0 K over 14 ps. The weights of the force constants were linearly increased from 0.1 to 1 during the course of the calculation. The file is
“RSTanneal.in” and is executed in the same way as the minimization step, using the appropriate input file names:
RSTanneal.in
simulated annealing protocol, 20 ps
&cntrl nstlim=20000, pencut=-0.001, nmropt=1, ntpr=200, ntt=1, ntwx=200, cut=12.0, ntb=0, vlimit=10, igb=1, saltcon=0.2, offset=0.13, / &ewald / # #Simple simulated annealing algorithm: # #from steps 0 to 4000: heat the system to 600K #from steps 4001-6000: hold at 600K #from steps 6001-20000: final cooling # &wt type='TEMP0', istep1=0,istep2=4000,value1=0., value2=600., / &wt type='TEMP0', istep1=4001, istep2=6000, value1=600.0, value2=600.0, / &wt type='TEMP0', istep1=6001, istep2=20000, value1=600.0, value2=0.0, /
55
&wt type='TAUTP', istep1=0,istep2=20000,value1=0.1, value2=20.0, /
&wt type='REST', istep1=0,istep2=20000,value1=0.1, value2=1.0, /
&wt type='END' / LISTOUT=POUT DISANG=RST
This file also outlines the changes in temperature that are introduced during the calculation, as described above. This calculation takes approximately 13 hours to complete (all calculations performed with a 2.4 GHz Intel Xeon processor). An explanation of what the values in the input files represent can be found at http://amber.scripps.edu/tutorial/dna_NMR/nmr_dna_tutorial.htm.
Structure refinement and analysis: The 20 structures with the lowest total energy values were subjected to restrained energy minimization by the SANDER module of the AMBER 7.0 package [Case et al, 1996; Pearlman et al, 1995]. Each conformer was subjected to a conjugate- gradient energy minimization calculation with solvent included. New parameter files were created, adding water to a radius of 8.0 Å with the “solvatebox” command in tleap, and neutralizng with sodium ions:
% tleap –f leaprc.ff99 > complex = loadpdb DNA+pitx2_docked.pdb > solvateBox complex WATBOX216 8.0 > charge complex > addIons2 complex Na+ 0 > saveamberparm complex DNA+pitx2_docked.parm7 DNA+pitx2_docked.x > quit
Two minimization steps were carried out, using the files called “min.in” and “min_all.in”:
56 min.in Minimization with Cartesian restraints for the solute &cntrl imin=1, maxcyc=200, ntpr=5, ntr=1, &end END END
min_all.in Minimization of the entire molecular system &cntrl imin=1, maxcyc=200, ntpr=5, &end
The evaluation of the structure, i.e., analysis of geometry, stereochemistry, and energy distributions in the models, was performed using the program PROCHECK [Laskowski et al,
1993]. Restraint violations were determined and analyzed using the program AQUA [Laskowski
et al, 1996]. Graphics were prepared using MOLMOL [Koradi et al, 1996].
Molecular Dynamics Calculations
Preparation of Mutant Protein Files
MD simulations were begun using the NMR solution structure of the PITX2
homeodomain bound to its DNA site [Protein Data Bank (PDB) code: 1yz8]. The average
structure of the 20 conformers was used as the starting point. Mutations of the protein were made within the program MOLMOL [Koradi et al, 1996] by replacing the wild-type residue with the mutant one and saving the molecule as a new PDB file to use in MD experiments. The
“SelectRes” command is used in MOLMOL to select the residue to be mutated, and the
“ChangeRes” command is used to switch it to the mutant.
57 Molecular Dynamics
The SANDER and LEaP modules of the AMBER7 suite of programs and the 1999
version of the AMBER force field were used in setting up and performing the simulations [Case
et al, 1996; Pearlman et al, 1995; Wang et al, 2000]. All of the complexes were put through the same protocol, which was loosely based on that of Gutmanas & Billeter [2004]. The complexes
were initially minimized in-vacuo with 100 steps of steepest descent, followed by 900 steps of
the conjugate gradient method, with a 30 Å cutoff on nonbonded interactions. This calculation
takes about 9 minutes to run, and the input file is called “min.in”, with the execution script
illustrated in the commented out (#) section (this will be shown for each of the MD scripts
presented below):
min.in
pitx2: initial minimisation prior to MD &cntrl imin = 1, maxcyc = 1000, ncyc = 100, ntb = 0, igb = 0, cut = 30 /
# sander -O -i min.in -o pitx2_min.out -c pitx2_vac.crd -p pitx2_vac.parm7 -r pitx2_vac_init_min.rst
# ambpdb -p pitx2_vac.parm7 < pitx2_vac_init_min.rst > pitx2_vac_init_min.pdb
These minimized structures were then solvated in a box of water, with a minimal distance of 8 Å
from the solute to the border (performed in tleap, as described above for the docking
calculation). A total of 4558 water molecules were added. Seventeen sodium ions were added to
make the system’s charge neutral (“addions” command in tleap). The simulations had a total of
58 15703 atoms. The solvated complexes were then put through a step of energy minimization.
This step consisted of 1000 steps of steepest descent, with 50 kcal/(mol Å) restraints on all heavy atoms of the solute, and a 10 Å cutoff for nonbonded interactions. This calculation takes about
13 minutes to run, and the input file is “min2.in”:
min2.in
Minimization with Cartesian restraints for the solute &cntrl imin = 1, maxcyc = 1000, ncyc = 1000, ntb = 1, igb = 0, ntr = 1, cut = 10 &end Group input for restrained atoms 50.0 RES 1 94 END END
# sander -O -i min2.in -o pitx2_min2.out -c pitx2.crd -p pitx2.parm7 -r pitx2_min2.rst - ref pitx2.crd
A 25 ps MD simulation was then performed, with position restraints of 50 kcal/(mol Å) and a 10
Å cutoff, as above. A constant pressure of 1 atm was used. The SHAKE algorithm, with a timestep of 2 fs, was used. The system was heated from 100 K to 300 K during the first 2 ps, and kept at a constant temperature afterwards. This calculation takes about 8 hours to run, and the input file is “eq1.in”:
eq1.in
Pitx2: 25ps MD with res &cntrl imin = 0, irest = 0,
59 ntx = 1, ntb = 2, ntp = 1, cut = 10, ntr = 1, ntc = 2, ntf = 2, tempi = 100.0, temp0 = 300.0, tautp = 2, ntt = 1, gamma_ln = 1, nstlim = 12500, dt = 0.002 ntpr = 100, ntwx = 100, ntwr = 1000 / Keep solute fixed 50.0 RES 1 94 END END
# sander -O -i eq1.in -p pitx2.parm7 -c pitx2_min2.rst -r pitx2_eq1.rst -x pitx2_eq1.crd -o pitx2_eq1.out -ref pitx2_min2.rst
Another minimization step was performed, as above, with the position restraints relaxed to 25 kcal/(mol Å). This calculation takes about 40 minutes to run, and the input file is “eq2.in”:
eq2.in
Minimization with Cartesian restraints for the solute &cntrl imin = 1, maxcyc = 1000, ncyc = 1000, ntb = 1, igb = 0, ntr = 1, cut = 10 &end Group input for restrained atoms 25.0 RES 1 94 END END
60
# sander -O -i eq2.in -o pitx2_eq2.out -c pitx2_eq1.rst -p pitx2.parm7 -r pitx2_eq2.rst - ref pitx2_eq1.rst
Next, a 3 ps MD simulation was performed. The position restraints were kept at 25 kcal/(mol
Å), and Particle Mesh Ewald (PME) summation was introduced. This calculation takes about 30 minutes to run, and the input file is called “eq3.in”:
eq3.in
Pitx2: 3ps MD with res &cntrl imin = 0, irest = 0, ntx = 1, ntb = 2, ntp = 1, cut = 10, ntr = 1, ntc = 2, ntf = 2, tempi = 100.0, temp0 = 300.0, tautp = 2, ntt = 1, gamma_ln = 1.0, nstlim = 1500, dt = 0.002 ntpr = 100, ntwx = 100, ntwr = 1000 / &ewald / Keep solute fixed 25.0 RES 1 94 END END
# sander -O -i eq3.in -p pitx2.parm7 -c pitx2_eq2.rst -r pitx2_eq3.rst -x pitx2_eq3.crd -o pitx2_eq3.out -ref pitx2_eq2.rst
61 Then, 5 more minimization steps were performed, relaxing the position restraints from 20 to 0 kcal/(mol Å). These calculations take about 30 minutes each to run, and the input files are called
“eq4.in”, “eq5.in”, “eq6.in”, “eq7.in”, and “eq8.in”:
eq4.in
Minimization with Cartesian restraints for the solute &cntrl imin = 1, maxcyc = 1000, ncyc = 1000, ntb = 1, igb = 0, ntr = 1, cut = 10 &end Group input for restrained atoms 20.0 RES 1 94 END END
# sander -O -i eq4.in -o pitx2_eq4.out -c pitx2_eq3.rst -p pitx2.parm7 -r pitx2_eq4.rst - ref pitx2_eq3.rst
eq5.in
Minimization with Cartesian restraints for the solute &cntrl imin = 1, maxcyc = 1000, ncyc = 1000, ntb = 1, igb = 0, ntr = 1, cut = 10 &end Group input for restrained atoms 15.0 RES 1 94 END END
62 # sander -O -i eq5.in -o pitx2_eq5.out -c pitx2_eq4.rst -p pitx2.parm7 -r pitx2_eq5.rst - ref pitx2_eq4.rst eq6.in
Minimization with Cartesian restraints for the solute &cntrl imin = 1, maxcyc = 1000, ncyc = 1000, ntb = 1, igb = 0, ntr = 1, cut = 10 &end Group input for restrained atoms 10.0 RES 1 94 END END
# sander -O -i eq6.in -o pitx2_eq6.out -c pitx2_eq5.rst -p pitx2.parm7 -r pitx2_eq6.rst - ref pitx2_eq5.rst eq7.in
Minimization with Cartesian restraints for the solute &cntrl imin = 1, maxcyc = 1000, ncyc = 1000, ntb = 1, igb = 0, ntr = 1, cut = 10 &end Group input for restrained atoms 5.0 RES 1 94 END END
# sander -O -i eq7.in -o pitx2_eq7.out -c pitx2_eq6.rst -p pitx2.parm7 -r pitx2_eq7.rst - ref pitx2_eq6.rst
63 eq8.in
Minimization with Cartesian restraints for the solute &cntrl imin = 1, maxcyc = 1000, ncyc = 1000, ntb = 1, igb = 0, cut = 10 &end
# sander -O -i eq8.in -o pitx2_eq8.out -c pitx2_eq7.rst -p pitx2.parm7 -r pitx2_eq8.rst
A 2 ps MD run was then performed at constant pressure, heating the system from 100 K to 300 K during the first 2 ps, and keeping the temperature constant afterwards. This calculation takes
about 45 minutes to run, and the input file is called “eq9.in”:
eq9.in
Pitx2: 2ps MD &cntrl imin = 0, irest = 0, ntx = 1, ntb = 2, ntp = 1, cut = 10, ntc = 2, ntf = 2, tempi = 100.0, temp0 = 300.0, tautp = 2, ntt = 1, gamma_ln = 1.0, nstlim = 1000, dt = 0.002 ntpr = 100, ntwx = 100, ntwr = 1000 / END
# sander -O -i eq9.in -p pitx2.parm7 -c pitx2_eq8.rst -r pitx2_eq9.rst -x pitx2_eq9.crd -o pitx2_eq9.out
64 A final MD simulation was run for 100 ps at constant pressure, and a constant temperature of
300 K. This calculation runs overnight, and the input file is called “eq10.in”:
eq10.in
Pitx2: 100ps MD &cntrl imin = 0, irest = 0, ntx = 1, ntb = 2, ntp = 1, cut = 10, ntc = 2, ntf = 2, temp0 = 300.0, tautp = 2, ntt = 1, gamma_ln = 1.0, nstlim = 50000, dt = 0.002 ntpr = 100, ntwx = 100, ntwr = 1000 &end END END
# sander -O -i eq10.in -p pitx2.parm7 -c pitx2_eq9.rst -r pitx2_eq10.rst -x pitx2_eq10.crd -o pitx2_eq10.out -ref pitx2_eq9.rst
The trajectory length for the production run was 2 ns for each complex. Periodic boundary conditions were used, at constant volume and a constant temperature of 300 K. The SHAKE algorithm was used, with a timestep of 2 fs. A cutoff of 10 Å was used for long-range interactions, along with PME summation. This calculation takes about 11 days to run, and is run in separate steps of 200 ps each. The input file is called “pitx2_prod.in”:
pitx2_prod.in
Constant pressure constant temperature production run &cntrl nstlim=100000, dt=0.002, ntx=5, irest=1, ntpr=500, ntwr=5000, ntwx=5000,
65 temp0=300.0, ntt=1, tautp=2.0,
ntb=1, ntp=0,
ntc=2, ntf=2, cut=10,
nrespa=2, / &ewald / END END
To run the entire 2 ns in separate 200 ps steps, the script “run_pitx2_2000ps.x” is used:
run_pitx2_2000ps.x
#!/bin/csh set AMBERHOME="/usr/local/amber7" set MDSTARTJOB=1 set MDENDJOB=10 set MDCURRENTJOB=$MDSTARTJOB set MDINPUT=0
echo -n "Starting Script at: " date echo ""
while ( $MDCURRENTJOB <= $MDENDJOB ) echo -n "Job $MDCURRENTJOB started at: " date @ MDINPUT = $MDCURRENTJOB - 1 $AMBERHOME/exe/sander -O -i pitx2_prod.in \ -o pitx2_prod$MDCURRENTJOB.out \ -p pitx2.parm7 \ -c pitx2_prod$MDINPUT.rst \ -r pitx2_prod$MDCURRENTJOB.rst \ -x pitx2_prod$MDCURRENTJOB.crd \ -ref pitx2_prod$MDINPUT.rst gzip -9 -v pitx2_prod$MDCURRENTJOB.crd echo -n "Job $MDCURRENTJOB finished at: " date @ MDCURRENTJOB = $MDCURRENTJOB + 1 end echo "ALL DONE"
66
To make this script executable, the following script was typed in the terminal window:
% chmod +x run_pitx2_2000ps.x
To run this script in the background, the following script was typed in the terminal window:
% nohup ./run_pitx2_2000ps.x >& run.log &
Analysis of Production Run Data
PDB files were prepared of the average structure during the MD run for each complex, and for individual steps throughout the trajectory (“ambpdb” command). These were compared
and analyzed in MOLMOL [Koradi et al, 1996]. The average structures during the trajectories
of the MD simulations were calculated using the PTRAJ module of the AMBER package [Case
et al, 1996; Pearlman et al, 1995]. The input file for this calculation is called “ptraj.in”:
ptraj.in
trajin pitx2_prod1.crd.gz trajin pitx2_prod2.crd.gz trajin pitx2_prod3.crd.gz trajin pitx2_prod4.crd.gz trajin pitx2_prod5.crd.gz trajin pitx2_prod6.crd.gz trajin pitx2_prod7.crd.gz trajin pitx2_prod8.crd.gz trajin pitx2_prod9.crd.gz trajin pitx2_prod10.crd.gz trajout pitx2_2000ps.crd rms first out pitx2_2000ps.rmsfit @P,O3',O5',C3',C4',C5',CA time 10 center :1-94 image familiar solvent byres :WAT closest 1050 :1-2012 first average avg.pdb pdb
# ptraj pitx2.parm7 < ptraj.in
67 For calculation of the average structure, only the 1050 water molecules closest to the complex
were retained (“closest” in ptraj.in). For these structures, the program NUCPLOT was used to
get a full list of protein-DNA and water-mediated contacts [Luscombe et al, 1997]. In addition,
the PTRAJ and CARNAL modules in AMBER were used to characterize hydrogen bonds and
water contacts during the course of the trajectories [Case et al, 1996; Pearlman et al, 1995]. The input file for the PTRAJ calculation is called “hbond.in”:
hbond.in
trajin pitx2_2000ps.crd
# ptraj pitx2.parm7 < hbond.in > hbond44.out
# specify the electron pair DONOR # donor mask :46@O
# specify the ACCEPTOR(s)
acceptor WAT O H1 acceptor WAT O H2
# calculate the waters in the first and second solvation shells # (0-3.5A and 3.5-5.0A) and output to watershell.list
watershell :46 watershell44.list
# do the Hbond search/output
hbond solventacceptor O H1 solventacceptor O H2 solventneighbor 2 series hbond hbond distance 2.8 angle 35 donor acceptor neighbor 2 series \
The input file for the CARNAL calculation is called “carnal_hbond.in” and is edited for each
residue analyzed:
68 carnal_hbond.in
# HBOND analysis of ligand-receptor interaction FILES_IN PARM p1 pitx2.parm7; STREAM s1 pitx2_2000ps.crd; FILES_OUT HBOND h2 pitx2_50_dna TABLE LIST; DECLARE GROUP g1 (RES 52); GROUP g2 (RES 85,86); OUTPUT HBOND h2 DONOR g2 ACCEPTOR g1 STATS; END
RMSD values of the trajectory versus the initial structure were calculated (“rms first out” command in ptraj.in). A Perl script was used to pull out energy versus time data (% process_mdout.perl *.out). To calculate the distance between the K50 NZ and the O6 and N7 atoms of the guanines G83 and G84, the PTRAJ module was used, with the “distance” command using the input file “ptraj_calc_K50_distance.in”:
ptraj_calc_K50_distance.in
trajin pitx2_prod1.crd.gz trajin pitx2_prod2.crd.gz trajin pitx2_prod3.crd.gz trajin pitx2_prod4.crd.gz trajin pitx2_prod5.crd.gz trajin pitx2_prod6.crd.gz trajin pitx2_prod7.crd.gz trajin pitx2_prod8.crd.gz trajin pitx2_prod9.crd.gz trajin pitx2_prod10.crd.gz
distance K50to85 :52@NZ :85@N7 out K50to85dist time 10 distance K50to86 :52@NZ :86@N7 out K50to86dist time 10 distance K50to85O :52@NZ :85@O6 out K50to85distO time 10 distance K50to86O :52@NZ :86@O6 out K50to86distO time 10
# ptraj pitx2.parm7 < ptraj_calc_K50_distance.in
# xmgrace K50to85dist K50to86dist
69
To calculate the side chain angles for the K50 side chain, the “dihedral” command was used in the PTRAJ module with the input file called “ptraj_calc_K50_angles.in”:
ptraj_calc_K50_angles.in
trajin pitx2_prod1.crd.gz trajin pitx2_prod2.crd.gz trajin pitx2_prod3.crd.gz trajin pitx2_prod4.crd.gz trajin pitx2_prod5.crd.gz trajin pitx2_prod6.crd.gz trajin pitx2_prod7.crd.gz trajin pitx2_prod8.crd.gz trajin pitx2_prod9.crd.gz trajin pitx2_prod10.crd.gz
dihedral chi_1 :52@C :52@CA :52@CB :52@CG out chi_1 dihedral chi_3 :52@CB :52@CG :52@CD :52@CE out chi_3 dihedral chi_4 :52@CG :52@CD :52@CE :52@NZ out chi_4 dihedral chi_2 :52@CA :52@CB :52@CG :52@CD out chi_2
# ptraj pitx2.parm7 < ptraj_calc_K50_angles.in
# xmgrace chi_1 chi_3 chi_4 chi_2
70 CHAPTER 3: Solution Structure of the K50 Class Homeodomain PITX2 Bound to DNA and Implications for Mutations that Cause Rieger Syndrome
Functional Analysis of Purified PITX2 Homeodomain
After expressing and purifying the PITX2 homeodomain as described in Chapter 2, it was important to verify that the protein being produced was functioning properly in that it binds the consensus TAATCC binding site, and some of the nonconsensus DNA sequences. Gel shift assays were performed, one of which can be seen in Figure 3.1. As can be seen, there is a shifted band in the lanes with protein where the protein binds the DNA, which is not seen in the lanes without protein. It was also believed that the protein was binding DNA because at high concentrations, the free protein irreversibly aggregates and precipitates out of solution. But while DNA is present, the protein remains soluble at 1 mM concentrations. A sample of the
PITX2 homeodomain was also sent to the University of Michigan for N-terminal analysis, which verified that the first four residues were correct and secondary thrombin cleavage was not occurring. These results indicate that the PITX2 homeodomain being expressed and purified
here is functioning properly, and structure determination can proceed.
71 1- No protein TAATCC 2- With protein TAATCC 3- No protein TAAGCC 4- With protein TAAGCC 5- No protein neg control 6- With protein neg control 7- No protein TAGTCC 8- With protein TAGTCC 9- No protein 1 2 3 4 5 6 7 8 9 10 TATTCC 10- With protein TATTCC
Figure 3.1: Gel shift assays of the PITX2 homeodomain. Gel shift assays were performed as described in Chapter 2 to examine the DNA-binding activity of the PITX2 homeodomain. The shifted bands in the lanes with protein indicate the protein is functioning properly in that it binds concensus and nonconsensus DNA binding sites as determined previously [Dave et al, 2000; Yuan et al, 1999;
Hjalt et al, 2001; Espinoza et al, 2002].
KD of the PITX2 Homeodomain Bound to its DNA Consensus Site
The KD of the PITX2 homeodomain alone bound to its consensus site had not been determined previously. This value was determined by Vrushank Dave, using protein we prepared. The Engrailed Q50K mutant binds to the consensus TAATCC site with an unusually high affinity, near the picomolar range [Ades & Sauer, 1994]. There is no evidence that natural
K50 class homeodomains have such a high affinity for DNA [Amendt et al, 1998; Ma et al,
1996]. The KD for the Q50K mutant of the Fushi tarazu homeodomain was found to be 0.63nM
[Percival-Smith et al, 1990], which is a much lower affinity than the EnQ50K mutant. The KD for the PITX2 homeodomain alone was found to be 2.6 +/- 0.38 nM (Figure 3.2), which is
72 comparable to the Fushi tarazu mutant, and also a much lower affinity than the Engrailed Q50K mutant.
AB
0. 5
0. 4 Bound
0. 3 Free Probe
0. 2
123 4 6 7
5 DNA Bound/Free 0. 1
0 00.511.52
DNA Bound (nM)
Figure 3.2: KD of the PITX2 homeodomain (Prepared by Vrushank Dave). Gel shift assay for Scatchard analysis to determine the affinity of recombinant Pitx2 homeodomain for the bicoid consensus DNA element. (A) The DNA probe concentrations used in this analysis were 1, 2, 4, 8, 16, 20 and 40 nM for lanes 1 to 7, respectively at a fixed protein concentration. (B) The KD value obtained was 2.6 ± 0.38 nM (mean ± standard deviation) from three independent DNA binding curves obtained from quantitative gel shift assays by measuring the bound and free fractions of the probes with a PhosphorImager. The data were analyzed using Microsoft Excel (linear regression analysis) to determine the KD value ( -1/KD = slope of the plot of Bound/Free against Bound DNA).
Analysis of Protein Folding by HSQC
The first NMR experiment that was performed was on a 15N-labeled sample and is called a Heteronuclear Single Quantum Correlation (HSQC) experiment. This 2D experiment is used to see the chemical shifts of N-H amide bonds in proteins, and can be used to analyze the folding of the protein. Chemical shifts of each amino acid are dependent on their environment, which has to do with the conformation of the protein. When the protein is well-folded, there is a good chemical shift dispersion. When the protein is not well-folded, the peaks tend to be clumped
73 together. The HSQC for the PITX2 homeodomain is shown in Figure 3.3. The good chemical shift dispersion indicates that the protein is well-folded under these experimental conditions when bound to DNA. The signal-to-noise ratio is appropriate for a complex of this size.
Approximately 59-60 resonances can be fully or partially resolved for the backbone amide groups. Eight of the nine Asn and Gln side chains are visible. Eleven out of the twelve arginine side chains are visible outside of this spectral view.
Figure 3.3: 15N HSQC for the PITX2 homeodomain bound to its concensus DNA site.
74
Resonance Assignments
Assignment of resonances of the atoms in the PITX2 homeodomain and its DNA consensus binding site were performed using experiments described in Chapter 2. These assignments are listed in Table 3.1. A 15N-HSQC labeled with the backbone assignments is presented in Figure 3.4.
Figure 3.4: 15N-HSQC labeled with backbone and side-chain assignments obtained through triple-resonance experiments. All red peaks, and arginine NH-QH peaks are folded in, with actual resonances listed in the following tables.
75 Residue N HN CA CB CO HA HB CG HG CD HD G-1 119.3 5.783 47.76 177.5 3.815 S-2 114.7 8.171 58.71 63.9 174.4 4.508 3.915 Q1 122.4 8.551 55.7 29.6 175.6 4.435 2.116,2.009 33.84 2.388 R2 123.8 8.462 55.85 31.57 176.3 4.464 1.827 27.23 1.720 43.7 3.293 R3 125.2 9.065 56.38 31.31 175.7 4.260 1.871,1.770 26.92 1.711 44.48 3.304 Q4 122.2 8.465 55.54 29.58 175.2 4.259 1.989 34.06 2.369 R5 124.2 8.622 56.45 32.84 175.2 4.272 1.593 34.14 1.694 2.366 T6 125.6 8.03 64.96 69.15 172.6 3.826 3.502 20.68 0.460 H7 125.7 8.669 53.64 29.9 174.7 4.778 3.180,3.089 Table 3.1: List F8 126.3 8.681 58.62 40.53 176.7 4.758 2.762,3.127 of chemical T9 113.7 9.151 60.39 70.97 175.6 4.530 4.823 21.8 1.393 S10 116.7 9.185 62.14 62.31 177.2 4.177 3.974 shifts for the Q11 121 8.409 59.61 27.97 178.8 4.005 2.147,1.981 33.83 2.453 PITX2 Q12 119.3 7.789 59.48 27.96 177.7 3.831 1.637 35 2.670 L13 117.1 8.293 57.93 41.6 179 3.609 1.799,1.524 26.97 1.660 25.38 0.940,0.799 homeodomain Q14 118.1 8.149 59.26 28.52 179.1 3.982 2.180 34.15 2.497,2.362 bound to DNA. Q15 119.8 7.57 58.49 29.53 179.8 4.174 2.006,1.865 35.46 2.352,2.211 L16 123.5 8.302 58.53 37.81 178.1 3.506 0.4867,-1.204 26.17 1.147 22.86 -0.811,0.4104 E17 119.2 8.247 59.03 29.16 179.1 4.358 1.951,2.177 34.51 2.350,2.517 A18 120.6 7.944 55.19 18 181.6 4.175 1.540 T19 113.5 7.932 66.99 68.32 176.1 4.088 4.570 21.83 1.515 F20 125.5 8.829 61.76 39.07 175.6 4.367 3.354,3.230 Q21 112.4 7.908 57.52 28.16 177.2 3.793 2.259,2.099 33.73 2.762,2.664 R22 116.8 7.393 57.1 31.6 176.6 4.358 1.980,1.906 27.56 1.777,1.656 43.53 3.237 N23 119.4 8.245 52.58 38.4 172.9 4.475 2.736 R24 121.5 8.39 57.6 31.06 174.9 3.788 1.281,0.9161 27.24 -0.283 42.7 1.648,1.699 Y25 114.1 7.722 55.74 39.12 4.647 2.543,3.127 P26 62.67 31.07 177.4 4.402 1.582,1.415 25.63 D27 124.2 8.457 52.51 40.62 175.6 4.475 3.224,2.825 M28 119.4 8.674 60.34 32.14 177.5 3.843 2.567,2.471 32.12 2.111 S29 113.4 8.376 61.83 62.47 177.5 4.244 3.890 T30 120.9 8.402 67.08 67.61 176.5 4.001 4.173 20.95 1.151 R31 121 8.944 61.14 32.1 178.6 3.859 2.092 1.553 2.438 E32 118.8 8.595 59.8 29.32 178.8 3.962 2.305,2.028 36.68 2.571,2.187 E33 121.2 7.629 59.62 29.29 178.7 3.988 2.166,2.009 36.41 2.309,2.009 I34 119.8 8.402 64.77 38.6 179.6 3.614 1.887 28.37 1.158,0.916 14.96, 10.916,0.8326 A35 124.3 8.538 55.71 16.97 178.7 3.659 1.365 V36 118.4 7.773 66.52 31.61 180.1 3.811 2.245 22.83, 21.153,0.961 W37 120.3 8.099 59.11 29.89 178 4.622 3.447 T38 104.5 7.999 61.65 71.14 174.9 4.379 4.265 22.37 1.207 N39 119 7.913 54.81 37.29 174.1 4.527 3.239,2.889 L40 120.2 8.626 53.08 47.04 175.9 5.039 1.687,1.458 23.86 1.102 25.96 0.897 T41 106.5 7.071 59.49 71.46 175.9 4.705 4.709 22.3 1.336 E42 124.1 9.445 61.56 28.39 177.8 3.616 1.930,2.362 37.3 2.624,2.111 A43 119.7 8.462 55.59 18.63 180.1 3.995 1.487 R44 116.9 7.656 59.07 31.67 180.1 4.296 2.249,2.143 27.9 1.980 43.46 3.405 V45 120.1 7.999 67.25 32.56 177.3 3.757 2.435 22.21, 21.150,1.035 R46 121.6 9.423 60.7 30.56 179.6 3.930 2.269,2.104 26.16 1.703,1.560 43.5 3.347 V47 120.8 8.097 66.92 32.43 176 3.626 2.273 24.32 1.107 W48 122.9 8.419 62.61 28.23 180.3 4.876 3.449 F49 118.3 9.039 63.67 39.42 177.5 3.714 3.280 K50 121.4 7.901 60.5 33.41 177.5 3.933 1.960,1.722 30.2 1.374,1.472 30.19 2.247 N51 119.4 8.669 56.22 38.35 177.8 4.487 2.846 R52 125.6 8.836 56.68 27.66 180.3 3.664 0.663 23.9 -0.620, -0.209 2.474 R53 120.3 8.607 61.17 31.6 178.2 4.313 2.242 1.989 43.07 2.452 A54 122 7.374 55.62 17.49 180.1 4.338 1.595 K55 120.1 7.703 59.58 32.95 178.2 4.055 1.815 25.03 1.498 W56 121.3 8.345 60.31 29.59 178.3 4.582 3.411 R57 117.6 8.457 59.59 31.26 178.7 3.859 1.949 28.98 2.166 44.08 3.385 K58 118.3 7.758 58.09 32.98 177.9 4.205 1.958 25.23 1.644 29.16 1.496 R59 116.9 7.959 56.26 30.98 176.3 4.339 1.863 27.09 1.624 43.18 3.103 E60 119.5 8.068 56.84 30.01 176.4 3.986 1.820 36.61 2.013 E61 120.1 7.877 57.36 30.07 175.9 3.869 1.782,1.491 36.47 2.075,1.918 F62 118.9 7.722 56.96 39.19 175.2 4.571 3.034 I63 123.5 7.794 60.97 38.56 175.7 4.121 1.786 27.32 1.4,1.104,0.832 12.62 0.831 V64 125.9 8.281 62.25 32.88 176.3 4.197 2.088 21.1 0.994,0.823 T65 118.8 8.335 61.27 70.09 173.5 4.435 4.295 21.33 1.171 D66 128.2 7.994 55.95 42.23 4.418 2.680,2.573 76
Chemical Shift Indices
α α The values of the resonance frequencies of H , C , and C’ atoms can be used to get a general idea of secondary structure by using chemical shift indices (CSI) [Wishart et al, 1992;
Spera & Bax, 1991; Luginbuhl et al, 1995]. Chemical shifts are highly characteristic of the chemical environment and vary based on secondary structure. Table 3.2 shows the analyses for these atoms. If a chemical shift is greater than a range defined for that amino acid type, a "1" is assigned to it. If the chemical shift is below the range, a "-1" is assigned. If it's within the range,
α α a "0" is assigned. A strip of four or more "1"s for C’ or C , or "-1"s for H , is indicative of a helical structure. Table 3.2 indicates that the secondary structure of the PITX2 homeodomain is very similar to other homeodomains in that it consists of three helices.
77
Residue CO HA CA Residue CO HA CA G-1 1 -1 1 E33 1 -1 1 S0 1 0 1 I34 1 -1 1 Q1 -1 0 -1 A35 1 -1 1 R2 0 0 -1 V36 1 -1 1 R3 -1 -1 0 W37 1 -1 1 Q4 0 -1 -1 T38 0 0 -1 R5 -1 0 1 N39 -1 -1 1 T6 -1 -1 1 L40 -1 1 -1 H7 0 0 -1 T41 0 1 -1 F8 1 1 1 E42 1 -1 1 T9 0 1 -1 A43 1 -1 1 S10 1 -1 1 R44 1 -1 1 Q11 1 -1 1 V45 0 -1 1 Q12 1 -1 1 R46 1 -1 1 L13 1 -1 1 V47 -1 -1 1 Q14 1 -1 1 W48 1 1 1 Q15 1 -1 1 F49 1 -1 1 L16 1 -1 1 K50 1 -1 1 E17 1 0 1 N51 1 -1 1 A18 1 -1 1 R52 1 -1 1 T19 1 -1 1 R53 1 0 1 F20 0 -1 1 A54 1 0 1 Q21 1 -1 1 K55 1 -1 1 R22 0 0 1 W56 1 -1 1 N23 -1 -1 -1 R57 1 -1 1 R24 -1 -1 1 K58 1 -1 1 Y25 0 -1 R59 0 0 0 P26 1 0 -1 E60 0 -1 0 D27 -1 -1 E61 0 -1 1 M28 1 -1 1 F62 0 -1 -1 S29 1 -1 1 I63 -1 1 -1 T30 1 -1 1 V64 -1 1 -1 R31 1 -1 1 T65 -1 0 -1 E32 1 -1 1 D66 -1 1
α α Table 3.2: Chemical shift indices for the H , C , and CO atoms of the PITX2 homeodomain. The residues highlighted in yellow indicate where the 3 helices are found to be in the Antennapedia homeodomain.
78
Aromatic Assignments
Assignment of the atoms in aromatic groups was performed using NMR experiments described in Chapter 2. The 2D HMQC-TOCSY with a 20 ms mixing time was the most useful in making assignments. Previous structural studies of homeodomains have indicated interactions between aromatic groups play an important structural role in the hydrophobic core of the protein
[Subramaniam et al, 2001]. Therefore, it was important to assign these atoms in order to obtain an accurate tertiary structure. The assignments for these aromatic groups are listed in Table 3.3.
Residue NE1 CD1/ 2 HD1/ 2 CE1/ 2 HE1/ 2 CZ HZ CE3 HE3 CZ2/ 3 HZ2/ 3 CH2 HH2 H7 119.8 7.120 137.0 8.315 F8 7.351 7.292 7.075 F20 126.7 7.585 132.2 7.384/ 7.446 122.0 7.170 Y25 133.6 7.065 117.8 6.793 W37 109.0 126.4 7.128 10.210 122.0 7.127 114.7/ 120.77.448/ 7.572 124.7 7.204 W48 109.3 128.7 6.966 9.301 7.429 113.7 7.125/ 7.091 7.217 F49 132.2 7.799 7.769 7.594 W56 110.6 127.5 7.473 10.350 130.5 7.041 114.7/ 119.87.256/ 7.694 122.0 7.169 F62 131.8 7.093 130.5/ 1307.318/ 7.318 7.295 Table 3.3: Assignments of atoms in aromatic groups of the PITX2 homeodomain.
79
Side Chain Assignments for Arginine, Asparagine, and Glutamine Residues
Side chains of arginine, asparagine and glutamine residues were assigned by looking at
NOESY spectra and matching large NOESY peaks to previous assignments. These assignments are shown in Tables 3.4, 3.5 and 3.6.
Residue Nε Hε R2 83.8 7.689 R3 84.7 7.478 R5 84.5 7.256 R22 83.0 7.652 R24 83.6 6.359 R31 83.2 8.186 R44 84.6 7.522 R46 80.5 9.330 R52 89.9 9.852 R53 86.4 7.495 R57 85.6 7.452 R59 85.8 7.396
Table 3.4: Chemical shift assignments of the arginine side-chains.
Residue ND2 HD21 HD22 N23 N39 112.3 7.520 6.878 N51 123.6 8.906 8.405
Table 3.5: Chemical shift assignments of the asparagine side-chains.
Residue NE2 HE21 HE22 Q1 108.6 6.583 7.514 Q4 113.0 6.893 7.634 Q11 112.0 7.663 6.787 Q12 110.7 6.716 7.982 Q14 111.7 6.835 7.472 Q15 112.0 6.787 7.663 Q21 113.5 7.910 7.083
Table 3.6: Chemical shift assignments of the glutamine side-chains.
80
Chemical Shift Assignments for the DNA Bound to the PITX2 Homeodomain
Assignments for the protons of the DNA were assigned as described in Materials and
Methods (Chapter 2). These assignments can be seen in Table 3.7.
Residue TCH3 2'H 2”H 1'H 6H 8H 3'H C5H A2H 4'H 5’H G67 2.665 2.757 5.988 7.966 4.834 4.237 4.168 C68 2.159 2.544 6.068 7.517 5.356 4.248 4.209 T69 1.590 6.061 7.426 4.164 4.195 C70 2.435 2.229 5.896 7.531 5.555 4.159 T71 1.738 2.249 2.363 6.078 7.618 4.049 5.564 A72 1.805 2.409 5.679 7.758 4.131 5.253 A73 2.313 2.883 6.173 7.438 T74 1.091 2.139 2.445 6.101 7.038 4.129 4.222 C75 2.461 2.619 5.891 7.532 5.471 4.211 C76 2.245 2.508 5.906 7.577 5.612 4.209 C77 2.151 2.486 6.079 7.536 4.781 5.610 4.149 4.161 C78 1.936 2.350 5.763 7.460 5.587 4.060 4.107 G79 2.362 2.605 6.170 7.932 4.669 4.062 4.178
C80 1.979 2.444 5.892 7.605 4.696 5.892 4.121 4.067 G81 2.733 2.777 5.800 7.870 4.978 4.043 4.352 G82 2.637 2.743 5.803 7.423 4.923 4.427 4.352 G83 2.245 2.694 5.926 7.665 4.378 4.065 G84 2.608 2.939 5.810 7.768 5.012 4.192 4.462 A85 2.462 2.999 5.952 8.062 5.008 4.427 4.458 T86 1.344 1.810 2.427 6.157 6.981 4.748 4.136 4.429 T87 1.538 2.096 2.410 5.678 7.178 4.912 4.147 A88 2.647 2.821 5.912 8.413 5.022 G89 2.110 2.175 6.145 7.402 4.912 4.045 4.465 A90 2.638 2.848 5.895 7.739 G91 2.638 2.747 5.932 7.631 4.917 4.424 C92 2.442 2.647 5.827 7.402 5.377 4.311
Table 3.7: Chemical shift assignments of the DNA binding site.
81 Protein-DNA NOEs
NOEs between the protein and the DNA were obtained and assigned as described in
Materials and Methods (Chapter 2). A table of these assigned NOEs, with the upper distance limits used, is shown in Table 3.8.
DNA Protein Upper Distance Limit (Å) T71 Q5’ R52 QB 5.00 A72 4'H R3 QD 4.00 A72 Q5’ R3 Hε 5.00 A72 Q5’ T6 HG 5.00 A72 Q5’ V47 QG1 6.00 A72 8H W48 HA 6.00 A73 2'H V47 QG1 5.00 A73 8H R44 Hε 6.00 A73 2"H V47 QG1 5.00 T74 Q5’ R2 HA 6.00 T74 Q5’ R2 QD 6.00 T74 4'H R44 QB 6.00 T74 2"H R44 QD 6.00 T74 6H V47 QG1 6.00 T74 TCH3 V47 QG1 6.00 G81 Q5’ Y25 HE 6.00 G82 Q5’ R31 QG 5.00 G82 4'H Y25 HE 5.00 G83 4'H F49 QB 6.00 G83 Q5’ F49 QB 6.00 G83 4'H R53 HG2 6.00 G83 Q5’ K50 HG2 6.00 G83 Q5’ K50 HG1 5.00 G84 8H R46 QD 6.00 G84 Q5’ K50 HG1 6.00 G89 8H R5 Hε 4.00 G89 8H R5 HN 5.00
Table 3.8: Table of protein-DNA NOEs
Tertiary Structure of the Pitx2 Homeodomain
Structure Determination. Assignment of the protein backbone and side chain 1H, 13C, and 15N resonances were obtained from heteronuclear spectra. Restraint data derived for the
PITX2 homeodomain-DNA complex are summarized in Table 3.9. Analysis of 15N and 13C
82 heteronuclear-edited NOESY spectra recorded at various mixing times provided 1259 intramolecular distance restraints comprising 513 intraresidue, 338 sequential, 300 medium- range (2-5 residues apart) and 108 long-range (>5 residues apart) NOE contacts. Torsional restraints for 55 φ and 43 ψ angles were obtained from a 3D HNHA experiment, and from using
Cα chemical shifts [Spera & Bax, 1991; Luginbuhl et al, 1995]. Overall, there are 19 restraints per residue, on average, for intramolecular protein NOEs. All of these protein restraints were used for structure calculation with the program CYANA [Guntert et al, 1997; Herrmann et al,
2002]. After the final round of structure calculation, the 20 structures with the lowest CYANA target function were used for docking to DNA and energy minimization. The final average
CYANA target function for the 20 structures was 2.05.
A total of 292 distance restraints between protons within the DNA were obtained from
13 15 13 13 15 C/ N-filtered NOE spectra. A series of 2D C(ω1)-edited, [ C, N](ω2)-filtered NOESY spectra provided 27 unambiguous intermolecular restraints between the protein and the DNA.
These restraints were entered into the program AMBER for docking and energy minimization, as described in Materials and Methods.
Quality of the NMR structure. The structure of the PITX2-DNA complex was calculated by a restrained molecular dynamics docking and energy minimization procedure starting from the coordinates of the PITX2 protein calculated from CYANA and canonical B-form DNA as described in Materials and Methods. The 20 structures with the lowest total energies were selected for conformer analysis. These structures exhibited mean AMBER energies of –6268 kcal mol-1 and mean Van der Waal’s and electrostatic energies of –399 and –3974 kcal mol-1 respectively. The mean AMBER energies given represent the intra-protein interaction energy.
83 Table 3.9. NMR structure statistics NMR constraints Protein Distance constraints 1259 Intraresidue 513 Sequential 338 Medium-range 300 Long-range 108 Dihedral constraints 98 Phi 55 Psi 43 DNA 292 Protein-DNA (intermolecular) 27 Total 1676
CYANA target function value (Å2)a 2.05 +/- 0.39
Number of violations (average per conformer) Distance violations (>0.30 Å) 0 Dihedral angle violations (>5.0º) 1
AMBER energies (kcal/mol)b Mean AMBER energy -6268 +/- 250 Van der Waals -399 +/- 33 Electrostatic -3974 +/- 336
Ramachandran plot (%)c Residues in most favored regions 80.1 Residues in additional allowed regions 14.8 Residues in generously allowed regions 2.3 Residues in disallowed regions 2.8
RMSD from the mean structure (Å) Protein (bb, residues 3-58) 1.38 All heavy atoms (residues 3-58) 1.95 Protein (bb, all residues) 1.85 DNA (residues 68-78, 81-91) 1.30 Complex (residues 3-58, 68-78, 81-91) 1.81
A total of 30 conformers were calculated and the 20 structures with the smallest residual CYANA target function values were subjected to docking and energy minimization. a The value given for the CYANA target function corresponds to the value before energy minimization (the CYANA target function is not defined after energy minimization, since the conformers no longer have ECEPP standard geometry). b The value given represents the intra-protein interaction energy. c For residues 3-58.
84
The superposition of the structures (Figure 3.5) demonstrates a well-defined tertiary structure for PITX2 bound to DNA. The structures have no distance violations greater than 0.3
Å, and only 1 angle violation greater than 5 degrees. Analysis of Ramachandran plots for the ensemble indicates that the structures generally show favorable backbone conformations within allowed conformational space, with 80.1% of the residues 3-58 within the most favored regions,
14.8% in additionally allowed regions, 2.3% in generously allowed regions, and 2.8% in disallowed regions for the 20 conformers (Table 3.9, Figure 3.6). The N- and C-termini are largely disordered. When superimposed, residues 3-58 have an average root-mean-square deviation (RMSD) from the mean structure of 1.38 Å for backbone (N, Cα, C’ and O), 1.95 Å for all heavy atoms, and 1.85 Å for the backbone when all residues are included. The global RMSD for all DNA heavy atoms (nucleotides 68-78, 81-91) is 1.30 Å. The RMSD for the entire complex (residues 3-58; nucleotides 68-78, 81-91) is 1.81 Å.
85 (a)
(b)
Figure 3.5: Ensemble of structures of the PITX2 homeodomain/DNA complex. (a) α Ensemble of 20 structures showing the protein backbone N, C , and C’ atoms and
the DNA backbone. Helix 1 is colored pink, helix 2 green, helix 3 purple, and the DNA strands are coral. Superimposition was performed using backbone atoms from
protein and DNA. (b) Alternate view of the structure, rotated by approximately 90 degrees.
86
Figure 3.6. Ramachandran plot, generated by PROCHECK [Laskowski et al., 1993, 1996], for the 20 structures in the ensemble, for residues 3-58; 80.1% of the residues are in the most favored regions, 14.8% in the additionally allowed regions, 2.3% in generously allowed regions, and 2.8% in disallowed regions.
Tertiary structure of the PITX2 homeodomain-DNA complex. The overall tertiary structure of the PITX2 homeodomain is similar to other homeodomains, supporting previous findings that this tertiary structure is conserved among homeodomains [Gehring et al, 1990;
Gehring et al, 1994; Scott et al, 1989; Billeter, 1996]. The tertiary structure of the PITX2 homeodomain is comprised of 3 alpha helices (Figure 3.7). Helix 1 (residues 10-20) is followed
87 by a loop region, and then helix 2 (residues 28-37) runs antiparallel to helix 1. Helix 2 and helix
3 (residues 42-58) form a helix-turn-helix motif. Helix 3 is approximately perpendicular to
helices 1 and 2, and fits into the major groove of the DNA. The N-terminus of the homeodomain
makes contacts within the minor groove of the DNA.
(a)
(b)
Figure 3.7. Structure of the PITX2 homeodomain-DNA complex. (a) Mean structure of the homeodomain. Helix 1 is colored pink, helix 2 green and helix 3 is purple. The DNA strands are colored coral. (b) Ribbon diagram of the mean structure of the PITX2 homeodomain-DNA complex.
88 The helices of the PITX2 homeodomain are held together by a core of eight tightly packed hydrophobic amino acids (F8, L13, L16, F20, L40, V45, W48, and F49). These amino acids are either invariant (W48 and F49) or highly conserved in all homeodomains [Gehring et al, 1994; Qian et al, 1989; Kornberg, 1993]. In a threading analysis performed previously for the
PITX2 homeodomain [Banerjee-Basu & Baxevanis, 1999], it was hypothesized that the tertiary structure of the PITX2 homeodomain would be similar to other homeodomains, mainly because many of these hydrophobic, aromatic amino acids that are present in other homeodomains are also present in the PITX2 homeodomain. The threading analysis threaded the PITX2 homeodomain sequence to the Engrailed homeodomain structure, so the overall tertiary structure ends up being very close to that of Engrailed. The threading analysis did not provide a PDB file that we could analyze in detail and is not necessarily indicative of true molecular structure. For this threading analysis, the focus was the role of Rieger mutations in causing disease, and there was no discussion of the role of K50 in determining the DNA-binding affinity and specificity of the homeodomain, which is something best addressed via an experimentally determined structure rather than a threading model. This study also did not analyze the K50 Rieger mutants, so there is no indication what the structure of this side chain was in their analysis. In the absence of an experimentally determined structure, the threading model was most useful for visualizing some of the intramolecular interactions that stabilize the tertiary structure, and the predicted interactions are consistent with our experimental data. While we cannot compare our PITX2 tertiary structure directly to that of the threaded structure, we can compare it to EnQ50K and other homeodomain structures. In our experimentally determined structure, the first helix is closer to the second helix when measured from the backbone nitrogen of L16 to the backbone nitrogen I34 and compared to the EnQ50K, Antennapedia, wild-type Engrailed, Fushi tarazu,
89 vnd/NK-2, and MATα2 homeodomains [Tucker-Kellogg et al, 1997; Billeter et al, 1993;
Kissinger et al, 1990; Qian et al, 1994b; Gruschus et al, 1997; Wolberger et al, 1991]. As far as this distance is concerned, PITX2 is an outlier compared to the other six homeodomains. This distance is a range of 9.60-10.70 Å for Antennapedia conformers, 9.43 Å for the crystal EnQ50K structure, 9.54 Å for wild-type Engrailed, 9.30-11.10 Å for Fushi tarazu, 8.67 Å for vnd/NK-2, and 10.9 Å for MATα2. But for PITX2, this distance range for the 20 conformers is only 7.55-
8.58 Å, which is an average of 1.8 Å closer. In view of my RMSD for the protein backbone atoms (residues 3-58) of 1.38, this result is still significant, when compared with the ranges of distances seen in the structures of the other homeodomains. This difference is especially significant when considering that the RMSD for helices 1 and 2 alone is only 0.78. The range for the distance between L16 and I34 for all of the other homeodomains together is 8.67-11.10
Å, and the PITX2 distance range is completely outside of this.
In addition to the narrower distance between the first and second helices in the PITX2 structure, there are several other differences between the PITX2 and EnQ50K structures. In particular, the third helix of PITX2 is positioned about 0.5 Å lower (closer to the N-terminus of helix 1 and C-terminus of helix 2) than in EnQ50K (Figure 3.8). This difference in orientation of the three helices causes slightly different contacts to be made between the first and third helices, and may provide an explanation for the decreased stability of this homeodomain. Unlike other homeodomains that are stable in the free form [Tsao et al, 1994; Qian et al, 1994b; Damante et al, 1994; Carra & Privalov, 1997; Otting et al, 1988; Yamamoto et al, 1992], the PITX2 homeodomain is unstable in the absence of DNA in that it irreversibly aggregates at micromolar concentrations, which suggests a possible lack of stable tertiary structure in the free form. This may be due to slightly different hydrophobic interactions within the core of the protein, and the
90 absence of other stabilizing interactions such as the salt bridge linking residues 19 and 30, which can be present in most homeodomains [Iurcu-Mustata et al, 2001] but is not possible in PITX2.
One difference seen here is that F49, which is nearly invariant among homeodomains, points slightly upwards towards the loop region of the homeodomain, instead of pointing towards the interior of the protein. The altered orientation of the first helix in relation to the third would cause a steric clash with F49 if it were in a similar orientation as other homeodomains. While there is still an interaction involving F49 and F20 within the hydrophobic core of the PITX2 homeodomain, the orientations of the side chains themselves are different. This differing orientation may lessen the strength of the interaction between the first and third helices, which may affect the stability of the protein in the absence of DNA. This difference in orientation may be due to any number of differing residues between the two homeodomains (see Table 1.2). One possibility is a proline residue that is found in the loop region between helices 1 and 2 in PITX2, but is not present in Engrailed or Fushi tarazu.
91
(a)
(b)
Figure 3.8. Overlay of PITX2 homeodomain and EnQ50K homeodomain structures. Cyan corresponds to the structure of the PITX2 homeodomain, and black corresponds to the structure of the Engrailed mutant homeodomain. (a) Helices 1 and 2 of PITX2 are approximately 1.8 Å closer to each other in Pitx2 than in other homeodomains. (b) Alternate view, rotated by approximately 90 degrees. Helix 3 is about 0.5 Å lower in Pitx2 than in EnQ50K.
92 In the PITX2 structure, the N- and C-terminal segments –2 to 3 and 60 to 68 (Figure 2.1) appear disordered (Figure 3.5), which is to be expected based on a lack of medium-range and long-range constraints for these residues. Relaxation analysis (Figure 3.9) indicates that residues
-2 to 2 and 59 to 66 are more mobile in solution, explaining the observed disorder and lack of restraint information for these regions.
12 3 15N R2 (1/sec) 15N R2
Residue number
Figure 3.9: R2 relaxation rate constants for the PITX2 homeodomain. Low R2 rate values correspond to a higher mobility in that area of the protein. The N- and C-termini are seen to be very mobile, which corresponds to the disorder seen in these regions in the tertiary structure. Residues in the core of the protein that appear more mobile are found in the loop regions.
Our study also reveals structural information about the DNA when it is bound to the protein. Distance restraints obtained from the experiments described above for assigning the
DNA were entered into AMBER during the docking procedure. Visual inspection of the structure of the PITX2 homeodomain-DNA complex indicates that there is a slight widening of the minor groove of the DNA compared to B-form DNA, and a concomitant narrowing of the major groove. Previous structures of protein-DNA complexes have indicated changes in DNA structure are possible upon protein binding [Jones et al, 1999]. A more thorough, quantitative,
93 analysis of the DNA structure when PITX2 is bound will not be possible until a high-resolution structure of the DNA is determined, using labeled DNA [Fernandez et al, 1999].
Protein-DNA recognition. Analysis of the filtered NOESY experiments produced 27 usable distance restraints between the protein and the DNA. These include contacts that have been seen in other biochemical and structural studies of homeodomains. Many of the residues that interact with the DNA are arginines, including R3 and R5 at the N-terminus, R31 in the second helix, and R46, R52 and R53 in the third helix (Figure 3.10). Other residues that were found to make DNA contacts are Y25 and F49. A number of NOESY peaks were also seen between K50 and the DNA, and these contacts are discussed further below.
(a)
T6 HG – A72 Q5’ R2 HA – T74 Q5’ QD – T74 Q5’
R3 QD – A72 4’H Hε –A72 Q5’
R5 Hε –G89 8H HN – G89 8H
94 (b)
Y25 HE – G81 Q5’ HE – G82 4’H
R31 QG – G82 Q5’
(c)
V47 QG1 – A72 Q5’ QG1 – A73 2’H QG1 – A73 2”H QG1 – T74 6H QG1 – T74 7H
W48 HA – A72 8H R44 Hε –A73 8H QB – T74 4’H QD – T74 2”H
95 (d)
R46 QD – G84 8H
F49 QB – G83 4’H QB – G83 Q5’
K50 HG2 – G83 Q5’ HG1 – G83 Q5’ HG1 – G84 Q5’
R53 HG2 – G83 4’H
R52 QB – T71 Q5’
Figure 3.10. Detailed view of the protein-DNA interface and protein-DNA contacts. Side- chains of the protein are illustrated in cyan. On the DNA, blue corresponds to guanine residues, green to cytosine, pink to adenine, and purple to thymine. (a) View of the protein-DNA NOE contacts between the N-terminus of the PITX2 homeodomain and the minor groove of the DNA. (b) View of the protein-DNA NOEs between Y25, R31, and the DNA. (c) and (d) View of protein-DNA NOE contacts between residues in the third helix and the major groove of the DNA.
A detailed picture of the protein-DNA interface is shown in Figure 3.10. This figure illustrates the orientations of some of the side chains that are important in DNA binding, particularly within the third helix. Figure 2.4 outlines the numbering of the DNA used in the following discussion. Figure 3.10a illustrates the protein-DNA NOE contacts seen within the N- terminal arm. NOE contacts were seen in the minor groove between R2, R3 and R5 and DNA residues A72, T74 and G89. Although the NOESY-derived distance constraints indicate contact between residues R3 and R5 and the minor groove, relaxation data (Figure 3.9) indicates that this region of the N-terminus does retain some degree of mobility; similar results were reported for the Even-skipped homeodomain, based on refined atomic B-factors [Hirsch & Aggarwal, 1995].
Broad linewidths were observed for the backbone NH resonances of His7 and Phe8, which are
96 indicative of slow timescale motions in this region of the homeodomain and could possibly render undetectable possible NOEs from these residues to the DNA.
In the second helix, R31 has a NOE contact to G82 Q5’, as can be seen in Figure 3.10b.
HBPLUS analysis (96) indicates that R31 is making a hydrogen bond contact with the phosphate backbone of this nucleotide. In the loop between helices 1 and 2, Y25 Hε is making NOE contacts with G81 and G82. In the third helix, V47 Qγ1 (Q refers to a pseudoatom representation) is making conserved NOE contacts to A72, A73 and T74. Residue W48 has a
α NOE contact between H and A72 8H. R44 is making contacts with DNA nucleotides A73 and
T74. HBPLUS analysis indicates that R44 is making a backbone hydrogen bond contact to the phosphate of T74. Residues 44, 47 and 48 are illustrated in Figure 3.10c. In the third helix, R46 and R52 appear to be making conserved contacts with the DNA backbone. R46 extends upwards, and R52 extends downwards to make these contacts (Figure 3.10d). R46 Qδ has an
NOE contact with G84 8H. R52 has an NOE contact with T71 Q5’. R53 makes an NOE contact with G83 4’H. All of these NOEs could be due to the close proximity of the atoms while the side chains form hydrogen bonds with backbone phosphate groups. NOEs are also seen between
F49 Qβ and G83. K50 will be discussed further below, but as can be seen in Figure 3.10d, there are NOE contacts between the K50 side chain, and atoms from G83 and G84.
Other residues that were found to be in close contact with the DNA, but without NOEs being seen in the NMR data, are N51, K55, R57, and K58. N51 is nearly invariant among homeodomains [Kornberg, 1993] and is found herein to make the same highly conserved interaction within the major groove with base A73. This residue has been shown in crystal structures to form a pair of hydrogen bonds with this adenine at the N7 and N6 positions, while
NMR studies have indicated possible rapidly-interchanging conformations [Tsao et al, 1994;
97 Qian et al, 1993]. NMR studies have shown this close interaction, but NOEs are not seen, possibly due to line-broadening effects [Qian et al, 1993]. While NOEs are not seen between
K55, R57 or K58 and the DNA, HBPLUS analysis of the complex indicates that there are possible interactions present. K55 may be forming a hydrogen bond with the phosphate of T71.
R57 may be contacting the phosphate of G84. K58 may be contacting the phosphate of C70.
Due to the usual sensitivity limitations in the edited/filtered NMR experiments employed to identify intermolecular NOEs, it is quite likely that a number of anticipated NOEs fall at or below the threshold for detection.
The role of lysine at position 50. No previous structures have been described for any native K50 class homeodomains. However, the X-ray crystal structure of the Q50K mutant of the Engrailed homeodomain bound to DNA has been reported [Tucker-Kellogg et al, 1997], and the side chain of K50 was found to project into the major groove of the DNA, making hydrogen bond contacts with the O6 and N7 atoms of the guanines at base pairs 5 and 6 of the complementary strand of the TAATCC binding site. Our structure of the PITX2 homeodomain marks the first experimentally determined structure of a native K50 class homeodomain, and is important for validating results seen in the studies of non-native proteins. When binding to the consensus site, the position of K50 is very similar to that seen in the EnQ50K structure, with the side chain of K50 extending outward and making contacts with the two guanines adjacent to the
TAAT core sequence on the antisense strand (Figure 3.10d). The Nζ of the K50 side chain is likely making hydrogen bond contacts to the O6 and N7 atoms of G83 and G84, according to analysis by HBPLUS [McDonald & Thornton, 1994].
NMR allows one to elucidate information about the mobility of the protein backbone and side chains. A key finding in the present study was that the side chain of K50 potentially
98 mediates recognition by fluctuating between multiple conformations. The conformational heterogeneity can be seen in Figure 3.11a. This preliminary evidence is based on averaging of
NOEs and broadening of resonances for this residue. The averaging of NOEs was dealt with as ambiguous distance constraints within the structure calculation in CYANA, and these constraints were satisfied in all structures of the family. When peaks from an H(CCO)NH-TOCSY experiment are compared between the K50 and K58 side chains (Figure 3.11b), peaks are easily seen for the K58 side chain resonances, but only the HA resonance is seen for the K50 side chain. The extra peaks in the K50 strip of Figure 3.11b are peaks from another residue on a different nitrogen plane that are strong enough to show up as residual peaks on this plane. The broadening of resonances for this side chain made it difficult to assign using typical heteronuclear-edited NMR spectra. Instead, assignments were made using NOESY spectra and eliminating assignments from nearby residues, until only K50 resonances were left. In principle, it is possible that the line-broadening of K50 side chain resonances could be caused by ring current effects from aromatic bases in the DNA, or by mobility of other nearby protons in the
DNA binding site. However, no anomalous line-broadening was observed for DNA proton resonances in the vicinity of the K50 side chain. In addition, results similar to those reported here have been seen in other DNA-binding proteins in which side chain mobility appears to cause line broadening of resonances (vide infra) [Tsao et al, 1994; Qian et al, 1993; Foster et al,
1997; Nishikawa et al, 2001]. These results, in combination with the multiple conformations observed for K50 in EnQ50K, provide compelling evidence that K50 is mobile. Backbone dynamics (Figure 3.9) did not show anything unusual for the backbone of K50, although this does not mean the side chain is not showing motional properties. Some degree of side chain mobility at the protein-DNA interface would be expected to confer an entropic advantage for
99 binding to the DNA. It has been estimated previously that the entropic cost of keeping a lysine side chain static during binding is 3 kcal mol-1 [Doig & Sternberg, 1995]. This possible entropic component cannot be assessed until a detailed thermodynamic study is performed for this complex. This hypothesis of K50 side chain mobility will be explored further in the future, but for now, it is complementary to the data for the EnQ50K mutant [Tucker-Kellogg et al, 1997].
The crystal structure indicates that there are two alternate conformations for the K50 side chain, one in which the side chain points to base pairs 5 and 6, and one in which the side chain is oriented slightly more towards base-pair 5. It must be pointed out that this x-ray structure was solved at cryogenic temperatures, so there is the possibility that there is a freezing out of a subset of conformational populations. It is also possible that these results indicate two static conformations for this side chain of EnQ50K, rather than a dynamic fluctuation between two conformations, as indicated by the B-factors for the K50 side chain in the mutant. The B-factors in this case provide no evidence for distinguishing between these possibilities. The B-factors are low for the side chain of K50 in the 1.9 Å crystal structure of EnQ50K, varying over the range
20.8 to 23.6, which are the lowest values in the protein, aside from the aromatic ring of F49. B- factors of about 20 indicate uncertainties of about 0.5 Å. Typically, B-factors of 60 or greater in high-resolution crystal structures indicate possible mobility of a side chain. So, according to the crystal results, the side chain position of K50 is well-defined in the crystal, in contrast to the possible mobility of the K50 side chain seen in our results. The true nature of the side chain may involve a combination of the states revealed by the two different experimental approaches, so that the K50 side chain has two predominant conformations, and fluctuates between these alternatives.
100
(a)
(b)
Figure 3.11. The K50 side-chain may be mobile. (a) View of the 20 conformers, with only the K50, G83, and G84 backbone and side-chain atoms shown to illustrate the extent of disorder of the K50 side-chain, implying possible mobility of this side-chain in interacting with the DNA. K50 atoms are shown in blue, and G83 and G84 atoms are shown in pink. Backbone atoms are bolder than side-chain atoms. (b) Strips from an H(CCO)NH-TOCSY spectrum showing proton resonances for the side-chains of K58 and K50. Line-broadening of resonances in the K50 side-chain is indicative of possible motion of this side-chain.
Although a more detailed characterization of the side chain dynamics in the PITX2 HD-
DNA interface must await future data, substantial support for our observation of flexibility in the
K50 side chain already exists from studies of related systems. Significant broadening of side chain resonances at the protein-DNA interface was observed in studies of homeodomain-DNA
101 complexes of Antennapedia [Qian et al, 1993] and NK-2 [Tsao et al, 1994]. Moreover, flexibility in lysine side chains appears to be a significant feature of various modes of protein-
DNA interactions. Foster and co-workers [Foster et al, 1997] have reported clear indications of substantial, conformational fluctuations in lysine side chains in the interface of the zinc-finger protein TFIIIA with its DNA binding site, including the observation of broadened resonances and multiple NOE contacts that strongly suggest rapid conformational averaging. Significant line-broadening effects were also reported for a lysine side chain in NMR studies of the telomeric DNA complex of trf1 [Nishikawa et al, 2001]. In addition to NMR studies, molecular dynamics simulations of wild-type [Billeter et al, 1996] and a Q50K mutant [Gutmanas &
Billeter, 2004] of the Antennapedia homeodomain bound to DNA provides further evidence in support of a dynamic homeodomain-DNA interface. For example, the Q50K simulations indicated that the side chain of K50 exhibited very pronounced mobility, with several arrangements of the lysine side chain torsion angles allowing for frequent contacts, both hydrogen bonds and hydrophobic interactions, with base-pairs 5 and 6 in the TAATCC binding site. In this case, the lysine in the Q50K mutant provides both entropic and enthalpic contributions to protein-DNA affinity. A general observation arising from the known structures of homeodomain-DNA complexes is that the region of position 50 is not in intimate contact with the bases of the major groove. Such a relatively unrestrained arrangement allows for relatively long-range contacts to be formed in multiple, possibly isoenergetic ways.
Previous studies have shown that the lysine at position 50 is critical for its binding to the
TAATCC DNA binding site [Kornberg, 1993]. In contrast, homeodomains with a glutamine at position 50 bind to TAATGG sites with a higher affinity. The glutamine at position 50 appears to have a more modest role. When this residue is mutated to an alanine, the Q50A mutant has a
102 very similar affinity and specificity as the wild-type protein, but when mutated to a lysine, the specificity changes [Fraenkel et al, 1998]. These studies, along with the current results, indicate that the interaction between K50 and the two guanines at positions 5 and 6 are vital to the affinity and specificity of the protein. The current model for specific homeodomain-DNA interactions consists of a fluctuating network of hydrogen bonds formed between polar groups of the protein and the DNA, and the interfacial water [Billeter et al, 1996]. These interactions are further complemented by hydrophobic contacts. The possible fluctuating hydrogen bonding interactions between K50 and the DNA and subsequent strict specificity of this class of homeodomains is consistent with this model. Investigation of side chain-base interactions has shown that lysine- guanine interactions are very common [Mandel-Gutfreund et al, 1995]. K50 homeodomains may have such a strong specificity for the TAATCC site because the orientation of the lysine is in an ideal position for the charged group to make hydrogen-bonding contacts with the two guanines.
In contrast, these hydrogen bonds cannot be made with cytosines, which are in these positions for the Q50 binding site TAATGG [Mandel-Gutfreund et al, 1995]. The N7 of guanine is the most electronegative region of the major groove [Saenger, 1984], and the favorable interactions that the lysine can make with both guanines in a mobile model may describe why K50 homeodomains are so specific for the TAATCC binding site, rather than other binding sites.
Analysis of residues mutated in Rieger syndrome. There have been 9 missense mutations found in the PITX2 homeodomain in Rieger syndrome and related disorders [Semina et al,
1996a; Priston et al, 2001; Kulak et al, 1998; Saadi et al, 2001; Heon et al, 1995; Alward et al,
1998; Chisholm & Chudley, 1983; Walter et al, 1996; Murray et al, 1992; Quentien et al, 2002b].
These mutations, along with their known biochemical effects, are listed in Table 1.1. The consequences of these mutations vary. Some mutations cause a total lack of DNA binding, while
103 others can still bind DNA, albeit with a decreased affinity. These consequences are directly reflected in the severity of the disease. A model of the PITX2 homeodomain structure was created previously by threading analysis, which allowed predictions to be made regarding the role of Rieger syndrome mutations in PITX2 dysfunction, though it is not necessarily an indication of the true molecular structure [Banerjee-Basu & Baxevanis, 1999].
The orientations of the side chains altered in Rieger syndrome patients are shown in
Figure 3.12. Analyzing these orientations provides insights into the role of each side chain, and how mutations in these positions could alter the structure and function of the protein. Future studies will focus on analyzing the mutant proteins by NMR spectroscopy. The side chain of highly conserved L16 points towards the interior hydrophobic core of the protein, and is probably involved in stabilizing both the formation of this core and the overall tertiary structure of the protein; the L16Q mutation would therefore be expected to destabilize or disrupt this hydrophobic core. The side chain of T30 extends outward from the second helix, away from the
DNA, so it does not appear to play a role in DNA recognition. Biochemical studies have shown that this mutant can still bind consensus DNA, but no longer activates transcription of a reporter gene [Amendt et al, 1998]. This residue may perform an activation function by interacting with other proteins, which could easily be disrupted by the effects of the proline mutation. An interesting observation is that in many homeodomains, residue 30 is involved in a salt bridge to residue 19, whereas this is not possible for PITX2. The side chain of R31, as described above, appears to contact the DNA backbone phosphate of G82. Therefore mutating this residue, even to another positively charged residue, may disrupt this interaction with the DNA and may disrupt a possible salt bridge with E42. The histidine side chain at this position in the mutant may not have favorable steric interactions with the DNA. The side chain of V45 points towards the
104 interior of the protein from the third helix. Like L16, this side chain appears to be involved in formation of the hydrophobic core of the protein. Unlike the L16Q mutant, the V45L mutant has the unusual characteristic of having a greatly heightened activation function, while having a reduced DNA-binding ability. It is possible that this mutant affects the protein in a way that alters these two functions separately, with a different fold of the protein that allows for a more efficient interaction with other proteins. For example, altered interactions of the PITX2 homeodomain with the C-terminal tail of the full-length PITX2 protein could have differential effects on DNA binding and activation [Amendt et al, 1999]. The DNA-binding functions of
R46, K50, R52 and R53 were discussed in detail above. Mutating these residues would disrupt many favorable interactions with the DNA, and biochemical studies have indicated these mutations interfere with DNA binding. Overall, these results are similar to the threading analysis, but provide a more direct and detailed understanding of the roles of these residues.
Figure 3.12. Ribbon diagram of the PITX2 homeodomain/DNA complex showing the positions of the side-chains for the residues known to be mutated in Rieger Syndrome and related disorders.
105 Many of the residues in the PITX2 homeodomain found to be altered in Rieger syndrome are involved in contacting the DNA. Other residues are involved in forming the hydrophobic core of the protein, which stabilizes the global fold. The analysis of mutations causing structural changes could be very relevant for the understanding and prediction of dysfunctions caused by mutations in homeodomains, as several homeodomains are known to be involved in various diseases [Boncinelli, 1997; Muragaki et al, 1996; Nakamura et al, 1996; D’Elia et al, 2001;
Borrow et al, 1996].
Concluding Remarks:
The structure previously determined for the Engrailed Q50K mutant [Tucker-Kellogg et al, 1997] provided some interesting insights into the possible role of lysine at position 50. The presence of hydrogen bonds between position 50 and the DNA had not been seen previously.
But many questions remained unanswered concerning the role of lysine in a native K50 homeodomain. For example, the Engrailed mutant has a dissociation constant of 0.0088 nM
[Tucker-Kellogg et al, 1997], representing an unusually high affinity for homeodomain-DNA interactions. Previous studies have indicated that proteins with excessively high affinities for
DNA or RNA can cause functional defects [Watanabe & Lambowitz, 2004; Monsalve et al,
1998]. The unusually high affinity of EnQ50K for DNA suggests that it may have properties that make it different from natural K50 homeodomains. Unlike the Engrailed mutant, the native
K50 class homeodomains PITX2 and Bicoid have properties that make them unstable in free forms, and have affinities within the normal nanomolar range [Amendt et al, 1998; Ma et al,
1996]. When DNA is not present, these proteins will irreversibly aggregate and precipitate out of solution at micromolar concentrations. These differences in biochemical properties between
106 the mutant and natural K50 proteins suggest the importance of understanding the structural properties of lysine at position 50 in the context of a native K50 class protein.
But the question still remains as to what causes these differences. The authors of the
EnQ50K structure found that the mutant bound to DNA more tightly and specifically than did the native protein [Tucker-Kellogg et al, 1997]. They hypothesized that this was due to very specific hydrogen bonds between the K50 side chain and the guanines at positions 5 and 6 on the anti-sense strand. In our study, we found that the native K50 homeodomain PITX2 has a slightly different tertiary structure, with helix 1 being closer to helix 2 than in other homeodomains, including the EnQ50K mutant. Helix 3 is angled about 0.5 Å closer to the N-terminus of helix 1 and C-terminus of helix 2 than EnQ50K. This appears to cause a difference in the way that helix
1 and helix 3 can interact, and previous studies have shown that this interaction between the helices stabilizes the global fold of the homeodomain [Gehring et al, 1994; Qian et al, 1989;
Kornberg, 1993]. Another Q50K mutant, this time of Fushi tarazu, is unable to bind non- consensus DNA sites that PITX2 and Bicoid are able to recognize [Zhao et al, 2000]. It is currently unknown whether the Engrailed mutant can bind non-consensus sites. These differences in affinity and specificity may involve any of the differing residues between these homeodomains. Positions 50 and 54 have been shown to be involved in recognizing non- consensus DNA sites [Pellizzari et al, 1997], and it is possible that other residues are also involved. Within the third helix, position 52 of Engrailed is a lysine. In PITX2, Bicoid and
Fushi tarazu, this residue is an arginine. We do not know whether having lysine residues at both positions 50 and 52 could contribute to the unnaturally tight binding of EnQ50K, but this is a possibility.
107 The current study of the solution structure of the PITX2 homeodomain reveals possible fluctuating interactions between position 50 and the DNA. It is possible that this mobile side chain may allow the protein to sample multiple DNA binding sites, and bind to the non- consensus sites, though at a slightly lower affinity. It will be interesting in the future to determine if other natural K50 class proteins share similar properties with PITX2. Future studies will focus on analyzing Rieger mutants of the PITX2 homeodomain, and analyzing the structural features of this protein when bound to non-consensus DNA binding sites. This will allow a greater understanding of the roles of specific residues in consensus and non-consensus DNA binding, and a greater understanding of how proteins can recognize multiple DNA sites to activate transcription of genes.
108 CHAPTER 4: DNA Recognition by the Human PITX2 Homeodomain: Molecular Dynamics Simulation of Wild-Type and Rieger Mutant Complexes
Introduction
Transcription, replication, recombination, and repair of genes are key processes essential to cellular function. These activities all require the non-covalent interaction of DNA-binding proteins with DNA. While there have been a large number of structural, functional and thermodynamic studies reported, molecular recognition between protein and DNA is complex and not yet fully understood. Among the mechanisms that affect protein-DNA recognition are hydrogen bonding, hydration, conformational changes, electrostatic effects, and changes in dynamics. While much has been learned about some of these mechanisms, others require further study, particularly molecular dynamics. Molecular dynamics (MD) simulations are a very useful tool used to examine the protein-DNA interface. These simulations are routinely performed on proteins, nucleic acids, and macromolecular complexes to record trajectories with a length in the nanosecond range [Hansson et al, 2002]. MD simulations have also been used in the past to study homeodomain-DNA interactions [Billeter et al, 1996; Duan & Nilsson, 2002; Gutmanas &
Billeter, 2004]. These studies looked at the importance of water at the protein-DNA interface and the role of residue 50 in discriminating between different DNA sequences. Another recent application of MD simulations has been in examining the properties of mutant proteins that are involved in human disease [Wu et al, 2003].
While MD simulations have been performed on Q50 homeodomains in which the glutamine was replaced with a lysine residue [Gutmanas & Billeter, 2004; Duan & Nilsson,
2002], these simulations have not been performed on a native K50 homeodomain structure.
Preliminary data has indicated that K50 may be mobile, based on line broadening, which indicates possible motion on a µs-ms timescale (See Chapter 3). Previous MD simulations on an
109 Antp Q50K mutant have indicated that the lysine may fluctuate between the two guanine residues adjacent to the TAAT core, preferring to spend 1 ns at each guanine before switching
[Gutmanas & Billeter, 2004]. NMR studies [Tsao et al, 1994] and molecular dynamics simulations [Billeter et al, 1996; Gutmanas & Billeter, 2004] have provided strong indications of a dynamic, fluctuating environment encompassing some of the key amino acid side chains at the interface, most importantly, the side chains of asparagine 51 and of the position 50 residue.
Billeter and co-workers [1996] proposed that, at least in the case of Antennapedia, the homeodomain achieves specificity through a fluctuating network of short-lived contacts that allow it to recognize DNA without the entropic cost that would result if side chains were immobilized upon DNA binding. It will be interesting to look at MD data on a native K50 structure, to determine how K50 behaves in its natural context (refer to Table 1.2).
The study by Wu et al [2003] used the Generalized Born model to represent solvent effects, rather than an explicit water model. In this study, we used explicit water in comparing mutant complexes to the wild-type complex and in looking at differences in hydration.
Hydration water is predominantly on the macromolecular surface, but a small number of water molecules may be located in the interior of complexes. In looking at protein-DNA complexes at an atomic level, previous studies have shown that contacts between DNA and protein can be explained in terms of direct hydrogen bonds, water-mediated hydrogen bonds, van der Waals, electrostatic, and hydrophobic contacts [Jayaram & Jain, 2004; Reddy et al, 2001]. Water molecules are observed quite often at the interface between protein and DNA in crystal structures, the classic case being the trp repressor-operator complex [Janin, 1999; Otwinowski et al, 1988; Schwabe, 1997]. These studies presented the idea of water molecules acting as extensions of protein side chains in mediating interactions with DNA. NMR studies of the Antp
110 homeodomain indicate that at least two side chains at the protein-DNA interface are close to water molecules [Fraenkel and Pabo, 1998]. A molecular dynamics simulation of the complex in a water bath implies the presence of up to five water molecules in the cavity at the interface between the recognition helix and the DNA [Billeter et al, 1996]. Water molecules at the protein-DNA interface are also visible in X-ray crystal structures at high resolution [Hirsch &
Aggarwal, 1995; Li et al, 1995; Wilson et al, 1995]. In the paired (S50) structure, position 50 forms hydrogen bonds to two water molecules, which then hydrogen bond to DNA bases
[Wilson et al, 1995]. This interfacial water is likely to allow for mobility of the protein side chains at the interface. Water-mediated contacts between protein and DNA have been seen in many other protein-DNA complexes [Kosztin et al, 1997; Davey et al, 2002; Chai et al, 2003;
Chiu et al, 2002]. In homeodomains, these water-mediated contacts are considered an essential contributor to specificity [Billeter, 1996; Wolberger, 1993]. Molecular dynamics simulations have the capability of determining where water-bridging interactions are occurring, and what the residence times of water molecules at the protein-DNA interface are. The water molecules allow for amino acids that are otherwise out of reach of the bases to contribute to a network of hydrogen bonds that is believed to be important for the specificity of DNA recognition
[Schwabe, 1997]. Both the protein and DNA specifically recognize each other's hydration pattern, and there appears to be a fluctuating network of bonding interactions between protein,
DNA and water molecules. This may reduce the entropic cost of forming a rigid interface.
The mutations found in the PITX2 homeodomain in Rieger syndrome have been presented in previous chapters. A previous study presented a threading analysis of some of the
Rieger mutant PITX2 homeodomains [Banerjee-Basu & Baxevanis, 1999]. This study was able to predict the structures of the mutants, but was unable to provide detailed information about
111 differences in protein-DNA recognition, including differences in specific contacts, and differences in hydration of the protein-DNA interface. In addition, this threading analysis found that the structures of the mutants did not vary significantly from wild-type, so the differences in binding and activation may be due to the properties mentioned above. MD study of these mutants can provide further insight into the specific structural role of each residue and into the cause of Rieger syndrome and other diseases involving DNA-binding proteins.
In this study, MD simulations were performed on the wild-type PITX2 homeodomain-
DNA complex, and the Rieger mutant complexes in H2O. The results indicate that motion of the
K50 side chain is on a time scale longer than what we can see by MD simulations, which is in contrast to results seen for a Q50K mutant. The results also show differences in DNA recognition between wild-type and mutant complexes. The role of hydration in this differential
DNA recognition is discussed.
Overall Behavior of the Trajectories
All of the trajectories were analyzed to ensure that the simulations proceeded properly, without huge variations in overall energy or, especially in the case of the wild-type complex, large variations in RMSD over the course of the simulations. This analysis provides a validation of the calculations. The RMSDs of the complexes versus the simulation time are shown in
Figure 4.1. These RMSDs are for both the protein and the DNA in the complex. The RMSDs of all of the complexes stay very close to the starting structure. The wild-type complex only drifts from the starting complex by about 0.55 Å, while the mutant complexes differ by no more than
0.75 Å. The RMSDs go up and then level off, indicating that the structures settle into a stable structure and then this stable structure remains intact for the remainder of the simulations. The stability of the energy of the systems versus simulation time was analyzed for all of the
112 trajectories, and this is shown in Figure 4.2. The total energies of each of the systems remain quite low during the course of the simulations. The energies fluctuate slightly during the simulations due to thermal fluctuations. It is interesting to note that while the wild-type and most of the mutant complexes have similar total energies, the energy of the R52C complex is significantly higher. The energy of the V45L complex is also slightly higher. These results will be discussed further below. RMSD (Å)
Time (ps)
Figure 4.1: RMSD values of the MD snapshots versus the starting NMR structure for all of the residues of both the protein and the DNA, using the protein backbone and the heavy atoms of the DNA bases, as a function of simulation time. The color scheme is as follows: wild-type complex is black, T30P is red, R31H is green, V45L is blue, R46W is magenta, K50Q is brown, R52C is yellow, and R53P is violet.
113 Etot (kcal/mol)
Time (ps)
Figure 4.2: Total energy levels of the MD snapshots as a function of simulation time. The color scheme is the same as in Figure 4.1.
Analysis of the Molecular Dynamics of the Wild-Type PITX2 HD-DNA Complex
As discussed above, there have been no molecular dynamics simulations performed previously for any native K50 homeodomain structure. The NMR structure of the PITX2 homeodomain is the first structure to be solved of a native K50 class homeodomain (See Chapter
3), and single amino acid substitutions in this domain are known to cause Rieger syndrome.
Therefore, to gain a better understanding of how these mutations cause disease, we performed
MD simulations on the wild-type and mutant complexes. To form a basis for comparison of the mutants, the results of the wild-type simulation were analyzed first. The average structure of the wild-type PITX2 HD-DNA complex during the simulation was analyzed using the program
NUCPLOT [Luscombe et al, 1997] to obtain a list of protein-DNA contacts seen in this complex during the MD simulation. A summary of these contacts can be seen in Figure 4.3. These contacts are discussed in great detail in Chapter 3. The unique nature of K50 must be
114 emphasized again, in that it forms hydrogen bonds with the O6 and N7 atoms of the two guanines on the antisense strand, adjacent to the TAAT core DNA binding site. R31, R46, R52 and R53, which are all mutated in Rieger syndrome, are also shown to make contacts with the
DNA in the wild-type complex.
Figure 4.3: NUCPLOT diagram of the average structure of the wild-type protein-DNA complex during the 2 ns trajectory, showing the contacts between protein side-chains and specific DNA bases. Atom names correspond with the nomenclature used in the PDB file. NZ corresponds to Nζ, NH1 and NH2 to Nε1 and Nε2, OG1 to Oδ1, ND2 to Nδ2, and NE2 to Nε2.
115
Hydration and water-mediated protein-DNA contacts.
When a protein recognizes and binds to a DNA molecule, it is recognizing not only the
DNA itself, including its charge and sequence properties, but it is also recognizing the hydration pattern of the water that is on the surface of the DNA in its natural solution-state environment.
In examining the molecular dynamics of a molecular system, water plays an important role in mediating the interplay between various members of the system. In a protein-DNA interaction, water will intercalate within the various crevices of the protein and the DNA, and even more importantly, the water is capable of entering the protein-DNA interface and conferring contacts between the protein and the DNA. These contacts are referred to as water bridges, and are vitally important in mediating the binding of the protein to the DNA.
Any MD analysis of a protein-DNA complex must take into account the role of hydration, in order to gain a full understanding of the dynamics of the complex. The hydration of the wild-type complex can be seen in Table 4.1. The term Nw refers to the average number of water molecules in the vicinity (within 3 Å) of the given residue and side chain. The term Nps refers to the maximum number of picoseconds that a given water molecule resides in the vicinity of the residue. This reflects water residence times. These numbers can be used to describe the presence of long-lived water molecules and increased hydration of the complex. The positions of the various side chains mutated in Rieger syndrome, and discussed below are illustrated in
Figure 3.12 for the NMR structure of PITX2. It was decided to investigate the hydration of these side chains mutated in Rieger syndrome for each simulation, because these are residues that all serve important roles in the homeodomain’s structure and function, so we wanted to determine if altering one residue can cause differences in hydration at the other residues. A view of the
116 positions of the side chains of interest can be seen in Figures 3.12, 4.4, and 4.6. Hydration at positions 16 and 45 were not included in the table, as these residues are shielded within the hydrophobic core, and there was virtually no hydration seen near these residues in any of the simulations. The results for the wild-type complex show that the water molecules in the vicinity of R52 and R53 have a long maximum lifetime in this trajectory. Further analysis by NUCPLOT
[Luscombe et al, 1997] indicates that there are many possible water bridges involving these two residues. R52 makes water-mediated contacts to the phosphates of T71 and A72 in this trajectory. The R52 side chain interacts with the DNA backbone and is not within the major groove. This allows for more water molecules to be in the vicinity of this side chain. Water bridges are also present between R53 Nε1 and Nε2, and T86 (O4 and N3 atoms) and A85 (N1 and
N6 atoms). The R53 side chain is in the major groove, and less water can access this side chain, which would explain why there are less water molecules near this side chain than R52. K50 has a relatively small number of water molecules in close proximity, and these have much shorter maximum lifetimes. Because K50 is believed to make direct hydrogen bonds with the O6 and
N7 atoms of G84 and G83, water-mediated contacts probably do not play as large of a role as with Q50 homeodomains. T30 is located in helix 2. Biochemical studies have indicated that this residue may be involved in activation, rather than DNA binding. It is relatively accessible to water, and these water molecules tend to have short residence times. R31 is also located in helix
2, and makes contacts with the DNA backbone. A water-bridge was seen in the NUCPLOT analysis for this residue’s Nε1 with the phosphate of G81. R46 is located at the major groove of the DNA and makes contacts with the phosphate backbone. It has a small number of water molecules in contact with it, and these molecules appear to have short residence times. A water bridge is seen between R46 Nε1 and the phosphate of DNA residue G83, and this bridge is short-
117 lived. A summary of some of the water at the interface can be viewed pictorially in Figure 4.4.
As you can see, the water molecules are present between the protein side chains and the DNA. A
view of the trajectory of a single water molecule can be seen in Figure 4.5, to give a better idea
of the pathway a water molecule can follow during the course of a trajectory. Within the 2 ns
trajectory, this particular water molecule starts well away from the complex, and ends up within
the major groove between the protein and DNA, before leaving again.
Table 4.1. Hydration in the WT and mutant trajectories
Trajectories WT T30P R31H V45L R46W K50Q R52C R53P Residue Nw Nps Nw Nps Nw Nps Nw Nps Nw Nps Nw Nps Nw Nps Nw Nps 50 2.81 230 6.34 880 3.20 1050 3.56 570 1.52 510 2.39 360 2.98 600 5.74 1360 30 5.81 110 6.91 20 4.69 140 6.20 110 5.28 100 5.98 80 8.12 220 7.64 90 31 3.89 790 5.98 1350 2.89 1950 2.84 350 7.31 1850 5.29 1130 5.21 1540 3.43 1890 46 2.58 110 4.27 510 3.89 1950 2.86 210 8.63 120 4.08 180 2.80 890 3.37 1310 52 8.01 1980 10.22 1740 8.54 1130 7.56 1990 7.51 200 8.44 1970 4.36 370 6.96 1740 53 3.71 1990 8.18 700 2.46 480 7.44 990 2.83 430 3.63 360 3.55 460 6.26 110
aWater molecules are within 3.0 Å of an atom from the protein side chains listed in this column. b For each complex the average number (Nw) of water molecules in the vicinity of the given protein side chain and the maximum number of picoseconds (Nps) that a given water molecule resides in this vicinity are given.
Figure 4.4: Outline of some of the water molecules at the protein- DNA interface that are in close contact with residues R31, K50 and R53, and form water-bridging interactions between the protein (particularly R31 and R53) and the DNA.
118
Figure 4.5: Snapshots of a single water molecule’s trajectory during the 2 ns simulation time, taken every 200 ps. The color scale starts at black, then goes to cyan, red, blue, green, grey, yellow, orange, pink, and purple. After 1.4 ns, this water molecule spends about 200 ps at the protein-DNA interface before leaving again.
Properties of Lys50.
The position of the K50 side chain in the average structure of the wild-type PITX2 HD-
DNA complex during the trajectory can be seen in Figure 4.6a. As can be seen, the lysine side chain extends upwards and interacts with the two guanines adjacent to the TAAT core DNA binding site (G83 and G84). A MD simulation performed by Gutmanas & Billeter [2004] of a
Q50K mutant of the Antp homeodomain indicated that over the first nanosecond of the trajectory, the lysine at position 50 prefers to form a hydrogen bond with base pair 5 (G84 in this study), while during the second nanosecond, it prefers to form a hydrogen bond with base pair 6
(G83 in this study). These results indicate that K50 fluctuates between the two base pairs, remaining at each for 1 ns. Our previous NMR results (see Chapter 3) show line broadening for the K50 side chain resonances, which indicates motion on a timescale longer than this (µs-ms).
119 This was one reason we wanted to closely re-create the MD simulations run by Gutmanas and
Billeter [2004], so we could determine if we obtained similar results for the native complex. In fact, we did not see the same clear switch in the lysine side chain binding to one guanine, and then the other. Throughout the entire trajectory, we saw hydrogen bonding to both G83 and
G84, with possible hydrogen bonding to G83 N7 and O6 present almost the entire time.
Hydrogen bonding to G84 O6 is present for 100% of the simulation, but hydrogen bonding to
G84 N7 is only present for 1.5% of the simulation. As can be seen in Figure 4.6b, K50 Nζ appears to be closer to the G83 N7, and this distance is in the range for hydrogen bonding interactions (3-4 Å), while the distance to G84 N7 is significantly greater (~4.5-5.0 Å). But when looking at the distance between Nζ and the O6 atoms of both guanines, the distances are essentially the same for G83 and G84 (~3.0 Å), which indicates no preference for one guanine over the other. To determine if the K50 side chain itself is mobile, we analyzed the angles of the
K50 side chain throughout the simulation (Figure 4.6c). While there are fluctuations in the angles, there are no large-scale changes in the chi angles throughout the 2 ns simulation, such as those seen by Gutmanas and Billeter [2004]. One explanation for these results is that K50 in the
PITX2 homeodomain is behaving similarly to the Q50K mutant, only on a slower timescale that cannot be seen in these simulations; therefore, the movement shows up as multiple populations of the side chain. This hypothesis is supported by previous data, which has shown line broadening for the K50 side chain [see Chapter 3]. Here, the side chain appears to have a slight preference for hydrogen bonding to G83 for the 2 ns trajectory. Because hydrogen bonds can be seen to both guanines during certain steps of the trajectory, it is possible that the lysine side chain is forming transient simultaneous hydrogen bonds to both guanines. This formation of two hydrogen bonds by one lysine side chain has been seen in the past [Mandel-Gutfreund et al,
120 1995], but would probably require the side chain to be less mobile than what we see with K50.
The actual state of the K50 side chain will be further clarified in the future with NMR experiments examining side chain dynamics.
(a)
(b) Distance (Å) Distance
Time (ps)
121
(c) Angle (º)
500 1000 1500 2000 Time (ps)
Figure 4.6: Properties of K50 during the MD simulation. (a) Positioning of the K50 side-chain in the average structure of the wild-type protein DNA complex during the 2ns trajectory. (b) Distance (Å) between K50 NZ and the N7 atoms of G83 (black) and G84 (red), and between K50 NZ and the O6 atoms of G83 (green) and G84 (blue) during the course of the 2 ns simulation. (c) Dihedral angles of the K50 side-chain during the course of the simulation. χ1 (black), χ2 (blue), χ3 (red), and χ4 (green).
Analysis of Mutant Complexes
Many differences were seen between the wild-type complex and the mutant complexes.
Differences were seen in level of hydration, water residence time, energy levels, side chain position, and protein-DNA contacts. These differences are illustrated in Table 4.1 and Figures
4.7-4.9, and described below for each complex. A simulation was not performed for the L16Q or
K50E complexes, as these are the only complexes in which it is known that there is no DNA binding or activation seen with the consensus DNA site in biochemical studies [Saadi et al,
122 2001]. The complexes focused upon are the ones that are believed to still have some DNA binding activity, albeit, at a lower level. The threading analysis performed previously for the
PITX2 homeodomain [Banerjee-Basu & Baxevanis, 1999] found that the mutant complexes, except for L16Q, all had similar threading scores to the wild-type complex, and therefore, they hypothesized that the overall tertiary structure was similar to wild-type for each of the mutants.
Therefore, differences in structure that lead to the differences in function may be localized to differences in protein-DNA interactions and hydration of the protein-DNA interface, and these are the things that are focused upon in the following discussion.
T30P Simulation.
Position 30 is in helix 2. Previous biochemical studies of the T30P mutant have indicated that this mutant can still bind DNA almost as well as wild-type, but the activation function is severely reduced [Amendt et al, 1998]. In the current analysis, we found a greater level of hydration around residues 30, 31, 46, 50, and 52, compared to the case of the wild-type homeodomain. The residence times were greater for residues 31, 46 and 50. Analysis of the protein-DNA contacts shows that contacts between R44 Nε1 and the phosphate backbone of T74, and between R53 Nε1 and the N7 atom of A85 are missing. There is a new possible non-bonded contact between R31 Cδ and the phosphate of G82. There are new possible water-mediated contacts between R46 and the phosphate backbone between G81 and G83. The addition of new contacts between R31 and the DNA, and new possible water-mediated contacts may compensate for the loss of other contacts, and this may explain why DNA-binding affinity is not lowered for this complex.
123 R31H Simulation.
R31 is in helix 2 and makes a contact with the DNA backbone (Figure 4.3). Replacement of this arginine with a histidine creates a much bulkier side chain at the protein-DNA interface, which likely creates steric hindrance. Analysis of the protein-DNA contacts indicates that the histidine is capable of contacting the phosphate of G82, which is to be expected based on its positive charge, but this interaction is only present in a small number of the snapshots of the trajectory (~25%). Levels of hydration are very similar in this complex as compared to the wild- type complex (Table 4.1). In contrast to the wild-type complex, maximum residence times for the water molecules are much higher near residues 31, 46, and 50. These residence times are lower for residues 52 and 53. Interestingly, while residence times are lower for R52, there is the addition of many possible water-mediated contacts between R52 and the phosphate of residue
C70. Analysis of protein-DNA contacts indicates a new contact between R52 Nε1 and T71 O3’.
The contact between N51 and A73 N7 is removed and there is a new non-bonded contact between N51 Nδ2 and C8 of A72. This slightly different positioning of the protein on the DNA may explain the lowering of DNA binding affinity. The increase in water residence times at the protein-DNA interface may be due to a destabilization of the protein-DNA interaction, with water-mediated interactions becoming more important at stabilizing the interaction with DNA.
V45L Simulation.
Position 45 is located in helix 3 and points towards the interior of the homeodomain, making up part of the hydrophobic core of the protein. The extension of the side chain by a methylene group is likely to create steric hindrance within the core of this protein, causing a destabilization of the homeodomain. Results of this simulation show that this extension of the side chain causes the backbone of the third helix to be pushed 1.15 Å outwards from helices 1
124 and 2. Levels of hydration are very similar to wild-type for the V45L mutant complex, with the exception of R53 where the level is significantly higher. Maximum water residence times are raised for positions 46 and 50, and lowered for positions 31 and 53. This complex has a much lower number of water-mediated contacts. Water-mediated contacts seen between K55 and the phosphate backbone of C70 are missing. All water-mediated contacts between R53 and T86 are missing. Instead, there are new water bridges between R53 and the phosphate of G84. The total energy for the complex during the simulation is raised compared to most of the other complexes.
The direct protein-DNA contacts for this complex can be seen in Figure 4.7. There are new contacts between W48 Cζ3 and the phosphate of A72, and N51 Nδ2 and A72 C8. The contact between R44 and the T74 phosphate is missing. All of the contacts between R53 and residues 85 and 86 are missing, and there is a new contact between R53 and the phosphate of G84. There is a contact missing between R52 and the phosphate of A72. The contact between R57 Nε and the phosphate of G84 is missing, as well as the contact between K50 Nζ and G83 O6. There is a new contact between R31 Nε1 and the phosphate of G81. The many differences in protein-DNA contacts likely explain the lowering in DNA binding seen for this mutant complex. What is interesting about this mutant is that while DNA binding is lowered, activation is greatly increased [Priston et al, 2001]. The lowered DNA binding is easily seen from the discussion of the many altered protein-DNA contacts above. But how is activation increased, when there is a lowered DNA binding? The possibility exists that there is still enough of a favorable interaction with the DNA for a lowered amount of binding to be seen, but that this highly destabilized interaction causes the complex to have a higher energy (see Figure 4.2), and causes residues to be exposed to possible interactions with transcriptional activators that otherwise would be
125 buried. This hypothesis will be explored further in the future, in structural and functional studies of this mutant in the full-length PITX2 protein.
Figure 4.7: NUCPLOT diagram of the protein-DNA contacts in the average structure of the V45L mutant protein-DNA complex during the 2 ns trajectory. Atom nomenclature is described in the caption for Figure 4.3.
126 R46W Simulation.
R46 is located in the third helix of the PITX2 homeodomain and makes contacts with the
DNA backbone of G82 and G83 (Figure 4.3). Figure 4.8 illustrates an overlay of the wild-type and R46W average structures during the simulation, with position 46 highlighted. In the wild- type case, the positively-charged arginine extends towards the negatively charged DNA backbone to make contacts. In the case of the mutant, replacement of the arginine with a tryptophan causes a bulky, hydrophobic side chain to be present at the protein-DNA interface, which disrupts the favorable interaction the arginine had with the DNA. Levels of hydration are higher for positions 31 and 46, with maximum water residence times being higher for positions
31 and 50, and lower for positions 52 and 53. In analyzing protein-DNA contacts, contacts between R2 Nε2 and the phosphate of A88, and R53 Nε1 and A85 N7 are missing. There are new contacts between Y25 OH and G82 O4’, and between M28 N and the phosphate of G82. The contact between R31 and the phosphate of G82 is shifted to G81. The contact between position
46 and G82 is obviously missing.
Figure 4.8: Overlay of wild-type and R46W mutant complexes, to illustrate the differences in side-chain positions of the two difference side- chains, and to show how tryptophan would cause an unfavorable steric interaction at the protein-DNA interface. 127
K50Q Simulation.
In the K50Q simulation, water residence times are higher for residues 31 and 46.
Maximum water residence times are higher for residue 31 and lower for residue 53. Analysis of protein-DNA contacts shows a new contact between R3 Nε and Nε2 and T87 O2, with the contact between the phosphate of A88 and R2 Nε2 missing. The direct hydrogen bond between N51 and
A73 is also missing. There are new contacts between Y25 Cε2 and G82 O3’, M28 N and the phosphate of G82, and the contact between R31 and the phosphate of G82 is shifted to G81.
There are new water-mediated contacts between the phosphate of G89 and R3 and R5. There is a new water-mediated contact between R3 Nε2 and T87. In this simulation, it appears that Q50
Oε1 is making water-mediated contacts with G83 and G84. It is unknown what the biochemical properties of this mutant are.
R52C Simulation.
Position 52 is located in the third helix and makes contacts with the phosphates of residues T71 and A72 (Figure 4.3). Replacement of this arginine with a cysteine disrupts this interaction. In our MD simulation for the R52C mutant, levels of hydration are higher for positions 30 and 31, and lower for residue 52. Maximum water residence times are higher for residues 30, 31, 46 and 50, and lower for residues 52 and 53. The total energy of the system for this complex is much higher than for the other complexes (Figure 4.2). Analysis of protein-DNA contacts indicates that the contact between R2 Nε2 and the phosphate of A88 is missing. The contact between R53 Nε1 and N7 of A85 is missing as well. The loss of contacts with no compensation by addition of new contacts may explain the higher energy of this complex.
128 R53P Simulation.
R53 is located in helix 3 and makes contacts with A85 N7 and T86 O4 in the wild-type complex simulation. In the simulation for R53P, levels of hydration are higher for residues 30,
46, 50 and 53, and lower for residue 52. Maximum water residence times are higher for residues
31, 46, and 50 and much lower for residue 53. The protein-DNA contacts for this mutant complex are summarized in Figure 4.9. There are new contacts between K50 O and A73 N6, and between A54 and the bases of C70. The contact between R3 Nε2 and T87 O2 is missing, along with the contact between R44 and A73. There are new contacts involving Y25, M28 and
R31 contacting the DNA backbone between residues 80 and 83. There are new possible water bridges between R46 and the phosphate of G82. There is also a new nonbonded contact between
D27 and the phosphate of G82.
129
Figure 4.9: NUCPLOT diagram of the protein-DNA contacts in the average structure of the R53P mutant protein-DNA complex during the 2 ns trajectory. Atom nomenclature is described in the caption for Figure 4.3.
Discussion
Specific interactions between protein and DNA depend largely on the complementarity of the binding surfaces of each macromolecule, which includes favorable intermolecular interactions, such as direct hydrogen bonds, water-mediated hydrogen bonds, van der Waals
130 contacts, electrostatic interactions, and hydrophobic contacts. There is evidence that many of these interactions are not static in homeodomains, but that there is movement of protein side chains in making interactions with the DNA [Billeter et al, 1996; Tsao et al, 1994; Gutmanas &
Billeter, 2004]. Some degree of side chain mobility at the protein-DNA interface would be expected to confer an entropic advantage for binding to the DNA, as the side chain would not be immobilized. In addition to consideration of mobility of the interacting surfaces, the role of water in protein-DNA interactions must be considered. Water molecules appear to act not only as a foundation to improve the complementarity of the interacting surfaces, but also as an assistant to reduce the entropic expense that arises when a highly specific macromolecular interaction requires a large number of interactions [Billeter et al, 1996]. Homeodomains provide good model systems for exploring these aspects of protein-DNA interactions.
The simulations characterized in this study may shed some light on the mechanisms that govern the specificity of homeodomains, and the structural roles that mutated residues play in causing human disease. Previous studies have hypothesized that motion plays a role in the way that K50 recognizes the DNA consensus site [Gutmanas & Billeter, 2004; Duan & Nilsson,
2002]. The molecular dynamics study that made this hypothesis looked at a Q50K mutant of the
Antp homeodomain, and they hypothesized that the K50 side chain spends 1 ns at each guanine adjacent to the TAAT core sequence, before switching to the other guanine [Gutmanas &
Billeter, 2004]. Results for a mutant may not apply to the native case. As shown with our mutants in this study, differences in residues at one position can cause many differences at other residues throughout the protein sequence. Therefore, the residue differences between
Antennapedia and PITX2 could cause many differences in the molecular dynamics of a lysine at position 50. In our previous study in which we presented the NMR solution structure of the
131 PITX2 homeodomain, we saw line broadening of the side chain resonances for K50, which indicates possible motion on a µs-ms timescale. Our current results in this MD study indicate that motion of the K50 side chain is on a scale longer than the 2 ns trajectory presented here. In addition, hydration levels were greater for the K50 side chain in the current study than the Q50K mutant. Because the K50 side chain makes direct hydrogen bonds with the guanines adjacent to the TAAT core DNA sequence, water-mediated contacts appear to not play as large of a role, as with Q50 homeodomains. For the wild-type Q50 homeodomain Antennapedia, water molecules spend a maximum residence time of 1.5 ns near position 50, and are shown to make water- bridges between the Q50 side chain and the DNA [Gutmanas & Billeter, 2004; Duan & Nilsson,
2002]. For the K50 homeodomain PITX2, the maximum water residence time is only 230 ps, which is comparable to the 367 ps seen for the Q50K mutant of Antennapedia [Gutmanas &
Billeter, 2004]. Therefore, one of the major differences between Q50 homeodomains and K50 homeodomains appears to be the importance of water in mediating the specificity of the homeodomain.
Nine missense mutations are found within the PITX2 homeodomain in Rieger syndrome and closely-related disorders. To try to elucidate how these mutations could affect the hydration of the PITX2 homeodomain and particularly the specific interactions with DNA, we have used
MD simulations over a 2 ns trajectory. Some of the complexes can still bind DNA, but at significantly lower levels than the wild-type, with many of the protein-DNA contacts being lost at the expense of DNA binding (R31H, V45L, R46W, R52C, R53P). It appears that with some mutations, the mutant complexes still bind DNA by using the additional DNA contacts that are still present, and supplement these interactions with further water-mediated contacts with the
DNA (R31H, R53P). Other mutant complexes do not have a way to compensate for the contact
132 that went away, and do not bind the TAATCC binding site at all (K50E). The mutant complex
V45L has mutations involving residues that are not involved in DNA binding, but are involved in forming the hydrophobic core of the protein. The results of this study show that this mutant loses many favorable interactions with the DNA, due probably to instability in the formation of the tertiary structure, which then is unable to properly dock onto the DNA. We will want to study the V45L mutant further in the future, as this mutant has the interesting biochemical properties of having less DNA binding, but a greatly heightened activation function. In our study, the total energy of this system was raised in relation to most of the other complexes, and this complex had a great number of missing and altered interactions with the DNA and with water-bridging molecules. Overall, mutant complexes appear to have lowered DNA binding levels due to a lower number of favorable interactions with the DNA. The complexes that are still able to bind DNA often do so by having a greater number of water-mediated contacts with the DNA.
During the 2 ns simulation time, the majority of the water molecules had residence times that were shorter than 100 ps. Some of the water molecules had much longer residence times, as long as 99.5% of the simulation time for water molecules near residues 52 and 53 in the wild- type simulation. Figure 4.5 illustrates the path of a single water molecule that spends around 200 ps in the protein-DNA interface of the wild-type simulation, showing how far a single water molecule can travel during the 2 ns trajectory. This MD data is in agreement with earlier NMR measurements for the Antennapedia homeodomain, in which a few water molecules have long residence times, in the nanosecond range [Qian et al, 1993]. The coupling of direct protein-DNA interactions, and water-mediated protein-DNA interactions is interesting. It seems that each are vitally important to the binding of the protein to the DNA, and when mutations occur that alter
133 the protein, differences in both the direct and water-mediated contacts are seen throughout the protein and the DNA. Throughout the simulation time of the wild-type and mutant complexes, there is no detectable change in any of the global structural characteristics of the complex, such as RMSD from starting structure and energy levels. But there are many differences between wild-type and mutant complexes when looking at the level of particular side chains in terms of their orientations, interactions with DNA, and hydration levels and residence times.
In conclusion, MD simulations suggest that lysine at position 50 in the PITX2 homeodomain contacts the two guanines on the antisense DNA strand adjacent to the TAAT core
DNA binding site, using motion on a time scale greater than the 2 ns trajectory utilized here.
Missense mutations in this homeodomain that lead to Rieger syndrome and closely related disorders cause differences in direct protein-DNA contacts, levels of hydration, and water residence times. There are also differences involving water-mediated contacts between the protein and the DNA.
134 CHAPTER 5: Thesis Summary and Future Directions
PITX2 is a transcription factor that is found in many developing tissues in vertebrate embryos. It has been shown to be expressed in the brain, heart, pituitary, mandibular and maxillary regions, eye, gut, limbs, and umbilicus [Semina et al, 1996; Gage et al, 1997;
Mucchielli et al, 1997; Hjalt et al, 2000]. Mutations in PITX2 are known to be a cause of Rieger syndrome [Xia et al, 2004; Priston et al, 2001; Lines et al, 2004; Phillips, 2002; Kulak et al,
1998], and many of these mutations result in single amino acid substitutions within the PITX2 homeodomain [Priston et al, 2001; Lines et al, 2004; Phillips, 2002; Kulak et al, 1998]. The
PITX2 homeodomain is a member of the K50 class of homeodomains, which have a lysine at position 50 and recognize a consensus DNA sequence of TAATCC [Hanes & Brent, 1989]. The only K50 homeodomain structure determined previously is an X-ray crystal structure of an altered specificity mutant, Engrailed Q50K (EnQ50K) [Tucker-Kellogg et al, 1997]. An issue concerning EnQ50K is the observation that it binds to the consensus TAATCC site with an unusually high affinity, which approaches the picomolar range [Ades & Sauer, 1994]. A KD was determined for the Q50K mutant of the Fushi tarazu homeodomain, and this value was found to be 0.63 nM [Percival-Smith et al, 1990], which is a much lower affinity than the EnQ50K mutant. The KD for the PITX2 homeodomain alone was found to be 2.6 +/- 0.38 nM.
In this thesis, the solution structure of the PITX2 homeodomain bound to its consensus
DNA site (TAATCC) has been determined by NMR spectroscopy. Although structures of several homeodomain/DNA complexes have been determined, this is the first structure of a native K50 class homeodomain. Analysis of the NMR structure of the PITX2 homeodomain indicates that the lysine at position 50 makes contacts with two guanines on the antisense strand of the DNA, adjacent to the TAAT core DNA sequence, consistent with the structure of
135 EnQ50K. Our evidence suggests that this side chain may make fluctuating interactions with the
DNA, which is complementary to the crystal data for EnQ50K. There are differences in the tertiary structure between the native K50 structure and that of EnQ50K, which may explain differences in affinity and specificity between these proteins. The information provided in this thesis will form the basis for many future studies focused in three different areas: nonconsensus
DNA binding, side chain dynamics of the K50 and other side chains, and analysis of Rieger mutant proteins.
Studies have shown that the PITX2 homeodomain can recognize DNA sites that deviate from the consensus site, and these sites are physiologically relevant [Dave et al, 2000; Yuan et al, 1999; Hjalt et al, 2001; Espinoza et al, 2002]. A list of these sites is presented in Table 1.3.
While it is unknown whether the EnQ50K mutant can bind nonconsensus DNA sites, a Q50K mutant of the Fushi tarazu homeodomain is unable to bind any non-consensus DNA binding sites tested [Zhao et al, 2000], which indicates that studies of nonconsensus DNA site recognition by
K50 class homeodomains must focus on the native members. Future structural studies will focus on analyzing the structures of the PITX2 homeodomain bound to multiple nonconsensus DNA sites. It will be interesting to determine how the PITX2 homeodomain adjusts itself structurally to bind multiple DNA sites. It will also be interesting to determine what allows for the PITX2 homeodomain to bind nonconsensus sites, while the Q50K mutant of the Fushi tarazu homeodomain is unable to bind any nonconsensus DNA sites. The structure presented in this thesis of the PITX2 homeodomain bound to the consensus DNA site will be used as the basis for these comparisons. As described in Chapter 1, amino acids at positions 47, 50, 51 and 54 are important in DNA recognition by homeodomains, and it may be the combination of residues at these positions that allows for these homeodomains to recognize a multitude of DNA sites
136 [Gruschus et al, 1997; Clarke, 1995]. Position 54 may be involved in recognition of variant
DNA sites [Dave et al, 2000; Gruschus et al, 1997]. It has been shown in the past that there is a functional interaction between residues at positions 47 and 51 [Pomerantz & Sharp, 1994]. A functional interaction has also been shown to exist between amino acids at positions 50 and 54
[Pellizzari et al, 1997], particularly that methionine is never present at position 54 when a lysine is at position 50. Position 47 is usually hydrophobic, position 51 is almost always asparagine, and position 54 varies [Billeter, 1996]. Bicoid and PITX2 have different combinations of residues at these positions when compared to each other (See Table 1.2), yet both recognize nonconsensus sites. But, Bicoid and PITX2 have different combinations of residues at these positions when compared to other well-studied homeodomains, which indicates that these two homeodomains can recognize nonconsensus sites partially because these two combinations of residues allow for them to do so. And as has been shown by Dave et al [2000], Bicoid and
PITX2 do not recognize the same nonconsensus sites, and each has its own unique properties in this sense. It will be interesting in the future to compare structures of these two homeodomains bound to nonconsensus sites to determine if there are any common characteristics.
We have performed molecular dynamics (MD) simulations on the solution structure of
PITX2 bound to its consensus DNA site. This is the first molecular dynamics simulation of a native K50 homeodomain. The results show a number of long-lived water molecules in the vicinity of R52 and R53, which form water-bridging interactions with the DNA. A number of water molecules are also shown to be in the vicinity of other arginines in contact with the DNA, and near K50. The results indicate that motion of the K50 side chain is on a time scale longer than what we can see by MD simulations, which is in contrast to results seen for a Q50K mutant previously. The line-broadening seen in these structural studies and described in Chapter 3 for
137 the K50 side chain also indicates that there is possibly motion of this side chain in interacting with the DNA. Future studies will focus on performing NMR dynamics experiments to analyze the possible motional properties of the K50 side chain and the timescale of this motion, to determine exactly how this side chain may fluctuate in interacting with positions 5 and 6 of the consensus site, and how this motion may be involved in its DNA recognition function and in the thermodynamics of the homeodomain-DNA interaction. The dynamics of the arginine side chains that are so important in contacting the DNA will also be examined, which will provide information on motion at the protein-DNA interface, and may paint a clearer picture of how motion of side chains is important in binding and recognition of DNA binding sites. This motion may also be important in recognition of nonconsensus sites.
Mutations in the human PITX2 gene are responsible for Rieger syndrome, an autosomal dominant disorder. Analysis of the residues mutated in Rieger syndrome indicates that many of these residues are involved in DNA binding, while others are involved in formation of the hydrophobic core of the protein. In this thesis, we analyzed the structural roles of residues that are mutated in Rieger syndrome, and performed molecular dynamics simulations of mutant complexes. The results of these simulations were compared to the wild-type case, and there are many differences in levels of hydration, water residence times, energy levels, water bridging interactions, and direct protein-DNA interactions. These results provide further insight into the mechanism by which K50 homeodomains bind DNA, and how Rieger mutations cause severe phenotypic consequences. Future studies will focus on making single site mutations to create
Rieger mutant homeodomains, and then performing structural studies on these mutants to determine more directly how these mutations alter the structure of the homeodomain and its interaction with DNA. These studies could provide information into how these mutants cause
138 disease, and into possible treatment options for not only this disease, but also the many others that are caused by mutations in homeodomains. We will initially focus on the mutant proteins that are known to be more stable and have some DNA-binding activity (see Table 1.1). In particular, we would like to analyze the V45L mutant, which has the unusual property of having a lowered DNA binding activity, but a greatly heightened activation function [Priston et al,
2001]. We would like to determine how this mutation changes the structure of the homeodomain itself in such a way to produce this phenotype. It may be necessary to determine the structure of the full-length PITX2 protein, and examine this mutant in the context of the full-length protein, which would include the transcriptional activation domains.
Unlike other homeodomains that are stable in the free form [Tsao et al, 1994; Qian et al,
1994b; Damante et al, 1994; Carra & Privalov, 1997; Otting et al, 1988; Yamamoto et al, 1992], the PITX2 homeodomain is unstable in the absence of DNA in that it irreversibly aggregates at micromolar concentrations, which suggests a possible lack of stable tertiary structure in the free form. This may be due to slightly different hydrophobic interactions within the core of the protein, and the absence of other stabilizing interactions such as the salt bridge linking residues
19 and 30, which can be present in most homeodomains [Iurcu-Mustata et al, 2001], but is not possible in PITX2. Because some of the Rieger mutant proteins have lower or no DNA binding activity, we would like to determine experimental conditions that would make the free PITX2 protein more stable at high concentrations, and possibly determine the structure of the protein in the free form. This would allow us to examine how the homeodomain changes its structure in binding the DNA. Experimental conditions that stabilize the free PITX2 homeodomain are also likely to stabilize some of the Rieger mutants with lowered DNA binding, which would allow for efficient structure determination of these proteins.
139 CHAPTER 6: Literature Cited
Ades, S.E., & Sauer, R.T. Differential DNA-binding specificity of the Engrailed homeodomain:
the role of residue 50. Biochemistry 33, 9187-9194 (1994)
Aishima, J., & Wolberger, C. Insights into nonspecific binding of homeodomains from a
structure of MATα2 bound to DNA. Proteins 51, 544-551 (2003).
Akarsu, A.N., Akhan, O., Sayli, B.S., Sayli, U., Baskaya, G., & Sarfarazi, M. A large Turkish
kindred with syndactyly type II (synpolydactyly). 2. Homozygous phenotype? J. Med. Genet.
32, 435-441 (1995).
Alward, W. L. M., Semina, E.V., Kalenak, J.W., Heon, E., Sheth, B.P., Stone, E.M., & Murray,
J.C. Autosomal dominant iris hypoplasia is caused by a mutation in the Rieger syndrome
(RIEG/PITX2) gene. Am. J. Opthalmol. 125, 98-100 (1998).
Amendt, B. A., Sutherland, L. B., Semina, E. V. & Russo, A. F. The molecular basis of Rieger
syndrome. J. Biol. Chem. 273, 20066-20072 (1998).
Amendt, B. A., Sutherland, L. B. & Russo, A. F. Multifunctional role of the Pitx2 homeodomain
protein C-Terminal tail. Mol. Cell. Biol. 19, 7001-7010 (1999).
Arakawa, H., Nakamura, T., Zhadanov, A.B., Fidanza, V., Yano, T., Bullrich, F., Shimizu, M.,
Blechman, J., Mazo, A., Canaani, E., & Croce, C.M. Identification and characterization of
the ARP1 gene, a target for the human acute leukemia ALL1 gene. Proc. Natl. Acad. Sci. 95,
4573-4578 (1998).
Banerjee-Basu, S. & Baxevanis, A. D. Threading analysis of the Pitx2 homeodomain: Predicted
structural effects of mutations causing Rieger syndrome and iridogoniodysgenesis. Hum.
Mut. 14, 312-319 (1999).
Banerjee-Basu, S., Moreland, T., Hsu, B.J., Trout, K.L., & Baxevanis, A.D. The homeodomain
140 resource: 2003 update. Nuc. Acids Res. 31, 304-306 (2003).
Berleth, T., Burri, M., Thoma, G., Bopp, D., Richstein, S., Frigerio, G., Noll, M., & Nusslein-
Volhard, C. The role of localization of bicoid RNA in organizing the anterior pattern of the
Drosophila embryo. EMBO 7, 1749-1756 (1988).
Billeter, M., Qian, Y., Otting, G., Muller, M., Gehring, W.J., & Wuthrich, K. Determination of
the three-dimensional structure of the Antennapedia homeodomain from Drosophila in
solution by 1H nuclear magnetic resonance spectroscopy. J. Mol. Biol. 214, 183-197 (1990).
Billeter, M., & Wuthrich, K. Model studies relating nuclear magnetic resonance data with the
three-dimensional structure of protein-DNA complexes. J. Mol. Biol. 234, 1094-1097 (1993).
Billeter, M. Homeodomain-type DNA recognition. Prog. Biophys. Mol. Biol. 66, 211-225
(1996).
Billeter, M., Guntert, P., Luginbuhl, P. & Wuthrich, K. Hydration and DNA recognition by
homeodomains. Cell 85, 1057-1065 (1996).
Bisgrove, B. W. & Yost, H. J. Classification of left-right patterning defects in zebrafish, mice,
and humans. Am. J. Med. Genet. 101, 315-323 (2001).
Boncinelli, E. Homeobox genes and disease. Curr. Op. Genet. Dev. 7, 331-337 (1997).
Borrow, J., Shearman, A.M., Stanton, V.P. Jr., Becher, R., Collins, T., Williams, A.J., Dube, I.,
Katz, F., Kwong, Y.L., Morris, C., Ohyashiki, K., Toyama, K., Rowley, J., & Housman, D.E.
The t(7;11)(p15;p15) translocation in acute myeloid leukaemia fuses the genes for
nucleoporin NUP98 and class I homeoprotein HOXA9. Nat. Genet. 12, 159-167 (1996).
Breeze, A.L. Isotope-filtered NMR methods for the study of biomolecular structure and
interactions. Prog. NMR Spectros. 36, 323-372 (2000).
Brennan, R.G., & Matthews, B.W. Structural basis of DNA-protein recognition. Trends
141 Biochem. Sci. 14, 286-290 (1989).
Briata, P., Ilengo, C., Corte, G., Moroni, C., Rosefeld, M.G., Chen, C., & Gherzi R. The Wnt/β-
catenin-->Pitx2 pathway controls the turnover of Pitx2 and other unstable mRNAs. Mol.
Cell 12, 1201-1211 (2003).
Burz, D. S., Rivera-Pomar, R., Jackle, H. & Hanes, S. D. Cooperative DNA-binding by Bicoid
provides a mechanism for threshold-dependent gene activation in the Drosophila embryo.
EMBO J. 17, 5998-6009 (1998).
Campione, M., Ros, M.A., Icardo, J.M., Piedra, E., Christoffels, V.M., Schweickert, A., Blum,
M., Franco, D., & Moorman, A.F. Pitx2 expression defines a left cardiac lineage of cells:
evidence for atrial and ventricular molecular isomerism in the iv/iv mice. Dev. Biol. 231,
252-264 (2001).
Carra, J.H., & Privalov, P.L. Energetics of folding and DNA binding of the MAT alpha 2
homeodomain. Biochemistry 36, 526-535 (1997).
Case, D.A., Pearlman, D.A., Caldwell, J.W., Cheatham, T.E., Ross, W.S., Simmerling, C.L.,
Darden, T.A., Merz, K.M., Stanton, R.V., Cheng, A.L., Vincent, J.J., Crowley, M., Tsui, V.,
Radmer, R.J., Dvan, Y., Pitera, J., Massova, I., Seibel, G.L., Singh, U.C., Weiner, P.K., &
Kalman, P.A. AMBER7, University of California, San Francisco (1996).
Ceska, T.A., Lamers, M., Monaci, P., Nicosia, A., Cortese, R., & Suck, D. The X-ray structure
of an atypical homeodomain present in the rat liver transcription factor LFB1/HNF1 and
implications for DNA binding. EMBO J. 12, 1805-1810 (1993).
Chai, J., Wu, J.W., Yan, N., Massague, J., Pavletich, N.P., & Shi, Y. Features of a Smad3 MH1-
DNA complex. J. Biol. Chem. 278, 20327-20331 (2003).
Chisholm, E.A., & Chudley, A.E. Autosomal dominant iridogoniodysgenesis with associated
142 somatic anomalies: four-generation family with Rieger's syndrome. Br. J. Ophthalmol. 67,
529-534 (1983).
Chiu, T.K., Sohn, C., Dickerson, R.E., & Johnson, R.C. Testing water-mediated DNA
recognition by the Hin recombinase. EMBO J. 21, 801-814 (2002).
Clarke, N.D. Covariation of residues in the homeodomain sequence. Protein Sci. 4, 2269-2278
(1995).
Clarke, N.D., Kissinger, C.R., Desjarlais, J., Gilliland, G.L. & Pabo, C.O. Structural studies of
the engrailed homeodomain. Protein Sci. 3, 1779-1787 (1994).
Clevers, H. Inflating numbers by Wnt. Mol. Cell 10, 1260-1261 (2002).
Cox, M., van Tilborg, P.J., de Laat, W., Boelens, R., van Leeuwen, H.C., van der Vliet, P.C., &
Kaptein, R. Solution structure of the Oct-1 POU homeodomain determined by NMR and
restrained molecular dynamics. J. Biomol. NMR 6, 23-32 (1995).
Cox, C. J., Espinoza, H.M., McWilliams, B., Chappell, K., Morton, L., Hjalt, T.A., Semina,
E.V., & Amendt, B.A. Differential regulation of gene expression by PITX2 isoforms. J. Biol.
Chem. 277, 25001-25010 (2002).
Crawford, M. J., Lanctot, C., Tremblay, J.J., Jenkins, N., Gilbert, D., Copeland, N., Beatty, B., &
Drouin, J. Human and murine PTX1/Ptx1 gene maps to the region for Treacher Collins
Syndrome. Mamm. Genome 8, 841-845 (1997).
Damante, G., Tell, G., Leonardi, A., Fogolari, F., Bortolotti, N., DiLauro, R., & Formisano, S.
Analysis of the conformation and stability of rat TTF-1 homeodomain by circular dichroism.
FEBS Lett. 354, 293-296 (1994).
Dave, V., Zhao, C., Yang, F., Tung, C. & Ma, J. Reprogrammable recognition codes in Bicoid
homeodomain-DNA interaction. Mol. Cell. Biol. 20, 7673-7684 (2000).
143 Davey, C.A., Sargent, D.F., Luger, K., Maeder, A.W., & Richmond, T.J. Solvent mediated
interactions in the structure of the nucleosome core particle at 1.9Å resolution. J. Mol. Biol.
319, 1097-1113 (2002).
Delaglio, F., Grzesiek, S., Vuister, G.W., Zhu, G., Pfeifer, J., & Bax, A. NMRPipe: A
multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277-
293 (1995).
D'Elia, A.V., Tell, G., Paron, I., Pellizzari, L., Lonigro, R., & Damante, G. Missense mutations
of human homeoboxes: a review. Hum. Mutat. 18, 361-374 (2001).
Doig, A.J., & Sternberg, M.J.E. Side-chain conformational entropy in protein folding. Protein
Sci. 4, 2247-2251 (1995).
Driever, W. & Nusslein-Volhard, C. A gradient of Bicoid protein in Drosophila embryos. Cell
54, 83-93 (1988a).
Driever, W. & Nusslein-Volhard, C. The Bicoid protein determines position in the Drosophila
embryo in a concentration-dependent manner. Cell 54, 95-104 (1988b).
Driever, W. & Nusslein-Volhard, C. The Bicoid protein is a positive regulator of hunchback
transcription in the early Drosophila embryo. Nature 337, 138-143 (1989).
Duan, J., & Nilsson, L. The role of residue 50 and hydration water molecules in homeodomain
DNA recognition. Eur. Biophys. J. 31, 306-316 (2002).
Espinoza, H. M., Cox, C. J., Semina, E. V. & Amendt, B. A. A molecular basis for differential
developmental anomalies in Axenfeld-Rieger syndrome. Hum. Mol. Genet. 11, 743-753
(2002).
Espinoza, H.M., Gannga, M., Vadlamudi, U., Martin, D.M., Brooks, B.P., Semina, E.V., Murray,
J.C., & Amendt, B.A. Protein Kinase C phophorylation modulates N- and C-terminal
144 regulatory activities of the PITX2 homeodomain protein. Biochemistry (2005).
Essner, J. J., Branford, W. W., Zhang, J. & Yost, H. J. Mesendoderm and left-right brain, heart
and gut development are differentially regulated by pitx2 isoforms. Development 127, 1081-
1093 (2000).
Farrow, N.A., Muhandiram, R., Singer, A.U., Pascal, S.M., Kay, C.M., Gish, G., Shoelson, S.E.,
Pawson, T., Forman-Kay, J.D., & Kay, L.E. Backbone dynamics of a free and a
phosphopeptide-complexed Src Homology 2 domain studied by 15N NMR relaxation.
Biochemistry 33, 5984-6003 (1994).
Fausti, S., Weiler, S., Cuniberti, C., Hwang, K.J., No, K.T., Gruschus, J.M., Perico, A.,
Nirenberg, M., & Ferretti, J.A. Backbone dynamics for the wildtype and a double
H52R/T56W mutant of the vnd/NK-2 homeodomain from Drosophila melanogaster.
Biochemistry 40, 12004-12012 (2001).
Fernandez, C., Szyperski, T., Billeter, M., Ono, A., Iwai, H., Kainosho, M., & Wuthrich, K.
Conformational changes of the BS2 operator DNA upon complex formation with the
Antennapedia homeodomain studied by NMR with 13C/15N-labeled DNA. J. Mol. Biol.
292, 609-617 (1999).
Flomen, R.H., Gorman, P.A., Vatcheva, R., Groet, J., Barisic, I., Ligutic, E., Sheer, D., &
Nizetic, D. Rieger Syndrome locus: a new reciprocal translocation t(4;12)(q25;q15) and a
deletion del(4)(q25q27) both break between markers D4S2945 and D4S193. J. Med. Genet.
34, 191-195 (1997).
Flomen, R. H., Vatcheva, R., Gorman, P.A., Baptista, P.R., Groet, J., Barisic, I., Ligutic, I., &
Nizetic, D. Construction and analysis of a sequence-ready map in 4q25: Rieger syndrome
can be caused by haploinsufficiency of RIEG, but also by chromosome breaks ~90kb
145 upstream of this gene. Genomics 47, 409-413 (1998).
Foster, M.P., Wuttke, D.S., Radhakrishnan, I., Case, D.A., Gottesfeld, J.M., & Wright, P.E.
Domain packing and dynamics in the DNA complex of the N-terminal zinc fingers of
TFIIIA. Nat. Struct. Biol. 4, 605-608 (1997).
Fraenkel, E. & Pabo, C. O. Comparison of X-ray and NMR structures for the Antennapedia
homeodomain-DNA complex. Nat. Struct. Biol. 5, 692-697 (1998).
Fraenkel, E., Rould, M. A., Chambers, K. A. & Pabo, C. O. Engrailed Homeodomain-DNA
Complex at 2.2 A Resolution: A detailed view of the interface and comparison with other
Engrailed structures. J. Mol. Biol. 284, 351-361 (1998).
Franco, D., & Campione, M. The role of Pitx2 during cardiac development. Trends Cardiovasc.
Med. 13, 157-163 (2003).
Frohnhofer, H. G. & Nusslein-Volhard, C. Organization of anterior pattern in the Drosophila
embryo by the maternal gene bicoid. Nature 324, 120-125 (1986).
Gage, P. J. & Camper, S.A. Pituitary homeobox 2, a novel member of the bicoid-related family
of homeobox genes, is a potential regulator of anterior structure formation. Hum. Mol. Genet.
6, 457-464 (1997).
Gage, P. J., Suh, H. & Camper, S.A. Dosage requirement of Pitx2 for development of multiple
organs. Development 126, 4643-4651 (1999a).
Gage, P. J., Suh, H. & Camper, S. A. The bicoid-related Pitx gene family in development.
Mamm. Genome 10, 197-200 (1999b).
Ganga, M., Espinoza, H.M., Cox, C.J., Morton, L., Hjalt, T.A., Lee, Y., & Amendt, B.A. PITX2
isoform specific regulation of ANF expression: synergism and repression with Nkx2.5. J.
Biol. Chem. 278, 22437-22445 (2003).
146 Gao, Q., & Finkelstein, R. Targeting gene expression to the head: the Drosophila orthodenticle
gene is a direct target of the Bicoid morphogen. Development 125, 4185-4193 (1998).
Gehring, W. Cell heredity and changes of determination in cultures of imaginal discs in
Drosophila melanogaster. J. Embryol. Exp. Morphol. 15, 77-111 (1966).
Gehring, W. J., Affolter, M. & Burglin, T. Homeodomain proteins. Annu. Rev. Biochem. 63,
487-526 (1994).
Gehring, W.J., Muller, M., Affolter, M., Percival-Smith, A., Billeter, M., Qian, Y.Q., Otting, G.,
& Wuthrich, W. The structure of the homeodomain and its functional implications, Trends
Genet. 6, 323-329 (1990).
Goddard, T.D., Kneller, D.G. SPARKY 3, University of California, San Francisco.
Graham, A. & McGonnell, I. Limb development: Farewell to arms. Curr. Biol. 9, 368-370
(1999).
Grant, R. A., Rould, M. A., Klemm, J. D. & Pabo, C. O. Exploring the role of glutamine 50 in
the homeodomain-DNA interface: crystal structure of Engrailed (Gln50-->Ala) complex at
2.0 A. Biochemistry 39, 8187-8192 (2000).
Green, P. D., Hjalt, T.A., Kirk, D.E., Sutherland, L.B., Thomas, B.L., Sharpe, P.T., Snead, M.L.,
Murray, J.C., Russo, A.F., & Amendt, B.A. Antagonistic regulation of Dlx2 expression by
PITX2 and Msx2: implications for tooth development. Gene Expr. 9, 265-281 (2001).
Gruschus, J.M., Tsao, D.H., Wang, L.H., Nirenberg, M., & Ferretti, J.A. Interactions of the
vnd/NK-2 homeodomain with DNA by nuclear magnetic resonance spectroscopy: basis of
binding specificity. Biochemistry 36, 5372-5380 (1997).
Grzesiek, S., & Bax, A. Improved 3D triple-resonance NMR techniques applied to a 31-Kda
protein. J. Magn. Reson. 96, 432-440 (1992a).
147 Grzesiek, S., & Bax, A. Correlating backbone amide and side-chain resonances in larger
proteins by multiple relayed triple resonance NMR. J. Am. Chem. Soc. 114, 6291-6293
(1992b).
Guntert, P., Mumenthaler, C., & Wuthrich, K. Torsion angle dynamics for NMR structure
calculation with the new program DYANA. J. Mol. Biol. 273, 283-298 (1997).
Guntert, P., Qian, Y.Q., Otting, G., Muller, M., Gehring, W., & Wuthrich, K. Structure
determination of the Antp (C39-->S) homeodomain from nuclear magnetic resonance data in
solution using a novel strategy for the structure calculation with the programs DIANA,
CALIBA, HABAS and GLOMSA. J. Mol. Biol. 217, 531-540 (1991).
Gutmanas, A., & Billeter, M. Specific DNA recognition by the Antp homeodomain: MD
simulations of specific and nonspecific complexes. Proteins: Struct. Funct. Bioinfo. 57, 772-
782 (2004).
Hanes, S. D. & Brent, R. DNA specificity of the Bicoid activator protein is fetermined by
homeodomain recognition helix residue 9. Cell 57, 1275-1283 (1989).
Hansson, T., Oostenbrink, C., & van Gunsteren WF. Molecular dynamics simulations. Curr.
Opin. Struct. Biol. 12, 190-196 (2002).
Harrison, S.C. A structural taxonomy of DNA-binding domains. Nature 353, 715-719 (1991).
Heon, E., Sheth, B.P., Kalenak, J.W., Sunden, S.L., Streb, L.M., Taylor, C.M., Alward, W.L.,
Sheffield, V.C., & Stone, E.M. Linkage of autosomal dominant iris hypoplasia to the region
of the Rieger Syndrome locus (4q25). Hum. Mol. Genet. 4, 1435-1439 (1995).
Herrmann, T., Guntert, P., & Wuthrich, K. Protein NMR structure determination with automated
NOE assignment using the new software CANDID and the torsion angle dynamics algorithm
DYANA. J. Mol. Biol. 319, 209-227 (2002).
148 Hirsch, J.A., & Aggarwal, A.K. Structure of the even-skipped homeodomain complexed to AT-
rich DNA: new perspectives on homeodomain specificity. EMBO J. 14, 6280-6291 (1995).
Hjalt, T. A. & Murray, J. C. The human BARX2 gene: genomic structure, chromosomal
localization, and single nucleotide polymorphisms. Genomics 62, 456-459 (1999).
Hjalt, T. A., Semina, E. V., Amendt, B. A. & Murray, J. C. The Pitx2 protein in mouse
development. Dev. Dyn. 218, 195-200 (2000).
Hjalt, T. A., Amendt, B. A. & Murray, J. C. PITX2 regulates procollagen lysyl hydroxylase
(PLOD) gene expression: implications for the pathology of Rieger syndrome. J. Cell Biol.
152, 545-552 (2001).
Holmberg, J., Liu, C., & Hjalt, T.A. PITX2 gain-of-function in Rieger syndrome eye model.
Am. J. Path. 165, 1633-1641 (2004).
Ikura, M., Kay, L.E., & Bax, A. A novel approach for sequential assignment of H-1, C-13, and
N-15 spectra of larger proteins: heteronuclear triple-resonance 3-dimensional NMR
spectroscopy: application to calmodulin. Biochemistry 29, 4659-4667 (1990).
Iurcu-Mustata, G., Ban Belle, D., Wintjens, R., Prevost, M., & Rooman, M. Role of salt bridges
in homeodomains investigated by structural analyses and molecular dynamics simulations.
Biopolymers 59, 145-159 (2001).
Janin, J. Wet and dry interfaces: The role of solvent in protein-protein and protein-DNA
recognition. Structure 7, R277-R279 (1999).
Jayaram, B., & Jain, T. The role of water in protein-DNA recognition. Annu. Rev. Biophys.
Biomol. Struct. 33, 343-361 (2004).
Jones, S., van Heyningen, P., Berman, H.M., & Thornton, J.M. Protein-DNA interactions: a
structural analysis. J. Mol. Biol. 234, 1070-1083 (1999).
149 Kathiriya, I. S. & Srivastava, D. Left-right asymmetry and cardiac looping. Am. J. Med. Genet.
97, 271-279 (2000).
Katz, L.A., Schultz, R.E., Semina, E.V., Torfs, C.P., Krahn, K.N., & Murray, J.C. Mutations in
PITX2 may contribute to cases of omphalocele and VATER-like syndromes. Am. J. Med.
Genet. 130A, 277-283 (2004).
Kay, L.E., Keifer, P., & Saarinen, T. Pure absorption gradient enhanced heteronuclear single
quantum correlation spectroscopy with improved sensitivity. J. Am. Chem. Soc. 114, 10663-
10665 (1992).
Kay, L.E., Xu, G.Y., Yamazaki, T. Enhanced-sensitivity triple-resonance spectroscopy with
minimal H2O saturation. J. Magn. Reson. Ser. A 103, 129-133 (1994).
Kioussi, C., Briata, P., Baek, S.H., Rose, D.W., Hamblet, N.S., Herman, T., Ohgi, K.A., Lin, C.,
Gleiberman, A., Wang, J., Brault, V., Ruiz-Lozano, P., Nguyen, H.D., Kemler, R., Glass,
C.K., Wynshaw-Boris, A., & Rosenfeld, M.G. Identification of a Wnt/Dvl/B-Catenin ->
Pitx2 pathway mediating cell-type-specific proliferation during development. Cell 111, 673-
685 (2002).
Kissinger, C. R., Liu, B., Martin-Blanco, E., Kornberg, T. B. & Pabo, C. O. Crystal structure of
an Engrailed homeodomain-DNA complex at 2.8 A: a framework for understanding
homeodomain-DNA interactions. Cell 63, 579-590 (1990).
Kitamura, K., Miura, H., Yanazawa, M., Miyashita, T. & Kato, K. Expression patterns of Brx1
(Rieg gene), Sonic hedgehog, Nkx2.2, Dlx1 and Arx during zona limitans intrathalamica and
embryonic ventral lateral geniculate nuclear formation. Mech. Dev. 67, 83-96 (1997).
Kitamura, K., Miura, H., Miyagawa-Tomita, S., Yanazawa, M., Katoh-Fukui, Y., Suzuki, R.,
Ohuchi, H., Suehiro, A., Motegi, Y., Nakahara, Y., Kondo, S., & Yokoyama, M. Mouse
150 Pitx2 deficiency leads to anomalies of the ventral body wall, heart, extra- and periocular
mesoderm and right pulmonary isomerism. Development 126, 5749-5758 (1999).
Klemm, J.D., Rould, M.A., Aurora, R., Herr, W. & Pabo, C.O. Crystal structure of the Oct-1
POU domain bound to an octamer site: DNA recognition with tethered DNA-binding
modules. Cell 77, 21-32 (1994).
Koradi, R., Billeter, M., & Wuthrich, K. MOLMOL: a program for display and analysis of
macromolecular structures. J. Mol. Graph. 14, 51-55 (1996).
Kornberg, T. B. Understanding the Homeodomain. J. Biol. Chem. 268, 26813-26816 (1993).
Kosztin, D., Bishop, T.C., & Shulten, K. Binding of the estrogen receptor to DNA. The role of
waters. Biophys. J. 73, 557-570 (1997).
Kozlowski, K. & Walter, M. A. Variation in residual PITX2 activity underlies the phenotypic
spectrum of anterior segment developmental disorders. Hum. Mol. Genet. 9, 2131-2139
(2000).
Kulak, S. C., Kozlowski, K., Semina, E. V., Pearce, W. G. & Walter, M. A. Mutation in the
RIEG1 gene in patients with iridogoniodysgenesis syndrome. Hum. Mol. Genet. 7, 1113-
1117 (1998).
La Rosee, A., Hader, T., Taubert, H., Rivera-Pomar, R., & Jackle, H. Mechanism and Bicoid-
dependent control of hairy stripe 7 expression in the posterior region of the Drosophila
embryo. EMBO J. 16, 4403-4411 (1997).
Lamonerie, T., Tremblay, J.J., Lanctot, C., Therrien, M., Gauthier, Y., & Drouin, J.. Ptx1, a
bicoid-related homeobox TF involved in transcription of the pro-opiomelanocortin gene.
Genes Dev. 10, 1284-1295 (1996).
Lanctot, C., Moreau, A., Chamberland, M., Tremblay, M. L. & Drouin, J. Hindlimb patterning
151 and mandible development require the Ptx1 gene. Development 126, 1805-1810 (1999).
Laskowski, R.A., MacArthur, M.W., Moss, D.S., & Thornton, J.M. PROCHECK: a program to
check the stereochemical quality of protein structures. J. Appl. Cryst. 26, 283-291 (1993).
Laskowski, R.A., Rullmann, J.A.C., MacArthur, M.W., Kaptein, R., & Thornton, J.M. AQUA
and PROCHECK-NMR: programs for checking the quality of protein structures solved by
NMR. J. Biomol. NMR 8, 477-486 (1996).
Lebel, M., Gauthier, Y., Moreau, A. & Drouin, J. Pitx3 activates mouse tyrosine hydroxylase
promotor via a high-affinity binding site. J. Neurochem. 77, 558-567 (2001).
Lee, W., Revingtonn, M.J., Arrowsmith, C., & Kay, L.E. A pulsed field gradient isotope-filtered
3D 13C HMQC-NOESY experiment for extracting intermolecular NOE contacts in molecular
complexes. FEBS Lett. 350, 87-90 (1994).
Leiting, B., De Francesco, R., Tomei, L., Cortese, R., Otting, G., & Wuthrich, K. The three-
dimensional NMR-solution structure of the polypeptide fragment 195-286 of the
LFB1/HNF1 transcription factor from rat liver comprises a nonclassical homeodomain.
EMBO J. 12, 1797-1803 (1993).
Lewis, E.B. A gene complex controlling segmentation in Drosophila. Nature 276, 565-570
(1978).
Li, T., Stark, M.R., Johnson, A.D., & Wolberger, C. Crystal structure of the MATa1/MATα2
homeodomain heterodimer bound to DNA. Science 270, 262-269 (1995).
Lin, C.R., Kioussi, C., O’Connell, S., Briata, P., Szeto, D., Liu, F., Izpisua-Belmonte, J.C., &
Rosenfeld, M.G. Pitx2 regulates lung asymmetry, cardiac positioning and pituitary and tooth
morphogenesis. Nature 401, 279-282 (1999).
Liu, C., Liu, W., Lu, M., Brown, N. A. & Martin, J. F. Regulation of left-right asymmetry by
152 thresholds of Pitx2c activity. Development 128, 2039-2048 (2001).
Liu, W., Selever, J., Lu, M., & Martin, J.F. Genetic dissection of Pitx2 in craniofacial
development uncovers new functions in branchial arch morphogenesis, late aspects of tooth
morphogenesis and cell migration. Development 130, 6375-6385 (2003).
Lu, M.-F., Pressman, C., Dyer, R., Johnson, R. L. & Martin, J. F. Function of Rieger syndrome
gene in left-right asymmetry and craniofacial development. Nature 401, 276-278 (1999).
Luginbuhl, P., Szyperski, T., & Wuthrich, K. Statistical basis for the use of 13C chemical shifts
in protein structure determination. J. Magn. Reson. Ser. B. 109, 229-233 (1995).
Luscombe N.M., Laskowski R.A., & Thornton J.M. NUCPLOT: A program to generate
schematic diagrams of protein-nucleic acid interactions. Nuc. Acids Res. 25, 4940-4945
(1997).
Ma, X., Yuan, D., Diepold, K., Scarborough, T., & Ma, J. The Drosophila morphogenetic
protein Bicoid binds DNA cooperatively. Development 122, 1195-1206 (1996).
Ma, X., Yuan, D., Scarborough, T. & Ma, J. Contributions to gene activation by multiple
functions of Bicoid. Biochem. J. 338, 447-455 (1999).
Mammi, I., De Giorgio, P., Clementi, M. & Tenconi, R. Cardiovascular anomaly in Rieger
Syndrome: Heterogeneity or contiguity? Acta Ophthalmol. Scand. 76, 509-512 (1998).
Mandel-Gutfreund, Y., Schueler, O., & Margalit, H. Comprehensive analysis of hydrogen bonds
in regulatory protein-DNA complexes: in search of common principles, J. Mol. Biol. 253,
370-382 (1995).
Marcil, A., Dumontier, E., Chamberland, M., Camper, S.A., & Drouin, J. Pitx1 and Pitx2 are
required for development of hindlimb buds. Development 130, 45-55 (2003).
Marion, D., Driscoll, P.C., Kay, L.E., Wingfield, P.T., Bax, A., Gronenborn, A., & Clore, M.
153 Overcoming the overlap problem in the assignment of H-1-NMR spectra of larger proteins
by use of 3-dimensional heteronuclear H-1-N-15 Hartmann-Hahn multiple quantum
coherence and nuclear Overhauser multiple quantum coherence spectroscopy-application to
interleukin-1-beta. Biochemistry 28, 6150-6156 (1989).
McDonald, I.K., & Thornton, J.M. Satisfying hydrogen bonding potential in proteins. J. Mol.
Biol. 238, 777-793 (1994).
McGinnis, W., Garber, R. L., Wirz, J., Kuroiwa, A. & Gehring, W.J.A Homologous protein-
coding sequence in Drosophila homeotic genes and its conservation in other metazoans. Cell
37, 403-408 (1984).
Monsalve, M., Calles, B., Mencia, M., Rojo, F., & Salas, M. Binding of phage 29 protein p4 to
the early A2c promoter: recruitment of a repressor by the RNA polymerase. J. Mol. Biol.
283, 559-569 (1998).
Mucchielli, M., Mitsiadis, T.A., Raffo, S., Brunet, J., Proust, J., & Goridis, C. Mouse
Otlx2/RIEG expression in the odontogenic epithelium precedes tooth initiation and requires
mesenchyme-derived signals for its maintenance. Dev. Biol. 189, 275-284 (1997).
Muhandiram, D.R., & Kay, L.E. Gradient-enhanced triple-resonance 3-dimensional NMR
experiments with improved sensitivity. J. Magn. Reson. Ser. B 103, 203-216 (1994).
Muragaki, Y., Mundlos, S., Upton, J., & Olsen, B.R. Altered growth and branching patterns in
synpolydactyly caused by mutations in HOXD13. Science 272, 548-551 (1996).
Murray, J.C., Bennett, S.R., Kwitek, A.E., Small, K.W., Schinzel, A., Alward, W.L., Weber,
J.L., Bell, G.I., & Buetow, K.H. Linkage of Rieger Syndrome to the region of the epidermal
growth factor gene on chromosome 4. Nat. Genet. 2, 46-49 (1992).
Nakamura, T., Largaespada, D.A., Lee, M.P., Johnson, L.A., Ohyashiki, K., Toyama, K., Chen,
154 S.J., Willman, C.L, Chen, I.M., Feinberg, A.P., Jenkins, N.A., Copeland, N.G., &
Shaughnessy, J.D. Jr. Fusion of the nucleoporin gene NUP98 to HOXA9 by the chromosome
translocation t(7;11)(p15;p15) in human myeloid leukaemia. Nat. Genet. 12, 154-158 (1996).
Niessing, D., Dreiver, W., Sprenger, F., Taubert, H., Jackle, H., & Rivera-Pomar, R.
Homeodomain position 54 specifies transcriptional versus translational control by Bicoid.
Mol. Cell 5, 395-401 (2000).
Nishikawa, T., Okamura, H., Nagadoi, A., Konig, P., Rhodes, D., & Nishimura, Y. Solution
structure of a telomeric DNA complex of human TRF1. Structure 9, 1237-1251 (2001).
Otting, G., Qian, Y.Q., Muller, M., Affolter, M., Gehring, W., & Wuthrich, K. Secondary
structure determination for the Antennapedia homeodomain by nuclear magnetic resonance
and evidence for a helix-turn-helix motif. EMBO 7, 4305-4309 (1988).
Otting, G., Qian, Y.Q., Billeter, M., Muller, M., Affolter, M., Gehring, W.J., & Wuthrich, K.
Protein-DNA contacts in the structure of a homeodomain-DNA complex determined by
nuclear magnetic resonance spectroscopy in solution. EMBO J. 9, 3085-3092 (1990).
Otting, G., & Wuthrich, K. Heteronuclear filters in two-dimensional [1H,1H]-NMR
spectroscopy: combined use with isotope labeling for studies of macromolecular
conformation and intermolecular interactions. Quart. Rev. Biophys. 23, 39-96 (1990).
Otting, G., Liepinsh, E., Farmer, B.T. 2nd, & Wuthrich, K. Protein hydration studied with
homonuclear 3D 1H NMR experiments. J. Biomol. NMR 1, 209-215 (1991).
Otwinowski, Z., Schevitz, R.W., Zhang, R.G., Lawson, C.L., Jochiamiak, A., Marmorstein,
R.Q., Luisi, B.F., & Sigler, P.B. Crystal structure of trp repressor/operator complex at
atomic resolution. Nature 335, 321-329 (1988).
Palmer, A.G. 3rd. Dynamic properties of proteins from NMR spectroscopy. Curr. Opin.
155 Biotechnol. 4, 385-391 (1993).
Pearlman, D.A., Case, D.A., Caldwell, J.W., Ross, W.S., Cheatham, T.E., deBolt, S., Ferguson,
D., Seibel, G., & Kollman, P.A. AMBER, a computer program for applying molecular
mechanics, normal mode analysis, molecular dynamics and free energy calculations to
elucidate the structures and energies of molecules. Comp. Phys. Commun. 91, 1-41 (1995).
Pellizzari, L., Tell, G., Fabbro, D., Pucillo, C., & Damante, G. Functional interference between
contacting amino acids of homeodomains. FEBS Lett. 407, 320-324 (1997).
Percival-Smith, A., Muller, M., Affolter, M., & Gehring, W.J. The interaction with DNA of
wild-type and mutant fushi tarazu homeodomains. EMBO J. 9, 3967-3974 (1990).
Perveen, R., Lloyd, C., Clayton-Smith, J., Churchill, A., van Heyningen, V., Hanson, I., Taylor,
D., McKeown, C., Super, M., Kerr, B., Winter, R., & Black, G.C.M. Phenotypic variability
and asymmetry of Rieger syndrome associated with PITX2 mutations. Invest. Ophthalmol.
Vis. Sci. 41, 2456-2460 (2000).
Pervushin, K., Wider, G. & Wuthrich, K. Deuterium relaxation in a uniformly 15N-labeled
homeodomain and its DNA complex. J. Am. Chem. Soc. 119, 3842-3843 (1997).
Piper, D. E., Batchelor, A. H., Chang, C., Cleary, M. L. & Wolberger, C. Structure of a HoxB1-
Pbx1 heterodimer bound to DNA: role of the hexapeptide and a fourth homeodomain helix
in complex formation. Cell 96, 587-597 (1999).
Pomerantz, J. L. & Sharp, P. A. Homeodomain determinant of major groove recognition.
Biochemistry 33, 10851-10858 (1994).
Priston, M., Kozlowski, K., Gill, D., Letwin, K., Buys, Y., Levin, A.V., Walter, M.A., & Heon,
E. Functional analyses of two newly identified PITX2 mutants reveal a novel molecular
mechanism for Axenfeld-Rieger syndrome. Hum. Mol. Genet. 10, 1631-1638 (2001).
156 Qian, Y. Q., Billeter, M., Otting, G., Muller, M., Gehring, W.J., & Wuthrich, K. The structure of
the Antennapedia homeodomain determined by NMR spectroscopy in solution: comparison
with prokaryotic repressors. Cell 59, 573-580 (1989).
Qian, Y.Q., Otting, G., Billeter, M., Muller, M., Gehring, W., & Wuthrich, K. Nuclear magnetic
resonance spectroscopy of a DNA complex with the uniformly 13C-labeled Antennapedia
homeodomain and structure determination of the DNA-bound homeodomain. J. Mol. Biol.
234, 1070-1083 (1993).
Qian, Y. Q., Resendez-Perez, D., Gehring, W. J. & Wuthrich, K. The des(1-6) Antennapedia
homeodomain: comparison of the NMR solution structure and the DNA-binding affinity
with the intact Antennapedia homeodomain. Proc. Natl. Ac. Sci. 91, 4091-4095 (1994a).
Qian, Y.Q., Furukubo-Tokunaga, K., Resendez-Perez, D., Muller, M., Gehringn, W.J., &
Wuthrich, K. Nuclear magnetic resonance solution structure of the fushi tarazu
homeodomain from Drosophila and comparison with the Antennapedia homeodomain. J.
Mol. Biol. 238, 333-345 (1994b).
Qiu, M., Bulfone, A., Martinez, S., Meneses, J.J., Pedersen, R.A., & Rubenstein, J.L. Null
mutation of Dlx-2 results in abnormal morphogenesis of proximal first and second branchial
arch derivatives and abnormal differentiation in the forebrain. Genes Dev. 9, 2523-2538
(1995).
Quentien, M., Manfroid, I., Moncet, D., Gunz, G., Muller, M., Grino, M., Enjalbert, A., &
Pellegrini, I. Pitx factors are involved in basal and hormone-regulated activity of the human
prolactin promoter. J. Biol. Chem. 277, 44408-44416 (2002a).
Quentien, M., Pitoia, F., Gunz, G., Guillet, M., Enjalbert, A., & Pellegrini, I. Regulation of
Prolactin, GH, and Pit-1 gene expression in anterior pituitary by Pitx2: an approach using
157 Pitx2 mutants. Endocrinology 143, 2839-2851 (2002b).
Riise, R., Storhaug, K. & Brondum-Nielsen, K. Rieger syndrome is associated with PAX6
deletion. Acta Opthalmol. Scand. 79, 201-203 (2001).
Rivera-Pomar, R., Lu, X., Perrimon, N., Taubert, H. & Jackle, H. Activation of posterior gap
gene expression in the Drosophila blastoderm. Nature 376, 253-256 (1995).
Rivera-Pomar, R., Niessing, D., Schmidt-Ott, U., Gehring, W. J. & Jackle, H. RNA binding and
translational suppression by bicoid. Nature 379, 746-749 (1996).
Saadi, I., Semina, E.V., Amendt, B.A., Harris, D.J., Murphy, K.P., Murray, J.C., & Russo, A.F.
Identification of a dominant negative homeodomain mutation in Rieger syndrome. J. Biol.
Chem. (2001).
Saadi, I., Kuburas, A., Engle, J.J., & Russo, A.F. Dominant negative dimerization of a mutant
homeodomain protein in Axenfeld-Rieger syndrome. Mol. Cell. Biol. 23, 1968-1982 (2003).
Saenger, W. Principles of Nucleic Acid Structure, Springer-Vertag, New York (1984).
Sattler, M., Schleucher, J., & Griesinger, C. Heteronuclear multidimensional NMR experiments
for the structure determination of proteins in solution employing pulsed field gradients.
Prog. NMR Spectros. 34, 93-158 (1999).
Schott, O., Billeter, M., Leiting, B., Wider, G., & Wuthrich, K. The NMR solution structure of
the non-classical homeodomain from the rat liver LFB1/HNF1 transcription factor. J. Mol.
Biol. 267, 673-683 (1997).
Schubert, S.W., Kardash, E., Khan, M.A., Cheusova, T., Kilian, K., Wegner, M., &
Hashemolhossein, S. Interaction, cooperative promoter modulation, and renal colocalization
of GCMa and PITX2. J. Biol. Chem. 279, 50358-50365 (2004).
Schwabe, J.W.R. The role of water in protein-DNA interactions. Curr. Op. Struct. Biol. 7, 126-
158 134 (1997).
Scott, M.P., & Weiner, A.J. Structural relationships among genes that control development:
sequence homology between the antennapedia, ultrabithorax and fushi tarazu loci of
Drosophila. Proc. Natl. Acad. Sci. USA 81, 4115-4119 (1984).
Scott, M. P., Tamkun, J. W. & Hartzell, G. W. The structure and function of the homeodomain.
Biochim. Biophys. Acta 989, 25-48 (1989).
Semina, E.V., Reiter, R., Leysens, N.J., Alward, ,W.L.M., Small, K.W., Datsonn, NN.A., Siegel-
Bartelt, J., Bierke-Nelson, D., Bitoun, P., Zabel, B.U., Cary, J.C., & Murray, J.C. Cloning
and characterization of a novel bicoid-related homeobox transcription factor gene, RIEG,
involved in Rieger syndrome. Nat. Genet. 14, 392-398 (1996a).
Semina, E.V., Datson, N.A., Leysens, N.J., Zabel, B.U., Carey, J.C., Bell, G.I., Bitoun, P.,
Lindgren, C., Stevenson, T., Frants, R.R., van Ommen, G., & Murray, J.C. Exclusion of
epidermal growth factor and high-resolution physical mapping across the Rieger syndrome
locus. Am. J. Hum. Genet. 59, 1288-1296 (1996b).
Semina, E. V., Reiter, R. S. & Murray, J. C. Isolation of a new homeobox gene belonging to the
Pitx/Rieg family: expression during lens development and mapping to the aphakia region on
mouse chromosome 19. Hum. Mol. Genet. 6, 2109-2116 (1997).
Semina, E. V., Ferrell, R.E., Mintz-Hittner, H.A., Bitounn, P., Alward, W.L.M., Reiter, R.S.,
Funkhauser, C., Daack-Hirsch, S., & Murray, J.C. A novel homeobox gene PITX3 is mutated
in families with autosomal-dominant cataracts and ASMD. Nat. Genet. 19, 167-170 (1998).
Semina, E. V., Murray, J. C., Reiter, R., Hrstka, R. F. & Graw, J. Deletion in the promoter region
and altered expression of Pitx3 homeobox gene in aphakia mice. Hum. Mol. Genet. 9, 1575-
1585 (2000).
159 Simon, M.D., Sato, K., Weiss, G.A., & Shokat, K.M. A phage display selection of engrailed
homeodomain mutants and the importance of residue Q50. Nucl. Acids Res. 32, 3623-3631
(2004).
Skelton, N.J., Palmer, A.G., Akke, M., Kordel, J., Rance, M., & Chazin, W.J. Practical aspects
of two-dimensional proton-detected 15N spin relaxation measurements. J. Magn. Res. Ser. B,
102, 253-264 (1993).
Slijper, M., Boelens, R., Davis, A.L., Konings, R.N., van der Marel, G.A., van Boom, J.H., &
Kaptein, R. Backbone and side chain dynamics of lac repressor headpiece (1-56) and its
complex with DNA. Biochemistry 36, 249-254 (1997).
Small, S., Blair, A., & Levine, M. Regulation of even-skipped stripe 2 in the Drosophila embryo.
EMBO J. 11, 4047-4057 (1992).
Smidt, M.P., van Schaick, H.S., Lanctot, C., Tremblay, J.J., Cox, J.J., van der Kleij, A.A.,
Wolterink, G., Drouin, J., & Burback, J.P. A homeodomain gene Ptx3 has highly restricted
brain expression in the mesencephalic dopaminergic neurons. Proc. Natl. Acad. Sci. USA 94,
13305-13310 (1997).
Spera, S., & Bax, A. Empirical correlation between protein backbone conformation and Cα and
Cβ 13C chemical shifts in protein structure determination. J. Am. Chem. Soc. 113, 5490-5492
(1991).
St. Amand, T. R., Zhang, Y., Semina, ,E.V., Zhao, X., Hu, Y., Nguyen, L., Murray, J.C., &
Chen, Y. Antagonistic signals between BMP4 and FGF8 define the expression of Pitx1 and
Pitx2 in mouse tooth-forming anlage. Dev. Biol. 217, 323-332 (2000).
Struhl, G., Struhl, K. & Macdonald, P.M. The gradient morphogen Bicoid is a concentration-
dependent transcriptional activator. Cell 57, 1259-1273 (1989).
160 Stuart, A.C., Borzilleri, K.A., Withka, J.M., & Palmer, A.G. Compensating for variations in the
1H-13C scalar coupling constants in isotope-filtered NMR experiments. J. Am. Chem. Soc.
121, 5346-5347 (1999).
Subramaniam, V., Jovin, T.M., & Rivera-Pomar, R.V. Aromatic amino acids are critical for
stability of the Bicoid homeodomain. J. Biol. Chem. 276, 21506-21511 (2001).
Suh, H., Gage, P. J., Drouin, J. & Camper, S. A. Pitx2 is required at multiple stages of pituitary
organogenesis: pituitary primordium formation and cell specification. Development 129,
329-337 (2002).
Szeto, D. P., Rodriguez-Esteban, C., Ryan, A.K., O’Connell, S.M., Liu, F., Kioussi, C.,
Gleiberman, A.S., Izpisua-Belmonte, J.C., & Rosenfeld, M.G.. Role of the Bicoid-related
homeodomain factor Pitx1 in specifying hindlimb morphogenesis and pituitary development.
Genes Dev. 13, 484-494 (1999).
Talluri, S., & Wagner, G. An optimized 3D NOESY-HSQC. J. Magn. Reson. Ser. B 112, 200-
205 (1996).
Tejada M.L., Jia Z., May D., & Deeley R.G. Determinants of the DNA-binding specificity of the
Avian homeodomain protein, AKR. DNA Cell Biol. 18, 791-804 (1999).
Tell, G., Acquaviva, R., Formisano, S., Fogolari, F., Pucillo, C., & Damante, G. Comparative
stability analysis of the thyroid transcription factor 1 and Antennapedia homeodomains:
evidence for residue 54 in controlling the structural stability of the recognition helix. Int. J.
Biochem. Cell Biol. 31, 1339-1353 (1999).
Thomas, B.L., Liu, J.K., Rubenstein, J.L., & Sharpe, P.T. Independent regulation of Dlx2
expression in the epithelium and mesenchyme of the first branchial arch. Development 127,
217-224 (2000).
161 Toro, R., Saadi, I., Kuburas A., Nemer, M., & Russo, A.F. Cell-specific activation of the Atrial
Natriuretic Factor promoter by PITX2 and MEF2A. J. Biol. Chem. 279, 52087-52094
(2004).
Toyota, M., Kopecky, K.J., Toyota, M.O., Jair, K., Willman, C.L., & Issa, J.J. Methylation
profiling in acute myeloid leukemia. Blood 97, 2823-2829 (2001).
Treisman, J., Gonczy, P., Vashishtha, M., Harris, E. & Desplan, C. A single amino acid can
determine the DNA binding specificity of homeodomain proteins. Cell 59, 553-562 (1989).
Tremblay, J.J., Goodyer, C.G., & Brouin, J. Transcriptional properties of Ptx1 and Ptx2
isoforms. Neuroendocrinology 71, 277-286 (2000).
Tron, A.E., Bertoncini, C.W., Palena, C.M., Chan, R.L., & Gonzalez, D.H. Combinatorial
interactions of two amino acids with a single base pair define target site specificity in plant
dimeric homeodomains proteins. Nuc. Acids Res. 29, 4866-4872 (2001).
Tsai, C., & Gergen, J.P. Gap gene properties of the pair-rule gene runt during Drosophila
segmentation. Development 120, 1671-1683 (1994).
Tsao, D.H., Gruschus, J.M., Wang, L.H., Nirenberg, M., & Ferretti, J.A. Elongation of helix III
of the NK-2 homeodomain upon binding to DNA: a secondary structure study by NMR.
Biochemistry 33, 15053-15060 (1994).
Tucker-Kellogg, L., Rould, M.A., Chambers, K.A., Ades, S.E., Sauer, R.T., & Pabo, C.O.
Engrailed (Gln50->Lys) homeodomain-DNA complex at 1.9A resolution: structural basis for
enhanced affinity and altered specificity. Structure 5, 1047-1054 (1997). van Heijenoort, C., Penin, F., & Guittet, E. Dynamics of the DNA binding domain of the fructose
repressor from the analysis of linear correlations between the 15N-1H bond spectral densities
obtained by nuclear magnetic resonance spectroscopy. Biochemistry 37, 5060-5073 (1998).
162 Voss, T.C., & Day, R.N. Editorial: Pitx-2 mutants and somatolactotroph gene regulation—
deciphering the combinatorial code. Endocrinology 143, 2836-2838 (2002).
Vuister, G.W., & Bax, A. Quantitative J correlations: a new approach for measuring
homonuclear three-bond J(HNHα) coupling constants in 15N-enriched proteins. J. Am. Chem.
Soc. 115, 7772-7777 (1993).
Walter, M.A., Mirzayans, F., Mears, A.J., Hickey, K., & Pearce, W.G. Autosomal-dominant
iridogoniodysgenesis and Axenfeld-Rieger syndrome are genetically distinct. Ophthalmology
103, 1907-1915 (1996).
Wang J., Cieplak P., & Kollman P.A. How well does a restrained electrostatic potential (RESP)
model perform in calculating conformational energies of organic and biological molecules?
J. Comput. Chem. 21, 1049-1074 (2000).
Wang, Y., Zhao, H., Zhang, X., & Feng, H. Novel identification of a four-base-pair deletion
mutation in Pitx2 in a Rieger syndrome family. J. Dent. Res. 82, 1008-1012 (2003).
Watanabe, K., & Lambowitz, A.M. High-affinity binding site for a group II intron-encoded
reverse transcriptase/maturase within a stem-loop structure in the intron RNA. RNA 10,
1433-1443 (2004).
Wei, Q. & Adelstein, R. S. Pitx2a expression alters actin-myosin cytoskeleton and migration of
HeLa cells through Rho GTPase signaling. Mol. Biol. Cell 13, 683-697 (2002).
Westmoreland, J. J., McEwen, J., Moore, B. A., Jin, Y. & Condie, B. G. Conserved function of
Caenorhabditis elegans UNC-30 and mouse Pitx2 in controlling GABAergic neuron
differentiation. J. Neurosc. 21, 6810-6819 (2001).
Wilson, D.S., Geunther, B., Desplan, C., & Kurian, J. High resolution crystal structure of a
paired (pax) class homeodomain dimer on DNA. Cell 82, 709-719 (1995).
163 Wilson, D.S., Sheng, G., Jun, S., & Desplan, C. Conservation and diversification in
homeodomain-DNA interactions: A comparative genetic analysis. Proc. Natl. Acad. Sci.
USA 93, 6886-6891 (1996).
Wilson, J.E., Connell, J.E., Schlenker, J.D., & Macdonald, P.M. Novel genetic screen for genes
involved in posterior body patterning in Drosophila. Dev. Genet. 19, 199-209 (1996).
Wimmer, E.A., Simpson-Brose, M., Cohen, S.M., Desplan, C., & Jackle, H. Trans- and cis-
acting requirements for blastodermal expression of the head gap gene buttonhead. Mech.
Dev. 53, 235-245 (1995).
Wishart, D.S., Sykes, B.D., & Richards, F.M. Chemical shift index: a fast and simple method
for the assignment of protein secondary structure through NMR spectroscopy. Biochemistry
31, 1647-51 (1992).
Wittekind, M., & Mueller, L. HNCACB, a high-sensitivity 3D NMR experiment to correlate
amide-proton and nitrogen resonances with the α-carbon and β-carbon resonances in
proteins. J. Magn. Reson., Ser. B 101, 201-205 (1993).
Wolberger, C., Vershon, A.K., Liu, B., Johnson, A.D., & Pabo, C.O. Crystal structure of a MAT
alpha 2 homeodomain-operator complex suggests a general model for homeodomain-DNA
interactions. Cell 67, 517-528 (1991).
Wolberger, C. Transcription factor structure and DNA binding. Curr. Op. Struct. Biol. 3, 3-10
(1993).
Wu, J.H., Gottlieb, B., Batist, G., Sulea, T., Purisina, E.O., Beitel, L.K., & Trifiro, M. Bridging
structural biology and genetics by computational methods: An investigation into how the
R774C mutation in the AR gene can result in complete Androgen Insensitivity
Syndrome. Hum. Mut. 22, 465-475 (2003).
164 Wuthrich, K. NMR of Proteins and Nucleic Acids. New York: John Wiley & Sons, 1986.
Wuttke, D.S., Foster, M.P., Case, D.A., Gottesfeld, J.M., & Wright, P.E. Solution structure of
the first three zinc fingers of TFIIIA bound to the cognate DNA sequence: determinants of
affinity and sequence specificity. J. Mol. Biol. 273, 183-206 (1997).
Yamamoto, K., Yee, C.C., Shirakawa, M., & Kyogoku, Y. Characterization of the bacterially
expressed Drosophila engrailed homeodomain. J. Biochem. Tokyo 111, 793-797 (1992).
Yu, X., St Amand, T.R., Wang, S., Li, G., Zhang, Y., Hu, Y., Nguyen, L., Qiu, M., & Chen, Y.
Differential expression and functional analysis of Pitx2 isoforms in regulation of heart
looping in the chick. Development 128, 1005-1013 (2001).
Yuan, D., Ma, X. & Ma, J. Sequences outside the homeodomain of Bicoid are required for
protein-protein interaction. J. Biol. Chem. 271, 21660-21665 (1996).
Yuan, D., Ma, X. & Ma, J. Recognition of multiple patterns of DNA sites by Drosophila
homeodomain protein Bicoid. J. Biochem. 125, 809-817 (1999).
Zerbe, O., Szyperski, T., Ottinger, M., & Wuthrich, K. Three-dimensional 1H-TOCSY-relayed
ct-[13C,1H]-HMQC for aromatic spin system identification in uniformly 13C-labeled proteins.
J. Biomol. NMR 7, 99-106 (1996).
Zhang, O., Kay, L.E., Olivier, J.P., & Forman-Kay, J.D. Backbone 1H and 15N resonance
assignments of the N-terminal SH3 domain of drk in folded and unfolded states using
enhanced-sensitivity pulsed field gradient NMR techniques. J. Biomol. NMR 4, 845-858
(1994).
Zhao, C., Dave, V., Yang, F., Scarborough, T. & Ma, J. Target selectivity of Bicoid is dependent
on nonconsensus site recognition and protein-protein interaction. Mol. Cell. Biol. 20, 8112-
8123 (2000).
165