The Genetics of Atrial Septal Defect and Patent Foramen Ovale
EDWIN PHILIP ENFIELD KIRK
A thesis submitted in fulfilment
of the requirements for the degree of Doctor of Philosophy
December, 2007
School of Women’s and Children’s Health University of New South Wales
ORIGINALITY STATEMENT I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.
Signed ……………………………………………......
Date ……………………………………………...... COPYRIGHT STATEMENT
‘I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Coyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation’
Signed…………………………………………………….
Date……………………………………………………….
Authenticity Statement
‘I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.’
Signed…………………………………………………….
Date………………………………………………………. DEDICATION
To my dear wife, Sue.
With gratitude, and with all my love.
i Abstract
Congenital heart disease is the most common form of birth defect, affecting approximately 1% of liveborn babies. Secundum atrial septal defect (ASD) is the second most common form of congenital heart disease (CHD). Most cases have no known cause. Chromosomal, syndromal and teratogenic causes account for a minority of cases. The hypothesis that mutations in the ASD genes NKX2-5 and GATA4 may cause apparently sporadic ASD was tested by sequencing them in unrelated probands with ASD. In this study, 1/102 individuals with ASD had an NKX2-5 mutation, and 1/129 had a deletion of the GATA4 gene.
The cardiac transcription factor TBX20 interacts with other ASD genes but had not previously been associated with human disease. Of 352 individuals with CHD, including 175 with ASD, 2 individuals, each with a family history of CHD, had pathogenic mutations in TBX20. Phenotypes included ASD, VSD, valvular abnormalities and dilated cardiomyopathy.
These studies of NKX2-5, GATA4 and TBX20 indicate that dominant ASD genes account for a small minority of cases of ASD, and emphasize the considerable genetic heterogeneity in dominant ASD (also caused by mutations in MYH6 and ACTC). A new syndrome of dominant ASD and the Marcus Gunn jaw winking phenomenon is reported. Linkage to known loci was excluded, extending this heterogeneity, but a whole genome scan did not identify a candidate locus for this disorder.
Previous studies of inbred laboratory mice showed an association between patent foramen ovale (PFO) and measures of atrial septal morphology, particularly septum primum length (“flap valve length” or FVL). In humans, PFO is associated with cryptogenic stroke and migraine, and is regarded as being in a pathological contiuum with ASD. Twelve inbred strains, including 129T2/SvEms and QSi5, were studied, with generation of [129T2/SvEms x
ii QSi5] F1, F2 and F14 mice. Studies of atrial morphology in 3017 mice confirmed the relationship between FVL and PFO but revealed considerable complexity. An F2 mapping study identified 7 significant and 6 suggestive quantitative trait loci (QTL), affecting FVL and two other traits, foramen ovale width (FOW) and crescent width (CRW). Binary analysis of PFO supported four of these.
iii Acknowledgements During the course of an 8 year candidature, I have become indebted to a great number of people for help of many kinds. Above all, I am grateful to my wife, Susan O’Regan, for her love, support, encouragement and forbearance. My children, Seamus, Yasmin and Finn are starting to understand what I’m up to and have been cheering me on in recent months. I followed my father into medicine and he has been an inspiration to me. My mother’s love and encouragement have been very important to me. In addition, she has given me help with graphics (a talent I managed not to inherit from her).
Academically, my greatest debts are, of course, to my supervisor, Prof Richard Harvey and co-supervisor, Dr Michael Buckley, for their patient guidance and friendship over the years. I’ve learned a great deal from both of them, and in particular I think I’m starting to get the hang of Richard’s lessons on structure (of manuscripts, as well as hearts). Prof Richard Henry was initially my supervisor. Over time the research moved in a different direction, but he continued to provide support and encouragement.
Many members of Richard Harvey’s lab have helped me and I am grateful to all of them. I want especially to thank Dr Christine Biben for teaching me to dissect mouse hearts, and for providing ongoing advice and occasional second opinions; Dr David Elliott for advice and guidance with molecular methods; and Dr Changbaig Hyun for his contributions to the QTL study and the studies of GATA4 and TBX20. Others at the Victor Chang Cardiac Research Institute who have helped include Leticia Castro, Louise Lynagh, Haley Crotty, Milena Furtado, and Drs Mauro Costa, Robyn Otway, Thomas Yeoh, Guanglan Guo, Owen Prall, Orit Wolstein, Daniel Schaft, Mark Solloway, Aaron Schindeler, Suchitra Chander, Fiona Stennard, and Donna Lai have all helped me in various ways and have made the lab an enjoyable place to work. A/Prof Diane Fatkin has been generous with her time and advice.
At Sydney University, Prof Chris Moran has been a major guiding force, generous with his time and experience. A better collaborator could not be found. Dr Ian Martin taught me everything I have learned about care and breeding of laboratory mice, including how to dissect them. He also bred the F1 and F2 mice (Chapters 5 and 6), a large undertaking. Dr Peter Thomson has been unstinting in giving of his time and expertise in matters mathematical. All of the animal house staff, but especially Matt Jones and Mamdouh Nessiem, have been unfailingly helpful, often going well beyond the scope of
iv their duties to make sure that things ran well. Noelia Lopez ordered and bred the Hapmap mice, and did a number of dissections as well as co-measuring a subset of these mice. Kim Dilati provided assistance in the last few weeks of the advanced intercross line breeding and dissection, when things were very busy. At Sydney Children’s Hospital, members of the Department of Medical Genetics have been unfailingly supportive and helpful. Dr Fiona McKenzie worked as a part time research assistant during the first year of the project, and I doubt it would have got off the ground at all without her kickstarting things. Dr Owen Jones was invaluable in recruitment of children with ASD. At the South Eastern Area Laboratory Service, Peter Taylor, George Eliakis and Glenda Mullan have been especially helpful, but many other members of staff have given advice or help over the years. At the Children’s Hospital at Westmead, A/Prof David Winlaw and members of his lab have contributed greatly to the human studies.
Prof Ian Glass first identified the ASD + Marcus Gunn family and contributed greatly to recruitment of family members; he continues to advise and encourage on that project. A/Prof Jenny Donald provided guidance on mapping, and Dr Kyall Zenger taught me to drive LINKAGE, as well as doing a good deal of analysis towards the project himself. Dr Carol Cheung did a similar QTL study and taught me to use Mapmaker/QTL, along with a great deal of other advice and help. Many cardiologists have contributed their time and expertise, especially Dr Rob Justo, Dr Michael Tsicalis (and his wonderful secretary, Dianne Reddell) and Prof Michael Feneley. My thanks to them. Every co- author on the papers arising from this project has helped me, and my thanks are due to all. There have also been many others, such as the private laboratory services who never failed to help with specimen collection and shipping, whom I’m unable to name but would like to acknowledge. Likewise, I am immensely grateful to but unable to name the many people who volunteered as research subjects, particularly those whose families I studied and into whose homes and workplaces I was invited. If there are others I should have named but have omitted, my apologies as well as thanks to them.
Lastly, I am deeply indebted to the bodies which funded this research. The National Heart Foundation of Australia awarded me a scholarship as well as providing research grant support. Goldman Sachs Australia, the Sydney Children’s Hospital Foundation, the National Institutes of Health in the United States, the RT Hall Trust and the Royal College of Pathologists of Australasia all provided research grant support; I am very grateful to them all.
v Publications arising from this work
1. Elliott DA, Kirk EP, Yeoh T, Chandar S, McKenzie F, Taylor P, Grossfeld P, Fatkin D, Jones O, Hayes P, Feneley M, Harvey RP. Cardiac homeobox gene NKX2-5 mutations and congenital heart disease - associations with atrial septal defect and hypoplastic left heart syndrome. Journal of the American College of Cardiology 2003;41(11):2072-2076
2. Kirk EP, Hyun C, Thomson PC, Lai D, Castro ML, Biben C, Buckley MF, Martin ICA, Moran C, Harvey RP. Quantitative Trait Loci Modifying Cardiac Atrial Septal Morphology and Risk of Patent Foramen Ovale in the Mouse. Circulation Research 2006;98:651-658
3. Kirk EP*, Sunde M*, Costa MW, Rankin SA, Wolstein O, Castro ML, Butler TL, Hyun C, Guo G, Otway R, Mackay JP, Waddell LB, Cole AD, Hayward C, Keogh A, Macdonald P, Griffiths L, Fatkin D, Sholler GF, Zorn AM, Feneley MP, Winlaw DS, Harvey RP. Mutations in cardiac T-box factor gene TBX20 are associated with diverse cardiac pathologies, including defects of septation and valvulogenesis and cardiomyopathy. American Journal of Human Genetics 2007;81:280-291 *shared first authorship
Published conference proceedings 1. Harvey RP, Lai D, Elliott D, Biben C, Solloway M, Prall O, Stennard F, Schindeler A, Groves N, Lavulo L, Hyun C, Yeoh T, Costa M, Furtado M. and Kirk E. Homeodomain Factor Nkx2-5 in Heart Development and Disease. Cold Spring Harbor Symposium on Quantitative Biology. Volume 67, Cold Spring Harbor Laboratory Press, Cold Spring Harbor 2002.
vi Table of Contents
Dedication…………………………………………………………………………..i Abstract……………………………………………………………………………..ii Acknowledgements………………………………………………………………iv Publications arising from this work...…………………………………………vi Table of Contents…………………………………………………………………vii List of figures……………………………………………………………………...xvi List of tables……………………………………………………………………….xviii Abbreviations used……………………………………………………………….xxi
1. Literature review
1.1 Overview ...... 1 1.2 Genes and human disease ...... 2 1.3 Mapping Mendelian disorders ...... 4 1.3.1 Principles of Mendelian inheritance ...... 4 1.3.1.1 Dominance and recessiveness ...... 5 1.3.2 Meiotic recombination ...... 8 1.3.3 Maps of genetic variation ...... 9 1.3.4 Mapping Mendelian disorders ...... 9 1.4 Quantitative Genetics ...... 11 1.4.1 Quantitative trait loci ...... 11 1.4.1.1 The liability model for binary traits ...... 12 1.4.2 Mapping QTL ...... 13 1.4.2.1 Experimental designs for QTL mapping ...... 15 1.4.2.2 Significance thresholds ...... 15 1.4.2.3 Selective genotyping ...... 15 1.4.2.4 Software packages for QTL mapping ...... 16 1.4.2.5 The mouse as a model organism ...... 17 1.4.3 Identifying the underlying genetic basis of QTL ...... 17 1.4.4 The mouse Hapmap project: application to QTL mapping ...... 19
vii 1.5 The heart ...... 20 1.5.1 Normal cardiac anatomy ...... 20 1.5.2 Heart development ...... 21 1.5.3 The interatrial septum ...... 23 1.5.4 Regulation of cardiac development by transcription factors ...... 26 1.6 Congenital heart disease ...... 28 1.6.1 Types of CHD ...... 29 1.6.2 Causes of CHD ...... 29 1.6.3 Patent foramen ovale ...... 31 1.6.3.1 PFO and stroke ...... 32 1.6.3.2 PFO and migraine ...... 33 1.6.3.3 Other pathological consequences of PFO ...... 34 1.6.3.4 Genetics of PFO ...... 34 1.6.4 Atrial septal defect ...... 35 1.6.4.1 Secundum ASD ...... 36 1.6.4.2 Ostium primum ASD ...... 36 1.6.4.3 Sinus venosus ASD ...... 36 1.6.4.4 Coronary sinus ASD ...... 36 1.6.4.5 Pathology associated with ASD ...... 36 1.6.5 Relationship between PFO and ASD ...... 37 1.7 Causes of ASD ...... 39 1.7.1 Syndromes associated with CHD ...... 39 1.7.1.1 Holt-Oram syndrome ...... 40 1.7.1.2 Chromosomal disorders, particularly 8p deletions ...... 41 1.7.2 Non-syndromal Mendelian ASD ...... 44 1.7.3 Multifactorial/polygenic causation of ASD ...... 49 1.7.3.1 Excess of females affected by ASD ...... 51 1.7.3.2 QTL for CHD ...... 51 1.7.4 Environmental factors ...... 51 1.7.4.1 Major teratogens ...... 52 1.7.4.2 Other environmental factors ...... 52 1.8 Project outline ...... 61
viii 2. Materials and Methods
2.1 Mouse experiments ...... 62 2.1.1 Ethics committee approval ...... 62 2.1.2 Animal resources ...... 62 2.1.3 Breeding protocols ...... 63 2.1.3.1 F2 mice ...... 63 2.1.3.2 Advanced intercross line ...... 63 2.1.4 Mouse phenotyping ...... 66 2.1.4.1 Initial dissection ...... 66 2.1.4.2 Fine dissection ...... 67 2.1.4.3 Identification of patent foramen ovale ...... 73 2.1.4.4 Measurements of atrial septal anatomy ...... 73 2.1.4.5 Blinding ...... 73 2.1.5 Strain selection for F2 and AIL studies ...... 74 2.2 Human subjects ...... 75 2.2.1 Ethics committee approval ...... 75 2.2.2 Ascertainment of subjects ...... 75 2.2.2.1 Children ...... 75 2.2.2.2 Adults ...... 75 2.2.2.3 Numbers of subjects studied for mutations in NKX2-5 & GATA4 ... 76 2.2.2.4 Follow-up of family members…………………………………………76 2.2.3 History ...... 76 2.2.4 Examination ...... 77 2.2.5 Investigations ...... 78 2.3 Molecular genetics methods...... 78 2.3.1 Extraction of DNA from human blood and mouse spleens ...... 78 2.3.1.1 DNA extraction from mouse spleens ...... 78 2.3.1.2 DNA extraction from blood ...... 80 2.3.4 Polymerase chain reaction (PCR) and sequencing of NKX2-5 and GATA4...... 82 2.3.4.1 PCR of NKX2-5 ...... 82 2.3.4.2 PCR of Exon 5 of GATA4 ...... 84 2.3.5 Sequence analysis ...... 86 ix 2.4 Microsatellite analysis ...... 88 2.5 Marker selection ...... 88 2.5.1 Human Markers ...... 88 2.5.2 Mouse Markers ...... 88 2.6 Laboratory methods used at AGRF ...... 88 2.7 Error checking ...... 89 2.7.1 Error Checking of Human Data ...... 89 2.7.2 Error Checking of Mouse Data ...... 89 2.8 Statistical methods ...... 90 2.8.1 Basic statistical analyses ...... 90 2.9 Linkage analysis ...... 91 2.9.1 Linkage analysis for autosomal dominant trait ...... 91 2.9.2 QTL analysis ...... 91 2.9.2.1 Selective Genotyping ...... 91 2.9.2.2 Linkage analyses ...... 91 2.9.2.3 Binary trait analysis ...... 92
3. The role of mutations in the cardiac transcription factors NKX2-5, GATA4 and TBX20 in causing CHD and cardiomyopathy
3.1 Introduction ...... 94 3.2 Mutations in NKX2-5 cause autosomal dominant CHD and AV conduction block ...... 94 3.2.1 Subjects screened for NKX2-5 mutations ...... 104 3.2.2 Results ...... 106 3.2.2.1 Family 1024: T178M ...... 106 3.2.2.2 Family AF1: E21Q ...... 108 3.2.3 Role of NKX2-5 mutations in nonsyndromal ASD ...... 109 3.2.4 Implications for asymptomatic mutation-positive individuals ...... 110 3.3 The role of mutations in GATA4 in ASD and PFO ...... 111 3.3.1 Subjects screened for GATA4 mutations ...... 115
x 3.3.2 Results of sequencing and cytogenetic analysis ...... 115 3.3.2.1 Family 1012 – GATA4 variants A411V and S377G ...... 115 3.3.2.2 Family z10 – GATA4 variant D425N ...... 118 3.3.2.3 Family 1006 – 8p23 deletion ...... 118 3.3.3 The common variant S377G – possible role in PFO with stroke ...... 120 3.3.4 Role of GATA4 mutations in nonsyndromal ASD ...... 124 3.4 Mutations in TBX20 are associated with diverse cardiac pathologies, including abnormal septation and valvulogenesis, and cardiomyopathy ...... 125 3.4.1 Subjects screened for TBX20 mutations ...... 126 3.4.2 TBX20 mutations ...... 128 3.4.2.1 Family 9001: TBX20 mutation 152M ...... 129 3.4.2.2 Family z103: TBX20 mutation Q195X...... 129 3.4.2.3 Family WM1: TBX20 polymorphism T209I ...... 130 3.4.3 Functional and other studies of the TBX20 mutations ...... 130 3.4.3.1 Transcriptional assays of Tbx20 function ...... 131 3.4.3.2 Xenopus embryo gastrulation assay ...... 131 3.4.3.3 Protein modelling ...... 131 3.4.4 Significance of mutations in TBX20 ...... 133 3.5 Conclusions: the role of mutations in NKX2-5, GATA4 and TBX20 in human disease ...... 135
4. Atrial septal defect and Marcus Gunn phenomenon: further evidence for clinical and genetic heterogeneity in autosomal dominant atrial septal defect
4.1 Introduction ...... 137 4.2 Marcus Gunn phenomenon ...... 137 4.3 Phenotypes of affected family members ...... 139 4.4 Cytogenetics ...... 140
xi 4.5 Sequencing of cardiac genes ...... 140 4.6 Mapping results ...... 142 4.6.1 Chromosome 1………………………………………………………….…142 4.6.2 Chromosome 5…………………………………………………………….143 4.6.3 Chromosome 6…………………………………………………………….144 4.6.4 Chromosome 7…………………………………………………………….144 4.6.5 Chromosome 8…………………………………………………………….144 4.6.6 Chromosome 12…………………………………………………………...144 4.6.7 Chromosome 14…………………………………………………………...144 4.6.8 Chromosome 15…………………………………………………………...145 4.7 Discussion ...... 145 4.7.1 Linkage results ...... 145 4.7.2 ASD and MGP ...... 146 4.7.3 Clefting ...... 146 4.7.4 Future studies ...... 147
5. Cardiac atrial septal morphology and risk of patent foramen ovale in inbred laboratory mice
5.1 Introduction ...... 148 5.2 The relationship between atrial septal morphology and PFO: previous work ...... 149 5.3 Selection and breeding of mice for study ...... 150 5.4 Analysis of data from QSi5, 129T2/SvEms, and the [QSi5 x 129T2/SvEms] F1, F2 and F14 mice ...... 152 5.4.1 Descriptive statistics...... 152 5.4.2 Relationships between the continuous traits ...... 155 5.4.3 Analysis of variance for factors affecting FVL, FOW and CRW in F2 mice ...... 157 5.4.4 Relationship between PFO and the continuous variables…………….159 5.4.4.1 Relationship between FVL and PFO ...... 160 xii 5.4.4.2 Relationship between FOW and PFO ...... 163 5.4.4.3 Relationship between CRW and PFO ...... 165 5.4.5 Biological significance of the relationships between FVL, FOW and CRW ...... 168
6. Quantitative trait loci modifying cardiac atrial septal morphology and risk of patent foramen ovale in inbred laboratory mice
6.1 Introduction ...... 169 6.2 Study design ...... 171 6.3 Selection of mice for genotyping ...... 171 6.4 Markers used ...... 172 6.5 Linkage results ...... 172 6.6 Chromosomes with noteworthy findings ...... 184 6.6.1 MMU1 ...... 187 6.6.2 MMU2 ...... 187 6.6.3 MMU3 ...... 187 6.6.4 MMU4 ...... 187 6.6.5 MMU6 ...... 188 6.6.6 MMU7 ...... 188 6.6.7 MMU8 ...... 188 6.6.8 MMU9 ...... 188 6.6.9 MMU10 ...... 188 6.6.10 MMU13 ...... 188 6.6.11 MMU15 ...... 188 6.6.12 MMU18 ...... 188 6.6.13 MMU19 ...... 189 6.7 Discussion ...... 189 6.7.1 Cryptic QTL ...... 190 6.7.2 Binary trait analysis ...... 190 6.7.3 Genetic relationship between FVL, FOW and CRW ...... 191 xiii 6.7.4 Contribution of the identified QTL to the phenotypes under study ...... 191 6.7.5 Candidate genes ...... 192 6.7.6 Future studies……………………………………………………………...193
7. Comparisons of atrial septal anatomy in 12 strains of inbred laboratory mice reveal unexpected complexity
7.1 Introduction ...... 194 7.2 History of the inbred laboratory mouse ...... 195 7.3 Application of mouse haplotype data to mapping ...... 196 7.4 Number of mice to phenotype ...... 197 7.5 Analyses of Hapmap strains ...... 198 7.6 Training of a second observer ...... 198 7.7 Assessments of the reliability of measurement of atrial septal anatomy ...... 198 7.8 Descriptive statistics for the Hapmap strains (including 129T2/SvEms and QSi5) ...... 200 7.9 ASD in DBA/1J mice ...... 203 7.10 Relationships between PFO and other traits in the 12 strains of inbred mice ...... 205 7.11 Comparison with the study by Biben and colleagues ...... 210 7.12 Conclusions ...... 211
8. Conclusions and Future Directions
8.1 Genetic heterogeneity and its clinical implications ...... 212 8.2 Cardiac phenotypes other than CHD...... 213 8.2.1 AV conduction abnormalities and ASD ...... 213 8.2.2 Cardiomyopathy in association with mutations in TBX20 and NKX2-5 ...... 214 8.3 The role of NKX2-5, GATA4 and TBX20 in multifactorial ASD ...... 214 xiv 8.4 Future studies of dominant ASD genes in unselected subjects ...... 215 8.5 Mapping genes affecting prevalence of PFO in inbred laboratory mice ...... 216 8.6 Significance of findings ...... 219
REFERENCES…………………………………………………………………..221
APPENDIX 1: ASD and Marcus Gunn Phenomenon - Linkage Results…245
APPENDIX 2: Microsatellite markers used for QTL mapping……………..262
APPENDIX 3: Candidate genes within QTL ...... 265
xv List of figures
Page Figure 1.1: Dominance 8 Figure 1.2: The adult mammalian heart 22 Figure 1.3: Normal cardiac development 24 Figure 1.4: Relative arrangement of septum primum and septum secundum 26
Figure 2.1: Cartoon illustrating the breeding scheme used 64 Figure 2.2: Dissection of mouse hearts 2.2A: Initial dissection 68 2.2B: Opening the auricle 69 2.2C: Laying open the atrium 70 2.2D: Final appearance of the heart 71 Figure 2.3: Detail of atrial septum 72
Figure 3.1: Families with ASD and NKX2-5 sequence changes 107 Figure 3.2: Families with GATA4 variants and 8p23 deletion 117 Figure 3.3: Families with TBX20 mutations 128 Figure 3.4: Transcription studies and Xenopus gastrulation assay 132
Figure 4.1: Family with ASD and MGP 141 Figure 4.2: Multipoint mapping of chromosome 5 143
Figure 5.1: Scatterplot of FVL vs heart weight in F2 mice 157 Figure 5.2: Histogram of FVL in F2 mice with and without PFO 162 Figure 5.3: Histogram of FVL in F14 mice with and without PFO 162 Figure 5.4: Histogram of FOW in F2 mice with and without PFO 164 Figure 5.5: Histogram of FOW in F14 mice with and without PFO 164 Figure 5.6: Histogram of CRW in F2 mice with and without PFO 166 Figure 5.7: Histogram of CRW in F14 mice with and without PFO 167 xvi Page Figure 6.1: MMU1 173 Figure 6.2: MMU2 174 Figure 6.3: MMU3 174 Figure 6.4: MMU4 175 Figure 6.5: MMU5 175 Figure 6.6: MMU6 176 Figure 6.7: MMU7 176 Figure 6.8: MMU8 177 Figure 6.9: MMU9 177 Figure 6.10: MMU10 178 Figure 6.11: MMU1 1 178 Figure 6.12: MMU1 2 179 Figure 6.13: MMU1 3 179 Figure 6.14: MMU1 4 180 Figure 6.15: MMU1 5 180 Figure 6.16: MMU1 6 181 Figure 6.17: MMU1 7 181 Figure 6.18: MMU1 8 182 Figure 6.19: MMU1 9 182 Figure 6.20: MMUX (female mice) 183 Figure 6.21: MMUX (male mice) 183
Figure 7.1: ASD in a DBA/1J mouse 204 Figure 7.2: Scatterplot of %PFO vs FVL 207 Figure 7.3: Scatterplot of %PFO vs FOW 207 Figure 7.4: Scatterplot of %PFO vs CRW 208 Figure 7.5: Scatterplot of Weight vs Heart Weight 208
Figure 8.1: Mapping results for chromosome 1 218
xvii List of tables
Page Table 1.1: Percentage of CHD accounted for by the most common lesions 30 Table 1.2: Reports of dominant ASD prior to the first identification of causative mutations 47 Table 1.3: Environmental exposures and other factors significantly associated with risk of ASD 54 Table 1.4: Environmental exposures and other factors with no significant association with risk of ASD 57
Table 2.1: Breeding scheme for AIL 65 Table 2.2: Primers used for PCR and sequencing 86
Table 3.1: Mutations in NKX2-5 97 Table 3.2: Patient characteristics (NKX2-5) 105 Table 3.3: Mutations in GATA4 112 Table 3.4: S377G allele distribution in indigenous human populations 121 Table 3.5: S377G in Caucasian subjects 122 Table 3.6: Characteristics of subjects sequenced for TBX20 mutations 127
Table 5.1: Characteristics of parental strains, F1 and F2 mice 154 Table 5.2: Basic statistical data and correlations for F2 mice 155 Table 5.3: Basic statistical data and correlations for F14 mice 156 Table 5.4: Comparison between data for 129T2/SvEms mice with and without PFO 159 Table 5.5: Analysis of variance for FVL in F2 mice 160 Table 5.6: Analysis of variance for FVL in F14 mice 161
xviii Page Table 5.7: Analysis of variance for FOW in F2 mice 163 Table 5.8: Analysis of variance for FOW in F14 mice 163 Table 5.9: Analysis of variance for CRW in F2 mice 165 Table 5.10: Analysis of variance for CRW in F14 mice 166
Table 6.1a: Loci with LOD score >2.8 for FVL 184 Table 6.1b: Loci with LOD score >2.8 for FOW 185 Table 6.1c: Loci with LOD score >2.8 for CRW 186 Table 6.2: LOD scores at loci of orthologues of reported human ASD genes 189
Table 7.1: Measures of inter-rater reliability 199 Table 7.2: Descriptive statistics for 12 strains of inbred laboratory mice 201 Table 7.3: Correlation between mean values for %PFO, FVL, FOW, CRW, weight and heart weight 206 Table 7.4: Results of ANOVA for FVL, FOW and CRW – single analyses with PFO as the model 209 Table 7.5: Combined mean values for 12 mouse strains with and without PFO 209
Table A1.1a: Chromosome 1 245 Table A1.1b: Chromosome 1 – additional markers 246 Table A1.2: Chromosome 2 247 Table A1.3: Chromosome 3 248 Table A1.4: Chromosome 4 249 Table A1.5: Chromosome 5 250 Table A1.6: Chromosome 6 251 Table A1.7: Chromosome 7 251 Table A1.8: Chromosome 8 252 xix Page Table A1.9: Chromosome 9 253 Table A1.10 Chromosome 10 254 Table A1.11 Chromosome 11 255 Table A1.12 Chromosome 12 255 Table A1.13 Chromosome 13 256 Table A1.14 Chromosome 14 257 Table A1.15 Chromosome 15 257 Table A1.16 Chromosome 16 258 Table A1.17 Chromosome 17 259 Table A1.18 Chromosome 18 259 Table A1.19 Chromosome 19 260 Table A1.20 Chromosome 20 260 Table A1.21 Chromosome 21 261 Table A1.22 Chromosome 22 261
Table A2.1 List of markers with map location 262
Table A3.1: Genes within QTL affecting FVL 265 Table A3.2: Genes within QTL affecting FOW 268 Table A3.3: Genes within QTL affecting CRW 270
xx Abbreviations used
AF Atrial fibrillation AGRF Australian Genome Research Facility AIL Advanced intercross line ANOVA Analysis of variance ASD Atrial septal defect AS Aortic stenosis ASA Atrial septal aneurysm AV Atrioventricular AVCD Atrioventricular canal defects AVN Atrioventricular node BAV Bicuspid aortic valve Coarct Coarctation of the aorta CHD Congenital heart disease CI Confidence interval cM centiMorgans CMP Cardiomyopathy CRW Crescent width DC Direct current DCM Dilated cardiomyopathy DNA Deoxyribonucleic acid d-TGA d-Transposition of the great arteries ECG Electrocardiogram FDR False discovery rate FISH Fluorescent in-situ hybridization FOW Foramen ovale width FVL Flap valve length GLM General linear model HLHS Hypoplastic left heart syndrome HOS Holt-Oram syndrome IM Interval mapping LA Left atrium
xxi LOD Logarithm of odds LSVC Left superior vena cava LV Left ventricle MGP Marcus Gunn phenomenon MR Mitral regurgitation MS Mitral stenosis MV Mitral valve MVP Mitral valve prolapse OR Odds ratio PA Pulmonary atresia PS Pulmonary stenosis PCR Polymerase chain reaction PDA Patent ductus arteriosus PFO Patent foramen ovale PS Pulmonary stenosis PTA Persistent truncus arteriosus QTL Quantitative trait locus RA Right atrium RFLP Restriction fragment length polymorphism RNA Ribonucleic acid RV Right ventricle SD Standard deviation SHF Second heart field SMM Single marker mapping SNP Single nucleotide polymorphism TA Tricuspid atresia TAPVR Total anomalous pulmonary venous return TOF Tetralogy of Fallot TR Tricuspid regurgitation VSD Ventricular septal defect
VE Variance due to environmental effects VF Ventricular fibrillation
VG Variance due to genetic effects
xxii VT Total variance WPW Wolff-Parkinson-White syndrome Recombination fraction
xxiii 1. Literature review
1.1 Overview Congenital heart disease (CHD) is the most common form of birth defect, with estimates of birth incidence in liveborn children ranging from 0.4-1.0% (Bower and Ramsay, 1994; Grech and Gatt, 1999; Ferencz et al., 1985; Gillum, 1994). A comprehensive review of epidemiological studies of CHD by Hoffman and Kaplan (Hoffman and Kaplan, 2002) yielded a combined incidence of 6/1000 live births, rising to 75/1000 live births if trivial lesions, such as tiny muscular ventricular septal defects (VSDs) present at birth but closing thereafter are included. Although some cardiac malformations are relatively benign, the overall morbidity and mortality associated with CHD are enormous. In the 10 year period 1979-1988, there were 46,450 deaths attributed to CHD in the United States of America, of which 26,319 occurred in the first year of life (Gillum, 1994). In Australia, 15% of neonatal deaths and 11% of post-neonatal childhood deaths are attributable to CHD (Bower and Ramsay, 1994).
Given this, it is perhaps surprising how little is known of the causes of CHD. There are many genetic syndromes associated with CHD, and there has been considerable success in elucidating the causes of these. However, 75% of CHD is non-syndromic, in the sense that there are no evident associated features (Bower and Ramsay, 1994), and even among the “syndromic” category, not all cases have a known cause. Genetic factors make an important contribution to non-syndromic CHD, but specific mutations have so far been identified in only a small minority of cases (Elliott et al., 2003; McElhinney et al., 2003). Teratogens such as alcohol, although important because of the potential for prevention, account for a small minority of cases (Tikkanen and Heinonen, 1992). This applies even where the relative risk associated with an exposure is comparatively high, such as the risk of CHD associated with maternal diabetes. For this exposure, Pradat (Pradat, 1992) found a relative risk for all CHD of 2.67 (95% CI 1.43-4.99), and an even higher risk for septal defects (ASD and VSD combined) for which the relative risk was 6.2 (95% CI 1.97-19.5).
1 Nonetheless, in this study only 1.2% of CHD was attributable to maternal diabetes (specific figures not available for the subgroup of septal defects).
The parents of a child born with CHD naturally want to know why their child has this problem. Based on current knowledge, as sketched above, the clinician treating the child will usually be unable to answer this question in more than general terms. This is the underlying motivation behind the work reported here: the drive to understand the causes of CHD in more detail. The focus is necessarily narrow – an attempt to delineate the genetic contribution to defects of atrial septal morphogenesis, specifically secundum atrial septal defect (ASD) and patent foramen ovale (PFO). However, it is anticipated that the lessons learned from study of these disorders will have wider relevance to other forms of CHD.
To place these genetic studies in context, this chapter starts with a discussion of the contribution of genes to human disease, and ways of unravelling it, including mapping techniques for Mendelian and quantitative traits. Next, there is a description of cardiac development and particularly the role of the transcription factors NKX2-5, GATA4 and TBX20 in early cardiac development. Finally the nature, epidemiology and causation of CHD are reviewed, with a focus on ASD and PFO.
1.2 Genes and human disease At the time of writing, the Entrez Genome Project web page listed 413 eukaryotic genome sequencing projects, of which 25 were complete, 162 at the assembly stage and 226 in progress (http://www.ncbi.nlm.nih.gov/genomes/leuks.cgi, accessed on 23rd August, 2007). In addition, there are nearly fourteen hundred prokaryote genome projects at various stages. The human genome project published its draft sequence of the human genome as long ago as 2001 (Lander et al., 2001).
From this vantage point, with vast and ever-increasing repositories of genomic information readily accessible to us, it is easy to forget that it is only a little over
2 a century ago that the first paper correctly identifying a genetic mechanism for a human disease was published. This was Garrod’s seminal paper on alkaptonuria (Garrod, 1902), in which he recognised that alkaptonuria is inherited in an autosomal recessive fashion. This in turn occurred only 37 years after Mendel’s publication of the principles of what is now known as Mendelian inheritance. Prior to these advances, while it was undoubtedly recognised that some traits and diseases were hereditary, there was no accurate understanding of the mechanisms underlying this.
It is now clear that there is a genetic contribution to many, and perhaps most, forms of human disease. Over 1000 genes for rare disorders which conform to Mendelian inheritance have now been identified. However, even taken together, these account for only a small proportion of human disease (Altshuler et al., 2005) – perhaps 1-2% in all (Rimoin et al., 2002). More importantly, most common disease has an identifiable genetic contribution – from ischaemic heart disease (Shiffman et al., 2005) to most forms of cancer (Knoepfler, 2007). In the case of cancer the genetic contribution is not necessarily hereditary – in most instances it consists of acquired, somatic mutation rather than inherited germline mutation. Even conditions which have a readily identifiable external cause, such as trauma and infectious diseases, can be shown to have a genetic contribution. Impulsive behaviour and risk-taking, which increase the risk of trauma, are contributed to by genetic factors (Kreek et al., 2005). Host factors which are genetically determined affect the response to infectious disease, influencing likelihood of clinically recognised infection and severity (Casanova and Abel, 2007).
The reverse is also true. Disorders which have been thought of as purely genetic in origin are influenced by environment. Children with cystic fibrosis, a classic example of an autosomal recessive disorder, have lung disease the severity of which is determined in part by which pathogens they happen to encounter (Jones et al., 2004). The phenotype in phenylketonuria can be greatly modified by provision of a modified diet low in phenylalanine (Scriver and Waters, 1999). There are numerous other examples, but more importantly,
3 it is undoubtedly the case that less obvious, currently unrecognised, environmental influences contribute to the phenotype of most (perhaps all) genetic disease. Chance also represents an intrinsically unobservable, but important, component of this environmental contribution. The regulation of gene expression (and by extension, development) is an inherently stochastic process (Fiering et al., 2000; Kaern et al., 2005).
Human disease, then, can be seen as the outcome of a complex interplay between genes and environmental factors. Understanding the causation of many forms of human disease – including CHD – thus requires investigation of both genetic and environmental contributions. Environmental influences can be studied by epidemiological means, or in some instances by the use of animal models, although this can only ever provide indirect evidence for the role of a particular environmental exposure in human disease. Epidemiological studies of ASD and PFO are reviewed in section 1.7.4 of this chapter. In the remainder of this section, methods of mapping traits inherited in a Mendelian fashion and quantitative trait mapping will be reviewed.
1.3 Mapping Mendelian disorders Mapping of Mendelian disorders relies on Mendel’s principles of inheritance, modified by the increased likelihood of co-segregation of alleles which are in physical proximity, and on the effects of meiotic recombination. It is greatly facilitated by the availability of maps of human genetic variation.
1.3.1 Principles of Mendelian inheritance Mendel derived four main principles from his work on the garden pea, summarised by Cook et al (Cook J et al., 2002) as follows:
1. Genes come in pairs (Mendel termed them factors), one inherited from each
parent.
2. Individual genes can have different alleles, some of which (dominant traits)
exert their effects over others (recessive traits) – the principle of dominance. 4 In Mendel’s own words “those characters which are transmitted entire, or
almost unchanged in the hybridisation, and therefore in themselves constitute
the characters of the hybrid, are termed the dominant, and those which become
latent in the process recessive”
3. At meiosis alleles segregate from each other with each gamete receiving only
one allele – the principle of segregation, or Mendel’s first law.
4. The segregation of different pairs of alleles is independent – the principle of
assortment, or Mendel’s second law.
Some modification to these principles has been required. X-linked inheritance, in which females have a pair of alleles but males only a single copy (hemizygosity), was not discussed by Mendel. Nonetheless X-linked disorders are considered Mendelian because the principles of X-linked inheritance are essentially a special case of the principles described by Mendel.
Mendel’s second law is true except for alleles which are located close together on the same chromosome, which do not segregate independently. Modification of Mendel’s principles by the addition of this fact formed the basis on which the mapping studies of Morgan, and indeed all subsequent genetic mapping studies, including those reported here, are based.
1.3.1.1 Dominance and recessiveness The term “dominance” is used somewhat differently in Mendelian and quantitative genetics, and somewhat differently again in reference to disease states. For convenience, all three uses of the term will be discussed at this point. Elsewhere in this thesis, the terms are used in the sense relevant to the topic under discussion.
In both Mendelian and quantitative genetics, dominance is a term which describes the relationship between two alleles. In Mendelian genetics, an allele (A) is dominant to another (B) if the phenotype in heterozygous organisms is the 5 same as the phenotype in those homozygous for the A allele (Cook J et al., 2002). The allele B is recessive to the allele A. Incomplete dominance, also called semidominance, occurs if the phenotype in the heterozygote is intermediate between that seen with the two homozygous states.
In speaking of diseases with Mendelian inheritance, however, dominant inheritance refers to any disorder in which heterozygosity for a mutated allele is sufficient to cause a pathological state.
There are some disorders in which it has been demonstrated that autosomal dominant human diseases conform to the more rigorous definition of dominance. For example, homozygosity for the triplet repeat expansion responsible for Huntington disease produces a phenotype indistinguishable from that seen in heterozygotes (Wexler et al., 1987). Similarly, a woman homozygous for a BRCA1 mutation had breast cancer at the age of 32, consistent with the phenotype seen in heterozygotes (Boyd et al., 1995). The age of onset of her cancer would not have been unusually early for a woman with a heterozygous mutation, and indeed a woman in the same family, presumably heterozygous for the same mutation, had breast cancer aged 22 (Boyd et al., 1995).
However, there are numerous examples of autosomal dominant disorders which in classical Mendelian terminology would be referred to as semidominant. For example, heterozygosity for mutations in KCNQ1 causes long QT syndrome, which is described as an autosomal dominant condition. Homozygosity causes the Jervell and Lange-Nielsen syndrome with severe long QT syndrome and sensorineural deafness (Splawski et al., 1997). Similarly, heterozygous mutations in CDMP1 cause minor skeletal anomalies including brachydactyly type C (Polinkovsky et al., 1997) and a phenotype resembling brachydactyly type A1 (Thomas et al., 1997); however, homozygous mutations cause a severe bone dysplasia, Grebe type chondrodysplasia (Thomas et al., 1997).
6 In practice, the effect of homozygosity for alleles associated with autosomal dominant disorders is generally not known. It is likely that many, even most, disorders described as having autosomal dominant inheritance are like these examples, and in a strict sense should be termed semidominant. Homozygosity for “dominant” mutations has been reported on numerous occasions, often as a result of consanguinity or assortative mating (eg in achondrplasia). This often results in a more severe phenotype than in the heterozyote (Zlotogora, 1997). The mechanism by which the mutation produces a phenotype is likely to influence this. It is highly likely, for example, that if haploinsuffiency is sufficient to produce a phenotype, complete loss of gene function will produce a severe phenotype. The effects of homozygosity for mutations which act in a different fashion (gain of function, abnormal participation in homodimer function and so on) are more difficult to predict but might reasonably be expected to be more severe than heterozygosity in many cases.
In summary, there is overlap between the use of the term “dominant” and “recessive” in classical Mendelian genetics and in the terminology of human diseases with Mendelian inheritance. However, the terminology is more loosely applied in relation to human diseases.
In quantitative genetics, the use of the term dominance is closely related to its strict Mendelian meaning. Here, dominance refers to the heterozygote effect of one allele relative to another (Fig 1.1, below). Suppose that homozygosity for allele 1 results in a phenotypic value of +1 for a given trait, and homozygosity for allele 2 results in a value of –1. If the heterozygote has a phenotypic value of 0, there is no dominance effect. Heterozygote values between 0 and 1 reflect dominance of allele 1 over allele 2, values between –1 and 0 reflect partial dominance of allele 2 over allele 1 and values of 1 or –1 represent complete dominance of allele 1 or 2, respectively. Overdominance refers to the situation in which the heterozygote has a more extreme phenotype than either homozygous state (>1 or < -1). Note that this refers to a theoretical situation in which the effects of a single pair of alleles on a quantitative trait can be separated out from all other effects and measured.
7 No dominance
A2A2 A1A2 A1A1
-1 0 1 Partial dominance
A2A2 A1A2 A1A1
-1 Complete dominance 1 A1A2 A2A2 A1A1
-1 1 Overdominance A2A2 A1A1 A1A2
-1 1
Fig 1.1: Dominance (see text for description). Adapted from Fig. 2.1 in Introduction to Quantitative Genetics (Falconer DS and Mackay TF, 1996))
1.3.2 Meiotic recombination During meiosis, homologous chromosomes pair and exchange segments by recombination. The effect of this is that the copy of each chromosome present in the gamete is not identical to either of the parental chromosomes, but rather is a mosaic of segments derived from each of them (Anderson NH, 2002). The effect of this process occurring over successive generations is illustrated in figure 2.1. The relevance of this process to genetic mapping is that recombination events, although more likely to occur in some parts of a chromosome than others, are widely distributed along the chromosomal length and vary considerably from meiosis to meiosis. This means that with an increasing number of meioses within a pedigree, there is an increasing probability that the region containing a mutated gene will become separated from genetic markers which were close to it on the chromosome of the first family member to have carried the mutation. In turn, this allows the progressive narrowing down of the region likely to contain the mutated gene.
8 1.3.3 Maps of genetic variation In order to track transmission of chromosomal segments through a pedigree, a map of genetic variation is required. To be useful for genetic linkage studies, an ideal map would contain markers which are densely spaced and highly polymorphic – and thus likely to be informative within a family. Such maps have been progressively developed, starting with restriction fragment length polymorphisms (RFLPs) (Botstein et al., 1980), progressing to short tandem repeats (also known as microsatellites) (Weber and May, 1989) and finally to single nucleotide polymorphisms (SNPs) (The International Hapmap Consortium, 2003; Altshuler et al., 2005). The availability of densely-spaced marker maps is obviously also important to mapping for complex diseases.
1.3.4 Mapping Mendelian disorders Mapping of Mendelian disorders depends on identifying one or more alleles which segregate with the phenotype in accordance with the principles of Mendelian inheritance (Anderson NH, 2002; Ott and Hoh, 2000). Generally the mode of inheritance will be known, but there will be no prior information regarding the likely chromosomal localisation of the gene of interest. Exceptions to this include X-linked disorders, and the rare circumstance of identification of one or more affected individuals with apparently balanced chromosomal translocations segregating with the phenotype (in which case the breakpoints represent candidate loci). If no localising information is available, a screen of the entire genome (less the sex chromosomes if X and Y-linkage can be excluded on the basis of pedigree analysis) will be required. Polymorphic loci spaced as closely as possible across the genome are genotyped in all available family members. With the availability of extremely dense genetic maps, cost has become the main limiting factor restricting marker density used in such a screen.
While in principle it should be possible to identify regions of genetic linkage simply by inspecting haplotypes to identify markers which segregate with disease state, in practice this is not usually a straightforward undertaking (Ott and Hoh, 2000). The large amount of data produced during a mapping exercise,
9 the problem of incomplete penetrance, and the fact that in the real world there are often missing individuals or other barriers to such an approach, necessitate the use of computer programs in data analysis in most instances. These calculate the likelihood that the disease locus is present on a marker map. The likelihood ratio is the ratio between the probability that the hypothesis that there is linkage ( <0.5), LHA, and the null hypothesis of no linkage ( =0.5), LH0. (theta) is the recombination fraction, which is 0.5 when there is no linkage (i.e. there is a 50% chance that two unlinked alleles will be transmitted together).
This is expressed as a logarithm of odds (LOD) score, which is the log10 of LHA/
LH0 (Ott and Hoh, 2000; Nyholt, 2002). Lander and Kruglyak calculated that in a whole genome scan in humans, a LOD score of 1.9 would be expected to occur by chance once per whole genome scan, and a LOD score of 3.3 would be expected to occur by chance once per 20 whole genome scans (Lander and Kruglyak, 1995). These were proposed as cutoffs for reporting suggestive and significant linkage results, respectively. Thresholds were also calculated for quantitative trait locus (QTL) mapping using various study designs.
Map distances are measured in centiMorgans (cM), named for the great fly geneticist Thomas Hunt Morgan. The cM is equivalent to the recombination fraction expressed as a percentage. A recombination fraction of 0.5 represents a map distance of 50cM, a recombination fraction of 0.01represents a map distance of 1cM and so on.
Examples of programs used in mapping of Mendelian traits (or potentially Mendelian traits, in the case of nonparametric programs) include those used for parametric analyses such as the LINKAGE package (Lathrop et al., 1984) and VITESSE (O'Connell and Weeks, 1995) and nonparametric programs such as GENEHUNTER (Kruglyak et al., 1996). Parametric programs specify a genetic model (eg autosomal dominant inheritance) and other parameters such as penetrance, whereas nonparametric methods are “model-free”. Parametric methods are powerful when there is good information available about the mode of inheritance and other relevant parameters. However, for more complex
10 situations or where there is limited information available about the disorder, nonparametric approaches are superior (O'Connell and Weeks, 1995; Nyholt, 2002).
1.4 Quantitative Genetics The application of Mendelian genetics to human disease virtually always involves discrete phenotypes –a patient either has or does not have cystic fibrosis. Gradations of severity (variable expressivity) and the phenomenon of incomplete penetrance, in which an individual has a genotype which is associated with disease in others but is not affected by the disorder in question, do not negate this observation. However, the great majority of variation within a population involves traits which are continuously variable rather than discrete in nature. These are quantitative traits. Examples include body weight, blood pressure and levels of blood lipids and homocysteine. Of themselves, none of these necessarily represents a pathological state, even at the extremes of their distribution in the population (arguably malignant hypertension represents an exception to this). However, these examples were chosen for discussion because all of them represent risk factors for atherogenic cardiovascular disease (Fruchart et al., 2004). Many, perhaps most, such traits are under genetic control to some degree. Quantitative genetics deals with efforts to identify the underlying genetic variants which are responsible for variation in quantitative traits.
1.4.1 Quantitative trait loci A QTL is a chromosomal segment which contains one or more genetic elements which affect a quantitative trait (Falconer DS and Mackay TF, 1996). The use of the term “genetic elements” here is deliberate, as it is not certain that all QTL have their effect due to variation in the coding region of a gene, or even in the promotor region or other regulatory elements of a gene. Given the emerging understanding of the regulatory function of RNAs not directly associated with genes, or with regulatory effects beyond their immediate chromosomal environment (Mattick, 2007), it is possible that variations in some of these may affect quantitative traits and thus be responsible for the effects
11 observed due to a QTL. Any given QTL may, on closer dissection, resolve into multiple loci (Flint et al., 2005), each of smaller effect than the original QTL. This means that elucidating the underlying basis of a QTL can be a daunting task (see section 1.4.3 below). Despite its difficulty, this is an important endeavour, given the major contribution of polygenic inheritance to human disease. QTL mapping is also important in agriculture, with the use of marker assisted selection to improve conventional breeding schemes for the modification of economically important species (Ribaut and Ragot, 2007).
The infinitesimal model, developed by RA Fisher in 1918 and still influential now (Barton and Keightley, 2002), considered polygenic disorders as being determined by a very large number of loci each of very small effect (Fisher RA, 1918). In practice, however, QTL of large effect are frequently detected, sometimes accounting for up to 50% of observed variation in the trait under study (Flint et al., 2005). Although most QTL effect sizes are substantially smaller than this, QTL of moderate effect size (>10%) are fairly common.There are a number of possible explanations for this, including overestimation of effect size. This is particularly likely if the sample size is smaller than about 500 (Barton and Keightley, 2002). Having said that, the majority of detected QTL are of relatively small size (~5%) and it is likely that for most traits there are indeed many further QTL of very small effect.
1.4.1.1 The liability model for binary traits The focus of this thesis is CHD, and as CHD is a binary trait (either present or absent in an individual) it is worth explicitly discussing the relationship between binary traits and QTL. While it is undoubtedly true that as such, CHD is intrinsically less informative and more difficult to analyze from a quantitative genetic perspective than a continuously distributed trait, QTL mapping remains a potentially important tool in understanding the genetic basis of CHD. Falconer (Falconer, 1965) developed the liability model for binary traits with multifactorial inheritance. This model assumes an underlying continuously distributed but unobservable scale of liability, with a threshold above which the observable binary trait is expressed. If it is possible to identify a quantitative phenotype
12 which confers a risk of an individual developing the binary trait, the quantitative phenotype can act as a proxy for the binary trait in mapping experiments. QTL mapping for traits such as blood pressure represents such an undertaking, with the binary trait of interest being the occurrence or non-occurrence of a cardiovascular event (myocardial infarction or stroke, for example).
1.4.2 Mapping QTL The principles on which methods for mapping QTLs are based are similar to those for mapping of Mendelian traits (Falconer DS and Mackay TF, 1996). Considering a single locus, there is evidence for a QTL at that locus if there is a statistically significant difference between individuals with different genotypes at that locus (Mackay TF, 2001). As for mapping of Mendelian traits, this implies the need for an informative genetic map. It is necessary for the trait of interest to be under significant genetic control. Although it is possible to study QTL directly in humans, animal models have considerable advantages. It is possible to set up crosses which allow phenotyping and genotyping of very large numbers of individuals. Moreover, inbred laboratory animals are essentially homozygous at every locus (Moore and Nagle, 2000). This means that in a breeding experiment based on a cross between two strains, identification of a marker which is polymorphic between the strains means that it will be informative in all offspring derived from the cross. F1 animals will be heterozygous at every locus, and subsequent generations will segregate the two parental alleles in predictable ratios depending on the nature of the second (and subsequent, where relevant) cross. Heterozygotes provide information regarding dominance effects but not regarding the location of the QTL.
Formal description of the fundamentals of QTL mapping in an F2 population of mice (as described in chapter 6) is relatively simple (Mackay TF, 2001). Consider a hypothetical marker locus, M, and a QTL, Q, each with two alleles
(M1, M2, Q1, Q2), with the recombination fraction between M and Q being . The QTL has additive (a) and dominance (d) effects. In the F2 population, the difference in the quantitative trait between homozygotes for each allele of the QTL will be a(1-2 ). The difference between the average mean phenotype of
13 the homozgotes and the mean phenotype of F1 (heterozygous) animals is d(1- 2 )2. If there is no linkage between Q and M, =0.5 (see 1.3.4) and therefore a(1-2x0.5) = 0; likewise d(1-2x0.5)=0. Since the total variance in the trait, VT, is the sum of variance due to environmental effects (VE) and genetic effects (VG), and VG in turn is the sum of the additive and dominance effects, it follows that (assuming shared environment, which would be the expectation in experimental conditions) at =0.5 there will be no difference between the phenotypes of
M1M1 and M2M2 mice.
The closer the marker locus is to the QTL, the smaller the value of and the larger the difference in trait phenotype between the two homozygous classes. Within a chromosomal region, the marker which is associated with the greatest difference in mean values between homozygotes for the two alleles is closest to the QTL. From this, it would intuitively follow that denser spacing of markers will lead to more accurate localization of QTL. This is only true up to a certain point. Exact figures will depend on the effect size, but for a QTL of moderate effect (allele substitution effect 0.25) spacing of markers at 10cM has almost the same power to detect a QTL as an infinitely dense map (Darvasi et al., 1993), and spacing at 20cM and even 50cM does not substantially reduce power. For a study size of 500 animals, power of detecting a QTL on a 100cM chromosome is 0.64, 0.58 and 0.47 for a QTL located halfway between the interval midpoint and the nearest marker, with map densities of 10cM, 20cM and 50cM respectively. Increasing the size of the experiment does make a difference; the equivalent figures for an experiment with 1000 animals are 0.91, 0.90 and 0.81 respectively (Darvasi et al., 1993). Increasing experiment size also has an important effect on the capacity of the experiment to resolve the location of the QTL. Resolving power (defined by Darvasi and Soller as “the 95% confidence interval for the QTL map location, that would be obtained when scoring an infinite number of markers”) is inversely proportional to the sample size and to the proportion of variance explained by the QTL (Darvasi and Soller, 1997).
14 1.4.2.1 Experimental designs for QTL mapping The description above is of an intercross experiment. An alternate strategy is a backcross, where the F1 (heterozygous) mice are crossed with one or both of the parental strains. The choice of study design will depend on the phenotype and mode of inheritance of the QTL. While an intercross design is more powerful than a backcross design in many circumstances and is most commonly used, backcross is superior in some situations - for example, if one of the strains is zero for a fully recessive phenotype (Moore and Nagle, 2000). This is a relatively uncommon scenario, however, and most commonly there is no information available about the nature of the QTL prior to commencing the experiment.
1.4.2.2 Significance thresholds Lander and Kruglyak calculated significance thresholds for backcross and intercross studies in mouse or rat (Lander and Kruglyak, 1995), based on results which would be expected to occur by chance once per genome scan (“suggestive”) or once per 20 genome scans (“significant”). For the intercross design used in this study (Chapter 6) the LOD score thresholds are 2.8 for suggestive linkage and 4.3 for significant linkage. The reason such calculations are required is to compensate for the effects of analysis of large numbers of markers. The figures derived by Lander and Kruglyak are conservative thresholds (Moore and Nagle, 2000), reducing the risk of type I statistical error (i.e. false positive), although this is at the expense of increasing the risk of type II error (i.e. false negative). An alternate approach is permutation testing, in which the experimental data are repeatedly randomized and the randomized figures analyzed to obtain experiment-specific levels of significance.
1.4.2.3 Selective genotyping As discussed above, larger numbers of animals lead to increased power to detect QTL and increased ability to resolve the location of QTL. However, increasing the number of animals in an experiment inevitably increases the associated cost of genotyping, which is usually much more expensive than measuring the phentoype under study. Lander and Botstein (Lander and
15 Botstein, 1989) introduced the principle of selective genotyping, in which only individuals with extreme phenotypes are genotyped. The logic behind this is that such individuals contain most of the genetic information.
Specifically, for a normally distributed trait, progeny with phenotypes more than 1 standard deviation (SD) from the mean comprise about 33% of the total population but contribute about 81% of the total linkage information. Growing a population only 25% larger and genotyping these extremes of the distribution would provide the same amount of linkage information but require genotyping only 40% as many individuals (Lander and Botstein, 1989). Progeny with offspring 2 SD from the mean comprise about 5% of the population but contribute about 28% of the total linkage information, and so on. However, extension of selective genotyping beyond this level (and perhaps even to this level) is not recommended, because of the risk that some of the more extreme phenotypes may be artefactual (eg measurement error). Moreover, the relative cost of phenotyping goes up as the percentage of animals to be genotyped goes down (Lander and Botstein, 1989). The extent to which this is a problem depends on the phenotype of interest and the cost of breeding and maintaining the mice until they are old enough to be phenotyped.
1.4.2.4 Software packages for QTL mapping There are numerous software packages available for QTL mapping. A comprehensive list is available at http://linkage.rockefeller.edu/soft/ . Among the most commonly used are the MAPMAKER/EXP and MAPMAKER/QTL package (Lander ES et al., 1987) and Mapmanager QT (Manly and Olson, 1999). MAPMAKER/QTL performs interval mapping (IM) based on the maximum- likelihood theorem. This uses data from multiple marker loci simultaneously to estimate both the position and effects of QTL. A particular strength of this package is its ability to handle missing genotype data – the program can “fill in” a missing datum and still use the phenotype information from that individual. This makes it particularly useful for data analysis in conjunction with selective genotyping (Moore and Nagle, 2000).
16 Mapmanager QT uses multiple regression analysis, which is less computationally demanding but almost as powerful as maximum-likelihood analysis (Moore and Nagle, 2000); it is also more user-friendly than the MAPMAKER/QTL. Unfortunately, however, it is much less tolerant of missing data, and in the context of selective genotyping overestimates effect sizes because it is unable to take the bulk of the phenotype data (that which is not in the genotyped extremes) into account.
1.4.2.5 The mouse as a model organism Despite the obvious differences between humans and laboratory animals such as mice, genetic studies in animals have contributed importantly to our understanding of human genetics (Moore and Nagle, 2000), aided by the high homology between the mouse and human genomes. Approximately 80% of mouse genes have a single identifiable orthologue in the human genome; only about 1% of mouse genes have no detectable orthologue in the human genome (the reverse is also true) (Waterston et al., 2002). The mouse also has the advantages of being easy to keep in a laboratory setting, having a short generation time (approximately 9 weeks) and, for many strains, fecundity which is high enough to make colony maintenance and breeding experiments straightforward.
1.4.3 Identifying the underlying genetic basis of QTL By 2005, over 2000 QTL had been mapped in mice (Flint et al., 2005). However, the genetic basis of at most 21 of these had been identified, depending on the standard used. Two reviews at the time identified 21 quantitative trait genes in total, but agreed on only 4 genes in mice and 5 in rats (Flint et al., 2005). Clearly, progressing from identifying a QTL to identifying its underlying genetic basis is an extremely difficult task.
The confidence intervals for most QTL at the time of identification are very wide, and contain large numbers of genes and even larger amounts of non-coding DNA which may be of functional importance. It follows that the first step in
17 identifying the genetic basis of a QTL is to narrow down the region of interest. A number of techniques have been developed or proposed to achieve this.
Congenic mouse strains have the QTL interval transferred from one of the strains under study into the other, by repeated backcrossing (Moore and Nagle, 2000). This process can be accelerated by genotyping mice at each generation to follow the region of interest (Markel et al., 1997), obviating the need for phenotyping of mice at each generation to aid selection of mice to breed. Once the region of interest has been bred from strain A into strain B, replacing the original locus for that strain, further breeding between the congenic and strain B can be done, looking for recombination events within the region of interest. When these occur, mice homozygous for the alternate segregants can be phenotyped to determine which side of the recombination contains the QTL.
An alternate approach is to watch for recombination events occurring during the backcrossing process. Any mouse with a recombination event becomes the founder of a new congenic line, so that a series of subinterval-specific congenic strains are developed and can be phenotyped to determine the localization of the QTL. In theory, analysis of ~1000 mice could reduce a QTL to ~1cM (Darvasi, 1997).
Advanced intercross lines, a method proposed by Darvasi and Soller (Darvasi and Soller, 1995) have the advantage of allowing fine mapping of multiple QTL simultaneously. This is discussed in more detail in chapters 2 and 5. Briefly, the pair of strains in which a QTL has been identified are crossed to produce an F1 population. These are then intercrossed and offspring of the F2 generation are crossed in turn, continuing for a number of generations but avoiding inbreeding. With each generation there are additional recombination events (Figure 2.1). Phenotyping and genotyping are performed only at the last generation. Analysis is then done along the same lines as the original QTL analysis.
Once a QTL has been narrowed down to as small an interval as possible, the next, and possibly most difficult, step is to pinpoint the genetic basis of the QTL.
18 Identification and study of candidate genes is a common approach at this point (Moore and Nagle, 2000). Problems with this include the likelihood that there will be a large number of candidate genes even in the smallest region to which a QTL is resolvable (Flint et al., 2005), and the existence of large numbers of SNPs, meaning that any gene which is studied is likely to contain variants which could potentially influence gene expression, and hence underlie the QTL. Candidate variants can be used in association studies, but may prove to be closely linked to the genetic variant which truly underlies the QTL. Resolving this is likely to be very difficult. Expression studies, particularly using microarray, may provide evidence supporting gene candidacy (Guo and Lange, 2000), but such evidence is likely to fall short of proof. Targeted knock-in and knock-out mice potentially can provide strong evidence for a gene’s role in a QTL – if the phenotype under study is substantially altered in the knock-in or knock-out mice (Flaherty et al., 2005). The resources involved in creating such mice are still very considerable, however, forming a barrier to such studies.
At present, QTL mapping techniques are well-established. Experimental designs are tried and tested, dense marker maps exist and software for data analysis is readily available. However, the transition from identification of a QTL to elucidation of its underlying genetic basis remains extremely difficult.
1.4.4 The mouse Hapmap project: application to QTL mapping The findings of the mouse Hapmap project (Frazer KA et al., 2007), including the identification of over 8 million SNPs densely distributed across the mouse genome, open up a new approach to QTL mapping (Flaherty et al., 2005; Frazer et al., 2004). The mouse Hapmap project initially studied 11 classical laboratory mouse strains and four wild-derived strains. The haplotype map developed had over 40,000 segments each with an average of three ancestral haplotypes, reflecting origin from one of the wild mouse substrains – M.m. musculus, M.m.castaneus, M.m domesticus and the hybrid M.m. molossinus. In the classical strains these subspecies contributed 68%, 6%, 3% and 10% respectively, with the remainder being of unknown origin (Frazer KA et al., 2007). Knowledge of the structure and distribution of these ancestral segments
19 allows the performance of association studies with considerable power to map QTL (Frazer et al., 2004). In the next phase of this project, a total of 38 “classical” inbred strains and 11 wild-derived strains have been genotyped with a SNP microarray covering 138,793 SNPs (data available at http://www.broad.mit.edu/mouse/hapmap/ ). The existence of large amounts of data regarding quantitative traits in the common laboratory strains means that mapping for many such traits will be possible without the need to perform additional experiments. For less well-studied phenotypes, like those described here, however, additional phenotyping will be required to make use of this resource.
1.5 The heart The heart’s role in mammalian physiology is a central one – without constant distribution of oxygenated blood to the body’s tissues, oxidative metabolism rapidly ceases, leading to death of the organism within minutes. This fundamental role has meant that the heart’s structure and development have been highly conserved during evolution, and in particular mammalian hearts, from mouse to whale, share very similar anatomy. It is remarkable that, despite the critical importance of the heart, severely disordered cardiac development, leading to a markedly structurally and functionally abnormal heart, may still be compatible with survival to and beyond birth. Lesser degrees of cardiac maldevelopment may be asymptomatic, or may become symptomatic only well into adult life.
1.5.1 Normal cardiac anatomy The adult mammalian heart is shown in Figure 1.2. Functionally, the heart can be viewed as a pair of two-chambered muscular pumps, fused anatomically and controlled by an electrical conducting system. The normal arrangement consists of a right-sided circulation which receives deoxygenated blood from the systemic veins, and distributes it to the lungs via the pulmonary arteries; and a left sided circulation which receives oxygenated blood from the lungs and distributes it to the body via the aorta. The atria are collecting chambers, separated from the ventricles by valves (tricuspid on the right and mitral on the
20 left) which prevent regurgitation of blood during ventricular contraction. The tricuspid and mitral valves are supported by fibrous chords, the chordae tendinae. The ventricles are pumping chambers and are separated from the great arteries by valves (pulmonary on the right and aortic on the left) which prevent regurgitation of blood after ventricular contraction. Cardiac muscle receives its own blood supply from the coronary arteries, which originate from the proximal aorta. Cardiac contraction is regulated by an electrical conduction system. Specialized cells are organized into nodes and tracts. The sinoatrial node, located at the junction between the right atrium and the superior vena cava, initiates the beat. The electrical impulse is then propagated throughout the atria, and to the atrioventricular (AV) node and thence to the ventricles.
From the AV node, the impulse passes down the bundle of His and its branches to the apex of the ventricles, and then along the Purkinje fibres to the remainder of the ventricles.
1.5.2 Heart development The heart becomes the first functional organ in the developing embryo (Buckingham et al., 2005). Cardiac development is illustrated in Fig 1.3. The first identifiable cardiac tissue (expressing myocardial markers) forms as bilateral groups of cells in the lateral mesoderm of the early embryo, having originated as undifferentiated mesodermal cells which migrate from the anterior region of the primitive streak (Buckingham et al., 2005). These cells are linked across the anterior-ventral midline, to form the so-called cardiac crescent, which in turn folds to form the linear heart tube. This is a transient structure composed of an inner endothelial tube surrounded by a myocardial layer (Harvey, 2002). At this point, cells from an addiional cardiac progenitor field called the second heart field (SHF) migrate from dorsal mesocardium, pericardium and branchial- arch mesoderm into the heart, and contribute to its structural development at both poles(Moorman et al., 2003). These contribute to the outflow tract, right ventricle and much of the atria (Buckingham et al., 2005).
21 Figure 1.2. The adult mammalian heart The structure of the adult human heart, whole (panel a) and in section (panel b). The right atrium (RA) receives venous blood from the body, and passes it through the tricuspid valve to the right ventricle (RV). The RV pumps the blood via the pulmonary artery to the lungs. Oxygenated blood from the lungs is returned to the left atrium (LA) via the pulmonary veins, and passes through the mitral valve to the to the left ventricle (LV). The tricuspid and mitral valves are supported by the chordae tendinae. The left ventricle pumps blood through the aortic valve to the systemic circulation. The coronary circulation branches off from the proximal aorta. The heart beat is regulated by specialised electrical conducting cells which are organised into clusters (nodes) or tracts; it is initiated at the sino-atrial node, propogates through the atria and to the atrioventricular node (AVN). After a delay it is then passed via the bundle of His and its bundle branches to the apex of the ventricles, and then to the rest of the ventricles via the Purkinje fibres. Ca = caudal (inferior), Cr = cranial (superior); L = left, R = right. Adapted by permission from Macmillan Publishers Ltd [NATURE REVIEWS GENETICS] (Harvey, 2002), copyright 2002 (license no 1838550679750)
22 The heart tube during its progressive formation undergoes a process of looping, with the development of a rightward spiral form, leading to the formation of distinct anatomical features. The caudal part of the heart tube moves dorsally and anteriorly to form the future atria (Prall et al., 2002). The ventricles become distinct at this stage and balloon outwards. The spatial relationship between the developing chambers at this point approximates their final alignment. The precursors of the atrioventricular valves first appear as endocardial cushions at this stage, forming at the level of the atrioventricular canal from cells from the endocardial layer of the heart. Endocardial cushions giving rise to the aortic and pulmonary valves also form in the outflow tract at this stage. Specification of distinct myogenic tissue types is also a feature of the looping stage of heart development. Of particular importance is the emergence of “working myocardium” along the outer curvature of the looping heart tube (Christoffels et al., 2000) – this becomes the contractile tissue of the heart chambers. Non- chamber myocardium gives rise to elements of the conduction system and fibrous tissue.
The looping phase of cardiac development is followed by a remodelling phase, during which septation of the heart chambers becomes complete, with the formation of distinct right and left atria and ventricles.The inflow and outflow tracts assume their final positions at this stage.
1.5.3 The interatrial septum The interatrial septum is a wall of tissue which, in postnatal life, separates the left and right atrium, preventing the flow of blood from left to right. During normal heart development in mammals, the interatrial septum acts as a valved communication between the atrial chambers, allowing a right-to-left atrial blood shunt that helps to bypass circulation to the lungs, which are nonfunctional until after birth. Two distinct septal walls, the septum primum and septum secundum, contribute to its final structure (Webb et al., 1998). Each maintains a natural but offset opening between the atrial chambers, creating a one-way flap valve (Figure 1.4). Expansion of the lungs at birth is accompanied by an increase in
23 Figure 1.3 Normal cardiac development This figure illustrates the main stages in early heart development in mammals and other amniotes, with staging in days of embryonic development (E) based on mouse development. The whole embryo or
24 isolated heart is shown at left. At right, a representative section (transverse in panels b and d, longitudinal in panels f and h) illustrates the main internal features. All views are ventral. Myocardium and its progenitors are depicted in red. The cardiac progenitors are first recognisable as a crescent-shaped epithelium (the cardiac crescent) at the cranial and cranio-lateral parts of the embryo (panels a and b). Next, heart progenitors move ventrally to form the linear heart tube (panels c and d). The inflow region of the linear heart tube is located caudally and its outflow cranially. The linear heart tube undergoes a complex process called cardiac looping, in which the tubular heart becomes spiral with its outer surface moving rightwards (panels e and f). Endocardial cushions (EC), precursors to the tricuspid and mitral valves, are forming in the atrioventricular (AV) canal. The trabeculae (T) also form at this stage. Panels g and h depict the remodelling phase of heart development, during which septation of the cardiac chambers is completed, and distinct right and left ventricles (LV and RV) and atria (LA and RA) become evident. Further spiralling of the heart tube results in the outflow region becoming wedged between the developing ventricles on the ventral side (panel g) and the inflow region spans the ventricles dorsally (panel h). The chambers and vessels have now reached the same alignment as in the adult heart. The muscular inter-atrial and inter-ventricular septae fuse with the non-muscular AV septum, which is derived from the endocarial cushions. Ca = caudal, Cr = cranial, L = left, R = right. Adapted by permission from Macmillan Publishers Ltd [NATURE REVIEWS GENETICS] (Harvey, 2002), copyright 2002 (license no 1838550679750)
25 Figure 1.4 Relative arrangement of septum primum and septum secundum Cartoon of the interatrial septum in the developing mammalian heart, showing the relative arrangements of the septum primum (orange) and septum secundum (purple). The red arrow indicates direction of blood flow in prenatal life. Adapted with permission from (Biben et al., 2000) Lippincott Williams and Wilkins, copyright 2000
left atrial pressure, which forces the septum primum against the septum secundum. In humans, the two septa seal permanently by adhesion in the first year of life in ~75% of individuals. However, the valve remains unsealed to varying degrees in ~25% of the adult population, a condition termed patent foramen ovale (PFO) (Hagen et al., 1984).
1.5.4 Regulation of cardiac development by transcription factors Cardiac development, as outlined above, is an extremely complex process and occurs under the control of an interactive cascade of genetic regulators (Harvey, 2002; Olson, 2006; Dunwoodie, 2007). A core set of evolutionarily conserved transcription factors (NK2, MEF2, GATA, Tbx and Hand) control cardiac cell fates, the expression of contractile protein-encoding genes and 26 cardiac morphogenesis (Olson, 2006). In turn these transcription factors regulate one another, and many other transcription factors are involved. Of these, MEF2 is the key myogenic transcription factor, involved in the differentiation of all types of myocyte. In turn it is under regulation by NK2 homeobox genes, particularly tinman in Drosophila and its orthologues in mammals. The homeodomain factor NKX2-5 is a key transcription factor in cardiac development (Dunwoodie, 2007). It is expressed in cardiac progenitor cells of both the first and second heart fields. Expression continues in the primary heart tube and in the looping heart, in the outflow tract, ventricles, common atrium and the proximal horns of the sinus venosus. Expression continues in muscular layers of the heart throughout the remainder of embryogenesis and into postnatal and adult life (Prall et al., 2002) The absence of Nkx2-5 is catastrophic to heart development in the mouse embryo, resulting in complete failure of cardiac morphogenesis, chamber formation and outflow tract development.
NKX2-5 acts as part of a pathway in which it physically interacts with a set of other transcription factors to activate target genes. For example, the zinc finger transcription factor GATA4 (one of a group of genes named because their protein products bind to the nucleotide sequence GATA) physically interacts with NKX2-5. When co-expressed, their effect on the transcription of some cardiac genes is synergistically augmented (Prall et al., 2002). GATA4 protein is regulated by other co-transcription factors including the Friend of GATA (Fog) proteins. Gata4 null mouse embryos have severely disrupted cardiac development, with failure to form the primitive heart tube among other severe developmental abnormalities (Molkentin et al., 1997).
The T-box genes are a group of transcription factors which share a highly conserved 180-amino acid DNA binding domain called the T-box (Stennard and Harvey, 2005). Of the seven or more T-box genes expressed in the developing human heart, TBX1, TBX5 and TBX20 (chapter 3) have been implicated in human congenital heart disease. TBX1 is important in the secondary heart field and subsequently the outflow tract, consistent with its role as the major
27 determinant of the cardiac phenotype in velocardiofacial syndrome (characterized by conotruncal malformations) (Yagi et al., 2003).
There is evidence that TBX5 functions as part of the NKX2-5 pathway (Prall et al., 2002; Dunwoodie, 2007). Mouse Tbx5 associates directly with Nkx2-5 and Gata4, synergistically stimulating chamber-specific genes in later stages of cardiac development. Tbx5 is specifically expressed in the first heart field, at the cardiac crescent stage and later in the primary heart tube. Tbx5 null mouse embryos are severely dysmorphic and fail to undergo cardiac looping. Interestingly, mice heterozygous for a Tbx5 null allele have similar cardiac abnormalities to those seen in Holt-Oram syndrome, with septal defects and AV conduction block (Stennard and Harvey, 2005; Prall et al., 2002).
Mouse Tbx20 is expressed in the cardiac crescent, and in some cells of the secondary heart field. In the heart tube, it is expressed in myocardium and in endothelial cells associated with the endocardial cushions; this latter expression persists with further development, as myocardial expression weakens. Tbx20 interacts directly with Tbx5, Nkx2-5, and Gata4 (Stennard and Harvey, 2005). Tbx20 null mouse embryos have hypoplastic, unlooped hearts. Expression of Tbx20 is required for normal levels of Nkx2-5 expression (Dunwoodie, 2007).
1.6 Congenital heart disease The complexity of the developmental process briefly outlined above provides many opportunities for maldevelopment of the heart, although it is still not possible to ascribe a patho-developmental mechanism to all types of CHD with confidence (Anderson et al., 1999). While some malformations are lethal in utero, a wide variety of malformations are seen in liveborn infants, occurring singly or in combination. The reported incidence of CHD varies from 0.4-1.0% (Hoffman and Kaplan, 2002), with most studies at the higher end of that range. A trend towards higher incidence in more recent studies appears to be due to improved detection of minor lesions of no clinical importance, particularly small VSDs with advances in cardiac imaging, especially 2-D echocardiography (Wilson et al., 1993; Srephensen et al., 2004).
28 1.6.1 Types of CHD Various classification systems have been devised for CHD. Classification by severity (Hoffman and Kaplan, 2002), by embryological origin (Marino and Digilio, 2000), by presumed aetiology (Boughman et al., 1987), by associated features (Bower and Ramsay, 1994; Stoll et al., 1989) and by anatomical divisions such as sequential segmental analysis (Craatz et al., 2002) have been proposed. While all may have merits, the proliferation of classification systems can make comparisons between studies of the epidemiology of CHD challenging. Depending on the classification system used, various malformations may be grouped in different ways, making it hard to separate out data relevant to particular lesions. Regardless of the classification system used, however, the relative frequencies of the more common malformations are generally consistent between studies, where data for individual lesions is provided. The proportion of CHD accounted for by VSD has increased over time, due to increased detection of small VSDs, as discussed above. Table 1.1 shows the percentages of common malformations reported in three studies from different continents (Australia, Europe and North America) as an illustration of this point, along with the data pooled from a large numbers of studies by Hoffman and Kaplan (Hoffman and Kaplan, 2002).
1.6.2 Causes of CHD The majority of cases of CHD do not have a single identifiable cause. Among identifiable causes, chromosomal abnormalities are the most common, accounting for 8.5% of cases in an Australian series (Bower and Ramsay, 1994). Genetic syndromes, mainly inherited in a Mendelian fashion, account for a further 1.2%. The Australian data are comparable to those obtained in other populations. There have been recent epidemiological surveys from Malta (Grech and Gatt, 1999), Iceland (Srephensen et al., 2004), Italy (Calzolari et al., 2003) and a combination of California, France and Sweden (Harris et al., 2003). In these studies, the percentage of cases attributable to chromosomal abnormalities ranged from 4.9%-18%; 2%-4.5% of subjects had identifiable syndromes and 6%-12% had associated extra-cardiac abnormalities.
29 Table 1.1: Percentage of CHD accounted for by the most common lesions
Bower and Samanek, Boughman et Hoffman and Ramsay, 1994 1994 al., 1987 Kaplan, 2002a n 1337 4409 1055 pooled data VSD 42.6 41.3 25.6 37.2 ASD 7.2 12.0 9.7 9.8 PS 7.3 6.4 7.3 7.6 PDA 6.8 5.8 2.6 8.3 Coarct 5.8 5.6 6.9 4.3 TOF 3.5 2.8 7.5 4.4 d-TGA 5.8 5.1 4.6 3.3 AS 3.3 6.2 3.1 4.2 AVCD 1.6 2.9 8.5 3.6 HLHS 2.8 3.2 4.2 2.8 PA 1.8 1.9 1.6 1.4 TA 1.9 0.6 1.6 0.8 TAPVR 1.0 0.8 1.6 1.0 Other 8.6 17.4 15.0 11.3
All figures other than the first row are percentages. a. – this study pooled results from 62 separate studies. Percentages listed here are based on the mean values derived. VSD, ventricular septal defect; ASD, atrial septal defect; PS, pulmonary stenosis; PDA, patent ductus arteriosus; Coarct, coarctation of the aorta; TOF, tetralogy of Fallot; d-TGA, d-transposition of the great arteries; AS, aortic stenosis; AVCD, atrioventricular canal defects; HLHS, hypoplastic left heart syndrome; PA, pulmonary atresia; TA, tricuspid atresia; TAPVR, total anomalous pulmonary venous return
30 Non-syndromal familial forms of CHD account for an as yet undetermined percentage. Multifactorial inheritance, with a strong genetic contribution, probably accounts for most of the remainder – i.e. the great majority of all CHD. The exact model which best describes the relative contribution of genes and environment is uncertain. The classical multifactorial model, with numerous genes each contributing a small amount to variation in the population, has been challenged (Burn J and Goodship J, 2002) and it is possible that some forms of CHD are better explained by oligogenic or other models.
Although there is clearly an important environmental contribution to the causation of most CHD, it is usually not possible to identify a specific environmental factor. Having said that, a small minority of cases are caused by identifiable teratogens such as maternal exposure to alcohol and medications, and the teratogenic effect of maternal diabetes. While it is possible to use statistical analyses to determine the proportion of cases attributable to relatively common in utero exposures such as maternal diabetes, in most instances it is not possible to say for certain which infants have malformations which would not have been present but for that teratogenic effect. Thus, for approximately 90% of children affected by CHD, no specific cause can be found.
1.6.3 Patent foramen ovale As described above (1.5.3), passage of blood from right to left atrium via the foramen ovale is a normal feature of circulation before birth. Following birth, persistent patency of the foramen ovale is very common, as noted above (found in ~25% of adults) (Hagen et al., 1984) and has generally been considered benign. Haemodynamically, PFO is usually of no consequence. Although there is a structural passage between the atria, the pressure differential between the left (higher pressure) and right sides produces a functional seal by pressing the septum primum against the septum secundum, preventing shunting of important volumes of blood in either direction. The small size of most PFOs (mean width 5mm (Hagen et al., 1984)) also reduces the likelihood of significant shunting of blood.
31 1.6.3.1 PFO and stroke However, in recent years it has become apparent that PFO is not always harmless. PFO has been shown to be a risk factor for ischaemic stroke without an identifiable cause, known as “cryptogenic stroke”, and also for migraine. Case-control studies show that, particularly among younger stroke patients, there is a significantly higher incidence of PFO than among controls. In stroke patients aged under 40 years, Webster and colleagues found PFO in 50% of cases, but only 15% of controls (Webster et al., 1988). In patients younger than 55, Lechat and colleagues found PFO in 40% of cases and 10% of controls (Lechat et al., 1988). The association is not seen in older patients (Jones et al., 1994) but this may be due to the fact that other causes of ischaemic stroke become very much more common with increasing age (particularly in individuals >60), potentially masking the effect of PFO in older individuals.
The mechanism by which PFO confers an increased risk of stroke in younger individuals is thought to be predominantly by producing a vulnerability to “paradoxical embolism” in which an embolus (usually thrombus detached from a deep vein thrombosis) crosses from the right side of the circulation to the left (McGaw and Harper, 2001). This may be more likely to occur at times when the systemic venous pressure is increased, for example if the patient is lifting a heavy object or straining at stool.
The observation that PFOs in individuals with cryptogenic stroke are both larger and are associated with a greater degree of right to left shunting of blood than those seen in individuals with known causes of ischaemic stroke (Homma et al., 1994) appears to support this proposition. However, although paradoxical embolism has been directly observed during echocardiography in at least one instance (Maier et al., 2007), there is little direct evidence to support this model (Kizer and Devereux, 2005). There is seldom a history connecting the stroke with an episode or condition which would be expected to raise right sided pressure, and a source of emboli is seldom found.
32 It is thus possible that other factors are responsible for the observed association between PFO and stroke. In situ thrombosis and atrial tachyarrhythmias have been considered. Large PFO are often associated with other structural anomalies such as aneurysm of the septum primum (McGaw and Harper, 2001) and persistence of embryonic features of right atrial morphology (Hagen et al., 1984), which could promote in situ formation of thrombus. However, mural thrombi and atrial tachyarrhythmias are not commonly found in association with cryptogenic stroke (Kizer and Devereux, 2005). There are likely to be cell adhesion factors which are involved in physical closure of the foramen ovale following birth. Hypothetically, these could be involved pathologically later in life, in the development of small thrombi in the systemic circulation which then embolize and cause stroke. Abnormalities of such factors could then predispose to PFO and stroke without the requirement for paradoxical embolism to occur. There is no direct evidence to support this concept either, however.
1.6.3.2 PFO and migraine Migraine is a common, recurrent form of headache with distinctive characteristics. The headache is throbbing, unilateral, often severe, is accompanied by nausea, vomiting or sensitivity to sound and light. The headache typically lasts from 4-72 hours (Post et al., 2007). About one third of patients with migraine experience aura, which is a period of focal neurological symptoms usually occurring within the hour preceding the onset of headache (Henry et al., 1992). The pathogenesis of migraine is not well understood but changes in cerebral bloodflow have been implicated (Henry et al., 1992). The observation that PFO was associated with stroke, combined with the previous finding that migraine is also a risk factor for stroke, led to the hypothesis that PFO may be associated with migraine.
A number of studies have been performed which appear to confirm that such a relationship exists. Firstly, retrospective studies of the prevalence and severity of migraine in people who have had PFO closure show a decreased prevalence of migraine (particularly migraine with aura) after closure (Wilmshurst et al., 2000; Reisman et al., 2005; Post et al., 2004). In a recent prospective study of
33 patients undergoing PFO closure (Anzola et al., 2006), 36% of patients with migraine had resolution of their migraine symptoms 1 year after closure. Among subjects with migraine with aura, only 7/33 still had aura symptoms 1 year after closure of PFO, compared with 21/21 controls with migraine and PFO who did not have closure.
Again, the mechanism for this relationship is not certain, although the passage of microemboli from the systemic venous to arterial circulation across the atrial septum has been suggested (Post et al., 2007) – a form in miniature of paradoxical embolism.
1.6.3.3 Other pathological consequences of PFO In addition to stroke and migraine, PFO has been implicated in the pathophysiology of decompression illness in divers (Wilmshurst et al., 2001) and in hypoxaemia in individuals with obstructive sleep apnoea (Shanoudy et al., 1998).
1.6.3.4 Genetics of PFO There has been relatively little study of the underlying causes of PFO. Research to date has focused entirely on the genetics of PFO, at the level of establishing that there is an increased risk of PFO in relatives of individuals with PFO and/or ASD. Arquizan et al (Arquizan et al., 2001) studied sibs of patients with ischaemic stroke, comparing sibs of those with PFO with sibs of those without PFO. 61.5% of sibs of patients with PFO also had PFO, compared with 30.6% of sibs of those without PFO. Interestingly, concordance for this trait was higher among female than male sibs. Although ASD shows a female preponderance (see 1.7.3.1), PFO seems to be evenly distributed between the sexes (Hagen et al., 1984). The authors comment that the relatively small numbers in each group once the figures are broken down by sex make a type I statistical error possible.
Wilmshurst and colleagues performed echocardiography on 71 relatives of 20 probands with either small ASD or large PFO with shunt (Wilmshurst et al., 2004). Of the 71 individuals, 60.6% had ASD or PFO. There were no controls
34 but this is well above the population prevalence of PFO. However, the data may have been skewed to some extent by the identification of several families in which atrial shunting appeared to be segregating in an autosomal dominant fashion, including one family with 21 affected members. This family was not included in table 1.2 (below), because the authors do not differentiate between ASD and PFO.
Rodriguez et al studied differences in PFO, ASA and right atrial anatomy in patients with ischaemic stroke (Rodriguez et al., 2003) from white, black and Hispanic backgrounds. They also assessed subjects for the presence of Chiari’s network, a congenital remnant of the right valve of the sinus venosus, the presence of which is associated with a high incidence of PFO and ASA (Schneider et al., 1995). The presence of Chiari’s network is not a major contributor to PFO, however, as it is present in only about 2% of adults (Schneider et al., 1995). Rodriguez et al found a similar incidence of PFO, ASA and Chiari’s network across the different racial groups. However, whites and Hispanics were more likely to have a large PFO and the degree of shunt was greater than in blacks. The significance of this finding is uncertain, particularly as it may relate to ASD. In the Baltimore-Washington infant study, nonwhites were found to be more likely to have ASD than whites, although the difference was small (OR 1.5, 95% CI 1.1-2.0).
1.6.4 Atrial septal defect Notwithstanding the potential pathological consequences of PFO discussed above, in most individuals PFO is associated with a functionally competent atrial septum with minimal if any blood flow between the atria. An ASD is present when there is a frank hole in the atrial septum. There are four anatomical types of ASD; secundum ASD, ostium primum ASD, sinus venosus ASD and coronary sinus defect, and these are described below. Confluent ASDs, large holes caused by a combination of two types of lesion, can generally be classified as secundum ASD; and common atrium is a form of primum ASD (Kouchoukos NT et al., 2003). By far the most common type is secundum ASD, which, with PFO, is the main focus of this thesis. Unless otherwise
35 specified (eg “primum ASD”), from this point onwards “ASD” refers to secundum ASD.
1.6.4.1 Secundum ASD Secundum ASD forms in the region of the foramen ovale and is in essence a failure of the flap valve (septum primum) to cover the opening of the foramen ovale. This may be because the foramen ovale is too large, because the septum primum is too short, because the septum primum forms abnormally and is fenestrated, or from a combination of these abnormalities.
1.6.4.2 Ostium primum ASD Ostium primum ASD is a form of endocardial cushion defect, in which the septum primum does not fuse with the endocardial cushions, leaving a patent foramen primum. There is usually an associated cleft in the anterior cusp of the mitral valve.
1.6.4.3 Sinus venosus ASD Sinus venosus ASDs are located high in the septum, just below the orifice of the superior vena cava, and are usually associated with anomalous pulmonary venous return. This is a very rare type of ASD.
1.6.4.4 Coronary sinus ASD Coronary sinus ASDs are part of the “unroofed coronary sinus syndrome” (Kouchoukos NT et al., 2003), in which the coronary sinus lacks a partition to separate it from the left atrium. As a result, the ostium of the coronary sinus forms a hole in the atrial septum.
1.6.4.5 Pathology associated with ASD Regardless of type, the main pathological consequence of ASD is the presence of an intracardiac shunt (Krasuski, 2007). In postnatal life, the pressure in the left atrium is greater than that in the right atrium, and an ASD permits left to right shunting of blood. This results in volume overload of the right side of the heart, as well as an increase in the pressure in the right atrium. Very large holes
36 may cause important symptoms in infancy or early childhood, with failure to thrive, frequent respiratory infections and even heart failure, requiring early surgical closure (Lammers et al., 2005). Smaller lesions, however, may take many years to produce symptoms. If not detected on routine examination in childhood, presentation well into adulthood is not uncommon. Presenting symptoms of fatigue and breathlessness result from progressive pulmonary hypertension, secondary to volume overload (Krasuski, 2007). Progressive right atrial enlargement and thickening of the atrial wall is usually seen, in the presence of a normal sized left atrium (Kouchoukos NT et al., 2003). This may lead to atrial fibrillation. If untreated, pulmonary hypertension can result in Eisenmenger syndrome, in which right sided pressure exceeds that on the left and the shunt is reversed, causing systemic hypoxaemia (Krasuski, 2007).
Cryptogenic stroke has also been associated with ASD (Bartz et al., 2006), although ASD is a much less common finding than PFO in patients with stroke. This is not surprising given the very high prevalence of PFO in the population.
1.6.5 Relationship between PFO and ASD Clinically, the smallest ASDs can be viewed as large PFOs with an incompetent flap valve, and indeed ASD and PFO are viewed as existing in a continuum by cardiac surgeons and cardiologists (Kouchoukos NT et al., 2003) (personal communication, Prof Michael Feneley, 2006). This clinical observation suggests an aetiological relationship between ASD and PFO, which is supported by studies in mice and humans. Wilmshurst and colleagues (Wilmshurst et al., 2004) studied family members of patients with left to right shunting on echocardiography. The probands had been investigated for a variety of reasons including decompression illness, ischaemic stroke, haemodynamic features of ASD and migraine with aura. Of the 71 family members who had contrast echocardiography, 61% had either ASD or PFO. There were no controls but this incidence is well above the population prevalence of PFO. Although most of these lesions appear to have been PFOs, and the pedigrees presented do not distinguish between ASD and PFO, the authors state that family members with both lesions were identified. Evidence is presented that inter-atrial shunt
37 segregated as an autosomal dominant trait in at least 6/20 families. This study supports the concept that there is a genetic link between ASD and PFO in humans.
In mice, the main evidence that ASD and PFO have a shared aetiology comes from the work of Biben and colleagues (Biben et al., 2000), which forms the basis for most of the mouse work reported in this thesis. Biben and colleagues studied the effects of heterozygous Nkx2-5 mutations on the murine atrial septum. NKX2-5 mutations in humans had been shown to cause a variety of forms of congenital heart disease, but particularly ASD. NKX2-5, and its murine orthologue Nkx2-5, are homeodomain-containing transcription factors important in early cardiac development in man and mouse, respectively (see above).
In Nkx2-5+/- mice, Biben et al found an increased incidence of patent formane ovale, atrial septal aneurysm, and decreased length of the septum primum flap valve compared with wild-type mice. These effects varied between strains and were particularly striking in mice from the strain 129T2/SvEms, in which heterozygosity for the Nkx2-5 null allele resulted in severe PFO bordering on ASD in 6/35 mice (17%). In total, 5/425 Nkx2-5+/- had ASD – much lower than in humans heterozygous for NKX2-5 mutations (see below) but considerably higher than in wild-type mice, in which ASDs are generally very rare (see chapter 5). Importantly, in every strain assessed, heterozygosity for the Nkx2-5 null allele was associated with a markedly increased incidence of PFO compared with wild-type mice – 78% vs 26% in B6 mice, 94% vs 74% in 129T2/SvEms, 36% vs 6% in a Swiss x B6 cross, and 62% vs 2.6% in an FVB x B6 cross. Thus, a mutation in Nkx2-5 led to measurable changes in atrial septal morphology, and an increased incidence of both PFO and ASD.
In summary, ASD and PFO are common and closely related conditions. While the role of PFO in human pathology makes it worthy of study in its own right, the apparent aetiological link between the two lesions suggests that PFO in the mouse may represent a valid model of ASD in the human. Insights gained from
38 the study of murine PFO may then be of importance in understanding ASD and, in turn, other forms of CHD.
1.7 Causes of ASD In this section the known causes of ASD will be discussed.
1.7.1 Syndromes associated with CHD Numerous malformation syndromes have been described in which CHD occurs. This can be one of the main defining features of a syndrome (as in Holt-Oram syndrome (Basson et al., 1997), velocardiofacial syndrome (Fokstuen et al., 1998) and Noonan syndrome (Tartaglia et al., 2002). Alternately, it may be an association of variable importance in a syndrome defined primarily by non- cardiac pathology – as in Smith-Magenis syndrome (Sweeney et al., 1999) and Rubinstein-Taybi syndrome (Stevens and Bhakta, 1995), in both of which about a third of affected individuals have CHD; or cerebrocostomandibular syndrome, in which CHD is a rare but apparently real association (Plotz et al., 1996; Kirk and Ades, 1998).
It is difficult to be accurate about how many syndromes are associated with CHD. The London Dysmorphology database (London Medical Databases Ltd, Bushey UK 2004) lists 873 syndromes in which CHD is a feature. The POSSUM dysmorphology database (Murdoch Children’s Research Institute, Melbourne, 2005) lists 933 syndromes with abnormalities of structure or function of the heart as a feature, although 151 of these are chromosomal anomalies, and some refer to cardiomyopathy rather than structural lesions. A search using the term “congenital heart disease” in the online database OMIM (Online Mendelian Inheritance in Man, http://www.ncbi.nlm.nih.gov/sites/entrez?db=OMIM ; accessed 7 June 2007) yields 387 results, although this is likely to be an incomplete set because of the way OMIM is written – for example, an entry which referred to “heart defects” but not to “congenital heart disease” would not be captured by this search. In the appendix to their majestic review of the genetics of CHD, Burn and Goodship (Burn J and Goodship J, 2002) list 317 syndromes associated with cardiac malformation, including many for which the
39 evidence that CHD is a feature of the disorder is limited to a single case report, and also including a number of teratogenic syndromes such as fetal alcohol syndrome. Of the Mendelian and chromosomal causes of CHD, although some may be allelic disorders, the majority will be caused by distinct genetic abnormalities, and in many instances mutations in more than one gene can lead to a single syndrome. Thus, there are a very large number of separate routes which lead to the common endpoint of CHD.
Almost every known mechanism of inheritance has been associated with CHD, and ASD in particular, including all forms of Mendelian inheritance, multifactorial/polygenic inheritance and chromosomal aneuploidies including small deletions and duplications (Burn J and Goodship J, 2002). Even mitochondrial inheritance has been suggested (Sherman et al., 1985), although the evidence for this is limited, and it is difficult to see how abnormalities in the function of the respiratory chain (the sole function of the mitochondrial genome (DiMauro and Schon, 2003)) would lead to isolated CHD.
1.7.1.1 Holt-Oram syndrome Among syndromes with CHD as a feature, Holt-Oram syndrome (HOS) deserves special mention here, for three reasons. Firstly, the great majority of affected individuals have CHD, and particularly ASD, which affected most individuals in the first reported family (Holt and Oram, 1960) and has been the most commonly reported lesion in subsequent studies, affecting 60% of people with HOS (Sletten and Pierpont, 1996). HOS is thus the syndrome most strongly associated with ASD in the minds of clinical geneticists.
Secondly, HOS can be viewed as one form of autosomal dominant ASD with conduction abnormalities, which is discussed in section 1.4.1.3 below. The non- cardiac manifestations in HOS are skeletal abnormalities of the upper limbs, which range from severe reduction anomalies in approximately 5% of affected individuals through to very mild anomalies such as clinodactyly and sloping shoulders (Newbury-Ecob et al., 1996). Although penetrance for limb anomalies is very high in HOS, the high frequency of mild anomalies reported by Newbury-
40 Ecob and colleagues implies that HOS should always be considered in the differential diagnosis of autosomal dominant CHD, particularly where ASD is the main lesion observed in a family.
Thirdly, a proportion of patients with HOS can be shown to have mutations in TBX5 (Basson et al., 1997; Li et al., 1997). As discussed in section 1.5.3, TBX5 interacts with NKX2-5 and GATA4, mutations in which cause autosomal dominant ASD as well as other forms of CHD. It also interacts with TBX20, mutations in which are shown here (see chapter 3) also to cause CHD, particularly ASD. Although TBX5 is the main gene associated with HOS, in most studies fewer than half of all patients with HOS have identifiable TBX5 mutations (Basson et al., 1997; Li et al., 1997; Borozdin W et al., 2006), although when strict criteria for diagnosis (personal an/or family history of abnormalities of cardiac septation and/or conduction with preaxial radial ray deformity) are applied, the detection rate goes up to 74%(McDermott et al., 2005). This is still low enough to suggest the possibility of genetic heterogeneity; it is clear that such heterogeneity exists among the heart-hand syndromes in general (Basson et al., 1995), but it may also be the case for HOS in particular. Mutations in the gene SALL4 have been reported in a small number of individuals with features of HOS (Brassington et al., 2003), although no further cases have been reported, and there is clinical overlap between the Duane-radial ray syndrome (caused by SALL4 mutations) and HOS.
1.7.1.2 Chromosomal disorders, particularly 8p deletions Numerous chromosomal abnormalities have been reported in association with ASD, ranging from very common abnormalities such as trisomy 21 (Vida et al., 2005) and 22q deletions associated with velocardiofacial syndrome (Fokstuen et al., 1998) through to unique chromosomal rearrangements reported only in single individuals. With the exception of balanced chromosomal rearrangements, which cause phenotypic effects either by direct disruption of genes at the breakpoint or by regional genomic effects of the translocation (Lettice et al., 2002), the common theme in chromosomal disorders is abnormalities of gene copy number. Trisomies, duplications, tetrasomies and so
41 on increase the copy number of genes within the affected chromosome or chromosomal segment from two to three or more (in the case of autosomal genes and X-linked genes in females) or from one to two or more (in the case of X-linked genes in males). Deletions reduce the copy number of genes from two to one (in the case of autosomal genes and X-linked genes in females) or from one to none, in the case of X-linked genes in males.
There are undoubtedly many genes for which copy number is of no consequence. For example, the great majority of autosomal recessive disorders produce no discernable phenotype in heterozygotes, even if the mutation in question is functionally a null mutation. Deletion of one copy of CFTR (for example) would not of itself be expected to produce a phenotype. Similarly, for many genes an increase in copy number would not be expected to produce a phenotype. However, there are many autosomal dominant disorders for which haploinsufficiency appears to be the main reason for the phenotypic effect of mutations. Examples include TBX5 in HOS (Borozdin W et al., 2006), SHOX in Leri-Weill syndrome (Rappold et al., 2002) and SOX9 in campomelic dysplasia (Wunderle et al., 1998); numerous others are known. Similarly, it is clear that increased gene copy number, as in trisomy 21, must be the reason for most, if not all, of the pathological effects associated with chromosomal abnormalities which result in a gain of chromosomal material.
Study of patients with chromosomal abnormalities and congenital heart disease therefore holds out the prospect of identifying genes important in cardiac development and relevant to CHD in general. In order for study of a chromosomal abnormality to lead to identification of such a gene, the chromosomal anomaly must either involve a very small region, containing few genes, or must be sufficiently common that it is possible – by study of large numbers of affected patients – to identify a relatively small critical region which must be involved in order to produce a cardiac phenotype. Most chromosomal abnormalities are detected because they are large enough to be visible cytogenetically, and therefore are likely to contain large numbers of genes. This
42 means that in practice only relatively common chromosomal abnormalities are likely to yield useful information in the study of CHD.
Examples of such lesions include deletions of 1p36, 22q11 and 8p23. Deletions of 1p36 have been increasingly commonly recognised and are associated with CHD, which however only infrequently includes ASD (2/30 cases in one series) (Heilstedt et al., 2003). Deletions of 22q11 are very common and have been intensively studied. Recently, evidence has been presented that haploinsufficiency for TBX1 is responsible for most of the cardiac phenotype associated with deletions of 22q11 (Yagi et al., 2003). Although ASD occurs in patients with 22q11 deletions, conotruncal lesions such as tetralogy of Fallot are more characteristic of this deletion, with only 4% of patients having ASD (Fokstuen et al., 1998).
Deletions of 8p23 are of perhaps the greatest interest in the context of the current study, because of the location of GATA4 within this region. CHD is a common feature of patients with deletions of this cytogenetic band. Both primum and secundum ASD have been reported in multiple patients with 8p23 deletions, although a variety of other cardiac lesions including pulmonary valve stenosis, double outlet right ventricle, aortic valve stenosis and ventricular septal defect have also been reported (Devriendt et al., 1999; Giglio et al., 2000).
Devriendt et al defined a critical region for cardiac malformations in individuals with 8p deletions and suggested GATA4 as a candidate gene which may be responsible for the CHD in affected subjects (Devriendt et al., 1999). Pehlivan and colleagues used fluorescent in-situ hybridization (FISH) to show that 4 patients with deletion of 8p23.1 and CHD were haploinsufficient for GATA4; a fifth patient with a normal heart had two copies of GATA4 (Pehlivan et al., 1999). Contradicting these findings, Giglio et al presented evidence that not all subjects with 8p deletions and CHD were deleted for GATA4. Two subjects in their study with CHD were FISH positive for a GATA4 probe, and they defined a critical region between markers WI-8327 and D8S1825 which excluded GATA4
43 (Giglio et al., 2000). From this it appeared that GATA4 may not play a role in causation of CHD in 8p23 deletions. However, in 2003, Garg et al reported GATA4 mutations in two families with congenital heart disease, predominantly ASD but also AVSD and various valvular abnormalities (Garg et al., 2003). It is thus clear that GATA4 haploinsufficiency does contribute to CHD in at least some indviduals with 8p23 deletions, although it is likely that at least one other gene in the region also contributes to the cardiac phenotype, given the findings of Giglio et al.
1.7.2 Non-syndromal Mendelian ASD Prior to the first identification of mutations associated with ASD, there were at least 15 separate reports of nonsyndromal autosomal dominant ASD (Table 1.2), with 12 of the reported families having ASD and disorders of cardiac conduction. There have been no reports of X-linked nonsyndromal ASD, and no definite reports of autosomal recessive ASD, although it would be difficult to confidently state that recurrence in a sibship was due to recessive inheritance rather than multifactorial causation (see 1.4.1.4). It is possible that the family reported by Libshitz and Barth (Libshitz and Barth, 1974), in which 4 sibs had ASD and a fifth died at two years of suspected CHD, had an autosomal recessive form of ASD. Dominant inheritance in this family could not be excluded as assessment of the parents was described only as “routine cardiac workup” and the study antedates 2-D routine echocardiography. Even if both parents did have normal hearts, one of them may have been nonpenetrant for cardiac disease. Thus, dominant inheritance cannot be excluded in this family. Despite the large number of affected sibs, multifactorial causation is also possible.
Autosomal recessive inheritance has been reported for at least one other form of nonsyndromic CHD, persistent truncus arteriosus (PTA). In a large consanguineous Kuwaiti family, a mutation in NKX2-6 segregated with PTA, with affected individuals being homozygous (Heathcote et al., 2005). Indirect evidence for autosomal recessive forms of ASD comes from studies of consanguinity and CHD. In a case-control study of neonates with CHD, Khalid
44 et al found that parental consanguinity was a risk factor for CHD (Khalid et al., 2006). Overall, first cousin marriages were associated with an adjusted odds ratio (OR) for CHD of 1.8 (95% CI 1.1-3.1). In particular, consanguinity was associated with an increased risk of ASD (first cousin marriages, p = 0.002, more distant relatives, p=0.044). Similar studies comparing the rate of consanguinity in cases to population data rather than directly to controls had similar findings to those of Khalid et al, including an effect specifically on ASD (Becker et al., 2001; Nabulsi et al., 2003). These data could be consistent with recessively acting alleles contributing to multifactorial causation, but it is also possible that there are indeed rare autosomal recessive forms of ASD.
In considering the reports of autosomal dominant ASD, some consistent themes emerge. Firstly, there are substantially more reports of ASD with conduction disorder than of ASD without conduction disorder. This is somewhat skewed by the fact that four of the families were ascertained as part of a study of relatives of probands with ASD + conduction disorder (Emanuel et al., 1975) and by the inclusion of small families with ASD + conduction disorder but exclusion of small families with ASD alone. It is also possible that families with conduction disorder are more likely to be published, particularly as there is a striking tendency to progression of the conduction disease with age, and there are numerous instances of sudden unexpected death in family members. Nonetheless, it does appear likely that dominant ASD with conduction abnormalities is more common than dominant ASD without conduction abnormalities.
While atrial septal defect is the predominant cardiac lesion in all of the reported families summarised in Table 1.2, the majority of the families include at least one affected individual with other forms of CHD in addition to or instead of ASD. There are a wide variety of lesions, but valvular abnormalities, particularly affecting the mitral valve, are common, as are VSDs. On the whole, penetrance for some form of cardiac abnormality is high, and in families with ASD with conduction abnormalities, there are often individuals with structurally normal hearts but with abnormal cardiac conduction. Conduction abnormalities range from mild prolongation of the PR interval to complete heart block and in many
45 families there is progression with age. Older individuals often report syncopal episodes and sudden death in adulthood is not uncommon.
The family reported by Mégarbané et al (Megarbane et al., 1999) is also worthy of mention here. Although reported as a syndromal form of ASD, 8/15 affected individuals had no noncardiac features. In many of the individuals who did have noncardiac abnormalities these were realtively mild (pectus excavatum and hypertelorism). Two subjects had cleft lip or cleft lip and palate; there were no other malformations of note. Cardiac malformations other than ASD included PS, MS, Ebstein anomaly and AS. One subject had Wolff-Parkinson-White syndrome, which was also observed in two members of the family reported by Zuckerman et al and Lynch et al (Zuckerman et al., 1962; Lynch et al., 1978). Seven had right bundle branch block but there were no other conduction abnormalities and this family is probably best classified in the group of ASD without conduction abnormalities.
Subsequent to the reports summarised in Table 1.2, mutations in six genes have been associated with autosomal dominant ASD. These are TBX5 (Basson et al., 1997), NKX2-5 (Schott et al., 1998) (both associated with conduction abnormalities), GATA4 (Garg et al., 2003), MYH6 (Ching et al., 2005), ACTC (Matsson H et al., 2005) and TBX20 (Kirk et al., 2007) (this study; chapter 3). Mutations in the latter four genes are not associated with conduction abnormalities. In addition, there has been one report of a family linked to chromosome 5p for which no causative mutation has yet been identified (Benson et al., 1998). Mohl and Mayr reported linkage to the HLA complex on chromosome 6p in three families, two quite small (Mohl and Mayr, 1977). The paper is very brief (only a page in length) and contains no pedigrees. It is difficult to be confident about the reliability of these findings, which have never been replicated.
46 Table 1.2: Reports of dominant ASD prior to the first identification of causative mutations Reference(s) Conduction Number of Number Other Number of abnormalities individuals with cardiac nonpenetrant (number with CHD ASD abnormalities individuals affected) (Zetterqvist P, No 20+2 20+ - -3 1960) (Johansson BW and Sievers J, 1967) (Zetterqvist P et al., 1971)1 (Zuckerman et No 15 11 WPW, VSD, 14 al., 1962) PDA, MS, (Lynch et al., Complex 1978)1 (Williamson No 11 5 CMP, VSD, - EM, 1969) MS, Coarct (Benson et al., No 12 9 ASA 1 1998) 5,6 (Benson et al., No 11 6 PDA, - 1998) Bicuspid AV, AS (Weil and Yes (3) 5 5 VSD, PS - Allenstein, 1961) (Kahler et al., Yes (10) 10 7 - -7 1966) (Maron et al., 1978)1 (Amarasingham Yes (3) 3 3 - - R and Fleming, 1967) (Bizarro et al., Yes (11) 16 16 VSD, PS, -8 1970) MS, AF (Bjornstad P, Yes (10) 10 109 AF, MS, - 1974) (Emanuel et al., Yes (3) 4 4 MR - 1975)10 (Emanuel et al., Yes (2) 3 3 Cleft MV - 1975) (Emanuel et al., Yes (3)11 7 3 MV abn - 1975) (Emanuel et al., Yes (2)12 11 - - 1975)
47 Reference(s) Conduction Number of Number Other Number of abnormalities individuals with cardiac nonpenetrant (number with CHD ASD abnormalities individuals affected) (Pease et al., Yes (11)13 15 8 Coarct, TOF, 4+14 1976) TOF+PA, PTA, VSD, AS (Schaede and Yes (3) 5 5 - - Ramacher, 1977) (Bosi et al., Yes (3) 3 3 - -15 1992) WPW – Wolff-Parkinson-White syndrome; VSD – ventricular septal defect; PDA – patent ductus arteriosus; AS – aortic stenosis; bicuspid AV – bicuspid aortic valve; MS – mitral stenosis; Complex – complex CHD, lethal in infancy; CMP – cardiomyopathy; Coarct – aortic coarctation; ASA – atrial septal aneurysm; AF - atrial fibrillation (only noted if affecting multiple family members); MR – mitral regurgitation; cleft MV –cleft mitral valve; MV abn – cusps thin with nodular edges and elongated chordae; no evidence of rheumatic heart disease (one individual); TOF – tetralogy of Fallot; TOF+PA – tetralogy of Fallot with pulmonary atresia; PTA – persistent truncus arteriosus; AS – aortic stenosis
Only reports of families with >4 affected individuals or with associated A-V conduction block are included. There are numerous reports of families with smaller numbers of affected individuals without conduction block, which are hard to distinguish from familial clustering due to multifactorial causation and therefore are not included here. The presence of right bundle branch block is not regarded as sufficient for families to be classified as having conduction abnormalities. 1. Multiple reports of the same family 2. 12 confirmed by cardiac catheterization; 8 based on clinical assessment; others in earlier generations suspected to have been affected based on history 3. No definite nonpenetrance but diagnosis uncertain in deceased members of earlier generations 4. In addition, 3 obligate mutation carriers were identified as probably affected on history 5. Two families in one report. A third family reported in this paper was subsequently shown to be segregating an NKX2-5 mutation(Schott et al., 1998) and is discussed in the introduction to Chapter 3. 6. This family linked to 5p 48 7. No unaffected individuals but 3/10 had conduction abnormalities with structurally normal hearts 8. One individual with clinical diagnosis only. Two obligate heterozygotes not examined and no history available (deceased). 9. 4/10 diagnosis on history only (deceased individuals) 10. Study of relatives of 10 probands with ASD + conduction delay; 4 families with >1 affected individual identified, listed separately here 11. 4/7 affected family members deceased, exact nature of CHD and presence of conduction abnormalities uncertain 12. Father with ASD + conduction abnormality; son with conduction abnormality but structurally normal heart 13. 2/11 had structurally normal heart 14. In one branch of the family, a grandmother (daughter of a deceased obligate heterozygote) had a normal heart, as did three of her children each of whom had one affected child. It seems very unlikely that three of 11 children in the last generation would have had CHD by chance (i.e. phenocopy) implying nonpenetrance in 4 members of this branch of the family. In other branches of the family, there were individuals who may have been nonpenetrant. One deceased obligate heterozgote had no history of CHD. In another branch of the family, an affected individual had an unaffected son and grandson but affected great-grandson. In another branch of the family, an unaffected son of an obligate heterozygote had an unaffected son who had an affected daughter. This case is more likely tohave been a phenocopy because there were 24 individuals in that branch of the family of whom only one was affected. 15. One individual with structurally normal heart but with conduction delay
1.7.3 Multifactorial/polygenic causation of ASD Most common human diseases can be viewed as having multifactorial causation, with contributions from both genes and environment. In general, the genetic component involves contributions from multiple genes (usually modelled as having additive effects), termed “polygenic inheritance”. Since polygenic disorders (like all genetic disease) will inevitably be influenced by the environment to some degree, and multifactorial causation implies a contribution from multiple genes, the terms are effectively interchangeable. For continuous traits, such as blood pressure or height, it is relatively straightforward to posit the interaction of a number of genes with environmental factors, some of which
49 (eg salt intake for blood pressure and nutrition in childhood) can be readily identified.
The proposition that most CHD is multifactorial in origin was put by Nora in 1968 (Nora, 1968) and was widely accepted for some time thereafter. In favour of the concept were the recurrence risks for CHD – generally on the order of 2- 5%, similar to the square root of the incidence for individual lesions, as predicted for multifactorial causation. Studies of heritability suggested heritability of ~0.6 for CHD in general (Williamson EM, 1969; Burn J and Goodship J, 2002). Other evidence such the work on PDA by Zetterqvist (Zetterqvist P, 1972); discussed by Burn and Goodship (Burn J and Goodship J, 2002), supported multifactorial causation. PDA showed the characteristics expected of a disorder with multifactorial causation; the recurrence risks were as above, the risks to sibs and offspring were about equal, the risk to more distant relatives declines rapidly, and the risk increases when there are multiple affected family members.
More recent evidence, however, has called this model into question. In particular, studies of the offspring of adults with CHD suggest a higher risk to offspring of affected individuals than to sibs, with a higher risk to offspring if the affected parent is the mother (Buskens et al., 1995; Burn et al., 1998; Romano- Zelekha et al., 2001). Even taking into account the fact that a proportion of families have CHD due to single gene disorders (as discussed in 1.4.2 above), which makes interpretation of studies of the relatives of affected individuals more difficult, it seems likely that at least some forms of CHD will prove to be best explained by means other than the standard multifactorial model. In particular, hypoplastic left heart syndrome (Boughman et al., 1987) and AVSD (Burn et al., 1998) have been proposed to fit single gene models, and tetralogy of Fallot an oligogenic model (Burn et al., 1998).
Notwithstanding these doubts about precise mechanisms, the debate is about how genes and environment interact to produce CHD, rather than whether they do.
50 1.7.3.1 Excess of females affected by ASD Since sex is genetically determined, it is relevant to comment at this point that there is an apparent excess of females affected by ASD. Not all studies confirm this finding, with the reported ratio of boys: girls ranging from 1.5:1 to 1:3 (Samanek, 1994), but larger studies generally do show an excess, with typically about 55-60% of affected individuals being female (Rothman and Fyler, 1976; Ferencz C et al., 1997).
1.7.3.2 QTL for CHD A Medline search combining the terms Quantitative Trait Loci/ and Heart/ (performed on 29th September, 2007) gave 22 results, none of which referred to CHD; replacing Heart/ with Heart Defects, Congenital/ yielded only one result, which did not in fact refer to a QTL study, and replacing Heart/ with Cardiovascular Diseases/ yielded 48 results, none of which was relevant to CHD. However, the published report of the findings described in chapter 6 (Kirk et al., 2006) was not identified by either of these search strategies, suggesting that it is possible that there have been other such studies which are not readily identifiable using Medline. Although there have been numerous QTL studies relevant to cardiovascular disease (Yagil and Yagil, 2006), the focus of such research has been phenotypes relevant to atherogenic disease, not CHD. It seems likely that the study reported here represents the first effort to identify QTL potentially relevant to congenital heart disease.
1.7.4 Environmental factors There have been numerous studies of environmental contributors to CHD. There are several clearly teratogenic influences which are capable of producing CHD, probably with a relatively small influence from genetic effects. On the other hand, there are numerous factors which have been identified as making relatively small contributions in epidemiologic studies. The focus here will be on environmental contributors to the causation of ASD rather than on CHD in general.
51 1.7.4.1 Major teratogens These can be divided into infections, the effects of maternal exposure to medications and other substances, and the effects of maternal illness such as diabetes. Fetal rubella infection, particularly in the first trimester, is strongly associated with CHD (up to 48% affected (Overall, 1972)), with ASD being one of the more common lesions. Maternal exposure to medications is not particularly strongly associated with ASD, although ASD has been reported as part of retinoic acid embryopathy (Lammer et al., 1985) and the fetal valproate syndrome (Clayton-Smith and Donnai, 1995).
Maternal consumption of large amounts of ethanol during pregnancy may cause the fetal alcohol syndrome (FAS). About a third of children with the full-blown syndrome have CHD, typically ASD or VSD (Sandor et al., 1981). However, it is possible that exposure to smaller amounts of ethanol in the first trimester may produce ASD without other features of FAS (see section 1.7.4.2, below). It is likely that genetic factors contribute to individual susceptibility to the effects of ethanol on the fetal heart, as for other features of the syndrome (Gemma et al., 2007).
Similarly, maternal diabetes can produce a multi-system embryopathy which can include CHD. Although ASD is not commonly reported as part of this syndrome (Ferencz et al., 1990), it is mentioned here because some of the epidemiological surveys discussed in 1.7.4.2 support a contribution from maternal diabetes to the causation of some cases of ASD.
1.7.4.2 Other environmental factors The distinction made here between “major teratogens” and “other environmental factors” is arguably somewhat artificial, since an environmental influence which causes a malformation is by definition a teratogen. The intention is to distinguish between exposures which are clearly the major cause of ASD in at least some individuals, and those for which there is statistical evidence of an association but which are likely to be but one of multiple factors contributing to ASD. Moreover, some of the factors discussed here, such as birth order, are not
52 likely to be teratogenic even under the strictest interpretation of the term. It is possible that birth order is a marker for advancing maternal and paternal age, with associated increase in risk of chromosomal abnormalities or new dominant mutations. Alternately, large family size may reflect low socio-economic status which could increase the risk of ASD on a purely environmental basis. Clearly, it is hard in some cases to separate apparent environmental influences from genetic effects.
Papers summarised in Tables 1.3 and 1.4 are restricted to those for which separate data for ASD are given. Studies of paternal and maternal age effects are included here, because such studies are done in conjunction with studies of environmental influences. Two studies in particular, those of Tikkanen and colleagues in Finland (Tikkanen and Heinonen, 1992; Tikkanen and Heinonen, 1991; Tikkanen J and Heinonen OP, 1990) and the Baltimore-Washington Infant Study (Boughman et al., 1987; Ferencz et al., 1985; Ferencz C et al., 1997) provide the bulk of the data discussed here.
53 Table 1.3: Environmental exposures and other factors significantly associated with risk of ASD
Exposure Reference Country OR (95% p value Negative OR CI) (if OR studies (95% not of same CI) reported) exposure Birth order (Rothman USA N/A 0.021 (first born as and Fyler, risk factor) 1976) Birth order (Zhan et al., China 2.139 - (>1st born as 1991) (1.109- risk factor) 4.126)2 Birth order (Ferencz C USA 1.7 (1.1- - (fourth or et al., 1997) 2.5)3 subsequent child vs first born) Maternal fever (Tikkanen Finland N/A <0.01 (Stoll et 0.76 >380C and al., 1989) (0.18- Heinonen, 1.29) 1991) Maternal (Tikkanen J Finland 2.0 (1.1- - (Ferencz 1.2 alcohol and 3.6)5 C et al., (0.9- consumption Heinonen 1997) 1.6) in first OP, 1990; trimester Tikkanen and Heinonen, 1992)4 Maternal (Tikkanen J Finland 1.9 (1.1- - (Stoll et 1.29 exposure to and 3.4) al., 1989) (0.67- chemicals at Heinonen 2.52) work in first OP, 1990; trimester Tikkanen and Heinonen, 1992)4 Paternal (Ferencz C USA 1.6 (1.0- exposure to et al., 1997) 2.6) paint stripping Paternal (Ferencz C USA 1.7 (1.1- exposure to et al., 1997) 2.7) “miscellaneous solvents”
54 Exposure Reference Country OR (95% p value Negative OR CI) (if OR studies (95% not of same CI) reported) exposure Paternal age6, (Olshan et USA 2.7 (1.3- - (Ferencz 1.0 father’s age al., 1994) 5.8) C et al., (0.8- 45-49 1997)7 1.4) (Zhan et 0.898 al., (0.78- 1991)8 1.034) (Stoll et 0.61 al., (0.38- 1989)9 1.06) Maternal (Ferencz C USA 1.4(1.0- education (< et al., 1997) 2.0) high school) Gestational (Ferencz C USA 2.4 (1.4- (Stoll et 1.64 diabetes et al., 1997) 4.3)10 al., 1989) (0.38- 7.09) Urinary tract (Ferencz C USA 1.7(2.2- infection in first et al., 1997) 2.5) trimester Bleeding (Ferencz C USA 1.5 (1.0- during et al., 1997) 2.2) pregnancy Corticosteroids (Ferencz C USA 5.1 (2.1- in first et al., 1997) 12.7) trimester Paternal (Ferencz C USA 2.3 (1.3- cocaine et al., 1997) 4.2) consumption Paternal (Ferencz C USA 1.7 (1.1- smoking >20 et al., 1997) 2.7)11 cigarettes/day Paternal (Ferencz C USA 8.7 (2.6- occupational et al., 1997) 28.4) exposure to extremely cold temperatures12
OR – odds ratio; CI – confidence interval 1. Significantly higher risk for first born (p = 0.02) compared with observed distributions for other cardiac disorders, but no clear pattern in subsequent children i.e. no significant fall in risk for third vs second or fourth vs third child. No OR provided. The same study found an increase in risk with young maternal age (<20 years) which was not independent from the birth order risk. Note that Ferencz et
55 al(Ferencz C et al., 1997) found no significant association with young or old maternal age. 2. OR for birth order 2 or higher, compared with birth order 1. 3. No significant association with birth order otherwise. However, note the positive association with advanced maternal age from the same study, consistent with this finding 4. Reported twice from essentially the same dataset 5. Significant association was for the group “at least a single drink in first trimester”, 56% of 50 cases vs 39%of 756 controls. No significant association for regular alcohol consumption (every week) or for “at least 2-3 drinks per occasion” but the numbers were much smaller in each of these groups. Figure from Ferencz et al is for “any amount consumed in first trimester”. 6. Paternal age 45-49 compared with paternal age 25-29. No other age groups had significant associations and there was no clear trend to lower risk with lower paternal age and higher risk with higher paternal age. Thus, the status of this association is doubtful. 7. Paternal age >29 compared with paternal age 20-29. Younger paternal age (<20) also not significantly associated. 8. Paternal age <24 years compared with 24 years+ 9. Comparison not explicitly described. Mean paternal age 29.5 vs controls 29.2. 10. “Overt” diabetes was not significantly associated with ASD, OR 2.5 (95% CI 0.7- 8.5), although the numbers were very small for this exposure (only 3 cases) and it seems likely that a larger study might reveal an increased risk. 11. No association with paternal smoking of 20 or fewer cigarettes/day 12. 4/187 cases, 9/3572 controls
56 Table 1.4: Environmental exposures and other factors with no significant association with risk of ASD
Exposure Reference Country OR (95% CI) p value (if OR not reported) Maternal age (Zhan et China 1.034 (0.932- - al., 1991) 1.15)1 (Stoll et al., France 1.01 (0.59-1.72)2 - 1989) Workplace (Tikkanen Finland N/A >0.05 temperature during and first trimester Heinonen, (mother) 1991) Frequency of (Tikkanen Finland N/A >0.05 maternal sauna and bathing during first Heinonen, trimester 1991) Maternal epilepsy (Stoll et al., France 1.01 (0.99-1.03) - 1989) Maternal X-rays in (Stoll et al., France 0.82 (0.11-1.35) - first trimester 1989) (Ferencz C USA 1.2 (0.4-3.9)3 et al., 1997) Maternal (Stoll et al., France 0.43 (0.16-1.13) - hypertension in 1989) first trimester Maternal “flu” in (Stoll et al., France 0.73 (0.26-2.03) - first trimester 1989) (Ferencz C USA 0.8 (0.4-1.5) et al., 1997) Maternal (Stoll et al., France 0.83 (0.24-1.34) - medication in first 1989) trimester (Tikkanen Finland 1.0 (0.4-2.2) (all and medications) Heinonen, 0.3 (0-2.4) 1992) (salicylic acid)
(Boneva et USA 0.61 (0.15-2.41) - al., 1999)3
57 Exposure Reference Country OR (95% CI) p value (if OR not reported) Maternal smoking (Stoll et al., France 0.71 (0.42-1.19) - in first trimester 1989)
(Tikkanen Finland 0.7 (0.3-1.6) (1- - and 14 Heinonen, cigarettes/day); 1992) 0.8 (0.1-5.7) (15- 29 cigarettes/day) (Ferencz C USA 1.5 (0.9-2.1) (1- et al., 10 1997) cigarettes/day), 1.1(0.7-1.7) (11- 20 cigarettes/day), 1.6 (0.9-2.8)(>20 cigarettes/day)
Exposure to (Tikkanen Finland 1.0 (0.5-1.9) passive smoking in and (exposure at first trimester Heinonen, home), 1992) 0.5 (0.2-1.5) (exposure at work) Maternal coffee (Tikkanen Finland 0.7 (0.4-1.5) consumption and Heinonen, 1992) Maternal use of (Tikkanen Finland 1.2 (0.7-2.2) deodorants and Heinonen, 1992) Maternal work (Tikkanen Finland 1.2 (0.6-2.4) attendance in first and trimester Heinonen, 1992) Maternal regular (Tikkanen Finland 2.6 (0.7-9.1)4 exposure to and organic solvents at Heinonen, work in first 1992) trimester Maternal regular (Tikkanen Finland 2.2 (0.3-1.8)4 exposure to dyes, and lacquers or paints Heinonen, at work in first 1992) trimester
58 OR- odds ratio; CI – confidence interval; N/A – not available 1. Maternal age 29 years+ compared with <29 years 2. Anti-nausea medication only 3. Occupational exposure 4. Note significant effect from “exposure to chemicals at work” from same study
As can be seen from these tables, the literature is contradictory at times and some of the reported significant associations are difficult to interpret or biologically implausible. A brief discussion of each of these follows.
The data on birth order and parental age are contradictory, with two studies suggesting later birth order as a risk factor for ASD (Zhan et al., 1991; Ferencz C et al., 1997) and one suggesting that the first born is at higher risk (Rothman and Fyler, 1976). Zhan et al find that advanced maternal age is associated with increased risk, whereas Rothman et al find that lower maternal age is associated with increaed risk. In a fourth study (Olshan et al., 1994), higher paternal age is associated with increased risk, but the results are significant only for a single age range and need to be treated with caution. Nonetheless, overall it seems likely that late birth order and advancing parental age may be a minor risk factor for ASD. The mechanism for this is uncertain, as discussed above.
Tikkanen and Heinonen (Tikkanen and Heinonen, 1991) exhaustively investigated the possibility that maternal hyperthermia in the first trimester may contribute to causing CHD. CHD was divided into 5 categories, of which ASD was one. They studied maternal fever >380C, workplace temperature, suana bathing (which is very popular in Finland), month of birth (because of the marked seasonal temperature variations in Finland) and a history of upper respiratory tract infection and acetylsalicylic acid use as a marker of febrile illness. The only significant association for ASD was with maternal fever. There were relatively few positive associations among the other groups of malformations. Even assuming this is a genuine association, it is not clear
59 whether the putative teratogenic effect relates to the fever per se or to teratogenic effects of the pathogen responsible for the fever. Two other studies did not find a relationship between viral upper respiratory tract infections in the first trimester and ASD (Stoll et al., 1989; Ferencz C et al., 1997), and Stoll et al also studied fever directly and found no association.
Given the teratogenic effect of ethanol in the fetal alcohol syndrome and of diabetes in diabetic embryopathy (discussed in 1.4.4.1) it is not surprising that positive associations with ASD have been reported (Tikkanen J and Heinonen OP, 1990; Tikkanen and Heinonen, 1992; Ferencz C et al., 1997), although not all studies confirm a role for ethanol (Ferencz C et al., 1997) or for diabetes (Stoll et al., 1989). Urinary tract infection and maternal corticosteroid exposure were also identified as risk factors in one study (Ferencz C et al., 1997). On the other hand, maternal smoking, including passive smoking, has been extensively studied and no evidence has been found of any link with ASD (Stoll et al., 1989; Tikkanen and Heinonen, 1992; Ferencz C et al., 1997)
The role of occupational exposures, if any, remains uncertain. Maternal exposure to chemicals at work was implicated in one study (Tikkanen and Heinonen, 1992) but not confirmed in another (Stoll et al., 1989). Given the diversity of chemicals used in workplaces it seems unlikely that a broad grouping of all chemical exposures would provide meaningful results. When more specific exposures (organic solvents and dyes, lacquers and paints) were studied, no association was found (Tikkanen and Heinonen, 1992), but the numbers exposed were small and it is possible that a larger study would identify a real causative role for such exposures.
Surprisingly, paternal exposures have also been identified as risk factors. These include paternal exposure to paint stripping, “miscellaneous solvents”, and extremely cold temperatures (Ferencz C et al., 1997). Paternal smoking and cocaine use are also significantly associated with ASD (presumably not usually in an occupational setting) (Ferencz C et al., 1997). Again, it is difficult to draw a biologically plausible link between such exposures and ASD.
60 Mutagenesis affecting sperm is implausible as a mechanism. It is possible that these associations represent markers for other factors, such as socioeconomic status, although there is little direct evidence for that as a significant contributor (other than an association with low maternal education levels (Ferencz C et al., 1997).
1.8 Project outline The aim of this project was to add to our understanding of the genetics of CHD, with a focus on ASD and PFO. Complementary strategies, involving mouse and man, were used.
In human subjects, two separate studies were undertaken, both focusing on Mendelian forms of CHD. In the first, subjects with CHD including ASD,PFO and a variety of other lesions were screened for mutations in the cardiac transcription factors NKX2-5, GATA4 and TBX20. This work is described in chapter 3. In the second study, a large family with a previously undescribed autosomal dominant ASD syndrome was investigated, with clinical evaluation and an attempt to map the disorder. This is described in chapter 4.
The major mouse study, reported in chapters 5 and 6 involved a QTL study, using an F2 intercross design. The parental strains were 129T2/SvEms and QSi5. In addition, an Advanced Intercross Line using the same parental strains has been bred to completion and ~1000 mice from the final generation (F14) phenotyped. Analysis of phenotype data from the parental strains and from F1, F2 and F14 mice is presented in chapter 5. Chapter 6 describes the results of the QTL study. After this study was completed, 10 additional mouse strains were phenotyped as part of work towards an analysis based on the recently published mouse Hapmap data. Phenotype data from these mice are reported in Chapter 7. Overall conclusions and future directions are discussed in chapter 8.
61 2. Materials and Methods
2.1 Mouse experiments 2.1.1 Ethics committee approval Animal experiments were performed under Animal Care and Research Ethics approvals N02/2-2001/1/3336, N02/2-2001/2/3336 and N02/2-2001/3/3336 (for all studies other than the AIL) and N00/4-2003/1/3745, N00/4-2003/2/3745 and N00/4-2003/3/3745 (for the AIL) from the University of Sydney.
2.1.2 Animal resources Parental inbred mice were obtained from the Centre for Advanced Technologies in Animal Genetics and Reproduction, University of Sydney (QSi5), and from the Garvan Institute (129T2/SvEms). Mice were kept in a rodent facility at the University of Sydney in a purpose-built air-conditioned room with a 12 hour light/dark cycle and ad libitum access to food and water until dissection at 6-8 weeks of age. The aim was to dissect at 6 weeks and efforts were made to keep the age of dissection as close to that as possible. In practice, breeding production varied from week to week. This resulted in a variable number of mice reaching the age of 6 weeks in any given week. It was not always possible to dissect all the available mice in a particular week, and if this happened the excess mice were kept until they could be dissected.
For the F2 study, a total of 1437 mice (680 female, 757 male) were dissected on 63 sampling days over a 9 month period. Mean age at dissection was 46.3 days (SD 3.46, range 39-60). Breeding of the AIL took three years and three months, including the period during which the dissections were done. A total of 1003 AIL F14 mice were dissected on 34 sampling days over a 6 month period, but data for 27 of these was lost as the result of a motor vehicle accident. Of the remaining 976 F14 mice, 480 were male and 496 female. Mean age at dissection was 44.0 days (SD 2.6, range 40-55).
In addition, 75 129T2/SvEms mice and 137 QSi5 mice were dissected, although data are analyzed in chapter 5 for only 66 of the QSi5 mice. This is because
62 only minimal information was recorded about the first 71, which were dissected as part of learning the dissection technique. Subsequently, 85 F1 mice were also dissected. An additional 280 mice from the HapMap strains were dissected. Of these, Ms Noelia Lopez did part or all of the dissections for 232. Of these, in turn, 147 were co-measured by EK and Ms Lopez (results recorded by each without knowledge of the other’s measurement), and 28 were dissected and measured by Ms Lopez and not co-measured by EK. The total number of mice dissected during the course of this project was 3017.
2.1.3 Breeding protocols 2.1.3.1 F2 mice The initial matings for the F2 mice were between 129T2/SvEms sires and QSi5 dams. The resulting F1 mice were then crossed to produce F2 mice. As the parental strains were inbred mice, and hence essentially homozygous at every locus, F1 mice were heterozygous at every locus. As discussed in Chapter 1, meiotic recombination results in F2 mice inheriting variable contributions from each of the parental strains. On average, at any given locus, the expected ratio of alleles in F2 mice should be aa:2ab:bb where a represents the parental allele from one strain and b the parental allele from the other. The cartoon (Fig 2.1) illustrates breeding and transmission of alleles for F2 and onwards, as for the AIL pedigree (see below).
2.1.3.2 Advanced intercross line Initial breeding for the AIL proceeded exactly as for the original F2 resource. However, breeding then continued for a further 12 generations, and the F14 mice were dissected. Sufficient F2 mice were bred to stock 48 cages, each containing one male and one female mouse. The offspring of these mice were the F3 mice, and so on. For the F3 x F3 and subsequent matings, avoiding inbreeding was essential, to reduce the risk of genetic drift causing loss of genetic information within the AIL. To avoid brother/sister matings, a system of cascading matings was used. Wherever possible, a female mouse born in one cage would be mated with a male mouse from the next cage, and so on, as illustrated in table 2.1.
63 .
Figure 2.1 Cartoon illustrating the breeding scheme used. 129T2/SvEms mice are white-bellied agouti chinchilla in colour(Eppig et al., 2005), and are represented here as dark grey; QSi5 mice are albino(Holt et al., 2004); the F1 mice were chinchilla and are represented as light grey; and the F2 mice were white, chinchilla or agouti in a ratio of 1:2:1. The parental 129T2/SvEms chromosomes are shown in grey and the parental QSi5 chromosomes are shown in white; recombinant chromosomes in F2 and subsequent generations have a mixture of 129T2/SvEms and QSi5 parental chromosomal material.
64 Table 2.1: Breeding scheme for AIL
Box 1 Box 2 Box 3 Box 4 Box5
Offspring M1 F1 M2 F2 M3 F3 M4 F4 M5 F5 of previous mating present in box Mice used M1 FR M2 F1 M3 F2 M4 F3 M5 F4 for next mating
M1 = male born in box 1, M2 = male born in box 2, F1 = female born in box 1, and so on. FR = random female (see below)
In practice, it took several rounds of breeding to produce sufficient F2 mice to stock all 48 boxes. This meant that the boxes with higher numbers produced litters later. Additionally, there was considerable variation in inter-litter times. This meant that it was not possible in any generation to mate the male from box 1 with a female from box 48, although on several occasions females from box 48 were used in other matings. A selection of male and female mice from each generation were kept as reserves – usually 3 boxes each of males and females, with the mice in each box selected from boxes 1-16, 17-32, or 33-48. If a mouse needed to be replaced (eg because it had died before reproducing) a replacement would be selected at random, from one of the two reserve boxes chosen from the range which did not include that mouse’s original box. This avoided inbreeding without the need to keep reserve mice from every litter. On some occasions it was possible to use a female from box 48 as a “random” mouse. Most generations required 0 or 1 such substitutions, with a maximum of 3 substitutions being recorded in any one generation. It was also necessary to use random females to mate with the males from box 1, for every generation after F2. Rarely (once every 3-4 generations), it was also necessary to use random mice if a pair of mice failed to produce offspring for a prolonged period. No specific time limit was set for a pair of mice to produce offspring. However, 65 there was a policy of not setting up matings for one generation until all matings for the previous generation had been set up (meaning that there were only ever a maximum of two generations present in the colony at any one time) and this influenced the timing of such decisions.
Despite the variability in inter-litter time for individual boxes, the intergeneration time averaged 11 weeks over the course of the breeding programme, only two weeks more than the expected minimum inter-litter time of 9 weeks. Fairly frequently, there were delays of several weeks between one box being ready to mate and the next box in the series being ready. As weaned litters were not separated by sex, due to a lack of space, this created the risk that females would already be pregnant by their littermates at the time of mating. This would be undesirable as it would lead to inbreeding and loss of genetic information from the colony. Careful attention was therefore paid to the time between mating and birth of the first litter of pups in a box; any litters born before 21 days post-mating were culled.
2.1.4 Mouse phenotyping 2.1.4.1 Initial dissection Mice were killed by asphyxiation in carbon dioxide. Within 15 minutes of death, the thoracic organs were removed as follows. A nick was made in the abdominal skin and the skin was firmly distracted rostrally and caudally, exposing the thorax and abdomen. The xiphisternum was grasped with toothed forceps and elevated. Using fine scissors, an incision was made below the xiphisternum and this was extended laterally on each side in an arc through the rib cage (convex laterally), leaving the sternum and the portions of the ribs attached to it anchored proximally and by the diaphragm, but otherwise unattached. The diaphragm was divided and the flap containing sternum and attached rib ends reflected upwards and caudally. The heart and lungs were gently lifted to expose the great vessels and oesophagus. These were then firmly grasped with non-toothed forceps and cut distally to the forceps. Firm upwards traction on the great vessels and oesophagus allowed rapid dissection of the structures passing through the thoracic inlet, with removal en bloc of the
66 heart, lungs, thymus and associated mediastinal structures and tissues (oesophagus, trachea, fat, mediastinal lymph nodes etc). These were then placed without further dissection into a 1.5ml Eppendorf tube, containing approximately 0.5 ml of phosphate buffered saline (PBS), for storage and transport until fine dissection could be done.
The abdominal cavity was opened with a transverse incision. The spleen was exposed by blunt dissection, removed and snap-frozen in liquid nitrogen. For the AIL mice, a tail biopsy was also taken and snap-frozen.
2.1.4.2 Fine dissection Fine dissection and measurement were done on the same day as the initial dissection, generally within 6 hours. A Leica MZ8 dissecting microscope was used. The dissections and determination of PFO status were done under low magnification, with the measurements done under higher magnification using an eyepiece graticule. The thoracic organs were placed in a dish containing PBS and the lungs, trachea and bronchi, oesophagus, thymus, great vessels and mediastinal fat were removed. The heart was then weighed. The left atrium was then opened as illustrated in Fig 2.2.
67 A Ao
PV
Auricle
Fig 2.2: Dissection of mouse hearts Fig 2.2A: Initial dissection. The heart following removal of other organs, most mediastinal fat and distal great vessels. The auricle of the left atrium (marked “auricle”), the remainder of the aorta (marked “Ao”) and the stumps of the pulmonary veins (marked “PV”) are indicated by arrows.
68 B
Figure 2.2B: Opening the auricle An incision has been made across the auricle of the left atrium, removing about half of the auricle and opening the atrium. The arrow indicates the cut edge of the auricle.
69 C
Figure 2.2C: Laying open the atrium. Fine dissecting scissors are inserted through the opening created by the previous step. The lower blade is kept in as superficial a position as possible to avoid damage to the atrial septum. A cut is made to open the atrium. This is extended through the proximal pulmonary veins for maximum exposure of the septum. Throughout the dissection the heart is held in position by a needle (indicated by arrow) inserted through the apex of the heart into the foam backing material.
70 D
Figure 2.2D: Final appearance of the heart. The heart is shown following the incision shown in 2.2C. Light traction with fine forceps is now sufficient to expose the left side of the atrial septum. Note that the right auricle (marked with an arrow) is intact. This makes it possible to pressurize the right atrium to produce a flow of blood across a patent foramen ovale, if present.
71 Fig 2.3: Detail of atrial septum. Atrial septal detail as seen from the left aspect after dissection as described above (A). The annulus of the mitral valve is toward the lower left of this panel. B, Same as in A, with septal landmarks and quantitative septal measurement used in this study identified. Note that the foramen ovale in A appears indistinct because it is covered by the membranous atrial septum primum. Crescent corresponds to the leftward edge of the atrial septum primum, forming a prominent ridge.
72 2.1.4.3 Identification of patent foramen ovale As described in Fig. 2.2, the integrity of the right atrium was protected during dissection so that it was possible to pressurize the atrium and use the remaining blood contained in it to determine whether a PFO was present. If present, blood would pass across the septum, emerging to the left of the crescent with the heart in the orientation illustrated above. The amount of blood passing across the septum varied considerably, from free passage of large quantities of blood without pressurization of the right atrium down to tiny quantities of blood, visible only after careful and repeated pressurization of the right atrium. On rare occasions, Orange G dye was injected into the left superior vena cava under pressure, to supplement the use of the atrial blood as described above. This was usually done when repeated pressurization of the atrium had exhausted the available blood in the atrium and there was still a question as to whether a PFO was present.
2.1.4.4 Measurements of atrial septal anatomy Measurements were done using an eyepiece graticule. Care was taken to orientate the atrial septum perpendicularly to the observer’s line of sight, to avoid parallax error. It was necessary to use dissecting forceps to hold the cut edges of the atrium apart in order to expose the atrial septum. As the septum is a highly elastic structure, measurements could be altered considerably by the amount of force used to expose the septum. An excess of stretch would result in falsely high measurements. Considerable attention was therefore directed towards maintaining a consistent amount of stretch, with the aim being to use the minimum amount of force necessary to fully expose the area of interest.
2.1.4.5 Blinding On the first day of dissections of the parental strains (129T2/SvEms and Qsi5) an attempt was made to do dissections blinded to the strain of the mice. All hearts for the day were placed in a plastic bag and each tube containing a heart was removed with its number concealed. The heart was dissected and only once all measurements were recorded was the number checked. However, blinding to strain proved impossible. The two strains had such different atrial
73 septal wall morphology that the strain of the mouse being dissected was instantly obvious once the atrium was opened. Attempts at blinding were therefore discontinued for this pair of strains. For the Hapmap strains, measurements by EK were generally done blinded to strain. Since there was no prior information about the phenotypes for each strain, observer-introduced bias is unlikely.
For dissections of the F2 and AIL mice, the genotypes of the mice were unknown at the time of dissection, so blinding was not a consideration.
2.1.5 Strain selection for F2 and AIL studies As described in section 1.6.5, previous work by Christine Biben and colleagues (Biben et al., 2000) had established that different strains of inbred laboratory mice have varying cardiac septal anatomy, with a particularly close relationship between mean flap valve length (FVL) and incidence of PFO (Biben et al., 2000) (this study is discussed in detail in section 5.2). This measure formed the basis for strain selection for this study. Strains evaluated by Biben et al included FVB/N, 129T2/SvEms, C57Bl6, and Swiss QS (the latter is not an inbred strain), as well as crosses of several of these strains and mice heterozygous for an Nkx2-5 null mutation. Of the wild-type strains and crosses, the shortest flap valve lengths were seen in 129T2/SvEms and the longest in FVB/N.
In addition to these strains, for this study a further inbred strain, QSi5, was evaluated. This strain was developed at the University of Sydney with selection for high fecundity and short inter-litter interval (Holt et al., 2004). The large numbers of mice needed for this study made this superior reproductive performance desirable. QSi5 proved to have the longest flap valve length of any strain assessed, with FVL of 1.13mm (SD 0.11). Although FVB/N had a lower incidence of PFO (FVB/N: 0 of 51, 0%, as opposed to QSi5: 3 of 66, 4.5%) the combination of the long FVL and reproductive characteristics led to the choice of QSi5 as one of the parental strains for the study. The other strain used, 129T2/SvEms, had the highest incidence of PFO (21 of 31, 75%) and shortest FVL (0.6mm, SD 0.11). Although data on the characteristics of 129T2/SvEms
74 were already available, a further 75 129T2/SvEms mice were dissected at the beginning of the study, to confirm the characteristics of the strain, and to provide baseline measures for FOW and CRW, measurements which are slightly different from those used in the study of Biben et al (Biben et al., 2000).
2.2 Human subjects 2.2.1 Ethics committee approval Human experiments were conducted under Human Research Ethics Committee approval from the South East Health Research Ethics Committee – Eastern Division (approval no 99/261), the St Vincent’s Hospital Research Ethics Committee (approval no H01/076) and the Children’s Hospital at Westmead Research Ethics Committee (approval no 2003/049).
2.2.2 Ascertainment of subjects 2.2.2.1 Children Children were recruited from Sydney Children’s Hospital (SCH) and the Children’s Hospital at Westmead (CHW). SCH subjects were recruited retrospectively by searching the records of the cardiology department for individuals with a diagnosis of ASD, and excluding those who had other significant cardiac pathology. There was no selection on the basis of family history, extracardiac pathology or the presence of abnormalities of cardiac conduction. CHW patients were recruited prospectively by a cardiac surgeon (Dr David Winlaw) by approaching patients with congenital heart disease (CHD) during outpatient clinics. Again, there was no selection of subjects other than on the basis of confirmed CHD.
2.2.2.2 Adults Adult subjects were recruited prospectively from St Vincent’s Hospital and St Vincent’s Private Hospital. Most were recruited at the time that they were having trans-oesophageal echocardiography (TOE). A number of different cardiologists were involved, but Prof Michael Feneley was the main driving force behind this recruitment effort.
75 2.2.2.3 Numbers of subjects studied for mutations in NKX2-5 and GATA4 In this study, a total of 146 individuals were screened for mutations in NKX2-5. Subjects were unselected for familial disease and the proposal was to test the significance of NKX2-5 mutations in common CHD and its prevalence among familial cases, focusing on ASD and PFO. Recruitment at this stage of the study came from St Vincent’s Hospital and Sydney Children’s Hospital. Additionally, following the identification of one subject with HLHS and a mutation in NKX2-5, a group of 18 children with HLHS recruited via the Children’s Hospital-San Diego and US HLHS support group was included. Subjects with PFO were recruited at the time of investigation for cryptogenic stroke, and thus represent a group likely to have relatively severe forms of PFO.
A total of 129 subjects with ASD, 109 with other types of CHD, 59 with PFO ascertained during investigation of cryptogenic stroke, and 29 with PFO ascertained during investigation for reasons other than stroke had GATA4 sequencing performed, including the exons and intron/exon boundaries.
2.2.2.4 Follow-up of family members Whenever a possible or definite mutation was identified in a family, as many first degree relatives as possible were recruited, and depending on the results of testing of those individuals, more distant relatives were also sometimes recruited.
2.2.3 History For all children recruited at Sydney Children’s Hospital, and for all subjects in whom a possible or definite mutation was identified, plus the members of their families, a detailed clinical history was taken. This included details of cardiac malformations, other malformations, pregnancy history, growth, developmental progress and any prior investigations. Permission was obtained to review hospital medical records when possible. For adult subjects recruited at St Vincent’s Hospital or St Vincent’s Private Hospital, a more limited history was taken by cardiology staff which included cardiac history (malformations,
76 arrhythmias, surgical or other procedures), and family history. Ethnicity was not recorded initially but part way through recruitment this began to be recorded and previously-recruited subjects were re-contacted to obtain this information. Age and sex were recorded for all subjects.
2.2.4 Examination All children recruited at Sydney Children’s Hospital had a full clinical examination by EK and/or Dr Fiona McKenzie. This included attention to presence of dysmorphic features, examination of limbs including palmar and plantar creases, palate, genitalia, skin, nails, teeth and hair. Examination of the hands included close inspection for radial ray anomalies, with the thenar eminences, thumbs and nails being closely examined, and comparison being made between the two sides for evidence of asymmetry. A record was kept of all examination findings.
Adults generally had a more limited clinical examination. All probands were examined by a cardiologist. Relatives of probands were examined by a clinical geneticist (the author) but it was often not appropriate to do as thorough an examination as was possible for children. This was in part because many of these examinations were carried out during the course of home visits without a chaperone present. In some cases examinations were conducted in unusual circumstances; several members of one family were examined (and had blood taken) in the food storage and preparation area of the family’s grilled chicken shop. Another woman had history taken, limited clinical examination done and blood taken, all in the lobby of a large hotel, where she was attending a conference. The reason for this was that she was seen during a short trip to Brisbane and there was a limited window of opportunity for meeting her. Two other members of the same family were examined in a back room of the family business, a carpet warehouse. Notwithstanding this, the minimum examination for such patients included examination of upper limbs (with attention to the radial ray as described above) and a dysmorphological assessment of the face, head and neck, plus inspection of palate, exposed skin, teeth, nails and hair.
77 Trips were made to Wollongong, Nowra, Canberra, Newcastle, Grafton, Lismore, the Gold Coast and Brisbane for the purpose of recruiting subjects. Numerous home visits were also done within the Sydney metropolitan area. Despite this a small number of subjects could not be examined because they lived in remote areas. For these individuals, a history was taken by telephone and medical records were obtained.
2.2.5 Investigations All subjects had at least electrocardiography (ECG) and transthoracic echocardiography. Most of the adult probands had TOE as well. These investigations were interpreted wherever possible by one of the collaborating cardiologists – Drs Owen Jones and Robert Justo and the cardiologists at Children’s Hospital at Westmead, especially A/Prof Gary Sholler (paediatric patients) and Prof Michael Feneley (adult patients), however in some instances access was gained to the records of assessments done by other cardiologists. Dr Michael Tsicalas assessed several members of the family described in Chapter 4. Other investigations were arranged from time to time, if clinically indicated. Two subjects had a standard blood karyotype done using routine methods in service laboratories, and one had FISH for 22q11 deletion.
2.3 Molecular genetics methods 2.3.1 Extraction of DNA from human blood and mouse spleens Genomic DNA was extracted from human blood and mouse spleens using modified salt precipitation protocols derived from the method of Miller et al (Miller et al., 1988).
2.3.1.1 DNA extraction from mouse spleens All tubes used at each stage of this procedure were pre-labelled with the unique identification number assigned to the mouse from which the spleen had been taken. 1. Preparation of spleen tissue Following sampling (described above), the spleens were stored in liquid nitrogen until the time of DNA extraction. On removal from the storage tube, the
78 spleen was placed on a dissection platform lined with clean foil. A segment of partially-thawed spleen weighing approximately 30mg (corresponding to a piece roughtly 3mm x 3mm x 3mm in size) was cut off using a new surgical blade. The remainder of the spleen was re-frozen. 2. Differential red cell lysis. The 30mg sample of spleen was transferred into a 1.5mL Eppendorf tube and soaked in 600 L of NH4Cl lysis buffer (NH4Cl 160mM, KHCO310mM, di-sodium EDTA 0.4mM) for 24 hours at room temperature. 3. Nucleated cell lysis The supernatant (which contained the red cell lysate) was removed. The remaining splenic tissue was then homogenized using a DNAase-free micropestle (SST). 600 L of TNES (50mM Tris-HCl pH 7.5, 400mM NaCl, 20mM EDTA, 0.5% SDS) was added, with 30 L of proteinase K (20mg/mL). The sample was then briefly vortexed and placed in a 500C water bath for overnight incubation. 4. Protein precipitation 200 L of 5M NaCl solution was added to the Eppendorf tube and the sample was mixed by vortexing, and then centrifuged at 13 000 rpm for 8 minutes. 5. DNA precipitation The supernatant (which should at this point be clear) was transferred to a second 1.5mL Eppendorf tube, with care being taken to avoid transferring any precipitate. 600 L of 100% ethanol was added to the supernatant and inverted gently several times. At this point condensed DNA strands would be visible within the liquid in the tube. 6. Removal of excess salt 200 L of 70% ethanol was placed in a third Eppendorf tube. A clean yellow disposable pipette tip was used to transfer the DNA from the previous tube into the 70% ethanol. The tube which now contained the DNA was then centrifuged at 13 000 rpm for 3 minutes in a benchtop centrifuge, leaving a DNA pellet firmly attached to the bottom of the tube. The ethanol was gently tipped out of the tube and the sample left to air-dry for 30-60 minutes.
79 7. Dissolving and storage of DNA 300 L of 1x TE buffer was then added to the DNA pellet, and the DNA was left to stand at room temperature for 1-2 weeks.
The concentration of DNA in the sample was then determined by spectrophotometry.
2.3.1.2 DNA extraction from blood 1. Sample collection, transport and storage Peripheral blood samples were collected by venesection using standard techniques, and placed in tubes containing EDTA as an anticoagulant. Sample collection from children was facilitated by the use of Emla cream (lignocaine 25mg/g and prilocaine 25mg/g, Astrazeneca) as a local anaesthetic agent, applied 60 minutes before venesection. In some instances local pathology services collected blood from subjects resident in remote areas. However, wherever possible, arrangements were made to visit areas where subjects lived in order to examine subjects and collect blood, as described in section 2.2 above. Samples were transported at room temperature. 2. Sample labelling The initial sample tube was labelled with at least two identifying pieces of information – usually study identification number, name and date of birth. Intermediate tubes used during the process of DNA extraction were labelled with the identification number only. The final tube in which the extracted DNA was stored was labelled with the identification number, date of birth of the subject, date of extraction and concentration of DNA as determined by spectrophotometry. 3. Red cell lysis A biological safety cabinet class II was used for handling blood samples. The blood was transferred from the EDTA-containing tube into a 50mL screw-cap centrifuge tube. For every 10mL of whole blood, 40mL of NH4Cl lysis buffer was added (NH4Cl 160mM, KHCO3 10mM, di-sodium EDTA 0.4mM). The sample and lysis buffer were mixed well by inversion and allowed to stand for 30-120 minutes at room temperature, inverted once more during this time to ensure
80 thorough mixing. The sample was then centrifuged for 10min at 3500rpm. The supernatant was discarded, leaving a white cell pellet in the bottom of the tube.
5mL of NH4Cl lysis buffer was then added and the pellet was resuspended by vigorous shaking or vortexing. A further 30mL of NH4Cl lysis buffer was added and the tube shaken hard to ensure resuspension of the pellet. The sample was again centrifuged for 10min at 3500rpm, and the supernatant discarded.
Optional: at this stage, if required, the sample could be stored at –200C for future completion of DNA extraction.
4. Nucleated cell lysis The white cell pellet was resuspended in 1-2mL (1mL if the pellet was small) of TE lysis buffer (Tris-HCl 200mM, di-sodium EDTA 5mM). 30 L of Proteinase K (10mg/mL) was added and the sample mixed with gentle shaking. 100 L of 10% SDS solution was added and mixed by gentle shaking. The sample was then incubated overnight in a water bath at 500C.
5. Protein precipitation The next day, the sample was taken from the water bath and cooled to 40C in a refrigerator. 700 L of ammonium acetate (5M) was added to the sample and the sample was shaken vigorously for approximately 20 seconds. It was then centrifuged for 10min at 5000rpm.
6. DNA precipitation The supernatant was transferred to a clean 20mL screwcap container and two volumes (4mL) of 95% ethanol added. Gentle mixing caused strands of DNA to condense and become visible. The DNA was then removed using a 1 L disposable microbiological loop and washed with 70% ethanol while still on the loop to remove excess salt. The DNA was then transferred to a 1.5ml screwcap tube and air dried for 1-5 minutes. Sterile TE (100 L per 5mL of blood in the original sample) was then added and the sample allowed to stand overnight to allow the DNA to dissolve.
81 The DNA concentration was then measured by spectrophotometry, and the sample stored at –200C.
2.3.4 Polymerase chain reaction (PCR) and sequencing of NKX2-5 and GATA4 2.3.4.1 PCR of NKX2-5 Initially the two exons of NKX2-5 were amplified in separate PCR reactions. Results for exon 1 were generally good but amplification and sequencing of exon 2 were problematic, possibly due in part to the high GC content of this exon. Subsequently the Expand PCR system (Roche) was used to amplify both exons and the intron in a single amplicon. Genomic DNA was diluted to a working concentration of 100ng/ L and this stock was used in PCR.
Reaction mixture (per sample):
Template DNA 1 L Expand polymerase 0.75 L 10x Expand buffer 5 L 10mM dNTPs 2.5 L Oligonucleotide PF1 (concentration 50ng/ L) 2.5 L Oligonucleotide PR1 (concentration 50ng/ L) 2.5 L dH2O 35.75 L
82 The following cycling conditions were used:
950C for 1min: 30sec
940C for 30 sec 680C for 2 min (10 cycles)
940C for 20 sec 680C for 2 min +5 sec/cycle (24 cycles)
720C for 5 min (1 cycle)
DNA purification
The PCR products were purified using the Qiaquick PCR purification kit (Qiagen) according to the manufacturer’s instructions.
Sequencing of NKX2-5: second method
Sequencing was done using the BigDye sequencing mix (ABI). Oligonucleotide concentration was 25ng/ L
Reaction mixture (per sample):
PCR product 6 L Oligonucleotide (S1R, S1F, S2R, or S2F) 1 L CSA buffer 3 L BigDye sequencing mix 4 L dH2O 6 L
83 CSA buffer: 200mM Tris pH9, 5mM MgCL
Cycling conditions:
960C for 2 minutes
960C for 10 sec 500C for 10 seconds 600C for 4 min (25 cycles)
2.3.4.2 PCR of Exon 5 of GATA4
The AmpliTaq Gold PCR system (ABI) was used to amplify exon 5 of GATA4. Genomic DNA was diluted to a working concentration of 100ng/ L and this stock was used in PCR.
Reaction mixture (per sample): Template DNA 0.7 L AmpliTaq polymerase 0.1 L 10x buffer (Buffer II) 1.5 L 5mM dNTPs 0.6 L Oligonucleotide G1 (concentration 50ng/ L) 0.5 L Oligonucleotide G2 (concentration 50ng/ L) 0.5 L
MgCl2 0.8 L dH2O 10.5 L
84 The following cycling conditions were used:
940C for 11min (1 cycle)
940C for 10 sec 600C for 45 sec 720C for 1 min (35 cycles)
720C for 5 min (1 cycle)
DNA purification
The PCR products were purified using the Qiaquick PCR purification kit (Qiagen) according to the manufacturer’s instructions.
Sequencing of GATA4 Sequencing was done using the BigDye sequencing mix (ABI). Oligonucleotide concentration was 25ng/ L
Reaction mixture (per sample):
PCR product 5 L Oligonucleotide (G1) 1 L CSA buffer 2 L BigDye sequencing mix 2 L
CSA buffer: 200mM Tris pH9, 5mM MgCL
85 Cycling conditions:
960C for 2 minutes (1 cycle)
960C for 20 sec 500C for 10 sec 600C for 4 min (25 cycles)
2.3.5 Sequence analysis For both NKX2-5 and GATA4, the products of the sequencing reactions were analysed using an ABI 3700 sequencer at the sequencing facility of the University of New South Wales. Lasergene DNAstar (Wisconsin) software was used to read the resulting sequence chromatograms.
Table 2.2: Primers used for PCR and sequencing
Gene Primer Sequence (5’=>3’)
NKX2-5 PF1 gcaccatgcagggaagctgcc
PR1 tcattgcacgctgcataatcgcc
S1F tgagactggcgctgccacc
S1R ctttcttttcggctctagggtcc
S2F agctggagcggcgcttcaag
S2R tggccggctgcgctggggaac
GATA4 1F atcgttgttgccgtcgttttctct
1R gccctcgcgcgctcctactcacc
2F gagagctgggcataaacaaagaat
2R ccccgatgcacaccctcaag
86 GATA4 3F acgcgaggtggaagggcagtg
3R caaaggaagaagacaagggaggac
4F cttctcgcagcaggtgtg
4R tgaaaggccagggatgtc
5F tgtccccggcaaatgtagataaag
5R cagtcggcctccccacaaacagc
6F tgggcctcatcgtgtgctttctgc
6R tccaacacccgcttcccctaacca
TBX20 1F acccttttccctgaacctgt
1R tcatggcttgagcatcagac
2F catttggttatgctgttctttcc
2R ctacccagggagtgtcctg
3F gagtcagaccctttccctcc
3R aggcttggaatgctctcttg
4F cccacttatatatggtttatgtgttcc
4R agatagaaggtgggaagggg
5F cactgtaatttggcctgtttagc
5R aatataagaacctcctaaatccttctc
6F ttccacccttctcaggacac
6R aggcctgcctgatgtctct
NKX2-5 primers PF1 and PR1 were used only for PCR. Primers S1F, S1R, S2F, S2R were used only for sequencing. For GATA4, all primers were used for PCR and primers 1F, 2F, 3F, 4F, 5F and 6R were also used for sequencing. For
87 TBX20, all primers were used for PCR and 1F, 2R, 3F, 4R, 5R and 6R were used for sequencing
2.4 Microsatellite analysis All microsatellite analyses were performed by a commercial service, the Australian Genome Research Facility (AGRF), in Melbourne, Australia.
2.5 Marker selection 2.5.1 Human Markers A standard set of 382 autosomal markers used by the AGRF (selected from the Genethon linkage map) were used in human mapping studies. An additional 5 markers were used in a second round of (fine) mapping, to investigate an area of interest on chromosome 1. The average intermarker distance in the initial genome screen was 9.0cM. The markers used are shown in Chapter 4, tables 4.1a –4.22.
2.5.2 Mouse Markers Candidate microsatellites were selected from the Whitehead Institute database and from local resources, and their informativeness confirmed by screening with DNA from the parental mouse strains. This was done by gel electrophoresis at the AGRF (as detailed below) and by polyacrylamide gel electrophoresis and autoradiography at the Victor Chang Cardiac Research Institute, with this being done largely by Dr Changbaig Hyun. Eighty-nine markers were selected to span the mouse genome, yielding an average intermarker distance of ~17cM. The markers used are shown in Chapter 6 (Table 6.1).
2.6 Laboratory methods used at AGRF Microsatellites were amplified under standard conditions. Each PCR reaction was carried out in a total volume of 6ul. Reactions were amplified using a PTC- 225 DNA Engine Tetrad (MJ Research) and PCR products pooled to run more than one marker per lane. Primers were fluorescently labelled with the fluorescent dyes FAM, HEX and TET. Gel electrophoresis was done using 0.2mm denaturing polyacrylamide gels, 4.5%. PCR product (1uL) was
88 electrophoresed for 2.8 hours on an Applied Biosystems 377 DNA Sequencer, with the size standard TAMRA 500 (Red) applied to each gel. Genescan software Version 3.1.2 (AB) was used to assign tracking for each sample lane. Files were then imported into Genotyper Version 2.1 (AB) software in order to interpret the electropherogram and assign genotypes.
2.7 Error checking All genotypes were machine scored and then manually checked by AGRF staff. The majority of the markers used performed well and it was possible to use the genotype information directly as provided by the AGRF, following this process. However, the data quality was checked, as described below.
2.7.1 Error Checking of Human Data In the analysis of the human data the main assurance of data quality was confirmation of Mendelian segregation of markers within the family. There were only 8 genotypes with apparent non-Mendelian segregation. These were re- examined and four were found to be due to incorrect reading of the original trace, two were probable new mutations and two were probable null alleles. The data were adjusted appropriately before analysis.
2.7.2 Error Checking of Mouse Data It was possible to examine the data quality for each mouse marker by comparing the observed ratios of the three possible genotypes (homozygous for the QSi5 allele, heterozygous, homozygous for the 129T2/SvEms allele) with the expected ratio of 1:2:1. A Chi-square test was performed to compare the observed with expected ratios. On review of the data, 9 markers gave results not consistent with Mendelian segregation of alleles. The expected ratio was 1:2:1 and the following markers had results which were highly significantly different from this ratio: D1Mit26, D3Mit41, D16Mit38, D2Mit83, D2Mit265, D5Mit125, D7Mit67, D11Mit62, and D14Mit125. On review of results for these markers it emerged that 96 of the genotypes for D1Mit26 had been incorrectly copied from proprietary software to Microsoft Excel; correction of these genotypes resulted in a return to Mendelian ratios.
89 In addition to the manual checking of the machine calls described above, all genotypes for D16Mit38 and a selection (~10% of genotypes) for each of the remaining 8 markers were manually checked in Sydney (by the author). It was found that D16Mit38 performed well and that calling by AGRF of the genotypes had been accurate. The deviation from Mendelian patterns in this case was less marked than in most of the other suspect markers and it appears that in this instance the skewed distribution of alleles occurred by chance. It is unlikely to have been caused by overrepresentation of a phenotype-associated allele, as both extremes of the phenotype distributions were selected for genotyping – thus any such effect should have been balanced out.
The remaining markers, D3Mit41, D2Mit83, D2Mit265, D5Mit125, D7Mit67, D11Mit62, and D14Mit125 all proved to have performed poorly at the genotyping stage despite having appeared suitable during marker evaluation. In each case one or both alleles amplified inconsistently, leading to a high number of miscalls which could not be corrected with repeated genotyping.
Replacement markers were therefore selected for D3Mit41, D2Mit83, D2Mit265, D5Mit125, D7Mit67, D11Mit62, and D14Mit125. The replacement markers were D3Mit147, D2Mit235, D2Mit517, D5Mit71, D7Mit74, D11Mit71, and D14Mit7, respectively.
2.8 Statistical methods 2.8.1 Basic statistical analyses All basic statistical analyses including calculations of mean values, standard deviations, correlation coefficients, chi square calculations, t-tests and analysis of variance (ANOVA) using the general linear model (GLM) were performed using Minitab V14.1 (Minitab, Inc).
90 2.9 Linkage analysis 2.9.1 Linkage analysis for autosomal dominant trait Linkage analysis in the family with dominant ASD and Marcus Gunn Phenomenon was done using the MLINK and LINKMAP programs within the LINKAGE software package (Lathrop and Lalouel, 1988). Allele frequency data generated in Australian subjects were used. These were generously provided by the CRC for Discovery of Genes for Common Diseases.
2.9.2 QTL analysis 2.9.2.1 Selective Genotyping As discussed in section 1.4.2.3, selective genotyping of animals at the extremes of the phenotypic distribution has advantages in terms of cost and study power (Lander and Botstein, 1989) In this study, therefore, the top and bottom deciles for each of FVL and FOW (corrected for sex and week of gestation) were genotyped, a total of 466 mice, selected from the 1328 F2 mice with complete records.
The relationship between other traits for which data were recorded (age, sex, week of dissection, coat colour, body weight and heart weight) and the cardiac phenotypes of interest is discussed in detail in Chapter 5.
2.9.2.2 Linkage analyses Linkage analyses were performed using the Mapmaker/QTL package (Lander and Botstein, 1989). This program performs interval mapping using a maximum likelihood estimation algorithm, as discussed in section 1.4.2.4. It is particularly advantageous for analysis of a selectively genotyped population, because provided that the phenotypic data for all individuals are included (with the genotypes for individuals not in the extremes entered as “missing”), the program can accurately estimate the effect sizes of QTL. This is in contrast to most other available QTL mapping programs. Important consequences would arise from the use of a program not designed with selective genotyping in mind. The main issue is that as all the phenotype data is from individuals with extreme phenotypes, effect sizes will be overestimated. This was seen in practice when
91 the linkage analyses were performed in parallel with the Mapmanager QTXb13 software (Manly and Olson, 1999), which uses a regression-based algorithm but cannot handle missing genotype data. Although this program confirmed the location of the QTL detected using Mapmaker/QTL, the effect sizes were overestimated as predicted.
2.9.2.3 Binary trait analysis As discussed in Chapter 1.4.1.1, PFO can be viewed as essentially a binary trait (PFO is either present or absent). As such it is intrinsically less informative and more difficult to analyze from a quantitative genetic perspective than a continuously distributed trait. However, the large amount of data available in this study made it practicable to conduct such an analysis. This was done by Dr Peter Thomson, using an approach (detailed below) and software developed by him.
For binary analysis of PFO, the model took the form
CSp i ()QQ () Qq () qq log DTMisex ax i dx i ax i EU1 pi or equivalently,
p 1 i ()QQ () Qq () qq 1 exp[ (Mi sex ax i dx i ax i)] where pi is the probability that an animal has PFO, sexi is a 0 or 1 indicator
()QQ ()Qq ()qq variable for sex (male = 1, female = 0), and xi , xi and xi are unobserved 0 or 1 indicator variables indexing the QTL genotype (QQ, Qq, or qq) with
()QQ ( Qq)()qq xxii xi 1. Note that the Q allele refers to the 129T2/SvEms line, while the q allele refers to the QSi5 line. The parameters a and d refer to additive and dominance effects of the 129T2/SvEms allele, on the logit scale.
92 Because the QTL genotypes indicator variables are unobserved, the model is fitted as a three-component mixture, with mixing probabilities
()QQ ()QQ ()Qq ()Qq ()qq ()qq iiPx 1|mi , iiPx 1|mi , and iiPx 1|mi , with
()QQ () Qq () qq iii1, where these are the conditional probabilities of the QTL genotype, given the flanking marker genotypes, mi. These are calculated in a standard way for an inbred F2 design.
As a protection against spurious results, QTL “peaks” with LOD > 2 were checked by selecting the nearest marker and performing a standard (non- mixture) logistic regression, CSp i ()MM () Mm (mm) log DTMisex ax i dx i ax i, EU1 pi
()MM ()Mm ()mm where the xi , xi and xi are the (observed) 0-1 indicator variables for the marker genotypes.
93 3. The role of mutations in the cardiac transcription factors NKX2-5, GATA4 and TBX20 in causing CHD and cardiomyopathy
3.1 Introduction It is now 10 years since the genetic basis of rare Mendelian forms of CHD began to be elucidated, with the identification of mutations in TBX5 in Holt- Oram syndrome (Basson et al., 1997). Subsequently, mutations were found in NKX2-5 (Schott et al., 1998), GATA4 (Garg et al., 2003), MYH6 (Ching et al., 2005) and ACTC (Matsson H et al., 2005) in families with nonsyndromal autosomal dominant ASD. The discovery of causes for rare forms of CHD naturally raises the question: could mutations in the same genes cause common CHD? Moreover, the fact that TBX5, NKX2-5 and GATA4 interact during early development, and MYH6 and ACTC are downstream targets of the other known ASD genes, raises the question: could other genes which interact with them also be associated with dominant ASD? This chapter represents a partial answer to these questions, with investigation of a large number of individuals with ASD and other forms of CHD to determine whether they harbour mutations in each of NKX2-5, GATA4 and TBX20. The latter was a candidate gene based on its known interactions with NKX2-5 and TBX5 (see section 1.5.3).
The chapter is divided into three sections, one for each of these three genes. Because the work has been progressing over a period spanning the years 2000-2007, the number of subjects studied for each gene was different.
3.2 Mutations in NKX2-5 cause autosomal dominant CHD and AV conduction block Mutations in human NKX2-5 were first identified by Schott and colleagues in 1998 (Schott et al., 1998). They described four families, each with multiple individuals affected by CHD and/or AV conduction block, in which NKX2-5 mutations segregated with the cardiac phenotype. In this and subsequent
94 reports, the AV block associated with NKX2-5 mutations has been noted to become progressively more severe with age (Schott et al., 1998; Gutierrez- Roelens I et al., 2002). Although ASD was the main form of CHD in all four families, there were a number of other malformations including VSD, tetralogy of Fallot, subvalvular aortic stenosis, pulmonary atresia and redundant mitral valve leaflets with fenestrations. This and subsequent reports have led to identification of a total of 35 germline mutations in NKX2-5. The mutations identified to date are summarized in Table 3.1.
The results of the study described here (reported in (Elliott et al., 2003)) are not included in the table. The intriguing findings of Dentice and colleagues (Dentice et al., 2006), are not included because the study was not of CHD. They noted expression of Nkx2-5 during thyroid development in the mouse. This led them to screen 241 individuals with thyroid dysgenesis for mutations in NKX2-5. Three missense mutations were found, two of which were not found in 561 healthy controls, and one of which was found once among the 561 controls. One of the affected individuals had mild mitral valve regurgitation, but otherwise they did not have CHD or conduction abnormalities. In each family, there was at least one heterozygote who was clinically normal. Thus, if these are pathogenic changes, the penetrance for thyroid disease associated with them must be low. Mutations in the related gene NKX2-1 have been identified in patients with congenital hypothyroidism (Devriendt et al., 1998). A homozygous mutation in NKX2-6 has also been found to cause common arterial trunk in a large consanguineous family (Heathcote et al., 2005).
Also not included in the table are the reports by Reamon-Buettner and colleagues of multiple somatic NKX2-5 mutations in complex CHD seen in fomalin-fixed hearts, part of a museum collection (Reamon-Buettner and Borlak, 2004; Reamon-Buettner et al., 2004; Reamon-Buettner and Borlak, 2006). The authors argue that the lack of mutations in healthy tissue and the lack of mutations in other, non-cardiac specific genes in the same samples confirm that these are genuine mutations and not artefact. Nonetheless, testing of formalin- fixed tissues, stored for decades, is fraught with difficulty and these results must
95 be viewed with caution until they are replicated by another group, preferably in fresh tissue obtained at surgery. The only attempt to do this to date has been inconclusive, with no somatic mutations found in 19 subjects undergoing surgery for bicuspid aortic valve (Majumdar et al., 2006).
This represents only weak evidence against a role for somatic NKX2-5 mutations in CHD, however, as bicuspid aortic valve is not one of the typical lesions associated with NKX2-5 mutations, although it is more frequent in mice heterozygous for an Nkx2-5 mutation (Biben et al., 2000). Of the 35 NKX2-5 mutations summarized above, 21 are missense mutations, 6 are nonsense mutations, 6 are frameshift mutations, one a splice site mutation and one an in- frame deletion of a single amino acid. However, of the missense mutations, there is currently limited evidence for pathogenicity for R25C, E21Q, A219V, K15I, Q22P, A63V, A127E, R126C, and P275T. The single amino acid deletion N291 is also of uncertain significance. For all of these mutations, there is either demonstrated nonpenetrance in the setting of only a small number of affected individuals (in most cases only one), or no information is available about the mutation status of family members other than the proband, with no family history of CHD. It is striking that all of the other 12 missense mutations, including all missense mutations identified as the result of study of families with multiple individuals affected by CHD, are located within the homeodomain.
It remains possible that the missense mutations located outside the homeodomain are pathogenic. The identification of these sequence variants in a relatively large number of individuals with CHD, but not in controls, makes it likely that this is the case for at least some of them. The absence of family history does not exclude the possibility that other family members have unrecognised CHD. Some or all may be proven pathogenic on study of
96 Table 3.1 Mutations in NKX2-5
Mutation T178M T178M T178M Q170X Q198X Q198X Q149X R189G Reference (Schott et (Schott et (Hirayama- (Schott et (Schott et (Hosoda et al., (Benson et (Benson et al., 1998) al., 1998) Yamada et al., 1998) al., 1998) 1999) al., 1999) al., 1999) al., 2005)
#heterozygous 75243 1 65 ASD 98446 2 44 VSD 2---- - 3- AV block 108633 2 55
Nonpenetrant - - (2) - - - (1) (1) Other CHD TOF(2), - - - MVF - - TV abn (no) SVAHS, PA Location Homeobox Homeobox Homeobox Homeobox Homeobox One AA 3’ to Homeobox Homeobox homeobox Comment Proband’s 3 with father died decreased suddenly at LV function 42, son died at 18 (pneumonia)
97 Mutation Y259X Y191C N188K Int 1DSG+1T R25C R25C R25C
Reference (Benson et (Benson et (Benson et (Benson et al., (Benson et al., (Goldmunt (Goldmuntz al., al., al., 1999)d} 1999)d} z et al., et al., 1999)d} 1999)d} 1999)d} 2001)d} 2001)d} (McElhinney et al., 2003)d} #heterozygous 7 1 5 1 1 2 1 x 7 ASD 615 - - - - VSD 31- - - 1 - AV block 715 1 - - -
Nonpenetrant (1) - - (1) - - 2+ Other CHD (no) - - Ebstein - TOF TOF TOF(4), TA, anomaly IAA, HLHS (3) Location 3’ coding Homeobox Homeobox Splice site 5’ coding region 5’ coding 5’ coding region region region Comment de novo 1 with Proband Other family 7 isolated mutation decreased ascertained members not cases listed in proband LV because of AV genotyped here (“x7”) function block. Father See died suddenly discussion aged 29.
98 Mutation E21Q A219V C264X 215-221del 223-224del R142C Reference (Goldmuntz et (Goldmuntz et (Ikeda et (Watanabe et al., 2002) (Watanabe et (Gutierrez- al., 2001) al., 2001; al., 2002) al., 2002) Roelens I et McElhinney et al., 2002) al., 2003) #heterozygous 3215 413 ASD --45* 410 VSD ---1 -3 AV block --65 511 Nonpenetrant 2 1 (2) 1 (1) (3) Other CHD - TOF - - - TOF, PS, (no) PDA Location 5’ coding region NK2 domain 3’ coding Frameshift in exon 1 Frameshift in Homeobox region exon1 Comment Nonpenetrance Nonpenetrance Nonpenetrant individual 4 subjects ? pathological ? pathological had AF aged 46 but no had CHD but significance significance AV block and no CHD. no AV block, One affected individual but 3 were had visceral situs children at inversus with polysplenia the time of but not dextrocardia study
99 Mutation Q187H K15I Q22P A63V A127E R216C InsTCCCT701 Reference (Gutierrez- (McElhinney et (McElhinney (McElhinney (McElhinney et (McElhinney (McElhinney Roelens I al., 2003) et al., 2003) et al., 2003) al., 2003) et al., 2003) et al., 2003) et al., 2002) #heterozygous 62 113 12 ASD 61 -- 1 -1 VSD ------AV block 7- -- - -1 Nonpenetrant (1) 1 ? ? 1 ? 1 Other CHD Anomalous - TOF L-TGA BAV TOF - (no) systemic venous return (2) Location Homeobox TN domain 5’ coding 5’ coding 5’ coding region NK2 domain 3’ coding region region region Comment Nonpenetrance No family No family Nonpenetrance, No family ? pathological history of history of one affected history of significance CHD, CHD, only has BAV, CHD, parents not parents not ?significance parents not genotyped genotyped genotyped
100 Mutation P275T N291 A323T 605-606del W185L 498insC L171P Reference (McElhinney (McElhinney et (McElhinney (Sarkozy A et (Sarkozy A et (Sarkozy A (Kasahara et al., 2003) al., 2003) et al., 2003) al., 2004) al., 2004) et al., 2004) and Benson, 2004) #heterozygous 1213319 ASD ---7417 VSD ---23-1 AV block ---7219 Nonpenetrant ? 1 ? - - - (2) Other CHD Coarct DORV TOF - MVP, LV -- (no) noncompaction Location 3’ coding 3’ coding 3’ coding Homeobox Homeobox 5’coding Homeobox region region region region Comment No family Nonpenetrance No family de novo 7deceased history of ? pathological history of mutation individuals CHD, parents significance CHD, parents had CHD, not not type not genotyped genotyped confirmed
101 Mutation R190H c.262del R190C Y256X Q160P Reference (Kasahara and (Hirayama- (Hirayama-Yamada et al., 2005) (Gutierrez- (Rifai L et al., Benson, 2004) Yamada et Roelens et al., 2007) al., 2005) 2006) #heterozygous 34 1 5 4 ASD 33 2 2 4 VSD 2- - - - AV block 34 1 5 4 Nonpenetrant - (1) ?2 (2) - Other CHD - - - MVP - (no) Location Homeobox 5’coding Homeobox 3’coding Homeobox region, FS region Comment Two deceased Father of 4 Proband’s father died aged 63 of individuals had affected subarachnoid haemorrhage, CHD, type not children cousin with ASD not confirmed unavailable genotyped/uncle not studied for study
Mutations are listed in order of publication, except that a second or subsequent report of a mutation is listed adjacent to the original report. Where multiple families with the same mutation have been reported, each is listed separately. # heterozygous: number proven heterozygous for the mutation. Number with ASD or other abnormalities may be greater than this if there are multiple affected individuals in the pedigree who could not be genotyped (eg deceased individuals). AV block: Confirmed AV conduction block. Nonpenetrant: confirmed mutation positive OR obligate hetorozygote, known normal cardiac status. Numbers in brackets refer to individuals with AV block but structurally normal heart. Other CHD: may overlap with ASD/VSD categories if >1 cardiac diagnosis (no) number if more than one affected in pedigree. TOF tetralogy of Fallot (individuals with TOF not listed separately as having VSD), SVAHS supravalvular aortic stenosis, PA pulmonary atresia, MVF mitral valve fenestration, TV abn tricuspid valve abnormality, LV left ventricle, PS pulmonary stenosis, TA truncus arteriosus, IAA interrupted aortic arch, L-TGA L-transposition of the great arteries, BAV bicuspid aortic valve. Coarct coarctation of the aorta, DORV double outlet right ventricle, MVP mitral valve prolapse, LV left ventricular. AA, amino acid. FS frameshift. *3 had sinus venosus ASD, 2 had type not specified 102 additional families. Even if the family history is correct, some or all of these mutations may be pathogenic but with reduced penetrance. Alternately, these may be rare polymorphisms (they are not common as they were not identified in 100 control chromosomes) or may be participating in multifactorial causation of CHD.
The recurrent mutation R25C is particularly noteworthy. The mutation is not in a recognised functional domain of the gene. It has mainly been reported in individuals with TOF. None of the affected individuals with this mutation have had ASD or conduction abnormalities. In most of the families in which it has been reported, studies of family members other than the proband have not been possible. There are three families in which other family members have been studied. In two, a healthy parent was heterozygous for the mutation. In the third, the father of the proband was heterozygous for the mutation and had VSD (Goldmuntz et al., 2001; McElhinney et al., 2003). Although the mutation was not identified in 50 healthy individuals not selected for racial background, 2/43 healthy African-American controls were heterozygous for the mutation. Of the 7 probands identified with the mutation, 5 were of African-American ancestry, one Hispanic and one Caucasian (McElhinney et al., 2003). In functional studies in vitro, NKX2-5 protein with the R25C mutation localised normally to the nucleus, bound to DNA normally at low concentrations but was subtly abnormal in formation of dimers at higher concentrations, and had similar effects on transcriptional activation to wild-type protein (Kasahara and Benson, 2004). The clinical data, including identification of the mutation in two healthy controls, suggest that this mutation is likely to have low penetrance. The in vitro data show only subtle differences between wild-type and mutant protein. On the other hand, the identification of a family with two affected individuals (one with TOF and one with VSD) with CHD segregating with the mutation represents evidence in favour of its pathogenicity. It is possible is that this is not a benign polymorphism, but rather is a mutation which participates in multifactorial causation of CHD.
103 The E21Q mutation was identified in a proband with TOF and was also present in the proband’s mother and maternal grandmother, both of whom had normal hearts (McElhinney et al., 2003). Its status is therefore uncertain.
Studies of DNA binding in homeodomain NKX2-5 missense mutations indicates that DNA binding is impaired by these mutations (Kasahara and Benson, 2004). This, together with the nonsense and frameshift mutations, and the observation of CHD and AV conduction defects in children with deletions of the NKX2-5 locus (Baekvad-Hansen et al., 2006), suggest that the mechanism by which NKX2-5 mutations cause disease is haploinsufficiency. However, dominant negative inhibition of other transcription factors or transcription complexes may also play a role.
The penetrance associated with mutations other than those listed above as being of uncertain significance is very high. Penetrance for AV conduction disease is almost complete, with the exceptions being children at the time of assessment, implying a possibility that they will develop AV conduction defects later in life. This has important clinical implications, implying a need for long term cardiac follow-up for individuals found to be heterozygous for an NKX2-5 mutation. Penetrance for CHD, although lower than penetrance for AV conduction defects, is likewise high.
3.2.1 Subjects screened for NKX2-5 mutations Subjects screened for mutations in NKX-5 were as described in section 2.2.2.3. Patient characteristics are shown in Table 3.2.
Contributions by EK to this part of the study include participation in study design, patient recruitment (recruitment of all paediatric subjects, assisted by Dr Fiona MacKenzie, but also including review of medical records and reading of ECGs for many of the adult subjects), DNA extraction, sequencing of samples from paediatric subjects and some of the adult subjects (the majority of sequencing of adult subjects was performed by Dr David Elliott), clinical evaluation of family members, and data analysis.
104 Table 3.2 Patient characteristics (NKX2-5)
ASD (n=102) PFO (n=25) HLHS (n=19*)
Male/female 37/65 12/13 11/8
Mean age in years (range) 31.9 (birth to 80) 48.7 (12-73) 0.9 (birth to 4)
Positive family history† 12 1 12
AV conduction block 41 0
Other cardiac conditions
Anomalous pulmonary venous drainage 2 0 0
Tricuspid annular dilatation 1 0 0
Double-outlet right ventricle 0 0 2
Complete heterotaxy 0 0 1
NKX2-5 mutation 1 01
* Includes one unaffected individual who was an obligate carrier of a familial mutation # At the time of recruitment into the study †First or second-degree relative with congenital heart disease Both members of family 1024 105 3.2.2 Results The coding regions and intron-exon boundaries of NKX2-5 were sequenced in 102 subjects ascertained only because of their need for ASD repair and 25 with PFO diagnosed following cryptogenic stroke. Thirty-two were children (age range birth to 14 years, mean 3.9 years) and the remainder were adults (age range18-80 years, mean 44.7 years). Within the cohort, five individuals (3.9%) also had first-degree heart block and 3 (2.9%) had some other form of CHD (table 3.2). Thirteen patients (10%) had a family history of ASD in at least one first or second-degree relative, although no family members had evidence of AV conduction block. Clinical and laboratory methods were as described in sections 2.2 and 2.3.
Two individuals had sequence changes in NKX2-5 identified, both of which had been reported previously (Figure 3.1). These were 61G>C, leading to the amino acid change E21Q, and 532C>T, leading to the amino acid change T178M.
3.2.2.1 Family 1024: T178M This mutation has been reported previously, in two of the families in the original report of NKX2-5 mutations in human disease (Schott et al., 1998) and in a more recent report by Hirayama-Yamada and colleagues (Hirayama-Yamada et al., 2005). The substitution of a threonine residue with methionine in the homeobox between the second and third helix probably changes the angle of the third helix, reducing contact to the major groove of DNA (Kasahara and Benson, 2004).
It has been shown to markedly impair the ability of NKX2-5 protein to bind DNA (Kasahara et al., 2000; Kasahara and Benson, 2004) .
In the family reported here, the proband was a girl aged 10 at the time of recruitment into the study. A cardiac murmur was noted on clinical screening six weeks after birth and the diagnosis of ASD was made. Surgical repair was done at age 4 and was complicated by pulmonary oedema; however, she made
106 532C>T 532C>T 61G>C
61G>C = E21Q 532C>T = T178M
Figure 3.1 Families with ASD and NKX2-5 sequence changes. At left is family 1024, with ASD and HLHS. NlaIII restriction digest of PCR products (below pedigree) demonstrated that all affected and one clinically normal individual (II:4) carried the mutation. At right is family AF1, with ASD in three generations; however, only one affected individual was heterozygous for the 61G>C change (family AF1 sequencing and family 1024 restriction digest by Dr David Elliott) All sequence changes illustrated were confirmed by bidirectional sequencing
107 a good recovery and at age 10 was in good health. Clinical examination was normal other than the presence of the surgical scar. On review at age 13, transthoracic echocardiogram and ECG were normal. Her father (I:2 on the pedigree) had had an ASD diagnosed and surgically repaired at the age of 27; he also had mitral valve replacement for mitral regurgitation at the same time. Twelve months later he developed heart failure due to restrictive pericarditis, which was managed surgically. At the age of 37, he had a cardiac arrest while at a concert, and was successfully treated for ventricular fibrillation (VF) with DC cardioversion. He had consumed approximately 5L of beer just before the concert, which his cardiologist regarded as a possible contributing factor. Electrophysiological studies did not induce any arrythmias and an implantable defibrilllator was placed. He also developed atrial fibrillation, for which he was successfully treated with flecainide. There was little extended family history available.
The proband had three sibs. The first child in the family (II:1) had died aged 9 weeks of hypoplastic left heart syndrome with an associated ASD. With the consent of the parents, this child’s newborn screening card was accessed and DNA extracted from it. An NlaIII restriction enzyme digest performed by Dr David Elliott confirmed that she was heterozygous for the T178M mutation.
Of the other two sibs, one was found not to carry the mutation. The other (II:4) was heterozygous for the mutation. However, clinical examination, echocardiography and ECG were normal when she was last seen, aged 10 years.
The finding of this mutation in a child with HLHS prompted sequencing of the gene in an additional 18 individuals with HLHS, none of whom had identifiable mutations in NKX2-5.
3.2.2.2 Family AF1: E21Q The sequence change 171G>C, leading to the amino acid change E21Q, was identified in one member of a family in which four individuals had ASD without
108 AV conduction abnormalities. However, E21Q did not segregate with the cardiac phenotype in the family – the other three affected individuals did not carry the mutation. The mother of the individual who was heterozygous for E21Q (II:1) was not available for testing, so it is not certain whether this was a de novo change or inherited. Although II:1 was said not to have cardiac disease it is possible that she did have unrecognised CHD. Nonetheless, this finding adds weight to the suggestion that this may be a benign polymorphism (see 3.2 above).
3.2.3 Role of NKX2-5 mutations in nonsyndromal ASD The findings reported here are consistent with those of other investigators. Considering ASD alone, among 102 individuals with ASD one definite mutation was identified, in an individual with a strong family history of CHD and conduction disease; this represented 1/12 subjects with a family history (or 1/13 if PFO is considered to represent CHD). Ikeda and colleagues found that 1/109 individuals with ASD had a mutation in NKX2-5; that proband had a family history of ASD and conduction block (Ikeda et al., 2002). This paper does not state how many of the individuals studied had a family history. Gutierrez- Roelens and colleagues found mutations in 2/50 subjects, both of whom had extensive family histories of CHD and conduction block (Gutierrez-Roelens I et al., 2002). There were a total of 16 probands with a family history of CHD.
McElhinney and colleagues studied a total of 608 individuals, of whom 18 had a mutation identified (although, as discussed above, a number of these are of uncertain significance; 9/18 had tetralogy of Fallot including 7/9 with R25C) (McElhinney et al., 2003). Considering ASD alone, 3/71 had mutations. One of these individuals was heterozygous for the K15I mutation. This subject had no conduction abnormality and one parent was heterozygous for the mutation but had no CHD. A second individual had the A127E mutation. This individual had ASD with no conduction defect, a parent with a normal heart and a grandparent with bicuspid aortic valve (BAV). Given the high frequency of BAV, it is hard to be certain that this confirms pathogenicity of the mutation. However, mice with Nkx2-5 mutation have an increased frequency of bicuspid aortic valve (Biben et
109 al., 2000) so it is possible that the bicuspid valve in the grandparent was caused by the A127E variant and that this is indeed a pathogenic mutation. The third individual had an unequivocal mutation; this subject had ASD and AV conduction disease and had a de novo frameshift mutation, InsTCCCT701 (McElhinney et al., 2003). Assuming that these latter two mutations are pathogenic but R25C is not, 2/71 subjects with ASD had mutations. It is not stated how many of the 71 subjects with ASD had a family history.
Combining these figures, 6/332 individuals ascertained purely on the basis of ASD have been shown to have mutations in NKX2-5 (1.8%). However, 3/28 individuals with a family history of ASD had mutations (10.7%). It is possible that publication bias is at work, with negative studies not having been published. Nonetheless, these findings suggest that where there is a family history of ASD, particularly with conduction defects, the yield from screening for NKX2-5 mutations is high enough to warrant testing, particularly as this a relatively small gene (two exons). These figures do not support screening individuals without family history for mutations in NKX2-5; of the 6 individuals with mutations, 5 had a family history of CHD (if the bicuspid aortic valve is included) and the sixth had AV conduction block in addition to ASD.
Supporting this conclusion, Benson and colleagues (Benson et al., 1999) found mutations in all of five individuals with CHD and AV block, four of whom had an extensive family history. In addition, 1/10 individuals with idiopathic 2nd or 3rd degree heart block but no CHD had a mutation (Benson et al., 1999). The latter part of the study has not been replicated and it remains to be seen whether this of itself is sufficient indication for mutation testing, but based on these studies the combination of CHD (particularly ASD) and AV conduction block does represent sufficient indication for NKX2-5 mutation testing, even in the absence of family history.
3.2.4 Implications for asymptomatic mutation-positive individuals The identification of an asymptomatic child with an NKX2-5 mutation (and, of course, her sister whose ASD has been successfully treated) raises the
110 question of the implications for the future health of such individuals. As discussed above, conduction disease in patients with NKX2-5 mutations is often progressive, and sudden death has been reported. Penetrance for conduction disease in adults is very high. Individual I:2 would probably not have survived his episode of VF if it had not occurred at a concert, with paramedics equipped with a defibrillator stationed close at hand. Normal ECGs in childhood are therefore not reassuring. Currently, these children are being monitored with annual ECG. This regimen is not evidence-based, as evidence for a specific monitoring program does not yet exist. It may need modification as more experience with the phenotype associated with NKX2-5 mutations accumulates.
3.3 The role of mutations in GATA4 in ASD and PFO As discussed in section 1.7.1.2, a role for GATA4 in human CHD was first suggested by the observation that patients with deletions of 8p23, where GATA4 is located, commonly have CHD. Mutations in GATA4 were first reported in 2003 (Garg et al., 2003). The mutations G296S and c.1075delG segregated with CHD, predominantly ASD, without conduction abnormalities or extracardiac manifestations. Since then, an additional three germline mutations have been reported (Table 3.3). Reamon-Buettner and colleagues have reported multiple somatic GATA4 mutations in formalin-preserved cardiac specimens. These mutations are not included in Table 3.3. This study used the same set of hearts in which the same investigators identified numerous somatic NKX2-5 mutations (Reamon-Buettner and Borlak, 2005). As for the NKX2-5 mutations, these somatic GATA4 mutations may be artefactual, and until the findings are replicated by another group, preferably in fresh cardiac tissue, their status will be uncertain.
The five previously reported mutations include two single basepair deletions causing frameshifts, and three missense mutations. Apart from the missense mutation E216D, which was found to occur de novo in two unrelated individuals with TOF (Nemer G et al., 2006), all of them segregate with disease in the families in which they have been reported, and none have been found in large numbers of controls.
111 Table 3.3: Mutations in GATA4. Mutation G296S G296S G296S c.1075delG c.1074delC S52F E216D Reference (Garg et al., (Sarkozy A et (Sarkozy A et (Garg et al., (Okubo et al., (Hirayama- (Nemer et 2003) al., 2004) al., 2004) 2003; 2004) Yamada et al., 2006) Hirayama- al., 2005) Yamada et al., 2005) #heterozygous 13 2 3 5 9 3 1x2 ASD 16 2 6 7 11 3 - VSD 3-1- --- PS 623 - 2-- Nonpenetrant -* ------Other CHD AR, MR, - - Dextrocardia - - TOF(2) (no) PDA, AVSD Location Adjacent to Adjacent to Adjacent to 3’ coding 3’ coding TAD1 Zinc finger zinc finger zinc finger zinc finger region region and nuclear and nuclear and nuclear localization localization localization signal (NLS) signal (NLS) signal (NLS) Comment 2 unrelated individuals with de novo mutation Mutations are listed in order of publication, except that a second report of a mutation is listed adjacent to the original report. # heterozygous: number proven heterozygous for the mutation. Number with ASD or other abnormalities may be greater than this if there are multiple affected individuals in the pedigree who could not be genotyped (eg deceased individuals). Nonpenetrant: confirmed mutation positive OR obligate hetorozygote, known normal cardiac status. Other CHD: may overlap with ASD/VSD categories if >1 cardiac diagnosis; (no) number if more than one affected in pedigree. AR, aortic regurgitation; MR, mitral regurgitation; PDA, patent ductus arteriosus; PS, pulmonary stenosis, TOF, tetralogy of Fallot; TAD1, transactivation domain. *two obligate heterozygotes not known to have CHD were unavailable for evaluation 112 In expression studies, both G296S and c.1075delG showed reduced activity, with the effect being most pronounced with the frameshift mutation (Garg et al., 2003). GATA4 protein with the hypomorphic G296S allele showed reduced DNA binding compared with wild-type protein, and there was evidence that interaction with TBX5 was disrupted. This mutation appears to be particularly associated with pulmonary stenosis (PS), with 11/24 reported affected individuals having PS (all of them also have ASD). Somewhat confusingly, the family with c.1075delG from the first report of GATA4 mutations was reported a second time, in a paper first-authored by one of the authors of the original paper (Hirayama-Yamada et al., 2005). No additional clinical information or laboratory information was provided in this subsequent paper, however.
No functional assays have been done on the single base-pair deletion c.1074delC (Okubo et al., 2004). However, this mutation results in a frameshift, with a premature stop codon resulting at amino acid 403. This, combined with the location of the mutation very close to the well-studied and very similar c.1075delG, plus the segregation of the mutation with CHD in the family in which it is reported, leave little doubt as to its pathogenicity.
The missense mutation S52F is located in a known functional domain (the transcriptional activation domain (TAD1)) and segregated with CHD in a small family (3 affected individuals in two generations) (Hirayama-Yamada et al., 2005). No functional assays have been reported on this sequence change but it is likely to be pathogenic.
The missense mutation E216D, identified as a de novo change in two unrelated individuals with TOF, showed normal cellular localization and binding to the consensus GATA binding site in assays in rat cells. However, transcriptional activity was modestly, but statistically significantly, reduced, by about 50% (Nemer G et al., 2006). This is a highly conserved residue. although the amino acid change is relatively conservative – both glutamate and aspartate are polar amino acids, although glutamate is larger than aspartate. Overall, the status of this mutation is uncertain based on currently available evidence.
114 3.3.1 Subjects screened for GATA4 mutations Subjects screened for GATA4 mutations were as described in section 2.2.2.3. There were two control groups. The first group had had trans-oesophageal echocardiography (TOE) for a variety of indications and were known to have structurally normal hearts, and in particular intact atrial septa (“TOE controls”). The second group were ascertained by Dr Lyn Griffiths and were unselected except for Caucasian ancestry (see below). Results of testing for the S377G variant (see below) were confirmed by commercial SNP analysis (Genera Biosystems). Contributions by EK to this work were participation in study design, patient recruitment, management of clinical data including coordination of a retrospective data collection exercise to obtain ethnicity information (with the assistance of Ms Haley Crotty and Ms Janan Fornusek); follow-up and clinical assessment of family members, and data analysis.
3.3.2 Results of sequencing and cytogenetic analysis Two missense changes, A411V and D425N, were identified, in subjects with ASD and large PFO + mitral regurgitation respectively (Figure 3.2). Additionally, when the proband in family 1006 was evaluated, the history and examination findings (see below) raised the possibility of a chromosomal abnormality. Cytogenetic analysis confirmed the presence of a deletion of 8p23.
A previously reported sequence variant, S377G, was identified in a number of cases and controls and further studies were done to investigate the possibility that this may be a contributor to CHD (see below; Tables 3.4 and 3.5).
3.3.2.1 Family 1012 – GATA4 variants A411V and S377G In this family (Figure 3.2a), the proband (II:1) had ASD diagnosed aged 9 weeks following identification of a cardiac murmur at routine examination aged 6 weeks. The pregnancy and birth history were unremarkable. At 20 months a 10mm diameter lesion was repaired surgically. Subsequently he was in good health apart from mild asthma. The only family history of note was that his
115 mother had unilateral breast cancer diagnosed at the age of 29. There was no other family history suggestive of a familial cancer syndrome.
On examination at the age of 21, II:1 was not dysmorphic. He had mild bilateral fifth finger clinodactyly. Individual II:1 was found to be heterozygous for both A411V and S377G. The latter variant is discussed in more detail below. Sequencing of GATA4 in all first-degree relatives showed that both of the proband’s sisters (II:2 and II:3) were heterozygous for S377G but not for A411V. The proband’s father, I:I, proved to be homozygous for both S377G and heterozygous for A411V. He was a healthy 53 year old man, with no past medical history of note. Transoesophageal echocardiography by Prof Michael Feneley was normal, and in particular the atrial septum appeared intact. A bubble study did not reveal any evidence of PFO.
The alanine at position 411 is not in a recognised functional domain. The change to valine is relatively conservative, substituting a large nonpolar amino acid for a small nonpolar amino acid. This residue is not evolutionarily conserved – at the same position, mice share the alanine with humans, but in rat a proline is sustituted, in chicken a glutamine and in Xenopus histidine. This variant has been reported previously as a rare polymorphism (minor allele frequency <0.01) (Poirier et al., 2003). In this family, one of two individuals heterozygous for A411V has a structurally and functionally normal heart. Taking all these factors into consideration, it is likely that A411V is a polymorphism with no pathogenic impact. A role in multifactorial causation of ASD cannot be excluded.
116 A B C
Figure 3.2: Families with GATA4 variants and 8p23 deletion. Sequencing done by Dr Changbaig Hyun.
A. Family 1012 (A411V + S377G) B. Family z10 (D425N) C. Family 1006 (deletion). Arrows indicate residues affected by mutation. WT = wild type, del = deletion 117 3.3.2.2 Family z10 – GATA4 variant D425N In this family (Figure 3.2b) the proband, I:2, had mitral valve replacement aged 65 for mitral stenosis presumed to be due to rheumatic heart disease. A large PFO with bidirectional shunting was identified during pre-operative evaluation for this surgery. Other cardiac findings included atrial fibrillation, severe tricuspid regurgitation and mild thickening of the aortic valve leaflets with normal function. The only other known medical history was cholecystectomy. It was not clear whether or not she had actually had rheumatic fever in childhood. Unfortunately, II:1 died aged 76, between the time that she was recruited to the study and the finding that she was heterozygous for D425N. Her sons (II:1 and II:2) were uncertain of the cause of death. They agreed to venepuncture (collected during a home visit) for the purpose of the study, but were unwilling to travel to a hospital in order to have echocardiography (clinical assessment and venepuncture were conducted at the home of one of them). This was disappointing, particularly given that both proved to be heterozygous for D425N. On clinical examination, neither was dysmorphic and neither had clinical evidence of CHD. II:1 had a unilateral single transverse palmar crease. There was no other family history of note.
D425N has not been reported previously. The aspartate residue at position 425 is not in a recognized functional domain. It is highly conserved, being invariate in mouse, rat, chicken and frog. Nonetheless, the lack of definitive cardiac assessment in individuals II:1 and II:2 means that the status of this sequence variant is uncertain.
3.3.2.3 Family 1006 – 8p23 deletion At the time of recruitment to the study, the proband in this family (II:2) was 17 years old. She had been born at term following an uncomplicated pregnancy. Birth weight and length were on the 10th centile, but head circumference was on the 50th. ASD and PDA were identified in the newborn period, as was a left- sided diaphragmatic hernia. The diaphragmatic hernia was repaired in the newborn period and the ASD at the age of 4 years. Medical problems in childhood included mild asthma and several episodes of pneumonia. Cognitive
118 development was delayed. Formal developmental assessment at ages 6 and 8 placed her in the mild range of intellectual disability (IQ 60-70), and she required special schooling. Cerebral CT scan and karyotype at 6 years were normal.
There was a steady increase in weight during her teenage years. She was not hyperphagic. Aged 17, she had a mildly low serum calcium at 2.17 mmol/L (NR 2.25-2.58) and fasting insulin was high at 47.4mU/L (NR 0.8-16), suggesting insulin resistance. Other endocrine investigations were normal and treatment with modified diet and metformin were commenced.
On examination aged 17, weight was 71.1kg (90th centile) , height 141.8cm (<3rd centile) and head circumference 52.5cm (2nd centile). She had deepset eyes, short palpebral fissures and a smooth philtrum. She had truncal obesity. Her 4th and 5th metacarpals were mildly short bilaterally. She had striae on her shoulders but no cutaneous calcification. She was pubertal, at Tanner stage 5 for both breast and pubic hair development. These features raised the possibility of a chromosomal disorder. It was more than 10 years since the previous karyotype and given advances in cytogenetic techniques, chromosomal analysis on blood was repeated. This showed a deletion at chromosome 8p23, with the karyotype being 46,XX,del(8)(p23.1p23.3). This includes the locus of GATA4. Because of the similarities of the phenotype to that seen in Albright Hereditary Osteodystrophy (AHO), sequencing of GNAS1 was also requested and this was kindly done by Dr Eileen Shore; no mutation was identified.
The S377G variant was also identified in members of this family and in particular the proband was found to be hemizygous for S377G.
Deletions of 8p23, encompassing GATA4, were discussed in section 1.7.1.2. It is likely that the high incidence of CHD in patients with these deletions is at least in part due to haploinsufficiency for GATA4. Although a phenotype similar to AHO has not previously been reported in patients with such deletions, other
119 features seen in II:2 have been previously reported. Apart from intellectual handicap, which is common to most chromosomal deletions, it is noteworthy that congenital diaphragmatic hernia (CDH) has been repeatedly reported in association with deletions of 8p23 – there have been more than 10 such cases reported (Holder et al., 2007). Since CDH has not been reported in families with GATA4 mutations, it is likely that deletion of another gene or genes within the region is responsible for CDH.
3.3.3 The common variant S377G – possible role in PFO with stroke The GATA4 variant S377G has previously been reported as a common SNP with a minor allele frequency of 0.11. Given the role of GATA4 in dominant CHD, this allele is a candidate for involvement in multifactorial causation of CHD. Allele frequency can vary between ethnic groups due to selection or founder effects. Therefore, before performing an association study, the allele frequency was determined in a previously established set of globally distributed indigenous populations (Martinson et al., 2000). This was done by Dr Jeremy Martinson (Table 3.4).
High allele frequencies were observed in populations of European and American Caucasian, Middle Eastern and – to a lesser extent – Indian and Hispanic-American descent. African, East Asian and Pacific Islander populations had very low frequencies. These data suggest a relatively recent and Caucasian origin for S377G.
The association study was therefore restricted to Caucasian subjects (Table 3.5). There was some variation in both heterozygosity and allele frequency between groups. The differences between the ASD and “other CHD” groups and controls were not statistically significant. However, an excess of S377G was observed in subjects with PFO with stroke, with an allele frequency of 0.18.
120 Table 3.4 S377G allele distribution in indigenous human populations
Total n Wild type* S377G het S377G hom S377G Heterozygosity* Homozygosity* Allele hehete Frequency* ______n % n % n % %
Caucasian Caucasian US 480 382 79.6% 88 18.3% 10 2.1% 11.3% UK 42 27 64.3% 13 31.0% 2 4.8% 20.2% Cyprus 37 27 73.0% 9 24.3% 1 2.7% 14.9% Russian Caucasus 112 84 75.0% 26 23.2% 2 1.8% 13.4% Russia 47 33 70.2% 14 29.8% 0 0.0% 14.9% Asian Yemen 89 64 71.9% 24 27.0% 1 1.1% 14.6% India/Pakistan 111 94 84.7% 17 15.3% 0 0.0% 7.7% Hong Kong 57 57 100.0% 0 0.0% 0 0.0% 0.0% Taiwan 92 92 100.0% 0 0.0% 0 0.0% 0.0% African Madagascar 117 116 99.1% 1 0.9% 0 0.0% 0.4% Central African Republic 44 44 100.0% 0 0.0% 0 0.0% 0.0% Pacific Islander Papua New Guinea 88 88 100.0% 0 0.0% 0 0.0% 0.0%
*Reflects the prevalence of the A->G single nucleotide polymorphism that causes the S377G amino acid change. Data provided by Dr JJ Martinson het = heterozygous hom = homozygous 121 Table 3.5: S377G in Caucasian Subjects ASD$ Other CHD* PFO Stroke/no TOE Controls (no Caucasian PFO/ASD^ PFO/ASD/Stroke)% Population Controls (n=131) (n=109) (n=66) (n=113) (n=391) With Stroke# Without Stroke## (n=59) (n=29)
Age at Study 0-77 0.1-44.3 19-86 38-88 29-87 21-89 16-84 (yrs) (mean:21.4) (mean:3.9) (mean:51.7) (mean: 65.9) (mean: 65.9) (mean:63.3) (mean: 53.1)
Male (%) 51(39.2) 69(63.3) 33(56.9) 20(69.0) 44(66.7) 66(58.4) 200(50)
Family History of CHD (%) 19/129(14.7)(2unk) 13(11.9) 6/57(10.5)(1unk) 4/27(14.8)(2unk) 2/65(3.1)(1unk) 12(10.6) N/A
GATA4 S377G Heterozygous (%) 18(13.7) 24(22.0) 15(25.4) 8(27.6) 16(24.2) 17(15.0) 83(21.2)
GATA4 S377G Homozygous (%) 3(2.3) 3(2.8) 3(5.1) 0(0.0) 0(0.0) 2(1.8) 7(1.8)
GATA4 S377G Allele Frequency 0.092 0.14 0.18** 0.14 0.12 0.093 0.12
Severe Atheroma N/A N/A 0/44(0.0)(14unk) 3(10.3) 18(27.3) 18/109(16.5)(4unk) N/A
Atrial Fibrillation N/A N/A 1(1.7) N/A 13(19.7) N/A N/A
$ Includes 2 TAPVR, 2 LSVC, 2 coarct, 2 PS, 1 LSVC & PS, 1 PDA, 3 MVP. *Includes 27 VSD, 17 VSD + minor abnormalities and 65 VSD with other anomalies: 33 with TOF/PA, 19 with TGA/DORV, 13 with other malformations. #Includes 1 Ebstein's Anomaly, 2 MVP, 1 prosthetic pulmonary valve. ##Includes 1 Quadri-leaflet Aortic Valve, 3 MVP, 3 prosthetic Aortic Valve, 1 prosthetic Aortic Valve&MR. ^Includes 1 BAV, 5 MVP, 4 prosthetic AV, 2 MVR. %Includes 1 Ebstein's Anomaly, 8 BAV, 1 BAV & Coarct, 1 BAV & MR, 1 Sick Sinus Syndrome, 1 PDA, 1 aortic root replacement, 9 MVP, 4 prosthetic MV, 6 prosthetic aortic valve, 3 prosthetic aortic &mitral valves, 1 MR & TR. **p=0.022 compared to TOE Controls; OR 2.17 122
When compared with the TOE control group, this was statistically significant (p=0.022, chi square test) . This is arguably the most appropriate comparison given the high prevalence of PFO in the general population – it is likely that approximately 25% of the unselected Caucasian controls have PFO (Hagen et al., 1984).
These findings suggest the possibility that S377G may be implicated in the causation of PFO. This finding should be interpreted with caution. Multiple comparisons were made. ASD did not show an increase in S377G as might be expected given the presumed shared aetiology of ASD and PFO (see section 1.6.5). It is possible that if there is a relationship between S377G and PFO with stroke, the effect is not an increase in the risk of PFO but rather a direct effect on stroke risk. GATA4 is expressed in liver as well as heart, and has been implicated as a transcriptional repressor of the gene for apolipoprotein(a), high levels of which are an independent risk factor for atherosclerosis and stroke (Negi et al., 2004). However, if there were such an independent effect, an increased allele frequency would be expected in the “stroke/no PFO/ASD” group and this was not seen.
At present, despite these data, S377G is most likely to represent a polymorphism of no clinical significance. However, a replication study is under way and if this finding can be reproduced it will have important implications for the pathogenesis of cryptogenic stroke.
3.3.4 Role of GATA4 mutations in nonsyndromal ASD The findings reported here are consistent with previous studies. It appears that GATA4 mutations are somewhat less common than mutations in NKX2-5. If A411V and D425N are considered polymorphisms, 1/131 subjects with ASD had a whole gene deletion, and 0/19 with a family history of CHD had a GATA4 mutation identified. Sarkozy and colleagues found mutations in 2/29 probands, including 16 familial cases (Sarkozy A et al., 2004). In a subsequent study, the same group studied 42 subjects with AVCD for mutations in GATA4 and
124 CRELD1 but found no mutations (Sarkozy et al., 2005) (not included in the combined figures below). Zhang and colleagues found mutations in 0/99 subjects with CHD, of which however only 6 had ASD (no information available regarding family history) (Zhang L et al., 2006). Schluterman and colleagues found no mutations in 157 probands with CHD, of whom 14 had ASD (no family history information available) (Schluterman M et al., 2007). Nemer and colleagues found mutations in 0/94 subjects with a variety of forms of CHD, including 0/12 with ASD (their findings in TOF are discussed above) (Nemer G et al., 2006). Hirayama-Yamada and colleagues found mutations in 2/16 familial cases (Hirayama-Yamada et al., 2005), but this was not part of a larger study of unselected cases.
Combining all of these data, 3/192 unselected cases (the majority being from this study) had a GATA4 mutation or deletion (1.5%), and 4/51 cases with a family history had a GATA4 mutation (7.8%). Of the three unselected cases, one had extracardiac malformations, intellectual disability, short stature and dysmorphic features as clues to the chromosomal basis of her problems, and the other two both had a family history of CHD. Thus, screening for mutations in GATA4 appears indicated in affected individuals with a family history of CHD, particularly in families in which the predominant phenotype is ASD +/- PS.
3.4 Mutations in TBX20 are associated with diverse cardiac pathologies, including abnormal septation and valvulogenesis, and cardiomyopathy As discussed in 1.5.3, the T-box transcription factors share a highly conserved DNA-binding domain called the T-box. TBX20 is an ancient member of this family of genes, related to TBX1. In mice, Tbx20 is expressed in cardiac progenitor cells, in the developing myocardium and in endothelial cells associated with the endocardial cushions, which are the precursors for the cardiac valves and AV septum (Stennard et al., 2003). The Tbx20 protein contains both transcriptional and repression domains, and it physically or genetically interacts with Nkx2-5, Gata4, and Tbx5 (Brown et al., 2005). Tbx20 null mice have profoundly abnormal cardiac development (Stennard and Harvey, 2005) , with a rudimentary heart lacking chamber myocardium. There is
125 evidence that Tbx20 directly represses Tbx2, which in turn functions as a repressor in the development of both chamber and nonchamber myocardium (Stennard and Harvey, 2005). Mice heterozygous for a Tbx20 null mutation have an atrial septal phenotype similar to mice heterozygous for Nkx2-5 mutations. They have mild atrial septal abnormalities, including an increased prevalence of PFO, atrial septal aneurysm (ASA) and ASD, as well as mild dilated cardiomyopathy (Stennard and Harvey, 2005).
TBX20 mutations have not previously been associated with human disease, but the information described above, plus the knowledge that mutations in TBX5, NKX2-5 and GATA4 are associated with CHD above made TBX20 a logical candidate gene for study in human subjects with CHD.
3.4.1 Subjects screened for TBX20 mutations A total of 353 individuals with CHD were screened for mutations in TBX20 by sequencing of the coding regions of the gene and intron-exon boundaries. This sequencing was done by Leticia Castro, Changbaig Hyun and Andrew Cole. At this stage of the study, subjects were recruited mainly from the Children’s Hospital at Westmead and St Vincent’s Hospital, with only 10% being recruited from Sydney Children’s Hospital. Contributions by EK to this part of the study were limited to participation in study design, patient recruitment, particularly recruitment and clinical evaluation of family members of probands with possible mutations, and data analysis and figure preparation (but not laboratory benchwork) for the transcriptional assays, and to a lesser extent the Xenopus studies.
Patient characteristics are summarised in Table 3.6
126 Table 3.6 Characteristics of subjects sequenced for TBX20 mutations
Phenotypes ASD ASD+Other VSD VSD+other Other onlya CHDb only CHDc CHD No of subjects: Total 151 24 41 22 115 Male 53 16 23 10 70 With positive 20 5 4 2 8 family historye With AV 5301 0 conduction blockf With atrial 8000 0 fibrillation With LV 5g 1h 00 0 dysfunction Mean (range) 26 (0- 12(0.2-62) 6 (0- 6(0-59) 4 (0-16) age at enrolment, 79) 68) in years a. Three adults had mitral valve prolapse. b. Including sinus venosus ASD (n = 13; all others are secundum ASD); partial anomalous pulmonary venous connection (n = 6); left SVC (n=2); valvular lesions (n = 5), including one example of supravalvar mitral ring; coarctation of the aorta (n = 1). c. Including ASD (n = 7), left SVC (n = 5), aortic valve abnormalities (n = 5), coarctation of the aorta (n = 4), double-chambered right ventricle (n = 2), pulmonary stenosis (n = 1), patent ductus arteriosus (n = 1), and partial anomalous venous connection (n = 1). One subject had mitral valve prolapse, and one had supravalvar mitral ring. d. Including outflow tract lesions (n = 75), atrioventricular septal defect and variants (n = 18), functional single ventricle (n = 17, including 2 with mitral valve atresia), heterotaxy (n = 2), cor triatriatum (n = 1), and Ebstein anomaly (n = 1). e. Positive family history was defined as at least one first-degree relative affected with CHD. Thirty-seven subjects were found to have syndromes known to be associated with CHD, including trisomy 21 (n = 20) and 22q microdeletions (n = 12). However, only two subjects with a positive family history were from this group. f. First-degree or complete heart block. Complete and partial right bundle-branch block were not included in this group. Two subjects with ASD had left bundle branch block. g. All subjects were aged 1-55 years. Subjects had normal LV size and contractility but impaired diastolic relaxation (n = 2) or impaired systolic function with (n = 2) or without (n = 1) LV dilation. h. This patient (family 2, individual III:4) was positive for the TBX20 mutation Q195X.
127 3.4.2 TBX20 mutations Mutations in TBX20 were identified in two families. These were 456C>G, coding for the protein change I152M, and 583C>T, coding for Q195X. In addition, a third change, 626C>T, coding for T209I, was identified in another family, but did not segregate with CHD in that family (Figure 3.3).
Family 9001
Family z103
Family WM1
Figure 3.3 Families with TBX20 mutations. Relevant sequence chromatograms of patient and wild-type DNA are shown. The arrow under the sequence indicates the detected single-nucleotide change or the corresponding normal sequence. Individual II:4 in family z103 was not available for genotyping.
128 3.4.2.1 Family 9001: TBX20 mutation 152M This mutation was identified in a family in which septal defects affected members of three generations. The proband (III:1) had ASD, which was corrected surgically in childhood. Her mother (II:2) had a large PFO with a permanent left to right shunt. The proband’s grandmother (I:2) had a small VSD. Cardiac valves and LV function were normal in all family members. There were no limb anomalies or other malformations, and the affected individuals did not have AV conduction abormalities or arrhythmias. The mutation was absent in >450 controls.
3.4.2.2 Family z103: TBX20 mutation Q195X In family 2, in which the Q195X mutation was identified, the phenotype was more complex and varied considerably between affected family members. Congenital heart disease included atrial septal defect and coarctation of the aorta (in the proband, III:4), and a strongly suggestive history of congenital heart disease in individual II:6. Although records for this woman are no longer available, she was said to have been born with a “hole in the heart” and was scheduled for corrective cardiac surgery at the time of her death in a motor vehicle accident, aged in her early 20s. Valvular dysfunction, primarily affecting the mitral valve, was present in several affected family members; individual II:2 has marked mitral valve prolapse (but only mild mitral regurgitation), and individual I:2 had a mitral valve replacement in the 1960s for presumed rheumatic heart disease. While it is possible that this was a correct diagnosis, the presence of mitral valve disease in other family members suggests that this may have been a manifestation of the family’s TBX20 mutation. Individual III:2 died of cardiac failure in 1967, aged 11 months. At 10 months, cardiac catheterization showed evidence of severe pulmonary hypertension, right ventricular hypertrophy and and enlarged left atrium, presumably due to mitral stenosis. At post mortem examination, she had a small mitral valve ring and thickened valve leaflets. In addition, she was found to have endocardial fibroelastosis, more pronounced in the left ventricle than the right, right ventricular hypertrophy and hypoplasia of the left ventricle. The atrial septum was intact. Her sister, III:3, died in 1977, aged 7, of heart failure following a
129 long history of pulmonary hypertension. She was first investigated with cardiac catheterization aged 3, at which time she was asymptomatic but was shown to have moderate pulmonary hypertension. This progressed over the next 4 years. No post mortem examination was done. The proband, III:4, also had pulmonary hypertension in childhood (in addition to his cardiac structural malformations), but this resolved by early adulthood. However, at the age of 34 he was found to have a mildly dilated left ventricle with mild global impairment of systolic function. His mother, II:2, also has evidence of a mild left ventricular dilated cardiomyopathy, with unusual apico-lateral hypertrophy recorded. Q195X was not identified in >300 controls. There are no noncardiac malformations in members of this family.
3.4.2.3 Family WM1: TBX20 polymorphism T209I In family WM1, in which the TBX20 mutation T209I was identified, the proband (II:3) had an ASD, one brother (II:4) had a VSD and one sister (II:6) had an ASD. Several members of this family declined to take part in the study (clinical studies of this family were done by A/Prof David Winlaw and members of his lab). Both II:3 and II:4 were heterozygous for the T209I mutation, but II:6 was homozygous for the wild-type allele. II:3 also had features of Klippel-Feil syndrome, in which CHD is relatively common (Tracy et al., 2004). Thus, the T209I mutation is not segregating with the cardiac phenotype in this family. While it is possible that II:6 represents a phenocopy, the significance of the mutation in this family is uncertain, based on pedigree analysis. It seems most likely that this is a polymorphism of no pathological significance, although it is possible that it does play some role in the pathogenesis of CHD in the family members who carry it. If it is a polymorphism, it is a rare one – it was not found in >300 controls.
3.4.3 Functional and other studies of the TBX20 mutations Work done by a number of others, outlined below, supported the pathogenicity of the I152M and Q195X mutations, and provided some evidence of abnormal function associated with T209I (Kirk et al., 2007). Specifically, transcriptional assays were done by members of Prof Richard Harvey’s lab, including Drs
130 Mauro Costa, Orit Wolstein and Guanglan Guo. Work in Xenopus embryos was done by Drs Aaron Zorn and Scott Rankin. Biophysical studies were performed by Drs Margaret Sunde and Joel Mackay.
3.4.3.1 Transcriptional assays of Tbx20 function The mouse Tbx20 protein exists in long (Tbx20a) and short (Tbx20c) isoforms . The long isoform has weak transcriptional activity, when assayed alone, compared with the short isoform, because of dominant effects of its C-terminal trans-repression domain (absent in the short isoform) (Stennard et al., 2003). However, when collaborating with Nkx2-5 and Gata4, the long isoform has strong transcriptional activity. In a transcriptional assay in 293T cells (Figure 3.4), measuring activation of the Nppa promotor in the presence of Tbx20c, activity of Tbx20 I152M was significantly reduced (p=0.05), and the activity of Tbx20c with the Q195X was severely impaired. Interestingly, in this assay the T209I change was also associated with reduced transcriptional activity (p=0.008). When assayed together with Nkx2-5, Tbx20a I152M activity was slightly elevated (p=0.05); Tbx20a T209I activity was unchanged, and not surprisingly Tbx20a Q195X activity was again severely reduced (p=0.02).
3.4.3.2 Xenopus embryo gastrulation assay Microinjection of synthetic Tbx20a wildtype mRNA into Xenopus laevis embryos severely disrupts gastrulation (figure 3.4) . This capacity was preserved with Tbx20a I152M and Tbx20a T209I mRNA, but was abolished when Tbx20a Q195X mRNA was used.
3.4.3.3 Protein modelling A model of the Tbx20 T-box was constructed by Drs Margaret Sunde and Joel Mackay, using the known crystal structure of the human TBX3 T-box. This model placed the side chain of the I152 residue in the core of the T-box, packed against other hydrophobic residues. Replacement of isoleucine by methionine in other proteins has been shown to have a destabilising effect (Gassner et al., 1996; Ohmura et al., 2001). The T209 residue is also located in the T-box, at the DNA-interaction face. Substitution of threonine with isoleucine at this point
131 has the potential to disrupt H bonds which stabilise the chain in the DNA binding region.
The Q195X mutation results in truncation of the protein within the T-box – thus its severe effects on Tbx20 function in the transcription and Xenopus assays are not surprising.
Figure 3.4 Transcription studies and Xenopus embryo gastrulation assay. a, 293T cell-transfection assay measuring activation of the Nppa promotor in the presence of Tbx20c (short isoform without C-terminal trans activation and trans repression domains). b, COS cell-transfection assay measuring activation of the Nppa promotor in the presence of Tbx20a (full-length isoform), Nkx2-5 and Gata4, alone or in combination. The combination of all three factoris is required for synergistic activation. c. Ability of wild type and mutant Tbx20a to disturb gastrulation in Xenopus embryos after microinjection of mRNAs into fertilized eggs. Assays done by Mauro Costa, Orit Wolstein and Guanglan Guo (transcription assays) and Aaron Zorn and Scott Rankin (Xenopus). Statistical analysis of transcription assays and figure preparation by EK
132 3.4.4 Significance of mutations in TBX20 There is strong evidence that the cardiac pathology seen in members of families 1 and 2 is caused by the unique mutations in TBX20 found in each. Neither mutation was found in >300 controls. The missense mutation, I152M, affected a highly conserved amino acid in the T- box DNA binding domain. It segregated with cardiac septal abnormalities over three generations in the affected family. Biophysical studies by Drs Margaret Sunde and Joel Mackay showed abnormalities including a fourfold reduction in the DNA-binding “on” rate (Kirk et al., 2007)). Transcriptional activity of the short isoform of mouse Tbx20 was significantly reduced, by ~40%, although the overexpression assays relying on a synergistic interaction with other transcription factors did not confirm this. These data point to I152M causing reduced TBX20 function, although it is clearly not a null allele.
By contrast, the Q195X mutation results in a functionally inactive protein. It introduces a stop mutation within one of the exons coding for the T-box DNA- binding domain, a key functional domain of the protein. The resulting TBX20 protein is truncated within the T-box and completely lacks the trans activation and trans repression domains located in the C terminus of the protein (Stennard et al., 2003). In all functional assays the Q195X mutation had severely reduced activity. Although only two affected individuals within the family were alive and available for genotyping, both are heterozygous for the mutation. Moreover, the pattern of congenital heart disease and cardiomyopathy within the family is consistent with autosomal dominant inheritance.
The range of phenotypes seen in this family is remarkable for its range, and particularly the occurrence of CHD and cardiomyopathy. Mitral valve structural malformations were prominent, with both mitral valve stenosis and prolapse occurring in different members of the family. Congenital mitral valve stenosis is a rare but serious malformation, generally associated with poor prognosis, whereas mitral valve prolapse is common, and is usually detected later in life. The causes of both of these types of mitral valve pathology are unknown, but it appears that both can arise from loss of TBX20 function. As discussed in
133 Chapter 1, mouse Tbx20 is expressed strongly in cells of the endocardial cushions and, subsequently, the cardiac valves (Stennard and Harvey, 2005; Stennard et al., 2003). Consistent with a functional role for TBX20 in this tissue, mouse embryos with a partial RNAi-mediated knockdown of Tbx20 expression show severely hypoplastic and/or immature atrioventricular valves (Takeuchi et al., 2005).
Dilated cardiomyopathy (DCM) was present with structural CHD in two individuals heterozygous for the Q195X mutation. While this could reflect pathological decompensation after functional adaptation to structural defects, the impression of the treating cardiologists was that the degree of DCM was greater than could be explained on this basis in both affected individuals. Consistent with the idea that the TBX20 mutation may cause DCM independently of structural anomalies, mild DCM was observed in mice heterozygous for Tbx20 mutation but without signficant structural anomalies (Stennard and Harvey, 2005). A possible explanation is that the TBX20 mutation provides a sensitized developmental template for adult-onset DCM. Heart failure is also seen in some patients carrying NKX2-5 mutations, years after correction of structural CHD (Schott et al., 1998; Benson et al., 1999). In a mouse model of Nkx2-5 deficiency, DCM is present even in fetal life (Elliott et al., 2006). Familial DCM is highly genetically heterogeneous, with most mutations occurring in genes encoding myofilament, cytoskeletal, energy, and Ca2+ handling proteins (Fatkin and Graham, 2002), rather than transcription factors. However, mutation of the transcriptional coactivator, EYA4 (MIM 603550), causes familial DCM and sensorineural hearing loss (Schonberger et al., 2005).
The pathology seen in the family with the Q195X mutation is generally more severe than in the family with the I152M mutation, and this may reflect the more severe disruption to the protein. It seems likely that for this mutation at least, the pathogenic effects of the mutation result from haploinsufficiency, although a dominant-negative effect of the mutant protein cannot be excluded.
134 The third change detected in this study, T209I, did not segregate with pathology in the proband’s family – similarly to the E21Q mutation in NKX2-5 described in section 3.2.2.2. The affected residue is located within the T-box, subtle transcriptional defects and instability in bacteria were observed for this allele and it was not found in controls. However, the lack of segregation with phenotype makes it difficult to ascribe it a role in causing pathology in this family. Although every effort was made to prevent this, the possibility that the non-segregation was due to one or more sample handling errors cannot be excluded. Additional samples from the family members were not available to check this.
Mutations in TBX20 have not previously been linked to human disease, but from the data presented here there is little doubt that, in some families at least, TBX20 mutation is responsible for CHD, particularly septal and mitral valve abnormalities, and DCM. Although mutations were present in only 0.6% (2 of 352) of probands with CHD, among those with a family history 5.1% (2/39) carried a TBX20 mutation.
3.5 Conclusions: the role of mutations in NKX2-5, GATA4 and TBX20 in human disease The hypothesis that mutations in genes responsible for dominant forms of ASD might contribute to common forms of ASD is only partially borne out by the studies reported here. Mutations in NKX2-5, GATA4 and TBX20 accounted for 1.8%, 1.5% and 0.6% respectively of unselected cases of ASD. The great majority of individuals with mutations had a positive family history, and this is reflected in the finding that 10.7%, 7.8% and 5.1% of familial cases respectively (including all reported studies) had mutations in one of these three genes. Combining these figures, nearly a quarter of familial ASD is accounted for by mutations in one of these three genes. It seems likely that studies of the other known nonsyndromal ASD genes, MYH6 and ACTC, will produce similar results. This will leave a considerable percentage of familial ASD unaccounted for. Some cases will represent recurrences in families due to multifactorial inheritance, rather than autosomal dominant inheritance, but it is likely that
135 there is considerable genetic heterogeneity yet to be unravelled, accounting for the bulk of familial cases. Chapter 4 describes an effort to identify an additional gene contributing to this heterogeneity.
There are tantalizing hints here that variation in these genes may play a role in the multifactorial causation of ASD. In each of these three genes, sequence changes resulting in altered amino acid structure have been identified, which although not found in large numbers of controls, do not segregate with disease or meet other criteria for Mendelian mutations. Determining whether such variants are truly relevant to common forms of ASD will require large scale association studies of a type not yet conducted. The exception is the S377G variant, which remains of uncertain significance despite the association study reported here, but may contribute to PFO and cryptogenic stroke.
136 4. Atrial septal defect and Marcus Gunn phenomenon: further evidence for clinical and genetic heterogeneity in autosomal dominant atrial septal defect
4.1 Introduction As discussed in Chapters 1 and 3, to date mutations in six genes have been associated with autosomal dominant ASD – TBX5 (as part of Holt-Oram syndrome), NKX2-5, GATA4, MYH6, ACTC and now TBX20. In addition, linkage to 5p has been reported in a single family, with the responsible gene yet to be identified (Benson et al., 1998). Four of the genes identified to date have been transcription factors and two (MYH6 and ACTC) structural proteins. There is thus considerable genetic heterogeneity in dominant ASD. This chapter describes a large autosomal dominant ASD family without conduction defects, in which 5 of 10 affected family members also have the Marcus Gunn jaw winking phenomenon. Although the effort to map the ASD gene in this family was unsuccessful, there was no evidence of linkage to any of the known ASD loci, indicating that this is an eighth form of dominant ASD. No association between ASD and Marcus Gunn phenomenon (MGP) has previously been reported – thus the phenotype in this family represents a new syndrome.
4.2 Marcus Gunn phenomenon MGP consists of synkinetic upper eyelid motion with stimulation of the ipsilateral pterygoid muscles, in association with varying degrees of congenital ptosis. This manifests as rapid movements of the affected eyelid with movement of the mandible during jaw protusion or opening (eg during chewing). It is most pronounced in the newborn period, when the eye movements are noted during sucking, but usually persists into adult life. It is thought that the disorder results from abnormal connection of axons which would normally travel within the motor branch of the trigeminal nerve, innervating the pterygoid muscle (Freedman H and Kushner B, 1997). The abnormal connection is to the levator superioris muscle (normally innervated by a branch of the oculomotor nerve).
137 Although most cases of MGP are sporadic, there have been several reports of familial occurrences of the condition, usually consistent with autosomal dominant inheritance (Falls et al., 1949; Kuder GG and Laws HW, 1968; Kirkham, 1969; Mrabet et al., 1991; Pratt et al., 1984). Incomplete penetrance has been observed in dominant MGP (Mrabet et al., 1991), but no associated malformations have been reported in these families. The only previous report of an association between CHD and MGP was in a case report of a child with MGP and complex CHD involving Tetrallogy of Fallot, with left heart hypoplasia and total anomalous pulmonary return (Festa et al., 2005). The MGP is mentioned only in the abstract of the paper and there is no mention of family history.
There has been a single report of KIF21A mutations in four patients with congenital fibrosis of the extra-ocular muscles (CFEOM) who also have MGP (Yamada et al., 2005). CFEOM is an autosomal dominant disorder characterised by nonprogressive ophthalmoplegia, bilateral ptosis and a downward primary position of the eyes, with limited ability to elevate (supraduct) the eyes, and is caused by mutations in KIF21AI, a kinesin motor protein located at 12q12 (Yamada et al., 2003). CFEOM is associated with abnormal development of the oculomotor axis, including the nucleus of the oculomotor nerve, the superior division of the nerve itself and the levator and superior rectus muscles (Yamada et al., 2005), and thus its pathogenesis overlaps with that of MGP. In a total of four patients with CFEOM and MGP, two had de novo mutations in KIF21A and two had familial mutations. In the latter two families there were other family members who had CFEOM but did not have MGP (Yamada et al., 2005). To date, there have been no reports of mutation screening in KIF21A in subjects with isolated MGP, familial or otherwise.
Doco-Fenzy and colleagues (Doco-Fenzy et al., 2006) reported a child with a chromosomal duplication involving 12q24.1-q24.2 who had MGP as well as multiple congenital anomalies. The cardiac phenotype of this child was of multiple small VSDs, which closed spontaneously, and pseudo-coarctation of the aorta. TBX5 is one of the more than 75 genes within the duplicated region. It
138 is not clear what link, if any, this has to the association of ASD and MGP reported here. Holt-Oram syndrome is caused by loss of function of TBX5 (Basson et al., 1999), rather than gain of function as would be expected in a duplication; however, in animal models cardiac development is disrupted by Tbx5 overexpression (Hatcher et al., 2004), and it is plausible that TBX5 duplication could cause CHD.
There has been a single report of MGP in a child with CHARGE syndrome (Weaver et al., 1997). This antedated the identification of CHD7 mutations in individuals with CHARGE syndrome (Vissers et al., 2004) so the molecular basis of CHARGE syndrome in this patient is not confirmed although a CHD7 mutation is likely. Facial nerve palsy, usually unilateral, is a common feature of CHARGE syndrome, and indeed abnormalities of all other cranial nerves have been reported (Sanlaville and Verloes, 2007). It is plausible that the MGP in the patient reported by Weaver et al (Weaver et al., 1997) is an unusual manifestation of a common component of the syndrome, i.e. abnormal development of cranial nerves and their nuclei. It seems unlikely that CHD7 mutations would account for MGP in patients without other features of CHARGE syndrome.
4.3 Phenotypes of affected family members The pedigree is shown in Figure 4.1. Ten individuals in four generations were affected by congenital heart disease. All of these had secundum ASD but otherwise structurally normal hearts, except for individuals II:7 (primum and secundum ASD), III:4 (VSD, no ASD), and III:5 (TOF and ASD). The severity of ASD ranged from a defect only a few mm in size in an asymptomatic 88 year old man (I:1) diagnosed as part of the study, to a lesion 4cm in diameter requiring surgical closure (II:7). No affected family members had cardiac conduction abnormalities, but II:7 had atrial fibrillation diagnosed at the age of 55. In addition, 5 of the 10 individuals with congenital cardiac malformations had MGP. Individual III:6 had very pronounced bilateral MGP and required surgery for ptosis and to eliminate the "wink". Other individuals were less severely affected. No family members with MGP in the absence of congenital
139 heart disease were identified. Individual II:1 is an obligate gene carrier, since he has an affected parent and child, but is apparently non-penetrant for the disorder, having a normal heart on echocardiography and lacking MGP. Individual III:7 had unilateral cleft lip and palate which were repaired in childhood, in addition to ASD. All family members were assessed by a clinical geneticist (EK). None had significant craniofacial dysmorphism or other features suggestive of known syndromes. In particular, none had features of velocardiofacial syndrome, or subtle limb abnormalities suggestive of Holt-Oram syndrome. Individuals II:4 and III:4 declined to take part in the study, and information about their cardiac lesions and the MGP in III:4 is based on history provided by other family members. The family was originally ascertained by Prof Ian Glass, and he played an important role in recruitment of subjects, including doing many of the venepunctures. Dr Rob Justo and Dr Michael Tsicalis did the cardiac assessments and Dr Tim Sullivan provided opthalmological advice.
4.4 Cytogenetics Although no family members were felt to have features consistent with velocardiofacial syndrome, a karyotype and FISH for 22q11 microdeletion was performed in individual IV:2. This was normal.
4.5 Sequencing of cardiac genes No abnormalities in the coding sequences of NKX2-5, GATA4 or TBX20 were detected in individual II:7 (sequencing of NKX2-5 by EK, GATA4 by Dr Changbaig Hyun and TBX20 by Ms Leticia Castro). DNA extraction was done by EK, assisted by Dr Fiona McKenzie.
140 + +
I:1 I:2
+ + + + + +
II:1 II:2 II:3 II:4 II:5 II:6 II:7 II:8
+ + + + + + + +
III:1 III:2 III:3 III:4 III:5 III:6 III:7 III:8 III:9
= congenital cardiac malformation
= Marcus Gunn phenomenon + + + = examined and/or genotyped IV:1 IV:2
Figure 4.1: Family with ASD and MGP
Note that individuals II:5, II:8 and III:9 were genotyped as part of the mapping study but not examined
141 4.6 Mapping results The names and map locations of the markers used in this study, together with the 2-point LOD scores obtained for each marker at = 0, are shown in Appendix 1. Analysis was also done at = 0.025, 0.05, 0.075, 0.1, 0.15, 0.2, 0.3 and 0.4 (data not shown) but no loci had substantially higher LOD scores at > 0 than at =0. Two LOD scores are shown for each marker. The first (“LOD score all”) is the score obtained using all available phenotype data. Because this was inconclusive, the genome scan was re-run with all unaffected individuals coded as “unknown”, apart from the unaffected spouses II:5, II:8 and III:9. The results are in the columns headed “LOD score affected”. The rationale for this was that if one of the individuals who was phenotypically unaffected was in fact heterozygous for the disease-causing mutation, this would manifest in the mapping results as a recombinant, lowering the LOD score. Although the maximum obtainable LOD score using only affected individuals would be lower, this problem would be avoided. Genetic distances from the Généthon linkage map (Gyapay et al., 1994) are given in cM from the p telomere. As there are 6 instances of male to male transmission in the pedigree, the X chromosome was not analysed. Dr Kyall Zenger did the initial analysis of the data (LOD score all) and the multipoint analysis of chromosome 5; EK did all other analyses.
4.6.1 Chromosome 1 Marker D1S2878 was the only marker studied to give a result suggestive of linkage, with a LOD score of 2.03 – just above the threshold of 1.9 proposed by Lander and Kruglyak as the threshold for suggestive linkage. Interestingly, when non-affected individuals are removed from the analysis the LOD score at this marker actually goes down to –1.23. Nonetheless, as the best candidate locus further study was warranted and additional markers closely flanking D1S2878 were genotyped. The results are in appendix 1, Table A1.1b. Note that the map locations given in the key are from the Marshfield map, as two of the additional markers selected (D1S382 and D1S1679) are not listed on the Généthon map. This is important because D1S382 is at the same position on the genetic map as D1S2878, but has a slightly different map position listed in table 1b. Given the results for these closely flanking markers, and the fact that
142 I:1 was homozygous for D1S2878, linkage to this region can be regarded as excluded.
4.6.2 Chromosome 5 No mutation was found in NKX2-5, which is located on 5q at 180cM, in an affected family member. However, additional analysis was done on this chromosome because of the previously reported linkage (with no gene yet identifed) on the short arm of chromosome 5. The results of multipoint linkage analysis are shown in Figure 4.2. Linkage to both loci is excluded.
Figure 4.2 Multipoint mapping of chromosome 5. The locations of NKX2-5 and D5S208, the marker at which maximum linkage was found by Benson et al (Benson et al., 1998), are indicated.
143 4.6.3 Chromosome 6 Mohl and Mayr (Mohl and Mayr, 1977) reported linkage of ASD to the HLA complex, which is at about 44cM on the Généthon map. As discussed above, it is difficult to assess their very brief report, and the finding has not been replicated in the 30 years since its publication. Although the data shown do not suggest linkage to 6p is likely in this family, the possibility has not been completely excluded.
4.6.4 Chromosome 7 TBX20 is located at 55.6cM, between markers D7S516 and D7S484. Given the low LOD scores at these markers, it is not surprising that no mutation was found in the coding regions of TBX20 in an affected family member.
4.6.5 Chromosome 8 GATA4 is located at 21.2cM, between D8S550 and D8S549. Given the low LOD scores in this region, it is not surprising that sequencing of the coding regions of this gene in an affected family member did not reveal a mutation.
4.6.6 Chromosome 12 TBX5 is located at 125cM, between D12S78 and D12S79. Given their very low LOD scores it is unlikely that this family is linked to TBX5. Clinically there is no evidence of even subtle limb changes in any family member, which is additional strong evidence against a role for TBX5 mutation in this family. Penetrance for limb anomalies is very high in Holt-Oram syndrome (Newbury-Ecob et al., 1996).
4.6.7 Chromosome 14 MYH6 is located at 10.97cM, close to D14S283. The low LOD score at this locus indicates a mutation in MYH6 is unlikely in this family.
144 4.6.8 Chromosome 15 The dominant ASD gene ACTC is located at 32cM, between markers D15S1007 and D15S1012. Given the linkage results for these markers it is unlikely that an ACTC mutation is responsible for the phenotype in this family.
4.7 Discussion 4.7.1 Linkage results The size of the family reported here is, unfortunately, marginal at best for a successful mapping effort. In principle, mapping in an autosomal dominant pedigree with 10 affected individuals could yield a LOD score of 3. However, this relies on a fully informative marker, positioned at or very close to the gene of interest, and in practice analysis of a rather larger number of affected individuals is likely to be required (Anderson NH, 2002). Although there are unaffected individuals in the pedigree, they contribute relatively little linkage information, because their status is uncertain (Lander and Botstein, 1989). The exception is II:1, who is an obligate heterozygote on the basis of having an affected parent and an affected child. The decision by two affected family members (II:4 and III:4) not to participate in the study also reduced the likelihood of success.
Non-penetrance, as seen in II:1, may also have been a problem. If one of the individuals classed as unaffected was in fact heterozygous for the mutation but was non-penetrant for a cardiac phenotype or for MGP, he or she would have represented an apparent recombination event, even with a perfect marker, and would have lowered the LOD score obtained. Similarly, with a common class of malformations like CHD, there is the possibility of a phenocopy occurring – i.e. a member of the family having CHD for reasons unconnected with the familial mutation. This would have serious consequences for success of the linkage analysis. Re-analyzing the data with clinically unaffected individuals marked as “unknown” was an attempt to control for the possibility of non-penetrance in one or more of those people. Although this had a significant effect on many of the LOD scores obtained, none rose substantially above zero.
145 Although no definite linkage was detected, and the only suggestive locus – on chromosome 1 – was not supported by additional fine mapping – the results of this linkage analysis do have something to tell us about the genetics of dominant ASD. All 6 of the known ASD genes were excluded, as was the only locus which has been convincingly mapped without the gene yet being identified (on 5p). Linkage to the HLA locus on 6p was not excluded but appears unlikely, and in any case the status of that linkage result is in doubt, as discussed above.
Thus, this family provides firm evidence for further genetic heterogeneity in dominant ASD. Including TBX5 and the locus responsible for this family’s CHD, there are at least 8 different dominant ASD loci. It seems likely that ultimately there will prove to be even more than this.
4.7.2 ASD and MGP The association of ASD and MGP represents a new dominant ASD syndrome. Non-penetrance for MGP is common, with 5/11 presumed mutation carriers (counting II:1) manifesting MGP. Non-penetrance has been observed in previous families with dominant MGP (Mrabet et al., 1991). Although CHD has not previously been reported in association with dominant MGP, ASD can easily be missed – as evidenced by the fact that several members of this family had their ASDs identified as a result of participation in the study – and it is possible that this disorder may sometimes present as a primarily ophthalmological phenotype.
4.7.3 Clefting Individual III:7 had cleft lip and palate repaired as a child. All other family members had normal lips and palates on examination, and it is possible that this is a chance association rather than representing a part of the phenotype of this disorder. There have been two previous reports of an association between MGP and orofacial clefting. In one case, the father of a girl with MGP had cleft lip and palate (Brooks, 1987). In another, a boy with MGP had cleft lip, as did two of his six sibs (neither of whom was affected by MGP) (Awan, 1976). This raises the possibility that the cleft lip and palate seen in individual III:7 in the
146 family reported here may not be a chance association, but could in fact be an uncommon feature of the disorder.
4.7.4 Future studies The majority of the work reported here was completed in 2000 and 2001. Since then, there have been considerable technological advances in mapping techniques. Specifically, SNP microarray is now becoming cheap enough to be an accessible alternative to microsatellite markers, offering the advantage of very dense map coverage – ranging upwards from 10,000 markers, compared with the 382 microsatellite markers used in this study. This increases the chance of finding the “perfect marker” or combination of closely linked markers which would allow the maximum possible LOD score to be obtained from the pedigree. While the spacing of markers used here was dense enough to make double recombinants in between markers unlikely (another possible cause for the lack of success in establishing linkage) it still seems possible that repeating the whole genome screen at the density allowed by microarray technology may be the key to future success.
147 5. Cardiac atrial septal morphology and risk of patent foramen ovale in inbred laboratory mice
5.1 Introduction Chapters 3 and 4 described studies of the Mendelian variants of ASD, which proved to be rare. In affected individuals without a family history of CHD, the chance of finding a mutation in one of the genes known to be associated with ASD is low. It is unlikely that similar studies of other such dominant genes will substantially advance our understanding of the causes of the majority of ASD.
An alternate approach is therefore required. This chapter and the next describe studies of an animal model of atrial septal dysmorphogenesis. Different strains of inbred laboratory mice have different susceptibilities to PFO, a point first demonstrated by Biben and colleagues (Biben et al., 2000). The Biben study formed the basis on which this study stands and will be discussed in detail in section 5.2, below. An important finding of that study was that there are features of the mouse atrial septal wall which correlate with the presence of PFO, with strain to strain variation in these quantitative traits being closely related to the risk of PFO in each strain. Biben and colleagues studied the length of the septum primum, or flap valve length (FVL). The traits foramen ovale width (FOW) and crescent width (CRW) were found in the course of this study to also be associated with the risk of PFO (descriptions of all of these traits are to be found in section 2.1.4.2, especially Figure 2.3).
While PFO itself is a binary trait, which offers relatively little power for mapping studies, the identification of quantitative traits which are associated with risk of PFO made a QTL mapping study a viable option. The underlying hypothesis of this study was that if QTL relevant to FVL, FOW and CRW could be mapped, these would also influence risk of PFO. Identification of the underlying genetic basis of any such QTL should provide insight into normal and abnormal morphogenesis of the atrial septal wall, and may have wider significance in the study of CHD. 148 Chapter 6 reports the identification of QTL relevant to PFO risk in two strains of inbred laboratory mice, QSi5 and 129T2/SvEms. For the QTL mapping study, 85 [QSi5 x 129T2/SvEms] F1 mice and 1437 F2 mice were dissected. Subsequently, an advanced intercross line (AIL) was established from the same parental strains and bred for 14 generations; 1003 mice from the F14 generation were dissected. These experiments generated a large amount of quantitative data relating to atrial septal wall morphology. This chapter presents analyses of those data, with a focus on the relationship between septal morphology and PFO. Independently from the QTL analysis, the morphological data provide insight into the relationships between PFO and each of FVL, FOW and CRW, as well as information about interaction between the traits and their relationship to other variables such as sex, body weight and heart weight. All dissections, measurements and statistical analyses including LOD score analysis presented in chapters 5 and 6 were done by EK.
5.2 The relationship between atrial septal morphology and PFO: previous work The only previous investigation of the relationship between atrial septal morphology and incidence of PFO in laboratory mice was that of Biben and colleagues (Biben et al., 2000). As part of an exploration of the effects of heterozygous mutations of Nkx2-5 on the murine heart, Biben and colleagues noted that the incidence of PFO varies considerably between strains of inbred laboratory mice. Moreover, they found a relationship between measures of atrial septal morphology and strain-specific incidence of PFO. In particular, the mean FVL was inversely proportional to the incidence of PFO in a given strain. Mice heterozygous for Nkx2-5 mutations had shorter FVL and a higher incidence of PFO than wild-type mice of the same genetic background. There was a very strong correlation between these traits, with a correlation coefficient of –0.97 (Biben et al., 2000). The authors made the point that it was not clear whether there was a causal relationship between short FVL and high risk of PFO, or even in which direction such a causal relationship might act. Either the shortness of the septum primum might reduce the chance of the flap valve forming an effective seal, or the presence of the PFO might lead to increased
149 bloodflow which could place additional stress on a fragile structure, resulting in a shorter flap valve in postnatal life.
Mice heterozygous for the Nkx2-5 mutation were found to have other abnormalities, including (in 129T2/SvEms mice) a high incidence of ASD and some 17% having borderline ASD (only 6% had no PFO or ASD). In C57Bl/6 mice, 1.4% of wild-type mice had bicuspid aortic valve, compared with 11% of Nkx2-5+/- mice; and female heterozygotes had a mild but significant prolongation of the PR interval. There was also an increased incidence of atrial septal aneurysm (ASA) in heterozygotes.
This paper was a vital precursor to all of the mouse work described in this chapter and the next. It showed 1) a relationship between FVL and PFO; 2) that the mouse heart responds to Nkx2-5 haploinsufficiency in a similar way to the human heart, albeit with a milder phenotype in mouse than man; and 3) clear evidence of a genetic link between ASD and PFO. In short, the findings of Biben and colleagues validated a quantitative analysis of the mouse atrial septum, and particularly of PFO as a model for human ASD.
In addition to FVL, Biben and colleagues studied the relationship between the width of the patent corridor behind the flap valve in cases of PFO, showing that this was wider in Nkx2-5+/- than in wild-type mice. As this can only be studied in mice with PFO, alternative measures were sought during initial dissections for this study, and FOW and CRW were identified as being worthy of investigation.
5.3 Selection and breeding of mice for study The process of strain selection for the QTL mapping study reported in chapter 6 is described in section 2.1.5. The strains chosen for study, QSi5 and 129T2/SvEms, had extreme values for FVL, with QSi5 having FVL of 1.13±0.11mm (mean±SD), compared with 0.60±0.11mm for 129T2/SvEms. The difference between the strains, which is 4.8 times the standard deviation of the strains, is large enough that a QTL mapping exercise would be expected to have good prospects of success. By way of comparison, Kirkpatrick and
150 colleagues (Kirkpatrick et al., 1998) identified 3 QTL for fecundity using just 41 informative markers across all chromosomes in an F2 resource of only 200 mice, where phenotypic means were separated by 4-5 SD. The study reported here was designed to use much larger numbers of mice and a denser marker map, suggesting that the prospects of success were excellent.
The F1 and F2 mice were bred by Dr Ian Martin as part of a mapping protocol, as described in sections 1.4.2 and 2.1.3.1. A total of 85 F1 mice were dissected early in the study, in order to gain an impression of overall dominance effects. At that stage, the decision to measure FOW and CRW had not yet been made and only FVL was measured.
Following the success of the QTL mapping project (Chapter 6), approaches to fine mapping were considered. One such approach is the use of an Advanced Intercross Line (AIL). The principles behind and generation of an AIL are described in section 2.1.3.2. The advantages of this approach include the ability to do fine mapping on multiple regions of interest simultaneously, and the ability to breed over a number of generations and only phenotype and genotype at the final generation. The gain in resolution of mapping diminishes with increasing generation number and after about 10 generations, the confidence interval (i.e. the size of the area of interest) is reduced only modestly for each additional generation (Darvasi and Soller, 1995). As it happened, it was not possible to devote the necessary time for phenotyping a large number of mice when the 10th generation of AIL mice were being born, and breeding (which was much less time-consuming than phenotyping) was thus continued until the 14th generation, when 1003 mice were dissected. These mice have yet to be genotyped (see chapter 8 for a discussion of future plans). However, the phenotype data generated by their dissection is of interest and is presented here. All mouse breeding, dissections and measurements for the AIL were done by EK.
151 5.4 Analysis of data from QSi5, 129T2/SvEms, and the [QSi5 x 129T2/SvEms] F1, F2 and F14 mice In this section, data are presented from the parental strains used in the QTL mapping experiment, and from strains derived from crossing them at various generations. Wherever possible, all available data for each trait are analysed, but for the more complex comparisons, analyses are restricted to the subset of mice with complete data. Selection of mice for genotyping was based only on mice with complete data available. Data were missing for some mice for a variety of reasons, most commonly inadvertent damage to the atrial septum during dissection. Uncommonly, atypical anatomy made it impossible to do the full set of measurements. For example, occasionally a PFO would be observed with two separate outlets in the left atrial septal wall, making it possible to measure FOW, but not CRW or FVL as there would be two values for each of these.
5.4.1 Descriptive statistics Table 5.1 shows the means and standard deviations for the continuous traits which were measured – body weight, heart weight, FVL, FOW, CRW, body weight and heart weight – and the percentage of PFO in each set of mice. QSi5 mice had a low prevalence of PFO, associated with a long FVL, small FOW and large CRW when compared with 129T2/SvEms. They were also heavier and had correspondingly greater heart weight compared with 129T2/SvEms. Interestingly, the atrial septa of the F1 mice were strikingly similar to QSi5 mice, with no PFOs identified in 85 F1 mice, and with very similar FVL to QSi5 mice, although body weight and heart weight were intermediate between the two parental strains. This similarity may represent the effect of alleles with substantial dominance effects in the direction of the QSi5 phenotype. FOW and CRW were not measured for the F1 mice as the decision to include these measures had not been made at the time that these mice were dissected. The F2 mice were intermediate between the parental strains for %PFO (albeit closer to QSi5 than 129T2/SvEms), for FVL and bodyweight. They were similar to QSi5 for FOW and heart weight, and similar to 129T2/SvEms for CRW. It is
152 hard to interpret this information with confidence, but again, it is possible that this represents dominance effects.
In theory, the F14 mice should be very similar to F2 mice for all measures. In practice, although all the continuous variables for the F14 mice were within one standard deviation of the values for the F2 mice, there was a substantial difference in prevalence of PFO.
Of the F14 mice, 34% had PFO, double the value for the F2 mice (p<0.0005, chi square test). The most likely explanation for this is that during the breeding of successive generations of mice a significant amount of genetic drift occurred – i.e. the random loss of some alleles from the population. The experiment was structured to minimize the risk of this occurring, with the use of 48 pairs of mice and with great care taken to avoid inbreeding at each generation. Nonetheless, it is possible that the continuation of the experiment past the 10th generation contributed to this unwanted outcome - if this is indeed the explanation for the changes from the F2 to F14 generations. The reason this is undesirable is that there is a risk that at one or more of the QTL detected in the F2 mapping experiment (see chapter 6), the F14 mice have become fixed for one of the parental strain alleles, or at least there may be heavy skewing towards one of the parental strain alleles. If this has happened the experiment will be unable to confirm and refine any such QTL (if fixed) or will have reduced power to do so (if not fixed but skewed). The large number of QTL which were detected mean that it is unlikely that all of the QTL will have been affected by genetic drift. It is also possible that none have been affected, and that the difference in PFO prevalence between F2 and F14 results from genetic drift affecting QTL which were not detected in the original experiment. Other possible explanations for this difference, such as a change in environmental factors in the interval between the breeding and dissection of the F2 and F14 mice (several years), are less likely. Animal husbandry practices in the animal house did not change over the period of these studies.
153 Table 5.1: Characteristics of parental strains, F1 and F2 mice
QSi5 129T2/SvEms F1 F2 F14
n 66 75 85 >1247* >933*
PFO (%) 4.5 80 0 17 34
FVL(mm) 1.13 0.11 0.60 0.11 1.15 0.14 1.0 0.19 1.010.16
FOW(mm) 0.21 0.061 0.24 0.058 N/R 0.21 0.07 0.240.07
CRW(mm) 0.51 0.13 0.44 0.12 N/R 0.41 0.12 0.540.15
Body Weight(g) 29.4 2.77 17.5 2.1 21.9 2.9 26.6 3.3 25.8±2.9
Heart Weight(g) 0.21 0.024 0.14 0.021 0.16 0.028 0.21 0.033 0.18±0.028
*not all mice had complete data – see text
154 5.4.2 Relationships between the continuous traits Tables 5.2 and 5.3 show the relationships between the continuous variables. In these tables, the numbers on the diagonals are the mean ± standard deviation, with the number of mice with complete data for that trait in brackets. To the right of the diagonal, each cell contains correlation coefficients between the traits at the left of the row and top of the column which intersect at that cell, with p values in brackets.
Table 5.2 Basic statistical information and correlations for F2 mice
Body Heart Flap Crescent Foramen weight weight Valve width ovale length width Body 26.635 0.744 0.133 0.086 0.096 weight ±3.284 (<0.0005) (<0.0005) (0.002) (0.002) (1437) Heart 0.20599 0.121 0.036 0.152 weight ±0.03362 (<0.0005) (0.209) (<0.0005) (1436) Flap 3.3303 -0.034 -0.087 Valve ±0.6024 (0.231) (0.001) length (1373) Crescent 1.3717 0.121 width ± 0.6237 (<0.0005) (1247) Foramen 0.70216 ovale ±0.22785 width (1344)
155 Table 5.3 Basic statistical data and correlations for F14 mice
Body Heart Flap Crescent Foramen weight weight Valve width ovale length width Body 25.801 0.640 0.100 0.117 0.007 weight ±2.877 (<0.0005) (0.002) (<0.0005) (0.825) (971) Heart 0.182 0.134 0.130 0.074 weight ±0.0277 (<0.0005) (<0.0005) (.023) (953) Flap 1.01 0.005 -0.284 Valve ±0.1564 (0.890) (<0.0005) length (933) Crescent 0.545 0.117 width ± 0.146 (<0.0005) (937) Foramen 0.2375 ovale ±0.074 width (1344)
It is not surprising that heart weight and body weight are strongly (and highly significantly) positively correlated, in both F2 and F14 mice. It would be surprising if this were not the case – a large mouse would be expected to have a large heart.
There are significant correlation coefficients between body weight and FVL (in F2 and F14), body weight and CRW (in both), body weight and FOW (in F2 but not F14), and between heart weight and FVL (in both), heart weight and crescent width (in F14) and heart weight and FOW (in both). However, all of these correlation coefficients are very low and these variables can be viewed as essentially independent. Figure 5.1, a scatterplot graphing FVL against heart weight in F2 mice, illustrates this point – although there is a highly significant 156 correlation between these variables, the graph shows a lack of any clear pattern. One of the weaknesses of the correlation coefficient is that if there are very large numbers of observations, as in this study, there is a high chance of a statistically significant correlation being observed, which is unlikely to be of any biological significance.
Figure 5.1: Scatterplot of FVL vs heart weight in F2 mice 1.6
1.4
1.2
1.0
FV L 0.8
0.6
0.4
0.2 0.10 0.15 0.20 0.25 0.30 Heart Weight
Similarly, although several pairs of the measurements of greatest interest here, FVL, FOW and CRW, have correlation coefficients which are statistically significant, the size of the correlation coefficient is small in every instance. This suggests that these variables are likely to be under largely independent genetic control, a prediction supported by the results of the QTL mapping (see chapter 6).
5.4.3 Analysis of variance for factors affecting FVL, FOW and CRW in F2 mice Analysis of variance (ANOVA) was used to study the relationships between the various traits – cardiac and noncardiac – recorded for each mouse. The General Linear Model (GLM) for ANOVA was used for all analyses of variance, 157 because unlike other available models it does not require data to be perfectly balanced (eg equal numbers of male and female mice).
The full ANOVA computer output is shown below only for the more complex models with multiple terms. The results of ANOVA for single variables in the F2 mice were as follows. For FVL, sex (p=0.019), coat colour (p=0.04) and heart weight (p=0.003) all significantly affected FVL (note that here, and from here onwards in statistical discussion, the words “affected” and “effect” are used in the statistical sense and do not necessarily imply biological causation). For FOW, age at dissection (p < 0.0005), coat colour (p=0.03), heart weight (p<0.0005) and week of dissection (in F2 but not F14) (p<0.0005) were significant. For CRW, sex (p=0.002), age (p<0.0005), week of dissection (in F2 but not F14 mice)(p<0.0005) and body weight (p=0.038) were significant, but adjusting for sex removed the effect of weight (p=0.09). Week of dissection was included to take into account the possibility of inconsistency in measurement technique, particularly with gains in experience over the course of the very large number of dissections performed in the study. Although there was no consistent trend observed to larger or shorter values over the course of the study, the effect was significant and it was considered prudent to include this as a covariant in the ANOVA analyses for the F2 mice.
158 Table 5.4: Comparison between data for 129T2/SvEms mice with and without PFO
129T2/SvEms (all) 129T2/SvEms 129T2/SvEms P
with PFO without PFO value# n75 1560-
PFO (%) 80 - - -
FVL(mm) 0.60 0.11 0.600.11 0.610.13 0.69
FOW(mm) 0.24 0.058 0.250.058 0.210.052 0.05
CRW(mm) 0.44 0.12 0.440.13 0.400.09 0.25
Body Weight(g) 17.5 2.1 17.42.1 18.12.0 0.27
Heart Weight(g) 0.14 0.021 0.140.021 0.140.019 0.60
# two tailed t-test
5.4.4 Relationship between PFO and the continuous variables The low frequency of PFO among QSi5 and F1 mice make it difficult to assess the effect of the measured traits on the risk of PFO in these mice. However, there were sufficient 129T2/SvEms mice without PFO to allow comparisons between groups. Table 5.4 shows data for 129T2/SvEms mice with and without PFO. For most of the measures, the results are very similar in mice with and without PFO. Only for FOW is there a marginally significant difference, of small magnitude (<1 standard deviation). Mice with PFO have slightly larger FOW than mice without PFO. Correction for multiple comparisons has not been done but would probably render this association non-significant. On the whole, however, these data suggest that comparisons between inbred strains are more likely to provide useful information than comparisons within strains. In the F2 and F14 mice, there were sufficient mice both with and without PFO to allow statistical comparisons to be made. Analysis of variance was performed for each of the key variables – FVL, FOW and CRW.
159 The tables below are the ANOVA output produced by the statistical package Minitab V14 (Minitab Inc). In each table, DF = degrees of freedom, Seq SS = sequential sums of the squares, Adj SS = adjusted sums of the squares, and Adj MS = adjusted means squares. Adjusted sums of the squares are used because all factors are considered in the model, without dependence on model order. The F statistic is the ratio of between-group to within-group variance and is the basis on which the p value is calculated. The smallest p value reported by Minitab is 0.000 which is equivalent to <0.0005, as values greater than or equal to 0.0005 are rounded up to 0.001. In each case the model used includes the variables which were found to have a significant effect on PFO in single analyses.
5.4.4.1 Relationship between FVL and PFO In the F2 and F14 mice, the relationship between FVL and PFO demonstrated by Biben and colleagues (Biben et al., 2000) was resoundingly confirmed. Biben and colleagues found that strains with short FVL had a high prevalence of PFO and strains with long PFO had a low prevalence of PFO. The same pattern was found in the F2 and F14 mice. Tables 5.5 and 5.6 show the results of ANOVA for FVL.
Table 5.5 Analysis of variance for FVL in F2 mice
Source DF Seq SS Adj SS Adj MS F P HtWeight 1 0.5927 0.4635 0.4635 16.14 0.000 Sex 1 0.0004 0.0003 0.0003 0.01 0.925 Colour 2 0.1492 0.0511 0.0256 0.89 0.411 PFO 1 4.6488 4.6488 4.6488 161.91 0.000 Error 1322 37.9576 37.9576 0.0287 Total 1327 43.3488
160 Table 5.6 Analysis of variance for FVL in F14 mice
Source DF Seq SS Adj SS Adj MS F P HtWeight 1 0.4105 0.3499 0.3499 19.35 0.000 Sex 1 0.0220 0.0297 0.0297 1.64 0.200 Colour 2 0.0012 0.0058 0.0029 0.16 0.852 PFO 1 5.6147 5.6147 5.6147 310.42 0.000 Error 927 16.7672 16.7672 0.0181 Total 932 22.8156
Note the extremely high F statistics in both F2 and F14 mice. An F statistic of 16.14 in F2 and 19.35 in F14 mice yields p<0.0005. The F statistics for PFO – even after correction for other potentially significant factors – are very substantially larger (161.91 and 310.42 respectively), so the true p value for the significance of this effect must be <<0.0005. Figures 5.2 and 5.3 are histograms comparing the distribution of FVL in mice with and without PFO. For F2 and F14 mice, there is an approximately normal distribution of FVL lengths. There is overlap between the values for mice with and without PFO, but mice with PFO have generally shorter FVL than mice with PFO.
161 Figure 5.2: Histogram of FVL in F2 mice with and without PFO
260 240 220 200 180 160 140 With PFO 120 Without PFO 100 80
Number of Mice 60 40 20 0
1 0.4 0.6 0.8 1.2 1.4 1.6 FVL (mm)
Figure 5.3: Histogram of FVL in F14 mice with and without PFO
200 180 160 140 120 With PFO 100 Without PFO 80 60
Number of mice 40 20 0 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 FVL (mm)
162 In the F2 mice there were FVL values available for 1328 mice. Of the top decile for FVL (i.e. the 10% of mice with the longest FVL), 1/133 had PFO. Of the bottom decile, 62/133 mice had PFO. There were similar figures for the F14 mice. There were 933 FVL values, and of the top decile 1/93 had PFO, compared with 73/93 in the bottom decile. The somewhat higher proportions in F14 than in F2 mice reflect the overall higher prevalence of PFO in the F14 than F2 mice, as discussed above.
5.4.4.2 Relationship between FOW and PFO FOW as defined for this study was not investigated by Biben and colleagues, and there have been no other reported studies of FOW. As a measure, it too proved to have a very strong relationship with PFO, with larger FOW being associated with a higher risk of PFO and vice versa. Tables 5.7 and 5.8 are ANOVA results for FOW.
Table 5.7: Analysis of variance for FOW in F2 mice
Source DF Seq SS Adj SS Adj MS F P HtWeight 1 0.146315 0.160581 0.160581 40.01 0.000 Week 1 0.050327 0.053148 0.053148 13.24 0.000 Age 18 0.244240 0.237463 0.013192 3.29 0.000 Colour 2 0.044954 0.031331 0.015666 3.90 0.020 PFO 1 0.480393 0.480393 0.480393 119.68 0.000 Error 1304 5.234077 5.234077 0.004014 Total 1327 6.200306
Table 5.8: Analysis of variance for FOW in F14 mice
Source DF Seq SS Adj SS Adj MS F P HtWeight 1 0.028424 0.044196 0.044196 11.42 0.001 Age 14 0.253185 0.107972 0.007712 1.99 0.016 Colour 2 0.003786 0.002914 0.001457 0.38 0.686 PFO 1 1.308869 1.308869 1.308869 338.13 0.000 Error 923 3.572821 3.572821 0.003871 Total 941 5.167085
163 Figure 5.4: Histogram of FOW in F2 mice with and without PFO
450 400 350 300 250 With PFO 200 Without PFO 150 Number of mice 100 50 0
.3 2 8 .6 .12 0 .4 .4 0 0.06 0 0.18 0.24 0.36 0 0 0.54 FOW (mm)
Figure 5.5: Histogram of FOW in F14 mice with and without PFO
300
250
200 With PFO 150 Without PFO 100 Number of mice
50
0 0.06 0.12 0.18 0.24 0.3 0.36 0.42 0.48 0.54 0.6 FOW (mm)
164 As for FVL, there is a very highly significant relationship between FOW and PFO. Figures 5.4 and 5.5 are histograms comparing the distribution of FOW in mice with and without PFO. Week of dissection is not included as a variable for ANOVA of F14 mice because of its lack of significant effect on FOW in the single variable analysis of these mice (the same applies for CRW).
While the pattern of results is a little less striking than for FVL, it is similar in essence. There is an approximately normal distribution of results for mice with and without PFO, and the distributions overlap. The relationship is the opposite to that seen for FVL in that larger values for FOW are associated with PFO. Considering the upper and lower deciles, for the F2 mice there were values available for 1328 mice. In the top decile, 55/133 mice had PFO; in the bottom decile 4/133 mice had PFO. For the F14 mice, there were 942 values available. 87/94 mice in the top decile and 8/94 in the bottom decile had PFO.
In summary, FOW appears to be nearly as good a predictor of PFO as FVL. These analyses led to the decision to select mice for genotyping in the QTL study on the basis of both FVL and FOW.
5.4.4.3 Relationship between CRW and PFO CRW was not studied by Biben or colleagues and there have been no other reported studies of this measurement. Tables 5.9 and 5.10 are ANOVA results for CRW.
Table 5.9: Analysis of variance for CRW in F2 mice
Source DF Seq SS Adj SS Adj MS F P Week 1 1.11120 1.06793 1.06793 81.93 0.000 Weight 1 0.50019 0.31863 0.31863 24.45 0.000 Sex 1 0.00033 0.00099 0.00099 0.08 0.783 Age 18 0.58072 0.48260 0.02681 2.06 0.006 PFO 1 0.47049 0.47049 0.47049 36.10 0.000 Error 1205 15.70660 15.70660 0.01303 Total 1227 18.36954 165 Table 5.10: Analysis of variance for CRW in F14 mice
Source DF Seq SS Adj SS Adj MS F P Weight 1 0.27177 0.19664 0.19664 9.62 0.002 Sex 1 0.10741 0.08047 0.08047 3.94 0.048 Age 14 0.62692 0.61444 0.04389 2.15 0.008 PFO 1 0.06206 0.06206 0.06206 3.04 0.082 Error 919 18.78996 18.78996 0.02045 Total 936 19.85811
In the F2 mice, there was a strong association between CRW and PFO, although this was much less pronounced than for FVL and FOW. Interestingly, in the F14 generation the association appears to have been lost with p = 0.08 in this analysis. This may be the result of the effects of genetic drift, as discussed above in relation to the increased prevalence of PFO in the F14 mice. Figures 5.6 and 5.7 are histograms comparing the distribution of CRW in mice with and without PFO.
Figure 5.6: Histogram of CRW in F2 mice with and without PFO
350 300 250 200 With PFO 150 Without PFO 100 Number of mice 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 CRW (mm)
166 Figure 5.7: Histogram of CRW in F14 mice with and without PFO
200 180 160 140 120 With PFO 100 Without PFO 80 60 Number of mice 40 20 0
.1 .3 .7 .3 0 0 0.5 0 0.9 1.1 1 CRW (mm)
Although the values for mice with and without PFO are once again approximately normally distributed, the distinction between mice with and without PFO is much more subtle. For the F14 mice the distributions are not evidently different, consistent with the finding on ANOVA of no significant association between CRW and PFO. For the F2 mice, the distribution for mice with PFO is somewhat shifted to the right relative to the distribution for mice without PFO, but this is much less clear-cut than for FVL and FOW. Considering the top and bottom deciles as for FVL and FOW, in the F2 mice there is a clear distinction. Of mice in the top decile (high CRW) 42/123 had PFO, compared with 12/123 (from 1228 mice with data available). This is a highly significant difference (p<0.0005, chi square test). By comparison, for the F14 mice, 33/94 from the top decile had PFO, compared with 24/94 from the bottom decile (from 937 mice with data available). This is not a statistically significant difference (p=0.15, chi square test).
Based on these data, CRW is a much weaker predictor of PFO than either FVL or FOW, and F2 mice were not selected for genotyping based on CRW. The
167 strength of the association in the F2 mice did suggest that it may be possible to map QTL for CRW, and analyses were done using the genotypes of mice selected on the basis of FVL and/or CRW (discussed in Chapter 6).
5.4.5 Biological significance of the relationships between FVL, FOW and CRW These figures demonstrate a strong relationship between FVL and PFO (confirming the findings of Biben and colleagues (Biben et al., 2000)), between FOW and PFO, and to a lesser extent between CRW and PFO. The biological bases for these associations are unknown. It is tempting to speculate that a long flap valve presents greater opportunity for fusion, reducing the chance of PFO, and that a short flap valve barely covers the foramen ovale, increasing the likelihood of PFO. For FOW, it is possible that a wide foramen ovale results in greater bloodflow across the septum in early postnatal life, inhibiting fusion of the flap valve; or perhaps a wide foramen ovale results in a wider channel which is less likely to close simply because there is more of it to close off. Likewise, it is possible that a wide crescent reflects the width of the channel in prenatal life. However, there is at present no direct evidence for any of these ideas. Hopefully, identification of the genetic bases for the QTL identified in chapter 6 will provide evidence for one or more of these alternatives, or will suggest an entirely different mechanism.
168 6. Quantitative trait loci modifying cardiac atrial septal morphology and risk of patent foramen ovale in inbred laboratory mice
6.1 Introduction Chapter 3 described an hypothesis-driven approach to understanding the genetics of ASD. The hypothesis can be stated as follows: mutations in genes associated with rare dominant forms of ASD or which are known to be important during development of the heart may be responsible for a significant proportion of ASD. In chapter 4, an effort was made to map an additional ASD gene. If this had been successful, in turn it would have been logical to screen affected individuals, as was done with NKX2-5, GATA4, and TBX20. However, the results of this and similar studies suggest that this kind of candidate gene approach identifies causative mutations in only a small proportion of cases, and in most of them the mechanism appears to be autosomal dominant inheritance rather than by playing a role in multifactorial causation. A possible exception to this, as discussed in section 3.2.3, is the work of McElhinney and colleagues (McElhinney et al., 2003) which does suggest a possible role for NKX-5 in the multifactorial causation of ASD. The role of the GATA4 polymorphism S377G (see section 3.3.3) may represent another exception to this rule. Nonetheless, these genes are likely to be relatively minor contributors to the great bulk of CHD, and ASD in particular.
It follows that a non-hypothesis driven approach is required if we are to understand the genetic basis of common forms of CHD. One option might be to conduct a whole-genome association study, studying very large numbers of affected individuals and unaffected controls with markers densely spaced through the genome (Carlson et al., 2004). While this is a powerful study design and has the advantage of working directly with human disease, it remains dauntingly expensive, and for this reason, in part, most studies to date have been grossly under-powered. Even with continued reductions in the cost of
169 genotyping with improved technology, the sample sizes required for such studies run to thousands of cases and controls (Wang et al., 2005).
QTL mapping in laboratory animals represents an approach which is non- hypothesis-driven, allows ready generation of large sample sizes in a way which is not possible in humans, and (compared with whole-genome association studies) is relatively inexpensive. For this reason it has been widely used for study of quantitative cardiovascular risk factors such as hypertension and hyperlipidemia (as well as numerous other clinically relevant quantitative traits) (Mashimo et al., 2007; Redina et al., 2006; Ueno et al., 2003).
These studies have shown the relevance of the animal models to human disease. In QTL studies in humans of hypertension , a major cardiovascular disease risk factor, 30 of 36 QTL regions identified were predicted by work in rats or mice (Stoll et al., 2000; Cowley, 2006). This also points to the enormous genetic complexity which can emerge in the course of QTL studies. In 2006, Cowley identified 108 hypertension QTL in the human genome, and hundreds more in rodents (Cowley, 2006).
Stroke and myocardial infarction, like CHD, can be treated as binary traits – a patient either has one of these phenotypes or not. Although severity can be clinically graded, the scoring scales are not continuous and thus cannot be used to map QTL with nearly the same power afforded by QTL analysis. Decades of intensive study into the causes of these problems has led to the identification of quantitative risk factors which allow a QTL mapping approach to be applied – not directly to the phenotype of ultimate interest but to a clinically important proxy, in accordance with the liability model for binary traits (Falconer, 1965) (see section 1.7.3). Until now, this has not been possible for CHD because of a lack of quantitative endophenotypes which can be studied. The work of Biben and colleagues (Biben et al., 2000) as discussed in sections 1.6.5 and 5.2, has identified endophenotypes influencing the risk of PFO, which in turn made this study possible.
170 6.2 Study design As described in 2.1.3.1, the study used an intercross design with parental strains being represented by 129T2/SvEms males and QSi5 females, and the F1 mice being crossed to produce the F2 generation. The strains were selected on the basis of having extreme phenotypes for FVL and PFO. 129T2/SvEms has short flap valve and high incidence of PFO, and QSi5 has FVL and low incidence of PFO (see section 5.3). In addition, the strains have significantly different mean FOW (QSi5 < 129T2/SvEms) and CRW (QSi5 >129T2/SvEms). QSi5 has the additional advantage of very high fecundity (average litter size 13.4) (Holt et al., 2004), an advantage for a study in which large numbers of animals are required.
6.3 Selection of mice for genotyping A total of 1437 F2 mice were dissected. Complete data were available from 1328 of these. After correction for sex and week of dissection, the top and bottom deciles for FVL and FOW were selected for genotyping, amounting to 466 mice. Note that, as anticipated, there was a relatively small overlap between the mice selected on the basis of FVL and those selected on the basis of FOW – if there had been no overlap at all, 532 mice would have been genotyped. This was an expected result because of the low correlation between the two traits (see section 5.4.2). There was no selection on the basis of CRW values, because this trait had the weakest association with PFO of any of the three. However, taking advantage of the ability of MAPMAKER/QTL to deal with missing data, it was possible to perform QTL mapping for QTL influencing CRW, albeit with lower power than the mapping for FOW and FVL, because the genotyped mice were not selected because of extreme values for CRW.
The decision to control for sex and week of dissection was based on the significant statistical effects each of these had on the risk of PFO (see section 5.5.4). Heart weight (p=0.003) and coat colour (p=0.04) also affected FVL. Age at dissection (p<0.001), coat colour (p=0.03), and heart weight (p<0.001) affected FOW, and age (p<0.001) and weight (p=0.038) affected CRW. However, it was decided not to control for these factors. Although statistically
171 significant, these effects (including the ones which were controlled for) were of small size. For example, mean male FVL was 1.01 mm, whereas mean female FVL was 0.98 mm, a difference of 0.03 mm, compared with an SD for FVL of 0.19. Adjustment for coat colour risked concealing the presence of a QTL that may be linked to coat colour genes, and was not done for that reason. Adjustment for heart weight could have masked QTL relevant to chamber morphology which also affect heart weight. Importantly, although there were substantial differences between heart weight and body weight in the parental strains, in the F2 mice there was no apparent direct relationship between PFO status and body weight or heart weight. Mice with PFO had a body weight of 26.85+/-3.32 g (mean+/-SD) and heart weight of 0.208+/-0.034 g, and mice without PFO had a body weight of 26.61+/-3.28 g and heart weight of 0.205+/- 0.033 g. Moreover, in F2 mice, there were only weak correlations between each of body weight and heart weight and FVL, FOW, and CRW (correlation coefficients, all <0.16).
6.4 Markers used Eighty-nine markers were selected, spanning the mouse genome with an average intermarker distance of ~17cM. As described in section 2.7.2, it was necessary to replace 7 of the initial markers selected because although they appeared to work well in the hands of the AGRF in evaluation, in practice they were unreliable. The final set of markers used (after replacement of the poorly performing markers) is listed in Appendix 2. Map positions, in cM, are from the Whitehead institute database (Dietrich et al., 1996) (http://www.broad.mit.edu/cgi-bin/mouse/sts_info?database=mouserelease).
6.5 Linkage results The figures and tables below show the results obtained using MAPMAKER/QTL as described in 2.9.2.2. Each chromosome was treated as an independent linkage group, and the phenotype data from all F2 animals with complete data were used for the purposes of estimating QTL effect sizes. In addition to the results from MAPMAKER/QTL, the figures include the results of a binary trait
172 analysis performed by Dr Peter Thomson, using a program he wrote for this explicit purpose using the model described in 2.9.2.3. Results for the X chromosome have been calculated separately for males and females and are graphed separately. In each figure, open triangles indicate FVL, solid triangles, FOW; solid squares, CRW, open diamonds, PFO (binary analysis) The y-axis represents LOD scores for that chromosome; the x-axis represents map distance. Note that the figures are NOT all to the same scale. The highest peak LOD score is almost 9 times the lowest peak, and the longest chromosome is nearly three times the length of the shortest. If all the graphs were to the same scale, either some very large figures or some very small ones would be required. Thus, each graph has been scaled to fit comfortably within half a page, although figures sharing a page are kept roughly in scale with one another as far as possible.
Figure 6.1: MMU1
6
5
4 LOD Score
3
2
1
0 0 10 20 30 40 50 60 70 80 90 100 110 120
Map Distance (cM)
173 Figure 6.2 MMU2
5 4
LOD Score 3 2 1 0 0 102030405060708090100 Map Distance (cM)
Figure 6.3 MMU3
3 32 21 10
LOD Score 0 1020304050607080 0 0 1020304050607080 Map Distance (cM)
174 Figure 6.4 MMU4 10 9 8
LOD Score 7 6 5 4 3 2 1 0 0 10203040506070 Map Distance (cM)
Figure 6.5 MMU5
2 1 0 LOD Score 0 1020304050607080 Map Distance (cM)
175 Figure 6.6 MMU6
4
3
LOD Score 2
1
0 0 102030405060 Map Distance (cM) Figure 6.7 MMU 7
5 4
LOD Score 3 2 1 0 0 1020304050 Map Distance (cM)
176 Figure 6.8 MMU8
6 5
LOD Score 4 3 2 1 0 0 102030405060 Map Distance (cM)
Figure 6.9 MMU9
3 2 1 LOD Score 0 0 10203040506070 Map Distance (cM)
177 Figure 6.10 MMU10 4
3 LOD Score 2
1
0 0 10203040506070 Map Distance (cM)
Figure 6.11 MMU11
2
1 LOD Score 0 0 10203040506070 Map Distance (cM)
178 Figure 6.12 MMU12
3 2 1 LOD Score 0 0 102030405060
Map Distance (cM)
Figure 6.13 MMU13
5 4 LOD Score 3 2 1 0 0 1020304050 Map Distance (cM) 179 Figure 6.14 MMU14
2
1 LOD Score 0 0 10203040506070
Map Distance (cM)
Figure 6.15 MMU15
4
LOD Score 3
2
1
0 0 10203040506070 Map Distance (cM) 180 Figure 6.16 MMU16
1
LOD Score 0 0 1020304050 Map Distance (cM)
Figure 6.17 MMU17
2 1
LOD Score 0 0 1020304050
Map Distance (cM)
181 Figure 6.18 MMU 18
3 2 1 LOD Score 0 0 10203040 Map Distance (cM)
Figure 6.19 MMU19
7 6
LOD Score 5 4 3 2 1 0 0 1020304050 Map Distance (cM)
182 Figure 6.20 MMUX (female mice)
2
1 LOD Score 0 0 102030405060 Map Distance (cM)
Figure 6.21 MMX (male mice) 2 1
LOD Score 0 0 102030405060 Map Distance (cM)
183 6.6 Chromosomes with noteworthy findings Using the significance thresholds of 2.8 for suggestive linkage and 4.3 for significant linkage proposed by Lander and Kruglyak (Lander and Kruglyak, 1995), there were a total of seven significant and six suggestive QTL identified in this study. The findings are summarised in tables 6.1a, 6.1b and 6.1c. The results for a number of the chromosomes deserve specific comment (see below).
Table 6.1a: Loci with LOD score >2.8 for FVL
MMU chromosome 6 8 10 13 15 18 19
LOD score 4.11 5.52 3.8 4.64 3.56 3.05 6.04
Maximum binary LOD* 2.44 0.99 1.47 0.17 3.25 0.94 3.35
Maximum FOW LOD* 2.11 2.99
Maximum CRW LOD* 2.1
Position (cM) 59 60 68 10 24 11 9
Estimated physical location 142 114 118 37 53 40 13
(Mb)
Confidence interval (Mb) 118-tel 108-tel 111-tel 6.5-48 21-90 cen-56 cen-46
% attributable phenotypic 2.4% 3.4% 1.8% 2.6% 2.1% 1.9% 3.5%
variance