PLEASE TYPE THE UNIVERSITY OF NEW SOUTH WALES Thesis/Dissertation Sheet

Surname or Family name: Moradi Marjaneh

First name: Mahdi Other name/s:

Abbreviation for degree as given in the University calendar: PhD

School: St Vincent's Clinical School Faculty: Medicine

Title: Genetics of Congenital Heart Diseases

Abstract 350 words maximum: (PLEASE TYPE)

Development of the cardiac atrial septum involves complex morphogenetic processes including programmed cell growth and death. Secundum atrial s eptal d efect ( ASDII) an d p atent f oramen o vale ( PFO) ar e co mmon at rial s eptal an omalies as sociated with n umerous p athologies including s troke. D ata from studies i n hum ans a nd mouse s uggest t hat P FO a nd A SDII e xist i n a n a natomical c ontinuum of septal dysmorphogenesis with a common genetic basis.

Analysis of quantitative trait loci (QTL) and genome technology form a powerful approach to understand genetic complexity underpinning common disease. A previous study o f inbred mice mapped QTL for quantitative anatomical atrial s eptal p arameters correlating with PFO, including flap valve length (FVL) and foramen ovale width (FOW). Here, we explore an advanced intercross line (AIL) for confirmation and fine mapping of t hese Q TL. An A IL be tween pa rental s trains QSi5 a nd 129T2/SvEms, s howing e xtreme va lues f or F VL a nd PFO, w as established ov er 1 4 g enerations. L inkage a nalysis us ing 141 s ingle nuc leotide p olymorphism m arkers f ocused on 6 s ignificant a nd on e suggestive QTL regions for FVL or FOW found previously, and we also sought QTL for heart weight (HW) normalized to body weight (BW). Virtually all QTL were confirmed and refined and many new QTL were discovered, with analysis of PFO as a binary trait providing strong support. QTL were not explained by HW/BW differences between parental strains. The overlap between FVL and FOW QTL was striking, indicating many QTL affect processes relevant to both septum primum and septum secundum. This study provides a high-resolution picture of genetic c omplexity unde rpinning a trial s eptal va riation i n t he m ouse a nd pr edicts i nvolvement of pot entially hundr eds of r isk variants i n humans.

Subsequently, high throughput sequencing of the whole genomes of the parental lines and analysis of HapMap datasets as a validation method, in addition to applying a multi-step filtering criteria resulted in identification of variants underlying QTL, and candidates for involvement in phenotypic variation based on the predicted impact of sequence changes.

Mutations in multiple members of the evolutionarily conserved cardiac transcription factor network, including GATA4, cause or predispose to ASDII and PFO. Here, we assessed whether the most prevalent variant of the GATA4 gene, S377G, was significantly associated with PFO or ASD. Ou r a nalysis o f world i ndigenous populations s howed t hat G ATA4 S 377G was l argely Caucasian-specific, and s o s ubjects were restricted to those of Caucasian descent. To select for patients with larger PFO, we limited our analysis to those with cryptogenic stroke in which PFO was a subsequent finding. In an initial study of Australian subjects, a weak association between GATA4 S377G and PFO/Stroke was observed. However, in a follow up study of German Caucasians no association was found with either PFO or ASD. Analysis of combined Australian and German data confirmed the lack of association. Thus, the common GATA4 variant S377G is likely to be relatively benign in terms of its participation in CHD and PFO/Stroke.

Declaration relating to disposition of project thesis/dissertation

I hereby grant to the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or in part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all property rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation.

I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstracts International (this is applicable to doctoral theses only).

…………………………………………………………… ……………………………………..……………… ……….……………………...…….… Signature Witness Date

The University recognises that there may be exceptional circumstances requiring restrictions on copying or conditions on use. Requests for restriction for a period of up to 2 years must be made in writing. Requests for a longer period of restriction may be considered in exceptional circumstances and require the approval of the Dean of Graduate Research.

FOR OFFICE USE ONLY Date of completion of requirements for Award:

THIS SHEET IS TO BE GLUED TO THE INSIDE FRONT COVER OF THE THESIS

COPYRIGHT STATEMENT

‘I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.'

Signed ……………………………………………......

Date ……………………………………………......

AUTHENTICITY STATEMENT

‘I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.’

Signed ……………………………………………......

Date ……………………………………………......

ORIGINALITY STATEMENT

‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.’

Signed ……………………………………………......

Date ……………………………………………......

Genetics of Congenital Heart Diseases

MAHDI MORADI MARJANEH

A thesis in fulfilment of

the requirements for the degree of

Doctor of Philosophy

August, 2012

St Vincent's Clinical School

University of New South Wales

Dedication

To my dear wife and parents.

With my deepest love, gratitude, and respect.

i

Acknowledgment

During t he c ourse of P hD, I ha ve be come i ndebted t o m any pe ople f or t heir g enerous support and help of many kinds. First and foremost I am grateful to my family in particular, my parents for their unconditional love, never-ending support, and encouragement during all of m y e ducational e ndeavours. I met my w ife, M adinah Y ate, w hen I w as do ing t he l ast semester of PhD. Since then my l ife has changed so much f or t he better. She has been a great emotional support and inspiration for me.

Academically, my greatest debts are, of course, to my supervisor, Prof. Richard Harvey, and co-supervisor, A/Prof. Edwin Kirk, for all their guidance, support, help, and encouragement.

I am eternally obliged to them for everything I learnt from them.

I am grateful to all people in the Victor Chang Cardiac Research Institute (VCCRI) and, of course a bove a ll, members of H arvey l ab, w here I di d m y P hD, f or t heir he lp in v arious ways and friendship over the years. Especially, I would like to thank Dr. Naisana Asli for teaching me many lab techniques and helping me with any technical questions that I have had, Dr. Mirana Ramialison and Mrs. Tram Doan for teaching me and helping me with the bioinformatics techniques, and Dr. Gonzalo del Monte Nieto for teaching me t o dissect mouse hearts, and for providing ongoing help with the dissection of mouse atrial septums. I would l ike t o a ddress s pecial t hanks t o P rof. S ally D unwoodie, a m ember of m y P hD committee, for her precious feedbacks in the PhD reviews. There have been many others at

VCCRI who h elped m e i n v arious w ays. I a m e specially g rateful to M r. A mirsalar

Rashidianfar, D r. A shley Waar denberg, Ms. B ernice S tewart, P rof. B ob G raham, Mr .

Brendan Lee, A/Prof. Catherine Suter, Ms. Dasha Syal, Dr. David Humphreys, Ms. Karen

Brennan, Mr. Khai Do, Dr. Michael Swanton, Dr. Munira Xaymardan, Mr. Pardeep Dhiman,

Mr. P aul Y oung, D r. R eena S ingh, D r. R omaric B ouveret, D r. S tella lee, M r. Timothy

Kersten, and Mr. Travis Byron (listed in alphabetical order).

ii

The QTL mapping studies could not be done without help and support of Prof. Chris Moran and A/Prof. Peter Thomson at the University of Sydney. Chris’s precious advices on Q TL mapping an d h is v aluable f eedbacks as a m ember o f m y P hD co mmittee and P eter’s excellent skills in statistical genetics and linkage analysis have greatly enhanced the content of the projects. I owe them many thanks. At the University of Sydney, I am also grateful to

Prof. Claire Wade for her generous advices and Dr. Ian Martin for his contribution to the establishment of AIL.

I w ould l ike t o t hank t hose w ho c ontributed t o t he G ATA4 S 377G s tudy i ncluding P rof.

Michael Feneley, Dr. Robyn Otway, and A/Prof. Diane Fatkin at VCCRI; Dr. Maximilian

Posch and his research team in Berlin, Germany; A/Prof. David Winlaw, Dr. Tanya Butler, and Ms. Gillian Blue at the Children's Hospital at Westmead; A/Prof. Jeremy Martinson at the University of Pittsburgh, USA; and Prof. Lyn Griffiths at Griffith University.

I a m a lso t hankful t o t hose w ho c ontributed t o the s tudy of t he f amily w ith A SD a nd

Marcus-Gunn phenomenon, first of all to Dr. Alan Ma, clinical geneticist at the Children’s

Hospital at Westmead. This study was basically a collaboration with Alan under supervision of A/Prof. Edwin Kirk to fulfil also Alan’s dissertation requirement. Ms. Glenda Mullan and the S EALS l aboratory t eam he lped us w ith D NA pr eparation and s ending a way. E xome sequencing w as p erformed at D r. Michael B amshad’s laboratory i n S eattle, USA. T he variants identified were filtered out by Dr. Tony Roscioli. Thanks to all of them.

Finally, I am deeply grateful to the University of New South Wales which awarded me a scholarship an d to the N ational H ealth and Medical R esearch C ouncil ( NHMRC) f or providing research grant support.

My s incere a pologies t o t hose w hose name s hould b e here b ut m ay h ave b een unintentionally omitted and thanks to them all.

iii

Abstract

Development o f t he c ardiac at rial s eptum involves complex morphogenetic processes i ncluding p rogrammed c ell gr owth a nd de ath. S ecundum atrial s eptal defect (ASDII) and patent foramen ovale (PFO) are common atrial septal anomalies associated with numerous pathologies including stroke. Data from studies in humans and mouse suggest that PFO and ASDII exist in an anatomical continuum of septal dysmorphogenesis with a common genetic basis.

Analysis o f q uantitative tr ait lo ci (QTL) and ge nome technology f orm a powerful approach t o unde rstand g enetic complexity underpinning common d isease. A previous study of inbred mice mapped QTL for quantitative anatomical atrial septal parameters correlating with P FO, including f lap va lve l ength (FVL) a nd foramen ovale width ( FOW). Here, we ex plore an ad vanced i ntercross l ine (AIL) for confirmation and fine mapping of these QTL. An AIL between parental strains QSi5 and 129T2/SvEms, showing extreme values for FVL and PFO, was established over

14 generations. Linkage analysis using 141 single nucleotide polymorphism markers focused on 6 s ignificant and one s uggestive QTL regions for FVL o r F OW found previously, a nd w e a lso s ought Q TL f or he art w eight ( HW) nor malized t o bod y weight ( BW). V irtually all QTL were confirmed a nd r efined a nd m any new Q TL were d iscovered, with a nalysis of P FO a s a bi nary t rait pr oviding s trong s upport.

QTL w ere n ot ex plained b y HW/BW d ifferences b etween p arental s trains. T he overlap be tween F VL and F OW Q TL w as s triking, i ndicating many Q TL a ffect processes relevant t o bot h s eptum pr imum a nd s eptum s ecundum. This s tudy provides a hi gh-resolution picture o f genetic complexity underpinning atrial septal

iv variation i n t he m ouse a nd pr edicts i nvolvement of pot entially hundr eds of r isk variants in humans.

Subsequently, hi gh t hroughput s equencing o f t he w hole genomes of t he pa rental lines an d an alysis o f HapMap datasets a s a validation m ethod, i n a ddition t o applying a mu lti-step filtering c riteria r esulted in id entification o f gene v ariants underlying QTL, and candidates for i nvolvement i n phenotypic variation based on the predicted impact of sequence changes.

Mutations in multiple members of the evolutionarily conserved cardiac transcription factor network, including GATA4, cause or predispose to ASDII and PFO. Here, we assessed w hether t he m ost pr evalent va riant of t he G ATA4 gene, S 377G, w as significantly associated w ith P FO or ASD. Our a nalysis of w orld i ndigenous populations s howed t hat G ATA4 S 377G w as l argely C aucasian-specific, an d so subjects w ere r estricted t o t hose o f C aucasian d escent. T o s elect f or p atients w ith larger PFO, we limited our analysis to those with cryptogenic stroke in which PFO was a s ubsequent f inding. In a n i nitial s tudy of A ustralian s ubjects, a w eak association between GATA4 S377G and PFO/Stroke was observed. However, in a follow up study of German Caucasians no association was found with either PFO or

ASD. A nalysis of combined A ustralian and G erman d ata c onfirmed t he l ack o f association. T hus, t he c ommon G ATA4 v ariant S 377G i s l ikely t o b e r elatively benign in terms of its participation in CHD and PFO/Stroke.

v

Publications arising from this work

1. Moradi Marjaneh M, Kirk EP, Posch MG, Ozcelik C, Berger F, Hetzer R, Otway R, Butler TL, Blue GM, Griffiths LR, Fatkin D, Martinson JJ, Winlaw DS, Feneley MP, Harvey RP. Investigation of association between PFO complicated by cryptogenic stroke and a common variant of the cardiac transcription factor GATA4. PLoS One. 2011; 6(6):e20711.

2. Moradi Marjaneh M, Martin ICA, Kirk EP, Harvey RP, Moran C, Thomson PC. QTL mapping of complex binary traits in an advanced intercross line. Animal Genetics. 2012; 43 Suppl 1: 97-101.

Published abstracts

1. Moradi Marjaneh M, Kirk EP, Doan TB, Thomson PC, Martin ICA, Moran C, Harvey RP. Genetic dissection of atrial septal abnormalities: integrating QTL mapping and genomic technology. European Human Genetics Conference 2012, June 23-26, Nuremberg, Germany. Published in: Eouropean Journal of Human Genetics. 2012; Vol. 20, Suppl. 1: page 223. (shortlisted for the best poster award)

vi

Table of Contents

1 Introduction and literature review ...... 1 1.1 Gene and genetic disorders ...... 1 1.2 Inheritance of genetic disorders ...... 4 1.2.1 Mendelian inheritance ...... 4 1.2.2 Complex inheritance ...... 6 1.3 Genetic mapping ...... 8 1.3.1 Linkage analysis ...... 8 1.3.2 Association study ...... 16 1.3.3 From QTL to gene...... 17 1.4 Atrial septation ...... 18 1.5 Fetal and neonatal circulation ...... 19 1.5.1 Fetal circulation...... 19 1.5.2 Transition to neonatal circulation ...... 20 1.6 Patent Foramen Ovale ...... 21 1.6.1 Clinical importance of PFO ...... 22 1.6.2 Genetics of PFO ...... 26 1.7 Atrial septal defect ...... 28 1.7.1 Anatomical subtypes of ASD ...... 28 1.7.2 Pathophysiology and clinical features of ASD ...... 29 1.7.3 Etiology of ASD...... 30 1.8 A link between ASD and PFO ...... 38 2 Materials and Methods ...... 40 2.1 Human studies ...... 40 2.1.1 Study of association between PFO and GATA4 S377G ...... 40 2.2 Animal studies ...... 44 2.2.1 Fine mapping of QTL affecting atrial septal morphology using AIL ...... 44 2.2.2 Whole genome sequencing of the AIL parental strains ...... 55 3 Statistical methods for QTL mapping of complex binary traits in AIL ...... 66 3.1 Introduction ...... 66 3.2 Statistical methodology for mapping of complex binary traits ...... 68 3.3 Application ...... 72 3.4 Some extensions ...... 75 3.5 Concluding remarks ...... 76

vii

4 Investigation of association between PFO complicated by cryptogenic stroke and a common variant of the cardiac transcription factor GATA4 ...... 77 4.1 Introduction ...... 77 4.2 Results ...... 80 4 .2.1 Global distribution of GATA4 S377G ...... 80 4.2.2 Australian study ...... 83 4.2.3 German study ...... 87 4.2.4 Pooled data ...... 89 4.3 Discussion ...... 90 4.4 Genetic study of a family with ASD and Marcus-Gunn phenomenon ...... 93 5 Fine mapping of QTL affecting atrial septal morphology using an AIL ...... 98 5.1 Introduction ...... 98 5.2 Results ...... 100 5.2.1 Phenotypes ...... 100 5.2.2 Linkage results for atrial septal morphology ...... 106 5.2.3 Linkage analysis for Heart Weight ...... 121 5.3 Discussion ...... 125 6 Identification of candidate underlying the AIL QTL using whole genome sequencing of the parental strains ...... 129 6.1 Introduction ...... 129 6.1.1 Haplotype mapping and QTL positional cloning ...... 129 6.1.2 Use of deep sequencing to prioritize candidate genes for AIL QTL...... 131 6.2 Identification of the genetic variations between QSi5 and 129T2/SvEms mouse strains using whole genome sequencing ...... 134 6.3 Filtering the genetic variations between QSi5 and 129T2/SvEms mouse strains 135 6.4 Genomic analysis of SNP density ...... 141 6.4.1 SNP density analysis using data from the whole genome sequencing ...... 141 6.4.2 Validation of the SNP density analysis using HapMap data ...... 142 6.5 Discussion ...... 143 7 Conclusions and future prospects ...... 148 8 References ...... 151 9 Supplemental materials ...... 174 Supplementary Table 1: List of markers with physical and genetic location ...... 174

viii

1 Introduction and literature review

1.1 Gene and genetic disorders

The a pplication of genetics t o va rious a spects of hum an l ife h as be en r apidly growing dur ing l ast de cades. A griculturalists, for ex ample, have used genetic techniques t o i mprove crop yields bot h i n qua ntity a nd qu ality. M ore i mportant advances h ave been s een i n m edicine, w here s everal genetic di sorders have be en characterized leading to significant advances in diagnosis, prevention, and treatment.

Although t he ba sic concept of ge netics - inheritance o f tr aits f rom p arents to th e offspring - has been used for centuries, the modern science of genetics is still young.

It w as only about 70 years a go t hat G regor Mendel pe rformed hi s well-known experiments showing s cientifically th e tr ansmission o f b iological c haracteristics from one generation to the next through hereditary units which were later termed as genes. Mendel’s observations established the basics of modern genetics.

Wilhelm Johannsen was the first to use the term "gene" as the unit of heredity in his influential book (Johannsen, 1909 ). H e d efined g enes as “s pecial co nditions, foundations, and determiners which are present [in the gametes] in unique, separate, and thereby independent ways [by which] many characteristics of the organism are specified”. S ince t hen, numerous ad vances i n genetic r esearch h ave d ramatically revolutionized the concept of genes although the field is still dynamic. According to a r ecent an d w idely a ccepted d efinition b y the H uman G ene N omenclature

Committee, gene is “a DNA segment that contributes to phenotype/function and in the ab sence o f d emonstrated f unction a gene may b e c haracterized b y sequence, transcription or hom ology” (Wain e t a l., 2002 ). This de finition e mphasises on t he

1 function of pr otein c oding genes, s egments of DNA w hich s ynthesize a p rotein through t ranscription a nd t ranslation. G ingeras updated t he de finition ta king in to account ne w unde rstanding of t he c omplexity of t he genome a s w ell a s t he transcriptional s ystem i ncluding c oncepts of s plicing a nd i ntergenic t ranscription

(Gingeras, 2007 ). H e al so s uggested u sing “ transcript” i nstead o f gene as a functional uni t of t he genome. A ccording t o t he r esults f rom the hum an ge nome project, t he hum an g enome i s e stimated to ha rbor 25,000 pr otein c oding ge nes spanning only 2% of the genome sequence (2008). The rest of the genome, the so- called “non-coding” DNA, h as al so b ecome important b ecause o f i ts s everal functions i ncluding r egulation of c oding genes t hrough cis- and trans-regulatory elements and contribution to the transcription of non-coding RNAs, RNA molecules which a re not t ranslated i nto pr oteins but ha ve r ecently be en found to ha ve significant bi ological f unctions. In a nother r ecently p roposed de finition, the functional i mportance o f n on-coding D NA i s t aken i nto a ccount and a ge ne i s defined a s “ a l ocatable r egion of genomic s equence, corresponding t o a uni t of inheritance, which is associated with regulatory regions, transcribed regions, and/or other functional sequence regions” (Pearson, 2006).

In parallel to the d evelopment of the gene concept, scientists have discovered that most if not all diseases h ave a genetic component. It i s now clear t hat genes pl ay roles in the onset or progress of the majority of human diseases, although the size of this contribution is highly variable from disease to disease. In some diseases, genetic factors do not pl ay a m ajor causative role but t hey contribute t o t he pathogenesis, positively or ne gatively, t hrough s everal m echanisms, a s f or i nstance, i n s ome infectious di seases t hey modify t he hos t s usceptibility t o t he i nfectious a gent. F or

2 example, s ome va riations of t he CCR5 gene h ave be en s hown t o be pr otective against H IV/AIDS (Anderson and Akkina, 2007 ). On t he ot her hand, a s ignificant abnormality in the genetic material may play a causative key role in the causation of some diseases, termed genetic disorders. A genetic disorder in an individual may be inherited from parents or be considered "de novo" and caused by new mutations or changes in an individual’s DNA.

Alteration i n num ber or s tructure of can cause c hromosomal

(cytogenetic) disorders. Chromosomal aneuploidy, which is referred to as a gain or loss of a s egment o r the entirety of a c hromosome, i s t he m ost c ommon chromosomal disorder in humans affecting about 3 out 1000 live births with the most common s pecific an omalies being t risomy 21 a nd s ex- tr isomies

(Hassold e t a l., 1996 ; Hassold a nd H unt, 2001 ). A n e normous a mount of g enetic material can be lost, gained or rearranged in a chromosomal disorder and therefore, if the patient survives, a spectrum of signs and symptoms representing involvement of multiple systems of the body may be seen.

Single-gene (Mendelian) disorders are caused by disruption of a single gene mostly due to mutation. About 10,000 single-gene disorders have been identified in humans.

Although e ach di sorder may b e rare b y i tself, t he co mbined p revalence o f al l disorders i s hi gh ( 1 out of e very 100 bi rths) (WHO, 1996 ). C urrently, OMIM

(Online Mendelian Inheritance in Man), a frequently updated database of Mendelian traits and disorders in h umans, contains 21266 entries including 13921 genes with known s equence a nd 153 g enes w ith know n s equence a nd phe notype

(http://omim.org/statistics/entry, accessed on 12 June 2012). Although the number of

3 entries h as in creased lo garithmically during the last years, th ere are still 1 772

Mendelian phenotypes or loci with unknown molecular basis.

For t he m ajority o f genetic di sorders t he e tiology i s not a s s traightforward as chromosomal or s ingle-gene di sorders. C omplex ( polygenic or m ultifactorial) disorders a re c aused b y m ultiple g enes pos sibly interacting w ith e nvironmental factors. W hile t hey p resent the g reater ch allenge t o t he h uman g eneticists, n ew mapping approaches help to elucidate several of their underlying features. Some of these approaches including quantitative mapping are discussed in detail in this thesis.

1.2 Inheritance of genetic disorders

1.2.1 Mendelian inheritance

As m entioned ab ove m any human d iseases ar e M endelian, which m eans t heir inheritance follows Mendelian principles. In 1856, Gregor Mendel initiated his well known 7 year experiment on c ross-breeding of the garden pea (i.e., Pisum sativum)

(Robin M arantz, 2009 ). H e s tatistically an alysed t he d ata and, in p articular, calculated t he p roportions of di fferent ge notypes i n t he pr ogeny of p arents w ith specific c ombinations of g enotypes, known a s Mendelian r atios (Mendel, 1965 ).

Mendel reported his findings based on t hese ratios, although at that time they were ignored b y other scientists. In 1900 hi s work was replicated and the findings were rediscovered, and a re currently kno wn a s M endel’s p rinciples of i nheritance including the following concepts (Robin Marantz, 2009):

• Gene concept: Biological characteristics pass from parents to the children by

individual units (termed “factors” by Mendel and later called “genes”). Genes

come in pairs one from each parent.

4

• Principle of dom inance: E ach gene ha s t wo alleles w ith one ha ving a

dominant effect on the other.

• Law of s egregation ( Mendel’s f irst l aw): It de scribes c hromosomes’

behaviour du ring m eiosis a nd b ased o n w hich t wo al leles o f a s ame t rait

become separated d uring gametogenesis an d t herefore e ach gamete o nly

receives one allele.

• Law of independent assortment (Mendel’s second law): According to which

during gametogenesis segregation of pairs of alleles for different traits occurs

independently.

After pr oposing M endel’s pr inciples s everal genetic s tudies s howed a noticeable deviation from Mendelian ratios and therefore some amendments had to be made.

• Multiple allele trait: many traits are controlled by more than two alleles such

as blood group which is controlled by three alleles of A, B, and O.

• Incomplete dom inance and c o-dominance: N one of t he t wo a lleles f or a

specific trait may be completely dominant over the other and therefore both

alleles can be expressed in the phenotype (incomplete dominance). The two

alleles m ay b e eq ually s trong an d expressed eq ually i n t he p henotype ( co-

dominance).

• Linked genes: If two genes are located on a s ame chromosome over a short

enough di stance t hey m ay b ehave a s l inked genes a nd s egregate t ogether.

Therefore, Mendel’s second law can be applied to genes on non-homologous

chromosomes or ge nes on hom ologous c hromosomes l ocated f ar e nough

from each other.

5

• Pleiotropy and epistasis: Many genes affect more than one trait (pleiotropy)

and many traits are controlled by multiple genes with some acting as modifier

to the others (epistasis) (Phillips, 2008). Moreover, environment may affect

gene-gene an d g ene-phenotype i nteractions. T herefore, f or s ome t raits,

instead of t he s imple p attern of s ingle gene-single phenotype w hich w as

proposed b y M endel, a c omplex ne twork o f ge nes, ph enotypes, and

environmental factors must be applied.

• X-linked i nheritance: according t o t he M endel’s pr inciples genes come in

pairs, one originally from each parent. This is exempted by sex chromosomes

in m ales w hich i nclude onl y one c opy of the X c hromosome a nd

consequently a single copy of its corresponding genes.

Despite t hese m odifications, M endel’s pr inciples l aid t he f oundation f or modern genetics, a nd b y w hich not onl y i nheritance o f s ingle-gene but a lso polygenic disorders, as d iscussed l ater, can b e explained (Bowler, 1989 ). A m ain gr oup o f geneticist known as “Mendelians” apply these principles as the most typical features of genetic traits.

1.2.2 Complex inheritance

Qualitative tr aits ( also k nown a s discontinuous or bi nary traits) have a phe notype which can have one of a few different values. On the one hand, the phenotype for blood type cannot take a continuous value and falls within a few different categories.

Furthermore, a subdivision of qua litative t raits know n as M endelian traits ar e controlled by a single locus and inherited in a straightforward Mendelian fashion. In contrast to th e M endelian tr aits, c omplex ( polygenic o r mu ltifactorial) tr aits a re ascribed to multiple loci, and are likely to be modified by environmental exposures

6 or a combination of the two (Wu and Lin, 2006). Complex traits include quantitative traits w hich s how a c ontinuously di stributed phe notype w ithin a r ange o f e xtreme values (e.g. blood pressure) as well as a subdivision of qualitative traits which do not follow Mendelian inheritance termed complex binary traits. As a single phenotype is affected b y m ultiple l oci, t he ef fect of e ach single l ocus cannot be obs erved separately in the phenotype and therefore the inheritance pattern cannot be studied using M endel’s ratios. H owever, i nheritance o f e ach s ingle l ocus m ay follow

Mendel’s principles and therefore, extensions of Mendel’s principles may be applied to the complex inheritance of such traits forming the basics of quantitative genetics.

In 1918 F isher proposed his pedigree-based infinitesimal model according to which genetic variation is being controlled by a very high (approaching infinite) number of loci e ach w ith a v ery small ( infinitesimal) e ffect (Fisher, 1918 ). The ba sics o f quantitative ge netics d eveloped l ater e xtending t his m odel f rom “ pedigree” t o

“population” a ccording to w hich a c omplex tr ait is d etermined b y a limite d ( not infinite) number of loci with each locus having Mendelian characteristics (Falconer and Mackay, 1996). Loci may interact with each other to influence the trait. Every genetic co ncept described f or M endelian i nheritance i s ap plicable h ere s uch as dominance, pl eiotropy, e pistasis, a nd l inkage. Importantly, t he phenotypic expression of each locus may be affected by non-genetic or environmental factors.

Quantitative tr ait lo ci ( QTL) mapping is a ma jor d ivision o f q uantitative g enetics which focuses on localising the genetic or physical position of QTL, a region of the genome which contains one or more genetic factors contributing to the variation of quantitative tr ait. D etails o n Q TL ma pping i ncluding di fferent t echniques a nd experimental designs are discussed in 1.3.1.3.

7

1.3 Genetic mapping

Genetic mapping is the localisation of genetic elements on the genome which can be performed t hrough t wo broadly used a pproaches, l inkage analysis a nd association study.

1.3.1 Linkage analysis

1.3.1.1 Basics

Linkage analysis is based on the concept of crossing over and recombination during meiosis. D uring pr ophase I of m eiosis, hom ologous c hromosomes e xchange segments ( crossing ove r) c ontributing t o genetic r ecombination which results i n recombinant gametes. The probability of crossing over between two loci on a same chromosome is d irectly r elated to th e genetic d istance b etween th em a nd is represented as recombination fraction (θ) defined as the number of recombinant progeny divided by the total number of progeny.

Given two loci on a same chromosome if they are located far apart, the chance of crossing ove r b etween t hem a pproaches t o 100 % r esulting in half o f t he p rogeny being recombinant (θ  0.5). On the other side, it is very unlikely that crossing over occurs between two closely spaced loci (θ  0). The middle situation is when two loci a re l ocated f ar e nough t o a llow c rossing ove r t o ha ppen onl y i n a f raction of meioses and consequently the recombination fraction falls between 0 a nd 0.5. Thus genetic distance can be obtained by measurement of recombination fraction.

The unit for genetic distance is centiMorgan (cM) which is defined as the distance between t wo l oci w hen t he r ecombination f raction be tween t hem i s 0.01. The recombination f ractions of 0 a nd 0.5 g ive genetic di stance of 0 c M a nd 50 c M,

8 respectively. In t he hum an ge nome, 1 cM corresponds t o approximately 1 millio n base pairs on average (Scott et al., 2004).

1.3.1.2 Linkage analysis of Mendelian traits

1.3.1.2.1 Two-point linkage mapping

A pedigree or a population with affected individuals is genotyped for a set of genetic markers spread over the genome. The genetic markers should be informative with a known s pecific lo cation. U sing the two-point m apping a pproach, recombination fractions between h ypothetical t rait l oci and every single m arker are an alysed. For small and simple families the recombination fraction for a marker may be calculated manually only by counting recombinant and non-recombinant individuals. However, linkage a nalysis in large o r c omplex f amilies r equires a mo re c omplex s tatistical method, referred to as maximum likelihood technique, which gives two values as an output for each marker, θmax and logarithm of odds (LOD), the first represents the best estimate of the recombination fraction or genetic distance between the marker and the trait locus and the second determines whether the recombination fraction is significantly different from 0.5 or not (Haldane and Smith, 1947; Smith, 1953; Ott,

1974). A m arker w ith a r ecombination f raction s ignificantly di fferent f rom 0.5 represents a “linkage” to the trait locus. Given L(θ) and L(0.5) as the probabilities of the observed data in the pedigree when there is linkage and when there is no linkage

(θ = 0.5), respectively, the likelihood ratio is calculated as L(θ)/L(0.5) and therefore it gives us the odds of the linkage. To be able to sum likelihood ratios obtained from different pedigrees, LOD is applied in linkage analysis which is calculated as log10

L(θ)/L(0.5). The best estimate of recombination fraction is termed θmax and referred to as the recombination fraction which gives the highest LOD. Computer programs

9 calculate the LOD for each marker and at several θ and, as an output, they list the

θmax and the c orresponding LOD f or e ach m arker. LOD i s pos itively and logarithmically related t o th e p robability o f lin kage. It c an also be p resented as negative. While a positive LOD supports linkage, a negative LOD provides evidence against l inkage. As s tandard c riteria f or s ingle tw o-point m apping, a LOD = 3 represents the probability of linkage to be 1000 t imes higher than the probability of no l inkage and is c onsidered a s a t hreshold f or s ignificance l evel. LOD ≤ -2 significantly r ejects th e lin kage (Morton, 1955 ). U sing t he out put da ta f or e ach chromosome we can map the order of the trait loci and the genetic distance between them. W hile tw o-point l inkage m apping i s of ten f ast a nd e asy to pe rform, t he resulting map usually has a low resolution with the confidence intervals for the locus position being wide (Aston and Wilson, 1986). Besides, two-point linkage mapping is not very efficient when high density markers are applied.

1.3.1.2.2 Multipoint linkage mapping

Multipoint linkage mapping calculates the likelihood of linkage between trait locus and a group of genetic markers with fixed position. In other words, for each analysis, information on t he transmission of series of flanking markers combine together and the t ransmission of t he t rait l ocus i s t racked a ccordingly. T his pr ovides t he opportunity of more pr ecise m apping i ncluding calculation of l ikelihood f or i nter- marker d istances an d al so b etter es timation o f i dentical b y d ecent (IBD) s haring

(Terwilliger and Ott, 1994). Two or more alleles are considered as IBD if they are identical (disregarding rare mutations which ma y occur during inheritance) in two individuals r elated t o e ach ot her a nd t hey ha ve be en i nherited f rom a c ommon ancestor. Although in general multipoint is a more powerful approach than two-point

10 linkage ma pping, it ma y result in mis specification o f th e lo cus ma p o rder a nd/or inter-marker distances (Ott and Lathrop, 1987; Halpern and Whittemore, 1999). In summary, two-point and multipoint linkage mapping can complement each other and the ideal approach in linkage mapping of a Mendelian trait is running b oth on t he same data and comparing the two output sets taking into account the weaknesses of each approach.

1.3.1.3 Linkage analysis of quantitative traits (QTL mapping)

1.3.1.3.1 QTL mapping techniques

Unlike a Mendelian tr ait w hich is controlled b y a s ingle l ocus and i nherited i n a straightforward Mendelian fashion, a complex trait is ascribed to multiple loci, each of w hich may h arbour Mendelian pr operties, a nd l ikely subject to e nvironmental exposure or a c ombination of t he t wo. Analysis of s uch l oci i s b eyond t he capabilities o f cl assical M endelian t echniques. H owever, i t h as b ecome f easible using Q TL m apping t echniques de veloped r ecently, w hich a re mainly ba sed on regression analysis and maximum likelihood.

1.3.1.3.1.1 Single marker analysis

Single m arker a nalysis u ses an alysis o f v ariance at ea ch m arker t o i dentify phenotypic va riation be tween c lasses of i ndividuals w ith di fferent hom ozygous genotypes. A s ignificant d ifference i s co nsidered as an evidence f or p resence o f a

QTL and the corresponding marker indicates the estimated location of the QTL. The main limitation of single marker analysis is a potential confounding of QTL position and Q TL effect. A v ariation i n a m ean phe notype be tween di fferent h omozygous classes for a marker may be due to a QTL which is closely linked to the marker and has the same effect as the variation, or to another QTL spaced far from the marker

11 but with a l arger ef fect (Falconer a nd M ackay, 1996 ). H owever, s ingle m arker analysis may not be able to distinguish between the two situations.

1.3.1.3.1.2 Interval mapping

Interval mapping is much less confounding with regards the position and effect of

QTL than single marker analysis. Basically it e stimates the effect of putative QTL within or dered pa irs o f m arkers t ermed i ntervals (Lander a nd B otstein, 19 89).

Therefore, the QTL position is fixed incrementally. Maximum likelihood is the main statistical approach to estimate the QTL effect. The likelihood ratio (LR) for a given position is defined as twice the natural logarithm of the ratio between the maximum probability ( Lmax) of t he obs erved da ta on t he assumption t hat t here i s no Q TL effect ( reduced m odel) and t he m aximum pr obability of t he obs erved d ata on t he assumption t hat t here i s a Q TL w ith a dditive a nd dom inance e ffects ( full m odel) summarized as below:

LR = -2 ln (Lmax (reduced model)/Lmax (full model))

For each interval the likelihood ratio is calculated at each position and for a set of parameters including QTL effect. Given the observed phenotypic and genotypic data, the best estimate of the QTL effect is the one which maximizes the likelihood ratio.

As for linkage analysis of Mendelian traits (see 1.3.1.2) LOD can be applied here. It is calculated as:

LOD = log10 (Lmax (reduced model)/Lmax (full model))

Deriving from this formula LOD can be easily converted to LR by multiplying by

4.61. The output of interval mapping includes LR, LOD, and QTL effect (displayed as e stimated phe notype means f or a llelic s ubstitutions) a t e ach pos ition displayed

12 along t he ge nome. A LOD w hich p eaks above t he t hreshold o f s ignificance l evel indicates presence and location of a QTL.

While i nterval m apping i s t he m ost commonly u sed approach i n QTL m apping, i t may i gnore th e e ffect o f o ther QTL o utside th e in terval. T his e ffect is controlled using the extensions of interval mapping including composite interval mapping and multiple interval mapping (Zeng, 1994; Kao et al., 1999).

1.3.1.3.2 Experimental designs for QTL mapping

Generally, in a QTL m apping, a “population” is s tudied instead of a “pedigree”.

Using an experimental design, a population is created in which QTL have the chance to segregate with their linked genetic markers resulting in estimation of effect and genetic location of the QTL.

1.3.1.3.2.1 Backcross and F2 intercross

Backcross an d F 2 intercross a re t he m ost popul ar Q TL m apping de signs. T he parental inbred strains which are homozygote at each locus are crossed to generate the first generation (F1) which is consequently heterozygote at all loci. The next step distinguishes be tween backcross a nd i ntercross. In ba ckcross de sign, t he F 1 generation is crossed with one of the parental lines while in the intercross design F1 individuals are crossed together (intercross). Given a marker locus for which parental lines have the genotypes of MM and mm, all F1 population have Mm genotype at this locus. While the F2 backcross will show only two genotypes of Mm and MM (or mm) each with 50% frequency, the F2 intercross will have three genotypes of MM

(25%), M m ( 50%), a nd m m ( 25%). T herefore, i n g eneral, b ackcross i s eas ier t o analyse while intercross is more informative. However, the nature of the phenotype and mode of inheritance identify which design is preferred. Consider the case of a

13 disorder c aused b y a mutant a llele. If t he m utation i s f ully dom inant or f ully recessive, 75% of F2 individuals, phenotyped as affected or unaffected respectively, have an unidentifiable genotype. In such cases backcross is of choice as it provides more genetic i nformation. O n t he ot her h and, i ntercross i s pr eferred w hen t he mutation is codominant and therefore each of the three F2 genotype classes show a different phenotype (Birren and Green, 1997).

1.3.1.3.2.2 Outbred populations

QTL m apping of popu lations de rived f rom i nbred pa rental l ines i s r elatively straightforward. H owever, s uch popul ations a re de signed a nd e stablished b y scientists an d ar e n ot created n aturally. Outbred popul ations i ncluding hum an pedigrees and l ivestock c an a lso b e s tudied f or Q TL m apping albeit w ith m ore complexity. F or e xample f ull-sib f amilies ( each c onsists o f o ffspring o riginating from the same outbred parents) are among popular outbred populations. Given the fact that parents are n ot i nbred, t hey m ay be he terozygote a t s everal l oci a nd therefore a maximum of four alleles may be segregated for any given locus (unlike inbred popul ations i n w hich onl y t wo a lleles a re s egregating a t e ach l ocus).

Furthermore, generally it is not known which allele is from which parent. In other words, the so-called “phase” of linkage is unknown. To overcome such difficulties one m ay p erform l inkage analysis f or each p arent s eparately t o i dentify alleles inherited from each p arent or bot h and t hen us e the t wo obt ained genetic m aps t o construct a single genetic map for the pedigree (Grattapaglia and Sederoff, 1994).

1.3.1.3.3 Fine mapping of QTL

Increasing the number of recombination events in an experimental design increases the ch ance o f cr ossing over b etween every two l oci a nd t herefore pr ovides t he

14 opportunity of f ine m apping of Q TL or s plitting l inked Q TL. Fine m apping experiments ar e d esigned t o ac cumulate t he recombination ev ents i n t he l ast generation. Genotyping of the last generation for high density markers may provide a genetic map with higher resolution.

1.3.1.3.3.1 Recombinant inbred line

Recombinant i nbred l ine ( RIL), a s f irst de veloped b y Bailey (Bailey, 1971 ), is established from an F2 generation by several continuous self-fertilization (selfing) or sibling mating. As a result a new line is produced with its genome being a mosaic of parental lines and also homozygote at each locus (inbred). Given genotyping for high density ma rkers, R ILs e stablished b y s elfing a nd s ibling m ating c an i mprove t he resolution of a genetic map up t o twofold and fourfold, respectively (Haldane an d

Waddington, 1931; Williams et al., 2001). Not only are RILs used in fine mapping studies, they m ay b e a pplied i n or iginal g enome s cans. B esides, ge notyping da ta obtained from a single RIL can be used to study several phenotypes. However, using

RILs i s not a lways t he choice. W hile a pplication of s uch l ines i s l imited onl y t o mapping pu rposes, establishing t hem i s di fficult, e xpensive, a nd t ime consuming with their maintenance also being costly.

1.3.1.3.3.2 Advanced intercross line

As an alternate method, Darvasi and Soller proposed advanced intercross line (AIL) in w hich repeated r andom i ntercrossing of t wo i nbred l ines for at l east 1 0 generations results in th e a ccumulation o f r ecombination up t o the l ast g eneration

(Darvasi a nd S oller, 19 95). A s i n R ILs, a s et of m arkers w ith e nough de nsity i s required. AILs have the potential of narrowing down the resolution of genetic maps to a bout 1 c M. T hey are m ore e conomical a nd l ess t ime-consuming t han R ILs.

15

However, unlike RILs which can be maintained permanently, AILs are transient and also each line can generally be used to analyse only one phenotype. In addition, AILs should be c ontrolled for r andom genetic d rift. S uccessive generations of intercrossing are at risk of accumulation of random genetic drift which may cause fixation of markers or alleles leading to loss of genetic information.

1.3.2 Association study

Unlike l inkage analysis w hich d etects co -segregation of l oci i n a p edigree, an association s tudy l ooks for c o-occurrence of alleles i n a f raction of a population which presents a s pecific phe notype. F or e xample, t o f ind g enetic c ausation of a human di sease, a popu lation i s g enotyped f or a c andidate gene a nd f requency of genetic variation is compared between affected and unaffected classes. Association is based on linkage disequilibrium according to which co-occurrence of two or more alleles w ithin a p opulation is s tatistically d ifferent f rom ex pected v alues b ased o n random di stribution. E ven i f t he association s tudy do esn’t i dentify a causative variant, it may find some loci which are linked to the causative gene/s. The genetic map obtained from association approach is more precise relative to the linkage maps.

Genome wide association studies (GWASs) are large scale extensions of association studies i n w hich genomic D NA i s genotyped for a s et of hi gh d ensity markers

(usually SNPs) covering the whole genome. The GWAS approach has more power to d etect g enetic v ariants o f m odest ef fect t han l inkage an alysis (Risch a nd

Merikangas, 1996). The idea of GWAS was complemented with the completion of the H uman G enome P roject a nd t he International H apMap P roject pr oviding hi gh

SNP d ensity arrays f or . A ssociation s tudies ha ve reported s everal genetic variants contributing to human diseases albeit that in the initial phase many

16 of t hem h ave f ailed t o b e r eplicated (1999). The results obt ained f rom a ssociation studies may be interfered by several factors including chance, bias and confounding variables leading to a susceptibility to false negatives and false positives (Campbell and Rudan, 2002).

1.3.3 From QTL to gene

While i dentification of t he Q TL controlling c omplex tr aits is a ma jor s tep to ward their genetic d issection, th e u ltimate p urpose of every QTL mapping s tudy i s pinpointing e xact unde rlying c andidate genes a nd t hen pr oviding e nough pr oof t o validate their contribution to the phenotype. Two broadly different approaches can be applied to track down genes underlying a QTL, positional cloning and a candidate gene ap proach, which are ba sed on t he c oncepts of l inkage a nd a ssociation, respectively.

Positional c loning i nvolves na rrowing dow n t he Q TL t o a level w hich can b e managed by available p ositional analysis and s ubsequently, i dentification of genes located w ithin t he ge nomic r egions of t he Q TL. T his r equires r educing t he confidence interval of the QTL to about 0.3 cM which can be achieved using fine mapping approaches (Falconer and Mackay, 1996). However, positional cloning may be ha mpered b y t wo pr oblems. F irstly, a s pos itional c loning i s ba sed o n l inkage analysis, genetic variants with large effect have a greater chance of being detected and those of small effect may be missed (Takahashi et al., 1994). Secondly, even the finely mapped QTL may contain several genes of which only a few contribute to the phenotype i ndicating a new c hallenge, pr ioritizing t he c andidate genes for f urther analysis.

17

Unlike pos itional c loning w hich t akes i nto a ccount every gene w ithin th e Q TL region, c andidate gene a pproach f ocuses on pa rticular c andidate g enes w ithin the

QTL r egion w hich a re know n t o be i nvolved i n t he ph ysiological mechanism/pathogenesis pa thway or t he r egulatory system of t he t rait of i nterest.

The next step is searching for association between different molecular variations at each c andidate gene and t he phenotype. The w eaknesses of association s tudies, as described above, may also be applicable here and the und erlying genes w hich haven’t yet been reported to be relevant to the trait of interest may be missed.

None of the two approaches described above can be independently sufficient to clone

QTL unde rlying complex traits. Depending on the initial QTL maps and available information on the phenotype, one approach may be preferred. However, in general, a combination of the two provides a better approach.

1.4 Atrial septation

Development of t he four-chambered m ammalian h eart i nvolves p ermanent separation of the systemic and pulmonary circulations through septation of common chambers including the common atrium. Atrial septation is a complex process, being contributed b y di fferent s tructures a nd m ultiple c ell popul ations. Initially, t he myogenic s eptum primum grows from the common atrial roof toward t he superior endocardial cushion creating a narrowing triangular hole termed the ostium primum

(Anderson e t a l., 2003 ). A m esenchymal cap o n t he l eading ed ge o f t he s eptum primum f uses w ith t he s uperior e ndocardial c ushion. C oncomitantly, the dor sal mesenchymal pr otrusion ( DMP; a lso c alled “ spina ve stibuli"), a w edge of mesenchymous tissue in continuity with mesenchyme forming the dorsal wall of the common a trium, in filtrates th e f orming s eptal c omplex a nd f uses w ith the in ferior

18 endocardial cushion (Webb et al., 1998). Before the ostium primum is fully closed, the upper edge of the septum primum becomes fenestrated by apoptotic cell death to form a new communication termed the ostium secundum (Moore and Persaud, 1998;

Anderson et al., 2003). A second septal wall, the septum secundum then develops to the right of the septum primum maintaining an offset opening termed the foramen ovale. The septum secundum forms the upper and lower rims of the foramen ovale, the lower rim forming through muscularization of tissue originating from the DMP, and the upper as a deep folding of the atrial wall (Anderson et al., 2003). Together, the offset communication between the primary and secondary septa creates the one- way flap valve that supports a right-to-left shunt.

1.5 Fetal and neonatal circulation

1.5.1 Fetal circulation

During prenatal life, the fetus develops within the amniotic sac inhaling and exhaling the amniotic fluid. Although this “breathing” is necessary for normal development of the lungs, it does not provide the gas exchange. Instead, the respiratory function is performed b y t he p lacenta al beit p assively and w ith l ess ef ficiency. The p lacenta receives ox ygenated bl ood f rom t he m aternal c irculation a nd s ends i t t o t he f etus through the umbilical vein while unoxygenated blood is carried back to the placenta by u mbilical arteries an d en ters t he m aternal ci rculation. T he p artial p ressure o f oxygen (PO2) in the umbilical vein is only about 30-35 mmHg, much lower than that for the maternal arterial blood (~80-100 mmHg) (Rhoades and Bell, 2012). Although this is partially compensated by higher affinity of fetal haemoglobin to oxygen than its maternal equivalent the structural adaptation is also required to divert blood away

19 from unnecessary pulmonary circulation and supply enough blood to the vital organs including brain and heart.

The adaptation is complete by vasoconstriction of the pulmonary vessels as well as temporary intracardiac and extracardiac shunts, the ductus venosus, foramen ovale, and ductus arteriosus (Kliegman et al., 2011). The ductus venosus shunts blood from the um bilical ve in i nto t he i nferior ve na cava w hile t he ot her t wo by pass t he pulmonary c irculation through c onnecting th e r ight a trium to th e le ft a trium

(foramen ov ale) a nd t he pul monary artery t o the a orta ( ductus a rteriosus). On average, half of the umbilical venous blood bypasses the liver via the ductus venosus and m ixes w ith t he u noxygenated i nferior v ena cava b lood which comes from t he lower body (Rudolph et al., 1971). After entering the right atrium, this mixed blood is mostly directed by the eustachian valve, a crescent-shaped fold at the entrance of the inferior vena cava, to cross the foramen ovale into the left atrium (Sadler, 2004).

This bl ood i s s ubsequently pum ped out f rom t he l eft ve ntricle t o t he u pper bod y organs including the brain via the ascending aorta. Unlike the inferior vena cava, the blood f rom t he s uperior ve na c ava m ostly flows t o t he r ight ve ntricle r ather t han passing t he f oramen ov ale a nd t hen i s pum ped out i nto t he pul monary artery.

However, due to the vasoconstriction of the pulmonary vessels, the majority of this blood e scapes f rom t he pul monary artery i nto t he de scending a orta vi a t he duc tus arteriosus (Kliegman et al., 2011).

1.5.2 Transition to neonatal circulation

At b irth, when t he ne wborn s tarts t o br eathe, the l ungs be come e xpanded a nd t he arterial P O2 i ncreases l eading t o a s ignificant r eduction in the pulmonary vascular resistance (Teitel et al., 1990). In parallel, the systemic vascular resistance increases

20 due t o d etachment o f t he p lacental ci rculation (Kliegman e t a l., 2011 ). A s a consequence, the ductus arteriosus flow becomes reversed to a left-to-right shunt and the pul monary c irculation r eceives t he w hole out put f rom t he r ight ve ntricle supplemented by that from the ductus arteriosus. Within the next few days the ductus arteriosus co nstricts an d ev entually cl oses d ue to a h igh ar terial P O2 l eaving a fibrous band termed ligamentum arteriosum (Kliegman et al., 2011). The increased venous return to the left atrium increases the left atrial volume. A subsequent left-to- right p ressure gradient across th e in teratrial s eptum f orces th e s eptum p rimum against the septum secundum and functionally closes the flap valve of the foramen ovale. Subsequently, the remnant of the septum primum fuses with the edge of the septum s ecundum a nd c ompletely c loses t he f oramen ova le l eaving a n embryonic remnant, t he f ossa ova lis, a n ova l de pression i n t he l ower p art of t he i nteratrial septum (Standring, 2008). The ductus venosus also closes due to the elimination of the placental circulation.

1.6 Patent Foramen Ovale

As described in 1.5.2 the flap valve of the foramen ovale normally closes after birth by f usion of t he s eptum pr imum and the s eptum s ecundum. H owever, f usion i s incomplete i n a bout on e-quarter of t he general hum an popul ation, l eading t o a n anatomic v ariant t ermed p atent f oramen o vale (P FO) (Hagen e t a l., 1 984). T he severity of P FO i n t he normal popul ation i s va riable r anging from s mall pi nhole openings to large tunnel-shaped corridors. The diameter ranges from 1 to 10 mm in

98% of PFOs and only ~1% of PFOs show an inter-atrial communication of ≥10mm

(Hagen et al., 1984).

21

1.6.1 Clinical importance of PFO

Haemodynamically, PFO is usually a benign condition. Even in cases of large PFO the normal pressure gradient between left and right atria forces the septum primum against the septum secundum and functionally closes the foramen ovale. However, in large P FOs c hange o f t he gr adient dur ing s ome ph ysiological c onditions such as

Valsalva m aneuver may a llow t he f lap va lve t o ope n e asily s hunting a significant volume of blood.

The main clinical importance of PFO is its association with other clinical conditions some being highly severe.

1.6.1.1 PFO and stroke

Stroke i s t he t hird l eading cause of de ath (2010) and t he m ajor c ause o f s erious, long-standing d isability in th e U nited S tates (2001) affecting a bout 800,000 individuals each year (Lloyd-Jones et al., 2010). Among cases with ischemic stroke,

18-42% have no i dentifiable cause and are termed cryptogenic (Sacco et al., 1989;

Homma et al., 2002; Musolino et al., 2003). For larger PFOs, there is a possibility of clinically s ignificant in ter-atrial c ommunication which is a ssociated w ith a h igher risk of cryptogenic stroke, particularly in young patients. Lechat et al. first reported this as sociation s howing t hat t he p revalence o f P FO ( detected b y contrast echocardiography) w as s ignificantly higher i n a dults unde r 55 years ol d w ith ischemic stroke than in the controls (40% vs 10%, respectively) (Lechat et al., 1988).

The association in young adults was confirmed by several other clinical studies using transesophageal echocardiography (TEE), the gold standard for the diagnosis of PFO in adults (Homma et al., 1994; Hubail et al., 2011). However, the association in older patients remains controversial. U nlike e arly c ase co ntrol studies (Hausmann et al .,

22

1992; Jones et al., 1994) a recent cohort (Handke et al., 2007) showed an association in both older (≥55 years) and younger (<55 years) patients suggesting the results from s ome pr evious r eports (Overell e t a l., 2 000) are b iased as o lder p atients underwent TEE less often than younger patients.

The most likely mechanism for the association between PFO and cryptogenic stroke is paradoxical embolism described as the passage of embolus (usually from a deep vein t hrombosis) t hrough t he s hunt of P FO f rom t he ve nous s ide to t he a rterial system w hich m ay eventually r each t he b rain (Lechat e t a l., 1988 ; Webster et al .,

1988). The paradoxical embolism can be i nduced b y s ituations which i ncrease t he right at rial p ressure, for ex ample V alsalva m aneuver, which can b e m imicked b y coughing or s training a t s tool, and accordingly a r ight-to-left s hunt t hrough P FO occurs. This mechanism is supported by the fact that patients with larger PFO are at a higher risk of stroke (Bridges et al., 1992; Schuchlenz et al., 2000 ). In addition, those with a residual shunt after PFO closure show a higher rate of stroke recurrence

(Wahl et al., 2001).

The c ontribution of P FO t o i schemic s troke m ay b e t hrough ot her pr oposed mechanisms. The tunnel-like structure of PFO and its possible turbulent blood flow may p redispose t o in s itu t hrombus f ormation (Homma a nd S acco, 200 5). A lso, thrombus f ormation m ay occur s econdary t o a trial a rrhythmias w hich a re m ore frequent in cases with PFO (Berthet et al., 2000).

1.6.1.1.1 PFO, ASA, and stroke

Large P FOs a re o ften a ssociated w ith p ersistence o f em bryonic at rial f eatures, a s well as atrial septal aneurysm (ASA) (Hagen et al., 1984; Homma et al., 2003), a rare cardiac a bnormality d escribed a s a bul ging o f t he a trial s eptum t hrough t he f ossa

23 ovalis in to th e r ight o r le ft a trium o r b oth (Silver a nd D orsey, 19 78). AS A independently h as a n i ncreased a ssociation w ith c ryptogenic s troke (Hagen et al .,

1984; Homma et al., 20 03). It i s m ore frequently detected i n patients w ho had an ischemic s troke (Agmon e t a l., 1999 ). Al so, in patients w ith is chemic s troke a nd

ASA, ASA (with or without PFO) is usually the only potential cardioembolic source detected by TEE (Agmon et al., 1999).

1.6.1.2 PFO and migraine

Migraine is a highly prevalent headache affecting approximately 10% to 18% of the adults in the industrialized countries (Rasmussen, 2001). It is associated with a high socio-economic bu rden. A ccording t o t he W HO s tatistics, i t i s r anked a s t he nineteenth l eading c ause of di sability a mong a ll di seases w orldwide (2004).

Migraine is described as recurrent attacks of headache with each attack lasting 4-72 hours. T he he adache i s unilateral, pul sating, w ith a m oderate or s evere intensity, aggravated by physical activity, and associated with nausea and/or photophobia and phonophobia ( fear of noi ses) (2004). A bout 30% of m igraineurs experience au ra which is a period of focal neurological (visual, sensory, or motor) symptoms usually occurring within 60 minutes prior to the headache attack (Silberstein, 2004).

The as sociation b etween P FO and m igraine w as f irst o bserved i n a c ase-control study performed by Del Sette et al. (Del Sette et al., 1998). The prevalence of right- to-left shunt in patients with migraine with aura was similar to that in young patients with stroke and significantly hi gher than that in controls suggesting P FO as a risk factor for migraine as for cryptogenic stroke. The association was reported by other studies i ncluding t hose w hich i nvestigated t he l ink be tween P FO and s troke a nd observed a higher prevalence of migraine in individuals with PFO than in controls

24

(Lamy e t a l., 2002 ). S tudy o f pa tients w ho u nderwent P FO closure showed a beneficial effect on migraine. The effect varies from a r eduction in the incidence of migraine to an improvement of the migraine symptoms or aura (Azarbal et al., 2005;

Reisman et al., 2005; Anzola et al., 2006; Luermans et al., 2008). These studies not only do cumented t he a ssociation be tween P FO and m igraine but a lso s uggested paradoxical e mbolism a s a pos sible m echanism f or t his a ssociation. For e xample,

Reisman et al . s howed t hat p atients w ith p aradoxical cer ebral em bolism ar e m ore frequently a ffected b y m igraine h eadaches t han t he general p opulation, an d transcatheter closure of the P FO s ignificantly i mproves t he f requency of m igraine attacks. A quantitative systematic review of 6 studies on migraine and PFO including

194 patients with migraine who had undergone PFO closure showed an improvement in t he f requency a nd s everity of m igraine (Schwedt e t a l., 2008 ). H owever, t he benefit o f P FO c losure in mig raine tr eatment w as q uestioned b y a r ecent w ell designed trial (Dowson et al., 2008) which, unlike the early studies on t he effect of

PFO closure on m igraine outcome, was randomized and double-blind. Patients with migraine w ith au ra an d al so w ith m oderate or l arge r ight-to-left s hunts w ere randomized t o t ranscatheter P FO closure w ith STARFlex imp lant o r t o a s ham procedure (skin i ncision i n t he groin). S ix-month follow up s howed no significant difference on r esolution of migraine between the two groups. A more recent study has a lso que stioned t he a ssociation be tween migraine an d P FO. G arg et al . performed a large case-control study to assess the prevalence of PFO in individuals with a nd w ithout mig raine (Garg et a l., 2010 ). Transthoracic ech ocardiogram an d transcranial D oppler ul trasonography w ere us ed t o i dentify P FO. PFO w as considered present if it was detected b y both two methods. The results showed no association b etween m igraine h eadaches and t he p resence o f P FO o r b etween

25 migraine s everity and P FO s ize s uggesting f urther i nvestigations ar e r equired t o establish the association between migraine and PFO.

1.6.1.3 PFO and other clinical conditions

A r ight-to-left s hunt in P FO is a ssociated w ith s ome h ypoxemic s ymptoms. A significant right-to-left s hunt t hrough P FO m ay worsen s ymptoms i n patients with chronic hypoxemia. PFO is associated with decompression illness in divers possibly through p aradoxical gas embolism via large right-to-left shunts (Wilmshurst et al.,

2001). P FO i s m ore pr evalent i n pa tients w ith obs tructive s leep a pnea t han i n controls and may contribute to the associated hypoxemia in such patients (Shanoudy et a l., 1998 ). M oreover, i nteratrial r ight-to-left shunts i ncluding P FO i s t he m ain cause of platypnea-orthodeoxia syndrome, a rare syndrome described as dyspnea and deoxygenation i nduced by change i n pos ition from r ecumbent t o upr ight (Cheng,

1999). A recent study has reported an increased prevalence of right-to-left shunt in patients with i solated i ncompetence of great s aphenous vein (Wright et al., 2010 ).

However, it is not known yet whether this association is etiologic or functional.

1.6.2 Genetics of PFO

There is no conclusive data on the underlying causes of PFO, although there is good evidence that genetic factors are involved. The available data on genetics of PFO can be classified into two main categories. First the studies which provide evidence for familial i nheritance of PFO a nd s econd t hose w hich ha ve t ried t o i dentify t he underlying genetic factors with a special attention to the possible link between PFO and atrial septal defect (ASD). Here, the first category is discussed and the details on the second are provided in 1.8.

26

Arquizan et al. studied familial aggregation of PFO on adult patients with ischemic stroke a nd t heir s iblings (Arquizan e t a l., 2001 ). A ll i ndividuals w ere as sessed b y contrast-enhanced t ranscranial D oppler f or t he presence of P FO. A s ignificantly higher pr evalence of P FO w as obs erved i n s iblings o f pa tients w ith P FO t han i n siblings of p atients without PFO. In addition, t he association s howed a significant sex dependency. In separate analysis of males and females, while females showed a significant difference in P FO p revalence b etween s iblings o f p atients with a nd without PFO, the association disappeared in males suggesting familial aggregation of

PFO in women.

In a r ecent s tudy b y W right et a l., a dults w ith symptomatic v aricose v eins w ere recruited, of which those with isolated incompetence of great saphenous vein were selected and as sessed f or t he p resence of r ight-to-left s hunt us ing t ranscranial

Doppler (Wright e t a l., 2010 ). A s m entioned in 1.6.1.3 a s ignificantly higher prevalence of right-to-left shunt was observed in this group compared to the general population. Varicose veins are common and highly heritable. FOXC2 was recently suggested to be involved in the pathogenesis of this vascular abnormality (Ng et al.,

2005). T herefore, t he l ink be tween v aricose ve ins a nd P FO m ay indicate genetic contribution to PFO.

Seventy-one relatives of 20 probands with a large PFO or ASD (altogether forming

18 families and t wo s ibships) were assessed for t he pr esence o f atrial s hunt us ing contrast echocardiography (Wilmshurst et al., 2004). The detected atrial shunts were mostly l arge P FO a nd s howed a n a utosomal dom inant i nheritance. All i ndividuals were also as sessed f or the m igraine s ymptoms. Interestingly, t he i nheritance o f migraine with aura in some families was linked to the inheritance of atrial shunt.

27

1.7 Atrial septal defect

Atrial s eptal d efects ( ASDs) ar e a m ore s evere f orm o f co ngenital h eart d efect

(CHD) t han P FO. According t o an atomical f eatures, A SDs ar e classified i nto f our subtypes, secundum ASD, ostium primum ASD, sinus venosus ASD, and coronary sinus ASD.

1.7.1 Anatomical subtypes of ASD

1.7.1.1 Secundum ASD

Among the anatomical sub-types of ASD, the secundum form (ASDII) is the most common, comprising 70% of ASDs and 7% of all CHDs (Feldt et al., 1971). ASDII arises from abnormal formation of the septum primum or septum secundum (or both) such th at th e s eptum p rimum r emnant is in sufficient to c over th e f oramen o vale, leading to a frank corridor between the two atrial chambers in the area of fossa ovalis

(Zipes et al., 2005 ). A SDII i s co nsidered as a t rue at rial s eptal d efect s ince i t i s a defect in the atrial septum (Yen Ho et al., 2007), whereas the other sub-types of ASD including os tium primum A SD, s inus ve nosus ASD, a nd coronary s inus A SD a re interatrial communications rather than a defect in the confines of the atrial septum.

1.7.1.2 Ostium primum ASD

In t he developing he art the ostium primum i s a hole between t he growing s eptum primum and the endocardial cushions which is eventually closes by fusion of these compartments. Ostium primum ASDs which comprise 20% of ASDs are caused by defects in the endocardial cushions leading to a g ap between the free margin of the atrial septum and the conjoined leaflets of the atrioventricular valves (Yen Ho et al.,

2007).

28

1.7.1.3 Sinus venosus ASD

Sinus venosus ASDs account for 4% to 11% of ASDs (Jost et al., 2005). The defect is in the common wall that normally separates the left atrium from either caval vein

(Oliver et al., 2002).

1.7.1.4 Coronary sinus ASD

Coronary sinus ASD which is also referred to as an “unroofed coronary sinus” is a rare f orm o f A SD ( 1% o f A SDs) ch aracterized b y a d efect i n t he w all w hich normally separates the coronary sinus from the left atrium. As a result the coronary sinus is partly or completely unroofed.

1.7.2 Pathophysiology and clinical features of ASD

As left atrial pressure rises postnatally, the resultant left-to-right shunt causes volume overload of the right side of the heart which may lead to clinical outcomes. The size of the defect is the main indicator for the outcome of the ASD. Patients with larger

ASDs become symptomatic earlier than those with smaller ones with the symptoms being more severe. The symptoms usually start with exercise intolerance (Webb and

Gatzoulis, 2006). Atrial fibrillation or flutter may occur due to atrial dilatation. Cases with u ntreated la rge A SD ma y d evelop in to mo re s evere c omplications such as decompensated right he art f ailure and pul monary h ypertension a nd eventually

Eisenmenger s yndrome in w hich t he A SD s hunt i s r eversed i nto t he r ight-to-left shunt due t o t he i ncreased pr essure i n t he r ight side of t he he art s econdary t o t he pulmonary hypertension. The advanced stages of Eisenmenger syndrome may cause severe complications including central cyanosis.

29

1.7.3 Etiology of ASD

1.7.3.1 Single factor and multifactorial causation of ASD

Despite many advances in identification of the causation of CHDs, the etiology for most of the cases remains unknown. A wide spectrum of genetic and environmental factors has been reported as causation for only a small fraction of CHDs (about %15)

(Botto and Correa, 2003). Chromosomal abnormalities, defects in single genes, and environmental exposures have been linked to about 5–10%, 3–5%, and 2% of CHD cases, respectively (Clark, 2001).

It is generally believed that the majority of CHDs are of multifactorial origin caused by i nteractions be tween multiple g enetic e lements a nd environmental factors. T he idea was firstly proposed by Nora who tested different hypotheses on the causation of C HD a nd suggested a mu ltifactorial model indicating that C HDs are m ostly caused b y m ultiple f actors w ith t he ex pression o f each b eing determined b y the genetic-environmental interaction (Nora, 1968). This model was later supported by several other studies (Nabulsi et al., 2003; Brent, 2004).

Several genetic and environmental factors which have been identified to be involved in the causation of ASD are discussed below. However, as other CHDs, ASD mostly has a multifactorial causation. The most prominent evidence on this was provided by

Kirk et al. through a QTL mapping of quantitative parameters of atrial septum in a mouse model. These parameters which were already shown to be correlated to PFO and A SD w ere m apped t o a num ber of Q TL with hi gh LOD s cores l ocated on different chromosomes (Biben et al., 2000; Kirk et al., 2006). This study is discussed more in chapter 5.

30

1.7.3.2 Genetic causes of ASD

1.7.3.2.1 Chromosomal abnormalities

Structural a bnormalities o f a utosomal o r s ex c hromosomes ma y r esult in chromosomal syndromes which are often associated with high incidence of CHDs.

1.7.3.2.1.1 Aneuploidies

Down syndrome (trisomy 21) is the most common chromosomal abnormality in live birth resulting in a wide range of symptoms. About half of the patients are affected by c ardiac abnormalities with septal defects including atrioventricular septal defect

(AVSD), ASDII, and ventricular s eptal d efect ( VSD) b eing o verrepresented

(Freeman et al., 2008). Other types of aneuploidies including trisomy 13 and trisomy

18 a re a lso c ommon a nd a ssociated w ith hi gh i ncidence of A SD (Musewe et al .,

1990; Baty et al., 1994).

1.7.3.2.1.2 22q11 deletion syndrome

22q11 de letion can b e r epresented i n di fferent f orms i ncluding D iGeorge, velocardiofacial, and c onotruncal a nomaly face s yndromes (Scambler, 2000 ).

Cardiac anomalies oc cur a mong 75% of p atients. T he out flow t ract is m ostly affected and ASD is less common suggesting the main pathology being an abnormal migration of neural crest cells into the branchial arches and outflow septum (Ryan et al., 1997).

1.7.3.2.1.3 8p23 deletion syndrome

About 60% of c ases w ith 8p23 deletion show cardiac m alformation which can b e

AVSD, ASD, VSD, double outlet right ventricle (DORV), hypoplastic left heart, or pulmonary stenosis (Pehlivan et al., 1999). 8p23.1 region and, in particular, GATA4

31 zinc finger transcription factor which is located in this region have been suggested to be responsible for cardiac defects (Marino et al., 1992; Digilio et al., 1993; Pehlivan et a l., 1999 ). M ore de tails on G ATA4 and i ts r ole i n t he causation o f ASD a re discussed below.

1.7.3.2.1.4 Cri-du-chat syndrome

Cri-du-chat s yndrome i s c aused b y a s pectrum of c hromosomal de letions r anging from de letion o f onl y t he 5p15.2 region t o l oss of t he w hole s hort a rm of the chromosome 5 (Overhauser et al., 1994). Cardiac malformations are found in a high proportion of patients with the most common forms being patent ductus arteriosus

(PDA), VSD, and ASD (Hills et al., 2006).

1.7.3.2.2 Non-chromosomal syndromes

1.7.3.2.2.1 Noonan and LEOPARD syndromes

Noonan s yndrome oc curs c ommonly a ffecting 1/1000 t o 1 /2000 l ive births

(Allanson, 1987). After Down syndrome, it is the most common genetic disorder

associated with c ongenital h eart anomalies. Left v entricular h ypertrophy

(hypertrophic c ardiomyopathy), pul monary s tenosis, a nd A SDII are t he m ost

common c ardiac a bnormalities obs erved i n pa tients w ith N oonan s yndrome

(Burch e t al., 1993 ) (Marino e t a l., 1999 ). T he s yndrome f ollows a utosomal

dominant mode of inheritance and mainly caused by g ain-of-function mutations

in PTPN11, a gene encoding the tyrosine phosphatase SHP-2 located at

chromosome 12q22-qter. LEOPARD s yndrome is a r are a utosomal do minant

syndrome which shares many clinical features with Noonan syndrome. It is also

caused b y mu tations in PTPN11 a lbeit w ith lo ss-of-function ef fect (Keyte an d

Hutson, 2012 ). E lectrocardiographic abnormalities a nd ve ntricular h ypertrophy

32

are t he m ost c ommon he art di sorders and s eptal de fects oc cur l ess c ommonly

than in Noonan syndrome (Limongelli et al., 2007).

1.7.3.2.2.2 Marfan syndrome

Marfan s yndrome i s a c ommon g enetic di sorder of c onnective t issue w ith a n

estimated pr evalence of 1/ 5000 (von K odolitsch a nd R obinson, 2007 ). It is

inherited a s a utosomal dominant a nd c aused b y mutations in fibrillin 1 gene

(Dietz et al., 1991). Fibrillin is a major structural component of the microfibrils

which s tructurally s upport e lastic a nd none lastic c onnective tissue.

Cardiovascular m anifestations i ncluding di lation of t he a scending aorta an d

aortic dissection are among the common features of Marfan syndrome. ASD is

rarely reported.

1.7.3.2.2.3 Holt-Oram syndrome

Holt-Oram syndrome (HOS) is a rare autosomal dominant disorder affecting 1 in

100000 individuals. It was firstly reported by Holt and Oram in a four generation

family as a syndrome of skeletal lesions of the upper limbs and heart anomalies

mainly A SDII (Holt a nd O ram, 1960 ). O ther s tudies a lso showed a hi gh

prevalence of ASD among patients. In Ewbury-Ecob’s clinical report ASDII was

the m ost pr evalent c ardiac a bnormality among H OS pa tients ( %34 a nd % 45

among f amilial and i solated H OS cas es, r espectively) (Newbury-Ecob et al .,

1996). T he m ajority o f f amilial o r s poradic H OS cas es are as sociated w ith

mutations i n T BX5 (McDermott e t a l., 2005 ). T BX5 h as a ke y role i n t he

development of t he upp er l imbs a s w ell as he art i n pa rticular i n di viding t he

developing heart into four chambers. Its effect may be also through interactions

33

with c ardiac t ranscription f actors TBX20, N KX2-5, a nd G ATA4, m utations i n

which are known to cause variety of heart defects including ASD.

1.7.3.2.3 Non-syndromic ASD

The m ajority o f A SDs with know n e tiology a re c aused b y s poradic g enetic

changes mostly point mutations which result in Mendelian patterns of inheritance

with the majority being autosomal dominant. The transcription factors regulating

the c omplex pr ocess of c ardiac de velopment a re a mong t he know n und erlying

genes. As discussed below, they can affect either directly or through interaction

with the other genetic factors.

NKX2-5, a hom eobox t ranscription f actor, w as the f irst gene i dentified t o be

involved i n t he c ausation of non -syndromic C HD. S chott e t a l. pe rformed a

linkage analysis of pedigrees with CHDs (Schott et al., 1998). They mapped a

dominant chromosomal locus associated with CHDs mainly ASDII in addition to

atrioventricular c onduction a bnormalities t o a r egion on c hromosome 5 w hich

contains NKX2-5 and subsequently found 3 causative NKX2-5 mutations. This

was followed b y another work b y this group which identified 7 m ore NKX2-5

mutations associated with atrioventricular block, ASD and VSD (Benson et al.,

1999). S everal o ther f amilies w ith A SD h ave b een as sessed r esulting i n

identification of a num ber of N KX2-5 m utations. H owever, s lightly di fferent

findings w ere obs erved i n m ice. B iben e t a l. studied m ice he terozygous f or

Nkx2-5 mutations for cardiac defects (Biben et al., 2000). While, unlike humans,

ASD w as r are i n s uch mice, a s pectrum o f at rial s eptal d ysmorphogenesis i n

addition to mild conduction abnormalities were observed suggesting convergence

of the mouse and human phenotypes.

34

GATA4 be longs t o t he c onserved G ATA f amily of z inc f inger t ranscription factors which play an important role in mammalian cardiac lineage specification and m orphogenesis (Crispino e t a l., 2001 ). F or th e f irst time , P ehlivan e t a l. showed that haploinsufficiency at the GATA4 locus is often seen in patients with del8p23.1 and CHD suggesting a causal role for GATA4 deficiency (Pehlivan et al., 1999). Garg et al. analysed two pedigrees, both with isolated CHDs mainly

ASD (Garg et al., 2003). In the bigger family, the disease locus was mapped to

8p22-23, a region w hich c ontains G ATA4. A missense mutation of G ATA4,

G296S, was observed in all affected individuals but not in any unaffecteds. In the smaller family, they found a f rame-shift m utation of GATA4 l inked t o CHDs.

Functional analysis of the mouse G295S mutation (human G296S) revealed an impaired interaction between GATA4 and TBX5 suggesting GATA4 mutations are involved in t he c ausation of s eptal de fects t hrough their interaction w ith

TBX5. S ince t hen a n umber of m issense m utations of G ATA4 ha ve be en identified in f amilies w ith C HD in p articular A SD (Hirayama-Yamada et al .,

2005; Tomita-Mitchell et al., 2007; Posch et al., 2008; Zhang et al., 2008).

Tbx20 is a member of the T-box transcription factor family and is essential for early h eart d evelopment (Stennard e t a l., 2005 ). It in teracts w ith o ther c ardiac transcription f actors i ncluding N KX2-5 a nd GATA4 f orming a r egulatory network for gene expression in the developing h eart (Stennard et al., 20 03). A causative role for TBX20 in the familial ASD was firstly highlighted by Kirk et al. w ho r eported t wo TBX20 m utations, one nons ense a nd one m issense, associated w ith a f amily history o f C HD as well as a s pectrum o f cardiac anomalies including septal defects (Kirk et al., 2007). Qian et al. found two non-

35 synonymous T BX20 m utations i n t wo unr elated c hildren, one w ith i solated

ASDII a nd one w ith ASDII a nd T OF (Qian e t a l., 2008 ). Study of non-

Caucasians r evealed m ore i nsight i nto t he r ole of T BX20 i n t he c ausation of

ASD albeit the da ta w as not c onclusive. Liu et al . as sessed a l arge group o f

Chinese patients with sporadic CHD, mostly ASD and TOF, for TBX20 variants

(Liu et al., 2008). A number of TBX20 mutations were observed in some patients but not in controls suggesting their possible contribution to CHD. More recently, a novel TBX20 missense mutation was found b y Posch et al. s egregating with

CHD i n a family me mbers f rom th ree g enerations in cluding a p atient w ith cribriform ASDII (Posch et al., 2010).

In addition t o the cardiac t ranscription f actors, s arcomeric f ilament genes h ave been r eported t o be l inked to f amilial A SD. MYH6 is a car diac-specific sarcomeric gene w hich en code t he al pha-myosin he avy c hain. It is d irectly regulated b y TBX5, G ATA4, a nd T BX20. C hing e t a l. pe rformed a linkage analysis of a large family with dominant ASD (Ching et al., 2005). The identified locus was then mapped to a missense mutation of MYH6. Moreover, in a study by Matsson et al., isolated A SD w as m apped t o m utations i n ACTC1, a sarcomeric g ene that en codes actin, al pha, ca rdiac m uscle 1 (Matsson et a l.,

2008). T hey s tudied t wo f amilies w ith A SDII and f ound a f ounder A CTC1 mutation i n bot h f amilies. In a nother work, families w ith h ypertrophic cardiomyopathy, di lated c ardiomyopathy, or l eft v entricular non -compaction were s tudied. E 101K, a nother A CTC m utation, w as f ound i n s everal family members including some with ASDII (Monserrat et al., 2007). The MYH7 gene encodes the beta-myosin heavy chain. Budde et al. analysed a l arge family with

36

noncompaction of t he v entricular m yocardium (NVM) a nd obs erved a MYH7

missense mutation in the individuals with NVM, some of which also had ASD

(Budde et al., 2007).

1.7.3.2.4 Environmental exposure

Maternal exposure during pregnancy is a major determinant for susceptibility to

CHD. Several environmental factors have been reported which modify the risk of

CHD i n t he f etus. W hile s ome of t hem s uch as pe riconceptional i ntake of

multivitamin or folic acid may reduce the risk of CHD, the majority of them act

as r isk f actors i ncluding m aternal d iseases s uch as p regestational d iabetes,

rubella, phe nylketonuria, a nd s ystemic l upus erythematosus; a nd m aternal

exposure to drugs and other environmental factors such as thalidomide, lithium,

indomethacin tocolysis, retinoic acid, alcohol, and marijuana (Lin and Ardinger,

2005; Jenkins et al., 2007). Major reported risk factors for ASD are highlighted

in table 1.1.

Table 1.1: Environmental factors with reported risk of ASD, mostly extracted from Jenkins et al.

Exposure ASD specific odds ratio Reference

Maternal illnesses

Rubella Not available (Gregg et al., 1945)

Epilepsy Not available (Pradat, 1992) Febrile illness Not available (Zhang and Cai, 1993) Influenza Not available (Scanlon et al., 1998) Phenylketonuria Not available (Levy et al., 2001) Maternal conditions (Tikkanen and Heinonen, Alcohol 1.9 1992) Age > 34 years 1.6 (Ferencz. et al., 1997)

37

Previous preterm birth 2.1 (Ferencz. et al., 1997) (Torfs and Christianson, Smoking 2.2 1999) (Cedergren and Kallen, BMI >29 kg/m2 1.37 2003) Maternal drug exposure Anticonvulsants Not available (Hanson, 1986) (Smithells and Newman, Thalidomide Not available 1992) Vitamin A congeners/retinoids Not available (Geiger et al., 1994) (Hernandez-Diaz et al., Sulfasalazine Not available 2000) Ibuprofen Not available (Ericson and Kallen, 2001)

Trimethoprim-sulfonamide Not available (Czeizel et al., 2001)

Paternal conditions 40-44 years 1.5 Age ≥ 40 45-49 years 2.7 (Olshan et al., 1994) ≥ 50 years 2.1 Age <20 1.9 (Olshan et al., 1994)

Currently, the National Birth Defects Prevention Study (NBDPS) is ongoing in

the United States, a large population-based study aimed at finding any risk factor

for birth defects including CHDs.

1.8 A link between ASD and PFO

A l ink be tween A SD a nd P FO ha s not be en f ormally e stablished i n hum ans,

although the largest and most complex PFO can be difficult to distinguish from

ASD an d are o ften r egarded as form fruste ASD. T his relationship i s a lso

strongly s uggested b y family s tudies, i ncluding t hose be aring m utations i n

cardiac t ranscription f actor ge nes. M utations in T BX20, f or example, a re

associated with defects in septation including ASD and PFO with permanent left-

to-right s hunt (Kirk e t a l., 2007 ). H eterozygous T BX20 m utant m ice s how a n

increased background prevalence of PFO and septal dysmorphogenesis, as well

38 as a genetic pr edisposition t o A SD (Stennard e t a l., 2005 ). M utations in t he homeodomain f actor NKX2-5, a ke y t ranscriptional r egulator o f cardiac development, c ause f amilial A SD i n hum ans, a nd i n m ice l ead t o a hi gh prevalence of PFO correlating with the severe end of a s pectrum of atrial septal dysmorphogenesis (Schott et al., 1998; Biben et al., 2000; Srivastava and Olson,

2000; McElhinney et al., 2003; Kasahara and Benson, 2004; Reamon-Buettner et al., 2004; Hirayama-Yamada et al., 2005; Sarkozy et al., 2005). The prevalence of P FO i n m ice i s al so h ighly genetic b ackground-dependent. A r ecent s tudy showed a hi gher i ncidence of PFO i n a m ouse m odel of t he G ATA4 G295S mutation (human GATA4 G296S), a causative mutation for ASD (Misra et al.,

2012). Collectively, these data suggest that ASD and PFO exist in an anatomical continuum with a common genetic basis.

39

2 Materials and Methods

2.1 Human studies

2.1.1 Study of association between PFO with cryptogenic stroke and GATA4

S377G

2.1.1.1 Ethics committee approval

All hum an e xperiments i n t he A ustralian s tudy were carried out und er H uman

Research Ethics C ommittee ap proval from t he South E ast H ealth R esearch E thics

Committee – Eastern D ivision, t he S t V incent’s H ospital Research E thics

Committee, and the Children’s Hospital at Westmead Research Ethics Committee.

The German study protocol was approved by the Institutional Review Board of the

Charité-Universitätsmedizin Berlin, Germany.

Written i nformed c onsent w as obt ained f rom all p articipants (or p arental/legal guardian on behalf of the children participating in the study).

2.1.1.2 Subject recruitment

2.1.1.2.1 Australian study of association between PFO and GATA4 S377G

Subjects w ere r ecruited f rom S ydney hos pitals f rom 1982 t o 2004 and those of

Caucasian descent were selected for the study. Adult ASD and CHD patients were recruited mo stly from St. V incent’s H ospital and S t. V incent’s P rivate H ospital

(SVHs), w hile c hildren w ere f rom t he S ydney Children’s H ospital ( SCH) a nd t he

Children’s Hospital at Westmead (CHW). Most were recruited prospectively during

40 outpatient s ervices an d were u nselected f or f amily h istory or p articular c ardiac anomalies. PFO and Stroke patients were recruited prospectively from St Vincent’s

Hospital a nd S t V incent’s P rivate H ospital a fter r eferral t o ech ocardiography services for a v ariety o f i ndications i ncluding s troke or t ransient i schemic episode

(TIA).

Allocation to case (PFO with cryptogenic stroke; n = 58) and control (PFO without stroke/TIA, n = 29; s troke/TIA wi thout P FO/ASD, n = 66) groups oc curred retrospectively. Cases consisted of 58 individuals under investigation for cryptogenic stroke in whom PFO was demonstrated by transesophageal echocardiography (TEE).

In control groups, PFO was either absent or an incidental finding, as also determined by TEE.

Clinical e valuation i ncluded m edical hi story, 12 -lead el ectrocardiography an d transthoracic echocardiography and/or TEE with intravenous saline contrast injection during the strain and release phases of the Valsalva maneuver.

Two additional control groups were also assessed. TEE controls recruited from St.

Vincent’s Hospital (n = 113) had undergone TEE for a range of indications and were reported to have structurally normal hearts, in particular an intact atrial septum. The second group (population controls; n = 391) were Australian Caucasians in generally good health (no stroke or known heart disease) although unselected for septal status

(Sherin et al., 2008).

The PFO/Stroke group was restricted to patients with cryptogenic stroke. Since atrial fibrillation (AF) is a known cause of stroke, all of these patients had Holter monitor evaluation u nless th ere was p rior e vidence in th eir me dical h istory f or A F. W e

41 excluded all patients with AF from the PFO/Stroke group. Stroke/no PFO/ASD were also evaluated by Holter monitor as part of their clinical evaluation, although those with AF were included in the study.

2.1.1.2.2 German study of association between PFO and GATA4 S377G

A vast majority o f individuals were o f Caucasian origin and recruited from Berlin hospitals between 2005 and 2007. There were three Arab patients who were included in t he s tudy s ince t he a llele f requency o f S 377G i s s imilar i n bot h Arab a nd

Caucasian populations (Table 4.1).

Ninety-six and 95 patients had isolated ASDII and isolated PFO, respectively. ASDII probands r epresent a s ubgroup o f a pr eviously published C HD c ohort i ncluded i n genetic screening (Posch et al., 2008). Most PFO probands (91%) had been admitted for i nterventional P FO closure a fter experiencing on e or m ore t hromboembolic events. A ll 1 91 cas es were recruited at t he D epartment o f P ediatric Cardiology,

Deutsches H erzzentrum B erlin, and A dult C ardiology D epartment at t he C harité -

Universitätsmedizin Berlin, Campus Virchow (CVK).

All p atients w ere ch aracterized b y p hysical ex amination, 1 2-lead electrocardiography an d T EE. N inety-six p atients a ttending th e Department o f

Cardiology CVK were used as controls after exclusion of ASD and PFO by contrast- enhanced TEE after Valsalva maneuver.

All ASDII and PFO patients underwent 24 hou r Holter monitor. AF was found in none of the PFO patients and all were designated as having cryptogenic stroke.

42

2.1.1.3 S377G genotyping and statistical analyses

2.1.1.3.1 Global distribution of S377G

The distribution of S377G allele (dbSNP ref. 3729856) was determined in different indigenous popul ations (Martinson e t a l., 2000 ) (Table 4.1 ) u sing F luorescence

Polarisation (Chen et al., 1999).

2.1.1.3.2 Australian study of association between PFO and GATA4 S377G

The S377G genotyping was performed at the Australian Genome Research Facility

(AGRF). Initially, G ATA4 c oding e xons w ere a mplified b y pol ymerase c hain reaction ( PCR) f rom 100ng of l eukocyte D NA, pur ified w ith P CR C leanup P lates

(Millipore) and sequenced using Big Dye Terminator v3.1 kit (Applied Biosystems) and A BI P RISM® 370 0 D NA A nalyser. S ubsequently, GATA4 S 377 s tatus w as confirmed i n a ll s amples, i ncluding popul ation a nd T EE c ontrols, us ing a commercial SNP analysis platform (Genera Biosystems).

Chi-square o r t he F isher’s ex act t est w as u sed to d etect d ifferences b etween al lele frequencies.

2.1.1.3.3 German study of association between PFO and GATA4 S377G

PCR fragments including the S377G variation were first analysed by single stranded conformational p olymorphism ( SSCP). D ifferential mig ration p atterns allowed a secure di stinction of a ll t hree pos sible g enotypes ( AA, A G, G G) on 12%

Polyacrylamide gels at 20°C. All three genotypes were confirmed in 12 samples by direct DNA sequencing.

43

Discrete variables were expressed as counts or percentages and compared using chi- square t est w ithout Y ates co rrection. F isher’s ex act t est w as em ployed f or association of c ounts o f w hich one s ubgroup w as s maller t han four. C ontinuous variables w ere ch ecked f or G aussian di stribution a nd e xpressed a s m ean va lue ± standard deviation and were compared by use of unpaired, two sided t-test.

2.2 Animal studies

2.2.1 Fine mapping of QTL affecting atrial septal morphology using an advanced

intercross line

2.2.1.1 Mice and Advanced Intercross Line

An efficient QTL analysis requires a l arge disparity for the trait of interest between parental strains. The parental strains in the original F2 study (Kirk et al., 2006), QSi5 and 129T2/SvEms, w ere s elected ba sed on ha ving e xtreme va lues f or mean flap valve l ength ( FVL) and p revalence o f P FO, t wo i ndicators o f at rial s eptal morphology w hich a re s trongly ne gatively c orrelated (Biben et a l., 2000 ).

129T2/SvEms had the highest prevalence of PFO (75%) and shortest FVL (mean =

0.6 mm) among inbred strains and multiple crosses (Biben et al., 2000). QSi5 had the l ongest F VL ( mean = 1.13 m m) a nd a mong t he l owest pr evalence of P FO

(4.5%).

We randomly intercrossed the original F2 mice for 12 f urther generations. The F2 mice were bred to produce 48 male and female pairs which were then stocked in 48 separate cages. We intercrossed the mice in each cage to generate the F3 mice. For the F3 x F3 and following matings a cascading scheme was used in which a female

44 mouse from one cage would be mated with a male mouse from the next cage. This breeding design in which each pair contributes exactly two offspring (one male and one female) to the next generation doubles the effective population size and reduces the r andom c hanges i n a llele f requency du e t o r andom ge netic dr ift ov er t he 10 generations. Animals were bred and housed under Animal Care and Research Ethics approvals N00/4-2003/1/3745, N00/4-2003/2/3745 and N00/4-2003/3/3745 from the

University of Sydney.

2.2.1.2 Dissection and measurements

In t otal, 1003 A IL F14 m ice w ere di ssected of t hem 933 m ice ha d c omplete phenotypic data (475 males and 458 females). Phenotyping of each mouse including initial and fine dissections, determination of PFO status, and measurement of septal features were p erformed on t he same day. As i n t he F2 study, t he t horacic organs including h eart, lu ngs, a nd me diastinum w ere in itially d issected e n b loc a nd w ere stored in PBS. A tail biopsy was also taken from each mouse and snap-frozen fo r

DNA extraction.

The following steps were performed under a Leica MZ8 dissecting microscope. The mediastinal organs were removed to expose the atria. Subsequently, the left atrium was opened to expose the atrial septum. We detected PFO by pressurization of the right atrium. The right to left passage of blood across the inter-atrial septum would indicate t he pr esence of P FO. In cases w ith i nsufficient bl ood i n t he atrium, w e injected O range G d ye into t he l eft s uperior v ena cav a t o p rovide color c ontrast.

Measurement o f s eptal features i ncluding F VL, F OW, an d C RW w as performed using an eyepiece graticule.

45

FVL was defined formally, as in our previous study, as the length of the flap valve from the edge of the crescent (proximal rim of the ostium secundum) to the distal rim of the fossa ovalis. The maximum width of the foramen ovale (foramen ovale width; FOW) was measured perpendicular to the FVL. Crescent width (CRW) was defined as the maximum width of the prominent crescent-shaped ridge, representing the proximal rim of the ostium secundum and edge of the flap valve as previously described (Kirk et al., 2006).

2.2.1.3 Normalization

A g eneral l inear m odel ( PASW S tatistics 1 8) w as u sed t o an alyze t he ef fect o f various c ovariates on t he t raits of i nterest (FVL; FOW; and heart w eight, HW) considering a p-value of < 0.05 as significant (Tables 2.1-2.4). FVL and FOW were significantly affected only by HW (p = 0.002 a nd p = 0.028, r espectively) and the effect of other covariates including sex, age, body weight (BW), and coat color was not significant. However, we did not adjust for HW since QTL relevant to HW may also i nfluence a trial s eptal m orphology. O n t he other ha nd, H W w as s ignificantly affected b y s ex (p < 0.001), age (p = 0.032), BW (p < 0.001) and coat color (p =

0.042). T herefore, pr ior t o s ample s election a nd f urther a nalysis, H W w as normalized for age, sex and BW, although not for coat color so as to avoid missing

QTL linked to coat color genes.

46

Table 2.1: Effect of various covariates on FVL

Source Type III Sum of Squares df Mean Square F P-value

Sex 0.315 1 0.315 1.177 0.278

Age 0.507 1 0.507 1.893 0.169

BW 0.125 1 0.125 0.467 0.495

HW 2.467 1 2.467 9.219 0.002

Colour 0.001 1 0.001 0.003 0.956

Table 2.2: Effect of various covariates on FOW

Source Type III Sum of Squares df Mean Square F P-value

Sex 0.030 1 0.030 0.500 0.480

Age 0.056 1 0.056 0.929 0.335

BW 0.170 1 0.170 2.797 0.095

HW 0.293 1 0.293 4.820 0.028

Colour 0.016 1 0.016 0.264 0.608

Table 2.3: Effect of various covariates on CRW

Source Type III Sum of Squares df Mean Square F P-value

Sex 0.345 1 0.345 1.492 0.222

Age 0.315 1 0.315 1.366 0.243

BW 0.372 1 0.372 1.613 0.204

HW 0.601 1 0.601 2.604 0.107

Colour 1.320 1 1.320 5.713 0.017

47

Table 2.4: Effect of various covariates on HW

Type III Sum of Source df Mean Square F P-value Squares

Sex 0.083 1 0.083 228.508 0.000

Age 0.002 1 0.002 4.621 0.032

BW 0.131 1 0.131 359.744 0.000

Colour 0.002 1 0.002 4.150 0.042

2.2.1.4 Sample selection

Selective g enotyping o f extreme p henotypes i s an ef ficient m ethod t o i ncrease t he power of QTL mapping (Lander and Botstein, 1989). However, the benefits of this method de cline with a n i ncreasing num ber of uncorrelated or w eakly correlated traits. T he f ocus of ou r s tudy w as t o f ine m ap Q TL unde rlying t he m ain hi ghly correlated traits (FVL and FOW) and a peripheral trait of HW. We did not consider

CRW as a basis for sample selection since it was a less defined anatomical structure and, unlike FVL and FOW, it was not associated with PFO (see chapter 5). For each trait ( FVL, F OW, a nd H W), we s elected ap proximately 1 00 F 14 an imals w ith extreme phe notype. G iven t he ove rlap be tween t he e xtreme phe notypes from different traits, 237 m ice were selected by this method. In order to compensate for biases in s elective g enotyping in th is s tudy, s election of e xtreme phe notypes w as combined with a degree of random selection. Therefore, 163 mice were also selected randomly giving a total sample of 400 mice. We took into account the structure of the br eeding pr otocol t o s elect t he s ame num ber of m ales a nd females f rom t he

48 breeding c ages. As a r esult, t he s elected mice gave as eq ual a s p ossible representation of males and females and breeding cages.

2.2.1.5 Marker selection

We s earched genotype d ata f rom t he M ouse H apMap p roject for p otential informative s ingle n ucleotide pol ymorphisms ( SNPs) be tween QSi5 and

129T2/SvEms strains (2009). Markers were selected according to the density of the informative S NPs be tween t he t wo s trains. I n ot her w ords, m ore m arkers w ere selected from the regions with high SNP density (Figure 2.1). In total, a set of 135 markers with the average interval of 2 cM was selected for genotyping of genomic regions of significant QTL from the F2 study (3 QTL for FVL and 3 for FOW) and 1

QTL for heart weight; HW (Supplementary Table 1). In addition, a suggestive QTL for FOW on MMU9 (LOD = 3.43) was included in the AIL study as its peak covered the k nown AS D/VSD g ene Tbx20 (Kirk e t a l., 2006). Fifteen ex tra m arkers w ere chosen to cover the whole peak of this QTL.

49

Figure 2.1: Distribution of the selected markers spanned the FOW QTL on chromosome 1 according to the density of informative SNPs between QSi5 and 129T2/SvEms strains. Blue bars represent the density of the informative SNPs and those in red represent the positions of the selected markers. LOD score curve for FOW from the F2 study is illustrated in black.

50

2.2.1.6 Genotyping

The genotyping was performed at the AGRF. First, the genomic DNA was isolated from m ouse t ails us ing M acherey-Nagel N ucleospin ki t a ccording t o t he manufacturer’s g uidelines. Prior t o g enotyping, t he i nformativeness of t he m arkers for p arental s trains was verified. S ubsequently, a t otal of 150 markers w ere genotyped in th e s elected mic e u sing iP LEX™ MassArray® a ssay following th e manufacturer’s protocol.

2.2.1.7 Linkage analysis

We pe rformed i nterval mapping l inkage a nalysis f or t he qu antitative t raits a t 0.25 cM intervals using a maximum likelihood method implemented in R software. The program was initially developed for an F2 design, but modified for an AIL using the methods de scribed b y Darvasi and S oller (Darvasi a nd S oller, 1995 ). Given t he density of selected markers and the size of the genomic region covered by markers a

LOD score of 2 was set as the threshold level of significance (Lander and Botstein,

1989). We used the 1 LOD drop-off to estimate the confidence interval of each QTL

(Lander a nd Botstein, 1 989). H owever, i n s ome cas es t wo s ignificant p eaks w ere located close together and overlapped in the 1 LOD drop-off intervals. To find out whether these peaks reflected the same underlying QTL or identified the presence of independent QTL, we re-ran the linkage analysis using a model in which the marker closest t o t he h igher p eak w as i ncluded as a fixed t erm ( Figure 2.2), e ffectively a simplified form of composite interval mapping but applied in a maximum likelihood framework (Kearsey and Hyne, 1994; Wu and Li, 1994). Disappearance of the lower peak would indicate that two peaks represented a single QTL. On the other hand, if the lower peak remained significant, two peaks would represent two separate QTL.

51

Figure 2.2

MMU1; FOW 5 rs32666041 rs30203203 A

4

3

LOD score 2

1

0 0 10 20 30 40 50 60 70 Genetic position (cM)

rs32666041 is included as a fixed term rs30203203 is included as a fixed term

MMU2; CRW 3 rs3695852 rs51010451 B

2 LOD score

1

0 0 20 40 60 80 100 Genetic position (cM)

rs3695852 is included as a fixed term rs51010451 is included as a fixed term

52

rs29552398 rs29250410 MMU13; FVL 7 C 6

5

4

3 LOD score

2

1

0 0 10 20 30 40 50 Genetic position (cM)

rs29552398 is included as a fix term rs29250410 is included as a fix term

MMU19; FVL 4 rs30653264 rs30464413 D

3

2 LOD score

1

0 0 10 20 30 40 50 60 Genetic position (cM)

rs30653264 is included as a fixed term rs30464413 is included as a fixed term

Figure 2.2: Linkage analysis of the AIL data. For each chromosome two different models have been used in each, one peak marker was included as a fixed term. Vertical blue lines correspond to the position of the AIL markers. The bold arrows indicate the position of the fixed markers.

53

We also developed the QTL program in R to perform linkage analysis for PFO as a binary trait (presence or absence). For binary analysis, the linear model which was used for the quantitative analysis was replaced by a generalized linear model in the form o f a lo gistic regression m odel, an d w as t he s ame p rocedure described previously, modified for an AIL design. As far as we are aw are, t his w as t he first time a method for analysis of a binary trait in an AIL design has been implemented.

The published statistical m ethods are d escribed i n detail i n a s eparate chapter (see chapter 3) (Moradi Marjaneh et al., 2012).

2.2.1.8 Conversion of genetic maps

We ha d pr eviously d efined t he g enetic pos itions of t he F 2 m arkers us ing an ol d mouse g enetic m ap de veloped b y t he W hitehead Institute a nd t he M assachusetts

Institute of Technology (MIT) (Dietrich et al., 1996). However, the genetic positions of the AIL markers were based on t he current standard genetic map established by the Mouse Genome Informatics (MGI) project at the Jackson Laboratory (Bult et al.,

2008). T o be a ble t o s uperimpose t he LOD pl ots obt ained f rom t he F 2 a nd A IL studies we converted the genetic position of the F2 markers from the Whitehead/MIT map to the MGI map using mouse maps introduced by Cox et al. (Cox et al., 2009).

For i ntermarker i ntervals a l inear i nterpolation was us ed t o c onvert ol d g enetic positions to new ones.

54

2.2.2 Whole genome sequencing of the AIL parental strains

The whole pr ocess including hi gh t hroughput s equencing and d ata analysis were carried o ut at the Victor Chang Cardiac Research Institute (VCCRI). H ere, t he details on each step are provided.

2.2.2.1 Step 1: DNA extraction and quantification

Genomic DNA was extracted from mouse liver specimens, one from QSi5 and one from 129T2/SvEms using phenol-chloroform e xtraction m ethod (Sambrook e t a l.,

1989). T he c oncentration a nd qua lity of the extracted D NA were as sessed by

NanoDrop spectrophotometer (Table 2.1).

Table 2.1: DNA properties measured by NanoDrop spectrophotometer

QSi5 129T2/SvEms Concentration (ng/μl) 238 334 Absorbance

A260/280 1.98 1.92

A260/230 1.93 1.65

From here onwards the steps carried out for DNA library preparation and sequencing are de scribed. T he s ame m ethodology w as a pplied t o bot h D NA s amples. The methodology f or t he Library pr eparation, a mplification a nd s equencing s teps described b elow are extracted f rom t he A pplied B iosystems S OLiD™ 4 S ystem guidelines.

2.2.2.2 Step 2: Library preparation

To adapt the samples for SOLiD™ System sequencing, genomic DNA library was prepared according to the SOLiD™ 4 System Library Preparation Guide (2010). The

55 fragment library method was used here in which firstly DNA was sheared into small fragments and then forward and reverse adaptors (P1 and P2) were ligated to each end of the sheared DNA (Figure 2.3).

Figure 2.3: Basics of fragment library method. Adapted from Figure 2 in SOLiD™ 4 System Library Preparation Guide (2010)

Shear the DNA

End-repair the DNA: 1. Repair the DNA ends with End Polishing Enzyme 1 and End Polishing Enzyme 2 2. Purify the DNA with SOLiD™ Library Column Purification Kit

Ligate P1 and P2 Adaptors to the DNA: 1. Ligate the P1 and P2 Adaptors to the DNA 2. Purify the DNA with the SOLiD™ Library Column Purification Kit

Size-select the DNA

Nick-translate, then amplify the library: 1. Nick-translate, then amplify the library 2. Purify the DNA with the SOLiD™ Library Column Purification Kit

Quantitate the library by performing quantitative PCR (qPCR)

Figure 2.4: Workflow for preparing a standard fragment library. Adapted from SOLiD™ 4 System Library Preparation Guide (2010)

56

The overview of the workflow is illustrated in Figure 2.4. A summary on each step is described below.

• Shearing the DNA: The genomic DNA was sonicated into small fragments

with a mean fragment size of 165 bp using a Covaris™ S2 System.

• End-repairing t he D NA: E nd P olishing E nzyme 1 a nd E nd P olishing

Enzyme 2 were used to convert DNA that had damaged or incompatible 5′-

protruding and/or 3′-protruding ends to 5′-phosphorylated, blunt-ended DNA.

• Purifying t he D NA: This w as done us ing t he S OLiD™ Library C olumn

Purification Kit.

• Ligating the P1 and P2 Adaptors to the DNA: P1 and P2 Adaptors were

ligated to the ends of the end-repaired DNA.

• Size-selecting the DNA: The l igated and purified D NA w as r un on a

SOLiD™ Library S ize Selection ge l. T he c orrectly s ized l igation pr oducts

(200 t o 230 bp) w ere e lectrophoresed t o t he c ollection w ells of t he S ize

Selection gel. The eluate in each collection well could be transferred directly

to the nick translation reaction.

• Nick-translating, then amplifying the library: The eluates from the SOLiD

Library S ize S election gel unde rwent ni ck t ranslation a nd s ubsequently

amplification us ing Library P CR P rimers 1 a nd 2 a nd P latinum® P CR

Amplification Mix. After amplification, PCR samples were purified with the

SOLiD™ Library Column Purification Kit.

• Quantitating the library by performing quantitative PCR (qPCR): The

library w as qua ntitated us ing t he S OLiD™ Library T aqMan® Q uantitation

Kit. In t otal, w e qua ntitated 4 s amples i ncluding 2 f rom Q Si5 a nd 2 f rom

57

129T2/SvEms libraries (Table 2.2). The quantitation results for 2 samples are

shown in Figure 2.5.

Table 2.2: DNA size and concentration for the library samples which were quantitated by qPCR.

No. Sample Size (nt) Concentration (ng/μl)

1 129T2/SvEms 209 13.61 2 129T2/SvEms (1:2 dilution) 205 5.9 3 QSi5 208 33.12 4 QSi5 (1:2 dilution) 210 15.29

58

A

B

C

Figure 2.5: Quantitation of the libraries by qPCR. A illustrates the detected PCR bands which correspond to the 4 samples described in Table 2.2. B and C represent the quantitation results for samples 1 and 3 (129T2/SvEms and QSi5), respectively.

2.2.2.3 Step 3: Amplification

After c onstruction o f th e lib raries, th e D NA w as a mplified us ing emulsion P CR according to the SOLiD™4 System Templated Bead Preparation Guide (2010).

59

Prepare the full-scale emulsion PCR (ePCR) reaction: 1. Prepare the oil phase 2. Prepare the aqueous phase 3. Prepare the SOLiD™ P1 DNA Beads 4. Create emulsion using the ULTRA-TURRAX® Tube Drive from IKA® 5. Perform the ePCR reaction and inspect the emulsion

Perform the emulsion break and bead wash 1. Break the emulsion 2. Wash the templated beads 3. Quantitate the beads

Enrich the full-scale templated beads 1. Prepare Denaturing Buffer solution 2. Prepare 60% glycerol 3. Prepare the enrichment beads 4. Prepare the templated beads for enrichment 5. Enrich the templated beads 6. Isolate the P2-enriched beads

Modify the 3′ ends 1. Extend 3′ ends with Terminal Transferase and Bead Linker 2. Quantitate the beads

Figure 2.6: Workflow for preparing emulsion PCR (ePCR) reaction. Adapted from SOLiD™4 System Templated Bead Preparation Guide (2010)

Figure 2.6 s hows the workflow for this step. The performed tasks are summarized below.

• Preparing the f ull-scale e mulsion P CR ( ePCR) re action: The o il p hase

and a queous pha se of the e mulsion w ere pr epared s eparately, a nd t hen

emulsified us ing t he U LTRA-TURRAX® T ube Dr ive f rom IKA®. Each

emulsion w as s eeded with 1.6 bi llion S OLiD™ P1 D NA B eads, a nd t hen

transferred into a single, 96-well plate for cycling.

• Performing the emulsion break and bead wash: The emulsion break used

2-butanol to purify emulsified templated beads from the oil phase following

amplification. The beads were washed to remove residual 2-butanol, oil, and

aqueous phase containing PCR reagents. The SOLiD™ Emulsion Collection

Tray was placed over the 96-well plate, and then the plate was centrifuged.

60

Next, 2 -butanol w as added t o t he reservoir. The br oken e mulsion was

transferred to a 50-mL tube for further processing.

• Enriching the full-scale templated beads: The templated bead enrichment

procedure isolated beads with full-length extension products following ePCR.

Beads with f ull-length e xtension pr oducts w ere i solated b y o ligo

hybridization us ing t he sequence of t he P 2 pr imer. Both m onoclonal and

polyclonal beads were enriched.

• Modifying the 3′ ends: The P2-enriched beads were extended with a B ead

Linker by Terminal Transferase.

2.2.2.4 Step 4: Sequencing

After eP CR an d en richment o f t he template beads, t he t emplated b eads were deposited ont o a s lide. The t emplates were then sequenced o n t he S OLiD™ 4

System according to the SOLiD™ 4 System Instrument Operation Guide (2010). We ran t he SOLiD™ A nalyzer twice i n each r un t wo s lides w ere analyzed, o ne f rom

QSi5 and one from 129T2/SvEms. As described below, the obtained reads from the two runs for each strain were pooled together for data analysis.

2.2.2.5 Step 5: Data analysis

The pipeline for data analysis is illustrated in Figure 2.7.

61

Pipeline for whole genome sequencing data analysis

Assess Bin reads Assess reads quality Map reads Pre-process for Call variants Filter variants* mappability and variant calling coverage BFAST Call variants using GATK’s Mark PCR duplicates UnifiedGenotyper Create BFAST index Convert SAM to BAM using Picard format and sort Recalibrate variant Convert .csfasta and Local realignment quality .qval files to fastq Merge BAM files using GATK and index Run BFAST Match Filterout unmapped reads using bamtools Run BFAST Align Recalibrate quality values Run BFAST Postprocess using GATK

SAM outputs

Figure 2.7: Data analysis pipeline for whole genome sequencing of AIL parental strains

* This step is fully described in chapter 6.

62

Figure 2.8: Average quality values by position in read

In s ummary, t he r eads were firstly binned i nto di fferent qu ality va lue bins a nd t hen assessed for quality (Figure 2.8). The reads were then mapped to the reference genome using B FAST (BLAT-like F ast A ccurate S earch T ool), a DNA s equence al igner t ool, resulting in outputs in S AM (Sequence Alignment/Map) format. T he S AM f iles w ere merged i nto r un-level B AM (binary v ersion of the S AM f ormat) files r esulting in 4

BAM files corresponding to the 4 sequencing runs 2 for each strain. Then, the run-level

BAM f iles w ere me rged in to lib rary-level B AM f iles r esulting in 2 B AM f iles corresponding to two sequencing libraries QSi5 and 129T2/SvEms. The mappability and coverage of the reads were also assessed. QSi5 and 129T2/SvEms reads showed 76.5% and 68.3% mappability (Figure 2 .9), a nd 8 a nd 6 t imes c overage of t he g enome, respectively (Figure 6.2 ). Pre-processing of t he mapped r eads f or va riant c alling included marking PCR duplicates, local re-alignment and quality recalibration.

63

Figure 2.9: Mappability of the total reads

Variant calling was performed using GATK, the Genome Analysis Toolkit, resulting in output files in VCF format.

GATK was also used for variant quality recalibration. In this method, a high confident set o f v ariants are used as t ruth da taset t o “learn” what the characteristics o f “good”

SNPs are. T he “learned” good ch aracteristics are then u sed t o r ecalibrate t he q uality score of t he c alled va riants t o a llow s election of hi gher qua lity S NPs. In hum an,

HapMap data is usually used as the “truth” dataset in the model. However, HapMap data is currently very limited for mouse and using mouse HapMap as a “truth” dataset in our

64 analysis did not provide high-quality r esults. I nstead, w e used a s ubset of our c alled variants which had the highest quality values as “truth” dataset. Mouse dbSNP was used as both the training and known sets of variants.

Next, a multi-step p rocess w as ap plied to f ilter th e variants and c onsequently t he candidate genes. T his w as i ncluded a pplying m ultiple c riteria, SNP a nnotation us ing

SNP E ffect P redictor ( SnpEff) and Ensembl Variant E ffect P redictor (VEP), an d al so genomic analysis of SNP density, full details on which are provided in Chapter 6.

65

3 Statistical methods for Q TL map ping o f c omplex

binary traits in advanced intercross line

3.1 Introduction

Mapping QTL using inbred lines requires crossing to produce a segregating generation, usually the F2. However, the resolution of map position in F2 resources is quite low, usually of the order of 20 to 30 cM, which is not useful for identification of the gene(s) and mutation(s) underlying the QTL. There is insufficient opportunity for recombination in generating an F2 resource and thus a QTL can only be assigned to a large haplotype block. As discussed in 1.3.1.3.3.1, one approach to resolving this problem involves the construction of recombinant inbred lines (RILs), usually b y sib mating within the line from the F2 generation onward, providing about a fourfold improvement in resolution and al so a m apping resource where phenotypes can be m easured with e xtremely high precision (Williams et al., 2001). However, R ILs are ex tremely ex pensive t o generate and to maintain.

An a dvanced i ntercross l ine ( AIL) i s a n a lternative a pproach t o hi gher r esolution mapping of QTL ( see 1.3.1.3.3.2) (Darvasi a nd S oller, 1995 ). In a n A IL, additional generations of i ntercrossing t o generation 10 o r beyond a re e xpected t o improve t he resolution of map position of the QTL down to about 1 c M, provided enough markers are available. An AIL is much more easily and economically bred than a set of RILs, but provides lower precision of phenotype measurement (single measurement), is transient rather t han pe rmanent a s i s t he c ase f or R ILs a nd i s a t r isk of r andom g enetic dr ift

66 causing random fixation of marker or QTL alleles during the additional generations of intercrossing. Nevertheless, AILs are an important tool in the QTL mapping strategy.

The complexity of inheritance that can underpin a binary phenotype can be explained by a threshold model a ssuming a n unobs ervable quantitative va riable t ermed liability underlying t he bi nary phenotype, w ith pr esence of t he a bnormality above a t hreshold and absence below (Falconer, 1965). Two broadly different approaches may be applied for Q TL m apping of c omplex b inary t raits. F irstly, one m ay i dentify a continuously distributed measure reflective of the binary trait and map QTL for this trait using the standard Q TL m apping t echniques. F or e xample, s everal q uantitative m easurements such as blood pressure and serum cholesterol have been linked to the risk of cerebro- cardiovascular diseases. These risk factors can be studied to map QTL in animal models providing an indirect albeit powerful approach for genetic dissection of such complex diseases (Gallardo et al., 2008; Johnson et al., 2009). As another example, quantitative parameters of atrial septal anatomy in the murine heart have been shown to be correlated to the risk of patent foramen ovale (PFO), a common abnormal connection between the two atria due to incomplete fusion of the septum primum and the septum secundum after birth (see 1.6) (Kirk et al., 2006). PFO in humans is associated with stroke, migraine and other disorders. Among parameters mapped, the length o f septum p rimum (flap v alve length; FVL) is strongly negatively associated with the risk of PFO (see chapter 5) (Kirk et al., 2006). Using an F2 resource, these atrial septal measures were mapped as proxies for the liability to the binary trait of PFO and the identified QTL were assumed also to influence the risk of PFO (Kirk et al., 2006). The second approach for QTL mapping of complex b inary tr aits is a d irect ma pping o f th e lia bility u nderlying th e b inary tr ait,

67 although t his pr esents a g reater s tatistical c hallenge due t o a nonl inear a ssociation between t he b inary phenotype a nd t he qu antitative l iability. H owever, s everal approaches h ave b een d eveloped t hat can b e ap plied h ere i ncluding p arametric Q TL mapping using a generalized l inear m odel (Hackett and W eller, 1995; Visscher et al .,

1996).

We ha ve c onducted a n AIL t o f ine m ap Q TL u nderlying a trial s eptal m orphology i n which both approaches described above were applied. This is fully described in chapter

5. The present chapter is focused on de scribing the statistical methods we used in our

AIL design in order to analyse our main trait of interest, the all-or-none trait of patent foramen ovale (PFO). This represents the first methodology developed for analysis of a binary trait in an AIL design.

The s tatistical me thods d escribed h ere were de veloped b y D r. P eter T homson, our collaborator a t t he U niversity of S ydney. M y c ontribution t o t his w ork i ncluded participation in applying this methodology to our AIL data (see section 3.3) and writing a technical paper on this (Moradi Marjaneh et al., 2012).

3.2 Statistical methodology for mapping of complex binary traits

For binary traits, a logistic regression model is fitted to the data (Kirk et al., 2006) ,

pi (QQ ) () qq ( Qq ) loge=x ii' β +aq−+ aq ii dq 1− pi where

68

pi = P(Bedo et al.) = probability of the binary outcome, where Yi = 1 (disease present) or 0 (disease absent);

xi = set of predictor variables to adjust for (e.g. sex, age, interaction terms, and a constant);

β = regression coefficients corresponding to the predictors;

a = additive QTL effect; and

d = dominance QTL effect. and where

()⋅ qi = 0-1 unobserved indicator variable indexing the QTL genotype (QQ, qq, or

Qq),

()()()QQ qq Qq with qi++ qq ii =1. N ote th at it is assumed th at a llele Q originates fro m parental line 1 (all QQ), and that allele q from parental line 2 (all qq).

Because the QTL genotypes indicator variables are unobserved, the model is fitted as a

()QQ ()QQ three-component mix ture, w ith mix ing p robabilities π=iiPq{} =1|m i,

()Qq ()Qq ()qq ()qq ()()()QQ Qq qq π=iiPq{} =1|m i, a nd π=iiPq{} =1|m i, w ith πi +π ii +π =1, where these are the conditional probabilities of the QTL genotype, given the flanking marker genotypes, mi. For an inbred F2 design, these probabilities can be calculated as outlined in Lynch & Walsh (Lynch and Walsh, 1998), and depend on the recombination

69 rate between flanking markers and the putative QTL. The Haldane mapping function is used in calculation of the recombination rate.

For the AIL design, these mixing probabilities are calculated as a modification to the standard way for an inbred F2 design. The modification required for the AIL design is the ‘ exact’ m ethod d escribed b y D arvasi & S oller (Darvasi a nd S oller, 1995 ) and involves the following recombination probabilities:

n – 2 r′1 = ½[1 – (1 – r1) (1 – 2 r1)]

n – 2 r′2 = ½[1 – (1 – r2) (1 – 2 r2)]

n – 2 r′12 = ½[1 – (1 – r12) (1 – 2 r12)]

where r1 and r2 are t he r ecombination r ates be tween t he put ative Q TL and f lanking markers 1 and 2; r12 is the recombination rate between the flanking markers; and n is the number of AIL generations. (Clearly with n = 2, this simply return the F2 rates). Here, the AIL recombination rates r′1, r′2 and r′12 replace r1, r2 and r12 in the F2 QTL interval mapping procedure.

The mo del w as f itted u sing ma ximum lik elihood, in corporating th e e xpectation- maximisation (E-M) algorithm as outlined in the mixture model approach of McLachlan and Basford (McLachlan and Basford, 1988) and Jansen (Jansen, 1992).

The E -step i nvolves c alculation of t he pos terior pr obability o f a n a nimal ha ving a particular Q TL genotype, g iven i ts phe notype and f lanking m arker i nformation. F or genotype QQ this probability is

70

()QQ = ()QQ = ()QQ ()QQ Pq{}i1 |mm i py ( ii | q 1, i ) τ=i Pq{ i =1|mii , y} = py(|iim ) where

()()()()()()QQ QQ Qq Qq qq qq py(|im i )=π i py (| ii q = 1,) m i +π ipy (| ii q = 1,) mm i +π ipy (| ii q = 1,) i

()⋅ Here, py(ii | q = 1,m i ) is th e p robability of d isease as c alculated f rom th e lo gistic regression model. S imilar expressions can be obtained for the posterior probability of

()Qq ()qq disease c onditional upon obt aining ge notypes Qq ()τi or qq ()τi , w ith

()()()QQ Qq qq τi +τ ii +τ =1.

The M -step in volves m aximisation o f the ex pected l og-likelihood, a nd i s a chieved through a call to a standard generalised linear model fitting procedure, but the procedure needs to be able to specify weights for each record. To accommodate the analysis, data need to be organised as follows, with the data being repeated three times, once for each possible QTL genotype:

Binary Data Predictors q(QQ) q(Qq) q(qq) Weight

Y x 1 0 0 τ(QQ)

Y x 0 1 0 τ(Qq)

Y x 0 0 1 τ(qq)

Thus with n binary observations, the data set provided to the generalised linear model call is of length 3n. The fitting is conducted iteratively, with the set of weights (posterior

71 probabilities) upda ted a t e ach i teration. Iteration i s t erminated on c onvergence o f t he parameter estimates and log-likelihood,

n logeL (βm , ad , )= ∑ loge py ( ii | ) i=1 n ()QQ ()QQ ()Qq ()Qq ()QQ ()qq =∑logei{ π py ( ii | q = 1,mmm i ) +π ipy ( ii | q = 1, i ) +π ipy ( ii | q = 1, i )} . i=1

The mixture model can then be fitted in steps (e.g. 1 c M, 0.25 c M) along the length of chromosome, r eturning the LOD v alue as w ell as p arameter es timates at each f itted position. At the peak LOD QTL position(s), as well as estimates a and d (on the logit

()()()QQ Qq qq scale or liability scale), the posterior probabilities τττi,, ii can again be obtained and these can be used to predicts the likelihood of each animal having a particular QTL genotype, and hence used in subsequent screening or selective breeding.

The procedure was coded using an in-house program written in R (R Development Core

Team, 2010 ), a s a n i nbred-line ve rsion of t he Q TL-MLE p rogram d escribed i n

(Thomson et al., 2007).

3.3 Application

PFO is a common abnormal connection between the two atria due to incomplete fusion of t he s eptum pr imum a nd t he s eptum s ecundum a fter bi rth. In hu man, P FO i s associated w ith s troke, migraine a nd s everal ot her di sorders (Webster et a l., 1988 ;

Wilmshurst et al., 2000). Previously, quantitative parameters of atrial septal anatomy in the murine heart have been shown to be associated with the risk of PFO (Biben et al.,

2000; Kirk et al., 2006). Among the parameters studied, the length of septum primum

72

(flap v alve le ngth; FVL) is s trongly negatively associated with th e r isk o f P FO ( see chapter 5) (Biben et al., 2000; Kirk et al., 2006). Using an F2 intercross of inbred mice, the atrial septal parameters were mapped as proxies for the liability to the binary trait of

PFO and the identified QTL were assumed also to influence the risk of PFO (Kirk et al.,

2006).

We have fine mapped the QTL identified by the F2 study using AIL (see chapter 5). In summary, a n A IL w as established a s a c ross b etween Q Si5 a nd 129T 2/SvEms m ice which show extremes of atrial s eptal variation and PFO, and this was continued until generation 14. A t otal o f 933 F14 m ice w ere p henotyped f or va rious t raits i ncluding

PFO, w hich w e h ave considered t o be an a ll-or-none bi nary t rait; a nd F VL, a measurement-scale t rait. O f t hese, 4 00 animals w ere s elected i ncluding t hose w ith phenotypic extremes (of the measurement-scale traits) and were subsequently genotyped for 150 m arkers at an a verage i nterval of 2 c M. The m arkers were di stributed over 8 chromosomes aiming to span the F2 QTL regions previously identified b y Kirk et al .

(Kirk et al., 2006). QTL mapping of quantitative atrial septal parameters confirmed and finely m apped t he F2 Q TL. In addition, t he m ethodology d escribed a bove was successfully applied to map QTL underlying the binary trait of PFO. The full study and results can be found in chapter 5.

Here, as an example, the MMU1 results for PFO and FVL are represented (Figure 3.1).

QTL interval mapping of MMU1 for both PFO and FVL have been conducted using 30 markers w ith p rofile lo g-likelihood e valuated a t 0.25 c M i ntervals. S imilar l inkage patterns f or P FO a nd FVL w ere obs erved s uggesting t hat t he assumed unde rlying

73 liability scale for PFO is probably very similar to the measurement-scale FVL trait. To obtain f urther s upport f or t his, w e c onfigured a s ynthetic bi nary t ait, F VL-binary, constructed using FVL but dichotomising it at 0.95 mm (Y = 0 if FVL > 0.95 mm; Y = 1 if F VL < 0.95 m m). T he c ut-off poi nt of 0.95 m m w as c hosen a s t his g ives approximately the same proportion of mice having a small FVL (34.7%) as those having

PFO (33.7%), and indeed there was a highly significant association between binary traits of P FO a nd F VL-binary ( odds r atio = 6.5, P < 0.0001). T he Q TL m apping of F VL- binary also resulted in a similar pattern to those for PFO and FVL (Figure 3.1). All three parameters showed strong linkage to a marker at 41.6 cM (rs30203203). However, not surprisingly, the peak LOD for the FVL trait is higher than the binary traits, presumably because o f t he greater i nformation co ntent o f a measurement-scale t rait o ver a b inary trait (although this was not always the case for this study).

74

7 rs30203203

6

5

4 PFO

LOD score 3 FVL FVL-binary 2

1

0 0 10 20 30 40 50 60 70 Genetic position (cM)

Figure 3.1: Intervals maps f or Q TL on M MU1. P FO w as mapped u sing a b inary t rait model, whereas FVL was mapped assuming a normally distributed trait. FVL-binary is a trait constructed by coding FVL as 1 ( FVL < 0.95 mm) or 0 ( FVL > 0.95 mm) and then analysed as a binary trait. The position of rs30203203 is illustrated at 41.6 cM.

3.4 Some extensions

The m ethods descried h ere cover s ituations with a bi nary outcome s uch as P FO. The method is easily extended to situations where the outcome is ordinal, such as occurs for the d iffering s everity of d isease. Because o f t he i mplementation u sing t he E -M algorithm, r elatively little r e-coding i s r equired a nd a c all t o a n or dinal l ogistic regression m odel r eplaces a c all t o a bi nary l ogistic r egression, but a gain t his i s performed iteratively and weights must be provided. This is easily achieved using the polr() function instead of the glm() function in R (Thomson et al., 2007).

75

A f urther c omplication arises w hen t here a re repeated obs ervations ove r t ime on t he same animal, introducing correlation structure into the data. In general, correlated data are handled by either the use of marginal models or random effect models (Diggle et al.,

2002) and t his c an also be a chieved f or bi nary t rait Q TL analysis. F or t he m arginal modelling approach, a generalised estimating equations approach has been outlined for non-normal t raits b y Lange & W hittaker (Lange a nd W hittaker, 2001 ) and T homson

(Thomson, 2003). Alternatively, generalised linear mixed model approaches have been proposed by Kayis (Kayis, 2000) and Wittenburg et al. (Wittenburg et al., 2007).

3.5 Concluding remarks

The methodology described here was successfully applied to map QTL for binary traits of PFO and FVL-binary and can also be utilised for any other complex binary trait in an

AIL design. The mapping results for quantitative parameters of atrial septal morphology were r emarkably similar to th ose f or th e b inary tr ait o f P FO on m ost of t he chromosomes s tudied i n our A IL d esign ( see c hapter 5) s uggesting t hat t he bi nary approach may be applied as a confirmatory tool for the quantitative analysis providing more support for the quantitative QTL. In addition, when multiple quantitative measures are studied the one which shows the most similar linkage pattern to the binary trait may be the best quantitative indicator of assumed underlying liability scale and can be used in further studies. Collectively, these data suggest that a combination of QTL mapping of th e lia bility its elf a nd th e q uantitative me asures a cting a s p roxies f or th e lia bility provide greater insight into the QTL architecture than each single approach.

76

4 Investigation of as sociation b etween P FO

complicated b y c ryptogenic s troke an d a common

variant of the cardiac transcription factor GATA4

4.1 Introduction

Stroke is a leading cause of morbidity and mortality (CDC, 2001; CDC, 2010). While many cases can b e attributed t o k nown f actors s uch as at herosclerosis, t here i s an important s ubset i n w hich no c ause i s readily apparent, t ermed cr yptogenic s troke

(Sacco et al ., 1 989; Homma e t a l., 2002 ; Musolino e t a l., 2003 ). T he as sociation between PFO and cryptogenic stroke is discussed in details in Section 1.6.1.1. Briefly, young patients with cryptogenic stroke are known to have significantly higher incidence of PFO than controls (Lechat et al., 1988; Homma et al., 1994; Hubail et al., 2011). The aetiological link is thought to be passage of thrombus across the atrial septum into the systemic arterial circulation, termed paradoxical embolism, although other factors may be involved (Lechat et al., 1988; Webster et al., 1988). PFO in patients with stroke are generally large (>2mm) and often associated with increased mobility or aneurysm of the fossa ovalis membrane, the remnant of embryonic septum primum (Bridges et al., 1992;

De Castro et al., 2000; Schuchlenz et al., 2000; Aslam et al., 2006).

As de scribed i n S ection 1.8, P FO a nd A SDII a re know n t o f orm a pa thological continuum. W e t herefore h ypothesized t hat va riants i n g enes know n t o be a ssociated with ASD may be important in the causation of PFO, and hence may have a role in the causation of cryptogenic stroke.

77

GATA4 is a member of the conserved GATA family of zinc finger transcription factors with a ke y role i n r egulation of genes i nvolved i n c ardiac m orphogenesis a nd differentiation (Kuo e t al., 1997 ; Molkentin e t a l., 1997 ). M utations i n G ATA4 a re associated w ith C HD i n hum ans, pa rticularly ASDII (Garg e t al., 2003 ). Indeed, i t i s apparent that perturbations in multiple members of the evolutionarily conserved cardiac transcription factor network cause or predispose to ASDII and PFO. The most prevalent variant of t he G ATA4 ge ne i s a n A >G non -synonymous t ransition a t p osition c .1647 resulting in a serine to glycine change at amino acid 377 (GATA4 S377G) with a minor allele frequency of 0.11 (Poirier et al., 2003). Although this is a conservative amino acid change, t he pol ymorphism i s l ocated w ithin t he C-terminal G ATA4 d omain w hich is highly conserved in vertebrate species and essential for GATA4 transcriptional activity

(Morrisey e t a l., 1997 ; Charron e t a l., 2001 ; Garg e t a l., 2003 ). In vi tro f unctional analysis using the alpha myosin heavy chain promoter has revealed that GATA4 S377G has a s ignificantly enhanced t ranscriptional a ctivity c ompared t o w ildtype G ATA4

(Schluterman et al., 2007). Although S377G had shown no s ignificant association with migraine (Sherin et al., 2008), an association with CHD including ASD and PFO had not been characterized at the time of our study.

This ch apter d escribes o ur s tudy ex ploring w hether t here w as an association b etween this GATA4 variant and ASD or PFO in patients unselected for family history. First, the global distribution of GATA4 S377G is presented which justifies selection of Caucasian population as a t arget ethnic group for t his s tudy. Next, t he association between P FO complicated b y c ryptogenic s troke a nd G ATA4 S377G i n t wo i ndependent s tudies i n

Australia a nd G ermany, a s w ell a s i n pool ed da ta f rom bot h s tudies i s described. T o

78 focus on t he m ore s evere e nd of t he P FO s pectrum, P FO s ubjects c omprised l argely patients w ith c ryptogenic s troke i n w hom P FO w as s ubsequently found b y echocardiography. Finally, the findings from these datasets are discussed.

Contributions by MM to this study included designing the experiments (assisted by Prof.

Richard Harvey, Dr. Edwin Kirk, Dr. Maximilian Posch, Prof. Lyn Griffiths, Dr. Diane

Fatkin, D r. J eremy M artinson, D r. D avid W inlaw, a nd P rof. M ichael F eneley), da ta collection in Australia including review of medical records (assisted by Dr. Edwin Kirk,

Dr. R obyn Otway, D r. T anya B utler, a nd M s. G illian B lue), m anagement of s amples from German patients for genotyping on an independent platform, data analysis (assisted by Dr. Edwin Kirk, Dr. Maximilian Posch, and Dr. Cemil Ozcelik), and taking primary responsibility for writing a paper on t his study (assisted by Prof. Richard Harvey, Dr.

Edwin Kirk, and Dr. Maximilian Posch) (Moradi Marjaneh et al., 2011). Collecting data in Germany was performed by Dr. Maximilian Posch. Dr. Jeremy Martinson provided the da ta on S 377G a llele di stribution i n i ndigenous hum an popul ations (see be low).

Australian p atients w ere r ecruited b y P rof. M ichael F eneley, D r. David W inlaw, D r.

Diane F atkin, a nd P rof. L yn G riffiths. G erman p atients w ere r ecruited b y P rof. Felix

Berger.

Certain pa rts o f t his w ork ha d b een carried o ut be fore M M a rrived. A pr eliminary association study had identified an excess of S377G in subjects with PFO with stroke which w as s tatistically significant relative t o t he C aucasian t ransesophageal echocardiogram (TEE) c ontrols. However, no s ignificant di fference b etween t he ASD and controls was seen. To confirm this result a replication study was initiated and during

79 the process o f recruiting m ore patients M M j oined t he Harvey l ab and t ook t he m ain charge of t his pr oject. In pa rallel, a nalyse of a nother c ohort w as pe rformed b y our collaborators in Berlin, Germany. They found a strong association between S377G and

PFO but no a ssociation be tween S 377G a nd A SD, t he s ame a s i n our o riginal s tudy.

Since these results would replicate the original study, we decided to discontinue patient recruitment and pr epare a publ ication i ncluding t he t wo data s ets, one from Australia and one f rom G ermany. D uring pr eparing t he m anuscript, w e not iced t hat t he distribution of g enotypes f rom t he B erlin c ohort w as out of H ardy-Weinberg equilibrium. Therefore we re-genotyped the Berlin TEE control, ASD and PFO samples.

MM organized t he r e-genotyping on a n i ndependent pl atform. U nfortunately, t he association disappeared albeit the negative results were published.

4.2 Results

4.2.1 Global distribution of GATA4 S377G

The global distribution of GATA4 S377G allele (dbSNP ref. 3729856) was determined by a nalysis of D NA f rom w orld i ndigenous popul ations (Martinson e t a l., 2000 ).

European, Australian, American and Middle Eastern Caucasians had the highest allele frequency (11.3-20.2%), while Indians/Pakistanis had a lower frequency (7.7%) (Table

4.1). B y contrast, Africans, E ast A sians a nd P acific Islanders s howed a ve ry l ow frequency (0 -0.4%). A llele f requency for a ll populations w as i n H ardy-Weinberg equilibrium. T hese d ata s uggest a r elatively recent a nd C aucasian or igin f or S 377G, potentially in a single individual among Neolithic farming popul ations from t he Near

East before their expansion into Europe within the last 10,000 years (Cavalli-Sforza et

80 al., 1994 ). F urther c omparisons of S 377G f requency w ere r estricted t o C aucasian subjects.

81

Table 4.1: S377G allele distribution in indigenous human populations

S377G S377G S377G Total Wild type1 Heterozygosity1 Homozygosity1 Allele Frequency1 n n (%) n (%) n (%) % Caucasian Caucasian US 480 382 (79.6) 88 (18.3) 10 (2.1) 11.3

UK 42 27 (64.3) 13 (31) 2 (4.8) 20.2

Australia2 391 301 (77) 83 (21.2) 7 (1.8) 12.4

Cyprus 37 27 (73) 9 (24.3) 1 (2.7) 14.9

Russian Caucasus 112 84 (75) 26 (23.2) 2 (1.8) 13.4

Russia 47 33 (70.2) 14 (29.8) 0 (0) 14.9

Asian Yemen 89 64 (71.9) 24 (27) 1 (1.1) 14.6

India/Pakistan 111 94 (84.7) 17 (15.3) 0 (0) 7.7

Hong Kong 57 57 (100) 0 (0) 0 (0) 0

Taiwan 92 92 (100) 0 (0) 0 (0) 0

African Madagascar 117 116 (99.1) 1 (0.9) 0 (0) 0.4

Central African Republic 44 44 (100) 0 (0) 0 (0) 0

Pacific Islander Papua New Guinea 88 88 (100) 0 (0) 0 (0) 0

1Reflects the prevalence of the A->G single nucleotide polymorphism that causes the S377G amino acid change.

2The same population were used as "population controls" in the Australian cohort.

82

4.2.2 Australian study

In T able 4 .2, b aseline d emographics an d r elevant cl inical d ata ar e t abulated. As AS D and PFO likely can be associated in an anatomical continuum with a common genetic basis, we also included ASD subjects (n = 129) which were a mixture of children and adults ( mean a ge 21.6 years) with th e ma jority having p ure A SDII. “Other C HD” subjects (n = 109) which served as controls, were mostly children (mean age 3.9 years) and t he m ajority had i solated ve ntricular s eptal de fect (VSD), VSD w ith m inor abnormalities, VSD with outflow tract defects including tetralogy of Fallot, pulmonary atresia, t ranspositions or doubl e out let r ight ve ntricle ( DORV). Both A SD a nd O ther

CHD s ubjects w ere u nselected for family history, although 26.6 % a nd 11% , respectively, had a known family history of CHD.

Within PFO/stroke c ases ( n = 58), t he m ean a ge w as 51.3 years, s ome 12-14 y ears younger than in our three adult control groups in which septal status was known (PFO without S troke, S troke w ithout P FO/ASD, a nd T EE C ontrols ) , s uggesting ove r- representation of pa tients w ith pr emature s troke. N o pa tient ha d s evere a theroma o r atrial fibrillation. Only a minority (10.7%) had a family history of CHD. No significant gender bias was evident in either cases or controls.

In all s tudy popul ations, G ATA4 S 377G a llele di stribution w as i n H ardy-Weinberg equilibrium. P revalence of he terozygosity an d al lele f requency v aried s omewhat, although not s ignificantly, be tween c ase a nd c ontrol g roups ( Table 4.3) . In h ealthy

Population Controls, which were unselected for atrial septal status (n = 391), the allele frequency w as 12.4% , in g ood a greement w ith ot her w orld C aucasian popul ations

83

(Table 4.1). The frequency was not significantly different in other CHD subjects, those with PFO without Stroke or Stroke without PFO/ASD. However, allele frequency in our gold standard TEE Controls who did not have stroke and in whom ASD and PFO had been specifically excluded (n = 113), was lower at 9.3%.

84

Table 4.2: Clinical characteristics of Australian cohort

CHD Stroke/no TEE Controls (no Caucasian ASD1 controls2 PFO PFO/ASD5 PFO/ASD/Stroke)6 Population Controls (n=129) (n=109) (n=66) (n=113) (n=391) With Stroke3 Without Stroke4

(n=58) (n=29) Age at Study7 0-77 0.1-44.3 19-86 38-88 29-87 21-89 16-84 (years) (mean:21.6) (mean:3.9) (mean:51.3) (mean:65.9) (mean:65.9) (mean:63.3) (mean:53.1)

Male (%) 54 (41.9) 69 (63.3) 32 (55.2) 20 (69.0) 44 (66.7) 66 (58.4) 200 (50)

Family History of CHD 34/128 12/109 (11.0) 6/56 (10.7) 4/27 (14.8) 2/65 (3.1) 12/113 (10.6) N/A n/total (%)7 (26.6) Severe Atheroma N/A N/A 0/43 (0.0) 3/29 (10.3) 18/66 (27.3) 18/109 (16.5) N/A n/total (%)8 Atrial Fibrillation N/A N/A 0/58 (0.0) N/A 13/66 (19.7) N/A N/A n/total (%) 1Includes 2 APVD, 2 LSVC, 2 Coarctation of the aorta, 2 Pulmonary Stenosis, 1 LSVC & Pulmonary Stenosis, 1 PDA, 3 MVP. 2Includes 26 isolated VSD, 17 VSD with minor abnormalities, 1 functional single ventricle and 64 VSD with malformations of outflow tracts - these include 34 with Tetralogy/pulmonary atresia, 15 with transposition/DORV and 15 with other malformations. 3Includes 1 Ebstein's Anomaly, 2 MVP, 1 prosthetic pulmonary valve, 1 prosthetic mitral valve. 4Includes 1 Quadri-leaflet Aortic Valve, 3 MVP, 3 prosthetic AV, 1 prosthetic AV&MVR. 5Includes 1 BAV, 4 MVP, 4 prosthetic AV, 2 MVR. 6Includes 1 Ebstein's Anomaly, 8 BAV, 1 BAV & Coarctation of the Aorta, 1 BAV & MVR, 1 Sick Sinus Syndrome, 1 PDA, 1 aortic root replacement, 9 MVP, 3 prosthetic MV, 6 prosthetic AV, 2 prosthetic AV&MV, 1 MVR, 1 prosthetic AV & MVR, 1 MVR & tricuspid VR. 7Family history was unknown for a small number of patients. 8Grade of atherosclerosis was unknown for 14 cases with PFO and stroke, 2 cases with PFO and without stroke and 4 TEE controls.

85

Table 4.3: GATA4 genotypes of Australian cohort

Stroke/no TEE Controls (no Population ASD CHD controls Controls PFO PFO/ASD PFO/ASD/Stroke) (n=129) (n=109) (n=391) (n=66) (n=113)

With Stroke Without Stroke

(n=58) (n=29)

GATA4 S377G Heterozygous (%) 18 (14) 24 (22.0) 15 (25.9) 8 (27.6) 16 (24.2) 17 (15.0) 83 (21.2)

Homozygous (%) 3 (2.3) 3 (2.8) 3 (5.2) 0 (0.0) 0 (0.0) 2 (1.8) 7 (1.8)

Allele Frequency 9.3 13.8 18.11 13.8 12.1 9.3 12.4

1p=0.02 compared to TEE Controls; Odds ratio 2.16

86

Although this fell short of significance when compared to Population Controls (p =

0.20), it would be the expected trend if S377G played a causative role in PFO at the severe end of the spectrum.

Allele frequency was also low in ASD subjects (9.3%), indicating no role for S377G in causation of ASDII. The differences between the ASD group and controls were not s tatistically s ignificant. H owever, a s ignificant i ncrease i n t he p revalence o f

S377G w as obs erved i n c ases w ith P FO a nd s troke, w ith a n allele f requency of

18.1% (p = 0.02 comparing to the TEE control group).

4.2.3 German study

We also assessed S377G allele frequency independently in German Caucasians with

ASD a nd P FO c ompared t o t he T EE controls. B asic de mographics a nd c linical information from ASD and PFO subjects are shown in Table 4.4. The ASD group (n

= 96) comprised children and adults (aged 2.9 years to 75.9 years) who were mostly female ( 70.5%). T his f emale pr eponderance of ASD i s c onsistent w ith a pr evious epidemiologic r eport (Feldt e t a l., 1971 ). P FO pr obands ( n = 95) w ere a ll a dults

(mean a ge 52.9) w ho mostly h ad be en a dmitted s ubsequent t o n eurologic events

(91.2%). During TEE, atrial septal aneurysm (ASA) was detected in a relatively high number of ASDs and PFOs (37.4% and 40.7% respectively). It is known that ASA is often associated with other atrial septal defects, in particular ASDII and PFO, and is an i ndependent r isk factor f or s troke (Hanley et a l., 1985 ; Belkin e t al., 1986 ;

Zabalgoitia-Reyes et al., 1990). More than two thirds of ASD subjects had an ECG pattern of r ight bundl e branch b lock ( RBBB) a nd a bout 14% ha d a trioventricular block (AVB). In the PFO group, RBBB was found in just 20% and one patient had

AVB.

87

Table 4.4: Clinical characteristics1 of German cohort

ASDII2 PFO/Stroke3

(n=96) (n=95) Age at study 2.9-75.9 26.3-74.4 (years) (mean:38.6) (mean:52.9)

Male (%) 28 (29.5) 50 (54.9)

Family history of CHD 8/42 (19.2) 1/58 (1.7) n/total (%)

Neurologic event (%)4 9 (9.9) 83 (91.2)

Migraine (%) 3 (3.3) 6 (6.6)

Defect closure (%)5 89 (98.9) 91 (100)

Atrial septal aneurysm (%) 34 (37.4) 37 (40.7)

AVB (%)6 13 (14.3) 1 (1.1) iRSB/RSB (%)7 68 (73.9)/6 (6.5) 19 (20.0)/2 (2.1)

Atrial Fibrillation (%) 3 (3.3) 0 (0)

1Family history was unknown for 54 cases with ASDII and 37 cases with PFO. Other Clinical characteristics were unknown for a small number of patients. 2All ASDII cases were free of any other cardiac malformation. 3All PFO cases were free of any other cardiac malformation. All were designated as having cryptogenic stroke. 4Neurologic event includes thromboembolic stroke, PRIND (prolonged reversible ischemic neurologic deficit) and TIA. 5Defect closure includes interventional closure and surgical closure. 6AVB: electrocardiographic atrioventricular block 7iRSB/RSB: (incomplete) right bundle block

In t he German C aucasians, t he S 377G allele f requency was not s ignificantly different between ASDs, PFOs and TEE controls (range 12% to 13.5%), and it was close to the allele frequency of other world Caucasian populations (Table 4.5).

88

Table 4.5: GATA4 genotypes of German cohort

TEE Controls ASDII PFO/Stroke (no PFO/ASD) (n=96) (n=95) (n=96)

GATA4 S377G Heterozygous (%) 15 (15.6) 21 (22.1) 16 (16.7)

Homozygous (%) 4 (4.2) 1 (1.1) 5 (5.2)

Allele Frequency 12 12.1 13.5

4.2.4 Pooled data

We c ontinued r ecruitment of A ustralian C aucasians i n t he s ame S t. V incent’s

Hospital tr eatment p opulation u sing id entical in clusion c riteria a nd p atient classification as in the original study (see 2.1.1.2.1), but over two years were onl y able to recruit 32 new cases of PFO with neurologic event. Among them, only 8 had undergone P FO closure. T his l ow num ber m ay be be cause of de pletion of a dults with severe PFO in the treatment population. However, in the same time frame, we collected an additional independent group of 134 TEE controls.

Table 4.6: Pooled genotype data from Australian and German cohorts

TEE Controls ASDII PFO/Stroke (no PFO/ASD) (n=234) (n=183) (n=340)

GATA4 S377G Heterozygous (%) 36 (15.4) 41 (22.4) 65 (19.1)

Homozygous (%) 7 (3) 5 (2.7) 8 (2.4)

Allele Frequency 10.7 13.9 11.9

89

The power of both Australian and German studies was weakened by the low number of subjects. Given the relative homogeneity of the two studies, we pooled data from all s ubjects including t he a dditional T EE c ontrols, i nto one da taset f or a nalysis

(Table 4.6) . T he P FOs ( n = 183) a nd A SDs ( n = 234) s howed S 377G a llele frequency of 13.9% and 10.7%, respectively, which were not significantly different compared to TEE controls (n=340, S377G allele frequency=11.9%).

4.3 Discussion

GATA t ranscription f actors regulate s everal bi ological pr ocesses dur ing embryogenesis including cardiac development. Of the six vertebrate GATA factors,

GATA4, GAT A5, and GATA6 a re i nvolved i n c ardiogenesis (Laverriere et al.,

1994; Charron and N emer, 1999 ; Molkentin, 20 00; Peterkin e t a l., 2005 ). S everal mutations in GATA4 have been previously described in patients with CHD. S377G is located within the conserved and essential C-terminal GATA4 domain and leads to an apparent i ncrease i n t ranscriptional activity of GATA4 in vitro (Schluterman et al., 2007). Given the causal role of GATA4 in CHD, this common allele may play a modifier role.

We i nitially s creened f or G ATA4 S 377G i n t hree d ifferent s ets o f C HD p atients including those with ASD, PFO and other forms of CHD, all of Australian Caucasian origin. We observed high allele frequency of S377G in subjects with PFO and stroke

(18%) which was significantly higher than in TEE controls in whom ASD and PFO were excluded (p = 0.02). This finding suggested a role for S377G in the etiology of

PFO co mplicated b y s troke. H owever, s everal o bservations m ake t his result l ess compelling. Around one-fourth of Caucasians unselected for septal status would be expected to have PFO (Hagen et al., 1984). If S377G was involved in causation of

90

PFO we would expect its frequency in Population Controls to be higher than that in

TEE Controls. Indeed, this was the case (12.4% vs 9.3%). However, the difference was n ot s tatistically s ignificant (p = 0.20), a lthough t his c ould be d ue t o t he underpowered nature of the study. Furthermore, the difference in S377G frequency between p atients w ith P FO a nd s troke, a nd pop ulation c ontrols ( 18.1% vs 12.4% ) was non-significant.

It is p ossible th at a n in creased frequency o f S 377G i n t he pa tients w ith P FO a nd stroke i s due t o a n i ndependent e ffect of t his va riant on s troke r isk. F or e xample,

GATA4 represses the transcriptional activity of apolipoprotein(a) gene (Negi et al.,

2004), hi gh l evels of w hich a re a n i ndependent r isk f actor f or pr emature atherosclerosis a nd s troke (Marcovina a nd K oschinsky, 1998 ; Lippi and G uidi,

2003). However, a direct association between S377G and stroke is unlikely because the allele frequency of S377G in the “stroke/no PFO/ASD” group (12.1%) was not different from that of TEE or population controls.

There w as no a ssociation be tween S 377G a nd P FO complicated b y cryptogenic stroke when we analysed an independent group of German Caucasians including 95 subjects with PFO, most of whom had experienced a neurologic event. The S377G allele f requency w as not s ignificantly di fferent i n a ny c omparison be tween A SDs,

PFOs a nd T EE c ontrols. A nalysis of combined A ustralian and G erman da ta confirmed the lack of association.

In an initial screen of 162 subjects (including 43 PFO/stroke, 9 incidental PFO with no s troke a nd 110 A SD s ubjects), a ll c oding e xons of t he G ATA4 g ene w ere sequenced. No GATA4 variants were identified in the PFO/stroke cohort. However,

4 non -synonymous va riants were f ound, 3 i n t he A SD g roup ( full ge ne de letion;

91

E359del; A 411V) a nd one i n a n i ncidental P FO s ubject ( D425N). T hese c hanges were not detected by subsequent screening in 270 controls. A411V and D425N have been r eported i n a s eparate p ublication d escribing a l arger s creen (Butler et al.,

2010). While potentially deleterious, there were no obvious functional consequences of these mutations in transient transfection assays.

In s ummary, t he s tudy of i ndependent and poo led popul ations of A ustralian a nd

German Caucasians failed to show a significant association between S377G and PFO with s troke. W e c onclude t hat t he common G ATA4 va riant S 377G i s r elatively benign i n t erms of i ts pa rticipation i n C HD a nd P FO/Stroke. H owever, i t r emains possible t hat c ommon mutations i n ot her c ardiac t ranscription f actors pl ay a significant role in the pathogenesis of cryptogenic stroke through causation of PFO.

92

4.4 Genetic study of a family with ASD and Marcus-Gunn

phenomenon

As discussed in Chapter 1, among ASD cases with known causation the majority can be a ttributed to point m utations in pr otein c oding genes. T he m utations c an b e passed down through families following Mendelian patterns of inheritance. Linkage analysis is a po werful approach to map single gene diseases with highly penetrant alleles. Indeed, lin kage a nalysis o f f amilies w ith A SD h as id entified several causative mutations mostly in cardiac transcription factors (see 1.7.3.2.3).

Here, we present the framework of an additional ongoing study of a large autosomal dominant A SD family with s ome me mbers a lso b eing a ffected b y Marcus G unn phenomenon (MGP). ASDII and i ts c ausation are fully di scussed i n S ection 1.7.

MGP, a lso know n a s jaw-winking s yndrome, i s ch aracterised b y congenital ptosis associated with winking motion of the affected lid induced by movement of the jaw

(Gunn, 1883). Approximately 5% of neonates with congenital ptosis are affected by

MGP mostly being sporadic, albeit some autosomal dominant MGP with incomplete penetrance have been reported (Kirkham, 1969; Pratt et al., 1984). Co-occurrence of

CHD and MGP has been reported in a few human studies (Weaver et al., 1997; Festa et al ., 2005; Doco-Fenzy et a l., 2006 ). However, none of t hem ha s c onclusively established an association between the two conditions.

The identification of the family and evaluation of patients in this study was done by

Dr. Edwin Kirk and his collaborators. We established the aim of linkage mapping 18 members of the family with 8 be ing affected by ASDII and 4 h aving MGP (Figure

4.1). The initial linkage analysis using 382 microsatellite markers was not successful

(Kirk, 2007). Marker D1S2878 showed the highest linkage across the genome with

93 the LOD s core be ing o nly 2.03. F urthermore, f ine m apping of t his l ocus di d no t provide evidence of linkage.

Figure 4.1: Family with ASD and MGP underwent original linkage analysis (Kirk, 2007)

Subsequently, w e recruited m ore f amily m embers e xpanding t he family t o 31 individuals including 17 A SDs ( Figure 4.2). W e genotyped 24 family members including 14 a ffecteds a nd 10 c ontrols using I llumina HumanCytoSNP-12 a rray consisting of more than 300,000 SNP markers. This added an extra 6 affected family members to the original linkage mapping. Next, the data was analysed for two-point parametric l inkage us ing SuperLink v1.6 pr ogram i n t he e asyLINKAGE s oftware package (Hoffmann and Lindner, 2005). No significant linkage was found but with a

LOD score of 1.9 a s the lower level of suggestive linkage, evidence of suggestive linkage was found at 193.5 cM on C hromosome 3 and at 9.4 c M on Chromosome

94

18, both with LOD score of 2.41 (Figure 4.3). While the highest LOD score obtained from this study was slightly higher than the original study, it again failed to provide a s ignificant evidence of l inkage. T he pos sible e xplanations f or t his which a re discussed in Ma’s project (Ma, 2012) are summarized as below:

• Nonpenetrance and Phenocopies of ASD: It was possible that the disease locus in

II:1 is non penetrant or his daughter, III:2, is a phenocopy for ASD.

• Misclassification of affecteds or spontaneous ASD closure which could lead to

misclassification. This was support b y the fact that several children in the new

generation h ad ASDs t hat closed s pontaneously in t he first few m onths of l ife

(IV:2, IV:4, IV:7). None of these children were included in the study.

• Alternative genetic m echanisms: This l inkage s tudy a ssumed do minant

inheritance for this family, due to the instances of male to male transmission, and

every ge neration being a ffected. H owever, a nother mechanism such a s di genic

inheritance could be responsible.

Collectively, t he pos sible e xplanations a bove s uggest t hat non -parametric l inkage analysis can be beneficial here. Indeed, a non-parametric linkage analysis based on allele sharing model showed a significant linkage. Different linkage programs which use different algorithms are being applied to confirm the result. This analysis is still in progress.

Before the data analysis step Dr. Alan Ma was the main researcher on t his project supervised b y D r. E dwin K irk. In pa rticular, h e r ecruited ne w pa tients f rom t he family and t hen m anaged t he s amples for genotyping. Involvement of MM i n t his project began with reformatting the SNP array data for linkage use, and setting up the program. The data was then analysed by MM and Dr. Alan Ma. In addition, MM

95 has set up a nd performed nonparametric linkage analysis on this family and is still analysing the family using different linkage approaches.

I:1 I:2

II:1 II:2 II:3 II:4 II:5 II:6 II:7 II:8 II:9

III:1 III:2 III:3 III:4 III:5 III:6 III:7 III:8 III:9 III:10 III:11 III:12

Atrial Septal Defcet

Marcus Gunn Phenomenon IV:1 IV:2 IV:3 IV:4 IV:5 IV:6 IV:7 IV:8 IV:9 IV:10 IV:11 Other cardiac conditions

Cleft lip/palate

Included in study

Figure 4.2: The expanded family with ASD and MGP underwent linkage analysis

Figure 4.3: LOD score plot for two-point linkage analysis of chromosome 18. Marker rs8095878 at 9.40 cM shows the highest LOD score of 2.41.

96

Next generation sequencing (NGS) technologies has now become widely available.

Another approach for genetic dissection of this family is to use NGS technology to find common variants between the affected individuals which are absent in normal individuals. Indeed, w e have pe rformed w hole exome s equencing o f t wo a ffected members. P reparing s amples a nd s ending for sequencing w ere p erformed b y D r.

Alan Ma (assisted by Ms Glenda Mullan). Sequencing was performed at Dr. Michael

Bamshad’s laboratory in S eattle, U SA. This pr oject w as s upervised b y Dr. E dwin

Kirk. The variants identified were filtered out by Dr. Tony Roscioli resulting in 15 novel candidate mutations shared between the two exomes. MM is taking charge in this w ork f rom he re b y f irstly pe rforming bi oinformatics t asks t o pr ovide m ore support f or t he c andidate m utations a nd t hen pe rforming segregation s tudies including sequencing of all individuals for the candidate mutations.

97

5 High resolution mapping of quantitative trait loci

affecting cardiac at rial s eptal morphology using

an advanced intercross line

5.1 Introduction

As discussed in chapter 1 development of the cardiac atrial septum involves complex morphogenetic processes including the convergence of multiple tissue elements. This process is genetically vulnerable and defects in septation occur commonly as part of the C HD s pectrum i n man, r epresenting a s ignificant b urden t o h ealth r esources.

ASDII, PFO and ASA are common atrial septal anomalies associated with numerous pathologies including stroke.

QTL analysis has emerged as an approach t o understand t he genetic basis of bot h quantitative and complex (non-Mendelian) binary traits (see chapters 1 and 3), and this approach is being adopted in genome wide studies of human disease. One model for complex binary traits (e.g. PFO in this study) assumes an underlying continuous but unobservable variable (termed liability) with a threshold above and below which an i ndividual e xpresses one a nd another p henotype, r espectively (Falconer, 1965 ).

QTL underlying PFO can be identified through mapping of quantitative parameters acting as p roxies f or t he assumed lia bility o r direct ma pping o f th e lia bility (see chapter 3).

Biben et al.’s study of inbred mouse strains revealed significant variation in atrial septal an atomy correlating with A SA a nd P FO (Biben e t a l., 2000 ), w ith ge netic background a s a m ajor de terminant. They established q uantitative p arameters

98 reflective of septal status including length of the septum primum (flap valve length;

FVL), orthogonal width of the foramen ovale (FOW) and width of the open corridor in PFO (crescent width; CRW). Mean FVL was strongly negatively correlated with the prevalence of PFO across of variety of genetic backgrounds (Biben et al., 2000), with short FVL, large FOW and large CRW all strongly associated with a higher risk of PFO (Kirk et al., 2006). Thus, these parameters could be mapped as proxies for the P FO l iability, a nd i ndeed Kirk et al . performed a Q TL an alysis u sing an F2 design and mapped the quantitative septal traits to 7 significant (LOD > 4.3) and 6 suggestive (LOD > 2.8) QTL (Kirk et al., 2006).

QTL analysis has the power to detect the most significant QTL underlying complex traits, al though t he co nfidence i ntervals a re u sually l arge and e ach m ay contain multiple Q TL and hundr eds of pot ential c andidate ge nes (Darvasi et a l., 1 993).

Therefore, i ncreasing r ecombination ha s be come t he f ocus of f ine m apping approaches i ncluding t he us e of a dvanced i ntercross l ines ( AIL), generated b y intercrossing i nbred s trains a cross 10 or m ore generations (Darvasi an d S oller,

1995).

This ch apter r epresents our A IL s tudy a imed at confirmation and f ine m apping significant QTL identified for FVL and FOW in the previous F2 study in addition to linkage analysis f or h eart w eight (HW) in bot h F 2 a nd A IL s tudies. S ignificant findings are discussed next.

All m ouse br eeding, di ssections a nd m easurements f or t he A IL were do ne b y Dr.

Edwin Kirk before MM arrived and MM performed virtually everything from that point. During the first steps including linkage analysis of the F2 data for body weight

(BW) and HW, AIL sample selection, and AIL marker selection he was taught by

99

Dr. E dwin Ki rk. M M managed genotyping o f t he A IL samples. H e performed analysis of the AIL data including linkage analysis (assisted by Dr. Peter Thomson) and w rote a m anuscript on t he AIL d ata (assisted b y Prof. R ichard H arvey and currently und er r evision). This m anuscript i s d ifferent f rom t he one which was published in Animal Genetics and described in chapter 3. During the whole period of this project Prof. Richard Harvey, Dr. Edwin Kirk and Prof. Chris Moran have been major guiding forces to MM.

5.2 Results

5.2.1 Phenotypes

In the previous F2 study (Kirk et al., 2006), adult QSi5 and 129T2/SvEms parental mice an d F2 m ice were s cored f or P FO as a b inary t rait, an d 3 q uantitative anatomical parameters of the inter-atrial septum (FVL, FOW, and CRW) that were found to be associated with PFO (see Section 2.2.1) (Biben et al., 2000; Kirk et al.,

2006). All tr aits were influenced b y m utation of t he hom eodomain t ranscription factor Nkx2-5, which in humans is essential for atrial septation (Schott et al., 1998;

Biben et al., 2000). The prevalence of PFO in parental strains was 4.5% and 80%, respectively. As in the original study, the most relevant quantitative septal parameter in t he F 14 s tudy ( Tables 5.1 -5.3 and F igure 5. 1) was F VL - it s hows t he g reatest difference in m ean l ength up t o a m aximum o f 2.5 -fold be tween inbred s trains

(Biben e t a l., 2000 ), a nd a d ifference of ~ 2-fold a nd 4.8 s tandard de viations ( SD) between p arental s trains f or t his s tudy. F VL a lso s hows t he s trongest ( negative) correlation to PFO prevalence among a number of inbred strains (r = -0.97) (Biben et al., 2000 ), and P FO risk i n bot h F 2 ( p < 0.001) a nd F 14 ( p < 0.001) generations

(Kirk et al., 2006).

100

FOW and CRW show more subtle differences and variation seen among individuals

within the parental strains substantially overlapped within one SD (Table 5.1) (Kirk

et a l., 2006 ). N evertheless, s uch va riation i n F OW i s r eflective of up t o a 2 -fold

difference i n f oramen o vale ar ea (Biben et a l., 2000 ), a nd i s a ssociated w ith bot h

PFO (p < 0.001) (Table 5.2) and FVL in the F14 mice (r = -0.284; p < 0.001) (Table

5.3), albeit that the correlation with FVL was weaker in the F2 cohort (r = -0.087; p

= 0.001) (Kirk et al., 2006). Therefore, both traits (FVL and FOW) were taken into

account in selecting mice of extreme phenotypes for inclusion in the AIL study (see

2.2.1.4).

Table 5.1: Phenotypic characteristics of parental strains and F2 mice extracted from Kirk et al. (2006) compared to F14 mice

QSi5 129T2/SvEms F2 F14

N 66 75 1437* 933

PFO (%) 4.5 80 17 34

FVL ± SD (mm) 1.13 ± 0.11 0.60 ± 0.11 1.0 ± 0.19 1.01 ± 0.16

FOW ± SD (mm) 0.21 ± 0.06 0.24 ± 0.06 0.21 ± 0.07 0.24 ± 0.07

CRW ± SD (mm) 0.51 ± 0.13 0.44 ± 0.12 0.41 ± 0.12 0.54 ± 0.15

Body Weight ± SD (g) 29.4 ± 2.77 17.5 ± 2.1 26.6 ± 3.3 25.8 ± 2.9

Heart Weight ± SD (g) 0.21 ± 0.02 0.14 ± 0.02 0.21 ± 0.03 0.18 ± 0.03

*Data were incomplete for some mice.

101

Table 5.2: Relationship between PFO and the quantitative traits in F14 mice with complete data (n = 933).

PFO Number of Mice Mean (mm) Std. Error t-test P-value*

FVL Present 314 0.90 0.138 <0.001 Absent 619 1.07 0.098

FOW Present 314 0.29 0.138 <0.001 Absent 619 0.21 0.098

CRW Present 314 0.56 0.138 0.069 Absent 619 0.54 0.098

*t-test compares mean of each trait between mice with and without PFO.

Table 5.3: Inter-trait correlation coefficients (r) in F14 mice with P-values in brackets

HW FVL FOW CRW

0.640 0.100 0.005 0.118 BW (<0.001) (0.002) (0.871) (<0.001)

0.134 0.072 0.131 HW (<0.001) (0.028) (<0.001)

-0.284 0.005 FVL (<0.001) (0.890)

0.117 FOW (<0.001)

102

Figure 5.1

60 A

50

40

30 with PFO without PFO Number of Mice 20

10

0 0.54 0.66 0.75 0.84 0.93 1.02 1.11 1.2 1.29 1.38 FVL (mm)

160 B 140

120

100

80 with PFO without PFO 60 Number of Mice

40

20

0 0.09 0.15 0.21 0.27 0.33 0.39 0.45 0.51 0.6 FOW (mm)

103

70 C 60

50

40

with PFO 30 without PFO Number of Mice

20

10

0 0.12 0.3 0.39 0.48 0.57 0.66 0.75 0.84 0.93 1.05 CRW (mm)

Figure 5.1: Histograms of quantitative traits in F14 mice with and without PFO

In the original study in which quantitative septal parameters were defined (Biben et al., 2000), the width of the patent corridor between the septum primum and septum secundum was also determined in cases of PFO, measured at the edge of the septum primum r emnant w hich forms a p rominent c rescent-shaped r idge. However, s ince this parameter w as constrained t o cas es of P FO, for QTL an alysis only CRW was considered, d efined as the l ength o f t he p rominent cr escent i rrespective o f t he presence of PFO (Kirk et al., 2006). While there was indeed a strong statistical effect of PFO on CRW in the F2 study (p < 0.001) (Kirk et al., 2006), this was lost in the

F14 cohort (p = 0.069) (Table 5.2). An additional observation was that among the two pa rental s trains, Q Si5 ha s l onger m ean CRW a nd l ower P FO prevalence

(negative association), while in the F2 cohort CRW is positively associated with risk

104 of P FO. F or t hese r easons an d b ecause C RW r emains an i ll-defined a natomical parameter, w e d id n ot c onsider th is tr ait in s electing mic e w ith extremes o f phenotype in either the F2 or F14 study. However, post-hoc analysis for CRW QTL in the F2 study revealed a significant QTL on MMU7 (LOD = 4.58) and a suggestive one on MMU3 (LOD = 3.49). We therefore performed a similar post-hoc analysis in the F14 study (see below).

In this AIL study, our main aim was to confirm and fine map QTL found in the F2 study with LOD scores above the threshold for significance of 4.3. T his included 3

QTL for FVL and 3 for FOW. We also included a suggestive QTL (2.8 < LOD <

4.3) for FOW l ocated o n M MU9 ( LOD = 3.43) , since i ts p eak co vered t he T -box transcription f actor gene Tbx20, m utations in w hich ar e k nown t o cau se f amilial septal defects and severe PFO with permanent shunt (Kirk et al., 2007).

We also analyzed for HW and BW QTL since it is evident that the quantitative septal parameters under study might be influenced by heart size and mass (Table 5.1) (Kirk et al., 2006), and indeed both FVL and FOW were significantly correlated with HW in both F2 and F14 cohorts, albeit that the effects were small (Table 5.3) (Kirk et al.,

2006) and HW had no influence on t he risk of PFO. In the F2 study, therefore, we did not normalize septal data for HW so as not to mask important QTL (Kirk et al.,

2006). However, t he po ssibility t hat F VL and F OW Q TL c ould be e xplained b y variation in HW or BW has not been formally examined. Thus, prior to analysis of

AIL m ice, w e p erformed a retrospective l inkage a nalysis for HW a nd BW on F2 data. The HW of F2 mice was initially adjusted for factors with significant effects

(age, sex, and BW) and we used the same LOD score criteria as in the F2 study (2.8 for s uggestive a nd 4.3 f or s ignificant lin kage) (Lander a nd K ruglyak, 1995 ). W e

105 discovered a s ignificant Q TL for B W nor malized f or a ge a nd s ex ( LOD = 14.2) overlapping one for HW normalized for age, sex and BW, on MMU11 (LOD = 8.5)

(Figure 5.2A). We also found one suggestive QTL (LOD = 3.4) for normalized HW on MMU7, and another on the same chromosome that fell just short of suggestive

(Figure 5.2 B). T he s uggestive H W Q TL ove rlapped w ith pr eviously determined

QTL for BW on M MU7, but did not represent a BW QTL in our study. Thus, we included normalized HW as a p arameter in selection of F14 mice with extremes of phenotype f or further QTL analysis ( see 2.2. 1.4), a nd s elected m arkers f or fine mapping of the significant normalized HW QTL on MMU11.

5.2.2 Linkage results for atrial septal morphology

We set a LOD score of 2 as a cut-off for significant linkage based on the density of chosen markers (~2 cM) and the size of the genomic regions covered (Lander and

Botstein, 1989 ). From t he septal m orphology QTL i dentified i n t he F 2 s tudy (6 significant a nd 1 s uggestive), a t l east 6 Q TL were co nfirmed an d s ignificantly narrowed using AIL data. Five F2 QTL resolved into multiple peaks and several new

QTL were also discovered. Overall, the overlap between QTL for different traits was increased.

Table 5.4 describes QTL for FVL, FOW, and CRW identified by the AIL study and

Figure 5.3 represents an ex ample f rom ea ch ch romosome w hich co mpares F 2 and

AIL mapping results.

106

9 MMU11

A 8

7 Threshold for significant linkage Threshold for suggestive linkage 6

5

4 LOD score

3

2

1

0 0 10 20 30 40 50 60 70 80 Genetic position (cM)

5 MMU7 B

4

3

LOD score 2

1

0 0 10 20 30 40 50 60 70 Genetic position (cM)

Figure 5.2: Suggestive and significant QTL for HW identified by F2 study. Y-axis represents LOD scores for HW adjusted for age, sex, and BW.

107

Table 5.4: Characteristics of the QTL identified by AIL

Peak position 1 LOD drop-off Chr. Trait Peak LOD (cM) From (bp) To (bp) Length (Mb) 1 FVL 9.00 2.15 21409725 33776233 12.4 1 FVL 36.00 2.34 64136253 75299905 11.2 1 FVL 41.25 5.99 78398362 82339854 3.9 1 FOW 36.75 3.06 69797763 80143657 10.3 1 FOW 41.25 4.35 78295971 82503617 4.2

2 FVL 101.00 5.45 176042766 179719583 3.7 2 FOW 72.00 2.70 141232938 148970168 7.7 2 FOW 99.75 5.80 174146375 179012121 4.9 2 CRW 41.25 2.91 67670246 71401682 3.7 2 CRW 44.00 2.31 71401682 76875762 5.5

4 FVL 9.50 2.64 10700299 28146469 17.4 4 FVL 20.50 2.85 34030075 45259069 11.2 4 FVL 43.50 2.74 87452077 96405956 9.0 4 FOW 6.00 2.25 0 16051300 16.1 4 FOW 44.04 5.97 91006973 95917012 4.9 4 CRW 32.25 2.82 56295938 62391898 6.1

8 FVL 55.25 2.22 105110435 114605600 9.5 8 FVL 66.00 3.68 120333646 121500120 1.2 8 FOW 63.72 4.70 117996631 119986390 2.0

9 FVL 19.50 3.76 33350424 39550297 6.2 9 FVL 23.75 4.75 41059283 43993122 2.9 9 FOW 9.25 5.55 19454191 26661789 7.2 9 FOW 19.25 2.53 29466989 39871427 10.4 9 FOW 24.00 7.56 41037719 47690535 6.7 9 FOW 28.25 7.31 49199504 53861153 4.7 9 FOW 35.50 4.61 60054957 68151175 8.1

11 FVL 3.50 3.89 1938262 8764300 6.8 11 FVL 9.50 4.80 13341915 18392190 5.1

13 FVL 4.91 6.09 0 14449862 14.4 13 FVL 7.00 3.57 5789754 22554216 16.8

19 FVL 16.00 3.27 21191916 22778267 1.6 19 FVL 19.25 3.15 22512605 25460897 2.9

108

19 FVL 21.71 2.14 25684932 28254727 2.6 19 FOW 4.47 2.16 2855329 10726584 7.9 19 FOW 11.50 2.02 13929960 21597921 7.7 19 FOW 16 2.96 21586323 23039925 1.5 19 FOW 19.75 3.95 23039925 26014137 3.0

Figure 5.3

MMU1; FOW A F2 AIL

(cM) rs32666041 rs31218676 rs30203203

109

B MMU2; FOW F2 AIL

(cM) rs33119174 rs29733168 rs27650386 rs29723406

MMU4; FOW C F2 AIL

(cM) rs27658776 rs3712771 rs27682682 rs28099501 rs28106777 rs32870027

110

MMU8; FVL D F2 AIL

(cM) rs6244767 rs32591452 rs13479958 rs13479995 rs3686697

E MMU9; FOW

Tbx20 F2 AIL LOD score

(cM) rs29990501 rs29594239 rs30087720 rs6324426 rs6331932 rs3666782 rs30331033 rs30471967

111

MMU13; FVL F F2 AIL LOD score

(cM) rs29552398 rs623307 rs29250410

MMU19; FVL G F2 AIL

(cM)

rs30653264 rs30323643 rs30464413 rs31057067 rs31090270 rs30518862

Figure 5.3: Comparison of linkage results from F2 (gray line) and AIL (black line) populations. Y-axes represent LOD scores and x-axes represent genetic map positions. A 1 LOD drop-off for each QTL is shown on the x-axis representing the confidence interval of the QTL. SNP markers within AIL QTL are also shown.

112

Although i n ge neral t he l inkage a nalysis of c ontinuously di stributed traits is mo re sensitive a nd i nformative t han a nalysis of b inary tr aits, t he an alysis o f PFO as a binary t rait w as s till of pr imary i mportance i n our s tudy. T herefore, w e al so performed direct linkage analysis for the presence or absence of PFO in the F14 mice as a binary trait using a logistic regression model (see chapter 3) (Moradi Marjaneh et a l., 2012 ). T his a nalysis c onfirmed m ost of the FVL a nd F OW Q TL. T he A IL results for each c hromosome f or qua ntitative ( FVL, FOW, a nd C RW) a nd bi nary

(PFO) atrial septal traits are compared in Figure 5.4.

5.2.2.1 MMU1

We pr eviously i dentified a s ignificant Q TL for F OW with a p eak at 30.8 c M extending across 26.1 cM (1 LOD drop-off) on MMU1 (Figure 5.3A). The linkage analysis of AIL data narrowed this region to 7.5 cM, including two adjacent peaks with LOD s cores of 3.1 a nd 4.4 overlapping i n t he 1 LOD drop-off . To a ssess whether t he t wo pe aks a re di stinct, w e r e-ran t he l inkage an alysis including the marker closest to the distal peak as a f ixed term (Figure 2.2A). While this led to a significant r eduction i n L OD s core o f t he distal peak, t he proximal peak di d not change and was still significant, indicating that the peaks represent two independent

QTL. Unlike t he F2 s tudy, t he A IL s tudy also s howed s trong evidence of l inkage across this region for FVL (including two FVL QTL with LOD scores of 6 and 2.3 respectively) (Figure 5.4A). Analysis of the AIL also revealed an additional QTL for

FVL on M MU1 l ocated at 9 c M, with peak LOD of 2.2 w hich i s clearly separate from the original QTL (Figure 5.4A). The LOD score curve from analysis of PFO as a binary trait followed the pattern produced by FOW and FVL data (Figure 5.4A). In particular, its peak at 41.6 cM (LOD = 4) strongly supported highly significant FOW

113 and F VL Q TL i n that vi cinity. T he 1 LOD drop-off of t his pe ak c overed approximately 2.4 cM, a very substantial refinement on the F2 study.

Figure 5.4

MMU1 7 A 6

5

4 PFO FVL 3 LOD score FOW CRW 2

1

0 0 10 20 30 40 50 60 70 Genetic position (cM)

114

MMU2 6 B 5

4

PFO 3 FVL

LOD score FOW

2 CRW

1

0 0 10 20 30 40 50 60 70 80 90 100 110 Genetic position (cM)

MMU4 6 C 5

4

PFO 3 FVL

LOD score FOW

2 CRW

1

0 0 10 20 30 40 50 Genetic position (cM)

115

MMU8 5 D

4

3

PFO FVL

LOD score 2 FOW CRW

1

0 0 10 20 30 40 50 60 70 Genetic position (cM)

MMU9 8 E 7

6

5

PFO 4 FVL

LOD score FOW 3 CRW

2

1

0 0 10 20 30 40 Genetic position (cM)

116

MMU11 5 F

4

3

PFO FVL

LOD score 2 FOW CRW

1

0 0 5 10 15 20 25 Genetic position (cM)

MMU13 7 G 6

5

4 PFO FVL 3 LOD score FOW CRW 2

1

0 0 10 20 30 40 50 Genetic position (cM)

117

MMU19 5 H

4

3

PFO FVL

LOD score 2 FOW CRW

1

0 0 10 20 30 40 50 60 Genetic position (cM)

Figure 5.4: Comparison of AIL results for PFO, FVL, FOW, and CRW. Y-axes represent LOD scores and x-axes represent genetic map positions. Vertical blue lines correspond to the position of the AIL markers.

5.2.2.2 MMU2

The broad F2 QTL peak for FOW (peak at 67.5 cM) was narrowed to a sharp peak at

72 cM refining the QTL genomic region from 18.2 cM (F2) to 3.6 cM (AIL) (Figure

5.3B). We a lso obs erved a ne w highly s ignificant QTL at t he distal end of t his chromosome (~100 cM) affecting both FOW and FVL (LOD scores 5.79 and 5.44, respectively) (Figures 5. 3B a nd 5.4 B). The bi nary analysis of P FO s upported a ll

QTL for FOW and FVL. While the F2 study was not able to detect QTL for CRW on this c hromosome, l inkage an alysis of t he A IL d ata revealed two p ossibly distinct

QTL close to each other at 41.25 cM and 44 cM (Figure 5.4B) (Figure 2.2B) without pleiotropic effects on the other quantitative traits.

118

5.2.2.3 MMU4

A very broad F2 FOW QTL spanned 23.8 cM of the chromosome (peaking at 25.4 cM) resolved in the AIL as two FOW QTL (6 cM, LOD = 2.2; 44 cM, LOD = 6), leaving the mid portion including the 1 LOD drop-off of the original peak unlinked

(Figure 5.3C). Given the sparsity of markers across this region in the F2 study, this outcome l ikely represents a r esolution of t he br oad F 2 Q TL region i nto 2 w idely spaced QTL. The A IL d ata also revealed 3 Q TL for FVL, of which the distal one was located at the same position as the distal FOW QTL (Figure 5.4C). The analysis of PFO as a binary trait generated a similar pattern to the FVL and FOW results. A new Q TL for C RW was d etected at 32.25 cM w hich, a s for t he C RW Q TL on

MMU2, did not affect the other traits.

5.2.2.4 MMU8

A region at the end of this chromosome was found to be linked to FVL in the F2 study. It peaked at 62.5 cM and covered 12.2 cM of the chromosome. On analysis of the AIL d ata, t his Q TL r esolved i nto t wo QTL a t 52.25 c M a nd 66 c M (Figure

5.3D). A hi ghly s ignificant Q TL w as a lso de tected f or F OW a t 63.7 c M (Figure

5.4D).

5.2.2.5 MMU9

The F2 study revealed a suggestive QTL for FOW peaking at 20.7 cM (LOD = 3.4) and extending 17.3 cM on MMU9. The AIL resolved this into at least four separate

QTL with peaks at 9.25, 19.25, 24, a nd 28.25 c M (Figure 5.3E). Tbx20, a cardiac transcription factor gene, is located at 10.25 cM, within the 1 LOD drop-off of the first QTL and very close to its peak at 9.25 c M. We also detected a new QTL for

FOW, peaking at 35.5 c M with maximum LOD score of 4.6. Analysis of the AIL

119 also identified two peaks at 19.5 cM and 23.75 cM with significant LOD scores for

FVL (Figure 5.4E), both overlapping with significant peaks for FOW. The analysis of PFO as a binary trait resulted in a strikingly similar pattern to the FOW results and significantly supported all 4 of the FOW QTL.

5.2.2.6 MMU11

The F2 s tudy di d not s how l inkage of any r egion of M MU11 t o t he a trial s eptal traits. However, this chromosome was included in the AIL study to fine map the HW

QTL detected by the analysis of the F2 data (see below). Using the chosen markers for the HW QTL we also discovered at least 2 new QTL underlying FVL, with peak

LOD score of 4.8 at 9.5 cM (Figure 5.4F).

5.2.2.7 MMU13

The A IL na rrowed dow n t he broad genomic region o f the F2 Q TL for F VL from

19.4 cM to 8.7 cM (Figure 5.3F). It is notable that the AIL data resulted in a shifting of t he Q TL pe ak f rom 15.3 c M (F2) t oward the telomere of t he c hromosome including t wo c lose pe aks a t 4.9 and 7 c M (AIL) which w ere de termined t o b e significantly distinct using the fixing method (see 2.2.1.7) (Figure 2.2C).

5.2.2.8 MMU19

Significant F2 QTL f or F VL at 10.2 c M r esolved i nto 3 separate Q TL with th e highest LOD score at 16 cM (Figure 5.3G). The fixing method showed these peaks represented 3 s eparate QTL (see 2. 2.1.7) (Figure 2.2D). We also observed linkage between this region and FOW, with a highest LOD score of 3.9 at 19.75 cM (Figure

5.4H). Binary analysis o f P FO s howed a s imilar p attern t o t hat s een for F VL and

FOW, strongly supporting the QTL located between 13 cM and 23 cM.

120

5.2.3 Linkage analysis for Heart Weight

Our retrospective analysis of F2 data revealed a suggestive QTL for normalized HW on M MU7 pe aking at 42.1cM, a nd a significant QTL at t he proximal e nd of

MMU11, with highest LOD score of 8.5 (Figure 5.2). While this is the first report of

QTL affecting HW on MMU11, the presence of a HW QTL on MMU7 was recently reported by Reed et al. (Reed et al., 2008), although the precise location of this QTL was not identified. An extra 9 markers covering the region were selected for this AIL study (see 2.2.1.5). We also performed linkage analysis for normalized HW on other chromosomal regions for which markers had been previously chosen for analysis of atrial septal morphology. MMU2, MMU4, MMU9, MMU11, and MMU13 showed significant ev idence o f linkage t o H W ( Figure 5.5) ; of t hese, only M MU2 ha s previously been linked with HW, albeit on di fferent genetic backgrounds (Rocha et al., 2004). On MMU11 we observed evidence of linkage distally, with a LOD score of 2.3 peaking at 15 cM (Figure 5.6), which confirmed the original F2 QTL (Figure

5.2).

Comparison of linkage results f or H W a nd atrial s eptal m orphology showed t hat only the HW QTL on M MU2 and MMU9 had a potential overlap within its 1 LOD drop-off w ith Q TL f or a trial s eptal pa rameters. In general, ho wever, t he chromosomal locations of QTL for HW were different from the distribution for atrial septal parameters.

121

Figure 5.5

6 A MMU2

5

4

HW 3 FVL

LOD score FOW CRW 2

1

0 0 10 20 30 40 50 60 70 80 90 100 110 Genetic position (cM)

6 MMU4 B 5

4

3 HW FVL

LOD score FOW

2 CRW

1

0 0 10 20 30 40 50 Genetic position (cM)

122

8 C MMU9 7

6

5

4 HW FVL

LOD score FOW 3 CRW

2

1

0 0 10 20 30 40 Genetic position (cM)

5 D MMU11

4

HW 3 FVL FOW CRW

LOD score 2

1

0 0 5 10 15 20 25 Genetic position (cM)

123

7 MMU13 E 6

5

4 HW FVL

LOD score 3 FOW CRW 2

1

0 0 10 20 30 40 50 Genetic position (cM)

Figure 5.5: Comparison of AIL results for HW (red line) and quantitative traits of atrial septum.

MMU11 9

8

7

6

5

4 F2 LOD score AIL 3

2

1

0 0 10 20 30 40 50 60 70 80 Genetic position (cM)

Figure 5.6: Comparison of linkage results of MMU11 for HW from F2 (gray line) and AIL (black line) populations.

124

5.3 Discussion

ASDII affects approximately 1 out of every 1000 live births (Hoffman et al., 2004).

An unt reated hemodynamically significant ASDII may r esult i n a r ange o f s evere complications including heart failure. It is likely that ASDII exists in an anatomical continuum with PFO (Schott et al., 1998; Biben et al., 2000; Kirk et al., 2007; Posch et al., 2010), a common atrial septal variant which is associated with several clinical conditions including cryptogenic stroke (Lechat et al., 1988; Webster et al., 1988).

The genetic complexity of these conditions is unknown. However, the formation of the a trial s eptum during c ardiac de velopment i s c omplex, be ing c ontrolled b y genetically programmed cell growth or cell death, as well as myogenesis, adhesion, migration and matrix deposition, and any genetic element that selectively alters one process relative to another may contribute to a defect.

QTL u nderlying q uantitative p arameters o f t he at rial s eptum w ere previously mapped using an F2 intercross d esign, and here, we applied t he A IL approach for confirmation a nd f ine m apping. Comparing F 2 a nd A IL de signs w ith i dentical parameters i ncluding s ample s ize an d m arker d ensity, t he confidence i ntervals o f

AIL Q TL i s t/2 times smaller than those of F2 QTL where t is the number of AIL generations (Darvasi and Soller, 1995). This indicates that AIL is a powerful method for precise localization of QTL and also separation of linked QTL identified b y an

F2 de sign. Indeed, f rom 7 F 2 Q TL i ncluded in our AIL s tudy, at le ast 6 were confirmed and s ubstantially n arrowed w hereas 5 F 2 Q TL resolved i nto m ultiple peaks.

Among the 3 quantitative parameters studied here, FVL and FOW showed a similar pattern of linkage on most of the chromosomes. This was not evident in the F2 study.

125

The analysis of PFO as a binary trait strongly supported results for FVL and/or FOW on most chromosomes. As noted, of the traits analyzed, FVL has a larger variation between p arental s trains ( Table 5.1) (Kirk e t a l., 2006 ) and s hows a stronger

(inverse) correlation t o P FO p revalence among several inbred and m utant strains

(Biben et a l., 2000 ), as w ell as PFO r isk i n bot h F 2 and F14 generations.

Collectively, these data suggest that many QTL affect formation of the primary and secondary atrial septa in common, and that FVL is a robust indicator for atrial septal morphology and risk of PFO.

CRW di d no t s how a s ignificant correlation t o P FO i n t he A IL da ta and w as therefore not found to be a satisfactory surrogate for the size of the open corridor in cases o f P FO (Biben e t a l., 2000 ). Nonetheless, w e f ound 5 s ignificant Q TL f or

CRW onl y one of which overlapped with QTL for the other traits, suggesting that

CRW is determined b y largely d ifferent genetic el ements to those g overning F VL and FOW.

As s eptal p arameters m ay be i nfluenced b y h eart s ize, w e p erformed a l inkage analysis f or nor malized H W a cross M MU11, w here a s ignificant Q TL w as discovered retrospectively in the F2 data, and indeed across all other chromosomal regions for which markers were selected. We confirmed and refined the position of the HW QTL on MMU11 and also detected HW QTL on MMU2, MMU4, MMU9, and MMU13. Of these, only MMU2 has previously been linked with HW (Rocha et al., 2004).

We studied the relationship between HW and atrial septal morphology by calculating inter-trait c orrelation c oefficients as well a s c omparing lin kage results. H W influenced quantitative septal parameters in both F2 and F14 studies, but only in a

126 minor way (Table 5.3). In addition, comparison of linkage results for HW and atrial septal m orphology showed a l imited ove rlap. W e pr opose t hat H W i s m ostly determined by ventricular chamber growth as an independent parameter from growth of the “primary” myocardial components of the early heart tube, which have a low proliferative index (Moorman and Christoffels, 2003), and additional mesenchymal and cushion components that contribute to formation of the inter-atrial septum.

The AIL study has resulted in a significant improvement in the genetic complexity map for atrial septal morphology. The improved resolution has enabled us to narrow down the areas of interest and identify candidate genes underlying QTL. However, the 1 LOD drop-off of each QTL identified by the AIL study still contains a l arge number of candidate genes. Finding sequence variations between the parental strains within Q TL, e ither i n e xons of pr otein c oding genes, c onserved cis-regulatory regions or non -coding R NAs, w ill he lp t he di scovery o f genetic v ariations underlying P FO or A SD. Next-generation s equencing a dds c onsiderable pow er i n identifying such variations, and indeed, we performed high throughput sequencing of the genomes of the AIL parental strains (see chapter 6).

The A IL s tudy h as r evealed a l arge n umber of QTL f or atrial s eptal m orphology.

However, given that the analysis was restricted to the previously found QTL and the selected m arkers c overed o nly a limite d p art o f th e genome it is lik ely that th e genetic complexity underpinning septal defects in the two inbred strains under study is e ven greater t han revealed h ere. E xtrapolating t o t he di verse out bred hum an population, w e m ight c onclude t hat potentially many hundreds of gene va riants contribute significantly to atrial septal dysmorphology.

127

Our re sults pr ovide t he f irst hi gh-resolution pi cture of ge netic c omplexity underpinning atrial s eptal va riation i n a m ouse model. They ar e cl early co nsistent with Fisher's 'infinitesimal' mo del assuming a ve ry l arge num ber of l oci e ach contributing a small part of the liability to manifestation of a phenotype in common disease (Fisher, 1918 ), and a re consistent w ith g enome-wide s tudies of hum an disease, t hat generally d etect l oci o f s mall effect w ith o dds r atio o f l ess than t wo.

However, as d etected i n t he F 2 study (Kirk et a l., 2006 ), individual l oci c an contribute significant effects, although how such variants found in mice relate to risk of septal defects in human remains to be seen. QTL s tudies in animal models have the potential to significantly inform our understanding of the genetic underpinnings of c ommon hum an di sease, e specially s ince genome-wide s tudies i n hum ans a re generally s ufficiently p owered t o d etect o nly a s ubset o f v ariants. Concerning t he variance in septal parameters that we have observed even in genetically homogenous mice ke pt unde r c onstant l aboratory conditions, our s tudy r eveals t he extent o f stochastic (probabilistic) a nd f etal e nvironmental c ontributions to s eptal dysmorphology. O ur qu antitative m odel of fers opportunities t o s tudy t he i nterface between genetic, e nvironmental a nd e pigenetic inputs t o c ommon CHD i n g reater detail.

128

6 Identification of candidate genes underlying the

AIL QTL using whole genome sequencing of the

parental strains

6.1 Introduction

6.1.1 Haplotype mapping and QTL positional cloning

Compared with human popul ations, laboratory inbred m ouse s trains s how significantly less genetic diversity between strains, mainly due to the fact that they have been generated over a shorter period of time and from only a limited number of founders. These s trains originated f rom t hree s ubspecies of Mus musculus (M. m.) including M. m. musculus, M. m. castaneus, and M. m. domesticus which all derived from a common ancestor about 1 million years ago. Much later, about 10,000 years ago, a hybridization of M. m. musculus and M. m. castaneus gave r ise t o another founder of t he current i nbred strains, M. m. molossinus (Bonhomme e t a l., 1987 ;

Frazer e t al., 2007 ). About 87% of t he h aplotypes from the genomes of t he laboratory inbred strains can be attributed to the three Mus musculus subspecies (as mentioned above) and the hybrid M. m. molossinus, with the rest being of unknown origin (Frazer et al., 2 007). Genomic an alysis of i nbred s tains r eveals a m osaic pattern, consisting of many haplotypes each attributable to the genome of a founder line. Concomitantly, a pairwise comparison of the genomic sequences of two inbred strains shows blocks of low and high SNP density representing regions of shared and unshared ancestry, respectively (Wade et al., 2002).

129

Analysis o f the mouse ha plotype map ( HapMap) can be used i n Q TL positional cloning. Consider QTL mapping established by a cross between two inbred strains.

The underlying genes f or th e id entified Q TL a re le ss lik ely to b e in th e g enomic regions of c ommon a ncestry b etween t he t wo s trains, t ermed i dentical b y de cent

(IBD) regions, a nd t he r egions of uns hared a ncestry ( non-IBDs) ar e of g reater importance fo r fu rther c loning. T his a pproach i s m ore e ffective w hen m ultiple crosses identify the same QTL, since the non-IBD regions from different crosses can be c ompared t ogether a nd onl y t he ove rlaps can be s elected f or f urther a nalysis

(Wade et al., 2002).

The av ailable m ouse H apMap imputed genotype resources pr ovide genotypes of high de nsity S NPs i n s everal i nbred s trains. F or e xample, t he N IEHS/Perlegen resequencing pr oject ha s di scovered m ore t han 8 m illion S NPs a cross m ouse genomes. T he corresponding i mputed genotypes f rom 94 i nbred s trains, i ncluding the parental strains of our AIL study (QSi5 and 129T2/SvEms), are now available.

Half of them are of high quality . This provides the opportunity to perform haplotype analysis and more finely resolve QTL.

It is d ifficult to s uccessfully id entify th e a lleles u nderlying Q TL u sing s tandard mapping techniques, even supplemented by the haplotype approach described above.

Haplotype mapping is often not precise enough to locate the QTL genes. In addition,

SNPs can occur in the low SNP density regions due to recent mutations occurring after c reation of an i nbred s train, and m ay be responsible for part of t he variation between two strains. However, there is a risk of ignoring them if haplotypes are used as a mapping tool. Despite these limitations, haplotype mapping is an extremely cost

130 and time effective in silico approach with noteworthy benefits in the journey from

QTL to gene.

6.1.2 Using whole genome sequencing to prioritize the candidate genes for AIL

QTL

As shown in 4.2, our AIL s tudy r esulted i n a significant i mprovement i n t he resolution o f th e g enetic ma p f or a trial s eptal dysmorphology, with e ach or iginal

QTL identified by the F2 study being resolved into a single but significantly smaller region or split into two or multiple peaks. Genomic regions for the 1 LOD drop-off of the AIL QTL representing the confidence interval of the QTL are shown in table

5.4. On average, t he confidence interval of each AIL QTL spanned 7.1 M b of the chromosomal l ength ( from 1.2 M b t o 17.4 M b) containing a number of candidate genes. For example, the genomic region defined by the 1 LOD drop-off of the highly significant FVL Q TL o n M MU1 c ontains 16 known ge nes, a t l east 4 of t hem annotated as be ing e xpressed i n t he he art (Figure 6.1). T hus, p rioritization of t he candidate genes was necessary before any further analysis.

131

7 Farsb Mogat1 6 Acsl3 Utp14b Kcne4 Scg2 5 Ap1s3 Wdfy1 Mrpl44 4 Serpine2 Fam124b Cul3 Dock10 LOD score 3 9430031J16Rik Irs1 2 Rhbdd1

1

0 0 10 20 30 40 50 60 70 Genetic position (cM)

Figure 6.1: Genes located within 1 LOD drop-off of the FVL QTL on MMU1. Those expressed in the heart are highlighted in red.

Sequencing of the genomes of the parental strains and identification of the genetic variations between them may accelerate prioritization of candidate genes. Basically, a co mplex p henotype i s af fected b y m ultiple Q TL ea ch at tributed t o at l east o ne

DNA sequence change with a cumulative, but not necessarily major effect on a gene function. The change of greatest effect are more likely to be in the genomic sequence of individuals at either end of t he phenotypic s pectrum suggesting that comparing genomic sequences of the individuals with extreme phenotypes, for example parental strains i n an A IL, p rovides a h igher ch ance t o detect t he causative changes. In an interstrain comparison, the majority of QTL genes are likely to show no sequence variation a nd can b e e xcluded. F or e xample, comparing t wo i nbred s trains, t he chance f or a gene t o h ave a n onconservative amino a cid change in open r eading frames (ORFs) is only 10–20%, and therefore, the number of candidate genes for a

132

QTL may be reduced by one-tenth by comparing the coding regions (Belknap et al.,

2001). This num ber can be r educed m ore b y functional assays as onl y half of t he polymorphisms a re f unctionally s ignificant. Given t hese es timations a g enomic region of 1 c M onl y contains three t o f ive genes with predicted functional polymorphisms in the ORFs (Belknap et al., 2001).

Sequencing of the whole genomes of the parental strains has the potential to detect all variations existing in the QTL simultaneously. The required cost and time have been dr amatically reduced us ing next generation s equencing t echnologies. Indeed, we p erformed a high t hroughput s equencing of t he w hole g enomes o f t he AIL parental strains to identify the variations, and then, the data were used to prioritize the QTL candidate ge nes. T his c hapter f irstly describes id entification o f th e variations be tween AIL pa rental s trains us ing whole ge nome s equencing. N ext, filtering of the variations through a multi-step process is explained. In addition, this chapter p resents an alysis of S NP de nsity a cross t he g enome using t he de ep sequencing da ta w hich was t hen va lidated b y analysis of an av ailable HapMap resource.

Dr. Catherine Suter and Mr. Paul Young prepared the genomic DNA (with help from

Ms. Karen Brennan on i dentifying and preparing the correct animals). The libraries from t he genomic D NA were m ade b y D r. Jeffrey Squires. Dr. David H umphreys performed Quality Control on the final library preparation in addition to ePCR and sequencing. All of tasks mentioned here were supervised by Dr. Thomas Preiss.

Ms. Tram Doan performed most of data analysis. However, when she left the Harvey

Lab MM took the primary responsibility of data analysis. For this, MM improved his

133 knowledge and skills in bioinformatics and programming. Ms. Tram Doan and Dr.

Mirana Ramialison have been great help for him in this shift.

6.2 Identification of the genetic variations between QSi5 and

129T2/SvEms mouse strains using whole genome sequencing

The f ull m ethodology f or t he w hole genome s equencing, pi peline f or t he da ta analysis and calling genetic variants are described in 2.2.2. In summary, two DNA fragment l ibraries, one f rom Q Si5 a nd one f rom 129T2/SvEms m ouse strain were prepared according to the standard protocols of the SOLiD system, a next-generation sequencing pl atform. N ext, th e g enomic D NAs of th e tw o s trains w ere amplified using s tandard emulsion P CR and the be ads were d eposited on four slides for sequencing on the SOLiD™ Analyzer. We ran the SOLiD™ Analyzer twice in each run t wo s lides w ere a nalyzed, one f rom Q Si5 a nd one f rom 129T2/SvEms. T he obtained reads from the two runs were pooled together for data analysis.

After assessment of t he qua lity of t he r eads, t hey were m apped t o t he m ouse reference sequence using BFAST, a DNA sequence aligner tool. In total, 76.5% of the Qsi5 reads and 68.3% of the 129T2/SvEms reads were successfully mapped to the r eference genome giving r ise t o a pproximately 8 and 6 t imes c overage of t he

QSi5 and 129T2/SvEms genomes, respectively (Figure 6.2).

134

Figure 6.2: Coverage of the QSi5 and 129T2/SvEms genomes by the whole genome sequencing reads

Using GATK, the Genome Analysis Toolkit, the genomic sequence from each strain was co mpared t o t he r eference genome t o c all genetic v ariants including SN Ps, insertions, de letions, c opy num ber a nd s tructural variations. GATK software was also used for variation quality recalibration.

6.3 Filtering the genetic variations between QSi5 and 129T2/SvEms

mouse strains

A mu lti-step p rocess w as p erformed to f ilter the variants and consequently t he candidate g enes co ntaining the variants. F irst, two G ATK out puts, on e f or e ach strain, were combined together and the genome-wide variant calls were filtered using the following criteria:

1. Variants s hould f all w ithin Q TL r egions (1 L OD dr op-off co nfidence

interval).

135

2. Sequencing depth for the variants should be at least 8 times.

3. Genotype should differ between the two mouse strains.

4. Variants should co me from t ranches with es timated f alse d iscovery rate

(FDR) under 10%.

This resulted in more than 215,000 variants.

Next, the variants were annotated with the effect of each on the corresponding gene being pr edicted us ing t wo i ndependent t ools, S NP E ffect P redictor ( SnpEff) and

Ensembl V ariant E ffect P redictor ( VEP). The t wo pr ograms r esulted in mo stly similar results. The annotated variants were significantly shared between the outputs from the two programs (Table 6.1; Chi-Square p-value < 0.001). The majority of the variants annotated b y SnpEff were al so covered b y VEP. On the other h and, VEP gave many more variants than SnpEff, most likely due to the fact that unlike SnpEff,

VEP also considers indels (insertions or deletions of nucleotides from a sequence) when c alling va riants. In g eneral, t he e ffects of t he va riants pr edicted by the tw o programs w ere a lso e qual. O ut of 143572 variants annotated b y both p rograms,

112647 were know n va riants, 30902 were novel a nd di splayed concordant r esults from the two programs, and only 23 were novel with the effect predicted differently by the two programs.

Table 6.1: Numbers of variants annotated by SnpEff and/or VEP

Annotated by VEP Total Yes (n) No (n)

Yes (n) 143572 277 143849 Annotated by SnpEff No (n) 71690 0 71690

Total 215262 277 215539

136

To focus on m ore v alidated va riants w e s elected t he va riants a nnotated b y both programs and excluded those obtained from only one. They were then filtered based on having a high or moderate impact resulting in 927 SNPs spread over 328 genes.

These genes were then filtered based on being related to the heart according to the following criteria:

1. Genes should be expressed in the heart; or

2. Genes should be associated with the cardiovascular system according to

the analysis; or

3. Genes should be associated with a known cardiovascular phenotype.

Seventy-five genes containing 251 SNPs met the criteria.

Finally, the genes were filtered to those located in the high SNP density regions of the ge nome (see 6.1.1) . Identification of t he l ow a nd hi gh S NP de nsity regions is fully described in 6.4. Using this filtering step, t he number of the candidate genes reduced to only 63 protein coding genes representing an approximate average of 1.7 genes p er Q TL. T he f inal can didate genes a re l isted i n table 6 .2. F rom the SN Ps identified in these genes, only those with high impact on pr otein function are listed in table 6.3.

137

Table 6.2: Genes within AIL QTL which passed through all filtering steps

Symbol Name Chr Abcg4 ATP-binding cassette, sub-family G (WHITE), member 4 9 Apoc3 apolipoprotein C-III 9 Arid4b AT rich interactive domain 4B (RBP1-like) 13 Asph aspartate-beta-hydroxylase 4 Atm ataxia telangiectasia mutated homolog (human) 9 Bai3 brain-specific angiogenesis inhibitor 3 1 Bcmo1 beta-carotene 15,15'-monooxygenase 8 Bmper BMP-binding endothelial regulator 9 Calb1 calbindin 1 4 Car9 carbonic anhydrase 9 4 Cbl Casitas B-lineage lymphoma 9 Ccm2 cerebral cavernous malformation 2 11 Cdh13 cadherin 13 8 Cdon cell adhesion molecule-related/down-regulated by oncogenes 9 Col19a1 collagen, type XIX, alpha 1 1 Col5a3 collagen, type V, alpha 3 9 Csnk1g1 casein kinase 1, gamma 1 9 Cst3 cystatin C 2 Des desmin 1 Dnajb5 DnaJ (Hsp40) homolog, subfamily B, member 5 4 Dync1i2 dynein cytoplasmic 1 intermediate chain 2 2 Fxn frataxin 19 G6pc2 glucose-6-phosphatase, catalytic, 2 2 Gad1 glutamate decarboxylase 1 2 Galt galactose-1-phosphate uridyl transferase 4 Gnas GNAS (guanine nucleotide binding protein, alpha stimulating) complex locus 2

138

Gpx6 glutathione peroxidase 6 13 Hecw1 HECT, C2 and WW domain containing E3 ubiquitin protein ligase 1 13 Ift46 intraflagellar transport 46 9 Ikbkap inhibitor of kappa light polypeptide enhancer in B cells, kinase complex-associated protein 4 Il11ra1 interleukin 11 receptor, alpha chain 1 4 Itga6 integrin alpha 6 2 Lancl1 LanC (bacterial lantibiotic synthetase component C)-like 1 1 Map2k1 mitogen-activated protein kinase kinase 1 9 Mpzl3 myelin protein zero-like 3 9 Mtap2 microtubule-associated protein 2 1 Mtmr3 myotubularin related protein 3 11 Myl1 myosin, light polypeptide 1 1 Nf2 neurofibromatosis 2 11 Nkx2-2 NK2 transcription factor related, locus 2 (Drosophila) 2 Npr2 natriuretic peptide receptor 2 4 Pax1 paired box gene 1 2 Pdk1 pyruvate dehydrogenase kinase, isoenzyme 1 2 Pfkp phosphofructokinase, platelet 13 Prune2 prune homolog 2 (Drosophila) 19 Rapgef4 Rap guanine nucleotide exchange factor (GEF) 4 2 Rce1 RCE1 homolog, prenyl protein peptidase (S. cerevisiae) 19 Rdh8 retinol dehydrogenase 8 9 Reck reversion-inducing-cysteine-rich protein with kazal motifs 4 Rnf38 ring finger protein 38 4 Rrbp1 ribosome binding protein 1 2 Sf1 splicing factor 1 19 Sigmar1 sigma non-opioid intracellular receptor 1 4 Slc4a3 solute carrier family 4 (anion exchanger), member 3 1 Sorl1 sortilin-related receptor, LDLR class A repeats-containing 9

139

Tle3 transducin-like enhancer of split 3, homolog of Drosophila E(spl) 9 Tln1 talin 1 4 Tpm1 tropomyosin 1, alpha 9 Ttn titin 2 Uaca uveal autoantigen with coiled-coil domains and ankyrin repeats 9 Xbp1 X-box binding protein 1 11 Zfp26 zinc finger protein 26 9 Zfp352 zinc finger protein 352 4

Table 6.3: SNPs identified by the whole genome sequencing with high impact on the candidate genes listed in table 6.2

Chr Position Gene_name Effect Impact Old_codon/New_codon 1 66449494 Mtap2 STOP_GAINED1 High CAG/TAG 1 66450672 Mtap2 STOP_GAINED High TGG/TGA 2 71100933 Dync1i2 STOP_GAINED High TGG/TGA 2 71715328 Pdk1 SPLICE_SITE_ACCEPTOR2 High 4 9549010 Asph STOP_LOST3 High TGA/TGG 4 9561371 Asph STOP_GAINED High TGG/TGA 4 41701766 Sigmar1 STOP_LOST High TGA/TTA 4 43950254 Reck STOP_GAINED High CAG/TAG 9 20248886 Zfp26 STOP_GAINED High CGA/TGA 9 44590278 Ift46 STOP_GAINED High CGA/TGA 9 44597761 Ift46 STOP_GAINED High GAA/TAA 13 14280024 Arid4b STOP_GAINED High CAA/TAA 1 Variant causes a STOP codon. 2 The variant hits a splice acceptor site (defined as two bases before exon start, except for the first exon). 3 Variant causes stop codon to be mutated into a non-stop codon.

140

6.4 Genomic analysis of SNP density

6.4.1 SNP density analysis using data from the whole genome sequencing

As di scussed i n 6.1.1, i n a p airwise comparison of t he genomes from t wo m ouse inbred s trains bl ocks o f l ow a nd hi gh S NP de nsity c an be obs erved. H aving t he whole g enomic s equence of Q Si5 and 129T2/SvEms and t he g enetic v ariations between t hem, w e w ere ab le t o m easure t he S NP d ensity across t he genome, an d based on this, partition the genome into the segments of low or high SNP density.

First, a 500 kb segment of chromosome 1 was examined to assess the distribution of

SNP densities. The SNP density was computed as the number of SNPs divided by the total number of genotyped bases for sliding windows of 10 kb moving with steps of 100 bp a cross t he s egment. A s s hown i n F igure 6.3, t he di stribution of S NP densities w as not nor mal a nd s howed a relatively bi modal pa ttern, s uggesting persistence of two categories of genomic regions which significantly differ in SNP rate referred to as low and high SNP density regions. The boundary between the two peaks of di stribution w as c onsidered as a t hreshold f or c lassifying t he ge nomic segments as either low or high SNP density regions. Next, the SNP density across the whole genome was cal culated using 10 kb windows with 100 bp steps and we identified w hether the c andidate ge nes f or A IL QTL w ere l ocated i n l ow or hi gh

SNP rate regions (see 6.3).

141

Figure 6 .3: Density p lot of SNP d ensities ac ross a 50 0 kb s egment of c hromosome 1 based on the deep sequencing data

6.4.2 Validation of the SNP density analysis using HapMap data

An av ailable d ataset o f imputed ge notypes a t 8 millio n SNPs d iscovered b y

NIEHS/Perlegen resequencing project was used to validate our SNP density analysis.

Here, we analysed the same 500 kb s egment of chromosome 1 a s in the analysis of the whole genome sequencing data (see 6.4.1). The number of shared and unshared

SNPs a nd non -SNP ca lls b etween t he t wo d atasets, f rom t he w hole genome sequencing an d t he NIEHS/Perlegen resequencing project, showed a s ignificant overlap (Chi-Square p-value < 0.001) (Table 6.4).

Table 6.4: Comparison of SNP and non-SNP calls by the whole genome sequencing and the NIEHS/Perlegen resequencing project within a 500 kb segment of chromosome 1

NIEHS/Perlegen resequencing Total SNPs (n) Non-SNPs (n)

SNPs (n) 210 8 218 Whole genome sequencing Non-SNPs (n) 161 1145 1306

Total 371 1153 1524

142

In addition, the SNP density for 10 kb windows with 100 bp s teps showed a similar pattern t o t hat obt ained f rom t he a nalysis o f t he w hole genome s equencing da ta

(Figure 6.4).

Figure 6 .4: Density p lot of SNP d ensities ac ross a 50 0 kb s egment of c hromosome 1 based on the HapMap data

6.5 Discussion

Linkage analysis is a p owerful approach to id entify lo ci w ith la rge effect s ize underlying complex phenotypes, and indeed, thousands of QTL mapping studies in human and animals have been conducted so far. However, a large proportion of the

QTL h ave not be en a ttributed t o s pecific r esponsible g enes presenting two ma jor

143 challenge, locating promising candidate genes in the genomic region of a QTL, and then, experimental validation of the candidate genes.

Despite t he hi gh r esolution g enetic m ap obt ained f rom our A IL s tudy, the Q TL contained many genes and only a small fraction of them may be responsible for atrial septal dysmorphology. The whole genome sequencing approach that we applied here accelerated t he p rocess of p rioritizing can didate g enes i n d ifferent ways. F irstly, sequence variations across the genomes of the AIL parental strains including coding sequence ch anges w ere d etected an d t hen f iltered b ased o n mu ltiple c riteria.

Secondly, the data was used independently in partitioning the genome into low and high S NP r ate r egions w hich w as a pplied i n t he f iltering pr ocess. E ventually, 63 protein c oding genes were s uggested f or i nvolvement i n t he at rial s eptal dysmorphology (Table 6.2), although they still need to be validated.

Despite the many advantages of deep sequencing of the whole genome to identify genetic variations between two strains, it is not always straightforward. Normally a huge amount o f d ata a re g enerated w hich n ecessitates a r igorous filtering p rocess.

Briefly, t he qu ality of t he ge nerated da ta s hould be c ontrolled. T he ne xt s teps a re designed based on the nature of the study. In our study, we aimed at filtering for the variants with th e highest predictive pot ential for be ing i nvolved i n a trial s eptal dysmorphology. H owever, i t i s likely t hat s ome r esponsible g enes w ere m issed during filtering process. For example, the SNPs that were filtered out at the quality checking l evel a re not n ecessarily s equencing ar tefacts, i t i s l ikely that s ome causative variants are located within low SNP density regions, and there might be some causative variants in the genes which have not been reported to be related to heart ex pression t erm s o f ar. What w e h ave p referred h ere i s a p ractical

144 prioritization. While the SNPs which passed through all the filters are not necessarily the causative SNPs, they are of top priority for any further analysis.

Out of t he 63 genes within A IL QTL m entioned a bove, he re, w e br ing a s hort description on those in which we identified SNPs with high impact on corresponding protein f unction (Table 6.3). T he d ata on e ach gene d escribed be low h ave b een obtained f rom M GI (Mouse G enome Informatics) an d Ensembl G enome B rowser websites.

Mtap2, m icrotubule-associated p rotein 2 , a ffects bod y growth, nervous s ystem, behaviour, a nd a ging pr ocess ( (Lewis e t a l., 1986 ; Ohmura e t a l., 2 012). T he expression i n t he he art i s a lso r eported (Kalsotra e t a l., 2008 ). Mtap2 has one paralogous gene, Mapt (microtubule-associated p rotein t au) l ocated on M MU11.

Although Mapt encodes different , it targets the same anatomical systems as

Mtap2.

Dync1i2, dynein cytoplasmic 1 intermediate chain 2, is another candidate gene. Our knowledge a bout i nvolved a natomical s ystems is lim ited. H owever, it h as b een reported t o b e e xpressed i n t he he art (Crackower e t al., 1999 ). Dync1i2 has 6 paralogues.

Pdk1, pyruvate d ehydrogenase k inase, i soenzyme 1 is l ocated on M MU2. A s

Dync1i2, e xpression i n the he art ha s be en r eported (Visel e t al., 2004 ). It ha s 4 paralogous genes.

Asph, aspartate-beta-hydroxylase is l ocated on M MU4. Homozygous mutation of this gene results in syndactyly, facial dysmorphology, mild hard palate defects, and

145 reduced female fertility. Expression in the heart has been reported (Hong et al., 2005;

Kalsotra et al., 2008). It has 2 paralogous genes.

Sigmar1, sigma non-opioid intracellular receptor 1, is also located on M MU4. The phenotypes of homozygous mice for a gene trapped allele include abnormal motor coordination a nd i ncreased de pressive-like behaviour. M ice hom ozygous f or a knock-out a llele e xhibit r educed spontaneous activity. Expression i n t he heart h as been reported (Langa et al., 2003). Sigmar 1 has no paralogues.

Reck, reversion-inducing-cysteine-rich p rotein with k azal mo tifs, i s l ocated on

MMU4. Homozygous m utation of t his g ene cause lethality around E 10.5-E11.5, defects in collagen fibrils, basal lamina and vascular development. It is involved in cardiovascular system. No paralogues has been identified.

Zfp26, zinc f inger p rotein 26 , i s l ocated on M MU9. T he phe notype h as not be en completely ch aracterised. H owever, ex pression i n t he h eart h as b een r eported

(Chowdhury et al., 1988). This gene has 17 paralogous genes.

Ift46, intraflagellar transport 46, is a gene on MMU9. Little is known about affected anatomical systems although expression in the heart has been reported (Gouttenoire et al., 2007). No paralogues have been identified.

Arid4b, AT rich i nteractive dom ain 4 B ( RBP1-like) is l ocated on M MU13. Mice homozygous for a null allele die pre-implantation. This gene has 3 paralogues.

Collectively, non e of t he a bove genes a re among t he know n genes i nvolved i n cardiac development. Reck is likely to be a good candidate to start further analysis with. It is involved in cardiovascular system. Furthermore, it has not multiple copies

146 indicating that loosing this gene by deleterious mutations may not be recovered b y other paralogues.

147

7 Conclusions and future prospects

Our work on t he genetic dissection of atrial septal dysmorphologies including ASD and PFO has resulted in several insights. We do cument the complex nature of the genetic contribution to common septal variation and risk of PFO in a mouse model, and ha ve m ade pr ogress t owards i dentification of s ome of t he c ontributing genes.

This w as done b y combining c lassical qua ntitative ge netic a pproaches i ncluding

QTL mapping using F2 intercross and AIL designs, with new genomic technologies including high throughput sequencing of the genome. While PFO can be regarded as a mild a natomical v ariation, a nd f or mo st in dividuals o f little immediate hemodynamic c onsequence, i t c onfers a s ignificant r isk i n of i tself f or a gr owing number of debilitating vascular conditions including stroke and migraine. Since PFO likely exists w ithin a n a natomical a nd p athological c ontinuum th at in cludes mo re serious d efects s uch as ASD, cau sative genes f or P FO a re relevant t o t he b roader picture of C HD pathophysiology. Furthermore, s ince P FO i s common, i ts ove rall impact is significant on human health.

Sax’s s tudy on t he common bean ~100 years a go was among t he first attempts t o map l inkage be tween genes unde rlying certain qua litative t raits ( seed c olour or colour patterns) and a quantitative trait of interest (seed weight) (Sax, 1923). Since then several human disease loci have been mapped using traditional QTL mapping approaches, al beit t hat m any h ave f ailed t o i dentify t he a ctual ca usal g enes highlighting three major challenges for this technology; 1) fine localization of QTL;

2) i solation of i ndividual g enes responsible f or QTL; a nd 3 ) pr oving di sease-gene association. R ecent advances i n genetic t echnology h ave r evolutionised Q TL

148 mapping and c ontributed a great d eal t o ov ercoming s uch c hallenges. F irstly, development of hi gh d ensity S NP-based pl atforms pr ovided a c hance f or hi gh resolution g enetic m apping, a nd m ore r ecently, t he a dvent of ne xt g eneration sequencing technologies that are cost-, time-, and labor-effective.

The progress we have made in our study of atrial septal dysmorphologies reflects the advances i n t he ge netic mapping t echnologies m entioned a bove. W hile t he i nitial study, using an F2 intercross design, was performed using microsatellite markers, in the AIL the markers were selected from a SNP data set, and eventually we applied a

NGS t echnology f or t he w hole genome s equencing o f t he p arental s trains. O ur project has resulted in a short list of candidate genes with highly predicted causative roles f or a trial s eptal dysmorphologies. H owever, no e xperimental s upport f or causation w as a vailable at th e time o f c ompleting th is th esis, a nd th is w ill b e th e topic of future work in the laboratory.

There ar e s everal approaches t o p roving a causative r ole f or a can didate gene o r, more specifically, for a candidate mutation. In human studies the first step is usually segregation analysis in which several individuals from an affected family or disease cohort a nd nor mal population a re s equenced f or t he c andidate m utation. B eing present in all affected individuals, absent in unaffected family members and rare or absent in the normal population provides strong evidence for a causative role for the mutation (Korstanje a nd P aigen, 2002 ). In a nimals, a di fferent approach m ay be applied. T he num ber of candidate genes m ay b e r educed t o onl y one or a f ew b y continuing fine mapping and positional cloning of the region. This may be followed by us e o f t ransgenics o r knoc kouts w hich c an pr ovide a definitive ge netic proof

(Korstanje and Paigen, 2002). While each candidate gene or variant can be studied

149 separately, analysis of genetic elements underling QTL for a single trait as a network may pr ovide gr eater insight. T his is b ased o n th e f act th at q uantitative tr ait nucleotides ( QTNs), m olecular va riations unde rlie Q TL, m ay i nteract w ithin a complex l ayered ne twork, w ith l ayers i ncluding t ranscription, c hromatin s tructure, alternative s plicing, microRNA r egulation, translation, pos t-translational modification, c atalysis, p rotein:protein in teraction, me tabolism a nd p hysiology.

Indeed, this presents a new challenge - to identify the causal and correlative effects of genetic variants according to these networks (Mackay et al., 2009). To tackle this, systems biology approaches ha ve be en p roposed w hich i ntegrate v ariations f rom different s ources i ncluding D NA s equence, t ranscript a bundance, a nd o rganismal phenotypes i n a m apping popu lation pr oviding t he pos sibility t o i nterpret quantitative ge netic va riation a ccording t o t he c ausal ne tworks of correlated transcripts (Sieberts and Schadt, 2007; Mackay et al., 2009).

Here, w e u sed t he l aboratory mouse as an an imal m odel f or h umans and i ndeed given t he r emarkable l evel of d evelopmental, anatomical a nd genetic homology between the two, our findings on disease-associated genes can be easily extended to human di seases. O nce causative genes for QTL a re i dentified, t he underlying pathologic m echanism m ay be s tudied, w hich m ay p rovide i nsight i nto hum an congenital heart disease mechanism and contribute to thinking about human impact, which could include genetic counselling, surgical choices, fetal therapies, postnatal therapy (which c ould i nclude s tem c ell t herapy, f or e xample, f or hypoplastic chambers) and gene t herapy. W hile m any of t hese are frontiers in themselves, our findings may, in the short term, be most helpful in understanding how disease loci pass down through families and can contribute to disease risk.

150

8 References

(1999). Freely associating. Nature genetics 22, 1-2.

(2004). "Headache disorders Fact sheet." Retrieved 01/06/2012, 2012.

(2004). The International Classification of Headache Disorders: 2nd edition. Cephalalgia : an international journal of headache 24 Suppl 1, 9-160.

(2008, 26/03/08). "human genome project information: The science behind the human genome project ". Retrieved 01/06/12, 2012, from http://www.ornl.gov/sci/techresources/Human_Genome/project/info.shtml.

(2009, February 2009). "Inbred Laboratory Mouse Haplotype Map." from http://www.broadinstitute.org/mouse/hapmap/.

(2010). Applied Biosystems SOLiD® 4 System Instrument Operation Guide A. Biosystems.

(2010). Applied Biosystems SOLiD™4 System Templated Bead Preparation Guide. A. Biosystems.

(2010). Applied Biosystems SOLiD™ 4 System Library Preparation Guide. A. Biosystems.

(2010). WISQARS Leading Causes of Death Reports, 1999 - 2007. C. f. D. C. a. Prevention, Centers for Disease Control and Prevention.

(2012). "Mouse HapMap Imputed Genotype Resource." Retrieved 27/07/2012, 2012, from http://mouse.cs.ucla.edu/mousehapmap/index.html.

Agmon, Y., Khandheria, B.K., Meissner, I., Gentile, F., Whisnant, J.P., Sicks, J.D., O'Fallon, W.M., Covalt, J.L., Wiebers, D.O., and Seward, J.B. (1999). Frequency of atrial septal aneurysms in patients with cerebral ischemic events. Circulation 99, 1942-1944.

Allanson, J.E. (1987). Noonan syndrome. Journal of medical genetics 24, 9-13.

151

Anderson, J., and Akkina, R. (2007). Complete knockdown of CCR5 by lentiviral vector- expressed siRNAs and protection of transgenic macrophages against HIV-1 infection. Gene therapy 14, 1287-1297.

Anderson, R.H., Webb, S., Brown, N.A., Lamers, W., and Moorman, A. (2003). Development of the heart: (2) Septation of the atriums and ventricles. Heart 89, 949-958.

Anzola, G.P., Frisoni, G.B., Morandi, E., Casilli, F., and Onorato, E. (2006). Shunt-associated migraine responds favorably to atrial septal repair: a case-control study. Stroke; a journal of cerebral circulation 37, 430-434.

Arquizan, C., Coste, J., Touboul, P.J., and Mas, J.L. (2001). Is patent foramen ovale a family trait? A transcranial Doppler sonographic study. Stroke; a journal of cerebral circulation 32, 1563-1566.

Aslam, F., Shirani, J., and Haque, A.A. (2006). Patent foramen ovale: assessment, clinical significance and therapeutic options. Southern Medical Journal 99, 1367-1372.

Aston, C.E., and Wilson, S.R. (1986). Two-point versus multipoint linkage analysis: a statistical view. Genetic epidemiology. Supplement 1, 113-116.

Azarbal, B., Tobis, J., Suh, W., Chan, V., Dao, C., and Gaster, R. (2005). Association of interatrial shunts and migraine headaches: impact of transcatheter closure. Journal of the American College of Cardiology 45, 489-492.

Bailey, D.W. (1971). Recombinant-inbred strains. An aid to finding identity, linkage, and function of histocompatibility and other genes. Transplantation 11, 325-327.

Baty, B.J., Blackburn, B.L., and Carey, J.C. (1994). Natural history of trisomy 18 and trisomy 13: I. Growth, physical assessment, medical histories, survival, and recurrence risk. American journal of medical genetics 49, 175-188.

Bedo, J., Wenzl, P., Kowalczyk, A., and Kilian, A. (2008). Precision-mapping and statistical validation of quantitative trait loci by machine learning. BMC genetics 9, 35.

Belkin, R.N., Waugh, R.A., and Kisslo, J. (1986). Interatrial shunting in atrial septal aneurysm. The American journal of cardiology 57, 310-312.

Belknap, J.K., Hitzemann, R., Crabbe, J.C., Phillips, T.J., Buck, K.J., and Williams, R.W. (2001). QTL analysis and genomewide mutagenesis in mice: complementary genetic approaches to the dissection of complex traits. Behavior genetics 31, 5-15.

152

Benson, D.W., Silberbach, G.M., Kavanaugh-McHugh, A., Cottrill, C., Zhang, Y., Riggs, S., Smalls, O., Johnson, M.C., Watson, M.S., Seidman, J.G., Seidman, C.E., Plowden, J., and Kugler, J.D. (1999). Mutations in the cardiac transcription factor NKX2.5 affect diverse cardiac developmental pathways. The Journal of clinical investigation 104, 1567-1573.

Berthet, K., Lavergne, T., Cohen, A., Guize, L., Bousser, M.G., Le Heuzey, J.Y., and Amarenco, P. (2000). Significant association of atrial vulnerability with atrial septal abnormalities in young patients with ischemic stroke of unknown cause. Stroke; a journal of cerebral circulation 31, 398-403.

Biben, C., Weber, R., Kesteven, S., Stanley, E., McDonald, L., Elliott, D.A., Barnett, L., Koentgen, F., Robb, L., Feneley, M., and Harvey, R.P. (2000). Cardiac septal and valvular dysmorphogenesis in mice heterozygous for mutations in the homeobox gene Nkx2-5. Circulation research 87, 888-895.

Birren, B., and Green, E.D. (1997). Genome Analysis: A Laboratory Manual. Cold Spring Harbor, Cold Spring Harbor Laboratory Press.

Bonhomme, F., Guenet, J.L., Dod, B., Moriwaki, K., and Bulfield, G. (1987). The Polyphyletic Origin of Laboratory Inbred Mice and Their Rate of Evolution. Biological Journal of the Linnean Society 30, 51-58.

Botto, L.D., and Correa, A. (2003). Decreasing the burden of congenital heart anomalies: an epidemiologic evaluation of risk factors and survival. Progress in Pediatric Cardiology 18, 111–121.

Bowler, P.J. (1989). The Mendelian revolution: the emergence of hereditarian concepts in modern science and society. Baltimore, Johns Hopkins University Press.

Brent, R.L. (2004). Environmental causes of human congenital malformations: The pediatrician's role in dealing with these complex clinical problems caused by a multiplicity of environmental and genetic factors. Pediatrics 113, 957-968.

Bridges, N.D., Hellenbrand, W., Latson, L., Filiano, J., Newburger, J.W., and Lock, J.E. (1992). Transcatheter closure of patent foramen ovale after presumed paradoxical embolism. Circulation 86, 1902-1908.

Budde, B.S., Binner, P., Waldmuller, S., Hohne, W., Blankenfeldt, W., Hassfeld, S., Bromsen, J., Dermintzoglou, A., Wieczorek, M., May, E., Kirst, E., Selignow, C., Rackebrandt, K., Muller, M., Goody, R.S., Vosberg, H.P., Nurnberg, P., and Scheffold, T. (2007). Noncompaction of the Ventricular Myocardium Is Associated with a De Novo Mutation in the beta-Myosin Heavy Chain Gene. PloS one 2.

153

Bult, C.J., Eppig, J.T., Kadin, J.A., Richardson, J.E., Blake, J.A., and Grp, M.G.D. (2008). The Mouse Genome Database (MGD): mouse biology and model systems. Nucleic Acids Research 36, D724-D728.

Burch, M., Sharland, M., Shinebourne, E., Smith, G., Patton, M., and McKenna, W. (1993). Cardiologic abnormalities in Noonan syndrome: phenotypic diagnosis and echocardiographic assessment of 118 patients. Journal of the American College of Cardiology 22, 1189-1192.

Butler, T.L., Esposito, G., Blue, G.M., Cole, A.D., Costa, M.W., Waddell, L.B., Walizada, G., Sholler, G.F., Kirk, E.P., Feneley, M., Harvey, R.P., and Winlaw, D.S. (2010). GATA4 mutations in 357 unrelated patients with congenital heart malformation. Genetic testing and molecular biomarkers 14, 797-802.

Campbell, H., and Rudan, I. (2002). Interpretation of genetic association studies in complex disease. The pharmacogenomics journal 2, 349-360.

Cavalli-Sforza, L.L., Menozzi, P., and Piazza, A. (1994). The history and geography of human genes. Princeton, NJ, Princeton University Press

CDC (2001). Morbidity and Mortality Weekly Report (MMWR), Centers for Disease Control and Prevention: 120-125.

CDC (2010). WISQARS Leading Causes of Death Reports, 1999 - 2007, Centers for Disease Control and Prevention.

Cedergren, M.I., and Kallen, B.A. (2003). Maternal obesity and infant heart defects. Obesity research 11, 1065-1071.

Charron, F., and Nemer, M. (1999). GATA transcription factors and cardiac development. Seminars in cell & developmental biology 10, 85-91.

Charron, F., Tsimiklis, G., Arcand, M., Robitaille, L., Liang, Q., Molkentin, J.D., Meloche, S., and Nemer, M. (2001). Tissue-specific GATA factors are transcriptional effectors of the small GTPase RhoA. Genes & development 15, 2702-2719.

Chen, X., Levine, L., and Kwok, P.Y. (1999). Fluorescence polarization in homogeneous nucleic acid analysis. Genome research 9, 492-498.

154

Cheng, T.O. (1999). Platypnea-orthodeoxia syndrome: Etiology, differential diagnosis, and management. Catheterization and Cardiovascular Interventions 47, 64-66.

Ching, Y.H., Ghosh, T.K., Cross, S.J., Packham, E.A., Honeyman, L., Loughna, S., Robinson, T.E., Dearlove, A.M., Ribas, G., Bonser, A.J., Thomas, N.R., Scotter, A.J., Caves, L.S.D., Tyrrell, G.P., Newbury-Ecob, R.A., Munnich, A., Bonnet, D., and Brook, J.D. (2005). Mutation in myosin heavy chain 6 causes atrial septal defect. Nature genetics 37, 423-428.

Chowdhury, K., Rohdewohld, H., and Gruss, P. (1988). Specific and ubiquitous expression of different Zn finger protein genes in the mouse. Nucleic Acids Research 16, 9995-10011.

Clark, E.B. (2001). Etiology of congenital cardiovascular malformation: epidemiology and genetics. Moss and Adams’ Heart Disease in Infants, Children and Adolescents. D. H. Allen, E. B. Cark, H. P. Gutgesell and D. J. Driscoll. Philadelphia, PA, Lippincott Williams &Wilkins: 64-79.

Cox, A., Ackert-Bicknell, C.L., Dumont, B.L., Ding, Y., Bell, J.T., Brockmann, G.A., Wergedal, J.E., Bult, C., Paigen, B., Flint, J., Tsaih, S.W., Churchill, G.A., and Broman, K.W. (2009). A New Standard Genetic Map for the Laboratory Mouse. Genetics 182, 1335-1344.

Crackower, M.A., Sinasac, D.S., Xia, J., Motoyama, J., Prochazka, M., Rommens, J.M., Scherer, S.W., and Tsui, L.C. (1999). Cloning and characterization of two cytoplasmic dynein intermediate chain genes in mouse and human. Genomics 55, 257-267.

Crispino, J.D., Lodish, M.B., Thurberg, B.L., Litovsky, S.H., Collins, T., Molkentin, J.D., and Orkin, S.H. (2001). Proper coronary vascular development and heart morphogenesis depend on interaction of GATA-4 with FOG cofactors. Genes & development 15, 839-844.

Czeizel, A.E., Rockenbauer, M., Sorensen, H.T., and Olsen, J. (2001). The teratogenic risk of trimethoprim-sulfonamides: A population based case-control study. Reproductive Toxicology 15, 637-646.

Darvasi, A., and Soller, M. (1995). Advanced Intercross Lines, an Experimental Population for Fine Genetic-Mapping. Genetics 141, 1199-1207.

Darvasi, A., and Soller, M. (1995). Advanced intercross lines, an experimental population for fine genetic mapping. Genetics 141, 1199-1207.

Darvasi, A., Weinreb, A., Minke, V., Weller, J.I., and Soller, M. (1993). Detecting Marker-Qtl Linkage and Estimating Qtl Gene Effect and Map Location Using a Saturated Genetic-Map. Genetics 134, 943-951.

155

De Castro, S., Cartoni, D., Fiorelli, M., Rasura, M., Anzini, A., Zanette, E.M., Beccia, M., Colonnese, C., Fedele, F., Fieschi, C., and Pandian, N.G. (2000). Morphological and functional characteristics of patent foramen ovale and their embolic implications. Stroke; a journal of cerebral circulation 31, 2407-2413.

Del Sette, M., Angeli, S., Leandri, M., Ferriero, G., Bruzzone, G.L., Finocchi, C., and Gandolfo, C. (1998). Migraine with aura and right-to-left shunt on transcranial Doppler: a case-control study. Cerebrovascular diseases 8, 327-330.

Dietrich, W.F., Miller, J., Steen, R., Merchant, M.A., DamronBoles, D., Husain, Z., Dredge, R., Daly, M.J., Ingalls, K.A., OConnor, T.J., Evans, C.A., DeAngelis, M.M., Levinson, D.M., Kruglyak, L., Goodman, N., Copeland, N.G., Jenkins, N.A., Hawkins, T.L., Stein, L., Page, D.C., and Lander, E.S. (1996). A comprehensive genetic map of the mouse genome. Nature 380, 149-152.

Dietz, H.C., Cutting, G.R., Pyeritz, R.E., Maslen, C.L., Sakai, L.Y., Corson, G.M., Puffenberger, E.G., Hamosh, A., Nanthakumar, E.J., Curristin, S.M., and et al. (1991). Marfan syndrome caused by a recurrent de novo missense mutation in the fibrillin gene. Nature 352, 337-339.

Diggle, P.J., Heagerty, P., Liang, K.Y., and Zeger, S.L. (2002). Analysis of longitudinal data. Oxford, Oxford University Press.

Digilio, M.C., Giannotti, A., Marino, B., and Dallapiccola, B. (1993). Atrioventricular canal and 8p- syndrome. American journal of medical genetics 47, 437-438.

Doco-Fenzy, M., Mauran, P., Lebrun, J.M., Bock, S., Bednarek, N., Struski, S., Albuisson, J., Ardalan, A., Collot, N., Schneider, A., Dastot-Le Moal, F., Gaillard, D., and Goossens, M. (2006). Pure direct duplication (12)(q24.1 -> q24.2) in a child with Marcus Gunn phenomenon and multiple congenital anomalies. American Journal of Medical Genetics Part A 140A, 212-221.

Dowson, A., Mullen, M.J., Peatfield, R., Muir, K., Khan, A.A., Wells, C., Lipscombe, S.L., Rees, T., De Giovanni, J.V., Morrison, W.L., Hildick-Smith, D., Elrington, G., Hillis, W.S., Malik, I.S., and Rickards, A. (2008). Migraine Intervention With STARFlex Technology (MIST) trial: a prospective, multicenter, double-blind, sham-controlled trial to evaluate the effectiveness of patent foramen ovale closure with STARFlex septal repair implant to resolve refractory migraine headache. Circulation 117, 1397-1404.

Ericson, A., and Kallen, B.A.J. (2001). Nonsteroidal anti-inflammatory drugs in early pregnancy. Reproductive Toxicology 15, 371-375.

Falconer, D.S. (1965). Inheritance of Liability to Certain Diseases Estimated from Incidence among Relatives. Annals of human genetics 29, 51-&.

156

Falconer, D.S., and Mackay, T.F. (1996). Introduction to Quantitative Genetics. Harlow, Pearson Education Limited.

Feldt, R.H., Avasthey, P., Yoshimas.F, Kurland, L.T., and Titus, J.L. (1971). Incidence of Congenital Heart Disease in Children Born to Residents of Olmsted-County, Minnesota, 1950-1969. Mayo Clinic Proceedings 46, 794-&.

Feldt, R.H., Avasthey, P., Yoshimasu, F., Kurland, L.T., and Titus, J.L. (1971). Incidence of congenital heart disease in children born to residents of Olmsted County, Minnesota, 1950- 1969. Mayo Clinic proceedings. Mayo Clinic 46, 794-799.

Ferencz., C., Correa-Villasenor, A., and Loffredo, C.A. (1997). Genetic and environmental risk factors of major cardiovascular malformations: the Baltimore-Washington Infant Study: 1981–1989. Armonk, NY.

Festa, P., Lamia, A.A., Murzi, B., and Bini, M.R. (2005). Tetralogy of Fallot with left heart hypoplasia, total anomalous pulmonary venous return, and right lung hypoplasia: Role of magnetic resonance imaging. Pediatric cardiology 26, 467-469.

Fisher, R.A. (1918). The correlation between relatives on the supposition of Mendelian inheritance. Transactions of the Royal Society of Edinburgh, 399-433.

Frazer, K.A., Eskin, E., Kang, H.M., Bogue, M.A., Hinds, D.A., Beilharz, E.J., Gupta, R.V., Montgomery, J., Morenzoni, M.M., Nilsen, G.B., Pethiyagoda, C.L., Stuve, L.L., Johnson, F.M., Daly, M.J., Wade, C.M., and Cox, D.R. (2007). A sequence-based variation map of 8.27 million SNPs in inbred mouse strains. Nature 448, 1050-1053.

Freeman, S.B., Bean, L.H., Allen, E.G., Tinker, S.W., Locke, A.E., Druschel, C., Hobbs, C.A., Romitti, P.A., Royle, M.H., Torfs, C.P., Dooley, K.J., and Sherman, S.L. (2008). Ethnicity, sex, and the incidence of congenital heart defects: a report from the National Down Syndrome Project. Genetics in medicine : official journal of the American College of Medical Genetics 10, 173-180.

Gallardo, D., Pena, R.N., Amills, M., Varona, L., Ramirez, O., Reixach, J., Diaz, I., Tibau, J., Soler, J., Prat-Cuffi, J.M., Noguera, J.L., and Quintanilla, R. (2008). Mapping of quantitative trait loci for cholesterol, LDL, HDL, and triglyceride serum concentrations in pigs. Physiological Genomics 35, 199-209.

Garg, P., Servoss, S.J., Wu, J.C., Bajwa, Z.H., Selim, M.H., Dineen, A., Kuntz, R.E., Cook, E.F., and Mauri, L. (2010). Lack of Association Between Migraine Headache and Patent Foramen Ovale Results of a Case-Control Study. Circulation 121, 1406-1412.

157

Garg, V., Kathiriya, I.S., Barnes, R., Schluterman, M.K., King, I.N., Butler, C.A., Rothrock, C.R., Eapen, R.S., Hirayama-Yamada, K., Joo, K., Matsuoka, R., Cohen, J.C., and Srivastava, D. (2003). GATA4 mutations cause human congenital heart defects and reveal an interaction with TBX5. Nature 424, 443-447.

Geiger, J.M., Baudin, M., and Saurat, J.H. (1994). Teratogenic Risk with Etretinate and Acitretin Treatment. Dermatology 189, 109-116.

Gingeras, T.R. (2007). Origin of phenotypes: genes and transcripts. Genome research 17, 682-690.

Gouttenoire, J., Valcourt, U., Bougault, C., Aubert-Foucher, E., Arnaud, E., Giraud, L., and Mallein-Gerin, F. (2007). Knockdown of the intraflagellar transport protein IFT46 stimulates selective gene expression in mouse chondrocytes and affects early development in zebrafish. The Journal of biological chemistry 282, 30960-30973.

Grattapaglia, D., and Sederoff, R. (1994). Genetic-Linkage Maps of Eucalyptus-Grandis and Eucalyptus-Urophylla Using a Pseudo-Testcross - Mapping Strategy and Rapd Markers. Genetics 137, 1121-1137.

Gregg, N.M., Ramsay Brevis, W., and Heseltine, M. (1945). The occurrence of congenital defects in children following maternal rubella during pregnancy. Medical Journal of Australia 2, 122–126.

Gunn, R.M. (1883). Congenital ptosis with peculiar associated movements of the affected lid. Transactions of the ophthalmological societies of the United Kingdom 3, 283–287.

Hackett, C.A., and Weller, J.I. (1995). Genetic mapping of quantitative trait loci for traits with ordinal distributions. Biometrics 51, 1252-1263.

Hagen, P.T., Scholz, D.G., and Edwards, W.D. (1984). Incidence and size of patent foramen ovale during the first 10 decades of life: an autopsy study of 965 normal hearts. Mayo Clinic proceedings. Mayo Clinic 59, 17-20.

Haldane, J.B., and Smith, C.A. (1947). A new estimate of the linkage between the genes for colourblindness and haemophilia in man. Annals of Eugenics 14, 10-31.

Haldane, J.B.S., and Waddington, C.H. (1931). Inbreeding and linkage. Genetics 16, 0357- 0374.

Halpern, J., and Whittemore, A.S. (1999). Multipoint linkage analysis. A cautionary note. Human heredity 49, 194-196.

158

Handke, M., Harloff, A., Olschewski, M., Hetzel, A., and Geibel, A. (2007). Patent foramen ovale and cryptogenic stroke in older patients. The New England journal of medicine 357, 2262-2268.

Hanley, P.C., Tajik, A.J., Hynes, J.K., Edwards, W.D., Reeder, G.S., Hagler, D.J., and Seward, J.B. (1985). Diagnosis and classification of atrial septal aneurysm by two-dimensional echocardiography: report of 80 consecutive cases. Journal of the American College of Cardiology 6, 1370-1382.

Hanson, J.W. (1986). Teratogen Update - Fetal Hydantoin Effects. Teratology 33, 349-353.

Hassold, T., Abruzzo, M., Adkins, K., Griffin, D., Merrill, M., Millie, E., Saker, D., Shen, J., and Zaragoza, M. (1996). Human aneuploidy: incidence, origin, and etiology. Environmental and molecular mutagenesis 28, 167-175.

Hassold, T., and Hunt, P. (2001). To err (meiotically) is human: the genesis of human aneuploidy. Nature reviews. Genetics 2, 280-291.

Hausmann, D., Mugge, A., Becht, I., and Daniel, W.G. (1992). Diagnosis of patent foramen ovale by transesophageal echocardiography and association with cerebral and peripheral embolic events. The American journal of cardiology 70, 668-672.

Hernandez-Diaz, S., Werler, M.M., Walker, A.M., and Mitchell, A.A. (2000). Folic acid antagonists during pregnancy and the risk of birth defects. New England Journal of Medicine 343, 1608-1614.

Hills, C., Moller, J.H., Finkelstein, M., Lohr, J., and Schimmenti, L. (2006). Cri du chat syndrome and congenital heart disease: a review of previously reported cases and presentation of an additional 21 cases from the Pediatric Cardiac Care Consortium. Pediatrics 117, e924-927.

Hirayama-Yamada, K., Kamisago, M., Akimoto, K., Aotsuka, H., Nakamura, Y., Tomita, H., Furutani, M., Imamura, S., Takao, A., Nakazawa, M., and Matsuoka, R. (2005). Phenotypes with GATA4 or NKX2.5 mutations in familial artrial septal defect. American Journal of Medical Genetics Part A 135A, 47-52.

Hoffman, J.I.E., Kaplan, S., and Liberthson, R.R. (2004). Prevalence of congenital heart disease. American heart journal 147, 425-439.

Hoffmann, K., and Lindner, T.H. (2005). easyLINKAGE-Plus - automated linkage analyses using large-scale SNP data. Bioinformatics 21, 3565-3567.

159

Holt, M., and Oram, S. (1960). Familial heart disease with skeletal malformations. British heart journal 22, 236-242.

Homma, S., Di Tullio, M.R., Sacco, R.L., Mihalatos, D., Li Mandri, G., and Mohr, J.P. (1994). Characteristics of patent foramen ovale associated with cryptogenic stroke. A biplane transesophageal echocardiographic study. Stroke; a journal of cerebral circulation 25, 582- 586.

Homma, S., and Sacco, R.L. (2005). Patent foramen ovale and stroke. Circulation 112, 1063- 1072.

Homma, S., Sacco, R.L., Di Tullio, M.R., Sciacca, R.R., and Mohr, J.P. (2002). Effect of medical treatment in stroke patients with patent foramen ovale: patent foramen ovale in Cryptogenic Stroke Study. Circulation 105, 2625-2631.

Homma, S., Sacco, R.L., Di Tullio, M.R., Sciacca, R.R., and Mohr, J.P. (2003). Atrial anatomy in non-cardioembolic stroke patients: effect of medical therapy. Journal of the American College of Cardiology 42, 1066-1072.

Hong, S., Kim, T.W., Choi, I., Woo, J.M., Oh, J., Park, W.J., Kim, D.H., and Cho, C. (2005). Complementary DNA cloning, genomic characterization and expression analysis of a mammalian gene encoding histidine-rich calcium binding protein. Biochimica et biophysica acta 1727, 188-196.

Hubail, Z., Lemler, M., Ramaciotti, C., Moore, J., and Ikemba, C. (2011). Diagnosing a patent foramen ovale in children: is transesophageal echocardiography necessary? Stroke; a journal of cerebral circulation 42, 98-101.

Jansen, R.C. (1992). A General Mixture Model for Mapping Quantitative Trait Loci by Using Molecular Markers. Theoretical and Applied Genetics 85, 252-260.

Jenkins, K.J., Correa, A., Feinstein, J.A., Botto, L., Britt, A.E., Daniels, S.R., Elixson, M., Warnes, C.A., and Webb, C.L. (2007). Noninherited risk factors and congenital cardiovascular defects: current knowledge: a scientific statement from the American Heart Association Council on Cardiovascular Disease in the Young: endorsed by the American Academy of Pediatrics. Circulation 115, 2995-3014.

Johannsen, W. (1909). Elemente der Exakten Erblichkeitslehre. Jena, Gustav Fischer.

Johnson, M.D., He, L.Q., Herman, D., Wakimoto, H., Wallace, C.A., Zidek, V., Mlejnek, P., Musilova, A., Simakova, M., Vorlicek, J., Kren, V., Viklicky, O., Qi, N.R., Wang, J.M., Seidman, C.E., Seidman, J., Kurtz, T.W., Aitman, T.J., and Pravenec, M. (2009). Dissection of

160

Chromosome 18 Blood Pressure and Salt-Sensitivity Quantitative Trait Loci in the Spontaneously Hypertensive Rat. Hypertension 54, 639-U388.

Jones, E.F., Calafiore, P., Donnan, G.A., and Tonkin, A.M. (1994). Evidence that patent foramen ovale is not a risk factor for cerebral ischemia in the elderly. American Journal of Cardiology 74, 596-599.

Jost, C.H.A., Connolly, H.M., Danielson, G.K., Bailey, K.R., Schaff, H.V., Shen, W.K., Warnes, C.A., Seward, J.B., Puga, F.J., and Tajik, A.J. (2005). Sinus venosus atrial septal defect - Long- term postoperative outcome for 115 patients. Circulation 112, 1953-1958.

Kalsotra, A., Xiao, X.S., Ward, A.J., Castle, J.C., Johnson, J.M., Burge, C.B., and Cooper, T.A. (2008). A postnatal switch of CELF and MBNL proteins reprograms alternative splicing in the developing heart. Proceedings of the National Academy of Sciences of the United States of America 105, 20333-20338.

Kao, C.H., Zeng, Z.B., and Teasdale, R.D. (1999). Multiple interval mapping for quantitative trait loci. Genetics 152, 1203-1216.

Kasahara, H., and Benson, D.W. (2004). Biochemical analyses of eight NKX2.5 homeodomain missense mutations causing atrioventricular block and cardiac anomalies. Cardiovascular Research 64, 40-51.

Kayis, S.A. (2000). A generalised linear model approach to QTL detection of litter size in mice. Sydney, The University of Sydney.

Kearsey, M.J., and Hyne, V. (1994). Qtl Analysis - a Simple Marker-Regression Approach. Theoretical and Applied Genetics 89, 698-702.

Keyte, A., and Hutson, M.R. (2012). The neural crest in cardiac congenital anomalies. Differentiation; research in biological diversity.

Kirk, E.P. (2007). The Genetics of Atrial Septal Defect and Patent Foramen Ovale. Sydney, University of New South Wales. PhD.

Kirk, E.P., Hyun, C., Thomson, P.C., Lai, D., Castro, M.L., Biben, C., Buckley, M.F., Martin, I.C., Moran, C., and Harvey, R.P. (2006). Quantitative trait loci modifying cardiac atrial septal morphology and risk of patent foramen ovale in the mouse. Circulation research 98, 651- 658.

Kirk, E.P., Sunde, M., Costa, M.W., Rankin, S.A., Wolstein, O., Castro, M.L., Butler, T.L., Hyun, C., Guo, G., Otway, R., Mackay, J.P., Waddell, L.B., Cole, A.D., Hayward, C., Keogh, A.,

161

Macdonald, P., Griffiths, L., Fatkin, D., Sholler, G.F., Zorn, A.M., Feneley, M.P., Winlaw, D.S., and Harvey, R.P. (2007). Mutations in cardiac T-box factor gene TBX20 are associated with diverse cardiac pathologies, including defects of septation and valvulogenesis and cardiomyopathy. American journal of human genetics 81, 280-291.

Kirkham, T.H. (1969). Familial Marcus Gunn Phenomenon. British Journal of Ophthalmology 53, 282-&.

Kliegman, R.M., Stanton, B.M.D., St. Geme, J., Schor, N., and Behrman, R.E. (2011). Nelson Textbook of Pediatrics, W.B. Saunders.

Korstanje, R., and Paigen, B. (2002). From QTL to gene: the harvest begins. Nature genetics 31, 235-236.

Kuo, C.T., Morrisey, E.E., Anandappa, R., Sigrist, K., Lu, M.M., Parmacek, M.S., Soudais, C., and Leiden, J.M. (1997). GATA4 transcription factor is required for ventral morphogenesis and heart tube formation. Genes & development 11, 1048-1060.

Lamy, C., Giannesini, C., Zuber, M., Arquizan, C., Meder, J.F., Trystram, D., Coste, J., and Mas, J.L. (2002). Clinical and imaging findings in cryptogenic stroke patients with and without patent foramen ovale: the PFO-ASA Study. Atrial Septal Aneurysm. Stroke; a journal of cerebral circulation 33, 706-711.

Lander, E., and Kruglyak, L. (1995). Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nature genetics 11, 241-247.

Lander, E.S., and Botstein, D. (1989). Mapping Mendelian Factors Underlying Quantitative Traits Using Rflp Linkage Maps. Genetics 121, 185-199.

Langa, F., Codony, X., Tovar, V., Lavado, A., Gimenez, E., Cozar, P., Cantero, M., Dordal, A., Hernandez, E., Perez, R., Monroy, X., Zamanillo, D., Guitart, X., and Montoliu, L. (2003). Generation and phenotypic analysis of sigma receptor type I (sigma 1) knockout mice. The European journal of neuroscience 18, 2188-2196.

Lange, C., and Whittaker, J.C. (2001). Mapping quantitative trait Loci using generalized estimating equations. Genetics 159, 1325-1337.

Laverriere, A.C., MacNeill, C., Mueller, C., Poelmann, R.E., Burch, J.B., and Evans, T. (1994). GATA-4/5/6, a subfamily of three transcription factors transcribed in developing heart and gut. The Journal of biological chemistry 269, 23177-23184.

162

Lechat, P., Mas, J.L., Lascault, G., Loron, P., Theard, M., Klimczac, M., Drobinski, G., Thomas, D., and Grosgogeat, Y. (1988). Prevalence of Patent Foramen Ovale in Patients with Stroke. New England Journal of Medicine 318, 1148-1152.

Levy, H.L., Guldberg, P., Guttler, F., Hanley, W.B., Matalon, R., Rouse, B.M., Trefz, F., Azen, C., Allred, E.N., de la Cruz, F., and Koch, R. (2001). Congenital heart disease in maternal phenylketonuria: report from the Maternal PKU Collaborative Study. Pediatric Research 49, 636-642.

Lewis, S.A., Villasante, A., Sherline, P., and Cowan, N.J. (1986). Brain-Specific Expression of Map2 Detected Using a Cloned Cdna Probe. Journal of Cell Biology 102, 2098-2105.

Limongelli, G., Pacileo, G., Marino, B., Digilio, M.C., Sarkozy, A., Elliott, P., Versacci, P., Calabro, P., De Zorzi, A., Di Salvo, G., Syrris, P., Patton, M., McKenna, W.J., Dallapiccola, B., and Calabro, R. (2007). Prevalence and clinical significance of cardiovascular abnormalities in patients with the LEOPARD syndrome. The American journal of cardiology 100, 736-741.

Lin, A.E., and Ardinger, H.H. (2005). Genetic epidemiology of cardiovascular malformations. Progress in Pediatric Cardiology 20, 113-126.

Lippi, G., and Guidi, G. (2003). Lipoprotein(a): An emerging cardiovascular risk factor. Critical Reviews in Clinical Laboratory Sciences 40, 1-42.

Liu, C.X., Shen, A.D., Li, X.F., Jiao, W.W., Zhang, X.G., and Li, Z.Z. (2008). T-box transcription factor TBX20 mutations in Chinese patients with congenital heart disease. European journal of medical genetics 51, 580-587.

Lloyd-Jones, D., Adams, R.J., Brown, T.M., Carnethon, M., Dai, S., De Simone, G., Ferguson, T.B., Ford, E., Furie, K., Gillespie, C., Go, A., Greenlund, K., Haase, N., Hailpern, S., Ho, P.M., Howard, V., Kissela, B., Kittner, S., Lackland, D., Lisabeth, L., Marelli, A., McDermott, M.M., Meigs, J., Mozaffarian, D., Mussolino, M., Nichol, G., Roger, V.L., Rosamond, W., Sacco, R., Sorlie, P., Stafford, R., Thom, T., Wasserthiel-Smoller, S., Wong, N.D., and Wylie-Rosett, J. (2010). Executive summary: heart disease and stroke statistics--2010 update: a report from the American Heart Association. Circulation 121, 948-954.

Luermans, J.G., Post, M.C., Temmerman, F., Thijs, V., Schonewille, W.J., Plokker, H.W., Suttorp, M.J., and Budts, W.I. (2008). Closure of a patent foramen ovale is associated with a decrease in prevalence of migraine: a prospective observational study. Acta cardiologica 63, 571-577.

Lynch, M., and Walsh, B. (1998). Genetics and Analysis of Quantitative Traits. Sunderland, MA, Sinauer Associates, Inc.

163

Ma, A. (2012). Atrial Septal Defect Linkage Study. Children’s Hospital at Westmead Sydney.

Mackay, T.F., Stone, E.A., and Ayroles, J.F. (2009). The genetics of quantitative traits: challenges and prospects. Nature reviews. Genetics 10, 565-577.

Marcovina, S.M., and Koschinsky, M.L. (1998). Lipoprotein(a) as a risk factor for coronary artery disease. American Journal of Cardiology 82, 57u-66u.

Marino, B., Digilio, M.C., Toscano, A., Giannotti, A., and Dallapiccola, B. (1999). Congenital heart diseases in children with Noonan syndrome: An expanded cardiac spectrum with high prevalence of atrioventricular canal. The Journal of pediatrics 135, 703-706.

Marino, B., Reale, A., Giannotti, A., Digilio, M.C., and Dallapiccola, B. (1992). Nonrandom association of atrioventricular canal and del (8p) syndrome. American journal of medical genetics 42, 424-427.

Martinson, J.J., Hong, L., Karanicolas, R., Moore, J.P., and Kostrikis, L.G. (2000). Global distribution of the CCR2-64I/CCR5-59653T HIV-1 disease-protective haplotype. AIDS 14, 483-489.

Matsson, H., Eason, J., Bookwalter, C.S., Klar, J., Gustavsson, P., Sunnegardh, J., Enell, H., Jonzon, A., Vikkula, M., Gutierrez, I., Granados-Riveron, J., Pope, M., Bu'Lock, F., Cox, J., Robinson, T.E., Song, F., Brook, D.J., Marston, S., Trybus, K.M., and Dahl, N. (2008). Alpha- cardiac actin mutations produce atrial septal defects. Human molecular genetics 17, 256- 265.

McDermott, D.A., Bressan, M.C., He, J., Lee, J.S., Aftimos, S., Brueckner, M., Gilbert, F., Graham, G.E., Hannibal, M.C., Innis, J.W., Pierpont, M.E., Raas-Rothschild, A., Shanske, A.L., Smith, W.E., Spencer, R.H., St John-Sutton, M.G., van Maldergem, L., Waggoner, D.J., Weber, M., and Basson, C.T. (2005). TBX5 genetic testing validates strict clinical criteria for Holt-Oram syndrome. Pediatric Research 58, 981-986.

McElhinney, D.B., Geiger, E., Blinder, J., Benson, D.W., and Goldmuntz, E. (2003). NKX2.5 mutations in patients with congenital heart disease. Journal of the American College of Cardiology 42, 1650-1655.

McLachlan, G.J., and Basford, K.E. (1988). Mixture Models: Inference and Applications to Clustering. New York, NY, M. Dekker.

Mendel, G. (1965). Experiments in Plant Hybridization. British medical journal 1, 370-&.

164

Misra, C., Sachan, N., McNally, C.R., Koenig, S.N., Nichols, H.A., Guggilam, A., Lucchesi, P.A., Pu, W.T., Srivastava, D., and Garg, V. (2012). Congenital heart disease-causing Gata4 mutation displays functional deficits in vivo. PLoS genetics 8, e1002690.

Molkentin, J.D. (2000). The zinc finger-containing transcription factors GATA-4, -5, and -6. Ubiquitously expressed regulators of tissue-specific gene expression. The Journal of biological chemistry 275, 38949-38952.

Molkentin, J.D., Lin, Q., Duncan, S.A., and Olson, E.N. (1997). Requirement of the transcription factor GATA4 for heart tube formation and ventral morphogenesis. Genes & development 11, 1061-1072.

Monserrat, L., Hermida-Prieto, M., Fernandez, X., Rodriguez, I., Dumont, C., Cazon, L., Cuesta, M.G., Gonzalez-Juanatey, C., Peteiro, J., Alvarez, N., Penas-Lado, M., and Castro- Beiras, A. (2007). Mutation in the alpha-cardiac actin gene associated with apical hypertrophic cardiomyopathy, left ventricular non-compaction, and septal defects. European heart journal 28, 1953-1961.

Moore, K.L., and Persaud, T.U.N. (1998). The Developing Human: Clinically Oriented Embryology Philadelphia, W.B. Saunders Company

Moorman, A.F.M., and Christoffels, V.M. (2003). Cardiac chamber formation: Development, genes, and evolution. Physiological Reviews 83, 1223-1267.

Moradi Marjaneh, M., Kirk, E.P., Posch, M.G., Ozcelik, C., Berger, F., Hetzer, R., Otway, R., Butler, T.L., Blue, G.M., Griffiths, L.R., Fatkin, D., Martinson, J.J., Winlaw, D.S., Feneley, M.P., and Harvey, R.P. (2011). Investigation of association between PFO complicated by cryptogenic stroke and a common variant of the cardiac transcription factor GATA4. PloS one 6, e20711.

Moradi Marjaneh, M., Martin, I.C.A., Kirk, E.P., Harvey, R.P., Moran, C., and Thomson, P.C. (2012). QTL mapping of complex binary traits in an advanced intercross line. Animal genetics 43, 97-101.

Morrisey, E.E., Ip, H.S., Tang, Z.H., and Parmacek, M.S. (1997). GATA-4 activates transcription via two novel domains that are conserved within the GATA-4/5/6 subfamily. Journal of Biological Chemistry 272, 8515-8524.

Morton, N.E. (1955). Sequential tests for the detection of linkage. American journal of human genetics 7, 277-318.

165

Musewe, N.N., Alexander, D.J., Teshima, I., Smallhorn, J.F., and Freedom, R.M. (1990). Echocardiographic evaluation of the spectrum of cardiac anomalies associated with trisomy 13 and trisomy 18. Journal of the American College of Cardiology 15, 673-677.

Musolino, R., La Spina, P., Granata, A., Gallitto, G., Leggiadro, N., Carerj, S., Manganaro, A., Tripodi, F., Epifanio, A., Gangemi, S., and Di Perri, R. (2003). Ischaemic stroke in young people: a prospective and long-term follow-up study. Cerebrovascular diseases 15, 121- 128.

Nabulsi, M.M., Tamim, H., Sabbagh, M., Obeid, M.Y., Yunis, K.A., and Bitar, F.F. (2003). Parental consanguinity and congenital heart malformations in a developing country. American journal of medical genetics. Part A 116A, 342-347.

Negi, S., Singh, S.K., Pati, N., Handa, V., Chauhan, R., and Pati, U. (2004). A proximal tissue- specific module and a distal negative regulatory module control apolipoprotein(a) gene transcription. The Biochemical journal 379, 151-159.

Newbury-Ecob, R.A., Leanage, R., Raeburn, J.A., and Young, I.D. (1996). Holt-Oram syndrome: a clinical genetic study. Journal of medical genetics 33, 300-307.

Ng, M.Y.M., Andrew, T., Spector, T.D., Jeffery, S., and Consortium, L. (2005). Linkage to the FOXC2 region of chromosome 16 for varicose veins in otherwise healthy, unselected sibling pairs. Journal of medical genetics 42, 235-239.

Nora, J.J. (1968). Multifactorial Inheritance Hypothesis for Etiology of Congenital Heart Diseases - Genetic-Environmental Interaction. Circulation 38, 604-&.

Ohmura, T., Shioi, G., Hirano, M., and Aizawa, S. (2012). Neural tube defects by NUAK1 and NUAK2 double mutation. Developmental Dynamics 241, 1350-1364.

Oliver, J.M., Gallego, P., Gonzalez, A., Dominguez, F.J., Aroca, A., and Mesa, J.M. (2002). Sinus venosus syndrome: atrial septal defect or anomalous venous connection? A multiplane transoesophageal approach. Heart 88, 634-638.

Olshan, A.F., Schnitzer, P.G., and Baird, P.A. (1994). Paternal age and the risk of congenital heart defects. Teratology 50, 80-84.

Ott, J. (1974). Estimation of the recombination fraction in human pedigrees: efficient computation of the likelihood for human linkage studies. American journal of human genetics 26, 588-597.

166

Ott, J., and Lathrop, G.M. (1987). Goodness-of-fit tests for locus order in three-point mapping. Genetic epidemiology 4, 51-57.

Overell, J.R., Bone, I., and Lees, K.R. (2000). Interatrial septal abnormalities and stroke: a meta-analysis of case-control studies. Neurology 55, 1172-1179.

Overhauser, J., Huang, X., Gersh, M., Wilson, W., McMahon, J., Bengtsson, U., Rojas, K., Meyer, M., and Wasmuth, J.J. (1994). Molecular and phenotypic mapping of the short arm of chromosome 5: sublocalization of the critical region for the cri-du-chat syndrome. Human molecular genetics 3, 247-252.

Pearson, H. (2006). Genetics: what is a gene? Nature 441, 398-401.

Pehlivan, T., Pober, B.R., Brueckner, M., Garrett, S., Slaugh, R., Van Rheeden, R., Wilson, D.B., Watson, M.S., and Hing, A.V. (1999). GATA4 haploinsufficiency in patients with interstitial deletion of chromosome region 8p23.1 and congenital heart disease. American journal of medical genetics 83, 201-206.

Peterkin, T., Gibson, A., Loose, M., and Patient, R. (2005). The roles of GATA-4, -5 and -6 in vertebrate heart development. Seminars in cell & developmental biology 16, 83-94.

Phillips, P.C. (2008). Epistasis - the essential role of gene interactions in the structure and evolution of genetic systems. Nature Reviews Genetics 9, 855-867.

Poirier, O., Nicaud, V., McDonagh, T., Dargie, H.J., Desnos, M., Dorent, R., Roizes, G., Schwartz, K., Tiret, L., Komajda, M., and Cambien, F. (2003). Polymorphisms of genes of the cardiac calcineurin pathway and cardiac hypertrophy. European journal of human genetics : EJHG 11, 659-664.

Posch, M.G., Gramlich, M., Sunde, M., Schmitt, K.R., Lee, S.H.Y., Richter, S., Kersten, A., Perrot, A., Panek, A.N., Al Khatib, I.H., Nemer, G., Megarbane, A., Dietz, R., Stiller, B., Berger, F., Harvey, R.P., and Ozcelik, C. (2010). A gain-of-function TBX20 mutation causes congenital atrial septal defects, patent foramen ovale and cardiac valve defects. Journal of medical genetics 47, 230-235.

Posch, M.G., Perrot, A., Schmitt, K., Mittelhaus, S., Esenwein, E.M., Stiller, B., Geier, C., Dietz, R., Gessner, R., Ozcelik, C., and Berger, F. (2008). Mutations in GATA4, NKX2.5, CRELD1, and BMP4 are infrequently found in patients with congenital cardiac septal defects. American Journal of Medical Genetics Part A 146A, 251-253.

Pradat, P. (1992). A Case-Control Study of Major Congenital Heart-Defects in Sweden - 1981-1986. European journal of epidemiology 8, 789-796.

167

Pratt, S.G., Beyer, C.K., and Johnson, C.C. (1984). The Marcus Gunn Phenomenon - a Review of 71 Cases. Ophthalmology 91, 27-30.

Qian, L., Mohapatra, B., Akasaka, T., Liu, J.D., Ocorr, K., Towbin, J.A., and Bodmer, R. (2008). Transcription factor neuromancer/TBX20 is required for cardiac function in Drosophila with implications for human heart disease. Proceedings of the National Academy of Sciences of the United States of America 105, 19833-19838.

R Development Core Team (2010). R: A language and environment for statistical computing. Vienna, Austria, R Foundation for Statistical Computing.

Rasmussen, B.K. (2001). Epidemiology of headache. Cephalalgia : an international journal of headache 21, 774-777.

Reamon-Buettner, S.M., Hecker, H., Spanel-Borowski, K., Craatz, S., Kuenzel, E., and Borlak, J. (2004). Novel NKX2-5 mutations in diseased heart tissues of patients with cardiac malformations. American Journal of Pathology 164, 2117-2125.

Reed, D.R., McDaniel, A.H., Avigdor, M., and Bachmanov, A.A. (2008). QTL for body composition on detected using a chromosome substitution mouse strain. Obesity 16, 483-487.

Reisman, M., Christofferson, R.D., Jesurum, J., Olsen, J.V., Spencer, M.P., Krabill, K.A., Diehl, L., Aurora, S., and Gray, W.A. (2005). Migraine headache relief after transcatheter closure of patent foramen ovale. Journal of the American College of Cardiology 45, 493-495.

Rhoades, R.A., and Bell, D.R. (2012). Medical Physiology: Principles for Clinical Medicine Lippincott Williams & Wilkins.

Risch, N., and Merikangas, K. (1996). The future of genetic studies of complex human diseases. Science 273, 1516-1517.

Robin Marantz, H. (2009). The monk in the garden : the lost and found genius of Gregor Mendel, the father of modern genetics, Houghton Mifflin.

Rocha, J.L., Eisen, E.J., Van Vleck, L.D., and Pomp, D. (2004). A large-sample QTL study in mice: II. Body composition. Mammalian Genome 15, 100-113.

Rudolph, A.M., Heymann, M.A., Teramo, K.A.W., Barrett, C.T., and Räihä, N.C.R. (1971). Studies on the Circulation of the Previable Human Fetus. Pediatric Research 5, 452-465.

168

Ryan, A.K., Goodship, J.A., Wilson, D.I., Philip, N., Levy, A., Seidel, H., Schuffenhauer, S., Oechsler, H., Belohradsky, B., Prieur, M., Aurias, A., Raymond, F.L., Clayton-Smith, J., Hatchwell, E., McKeown, C., Beemer, F.A., Dallapiccola, B., Novelli, G., Hurst, J.A., Ignatius, J., Green, A.J., Winter, R.M., Brueton, L., Brondum-Nielsen, K., Scambler, P.J., and et al. (1997). Spectrum of clinical features associated with interstitial chromosome 22q11 deletions: a European collaborative study. Journal of medical genetics 34, 798-804.

Sacco, R.L., Ellenberg, J.H., Mohr, J.P., Tatemichi, T.K., Hier, D.B., Price, T.R., and Wolf, P.A. (1989). Infarcts of undetermined cause: the NINCDS Stroke Data Bank. Annals of neurology 25, 382-390.

Sadler, T.W. (2004). Langman’s medical embryology. Philadelphia, Lippincott Williams & Wilkins.

Sambrook, J., Fritsch, E.F., and Maniatis, T. (1989). Molecular cloning: a laboratory manual. New York, NY, Cold Spring Harbor Laboratory Press.

Sarkozy, A., Conti, E., Neri, C., D'Agostino, R., Digilio, M.C., Esposito, G., Toscano, A., Marino, B., Pizzuti, A., and Dallapiccola, B. (2005). Spectrum of atrial septal defects associated with mutations of NKX2.5 and GATA4 transcription factors. Journal of medical genetics 42.

Sax, K. (1923). The Association of Size Differences with Seed-Coat Pattern and Pigmentation in PHASEOLUS VULGARIS. Genetics 8, 552-560.

Scambler, P.J. (2000). The 22q11 deletion syndromes. Human molecular genetics 9, 2421- 2426.

Scanlon, K.S., Ferencz, C., Loffredo, C.A., Wilson, P.D., Correa-Villasenor, A., Khoury, M.J., and Willett, W.C. (1998). Preconceptional folate intake and malformations of the cardiac outflow tract. Baltimore-Washington Infant Study Group. Epidemiology 9, 95-98.

Schluterman, M.K., Krysiak, A.E., Kathiriya, I.S., Abate, N., Chandalia, M., Srivastava, D., and Garg, V. (2007). Screening and biochemical analysis of GATA4 sequence variations identified in patients with congenital heart disease. American journal of medical genetics. Part A 143A, 817-823.

Schott, J.J., Benson, D.W., Basson, C.T., Pease, W., Silberbach, G.M., Moak, J.P., Maron, B.J., Seidman, C.E., and Seidman, J.G. (1998). Congenital heart disease caused by mutations in the transcription factor NKX2-5. Science 281, 108-111.

169

Schuchlenz, H.W., Weihs, W., Horner, S., and Quehenberger, F. (2000). The association between the diameter of a patent foramen ovale and the risk of embolic cerebrovascular events. American Journal of Medicine 109, 456-462.

Schwedt, T.J., Demaerschalk, B.M., and Dodick, D.W. (2008). Patent foramen ovale and migraine: a quantitative systematic review. Cephalalgia : an international journal of headache 28, 531-540.

Scott, M.P., Matsudaira, P., Lodish, H., Darnell, J., Zipursky, L., Kaiser, C.A., Berk, A., and Krieger, M. (2004). Molecular Cell Biology. San Francisco, W. H. Freeman.

Shanoudy, H., Soliman, A., Raggi, P., Liu, J.W., Russell, D.C., and Jarmukli, N.F. (1998). Prevalence of patent foramen ovale and its contribution to hypoxemia in patients with obstructive sleep apnea. Chest 113, 91-96.

Sherin, C., Francesca, F., Karl, P., Brendan, T., Ron, H., and Lyn, G. (2008). Investigation between the S377G3 GATA-4 polymorphism and migraine. The open neurology journal 2, 35-38.

Sieberts, S.K., and Schadt, E.E. (2007). Moving toward a system genetics view of disease. Mammalian genome : official journal of the International Mammalian Genome Society 18, 389-401.

Silberstein, S.D. (2004). Migraine. Lancet 363, 381-391.

Silver, M.D., and Dorsey, J.S. (1978). Aneurysms of the septum primum in adults. Archives of pathology & laboratory medicine 102, 62-65.

Smith, C.A.B. (1953). The Detection of Linkage in Human Genetics. Journal of the Royal Statistical Society. Series B (Methodological) 15, 153-192.

Smithells, R.W., and Newman, C.G.H. (1992). Recognition of Thalidomide Defects. Journal of medical genetics 29, 716-723.

Srivastava, D., and Olson, E.N. (2000). A genetic blueprint for cardiac development. Nature 407, 221-226.

Standring, S. (2008). Gray's Anatomy: The Anatomical Basis of Clinical Practice, Expert Consult Churchill Livingstone.

Stennard, F.A., Costa, M.W., Elliott, D.A., Rankin, S., Haast, S.J.P., Lai, D., McDonald, L.P.A., Niederreither, K., Dolle, P., Bruneau, B.G., Zorn, A.M., and Harvey, R.P. (2003). Cardiac T-

170 box factor Tbx20 directly interacts with Nkx2-5, GATA4, and GATA5 in regulation of gene expression in the developing heart. Developmental Biology 262, 206-224.

Stennard, F.A., Costa, M.W., Lai, D., Biben, C., Furtado, M.B., Solloway, M.J., McCulley, D.J., Leimena, C., Preis, J.I., Dunwoodie, S.L., Elliott, D.E., Prall, O.W.J., Black, B.L., Fatkin, D., and Harvey, R.P. (2005). Murine T-box transcription factor Tbx20 acts as a repressor during heart development, and is essential for adult heart integrity, function and adaptation. Development 132, 2451-2462.

Takahashi, J.S., Pinto, L.H., and Vitaterna, M.H. (1994). Forward and reverse genetic approaches to behavior in the mouse. Science 264, 1724-1733.

Teitel, D.F., Iwamoto, H.S., and Rudolph, A.M. (1990). Changes in the Pulmonary Circulation during Birth-Related Events. Pediatric Research 27, 372-378.

Terwilliger, J.D., and Ott, J. (1994). Handbook of Human Genetic Linkage Johns Hopkins University Press.

Thomson, P.C. (2003). A generalized estimating equations approach to quantitative trait locus detection of non-normal traits. Genetics, selection, evolution : GSE 35, 257-280.

Thomson, P.C., Brown, S.C., and Raadsma, H.W. (2007). QTL-MLE: a maximum likelihood QTL mapping program for flexible modelling using the R computing environment. Proceedings of the Association for the Advancement of Animal Breeding and Genetics 17, 288-290.

Tikkanen, J., and Heinonen, O.P. (1992). Risk factors for atrial septal defect. European journal of epidemiology 8, 509-515.

Tomita-Mitchell, A., Maslen, C.L., Morris, C.D., Garg, V., and Goldmuntz, E. (2007). GATA4 sequence variants in patients with congenital heart disease. Journal of medical genetics 44, 779-783.

Torfs, C.P., and Christianson, R.E. (1999). Maternal risk factors and major associated defects in infants with Down syndrome. Epidemiology 10, 264-270.

Visel, A., Thaller, C., and Eichele, G. (2004). GenePaint.org: an atlas of gene expression patterns in the mouse embryo. Nucleic Acids Research 32, D552-D556.

Visscher, P.M., Haley, C.S., and Knott, S.A. (1996). Mapping QTLs for binary traits in backcross and F-2 populations. Genetical Research 68, 55-63.

171 von Kodolitsch, Y., and Robinson, P.N. (2007). Marfan syndrome: an update of genetics, medical and surgical management. Heart 93, 755-760.

Wade, C.M., Kulbokas, E.J., 3rd, Kirby, A.W., Zody, M.C., Mullikin, J.C., Lander, E.S., Lindblad-Toh, K., and Daly, M.J. (2002). The mosaic structure of variation in the laboratory mouse genome. Nature 420, 574-578.

Wahl, A., Meier, B., Haxel, B., Nedeltchev, K., Arnold, M., Eicher, E., Sturzenegger, M., Seiler, C., Mattle, H.P., and Windecker, S. (2001). Prognosis after percutaneous closure of patent foramen ovale for paradoxical embolism. Neurology 57, 1330-1332.

Wain, H.M., Bruford, E.A., Lovering, R.C., Lush, M.J., Wright, M.W., and Povey, S. (2002). Guidelines for human gene nomenclature. Genomics 79, 464-470.

Weaver, R.G., Seaton, A.D., and Jewett, T. (1997). Bilateral Marcus Gunn (jaw-winking) phenomenon occurring with CHARGE association. Journal of Pediatric Ophthalmology & Strabismus 34, 308-309.

Webb, G., and Gatzoulis, M.A. (2006). Atrial septal defects in the adult - Recent progress and overview. Circulation 114, 1645-1653.

Webb, S., Brown, N.A., and Anderson, R.H. (1998). Formation of the atrioventricular septal structures in the normal mouse. Circulation research 82, 645-656.

Webster, M.W., Chancellor, A.M., Smith, H.J., Swift, D.L., Sharpe, D.N., Bass, N.M., and Glasgow, G.L. (1988). Patent foramen ovale in young stroke patients. Lancet 2, 11-12.

Webster, M.W.I., Smith, H.J., Sharpe, D.N., Chancellor, A.M., Swift, D.L., Bass, N.M., and Glasgow, G.L. (1988). Patent Foramen Ovale in Young Stroke Patients. Lancet 2, 11-12.

WHO (1996). Control of hereditary disorders: Report of WHO Scientific meeting, World Health Organization (WHO).

Williams, R.W., Gu, J., Qi, S., and Lu, L. (2001). The genetic structure of recombinant inbred mice: high-resolution consensus maps for complex trait analysis. Genome Biology 2, RESEARCH0046.

Wilmshurst, P.T., Nightingale, S., Walsh, K.P., and Morrison, W.L. (2000). Effect on migraine of closure of cardiac right-to-left shunts to prevent recurrence of decompression illness or stroke or for haemodynamic reasons. Lancet 356, 1648-1651.

172

Wilmshurst, P.T., Pearson, M.J., Nightingale, S., Walsh, K.P., and Morrison, W.L. (2004). Inheritance of persistent foramen ovale and atrial septal defects and the relation to familial migraine with aura. Heart 90, 1315-1320.

Wilmshurst, P.T., Pearson, M.J., Walsh, K.P., Morrison, W.L., and Bryson, P. (2001). Relationship between right-to-left shunts and cutaneous decompression illness. Clinical Science 100, 539-542.

Wittenburg, D., Guiard, V., Liese, F., and Reinsch, N. (2007). Linear and generalized linear models for the detection of QTL effects on within-subject variability. Genetical Research 89, 245-257.

Wright, D.D., Gibson, K.D., Barclay, J., Razumovsky, A., Rush, J., and McCollum, C.N. (2010). High prevalence of right-to-left shunt in patients with symptomatic great saphenous incompetence and varicose veins. Journal of Vascular Surgery 51, 104-107.

Wu, R.L., and Lin, M. (2006). Opinion - Functional mapping - how to map and study the genetic architecture of dynamic complex traits. Nature Reviews Genetics 7, 229-237.

Wu, W.R., and Li, W.M. (1994). A New Approach for Mapping Quantitative Trait Loci Using Complete Genetic-Marker Linkage Maps. Theoretical and Applied Genetics 89, 535-539.

Yen Ho, S., Path, F.R.C., McCarthy, K.P., and Rigby, M. (2007). Anatomy of atrial and ventricular septal defects. Journal of Interventional Cardiology 13, 475-486.

Zabalgoitia-Reyes, M., Herrera, C., Gandhi, D.K., Mehlman, D.J., McPherson, D.D., and Talano, J.V. (1990). A possible mechanism for neurologic ischemic events in patients with atrial septal aneurysm. The American journal of cardiology 66, 761-764.

Zeng, Z.B. (1994). Precision mapping of quantitative trait loci. Genetics 136, 1457-1468.

Zhang, J., and Cai, W.W. (1993). Association of the common cold in the first trimester of pregnancy with birth defects. Pediatrics 92, 559-563.

Zhang, W.M., Li, X.F., Shen, A.D., Jiao, W.W., Guan, X.L., and Li, Z.Z. (2008). GATA4 mutations in 486 Chinese patients with congenital heart disease. European journal of medical genetics 51, 527-535.

Zipes, D.P., Libby, P., Bonow, R.O., and Braunwald, E. (2005). Braunwald's Heart Disease: A Textbook of Cardiovascular Medicine. Philadelphia, Elsevier Saunders Company.

173

9 Supplemental materials

Supplementary Table 1: List of markers with physical and genetic location

No. Chr. SNP ID Physical Position Genetic Position

1 1 rs32517458 17878362 5.4549 2 1 rs31445960 21704030 7.704 3 1 rs13475769 24958696 10.308 4 1 rs30641794 30026645 11.3472 5 1 rs13475802 34647581 13.337 6 1 rs31052524 37358323 15.457 7 1 rs32480633 39640998 18.3493 8 1 rs31075310 42578158 20.7457

9 1 rs30805456 44439262 23.626 10 1 rs30917196 47228346 25.5443 11 1 rs31991963 52065230 26.6713 12 1 rs32480972 59237136 29.333 13 1 rs32055290 63762906 32.307

14 1 rs31604370 68192834 33.9014 15 1 rs32666041 73462407 37.996 16 1 rs31218676 77148626 39.511 17 1 rs30203203 80790247 41.6201 18 1 rs6278395 88645127 43.9953 19 1 rs30315626 92265367 45.4767 20 1 rs31438438 98284433 47.764 21 1 rs31883728 103053044 49.258 22 1 rs6212740 108915586 50.344 23 1 rs31959300 117673197 51.606 24 1 rs31735078 124410502 53.1661 25 1 rs32648932 129563187 55.798

26 1 rs32434650 135869536 58.0582 27 1 rs32424507 139650241 60.524

28 1 rs3663996 143991782 62.3284

174

29 1 rs6405822 149908841 63.32 30 1 rs30467058 154157774 64.7135 31 2 rs27208240 31043128 21.806 32 2 rs27168898 34880155 23.2912 33 2 rs6171091 42409494 25.237 34 2 rs28320922 47148532 27.8716 35 2 rs28295341 50718471 29.2835 36 2 rs29521586 58052086 33.053 37 2 rs13476542 64093315 37.6423 38 2 rs3695852 67387900 39.315 39 2 rs51010451 72078052 43.026 40 2 rs33005712 77153358 45.937 41 2 rs33146952 80508862 48.3881 42 2 rs27369074 92294227 51.0509 43 2 rs6403952 95681285 52.156

44 2 rs27393348 102153636 54.129 45 2 rs33217144 107389917 56.116

46 2 rs27488478 113024169 57.2433 47 2 rs29971528 117981438 59.3588 48 2 rs32992654 126302207 61.759

49 2 rs27276631 132008491 64.146 50 2 rs13476787 134965497 66.437

51 2 rs33119174 140093642 69.202 52 2 rs29733168 148699444 73.5983

53 2 rs27369770 152240355 74.9854 54 2 rs27338655 155621679 77.262 55 2 rs27304445 158395675 78.715 56 2 rs27316600 162874767 83.9978 57 2 rs33353350 165595453 85.9611 58 2 rs29821300 168185567 88.7165 59 2 rs27620443 172534415 94.899 60 2 rs27650386 174060877 97.874 61 2 rs29723406 178724740 100.7742 62 4 rs27658776 8919600 3.9091

175

63 4 rs3712771 13922293 6.0006 64 4 rs27682682 19548000 7.559 65 4 rs3671060 29186287 12.9927 66 4 rs27773647 33403738 14.7883 67 4 rs32922690 35631610 17.7134 68 4 rs27766395 37917978 19.8859 69 4 rs13459075 41664425 21.9875 70 4 rs3661213 45208833 23.7715 71 4 rs3699346 48577080 26.133 72 4 rs32666861 52347280 28.314 73 4 rs27805117 55773652 30.4993 74 4 rs27866811 58985174 32.361 75 4 rs27922877 64514168 34.056 76 4 rs32458346 71213322 36.0325 77 4 rs32917213 75830206 37.323

78 4 rs28090194 82225610 39.2684 79 4 rs28099501 86919147 41.044

80 4 rs28106777 95477384 44.0403 81 4 rs32870027 100191865 45.74 82 8 rs32591452 103180549 50.074

83 8 rs13479958 105453023 52.796 84 8 rs13479995 116236688 59.651

85 8 rs3691294 117662932 62.158 86 8 rs3686697 119689334 63.7249

87 8 rs6244767 120831047 65.9844 88 9 rs31494115 4435626 2.458 89 9 rs30194650 16513299 5.3342 90 9 rs29594239 20932553 7.7191 91 9 rs6324426 26722424 11.8784 92 9 rs6352237 28364090 14.3502 93 9 rs6331932 32220062 17.7274 94 9 rs3666782 36894028 20.721 95 9 rs30079126 42017004 23.575 96 9 rs30331033 45242187 24.836

176

97 9 rs30087720 48368569 26.4443 98 9 rs30524405 56052626 30.13 99 9 rs30471967 59939036 32.5174 100 9 rs30273450 63600100 34.2822 101 9 rs29990501 67283279 36.5258 102 9 rs30084038 70100757 39.4643 103 11 rs26897275 3686755 2.6629 104 11 rs29385812 7963924 4.884 105 11 rs6185261 11911737 7.1966 106 11 rs29408224 14842944 8.623 107 11 rs33850802 18943579 11.4064 108 11 rs29465407 22114857 14.1312 109 11 rs6259125 28060339 16.2052 110 11 rs26839036 31432399 18.4862 111 11 rs26888494 34940140 20.3055

112 13 rs29552398 12612567 4.9149 113 13 rs623307 17509991 5.9688

114 13 rs29250410 20698662 7.3978 115 13 rs6310463 29209975 12.9129 116 13 rs29252384 34405346 14.0842

117 13 rs29637575 37665745 16.9315 118 13 rs29239048 39581608 19.0463

119 13 rs29222774 43048849 21.2576 120 13 rs29534013 45966066 22.8181

121 13 rs29244575 47908419 24.497 122 13 rs29250103 52156808 26.669 123 13 rs29686818 54788137 28.9658 124 13 rs30079865 57767165 30.6219 125 13 rs29831449 62928598 32.744 126 13 rs29245956 70857323 35.97 127 13 rs29885133 77353251 40.977 128 19 rs31292888 3356365 3.115 129 19 rs52111888 6358321 4.473 130 19 rs31091380 10728651 7.0149

177

131 19 rs30617509 14647451 9.184 132 19 rs30690659 16465896 11.2293 133 19 rs31195994 18586641 13.167 134 19 rs30653264 21574326 14.5949 135 19 rs30323643 22618957 17.0007 136 19 rs30464413 24535024 19.518 137 19 rs31057067 26041377 20.8274 138 19 rs31090270 27236333 21.7132 139 19 rs30518862 28327321 22.8775 140 19 rs3680681 30074291 24.6699 141 19 rs3713134 32068420 26.584 142 19 rs31233231 33025504 29.056 143 19 rs30997146 35142532 29.792 144 19 rs31262554 37174506 32.1327 145 19 rs31045283 47110119 39.1172

146 19 rs30696058 47983028 40.527 147 19 rs3677606 50367797 45.048

148 19 rs6167751 52006078 46.061 149 19 rs30903784 53224126 47.1984 150 19 rs30363136 55847579 51.6054

178