Flaquer et al. BMC Genetics 2013, 14:44 http://www.biomedcentral.com/1471-2156/14/44

RESEARCH ARTICLE Open Access Genome-wide linkage analysis of congenital heart defects using MOD score analysis identifies two novel loci Antònia Flaquer1,2*, Clemens Baumbach1,2, Estefania Piñero3, Fernando García Algas4, María Angeles de la Fuente Sanchez4, Jordi Rosell5, Jorge Toquero6, Luis Alonso-Pulpon6, Pablo Garcia-Pavia6, Konstantin Strauch1,2 and Damian Heine-Suñer5

Abstract Background: Congenital heart defects (CHD) is the most common cause of death from a congenital structure abnormality in newborns and is often associated with fetal loss. There are many types of CHD. Human genetic studies have identified that are responsible for the inheritance of a particular type of CHD and for some types of CHD previously thought to be sporadic. However, occasionally different members of the same family might have anatomically distinct defects — for instance, one member with atrial septal defect, one with tetralogy of Fallot, and one with ventricular septal defect. Our objective is to identify susceptibility loci for CHD in families affected by distinct defects. The occurrence of these apparently discordant clinical phenotypes within one family might hint at a genetic framework common to most types of CHD. Results: We performed a genome-wide linkage analysis using MOD score analysis in families with diverse CHD. Significant linkage was obtained in two regions, at 15 (15q26.3, Pempirical = 0.0004) and at chromosome 18 (18q21.2, Pempirical = 0.0005). Conclusions: In these two novel regions four candidate genes are located: SELS, SNRPA1, and PCSK6 on 15q26.3, and TCF4 on 18q21.2. The new loci reported here have not previously been described in connection with CHD. Although further studies in other cohorts are needed to confirm these findings, the results presented here together with recent insight into how the heart normally develops will improve the understanding of CHD. Keywords: Congenital heart defects, Genetics, Linkage analysis, Genome-wide study

Background fects (ASD), ventricular septal defects (VSD), bicuspid Congenital heart defects (CHD) refer to abnormalities in aortic valve (BAV), and Ebstein’s anomaly, among many the heart structure or function that arise at the fetal others. stages and affect approximately 1% of newborns [1]. Normal heart development involves many regula- Multiple surgeries are almost always required to correct tory pathways including receptor-ligand interactions many of the anatomical defects, and quality of life is (JAGGED/NOTCH, TGFB-BMP/TGFBR, VEGF/FLT1- often greatly compromised. There are many types of FLK1, NODAL/ACVRA-ACVRB and RTK/RAS), signal CHD. Examples include transposition of the great ar- transduction (kinases or phosphatases such as MAPK, teries/vessels (TGA/TGV), tetralogy of Fallot (TOF), ERK1/2, calcineurin, or GSK), and transcription factors double outlet right ventricle (DORV), atrial septal de- that determine the expression of cardio-specific genes ( families with a T-box domain TBX1, TBX5, and * Correspondence: [email protected] TBX20, the GATA family GATA4 and FOG2 or homeobox 1 Institute of Medical Informatics, Biometry, and Epidemiology, Chair of domain NKX2.5 and NKX2.6). Mutations that show a Genetic Epidemiology, Ludwig-Maximilians-Universität, Munich, Germany 2Institute of Genetic Epidemiology, Helmholtz Zentrum München, German greater penetrance and that therefore approximate a Research Center for Environmental Health, Neuherberg, Germany Full list of author information is available at the end of the article

© 2013 Flaquer et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Flaquer et al. BMC Genetics 2013, 14:44 Page 2 of 8 http://www.biomedcentral.com/1471-2156/14/44

monogenic inheritance are those affecting transcription ful method for detecting effects [18,19]. Especially factors or genes transcribed by them [2]. when looking at rare variants, linkage has the advantage Although the major underlying defects that cause over association of not being prone to allelic heteroge- CHD are thought to be mutations in regulators of heart neity. That is, the combination of many weak association development during embryogenesis, epidemiological data signals obtained in a certain region for single variants by also indicate an environmental influence [2-4]. However, means of collapsing methods is automatically performed these epidemiological studies mostly suggest risk factors in the context of linkage analysis. rather than underlying disease mechanisms. Genetic fac- In this study we performed a detailed genome-wide lin- tors for some of the CHD include Mendelian mutations, kage analysis using MOD scores and 6 families affected by copy number variants, translocations, and single nucleo- different CHD to test for a common genetic background tide polymorphisms (SNPs) [5-7]. among different types of heart defects. A genetic component of CHD diseases was initially im- plicated by recurrence in families, and by studies showing Results and discussion a co-segregation of CHD with the deletion 22q11.2 [8]. It The genome-wide results for MOD scores are plotted in has also been shown that individuals with an affected pa- Figure 1. No significant results were obtained when mo- rent are at twofold greater risk, which is even greater if deling parent-specific heterozygote penetrances, resulting siblings are affected [9]. While many studies have investi- in no evidence of parental genomic imprinting in our set gated the role of genes in the etiology of inherited and of pedigrees. The best results were obtained on chromo- sporadic CHD [10-14] these studies have been focusing some 15 and 18. The results for these two on specific cardiac lesions separately. Recently, rare vari- are plotted in Figures 1A and 1B, respectively. The highest ants have also been reported to be associated with CHD scores (MOD = 3.9) were detected on chromosome 18 [15]. These rare variants are usually present in less than (18q21.2-18q21.3; 51197–52023 kb) comprising 57 SNPs 1% of the normal population but are overrepresented in with Pempirical = (0.0005-0.0006), where Pempirical refers to selected patients. They are typically inherited from an the P value obtained after simulations were performed. asymptomatic parent and are therefore not believed to be The associated parametric model is a recessive mode of a sufficient cause of CHD in these patients; other muta- inheritance with incomplete penetrance with a probability tions need to be present to lead to CHD. of 0.27 to be affected for individuals carrying two copies However, occasionally, different members of one family of the disease allele. Incomplete penetrance refers to the are observed to suffer from anatomically distinct defects — failure of individuals who carry the high risk genotype to for example, one member with ASD, one with TOF, and exhibit the trait. This phenomenon is often observed in one with VSD. These apparently discordant clinical phe- complex diseases. Besides different ways of gene action or notypes arising within one family are difficult to ratio- degrees of expression, incomplete penetrance can arise nalize. Benson et al. [16] observed a coinheritance of from further genetic or environmental factors. The second different NKX2.5 mutations, i.e., compound heterozygotes, significant region was at (15q26.3; in members of families with ASD, VSD, and cardiac con- 99749–99883 kb) comprising 27 SNPs, all of them with a duction abnormalities, suggesting that mutations in the MOD score of 3.5 and Pempirical = (0.0004-0.0005). The same gene can affect different parts of the heart, and best-fitting model points to a recessive mode of inheri- therefore cause different types of CHD. McGregor et al. tance, with a probability of 0.98 to be affected for indivi- [17] reported a statistically significant linkage evidence for duals carrying two copies of the disease allele. It should be a on chromosome 14 (within the HOMEZ gene) in noted that the estimate of the disease allele frequency South Indian cases born to consanguineous parents. How- obtained by a MOD-score analysis has the largest variance ever, the linkage finding was not robust in a genetic of all trait-model parameters. Furthermore, in some cases, association follow-up study in a general United States the estimated disease allele frequency will be markedly population. Despite the successes over the past few years, higher than the true value. This is due to the fact that spe- much remains unknown about the genetics of CHD. It is cifying a higher disease allele frequency can compensate believed that much of the missing heritability is likely to for a general model misspecification and hence lead to be due to rare variants with larger effect sizes as well as robustness in a multipoint analysis [20]. epigenetic effects. Their systematic characterization is be- To corroborate these findings, family-based association yond the means of studies that currently rely on linkage analysis was performed in these two significant regions disequilibrium (LD) patterns such as genome-wide associ- with SNPs displaying a MOD >2.5 (P value ≤ 0.01). Using ation analysis. In such situations taking family information this criterion a total number of 304 SNPs (P values bet- into consideration using a direct mapping approach ween 5 × 10-04 and 1 × 10-02) and 211 SNPs (P values bet- complemented by segregation techniques will be a useful ween 4 × 10-04 and 4 × 10-03)wereincludedinthe strategy. For complex diseases linkage analysis is a power- association analysis for 18q21.2-18q21.3 and 15q26.3, Flaquer et al. BMC Genetics 2013, 14:44 Page 3 of 8 http://www.biomedcentral.com/1471-2156/14/44

− ( ) − ( )

Figure 1 Genome-wide results for the MOD score analyses. XY1 and XY2 represent pseudoautosomal regions 1 and 2, respectively. Figures 1A and 1B illustrate the MOD score results on chromosome 15 and chromosome 18, respectively. In addition, the –log10(P value) of the family-based association test in regions with significant MOD scores are shown as red dots. respectively. A total number of 34 SNPs resulted to be sig- very closely to each other (SELS, SNRPA1,andPCSK6). nificant at a type I error rate of 0.05 on chromosome 18. In addition, one gene is located in the linkage region Nevertheless, none of them was significant after using the identified on chromosome 18 (TCF4). Bonferroni correction to adjust for multiple testing. On the other hand, on chromosome 15 a total number of 39 One gene at 18q21.2: TCF4 SNPs were found to be nominally significant and only one The potentially relevant TCF4 gene is located at the (rs3784481 at 99749 kb) was significant after Bonferroni linkage peak of chromosome 18. This gene encodes tran- correction (P value = 0.0025). scription factor 4, a basic helix-loop-helix transcription To see if one of the two linkage regions may be the factor. To date, no connection between this gene and causal genetic factor for a particular family we CHD has been established. However, it is interesting to performed parametric linkage (LOD score) analysis of note that fairly recently it has been found that in mice individual pedigrees under the best fitted model for the the tcf4 gene is strongly expressed in connective tissue two linked regions. As shown in Table 1, all the pedi- fibroblast. It was demonstrated that connective tissue grees contribute positively to the LOD score in both re- gions (except pedigree 5, chr15). These positive Table 1 Parametric LOD scores for each family separately contributions indicate that all types of CHD present in Family Heart defect Chr15 Chr18 our pedigrees are linked to both regions, supporting the 15q26.3 18q21.2-18q21.3 initial hypothesis of this study of a common genetic 1 VSD 0.615 0.632 background for diverse CHD. The negative contribution of pedigree 5 to the LOD score on chromosome 15 has 2 TOF,TGV 0.324 0.299 to be treated carefully. Due to the small number of mei- 3 TOF 0.545 0.602 oses in this pedigree there is a lack of power to detect 4 ASD 1.670 1.297 linkage, and no conclusion should be drawn when ana- 5 VSD,DORV −0.307 0.602 lyzing this pedigree individually. 6 ASD 0.681 0.496 When looking in detail at the region around the sig- All VSD,TOF,TGV,ASD,DORV 3.529 3.929 nificant SNPs on chromosome 15, three genes are found Flaquer et al. BMC Genetics 2013, 14:44 Page 4 of 8 http://www.biomedcentral.com/1471-2156/14/44

fibroblasts and tcf4 critically regulate two aspects of type 9, which is encoded by the PCSK9 gene, is a key regu- myogenesis: muscle fiber type development and matur- lator of plasma low density lipoprotein cholesterol and has ation [21]. Taken together this finding in mice and our emerged as a promising target for prevention and treat- results could suggest that malfunction of the TCF4 gene ment of coronary heart diseases. A recent genome-wide may produce a malformation of the heart muscle tissue association study further bolstered the importance of which in turn predisposes to a higher risk to develop PCSK9 by establishing a link between a nearby SNP and some type of CHD. early onset myocardial infarction [29].

Three genes at 15q26.3: SELS, SNRPA1, and PCSK6 The SELS gene encodes a selenoprotein, which contains Conclusions a selenocysteine residue at its active site. Studies suggest We have conducted a whole genome linkage analysis for that this protein may regulate cytokine production, and CHD using MOD score analysis. We used a total num- thus play a key role in the control of the inflammatory ber of 177,568 SNPs providing a good coverage of the response decreasing pro-inflammatory cytokines [22]. genome in a sample of 6 families of Spanish origin with Production of cytokines is one of the major mechanisms members affected with anatomically distinct CHD: ASD, employed by T-cells to coordinate immune responses. It VSD, TOF, TGV, and DORV. The aim of this study was has been demonstrated that the developing immune sys- the identification of chromosomal regions that are impli- tem of the neonate not only differs significantly from cated in a possible common genetic framework shared that of an adult, but also varies based on gestational age by distinct types of CHD. Two interesting linkage re- [23]. A malfunction of the SELS gene might provoke de- gions were found, on chromosome 15 and 18. The ficient cytokine production by neonatal lymphocytes, highest linkage peak was obtained at position 18q21.2 and somehow predispose the newborn to a higher risk (LOD = 3.9, Pempirical = 0.0005) and the second highest to develop some CHD. It was also shown that SELS is peak at 15q26.3 (LOD = 3.5, Pempirical = 0.0004) using secreted from hepatoma cells and that it associates with MOD-score analysis. In addition, at 15q26.3 a significant low-density lipoprotein particles [24] suggesting also a association was obtained for a SNP (rs3784481, P value = role in lipid metabolism. Lundell et al. [25] had already 0.0025) using family-based association analysis. Imprinting shown in their study that lipid metabolism was disturbed analysis of the data in the present study did not confirm in infants with CHD. Furthermore, genetic studies in the hypothesis of a possibly imprinted locus. When relation to cardiovascular disease have revealed that vari- looking at these two linkage regions in detail, four genes ation in the SELS locus is associated with coronary heart are located there, three very close to each other at 15q26.3 disease and ischemic stroke in women [26]. and one at 18q21.2. The SNRPA1 gene encodes the U2 small ribonucleopro- As in all research work, our results have some un- tein A’ (U2 snRNP A’). The action of snRNPs is essential avoidable limitations. They have been obtained using to the removal of introns from pre-mRNA, a critical as- linkage techniques with a small sample size. It is well pect of post-transcriptional modification of RNA. To date known that large, multi-generation pedigrees are the no association has been reported between SNRPA1 and most informative for linkage analysis. However, such heart defects. However, post-transcriptional modification pedigrees are difficult to find when investigating com- is one of the multiple mechanisms that regulate the level plex diseases with a high and early mortality such as and activity of transcription factors which are responsible CHD. That might explain why to date there have been for regulating formation and function of the heart. Pertur- so few genome-wide linkage scans involving affected bations of transcription factor expression and regulation families with CHD. Therefore, despite the fact that our disrupt normal heart structure and function [27]. study comprising 6 families is small compared with link- The protein encoded by the PCSK6 gene belongs to the age studies for other disorders, it is, thus far, the largest subtilisin-like proproteinconvertase family. The members family study performed in CHD. Moreover, our findings of this family are proproteinconvertases that process latent on chromosome 15 are corroborated by family-based precursor into their biologically active products. association analysis (with 7 additional trios). Clearly, This encoded protein is a calcium-dependent serine recruiting more families with CHD will help to verify endoprotease that can cleave precursor protein at their this linkage evidence as well as to identify new linkage paired basic amino acid processing sites. Some of its sub- regions. strates are transforming growth factor beta related pro- We report in this study two genetic regions linked to teins, proalbumin, and von Willebrand factor. In 1998, CHD predisposition. There are three candidate genes lo- a study reported abnormalities of von Willebrand factor cated in the 15q26.3 region very close to each other in adults with CHD [28]. Interestingly, a protein from (SELS, SNRPA1 and PCSK6) and one gene at 18q21.2 the same family, proproteinconvertasesubtilisin-like/kexin (TCF4). Only one of these genes, SELS, was previously Flaquer et al. BMC Genetics 2013, 14:44 Page 5 of 8 http://www.biomedcentral.com/1471-2156/14/44

found to be implicated in cardiovascular disorders. The Methods other three genes reported in this study may be considered Subjects as novel candidate loci for CHD. A total number of 6 families comprising 48 individuals While several genome scans have reported susceptibi- (19 females, 29 males) and 7 trios (an affected proband lity genes for specific types of CHD, very few (only one and both parents) of Spanish origin were recruited at the [17], and the one presented here) have identified suscep- University Hospital Son Espases (HUSE) of the Balearic tibility genes that predispose to developing anatomically Islands and at the Puerta del Hierro University Hospital in distinct types of CHD and should form the basis of new Madrid. Pedigree structures are illustrated in Figure 2. analyses of both linkage and association studies on The inclusion criterion for all patients was a CHD present CHD. Future linkage studies may benefit in power from at birth. Patients and their families were asked to partici- larger sample size, and also case–control studies will pate in the study within the clinical practice and blood benefit from direction or corroboration given by linkage samples were taken. All members included in the study scans. For example, the extensive multiple testing prob- were checked for the presence of deletions or duplications lem in genome-wide association studies can be partially using the same SNP-array analysis. Individuals that were alleviated by a weighted Bonferroni correction or Bayesian found to carry a suggestive or clearly causative deletion or analysis that employs linkage data [30]. The results pre- duplication (22q11.2 or other) were excluded from the sented here should facilitate the discovery of both com- analysis. All participants provided their written informed mon and rare variants that confer genetic susceptibility consent to participate in the study. A statement of in- for CHD and help the understanding of heart mecha- formed consent form was signed by the parents of all par- nisms. Understanding the etiology of CHD will also help ticipating children under 18 years old, following protocols to identify possible complications and risk factors for sur- approved by the ethical committee for clinical research of gery or treatment, as patients (babies and children) with the Balearic Islands (IB 963/08; Comité ético de inves- genetic syndromes or extracardiac anomalies are generally tigación clínica de las Islas Baleares; CEIC). The study at higher risk of operative mortality and morbidity. conformed to all ethical guidelines of the institutions

Figure 2 Pedigree structures. Pedigree structures used to perform linkage analysis. In addition, 7 trios (an affected proband and both parents) affected with CHD - DORV, VSD, TOF, TGV, or ASD – were used in family-based association analysis. Flaquer et al. BMC Genetics 2013, 14:44 Page 6 of 8 http://www.biomedcentral.com/1471-2156/14/44

involved. Research at the HUSE and Puerta del Hierro fol- can result in severe biases in linkage results [34]. That is low the guidelines included in the June 1964 Declaration why pairwise LD between each pair of SNP loci was ana- of Helsinki (entitled “Ethical Principles for Medical lyzed, using the procedure ldmax in the GOLD software Research Involving Human Subjects”). All sample identi- [35]. ldmax calculates r2 values from haplotype frequen- fiers were re-encoded in a way that only the clinicians are cies estimated via the expectation maximization (EM) able to access personal information. algorithm of Excoffier and Slatkin [36]. In our data, when- ever a pair of SNPs was in LD (r2 >0.20),oneofthetwo DNA isolation and genotyping SNPs was removed from the analysis. With the resulting All patients had a single tube of blood drawn at the time of data set a genome-wide linkage analysis was performed on enrollment, which was collected centrally for subsequent a total number of 6 pedigrees with sizes varying from 4 to DNA extraction. Genotyping reactions of 52 individuals 13 members. After QC, a total number of 177,568 SNPs were performed at the Spanish genotyping center (CeGen) was analyzed, with 174,475 from the autosomes, 3078 facilities in Barcelona using the Illumina (California, USA) from the X-chromosome, and 15 SNPs from the pseu- Human660W-Quad chip, following the manufacturer’s doautosomal regions (PAR1 and PAR2). instructions. Statistical methods Data quality control In parametric linkage analysis (LOD scores) the trait- Because genotyping errors can lead to incorrect infer- model parameters, i.e., the penetrances and disease allele ences about gene flow in pedigrees and may reduce the frequency, must be specified prior to the analysis. It is effectiveness of linkage analysis, strict quality controls well known that the power to detect linkage decreases (QC) were performed in our data. First of all, the soft- when the specified model is not sufficiently close to the ware Graphical Representation of Relationship Errors true one. Consequently, we calculated multipoint MOD (GRR) was used to identify errors in the structure of the scores, i.e., LOD scores that were maximized over trait-model pedigrees and to eliminate duplicate samples, monozy- parameters using the program GENEHUNTER-MODSCORE gotic twins, and unrelated subjects [31]. In one family (GH-MOD v3.0) [37]. This program calculates MOD GRR detected an inconsistency involving a mother and scores automatically by varying the disease-allele frequency her two children. Eventually it turned out that the chil- and three penetrances. We used the option “modcalc dren’s genotypes were mutually consistent and only the single” to perform a separate maximization for each ge- mother’s genotypes were removed from the analysis. All netic position assumed for the putative disease locus. This individuals were also tested for sex consistency. After- procedure yields a MOD score in conjunction with the wards, the PEDCHECK software was used to detect penetrances and disease allele frequency of the best-fitting marker typing incompatibilities [32]. A total number of trait model for every genetic position. MOD score analysis 3271 (0.6%) incompatibilities were detected and those represents a comprehensive way to analyze linkage data genotypes were removed from the data. Additional QC when the mode of inheritance is unknown; in general this was undertaken to check for unlikely double recom- procedure is particularly well suited for the genetic dissec- binants. Two or more close recombination events on the tion of a complex trait [38]. same chromosome are uncommon due to the chance It has also been observed that the CHD risk for chil- distribution of chiasmata, and because of interference. dren of mothers with a CHD is consistently higher than To detect such problematic genotypes the “error detec- for those of fathers with a CHD [39,40]. One possible tion” option from the MERLIN v1.1.2 software was used explanation for the increased transmission through and the unlikely genotypes were subsequently removed females is a paternal imprinting mode of inheritance, from the data using the “pedwipe” option [33]. Once i.e., offspring will be affected only if the inherited disease Mendelian inconsistencies and unlikely genotypes were allele comes from the mother. Hence, we also checked resolved, SNPs with call rates below 95%, monomorphic for a parent-of-origin effect using two different pene- SNPs, and SNPs with a missing genetic position were trances for heterozygous individuals to take the parent also removed from the analysis. At the end, 93.95% of who transmitted the disease allele into account (see “im- the original SNPs survived the QC. Allele frequencies printing” option of GH-MOD v3.0). were then estimated with MERLIN v1.1.2 using a A MOD-score analysis leads to an inflation of the type maximum-likelihood procedure. I error when compared to LOD scores calculated under a single parametric model. Therefore, in the context of Data preparation for linkage analysis MOD-score analysis, significance criteria for LOD scores In linkage, when analyzing SNPs in pedigrees where some cannot be applied without correction. To assess signifi- parental genotypes are missing it is necessary to model for cance for a MOD score empirical P values need to be marker-marker LD prior to the analysis. Ignoring such LD calculated. We used 5000 replicates, each one generated Flaquer et al. BMC Genetics 2013, 14:44 Page 7 of 8 http://www.biomedcentral.com/1471-2156/14/44

with Monte-Carlo simulations of marker alleles under Acknowledgments the assumption of no linkage and Hardy-Weinberg equi- The authors are grateful to the patients and their families for their participation in the study. This study was supported by the grants from the librium, using the same pedigree structure, affection sta- Instituto de Salud Carlos III (08–1363 and 11–0699) of the Spanish Ministry of tus, marker spacing, and allele frequencies as in the real Health and by grant Str643/4-1 of the Deutsche Forschungsgemeinschaft data set. Replicates were analyzed like the original data. (DFG, German Research Foundation). P The empirical value was obtained as the proportion of Author details replicates showing a MOD score equal to or higher than 1Institute of Medical Informatics, Biometry, and Epidemiology, Chair of the one observed in the real data set. Genetic Epidemiology, Ludwig-Maximilians-Universität, Munich, Germany. 2Institute of Genetic Epidemiology, Helmholtz Zentrum München, German We computed MOD and MOD-imprinting for all Research Center for Environmental Health, Neuherberg, Germany. 3Research autosomes. The MOD score has not been implemented Unit, Son Espases University Hospital, Palma de Mallorca, Spain. 4Pediatric so far for the X-chromosome. Therefore, we performed Cardiology, Son Espases University Hospital, Palma de Mallorca, Spain. 5Genetic Section, Son Espases University Hospital, Palma de Mallorca, Spain. LOD-score analysis assuming different genetic models 6Cardiomyopathy Unit, Department of Cardiology, Puerta de Hierro for the X-chromosome, ranging from completely dom- University Hospital, Madrid, Spain. inant to completely recessive modes of inheritance. In Received: 6 February 2013 Accepted: 16 May 2013 addition, the PAR1 and PAR2 were also investigated. Published: 24 May 2013 Genetic markers in the PARs follow the same inheri- tance pattern as autosomal markers, becoming pro- gressively more sex-linked (i.e., recombination events References 1. Hoffman JI, Kaplan S: The incidence of congenital heart disease. J Am Coll are becoming rarer between X and Y) as they ap- Cardiol 2002, 39:1890–1900. proach the pseudoautosomal boundary. When using 2. Wessels MW, Willems PJ: Genetic factors in non-syndromic congenital multipoint linkage analysis in the PARs, it is crucial to heart malformations. Clin Genet 2010, 78:103–123. 3. Burn J, Brennan P, Little J, Holloway S, Coffey R, Somerville J, Dennis NR, use sex-specific maps since there exists a marked dif- Allan L, Arnold R, Deanfield JE, Godman M, Houston A, Keeton B, Oakley C, ference in recombination frequencies between males Scott O, Silove E, Wilkinson J, Pembrey M, Hunter AS: Recurrence risks in and females [41]. We used pseudoautosomal sex- offspring of adults with major heart defects: results from first cohort of British collaborative study. Lancet 1998, 351:311–316. specific genetic coordinates [41] implemented in the 4. Oyen N, Poulsen G, Boyd HA, Wohlfahrt J, Jensen PK, Melbye M: Recurrence Rutgers map [42]. The sex-specific genetic positions were of congenital heart defects in families. Circulation 2009, 120:295–301. interpolated using a local regression with a quadratic fit as 5. Greenway SC, Pereira AC, Lin JC, DePalma SR, Israel SJ, Mesquita SM, Ergul E, “ ” Conta JH, Korn JM, McCarroll SA, Gorham JM, Gabriel S, Altshuler DM, implemented by the locfit package [43] of the statistical Quintanilla-Dieck Mde L, Artunduaga MA, Eavey RD, Plenge RM, Shadick NA, software R. In order to obtain a mapping that is monoton- Weinblatt ME, De Jager PL, Hafler DA, Breitbart RE, Seidman JG, Seidman CE: ous in the positions we used R’s “monoProc” De novo copy number variants identify new genes and loci in isolated sporadic tetralogy of Fallot. Nat Genet 2009, 41:931–935. package which monotonizes the local regression fit [44]. 6. Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, Finally, only in those regions where the most signifi- Hubbell E, Veitch J, Collins PJ, Darvishi K, Lee C, Nizzari MM, Gabriel SB, cant linkage signals were obtained, to corroborate and Purcell S, Daly MJ, Altshuler D: Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and narrow the significant regions, family-based association rare CNVs. Nat Genet 2008, 40:1253–1260. analysis together with the trio families was performed 7. McCarroll SA, Kuruvilla FG, Korn JM, Cawley S, Nemesh J, Wysoker A, using the LAMP software [45]. LAMP uses a maximum Shapero MH, de Bakker PI, Maller JB, Kirby A, Elliott AL, Parkin M, Hubbell E, Webster T, Mei R, Veitch J, Collins PJ, Handsaker R, Lincoln S, Nizzari M, likelihood model to extract the exact information on Blume J, Jones KW, Rava R, Daly MJ, Gabriel SB, Altshuler D: Integrated genetic linkage and association from samples of unre- detection and population-genetic analysis of SNPs and copy number lated individuals, sib pairs, trios, and larger pedigrees. variation. Nat Genet 2008, 40:1166–1174. P 8. Wilson DI, Cross IE, Wren C, Scambler PJ, Burn J, Goodship J: Minimum values obtained by the family-based association ana- prevalence of chromosome 22q11 deletions. Am J Hum Genet 1994, 55: lysis were adjusted using Bonferroni correction (using A169.975. the total number of SNPs in each specific region of link- 9. Murabito JM, Pencina MJ, Nam BH, D’Agostino RB Sr, Wang TJ, Lloyd-Jones D, ’ age) to account for multiple testing. We used an overall Wilson PW, O Donnell CJ: Sibling cardiovascular disease as a risk factor for cardiovascular disease in middle-aged adults. JAMA 2005, 294:3117–3123. type I error rate of 0.05. 10. Basson CT, Bachinsky DR, Lin RC, Levi T, Elkins JA, Soults J, Grayzel D, Kroumpouzou E, Traill TA, Leblanc-Straceski J, Renault B, Kucherlapati R, Competing interests Seidman JG, Seidman CE: Mutations in human TBX5 [corrected] cause The authors declare that they have no competing interests. limb and cardiac malformation in Holt-Oram syndrome. Nat Genet 1997, 15:30–35. Authors’ contributions 11. Li QY, Newbury-Ecob RA, Terrett JA, Wilson DI, Curtis AR, Yi CH, Gebuhr T, AF carried out all statistical analyses, interpretation of the results and wrote the Bullen PJ, Robson SC, Strachan T, Bonnet D, Lyonnet S, Young ID, Raeburn manuscript. CB carried out all the simulation analyses, prepared the figures and JA, Buckler AJ, Law DJ, Brook JD: Holt-Oram syndrome is caused by edited the manuscript. EP carried out the SNP-array DNA preparation and mutations in TBX5, a member of the Brachyury (T) gene family. Nat partial analysis. FGA, MADLF, JT, LAP and PGP characterized clinically heart Genet 1997, 15:21–29. defects of patients. JR characterized clinical dysmorphology and evaluated 12. Garg V, Kathiriya IS, Barnes R, Schluterman MK, King IN, Butler CA, Rothrock CR, inheritance. KS critically revised the manuscript and all statistical content. DHS Eapen RS, Hirayama-Yamada K, Joo K, Matsuoka R, Cohen JC, Srivastava D: provided samples and revised the manuscript critically for important molecular GATA4 mutations cause human congenital heart defects and reveal an content. All authors gave final approval of the version to be published. interaction with TBX5. 2003, 424:443–447. Flaquer et al. BMC Genetics 2013, 14:44 Page 8 of 8 http://www.biomedcentral.com/1471-2156/14/44

13. Schott JJ, Benson DW, Basson CT, Pease W, Silberbach GM, Moak JP, Maron BJ, 36. Excoffier L, Slatkin M: Maximum-likelihood estimation of molecular haplotype Seidman CE, Seidman JG: Congenital heart disease caused by mutations in frequencies in a diploid population. Mol Biol Evol 1995, 12:921–927. the transcription factor NKX2-5. Science 1998, 281:108–111. 37. Mattheisen M, Dietter J, Knapp M, Baur MP, Strauch K: Inferential testing 14. Garg V, Muth AN, Ransom JF, Schluterman MK, Barnes R, King IN, Grossfeld for linkage with GENEHUNTER-MODSCORE: the impact of the pedigree PD, Srivastava D: Mutations in NOTCH1 cause aortic valve disease. Nature structure on the null distribution of multipoint MOD scores. Genetic 2005, 437:270–274. Epidemiol 2008, 32:73–83. 15. Roessler E, Ouspenskaia MV, Karkera JD, Velez JI, Kantipong A, Lacbawan F, 38. Strauch K, Furst R, Ruschendorf F, Windemuth C, Dietter J, Flaquer A, Baur MP, Bowers P, Belmont JW, Towbin JA, Goldmuntz E, Feldman B, Muenke M: Wienker TF: Linkage analysis of alcohol dependence using MOD scores. Reduced NODAL signaling strength via mutation of several pathway BMC Genet 2005, 6(Suppl 1):S162. members including FOXH1 is linked to human heart defects and 39. Gill HK, Splitt M, Sharland GK, Simpson JM: Patterns of recurrence of holoprosencephaly. Am J Hum Genet 2008, 83:18–29. congenital heart disease: an analysis of 6,640 consecutive pregnancies 16. Benson DW, Silberbach GM, Kavanaugh-McHugh A, Cottrill C, Zhang Y, evaluated by detailed fetal echocardiography. J Am Coll Cardiol 2003, Riggs S, Smalls O, Johnson MC, Watson MS, Seidman JG, Seidman CE, 42:923–929. Plowden J, Kugler JD: Mutations in the cardiac transcription factor NKX2.5 40. Romano-Zelekha O, Hirsh R, Blieden L, Green M, Shohat T: The risk for affect diverse cardiac developmental pathways. J Clin Investig 1999, congenital heart defects in offspring of individuals with congenital heart 104:1567–1573. defects. Clin Genet 2001, 59:325–329. 17. McGregor TL, Misri A, Bartlett J, Orabona G, Friedman RD, Sexton D, 41. Flaquer A, Fischer C, Wienker TF: A new sex-specific genetic map of the Maheshwari S, Morgan TM: Consanguinity mapping of congenital heart human pseudoautosomal regions (PAR1 and PAR2). Hum Hered 2009, disease in a South Indian population. PLoS One 2010, 5:e10286. 68:192–200. 18. Ott J, Kamatani Y, Lathrop M: Family-based designs for genome-wide 42. Matise TC, Chen F, Chen W, De La Vega FM, Hansen M, He C, Hyland FC, association studies. Nat Rev Genet 2011, 12:465–474. Kennedy GC, Kong X, Murray SS, Ziegle JS, Stewart WC, Buyske S: A second- 19. Zhu X, Feng T, Li Y, Lu Q, Elston RC: Detecting rare variants for complex traits generation combined linkage physical map of the . using family and unrelated data. Genetic epidemiology 2010, 34:171–187. Genome Res 2007, 17:1783–1786. 20. Risch N, Giuffra L: Model misspecification and multipoint linkage analysis. 43. Loader C: Local regression and likelihood. New York: Springer; 1999. Hum Hered 1992, 42:77–92. 44. Dette H, Neumeyer N, Pilz KF: A simple nonparametric estimator of a – 21. Mathew SJ, Hansen JM, Merrell AJ, Murphy MM, Lawson JA, Hutcheson DA, strictly monotone regression function. Bernoulli 2006, 12:469 490. Hansen MS, Angus-Hill M, Kardon G: Connective tissue fibroblasts and 45. Li M, Boehnke M, Abecasis GR: Efficient study designs for test of genetic Tcf4 regulate myogenesis. Development 2011, 138:371–384. association using sibship data and unrelated cases and controls. Am J – 22. Fradejas N, Serrano-Perez Mdel C, Tranque P, Calvo S: Selenoprotein S Hum Genet 2006, 78:778 792. expression in reactive astrocytes following brain injury. Glia 2011, 59:959–972. doi:10.1186/1471-2156-14-44 23. West LJ: Defining critical windows in the development of the human Cite this article as: Flaquer et al.: Genome-wide linkage analysis of immune system. Hum Exp Toxicol 2002, 21:499–505. congenital heart defects using MOD score analysis identifies two novel 24. Gao Y, Pagnon J, Feng HC, Konstantopolous N, Jowett JB, Walder K, Collier loci. BMC Genetics 2013 14:44. GR: Secretion of the glucose-regulated selenoprotein SEPS1 from hepatoma cells. Biochem Biophys Res Commun 2007, 356:636–641. 25. Lundell KH, Sabel KG, Eriksson BO: Plasma metabolites after a lipid load in infants with congenital heart disease. Acta Paediatr 1999, 88:718–723. 26. Alanne M, Kristiansson K, Auro K, Silander K, Kuulasmaa K, Peltonen L, Salomaa V, Perola M: Variation in the selenoprotein S gene locus is associated with coronary heart disease and ischemic stroke in two independent Finnish cohorts. Hum Genet 2007, 122:355–365. 27. Zhou P, He A, Pu WT: Regulation of GATA4 transcriptional activity in cardiovascular development and disease. Curr Top Dev Biol 2012, 100:143–169. 28. Territo MC, Perloff JK, Rosove MH, Moake JL, Runge A: Acquired von Willebrand factor abnormalities in adults with congenital heart disease: Dependence upon cardiopulmonary pathophysiological subtype. Clin Appl Thromb-Hem 1998, 4:257–261. 29. Kathiresan S, Voight BF, Purcell S, Musunuru K, Ardissino D, Mannucci PM, Anand S, Engert JC, Samani NJ, Schunkert H, Erdmann J, Reilly MP, Rader DJ, Morgan T, Spertus JA, Stoll M, Girelli D, McKeown PP, Patterson CC, Siscovick DS, O'Donnell CJ, Elosua R, Peltonen L, Salomaa V, Schwartz SM, Melander O, Altshuler D, Ardissino D, Merlini PA, Berzuini C, et al: Genome- wide association of early-onset myocardial infarction with single nucleotide polymorphisms and copy number variants. Nat Genet 2009, 41:334–341. 30. Roeder K, Bacanu SA, Wasserman L, Devlin B: Using linkage genome scans to improve power of association in genome scans. Am J Hum Genet 2006, Submit your next manuscript to BioMed Central 78:243–252. and take full advantage of: 31. Abecasis GR, Cherny SS, Cookson WO, Cardon LR: GRR: graphical representation of relationship errors. Bioinformatics 2001, 17:742–743. • Convenient online submission 32. O’Connell JR, Weeks DE: PedCheck: a program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet 1998, • Thorough peer review – 63:259 266. • No space constraints or color figure charges 33. Abecasis GR, Cherny SS, Cookson WO, Cardon LR: Merlin–rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 2002, 30:97–101. • Immediate publication on acceptance 34. Abecasis GR, Wigginton JE: Handling marker-marker linkage • Inclusion in PubMed, CAS, Scopus and Google Scholar disequilibrium: pedigree analysis with clustered markers. Am J Hum • Research which is freely available for redistribution Genet 2005, 77:754–767. 35. Abecasis GR, Cookson WO: GOLD–graphical overview of linkage disequilibrium. Bioinformatics 2000, 16:182–183. Submit your manuscript at www.biomedcentral.com/submit