bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Deep whole genome sequencing of multiple proband tissues and parental blood reveals
the complex genetic etiology of congenital diaphragmatic hernias
EL Bogenschutz1,
ZD Fox1
A Farrell1,2
J Wynn3
B Moore1,2
L Yu3
G Aspelund4,
G Marth1,2
MY Yandell1,2
Y Shen5,6,7
WK Chung3,8,9
G Kardon1
1 Dept of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT
2 USTAR Center for Genetic Discovery, University of Utah School of Medicine, Salt Lake City, UT
3 Dept of Pediatrics, Columbia University Irving Medical Center, New York, NY.
4 Dept of Surgery, Columbia University Irving Medical Center, New York, NY
5 Dept of Systems Biology, Columbia University Irving Medical Center, New York, NY.
6 Dept of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY.
7 JP Sulzberger Columbia Genome Center, Columbia University Irving Medical Center, New York, NY.
8 Dept of Medicine, Columbia University Medical Center Irving Medical Center, New York, NY
9 Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York,
NY
1 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
ABSTRACT
The diaphragm is a mammalian muscle critical for respiration and separation of the thoracic and
abdominal cavities. Defects in the development of the diaphragm are the cause of congenital
diaphragmatic hernia (CDH), a common birth defect. In CDH, weaknesses in the developing diaphragm
allow abdominal contents to herniate into the thoracic cavity and impair lung development, leading to a
high neonatal mortality. The genetic etiology of CDH is complex. Single nucleotide variants (SNVs),
insertion/deletions (indels), and structural/copy number variants in more than 150 genes have been
associated with CDH, although few genes are recurrently mutated in multiple patients and recurrently
mutated genes can be incompletely penetrant. This suggests that multiple genetic variants in
combination, other not yet investigated classes of variants, and/or nongenetic factors contribute to CDH
susceptibility. However, to date no studies have comprehensively investigated the contribution of all
possible classes of variants throughout the genome to the etiology of CDH. In our study, we used a
unique cohort of four patients with isolated CDH with samples from blood, skin, and diaphragm
connective tissue and parental blood samples and deep whole genome sequencing to assess germline
and somatic de novo and inherited variants of various sizes (SNVs, indels, and structural variants) in
exons, introns, UTRs, and intergenic regions. In each patient we found a different mutational landscape
that included germline de novo, and inherited SNVs and indels in multiple genes. We also found in two
patients an inherited 343 bp deletion interrupting an annotated enhancer of the CDH associated gene,
GATA4, and we hypothesize that this common deletion (found in 1-2% of the population) acts as a
sensitizing allele for CDH. Overall, our comprehensive reconstruction of the genetic architecture of four
CDH individuals demonstrates that the etiology of CDH is heterogeneous and multifactorial.
2 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
AUTHOR SUMMARY
Deep whole genome sequencing of family trios shows that etiology of congenital diaphragmatic hernias
is heterogeneous and multifactorial.
INTRODUCTION
The diaphragm is a mammalian-specific muscle critical for respiration and separation of the abdominal
and thoracic cavities [1]. Defects in diaphragm development lead to congenital diaphragmatic hernias
(CDHs), a common structural birth defect [1 in 3000-3500 births; 2, 3-5] in which the barrier function of
the diaphragm is compromised. In CDH, a weakness develops in the diaphragm, allowing the
abdominal contents to herniate into the thoracic cavity and impede lung development. The resulting
lung hypoplasia and pulmonary hypertension are important causes of the neonatal mortality and long-
term morbidity associated with CDH [5-7]. The phenotype of CDH is highly variable and the clinical
outcomes are diverse, in part due to other associated congenital anomalies [7, 8]. Underlying this
phenotypic diversity is a complex genetic etiology [9].
Genetic variants in many chromosomal regions and over 150 genes have been implicated in CDH.
Molecular cytogenetic studies of individuals with CDH have identified multiple aneuploidies,
chromosomal rearrangements, and copy number variants in different chromosomal regions [9, 10].
Chromosomal abnormalities are found in 3.5-13% of CDH cases and are most frequently associated
with complex cases in which hernias appear in conjunction with other comorbidities [9]. In addition, a
multitude of individual genes have been identified through analyses of chromosomal regions commonly
associated with CDH [e.g. 11], recent exome sequencing studies [12-15], and analyses of mouse
mutants [e.g. 16, 17]. Variants in these genes can lead to either isolated or complex CDH. Altogether,
3 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
the diversity of chromosomal regions and genes associated with CDH demonstrates the heterogeneous
genetic origins of CDH.
Most genetic studies of the etiology of CDH have focused on the role of germline de novo variants. The
preponderance of CDH cases that occur sporadically without a family history of CDH [7] and the low
sibling recurrence rate [0.7%; 18] have argued for the importance of this class of genetic variants.
Indeed studies of trios of CDH-affected children and their unaffected parents have identified de novo
pathogenic copy number variants (affecting at least one gene) in 4% of CDH patients [19]. In addition,
trio studies employing exome or genome sequencing have discovered candidate de novo deleterious
gene variants [12-15]. Nevertheless one of these exome sequencing studies [15] estimated that only
15% of sporadic non-isolated CDH cases can be attributed to de novo gene-disrupting or deleterious
missense variants. Particularly given the small number of cases sequenced to date, most identified
genes recur in none or only a few CDH cases [20]. In addition, variants in particular CDH-associated
chromosomal regions or genes are often incompletely penetrant for CDH or cause subtle subclinical
diaphragm defects [11, 21]. Thus, while de novo copy number variants and variants in individual genes
undoubtedly are important, the genetic etiology of CDH is more complex and likely polygenic and
multifactorial.
Another class of variants that may contribute to CDH etiology are somatic de novo variants. A potential
role of somatic variants has been suggested by the discordant appearance of CDH in monozygotic
twins [18, 22] and the finding of tissue-specific genetic mosaicism in CDH patients [23, 24]. More
recently, our functional studies using mouse conditional mutants found that development of localized
muscleless regions leads to CDH and suggest that in humans somatic variants in the diaphragm may
cause muscleless regions that ultimately herniate [17]. However, to date only one human study has
explicitly examined whether somatic variants contribute to CDH [25].
4 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Although less commonly investigated, inherited variants have been linked to CDH. Analyses of families
with multiple members affected by CDH revealed that autosomal recessive alleles can cause CDH [26-
29]. Other familial CDH cases exhibit an inheritance pattern of autosomal dominance with incomplete
penetrance. For instance, two families have been reported with multiple CDH offspring that inherited
either a large deletion or frameshift variant in ZFPM2, but with unaffected carrier parents [30]. In another
case, monoallelic missense variants in GATA4 were inherited in three generations of one family and
associated with a range of diaphragm defects, but only one family member had symptomatic CDH [21].
Thus, these familial cases demonstrate that inherited variants can contribute to CDH etiology, but these
genetic variants often exhibit incomplete penetrance and variable expressivity. This suggests that some
inherited variants may only lead to CDH in the context of other interacting genetic variants.
While human genetics studies have been essential for identifying candidate CDH chromosomal regions
and genes and providing insights into the genetic architecture underlying CDH, experiments with
rodents have been critical for determining mechanistically how the diaphragm and CDH develop and
explicitly testing whether candidate genes cause CDH. Embryological and genetic lineage experiments
[17, 31-33] have shown that the diaphragm develops primarily from two transient embryonic tissues: the
somites and the pleuroperitoneal folds (PPFs). The somites are the source of the diaphragm’s muscle,
as muscle progenitors migrate from cervical somites into the nascent diaphragm [33]. The PPFs give
rise to the diaphragm’s muscle connective tissue and central tendon [17]. Importantly, the PPFs
regulate the development of the diaphragm’s muscle and control overall diaphragm morphogenesis,
which takes place between embryonic day (E) 9.5 and E16.5 in the mouse [corresponding to E30-60 in
humans; 17, 33]. Engineered mutations in the mouse of candidate CDH genes have definitively
established that these genes are functionally important in CDH [17]. In addition, conditional
5 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
mutagenesis experiments indicate that the PPFs are an important cellular source of CDH as
inactivation of Gata4, WT1, or b-catenin in the PPFs results in hernias [17, 34, 35], while Gata4
inactivation in somites does not affect diaphragm development [17]. Furthermore, these experiments
established that mutations in CDH genes initiate aberrations in the earliest development of the PPFs
[by E12.5 in mouse; 17, 34, 35]. In contrast, mutations in the diaphragm’s muscle lead to diaphragms
that are muscle-less or with thin or aberrant muscle, but have never been found to lead to CDH [17, 36-
45]. Altogether these data indicate that the PPFs are critical for diaphragm morphogenesis and a
cellular source of CDH, while a direct role in CDH for genes expressed in the diaphragm’s muscle is
less clear. Given the importance of the PPFs in CDH, prioritization of genes expressed in the early
mouse PPFs is likely to be an effective strategy for evaluating new candidate CDH genes derived from
human genetic studies.
In this study, we take a novel approach to studying the etiology of CDH. Complementing recent studies
using large cohorts of CDH patients that focus on one class of possible variants - de novo germline
variants [12-15], we comprehensively examine the genome of four CDH patients with multiple tissue
samples and their unaffected parents. Using deep whole genome sequencing and a sophisticated
bioinformatics toolkit, we determine the contribution of germline and somatic de novo and inherited
variants to CDH etiology. Our analysis includes variants of different sizes – single nucleotide variants
(SNVs), small insertions and deletions (indels), and larger structural variants (SVs) – in all genomic
regions (exons, introns, UTR, and intergenic). We prioritize implicated genes not only based on their
frequency in the general population and predicted effect on gene function, but on their expression in
early PPFs and diaphragm muscle progenitors, using a newly generated mouse RNA-seq dataset.
Altogether we reconstruct the diverse genetic architecture underlying isolated CDH in four individuals,
revealing the heterogenous and multifactorial genetic etiology of CDH.
6 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
RESULTS
Strongly supported CDH genes are expressed at high levels in early mouse PPF fibroblasts or
diaphragm muscle progenitors
Studies of CDH patients and mouse models have implicated many genes in the etiology of CDH.
However, not all CDH-implicated genes are equally well-supported by evidence. To systematically and
comprehensively analyze published CDH-implicated genes, we compiled a list of 153 CDH-implicated
genes from three large cohort studies [13-15], previous literature reviews [9, 46] and recently published
studies [47, 48] (TableS1). We included genes that either had mouse functional data or human genetic
data found in more than 1 CDH individual, associated with other developmental disorders or structural
birth defects that co-occur in patients with complex CDH, and/or implicated [by 13] via their interaction
with known CDH genes and expression in the developing diaphragm [as determined by 46]. We ranked
each gene based on the nature of the variants, penetrance, and frequency reported in mouse and
human data (for details, see Materials and Methods). For mouse data, genes in which variants resulted
in herniation of abdominal contents into the thoracic cavity were ranked more highly than those which
simply resulted in muscleless regions or entirely muscleless diaphragms. In addition, genes in which
variants lead to highly penetrant phenotypes were more highly ranked. For human data, genes in which
inherited homozygous or biallelic putative deleterious (as described by original publication) variants or
de novo deleterious variants were found in more than one CDH patient (either more than one patient in
one study or across multiple studies) were ranked more highly than genes with putative deleterious
variants of unknown inheritance or found in only one patient. The scores for mouse and human data
were added to produce a final score and ranking. Twenty seven genes had a score of 10-19 and were
deemed highly likely to contribute to CDH etiology, fifty one genes had a score of 5-9, and seventy five
genes had a score of < 5.
7 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Previous mouse studies of the development of CDH demonstrated that the PPFs are a critical cellular
source of CDH, and many CDH-implicated genes are expressed and required in early PPF cells [17,
34, 35, 49]. To test whether the CDH genes identified in the literature are expressed in the PPFs or
associated diaphragm muscle progenitors, we isolated E12.5 PPF fibroblasts and diaphragm muscle
progenitors from wild-type mice and performed RNA sequencing. We found that nearly all 26/27 (96%)
highly ranked genes are expressed at levels of at least 10 transcripts per million (TPM) reads (which
includes 29% of total transcripts), while 45/51 (88%) of moderately ranked genes and 63/75 (84%) of
lowly ranked genes are expressed at this level (Fig 1A). These data suggest that genes involved in
CDH are expressed at levels of at least 10 TPM in E12.5 PPFs or diaphragm muscle progenitors. In
evaluating the significance of newly identified putative CDH-genes, a finding that such genes are
expressed at > 10 TPM increases confidence that such genes indeed are important to CDH etiology.
Analysis of the highly ranked CDH genes reveals gene families and pathways that likely lead to CDH.
To discover protein networks, we inputted the 153 genes into STRING [50] (using all active interaction
sources except text mining and requiring a minimum interaction score of 0.4) to generate a protein
network (Fig 1B). The two highest scoring genes, GATA4 and ZFPM2 (FOG2), encode a transcription
factor and a co-factor that directly interact with each other [51]. In addition, in the heart GATA4 and
ZFPM2 interact with the protein encoded by the highly ranked gene, NR2F2 [52] and GATA4 interacts
with the protein encoded by the highly ranked gene, TBX5 [53, 54]. Thus, not only are variants in
GATA4, GATA6, ZFPM2, NR2F2, and TBX5 highly implicated in CDH, but GATA4 (GATA6), ZFPM2,
NR2F2, and TBX5 proteins may function together in a complex to regulate diaphragm development.
Another class of highly ranked genes are those involved in the extracellular matrix (EFEMP2, FREM1,
FREM2, FRAS1, FBN1, HSPG2, COL3A1, COL20A1, LAMA5, and ELN) as well as genes that modify
matrix components (MMP2, MMP14, NDST1, and LOX). Also, prominent are genes involved in several
8 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
critical developmental pathways: SLIT-ROBO signaling (SLIT3, ROBO1, ROBO2), Retinoic Acid
signaling (RARB, RARA, STRA6), SHH signaling (GLI2, GLI3, KIF7, DISP1, STK36), and WNT
signaling (CTNNB1, FZD2, PORCN), MET signaling (MET, GAB1, PTPN11), and FGF signaling
(FGFRL1, FGFR2).
Present in the list of CDH-implicated genes are those expressed in myogenic cells and critical for
myogenesis (SIX4, SIX1, EYA1, EYA2, MEF2A, MSC, PAX7, PAX3, MYOD1, MYOG, DES, TNNT3).
However, while these genes are important for myogenesis and variants lead to muscle-less regions or
muscle-less diaphragms, it is not clear that they lead to diaphragmatic hernias. For instance, variants in
Pax3 in mouse lead to completely muscle-less diaphragms, and while the diaphragms are highly
domed, they do not allow abdominal contents to herniate into the thoracic cavity [17]. Thus, while
multiple reviews have included these genes as potentially implicated in CDH [e.g. 9, 13, 46], their true
role in CDH is less clear.
Cohort of 4 CDH probands and their parents with multiple proband tissue samples enable
unique genetic insights into CDH etiology
To gain greater insight into the genetic etiology of CDH, we analyzed four probands with CDH and their
parents with a unique array of tissue samples (Fig 2). Our cohort consists of four unrelated individuals
(male probands 411 and 967 and female probands 716 and 809) who had an isolated CDH in which the
diaphragmatic hernias were encased by a connective tissue sac. Reflecting the increased prevalence
of left versus right CDH [7], three probands (411, 809, and 967) had left CDH and one proband (716)
had right CDH. Three of the probands (411, 809 and 967) had hernias < 50% of chest wall devoid of
diaphragmatic tissue, and one (716) had a large hernia (> 50% of chest wall devoid of diaphragmatic
tissue). From each CDH proband, the sac was surgically removed during corrective surgery and saved
9 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
and skin biopsies and blood draws taken. In addition, blood samples were taken from each parent.
Altogether five samples (sac, skin, and blood from CDH proband and blood from 2 parents), from each
family were paired-end, whole genome sequenced to an average coverage > 50 reads across each
genome. Using Peddy [55], for each pentad of samples we confirmed sample quality, sex, relatedness,
and reported ancestry (Table S2). The DNA samples from the connective tissue sac uniquely enable us
to test the hypothesis, suggested by our previous mouse functional studies [17], that somatic variants in
the diaphragm’s connective tissue contribute to the etiology of CDH. In addition, the high depth of
whole genome sequencing and availability of new bioinformatic tools enable us to examine a wide
spectrum of potential genetic variants including SNVs, indels, and SVs in coding and non-coding
regions of the genome.
Analysis of germline de novo variants identifies THSD7A as a novel CDH candidate gene
Because most CDH patients have no family history of CDH, previous human genetic studies on the
etiology of CDH have concentrated on de novo variants [12-15]. Using blood samples from CDH
probands and their parents, these studies have inferred that variants present in the proband and not the
parents arose in the egg or sperm giving rising to the child and thus designated the discovered variants
as germline de novo variants. However, blood samples may also harbor somatic variants that arose
later in development and such somatic variants could be erroneously designated as a germline variant.
Because we have samples from three different tissues of each CDH proband as well as the parents,
variants present in the proband diaphragm sac, skin, and blood but not present in the parents are
confidently identified as germline (or at least early embryonic) variants. To discover such germline de
novo variants, we analyzed whole genome sequences from the pentad of samples using the variant
calling tool RUFUS (https://github.com/jandrewrfarrell/RUFUS). RUFUS subdivides each genome into
a series of k-mers and aligns k-mers from samples of interest to control samples to identify unique
10 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
SNVs and INDELs in the sample of interest. CDH proband genomes were compared with parental
genomes, and variants present in > 20% of reads (with GQ scores >20 and read depths > 15) of
proband diaphragm sac, skin, and blood genomes but not in parental genomes were designated as
germline de novo variants and confirmed by visual inspection in the Integrated Genome Viewer [56].
Germline de novo SNVs and INDELs were found in all four CDH probands (shown in intersection of
diaphragm sac, skin, and blood of 4 CDH probands in Fig 3A and detailed in Table S3). CDH probands
have 48-116 germline de novo variants, falling close to the 70 de novo SNVs expected in each
newborn in an average population [57, 58]. As expected, the largest number of de novo germline
variants were in intergenic regions (24-63 variants) and fewer were in UTR or intronic regions (20-50
variants). In the 4 probands, germline de novo non-synonymous coding (exon) variants in six genes
(PEX6, SCARB1, OLFM3, ZNF792, AR, and THSD7A) were discovered. De novo coding variants
impacting these genes have not been found in other CDH patients, and these genes are not located
within CDH-associated chromosomal regions. All of these coding variants are rare (gnomAD frequency
< 0.0001). The PEX6 variant is a damaging frameshift variant, while the THSD7A variant is a damaging
nonsense variant (stop gain, p.Trp260Ter). SCARB1, OLFM3, ZNF792, and AR are all missense
variants, but only the OLFM3 and ZNF792 variants are predicted to be damaging by PROVEAN (which
takes into account conservation of homologous sequences [59]). THSD7A is the only one of these four
genes (PEX6, THSD7A, OLFM3, and ZNF792) with damaging variants predicted by ExAC Pli [60] to be
haploinsufficient (intolerant of loss of one allele) and also substantially expressed in mouse PPFs or
diaphragm muscle progenitors (TPM of 7.0; Fig 3B). Thus THSD7A is a promising new candidate gene
in which variants lead to CDH. THSD7A (Thrombospondin Type I Domain Containing 7A) is a protein
containing 10 thrombospondin type I repeats and through its co-localization with avb3 integrin and
paxillin has been shown to promote endothelial cell migration during development [61-63]. In diaphragm
11 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
development, THSD7A may similarly regulate diaphragm vascularization or it may regulate PPF
fibroblast migration, which is essential for diaphragm morphogenesis [17].
Somatic de novo variants are present in diaphragm, skin and blood, but diaphragm somatic
variants are unlikely to contribute to CDH in the probands analyzed
Our previous mouse genetic functional study of CDH [17] suggested the hypothesis that somatic
variants in the PPF-derived muscle connective tissue contribute to CDH etiology. Our cohort of CDH
proband samples that include the diaphragm sac, which is composed of PPF-derived connective tissue,
uniquely allow us to test this hypothesis. In addition, because the samples were collected within a few
weeks of birth, we are uniquely positioned to determine the background frequency of somatic variants
in the skin and blood prior to exposure of any potential environmental mutagens (e.g. UV light).
To identify somatic de novo variants, diaphragm sac, skin, or blood CDH proband genomes were
compared via RUFUS with the other 2 genomes derived from the same CDH proband, and variants
present in > 20% of reads (with GQ scores >20 and read depths > 15) in only 1 or 2 tissues of each
proband genome and not present in parental genomes were designated as somatic de novo variants.
Variants were only included in which all three tissue samples had at least 15X coverage and Phred
quality score [64, 65] above 20. Variants were designated as tissue-specific if > 20% alternate variants
reads were present in one or two tissue samples and no alternate reads were present in the other
samples.
Somatic de novo variants were found in all four CDH probands (Fig 3, Table S3). Across both
individuals and tissues, the alternative allele read depth of somatic variants was significantly lower than
germline de novo variants (Fig S1) [66-68]. This indicates that the somatic de novo variants are present
12 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
in a subset of the sampled cells; in the diaphragm, this reflects either that not all PPF-derived
connective cells harbor the variant and/or that the sampled tissue includes several cells types (e.g.
connective tissue fibroblasts and endothelial cells ) of which only one cell type (e.g. fibroblasts) harbors
the variant. An analysis of the mutational spectrum (Fig S2A) reveals that the spectrum was generally
similar across all probands (although proband 411 and 967 harbor a larger number of deletions) and,
as expected, transitions (C/T or A/G changes) are more frequent than transversions (C/G, C/A, A/T,
A/C changes). The mutational spectrum also did not vary widely amongst diaphragm, skin, and blood
(Fig S2B).
Diaphragm sac, skin, and blood were all found to harbor private variants but no variants shared
between two tissues (Fig 3A, Table S3). Intergenic variants were the most common class of somatic
variants with variants in UTR and introns as the next most common. Somatic coding variant were only
found in the blood of Proband 411 (missense variant in MIR4717) and in the skin of Proband 809
(missense variants in DHX57 and TNFAIP8L3 and a synonymous variant in CXorf57). In all four
probands, the diaphragm contained somatic variants, but none of these were in coding regions and
variants in UTR and intron regions did not overlap any annotated enhancers. Thus, while somatic
variants were found in the diaphragm, none of the variants are likely to contribute to the etiology of
CDH in these children.
One striking feature of our analysis was the extremely high number of somatic variants in the
diaphragm sac and skin, but not blood, of proband 809. This high number of variants was not an artifact
of technical issues, as the samples passed all quality control filters of the standardized Utah Genome
Project pipeline (see Materials and Methods). Furthermore, not only does this child harbor more
somatic variants (Fig 3A), but the alternate allele frequency is significantly higher than that found in the
somatic variants in the other probands (Fig S1). This child also harbors a missense germline de novo in
13 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
the androgen receptor gene AR. AR has been well characterized as a tumor suppressor gene,
specifically in prostate cancer [69], and shown to be critical in DNA repair through activation of
transcriptional targets [70, 71]. However, arguing against a causative role of AR is that the particular AR
missense variant in proband 809 is not predicted by Provean to be damaging.
Our analysis of somatic variants also provides insights into how representative variants in the blood are
of germline variants. Blood is the most commonly sampled human tissue and variants in the blood are
typically designated as “germline” variants. However, our analysis shows that blood harbors somatic
variants that would typically be erroneously tallied as germline variants. In the four children analyzed,
an average of 1.5% (range of 1-4%) of variants in the blood are unique to blood. Thus our analysis
suggests that in general 1.5% of variants found in the blood are not germline variants, but instead
somatic variants.
Inherited variants in genes regulating muscle structure and function potentially contribute to
the etiology of CDH in one proband
Damaging variants inherited from parents may also be a genetic source of CDH. As the parents of the
four probands investigated do not have CDH, it is unlikely that an inherited variant of one allele would
directly cause CDH, while homozygous or biallelic variants, in which each parent contributes a gene
damaging allele, are more likely to contribute to CDH. Given this, recessive homozygous and biallelic
inherited variants were identified and prioritized using the GATK variant calling pipeline and the variant
prioritizing tool VAAST3 [72, 73] (Table S4). VAAST identifies and ranks genes based on whether they
are predicted to be damaging based on protein impact and rare compared against a mixed control
population from 1000 Genomes phase 3 [74]. Given that the significance level VAAST assigns to a
gene is limited by the size of the background population, genes were further prioritized as to their
14 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
likelihood to be involved in CDH etiology. Pseudogenes [75], highly mutable genes [76, 77], and genes
with multiple alleles in the variant region reported in gnomAD [78] were removed and genes expressed
at higher than 10 TPM in E12.5 mouse PPFs or diaphragm muscle progenitors prioritized (Fig 4A, B
and Table S4).
Using these criteria, we identified 13 genes with biallelic or homozygous variants in the three CDH
probands (Fig 4A and Table S4 with VAAST, CADD, PROVEAN impact, and ExAC Pli scores). None
of these genes has been identified in previous CDH studies. Of these 13 genes, 3 genes (MPEG1,
MYOF, SACS) harbor 1 deleterious frameshift allele and 1 missense allele predicted by Provean to be
neutral; 5 genes (UPF3A, CCDC136, NOP9, ZNF646, SH3PXD2A) harbor 1 missense allele predicted
to be deleterious and 1 missense allele predicted to be neutral by Provean; and 1 gene (ADAMTS2) is
homozygous for an inherited in-frame insertion, predicted by Provean to be neutral, but predicted by
ExAC to be intolerant of loss-of-function (Pli score of 0.99). Because each of these genes has at least
one predicted functional allele, these genes are unlikely to directly cause CDH, but may act as
sensitizing alleles that act in combination with other genetic variants produce CDH.
Proband 716, with a large right hernia, harbored four genes – ALG2, HRC, AHNAK, and MYO1H – in
which both inherited maternal and paternal alleles were frameshift null or missense predicted damaging
alleles and therefore more likely to contribute to CDH etiology. All four genes contain variants that are
rare compared to the background population (1000 Genomes Phase 3), based on the probabilistic
framework underlying VAAST, and all variants are rare (AF < 0.01) in gnomAD (Table S4).
Interestingly, three of these genes are involved in skeletal muscle function. ALG2 encodes an a1,3-
mannosyltransferase that catalyzes early steps of asparagine-linked glycosylation and is expressed at
neuromuscular junctions [79]. Human ALG2 variants have been found to affect the function of the
neuromuscular junction and mitochondria organization in myofibers, leading to congenital myasthenic
15 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
syndrome (fatigable muscle weakness) and mitochondrial myopathy [79-81]. HRC encodes a histidine-
rich calcium-binding protein that is expressed in the sarcoplasmic reticulum of skeletal, cardiac, and
smooth muscle [82] and in the heart has been shown to regulate calcium cycling [83, 84]. AHNAK
encodes a large nucleoprotein that acts as a structural scaffold in multi-protein complexes [85]. In
particular, AHNAK interacts with dysferlin, which is a transmembrane protein critical for skeletal muscle
membrane repair and loss of dysferlin causes several types of muscular dystrophy [86, 87]. AHNAK is
proposed to play a role in dysferlin-mediated membrane repair [86, 87]. The finding of deleterious
biallelic variants in these three genes suggests that aberrations in the diaphragm muscle’s
neuromuscular junctions, mitochondria, calcium handling, or membrane integrity contributes to the
development of CDH in this child. In addition, proband 716 has one predicted null and one missense
deleterious allele in MYO1H. MYO1H is a motor protein involved in intracellular transport and vesicle
trafficking, expressed in retrotrapezoid neurons critical for sensing C02 and regulating respiration, and
variants in MYO1H cause a recessive form of central hypoventilation [88]. While unlikely to directly
contribute to CDH, the MYO1H variants’ potentially deleterious effect on neuronal regulation of
respiration would have a detrimental impact on a CDH child.
An inherited deletion in intron 2 of Gata4 in two probands is a candidate common sensitizing
allele for CDH
Another important potential source of genetic variants underlying CDH are structural variants [SVs; 47]
and include > 50 bp insertions, deletions, inversions, and translocations. To identify structural variants
that could contribute to the etiology of the four CDH probands, we used the Lumpy smooth pipeline
[89], which uses three copy number variant callers: cn.MOPS [90], CNVkit [91] and CNVnator [92]. We
then determined whether identified SVs were de novo or inherited in the CDH probands using the tool
GQT [93] and confirmed SVs visually using IGV [56]. Though de novo SVs were discovered in multiple
16 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
probands (Table S5), all were common (allele frequency > 0.1) in a large SV dataset of 14,623
ancestrally diverse individuals [94] or located in an intergenic region. Thus, no discovered de novo SVs
are likely to contribute to CDH etiology.
To discover inherited SVs potentially contributing to CDH, we analyzed variants within chromosomal
regions highly associated with CDH [10]. Multiple candidate inherited SVs were identified (with allele
frequencies < 0.1 and not found in intergenic or repetitive regions, Table S5), but in no case were the
probands homozygous for the SV or biallelic with any of the identified SNVs. However, one SV, a 343
base pair deletion within the second intron of the highly ranked CDH-associated transcription factor,
GATA4 [Table S1; 16, 17, 21], was discovered in two probands (Fig 5) and subsequently confirmed by
PCR. In proband 411 the deletion is paternally inherited (Fig 5A), while in proband 967 the deletion is
maternally inherited (Fig 5D).
Comparison of the 343 bp deleted region in human with the orthologous region in mouse reveals that
this region lies within an enhancer (element 2205) annotated in the VISTA enhancer database [Fig 5E;
95]. This enhancer drives reporter expression at E10.5 in the developing heart in a domain similar to
the GATA4 expression domain [96, 97] and demonstrates that this region is a bona fide GATA4
enhancer. Although yet to be tested, this enhancer may also be important for GATA4 expression in the
PPF cells that are critical for diaphragm development. As a parent of proband 411 and 967 possesses
the deletion and does not have any reported CDH, this deletion alone is unlikely to cause CDH in the
two probands. However, this deletion may sensitize the two probands to develop CDH.
Given that two of the CDH patients in this cohort harbor the 343 bp deletion, this deletion within the
GATA4 intron 2 enhancer may be a sensitizing variant enriched in CDH individuals. To test this, we
used PCR to identify the presence of the 343 bp deletion on a small cohort of 141 CDH individuals
17 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
(including the 4 CDH individuals from this study) for which DNA was available. In addition, we
compared the frequency of this deletion in the CDH cohort with that of a larger control population
(which includes an ancestrally diverse population, but also includes individuals with common
cardiovascular, neuropsychiatric, and immune- related diseases) with SVs discovered using a similar,
Lumpy software-based pipeline [94]. We found that CDH individuals had an allele frequency of 0.98%
(4 heterozygous individuals of 204 tested individuals, of which 141 had isolated CDH and 63 had
complex CDH) and in the larger control population an allele frequency of 1.9%. These data suggest that
this deletion has a roughly similar frequency of 1- 2% in CDH and control populations and is a common
variant in the population. This relatively common 343 bp deletion may be a sensitizing variant that, in
conjunction with other variants, contributes to the genetic etiology of CDH.
DISCUSSION
In this study, we comprehensively assessed the contribution of germline and somatic de novo and
inherited SNVs, indels, and SVs to CDH etiology and reconstruct for the first time the genetic
architecture of four individuals with isolated CDH. Our ability to perform such a comprehensive analysis
of CDH, identifying genetic variants of different sizes (SNVs, indels, and SVs) in various genomic
regions (exons, introns, UTR, and intergenic) with different inheritance patterns (inherited and germline
and somatic de novo) was enabled by three factors. First, the unique collection of diaphragm sac, skin
and blood samples from individuals with isolated CDH and blood from their parents allowed us to 1.
confidently identify germline variants (present in all three proband samples and absent in parent
samples), 2. discover de novo somatic variants (present in sac, but not in other samples) in the
diaphragm’s connective tissue, which has previously been shown to be an important cellular source of
CDH [9], and 3. determine which variants are inherited (present in all three proband samples and
present in at least one parent). Second, whole genome sequencing DNA samples to an average of >
18 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
50X coverage was essential for identifying non-coding DNA regions which contribute to CDH etiology
and positively calling somatic variants, which have relatively low numbers of reads. Finally, we
employed a collection of computational tools that uses Illumina pair-end, whole genome sequences to
discover de novo variants (RUFUS), identify and prioritize inherited variants (VAAST), and discover
SVs (Lumpy pipeline). Altogether, the unique cohort of samples, high-depth whole genome sequences,
and computational toolkit were essential to comprehensively interrogate the genetic architecture of the
four CDH individuals. Our pipeline lays the groundwork for future, larger scale studies investigating the
genetic etiology of CDH. In addition, our methodology should be useful for investigating other birth
defects with complex genetic etiologies, such as congenital heart defects [98].
Our analysis of germline de novo variants revealed a potential pitfall of using blood samples to infer
germline de novo variants and also identified a new candidate CDH-causative gene. Previous studies
searching for de novo variants that cause CDH have used DNA from blood samples and then inferred
that such variants are germline de novo variants [12-15]. However, while blood is the most readily
available source of DNA, variants in its DNA may not have originally arisen in the germline, but instead
may have arisen later in somatic cells. Our cohort, with three proband-derived tissue samples, allows
us to explicitly test this alternative hypothesis. We found that an average of 1.5% of the de novo
variants in the proband’s blood were not germline in origin (not present in all three proband tissues) and
is a similar rate as found in other recent papers [58, 99]. In fact, in proband 411 a blood-specific
somatic variant in MIR4717 would have been classified as a germline variant. Thus, researchers should
be cautious about inferring that all de novo variants in the blood are germline in origin. With DNA
samples from three tissues from each CDH proband, we are able to confidently identify germline de
novo variants because such variants will be present in all proband tissues, but not in parental samples.
We found an average of 68 germline de novo variants per proband, and this is similar to the 70
germline de novo variants found in the average population [58, 100-103]. Of these, only a few are in
19 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
gene-coding regions and only one of these genes, THSD7A, harbors a deleterious variant and is
predicted to be haploinsufficient and thus is a strong candidate CDH-causative allele. A limited number
of studies have investigated the function of Thsd7a in mice and zebrafish, focusing on its role in
regulating endothelial cell migration [61-63]. As Thsd7a is expressed in PPF cells in mouse and
migration of PPF cells is critical for normal diaphragm development and aberrant in CDH [17, 49], an
intriguing hypothesis is that Thsd7a regulates PPF migration.
The largely discordant appearance of CDH in monozygotic twins [22, 104] and our previous mouse
genetic studies [17] suggested that somatic variants in the diaphragm’s connective tissue may be a
genetic feature of some CDH individuals. In particular, our mouse conditional mutagenesis experiments
suggested that the connective tissue fibroblasts, which ultimately form a sac around some
diaphragmatic hernias, may harbor somatic variants [17]. Previous studies of the role of somatic
variants in other structural birth defects, such as congenital heart defects, have relied on blood, saliva,
or skin samples and inferred that low frequency (< 30%) alternate alleles represent somatic variants
that may potentially contribute to birth defect etiology [e.g. 105]. In our study, we have DNA directly
derived from the proband tissue, the PPF-derived diaphragm connective sac, hypothesized to harbor
somatic variants. Because we also have DNA derived from skin and blood proband samples, we were
able to positively identify any alternate allele, regardless of its frequency, present in the sac, but not in
skin or blood (or parental blood), as a somatic variant. Using similar logic, we were able to identify
somatic variants in blood and skin. To confidently identify these somatic variants, we conservatively
included alleles present in at least 20% of the reads (and not present in the other tissues). Using this
strategy, we identified in three of the probands 1-7 private somatic variants in the sac, skin, and blood.
Proband 809 has an aberrantly high number of somatic variants, but we currently have no mechanistic
explanation (e.g. variants in DNA repair genes) for this individual’s high somatic mutational load. In all
probands, the frequency of somatic alternate alleles was significantly lower than germline alleles,
20 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
indicating that these variants are present in a subset of sampled cells; in the diaphragm sac, either not
all PPF-derived connective tissue fibroblasts harbor the variants and/or other non-fibroblast cells are
included in the sample. Because no somatic variants were shared between two of the proband tissues,
all of the somatic variants must have arisen after the developmental divergence of diaphragm
connective tissue, skin, and blood. Importantly, no variants are shared between blood and diaphragm.
Thus, blood samples are unlikely to be informative about somatic variants in the diaphragm. Our
analysis of the diaphragm revealed multiple somatic variants in the diaphragm’s connective tissue, but
none in coding or annotated enhancers and so these somatic variants are unlikely to be deleterious. A
previous study [25] also found no evidence of damaging somatic variants, although this study examined
tissue sampled around the periphery of the herniated region and so did not specifically sample the
connective tissue that mouse genetic studies [17, 34, 35] predict to be a cellular source of CDH. While
our study did not find potentially damaging somatic variants in the diaphragm’s connective tissue, we
have established an effective discovery strategy. A more definitive test of the role of somatic variants in
CDH etiology awaits a future larger study.
The role of inherited variants in the etiology of CDH has received relatively little attention. Using
VAAST, we identified multiple compound heterozygous or homozygous inherited, presumably
recessive, SNVs and indel variants in all four probands. However, only a small number of these
variants were rare, predicted damaging, and in genes expressed in mouse PPFs or associated
diaphragm muscle progenitors. Of particular note is proband 716 who inherited multiple damaging
variants, including variants in ALG2, HRC, AHNAK, and MYO1H. AHNAK is involved in skeletal muscle
membrane structure and repair [86, 87], while HRC is functionally important in regulating calcium
cycling in muscle [82-84]. ALG2 regulates the function of the neuromuscular junction and organization
of mitochondria in myofibers and human variants in ALG2 have been linked to congenital myasthenic
syndrome and mitochondrial myopathy [79-81]. Together, variants in these three genes may weaken
21 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
muscle structure and function and lead to CDH. This is a surprising finding as mouse genetic studies
have found that while variants in muscle-specific genes lead to muscle-less or diaphragms with
aberrant muscle, none cause CDH (see Table S1). However, the inherited damaging variants in three
muscle-related genes in proband 716, with an unusually large hernia, suggests that genetic alterations
in muscle may lead to CDH. Another interesting aspect of the genetic profile of proband 716 is the
variant in MYO1H. MYO1H regulates the function of neurons critical for sensing C02 and respiration
[88], and so this MYO1H variant may further compound the respiratory issues introduced by CDH.
In our search for de novo or inherited SVs that could contribute to CDH etiology, we discovered in two
probands a 343 bp deletion in intron 2 of GATA4, a highly ranked CDH-associated gene, that disrupts
an annotated enhancer regulating GATA4 expression [106]. We hypothesize that disruption of this
enhancer leads to lower levels of GATA4 expression. GATA4 has notably dosage sensitive effects on
heart development [107] and likely also on diaphragm development. Given that this deletion is inherited
from unaffected parents and has an allele frequency of 1-2% in the general population, we hypothesize
that this deletion is a relatively common SV that acts as a sensitizing allele for CDH. We hypothesize
that decreased expression of GATA4 expression resulting from the 343 bp deletion confers CDH
susceptibility and in the context of other genetic variants (or environmental factors) leads to CDH. To
test this hypothesis, future experiments in our lab will test in mice whether this 343bp region regulates
GATA4 expression in the PPFs and whether a deletion in this region sensitizes mice to develop CDH.
Our comprehensive analysis of the genomes of four individuals with isolated CDH allows us to
reconstruct the diverse genetic architectures underlying CDH (Fig 6). Proband 809 is the most
enigmatic of the four cases. She harbors no obvious candidate genetic variants leading to CDH, as she
has no predicted damaging inherited recessive variants and one germline de novo missense variant in
AR that is predicted to be neutral. Her genome is particularly unusual in that it contains an abnormally
22 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
high somatic mutational load in her skin and diaphragm connective tissue, but the variants in the
diaphragm do not affect coding or annotated enhancer regions. The source of large number of somatic
mutations is unclear as she harbors no mutations in DNA repair genes. Proband 411 harbors the 343
bp intron 2 GATA4 deletion that we hypothesize acts as a sensitizing CDH allele, but the collaborating
variants that drive CDH are unclear. The germline de novo frameshift variant in PEX6 is deleterious but
PEX6 is not predicted to be haploinsufficient, and only one of the inherited variants in UPF3A is
predicted deleterious and UPF3A is also not predicted to be haploinsufficient. Proband 716 differs from
the other probands in that she has inherited multiple rare and damaging variants in myogenic genes
that likely lead to CDH. Notably, while three of the probands in our cohort have small left hernias, she is
the only proband who has a large (where > 50% of the chest wall is devoid of diaphragm tissue) right
hernia. Potentially, the origin of her atypical large right hernia [most hernias are found on the left side;
7] is linked to the variants in myogenic genes as opposed to genes expressed in PPF fibroblasts.
Proband 967 is the individual for which we have the strongest hypothesis about the genetic origin of
CDH. This individual harbors the 343 bp intron 2 GATA4 deletion that we postulate acts as a sensitizing
CDH allele and a rare germline de novo damaging missense variant in the haploinsufficient-intolerant
gene, THSD7A. These two variants suggest the hypothesis that during the early development of
proband 967 the PPFs of his nascent diaphragm were prone to apoptosis, unable to proliferate
sufficiently [as GATA4 promotes proliferation and survival; 17], and had defects in migration (due to low
expression levels of THSD7A) that ultimately lead to defects in diaphragm morphogenesis and CDH.
In summary, our comprehensive analysis demonstrates that the genetic etiology of CDH is
heterogeneous and likely multifactorial. A challenge for future studies will be to determine whether,
despite a diverse array of initiating genetic variants, a small set of molecular pathways are consistently
impacted in CDH. Identification of a few key molecular pathways common to all CDH patients would be
critical for designing potential in utero therapies to rescue or minimize the severity of herniation.
23 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
MATERIALS AND METHODS
Ranking of CDH-implicated genes
CDH genes reported in the literature were gathered from recent reviews [9, 46], large human patient
cohort studies [12-15, 47] and recent studies [47, 48]. Original publications implicating genes in CDH
were checked for the level of evidence (i.e. variants likely impacting gene function or deleterious as
described in original studies) and the following ranking system used.
Mouse Data Ranking: 10 = CDH, > 80% frequency; 9 = CDH, 40-60% frequency; 8 = CDH, < 40%
frequency; 7 = muscle-less patches, muscle-less diaphragm, thin diaphragm, > 80% frequency; 6 =
muscle-less patches, muscle-less diaphragm, thin diaphragm, 40-60% frequency; 5 = muscle-less
patches, muscle-less diaphragm, thin diaphragm, < 40% frequency. Human Data Ranking:9 = inherited
compound heterozygous or homozygous deleterious variant (2 alleles in 1 gene) + de novo deleterious
variant > 1 patient; 8 = inherited compound heterozygous or homozygous deleterious variant (2 alleles
in 1 gene), > 1 patient; 7 = de novo deleterious variant (1 allele in 1 gene), > 1 patient; 6 = inherited
compound heterozygous or homozygous deleterious variant (2 alleles in 1 gene), 1 patient; 5 =
inherited deleterious variant (1 allele in 1 gene), > 1 patient; 4 = unknown inheritance deleterious
variant (1 allele in 1 gene), > 1 patient; 3 = de novo deleterious variant (1 allele in 1 gene), 1 patient; 2
= inherited deleterious variant (1 allele in 1 gene), 1 patient; 1 = unknown inheritance deleterious
variants (1 allele in 1 gene), 1 patient. The final ranking of each gene was determined as the sum of the
mouse data and the human data ranking and constitutes the order of genes found in Table S1. CDH-
implicated genes were queried for expression in the mouse E12.5 PPF RNA-seq dataset and
expression plotted using the ggplot2 R package [108]. Gene networks within the list of CDH-associated
genes were visualized using STRING [50], visualizing high and medium levels of evidence to connect
gene nodes, using all evidence (except the “text mining” option was not used).
24 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Patient sample collection and DNA isolation
Participants were enrolled in a Columbia University IRB protocol AAAB2063 and provided informed
consent/assent for participation in this study as previously described [15]. Whole genome sequencing
and data analysis at the University of Utah was covered by IRB protocol 00085165. All four probands
had isolated CDH. Probands 411, 809, and 967 had left hernias < 50% of chest wall devoid of
diaphragmatic tissue, while proband 716 had a large (> 50% of chest wall devoid of diaphragmatic
tissue) right hernia. The self-reported ancestries are probands 411 and 809 are European (non-
Hispanic), proband 716 is Asian, and proband 967 is African, Hispanic, and European. The 4 CDH
probands and parents are included in previous exome sequencing analysis [14].
Whole Genome Sequencing
DNA was prepared for sequencing using TruSeq DNA PCR-free libraries (Illumina) and run on the
Illumina HiSeq X Ten System at a minimum of 60× median whole-genome coverage
Whole genome alignment, variant calling and quality checks
Genomic sequencing reads were aligned with BWA-MEM [109] against human GRCh37 reference
genome (including decoy sequences from the GATK resource bundle). Aligned BAM files were de-
duplicated using samblaster [110]. Base quality score recalibration and realignment of small insertions
and deletions was performed with the GATK package [111]. Alignment quality was checked with
samtools [112] ‘stats’ and ‘flagstats’ functions. Variants (SNVs, indels) were called with GATK
Haplotypecaller [111]. Sample relatedness, sex and reported ancestry were confirmed with Peddy [55].
25 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Germline and somatic de novo SNV and indel variant analysis
De novo SNVs and indels were called with RUFUS (https://github.com/jandrewrfarrell/RUFUS) using
standard parameters (25 length k-mers and 40 threads) and realigning k-mers to the human GRCh37
reference genome. Each proband tissue sample was run against both parents as control samples to
call germline de novo variants, and all possible combinations of one proband tissue against the other
two were run to call tissue-specific variants. Variants flagged “DeNovo” by RUFUS were retained and
any variants found in all three proband tissues were called germline de novo. Variants were further
filtered, and only variants were retained with GQ scores >20, read depths >15 at the variant site, and
the variant was found in > 20% of reads in the sample using annotations from the GATK
haplotypecaller [111] via a python script written with the cyvcf2 package [113] and annotations in IGV
[56]. Variant quality, alignment, and sample specificity were confirmed visually in IGV [56]. 5-20bp
indels located at the start or end of single nucleotide repeats were filtered out, as these are potential
false positives due to alignment error. Genetic locations were annotated with the UCSC Known Gene
Annotation [114]. Noncoding variants were intersected against a bed file of enhancers from the VISTA
Enhancer Database [106] using bedtools [115] to determine whether variants were within annotated
enhancers. Coding variant predicted damage was determined by both Provean [59] and CADD scores
[116] and allele frequency within a large, healthy population was determined by gnomAD [78]. Genes
containing coding variants were annotated as intolerant for loss-of-function with ExAC PLi scores [60].
Inherited SNV and indel variant analysis
The VAAST3 pipeline was used to call predicted damaging inherited recessive variants [73, 117]. The
pipeline includes the following steps. First, GATK haplotypecaller [111] identified-variants were
decomposed and normalized using VT [118], and variants with a GQ score < 30 or with > 25% of the
samples in the VCF file not genotyped (“no-call”) were filtered out using vcftools [119]. Variant effect
predicted annotations were added to the filtered VCF file using VEP [120] with version 83 of the hg19
26 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
vep-cache. As a background population, a VCF file with variants from 1000 Genomes phase 3 [74]
were run through the same workflow. A VVP background database was made from the 1000 Genome
filtered variants using the build_background function of VVP [121]. This background database and the
filtered VCF file of variants discovered in this cohort were used as inputs to VVP to prioritize cohort-
discovered variants, which were then passed to VAAST3 to be scored and ranked. Blood samples from
parents and the proband were used in VAAST3’s trio recessive inherited model. Genes with a p-value >
0.05 from VAAST3 were filtered out. The remaining genes were annotated with the predicted effect
from VEP, the location of the variant from VVP, and the parental origin of the allele from VVP. Ensembl
IDs given in the VAAST3 output were converted to gene names using GeneCard [122]. Genes were
then filtered out if 1) the genes were expressed at < 10 TPM in E12.5 mouse PPFs (human gene
names were converted to mouse orthologs using OrthoRetriever
(http://lighthouse.ucsf.edu/orthoretriever/)), 2) if they were a pseudogene annotated by GENCODE [75],
3) if they belonged to a highly mutable gene family, 4) if the allele called is found in > 0.01 of individuals
in gnomAD, or 5) there are multiple alleles at the site in gnomAD. Variant impact was predicted by
Provean [59], CADD [116] and ExAC pli scores [60], then genes were ranked based on the number and
severity of damaging alleles.
Inherited and de novo structural variant analysis
SVs were called with the Lumpy Smooth pipeline [89]. Regions with possible copy number variants
based on read depth were called using CNVkit [91], CNVnator [92] and CN.MOPS [90] with sample
BAM files. Outputs from the three tools were converted into BEDPE files and merged together as one
“deletions” file (evidence of a decrease in copy number) and one “duplications” (evidence of an
increase in copy number) per sample. The copy number call BEDPE files and aligned BAM files were
used as input to Lumpy [89] and called with the lumpy_smooth script (in the LUMPY scripts directory).
Lumpy output variants were used as inputs to SVTYPER [123] to annotate each variant as an insertion,
27 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
deletion, translocation or inversion within a VCF file. De novo and inherited SVs were identified with
GQT [124] using the LUMPY-created VCF file and a PED file describing family relations. We kept only
variants with either spilt or discordant supporting read counts above 8 and but below 400 (to exclude
noisy regions with high read mapping). SVs were confirmed in IGV [56], keeping variants with visible
evidence in read coverage change and discordant reads. de novo and inherited structural variants were
queried in the complete VCF file from [94] to determine allele frequency in the general population, and
we filtered out SVs with an allele frequency > 10%. Variants located in intergenic regions, overlapping
annotated repetitive element or elements [125] or homozygous in parental DNA were excluded. The
discovered 343 bp deletion was confirmed with PCR using primers: forward, 5'-
TTCCTCTACCATTGGGCGTTT-3' and reverse, 5'-AGGTAGTACGGCTGACTTGC-3'.
E12.5 PPF RNA sequencing
Pleuroperitoneal folds (PPFs) were isolated from wild-type E12.5 mouse embryos by cutting embryos
just above the hindlimbs and below the forelimbs and removing the heart and lungs cranially, leaving
the trunk with attached nascent diaphragm. The PPFs were manually dissected and stored in RNAlater
(Thermo Fisher, #AM7020) at -80°c. RNA was isolated using a RNeasy Micro Kit (Qiagen, #74004),
RNA quality confirmed with RNA TapeStation ScreenTape Assay (Agilent, # 5067-5576), and
sequenced using TruSeq Stranded mRNA Library Preparation Kit with polyA selection (Illumina) and
sequenced using HiSeq 50 Cycle Single-Read Sequencing v4 (Illumina) through the High-Throughput
Genomics and Bioinformatic Analysis Shared Resource at Huntsman Cancer Institute at the University
of Utah. Two biological replicates were sequenced (two pooled PPF pairs per replicated) and analyzed.
Sequencing reads were aligned to the mouse genome (mm9) with STAR [126], using standard
parameters. featureCounts [127] was used to count reads per gene and then normalized by TPM
(transcript per million) using R.
28 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Figure creation and statistics
Figures 1A, 3B, 4B, and S1 were created with R package ggplot2 [108]. Fig S2 was created with
Prism 7 (Graphpad). Figures 2, 3A, 4A and 6 were created with Adobe Illustrator. Fig 5 was created
with samplot (https://github.com/ryanlayer/samplot) and Adobe Photoshop. Genome tracks were plotted
using the Gviz R package [128], with UCSC Known Gene [114], GERP conservation scores [129],
ENCODE DNase 1 hypersensitivity clusters, H3K27 acetylation in normal human lung fibroblasts
(NHLF) cells [130] and VISTA enhancer element tracks [106]. Except for the VISTA elements (bed file
downloaded from https://enhancer.lbl.gov/) all data was downloaded from the UCSC Genome Table
Browser. The 343 bp deletion sequence and the homologous sequence in mouse (mm9 reference
genome) were downloaded from the UCSC Genome Browser and aligned with Geneious [131].
Statistics for Fig S1 were generated with Prism 7, using a one-way ANOVA with multiple comparisons
to test differences between somatic or germline de novo variants found across probands or tissues and
unpaired t-tests to compare somatic and germline de novo variants within probands or tissues.
ACKNOWLEDGEMENTS
We would like to thank the patients and their families for their generous contribution. We are grateful for
the technical assistance provided by Patricia Lanzano, Jiangyuan Hu, Jiancheng Guo, and Liyong
Deng. We thank A Quinlan and RM Layer for help with LUMPY, D Neklason at Utah Genome Project
for coordinating sequencing, and CY Chow, LB Jorde, A Quinlan, EM Sefton, and B Collins for critical
reading of the manuscript. The support and resources from the Center for High Performance
Computing at the University of Utah are gratefully acknowledged. EL Bogenschutz was supported by
the University of Utah Genetics Training grant (NIH T32 GM007464). This research was supported by
NIH R01HD087360, March of Dimes 6FY15203, Utah Genome Project, and Wheeler Foundation grants
to GK; NIH R01GM120609 and R03HL138352 to YS; and NIH R01HD057036,UL1 RR024156,
29 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
and P01HD068250 to WKC. Additional funding support was provided by grants from CHERUBS,
CDHUK, and the National Greek Orthodox Ladies Philoptochos Society, Inc. and generous donations
from the Williams Family, Wheeler Foundation, Vanech Family Foundation, Larsen Family, Wilke
Family and many other families.
30 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
REFERENCES
1. Perry SF, Similowski T, Klein W, Codd JR. The evolutionary origin of the mammalian
diaphragm. Respir Physiol Neurobiol. 2010;171(1):1-16. Epub 2010/01/19. doi:
10.1016/j.resp.2010.01.004. PubMed PMID: 20080210.
2. Stege G, Fenton A, Jaffray B. Nihilism in the 1990s: the true mortality of congenital
diaphragmatic hernia. Pediatrics. 2003;112(3 Pt 1):532-5. Epub 2003/09/02. PubMed PMID:
12949279.
3. Torfs CP, Curry CJ, Bateson TF, Honore LH. A population-based study of congenital
diaphragmatic hernia. Teratology. 1992;46(6):555-65. Epub 1992/12/01. doi:
10.1002/tera.1420460605. PubMed PMID: 1290156.
4. Parker SE, Mai CT, Canfield MA, Rickard R, Wang Y, Meyer RE, et al. Updated
National Birth Prevalence estimates for selected birth defects in the United States, 2004-2006.
Birth Defects Res A Clin Mol Teratol. 2010;88(12):1008-16. Epub 2010/09/30. doi:
10.1002/bdra.20735. PubMed PMID: 20878909.
5. Shanmugam H, Brunelli L, Botto LD, Krikov S, Feldkamp ML. Epidemiology and
Prognosis of Congenital Diaphragmatic Hernia: A Population-Based Cohort Study in Utah.
Birth Defects Res. 2017;109(18):1451-9. Epub 2017/09/20. doi: 10.1002/bdr2.1106. PubMed
PMID: 28925604.
6. Lally KP. Congenital diaphragmatic hernia - the past 25 (or so) years. J Pediatr Surg.
2016;51(5):695-8. Epub 2016/03/02. doi: 10.1016/j.jpedsurg.2016.02.005. PubMed PMID:
26926207.
7. Pober BR. Overview of epidemiology, genetics, birth defects, and chromosome
abnormalities associated with CDH. Am J Med Genet C Semin Med Genet. 2007;145C(2):158-
31 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
71. Epub 2007/04/17. doi: 10.1002/ajmg.c.30126. PubMed PMID: 17436298; PubMed Central
PMCID: PMCPMC2891729.
8. Ackerman KG, Vargas SO, Wilson JA, Jennings RW, Kozakewich HP, Pober BR.
Congenital diaphragmatic defects: proposal for a new classification based on observations in
234 patients. Pediatric and developmental pathology : the official journal of the Society for
Pediatric Pathology and the Paediatric Pathology Society. 2012;15(4):265-74. Epub
2012/01/20. doi: 10.2350/11-05-1041-OA.1. PubMed PMID: 22257294; PubMed Central
PMCID: PMC3761363.
9. Kardon G, Ackerman KG, McCulley DJ, Shen Y, Wynn J, Shang L, et al. Congenital
diaphragmatic hernias: from genes to mechanisms to therapies. Dis Model Mech.
2017;10(8):955-70. Epub 2017/08/05. doi: 10.1242/dmm.028365. PubMed PMID: 28768736;
PubMed Central PMCID: PMCPMC5560060.
10. Holder AM, Klaassens M, Tibboel D, de Klein A, Lee B, Scott DA. Genetic factors in
congenital diaphragmatic hernia. Am J Hum Genet. 2007;80(5):825-45. Epub 2007/04/17. doi:
10.1086/513442. PubMed PMID: 17436238; PubMed Central PMCID: PMCPMC1852742.
11. Longoni M, Lage K, Russell MK, Loscertales M, Abdul-Rahman OA, Baynam G, et al.
Congenital diaphragmatic hernia interval on chromosome 8p23.1 characterized by genetics
and protein interaction networks. Am J Med Genet A. 2012;158A(12):3148-58. Epub
2012/11/21. doi: 10.1002/ajmg.a.35665. PubMed PMID: 23165946; PubMed Central PMCID:
PMCPMC3761361.
12. Longoni M, High FA, Qi H, Joy MP, Hila R, Coletti CM, et al. Genome-wide enrichment
of damaging de novo variants in patients with isolated and complex congenital diaphragmatic
32 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
hernia. Hum Genet. 2017;136(6):679-91. Epub 2017/03/18. doi: 10.1007/s00439-017-1774-y.
PubMed PMID: 28303347; PubMed Central PMCID: PMCPMC5453716.
13. Longoni M, High FA, Russell MK, Kashani A, Tracy AA, Coletti CM, et al. Molecular
pathogenesis of congenital diaphragmatic hernia revealed by exome sequencing,
developmental data, and bioinformatics. Proc Natl Acad Sci U S A. 2014;111(34):12450-5.
Epub 2014/08/12. doi: 10.1073/pnas.1412509111. PubMed PMID: 25107291; PubMed Central
PMCID: PMCPMC4151769.
14. Qi H, Yu L, Zhou X, Wynn J, Zhao H, Guo Y, et al. De novo variants in congenital
diaphragmatic hernia identify MYRF as a new syndrome and reveal genetic overlaps with other
developmental disorders. PLoS Genet. 2018;14(12):e1007822. Epub 2018/12/12. doi:
10.1371/journal.pgen.1007822. PubMed PMID: 30532227; PubMed Central PMCID:
PMCPMC6301721.
15. Yu L, Sawle AD, Wynn J, Aspelund G, Stolar CJ, Arkovitz MS, et al. Increased burden
of de novo predicted deleterious variants in complex congenital diaphragmatic hernia. Hum
Mol Genet. 2015;24(16):4764-73. Epub 2015/06/03. doi: 10.1093/hmg/ddv196. PubMed PMID:
26034137; PubMed Central PMCID: PMCPMC4512631.
16. Jay PY, Bielinska M, Erlich JM, Mannisto S, Pu WT, Heikinheimo M, et al. Impaired
mesenchymal cell function in Gata4 mutant mice leads to diaphragmatic hernias and primary
lung defects. Dev Biol. 2007;301(2):602-14. Epub 2006/10/31. doi:
10.1016/j.ydbio.2006.09.050. PubMed PMID: 17069789; PubMed Central PMCID:
PMCPMC1808541.
33 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
17. Merrell AJ, Ellis BJ, Fox ZD, Lawson JA, Weiss JA, Kardon G. Muscle connective tissue
controls development of the diaphragm and is a source of congenital diaphragmatic hernias.
Nat Genet. 2015;47(5):496-504. Epub 2015/03/26. doi: 10.1038/ng.3250. PubMed PMID:
25807280; PubMed Central PMCID: PMCPMC4414795.
18. Pober BR, Lin A, Russell M, Ackerman KG, Chakravorty S, Strauss B, et al. Infants with
Bochdalek diaphragmatic hernia: sibling precurrence and monozygotic twin discordance in a
hospital-based malformation surveillance program. Am J Med Genet A. 2005;138A(2):81-8.
Epub 2005/08/12. doi: 10.1002/ajmg.a.30904. PubMed PMID: 16094667; PubMed Central
PMCID: PMCPMC2891716.
19. Yu L, Wynn J, Ma L, Guha S, Mychaliska GB, Crombleholme TM, et al. De novo copy
number variants are associated with congenital diaphragmatic hernia. J Med Genet.
2012;49(10):650-9. Epub 2012/10/12. doi: 10.1136/jmedgenet-2012-101135. PubMed PMID:
23054247; PubMed Central PMCID: PMCPMC3696999.
20. Yu L, Hernan RR, Wynn J, Chung WK. The influence of genetics in congenital
diaphragmatic hernia. Semin Perinatol. 2019:151169. Epub 2019/08/25. doi:
10.1053/j.semperi.2019.07.008. PubMed PMID: 31443905.
21. Yu L, Wynn J, Cheung YH, Shen Y, Mychaliska GB, Crombleholme TM, et al. Variants
in GATA4 are a rare cause of familial and sporadic congenital diaphragmatic hernia. Hum
Genet. 2013;132(3):285-92. Epub 2012/11/10. doi: 10.1007/s00439-012-1249-0. PubMed
PMID: 23138528; PubMed Central PMCID: PMCPMC3570587.
22. Veenma D, Brosens E, de Jong E, van de Ven C, Meeussen C, Cohen-Overbeek T, et
al. Copy number detection in discordant monozygotic twins of Congenital Diaphragmatic
34 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Hernia (CDH) and Esophageal Atresia (EA) cohorts. Eur J Hum Genet. 2012;20(3):298-304.
Epub 2011/11/11. doi: 10.1038/ejhg.2011.194. PubMed PMID: 22071887; PubMed Central
PMCID: PMCPMC3283183.
23. Kantarci S, Ackerman KG, Russell MK, Longoni M, Sougnez C, Noonan KM, et al.
Characterization of the chromosome 1q41q42.12 region, and the candidate gene DISP1, in
patients with CDH. Am J Med Genet A. 2010;152A(10):2493-504. Epub 2010/08/28. doi:
10.1002/ajmg.a.33618. PubMed PMID: 20799323; PubMed Central PMCID:
PMCPMC3797530.
24. Veenma D, Beurskens N, Douben H, Eussen B, Noomen P, Govaerts L, et al.
Comparable low-level mosaicism in affected and non affected tissue of a complex CDH
patient. PLoS One. 2010;5(12):e15348. Epub 2011/01/05. doi: 10.1371/journal.pone.0015348.
PubMed PMID: 21203572; PubMed Central PMCID: PMCPMC3006223.
25. Matsunami N, Shanmugam H, Baird L, Stevens J, Byrne JL, Barnhart DC, et al.
Germline but not somatic de novo mutations are common in human congenital diaphragmatic
hernia. Birth Defects Res. 2018;110(7):610-7. Epub 2018/03/24. doi: 10.1002/bdr2.1223.
PubMed PMID: 29570242; PubMed Central PMCID: PMCPMC5903934.
26. Farag TI, Bastaki L, Marafie M, al-Awadi SA, Krsz J. Autosomal recessive congenital
diaphragmatic defects in the Arabs. Am J Med Genet. 1994;50(3):300-1. Epub 1994/04/15.
doi: 10.1002/ajmg.1320500316. PubMed PMID: 8042677.
27. Hitch DC, Carson JA, Smith EI, Sarale DC, Rennert OM. Familial congenital
diaphragmatic hernia is an autosomal recessive variant. J Pediatr Surg. 1989;24(9):860-4.
Epub 1989/09/01. doi: 10.1016/s0022-3468(89)80582-2. PubMed PMID: 2674388.
35 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
28. Kantarci S, Al-Gazali L, Hill RS, Donnai D, Black GC, Bieth E, et al. Mutations in LRP2,
which encodes the multiligand receptor megalin, cause Donnai-Barrow and facio-oculo-
acoustico-renal syndromes. Nat Genet. 2007;39(8):957-9. Epub 2007/07/17. doi:
10.1038/ng2063. PubMed PMID: 17632512; PubMed Central PMCID: PMCPMC2891728.
29. Mitchell SJ, Cole T, Redford DH. Congenital diaphragmatic hernia with probable
autosomal recessive inheritance in an extended consanguineous Pakistani pedigree. J Med
Genet. 1997;34(7):601-3. Epub 1997/07/01. doi: 10.1136/jmg.34.7.601. PubMed PMID:
9222974; PubMed Central PMCID: PMCPMC1051006.
30. Longoni M, Russell MK, High FA, Darvishi K, Maalouf FI, Kashani A, et al. Prevalence
and penetrance of ZFPM2 mutations and deletions causing congenital diaphragmatic hernia.
Clin Genet. 2015;87(4):362-7. Epub 2014/04/08. doi: 10.1111/cge.12395. PubMed PMID:
24702427; PubMed Central PMCID: PMCPMC4410767.
31. Allan DW, Greer JJ. Embryogenesis of the phrenic nerve and diaphragm in the fetal rat.
J Comp Neurol. 1997;382(4):459-68. Epub 1997/06/16. PubMed PMID: 9184993.
32. Babiuk RP, Zhang W, Clugston R, Allan DW, Greer JJ. Embryological origins and
development of the rat diaphragm. J Comp Neurol. 2003;455(4):477-87. Epub 2003/01/01. doi:
10.1002/cne.10503. PubMed PMID: 12508321.
33. Sefton EM, Gallardo M, Kardon G. Developmental origin and morphogenesis of the
diaphragm, an essential mammalian muscle. Dev Biol. 2018;440(2):64-73. Epub 2018/04/22.
doi: 10.1016/j.ydbio.2018.04.010. PubMed PMID: 29679560; PubMed Central PMCID:
PMCPMC6089379.
36 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
34. Carmona R, Canete A, Cano E, Ariza L, Rojas A, Munoz-Chapuli R. Conditional
deletion of WT1 in the septum transversum mesenchyme causes congenital diaphragmatic
hernia in mice. Elife. 2016;5. Epub 2016/09/20. doi: 10.7554/eLife.16009. PubMed PMID:
27642710; PubMed Central PMCID: PMCPMC5028188.
35. Paris ND, Coles GL, Ackerman KG. Wt1 and beta-catenin cooperatively regulate
diaphragm development in the mouse. Dev Biol. 2015;407(1):40-56. Epub 2015/08/19. doi:
10.1016/j.ydbio.2015.08.009. PubMed PMID: 26278035; PubMed Central PMCID:
PMCPMC4641796.
36. Grifone R, Demignon J, Giordani J, Niro C, Souil E, Bertin F, et al. Eya1 and Eya2
proteins are required for hypaxial somitic myogenesis in the mouse embryo. Dev Biol.
2007;302(2):602-16. Epub 2006/11/14. doi: 10.1016/j.ydbio.2006.08.059. PubMed PMID:
17098221.
37. Grifone R, Demignon J, Houbron C, Souil E, Niro C, Seller MJ, et al. Six1 and Six4
homeoproteins are required for Pax3 and Mrf expression during myogenesis in the mouse
embryo. Development. 2005;132(9):2235-49. Epub 2005/03/25. doi: 10.1242/dev.01773.
PubMed PMID: 15788460.
38. Inanlou MR, Dhillon GS, Belliveau AC, Reid GA, Ying C, Rudnicki MA, et al. A
significant reduction of the diaphragm in mdx:MyoD-/-(9th) embryos suggests a role for MyoD
in the diaphragm development. Dev Biol. 2003;261(2):324-36. Epub 2003/09/23. PubMed
PMID: 14499644.
39. Ju Y, Li J, Xie C, Ritchlin CT, Xing L, Hilton MJ, et al. Troponin T3 expression in skeletal
and smooth muscle is required for growth and postnatal survival: characterization of
37 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Tnnt3(tm2a(KOMP)Wtsi) mice. Genesis. 2013;51(9):667-75. Epub 2013/06/19. doi:
10.1002/dvg.22407. PubMed PMID: 23775847; PubMed Central PMCID: PMCPMC3787964.
40. Laclef C, Hamard G, Demignon J, Souil E, Houbron C, Maire P. Altered myogenesis in
Six1-deficient mice. Development. 2003;130(10):2239-52. Epub 2003/04/02. PubMed PMID:
12668636.
41. Li J, Liu KC, Jin F, Lu MM, Epstein JA. Transgenic rescue of congenital heart disease
and spina bifida in Splotch mice. Development. 1999;126(11):2495-503. Epub 1999/05/05.
PubMed PMID: 10226008.
42. Li Z, Colucci-Guyon E, Pincon-Raymond M, Mericskay M, Pournin S, Paulin D, et al.
Cardiovascular lesions and skeletal myopathy in mice lacking desmin. Dev Biol.
1996;175(2):362-6. Epub 1996/05/01. doi: 10.1006/dbio.1996.0122. PubMed PMID: 8626040.
43. Lu JR, Bassel-Duby R, Hawkins A, Chang P, Valdez R, Wu H, et al. Control of facial
muscle development by MyoR and capsulin. Science. 2002;298(5602):2378-81. Epub
2002/12/21. doi: 10.1126/science.1078273. PubMed PMID: 12493912.
44. Seale P, Sabourin LA, Girgis-Gabardo A, Mansouri A, Gruss P, Rudnicki MA. Pax7 is
required for the specification of myogenic satellite cells. Cell. 2000;102(6):777-86. Epub
2000/10/13. PubMed PMID: 11030621.
45. Tseng BS, Cavin ST, Booth FW, Olson EN, Marin MC, McDonnell TJ, et al. Pulmonary
hypoplasia in the myogenin null mouse embryo. Am J Respir Cell Mol Biol. 2000;22(3):304-15.
Epub 2000/03/01. doi: 10.1165/ajrcmb.22.3.3708. PubMed PMID: 10696067.
46. Russell MK, Longoni M, Wells J, Maalouf FI, Tracy AA, Loscertales M, et al. Congenital
diaphragmatic hernia candidate genes derived from embryonic transcriptomes. Proc Natl Acad
38 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Sci U S A. 2012;109(8):2978-83. Epub 2012/02/09. doi: 10.1073/pnas.1121621109. PubMed
PMID: 22315423; PubMed Central PMCID: PMCPMC3286948.
47. Zhu Q, High FA, Zhang C, Cerveira E, Russell MK, Longoni M, et al. Systematic
analysis of copy number variation associated with congenital diaphragmatic hernia. Proc Natl
Acad Sci U S A. 2018;115(20):5247-52. Epub 2018/05/02. doi: 10.1073/pnas.1714885115.
PubMed PMID: 29712845; PubMed Central PMCID: PMCPMC5960281.
48. Jordan VK, Beck TF, Hernandez-Garcia A, Kundert PN, Kim BJ, Jhangiani SN, et al.
The Role of FREM2 and FRAS1 in the Development of Congenital Diaphragmatic Hernia. Hum
Mol Genet. 2018. Epub 2018/04/05. doi: 10.1093/hmg/ddy110. PubMed PMID: 29618029.
49. Clugston RD, Zhang W, Greer JJ. Early development of the primordial mammalian
diaphragm and cellular mechanisms of nitrofen-induced congenital diaphragmatic hernia. Birth
Defects Res A Clin Mol Teratol. 2010;88(1):15-24. Epub 2009/08/28. doi: 10.1002/bdra.20613.
PubMed PMID: 19711422.
50. Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING
v11: protein-protein association networks with increased coverage, supporting functional
discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47(D1):D607-D13.
Epub 2018/11/27. doi: 10.1093/nar/gky1131. PubMed PMID: 30476243; PubMed Central
PMCID: PMCPMC6323986.
51. Chlon TM, Crispino JD. Combinatorial regulation of tissue specification by GATA and
FOG factors. Development. 2012;139(21):3905-16. Epub 2012/10/11. doi:
10.1242/dev.080440. PubMed PMID: 23048181; PubMed Central PMCID: PMCPMC3472596.
39 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
52. Huggins GS, Bacani CJ, Boltax J, Aikawa R, Leiden JM. Friend of GATA 2 physically
interacts with chicken ovalbumin upstream promoter-TF2 (COUP-TF2) and COUP-TF3 and
represses COUP-TF2-dependent activation of the atrial natriuretic factor promoter. J Biol
Chem. 2001;276(30):28029-36. Epub 2001/05/31. doi: 10.1074/jbc.M103577200. PubMed
PMID: 11382775.
53. Ang YS, Rivas RN, Ribeiro AJS, Srivas R, Rivera J, Stone NR, et al. Disease Model of
GATA4 Mutation Reveals Transcription Factor Cooperativity in Human Cardiogenesis. Cell.
2016;167(7):1734-49 e22. Epub 2016/12/17. doi: 10.1016/j.cell.2016.11.033. PubMed PMID:
27984724; PubMed Central PMCID: PMCPMC5180611.
54. Luna-Zurita L, Stirnimann CU, Glatt S, Kaynak BL, Thomas S, Baudin F, et al. Complex
Interdependence Regulates Heterotypic Transcription Factor Distribution and Coordinates
Cardiogenesis. Cell. 2016;164(5):999-1014. Epub 2016/02/16. doi: 10.1016/j.cell.2016.01.004.
PubMed PMID: 26875865; PubMed Central PMCID: PMCPMC4769693.
55. Pedersen BS, Quinlan AR. Who's Who? Detecting and Resolving Sample Anomalies in
Human DNA Sequencing Studies with Peddy. Am J Hum Genet. 2017;100(3):406-13. Epub
2017/02/14. doi: 10.1016/j.ajhg.2017.01.017. PubMed PMID: 28190455; PubMed Central
PMCID: PMCPMC5339084.
56. Thorvaldsdottir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-
performance genomics data visualization and exploration. Brief Bioinform. 2013;14(2):178-92.
Epub 2012/04/21. doi: 10.1093/bib/bbs017. PubMed PMID: 22517427; PubMed Central
PMCID: PMCPMC3603213.
40 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
57. Besenbacher S, Liu S, Izarzugaza JM, Grove J, Belling K, Bork-Jensen J, et al. Novel
variation and de novo mutation rates in population-wide de novo assembled Danish trios. Nat
Commun. 2015;6:5969. Epub 2015/01/20. doi: 10.1038/ncomms6969. PubMed PMID:
25597990; PubMed Central PMCID: PMCPMC4309431.
58. Sasani TA, Pedersen BS, Gao Z, Baird L, Przeworski M, Jorde LB, et al. Large, three-
generation human families reveal post-zygotic mosaicism and variability in germline mutation
accumulation. Elife. 2019;8. Epub 2019/09/25. doi: 10.7554/eLife.46922. PubMed PMID:
31549960; PubMed Central PMCID: PMCPMC6759356.
59. Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the functional effect of
amino acid substitutions and indels. PLoS One. 2012;7(10):e46688. Epub 2012/10/12. doi:
10.1371/journal.pone.0046688. PubMed PMID: 23056405; PubMed Central PMCID:
PMCPMC3466303.
60. Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, et al. Analysis of
protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285-91. Epub
2016/08/19. doi: 10.1038/nature19057. PubMed PMID: 27535533; PubMed Central PMCID:
PMCPMC5018207.
61. Kuo MW, Wang CH, Wu HC, Chang SJ, Chuang YJ. Soluble THSD7A is an N-
glycoprotein that promotes endothelial cell migration and tube formation in angiogenesis. PLoS
One. 2011;6(12):e29000. Epub 2011/12/24. doi: 10.1371/journal.pone.0029000. PubMed
PMID: 22194972; PubMed Central PMCID: PMCPMC3237571.
41 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
62. Wang CH, Chen IH, Kuo MW, Su PT, Lai ZY, Wang CH, et al. Zebrafish Thsd7a is a
neural protein required for angiogenic patterning during development. Dev Dyn.
2011;240(6):1412-21. Epub 2011/04/27. doi: 10.1002/dvdy.22641. PubMed PMID: 21520329.
63. Wang CH, Su PT, Du XY, Kuo MW, Lin CY, Yang CC, et al. Thrombospondin type I
domain containing 7A (THSD7A) mediates endothelial cell migration and tube formation. J Cell
Physiol. 2010;222(3):685-94. Epub 2009/12/19. doi: 10.1002/jcp.21990. PubMed PMID:
20020485.
64. Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error
probabilities. Genome Res. 1998;8(3):186-94. Epub 1998/05/16. PubMed PMID: 9521922.
65. Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces
using phred. I. Accuracy assessment. Genome Res. 1998;8(3):175-85. Epub 1998/05/16. doi:
10.1101/gr.8.3.175. PubMed PMID: 9521921.
66. Dong X, Zhang L, Milholland B, Lee M, Maslov AY, Wang T, et al. Accurate
identification of single-nucleotide variants in whole-genome-amplified single cells. Nat
Methods. 2017;14(5):491-3. Epub 2017/03/21. doi: 10.1038/nmeth.4227. PubMed PMID:
28319112; PubMed Central PMCID: PMCPMC5408311.
67. Behjati S, Huch M, van Boxtel R, Karthaus W, Wedge DC, Tamuri AU, et al. Genome
sequencing of normal cells reveals developmental lineages and mutational processes. Nature.
2014;513(7518):422-5. Epub 2014/07/22. doi: 10.1038/nature13448. PubMed PMID:
25043003; PubMed Central PMCID: PMCPMC4227286.
68. Vijg J, Dong X, Zhang L. A high-fidelity method for genomic sequencing of single
somatic cells reveals a very high mutational burden. Exp Biol Med (Maywood).
42 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
2017;242(13):1318-24. Epub 2017/07/25. doi: 10.1177/1535370217717696. PubMed PMID:
28737476; PubMed Central PMCID: PMCPMC5529006.
69. Culig Z, Santer FR. Androgen receptor signaling in prostate cancer. Cancer Metastasis
Rev. 2014;33(2-3):413-27. Epub 2014/01/05. doi: 10.1007/s10555-013-9474-0. PubMed
PMID: 24384911.
70. Polkinghorn WR, Parker JS, Lee MX, Kass EM, Spratt DE, Iaquinta PJ, et al. Androgen
receptor signaling regulates DNA repair in prostate cancers. Cancer Discov. 2013;3(11):1245-
53. Epub 2013/09/13. doi: 10.1158/2159-8290.CD-13-0172. PubMed PMID: 24027196;
PubMed Central PMCID: PMCPMC3888815.
71. Schiewer MJ, Goodwin JF, Han S, Brenner JC, Augello MA, Dean JL, et al. Dual roles
of PARP-1 promote cancer growth and progression. Cancer Discov. 2012;2(12):1134-49. Epub
2012/09/21. doi: 10.1158/2159-8290.CD-12-0120. PubMed PMID: 22993403; PubMed Central
PMCID: PMCPMC3519969.
72. Hu H, Roach JC, Coon H, Guthery SL, Voelkerding KV, Margraf RL, et al. A unified test
of linkage analysis and rare-variant association for analysis of pedigree sequence data. Nat
Biotechnol. 2014;32(7):663-9. Epub 2014/05/20. doi: 10.1038/nbt.2895. PubMed PMID:
24837662; PubMed Central PMCID: PMCPMC4157619.
73. Yandell M, Huff C, Hu H, Singleton M, Moore B, Xing J, et al. A probabilistic disease-
gene finder for personal genomes. Genome Res. 2011;21(9):1529-42. Epub 2011/06/28. doi:
10.1101/gr.123158.111. PubMed PMID: 21700766; PubMed Central PMCID:
PMCPMC3166837.
43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
74. Genomes Project C, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A
global reference for human genetic variation. Nature. 2015;526(7571):68-74. Epub 2015/10/04.
doi: 10.1038/nature15393. PubMed PMID: 26432245; PubMed Central PMCID:
PMCPMC4750478.
75. Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, et al.
GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res.
2012;22(9):1760-74. Epub 2012/09/08. doi: 10.1101/gr.135350.111. PubMed PMID:
22955987; PubMed Central PMCID: PMCPMC3431492.
76. Lenz TL, Spirin V, Jordan DM, Sunyaev SR. Excess of Deleterious Mutations around
HLA Genes Reveals Evolutionary Cost of Balancing Selection. Mol Biol Evol.
2016;33(10):2555-64. Epub 2016/07/21. doi: 10.1093/molbev/msw127. PubMed PMID:
27436009; PubMed Central PMCID: PMCPMC5026253.
77. Shyr C, Tarailo-Graovac M, Gottlieb M, Lee JJ, van Karnebeek C, Wasserman WW.
FLAGS, frequently mutated genes in public exomes. BMC Med Genomics. 2014;7:64. Epub
2014/12/04. doi: 10.1186/s12920-014-0064-y. PubMed PMID: 25466818; PubMed Central
PMCID: PMCPMC4267152.
78. Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation
across 141,456 human exomes and genomes reveals the spectrum of loss-of-function
intolerance across human protein-coding genes. bioRxiv. 2019:531210. doi: 10.1101/531210.
79. Cossins J, Belaya K, Hicks D, Salih MA, Finlayson S, Carboni N, et al. Congenital
myasthenic syndromes due to mutations in ALG2 and ALG14. Brain. 2013;136(Pt 3):944-56.
44 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Epub 2013/02/14. doi: 10.1093/brain/awt010. PubMed PMID: 23404334; PubMed Central
PMCID: PMCPMC3580273.
80. Monies DM, Al-Hindi HN, Al-Muhaizea MA, Jaroudi DJ, Al-Younes B, Naim EA, et al.
Clinical and pathological heterogeneity of a congenital disorder of glycosylation manifesting as
a myasthenic/myopathic syndrome. Neuromuscul Disord. 2014;24(4):353-9. Epub 2014/01/28.
doi: 10.1016/j.nmd.2013.12.010. PubMed PMID: 24461433.
81. Thiel C, Schwarz M, Peng J, Grzmil M, Hasilik M, Braulke T, et al. A new type of
congenital disorders of glycosylation (CDG-Ii) provides new insights into the early steps of
dolichol-linked oligosaccharide biosynthesis. J Biol Chem. 2003;278(25):22498-505. Epub
2003/04/10. doi: 10.1074/jbc.M302850200. PubMed PMID: 12684507.
82. Pathak RK, Anderson RG, Hofmann SL. Histidine-rich calcium binding protein, a
sarcoplasmic reticulum protein of striated muscle, is also abundant in arteriolar smooth muscle
cells. J Muscle Res Cell Motil. 1992;13(3):366-76. Epub 1992/06/01. doi: 10.1007/bf01766464.
PubMed PMID: 1527222.
83. Arvanitis DA, Vafiadaki E, Fan GC, Mitton BA, Gregory KN, Del Monte F, et al.
Histidine-rich Ca-binding protein interacts with sarcoplasmic reticulum Ca-ATPase. Am J
Physiol Heart Circ Physiol. 2007;293(3):H1581-9. Epub 2007/05/29. doi:
10.1152/ajpheart.00278.2007. PubMed PMID: 17526652.
84. Tzimas C, Johnson DM, Santiago DJ, Vafiadaki E, Arvanitis DA, Davos CH, et al.
Impaired calcium homeostasis is associated with sudden cardiac death and arrhythmias in a
genetic equivalent mouse model of the human HRC-Ser96Ala variant. Cardiovasc Res.
2017;113(11):1403-17. Epub 2017/09/02. doi: 10.1093/cvr/cvx113. PubMed PMID: 28859293.
45 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
85. Davis TA, Loos B, Engelbrecht AM. AHNAK: the giant jack of all trades. Cell Signal.
2014;26(12):2683-93. Epub 2014/08/31. doi: 10.1016/j.cellsig.2014.08.017. PubMed PMID:
25172424.
86. Huang Y, Laval SH, van Remoortere A, Baudier J, Benaud C, Anderson LV, et al.
AHNAK, a novel component of the dysferlin protein complex, redistributes to the cytoplasm
with dysferlin during skeletal muscle regeneration. FASEB J. 2007;21(3):732-42. Epub
2006/12/23. doi: 10.1096/fj.06-6628com. PubMed PMID: 17185750.
87. Zacharias U, Purfurst B, Schowel V, Morano I, Spuler S, Haase H. Ahnak1 abnormally
localizes in muscular dystrophies and contributes to muscle vesicle release. J Muscle Res Cell
Motil. 2011;32(4-5):271-80. Epub 2011/11/08. doi: 10.1007/s10974-011-9271-8. PubMed
PMID: 22057634; PubMed Central PMCID: PMCPMC3230764.
88. Spielmann M, Hernandez-Miranda LR, Ceccherini I, Weese-Mayer DE, Kragesteen BK,
Harabula I, et al. Mutations in MYO1H cause a recessive form of central hypoventilation with
autonomic dysfunction. J Med Genet. 2017;54(11):754-61. Epub 2017/08/06. doi:
10.1136/jmedgenet-2017-104765. PubMed PMID: 28779001.
89. Layer RM, Chiang C, Quinlan AR, Hall IM. LUMPY: a probabilistic framework for
structural variant discovery. Genome Biol. 2014;15(6):R84. Epub 2014/06/28. doi: 10.1186/gb-
2014-15-6-r84. PubMed PMID: 24970577; PubMed Central PMCID: PMCPMC4197822.
90. Klambauer G, Schwarzbauer K, Mayr A, Clevert DA, Mitterecker A, Bodenhofer U, et al.
cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation
sequencing data with a low false discovery rate. Nucleic Acids Res. 2012;40(9):e69. Epub
46 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
2012/02/04. doi: 10.1093/nar/gks003. PubMed PMID: 22302147; PubMed Central PMCID:
PMCPMC3351174.
91. Talevich E, Shain AH, Botton T, Bastian BC. CNVkit: Genome-Wide Copy Number
Detection and Visualization from Targeted DNA Sequencing. PLoS Comput Biol.
2016;12(4):e1004873. Epub 2016/04/23. doi: 10.1371/journal.pcbi.1004873. PubMed PMID:
27100738; PubMed Central PMCID: PMCPMC4839673.
92. Abyzov A, Urban AE, Snyder M, Gerstein M. CNVnator: an approach to discover,
genotype, and characterize typical and atypical CNVs from family and population genome
sequencing. Genome Res. 2011;21(6):974-84. Epub 2011/02/18. doi: 10.1101/gr.114876.110.
PubMed PMID: 21324876; PubMed Central PMCID: PMCPMC3106330.
93. Layer RM, Kindlon N, Karczewski KJ, Exome Aggregation C, Quinlan AR. Efficient
genotype compression and analysis of large genetic-variation data sets. Nat Methods.
2016;13(1):63-5. Epub 2015/11/10. doi: 10.1038/nmeth.3654. PubMed PMID: 26550772;
PubMed Central PMCID: PMCPMC4697868.
94. Abel HJ, Larson DE, Chiang C, Das I, Kanchi KL, Layer RM, et al. Mapping and
characterization of structural variation in 17,795 deeply sequenced human genomes. bioRxiv.
2018:508515. doi: 10.1101/508515.
95. Pennacchio LA, Ahituv N, Moses AM, Prabhakar S, Nobrega MA, Shoukry M, et al. In
vivo enhancer analysis of human conserved non-coding sequences. Nature.
2006;444(7118):499-502. Epub 2006/11/07. doi: 10.1038/nature05295. PubMed PMID:
17086198.
47 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
96. Watt AJ, Battle MA, Li J, Duncan SA. GATA4 is essential for formation of the
proepicardium and regulates cardiogenesis. Proc Natl Acad Sci U S A. 2004;101(34):12573-8.
Epub 2004/08/18. doi: 10.1073/pnas.0400752101. PubMed PMID: 15310850; PubMed Central
PMCID: PMCPMC515098.
97. Kuo CT, Morrisey EE, Anandappa R, Sigrist K, Lu MM, Parmacek MS, et al. GATA4
transcription factor is required for ventral morphogenesis and heart tube formation. Genes
Dev. 1997;11(8):1048-60. Epub 1997/04/15. PubMed PMID: 9136932.
98. Homsy J, Zaidi S, Shen Y, Ware JS, Samocha KE, Karczewski KJ, et al. De novo
mutations in congenital heart disease with neurodevelopmental and other congenital
anomalies. Science. 2015;350(6265):1262-6. Epub 2016/01/20. doi: 10.1126/science.aac9396.
PubMed PMID: 26785492; PubMed Central PMCID: PMCPMC4890146.
99. Hsieh A, Morton SU, Willcox JAL, Gorham JM, Tai AC, Qi H, et al. Early post-zygotic
mutations contribute to congenital heart disease. bioRxiv. 2019.
100. Besenbacher S, Sulem P, Helgason A, Helgason H, Kristjansson H, Jonasdottir A, et al.
Multi-nucleotide de novo Mutations in Humans. PLoS Genet. 2016;12(11):e1006315. Epub
2016/11/16. doi: 10.1371/journal.pgen.1006315. PubMed PMID: 27846220; PubMed Central
PMCID: PMCPMC5147774.
101. Jonsson H, Sulem P, Arnadottir GA, Palsson G, Eggertsson HP, Kristmundsdottir S, et
al. Multiple transmissions of de novo mutations in families. Nat Genet. 2018;50(12):1674-80.
Epub 2018/11/07. doi: 10.1038/s41588-018-0259-9. PubMed PMID: 30397338.
102. Kong A, Frigge ML, Masson G, Besenbacher S, Sulem P, Magnusson G, et al. Rate of
de novo mutations and the importance of father's age to disease risk. Nature.
48 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
2012;488(7412):471-5. Epub 2012/08/24. doi: 10.1038/nature11396. PubMed PMID:
22914163; PubMed Central PMCID: PMCPMC3548427.
103. Rahbari R, Wuster A, Lindsay SJ, Hardwick RJ, Alexandrov LB, Turki SA, et al. Timing,
rates and spectra of human germline mutation. Nat Genet. 2016;48(2):126-33. Epub
2015/12/15. doi: 10.1038/ng.3469. PubMed PMID: 26656846; PubMed Central PMCID:
PMCPMC4731925.
104. Wat MJ, Shchelochkov OA, Holder AM, Breman AM, Dagli A, Bacino C, et al.
Chromosome 8p23.1 deletions as a cause of complex congenital heart defects and
diaphragmatic hernia. Am J Med Genet A. 2009;149A(8):1661-77. Epub 2009/07/17. doi:
10.1002/ajmg.a.32896. PubMed PMID: 19606479; PubMed Central PMCID:
PMCPMC2765374.
105. Manheimer KB, Richter F, Edelmann LJ, D'Souza SL, Shi L, Shen Y, et al. Robust
identification of mosaic variants in congenital heart disease. Hum Genet. 2018;137(2):183-93.
Epub 2018/02/09. doi: 10.1007/s00439-018-1871-6. PubMed PMID: 29417219; PubMed
Central PMCID: PMCPMC5997246.
106. Visel A, Minovitsky S, Dubchak I, Pennacchio LA. VISTA Enhancer Browser--a
database of tissue-specific human enhancers. Nucleic Acids Res. 2007;35(Database
issue):D88-92. Epub 2006/11/30. doi: 10.1093/nar/gkl822. PubMed PMID: 17130149; PubMed
Central PMCID: PMCPMC1716724.
107. Pu WT, Ishiwata T, Juraszek AL, Ma Q, Izumo S. GATA4 is a dosage-sensitive
regulator of cardiac morphogenesis. Dev Biol. 2004;275(1):235-44. Epub 2004/10/07. doi:
10.1016/j.ydbio.2004.08.008. PubMed PMID: 15464586.
49 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
108. Wickham H, Sievert C. ggplot2 : Elegant Graphics for Data Analysis. 2nd ed. Cham:
Springer International Publishing; 2016. 1 online resource (266 pages) p.
109. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.
arXiv. 2013. doi: 1303.3997.
110. Faust GG, Hall IM. SAMBLASTER: fast duplicate marking and structural variant read
extraction. Bioinformatics. 2014;30(17):2503-5. Epub 2014/05/09. doi:
10.1093/bioinformatics/btu314. PubMed PMID: 24812344; PubMed Central PMCID:
PMCPMC4147885.
111. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework
for variation discovery and genotyping using next-generation DNA sequencing data. Nat
Genet. 2011;43(5):491-8. Epub 2011/04/12. doi: 10.1038/ng.806. PubMed PMID: 21478889;
PubMed Central PMCID: PMCPMC3083463.
112. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence
Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078-9. Epub 2009/06/10.
doi: 10.1093/bioinformatics/btp352. PubMed PMID: 19505943; PubMed Central PMCID:
PMCPMC2723002.
113. Pedersen BS, Quinlan AR. cyvcf2: fast, flexible variant analysis with Python.
Bioinformatics. 2017;33(12):1867-9. Epub 2017/02/07. doi: 10.1093/bioinformatics/btx057.
PubMed PMID: 28165109; PubMed Central PMCID: PMCPMC5870853.
114. Hsu F, Kent WJ, Clawson H, Kuhn RM, Diekhans M, Haussler D. The UCSC Known
Genes. Bioinformatics. 2006;22(9):1036-46. Epub 2006/02/28. doi:
10.1093/bioinformatics/btl048. PubMed PMID: 16500937.
50 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
115. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic
features. Bioinformatics. 2010;26(6):841-2. Epub 2010/01/30. doi:
10.1093/bioinformatics/btq033. PubMed PMID: 20110278; PubMed Central PMCID:
PMCPMC2832824.
116. Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the
deleteriousness of variants throughout the human genome. Nucleic Acids Res.
2019;47(D1):D886-D94. Epub 2018/10/30. doi: 10.1093/nar/gky1016. PubMed PMID:
30371827; PubMed Central PMCID: PMCPMC6323892.
117. Hu H, Huff CD, Moore B, Flygare S, Reese MG, Yandell M. VAAST 2.0: improved
variant classification and disease-gene identification using a conservation-controlled amino
acid substitution matrix. Genet Epidemiol. 2013;37(6):622-34. Epub 2013/07/10. doi:
10.1002/gepi.21743. PubMed PMID: 23836555; PubMed Central PMCID: PMCPMC3791556.
118. Tan A, Abecasis GR, Kang HM. Unified representation of genetic variants.
Bioinformatics. 2015;31(13):2202-4. Epub 2015/02/24. doi: 10.1093/bioinformatics/btv112.
PubMed PMID: 25701572; PubMed Central PMCID: PMCPMC4481842.
119. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant
call format and VCFtools. Bioinformatics. 2011;27(15):2156-8. Epub 2011/06/10. doi:
10.1093/bioinformatics/btr330. PubMed PMID: 21653522; PubMed Central PMCID:
PMCPMC3137218.
120. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, et al. The Ensembl
Variant Effect Predictor. Genome Biol. 2016;17(1):122. Epub 2016/06/09. doi:
51 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
10.1186/s13059-016-0974-4. PubMed PMID: 27268795; PubMed Central PMCID:
PMCPMC4893825.
121. Flygare S, Hernandez EJ, Phan L, Moore B, Li M, Fejes A, et al. The VAAST Variant
Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool. BMC
Bioinformatics. 2018;19(1):57. Epub 2018/02/22. doi: 10.1186/s12859-018-2056-y. PubMed
PMID: 29463208; PubMed Central PMCID: PMCPMC5819680.
122. Safran M, Dalah I, Alexander J, Rosen N, Iny Stein T, Shmoish M, et al. GeneCards
Version 3: the human gene integrator. Database (Oxford). 2010;2010:baq020. Epub
2010/08/07. doi: 10.1093/database/baq020. PubMed PMID: 20689021; PubMed Central
PMCID: PMCPMC2938269.
123. Chiang C, Layer RM, Faust GG, Lindberg MR, Rose DB, Garrison EP, et al. SpeedSeq:
ultra-fast personal genome analysis and interpretation. Nat Methods. 2015;12(10):966-8. Epub
2015/08/11. doi: 10.1038/nmeth.3505. PubMed PMID: 26258291; PubMed Central PMCID:
PMCPMC4589466.
124. Layer RM, Kindlon N, Karczewski KJ, Quinlan AR, Consortium EA. Efficient genotype
compression and analysis of large genetic-variation data sets. Nat Methods. 2016;13(1):63-+.
doi: 10.1038/Nmeth.3654. PubMed PMID: WOS:000367463600029.
125. Jurka J. Repbase update: a database and an electronic journal of repetitive elements.
Trends Genet. 2000;16(9):418-20. Epub 2000/09/06. doi: 10.1016/s0168-9525(00)02093-x.
PubMed PMID: 10973072.
126. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast
universal RNA-seq aligner. Bioinformatics. 2013;29(1):15-21. Epub 2012/10/30. doi:
52 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
10.1093/bioinformatics/bts635. PubMed PMID: 23104886; PubMed Central PMCID:
PMCPMC3530905.
127. Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for
assigning sequence reads to genomic features. Bioinformatics. 2014;30(7):923-30. Epub
2013/11/15. doi: 10.1093/bioinformatics/btt656. PubMed PMID: 24227677.
128. Hahne F, Ivanek R. Visualizing Genomic Data Using Gviz and Bioconductor. Methods
Mol Biol. 2016;1418:335-51. Epub 2016/03/24. doi: 10.1007/978-1-4939-3578-9_16. PubMed
PMID: 27008022.
129. Davydov EV, Goode DL, Sirota M, Cooper GM, Sidow A, Batzoglou S. Identifying a high
fraction of the human genome to be under selective constraint using GERP++. PLoS Comput
Biol. 2010;6(12):e1001025. Epub 2010/12/15. doi: 10.1371/journal.pcbi.1001025. PubMed
PMID: 21152010; PubMed Central PMCID: PMCPMC2996323.
130. Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, et al. Architecture
of the human regulatory network derived from ENCODE data. Nature. 2012;489(7414):91-100.
Epub 2012/09/08. doi: 10.1038/nature11245. PubMed PMID: 22955619; PubMed Central
PMCID: PMCPMC4154057.
131. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious
Basic: an integrated and extendable desktop software platform for the organization and
analysis of sequence data. Bioinformatics. 2012;28(12):1647-9. Epub 2012/05/01. doi:
10.1093/bioinformatics/bts199. PubMed PMID: 22543367; PubMed Central PMCID:
PMCPMC3371832.
53 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
132. Hucthagowder V, Sausgruber N, Kim KH, Angle B, Marmorstein LY, Urban Z. Fibulin-4:
a novel gene for an autosomal recessive cutis laxa syndrome. Am J Hum Genet.
2006;78(6):1075-80. Epub 2006/05/11. doi: 10.1086/504304. PubMed PMID: 16685658;
PubMed Central PMCID: PMCPMC1474103.
133. Devriendt K, Deloof E, Moerman P, Legius E, Vanhole C, de Zegher F, et al.
Diaphragmatic hernia in Denys-Drash syndrome. Am J Med Genet. 1995;57(1):97-101. Epub
1995/05/22. doi: 10.1002/ajmg.1320570120. PubMed PMID: 7645607.
134. Slavotinek AM, Moshrefi A, Lopez Jiminez N, Chao R, Mendell A, Shaw GM, et al.
Sequence variants in the HLX gene at chromosome 1q41-1q42 in patients with diaphragmatic
hernia. Clin Genet. 2009;75(5):429-39. Epub 2009/05/23. doi: 10.1111/j.1399-
0004.2009.01182.x. PubMed PMID: 19459883; PubMed Central PMCID: PMCPMC2874832.
135. Wat MJ, Beck TF, Hernandez-Garcia A, Yu Z, Veenma D, Garcia M, et al. Mouse model
reveals the role of SOX7 in the development of congenital diaphragmatic hernia associated
with recurrent deletions of 8p23.1. Hum Mol Genet. 2012;21(18):4115-25. Epub 2012/06/23.
doi: 10.1093/hmg/dds241. PubMed PMID: 22723016; PubMed Central PMCID:
PMCPMC3428158.
136. High FA, Bhayani P, Wilson JM, Bult CJ, Donahoe PK, Longoni M. De novo frameshift
mutation in COUP-TFII (NR2F2) in human congenital diaphragmatic hernia. Am J Med Genet
A. 2016;170(9):2457-61. Epub 2016/07/02. doi: 10.1002/ajmg.a.37830. PubMed PMID:
27363585; PubMed Central PMCID: PMCPMC5003181.
137. Tautz J, Veenma D, Eussen B, Joosen L, Poddighe P, Tibboel D, et al. Congenital
diaphragmatic hernia and a complex heart defect in association with Wolf-Hirschhorn
54 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
syndrome. Am J Med Genet A. 2010;152A(11):2891-4. Epub 2010/09/11. doi:
10.1002/ajmg.a.33660. PubMed PMID: 20830802.
138. Srour M, Chitayat D, Caron V, Chassaing N, Bitoun P, Patry L, et al. Recessive and
dominant mutations in retinoic acid receptor beta in cases with microphthalmia and
diaphragmatic hernia. Am J Hum Genet. 2013;93(4):765-72. Epub 2013/10/01. doi:
10.1016/j.ajhg.2013.08.014. PubMed PMID: 24075189; PubMed Central PMCID:
PMCPMC3791254.
139. Jacobs AM, Toudjarska I, Racine A, Tsipouras P, Kilpatrick MW, Shanske A. A
recurring FBN1 gene mutation in neonatal Marfan syndrome. Arch Pediatr Adolesc Med.
2002;156(11):1081-5. Epub 2002/11/07. PubMed PMID: 12413333.
140. Bleyl SB, Moshrefi A, Shaw GM, Saijoh Y, Schoenwolf GC, Pennacchio LA, et al.
Candidate genes for congenital diaphragmatic hernia from animal models: sequencing of
FOG2 and PDGFRalpha reveals rare variants in diaphragmatic hernia patients. Eur J Hum
Genet. 2007;15(9):950-8. Epub 2007/06/15. doi: 10.1038/sj.ejhg.5201872. PubMed PMID:
17568391.
141. Chassaing N, Ragge N, Kariminejad A, Buffet A, Ghaderi-Sohi S, Martinovic J, et al.
Mutation analysis of the STRA6 gene in isolated and non-isolated
anophthalmia/microphthalmia. Clin Genet. 2013;83(3):244-50. Epub 2012/06/13. doi:
10.1111/j.1399-0004.2012.01904.x. PubMed PMID: 22686418.
142. Urban Z, Hucthagowder V, Schurmann N, Todorovic V, Zilberberg L, Choi J, et al.
Mutations in LTBP4 cause a syndrome of impaired pulmonary, gastrointestinal, genitourinary,
musculoskeletal, and dermal development. Am J Hum Genet. 2009;85(5):593-605. Epub
55 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
2009/10/20. doi: 10.1016/j.ajhg.2009.09.013. PubMed PMID: 19836010; PubMed Central
PMCID: PMCPMC2775835.
143. Hogue J, Shankar S, Perry H, Patel R, Vargervik K, Slavotinek A. A novel EFNB1
mutation (c.712delG) in a family with craniofrontonasal syndrome and diaphragmatic hernia.
Am J Med Genet A. 2010;152A(10):2574-7. Epub 2010/08/25. doi: 10.1002/ajmg.a.33596.
PubMed PMID: 20734337.
144. Maas SM, Lombardi MP, van Essen AJ, Wakeling EL, Castle B, Temple IK, et al.
Phenotype and genotype in 17 patients with Goltz-Gorlin syndrome. J Med Genet.
2009;46(10):716-20. Epub 2009/07/10. doi: 10.1136/jmg.2009.068403. PubMed PMID:
19586929.
145. Tuzovic L, Yu L, Zeng W, Li X, Lu H, Lu HM, et al. A human de novo mutation in
MYH10 phenocopies the loss of function mutation in mice. Rare Dis. 2013;1:e26144. Epub
2013/01/01. doi: 10.4161/rdis.26144. PubMed PMID: 25003005; PubMed Central PMCID:
PMCPMC3927488.
146. Yano S, Baskin B, Bagheri A, Watanabe Y, Moseley K, Nishimura A, et al. Familial
Simpson-Golabi-Behmel syndrome: studies of X-chromosome inactivation and clinical
phenotypes in two female individuals with GPC3 mutations. Clin Genet. 2011;80(5):466-71.
Epub 2010/10/19. doi: 10.1111/j.1399-0004.2010.01554.x. PubMed PMID: 20950395.
147. Sarajlija A, Magner M, Djordjevic M, Kecman B, Grujic B, Tesarova M, et al. Late-
presenting congenital diaphragmatic hernia in a child with TMEM70 deficiency. Congenit Anom
(Kyoto). 2017;57(2):64-5. Epub 2016/09/21. doi: 10.1111/cga.12194. PubMed PMID:
27649480.
56 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
148. Kammoun M, Souche E, Brady P, Ding J, Cosemans N, Gratacos E, et al. Genetic
profile of isolated congenital diaphragmatic hernia revealed by targeted next-generation
sequencing. Prenat Diagn. 2018;38(9):654-63. Epub 2018/07/03. doi: 10.1002/pd.5327.
PubMed PMID: 29966037.
149. Jongmans MC, Admiraal RJ, van der Donk KP, Vissers LE, Baas AF, Kapusta L, et al.
CHARGE syndrome: the phenotypic spectrum of mutations in the CHD7 gene. J Med Genet.
2006;43(4):306-14. Epub 2005/09/13. doi: 10.1136/jmg.2005.036061. PubMed PMID:
16155193; PubMed Central PMCID: PMCPMC2563221.
150. Qidwai K, Pearson DM, Patel GS, Pober BR, Immken LL, Cheung SW, et al. Deletions
of Xp provide evidence for the role of holocytochrome C-type synthase (HCCS) in congenital
diaphragmatic hernia. Am J Med Genet A. 2010;152A(6):1588-90. Epub 2010/05/27. doi:
10.1002/ajmg.a.33410. PubMed PMID: 20503342; PubMed Central PMCID:
PMCPMC2909827.
151. Yu L, Bennett JT, Wynn J, Carvill GL, Cheung YH, Shen Y, et al. Whole exome
sequencing identifies de novo mutations in GATA6 associated with congenital diaphragmatic
hernia. J Med Genet. 2014;51(3):197-202. Epub 2014/01/05. doi: 10.1136/jmedgenet-2013-
101989. PubMed PMID: 24385578; PubMed Central PMCID: PMCPMC3955383.
152. Lin IC, Ko SF, Shieh CS, Huang CF, Chien SJ, Liang CD. Recurrent congenital
diaphragmatic hernia in Ehlers-Danlos syndrome. Cardiovasc Intervent Radiol.
2006;29(5):920-3. doi: 10.1007/s00270-005-0154-5. PubMed PMID: 16447004.
57 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
153. Bulfamante G, Gana S, Avagliano L, Fabietti I, Gentilin B, Lalatta F. Congenital
diaphragmatic hernia as prenatal presentation of Apert syndrome. Prenat Diagn.
2011;31(9):910-1. Epub 2011/06/28. doi: 10.1002/pd.2788. PubMed PMID: 21706505.
154. McVeigh TP, Banka S, Reardon W. Kabuki syndrome: expanding the phenotype to
include microphthalmia and anophthalmia. Clin Dysmorphol. 2015;24(4):135-9. Epub
2015/06/08. doi: 10.1097/MCD.0000000000000092. PubMed PMID: 26049589.
155. Reis LM, Tyler RC, Schilter KF, Abdul-Rahman O, Innis JW, Kozel BA, et al. BMP4
loss-of-function mutations in developmental eye disorders including SHORT syndrome. Hum
Genet. 2011;130(4):495-504. Epub 2011/02/23. doi: 10.1007/s00439-011-0968-y. PubMed
PMID: 21340693; PubMed Central PMCID: PMCPMC3178759.
156. Wat MJ, Veenma D, Hogue J, Holder AM, Yu Z, Wat JJ, et al. Genomic alterations that
contribute to the development of isolated and non-isolated congenital diaphragmatic hernia. J
Med Genet. 2011;48(5):299-307. Epub 2011/04/29. doi: 10.1136/jmg.2011.089680. PubMed
PMID: 21525063; PubMed Central PMCID: PMCPMC3227222.
157. Wilmink FA, Papatsonis DN, Grijseels EW, Wessels MW. Cornelia de lange syndrome:
a recognizable fetal phenotype. Fetal Diagn Ther. 2009;26(1):50-3. Epub 2009/10/10. doi:
10.1159/000236361. PubMed PMID: 19816032.
158. Piard J, Collet C, Arbez-Gindre F, Nirhy-Lanto A, Van Maldergem L. Coronal
craniosynostosis and radial ray hypoplasia: a third report of Twist mutation in a 33 weeks fetus
with diaphragmatic hernia. Eur J Med Genet. 2012;55(12):719-22. Epub 2012/09/18. doi:
10.1016/j.ejmg.2012.08.007. PubMed PMID: 22982246.
58 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
159. Graul-Neumann LM, Hausser I, Essayie M, Rauch A, Kraus C. Highly variable cutis laxa
resulting from a dominant splicing mutation of the elastin gene. Am J Med Genet A.
2008;146A(8):977-83. Epub 2008/03/19. doi: 10.1002/ajmg.a.32242. PubMed PMID:
18348261.
160. Niemi AK, Northrup H, Hudgins L, Bernstein JA. Horseshoe kidney and a rare TSC2
variant in two unrelated individuals with tuberous sclerosis complex. Am J Med Genet A.
2011;155A(10):2534-7. Epub 2011/09/13. doi: 10.1002/ajmg.a.34197. PubMed PMID:
21910228.
161. Zayed H, Chao R, Moshrefi A, Lopezjimenez N, Delaney A, Chen J, et al. A maternally
inherited chromosome 18q22.1 deletion in a male with late-presenting diaphragmatic hernia
and microphthalmia-evaluation of DSEL as a candidate gene for the diaphragmatic defect. Am
J Med Genet A. 2010;152A(4):916-23. Epub 2010/04/02. doi: 10.1002/ajmg.a.33341. PubMed
PMID: 20358601; PubMed Central PMCID: PMCPMC2922899.
162. Bulman MP, Kusumi K, Frayling TM, McKeown C, Garrett C, Lander ES, et al.
Mutations in the human delta homologue, DLL3, cause axial skeletal defects in spondylocostal
dysostosis. Nat Genet. 2000;24(4):438-41. Epub 2000/03/31. doi: 10.1038/74307. PubMed
PMID: 10742114.
163. Race H, Hall CM, Harrison MG, Quarrell OW, Wakeling EL. A distinct autosomal
recessive disorder of limb development with preaxial brachydactyly, phalangeal duplication,
symphalangism and hyperphalangism. Clin Dysmorphol. 2010;19(1):23-7. Epub 2009/12/03.
doi: 10.1097/MCD.0b013e328334557e. PubMed PMID: 19952732.
59 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
164. Salonen R, Herva R. Hydrolethalus syndrome. J Med Genet. 1990;27(12):756-9. Epub
1990/12/01. PubMed PMID: 2074561; PubMed Central PMCID: PMCPMC1017280.
165. Taylor J, Aftimos S. Congenital diaphragmatic hernia is a feature of Opitz G/BBB
syndrome. Clin Dysmorphol. 2010;19(4):225-6. Epub 2010/09/09. doi:
10.1097/MCD.0b013e32833b2bd3. PubMed PMID: 20823703.
166. Gupta N, Shastri S, Singh PK, Jana M, Mridha A, Verma G, et al. Nasopharyngeal
teratoma, congenital diaphragmatic hernia and Dandy-Walker malformation - a yet
uncharacterized syndrome. Clin Genet. 2016;90(5):470-1. Epub 2016/10/26. doi:
10.1111/cge.12830. PubMed PMID: 27506516.
167. Ackerman KG, Herron BJ, Vargas SO, Huang H, Tevosian SG, Kochilas L, et al. Fog2
is required for normal diaphragm and lung development in mice and humans. PLoS Genet.
2005;1(1):58-65. Epub 2005/08/17. doi: 10.1371/journal.pgen.0010010. PubMed PMID:
16103912; PubMed Central PMCID: PMCPMC1183529.
168. Antonius T, van Bon B, Eggink A, van der Burgt I, Noordam K, van Heijst A. Denys-
Drash syndrome and congenital diaphragmatic hernia: another case with the 1097G >
A(Arg366His) mutation. Am J Med Genet A. 2008;146A(4):496-9. Epub 2008/01/19. doi:
10.1002/ajmg.a.32168. PubMed PMID: 18203154.
169. van Dooren MF, Brooks AS, Hoogeboom AJ, van den Hoonaard TL, de Klein JE,
Wouters CH, et al. Early diagnosis of Wolf-Hirschhorn syndrome triggered by a life-threatening
event: congenital diaphragmatic hernia. Am J Med Genet A. 2004;127A(2):194-6. Epub
2004/04/27. doi: 10.1002/ajmg.a.20613. PubMed PMID: 15108210.
60 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
170. Revencu N, Quenum G, Detaille T, Verellen G, De Paepe A, Verellen-Dumoulin C.
Congenital diaphragmatic eventration and bilateral uretero-hydronephrosis in a patient with
neonatal Marfan syndrome caused by a mutation in exon 25 of the FBN1 gene and review of
the literature. Eur J Pediatr. 2004;163(1):33-7. Epub 2003/10/31. doi: 10.1007/s00431-003-
1330-8. PubMed PMID: 14586646.
171. Pasutto F, Sticht H, Hammersen G, Gillessen-Kaesbach G, Fitzpatrick DR, Nurnberg G,
et al. Mutations in STRA6 cause a broad spectrum of malformations including anophthalmia,
congenital heart defects, diaphragmatic hernia, alveolar capillary dysplasia, lung hypoplasia,
and mental retardation. Am J Hum Genet. 2007;80(3):550-60. Epub 2007/02/03. doi:
10.1086/512203. PubMed PMID: 17273977; PubMed Central PMCID: PMCPMC1821097.
172. Twigg SR, Kan R, Babbs C, Bochukova EG, Robertson SP, Wall SA, et al. Mutations of
ephrin-B1 (EFNB1), a marker of tissue boundary formation, cause craniofrontonasal
syndrome. Proc Natl Acad Sci U S A. 2004;101(23):8652-7. Epub 2004/05/29. doi:
10.1073/pnas.0402819101. PubMed PMID: 15166289; PubMed Central PMCID:
PMCPMC423250.
173. Smigiel R, Jakubiak A, Lombardi MP, Jaworski W, Slezak R, Patkowski D, et al. Co-
occurrence of severe Goltz-Gorlin syndrome and pentalogy of Cantrell - Case report and
review of the literature. Am J Med Genet A. 2011;155A(5):1102-5. Epub 2011/04/13. doi:
10.1002/ajmg.a.33895. PubMed PMID: 21484999.
174. Hughes-Benzie RM, Pilia G, Xuan JY, Hunter AG, Chen E, Golabi M, et al. Simpson-
Golabi-Behmel syndrome: genotype/phenotype analysis of 18 affected males from 7 unrelated
61 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
families. Am J Med Genet. 1996;66(2):227-34. Epub 1996/12/11. doi: 10.1002/(SICI)1096-
8628(19961211)66:2<227::AID-AJMG20>3.0.CO;2-U. PubMed PMID: 8958336.
175. Catteruccia M, Verrigni D, Martinelli D, Torraco A, Agovino T, Bonafe L, et al. Persistent
pulmonary arterial hypertension in the newborn (PPHN): a frequent manifestation of TMEM70
defective patients. Mol Genet Metab. 2014;111(3):353-9. Epub 2014/02/04. doi:
10.1016/j.ymgme.2014.01.001. PubMed PMID: 24485043.
176. Hosokawa S, Takahashi N, Kitajima H, Nakayama M, Kosaki K, Okamoto N.
Brachmann-de Lange syndrome with congenital diaphragmatic hernia and NIPBL gene
mutation. Congenit Anom (Kyoto). 2010;50(2):129-32. Epub 2010/02/17. doi: 10.1111/j.1741-
4520.2010.00270.x. PubMed PMID: 20156239.
177. Suri M, Kelehan P, O'Neill D, Vadeyar S, Grant J, Ahmed SF, et al. WT1 mutations in
Meacham syndrome suggest a coelomic mesothelial origin of the cardiac and diaphragmatic
malformations. Am J Med Genet A. 2007;143A(19):2312-20. Epub 2007/09/14. doi:
10.1002/ajmg.a.31924. PubMed PMID: 17853480.
178. Stheneur C, Faivre L, Collod-Beroud G, Gautier E, Binquet C, Bonithon-Kopp C, et al.
Prognosis factors in probands with an FBN1 mutation diagnosed before the age of 1 year.
Pediatr Res. 2011;69(3):265-70. Epub 2010/12/08. doi: 10.1203/PDR.0b013e3182097219.
PubMed PMID: 21135753.
179. Vasudevan PC, Twigg SR, Mulliken JB, Cook JA, Quarrell OW, Wilkie AO. Expanding
the phenotype of craniofrontonasal syndrome: two unrelated boys with EFNB1 mutations and
congenital diaphragmatic hernia. Eur J Hum Genet. 2006;14(7):884-7. Epub 2006/04/28. doi:
10.1038/sj.ejhg.5201633. PubMed PMID: 16639408.
62 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
180. Dias C, Basto J, Pinho O, Barbedo C, Martins M, Bornholdt D, et al. A nonsense porcn
mutation in severe focal dermal hypoplasia with natal teeth. Fetal Pediatr Pathol.
2010;29(5):305-13. Epub 2010/08/14. doi: 10.3109/15513811003796912. PubMed PMID:
20704476.
181. Veugelers M, Cat BD, Muyldermans SY, Reekmans G, Delande N, Frints S, et al.
Mutational analysis of the GPC3/GPC4 glypican gene cluster on Xq26 in patients with
Simpson-Golabi-Behmel syndrome: identification of loss-of-function mutations in the GPC3
gene. Hum Mol Genet. 2000;9(9):1321-8. Epub 2000/05/18. PubMed PMID: 10814714.
182. Denamur E, Bocquet N, Baudouin V, Da Silva F, Veitia R, Peuchmaur M, et al. WT1
splice-site mutations are rarely associated with primary steroid-resistant focal and segmental
glomerulosclerosis. Kidney Int. 2000;57(5):1868-72. Epub 2000/05/03. doi: 10.1046/j.1523-
1755.2000.00036.x. PubMed PMID: 10792605.
183. Chassaing N, Golzio C, Odent S, Lequeux L, Vigouroux A, Martinovic-Bouriel J, et al.
Phenotypic spectrum of STRA6 mutations: from Matthew-Wood syndrome to non-lethal
anophthalmia. Hum Mutat. 2009;30(5):E673-81. Epub 2009/03/25. doi: 10.1002/humu.21023.
PubMed PMID: 19309693.
184. Twigg SR, Matsumoto K, Kidd AM, Goriely A, Taylor IB, Fisher RB, et al. The origin of
EFNB1 mutations in craniofrontonasal syndrome: frequent somatic mosaicism and explanation
of the paucity of carrier males. Am J Hum Genet. 2006;78(6):999-1010. Epub 2006/05/11. doi:
10.1086/504440. PubMed PMID: 16685650; PubMed Central PMCID: PMCPMC1474108.
185. Li M, Shuman C, Fei YL, Cutiongco E, Bender HA, Stevens C, et al. GPC3 mutation
analysis in a spectrum of patients with overgrowth expands the phenotype of Simpson-Golabi-
63 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Behmel syndrome. Am J Med Genet. 2001;102(2):161-8. Epub 2001/07/31. PubMed PMID:
11477610.
186. Horiguchi M, Inoue T, Ohbayashi T, Hirai M, Noda K, Marmorstein LY, et al. Fibulin-4
conducts proper elastogenesis via interaction with cross-linking enzyme lysyl oxidase. Proc
Natl Acad Sci U S A. 2009;106(45):19029-34. Epub 2009/10/27. doi:
10.1073/pnas.0908268106. PubMed PMID: 19855011; PubMed Central PMCID:
PMCPMC2776456.
187. Kreidberg JA, Sariola H, Loring JM, Maeda M, Pelletier J, Housman D, et al. WT-1 is
required for early kidney development. Cell. 1993;74(4):679-91. Epub 1993/08/27. PubMed
PMID: 8395349.
188. Clugston RD, Klattig J, Englert C, Clagett-Dame M, Martinovic J, Benachi A, et al.
Teratogen-induced, dietary and genetic models of congenital diaphragmatic hernia share a
common mechanism of pathogenesis. Am J Pathol. 2006;169(5):1541-9. Epub 2006/10/31.
doi: 10.2353/ajpath.2006.060445. PubMed PMID: 17071579; PubMed Central PMCID:
PMCPMC1780206.
189. Yuan W, Rao Y, Babiuk RP, Greer JJ, Wu JY, Ornitz DM. A genetic model for a central
(septum transversum) congenital diaphragmatic hernia in mice lacking Slit3. Proc Natl Acad
Sci U S A. 2003;100(9):5217-22. Epub 2003/04/19. doi: 10.1073/pnas.0730709100. PubMed
PMID: 12702769; PubMed Central PMCID: PMCPMC154325.
190. Coles GL, Ackerman KG. Kif7 is required for the patterning and differentiation of the
diaphragm in a model of syndromic congenital diaphragmatic hernia. Proc Natl Acad Sci U S
64 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
A. 2013;110(21):E1898-905. Epub 2013/05/08. doi: 10.1073/pnas.1222797110. PubMed
PMID: 23650387; PubMed Central PMCID: PMCPMC3666741.
191. Hentsch B, Lyons I, Li R, Hartley L, Lints TJ, Adams JM, et al. Hlx homeo box gene is
essential for an inductive tissue interaction that drives expansion of embryonic liver and gut.
Genes Dev. 1996;10(1):70-9. Epub 1996/01/01. PubMed PMID: 8557196.
192. Kim PC, Mo R, Hui Cc C. Murine models of VACTERL syndrome: Role of sonic
hedgehog signaling pathway. J Pediatr Surg. 2001;36(2):381-4. Epub 2001/02/15. PubMed
PMID: 11172440.
193. Beck TF, Veenma D, Shchelochkov OA, Yu Z, Kim BJ, Zaveri HP, et al. Deficiency of
FRAS1-related extracellular matrix 1 (FREM1) causes congenital diaphragmatic hernia in
humans and mice. Hum Mol Genet. 2013;22(5):1026-38. Epub 2012/12/12. doi:
10.1093/hmg/dds507. PubMed PMID: 23221805; PubMed Central PMCID: PMCPMC3561915.
194. You LR, Takamoto N, Yu CT, Tanaka T, Kodama T, Demayo FJ, et al. Mouse lacking
COUP-TFII as an animal model of Bochdalek-type congenital diaphragmatic hernia. Proc Natl
Acad Sci U S A. 2005;102(45):16351-6. Epub 2005/10/28. doi: 10.1073/pnas.0507832102.
PubMed PMID: 16251273; PubMed Central PMCID: PMCPMC1283449.
195. Babiuk RP, Greer JJ. Diaphragm defects occur in a CDH hernia model independently of
myogenesis and lung formation. Am J Physiol Lung Cell Mol Physiol. 2002;283(6):L1310-4.
Epub 2002/10/22. doi: 10.1152/ajplung.00257.2002. PubMed PMID: 12388344.
196. Baertschi S, Zhuang L, Trueb B. Mice with a targeted disruption of the Fgfrl1 gene die at
birth due to alterations in the diaphragm. FEBS J. 2007;274(23):6241-53. Epub 2007/11/08.
doi: 10.1111/j.1742-4658.2007.06143.x. PubMed PMID: 17986259.
65 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
197. Mendelsohn C, Lohnes D, Decimo D, Lufkin T, LeMeur M, Chambon P, et al. Function
of the retinoic acid receptors (RARs) during development (II). Multiple abnormalities at various
stages of organogenesis in RAR double mutants. Development. 1994;120(10):2749-71. Epub
1994/10/01. PubMed PMID: 7607068.
198. Pereira L, Lee SY, Gayraud B, Andrikopoulos K, Shapiro SD, Bunton T, et al.
Pathogenetic sequence for aneurysm revealed in mice underexpressing fibrillin-1. Proc Natl
Acad Sci U S A. 1999;96(7):3819-23. Epub 1999/03/31. PubMed PMID: 10097121; PubMed
Central PMCID: PMCPMC22378.
199. Ramirez-Solis R, Zheng H, Whiting J, Krumlauf R, Bradley A. Hoxb-4 (Hox-2.6) mutant
mice show homeotic transformation of a cervical vertebra and defects in the closure of the
sternal rudiments. Cell. 1993;73(2):279-94. Epub 1993/04/23. PubMed PMID: 8097432.
200. Liu Y, Oppenheim RW, Sugiura Y, Lin W. Abnormal development of the neuromuscular
junction in Nedd4-deficient mice. Dev Biol. 2009;330(1):153-66. Epub 2009/04/07. doi:
10.1016/j.ydbio.2009.03.023. PubMed PMID: 19345204; PubMed Central PMCID:
PMCPMC2810636.
201. Hildebrand JD, Soriano P. Overlapping and unique roles for C-terminal binding protein 1
(CtBP1) and CtBP2 during mouse development. Mol Cell Biol. 2002;22(15):5296-307. Epub
2002/07/09. PubMed PMID: 12101226; PubMed Central PMCID: PMCPMC133942.
202. Valasek P, Theis S, DeLaurier A, Hinits Y, Luke GN, Otto AM, et al. Cellular and
molecular investigations into the development of the pectoral girdle. Dev Biol.
2011;357(1):108-16. Epub 2011/07/12. doi: 10.1016/j.ydbio.2011.06.031. PubMed PMID:
21741963.
66 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
203. Misgeld T, Burgess RW, Lewis RM, Cunningham JM, Lichtman JW, Sanes JR. Roles of
neurotransmitter in synapse formation: development of neuromuscular junctions lacking
choline acetyltransferase. Neuron. 2002;36(4):635-48. Epub 2002/11/21. PubMed PMID:
12441053.
204. Domyan ET, Branchfield K, Gibson DA, Naiche LA, Lewandoski M, Tessier-Lavigne M,
et al. Roundabout receptors are critical for foregut separation from the body wall. Dev Cell.
2013;24(1):52-63. Epub 2013/01/19. doi: 10.1016/j.devcel.2012.11.018. PubMed PMID:
23328398; PubMed Central PMCID: PMCPMC3551250.
205. Oh J, Takahashi R, Adachi E, Kondo S, Kuratomi S, Noma A, et al. Mutations in two
matrix metalloproteinase genes, MMP-2 and MT1-MMP, are synthetic lethal in mice.
Oncogene. 2004;23(29):5041-8. Epub 2004/04/06. doi: 10.1038/sj.onc.1207688. PubMed
PMID: 15064723.
206. Zhang B, Xiao W, Qiu H, Zhang F, Moniz HA, Jaworski A, et al. Heparan sulfate
deficiency disrupts developmental angiogenesis and causes congenital diaphragmatic hernia.
J Clin Invest. 2014;124(1):209-21. Epub 2013/12/21. doi: 10.1172/JCI71090. PubMed PMID:
24355925; PubMed Central PMCID: PMCPMC3871243.
207. Krieser RJ, MacLea KS, Longnecker DS, Fields JL, Fiering S, Eastman A.
Deoxyribonuclease IIalpha is required during the phagocytic phase of apoptosis and its loss
causes perinatal lethality. Cell Death Differ. 2002;9(9):956-62. Epub 2002/08/16. doi:
10.1038/sj.cdd.4401056. PubMed PMID: 12181746.
208. Hornstra IK, Birge S, Starcher B, Bailey AJ, Mecham RP, Shapiro SD. Lysyl oxidase is
required for vascular and diaphragmatic development in mice. J Biol Chem.
67 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
2003;278(16):14387-93. Epub 2002/12/11. doi: 10.1074/jbc.M210144200. PubMed PMID:
12473682.
209. Shi L, Zhao G, Qiu D, Godfrey WR, Vogel H, Rando TA, et al. NF90 regulates cell cycle
exit and terminal myogenic differentiation by direct binding to the 3'-untranslated region of
MyoD and p21WAF1/CIP1 mRNAs. J Biol Chem. 2005;280(19):18981-9. Epub 2005/03/05.
doi: 10.1074/jbc.M411034200. PubMed PMID: 15746098.
210. Mill P, Lockhart PJ, Fitzpatrick E, Mountford HS, Hall EA, Reijns MA, et al. Human and
mouse mutations in WDR35 cause short-rib polydactyly syndromes due to abnormal
ciliogenesis. Am J Hum Genet. 2011;88(4):508-15. Epub 2011/04/09. doi:
10.1016/j.ajhg.2011.03.015. PubMed PMID: 21473986; PubMed Central PMCID:
PMCPMC3071922.
211. Kim Y, Sharov AA, McDole K, Cheng M, Hao H, Fan CM, et al. Mouse B-type lamins
are required for proper organogenesis but not by embryonic stem cells. Science.
2011;334(6063):1706-10. Epub 2011/11/26. doi: 10.1126/science.1211222. PubMed PMID:
22116031; PubMed Central PMCID: PMCPMC3306219.
212. Arber S, Han B, Mendelsohn M, Smith M, Jessell TM, Sockanathan S. Requirement for
the homeobox gene Hb9 in the consolidation of motor neuron identity. Neuron.
1999;23(4):659-74. Epub 1999/09/11. PubMed PMID: 10482234.
213. Pacifici PG, Peter C, Yampolsky P, Koenen M, McArdle JJ, Witzemann V. Novel mouse
model reveals distinct activity-dependent and -independent contributions to synapse
development. PLoS One. 2011;6(1):e16469. Epub 2011/02/10. doi:
68 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
10.1371/journal.pone.0016469. PubMed PMID: 21305030; PubMed Central PMCID:
PMCPMC3031568.
214. Ibdah JA, Paul H, Zhao Y, Binford S, Salleng K, Cline M, et al. Lack of mitochondrial
trifunctional protein in mice causes neonatal hypoglycemia and sudden death. J Clin Invest.
2001;107(11):1403-9. Epub 2001/06/08. doi: 10.1172/JCI12590. PubMed PMID: 11390422;
PubMed Central PMCID: PMCPMC209324.
215. Sachs M, Brohmann H, Zechner D, Muller T, Hulsken J, Walther I, et al. Essential role
of Gab1 for signaling by the c-Met receptor in vivo. J Cell Biol. 2000;150(6):1375-84. Epub
2000/09/20. PubMed PMID: 10995442; PubMed Central PMCID: PMCPMC2150711.
216. Goshu E, Jin H, Fasnacht R, Sepenski M, Michaud JL, Fan CM. Sim2 mutants have
developmental defects not overlapping with those of Sim1 mutants. Mol Cell Biol.
2002;22(12):4147-57. Epub 2002/05/25. PubMed PMID: 12024028; PubMed Central PMCID:
PMCPMC133848.
217. Kurohara K, Komatsu K, Kurisaki T, Masuda A, Irie N, Asano M, et al. Essential roles of
Meltrin beta (ADAM19) in heart development. Dev Biol. 2004;267(1):14-28. Epub 2004/02/21.
doi: 10.1016/j.ydbio.2003.10.021. PubMed PMID: 14975714.
218. Laurin M, Fradet N, Blangy A, Hall A, Vuori K, Cote JF. The atypical Rac activator
Dock180 (Dock1) regulates myoblast fusion in vivo. Proc Natl Acad Sci U S A.
2008;105(40):15446-51. Epub 2008/09/30. doi: 10.1073/pnas.0805546105. PubMed PMID:
18820033; PubMed Central PMCID: PMCPMC2563090.
219. Meech R, Gonzalez KN, Barro M, Gromova A, Zhuang L, Hulin JA, et al. Barx2 is
expressed in satellite cells and is required for normal muscle growth and regeneration. Stem
69 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Cells. 2012;30(2):253-65. Epub 2011/11/15. doi: 10.1002/stem.777. PubMed PMID: 22076929;
PubMed Central PMCID: PMCPMC4547831.
220. Chevessier F, Girard E, Molgo J, Bartling S, Koenig J, Hantai D, et al. A mouse model
for congenital myasthenic syndrome due to MuSK mutations reveals defects in structure and
function of neuromuscular junctions. Hum Mol Genet. 2008;17(22):3577-95. Epub 2008/08/23.
doi: 10.1093/hmg/ddn251. PubMed PMID: 18718936.
221. Zvaritch E, Depreux F, Kraeva N, Loy RE, Goonasekera SA, Boncompagni S, et al. An
Ryr1I4895T mutation abolishes Ca2+ release channel function and delays development in
homozygous offspring of a mutant mouse line. Proc Natl Acad Sci U S A. 2007;104(47):18537-
42. Epub 2007/11/16. doi: 10.1073/pnas.0709312104. PubMed PMID: 18003898; PubMed
Central PMCID: PMCPMC2141812.
222. Li S, Czubryt MP, McAnally J, Bassel-Duby R, Richardson JA, Wiebel FF, et al.
Requirement for serum response factor for skeletal muscle growth and maturation revealed by
tissue-specific gene deletion in mice. Proc Natl Acad Sci U S A. 2005;102(4):1082-7. Epub
2005/01/14. doi: 10.1073/pnas.0409103102. PubMed PMID: 15647354; PubMed Central
PMCID: PMCPMC545866.
223. Nagata K, Kiryu-Seo S, Maeda M, Yoshida K, Morita T, Kiyama H. Damage-induced
neuronal endopeptidase is critical for presynaptic formation of neuromuscular junctions. J
Neurosci. 2010;30(20):6954-62. Epub 2010/05/21. doi: 10.1523/JNEUROSCI.4521-09.2010.
PubMed PMID: 20484637.
224. Uetani N, Chagnon MJ, Kennedy TE, Iwakura Y, Tremblay ML. Mammalian motoneuron
axon targeting requires receptor protein tyrosine phosphatases sigma and delta. J Neurosci.
70 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
2006;26(22):5872-80. Epub 2006/06/02. doi: 10.1523/JNEUROSCI.0386-06.2006. PubMed
PMID: 16738228.
225. Reinholt BM, Ge X, Cong X, Gerrard DE, Jiang H. Stac3 is a novel regulator of skeletal
muscle development in mice. PLoS One. 2013;8(4):e62760. Epub 2013/04/30. doi:
10.1371/journal.pone.0062760. PubMed PMID: 23626854; PubMed Central PMCID:
PMCPMC3633831.
226. Zhang P, Liegeois NJ, Wong C, Finegold M, Hou H, Thompson JC, et al. Altered cell
differentiation and proliferation in mice lacking p57KIP2 indicates a role in Beckwith-
Wiedemann syndrome. Nature. 1997;387(6629):151-8. Epub 1997/05/08. doi:
10.1038/387151a0. PubMed PMID: 9144284.
227. Hicks AN, Lorenzetti D, Gilley J, Lu B, Andersson KE, Miligan C, et al. Nicotinamide
mononucleotide adenylyltransferase 2 (Nmnat2) regulates axon integrity in the mouse embryo.
PLoS One. 2012;7(10):e47869. Epub 2012/10/20. doi: 10.1371/journal.pone.0047869.
PubMed PMID: 23082226; PubMed Central PMCID: PMCPMC3474723.
228. Xian J, Clark KJ, Fordham R, Pannell R, Rabbitts TH, Rabbitts PH. Inadequate lung
development and bronchial hyperplasia in mice with a targeted deletion in the Dutt1/Robo1
gene. Proc Natl Acad Sci U S A. 2001;98(26):15062-6. Epub 2001/12/06. doi:
10.1073/pnas.251407098. PubMed PMID: 11734623; PubMed Central PMCID:
PMCPMC64983.
229. Bladt F, Riethmacher D, Isenmann S, Aguzzi A, Birchmeier C. Essential role for the c-
met receptor in the migration of myogenic precursor cells into the limb bud. Nature.
1995;376(6543):768-71. Epub 1995/08/31. doi: 10.1038/376768a0. PubMed PMID: 7651534.
71 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
230. Catela C, Bilbao-Cortes D, Slonimsky E, Kratsios P, Rosenthal N, Te Welscher P.
Multiple congenital malformations of Wolf-Hirschhorn syndrome are recapitulated in Fgfrl1 null
mice. Dis Model Mech. 2009;2(5-6):283-94. Epub 2009/04/23. doi: 10.1242/dmm.002287.
PubMed PMID: 19383940; PubMed Central PMCID: PMCPMC2675798.
231. Hasty P, Bradley A, Morris JH, Edmondson DG, Venuti JM, Olson EN, et al. Muscle
deficiency and neonatal death in mice with a targeted mutation in the myogenin gene. Nature.
1993;364(6437):501-6. Epub 1993/08/05. doi: 10.1038/364501a0. PubMed PMID: 8393145.
72 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
SUPPORTING INFORMATION
Figure S1: Alternative allele read depth of somatic de novo variants is significantly lower than
germline de novo variants across CDH probands (A) and tissues (B).
Figure S2: The spectrum of de novo variants varies between the different CDH probands (A), but
does not vary between the three tissues sampled from the probands (B).
Table S1: Genes associated with CDH in the literature. Genes were collated from multiple studies,
including family and single patient case studies, large human patient sequencing studies and functional
studies using mouse genetics. A gene from human studies is included if multiple patients harbor
variants in the gene, the gene is associated with other implicated genes and expressed in the
developing diaphragm, and/or the gene is associated with other developmental disorders or structural
birth defects that co-occur in patients with complex CDH. A gene from mouse studies is included if a
variant in the gene leads to a diaphragm defect. Genes were then ranked based on the amount and
type of evidence implicating their contribution to CDH (see Materials and Methods). Literature cited [11-
17, 21, 23, 28, 30, 34-46, 48, 132-231]. Genes involved in GATA/TBX transcriptional network are
highlighted in blue; in Hedgehog signaling in cyan; in Retinoic Acid signaling in dark green; in
ROBO/SLIT signaling in purple; in MET signaling in orange; in WNT/bCATENIN signaling in yellow; and
in FGF signaling in pink and ECM proteins are in light green and in muscle-related transcription factors
or structural/functional genes in red.
Table S2: DNA samples of CDH probands and their parents are of high quality and expected
ancestry and relatedness. Sheet 1 reports results of Peddy analysis of heterozygosity and indicates
that sequences were of high quality and ancestry congruent with self-reported ancestry. Sheet 2 reports
73 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
results of Peddy sex analysis and shows that sex of DNA samples is congruent with known sex of
sample origins. Sheet 3 reports results of Peddy analysis of relatedness and shows that relatedness is
as expected.
Table S3: Germline de novo and somatic de novo variants identified by rufus in CDH probands.
True positive variants were identified by rufus and then filtered based on read depth (>15 reads total in
all samples compared in the rufus call), call quality (>20 GQ score) and alternate allele frequency
(sample of interest >20% alternate read frequency and other samples in comparison <20% alternate
read frequency). Sheet 1: summary of all de novo variants identified. Sheet 2: de novo coding variants
identified. Sheets 3-5: de novo gene-associated variants identified in diaphragm, skin, and blood.
Sheets 6-8: de novo non-gene associated variants identified in diaphragm, skin, and blood.
Table S4: Predicted gene damaging variants inherited from parents of CDH probands. VAAST3
recessive model comparing CDH patient families to 1000 Genome phase 3 individuals was used to call
predicted damaging, rare recessive variants. Genes harboring variants with a VAAST p-value <0.05
were saved as candidate genes potentially contributing to CDH (Sheet 2 shows unfiltered genes).
Genes were then further filtered based on expression in PPFs (genes needed to be expressed at > 10
TPM in PPFs), and pseudogenes and highly mutable genes were removed (Sheet 1 shows filtered
genes).
Table S5. Germline de novo and inherited structural variants possibly contributing to CDH.
Structural variants were identified using the Lumpy Smooth workflow. Germline de novo variants were
identified by querying variants unique to the CDH proband compared to the parents, filtered based on
reads that Lumpy scored as supporting the variant (>8 reads either split or discordant) and finally
visually inspected in IGV to determine whether parents had evidence of the same variant. Inherited
74 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
structural variants were filtered based on chromosomal regions highly associated with CDH [10] .
Structural variants were visually inspected and filtered out if allele frequency > 0.1 for inherited SVs or >
0.01 for de novo SVs [94], if SV was located in an intergenic region, or overlapped a repetitive element
[125].
FIGURE LEGENDS
Figure 1: Genes associated with CDH in the literature were collected and ranked based on the
amount of human CDH patient and mouse model functional evidence (see Table S1 for
rankings). A) CDH-associated genes overlaid on all genes expressed in mouse PPFs at E12.5. RNA-
seq reads normalized using TPM. Black line denotes TPM = 0 and red dashed line TPM = 10 (Log1).
Black dots, CDH genes highly supported by human and mouse data; dark grey dots, genes with
moderate data support; light grey dots, CDH genes with modest data support. B) STRING protein
network of CDH-associated genes. Within the STRING tool all active interaction sources except text-
mining was used, with a minimum required interaction score of 0.4. Edges are based on the strength of
data support. Nodes without edges are genes that do not interact with any other listed genes. Thick
edges, interaction score > 0.9; thinner edges, 0.9 - 0.7 interaction score; thinnest edges, interaction
score < 0.4. Prominent GATA/TBX transcriptional network, extracellular matrix, and muscle-related
genes are highlighted as well as Hedgehog, Retinoic Acid, ROBO/SLIT, MET, WNT/b-CATENIN, and
FGF signaling pathways. In both RNA-seq expression and protein network genes ranked 1-27 (with a
total score of > 10) are shown in black, genes ranked 28- 88 (with a total score 5-9) are shown in dark
gray, and genes ranked 89- 153 (with a total score <5) are shown in light gray. Details on rankings are
described in Table S1 and Materials and Methods.
75 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Figure 2: Sample cohorts analyzed in this study. Each cohort consists of DNA samples from
diaphragm sac (yellow), skin (blue), and blood (red) of CDH proband and blood from proband’s mother
and father.
Figure 3: Germline and somatic de novo variants, identified via RUFUS, in CDH probands. A) De
novo germline and somatic variants unique to a single tissue and shared between two tissues (coding,
gene-associated non-coding, or non-gene associated variants) were found in the 4 CDH probands. 5
genes (PEX6, SCARB1, OLFM3, ZNF792, AR, and THSD7A) contained germline coding variants.
Genes with coding variants are highlighted in red. #/#/# = exon (coding) variants/ UTR+intron variants/
intergenic variants. B) Five genes with germline de novo coding variants overlaid on all genes
expressed in mouse PPFs at E12.5 (ZNF792 has no mouse ortholog). RNA-seq reads normalized
using TPM. Black line denotes TPM = 0 and red dashed line TPM = 10. Only SCARB1, PEX6, and
THSD7A are expressed in the PPFs at approximately > 10 TPM.
Figure 4: Inherited variants potentially contributing to etiology of CDH probands. A) Compound
heterozygous (boxed in dark purple) or homozygous (boxed in light purple) inherited variants highly
ranked as damaging and rare by VAAST, with pseudogenes, highly mutable genes, and genes
expressed at low levels in PPFs filtered out. B) Candidate genes with inherited variants overlaid on all
genes expressed in mouse PPFs at E12.5. RNA-seq reads normalized using TPM. Black line denotes
TPM = 0 and red dashed line TPM = 10.
Figure 5: Inherited deletion in intron 2 of GATA4, in a region which overlaps an enhancer
conserved in mouse and humans, found in CDH Probands 411 and 967. A-D) Samplot figures
show coverage in grey (right y-axis) and Insert size between pair-end reads (left y-axis), with split and
discordant reads with mis-matched insert size above 500 shown as bars. Paternal inheritance in
76 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bogenschutz et al.
Proband 411 (A) and maternal inheritance in Proband 967 (D) of intron 2 deletion. E) GATA4 intron 2
and 343 bp deleted region (red box) with tracks of gene location, conservation, DNase 1
hypersensitivity and H3K27 acetylation in human lung fibroblasts. Bottom track shows enhancer
element 2205 from the VISTA enhancer database within which the deletion resides. Built on the hg19
human reference genome. F) Pairwise alignment of 343 bp region deletion in CDH families to the
homologous region in the mouse genome (mm9 reference genome). Track 1, consensus sequence;
track 2, base pair identity; track 3, human (hg19) sequence; track 4, mouse minus strand sequence.
Figure 6: Models of four CDH probands with inherited, germline and somatic de novo variants,
including coding, gene-associated (but non-coding), and non-gene associated SNVs, indels,
and SVs. Each model highlights the genetic complexity and heterogeneity underlying CDH. Z, zygote;
SD, skin-diaphragm progenitor; SB, skin-blood progenitor; DB, diaphragm-blood progenitor; S, skin cell;
D, diaphragm cell; B, blood cell.
77 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. bioRxiv preprint doi: https://doi.org/10.1101/2020.04.03.024398; this version posted April 5, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.