bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 SORBS2 is a genetic factor contributing to cardiac malformation of 4q deletion syndrome

2

3 Fei Liang1, *, Xiaoqing Zhang2, *, Bo Wang2, *, Juan Geng2, Guoling You2, Jingjing Fa2, Huiying

4 Sun3, Huiwen Chen4, #, Qihua Fu2, #, Zhen Zhang1, #.

5 1Shanghai Pediatric Congenital Disease Institute and Pediatric Translational Medicine

6 Institute, Shanghai Children’s Medical Center, Shanghai Jiao Tong University School of

7 Medicine, Shanghai 200127, China.

8 2Department of Laboratory Medicine, Shanghai Children’s Medical Center, Shanghai Jiao

9 Tong University School of Medicine, Shanghai 200127, China.

10 3Key Laboratory of Pediatric Hematology and Oncology Ministry of Health and Pediatric

11 Translational Medicine Institute, Shanghai Children’s Medical Center, Shanghai Jiao Tong

12 University School of Medicine, Shanghai 200127, China.

13 4Department of thoracic and cardiac surgery, Shanghai Children’s Medical Center, Shanghai

14 Jiao Tong University School of Medicine, Shanghai 200127, China.

15

16 *These authors contributed equally.

17 #Correspondence to: Zhen Zhang, [email protected]; Qihua Fu, [email protected];

18 Huiwen Chen, [email protected]

19

20 Running title: SORBS2 and congenital heart disease

21 22

1

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

23 Abstract

24

25 4q deletion is one of the most frequently detected genomic imbalance events

26 in congenital heart disease (CHD) patients. However, a portion of CHD-associated 4q deletions

27 do not include known CHD . Alignment of those 4q deletions defined a minimal

28 overlapping region including only one -SORBS2. Histological analysis of Sorbs2-/- heart

29 revealed atrial septal hypoplasia/aplasia or double atrial septum. Mechanistically, SORBS2 had

30 a dual role in maintaining sarcomeric integrity of cardiomyocytes and specifying the fate of

31 second heart field (SHF) progenitors through c-ABL/NOTCH/SHH axis. In a targeted

32 sequencing of a panel of known and candidate CHD genes on 300 CHD cases, we found that

33 rare SORBS2 variants were significantly enriched in CHD patients. Our findings indicate that

34 SORBS2 is a regulator of cardiac development and its haploinsufficiency may contribute to

35 cardiac phenotype of 4q deletion syndrome. The presence of double atrial septum in Sorbs2-/-

36 reveals the first molecular etiology of this rare anomaly linked to paradoxical

37 thromboembolism.

38

39

40 Keywords: 4q deletion syndrome, Congenital heart disease, SORBS2, second heart field,

41 Notch1

42

2

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

43 Introduction

44 Congenital heart disease (CHD) is present in ~1% newborns (1). Although CHD is the most

45 common birth defect, the underlying causes of most CHDs remain elusive. Genetic

46 heterogeneity and gene-environment interaction limit the genotype-phenotype correlation and

47 complicate interpretation of CHD genetics. Anatomic rectification through interventional and

48 surgical treatments have greatly improved survival rate of CHD patients (2). However, the same

49 medical treatment often yields different outcomes in different CHD patients, suggesting the

50 underlying genetic lesion impacts CHD prognosis. Therefore, elucidation of CHD genetic basis

51 is becoming ever more important in both reducing short-term therapeutic complications and

52 improving long-term care of CHD patients. With the advances in genetic diagnosis techniques,

53 the discovery of genetic variants is no longer a limit (3). The interpretation of the association

54 of identified genetic variants with phenotype becomes the imperative need to address the

55 genetic basis of CHD.

56 CNV is a chromosomal structural aberration comprising of deletion or duplication with a

57 size larger than 1kb. Due to the abundance of low-copy repeats and retrotransposons, CNVs

58 are quite common genetic variants in human genomes. Previous studies have estimated that

59 CNVs contribute to 10-15% CHD (4, 5). The most common chromosomal deletion is

60 Del22q11.2, which is the shared genetic cause of 3 clinical syndromes—DiGeorge syndrome,

61 velocardiofacial syndrome and conotruncal anomaly face syndrome (6). The common forms of

62 Del22q11.2 are ~3 or 1.5 Mbs deletion encompassing dozens of genes (6). The successful

63 identification of the major CHD gene TBX1 has greatly advanced our understanding of the

64 pathogenesis of 22q11.2 deletion syndrome (22q11.2DS) (7-10). The gained knowledge also

3

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

65 leads to the discovery of potential therapeutic measures to rescue cardiovascular defect through

66 pharmaceutical manipulation of TBX1-involved biological pathway (11-13). Interestingly,

67 untypical 22q11.2 deletions outside of the 1.5 Mbs critical region are also associated with

68 CHD(14-16). Studies have shown that CRKL and LZTR1 within these untypical deletions

69 regulate cardiac development and may contribute to heart malformations of patients with

70 typical 3Mbs deletion(15, 17, 18). Therefore, the identification of all the disease genes within

71 CNV intervals is crucial for having a complete view of the pathogenesis of related phenotypes.

72 We previously profiled CNVs distribution in 514 CHD cases and have found that

73 chromosome 4q deletion is the most frequent CHD-related CNV in our cohort(19). The

74 majority of terminal 4q deletion we identified does not include HAND2 that are considered as

75 the culprit gene responsible for CHD in 4q deletion syndrome. Through comparing our findings

76 with reported 4q deletions, we hypothesized that SORBS2 within the minimal overlapping

77 region may be a CHD gene. To this end, we deleted Sorbs2 in mice. Histological analyses of

78 Sorbs2-/- mouse hearts revealed atrial septal anomalies in 40% (12/30) mice. SORBS2

79 knockdown in in vitro human cardiomyocyte differentiation model indicated a dual role of

80 SORBS2 in cardiogenesis. The encoded not only maintains the sarcomeric integrity as

81 a structural component, but also promotes the differentiation of the second heart field (SHF)

82 progenitors through C-/Notch1/Shh axis. To further validate its pathogenicity in CHD, we

83 performed targeted sequencing of 104 candidate and known CHD genes on 300 complex CHD

84 cases. After comparing with 220 Han Chinese subjects from 1000 genomes, we found that

85 SORBS2 and known CHD gene KMT2D had significantly higher mutational burden in the CHD

86 group. Taken together, our data indicate that SORBS2 is a novel CHD gene contributing to the

4

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

87 cardiac phenotype of 4q deletion syndrome.

88

89 Results

90 SORBS2 is a potential CHD disease gene

91 We aligned CHD-associated terminal 4q deletions identified in our previous study with

92 reported 4q deletions. We noted that more than half of them do not include the known CHD

93 gene HAND2 (Figure 1A). The minimal overlapping region among those deletions includes

94 only one gene-SORBS2. It was proposed as a candidate CHD gene in the study identifying this

95 unusual small deletion(20). However, the pathogenicity of SORBS2 deficiency on CHD is still

96 unclear.

97 SORBS2 encodes a cytoskeleton associated adaptor protein that regulates cytoskeletal

98 organization and intracellular signaling (21). It is highly expressed in the heart and participates

99 in structural organization of cardiomyocyte Z-discs (22). In analyzing RNA-seq data of human

100 fetal heart tissues (96 days -147 days) from the ENCODE database (23), we found that SORBS2

101 expression was highly correlated with those of multiple known CHD genes (Figure 1B). Taken

102 together, we thought that SORBS2 might be required for cardiac development.

103

104 Sorbs2-/- mice have atrial septal defect

105 The entire SORBS2 gene is absent in terminal 4q deletion, hence we used Sorbs2 knockout

106 mice to study its role in cardiac development. In previous report, about 40-60% Sorbs2-/- mice

107 die within 1 week after birth (24). We wonder whether structural heart defect(s) contributes to

108 the early lethality. To this end, we intercrossed Sorbs2+/- mice and collected 137 embryos at

5

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

109 E18.5. The ratio of genotype distribution among embryos is consistent with Mendel's law

110 (Table S1), suggesting no embryo loss in early development stage. We dissected 30 Sorbs2-/-

111 embryos and none of them showed conotruncal defect from gross examination (Figure 2A).

112 Next, we performed paraffin sectioning for intracardiac morphological analysis. No ventricular

113 septal defect (VSD) was noted in all the Sorbs2-/- hearts (Figure 2B). However, we found that

114 about 40% (12/30) Sorbs2-/- hearts had atrial septal defect (ASD) with 10 being the absence of

115 primary septum and 2 being DAS (Figure 2C). The presence of ASD in Sorbs2-/- mice indicates

116 that SORBS2 deficiency contributes to CHD pathogenesis.

117

118 SORBS2-knockdown cardiomyocytes have disrupted sarcomeric structure but no

119 obvious electrophysiological abnormality

120 Having known that Sorbs2 knockout caused ASD in mice, we used in vitro stem cell-based

121 cardiogenesis to study whether SORBS2 has a conserved role in human. Since 4q deletion

122 causes SORBS2 haploinsufficiency, we approached the question by knocking down SORBS2 in

123 H1 human ES cell lines (H1-hESC). We used two different shRNAs to knock down SORBS2

124 and similar knockdown efficiencies (~40% of wild type expression level) were achieved

125 (Figure S1A). Compared to scramble-shRNA controls, there was no significant difference in

126 the growth rate and clone morphology of SORBS2-knockdown hESCs. To verify this

127 observation, we examined the expression of pluripotency markers Oct4, Sox2, Nanog, TRA-

128 1-60, and no significant difference was detected on both RNA and protein levels between

129 control and SORBS2-knockdown groups (Figure S1B-S1E). We used a well-established 2-

130 dimensional monolayer differentiation protocol to perform cardiomyocyte differentiation (25)

6

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

131 (Figure S2A). Since SORBS2-knockdown ES cell lines from different shRNAs had similar

132 phenotypes (Figure S2B, Video S1-S3), we only used SORBS2-shRNA1 for further analyses.

133 Spontaneous beating started to appear at differentiation D8 and robust contraction occurred at

134 D10 in control hESCs. Although spontaneous beating also appeared at D8 in SORBS2-

135 knockdown group, differentiated cardiomyocytes contracted much weaker (Video EV1-EV3).

136 Since SORBS2 is a structural component of sarcomeric Z-line (26), we examined the

137 structure of D30 cardiomyocytes through anti-cTnI and anti-α-actinin immunostaining, which

138 can visualize sarcomeric length and Z line, respectively. The majority of cardiomyocytes in

139 SORBS2-knockdown group presented a round or oval shape instead of polygonal or spindle-

140 like outlines in control group (Figure 3A). Close lookup of myofilaments showed that the

141 sarcomaric structures in cells with abnormal shapes were disrupted (Figure 3A). Overall, the

142 percentage of cardiomyocytes with well-organized and normal shape was much

143 lower in SORBS2-knockdown group (Figure 3B). In addition, we examined the ultrastructural

144 organization of sarcomere of D30 cardiomyocytes by transmission electron microscopy. The

145 disrupted were also noted in SORBS2-knockdown cardiomyocytes (Figure 3C).

146 The weakened beating force of SORBS2-knockdown cardiomyocytes may also be derived from

147 abnormal electrophysiology. To this end, we examined the electrical activities of dissociated

148 D30 cardiomyocytes by patch clamping. Three types of spontaneous action potentials

149 (ventricular-like, atrial-like, and nodal-like) were all observed in both control and SORBS2-

150 knockdown cardiomyocytes. The dominant type of cardiomyocytes is ventricular-like (Figure

151 3D). Statistical analyses on action potential parameters of ventricular-like cells, including

152 average action potential (AP) duration at 90% repolarization, average AP frequency, peak

7

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

153 amplitude, and resting potential showed no difference between two groups (Figure 3E-3H).

154 These data indicate that SORBS2-knockdown did not significantly affect cardiomyocyte

155 electrophysiology, albeit its indispensability in sarcomeric organization.

156

157 SORBS2 promotes cardiomyocyte differentiation through enhancing SHF fate

158 commitment

159 Cardiomyocyte differentiation efficiency was assessed by flow cytometry analysis of cTnT+

160 cells. The proportion of cTnT+ cells was significantly decreased in SORBS2-konckdown group

161 at D15 (Figure 4A and 4B). It indicates that SORBS2 might have an early role in regulating

162 cardiomyocyte differentiation in addition to its role in maintaining sarcomeric integrity of

163 cardiomyocytes. Consistently, we detected significantly decreased expression of sarcomeric

164 genes like TNNT2, MLC-2A, MYH6 and MYH7 upon the onset of cardiomyocyte formation in

165 SORBS2-konckdown group (Figure 4C-4F).

166 We examined SORBS2 expression dynamics to help identify the critical time when SORBS2

167 regulates cardiomyocyte differentiation. SORBS2 was expressed in hESCs, but the expression

168 was quickly turned off at the mesodermal cell stage (D2-D3). Interestingly, SORBS2 began to

169 re-express at the cardiac progenitor stage D5 and continuously increased with cardiomyocyte

170 differentiation (Figure 4G). In SORBS2-knockdown cells, SORBS2 expression was

171 significantly lower at all the time points (Figure 4G). As previously shown, SORBS2-

172 knockdown didn’t significantly affect hESC pluoripotency. In the initial mesodermal

173 commitment stage, the expression levels of mesoderm markers, T and MESP1, were not

174 changed by SORBS2-knockdown (Figure S3A-S3B), which was consistent with absent

8

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

175 SORBS2 expression at this stage.

176 The reactivation of SORBS2 expression in cardiac progenitor specification stage indicated a

177 potential role in regulating cardiac progenitor formation. There are two sets of molecularly

178 distinct cardiac progenitors during mammalian heart development, referred to as the first and

179 second heart fields (FHF and SHF). These two populations share common ancestors and

180 contribute to distinct anatomical structures of heart (27, 28). Interestingly, we found

181 significantly increased expression of FHF markers (TBX5, HCN4, HAND1) while significantly

182 decreased expression of SHF markers (TBX1, ISL1, MEF2C) in SORBS2-knockdown cells

183 (Figure 4H-4M). The inverse correlation of FHF and SHF markers suggests that SORBS2 is a

184 critical regulator to maintain the balance between FHF and SHF cells when derived from

185 common progenitors. SHF gives rise to cardiac outflow tract, right ventricle, and inflow tract

186 (28). Defects in these embryonic structures often lead to CHDs commonly seen in 4q deletion

187 syndrome.

188

189 Transcriptomic analysis reveals signaling pathway alterations that tilt cardiac progenitor

190 specification towards SHF

191 To understand how SORBS2 regulates cardiac progenitor differentiation, we collected D5 cells

192 for RNA-seq. There were 1406 differentially expressed genes between two groups (padj < 0.05,

193 Figure 5A). Using a stringent threshold (padj. < 0.05, |fold change| > 2), we selected out 160

194 down-regulated and 104 up-regulated genes (Table S2-S3) for Go enrichment analysis. Results

195 showed that the up-regulated genes were enriched in biological processes like cell adhesion,

196 and collagen fibril organization etc. (Figure 5B), which might be a compensatory reaction to

9

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

197 reduced SORBS2 as a cytoskeleton component. The down-regulated genes were enriched in

198 biological processes like heart development, pattern specification process etc. (Figure 5C). It

199 suggests that SORBS2 positively regulates cardiac progenitor specification.

200 We showed earlier that Sorb2-/- mice had ASD. Previous studies have shown that Shh signaling

201 is necessary for the proper differentiation and migration of atrial septal progenitors in posterior

202 SHF to form the primary atrial septum (27, 29). As expected, we noticed the reduced expression

203 of SHH and SHH signaling targets PTCH1 and GLI1 (Figure 5D). These results were further

204 confirmed by qPCR (Figure 5E). In addition, we also noted reduced expression of genes (FZD5,

205 SOX9, NKD2, FOXF1) involved in canonical WNT signaling and elevated expression of genes

206 (BMP4, SMAD3, ID2, MSX1) involved in BMP signaling (Figure 5D). Inhibition of canonical

207 WNT signaling and upregulation of BMP signaling promote FHF progenitor formation and

208 meanwhile impair SHF progenitor differentiation(30). Therefore, these expression changes

209 may collectively contribute to the disrupted balance of FHF and SHF progenitors. We

210 confirmed these results by qPCR (Figure S3C). FGF signaling is required for both FHF and

211 SHF progenitor expansion (28, 31). We noted the reduced expression of FGF8 and FGF18 in

212 SORBS2-knockdown cells (Figure 5D), which may contribute to the reduced efficiency of

213 cardiomyocyte differentiation. We also validated these changes by qPCR (Figure S3C).

214

215 SORBS2 promotes SHH signaling through inhibiting c-ABL-mediated NOTCH1

216 degradation

217 To understand how SORBS2 regulates SHH signaling, we looked for SORBS2 function besides

218 its role as a cytoskeleton component. SORBS2 can interact with the non-receptor tyrosine

10

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

219 kinase c-ABL as SH3 domain-containing adaptor (21). The binding of SORBS2 to c-ABL

220 triggers the recruitment of ubiquitin ligase CBL and leads to the ubiquitination of c-ABL (32).

221 Indeed, we noted that c-ABL protein level was significantly elevated in SORBS2-knockdown

222 cells (Figure 5F-5G). c-Abl can promote Notch endocytosis to modulate Notch protein level.

223 Hence we examined NOTCH1 protein level at D5 cells. As expected, NOTCH1 protein level

224 decreased significantly in SORBS2-knockdown group (Figure 5H-5I). Consistently, we noted

225 that the expression of NOTCH signaling target genes HEY1, HEYL, NARAP and HES4, was

226 also significantly reduced in RNA-seq and GO analysis (Figure 5C-5D). The expression

227 reduction of HEY1, HEYL and NARAP was further validated by qPCR (Figure 5J). In contrast,

228 there was no change in the transcriptional level of NOTCH1 (Figure 5D), suggesting that the

229 regulation of SORBS2 on NOTCH1 signaling is through modulating protein level. Notch

230 signaling is a well-known molecular mechanism promoting Smo accumulation in cilia and

231 enhancing cellular response to Shh (33, 34). Therefore, the reduced SHH signaling in SORBS2-

232 knockdown cells may be a consequence of the decreased NOTCH1 activity.

233

234 Rare SORBS2 variants are significantly enriched in CHD patients

235 To validate the genetic association of previously identified CNV genes with CHD, we

236 performed targeted sequencing on 300 complex CHD cases. Besides 23 candidate CNV genes

237 containing SORBS2, the targeted panel also includes 81 known CHD genes from literature. For

238 298 CHD patients included in our study, 98% of the target regions were covered at least 100

239 times. 220 subjects of Han Chinese descent without CHD from the 1000 genome project were

240 used as controls. The ethnic backgrounds of cases and controls were compared by principal

11

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

241 component analysis (PCA) with SNP genotype data from all the participants of this study.

242 Results indicated that CHD cases were clustered together with controls, indicating a matched

243 ethnicity between two groups (Figure S4). A total of 1560 exonic variants from CHD and

244 controls passed quality control and were included for further analyses. 54.29 % of these

245 variants (n=847) were nonsynonymous SNV, 0.38% (n=6) stop-gain and 45.32% (n= 707)

246 synonymous (Figure 6A). Distribution of exonic variants in the breakdown categories of CHD

247 and control groups was shown in Table S4.

248 In current CHD disease model, rare variants contribute significantly to the pathogenesis of

249 CHD (3). Of the 847 nonsynonymous variants, 43.57% (n=369) variants had a minor allele

250 frequency (MAF) below 1% across the public control database (Exome Aggregation

251 Consortium, ExAC), and were adjudicated as “damaging” by at least two algorithms

252 (PolyPhen2, SIFT, or MutationTaster). These rare damaging variants are located in 81 genes,

253 including 65 known CHD genes and 16 candidate genes. We applied gene-based statistic tests

254 to evaluate the cumulative effects of rare damaging variants (MAF<1%) on CHDs. Genes with

255 at least two rare damaging variants (n=57) were included for analysis (Table S5). One-tailed

256 Fisher’s exact test was performed to compare the mutational load of each gene between two

257 groups. 4 out of 57 genes (SORBS2, KMT2D, EVC2 and SH3PXD2B) had a p-value lower than

258 0.05 (one-tailed Fisher’s exact) and 2 of them (SORBS2 and KMT2D) had a statistically

259 significant mutation burden after the correction for multiple testing (q<0.20) (Figure 6B, Table

260 S5). KMT2D is a well-established CHD gene (35, 36). Our data showed that SORBS2 variants

261 had similar level of enrichment in CHDs as the known CHD genes. The distribution of rare

262 SORBS2 damaging variants in CHD patients spread throughout the gene (Figure 6C). Although

12

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

263 we didn’t detect SORBS2 nonsense variants in CHD patients, the excessive burden of SORBS2

264 missense mutations in CHD group suggests a role in CHD development. Indeed, the majority

265 of patients with rare SORBS2 variants had ASD (Table S6), the defect seen in Sorbs2-/- mouse

266 heart. Overall, these data further support that SORBS2 contributes to the pathogenesis of

267 cardiac defects of 4q deletion syndrome.

268

269 Discussion

270 Chromosome 4q deletion syndrome is a genetic disease resulting from chromosomal

271 aberration that causes a portion of long arm missing (20). Chromosomal

272 deletions are interstitial or terminal. Although deletion size and position are highly variable

273 among patients, some breakpoints occur more frequently. With the advent of high-resolution

274 chromosomal array, more submicroscopic microdeletions are being identified, particularly the

275 clustering terminal 4q deletions(19, 37). These microdeletions greatly facilitate the

276 identification of disease genes underpinning various defects. 4q deletion syndrome patients

277 have a spectrum of clinical manifestations including craniofacial, skeletal, cardiovascular,

278 gastrointestinal abnormalities, mental and growth deficiencies (20). Among them, CHD is a

279 common defect seen in about half patients (20). A study narrowed the cardiovascular critical

280 region to 4q32.2–q34.3, which contains TLL1, HPGD and HAND2 that participate in

281 cardiovascular development (38). Over-represented right-sided CHDs in 4q deletion syndrome

282 patients suggest that HAND2, an essential regulator of SHF development, is mainly responsible

283 for CHD phenotype (38, 39). However, a portion of 4q deletions associated with CHD that we

284 and others have discovered do not cover this defined region(19, 37, 40, 41). SORBS2 within

13

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

285 terminal 4q deletions has been proposed as a candidate CHD gene due to a rare small deletion

286 event and its high expression level in heart (37). The follow-up work to sequence CHD patients

287 for SORBS2 mutations, stated in this article, has never been published. It leaves this single-

288 case association in an uncertain situation. Here we presented evidence from animal model, in

289 vitro cardiogenesis and mutation analysis to demonstrate that SORBS2 functions not only as a

290 sarcomeric component to maintain cardiomyocyte function, but also as an adaptor protein to

291 promote SHF development through c-ABL/NOTCH/SHH axis and its haploinsufficiency may

292 contribute to the overall pathogenicity of chromosome 4q deletion.

293 The common CHDs in 4q deletion syndrome include ASD, ventricular septal defect

294 (VSD), pulmonary stenosis/atresia, and tetralogy of Fallot etc. (20, 42, 43). The affected

295 structures are atrial septum and cardiac outflow tract, which are derived from posterior and

296 anterior SHF, respectively (27, 44). Although ASD is detected in Sorbs2-/- mouse hearts, which

297 supports that Sorbs2 regulates SHF development in vivo and the role is conserved across

298 species, no conotruncal defect is present in mouse mutant hearts. It is different from the high

299 penetrance of conotruncal defects seen in human patients. The discrepancy may be due to

300 genetic modifiers that, together with SORBS2 haploinsufficiency, cause the developmental

301 defect in cardiac outflow tract. An obvious genetic modifier is the HAND2 gene, which is co-

302 missing with SORBS2 in large 4q deletions. Previous studies have shown that CHD is observed

303 more frequently in patients with the terminal deletion at 4q31, which includes HAND2, than in

304 patients with the terminal deletion at 4q34 or 4q35 (42, 43). Hand2 is essential for anterior SHF

305 progenitor to proliferate and survive (45) and functions downstream of Notch signaling to

306 regulate cardiac OFT development (46). Therefore, SORBS2 and HAND2 may have synergic

14

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

307 effects at both SHF development and NOTCH signaling levels to control OFT development.

308 HELT within 4q35.1 might be another genetic modifier of SORBS2, particular for deletion not

309 including HAND2. Helt encodes a Hey-related bHLH transcription factor that is expressed in

310 both brain and heart, and mediates Notch signaling (47). Therefore, SORBS2 and HELT might

311 synergistically regulate OFT development through modulating NOTCH signaling. The

312 relatively high incidence of rare SORBS2 variants in CHD group suggests that it may act as a

313 genetic contributor to the pathogenesis of isolated CHDs, albeit a low penetrance by itself. Our

314 findings show that SORBS2 regulates SHH and NOTCH signaling. It might interact with other

315 components of SHH and NOTCH signaling pathways to modify cardiac phenotype, which is

316 consistent with a recent report indicating cilia-related genes are significantly enriched for

317 damaging rare recessive and compound heterozygous genotypes(48).

318 There are two types of ASD in Sorbs2-/- heart, including atrial septal aplasia/hypoplasia

319 and DAS. It is surprising that the opposite cellular changes occur in the same type of mutants.

320 Lineage tracing has shown that primary atrial septum is mainly derived from Shh-responsive

321 posterior SHF progenitors (27, 29). Knockout of Shh receptor Smo in SHF or Shh-responsive

322 cells impairs the differentiation and migration of SHF progenitors and consequently leads to

323 atrial septal aplasia (27, 29). In Smo mutants, more Shh-responsive progenitor-derived cells are

324 directed into atrial free wall instead of primary atrial septum(29). Sorbs2-knockdown tunes

325 down Notch signaling. It is well established that Notch signaling facilitates Smo accumulation

326 in primary cilia (33, 34). Hence Sorbs2 knockout may dampen Shh signaling through reducing

327 Smo level within primary cilia. In comparison to Smo mutant mice, the extent of Shh signaling

328 reduction in Sorbs2-/- mice may be much milder, which is supported by the phenotypic

15

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

329 difference between two mutants. There is severe cardiac OFT defect in Smo mutants but not in

330 Sorbs2-/- mice. If the migration of Shh-responsive SHF progenitor is more dependent on Shh

331 signaling than progenitor differentiation does, Shh signaling dampening would hit cell

332 migration first before it can affect their differentiation. In such a condition, SHF progenitors in

333 Sorbs2-/- mice might have normal cell number but abnormal cell migration. If the direction of

334 unguided migration is randomized, there is a small chance that more Shh-responsive SHF

335 progenitors, including those destined for atrial free wall, migrate into atrial septal area, thereby

336 producing an extra atrial septum in a small percentage of Sorbs2-/- embryos.

337 DAS, also called Cor triatriatum type C in the original report (49), is a very rare CHD

338 characterized by an extra septal structure to the right side of primary atrial septum (50). The

339 anatomic abnormality is implicated as a cause of paradoxical thromboembolic event to stroke

340 or heart attack (51, 52). However, its etiology and pathogenesis are entirely unknown. Our data

341 reveal the first genetic etiology of DAS. Interestingly, Cor triatriatum, another type of abnormal

342 atrial septation, have been reported in a patient with a terminal 4q34.3 deletion (53), which

343 includes SORBS2 but not HAND2. It has been speculated that the additional atrial septum might

344 result from persistence of embryologic structures or abnormal duplication of atrial septum. The

345 impaired cardiogenesis in SORBS2-knockdown cells suggests that the latter scenario may be

346 the underlying pathogenesis.

347 Cardiomyocytes are derived from two different cardiac progenitor populations, FHF and

348 SHF. These two populations are derived from Mesp1+ mesodermal progenitors during

349 gastrulation (54, 55). In in vitro cardiogenesis, molecularly distinctive FHF and SHF

350 progenitors simultaneously appear during differentiation process (56). However, what controls

16

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

351 the fate commitment to FHF or SHF remains unclear. SORBS2 is not expressed at mesodermal

352 stage, but its expression is switched on at cardiac progenitor stage. The temporal expression

353 pattern suggests a role in the fate choice of cardiac progenitor. Interestingly, SORBS2-

354 knockdown causes a tilted balance between FHF and SHF progenitors, which leans towards

355 FHF fate. It is similar to a molecular switch in the chordate model Ciona. Tbx1/10-Dach

356 pathway maintains the separate lineages of FHF and SHF in Ciona. Deletion of Tbx1/10 or

357 Dach causes ectopic FHF marker expression in SHF cells (57). Interestingly, a recent study has

358 shown that SORBS2 mutation is associated with arrhythmogenic right ventricular

359 cardiomyopathy (58), which is considered as a disease of disrupted differentiation of cardiac

360 progenitor cells(59).

361 Overall, our data reveal that SORBS2 has a dual role in regulating cardiac development and

362 function. The discovered role of SORBS2 in promoting SHF development substantiates that

363 SORBS2 haploinsufficiency contributes to CHD seen in 4q deletion syndrome.

364

365 Material and Methods

366 Mouse lines and breeding

367 Mice were housed under specific pathogen-free conditions at the animal facility of Shanghai Children’s

368 Medical Center. Sorbs2flox/flox mice (24) were a gift from Dr. Guoping Feng’s lab (McGovern

369 Institute for Brain Research, MIT, Cambridge). Sorbs2- allele was obtained by breeding

370 Sorbs2flox allele into CMV-Cre mouse (60). The strains were backcrossed with C57BL/6 to

371 maintain the lines ever since we obtained them. 2-6 months old males and females were used for

372 timed mating and embryos were collected at E18.5. Neither anesthetic nor analgesic agent was

17

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

373 applied. CO2 gas in a closed chamber was used for euthanasia of pregnant dam and cervical

374 dislocation was followed. Isolated fetuses were euthanized by cervical dislocation. Animal care

375 and use were in accordance with the NIH guidelines for the Care and Use of Laboratory

376 Animals and approved by the Institutional Animal Care and Use Committee of Shanghai

377 Children’s Medical Center.

378 Histological analysis

379 For cardiac phenotype analysis, embryos were collected at E18.5 and were fixed in 10%

380 formalin overnight. Isolated hearts were processed for paraffin embedding, sectioned at a

381 thickness of 4 μm and stained with Hematoxylin and Eosin. Stained sections were imaged using

382 a Leica DM6000 microscope.

383 H1 hESCs cell cultures and cardiomyocytes differentiation

384 Undifferentiated H1 hESC lines were maintained in a feeder-free culture system. Briefly, we

385 precoated the well plates with Matrigel (354277, BD Biosciences), and then seeded and

386 cultured cells with TeSR™-E8™ medium (05840,Stemcell). When cells reached 80%

387 confluence, they were passaged routinely with Accutase (07920, Stemcell). For cardiomyocyte

388 differentiation, cells were induced using a chemically defined medium consisting of three

389 components (CDM3): the basal medium RPMI 1640 (C14065500, Gibco), L-ascorbic acid 2-

390 phosphate(213µg/ml,113170-55-1,Sigma) and Oryza sativa-derived recombinant human

391 albumin(500µg/m,HY100M1,Healthgen Biotechnology Corp). In brief, single cell suspensions

392 were prepared using Accutase and were seeded in 12-well Matrigel-coated plate at a density of

393 4×105 cells/well. When cells reached 80%-90% confluence (Day 0), cells were fed by 2ml

394 CDM3 basal medium supplemented with CHIR99021 (6μM, 72052, Stem cell). 48 hours later

18

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

395 (Day 2), medium was replaced with 2ml CDM3 supplemented with Wnt-C59 (2μM, 1248913,

396 Peprotech Biogems). After 96 hours (day 4), the medium was replaced with CDM3 basal

397 medium every other day until the appearance of cell beating.

398 Lentiviral shRNA knockdown of SORBS2 in hESCs

399 shRNA plasmid vectors (U6-MCS-Ubiquitin-Cherry-IRES-puromycin) expressing target-

400 specific sequences against human SORBS2 and non-target scrambled control were purchased

401 from Shanghai Genechem Co. The targeting sequences of SORBS2 shRNA-1 and shRNA-2 are

402 5′-TCCTTGTATCAGTCCTCTA-3′ and 5′-TCGATTCCACAGACACATA-3′, respectively.

403 The sequence of scrambled shRNA is 5′-TTCTCCGAACGTGTCACGT-3′. Lentiviral particles

404 were produced by transfecting human embryonic kidney (HEK) 293FT cells with shRNA,

405 psPAX2 (#12260, Addgene), and pMD2.G (#12259, Addgene) plasmids. Efficiency of gene

406 knockdown was examined using qRT-PCR.

407 Western blot

408 Cells were lysed in RIPA buffer (P0013B, Beyotime) containing protease inhibitors (P1010,

409 Beyotime). Protein concentrations were determined with the BCA™ protein assay kit

410 (Thermo). Protein was separated via 8% SDS-PAGE and afterwards transferred to

411 polyvinylidene difluoride (PVDF) membrane. After blocking by 5% non-fat milk for 1 hour,

412 primary antibodies were incubated overnight at 4℃. The membrane was washed with TBST

413 and incubated with second antibodies for 0.5 hour at room temperature. Bands were detected

414 with the Immobilon ECL Ultra Western HRP Substrate (WBULS0500, Sigma) and band

415 intensity was analyzed by ImageJ software. Antibodies: c-ABL (1:1000; A0282, Abclonal),

416 NOTCH1 (1:1000; 3608s, CST), GAPDH (1:1000; ab8245, Abcam). Goat Anti-Rabbit IgG

19

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

417 H&L (HRP) (1:5000; ab6721, Abcam), Goat Anti-Mouse IgG H&L (HRP) (1:5000;

418 ab205719, Abcam).

419 Immunofluorescent staining

420 Cells were fixed in 4% paraformaldehyde for 10 min, permeabilized in 0.5% Triton X-100/PBS

421 for 20 min, and then blocked in 5% bovine serum albumin/PBS for 30 min. Fixed cells were

422 stained with the following primary antibodies: TRI-1-60 (1:200; Ab-16288, Abcam), OCT4

423 (1:200; Ab-18976, Abcam), SOX2 (1:200; Ab-80520, Abcam), cTnI (1:200; Ab-92408, Abcam)

424 and α-actinin (1:200; A-9776, Sigma). These primary antibodies were visualized with

425 AlexaFluor 488 (1:1000; A-11029, Invitrogen) or AlexaFluor 633 (1:1000; A-21071,

426 Invitrogen). Nuclei were stained with DAPI. Fluorescence images were acquired using a Laser

427 confocal microscope (Leica TCS SP8).

428 Flow cytometric analysis

429 In brief, D15 cells were harvested in 0.25% trypsin/EDTA at 37 ℃ for 15mins and

430 subsequently neutralized by 10% FBS in DMEM. Then cells were centrifuged at 1000 rpm for

431 5min and resuspended in Invitrogen™ FIX&PERM solution and kept at 4°C for 30min. After

432 washing, cells were incubated with anti-cTnT antibody (1:100; ab105439, Abcam) in washing

433 buffer on ice in the dark for 45 min. Cells were centrifuged, washed and resuspended for

434 detection. Data were collected by BD FACSCanto™ flow cytometer and analyzed by BD

435 FACS software.

436 Electron Microscopy (EM)

437 D30 cardiomyocytes were harvested using 0.25% trypsin/EDTA and prefixed with 2.5%

438 glutaraldehyde in 0.2M phosphate buffer overnight at 4°C. Samples were washed and then

20

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

439 post-fixed with 1% osmium tetroxide for 1.5 hours. Next, cells were routinely dehydrated in

440 an ethanol series of 30%, 50%, 70%, 80% and 95% ethanol for 15 min each, 100% ethanol and

441 acetone twice for 20 min each at room temperature and then embedded in epoxy resin. Sections

442 (70 nm thick) were poststained in uranyl acetate and lead citrate and visualized on Hitachi 7650

443 microscope.

444 Electrophysiological recordings

445 D30 h1ESCs-CMs were digested by Accutase (7920, STEMCELL Technologies, Canada),

446 washed cells one time with baseline extracellular fluid and then moved to the stage of an

447 inverted microscope (ECLIPSE Ti-U, Nikon, Japan) for patch-clamp recording. h1ESCs-CMs

448 were continuously perfused by extracellular solution through a “Y-tube” system with solution

449 exchange time of 1 minute. Whole-cell patch-clamp recordings were performed using

450 Axopatch 700B (Axon Instruments, Inc., Union City, CA, USA) amplifiers under an invert

451 microscope at room temperature (22-25℃). Glass pipettes were prepared using borosilicate

452 glasses with a filament (Sutter Instruments Co, Novato, CA, USA) using the Flaming/Brown

453 micropipette puller P97 (Sutter Instruments Co, Novato, CA, USA). The final resistance

454 parameters of patch pipette tips were about 2-4 MΩ after heat polish and internal solution filling.

455 After forming “gigaseal” between the patch pipette and cell membranes, a gentle suction was

456 operated to rupture the cell membrane and establish whole-cell configuration. All current

457 signals were digitized with a sampling rate of 10 kHz and filtered at a cutoff frequency of 2

458 kHz (Digidata 1550A, Axon Instruments, Inc., Union City, CA, USA). The spontaneous action

459 potentials were recorded by gap free mode with a sampling rate of 1 kHz and filtered at a cutoff

460 frequency of 0.5 kHz. If the series resistance was more than 10 MΩ or changed significantly

21

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

461 during the experiments, the recordings were discarded from further analysis. The pipette

462 internal solution for action potential recording contained (in mM): KCl 150, NaCl 5, CaCl2 2,

463 EGTA 5, HEPES 10, MgATP 5 (pH 7.2, KOH), and baseline extracellular solution and

464 extracellular solution for action potential recording contained (in mM): NaCl 140, KCl 5, CaCl2

465 1, MgCl2 1, Glucose 10, HEPES 10 (pH 7.4, NaOH). Data were analyzed by using Clampfit

466 10.5 and Origin 8.0 (OriginLab, Northampton, MA, USA).

467 Quantitative real-time PCR analysis

468 Total RNA was extracted from D0, D2, D3, D5 and D10 cells using Trizol reagent (Thermo

469 Fisher Scientific, 15596018). Reverse transcription was accomplished with Reverse

470 Transcription Kit (Takara, RR037A) according to the manufacturer's instructions. qPCR was

471 performed with the SYBR Fast qPCR Mix (Takara, RR430A) in the Applied Biosystems 7900

472 Real-Time PCR System. Primers are listed in Table S7.

473 RNA-Seq

474 Total RNA of D5 cells was isolated using TRizol reagent (Thermo Fisher Scientific, 15596018).

475 Library preparation and transcriptome sequencing on an Illumina HiSeq platform were

476 performed by Novogene Bioinformatics Technology Co., Ltd. to generate 100-bp paired-end

477 reads. HTSeq v0.6.0 was used to count the read numbers mapped to each gene, and fragments

478 per kilobase of transcript per million fragments mapped (FPKM) of each gene were calculated.

479 We used FastQC to control the quality of transcriptome sequencing data. Next, we compared

480 the sequencing data to the human reference genome (hg19) by STAR. The expression level of

481 each gene under different treatment conditions is obtained by HTSeq-count after

482 standardization. The differentially expressed genes were analyzed by DESeq2 software

22

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

483 according to the standard that the p-value was under 0.05 after correction and the fold change

484 was more than 2 times. Functional enrichment of differentially expressed genes is analyzed on

485 Toppgene website.

486 Network and analysis

487 From ENCODE database (23), 316 human fetal heart specific genes including SORBS2 were

488 selected and their expression coefficients were computed. The highly co-regulated

489 transcriptional networks (correlation efficient≥0.8) were constructed and visualized with

490 BioLayout Express3D. The interconnected gene clusters were detected using the MCL

491 (Markov Cluster) algorithm and illustrated with different colors.

492 Patient samples

493 A total of 300 children with complex CHD, including ASD, conotruncal defect etc., were

494 enrolled in our study from November 2011 to January 2014 in Shanghai Children’s Medical

495 Center (Table S8). Patients carrying 22q11.2 deletion and gross chromosomal aberrations were

496 excluded from our study. The mean age of included probands was 10 months with a range of 3

497 days to 17 years. 188 (62.7%) were boys and 112 (37.3%) were girls. CHD diagnosis was

498 confirmed by reviewing patient history, physical examinations, and medical records. Patients

499 carrying 22q11.2 deletion and gross chromosomal aberrations (e.g., trisomy 21, trisomy 13 and

500 trisomy 18) were excluded from our study. The study conformed to the principles outlined in

501 the Declaration of Helsinki and approval for human subject research was obtained from the

502 Institutional Review Board of Shanghai Children's Medical Center. Written informed consents

503 were obtained from parents or legal guardians of all patients.

504 Control cohort

23

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

505 Exome sequences for 220 control subjects of Han Chinese descent were derived from the 1000

506 genome project (http://www.internationalgenome. org/data). Raw sequence data in the form of

507 bam or fastq files was re-aligned and re-analyzed with the same bioinformatic pipelines with

508 CHD patients. The ethnicity of cases and controls were investigated by performing PCA with

509 SNP genotype data from all the participants of this study.

510 Gene selection and targeted sequencing

511 We used a customized capture panel of 104 targeted genes, which included 81 known CHD

512 genes (Table S9) and 23 CHD candidate genes (Table S10). The CHD genes were selected

513 through a comprehensive literature search and had been reported by other research groups to

514 be associated with CHD in either human patients or mouse models (1, 61, 62). CHD candidate

515 genes were prioritized from pathogenic CNVs and likely pathogenic CNVs identified in CHD

516 patients in our previous study(19). The coding regions of selected genes and their flanking

517 sequences were covered by the Agilent SureSelect Capture panel and sequenced on the Illumina

518 Genome Analyzer IIx platform according to the protocols recommended by the manufacturers.

519 Variant calling and quality control

520 We used the best practice pipeline of Broad Institute’s genome analysis toolkit (GATK 3.7) to

521 obtain genetic variants from the target sequencing data of 300 CHD cases together with the

522 raw data of 220 Han Chinese control samples. Briefly, Burrows-Wheeler Aligner (BWA,

523 version 0.7.17) was used to align the Fastq format sequences to reference

524 (hg38). De-duplication was performed using Picard and the base quality score recalibration

525 (BQSR) was performed to generate analysis-ready reads. HaplotypeCaller implemented in

526 GATK was used for variant calling in GVCF mode. All samples were then genotyped jointly.

24

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

527 We then excluded the variants based on the following rules: (1) >2 alternative alleles; (2) low

528 genotype call rate <90%; (3) deviation from Hardy-Weinberg Equilibrium in control samples

529 (p < 10-7); (4) differential missingness between cases and controls (p < 10-6). After alignment

530 and variant calling, we removed 2 subjects with low-quality data, leaving a total of 298 cases

531 and 220 controls.

532 Variant enrichment analysis

533 Given that severe mutations are generally present at low frequencies in the population, we set

534 the relevant variants filtering criteria as follows: (1) variants that located in exonic region, (2)

535 exclude synonymous variants, (3) variants with minor allele frequency (MAF) below 1%

536 according to the public control database (Exome Aggregation Consortium, ExAC), and (4)

537 damaging missense variants predicted to be deleterious by at least two algorithms (Polyphen2

538 ≥0.95/MutationTaster_pred:D/SIFT≤0.05). All relevant variants following these criteria

539 were hereafter called “rare damaging” variants. The number of rare damaging variant carriers

540 in each gene was counted in CHD patients and controls. We hypothesized that rare damaging

541 variants should be enriched in CHD patients, hence the carrier and non-carrier groups were

542 compared between CHD patients and controls using a one-tailed Fisher’s exact test. The odds

543 ratio (OR) was calculated. Only genes with at least two variants were retained and multiple

544 testing correction was performed using Benjamini-Hochberg procedure (q-value, adjusted p-

545 value after Benjamini-Hochberg testing).

546 Statistical analysis

547 Statistical significance was performed using a two-tailed Student’s t test or χ2 test as appropriate.

548 Statistical significance is indicated by *, where p < 0.05, **, where p < 0.01.

25

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

549

550

551 Funding

552 The work was supported by National Natural Science Foundation of China (81371893,

553 81741031, 81871717, 81672090, 31371465 and 31771612), Shanghai Municipal Education

554 Commission-Gaofeng Clinical Medicine Grant Support (20171925), Innovative Research

555 Team of High-level Local Universities in Shanghai (SSMU-ZDCX20180200), Shanghai

556 Sailing Program (18YF1414800).

557

558 Acknowledgements

559 We thank Dr. Guoping Feng’s at Massachusetts Institute of Technology for Sorbs2flox/flox mice,

560 Dr. Bingshan Li at Vanderbilt University and Dr. Hao Mei at University of Mississippi Medical

561 Center for advice on genetic data analyses.

562

563 Author contributions

564 F. L. performed mouse and in vitro cardiogenesis experiments; X. Z., B. W., J. G. and G. Y.

565 did targeted sequencing and analyzed genomic data; H. S. analyzed RNA-seq data; H. S.

566 analyzed CHD phenotype and collected samples. Q. F. and Z.Z. conceived and directed the

567 project. F. L., X. Z. and Z.Z wrote the manuscript. Q. F., H. C. and J. F. commented on the

568 written manuscript.

569

570 Conflict of Interests

26

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

571 The authors declare no competing interests.

572

573 Data and code availability

574 Targeted sequencing raw data of CHD patients have been deposited NCBI’s Sequence Read

575 Archive (PRJNA579193). RNA-seq data have been deposited in NCBI’s

576 Omnibus (GSE137090). R scripts will be available upon request.

577

578 References

579 1. Andersen TA, Troelsen Kde L, Larsen LA. Of mice and men: molecular genetics of congenital heart 580 disease. Cell Mol Life Sci. 2014;71(8):1327-52. 581 2. Triedman JK, Newburger JW. Trends in Congenital Heart Disease: The Next Decade. Circulation. 582 2016;133(25):2716-33. 583 3. Blue GM, Kirk EP, Giannoulatou E, Sholler GF, Dunwoodie SL, Harvey RP, Winlaw DS. Advances in the 584 Genetics of Congenital Heart Disease: A Clinician's Guide. Journal of the American College of Cardiology. 585 2017;69(7):859-70. 586 4. Glessner JT, Bick AG, Ito K, Homsy J, Rodriguez-Murillo L, Fromer M, Mazaika E, Vardarajan B, Italia M, 587 Leipzig J, DePalma SR, Golhar R, Sanders SJ, Yamrom B, Ronemus M, Iossifov I, Willsey AJ, State MW, 588 Kaltman JR, White PS, Shen Y, Warburton D, Brueckner M, Seidman C, Goldmuntz E, Gelb BD, Lifton R, 589 Seidman J, Hakonarson H, Chung WK. Increased frequency of de novo copy number variants in congenital 590 heart disease by integrative analysis of single nucleotide polymorphism array and exome sequence data. 591 Circ Res. 2014;115(10):884-96. 592 5. Kim DS, Kim JH, Burt AA, Crosslin DR, Burnham N, Kim CE, McDonald-McGinn DM, Zackai EH, Nicolson 593 SC, Spray TL, Stanaway IB, Nickerson DA, Heagerty PJ, Hakonarson H, Gaynor JW, Jarvik GP. Burden of 594 potentially pathologic copy number variants is higher in children with isolated congenital heart disease 595 and significantly impairs covariate-adjusted transplant-free survival. J Thorac Cardiovasc Surg. 596 2016;151(4):1147-51 e4. 597 6. McDonald-McGinn DM, Sullivan KE, Marino B, Philip N, Swillen A, Vorstman JA, Zackai EH, Emanuel 598 BS, Vermeesch JR, Morrow BE, Scambler PJ, Bassett AS. 22q11.2 deletion syndrome. Nature reviews Disease 599 primers. 2015;1:15071. 600 7. Lindsay EA, Vitelli F, Su H, Morishima M, Huynh T, Pramparo T, Jurecic V, Ogunrinu G, Sutherland HF, 601 Scambler PJ, Bradley A, Baldini A. Tbx1 haploinsufficieny in the DiGeorge syndrome region causes aortic 602 arch defects in mice. Nature. 2001;410(6824):97-101. 603 8. Rana MS, Theveniau-Ruissy M, De Bono C, Mesbah K, Francou A, Rammah M, Dominguez JN, Roux 604 M, Laforest B, Anderson RH, Mohun T, Zaffran S, Christoffels VM, Kelly RG. Tbx1 coordinates addition of 605 posterior second heart field progenitor cells to the arterial and venous poles of the heart. Circ Res. 606 2014;115(9):790-9. 607 9. Yagi H, Furutani Y, Hamada H, Sasaki T, Asakawa S, Minoshima S, Ichida F, Joo K, Kimura M, Imamura

27

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

608 S, Kamatani N, Momma K, Takao A, Nakazawa M, Shimizu N, Matsuoka R. Role of TBX1 in human 609 del22q11.2 syndrome. Lancet. 2003;362(9393):1366-73. 610 10. Zhang Z, Huynh T, Baldini A. Mesodermal expression of Tbx1 is necessary and sufficient for pharyngeal 611 arch and cardiac outflow tract development. Development (Cambridge, England). 2006;133(18):3587-95. 612 11. Caprio C, Baldini A. p53 Suppression partially rescues the mutant phenotype in mouse models of 613 DiGeorge syndrome. Proceedings of the National Academy of Sciences of the United States of America. 614 2014;111(37):13385-90. 615 12. Fulcoli FG, Franzese M, Liu X, Zhang Z, Angelini C, Baldini A. Rebalancing gene haploinsufficiency in 616 vivo by targeting chromatin. Nature communications. 2016;7:11688. 617 13. Lania G, Bresciani A, Bisbocci M, Francone A, Colonna V, Altamura S, Baldini A. Vitamin B12 ameliorates 618 the phenotype of a mouse model of DiGeorge syndrome. Human molecular genetics. 2016;25(20):4369- 619 75. 620 14. Kurahashi H, Nakayama T, Osugi Y, Tsuda E, Masuno M, Imaizumi K, Kamiya T, Sano T, Okada S, 621 Nishisho I. Deletion mapping of 22q11 in CATCH22 syndrome: identification of a second critical region. 622 American journal of human genetics. 1996;58(6):1377-81. 623 15. Racedo SE, McDonald-McGinn DM, Chung JH, Goldmuntz E, Zackai E, Emanuel BS, Zhou B, Funke B, 624 Morrow BE. Mouse and human CRKL is dosage sensitive for cardiac outflow tract formation. American 625 journal of human genetics. 2015;96(2):235-44. 626 16. Verhagen JM, Diderich KE, Oudesluijs G, Mancini GM, Eggink AJ, Verkleij-Hagoort AC, Groenenberg 627 IA, Willems PJ, du Plessis FA, de Man SA, Srebniak MI, van Opstal D, Hulsman LO, van Zutven LJ, Wessels 628 MW. Phenotypic variability of atypical 22q11.2 deletions not including TBX1. American journal of medical 629 genetics Part A. 2012;158A(10):2412-20. 630 17. Guris DL, Fantes J, Tara D, Druker BJ, Imamoto A. Mice lacking the homologue of the human 22q11.2 631 gene CRKL phenocopy neurocristopathies of DiGeorge syndrome. Nature genetics. 2001;27(3):293-8. 632 18. Yamamoto GL, Aguena M, Gos M, Hung C, Pilch J, Fahiminiya S, Abramowicz A, Cristian I, Buscarilli M, 633 Naslavsky MS, Malaquias AC, Zatz M, Bodamer O, Majewski J, Jorge AA, Pereira AC, Kim CA, Passos-Bueno 634 MR, Bertola DR. Rare variants in SOS2 and LZTR1 are associated with Noonan syndrome. Journal of medical 635 genetics. 2015;52(6):413-21. 636 19. Geng J, Picker J, Zheng Z, Zhang X, Wang J, Hisama F, Brown DW, Mullen MP, Harris D, Stoler J, Seman 637 A, Miller DT, Fu Q, Roberts AE, Shen Y. Chromosome microarray testing for patients with congenital heart 638 defects reveals novel disease causing loci and high diagnostic yield. BMC genomics. 2014;15:1127. 639 20. Strehle EM, Bantock HM. The phenotype of patients with 4q-syndrome. Genetic counseling. 640 2003;14(2):195-205. 641 21. Kioka N, Ueda K, Amachi T. Vinexin, CAP/ponsin, ArgBP2: a novel adaptor protein family regulating 642 cytoskeletal organization and signal transduction. Cell structure and function. 2002;27(1):1-7. 643 22. Kakimoto Y, Ito S, Abiru H, Kotani H, Ozeki M, Tamaki K, Tsuruyama T. Sorbin and SH3 domain- 644 containing protein 2 is released from infarcted heart in the very early phase: proteomic analysis of cardiac 645 tissues from patients. Journal of the American Heart Association. 2013;2(6):e000565. 646 23. Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, 647 Alexander R, Min R, Alves P, Abyzov A, Addleman N, Bhardwaj N, Boyle AP, Cayting P, Charos A, Chen DZ, 648 Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski 649 M, Lacroute P, Leng JJ, Lian J, Monahan H, O'Geen H, Ouyang Z, Partridge EC, Patacsil D, Pauli F, Raha D, 650 Ramirez L, Reddy TE, Reed B, Shi M, Slifer T, Wang J, Wu L, Yang X, Yip KY, Zilberman-Schapira G, Batzoglou 651 S, Sidow A, Farnham PJ, Myers RM, Weissman SM, Snyder M. Architecture of the human regulatory network

28

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

652 derived from ENCODE data. Nature. 2012;489(7414):91-100. 653 24. Zhang Q, Gao X, Li C, Feliciano C, Wang D, Zhou D, Mei Y, Monteiro P, Anand M, Itohara S, Dong X, 654 Fu Z, Feng G. Impaired Dendritic Development and Memory in Sorbs2 Knock-Out Mice. The Journal of 655 neuroscience : the official journal of the Society for Neuroscience. 2016;36(7):2247-60. 656 25. Burridge PW, Matsa E, Shukla P, Lin ZC, Churko JM, Ebert AD, Lan F, Diecke S, Huber B, Mordwinkin 657 NM, Plews JR, Abilez OJ, Cui B, Gold JD, Wu JC. Chemically defined generation of human cardiomyocytes. 658 Nat Methods. 2014;11(8):855-60. 659 26. Sanger JM, Wang J, Gleason LM, Chowrashi P, Dube DK, Mittal B, Zhukareva V, Sanger JW. Arg/Abl- 660 binding protein, a Z-body and Z-band protein, binds sarcomeric, costameric, and signaling molecules. 661 Cytoskeleton. 2010;67(12):808-23. 662 27. Goddeeris MM, Rho S, Petiet A, Davenport CL, Johnson GA, Meyers EN, Klingensmith J. Intracardiac 663 septation requires hedgehog-dependent cellular contributions from outside the heart. Development 664 (Cambridge, England). 2008;135(10):1887-95. 665 28. Kelly RG. The second heart field. Current topics in developmental biology. 2012;100:33-65. 666 29. Hoffmann AD, Peterson MA, Friedland-Little JM, Anderson SA, Moskowitz IP. sonic hedgehog is 667 required in pulmonary endoderm for atrial septation. Development (Cambridge, England). 668 2009;136(10):1761-70. 669 30. Brade T, Pane LS, Moretti A, Chien KR, Laugwitz KL. Embryonic heart progenitors and cardiogenesis. 670 Cold Spring Harbor perspectives in medicine. 2013;3(10):a013847. 671 31. Reifers F, Walsh EC, Leger S, Stainier DY, Brand M. Induction and differentiation of the zebrafish heart 672 requires fibroblast growth factor 8 (fgf8/acerebellar). Development (Cambridge, England). 673 2000;127(2):225-35. 674 32. Soubeyran P, Barac A, Szymkiewicz I, Dikic I. Cbl-ArgBP2 complex mediates ubiquitination and 675 degradation of c-Abl. The Biochemical journal. 2003;370(Pt 1):29-34. 676 33. Kong JH, Yang L, Dessaud E, Chuang K, Moore DM, Rohatgi R, Briscoe J, Novitch BG. Notch activity 677 modulates the responsiveness of neural progenitors to sonic hedgehog signaling. Developmental cell. 678 2015;33(4):373-87. 679 34. Stasiulewicz M, Gray SD, Mastromina I, Silva JC, Bjorklund M, Seymour PA, Booth D, Thompson C, 680 Green RJ, Hall EA, Serup P, Dale JK. A conserved role for Notch signaling in priming the cellular response 681 to Shh through ciliary localisation of the key Shh transducer Smo. Development (Cambridge, England). 682 2015;142(13):2291-303. 683 35. Ang SY, Uebersohn A, Spencer CI, Huang Y, Lee JE, Ge K, Bruneau BG. KMT2D regulates specific 684 programs in heart development via histone H3 lysine 4 di-methylation. Development (Cambridge, 685 England). 2016;143(5):810-21. 686 36. Jin SC, Homsy J, Zaidi S, Lu Q, Morton S, DePalma SR, Zeng X, Qi H, Chang W, Sierant MC, Hung WC, 687 Haider S, Zhang J, Knight J, Bjornson RD, Castaldi C, Tikhonoa IR, Bilguvar K, Mane SM, Sanders SJ, Mital S, 688 Russell MW, Gaynor JW, Deanfield J, Giardini A, Porter GA, Jr., Srivastava D, Lo CW, Shen Y, Watkins WS, 689 Yandell M, Yost HJ, Tristani-Firouzi M, Newburger JW, Roberts AE, Kim R, Zhao H, Kaltman JR, Goldmuntz 690 E, Chung WK, Seidman JG, Gelb BD, Seidman CE, Lifton RP, Brueckner M. Contribution of rare inherited 691 and de novo variants in 2,871 congenital heart disease probands. Nature genetics. 2017;49(11):1593-601. 692 37. Strehle EM, Yu L, Rosenfeld JA, Donkervoort S, Zhou Y, Chen TJ, Martinez JE, Fan YS, Barbouth D, Zhu 693 H, Vaglio A, Smith R, Stevens CA, Curry CJ, Ladda RL, Fan ZJ, Fox JE, Martin JA, Abdel-Hamid HZ, McCracken 694 EA, McGillivray BC, Masser-Frye D, Huang T. Genotype-phenotype analysis of 4q deletion syndrome: 695 proposal of a critical region. American journal of medical genetics Part A. 2012;158A(9):2139-51.

29

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

696 38. Xu W, Ahmad A, Dagenais S, Iyer RK, Innis JW. Chromosome 4q deletion syndrome: narrowing the 697 cardiovascular critical region to 4q32.2-q34.3. American journal of medical genetics Part A. 698 2012;158A(3):635-40. 699 39. Huang T, Lin AE, Cox GF, Golden WL, Feldman GL, Ute M, Schrander-Stumpel C, Kamisago M, 700 Vermeulen SJ. Cardiac phenotypes in chromosome 4q- syndrome with and without a deletion of the 701 dHAND gene. Genetics in medicine : official journal of the American College of Medical Genetics. 702 2002;4(6):464-7. 703 40. Maurin ML, Labrune P, Brisset S, Le Lorc'h M, Pineau D, Castel C, Romana S, Tachdjian G. Molecular 704 cytogenetic characterization of a 4p15.1-pter duplication and a 4q35.1-qter deletion in a recombinant of 705 chromosome 4 pericentric inversion. American journal of medical genetics Part A. 2009;149A(2):226-31. 706 41. Tsai CH, Van Dyke DL, Feldman GL. Child with velocardiofacial syndrome and del (4)(q34.2): another 707 critical region associated with a velocardiofacial syndrome-like phenotype. American journal of medical 708 genetics. 1999;82(4):336-9. 709 42. Lin AE, Garver KL, Diggans G, Clemens M, Wenger SL, Steele MW, Jones MC, Israel J. Interstitial and 710 terminal deletions of the long arm of chromosome 4: further delineation of phenotypes. American journal 711 of medical genetics. 1988;31(3):533-48. 712 43. Mitchell JA, Packman S, Loughman WD, Fineman RM, Zackai E, Patil SR, Emanual B, Bartley JA, Hanson 713 JW. Deletions of different segments of the long arm of chromosome 4. American journal of medical 714 genetics. 1981;8(1):73-89. 715 44. Dyer LA, Kirby ML. The role of secondary heart field in cardiac development. Developmental biology. 716 2009;336(2):137-44. 717 45. Tsuchihashi T, Maeda J, Shin CH, Ivey KN, Black BL, Olson EN, Yamagishi H, Srivastava D. Hand2 718 function in second heart field progenitors is essential for cardiogenesis. Developmental biology. 719 2011;351(1):62-9. 720 46. Klaus A, Muller M, Schulz H, Saga Y, Martin JF, Birchmeier W. Wnt/beta-catenin and Bmp signals 721 control distinct sets of transcription factors in cardiac progenitor cells. Proceedings of the National 722 Academy of Sciences of the United States of America. 2012;109(27):10921-6. 723 47. Nakatani T, Mizuhara E, Minaki Y, Sakamoto Y, Ono Y. Helt, a novel basic-helix-loop-helix 724 transcriptional repressor expressed in the developing central nervous system. The Journal of biological 725 chemistry. 2004;279(16):16356-67. 726 48. Watkins WS, Hernandez EJ, Wesolowski S, Bisgrove BW, Sunderland RT, Lin E, Lemmon G, Demarest 727 BL, Miller TA, Bernstein D, Brueckner M, Chung WK, Gelb BD, Goldmuntz E, Newburger JW, Seidman CE, 728 Shen Y, Yost HJ, Yandell M, Tristani-Firouzi M. De novo and recessive forms of congenital heart disease 729 have distinct genetic and phenotypic landscapes. Nature communications. 2019;10(1):4722. 730 49. Thilenius OG, Bharati S, Lev M. Subdivided left atrium: an expanded concept of cor triatriatum 731 sinistrum. The American journal of cardiology. 1976;37(5):743-52. 732 50. Roberson DA, Javois AJ, Cui W, Madronero LF, Cuneo BF, Muangmingsuk S. Double atrial septum with 733 persistent interatrial space: echocardiographic features of a rare atrial septal malformation. Journal of the 734 American Society of Echocardiography : official publication of the American Society of Echocardiography. 735 2006;19(9):1175-81. 736 51. Breithardt OA, Papavassiliu T, Borggrefe M. A coronary embolus originating from the interatrial 737 septum. European heart journal. 2006;27(23):2745. 738 52. Seyfert H, Bohlscheid V, Bauer B. Double atrial septum with persistent interatrial space and transient 739 ischaemic attack. European journal of echocardiography : the journal of the Working Group on

30

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

740 Echocardiography of the European Society of Cardiology. 2008;9(5):707-8. 741 53. Marci M, Guarina A, Castiglione MC, Sanfilippo N. Different Cardiac Anomalies in Mother and Son 742 with 4q-Syndrome. Case reports in genetics. 2015;2015:932651. 743 54. Devine WP, Wythe JD, George M, Koshiba-Takeuchi K, Bruneau BG. Early patterning and specification 744 of cardiac progenitors in gastrulating mesoderm. eLife. 2014;3. 745 55. Lescroart F, Chabab S, Lin X, Rulands S, Paulissen C, Rodolosse A, Auer H, Achouri Y, Dubois C, Bondue 746 A, Simons BD, Blanpain C. Early lineage restriction in temporally distinct populations of Mesp1 progenitors 747 during mammalian heart development. Nature cell biology. 2014;16(9):829-40. 748 56. Andersen P, Tampakakis E, Jimenez DV, Kannan S, Miyamoto M, Shin HK, Saberi A, Murphy S, Sulistio 749 E, Chelko SP, Kwon C. Precardiac organoids form two heart fields via Bmp/Wnt signaling. Nature 750 communications. 2018;9(1):3140. 751 57. Wang W, Niu X, Stuart T, Jullian E, Mauck WM, 3rd, Kelly RG, Satija R, Christiaen L. A single-cell 752 transcriptional roadmap for cardiopharyngeal fate diversification. Nature cell biology. 2019;21(6):674-86. 753 58. Ding Y, Yang J, Chen P, Lu T, Jiao K, Tester D, Jiang K, Ackerman MJ, Li Y, Wang DW, Wang DW, Lee 754 H-C, Xu X. SORBS2 is a susceptibility gene to arrhythmogenic right ventricular 755 cardiomyopathy. bioRxiv. 2019:725077. 756 59. Lombardi R, Marian AJ. Arrhythmogenic right ventricular cardiomyopathy is a disease of cardiac stem 757 cells. Current opinion in cardiology. 2010;25(3):222-8. 758 60. Schwenk F, Baron U, Rajewsky K. A cre-transgenic mouse strain for the ubiquitous deletion of loxP- 759 flanked gene segments including deletion in germ cells. Nucleic acids research. 1995;23(24):5080-1. 760 61. Barriot R BJ, Thienpont B, Brohée S, Van Vooren S, Coessens B, Tranchevent LC, Van Loo P, Gewillig 761 M, Devriendt K, Moreau Y. Collaboratively charting the gene-to-phenotype network of human congenital 762 heart defects. Genome Med. 2010;2(3). 763 62. Fahed AC, Gelb BD, Seidman JG, Seidman CE. Genetics of congenital heart disease: the glass half 764 empty. Circ Res. 2013;112(4):707-20. 765 63. Cuturilo G, Menten B, Krstic A, Drakulic D, Jovanovic I, Parezanovic V, Stevanovic M. 4q34.1-q35.2 766 deletion in a boy with phenotype resembling 22q11.2 deletion syndrome. European journal of pediatrics. 767 2011;170(11):1465-70. 768 64. Vona B, Nanda I, Neuner C, Schroder J, Kalscheuer VM, Shehata-Dieler W, Haaf T. Terminal 769 chromosome 4q deletion syndrome in an infant with hearing impairment and moderate syndromic features: 770 review of literature. BMC medical genetics. 2014;15:72.

771

772

31

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

773 Figures

774

775 Figure 1. SORBS2 is a candidate CHD gene

776 A. Alignment of 4q deletions identified by us and others(19, 37, 38, 40, 63, 64). Asterisk, 4q

777 deletions not including HAND2, the known cardiac development regulator responsible for

778 CHD of 4q deletion syndrome. Green box, the HAND2 gene. Red box, the SORBS2 gene within 32

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

779 the minimal overlapping region among 4q deletions indicated by asterisks.

780 B. Co-regulation network of SORBS2 with multiple known CHD-related genes. SORBS2 is

781 highlighted in red box. Color-coded dots represent different clusters.

782

783

33

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

784 785 Figure 2. Cardiac phenotype of Sorbs2-/- mice.

786 A. Gross view of embryos at E18.5.

787 B. HE-stained paraffin sections of E18.5 hearts in conotruncal area.

788 C. HE-stained paraffin sections of E18.5 heart in atrial septum area. Asterisk indicates the

789 absence of PAS. Two sections in the right are from the same heart with DAS. The rightmost

790 section is dorsal to the other. AO, aorta. PT, pulmonary trunk. LSA, left subclavian artery.

791 RSA, right subclavian artery. LCA, left common carotid artery. RCA, right common carotid

792 artery. LA, left atrium. RA, right atrium. LV, left ventricle. RV, right ventricle. PAS, primary

793 atrial septum. AAS, accessory atrial septum. IVS, interventricular septum. Scale bar: 1mm in

794 A, 200μm in B and C.

795

796

34

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

797

798 Figure 3. Defective sarcomeric structure but normal electrophysiology of SORBS2-

799 knockdown cardiomyocytes.

800 A. Immunostaining of D30 cells with anti-cardiac troponin I (cTnl, red) and anti-α-actinin

801 (green) antibodies. Boxed areas are magnified in the lower panels. Scale bar: 25μm in upper

802 panel and 5μm in lower panel.

803 B. Quantification of cardiomyocytes with well-organized sarcomeres (Control: n=211,

804 SORBS2-knockdown: n=197). **, p <0.01.

805 C. Transmission electron microscopy images of D30 cardiomyocytes. ZL, Z line.

806 D. Representative action potentials of hESCs-derived ventricular-like cardiomyocytes.

807 E-H. Quantification of action potential parameters of ventricular-like cardiomyocytes

808 (Control, n =6; SORBS2-knockdown, n = 6). Frequency (E), action potential duration (F), the

809 peak amplitude (G), and the resting membrane potential (H).

35

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

810

811 Figure 4. Impair cardiomyocyte differentiation and SHF specification in in vitro

812 cardiogenesis of SORBS2-knockdown hESCs.

813 A. Flow cytometry analysis of cardiomyocytes at D15. P3 indicates cTnT+ population.

814 B. Quantification of cTnT+ cells (n = 4). **, p <0.01.

815 C-F. qPCR quantification of cardiogenic marker expression at different differentiation time

816 points (n = 3 for each time point). *, p <0.05, **, p <0.01.

817 G. qPCR quantification of SORBS2 expression dynamics (n = 3 for each time point). **, p

36

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

818 <0.01.

819 H-M. qPCR quantification of cardiac progenitor marker expression at different time points (n

820 = 3 for each time point). *, p <0.05, **, p <0.01.

821

822

37

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

823

824 Figure 5. SORBS2 promotes SHF specification through c-ABL/NOTCH/SHH axis.

825 A. Volcano plot illustrates the differential gene expression from D5 RNA-seq data. Orange,

826 down-regulated genes. Blue, up-regulated genes. (|log2(fold change)| > 1 and padj. <0.05) B-

827 C. GO biological process analysis of differentially expressed genes. Up-regulated pathways

828 (B). Down-regulated pathways (C).

829 D. Heatmap illustrates gene expression changes of critical signaling pathways. Color tints

830 correspond to expression levels. *, padj<0.05. **, padj<0.01. ***, padj<0.001.

831 E. qPCR quantification of SHH signaling targets at different time points (n = 3 for each time

832 point). *, p <0.05, **, p <0.01.

833 F. Western blot analysis of c-ABL expression in D5 cells.

834 G. Quantification of Western blot (n=3). *, p <0.05.

835 H. qPCR quantification of NOTCH1 signaling targets at different time points (n = 3 for each

38

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

836 time point). *, p <0.05.

837 I. Western blot analysis of NOTCH1 expression in D5 cells.

838 J. Quantification of Western blot (n=3). **, p <0.01.

839 K. Diagram depicting the role of SORBS2 in cardiogenesis.

840

841

39

bioRxiv preprint doi: https://doi.org/10.1101/2020.05.12.087452; this version posted May 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

842

843 Figure 6. Rare SORBS2 variants are significantly enriched in CHD patients.

844 A. Descriptive statistics of the identified exonic variants.

845 B. Manhattan plot of gene level Fisher’s exact test of rare damaging variant counts between

846 CHD and control groups. Raw p values of 0.05 and 0.01 were indicated by a blue line and a

847 grey dash line, respectively. Genes (SORBS2, KMT2D) with q value lower than 0.2 were

848 highlighted in red. Genes (EVC2, SH3PXD2B) with p < 0.05 but q > 0.2 were highlighted in

849 green.

850 C. Illustration of rare damaging variants in SORBS2. Most variants are indicated in the main

851 SORBS2 isoform (SORBS2-201). Three isoform-specific variants are shown in the

852 corresponding exons. Variants in the CHD or control groups are indicated by orange or pink

853 dots, respectively. Variants appearing in both groups are indicated by grey dots.

40