<<

bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

The extracellular matrix phenome across species

Cyril Statzer1 and Collin Y. Ewald1*

1 Eidgenössische Technische Hochschule Zürich, Department of Health Sciences and

Technology, Institute of Translational Medicine, Schwerzenbach-Zürich CH-8603,

Switzerland

*Corresponding authors: [email protected] (CYE)

Keywords: Phenome, genotype-to-phenotype, matrisome, extracellular matrix,

, data mining.

Highlights

• 7.6% of the human phenome originates from variations in matrisome

• 11’671 phenotypes are linked to matrisome genes of humans, mice, zebrafish,

Drosophila, and C. elegans

• Expected top ECM phenotypes are developmental, morphological and

structural phenotypes

• Nonobvious top ECM phenotypes include immune system, stress resilience,

and age-related phenotypes

1 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1 Abstract

2 Extracellular matrices are essential for cellular and organismal function. Recent

3 genome-wide and phenome-wide association studies started to reveal a broad

4 spectrum of phenotypes associated with genetic variants. However, the phenome or

5 spectrum of all phenotypes associated with genetic variants in extracellular matrix

6 genes is unknown. Here, we analyzed over two million recorded genotype-to-

7 phenotype relationships across hundreds of species to define their extracellular matrix

8 phenomes. By using previously defined matrisomes of humans, mice, zebrafish,

9 Drosophila, and C. elegans, we found that the extracellular matrix phenome comprises

10 of 3-10% of the entire phenome. (COL1A1, COL2A1) and fibrillin (FBN1)

11 are each associated with more than 150 distinct phenotypes in humans, whereas

12 collagen COL4A1, Wnt- and sonic hedgehog (shh) signaling are predominantly

13 associated in other species. We determined the phenotypic fingerprints of matrisome

14 genes and found that MSTN, CTSD, LAMB2, HSPG2, and COL11A2 and their

15 corresponding orthologues have the most phenotypes across species. Out of the

16 42’558 unique matrisome genotype-to-phenotype relationships across the five species

17 with defined matrisomes, we have constructed interaction networks to identify the

18 underlying molecular components connecting with orthologues phenotypes and with

19 novel phenotypes. Thus, our networks provide a framework to predict unassessed

20 phenotypes and their potential underlying molecular interactions. These frameworks

21 inform on matrisome genotype-to-phenotype relationships and potentially provide a

22 sophisticated choice of biological model system to study human phenotypes and

23 diseases.

24

2 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

25 Introduction

26 A principle goal in biology is to link molecular mechanisms and biological pathways to

27 phenotypes. Recent systems-level approaches of genome-wide association studies

28 (GWAS) and phenome-wide association studies (PheWAS) have been harnessed to

29 generate and map genotype-to-phenotype relationships. These efforts are invaluable

30 for clinical research spearheaded by linking electronic health records to medically

31 relevant PheWAS, thereby associating genetic variants to human phenotypes,

32 pathologies, and diseases [1-4]. Assuming that conserved genes and pathways result

33 in comparable phenotypes across species, knowing the genotype-to-phenotype

34 relationship in one species could inform on the existence of a similar genotype-to-

35 phenotype relationship in another species. The phenome is the entire set of

36 phenotypes resulting from all possible genetic variations [5,6]. In order to make trans-

37 species genome-phenome inferences, the phenomes of several species need to be

38 known. The human phenome is far from reaching saturation [6]. We estimate that the

39 current human phenome consists of 7’037 unique phenotypes (Source: Database

40 Monarch Initiative; timestamp 23.08.2019). From the entire human phenome, several

41 sub-phenomes have been characterized and defined, including the behavioral

42 phenome [7], the aging phenome [8,9], and the disease phenome [10-12]. These

43 genotype-to-phenotype relationships can be used to build networks and pathways.

44 In particular, the human disease phenome consists of over 80’000 genetic

45 variants associated with human diseases, of which cardiovascular diseases and skin

46 and connective diseases share the most pathways [12]. These diseases are enriched

47 in variants closely located or found in collagens or extracellular matrix genes. Cells

48 secrete , such as collagens, glycoproteins, and proteoglycans that integrate

49 and form matrices in the extracellular space. The extracellular matrix (ECM) provides

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

50 structural support, is important for cell-cell communication, and cellular homeostasis

51 [13,14]. The ECM has emerged as a key feature for overall health or disease, with

52 several clinical trials targeting to modify the ECM [15-18]. Furthermore, recent

53 approaches that combine PheWAS from mice to human or zebrafish to human

54 revealed clinically relevant variants in collagen Col6a5 or in Ric1 important for

55 collagen-secretion [19,20]. This indicates that a phenome-based approach across

56 species can be of translational value. However, a defined phenome of collagens and

57 other extracellular matrix genes is currently missing for any species.

58 Here, we mine publicly available databases to establish the phenome of

59 extracellular matrices across species. We take advantage of the large collection of

60 over two million genotype-to-phenotype associations that integrates data across

61 hundreds of species from more than 30 different curated databases provided by the

62 Monarch Initiative [21]. For the ECM phenome we include all genes that assemble the

63 matrisome. The matrisome is the compendium of all possible products that either

64 form, remodel, or associate with the extracellular matrix [22]. The matrisome of

65 humans consists of 1027 proteins [23], for mice 1110 proteins [23], for zebrafish 1002

66 proteins [24], for Drosophila 641 proteins [25], and for C. elegans 719 proteins [26].

67 This corresponds to roughly 4% of their genomes are dedicated to ECM genes. We

68 find a similar contribution of 3-10% (overall 6.7 %) of the phenome consisting of 42’558

69 ECM-specific gene-to-phenotype associations. Across these five species, 639’431

70 gene-phenotype associations were recorded consisting of 34’818 distinct phenotypes.

71 We identify the phenotypic landscape of the matrisome. Our genotype-to-phenotype

72 interaction maps for matrisome genes reveal unstudied interaction when compared

73 across species.

74

4 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

75 Methods and Materials

76 Matrisome phenotype-to-genotype relationships

77 The phenotype-gene association table was obtained from the Monarch Initiative

78 (Source: Database Monarch Initiative; timestamp 23.08.2019) [21]. This phenotype-

79 gene association table was modified to remove duplicated species-gene-phenotype

80 relationships (< 0.002% in total) and unspecified organisms were removed

81 (Supplementary Table 1). The genotype-to-phenotype information was then combined

82 with the defined matrisomes of five species: the matrisome of humans [23], mice [23],

83 zebrafish [24], Drosophila [25], and for C. elegans [26]). To enable cross-species

84 comparisons among the species with defined and undefined matrisomes, we mapped

85 all genes documented in the Monarch Initiative database to their corresponding

86 orthologues. The homology mapping was achieved by using the homologene database

87 by the National Center for Biotechnology Information (NCBI) implemented in the

88 homologene R package (Ogan Mancarci (2019). homologene: Quick Access to

89 Homologene and Gene Annotation Updates. R package version 1.4.68.19.3.27.

90 https://CRAN.R-project.org/package=homologene). The human homology information

91 subset and its associated phenome was then further analyzed. All data analysis was

92 performed using the purrr and dplyr R packages (Lionel Henry and Hadley Wickham

93 (2019). purrr: Functional Programming Tools. R package version 0.3.3.

94 https://CRAN.R-project.org/package=purrr; Hadley Wickham, Romain François, Lionel

95 Henry and Kirill Müller (2019). dplyr: A Grammar of Data Manipulation. R package

96 version 0.8.3. https://CRAN.R-project.org/package=dplyr).

97

98 Grouping of orthologous phenotypes

5 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

99 To facilitate the comparison between species, the phenotypes were simplified and

100 grouped to form larger phenotype collections (Supplementary Table 2). These

101 phenotype collections enable both cross-species comparisons by avoiding species-

102 specific terminology, as well as, generating a broader overview of the processes

103 involved. Substring matching was used to assimilate similar phenotypes to phenotype

104 groups, while assuring that each original phenotype only belonged to a single

105 phenotype group using the stringr R package (Hadley Wickham (2019). stringr: Simple,

106 Consistent Wrappers for Common String Operations. R package version 1.4.0.

107 https://CRAN.R-project.org/package=stringr). Capitalization and terminal white spaces

108 were ignored.

109

110 Data analysis and visualization

111 The circular dendrograms were generated with a single hub gene as central node

112 connected to the species in which it was observed and the phenotypes as terminal

113 nodes. The interaction network for each collection of gene-phenotype interactions was

114 computed by extracting the genes, which are associated with the largest number of

115 phenotypes followed by selecting the phenotypes, which comprise the largest number

116 of genes. The edges of the generated network were then facetted by species and their

117 weight set according to the number of species in which a phenotype was observed for

118 the orthologues of a particular gene. Network analysis and visualization was performed

119 using the igraph and ggraph R packages (Csardi G, Nepusz T: The igraph software

120 package for complex network research, InterJournal, Complex Systems 1695. 2006.

121 http://igraph.org; Thomas Lin Pedersen (2018). ggraph: An Implementation of

122 Grammar of Graphics for Graphs and Networks. R package version 1.0.2.

123 https://CRAN.R-project.org/package=ggraph). In addition, the R packages ggplot2,

6 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

124 ggforce and ggpubr were used (H. Wickham. ggplot2: Elegant Graphics for Data

125 Analysis. Springer-Verlag New York, 2016; Thomas Lin Pedersen (2019). ggforce:

126 Accelerating 'ggplot2'. R package version 0.3.1. https://CRAN.R-

127 project.org/package=ggforce; Alboukadel Kassambara (2019). ggpubr: 'ggplot2'

128 Based Publication Ready Plots. R package version 0.2.2. https://CRAN.R-

129 project.org/package=ggpubr). Illustrations were generated using Inkscape version

130 0.92.

131

7 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

132 Results

133 The matrisome-phenome across species

134 To define the extracellular matrix phenome, we sought to identify the overall

135 contribution of matrisome genes to the phenome for each species. In particular, to

136 assess the suitability for this approach, we first plotted the phenotypic landscape in

137 relation to genotypes across species (Fig. 1A). We found 7’037 unique phenotypes

138 comprising the human phenome, from which 3’809 unique genes are linked to these

139 phenotypes (Supplementary Fig. 1). Fortunately, the species with a good phenotype-

140 to-genotype ratio are the ones that have a fully defined and characterized matrisome

141 [23-26]. We used these defined matrisome gene lists to determine their contribution to

142 the phenomes of humans, mice, zebrafish, Drosophila, and C. elegans (Fig. 1B-1F).

143 The contribution of the matrisome to the human phenome is 7.6%, with a comparable

144 2.8-9.5% across species (Fig 1B-F). Thus, the matrisome is involved in a number of

145 phenotypes, but only corresponds to a maximum of 10% of the overall recorded

146 phenome across species.

147

148 The phenotypic signatures of matrisome genes across species

149 Next, we determined which phenotypes are associated with the greatest number of

150 matrisome genes. For humans the top three phenotypes are “scoliosis”, “short stature”,

151 and “micrognathia” (Supplementary Fig. 2). By contrast, the top three phenotypes for

152 mice are “decreased body weight”, “immune system”, “premature death”, for zebra fish

153 (“decreased eye size”, “decreased length whole organism”, “lethal”), for Drosophila

154 (“uncharacterized phenotype”, “viable”, “lethal”), and for C. elegans (“locomotion

155 variant”, “embryonic lethal”, “dumpy”; Supplementary Fig. 2). Given that certain

156 phenotypes are species specific, for instance “dumpy” stands for short and fat C.

8 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

157 elegans, we realized that we cannot compare the phenotypic landscape across

158 species using this conventional phenotypic nomenclature. To overcome this obstacle,

159 we grouped phenotypes into novel phenotype collections. For instance, the phenotypic

160 category “altered body size” includes phenotypes with the key words: body weight,

161 body size, body length, growth phenotype, dwarf, small, dumpy, stunted, tall and others

162 (Materials and Methods, Supplementary Table 1 and 2). With this approach, we were

163 able to group 86.8% of the total 716’644 species-gene-phenotype associations

164 (Supplementary Table 2). We identified sterility, development, and altered body size

165 as the most dominant phenotypic categories across species, and for instance

166 connective tissue and skin phenotypes for vertebrate and poor viability for

167 invertebrates (Fig. 2). Similarly, besides skin and connective tissue, many

168 morphological phenotypes affecting bone, eye, brain, muscle, and the cardiovascular

169 system are represented in these top categories (Fig. 2). Unexpectedly, aging-related

170 phenotypes, stress resilience, and immune systems phenotypes were ranked across

171 species among these top ECM phenotypic categories (Fig. 2). Since we base our

172 observations on five species, we extended our analysis to incorporate all known

173 genotype-phenotype associations across 44 species. Thereby, we were able to extend

174 the matrisome by homology mapping to three additional species (Bos taurus (cattle),

175 Rattus norvegicus (rat), Canis lupus familiaris (dog) (Supplementary Table 3). Then,

176 we used these matrisome input gene lists to define their matrisome-phenome. We

177 found similar top phenotypes either for the ungrouped or with our comparable

178 phenotypic categories (Supplementary Fig. 3 and 4). Thus, our analysis revealed

179 known ECM phenotypes but also unexpected phenotypes related to the ECM with high

180 penetrance across species.

181

9 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

182 Top ranked-matrisome gene signature of the ECM-phenomes

183 To understand which genes are the major drivers of multiple distinct phenotypes, we

184 ranked matrisome genes based on their number of associated phenotypes (Fig. 3,

185 Supplementary Fig. 5). For humans, collagens (COL2A1, COL1A1, COL5A1), fibrillin

186 (FBN1), and growth differentiation factor (GDF5) are each associated with more than

187 150 distinct phenotypes (Fig. 3A). By contrast, for mice, zebrafish, Drosophila, and C.

188 elegans, Wnt- and sonic hedgehog (shh) signaling and other matrisome-secreted

189 factors are predominantly associated with the greatest number of distinct phenotypes

190 (Fig. 3B-E, Supplementary Fig. 5). Except for zebrafish, the highly conserved collagen

191 type IV (COL4A1/emb-9) is associated with range of 20-100 phenotypes across

192 species (Fig. 3). Overall, members of each matrisome category are represented and

193 comparable conserved gene families are linked with numerous phenotypes.

194

195 Phenotypic finger prints associated with matrisome genes

196 Using our phenotype grouping approach, we observed that certain genes were

197 associated with comparable phenotypes across species. For this comparison, we used

198 our phenotypic categories and plotted all matrisome genes as dendrograms

199 connecting the gene with its phenotypes through the species in which they were

200 observed (Fig. 4, Supplementary Fig. 6). The most phenotypically conserved gene is

201 the secreted muscle growth controlling (MSTN) showing muscle, fat,

202 and body size phenotypes in seven species (Fig. 4A, B). This is not surprising, since

203 either loss or pharmacological inhibition of myostatin leads to almost a doubling of

204 muscle mass, a trait applied in livestock and a potential target for treating muscle

205 wasting diseases [27]. Next is the cathepsin D (CTSD) gene, which is a lysosomal

206 aspartyl protease that can be secreted into the extracellular space to remodel ECM

10 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

207 and is involved in many physiological functions, including skin and neuronal

208 development, lipofuscin (age-pigment) removal, and apoptosis [28]. Our analysis of

209 CTSD showed the full phenotypic spectrum across six species with many conserved

210 phenotypes but also species-specific phenotypes (Fig. 4C, Supplementary Fig. 6).

211 Similarly, laminin (LAMB2), heparan sulfate proteoglycan (HSPG2), collagen

212 (COL11A2) showed comparable orthologues phenotypes across multiple species

213 ranging from altered body size to aging-related phenotypes (Fig. 4D-F). The

214 phenotypic finger prints of all matrisome genes is shown in Supplementary Fig. 6 and

215 7. Thus, we have defined the phenotypic landscape of matrisome genes across

216 species.

217

218 Network analysis to identify novel phenotypes and underlying molecular

219 mechanisms

220 We next asked whether we could use this phenotypic landscape of matrisome genes

221 to inform on potential phenotypes not assessed in humans. Furthermore, could we use

222 gene-phenotype pairs in one species to infer novel interactions in another species? In

223 order to achieve this, we built matrisome-genes-to-phenotype networks across species

224 for all matrisome categories and divisions. These networks are shown in

225 Supplementary Fig. 8 and 9. We chose the three core-matrisome categories, which

226 include ECM glycoproteins, collagens, and proteoglycans, to build a network of the five

227 most abundant interspecies gene-to-phenotype associations for H. sapiens, M.

228 musculus and D. rerio (Fig. 5). We positioned these top five conserved genes and

229 phenotypes for each species at the same location in our graphical network (Fig. 5). We

230 identified the conserved gene-to-phenotypes interactions (bold arrows), but also

231 interactions that only occur in a single species (light arrows; Fig. 5). For instance,

11 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

232 genetic variants in protein agrin are associated in humans

233 solely with a “sterility” phenotype, whereas in mouse in addition to sterility, agrin is

234 associated with altered body size, morphology, and cardiovascular impairments (Fig.

235 5). Next, we selected the most connected matrisome genes and mapped the genotype-

236 to-phenotype relationships across species (Fig. 6). Myostatin (MSTN) and collagen

237 type VII (COL7A1) are the most conserved drivers for muscle or skin phenotypes,

238 respectively (Fig. 6). Laminin (LAMA2), heparan sulfate proteoglycan (HSPG2), and

239 sonic hedgehog (SHH) gene build a strong interaction network among brain, muscle,

240 and cardiovascular phenotypes (Fig. 6). Curiously, only in mice is matrix

241 metallopeptidase 9 (MMP9) implicated with these three phenotypes (brain, muscle,

242 and cardiovascular), but also with immune system and proliferation phenotypes (Fig.

243 6). Taken together, our graphical networks build a framework to predict unassessed

244 phenotypes and their potential underlying molecular interactions.

245

12 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

246 Figure Legends

247

248 Fig. 1: The contribution of the matrisome to the phenome across species

249 The relationship between the number of genes associated with at least one phenotype

250 and the total number of characterized phenotypes (phenome) linked to at least one

251 gene are shown across species (A). The species with defined matrisomes are depicted

252 in separate panels with the number of genes associated with each phenotype shown

253 on the y-axis and the 40 phenotypes with the highest number of total genes involved

254 are shown on the x-axis for humans (B), for mouse (C), for zebrafish (D), for Drosophila

255 (E), and for C. elegans (F).The fraction of genes, which are members of each species’

256 matrisome are highlighted in color. The right panel provides an overview of the gene-

13 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

257 to-phenotype distribution, while the left panel highlights the contribution of the

258 matrisome by cropping the y-axis. The fraction of matrisome genes associated with the

259 phenotypes compared to all genes is shown as a percentage.

260

261

262 Supplementary Fig. 1: The phenome and number of genes associated with

263 phenotypes

264 (A) The phenome across species plotted as number of phenotypes. (B) The number

265 of unique genes associated with phenotypes across species.

266

14 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

267

268

269 Fig. 2: The matrisome-phenome across species

270 Using the matrisome composition for each species we show the phenome space

271 associated with it for humans (A), mice (B), zebra fish (C), Drosophila (D), and C.

272 elegans (E). Each species is depicted in a separate panel with the 30 largest

273 phenotype groups on the y-axis and the number of unique gene/phenotype

274 associations contained in each group on the x-axis. Subsets of phenotypes and analog

275 phenotypes across species are color-coded to ease the comparison (i.e., “altered body

276 size” in purple). For illustration we connected the “altered body size” phenotype with a

277 purple line. The phenotypes meaning wear slightly different meaning across species

278 due to different morphologies as i.e. C. elegans eye phenotype, which refers to

279 anything related to phototaxis (e.g., UV-light sensing), while the limbs related

280 phenotype captures phenotypes related to the animal’s tail.

281

15 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A Homo sapiens B Mus musculus C Danio rerio Scoliosis no abnormal phenotype detected abnormal(ly) decreased size eye Short stature Decreased body weight abnormal(ly) decreased length whole organism Micrognathia immune system phenotype abnormal(ly) edematous pericardium Seizures premature death abnormal(ly) lethal (sensu genetics) whole organism Intellectual disability decreased body size abnormal(ly) disrupted blood circulation Hypertelorism homeostasis/metabolism phenotype notochord development phenotype Hearing impairment reproductive system phenotype fin regeneration phenotype Muscular hypotonia nervous system phenotype abnormal(ly) decreased size head Respiratory insufficiency neonatal lethality, complete penetrance abnormal(ly) viability whole organism Cleft palate Mortality/Aging abnormal(ly) undulate notochord Sensorineural hearing impairment postnatal lethality, incomplete penetrance abnormal(ly) morphology somite High palate cardiovascular system phenotype cartilage development phenotype Low−set ears Failure to thrive abnormal(ly) quality nervous system Depressed nasal bridge skeleton phenotype abnormal(ly) decreased length post−vent region Cataract Recurrent bacterial infections sprouting phenotype Nystagmus postnatal lethality, complete penetrance dorsal/ventral pattern formation phenotype Myopia Osteopenia abnormal(ly) shape whole organism Talipes equinovarus behavior/neurological phenotype abnormal(ly) quality sensory system Kyphosis preweaning lethality, complete penetrance embryonic viscerocranium morphogenesis phenotype Strabismus increased or absent threshold for auditory brainstem response angiogenesis phenotype Ptosis Cardiomegaly abnormal(ly) morphology notochord Global developmental delay Abnormal heart morphology abnormal(ly) increased occurrence apoptotic process Abnormality of cardiovascular system morphology decreased litter size abnormal(ly) disrupted angiogenesis Pes planus Abnormal trabecular bone morphology determination of left/right symmetry phenotype inguinal hernia Abnormal lung morphology abnormal(ly) morphology whole organism Failure to thrive growth/size/body region phenotype abnormal(ly) morphology brain Cryptorchidism hematopoietic system phenotype abnormal(ly) decreased length whole organism anterior−posterior axis Muscle weakness decreased vertical activity abnormal(ly) curved ventral post−vent region Joint stiffness abnormal kidney morphology abnormal(ly) decreased length intersegmental vessel Anteverted nares Abnormal bleeding abnormal(ly) curved post−vent region 0 20 40 60 0 50 100 150 0 10 20 30 D Drosophila melanogaster E Caenorhabditis elegans Uncharacterized phenotype locomotion variant viable embryonic lethal lethal dumpy fertile abnormal(ly) arrested larval development visible Failure to thrive Drosophila wing phenotype larval lethal neuroanatomy defective adult body morphology variant partially lethal − majority die organism development variant lethal − all die before end of pupal stage transgene subcellular localization variant female sterile molt defect some die during embryonic stage reduced brood size lethal − all die before end of embryonic stage transgene expression increased male sterile sterile some die during pupal stage small lethal − all die before end of larval stage maternal sterile eye phenotype lethal neurophysiology defective pattern of transgene expression variant locomotor behavior defective dauer lifespan extended short lived transgene expression reduced developmental rate defective sick embryonic/larval salivary gland phenotype protruding vulva Drosophila embryonic/first instar larval cuticle phenotype protein aggregation variant some die during larval stage exploded through vulva Drosophila ommatidium phenotype mediated endocytosis defective Drosophila embryo phenotype extended life span partially lethal paralyzed some die during third instar larval stage germ cell compartment expansion variant partially lethal − majority live clear mitotic cell cycle defective pathogen susceptibility increased size defective avoids bacterial lawn 282 0 200 400 0 25 50 75

283 Supplementary Fig. 2: Ungrouped phenotype space across species:

284 Using the matrisome composition for each species and the ungrouped phenotypes,

285 the associated phenome space for humans (A), mice (B), zebra fish (C), Drosophila

286 (D), and C. elegans (E) are shown. Each species is depicted in a separate panel with

287 the 30 largest phenotype/gene association groups on the y axis and the number of

288 unique gene/phenotype associations contained in each group on the x-axis.

16 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A Caenorhabditis elegans B Canis lupus familiaris C Danio rerio

Sterility / reproductive potential Eye phenotype Nervous and neural tissues affected Poor viability Nervous and neural tissues affected Eye phenotype Developmental phenotype Systemic health decrease Developmental phenotype Expression variant Complex brain phenotype Cardiovascular phenotype Genetic information phenotype Psychological phenotype Muscle tissue phenotype Cellular level phenotype Muscle tissue phenotype Cranial deformations Systemic health decrease Oropharingeal related phenotype Cardiovascular impairments Morphology altered Immune system phenotype Brain phenotype Aging−related phenotype Ataxia and immobility Blood phenotype Localization variant Altered body size Connective tissue phenotype Stress resilience phenotype Reflexes and coordination Cellular level phenotype Kinestetic derived phenotypes Urinary system phenotype Oropharingeal related phenotype Altered body size Kinestetic derived phenotypes Morphology altered Adipose tissue phenotype Integumentary and thricocutaneus Edema occurence Eye phenotype Blood phenotype Limbs related phenotype Aggregation Visual impairment Kinestetic derived phenotypes Immune system phenotype tremor Size variant Muscle tissue phenotype weakness Liver and gall phenotype avoids bacterial lawn Morphology altered Coloration variant Growth phenotype Lung and respiratory system Epithelial tissue phenotype Apoptosis Liver and gall phenotype Intestine phenotype Membrane phenotype Brain phenotype Vacuoles and vesicles Lysosomal phenotype Bone phenotype Poor viability Mitochondria affected Apoptosis Psychological phenotype Intestine phenotype Anemia Skin phenotype rachis narrow Sterility / reproductive potential Genetic information phenotype pachytene region organization variant Glucose metabolism related Angiogenetic and blood vessels formation molt defect Cranial deformations Sensory phenotype Pathological condition Cardiovascular impairments Kidney phenotype Increased tissue transparency Posture phenotype Altered body size Transgene variant Kidney phenotype Differentiation phenotype Size variant impaired proprioception Bone phenotype Nervous and neural tissues affected Facial deformations Pancreas phenotype Oropharingeal related phenotype Cellular level phenotype Integumentary and thricocutaneus Paralysis abnormal blinking Spinal chord phenotype protein degradation variant Turn−over variant Turn−over variant Sex−specific phenotype Thyroid phenotype Signaling phenotype sluggish Skin phenotype Localization variant pleiotropic defects severe early emb Sensory phenotype Cystic−like features Circulating and secreted agents phenotype PH−phenotype abnormal(ly) dead whole organism cord commissures fail to reach target neutropenia Circulating and secreted agents phenotype Coloration variant Lymphatic phenotype Sterility / reproductive potential Polarity variant Limbs related phenotype Facial deformations General metabolism phenotype increased aggression Complex brain phenotype age associated fluorescence increased impaired exercise endurance abnormal(ly) curved post−vent region Cancer General metabolism phenotype Aging−related phenotype rachis absent count Fibrosis count abnormal(ly) decreased length post−vent region count 10000 engulfment variant dysmetria 30 Activity variant 4000 Bone phenotype Connective tissue phenotype Adipose tissue phenotype 7500 axon fasciculation variant cognitive impairment abnormal(ly) shape whole organism 3000 20 Phosphorylation altered 5000 circling abnormal(ly) decreased length whole organism anterior−posterior axis 2000 Limbs related phenotype Activity variant Neurotransmiters related phenotype Phenotype Phenotype 10 Phenotype excessive blebbing early emb 2500 Vitamin phenotype Apoptosis 1000 Turn−over variant Spleen phenotype abnormal(ly) curved ventral post−vent region sister chromatid segregation defective early emb Spinal chord phenotype abnormal(ly) decreased thickness extension ectopic cleavage furrows early emb Skeletal phenotype abnormal(ly) decreased thickness trunk Temperature phenotype Shortening phenotype determination of left/right symmetry phenotype Cytoskeleton affected Podalic, digitalis phenotype abnormal(ly) curved trunk blistered Paralysis abnormal(ly) disrupted determination of left/right symmetry Vacuoles and vesicles Inflamation phenotype Aggregation cleavage furrow initiation defective early emb Hyperactivity phenotype Shortening phenotype Shortening phenotype Heart function impairment abnormal(ly) curved ventral whole organism Differentiation phenotype diarrhea hemopoiesis phenotype Oxygen and nitrogen metabolism Developmental phenotype Thorax−related phenotypes Accumulation phenotype Cystic−like features Thymus phenotype breaks in alae confusion Immune system phenotype Activity variant coma abnormal(ly) u−shaped somite Epithelial tissue phenotype Circulating and secreted agents phenotype Glucose metabolism related Sensory phenotype Cardiovascular phenotype Cytoskeleton affected abnormal(ly) process quality axon guidance Blood vessel narrowing abnormal(ly) curved ventral trunk population fitness variant Atrophy phenotype abnormal(ly) bent post−vent region complex phenotype early emb abnormal vocalization Hypoplasia phenotype Skin phenotype abnormal exercise endurance abnormal(ly) increased width notochord failure to hatch impaired balance Membrane phenotype pale ichthyosis abnormal(ly) decreased length trunk lipid composition variant hyperkeratosis somitogenesis phenotype exaggerated asynchrony early emb hypercapnia abnormal(ly) increased width somite anaphase bridging Genetic information phenotype abnormal(ly) grey yolk Interaction with chemicals phenotype flexion contracture dorsal/ventral pattern formation phenotype cortical dynamics defective early emb decreased lacrimation Metal ion homeostasis phenotype transposon silencing variant decreased collagen level abnormal(ly) shape somite cleavage furrow defective early emb cognition phenotype abnormal(ly) curved whole organism anterior−posterior axis abnormal(ly) process quality gastrulation Cancer abnormal(ly) kinked post−vent region Skeletal phenotype Breathing problems abnormal(ly) undulate notochord cleavage furrow termination defective early emb bradycardia abnormal(ly) necrotic whole organism rays displaced astasia abnormal(ly) increased curvature post−vent region Cranial deformations Aneurysm abnormal(ly) dorsalized whole organism ovulation variant Adipose tissue phenotype abnormal(ly) decreased mobility whole organism fertility reduced absent galactosylceramidase abnormal(ly) bent whole organism ems anterior extension fails early emb absent eccrine glands abnormal(ly) quality yolk unfolded protein response variant abnormality of the voice abnormal(ly) low saturation iridophore protein transport variant abnormality of the patella abnormal(ly) increased curvature whole organism ventral enclosure variant abnormal wound healing abnormal(ly) curled post−vent region phagosome maturation defective abnormal testis physiology Lymphatic phenotype ventral closure defective abnormal rod electrophysiology Adhesion affected pace of p lineage defective early emb abnormal protein level abnormal(ly) curved whole organism obsolete ray loss abnormal nursing abnormal(ly) bent trunk lipid depleted Abnormal gait phenotype Abberant proliferation alae variant abnormal food intake abnormal(ly) quality inner ear alae absent abnormal bleeding abnormal(ly) curved dorsal post−vent region 0 2500 5000 7500 10000 0 10 20 30 0 1000 2000 3000 4000 # occurances # occurances # occurances D Drosophila melanogaster E Homo sapiens F Mus musculus

Poor viability Complex brain phenotype Sterility / reproductive potential Uncharacterized phenotype Eye phenotype Blood phenotype Sterility / reproductive potential Oropharingeal related phenotype Cellular level phenotype Nervous and neural tissues affected Muscle tissue phenotype Altered body size Drosophila−specific phenotype Podalic, digitalis phenotype Morphology altered Developmental phenotype Sterility / reproductive potential Nervous and neural tissues affected Viability Facial deformations Bone phenotype Podalic, digitalis phenotype Bone phenotype Immune system phenotype Genetic information phenotype Cardiovascular impairments Cardiovascular impairments Eye phenotype Blood phenotype Developmental phenotype visible Developmental phenotype Poor viability Muscle tissue phenotype Nervous and neural tissues affected Eye phenotype fertile Atrophy phenotype Muscle tissue phenotype Cellular level phenotype Altered body size Brain phenotype Kinestetic derived phenotypes Skin phenotype Cardiovascular phenotype Abdominal phenotype Kidney phenotype Integumentary and thricocutaneus Oropharingeal related phenotype Integumentary and thricocutaneus Adipose tissue phenotype Sex−specific phenotype Hearing impairment Complex brain phenotype Stress resilience phenotype Lung and respiratory system Cancer Thorax−related phenotypes Cranial deformations Kidney phenotype Sensory phenotype Spinal chord phenotype Psychological phenotype Aging−related phenotype Inflamation phenotype Skin phenotype Altered body size Connective tissue phenotype Lung and respiratory system Cranial deformations Cancer Liver and gall phenotype Brain phenotype Kinestetic derived phenotypes no abnormal phenotype detected Coloration variant Brain phenotype Oropharingeal related phenotype Size variant Visual impairment Circulating and secreted agents phenotype nmj bouton phenotype Major articulations Systemic health decrease Apoptosis Morphology altered Spleen phenotype Cystic−like features Muscle weakness phenotype Glucose metabolism related Limbs related phenotype Psychological phenotype Spinal chord phenotype Psychological phenotype Reflexes and coordination Apoptosis Cytoskeleton affected Paralysis Cranial deformations Polarity variant Urinary system phenotype Lymphatic phenotype dorsal appendage phenotype Liver and gall phenotype Stress resilience phenotype Integumentary and thricocutaneus Cardiovascular phenotype Urinary system phenotype Morphology altered Growth phenotype Kinestetic derived phenotypes Immune system phenotype Ataxia and immobility Limbs related phenotype Complex brain phenotype Breathing problems Inflamation phenotype Paralysis Systemic health decrease Diabetes related phenotype Reflexes and coordination Hypoplasia phenotype Genetic information phenotype interommatidial bristle phenotype Intestine phenotype Intestine phenotype chorion phenotype Shortening phenotype Reflexes and coordination dendrite phenotype Anemia Aging−related phenotype Facial deformations Cystic−like features Turn−over variant Mitochondria affected Diabetes related phenotype Hearing impairment dorsocentral bristle phenotype count Immune system phenotype count Podalic, digitalis phenotype count microchaeta phenotype spasticity 8000 General metabolism phenotype 10000 Adipose tissue phenotype 30000 Blood vessel narrowing Connective tissue phenotype Intestine phenotype Abberant proliferation 6000 Pancreas phenotype 7500 axon phenotype 20000 hypertelorism Epithelial tissue phenotype 4000 5000 crossvein phenotype ptosis Shortening phenotype Phenotype Phenotype Phenotype Lamina affected 10000 Limbs related phenotype 2000 Activity variant 2500 suppressor of variegation micrognathia Sex−specific phenotype bouton phenotype Skeletal phenotype Major articulations Vacuoles and vesicles Motor phenotype Differentiation phenotype tarsal segment phenotype Spleen phenotype Breathing problems mushroom body beta−lobe phenotype high palate Sensory phenotype karyosome phenotype Thorax−related phenotypes Thymus phenotype Epithelial tissue phenotype cleft palate Size variant Cardiovascular phenotype Thyroid phenotype Anemia Skin phenotype Coloration variant Hyperactivity phenotype type i bouton phenotype anteverted nares Cystic−like features non−suppressor of variegation Cellular level phenotype Edema occurence dorsal trunk primordium phenotype General metabolism phenotype Skeletal phenotype arista phenotype epicanthus Facial deformations Turn−over variant Edema occurence Hypoactivity phenotype commissure phenotype Immunodeficiency Growth phenotype enhancer of variegation PH−phenotype Fibrosis Lysosomal phenotype downslanted palpebral fissures Atrophy phenotype Kidney phenotype Obesity related phenotype Weight loss grandchildless Lymphatic phenotype Mitochondria affected Hypoactivity phenotype Fatigue Endocrine phenotype Blood phenotype Abdominal phenotype PH−phenotype non−enhancer of variegation cognitive impairment Fatty acid metabolism mushroom body alpha−lobe phenotype Fractures Thyroid phenotype notopleural bristle phenotype Sensory phenotype Temperature phenotype germarium phenotype tremor Adrenal gland phenotype femur phenotype dystonia Neurotransmiters related phenotype Growth phenotype Nervous system defect Obesity related phenotype Hearing impairment Fibrosis Hypoplasia phenotype Membrane phenotype Aplasia/Hypoplasia of bone tremor aster phenotype Sclerosis abnormal bleeding touch response defective hydrocephalus Stomach and gastric related phenotype Cardiovascular impairments hypertension Necrosis related phenotype Lymphatic phenotype Hyperactivity phenotype Ataxia and immobility Neurotransmiters related phenotype flexion contracture Posture phenotype Cancer kyphosis Vacuoles and vesicles Hyperplasia phenotype hypospadias Abberant proliferation pain response defective Agenesis decreased testis weight unguis phenotype Mental Depression Oxygen and nitrogen metabolism temperature response defective eeg abnormality Thorax−related phenotypes ring gland phenotype Adrenal gland phenotype leukopenia Skeletal phenotype pes cavus pallor electrophoretic variant babinski sign Angiogenetic and blood vessels formation cuticle phenotype Circulating and secreted agents phenotype decreased survivor rate adult cuticle phenotype pes planus kyphosis ventral furrow phenotype posteriorly rotated ears abnormal macrophage physiology terminal bouton phenotype pectus excavatum Metal ion homeostasis phenotype large body inguinal hernia decreased prepulse inhibition 0 10000 20000 30000 0 2000 4000 6000 8000 0 3000 6000 9000 # occurances # occurances # occurances G Mus musculus domesticus H Rattus norvegicus I Saccharomyces cerevisiae S288C

Sterility / reproductive potential Blood phenotype Stress resilience phenotype Cardiovascular impairments Cardiovascular phenotype Accumulation phenotype Altered body size Altered body size Growth phenotype Integumentary and thricocutaneus Cellular level phenotype Genetic information phenotype Limbs related phenotype Psychological phenotype Systemic health decrease Eye phenotype Circulating and secreted agents phenotype Viability Blood phenotype Stress resilience phenotype Vacuoles and vesicles Podalic, digitalis phenotype Sterility / reproductive potential competitive fitness:increased Morphology altered Complex brain phenotype Aging−related phenotype Cranial deformations Adipose tissue phenotype Cellular level phenotype Poor viability Urinary system phenotype Lung and respiratory system Developmental phenotype Cardiovascular impairments Oxygen and nitrogen metabolism Bone phenotype Immune system phenotype Temperature phenotype Nervous and neural tissues affected Neurotransmiters related phenotype Abberant proliferation Cardiovascular phenotype Muscle tissue phenotype Apoptosis Hearing impairment Morphology altered Morphology altered Cellular level phenotype Lymphatic phenotype protein/peptide distribution:abnormal Complex brain phenotype Brain phenotype Mitochondria affected Spinal chord phenotype Nervous and neural tissues affected Size variant Brain phenotype Kidney phenotype auxotrophy Psychological phenotype Glucose metabolism related colony sectoring:increased Kidney phenotype Fibrosis Interaction with chemicals phenotype Skin phenotype decreased susceptibility to hypertension utilization of carbon source:decreased rate Facial deformations Hypoactivity phenotype Activity variant Oropharingeal related phenotype Diabetes related phenotype protein/peptide modification:decreased tremor Cancer utilization of carbon source:decreased Systemic health decrease Systemic health decrease Turn−over variant Kinestetic derived phenotypes Spleen phenotype sporulation efficiency:decreased Immune system phenotype Sclerosis transposable element transposition:decreased Lung and respiratory system Lung and respiratory system biofilm formation:decreased Muscle tissue phenotype Liver and gall phenotype Sterility / reproductive potential Reflexes and coordination Kinestetic derived phenotypes sporulation:decreased Hyperactivity phenotype Fatty acid metabolism colony appearance:abnormal belly spot Developmental phenotype protein/peptide modification:increased circling Apoptosis sporulation:absent cleft palate Activity variant Diet−related phenotype Spleen phenotype Thymus phenotype sporulation:normal micrognathia Temperature phenotype competitive fitness:normal Sensory phenotype Sensory phenotype biofilm formation:increased Urinary system phenotype Poor viability transposable element transposition:increased Circulating and secreted agents phenotype PH−phenotype Circulating and secreted agents phenotype PH−phenotype impaired proprioception Skeletal phenotype Cystic−like features Cranial deformations budding pattern:abnormal Ataxia and immobility Size variant Poor viability Edema occurence Podalic, digitalis phenotype utilization of carbon source:absent Thymus phenotype Pathological condition Peroxisome phenotype Glucose metabolism related count Oropharingeal related phenotype count sporulation efficiency:increased count Skeletal phenotype Mitochondria affected protein transport:abnormal 500 50 Shortening phenotype leukopenia sporulation:abnormal 15000 Lymphatic phenotype 400 Integumentary and thricocutaneus 40 chitin deposition:increased decreased mean corpuscular volume 300 Inflamation phenotype 30 protein/peptide distribution:increased 10000 increased hdl cholesterol concentration 200 increased prepulse inhibition 20 protein/peptide modification:absent Phenotype Phenotype Phenotype Sex−specific phenotype increased creatine kinase level survival rate in stationary phase:decreased 5000 100 10 heterotaxy increased alcohol consumption silencing:decreased Coloration variant hypertension chitin deposition:normal Cancer decreased testis weight budding index:abnormal Obesity related phenotype Cystic−like features spore germination:decreased dextrocardia Connective tissue phenotype prion formation:decreased decreased hdl cholesterol concentration Atrophy phenotype Kinestetic derived phenotypes impaired balance abnormal response to transplant Altered body size Adipose tissue phenotype abnormal estrous cycle budding index:increased Posture phenotype increased tibialis anterior weight silencing:abnormal Liver and gall phenotype increased susceptibility to weight gain Polarity variant Intestine phenotype increased fluid intake prion formation:increased decreased survivor rate increased defecation amount budding:abnormal Turn−over variant increased creatinine clearance budding index:decreased Connective tissue phenotype increased body temperature Localization variant Breathing problems increased bile salt level petite Genetic information phenotype impaired balance General metabolism phenotype Blood vessel narrowing impaired acrosome reaction utilization of iron source:decreased Vacuoles and vesicles hypoperistalsis petite−negative Stomach and gastric related phenotype hyperresponsive to tactile stimuli utilization of phosphorus source:decreased Size variant Eye phenotype spore wall formation:abnormal Inflamation phenotype excessive scratching redox state:abnormal Hypoactivity phenotype Epithelial tissue phenotype silencing:increased hydrocephalus Endocrine phenotype flocculation:increased Epithelial tissue phenotype decreased tibialis anterior weight rna modification:decreased General metabolism phenotype decreased prepulse inhibition rna modification:abnormal Anemia decreased food intake axial budding pattern:decreased Aging−related phenotype decreased exploration in new environment rna modification:absent Agenesis decreased b wave amplitude chitin deposition:decreased reduced fertility cognitive inflexibility prion loss:increased mesocardia chondrodystrophy protein/peptide distribution:decreased Differentiation phenotype Bone phenotype Major articulations decreased testis weight belly spot liquid culture appearance:abnormal cleft upper lip Anemia protein/peptide modification:abnormal chylous ascites Aging−related phenotype pexophagy:absent brachycephaly Adrenal gland phenotype protein transport:decreased Apoptosis absent sexual maturation pheromone production:decreased white spotting abnormal survival survival rate in stationary phase:increased Weight loss abnormal response to social novelty shmoo formation:absent striated fur abnormal pruritus chitin deposition:abnormal impaired righting response abnormal nociception after inflammation protein/peptide modification:normal Hypoplasia phenotype abnormal mechanical nociception shmoo formation:abnormal Heart function impairment abnormal lipid level Adhesion affected Diabetes related phenotype abnormal hemostasis spore germination:absent coloboma abnormal food preference Golgi apparatus affected Atrophy phenotype abnormal conditioned emotional response colony shape:abnormal Abdominal phenotype abnormal chloride level budding:absent abnormal cone electrophysiology Abberant proliferation reticulophagy:decreased 0 200 400 0 20 40 0 5000 10000 15000 # occurances # occurances # occurances J Felis catus K Bos taurus L Ovis aries

Facial deformations Genetic information phenotype Sterility / reproductive potential Eye phenotype Poor viability

Muscle tissue phenotype Connective tissue phenotype Systemic health decrease Systemic health decrease Systemic health decrease

Complex brain phenotype Spinal chord phenotype Nervous and neural tissues affected

Bone phenotype Skin phenotype Complex brain phenotype Spinal chord phenotype Muscle tissue phenotype

Oropharingeal related phenotype Altered body size Visual impairment Skin phenotype Sterility / reproductive potential Reflexes and coordination Oropharingeal related phenotype Skin phenotype microtia Cardiovascular impairments Kinestetic derived phenotypes Muscle tissue phenotype Thyroid phenotype Abdominal phenotype Morphology altered weakness Altered body size Integumentary and thricocutaneus tremor Complex brain phenotype abnormal ovulation Skeletal phenotype tremor Podalic, digitalis phenotype rigidity weakness Paralysis Psychological phenotype Nervous and neural tissues affected Turn−over variant Nervous and neural tissues affected Major articulations Cranial deformations Kidney phenotype Spinal chord phenotype Visual impairment Glucose metabolism related Turn−over variant Psychological phenotype Cystic−like features Podalic, digitalis phenotype Cranial deformations protein catabolic process phenotype Limbs related phenotype Cardiovascular phenotype impaired balance Blood phenotype Motor phenotype Eye phenotype Anemia

Altered body size Coloration variant Limbs related phenotype

Activity variant Cardiovascular phenotype large horns abnormal protein level Brain phenotype

Weight loss Ataxia and immobility Kinestetic derived phenotypes Urinary system phenotype astasia count count count unresponsive to tactile stimuli white spotting 8 10.0 Integumentary and thricocutaneus 8 Turn−over variant weakness 6 7.5 6 Thyroid phenotype Water homeostasis altered impaired proprioception 5.0 Psychological phenotype 4 Shortening phenotype 4 Phenotype Phenotype Phenotype impaired balance Morphology altered 2 PH−phenotype 2.5 2 Lysosomal phenotype Paralysis Glucose metabolism related Limbs related phenotype micrognathia jaundice microfibril phenotype Eye phenotype Integumentary and thricocutaneus Kidney phenotype

Inflamation phenotype increased aggression exercise intolerance increased inflammatory response impaired exercise endurance increased aggression Enzyme activity heterochromia iridis impaired proprioception Hearing impairment Edema occurence impaired balance General metabolism phenotype Immune system phenotype Facial deformations distichiasis Hyperplasia phenotype ectopia lentis Hyperactivity phenotype Connective tissue phenotype dilatation hydrocephalus Developmental phenotype Heart function impairment cognitive impairment decreased beta−mannosidase level Hearing impairment conjunctiva phenotype Brain phenotype General metabolism phenotype coloboma Fibrosis cleft palate Atrophy phenotype enlarged paws Cancer Embolus and / or thrombus Ataxia and immobility brachycephaly Edema occurence blue irides dysostosis multiplex Aging−related phenotype Blood phenotype Developmental phenotype blepharophimosis absent horns Coloration variant Atrophy phenotype Circulating and secreted agents phenotype absent beta−galactosidase protein Cardiovascular impairments absent horns

blepharophimosis abortion, rnaseh2b−related in cattle abnormal translation Atrophy phenotype abnormal pruritus

Ataxia and immobility abnormal protein level abnormal spike wave discharge

Adipose tissue phenotype abnormal primary sex determination

absent beta−galactosidase protein abnormal parturition abnormal pruritus

abnormality of the rib cage abnormal ear position abnormal lipid level abnormality of the pinna Abdominal phenotype

0 2 4 6 8 0.0 2.5 5.0 7.5 10.0 0 2 4 6 8 289 # occurances # occurances # occurances

290 Supplementary Fig. 3: The whole matrisome genome-phenome across species

291 (grouped)

292 All gene/phenotype associations of every species are shown for 44 species. The

293 number of unique gene-to-phenotype associations within every phenotype group are

294 shown arranged in descending order.

17 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A Bos taurus B Caenorhabditis elegans C Canis lupus familiaris

premature death embryonic lethal premature death Tremor Failure to thrive Ataxia thin body abnormal(ly) arrested larval development Tremor Rigidity sterile weakness Limb joint contracture locomotion variant Retinal dysplasia Joint hypermobility transgene subcellular localization variant Recurrent infections decreased body size transgene expression increased impaired coordination abnormal spine curvature reduced brood size Gait disturbance Relatively short spine maternal sterile Blindness Protruding tongue pattern of transgene expression variant Paraparesis postnatal lethality sterile progeny increased neuron apoptosis increased skeletal muscle mass larval lethal Abnormality of movement impaired balance protruding vulva Abnormal neuron morphology Gait disturbance transgene expression reduced Abnormal eating behavior Behavioral abnormality receptor mediated endocytosis defective Strabismus Ataxia dauer lifespan extended Seizures astasia sick increased salivation head/neck swellings extended life span Impaired proprioception Haplotype JH1 lethal abnormal postnatal growth/weight/body size Haplotype HHD organism development variant abnormal blinking Haplotype HH5 protein aggregation variant Retinal detachment Haplotype HH4 fat content reduced Retinal degeneration Haplotype HH1 avoids bacterial lawn Neutropenia Haplotype HH0 shortened life span muscle degeneration Goiter germ cell compartment expansion variant Lethargy Fragile skin adult body morphology variant increased aggression Failure to thrive exploded through vulva impaired exercise endurance Enlarged kidney small Hepatomegaly embryonic lethality oocyte number decreased Failure to thrive Ectopia lentis cell membrane organization biogenesis variant Dysmetria Dysphagia oocyte morphology variant decreased body size Dry skin germ cell compartment morphology variant Cognitive impairment domed cranium dumpy circling diluted coat color RAB−11 recycling endosome morphology variant Behavioral abnormality Dilation of lateral ventricles no induction of antimicrobial peptide expression after infection Anemia Dilatation rachis morphology variant Abnormality of dental morphology Dementia lysosome−related organelle morphology variant abnormal sensory capabilities/reflexes/nociception Dehydration egg laying variant Visual impairment deformed nails germline nuclear positioning variant Splenomegaly decreased triiodothyronine level rachis narrow Respiratory insufficiency decreased cardiac muscle contractility cytoplasmic processing body variant Recurrent bacterial infections decreased branched−chain alpha−keto acid dehydrogenase level abnormal(ly) process quality gonad development Peripheral demyelination Decreased body weight pachytene region organization variant Paralysis decreased beta−mannosidase level molt defect Mitral regurgitation decreased argininosuccinate synthetase level fat content increased increased glycogen level darkened coat color accumulated germline cell corpses Hyperactivity conjunctiva phenotype count growth variant count Hemolytic anemia count Congestive heart failure 5 clear 3000 Hematuria Coloboma abnormal cell nucleus morphology Dysphagia 4 Cleft palate protein expression reduced 2000 domed cranium 10 Brain atrophy 3 aldicarb resistant Diarrhea Brachycephaly apoptosis reduced Dementia

Phenotype Phenotype 1000 Phenotype 5 Blue irides 2 abnormal(ly) process quality condensation Degeneration of anterior horn cells Blindness RAB−11 recycling endosome localization variant Confusion 1 Blepharophimosis gonad vesiculated Coma Biparietal narrowing thin Cataract behavior/neurological phenotype protein subcellular localization variant bone marrow phenotype Arthrogryposis multiplex congenita no oocytes behavior/neurological phenotype Aplasia/Hypoplasia of the lens diplotene region organization variant Atrophy of the spinal cord absent horns multivulva Aciduria absent eye pigmentation oogenesis variant Absent hair absent coat pigmentation early larval arrest Abnormality of vision Abortion, RNASEH2B−related in cattle abnormal(ly) process quality apoptotic process Abnormality of the dentition Abortion due to haplotype MH2 in cattle transgene expression variant Abnormality of the cerebellum Abortion due to haplotype MH1 in cattle life span variant abnormal vocalization Abortion due to haplotype FH4 in cattle germ cell compartment size variant abnormal vitamin B12 level Abortion due to haplotype BH2 in cattle gonad morphology variant abnormal reflex Abortion (embryonic lethality), SNAPC4−related in cattle germ cell compartment multinucleate Abnormal posturing Abortion (embryonic lethality), RPIA−related in cattle oocyte septum formation variant abnormal postural reflex Abortion (embryonic lethality), RNF20−related in cattle nuclei small abnormal neuron physiology Abortion (embryonic lethality), RABGGTB−related in cattle abnormal(ly) process quality germ cell development Abnormal joint morphology Abortion (embryonic lethality), OBFC1−related in cattle foraging behavior variant abnormal exercise endurance Abortion (embryonic lethality), MYH6−related in cattle distal tip cell migration variant abnormal ependyma morphology Abortion (embryonic lethality), MED22−related in cattle abnormal(ly) increased rate apoptotic process Abnormal heart morphology Abortion (embryonic lethality), EXOSC4−related in cattle body wall muscle myosin organization defective Abnormal head movements Abnormality of skull size protein degradation variant Abnormal glucose homeostasis Abnormality of skin pigmentation pore forming toxin hypersensitive abnormal forelimb morphology Abnormality of serum amino acid level male somatic gonad development variant abnormal foot pad morphology Abnormality of metabolism/homeostasis paralyzed abnormal food intake Abnormality of joint mobility embryo osmotic integrity defective early emb Abnormal facial shape Abnormality of ion homeostasis sluggish Abnormal eyelid morphology Abnormal vertebral morphology oocytes small Abnormal erythrocyte morphology abnormal pruritus Bacillus thuringiensis toxin hypersensitive Abnormal enzyme/coenzyme activity abnormal protein level abnormal testis development Abnormal emotion/affect behavior abnormal primary sex determination mitochondria alignment variant abnormal developmental patterning abnormal pigmentation pattern abnormal(ly) arrested embryo development Abnormal circulating methionine concentration abnormal parturition multiple nuclei early emb abnormal cerebral hemisphere morphology abnormal outer ear morphology pleiotropic defects severe early emb abnormal cerebellum fissure morphology Abnormal neuron morphology meiosis variant abnormal cerebellar foliation abnormal milk composition high incidence male progeny abnormal calcium ion homeostasis abnormal horn pigmentation germ cell compartment anucleate abnormal bone mineralization Abnormal hand morphology dauer constitutive abnormal bone marrow morphology abnormal ear position oocytes lack nucleus Abnormal bleeding abnormal depression−related behavior RNAi resistant abnormal behavioral response to light abnormal circulating thyroxine level body wall muscle sarcomere morphology variant abnormal basal metabolism abnormal cerebellum white matter morphology pathogen susceptibility increased abnormal axon morphology Abnormal cartilage morphology mRNA surveillance defective abnormal atrioventricular cushion morphology Abnormal aortic morphology nonsense mRNA accumulation Abnormal appendicular skeleton morphology abnormal aorta elastic tissue morphology cord commissures fail to reach target abnormal amino acid level Abdominal distention Abnormal mitochondrial morphology Abnormal activity of mitochondrial respiratory chain 0 1 2 3 4 5 0 1000 2000 3000 0 5 10 # occurances # occurances # occurances D Danio rerio E Drosophila melanogaster F Felis catus

abnormal(ly) decreased size eye Uncharacterized phenotype Gait disturbance abnormal(ly) edematous pericardium lethal Failure to thrive abnormal(ly) decreased size head viable Microtia abnormal(ly) decreased length whole organism visible eye opacity abnormal(ly) viability whole organism fertile Broad face abnormal(ly) lethal (sensu genetics) whole organism Drosophila wing phenotype weakness abnormal(ly) dead whole organism neuroanatomy defective Tremor abnormal(ly) edematous heart female sterile thick skin abnormal(ly) quality nervous system partially lethal − majority die Short nose abnormal(ly) decreased size whole organism male sterile increased creatine kinase activity abnormal(ly) quality heart lethal − all die before end of larval stage Decreased cervical spine mobility abnormal(ly) hypoplastic gut some die during embryonic stage decreased body size abnormal(ly) disrupted blood circulation lethal − all die before end of embryonic stage Anemia abnormal(ly) necrotic central nervous system lethal − all die before end of pupal stage Abnormality of the mouth abnormal(ly) disrupted heart looping neurophysiology defective abnormal protein level abnormal(ly) quality pigment cell eye phenotype Abnormal facial shape abnormal(ly) quality gut some die during larval stage abdomen swellings abnormal(ly) hypoplastic liver some die during pupal stage skeleton phenotype abnormal(ly) curved post−vent region locomotor behavior defective Skeletal muscle hypertrophy heart development phenotype partially lethal Retinal degeneration abnormal(ly) quality liver meiotic cell cycle defective Renal insufficiency abnormal(ly) increased occurrence apoptotic process short lived Renal cyst abnormal(ly) decreased rate heart contraction developmental rate defective reflex phenotype abnormal(ly) quality brain Drosophila ommatidium phenotype reduced foot pad pigmentation developmental pigmentation phenotype mitotic cell cycle defective Pulmonary edema brain development phenotype Drosophila embryonic/first instar larval cuticle phenotype Prominent forehead abnormal(ly) decreased length post−vent region Drosophila macrochaeta phenotype premature death cilium morphogenesis phenotype lethal − all die before end of P−stage paresis abnormal(ly) decreased amount nucleate erythrocyte wing vein phenotype Paralysis liver development phenotype Drosophila oocyte phenotype Osteopenia abnormal(ly) shape whole organism NMJ bouton phenotype obsolete neuromuscular junction phenotype abnormal(ly) decreased life span whole organism partially lethal − majority live muscular atrophy abnormal(ly) decreased length whole organism anterior−posterior axis egg chamber phenotype Mitral regurgitation abnormal(ly) cystic pronephros male fertile Metabolic acidosis chordate embryonic development phenotype Drosophila embryo phenotype long tongue abnormal(ly) morphology brain Drosophila egg phenotype Lethargy abnormal(ly) necrotic brain cytokinesis defective Left ventricular hypertrophy abnormal(ly) hydrocephalic brain some die during third instar larval stage Jaundice abnormal(ly) quality sensory system some die during pharate adult stage Increased inflammatory response abnormal(ly) curved ventral post−vent region leg phenotype Increased head circumference heart looping phenotype female fertile Increased blood urea nitrogen angiogenesis phenotype Drosophila imaginal disc phenotype increased aggression abnormal(ly) decreased thickness extension Drosophila embryonic/larval neuromuscular junction phenotype Impaired proprioception abnormal(ly) decreased size liver small body impaired limb coordination abnormal(ly) decreased size brain lethal − all die before end of first instar larval stage impaired balance abnormal(ly) decreased thickness trunk sterile Hypothyroidism abnormal(ly) quality pharyngeal arch 3−7 skeleton count size defective count Hyperreflexia count abnormal(ly) morphology heart ovary phenotype 20000 Hyperphosphatemia 4 determination of left/right symmetry phenotype nurse cell phenotype Hyperactivity 600 15000 abnormal(ly) curved trunk some die during first instar larval stage Hydrocephalus 3 abnormal(ly) disrupted determination of left/right symmetry 400 dorsal appendage phenotype 10000 Hip dysplasia abnormal(ly) disrupted thigmotaxis chemical sensitive Hip dislocation

Phenotype Phenotype Phenotype 2 abnormal(ly) morphology somite 200 lethal − all die before end of second instar larval stage 5000 hindlimb paralysis embryonic viscerocranium morphogenesis phenotype increased cell death Hepatic cysts 1 retina development in camera−type eye phenotype some die during second instar larval stage Hearing impairment cartilage development phenotype embryonic/larval salivary gland phenotype Global developmental delay abnormal(ly) circular yolk decreased cell number Generalized tonic−clonic seizures abnormal(ly) disrupted embryo development Drosophila embryonic head phenotype Flat face abnormal(ly) present in fewer numbers in organism melanocyte embryonic/larval neuroblast phenotype Epiphyseal dysplasia abnormal(ly) quality hindbrain Drosophila pigment cell phenotype enlarged paws abnormal(ly) curved ventral whole organism immune response defective Dysostosis multiplex abnormal(ly) delayed embryo development Drosophila wing disc phenotype Difficulty walking hemopoiesis phenotype female semi−sterile Degeneration of anterior horn cells heart contraction phenotype paralytic decreased UDP−N−acetylglucosamine−1−phosphotransferase level digestive tract development phenotype Drosophila melanotic mass phenotype decreased N−acetylgalactosamine−4−sulfatase protein level abnormal(ly) morphology whole organism Drosophila spermatid phenotype decreased circulating factor XII level abnormal(ly) disrupted retina layer formation Drosophila testis phenotype darkened coat color abnormal(ly) U−shaped somite planar polarity defective Corneal opacity pronephros development phenotype wing margin phenotype Congestive heart failure abnormal(ly) quality eye embryonic/larval brain phenotype Cataract abnormal(ly) curved ventral trunk courtship behavior defective bone cell phenotype abnormal(ly) bent post−vent region eye photoreceptor cell phenotype Blepharophimosis abnormal(ly) decreased size mandibular arch skeleton retina phenotype Biliary hyperplasia abnormal(ly) decreased length Kupffer's vesicle cilium denticle belt phenotype Behavioral abnormality abnormal(ly) increased width notochord neuromuscular junction phenotype behavior/neurological phenotype abnormal(ly) decreased pigmentation melanocyte eye color defective Ataxia ventricular system development phenotype Drosophila germline cell phenotype absent beta−galactosidase protein abnormal(ly) decreased pigmentation whole organism decreased occurrence of cell division Abnormality of the rib cage abnormal(ly) decreased length trunk uncoordinated Abnormality of the pinna fin regeneration phenotype interommatidial bristle phenotype Abnormality of the lumbar spine sprouting angiogenesis phenotype increased cell number Abnormality of skeletal morphology somitogenesis phenotype Drosophila intersegmental nerve phenotype Abnormality of dental color abnormal(ly) increased width somite male semi−sterile abnormal zygomatic bone morphology abnormal(ly) grey yolk locomotor rhythm defective Abnormal vestibulo−ocular reflex abnormal(ly) decreased size otic vesicle behavior defective abnormal urine osmolality abnormal(ly) disrupted angiogenesis chorion phenotype Abnormal urinary color vasculature development phenotype dendrite phenotype abnormal reflex dorsal/ventral pattern formation phenotype nucleus phenotype abnormal myocardial fiber morphology abnormal(ly) process quality heart looping centrosome phenotype Abnormal muscle morphology abnormal(ly) morphology intersegmental vessel Drosophila border follicle cell phenotype abnormal muscle electrophysiology abnormal(ly) shape somite flight defective abnormal lysosome physiology abnormal(ly) curved whole organism anterior−posterior axis spindle phenotype abnormal long bone epiphyseal plate morphology abnormal(ly) morphology eye Drosophila scutellar bristle phenotype Abnormal leukocyte morphology abnormal(ly) kinked post−vent region Drosophila embryonic/larval dorsal trunk phenotype abnormal kidney morphology abnormal(ly) disrupted determination of heart left/right asymmetry dorsocentral bristle phenotype Abnormal heart valve morphology abnormal(ly) necrotic whole organism Drosophila chaeta phenotype abnormal glycosaminoglycan level abnormal(ly) increased curvature post−vent region microchaeta phenotype abnormal foot pad morphology abnormal(ly) dorsalized whole organism wing margin bristle phenotype abnormal cervical vertebrae morphology abnormal(ly) decreased mobility whole organism Drosophila embryonic/larval somatic muscle phenotype abnormal cardiac muscle tissue morphology abnormal(ly) bent whole organism adult brain phenotype Abdominal distention 0 200 400 600 800 0 5000 10000 15000 20000 0 1 2 3 4 # occurances # occurances # occurances G Homo sapiens H Mus musculus I Mus musculus domesticus

Intellectual disability no abnormal phenotype detected decreased body size Global developmental delay Decreased body weight kinked tail Seizures preweaning lethality, complete penetrance Tremor Muscular hypotonia premature death Decreased body weight Short stature decreased body size Male infertility Generalized hypotonia nervous system phenotype Double outlet right ventricle Nystagmus immune system phenotype Gait disturbance Microcephaly behavior/neurological phenotype prenatal lethality, complete penetrance Failure to thrive Failure to thrive Microphthalmia Scoliosis homeostasis/metabolism phenotype Hearing impairment Strabismus reproductive system phenotype premature death Ataxia Male infertility diluted coat color Sensorineural hearing impairment postnatal lethality, incomplete penetrance belly spot Spasticity preweaning lethality, incomplete penetrance circling Hearing impairment embryonic lethality during organogenesis, complete penetrance Hyperactivity Cataract embryonic growth retardation Cleft palate Hyperreflexia Mortality/Aging short tail Hypertelorism neonatal lethality, complete penetrance Abnormality of hair pigmentation Ptosis Hyperactivity Micrognathia Micrognathia hypoactivity Ventricular septal defect Cryptorchidism increased or absent threshold for auditory brainstem response Short nose Optic atrophy decreased embryo size atrioventricular septal defect Visual impairment postnatal lethality, complete penetrance increased or absent threshold for auditory brainstem response High palate Behavioral abnormality Polydactyly Hepatomegaly Adipose tissue loss Syndactyly Dysarthria cardiovascular system phenotype Left ventricular hypertrophy Low−set ears decreased body length Right ventricular hypertrophy Cleft palate Cardiomegaly Ataxia Respiratory insufficiency decreased lean body mass embryonic lethality during organogenesis, complete penetrance Anteverted nares impaired coordination Cataract Growth delay Abnormal heart morphology Right aortic arch Muscle weakness Gait disturbance hearing/vestibular/ear phenotype Wide nasal bridge Osteopenia head bobbing Intrauterine growth retardation Splenomegaly decreased birth body size Depressed nasal bridge decreased grip strength Anophthalmia Ventriculomegaly embryonic lethality, complete penetrance abnormal craniofacial morphology Motor delay growth/size/body region phenotype decreased mean corpuscular volume Feeding difficulties decreased circulating glucose level curly vibrissae Epicanthus hematopoietic system phenotype preweaning lethality, complete penetrance Feeding difficulties in infancy decreased vertical activity curly tail Anemia Female infertility Sparse hair Skeletal muscle atrophy Decreased testicular size limb grasping Downslanted palpebral fissures obsolete Glucose intolerance Increased HDL cholesterol concentration Delayed speech and language development decreased bone mineral content Hydronephrosis Splenomegaly Abnormal bone structure Curly hair Gait disturbance prenatal lethality, incomplete penetrance Increased body weight Recurrent respiratory infections count Hypoinsulinemia count postnatal lethality, complete penetrance count Frontal bossing decreased litter size 2000 perinatal lethality, complete penetrance 200 Macrocephaly 1000 prenatal lethality, complete penetrance Heterotaxy Cognitive impairment neonatal lethality, incomplete penetrance 1500 postnatal lethality, incomplete penetrance 150 Abnormality of cardiovascular system morphology 750 increased lean body mass postnatal lethality Brachydactyly Tremor 1000 head tossing 100

Phenotype 500 Phenotype Phenotype Talipes equinovarus Abnormality of cell physiology Dextrocardia 500 50 Hyporeflexia 250 Increased body weight Decreased HDL cholesterol concentration Ventricular septal defect Abnormal bleeding impaired balance Delayed skeletal maturation B lymphocytopenia Elevated tissue non−specific alkaline phosphatase Cerebellar atrophy abnormal kidney morphology absent startle reflex Short neck reduced female fertility abnormal tail morphology Tremor improved glucose tolerance increased percent body fat/body weight Dystonia Ataxia Anisocytosis Myopia increased total body fat amount reproductive system phenotype Hypoplasia of the corpus callosum Edema Obesity Glaucoma vision/eye phenotype Muscular ventricular septal defect Hydrocephalus abnormal embryo size Exencephaly Dysphagia perinatal lethality, incomplete penetrance Vascular ring Hypoplasia of penis Anemia Hypocholesterolemia Hypertension reduced male fertility embryonic lethality between implantation and somite formation, complete penetrance Obesity decreased testis weight domed cranium Abnormality of the dentition Abnormality of the liver decreased survivor rate Long philtrum perinatal lethality, complete penetrance Abnormal aortic arch morphology Flexion contracture embryonic lethality during organogenesis, incomplete penetrance Oligospermia Clinodactyly of the 5th finger embryonic lethality between implantation and somite formation, complete penetrance Increased blood urea nitrogen Atrial septal defect Hypotriglyceridemia Hypoplasia of the thymus Short nose Abnormality of the spleen Hypercholesterolemia Kyphosis Weight loss Abnormal eye morphology Conductive hearing impairment Hypocholesterolemia abnormal coat appearance Abnormality of retinal pigmentation embryonic growth arrest prenatal lethality, incomplete penetrance Intellectual disability, severe limb grasping Overriding aorta Cerebral atrophy embryonic lethality prior to organogenesis Osteopenia Renal insufficiency Elevated tissue non−specific alkaline phosphatase increased foot pad pigmentation Hypospadias Abnormal B cell morphology impaired limb coordination Microphthalmia skeleton phenotype Hypotrichosis Developmental regression increased heart weight Hypophosphatemia Hypertrophic cardiomyopathy Abnormal glucose homeostasis Hyperglycemia Abnormality of metabolism/homeostasis Decrease in T cell count Female infertility Peripheral neuropathy Recurrent bacterial infections Edema Fatigue Hepatic steatosis decreased mean corpuscular hemoglobin Blindness Oligospermia Abnormality of digit Hypogonadism lethality throughout fetal growth and development, complete penetrance Renal cyst EEG abnormality Hypercholesterolemia prenatal lethality Thrombocytopenia Decreased proportion of CD8−positive T cells increased systemic arterial blood pressure Cerebral cortical atrophy embryonic lethality prior to tooth bud stage Globozoospermia Photophobia decreased CD4−positive, alpha beta T cell number double outlet right ventricle with atrioventricular septal defect Pes cavus Cyanosis Corneal opacity Ophthalmoplegia Retinal dysplasia Abnormality of hair growth Elevated serum creatine kinase decreased cell proliferation decreased litter size Agenesis of corpus callosum Microphthalmia asthenozoospermia Babinski sign Leukopenia Abnormality of the hair Myopathy Pallor Abnormal spermatogenesis Umbilical hernia cellular phenotype Abnormal auditory evoked potentials 0 250 500 750 1000 1250 0 500 1000 1500 2000 0 50 100 150 200 # occurances # occurances # occurances J Rattus norvegicus K Saccharomyces cerevisiae S288C L Ovis aries

cardiac hypertrophy resistance to chemicals:decreased increased litter size salt−sensitive hypertension viable Decreased body weight competitive fitness:decreased Seizures Hypercholesterolemia resistance to chemicals:increased decreased susceptibility to hypertension vegetative growth:decreased rate premature death hypoactivity vacuolar morphology:abnormal Hypertriglyceridemia competitive fitness:increased abnormal ovulation Decrease in T cell count haploinsufficient abnormal circadian regulation of systemic arterial blood pressure heat sensitivity:increased weakness premature death chemical compound accumulation:increased Increased body weight chemical compound accumulation:abnormal Visual impairment decreased urine albumin level chemical compound accumulation:decreased Proteinuria inviable Tetraparesis Neurodegeneration vegetative growth:decreased increased systemic arterial blood pressure utilization of nitrogen source:decreased rate subcutaneous edema increased neutrophil cell number chronological lifespan:decreased increased heart weight desiccation resistance:decreased spongiform encephalopathy Increased blood urea nitrogen haploproficient Impaired proprioception starvation resistance:decreased small horns Hepatic fibrosis oxidative stress resistance:decreased Glomerulosclerosis toxin resistance:decreased Scoliosis Gait disturbance protein/peptide distribution:abnormal embryonic lethality chromosome/plasmid maintenance:decreased protein catabolic process phenotype decreased erythrocyte cell number cell death:decreased vascular smooth muscle hypertrophy protein/peptide accumulation:increased Prominent eyelashes Tubulointerstitial fibrosis metal resistance:decreased Renal cyst respiratory growth:decreased rate Paraparesis Reduced natural killer cell count auxotrophy pathological neovascularization respiratory growth:absent Myotonia oxidative stress invasive growth:increased obsolete Glucose intolerance colony sectoring:increased long limbs Monocytopenia innate thermotolerance:decreased Lymphopenia toxin resistance:increased limb joint phenotype Lymphocytosis chemical compound excretion:increased loss of dopaminergic neurons utilization of carbon source:decreased rate large horns Leukopenia stress resistance:increased Left ventricular hypertrophy chromosome/plasmid maintenance:abnormal Infertility increased serotonin level protein/peptide accumulation:decreased increased prepulse inhibition innate thermotolerance:increased increased skeletal muscle size increased liver triglyceride level endocytosis:decreased increased insulin secretion stress resistance:decreased increased skeletal muscle mass increased grooming behavior resistance to chemicals:normal increased gnawing activity hyperosmotic stress resistance:decreased Impaired proprioception increased dopamine level acid pH resistance:decreased impaired balance increased creatine kinase level replicative lifespan:increased Increased circulating free fatty acid level chronological lifespan:increased Ganglioside accumulation increased circulating aspartate transaminase level count oxidative stress resistance:increased count count increased circulating alanine transaminase level protein/peptide modification:decreased 4000 4 Gait disturbance increased cell proliferation 7.5 utilization of carbon source:decreased increased alcohol consumption mitochondrial morphology:abnormal 3000 3 Fragile skin Hypoplastic spleen UV resistance:decreased 5.0 2000 Hypoplasia of the thymus cell size:increased

Phenotype Phenotype Phenotype Failure to thrive 2 Hypertension 2.5 anaerobic growth:decreased 1000 Hyperinsulinemia respiratory growth:decreased extended life span 1 enlarged alveolar lamellar bodies sporulation efficiency:decreased Elevated tissue non−specific alkaline phosphatase RNA accumulation:decreased Exercise intolerance Elevated serum creatinine transposable element transposition:decreased Elevated circulating luteinizing hormone level killer toxin resistance:increased Distichiasis decreased vertical activity invasive growth:absent decreased urine protein level protein activity:increased Diminished movement decreased testis weight filamentous growth:decreased Decreased testicular size killer toxin resistance:decreased decreased glycogen catabolism rate Decreased serum leptin biofilm formation:decreased decreased metastatic potential sporulation:decreased darkened coat color decreased mean systemic arterial blood pressure resistance to enzymatic treatment:decreased decreased litter size fermentative growth:decreased rate Cognitive impairment decreased hematocrit RNA accumulation:increased decreased eosinophil cell number cell size:decreased Brain atrophy decreased circulating glucose level bud morphology:abnormal decreased brain tyrosine 3−monooxygenase activity cold sensitivity:increased Blindness decreased body length colony appearance:abnormal B lymphocytopenia telomere length:decreased Ataxia Albuminuria filamentous growth:increased Adipocyte hypertrophy cell cycle progression:abnormal absent horns abnormal vasodilation protein/peptide modification:increased abnormal spatial working memory sporulation:absent absent beta−galactosidase protein abnormal social play behavior metal resistance:increased abnormal response to transplant lipid particle morphology:abnormal Abnormality of the ovary Abnormal muscle tone alkaline pH resistance:decreased abnormal locomotor behavior sporulation:normal Abnormality of skin morphology abnormal learning/memory/conditioning mutation frequency:increased abnormal estrous cycle competitive fitness:normal Abnormality of central motor function abnormal depression−related behavior biofilm formation:increased abnormal blood−brain barrier function cell shape:abnormal abnormal translation abnormal exorbital lacrimal gland morphology transposable element transposition:increased abnormal dopamine level cellular morphology:abnormal abnormal spike wave discharge abnormal dendrite morphology ionic stress resistance:decreased abnormal craniofacial development growth in exponential phase:decreased rate abnormal sleep behavior abnormal craniofacial bone morphology nuclear morphology:abnormal abnormal pruritus abnormal conditioned taste aversion behavior cell cycle progression in G1 phase:increased duration abnormal conditioned emotional response replicative lifespan:decreased abnormal lipid level Abnormal circulating protein level vegetative growth:abnormal abnormal circulating free fatty acids level budding pattern:abnormal abnormal lipid homeostasis abnormal chloride level telomere length:increased abnormal cellular respiration mitochondrial genome maintenance:abnormal abnormal fertility/fecundity abnormal cardiac muscle relaxation viability:decreased abnormal brain development utilization of carbon source:absent Abnormal external genitalia abnormal body size respiratory growth:increased rate abnormal apoptosis actin cytoskeleton morphology:abnormal abnormal enzyme/coenzyme level abnormal amino acid level resistance to enzymatic treatment:increased 0.0 2.5 5.0 7.5 0 1000 2000 3000 4000 0 1 2 3 4 295 # occurances # occurances # occurances

296 Supplementary Fig. 4: The whole matrisome genome-phenome across species

297 (ungrouped)

298 All gene/phenotype associations of every species are shown for 44 species. The

299 number of unique gene-to-phenotype associations with every phenotype are shown

300 arranged in descending order without assembling phenotypes into groups.

18 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

301

302

303 Fig. 3: The degree of phenotypic impact of matrisome genes.

304 The number of unique phenotypes linked to individual matrisome gene is shown in

305 decreasing order for the top 50 genes. The gene name is displayed on the x-axis and

306 the number of unique phenotypes on the y-axis. The color of the bars refers to the

307 matrisome category for each gene.

308

19 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A Homo sapiens B Mus musculus

120

75 90

50 60

25 30 # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved

0 0

Il6 Il4 Il2 F2 Kitl INS Tnf Lep Il10 Ifng Ctsl Plg Il13 EDA Shh Fgf9 Ccl2 Plau Fgf8 Ntf3 Ins2 Fasl SHH IGF2 Bdnf Csf1 Chrd Fbn1 Spp1 Fstl1 Nrg1 FBN1 FGF8 GPC3 KAL1 TNXB GDF5 BMP4 CTSKGDF6 Vegfa Bmp4 Tgfb1 Fgf10 Nodal Tgfb2 Crim1 Bmp5 AGRN Hspg2Lama2 Wnt5a Mmp9 Lama5 Megf8Timp3 HSPG2 TGFB3 CCBE1 TGFB2 FGF17FRAS1 TGFB1 A2ML1 FBLN5 NGLY1 MASP1PLOD1 LAMA2LAMA3 Col1a1 Col2a1 Tnfsf11 Mmp14 Adipoq Col4a1Col4a2 WNT5A SMOC1MEGF8 Adam17 Fam20c COL1A1COL5A1COL2A1COL1A2 SEMA3ECOL3A1 COL4A1 PLXND1 COL6A3 FAM20C EFEMP2 COL6A2 COL11A1 COL13A1 COL11A2 COLEC11 ADAMTS2SERPINA1 C Danio rerio D Drosophila melanogaster

60

60

40

40 category

Collagens ECM Glycoproteins ECM Regulators ECM−affiliated Proteins Proteoglycans 20 20 Secreted Factors # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved

0 0

hh sli vn tsl trk tld tfc wg spi bnl pip dlp eys trol grk fog Mp pio chd fgf3 loxa dpp aos gbb dpy sca kuz Sdc knk Cp1 scw spz gpc4 shha gdf3 ndr2ndr1 agrn hhipigf2b fn1a igf2a ihha sulf1 sdc2 ctgfa lepa otog dally serp Burs tdgf1 fgf8a bmp4 sparc fbn2b gdf6a ntn1a LanA verm upd3 upd1amon Wnt5 wnt11lamc1 lama1 fgf10a bmper hspg2lama2 lama5 Mmp1Mmp2 fs(1)N bmp2b bmp7a vegfaa bmp1acol6a1 plxnd1 Col4a1 LanB2 lamb1a col8a1a scube2 crispld2cxcl12b Cpr92A mmp14a Cpr100A CG6337CG6347CG6357 CG11828 dsx−c73A

E Caenorhabditis elegans

20

15

10

5 # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved

0

bli−5 lpr−5 cpl−1 acn−1 apl−ced1 −1adt−2epi−lam1 −1mlt−8wrt−4 lam−2mlt−7 sqt−dpy3 −2 mig−6 sqt−1 wrt−1 fbn−1 wrt−asp7 −asp3 −4 dig−1 mup−qua4 −1 zmp−2 emb−9 mua−pan3 −1 mlt−11 col−35col−92col−93col−94 gon−1 clec−1 mom−2 noahunc−2 −mfap52 −1 let−805 dpy−10 unc−71 col−131 noah−1 col−125col−154col−155 clec−178 309 Y65B4A.2

310 Supplementary Fig. 5: Degree of phenotypic impact of matrisome genes

311 (grouped)

312 The number of phenotypes that associate with each matrisome gene are shown in

313 decreasing order for the top 50 genes. This graph is complementary to Fig. 3., to show

314 the impact of phenotypic grouping on the number of phenotypes. The gene name is

315 displayed on the x-axis and the number of unique phenotypes on the y-axis. The color

316 of the bars refers to the matrisome category for each gene.

20 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

317

318

319 Fig. 4: Phenotypic fingerprint of the most conserved matrisome members

320 The most conserved genes displaying phenotypes across the most species are shown

321 in decreasing order (A). The table displays the human gene name, the number of

322 species presenting a phenotype associated with the matrisome gene, the overall

21 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

323 phenotypic signature observed for this gene, as well as the matrisome category and

324 division are shown. Circular dendrograms depict the five most conserved matrisome

325 genes in more detail (B-F). The human orthologue is shown in the middle connected

326 to the species, in which the gene was associated with a phenotype, and then each

327 connected to terminal nodes depicting the grouped phenotypes. Phenotypes that were

328 observed in more than one species are highlighted in red and the size of the terminal

329 node reflects the number of species this phenotype was observed in. At the bottom

330 right of each circular dendrogram panel a graphical summary is provided highlighting

331 the species in which by homology a phenotype group was observed (dark grey) or

332 absent (light grey).

333

22 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A B

Muscle tissue phenotype (Rattus norvegicus)

Integumentary and thricocutaneus

Adipose tissue phenotype (Rattus norvegicus)

Immune system phenotype

Complex brain phenotype (Mus musculus) Eye phenotype (Mus musculus)

Hearing impairment

Cellular level phenotype (Mus musculus)

Aggregation

abnormal parturition Apoptosis (Caenorhabditis elegans) gliosis

Localization variant ● ● Nervous and neural tissues affected (Caenorhabditis elegans)

Altered body size (Bos taurus) ● Poor viability (Caenorhabditis elegans)

● ● ● ● protein degradation variant Blood phenotype Atrophy phenotype (Mus musculus) axonal dystrophy ● Stress resilience phenotype

● Muscle tissue phenotype (Bos taurus) Ataxia and immobility (Mus musculus) Apoptosis (Canis lupus familiaris) ● ●

Ataxia and immobility (Canis lupus familiaris) Altered body size (Mus musculus) ● ●

Apoptosis (Mus musculus) dysmetria Rattus norvegicus ● Bos taurus Muscle tissue phenotype (Canis lupus familiaris) ● ● Altered body size (Mus musculus) Kinestetic derived phenotypes (Canis lupus familiaris) ● Caenorhabditis elegans ● Mus musculus Nervous and neural tissues affected (Canis lupus familiaris) ● ●

Adipose tissue phenotype (Mus musculus) rigidity ● Psychological phenotype (Canis lupus familiaris) Canis lupus familiaris ● abnormal(ly) increased weight whole organism Psychological phenotype (Homo sapiens) Canis lupus familiaris

● Systemic health decrease

Mus musculus Mstn Oropharingeal related phenotype Ctsd abnormal(ly) bent whole organism

● abnormal(ly) dead whole organism

Cardiovascular impairments Nervous and neural tissues affected (Homo sapiens) Homo sapiens abnormal lipid level Danio rerio ● Altered body size (Danio rerio)

Lung and respiratory system Danio rerio Circulating and secreted agents phenotype

Homo sapiens General metabolism phenotype Coloration variant Cardiovascular phenotype Drosophila melanogaster

Facial deformations Drosophila melanogaster Epithelial tissue phenotype

abnormal gluconeogenesis ● ● Cranial deformations Eye phenotype (Danio rerio)

Kinestetic derived phenotypes (Danio rerio) Immune system phenotype ●

Muscle tissue phenotype ● ● Breathing problems

Poor viability (Danio rerio) ● ● Complex brain phenotype (Homo sapiens) abnormal body composition Size variant Muscle tissue phenotype (Danio rerio) ●

Apoptosis (Drosophila melanogaster) Viability ● ● Brain phenotype

Developmental phenotype Cellular level phenotype (Drosophila melanogaster)

Cystic−like features Atrophy phenotype (Homo sapiens)

fertile

Ataxia and immobility (Homo sapiens)

Uncharacterized phenotype

Sterility / reproductive potential Uncharacterized phenotype Muscle tissue phenotype (Homo sapiens)

Eye phenotype (Drosophila melanogaster)

C D

Nervous and neural tissues affected (Mus musculus)

Kidney phenotype (Mus musculus)

Poor viability (Mus musculus)

Morphology altered (Mus musculus)

Eye phenotype (Mus musculus)

Inflamation phenotype

Developmental phenotype (Mus musculus)

Epithelial tissue phenotype

Membrane phenotype Cellular level phenotype (Mus musculus)

Altered body size (Caenorhabditis elegans) Lung and respiratory system Cellular level phenotype

Cellular level phenotype (Caenorhabditis elegans) Circulating and secreted agents phenotype

● ● Developmental phenotype (Caenorhabditis elegans) ● ● ● Expression variant ● ● Genetic information phenotype ● Morphology altered (Caenorhabditis elegans)

Blood phenotype (Mus musculus) ● Increased tissue transparency Hearing impairment

Poor viability (Caenorhabditis elegans) ● Eye phenotype (Mus musculus) ● Kinestetic derived phenotypes (Caenorhabditis elegans)

● ● Sterility / reproductive potential (Caenorhabditis elegans)

Developmental phenotype (Mus musculus) ● Morphology altered (Caenorhabditis elegans) ● ● abnormal(ly) quality somite Apoptosis ● Altered body size (Mus musculus) Muscle tissue phenotype (Caenorhabditis elegans) ● abnormal(ly) undulate notochord ● Caenorhabditis elegans

Poor viability Cranial deformations ● abnormal miniature endplate potential Caenorhabditis elegans Mus musculus Brain phenotype (Danio rerio) Mus musculus ● Size variant

cleft palate Kinestetic derived phenotypes Visual impairment Sterility / reproductive potential

Cardiovascular phenotype (Mus musculus) ● Danio rerio ● Morphology altered (Danio rerio) Urinary system phenotype abnormal(ly) disorganized somite Lamb2 Plod3

Muscle tissue phenotype Cardiovascular impairments Sclerosis abnormal(ly) quality notochord Danio rerio

● ● ● Nervous and neural tissues affected (Danio rerio) Homo sapiens Developmental phenotype (Danio rerio)

Brain phenotype (Mus musculus) Reflexes and coordination Drosophila melanogaster

● fertile Kinestetic derived phenotypes (Danio rerio)

Growth phenotype posterior lenticonus ● ● Membrane phenotype Sterility / reproductive potential (Drosophila melanogaster) Drosophila melanogaster Homo sapiens Fractures

Morphology altered (Danio rerio) Uncharacterized phenotype

microcoria ● Muscle tissue phenotype (Homo sapiens) ● ● Muscle tissue phenotype (Danio rerio) Facial deformations Viability

● Adhesion affected abnormality of the pinna ● ● Nervous and neuralNervous tissues affected

Developmental phenotype (Drosophila melanogaster) anteverted nares

Eye phenotype (Homo sapiens) Kidney phenotype (Homo sapiens) ● Drosophila−specific phenotype ● Hypoplasia phenotype ● Atrophy phenotype

Heart function impairment

Bone phenotype

Muscle tissue phenotype (Drosophila melanogaster)

Breathing problems

Edema occurence decreased palmar creases

Eye phenotype (Homo sapiens) Developmental phenotype (Homo sapiens)

Connective tissue phenotype

bruising susceptibility

Blood phenotype (Homo sapiens)

Oropharingeal related phenotype

Cardiovascular phenotype (Homo sapiens)

E F

Connective tissue phenotype (Mus musculus)

Cranial deformations (Mus musculus)

Cellular level phenotype (Mus musculus)

Blood phenotype (Mus musculus)

Altered body size (Mus musculus) anterior synechiae of the anterior chamber

Bone phenotype

Expression variant aphakia Adipose tissue phenotype

Nervous and neural tissues affected (Caenorhabditis elegans)

Adrenal gland phenotype (Mus musculus)Agenesis (Mus musculus) ● Poor viability (Caenorhabditis elegans) ● ● ● Cellular level phenotype

● ● Sterility / reproductive potential

abnormal(ly) aplastic habenular commissure

● Developmental phenotype

abnormal(ly) decreased process quality habenular commissure axon extension involved in axon guidance ● absent mullerian ducts abnormal(ly) decreased thickness habenular commissure Anemia ● abnormal secondary sex determination Sterility / reproductive potential

abnormal(ly) discontinuous habenular commissure

abnormal(ly) disrupted convergent extension Altered body size

Facial deformations Caenorhabditis elegans Systemic health decrease

Mus musculus Cellular level phenotype (Danio rerio) ● Caenorhabditis elegans

Endocrine phenotype

Connective tissue phenotype (Danio rerio) ● Aging−related phenotype

pain response defective Danio rerio Mus musculus cubitus valgus ● Cranial deformations (Danio rerio) Drosophila melanogaster

cleft palate Wnt4 ● Developmental phenotype (Danio rerio) Adam17

Atrophy phenotype absent schlemm's canal

cleft lip Oropharingeal related phenotype

Homo sapiens anterior commissure phenotype

● Cardiovascular impairments Blood phenotype (Homo sapiens)

commissure phenotype Drosophila melanogaster absent meibomian glands

Breathing problems Homo sapiens dendrite phenotype

diarrhea ● Blood vessel narrowing Developmental phenotype (Drosophila melanogaster)

Blood phenotype Drosophila−specific phenotype abnormal macrophage physiology ● erythema amenorrhea fertile ●

Kidney phenotype ● and thricocutaneus Integumentary Immune system phenotype abnormal chemokine level Morphology altered ● pustule ● Muscle tissue phenotype Inflamation phenotype

mushroom body alpha−lobe phenotype acne

mushroom body beta−lobe phenotype Altered body size (Homo sapiens)

Agenesis (Homo sapiens)

paronychia

Adrenal gland phenotype (Homo sapiens)

Poor viability (Drosophila melanogaster)

Nervous and neural tissues affected (Drosophila melanogaster) 334

335 Supplementary Fig. 6: Phenotypic fingerprint of the most conserved matrisome

336 members

337 Circular dendrograms depicting the cross-species phenotypic fingerprint of all human

338 matrisome genes for which at least one phenotype has been studied in at least one

339 other species. The human orthologue is shown in the middle connected to the species

340 in which the gene was associated with a phenotype, and then each connected to

341 terminal nodes depicting the grouped phenotypes. Phenotypes that were observed in

342 more than one species are highlighted in red and the size of the terminal node reflects

343 the number of species this phenotype was observed in.

23 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

A B

decreased abdominal adipose tissue amount

increased muscle weight

cellular phenotype

Ataxia (Mus musculus)

Decreased body weight Bradykinesia axonal dystrophy

Cachexia abnormal thymus morphology

Blindness

Decreased adiponectin level

cell death variant

abnormal parturition abnormal stomach wall morphology hypoosmotic shock hypersensitive Astrocytosis lethal

increased skeletal muscle mass necrotic cell death variant abnormal spleen white pulp morphology abnormal thrombosis neuron degeneration

protein aggregation variant cardiovascular system phenotype skeletal muscle tissue phenotype (Bos taurus) abnormal spleen B cell follicle morphology protein degradation variant ● ● transgene subcellular localization variant

skeletal muscle tissue phenotype (Canis lupus familiaris) Abnormal neuron morphology ● abnormal microglial cell morphology Abnormality of movement

abnormal intestinal mucosa morphology abnormal(ly) has extra parts of type heart towards cardiac muscle cell Ataxia (Canis lupus familiaris) ● Adipose tissue loss Dysmetria

increased neuron apoptosis abnormal ileum morphology abnormal(ly) increased size atrium Abnormality of muscle physiology Paraparesis Rattus norvegicus abnormal hair cycle abnormal crypts of Lieberkuhn morphology premature death

abnormal(ly) increased size cardiac ventricle Bos taurus Caenorhabditis elegans Psychomotor deterioration

abnormal cochlear ganglion morphology Mus musculus Canis lupus familiaris abnormal(ly) bent whole organism Abnormality of muscle fibers Canis lupus familiaris

abnormal(ly) dead whole organism abnormal(ly) increased size heart abnormal apoptosis abnormal(ly) decreased length whole organism Danio rerio abnormal antigen presentation abnormal(ly) decreased size eye Abnormality of femur morphology abnormal(ly) increased thickness cardiac ventricle Spasticity abnormal(ly) decreased size otolith

MSTN Sloping forehead CTSD abnormal(ly) decreased size semicircular canal

abnormal(ly) increased weight whole organism Danio rerio abnormal(ly) decreased thickness myoseptum abnormal trochanter morphology Sensory axonal neuropathy

abnormal(ly) disorganized trunk musculature muscle cell Mus musculus Rod−cone dystrophy abnormal(ly) disrupted myotube cell development immune response phenotype Rigidity Homo sapiens Drosophila melanogaster abnormal(ly) disrupted swim bladder inflation

abnormal skeletal muscle morphology Retinal atrophy Homo sapiens abnormal(ly) increased pigmentation integument

developmental rate defective abnormal(ly) lacks all parts of type retina towards retinal pigmented epithelium microvillus Respiratory insufficiency

Drosophila melanogaster abnormal(ly) morphology trunk musculature muscle

Premature closure of fontanelles abnormal(ly) uninflated swim bladder abnormal postnatal growth/weight/body size Uncharacterized phenotype

Microcephaly abnormal(ly) viability whole organism

Neuronal loss in central nervous system muscle structure development phenotype Skeletal muscle hypertrophy

Mental deterioration retinal pigment epithelium development phenotype

abnormal myocardium layer morphology Low−set ears adult brain phenotype

abnormal body composition cell death defective

decreased cell death Abnormal muscle morphology ● Drosophila spermatogonial cyst phenotype

abnormal circulating level nurse cell phenotype Intellectual disability, severe egg chamber phenotype

abnormal lipid level eye phenotype abnormal enzyme/coenzyme level

fertile

increased cell death

abnormal gluconeogenesis Cerebral atrophy male germline cell phenotype

Intellectual disability, progressive Apnea

viable Cerebellar atrophy

abnormal hormone level

Ataxia (Homo sapiens)

Increased neuronal autofluorescent lipopigment

Uncharacterized phenotype

Abnormal glucose homeostasis

abnormal heart left ventricle morphology Abnormality of metabolism/homeostasis

C D

embryonic lethality during organogenesis, incomplete penetrance

embryonic lethality during organogenesis, complete penetrance

embryonic lethality prior to tooth bud stage

abnormal retinal outer plexiform layer morphology abnormal retinal photoreceptor morphology decreased threshold for auditory brainstem response

decreased total retina thickness

decreased amacrine cell number abnormal retinal rod cell morphology

abnormal retina inner limiting membrane morphology

Decreased body weight

embryonic growth retardation abnormal renal tubule epithelium morphology abnormal renal glomerulus basement membrane morphology abnormal retinal development Exencephaly

Albuminuria

decreased embryo size

abnormal cell nucleus morphology adult body morphology variant

abnormal(ly) process quality cell migration cell secretion variant abnormal(ly) process quality chromosome condensation

body wall muscle development variant embryonic lethal

abnormal neuromuscular synapse morphology clear Abnormal renal physiology Abnormality of skeletal morphology exploded through vulva dumpy

larval lethal embryonic lethal

germ cell compartment expansion variant Abnormality of epidermal morphology pattern of transgene expression variant Cleft palate germ cell compartment morphology variant abnormal vascular endothelial cell morphology receptor mediated endocytosis defective abnormal Muller cell morphology germ cell compartment multinucleate

abnormal(ly) absent MiP motor neuron axon abnormal miniature endplate potential germ cell compartment size variant

gonad morphology variant abnormal Reichert's membrane morphology abnormal(ly) behavioral quality of a process locomotion

larval lethal abnormal glomerular filtrationabnormal barrier kidney function morphology abnormal(ly) contractility trunk musculature locomotion variant abnormal preimplantation embryo development abnormal(ly) decreased length CaP motoneuron axon maternal sterile

muscle arm development defective abnormal(ly) degenerate notochord Abnormal eye physiology

myopodia present abnormal pericardium morphology Caenorhabditis elegans abnormal eye electrophysiology abnormal(ly) disrupted skeletal muscle contraction nuclei enlarged abnormal amacrine cell morphology organism development variant Abnormal neural tube morphology abnormal(ly) disrupted thigmotaxis Mus musculus Caenorhabditis elegans rachis morphology variant abnormal(ly) lacks parts or has fewer parts of type neural crest cell towards neural crest cell filopodium Strabismus Mus musculus abnormal(ly) broken skeletal muscle cell sarcolemma Abnormal lung morphology Stage 5 chronic kidney disease abnormal(ly) lethal (sensu genetics) whole organism abnormal(ly) decreased rate larval locomotory behavior

Abnormal eye morphology Proteinuria abnormal(ly) decreased rate thigmotaxis abnormal(ly) mislocalised CaP motoneuron axon Danio rerio Posterior lenticonus abnormal(ly) degenerate myotome abnormal embryo size abnormal(ly) mislocalised MiP motor neuron axon

Nystagmus LAMB2 abnormal(ly) degenerate somite PLOD3

abnormal(ly) mislocalised RoP motor neuron axon abnormal(ly) degeneration myotome skeletal muscle cell Nephrotic syndrome Homo sapiens abnormal cutaneous collagen fibril morphology

Myopia Danio rerio abnormal(ly) detached from somite myofibril towards somite abnormal(ly) morphology hindbrain abnormal cell morphology abnormal(ly) disorganized somite Muscular hypotonia abnormal(ly) morphology somite abnormal(ly) disrupted musculoskeletal movement Low−set ears Microcoria

abnormal(ly) dystrophic muscle abnormal(ly) movement behavioral quality whole organism Long philtrum Hypoproteinemia abnormal(ly) lethal (sensu genetics) whole organism Homo sapiens abnormal(ly) process quality motor neuron axon guidance Drosophila melanogaster Drosophila melanogaster abnormal(ly) malformed vertical myoseptum Hypoplasia of the iris J−shaped sella turcica abnormal(ly) process quality neural crest cell migration abnormal(ly) malformed vertical myoseptum basement membrane

Hypoplasia of the ciliary body abnormal(ly) mislocalised myotome myoseptum abnormal(ly) process quality spinal cord motor neuron differentiation Edema Generalized hypotonia abnormal(ly) morphology notochord Intrauterine growth retardation abnormal(ly) quality brain

abnormal(ly) present in fewer numbers in organism myotome skeletal muscle cell

abnormal(ly) quality hindbrain abnormal(ly) process quality muscle attachment Blindness Hearing impairment abnormal(ly) quality notochord Drosophila egg phenotype Diffuse mesangial sclerosis Areflexia Hypoplasia of the capital femoral epiphysis

abnormal(ly) quality trunk musculature Flat face fertile abnormal(ly) refractivity myotome

Drosophila endoderm phenotype Drosophila endoderm Uncharacterized phenotype cell adhesion defective Global developmental delay

Drosophila A1−7 ventral longitudinal muscle 1 phenotype viable

Drosophila chaeta phenotype Failure to thrive Abnormality of the pinna Drosophila embryonic Malpighian tubule phenotype Cataract Drosophila embryonic/larval dorsal vessel phenotype Anteverted nares

Drosophila embryonic/larval heart phenotype Arterial rupture

Drosophila embryonic/larval midgut primordium phenotype

Bruising susceptibility Drosophila embryonic/larval pericardial cell phenotype

Elbow flexion contracture embryonic/larval hypodermalembryonic/larval muscle phenotype cardial cell phenotype

Coarse hair

embryonic/larval alary muscle phenotype Downturned corners of mouth Drosophila wing phenotype embryonic somatic muscle phenotype

Diaphragmatic eventration Dilatation of the cerebral artery

embryonic midgut constriction phenotype Decreased palmar creases

Drosophila gonad phenotype

Drosophila visceral mesoderm phenotype Drosophila macrophage phenotype

Drosophila presumptive embryonic salivary gland phenotype

Drosophila metathoracic ventral longitudinal muscle 1 phenotype

E F

Abnormality of the ovary (Mus musculus)

abnormal enterocyte proliferation

Abnormal eye morphology

axial skeleton hypoplasia abnormal embryonic tissue morphology abnormal secondary sex determinationAbnormality of adrenal physiology abnormal reproductive system development

absent ovary capsule absent Mullerian ducts

abnormal dendritic cell physiology

abnormal pubis morphology

embryonic lethal

abnormal(ly) process quality cell fate specification neuron number variant

transgene expression increased

abnormal ovary development maternal sterile abnormal nephrogenic mesenchyme morphogenesis transgene expression reduced abnormal corneal epithelium morphology

vulval cell induction variant

● abnormal(ly) aplastic habenular commissure multiple anchor cells

abnormal(ly) decreased occurrence pharyngeal pouch protein localization to plasma membrane abnormal metanephric mesenchyme morphology abnormal oogenesis

abnormal(ly) decreased process quality embryonic viscerocranium morphogenesis organism development variant

abnormal(ly) decreased process quality habenular commissure axon extension involved in axon guidance

abnormal kidney mesenchyme morphology abnormal(ly) decreased thickness habenular commissure Abnormal cornea morphology sick

abnormal(ly) discontinuous habenular commissure

abnormal(ly) disrupted convergent extension abnormal kidney development pain response defective abnormal conjunctival sac morphology abnormal(ly) increased speed pharyngeal pouch cell migration abnormal female germ cell morphology

abnormal(ly) malformed ceratobranchial cartilage

abnormal craniofacial bone morphology Caenorhabditis elegans Blepharitis abnormal(ly) malformed pharyngeal pouch Caenorhabditis elegans Mus musculus abnormal circulating chemokine level abnormal chondrocyte morphology cell migration involved in gastrulation phenotype Drosophila melanogaster

Diarrhea Danio rerio embryonic viscerocranium morphogenesis phenotype Abnormal cartilage morphology

habenula development phenotype

abnormal bone marrow morphology abnormal chemokine level adult antennal lobe projection neuron phenotype Eosinophilia

abnormal adenohypophysis morphology WNT4 antennal lobe commissure phenotype ADAM17 Mus musculus antennal lobe phenotype Facial hirsutism Erythema abnormal canal of Schlemm morphology

anterior commissure phenotype Cubitus valgus Homo sapiens commissure phenotype Drosophila melanogaster Erythroderma Congenital diaphragmatic hernia Homo sapiens dendrite phenotype Cleft palate abnormal bronchiole epithelium morphology Drosophila adult antennal lobe phenotype Cleft lip Hematochezia

Drosophila alpha/beta Kenyon cell phenotype

Brachydactyly Drosophila antennal olfactory receptor neuron phenotype

Paronychia Drosophila fascicle phenotype abnormal branching involved in lung morphogenesis Bilateral lung agenesis Drosophila glomerulus phenotype

Pustule Drosophila intermediate longitudinal fascicle phenotype

Drosophila MP1 neuron phenotype abnormal brain vasculature morphology Aplasia of the vagina Aplasia/Hypoplasia of the fallopian tube Drosophila MP1 tract phenotype Thick nail

Aplasia of the uterus Amenorrhea Drosophila presumptive embryonic salivary gland phenotype abnormal bone remodeling

Acne ● Drosophila presumptive embryonic/larval central nervous system phenotype Villous atrophy fertile

lateral longitudinal fascicle phenotype Abnormal anterior eye segment morphology

lateral transverse muscle phenotype

Adrenal gland agenesis mushroom body alpha−lobe phenotype Abnormal aortic valve cusp morphology

Abnormality of cardiovascular system morphology

Abnormality of the vagina abnormal bone marrow cavity morphology Abnormality of the penis

abnormal aortic valve flow

abnormal atrioventricular valve morphology abnormal aortic valve morphology

Abnormality of the genital system

Abnormality of the adrenal glands

Abnormality of the endocrine system

Abnormality of the ovary (Homo sapiens)

344

345 Supplementary Fig. 7: Phenotypic fingerprint of the most conserved matrisome

346 members (ungrouped)

347 Circular dendrograms depicting the cross-species phenotypic fingerprint of all human

348 matrisome genes for which at least one phenotype has been studied in at least one

349 other species. This figure is complementary to Supplementary Fig. 6, but in contrast is

350 showing the phenotypic fingerprint for ungrouped phenotypes.

351

24 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

352

353 Fig. 5: Phenotypic implications of the structural matrisome categories on

354 vertebrates

355 The matrisome is grouped into multiple categories according to the localization and

356 composition of the extracellular proteins. The members of the ECM glycoprotein

357 category, collagens, and proteoglycans constitute the foundation of the matrisome

358 responsible for the structural integrity of the extracellular matrix. Here the most

359 abundant inter-species gene-to-phenotype associations are highlighted for H. sapiens,

360 M. musculus and D. rerio. Genes are shown as human orthologues in colored labels

25 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

361 while phenotype groups are given as plain text. The arrows connect genes to

362 phenotype groups if the gene has been associated with at least one phenotype

363 belonging to the phenotype group. The width and opacity of the connection reflects the

364 degree of conservation of the gene-to-phenotype association across species.

365

26 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

366

367 Fig. 6: Phenotypic implications of the entire matrisome

368 A selected subset of the most connected genes of the entire matrisome and

369 phenotypes across species are displayed. Genes are shown as human orthologues

370 color-coded by matrisome division, while phenotype groups are shown as plain text.

371 The arrows connect genes to phenotype groups when the gene is associated with at

372 least one phenotype of the phenotype group. The width and opacity of the connection

27 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

373 reflects the degree of conservation of the gene-to-phenotype association across

374 species.

375

28 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

ECM Glycoproteins ECM Regulators

Homo sapiens Mus musculus Bos taurus Homo sapiens Mus musculus Danio rerio

Morphology altered Morphology altered Morphology altered FAM20C FAM20C FAM20C

Altered body size Altered body size Altered body size

FRAS1 FN1 FRAS1 FN1 FRAS1 FN1

LAMC1 LAMC1 LAMC1 ADAM17 ADAM17 ADAM17 Morphology altered Morphology altered Morphology altered

Altered body size Altered body size Altered body size

Blood phenotype Blood phenotype Blood phenotype Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential

CTSV CTSV CTSV

Blood phenotype Blood phenotype Blood phenotype

FBN1 FBN1 FBN1 Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential

F2 F2 F2 PLG PLG PLG Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments

LAMA2 LAMA2 LAMA2 Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments

Danio rerio Drosophila melanogaster Caenorhabditis elegans Drosophila melanogaster Caenorhabditis elegans

Morphology altered Morphology altered Morphology altered FAM20C FAM20C

Altered body size Altered body size

FRAS1 FN1 FRAS1 FN1 FRAS1 FN1

LAMC1 LAMC1 LAMC1 ADAM17 ADAM17 Morphology altered Morphology altered

Altered body size Altered body size Altered body size

Blood phenotype Blood phenotype Sterility / reproductive potential Sterility / reproductive potential

CTSV CTSV

Blood phenotype Blood phenotype Blood phenotype

FBN1 FBN1 FBN1 Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential

F2 F2 PLG PLG Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments

LAMA2 LAMA2 LAMA2 Cardiovascular impairments Cardiovascular impairments

ECM−affiliated Proteins Secreted Factors

Homo sapiens Mus musculus Homo sapiens Mus musculus

Immune system phenotype Immune system phenotype BMP4 BMP4

SHH SHH

Cellular level phenotype Cellular level phenotype SEMA3E SEMA3E

Altered body size Altered body size GPC3 GPC3

Blood phenotype Blood phenotype

Morphology altered Morphology altered

Sterility / reproductive potential Sterility / reproductive potential

PLXND1 PLXND1

Nervous and neural tissues affected Nervous and neural tissues affected Morphology altered Morphology altered

FGF8 FGF8

BMP2 BMP2 SEMA3A GPC4 SEMA3A GPC4

TGFB1 TGFB1

Cellular level phenotype Cellular level phenotype Blood phenotype Blood phenotype

Danio rerio Drosophila melanogaster Danio rerio Drosophila melanogaster

Immune system phenotype Immune system phenotype BMP4 BMP4

SHH SHH

Cellular level phenotype Cellular level phenotype SEMA3E SEMA3E

Altered body size Altered body size GPC3 GPC3

Blood phenotype Blood phenotype

Morphology altered Morphology altered

Sterility / reproductive potential Sterility / reproductive potential

PLXND1 PLXND1

Nervous and neural tissues affected Nervous and neural tissues affected Morphology altered Morphology altered

FGF8 FGF8

BMP2 BMP2 SEMA3A GPC4 SEMA3A GPC4

TGFB1 TGFB1

Cellular level phenotype Cellular level phenotype Blood phenotype Blood phenotype

Collagens Proteoglycans

Homo sapiens Mus musculus Danio rerio Homo sapiens Mus musculus Danio rerio

COL5A1 COL5A1 COL5A1 Connective tissue phenotype Connective tissue phenotype Connective tissue phenotype

BGN BGN BGN

Muscle tissue phenotype Muscle tissue phenotype Muscle tissue phenotype DCN DCN DCN

COL1A2 COL1A2 COL1A2

Morphology altered Morphology altered Morphology altered

Eye phenotype Eye phenotype Eye phenotype

HSPG2 HSPG2 HSPG2

Eye phenotype Eye phenotype Eye phenotype

COL1A1 COL1A1 COL1A1 ACAN ACAN ACAN

Morphology altered Morphology altered Morphology altered

Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments

Blood phenotype Blood phenotype Blood phenotype

Developmental phenotype Developmental phenotype Developmental phenotype

Altered body size Altered body size Altered body size

COL3A1 COL3A1 COL3A1

COL2A1 COL2A1 COL2A1 IMPG2 IMPG2 IMPG2 376

377 Supplementary Fig. 8: Phenotypic implications of the matrisome categories

378 across species

379 For each matrisome category, the five genes and phenotypes with the highest

380 connectivity across species are highlighted for each taxon. Grouped phenotypes are

381 shown as plain text while genes are shown as colored labels always referring to their

29 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

382 human orthologues. If a gene has been associated with at least one phenotype

383 associated with the phenotype group they are connected with an arrow. The degree of

384 conservation of the gene-to-phenotype association is reflected by the width and opacity

385 of the connecting arrow.

386

30 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

Core matrisome

Homo sapiens Mus musculus Bos taurus

Blood phenotype Blood phenotype Blood phenotype COL4A1 COL4A1 COL4A1

Eye phenotype Eye phenotype Eye phenotype

LAMC1 Morphology altered LAMC1 Morphology altered LAMC1 Morphology altered

Muscle tissue phenotype Muscle tissue phenotype Muscle tissue phenotype

COL5A1 COL5A1 COL5A1

COL3A1 Developmental phenotype COL3A1 Developmental phenotype COL3A1 Developmental phenotype

HSPG2 HSPG2 HSPG2

Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential COL1A1 Cardiovascular impairments COL1A1 Cardiovascular impairments COL1A1 Cardiovascular impairments

LAMA2 LAMA2 LAMA2

COL2A1 Altered body size COL2A1 Altered body size COL2A1 Altered body size

FBN1 FBN1 FBN1

COL1A2 COL1A2 COL1A2

Connective tissue phenotype Connective tissue phenotype Connective tissue phenotype

Bone phenotype Bone phenotype Bone phenotype

Danio rerio Drosophila melanogaster Caenorhabditis elegans

Blood phenotype Blood phenotype Blood phenotype COL4A1 COL4A1 COL4A1

Eye phenotype Eye phenotype Eye phenotype

LAMC1 Morphology altered LAMC1 Morphology altered LAMC1 Morphology altered

Muscle tissue phenotype Muscle tissue phenotype Muscle tissue phenotype

COL5A1 COL5A1 COL5A1

COL3A1 Developmental phenotype COL3A1 Developmental phenotype COL3A1 Developmental phenotype

HSPG2 HSPG2 HSPG2

Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential COL1A1 Cardiovascular impairments COL1A1 Cardiovascular impairments COL1A1 Cardiovascular impairments

LAMA2 LAMA2 LAMA2

COL2A1 Altered body size COL2A1 Altered body size COL2A1 Altered body size

FBN1 FBN1 FBN1

COL1A2 COL1A2 COL1A2

Connective tissue phenotype Connective tissue phenotype Connective tissue phenotype

Bone phenotype Bone phenotype Bone phenotype

Matrisome−associated

Homo sapiens Mus musculus Rattus norvegicus

Immune system phenotype Immune system phenotype Immune system phenotype

LEP LEP LEP

Blood phenotype Blood phenotype Blood phenotype

TNF TNF TNF

TGFB1 TGFB1 TGFB1

BMP2 BMP2 BMP2

Cellular level phenotype Cellular level phenotype Cellular level phenotype

FGF8 Nervous and neural tissues affected FGF8 Nervous and neural tissues affected FGF8 Nervous and neural tissues affected

Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential

Morphology altered Morphology altered Morphology altered

Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments

SHH SHH SHH Developmental phenotype Developmental phenotype Developmental phenotype WNT1 WNT1 WNT1 FGF10 FGF10 FGF10

BMP4 BMP4 BMP4 Eye phenotype Eye phenotype Eye phenotype Altered body size Altered body size Altered body size

WNT5A WNT5A WNT5A

Danio rerio Drosophila melanogaster

Immune system phenotype Immune system phenotype

LEP LEP

Blood phenotype Blood phenotype

TNF TNF

TGFB1 TGFB1

BMP2 BMP2

Cellular level phenotype Cellular level phenotype

FGF8 Nervous and neural tissues affected FGF8 Nervous and neural tissues affected

Sterility / reproductive potential Sterility / reproductive potential

Morphology altered Morphology altered

Cardiovascular impairments Cardiovascular impairments

SHH SHH Developmental phenotype Developmental phenotype WNT1 WNT1 FGF10 FGF10

BMP4 BMP4 Eye phenotype Eye phenotype Altered body size Altered body size

387 WNT5A WNT5A

388 Supplementary Fig. 9: Phenotypic implications of the matrisome divisions

389 across species

390 For each matrisome division, the ten genes and phenotypes with the highest

391 connectivity across species are highlighted for each taxon. Grouped phenotypes are

392 shown as plain text while genes are shown as colored labels always referring to their

31 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

393 human orthologues. If a gene has been associated with at least one phenotype

394 associated with the phenotype group they are connected with an arrow. The degree of

395 conservation of the gene-to-phenotype association is reflected by the width and opacity

396 of the connecting arrow.

397

32 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

398 Discussion

399 The recent collective scientific efforts focusing on deep phenotyping and phenome-

400 wide association studies revealed many novel genotype-to-phenotype relationships

401 [2,3,29]. Extracellular matrices are important for cellular and organismal function

402 [13,17]. However, the contribution of ECM gene variants to the phenotypic landscape

403 is unknown.

404 Here, we defined the ECM phenome of humans, mice, zebrafish, Drosophila,

405 and C. elegans, with extrapolations to three other species. We found over 42’558

406 matrisome genotype-to-phenotype relationships across the five defined matrisome

407 species and an additional 306 genotype-to-phenotype relationships in the three

408 undefined-matrisome species. We identified the phenotypic landscape of the

409 matrisome that is conserved among species and also species-specific phenomes. Our

410 analysis of the phenotypic fingerprints of genes revealed MSTN, CTSD, LAMB2,

411 HSPG2, and COL11A2 matrisome genes bearing the most associated phenotypes.

412 From these genotype-to-phenotype interactions, we have built networks linking

413 analogue phenotypes that might be driven by similar underlying molecular

414 mechanisms involving matrisome genes. We provide the ECM phenome as a platform

415 to use information gained from model organisms to implicate novel genotype-to-

416 phenotype relationships for humans.

417 One limitation of our analysis are our nearly-completed phenotype categorical

418 collections, which include the most frequent occurring phenotypes, but ignores some

419 infrequently occurring phenotypes during our manual curation process due to the large

420 number phenotypes and sometimes jargon-specific phenotype naming. We aimed to

421 functionally group phenotypes so that these become more comparable across species.

422 This includes also species-specific phenotypes. We manually added 156 broadly

33 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

423 defined phenotype categories, which captured 86.8 % of the original phenome.

424 However, while the grouping achieves its primary goal to unify the most frequent

425 phenotypes some of the less abundant phenotypes might be falsely grouped or remain

426 ungrouped. Modifications in the grouping system achieved a better assignment of

427 misclassified infrequent phenotypes, with little or no changes in the top phenotypic

428 categories, which adds reliability to our findings and interpretations in this present

429 study. All analysis including grouped or ungrouped are provided in the supplementary

430 sections, allowing researchers to build upon.

431 The strong points in the present study are our networks of matrisome genes

432 mapped to recorded phenotypes, which offers a conceptual framework for further

433 investigation. This phenome-based network allows the identification of conserved

434 underlying molecular interactions of matrisome genes across comparable phenotypes

435 and species. In particular, it enables queries on given human phenotypes or ECM

436 genes, which are provided with genotype-to-phenotype relationships across other

437 species, suggesting potential novel molecular targets or read-outs for phenotypic high-

438 throughput screens. For instance, early developmental or lethal genes that lead to

439 spontaneous abortion in humans, can be studied with their orthologue gene

440 counterparts in other species [10]. We provide the most conserved subset of the

441 matrisome-interactome dataset in graphical format highlighting the genes and

442 phenotypes with the highest degree of connectivity (Fig. 5 and 6; Supplementary Fig.

443 8 and 9). Such graphical interaction frameworks have been invaluable for all known

444 human gene associated phenotypes [10].

445 In our phenome analysis across species, we identified novel aging-related

446 phenotypes associated with matrisome gene variants in the top categories (Fig. 2).

447 Decline and remodeling of the matrisome is a major driver of aging and longevity

34 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

448 [18,30-32]. Recently, the human aging phenome has been defined revealing that

449 phenotypes associated with collagens or matrisome, such as facial wrinkles, kyphosis,

450 arthritis, and osteoporosis are under the most prevalent phenotypes recorded from 77

451 million elderly [8]. Furthermore, connective tissue diseases, such as Marfan syndrome

452 caused by mutation in FBN1 and Ehlers-Danlos syndrome, were clustered with

453 features of premature aging diseases (progeria) and aging [8]. In our analysis, we

454 found that the FBN1 gene has the most associated human phenotypes (Fig. 3 and

455 Supplementary Fig. 5).

456 The breath of the phenotypic fingerprints of some matrisome genes hints at the

457 conserved modularity of gene systems. Some of these orthologue phenotypes across

458 species suggest a repurposing of the underlying molecular systems through evolution.

459 Our network of these matrisome genes associated with these orthologues phenotypes

460 facilitate studying human phenotypes or even diseases in nonobvious model systems.

461 Thus, our computational analysis provides a systems-level approach to facilitate

462 mechanistic discoveries underlying ECM phenomes across species.

463

464

465

35 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

466 Author contributions

467 All authors participated in analyzing and interpreting the data. CYE and CS designed

468 the computational analysis and wrote the manuscript.

469

470 Author Information

471 The authors have no competing interests to declare. Correspondence should be

472 addressed to C. Y. E.

473

474 Acknowledgement

475 We thank members of the Ewald lab for critical reading of the manuscript,

476 Davide Vitiello for his help with the phenotypic grouping, and the Monarch

477 Initiative (https://monarchinitiative.org) for providing the indispensable

478 resources for the genotype-phenotype analysis performed in this work. Funding

479 for this study was from the Swiss National Science Foundation grant number

480 PP00P3_163898 to CS and CYE.

481

36 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

482 References

483 [1] J.C. Denny, L. Bastarache, D.M. Roden, Phenome-Wide Association Studies 484 as a Tool to Advance Precision Medicine, Annual Review of Genomics and 485 Human Genetics. 17 (2016) 353–373. doi:10.1146/annurev-genom-090314- 486 024956. 487 [2] D.M. Roden, Phenome-wide association studies: a new method for functional 488 genomics in humans, J. Physiol. (Lond.). 595 (2017) 4109–4115. 489 doi:10.1113/JP273122. 490 [3] W.S. Bush, M.T. Oetjens, D.C. Crawford, Unravelling the - 491 phenome relationship using phenome-wide association studies, Nat Rev 492 Genet. 17 (2016) 129–145. doi:10.1038/nrg.2015.36. 493 [4] I. Lappalainen, J. Almeida-King, V. Kumanduri, A. Senf, J.D. Spalding, S. Ur- 494 Rehman, et al., The European Genome-phenome Archive of human data 495 consented for biomedical research, Nat Genet. 47 (2015) 692–695. 496 doi:10.1038/ng.3312. 497 [5] N. Freimer, C. Sabatti, The human phenome project, Nat Genet. 34 (2003) 498 15–21. doi:10.1038/ng0503-15. 499 [6] M.E. Samuels, Saturation of the human phenome, Curr. Genomics. 11 (2010) 500 482–499. doi:10.2174/138920210793175886. 501 [7] E. Krapohl, J. Euesden, D. Zabaneh, J.-B. Pingault, K. Rimfeld, S. von 502 Stumm, et al., Phenome-wide analysis of genome-wide polygenic scores, 503 Molecular Psychiatry. 21 (2016) 1188–1193. doi:10.1038/mp.2015.126. 504 [8] S.N. Andreassen, M. Ben Ezra, M. Scheibye-Knudsen, A defined human 505 aging phenome, Aging (Albany NY). 11 (2019) 5786–5806. 506 doi:10.18632/aging.102166. 507 [9] P. Eline Slagboom, N. van den Berg, J. Deelen, Phenome and genome 508 based studies into human ageing and longevity: An overview, Biochim 509 Biophys Acta Mol Basis Dis. 1864 (2018) 2742–2751. 510 doi:10.1016/j.bbadis.2017.09.017. 511 [10] K.-I. Goh, M.E. Cusick, D. Valle, B. Childs, M. Vidal, A.-L. Barabási, The 512 human disease network, Proc Natl Acad Sci USA. 104 (2007) 8685–8690. 513 doi:10.1073/pnas.0701361104. 514 [11] T.A. Peterson, E. Doughty, M.G. Kann, Towards precision medicine: 515 advances in computational approaches for the analysis of human variants, 516 Journal of Molecular Biology. 425 (2013) 4047–4063. 517 doi:10.1016/j.jmb.2013.08.008. 518 [12] A.G. Cirincione, K.L. Clark, M.G. Kann, Pathway networks generated from 519 human disease phenome, BMC Medical Genomics. 11 (2018) 75. 520 doi:10.1186/s12920-018-0386-2. 521 [13] R.O. Hynes, The extracellular matrix: not just pretty fibrils, Science. 326 522 (2009) 1216–1219. doi:10.1126/science.1176009. 523 [14] C. Frantz, K.M. Stewart, V.M. Weaver, The extracellular matrix at a glance, 524 Journal of Cell Science. 123 (2010) 4195–4200. doi:10.1242/jcs.023820. 525 [15] M.C. Lampi, C.A. Reinhart-King, Targeting extracellular matrix stiffness to 526 attenuate disease: From molecular mechanisms to clinical trials, Sci Transl 527 Med. 10 (2018) eaao0475. doi:10.1126/scitranslmed.aao0475.

37 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

528 [16] I.N. Taha, A. Naba, Exploring the extracellular matrix in health and disease 529 using proteomics, Essays Biochem. 63 (2019) 417–432. 530 doi:10.1042/EBC20190001. 531 [17] C. Bonnans, J. Chou, Z. Werb, Remodelling the extracellular matrix in 532 development and disease, Nat Rev Mol Cell Biol. 15 (2014) 786–801. 533 doi:10.1038/nrm3904. 534 [18] C.Y. Ewald, The Matrisome during Aging and Longevity: A Systems-Level 535 Approach toward Defining Matreotypes Promoting Healthy Aging, 536 Gerontology. (2019) 1–9. doi:10.1159/000504295. 537 [19] X. Wang, A.K. Pandey, M.K. Mulligan, E.G. Williams, K. Mozhui, Z. Li, et al., 538 Joint mouse-human phenome-wide association to test gene function and 539 disease risk, Nat Commun. 7 (2016) 10464. doi:10.1038/ncomms10464. 540 [20] G. Unlu, X. Qi, E.R. Gamazon, D.B. Melville, N. Patel, A.R. Rushing, et al., 541 Phenome-based approach identifies RIC1-linked Mendelian syndrome 542 through zebrafish models, biobank associations and clinical studies, Nat Med. 543 26 (2020) 98–109. doi:10.1038/s41591-019-0705-y. 544 [21] K.A. Shefchek, N.L. Harris, M. Gargano, N. Matentzoglu, D. Unni, M. Brush, 545 et al., The Monarch Initiative in 2019: an integrative data and analytic platform 546 connecting phenotypes to genotypes across species, Nucleic Acids Res. 48 547 (2020) D704–D715. doi:10.1093/nar/gkz997. 548 [22] R.O. Hynes, A. Naba, Overview of the matrisome--an inventory of 549 extracellular matrix constituents and functions, Cold Spring Harb Perspect 550 Biol. 4 (2012) a004903–a004903. doi:10.1101/cshperspect.a004903. 551 [23] A. Naba, K.R. Clauser, H. Ding, C.A. Whittaker, S.A. Carr, R.O. Hynes, The 552 extracellular matrix: Tools and insights for the “omics” era, Matrix Biol. 49 553 (2016) 10–24. doi:10.1016/j.matbio.2015.06.003. 554 [24] P. Nauroy, S. Hughes, A. Naba, F. Ruggiero, The in-silico zebrafish 555 matrisome: A new tool to study extracellular matrix gene and protein 556 functions, Matrix Biol. 65 (2018) 5–13. doi:10.1016/j.matbio.2017.07.001. 557 [25] M.N. Davis, S. Horne-Badovinac, A. Naba, In-silico definition of the 558 Drosophila melanogaster matrisome, Matrix Biology Plus. 4 (2019) 100015. 559 doi:10.1016/j.mbplus.2019.100015. 560 [26] A.C. Teuscher, E. Jongsma, M.N. Davis, C. Statzer, J.M. Gebauer, A. Naba, 561 et al., The in-silico characterization of the Caenorhabditis elegans matrisome 562 and proposal of a novel collagen classification, Matrix Biology Plus. (2019) 1– 563 13. doi:10.1016/j.mbplus.2018.11.001. 564 [27] B.D. Rodgers, D.K. Garikipati, Clinical, agricultural, and evolutionary biology 565 of myostatin: a comparative review, Endocr. Rev. 29 (2008) 513–534. 566 doi:10.1210/er.2008-0003. 567 [28] P. Benes, V. Vetvicka, M. Fusek, Cathepsin D--many functions of one 568 aspartic protease, Crit. Rev. Oncol. Hematol. 68 (2008) 12–28. 569 doi:10.1016/j.critrevonc.2008.02.008. 570 [29] H. Li, J. Auwerx, Mouse Systems Genetics as a Prelude to Precision 571 Medicine, Trends Genet. (2020). doi:10.1016/j.tig.2020.01.004. 572 [30] C.Y. Ewald, J.N. Landis, J. Porter Abate, C.T. Murphy, T.K. Blackwell, Dauer- 573 independent insulin/IGF-1-signalling implicates collagen remodelling in 574 longevity, Nature. 519 (2015) 97–101. doi:10.1038/nature14021.

38 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

575 [31] D. Bakula, A.M. Aliper, P. Mamoshina, M.A. Petr, A. Teklu, J.A. Baur, et al., 576 Aging and drug discovery, Aging (Albany NY). 10 (2018) 3079–3088. 577 doi:10.18632/aging.101646. 578 [32] A.C. Teuscher, C. Statzer, S. Pantasis, M.R. Bordoli, C.Y. Ewald, Assessing 579 Collagen Deposition During Aging in Mammalian Tissue and in 580 Caenorhabditis elegans, Methods Mol Biol. 1944 (2019) 169–188. 581 doi:10.1007/978-1-4939-9095-5_13. 582

39