bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
The extracellular matrix phenome across species
Cyril Statzer1 and Collin Y. Ewald1*
1 Eidgenössische Technische Hochschule Zürich, Department of Health Sciences and
Technology, Institute of Translational Medicine, Schwerzenbach-Zürich CH-8603,
Switzerland
*Corresponding authors: [email protected] (CYE)
Keywords: Phenome, genotype-to-phenotype, matrisome, extracellular matrix,
collagen, data mining.
Highlights
• 7.6% of the human phenome originates from variations in matrisome genes
• 11’671 phenotypes are linked to matrisome genes of humans, mice, zebrafish,
Drosophila, and C. elegans
• Expected top ECM phenotypes are developmental, morphological and
structural phenotypes
• Nonobvious top ECM phenotypes include immune system, stress resilience,
and age-related phenotypes
1 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
1 Abstract
2 Extracellular matrices are essential for cellular and organismal function. Recent
3 genome-wide and phenome-wide association studies started to reveal a broad
4 spectrum of phenotypes associated with genetic variants. However, the phenome or
5 spectrum of all phenotypes associated with genetic variants in extracellular matrix
6 genes is unknown. Here, we analyzed over two million recorded genotype-to-
7 phenotype relationships across hundreds of species to define their extracellular matrix
8 phenomes. By using previously defined matrisomes of humans, mice, zebrafish,
9 Drosophila, and C. elegans, we found that the extracellular matrix phenome comprises
10 of 3-10% of the entire phenome. Collagens (COL1A1, COL2A1) and fibrillin (FBN1)
11 are each associated with more than 150 distinct phenotypes in humans, whereas
12 collagen COL4A1, Wnt- and sonic hedgehog (shh) signaling are predominantly
13 associated in other species. We determined the phenotypic fingerprints of matrisome
14 genes and found that MSTN, CTSD, LAMB2, HSPG2, and COL11A2 and their
15 corresponding orthologues have the most phenotypes across species. Out of the
16 42’558 unique matrisome genotype-to-phenotype relationships across the five species
17 with defined matrisomes, we have constructed interaction networks to identify the
18 underlying molecular components connecting with orthologues phenotypes and with
19 novel phenotypes. Thus, our networks provide a framework to predict unassessed
20 phenotypes and their potential underlying molecular interactions. These frameworks
21 inform on matrisome genotype-to-phenotype relationships and potentially provide a
22 sophisticated choice of biological model system to study human phenotypes and
23 diseases.
24
2 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
25 Introduction
26 A principle goal in biology is to link molecular mechanisms and biological pathways to
27 phenotypes. Recent systems-level approaches of genome-wide association studies
28 (GWAS) and phenome-wide association studies (PheWAS) have been harnessed to
29 generate and map genotype-to-phenotype relationships. These efforts are invaluable
30 for clinical research spearheaded by linking electronic health records to medically
31 relevant PheWAS, thereby associating genetic variants to human phenotypes,
32 pathologies, and diseases [1-4]. Assuming that conserved genes and pathways result
33 in comparable phenotypes across species, knowing the genotype-to-phenotype
34 relationship in one species could inform on the existence of a similar genotype-to-
35 phenotype relationship in another species. The phenome is the entire set of
36 phenotypes resulting from all possible genetic variations [5,6]. In order to make trans-
37 species genome-phenome inferences, the phenomes of several species need to be
38 known. The human phenome is far from reaching saturation [6]. We estimate that the
39 current human phenome consists of 7’037 unique phenotypes (Source: Database
40 Monarch Initiative; timestamp 23.08.2019). From the entire human phenome, several
41 sub-phenomes have been characterized and defined, including the behavioral
42 phenome [7], the aging phenome [8,9], and the disease phenome [10-12]. These
43 genotype-to-phenotype relationships can be used to build networks and pathways.
44 In particular, the human disease phenome consists of over 80’000 genetic
45 variants associated with human diseases, of which cardiovascular diseases and skin
46 and connective diseases share the most pathways [12]. These diseases are enriched
47 in variants closely located or found in collagens or extracellular matrix genes. Cells
48 secrete proteins, such as collagens, glycoproteins, and proteoglycans that integrate
49 and form matrices in the extracellular space. The extracellular matrix (ECM) provides
3 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
50 structural support, is important for cell-cell communication, and cellular homeostasis
51 [13,14]. The ECM has emerged as a key feature for overall health or disease, with
52 several clinical trials targeting to modify the ECM [15-18]. Furthermore, recent
53 approaches that combine PheWAS from mice to human or zebrafish to human
54 revealed clinically relevant variants in collagen Col6a5 or in Ric1 important for
55 collagen-secretion [19,20]. This indicates that a phenome-based approach across
56 species can be of translational value. However, a defined phenome of collagens and
57 other extracellular matrix genes is currently missing for any species.
58 Here, we mine publicly available databases to establish the phenome of
59 extracellular matrices across species. We take advantage of the large collection of
60 over two million genotype-to-phenotype associations that integrates data across
61 hundreds of species from more than 30 different curated databases provided by the
62 Monarch Initiative [21]. For the ECM phenome we include all genes that assemble the
63 matrisome. The matrisome is the compendium of all possible gene products that either
64 form, remodel, or associate with the extracellular matrix [22]. The matrisome of
65 humans consists of 1027 proteins [23], for mice 1110 proteins [23], for zebrafish 1002
66 proteins [24], for Drosophila 641 proteins [25], and for C. elegans 719 proteins [26].
67 This corresponds to roughly 4% of their genomes are dedicated to ECM genes. We
68 find a similar contribution of 3-10% (overall 6.7 %) of the phenome consisting of 42’558
69 ECM-specific gene-to-phenotype associations. Across these five species, 639’431
70 gene-phenotype associations were recorded consisting of 34’818 distinct phenotypes.
71 We identify the phenotypic landscape of the matrisome. Our genotype-to-phenotype
72 interaction maps for matrisome genes reveal unstudied interaction when compared
73 across species.
74
4 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
75 Methods and Materials
76 Matrisome phenotype-to-genotype relationships
77 The phenotype-gene association table was obtained from the Monarch Initiative
78 (Source: Database Monarch Initiative; timestamp 23.08.2019) [21]. This phenotype-
79 gene association table was modified to remove duplicated species-gene-phenotype
80 relationships (< 0.002% in total) and unspecified organisms were removed
81 (Supplementary Table 1). The genotype-to-phenotype information was then combined
82 with the defined matrisomes of five species: the matrisome of humans [23], mice [23],
83 zebrafish [24], Drosophila [25], and for C. elegans [26]). To enable cross-species
84 comparisons among the species with defined and undefined matrisomes, we mapped
85 all genes documented in the Monarch Initiative database to their corresponding
86 orthologues. The homology mapping was achieved by using the homologene database
87 by the National Center for Biotechnology Information (NCBI) implemented in the
88 homologene R package (Ogan Mancarci (2019). homologene: Quick Access to
89 Homologene and Gene Annotation Updates. R package version 1.4.68.19.3.27.
90 https://CRAN.R-project.org/package=homologene). The human homology information
91 subset and its associated phenome was then further analyzed. All data analysis was
92 performed using the purrr and dplyr R packages (Lionel Henry and Hadley Wickham
93 (2019). purrr: Functional Programming Tools. R package version 0.3.3.
94 https://CRAN.R-project.org/package=purrr; Hadley Wickham, Romain François, Lionel
95 Henry and Kirill Müller (2019). dplyr: A Grammar of Data Manipulation. R package
96 version 0.8.3. https://CRAN.R-project.org/package=dplyr).
97
98 Grouping of orthologous phenotypes
5 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
99 To facilitate the comparison between species, the phenotypes were simplified and
100 grouped to form larger phenotype collections (Supplementary Table 2). These
101 phenotype collections enable both cross-species comparisons by avoiding species-
102 specific terminology, as well as, generating a broader overview of the processes
103 involved. Substring matching was used to assimilate similar phenotypes to phenotype
104 groups, while assuring that each original phenotype only belonged to a single
105 phenotype group using the stringr R package (Hadley Wickham (2019). stringr: Simple,
106 Consistent Wrappers for Common String Operations. R package version 1.4.0.
107 https://CRAN.R-project.org/package=stringr). Capitalization and terminal white spaces
108 were ignored.
109
110 Data analysis and visualization
111 The circular dendrograms were generated with a single hub gene as central node
112 connected to the species in which it was observed and the phenotypes as terminal
113 nodes. The interaction network for each collection of gene-phenotype interactions was
114 computed by extracting the genes, which are associated with the largest number of
115 phenotypes followed by selecting the phenotypes, which comprise the largest number
116 of genes. The edges of the generated network were then facetted by species and their
117 weight set according to the number of species in which a phenotype was observed for
118 the orthologues of a particular gene. Network analysis and visualization was performed
119 using the igraph and ggraph R packages (Csardi G, Nepusz T: The igraph software
120 package for complex network research, InterJournal, Complex Systems 1695. 2006.
121 http://igraph.org; Thomas Lin Pedersen (2018). ggraph: An Implementation of
122 Grammar of Graphics for Graphs and Networks. R package version 1.0.2.
123 https://CRAN.R-project.org/package=ggraph). In addition, the R packages ggplot2,
6 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
124 ggforce and ggpubr were used (H. Wickham. ggplot2: Elegant Graphics for Data
125 Analysis. Springer-Verlag New York, 2016; Thomas Lin Pedersen (2019). ggforce:
126 Accelerating 'ggplot2'. R package version 0.3.1. https://CRAN.R-
127 project.org/package=ggforce; Alboukadel Kassambara (2019). ggpubr: 'ggplot2'
128 Based Publication Ready Plots. R package version 0.2.2. https://CRAN.R-
129 project.org/package=ggpubr). Illustrations were generated using Inkscape version
130 0.92.
131
7 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
132 Results
133 The matrisome-phenome across species
134 To define the extracellular matrix phenome, we sought to identify the overall
135 contribution of matrisome genes to the phenome for each species. In particular, to
136 assess the suitability for this approach, we first plotted the phenotypic landscape in
137 relation to genotypes across species (Fig. 1A). We found 7’037 unique phenotypes
138 comprising the human phenome, from which 3’809 unique genes are linked to these
139 phenotypes (Supplementary Fig. 1). Fortunately, the species with a good phenotype-
140 to-genotype ratio are the ones that have a fully defined and characterized matrisome
141 [23-26]. We used these defined matrisome gene lists to determine their contribution to
142 the phenomes of humans, mice, zebrafish, Drosophila, and C. elegans (Fig. 1B-1F).
143 The contribution of the matrisome to the human phenome is 7.6%, with a comparable
144 2.8-9.5% across species (Fig 1B-F). Thus, the matrisome is involved in a number of
145 phenotypes, but only corresponds to a maximum of 10% of the overall recorded
146 phenome across species.
147
148 The phenotypic signatures of matrisome genes across species
149 Next, we determined which phenotypes are associated with the greatest number of
150 matrisome genes. For humans the top three phenotypes are “scoliosis”, “short stature”,
151 and “micrognathia” (Supplementary Fig. 2). By contrast, the top three phenotypes for
152 mice are “decreased body weight”, “immune system”, “premature death”, for zebra fish
153 (“decreased eye size”, “decreased length whole organism”, “lethal”), for Drosophila
154 (“uncharacterized phenotype”, “viable”, “lethal”), and for C. elegans (“locomotion
155 variant”, “embryonic lethal”, “dumpy”; Supplementary Fig. 2). Given that certain
156 phenotypes are species specific, for instance “dumpy” stands for short and fat C.
8 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
157 elegans, we realized that we cannot compare the phenotypic landscape across
158 species using this conventional phenotypic nomenclature. To overcome this obstacle,
159 we grouped phenotypes into novel phenotype collections. For instance, the phenotypic
160 category “altered body size” includes phenotypes with the key words: body weight,
161 body size, body length, growth phenotype, dwarf, small, dumpy, stunted, tall and others
162 (Materials and Methods, Supplementary Table 1 and 2). With this approach, we were
163 able to group 86.8% of the total 716’644 species-gene-phenotype associations
164 (Supplementary Table 2). We identified sterility, development, and altered body size
165 as the most dominant phenotypic categories across species, and for instance
166 connective tissue and skin phenotypes for vertebrate and poor viability for
167 invertebrates (Fig. 2). Similarly, besides skin and connective tissue, many
168 morphological phenotypes affecting bone, eye, brain, muscle, and the cardiovascular
169 system are represented in these top categories (Fig. 2). Unexpectedly, aging-related
170 phenotypes, stress resilience, and immune systems phenotypes were ranked across
171 species among these top ECM phenotypic categories (Fig. 2). Since we base our
172 observations on five species, we extended our analysis to incorporate all known
173 genotype-phenotype associations across 44 species. Thereby, we were able to extend
174 the matrisome by homology mapping to three additional species (Bos taurus (cattle),
175 Rattus norvegicus (rat), Canis lupus familiaris (dog) (Supplementary Table 3). Then,
176 we used these matrisome input gene lists to define their matrisome-phenome. We
177 found similar top phenotypes either for the ungrouped or with our comparable
178 phenotypic categories (Supplementary Fig. 3 and 4). Thus, our analysis revealed
179 known ECM phenotypes but also unexpected phenotypes related to the ECM with high
180 penetrance across species.
181
9 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
182 Top ranked-matrisome gene signature of the ECM-phenomes
183 To understand which genes are the major drivers of multiple distinct phenotypes, we
184 ranked matrisome genes based on their number of associated phenotypes (Fig. 3,
185 Supplementary Fig. 5). For humans, collagens (COL2A1, COL1A1, COL5A1), fibrillin
186 (FBN1), and growth differentiation factor (GDF5) are each associated with more than
187 150 distinct phenotypes (Fig. 3A). By contrast, for mice, zebrafish, Drosophila, and C.
188 elegans, Wnt- and sonic hedgehog (shh) signaling and other matrisome-secreted
189 factors are predominantly associated with the greatest number of distinct phenotypes
190 (Fig. 3B-E, Supplementary Fig. 5). Except for zebrafish, the highly conserved collagen
191 type IV (COL4A1/emb-9) is associated with range of 20-100 phenotypes across
192 species (Fig. 3). Overall, members of each matrisome category are represented and
193 comparable conserved gene families are linked with numerous phenotypes.
194
195 Phenotypic finger prints associated with matrisome genes
196 Using our phenotype grouping approach, we observed that certain genes were
197 associated with comparable phenotypes across species. For this comparison, we used
198 our phenotypic categories and plotted all matrisome genes as dendrograms
199 connecting the gene with its phenotypes through the species in which they were
200 observed (Fig. 4, Supplementary Fig. 6). The most phenotypically conserved gene is
201 the secreted muscle growth controlling myostatin (MSTN) protein showing muscle, fat,
202 and body size phenotypes in seven species (Fig. 4A, B). This is not surprising, since
203 either loss or pharmacological inhibition of myostatin leads to almost a doubling of
204 muscle mass, a trait applied in livestock and a potential target for treating muscle
205 wasting diseases [27]. Next is the cathepsin D (CTSD) gene, which is a lysosomal
206 aspartyl protease that can be secreted into the extracellular space to remodel ECM
10 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
207 and is involved in many physiological functions, including skin and neuronal
208 development, lipofuscin (age-pigment) removal, and apoptosis [28]. Our analysis of
209 CTSD showed the full phenotypic spectrum across six species with many conserved
210 phenotypes but also species-specific phenotypes (Fig. 4C, Supplementary Fig. 6).
211 Similarly, laminin (LAMB2), heparan sulfate proteoglycan (HSPG2), collagen
212 (COL11A2) showed comparable orthologues phenotypes across multiple species
213 ranging from altered body size to aging-related phenotypes (Fig. 4D-F). The
214 phenotypic finger prints of all matrisome genes is shown in Supplementary Fig. 6 and
215 7. Thus, we have defined the phenotypic landscape of matrisome genes across
216 species.
217
218 Network analysis to identify novel phenotypes and underlying molecular
219 mechanisms
220 We next asked whether we could use this phenotypic landscape of matrisome genes
221 to inform on potential phenotypes not assessed in humans. Furthermore, could we use
222 gene-phenotype pairs in one species to infer novel interactions in another species? In
223 order to achieve this, we built matrisome-genes-to-phenotype networks across species
224 for all matrisome categories and divisions. These networks are shown in
225 Supplementary Fig. 8 and 9. We chose the three core-matrisome categories, which
226 include ECM glycoproteins, collagens, and proteoglycans, to build a network of the five
227 most abundant interspecies gene-to-phenotype associations for H. sapiens, M.
228 musculus and D. rerio (Fig. 5). We positioned these top five conserved genes and
229 phenotypes for each species at the same location in our graphical network (Fig. 5). We
230 identified the conserved gene-to-phenotypes interactions (bold arrows), but also
231 interactions that only occur in a single species (light arrows; Fig. 5). For instance,
11 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
232 genetic variants in neuromuscular junction protein agrin are associated in humans
233 solely with a “sterility” phenotype, whereas in mouse in addition to sterility, agrin is
234 associated with altered body size, morphology, and cardiovascular impairments (Fig.
235 5). Next, we selected the most connected matrisome genes and mapped the genotype-
236 to-phenotype relationships across species (Fig. 6). Myostatin (MSTN) and collagen
237 type VII (COL7A1) are the most conserved drivers for muscle or skin phenotypes,
238 respectively (Fig. 6). Laminin (LAMA2), heparan sulfate proteoglycan (HSPG2), and
239 sonic hedgehog (SHH) gene build a strong interaction network among brain, muscle,
240 and cardiovascular phenotypes (Fig. 6). Curiously, only in mice is matrix
241 metallopeptidase 9 (MMP9) implicated with these three phenotypes (brain, muscle,
242 and cardiovascular), but also with immune system and proliferation phenotypes (Fig.
243 6). Taken together, our graphical networks build a framework to predict unassessed
244 phenotypes and their potential underlying molecular interactions.
245
12 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
246 Figure Legends
247
248 Fig. 1: The contribution of the matrisome to the phenome across species
249 The relationship between the number of genes associated with at least one phenotype
250 and the total number of characterized phenotypes (phenome) linked to at least one
251 gene are shown across species (A). The species with defined matrisomes are depicted
252 in separate panels with the number of genes associated with each phenotype shown
253 on the y-axis and the 40 phenotypes with the highest number of total genes involved
254 are shown on the x-axis for humans (B), for mouse (C), for zebrafish (D), for Drosophila
255 (E), and for C. elegans (F).The fraction of genes, which are members of each species’
256 matrisome are highlighted in color. The right panel provides an overview of the gene-
13 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
257 to-phenotype distribution, while the left panel highlights the contribution of the
258 matrisome by cropping the y-axis. The fraction of matrisome genes associated with the
259 phenotypes compared to all genes is shown as a percentage.
260
261
262 Supplementary Fig. 1: The phenome and number of genes associated with
263 phenotypes
264 (A) The phenome across species plotted as number of phenotypes. (B) The number
265 of unique genes associated with phenotypes across species.
266
14 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
267
268
269 Fig. 2: The matrisome-phenome across species
270 Using the matrisome composition for each species we show the phenome space
271 associated with it for humans (A), mice (B), zebra fish (C), Drosophila (D), and C.
272 elegans (E). Each species is depicted in a separate panel with the 30 largest
273 phenotype groups on the y-axis and the number of unique gene/phenotype
274 associations contained in each group on the x-axis. Subsets of phenotypes and analog
275 phenotypes across species are color-coded to ease the comparison (i.e., “altered body
276 size” in purple). For illustration we connected the “altered body size” phenotype with a
277 purple line. The phenotypes meaning wear slightly different meaning across species
278 due to different morphologies as i.e. C. elegans eye phenotype, which refers to
279 anything related to phototaxis (e.g., UV-light sensing), while the limbs related
280 phenotype captures phenotypes related to the animal’s tail.
281
15 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
A Homo sapiens B Mus musculus C Danio rerio Scoliosis no abnormal phenotype detected abnormal(ly) decreased size eye Short stature Decreased body weight abnormal(ly) decreased length whole organism Micrognathia immune system phenotype abnormal(ly) edematous pericardium Seizures premature death abnormal(ly) lethal (sensu genetics) whole organism Intellectual disability decreased body size abnormal(ly) disrupted blood circulation Hypertelorism homeostasis/metabolism phenotype notochord development phenotype Hearing impairment reproductive system phenotype fin regeneration phenotype Muscular hypotonia nervous system phenotype abnormal(ly) decreased size head Respiratory insufficiency neonatal lethality, complete penetrance abnormal(ly) viability whole organism Cleft palate Mortality/Aging abnormal(ly) undulate notochord Sensorineural hearing impairment postnatal lethality, incomplete penetrance abnormal(ly) morphology somite High palate cardiovascular system phenotype cartilage development phenotype Low−set ears Failure to thrive abnormal(ly) quality nervous system Depressed nasal bridge skeleton phenotype abnormal(ly) decreased length post−vent region Cataract Recurrent bacterial infections sprouting angiogenesis phenotype Nystagmus postnatal lethality, complete penetrance dorsal/ventral pattern formation phenotype Myopia Osteopenia abnormal(ly) shape whole organism Talipes equinovarus behavior/neurological phenotype abnormal(ly) quality sensory system Kyphosis preweaning lethality, complete penetrance embryonic viscerocranium morphogenesis phenotype Strabismus increased or absent threshold for auditory brainstem response angiogenesis phenotype Ptosis Cardiomegaly abnormal(ly) morphology notochord Global developmental delay Abnormal heart morphology abnormal(ly) increased occurrence apoptotic process Abnormality of cardiovascular system morphology decreased litter size abnormal(ly) disrupted angiogenesis Pes planus Abnormal trabecular bone morphology determination of left/right symmetry phenotype inguinal hernia Abnormal lung morphology abnormal(ly) morphology whole organism Failure to thrive growth/size/body region phenotype abnormal(ly) morphology brain Cryptorchidism hematopoietic system phenotype abnormal(ly) decreased length whole organism anterior−posterior axis Muscle weakness decreased vertical activity abnormal(ly) curved ventral post−vent region Joint stiffness abnormal kidney morphology abnormal(ly) decreased length intersegmental vessel Anteverted nares Abnormal bleeding abnormal(ly) curved post−vent region 0 20 40 60 0 50 100 150 0 10 20 30 D Drosophila melanogaster E Caenorhabditis elegans Uncharacterized phenotype locomotion variant viable embryonic lethal lethal dumpy fertile abnormal(ly) arrested larval development visible Failure to thrive Drosophila wing phenotype larval lethal neuroanatomy defective adult body morphology variant partially lethal − majority die organism development variant lethal − all die before end of pupal stage transgene subcellular localization variant female sterile molt defect some die during embryonic stage reduced brood size lethal − all die before end of embryonic stage transgene expression increased male sterile sterile some die during pupal stage small lethal − all die before end of larval stage maternal sterile eye phenotype lethal neurophysiology defective pattern of transgene expression variant locomotor behavior defective dauer lifespan extended short lived transgene expression reduced developmental rate defective sick embryonic/larval salivary gland phenotype protruding vulva Drosophila embryonic/first instar larval cuticle phenotype protein aggregation variant some die during larval stage exploded through vulva Drosophila ommatidium phenotype receptor mediated endocytosis defective Drosophila embryo phenotype extended life span partially lethal paralyzed some die during third instar larval stage germ cell compartment expansion variant partially lethal − majority live clear mitotic cell cycle defective pathogen susceptibility increased size defective avoids bacterial lawn 282 0 200 400 0 25 50 75
283 Supplementary Fig. 2: Ungrouped phenotype space across species:
284 Using the matrisome composition for each species and the ungrouped phenotypes,
285 the associated phenome space for humans (A), mice (B), zebra fish (C), Drosophila
286 (D), and C. elegans (E) are shown. Each species is depicted in a separate panel with
287 the 30 largest phenotype/gene association groups on the y axis and the number of
288 unique gene/phenotype associations contained in each group on the x-axis.
16 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
A Caenorhabditis elegans B Canis lupus familiaris C Danio rerio
Sterility / reproductive potential Eye phenotype Nervous and neural tissues affected Poor viability Nervous and neural tissues affected Eye phenotype Developmental phenotype Systemic health decrease Developmental phenotype Expression variant Complex brain phenotype Cardiovascular phenotype Genetic information phenotype Psychological phenotype Muscle tissue phenotype Cellular level phenotype Muscle tissue phenotype Cranial deformations Systemic health decrease Oropharingeal related phenotype Cardiovascular impairments Morphology altered Immune system phenotype Brain phenotype Aging−related phenotype Ataxia and immobility Blood phenotype Localization variant Altered body size Connective tissue phenotype Stress resilience phenotype Reflexes and coordination Cellular level phenotype Kinestetic derived phenotypes Urinary system phenotype Oropharingeal related phenotype Altered body size Kinestetic derived phenotypes Morphology altered Adipose tissue phenotype Integumentary and thricocutaneus Edema occurence Eye phenotype Blood phenotype Limbs related phenotype Aggregation Visual impairment Kinestetic derived phenotypes Immune system phenotype tremor Size variant Muscle tissue phenotype weakness Liver and gall phenotype avoids bacterial lawn Morphology altered Coloration variant Growth phenotype Lung and respiratory system Epithelial tissue phenotype Apoptosis Liver and gall phenotype Intestine phenotype Membrane phenotype Brain phenotype Vacuoles and vesicles Lysosomal phenotype Bone phenotype Poor viability Mitochondria affected Apoptosis Psychological phenotype Intestine phenotype Anemia Skin phenotype rachis narrow Sterility / reproductive potential Genetic information phenotype pachytene region organization variant Glucose metabolism related Angiogenetic and blood vessels formation molt defect Cranial deformations Sensory phenotype Pathological condition Cardiovascular impairments Kidney phenotype Increased tissue transparency Posture phenotype Altered body size Transgene variant Kidney phenotype Differentiation phenotype Size variant impaired proprioception Bone phenotype Nervous and neural tissues affected Facial deformations Pancreas phenotype Oropharingeal related phenotype Cellular level phenotype Integumentary and thricocutaneus Paralysis abnormal blinking Spinal chord phenotype protein degradation variant Turn−over variant Turn−over variant Sex−specific phenotype Thyroid phenotype Signaling phenotype sluggish Skin phenotype Localization variant pleiotropic defects severe early emb Sensory phenotype Cystic−like features Circulating and secreted agents phenotype PH−phenotype abnormal(ly) dead whole organism cord commissures fail to reach target neutropenia Circulating and secreted agents phenotype Coloration variant Lymphatic phenotype Sterility / reproductive potential Polarity variant Limbs related phenotype Facial deformations General metabolism phenotype increased aggression Complex brain phenotype age associated fluorescence increased impaired exercise endurance abnormal(ly) curved post−vent region Cancer General metabolism phenotype Aging−related phenotype rachis absent count Fibrosis count abnormal(ly) decreased length post−vent region count 10000 engulfment variant dysmetria 30 Activity variant 4000 Bone phenotype Connective tissue phenotype Adipose tissue phenotype 7500 axon fasciculation variant cognitive impairment abnormal(ly) shape whole organism 3000 20 Phosphorylation altered 5000 circling abnormal(ly) decreased length whole organism anterior−posterior axis 2000 Limbs related phenotype Activity variant Neurotransmiters related phenotype Phenotype Phenotype 10 Phenotype excessive blebbing early emb 2500 Vitamin phenotype Apoptosis 1000 Turn−over variant Spleen phenotype abnormal(ly) curved ventral post−vent region sister chromatid segregation defective early emb Spinal chord phenotype abnormal(ly) decreased thickness extension ectopic cleavage furrows early emb Skeletal phenotype abnormal(ly) decreased thickness trunk Temperature phenotype Shortening phenotype determination of left/right symmetry phenotype Cytoskeleton affected Podalic, digitalis phenotype abnormal(ly) curved trunk blistered Paralysis abnormal(ly) disrupted determination of left/right symmetry Vacuoles and vesicles Inflamation phenotype Aggregation cleavage furrow initiation defective early emb Hyperactivity phenotype Shortening phenotype Shortening phenotype Heart function impairment abnormal(ly) curved ventral whole organism Differentiation phenotype diarrhea hemopoiesis phenotype Oxygen and nitrogen metabolism Developmental phenotype Thorax−related phenotypes Accumulation phenotype Cystic−like features Thymus phenotype breaks in alae confusion Immune system phenotype Activity variant coma abnormal(ly) u−shaped somite Epithelial tissue phenotype Circulating and secreted agents phenotype Glucose metabolism related Sensory phenotype Cardiovascular phenotype Cytoskeleton affected abnormal(ly) process quality axon guidance Blood vessel narrowing abnormal(ly) curved ventral trunk population fitness variant Atrophy phenotype abnormal(ly) bent post−vent region complex phenotype early emb abnormal vocalization Hypoplasia phenotype Skin phenotype abnormal exercise endurance abnormal(ly) increased width notochord failure to hatch impaired balance Membrane phenotype pale ichthyosis abnormal(ly) decreased length trunk lipid composition variant hyperkeratosis somitogenesis phenotype exaggerated asynchrony early emb hypercapnia abnormal(ly) increased width somite anaphase bridging Genetic information phenotype abnormal(ly) grey yolk Interaction with chemicals phenotype flexion contracture dorsal/ventral pattern formation phenotype cortical dynamics defective early emb decreased lacrimation Metal ion homeostasis phenotype transposon silencing variant decreased collagen level abnormal(ly) shape somite cleavage furrow defective early emb cognition phenotype abnormal(ly) curved whole organism anterior−posterior axis abnormal(ly) process quality gastrulation Cancer abnormal(ly) kinked post−vent region Skeletal phenotype Breathing problems abnormal(ly) undulate notochord cleavage furrow termination defective early emb bradycardia abnormal(ly) necrotic whole organism rays displaced astasia abnormal(ly) increased curvature post−vent region Cranial deformations Aneurysm abnormal(ly) dorsalized whole organism ovulation variant Adipose tissue phenotype abnormal(ly) decreased mobility whole organism fertility reduced absent galactosylceramidase abnormal(ly) bent whole organism ems anterior extension fails early emb absent eccrine glands abnormal(ly) quality yolk unfolded protein response variant abnormality of the voice abnormal(ly) low saturation iridophore protein transport variant abnormality of the patella abnormal(ly) increased curvature whole organism ventral enclosure variant abnormal wound healing abnormal(ly) curled post−vent region phagosome maturation defective abnormal testis physiology Lymphatic phenotype ventral closure defective abnormal rod electrophysiology Adhesion affected pace of p lineage defective early emb abnormal protein level abnormal(ly) curved whole organism obsolete ray loss abnormal nursing abnormal(ly) bent trunk lipid depleted Abnormal gait phenotype Abberant proliferation alae variant abnormal food intake abnormal(ly) quality inner ear alae absent abnormal bleeding abnormal(ly) curved dorsal post−vent region 0 2500 5000 7500 10000 0 10 20 30 0 1000 2000 3000 4000 # occurances # occurances # occurances D Drosophila melanogaster E Homo sapiens F Mus musculus
Poor viability Complex brain phenotype Sterility / reproductive potential Uncharacterized phenotype Eye phenotype Blood phenotype Sterility / reproductive potential Oropharingeal related phenotype Cellular level phenotype Nervous and neural tissues affected Muscle tissue phenotype Altered body size Drosophila−specific phenotype Podalic, digitalis phenotype Morphology altered Developmental phenotype Sterility / reproductive potential Nervous and neural tissues affected Viability Facial deformations Bone phenotype Podalic, digitalis phenotype Bone phenotype Immune system phenotype Genetic information phenotype Cardiovascular impairments Cardiovascular impairments Eye phenotype Blood phenotype Developmental phenotype visible Developmental phenotype Poor viability Muscle tissue phenotype Nervous and neural tissues affected Eye phenotype fertile Atrophy phenotype Muscle tissue phenotype Cellular level phenotype Altered body size Brain phenotype Kinestetic derived phenotypes Skin phenotype Cardiovascular phenotype Abdominal phenotype Kidney phenotype Integumentary and thricocutaneus Oropharingeal related phenotype Integumentary and thricocutaneus Adipose tissue phenotype Sex−specific phenotype Hearing impairment Complex brain phenotype Stress resilience phenotype Lung and respiratory system Cancer Thorax−related phenotypes Cranial deformations Kidney phenotype Sensory phenotype Spinal chord phenotype Psychological phenotype Aging−related phenotype Inflamation phenotype Skin phenotype Altered body size Connective tissue phenotype Lung and respiratory system Cranial deformations Cancer Liver and gall phenotype Brain phenotype Kinestetic derived phenotypes no abnormal phenotype detected Coloration variant Brain phenotype Oropharingeal related phenotype Size variant Visual impairment Circulating and secreted agents phenotype nmj bouton phenotype Major articulations Systemic health decrease Apoptosis Morphology altered Spleen phenotype Cystic−like features Muscle weakness phenotype Glucose metabolism related Limbs related phenotype Psychological phenotype Spinal chord phenotype Psychological phenotype Reflexes and coordination Apoptosis Cytoskeleton affected Paralysis Cranial deformations Polarity variant Urinary system phenotype Lymphatic phenotype dorsal appendage phenotype Liver and gall phenotype Stress resilience phenotype Integumentary and thricocutaneus Cardiovascular phenotype Urinary system phenotype Morphology altered Growth phenotype Kinestetic derived phenotypes Immune system phenotype Ataxia and immobility Limbs related phenotype Complex brain phenotype Breathing problems Inflamation phenotype Paralysis Systemic health decrease Diabetes related phenotype Reflexes and coordination Hypoplasia phenotype Genetic information phenotype interommatidial bristle phenotype Intestine phenotype Intestine phenotype chorion phenotype Shortening phenotype Reflexes and coordination dendrite phenotype Anemia Aging−related phenotype Facial deformations Cystic−like features Turn−over variant Mitochondria affected Diabetes related phenotype Hearing impairment dorsocentral bristle phenotype count Immune system phenotype count Podalic, digitalis phenotype count microchaeta phenotype spasticity 8000 General metabolism phenotype 10000 Adipose tissue phenotype 30000 Blood vessel narrowing Connective tissue phenotype Intestine phenotype Abberant proliferation 6000 Pancreas phenotype 7500 axon phenotype 20000 hypertelorism Epithelial tissue phenotype 4000 5000 crossvein phenotype ptosis Shortening phenotype Phenotype Phenotype Phenotype Lamina affected 10000 Limbs related phenotype 2000 Activity variant 2500 suppressor of variegation micrognathia Sex−specific phenotype bouton phenotype Skeletal phenotype Major articulations Vacuoles and vesicles Motor phenotype Differentiation phenotype tarsal segment phenotype Spleen phenotype Breathing problems mushroom body beta−lobe phenotype high palate Sensory phenotype karyosome phenotype Thorax−related phenotypes Thymus phenotype Epithelial tissue phenotype cleft palate Size variant Cardiovascular phenotype Thyroid phenotype Anemia Skin phenotype Coloration variant Hyperactivity phenotype type i bouton phenotype anteverted nares Cystic−like features non−suppressor of variegation Cellular level phenotype Edema occurence dorsal trunk primordium phenotype General metabolism phenotype Skeletal phenotype arista phenotype epicanthus Facial deformations Turn−over variant Edema occurence Hypoactivity phenotype commissure phenotype Immunodeficiency Growth phenotype enhancer of variegation PH−phenotype Fibrosis Lysosomal phenotype downslanted palpebral fissures Atrophy phenotype Kidney phenotype Obesity related phenotype Weight loss grandchildless Lymphatic phenotype Mitochondria affected Hypoactivity phenotype Fatigue Endocrine phenotype Blood phenotype Abdominal phenotype PH−phenotype non−enhancer of variegation cognitive impairment Fatty acid metabolism mushroom body alpha−lobe phenotype Fractures Thyroid phenotype notopleural bristle phenotype Sensory phenotype Temperature phenotype germarium phenotype tremor Adrenal gland phenotype femur phenotype dystonia Neurotransmiters related phenotype Growth phenotype Nervous system defect Obesity related phenotype Hearing impairment Fibrosis Hypoplasia phenotype Membrane phenotype Aplasia/Hypoplasia of bone tremor aster phenotype Sclerosis abnormal bleeding touch response defective hydrocephalus Stomach and gastric related phenotype Cardiovascular impairments hypertension Necrosis related phenotype Lymphatic phenotype Hyperactivity phenotype Ataxia and immobility Neurotransmiters related phenotype flexion contracture Posture phenotype Cancer kyphosis Vacuoles and vesicles Hyperplasia phenotype hypospadias Abberant proliferation pain response defective Agenesis decreased testis weight unguis phenotype Mental Depression Oxygen and nitrogen metabolism temperature response defective eeg abnormality Thorax−related phenotypes ring gland phenotype Adrenal gland phenotype leukopenia Skeletal phenotype pes cavus pallor electrophoretic variant babinski sign Angiogenetic and blood vessels formation cuticle phenotype Circulating and secreted agents phenotype decreased survivor rate adult cuticle phenotype pes planus kyphosis ventral furrow phenotype posteriorly rotated ears abnormal macrophage physiology terminal bouton phenotype pectus excavatum Metal ion homeostasis phenotype large body inguinal hernia decreased prepulse inhibition 0 10000 20000 30000 0 2000 4000 6000 8000 0 3000 6000 9000 # occurances # occurances # occurances G Mus musculus domesticus H Rattus norvegicus I Saccharomyces cerevisiae S288C
Sterility / reproductive potential Blood phenotype Stress resilience phenotype Cardiovascular impairments Cardiovascular phenotype Accumulation phenotype Altered body size Altered body size Growth phenotype Integumentary and thricocutaneus Cellular level phenotype Genetic information phenotype Limbs related phenotype Psychological phenotype Systemic health decrease Eye phenotype Circulating and secreted agents phenotype Viability Blood phenotype Stress resilience phenotype Vacuoles and vesicles Podalic, digitalis phenotype Sterility / reproductive potential competitive fitness:increased Morphology altered Complex brain phenotype Aging−related phenotype Cranial deformations Adipose tissue phenotype Cellular level phenotype Poor viability Urinary system phenotype Lung and respiratory system Developmental phenotype Cardiovascular impairments Oxygen and nitrogen metabolism Bone phenotype Immune system phenotype Temperature phenotype Nervous and neural tissues affected Neurotransmiters related phenotype Abberant proliferation Cardiovascular phenotype Muscle tissue phenotype Apoptosis Hearing impairment Morphology altered Morphology altered Cellular level phenotype Lymphatic phenotype protein/peptide distribution:abnormal Complex brain phenotype Brain phenotype Mitochondria affected Spinal chord phenotype Nervous and neural tissues affected Size variant Brain phenotype Kidney phenotype auxotrophy Psychological phenotype Glucose metabolism related colony sectoring:increased Kidney phenotype Fibrosis Interaction with chemicals phenotype Skin phenotype decreased susceptibility to hypertension utilization of carbon source:decreased rate Facial deformations Hypoactivity phenotype Activity variant Oropharingeal related phenotype Diabetes related phenotype protein/peptide modification:decreased tremor Cancer utilization of carbon source:decreased Systemic health decrease Systemic health decrease Turn−over variant Kinestetic derived phenotypes Spleen phenotype sporulation efficiency:decreased Immune system phenotype Sclerosis transposable element transposition:decreased Lung and respiratory system Lung and respiratory system biofilm formation:decreased Muscle tissue phenotype Liver and gall phenotype Sterility / reproductive potential Reflexes and coordination Kinestetic derived phenotypes sporulation:decreased Hyperactivity phenotype Fatty acid metabolism colony appearance:abnormal belly spot Developmental phenotype protein/peptide modification:increased circling Apoptosis sporulation:absent cleft palate Activity variant Diet−related phenotype Spleen phenotype Thymus phenotype sporulation:normal micrognathia Temperature phenotype competitive fitness:normal Sensory phenotype Sensory phenotype biofilm formation:increased Urinary system phenotype Poor viability transposable element transposition:increased Circulating and secreted agents phenotype PH−phenotype Circulating and secreted agents phenotype PH−phenotype impaired proprioception Skeletal phenotype Cystic−like features Cranial deformations budding pattern:abnormal Ataxia and immobility Size variant Poor viability Edema occurence Podalic, digitalis phenotype utilization of carbon source:absent Thymus phenotype Pathological condition Peroxisome phenotype Glucose metabolism related count Oropharingeal related phenotype count sporulation efficiency:increased count Skeletal phenotype Mitochondria affected protein transport:abnormal 500 50 Shortening phenotype leukopenia sporulation:abnormal 15000 Lymphatic phenotype 400 Integumentary and thricocutaneus 40 chitin deposition:increased decreased mean corpuscular volume 300 Inflamation phenotype 30 protein/peptide distribution:increased 10000 increased hdl cholesterol concentration 200 increased prepulse inhibition 20 protein/peptide modification:absent Phenotype Phenotype Phenotype Sex−specific phenotype increased creatine kinase level survival rate in stationary phase:decreased 5000 100 10 heterotaxy increased alcohol consumption silencing:decreased Coloration variant hypertension chitin deposition:normal Cancer decreased testis weight budding index:abnormal Obesity related phenotype Cystic−like features spore germination:decreased dextrocardia Connective tissue phenotype prion formation:decreased decreased hdl cholesterol concentration Atrophy phenotype Kinestetic derived phenotypes impaired balance abnormal response to transplant Altered body size Adipose tissue phenotype abnormal estrous cycle budding index:increased Posture phenotype increased tibialis anterior weight silencing:abnormal Liver and gall phenotype increased susceptibility to weight gain Polarity variant Intestine phenotype increased fluid intake prion formation:increased decreased survivor rate increased defecation amount budding:abnormal Turn−over variant increased creatinine clearance budding index:decreased Connective tissue phenotype increased body temperature Localization variant Breathing problems increased bile salt level petite Genetic information phenotype impaired balance General metabolism phenotype Blood vessel narrowing impaired acrosome reaction utilization of iron source:decreased Vacuoles and vesicles hypoperistalsis petite−negative Stomach and gastric related phenotype hyperresponsive to tactile stimuli utilization of phosphorus source:decreased Size variant Eye phenotype spore wall formation:abnormal Inflamation phenotype excessive scratching redox state:abnormal Hypoactivity phenotype Epithelial tissue phenotype silencing:increased hydrocephalus Endocrine phenotype flocculation:increased Epithelial tissue phenotype decreased tibialis anterior weight rna modification:decreased General metabolism phenotype decreased prepulse inhibition rna modification:abnormal Anemia decreased food intake axial budding pattern:decreased Aging−related phenotype decreased exploration in new environment rna modification:absent Agenesis decreased b wave amplitude chitin deposition:decreased reduced fertility cognitive inflexibility prion loss:increased mesocardia chondrodystrophy protein/peptide distribution:decreased Differentiation phenotype Bone phenotype Major articulations decreased testis weight belly spot liquid culture appearance:abnormal cleft upper lip Anemia protein/peptide modification:abnormal chylous ascites Aging−related phenotype pexophagy:absent brachycephaly Adrenal gland phenotype protein transport:decreased Apoptosis absent sexual maturation pheromone production:decreased white spotting abnormal survival survival rate in stationary phase:increased Weight loss abnormal response to social novelty shmoo formation:absent striated fur abnormal pruritus chitin deposition:abnormal impaired righting response abnormal nociception after inflammation protein/peptide modification:normal Hypoplasia phenotype abnormal mechanical nociception shmoo formation:abnormal Heart function impairment abnormal lipid level Adhesion affected Diabetes related phenotype abnormal hemostasis spore germination:absent coloboma abnormal food preference Golgi apparatus affected Atrophy phenotype abnormal conditioned emotional response colony shape:abnormal Abdominal phenotype abnormal chloride level budding:absent abnormal cone electrophysiology Abberant proliferation reticulophagy:decreased 0 200 400 0 20 40 0 5000 10000 15000 # occurances # occurances # occurances J Felis catus K Bos taurus L Ovis aries
Facial deformations Genetic information phenotype Sterility / reproductive potential Eye phenotype Poor viability
Muscle tissue phenotype Connective tissue phenotype Systemic health decrease Systemic health decrease Systemic health decrease
Complex brain phenotype Spinal chord phenotype Nervous and neural tissues affected
Bone phenotype Skin phenotype Complex brain phenotype Spinal chord phenotype Muscle tissue phenotype
Oropharingeal related phenotype Altered body size Visual impairment Skin phenotype Sterility / reproductive potential Reflexes and coordination Oropharingeal related phenotype Skin phenotype microtia Cardiovascular impairments Kinestetic derived phenotypes Muscle tissue phenotype Thyroid phenotype Abdominal phenotype Morphology altered weakness Altered body size Integumentary and thricocutaneus tremor Complex brain phenotype abnormal ovulation Skeletal phenotype tremor Podalic, digitalis phenotype rigidity weakness Paralysis Psychological phenotype Nervous and neural tissues affected Turn−over variant Nervous and neural tissues affected Major articulations Cranial deformations Kidney phenotype Spinal chord phenotype Visual impairment Glucose metabolism related Turn−over variant Psychological phenotype Cystic−like features Podalic, digitalis phenotype Cranial deformations protein catabolic process phenotype Limbs related phenotype Cardiovascular phenotype impaired balance Blood phenotype Motor phenotype Eye phenotype Anemia
Altered body size Coloration variant Limbs related phenotype
Activity variant Cardiovascular phenotype large horns abnormal protein level Brain phenotype
Weight loss Ataxia and immobility Kinestetic derived phenotypes Urinary system phenotype astasia count count count unresponsive to tactile stimuli white spotting 8 10.0 Integumentary and thricocutaneus 8 Turn−over variant weakness 6 7.5 6 Thyroid phenotype Water homeostasis altered impaired proprioception 5.0 Psychological phenotype 4 Shortening phenotype 4 Phenotype Phenotype Phenotype impaired balance Morphology altered 2 PH−phenotype 2.5 2 Lysosomal phenotype Paralysis Glucose metabolism related Limbs related phenotype micrognathia jaundice microfibril phenotype Eye phenotype Integumentary and thricocutaneus Kidney phenotype
Inflamation phenotype increased aggression exercise intolerance increased inflammatory response impaired exercise endurance increased aggression Enzyme activity heterochromia iridis impaired proprioception Hearing impairment Edema occurence impaired balance General metabolism phenotype Immune system phenotype Facial deformations distichiasis Hyperplasia phenotype ectopia lentis Hyperactivity phenotype Connective tissue phenotype dilatation hydrocephalus Developmental phenotype Heart function impairment cognitive impairment decreased beta−mannosidase level Hearing impairment conjunctiva phenotype Brain phenotype General metabolism phenotype coloboma Fibrosis cleft palate Atrophy phenotype enlarged paws Cancer Embolus and / or thrombus Ataxia and immobility brachycephaly Edema occurence blue irides dysostosis multiplex Aging−related phenotype Blood phenotype Developmental phenotype blepharophimosis absent horns Coloration variant Atrophy phenotype Circulating and secreted agents phenotype absent beta−galactosidase protein Cardiovascular impairments absent horns
blepharophimosis abortion, rnaseh2b−related in cattle abnormal translation Atrophy phenotype abnormal pruritus
Ataxia and immobility abnormal protein level abnormal spike wave discharge
Adipose tissue phenotype abnormal primary sex determination
absent beta−galactosidase protein abnormal parturition abnormal pruritus
abnormality of the rib cage abnormal ear position abnormal lipid level abnormality of the pinna Abdominal phenotype
0 2 4 6 8 0.0 2.5 5.0 7.5 10.0 0 2 4 6 8 289 # occurances # occurances # occurances
290 Supplementary Fig. 3: The whole matrisome genome-phenome across species
291 (grouped)
292 All gene/phenotype associations of every species are shown for 44 species. The
293 number of unique gene-to-phenotype associations within every phenotype group are
294 shown arranged in descending order.
17 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
A Bos taurus B Caenorhabditis elegans C Canis lupus familiaris
premature death embryonic lethal premature death Tremor Failure to thrive Ataxia thin body abnormal(ly) arrested larval development Tremor Rigidity sterile weakness Limb joint contracture locomotion variant Retinal dysplasia Joint hypermobility transgene subcellular localization variant Recurrent infections decreased body size transgene expression increased impaired coordination abnormal spine curvature reduced brood size Gait disturbance Relatively short spine maternal sterile Blindness Protruding tongue pattern of transgene expression variant Paraparesis postnatal lethality sterile progeny increased neuron apoptosis increased skeletal muscle mass larval lethal Abnormality of movement impaired balance protruding vulva Abnormal neuron morphology Gait disturbance transgene expression reduced Abnormal eating behavior Behavioral abnormality receptor mediated endocytosis defective Strabismus Ataxia dauer lifespan extended Seizures astasia sick increased salivation head/neck swellings extended life span Impaired proprioception Haplotype JH1 lethal abnormal postnatal growth/weight/body size Haplotype HHD organism development variant abnormal blinking Haplotype HH5 protein aggregation variant Retinal detachment Haplotype HH4 fat content reduced Retinal degeneration Haplotype HH1 avoids bacterial lawn Neutropenia Haplotype HH0 shortened life span muscle degeneration Goiter germ cell compartment expansion variant Lethargy Fragile skin adult body morphology variant increased aggression Failure to thrive exploded through vulva impaired exercise endurance Enlarged kidney small Hepatomegaly embryonic lethality oocyte number decreased Failure to thrive Ectopia lentis cell membrane organization biogenesis variant Dysmetria Dysphagia oocyte morphology variant decreased body size Dry skin germ cell compartment morphology variant Cognitive impairment domed cranium dumpy circling diluted coat color RAB−11 recycling endosome morphology variant Behavioral abnormality Dilation of lateral ventricles no induction of antimicrobial peptide expression after infection Anemia Dilatation rachis morphology variant Abnormality of dental morphology Dementia lysosome−related organelle morphology variant abnormal sensory capabilities/reflexes/nociception Dehydration egg laying variant Visual impairment deformed nails germline nuclear positioning variant Splenomegaly decreased triiodothyronine level rachis narrow Respiratory insufficiency decreased cardiac muscle contractility cytoplasmic processing body variant Recurrent bacterial infections decreased branched−chain alpha−keto acid dehydrogenase level abnormal(ly) process quality gonad development Peripheral demyelination Decreased body weight pachytene region organization variant Paralysis decreased beta−mannosidase level molt defect Mitral regurgitation decreased argininosuccinate synthetase level fat content increased increased glycogen level darkened coat color accumulated germline cell corpses Hyperactivity conjunctiva phenotype count growth variant count Hemolytic anemia count Congestive heart failure 5 clear 3000 Hematuria Coloboma abnormal cell nucleus morphology Dysphagia 4 Cleft palate protein expression reduced 2000 domed cranium 10 Brain atrophy 3 aldicarb resistant Diarrhea Brachycephaly apoptosis reduced Dementia
Phenotype Phenotype 1000 Phenotype 5 Blue irides 2 abnormal(ly) process quality chromosome condensation Degeneration of anterior horn cells Blindness RAB−11 recycling endosome localization variant Confusion 1 Blepharophimosis gonad vesiculated Coma Biparietal narrowing thin Cataract behavior/neurological phenotype protein subcellular localization variant bone marrow phenotype Arthrogryposis multiplex congenita no oocytes behavior/neurological phenotype Aplasia/Hypoplasia of the lens diplotene region organization variant Atrophy of the spinal cord absent horns multivulva Aciduria absent eye pigmentation oogenesis variant Absent hair absent coat pigmentation early larval arrest Abnormality of vision Abortion, RNASEH2B−related in cattle abnormal(ly) process quality apoptotic process Abnormality of the dentition Abortion due to haplotype MH2 in cattle transgene expression variant Abnormality of the cerebellum Abortion due to haplotype MH1 in cattle life span variant abnormal vocalization Abortion due to haplotype FH4 in cattle germ cell compartment size variant abnormal vitamin B12 level Abortion due to haplotype BH2 in cattle gonad morphology variant abnormal reflex Abortion (embryonic lethality), SNAPC4−related in cattle germ cell compartment multinucleate Abnormal posturing Abortion (embryonic lethality), RPIA−related in cattle oocyte septum formation variant abnormal postural reflex Abortion (embryonic lethality), RNF20−related in cattle nuclei small abnormal neuron physiology Abortion (embryonic lethality), RABGGTB−related in cattle abnormal(ly) process quality germ cell development Abnormal joint morphology Abortion (embryonic lethality), OBFC1−related in cattle foraging behavior variant abnormal exercise endurance Abortion (embryonic lethality), MYH6−related in cattle distal tip cell migration variant abnormal ependyma morphology Abortion (embryonic lethality), MED22−related in cattle abnormal(ly) increased rate apoptotic process Abnormal heart morphology Abortion (embryonic lethality), EXOSC4−related in cattle body wall muscle myosin organization defective Abnormal head movements Abnormality of skull size protein degradation variant Abnormal glucose homeostasis Abnormality of skin pigmentation pore forming toxin hypersensitive abnormal forelimb morphology Abnormality of serum amino acid level male somatic gonad development variant abnormal foot pad morphology Abnormality of metabolism/homeostasis paralyzed abnormal food intake Abnormality of joint mobility embryo osmotic integrity defective early emb Abnormal facial shape Abnormality of ion homeostasis sluggish Abnormal eyelid morphology Abnormal vertebral morphology oocytes small Abnormal erythrocyte morphology abnormal pruritus Bacillus thuringiensis toxin hypersensitive Abnormal enzyme/coenzyme activity abnormal protein level abnormal testis development Abnormal emotion/affect behavior abnormal primary sex determination mitochondria alignment variant abnormal developmental patterning abnormal pigmentation pattern abnormal(ly) arrested embryo development Abnormal circulating methionine concentration abnormal parturition multiple nuclei early emb abnormal cerebral hemisphere morphology abnormal outer ear morphology pleiotropic defects severe early emb abnormal cerebellum fissure morphology Abnormal neuron morphology meiosis variant abnormal cerebellar foliation abnormal milk composition high incidence male progeny abnormal calcium ion homeostasis abnormal horn pigmentation germ cell compartment anucleate abnormal bone mineralization Abnormal hand morphology dauer constitutive abnormal bone marrow morphology abnormal ear position oocytes lack nucleus Abnormal bleeding abnormal depression−related behavior RNAi resistant abnormal behavioral response to light abnormal circulating thyroxine level body wall muscle sarcomere morphology variant abnormal basal metabolism abnormal cerebellum white matter morphology pathogen susceptibility increased abnormal axon morphology Abnormal cartilage morphology mRNA surveillance defective abnormal atrioventricular cushion morphology Abnormal aortic morphology nonsense mRNA accumulation Abnormal appendicular skeleton morphology abnormal aorta elastic tissue morphology cord commissures fail to reach target abnormal amino acid level Abdominal distention Abnormal mitochondrial morphology Abnormal activity of mitochondrial respiratory chain 0 1 2 3 4 5 0 1000 2000 3000 0 5 10 # occurances # occurances # occurances D Danio rerio E Drosophila melanogaster F Felis catus
abnormal(ly) decreased size eye Uncharacterized phenotype Gait disturbance abnormal(ly) edematous pericardium lethal Failure to thrive abnormal(ly) decreased size head viable Microtia abnormal(ly) decreased length whole organism visible eye opacity abnormal(ly) viability whole organism fertile Broad face abnormal(ly) lethal (sensu genetics) whole organism Drosophila wing phenotype weakness abnormal(ly) dead whole organism neuroanatomy defective Tremor abnormal(ly) edematous heart female sterile thick skin abnormal(ly) quality nervous system partially lethal − majority die Short nose abnormal(ly) decreased size whole organism male sterile increased creatine kinase activity abnormal(ly) quality heart lethal − all die before end of larval stage Decreased cervical spine mobility abnormal(ly) hypoplastic gut some die during embryonic stage decreased body size abnormal(ly) disrupted blood circulation lethal − all die before end of embryonic stage Anemia abnormal(ly) necrotic central nervous system lethal − all die before end of pupal stage Abnormality of the mouth abnormal(ly) disrupted heart looping neurophysiology defective abnormal protein level abnormal(ly) quality pigment cell eye phenotype Abnormal facial shape abnormal(ly) quality gut some die during larval stage abdomen swellings abnormal(ly) hypoplastic liver some die during pupal stage skeleton phenotype abnormal(ly) curved post−vent region locomotor behavior defective Skeletal muscle hypertrophy heart development phenotype partially lethal Retinal degeneration abnormal(ly) quality liver meiotic cell cycle defective Renal insufficiency abnormal(ly) increased occurrence apoptotic process short lived Renal cyst abnormal(ly) decreased rate heart contraction developmental rate defective reflex phenotype abnormal(ly) quality brain Drosophila ommatidium phenotype reduced foot pad pigmentation developmental pigmentation phenotype mitotic cell cycle defective Pulmonary edema brain development phenotype Drosophila embryonic/first instar larval cuticle phenotype Prominent forehead abnormal(ly) decreased length post−vent region Drosophila macrochaeta phenotype premature death cilium morphogenesis phenotype lethal − all die before end of P−stage paresis abnormal(ly) decreased amount nucleate erythrocyte wing vein phenotype Paralysis liver development phenotype Drosophila oocyte phenotype Osteopenia abnormal(ly) shape whole organism NMJ bouton phenotype obsolete neuromuscular junction phenotype abnormal(ly) decreased life span whole organism partially lethal − majority live muscular atrophy abnormal(ly) decreased length whole organism anterior−posterior axis egg chamber phenotype Mitral regurgitation abnormal(ly) cystic pronephros male fertile Metabolic acidosis chordate embryonic development phenotype Drosophila embryo phenotype long tongue abnormal(ly) morphology brain Drosophila egg phenotype Lethargy abnormal(ly) necrotic brain cytokinesis defective Left ventricular hypertrophy abnormal(ly) hydrocephalic brain some die during third instar larval stage Jaundice abnormal(ly) quality sensory system some die during pharate adult stage Increased inflammatory response abnormal(ly) curved ventral post−vent region leg phenotype Increased head circumference heart looping phenotype female fertile Increased blood urea nitrogen angiogenesis phenotype Drosophila imaginal disc phenotype increased aggression abnormal(ly) decreased thickness extension Drosophila embryonic/larval neuromuscular junction phenotype Impaired proprioception abnormal(ly) decreased size liver small body impaired limb coordination abnormal(ly) decreased size brain lethal − all die before end of first instar larval stage impaired balance abnormal(ly) decreased thickness trunk sterile Hypothyroidism abnormal(ly) quality pharyngeal arch 3−7 skeleton count size defective count Hyperreflexia count abnormal(ly) morphology heart ovary phenotype 20000 Hyperphosphatemia 4 determination of left/right symmetry phenotype nurse cell phenotype Hyperactivity 600 15000 abnormal(ly) curved trunk some die during first instar larval stage Hydrocephalus 3 abnormal(ly) disrupted determination of left/right symmetry 400 dorsal appendage phenotype 10000 Hip dysplasia abnormal(ly) disrupted thigmotaxis chemical sensitive Hip dislocation
Phenotype Phenotype Phenotype 2 abnormal(ly) morphology somite 200 lethal − all die before end of second instar larval stage 5000 hindlimb paralysis embryonic viscerocranium morphogenesis phenotype increased cell death Hepatic cysts 1 retina development in camera−type eye phenotype some die during second instar larval stage Hearing impairment cartilage development phenotype embryonic/larval salivary gland phenotype Global developmental delay abnormal(ly) circular yolk decreased cell number Generalized tonic−clonic seizures abnormal(ly) disrupted embryo development Drosophila embryonic head phenotype Flat face abnormal(ly) present in fewer numbers in organism melanocyte embryonic/larval neuroblast phenotype Epiphyseal dysplasia abnormal(ly) quality hindbrain Drosophila pigment cell phenotype enlarged paws abnormal(ly) curved ventral whole organism immune response defective Dysostosis multiplex abnormal(ly) delayed embryo development Drosophila wing disc phenotype Difficulty walking hemopoiesis phenotype female semi−sterile Degeneration of anterior horn cells heart contraction phenotype paralytic decreased UDP−N−acetylglucosamine−1−phosphotransferase level digestive tract development phenotype Drosophila melanotic mass phenotype decreased N−acetylgalactosamine−4−sulfatase protein level abnormal(ly) morphology whole organism Drosophila spermatid phenotype decreased circulating factor XII level abnormal(ly) disrupted retina layer formation Drosophila testis phenotype darkened coat color abnormal(ly) U−shaped somite planar polarity defective Corneal opacity pronephros development phenotype wing margin phenotype Congestive heart failure abnormal(ly) quality eye embryonic/larval brain phenotype Cataract abnormal(ly) curved ventral trunk courtship behavior defective bone cell phenotype abnormal(ly) bent post−vent region eye photoreceptor cell phenotype Blepharophimosis abnormal(ly) decreased size mandibular arch skeleton retina phenotype Biliary hyperplasia abnormal(ly) decreased length Kupffer's vesicle cilium denticle belt phenotype Behavioral abnormality abnormal(ly) increased width notochord neuromuscular junction phenotype behavior/neurological phenotype abnormal(ly) decreased pigmentation melanocyte eye color defective Ataxia ventricular system development phenotype Drosophila germline cell phenotype absent beta−galactosidase protein abnormal(ly) decreased pigmentation whole organism decreased occurrence of cell division Abnormality of the rib cage abnormal(ly) decreased length trunk uncoordinated Abnormality of the pinna fin regeneration phenotype interommatidial bristle phenotype Abnormality of the lumbar spine sprouting angiogenesis phenotype increased cell number Abnormality of skeletal morphology somitogenesis phenotype Drosophila intersegmental nerve phenotype Abnormality of dental color abnormal(ly) increased width somite male semi−sterile abnormal zygomatic bone morphology abnormal(ly) grey yolk locomotor rhythm defective Abnormal vestibulo−ocular reflex abnormal(ly) decreased size otic vesicle behavior defective abnormal urine osmolality abnormal(ly) disrupted angiogenesis chorion phenotype Abnormal urinary color vasculature development phenotype dendrite phenotype abnormal reflex dorsal/ventral pattern formation phenotype nucleus phenotype abnormal myocardial fiber morphology abnormal(ly) process quality heart looping centrosome phenotype Abnormal muscle morphology abnormal(ly) morphology intersegmental vessel Drosophila border follicle cell phenotype abnormal muscle electrophysiology abnormal(ly) shape somite flight defective abnormal lysosome physiology abnormal(ly) curved whole organism anterior−posterior axis spindle phenotype abnormal long bone epiphyseal plate morphology abnormal(ly) morphology eye Drosophila scutellar bristle phenotype Abnormal leukocyte morphology abnormal(ly) kinked post−vent region Drosophila embryonic/larval dorsal trunk phenotype abnormal kidney morphology abnormal(ly) disrupted determination of heart left/right asymmetry dorsocentral bristle phenotype Abnormal heart valve morphology abnormal(ly) necrotic whole organism Drosophila chaeta phenotype abnormal glycosaminoglycan level abnormal(ly) increased curvature post−vent region microchaeta phenotype abnormal foot pad morphology abnormal(ly) dorsalized whole organism wing margin bristle phenotype abnormal cervical vertebrae morphology abnormal(ly) decreased mobility whole organism Drosophila embryonic/larval somatic muscle phenotype abnormal cardiac muscle tissue morphology abnormal(ly) bent whole organism adult brain phenotype Abdominal distention 0 200 400 600 800 0 5000 10000 15000 20000 0 1 2 3 4 # occurances # occurances # occurances G Homo sapiens H Mus musculus I Mus musculus domesticus
Intellectual disability no abnormal phenotype detected decreased body size Global developmental delay Decreased body weight kinked tail Seizures preweaning lethality, complete penetrance Tremor Muscular hypotonia premature death Decreased body weight Short stature decreased body size Male infertility Generalized hypotonia nervous system phenotype Double outlet right ventricle Nystagmus immune system phenotype Gait disturbance Microcephaly behavior/neurological phenotype prenatal lethality, complete penetrance Failure to thrive Failure to thrive Microphthalmia Scoliosis homeostasis/metabolism phenotype Hearing impairment Strabismus reproductive system phenotype premature death Ataxia Male infertility diluted coat color Sensorineural hearing impairment postnatal lethality, incomplete penetrance belly spot Spasticity preweaning lethality, incomplete penetrance circling Hearing impairment embryonic lethality during organogenesis, complete penetrance Hyperactivity Cataract embryonic growth retardation Cleft palate Hyperreflexia Mortality/Aging short tail Hypertelorism neonatal lethality, complete penetrance Abnormality of hair pigmentation Ptosis Hyperactivity Micrognathia Micrognathia hypoactivity Ventricular septal defect Cryptorchidism increased or absent threshold for auditory brainstem response Short nose Optic atrophy decreased embryo size atrioventricular septal defect Visual impairment postnatal lethality, complete penetrance increased or absent threshold for auditory brainstem response High palate Behavioral abnormality Polydactyly Hepatomegaly Adipose tissue loss Syndactyly Dysarthria cardiovascular system phenotype Left ventricular hypertrophy Low−set ears decreased body length Right ventricular hypertrophy Cleft palate Cardiomegaly Ataxia Respiratory insufficiency decreased lean body mass embryonic lethality during organogenesis, complete penetrance Anteverted nares impaired coordination Cataract Growth delay Abnormal heart morphology Right aortic arch Muscle weakness Gait disturbance hearing/vestibular/ear phenotype Wide nasal bridge Osteopenia head bobbing Intrauterine growth retardation Splenomegaly decreased birth body size Depressed nasal bridge decreased grip strength Anophthalmia Ventriculomegaly embryonic lethality, complete penetrance abnormal craniofacial morphology Motor delay growth/size/body region phenotype decreased mean corpuscular volume Feeding difficulties decreased circulating glucose level curly vibrissae Epicanthus hematopoietic system phenotype preweaning lethality, complete penetrance Feeding difficulties in infancy decreased vertical activity curly tail Anemia Female infertility Sparse hair Skeletal muscle atrophy Decreased testicular size limb grasping Downslanted palpebral fissures obsolete Glucose intolerance Increased HDL cholesterol concentration Delayed speech and language development decreased bone mineral content Hydronephrosis Splenomegaly Abnormal bone structure Curly hair Gait disturbance prenatal lethality, incomplete penetrance Increased body weight Recurrent respiratory infections count Hypoinsulinemia count postnatal lethality, complete penetrance count Frontal bossing decreased litter size 2000 perinatal lethality, complete penetrance 200 Macrocephaly 1000 prenatal lethality, complete penetrance Heterotaxy Cognitive impairment neonatal lethality, incomplete penetrance 1500 postnatal lethality, incomplete penetrance 150 Abnormality of cardiovascular system morphology 750 increased lean body mass postnatal lethality Brachydactyly Tremor 1000 head tossing 100
Phenotype 500 Phenotype Phenotype Talipes equinovarus Abnormality of cell physiology Dextrocardia 500 50 Hyporeflexia 250 Increased body weight Decreased HDL cholesterol concentration Ventricular septal defect Abnormal bleeding impaired balance Delayed skeletal maturation B lymphocytopenia Elevated tissue non−specific alkaline phosphatase Cerebellar atrophy abnormal kidney morphology absent startle reflex Short neck reduced female fertility abnormal tail morphology Tremor improved glucose tolerance increased percent body fat/body weight Dystonia Ataxia Anisocytosis Myopia increased total body fat amount reproductive system phenotype Hypoplasia of the corpus callosum Edema Obesity Glaucoma vision/eye phenotype Muscular ventricular septal defect Hydrocephalus abnormal embryo size Exencephaly Dysphagia perinatal lethality, incomplete penetrance Vascular ring Hypoplasia of penis Anemia Hypocholesterolemia Hypertension reduced male fertility embryonic lethality between implantation and somite formation, complete penetrance Obesity decreased testis weight domed cranium Abnormality of the dentition Abnormality of the liver decreased survivor rate Long philtrum perinatal lethality, complete penetrance Abnormal aortic arch morphology Flexion contracture embryonic lethality during organogenesis, incomplete penetrance Oligospermia Clinodactyly of the 5th finger embryonic lethality between implantation and somite formation, complete penetrance Increased blood urea nitrogen Atrial septal defect Hypotriglyceridemia Hypoplasia of the thymus Short nose Abnormality of the spleen Hypercholesterolemia Kyphosis Weight loss Abnormal eye morphology Conductive hearing impairment Hypocholesterolemia abnormal coat appearance Abnormality of retinal pigmentation embryonic growth arrest prenatal lethality, incomplete penetrance Intellectual disability, severe limb grasping Overriding aorta Cerebral atrophy embryonic lethality prior to organogenesis Osteopenia Renal insufficiency Elevated tissue non−specific alkaline phosphatase increased foot pad pigmentation Hypospadias Abnormal B cell morphology impaired limb coordination Microphthalmia skeleton phenotype Hypotrichosis Developmental regression increased heart weight Hypophosphatemia Hypertrophic cardiomyopathy Abnormal glucose homeostasis Hyperglycemia Abnormality of metabolism/homeostasis Decrease in T cell count Female infertility Peripheral neuropathy Recurrent bacterial infections Edema Fatigue Hepatic steatosis decreased mean corpuscular hemoglobin Blindness Oligospermia Abnormality of digit Hypogonadism lethality throughout fetal growth and development, complete penetrance Renal cyst EEG abnormality Hypercholesterolemia prenatal lethality Thrombocytopenia Decreased proportion of CD8−positive T cells increased systemic arterial blood pressure Cerebral cortical atrophy embryonic lethality prior to tooth bud stage Globozoospermia Photophobia decreased CD4−positive, alpha beta T cell number double outlet right ventricle with atrioventricular septal defect Pes cavus Cyanosis Corneal opacity Ophthalmoplegia Retinal dysplasia Abnormality of hair growth Elevated serum creatine kinase decreased cell proliferation decreased litter size Agenesis of corpus callosum Microphthalmia asthenozoospermia Babinski sign Leukopenia Abnormality of the hair Myopathy Pallor Abnormal spermatogenesis Umbilical hernia cellular phenotype Abnormal auditory evoked potentials 0 250 500 750 1000 1250 0 500 1000 1500 2000 0 50 100 150 200 # occurances # occurances # occurances J Rattus norvegicus K Saccharomyces cerevisiae S288C L Ovis aries
cardiac hypertrophy resistance to chemicals:decreased increased litter size salt−sensitive hypertension viable Decreased body weight competitive fitness:decreased Seizures Hypercholesterolemia resistance to chemicals:increased decreased susceptibility to hypertension vegetative growth:decreased rate premature death hypoactivity vacuolar morphology:abnormal Hypertriglyceridemia competitive fitness:increased abnormal ovulation Decrease in T cell count haploinsufficient abnormal circadian regulation of systemic arterial blood pressure heat sensitivity:increased weakness premature death chemical compound accumulation:increased Increased body weight chemical compound accumulation:abnormal Visual impairment decreased urine albumin level chemical compound accumulation:decreased Proteinuria inviable Tetraparesis Neurodegeneration vegetative growth:decreased increased systemic arterial blood pressure utilization of nitrogen source:decreased rate subcutaneous edema increased neutrophil cell number chronological lifespan:decreased increased heart weight desiccation resistance:decreased spongiform encephalopathy Increased blood urea nitrogen haploproficient Impaired proprioception starvation resistance:decreased small horns Hepatic fibrosis oxidative stress resistance:decreased Glomerulosclerosis toxin resistance:decreased Scoliosis Gait disturbance protein/peptide distribution:abnormal embryonic lethality chromosome/plasmid maintenance:decreased protein catabolic process phenotype decreased erythrocyte cell number cell death:decreased vascular smooth muscle hypertrophy protein/peptide accumulation:increased Prominent eyelashes Tubulointerstitial fibrosis metal resistance:decreased Renal cyst respiratory growth:decreased rate Paraparesis Reduced natural killer cell count auxotrophy pathological neovascularization respiratory growth:absent Myotonia oxidative stress invasive growth:increased obsolete Glucose intolerance colony sectoring:increased long limbs Monocytopenia innate thermotolerance:decreased Lymphopenia toxin resistance:increased limb joint phenotype Lymphocytosis chemical compound excretion:increased loss of dopaminergic neurons utilization of carbon source:decreased rate large horns Leukopenia stress resistance:increased Left ventricular hypertrophy chromosome/plasmid maintenance:abnormal Infertility increased serotonin level protein/peptide accumulation:decreased increased prepulse inhibition innate thermotolerance:increased increased skeletal muscle size increased liver triglyceride level endocytosis:decreased increased insulin secretion stress resistance:decreased increased skeletal muscle mass increased grooming behavior resistance to chemicals:normal increased gnawing activity hyperosmotic stress resistance:decreased Impaired proprioception increased dopamine level acid pH resistance:decreased impaired balance increased creatine kinase level replicative lifespan:increased Increased circulating free fatty acid level chronological lifespan:increased Ganglioside accumulation increased circulating aspartate transaminase level count oxidative stress resistance:increased count count increased circulating alanine transaminase level protein/peptide modification:decreased 4000 4 Gait disturbance increased cell proliferation 7.5 utilization of carbon source:decreased increased alcohol consumption mitochondrial morphology:abnormal 3000 3 Fragile skin Hypoplastic spleen UV resistance:decreased 5.0 2000 Hypoplasia of the thymus cell size:increased
Phenotype Phenotype Phenotype Failure to thrive 2 Hypertension 2.5 anaerobic growth:decreased 1000 Hyperinsulinemia respiratory growth:decreased extended life span 1 enlarged alveolar lamellar bodies sporulation efficiency:decreased Elevated tissue non−specific alkaline phosphatase RNA accumulation:decreased Exercise intolerance Elevated serum creatinine transposable element transposition:decreased Elevated circulating luteinizing hormone level killer toxin resistance:increased Distichiasis decreased vertical activity invasive growth:absent decreased urine protein level protein activity:increased Diminished movement decreased testis weight filamentous growth:decreased Decreased testicular size killer toxin resistance:decreased decreased glycogen catabolism rate Decreased serum leptin biofilm formation:decreased decreased metastatic potential sporulation:decreased darkened coat color decreased mean systemic arterial blood pressure resistance to enzymatic treatment:decreased decreased litter size fermentative growth:decreased rate Cognitive impairment decreased hematocrit RNA accumulation:increased decreased eosinophil cell number cell size:decreased Brain atrophy decreased circulating glucose level bud morphology:abnormal decreased brain tyrosine 3−monooxygenase activity cold sensitivity:increased Blindness decreased body length colony appearance:abnormal B lymphocytopenia telomere length:decreased Ataxia Albuminuria filamentous growth:increased Adipocyte hypertrophy cell cycle progression:abnormal absent horns abnormal vasodilation protein/peptide modification:increased abnormal spatial working memory sporulation:absent absent beta−galactosidase protein abnormal social play behavior metal resistance:increased abnormal response to transplant lipid particle morphology:abnormal Abnormality of the ovary Abnormal muscle tone alkaline pH resistance:decreased abnormal locomotor behavior sporulation:normal Abnormality of skin morphology abnormal learning/memory/conditioning mutation frequency:increased abnormal estrous cycle competitive fitness:normal Abnormality of central motor function abnormal depression−related behavior biofilm formation:increased abnormal blood−brain barrier function cell shape:abnormal abnormal translation abnormal exorbital lacrimal gland morphology transposable element transposition:increased abnormal dopamine level cellular morphology:abnormal abnormal spike wave discharge abnormal dendrite morphology ionic stress resistance:decreased abnormal craniofacial development growth in exponential phase:decreased rate abnormal sleep behavior abnormal craniofacial bone morphology nuclear morphology:abnormal abnormal pruritus abnormal conditioned taste aversion behavior cell cycle progression in G1 phase:increased duration abnormal conditioned emotional response replicative lifespan:decreased abnormal lipid level Abnormal circulating protein level vegetative growth:abnormal abnormal circulating free fatty acids level budding pattern:abnormal abnormal lipid homeostasis abnormal chloride level telomere length:increased abnormal cellular respiration mitochondrial genome maintenance:abnormal abnormal fertility/fecundity abnormal cardiac muscle relaxation viability:decreased abnormal brain development utilization of carbon source:absent Abnormal external genitalia abnormal body size respiratory growth:increased rate abnormal apoptosis actin cytoskeleton morphology:abnormal abnormal enzyme/coenzyme level abnormal amino acid level resistance to enzymatic treatment:increased 0.0 2.5 5.0 7.5 0 1000 2000 3000 4000 0 1 2 3 4 295 # occurances # occurances # occurances
296 Supplementary Fig. 4: The whole matrisome genome-phenome across species
297 (ungrouped)
298 All gene/phenotype associations of every species are shown for 44 species. The
299 number of unique gene-to-phenotype associations with every phenotype are shown
300 arranged in descending order without assembling phenotypes into groups.
18 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
301
302
303 Fig. 3: The degree of phenotypic impact of matrisome genes.
304 The number of unique phenotypes linked to individual matrisome gene is shown in
305 decreasing order for the top 50 genes. The gene name is displayed on the x-axis and
306 the number of unique phenotypes on the y-axis. The color of the bars refers to the
307 matrisome category for each gene.
308
19 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
A Homo sapiens B Mus musculus
120
75 90
50 60
25 30 # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved
0 0
Il6 Il4 Il2 F2 Kitl INS Tnf Lep Il10 Ifng Ctsl Plg Il13 EDA Shh Fgf9 Ccl2 Plau Fgf8 Ntf3 Ins2 Fasl SHH IGF2 Bdnf Csf1 Chrd Fbn1 Spp1 Fstl1 Nrg1 FBN1 FGF8 GPC3 KAL1 TNXB GDF5 BMP4 CTSKGDF6 Vegfa Bmp4 Tgfb1 Fgf10 Nodal Tgfb2 Crim1 Bmp5 AGRN Hspg2Lama2 Wnt5a Mmp9 Lama5 Megf8Timp3 HSPG2 TGFB3 CCBE1 TGFB2 FGF17FRAS1 TGFB1 A2ML1 FBLN5 NGLY1 MASP1PLOD1 LAMA2LAMA3 Col1a1 Col2a1 Tnfsf11 Mmp14 Adipoq Col4a1Col4a2 WNT5A SMOC1MEGF8 Adam17 Fam20c COL1A1COL5A1COL2A1COL1A2 SEMA3ECOL3A1 COL4A1 PLXND1 COL6A3 FAM20C EFEMP2 COL6A2 COL11A1 COL13A1 COL11A2 COLEC11 ADAMTS2SERPINA1 C Danio rerio D Drosophila melanogaster
60
60
40
40 category
Collagens ECM Glycoproteins ECM Regulators ECM−affiliated Proteins Proteoglycans 20 20 Secreted Factors # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved
0 0
hh sli vn tsl trk tld tfc tll1 wg spi bnl pip dlp eys trol grk fog Mp pio chd fgf3 loxa dpp aos gbb dpy sca kuz Sdc knk Cp1 scw spz gpc4 shha gdf3 ndr2ndr1 agrn hhipigf2b fn1a igf2a ihha sulf1 sdc2 ctgfa lepa otog dally serp Burs tdgf1 fgf8a bmp4 sparc fbn2b gdf6a ntn1a LanA verm upd3 upd1amon Wnt5 wnt5b wnt11lamc1 lama1 wnt8a fgf10a bmper hspg2lama2 lama5 wnt3a Mmp1Mmp2 fs(1)N bmp2b bmp7a vegfaa bmp1acol6a1 plxnd1 Col4a1 LanB2 lamb1a col8a1a scube2 crispld2cxcl12b Cpr92A mmp14a Cpr100A CG6337CG6347CG6357 CG11828 dsx−c73A
E Caenorhabditis elegans
20
15
10
5 # distinct phenotypes this gene is involved in # distinct phenotypes this gene is involved
0
bli−5 lpr−5 cpl−1 acn−1 apl−ced1 −1adt−2epi−lam1 −1mlt−8wrt−4 lam−2mlt−7 sqt−dpy3 −2 mig−6 sqt−1 wrt−1 fbn−1 wrt−asp7 −asp3 −4 dig−1 mup−qua4 −1 zmp−2 emb−9 mua−pan3 −1 mlt−11 col−35col−92col−93col−94 gon−1 clec−1 mom−2 noahunc−2 −mfap52 −1 let−805 dpy−10 unc−71 col−131 noah−1 col−125col−154col−155 clec−178 309 Y65B4A.2
310 Supplementary Fig. 5: Degree of phenotypic impact of matrisome genes
311 (grouped)
312 The number of phenotypes that associate with each matrisome gene are shown in
313 decreasing order for the top 50 genes. This graph is complementary to Fig. 3., to show
314 the impact of phenotypic grouping on the number of phenotypes. The gene name is
315 displayed on the x-axis and the number of unique phenotypes on the y-axis. The color
316 of the bars refers to the matrisome category for each gene.
20 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
317
318
319 Fig. 4: Phenotypic fingerprint of the most conserved matrisome members
320 The most conserved genes displaying phenotypes across the most species are shown
321 in decreasing order (A). The table displays the human gene name, the number of
322 species presenting a phenotype associated with the matrisome gene, the overall
21 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
323 phenotypic signature observed for this gene, as well as the matrisome category and
324 division are shown. Circular dendrograms depict the five most conserved matrisome
325 genes in more detail (B-F). The human orthologue is shown in the middle connected
326 to the species, in which the gene was associated with a phenotype, and then each
327 connected to terminal nodes depicting the grouped phenotypes. Phenotypes that were
328 observed in more than one species are highlighted in red and the size of the terminal
329 node reflects the number of species this phenotype was observed in. At the bottom
330 right of each circular dendrogram panel a graphical summary is provided highlighting
331 the species in which by homology a phenotype group was observed (dark grey) or
332 absent (light grey).
333
22 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
A B
Muscle tissue phenotype (Rattus norvegicus)
Integumentary and thricocutaneus
Adipose tissue phenotype (Rattus norvegicus)
Immune system phenotype
Complex brain phenotype (Mus musculus) Eye phenotype (Mus musculus)
Hearing impairment
Cellular level phenotype (Mus musculus)
Aggregation
abnormal parturition Apoptosis (Caenorhabditis elegans) gliosis
Localization variant ● ● Nervous and neural tissues affected (Caenorhabditis elegans)
Altered body size (Bos taurus) ● Poor viability (Caenorhabditis elegans)
● ● ● ● protein degradation variant Blood phenotype Atrophy phenotype (Mus musculus) axonal dystrophy ● Stress resilience phenotype
● Muscle tissue phenotype (Bos taurus) Ataxia and immobility (Mus musculus) Apoptosis (Canis lupus familiaris) ● ●
Ataxia and immobility (Canis lupus familiaris) Altered body size (Mus musculus) ● ●
Apoptosis (Mus musculus) dysmetria Rattus norvegicus ● Bos taurus Muscle tissue phenotype (Canis lupus familiaris) ● ● Altered body size (Mus musculus) Kinestetic derived phenotypes (Canis lupus familiaris) ● Caenorhabditis elegans ● Mus musculus Nervous and neural tissues affected (Canis lupus familiaris) ● ●
Adipose tissue phenotype (Mus musculus) rigidity ● Psychological phenotype (Canis lupus familiaris) Canis lupus familiaris ● abnormal(ly) increased weight whole organism Psychological phenotype (Homo sapiens) Canis lupus familiaris
● Systemic health decrease
Mus musculus Mstn Oropharingeal related phenotype Ctsd abnormal(ly) bent whole organism
● abnormal(ly) dead whole organism
Cardiovascular impairments Nervous and neural tissues affected (Homo sapiens) Homo sapiens abnormal lipid level Danio rerio ● Altered body size (Danio rerio)
Lung and respiratory system Danio rerio Circulating and secreted agents phenotype
Homo sapiens General metabolism phenotype Coloration variant Cardiovascular phenotype Drosophila melanogaster
Facial deformations Drosophila melanogaster Epithelial tissue phenotype
abnormal gluconeogenesis ● ● Cranial deformations Eye phenotype (Danio rerio)
●
Kinestetic derived phenotypes (Danio rerio) Immune system phenotype ●
Muscle tissue phenotype ● ● Breathing problems
Poor viability (Danio rerio) ● ● Complex brain phenotype (Homo sapiens) abnormal body composition Size variant Muscle tissue phenotype (Danio rerio) ●
Apoptosis (Drosophila melanogaster) Viability ● ● Brain phenotype
Developmental phenotype Cellular level phenotype (Drosophila melanogaster)
Cystic−like features Atrophy phenotype (Homo sapiens)
fertile
Ataxia and immobility (Homo sapiens)
Uncharacterized phenotype
Sterility / reproductive potential Uncharacterized phenotype Muscle tissue phenotype (Homo sapiens)
Eye phenotype (Drosophila melanogaster)
C D
Nervous and neural tissues affected (Mus musculus)
Kidney phenotype (Mus musculus)
Poor viability (Mus musculus)
Morphology altered (Mus musculus)
Eye phenotype (Mus musculus)
Inflamation phenotype
Developmental phenotype (Mus musculus)
Epithelial tissue phenotype
Membrane phenotype Cellular level phenotype (Mus musculus)
Altered body size (Caenorhabditis elegans) Lung and respiratory system Cellular level phenotype
Cellular level phenotype (Caenorhabditis elegans) Circulating and secreted agents phenotype
● ● Developmental phenotype (Caenorhabditis elegans) ● ● ● Expression variant ● ● Genetic information phenotype ● Morphology altered (Caenorhabditis elegans)
Blood phenotype (Mus musculus) ● Increased tissue transparency Hearing impairment
Poor viability (Caenorhabditis elegans) ● Eye phenotype (Mus musculus) ● Kinestetic derived phenotypes (Caenorhabditis elegans)
● ● Sterility / reproductive potential (Caenorhabditis elegans)
Developmental phenotype (Mus musculus) ● Morphology altered (Caenorhabditis elegans) ● ● abnormal(ly) quality somite Apoptosis ● Altered body size (Mus musculus) Muscle tissue phenotype (Caenorhabditis elegans) ● abnormal(ly) undulate notochord ● Caenorhabditis elegans
Poor viability Cranial deformations ● abnormal miniature endplate potential Caenorhabditis elegans Mus musculus Brain phenotype (Danio rerio) Mus musculus ● Size variant
cleft palate Kinestetic derived phenotypes Visual impairment Sterility / reproductive potential
Cardiovascular phenotype (Mus musculus) ● Danio rerio ● Morphology altered (Danio rerio) Urinary system phenotype abnormal(ly) disorganized somite Lamb2 Plod3
Muscle tissue phenotype Cardiovascular impairments Sclerosis abnormal(ly) quality notochord Danio rerio
● ● ● Nervous and neural tissues affected (Danio rerio) Homo sapiens Developmental phenotype (Danio rerio)
Brain phenotype (Mus musculus) Reflexes and coordination Drosophila melanogaster
● fertile Kinestetic derived phenotypes (Danio rerio)
Growth phenotype posterior lenticonus ● ● Membrane phenotype Sterility / reproductive potential (Drosophila melanogaster) Drosophila melanogaster Homo sapiens Fractures
●
Morphology altered (Danio rerio) Uncharacterized phenotype
microcoria ● Muscle tissue phenotype (Homo sapiens) ● ● Muscle tissue phenotype (Danio rerio) Facial deformations Viability
● Adhesion affected abnormality of the pinna ● ● Nervous and neuralNervous tissues affected
Developmental phenotype (Drosophila melanogaster) anteverted nares
Eye phenotype (Homo sapiens) Kidney phenotype (Homo sapiens) ● Drosophila−specific phenotype ● Hypoplasia phenotype ● Atrophy phenotype
Heart function impairment
Bone phenotype
Muscle tissue phenotype (Drosophila melanogaster)
Breathing problems
Edema occurence decreased palmar creases
Eye phenotype (Homo sapiens) Developmental phenotype (Homo sapiens)
Connective tissue phenotype
bruising susceptibility
Blood phenotype (Homo sapiens)
Oropharingeal related phenotype
Cardiovascular phenotype (Homo sapiens)
E F
Connective tissue phenotype (Mus musculus)
Cranial deformations (Mus musculus)
Cellular level phenotype (Mus musculus)
Blood phenotype (Mus musculus)
Altered body size (Mus musculus) anterior synechiae of the anterior chamber
Bone phenotype
Expression variant aphakia Adipose tissue phenotype
Nervous and neural tissues affected (Caenorhabditis elegans)
Adrenal gland phenotype (Mus musculus)Agenesis (Mus musculus) ● Poor viability (Caenorhabditis elegans) ● ● ● Cellular level phenotype
● ● Sterility / reproductive potential
abnormal(ly) aplastic habenular commissure
● Developmental phenotype
abnormal(ly) decreased process quality habenular commissure axon extension involved in axon guidance ● absent mullerian ducts abnormal(ly) decreased thickness habenular commissure Anemia ● abnormal secondary sex determination Sterility / reproductive potential
abnormal(ly) discontinuous habenular commissure
abnormal(ly) disrupted convergent extension Altered body size
Facial deformations Caenorhabditis elegans Systemic health decrease
Mus musculus Cellular level phenotype (Danio rerio) ● Caenorhabditis elegans
Endocrine phenotype
Connective tissue phenotype (Danio rerio) ● Aging−related phenotype
pain response defective Danio rerio Mus musculus cubitus valgus ● Cranial deformations (Danio rerio) Drosophila melanogaster
cleft palate Wnt4 ● Developmental phenotype (Danio rerio) Adam17
Atrophy phenotype absent schlemm's canal
cleft lip Oropharingeal related phenotype
Homo sapiens anterior commissure phenotype
● Cardiovascular impairments Blood phenotype (Homo sapiens)
commissure phenotype Drosophila melanogaster absent meibomian glands
Breathing problems Homo sapiens dendrite phenotype
diarrhea ● Blood vessel narrowing Developmental phenotype (Drosophila melanogaster)
Blood phenotype Drosophila−specific phenotype abnormal macrophage physiology ● erythema amenorrhea fertile ●
Kidney phenotype ● and thricocutaneus Integumentary Immune system phenotype abnormal chemokine level Morphology altered ● pustule ● Muscle tissue phenotype Inflamation phenotype
mushroom body alpha−lobe phenotype acne
mushroom body beta−lobe phenotype Altered body size (Homo sapiens)
Agenesis (Homo sapiens)
paronychia
Adrenal gland phenotype (Homo sapiens)
Poor viability (Drosophila melanogaster)
Nervous and neural tissues affected (Drosophila melanogaster) 334
335 Supplementary Fig. 6: Phenotypic fingerprint of the most conserved matrisome
336 members
337 Circular dendrograms depicting the cross-species phenotypic fingerprint of all human
338 matrisome genes for which at least one phenotype has been studied in at least one
339 other species. The human orthologue is shown in the middle connected to the species
340 in which the gene was associated with a phenotype, and then each connected to
341 terminal nodes depicting the grouped phenotypes. Phenotypes that were observed in
342 more than one species are highlighted in red and the size of the terminal node reflects
343 the number of species this phenotype was observed in.
23 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
A B
decreased abdominal adipose tissue amount
increased muscle weight
cellular phenotype
Ataxia (Mus musculus)
Decreased body weight Bradykinesia axonal dystrophy
Cachexia abnormal thymus morphology
Blindness
Decreased adiponectin level
cell death variant
abnormal parturition abnormal stomach wall morphology hypoosmotic shock hypersensitive Astrocytosis lethal
increased skeletal muscle mass necrotic cell death variant abnormal spleen white pulp morphology abnormal thrombosis neuron degeneration
protein aggregation variant cardiovascular system phenotype skeletal muscle tissue phenotype (Bos taurus) abnormal spleen B cell follicle morphology protein degradation variant ● ● transgene subcellular localization variant
skeletal muscle tissue phenotype (Canis lupus familiaris) Abnormal neuron morphology ● abnormal microglial cell morphology Abnormality of movement
abnormal intestinal mucosa morphology abnormal(ly) has extra parts of type heart towards cardiac muscle cell Ataxia (Canis lupus familiaris) ● Adipose tissue loss Dysmetria
increased neuron apoptosis abnormal ileum morphology abnormal(ly) increased size atrium Abnormality of muscle physiology Paraparesis Rattus norvegicus abnormal hair cycle abnormal crypts of Lieberkuhn morphology premature death
abnormal(ly) increased size cardiac ventricle Bos taurus Caenorhabditis elegans Psychomotor deterioration
abnormal cochlear ganglion morphology Mus musculus Canis lupus familiaris abnormal(ly) bent whole organism Abnormality of muscle fibers Canis lupus familiaris
abnormal(ly) dead whole organism abnormal(ly) increased size heart abnormal apoptosis abnormal(ly) decreased length whole organism Danio rerio abnormal antigen presentation abnormal(ly) decreased size eye Abnormality of femur morphology abnormal(ly) increased thickness cardiac ventricle Spasticity abnormal(ly) decreased size otolith
MSTN Sloping forehead CTSD abnormal(ly) decreased size semicircular canal
abnormal(ly) increased weight whole organism Danio rerio abnormal(ly) decreased thickness myoseptum abnormal trochanter morphology Sensory axonal neuropathy
abnormal(ly) disorganized trunk musculature muscle cell Mus musculus Rod−cone dystrophy abnormal(ly) disrupted myotube cell development immune response phenotype Rigidity Homo sapiens Drosophila melanogaster abnormal(ly) disrupted swim bladder inflation
abnormal skeletal muscle morphology Retinal atrophy Homo sapiens abnormal(ly) increased pigmentation integument
developmental rate defective abnormal(ly) lacks all parts of type retina towards retinal pigmented epithelium microvillus Respiratory insufficiency
Drosophila melanogaster abnormal(ly) morphology trunk musculature muscle
Premature closure of fontanelles abnormal(ly) uninflated swim bladder abnormal postnatal growth/weight/body size Uncharacterized phenotype
Microcephaly abnormal(ly) viability whole organism
Neuronal loss in central nervous system muscle structure development phenotype Skeletal muscle hypertrophy
Mental deterioration retinal pigment epithelium development phenotype
abnormal myocardium layer morphology Low−set ears adult brain phenotype
abnormal body composition cell death defective
decreased cell death Abnormal muscle morphology ● Drosophila spermatogonial cyst phenotype
abnormal circulating tumor necrosis factor level nurse cell phenotype Intellectual disability, severe egg chamber phenotype
abnormal lipid level eye phenotype abnormal enzyme/coenzyme level
fertile
increased cell death
abnormal gluconeogenesis Cerebral atrophy male germline cell phenotype
Intellectual disability, progressive Apnea
viable Cerebellar atrophy
abnormal hormone level
Ataxia (Homo sapiens)
Increased neuronal autofluorescent lipopigment
Uncharacterized phenotype
Abnormal glucose homeostasis
abnormal heart left ventricle morphology Abnormality of metabolism/homeostasis
C D
embryonic lethality during organogenesis, incomplete penetrance
embryonic lethality during organogenesis, complete penetrance
embryonic lethality prior to tooth bud stage
abnormal retinal outer plexiform layer morphology abnormal retinal photoreceptor morphology decreased threshold for auditory brainstem response
decreased total retina thickness
decreased amacrine cell number abnormal retinal rod cell morphology
abnormal retina inner limiting membrane morphology
Decreased body weight
embryonic growth retardation abnormal renal tubule epithelium morphology abnormal renal glomerulus basement membrane morphology abnormal retinal development Exencephaly
Albuminuria
decreased embryo size
abnormal cell nucleus morphology adult body morphology variant
abnormal(ly) process quality cell migration cell secretion variant abnormal(ly) process quality chromosome condensation
body wall muscle development variant embryonic lethal
abnormal neuromuscular synapse morphology clear Abnormal renal physiology Abnormality of skeletal morphology exploded through vulva dumpy
larval lethal embryonic lethal
germ cell compartment expansion variant Abnormality of epidermal morphology pattern of transgene expression variant Cleft palate germ cell compartment morphology variant abnormal vascular endothelial cell morphology receptor mediated endocytosis defective abnormal Muller cell morphology germ cell compartment multinucleate
abnormal(ly) absent MiP motor neuron axon abnormal miniature endplate potential germ cell compartment size variant
gonad morphology variant abnormal Reichert's membrane morphology abnormal(ly) behavioral quality of a process locomotion
larval lethal abnormal glomerular filtrationabnormal barrier kidney function morphology abnormal(ly) contractility trunk musculature locomotion variant abnormal preimplantation embryo development abnormal(ly) decreased length CaP motoneuron axon maternal sterile
muscle arm development defective abnormal(ly) degenerate notochord Abnormal eye physiology
myopodia present abnormal pericardium morphology Caenorhabditis elegans abnormal eye electrophysiology abnormal(ly) disrupted skeletal muscle contraction nuclei enlarged abnormal amacrine cell morphology organism development variant Abnormal neural tube morphology abnormal(ly) disrupted thigmotaxis Mus musculus Caenorhabditis elegans rachis morphology variant abnormal(ly) lacks parts or has fewer parts of type neural crest cell towards neural crest cell filopodium Strabismus Mus musculus abnormal(ly) broken skeletal muscle cell sarcolemma Abnormal lung morphology Stage 5 chronic kidney disease abnormal(ly) lethal (sensu genetics) whole organism abnormal(ly) decreased rate larval locomotory behavior
Abnormal eye morphology Proteinuria abnormal(ly) decreased rate thigmotaxis abnormal(ly) mislocalised CaP motoneuron axon Danio rerio Posterior lenticonus abnormal(ly) degenerate myotome abnormal embryo size abnormal(ly) mislocalised MiP motor neuron axon
Nystagmus LAMB2 abnormal(ly) degenerate somite PLOD3
abnormal(ly) mislocalised RoP motor neuron axon abnormal(ly) degeneration myotome skeletal muscle cell Nephrotic syndrome Homo sapiens abnormal cutaneous collagen fibril morphology
Myopia Danio rerio abnormal(ly) detached from somite myofibril towards somite abnormal(ly) morphology hindbrain abnormal cell morphology abnormal(ly) disorganized somite Muscular hypotonia abnormal(ly) morphology somite abnormal(ly) disrupted musculoskeletal movement Low−set ears Microcoria
abnormal(ly) dystrophic muscle abnormal(ly) movement behavioral quality whole organism Long philtrum Hypoproteinemia abnormal(ly) lethal (sensu genetics) whole organism Homo sapiens abnormal(ly) process quality motor neuron axon guidance Drosophila melanogaster Drosophila melanogaster abnormal(ly) malformed vertical myoseptum Hypoplasia of the iris J−shaped sella turcica abnormal(ly) process quality neural crest cell migration abnormal(ly) malformed vertical myoseptum basement membrane
Hypoplasia of the ciliary body abnormal(ly) mislocalised myotome myoseptum abnormal(ly) process quality spinal cord motor neuron differentiation Edema Generalized hypotonia abnormal(ly) morphology notochord Intrauterine growth retardation abnormal(ly) quality brain
abnormal(ly) present in fewer numbers in organism myotome skeletal muscle cell
abnormal(ly) quality hindbrain abnormal(ly) process quality muscle attachment Blindness Hearing impairment abnormal(ly) quality notochord Drosophila egg phenotype Diffuse mesangial sclerosis Areflexia Hypoplasia of the capital femoral epiphysis
abnormal(ly) quality trunk musculature Flat face fertile abnormal(ly) refractivity myotome
Drosophila endoderm phenotype Drosophila endoderm Uncharacterized phenotype cell adhesion defective Global developmental delay
Drosophila A1−7 ventral longitudinal muscle 1 phenotype viable
Drosophila chaeta phenotype Failure to thrive Abnormality of the pinna Drosophila embryonic Malpighian tubule phenotype Cataract Drosophila embryonic/larval dorsal vessel phenotype Anteverted nares
Drosophila embryonic/larval heart phenotype Arterial rupture
Drosophila embryonic/larval midgut primordium phenotype
Bruising susceptibility Drosophila embryonic/larval pericardial cell phenotype
Elbow flexion contracture embryonic/larval hypodermalembryonic/larval muscle phenotype cardial cell phenotype
Coarse hair
embryonic/larval alary muscle phenotype Downturned corners of mouth Drosophila wing phenotype embryonic somatic muscle phenotype
Diaphragmatic eventration Dilatation of the cerebral artery
embryonic midgut constriction phenotype Decreased palmar creases
Drosophila gonad phenotype
Drosophila visceral mesoderm phenotype Drosophila macrophage phenotype
Drosophila presumptive embryonic salivary gland phenotype
Drosophila metathoracic ventral longitudinal muscle 1 phenotype
E F
Abnormality of the ovary (Mus musculus)
abnormal enterocyte proliferation
Abnormal eye morphology
axial skeleton hypoplasia abnormal embryonic tissue morphology abnormal secondary sex determinationAbnormality of adrenal physiology abnormal reproductive system development
absent ovary capsule absent Mullerian ducts
abnormal dendritic cell physiology
abnormal pubis morphology
embryonic lethal
abnormal(ly) process quality cell fate specification neuron number variant
transgene expression increased
abnormal ovary development maternal sterile abnormal nephrogenic mesenchyme morphogenesis transgene expression reduced abnormal corneal epithelium morphology
vulval cell induction variant
● abnormal(ly) aplastic habenular commissure multiple anchor cells
abnormal(ly) decreased occurrence pharyngeal pouch protein localization to plasma membrane abnormal metanephric mesenchyme morphology abnormal oogenesis
abnormal(ly) decreased process quality embryonic viscerocranium morphogenesis organism development variant
abnormal(ly) decreased process quality habenular commissure axon extension involved in axon guidance
abnormal kidney mesenchyme morphology abnormal(ly) decreased thickness habenular commissure Abnormal cornea morphology sick
abnormal(ly) discontinuous habenular commissure
abnormal(ly) disrupted convergent extension abnormal kidney development pain response defective abnormal conjunctival sac morphology abnormal(ly) increased speed pharyngeal pouch cell migration abnormal female germ cell morphology
abnormal(ly) malformed ceratobranchial cartilage
abnormal craniofacial bone morphology Caenorhabditis elegans Blepharitis abnormal(ly) malformed pharyngeal pouch Caenorhabditis elegans Mus musculus abnormal circulating chemokine level abnormal chondrocyte morphology cell migration involved in gastrulation phenotype Drosophila melanogaster
Diarrhea Danio rerio embryonic viscerocranium morphogenesis phenotype Abnormal cartilage morphology
habenula development phenotype
abnormal bone marrow morphology abnormal chemokine level adult antennal lobe projection neuron phenotype Eosinophilia
abnormal adenohypophysis morphology WNT4 antennal lobe commissure phenotype ADAM17 Mus musculus antennal lobe phenotype Facial hirsutism Erythema abnormal canal of Schlemm morphology
anterior commissure phenotype Cubitus valgus Homo sapiens commissure phenotype Drosophila melanogaster Erythroderma Congenital diaphragmatic hernia Homo sapiens dendrite phenotype Cleft palate abnormal bronchiole epithelium morphology Drosophila adult antennal lobe phenotype Cleft lip Hematochezia
Drosophila alpha/beta Kenyon cell phenotype
Brachydactyly Drosophila antennal olfactory receptor neuron phenotype
Paronychia Drosophila fascicle phenotype abnormal branching involved in lung morphogenesis Bilateral lung agenesis Drosophila glomerulus phenotype
Pustule Drosophila intermediate longitudinal fascicle phenotype
Drosophila MP1 neuron phenotype abnormal brain vasculature morphology Aplasia of the vagina Aplasia/Hypoplasia of the fallopian tube Drosophila MP1 tract phenotype Thick nail
Aplasia of the uterus Amenorrhea Drosophila presumptive embryonic salivary gland phenotype abnormal bone remodeling
Acne ● Drosophila presumptive embryonic/larval central nervous system phenotype Villous atrophy fertile
lateral longitudinal fascicle phenotype Abnormal anterior eye segment morphology
lateral transverse muscle phenotype
Adrenal gland agenesis mushroom body alpha−lobe phenotype Abnormal aortic valve cusp morphology
Abnormality of cardiovascular system morphology
Abnormality of the vagina abnormal bone marrow cavity morphology Abnormality of the penis
abnormal aortic valve flow
abnormal atrioventricular valve morphology abnormal aortic valve morphology
Abnormality of the genital system
Abnormality of the adrenal glands
Abnormality of the endocrine system
Abnormality of the ovary (Homo sapiens)
344
345 Supplementary Fig. 7: Phenotypic fingerprint of the most conserved matrisome
346 members (ungrouped)
347 Circular dendrograms depicting the cross-species phenotypic fingerprint of all human
348 matrisome genes for which at least one phenotype has been studied in at least one
349 other species. This figure is complementary to Supplementary Fig. 6, but in contrast is
350 showing the phenotypic fingerprint for ungrouped phenotypes.
351
24 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
352
353 Fig. 5: Phenotypic implications of the structural matrisome categories on
354 vertebrates
355 The matrisome is grouped into multiple categories according to the localization and
356 composition of the extracellular proteins. The members of the ECM glycoprotein
357 category, collagens, and proteoglycans constitute the foundation of the matrisome
358 responsible for the structural integrity of the extracellular matrix. Here the most
359 abundant inter-species gene-to-phenotype associations are highlighted for H. sapiens,
360 M. musculus and D. rerio. Genes are shown as human orthologues in colored labels
25 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
361 while phenotype groups are given as plain text. The arrows connect genes to
362 phenotype groups if the gene has been associated with at least one phenotype
363 belonging to the phenotype group. The width and opacity of the connection reflects the
364 degree of conservation of the gene-to-phenotype association across species.
365
26 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
366
367 Fig. 6: Phenotypic implications of the entire matrisome
368 A selected subset of the most connected genes of the entire matrisome and
369 phenotypes across species are displayed. Genes are shown as human orthologues
370 color-coded by matrisome division, while phenotype groups are shown as plain text.
371 The arrows connect genes to phenotype groups when the gene is associated with at
372 least one phenotype of the phenotype group. The width and opacity of the connection
27 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
373 reflects the degree of conservation of the gene-to-phenotype association across
374 species.
375
28 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
ECM Glycoproteins ECM Regulators
Homo sapiens Mus musculus Bos taurus Homo sapiens Mus musculus Danio rerio
Morphology altered Morphology altered Morphology altered FAM20C FAM20C FAM20C
Altered body size Altered body size Altered body size
FRAS1 FN1 FRAS1 FN1 FRAS1 FN1
LAMC1 LAMC1 LAMC1 ADAM17 ADAM17 ADAM17 Morphology altered Morphology altered Morphology altered
Altered body size Altered body size Altered body size
Blood phenotype Blood phenotype Blood phenotype Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential
CTSV CTSV CTSV
Blood phenotype Blood phenotype Blood phenotype
FBN1 FBN1 FBN1 Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential
F2 F2 F2 PLG PLG PLG Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments
LAMA2 LAMA2 LAMA2 Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments
Danio rerio Drosophila melanogaster Caenorhabditis elegans Drosophila melanogaster Caenorhabditis elegans
Morphology altered Morphology altered Morphology altered FAM20C FAM20C
Altered body size Altered body size
FRAS1 FN1 FRAS1 FN1 FRAS1 FN1
LAMC1 LAMC1 LAMC1 ADAM17 ADAM17 Morphology altered Morphology altered
Altered body size Altered body size Altered body size
Blood phenotype Blood phenotype Sterility / reproductive potential Sterility / reproductive potential
CTSV CTSV
Blood phenotype Blood phenotype Blood phenotype
FBN1 FBN1 FBN1 Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential
F2 F2 PLG PLG Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments
LAMA2 LAMA2 LAMA2 Cardiovascular impairments Cardiovascular impairments
ECM−affiliated Proteins Secreted Factors
Homo sapiens Mus musculus Homo sapiens Mus musculus
Immune system phenotype Immune system phenotype BMP4 BMP4
SHH SHH
Cellular level phenotype Cellular level phenotype SEMA3E SEMA3E
Altered body size Altered body size GPC3 GPC3
Blood phenotype Blood phenotype
Morphology altered Morphology altered
Sterility / reproductive potential Sterility / reproductive potential
PLXND1 PLXND1
Nervous and neural tissues affected Nervous and neural tissues affected Morphology altered Morphology altered
FGF8 FGF8
BMP2 BMP2 SEMA3A GPC4 SEMA3A GPC4
TGFB1 TGFB1
Cellular level phenotype Cellular level phenotype Blood phenotype Blood phenotype
Danio rerio Drosophila melanogaster Danio rerio Drosophila melanogaster
Immune system phenotype Immune system phenotype BMP4 BMP4
SHH SHH
Cellular level phenotype Cellular level phenotype SEMA3E SEMA3E
Altered body size Altered body size GPC3 GPC3
Blood phenotype Blood phenotype
Morphology altered Morphology altered
Sterility / reproductive potential Sterility / reproductive potential
PLXND1 PLXND1
Nervous and neural tissues affected Nervous and neural tissues affected Morphology altered Morphology altered
FGF8 FGF8
BMP2 BMP2 SEMA3A GPC4 SEMA3A GPC4
TGFB1 TGFB1
Cellular level phenotype Cellular level phenotype Blood phenotype Blood phenotype
Collagens Proteoglycans
Homo sapiens Mus musculus Danio rerio Homo sapiens Mus musculus Danio rerio
COL5A1 COL5A1 COL5A1 Connective tissue phenotype Connective tissue phenotype Connective tissue phenotype
BGN BGN BGN
Muscle tissue phenotype Muscle tissue phenotype Muscle tissue phenotype DCN DCN DCN
COL1A2 COL1A2 COL1A2
Morphology altered Morphology altered Morphology altered
Eye phenotype Eye phenotype Eye phenotype
HSPG2 HSPG2 HSPG2
Eye phenotype Eye phenotype Eye phenotype
COL1A1 COL1A1 COL1A1 ACAN ACAN ACAN
Morphology altered Morphology altered Morphology altered
Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments
Blood phenotype Blood phenotype Blood phenotype
Developmental phenotype Developmental phenotype Developmental phenotype
Altered body size Altered body size Altered body size
COL3A1 COL3A1 COL3A1
COL2A1 COL2A1 COL2A1 IMPG2 IMPG2 IMPG2 376
377 Supplementary Fig. 8: Phenotypic implications of the matrisome categories
378 across species
379 For each matrisome category, the five genes and phenotypes with the highest
380 connectivity across species are highlighted for each taxon. Grouped phenotypes are
381 shown as plain text while genes are shown as colored labels always referring to their
29 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
382 human orthologues. If a gene has been associated with at least one phenotype
383 associated with the phenotype group they are connected with an arrow. The degree of
384 conservation of the gene-to-phenotype association is reflected by the width and opacity
385 of the connecting arrow.
386
30 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
Core matrisome
Homo sapiens Mus musculus Bos taurus
Blood phenotype Blood phenotype Blood phenotype COL4A1 COL4A1 COL4A1
Eye phenotype Eye phenotype Eye phenotype
LAMC1 Morphology altered LAMC1 Morphology altered LAMC1 Morphology altered
Muscle tissue phenotype Muscle tissue phenotype Muscle tissue phenotype
COL5A1 COL5A1 COL5A1
COL3A1 Developmental phenotype COL3A1 Developmental phenotype COL3A1 Developmental phenotype
HSPG2 HSPG2 HSPG2
Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential COL1A1 Cardiovascular impairments COL1A1 Cardiovascular impairments COL1A1 Cardiovascular impairments
LAMA2 LAMA2 LAMA2
COL2A1 Altered body size COL2A1 Altered body size COL2A1 Altered body size
FBN1 FBN1 FBN1
COL1A2 COL1A2 COL1A2
Connective tissue phenotype Connective tissue phenotype Connective tissue phenotype
Bone phenotype Bone phenotype Bone phenotype
Danio rerio Drosophila melanogaster Caenorhabditis elegans
Blood phenotype Blood phenotype Blood phenotype COL4A1 COL4A1 COL4A1
Eye phenotype Eye phenotype Eye phenotype
LAMC1 Morphology altered LAMC1 Morphology altered LAMC1 Morphology altered
Muscle tissue phenotype Muscle tissue phenotype Muscle tissue phenotype
COL5A1 COL5A1 COL5A1
COL3A1 Developmental phenotype COL3A1 Developmental phenotype COL3A1 Developmental phenotype
HSPG2 HSPG2 HSPG2
Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential COL1A1 Cardiovascular impairments COL1A1 Cardiovascular impairments COL1A1 Cardiovascular impairments
LAMA2 LAMA2 LAMA2
COL2A1 Altered body size COL2A1 Altered body size COL2A1 Altered body size
FBN1 FBN1 FBN1
COL1A2 COL1A2 COL1A2
Connective tissue phenotype Connective tissue phenotype Connective tissue phenotype
Bone phenotype Bone phenotype Bone phenotype
Matrisome−associated
Homo sapiens Mus musculus Rattus norvegicus
Immune system phenotype Immune system phenotype Immune system phenotype
LEP LEP LEP
Blood phenotype Blood phenotype Blood phenotype
TNF TNF TNF
TGFB1 TGFB1 TGFB1
BMP2 BMP2 BMP2
Cellular level phenotype Cellular level phenotype Cellular level phenotype
FGF8 Nervous and neural tissues affected FGF8 Nervous and neural tissues affected FGF8 Nervous and neural tissues affected
Sterility / reproductive potential Sterility / reproductive potential Sterility / reproductive potential
Morphology altered Morphology altered Morphology altered
Cardiovascular impairments Cardiovascular impairments Cardiovascular impairments
SHH SHH SHH Developmental phenotype Developmental phenotype Developmental phenotype WNT1 WNT1 WNT1 FGF10 FGF10 FGF10
BMP4 BMP4 BMP4 Eye phenotype Eye phenotype Eye phenotype Altered body size Altered body size Altered body size
WNT5A WNT5A WNT5A
Danio rerio Drosophila melanogaster
Immune system phenotype Immune system phenotype
LEP LEP
Blood phenotype Blood phenotype
TNF TNF
TGFB1 TGFB1
BMP2 BMP2
Cellular level phenotype Cellular level phenotype
FGF8 Nervous and neural tissues affected FGF8 Nervous and neural tissues affected
Sterility / reproductive potential Sterility / reproductive potential
Morphology altered Morphology altered
Cardiovascular impairments Cardiovascular impairments
SHH SHH Developmental phenotype Developmental phenotype WNT1 WNT1 FGF10 FGF10
BMP4 BMP4 Eye phenotype Eye phenotype Altered body size Altered body size
387 WNT5A WNT5A
388 Supplementary Fig. 9: Phenotypic implications of the matrisome divisions
389 across species
390 For each matrisome division, the ten genes and phenotypes with the highest
391 connectivity across species are highlighted for each taxon. Grouped phenotypes are
392 shown as plain text while genes are shown as colored labels always referring to their
31 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
393 human orthologues. If a gene has been associated with at least one phenotype
394 associated with the phenotype group they are connected with an arrow. The degree of
395 conservation of the gene-to-phenotype association is reflected by the width and opacity
396 of the connecting arrow.
397
32 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
398 Discussion
399 The recent collective scientific efforts focusing on deep phenotyping and phenome-
400 wide association studies revealed many novel genotype-to-phenotype relationships
401 [2,3,29]. Extracellular matrices are important for cellular and organismal function
402 [13,17]. However, the contribution of ECM gene variants to the phenotypic landscape
403 is unknown.
404 Here, we defined the ECM phenome of humans, mice, zebrafish, Drosophila,
405 and C. elegans, with extrapolations to three other species. We found over 42’558
406 matrisome genotype-to-phenotype relationships across the five defined matrisome
407 species and an additional 306 genotype-to-phenotype relationships in the three
408 undefined-matrisome species. We identified the phenotypic landscape of the
409 matrisome that is conserved among species and also species-specific phenomes. Our
410 analysis of the phenotypic fingerprints of genes revealed MSTN, CTSD, LAMB2,
411 HSPG2, and COL11A2 matrisome genes bearing the most associated phenotypes.
412 From these genotype-to-phenotype interactions, we have built networks linking
413 analogue phenotypes that might be driven by similar underlying molecular
414 mechanisms involving matrisome genes. We provide the ECM phenome as a platform
415 to use information gained from model organisms to implicate novel genotype-to-
416 phenotype relationships for humans.
417 One limitation of our analysis are our nearly-completed phenotype categorical
418 collections, which include the most frequent occurring phenotypes, but ignores some
419 infrequently occurring phenotypes during our manual curation process due to the large
420 number phenotypes and sometimes jargon-specific phenotype naming. We aimed to
421 functionally group phenotypes so that these become more comparable across species.
422 This includes also species-specific phenotypes. We manually added 156 broadly
33 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
423 defined phenotype categories, which captured 86.8 % of the original phenome.
424 However, while the grouping achieves its primary goal to unify the most frequent
425 phenotypes some of the less abundant phenotypes might be falsely grouped or remain
426 ungrouped. Modifications in the grouping system achieved a better assignment of
427 misclassified infrequent phenotypes, with little or no changes in the top phenotypic
428 categories, which adds reliability to our findings and interpretations in this present
429 study. All analysis including grouped or ungrouped are provided in the supplementary
430 sections, allowing researchers to build upon.
431 The strong points in the present study are our networks of matrisome genes
432 mapped to recorded phenotypes, which offers a conceptual framework for further
433 investigation. This phenome-based network allows the identification of conserved
434 underlying molecular interactions of matrisome genes across comparable phenotypes
435 and species. In particular, it enables queries on given human phenotypes or ECM
436 genes, which are provided with genotype-to-phenotype relationships across other
437 species, suggesting potential novel molecular targets or read-outs for phenotypic high-
438 throughput screens. For instance, early developmental or lethal genes that lead to
439 spontaneous abortion in humans, can be studied with their orthologue gene
440 counterparts in other species [10]. We provide the most conserved subset of the
441 matrisome-interactome dataset in graphical format highlighting the genes and
442 phenotypes with the highest degree of connectivity (Fig. 5 and 6; Supplementary Fig.
443 8 and 9). Such graphical interaction frameworks have been invaluable for all known
444 human gene associated phenotypes [10].
445 In our phenome analysis across species, we identified novel aging-related
446 phenotypes associated with matrisome gene variants in the top categories (Fig. 2).
447 Decline and remodeling of the matrisome is a major driver of aging and longevity
34 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
448 [18,30-32]. Recently, the human aging phenome has been defined revealing that
449 phenotypes associated with collagens or matrisome, such as facial wrinkles, kyphosis,
450 arthritis, and osteoporosis are under the most prevalent phenotypes recorded from 77
451 million elderly [8]. Furthermore, connective tissue diseases, such as Marfan syndrome
452 caused by mutation in FBN1 and Ehlers-Danlos syndrome, were clustered with
453 features of premature aging diseases (progeria) and aging [8]. In our analysis, we
454 found that the FBN1 gene has the most associated human phenotypes (Fig. 3 and
455 Supplementary Fig. 5).
456 The breath of the phenotypic fingerprints of some matrisome genes hints at the
457 conserved modularity of gene systems. Some of these orthologue phenotypes across
458 species suggest a repurposing of the underlying molecular systems through evolution.
459 Our network of these matrisome genes associated with these orthologues phenotypes
460 facilitate studying human phenotypes or even diseases in nonobvious model systems.
461 Thus, our computational analysis provides a systems-level approach to facilitate
462 mechanistic discoveries underlying ECM phenomes across species.
463
464
465
35 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
466 Author contributions
467 All authors participated in analyzing and interpreting the data. CYE and CS designed
468 the computational analysis and wrote the manuscript.
469
470 Author Information
471 The authors have no competing interests to declare. Correspondence should be
472 addressed to C. Y. E.
473
474 Acknowledgement
475 We thank members of the Ewald lab for critical reading of the manuscript,
476 Davide Vitiello for his help with the phenotypic grouping, and the Monarch
477 Initiative (https://monarchinitiative.org) for providing the indispensable
478 resources for the genotype-phenotype analysis performed in this work. Funding
479 for this study was from the Swiss National Science Foundation grant number
480 PP00P3_163898 to CS and CYE.
481
36 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
482 References
483 [1] J.C. Denny, L. Bastarache, D.M. Roden, Phenome-Wide Association Studies 484 as a Tool to Advance Precision Medicine, Annual Review of Genomics and 485 Human Genetics. 17 (2016) 353–373. doi:10.1146/annurev-genom-090314- 486 024956. 487 [2] D.M. Roden, Phenome-wide association studies: a new method for functional 488 genomics in humans, J. Physiol. (Lond.). 595 (2017) 4109–4115. 489 doi:10.1113/JP273122. 490 [3] W.S. Bush, M.T. Oetjens, D.C. Crawford, Unravelling the human genome- 491 phenome relationship using phenome-wide association studies, Nat Rev 492 Genet. 17 (2016) 129–145. doi:10.1038/nrg.2015.36. 493 [4] I. Lappalainen, J. Almeida-King, V. Kumanduri, A. Senf, J.D. Spalding, S. Ur- 494 Rehman, et al., The European Genome-phenome Archive of human data 495 consented for biomedical research, Nat Genet. 47 (2015) 692–695. 496 doi:10.1038/ng.3312. 497 [5] N. Freimer, C. Sabatti, The human phenome project, Nat Genet. 34 (2003) 498 15–21. doi:10.1038/ng0503-15. 499 [6] M.E. Samuels, Saturation of the human phenome, Curr. Genomics. 11 (2010) 500 482–499. doi:10.2174/138920210793175886. 501 [7] E. Krapohl, J. Euesden, D. Zabaneh, J.-B. Pingault, K. Rimfeld, S. von 502 Stumm, et al., Phenome-wide analysis of genome-wide polygenic scores, 503 Molecular Psychiatry. 21 (2016) 1188–1193. doi:10.1038/mp.2015.126. 504 [8] S.N. Andreassen, M. Ben Ezra, M. Scheibye-Knudsen, A defined human 505 aging phenome, Aging (Albany NY). 11 (2019) 5786–5806. 506 doi:10.18632/aging.102166. 507 [9] P. Eline Slagboom, N. van den Berg, J. Deelen, Phenome and genome 508 based studies into human ageing and longevity: An overview, Biochim 509 Biophys Acta Mol Basis Dis. 1864 (2018) 2742–2751. 510 doi:10.1016/j.bbadis.2017.09.017. 511 [10] K.-I. Goh, M.E. Cusick, D. Valle, B. Childs, M. Vidal, A.-L. Barabási, The 512 human disease network, Proc Natl Acad Sci USA. 104 (2007) 8685–8690. 513 doi:10.1073/pnas.0701361104. 514 [11] T.A. Peterson, E. Doughty, M.G. Kann, Towards precision medicine: 515 advances in computational approaches for the analysis of human variants, 516 Journal of Molecular Biology. 425 (2013) 4047–4063. 517 doi:10.1016/j.jmb.2013.08.008. 518 [12] A.G. Cirincione, K.L. Clark, M.G. Kann, Pathway networks generated from 519 human disease phenome, BMC Medical Genomics. 11 (2018) 75. 520 doi:10.1186/s12920-018-0386-2. 521 [13] R.O. Hynes, The extracellular matrix: not just pretty fibrils, Science. 326 522 (2009) 1216–1219. doi:10.1126/science.1176009. 523 [14] C. Frantz, K.M. Stewart, V.M. Weaver, The extracellular matrix at a glance, 524 Journal of Cell Science. 123 (2010) 4195–4200. doi:10.1242/jcs.023820. 525 [15] M.C. Lampi, C.A. Reinhart-King, Targeting extracellular matrix stiffness to 526 attenuate disease: From molecular mechanisms to clinical trials, Sci Transl 527 Med. 10 (2018) eaao0475. doi:10.1126/scitranslmed.aao0475.
37 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
528 [16] I.N. Taha, A. Naba, Exploring the extracellular matrix in health and disease 529 using proteomics, Essays Biochem. 63 (2019) 417–432. 530 doi:10.1042/EBC20190001. 531 [17] C. Bonnans, J. Chou, Z. Werb, Remodelling the extracellular matrix in 532 development and disease, Nat Rev Mol Cell Biol. 15 (2014) 786–801. 533 doi:10.1038/nrm3904. 534 [18] C.Y. Ewald, The Matrisome during Aging and Longevity: A Systems-Level 535 Approach toward Defining Matreotypes Promoting Healthy Aging, 536 Gerontology. (2019) 1–9. doi:10.1159/000504295. 537 [19] X. Wang, A.K. Pandey, M.K. Mulligan, E.G. Williams, K. Mozhui, Z. Li, et al., 538 Joint mouse-human phenome-wide association to test gene function and 539 disease risk, Nat Commun. 7 (2016) 10464. doi:10.1038/ncomms10464. 540 [20] G. Unlu, X. Qi, E.R. Gamazon, D.B. Melville, N. Patel, A.R. Rushing, et al., 541 Phenome-based approach identifies RIC1-linked Mendelian syndrome 542 through zebrafish models, biobank associations and clinical studies, Nat Med. 543 26 (2020) 98–109. doi:10.1038/s41591-019-0705-y. 544 [21] K.A. Shefchek, N.L. Harris, M. Gargano, N. Matentzoglu, D. Unni, M. Brush, 545 et al., The Monarch Initiative in 2019: an integrative data and analytic platform 546 connecting phenotypes to genotypes across species, Nucleic Acids Res. 48 547 (2020) D704–D715. doi:10.1093/nar/gkz997. 548 [22] R.O. Hynes, A. Naba, Overview of the matrisome--an inventory of 549 extracellular matrix constituents and functions, Cold Spring Harb Perspect 550 Biol. 4 (2012) a004903–a004903. doi:10.1101/cshperspect.a004903. 551 [23] A. Naba, K.R. Clauser, H. Ding, C.A. Whittaker, S.A. Carr, R.O. Hynes, The 552 extracellular matrix: Tools and insights for the “omics” era, Matrix Biol. 49 553 (2016) 10–24. doi:10.1016/j.matbio.2015.06.003. 554 [24] P. Nauroy, S. Hughes, A. Naba, F. Ruggiero, The in-silico zebrafish 555 matrisome: A new tool to study extracellular matrix gene and protein 556 functions, Matrix Biol. 65 (2018) 5–13. doi:10.1016/j.matbio.2017.07.001. 557 [25] M.N. Davis, S. Horne-Badovinac, A. Naba, In-silico definition of the 558 Drosophila melanogaster matrisome, Matrix Biology Plus. 4 (2019) 100015. 559 doi:10.1016/j.mbplus.2019.100015. 560 [26] A.C. Teuscher, E. Jongsma, M.N. Davis, C. Statzer, J.M. Gebauer, A. Naba, 561 et al., The in-silico characterization of the Caenorhabditis elegans matrisome 562 and proposal of a novel collagen classification, Matrix Biology Plus. (2019) 1– 563 13. doi:10.1016/j.mbplus.2018.11.001. 564 [27] B.D. Rodgers, D.K. Garikipati, Clinical, agricultural, and evolutionary biology 565 of myostatin: a comparative review, Endocr. Rev. 29 (2008) 513–534. 566 doi:10.1210/er.2008-0003. 567 [28] P. Benes, V. Vetvicka, M. Fusek, Cathepsin D--many functions of one 568 aspartic protease, Crit. Rev. Oncol. Hematol. 68 (2008) 12–28. 569 doi:10.1016/j.critrevonc.2008.02.008. 570 [29] H. Li, J. Auwerx, Mouse Systems Genetics as a Prelude to Precision 571 Medicine, Trends Genet. (2020). doi:10.1016/j.tig.2020.01.004. 572 [30] C.Y. Ewald, J.N. Landis, J. Porter Abate, C.T. Murphy, T.K. Blackwell, Dauer- 573 independent insulin/IGF-1-signalling implicates collagen remodelling in 574 longevity, Nature. 519 (2015) 97–101. doi:10.1038/nature14021.
38 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.06.980169; this version posted March 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
575 [31] D. Bakula, A.M. Aliper, P. Mamoshina, M.A. Petr, A. Teklu, J.A. Baur, et al., 576 Aging and drug discovery, Aging (Albany NY). 10 (2018) 3079–3088. 577 doi:10.18632/aging.101646. 578 [32] A.C. Teuscher, C. Statzer, S. Pantasis, M.R. Bordoli, C.Y. Ewald, Assessing 579 Collagen Deposition During Aging in Mammalian Tissue and in 580 Caenorhabditis elegans, Methods Mol Biol. 1944 (2019) 169–188. 581 doi:10.1007/978-1-4939-9095-5_13. 582
39