Figure S1. Status of the Functional Annotation of the 28392 Arabidopsis Thaliana Genes on November 2015. A) the Pie Charts Show
Total Page:16
File Type:pdf, Size:1020Kb
A B 3212 1136 3177 Biological Process 1771 8799 16381 6117 19098 9481 17775 1896 1750 3382 502 Biological Process Cellular Component Molecular Function 3720 483 Experimentally Predicted Unknown Cellular Component Molecular Function validated Figure S1. Status of the functional annotation of the 28392 Arabidopsis thaliana genes on November 2015. A) The pie charts show the fraction of experimentally validated (green), predicted (blue) and unknown function (gray) genes for BP, CC and MF. B) Venn diagram showing the overlap of functionally characterized genes in the three domains of GO. Biological process Number of integrated networks F-measure Number of made predictions Cellular Component Number of integrated networks F-measure Number of made predictions Molecular function Number of integrated networks F-measure Number of made predictions Figure S2. Combinatorial analysis of the ten gene co-function networks for BP (top), CC (middle) and MF (bottom). The x-axis indicates the number of made predictions, while the y-axis shows the F-measure. Point color indicates the number of integrated networks, ranging from 1 network (dark blue) to ten (bright yellow). F-measure Made predictions Biological process Biological process % of genes with Gene Ontology term % of genes with Gene Ontology term F-measure Made predictions Cellular Component Cellular Component % of genes with Gene Ontology term % of genes with Gene Ontology term F-measure Made predictions Molecular function Molecular function % of genes with Gene Ontology term % of genes with Gene Ontology term Figure S3. Influence of available functional information on F-measure and the number of made predictions. Relationship between the F-measure (y-axis) and % of genes with available GO terms (x-axis) given for BP (top left), CC (middle left) and MF (bottom left). Relationship between the number of made predictions (y-axis) and % of genes with available GO terms (x-axis) for BP (top right), CC (middle eight) and MF (bottom right). The lines represent bootstrapped 68% confidence interval plots of 100 permutations. ATTED-II Biological process AraNet v2 Biological process % of genes with a neighbor % of genes with a neighbor Phylostrata Phylostrata GeneMANIA Biological process BIOGRID Biological process % of genes with a neighbor % of genes with a neighbor Phylostrata Phylostrata RNAseq Biological process Green plants Rosids Land plants Malvids Vascular plants Brassicales Angiosperms Arabidopsis Eudicots Arabidopsis thaliana % of genes with a neighbor Phylostrata Figure S4 Relationship between the percentage of genes with a neighbor in a network (y-axis) and the phylostrata of the genes (x-axis). The example is given for ATTED-II (top), AraNet v2 (middle) and GeneMANIA (bottom), BP. A Gene 1 Experimentally verified function Gene 2 Network 1 prediction Gene 3 Network 2 prediction Network 3 Figure S5. Estimating the performance of Gene Ontology term predictions. that can be predicted well. Three genes (1-3) experimentally assigned to the term are predictedprediction to be B assigned to this term in 9 cases (indicated by green +). B) An example of a GO term that cannot be predicted well. Only 2 out of 9 networks predicted that the experimentally assigned genes A-C are assigned to this term. Gene A Experimentally verified function Gene B Network 1 prediction Gene C Network 2 prediction Network 3 prediction A) An example of a GO term Number of gene pairs Gene Ontology Jaccard Index value Figure S6 Estimating the significant Gene Ontology Jaccard Index (GOJI) value. The histogram shows the relationship between GOJI values (x-axis) and frequency of random gene pairs showing the particular value. The red line indicates the GOJI threshold (0.178) below which 99% of the random pairs are found. .