Neuron, Volume 105

Supplemental Information

A Genetic Model of the

Dániel L. Barabási and Albert-László Barabási Supplementary Information for A Genetic Model of the Connectome

Dániel L. Barabási and Albert-László Barabási

Organism TF b = [log2(N)] Data Source

(White et al., 1986; Reece-Hoyes et al., 2005, C. elegans 302 6398 934 9 2011; Varshney et al., 2011)

(Lagercrantz et al., 2010; Zhang et al., 2011; Fruit Fly 100,000 107 627 17 Zheng et al., 2018)

(Ananthanarayanan et al., 2009; Zhang et al., Mouse 7.09*106 1.28*1011 1,457 23 2011)

(Herculano-Houzel and Lent, 2005; 8 11 Rat 2*10 4.48*10 1,371 28 Ananthanarayanan et al., 2009; Zhang et al.,

2011)

(Ananthanarayanan et al., 2009; Zhang et al., Cat 7.63*108 6.1*1012 887 30 2011)

(Tang et al., 2001; Azevedo et al., 2009; Human 8.1*109 1.64*1014 1,391 33 Vaquerizas et al., 2009)

Supplementary Table 1: Neurons, synapses, and transcription factors. (Related to Results: "Encoding Neuronal Identity" and Star Methods: "Brain Sizes Across Organisms") We compiled from the literature the number of neurons, synapses and transcription factors for various organisms. For each organism, we also show b = log2(N), representing the number of TFs minimally required to offer a unique identity to all neurons in a brain. Notice that the number of TFs in each organisms exceeds b, indicating that TF combinations can reasonably offer unique cellular identity to each .

1 a) Bicliques Connectome Random Z-Score (number) (number)

Chemical Varshney et. al. 7.86 2,968 2,723.6±44.0 5.5573

Synapses Cook et. al. 12.6 9,430 7,569.4±92.3 20.149

Gap Varshney et. al. 3.70 344 314.34±8.90 3.334

Junctions Cook et. al. 7.49 1,706 1,059.5±82.01 7.883

b) Destination Size Destination Size c) Destination Size Destination Size

Z > 3.4

Z < -3.4

-3.4 < Z < 3.4

Source Size Source No BCs Source Size Source

Supplementary Figure 1: Bicliques in C. elegans connectome reconstructions. (Related to Star Methods: "Quantification and Statistical Analysis") a) Biclique Numbers in Each Studied Connectome. The entries show the number of maximal bicliques found in the real (Connectome) and randomized (Random) networks, together with the overall Z-score. b) Biclique Size Distributions for under degree preserving randomizations. Z-scores of maximal biclique sizes under degree-preserving randomization. Blue squares show maximal bicliques that are underrepresented in the real connectome compared to the random reference (Z < -3.4) — they capture small maximal bicliques (2 ® n, or n ® 2) that emerge frequently by chance. Red squares capture maximal bicliques that are overrepresented in the real data (Z > 3.4). Black maximal bicliques exist but their numbers are non-significant (-3.4 < Z < 3.4). White region corresponds to maximal bicliques that are absent in the connectome. Significance was set at Z = 3.4 to correct for multiple testing for each of the biclique types, with the most stringent cutoff used for all datasets for consistency. The higher density of Cook datasets over the Varshney reconstruction is apparent in the larger maximal bicliques found, as well as the reduced number of non-significant maximal bicliques. The matrix is less sparse, however the larger maximal bicliques are more significant, as expected. c) Biclique Sizes Under Erdös-Rényi (ER) Distributions. Z-scores of maximal biclique sizes compared to ER random networks with matching density and node number. Given the low structure of ER random networks, the increased significance of maximal biclique sizes should be expected.

2 a) b) Destination Size Destination Size

Z > 3.7

Z < -3.7

-3.7 < Z < 3.7

Source Source Size No BCs

Supplementary Figure 2: Bicliques in connectomes of other organisms. (Related to Star Methods: "Biclique Motifs in Other Organisms") (a) Biclique sizes under degree-preserving randomization for Ciona instestinalis. Blue square is z-score less than -3.7, red square is z-score greater than 3.7, black indicates non-significant z-score (-3.7

3 a) Destination Size b) Destination Size

Z > 3.4

Z < -3.4

-3.4 < Z < 3.4

No BCs Source Source Size

c) d) Source Source Size

e) f) Source Source Size

Supplementary Figure 3: Bicliques in network models. (Related to Results: "Bicliques in C. elegans" and Star Methods: "Bicliques in Network Models") (a) Bicliques in Erdös-Rényi model. We generated a random graph of 279 nodes and 4.5% density, to match the size and average degree of the chemical connectome of C. elegans. As expected, we observe no significant maximal bicliques compared to degree preserved randomizations. Z-score significance cutoffs for all of Supplementary Figure 3 were set to a more stringent 3.4 standard deviations to match the thresholds in Figure 2 and Supplementary Figure 1, even though many fewer bicliques types were found in all plots of Supplementary Figure 3. (b) EDR model of C. elegans connectome. An exponential distance rule

4 network fit to the C. elegans connectome with λ = 10.88, compared to degree preserved randomizations. (c-e) Networks generated using the scale-free model with N = 279 to match the size of the C. elegans connectome. The different panels correspond to different densities, generated with (c) m = 5, (d), 8, and (e) 13, demonstrating the non-stationary nature of the resulting maximal bicliques. (f) Coreless Scale Free Network. We removed the first m = 13 nodes (core) from the network profiled in (e). The resulting network has fewer large maximal bicliques, and all maximal bicliques are statistically underrepresented or nonsignificant.

5