I. Evidence for the Weakness of a Pure Ageing Effect

1

Swire et al. The cellular geometry of growth drives the amino acid economy of

Caenorhabditis elegans

SUPPORTING INFORMATION

I. Evidence for the weakness of a pure ageing effect.

We have shown that amino acid usage is driven by geometry rather than age per se: when we controlled for the effects of age-response we found that the correlation between the change in amino acid usage observed between cytoplasmic and membrane proteins and the change between early and late

2 -10 proteins was maintained (r age controlled = 0.88, p = 1.6 x10 ). By contrast, when we

2 controlled for the effects of compartment this correlation collapsed (r compartment controlled = 0.19, p = 0.021) (Figures 2c, d). In order to examine the robustness of these results we resampled our dataset 100,000 times. When we resample rcompartment controlled we find the median correlation coefficient is 0.32 (p = 0.056, one- tailed Spearman’s rank correlation; the original unresampled correlation was

0.43) with widely separated 5th and 95th percentiles of -0.02 and 0.59.

Conversely, when we resample rage controlled the median correlation coefficient is

0.92 (p = 1.9 x 10-9, original correlation 0.94) with tight 5th and 95th percentiles of 0.87 and 0.96. In other words it is not simply that after controlling for compartment the average differences between late and early proteins are small and only weakly similar (as in Figure 2d); it is also the case that individual early proteins differ greatly in composition from one another and individual late proteins differ greatly in composition from one another. This explains the very 2

broad distribution of correlations in Figure S1, blue bars. Any pure ageing effect must, therefore, be weak and barely visible against the backdrop of other effects. Conversely, after controlling for protein age we see not only that the average differences between cytoplasmic and membrane proteins are large and strongly similar (as in Figure 2c), but also that individual cytoplasmic proteins are relatively similar in composition to one another and individual membrane proteins are relative similar to one another (Figure S1, red bars). We conclude that ageing, in so far as it affects amino acid composition, is largely a by- product of shifting geometry.

Figure S1. Distributions following resampling of the Pearson’s correlation coefficients presented in Figures 2c and d. 99 genes were drawn randomly without replacement in each of the categories early-membrane, late-membrane, early-cytoplasm, late-cytoplasm. n=100,000.

II. Results of the four regression models for the metabolome correlations.

When we estimate the rate of change in the relative prevalence of an amino acid in the metabolome we are forced to model the relationship with time. 3

As in the case of the transcriptomic data, we used four models (see Materials and Methods); here we present the results for all four (Table S1). We prefer the linear model (Figure 4), both because it is the most intuitive and because it presented the best fit for more amino acids than did any of the other models

(Table S1, column 2). Note that the power and exponential models give stronger results (with respect to our hypotheses) than does the linear.

2 2 model nbest r age p, age r geometry p, geometry linear 5 0.39 0.0087, 0.0061 0.41 0.0067, 0.00092 log 4 0.35 0.014, 0.019 0.38 0.0096, 0.0044 power 3 0.39 0.012, 0.0052 0.54 0.0021, 0.00056 exponentia 2 0.49 0.0040, 0.0044 0.63 0.00060, 0.00042

l

TABLE S1. Results of four models for the relation between amino acid metabolome levels and time. nbest gives the number of the 14 amino acids for which the particular model gives the best

2 fit in Figure 3. r age gives the correlation coefficient between (i) the slope between amino acid pool sizes and time (according to the particular time model) and (ii) the log(late/early)

2 transcriptomic vector. In the case of the linear model, this was presented as Figure 4a. r geometry gives the correlation coefficient with the log(cytoplasmic/membrane) transcriptomic vector (i.e.

Figure 4b). p values are one-tailed Pearson correlations followed by one-tailed Spearman Rank correlations.

III. Predicting the transcriptome is more reliable than predicting the metabolome.

It appears that the geometric hypothesis explains less of the variation in pools of free amino acids (Figure 4b) than it does variation in the amino acid 4

proportions of mRNAs (Figure 2b). To a certain extent we expect this: after all, free pools of amino acids will not solely be used for protein synthesis. Moreover, protein expression levels vary over five orders of magnitude, so a relatively small group of highly-expressed proteins will be responsible for most of the demand for free amino acids. However, our cytoplasm/membrane and late/early measures necessarily treat each protein as making an equal contribution, and thus fail to recognise any distinctive contribution from highly-expressed proteins.

Given this, it is remarkable that the correlations in Figure 4 are as high as they are, and this suggests that the idiosyncratic requirements of a few highly- expressed proteins do not swamp the average changes captured in our ratios.

In any case, we should be careful not to overemphasise the distinction in correlations, for there are two sources of noise that will diminish the correlations in Figure 4 compared to that in Figure 2b. First, the Y-axis in Figure 4 involves the estimate of a slope on the basis of five time point readings, and this will be inherently less accurate than the amino acid usage proportions, averaged over thousands of proteins, which form the Y-axis in Figure 2b. Second, when estimating the Y-axis values in Figure 4, we must measure slopes; this entails choosing one model, and the linear model that we have chosen is best for only

5/14 amino acids (Table S1) which means that the Y-axis estimates for the other 9 will be especially noisy. By contrast the Y-axis values in Figure 2b are based on the consensus of four models, as we do not have to use measurements of slopes directly, but rather can take the consensus of four models as to the direction (+/-) of slopes to divide the data into the two classes of late/early genes. 5

IV. Selection for metabolic efficiency does not explain the change in amino acid usage during worm ontogeny

Our hypothesis is that the ontogeny of amino acid usage is driven by changing cellular geometry. An alternative hypothesis is that it is driven by the changing energy requirements of the worm. If energy becomes more or less of a constraint on the worm as it grows, then the power of cost selection on its amino acid usage will wax or wane. If this is so, then amino acid synthesis costs should predict the change in amino acid usage with age (i.e. age- response — log (late/early). And indeed they do: r = -0.59, p = 0.0058 (two- tailed).

However, cellular geometry (aka compartment) has far greater explanatory power than synthesis cost, since in a multiple correlation analysis in which age-response is the dependent, and synthesis cost and compartment – log (cytoplasmic/membrane) – are independent variables, synthesis cost is no longer significant (two-tailed partial regression p values: compartment, p =

0.00010, cost p = 0.57, interaction p = 0.36; and without interaction, compartment p = 0.000029, cost p = 0.73). Thus, there is no evidence for a cost effect independent of cellular geometry.