<<

Supplementary Information

Enhanced production in cyanobacteria reveals photosynthesis limitations

Xin Wang, Wei Liu, Changpeng Xin, Yi Zheng, Yanbing Cheng, Su Sun, Runze Li, Xin-Guang Zhu, Susie Y. Dai, Peter M. Rentzepis1 and Joshua S. Yuan1

1To whom the correspondence should be addressed

Peter M. Rentzepis: [email protected] (979)845-7250 Joshua S. Yuan: [email protected] (979)845-3016

Table of Contents: Supplementary Methods Supplementary Figure 1-6 Supplementary Dataset 1 Supplementary Table 1-3 Supplementary Files 1-4 References

1

Supplementary Methods Strains and plasmids construction. Strains and plasmids used in this study were summarized in Table S1. The maps and sequences of pWX1118 and pWX121 can be found in File S2 and S3. Plasmids were constructed using Gibson Assembly (NEB, Ipswich, MA) and primers used were designed using NEBuilder Assembly Tool (http://nebuilder.neb.com/). Limonene synthase (LS) was codon optimized for S. elongatus expression and synthesized by IDT (Coralville, IA). S. elongatus genome neutral site I targeting plasmids were constructed based on pAM2991 (from Golden lab), and neutral site II targeting plasmids were constructed based on pAM1579 (from Addgene). Cell absorbance spectra scan and optical density 200 µl of wild type or limonene-producing cells were used to measure the whole-cell absorbance wavelength scans using an Epoch Microplate Spectrophotometer (BioTek Instruments Inc., Vermont). The absorbance scans were normalized by cell density (OD730). All measurements were determined by averaging triplicates of independent cultures. Error bars in figures represent standard deviations. Model description and simulations. A kinetics model of MEP biosynthesis was developed by extending the C3 photosynthesis kinetics model developed by Xin et al. (1). The current model was built with Simbiology in MATLAB (version 2008a). The sundials solver of Simbiology was chosen to solve the system. The solution of ODEs provided by sundials was the time evolution of the concentration of each metabolite in each compartment which in turn allows calculation of different reaction fluxes. In our model, the carbon input into terpene synthesis was channeled from photorespiratory pathways through malate synthase. The glyoxylate was first converted to malate, and further to pyruvate by malate dehydrogenase (Fig. S6). All pyruvate channeled from photorespiration was used for terpene synthesis through a simplified MEP derived terpene biosynthesis pathway (Fig. S6). The initial concentrations of metabolites and the kinetic parameters of the additional reactions used in this model were listed in the supplemental tables (Tables S2 and S3). The reaction rates of the additional enzymes were described using the Michaelis-Menten kinetics (File S4). In order to control carbon flux for terpene synthesis, we used glycolate dehydrogenase to control carbon input, and assumed that the pyruvate used for MEP pathway was only generated from photorespiration. By changing Vmax of glycolate dehydrogenase, the carbon flux into MEP pathway could be manipulated in the simulations. Protein preparation for proteomics. S. elongatus wild type and limonene producing lines were grown to log phase in 1 L Roux culture bottles as described above. 100 mL of each line was collected in triplicates, and centrifuged at 8000 rpm for 15 min at 4 °C. Cell pellets were resuspended in 2 mL Tris buffer (50 mM Tris- TM HCl, 10 mM CaCl2, 0.1 % Nonidet P-40, pH 7.6) supplemented with 1X Halt protease and phosphatase inhibitor cocktail (Thermo Scientific, Rockford, IL). Cells were lysed through 10 cycles of sonication with 30 seconds on and 2 min off on ice, and centrifuged at 13,000 g for 30 min at 4 °C. The supernatant was collected and transferred into a new centrifuge tube. Protein concentration was determined by Bradford assay (Thermo Scientific, Rockford, IL) following the manufactory’s instructions. Liquid chromatography-tandem mass spectrometry (LC-MS/MS). The Multidimensional Protein Identification Technology (MudPIT) based shotgun proteomics were performed similarly as Washburn et al. (2). For each sample, 100 μg of total protein was first

2 incubated with 8 M urea and 5 mM dithiothreitol (DTT) at 37 °C for an hour. The protein was then diluted 4 folds using Tris buffer (50 mM Tris-HCl, 10 mM CaCl2, pH 7.6), and subjected to trypsin digestion using Mass Spectrometry Grade Trypsin Gold (Promega, Madison, WI) with 1:100 w/w at 37 °C for 18 hours. The digested peptides were cleaned up using a Sep-Pak C18 plus desalting column (Waters Corporation, Milford, MA). A biphasic strong cation exchange/reversed phase capillary column was prepared to load the peptides with a pressure cell, followed by a two dimensional peptide separation as described previously (3). Peptide fractions were analyzed on a LTQ ion trap mass spectrometer (Thermo Finnegan, San Jose, CA) operated in the data-dependent acquisition mode. An MS2 method was used to record the full mass spectra over the range of 300- 1700 m/z. The 5 most abundant peaks were subjected to collision induced dissociation fragmentation for MS/MS analysis. The proteomics analyses were carried out with triplicate biological samples and consolidated in the final data analysis. Database search and spectral validation. A data processing pipeline was developed to analyze the peptide information. MS2 file of tandem mass spectra was first extracted from the raw data, and subsequently used for database search by ProLuCID (4) (version 1.0). A protein database were generated including S. elongatus proteins (2657 sequences) obtained from UniProt database and 37 common contaminants, and their reversed sequences (total 5388 sequences). The reverse database was included as a quality control system to restrain false positive discovery to 0.05. ProLuCID search was performed using a precursor mass tolerance of 100 ppm. And no peptide modifications were included in the search. The peptide/spectrum matches were filtered using DTASelect (5) (v. 2.0) with the criteria of DelCN ≥ 0.08 coupled with XCorr ≥ 1.8 for charge state +1, ≥ 2.5 for charge state +2, and ≥ 3.5 for charge state +3. A minimum of two peptides were needed to identify proteins. Analysis of protein differential expression. Protein differential expression was analyzed in PatternLab (6) (version 4.0.0.38) with spectral counts indicating protein abundance. The normalized spectral abundance factor (NSAF) (7) was applied to the biological triplicates to minimize the noise caused by protein length and sample-to- sample variation. TFold test was used to pinpoint the protein differential expression, in which the BH q-value and L-stringency were set to 0.05. Variable fold change was determined by the optimized F-stringency parameter.

3

Fig. S1. Limonene production in S. elongatus through stepwise metabolic engineering. (A) MEP pathway for IPP/DMAPP generation. S. elongatus does not encode a limonene synthase (LS) or geranyl synthase (GPPS). Heterologous LS was introduced into S. elongatus genome to enable limonene synthesis. (B) Specific limonene productivity in different strains. Strain L111: ls driven by Ptrc promoter and integrated in neutral site I of S. elongatus genome; Strain L113: ls driven by Ptrc promoter in which the native RBS was replaced by a synthetic RBS sequence; L114: ls/gpps in a single operon driven by Ptrc promoter; L114/dxs: dxs driven by PlacO-1 promoter and integrated in neutral site II of strain L114 genome.

4

Fig. S2. GC-MS analysis of limonene production in engineered cyanobacteria. Figure represents a typical elution from the hydrocarbon absorption trap with cedrene as the internal standard.

5

Fig. S3. Normalized spectral count for limonene synthase in L1115 and L1118 cells. Normalized spectral count was calculated by dividing peptide spectral counts with the protein length of limonene synthase (543 amino acids). Data was obtained from biological triplicates with error bars indicating standard deviations. The two-tailed t-test showed a significant difference between L1115 and L1118 LS levels (p value = 0.0005).

6

Fig. S4. Cell growth and limonene productivity of L1118. (A) Growth of wild type and L1118 cells. (B) Daily limonene productivity in L1118 cells. Error bars indicate standard deviations. One- way ANOVA (p < 0.0001) showed that limonene production is significantly different in at least two time points.

7

Fig. S5. Standard curve used to calculate limonene recovery rate. Limonene concentrations of 10, 50, 100, and 250 µg/mL were used to spike the S. elongatus wild type cells. Limonene was collected and measured as described in the methods.

8

Fig. S6. Pathway in kinetics modeling. The current model is an expansion of a previous C3 photosynthesis kinetics model (1). Carbon flux into MEP derived terpene biosynthesis was controlled by pyruvate generation through glycolate dehydrogenase in photorespiration. A simplified downstream terpene biosynthesis pathways were used to simulate downstream terpene response against MEP carbon flux.

9

Dataset S1. Differential expression of proteins in WT and L1118 cells. The statistical analysis of changes in protein expression was conducted by PatternLab (see methods), and ranked by protein fold change in the dataset. Each row is a separate protein, and the columns from left to right show the UniProt ID of the protein, fold change between WT and L1118 proteins, p-value calculated from PatternLab showing the statistical difference between WT and L1118 proteins, the normalized spectral abundance factor (NSAF) (see methods) for L1118, NSAF for WT, and description of proteins, respectively.

See the excel file for Dataset S1.

10

Table S1. Plasmids and strains used in this study. Strains Genotype Reference Synechococcus elongatus PCC 7942 Wild type From S. Golden L111 ls at NSI site of PCC 7942 This study ls (synthetic RBS) at NSI site of L113 This study PCC 7942 L114 ls-gpps at NSI site of PCC 7942 This study ls-gpps at NSI site of PCC 7942 L114/dxs This study dxs at NSII site of PCC 7942 ls (PCC 7942 native promoter) at L1115 This study NSI site of PCC 7942 ls (Ptrc promoter) at NSI site of L1118 This study PCC 7942 Plasmids Genotype Reference pAM2991 Targeting PCC7942 NSI; Ptrc (8) Targeting PCC7942 NSII; pAM1579 (9) PLlacO1 pWX111 Ptrc: ls; Sp/SmR This study pWX113 Ptrc: ls (synthetic RBS); Sp/SmR This study pWX114 Ptrc: ls-gpps; Sp/SmR This study R pWX1115 PPHbs: ls; Sp/Sm This study pWX1118 PpsbA: ls; Sp/SmR This study R pWX121 PLlacO1: dxs; Km This study

11

Table S2. Enzyme kinetic parameters used in the model. In cases where parameters were assumed, a flux control coefficient was determined by increasing and decreasing the value by 20%.

EC Parameters Reference

2.5.1.29 KmIPP=0.02 mM (10)

KmDMAPP=0.02 mM (10) -1 -1 Vmax=1.0μmol mL s Assumed

2.5.1.10 KmGPP=0.002 mM (11)

KmIPP=0.015 mM (11) -1 -1 Vmax=1.0μmol mL s Assumed

1.17.1.2 KmMEP=0.03 mM (12) -1 -1 Vmax=0.3 μmol mL s Assumed 5.3.3.2 Ke=10

KmIPP=0.01 mM (13)

KmDMAPP=0.02mM (14) -1 -1 Vmax=0.3 μmol mL s Assumed 1.1.1.267 Ke=8

KmNADPH=0.03 mM (15)

KmDXP=0.132 mM (15) + KmNADP =0.47 mM (15)

KmMEP=0.972 mM (15) -1 -1 Vmax=0.3 μmol mL s (15)

2.2.1.7 KmG3P=0.47 mM (16)

KmPYR=0.7 mM (16) -1 -1 Vmax=1 μmol mL s Assumed

2.3.3.9 KmGOA=2 mM (17)

KmAceCoA=0.01 mM (17)

KiGCA=0.15 mM (17) -1 -1 Vmax=1 μmol mL s Assumed

1.1.1.40 Ke=0.051 (18)

kmMAL=0.23 mM (19)

KmNADP+=0.0102 mM (19)

KmPYR=26.3 mM (19) -1 -1 Vmax=0.3 μmol mL s Assumed

4.2.3.16 KmGPP=0.0018 mM (20) -1 -1 Vmax=0.01 μmol mL s (21)

Squalene synthase KmFPP=0.0025 mM (22) -1 -1 Vmax=0.51 μmol mL s Assumed

12

Table S3. Initial values of metabolite concentrations used in the model.

Metabolite Concentration (mM) Reference Malatename 3.0 (23) PYR* 0 Assumed DXP* 0 Assumed MEP* 0 Assumed DMAPP* 0 Assumed IPP* 0 Assumed GPP* 0 Assumed FPP* 0 Assumed Limonene 0 Assumed Β- 0 Assumed *abbreviation for metabolites: PYR (pyruvate), DXP (1-Deoxy-D-xylulose 5-phosphate), MEP (2-C- methylerythritol 4-phosphate), DMAPP (dimethylallyl pyrophosphate), IPP (isopentenyl pyrophosphate), GPP (geranyl pyrophosphate), FPP ().

13

File S1. Promoter sequences (RBS underlined) in genome targeting plasmids. pWX111: CGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCATCGGAAGCTGTGGTATGGCTGTGCAGGTCGTAAATC ACTGCATAATTCGTGTCGCTCAAGGCGCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACATCATAACGGTTCTGG CAAATATTCTGAAATGAGCTGTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAATTGTGAGCGGATAACAATT TCACACAGGAAACAGACC pWX113: CGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCATCGGAAGCTGTGGTATGGCTGTGCAGGTCGTAAATC ACTGCATAATTCGTGTCGCTCAAGGCGCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACATCATAACGGTTCTGG CAAATATTCTGAAATGAGCTGTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAATTGTGAGCGGATAACAATT TCAACTATTCTAAAGGAGGTAAACT pWX1115: GCGCCACTTCCTTGCACAATTCATGGCTACGACCCCTGCTGGCCTTGCGGTCACAGGGGTTTCTTGTGGGCAGCAGA CCGAGTGGCGGTGATCTCAACTCCCCTGTCTCCTCGCGCGATCGCTCGGCAGAGAGCCAAGTCAGGGCCTGAATCTT GGGGAAATTGGAGCCAGCGATCGCTGCTCCTATGGCTCATCTAGACCGTCTAGGGATGCTCTTGTAACCATTTCTAC AGAAAAGAACCCTGAAAAAGCAGTCGCCGCAAGACTTTAGGCCTTCCACTTCAAAAAAGAGTGCTGTATTATTTGCG AGACCCGCTCAAATCTACTTTTCATC pWX1118: AAAGTACTATTCAGATAGAACGAGAAATGAGCTTGTTCTATCCGCCCGGGgctgagggGAATTCGATCTCAATGAAT ATTGGTTGACACGGGCGTATAAGACATGTTATACTGTTGAATAACAAGTTTACCGTTCCCAAAAATAAAGAAGGAGG AACAGT pWX121: TAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCTTCACCTCGAGAA TTGTGAGCGGATAACAAGATACTGAGCACATCAGCAGGACGCACTGACCGAATTCATTAAAGAGGAGAAA

14

File S2. Map and sequence of pWX1118.

Sequence CGCCGGGGCTGGCAGCTTAGTCCTGCGCAATCTCTACTACATCTGCCAACCCAGTGAAATTTTGATCTTTGCTGGCA GTAGTCGCCGCAGTAGTGATGGCCGCCGAGTTGGCTATCGCTTGGTCAAGGGCGGCAGCAGCCTGCGGGTACCTCTG CTGGAAAAAGCGCTCCGCATGGATCTGACCAACATGATCATTGAGTTGCGCGTTTCCAATGCCTTCTCCAAGGGCGG CATTCCCCTGACTGTTGAAGGCGTTGCCAATATCAAGATTGCTGGGGAAGAACCGACCATCCACAACGCGATCGAGC GGCTGCTTGGCAAAAACCGTAAGGAAATCGAGCAAATTGCCAAGGAGACCCTCGAAGGCAACTTGCGTGGTGTTTTA GCCAGCCTCACGCCGGAGCAGATCAACGAGGACAAAATTGCCTTTGCCAAAAGTCTGCTGGAAGAGGCGGAGGATGA CCTTGAGCAGCTGGGTCTAGTCCTCGATACGCTGCAAGTCCAGAACATTTCCGATGAGGTCGGTTATCTCTCGGCTA GTGGACGCAAGCAGCGGGCTGATCTGCAGCGAGATGCCCGAATTGCTGAAGCCGATGCCCAGGCTGCCTCTGCGATC CAAACGGCCGAAAATGACAAGATCACGGCCCTGCGTCGGATCGATCGCGATGTAGCGATCGCCCAAGCCGAGGCCGA GCGCCGGATTCAGGATGCGTTGACGCGGCGCGAAGCGGTGGTGGCCGAAGCTGAAGCGGACATTGCTACCGAAGTCG CTCGTAGCCAAGCAGAACTCCCTGTGCAGCAGGAGCGGATCAAACAGGTGCAGCAGCAACTTCAAGCCGATGTGATC GCCCCAGCTGAGGCAGCTTGTAAACGGGCGATCGCGGAAGCGCGGGGGGCCGCCGCCCGTATCGTCGAAGATGGAAA AGCTCAAGCGGAAGGGACCCAACGGCTGGCGGAGGCTTGGCAGACCGCTGGTGCTAATGCCCGCGACATCTTCCTGC TCCAGAAGCTCGAAATTCGAGCTCGGTACCCGGGGATCTGGGCCGCAAAGTACTATTCAGATAGAACGAGAAATGAG CTTGTTCTATCCGCCCGGGgctgagggGAATTCGATCTCAATGAATATTGGTTGACACGGGCGTATAAGACATGTTA TACTGTTGAATAACAAGTTTACCGTTCCCAAAAATAAAGAAGGAGGAACAGTATGCGACGCTCCGGCAATTATAATC CGTCTCGCTGGGATGTTAATTTCATCCAGTCCCTCTTGTCGGACTACAAAGAGGACAAACATGTCATCCGGGCCAGC GAGCTGGTCACACTGGTTAAAATGGAATTGGAGAAAGAGACGGACCAGATTCGACAGCTGGAGCTCATTGACGATTT GCAACGCATGGGACTCTCGGATCATTTCCAGAACGAATTTAAAGAGATCCTGTCTAGCATTTACCTGGACCACCATT ACTATAAGAACCCATTCCCTAAAGAGGAACGTGACCTCTACTCTACGTCGTTGGCGTTCCGACTCCTGCGCGAGCAT GGCTTCCAAGTTGCCCAGGAAGTCTTTGATAGTTTCAAGAATGAGGAAGGCGAGTTTAAAGAGAGCCTGTCGGACGA TACGCGAGGCTTGTTGCAGTTGTACGAGGCTTCGTTCCTGCTCACGGAAGGAGAAACCACTCTCGAATCTGCCCGCG AGTTTGCAACGAAATTCTTGGAAGAAAAGGTGAACGAAGGTGGCGTGGATGGAGACCTCCTGACTCGTATCGCGTAC TCGCTGGATATTCCGTTGCACTGGCGCATCAAGCGCCCCAATGCCCCGGTCTGGATCGAATGGTATCGTAAACGACC AGATATGAACCCGGTTGTTCTCGAGCTGGCTATCCTGGACCTGAATATTGTGCAGGCACAATTTCAAGAGGAGTTGA AGGAGTCTTTTCGCTGGTGGCGCAATACAGGCTTTGTGGAAAAACTGCCATTCGCGCGCGATCGCCTCGTGGAATGC

15

TACTTCTGGAACACTGGTATTATCGAGCCGCGTCAGCATGCGTCGGCGCGCATCATGATGGGAAAGGTCAATGCTCT GATCACGGTGATCGACGACATCTATGATGTCTACGGTACCCTGGAGGAGCTGGAACAATTCACAGATCTGATTCGGC GCTGGGATATTAATAGCATTGATCAGCTGCCTGATTACATGCAACTGTGCTTTTTGGCTCTCAATAATTTTGTTGAC GATACCTCCTATGATGTTATGAAAGAAAAAGGCGTGAATGTTATCCCCTATCTCCGCCAATCTTGGGTTGACTTGGC AGATAAGTATATGGTTGAGGCGCGGTGGTTTTACGGTGGGCACAAGCCAAGTCTGGAAGAATATTTGGAGAACAGTT GGCAGAGTATTTCTGGCCCTTGCATGCTCACGCACATCTTTTTTCGAGTGACGGACAGTTTTACCAAGGAGACGGTG GATAGCCTGTATAAATATCATGATCTCGTCCGTTGGAGCTCTTTCGTGTTGCGCCTGGCAGATGACCTGGGTACTAG CGTGGAGGAGGTTTCGCGCGGTGACGTGCCTAAAAGCCTGCAGTGCTATATGAGTGATTACAACGCCTCGGAAGCAG AGGCTCGGAAACACGTCAAATGGCTGATCGCTGAGGTGTGGAAAAAAATGAATGCTGAGCGCGTGAGTAAGGACAGC CCATTCGGCAAAGATTTTATTGGCTGCGCGGTGGATCTGGGCCGCATGGCCCAACTGATGTACCACAATGGTGATGG CCACGGCACACAACACCCAATCATCCATCAACAGATGACCCGGACCCTCTTTGAACCTTTCGCATAAGAATTCGAGC TCGGTACCCGGGGATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTTGGCTGTTTTGGCGGATGAGAGAAGATTTTC AGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGT CCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAG TAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTC GGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGGGAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGC GGGCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAGGCCATCCTGACGGATGGCCTTTTTGCGTTT CTACAAACTCTTTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGGCGGCCCAGATCCGCGGCCGCC GATCCTCTAGTATGCTTGTAAACCGTTTTGTGAAAAAATTTTTAAAATAAAAAAGGGGACCTCTAGGGTCCCCAATT AATTAGTAATATAATCTATTAAAGGTCATTCAAAAGGTCATCCACCGGATCAGCTTAGTAAAGCCCTCGCTAGATTT TAATGCGGATGTTGCGATTACTTCGCCAACTATTGCGATAACAAGAAAAAGCCAGCCTTTCATGATATATCTCCCAA TTTGTGTAGGGCTTATTATGCACGCTTAAAAATAATAAAAGCAGACTTGACCTGATAGTTTGGCTGTGAGCAATTAT GTGCTTAGTGCATCTAACGCTTGAGTTAAGCCGCGCCGCGAAGCGGCGTCGGCTTGAACGAATTGTTAGACATTATT TGCCGACTACCTTGGTGATCTCGCCTTTCACGTAGTGGACAAATTCTTCCAACTGATCTGCGCGCGAGGCCAAGCGA TCTTCTTCTTGTCCAAGATAAGCCTGTCTAGCTTCAAGTATGACGGGCTGATACTGGGCCGGCAGGCGCTCCATTGC CCAGTCGGCAGCGACATCCTTCGGCGCGATTTTGCCGGTTACTGCGCTGTACCAAATGCGGGACAACGTAAGCACTA CATTTCGCTCATCGCCAGCCCAGTCGGGCGGCGAGTTCCATAGCGTTAAGGTTTCATTTAGCGCCTCAAATAGATCC TGTTCAGGAACCGGATCAAAGAGTTCCTCCGCCGCTGGACCTACCAAGGCAACGCTATGTTCTCTTGCTTTTGTCAG CAAGATAGCCAGATCAATGTCGATCGTGGCTGGCTCGAAGATACCTGCAAGAATGTCATTGCGCTGCCATTCTCCAA ATTGCAGTTCGCGCTTAGCTGGATAACGCCACGGAATGATGTCGTCGTGCACAACAATGGTGACTTCTACAGCGCGG AGAATCTCGCTCTCTCCAGGGGAAGCCGAAGTTTCCAAAAGGTCGTTGATCAAAGCTCGCCGCGTTGTTTCATCAAG CCTTACGGTCACCGTAACCAGCAAATCAATATCACTGTGTGGCTTCAGGCCGCCATCCACTGCGGAGCCGTACAAAT GTACGGCCAGCAACGTCGGTTCGAGATGGCGCTCGATGACGCCAACTACCTCTGATAGTTGAGTCGATACTTCGGCG ATCACCGCTTCCCTCATGATGTTTAACTTTGTTTTAGGGCGACTGCCCTGCTGCGTAACATCGTTGCTGCTCCATAA CATCAAACATCGACCCACGGCGTAACGCGCTTGCTGCTTGGATGCCCGAGGCATAGACTGTACCCCAAAAAAACAGT CATAACAAGCCATGAAAACCGCCACTGCGCCGTTACCACCGCTGCGTTCGGTCAAGGTTCTGGACCAGTTGCGTGAG CGCATACGCTACTTGCATTACAGCTTACGAACCGAACAGGCTTATGTCCACTGGGTTCGTGCCTTCATCCGTTTCCA CGGTGTGCGTCACCCGGCAACCTTGGGCAGCAGCGAAGTCGAGGCATTTCTGTCCTGGCTGGCGAACGAGCGCAAGG TTTCGGTCTCCACGCATCGTCAGGCATTGGCGGCCTTGCTGTTCTTCTACGGCAAGGTGCTGTGCACGGATCTGCCC TGGCTTCAGGAGATCGGAAGACCTCGGCCGTCGCGGCGCTTGCCGGTGGTGCTGACCCCGGATGAAGTGGTTCGCAT CCTCGGTTTTCTGGAAGGCGAGCATCGTTTGTTCGCCCAGCTTCTGTATGGAACGGGCATGCGGATCAGTGAGGGTT TGCAACTGCGGGTCAAGGATCTGGATTTCGATCACGGCACGATCATCGTGCGGGAGGGCAAGGGCTCCAAGGATCGG GCCTTGATGTTACCCGAGAGCTTGGCACCCAGCCTGCGCGAGCAGGGGAATTGATCCGGTGGATGACCTTTTGAATG ACCTTTAATAGATTATATTACTAATTAATTGGGGACCCTAGAGGTCCCCTTTTTTATTTTAAAAATTTTTTCACAAA ACGGTTTACAAGCATAAAGCTCTAGAGTCGACCTGCAGGCATGCAAGCTTCGAGTCCCTGCTCGTCACGCTTTCAGG CACCGTGCCAGATATCGACGTGGAGTCGATCACTGTGATTGGCGAAGGGGAAGGCAGCGCTACCCAAATCGCTAGCT TGCTGGAGAAGCTGAAACAAACCACGGGCATTGATCTGGCGAAATCCCTACCGGGTCAATCCGACTCGCCCGCTGCG AAGTCCTAAGAGATAGCGATGTGACCGCGATCGCTTGTCAAGAATCCCAGTGATCCCGAACCATAGGAAGGCAAGCT CAATGCTTGCCTCGTCTTGAGGACTATCTAGATGTCTGTGGAACGCACATTTATTGCCATCAAGCCCGATGGCGTTC AGCGGGGTTTGGTCGGTACGATCATCGGCCGCTTTGAGCAAAAAGGCTTCAAACTGGTGGGCCTAAAGCAGCTGAAG CCCAGTCGCGAGCTGGCCGAACAGCACTATGCTGTCCACCGCGAGCGCCCCTTCTTCAATGGCCTCGTCGAGTTCAT CACCTCTGGGCCGATCGTGGCGATCGTCTTGGAAGGCGAAGGCGTTGTGGCGGCTGCTCGCAAGTTGATCGGCGCTA CCAATCCGCTGACGGCAGAACCGGGCACCATCCGTGGTGATTTTGGTGTCAATATTGGCCGCAACATCATCCATGGC TCGGATGCAATCGAAACAGCACAACAGGAAATTGCTCTCTGGTTTAGCCCAGCAGAGCTAAGTGATTGGACCCCCAC GATTCAACCCTGGCTGTACGAATAAGGTCTGCATTCCTTCAGAGAGACATTGCCATGCCCGTGCTGCGATCGCCCTT CCAAGCTGCCTTGCCCCGCTGTTTCGGGCTGGCAGCCCTGGCGTTGGGGCTGGCGACCGCTTGCCAAGAAAGCAGCG CTCCGCCGGCTGCCGGATCGATCCTGCCTCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCC 16

GGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCG GGTGTCGGGGCGCAGCCATGACCCAGTCACGTAGCGATAGCGGAGTGTATACTGGCTTAACTATGCGGCATCAGAGC AGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGC TCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGC GGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGA ACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCA AGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCC TGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCAC GCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGAC CGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCAC TGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACA CTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCC GGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCA AGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGA GATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAG TAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCAT AGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATAC CGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGT CCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAG TTTGCGCAACGTTGTTGCCATTGCTGCAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCG GTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATC GTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCC ATCCGTAAGATGCTTTTCTGTGACTGGTGAGT

17

File S3. Map and sequence of pWX121.

Sequence AGCTTGTCATCTGCCGGATGAGGCAAAACCCTGCCTACGGCGCGATTACATCGTCCCAGCGCGATCGCTCTTACTGT TGATGGCTCGTGCTTAAAAACAATGCAAACTTCACCGTTTCAGCTGGTGATTTTCGACTGTGATGGTGTGCTTGTTG ATAGCGAACGCATCACTAATCGCGTCTTTGCAGACATGCTCAATGAACTGGGTCTGTTGGTGACTTTGGATGACATG TTTGAGCAGTTTGTGGGTCATTCCATGGCTGACTGTCTCAAACTAATTGAGCGACGGTTAGGCAATCCTCCACCCCC TGACTTTGTTCAGCACTATCAACGCCGTACCCGTATCGCGTTAGAAACGCATCTACAAGCCGTTCCTGGGGTTGAAG AGGCTTTGGATGCTCTTGAATTGCCCTACTGTGTTGCGTCCAGTGGTGATCATCAAAAGATGCGAACCACACTGAGC CTGACGAAGCTCTGGCCACGATTTGAGGGACGAATCTTCAGCGTGACTGAAGTACCTCGCGGCAAGCCATTTCCCGA TGTCTTTTTGTTGGCCGCCGATCGCTTCGGGGTTAATCCTACGGCCTGCGCTGTGATCGAAGACACCCCCTTGGGAG TAGCGGCAGGCGTGGCGGCAGGAATGCAAGTGTTTGGCTACGCGGGTTCCATGCCCGCTTGGCGTCTGCAAGAAGCC GGTGCCCATCTCATTTTTGACGATATGCGACTGCTGCCCAGTCTGCTCCAATCGTCGCCAAAAGATAACTCCACAGC ATTGCCCAATCCCTAACCCCTGCTCGCGCCGCAACTACACACTAAACCGTTCCTGCGCGATCGCTCTTACTGTTGAT GGCTCGTGCTTAAAAACAATGCAACCCTAACCGTTTCAGCTGGTGATTTTCGGACGATTTGGCTTACAGGGATAACT GAGAGTCAACAGCCTCTGTCCGTCATTGCACACCCATCCATGCACTGGGGACTTGACTCATGCTGAATCACATTTCC CTTGTCCATTGGGCGAGAGGGGAGGGGAATCTTCTGGACTCTTCACTAAGCGGCGATCGCAGGTTCTTCTACCCAAG CAGTGGCGATCGCTTGATTGCAGTCTTCAATGCTGGCCTCTGCAGCCATCGCCGCCACCAAAGCATCGTAGGCGGGA CGTTGTTGCTCCAGTAAAGTCTTCGCCCGTAACAATCCCCAGCGACTGCGTAAATCCGCTTCGGCAGGATTGCGATC GAGTTGCCGCCACAGTTGTTTCCACTGGGCGCGATCGTCAGCTCCCCCTTCCACGTTGCCGTAGACCAGTTGCTCTG CCGCTGCACCGGCCATCAACACCTGACACCACTGTTCCAGCGATCGCTGACTGAGTTGCCCCTGTGCGGCTTCGGCT TCTAGCGCAGCTGCTTGGAACTGCACACCCCCGCGACCAGGTTGTCCTTGGCGCAGCGCTTCCCACGCTGAGAGGGT GTAGCCCGTCACGGGTAACCGATTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGG CCCTTTCGTCTTCACCTCGAGAATTGTGAGCGGATAACAAGATACTGAGCACATCAGCAGGACGCACTGACCGAATT CATTAAAGAGGAGAAAgatatcATGGCGGATGCAATCGTCAGAGAAGAAGTTAACTTGCAGACAAGCCCAGATAAGA TAACCGTGGATGAGATTGAACTATGGCCAGCAAAGGGGGGACCAGAAACGCCATATTTGGATAAAGTCAAGGTCCCT GCGCACTTGAAGAGCTTCCGTAAGGATGAGTTGAAGACAGTCTGCAGAGAGCTGCGAGCTGAGATCATCAACGCCGT GTCTGCAACGGGAGGGCACTTGGGTTCATCCCTGGGAGTTGTGGAGTTGACTGTTGCTATACATTATGTCTTCGACT

18

GCCCGGAGGACAAGCTTGTTTGGGATGTGGGCCACCAAGCATATGGTCACAAAATTTTGACTGGGCGGAGGGATAAA ATGCACACCATACGACAACGAGATGGCTTGTCTGGGTTCACCAACCGGTCTGAAAGTGAGTACGACGCATTTGGTGC CGGGCACAGTTCTACTTCCATCTCTGCTGCGTTAGGAATGGCGGTTGGACGGGATCTTCTGGGAAAGGATAACCACT GCATTGCCGTCATTGGGGATGGAGCTATTACTGGTGGTATGGCGTATGAGGCCTTAAACCATGCTGGCTTCCTGGGG AGTGAGAAAATCCAAGGGAAGGGGAACAACATTGGCAGAACGATTGTCATCCTAAATGACAACCAGCAGGTGTCGCT CCCAACACAATTCAATGGAGAGAAACAAAAGCCTGTTGGGGCTTTGGCAGATGCCCTCTCCCAAGTTGCATCTAAAC AATTAAGTAATTCCGCCAAAGAAATTGTGAAGCAGCTGCCTGAGCCCTTGCAAGCTGTTGGAGGCCAATTGGAGAAG GTTGTTAGCACAATCGGGGGAAATAACACCTTCTTTGATGAGTTGGGTGTGGCCCATGTTGGCCCTATTGATGGCCA CAACGTGGAGGACCTCGTGAACGTCTTGGAATGGATTAAGAGTCAGAAGGACGGCGGGCCCGTCATTGTCCATATCT TGACAGAAAAAGGATATGGCTATGAGTTTGCTGAGAAGGCTTCTGATCGTATGCATGGAGTCGCGAAGTATGATGTC ACATCTGGCAAGCAAGTCAAGAGTTCCAGCAAGGTGGCCAGTTACACCACTTACTTTGCAGACTCATTAATCGCGGA AGCAGAACGGGACGGCCGGGTCATTGGGATCCATGCAGCCATGGGTGGGGGTACTGGCATGAACCGCTTTGCCAAGC GATTCCCCAAGCGCACTTTTGATGTTGGCATTGCTGAACAGCATGCGGTGACTTTCGCTGCAGGCCTGGCATGTGAA GGGCTTATCCCAATGTGCGCAATTTACTCCTCCTTCCTACAGCGTGCCTATGATCAAGTCATCCATGACGTTGCCCT CCAAAACCTCCCAGTGCGCTTTGCCATGGATCGCGCGGGCCTCGTTGGGGCGGATGGGGCAACGCACAGTGGATTTG CAGATGTCACCTACATGGCATGTGTCCCCAACATGATCGTAATGGCACCCTCTAACGAGGCAGAGCTGTGCAATGCT GTGGCAACTTCAATTGCTATCGATTTTGCTCCATCTTGCTTCAGATTCCCTCGTGGGAATGGAATTGGCGTCGACCT GGCTGAGTATGGGGTGCAACCCAACTTCAAGGGCACTCCTTGGGAGATTGGGAAAGGCAAAATCAGGCGGAACGGGT CGAACAATAAGCCGGAAGGAGATGTAGCTCTTCTGGGGTATGGCACTGTGGTCAATGATTGCCTGGCAGCGGCCGAG ATGCTAGAAGCGCAAGGAATTAAGACCACAGTAGCTGATATGCGTTTCTGCAAGCCTTTGGATGAGGAGCTCATTGT GCAGCTGGCCAAGAACCACCCAGTCGTGATTACTGTTGAGGAGAACACTGTTGGTGGCTTTGCCTCCCATGTCCTGC ACTTCATGACAGAGCAGGGCTTCCTGGATGGAAAGATGAAGCTCCGGACATTATTGCTCCCCGATCGTTTCATTGAA CACGGCACCCAGCAGCAACAGCTAGAAGAGGCACTCCTGACAGCTAATGACATTGTAGAGACAGCGGTCAATGCTTT GGGGATTGCTCCCGTTCCTCAAATCATCATCCCCCCTCAGCAGGTTGTTACGCCCCCCCAGGCCGTCAGCCCGCCCC AGGTCGCCACTCCCACGCCAGCACCCTTAAATTGAGGTACCATCGTCGACAGGCCTCTAGACCCGGGCTCGAGCTAG CAAGCTTGGCCGGATCCGGCCGGATCCGGAGTTTGTAGAAACGCAAAAAGGCCATCCGTCAGGATGGCCTTCTGCTT AATTTGATGCCTGGCAGTTTATGGCGGGCGTCCTGCCCGCCACCCTCCGGGCCGTTGCTTCGCAACGTTCAAATCCG CTCCCGGCGGATTTGTCCTACTCAGGAGAGCGTTCACCGACAAACAACAGATAAAACGAAAGGCCCAGTCTTTCGAC TGAGCCTTTCGTTTTATTTGATGCCTGGCAGTTCCCTACTCTCGCATGGGGAGACCCCACACTACCATCGGCGCTAC GGCGTTTCACTTCTGAGTTCGGCATGGGGTCAGGTGGGACCACCGCGCTACTGCCGCCAGGCAAATTCTGTTTTATC AGCCGTTACCCCACCTACTAGCTAATCCCATCTGGGCACATCCGATGGCAAGAGGCCCGAAGGTCCCCCTCTTTGGT CTTGCGACGTTATGCGGTATTAGCTACCGTTTCCAGTAGTTATCCCCCTCCATCAGGCAGTTTCCCAGACATTACTC ACCCGTCCGCCACTCGTCAGCAAAGAAGCAAGCTTAGATCGACCTGCAGGGGGGGGGGGGAAAGCCACGTTGTGTCT CAAAATCTCTGATGTTACATTGCACAAGATAAAAATATATCATCATGAACAATAAAACTGTCTGCTTACATAAACAG TAATACAAGGGGTGTTATGAGCCATATTCAACGGGAAACGTCTTGCTCGAGGCCGCGATTAAATTCCAACATGGATG CTGATTTATATGGGTATAAATGGGCTCGCGATAATGTCGGGCAATCAGGTGCGACAATCTATCGATTGTATGGGAAG CCCGATGCGCCAGAGTTGTTTCTGAAACATGGCAAAGGTAGCGTTGCCAATGATGTTACAGATGAGATGGTCAGACT AAACTGGCTGACGGAATTTATGCCTCTTCCGACCATCAAGCATTTTATCCGTACTCCTGATGATGCATGGTTACTCA CCACTGCGATCCCCGGGAAAACAGCATTCCAGGTATTAGAAGAATATCCTGATTCAGGTGAAAATATTGTTGATGCG CTGGCAGTGTTCCTGCGCCGGTTGCATTCGATTCCTGTTTGTAATTGTCCTTTTAACAGCGATCGCGTATTTCGTCT CGCTCAGGCGCAATCACGAATGAATAACGGTTTGGTTGATGCGAGTGATTTTGATGACGAGCGTAATGGCTGGCCTG TTGAACAAGTCTGGAAAGAAATGCATAAGCTTTTGCCATTCTCACCGGATTCAGTCGTCACTCATGGTGATTTCTCA CTTGATAACCTTATTTTTGACGAGGGGAAATTAATAGGTTGTATTGATGTTGGACGAGTCGGAATCGCAGACCGATA CCAGGATCTTGCCATCCTATGGAACTGCCTCGGTGAGTTTTCTCCTTCATTACAGAAACGGCTTTTTCAAAAATATG GTATTGATAATCCTGATATGAATAAATTGCAGTTTCATTTGATGCTCGATGAGTTTTTCTAATCAGAATTGGTTAAT TGGTTGTAACACTGGCAGAGCATTACGCTGACTTGACGGGACGGCGGCTTTGTTGAATAAATCGAACTTTTGCTGAG TTGAAGGATCAGATCACGCATCTTCCCGACAACGCAGACCGTTCCGTGGCAAAGCAAAAGTTCAAAATCACCAACTG GTCCACCTACAACAAAGCTCTCATCAACCGTGGCTCCCTCACTTTCTGGCTGGATGATGGGGCGATTCAGGCCTGGT ATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGCGCCCCCCCCCCCCTGCAGGTCGATCTGGTAACCCCAGC GCGGTTGCTACCAAGTAGTGACCCGCTTCGTGATGCAAAATCCGCTGACGATATTCGGGCGATCGCTGCTGAATGCC ATCGAGCAGTAACGTGGCACCCCGCCCCTGCCAAGTCACCGCATCCAGACTGAACAGCACCAAGAGGCTAAAACCCA ATCCCGCCGGTAGCAGCGGAGAACTACCCAGCATTGGTCCCACCAAAGCTAATGCCGTCGTGGTAAAAATCGCGATC GCCGTCAGACTCAAGCCCAGTTCGCTCATGCTTCCTCATCTAGGTCACAGTCTTCGGCGATCGCATCGATCTGATGC TGCAGCAAGCGTTTTCCATACCGGCGATCGCGCCGTCGCCCTTTCGCTGCCGTGGCCCGCTTACGAGCTCGTTTATC GACCACGATCGCATCCAAATCCGCGATCGCTTCCCAGTCCGGCAATTCAGTCTGGGGCGTCCGTTTCATTAATCCTG ATCAGGCACGAAATTGCTGTGCGTAGTATCGCGCATAGCGGCCAGCCTCTGCCAACAGCGCATCGTGATTGCCTGCC TCAACAATCTGGCCGCGCTCCATCACCAAGATGCGGCTGGCATTACGAACCGTAGCCAGACGGTGAGCAATGATAAA 19

GACCGTCCGTCCCTGCATCACCCGTTCTAGGGCCTCTTGCACCAAGGTTTCGGACTCGGAATCAAGCGCCGAAGTCG CCTCATCCAGAATTAAAATGCGTGGATCCTCTACGCCGGACGCATCGTGGCCGGCATCACCGGCGCCACAGGTGCGG TTGCTGGCGCCTATATCGCCGACATCACCGATGGGGAAGATCGGGCTCGCCACTTCGGGCTCATGAGCGCTTGTTTC GGCGTGGGTATGGTGGCAGGCCCCGTGGCCGGGGGACTGTTGGGCGCCATCTCCTTGCATGCACCATTCCTTGCGGC GGCGGTGCTCAACGGCCTCAACCTACTACTGGGCTGCTTCCTAATGCAGGAGTCGCATAAGGGAGAGCGTCGATCGA CCGATGCCCTTGAGAGCCTTCAACCCAGTCAGCTCCTTCCGGTGGGCGCGGGGCATGACTATCGTCGCCGCACTTAT GACTGTCTTCTTTATCATGCAACTCGTAGGACAGGTGCCGGCAGCGCTCTGGGTCATTTTCGGCGAGGACCGCTTTC GCTGGAGCGCGACGATGATCGGCCTGTCGCTTGCGGTATTCGGAATCTTGCACGCCCTCGCTCAAGCCTTCGTCACT GGTCCCGCCACCAAACGTTTCGGCGAGAAGCAGGCCATTATCGCCGGCATGGCGGCCGACGCGCTGGGCTACGTCTT GCTGGCGTTCGCGACGCGAGGCTGGATGGCCTTCCCCATTATGATTCTTCTCGCTTCCGGCGGCATCGGGATGCCCG CGTTGCAGGCCATGCTGTCCAGGCAGGTAGATGACGACCATCAGGGACAGCTTCAAGGATCGCTCGCGGCTCTTACC AGCCTAACTTCGATCACTGGACCGCTGATCGTCACGGCGATTTATGCCGCCTCGGCGAGCACATGGAACGGGTTGGC ATGGATTGTAGGCGCCGCCCTATACCTTGTCTGCCTCCCCGCGTTGCGTCGCGGTGCATGGAGCCGGGCCACCTCGA CCTGAATGGAAGCCGGCGGCACCTCGCTAACGGATTCACCACTCCAAGAATTGGAGCCAATCAATTCTTGCGGAGAA CTGTGAATGCGCAAACCAACCCTTGGCAGAACATATCCATCGCGTCCGCCATCTCCAGCAGCCGCACGCGGCGCATC TCGGGCAGCGTTGGGTCCTGGCCACGGGTGCGCATGATCGTGCTCCTGTCGTTGAGGACCCGGCTAGGCTGGCGGGG TTGCCTTACTGGTTAGCAGAATGAATCACCGATACGCGAGCGAACGTGAAGCGACTGCTGCTGCAAAACGTCTGCGA CCTGAGCAACAACATGAATGGTCTTCGGTTTCCGTGTTTCGTAAAGTCTGGAAACGCGGAAGTCAGCGCCCTGCACC ATTATGTTCCGGATCTGCATCGCAGGATGCTGCTGGCTACCCTGTGGAACACCTACATCTGTATTAACGAAGCGCTG GCATTGACCCTGAGTGATTTTTCTCTGGTCCCGCCGCATCCATACCGCCAGTTGTTTACCCTCACAACGTTCCAGTA ACCGGGCATGTTCATCATCAGTAACCCGTATCGTGAGCATCCTCTCTCGTTTCATCGGTATCATTACCCCCATGAAC AGAAATCCCCCTTACACGGAGGCATCAGTGACCAAACAGGAAAAAACCGCCCTTAACATGGCCCGCTTTATCAGAAG CCAGACATTAACGCTTCTGGAGAAACTCAACGAGCTGGACGCGGATGAACAGGCAGACATCTGTGAATCGCTTCACG ACCACGCTGATGAGCTTTACCGCAGCTGCCTCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTC CCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGG CGGGTGTCGGGGCGCAGCCATGACCCAGTCACGTAGCGATAGCGGAGTGTATACTGGCTTAACTATGCGGCATCAGA GCAGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGC GCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAG GCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAG GAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCT CAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCT CCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTC ACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCG ACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCC ACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTA CACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGAT CCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCT CAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCAT GAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATG AGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCC ATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGAT ACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTG GTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAAT AGTTTGCGCAACGTTGTTGCCATTGCTGCAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTC CGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGA TCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATG CCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAG TTGCTCTTGCCCGGCGTCAACACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAAC GTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAAC TGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGG AATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATT GTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAA GTGCCACCTGACGTCTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCG TCTTCAAGAATT

20

File S4. List of additional enzymatic equations added to the model in this study. Michaelis-Menten kinetics were used to describe concentration-dependent rate in all cases.

EC 2.3.3.9 GOA  AceCoA V = Vmax  GCA    GOA + K mGOA  1+   AceCoA + K mAceCoA   K mGCA  EC 1.1.1.40 PYR NADPH CO Mal  NADP +  2 Ke V = Vmax  Mal NADP + PYR Mal  NADP +    K mMal  K mNADP+  1+ + + +   K mMal K mNADP+ K mPYR K mMal + K mNADP+  EC 2.2.1.7 PYR  GAP V = Vmax GAP + K mGAP  PYR+ K mPYR  EC 1.1.1.267 MEP NADP+ DXP NADPH  Ke V =Vmax  NADP+ MEP DXP NADPH NADP+  MEP DXP NADPH    KmMEP  K NADP+ 1+ + + + + +   K mNADP+ KmMEP KmDXP K mNADPH KmMEP  K mNADP+ K mDXP  KmNADPH  EC 1.17.1.2 IPP MEP  Ke V = Vmax  MEP IPP    K mMEP  K mIPP 1+ +   K mMEP1 K mIPP  EC5.3.3.2 DMAPP IPP  Ke V = Vmax  DMAPP IPP  K K 1+ +  mDMAPP  MIPP     K mDMAPP K mIPP  EC 2.5.1.29 DMAPP  IPP V = Vmax DMAPP + K mAPP  IPP + K mIPP  EC 2.5.1.10 GPP  IPP V = Vmax GPP + K mGPP  IPP + K mIPP  EC 4.2.3.16 GPP V = V max (GPP+K ) mGPP EC 2.5.1.21 FPP V = V max FPP+K mFPP 21

References 1. Xin C, Tholen D, Devloo V, Zhu XG (2015) The benefits of photorespiratory bypasses: how can they work? Plant Physiol 167(2):574-585. 2. Washburn MP, Wolters D, Yates JR, 3rd (2001) Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat Biotechnol 19(3):242-247. 3. Zhang Y, Liu S, Dai SY, Yuan JS (2012) Integration of shot-gun proteomics and bioinformatics analysis to explore plant hormone responses. BMC Bioinformatics 13 Suppl 15:S8. 4. Xu T, et al. (2006) ProLuCID, a fast and sensitive tandem mass spectra-based protein identification program. Mol Cell Proteomics 5(10):S174-S174. 5. Tabb DL, McDonald WH, Yates JR, 3rd (2002) DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics. J Proteome Res 1(1):21-26. 6. Carvalho PC, Fischer JS, Chen EI, Yates JR, Barbosa VC (2008) PatternLab for proteomics: a tool for differential shotgun proteomics. BMC Bioinformatics 9. 7. Zybailov B, et al. (2006) Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiae. J Proteome Res 5(9):2339-2347. 8. Ivleva NB, Bramlett MR, Lindahl PA, Golden SS (2005) LdpA: a component of the circadian clock senses redox state of the cell. EMBO J 24(6):1202-1210. 9. Andersson CR, et al. (2000) Application of bioluminescence to the study of circadian rhythms in cyanobacteria. Methods Enzymol 305:527-542. 10. Orlova I, et al. (2009) The small subunit of snapdragon geranyl diphosphate synthase modifies the chain length specificity of tobacco geranylgeranyl diphosphate synthase in planta. Plant Cell 21(12):4002-4017. 11. Tholl D, Croteau R, Gershenzon J (2001) Partial purification and characterization of the short-chain prenyltransferases, gernayl diphospate synthase and farnesyl diphosphate synthase, from Abies grandis (grand fir). Arch Biochem Biophys 386(2):233-242. 12. Grawert T, et al. (2004) IspH protein of Escherichia coli: studies on iron-sulfur cluster implementation and catalysis. J Am Chem Soc 126(40):12847-12855. 13. de Ruyck J, Durisotti V, Oudjama Y, Wouters J (2006) Structural role for Tyr-104 in Escherichia coli isopentenyl-diphosphate isomerase: site-directed mutagenesis, enzymology, and protein crystallography. J Biol Chem 281(26):17864-17869. 14. Ramos-Valdivia AC, van der Heijden R, Verpoorte R, Camara B (1997) Purification and characterization of two isoforms of isopentenyl-diphosphate isomerase from elicitor- treated Cinchona robusta cells. Eur J Biochem 249(1):161-170. 15. Rohdich F, et al. (2006) Isoprenoid biosynthesis in plants - 2C-methyl-D-erythritol-4- phosphate synthase (IspC protein) of Arabidopsis thaliana. FEBS J 273(19):4446-4458. 16. Matsushima D, et al. (2012) The single cellular green microalga Botryococcus braunii, race B possesses three distinct 1-deoxy-D-xylulose 5-phosphate synthases. Plant Sci 185- 186:309-320. 17. Bowden L, Lord JM (1978) Purification and comparative properties of microsomal and glyoxysomal malate synthase from castor bean endosperm. Plant Physiol 61(2):259-265. 18. Harary I, Korey SR, Ochoa S (1953) Biosynthesis of dicarboxylic acids by carbon dioxide fixation. VII. Equilibrium of malic enzyme reaction. J Biol Chem 203(2):595-604. 19. Wheeler MC, et al. (2005) A comprehensive analysis of the NADP-malic enzyme gene family of Arabidopsis. Plant Physiol 139(1):39-51.

22

20. Rajaonarivony JI, Gershenzon J, Croteau R (1992) Characterization and mechanism of (4S)-limonene synthase, a cyclase from the glandular trichomes of peppermint (Mentha x piperita). Arch Biochem Biophys 296(1):49-57. 21. Alonso WR, Rajaonarivony JI, Gershenzon J, Croteau R (1992) Purification of 4S- limonene synthase, a monoterpene cyclase from the glandular trichomes of peppermint (Mentha x piperita) and spearmint (Mentha spicata). J Biol Chem 267(11):7582-7587. 22. LoGrasso PV, Soltis DA, Boettcher BR (1993) Overexpression, purification, and kinetic characterization of a carboxyl-terminal-truncated yeast synthetase. Arch Biochem Biophys 307(1):193-199. 23. Heineke D, et al. (1991) Redox transfer across the inner chloroplast envelope membrane. Plant Physiol 95(4):1131-1137.

23