USOO8709766B2

(12) United States Patent (10) Patent No.: US 8,709,766 B2 Radakovits et al. (45) Date of Patent: Apr. 29, 2014

(54) USE OF ENDOGENOUS PROMOTERS IN Cocket al. “The Ectocarpus genome and the independent evolution GENETIC ENGINEERING OF of multicellularity in brown algae', Nature, pp. 617-621 (2010). Conesa et al., “Blast2GO: a universal tool for annotation, visualiza NANNOCHLOROPSIS GADITANA tion and analysis in functional genomics research'. Bioinformatics 21, pp. 3674-3676 (2005). (71) Applicant: Colorado School of Mines, Golden, CO Gobler et al. “Niche of harmful alga Aureococcus anophageferens (US) revealed through ecogenomics'. Proceedings of the National Acad emy of Sciences, pp. 4352-4357 (2011). (72) Inventors: Randor Radakovits, Denver, CO (US); Götz et al., “B2G-FAR, a species centered GO annotation reposi tory'. Bioinformatics (2011). Robert Jinkerson, Golden, CO (US); Götz et al. “High-throughput functional annotation and data mining Matthew Posewitz, Golden, CO (US) with the Blast2GO Suite'. Nucleic Acids Research36, pp. 3420-3435 (2008). (73) Assignee: Colorado School of Mines, Golden, CO Gouveia et al., “Microalgae as a raw material for biofuels produc (US) tion”, Journal of Industrial Microbiology & Biotechnology , pp. 269-274 (2009). (*) Notice: Subject to any disclaimer, the term of this Hu et al. “Microalgal triacylglycerols as feedstocks for biofuel pro patent is extended or adjusted under 35 duction: perspectives and advances'. The Journal, pp. 621-639 (2008). U.S.C. 154(b) by 0 days. Karpowicz et al., “The GreenCut2 resource, a phylogenomically derived inventory of specific to the plant lineage'. Journal of (21) Appl. No.: 13/654,347 Biological Chemistry, pp. 21427-21439 (2011). Kindle, K.L. "High-frequency nuclear transformation of (22) Filed: Oct. 17, 2012 Chlamydomonas reinhardfii', Proceedings of the National Academy of Sciences, pp. 1228-1232 (1990). (65) Prior Publication Data Le Corguille et al., “Plastid genomes of two brown algae, Ectocarpus siliculosus and Fucus vesiculosus: further insights on the evolution of US 2013/O102040 A1 Apr. 25, 2013 red-algal derived plastids'. BMC Evolutionary Biology 9, p. 253 (2009). Liet al., "Chloroplast-encoded chIB is required for light-independent protochlorophyllide reductase activity in Chlamydomonas Related U.S. Application Data reinhardtii', The Plant 5, pp. 1817-1829 (1993). (60) Provisional application No. 61/548,157, filed on Oct. Li et al. "Chlamydomonas starchless mutant defective in ADP-glu cose pyrophosphorylase hyper-accumulates triacylglycerol.'. Meta 17, 2011, provisional application No. 61/578,110, bolic Engineering, pp. 387-391 (2010). filed on Dec. 20, 2011. Marchler-Bauer et al., “CDD: a Conserved Domain Database for the functional annotation of proteins”. Nucleic Acids Research 39, 225 (51) Int. Cl. 229 (2011). CI2P 7/64 (2006.01) Matsuzaki et al., “Genome sequence of the ultraSmall unicellular red CI2N 9/10 (2006.01) alga Cyanidioschyzon merolae 10D, Nature, pp. 653-657 (2004). CI2N L/3 (2006.01) Merchant et al., “The Chlamydomonas genome reveals the evolution of key and plant functions'. Science, pp. 245-250 (2007). C7H 2L/04 (2006.01) Oudot-Le Secq et al., “Chloroplast genomes of the diatoms (52) U.S. Cl. Phaeodactylum tricornutum and Thalassiosira pseudonana com USPC ... 435/134: 435/193; 435/257.2:435/173.5; parison with other plastid genomes of the red lineage'. Molecular 435/470; 536/23.7:536/24.1 Genetics and Genomics 277, pp. 427-439 (2007). (58) Field of Classification Search None (Continued) See application file for complete search history. Primary Examiner — David T Fox Assistant Examiner — Matthew Keogh (56) References Cited (74) Attorney, Agent, or Firm — Dorsey & Whitney LLP PUBLICATIONS (57) ABSTRACT Armbrust et al., “The genome of the diatom Thalassiosira The present disclosure is directed to novel polynucleotide pseudonana: Ecology, evolution, and '. Science, 79-86 sequences for use in Nannochloropsis gaditana. The novel (2004). polynucleotide sequences include control sequences and cod Atsumi et al., “Direct photosynthetic recycling of carbon dioxide to ing sequences. Also disclosed are novel gene expression con isobutyraldehyde'. Nature Biotechnology, pp. 1177-1180 (2009). structs wherein N. gaditana promoters/control regions are Blanc et al. “The Chlorella variabilis NC64A genome reveals adap operatively linked to N. gaditana or non-N.gaditana coding tation to photosymbiosis, coevolution with viruses, and cryptic sex’, sequences. These novel polynucleotide sequences and The Plant Cell (2010). expression constructs can be introduced into N. gaditana and Blithgen et al. “Biological Profiling of Gene groups utilizing Gene can recombine into the N.gaditana genome. Expression from ontology—A statistical and software framework'. (2004), Gossip software, http://www.microdiscovery.de/. these polynucleotide sequences and expression constructs Blithgen et al. “Biological Profiling of Gene Groups utilizing Gene can enhance N. gaditana biomass and/or lipid biosynthesis. Ontology”. Genome Informatics, 16 (1): 106-115, 2005. Also disclosed are methods for modifying N. gaditana, for Bowler et al., “The Phaeodactylum genome reveals the evolutionary example by stably transforming N.gaditana with nucleic acid history of diatom genomes”. Nature, pp. 239-244 (2008). sequences, growing the modified N. gaditana, and obtaining Chen et al., “Conditional production of a functional fish growth biomass and biofuels from the modified N. gaditana. hormone in the transgenic line of Nannochloropsis oculata (Eustigmatophyceae)”. Journal of Phycology, pp. 768-776 (2008). 13 Claims, 198 Drawing Sheets US 8,709,766 B2 Page 2

(56) References Cited Wang et al. Algal Lipid Bodies: Stress induction, purification, and biochemical characterization in wild-type and starchless PUBLICATIONS Chlamydomonas reinhardtii. Eukaryotic Cell , pp. 1856-1868 (2009). Pal et al., “The effect of light, salinity, and nitrogen availability on Work et al. “Increased lipid accumulation in the Chlamydomonas lipid production by Nannochloropsis sp.”. Applied Microbiology and reinhardtii Sta7-10 starchless isoamylase mutant and increased car Biotechnology, pp. 1429-1441 (2011). bohydrate synthesis in complemented Strains’. Eukaryotic Cell, pp. Radakovits et al., “Genetic engineering of fatty acid chain length in 1251-1261 (2010). Phaeodactylum tricornutum', Metabolic Engineering , pp. 1-7 Zaslavskaia et al., “Transformation of the diatom Phaeodactylum (2010). tricornutum (Bacillariophyceae) with a variety of selectable marker Radakovits et al., “Genetic engineering of algae for enhanced biofuel and reporter genes”,Journal of Phycology, pp. 379-386 (2000). production”. Eukaryotic Cell 9, pp. 486-501 (2010). Zou et al. “Production of cell mass and eicosapentaenoic acid (EPA) Samstag, Antisense Nucleic Acid Drug Dev 6: pp. 153-156 (1996). in ultrahigh cell density cultures of Nannochloropsis sp. Steen et al., “Microbial production of fatty-acid-derived fuels and (Eustigmatophyceae). European Journal of Phycology, pp. 127 chemicals from plant biomass', Nature 463, pp. 559-562 (2010). 133 (2000). U.S. Patent Apr. 29, 2014 Sheet 1 of 198 US 8,709,766 B2

Supplementary Table 3. (tetrapyrrole), carotenoid and sterol biosynthesis genes N. gadiana Transcript Gene Description EC number node Support Location Tetrapyrrole Synthesis GS glutamyl-tRNA synthetase 6.1.1.17 Nga04989 GTS glutamyl-tRNA synthetase 6.1.24 Nga02834 GTR glutamyl-tRNA reductase 1.2.70 Nga02604 glutamate-1-semialdehyde GSA aminotransferase f glutamate-1- 5438 Nga30045 semialdehyde 21-aminomutase AAD 5-aminolevulinic acid dehydratase f (HenB) porphobilinogen synthase 4.2.1.24 NgaO0585 PBGO porphobilinogen deaminase (HenC) hydroxymethylbilane synthase 2.5.1.6. NgaO3248 UROS uroporphyrinogen synthase 4.2.1.75 NgaO0807 (HenD) UROD uroporphyrinogen I decarboxylase 4.1.1.37 Nga04120 UROD uroporphyrinogen decarboxylase 4.1.1.37 Nga05706 Gr Coproporphyrinogen Oxidase 1.3.3.3 Nga05151 CPX1 t Nga04278 (HenF) Coproporphyrinogen oxidase 33.3 (partial) PPX protoporphyrinogen X oxidase 1.3.3.4. NgaO3873 Chi protoporphyrin XMg-chelatase subunit D 68. Nga30773 Ch protoporphyrin X MG-chelatase subuint 6.6.1.1 Nga40092 Chi-1 protoporphyrin X Mg-chelatase subunit H 6.6.1.1 Nga30995 Chi-2 protoporphyrin XMg-chelatase subunit H 66.11 Nga06242 PPMT Mg-protoporphyri IX methyltransferase 2.1.1.1 : Nga04808 (ChM) ACSF Mg-protoporphyrin IX monomethyester 14, 138 (ycf59) (Oxidative) cyclase Nga40091 DWR divinyl protochlorophytide a 8-vinyl- 1.3.1.75 Nga05945 reductase POR ight-dependent NADPH-protochlorophyllide 1.3.1.33 Nga04959 POR light-dependent NADPH protochlorophyllide Oxidoreductase 1.3.1.33 Nga00683 ight-independent protochlorophyllide ChB Oxidoreductase subutit B 1.18.-. Nga40089 tight-independent:protochlorophyllide C Oxidoreductase subunit 1.18.-- Nga40044 tight-independent:protochlorophyllide CN Oxidoreductase Subtit N 1.18.-- Nga40045 (Ch;G)C-S chlorophyli synthase 2.5.1.62 Nga31097

F.G.R. A U.S. Patent Apr. 29, 2014 Sheet 2 of 198 US 8,709,766 B2

W. gadiana Transcript eene Description ECner mode Support location GGR (ChiP) geranylgerary reductase 3. Nga04895 UM uroporphyrinogen C-methyltransferase 2.1.1.107 NgaO5160 Sir B sirohydrochlorin ferrochelatase 4.99.14 NgaO0339 FC ferrochelatase 4.99.1.1 NgaO0748 Caroteroid Biosynthesis 1-deoxy-D-xylulose-5-phosphate DXS synthase 22.7 NgaO2203 Y 1-deoxy-D-xylulose-5-phosphate DXR (IspC) reductoisonerase 1.1.1.267 Nga30771 2-C-methyl-D-erythritol 4-phosphate MCT (spD) cytidyltransferase 2.7.7.60 NgaO698 4-(cytidine 5'-diphospho)-2-C-methyl CMK (spe) D-erythritoi kinase 2.7.1. 148 Nga04584 2-C-methyl-D-erythrito 24 MDS (spf) cyclodiphosphate synthase 4.6.1.1 2 NgaO2651 4-hydroxy-3-methylbut-2-enyi HDS (sp6) diphosphate synthase 1.17.4.3 Nga30806 4-hydroxy-3-methylbut-2-enyl HDR (IspH) diphosphate reductase 1.17.1.2. NgaO5308 isopentenyl diphosphate:dimethylatty Nga03838 isopentenyl diphosphate:dimethylaty NCaOOO2 diphosphate type 9 GGPPS (CrtE) expany pyrophosphate 5.1.29 Nga02636 PSY (CritB) phytoene synthase NgaO2957 PDS (CrtP) phytoene desaturase NgaO5064 ZDS (CrtQ) Zeta-caroteine desaturase 1993 NgaO7310 No CRSO caroteroid isomerase Honolog LCYB (CftL-b) tycopene 3-cyclase NgaO0640 cytochrome P450 enzyme related to CYP97E CYP97A carotene 3-hydroxylase Nga30077 cytochrome P450 enzyme related to CYP97F CYP97A carotene 8-hydroxylase Nga00100 1143.9 ZE) zeaxanthin epoxidase O NgaO1534 WDE violaxanthin de-epoxidase 1.10.99.3 Nga02700 NSY neoxanthin synthase 5.3.99.9 Nga10001 Sterol Synthesis ACA acetyl-CoA C-acetyltransferase 2.3.1.9 Nga20998 ACA acetyl-CoA C-acetyltransferase 2.3, 1.9 Nga30830 HMGS hydroxymethylglutary-CoA synthase 2.3.3.10 NgaO6246 GRE 3 U.S. Patent Apr. 29, 2014 Sheet 3 Of 198 US 8,709,766 B2

N. gadiana Transcript Gene Description EC number modes Sup port location HMGR hydroxymethylglutary-CoA reductase 11.34 newakonate kinase 2.7.1.36 Honolog No PMK phosphomevalonate kinase 2.74.2 Honolog No MWO diphosphomevalonate decarboxylase 4.1.1.33 Honolog geranyl-disphosphate synthase GPPS dimethylatytranstransferase 2.5.1.1 NgaO2865 geranyl-disphosphate synthase 1 GPPS dimethylaitytranstransferase 2.5.1.1 NgaO 1978 FPPS farnesyl-diphosphate synthase 25.168 NgaO2874 isopentenyi diphosphate:dimethylally D diphosphate isomerase type 5.3.3.2 NgaO3838 isopentenyi diphosphate:dimethylally O diphosphate isomerase type 5.3.3.2 Ngat 0002 squalene monoxygenase f squalene SQE (SQP) epoxidase 14.99.7 NgaO 1590 CAS cycioarteno synthase 54,998 Nga30790 CYP51 C4-demethylase (sterol 4 14.13.7 NgaO3733 demethylase) O .1 FACKEL D4-Sterol reductase 13.170 Nga00758 SMO Steroi methyl-oxidase (C4 143.7 No methylstero morioxygenase) 2 Homolog C4-decarboxylase (sterol-4-alpha HSD carboxylate 3-dehydrogenase) 1117O NgaO6114 sterol-C24-methyl (sterol SMT 24-C-methyltransferase) 2.1. 141 Nga05943 steroi-C24-methyltransferase (sterol SMT 24-C-methyltransferase) 2.1. 141 Nga02534 D8-D7-sterol-isomerase (cholesteno! No HYD1 deta-isonerase) 5.3.3.5 Homolog DWF7 (STE 1) C5-desaturase (lathosterol oxidase) 1421.6 NgaO2795 D7-stero reductase (7- DWF5 dehydrochoestero reductase) 1.3.1.21 Nga03254 D24-stero reductase (24 DWF1 dehydrochoestero reductase) 13.172 NgaO3764 D24-steroi reductase (24 DWF dehydrochoestero reductase) 13.172 Nga05293 cyclopropy sterol isomerase CP (CCI) (cycoeucalenoi cycloisonerase) 5.5.19 Nga301.96 DE 2 D5-stero reductase 3.1.30 Nga00656

GREC U.S. Patent Apr. 29, 2014 Sheet 4 of 198 US 8,709,766 B2

8 p//Buu) pepd U.S. Patent Apr. 29, 2014 Sheet 5 Of 198 US 8,709,766 B2

s 3. 38. : Rt in U.S. Patent Apr. 29, 2014 Sheet 6 of 198 US 8,709,766 B2

xyp. prote Nannochloropsis gaditana chloroplast genome 114,875 bp

x photosystem &photosystem if

cytochrone by complex &RuBisCO subunit 8. g Eichlorophyll biosynthesis , timetabolisri & quality control 8 AFP synthase 8RNA polymerase & ribosomal RNAs 3ribosomal proteins (SSU) & ribosomal proteins (LSU) xtransfer RNAs E. protein assembly finembrane insertion hypothetical chloroplast reading frames (ycf. & other genes

FGRE 4. U.S. Patent Apr. 29, 2014 Sheet 7 of 198 US 8,709,766 B2

Nannochloropsis gaditana mitochondrial genome 42,067 bp

complex 3 (NAOH dehydrogenase) complex 3: (ubichinot cytochrome c reductase) complex 3v (cytochrome c oxidase) 8 ATP synthase fiboso a proteins (SSU) 8 ribosomal proteins (SU) & other genes & transfer RNAs & ribosomat RNAs repeat region FGRE 5 U.S. Patent Apr. 29, 2014 Sheet 8 Of 198 US 8,709,766 B2

Gene Guillardia Odontella Phaeodactylun Thalassiosira Emiliania Ectocarpus Nane heta Sinerisis ricorn in pseudonana huxleyi sifcufosus N. gadifarna s r

-

Gre 6A U.S. Patent Apr. 29, 2014 Sheet 9 Of 198 US 8,709,766 B2

Gene Guilfardia Odontella Phaeodactylun Thalassiosira Emiliania Ectocarpus Name thea Siness ricortin pseudoniana huxleyi silicutosus N. gadiana ify - if 3 --

riff -

ir s pbSA

petB

-- --

w

--

r r

-

--

- -- FGr. 63 U.S. Patent Apr. 29, 2014 Sheet 10 Of 198 US 8,709,766 B2

Gene Guillardia Odontella Phaeodactylum Thalassiosira Emiliania Ectocarpus Name thea siensis ricorun pseudonana huxleyi . . . siliculosus N, gadiana

FGRE 6 C U.S. Patent Apr. 29, 2014 Sheet 11 Of 198 US 8,709,766 B2

Gere Guilfardia Odonteia Phaeodactylun Thalassfosfira Enifania Ectocarpus Name thea Sinerisis ricortin pseudoniana huxleyi siliculosus Ngadiana pla s -- rpi5 -- -

--

--

FGURE 6 U.S. Patent Apr. 29, 2014 Sheet 12 Of 198 US 8,709,766 B2

Gene Guillardia Odontella Phaeodactylun Thalassiosira Emiliania Ectocarpus Narine heta Sinensis tricorintin pseudonana huxleyi siliculosus N, gadiana

FGRE 8

U.S. Patent Apr. 29, 2014 Sheet 14 of 198 US 8,709,766 B2

Ngadiana node Gene modei description NgaO2737 phospholipid diacylglycerol acyltransferase NgaO2743 serine palmitoyltransferase Nga02741 pyruvate dehydrogenase component X

NgaOO713 guanine deaminase Nga00717 cytohesin-1 isoform 2 Nga00702 serine threonine-protein kinase tousied-like 1 isoform 2 NgaO0818 monogalactosyldiacylglycero synthase Nga00817 fatty acid desaturase NgaO5525 anthranilate synthase Nga05530 cystathionine beta-synthase NgaO5522 dihydroxy-acid dehydratase

NgaO3876 protoporphyrinogen oxidase Nga03879 protoporphyrinogen oxidase Nga03873 protoporphyrinogen oxidase

Nga20 rrna p g prot p Nga30171 u4 u6 small nuclear ribonucleoprotein prp3 Nga30821 u4 u6 smal nuclear ribonucleoprotein prp4 NgaO5511 prp4 pre-mina processing factor 4 homolog NCa3O351 4 u6 ib 4

NgaO3294 tfih Subunit Nga30114 protein NgaO3302 polyadenlyte binding protein b chain structure of the mile domain of poly-binding protein NgaO3296 in Complex with the binding region of DaiO2

NgaO3486 abc atp-binding permease protein NCaO3479 to-bindi t fami Nga03304 S-adenosylmethionine synthetase NgaO3314 glutamyl-tra amidotransferase subunit a Nga03312 cytidine and deoxycytidylate deaminase family protein NgaO3305 rina pseudouridylate synthase family protein Nga03303 carbamoyl-phosphate small subunit NgaO3313 GRE 8A U.S. Patent Apr. 29, 2014 Sheet 15 Of 198 US 8,709,766 B2

Nga03316 aspartate aminotransferase Nga03306 thymidylate synthase NgaO3310 dihydrofolate reductase Nga03311 threonyl-trna synthetase

Ng Nga03669 Uw-damaged dna-binding Nga30761 gtip-binding protein Nga30709 dra damage-binding

Ng aSp-g polypep Nga03202 dria mismatch repair protein msh2 Nga03203 dna mismatch repair protein msh2 Nga30317 dna mismatch repair protein msh2 Nga30528 protein

NgaO1560 of inner mitochondrial membrane 13 homolog Nga01563 lipid a export atp-binding permease protein msba Nga20851 at2g36910-like protein Nga20388 atp-binding sub-family b (mdr tap) member 10 NgaO4041 dihydrolipoamide acetyltransferase NgaO4043 branched-chain alpha-keto acid dehydrogenase subunit e2 NgaO4040 branched-chain alpha-keto acid dehydrogenase subunit e2 Nga30513 dna replication licensing factor mom3 NgaO1778 dna replication licensing factor mom3 Nga04063 dihydroxy-acid dehydratase Nga30327 dihydroxy-acid dehydratase 44 iaideh

Nga301.42 glutamate-cysteine tigase catalytic subunit Nga30765 glutamate-cysteine catalytic subunit

NgaO2623.2 molybdenum synthesis NgaO1769.1 5-oxoprolinase Nga01765.01 CyStathionine gamma-

NgaO0538 smail nuclear ribonucleoprotein associated protein b Nga00552 translesion dna polymerase-rev 1 deoxycytidyl transferase GRE 83 U.S. Patent Apr. 29, 2014 Sheet 16 of 198 US 8,709,766 B2

NgaO0540 ash (or homeotic)-Eike Nga2O782 50s ribosomal protein 9 Nga00535 histone-lysine n-methyltransferase NgaO0543 ef- guanine nucleotide exchange domain-containing Nga00558 ribosomal proteins 16 NgaOO583 cg9383-pa NgaO0593 polyribonucleotide nucleotidyltransferase Nga2110 polyribonucleotide nucleotidyltransferase NgaO0596 rna helicase rnase NgaO0595 dicer-like protein 2 Nga20178 dice-1 NgaO0568 Cdc2-like protein kinase NgaO0562 5-3 exoribonuclease 2 Nga00557 glycy-trna synthetase NgaO0581.0 dna polymerase v family Nga00566.01 protein bud31 homolog

Gene ontology term that defines gene cluster. Gene ontology term description. Name of contig that gene cluster is found on. ° N. gaditana gene model, N. gaditana gene model description. 'Gene location in cluster.

GURE 8C U.S. Patent Apr. 29, 2014 Sheet 17 Of 198 US 8,709,766 B2

???????????ºgºgæfi?eunaoug

©••ž?

U.S. Patent Apr. 29, 2014 Sheet 18 Of 198 US 8,709,766 B2

spºpolueue6eueppe6'N?oJequunu U.S. Patent Apr. 29, 2014 Sheet 19 Of 198 US 8,709,766 B2

N. gadiana Gene Conserved Aigal Found in Gene Namea pontain of conserved donain description crop Greencate Exopolyphosphatase-related proteins, Nga00879 SSS 8. cystathionine beta-synthase (CBS) pair Superfamily

codOO353; Ribosomal protein S15 (prokaryotic)/S13 Nga01062 COO349 (eukaryotic), S5/NS/EPRS RNA-binding domain NgaO241 COO999 YC-related donai BD NgaO20 cC5363 Oxygen evolving enhancer protein 3 (PsbQ) BD STARTIRHO alpha C/PTP/Bet y1/CoxG! Nga02601.01 c 14843 CalC (SRPBCC) ligand-binding domain BD superfamily NgaO2755 CO900 Axonemal dynein light chain BO NgaO2799 BO NgaO3348 CCOO.204 repeats BO c 12031, Esterase lipase superfamily; alpha/beta Nga03586 pfam00561 1 fold B NgaO3779 CO430 N-Acyltransferase superfamily BD Nga03868 BD Nga04312.0 c10460 Sulfatase superfamily BO NgaO4890 c2O3 Esterase lipase superfamily BO Glycosyltransferase family A (GT-A) type Nga05077 C1 394 superfamily BD NgaO5710 BO Nga05790.1 coC674 Major Facilitator Superfamily (MFS) Nga05820 BD Nga05889 PRK00107 16S rRNA methyltransferase GidB BO Nga06056 BO Nga06213 CO460 Sulfatase B) Nga06321 BC Nga06681 C1238 GDP-fucose protein O-fucosyltransferase BO Nga067.17 CO927 Ribonuclease 2-C B Nga20030 C10638 Protein of unknown function (DUF726) Nga20064 CO2660 TAZ zinc finger Nga20089 CO2576 bZIP transcription factor Nga20138

FIGURE A. U.S. Patent Apr. 29, 2014 Sheet 20 Of 198 US 8,709,766 B2

Wigadiana Gene Conserved Alga! Found in Gene Name Domain D Conserved Domain Description group Greenicut;2 Crotonase/Enoyl-Coenzyme A (CoA) Nga2029 c 4780 hydratase like superfamily BD Nga2033 BD von Hippel-Landau (pVHL) tumor Nga20943 CO3381 suppressor protein BO Nga21039 C 12207 MgtC family (unknown function) B c0993; Rossmann-fold NAD(P)(+)-binding proteins, Nga21144 COG0644 Dehydrogenases (flavoproteins) (FixC) BD Nga30278 COO 161 Dihydrofolate reductase (DHFR) BD S-adenosylmethionine-dependent Nga30454 C2O1 methyltransferases (SAM or AdoMet Miase), class superfamily Nga30564 B START/RHO alpha C/PITP/Bet v1/CoxG? Nga30653 C 4643 CatC (SRPBCC) ligand-binding domain BO Superfamily c 14603; C2 superfamily; Rossmann-fold NAD(P)(+)- B Nga30880 CO993 binding proteins D CdO005; EF-hand, calcium binding motif: COP Nga30943 CO726 associated protein Nga40029 CO3567 Yofa superfamily PLNO2584; 5'-methylthioadenosine nucleosidase; Nga00093 COO303 Phosphorylase superfamily BDG GreenCut 2

COG5070; Nucleotide-sugar transporter (VRG4); Nga00146 CO)1037 EamA-like transporter family BDG GreenCut2 Nga00382 BDG GreenCut2 Nga00539 COO 3 Glycosyltransferase GTB type superfamily BDG GreenCit2 Ferredoxin thioredoxin reductase catalytic NgaO0957 CO1977 beta chain BDG GreenCut2 band 7 domain of flotiin (reggie) Eike Nga01759.01 CCO34O7 proteins BDG GreenCit2

pheophorbide a oxygenase, Rieske 2Fe PLNO258; 2S cluster binding domain; NgaO2012 c00938; START/RHO alpha CPTP/Bet y1/CoxG BDG GreenCut2 CA1643 CatC (SRPBCC) ligand-binding domain superfamily

GRE 3 U.S. Patent Apr. 29, 2014 Sheet 21 Of 198 US 8,709,766 B2

W. gadiana Gene Conserved Aigal Found in Gene D Name Domain D Conserved Domain Description group GreenCut2. Rieske 2Fe-2S cluster binding domain; cOO938: pheophorbidea oxygenase; Nga02567 PLN02518; START/RHO alpha C/PTP/Bet y1/CoxG, BDG GreenCut2 c.14643 CatC (SRPBCC) tigand-binding domain superfamily NgaO2700.0: CO6253 Wiolaxanthin de-epoxidase (VDE) BDG GreenCut2 Nga02877 a bandproteins 7 domain (lipid raft-associated) of fiottlin (reggie) like BG GreenCut2 Nga03047 BOG GreenCut2 Nga03349 COG 114 Ferredoxi BG GreenCit2 cdO6660; Aido-keto reductases, Predicted NgaO336: COGO667 BDG GreenCut2 c{O1154; CreA superfamity; Nuclear transport factor 2 Nga03364 CO909 superfamily BDG GreenCit2 PLN02657; 3,8-divinyl protochlorophyllide a 8-viny Nga05945 DVR cdO5243: reductase; atypical (a) SORs, subgroup 5. BG GreenCut2 CO9931 Rossmann-fold NAD(P)-(+)-binding proteins Nga20955 BDG GreenCut2 Nga20977.1 BBG GreenCut2 Nga21 90 B DG GreenCut2 Nga30820 C 34.79 Protein of unknown function (DUF3529) BDG GreenCut2

Nga30882 COO8O proteinRetina pigment epithelia 65 membrane BG GreenCut2 Nga40028 psal CO365 Photosystem reaction centre subunit XI BOG GreenCut2

cdO5243: atypical (a) SDRs, subgroup 5, Rossmann Nga00206 c09931; fold NAD(P)(+)-binding proteins; NAD BORG GreerCit2 pfam0370 dependent epimerase? dehydratase family C-terminal processing peptidase, serine Nga00244 CdO756C protease family S4 BDRG Green Cut2 PLNO2679; hydrolase, alpha/beta fold family protein, Nga00352 C2O31 Esterase ipase superfamily BORG GreenCut2

Nga00432 PRK13474 cytochrome b6-fcomplex iron-sulfur subunit BDRG GreenCut2 Nga00448.0 CO2879 Chlorophy: A-B binding protein BORG GreenCut2 NgaOQ462 CCOO2O7 2Fe-2S iron-sulfur cluster binding domain BORG GreerCit2 (fer2) superfamily Nga00522.01 CO2879 Chlorophy: A-B binding protein BRG GreenCut2 Nga00633 c1841 photosystem protein Psb27 BRG GreenCut2 FGURE C U.S. Patent Apr. 29, 2014 Sheet 22 Of 198 US 8,709,766 B2

N. gadiana Gene Conserved Alga Found in Gene Name Domain of Conserved Domain Description group GreenCut2. LCYB NgaO0640 (CrtL CO993 Rossmann-fold NAD(P)(+)-binding proteins BORG GreenCut2 b S-adenosylmethionine-dependent NgaOO679.01 C12O11 methyltransferases (SAM or AdoMet BRG GreeCult2 MTase), class superfamily Retina; pigment epithelia 65 membrane NgaO0694 c10080: , Lignosti bene BORG GreenCut2 COG3670 alpha, beta-dioxygenase and related NgaOO774.01 CO2879 Chlorophy: A-B binding protein BRG GreeCut2 Nga00782 Co02947;COO388 RX family. Thioredoxin like superfamily BDRG GreeCult2 UROS NgaO0807 (Hem co06578 Uroporphyrinogen-I synthase (hemD) BDRG GreenCut2 D) monogalactosyidiacylglycerol (MGDG) NgaO0818 MGD PNO2605 synthase BORG GreenCit? NgaOO935.01 COGO534: Na+-driven multidrug efflux pump (NorM); BDRG GreeCut2 c10513 Mate superfamily coiO5265; atypical (a) SDRs, subgroup 1, Rossmann Nga00964 clO9931 fold NAD(P)-(+)-binding proteins. BDRG GreeCit? COGO451 Nucleoside-diphosphate-sugar epimerases Manganese-stabilising protein NgaO0965.01 CO3326 photosystem li polypeptide BDRG GreenCit? cyclophilin-type peptidylproly cis-trans Nga00983.01 cOO1.97 isornease BORG GreenCut2 Nga01068 COO484 Nifty-like donnair BORG GreenCut2 Protein of unknown function (DUF2470); Nga01167 c11228; Pyridoxine 5'-phosphate (PNP) oxidase-like BORG GreeCut2 COO381 proteins

TIGRO1292 thioredoxin-disulfide reductase, Nga01264 clOO388: hioredoxin like superfamily; Rossmann BDRG GreenCut2 CO993 fold NAD(P)(+)-binding proteins C-terminal processing peptidase, Serine Nga01344 CO7560 protease family S4 BORG GreenCut2 clOO484. Nifty-like domain; GFY-YIG nuclease domain BORG GreenCut2 NgaO 1483 C15257 superfamily NgaO 1705 CO1535 Domain of unknown function (DUF477) BDRG GreeCult2 NgaO1771.01 cO9611 Domain of unknown function (DUF1995) BORG GreenClt2 NgaO1898.01 cO2879 Chlorophy: A-B binding protein BORG GreenCut2 FGRE O U.S. Patent Apr. 29, 2014 Sheet 23 Of 198 US 8,709,766 B2

N. gaditana Gene Conserved Aigal Ford in Gene Name Domain D Conserved Domain Description group: GreenCut2. Nga01936 c 1587 FKBP-type peptidyl-prolyl cis-trans BRG Greer Cit2 ISOe3Se COG0546; Predicted phosphatases (Gph); Haloacid Nga01968 CC0427 dehalogenase-like BORG GreenClt2

PLN.02824, hydrolase, alpha/beta fold family protein; Nga01980.0 C12O31 Esterase ipase superfamily BDRG GreenCut2

Nga02139 Roos 2-dehydropantoate(apbA?panE) 2-reductase BRG Greer Cit2 NgaO2243 ClOO337 UbiA prenyltransferase famity BORG GreenCut2

PLN.02775; Probable dihydrodipicolinate reductase; NgaO2246 clO993; Rossmann-fold NAD(P)-(+)-binding proteins; BORG GreenCut2 ClO4965 Dihydrodipicolinate reductase, C-terminus

GROO225 C-terminai peptidase (pro); PDZ domain of NgaO2340 codOO988 C-terminai processing-, tail-specific-, and BORG GreeCut? tricorn proteases

Predicted SAM-dependent methyltransferases; S-adenosylmethionine COG1092. dependent methyltransferases (SAM or Nga02348.0 cc.02440, AdoMet-MTase), class superfamily, BDRG GreenCut2 clOO607 Pseudouridine synthase and Archaeosine transglycosylase domain

NgaO2385 PLNO2679 hydrolase, alpha/beta fold family protein BDRG GreenCut2

Guanosine polyphosphate pyrophosphohydrolases/synthetases COGO317 Nga02586 Cd5399 (Spoil); Nucleotidyltransferase (NT) domain BDRG GreenCut2 of RelA- and Spot-like ppGpp synthetases and hydrolases GUN4-like (involved in plastid-to-nucleus Nga02777 ClO542 signaling) BRG GreenCut2 Nga02785 CO2879 Chlorophyl: A-B binding protein BORG GreenC2 Nga02790 CO2879 Chlorophy: A-B binding protein BORG GreenCut2 COG2267 Lysophospholipase, Predicted Nga02810 Cocos acetyltransferases and hydrolases with the BDRG GreenCut2 alpha/beta hydrolase fold clO9931 Rossmann-fold NAD(P)-(+)-binding proteins, Nga03075 COG1233 Phytoene dehydrogenase and related BORG GreenCut2 proteins Nga03116 CO2879 Chlorophy: A-B binding protein BORG GreeCit?

FCRE E U.S. Patent Apr. 29, 2014 Sheet 24 of 198 US 8,709,766 B2

N. gaitana Gere Conserved Aga Fond ir Gene Nane Domain be conserved Domain Description group Greenicut cciO5243; atypical (a) SDRs, subgroup 5; Rossmann Nga03131. c0993, fold NAD(P)+)-binding proteins; Predicted BORG GreenCut2 COGO702 nucleoside-diphosphate-sugar epimerases

c00388, Protein Disulfide Oxidoreductases and Nga03355 CO729 Other Proteins with a hioredoxin fold: BORG Green Cut2 Vitamin Kepoxide reductase family; TIGR02009 beta-phosphogluconutase family hydroiase, NgaO3454.0 cdO1427 Haloacid dehalogenase-like hydrolases BDRG GreenCut2

c. 12031. Esterase lipase superfamily; Predicted Nga03465.0 COG 1075 acetyltransferases and hydrolases with the BORG GreenCut2 alpha/beta hydrolase fold Nga03470 COGO534 Na+-driven multidrug efflux pump BCRG GreenCut2 Alternative oxidase, ferriti-ike diron Nga03545 AOX COO 1053 binding domain BRG GreenCut2 NgaO3581 CO2879 -B binding protein BORG GreenClt2 S-adenosylmethionine-dependent Nga03884 c1201 methyltransferases (SAM or AdoMet BORG GreenCut2 Mase), class Nga03886 c. 414 Cytochrome C superfamily BRG GreenCut2 NgaO4175 CO2879 Chlorophy A-B binding protein BORG GreenCut2 Nga04177 CO2879 Chlorophy A-B binding protein BORG GreenCut2 Nga04484 CEO.4786 SOUL heme-binding protein BORG GreerCit? Nga04536.0 CO2879 Chlorophyll A-B binding protein BORG Green Cut2 COG 1748; Saccharopine dehydrogenase and related Nga04565 CCOO590 proteins, RNA recognition motif (RRM) BORG GreerCit? C-terminal processing peptidase, serine Nga04644 CCO7560 protease family S41 BORG GreenCut2 Mg-protoporphyrin IX methyltransferase, S PRKO7580; adenosylmethionine-dependent Nga04808.0 c 1201 methyltransferases (SAM or AdoMet BORG GreenClt2 Mase), class superfamily PDS Nga05064 PLNO2612: phytoene desaturase, Rossmann-fold BRG GreenCut2 (CrtP) CO993 NAD(P)-(+)-binding proteins Protein of unknown function (DUF1092) NgaO5492 CO5808 superfamily BORG Green Cut2

Ribosomat protein S-like RNA-binding Nga05523 donain BORG GreenCut2

EIGRE . . . U.S. Patent Apr. 29, 2014 Sheet 25 Of 198 US 8,709,766 B2

N. gaitana Gere Conserved Algal Forcin Gene Nane Domain D Conserved Domain Description group GreenCut2. Guanosine polyphosphate Nga05636 COGO37 pyrophosphohydrolases/synthetases (Spot) BDRG GreenCut2

NgaO5671 COG1357 Pentapeptide repeats containing protein BDRG Green Cit2 Nga05697 CO2879 Chlorophy I A-B binding protein BDRG GreeCut2 GSTN family, 2 repeats of the N-terminal NgaO5732 cdO3041, domain of soluble GSTs (2 GSTN) BORG GreeCit? COO388 subfamily; Thioredoxin. Eike superfamily Nga05831.0 CO297 Pentapeptide repeats BORG GreenClt2 cyclophilin-type peptidylproly cis-trans NgaO5995.1 CCO1924 (cyclophilins) similar to the BORG GreenCut2 Spinach thylakoid lumen protein TLP40 Carboxy-terminal processing protease, C NgaO6121 PLN.00049; terminal processing peptidase; serine BORG GreenCut2 CdC7560 protease family S4 Nga06454 CO2879 Chlorophy A-B binding protein BDRG GreeCult2 cyclophitin-type peptidylpfoly cis-trans Nga06512 COO 97 Somerases BDRG GreenC2

Nga06539 c 1228; Protein of unknown function (DUF2470); BDRG GreenCut2 COGO748 Putative heme iron utilization protein pfamO1458 Uncharacterized (UPF0051); Nga06745 cysteine desulfurase activator complex BDRG GreenCut2 ciO3223 subunit (SufB) NgaO6853 CO2879 Chlorophyll A-B binding protein BDRG GreenCit2

NgaO7310 ZDS PLNO2487: Zeta-caroterie desaturase, Rossmann-fold BDRG GreenClt2 (CrtC) CO993 NAD(P)-(+)-binding proteins Nga21031 C00508 YGGT family BORG GreeCit? ciO993; Rossmann-fold NAD(P)(+)-binding proteins; Nga30083 COG045 Nucleoside-diphosphate-sugar epimerases BDRG GreenClt2

Uncharacterized protein conserved in Nga301.45 cO1792: bacteria (DUF2256), S-adenosylmethionine BDRG GreenCut2 C12O11 dependent methyltransferases (SAM or AdoMet-MTase), class superfamily

Nga30371 CO9931 Rossmann-fold NAD(P)-(+)-binding proteins BORG Green Cut2

Nga30374 PRKOO122 16S rRNA-processing protein Rim M BDRG GreenCut2

FCURE C. U.S. Patent Apr. 29, 2014 Sheet 26 of 198 US 8,709,766 B2

N. gadiana Gene Conserved Alga Founcil Gene Name Domain ID Conserved Domain Description group GreenCut2. Nga30421 PLNO2578 hydrolase BDRG GreenCut2 cc.(7560; C-terminal processing peptidase; serine Nga30571 CCOO988 protease family S41, PDZ domain of C- BDRG GreenCut2 terminal processing pfamO1593 Flavin containing amine oxidoreductase, Nga30762 c10993 Rossmann-fold NAD(P)-(+)-binding proteins BDRG GreenCut2

ciO9099; P-loop containing Nucleoside Triphosphate Nga30773 Ch. TGRO2O3O Hydrolases, magnesium chelatase ATPase BDRG GreenCut2 subunit Fenritin-like superfamily of diron-containing Nga30837 AOX COO264 four-helix-bundle proteins BDRG GreenCu2 Nga30961 PLNO2578 hydrolase BDRG GreenCut2 CHS Nga31097 Chi G COO337 UbiA prenyltransferase family BORG GreeCut? 2Fe-2S iron-sulfur cluster binding domain Nga40001 petF Cd 0.02O7 (fer2) superfamily BDRG GreenCut2 Photosystem reaction centre subunit IWF Nga40006 psaE ciO3585 PsaE BDRG GreenCut2 Nga40042 psa co3639 Psal superfamily BDRG GreenCut2 Ribulose bisphosphate Nga40047 rocS CdCl3527 carboxylasefoxygenase (Rubisco), small BORG GreenC2 subunit superfamily sufB cysteine desulfurase activator complex Nga40072 (ycf24 SEGas subunit SufB, putative ABC transporter BDRG GreenCut2 ) (ycf24) Nga40087 psar c03627 (Psaf)Photosystem reaction centre subunit BDRG GreenCut?

cO9099; P-loop containing Nucleoside Triphosphate Nga40092 Ch TGRO2O3O Hydrolases; magnesium chelatase ATPase BDRG GreenCut2 subunit (Bchi-Chi) Nga00022.01 CO3831 Haemolysin-I related BDG Nga00046 pfam12697 Alpha/beta hydrolase 6 family BDG PRK 13557; histidine kinase; bZIP transcription factor, NgaOO157.01 clO2576; Motif C-terminai to PAS motifs (likely to BDG CO9965 contribute to PAS structural domain) NgaO0185 CO6868 Ferredoxin reductase (FNR like superfamily) BDG Nga00317 CCOO3O PAS domain B DG

FGURE U.S. Patent Apr. 29, 2014 Sheet 27 Of 198 US 8,709,766 B2

N, gadiana Gene Conserved Alga Found in Gene D. Name Domain D Conserved Domain Description group GreenCut2 c 14813; Peptidase Gluzincin family, Ubiquitin fusion NgaOO763 pfam03152 degradation protein UFD1 BDG Transmembrane amino acid transporter NgaOO791 CO524 protein NgaO0897 COO927 Formate/nitrite transporter

Polyphosphate(polyP) polymerase domain cdO775: of yeast vacuolar transport chaperone c 1964; (VC) protein WTC4 CYTH-like (also known NgaO 1225 ciO2238; as triphosphate tunnel metalloenzyme BDG CO995. (TTM)-like) Phosphatases: SPX domain; Domain of unknown function (DUF202)

Glycine? D-amino acid oxidases Nga01418.0 COGO665 (deaminating) (DadA) acetoin dehydrogenase E2 subunit Nga01437.01 PRK 14875 dihydroipoyllysine-residue acetyltransferase ci 2031 Esterase ipase superfamily, Alpha/beta NgaO1497 pfam12695 hydrolase family BDG Nga01527 cO 108 Yeet/YedEfamily (DUF395) BDG Glycosyltransferase family 0 Nga01585 CO2988 (fucosyltransferase) BG NgaO 1597.01 COO447 Nudix Hydroiase superfamily BDG 2-polyprenyl-6-methoxyphenol hydroxylase Nga01790 COG0654, and related FAE)-dependent CO993 oxidoreductases (Ubih), Rossmann-fold NAD(P)(+)-binding proteins NgaO2655 COO218 glycosyl hydrolase famity 16 BDG PLN.02378, glutathione S-transferase DHAR1; Nga0282 COO388 Thioredoxin like superfamily BDG Nga02918 CC0430 N-Acyltransferase superfamily BOG Nga03200 BDG NgaO3278 BG cd O4730; 2-Nitropropane dioxygenase (NPD)-like; Nga03403 CO9108 TIM phosphate binding superfamily NgaO3436 CO2813 protein BDG NgaO3614.01 CO3589 Chalcone-flavanone isonerase BDG Domain of unknow function 202 NgaO3952.01 CO9954 superfamily BDG Nga04174. PRK09525 beta-D-galactosidase (acz) BDG

FIGURE U.S. Patent Apr. 29, 2014 Sheet 28 Of 198 US 8,709,766 B2

N. gradiana Gene conserved Algal Found in Gene D Name Domain D Conserved Domain Description group GreenCut2. Nga04243 pfam 12697 Alpha/beta hydrolase 6 family BDG Nga0441 CO283 Mitochondria carrier protein superfamily BDG NgaO47541 cO7.118 RAP domain superfamily BDG 2-polyprenyl-6-methoxyphenot hydroxylase Nga04844,01 COGO654, and related FAD-dependent CO993 oxidoreductases (Ubih); Rossmann-fold NAD(P)(+)-binding proteins Bestrophin, RFP-TM, channel Nga04864 O1544 Superfamily O3951; Codc37 N terminal kinase binding; Cdc37 Nga05116 O7248; Hsp90 binding domain; Cdc27 C terminal BDG O7247 domain Nga05302 OO447 Nudix Hydrolase superfamily B Bestrophin, RFP-TM, chloride channe Nga05344 O544 superfamily BDG NgaO5424 COO473 BAX inhibitor (B)-1 FYoCA-like protein family BDG Nga05532 COG2850 Uncharacterized conserved protein BDG NgaO5999 PRK 13559, hypothetical protein, bZIP transcription BDG ciO2576 factor superfamily NgaO6125 BC Nga06 186 COOA58 :solute family BDG Nga06677 COO927 Fornate/nitrite transporter BDG Nuclear transport factor 2 (NTF2-like) Nga06705 CO909 Superfamily BDG Nga06712 BD Nga06740 pfam08449 UAA transporter family BDG Nga06994 CO2879 Chlorophy: A-B binding protein BDG

PRK1752: putative S-transferase; Giutathione S Nga07089 transferase (GST) C-terminal domain family, BDG Cd 0292; Ygh U-like subfamily; Thioredoxin like COO388 Superfamily NgaO7273 COO447 Nudix. Hydrolase superfamily BG Violaxanthin de-epoxidase-like protein of Nga20078 C06253 unknown function BDG Nga20139 CO)12O6 2OG-Fe(II) oxygenase superfamily BCG Nga20168.1 BDG Nga20308.1 BDG Glutathione S-transferase (GST) family, C Nga20591 CO2776 terminal alpha helica domain superfamily BDG

FGrE . . . U.S. Patent Apr. 29, 2014 Sheet 29 Of 198 US 8,709,766 B2

N. gaolitana Gene Conserved Alga Found in Gene Name Domainly Conserved Domain Description group Green cut S-adenosylmethionine-dependent Nga20879 C2O methyltransferases (SAM or AdoMet MTase), class I superfamily

Nga21269 COO938 Rieske 2Fe-2S cluster binding domain BDG Nga30150 BOG Nga301.86 CO2:44 TLD superfamily BDG CP Nga301.96 CC BDG FKBP-type peptidyl-prolyi cis-trans BOG Nga30242 c 11587 isomerase Nga30516 COO731 Domain of unknown function (DUF179) BBG Caspase, interleukin-1 beta Converting Nga30567 COOO42 enzyme (ICE) homologues Transmembrane amino acid transporter Nga30598 COO524 protein Caspase, interleukin-1 beta converting Nga306.79 COOO42 enzyme (ICE) homologues Nga40030 psbE CHLOOO64 photosystem protein V (psbE) B DG Nga40063 atpH COO486 ATP synthase subunit C B G Nga00901 BOR RaiA"ribosome-associated inhibitor A", also Nga01689 CoOO552 known as Protein Y (PY), YFA, and Spot'Y BR Nga01690 BDR Nga02056 CEO993 Rossmann-fold NAD(P)(+)-binding proteins BDR 2-desacetyl-2-hydroxyethy bacteriochlorophyide and other MDR Cd 08255; family members; Medium chain NgaO2342 C 4614 reductasetdehydrogenase (MOR)/zinc dependent alcohol dehydrogenase-like family

Nga03038.01 BR Nga03117 B DR NgaO3879 CO993 Rossmann-fold NAD(P)(+)-binding proteins BDR

NgaO4065 CO 139 Protein of unknown function, DUF399 BBR

FIGURE ... K U.S. Patent Apr. 29, 2014 Sheet 30 Of 198 US 8,709,766 B2

N. gradiana Gene Conserved Alga Found in Gene D Name Domain of Conserved Domain Description group GreenCut2. NgaO4422 BDR Nga04841 CO2879 Chlorophyt: A-B binding protein superfamily BDR ATP-dependent Clp protease adaptor Nga05526 OO933 protein CipS BOR Nga20136 GRO2352 glycine oxidase ThiO; Malate:guinone : cl14881 oxidoreductase (MgO) Nga20468 CC04728 Thiazole synthase (ThiG) Ribulose bisphosphate carboxylase large Nga20657 CO8232 chain superfamily BDR Photosystem psaApsaB protein Nga20670 CO8224 superfamily Ribulose bisphosphate carboxylase large Nga20759 CO8232 chain superfamily BR

Nga20849 CO9294 Smalt protein B (SmpB) superfamily Nga20900 CO8223 Photosystem protein Ribulose bisphosphate carboxylase large Nga2 024 CO8232 chain superfamily BDR

Nga2037 COO24 Putative small multi-drug export protein BDR Nga21157 BOR Ribulose bisphosphate carboxylase large Nga21279 CO8232 chain superfamily BDR

Nga30886 ciO2526 C-terminal processing peptidase family S41 BOR

Nga30899 pfam06514 Photosystem 12 kDa extrinsic protein BDR (Psbu) Nga40021 psbW C 1434 Cytochrome c BDR

Nga40O82 thiG cc.04728; Thiazole synthase (Thi(G); cO9108 TIM phosphate binding superfamily

NgaOOO68 COS685 Ribosomal protein S1-like RNA-binding BDRG domain (S1 ex) NgaO0223.01 GRO1172 serine O-acetyltransferase (cysE), Serine BDRG co03354 acetyltransferase (SAT) cdO3013: Peroxiredoxin 5 like Thioredoxin like BORG NgaO0251.01 cOO388 superfamily Nga00361 C10O3 Glycosyltransferase GiB type superfamily BDRG

FIGURE U.S. Patent Apr. 29, 2014 Sheet 31 Of 198 US 8,709,766 B2

N. gadiana Gene Conserved Agai For in Gene Name Domain D Conserved Domain Description group GreenCut? FABZ Nga00482.01 CCO)1288 beta-hydroxyacyl-acy carrier protein (ACP) BORG 2 dehydratase (FabZ) Nga00950 2022 Ribosomal protein 18e/15 BDRG NgaO 1002 CoO3467 Rieske [2Fe-2S cluster binding domain BDRG NgaO1031 COO630 YdcF-like superfamity BDRG NgaO 1037 CEO 1132 Fatty acid hydroxylase superfamily BDRG Nga01210 BDRG NgaO1230 TGRO3156 GTP-binding protein HFX; Ras-like GTPase BDRG ; c. 10444 superfamily NgaO 1305.1 CO3940 TLC ATPIADP transporter BDRG Nga01316 cdOO320 Chaperonin 10 Kod subunit (cpn 10 or COG3435 GroES), Gentisate 1,2-dioxygenase

TGRO084 Nga01515 Bile Acid:Na+ Symporter (BASS) Family: BDRG ; c.091 7 Membrane transport protein

NgaO1605 cdO3013: Peroxiredoxin (PRX) family, PRX5-like COO388 subfamily; Thioredoxin like superfamily

Nga01793.0 CO9807 Conserved hypothetical protein (Lin O512 fam) NgaO1805 CO3940 TLC ATPIADP transporter superfamily

NgaO1914 C1468 integral TerC famity

NgaO1942 Cd 03013: Peroxiredoxin (PRX) family, PRX5-like COO388 subfamity; Thioredoxin like superfamily EamA-like transporter family, Permeases of Nga0260 c:01.037; the drug? metabolite transporter (DMT) BORG COGO697 superfamily (Rhai) Patatin-like phospholipase domain Nga02379 cdO721 1, containing protein 8; Patatins and c.11396 Phospholipases MECDP synthase (2-C-methyl-D-erythritol Nga02651.01 cCOO554 2.4-cyclodiphosphate synthase) (spf) BDRG Nga02662 CO849 Sigma-70 region 2

NgaO2686 SQD1 cO9931 Rossmann-fold NAD(P)(+)-binding proteins; BDRG PNO2572 UDP-sulfoguinovose synthase

FIGURE y U.S. Patent Apr. 29, 2014 Sheet 32 Of 198 US 8,709,766 B2

N. gaolitana Gene Conserved Aigal Fourt in Gene Name Domain it Conserved bonain Description group' GreenCut2. Sec14p-like tipid-binding domain, cdOO 170, Acetyltransferases (Rin); N NgaO2690 COGO456, Acyltransferase superfamily; Various BDRG CoO4301 enzymes that characteristically catalyze the transfer of an acyl group to a

NgaO2746 CoOO498 Heat shock protein 33-like chaperonin BRG Nga02754 TIGRO2997 RNA polymerase sigma factor 70-like BORG NgaO2798 CO14.67 Uncharacterized conserved protein BORG NgaO2900 pfam06325 Riema protein L11 methyltransferase BRG Trans-soprenyi Diphosphate Synthases, cdOO683: NgaO2957 COO2O head-to-head; soprenoid Biosynthesis enzymes, Class 1 Nucleotidyltransferase (NT) domain of RetA NgaO3147 CoO85399 and SpoT-ike ppGpp synthetases and hydrolases., c}00951, Fe-S metabolism associated domain, Suf Nga0327 COO386 related proteins, BoA-like protein S-adenosylmethionine-dependent NgaO334 CO2440 methyltransferases (SAM or AdoMet MTase), class cdO1517: PAP-phosphatase like domain; FG, NgaO3442 c}00289 FBPasef MPase/glpX-like domain

Nga03565 PRK00517 ribosoma protein L1 methyltransferase BDRG NgaO3664 CO2879 Chlorophy: A-B binding protein BORG Nga03732 TIGRO2227 signal peptidase , bacteriai typ BORG NgaO3876 CO9931 Rossmann-fold NAD(P)(+)-binding proteins BORG Nga03950 CCOF233 Glyoxalase BDRG cO917: Membrane transport protein, Bile Acid:Na+ Nga04.047. TIGR00841 Symporter (BASS) Family BORG Nga04073 CO9925 Protein Kinases, catalytic domain NgaO428 CO8419 Sigma-70 region 2 Nga04326.01 c10468 integral nembrane protein TerC family

FIGURE N U.S. Patent Apr. 29, 2014 Sheet 33 Of 198 US 8,709,766 B2

N. gadttana eene Conserved Alga Found in Gene Name Domain D Conserved Domain Description group GreenCert? Nga04512 CCOO83 Chalcone and stilbene synthases like BDRG Protein Kinases, catalytic domain-like Nga04526 CO9925 superfamily BDRG 4-diphosphocytidyl-2C-methyl-D-erythritoi 2 Nga04584.1 COG947 phosphate synthase (spE) BDRG

Dipeptidyl aminopeptidases/acylaminoacyl COG1506; peptidases (DAP2), Estease lipase Nga04742 c12031 superfamily; type glutamine COOO2O amidotransferase (GATase)-like domain

cc.00948, Fructose-1,6-bisphosphate aldolase, TM Nga04777.1 CO908 phosphate binding superfamily BDRG NgaO4801 CEO 1037 EamA-like transporter family NgaO4846 c 12050 TraB superfamily GGR Nga04895 PLN.00093: geranylgerany diphosphate reductase, BDRG (ChiP) co9931 Rossmann-fold NAD(P)(+)-binding proteins Nga05141 COG 842 Phage shock protein A (IM30) BDRG -OR 1-hydroxy-2-methyl-2-(E)-butenyl 4 Nga05308 (IspH) PLN.02821 diphosphate reductase (HDR) BDRG Nga05494.01 PRK0484 DNA topoisomerase W subunit B BDRG Peroxiredoxin (PRX) family, Bacterioferritin Nga05568 cdO3O47: comigratory protein (BCP) subfamily; BDRG COO388 Thioredoxin like superfamily Tryptophan-rich sensory protein (spO)f Nga05781 CO1379 MBR family BDRG Nga05803 CO96 Domain of unknown function (DUF1995) BORG

Nga(5850 PLN.00033; photosystem stabilitylassembly factor; BDRG C101.30 Yofa.8-like protein Nga05921 c 4656 Phosphoenolpyruvate carboxylase BDRG NgaO6059.1 COO489 60Kc inner membrane protein BDRG Tryptophan-rich sensory protein (1 spO). Nga061.49 CEO 1379 MBR family BDRG Nga06200 CO3940 TLC ATP/ADP transporter BDRG Nga06242 Ch-2 PLNO32A1 ?nagnesium chelatase subunit H BORG Nga06284 CO1697 Niru protein (exact function unclear) BDRG Translation initiation Factor F, S-like Nga06431 CoO445 RNA-binding domain FORE - O U.S. Patent Apr. 29, 2014 Sheet 34 of 198 US 8,709,766 B2

N. gadiana Gene Conserved Algal Ford in Gene Name Domain of Conserved Domain Description group GreenCust? NgaO6432 PRK12902 preprotein transiocase subunit (SecA) BDRG Nga0656 TGRO0691 (p)ppGpp synthetase, RelA/Spot family BORG S-adenosylmethionine-dependent NgaO6559 CCO2440 methyltransferases (SAM or AdoMet MTase), class superfamily NgaO6584 CO3O37 HCO3- transporter family Patatin-like phospholipase domain NgaO7147 CCO7211 containing protein 8 PLN.0324t, magnesium chelatase subunit H. Domain of Nga20052 C 3412 unknown function (DUF3479) BORG Nga20056 pfamO9258 Glycosy transferase family 64 domain Nga20192 pfamO2536 mitochondrial transcription termination factor (mTERF) Nga20325 PRK00131 shikimate kinase (arok) Nga20417 CO8224 Photosystem psaAlpsaB protein Nga20459 CO8224 Photosystem psaApsaB protein Ribulose bisphosphate carboxylase large Nga20492 CO8232 chain superfamily BORG

S-adenosylmethionine-dependent Nga20546.1 C2O11 methyltransferases (SAM or AdoMet Mase), class Superfamily Photosystem psaApsaB protein Nga20664 CO8224 superfamily Ribulose bisphosphate carboxylase large Nga2O720 CO8232 chain superfamily BORG Photosystem psaApsaB protein Nga2O725 CO8224 Superfamily BORG Photosystem psaApsaB protein Nga20738 CO8224 superfamily BORG Nga20923 CO8223 Photosystem it protein BORG Nga20940 pfam09258 Glycosy transferase family 64 domain BORG Nga21049 CO8224 Photosystem psaApsaB protein BORG Nga2O63 CO8224 Photosystem psaApsaB protein BDRG Nga2O66 CO8224 Photosystem psaApsaB protein BORG Nga21084 CO8224 Photosystem psaApsaB protein BORG GRE U.S. Patent Apr. 29, 2014 Sheet 35 of 198 US 8,709,766 B2

N. graditana eene Alga Forcin Gere. Nane ConservesDorai Conserved Domain Description group GreenCut2. Nga21158 CO8224 Photosystern psaApsaB protein BDRG Ribulose bisphosphate carboxylase arge Nga211.78 CO8232 chain superfamily BDRG Mga21247 CO3224 Photosystern psa ApsaB protein Nga21271 CoO9289 D1 subunit of photosystem smart0048 DEAD-like helicases superfamily (DEXDC); 7; Helicase superfamily C-terminal domain Nga30090 codOOO79; (HECC); Transcription-repair coupling COG 1197 factor (Mfd) Nga3025 CoOO42 Inorganic pyrophosphatase BDRG Nga3026 CO3940 TLC ATPIADP transporter BDRG c0993; Rossmann-fold NAD(P)-(+)-binding proteins, Nga301.94 PRKO7208 hypothetical protein BDRG Nga30232 CO2879 Chlorophyll A-B binding protein BDRG Nga30264 CO9925 Protein Kinases, catalytic domain BDRG Sec14p-like lipid-binding domain Nga30614 CCOO 170 superfamily BDRG RNA polymerase signa factor, Nga30622 GRO2997 cyanobacterial RpoD-like family: Signa-70 BDRG ciO8419 region 2 PRK07379; coproporphyrinogen if oxidase; HemN C Nga30671 CO650 terminal region BDRG Nga30751 PRK11132 Serine acetyltransferase (SAT) BDRG COG2262, BORG Nga30761 CdO1878 GTPases; HfiX subfamily c 2138 Thylakoid formation protein (THF1); Nga30766 TGRO3060 photosystem biogenesis protein Psp29 BDRG Squalene cyclase (SQCY) domain Nga30790 CAS CO2892 Subgroup HOS 4-hydroxy-3-methylbut-2-en-1-yl Nga30806 (spG) PNO2925 diphosphate synthase (spG) GRO2997 RNA polymerase sigma factor, Nga30840 cyanobacterial Rpot-like family; Sigma-70 BDRG ciO8419 region 2 CdO3013: Peroxiredoxin (PRX) family, PRX5-like Nga30855 COO388 subfamily: Thioredoxin like superfamily

Nga30862 COO938 Rieske [2Fe-2S cluster binding domain

FIGURE Q U.S. Patent Apr. 29, 2014 Sheet 36 of 198 US 8,709,766 B2

N.gadiana Gene Conserved Aigal For in Gene Namea Domain of Conserved Domain description group GreenCut2. S-adenosylmethionine-dependent Nga30939 CO2440 methyltransferases (SAM or AdoMet Mase), class superfamily

Chi- PRK12493; magnesium chelatase subunit h; domain of BORG Nga30995 c1342 unknown function (DUF3479) Nga31026 PLNO2536 Diaminopimetate epimerase Nga40002 psbB ciO8223 Photosystem protein Nga40005 psbH ciO2951 Photosystem 10 kDa phosphoprotein Nga40009 petB CCOO284 Cytochrome b (N-terminus)/b6/petB Nga4000 petD CCOO290 Cytochrome b(C-terminus)ib6/petD Nga4007 rps 14 co0355 Ribosomal protein S14pfS29e Nga40018 psaB c{08224 Photosystem psaApsaB protein Nga40019 pSaA c08224 Photosystem psaA?psaB protein Nga40020 pet c. 44 Cytochrome c Nga4.0024 rp 19 COO4O6 Ribosomal protein L. 9 Nga'006 PsbA CdO9289;co8220 D2D1 subunitsubunits ofof photosystemphotosystemissils li (PS II), D1, BORG Ribulose bisphosphate carboxylase large Miga40048 bCL CO8212 chain, Form superfamily BORG Nga40050 petA ciO368 Apocytochrome F, C-terminal BORG Sec-independent protein translocase Nga4005 tatC COO521 protein (TatC) BDRG Nga4.0054 ycf3 CHL.00033 photosystem assembly protein (Yof3) BORG Nga4.0056 rpl33 COO383 Ribosoma protein L33 BDRG DNA-directed RNA polymerase, beta' Nga40060 rpoC2 TIGRO2388 subunit (rpoC2 cyan) BDRG Nga4006 rps2 Cd 0.1425 Ribosomal protein S2 (RPS2) BDRG Nga4.0062 atpi COO413 ATP synthase A chain BDRG Nga40073 CCSA COO.504 Cytochrome Cassembly protein BRG Nga40078 secA CHL00122 preprotein translocase subunit SecA BDRG Nga40084 psaC ChLO0065 photosystem subunit V (psaC) BORG Nga40093 psbC Ross Photosystem protein; psbC BORG Nga'009 PsbD CdO9288;06226 D2 subunitisofoStemissils of photosystem li (PS II), D1, BDRG FIGURE R. U.S. Patent Apr. 29, 2014 Sheet 37 Of 198 US 8,709,766 B2

N. graditana Gene Conserved o Aigal Fond in Gene Name Domain D Conserved Domain Description group GreenCut2. Nga4013 SecY CHL001.61 preprotein translocase subunit SecY BDRG Nag50013 BORG

Name given to N. gaditana gene model by manual curation. Conserved (s) D assigned from NCBi-curated domains, Pfam, SMART, COG, PRK, or TIGRFAM databases. Description(s) of conserved protein domain given. Aigal lineages with homologs to N. gaditana gene model. B - Brown, D - diatom, R - red, G - green. indicates if N. gaditana gene model has homology to a GreenCut2gene'. U.S. Patent Apr. 29, 2014 Sheet 38 of 198 US 8,709,766 B2

;************………–…….…

U.S. Patent Apr. 29, 2014 Sheet 39 Of 198 US 8,709,766 B2

U.S. Patent Apr. 29, 2014 Sheet 40 of 198 US 8,709,766 B2

C&C errs Nurnber of Sequences {{ 50 200 28 translation ribosome biogenesis organ development auxin biosyfittistic process Serine fatiiily affino acid metabolic process & 39 senai Cecile catatoic process intrace:sular signating pathway cellular afriro acid biosyntheti?: process

election transport intracelular trait.sport er?bryonic development iipid biosyfithstic process photosynthesis at atomical structure morphogefesis electron transport chair reproductive developmental process post-embryonic development RNA splicing frkA rocessing ?égitatisfy of cellular protein metabolic process heterocycle biosyfithstic process protein iransport tailular macromolecular complex assembly negative regulation of fracromolecute frietaboli?: process DKA repair x 68 riotic cel cycle & 55 asariate family amine acid metabolic process multi-organism grocess regative regulatio of cellular metabolic process protein amino acid cellular affino acid derivative frietabolic process immune system process ce development acyl-carrier-protein biosynthetic process ce:k:ta respiration cellular response to cherica stifritius posttranscriptionai regulation of gene expression chromatin organization coenzyme biosyntheti?: process growth : signal transduction vesicle-mediated transport ceikata: gafibohydrate biosyrithetic process glucose metalgic process positive regulatio of cellular metabolic process response to hormone stimuius X protein coringlex assembly regulatiot of catalytic activity : reSOCES8 to biotic sinus : ubiquitin-dependent protein catabolic process ; regulation of transfiriptio?, NA-dependent x 55 response to fadiation x 55 M. phase x 54 positive regulation of macromolecule metabolic process x x 54 Stuf frietabolic process 54. fucleotide biosynthetic process 83 apoptosis 52 cellujar carbohydrate catabolic proces 52 in?teabase metabolic grocess 32 tRNA metabolic process x 52 cellular macromolecute localization x 5. ENA rapication fatty acid frietabolic process microtubule-based process : protein modification by smal: protein conjugation foto? transport 5 oxidoreduction Coenzyme metabolic process protein folding pyruvate metabolic process response to cadrium response to temperature stimulus :

GRE 4. U.S. Patent Apr. 29, 2014 Sheet 41 of 198 US 8,709,766 B2

EC N. gadlitana Transcript Enzyme Description number mode support Fatty acid biosynthesis ACS acetyl-CoA synthetase 6.2.1.1 Nga30O20 Y ACS acetyl-CoA synthetase 6.2.1.1 Nga02166 Y ACC acetyl-CoA carboxylase 6.4.2 Nga30O28 Y ACC acetyl-CoA carboxylase 6.4.1.2 Nga30783 Y MA maionyl-CoAACP-trans-acylase 23.39 Nga00899 Y KAS 3-oxoacyl-ACP synthase II 2.3.1.18O Nga301.06 Y KAS 3-oxoacyl-ACP synthase 2.3.1.79 Nga02138 Y KAS 3-oxoacyl-ACP synthase 2.3.1.179 Nga30755 Y KAS 3-oxoacyl-ACP synthase 2.3.1179 NgaO4201.01 Y KAS fatty acid synthase 2.3.1.- Nga05827 Y KAR 3-oxoacyl-ACP reductase 11.100 Nga20022 Y KAR 3-oxoacyl-ACP reductase 1.1.100 NgaO1542 Y HD 3-hydroxyacyl-CoA dehydratase 4.2.1- Nga05985.01 Y ENER enoyl-ACP reductase 3.9 NgaO5091 Y ENR enoyl-ACP reductase 13, 1.9 Nga10004 Y KAR 3-oxoacyl-ACP reductase 111.100 Nga00602 Y KAR 3-oxoacyl-ACP reductase 11. OO NgaO5627 Y FA oley-ACP hydrolase 3.12.4 Nga01045.01 Y E.OW elongation of very long chain fatty acids 2.3..- NgaO3084 Y E.OV elongation of very long chain fatty acids 2.3. - Nga00451 Y OESA acyl-ACP desaturase 4.192 Nga01458.01 Y OESC stearoyl-CoA desaturase 14.19.1 Nga00524 Y DES omega-6 fatty acid desaturase delta-12 114. 19 NgaO2019 Y DES omega-6 fatty acid desaturase delta-12 114.19 Nga00817 Y DEGS sphingolipid deta-4 desaturase 1.14.-- NgaO6739.1 Y FAO7 glycerolipid omega-3 fatty acid desaturase 14, 19 Nga3038 Y

TAG assembly G3PDH glycerol-3-phosphate dehydrogenase NgaO6665.1 G3PDH glycerol-3-phosphate dehydrogenase Nga01869 G3PDH glycerol-3-phosphate dehydrogenase NgaO4914 G3PDH glycerol-3-phosphate dehydrogenase NgaO5226 G3PDH glycerol-3-phosphate dehydrogenase Nga30015 GPAT glycerol-3-phosphate acyltransferase NgaO4759 GPA glycerol-3-phosphate acyltransferase Nga10005 GPA glycerol-3-phosphate acyltransferase Nga04850 FGURE 5 A U.S. Patent Apr. 29, 2014 Sheet 42 of 198 US 8,709,766 B2

EC N. gardiana Transcript Description number mode support 1-acylglycerol-3-phosphate O LPAA acyltransferase 2.315 Nga00059 Y 1-acylglycerol-3-phosphate O PAA acyltransferase 2.3.15 Nga30581 1-acylglycerol-3-phosphate O LPAA acyltransferase 2.3.151 Nga30809 1-acylglycerol-3-phosphate O PAA acyltransferase 23, 151 Nga02265 1-acylglycerol-3-phosphate O LPAA acyltransferase 2.351 Nga21122.1 PAP phosphatidic acid phosphatase 3.13.4 Nga21116 DAGK diacylglycerol kinase 2.7.1.107 Nga02796 DAGK diacylglycero kinase 2.7.1.107 NgaO5586.1 PDAT phospholipid:diacylglycerol acyltransferase 2.3.1.1.58 NgaO2737 DGA diacylglycerol acyltransferase 2.3.20 NgaO003 DGAT diacylglycerol acyltransferase 23, 20 Nga30544 PA ?ysophosphatidylglycerol acyltransferase 2.3.1 - NgaO5465 KO PKS5 polyketide synthase 5 K2433 Nga00335 MGD monogalactosyldiacylglycerol synthase 2.4.46 Nga(0818 SQ)2 suffocuinovosyltransferase 2.4..- Nga00561 SQ)1 UDP-Suffocquinovose synthase 3.13.1 NgaO2686

lipid activation TAGL TAG-lipase 3.13 Nga30958 AG AG-lipase 3.11.3 Nga30749 AASDh acyl-CoA synthetase 6.2.1 - NgaO3422 AASDH acyl-CoA synthase 6.2.1 - Nga05597 ACSL long-chain acyl-CoA synthetase 6.2.13 Nga30170 ACS long-chain acyl-CoA synthetase 6.2.13 Nga03113 ACSL long-chain acyl-CoA synthetase 6.2.13 Nga006.75 ACS long-chain acyl-CoA synthetase 6.2.13 Nga30631 ACS long-chain acyl-CoA 6.2.13 Nga00919 ACSL long-chain acyl-CoA ligase 6.2.13 Nga02299 FAOD9 long-chain acyl-CoA ligase 6.2.1- Nga00006 ACSL long-chain acyl-CoA ligase 6.2.1.3 Nga30568 HADH 3-hydroxyacyl-CoA dehydrogenase 1.1.1.35 Nga00482.01 ADH 3-hydroxyacyl-CoA dehydrogenase 135 Nga02679 ADH 3-hydroxyacyl-CoA dehydrogenase 1135 Nga06144.1 LCD long-chain hydroxyacyl-CoA dehydrogenase , 1,121 NgaO3480 ACADS short-chain acyl-CoA dehydrogenase 3.81 Nga31130 short-branched-chain acyl-CoA ACADSB dehydrogenase 1.3.99.12 Nga05705 ACDH ACV-CoA dehv(drodenase 13.99.3 NCaO42O4. O GURE 3 E3 U.S. Patent Apr. 29, 2014 Sheet 43 of 198 US 8,709,766 B2

N. gaditana Transcript Description model support acyl-CoA oxidase Y a Ngao.3053 Y peroxisomal trans-2-enoyl-CoA reductase Wr Nga06128 Y alternative oxidase Nga03545 Y alternative oxidase Nga30837 Y alternative Oxidase NgaO3289 Y acyl-CoA oxidase a a Nga04370. Y acyl-CoA oxidase a Wa V8 Nga30819 Y enyol-CoA hydratase NgaO1761.01 Y enyol-CoA hydratase T. r. NgaO0171 Y enyol-CoA hydratase Nga20152 Y enyol-CoA hydratase NgaO6135 Y beta-ketoacyl-CoA thiolase M r Nga01710 Y beta-ketoacyl-CoA thiolase Nga04504.01 Y beta-ketoacyl-CoA thiolase V Nga30830 Y acetyl-CoA acetyltransferase a . Nga20998 Y beta-ketoacyl-CoA thiolase Nga30830 Y (fatty acid transporter) A Nga06551 Y peroxisoma 2,4-dienoyl-CoA reductase a NgaO5502.01 Y patatin-like phospholipase Y NgaO3028.01 Y

Candidate N. gaditana gene model encoding Corresponding enzyme. indicates if given gene model has transcript support from RNAseq of a pool of conditions including: +/- nitrate, logarithmic phase, stationary phase, heat shocked culture (2h at 37°C), Cold treated culture (2h at 4°C), culture after 12h dark, +/- CO2.

GR 5 C U.S. Patent Apr. 29, 2014 Sheet 44 of 198 US 8,709,766 B2

Acetyl-CoA e ACC :six Malonyl

CoA

Acetyl- Malonyl COA ACP is KAS12

3-Ketoacyl ACP

3-Hydroxyacyl ACP

H Acyl-ACP trans to

G3POH Acwl-CoAy Glycerol 3-phosphate * GPA : lysophosphatidic acid ACwl-CoAre LPAAT &is Phosphatidic acid

SNDAGKPAP : 8S. Diacylglycerol

DGA is TAG & N. gaditana 3: E. siculosus 8. P. tricornuitin 8. C. merolae 8. C. feifardi

GRE 8 U.S. Patent Apr. 29, 2014 Sheet 45 of 198 US 8,709,766 B2

W. gaditana E. silicitosis P. tricornutun C. neroloe C. reinhordi total it of genes" 9,052 36,256 10,402 5,331 15,143 Fatty acid biosynthesi ACC s. MA KAS KAS/ KAR D ENR Total TAG assembly' G3POH GPA LPAAT PAP DAGK PDA DGAT Total Lipid degradation TAG AASD ACS FADD9 ADH CD ACADS ACADSB ACDH ACOX AOX EC KCT1 KCT2 KCT3 ACAT Total Total in a categories

GRE 7

U.S. Patent Apr. 29, 2014 Sheet 48 of 198 US 8,709,766 B2

Ngadiana Fold: Conserved Conserved Domain node regulation Description Domain D definition NgaO6526 NgaO4678 P-loop containing Nucleoside Triphosphate clO9099; Hydrolases, AAA ATPase NgaO2936 midasin-like protein COG5271 containing von Wilebrand factor type A (WA) domain Nga31211 Nga01342.01 Nga2O676.1 Nga20915 Nga20972 Nga04280 Nga2O680 Nga06280.01 NgaO4146 atp-dependent rna AAA ATPase containing Nga20823 midasin-like protein COG5271 won Wilebrand factor type A (WA) domain NgaO1883.01 . no hit Nga30731 ribonuclease hi Cl 14782 Ribonuclease H (RNase NgaO3789 Medium chain reductaseldehydrogenas Nga2O714 polyketide synthase C14614 e (MDR)/zinc-dependent alcohol dehydrogenase Bike fami monovalent cation: proton NgaO3637 -1 family GROO831 Na+/H+ antiporter NgaO3295

C2 domain of PEN Nga07000 formin ike protein C11068 tumour-suppressor protein

FGRE 20 A U.S. Patent Apr. 29, 2014 Sheet 49 of 198 US 8,709,766 B2

N. gadiana Foci Conserved Conserved Domain model egulatio Description Domain D definition hypothetical protein NgaO2041 FB2170 06015 Maribacter sp. HTCC2170 NgaO7101 rime-like gtpase atpase Nga2265 without a C-terrnia eh domain Nga06490 Nga00341 3-5 exoribonucleaserna Nga20547 CO314 3' exoribonuclease family, binding protein dona NgaO7033 NgaO0173 PN connai of cdO9857; Exonuclease- H3TH Nga03960 protein c 14815 domains of structure specific 5' nucleases Nga2O524, dha-directed na polymerase RNA polymerase Rpb1, iii largest domain NgaO0028.01 NgaO1948 Nga20592 alpha-galactosidase C1 402 Glycosyl hydrolase family 31 (GH31) Nga04495 fatty acid elongase CO3120 GNS1 (SUR4 family Nga30495 NgaO3619 phd zinc finger-containing NgaO0231.01 protein c5348 RING-finger NgaO2632 Nga04597 Nga06236 Nga0424 Nga00280 Nga30460 Nga31210

Nga20586

FGRE 3 U.S. Patent Apr. 29, 2014 Sheet 50 of 198 US 8,709,766 B2

N. gaitana Fold Conserved Conserved Domain model regulation Description Domain D definition predicted protein Nga05509 Thalassiosira pseudonana COO 117 PDZ domain CCMP1335) Nga2074 fbox protein Nga04500 Nga30473 Nga20390 rna polymerase i elongator CO2587 Nga06889 NgaO7058 Nga20704 NgaO5678

Nga30260 nuclear Condensin complex CO5797 SMC proteins Flexible subunit smcA Hinge Domain

notch-regulated ankyrin NgaO 1032 repeat-containing protein b CCOO2O4. Ankyrin repeats tetratricopeptide repeat Nga20904 family protein NgaO686 NgaO4596 Nga31101 tetratricopeptide repeat

Nga21.59

NgaO2039 Nga2 141 protein c10757 Guanyly Cycase Nga21257 no hit vacuolar protein Sorting Nga21076 CO)1388 2-hydroxycarboxylate associated protein Vps13 transporter family NgaO2799 protein GRE 20 C U.S. Patent Apr. 29, 2014 Sheet 51 of 198 US 8,709,766 B2

N. gaditana Conserved Conservec Domain mode Description Domain it definition Nga40094 hypothetical protein cur. 2694 Nga04295 Thermomonospora curvata DSM 43183 Nga00509 calcineurin-like Nga30431 CEO950 EF-hand, calcium binding phosphoesterase notif NgaO7039 Nga20583.1 NgaO7207 zinc finger fyve domain Nga20327 Containing protein CCOOO65 FYWE donair NgaO4524

NgaO5615.1

Nga40046

Nga04641 auxin efflux carrier-like Membrane transport NgaO1578.0 protein CO917 protein protein kinase domain Nga20668. containing protein

Nga20029.1 abc subfamily abcg C1417 ABC-2 type transporter Nga2O678 Nga2O579 Nga03015 CO9925 Protein Kinases, catalytic protein kinase weel. donair

GRE 20 U.S. Patent Apr. 29, 2014 Sheet 52 of 198 US 8,709,766 B2

N. gaitana Fold se Conserved Conserved Domain node regulation Description Domain D definitio

NgaO4225

Nga40088 Nga05473

Nga06871

Nga20458 chaperone protein CC06257 Ona domain or 3-domain Nga30457 stip-like protein CO261 G-patch domain

Nga2047

Nga00448.0

Nga03116

: myb-like dina-binding domain 'SW3. ADA2, N-CoR and Nga20981 r COO 132 TFIIB' DNA-binding Containing protein domains NgaO1043 hypothetical protein uncultured delta Nga40023 proteobacterium HFO130 19C2O Nga30860 NgaO6456 C-terminal, alpha hetical domain of the Gutathione S-transferase family; Nga03667 glutathione S-transferase cO2776:COO388 Prote stifice Oxidoreductases and Other Proteins with a Thoredoxin fod Nga40124 30S ribosoma protein s10 COO34 Ribosomal protein S20e Nga00253.1 NgaO2520.0 phytochelatin Synthase pfam()5O23 Phytochelatin synthase Nga40090 FGRE 20 U.S. Patent Apr. 29, 2014 Sheet 53 of 198 US 8,709,766 B2

N. gaditana Fo Conserved Conserved Dontain mode regulation Description Domain D def NgaO2790 NgaO3527 Nga30169 Effector domain of the cap family transcription CAP family of NgaO0347 Co00038, transcription factors; factor COO446 Metalo-beta-lactamase Superfami Nga01885.01 NgaO4558 Nga00243.01 Uncharacterized Nga30680 zinc finger protein 598 COG5236 conserved protein, contains RING Zn-finger Nga20514 peptide chain release factor 2 CO2875 R - Onair Nga01.194

Nga01460.01 riken Cdna 6720467C03 isoform Crata NgaO7186 NgaO5697 Nga06201 Nga04536.01

Nga30207

Proteir of unknown Nga06069 protein CO84 function (DUF1499) nadh dehydrogenase subunit Nga50010 COO539 Respiratory-chain NADH 9 dehydrogenase predicted protein Nga06056 Thalassiosira pseudonana CCMP1335

Nga05289

FGFRE 2 U.S. Patent Apr. 29, 2014 Sheet 54 Of 198 US 8,709,766 B2

N. gaolitana Fold Conserved Conserved Domain node regulation Description Domain D definition wd40 repeat-containing Nga20601 protein CO2567 WD40 donain Nga05030 ankyrin repeat and kh NgaO6911. domain-containing protein 1 COO.204 Ankyrin repeats Nga30730 NgaO1282 NgaOO736 ROSSnar n-fold Nga21288.1 udp-sulfoguinovose synthase CO993 NAD(P)(+)-binding proteins Nga40020 cytochrome c553 c 1414 Cytochrome c Nga02267 NgaO2416

Nga07010

NgaO2785

NgaO23250 CCOO 105 Khomology RNA-binding rna binding protein domain, type poly-binding protein 3 CCOO 105 K homology RNA-binding NgaO1 100.1 isoform 2 domain, type

Nga40044

Nga03568

Nga40018

myo-inositol-1-phosphate Myo-inositol-1-phosphate Nga20911.1 Synthase COO554 synthase Khomology RNA-binding Nga50007 c{00098: domain, type ; 30S ribosomal protein S3 CO2819 Ribosomal protein S3, C termina domain Tetratricopeptide repeat NgaO6064 protein CO2429 domain

GRE 23 G U.S. Patent Apr. 29, 2014 Sheet 55 of 198 US 8,709,766 B2

N. gadiana Fold Conserved Conserved Domain model regulation Description Domain D definition

Nga20756

Nga001 24 Nga05257 Nga30862 rieske (2fe-2s) region protein COO938 Rieske domain Nga04958

N. gaditana gene modeis differentially regulated during nitrogen deprivation. Fold regulation of gene, > 1 signifies up-regulation, <1 signifies down-regulation. N. gaditana gene model description. A green label indicates a function in photosynthesis; a blue label indicates a function in nitrogen utilization or protein degradation/recycling. Conserved protein domain ID assigned from NCBi-curated domains, Pfam, SMART, COG, PRK, or TIGRFAM databases. Description of conserved protein domain given.

GRE O - U.S. Patent Apr. 29, 2014 Sheet 56 of 198 US 8,709,766 B2

.uajsebouefau;'q

U.S. Patent Apr. 29, 2014 Sheet 58 Of 198 US 8,709,766 B2

pepit Nga model description Chlorophy, carotenoid and sterobiosynthesis genes SECRO NO; 3722. 59 Nga02834 glutamyl-tRNA synthetase SEOE NO: 873 50 Nga02604 glutanyl-tRNA reductase SECRED NO: 8804. 141 Nga3004.5 glutamate-1-semialdehyde aminotransferase ( glutamate-1-semialdehyde 21-aminomutase SECONO: 8678 15 NgaO0585 5-aminolevulinic acid dehydratase porphobinogen synthase SEC 3D NO: 8730 67 NgaO3243 porphobilinogen deaminase f hydroxymethylbilane synthase SECRED NO: 8586 23 NgaO0807 uroporphyrinogen synthase SEO O NO; 3745 82 Nga04120 uroporphyrinogen Si decarboxylase SEOE NO: 8774 11. Nga05706 Uroporphyrinogen El decarboxylase SECREO NO: 8763 100 Nga05151 coproporphyrinogen El oxidase SECR D NO: 8749 86 NgaO4278 Coproporphyrinogen oxidase SEC 3D NO: 8744: 81. NgaO3873 protoporphyrinogen X oxidase SEO O NO: 888 155 Nga30773 protoporphyrin XMg-chelatase subunit D SECONO: 8838 175 Nga40092 protoporphyrin XMG-chelatase subuin; : SECRED NO: 8831 68 Nga30995 protoporphyrin XMg-chelatase subunit H SEC in O: 8785 122 Nga06242 protoporphyrin XMg-chelatase subunit - SE: NO: 883, 174 Nga40091 Mg-protoporphyrin EX mo?omethyl ester oxidative) cyclase SEC) No: 877 114 Nga05945 divinyl protochlorophylide a 8-vinyl-reductase SEC NO: 8759 96 Nga(4959 light-dependent NADPH:protochlorophyllide oxidoreductase SECRO NO: 8683. 20 Nga00683 light-dependent NADPH:protochiorophyllide oxidoreductase SECRED NO: 8836 173 ga40089 figh-independent:protochlorophytide oxidoreduciase subunit B SEO O NO: 8834, 171 Nga40044 light-independent protochlorophyllide oxidoreductase subunit SECE NO: 8835 172 Nga40045 light-independent protochlorophytide oxidoreductase subunit N SECRED NO: 8832 169 Nga31097 chlorophyll synthase SFOD NO: 8757 94. Nga04895 geranylgeranyl reductase SEO 3D NO: 8764. 101 NgaO560 troporphyrinogen El C-methyltransferase SEO in NO: 8671. 8 Nga00339 sirohydrochlorin ferrochelatase SEO O NO: 863A 21 NgaO.0748 farrochelatase SEC E RO: 88.7 154 Nga30771 1-deoxy-D-xylulose-5-phosphate reductoisomerase SEOD NO: 8784. 121 NgaO6198 2-C-methyl-D-erythritol 4-phosphate cytidyltransferase SEOD NO: 882. 158 Nga30806 4-hydroxy-3-methylbut-2-enyl diphosphate synthase SECREED NO: 8787 104 NgaO5308 4-hydroxy-3-methylbut-2-enyi diphosphate reductase SEC. FO: 342 9 Nga03838 isopentenyl diphosphate:dimethylallyl diphosphate isomerase type SEOE NO: 874 5. Nga02636 geranylgeranyl pyrophosphate synthase SEQED NO: 8725 62 NgaO2957 phytoene synthase SECR D NO: 876, 98 Nga(5064 phytoene desaturase SEC3D NO: 8792. 129 NgaO730 zeta-caroteine desaturase SEO ED NO: 86.80 17 NgaOO640 yCopene 3-cyclase SEOD NO: 8805 142 Nga30.077 Cytochrome P450 enzyrne related to CYP97A carotene 3-hydroxylase SECRED NO: 8686 3 NgaOO 100 cytochrome P450 enzyme related to CYP97A caroteine 3-hydroxylase SEO O NO; 8797 134 Nga20998 acetyl-CoA C-acetyltransferase SEO ED NO: 8824 16 Nga30830 acetyl-CoA C-acetyltransferase SECRED NO: 8786 123 Nga06246 hydroxymethylglutaryl-CoA synthase SECD NO; 8.723 60 NgaO2865 geranyl-disphosphate synthase f dimethylaliyitranstransferase SEO 3D NO: 8724 61 gaO2874 farresyl-digiospiate synthase SEO in No: 87A2 F9 NgaO3838 isopenterlyi diphosphate:dirtiethylallyl diphosphate isofrerase type SEC No: 8698 35 NigaO1590 Squaene inor oxygenase squalene epoxidase SEC D. NO: 8820 157 Nga30790 cycloartenol synthase SEC) No: 874 77 Nga03733.1 C 14-demethylase (sterol 14-demethylase) SEOE NO: 8685 22 Nga00758 D14-sterol reductase SEC No: 8779 116 Nga06:14 C4-decarboxylase (sterol-4-alpha-carboxylate 3-dehydrogenase) SFOD NO: 8776. 113 Nga05943 sterol-C24-methyl transferase (sterol 24-C-methyltransferase) SEC 3: NO: 872 49 NgaO2534 sterol-C24-methyltransferase (sterol. 24-C-methyltransferase) SEO iO NO: 8720 NgaO2795 C5-desaturase (lathosterol oxidase) SEO O NO: 8731 68 NgaO3254 D7-sterol reductase (7-dehydrocholesterol reductase) SEC 3D NO: 8766 103 NgaO5293 D24-sterol reductase (24-dehydrocholesterot reductase) SEO DO: 8809 146 Nga30196 cyclopropy sterol isomerase (cycoeucalenol cycloisomerase) SEOD NO: 863. 3 NigaOO656 O5-sterol reductase lipid metabolic pathway genes SEC FO: 8802. 39 Nga30.020 acetyl-CoA synthetase SEO NO: 8706 43 NgaO2.68 acetyl-CoA synthetase SEQEO NO: 8803 AO Nga30028 acetyl-CoA carboxylase SEO NO: 889 156 Nga30783 acetyl-CoA carboxylase SEC3D NO: 8590 27 NgaOO899 malonyl-CoA:ACP-trans-acylase SEO E) NO: 8806 A3 Nga301 06 3-oxoacyl-ACP synthase SEC D. No: 8705 42 NgaO2;38 3-oxoacyl-ACP synthase I SEC O&O: 886 53 Nga30755 3-oxoacyl-ACP synthase it SEC to so; 37A7 3A NgaO42O.O3-oxoacyl-ACP synthase Gr. 23A U.S. Patent Apr. 29, 2014 Sheet 59 of 198 US 8,709,766 B2

pep: Nga rode description SEC NO; 877S 12 NgaO5827 fatty acid synthase SEO O NO 893 30 Nga20022 3-oxoacyl-ACP reductase SEO NO; 8697 34 NgaO1542 3-oxoacyl-ACP reductase SEC NO; 8778 1S NgaO5985.013-hydroxyacyl-CoA dehydratase SEO ONO; 862 99 Nga05091 enoyl-ACP reductase SEC NO; 8879 6 NgaO0802 3-oxoacyl-ACP reductase SEC. NO; 8772 09 Nga05627 3-oxoacyl-ACP reductase SEO O NO; 869. 29 NgaO1045.01 oeyl-ACP hydrolase SEC NO: 828 8S NgaO3084 elongation of very long chain fatty acids SEC O NO: 8673 NgaOO451 elongation of very long chain fatty acids SEC O NO; 8695 32 Nga01458.01 acyl-ACP desaturase SEC NO: 8675 32 NgaOO524 stearoyl-CoA desaturasa SEC D NO; 3704 41 NgaO2019 omega-6 fatty acid desaturase delta-2 SEO DFO: 888 24. Nga00817 oriega-6 fatty acid desaturase deta-12 SEO NO: 89 23 Nga06739.1 sphingolipid delta-4 desaturase SEC DO; 387 A. Nga30138 glycerolipid omega-3 fatty acid desaturase SEO DNO: 8789. 26 tga06665.1 glycerol-3-phosphate dehydrogenase SEO O NO; 8O 39 NgaO1869 glycero-3-phosphate dehydrogenase SEC O; 3758 95 NgaO4914 glycerol-3-phosphate dehydrogenase SEO ON: 8765 O2 tigaO5228 glycerol-3-phosphate dehydrogenase SEO NO; 880 .38 Nga30015 glycerol-3-phosphate dehydrogenase SEC NO; 3754 91 Nga04759 glycerol-3-phosphate acyltransferase SEO O NO: 36 93 Nga04850 glycerol-3-phosphate acyltransferase SEQ NO; 8665. 2 NgaO0059 1-acylglycerol-3-phosphate O-acyltransferase SEC NO; 383 50 Nga30581 -acylglycerol-3-phosphate O-acyltransferase SEO O NO: 8822 SS Mga30809 1-acylglycerol-3-phosphate O-acyltransferase SEO NO; 808 45 NgaO2265 1-acylglycerol-3-phosphate O-acyltransferase SEC NO; 879. 135 Nga21116 phosphatidic acid phosphatase SEO O NO: 872 58 Nga02796 diacylglycerol kinase SEC NO; 877O O7 NgaO5586.1 diacylglycerol kinase SEO NO; 879 56 NgaO2737 phospholipid-diacylglycerol acyltransferase SEO O NO: 88: 43 Nga30544 diacylglycerol acyltransferase SEC NC: 868 O5 NgaO5465 lysophosphatidylglycerol acyltransferase SEO O NO: 867O Nga00335 polyketide synthase 5 SEQ NO: 8688 25 NgaO0818 monogalactosyidiacylglycero synthase SFO NC; 8877 & Nga00561 Suttoquinovosyltransferase SEO D.N.O. 8717 54. NgaO2886 UDP-sutroduinovose synthase SEO O. 883O 67 Nga30958 TAG-lipase SFO NC: 88.15 52 Nga30749 SAG-lipase SEO O NO: 8734 1 NgaO3422 acyl-CoA synthetase SEO NC: 877 08 NgaO5597 acyl-CoA synthase SEO NO; 8808 45 Nga30.70 ong-chain acyl-CoA synthetase SEO O NO: 829 86 NgaO3113 tong-chain acyl-CoA synthetase SEO O. 8682 9 NgaO0675 long-chain acyl-CoA synthetase SEO NC; 834 S. Nga30631 long-chain acyl-CoA synthetase SEO O NO: 869. 28 NgaOO919 tong-chain acyl-CoA ligase SEO NO 3. 8 NgaO2299 long-chain acyl-CoA Eigase SEO NC; 8864 NgaOO006 long-chain acyl-CoA ligase SEO NO; 882 49 Nga30568 tong-chair acyl-CoA ligase SEO 3 NO; 867. 1 NgaOO482,013-hydroxyacyl-CoA dehydrogenase SEO N: 86 S3 NgaO2679 3-hydroxyacyl-CoA dehydrogenase SEO NO 8782. 13 Nga06144. 3-hydroxyacyl-CoA dehydrogenase SEO 3 NO; 8735 72 NgaO3480 long-chain hydroxyacyl-CoA dehydrogenase SEO s: 8833 70 Nga31 30 short-chain acyl-CoA dehydrogenase SEO O NO: 873 1C Nga05705 short-branched-chain acyl-CoA dehydrogenase SEQ NO; 848 85 NgaO42O4,CAcyl-CoA dehydrogenase SEO NO: 827 84. Nga03053 acyl-CoA oxidase SEO ONO; 880 17 NgaO6128 peroxisomal trans-2-enoyl-CoA reductase SEC NO; 8736 73 NgaO3545 alternative oxidase SEO ONO. 8827 84 Nga30837 alternative oxidase SEO NO; 832. 89 NgaO3289 alternative oxidase SEC D NO; 87SO 87 NgaO4370, acyl-CoA oxidase SEO ONC 8823 SO Nga30819 acyl-CoA oxidase SEO O NO; 8 38 NgaO1781.01 enyol-CoA hydratase SEC NO; 8668 5 NgaO071 enlyol-CoA hydratase SEO O NO; 89A 31 Nga2O52 enyol-CoA hydratase SEO NC; 88: 13 Nga06135 enyol-CoA hydratase SEC NO; 8699 36 Nga01710 beta-ketoacyl-CoA thiolase GRE 33 3 U.S. Patent Apr. 29, 2014 Sheet 60 of 198 US 8,709,766 B2

pepit Nga model Oescription SEC3D No: 8752 89 Nga.04504,01 beta-ketoacyl-CoA thiclase SECED NO: 8824, 161 Nga30830 beta-ketoacyl-CoA thiolase SEOD NO: 879. 134 Nga20998 acetyl-CoA acetyltransferase SEC3D NO: 8824. 18. Nga3O830 beta-ketoacyl-CoA thiclese SECD NO; 8787 124 NgaO6551 solute carrier family (fatty acid transporter) SEOD NO: 8769 106 NgaO5502.01 peroxisoma 2.4-dieroyl-CoA reductase SEC) NO: 8726 (53 Nga03028.01 patatin-like phospholipase Carbon assimilation genes SECD NO: 8693 30 NgaO240 Carbonic anhydrase SECD NO: 8700 37 Nga01717 Carbonic anhydrase SEGD NO: 8739 76 RigaO3728 Carbonic anhydrase SEO 3D NO 8828, 165 Nga30848 Carbonic anhydrase SECONO: 8800 137 Nga21222 Carbonic anhydrase SECD NO: 8667 4 figaOO165,0i Bicarbonate transporter SEO 3D NO 8788 125 Nga06584 Bicarbonate transporter Nitrate assimilation genes SECD NO: 87O 37 &ga02268 NAD(P)H nitrate reductase SEO 3D NO: 8709 46 NgaO2267 Ferredoxin nitrite reductase SEC NEO 8738 75 Ngai)3713 Jrease SECD NO; 8669 6 \gaO2O7.01 Urease accessory protein ureG SEO 3D NO: 8594 3 NgaO1342.01 Urease accessory protein uret) SECRED NO: 8829 66 Nga30904. Nitrate high affinity transporter SECD NO: 87.46 83 Miga(41.30 Putative nitrate peptide transporter SEO 3D NO: 8689 26 NgaOC897 Nitrite transporter SEC ENO: 8790 127 NgaO6677 Nitrite transporter SEC 3D NO: 8796 133 Riga20972 Ammonium transporter SEO 3D NO: 880 147 Nga3C207 Ammonium transporter SEO ED NO: 8783 120 NgaO6136 Urea Na+ high affinity symporter SEC 3D NO: 86.72 9 NgaOC38 carbamoyl-phosphate synthetase large Subunit SEC3D No: 8733 78 Nga.03303 carbamoyl-phosphate synthetase small subunit SEC) No: 8795. 132 Nga20220 ornithine carbamoyltransferase SEO 3D NO: 8751, 88 NgaO4487 argininosuccinate synthetase SEO 3D NO; 8.737, 74 NgaO3647 argininosuccinate lyase SEQED NO; 86.76 13 NgaOO526 arginase

FGRE 23 C U.S. Patent Apr. 29, 2014 Sheet 61 of 198 US 8,709,766 B2

i Nga model Nrpkb -N rpkb GO SECD NO: 1 NgaOO398 2O7.6225 207.99664 beta adaptin SEO ED NO; 2 Nga QG392 784,4507 773.59444 glutamine-traigase SECD NO: 3 Nga00399 457.2264 413.26353 glutathione synthetase SEO NO: 4 NgaOC4O1 202.9731. 154.21684 protein SECD NO: 5 NgaOC43 95608 S83.13373 ---NA--- SEO NO: 6 NgaOC407 97.71078 92.2351.62 epidermal growth factor receptor pathway substrate 15 SECD No. 7 Nga00423 561.6046 534.892O2 ridine diphosphate-in-acetylglucosanine transporter hut1. SEO. No 8 Nga.00426.2 1419.718 1439.0992 chromatin modification-related protein eaf3 SECD NO: 9 Nga00396 393.3464 576.59419 protein SEO. O. Nd: 1 NgaO(394 739.9037 81.52243 puronycin-sensitive aminopeptidase SECD NO: 1 Nga00393 6558,025 6146.3496 pumpkin fruit trypsin inhibitor SEO ED NO; 2 Mga.00405 438.0791 383.47787 S-adenosylmethionine-dependent methyltransferase domain-containing protein SEOE NO; 13 Nga 20048.i. 923.9281 319.54214 rudix hydrolase SECONO: 14 Nga00391 586.678; 62.63382 protein SEO ED NO; 25 Nga G04.04 353.5354 363.76.34 domain-containing histone demethylation protein 3c SECD NO; 16 NgaOO400.3. 37.655 342.24876 cedivision control protein 45 SEO EO NO; 17 NgaCO395 472.393. 462.218 protein SEOD NO: 18 NgaOC408 1374,384 29,3429 ---NA--- SEO EO NO; 19 Nga00397.01. i44,845 365.12295 fidgetin-like protein SEOD NO: 20 Nga20104 607.9952 643.08936 protein SEQEO NO. 23. NgaOC4O6 286.456 325.90.33 ---NA--- SEO NO: 22 Nga2C131.3 392.5466 476.35383 hypothetical protein AURANDRAF 63258 Aureococcusanophagefferens SEC) NO: 23 NgaO394.6.2 7492188 SO36633 ---NA--- SEO ED NO; 24 NgaO3460 47.35376 56.525299 phosphatidylinositol kinase (pik-4) SEC) NO: 25 Nga03458.01. 2169.955 2504,3321 ---NA SEO, ED NO; 28 NgaO3461 38.439 396.48 ---NA--- SECONO: 27 Ngao3453.01. 210.8067 224.78066 structural maintenance of chromosomes protein 3 SEO ED NO; 28 Nga03459 428.2946 300.19355 membrane protein SEC to NO: 29 NgaO3456 397.3903 370.07281 transcription elongation factor SEO ED NO; 30 NgaO34S5 262.334 436,004.5 ankyrin partial SEOD NO; 3. Ng303454.01 65.7944 688.1845 protein SEOD NO; 32 NgaO34S7.3 32.216 83.56478 protein SEOD NO; 33 Nga21127 205,5514237.28755 solute carrier family member 27 SEQED NO; 34. NgaO5422 O8.483 0.03284. ---NA.-- SEOD NO:35 NgaO3210.02 259,3477 279.21066 protein SECD NO; 38 Nga03209,02 3i.5789 98.187872 ---NA SEO NO:37 NgaO5414.01. 21.74.783 2062.85.42 peptide methionine sulfoxide reductase SECONO:38 Nga05435.0 324.324 401094 protein SEC) NO 39 Nga.05420.01 31.10.33 33438.712 ribosomal protein 27 SECONO: 40 Nga05433 8O8,2132 788.9366 ---NA--- SEO NO: A1 Nga.05416 386.2816297.20498 protein SECONO: 42 Nga.05417 755.0471 727.98764 tryptophan synthetase SEO NO: A3 Nga.05421 873.007 2142.0519 protein SECONO. 44. Nga05424 1797,619 1782. 1777 nmda receptor glutarnate-binding chain SEO MO: A5 Nga.05426 293,050. 258.476 --NA--- SECD NO: 46 Nga04327,02 178.0576. 23.06177 protein SEO ED NO; AJ NgaG5764 305.34 1340,187 membrane-associated protein in eicosanoid and glutathione metabolism SEO EO NC: A8 Nga.05771. 37.3737 3.46.61954 lysophosphatidic acid SEC) NO: 49 Nga2O723 93.75 74.472333 serine threonine protein kinase SEOE NC 50 Nga2O343 94.6565. 202.58955 traf2 and nick-interacting protein kinase-like SEC D NO: 5 Nga04326,02 428.3109 438.60149 protein SEOE NC: S2 Nga GS768.i. 229.0076219.4031 probabie serca-type calcium atpase SECD NO: 53 Nga20086, 274.6067 437,6752 serine threonine proteir SECEO NC: 54. Nga GS767 562.2776 438.2905 protein SECD NO:55 NgaO2263.02 1320.72 0950.608 fructose-bisphosphate aidolase SECEO NO: S6 NigaOS770 2.349.4 3449.9245 nucleosome chromatin assembly complex protein SEC to NO: 57 NgaO2264.02 153.238S, 168,560 cyclopropane-fatty-acyl-phospholipid synthase SEO ED NO: E8 Nga 04304.02 390.411 650.43499 protein SECD NO. 59 Nga.04302.02 234,5426 377.93588 paktipi protein SEC Nd: 60 NgaOSO96.3 346.491 2 404.62905 retinoblastoma binding protein 4 SECD NO: 6 Nga04303.02 490.0872 462.586.42 n-ethylmaleimide-sensitive fusion protein

FGURE 24 A U.S. Patent Apr. 29, 2014 Sheet 62 of 198 US 8,709,766 B2

Nga model N: at N pke GO SEC O NO: 62 FigaO5101 683.6068 73.43837 ---A--- SEOE NO: 63 NgaO4401.02 420.6799 494-05287 ---NA--- SEC DNO; 64. Figa21200.1 802,2388 1023,9525 c190rf60 homolog SEC NC: 65 NgaO4400.2 159.6107 17500422 minichronosome maintenance protein 2 SEC D NO: 86 NgaO5100.1 272.91.04 285.80795 dra replication licensing factor mom2 SECR D NO: 67 iga2O2S5. 331, 1966 290.48261 probable palmitoyltransferase 2dhhcil-like SECONO: 68 NgaO1865.2 128,6667 363.9294 protein SEC) NO: 69 FigaO434(3.2 661.3295 756.3190.9 mitochondrion protein SEC NO: 70 NgaO5117 666,6667 797.81.039 protein SEC No. 7 NgaO51.31.1 333.648 322.89.032 calcium-transporting atpase type 2c merber 1-ike SEO NO: 72 Nga05125 9.230 93.74.399 ---NA-...-- SEO NO: 73 NgaO1335.02 703,7594 567,67974 protein SEO ED NO: 74 Riga2OC59.1. 1173.2 1059.2993 protein SECONO: 75 NgaO1334,02 5031.935 6632.5672 40S ribosonal protein s2 SEC DNO: 76 Nga(1088.02 700,4539 686.64.905 protein SECD NC: 77 Nga20927.1 5178.976 5806.8.78 protein SECD NO: 78 NgaO51.18 1110.123993.5883 cidc37 protein SEC NEO: 79 Riga.05122.1. 256,906. 266.3995 ---A--- SECONO: 80 NgaOS123 665.07 S68.304 --NA--- SEC DNO: 8 NgaClO37.2 93.937 779,73326 ---NA--- SEO NO: 82 NgaO5124 596 682.43738 sister chromatid cohesion protein dcci SECD NO: 83 FigaO536 623.6324 711.8846 cytochrome b5 dorrain-containing SEC OMO; 84 NgaO2421.02 1065.574 1197.392 phosphoglycerate kinase SEC NO: 8S Mga20253.1 907, 1895 738.61598 irr and pyd domains-containing protein 3-like SEC OMO: 86 8igaO4768.02 315.6342 389. 83.64 ---NA SECD NO: 87 Nga05137 2510.606 2473.3842 predicted protein Phaeodactylum tricornutum CCAP 30553) SEC DNO: 88 Nga2O778 292.102.8 226.74735 transducin wi-40 repeat SEC D. NC: 89 NgaO51.32 505.1.238 572.60616 ca2:cation antiporter family SECD NO; 90 NgaO2203,02. 2106,762 1790.8269 protein SECONC: 9 Nga20097.1 19C.1932 218.09539 type i inositof polyphosphate 5 SECONO 92 Nga06060 463.51.93 538.51759 protease SEC DNO: 93 NigaO6063 1310.484 1311,8965CS ribosomal protein 3 SECONO: 94 NgaO6056 3280.692 1172.6975 predicted protein Thalassiosia pseudonana CCMP1335 SEC NO: 95 NgaO6058 923.295 82.403 ---NA--- SEC O NO: 96 Nga06059.1. 1347.354 1430,8389 protein SEC No. 97 NgaO6061 3.233.25 1223.7322 ce: adhesion domain-containing protein SECR DNC; 98 igaO6062 1048.626 1144,6359 rieske (2fe-2s) domain protein SEC NO: 99 NgaO605? 1101.551 1,477.9243 phenylalanine hydroxylase SEC NO: GO NgaC4O12.02 Z85.3695 857.9792 t-complex protein 1subunit alpha SEO D NO: 101 Nga20296 342,8928 329.56245 conserved unknown protein Ectocarpus siliculosus SEQ D NO: 102 NgaO6468 353,4994 345.65828 apicomplexan specific region near n-terminus SEC D NO: 103 Nga.06469 948.6166 1161.8589 mate efflux family protein SEQD NO: 304 NgaO6581 533,7079 416,8525 syntaxin 6 SEO D NO: 105 Nga00063.2 543.497 530.88726 acao transport system atpase SEC O NO: 106 Nga0658A 356.3238 54C, O606 SEO D NO: 107 NgaOOO70,02 910,2384 770,26593 ---NA SEC NC: 108 Nga00057.02 412,6568 515.55493 conserved hypothetical protein Phytophthora infestans T30-4 SEQ D NO: 109 Nga.00061.02 734, 1954 684.80307 ---NA SECR O NO; 20 \ga.00052,02 725,0415 63.31406 isoleucyl-trina synthetase SEC D NO: 311 NgaO005A.02 3,480.33 451.604 gamma-tubulin SEC D NO: 12 NigaOOO53.02 6413.408 6296,655 expressed unknown protein Ectocarpus siliculosus SEC NC: 113 NgaOOO62.2 197.3856 174.26703 biopterin transporter SEC D NO: 114 Nga20370.1 283,0189 324.45922 autophagy ubiquitin-activating enzyme SEO C MEC: 115 NgaO1899.2 238.9046 3.01.23928 autophagy-related protein SEQ D NO: 16 NgaO6248 3098.947 4042,4285 hydroxymethylglutary-synthase SEO D NO: 117 Riga.01898.02 25470.99 15573.132 light-harvesting protein SEO E) NO: 318 NgaO190C.O2 3330.409 3329.94 E4 protein SECR D NO: 119 NgaO6249 696.7509 875.972.57 acyl- wax alcohol acyltransferase 2 SEO } NC: 120 NgaO4989.02 994.8893 388.8546S protein SEQ D NO: 121 NgaO6270 122.3097 151.25471 pathogenesis-reated transcriptional factor and ef SECR D NO: 122 KigaO6272.1. 183,6115 330.39606 ---NA--- SEC O NO. 323 Nga20972 5904.439 20343.027 armonium transportar SEO D NO: 124 NgaO6269.03. 1489,596 1356,3627 ---NA--- SEO C MO: 25 NgaO6267 469.29825.42.9744; like protein n-terminal domain of phosphotransacetylase-like protein

FIGURE 24 B U.S. Patent Apr. 29, 2014 Sheet 63 of 198 US 8,709,766 B2

Nga model Nrpkb -N rpkb GO SECR D NO: 126 NgaO6271,01. 2306.054, 2335.2678 thiol-disulfide oxidoreductase doc SEONC: 127 NgaO6273.01. 77,77778 96.287.461 ---NA--- SEC D NO: 128 NgaO6266.01. 780.5841. 1109.7377 gist lipase acylhydrolase family protein SEO EBNC 129 NgaO6280.01 E3.1C044 42.572533 ---NA--- SEO D NO: 130 NgaO6268 SS36.43 FSO.7538. --NA--- SEQED NO: 133 Nga.06433 476, 19083.3882 transation initiation factor if-E SEO D NO: 132 Nga2O399 130,2521 220,743.05 stelar k+ outward rectifying channel SECD NO: 133 NgaO6437 272.7273 247.01.036 protein SEO O NO: 134 Nga2O161 289.2263 449.53516 shaker-like potassium channe SEC O NO: 135 Nga21210, 396.8254 554,181966-phosphofructo-2-kinase fructose--biphosphatase-1-ike protein SEO NO: 136 NgaO6438 4389,199 3892.1786 cyclophiin SECD NO: 137 NgaO6436 173.461.2 89,63088 rina family SEO ED NO: 138 NgaO6434 867.7897856.52342 phosphataseptc7 family protein SEO D NO: 139 Ngai)6435 4999.031 61.18.3824 -ascorbate peroxidase SEQEB NO: 140 Riga.06433 8604.491 8276.7305 60s ribosoma protein 44 SEC D NO: 143 NgaO8432 1385.586 3034.33 preprotein transiocase subunit SEQ O NO: 142 Nga20950. 1581.604. 1600.1298 conserved unknown protein Ectocarpus siliculosus SEO ED NO: 143 NgaO1308.3. 383.3764. 359.2226 methionyl-trina synthetase SEC D NO: 144 NgaO1304 1096.552 1065.0911 protein SEO ER NO 145 Nga01309,01. 1077.343 825.89107 protein SECR O NO: 146 NgaO1314.01. S37.064. 329.8O351 conserved unknow protein Ectocarpassificulosus SEQED NO; 147 Nga01305.3 2472.452 2697.7859 carrier protein SECR D NO: 148 NgaO1306 183.6036 88.643018 phospho serine aminotransferase SEQ -D NO: 149 NgaO1307 1448.664. 1523.91.87 madh:ubiquinone oxidoreductase complex i intenediate-associated protein 30 SEC D NO: 150 Nga.00790 A9763 4377.6823 ---NA SEC NC: 153 NgaOO792.0i 88.2394 94.43459 inda receptor regulated 2-like SEC D NO: 152 Nga20855.3 288.7597 417,75883 --NA SEQED NO: 153 NgaOO796 10O2.509 (43.8.189 copper ion binding protein SEO D NO: 154 NgaOO79i 137.0433 1.9.21086 tryptophan tyrosine permease family protein SEQED NO: 155 Nga 00797 236.33 253.43404 cyclin 21 SEC D NO: 156 NgaOO795 426.21.02 470.63765 probable alpha-ketoglutarate-dependent dioxygenase abhé-fike SEQ D NO; 1S7 NgaOO794 1639.733 1699.1.275 fatty acid desaturase domain protein SEO O NO: 158 NgaOO793 9758.47 9081.4072 SOS ribosoma protein 28 SEQ D NO: 159 NgaO1153 233.9565. 256,16649 amino acid transport protein SEO EO NO: 160 Ngao1154.01. 513,587 528.36S ass 3 SECD NO: 16 NgaO1352.01. 187.301.5 257,92.284 --NA--- SEO EO NO: 162 NgaO1150. 730.5062 797.23647 -diaminopimelate arminotransferase SECR D NO: 163 Ngaii.55.0, 3923.423 41.63.9854 s-adenosylmethionine mitochondria carrier protein SEQEB NO: 164 Riga.01.156.01 3.09.6192 286.54685 --NA--- SEC O NO; 165 NgaO352.O 2026.278 2082,6096 ribosome biogenesis proteinsa2 homolog SEC NO: 166 Nga.01149.01. 371.1584. 514,72818 short chain dehydrogenase SEO ED NO: 167 NgaO1515 643.8095 701.52293 sodiur? bile acid symporter family protein SEC D NO: 168 NgaO1524 377.3585, 245,265. ---NA--- SEO EO NO 169 NgaO1525,01 170.5202 169.05963 ---NA--- SECRE NO: 170 Nga2O615. 1.37.4935 i41.41435 c-myc promoter-binding protein irb SEQED NO; 17 Nga01520 963.1206 1084.7704 glutaredoxin 2 SECR D NO: 172 Nga2O595. 4.81481, 56.67686 --NA--- SEQ D NO; 173 NgaO6576.2 178.2274 370,10321 brefed in a-inhibited guanine nucleotide-exchange SEC D NO: 174 NgaO1532 14.4.4665 14141,623 oligopeptidase to SECRED NO: 175 Nga01522. 173.0959 392,324.92 brefed in a-inhibited guanine nucleotide-exchange SEQED NO: 176 NgaO1518 66.289 98.55339 ---A--- SEC D NO: 177 NgaO1516 427.4892 480,65575 nitroreductase-like grotein SEQED NO: 178 NgaO1523. 278.9474 225.19864 inosine-uridine preferring nucleoside hydrolase SEO D NO: 179 NgaO1517 1802.653. 1682.336 fructose--bisphosphate 2-phosphatase SEQ 3 NO: 180 Nga O1523 15 65.8702 ---NA--- SEC D NO: 183 Nga2O384 220,5607 19842416 predicted protein Thalassiosira pseudonana CCMP1335} SECR D NO; 182 Nga20236 221.458 84,76067 conserved unknown protein Ectocarpus siliculosus SEC) NO 183 NgaO1519 86,439 39.66S--...-NA--- SEQ D NO: 184 NgaO1193 695,181. 483,0983 - --NA SEO No. 185 Nga.01.194 12,689 a.39.18 ---NA--- SEQ -D NO: 186 Nga.01.192 555.0558 532,22066 w-nyb myeloblastosis via oncogene homologue SEO EO NO: 187 NgaO1191. 3554.609 3521.5591 nadh-cytochrome b5 reductase 2-Eike isoform 2 SEO D NO: 188 NgaO 130, 23.90.024 2.293.2585 Eumina binding protein FCURE 24 C U.S. Patent Apr. 29, 2014 Sheet 64 of 198 US 8,709,766 B2

Nga node +Nrpkb -Nrpkb GO SEO ED NO: 189 Nga.01137 332,632 273,071,553-mercaptopyruvates furtransferase SEO Ed NO; 190 Nga01. 138 323.4323 303,87751 3-mercaptopyruvate sulfur transferase SECR DNC; 391 NgaO1.13 1779.286 2313.264 protein SEC 3 NO 92 Nga2000 690.5738 654,3238 diency-reductase SEO ED NO: 393 Nga20721 37.3653. 350,23726 ---NA--- SECR D NO: 194 Nga2O697 222.7273 256.0371 protein SEC E NO: 395 Nga20437 303.6364. 289.51889 serine threonine protein kinase SEO DNC): 196 Ngao. 265 622.87 53.345. ---A--- SEQED NO. 397 NgaO2.264 356,9372 433,6616 thioredoxin reductase SECR D NO: 198 NgaO1262 E57.8947 375.94122 sphingosine-3-phosphateyase SEC : NO: 199 NgaO.263 218.3238 224,881.9 phosphoadenosine phosphosulfate reductase family protein SEO DNC; 200 NgaO266.01. 264.835 275, 18421 dhhczinc finger domain-containing protein SEC 3 NO. 20. NgaOOS. 104,9223 leg,46168 retrograde transporter SEO ED NO: 202 Nga00747.01 1671.614 1771.8947 sugar transporter SEO D NO. 203 NgaOOS66,02 653.7572 609.10.279 protein bud31 homolog SEC NC; 204 NgaOG750 1565.502 i293,36621-like protein SECR D NO. 205 NgaOO748 662.9297 707.5833 ferrochelatase SEO ED NO: 206 NgaOG745 97.52.931: 9653,146. --NA--- SECR D NO. 207 Nga00746, EO6.934 339.20609 pre-rina-processing protein tsri SEC : NO. 208 Nga2O529 439.7:63 455.36333 cathrin-adaptor gamina chain SEO D NO: 209 Nga2O705 312.2066 300.05072 ap-1 complex subunit gamma-1 SEQED NO. 210 NgaO0833 595,6933 507,40958 glucose-6-phosphate 2-dehydrogenase SER D NO: 211. NgaO0835 64-58797 77.2128 ---NA--- SEC E NO. 212 NgaO0834 276,7422 162.34866 y8236 dicciame: fulletpr-containing protein did b. go280363 SEC D NO; 213 Nga014.89 204.4502268.62S25 conserved unknown protein Ectocarpus siliculosus SEQ. O NO; 214 NgaO2,487 39.237 103,871.75 zinc transporter SEO ED NO: 215 Nga.01438 553.35.53, S46,97951 protein SEO D NO: 216 Nga014.86 589.6546 693.57237 bifunctional aspa?tokinase homeserine dehydrogenase SEC D. NO; 217 Nga20977.1. 692.3783 601,79663 protein SECR D NO 213 NgaO0861.2 321.6031. 379,07894 ferrochelatase Actinomyces viscosus C505 SEO ED NO: 219 Mga.01.106.01 .262.956 2069,795 chalcone isomerase-like protein SER D NO: 220 Nga2O761.3 418.5293 517.04204 serine threonine-protein kinase sing SEC NC: 221 Nga01.07.01 S35.2381. 453.8633 isoarinyacetate-hydrolyzing esterase 1 honolog cerevisiae) SEO D NO: 222 Ngao. 108.01 2068,654 1876.8227 predicted protein Thalassiosira pseudonana CCMP1335) SEQED NO; 223 Nga22,000 326,6453 37,31973 ania-6a type cycin SEO D NO: 224 NgaOEO4. 528.6329 500.56093 alpha-glucosidase ii SEC E NO: 225 Nga(1044 983.6852 1033,2553 protein SEQ D NO; 226 NgaO1040 419.0317 484.65225 ubiquitin-conjugating enzyme SEO 3D NO; 227 NgaO2,042.0L 735,3333 615.27638 rael-like protein SEO ED NO: 228 Nga01045.01 3785.006 3574.5526 microcystin synthetase-associated thioesterase SEC D NO. 229 NgaO1043 52.87SS 23.SS9368. ---NA--- SEC NC: 230 Nga21.140.1 134.7068 iS1,06908 d-2-hydroxyglutarate mitochondrial precursor SEQ}E NO. 231 Nga2O552, 3 25.32s 3.78.3208 ---NA--- SEO D NO. 232 Ng420751.1 250,6427 261,75833 protein SECR DNC; 233 NgaO153G 556.2588 622.36437 sjogren syndrome antiger b SEC) O NO. 234 NgaO1509 637.621. 55585033 asparty aminopeptidase SEOD NO: 235 Nga21163.1 233.0827 54.74771 inyb-like dra-binding SEQED NO. 236 NgaO2.480 2.645.643 1647,7741 protein SEO D NO. 237 Nga20898. 116.3265 132.64089 ---NA--- SEO. :D NO. 238 NgaO1479.01. 3896.409 4264. 1115 translationally contro?ied tumor protein SEC D NO: 239 Nga 04160,01. 2642,633 2382.096 brain protein 44-like protein SEC3D NO; 240 NgaO463 50.929 .3S489 ---NA--- SEO D NC: 241 Nga.04.157 199,329 997,9442 ---NA-...- SEQ D NO. 242 Nga04159 1494,46 1292.5291. 26S proteason?e subunit pr&a SEC DNC; 243 NgaO416 348.0315 411,97007 meta: family SEO 3D NO: 244 Nga20961. 443.3333 528.0765S magnesiurn and cobalt transport protein SEO ED NO: 245 Ng304158 419.5046 360,51904 glyoxalase bleomycin resistance protein dioxygenase SER DNC: 246 NgaO4162,01 35.3519 272.241.33 ---NA SEC : NO: 247 Nga2O375.i. 288.8:47 339.97993 protein SEO D NO; 248 Nga03952,01 470,342 A386.2837 protein SEQ: NO. 249 Nga03956.01. 679.941, 627,892.24 pyridoxamine 5-phosphate oxidase SECR D NO: 250 NgaO3954 4982.488 SS52.9077 choline-phosphate cytidylyltransferase b SEO. :D NO: 2S1 NgaO39S5 16857.14, 18321,732 alcohol dehydrogenase SEQ D NO; 252 Ng303953 809.2567. 12.8S96 ---NA-...-- FIGURE 2 A D U.S. Patent Apr. 29, 2014 Sheet 65 Of 198 US 8,709,766 B2

Nga mode +Nrpkh -Nrpkh GO SEQ 3) NO; 253 NgaOSS01. 777.78 84S.O7612 protein SEO D NO: 254 NgaO3503 199,074 260.77854 ---NA--- SEO E) NO 255 Nga03500 611, 1888 645,39533 S deoxy cytosolic type C protein domain containing protein SEQ D NO; 256 NgaO3502 371.3592. 233.95453 dtp1a10 protein SEO ED NO: 257 NgaO2234,02 420.6O49 304.03363 ---NA--- SEO ED NO: 258 NgaO3537 131.233s 85.2940 ---NA--- SEO ED NO: 259 Nga03536 1913,173 1979.2887 serine threonine-grotein phosphatase pp.1-gamma catalytic subunit SEQ D NO; 260 NgaO3534 2398.05 2536.6:55 ring zinc finger-containing protein SECRED NO: 261 NgaO3535 746.0908 904.17368 dual specificity phosphatase 10 SEO in NO. 262 Nga21,251.3 3252.33 E.28468. ---NA--- SEO 3D NO: 263 Nga 04223.01. 669.0939 1243,6357 tina guanine-in-)-methyltransferase-like SEC : NO: 264 Nga04222.02, 558.6457 647,05873 protein SEO 3D NO: 26S Nga2O76S 608. 209 633.5706 traf and trf receptor associated protein SEO a NO. 266 Nga04221 907.8014 882.20326 protein SEQ D NO: 267 NgaO3914 1839.734 1903.2596 oxidoreductase domain protein SEQ) NO; 268 NgaO39;3 757.7808 749.929E5 replication factor c subunit 2 SEO D NO: 269 NgaO3629 1434.009 2048.7684 protein phosphatase pp2a regulatory subunit 8 SEO ED NO 270 Nga03630 25,9542 89.92.4955 grotein SEQ NC: 271 Nga.21380.3 252.2523 243,973.63 gtp-binding protein parf-like SEO ED NO. 272 Nga20137.1 310.0522 349.29345 as superfamily gtipase SEO ED NO: 273 Nga03043.02 2.702.461. 2792.90.18 mgc84239 protein SEO Ed NO: 27A NgaO3C2O,O2 2188,825 1943.360 golgi snap receptor complex member SEC : NEO; 275 Nga2O668.4 585.3659 29C,62374 protein kinase domain containing protein SEO 3D NO: 276 Nga21011. 964.775 1131,990 copia trrider SEO ED NO. 277 Nga,21010.3. 1209.77 (48.9938 gag-poi polyprotein SEO ED NO: 278 Nga2010 353.94.63 274.1337 inad-dependent epimerase dehydratase SEQE) NO 279 Nga(4230 Q39.88 786.2356 protein SEO ED NO: 280 NgaO3893 1742,857 1535.44.1 kbp-type peptidyl-provi cis-trans isomerase SEQ O NO. 281 Nga03895 876.7507 903.17782 protein SEQ D NO: 282 NgaO3896 1639.269 1283.558 protein SEO ) NO. 233 NgaO3894 3S76.208 253.4529 ---NA--- SEQ D NO: 284 Nga.03892 1051.075 1026.4515 st-like protein SEO ED NO. 235 NgaO1896,02 626.2712 569.15682 arrestin domain protein SEQ NO: 286 Nga2O588 572.327 552.83616 conserved unknown protein Ectocarpus silicutosus SEO ED NO: 287 Nga03971. 42.2843. 427.5999 oc10014585 relate SEO D NO: 288 NgaO3970 5975.51881.33.8058 expressed unknown protein Ectocarpus siliculosus SEQ}) NO: 289 Nga03725 657.3072 885.1363 ef hard domain protein SEQ D NO: 290 Nga21362. 119.4539 29,39655 ysine ornithine decarboxylase SEO ED NO: 291 NgaO3729 201.909 203.60344 folic acid synthesis protein SEQED NO. 292 Nga03724 3312.798 3436,8335 actin depolymerizing factor 8 SEQED NO: 293 Nga04.192 382.77s. 266,40299 ---NA. SEC : NO. 294 Nga(4190 1860.047 1186,9848 mitochondria; phosphate carrier protein SEO ED NO: 295 Ngao2088.02 23.02.264. 2571.7791 protein SEQE) NO; 296 Nga20976, 455.053 418.62844 conserved hypothetical protein Phytophthora infestans 30-4 SEQ D NO: 297 NgaO3785 836.3496 1910.3047 ring finger protein 13 SEO ED NO. 298 Nga20229 155.5324 131.16403 trna (uracil-5-)-methyltransferase SEO D NO; 299 Nga2O749 272.1713 228.S723 trna uraci-5-)-methyltransferase SEQ}ED NO. 300 Nga03783 576,6962 43G,5775 tor repeat-containing protein SEQ NO: 301 Nga21044 347.1338 31.2.42546 polymerase dra directed epsilon 3 p17 subunit) SEO ED NO: 302 Nga2O783 772.9779 852.25023 short-chain dehydrogenase reductase Sdr SEQ D NO; 303 NgaO5796.2 149.2537 23A, 1935 in -dimethylguanosie tria methyltransferase SEQ}O NO:304 Nga03852.0 338.7755 446.55767 ---NA SEO is NO: 305 Nga03853 89.201.88 62848 ---NA--- SEO ED NO: 306 Nga.03851. 23.9522, 27.2893 ---NA SEQED NO. 307 Nga03850 226.463. 222,343.8 deoxyhypusine synthase SEQ D NO:308 Nga03750. 133.9995 198.1778 dna binding SEQ a NO. 309 Nga21172 295.203 2.99.78799 protein SEO ED NO:310 Nga2O57S 325,9005 343.78633 sulfotransferase member-like SEQ O NO; 311. Nga2O603 355.224 334.58963 kini 7 protein SEQ D NO:312 Nga20272 275,5299 296.376 ---NA SEO E) NO 313 Nga20934 68 SO2 9851CO ---NA--- SEQ NO; 314 Nga2O367 215.864.8 195,799.3 transcription factor e2 dimerization partner protein SEO ED NO: 315 Nga03491.02 1636.64; 1347.3407 ---NA--- SEO ED NO: 316 NgaO4018.1 2458.194. 2300.5136 type fieffector protein FIG RE 24 E. U.S. Patent Apr. 29, 2014 Sheet 66 of 198 US 8,709,766 B2

Nga mode +Nrpkb Nrpkb GO SEC D NO; 317 NgaO1851 1371,485 1378.1402 protein kinase SEO NO: 318 NgaO850 O3 1237,3832 ---NA--- SECEO NO: 319 NgaO2.849.1 65,4275 2.45.6404 --NA--- SECD NO: 320 NgaO1773 41.993S4 73.981329 gentatricopeptide repeat-containing protei SEQ D NO: 321 NgaO1771.01 653.7196566.93284 protein SECEO NO:322. Nga(4098.2 440,7895 793.71829 phosphoribosylformylglycinamidine cyclo-ligase SECD NO: 323 NgaO1754 369.0037 279,80212 protein SEC O NC: 324 NgaO1757.01 S27.2109 633.7287 2-oxoisovaerate dehydrogenase alpha mitochondrial expressed SEC EO NC: 325 NgaO2.755.01 .547,7583 73.59812 ---NA--- SEQED NO: 326 NgaO1756 631.8632 583.05481 protein SEO ED NO: 327 NgaO185S 656.8978 579.17031 locA95188 protein SEO EO NO: 328 NigaO)999 98.24O6 17.28247 fan partial SEQ is NO. 329 NgaO2597.02 .519.4488 640.39068 nucleoside diphosphate-linked moiety x rotif 6 SEO ED NO: 330 Nga07239.2 S81.0306 773.39.182 spicing factor 3b SEOE NC: 33. NgaO2160 40,99 75.56968 protein SECD NO. 332 NgaO2.159 2853,456 36.06.0456 natural killer enhancing factor SEOD NC: 333 NgaO2176 406.0797 A60.77.184 intron-binding protein aquarius SEO ED NO: 334 NigaO2174 2205.747. 2269.8.199 expressed unknown protein Ectocarpus siliculosus SEO ED NO:335 Nga20457 24,88 21,07985 ---NA--- SEO ED NO: 336 Nga21075 102.0408 99.480663 protein SEO ED NO: 337 Nga20717 2S,9343 8S 2941 ---NA- - - SEC D NO: 338 NgaO2.193.01 3.625,663 479.539 upfo414 c20orf30-3ike protein SEO, ED NO: 339 NgaO2.194 i48,4552 132.24107 nucleic acid binding SEO ED NO. 340 NgaO2.195.01 382.659 314.4.192 pyruvate kinase SEC HD NO: 341 NgaO1669 4.09.3908 538.88472 kh domain protein SECEO NC: 342 Nga(1736 873.107 334.47 723 ubiquitir family SEC DNC; 343 Nga20243 351,4392: 396.43753 ---NA. -- SEO D NO: 344 Nga2O374 313.6247 364.79086 grip and coiled-coidomain-containing protein 1 SEC EO NO: 345 NgaO2O52.01 393,8547 450.84.338 --NA SECD NO: 346 Nga21186 431.2977 351.43986 rucleotidy transferase donair-contairing protein SEO D NO: 347 Nga02050 206,0988 232.36564 protein SEO ED NO: 348 NgaO3605.2 115,0342 112.273.4 mitochondria carrier protein SEO ED NO: 349 NgaO1673 33832S4 0.26932 ---NA--- SEQED NC: 350 NgaO1672 S23.591; 638.86668 ceili division cycle protein 123 homolog SEC EO NO: 353. NigaO2.287 C3S.33 937.33458 ---NA--- SEO ED NO. 352 NgaO2288 364,726 443,30978 zinc finger protein SEO O NO: 353 Nga20830.1. 873.5294 270.80849 retrovirus-related popolyprotein from transposon trit 1-94 SEGEE NO: 354 NgaO6718.2 2232. O3 2335.0459 elongation factor 3 SEQ R NO: 355 NgaO27G9.1 39C.2676 366.34929 protein SEO ED NO: 356 Nga01712 6.8415 14.36859 ---NA--- SEOE NO:357 NigaO2133 596.577 AS8.1894.2 predicted protein Thalassiosira pseudonana CCMP335} SEQD NO. 358 NgaO213 383,771. 504.23741 asparagine synthetase SEO ED NO: 359 NgaO2132. 327.3273 357.325O3 dynamin sks protein SEQED NO: 360 Niga00253.2 3745,673 1601.4958 --NA--- SEQED NO: 361 NgaO2325.01 763.4409 183,45091 rra binding protein SEO ED NO: 362 NgaO2324 353.829, 369,25774 ---NA--- SEC EO NO: 363 NgaO1800.02 3475.895 A617.9188 gons-related n-acetyltransferase SEO ED NO: 364 Nga01825.1 561.6883 637.7483.2 conserved hypothetical protein Phytophthora infestans 330-4 SEO ED NC: 365 NgaO1829 93.232, 9446805 --NA--- SEC EO NO; 366 ligaO2333.2 36.0652324.0874.5 dma polymerase deta subunit SEC HD NO:36 NgaO1704 A82.4825 4.3.12526 short-chain dehydrogenase reductases dr SEO EO NO: 368 NgaQ1707 393, 1582,0233 ---NA--- SEQED NO; 369 NgaO1705 83S,6998 689,92993 protein SEO D NO: 370 NgaO212S 3000.459 832.95067 hypothetical protein PPL(A68. Polysphoridylium pallidum PN500 SEO ED NO:37. NigaO2123 773.239, 542.59374 hadh:ubiquinone oxidoreductase complex intermediate-associated protein 30 SEQ O NO: 372 NgaO2124 336.7232 305.998.29 protein SEO ED NO; 373 NgaO2042 663,75 686.499.51 protein SEQED NO: 374 NgaO2038 i3523. 351,0798---NA--- SEQ O NC: 375 NgaO3606.2 1402,083 1990.4424 agc ndr protein kinase SEO ED NO: 376 NigaO2411. 8.333333 6.09863 ---NA--- SEQED NO: 377 NgaO5034 A.O.543. 474,131S ---NA--- SEO ED NO: 378 NgaO4270.01. 32.81.25 93.09.0437 ---NA--- SEO ED NO: 379 NigaO4915 852.984. A65.34 ---NA--- FGRE 2 A. E. U.S. Patent Apr. 29, 2014 Sheet 67 of 198 US 8,709,766 B2

Nga model +Nrpkb -Nrpkb GO SEQD NO: 380 Nga2O305 522,0555 450.64968 glycosyl group 2 family protein SEO EO NC: 383. Nga.04536.02 8371.486 3099.63.52 3ight-harvesting protein SEC ENO 382 NgaO4607.O. 98.1567 239.649 ---NA--- SEO ED NO: 383 NgaO4606.1 9,6129 101.33479 intrafagellar transport protein 52 SEQED NO; 384 NgaO4635 691,1905 725,84043 protein SEO ED NO: 385 Ngao4634 6594,635 S790.7007 protein SEO EO NO: 386 NgaO4633 3442.59 3280,3653 aspartic protease SEQ (ED NO: 387 Nga210O2.1 133.52.27 184.64215 phytocheatin synthase SEO EO NO: 388 NgaO4697 66.6638, 48.4331 -...-NA--- SEC EO NO: 389 NgaO4696.1 255.4865. 258.363.39 phercrinone-regiated membrane protein SECRH NO: 390 NgaO4523 732.789 1689.3433 glyoxalase domain-containing protein 4-like SEO EO NO:39. NgaO4524 55.5S555 ...f523 ---NEA--- SEO Ed NO: 392 Nga04525 290,4452 32.03889 conserved unknown protein Ectocarpus siliculosus SEO ED NO: 393 Nga20151. 480,8531. 453,195 protein phosphatase 2c-related protein SEQED NO: 394 Nga20221. 202,5478 164.20999 protein SEO ED NO: 395 Nga2O745.1 496.53S 545.404S ---NA--- SEO ED NO: 396 Ng321300.1 473.4694 530.56356 udp-n-acetylglucosamine pyrophosphorylase SEQ -D NO: 397 Ngag4993 243.2141. 301,090S8 dp-r-acetylglucosarine pyrophosphorylase SEQDNC:398 Ngao4997 1405.233 450.3014 protein SEC EO NO: 399 NgaO3706.02 591.2863 553.97752 protein SEC 8 NO. 400 NgaO4405 587.8565 658.70988 conserved unknown protein Ectocarpus siliculosus SEO ED NO: 401 NgaO4407 214,221. 239,83O3S ---NA--- SEQ O NO; 402 Nga2O779 453,1316 502.26733 percamino acid-rich with gyf domain-containing protein 1 SEO ED NO: 403 Nga20093.1 189,7856 87,1413 exportin 4 SEO EO NO: 404 Nga211.59 34.5919 74.94C13 ribosome biogenesis atpase rix7 SEO ED NO. 405 Nga2O583.1 42.32804 85.970948 protein SEO EO NO: 406 Nga2O490.1 861C. 494O13-...-NA--- SEO ED NO: 407 Nga2O3S9. 231995. 1664385 --NA--- SECRH NO: 408 NgaO4678 26.03896. 13.331. --NA--- SEO EO NO: 49 NgaO4590 765A32, 80328A ---NEA--- SEC E) NO: 410 Nga.04591. 2558.065 2291,098 ef-1 guanine nucleotide exchange domain-containing SEO ED NO: 413 NgaO475). 347,2222 28,83843 ---NA SEC EO NC: 412 Nga.04446 362,1622 93885098 ---NA--- SEO ED NO: 413 NgaO4441. 11.922 OSS.53S ---NA--- SEO ED NO: 414 Nga04440 10.5 s.948 ---NA--- SEQ -D NO: 415 NgaO4443 3.15.395 A59.39932 ---NA..... SEQD NO: 416 Nga(4902 A50,847s 36.3789 --NA.-- SEC EO NO: 417 NgaO4903 320.8396 302.03.23 tetracycine resistance protein SEC 8 NO. 418 NgaO4904.1 134.3206 174.9553 cysteine protease famity SEO EO NO: 419 NgaO250.02 11:49.22 E3737.34 ...--NA--- SEQ D NO: 420 NgaO4581 63,38389 93.278478 selenoprotein t SEO ED NO: 423 Ngao4705 35,238 165,06422 -...-NA--- SEO EO NO: 422 NgaOi343.02 695,6522 897.88395 ---NA SEQ (ED NO: 423 Ngao4491. 19832 78,544 ---NA--- SEO EO NO: 424 Nga20981 14.357 S3.856944 hyb-like dina-iriding drain containing protei SEQ O NO: 425 Nga.04831. 4350-143858,3761 conserved unknown protein Ectocarpus siliculosus SECRH NO: 426 Nga20929 248.269 6437.489 pap2 haloperoxidase domain-containing protein SEO EO NO: 427 Ng304522.4 578.1742558,09849 ---NA--- SECD NO: 428 Nga20871.1. 1056.358 926.69724 pterin-4-alpha-carbinolamine dehydratase SEO ED NO: 429 NgaO4554 28,183 3447.568. ---NA SEQED NO: 430 Nga2O772. 554,8246 701.9641 protein SEO ED NO: 433 Nga20330 822.9842 936.66:04 uncharacterizedudp-glucosyltransferase SEO ED NO: 432 NigaO4735.1 258,0486 349.25578 aldehyde dehydrogenase family SEC 8 NC: 433 Ngag4381. 8.364342 6.5436995 protein SEQED NO: 434 Ngao.2230.02 23.58 958.3446 ---NA.-- SEC EO NO: 435 NgaO4973. 902.7778 (79.762 px domain containing protein SEC 8 NO. 435 NgaO320.02 325.5172 340,6584 trigger factor SEO EO NC: 437 NgaO4972.1 98.7578 224.27204 conserved unknown protein Ectocarpus siliculosus SEQ O NO; 438 Nga03844.02. 2072.273 1940.1417 10 kida heat shock mitochondrial SEO ED NO: 439 Ngao439.1 223,254 293, 1976 dead box rina helicase SEQED NO: 440 NgaO4760.1 284.9462. 241,68929 uncharacterized protein SEO ED NO. 443 NgaO7054.1 a3.25 A38.39.4S ---NA--- SEO EO NO: 442 NgaO3620.02 513.8554 634.CSO08 endonuclease exonuclease phosphatase SEC EO NC: 443 Nga2O5S5. 3400.729 1298.3022 retrotransposon ty1-Copia subclass FIGURE 2 A. G. U.S. Patent Apr. 29, 2014 Sheet 68 of 198 US 8,709,766 B2

Nga model Nrpkb Nrpkh GO SEO 3D NO: 444 NgaO5790.2 953.02O. 993.34.998 major facilitator superfamily infs 3. SEED NO; A45 NgaO6926 47.8873 228.85224 ---MA--- SEO NO; 445 NgaO7217 19856O2 150.37523 egf-ike protein SEOD NO al NgaO726 28,8229 296.202 ---NA--- SEOD NO; 448 Nga07183 935.683 854.03583 conserved unknown protein Ectocarpus siliculosus SED NO: 49 NgaO7184 352 349.7 ---NA--- SEOD NO; 450 NgaO6894 18.948 87.228 ---MA--- SECEO NO; 5. Nga20032.3 659.72 541.63697 gigutathione s-transferase SECR D NO: A52 NgaOO239.32 3.09.AOC31 285.5939 integral membrane protein gp3SS-Eike SEO NO: 453 .NgaO764 902.3a9 858.9549 ---MA-...- SEQED NO; 45a Nga21013. 80,3857 2023.3933 hypothetical protein WE SWOO4538 Vitis vinifera SEO 3D NO: S5 NgaO711.5 6.931.9 58.238384 ---MA--- SEED NO; 456 NgaO6941. 608.1871 57434626 ---MA--- SEO NO; 457 Nga2101.7.1. 32,827 316.17234 inad-dependent epimerase dehydratase SEO NO S3 NgaO7291 108,225 79,1855 ---NA--- SEO EDMO: 49 NgaO7292 4,6383 493.23716 ribosome biogenesis protein p24 SED NO: as NgaO7181.1 133.99 40.87377 glycosyl hydrolase SEOD NO; 46. Nga2O58. SO2.2288 350.8841 dufS63 domain protein SEC) NO; 45 NgaO739 483.3333 53.300.46 ---NA--- SEO 3D NO: AS3 NgaO7214 Ol.82s 82.24.462 ---NA--- SEQD NO: 46A .NgaO723 327,232 25.3884 ---MA-...- SEQED NO; 465 NgaO7028 292.654 252.84.015 Conserved unknown protein Ectocarpus silicutosus SEO 3D NO: 66 Nga07065 3O46. 2738.81.49 purple acid phosphatase SEED NO; 4.67 NgaO7066 356.725 1265.884S ---MA SEO NO; 468 NgaO7356.3 329,2563 355,64313 protein SEO NO as NgaO7178 99.291.3 710.07254 prefoidin subunit 5 SEOD NO: 47C NgaO755 233.8028 288-86238 protein SED NO NgaO3591.02 688.023 743.306.46 possible sulfotransferase SECD NO; f NgaO5479.2 23.5789 228.04.925 ---NA--- SEC NEO : 43 Nga2O629 49.8566 3.13973 ---NA--- SECR D NO: Afa NgaO7277 40.9481 771.07446 protein SEQD NO 45 .NgaO690. 234.S9. 279.7284 protein SECRED NO; 45 NgaO6953 558,8526 43.57893 ---NA--- SEO 3D NO: 7 Nga07091 529.2259 62.634.58 ---MA--- SEED NO; 478 NgaO7172 204.955 224.45.388 ---MA--- SEO NO; 49 NgaO6912 A46S5 198.20291 wid repeat SEO NO 480 NgaO6944 228,8238 208,3397 uncharacterized protein SEOD NO : 48. NgaO6920 3.99.13 8529AC1. ---MA-. SED NO: a82 NgaO6793 1735,88 1399.9269 proteasome (macropain) beta 6 SECD NO; 433 NgaO7042 RSS.OF94. 238.99923 ---NA--- SEC) NO; 484 NgaO7050 447.9638 S390.99 ---NA.-- SECR D NO: a3S NgaO7290 S23.388 89,83399 ---MA--- SEO NO: 436 .Nga2O687.2 9,81221. 3.3698s ---NA-. SECRED NO; 487 Nga20924 8.337 74,339S84 ---NA--- SEO 3D NO: 288 NgaO7279 2O2.15CS 17.38562 ---MA--- SEED NO; 489 NgaO6811. 73,2392.4 125.10589 ---MA--- SEO NO; 490 NgaO6900 64S,9948 S79,442. --NA.-- SEO NO a NgaO7302 829,235 925.6732 atp-binding cassette superfamily SEOD NO: 49 NgaG7000 64.864.86 189.32197 formin like protein SED NO : 93 NgaO6977 173574 12.895.642 ryb domain-containing protein SEOD NO; 494 NgaO6897 18.9923 184.6893 adenylate kinase SECEO NO; i Nga(68.32 453.556 430.536OS aldehyde reductase i SECRD NO: 98 NgaOO443. 95.285 257.8767 serie theorine protein SEQED NO: 497 .NgaOC437 225,667 1155.4495 high robiity group protein b3 SECRED NO; 498 NgaOO434 1589,892 1S62.S805 hypothetical protein Ost 22534. Oryza sativa indica Group SEC3D NO: A99 NgaOO442. 326.936 360.00812 sh2 domain containing protein SEED NO; SCO NgaOO432 23O43, 20055,906 cytochrome b6-fcomplex iron-sulfur subunit SEO MEO; 501. NgaOO438 15569. 1784.3299 poic acid synthetase SEO NO SO2 NgaOO471 36,364 49.237906 ribosomal large subunit pseudouridine synthased SEOD NO: SG3 NgaOO445 38.9087 38.5989 ---MA--- SED NO: 54 Nga2019 573.272 501.07724 protein SEOD NO; SG5 NgaOO436 288-955 1058.6635 ddhi- and Cul4-associated factor 7 SEO ED NO; 56 NgaOO44. 382.6 596.781653–5 exonuclease domain-containing protein SECR D NO: 507 Nga2O469 S34.08 4.09.436 conserved unknown protein Ectocarpus siliculosus FIG URE 24 H. U.S. Patent Apr. 29, 2014 Sheet 69 of 198 US 8,709,766 B2

f Nga made -N rpkb -Nrpkh GO SEQ - O NO; SO3 NgaOC430 1072.738 1,048,006 methionine aminopeptidase 2b SEQ D NO; 509 Nga.00470 445.9907 484.54799 cys met metaboism pyridoxal-phosphate-dependent enzyme SEO NO. 510 NgaOC43. 469,3391 42S,2709 atp-binding cassette superfamily SEC :D NO: 511 NgaO.0435 838.1692 750.33982 n-myristoyltransferase 2 SEO. :D NO. 512 NgaOO440 196.0784. 217.30032 dual specificity phosphatase 1.5 SEQ D NO; 513 Nga.00439 292.9624, 234,54836 airpl actin-reated protein honolog centractin beta SEO. :D NO 514 NgaOC433 2908.56 2649.076 us snrna-35sociated Sr-like protein sm.5 SECR D NO. 515 NgaO(429.01 7422,336 5958.4819 thioredoxin f SEO D NO. 516 NgaOC444 89,405. 888.40967 ---NA--- SEQ DNC; 517 Nga2CO25 3.94.893 249,23987 ---NA.-- SEC : D NO. 518 Nga21267 49.532 B23.682. --NA--- SEC ; D NO: 519 Nga21266 73.3945 39,13096 trina inodification gtipase SEQED NO: 520 NgaOO259 77.8423 750,62269 protein SEQ in NO. 521 NgaOO275 3345.168 3519,7442 conserved unknown protein Ectocarpus siiciosus SEO; No. 522 NgaOO277 222.0249 219.987.9 C transferase SEO : D NO; 523 NgaOO276 734.334 7549.71.58 beta-lactanase SEQ :D NO: 524 Nga21179 17.372 300.08503 pentatricopeptide repeat-containing SEO. :D NO: 525 Nga2O706 295.0495 375,378 salt-inducible protein SEC D NO; 526 NgaOO274 127.5862. 159,37235 iidgetin-like 1 SEO E) NO. 527 NgaOC265 1312,033 1549.5427 t-complex protein 1 subunit delta SECR D NO: 528 NgaOO280 275.8621 65.23 ---NA--- SEO. :D NO. 529 Ngao0270 481.5972 740,58598 protein SEQ :D NO; 530 NgaOO273 278.5388 Af,3369 ---NA--- SEC : D NO: 533 NgaOC278 5G4.3597 493.55335 novel protein vertebrate deah asp-giu-aa-asp his box polypeptide 57 SEC ; D NO: 532 Nga(0.282 166.7969 134,9811 Conserved hypothetical protein Capsaspora Owczarzaki ACC 30864 SEQED NO: 533 NgaOO250 18877,38 11804.309 luminai binding protein SEQ - DNQ: 534 NgaOO259 493.7676533.55914 nuclear pore complex protein SEO ED NO: 535 NgaOO268 723.055 S16.48842 protein SEO. :D NO. 536 Nga211.68 7,4408 A38028---NA--- SEQ :D NO; 537 NgaOO283 256,351. 28.85764 - --NA--- SEO in No. 538 NgaOO262 1462.753 1667.934 30S ribosomal protein si SEC ; D NO: 539 Nga2O639 57.85.24 7161873 ... -N.A. SEO. :D NO. 540 Nga00271 795,6587 552,6971 protein SECR D NO: 54i NgaO0272 484.5447 460,62329 hypothetica protein NAR i9477 entisphaera araneosa HTCC2155 SEO. :D NO: 542 NgaOO263 2198.137 21774745 phosphoglycerate mutase SECR DNO; 543 Nga.00264.01. 515.7685 588,96792 cc 4-not transcription corplex subunit 10 SEC O NO: 544 NgaOC256 1552,022 1528,06 5-methyltetrahydropteroy triguitarnate-homocysteine S-methyltransferase SEC : D NO: 545 NgaO0284. 94.3128 12.9A382 ---NA--- SEC ; D NO. 546 Nga2O65 330.304 249,80549 protein SEQED NO: 547 NgaOO283 11729.49 12331,093 protein SEQ : D NO: 548 NgaOQ279 462,212 589.68255 protein SEQ iD NO; 549 NgaOO257.01 818,779 835,86483 px domain containing protein SEO NO. 550 Nga2O441 245.253 211,21553 mannosyloligosaccharid alpha-mannosidase SEQ D NO: 551 Nga.00251 362,656 316,48702 translation initiation factor ef-2b Subunit deta isoform SEO. :D NO. 552 Nga.00435 854,CO34 626.1732 ---NA--- SEQ D NO; 553 NgaOO486 632 623,9425 ---NA SEO D NO. 554 NgaOC507 972.2719 968,90335 peptidyl-prow cis-trans isomerase 10 SECD NO. 555 Nga00480 7953.1038292.3426 ribosomal protein 518 SEO D NO. 555 NgaOC473 314,657 235.9004:3 dihydrodipicolinate synthase SEQD NO; 557 NgaOC510 365.4.224 223.4569 smarna degrading nuclease 5 SEC : D NO: 558 NgaOC475 70. 1193 787.41486 transmembrane protein SEC ; D NO. 559 Nga20859 237.755 2024,1553 conserved unknown protein Ectocarpus siliculosus SEC D. NO. 560 NgaOC51.5 239.0164. 23.0952 ---NA--- SEQ in NO. 561 NgaO(481. 719.7279 791.42398 fat-free protein SEQ D No. 562 Nga.00476.01 72.374 546.54973 4-hydroxyphenylpyruvate dioxygenase SEO : D NO. 563 NgaOOS11 243.74O7 259,4422 ---NA--- SEC :D NO: 564 &gaOO498 639.7849 68.38909 ---NA--- SEO D NO 565 NgaOO432.01 3534.884 2768.9643 beta-hydroxyacyl-acp dehydratase precursor SEC D NO; 566 Nga2O150 42.7796 132,89129 frna biogenesis protein rips SEO. :D NO: 567 NgaOC499.01. 3887,179 3369,1353 ---NA--- SECR D NO: 568 Nga00487 A363,636 3749.2428 phosphatidylinoston-acetylglucosaininyltransferase subunith-like SEO. :D NO. 569 NgaOC513 998,651 99.305.02 ---NA--- SEQ :D NO; 570 NgaOGSO3 3,485.666 622.71.98 p-type atpase FIGURE 24 U.S. Patent Apr. 29, 2014 Sheet 70 of 198 US 8,709,766 B2

Nga model +N rpkh -N rpkh GO SEQED NO; 571 NgaOO501 8380,399 881,3366 protein SEC D NO: 572 Nga.00.472 S835.945 3934.2108 permease SFO EO NO: 573 NgaOO534 339,736 388.853? ---NA--- SEQED NO; S74 Ng320028 78,94.737 S7,02313 atp-dependent deadh dra helicase reco SEO NO: 57S Nga (Q477.01. 134.8601 100.80258 conserved hypothetical protein Phytophthora infestans 30-4 SEO EDMO: St. NgaOO478 4.59.695 441.3353 rrha biogenesis protein Frps SEO D NO: 577 Nga2O628 2O7.2829 221.50.62 atp-dependent dna family expressed SEO ED NO: 578 Nga00512.01 74.11424 82.849564 ---NA--- SEC E NO. 579 NgaGO474 S49,564 52.92.99 -...-NA--- SEO ED NO: 580 NgaOOS14 700.854 338.835S7 family SEQED NO: S81 Nga2O483 190.9722 206.86759 prematurely terminated inrna decay factor-like SECR D NO: 582 NgaOC506 890.5752 951,10254 splicing factor 3b subunit 1 SEQED NO; 583 NgaOO509 7855626 63.9908 ---NA--- SECR D NO: 584 Nga00533 214.2305 196.87546 protein SEO ED NO: 585 Nga00438 1245.96S 113,2806 predicted protein Phasodactylun tricornutur CCAP 1055/ SEQED NO. 586 Ng800483 235.851 299.74696 ubiquitin-protein Eigase SEO D NO: 587 Nga00484 101,582 1.89.4638 eukaryotic transation initiation factor 3 subunit SEO ED NO: S88 NgaCOSOO.01 4996.238 4992.331 guitary:- initochondria precursor SEC NO: 539 NgaOC508.01 6980.9 6051.5275 gutathione peroxidase SEO ED NO; 590 Nga00504 258.1903 306.71913 low-density lipoprotein SEQ D NO. 591 Ng800479 491.973 451.23534 atp-binding cassette sub-family b member 9 SEO D NO: 592 NgaOOS05 2501.88 1518.2477 amino acid-polyamine-organocation family SEO ED NO: S93 Nga.005O2 3516.592 5094.725 c14.c577g 12199 SECR D NO: 594 NgaO0663 2639.663 2428.0.67 chaperoni SEO EE NO. 595 Nga20429.1 246.0137 199.86777 elongator complex protein 4 SECR D NO. 596 NgaOO671 227.4939 202,946 ...N.A.-- SEO EO NO: 597 Nga.00688.1 444.6123 587.433S1 ap-dependent na helicase SEO ED NO. 593 Nga2O297 335.4762. 214.927.37 mitochondria protein SEO D NO: 599 Nga.00564 109.2896 136,14416 phdfinger protein 3 SEQED NO: 600 Nga00673 98.88 108.254 ---NA--- SEC D NO: 601 Nga 20068 32O.2576 393.22.38 pentatricopeptide repeat:0 SEO ED NO: 602 NgaOO668 1857.707 3043,3426 S-adenosylmethionina-dependent methyltransferase domain-cofitaining protein SECR D NO: 603 NgaOQ669.01 324.263 434.276 Sumo tigase SEQED NO: 604 Nga.00656 2106.164. 2243,0814-3-oxo-5-alpha-steroid 4-dehydrogenase SEO D NO: 605 NgaOO662 2:18,433 2.70.9606 radp-dependent naic enzyme SEO ED NO: 606 Nga00657 2076.51 1615,8539 zinc finger hit domain-containing protein 1 SEC E NO. 507 NgaOO658 4937.87 4748.4959 inpw1.7-like protein SEQED NO: 608 NgaOO670 26.1064. 379.233S9 ---NA SEQED NO: 609 NgaOO659 825.2662824.74685 cobalanin Synthesis protein pa2k SEC NO: 610 Nga(O654 82.391, 8,353 ---NA--- SEQED NO; 611 Nga2199 1297.583 1143,686. upfo587 protein c1orf123 homolog SEC D NO: 612 NgaOO661 607.4534 527.48783 protein SEO ED NO: 613 NgaO0667.1 895.9508 1007.724 poate-protein gaseb SEQED NO: 614 Ng800672 62,39303 84.02443G flagear outer dynein arm-docking complex SEO D NO: 62.S Nga00665 183.3539 208.8807 uncharacterized protein corf24-ike SEO EE NO: 616 NgaOO660 93.75 9082.3422 ribosoma protein 32 SEC D NO: 617 Nga.00681.2 1095.077 1240.4002 ---NA SEO ED NO: 618 Nga20187 228.1803 304.0223 protein SEQ D NO: 619 Nga 20024 326.5O27 279.68745 conserved hypothetical protein Abugo aibaci Nci4 SEO ED NO: 620 Nga00666 994,763 919.28809 protein SEQEE NO: 621 NgaOO011 449.3464 430.10759 tili-Eike protein SEC E NO: 622 Nga.00018 88.88839 100,29944 dynein light chain SEGED NO; 623 Nga.00015.01. 3272.989 2182,0316 ferredoxin SECR D NO: 324 Nga.00004. 1060,413 715.55438 methyltransferase family SEO ED NO: 62S NgaOQ009 435.1088 468.07408 protein SEC EO NO: 626 NgaOOOO1 5567.879 S567,0727 40S ribosomal protein S4 SEO D NO; 627 Nga.00010 7307,191 80.26 ---NA--- SEOE NO: 628 NgaOOOOS 765 895.224 horrosarine kinase SEC DNC): 329 Nga GOO12 123.89.3 15.93688 phosphoethanolaminen SEO ED NO; 630 Nga00038 250,410S 287.26155 protein SEC E NO; 631 NgaO004 65,302.4 104.52257 histone deacetylase SEQD NO: 632 NgaOQ008 43.847254SCO2O3 repeat-containing protein a 04. SEQED NO: 633 Nga.00006 297.7194 266,64633 fatty-acid-ligase add9 FIGURE 24 J U.S. Patent Apr. 29, 2014 Sheet 71 Of 198 US 8,709,766 B2

Nga model + rpkb Nrpka GO SEO EO NO. 534 Nga2O828 266.923, 198.3:514 ag2 protein SEQED NO; 63S NgaO0015 2230.867. 1908,3945 2fe-2s iron-sulfur cluster binding domain protein SEQE) No. 636 NgaOOO17 80.71473 67.4101.22 rna recognition motif-containing protein SEQED NO; 637 NgaOOOO7 583,786. 483.56618 methyltransferase family SEO EO NO; 688 NgaOOO13 294.7368 341.19576 protein SEC HD NO: 639 NgaOOOO3.01. 13782.2 16785,777 etyEratoric encephalopathy i. SEO Ed No. 640 NgaOOOO2 344-1842 437.33581 protein SEO EO NO: 641. Nga2O792 263,238 328.53.422 ---NA--- SEC D.N.O; 642 Ng302883 168.368 220.4505 conserved hypothetical protein Abugo abachii Nc14 SEQ O NO: 643 NgaO2882 224.4898 231.3.167 fina domain containing protein SEO ED NO: 64A Nga2O557 A09.7065 448,69848 dhhczinc finger domain containing protein SEC NC: 645 NgaO2880.2 960.1407 1295.3091 fas-related protein rab-2-a SEQ O NO: 646 NgaO2872 486.309 394,340 histone 2 SEQED NO: 647 NgaO2881 1240.246 1118.82.27 conserved unknown protein Ectocarpus sticulosus SEQEO NO; 648 Nga20407 1363.897 1145,3104 protein SEO HD No. 649 Nga2O428 237.1542 1.81.25258 alanyl-trina synthetase SEO EO NO: 650 Nga20209 1389.6C 1134.3614 protein SEO EO NO: 651 NgaO2874 O32.063 922.34856 farnesyl diphosphate synthase SEC HD NC: 652. Ng302871. 1024.259 873.03.064 ubiquitin-Cougating enzyme e2 i SEO EO NO: 653 Nga,02876 3048.303 231.9 1954 calcium-dependent protein SECRED NO: 654 NgaO2873 1300.49 1416,8789 protein SECR D NO: 655 Ng302878 1588.435 1447.998A protein SEC O NO: 656 NgaO2879 1459.848 2282.77 hypothetical protein Esi 0209C949 Ectocarpus siliculosus SEO EO NO; 657 NigaO2870 1575..112 1360,7216 protein SEC HD NO: 658 NgaO2884 65.237 (69.4476 ---NA SEO ED NO; 659 NgaO2877 339.838 3433,8954 hypersensitive-induced response protein SEQED NO: 660 Nga.02189.2 538.6473 556.26941 violaxanthin de-epoxidase-related protein SEC ED NO: 66i Ng30.016101 760.492 1253,322 had-superfafnity subfamily ia hydrolase SEQ D No. 662. NgaOO150 234.2342 231.77303 gamma complex associated protein 2 SEO ED NO: 663 Nga.00155.01 1802.676 1561.4509 anion exchanger fainity SEQED NO: 664 Ng321119 22A2.798. 1887.8583 retino retinadehyde reductase SECD NO: 86.5 NgaO0166 23.396 273,583.59 ---NA-. SEO ED NO: 666 NgaOO143.01 842.3303 97.8007 conserved uncharacterized protein SEQD NO: 667 Nga2O068.1 799.1845 721.05175 recepto-iriteractingsefire-threoning kinase 4 SEOD NO: 668 NgaOO.194 3.73 157,844S ---NA-...- SEO ED NO: 869 NgaOO149.01 1285,985 1033,998 ---NA--- SEQEO NO; 670 NgaOO163 1542.049 1652.793 nuclear receptor coactivator 7 SEQ HD No. 671 NgaOO167 261.3.397 2935.4726 -aspartate oxidase SEO ED NO: 672 NgaOO170 1444.444 1633,572 peroxiradoxin-like protein SEQEO NO: 673 Nga.001.68 4567.293 4185.3861 molecular chaperone SECR HD NO: 674 Ng3001.72 421.5.359 4080,022 conserved unknown protein Ectocarpus siliculosus SEO ED NO: 675 Nga.001.93 1161.348 1054,4245 zinc finger protein SEQED NO: 676 Nga20242 463.0503 445.23788 a polymerase icted phosphatase SEQE) No. 677 NgaOO171 1501.247 140.3937 enoyl-hydratase isomerase SEQED NO: 678 Nga20286 491.360S S30.2808 rapoy rerase is cid phosphatase SEC EO NO: 679 Ngao0169 85.7343 219.46038 tumor suppressor candidate 4 SECH) No. 680 Nga,20422 139.8376 186,0285 conserved unknown protein Ectocarpus siliculosus SEO O NO; 681 NgaOO44 99.2767 60S,279 caltractin SEO EO NO: 682 Nga20351. 152.5888 152.4988 pentaticopeptide repeat-containing protein SEO ED NO: 683 Nga,21250 137,4408 97.542393 protein SEQID No. 684 Nga20271 157.5037 188,89236 ---NA--- SEO ED NO: 685 NgaO0147 1786.268 1309,9573 abc subfamily abcg SECD NO: 685 NgaOO151 E85.0534 185,0364 copine family protein SEC D NO: 687 NgaOO164 19765.55 15222,287 heat shock protein hsp20 SEO ED NO: 688 Nga001.45 177.2535 226.73882 u3 smal nucleolar na-interacting protein 2 SEQD NO: 689 NgaOO162 979.8658 984.17983 serine threonine-protein kinase ctrl. SEO HD NO 690 NgaOO16 62.4689 479.0589 protein SEGED NO: 69. NgaOOI73 38.f5989 10,7859S ---NA--- SEC EO NO: 692 NgaOO174 78.2 184787 ---NA.-- SEC HD NO: 693 Nga2O762 73.5632 94.6.27333 fina domain protein SEO ED NO: 694 NgaO3257 622,009, 645,794.01 fina binding notif protein 7 SECRED NO: 69S NgaO3264 84.50704. 45.773448 metalion transporter formity SEO ED NO: 696 Ng303260 559,6591 518.02381 protein SEO EO NO. 597 Nga.03262 345.3608 352.88358 protein EGURE 24 K. U.S. Patent Apr. 29, 2014 Sheet 72 Of 198 US 8,709,766 B2

Ngarnode +Nrpkb Nrpkh GO SEC E NO: 698 NgaO3267 43,9768 408,3882 ---NA--- SEO 3D NO: 699 Nga2G709.1 107.6923 168,50306 protein SEQ D NO: 703 Nga20491.1. 112,5731. 358.36754 atp-dependent clp protease adaptor protein containing protein SEO El NO: 701 Nga2.277 577.111. 627.4.394 pentatricopeptide repeat-containing SEQ}{D NO: 702 Nga03254 577.2931 533,626477-dehydrocholesterot reductase SEO D No. 703 NgaO3261 491.9499 434.0687 rna recognition motif-containing protein SEO Ed NO: 704 NgaO3263 3.94.7484 65.922C5 ---NA--- SEO ED NO: 705 NgaO3256 77.5281964.33099 trehaase-like isoform SECR D NO: 706 NgaO3266 95.63994 82.464.967 ---NA--- SEO E NO: 707 Nga2O643 38.3339 22.3823 ---NA--- SEC E NC: 708 NgaO3258 843.2584 65,548 -...-NA--- SEO ED NO: 709 Ngao.3265 447.6651 415.1S246 protein SEO E NO: O NgaO3259 718.6761 824.84A38 innature Colon carcinoma transcript 3. SEC : NO; 7 NgaO325S 363.9333 198.36441 fad nad -binding oxidoreductase family protein SEC is NC: 712 NgaZ0184 193.514 22.3S185 conserved unknown protein Ectocarpus siliculosus SEO E NO: 713 NgaO098 315.2318,359,4043 single-stranded nucleic acid binding 3h protein SEC : NO: 714 NgaOO1960. 469.4271 500,67626 ---NA SEO ED NO: 715 NgaOC202 149.4642 365.57044 pentafunctional arom polypeptide SFO E NO: 716 NgaOCl.97.O. 501.9973 375.742.27 proly A-hydroxylase SEC E NO: 727 NgaOC20S 753.7609 807.06503 transcription factorie SECR DNC: 718 NgaOG2O7.0: 2368.566 2338.2092 rease accessory protein areg SEO D NO: 719 Nga00206 359.589 395.70.9 had-binding rossmann-fold-containing protein SEO El NO: 720 NgaO02.2 764.879 839.66 --NA--- SEO ED NO: 721 Nga00209 805.5556926,33696 protein SEC O NO: 722 NgaOC200.0 608.4316 S61.54786 dra gyrase subunit a SEC) : NO: 723 NgaOC233,02. 360.0424 132.60.574 nr card domain containing 3 SEO ED NO: 724 NgaOG208 1932.836 1980.5897 protein disulfide-isomerase SECR D NO: 725 NgaOC203.0: 242.0765 225.5257S beta-adaptin-like protein a SEO E NO: 726 Nga2O546.1 3O33.708 983.05587 né-adenine-specific methylase SEC E NC: 727 NgaOG195,91. 87.215 608,06253 conserved hypothetical protein Phytopthora infestans 3-4 SEO ED NO: 723 Ngao(201.01 99.67846 313.42429 dimc1 SEQ is NO; 29 NgaOC204.O. 2006.969929.4435 atp-binding cassette transporter SECE NO: 730 Nga2O53S.1. 648.855 61.374O69 pseudouridine synthase SEC is NO: 73. NgaZG949.1 223.9289 383.0202 protein SEO NO: 732 NgaZO254, 389.5112 359,05565 protein SEC : NO: 733 Nga20878.1 S54.003 444,791.59 aspatate carbarinoyltransferase SEO D NO: 734 NgaOC199.0. 1643.221 3667.3632 dha-directed in a polymerase is 39 kca polypeptide SEQ NO: 735 NgaOC2C.O. 208.4632 259.396 uncharacterized protein c16orf7 homolog SEO E NO: 736 NgaO2630 338.9728 30.9044 hypothetical protein Esil 9:160070 (Ectocarpus si{iculosus SECR } NC: 737 NgaO767,02 29A1.606 3178,54G5 haca ribonucleoprotein cornplex subunit 2-like protein SEO D NO: 733 NgaO2623 884.0852 809.7058 phdzn finger-containing protein SEO E NO: 739 NigaOil 85.82 492.6185522,6793 cystathic nine gamma-lyase SEO ED NO: 740 NgaO2645 25.959 88.263SGS ---NA--- SEO E NO: 741 NgaO2522 847.3261902.36789 serine threonine phosphatase 2C ptc2 SEQ O NO: 742 NgaO1764,02 4929.6 5797.468 protein SEC E NO: 743 NgaO2624 4731,634 453,7501 endonuclease exonuclease phosphatase fanity protein SEC D NO: 744 NgaO252. 596.2644 566.00013 atp-dependent dria helicase piff. SEO ENO: 745 NgaO263. 336.4594 38.12.58 amino acid permease-associated region SEC E NO: 746 NigaO262G 4806.452 4279.607 giquinol-cytochrone c reductase subunit 7 SEO is NO: 747 NgaO2523. 3232.196 371.7164 molybdenurn cofactor synthesis 1. SEQE) NO: 48 NgaO2832 23.SS81 S6,834s ---NA--- SEC O NO: 749 NgaO262s 1965.352 2243.0015 sulfate permease family SEC is NC: 750 NgaO262 142.4522 37.96627 infna decapping protein 2 SEO E NO: 75 NgaO2626 379.37 430,44395 ysine-specific dennethylase 5c SEC E MO: 752 NigaOSO3 1530.34 372.4792 predicted protein Naegleria gruber SEO D NO: 753 NgaCEO88 22.3022 83.25866 serine threonine-protein SEQ is NO: 754 NgaO5084 59.5205 170.8037 protein SEO E NO: 755 NgaOS083 3769.884 3502.37 succinyl- igase subunit mitochondrial precursor SECR };} NC: 758 Nga2C444 224.2424 A39,85863 cytochrone pass SEO D NO: 757 Nga2O671 295,082 399.5535 cholestero 24-hydroxylase-like SEO El NO: 758 NgaOSO36 3.375.543. 434.3075 succinic semialdehyde dehydrogenase SEO ED NO: 759 Nga2O512 539.6825 4.06.92.915 protein SEO E NO: 760 Nga05085 345.95.83 330.6975 isoleucyl-trna synthetase SEO E NO: 763 Nga2C053 644.2335 742,32228 oxidation resistance FIGURE 24 L. U.S. Patent Apr. 29, 2014 Sheet 73 of 198 US 8,709,766 B2

Nga model +Nrpkb -Nrpkb GO SEC E MO: 7.62 Nga20290 44S. 754 4S0.5972 2-dependent protease with chaperone function SEC D MO: 63 Nga,02969 1224.479 1329.283 cina-like seces SEO D MO: 764 NigaO2979 272,723. 28.502. ---NA--- SEQD NO: 765 Nga20152 91.1.3.303 894.43907 enoyl-hydratase isomerase SECD NO: 766 NgaO2975 922.9885 973.0429 stomata cytokinesis defective scal protein SEC D NO: 767 NgaO2968 587.3606. 464.43487 protein SEC DNG: 768 NgaO2967 493.5755 556.32722 conserved hypothetical protein Abugo aibachi Nc34 SEO Ed NO; 769 NgaO2977 70,6SSS 83.8565S ---NA--- SEQED NO 770 NgaO2966 1955.729 3932.3314 glutaredoxin SEED NO: 77 NigaO2976 204.4025 173.72622-octaprenyl-3-methyl-6-methoxy--benzoquinot hydroxylase SECEE NO: A72 NigaO297. 1862.572 1669,653 protein SEGED NO: 773 NgaO2974 2705.443 2.519.3645 gamma-glutany phosphate reductase SEGED NO: 774 NigaO2980 805.456 396.394 ---NA---. SEGEE NO: 775 NigaO2972 32.23 4.SaS ---NA--- SEC E MO: 776 NigaO2993 98.76543 120.35933 S-adenosylmethionine:trina ribosyltransferase-isomerase SECD MO: 777 NgaO2995 825 23.636 ---NA--- SEO D MO: 778 Nga2O080 740,9639 703.12324 hypothetical protein Esi G058 (057 Ectocarpus siliculosus SECD NO: 79 Nga,02978 6,6667 3.3S.O.3a. ---NA--- SECD NC: 780 NgaO2970 431.6.163 4561.4488 leucine rich repeat protein SEC D NO: 783 NgaO2973 1285.75i 224,703 cc 4-not transcription complex subunit 3 SEC DNG; 782 NgaO3007 172.923 3396,891. ---NA SEQED NO: 783 NgaO574.02 1614.892 1529.24 kazai-type serine protease inhibitor domain SECED NO: 784 NgaO3C01. 1710.496 i762.930 protein binding protein SECEO NO: 785 NgaO3003 32.27 2.93.8233S ---NA SECEE NO: 86 Nga2E166,1. 195.7295 231,2955 aec family transporter: auxin efflux SEGED NO: 787 NgaO1578,02 240.8027 19.5S425 auxin efflux carrier-like protein SEGEE NO: 788 NgaO3008 593.992 585.9002 ...-NA-...- SEGEE NO: 789 NigaO3OO6 298.679 334849SS ---NA--- SEO ED NO: 790 NigaO30O2 58.875 A2852 ...--A-...- SEC D MO: 793 NgaO3COO 34.81.353 292.585 ---NA--- SFOD MO: 792 NigaO2998 376,3838 331.75533 protein SECD NO: 93 Nga.0575.02 767,9924 915.00.443 turnor protein p53 inducible protein 3 SEO DNC: 794. NgaO2997 O4.63 929.28665 coatomer suit beta SECD NO: 795 NgaO1577.02 349,4624 423.68424 ---NA--- SEQE NG: 796 Ngao3377 313,4524, 361.07798 hypothetical protein AURANDRAFT 63229 Aureococcusanophagefferens SECED NO: 797 NgaOO344,02 102.04.08 51.532569 ---NA SEED NO: 798 Ngao:33.54 2333.97: 2325.9497 homoaconitate hydratase family protein SECEE NO: 799 NigaO3353.1 115.1762 9.627969 -degrading enzyme SEGED NO:800 NigaO3355 537.2168 788.76258 vitamin kepoxide reductase family SEGEE NO: 80. NgaO3357 16.85393 10.14264 sperm associated antigen . SEGEE NO: 802 NigaO3352 147.6744 67.52339 major facilitator superfamily SEGEE NO: 803 NgaO084.02 1149.094 1359.2105 in organic pyrophosphatase SEC E MO: 804 Ng303125,03. 120,2186 2.39.10381 ---NA--- SFOD MO: 80S Nga2O189. 3.17.338 385.91406 conserved unknown protein Ectocarpus siculosus SEQE NO: 806 Nga.03119 3781.1763466.3486 expressed unknown protein Ectocarpus silicudostas SEOD NO: 807 NgaO3120 437,4463. 426.904 Syntaxin-like protein SECD NO: 808 NgaO3122 1355.675 3270,0652 dicp1-Eike decapping family protein SECR D NO: 809. NgaO3124 230,657 652.405 thioredox SEC E MO: 810 NgaO317 3725.2643056.1557 protein SECEO NO:3: NgaO318 2968,661. 677.0068 peptidase embrane alanine aminopeptidase SECEE NO: 312 NgaO311.6 i5449.287019.2364 light-harvesting protein SEC EE MO: 813 Nga20975 2033.856 2185.2523 metalo-beta-lactamase family protein SEC EE NO: 814 Nga2 152 1194.376 1S(.4321 short-chain dehydrogenase reductase family protein SEGEE NO: 815 Nga2OOSE 3.64.859 160.56613 n-acetylglucosaminyltransferase-like protein SEO ED NO: 816 NgaO3127 67.5 7S39.3582 ...--A--- SEGEE NO: 817 Nga2O113.1 756.9767 779.67652 transcription initiation factur t?iid subunit. SEC D MO: 818 NgaO3121. 357.875 271.74879 dhhczinc finger domain containing protein SFOD MO: 81.9 Nga.03128 6,6557 6.89.908 ---NA--- SEC NC: 820 Nga2O750 237,6682 247.73512 short-chain dehydrogenase reductase Sdf SEC NC: 823 Nga03126.0: 9669.35S 7501.5398 protein SECD NO: 822 Nga20990 432,8358 136.52711 aerobic respiration control sensor protein SECRED NO: 823 Nga2094. 342.437 46.45338 aminoadipate-semiadehyde dehydrogenase-phosphopaintetheny transferase FIGURE 2 A M U.S. Patent Apr. 29, 2014 Sheet 74 of 198 US 8,709,766 B2

Nesmodel Nreh No 68 SECR DNQ: 824 NgaO2551. 6,924 83.95792 --NA--- SECR D No: 825 Mga.21222 239,3443 251.45234 mitochondria protein 18 kda SECR D No: 828 Ngao.2554.2 84.4,611. 826.95277 protein SEC ID No: 827 Ngaio.2552 1049.405 182.3404 fam49 family protein SEC ID No: 828 Ngao2546 S53.2915 516.14909 ohcu decarboxylase SERE NO: 829 Nga2000 406,4039 344.71386 conserved oligomeric golgi complex subunit 2 SERE NO: 83C NgaO648.2 95.6935 S13.82738 protein SECR DNG: 831 Nga2O68. 609,004 S36,2846 component of oligomeric golgi complex2 SEQD NO 832 NigaO25SO 92.4 724.46686 - NA--- SECD NO; 333 NgaO2554 1555.27 1434.2254 tetratricopeptide repeat family SECD NO: 334 NgaO2547.01 .214.516 E5.3729 26s proteasome non-atoase regulatory subunit 2 SECD NO: 835 NgaO2549 7600 6319.0374 voltage-dependent anion-selective charine protein 2 SECD NO: 336 NigaO2548.04 817.3913 679.85089 peroxisomal membrane 22kdampvil pimp.22) family protein SECD NO:337 Nga2G978.1 205.2023. 219.153.38 rate efflux nuti antimicrobias extrusion family SECD NO; 338 Nga2.293.1. 263,9692. 265.0688. ---NA SECRL) NO: 339. Nga2O832.1 327.645; 223.6737 ultidrug oigosaccharidyl-ipid polysaccharide flipase superfamily SEC NO:340 NgaO2553 386.803. 3233,5833 integral membrane protein SEQD NO: 341 NgaO2695 303.67C2 386.13015 grotein SECD NO: 342 NgaO2626.01 i319.672 1235.2862 hypothetical protein Dole 04.19 EResuiococcus oleoworans Hxd3} SECD NO: 343 NgaO2698 1454.07S 1396.4329 protein SEOD NO: 844 NgaO2700.0 149.425 1155.1728 violaxanthin de-epoxidase SECR D & O: 845 NgaO2702 368.92OS 29.04884 ---NA--- SEC D. & O: 346 NgaO270 434.4942 388.9223 myo inositol monophosphatase SEC L8:O: 84 Nga2O690 776,9857 (381.028 set domain protein SECR L & O: 848 Nga23033 1092.325 1265.3236 set domain-containing protein 3 SECR D &O: 849 NgaO2703 833,0658 814.30909 solute carrier family member b2 SECR D &O: 850 NgaO2699 393,501.8 348,04267 expressed unknown protein Ectocarpus siliculosus SEC D&O: 851 Nga2,139.1 3.13.253 107,01829 pot protection of telomeres 1 homolog SEC No: 852 NgaO2320.02 509,7907 517, 11.438 duf3688 domain-containing protein SECR D NO: 853 NgaO2705 385,654. 207,20087 transcription factor SECR D NO: 854 Nga20944, 3347.24, 3327.7288 dra-directed rna polymerase iii subunit rpc10 SECD &O: 855 Nga2O568 514.84.38 S27.23027 oxidoreductase domain protein SECR D \O: 856 NgaO2697 489.9598 S14.20985 eukaryotic translation initiation factor 4e member 2 SECR D \O: 857 NgaO2704 324.976 3A347 ---NA--- SECR DNO: 858 NgaO3236.02 290.8257 319.00743 ---NA--- SEC D. No: 859 Nga20853 O55.8S 77.4 reast cancer 2 Eike SEC ID No: 860 NgaO3233.2 466.959 SA953 toue cheese SECR DNC: 861. NgaO3223.2 3.36 2S8876 ---NA--- SECR DNC: 862 NgaO532S, 974.0562 1918.3058 plastidatpado transiocase SECR DNC: 863 Nga20719.1 255.137 34.29289 coiled-coil domain containing 78 SECR DNC: 864 NgaO3228.02 227.8594. 231.398.44 calcium-dependent protein SEQD NO: 865 NgaO5323.1 876.5432 084.5713 protein SEOD NC: 865 Ng303225.02 843.5606 355.2734 madph--cytochrome pas reductase SEOD No: 867 Ngao5327 4.5a5OS 34.58025 ---NA--- SEOD No: 868 Nga21130 230.6397 187.83349 protein SEC ID No: 869 NgaO5337 2Cs23 3AS37 ---A-...- SEC ID NO: 870 NgaO43A.S.O2 633.42.5 637.35789 anine oxidase SEC D. No: 871 NgaOS348 63.6.323 S.A3S46 ---A- SEC ID No: 872 NgaOS339 3347.973 2995,7274 beta-lactamase SEO D NO: 873 Nga20320 420.2335 356.16058 nuclease domain containing grotein SEO D NO: 874 NgaOS341. 458785 53.833 ---NA--- SEO DNC: 875 NgaO3059 437.81.09 420.71872 protein SEO D No: 876 NigaO3047 62.7763 72.790256 protein SEQDNC: 877 NigaO3051.01. 599,352 581,59032 sentrin-specific protease 8 SEO DNC: 378 NigaO3052 202,8302 63.5070 uncharacterized protein SEO DNQ: 89 NigaO3061. 60,2739 84,086553 --NA--- SEO DNC: 880 NgaO3057 3.445.333 1458.755 iron-sulfur cluster assembly 2 mitochondria-like SEOD NO: 881 NgaO3062 60.60606 82,0637 ---NA--- SEOD NO: 882 NgaO3050 1356.15 1422,1086 a do keto reductase family protein SEC ID No: 883 NgaO3053 323,581.6 277.5.3068 peroxisomal acyl-coenzyme a oxidase 1 SEOD NO: 884. NgaO3046 S87.5262 637.27993 protein SEOID NO: 885 NgaO3345.01. 422.5352 411.93403 proteasome subunit alpha SEOID NO: 886 Ngai)3058.01. 293,4473 271.580O2 transmembrane protein 144 SECR DNQ: 887 NgaO3O49.1 3.050.495 123.2402 iron-sulfur cluster scaffold protein influ-like protein EIGURE 24 N U.S. Patent Apr. 29, 2014 Sheet 75 Of 198 US 8,709,766 B2

i Nga model +N rpka Nrpkh GO SECE NO; 888 NgaO3055 650,6922 585.24G64 tim21-like mitochondrial precursor SERE NO: 8.33 NgaO3056 3S4833 336,6788 to corrain nenoer 20 SECRE NO: 39 Nga(3044 56.265 559.26364 chaperone protein dnai SERENE): 891. Nga03060 282.2088 299.05233 ato-binding cassette superfamily SEO 3D 80: 892 NgaC3048 278.834 283,16213 protein SEC NO: 893 Nga(3054 25.7078 206,09474 soluble insi attachment protein receptor dihydroipoyllysine-residue acetyltransferase component of pyruvate dehydrogenase SEOE NO: 894 NgaCO357.01. 29.83 57.8587 fitochondria SE: NO; 395 Nga00372 34S2855 40.2796i5 ---NA--- SEC NO: 396 NgaCO37. 132,2307 193.84186 conserved hypothetical protein Phytophthora infestans 30-4 SEC NO 397 NgaOC35.91 8O3,557 7858.4813 atp synthase gamma SECC: 398 NgaCO339.01. SS303 164.2635 eukaryotic translation initiation factor 2-alpha kinase 1 SEED NO: 899 NgaOC368.1 282.2523 3O4.32702 eukaryotic translation initiation factor 2-alpha kinase 1 SE: NO: 900 NgaOC367 675.6453 686.8993.9 malate dehydrogenase SEED NO; 98). NgaOC353 58.428 842.26506 phosphatidate cytidylyltransferase SEO NO; 92 NgaOC390.01 2517388 33.722 ---NA--- SEC NC: 93 NgaOG362 93,0934 1038,692 ato-dependent clip protease atp-binding subunit SEC to NC: 94 Nga2O758 163.043S 1543339S ---NA--- SENC: 95 MgaOO3S3 491,891 516,42144 phero note processing carboxypeptidase SEC D NO: 9. NgaCO352 i3S,6882 584.77273 protein SEC E NC: 97 NgaOO365 37.32 493.884.74 serine protease family s9x SERE NO: 908 NgaCO359 3S48S3S 919,55848 protein SEC & O : 99 NgaOO369 993.5691 2033.33 carbon-nitrogen family protein SER 8): 9. Nga00353 23.98 24.5 ---NA..... a chain crystal structure of the protein bh0493 from bacius haloduransc-125 SEO NO: 9. NgaCO355 S383,33 58.398 complexed with zin SEO 3D 80: 92 NgaCO356 37.586 816,42755 peptidy-prody cis-trans isomerase SE: NEO: 93 NgaCO364 A70.841 44.77974 protein SEO NO; 94 Nga2O736 A65.163 577.3O49.50S ribosona: protein 27 SE: NO; 915 NgaCO366 Self.276 5297 he chain SEC NO; 96 NgaCO370 86,631 1996322 -NA SEC NO: 97 NgaCO354. 28,453 423.76023 set and zf-mynd domain-containing protein SEC NO; 918 Nga0036 555.4342 543.91277 gycosygroup SEC NO; 99 NgaO3360 34,472 886,5754 eukaryotic translation initiation factor 5 SEC O NO: 920 NgaO248 1609.689 269,4384. ---8A--- SEO Ed NO; 92 NgaO2482.01 43.488 468.300 4 pre-mirna-splicing factor Syf SECEO NO; 922 NgaO2486.1. 75.30a 706.73992 atp-hinding sub-family c (cft? mp) member 3 SECONC; 923 NgaO2485.1 17862 1876.66 protein SEO ED NO: 24 NgaO2328.02 3O4.96a:5 320.61676 signal recognition particle 72kda SEC to NC: s MgaO24.83.01. 1542.85 1293.5908 peptide methionine sulfoxide reductase SEO NO: 9. NgaO2484.1 36.9A3 G558.3 Aardvark Ectocarpus siliculosus SEO Ed No: 2 NgaO2593 690.834.S 637.19644 trafficking protein particle cornplex subunit 3 SERENC: 928 NgaO2604 259,909 2356,664 gutamyl-tra reductase SECRE NO: 929 Nga(2600 49.702 519.36567 tiki family protein kinase SEC NO: 93. NgaO2605 3395.3 3:28.488 isoform SECB NO: 931. NgaO2603 68.9S 56S.734 ---NA SECE NO: 932 Nga(2599 A23.3838 449,5333 two-pare calcium channel SERENEO: 933 Nga(2596 285.71.43 257,9234 dnai hornoog subfamily b member 12 SEOE NO: 934 Nga(2597. 555,331 626,051.56 Lusmall nuclear ribonucleoprotein auxiliary factor u2af SE: NO; 935 NgaO2595 29.4623 62.315073. dna polymerase epsilon catalytic subunit a SEC 3 O: 936 NgaO26O7 34,2489 3368. --NA--- SEC NO 937 NgaO2592.91. 324.4253 366.05837 nonophosphate synthase SECC 938 NgaO2598 250,573 2299.5796 si: (novel grotein vertebrate udp-gakatose transporters) SEO NO : 9S NgaO260. 1569.83 1348.636 protein SE: NO: 940 NgaO6487.2 139,5349 78.393 transportin 3 SE: NO; 94. Nga2.391 254,975 300.72784 mitochondrial inner membrane protein oxali-Eike SEO NO; ga Nga02606 87.1303 87.344997 protein SEC NC: 943 NgaO2602 28. 82 422.3036 dead box atp-dependent rina helicase SEO ED NO : 944 NgaOG539 99.885 988,08505 digalactosyldiacylglycerol synthase 1 SEC is NC : 945 MgaOG564 82.2 1633.2732 hsp90-like protein SENC : 94. NgaOO556 1823.432 1430,349 nuclear transport SEC E No: Nga20874 423,39s i45.7357 casein kinase 2 subunit beta SERENC: 948 Nga2O782 1705.882 1532,685 50S ribosoma: grotein 19 SECRE NO: 949 Nga(O537 SOAA-3 554.76382 protein F (GEJRE 24 O U.S. Patent Apr. 29, 2014 Sheet 76 of 198 US 8,709,766 B2

Nga model +N ph NP G2 SECR D NO: 959 NgaCOS63 2255.952 1841,9276 peptide methionine sulfoxide reductase b5 SEOD NO: 95). Nga2O573 4:04.893 339.32529 uncharacterized protein SEO ED NO: 952 Nga20405 3.11.2339 383.02.195 uncharacterized protein SEO ED NO: 953 NgaOC568 316,842 285,667 ccdc2-like protein kinase SEQ (NO: 954 Nga,23,265 154,5455 423.448 rmei-like gipase atase without a c-termina eh domain SEQ D NO: 955 NgaOO569 355,0725, 382,01004 rmel-like gtpase atpase without 3 C-terminaleh domain SEO ED NO: 956 NgaOC586.2 347.536. 33.748 recredoxin SEO D NO: 957 NgaOOS65 246.6509 222.7928 transcriptional sir2 family SEO 3D NO; 958 NgaOO557 847,185 1039.6723 glycy-trina synthetase SECR DNC; 959 Nga2O19 1761.337 3789.064 mirna 2 -o-methyladerosine-n -- SEO ED NO: 960 Nga005.94 572.687 635.28006 wid repeat domain 18 SEC D NO: 961 Nga.00572 612.6533 561.54736 protein SEC ID NO: 962 NgaOO535 827,0295 750.96892 histone-lysine in-methyltransferase SEC D NO: 963 NgaOG560 531.42O1 569,7564 tetratricopeptide repeat protein 30a SEQ D NO: 964 NgaOO538 1563.501 3538.88.01 small nuclear ribonucleoprotein associated protein b SEO 3D NO: 96S Nga20185 A64.5197584.18943 hnrnp arginine r-methyltransferase SEC D. N.O. 966 NgaOO558 12029,89 2677.572 ribosoma protein S16 SEQED NO: 967 Nga00570 42.7733 343.91025 retrograde transporter SEO 3D NO: 963 NgaOC596 465.0485 92,80883 na helicasernase SEO D No. 969 Nga20986 255.319). 237.06032 wo40 repeat domain-containing protein SEO ED NO: 970 NgaOQ541 169,878 190.9547 neural precursorce developmentally down-regated 3 SEQ -D NO: 971 NgaOC595 289.1434. 298,36288 dicer-like protein 2 SECR D NO: 92. NgaOO562 691.5094 80S.272025 -3 exoribonuclease 2 SEQ D NO 973 NgaOQ573 462.664. 133.3899 taurine catabolism dioxygenase SEO ED NO: 974 NgaOC559 A.2105 100.36 ---NA--- SEO D NO: 975 Nga2O1.78 475,773. 433,8S dicer. SEO 3D NO; 976 NgaOO536 S23.2323 088.7048 ---NA--- SEC D. N.O. 977 NgaOG561. 3143.882 2S44.8223 gycosy group 1 family protein SEO ED NO: 978 NgaOO542 369,532. 35,767 ---NA--- SEC D NO: 979 NgaOO571 56.4356 218.798. ---MA--- SEC D. NC: 980 NgaO3381 450,0805. 394.2204 trna-dihydrouridine synthase 1-like SEC D NO: 981 NgaO3384 729.8637 71,07528 cytosolic iron-sulfur protein assembly 1 horroog cerevisiae) SEQ D NO: 982 Ng3.03382 908,09868,60503 ---NA--- SEC3D NO: 983 Nga20887 39.834, 2153.733 helicase now SEOD NO: 984 NgaO3383 58,848 90.177 raelicase SEQED NO: 985 NgaO3385 43.39.38 3538.8396 conserved unknown protein Ectocarpus siticulosus SEO 3D NO: 986 NgaO3380 1726.364. 1951.2643 protein SEO D No. 987 NgaO3388 388,8463 33.6926 ---A--- SEO ED NO: 988 NgaO3379.0 1003.876 948.87934 conserved hypothetical protein Phytophthora infestans 30-4 SEQ -D NO: 989 Nga.03386 523,0927. 627,40351 protein SECR D NO: 990 NgaO3387 A381,694. 4938.0SS ---MA--- SEG D NO: 99). Nga2O108 248.2447 344,41841 eukaryotic translation initiation factor 4) SEO ED NO: 992 Nga03389 6,34483 0272.46 ---NA--- SEO 3D NO: 993 NgaO429.02 3957.888 4347.07.05 xylulose kinase SECD NO: 994 NgaO2900 487.857 518.0684 ribosoma protein 11 methyltransferase SECR D NO; 995 NgaO2908.1. 1334.945. 1391.3009 protein SEO ED NO: 996 NgaO1433.02 155.354 150.8.197 ---NA--- SEC D NO: 997 Nga.01432.02 137.793 204,38376 amidophosphoribosyltransferase SEC ID NO: 998 NgaO2906 459,8062. 421,3802), arabinose 5-phosphate isomerase SEC D. N.O. 999 NgaCi427.02 422.5533 47499438 mitochondria inner merbrane protease ap23 homolog SEQ D NO: 1000 Ng30.1426.02 551.3035 429.801.96 ucip-glucose 4-epimerase SEO 3D NO: 1001 NgaO425.02 .500,775, 1598.8197 protein SECD NO: 1002 NgaO3431.02 8,438839 7,3129718 protein SEO ED NO: 1003 NgaO2903 6260.047 7722.63 cytochfore c. SEO 3D NO: 1004 NgaO428.02 796.5363 725,67295 snf7 family protein SEO D No. 100S Ngao.290A 3660.081 1685.31, calcium-dependent protein kinase SEO ED NO: 1008 NgaO4498.02 165,4135 222.61951 conserved hypothetical protein Phytophthora infestans 30-4 SECR D NO: 1007 NgaO6724 513,5135 576,33055 ipase class 3 SECR D NO: 10O8 NgaO6728 719,0476 964,594.03 serine threonine protein kinase SEC D. N.O. 1009 Ngata 729 14.35 285.1024 conserved unknown protein Ectocarpus siliculosus SEO ED NO: 101G NgaO6726 2934.808 2558,651.5 malate Synthase SEO D NO: 1011 Nga,06725 1471.335 3440.2214 atp-dependent rna helicase SECD NO: 102 Nga2O336 552.4862 560.56858 conserved plasmodium protein SEC D.N.O. 1013 Nga20947 539.5778. 477.30889 mitochondria protein FGFRE 24 P U.S. Patent Apr. 29, 2014 Sheet 77 of 198 US 8,709,766 B2

Nga mode +Nrpkb Rrpkb GO SEO EDMO: O3.4 Nga2O2C5 500.4292 577.4.483 protein mero1. SEC ED NO: 1015 Nga2O380 2.445. 33.2944. ...N.A.-- SEC ID No. 106 Nga2.293 459.4595 653.8439. ---NEA--- SEO ED NO: 1017 Nga20064 2607.6022232,0582 conserved unknown protein Ectocarpus sticulosus SEC EO NO: 108 Nga20046 S60.0248 658.25231 imidazoleglycero-phosphate dehydratase SEO ED NO: 3019 Nga23.89 22.591 387.7952 ---A--- SFO ED NO: 1020 Nga21068.1 1560976 195.51052 glutamate carboxypeptidase 2 SEQED NO: 1021 Nga20403.1 232.8767 204.03379 glutamate carboxypeptidase SEO ED NO: 3022 Nga2O577.1. 202.27 343-38296 ---NA--- SEQED NO; 3323 NgaO8109 553 A-3898---NA SEO ED NO: 1024 NgaOS113 556.7568 543.24345 hypothetical protein RHA1 roo41.71 Rhodococcus jostii RHA SEO HD NO: 3.025 Nga}61.38 SAS392 S65.3S37 ---NA--- SEO ED NO: 1026 NgaO6114 198.7871 381.2640S had dependent steroid dehydrogenase-like SEO EO NO: 102 NigaO615 250.2005 246.961.66 c2h2-type zinc finger-containing protein SEQED NO: 1023 NgaC6111 6255.613 6012,3537 protein SEC ID NO. 3029 NgaO6110 746.8847 70.34S histidyl-trina synthetase SEQED NO: 1030 NgaO3152.1 39.5793 293,594.96 metalo-beta-lactamase domain-containing protein SEC D NO: 1031 NgaO3154 88 87.019793 protein SECRED NO: 3032 NgaO3144 968,478 27.279 ---NA--- SEO ED NO: 1033 NgaO3147 353.6836 370.08688 ppgpin synthetase SECR DNC: 034 NigaO31.45 358.8744 342,5616 vacuolar arrino acid SEO ED NO: 1035 NgaO31.57 30.4336 299.43352 ---NEA-...- SECRED NO: iO38 NgaO3153 237,6812. 345.3.894 tra (guanine---frethyltransferase SEC ED NO: 1037 Nga2:223 313.2184 289.43493 thioesterase family protein SECRED NO: 3038 NgaO3155 289.5377 271.46739 protein SEO ED NO: 1039 Nga23154 581.1733 610.33509 mediator of irra polymerase itranscription subunit 3. SEC ED NO: 104C NigaO3148 42.64871 52.277283 kinesin-Sike protein SEO ED NO: 1041, Nga(3150 33.626 3.87.60635 toxin biosynthesis SFQED NO: 1042 Nga03151. 691.2669 767.45726 dha-3-methyladenine glycosylase SEQED NO: 1043 NgaO3156 332.3782. 364.818 lysine decarboxylase SEQED NO: 104A Nga.03149 46.2204 67.82544 protein SEO EDMO: 1945 Nga2CO73 668.51.17792.83629 abi-philin partia SECRED NO: 1046 NgaO1458,02 7968.763 91.66.4674 delta-9 acy-desaturase SEQ HD NO: 1047 NgaC1459.02 828.3642 775.308 rethiorine amopeptidase SEO ED NO: 3048 Nga22.224. 93.61702 89.142592 conserved unknown protein Ectocarpus siculosus SEO EDMO: 1049 NigaO3480.02 97.3286 36.960738 riker cina 572O467c03 isoform Crata SEOD NO: 1050 MgaC5263 6.3053 40,523.4 ...-NA--- SECRED NO: 3051. NgaO5276 39S.1477 364.54188 protein SEO ED NO: 1052 Nga2O764.1 329.5924 273.84768 tubulin tyrosine ligase SECR DNC: 1053 NgaO5260 875.3103 650.33583 nicotinamide nucleotide transhydrogenase SEO ED NO: 1054 NgaO5261 93.5.1751842.5623.2 protein SEQED NO: 2055 NgaO4994.02 756.4935 695, 19235 stromal cell-derived factor 2 SEQED NO: 1056 NgaO5636 2555.225 2637.344.7 ppgpp synthetase SEO ED NO: 1057 NgaO5641. 235.51.23 34.9.5275 mgcolipin-like proteir SEGED NO: iO58 NgaO4806.02 6873.897 6980.2338 60s ribosoma protein 39 SEC ED NO: 1059 NgaO563S 1593.084 2002.2593 pkd domain-containing protein SEO ED NO: 306 NgaZG957.1, 523.0461 366.86683 cytochrome c oxidase asserrhiy proteir cox19 SFO ED NO: 1061. Nga20109.1 316.7128 290.08264 folate-binding protein SECRED NO: 1082 NigaO5640 3.24.592 9.6564. ---A--- SECD NO: 3063 NgaC4995.02 2294.228 2284,0334 nad-dependent epimerase dehydratase SEO ED NO: 1964 NgaO4996.02 2387.654. 1424,8908 into pyrophosphohydrolase SEO ED NO: 1065 NgaO2527 941,8239 137.7363 predicted protein halassiosira pseudonana CCMP13353 SEO HD NO: 3.066 NgaO2522 iSO.734 1237,7223 membrarie protein SEO ED NO: 1067 NgaO2S23 2026.3S4. 1517.7963 expressed unknown protein Ectocarpus siliculosus SEQED NO: 1068 Ngao2523 4.07.0568 542.54939 intraflagear transport protein 7274 SEQD NO: 1069 NgaC2524.01 i8205.19 18242.249 predicted protein Phaeodactylum tricornutum CCAP 1055/ SEC ID No. 1070 NgaO2545 224,379 29.96S ---NA SEQED NO: Of1 NgaO2526 204.467 213.6362 ---NA--- SEC D. No: 1072 NgaO2529 74985, 85.858933 ...NA--- SECRED NO: O73 NgaO2533. 394.44 83.53899 ---NA--- SEQED NO: 1074 NgaO2530 350.6173 230.02003 ubiquitin-specific protease, putative Phytophthora infestans 30-4 SECD NO: 1075 NgaO2525 282.0373 343.40447 dead-cx atp-dependent na helicase 33 SEOD NO: 1076, Nga2C966 408,882 502.22665 ubiquitir carboxyl-termina hydroase 24 SEO EC NO: O7 NgaO5.370 508.7719 459.4523.50s ribosoma protein 15 FIGURE 2 A Q U.S. Patent Apr. 29, 2014 Sheet 78 of 198 US 8,709,766 B2

Nga model -Nrpkb GO SED NO; 1078 NgaOSO71 86.9985 508,4803 acetylornithine deacetylase SECED NO; 09 NgaOSO72 1534,29 i215.3988 anion-transporting atase SEED 80: O8O NgaO48.4.2 297.8036 20.082 ---NA SEGED NO: 8. NgaOSO31 129,2S17 162.1664 subtilisi-like serine peptidase SECD O: 1082 NgaOSO74 134741. 93.365389 r-ethylmaleimide-sensitive factor attachment gamma SEOD NO: 83 NgaOSO73 38.232 2932S8---NA--- SECD NO: 38.4 NgaO5056 266.6667 238.33.147 neuro-oncologica wentral antigen 2 SECD NO: SS NgaOSO53.01. 3648.88 3444.6372 vacuolar hi-hatipase b subunit SEO ED NO; 1086 NgaOSO55 SC2,399 558.74187 secoheptulose- -bisphosphatase SECED 80: O8 NgaO4453.02 2007.277 23208,969 inorganic phosphate SEED FO: O3S NgaOSO52 2484.892 24.67.2796 ---NA--- SEO D NO: 189 Nga05057 24-7AS27 21.286256 inorganic phosphate SECR D NO: 190 NgaOS620.01 133398 1892.5052 polyubiquitin-like protein SEC D. N.O. 9. Nga05622 A62.65 599.17794 trina pseudouridine synthase a SECD NO; 92 NgaO5623 3O4. 358.3936 tata-box binding protein SEC D. NO; 33 NgaOS621. 888.82 747.05789 crystal protein SECED FO: O94 NgaOS624 66.579 6088.2492 ribosoma protein s5a SECD NO: 95 NgaO445.2 934.6697 532.933 death (asp-gu-aa-his) box polypeptide 36 SEO DO: 96 NgaO6094 136.065 1292.4346 mitotic spindle-associated immx.d cornplex subunit mig18 SECR D NO: 197 NgaO6098 14,4682 278,72756 methyltransferase type 11. SEC D. N.O. 198 NgaO6096 134.322 150.2512 ctosynthase SECD NO; 99 NgaO6095 1733.326 2393.5881 6-phosphofructo-2-kinase fructose--bisphosphatase short form SEOiD NO; NgaO6097 6.5.3i 61165,061 protein SEO ED NO: 3. Nga(62O3 16079 89.09633 protein SECD NO; 12 Ngao6199 358,484 406,7777 pre-inna-processing protein 40a SEO DO: 13 NgaO6198 343.48.35 242.7S175 protein SEQID NO; 104. NgaO6204 1927.28 2155.64. --NA SEO D NO: OS NgaO620 78.21 753.86581 carrier protei SECR D NO: Os NgaO6197 16414 1029.9971 sery-trina synthetase SEOD NO; 7 NgaO62O2 434.806 1354,763 protein SEO ED 80: 08 Nga(16201 1S286 566.57895 ---NA.-- SEOD NO: 39 NgaO8196 272.6436 33s.60632 ---NA--- SE DO; O NgaO4202.02 F433 699,28622 udg-glucuronate decarboxylase 1 SECR D NO; 1. NgaO4201,02 9,259 756,974:93-oxoacyl-acyl-carrier-grotein synthase 2 SEO D NO: 11.2 NgaO6687 A83,641. 55.393 ---NA--- SECR D NO: 13 NgaO4206.02 898.427 47.9424 ---NA SECD NO: 14 NgaO42O3.02 234.97.8 2695A. ...NA-. SECRED EO: KigaO6689 240.919 265.38161 exosome component :O SEO ED NO: . Nga(6686 323.92. 324.67899 protein high chlorophy; fuorescent 107 SEOD NO: NgaO2345.02 S2.884 1729.2092 glutamate dehydrogenase SEQID NO; 18 Ngade,372 2308.13 22502.43 ---NA--- SEO D NO: 19 NgaO6375.3 985,9296 3.047.8519 protein SECR D NO; 2 Nga06376 33901s 4130.802 tpainf: won willebrand factor SEC:DO; 12. NgaO6371 2529.524 i893.3113 gap-mannose-epiferase SEED NO: 2 NgaO6377 13.85 84.67347 ---NA--- SEED NO: 1.23 Nga2O771 3.454 297.55121 suppression of tumorigenicity 5 SECD EO: 24 NgaO6374 75.2519 565.4793. ---NA--- SECD NO; S NgaOS670 744f7 634,815 --NA--- SEO D NO; 26 NgaO5667 580.462 484,95324 protein SECRD NO; 2 NgaO5675 9.a652S 66.335829 ---NA--- SEO 3D NO; 128 NgaO5674.01. 694.783 1273.9773 light harvesting complex protein SECRED NO: 129 NgaO5669 969.274 108.8866 cysteine subfamily SE: NO: 3. NgaOS672 479,224 354.94502 endoplasmic reticulum protein SEOD NO: 3. NgaO3541.2 738.3513 652.699...NA... SECD NO; 32 NgaoS666.3. 126.6 3.299.2888 protein SEOD NO; 33 Ngao S68 OE,035 325,39042 hypothetical protein Partial Ectocarpus siliciosus SECR DNO; 34 Nga0.539.2 338.326 SO7.73809 hypothetical protein GRG 1977 Gomerella gaminicola M1,003 SEC D. NO; NgaO5653 528,571.4 394,60665 nudix (nucleoside diphosphate linked noiety x)-type notif isoform crata SEED 80: NgaO280 93.3 914.63097 protein SEED NO: Nga21290, 2242.385 1954.699 transmembrane protein 222 SEOD NO: NgaO2806 54.5Ga5 472,86134 ud-n-acetylglucosamine transporter SEOD NO; 39 Ngao280 AC,8Saf 492.5474 zinc finger nyid domain-containing protein 30 SECD NO: AC NgaO2803.01. 532,258 2007.3153 x-pro dipeptidyl-peptidase domain-containing protein

EIGURE 2 4 3 U.S. Patent Apr. 29, 2014 Sheet 79 of 198 US 8,709,766 B2

Ngarrodel +N rpkb -Nrpkb GO SEQED NO: 1141 NgaO2304 1168.84 1122.6378 protein SEC O NO: 1142 NgaO2808.O. 152.068 125, 1927 cysteine synthase SEC E NO: 143 NgaO2809 448.9572 575.50519 vesicle-associated membrane protein 4 SEQED NO: 1144 Nga20044.1 243,6364. 319,97064 protein SEO D No: 1145 Nga20890 399.4828 462.6409 iws C-terminus family protein SEC EO NO: 1146 Nga20954 138.5224. 15.7543 atp-dependent rra SEQED NO: 1147 Nga2O617 859.4164 959.68206 u5 small nuclear ribonucleoprotein 40 kda protein SEO ED NO: 1148 Nga2O41 557.6756 A.05.36042 bzip transcription factor SEC E NO: 1.49 Nga20204 170.0913 210.21663 deah (asp-gu-aa-his) box polypeptide 6 SEQED NO: 1150 Nga2C197 213.3373 198.23889 pre-infra-suicing factor atp-degendent na helicase dhx16 SEO ED NO: 1151. Nga20933 2507,644 2549.7885 pas pac sensor hybrid histidine kinase SEO Ed NO: 1152 Nga21085 1820.79 63.79.915 f-type h-atpase beta subunit SECRE) NC: 1153 NgaziG38 685.5576 695.0916 becEin-i-like protein SEO ED NO; 1154 Nga20049 1594,843 2070.0444 transmembrane and coiled-coidomains 4 SEO Ed NO: 155 Nga20078 848.3111 74.66093 violaxanthin de-epoxidase SEQ 3) NO: 1156 Nga2O799.1 224.765. 242.0209 competence-like protein SEO NO: 57 ga2.4 410.8619. 358.09386 protein SEC EO NO; 1.58 Nga20812 553.92.47 622.5096 phd zinc finger-containing protein SECB NO: 1159 Nga20906 327.7512 3O8.38.478 ---MA--- SEO Ed NO: 1160 Ngag5829.1. 946.2368 938.39763 nyb-like dna-bindig domain containing protein SEC EO NO; 161 NgaOS825 425.2593 386.628? ---NA--- SEC) NC: 1162 NgaO5826.1. 245.4268 257.04789 signal recognition particle receptor SEO E NO; 1163 NgaO5827 935.4868 11.6.180i beta-ketoacylsynthase SEC E MO: 1.84 Ngaio.4858.2 42.SS7 152.42417 endoribonuclease SECD NO: 1165 NgaO4357. C2 2.722.667 2435.8524 signal peptidase cornpiex catalytic subunit sec1a SFO NO; 1165 gaO5839 33.9807 1238.7233 at-binding cassette SEC E MO: 1.87 Nga20872.1 585.3031 720.01687 ubiquitin carboxy-termina: hydrolase 10 SEO D NO: 1168 NgaO6401 1986.053 1589.3474 pre-mina-processing factor 37 SFO NO; 11.69 gaO5400 283.3333. 252.75459 predicted protein Phaeodactylum tricornutum CCAP 1055/13 SEQEE NO: 1170 NgaC1385.02 3Q82.192 12313874 - NA SEO D NO: 1171. NgaO6404 977.8963 1145.9822 tryptophany-trina synthetase SEO NO: 1172 ga064.06.1 47.46489 4117.57 fatty-acy SEQED NO: 1173 NgaO1884.2 524.6727 603.25.13 S-adenosyl-l-methionine-dependent methyltransferase domain-containing protein SEO ED NO: 1174 NgaOS405 63.242 1.87.34012 non-ribosomal peptide synthase SEC E NO: 1175 Nga03329 200 248.3512 peroxin 3 SEC is NC: 1176 NgaO3326 1255.692 1327.3252 sery-trina synthetase SEO ED NO; 1177 NigaO3321 4297,498 2.343.SSS7 .NEA--- SECE NO: 1178 NgaO3320 98.4.2276 102.7797 conserved unknown protein Ectocarpus siliculosus SEO El No: 1179 NgaO3318 475.5678 547.95385 protein SEO EE NO; 1130 NgaC3330 6.382 588.52888 -...--NA--- SEC E NO; 1181 NgaO3325 591.3367 502,5616 thioredoxin domain containing 9 SEC is NO: 1132 Nga33.22.0: 2959.643. 2797.94.96 ubiquitin-conjugating enzyme SEO EE NO: 1183 NgaO3323 1413502 137.40927 dra binding protein SECRE NO: 184 Ngao.3317.1 646,7331 730.55312 spicing factor subunit 49kda SEO HD No. 1185 NgaO3319 498.7076 494.95297 protein SEO ENO; 1186 NgaO3327 50.05688 65.34447 uncharacterized protein SEC E NO; 1187 NgaO3324 371,7949 443.62615 S-adenosylmethionine mitochondria carrier protein SEO NO: 1188 NgaO3323 132.357 .457.2848 radicalsam cfr family SEO EO NO: 1189 NgaO5176 1786.078 1919.308 atp-dependent clip protease proteolytic subunit SECRE NO: 1130 NgaO51.77 327.8167 435.08633 serine hydroxyrethyltransferase SEO D NO: 1191, Nga2.108 173.3615 60,30946 protein arginine n SEO EE NO: 1192 Nga2O154 198.3095 174.6697; protein arginine -methyltransferase 5 SECRE NC: 1193 NgaOS 17 359.7734. 452.11464 methionyl-tria formyltransferase SEO D NO: 1194 Nga05181 362.5378 356.7345 anaphase-promoting complex subunit 11 SEOE NO: 1195 NgaO517s S82.3666 676.07872 conserved unknown protein Ectocarpus siliculosus SECE NO: 1196 NgaOS 1.78 230.93.68 233.61585 voltage-gated ion channel superfamily SEC FD NO: 1197 NgaO5280 322.4044: 888.99.29 ---NA--- SEO ED NO: 1198 NgaOO607 3.18.23 1343,6321 ferredoxin-dependent glutamate synthase SECE NO: 1199 NgaOO603 1580.412 1756,998 mitochondria import inner membrane translocase subunit tim23 SECR DNC: 1290 NgaO3608 747.0862 62.29373 subfamily member 9 SEO Ed NO: 1201 NgaOO67.C. 240.0794. 208.47955 hexose-6-phosphate dehydrogenase glucose 3-dehydrogenase) SEO El No: 1202 Nga21227.1 293.5636 255.07864 hexose-6-phosphate dehydrogenase glucose 3-dehydrogenase) SECR DNC: 1293 NgaOO609 563.702S 540.7720i folate biopterin transporter EGEJRE 24 S U.S. Patent Apr. 29, 2014 Sheet 80 Of 198 US 8,709,766 B2

Nga model +N rpkb -N rpkb GO SEC D. No: 1204 NgaOO601.0: 1353,053 973.7079 protein SEO ED NO: 12C5 NigaOO613 434.903 414.8943 hypothetical protein AURANERAF66563 Aureococcusanophagefferens SECRED NO: 1208 NgaOQ627 1028,249 389.2322 ---NA--- SEQ -D No: 1207 Nga00602 2057.908 1588.8055 short-chain dehydrogenase reductase acting with naco nadp as acceptor SEQ OMO: 1208 Ng320886 612.3596 702.88494 n-acetylglucosamiyiphosphatidylinosito de-in-acetylase family protein SEC D NO: 1209 NgaOO612 2266,667 2563.6537 at-dependent cp protease proteolytic subunit SEC D. No: 1230 Nga2O353 2054.28 893.27225 in-acetylglucosaminylphosphatidylinositode-n-acetylase family protein SEO EDNO: 1211. Nga2O647 79.096OS 73.439589 hypothetical protein AURANDRAF66583 Aureococcusainophagefferens} SEQ}D No. 1212 Nga21,169 19.469 119.82676 hypothetical protein AURANRAF 69090 Aureococcusanophagefferens SEO ED NO: 1213 Nga2O6i A 283.7838 204.93635 hypothetical protein AURANERAF66563 Aureococcusanophagefferens; SEC 3 NO; 24 Nga00653 1924,554 i907.982 proteasotile component do fair protein SEQ}D No: 1235 Nga00604 95.2471 396.92945 glutathione peroxidase SECR D No. 1236 NgaOO65 258.2799 247,373.4 phosphatidylinositokinase SEO ED NO: 1217 NgaOO63. 668.631 6815.069 exo-beta- -glucanase SEC D. NO: 1218 Nga.00620 2981.68S 3436.1926 hypothetical protein DFA 07107 Dictyostelium fasciculatum SEC 3 NO: 1219 Nga006.8 998.386 2S6, ---NA--- SEQ}D No: 1220 Nga00600 2286.137 1933.5212 fructose--bisphosphatase SECR DNC: 1221 NgaOO625 457.0528 463,5145 protein kinase SEO ED NO 1222 NgaOO593 825.7669 688.48489 305 ribosomal protein 56 SEO ED NO: 1223 NgaOOS99 1840.718 3780.6835 dina binding protein SEC 3D NO: 1224 Nga00623 1570,336 3.49864 -m-NA--- SEQ -D No: 1225 Nga006.8.1. 305.6332 288.66042 conserved unknown protein putative Albugo aibachii Nc34 SEC D. No. 1226 NgaOO606 375.305s 354,8985 ---NA--- SEO ED NO 1227 NgaOO633 230, 1943 255.83104 protein SEC :D NO: 1228 NgaOO60 2349.867 2328.3085 protein SEC 3 NO: 1229 NgaOO622 286.2903 323,223O3 at-binding cassette superfamily SEQ}D No: 1230 Nga00624 1312.399 1465.24A furnaryacetoacetase SECR D No. 1231 NgaOO64 438.2494 553,04788 atp-binding cassette superfamily SEO ED NO 1232 NgaOO621. 1819.741 1482.8738 fumar’yacetoacetate hydrolase SEC D NO: 1233 NgaOO626 938,603 88.6SOS3 -...-NA--- SEC D NO: 1234 Nga20907.1, 373.1988 415.18765 butyrylchoinesterase precursor SEO 3D NO: 1235 NgaO5927 105.25 933479 adose SEC D No. 1236 Nga.05926 303.046 328.7OS4 is rances SEO DNO: 1237 Nga2O285 23,6752 398,12997 e3 ubiquitin-protei Eigase ubr2 SEC D NO: 1238 Nga21181 1.94 69.63482.--NA--- SEC NO: 1239 Nga21105 338.1309 410.0570 protein SEO 3D NO: 1240 Nga2CO23 f$7.961.7 73.61.7843 cell cycle switch protein SECR D NO.1241. Nga21,053 47.24409 85.294.011 protein SEO DNO: 1242. NgaO5925 978.97 98.488. ---NA--- SEO ED NO: 1243 Nga2O75A 269,911.5 396.51589 golgi transport complex subunit cog6 SEC D NO: 1244. Nga2O347 134,6154 i56.23566 component of oligomeric golgi complex 6 SEO 3D NO: 1245 Nga2C223 22E.8845 238.70657 component of oligomeric golgi complex 6 SEC D. No. 1246 Nga21133 305.2109 253.41669 conserved unknown protein Ectocarpus siliculosus SEO O NO: 1247 NgaO5930 325,5237 347,7482 protein SEC D NO: 1248 Nga.05924 2836.425 3850.6866 phosphoinos to transporter SEC NO: 1249 NgaOS928 358.41 i484,0636 chaperonin containing to theta subunit SEO 3D NO: 1250 NgaO5929 472.4221. 488.36446 pre-mrna-splicing factor cw.c25-like protein SEQ D No. 1251 Ng305229.2 755.841 843,21832 glycosyl hydrolase family 81 protein SEO DNO: 1252. Nga2O860, 1766,637 2124.0.4 isowatery-dehydrogenase SEO ED NO: 1253 Nga.05995.1 1085.29 843,36289 peptidy-prolyi cis-trans isomerase SEC NO: 1254 Nga2O235.1 36 38.996422 protein SEO 3D NO: 1255 NgaO60.05 343.5115 223.261.96 mekha domain protein SECR D No. 12S6 Nga2O456. 69,2934 517543 ribokinase SEO EDMO: 1257 Nga.05996. 576.412 509,22792 cytosolic phosphoglucose isomerase SEC D. NO: 1258 Nga(15999 807,0761. 93.93589 protein SECR D NO: 1259 NgaO6000 155.6225 174.01.348 protein SECR D NO; 1250 Nga05998 10922.08 .2835,353 radp-dependent glyceraldehyde-3-phosphate dehydrogenase FIG RE 24 U.S. Patent Apr. 29, 2014 Sheet 81. Of 198 US 8,709,766 B2

i Nga node +N rpkb -N rpkh GO SEO ED NO; 1262. Nga,20596. 62.055 100.76595 vacuolar protein sorting-associated SEQED NO: 1262 Nga05973 1132.97. 951.79553 adenylate kinase SEO ED NO: 1263 Nga20598 1377,953 .264,0572 adenylate kinase 3 SEQED NO: 1264 Nga208.84 S31.3653 636.S.4983 hsp70-binding protein 1-like SEOD NO: 1265 Nga2O777 426,1662 524.42515 vacuolar protein sorting-associated SEO ED NO; 1266 NgaOS970 704,0404 827.3.9633 conserved unknown protein Ectocarpus siliculosus SEQED NO: 1267 NgaO5987 1652,556 1988.2363 sorting nexin-29-ike SEQED NO: 1268 NgaOS972 2S3 45.597 ---NA SEO ED NO: 1269 Nga2O553 443.4524 399.7649 cathepsin a SEQED NO: 1270 NgaO5971. 500,8013 481.72663 protein SEO ED NO: 1271. NgaO2181.2 494,4348 529.25369 vesicle transport through interaction with t-shares homologia SEQED NO: 1272 NgaO5969 200.3968 200.59388 deoxyribodipyrimidine photo-lyase SEC, D MO: 1273 NgaOS855 67.963.17 1865,035 pdx1 c-termina: inhibiting factor . SEO ED NO: 1274 Nga2O675 3.358 SO9534 ---A--- SECRED NO: 1275 NgaO5841 17633.25 17364,735 iron-sulfur custer scaffold homolog coi} SEO ED NO: 1276 Nga21174 A643983. 3.864 ---NA--- SEQED NO: 1277 NgaO5840 279.762. 1240.5608 selenoprotein h SEO DNC): 1278 Nga05842 1333.02 .450.7987 transferring glycosy SEO ED NO; 1279 Nga20135 108.7786 109.56374 phdzinc finger-containing protein SEQED NO: 1280 Nga05843 163.8955. 137.65562 activating signal cointegrator 1 cornplex subunit 3 SEO ED NO: 1281. NgaOS847 210,6383 186,635 set domain-containing protein SEQED NO: 1282 NgaO5846 3655.48 4091952 ---NA--- SEOD NO: 1283 Nga2O450 166,6667 24.3.2056 glycoside hydrolase family 31 protein SEO ED NO: 1284 NgaOS844 22,7544. 229.56958 glycoside hydrolase farily 31 protein SEQED NO: 128S Nga.05845 739.6789 702.4875 noplete game: full-?hucleolar protein 16 SEQED NO: 1286 NgaOS219 2880.39 2944.8875 ---NA SEO ED NO: 1287 NgaO5215 44.83.48 470.07304 protein SECD NO: 1288 NgaOS216 4SS,7O 343.34.863 ---NA--- SEO ED NO: 1289 NgaOS221 211.0572 228,63515 serine threonine protein kinase SEQED NO: 1290 NgaO5209 1098.325 912,0339. ---NA-. SEC ID MO: 1291 NgaOS220 1600.624 1757.6899 mybdna binding protein transcription factor-like protein SEO ED NO: 1292 NgaO5213 65.9174724.81878 conserved unknown protein Ectocarpus siliculosus SEQED NO: 1293 NgaO5211 982.5i. 1063.4336 cad protein SEO ED NO: 1294 NgaOS214 13241.8 2013,542 inoleoyl- desaturase SEO ED NO: 1295 Nga20988 S42.725 32.42546 ---NA--- SEOD NO: 1296 NgaoS217 2O3.4588 1574. --NA--- SEO ED NO; 1297 NgaO5212 1979,6s 979,4347 ---NA--- SEQED NO: 1298 Nga05218 150.8333 187.76055 potentia inositol polyphosphate-5-phosphatase inp51p SEO ED NO: 1299 NgaOS210 1682,886 1932.63.54 glitathione reductase SEQED NO: 1300 Nga20980 2636.542 292.965 conserved unknown protein Ectocarpus siliculosus SECD NO: 1301 NgaO2763 15.65558 28.371.874 kitesin faimily-like protein SECED NO: 1302 NgaO2753 706.23.47 625,7565 to Eiday junction resolvase SEQED NO: 1303 NgaO2749 89.3976 759,92308 peptidased SECD NO: 1304 NgaO2764 38.4932 49.54418. ---NA--- SEO ED NO: 1305 NgaO2750 776.3394. 732.93443 boia-like protein SEQED NO: 1306 NgaO2752 79,4339 397.53669 rimp proma ame: fulieribosome maturation factor rimp SEO ED NO: 1307 NgaO2754 92.7726 740.35362 rna polymerase sigma factor SEQED NO: 1308 NgaO2743 747.8738 628,932.9 suppressor enhancer of in-12 protein 9 SEQD NO: 1309 Nga02751 153,9644 284.68777 cina mismatch repair protein SEO ED NO: 1310 NgaO2745 10663.19904.O.S.A. --A--- SECRED NO: 131 Nga2O177 462,6866 398.64686 peptidyl-proyl cis-trans isomerase cyclophiin type SEO ED NO: 1312 NgaO2755 635.0388 441.31753 protein SEQED NO: 1313 NgaO2746 639.379 613.85305 chaperonin SEO DNC): 34 NgaO2747 2937.345 2970.4793 histone family protein dia-binding protein SEO ED NO; 1315 NgaO1017 268.585 361.94337 ribosome maturation proteinsbds SEQED NO: 1316 Nga01.016 82.9457 83.6508 queuine trna-ribosyltransferase SEO ED NO: 1317 NgaO1014 6748,933 6773.2135 ipocalin protein SEQED NO: 1318 NgaO.C.19 443,4524. 498,0948 ---NA. SEQ -D NO: 1319 NgaO4018 596,5217 570.81719 conserved hypothetical protein Phytophthora infestans 30-4: SEO ED NO: 1320 NgaOG21 S2739. SS-93.579 ---FA--- SECRED NO: 1321. NgaO2C2O 449.012 46S.2272 ubiquitin proteinigage.e3a SECRED NO: 1322 NgaO3015 10992.84 913.288 glycerophosphory diester phosphodiesterase SEO ED NO: 1323 NgaO623? 77.972; 105.57836 atpbinding cassette superfamily SEOD NO: 1324 NgaOA419.02 1432.924 .395.58 is trigger factor EIGURE 24 U U.S. Patent Apr. 29, 2014 Sheet 82 of 198 US 8,709,766 B2

Riga model Nrpkh -N rpkb GO SEO O NO: 1325 NgaO6233 465,5397 65.56987 adg-ribose pyrophosphatase SEO MO: 1326 NgaO6234 400.6734 322,7833 potentia: dna binding component of sof SEC E NO: 1327 NgaO6236 22.5639 54.2944 ---NA SEO NO: 1328 Ngat6235.03 963.35 G8 921.59958 signal peptidase SEQ NO: 1329 Nga20848 105.5546 149,6839 hypothetical protein PFSGOOO45 Salpingoeca sp., AECC 50818 SEC ID NO: 1330 Nga209CS 239.304 259,0342 ring finger-ike protein SEC ID NO; 1331 NgaO6622 6324.859 6432,084 rpe repeat protein SEC ID NO 1332 NgaO6623 350,3528 2S225996 ame oxidase SEO D NO: 1333 NgaO667 S7.5 183.24708 protein SEQID NO: 1334. Ngao.66:9 63.88839 48.14373}. c2h2 finger domain-containing protein SECB NO: 1335 NgaO1978.02 1568.285. 1993.29G7 geranyl diphosphate synthase SEO O NO: 1336 Ngatsb2O 74.25543 87.085315 sulphonylurea receptor 2b SEO is NO: 1337 Nga2O618.1 38.7S 230.82. --NA--- SEC NO: 1338 NgaO1980,02 748.172 702.51574 alphabeta fold family protein SEO, ID NO: 1339 Nga(1981.02 2059.333 2352,2456 --NA--- SEC E RO: 1340 NgaO5980.0. A64.0523 445,02608 gons-like n-acetyltransferase SEO NO: 1343 Nga21036, 1 S68.5686. 455.41367 eg nine honolog 1 elegans) SEO NO: 1342. NgaO5979.03. 323.0303 344.9936 peptidase r1 membrarie alanine aminopeptidase SEC ID NO: 1343 NgaO1175 2CO 244.305.05 -...-NA--- SEQID NO: 1344 NgaO1173 1280.093 A9.545 ---NA--- SEO E NO: 1345 Ng32038. 392.63O2 352.36O72 mkiaaC609 protein SEC ID NO 1346 Nga01174. 424.8826 419.56244 protein SEC ID NO: 1347 Nga01172 289.948 39.8258 protein phosphatase SEQID NO: 1348 NigaO2657 524.1433 650,44655 methyltransfer with n-terminal ankyrin repeats SEO D NO: 1349 Nga20173 379.3478 346.363.39 se-1 suppressor of in-32-3ke 2 elegans) SEO O NO: 135C. Ngat?652 563.5634 562.4.4839 carboxyesterase type b SEQID NO: 1351 Ngag2659 87.458887,639 ---NA.-- SEO, ID NO: 1352 NgaO2649 279,223 233,7993 S-adenosylmethionine-dependent methyltransferase methyltransferase thiopurine S methyltransferase SEC D NO: 1353 NgaO2658 AOS2,305 3925.7627 uncharacterized oxidoreductase yesn SEC ID NO: 1354 Nga02647 248.564. 23.36.04 fibronectin type iii domain protein SEC ID NO; 1355 Nga2O787 2104.753 2365,419 -related protein kinase 2 SEQD NO: 1356 NgaO2656 186,4407 147,491.18 protein SEO, ID NO: 1357 NgaO2654 853.264 808.53.43 protein SEO E NO: 1358 NgaO265.0: 1413.223 1325.01.162c-methyl-d-erythrito -cyclodiphosphate synthase SEC D NO: 1359 NgaO2655 701.4925 59.88046 beta- -glucan-binding SEO D NO: 1360 NgaO2653 806.6784 S70.12012 heterogeneous nuclear SEQID NO: 1361 NgaO2646 3S1628 3S9.033---NA--- SEC D NO: 1362 NgaO2648 1573.273 1899.4808 sphingarine ca-hydroxylase SEO, ID NO: 1363 Nga(2650 1679,654 1795,4251 protein SEC D&O; 1364 Nga05S81 20,673 3329.7464 -...-NA--- SEO, ID NO: 1365 Nga05579 5212,963 5971.1599 kid repeat family protein SEO NO: 1366 NgaOS577 1723.338 1445.119 conserved protein SEC E NO: 1367 NgaO2O25.02 940.1709 742.98738 signageptidase complex subunit 1. SEQD NO: 1368 NgaO2O24.2 257.94.73 334.51366 histone acetyltransferase SEC ID NO: 1369 NgaO5578 1580.867. 1853.3644 peptidyl-proyl cis-trans SEC ID NO 1370 Nga21120.1 134,6154 180.2473 bromodomain phdfinger transcription factor SEC ID NO: 1371 NgaO5373 624.8042 604,97O6rra polymerase iisecond largest subunit SEOID NO: 1372 NigaO5374 452.2989 61,964.92 guanine nucleotide binding 2 SEC, D NO: 1373 NgaO5375 631.5789 6:4.4664 protein SEOD NO: 1374 NgaO5372 4386.19 12259,282 expressed unknown protein Ectocarpus siliculosus SEQE) NO: 1375 Ngag1064 580.0866 537.38879 actin related protein SEC E NO: 1376 NgaQ1063 400.2883 403.91.774 cop9 signaosome corplex subunit 6a SEO NO: 1377 NgaQ1066.03. 143.6782. 343,1861 ---NA--- SEQ D NO: 1378 NgaO1065.0: 307.4627 219.88032-NA--- SEC ID NO: 1379 NgaO1062 6518,284. 5480,9230S ribosomal protein 5.5 SEOID NO: 1380 NgaO095 141,733 132,2572. ---NA--- SEC ID NO: 1381 NgaOO950 3.20.5467621.2365 ribosoma protein 15 SEO Ed NO: 1382 Nga21308 9.52381 10.316524 dynein heavy chain SEO E NO: 1383 NgaO0952 1712.367. 1980.7217 Eike protein SECRE NO: 1384 NgaO0953 (2.95 A38GS ---NA--- SECR D NO: 1385 NgaO654 25.307. 13.55759 cycin delta-3 SEQID NO: 1386 Ngagó15. 688,732 648.3873 histore 2 isotor 2 SEC D NO: 1387 Nga20642 234.8438 186.38033 regulator of ribonuclease activity a FIGURE 24 V U.S. Patent Apr. 29, 2014 Sheet 83 of 198 US 8,709,766 B2

Nanods iNFP Nipt. So SEC ENO: 1388 Nga2O73i 341.7722 37.21929 regulator of ribonuclease activity a SEO C NO; 1389 NgaO2495.02 994.5227 955.24621 hernesteroid binding domain-containing SEO EE NO: 1390 NgaO2497.02 3070.802 1097.487 protein SER D NO: 1391 Nga2O38 594.3971 79.07734 n-acetyltransferase subunit SEC EO NO: 1392. NgaO2499.02 9351.421927.2852 ribosomal protein 35a SEO EG NO: 1393 Nga06149 3.27.001. 101.1294 mbir-related protein SEC E NO: 1394 Niga20436.1 76.31579 51.311081 --NA SECHNO: 1395 Nga2O650.1. 4.38993 49.575924 retiral pigment epithelia membrane protein SEO EO NO; 1396 NigaO2SO5.2 27,5322 75.8G6427 nucleolar mitag domain-containing protein . SEC EO NO: 1397 NgaO6559 3.746,078 1799.0179 methyltransferase type 11 SEC 8 No: 1398 NgaO6564 3.260. 3.11.747226s proteasore non-atpase regulatory SEC EO NO; 1399 NigaO6S65 920,6939 1239,9477 purine nucleoside phosphorylase SEQEE NO: 1400 NigaOS567.01. 3.93. 4389 -...-NA--- SEC E NO: 1401 Nga.06562 291.2374. 296.93851 polyadenylate-binding protein 1-like SEO C NO: 1402. NgaOS30.02 73S.O427 O1.S8172 tar repeat-containing protein SEO ED NO: 1403 Nga07224.2 347,4954 65.8O111 protein SEQ - NO: 1404 NgaO6563 3.74.4548 5.35991 trars remos are protein 184c SEO D No. 1405 Nga?ö56. 1.35.752 903.890.57 glutamine cyclotransferase SEO EO NO: 1406 Nga03782,02 964, 1898 95.4636 subunit of proteaseone activator SEED NO: 1407 Nga20282.1 245.4545 256.93711 cigi-01 protein isoform 1 SEC E NO: 1408 Nga20207.1 175.3333 375.81902 methyltransferase-like protein 3 SEC EO NO: 1409 NigaOS604 32,7854 35.18543 predicted proteinhalassiosia pseudorana CCVPi33S SEC EO NO: 1430 NgaOS605 404,9439 335.09278 soluble nsf attachment protein receptor SEC CNO; 14:1. NgaO5607 240.73 67.56863 kinesin family member 1. SEO D No. 1412 Nga05606 5832.871. 7229.1029 acoitate mitochondria SEO EO NO: 43.3 NigaOS60 390,946S 87.22562 cycin h SEC ENO: 1414 Nga20964.1 406,8323 363.3297 hsbp3-like protein SEO E NO: 145 Nga20264 588.6983 61.51686 fumarate hydratase SEO EO NO: 1436 Nga20993 755,597 679.04237 26S protease regulatory subunit 6a SEC a NO: 1417 Nga20136 792.8496 741.51976 26S protease regulatory subunit 6a SEO E NO: 148 NgaO4755.02 367.3966 342.62874 short-chair dehydrogenase reductase sor SEED NO: 1429 NigaOS609 333,1671. 327.40138 wo40 domain-containing protein SEC ENO: 1420 Nga2O33 58.26 30,4856 -...-NA--- SEC ENO: 1421. Nga.074 CC 93.33O2S ---NA--- SEC E NO: 1422. NgaO5403 292.6182 326.75352 zinc cchc domain containing 17 SEO EO NO: 1423 NgaOS396 3669,568 421.4024 gueupheval dehydrogenase SEC O NO: 1424 NgaO5399 1183,333 11:3.3238 d-glycerate 3-kinase SEO ED NO; 1425 Nga,05393 936.8351. 108.4128 inapseudouridylate synthase domain-containing protein SEO EO NO: 1426 NigaOS395 641,9647 653.52718 phosphatidyiositol-4-phosphate 5-kinase SEO El No: 1427 NigaOS397 1760.336 3.628.514 electron transfer alpha subunit SEC E NO: 1428 NgaO5398 3380.285 3922.8945 glucokinase SEO ED NO: 1429 Nga054G0.1 1037.525 1019,0529 proliferation-associated protein 234 SEQED NO: 1430 NgaOS401. 384.53S 3845229 ---NA--- SEC E No. 1431. NgaQ4774.02 81.19.134 33:5.7961 mosc domain protein SEO CNO; 1432 Niga20854 222,5352 36.1336 hypothetical protein CHLNCORAF 136762 Chorella variabilis SEQEE NO: 1433 NgaOS4C2 395.75:1. 266.28816 hypothetical protein WOCAERAF 105929 Voivox carterif. nagar ensis SEC. ONO: 1434 Nga,05438 589.2099 832.44274 protein SEC to NO: 1435 NgaO54.42 384.7981 398.81535 udp-n-acetylglucosamine transferase subunit ag13 homolog isoform a SEO EO NO: 1436 Nga2O620 338,639 7.2O89 ---NA--- SEC E NO: 1437 NgaO54.46 S.O.32 449.4SS ---NA... SECHNO: 1438 NgaO5444 850.385 933,29519 protein SEO C MO: 1439 NgaO5440 10219.82 8949.7022 adip-ribosylation factor SEC EO NO: 1440 NigaOS443 258.4635 373.299.08 ---NA--- SEC E No. 1441 NgaOSA39 286,9234.4854. ---NA--- SEC ID NO: 1442 Nga05445 S6 86853S ---NA--- SEO EE NO: 1443 NgaOS443 3:48.68 3235.5908 peptidase caspase catalytic subunit p2 SEC O NO: 1444 NgaO6036.1 598,4556 58.61393 protein SEO ED NO: 1445 NgaO6034 22O8.7O6 2585.6801 carboxyl transferase SEO EO NO: 1446 NigaO6033.01. 3056.995 .338.0282 gtp-binding protein SEC E NO: 1447 NgaO6037 3.366.432 2.74.1925 26S protease regulatory subunits:0b SEO C No. 1448 NgaO6038 150.5532 78.39433 rhogtipase activating protein 2 SEO EO NO: 1449 Nga06035 374,6287 57.85384 nad synthetase SEC NO: 1450 NgaO1543 257.8445. 255.66094 alcohol dehydrogenase SEC 3 NO: 1451. NgaO1546.1 686.(405 739.20739 solute carrier family 29 (nucleoside transportars) member 2 FGrE 2 A. W. U.S. Patent Apr. 29, 2014 Sheet 84 of 198 US 8,709,766 B2

Nga node +Nrpkb -N rpkb GO SEC DNO: 1452. Nga2C2.42 367.2727 392.59024 phosphatidylinositokinase SEC D NO: 1453 NgaO1551. 353.7778 435.21933 phosphatidylinositokinase (pik-5) SEO D NO: 1454 NgaO1544 4987.429 4585.434 peptidy prolyi isomerasef (cyclophi in f) SEQ D NO: 145S NgaO1542 3103.521. 2735,8946 3-oxoacyl-[acyl-carrier-protein) redictase SEO ED NO: 1456 Ngao1378 564.5798 829.07434 folate-biopterin transporter famiy SEQED NO: 1457 NgaO4873.2 A269.746. 30988931. ---NA--- SECRE NO: 1458 NgaO1376.01. 265.651.4. 284.09689 flap endonuclease SECR D NO: 1459 NgaO1377.0 A21,0526 445.23901 flap endonuclease SEC D NO: 1460 NgaO1375,01 437.799 290.2445 flap endonuclease 1. SEO D NO: 1461 Nga2O676.1. 24,393SS S35Si6 ---NA-... SEO E NO: 1452 Riga,007.77 340.4255 76.69774 transcription factor SEO D NO: 1463 Ngao(776 2372,873. 2142.9331 undecaprenyl diphosphate synthase SEO ED NO: 1464 NgaOO775 838.9313 785.551.33 anthranitate phosphoribosyltransferase SEO ED NO: 1465 NgaOO778 154.3568 76,19407 ankyrin repeat-containing protein SEO ED NO: 1466 NgaOC774.01 7425.419 5973,5619 ight-harvesting protein SEO ED NO: 1467 Nga00784 254.2735 296.28911 short-chain dehydrogenase reductase scir SEO D NO: 1468 Nga2O471 391,0306 343,73633 wo repeat and himg-box dina-binding protein 1 SEQ D NO: 1459 Nga2O781.1 866.3559 77.8147 protein SEO ED NO; 1470 NgaOS503.2 1697,917 1885.0306 glycoprotease m22 family protein SEO ED NO: 1473 NgaOS893 5327.88 488.8035 beta alanine-pyruvate transaminase SEO E) NO 1472 Niga05586,O2 360.9467 303.25441 ---MA--- SEO ED NO: 1473 NgaO5392 759,6458 670.76915 3-phosphoinositide-dependent protein SEO ED NO: 1474 Nga20875.1 1243.391 935,47627 protein SEO E NO; 1475 NgaO5502.02 1425,928 1289.8508 peroxisomal-dienoyl- reductase SEQ: NO: 1476 Nga05890 201.7837 38.87635 ud-sugar transporter ust 74c (fringe connection protein) SEO ED NO: 1477 NigaOS895.1 22.284 A34,622 ---NA--- SEO ED NO: 1478 aga05889 2055.18 2218.921.8 ribosomarna smal subunit methyltransferaseg SEO 3D NO: 1479 Nga21035 1109.005 944,621.07 protein SEO ED NO: 430 NgaO5499.02 1387.155 1469.01:9 peptidase dimerization domain-containing protein SEC NO: 1483 NgaOSSOO.02 1136.11.1973.32252 rna (guanine-9-)-methyltransferase domain-containing SEO ED NO; 1482 NgaoS888 762.2413 842,272.26 protein SEQ D NO: 1483 Nga2:22.É. 467.1875 504.3808 ysocardiolipin acyltransferase 1 SEO ED NO: 1484 NigaO6829.2 48.85057 54.472971 protein shi honolog isoform 2 SEC D. NO: 1435 NgaO1353 398.24, 323.69 ---NA.-- SEC D NO: 1486 NgaO1351.01 1830.59 21.3.1.6756 eukaryotic translation initiation factor 3 subunit g SEO ED NO: 1487 NgaO1354 631.8052 557.13608 expressed unknown protein Ectocarpus silicudosus SEQ D NO: 1488 KigaO2827, 233.5827 222.40963 ubx domain-containing protein 7 SEC D. No: 1489 NigaO2825 2231.366 23S4.8564gtip-binding protein ypt.1 SEQED NO: 1490 Nga202,57 393.7322 75.90979 phosphatidylinositol--trisphosphate 5-phosphatase SECRE NO: 1491 Nga20958 1066.978 1097.5758 protein arginine serine-rich 45 SEO ED NO: 1492 NgaO2328 306.1752 117.40154 protein SEQ D NO; 1493 NgaO2822 927.1565 340,28499 protein SEO D NO: 1494 NigaO2323 2258.128 2138.4427 nucleoside diphosphate kinase SEQ D NO: 1495 &ga02824 353.4229 403.96287 5-enolpyruvylshikimate-3-phosphate synthase SEO D NO: 1496 Ngao3090 663,356S 496.97525 F-box protein, putative Phytophthofa infestans 30-4} SEC D NO: 1497 Nga21086 781.94.73 798.32657 protein sel-2 homolog 1 SEO E NO: 1498 Nga03087 36.23 82,675 ---MA--- SEO ED NO: 1499 NgaO3.096 608.7418 585.32253 asparty-trina synthetase SEO E) No: 1500 Nga03097 471.3764 802.96944 pre-mrna-spicing factor slui SEO D NO: 150: NigaO3985 809.1658. 809,80754 otu-like cysteine type protease SEQ D NO: 1502 NgaO3O89 3624.242 744.16 conserved unknown protein Ectocarpus siliculosus SEO ED NO; SO3 NgaO3092 117,222 1255.6426 ribosomal rina assembly protein mis3 SEO ED NO: 1504 NgaO3094 334-2199 2.99.37 a C 3 SEO E) No. 1505 Nga03099 532.9768 714,43237 hypothetical protein Es:0081 0013 Ectocarpus si{iculosus SEO ED NO: 1506 NgaO3100 844.375 75i.49355 exodeoxyribonuclease iii SEO ED NO: 1507 Nga.07097.2 7O6.486 893.8O14 gta-binding protein SEO E NO; 1503 NgaO3098 654.3148 680,431.39 ---NA--- SEC in NO: 1509 Nga,03084 669.S07 630.29315 gnsi sur4 family protein SEO ED NO: 1510 NgaO3095 G02,153 933,06392 mad-dependent epimerase dehydratase SEO ED NO: 1513 KigaO3O91 301.3242 418.563.03 serine threonine protein kinase SEO ED NO: 1512 Nga03086 1336.815 975.75903 ferredoxin (2fe-2s) SEO D NO: 1513 NgaO3O88 538.6289 486.30698 cytochrone b-561 domain containing 2 SEQ NO: 1514 NgaO5303 359.944 390.3926 conserved unknown protein Ectocarpus siculosus SEO ED NO; 1515 NgaOS30 1665.58. 1779.094 propionyi- alpha subunit FIGURE 22 X U.S. Patent Apr. 29, 2014 Sheet 85 of 198 US 8,709,766 B2

Nga model -N rpks GO SEC O NO: S NgaO5304 266.67 212.74.324 ap-binding cassette sub-family b member mitochondria SEONC: 57 Nga20922 189,9346 3.04.4383 abc transporter SEC NC: 53 NgaOS297 2164.634. 2012.6406 mannito dehydrogenase SEC ID NO: 59 Nga05302 236,46 326.50668 rudix hydrolase SEC, D MO: 52 Ng305293 57OGA75 576.449 wite SEC NC: 52 NgaOS299 3S74.2O2 2930,692 endosoma p24a protein SEC NO S22 NgaOS300 O436.2 109G2.839 cazy family gt2 SECONO: S23 NgaO5305 S3.23.9 84.451289 e3 ubiquitin-protein ligase shorh SEOE NC: 5. Nga03.223 208.458 194.E.7s ---NA--- SECD NO 525 NgaO1227 187.579.2 194.954.65 hypothetical protein PTSGO3392 Salpingoeca sp. ACC 50818 SEC D. NO: S26 Nga 0.225 8943,338 9082,9394 vacuolar transporter chaperone SECE NC: 57 Nga03229 3633436 3039.3994 conserved unknown protein Ectocarpus Siculosus SEC NC: 528 NgaO1226 12.2 1167.7304 hypothetical membrane spanning protein SEC D NO: 529 Nga2O334 96.2482 189.13608 mya-like dna-binding domain-containing protein SEC NC: 53 Nga2O421.1. 323,8866 350.845 ---NA--- SEO DNC: 53. NgaO348.0 92.9023 666,07849 protein SEC NO: 532 NgaO439. 357.543 4:06.42397 cutlin-associated and neddylation-dissociated 1 SEC O NO: S33 iNg301.436 A13.749S 415.7498S trna 5-methylaminonethyl-2-thiouridylate)-methyltransferase SEC NO: 534 Nga0347 39.653 277.930 lysine decarboxylase domain-containing protein SEC is NO 53S NgaO1420 AS3.252 70.59325 ---NA--- SECONO: S36 Nga.05988. 2O3357 2206,452S protein SEC NC: 1537 Nga.05990 39.038 131,55878 kinesin-related protein kipa-like protein SEOD NO; 538 NgaOS989 O90.932 1097,1524 aminopeptidase in SEO NO: 539 NgaO5861 644.4159 522.34462 ---A--- SEC OMC; 54 NgaO5859 33,443 903,16633 S-adenosyl-methyltransferase SEC BNC: S4 NgaOS857 1S63.716 1448,848 protein SEO NO: Sa2 NgaO5863 1,383 18.847 ---NA--- SEQED NO: 543 .Ng805862 3278,88 9992,012S ---NA--- SEC NC: 544 NgaO5856 336.763 322,58353 udp-galactopyranose mutase SEC DNO: 545 Nga2O654 486,623. 336.92561 protein SECONC: S46 Nga.05858 82.49 826.985OS transmembrane protein SECB NC: 147 NgaO5860 357.923 786,08452 eukaryotic translation initiation factor 3 subunit 12 SECDNC; 548 NgaO1928 72072O7 4.4573 - --NA--- SEC NO: 549 Nga2O866. 5862O7 1755.588 ---NA-. SEO NC: SS Nga.0029.3 376,3713 480,82789 ---NA--- SEC NC: 55 Nga20896.3 343.532 68.22462 - - -NA SEC DNO: 552 NgaO2O89.C2 282,918 2368,0285 rhamnose biosynthetic enzyme expressed SEC NC: SS3 Nga(2093.02 595.3309 S60.94.072 short-chain dehydrogenase SECE MO: NgaO8174 82893 464.2432 sysdicciarrie: full-protein sys1 homolog SECONO SSS NgaO2O90.2 532O2O6 83.647409 nudix hydrolase SECD NC: S56 Ng302091.02 2241.83 166.397 ---NA--- SEC NC; SS7 Nga.061.78 3588. 339,23859 abc transporter family protein SECONO 558 Nga2O908 4.73834. 14.S.232 . . .NA--- SEC O NO: SSS Nga2O662 30.3a. 97.69343 ---A-. SECONC: 56 NgaO618O 473,824 446.5239 ---NA-...- SEC NC: 53 Nga2O52. 383.0645 464.451 protein SEC D NO: 562 Nga2171 6888. 621.77329 conserved hypothetical protein Phytophthora infestans 30-4 SECD NC: 563 Ng306529 233.8889 330.988S ribosome biogenesis protein nobi SEC NC: 554. igaO8527 35414 1696,9093 eukaryotic translation initiation factor.3 SEC ENO: 56s Nga08.538 373.583 394.6.463 pyridoxarine 5-phosphate oxidase- frri?-binding SECONO: S66 Ng806526 7.14s 1240.2708 copper amine oxidase SEOE NC: 557 Nga06532 SOC.795 589.9423 ---NA--- SECONO 568 NgaO5531 33.8235 407.22213 inorganic phosphate SEC O NO: S63 Nga20089 2,669 1285,6063 protein SEOE) NC: 570 NgaO6530 34:1033 352.9418 alany-tina synthetase SEO MO: 5. NgaO6528 22S36 1408.3241 potential zinc ring finger protein SEC ID NO: 572 Nga.06533 87.6923 795.06594 protein SEC D. NC: 573 NgaO6359 a 3.4625 453.55288 conserved unknown protein Ectocarpus siliciosus SECD NC: 54 Nga20953 285.4223 259.0022 multidrug resistance protein 2 atp-binding cassette protein C) SEC NO 575 Nga2E13S 2s. 43a;6 235.5798 atp-dependent bile acid permease SEC O NO: S76 Nga20835 418.392 372.8414 abc transporter c famity protein SEC NC: f Nga2O587 62.3.4.89 159,02398 impact homolog SEC NO: 1573 NgaO6363 9,462 151.6324 ---NA--- SECONO: Sf9 Nga 06361 O32.932 891.81921. 26S proteasome reguatory subunit 7 F (URE 24 Y U.S. Patent Apr. 29, 2014 Sheet 86 of 198 US 8,709,766 B2

Namodel Nript Neb se SEC ONO; 1530 NgaO6362 3324.299 4638,671 transation initiation factor eif-Sa SEC NO : 18 NgaOS360 30938.2 360.8242 erolaSe SEC ENO S3 NgaO6632 93,0698 94.302153 sc complex subunit SEC DNO : 583 NgaO2313,02 77.9. 633.2742 zinc fyve domain containing 23 SEQ NC: 1584 NgaO2316.2 51338 32.95.082 transducin wa-O domain-containing protein SEC N 585 NgaO6630 3.736 356.561.2 trf1 subunit of rina po?ymerase iii transcription initiation factor iiib SEC ONO : 86 Nga20132 S3.0658 198.33885 n-terminal asparagine amidohydrolase SEC O NO 158 NgaO6631. 256.853 237.64238 translation initiation factor eif-2b alpha subunit SEC NC ; 1588 Nga2026 2,2654 104,70627 gas-like protein SEC NO 1589 NgaCO325 338,39S 26S0.92 ---NA--- SEC N : 1590 NgaOO322 56.6003 S84.OS442 zinc finger (c3hc4-type ring finger family protein SEO NO : 59. NgaCO316 37.8237 340,323.73 conserved protein SEC ONC : 1592 NgaOO347 8,2452. 2.5368.476 cap family transcription factor SEC NO 1593 NgaOG319 77.39. 1479,9523 prote SEC N : 1594 NgaOO312 69.63 it.05.94.92 nuclear in factor interactor-interacting protein cleavage-specific form SEO ENO 595 NgaOO324 293.853 299.37832 ectonucleoside triphosphate diphosphohydrolase 4 SEO ONO : 96 NgaCO309 738,678 757.743 atp synthase mitochondria fi complex assembly factor 2 SEC DNC 1597 NgaOO310 7792.723 8849,7944 serine hydroxymethyltransferase SEC O NO 1593 NgaO934.1 772.2772 646.0583 protein SEC NO: 1599 NgaCO318.03. 35.42 373,77233 expressed unknown protein Ectocarpus siliculosus SEO NC: 6OO NgaOO3.3 459,3873 569.67.562 dead death box helicase SEC NO : SO NgaGO326 58.9384 532,204.99 ---NA--- SEQ ON : OR NgaOO37 3370.263 4661,3348 aureochronel-like protein SEG NC 603 NgaOO320 5.299 1941.2968 protein SEC ONO ; 1604 MgaCO321 558.9 1453.8498 early-responsive to dehydration stress-related protein SEO O NO 60 NgaOO350 3.3O303 A9.3790s ---NA-...- SEO O6 NgaCO315 393,5428 397,941.98 see racerase SEC NO 16O7 NgaCO346 33.65074 59,376.4 kinesin motor protein SEC 1608 NgaOO323 2O3.835 2368.37 heat shock protein SEC 1809 MgaCO311 8 715,59091 fgfroncogene partner SEC DNC 50 Nga041.97.01. 880.8 2549.735 prote ir SEC ON 15. Nga2O607.1 24,629. 208.93.235 e3 ubiquitin-protein igase abra. SEQ N : 62 Nga2O308.1 25.67 386.0685 prote in SEO ENO 163 Nga20168, 528.392 655.2994 prote i SEC ONO : 34 NgaO2949 53.8889 500,78077 anaphase promoting complex subunit 1. SEC NO : 15 NgaO2948 13.75.478 9.1933834 ari-gap with coiled- ank repeat and ph domain-containing protein . SEC NO ; 16.6 NgaO2938 487.3884 454.95826 protein SEC NO : 67 NgaO2945 549.92 432,65732 deri-ike domain nember SEC ENO 1.68 NgaO2942 389.33 419.07366 phosphoribosy or nylglycinamidine Synthase SEC O NO 89 NgaO2943 22.2867 1748.346 pepticase s3 s33 satilisin kexir Sedoisi? SEQ NC : 2O NgaO2944 3888 403.22716 meta: dependent phosphohydrolase SEC N 52. NgaO2946.1 2234-2 1934.7062 dina-formamidopyrimidine glycosylasa SEC DNO ; 1622 NgaO2947 75,0988 68.5493---NA--- SEC O NO : 1623 NgaO2950 SO19225 579,90341 prote SEO NC 64 NgaO2939 24,0304 277,03683 ric kinase SEC NO 1625 NgaO2941 234,5779 240,03479 dna-directed rna polymerase isubunit pa2 SEC N 1626 NgaO2951 145.0777 51. SACS ---NA--- SEO NO : 87 MgaO294 23.827 195,75526 chaperone -dorrain containing protein SEC O NO 1628 NgaO6743.2 377.8 458,393 ---NA--- SEC NO 1629 NgaO6739.1 392,53 1334.6489 sphingolipid delta 4 desaturase C-4 hydroxylase protein des2 SEC N : 630 NgaO6744 377.7778 349,42O5 ---NA--- SEO NC 163. NgaO6740 625.7036 463,33978 solute carrier family member b3 dihydrolipoyllysine-residue acetyltransferase component 1 of pyruvate dehydrogenase SEC NO : 632 NgaO5741 3428O8 3151. comp tex SEO NO 1633 NgaO6743.1. 377,7006 453.86993 ---NA. -- SEQ ON : 34 NgaO5738 4765. 4385.913 hypothetical protein 3OALK 09702160 Gordonia alkanivoran, NBRC 16433 SEC NO : 83S NgaC2678 3.4448 343.01.161 replication factor c subunit 4. SEC NC 1636 NgaO2677 27.234 483,755 na helicasa SEC NO 1637 NgaO2671. S59.43 634.28228 nucleoprotein tor SEC NO : 638 NgaO2679 S44.385 824.87975 prote in SEO ENO 39 NgaO2676 54.6489 593.7.63 protei SEO ONO : AO NgaC2681 6,778 526,37146 conserved unknown protein Ectocarpus siliculosus SEC ON : 141 NgaO2674 779,438 835.46986 conserved unknown protein Ectocarpus siliculosus

GUR -. 2 U.S. Patent Apr. 29, 2014 Sheet 87 of 198 US 8,709,766 B2

Nga mode +N rpka Nrpkh GO SEO NO; 1642 Nga20519 AS35,698 50848.357 ---NA. -- SECE REO; 1643 NgaO2573 527.4295 662.36495 protein SECR D NO 1644 NgaO2675 225,3993 239.83.37 phosphatidylserie receptor SEC EO NO: 1645 NgaO268C 522.646 457.69539 protein SEQE NO: 1646 NgaO2682 64.35644. 94.738282 conserved unknown protein Ectocarpus siliculosus SEC E3 NG: 1647 Nga (2672 1097.561 3355.5867 abc transporter atp-birding protein SEC D. NO; 1643 Ngates 48 1944.215 i797.1836 proteasome maturation factor umpifany protein SEO ED NO: 1649 gaO3952.02 4710.342 4386.2837 protein SEC NO: 1650 NgaO3956.02 679.941. 627.892.24 pyridoxamine 5-phosphate oxidase SECE REC: 1652. NgaO6347 1766.458 3845.5726 protein SEC to NO: 1652 NgaQ6346 1403.037 i762.0306 serine threonine-protein kinase tick3-like SEC EO NO: 1653 Nga (6344 93.2.0952 1003.0474 retino retinaldehyde reductase SEQE NO: 1654 NgaO1357; 879.9571 986.35908 --NA SEO ED NO: 1655 NgaO1358.01 440.5405 482.57584 dm3 gyrase subunit b SEC to NO: 1656 Nga21070, 576.6284 442.0925 trafficking protein particle complex subunit 4 SEC EO NO; 1657 ga01356 3829.361 3494.9845 udp-glucose 6 SEC D NO: 1658 Nga01359 132.3764. 153.7604.8 atp-binding cassette superfamily SECD NO: 1659 Nga20289 9.33489 86.25282 atp-hinding cassette sub-family g member 2 SFO i NO 1660 Nga20029.1 184,7015 80.9431.48 abo subfamily abcg SEC EO NO: 1662. NgaO2467.02 721.598: 742.59.434 50S ribosoma protein 325 general stress protein cric SECD NO: 1662 Nga01056.02 26.21723 14, 199696 ---NA--- SEO D No. 1663 Ngao1053.02. 266.0819 19.62472 ---NA--- SEO is NO: 1664 Nga (1050.02 211.8877 198.52334 lipase domain-containing protein SEC to No: 1665 Ngao.6.297 75.72016, 73.1071.47 dra rina non-specific nuclease SECO NO: 1666 NgaO6306 83.10249 68.0.14689 tata-binding protein-associated factor 72 SEOE NO: 1667 Nga.06307 82.253.08 82.06377 tata-binding protein-associated factor 372 SEO NO 1688 ga(452 379.349 365.2246 ---NA--- SEO ED NO: 1669 NgaO4156 83.9461. 0.4082 hrhardonuclease SECD NO; 1670 Nga04151 295.841 1376.2752 serine carboxypeptidase SEO is MO: 1671 NgaQ450 698.2025. 893.5299 acetylserotonin o-methyltransferase-like SECRED NO: 1672. Nga (3826 62.83 262,3261 ---NA--- SEQED NO: 1673 NgaO3820. 594.737 774,64747 predicted protein halassiosira pseudonana CCVP1335 SECR NO; 1674 NgaO6633.2 47,843 A2i.2883 ---NA. -- SEO is NO: 1675 NgaO3828.03. 401.531.7 394,65744 cop9 compiex subunit 7a SEO ED NO 1676 Nga (387.01. 595.4495 658.23594 soluble pyridine nucleotide transhydrogenase SEQED NO: 1677 NgaO6678.2 3382.787 2908.7495 protein SEQE NO: 1678 NgaO3Q43.01. 2702.461. 2792.918 ngc34239 protein SEC E RO: 1679 NgaO3O21 74.972 9.SS61 ---NA--- SECRED NO: 1680 Nga 3020.01. 23.88.825 1948.360 golgi Snap receptor complex merr her i SECD NO: 1682. Nga20872 632.90.32 37.09354 protein kinase domain protein SEQE NO: 1682 NgaO3Q23.01. 1230.971. 1157.1554 protein SEC is 80: 1683 NgaQ3026 C96.096 155.343 ---NA--- SECRED NO:1684 NgaO3O22 2083.91. 2078.21.49 abc subfamily abcg SEQ NO: 1685 NgaO3Q27.01. 2032.492 1021.2307 protein SEQE NO; 1638 NgaO3O29 22.212 364.283S ---NA--- SEG 8.0: 1687 Nga2O536 245.531.4 261.972.2 dnaigase SECRED NO: 1683 Nga 3024 328.231.3 288.3G971 dra iigasei SEQD NO: 1639 Nga03025.01. 1242.206 1453.4038 protein SECE NO; 1690 Nga20i48.i. 242.963. 224.67C74 dra-aparinic or apyrimidinic site lyase 2 SEO E &O; 1691 NgaQ3028.01 324, 1913 305.46392 protein SEO ED NO 1692. Ngao.6.27 a 1906 53.7797 ---NA--- SEQ in NO: 1693 Nga.06126 260.8696 260.22352 nudix hydrolase SEOE NO: 1694 NgaO6124. Z47.967 674.76448 mitochondrial phosphate carrier protein SEO is 80; 1695 NgaOS123.31 752.53S5 799.79139 mitochondrial protein translocase family SEQED NO: 1696 Ngao.625 82.3944 180.79327 cyclic nucleotide-binding domain protein SEC in No: 1697 Nga20217 258.6873 313.67778 gamma complex associated protein 3 SEOE NO; 1698 Nga20091 158.8527 93.584 abc subfamiyabog SEO DNO; 1699 Nga2O499 99.7919 206.2766 ...N.A.-- SEO ED NO: 1700 Ngao6600 4222.222 3849.702 nach dehydrogenase SEC in NO: 1701, Nga21177 25.7:43 113.43835 ser thr protein phosphatase family SEO B & O: 1702 Nga2 1231.1 5:3.8811 534.79941rna binding protein SEO D No. 1703 Ngao,598 226.1905 34.837.54 ad-ribosylation factor-like protein 2 SEO ED NO: 1704 NgaO6599. 2.76.8293 249.95.032 chronosore region maintenance protein 5 exportin SEC NO: 1705 NgaO6597 505,933 549,44586 tho complex 7 F: Gj RE 24 AA U.S. Patent Apr. 29, 2014 Sheet 88 Of 198 US 8,709,766 B2

Nga model -N rpkb GO SEO O NO: S Nga03239. 84.49 208.3258 casein kinase SEO 3 NO: 170 Nga20497 44,6384 157,482.55 atp-binding sub-famiy c (cfer mp3 member 2 SEC O NO: O8 NgaO3243 3.3584 270.98241 multidrug-resistance like protein isoforme SEO C: 1709 NgaO3245.0. 46.313 41.3.63996 e3 ubiquitin-proteinigasernfa SEC ENO: O NgaO3242 6992 668.40862 conserved unknown protein Ectocarpus silicudosus SEO NO: 7. NgaO3238 702.9973 80.14072 transmembrane protein i5 SEC NO: 2 NgaO324: 5336,549 5478.8649 vacuolar h--atpase a subunit SEO NC: 73 Nga03244 222222 678.5257gtip-binding protein era SEO O NO: 74 NgaO3240 5486.374 51.59.6143 inadh dehydrogenase subunit 10 SEC ONEO: S Nga05902.2 20569. 1804.9896 ---NA--- SEC 3 NO: 75 NgaO5498.01 343998 2738.2545 ---NA. SEC NO: Mga.05489 2S385 042.552 ---NA.-- SEO C. : 78 NgaO5497.01 68s. 924 1441.5215 doichyl-diphosphooligosaccharide--protein glycosyltransferase subunit dad SEO O NO; 1719 NgaO5493 248.227 223,27295 unc45 family protein SEC O: 720 Nga20081. 131,985 943.394.49 pseudouridine synthase homolog 1 coli SEO ONC: 2. Nga05492 O3,874. 985.38658 rna binding protein SEO NO: 72 Nga20055.3 243.28 1900.2292 gonserved unknown protein Ectocarpus siliculosus SEO NO: 23 NgaO5495 376.63 549,07042 mitochondrial carrier domain-containing protein SEO N: F24 Nga2O3O4.3 2625.498 2593.71.95 fritochondria carrier domain-containing protein SEC O NO: 25 Nga2O76 628.723 130.44178 vacuolar protei sorting-associated protein woS3.3 SEO NC: 26 NgaO5509 95.429 239.7866 predicted protein Thalassiosira pseudonana CCMP1335 SEC NO; 727 NgaO5397.2 895.2703 834.3829 ---NA-...- SEO ONC: 1723 Nga05S08 46.99. 198.4333 Syts protein SEC NO: 729 NgaO5494.01. 967-684. 874,49586 topoisomerase 6 subunit b SEO ONC: 3. NgaO5496 7.3 326.02137 vacuolar protein sorting-associated protein 13 family protein SEO NO: 73. NgaO4128.02. 257.08.2S 33515.408 fructose-bisphosphate aldolase SEC O NO: 32 NgaO41.29.03 23S423 24.62.OCS ---A--- SEO NC: 733 NgaO5546.2 360,441. 368,0452 ---NA--- SE O NO: 34 NgaO42.33 48.532.5 769,049 ---NA--- SEO NO; 1735 Nga04132 41,623 479.101.43 nono-ordiacylglycero acyltransferase type 2 SEC DNO: 35 NgaO41.30 92.28O 23:08.991 nitrate transporter SEO NO: 1737 Nga20404 323.93 207.22736 dra topoisomerase SEC NO: 38 Nga2O335 26S, 3824 145,28079 dra topoisomerase SE NO: 39 NgaO3611 292.6S5 1408.2041 serine threonine protein kinase SEO NO: 7) NgaO3610 1274.63 1592.0304 at p-binding cassette sub-family g member 2 SEC NO: A1 NgaO3608 248.68 29.38482 protein SEO R NO; 742 NgaO3609 413,666 483,77963 protein SEC D NO: 743 Nga21043 54 353.73855 ser thrkase SEO NC; 744 Nga20895 SOO 362,34938ser thrkinase SEC NO; AS Nga2CO 588,337 465.724.6 ser th kase SEO ONC: 46 NgaO2SO5.i. 27.5322 75,806427 nucleolar riflg domain-containing protein 1. SEC NO: af NgaO2503.02 A341S 5050.3668 dihydrolipoy dehydrogenase SEO NO: 48 NgaO2498 49.42.38 4534.3062 histone ha. SEO NO: F49 NgaO2500 O8.3 107.2083 cycin deta-3 SEC O NO: 50 NgaO2501.03. 434.15 5050.0668 dihydrolipoy dehydrogenase SEO N: S. NgaO2495.0 99.527 955.24621 hemesteroid binding domain-containing SEC NO; 752 NgaC2507 28.367 113.37729 conserved unknown protein Ectocarpus silicudosus SEO NC: 1753 NgaO2SO4 16.86 853,53815 protein SEC NO; 754. Nga02497.01. O78O2 1097.487 protein SEO NC: ASS NgaO2499.01 351,42 923.2852 ribosomal protein 35a SEO NC: 1756 NgaO2502 aO.O34 S44.968 isoform a SEO O NO: SF NgaO2496 34.54 2940.2064 nadh dehydrogenase SEO : 7S3 Nga20863 9914. 1066.1629 protein SEC O NO: 59 NgaO2505 383,7 4t G.80C19 protein SEO NO; 178 NgaO1606 SS3O888 449.98663 ués Snrna-associated sm-like protein sm3 SEC NO: 6 NgaO1610 342.892 388.45479 asparty: giutamyl-tra amidotransferase subunit a SEC NO: 1782 Nga.01536 27.025 55.02454 - --NA--- SEQ DO: f63 NgaO2.609 1903 1526.8025 protein SE? OMO: 84 Nga20814 322.1258 287.84416 probable methyltransferase bcdin3d-like SEO NO: 765 NgaO1608 2833O88 23337.319 expressed unknown protein Ectocarpus siliculosus SEC O NO: 66 .NgaO607 97.522, 2.73 hoacontage SEO N: 76 NgaO7299, 2 2197.878 2503.6802 solute carrier family 35 member b3. SEC NO: 68 NgaC6703.3. 92.343 104.6472 uncharacterized protein FIGURE 24 AB U.S. Patent Apr. 29, 2014 Sheet 89 of 198 US 8,709,766 B2

Nga model +N rpkb -Nrpkb GO SEC NC: 1789 NgaO6699 16S-29 901.851.37 mitochondria ribosoma protein 11 SEO NO; 1770 NgaO6697 3.23.333 330.4.357 ---NA--- SEO MO: 771 NgaO6700 64,6465 773.73853 endonuclease SEQED NO; Nga2O71 225,2986 184,06744 fibronectin-binding a domain-containing protein SEC NC: AJ3 Nga2O386 2SO 273.12309 fibrojectin-birding a domain-containing protein SECR DNC; 7A. NgaO6702 2292 1808.2785 protein SEOE NO: 1775 NgaO670 342.5358 258.03948 proteasane subunit beta type-6 SECE NO; 1776 Nga2O669 29. 90SS855 ---MA--- SECENO 1777 Nga0383 36.48 331.1275 coq family protein SEOD NO; 1778 Nga03838 466,469 464.9416i sauaiene synthase SEO EO NO: 1779 Nga2O34 234.58 1354,0424 wacuolar sorting protein 9 domain-containing protein SEC) NC: 1780 NgaO1200 633,88 836.22255 potentiairn mp coffplex component SEC NO; 181 Nga0.398.01. 54.66 22404 ---A- SEOD NO: 1782 NgaO 97.01. 297,435 2518,5728 peptide deformylase SEOE NO: 1783 NgaOt199 O929 1382.4944 pas-domain protein SECD NO; 1784 Nga2026 578.9474 S27.36389 na methylase SEO DNC; 1785 NgaOS0. 28,909 351.05917 cox11 cytochrome c oxidase assembly protein SOE NO: i88 NgaOiSOO.G1 38.903 33s,344.94 ---NA-- SEO) NO: 18 Nga(4084.2 139,663 262.367 pyridoxakinase SECD NO: 1788 NgaO5539.3 4.815 348.0980.5 cysteinyl-tra synthetase SECR DNC; 1789 Nga0.5544 466,667 398.4824 protein SEO EO NO: 1793 Nga0436.2 6,38 235.407 protein SEC NC: 1791 Nga05542 53S36 S819,536 transmembrane bax inhibitor motif-containing protein 4 SEO EDMO: 179 Nga043.34.02 36.563 560.23029 activating transcription factor SEN: 93 NgaOSS43 83.800 644,234.92 protein SECEO NO: 1794. Nga20225 377.070; 329.79989 inaguanylyltransferase and 5-phosphatase SEC NC: 1795 NgaO5545 58.609 207.85532 homolog of cois closely related to ydip SEO NO: 796 Nga05,538 6.391. 690.88149 conserved hypothetical protein Phytophthora infestans 30-4 SEOD NO: 1797 NgaO5541 414.90 342.48826 6-phosphofructo-2- isoform a SEO ED NO; 98. NgaO1218 1S24.8 1274,4608 abc transporterg family member 7 SECD NO; 1799 Nga2O368.3. 223.30. 39.3237 transthyretin family protein SECONO; 8O Nga03.22.O. 12.069 93.382.236 ---NA--- SEC NC: 801 NgaO1220.1 4,845 85.67239 ---NA SECE NO; 18O2 NgaO129 90.5263 41.39054 ---NA--- SEC E MO: 1833 Nga06334 987.3979 053,839 ---NA--- SEOD NO; 130A. Nga06133.3 325.7373 374-6305 dra replication complex gins protein psf2 SEOE NO: 305 NgaO6137 35399. 366,6379 mitochondrial carrier SEC NC: 1806 NgaO635 17646 1978.8983 3-ydroxyacyl-dehydrogenase SECN: 18O NgaO6342 52.868. 488.801.55 transporter belonging to the ims superfamily SEOD NO: 308 Nga21238 28.8989 28.03823 ---A--- SECEO: 1809 NgaO641 S43.352 53.02096 major facilitator superfamity protein SECD NO: 1810 Nga06.36 319.3S 3234.3223 fumarate hydratase SEC NO; 1811 Nga2203. 522.309 735.58.156 protein SEO EO NO: 1832 Nga06:38 7934A4 1958,833 ---NA--- SEO NO: 1833 NgaO4845.02 295.9223 316.76.705 radph dependent diflavin oxidoreductase. SECD NO 1814 Nga04844.02 24.888 28.0347 salicylate hydroxylase SEOD NO; 1815 NgaO6733 3534 ---NA--- SEO NO: 1836 Nga06779 A436.236 A418,5799 protein--isoaspartated-aspartate o-methyltransferase SECENO: 1817 Nga2O295.3 S697413 470.33555 jmic domain-containing protein 4-like SEOD NO; 88 Nga06780 90.6S 1027.3607 protein fucu homolog SECONO; 183 Nga04328.02 44.37A3 A38.89967 ---NA--- SEOE NO: 1828 NgaO4329.02 8.884 62.6994. ---NA--- SEC NC: 1821 Ng306781. 6653.82 8686,5077 ribosoma protein SEOR NO; 1822 Nga2116. 24S.4955 351.33,912 naidh dehydrogenase SEOD NO: 823 NgaO6789 463.97 A50.931:49 inadh dehydrogenase SEO EFO: 182 NgaO3624 8.033S 137.36638 h-acetylglucosamine-1-phosphate transferase SECD NO; 1825 NgaO3625 3CO 241.57836 ketose-bisphosphate adolase class-it-like protein SECRO NO; 1828 NgaO2.834.02 6.4213 St.8587 ---NA--- SECE NO: 32 Nga20952 24.293S 266.76657 tetratricopeptide repeat protein 26 SECD NO: 1828 NgaOC882 2295918 2300.4092 protein SEC E MO: 1829 NgaOO884.3 3,558 3.88.8029 dra replication atp-dependent helicase dina2 SECD NO; 1830 Nga00881.01 3447.703 3984.5643 succinate fumarate mitochondria transporter SEO NO: 1831 NgaO1403 286 2660.65 ---NA--- SEOD NO: 1832 NgaO1404 372,993 268.99 --NA-- FIGURE 24 AC U.S. Patent Apr. 29, 2014 Sheet 90 Of 198 US 8,709,766 B2

Nga made Nrpkb -Nrpkh so SEO NO: 3833 Nga2O494 37.308 292,5387 short-chain dehydrogenase reductase scir SEO ED NO: 1834 NgaO1400 377.873 1625,3273 transraembrane protein 56 SEC EE MO: 1835 Nga(1402 624.1747 670.62579 hsp70 hsp90 organizing protein SEC D. No: 1836 Ng321,131 63168 05.44.755 ---NA-. SEC NO; 383 NgaO401 328,152 43.78367 ufrn-conjugating enzyme SEO ED FO: 1838 NigaOE.238 475.1Q32 618.99082 phyhc1 protein SEC NO: 1839 NgaQ1239 982.7246 037.2525 uncharacterized protein SECRE NO: 3840 Ngao.237 716.8142 752.5.207 homeobox prox 1 SEO ED NO; 1841 NgaO3808 2240.779 2098.7058 protein SEC E MO: 1842 Nga2006 63.888 598.972.77 dina primase SEC No: 1843 NgaO6320 2054.688 1970.0198 mannito i-phosphate dehydrogenase SECRENO. 3844 Nga (63.19 282.2832 34.93.61 phosphatidylinositol 3-kinase SEO EO NO: 1845 NgaO631.4 47,2277 476.16763 translation initiation factor eif-2 subunitheta SEC NO: 1846 NgaQ6326 255.2422 260.03883 sic3Oas protein SEC No. 3847 NgaO6317 CCS.623 ass82 ---NA--- SEO ED NO: 1848 Nga.06318.1 203.6082 234,51456 protein SEO Ed NO: 1849 NigaO63.5 384,4508 431.90283 bifunctional coenzyme a synthase SECR D NO: 1859 NgaO13.7 310,0437 264.89554 didehydroglucorate eductase SEQ - No. 3851 Nga (3.316 294.2933026.6982 protein SEO ED NO: 1852. Nga01315 1364.583 1351.478 methionine synthase SEO Ed NO: 1853 Nga21150, 299.8776 234.67859 protein SEC E NO: 1854 Nga Q6047 942.32 438.9882 ---NA--. SEQ - No: 855 Nga06048 615.4432 636.24893 myb-like dra-binding SEC E MO: 1856 NgaO6049 356 255.909sa ---NA--- SEC) NO: 1857 Nga2018O 73.1957 92.80337 rina polymerase ii associated protein 3 SEC DNC; 1858 Nga2O2C8 0.0939 134.5.147 inad-specific glutamate dehydrogenase encoded in antisense gene pair with dnak SEO EO NO: 1859 NgaO27316, 1963.609 1938.3593 hypothetica: pirotein WOLCAERAF 89,537 Volvox carterif. nagariensis: SECR D NO: 1860 Nga20216 536,6492 561.4668. ubiquitin carboxyl-terminal hydrolase and or f-box SECR D NO: 3861 NgaO2735.0, 456.218 452.26672 conserved unknown protein Ectocarpus siliculosus SEO ED NO: 1862 NgaO2729 25,399 250.22O7 isfor a SEC NO: 1863 Nga2O748 758.5677S7.02842 ubiquitin carboxyl-terminal hydrolase and or f-box SEC E NO: 3864 NgaO6813.2 3.1.6.026 1004.3523 protein SEC E NO: 1865 Nga.02725.C1 9S4.0785 CO2,0732 - - -NA SECREE 80: 1866 NigaO2723.0. 76.7956 214.2529 expressed unknown protein Ectocarpus siliculosus SEC D No: 1867 NgaO2728 3966.667 4339.0319 ht-transiocating pyrophosphatase family SEC DNC; 1868 NgaO2726 447.5806. 48.92233 sugar fermentation stimulation protein SEO EE MO: 1869 NgaO2733 439.4753 397.89631 leucy Eaminopeptidase SEQED NO: 1870 NgaO2734 408,7452 333.61958 dihydrofolate reductase-thymidylate synthase SEC D No: 1871 Nga.02727 2608.964. 2331.0574 phosphoglucomutase SEO ED NO; 1872. NgaO2724 28C.S32 48489 - - -NA... SEC E RO: 1873 Nga (2730 702.29 1590.3969 protein SECR DNC: 1874 Nga (1462 174,3827 185.55395 tubulin-specific chaperone e SEO NO: 3875 Nga (3.461 2782.575 2.91.1.295 pyruvate dehydrogenase e1 component alpha subunit SEO EDO: 1876 Nga2022 22,766 20.597.79 ---NA--- SEO NO: 1877 NgaQ1464 3.04.3478 290.82478 rad-dependent deacetylase SEC No. 3878 Ngao.465 107.784 1312.9637 conserved unknown protein Ectocarpus silicudosus SEO D No: 1879 Nga014.63 408.2251 426.72852 cytochrome p450 SEC E RO: 1880 Nga (1467 328,8293. 33.8636 u3 small nucleolar ribonucleoprotein protein mpp.O SEC : NO: 2881 Nga20838 43,0473 207.68983 u3 granucleolar ribonucleoprotein protein mpp.10 SEO NO: 3882. Nga (4006 3.40.94.32 164.28468 breast cancer type 2 susceptibility protein homolog SEC NO: 1883 NgaO4004 894,9275 772,19666 peptide chain release factor 1. SEQED NO: 1884. NgaQ4005.03 iO76.387 3.047.7632 protease required for anti-signa degradation SEC E NO: 3885 Nga (4140.2 262.3336 318.947S uncharacterized membrane protein SEQD No. 1886 NgaO3645 301.9432 305.83532 protein SEC ENO: 1887 NgaO3643 1371.827 .351.9671 protein SECR DNQ: 2883 Nga29563 73,7405756.60997 protein SERENO. 3889 Nga (33644 619.8663 707,646 protein SEO ED NO: 1890 Nga00807 282,4742 304.86893 uroporphyrinogen-ii synthase SEC D. KO: 1891 Nga00305 3169.899 3540.469 stafite ferredoxin dependent SECR D NO: 3892. NgaCO303 4237.933 4515,033: rnp-1}ike rna-binding protein SECR DNC; 1893 Nga21102 2.78.94 937.537 expressed unknown protein Ectocarpus siliculosus SEO ED MO: 1894 Nga00804. 46.439 91.143 ---NA--- SECEO NO: 1895 NgaO0806 73.21773 139.8394S conserved hypothetical protein Ricinus communis FIGURE 24 AD U.S. Patent Apr. 29, 2014 Sheet 91 of 198 US 8,709,766 B2

Nga node +N rpkb -Nrpkh GO SEC O NO: 1896 NgaO41.25 4.59.1492 468.47491 protein SECR D NO: 1897 NgaO43.24. 341.5233 460.44096 transcription antitermination protein fusg SEC EO NO: 1898 NgaO4123,0E 522.122 655.66041 ---NA--- SEQED NO: 1899 Nga20959 549.8688 6O2.74435 serine threonine-protein kinase eg2 SEO ED NO: 900 Nga20328 18.2689 206.27444 component of oligomeric golgi complex 8 SEO EO NO: 1901 NgaO2573 555.3838 643,46663 eukaryotic trainstation initiation factor 2a SEO ED NO: 1902 Nga20425 203.S532 1.98.05397 conserved oligomeric golgi complex subunit Swis?f-related matrix-associated actin-dependent regulator of chromatial isoform a SECR D NO: 1903 NgaO2108.2 580.3595 681.99438 isoform 3 SEQED EO; 1904 Nga2O362 230,5227 200,37844 serine threonine protein kinase SECD NO: 1905 Nga20813 444.4024. 427.7272 protein kinase SEO NO: 1906 NgaO2.09.02 1408,063. 1229,1563 guaynate kinase-1 SEQEE NO: 1907 Nga20883 385.8447 576i.24089 mitochondrial ribosoma 836 protein SEO ED NO 1908 NgaO2110.02 210.356 183,1882 phosphoinositol transporter SEQD NO: 1909 NgaO2578.1 520.91.65 520.55348 protein SEO ED NO: 1910 NgaO2580 285.4167 293,37586 protein SEC D.N.O. 1911. NgaO2574 429.4809 403.15249 eukotiene 34 hydrolase SEQD NO: 1912. NgaO2575 1068,664. 246, 1924 methyltransferase sma} SEC EO NO: 1913 NgaO2581 251.168. 293,52145 conserved unknown protein Ectocarpus siculosus SEC NO: 1914. NgaO3684 496.182 600.11632 histone acetylase complex subunit paf400 SEC ENO: 1915 NgaO34C2 508,6022 462.99.515 ketch repeat SECONO; 1916 Nga2O038 2371,219 2215.548 c2h2-type zinc finger-containing grotein SEC EO NO: 1917 Nga(3406 3095.548 2954,4882 phosphoinositol transporter SECB) NO: 1918 NgaO3401 400.8097 A26.36142 folylpolyglutamate synthase SEGEE NC: 1919. NgaO3405 283.906 321.2484 abc. protein SEO EO NO: 1920 NgaO3404 S80.S8, S476406 protein SEC is NO: 1921. Nga.03408 4850.06 A488,957 ---NA SEOD NO: 1922. NgaO3400 8735.66 7629.91.09 protein SEC EO NO; 1923 NgaO3399 6236.41. 4757.7242 cellulase 2 SEO ED NO: 1924 NgaO3437 1545.671. 1497,943 protein SECR D NO: 1925 Nga?e481.1. 95.6985 913.32738 glutathione-regulated potassium-efflux System protein SEC EO NO: 1926 NgaO2547.02 1214.516 1151.3729 26S proteasome non-aipase regulatory subunit 2 SECR D NO 1927 NgaO264.02 352.1657 411,9577 choine ethanolaminephosphotransferase. SEQ O NO: 1928 NgaO2548.02 87.391.3 679.83089 paroxisomal membrane 22 kida (impvil pimp.22 family protein SEO EO NO; 1929 NgaO2603,02. 5491.917 5974,477 expressed unknown protein (Ectocarpus siliculosus: SEC E NO: 1930 NgaO6477 93.5636 139.12257 ribonicleoside-diphosphate reductase large subunit SEO ED NO: 1931. Nga06475 716,831.7. 589.87987 novel protein SEO ED NO: 1932 Nga.06655 S.2 (3.953964. ---NA SEQED NO: 1933 Nga06660 105.73 98.8453 ---NA--- SECR DNC 1934 NgaO6656 1025.258 1004.2361 actin binding protein SEO, ED NO; 1935 NgaO6023.2 1939,289 2664,398S protein SEC EO NO: 1936 NgaO6657 375.409 356.09.268 budi3 homolog SECONO 1937 NgaO6658 1810.726 1684,6509 protein SECR D NO: 1938 Nga2O2SO 334.2287 322.78714 polyrna polymerase SECEO NO: 1939 NgaOOO46 1486.328 1419,999 conserved unknown protein Ectocarpus siculosus SEC D NO: 194g NgaOOO49 392.5128 471.37956 drug metabolite transporter superfarily SECONO: 1941. NgaOOO42.01 832.649 913.49007 dra replication licensing factor nom9 SECEO NO: 1942. NgaGOO45 22S7.962 2228.56.41 nucleoside diphosphate kinase a SEC EO.K.O. 1943 Nga2O293 O37.52 1005.3712 inositol 5-phosphatase SEQED NO: 1944 Nga.00050 948.7399 845.3577 ---NEA... SEO ED NO: 1945 Nga.00040 1443.492. 1629.4945 peptidase SEQED NO: 1946 NgaOOO41 536.2.822 617.63932 exocyst gomplex SEQE) NO: 1947 Nga20074.1 1:08.252 1092.1733 hypothetical protein CHNCBRAF.136236 Chorella variabilis SECRE NO: 1948 NgaO004.4 4009 S841.5249 ubiquitin conjugating enzyme SEO Et NO: 1949 NgaOOO48 571.9468. 561.91942 macrophage erythroblast attacher SEC HD NO 1950 NgaOGO43.01. 365.3846. 399.81712 myosin ight chain kinase SEQED NO: 1951. Nga21:25 274.404.3 295.57882 tpr repeat nuclear phosphoprotein SEO ED NO: 1952 Nga2O797 299.4393 360,80802 ctr, protein SEQED NO: 1953 Nga20955 406.2827 389,0568 protein SEQ D NO: 1954 NgaO1535 174.7212 240.27122 afg1 family SEO EO NO: 1955 Nga2O562.1 52.468 90.04.04 ---NA--- SEC EO NO: 1956 NgaO1534.1 3648.79 2991,1035 zeaxanthin epoxidase SEQD NO: 1957 NgaO6644 932.4.222 952.32628 fe-s oxidoreductase SEO ED NO; 1958 Nga20916.1 595.982. 459,40725 protein EGURE 24 AE U.S. Patent Apr. 29, 2014 Sheet 92 of 198 US 8,709,766 B2

Nga model +N rpkb -N rpkb G SEO ED NO: 1959 NgaO6647 O8,085 98.58.872 to c1 consin member 3 SEOD NO: 1983 NgaO6645 549,2578 554.04273 prolycarboxypeptidase ike protein SECD NO: 1361 NgaO6646 1916.94 1977.0499 protein SEQED NO: 1962 Nga06643 33.322 78.4926 chorismate nase SEO ED NO: 1963 NgaO6642 3.342.894. 138.9378 at p-dependent metalioprotease SEOD NO: 1984 Nga2O543 638.79 481,86563 glutamyl-trina amidotransferase subunit c SEQED NO: 1965 Nga2O599 95.4023. 24.9582 ---NA--- SEO ED NO: 1966 Nga2094. 287,0873 2.77.927.53 sterol 3-beta SEO ED NO: 1967 NgaOO705 62.8 S43.423 ---NA--- SECD NO 1963 Nga(30692 526.8293 6(7.68782 exosome component 8 SEOD NO: 1969 NgaOO711 8S4,453 708.36778 -...--NA--- SEO ED NO: 1970 NgaOO707 33.38839 (5.044916 low quality protein: 60 kida lysophospholipase-like SEO ED NO: 1971 NgaOO701 38.332 90,3393 ---NA-...- SECR D No: 1972. NgaOO697 848.763 713,65492 carbamoyltrainsferase SEO ED NO: 1973 NgaOO696 23297.5, 384,629 erolase SEO ED FO: 1974 NgaOO703 319.7625 309.6728 protein SECR D NO: 1975 NgaCO706 275.2902 377,24565 ap-dependent protease a SECR D NO: 1976 NgaOO710. 880 765.48532 cytochrome c oxidase biogenesis protein cmci-ike protein SEO ED NO: 1977 NigaOO708 2069.3G7 2049,8325 cyclin-dependent kinase 2 SEO ED&O: 1978 NgaO0699.0 356.6667 92.2742 phosphatidylinositol kinase (pik-k) SEO D NO: 1979 NgaOO693 SS7.082S 589.7098: rah family gipase SEQED NO: 1980 NgaOO895 6SG,092. FO3.20435 pictein SEGED NEC: 1981 NgaOO700 380.7732 232.04.073 phosphatidylinositol 4-kinase SECR DNC; 1982 NgaOO698 438.317 466.0016 riboflawir kinase SECR D No: 1983 NgaOO702 32.499 2.93.42037 serine threonine-protein kinase tousied-like 3 isoform 2 SEO ED NO: 1984 NgaOO694 44.1.1.303 459.14154 retinal pigment epithelia membrane protein SEQED NO; 1985 NgaO0709 287.9652 245.42709 atpase aaa domain containing 5 SECD NO: 1986 NgaCO704 2O3].308 99.8772 adg-ribosylation factor 3 SECR DN: 1987 NgaO5395.2 38.13S59 32.12982 elegans protein partially confirmed by transcript evidence SEO ED NO: 1988 NigaOO943.01. 72.56236 58.951507 dra repair and recombination protein radS4 SEO ED NO: 1989 NgaOO932 SO4.9248 617.70728 protein SEOD NO: 1990 NgaCO935.01. 21.2.3552 201.45084 mate efflux family protein SEQED NO: 1991 NgaOS 396.2 3.579.558. 1479,2228 sperinidine synthase SEO ED NEC: 1992 NigaOO934 27757.33. 24.354.446 heat shock protein 90 SEO ED NO: 1993 NgaOO933 1057.098 1094,618 diacylglycero acyltransferase family protein SECD No: 1994 Ngao(938 537.2993 387.67296 predicted protein Phaeodactylum tricornutt in CCAP 105.5/1 SEO ED NO; 1995 Nga2O51. A88,8399 SO8.9085. ---NA SEQED NO: 1996 Nga20927 213.7405 236.43786 protein SECD NO: 1997 NgaO3963 S38.322 S99.2O3 cels face SEQD No: 1998 Nga03964 904.8132936.58588 exported nucleotide-binding protein SEO ED NO: 1999 NgaO3966 225.6569 220.99983 prefodir Suhuriit 3 SEOD NO 2000 NgaO3965 O6.703 94,096847 atp-binding cassette superfamily SEOD NO: 2001. NgaC73.10.2 99.2857. 177,95986 bronodomain containing protein SEQED NO: 2CO2 NgaO1056.01. 28.23.723. 14.199696 -NA SEO ED NO: 2003 Ngao (353.0). 266.039 91.62472. --NA SEOD NO 2004 Nga01050.0 21.18877 98.52334 lipase domain-containing protein SEOD NO: 2005 NgaOO5. SS3,303 74.7749.--NA--- SEQED NO: 2006 NgaO1054 3.369.528 1263,0545 otu domain-containing protein 7a SEQED.N.O. 2007 NigaO1952 1038.439. 2053.7669 eukaryotic translation initiation factor subunite SECD NO. 2003 NgaC5705 194.636. 1318.i.42i isobutyry- mitochondrial SEQID No. 2009 NgaO5703 3.34.4.229 1906.8875 phosphomarnomutase SEO ED NO: 2010 NgaO5707 3.23.2068 91.4121.47 methyltransferase type 11. SECD NO: 2011. NgaC5706 438.648i 375.6:053 probable doichy pyroptosphategicians c2 alpha- -glucosyltransferase SEED NO: 2012 NgaO5704 281.818 27S.19997 guanylate-binding protein SEO ED NO: 2013 NgaOS702 335.139 3.58.49885 ankyrin repeat family protein SEC D.N.O. 2014 Nga2O864. 1480.652 138.0681 uncharacterized protein SECR DNQ: 2015 NgaO5708 4.393 4.941.58 ---NA--- SEQELY NO: 2G16 NgaO2451.Q1. FSF.247 324.33216 zinc transporter SERED NO 2017 NigaO2460 658.027 2003,4759 conserved unknown protein Ectocarpus siliculosus SECD NO. 2013 NgaO2A61 32.47 2,382 ---NA--- SECD NO: 2019 NgaO2462 34,647 33.7045 ---NA--- SEO ED NO: 2020 NgaO2459,01, 69.37799 76.0647 metalo-beta-lactamase superfamily protein SEOD NO: 2021. Nga2O655 35.4167 57.97162 gutathioes-transferase Fi. Gjiri, 24 AF U.S. Patent Apr. 29, 2014 Sheet 93 of 198 US 8,709,766 B2

Nanode Nip. NPs 69 SEO E NO: 2022 NgaO2453.01 857.3643 95.290998 protein SEO Ed NO: 2023 NgaO4049.2 838.436. 39.8033 S-3 exorionuclease SEOiD NO: 2024 NgaO2452 92.15956 95.36.0347 n-acetyltransferase mak3-ike protein SEO Es NC: 2O2S NgaC2456 85.77 42483 ---NA--- SEO Ed NO 2026 NgaO2448 5468.355 5239,6878 uqcrx acre like ubiquinol-cytochrome c reductase family protein SEO D NO: 2027 NgaO2457 215.73.2 254,57098 ubiquitin carboxyl-termina hydrolase 4 SEO ED NO: 2028 NgaO2463 3.81.6362 257.73936 rha polymerase ii elongator SEQD NO 2029 Nga20287 20,3933. 170.33654 elongator complex protein 2 SEOD NO: 2030 NgaO2.454 494.3705 487.84.333 potential 4 histone acetyltransferase complex component yng2 SEO E3 NC; 2031 Nga2O390 366,362. 378,24074 rha polymerase is elongator SEQ to NO: 2032 Nga20077 225.0242 239,983.09 wid-40 repeat-containing protein SECD NO: 2033 Nga2O372 243.1694. 236.77245 1-acyl-sn-glycerol-3-phosphate acyltransferase SEO E3 NC: 2034. Nga2O245 227.4775 24,53189 protein SEO DNG; 2035 Nga21CO 334,239. 34.45418 protein SEO Ed NO: 2036 NgaO247? 153.1823 179.95472 p-type atpase SEO. :D NC: 2037 NgaO2449 6466,234 5714,0444 citrate synthase SEQ. O NG: 2O38 NgaO2479 2O3.5928 259.45723 e1-2 atpase family protein SEO D NO: 2039 NgaO2458 311,7326 308.86207 ca-transporting atpase SEC in NC: 2040 NgaO2455 1803,379 933.8839 protein SEOiD NO: 2041 Ngai 1622 2013.298 (64.5078 mitogen-activated protein kinase kinase i-interacting protein 1 SEC D NO. 2042 NgaO1629 43,81, 8.255 ---NA--- SEO, O NO: 2043 NgaO6228 243. 297.5406Es ---NA-. SEO Es NC: 2044 FigaOS227 227.9849. 320.11289 gp-binding protein otg SEO D NO: 2045 NgaO6229 350.8647, 358.255.05 madph:adrenodoxin mitochondria SEC D NO: 2O46 NgaOO417.02 2296.524. 2372.4847 ubiquitin domain containing protein SEOE NO: 2047 Nga06226 356.032 383 ---NA--- SECR DNC; 2048 NgaO119 777.1783 666,477Q1 serine threonine-protein phosphatase 4 catalytic subunit SECR D NO: 2049 Ngao.1.21 259.7938 415.4258 diacylglycerol acyltransferase family protein SEO. :D NC: 2050 NgaO1120 73.5499 24,3407S ---NA--- SEO E NO: 2051. NgaO378). 43.902 85,223-5 exoribonuclease SEC, D NO: 2052 Nga03772 1148.87 302,515 protein SEO D NO: 2053 NgaO3774 34.34 36.4.374 - --NAs. SEO ED NO: 2054 Nga03776 542,533 1374.0075 GrpE Atopobium rimae ATCC 49.626 SEQ to NO. 2055 NgaO3773 9570.216 13926.603 pyruvate decarboxylase SEO NC: 2056 NgaO3775.03 595.6679 43,00283 (2)-bisphosphate nucleotidase SEO E NC: 2057 NigaO2,000 481.7098: 549.40842 inh repeat containing 2 SEO ENO: 2058 NgaO1CO 338,5214, 18.2448 muscular protein 20 SEO NC: 2059 NgaOO999 1278.698 1386.3949 folate biopterin transporter SEO E3 NC: 2060 NgaOC94 2086,988. 209,861 peptidase maste24p SEC to NC: 2061 NgaC1097.1 511.3772 527,99547 conserved unknown protein Ectocarpus siliculosus SECD NO: 2062 Nga(1095 595.3421 693.77429 protein SEQE3 NC: 2063 NgaC1096.1 560.466 6.98.43 dra topoisorTerase SEO Es NG: 2064 Ngai 1098 1.21.9S12 2.0S491.53 dynein heavy chain SECR D NO: 2065 Nga01099 6.856667 4.3266055 dyneir heavy chain SEO. :D NC: 2066 NgaO4654.02 1314,455 1350.636 protein SEO Es NG: 2057 NgaO4655.02 722.3464 900,66642 ---NA--- SEO D NO. 2068 NgaO5808 743.8636 745.13365 abdi family protein SEQ: NC: 2069 Nga20416.1 3.01.204 3.89 ---NA--- SEO E NO: 2070 Nga(5810 A962.825 S5063721 protein SEC E NO: 2O71 NgaOS816 28,498 384,436 ---NA--- SEQD NO: 2072. Nga20877 320.6349 333.58728 conserved c2h2 zinc finger protein SEOiD NC: 2073 NgaO5814 651.0264. 627,38623 exocyst complex component 6 SEO D NO: 2074 NgaO5809 1872.35 35.67 ---NA--- SEC ID NO: 2075 Nga2O638 671.6438. 659.35979 exocyst complex component 6b SEOiD NO: 2076 Nga20852 99.45387 106,79771 uncharacterized protein SEO D NO: 2077 NgaOS813 27709.13 305S9.299 glyceraldehyde 3-phosphate SEO NO: 2078 Nga21005 24332.97 18319.07 glyceraldehyde-3-phosphate dehydrogenase SEO. :D NC: 2079 NgaO5812 299.360. 333.1885 myo cina binding protein transcription factor-like protein SEO ENG: 2080 NgaOO912 18.577 115.69287 hypothetical conserved protein SEO D NO. 2081 NgaOC910 509,839 433.23.223 ---NA--- SEC DNC: 2032 NgaOO909 715.0585 638.523 oracance overexpressed i SEO E NO: 2083 NgaOO91. 64.975 68.50306 rad-dependent epimerase dehydratase SEC NO. 2084 NgaOC918 449.24O6 430.43396 protein phosphatase SEC NO: 2035 NgaO1324.3 613.4507 665,19143 upfo760 protein c2orf29-like FGURE 2A AG U.S. Patent Apr. 29, 2014 Sheet 94 of 198 US 8,709,766 B2

Nga model +N pk -N rpkh O SEOD NO : 2O86 NigaO327.0. 342.309 362.697.6 phosphopantothenoylcysteine decarboxylase SEO EO NO : 288 NgaO3.325.03. CSSO2 237,855 --NA.-- SEC3D NO : 088 Nga03.328,0i 1233,58S 1115,473 heat shock protein Oi SEO NO : 2039 NgaO323 176.222 1464,359726s proteasome non-atpase regulatory SEO ENO : 209 NgaO2326 24,504 276.4004 microtubule interacting and transport domain-containing protein SEO ED NO : 209 Nga0328,02 3.585 1115,473 heat shock protein 30 SEO Ed NEO : 292 tigaCO324 653,341. 74.7.05789 --NA--- SECRED NO 2093 NgaO0839.03. S90.078 455.592 uncharacterized protein conserved in bacteria with a cystatin-like fold SECONO 2094. NigaO0823 352.3037 334.61153 pre-mina-spicing regulatoriennae-etha: d SEO EFEO : 2095 NgaCO822 58,421. 189.7 ---NA--- SECRENO O95 NgaO088 OfSO87 1E C3.6427 monogalactosydiacylglycerol synthase SEO DNC : 2097 ENga00821. S8PSA 64-6683 - ...N.A. -- SEC EO NO : 2098 NgaO087 333,299 651.26949 fatty acid desaturase SECR DNC : 93 NigaO082O 32,69 29298839 ---NA--- SEO EO NO : 2OC fig3852.43 a.489 94.2858 ---NA--- SECRED NO O NgaO3614.02 8.89 7259.0974 protein SECD NO : 22 NigaOS238.02 22.2109 193.5924 sorbitol dehydrogenase SEC ENEO : 213 Riga (524.0 18522 1338.5336 histore deacetylase complex subisit SEED NO 204 NigaO363,02 396.282 4055.8798 calcium-dependent protein SFOD NO ; 25 fig3.05232 87,914 786.28772 multidrug oligosaccharidyl-ipid polysaccharide flippase SECRED NO ; 2106 Riga.05237 7,648 178.4907 ---NA--- SEOD NO : 20 NigaOS234 O64,363 899.02546 stage iv sporulation protein ib SEO EO NO : 28 Nga C5238.03. 22,219 2.91.5924 sorbito dehydrogenase SEC3D NO : 109 Nga0524. 18O, 95.24 263,071. --A--- SEO NO : 2113 NigaOS242 3.325.301 3258,425 ---NA.-- SEO ENO : 21. NgaO5236 463,6763 495.40392 tor repeat-containing protein SEO ED NO : 12 NigaO6288 35.8O25 203.94.219 protein SEO Ed NEO : 2.13 figa C6288 28.306 244.15749 autophagy related protein SEQED NO 214 NgaO0970.02 30303 2547,4792 gutaredoxin typei SECONO : 235 NigaO0973.2 2O4.509 216.8543 gutamine-fructose-6-phosphate transaminase SEO EFEO : 28 Nga C6282. 8O3.59 236C.2429 biotin and thian in synthesis associated SECRENO NigaO6287.1 265.5367 329 --NA--- SEC DNC : 23 Mga.06239 913,689. 109.7303 stragtip-birding protein SEC EO NO : 29 Nga20974 95.38 2292,042 smalgtp-binding protein SECR DNC : 2 NigaO6284 6748.98 34.83 S-cis-zeta-caroene sorrerase SEO EO NO 2. fig386.285 3.29. 1419.2279 chs donair-containing protein SECRED NO 22 NgaOO979 1a84.127 1628.2897 predicted protein Phaeodactylurn tricornutum CCAP 1055, SECD NO ; 223 NigaOO977 4860s SOO.80676 rrp 15-like protein SECRED NO ; 2124 NgaO0976 36.263 324.97.033 serine threonine-protein phosphatase 6 regulatory ankyrin repeat subunit c SFOD NO ; 25 fig32020. 5,678 150.90855 transcription factoric-garima subunit SECRED NO ; 2126 RigaO0973 422,8758 569.93581 topoisonerase G subunit b SEOD NO ; 227 Niga20277 2SC 189.6769 cyclin-dependent kinase d12 SEO EO NO ; 228 NgaO0980 528,239 64,58353 - --NA.-- SEC3D NO : 129 Nga2O246.1 1850,04 21.2G.1638 uncharacterized protein c3orf85 honolog SEO NO : 233 NgaO274 366,3342 371.511.14 protein SEO ENO : 213 NgaO3.275 OS.5 90.74446-guionolactone oxidase-like SEO ED NO 32 NigaO283 3446SO2 50.14979 cycline SEO Ed NEO : 233 tigaO1273 2766.871. 2698.154 haca ribonucleoprotein complex subunit 3 SECRED NO ; 2134 Nga2013.6 36.609 183.1.0733 genomes uncouped 1 protein SECONO : 235 Nga20122 274.3083 248.3303 conserved unknown protein Ectocarpus siculosus SEO EFEO : 235 Nga20084 42.695 374.4.9335 conserved unknown protein Ectocarpus siliculosus SECRENO 37 NigaO3708 263.3797 27C.69436 dra repair helicase rad25 SEO DNC : 233 Mga03707 89.3239 895.2720226s protease regulatory.sutrit 7 SEC EO NO : 239 NgaOO893 3378,571. 327945. --NA--- SECR DNC : 40 Nga2009S 484.928 1463.3038 protein SEC ENO : 214 Nga.00890 780,9355 856.83869 orotidine-5-phosphate decarboxylase orotate phosphoribosyltransferase SEO O NO : 2A12 Mga2O527 588,3224 548,034 dtwdomair containing protein SEC EO NO : 243 tigaO089. 339,298 357.57237 abci family protein SECR DNC : 44 NigaOO892 48S.158 49.35904 ---NA--- SEO EO NO : 2A15 fig33227 529 OOS 634,5362 ---NA--- SEED NO as NgaO2269 376.374. 2033.5436 -threonine-o-3-phosphate decarboxylase

FIG JRE 24 Ali U.S. Patent Apr. 29, 2014 Sheet 95 of 198 US 8,709,766 B2

Nga model +Nrpkb -N pkb GO SEC) NO; 21.47 Nga1745 228,673 2.961.039 triosephosphate isomerase gy ceraldehyde-3-phosphate dehydrogenase SEO NO: 2148 NgaO1746 2391 233.S2326 dihydroipoamide S-acetyltransferase SEO NO: 2149 NgaOlaq. 757.835 694,46947 tricarboxylate transport mitochondrial precursor SEO DNO: 50 Nga20985.1 1s2 18.9833 uncharacterized protein SEO ED NO; S1. NgaO5796.1 49,263 234,1933.5 n-dimethylguanosine trina methyltransferase SEO NO: 52 Ngao5798 S359.47 5.31.7334 heat shock protein SEC) NO: 53 NgaO5882 2O29.72 249.424 dihydroipoamides-succinyltransferase SEO NO: 54 NgaO5795.02 342O.st 3770.0325 dihydroioanide succinyltransferase SEC) NO; NgaO5799 245,733 244,098.08 cyclin-dependent kinase SEC) NO; Ngai)580.0.1 3.297 1509.5021 electron transfer favoprotein subunit SEO NO: NgaO5797 739,682 777,86513 cell division cycle 2 SEQ is NO: NgaO5795.01 342.67 3770,0325 dihydroipoamide succinyltransferase SEOR NO: NgaoS390.2 3.297 1509.5031 electron-transfer-beta polypeptide SEO NO: Nga20040 35.58.26 4257.430s ---NA--- SEO NO; NgaOOO79 23154 1224,769 dina-directed rna polymerase it subunit po4. SEO NO; Ngao(038 994,444 1567,6802 ---8A--- SEO ED NO: NgaO0081.03 453981 552,43489 cyclindependent kinase SEC) NO: NgaOOO84 656.5C08 635.21905 malate synthase a SEO NO: NgaOOO71. 384.S3s 3974,4332 cdgshiron sufur dorraini SEO NO: s NgaOOO72 363.641 3547.3567 glucose-6-phosphate isomerase SEO. O. NO: NgaO0083 620.95 55C.A6692 phosphatidylinositide phosphatase sac1. SEO, MO: Nga2O882 43,103 855.381.28 ap-4 complex subunit signa SEO NO: NgaO0086 4356 4235.67 60S ribosoma; protein .3a SEC, NO: NgaOCO37 f7.379 189.991.29 regulator of chromosome Cordensation-like protein SEO E NO: NgaOCO78 465.32 16839,859 cell division protein SEO NO; 1. NgaO0089 5387,294 4742,5398---NA--- SECD NO; NgaOOO74 0.03. S64 987,71067 dnaj subfamily b member 5 SEO NO: NgaOG134 98.875 1138.G.366 cystinosin SEC NO: NgaOOO82 2O55.558 1843,9425 glycolate oxidase SEC) NO: Nga2O784 97.221 1250.6423 - --NA---- SEO NO: NgaO0085 29,9684 31.3.15467 protein SEOD NO: 78 NgaOOO73 95,0579 844.83883 prefodin subunit 2 SEO NO: 79 NgaOOO76 2O75.3S1 3335.1659 imp dehydrogenase SEOO NO: 8. NgaCOO75 42.652 1135.6527 translation elongation factor p SEC NO: 8. NgaOOO80 523,774. 557,67461 dra ac-dependent SEO NO: 282 NgaOOO90 292.83. 274.13688 postrepication repair e3 ubiquitin-protein igase radi8 SEO NO: 283 NgaO0077 436,302 469,30685 protein SEO NO: 2184 NgaO2O30 978 206.9453 cystatin b SEO NO: 2185 Ngao2031 55.6989 479.884.28 aldehyde dehydrogenase SEO NO: 2186 NgaO2032 382,826S 299.391.33 protein SEC D NO: 287 NgaO1783.02 3437.26 1556,7434 tin1.0-like protein SEO NO: 2.188 Nga01781.02 05471 838,98776 endonembrane protein 70 containing expressed SEO NO: 289 NgaO1782.2 59,494 158,62549 nac-dependent histone deacetylase sir2-ke protein SEO NO: 290 Ngai)1677.1 53.24. 1322.8277 molecular chaperone superfamily) SEO NO: 2191 NgaO1678.33 ASS,299 824.8G835 mitochondrial-processing peptidase subunitheta SEC NO: 2192 NgaO1679.03. 56577 630.07333 pleiotropic regulator 1. SEC) NO: 2.93 Ngao.1680 3S5.28 427.29643 protein SEOD NO: 219A Nga2O780 336,82O1 243.99202 protein SEO NO: 2195 NgaOS766 53.521 274,62269 dna glycosylase SE: NO: 21.96 NgaO8759 27.773 28.75C24 ---NA--- SEOD NO: 2197 NgaO6758 85.7388 837.55589 phospholipase did hid SEOD NO: 2198 Nga20206 237,202 271.881.7 mitogen-activated protein kinase kinase 2 SEO NO: 2199 Ngaos.760 Os3.04. 1989.2519 fl10769 protein SEO NO: 220 Nga06757 6,399 57.7309 cellulase 2 SEC NO: 220 NgaO1656 558,5488 618,41541 myth domain-containing protein SEO NO: 22O2 NgaO1658 3.96296 85.2S4523 glutathione s-transferase C-terminas domain-containing SEO NO: 2203 NgaO6236 S89, SWS9 543.23O38 transmembrane protein 10 SEO DNO: 2204. NgaO6212 637,647 579.381,09 uncharacterized protein c3ot2.6-like SEO ED NO: 2205 Nga06213 31,246 237,93904 arylsulfatase SEO NO: 2206 NgaO6237 373.374 368,73721 protein SEC). NO: 22O7 Nga()6218 317.7S7 354.32888 ---NA--- SEO NO: 22C8 NgaO6234 415, SS 543,61697 conserved unknown protein Ectocarpus siliculosus SEC) NO: 2209 NgaO6235.1 Si6.7336 467.533 conserved unknown protein Ectocarpus siculosus FIGURE 24 AI U.S. Patent Apr. 29, 2014 Sheet 96 of 198 US 8,709,766 B2

Nga model +N rpkb -Nrpkh GO SEC ONEO: 2210 Nga20085 54.95 159.53.454 subunit of golgi mannosyltransferase complex SEC NC: 221 NgaO3S72.2 794.279 844.684.77 protein SEC N: 2212 Nga20635 66.666 6.638 ---NA--- SEO MO 2213 Mga2O691 29.2929 14.4.4.3119 kinesin family member 2b SEC O NO: 2214 NgaO3572, F94,279 844.68477 protein SEO NO: 2215 NgaO357...i 3O37.58 960.85C87 protein SEC NO: 2216 NgaO3570,1 262.18 255.57.01.2 integral membrane mpw17 pimp.22 SE O NO; 21 NgaC2313,0t 77.494. 61.3.2742 zinc fyve domain containing 2 SEC NC: 2.218 NgaO2316. S.338 32.9500.82 transducin wid40 domain-containing protein SER NO: 2219 NigaO2315 3352.59 2942.5698 carine-like rab-type snaig protein SEC O NO: 2220 NgaO2314 8992.326 6693.9366 ---NA--- SEC O NO 222. NgaC206 752.076 A8.6438. --NA--- SEC NC: 2222 Nga20237 33.536A. 291.9768 protein kinase domain protein SEO NC: 2223 NgaO2O68 76.a419 98.2a.699 ---NA--- SEO MO: 2224 NgaO2O66 32.430 216.64.679 acetyl-coenzyme a transporter it. SEO O NO 222S NgaOC87. 874,82 1851.3453 smagtp-binding protein SEC NC: 2226 NigaOO873 976,7442 1070.33 phosphoribosylpyrophosphate Synthetase SEO O NO; 2227 NgaO0872 669,07 743.7787 ---NA--- SE NO; 2228 NgaOO869 1942O3 1132.0022 oligopeptidaseb SEQ O NO: 2229 NgaOC870 S38. Of 419.73024 conserved unknown protein Ectocarpus siliculosus SER NC: 2230 NgaOO876 352,857. 340.96078 alpha- -mannosyltransferase SE NC: 223 NgaO0874 A63.5368 469.57089 protein SFO NO. 2232 NgaOO875 S243.399 635.465 conserved unknown protein Ectocarpus silicudosus SEC O NO: 2233 NgaO3,846 216.3934 200.664.65 aaa family atpase SEO NC: 2234 NgaO1845 31,455A. 76.2848 diphosphate fructose-6-phosphate 3-phosphotransferase SEC NC: 2235 NigaO3843 85.5 156.82343 pyrophosphate-dependent alpha shunit SEO MO 2236 Mga.04.265.2 2.936 6.8432 ---NA--- SEC NO: 223 NgaO3844 426.59 381,2827 peroxisoma mernbane protein pnp47b SEC NC: 2238 NgaO3.842 534,586 738,804 glutathione s-transferase iru 3-like SEO ONO: 2239 NgaO239 Sesa.487 757.02912 ribose-phosphate pyrophosphokinase SEC NO 240 NgaO2390.1 68.3S4. 536.6958i gtp cyclohydrolase ii SEQ NC: 2241. NgaO2392.01 639.0328 692.22203 cid4-specific ankyrin repeat protein SER NO: 2242 NigaO3983.32 527.555 1594.8829 conserved unknown protein Ectocarpus siliculosus SE NO; 2243 NgaO2022, 1 988.503 906.68.476 in-aaa protease afgs ytai-ike protein SEC O NO 244 Nga2O730 3. 90.9965i epsilon 1 SEC NC: 224 Nga20703 95.238 103.16S14 epsilon 1 SEO NO: 22A6 NgaO2082 26,063 134.3034 epsilor tubulin SEO MO: 22A7 NgaO2O83 2184369 2097.002 protein atypical group SEO O NO 2248 Nga2O646.3 37.968 195,62446 ---NA SEC NC: 2249 NigaO3.645,01. SO4,392 504.308.05 ysophospholipase-like protein SE NC: 22SO NgaO3.647 75.6325. 109.2338---NA--- SE NO; 22S1 NgaO2.646. 423.3 1713.422 ioap-like prote SEC O NO: 2252 NgaO5879 2O3.5836 230.56787 cell division cycle 5-Eike protein SER NC: 2253 NgaO5878 349,537 3.19898 ---NA--- SE N: 225A NgaOS877 219.SSS 572.2935 hypothetical protein Pnar PFWARO26561 Perkinsus marinus A3CC 50983 SEC O. 2255 Nga05880 297,935 217.28586 ---NA-...- SEC O NO: 225 NgaO5876 7.94 445.46946 call division cycle 5-Fike proteir SEO NC: 2257 NgaO5872 916. 7577.8C17 glutathione peroxidase SEO NC: 228 NigaOS875 426.64 1155.3728 had-dependent epimerase dehydratase SEC O NO 259 Nga2O683 54.34783 64,75855 protein SEC MO: 2260 Nga23034 386.388 139.43699 hypothetical protein CaO.9.1645 Candida albicans SC5314 SEQ NO: 22s NgaOS873 226,4236 352,05103 ryotukularin-reated protein SEO O NO: 2252 NgaO5874 60.974 564.77822 oc79510 protein SEC ONO 253 NgaC5873 323.37.43 432.7859: Eysyl-tra synthetase SECR NC: 2234 Nga03925.01. 104.4 174764 ---NA--- SEQ NO: 225 NigaO3923. a867.703 4.165.322 ribosomal proteins/ SE N: 228s NgaO3920.01 772.536 741.5553 cytochrome c oxidase assembly mitochondria SEC O NO 225 Ng303924 E6.703 36.84633 branched-cai-arino-acid transariinase SEC .NC: 2268 NgaO3922 3.329 660.36722 protein SEO NO: 229 NgaO3926 88.48 G.718A. ---NA--- SE O NO; 227 NgaO3923 2248.792 227,4398 periplasmic binding protein SEO O NO 27. NgaO2252 539.3508 669,853.96 atp-binding cassette protein cA-3ike SEC NO: 2272 NgaO2253 212.212 3O8,3892 - --NA--- SE NC; 223 NgaO2248 2S5.914 186.36283 ---NA--- EIG JRE 24 AJ U.S. Patent Apr. 29, 2014 Sheet 97 of 198 US 8,709,766 B2

Nanode Nipt NP so SEQED NO: 2274 NgaO2247.1 69S.0445 664.80558 histone deacetylase superfamily protein SECD NO; 2275 NgaO2249 56,068 89,375.38 - --NA--- SECRD NO: 2278 Nga01202,02 703.7857 658.98.73 magnesium-dependent phosphatase. SEO EO NO: 2277 NgaO3425 845.9445 722,35252 cdc25 protein SEO EO NO; 2278 NgaO3428 3385.37 3359,2379 -NA--- SEO ED NO. 2279 Nga.03430.1 1044.76 980.39865 fra domain containing protein SEOD No. 2230 NgaO3438 469.879S 349,7673 pseudouridine synthase and archaeosine transgycosylase domain-containing protein SEO Et NO. 2281 NgaO3432 897,471 97.57438 4fe-4s iron-sulfur binding protein SEO ED NO: 2282 Nga20095 260.87 224.2576 a keto redictase SECONO: 2233 NgaO3431 542.2427 554.9236 methyltransferase type 11 SEO EDMO: 2284 NgaO3437 555.81.58 7 6.23665 tom1 proteir SEC EO NO; 2285 NgaO3426 426.1874. 490.8623 ubiquitin-conjugating enzyme e2t SEO DNC; 2286 NgaO3436 390.7886 372.66876 protei SEO Ed No. 2287 Nga.03434 400.995 437.50496 conserved unknown protein Ectocarpus siliculosus SEC NO; 2288 Nga.0343S 810,8298 737.01373 dead death ox helicase SEO EO NO: 2289 Nga2O396 228,5966 237.168 cap-specific mina (nucleoside-2-o-)-methyltransferase 1 SEOD NO; 2290 NgaO3452 3679.8. 2759.95 protein SEO EDMO: 2291 Nga(3427 32704.55 15826,424 ypt26Ow-like protein SEOD No. 2292 NgaO3429 1538.311 1284.8998 hect e3 ubiquitin SEO ED NO: 2293 Nga21287 1525 11.5338 ---NA--- SEO E NO: 2294 Nga2O716 38.46154 53.56554 dra helicase SEC E MO: 2295 Nga02368 879.2627 936.47323 hypothetical protein WOCAERAFE 1039C Volvox carterif. nagariensis SEO D No. 2296 Nga21187 1138.261 1208.8263 sucrose-phosphate phosphatase SEC EO NO: 2297 NgaO2367 337.5598 321.34233 sulfate permease SEQED NO: 2298 Ng302366.1 3.34.64. 379.77923 ---All SECD NO. 22.99 NgaO2244 6853.85 586.45. ---NA--- SECONO: 2300 NgaO2243 740.0722 732.5842 homogentisate sofanesyltransferase SEQEE NO: 2301 Nga21,059 619.929S 685.857.2 exosome complex exonuclease rp41 SEO ED NO; 2302 NgaO2242.01 502.994 507.88753 drug metabolite transporter superfamily SEC OMO; 23O3 NgaC41.75 31271.32 18893,283 light-arvesting protein SEQ OMO: 2304 Nga0476 O22.435 1013,1865 histone acetyltransferase SEO ED FO: 2305 Nga04174. 664.962 7G2.65287 beta-galactosidase SEO EE MO: 2306 NgaO41.73 1576.88 1531.846 ap-2 complex subunit mu SEO ED NO: 2307 NgaCO239.01 3.09.400 235.51939 integral nembrane protein gpr 155-like SECD No. 2308 NgaCO2430. 5108.949 1993.6562 ---NA--- SEQED NO. 2309 NgaO0236 68.205 396.8899 ---NA--- SEC EO NO; 2310 NgaOO249.01. 21.97.105 1293.993 -NA SEO ED NO: 2311 NgaCO237 93.9372 123877...N.A.-- SEC No. 2312 NigaOO242.1 425.603 253.866 ---NA--- SEQED NO. 2313 NgaCO238.03. 4637,013. 3068.5203 ---NA SEO EO NO. 2314 NgaCO246 O9.56. 165.23 - - -A--- SEO ED NO: 2315 Ng3.00241 508.4823 540.56936 propionyl- carboxylase SEO Ed No. 2316 Nga.00247 7.543 63.43338. ---NEA--- SEO ED NO; 237 NgaCO23S 242.978. 299,96967 ccnserved unknown protein Ectocarpus siculosus SEO EO NO: 238 NgaCO248 65.999 49,93 ---NA--- SEC E MO: 2319 NgaOO.245 OS6 16S-233 ...N.A.-- SEO ENO: 2320 Nga.00240 979.4195 975.76799 hypothetical protein CYO:10 14003 Cyanotheca sp. CCYO110 SEOD No. 2321 NgaOO244 1148.058 1196.2899 carboxy-terminal protease SEO EO NO. 23.22 NgaO6410.2 353.6379 377.57393 nicotinate-nucleotide pyrophosphorylase SEO ED NO. 2323 NgaOO968.0. 54.57227 59.114537 hypothetical protein PSG (7517 Salpingoeca sp., ATCC 50318) SEC to NO. 2324 NgaCO966.1 738.2107 652.13643 mitogen-activated protein kinase SECR DN: 2325 NgaOO9650. 26928.68. 20938.344 oxygen-evolving enhancer protein SEO EO NO. 2328 NgaOC964 2335.029 1750.2552 mad-dependent epimerase dehydratase SEQED NO: 2327 NgaOO967.0i 484.4479 507.03.52 dna poiymerase beta SEC NO. 2323 NgaO3051.02 599.3S12 58.5932 seri triri-specific protease 8 SECONO: 2329 Nga.08386 252.877 252.8343 hypothetical protein SKA58,432 Sphingomonas sp. SKA58 SEO EONO. 2330 Nga21046 432.195i 425.83588 tho corrplex subunit 5 horolog SEO ED NO; 2331 Nga20043 436,5869 522.OO729 nucleoporin 98kd SEO EDMO: 2332 Nga2O534 252.338A: 87-883. gipase activatornb4s evils (contains thc domain calmodulin-binding protein pollux (contains to 3rd thc domains) SEO EO NO. 2333 Nga05196.1 1679.558. 1479.2228 SEQED NO. 2334 NgaO5198 264,263. 252.73866 ---All SECD No. 2335 Nga.05197 3.09.A. 19.9666 ---NEA--- EIGRE 24 AE U.S. Patent Apr. 29, 2014 Sheet 98 of 198 US 8,709,766 B2

Nga model -Nrpkb GO SECR DNC; 2336 NgaoS194 5,93039 36.107798 phosphatidylinositor-acetylglucosaminyltransferase subunit SEO 3D MO: 2337 Nga.05199 3.6329 329.02577 protein SECD NO: 2338 NgaOS193 734619 782.50948 dna photolyase SE3D NO; 2339 Nga05190 182.795 1643.6429 protein SEED NO; 340 FigaOS191. 25.8536 2C2.286.52 probable phospholipid-transporting atpase iaisoforin1. SEO ED NO: 34 FigaO5195, 38.3559 32,12982 dra repair and recombination protein rad26 SEC D. NEO: 234 NgaOO943.02 2.6236 58.951507 dha repair and recombination protein rad54 SECR DNC: 23:43 NgaOO935.02 22.3552 20.45084 mate efflux family protein SEDM: 2344 Nga2O392 79.3 i6,83032 methyltransferase sma: SECD NO: 23a5. NgaO4290 546,714.9 749,6992.1 protein SECD NO: 2346 NgaO2441 562.08 601.86729 atp-dependent rina helicase SEO 3D MO: 237 Nga.02442 2252 34.637 ---NA--- SEC3D MO: 2348 Nga01803 97.236 82.4SO37 ---NA--- SEC3D NO; 349 NgaO.804 6029.874. 5975.4656 40s ribosoma proteins23 SEO ED NO: 350 NgaO1805 39,8407 428,89164 carrier protein SEED NO: 35 NgaO1802 343.21. 3678.6535 mitochondrial carrier protein SEC NEO: 235 NgaO4343.01. 18888.2 34.15 ---NA--- SECD NC: 2353 NgaO4342.01 57,936 528.1.2473 dina-directed na poymerases and iii kcia polypeptide SECON: 2354 Nga04344 770.7359 707,25823 trypsin 5g SEC DNC; 23SS NgaO4580 281439 343.7741 retino dehydrogenase 14 SEO 3D NO; 2356 Nga,04618 62.233 204.67522 ---NA--- SECD NO: 235 Nga,04617 ORO534 S520.98 ---NA.-- SECD NO: 2353 NgaO1794.01. 78.825 947 ---NA. -- SEED NO; 359 NgaO1792.0 814,887 1902.2904 expressed unknown protein Ectocarpus siliculosus) SEC :D NO: 380 NgaO1793.01. 333,8789 304,93656 protein SEO ED NO: 236 NgaO2791. 2943,84. 3333,4838 beta-asparty asparaginyi family SECD N: 2362 NgaO4213 256.413 273.7844 sulfate perinease family SEOD MO: 23.63 NgaO4212 41,0749 4.3297 ---NA--- SEQDN): 2364 Nga0374. 73.543. 85382S9 ---NA--- SECD NC: 2365 Nga03740 38.486 948.04774 solute carrier family member b2 SEO 3D NO: 2366 Nga03742 885.5932 900.94639 radicasam cfr family SEC3D NO: 23S Nga20943 352.472 4644.2734 protein SEO 3D NO: 388 Nga2O613 362.287 676.22938 protein SEED NO: 369 NgaO3739.0 723O.66 8437.329 malate synthase SECRED NO: 23 O. Riga.0386.5 38.43 376,081.24 nucleolar complex protein 3 hornoog SEC : NEO: 23. Nga03858 598.4556 588,04128 cystathionine beta-lyase SECON: 2372 NgaO3859 28,2435 i43.78.255 atp-dependent dina helicase SECD MO: 2373 NgaO3850 5855 5448.0921 60S ribosonia; protein 43 SECD NO; 2374. NgaO3881 2O3,439 232,05756 es2 protein SECR DNC: 23,5 NgaO4683 7.529 994.3037 histidine kinase SEO 3D MO: 2376 NgaO4S11 49.465 158.38764 mak1.6-like protein rhini.3 SEC3D NO: 237 NgaO4S10 2S,3399 204.57943 smad nuclear interacting protein SEQED NO 373 NgaO2200 O85.37 1277,0594 mitochondria ingort orf22 hoplog SEO DNO: 379 FigaO2201.1 A74.533 494,30996 gon-loop gtpase 2-ke SEO ED NO: 238O tigaO2199 O6.61.36 822,52523 pyruvate dehydrogenase SEED NO: 233. NgaO2202 9,979 189,8626 rvp-rk domain-containing protein SEC NEC: 2.382 NgaO6333 96.454 415.34.404 solute carrier family 25 member 4.5-ike SEDM: 2383 Nga2.292 38.327.5 95.SS485 ---NA SEC DNC: 2384 Nga21291 4,895. SS,6538---NA--- SEO 3D NO: 23.85 NgaO6337 69.0939 96.951.63 protein SER3D NO: 2386 Nga2O480 58.43972 852.75853 phosphoglucornutase phosphorman hornutase alphabeta alpha domain i SEC3D NO; 387 Nga2O323 143,065 209.841,3 ... --NA. . SEO ED NO: 388 Nga2235 5S80977 820,97987 phosphoglucornutase phosphomarnomutase SEOD NO: 2339 tiga21226 7S2.7594 1162.345 phosphoglucornutase phosphornaranorutase family protein SEC NEO: 239 NgaO6335 58.357 114.18705 snf2-related domain-containing protein SECD NC: 2391. Nga2O439 225,656 372.16963 oligomeric golgi complex component SEDN: 2392 Nga2O337 57.31.87 7189435 ---NA--- SEC DNC: 2393 Nga2006 398,2264 270.24875 uncharacterized protein SECD NO; 23S4. NgaO6331 798.92 7233,8804 fatty-acy elongase SEO 3D NO: 239s Nga,06336 98.338 932.04377 fatty acid elongation protein 3 SECR D NO: 2396 NgaO6332 782,587 1893.2242 rhodanese domain protein SEED NO; 397 NgaO6334 58.341.5 553,8489 protein SECRED NO: 398 NigaO3803 SC.77033 70.85O175 dynein heavy chain FIGURE- 24 A