US007947478B2

(12) United States Patent (10) Patent No.: US 7,947.478 B2 Melis (45) Date of Patent: May 24, 2011

(54) SHORT CHAINVOLATILE HYDROCARBON Logan, Barry A. et al., “Biochemistry and physiology of foliar PRODUCTION USING GENETICALLY isoprene production’: 2000, Trends in Plant Science, vol. 5, No. 11, pp. 477-481. ENGINEERED MICROALGAE, McKay, W.A. et al.; "Emissions of Hydrocarbons from Marine CYANOBACTERIA OR BACTERA Phytoplankton—some results from controlled Laboratory Experi ments'; 1996, Atmospheric Environment, vol. 30, No. 14, pp. 2583 (75) Inventor: Anastasios Melis, El Cerrito, CA (US) 2593. Miller, Barbara et al., “First isolation of an isoprene synthase gene from popular and Successful expression of the gene in Escherichia (73) Assignee: The Regents of the University of coli: 2001, Planta, vol. 213, pp. 483-487. California, Oakland, CA (US) Sasaki, Kanako et al., “Gene expression and characterization of isoprene synthase from Populus alba'; 2005, FEBS Letters, vol. 579, (*) Notice: Subject to any disclaimer, the term of this pp. 2514-2518. patent is extended or adjusted under 35 Sasaki, Kanako et al.; “Plants Utilize Isoprene Emission as a U.S.C. 154(b) by 245 days. Thermotolerance Mechanism’: 2007, Plant Cell Physiol., vol. 48, No. 9, pp. 1254-1262. (21) Appl. No.: 11/770,412 Sharkey, Thomas D. et al.; “Evolution of the Isoprene Biosynthetic Pathway in Kudzu’: 2005, Plant Physiology, vol. 137, pp. 700-712. GenBank accession No. AM410988, 2 pages. (22) Filed: Jun. 28, 2007 Davidson, S. (Oct.-Dec. 2003). “Light Factories.” ECOS, published by CSIRO, 117:10-12; also available at http://www.publish.csiro. (65) Prior Publication Data au/?act=view file&file id=EC 117p10.pdf, last visited on Apr. 7, US 2008/OO38805 A1 Feb. 14, 2008 2008. Miller, B et al. (Jul. 2001). “First Isolation of an Isoprene Synthase Related U.S. Application Data Gene from Poplar and Successful Expression of the Gene in Escherichia coli," Planta, Fraunhofer Institut fur Atmospharische (60) Provisional application No. 60/806,244, filed on Jun. Umweltforschung, Germany, 213(3):483-487. 29, 2006. Sasaki, K. et al. "Gene Expression and Characterization of Isoprene Synthase from Populus alba,” FEBS Letters, Laboratory of Plant (51) Int. Cl. Gene Expression, Research Institute for Sustainable Humanosphere, CI2P 5/02 (2006.01) Kyoto University, Gokasho, Japan, 579(11):2514-2518, (2005). Sharkey, T. D. et al. (Feb. 2005). “Evolution of the Isoprene CI2N L/21 (2006.01) Biosynthetic Pathway in Kudzu,” Plant Physiology, Department of CI2N L/3 (2006.01) Botany, University of Wisconsin, Madison, Wisconsin and Protemix (52) U.S. Cl...... 435/167; 435/252.3:435/257.2 Corporation, University of Auckland, Auckland City, New Zealand, (58) Field of Classification Search ...... None 137:700:712; also available at http://www.plantphysiol.org/cgi/re See application file for complete search history. print/137/2/700.pdf, last visited on Apr. 7, 2008. Ladygina, N. et al., “A review on microbial synthesis of hydrocar (56) References Cited bons'; 2006, Process Biochemistry, vol. 41, pp. 1001-1014. Lindberg, P. et al., “Engineering a platform for photosynthetic U.S. PATENT DOCUMENTS isoprene production in cyanobacteria, using synechocystis as the model organism”: 2010, Metabolic Engineering, vol. 12, pp. 70-79. 5,849,970 A 12/1998 Fall et al. Miller, Barbara; "Erstmalige Isolierung eines Isoprensynthase-Gens 6,916,972 B2 7/2005 Falco et al. 2002/01 19546 A1 8, 2002 Falco et al. und heterologe Expression des aus der Pappel stammenden Gens 2003,004 1338 A1 2/2003 Falco et al. Sowie Charakterisierung der Eingangsgene des Mevalonat 2003/0219798 A1 11/2003 Gokarn et al. unabhangigen Isoprenoidbiosyntheseseweges aS dem 2005, 0183163 A1 8, 2005 Falco et al. Cyanobakterium Synechococcus’: 2003, Internet Citation, 2005, 0183164 A1 8, 2005 Falco et al. URL:http://kups.ub.uni-koeln.de/volltexte/2003/883/pdf millerbarbara.pdf>, pp. 1-2. FOREIGN PATENT DOCUMENTS WO WO 98.25550 A2 7/1997 * cited by examiner WO WO 98.02550 1, 1998 WO WO O2/O86094 A2 10, 2002 Primary Examiner — Scott Long WO WO 2007/140339 12/2007 (74) Attorney, Agent, or Firm — Kilpatrick Townsend & WO WO 2008/137092 A2 11/2008 Stockton LLP OTHER PUBLICATIONS (57) ABSTRACT Steve Davidson (ECOS Magazine. Oct.-Dec. 2003; 117: 10-12).* Stevens et al. (J. Physiol. 1997; 33: 713-722).* The present invention provides methods and compositions for Sasaki et al (FEBS Letters 579. 2005; 2514-2518).* producing isoprene hydrocarbons from microalgae, cyano Broadgate, W.J. et al., “Isoprene and other non-methane hydrocar bacteria, and photosynthetic and non-photosynthetic bacte bons from seaweeds: a source of reactive hydrocarbons to the atmo ria. sphere'; 2004, Marine Chemistry, vol. 88, pp. 61-73. Davidson, Steve; "Light Factories'; 2003, ECOS, vol. 117, pp. 10-12. 8 Claims, 14 Drawing Sheets U.S. Patent May 24, 2011 Sheet 1 of 14 US 7,947.478 B2

Abbreviations used: RuBP = ribulose bis-phosphate 3-PGA = 3-phosphoglyceric acid GA-3-P = glyceraldehyde-3-phosphate DMAPP = dimethylallyl-pyrophosphate

Figure 1 U.S. Patent May 24, 2011 Sheet 2 of 14 US 7,947.478 B2

Isoprene- -- Isoprene: C5H8 Synthase sopree Volatile compound produced and ye- emitted from herbaceous plants DMAPP isoprene and deciduous trees.

MBO

Synthase av$$$$$y-3-issess--- zi MBO: CHO sy's - Volatile compound produced and y's sis si emitted from US pines. 2-Methyl-3-buten-2-ol (MBO)

Figure 2 U.S. Patent May 24, 2011 Sheet 3 of 14 US 7,947.478 B2

GA-3 - Pyrrivate

DxS DXPS)’nthase Sexy Xyriose-3- Abbreviations used: GA-3-P = glyceraldehydes-3-phosphate DXr/IspC Dvancoisoners IPP = isopentenyl pyrophosphate Methyl-erythrite-P DMAPP = dimethylallyl-pyrophosphate 5 enzymes IspD-IspH S.",

Figure 3 U.S. Patent May 24, 2011 Sheet 4 of 14 US 7,947.478 B2

Spectinomycin resistant 16S rRNA

Figure 4 U.S. Patent May 24, 2011 Sheet 5 of 14 US 7,947.478 B2

27 42 111 115 117 110 112 113 115 119

S S y SS y

& S

Primer N S.

Primer C

& S yS.

Figure 5 U.S. Patent May 24, 2011 Sheet 6 of 14 US 7,947.478 B2

B BamH kbp M C#7 #9

Figures 6A and 6B U.S. Patent May 24, 2011 Sheet 7 of 14 US 7,947.478 B2

A

Codon optimized 3x HA tag

#9 B Soluble fraction kD CH7 10 20 ug Chl

72 - 55 O-HA 42 34 -

26

Figures 7A and 7B U.S. Patent May 24, 2011 Sheet 8 of 14 US 7,947.478 B2

S 2 8. A: E.

Figure 8 U.S. Patent May 24, 2011 Sheet 9 of 14 US 7,947.478 B2

PpsbA3 it.

psbA3

Figure 9 U.S. Patent May 24, 2011 Sheet 10 of 14 US 7,947.478 B2

- pAG --

S&SS&S & SSSSS& X SS ------

----msm----- resis

Figure 10 U.S. Patent May 24, 2011 Sheet 11 of 14 US 7,947.478 B2

Figure 11 U.S. Patent May 24, 2011 Sheet 12 of 14 US 7,947.478 B2

Expected size of His-lspS is 65 kD (asterisk)

Figure 12

US 7,947,478 B2 1. 2 SHORT CHAINVOLATILE HYDROCARBON mones, among many others). The present invention relates to PRODUCTION USING GENETICALLY methods and compositions for the use of genetically modified ENGINEERED MICROALGAE, microalgae, cyanobacteria, and photosynthetic and non-pho CYANOBACTERIA OR BACTERA tosynthetic bacteria in the production and harvesting of 5-car bon volatile isoprenoid compounds, e.g., isoprene and CROSS-REFERENCE TO RELATED methyl-butenol. Such genetically modified organisms can be APPLICATIONS used commercially in an enclosed mass culture system, e.g., to provide a source of renewable fuel for internal combustion This application claims benefit of U.S. provisional appli engines or, upon on-board reformation, in fuel-cell operated cation No. 60/806,244, filed Jun. 29, 2006, which application 10 engines; or to provide a source of isoprene for uses in other is herein incorporated by reference. chemical processes such as chemical synthesis. Microalgae, cyanobacteria, and photosynthetic and non BACKGROUND OF THE INVENTION photosynthetic bacteria do not possess an isoprene synthase or a methyl-butenol synthase gene, which catalyze the last A variety of herbaceous, deciduous and conifer plants are 15 committed Step in isoprene (CHs) and methyl-butenol known to possess the genetic and enzymatic capability for the (CHO) biosynthesis, respectively. This invention there synthesis and release of short-chain isoprenoids (e.g., iso fore provides methods and compositions to genetically prene (CH) and methyl-butenol (CHO)) into the Sur modify microorganisms to express an isoprene synthase rounding environment. Such short-chain isoprenoids are gene, e.g., a codon-adjusted poplarisoprene Synthase gene, so derived from the early Calvin-cycle products of photosynthe as to confer isoprene (CHs) production to the organism. sis, and can be synthesized in the chloroplast of herbaceous, In additional aspects, the invention also provides method deciduous and conifer plants via the so-called DXP-MEP and compositions for the genetic modification of microalgae, pathway at Substantial rates under certain environmental cyanobacteria, and photosynthetic and non-photosynthetic stress conditions. Heat-stress of the organism is particularly bacteria to confer to these micro-organisms over-expression important for the induction of this process in plants, and the 25 of endogenous genes and proteins encoding the first commit resulting hydrocarbon pollution of the has been ted step in isoprenoid biosynthesis. The invention can thus the focus of the prior art in this field. further comprise increasing expression of native DXS and DXr Emission of isoprene from herbaceous, deciduous, and genes in the microorganism, e.g., green algae Such as conifer plants is due to the presence of an isoprene synthase Chlamydomonas reinhardtii; cyanobacteria Such as Syn (IspS) gene, a nuclear gene encoding for a chloroplast-local 30 echocystis sp., or photosynthetic bacteria Such as Rhodospir ized protein that catalyzes the conversion of dimethylallyl illum rubrum, or non-photosynthetic bacteria Such as diphosphate (DMAPP) to isoprene. As noted above, iso Escherichia coli. DXs and DXr encode enzymes that catalyze prenoids are synthesized in the chloroplast from the early the first committed Steps in isoprenoid biosynthesis. products of the Calvin cycle (carbon fixation and reduction, In some embodiments, microalgae are employed. Microal See FIG. 1). 5-carbon isoprenoids, e.g. isoprene (CHs) and 35 gae are factories of , with the chloroplast occu methyl-butenol (CHO) are relatively small hydrophobic pying ~70% of the cell Volume; green algal chloroplast con molecules, synthesized directly from DMAPP (FIG. 2). tains over 3 million electron transport circuits, each being These isoprenoids are volatile molecules that easily go capable of delivering 100 electrons per second to the Calvin through cellular membranes and thereby are emitted from the Cycle for CO conversion to GA-3-P; microalgae have no leaves into the atmosphere. The process of heat stress-induc 40 roots, stems, leaves, or flowers on which to invest photosyn tion and emission of short-chain hydrocarbons by plants has thetic resources, thus a greater fraction of photosynthetic been discussed as undesirable pollution of the atmosphere in product can be directed toward volatile isoprenoid genera the literature. There has been no description of the mass tion; microalgae grow and reproduce faster than any other generation, harvesting and sequestration of these hydrocar terrestrial or aquatic plant, doubling of biomass per day; and bons from the leaves of herbaceous, deciduous and conifer 45 microalgae are non-toxic and non-polluting, thus environ plants. mentally friendly for mass cultivation and commercial There is an urgent need for the development of renewable exploitation. Accordingly, in Some embodiments, the inven biofuels that will help meet global demands for energy but tion provides a process to modify the highly efficient process without contributing to climate change. The current invention of microalgal photosynthesis to generate, in high Volume, addresses this need by providing methods and compositions 50 short-chain isoprene hydrocarbons (e.g., CHs) from Sun to generate volatile short-chain hydrocarbons that are derived light, CO and H2O. Such modified microalgae can be grown, entirely from sunlight, (CO) and water e.g., in large capacity (e.g., 1,000-1,000,000 liters) fully (H2O). These hydrocarbons can serve as biofuel or feedstock enclosed photoreactors for the production and harvesting of in the synthetic chemistry industry. Volatile short-chain isoprene hydrocarbons. 55 The invention will help eliminate a number of current BRIEF SUMMARY OF THE INVENTION barriers in the commercial production, storage and utilization of renewable energy, including, but not limited to: (a) Low The invention is based, in part, on the discovery that ering the cost of production and storage of fuel. (b) Improving microalgae, cyanobacteria and prokaryotic photosynthesis fuel Weight/Volume ratios. (c) Improving the efficiency of can be employed, upon Suitable modification, to produce 60 fuel production/storage. (d) Increasing the durability of fuel 5-carbon isoprenoids (e.g. FIG. 3). The DXP-MEP iso storage. (e) Minimizing auto-refueling time. (f) Offering Suf prenoid biosynthetic pathway is absolutely required in plants ficient fuel storage for acceptable vehicle range. (g) Produc and algae, as it leads to the synthesis of many essential longer ing a fuel amenable to regeneration process. (h) Fuel is not chain cellular compounds. Unicellular green algae specifi Subject to interference by oxygen in either production or cally express this pathway in their chloroplast and utilize the 65 Storage Stage. corresponding enzymes for the biosynthesis of a great variety In one aspect, the invention provides a method of produc of molecules (carotenoids, tocopherols, phytol, Sterols, hor ing isoprene hydrocarbons in a microorganism selected from US 7,947,478 B2 3 4 the group consisting of microalgae, cyanobacteria, or photo embodiments, the microorganism is a photosynthetic bacte synthetic bacteria, the method comprising: introducing an ria, such as Rhodospirillum rubrum. In still other embodi expression cassette that comprises a nucleic acid sequence ments, the microorganism is a non-photosynthetic bacteria, encoding isoprene synthase into the microorganism; and cul Such as Escherichia coli. turing the microorganism under conditions in which the In some embodiments of the mass-culture methods of the nucleic acid encoding isoprene synthase is expressed. In invention, the heterologous gene that encodes isoprene Syn Some embodiments, the microorganism is a microalgae Such thase comprises a sequence that encodes an isoprene synthase as green algae, e.g., Chlamydomonas reinhardtii, Scenedes gene that has the sequence set forthin SEQID NO:2 or has the mus obliquus, Chlorella vulgaris or Dunaliella Salina. In sequence set forth in SEQ ID NO:2, but lacks the transit alternative embodiments, the microorganism is a cyanobac 10 peptide. The isoprene synthase polypeptide can, e.g., com teria, Such as a Synechocystis sp. In other embodiments the microorganism is a photosynthetic bacteria Such as Rho prise amino acid residues 53-595 of SEQID NO:2 or residues dospirillum rubrum. Alternatively, in Some embodiments, the 38-595 of SEQID NO:2. The nucleic acid can, e.g., comprise microorganism can be a non-photosynthetic bacteria, such as the sequence set forth in SEQ ID NO:1. In other embodi Escherichia coli. 15 ments, the nucleic acid comprises the nucleotide coding In some embodiments, the nucleic acid introduced into the sequence for isoprene synthase set forth in SEQID NO:3: or microorganism comprises a sequence that encodes an iso the isoprene coding sequence as set forth in SEQID NO:5. prene synthase polypeptide that has the sequence set forth in In some embodiments of the methods and compositions of SEQID NO:2, or has the sequence set forth in SEQID NO:2, the invention, the IspS nucleic acid encodes a protein that but lacks a transit peptide region. The isoprene synthase comprises the amino acid sequence of SEQID NO:8 or that polypeptide can, e.g., comprise amino acid residues 53-595 comprises amino acid 46-608 of SEQID NO:8. of SEQ ID NO:2, or residues 38-595 of SEQ ID NO:2. In Some embodiment, the nucleic acid comprises the sequence BRIEF DESCRIPTION OF THE DRAWINGS set forth in SEQID NO:1. In other embodiments, the nucleic acid comprises the nucleotide coding sequence for isoprene 25 FIG.1. Schematic pathway of carbon dioxide fixation and synthase set forth in SEQID NO:3: or the nucleic acid com reduction in the Calvin cycle of photosynthesis and of the prises the isoprene coding sequence as set forth in SEQ ID channeling of organic carbon from the ubiquitous glyceral NO:5. dehyde-3-phosphate (GA-3-P) via the deoxy-xylulose/me In another aspect, the invention provides a microorganism thyl-erythritol (DXP/MEP) biosynthetic pathway to iso selected from the group consisting of a microalgae cell, a 30 prenoids. cyanobacteria cell, and a photosynthetic bacterial cell or non FIG. 2. (Left panel) Single step enzymatic reaction for the photosynthetic bacterial cell, wherein the microorganism biosynthesis of isoprene and methyl-butenol in the chloro comprises a heterologous nucleic acid that encodes isoprene plast of herbaceous/deciduous tress and US pines, respec synthase and is operably linked to a promoter. The promoter tively. (Right panel) Chemical formulae of isoprene (CH) can be a constitutive promoter or an inducible promoter. In 35 Some embodiments, the microorganism is a green algae, Such and methyl-butenol (CHO). as Chlamydomonas reinhardtii, Scenedesmus obliquus, FIG.3. The DXP/MEP biosynthetic pathway leading to the Chlorella vulgaris or Dunaliella salina. In other embodi formation of volatile isoprenoids from the abundant chloro ments, the microorganism is a cyanobacteria, Such as Syn plast metabolites GA-3-P (glyceraldehyde-3-phosphate) and echocystis sp. In other embodiments, the microorganism is a 40 pyruvate. Seven distinct enzymatic reactions are needed to photosynthetic bacteria, Such as Rhodospirillum rubrum. In synthesize isoprene from GA-3-P and pyruvate. Unicellular Some embodiments, the heterologous nucleic acid comprises green algae, cyanobacteria, photosynthetic and certain non a sequence that encodes an isoprene synthase gene that has photosynthetic bacteria possess the first six of these genes, the sequence set forth in SEQID NO:2, or has the sequence but lack the isoprene synthase or methyl-butenol synthase set forth in SEQID NO:2, but lacks the transit peptide. The 45 genes. isoprene synthase polypeptide can, e.g., comprise amino acid FIG. 4. Co-transformation and homologous recombination residues 53-595 of SEQID NO:2, or residues 38-595 of SEQ of green algal, e.g. Chlamydomonas reinhardtii, chloroplast ID NO:2. In some embodiments, the nucleic acid comprises DNA with novel Cr-IspS gene. This construct contains the the sequence set forth in SEQ ID NO:1. In other embodi atpA promoter (Patp A), fused to the 5' UTR end of a codon ments, the nucleic acid comprises the nucleotide coding 50 optimized three-copy hemagglutinin (HA) epitope tag DNA. sequence for isoprene synthase set forth in SEQID NO:3: or The DNA sequence is followed by the Cr-IspS coding region, the nucleic acid comprises the isoprene coding sequence as followed by the atpA 3' UTR. set forth in SEQID NO:5. FIG. 5. Screening for C. reinhardtii IspS (Cr-IspS) trans In a further aspect, the invention provides a method of formants by genomic DNA PCR. Primers N and C represent producing isoprene hydrocarbons in a microorganism that 55 the primer set used for amplification, and their annealing comprises a heterologous gene that encodes isoprene Syn locations are shown in FIG. 4. thase and that is selected from the group consisting of FIG. 6. A. Cr-IspS transgene integrity tested by genomic microalgae, cyanobacteria, photosynthetic bacteria, and non DNA Southern blot analysis. Filters probed with the Cr-IspS photosynthetic bacteria, the method comprising: mass-cul DNA probe. Hybridization with a radio-labeled NdeI/Xbal turing the microorganism in an enclosed bioreactor under 60 fragment of the Cr-IspS coding region identified a 3.0 kbp conditions in which the isoprene synthase gene is expressed; band exclusively in the Cr-IspStransformantline #9, whereas and harvesting isoprene hydrocarbons produced by the no detectable band could be observed in the control line #7 microorganism. In some embodiments, the microorganism is lane. B. Ethidium bromide staining to test for equal amounts a microalgae that is a green microalgae, Such as Chlamy of DNA loading in A. domonas reinhardtii, Scenedesmus obliquus, Chlorella vul 65 FIG. 7. A. Schematic representation of the Codon opti garis or Dunaliella Salina. Alternatively, the microorganism mized 3xHA tagged Cr-IspS gene. B. Validation of Cr-IspS can be a cyanobacteria, Such as a Synechocystis sp. In other gene expression. Soluble protein fractions, which correspond US 7,947,478 B2 5 6 to 10 or 20 ug chlorophyll, were subjected to SDS-PAGE and roamidite linkages (see Eckstein, Oligonucleotides and Ana Western blot analysis with specific polyclonal anti-HA anti logues: A Practical Approach, Oxford University Press); and bodies. peptide nucleic acid backbones and linkages. Other analog FIG. 8. Components and structure of the plspS plasmid. nucleic acids include those with positive backbones; non Novel isoprene synthase gene (SS-IspS) with codon usage ionic backbones, and non-ribose backbones. Thus, nucleic designed for expression in cyanobacteria, e.g. Synechocystis, acids or polynucleotides may also include modified nucle which includes an Ampicillin resistance gene. The novel SS otides, that permit correct read through by a polymerase. IspS DNA sequence was designed on the basis of the amino “Polynucleotide sequence' or “nucleic acid sequence' acid sequence template of the poplar isoprene synthase pro includes both the sense and antisense Strands of a nucleic acid tein, with criteria designed to conform to the Synechocystis 10 as either individual single strands or in a duplex. As will be codon preferences. Restriction sites were introduced to facili appreciated by those in the art, the depiction of a single strand tate cloning. The novel SS-IspS DNA sequence was synthe also defines the sequence of the complementary Strand; thus sized and cloned into plasmid pspS for propagation in E. coli. the sequences described herein also provide the complement FIG.9. Construction of p AIGA plasmid for transformation of the sequence. Unless otherwise indicated, a particular of cyanobacteria, e.g., Synechocystis. Flanking sequences 15 nucleic acid sequence also implicitly encompasses variants from the psbA3 gene of Synechocystis were used for homolo thereof (e.g., degenerate codon Substitutions) and comple gous recombination of the plasmid and to Subsequently drive mentary sequences, as well as the sequence explicitly indi expression of the SS-IspS gene with a strong promoter. A cated. The nucleic acid may be DNA, both genomic and Gentamycin resistance cassette was introduced in the plasmid cDNA, RNA or a hybrid, where the nucleic acid may contain at the 3' end of the SS-IspS gene to serve as selectable marker. combinations of deoxyribo- and ribo-nucleotides, and com The SS-IspS gene was cloned between the NcoI and PstI binations of bases, including uracil, adenine, thymine, restriction sites. cytosine, guanine, inosine, Xanthine hypoxanthine, isocy FIG. 10. Double homologous recombination. Schematic tosine, isoguanine, etc showing the principle of Synechocystis sp. transformation by The phrase “a nucleic acid sequence encoding refers to a double-homologous recombination and replacement of the 25 nucleic acid which contains sequence information for a struc native psbA3 gene by the SS-IspS Gm-resistance construct. tural RNA such as rRNA, a tRNA, or the primary amino acid FIG. 11. Structure of a His-tagged SS-IspS-containing sequence of a specific protein or peptide, or a binding site for plasmid for recombinant protein over-expression in bacteria, a trans-acting regulatory agent. This phrase specifically e.g. Escherichia coli. The N-terminal histidine-tag was intro encompasses degenerate codons (i.e., different codons which duced to facilitate purification of recombinant protein. E. coli 30 encode a single amino acid) of the native sequence or expression was induced upon addition of IPTG to the liquid sequences that may be introduced to conform with codon cell culture. preference in a specific host cell. In the context of this inven FIG. 12. Evidence of expression of the His-tagged SS-IspS tion, the term “IspS coding region' when used with reference recombinant protein in bacteria, e.g. E. coli. Coomassie to a nucleic acid reference sequence such as SEQID NO:3, 5, stained SDS-PAGE of electrophoretically separated total pro 35 or 7 refers to the region of the nucleic acid that encodes the tein from cell extracts of E. coli carrying the pETIspS plas protein. mid. Lane 1: Non-induced control culture. Lanes 2-4 and An IspS “gene' in the context of this invention refers to a 6-10: Induced E. coli cultures. Lane 5: Molecular weight nucleic acid that encodes an IspS protein, or fragment thereof. protein size markers. Thus, such a gene is often a cDNA sequence that encodes FIG. 13. Clustal alignment of four known isoprene syn 40 IspS. In other embodiments, an IspS gene may include thase proteins(SEQID NOS:2, 9, 10 and8.respectively). sequences, such as introns that are not present in a cDNA. The term “promoter' or “regulatory element” refers to a DETAILED DESCRIPTION OF THE INVENTION region or sequence determinants located upstream or down stream from the start of transcription that direct transcription. Definitions 45 As used herein, a promoter includes necessary nucleic acid “Microalgae”, “alga' or the like, refer to plants belonging sequences near the start site of transcription, such as, in the to the subphylum Algae of the phylum Thallophyta. The algae case of a polymerase II type promoter, a TATA element. A are unicellular, photosynthetic, oxygenic algae and are non promoter also optionally includes distal elements, which can parasitic plants without roots, stems or leaves; they contain be located as much as several thousand base pairs from the chlorophyll and have a great variety in size, from microscopic 50 start site of transcription. A “constitutive' promoter is a pro to large seaweeds. Green algae, belonging to Eukaryota— moter that is active under most environmental and develop Viridiplantae Chlorophyta Chlorophyceae, can be used mental conditions. An “inducible' promoter is a promoter in the invention. However, algae useful in the invention may that is active under environmental or developmental regula also be blue-green, red, or brown, so long as the algae is able tion. The term “operably linked refers to a functional linkage to perform the steps necessary to provide a Substrate to pro 55 between a nucleic acid expression control sequence (such as duce isoprene. a promoter) and a second nucleic acid sequence, such as an A “volatile isoprene hydrocarbon' in the context of this IspS gene, wherein the expression control sequence directs invention refers to a 5-carbon, short chain isoprenoid, e.g., transcription of the nucleic acid corresponding to the second isoprene or methyl-butenol. sequence. An 'algae promoter” or “bacterial promoter is a The terms “nucleic acid' and “polynucleotide are used 60 promoter capable of initiating transcription in algae and/or synonymously and refer to a single or double-stranded poly bacterial cells, respectively. Such a promoter is therefore mer of deoxyribonucleotide or ribonucleotide bases read active in a microalgae, cyanobacteria, or bacteria cell, but from the 5' to the 3' end. A nucleic acid of the present inven need not originate from that organism. It is understood that tion will generally contain phosphodiester bonds, although in limited modifications can be made without destroying the Some cases, nucleic acid analogs may be used that may have 65 biological function of a regulatory element and that Such alternate backbones, comprising, e.g., phosphoramidate, limited modifications can result in algal regulatory elements phosphorothioate, phosphorodithioate, or O-methylphospho that have Substantially equivalent or enhanced function as US 7,947,478 B2 7 8 compared to a wild type algal regulatory element. These 95%, 96%, 97%, 98%, 99%, or higher nucleotide sequence modifications can be deliberate, as through site-directed identity to SEQID NO:1 or SEQID NO:7, at least 80%, 85%, mutagenesis, or can be accidental such as through mutation in 90%, or at least 95%,96%.97%.98%,99% or greateridentity hosts harboring the regulatory element. All Such modified over a comparison window of at least about 50, 100, 200, 500, nucleotide sequences are included in the definition of an algal 5 1000, or more nucleotides of SEQID NO:1 or SEQID NO:7, regulatory element as long as the ability to confer expression or the IspS coding region of SEQID NO:3 or SEQID NO:5; in unicellular green algae is Substantially retained. or 6) is amplified by primers to SEQ ID NO:1 or SEQ ID “Increased' or "enhanced’ activity or expression of a DXs NO:7, or the IspS coding region of SEQID NO:3 or SEQID or DXr gene refers to a change in DXS or DXr activity. NO:5. The term “IspS polynucleotide” refers to double Examples of such increased activity or expression include the 10 Stranded or singled Stranded nucleic acids. The IspS nucleic following. DxS or DXR activity or expression of a DxS or DXR acids for use in the invention encode an active IspS that gene is increased above the level of that in wild-type, non catalyzes the conversion of a dimethylallyl diphosphate sub transgenic control microorganism (i.e., the quantity of DXS or strate to isoprene. DXr activity or expression of DXS or DX gene is increased). An 'IspS polypeptide' is an amino acid sequence that has DxS or DXractivity or expression of a DXs or DXr gene is in a 15 the amino acid sequence of SEQID NO:2 or SEQID NO:8, cell where it is not normally detected in wild-type, non or is substantially similar to SEQID NO:2 or SEQID NO:8, transgenic cells (i.e., expression of the DXS or DXr gene is or a fragment or domain thereof. Thus, an IspS polypeptide increased). DXS or DXractivity or expression is also increased can: 1) have at least 55% identity, typically at least 60%. 65%, when DxS or DXractivity or expression of the DxS or DXr gene 70%, 75%, 80%, 85%, 90%. 95% or greater identity to SEQ is present in a cell for a longer period than in a wild-type, ID NO:2 or SEQID NO:8, or over a comparison window of at non-transgenic controls (i.e., duration of DXS or DXr activity least 100, 200, 250, 300, 250, 400, 450, 500, or 550 amino or expression of the DXS or DXr gene is increased). acids of SEQ ID NO:2 or 8; or 2) comprise at least 100, “Expression of an IspS gene in the context of this inven typically at least 200, 250, 300, 350, 400, 450, 500, 550, or tion typically refers introducing an IspS gene into a cell, e.g., more contiguous amino acids of SEQID NO:2 or 8; or 3) bind microalgae, such as green microalgae, cyanobacteria, orpho 25 to antibodies raised against an immunogen comprising an tosynthetic or non-photosynthetic bacteria, in which it is not amino acid sequence of SEQID NO:2 or 8 and conservatively normally expressed. Accordingly, an “increase' in IspS activ modified variants thereof. An IspS polypeptide in the context ity or expression is generally determined relative to wild type of this invention is a functional protein that catalyzes the cells, e.g., microalgae, cyanobacteria or photosynthetic or conversion of a dimethylallyl diphosphate substrate to iso non-photosynthetic bacteria, that have no IspS activity. 30 prene. A polynucleotide sequence is "heterologous to a second As used herein, a homolog or ortholog of a particular IspS polynucleotide sequence if it originates from a foreign spe gene (e.g., SEQID NO: 1) is a second gene in the same plant cies, or, if from the same species, is modified by human action type or in a different plant type that is substantially identical from its original form. For example, a promoter operably (determined as described below) to a sequence in the first linked to a heterologous coding sequence refers to a coding 35 gene. sequence from a species different from that from which the The terms “DXs' and “DXr nucleic acids and polypeptide promoter was derived, or, if from the same species, a coding refer to fragments, variants, and the like. Exemplary DXS and sequence which is different from any naturally occurring DXr sequences include the nucleic acid and polypeptide DXS allelic variants and DXr sequences disclosed in U.S. Patent Application Pub An 'IspS polynucleotide' is a nucleic acid sequence of 40 lication No. 20030219798, e.g., Chlamydomonas sequences. SEQID NO:1 or SEQID NO:7, or the IspS coding regions of The DXs and DXr sequences of U.S. Patent Application Pub SEQ ID NO:3 or SEQID NO:5; or a nucleic acid sequence lication No. 20030219798 are herein incorporated by refer that is substantially similar to SEQ ID NO:1 or the IspS CCC. coding regions of SEQID NO:3 or SEQID NO:5; or a nucleic An 'expression cassette' refers to a nucleic acid construct, acid sequence that encodes a polypeptide of SEQID NO:2 or 45 which when introduced into a host cell, results in transcrip SEQID NO:8, or a polypeptide that is substantially similar to tion and/or translation of a RNA or polypeptide, respectively. SEQ ID NO:2 or SEQ ID NO:8, or a fragment or domain In the case of expression of transgenes one of skill will thereof. Thus, an IspS polynucleotide: 1) comprises a region recognize that the inserted polynucleotide sequence need not of about 15 to about 50, 100, 150, 200, 300, 500, 1,000, 1500, be identical and may be “substantially identical to a or 2,000 or more nucleotides, sometimes from about 20, or 50 sequence of the gene from which it was derived. As explained about 50, to about 1800 nucleotides and sometimes from below, these variants are specifically covered by this term. about 200 to about 600 or about 1500 nucleotides of SEQID In the case where the inserted polynucleotide sequence is NO:1 or SEQID NO:7, or the IspS coding region of SEQID transcribed and translated to produce a functional polypep NOs: 3 or 5; or 2) hybridizes to SEQ ID NO:1 or SEQ ID tide, one of skill will recognize that because of codon degen NO:7 or to the IspS coding region of SEQIDNO:3 or SEQID 55 eracy a number of polynucleotide sequences will encode the NO:5, or the complements thereof, under stringent condi same polypeptide. These variants are specifically covered by tions, or 3) encodes an IspSpolypeptide or fragment of at least the term “IspS polynucleotide sequence' or “IspS gene'. 50 contiguous amino acids, typically of at least 100, 150, 200, Two nucleic acid sequences or polypeptides are said to be 250, 300, 350, 400, 450, 500, or 550, or more contiguous “identical” if the sequence of nucleotides or amino acid resi residues of an IspS polypeptide, e.g., SEQID NO:2 or SEQ 60 dues, respectively, in the two sequences is the same when ID NO:8; or 4) encodes an IspS polypeptide or fragment that aligned for maximum correspondence as described below. has at least 55%, often at least 60%, 65%, 70%, 75%, 80%, The term “complementary to” is used herein to mean that the 85%, 90%, 95%, or greater identity to SEQID NO:2 or SEQ sequence is complementary to all or a portion of a reference ID NO:8, or over a comparison window of at least 100, 200, polynucleotide sequence. 300, 400, 500, or 550 amino acid residues of SEQID NO:2 or 65 Optimal alignment of sequences for comparison may be SEQ ID NO:8; or 5) has a nucleic acid sequence that has conducted by the local homology algorithm of Smith and greater than about 60%. 65%, 70%, 75%, 80%, 85%, 90%, Waterman Add. APL. Math. 2:482 (1981), by the homology US 7,947,478 B2 9 10 alignment algorithm of Needle man and Wunsch.J. Mol. Biol. will be different in different circumstances. Longer 48:443 (1970), by the search for similarity method of Pearson sequences hybridize specifically at higher temperatures. An and Lipman Proc. Natl. Acad. Sci. (U.S.A.) 85: 2444 (1988), extensive guide to the hybridization of nucleic acids is found by computerized implementations of these algorithms (GAP, in Tijssen, Techniques in Biochemistry and Molecular Biol BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin 5 ogy—Hybridization with Nucleic Probes, “Overview of prin Genetics Software Package, Genetics Computer Group ciples of hybridization and the strategy of nucleic acid (GCG), 575 Science Dr. Madison, Wis.), or by inspection. assays” (1993). Generally, stringent conditions are selected to “Percentage of sequence identity” is determined by com be about 5-10°C. lower than the thermal melting point (Tm) paring two optimally aligned sequences over a comparison for the specific sequence at a defined ionic strength pH. The window, wherein the portion of the polynucleotide sequence 10 Tm is the temperature (under defined ionic strength, pH, and in the comparison window may comprise additions or dele nucleic concentration) at which 50% of the probes comple tions (i.e., gaps) as compared to the reference sequence mentary to the target hybridize to the target sequence at equi (which does not comprise additions or deletions) for optimal librium (as the target sequences are present in excess, at Tm, alignment of the two sequences. The percentage is calculated 50% of the probes are occupied at equilibrium). Stringent by determining the number of positions at which the identical 15 conditions will be those in which the salt concentration is less nucleic acid base or amino acid residue occurs in both than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sequences to yield the number of matched positions, dividing sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the number of matched positions by the total number of the temperature is at least about 30°C. for short probes (e.g., positions in the window of comparison and multiplying the 10 to 50 nucleotides) and at least about 60°C. for long probes result by 100 to yield the percentage of sequence identity. A (e.g., greater than 50 nucleotides). Stringent conditions may “comparison window', as used herein, includes reference to a also be achieved with the addition of destabilizing agents segment of any one of the number of contiguous positions, such as formamide. For selective or specific hybridization, a e.g., 20 to 600, usually about 50 to about 200, more usually positive signal is at least two times background, optionally 10 about 100 to about 150 in which a sequence may be compared times background hybridization. Exemplary stringent to a reference sequence of the same number of contiguous 25 hybridization conditions can be as following: 50% forma positions after the two sequences are optimally aligned. mide, 5xSSC, and 1% SDS, incubating at 42°C., or 5xSSC, The term “substantial identity” in the context of polynucle 1% SDS, incubating at 65° C., with wash in 0.2xSSC, and otide oramino acid sequences means that a polynucleotide or 0.1% SDS at 55° C., 60° C., or 65° C. Such washes can be polypeptide comprises a sequence that has at least 50% performed for 5, 15, 30, 60, 120, or more minutes. sequence identity to a reference sequence. Alternatively, per 30 Nucleic acids that do not hybridize to each other under cent identity can be any integer from 50% to 100%. Exem stringent conditions are still substantially identical if the plary embodiments include at least: 55%, 57%, 60%. 65%, polypeptides that they encode are substantially identical. This 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity compared occurs, for example, when a copy of a nucleic acid is created to a reference sequence using the programs described herein; using the maximum codon degeneracy permitted by the preferably BLAST using standard parameters, as described 35 genetic code. In Such cases, the nucleic acids typically hybrid below. Accordingly, IspS sequences of the invention include ize under moderately stringent hybridization conditions. For nucleic acid sequences that have substantial identity to SEQ example, an IspS polynucleotides, can also be identified by ID NO:1 or SEQ ID NO:7 or to the IspS coding regions of their ability to hybridize under stringency conditions (e.g., SEQ ID NO:3 or SEQ ID NO:5. As noted above, IspS Tm ~40°C.) to nucleic acid probes having the sequence of polypeptide sequences of the invention include polypeptide 40 SEQ ID NO:1 or SEQ ID NO:7. Such an IspS nucleic acid sequences having substantial identify to SEQ ID NO:2 or sequence can have, e.g., about 25-30% base pair mismatches SEQID NO:8. or less relative to the selected nucleic acid probe. SEQ ID Polypeptides that are “substantially similar share NO:1 is an exemplary IspS polynucleotide sequence. Exem sequences as noted above except that residue positions that plary “moderately stringent hybridization conditions' are not identical may differ by conservative amino acid 45 include a hybridization in a buffer of 40% formamide, 1 M changes. Conservative amino acid Substitutions refer to the NaCl, 1% SDS at 37°C., and a wash in 1xSSC at 45° C. Such interchangeability of residues having similar side chains. For washes can be performed for 5, 15, 30, 60, 120, or more example, a group of amino acids having aliphatic side chains minutes. A positive hybridization is at least twice back is glycine, alanine, Valine, leucine, and isoleucine; a group of ground. Those of ordinary skill will readily recognize that amino acids having aliphatic-hydroxyl side chains is serine 50 alternative hybridization and wash conditions can be utilized and threonine; a group of amino acids having amide-contain to provide conditions of similar stringency. ing side chains is asparagine and glutamine; a group of amino The term "isolated', when applied to a nucleic acid or acids having aromatic side chains is phenylalanine, tyrosine, protein, denotes that the nucleic acid or protein is essentially and tryptophan; a group of amino acids having basic side free of other cellular components with which it is associated chains is lysine, arginine, and histidine; and a group of amino 55 in the natural State. It is preferably in a homogeneous state and acids having Sulfur-containing side chains is cysteine and may be in either a dry or aqueous solution. Purity and homo methionine. Exemplary conservative amino acids Substitu geneity are typically determined using analytical chemistry tion groups are: Valine-leucine-isoleucine, phenylalanine-ty techniques such as polyacrylamide gel electrophoresis or rosine, lysine-arginine, alanine-valine, aspartic acid high performance liquid chromatography. A protein that is the glutamic acid, and asparagine-glutamine. 60 predominant species present in a preparation is substantially Another indication that nucleotide sequences are Substan purified. In particular, an isolated gene is separated from open tially identical is if two molecules hybridize to each other, or reading frames that flank the gene and encode a protein other a third nucleic acid, under Stringent conditions. The phrase than the gene of interest. “stringent hybridization conditions' refers to conditions As used herein, "mass-culturing refers to growing large under which a probe will hybridize to its target subsequence, 65 quantities of microalgae, cyanobacteria, or photosynthetic or typically in a complex mixture of nucleic acid, but to no other non-photosynthetic bacteria that have been modified to sequences. Stringent conditions are sequence-dependent and express an IspS gene. A "large quantity” is generally in the US 7,947,478 B2 11 12 range of about 100 liters to about 1,500,000 liters, or more. In the additional property of many conserved Serines in the Some embodiments, the organisms are cultured in large quan C-terminal half of the protein. Accordingly, in Some embodi tities in modular bioreactors, each having a capacity of about ments, a nucleic acid for use in the invention encodes an IspS 1,000 to about 1,000,000 liters. polypeptide that comprises the carboxyl-terminal 45% of A “bioreactor” in the context of this invention is any SEQID NO:2 and retain the catalytic activity in converting enclosed large-capacity vessel in which microalgae, cyano DMAPP to isoprene. Other examples exist where a related bacteria or photosynthetic or non-photosynthetic bacteria are protein in one microorganism, Such as a green microalgae, grown. A "large-capacity vessel” in the context of this inven lacks a substantial portion of the N-terminal portion of the tion can hold about 100 liters, often about 500 liters, or about protein (relative to the form of the protein present in another 1,000 liters to about 1,000,000 liters, or more. 10 microorganism Such as bacteria) without adverse effect on As used herein, “harvesting volatile isoprene hydrocar activity (see, e.g., Melis and Happe, Plant Physiol. 127:740 bons refers to capturing and sequestering Such hydrocarbons 748, 2001). Accordingly, in some embodiments, an IspS in a closed or contained environment. nucleic acid for use in the invention encodes a polypeptide IspS, DXr, or DxS Nucleic Acid Sequences that comprises from aboutamino acid residue 330 through the The invention employs various routine recombinant 15 C-terminus of SEQ ID NO:2 or SEQ ID NO:8. In some nucleic acid techniques. Generally, the nomenclature and the embodiments, the IspS polypeptide encoded by the IspS laboratory procedures in recombinant DNA technology nucleic acid comprises from about amino acid residue 300 described below are those well known and commonly through the C-terminus of SEQID NO:2 or SEQID NO:8. In employed in the art. Many manuals that provide direction for Some embodiments, the IspS sequence can additionally lack performing recombinant DNA manipulations are available, the last 10, 15, or 20 residues of SEQ ID NO:2 or SEQ ID e.g., Sambrook & Russell, Molecular Cloning, A Laboratory NO:8. Manual (3rd Ed. 2001); and Current Protocols in Molecular The transit peptide of the IspS protein includes, minimally, Biology (Ausubel et al., eds., 1994-1999). amino acids 1-37 for poplar and aspen and 1-45 for kudzu. On IspS nucleic acid and polypeptide sequences are known in the basis of this analysis, the mature protein begins with the the art. IspS genes have been isolated and sequenced from 25 amino acid sequence “CSVSTEN. . . (SEQ ID NO:11) etc. poplar and aspen (two related trees), and kudzu (a vine). The IspS nucleic acid sequences for use in the invention need not species involved and the sequences available in the NCBI include sequences that encode a transit polypeptide and fur database are given below by accession number, each of which ther omit additional N-terminal sequence. For example, the is incorporated by reference: SS-IspS construct set forth in the EXAMPLES section lacks Populus alba (white poplar) IspS mRNA for isoprene syn 30 52 amino acids from the encoding synthetic gene DNA. This thase: ACCESSION No AB198180; has had no effect on IspS protein synthesis and accumulation. Populus tremuloides (quaking aspen) isoprene synthase In some embodiments of the invention, a nucleic acid (IspS): ACCESSION No AY341431 (complete cds); sequence that encodes a poplar or aspen IspS polypeptide Populus albaxPopulus tremula IspS mRNA; ACCESSION (e.g., SEQID NO:2) is used. In other embodiments, a nucleic No AJ294819; 35 acid sequence that encodes a kudzu IspS polypeptide (e.g., Populus nigra (Lombardy poplar) mRNA for isoprene Syn SEQID NO:8) is used. The IspS polypeptides encoded by the thase (IspS gene): ACCESSION No AM410988: nucleic acids employed in the methods of the invention have Pueraria montana var. lobata (kudzu vine) isoprene Syn the catalytic activity of converting DMAPP to isoprene. Typi thase (IspS): ACCESSION No AY316691 (complete cds.). cally, the level of activity is equivalent to the activity exhib Examination of these IspS sequences reveals a high degree 40 ited by a poplar or aspen IspS polypeptide (e.g., encoded by of nucleotide and amino acid sequence identities, for SEQ ID NO: 1) or a natural kudzu IspS polypeptide (e.g., example, hybrid poplar and aspen cINA sequences are 98% encoded by SEQID NO:7). identical at the polypeptide and nucleotide level (see, e.g., Exemplary DXS and DXr sequences include the nucleic acid Sharkey et al., Plant Physiol. 137:700-712, 1995). The aspen and polypeptide DXS and DXr sequences disclosed in U.S. isoprene synthase nucleotide coding sequence is 65% identi 45 Patent Application Publication No. 20030219798, e.g., cal to the kudzu gene, while the protein sequences (without Chlamydomonas sequences. The DXS and DXr sequences of the chloroplast transit peptide) are 57% identical. U.S. Patent Application Publication No. 20030219798 are The poplar IspS protein has a high-density of Cysteine and herein incorporated by reference. Histidine amino acids in the carboxy-terminal half of the Isolation or generation of IspS, DXr, or DXs polynucleotide protein. For example, considering the 591 amino acid 50 sequences can be accomplished by a number of techniques. sequence of the Cr-IspS protein (SEQ ID NO:4), cysteine Cloning and expression of Such technique will be addressed moieties are found at positions 34, 326, 378, 413, 484, 505 in the context of IspS genes. However, the same techniques and 559, i.e., six out of the seven cysteines are found in the can be used to isolate and express DXr or DXS polynucle lower 45% of the protein. Additional clustering of histidines otides. For instance, oligonucleotide probes based on the in various positions of the C-terminal half of the protein is 55 sequences disclosed here can be used to identify the desired also observed. Cysteine and histidine amino acids are known polynucleotide in a cDNA or genomic DNA library from a to participate in proper folding and catalytic site structure of desired plant species. Such a cDNA or genomic library can proteins and can be important components for enzyme activ then be screened using a probe based upon the sequence of a ity. An alignment of four known IspS proteins showing the cloned IspS gene, e.g., SEQ ID NO:1 or SEQ ID NO:7. high conservation of Cys in the C-terminal part of the mol 60 Probes may be used to hybridize with genomic DNA or ecule is provided in FIG. 13. In one case, the kudzu protein cDNA sequences to isolate homologous genes in the same or has substituted an otherwise conserved Cys with Ser (Cys different plant species. 509-Ser of the Alba or nigra or tremuloides) sequence in the Alternatively, the nucleic acids of interest can be amplified clustal alignment in FIG. 13). Serine is a highly conservative from nucleic acid samples using amplification techniques. substitution for cysteine, as the only difference between the 65 For instance, PCR may be used to amplify the sequences of two amino acids is a —OH group in the place of the SH the genes directly from mRNA, from cDNA, from genomic group. In fact, examination of the four IspS sequences reveals libraries or cDNA libraries. PCR and other in vitro amplifi US 7,947,478 B2 13 14 cation methods may also be useful, for example, to clone nitrite reductase gene (Back et al., Plant Mol. Biol. 17:9 nucleic acid sequences that code for proteins to be expressed, (1991)), or a light-inducible promoter, such as that associated to make nucleic acids to use as probes for detecting the with the small subunit of RuBP carboxylase or the LHCP presence of the desired mRNA in samples, for nucleic acid gene families (Feinbaum et al., Mol. Gen. Genet. 226:449 sequencing, or for other purposes. (1991); Lam and Chua, Science 248:471 (1990)), or a light. Appropriate primers and probes for identifying an IspS In one example, a promoter sequence that is responsive to gene from plant cells, e.g., poplar or another deciduous tree, light may be used to drive expression of an IspS nucleic acid can be generated from comparisons of the sequences pro construct that is introduced into Chlamydomonas that is vided herein. For a general overview of PCR see PCR Proto exposed to light (e.g., Hahn, Curr Genet. 34:459-66, 1999; cols: A Guide to Methods and Applications. (Innis, M, Gel 10 Loppes, Plant Mol Biol 45:215-27, 2001; Villand, Biochem J fand, D., Sninsky, J. and White, T., eds.), Academic Press, San 327:51-7), 1997. Other light-inducible promoter systems Diego (1990). An exemplary PCR for amplifying an IspS may also be used, such as the phytochrome/PIF3 system nucleic acid sequence is provided in the examples. (Shimizu-Sato, Nat Biotechnol 20): 1041–4, 2002). Further, a The genus of IspS nucleic acid sequences for use in the promoter can be used that is also responsive to heat can be invention includes genes and gene products identified and 15 employed to drive expression in algae Such as Chlamydomo characterized by techniques such as hybridization and/or nas (Muller, Gene 111:165-73, 1992; von Gromoff, Mol Cell sequence analysis using exemplary nucleic acid sequences, Biol 9:3911-8, 1989). Additional promoters, e.g., for expres e.g., SEQID NO:1 or SEQID NO:7 and protein sequences, Sionin algae Such as green microalgae, include the RbcS2 and e.g., SEQID NO:2 or SEQID NO:8. Psal D promoters (see, e.g., Stevens et al., Mol. Gen. Genet. Preparation of Recombinant Vectors 251: 23-30, 1996; Fischer & Rochaix, Mol Genet Genomics To use isolated sequences in the above techniques, recom 265:888-94, 2001). binant DNA vectors suitable for transformation of green In some embodiments, the promoter may be from a gene microalgae, cyanobacteria, and photosynthetic or non-photo associated with photosynthesis in the species to be trans synthetic bacterial cells, are prepared. Techniques for trans formed or another species. For example such a promoter from formation are well known and described in the technical and 25 one species may be used to direct expression of a protein in Scientific literature. For example, a DNA sequence encoding transformed algal cells or cells of another photosynthetic an IspS gene (described in further detail below), can be com marine organism. Suitable promoters may be isolated from or bined with transcriptional and other regulatory sequences synthesized based on known sequences from other photosyn which will direct the transcription of the sequence from the thetic organisms. Preferred promoters are those for genes gene in the intended cells of the transformed algae, cyano 30 from other photosynthetic species that are homologous to the bacteria, or bacteria. In some embodiments, an expression photosynthetic genes of the algal host to be transformed. For vector that comprises an expression cassette that comprises example, a series of light harvesting promoters from the the IspS gene further comprises a promoter operably linked to fucoxanthing chlorophyll binding protein have been identi the IspS gene. In other embodiments, a promoter and/or other fied in Phaeodactylum tricornutum (see, e.g., Apt, et al. Mol. regulatory elements that direct transcription of the IspS gene 35 Gen. Genet. 252:572-579, 1996). In other embodiments, a are endogenous to the microorganism and the expression carotenoid chlorophyll binding protein promoter, Such as that cassette comprising the IspS gene is introduced, e.g., by of peridinin chlorophyll binding protein, can be used. homologous recombination, Such that the heterologous IspS In some embodiments, a promoter used to drive expression gene is operably linked to an endogenous promoter and is of a heterologous IspS gene is a constitutive promoter. expression driven by the endogenous promoter. 40 Examples of constitutive strong promoters for use in microal Regulatory sequences include promoters, which may be gae include, e.g., the promoters of the atpA, atpB, and rbcL either constitutive or inducible. In some embodiments, a pro genes. Various promoters that are active in cyanobacteria are moter can be used to direct expression of IspS nucleic acids also known. These include promoters such as the (constitu under the influence of changing environmental conditions. tive) promoter of the psbA3 gene in cyanobacteria and pro Examples of environmental conditions that may effect tran 45 moters such as those set forth in U.S. Patent Application Scription by inducible promoters include anaerobic condi Publication No. 20020164706, which is incorporated by ref tions, elevated temperature, or the presence of light. Promot erence. Other promoters that are operative in plants, e.g., ers that are inducible upon exposure to chemicals reagents are promoters derived from plant viruses, such as the CaMV35S also used to express IspS nucleic acids. Other useful inducible promoters, can also be employed in algae. regulatory elements include copper-inducible regulatory ele 50 In some embodiments, promoters are identified by analyZ ments (Mett et al., Proc. Natl. Acad. Sci. USA 90:4567-4571 ing the 5' sequences of a genomic clone corresponding to an (1993); Furst et al., Cell 55:705-717 (1988)); tetracycline and IspS gene. Sequences characteristic of promoter sequences chlor-tetracycline-inducible regulatory elements (Gatz et al., can be used to identify the promoter. Plant J. 2:397-404 (1992); Röder et al., Mol. Gen. Genet. A promoter can be evaluated, e.g., by testing the ability of 243:32-38 (1994); Gatz, Meth. Cell Biol. 50:411-424 55 the promoter to drive expression in plant cells, e.g., green (1995)); ecdysone inducible regulatory elements (Christo algae, in which it is desirable to introduce an IspS expression pherson et al., Proc. Natl. Acad. Sci. USA 89:6314-6318 COnStruct. (1992); Kreutzweiser et al., Ecotoxicol. Environ. Safety A vector comprising IspS nucleic acid sequences will typi 28:14-24 (1994)); heat shock inducible regulatory elements cally comprise a marker gene that confers a selectable phe (Takahashi et al., Plant Physiol. 99:383-390 (1992); Yabe et 60 notype on algae or bacterial cells. Such markers are known. al., Plant Cell Physiol. 35:1207-1219 (1994): Ueda et al., For example, the marker may encode antibiotic resistance, Mol. Gen. Genet. 250:533-539 (1996)); and lac operon ele Such as resistance to kanamycin, G418, bleomycin, hygro ments, which are used in combination with a constitutively mycin, and the like. In some embodiments, selectable mark expressed lac repressor to confer, for example, IPTG-induc ers for use in Chlamydomonas can be markers that provide ible expression (Wilde et al., EMBO.J. 11:1251-1259 (1992)). 65 spectinomycin resistance (Fargo, Mol Cell Biol 19:6980-90, An inducible regulatory element also can be, for example, a 1999), kanamycin and amikacin resistance (Bateman, Mol nitrate-inducible promoter, e.g., derived from the spinach Gen Genet. 263:404-10, 2000), Zeomycin and phleomycin US 7,947,478 B2 15 16 resistance (Stevens, Mol Gen Genet. 251:23-30, 1996), and reinhardtii, which is classified as Volvocales—Chlamy paromomycin and neomycin resistance (Sizova, Gene 277: domonadaceae. Algae strains that may be used in this inven 221-9, 2001). tion include, e.g., Chlamydomonas reinhardtii, Scenedesmus IspS nucleic acid sequences of the invention are expressed obliquus, Chlorella vulgaris, Botryococcus braunii, Botryo recombinantly in microorganisms, e.g., microalgae, cyano 5 coccus Sudeticus, Dunaliella Salina, and Haematococcus bacteria, or photosynthetic or non-photosynthetic bacteria. pluvialis. As appreciated by one of skill in the art, expression constructs Methods of mass-culturing algae are known. For example, can be designed taking into account Such properties as codon algae can be grown in high density photobioreactors (see, usage frequencies of the organism in which the IspS nucleic e.g., Lee et al., Biotech. Bioengineering 44:1161-1167, 1994: acid is to be expressed. Codon usage frequencies can be 10 tabulated using known methods (see, e.g., Nakamura et al. Chaumont, J Appl. Phycology 5:593-604, 1990), bioreactors Nucl. Acids Res. 28:292, 2000). Codon usage frequency Such as those for sewage and waste water treatments (e.g., tables, including those for microalgae and cyanobacteria, are Sawayama et al., Appl. Micro. Biotech., 41:729-731, 1994: also available in the art (e.g., in codon usage databases of the Lincoln, Bulletin De L'institut Oceangraphique (Monaco), Department of Plant Genome Research, Kazusa DNA 15 12:109-115, 1993), mass-cultured for the elimination of Research Institute (www.kazusa.or.jp/codon). heavy metals from contaminated water (e.g., Wilkinson, Bio Cell transformation methods and selectable markers for tech. Letters, 11:861-864, 1989), mass-cultured for the pro bacteria and cyanobacteria are well known in the art (Wirth, duction off-carotene (e.g., Yamaoka, Seibutsu-Kogaku Kai Mol Gen Genet. 1989 March; 216(1):175-7: Koksharova, shi, 72:111-114, 1994), (e.g., U.S. Patent Appl Microbiol Biotechnol 2002 February: 58(2):123-37; Application Publication No. 2003.0162273), and pharmaceu Thelwell). Transformation methods and selectable markers tical compounds (e.g., Cannell, 1990), as well as nutritional for use in bacteria are well known (see, e.g., Sambrook et al. supplements for both humans and animals (Becker, 1993, Supra). “Bulletin De L'institut Oceanographique (Monaco), 12, 141 In microalage, e.g., green microalgae, the nuclear, mito 155) and for the production of other compounds of nutritional chondrial, and chloroplast genomes are transformed through 25 value. a variety of known methods, including by microparticle bom Conditions for growing IspS-expressing algae or bacteria bardment, or using a glass bead method (see, e.g., Kindle, J for the exemplary purposes illustrated above are known in the Cell Biol 109:2589-601, 1989: Kindle, Proc Natl Acad Sci art (see, e.g., the exemplary references cited herein). Volatile USA 87: 1228-32, 1990; Kindle, Proc Natl Acad Sci USA isoprene hydrocarbons produced by the modified microor 88: 1721-5, 1991; Shimogawara, Genetics 148:1821-8, 1998: 30 ganisms can be harvested using known techniques. Isoprene Boynton, Science 240: 1534-8, 1988: Boynton, Methods hydrocarbons are not miscible in water and they rise to and Enzymol 264:279-96, 1996; Randolph-Anderson, Mol Gen float at the surface of the microorganism growth medium. Genet. 236:235-44, 1993). In some embodiments, an IspS They are siphoned off from the surface and sequestered in gene is introduced into the chloroplast of a microalgae. In Suitable containers. In addition, and depending on the prevail other embodiments, an IspS gene is introduced into the 35 ing temperature during the mass cultivation of the microor nucleus. ganisms, isoprene can exist in vapor form above the water The techniques described herein for obtaining and express medium in the bioreactor container (isoprene boiling tem ing IspS nucleic acid sequences in microalgae, cyanobacteria perature T-34°C.). Isoprene vapor is piped off the bioreactor or photosynthetic or non-photosynthetic bacteria can also be container and condensed into liquid fuel form upon cooling or employed to express DXr or DXS nucleic acid sequences. 40 low-level compression. Microorganisms that can be Targeted IspS can be expressed in any number of microalgae, e.g., EXAMPLES green algae, or cyanobacteria, or photosynthetic or non-pho tosynthetic bacteria where it is desirable to produce isoprene. The examples described herein are provided by way of Transformed microalgae, cyanobacteria, or bacteria (photo 45 illustration only and not by way of limitation. Those of skill in synthetic bacteria or non-photosynthetic bacteria) that the art will readily recognize a variety of non-critical param express a heterologous IspS gene are grown under mass cul eters that could be changed or modified to yield essentially ture conditions for the production of hydrocarbons, e.g., to be similar results. used as a fuel source or as feedstock in synthetic chemistry. The transformed organisms are growth in bioreactors or fer 50 Example 1 mentors that provide an enclosed environment to contain the hydrocarbons. In typical embodiments for mass culture, the Design and Expression of Novel Cr-IspS Gene for microalgae, cyanobacteria, or bacteria are grown in enclosed Isoprene Hydrocarbon Production in Microalgae reactors in quantities of at least about 500 liters, often of at least about 1000 liters or greater, and in some embodiments in 55 A codon-adjusted synthetic DNA construct was generated quantities of about 1,000,000 liters or more. based on the known nuclear-encoded "isoprene synthase' In some embodiments, IspS is expressed in microalgae. IspS protein sequence of Populus alba (poplar). This amino Algae, alga or the like, refer to plants belonging to the Sub acid sequence (SEQID NO:2) was used as a template for the phylum Algae of the phylum Thallophyta. The algae are denovo design of an IspSDNA sequence for expression of the unicellular, photosynthetic, oxygenic algae and are non-para 60 gene in the chloroplast of model microalga Chlamydomonas sitic plants without roots, stems or leaves; they contain chlo reinhardtii. For the purposes of this invention, this gene has rophyll and have a great variety in size, from microscopic to been termed Cr-IspS. Features of this gene included: (1) large seaweeds. Green algae, which are single cell eukaryotic Codon usage was different from that of poplar and specifi organisms of oxygenic photosynthesis endowed with chloro cally selected to fit the codon usage of the Chlamydomonas phyll a and chlorophyll b belonging to Eukaryota Virid 65 reinhardtii chloroplast; (2) The poplar chloroplast targeting iplantae Chlorophyta Chlorophyceae, are often a pre sequence of the protein was omitted from the design of the ferred target. For example, IspS can be expressed in C. new Cr-IspS gene. (3) Three copies of a codon optimized US 7,947,478 B2 17 18 gene encoding the hemagglutinin (HA) epitope tag were whereas no detectable band could be observed in the control fused upstream of the IspS gene. line #7 lane (FIG. 6A). These results validated the stable The Cr-IspS DNA sequence (SEQID NO:3) was designed integration of Cr-IspS in the chloroplast genome of Chlamy to encode for the isoprene synthase protein (SEQ ID NO:4) domonas reinhardtii transformant line #9, and are consistent specifically in the chloroplast of microalgae, e.g., Chlamy with the results of the PCR analysis (FIG. 5). Ethidium bro domonas reinhardtii. Codon usage adjustments for gene mide staining of the Agarose gel (FIG. 6B) tested for the equal expression in the chloroplast of Chlamydomonas were made amount of DNA loading. Similar results were obtained with on the basis of the codon usage table for the Chlamydomonas the Cr-IspS transformant line #20 (not shown). reinhardtii chloroplast 6803, listed in the following URL: Cr-IspS proteinaccumulation in the Chlamydomonas rein http://www.bio.net/bionet/mm/chlamy/1997-March/ 10 hardtii transgenic line #9 was verified by Western blot analy O00843.html. sis (FIG. 7) in order to demonstrate Cr-IspS gene expression. SEQID NO:4 also contains three copies of the hemagglu Anti-HA tag antibody (C.-HA) was used to assay for the tinin tag, which are underlined in the N-terminal side of the presence of the recombinant Cr-IspS protein and its cellular sequence. Restriction enzyme recognition sites were intro concentration. Three copies of hemagglutinin (HA) tag were duced at the ends of the Cr-IspS DNA sequence to facilitate 15 introduced into a position preceding the Cr-IspS gene that cloning of the gene, and the entire sequence was synthesized encodes the mature protein (FIG. 7A), to serve as a conve and cloned in a carrier-plasmid. nient epitope for the detection of Cr-IspS protein accumula A transgenic Chlamydomonas reinhardtii chloroplast was tion. Chlamydomonas reinhardtii cells were concentrated to generated that expressed the codon-optimized recombinant 500x10 cells/ml in 50 mM HEPES buffer (pH7.0) and bro isoprene synthase gene (Cr-IspS). This was accomplished by ken by glass bead agitation for 5 min to release the soluble constructing a chimeric gene (FIG. 4 top, Cr-IspS) containing fraction of chloroplast. Soluble protein fractions, which cor the atp A promoter (Patp A), fused to the 5' UTR end of a codon respond to 10 or 20 ug chlorophyll, were subjected to SDS optimized three-copy hemagglutinin (HA) epitope tag DNA PAGE and Western blot analysis with specific polyclonal (FIG. 4). This DNA sequence was followed by the Cr-IspS anti-HA antibodies. A clear antibody-protein cross-reaction coding region (FIG. 4), followed by the atpA 3'UTR (FIG. 4. 25 was observed at about the 67 kD band in the lanes loaded with TatpA). Integration of the constructed chimeric gene into the sample from the Chlamydomonas reinhardtii transformant Chlamydomonas reinhardtii chloroplast genome was line #9, but not in the control (C #7) (FIG. 7B). In addition, achieved using biolistic transformation and homologous antibody-protein cross-reactions were observed at about 38 recombination, requiring sequence homology between the kD, indicated by asterisk in FIG. 5B. Accumulation of Cr transforming vector and the chloroplast genome (Boynton et 30 IspS protein as a 38 kD band might indicate a premature al., Science, 240:1534-1538, 1988). For this purpose, the termination of Cr-IspS mRNA translation, or a specific deg vector p322 was employed, which contains a partial C. rein radation activity over the recombinant protein. There was no hardtii chloroplast genome for the target of homologous detectable 67 kD or 38 kD bands in the control lane (C #7). recombination (Franklin et al., Plant J., 30.733-744, 2002). The apparent cross-reaction corresponding to a 34 kD protein As shown in the diagram of FIG.4, the chimeric Cr-IspS gene 35 is probably a non-specific binding of the primary or second was ligated into the BamHI site of p322 to generate plasmid ary antibody to a Chlamydomonas reinhardtii protein. pApISAt. The pApISAt construct was co-transformed into Expression of the Cr-IspS protein was also detected in trans the C. reinhardtii strain CC503 chloroplast by means of par formant line #20 (not shown). ticle bombardment (Boynton et al., Science, 240: 1534-1538, 1988), along with plasmid p228, containing a modified 16S 40 Example 2 ribosomal gene conferring spectinomycin resistance (Frank linet al., Plant J.,30:733-744, 2002). Primers N and Cin FIG. Design and Expression of a SS-IspS Gene for 4 mark the annealing location of primers that were used for Isoprene Hydrocarbon Production in Cyanobacteria the Subsequent PCR screening among isolated spectinomycin resistant transformants. 45 In order to express isoprene hydrocarbon production in FIG. 5 provides an example of the genomic PCR screening cyanobacteria, a codon-adjusted synthetic DNA construct analysis of primary transformants that were selected on was generated, based on the known isoprene synthase IspS media containing spectinomycin for the presence of either N protein sequence of Populus alba (poplar). This amino acid or C-terminal regions of the chimeric Cr-IspS gene in order to sequence was used as a template for the de novo design of an screen for C. reinhardtii Cr-IspS transformants. Over one 50 IspS DNA sequence for expression of the gene in cyanobac hundred spectinomycin resistant transformant colonies of teria, e.g., Synechocystis sp. Codon usage adjustments for Chlamydomonas reinhardtii were isolated and tested, among gene expression in cyanobacteria were made on the basis of which two independent lines (#9 and #20) were found to the codon usage Table for Synechocystis PCC 6803, listed in unequivocally contain the stably integrated Cr-IspS gene in the following URL: http://gib.genes.nig.ac.jp/single?codon/ their chloroplast DNA. A spectinomycin resistant transfor 55 main.php?spid=Syne PCC6803. mant (#7, not shown) was used as negative control for the The codon-adjusted gene is referred to herein as SS-IspS. PCR analysis and the pApISAt plasmid served as a template Features of this gene include: (1) Codon usage was different for the positive control. Primers N and C represent the primer from that of poplar and specifically selected to fit the codon set used for amplification, and their annealing locations are usage of Synechocystis; (2) The poplar chloroplast targeting shown in FIG. 4. 60 sequence of the protein was omitted from the design of the Genomic DNA from Chlamydomonas reinhardtii control new SS-IspS gene. The DNA sequence was designed to (#7) and the putative Cr-IspS transformant line #9 were encode the isoprene synthase protein specifically in cyano digested with BamHI, separated on an Agarose gel, and Sub bacteria, e.g. Synechocystis. The first underlined sequence of jected to Southern blot analysis in order to test for Cr-IspS SEQID NO:5 represents the (reverse compliment) beta-lac transgene integrity. Hybridization with a radio-labeled Ndel/ 65 tamase gene, whereas the second underlined sequence is the Xbal fragment of the Cr-IspS coding region identified a -3.0 SS-IspS DNA. Additionally, the italicized sequences are start kbp band exclusively in the Cr-IspS transformant line #9, and stop codons, and the bold sequences are cloning restric US 7,947,478 B2 19 20 tion sites. Restriction enzyme recognition sites were intro duced at the ends of the newly designed SS-IspS DNA IspS F NdeI, sequence to facilitate cloning of the gene, and the entire 5 - CTGGGTCATATGGAAGCTCGACGAA-3 (SEQ ID NO: 12), sequence was synthesized and cloned in a carrier-plasmid (FIG. 8). and The codon-optimized, length-adjusted and chemically IspS R HindIII, synthesized SS-IspS gene was cloned downstream of the 5'-ATGGAAAACCTGAAGCTTTTAACGTTCAA-3 '' (SEO ID NO : 13), psbA3 promoter region of Synechocystis, in frame with the ATG start codon of the psbA3 gene. The SS-IspS gene was NO:13), introducing an Nde I-site and a HindIII-site in the 5' followed by a transcriptional terminator and a gentamicin 10 and 3' end of the gene, respectively. These sites were used to resistance cassette and, thereafter, by the Synechocystis clone the gene into the pET1529 expression vector forming sequence immediately downstream of psbA3 gene (FIG. 8). vectorpETIspS, which carries a His-tag in the N-terminal end This new construct allowed for homologous recombina tion, i.e., insertion of the SS-IspS DNA sequence into the of the protein (FIG. 11). Synechocystis genome by replacement of the endogenous In order to demonstrate recombinant His-tagged SS-IspS psbA3 gene via double homologous recombination (FIG.10). 15 expression in Escherichia coli, E. coli bacteria were trans Selection of Synechocystis transformants could be made formed with the pETIspS plasmid, which contains the SS using gentamicin (Gm) as the selectable marker, and the IspS gene and a His-tag-encoding DNA in the 5' end of the strong psbA3 promoter drove expression of the SS-IspS gene. SS-IspS gene. Successful expression of this His-Ss-IspS gene In order to transform Synechocystis with the SS-IspS con struct, Synechocystis sp. cells were grown in a basic BG 11 in E. coli was induced upon addition of 0.1 mM IPTG to the growth medium in the presence of 5 mM glucose, until cell liquid cell culture. Cells were harvested and their protein density reached about 50x10 cells ml (ODo-0.5). Cells content was analyzed by SDS-PAGE and Coomassie staining were then harvested and concentrated to 10" cells ml, (FIG. 12). It was demonstrated that all clones carrying the mixed with the pAIGA plasmid for transformation and incu pETIspS plasmid were expressing the ~65 kD His-Ss-IspS bated for 4-6 hprior to spreading of the mixture onto filters on 25 protein (FIG. 12, -65 kD band). top of BG 11-containing agarplates, also containing 0.5ug/ml A similar undertaking and demonstration of expression of Gm, 0.3% sodium thiosulfate, and 10 mM TES-NaOH, pH8.0. the IspS gene and accumulation of the recombinant IspS The Petri plates were kept under low light intensity for 1-2 protein in bacteria, e.g. Escherichia coli, was successfully days and thereafter moved to normal growth conditions. Fil conducted with the Cr-IspS gene, codon-optimized for ters were transferred to fresh Gm-containing plates once a 30 expression in unicellular green algae, e.g. Chlamydomonas week. es that formed in the presence of the Gm selectable reinhardtii (results not shown). marker were isolated and re-streaked on fresh filters, fol All publications, accession numbers, and patent applica lowed by transfer to liquid BG 11 growth media under con tinued selective conditions in the presence of Gm. tions cited in this specification are herein incorporated by 35 reference as if each individual publication or patent applica Example 3 tion were specifically and individually indicated to be incor porated by reference. Expression of His-tagged IspS in Escherichia coli In order to construct the vector for expression of His tagged IspS gene in E. coli, SS-IspS DNA, codon optimized 40 for expression in cyanobacteria, was amplified by PCR using Exemplary IspS Sequences primers:

SEQ ID NO : 1 Populus alba cDNA for isoprene synthase, Accession No. AB19818O atggcaactgaattattgtg cittgcaccgt ccaatctoac togacacacaa acttitt caga 61 aatcc cttac ctaaagt cat coaggcc act coctita actt togaaacticag atgttctgta 121 agcacagaala acgtcagctt cacagaalaca gaalacagaag ccagacggtc. tccaatt at 181 gaaccaaata gctgggatta to attatttg ctgtc.ttcag acactgacga at cattgag 241 gtatacaaag acaaggccaa aaagctggag gCtgaggtga galagagagat taacaatgaa 301 aaggcagagt ttittgactict gcttgaactg at agataatg tccalaaggitt aggattgggit 361 taccggttcg agagtgacat aaggggagcc cttgatagat ttgtttct tc aggaggattit 421 gatgctgtta caaaaac tag cctt catggit actgct citta gott caggct tct cagacag 481 catggittittg aggtotctica agaag.cgttc agtggattica aggatcaaaa toggcaattitc 541 ttggaaaacc ttaaggagga catcaaggca at actaagcc tatatgaagc tit catttctt 6O1 goattagaag gagaaaat at Cttggatgag gC calaggtgt ttgcaat atc acatctaaaa 661 gagct cagcg aagaaaagat tgaaaagag ctggc.cgaac aggtgaat catcattggag 721 ctitccattgc atcgcaggac goaaagact a galagctgttt ggagcattga agcataccgt US 7,947,478 B2 21 22 - Continued 781 aaaaaggaag atgcaaatca agtactgcta gaacttgcta tattggacta caacatgatt

841 caatcagtat accalaagaga tott.cgc gag acat Caaggt ggtggaggcg agtgggtctt 901 gcaacaaagt tgcattttgc tagagacagg ttaattgaaa gottt tact g ggcagttgga 961 gttgcgttcg agcct caata cagtgattgc cgtaatticag tagcaaaaat gttitt cattt O21 gtaacaatca ttgatgatat ctatatgtt tatgg tactic tigacgagtt ggagct attt O81 acagatgctg ttgagagatg ggatgttaat gccat caatig at citt.ccgga ttatatgaag

141 citctgct tcc tagct ct cita Calacact at C aatgagatag ct tatgacaa totgaaggac

2O1 aagggggaala a cattct tcc. attacctaa.ca aaag.cgtggg cagattt atg caatgcatt C 261 ctacaagaag caaaatggitt gtacaataag to cacaccala catttgatga ct attt cqga

321 aatgcatgga aatcatCct C agggcct citt caact agttt ttgcct actt togcc.gtggitt

381 caaaa.catca agaaagagga aattgaaaac ttacaaaagt at catgatac catcagtagg

441 cct tcc caca tottt cqtct ttgcaacgac Ctggctt cag catcggctga gatagagaga 501 ggtgaaa.ca.g cga attctgt at catgctac atgcgtacaa aaggcattt C taggagctt 561 gct actgaat cc.gtaatgaa cittgatcgac gaaac Ctgga aaaagatgaa caaagaaaag

621 cttggtggct Ctttgtttgc aaaac Cttitt gtcqaaacag ct attaacct togcacggcaa.

681 toccattgca Cttat catala cggagatgcg catactt cac cagacgagct aactaggaaa 74.1 c.gtgtcc tdt cagtaat cac agagcctatt ct acc ctittg agagataa SEQ ID NO: 2 Populus alba polypeptide sequence for isoprene synthase (from Accession No. AB19818 O). The underlined portion of the protein denotes a chloroplast transit peptide. MATELLCLHRPISLTH KLFRNPLPKVIQATPLTLKLRCSVSTENVSFTETETEARRSANYEPNSWDYDYL

LSSDTDESIEWYKDKA KKLEAEVRREINNEKAEFLTLLELIDNWORLGLGYRFESDIRGALDRFWSSGGF

DAWTKTSLHGTALSFR (LROHGFEWSOEAFSGFKDONGNFLENLKEDIKAILSLYEASFLALEGENILDE

AKVFAISHLKELSEEKIGKELAEQVNHALELPLHRRTORLEAVWSIEAYRKKEDANOWLLELAILDYNMI

OSVYORDLRETSRWWR RVGLATKLHFARDRLIESFYWAVGVAFEPOYSDCRNSVAKMFSFWTIIDDIYDV

YGTLDELELFTDAWERWDWNAINDLPDYMIKLCFLALYNTINEIAYDNLKDKGENILPYLTKAWADLCNAF

LOEAKWLYNKSTPTFD DYFGNAWKSSSGPLOLVFAYFAVWONIKKEEIENLOKYHDTISRPSHIFRLCND

LASASAEIARGETANSWSCYMRTKGISEELATESWMNLIDETWKKMNKEKLGGSLFAKPFWETAINLARO

SHCTYHNGDAHTSPDE TRKRWLSWIT EPILPFER SEQ ID NO: 3 Cr-ISpS gene and hemagglutinin tag for transformation/ expression in unicellular green algae. The IspS nucleotide sequence starts with the underlined vTGT codon. ATGTATCCTTATGATGTTCCAGACTACGCAGGTTATCCT ATGATGTACCAGACTATGCAGGTTATCCTT

ACGATGTACCTGATTACGCTGGTCCATGGTGTTCTGTTAGTACTGAAAATGTTTCATTTAC GAAACAGA

AACAGAAGCACGTAGATCAGCAAATTATGAGCCAAATAGTTGGGATTATGACTATTTATTA CTAGTGAT

ACAGATGAATCTATTGAAGTATATAAAGATAAAGCAAAAAAATTAGAAGCAGAAGTACGTCGTGAAATTA.

ATAACGAAAAAGCAGAATTTCTTAC TTATTAGAATTAATTGATAATGTACAACGTTTAGG TTAGGTTA

TCGTTTTGAATCAGACATTCGTGGTGCATTAGATCGTTT GTATCAAGTGGTGGTTTTGATGCTGTTACA

AAAACTAGTTTACATGGTACTGC TT AAGTTTTCGTTTACTTCGTCAACATGGTTTTGAAG TAAGTCAAG

AAGCTTTTTCTGGTTTTAAAGATCAAAATGGTAATTTCTTAGAAAATTTAAAAGAAGATAT AAAGCTAT

TTTAAGTTTATACGAAGCATCATTT TAGCTTTAGAAGG GAAAATATTTTAGATGAAGCTAAAGTATTT

GCTATTTCTCACTTAAAAGAATTATCAGAAGAAAAAATTGGTAAAGAATTAGCTGAACAAG AAACCATG

CATTAGAATTACCATTACATCGTCG ACACAACGTTTAGAAGCAGTTTGGTCTATTGAAGC TATCGTAA US 7,947,478 B2 23 24 - Continued AAAAGAAGATGCTAATCAAGTTTTATTAGAATTAGCAATTTTAGATTATAATATGATTCAATCAGTATAC

CAACGTGATTTACGTGAAACAAGTCGTTGGTGGCGTCGTGTAGGTTTAGCTACTAAATTACATTTTGCTC

GTGATCGTTTAATTGAAAGTTTTTATTGGGCAGTTGGTGTAGCTTTTGAACCACAATATTCAGATTGTCG

TAATTCAGTTGCAAAAATGTTTTCATTTGTAACTATTATTGATGATATTTATGATGTTTACGGTACATTA

GATGAATTAGAATTATTCACTGATGCAGTAGAACGTTGGGATGTTAATGCTATTAATGATTTACCAGATT

ATATGAAATTATGTTTTCTTGCTTTATATAACACTATTAATGAAATTGCTTATGATAACTTAAAAGATAA

AGGTGAAAATATTTTACCATATTTAACAAAAGCTTGGGCTGATTTATGTAATGCTTTTTTACAAGAAGCT

AAATGGTTATATAATAAATCAACACCAACATTTGATGATTATTTTGGTAATGCTTGGAAAAGTTCATCTG

GTCCATTACAATTAGTTTTTGCTTATTTTGCTGTTGTTCAAAATATTAAAAAAGAAGAAATTGAAAATTT

ACAAAAATATCATGATACAATTTCACGTCCATCACATATTTTTCGTTTATGTAATGATTTAGCTTCAGCT

TCAGCTGAAATTGCACGTGGTGAAACAGCAAATTCAGTTTCATGTTATATGCGTACAAAAGGTATTTCTG

AAGAATTAGCTACAGAATCAGTTATGAATTTAATTGATGAAACATGGAAAAAAATGAATAAAGAAAAATT

AGGTGGTTCTTTATTTGCTAAACCATTTGTTGAAACTGCTATTAATTTAGCACGTCAATCACATTGTACT

TATCATAATGGTGATGCTCATACATCACCAGATGAATTAACACGTAAACGTGTTTTATCAGTTATTACAG

AACCAATTTTACCATTTGAACGTTAA SEQ ID NO: 4 Polypeptide sequence for Cr-IspS isoprene synthase gene. The three copies of the hemagglutinin HA tag are underlined. The isoprene synthase sequence lacks a chloroplast targeting sequence of the poplar IspS protein sequence. The IspS sequence starts with CS . . . " , indicated by the change of font. MYPYDVPDYAGYPYDVPDYAGYPYDVPDYAGPWCSWSTENWSFTETETEARRSANYEPNSWDYDY

LLSSDTDESIEVYKDKAKKLEAEVRREINNEKAEFLTLLELIDNWORLGLGYRFESDIRGALDRFWSSGG

FDAVTKTSLHGTALSFRLLROHGFEWSOEAFSGFKDONGNFLENLKEDIKAILSLYEASFLALEGENILD

EAKVFAISHLKELSEEKIGKELAEQVNHALELPLHRRTORLEAVWSIEAYRKKEDANOWLLELAILDYNM

IOSVYORDLRETSRWWRRVGLATKLHFARDRLIESFYWAVGVAFEPOYSDCRNSVAKMFSFWTIIDDIYD

WYGTLDELELFTDAWERWDWNAINDLPDYMIKLCFLALYNTINEIAYDNLKDKGENILPYLTKAWADLCNA

FLOEAKWLYNKSTP TFDDYFGNAWKSSSGPLOLVFAYFAVVONIKKEEIENLOKYHDTISRPSHIFRLCN

DLASASAEIARGETANSWSCYMRTKGISEELATESWMNLIDETWKKMNKEKLGGSLFAKPFWETAINLAR

OSHCTYHNGDAHTSPDELTRKRVLSVITEPILPFER SEQ ID NO: 5 Nucleotide sequence of Ss -IspS DNA and plasmid plspS for cyanobacteria The first underlined sequence of SEQ ID NO: 5 represents the (reverse complement) beta-lactamase gene, whereas the second underlined sequence is the Ss - IspS DNA. Additionally, the italicized sequences are start and stop codons, and the bold sequences are cloning restriction sites. >pIsps aaaaag cattgct catcaatttgttgcaacgaac aggt cactatoagtcaaaataaaatcattatttaaaagggg cc.cgagct talagactggc.cgt.cgttitt acaacacagaaagagtttgtagaaacgcaaaaaggcc at CC9tcaggg gcct tctgct tagtttgatgcctggcagttcc ct actict cqcct tcc.gct tcc togct cactgact cqctg.cgct cggit cqtt cqgctg.cggcgagcggitat cagct cact caaaggcggtaatacggittatccacaga at Caggggata acgcaggaaagaacatgtgagcaaaaggc.ca.gcaaaaggcCagga accgtaaaaaggcc.gcgttgctggcgttitt tccataggct cogCCCC cctgacgagcatcacaaaaatcgacgct Caagt cagaggtggcga aaccc.gacaggac tataaagataccagg.cgtttcc.ccctggaagctic cct cqtgcgct ct cotgttcc.gaccctg.ccgcttaccggat acctgtcc.gc.ctttctic cctt.cgggaagcgtggcgctttct catagct cacgctgtaggitat ct cagttcggtgt aggit cqtt cqct coaa.gctgggctgtgtgcacgaac ccc.ccgttcagc.ccgaccgctg.cgc.ctt at CC9gtaact atcgt.cttgagt cca accc.gg talagacacgactitat cqccactggcagcagcc actggtaac aggattagcagag cgagg tatgtaggcggtgctacagagttcttgaagtggtgggctaactacggctacactagaagaacagtatttg

US 7,947,478 B2 27 28 - Continued ttgttggaalacc.gc.gattaatttggctcgc.caaagttcattgtacctat cacaatggtgatgct cacaccagt coco atgaatta acccdt aaacgagttctgttctgttgat tact galacc cattittgccCtttgaact taaaagtaa.ca99 ttitt coatgttgtcgtctgcaagaacactgcagagcctgctttitttgtacaaagttggcattata SEQ ID NO: 6 Amino acid sequence of the expected 65kD translated Ss -IspS protein from cyanobacteria plasmid. MEARRSANYE PNSWDYDFLL SSDTDESIEW YKDKAKKLEA EWRREINNEK

51. AEFLTLLELI DNWORLGLGY RFESDIRRAL DRFWSSGGFD GVTKTSLHAT

101 ALSFRLLROH GFEWSOEAFS GFKDONGNFL ENLKEDTKAI LSLYEASFLA

151 LEGENILDEA RVFAISHLKE LSEEKIGKEL, AEOVNHALEL PLHRRTORLE

2O1 AVWSIEAYRK. KEDANOVLLE LAILDYNMIQ SVYORDLRET SRWWRRVGLA

251 TKLHFAKDRL, IESFYWAVGV AFEPOYSDCR NSVAKMFSFV TIIDDIYDVY

3 O1 GTLDELELFT DAWERWDWNA INDLPDYMKL CFLALYNTIN EIAYDNLKDK

351 GENILPYLTK AWADLCNAFL OEAKWLYNKS TPTFDDYFGN AWKSSSGPLO

4O1 LIFAYFAVVO NIKKEEIENL OKYHDIISRP SHIFRLCNDL ASASAEIARG

451 ETANSWSCYM RTKGISEELA TESWMNLIDE TCKKMNKEKL GGSLFAKPFW

5O1 ETAINLAROS HCTYHNGDAH TSPDELTRKR VLSVITEPIL PFER* SEQ ID NO : 7 Pueraria montana var. lobata (kudzu vine) isoprene synthase (IspS) ; ACCESSION No AY316691 (complete cas.) The atg start codon is underlined and indicates the start of the protein-coding region of the cDNA. aat caatata taatatttac ggaagatttg atgcc titt.cc tdattittaat ttatttittat 61 ccctgcataa aataattgttg gtcaccgtac actgttcttg to acttggac aagaaatttg 121 act agcaa.gc aaggtataat catt catcta aact tatggit gatttattgc cccacct cat 181 caattitt cqt gtgttittatt ttagtgtc.ct tdgatcc tog titccaatata aaaggagaac 241 atggcatcgc aattittagag catat cattgaaaagtcatc. gcaaccaacc ttt tatgctt 301 g totaataaa ttatcgt.ccc ccacaccaac accaagtact agattitccac aaagtaagaa 361 citt catcaca caaaaaa.cat citcttgccaa toccaaacct togg.cgagitta tttgttgctac 421 gagct ct caa tttacccaaa taacagaaca taatagt cqg cqtt cagcta attaccago c 481 aaacctctgg aattittgaat ttctgcagtic totggaaaat gaccittaagg togattataca 541 tatatt coag ttaatttitt c ttitttitt citt ttgttgattitt taaggaatca tttagtttgg 6O1 gaaagtattt tttittatttg cacttittaat tataaaaatgttatat catt tt cactttitt 661 totatt catt ttcaaaattt tacatagaaa acagtaaatt ttittatttitt tittattittct 721 attitt catta titt citcaaat caaacggitat taaag cataa acaaagaaat taat attgtt 781 ctitttaattt tattitttitta caataatggg aacgattata tattaggct g acct taataa 841 gttatttittt ttittataata ttgttctitat tigtaacctaa cqac aggtgg aaaaactaga 901 agaga aggca acaaagctag aggaggaggit acgatgcatg at Caacagag taga cacaca 961 accattaagc titact agaat tdatcgacga tigtc.cagcgt ctaggattga cctacaagtt 1021 tagaaggac ataatcaaag cccttgagala tattgttittg ctggatgaga at aagaaaaa 1081 taaaagtgac ct c catgcta citgct ct cag ct tcc.gttta cittagacaac atggctittga 1141 gottt cocaa got atttatg tatatatatgttacc cactt agcaa.catat atatatatat 1201 at attatgat t cactgacca to atgtggit gcagatgtgt ttgagagatt taagga caag 1261 gagggaggitt t cagtggtga acttaaaggit gatgtgcaag ggttgctgag totatatgaa 1321 goat cct at C ttggctittga gggagaaaat ct cttggagg aggcaaggac attittcaata US 7,947,478 B2 29 30 - Continued 381 acacatctoa agaacaacct aaaagaagga at aaacacca aagtggcaga acaagttagt 441 catgcactgg aact tcc cta toatcaaaga ttgcatagac tagaagcacg atggttcctt 5O1 gacaaatatgaac caaagga accccaccat cagttactac to gagcttgc aaagctagat 561 ttcaatatgg togcaaacatt goaccagaaa gaactgcaag acctgtcaag gttagaaatt 621 totaattct ca agtaattatt acct cataag aaattaaata acaataacaa tattgagtgt 681 agagattitcc aattaaaaat taa catacga gaggat caat at at attctt agg tatgtgg 741 tactaatgaa atatatgct a ggtggtggac ggagatgggg ctagdalagca agct agacitt 801 tt Cogaga C agattaatgg aagtgtattt ttgggcgttg ggaatggcac ctgatcctica 861 attcggtgaatgtcgtaaag ctgtcactaa aatgtttgga ttggtcacca to atcqatga 921 totatatgac gtt tatggta ctittggatga got acaactic tt cactgatg ctgttgaga.g 981 gttcgtaatt gattt cagtic togatticagt tdgaatttaa ttattgctta attaataata 2041 acttgcgtac atgcatacac acagatggga cqtgaatgcc at aaacacac titccagacta 2101 catgaagttg togct tcc tag cactittataa caccqtcaat gacacgt.ctt at agcatcct 2161 taaagaaaaa gogacacaa.ca acctitt.cct a tittgacaaaa totgtacata tatactaatt 2221 atctoctitgg ttgattaatt agtttagttt agtttagttg g tatgtcaac acaattaatt 2281 aat attatat atggatgttg acagtggcgt gagittatgca aag catt cot to aagaagca 2341 aaatggit cqa acaacaaaat cattccagoa tittagcaagt acctggaaaa tigcatcggtg 24O1 toctic ct cog gtgtggctitt gottgct cot to ctact tct cagtgttgcca acaacaagaa 24 61 gatat ct cag accatgctict tcgttctitta actgattitcc atggccttgt gcgctic ctica 2521 tdcgt cattt toagacitctg caatgatttg gctacct cag cqgtgttgtaa ttaattacct 2581 taattaattt gtaac acttig titagactaat atatataggit gtgtctgtta attact acag 2641 gotgagctag agaggggtga gacgacaaat tcaataatat cittatatgca tagaatgac 2701 ggc acttctgaagagcaa.gc acgtgaggag titgagaaaat tat catgc agagtggaag 2761 aagatgalacc gagagcgagt ttcagattct acact acticc caaaagcttt tatggaaata 2821 gotgttaa.ca tot cagt titcgcattgc acataccaat atggagacgg acttggaagg

2881 ccagacitacg ccacagagaa tagaatcaag ttgct actta tag accc citt to caat caat 2941 caactaatgt acgtgtaa.ca acacaatata aa cacttitt c tacaagtata tatttgttta 3 OO1 attitcggtgt tdaattaggg gtcaacacag atatatatac ttcaatggac caact caacc 3 O61 aatctgataa gagaaaaaaa ataaaaataa ggittaggitta actttgtata aatccaagtt 3121 agatat caag titt SEQ ID NO: 8 Pueraria montana var. lobata (kudzu vine) isoprene synthase polypept ide sequence MATNLLCLSNKLSSPTPTPSTRFPOSKNFITOKTSLAN PWRVICATS SOFTOITEHNSRRSANYOPNL

WNFEFLOS ENDLKWEKLEEKATKLEEEWRCMINRWDT LSLLELIDDWORLGLTYKFEKDIIKALENI

WLLDENKKNKSDLHATALSFRLLROHGFEWSODWFERF KEGGFSGELKGDVOGLLSLYEASYLGFEGE

NLLEEART FSITHLKNNLKEGINTKVAEOVSHALELPY RLHRLEARWFLDKYEPKEPHHOLLLELAKL

DFNMVOTL HOKELODLSRWWTEMGLASKLDFVRDRLME FWALGMAPDPOFGECRKAVTKMFGLVTIID

DWYDWYGT LDELOLFTDAVERWDVNAINTLPDYMKLCF L.A. LYNTWNDTSYSILKEKGHNNLSYLTKSWRE

LCKAFLQEAKWSNNKIIPAFSKYLENASVSSSGWALLA PSYFSVCOOOEDISDHALRSLTDFHGLVRSSC

WIFRLCND ATSAAELERGETTNSIISYMHENDGTSEE REELRKLIDAEWKKMNRERWSDSTLLPKAF

MEIAWNMA RVSHCTYOYGDGLGRPDYATENRIKLLLID PINOLMYV

US 7,947,478 B2 33 34 - Continued <223> OTHER INFORMATION: white poplar isoprene synthase (IspS) 22 Os. FEATURE: <221s NAME/KEY: PEPTIDE <222s. LOCATION: (1) ... (37) <223> OTHER INFORMATION: chloroplast transit peptide

<4 OOs, SEQUENCE: 2 Met Ala Thr Glu Lieu Lleu. Cys Lieu. His Arg Pro Ile Ser Lieu. Thr His 1. 5 1O 15 Llys Lieu. Phe Arg Asn Pro Lieu Pro Llys Val Ile Glin Ala Thr Pro Lieu. 2O 25 3 O Thr Lieu Lys Lieu. Arg Cys Ser Val Ser Thr Glu Asn Val Ser Phe Thr 35 4 O 45 Glu Thr Glu Thr Glu Ala Arg Arg Ser Ala Asn Tyr Glu Pro Asn. Ser SO 55 60 Trp Asp Tyr Asp Tyr Lieu Lleu Ser Ser Asp Thr Asp Glu Ser Ile Glu 65 70 7s 8O Val Tyr Lys Asp Lys Ala Lys Llys Lieu. Glu Ala Glu Val Arg Arg Glu 85 90 95 Ile Asin Asn. Glu Lys Ala Glu Phe Lieu. Thir Lieu. Lieu. Glu Lieu. Ile Asp 1OO 105 11 O Asn Val Glin Arg Lieu. Gly Lieu. Gly Tyr Arg Phe Glu Ser Asp Ile Arg 115 12 O 125 Gly Ala Lieu. Asp Arg Phe Val Ser Ser Gly Gly Phe Asp Ala Val Thr 13 O 135 14 O Llys Thir Ser Lieu. His Gly Thr Ala Lieu. Ser Phe Arg Lieu. Lieu. Arg Glin 145 150 155 160 His Gly Phe Glu Val Ser Glin Glu Ala Phe Ser Gly Phe Lys Asp Glin 1.65 17O 17s Asn Gly Asn. Phe Lieu. Glu Asn Lieu Lys Glu Asp Ile Lys Ala Ile Lieu 18O 185 19 O Ser Lieu. Tyr Glu Ala Ser Phe Lieu Ala Lieu. Glu Gly Glu Asn. Ile Lieu. 195 2OO 2O5 Asp Glu Ala Lys Val Phe Ala Ile Ser His Lieu Lys Glu Lieu. Ser Glu 21 O 215 22O Glu Lys Ile Gly Lys Glu Lieu Ala Glu Glin Val Asn His Ala Lieu. Glu 225 23 O 235 24 O Lieu Pro Lieu. His Arg Arg Thr Glin Arg Lieu. Glu Ala Val Trp Ser Ile 245 250 255 Glu Ala Tyr Arg Llys Lys Glu Asp Ala Asn. Glin Val Lieu. Lieu. Glu Lieu. 26 O 265 27 O Ala Ile Lieu. Asp Tyr Asn Met Ile Glin Ser Val Tyr Glin Arg Asp Lieu 27s 28O 285 Arg Glu Thir Ser Arg Trp Trp Arg Arg Val Gly Lieu Ala Thir Lys Lieu 29 O 295 3 OO His Phe Ala Arg Asp Arg Lieu. Ile Glu Ser Phe Tyr Trp Ala Val Gly 3. OS 310 315 32O Val Ala Phe Glu Pro Glin Tyr Ser Asp Cys Arg Asn. Ser Val Ala Lys 3.25 330 335 Met Phe Ser Phe Val Thr Ile Ile Asp Asp Ile Tyr Asp Val Tyr Gly 34 O 345 35. O Thir Lieu. Asp Glu Lieu. Glu Lieu. Phe Thr Asp Ala Val Glu Arg Trp Asp 355 360 365 Val Asn Ala Ile Asn Asp Lieu Pro Asp Tyr Met Llys Lieu. Cys Phe Lieu 37 O 375 38O US 7,947,478 B2 35 36 - Continued Ala Luell Asn. Thir Ile Asn. Glu Ile Ala Tyr Asp Asn Lieu Lys Asp 385 390 395 4 OO

Gly Glu Asin Ile Leu Pro Tyr Lieu. Thir Lys Ala Trp Ala Asp Lieu. 4 OS 41O 415

Asn Ala Phe Lieu. Glin Glu Ala Llys Trp Lieu. Tyr Asn Lys Ser Thr 425 43 O

Pro Thir Phe Asp Asp Tyr Phe Gly Asn Ala Trp Ser Ser Ser Gly 435 44 O 445

Pro Luell Glin Leu Val Phe Ala Tyr Phe Ala Wall Wall Glin Asn Ile Llys 450 45.5 460

Lys Glu Glu Ile Glu Asn Lieu. Glin Asp Thir Ile Ser Arg 465 470 48O

Pro Ser His Ile Phe Arg Lieu. Cys Asn Asp Lieu. Ala Ser Ala Ser Ala 485 490 495

Glu Ile Ala Arg Gly Glu Thir Ala Asn Ser Wall Ser Tyr Met Arg SOO 505

Thir Gly Ile Ser Glu Glu Lieu Ala Thr Glu Ser Wall Met Asn Luell 515 525

Ile Asp Glu Thir Trp Llys Lys Met Asn Lys Glu Lys Lell Gly Gly Ser 53 O 535 54 O

Lell Phe Ala Llys Pro Phe Val Glu Thir Ala Ile Asn Lell Ala Arg Glin 5.45 550 555 560

Ser His Thr Tyr His Asn Gly Asp Ala His Thir Ser Pro Asp Glu 565 st O sts

Lell Thir Arg Lys Arg Val Lieu. Ser Wall Ile Thr Glu Pro Ile Leul Pro 585 59 O

Phe Glu Arg 595

SEQ ID NO 3 LENGTH: 1776 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: codon-adjusted synthetic DNA construct for expression of Populus alba Isps gene with chloroplast targeting sequence omitted and hemagglutinin (HA) epitope tag in microalga Chlamydomonas reinhardtii (Cr IspS) FEATURE: NAME/KEY: gene LOCATION: (100) . . (1776) OTHER INFORMATION: Populus alba IspS gene with chloroplast targeting sequence omitted codon-optimized for expression in microalga Chlamydomonas reinhardtii chloroplast (Cr-IspS)

SEQUENCE: 3 atgitat cott atgatgttcc agacitacgca ggittatcctt atgatgtacc agactatgca 6 O ggittatcc tt acgatgtacc tgattacgct ggtc.catggit gttctgttag tactgaaaat 12 O gttt cattta Ctgaaacaga alacagaagca cgtagat cag caaattatga gcc aaat agt 18O tgggattatg act atttatt atc tagtgat acagatgaat ctattgaagt atataaagat 24 O aaagcaaaaa aattagaa.gc agalagtacgt. cgtgaaatta ataacgaaaa agcagaattit 3OO cit tact titat tagaattaat tgataatgta caacgtttag gtttaggitta tcqttittgaa 360 t cagacattc gtggtgcatt agat.cgttitt gtat caagtg gtggttittga tgctgttaca aaaact agtt tacatggtac tgctittaagt titt cqtttac titcgt caa.ca tggittittgaa 48O gtaagttcaag aagctttitt c tggittittaaa gatcaaaatg gtaattt citt agaaaattta 54 O US 7,947,478 B2 37 38 - Continued aaagaagata ttaaagctat tittaagttta tacgaag cat catttittagc tittagaaggit gaaaat attt tagatgaagc taaagtattt gct attt citc acttaaaaga attat cagaa 660 gaaaaaattg gtaaagaatt agctgaacaa gtaalaccatg cattagaatt acCattacat 72 O cgt.cgtacac aacgtttaga agcagtttgg tctattgaag cittatcqtaa aaaagaagat 78O gctaat Caag ttitt attaga attagcaatt ttagattata atatgattica atcagtatac 84 O caacgtgatt tacgtgaaac aagtcgttgg tgg.cgt.cgtg taggitttagc tactalaatta 9 OO cattttgctic gtgatcgttt aattgaaagt ttittattggg Cagttggtgt agcttittgaa 96.O

CCaCalatatt Cagattgtcg taatt cagtt gcaaaaatgt titt catttgt aact attatt O2O gatgat attt atgatgttta cgg tacatta gatgaattag aattatt CaC tgatgcagta O8O galacgttggg atgttaatgc tattaatgat ttaccagatt atatgaaatt atgttittctt 14 O gctittatata acact attaa. tgaaattgct tatgataact taaaagataa aggtgaaaat 2OO attittacCat atttalacaaa. agcttgggct gattitatgta atgctttittt acaagaa.gct 26 O aaatggittat ataataaatc. acac caa.ca tittgatgatt attittgg taa tgcttggaaa 32O agttcatctg gtc. cattaca attagtttitt gct tattittg ctgttgttca aaat attaala 38O aaagaagaaa ttgaaaattit acaaaaat at catgatacaa titt cacgt.cc at Cacatatt 44 O tttcgtttat gtaatgattit agctt cagct t cagctgaaa ttgcacgtgg tgaaa.ca.gca SOO aatticagttt catgttatat gcgtacaaaa ggt atttctg aagaattagc tacagaatca 560 gttatgaatt taattgatga alacatggaaa aaaatgaata aagaaaaatt aggtggttct 62O ttatttgcta aac catttgt tgaaactgct attaatttag cacgtcaatc acattgtact 68O tat cataatg gtgatgctica taCat Cacca gatgaattaa cacgtaaacg tgttittatca 74 O gttattacag alaccalattitt accatttgaa cgittaa 776

<21Os SEQ ID NO 4 <211 > LENGTH: 591 <212> TYPE : PRT <213> ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Populus alba isoprene synthase (IspS) with chloroplast targeting sequence omitted and hemagglutinin (HA) epitope tag fusion protein for expression in microalga Chlamydomonas reinhardtii chloroplast FEATURE: NAME/KEY: PE PTIDE LOCATION: (2) ... (10) OTHER INFORMATION: hemagglutinin (HA) epitope tag peptide FEATURE: NAME/KEY: PE PTIDE LOCATION: (12) . . (2O) OTHER INFORMATION: hemagglutinin (HA) epitope tag peptide FEATURE: NAME/KEY: PE PTIDE LOCATION: (22) . . (30) OTHER INFORMATION: hemagglutinin (HA) epitope tag peptide FEATURE: NAME/KEY: PE PTIDE LOCATION: (34) . . (591) OTHER INFORMATION: Populus alba isoprene synthase (I spS) with chloroplast targeting sequence omitted

SEQUENCE: 4 Met Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Gly Tyr Pro Tyr Asp Wall 1. 5 1O 15 Pro Asp Tyr Ala Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Gly Pro 2O 25 3 O

Trp. Cys Ser Val Ser Thr Glu Asn. Wal Ser Phe Thr Glu Thir Glu Thir 35 4 O 45 US 7,947,478 B2 39 40 - Continued

Glu Ala Arg Arg Ser Ala Asn Glu Pro ASn Ser Trp Asp Tyr Asp SO 55 60

Tyr Luell Luell Ser Ser Asp Thir Asp Glu Ser Ile Glu Wall Tyr Lys Asp 65 70

Ala Lell Glu Ala Glu Wall Arg Arg Glu Ile Asn Asn Glu 85 90 95

Ala Glu Phe Lell Thir Lell Luell Glu Luell Ile Asp Asn Wall Glin Arg 105 11 O

Lell Gly Luell Gly Tyr Arg Phe Glu Ser Asp Ile Arg Gly Ala Luell Asp 115 12 O 125

Arg Phe Wall Ser Ser Gly Gly Phe Asp Ala Wall Thir Thir Ser Luell 13 O 135 14 O

His Gly Thir Ala Lell Ser Phe Arg Luell Luell Arg Glin His Gly Phe Glu 145 150 155 160

Wall Ser Glin Glu Ala Phe Ser Gly Phe Lys Asp Glin Asn Gly Asn Phe 1.65 17s

Lell Glu Asn Luell Lys Glu Asp Ile Lys Ala Ile Lell Ser Luell Glu 18O 185 19 O

Ala Ser Phe Luell Ala Lell Glu Gly Glu Asn Ile Lell Asp Glu Ala 195

Wall Phe Ala Ile Ser His Lell Glu Luell Ser Glu Glu Ile Gly 21 O 215

Lys Glu Luell Ala Glu Glin Wall Asn His Ala Luell Glu Lell Pro Luell His 225 23 O 235 24 O

Arg Arg Thir Glin Arg Lell Glu Ala Wall Trp Ser Ile Glu Ala Tyr Arg 245 250 255

Glu Asp Ala Asn Glin Wall Luell Luell Glu Lell Ala Ile Luell Asp 26 O 265 27 O

Asn Met Ile Glin Ser Wall Tyr Glin Arg Asp Lell Arg Glu Thir Ser 285

Arg Trp Trp Arg Arg Wall Gly Luell Ala Thir Lell His Phe Ala Arg 29 O 295 3 OO

Asp Arg Luell Ile Glu Ser Phe Trp Ala Wall Gly Wall Ala Phe Glu 3. OS 310 315

Pro Glin Ser Asp Arg Asn Ser Wall Ala Met Phe Ser Phe 3.25 330 335

Wall Thir Ile Ile Asp Asp Ile Asp Wall Gly Thir Luell Asp Glu 34 O 345 35. O

Lell Glu Luell Phe Thir Asp Ala Wall Glu Arg Trp Asp Wall Asn Ala Ile 355 360 365

Asn Asp Luell Pro Asp Met Luell Phe Lell Ala Luell Asn 37 O 375

Thir Ile Asn Glu Ile Ala Asp Asn Luell Lys Asp Gly Glu Asn 385 390 395 4 OO

Ile Luell Pro Lell Thir Ala Trp Ala Asp Lell Asn Ala Phe 4 OS 41O 415

Lell Glin Glu Ala Lys Trp Lell Asn Ser Thir Pro Thir Phe Asp 42O 425 43 O

Asp Phe Gly Asn Ala Trp Lys Ser Ser Ser Gly Pro Luell Glin Luell 435 44 O 445

Wall Phe Ala Phe Ala Wall Wall Glin Asn Ile Lys Glu Glu Ile 450 45.5 460

Glu Asn Luell Glin Lys His Asp Thir Ile Ser Arg Pro Ser His Ile US 7,947,478 B2 41 42 - Continued

465 470

Phe Arg Lieu. Cys Asn Asp Lieu Ala Ser Ala Ser Ala Glu Ile Ala Arg 485 490 495

Gly Glu Thir Ala Asn. Ser Wal Ser Cys Tyr Met Arg Thir Lys Gly Ile SOO 505

Ser Glu Glu Lieu Ala Thr Glu Ser Wal Met Asn Lell Ile Asp Glu Thir 515 525

Trp Llys Llys Met Asn Lys Glu Lys Lieu. Gly Gly Ser Lell Phe Ala Lys 53 O 535 54 O

Pro Phe Wall Glu Thir Ala Ile Asn Lieu Ala Arg Glin Ser His Cys Thr 5.45 550 555 560

His Asn Gly Asp Ala His Thr Ser Pro Asp Glu Lell Thir Arg Llys 565 st O sts

Arg Wall Lieu Ser Wall Ile Thr Glu Pro Ile Lieu. Pro Phe Glu Arg 585 59 O

<210s, SEQ I D NO 5 &211s LENGT H: 3965 212. TYPE : DNA ORGANISM: Artificial Sequence 22 Os. FEATU RE: &223s OTHER INFORMATION: Description of Artificial Sequence: codon - optimized synthetic DNA construct for expression of Populus alba Isps gene with chloroplast targeting sequence omitted in Synec hocystis sp. cyanobac teria (SS-IspS) With beta-lactamase gene (reverse complement) 22 Os. FEATU RE: <221 > NAMEA KEY: gene LOCATION: Complement (1084) . . (1944) &223s OTHER INFORMATION: beta-lactamase gene (reverse complement) 22 Os. FEATU RE: <221 > NAMEA KEY: gene &222s. LOCAT ION: (2256) . . (3890 &223s OTHER INFORMATION: Populus alba IspS gene with chloroplast targeting sequence omitted codon-optimized for expression in Synec hocystis sp. cyanobac teria (SS-IspS)

<4 OOs, SEQUENCE: 5 aaaaag catt gct cat caat ttgttgcaac gaacaggtoa citat cagtica aaataaaatc. 6 O attatttaala aggggc.ccga gcttalagact ttaca acaca gaaagagttt 12 O gtagaaacgc aaaaaggc.ca tcc.gt Caggg gcc ttctgct tagtttgatg Cctggcagtt 18O coct actic to gcct tcc.gct t cct cqctica ctgacticgct gttcggctgc 24 O ggcgagcggt atcagotcac tcaaaggcgg taatacggitt atccacagaa t caggggata 3OO acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc Cagga accgt. aaaaaggcc.g 360

gtttitt coat aggct Cogcc cc cct gacga gcatcacaaa aatcgacgct

Caagttcagag gtggcgaaac ccgacaggac tataaagata cCaggcgttt c cc cctdgaa gctic cct cqt gcqct citcct gttcc gaccc tgcc.gcttac cggatacctg to cqcctitt c 54 O t ccctt.cggg citt tot cata gct cacgctg taggitat citc agttcggtgt aggit cqtt cq ctic caa.gctg acgaacc ccc cgttcagcc c gaccgctgcg 660 c ctitat cogg taactatogt cittgagtc.ca acccggtaag acacgacitta tcgcc actgg 72 O

Cagcagccac tggtaa.cagg attagcagag cgagg tatgt agg.cggtgct acagagttct tgaagtggtg ggctaactac ggctacacta gaagaac agt atttggitatic tgcgctctgc 84 O tgaa.gc.cagt tacctt cqga aaaagagttg gtagcticttg atc.cggcaaa caaaccaccg 9 OO

Ctggtagcgg tggitttittitt agcagattac gcgcagaaaa aaaggat.ctic 96.O aagaagat CC tttgatctitt tctacggggt ctgacgctica gtggaacgac gcgc.gcgtaa

US 7,947,478 B2 45 46 - Continued gcaatgcctg gaagagcagc agcgggcctic tccaact gat ttttgctitat tittgcgg tag 3480 tacaaaaCat taagaaagaa gagattgaaa atttgcaaaa gtaccatgac att attagt c 354 O ggcc cagt ca tattitt cogc ttgttgcaacg acctggcatc cgctagtgcc gaaattgcgc 36OO gtggcgaaac agctaatagt gtgagttgtt acatgcgcac aaagggcatt tcc.gaagaac 366 O tagctacgga aagtgt catg aacct gattg acgagacittg Caagaaaatg aataaggaaa 372 O aattgggcgg gtc. cct attt gccaaaccct ttgttggaaac cgcgattaat ttggctcgc.c 378 O aaagt cattg tacctatoac aatggtgatg ct cacaccag tcc.cgatgaa tta accc.gta 384 O aacgagttct gtctgtgatt actgaaccca ttttgcc citt tgaacgittaa aagtaac agg 3900 ttitt coatgt aagaacactg cagagcctgc tttitttgtac aaagttggca 396 O ttata 3.965

SEQ ID NO 6 LENGTH: 544 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: 65 kD Populus alba isoprene synthase (IspS) gene with c hloroplast targeting sequence omitted fusion protein from Synechocystis sp. cyanobacteria (SS-IspS) expression plasmid pI sps

<4 OOs, SEQUENCE: 6

Met Glu Ala Arg Arg Ser Ala Asn Tyr Glu Pro Asn Ser Trp Asp Tyr 1. 5 1O 15

Asp Phe Luell Leu Ser Ser Asp Thr Asp Glu Ser Ile Glu Wall 25 3 O

Asp Ala Llys Llys Lieu. Glu Ala Glu Val Arg Arg Glu Ile Asn. Asn 35 4 O 45

Glu Lys Ala Glu Phe Lieu. Thir Lieu. Lieu. Glu Lieu Ile Asp Asn Wall Glin SO 55 60

Arg Luell Gly Lieu. Gly Tyr Arg Phe Glu Ser Asp Ile Arg Arg Ala Lieu 65 70 7s 8O

Asp Arg Phe Val Ser Ser Gly Gly Phe Asp Gly Val Thr Lys Thir Ser 85 90 95

Lell His Ala Thir Ala Leu Ser Phe Arg Lieu. Lieu. Arg Gln His Gly Phe 105 11 O

Glu Wall Ser Glin Glu Ala Phe Ser Gly Phe Lys Asp Glin Asn Gly Asn 115 12 O 125

Phe Luell Glu Asn Lieu Lys Glu Asp Thir Lys Ala Ile Lieu. Ser Leu Tyr 13 O 135 14 O

Glu Ala Ser Phe Lieu. Ala Lieu. Glu Gly Glu Asn Ile Lieu. Asp Glu Ala 145 150 155 160

Arg Wall Phe Ala Ile Ser His Lieu. Lys Glu Lieu. Ser Glu Glu Lys Ile 1.65 17O 17s

Gly Glu Lieu Ala Glu Glin Wall Asn His Ala Lieu. Glu Lieu. Pro Leu 18O 185 19 O

His Arg Arg Thr Glin Arg Lieu. Glu Ala Val Trp Ser Ile Glu Ala Tyr 195 2O5

Arg Lys Glu Asp Ala Asn. Glin Wall Lieu. Lieu. Glu Lieu. Ala Ile Lieu. 21 O 215 22O

Asp Asn Met Ile Glin Ser Wall Tyr Glin Arg Asp Lieu. Arg Glu Thir 225 23 O 235 24 O

Ser Arg Trp Trp Arg Arg Val Gly Lieu Ala Thr Llys Lieu. His Phe Ala US 7,947,478 B2 47 48 - Continued

245 250 255

Asp Arg Lieu. Ile Glu Ser Phe Tyr Trp Ala Wall Gly Wall Ala Phe 26 O 265 27 O

Glu Pro Glin Tyr Ser Asp Cys Arg Asn Ser Wall Ala Lys Met Phe Ser 285

Phe Wall Thir Ile Ile Asp Asp Ile Tyr Asp Wall Tyr Gly Thir Lieu. Asp 29 O 295 3 OO

Glu Luell Glu Lieu. Phe Thr Asp Ala Val Glu Arg Trp Asp Wall Asn Ala 3. OS 310 315 32O

Ile Asn Asp Leu Pro Asp Tyr Met Llys Lieu. Cys Phe Lell Ala Leu Tyr 3.25 330 335

Asn Thir Ile Asin Glu Ile Ala Tyr Asp Asn Lieu. Asp Lys Gly Glu 34 O 345 35. O

Asn Ile Luell Pro Tyr Lieu. Thir Lys Ala Trp Ala Asp Lell Asn Ala 355 360 365

Phe Luell Glin Glu Ala Lys Trp Lieu. Tyr Asn Lys Ser Thir Pro Thir Phe 37 O 375

Asp Asp Phe Gly Asn Ala Trip Llys Ser Ser Ser Gly Pro Lieu. Glin 385 390 395 4 OO

Lell Ile Phe Ala Tyr Phe Ala Val Wall Glin Asn Ile Glu Glu 4 OS 41O 415

Ile Glu Asn Lieu. Glin Llys Tyr His Asp Ile Ile Ser Arg Pro Ser His 425 43 O

Ile Phe Arg Lieu. Cys Asn Asp Lieu Ala Ser Ala Ser Ala Glu Ile Ala 435 44 O 445

Arg Gly Glu Thir Ala Asn. Ser Wall Ser Cys Tyr Met Arg Thir 450 45.5 460

Ile Ser Glu Glu Lieu Ala Thr Glu Ser Wal Met Asn Lell Ile Asp Glu 465 470 48O

Thir Llys Met Asn Lys Glu Llys Lieu. Gly Gly Ser Luell Phe Ala 485 490 495

Pro Phe Wall Glu. Thir Ala Ile Asn Lieu Ala Arg Glin Ser His Cys SOO 505

Thir His Asin Gly Asp Ala His Thir Ser Pro Asp Glu Luell Thr Arg 515 525

Arg Wall Lieu. Ser Wall Ile Thr Glu Pro Ile Lell Pro Phe Glu Arg 53 O 535 54 O

SEO ID NO 7 LENGTH: 31.33 TYPE: DNA ORGANISM: Pueraria montana war. lobata FEATURE: OTHER INFORMATION: kudzu vine isoprene synthase (IspS) cDNA SEQUENCE: 7 aatcaatata taatatttac ggaagatttg atgcc titt.cc tgattittaat t tattitt tatt 6 O c cct gcataa aataattgttg gtcaccgtac actgttcttg t cacttggac aagaaatttg 12 O actagdaagc alaggtataat Catticatcta aact tatggit gatttattgc CCC acct Cat 18O caattitt.cgt. gtgttittatt ttagtgtc.ct tggat.cctic titcCalatata aaaggagaac 24 O atggcatcgc aattittagag catat cattg aaaagt catg gcaac caacc ttt tatgctt 3OO gtctaataaa ttatcgt.ccc CCaCaccaac accaagtact agattitccac aaagtaagaa 360

Cttcat caca CaaaaaaCat citc.ttgccaa toccaaacct tggcgagitta tttgttgctac US 7,947,478 B2 49 50 - Continued gagct citcaa tttacccaaa taacagaa.ca taatagt cqg cqttcagcta attaccago c 48O aaac citctgg aattittgaat ttctgcagtic tictoggaaaat gaccittaagg tdattataca 54 O tatatt coag ttaatttitt c ttitttittctt ttgttgattitt taaggaatca tttagtttgg 6OO gaaagtattt tttittatttg cacttittaat tataaaaatgttatat catt tt cactttitt 660 tctatt catt ttcaaaattt tacatagaaa acagtaaatt ttittatttitt tittattittct 72 O attitt catta titt citcaaat caaacggitat taaag cataa acaaagaaat taatattgtt 78O cittittaattt tatttittitta caataatggg aacgattata tattaggct g accittaataa 84 O gttatttittt ttittataata ttgttctitat td taacctaa cqacaggtgg aaaaact aga 9 OO agagaaggca acaaagctag aggaggaggt acgatgcatg at Caacagag tag acacaca 96.O accattaa.gc titact agaat tdatcgacga tigt ccagcgt. c taggattga cct acaagtt O2O tgagaaggac ataatcaaag C ccttgagaa tattgttittg Ctggatgaga ataagaaaaa O8O taaaagtgac ctic catgcta citgct ct cag citt cogittta cittagacaac atggctittga 14 O ggtttcc.caa got atttatg tatatatatgttacc cactt agcaa.catat atatatatat 2OO at attatgat t cactgacca to atgtggit gcagatgtgt ttgagagatt taaggacaag 26 O gagggaggitt t cagtggtga acttaaaggit gatgtgcaag ggttgctgag tictatatgaa 32O gcatcCtatic ttggctittga gggagaaaat Ctcttggagg aggcaaggac attitt calata 38O acacatctica agaacaacct aaaagaagga ataaacacca aagtggcaga acaagttagt 44 O catgcactgg aacttic ccta t catcaaaga ttgcatagac tagaa.gcacg atggttccitt SOO gacaaatatgaac caaagga acc ccaccat cagttac tact cagcttgc aaagctagat 560 ttcaatatgg togcaaacatt gcaccagaaa gaactgcaag acctgtcaag gttagaaatt 62O t caattctica agtaattatt acct cataag aaattaaata acaataacaa tattgagtgt 68O agagatttico aattaaaaat taa catacga gaggat caat atatatt citt agg tatgtgg 74 O tactaatgaa atatatgcta ggtggtggac ggagatgggg Ctagdaagca agctagacitt 8OO tgtc.cgagac agattaatgg aagtgtattt ttgggcgttg ggaatggcac Ctgat cotca 86 O attcggtgaa titcgtaaag Ctgtcactaa aatgtttgga ttggt cacca t catcgatga 92 O tgtatatgac gtt tatggta Ctttggatga gct acaactic titcactgatg Ctgttgaga.g 98 O gttcgtaatt gattitcagtic ticgatticagt tdgaatttaa ttattgctta attaataata 2O4. O acttgcgtac atgcatacac acagatggga cqtgaatgcc ataaacacac titccagacita 21OO catgaagttg togctitcc tag cactittataa caccgtcaat gacacgt.ctt atagoat cot 216 O taaagaaaaa gogacacaa.ca acctitt.ccta tittgacaaaa totgtacata tatactaatt 222 O atct cottgg ttgattaatt agtttagttt agtttagttg g tatgtcaac acaattaatt 228O aatatt at at atggatgttg acagtggcgt gag titatgca aag catt cot tdaagaa.gca 234 O aaatggtcga acaacaaaat catt C cagca tttagcaagt acctggaaaa to atcggtg 24 OO t cct cotc.cg gtgtggctitt gcttgctic ct tcc tact tct cagtgtgcca acaacaagaa 246 O gatat ct cag accatgct ct tcgttctitta actgattitcc atggccttgt gcgct cotca 252O tgcgtcattt toagacitctg caatgatttg gctacct cag cqgtgtgtaa ttaattacct 2580 taattaattt gtaacacttg ttagactaat atatataggit gtgtctgtta attact acag 264 O gctgagctag agaggggtga gacgacaa at t caataatat Cttatatgca tagaatgac 27 OO ggcactitctgaagagcaa.gc acgtgaggag ttgagaaaat tdatcgatgc agagtggaag 276 O aagatgaacc gagagcgagt ttcagattct acact acticc caaaagctitt tatggaaata 282O US 7,947,478 B2 51 - Continued gctgttaa.ca tggctcgagt titcgcattgc acataccalat atggaga.cgg acttggalagg 288O ccagacitacg ccacagagaa tagaatcaag ttgct actta tag accc citt to Caatcaat 294 O caactaatgt acgtgtaa.ca acacalatata aac acttittc. tacaagtata tatttgttta 3 OOO

tgaattaggg gtcaiacacag CtatatataC ttcaatggac Caact Calacc 3 O 6 O aatctgataa gagaaaaaaa ataaaaataa. ggittaggitta actttgtata aatccaagtt 312 O agatat caag titt 31.33

SEQ ID NO 8 LENGTH: 608 TYPE : PRT ORGANISM: Pueraria montana var. lobata FEATURE: OTHER INFORMATION: kudzu vine isoprene synthase (IspS)

<4 OOs, SEQUENCE: 8

Met Ala Thr Asn Lieu. Lieu. Cys Lieu Ser Asn Lys Lell Ser Ser Pro Thir 1. 5 1O 15

Pro Thir Pro Ser Thr Arg Phe Pro Glin Ser Lys Asn Phe Ile Thr Gin 25 3 O

Thir Ser Lieu Ala ASn Pro Llys Pro Trp Arg Wall Ile Ala Thr 35 4 O 45

Ser Ser Glin Phe Thr Glin Ile Thr Glu. His Asn Ser Arg Arg Ser Ala SO 55 60

Asn Glin Pro Asn Lieu. Trp Asn Phe Glu Phe Lell Glin Ser Lieu. Glu 65 70 7s 8O

Asn Asp Luell Llys Val Glu Lys Lieu Glu Glu Lys Ala Thir Lieu. Glu 85 90 95

Glu Glu Wall Arg Cys Met Ile Asn Arg Val Asp Thir Glin Pro Luell Ser 105 11 O

Lell Luell Glu Lieu. Ile Asp Asp Wall Glin Arg Lieu. Gly Lell Thir 115 12 O 125

Phe Glu Asp Ile Ile Lys Ala Lieu. Glu Asn Ile Wall Luell Lieu. Asp 13 O 135 14 O

Glu Asn Lys Asn Llys Ser Asp Lieu. His Ala Thir Ala Luell Ser Phe 145 150 155 160

Arg Luell Luell Arg Gln His Gly Phe Glu Wal Ser Glin Asp Wall Phe Glu 1.65 17O 17s

Arg Phe Asp Llys Glu Gly Gly Phe Ser Gly Glu Lell Lys Gly Asp 18O 185 19 O

Wall Glin Gly Lieu Lleu Ser Lieu. Tyr Glu Ala Ser Lell Gly Phe Glu 195

Gly Glu Asn Lieu. Lieu. Glu Glu Ala Arg Thr Phe Ser Ile Thir His Lieu. 21 O 215 22O

Lys Asn Asn Lieu Lys Glu Gly Ile Asn. Thir Lys Wall Ala Glu Glin Wall 225 23 O 235 24 O

Ser His Ala Lieu. Glu Lieu Pro Tyr His Glin Arg Lell His Arg Lieu. Glu 245 250 255

Ala Arg Trp Phe Lieu. Asp Llys Tyr Glu Pro Llys Glu Pro His His Glin 26 O 265 27 O

Lell Luell Luell Glu Lieu Ala Lys Lieu. Asp Phe Asn Met Wall Glin Thir Lieu. 27s 285

His Glin Glu Lieu. Glin Asp Lieu. Ser Arg Trp Trp Thir Glu Met Gly 29 O 295 3 OO

Lell Ala Ser Llys Lieu. Asp Phe Val Arg Asp Arg Lell Met Glu Val Tyr US 7,947,478 B2 53 - Continued

3. OS 310 315 32O Phe Trp Ala Leu Gly Met Ala Pro Asp Pro Glin Phe Gly Glu. Cys Arg 3.25 330 335 Lys Ala Val Thir Lys Met Phe Gly Lieu Val Thir Ile Ile Asp Asp Wall 34 O 345 35. O Tyr Asp Val Tyr Gly Thr Lieu. Asp Glu Lieu Gln Lieu. Phe Thir Asp Ala 355 360 365 Val Glu Arg Trp Asp Val Asn Ala Ile Asn. Thir Lieu Pro Asp Tyr Met 37 O 375 38O Llys Lieu. Cys Phe Lieu Ala Lieu. Tyr Asn Thr Val Asn Asp Thir Ser Tyr 385 390 395 4 OO Ser Ile Lieu Lys Glu Lys Gly His Asn. Asn Lieu. Ser Tyr Lieu. Thir Lys 4 OS 41O 415 Ser Trp Arg Glu Lieu. Cys Lys Ala Phe Lieu. Glin Glu Ala Lys Trp Ser 42O 425 43 O Asn Asn Lys Ile Ile Pro Ala Phe Ser Lys Tyr Lieu. Glu Asn Ala Ser 435 44 O 445 Val Ser Ser Ser Gly Val Ala Lieu. Leu Ala Pro Ser Tyr Phe Ser Val 450 45.5 460 Cys Glin Glin Glin Glu Asp Ile Ser Asp His Ala Lieu. Arg Ser Lieu. Thir 465 470 47s 48O Asp Phe His Gly Lieu Val Arg Ser Ser Cys Val Ile Phe Arg Lieu. Cys 485 490 495 Asn Asp Lieu Ala Thir Ser Ala Ala Glu Lieu. Glu Arg Gly Glu Thir Thr SOO 505 51O Asn Ser Ile Ile Ser Tyr Met His Glu Asn Asp Gly Thr Ser Glu Glu 515 52O 525 Glin Ala Arg Glu Glu Lieu. Arg Llys Lieu. Ile Asp Ala Glu Trp Llys Llys 53 O 535 54 O Met Asn Arg Glu Arg Val Ser Asp Ser Thir Lieu Lleu Pro Lys Ala Phe 5.45 550 555 560 Met Glu Ile Ala Val Asn Met Ala Arg Val Ser His Cys Thr Tyr Glin 565 st O sts Tyr Gly Asp Gly Lieu. Gly Arg Pro Asp Tyr Ala Thr Glu Asn Arg Ile 58O 585 59 O Llys Lieu. Lieu. Lieu. Ile Asp Pro Phe Pro Ile Asin Glin Lieu Met Tyr Val 595 6OO 605

<210s, SEQ ID NO 9 &211s LENGTH: 595 212. TYPE: PRT <213> ORGANISM: Populus tremuloides 22 Os. FEATURE: <223> OTHER INFORMATION: quaking aspen isoprene synthase (IspS)

<4 OOs, SEQUENCE: 9 Met Ala Thr Glu Lieu Lleu. Cys Lieu. His Arg Pro Ile Ser Lieu. Thr His 1. 5 1O 15 Llys Lieu. Phe Arg Asn Pro Lieu Pro Llys Val Ile Glin Ala Thr Pro Lieu. 2O 25 3 O Thr Lieu Lys Lieu. Arg Cys Ser Val Ser Thr Glu Asn Val Ser Phe Ser 35 4 O 45 Glu Thr Glu Thr Glu Thr Arg Arg Ser Ala Asn Tyr Glu Pro Asn Ser SO 55 60 Trp Asp Tyr Asp Tyr Lieu Lleu Ser Ser Asp Thr Asp Glu Ser Ile Glu 65 70 7s 8O US 7,947,478 B2 55 56 - Continued

Wall His Asp Lys Ala Luell Glu Ala Glu Val Arg Arg Glu 85 90 95

Ile Asn Asn Glu Ala Glu Phe Luell Thir Luell Lell Glu Luell Ile Asp 105 11 O

Asn Wall Glin Arg Lell Gly Lell Gly Tyr Phe Glu Ser Asp Ile Arg 115 12 O 125

Arg Ala Luell Asp Arg Phe Wall Ser Ser Gly Gly Phe Asp Gly Wall Thir 13 O 135 14 O

Lys Thir Ser Luell His Gly Thir Ala Luell Ser Phe Arg Lell Luell Arg Glin 145 150 155 160

His Gly Phe Glu Wall Ser Glin Glu Ala Phe Ser Gly Phe Asp Glin 1.65 17O 17s

Asn Gly Asn Phe Lell Glu Asn Luell Lys Glu Asp Ile Ala Ile Luell 18O 185 19 O

Ser Luell Tyr Glu Ala Ser Phe Luell Ala Luell Glu Gly Glu Asn Ile Luell 195

Asp Glu Ala Wall Phe Ala Ile Ser His Luell Lys Glu Luell Ser Glu 21 O 215 22O

Glu Ile Gly Glu Lell Ala Glu Glin Wall Ser His Ala Luell Glu 225 23 O 235 24 O

Lell Pro Luell His Arg Arg Thir Glin Arg Luell Glu Ala Wall Trp Ser Ile 245 250 255

Glu Ala Arg Lys Glu Asp Ala Asn Glin Wall Lell Luell Glu Luell 26 O 265 27 O

Ala Ile Luell Asp Tyr Asn Met Ile Glin Ser Wall Glin Arg Asp Luell 285

Arg Glu Thir Ser Arg Trp Trp Arg Arg Wall Gly Lell Ala Thir Luell 29 O 295 3 OO

His Phe Ala Arg Asp Arg Lell Ile Glu Ser Phe Trp Ala Wall Gly 3. OS 310 315

Wall Ala Phe Glu Pro Glin Ser Asp Cys Arg Asn Ser Wall Ala 3.25 330 335

Met Phe Ser Phe Wall Thir Ile Ile Asp Asp Ile Asp Wall Gly 34 O 345 35. O

Thir Luell Asp Glu Lell Glu Lell Phe Thir Asp Ala Wall Glu Arg Trp Asp 355 360 365

Wall Asn Ala Ile Asn Asp Lell Pro Asp Met Lys Lell Phe Luell 37 O 375

Ala Luell Tyr Asn Thir Ile Asn Glu Ile Ala Tyr Asp Asn Luell Asp 385 390 395 4 OO

Gly Glu Asn Ile Lell Pro Luell Thir Ala Trp Ala Asp Luell 4 OS 415

Asn Ala Phe Lell Glin Glu Ala Lys Trp Luell Tyr Asn Lys Ser Thir 425 43 O

Pro Thir Phe Asp Asp Phe Gly Asn Ala Trp Ser Ser Ser Gly 435 44 O 445

Pro Luell Glin Luell Ile Phe Ala Phe Ala Wall Wall Glin Asn Ile 450 45.5 460

Lys Glu Glu Ile Glu Asn Lell Glin Tyr His Asp Ile Ile Ser Arg 465 470 48O

Pro Ser His Ile Phe Arg Lell Asn Asp Luell Ala Ser Ala Ser Ala 485 490 495

Glu Ile Ala Arg Gly Glu Thir Ala Asn Ser Wall Ser Met Arg US 7,947,478 B2 57 - Continued

SOO 505 51O Thir Lys Gly Ile Ser Glu Glu Lieu Ala Thr Glu Ser Val Met Asn Lieu. 515 52O 525 Ile Asp Glu Thir Trp Llys Llys Met Asn Lys Glu Lys Lieu. Gly Gly Ser 53 O 535 54 O Lieu. Phe Ala Lys Pro Phe Val Glu Thir Ala Ile Asn Lieu Ala Arg Glin 5.45 550 555 560 Ser His Cys Thr Tyr His Asn Gly Asp Ala His Thr Ser Pro Asp Glu 565 st O sts Lieu. Thir Arg Lys Arg Val Lieu. Ser Val Ile Thr Glu Pro Ile Lieu Pro 58O 585 59 O Phe Glu Arg 595

<210s, SEQ ID NO 10 &211s LENGTH: 595 212. TYPE: PRT <213> ORGANISM: Populus nigra 22 Os. FEATURE: <223> OTHER INFORMATION: Lombardy poplar isoprene synthase (IspS)

<4 OOs, SEQUENCE: 10 Met Ala Thr Glu Lieu Lleu. Cys Lieu. His Arg Pro Ile Ser Lieu. Thr His 1. 5 1O 15 Llys Lieu. Phe Arg Asn Pro Lieu Pro Llys Val Ile Glin Ala Thr Pro Lieu. 2O 25 3 O Thr Lieu Lys Lieu. Arg Cys Ser Val Ser Thr Glu Asn Val Ser Phe Thr 35 4 O 45 Glu Thr Glu Thr Glu Thr Arg Arg Ser Ala Asn Tyr Glu Pro Asn Ser SO 55 60 Trp Asp Tyr Asp Tyr Lieu Lleu Ser Ser Asp Thr Asp Glu Ser Ile Glu 65 70 7s 8O Val Tyr Lys Asp Lys Ala Lys Llys Lieu. Glu Ala Glu Val Arg Arg Glu 85 90 95 Ile Asin Asn. Glu Lys Ala Glu Phe Lieu. Thir Lieu Pro Glu Lieu. Ile Asp 1OO 105 11 O Asn Val Glin Arg Lieu. Gly Lieu. Gly Tyr Arg Phe Glu Ser Asp Ile Arg 115 12 O 125 Arg Ala Lieu. Asp Arg Phe Val Ser Ser Gly Gly Phe Asp Ala Val Thr 13 O 135 14 O Llys Thir Ser Lieu. His Ala Thr Ala Lieu. Ser Phe Arg Lieu. Lieu. Arg Glin 145 150 155 160 His Gly Phe Glu Val Ser Glin Glu Ala Phe Ser Gly Phe Lys Asp Glin 1.65 17O 17s Asn Gly Asn. Phe Lieu Lys Asn Lieu Lys Glu Asp Ile Lys Ala Ile Lieu 18O 185 19 O Ser Lieu. Tyr Glu Ala Ser Phe Lieu Ala Lieu. Glu Gly Glu Asn. Ile Lieu. 195 2OO 2O5 Asp Glu Ala Lys Val Phe Ala Ile Ser His Lieu Lys Glu Lieu. Ser Glu 21 O 215 22O Glu Lys Ile Gly Lys Asp Lieu Ala Glu Glin Val Asn His Ala Lieu. Glu 225 23 O 235 24 O Lieu Pro Lieu. His Arg Arg Thr Glin Arg Lieu. Glu Ala Val Trp Ser Ile 245 250 255

Glu Ala Tyr Arg Llys Lys Glu Asp Ala Asp Glin Val Lieu. Lieu. Glu Lieu. 26 O 265 27 O US 7,947,478 B2 59 - Continued

Ala Ile Lieu. Asp Tyr Asn Met Ile Glin Ser Val Tyr Glin Arg Asp Lieu 27s 28O 285 Arg Glu Thir Ser Arg Trp Trp Arg Arg Val Gly Lieu Ala Thir Lys Lieu 29 O 295 3 OO His Phe Ala Arg Asp Arg Lieu. Ile Glu Ser Phe Tyr Trp Ala Val Gly 3. OS 310 315 32O Val Ala Phe Glu Pro Glin Tyr Ser Asp Cys Arg Asn. Ser Val Ala Lys 3.25 330 335 Met Phe Ser Phe Val Thr Ile Ile Asp Asp Ile Tyr Asp Val Tyr Gly 34 O 345 35. O Thir Lieu. Asp Glu Lieu. Glu Lieu. Phe Thr Asp Ala Val Glu Arg Trp Asp 355 360 365 Val Asn Ala Ile Asp Asp Lieu Pro Asp Tyr Met Llys Lieu. Cys Phe Lieu 37 O 375 38O Ala Lieu. Tyr Asn. Thir Ile Asn. Glu Ile Ala Tyr Asp Asn Lieu Lys Asp 385 390 395 4 OO Lys Gly Glu Asn. Ile Lieu Pro Tyr Lieu. Thir Lys Ala Trp Ala Asp Lieu. 4 OS 41O 415 Cys Asn Ala Phe Lieu. Glin Glu Ala Lys Trp Lieu. Tyr Asn Llys Ser Thr 42O 425 43 O Pro Thr Phe Asp Glu Tyr Phe Gly Asn Ala Trp Llys Ser Ser Ser Gly 435 44 O 445 Pro Lieu. Glin Lieu Val Phe Ala Tyr Phe Ala Val Val Glin Asn. Ile Llys 450 45.5 460 Lys Glu Glu Ile Asp Asn Lieu. Glin Llys Tyr His Asp Ile Ile Ser Arg 465 470 47s 48O Pro Ser His Ile Phe Arg Lieu. Cys Asn Asp Lieu Ala Ser Ala Ser Ala 485 490 495 Glu Ile Ala Arg Gly Glu Thir Ala Asn. Ser Val Ser Cys Tyr Met Arg SOO 505 51O Thir Lys Gly Ile Ser Glu Glu Lieu Ala Thr Glu Ser Val Met Asn Lieu. 515 52O 525 Ile Asp Glu Thir Trp Llys Llys Met Asn Lys Glu Lys Lieu. Gly Gly Ser 53 O 535 54 O Lieu. Phe Ala Lys Pro Phe Val Glu Thir Ala Ile Asn Lieu Ala Arg Glin 5.45 550 555 560 Ser His Cys Thr Tyr His Asn Gly Asp Ala His Thr Ser Pro Asp Glu 565 st O sts Lieu. Thir Arg Lys Arg Val Lieu. Ser Val Ile Thr Glu Pro Ile Lieu Pro 58O 585 59 O Phe Glu Arg 595

<210s, SEQ ID NO 11 &211s LENGTH: 7 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: beginning sequence of white poplar isoprene synthase (IspS) mature protein with chloroplast transit peptide omitted

<4 OOs, SEQUENCE: 11 Cys Ser Val Ser Thr Glu Asn 1. 5 US 7,947,478 B2 61 62 - Continued

SEQ ID NO 12 LENGTH: 25 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: PCR amplification primer IspS F NdeI

<4 OOs, SEQUENCE: 12 Ctgggt cata tigaagctic acgaa 25

<210s, SEQ ID NO 13 &211s LENGTH: 29 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: PCR amplification primer Isp S R HindIII

<4 OOs, SEQUENCE: 13 atggaaaacc tdaagcttitt aacgttcaa 29

What is claimed is: 25 wherein the green microalgae comprises a heterologous 1. A method of producing isoprene hydrocarbons in agreen nucleic acid encoding an isoprene synthase in the microalgae, the method comprising: chloroplast or nuclear genome, wherein the heterolo introducing an expression cassette that comprises a nucleic gous nucleic acid encodes amino acid residues 53-595 acid encoding an isoprene synthase that comprises of SEQID NO:2 and the nucleic acid: (i) is codon amino acid residues 53-595 of SEQ ID NO:2 into the 30 optimized for expression in the green microalgae and chloroplast or nuclear genome of the green microalgae, (2) comprises at least 95% identity to the isoprene wherein the nucleic acid is codon-optimized for expres synthase coding region of SEQID NO:3; and sion in the green microalgae and comprises at least 95% harvesting volatile isoprene hydrocarbons produced by the identity to the isoprene synthase coding region of SEQ green microalgae. ID NO:3; and culturing the green microalgae under conditions in which 35 5. The method of claim 4, wherein the green microalgae is the nucleic acid encoding isoprene synthase is expressed selected from the group consisting of Chlamydomonas rein and produces isoprene. hardtii, Scenedesmus obliquus, Chlorella vulgaris and 2. The method of claim 1, wherein the green microalgae is Dunaliella Salina. selected from the group consisting of Chlamydomonas rein 6. The method of claim 4, wherein the heterologous nucleic hardtii, Scenedesmus obliquus, Chlorella vulgaris and 40 acid comprises the isoprene synthase coding region of SEQ Dunaliella Salina. ID NO:3. 3. The method of claim 1, wherein the nucleic acid com 7. The method of claim 1, wherein the nucleic acid com prises the isoprene synthase coding region of SEQID NO:3. prises at least 97% identity to the isoprene synthase coding 4. A method of producing isoprene hydrocarbons in agreen region of SEQID NO:3. microalgae that comprises a heterologous nucleic acid in the 45 8. The method of claim 4, wherein the nucleic acid com chloroplast or nuclear genome that encodes isoprene Syn prises at least 97% identity to the isoprene synthase coding thase, the method comprising: region of SEQID NO:3. mass-culturing a green microalgae in an enclosed bioreac torunder conditions in which the isoprene synthase gene is expressed and produces isoprene,