ABSTRACT

MURPHREE, COLIN A. Exploration of Nitrogen Recycling in Algal Cultivation & Novel Methods of Plastid Genome Modification. (Under the direction of Heike Sederoff).

Two separate ideas are condensed within the text of this document. The first three chapters address the challenge of nitrogen fertilizer use in microalgal production systems for biofuel production, and the various means by which nitrogen can be recovered or recycled. The genus Dunaliella is presented as a model system for investigating the mechanisms of organic nitrogen use. We determined that, among amino acids, only histidine could be transported across the cell membrane. In contrast, other forms of amino acids were intrinsically unstable and produced ammonium. This finding yielded a study in which the transcriptional response of the

Dunaliella species D. viridis to ammonium supplementation was examined. This study shows that Dunaliella responds to ammonium released from glutamine quickly compared to KNO3 fertilizer positive control and nitrogen starvation negative control conditions. The fourth and final chapter presents a framework for developing a new set of plastid engineering tools based on sequence specific , retroviral features, viroid fusions, and a condensed Non-

Homologous End Joining pathway for integration of a heterologous transgene into plastid chromosomes and for plastid chromosome editing. Plastid engineering tools were generated in transgenic Arabidopsis thaliana, and recombinant reverse transcriptase produced by transgenic A. thaliana was shown to possess genuine reverse transcriptase activity. This work suggested that the effort should be extended to Nicotiana benthamiana, which will be utilized in the future to troubleshoot technical aspects of novel plastid engineering mechanisms.

© Copyright 2018 by Colin A. Murphree

All Rights Reserved Exploration of Nitrogen Recycling in Algal Cultivation & Novel Methods of Plastid Genome Modification

by Colin A. Murphree

A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Plant Biology

Raleigh, North Carolina

2018

APPROVED BY:

______Dr. Amy Grunden Dr. Jose Alonso

______Dr. Dominique Robertson Dr. Colleen Doherty

______Dr. Heike Sederoff Committee Chair ii

BIOGRAPHY

Colin began his scientific career via the National Science Foundation’s Research Experience for

Undergraduate Students program (NSF REU). In 2008 he was accepted into and attended the

NSF REU at University of California, Riverside, studying plant pathology under Dr. A.L.N. Rao.

In 2009 he attended a second NSF REU at North Carolina State University with Dr. Dominique

Robertson. During this time, he learned many essential concepts and techniques in molecular biology and produced Gossipium hirsutum with lowered transcript levels of the cell wall expansins using viral induced gene silencing (VIGS). Colin graduated from Transylvania

University with a B.A in Biology in 2010 and worked for 2 years as a technician under Dr. Luke

Moe at the University of Kentucky. During this time, he managed a project in collaboration with a bioethanol consulting company to identify mechanisms by which ethanol fermentation contaminating bacteria accumulated antibiotic resistance. The outcomes of this project were two publications and an interest in learning about emerging technologies in biofuel production. This is how he came to begin a Ph.D in Dr. Heike Sederoff’s lab working on the sustainability of algal biofuels. After publication of this work, Colin transitioned to developing new plastid genome modification technologies.

iii

ACKNOWLEDGMENTS

Multiple Funding Sources were instrumental in allowing this work to proceed. I received assistance from the following:

NCSU Chancellors Innovation Fund

NSF EFRI Award # 1332341

DOE BER award DE-SC0018269

NCSU Biotechnology Teaching Assistantship

A great deal of people were instrumental in producing the material here. I did not do this alone, and so I would like to thank the following people:

Danielle Young

Sihui Ni

Soundarya Srirangan

Melodi Charles

Nicole Khoshnoodie

Jacob Dums

Naresh Vasani

Jeff Macdonald

Andrey Tikunov

Guillaume Pilot

Chengsong Zhao

Siddharth Jain

Heike Sederoff

iv

Jose Alonso

Amy Grunden

Dominique Robertson

Colleen Doherty

v

TABLE OF CONTENTS

LIST OF TABLES ...... x

LIST OF FIGURES ...... xii

CHAPTER 1: Nitrogen Recycling as a Framework for Addressing Nitrogen Fertilizer

Use in Algal Cultivation ...... 1

Abstract ...... 1

1.1 Algal cultivation is industrial agriculture ...... 1

1.2 Pollution is an avoidable consequence of algal cultivation ...... 2

1.3 Technologies enabling nitrogen recycling ...... 4

1.4 Nitrogen metabolism and scavenging in chlorophyte algae ...... 13

1.5 Dunaliella spp. biofuel production as a system for modeling nitrogen recycling ...... 25

Conclusion ...... 26

References ...... 27

CHAPTER 2: Amino Acids Are an Ineffective Fertilizer for Dunaliella spp. Growth ...... 44

Contributions Statement: ...... 44

Publication “Amino Acids Are an Ineffective Fertilizer for Dunaliella spp. Growth” .....45

CHAPTER 3: Dunaliella viridis Metabolism and Transcription show Partial Nitrogen

Starvation when Grown on Glutamine ...... 63

Contributions statement ...... 63

Abstract ...... 64

Introduction ...... 66

Material and Methods ...... 69

Growth conditions ...... 69

vi

Cell density/diameter/pH measurements ...... 70

Chlorophyll ...... 70

Soluble protein ...... 71

Total carbohydrate and starch ...... 71

Neutral lipids ...... 72

Reference transcriptome annotation ...... 73

RNA extraction, sequencing, and quality control ...... 73

Mapping reads ...... 74

Differential expression (DE) analysis ...... 75

Pathway analysis ...... 76

Results ...... 77

Glutamine decreases cell diameter, but not cell density ...... 77

Active cell division resulted in higher media pH...... 78

Nitrogen starvation and glutamine result in less chlorophyll and soluble

protein than nitrate ...... 78

Glutamine supplementation increases carbohydrate content compared to

nitrate controls ...... 79

Nitrogen starvation results significant transcriptional changes ...... 80

Nitrogen assimilation genes are DE in all nitrogen conditions over time ...... 81

Lipid and triacylglycerol synthesis gene expression in nitrogen starvation ...... 83

Central carbon metabolism shows high transcriptional activity around

pyruvate...... 84

Transcriptional shift by glutamine at 48 hours ...... 85

vii

Discussion ...... 104

Glutamine supplemented cultures underophenotypic and transcriptomic

changes comparable to nitrogen starvation ...... 104

Dunaliella viridis responds to the presence of NH4+ ...... 106

Starch synthesis is higher than lipid synthesis during nitrogen starvation in

D. viridis ...... 107

pH differences are associated with changes in the transcription patterns of

batch cultures ...... 111

The carbon concentrating mechanism (CCM) is perturbed due to pH changes

and nitrogen starvation ...... 112

Conclusion ...... 114

Acknowledgments...... 115

Author contributions statement ...... 115

Conflict of interest ...... 116

References ...... 117

Supplementary Figures ...... 124

Supplementary Tables ...... 125

CHAPTER 4: Novel Methods of Plastid Genome Modification ...... 133

Abstract ...... 133

Introduction ...... 133

Applications, methods, and limitations of Plastid Engineering ...... 133

Plastid DNA repair and implications for genome modification ...... 140

Novel plastid modification tools ...... 142

viii

Novel methods of plastid genome modification ...... 153

Implementation of systems in plastid using nuclear transgene Expression

...... 155

Materials and Methods ...... 162

Results ...... 172

Design and construction of vectors ...... 172

Replication of the use of ELVd as a plastid localizing UTR ...... 178

Selection and genotyping ...... 179

Reverse transcriptase produced in planta is a functional enzyme...... 188

Combination of PTEC and RT in A. thaliana ...... 188

N. benthamiana callus ...... 193

Discussion ...... 194

The use of ELVd as a 5’UTR conferring chloroplast localization and

transgene expression was replicated in N. benthamiana ...... 195

PTEC expression may reduce seed viability ...... 196

Reverse transcriptase is functional in plants ...... 197

RT + PTEC technology does not work in A. thaliana ...... 198

Redesign of the system and implementation in N. benthamiana ...... 199

Conclusion ...... 201

Prospects ...... 201

References ...... 208

APPENDICES ...... 223

APPENDIX A: Methods and compositions for modification of plastid genomes ...... 224

ix

APPENDIX B: Vector sequence and synthetic constructs ...... 231

APPENDIX C: Primers...... 258

APPENDIX D: Recipes ...... 259

APPENDIX E: Seed viability of PTECv3 transgenics ...... 261

APPENDIX F: Screening of lines carrying PTEC and RT transgenes ...... 262

x

LIST OF TABLES

Chapter 1

Table 1.1 End to end pollution resulting from fertilizer use ...... 5

Chapter 2

Table 2.1 Variation in the ability of Dunaliella spp. to use urea as a

nitrogen source ...... 51

Table 2.2 Growth of four Dunaliella strains supplemented with Gln,

Cys, His, and Trp ...... 52

Table 2.3 Amino acid content of D. viridis...... 56

Chapter 3

Table 3.1 Summary of gene transcription patterns for differentially

Expressed genes at 51 hours between glutamine and

nitrate ...... 103

Supplementary Table 1 Shared differentially expressed genes between nitrogen

conditions ...... 125

Supplementary Table 2 Shared differentially expressed genes between time points ...... 126

Supplementary Table 3 Expression data for nitrogen assimilation genes ...... 127

Supplementary Table 4 Fatty acid synthesis gene expression under nitrogen

starvation ...... 128

Supplementary Table 5 Triacylglycerol metabolism gene expression under nitrogen

starvation ...... 129

xi

Supplementary Table 6 Carbonic anhydrase gene expression in comparison to the

inoculum ...... 130

Supplementary Table 7 Carbon metabolism gene expression under nitrogen

starvation at 51 hours ...... 131

Chapter 4

Table B1 Vectors used in this study ...... 231

Table C1 Primers used in this study ...... 258

Table E1 Comparison of transgenic seed viability ...... 261

Table F1 Transgene screening summary ...... 265

xii

LIST OF FIGURES

Chapter 1

Figure 1.1 Global nitrogen emissions...... 5

Figure 1.2 Proposed readjustment of the nitrogen cycle in biofuel production ...... 6

Figure 1.3 Reaction pathways of hydrothermal biomass degradation in the presence

of proteins ...... 9

Figure 1.4 Growth of Dunaliella spp. using urea, purines, and acetamide as a sole

nitrogen source ...... 22

Chapter 2

Figure 2.1 Diagram of a microalgae production system featuring in vitro biomass

recycling ...... 47

Figure 2.2 Growth of Dunaliella viridis supplemented with nitrogen substrates ...... 50

Figure 2.3 Growth of Dunaliella spp. using NH4Cl as the sole nitrogen source ...... 51

Figure 2.4 Metabolite productivity and content of D. salina and D. viridis ...... 53

Figure 2.5 Uptake of radiolabeled amino acids by D. viridis...... 54

+ Figure 2.6 NH4 Released from amino acids in sterile media ...... 54

Figure 2.7 Spectral analysis of spent growth medium containing glutamine ...... 55

Figure 2.8 Pathways of nitrogen acquisition from amino acids by Dunaliella spp...... 57

Figure S1 Growth of D. viridis dumsii on ribonucleosides and nucleobases as

sole N sources ...... 61

xiii

Chapter 3

Figure 3.1 Experimental and sampling design ...... 87

Figure 3.2 Growth on glutamine results in increased cell density over nitrate, but

the same cell diameter as nitrogen starved cells ...... 88

Figure 3.3 Growth on glutamine results in a smaller increase in pH in comparison

to nitrate ...... 89

Figure 3.4 Chlorophyll content in the glutamine culture reflects the nitrate culture

until 51 hours where it decreases ...... 90

Figure 3.5 Soluble protein content in the glutamine culture reflects the nitrate culture

until 51 hours where it decreases to the nitrogen starved level ...... 91

Figure 3.6 Total carbohydrate content in the glutamine culture reflects the nitrate

culture until 48 hours where it increases ...... 92

Figure 3.7 Nitrogen starvation results in neutral lipid accumulation ...... 93

Figure 3.8 Area proportional Euler diagram of the rapidly differentially expressed

genes after inoculation ...... 94

Figure 3.9 Area proportional Euler diagrams of differentially expressed genes by

nitrogen source ...... 95

Figure 3.10 Area proportional Euler diagrams of differentially expressed genes

over time ...... 96

Figure 3.11 Area proportional Euler diagrams of differentially expressed genes

scaled to total number of unique transcripts in each comparison ...... 97

Figure 3.12 Differential expression in nitrogen assimilation genes in different nitrogen

conditions over time ...... 98

xiv

Figure 3.13 Expression of fatty acid synthesis genes during nitrogen starvation ...... 99

Figure 3.14 Expression of triacylglycerol synthesis genes during nitrogen starvation ...... 100

Figure 3.15 Differential expression of core carbon metabolism genes under nitrogen

starvation ...... 102

Supplementary Figure 1 Starch content over time on different nitrogen sources ...... 124

Chapter 4

Figure 4.1 Current Approaches to plastid genome modification ...... 139

Figure 4.2 Prokaryotic DNA repair by homologous recombination and relevance

to plastid transformation ...... 143

Figure 4.3 Plastid localization of Avsunviroidae RNA ...... 146

Figure 4.4 Reverse transcription of retroviral genomic RNA into double stranded

proviral DNA via the strong stop strand transfer model ...... 148

Figure 4.5 Application of CRISPR-Cas9 systems to generate sequence specific

double stranded breaks ...... 150

Figure 4.6 Unique sites in the plastid genome of N. tabacum ...... 153

Figure 4.7 Chloroplast genome engineering ...... 156

Figure 4.8 Design of genetic constructs used in chloroplast genome engineering ...... 174

Figure 4.9 Demonstration of ELVd 5’UTR use in N. benthamiana leaf tissue ...... 180

Figure 4.10 Selection and propagation of transgenic Lines ...... 183

Figure 4.11 Genotyping of transgenic A. thaliana lines ...... 185

Figure 4.12 Transient RNA expression ...... 187

Figure 4.13 In planta protein expression of rbcSCTP-M-MLV reverse transcriptase

xv

and rbcSCTP-Cas9 ...... 189

Figure 4.14 Enhanced detection of M-MLV reverse transcriptase in transgenic

A. thaliana lines ...... 190

Figure 4.15 Immunoprecipitation of M-MLV reverse transcriptase from A. thaliana

rosette leaf tissue ...... 191

Figure 4.16 In vitro activity assays of MMLV reverse transcriptase produced in

planta ...... 192

Figure 4.17 Attempted identification of integrated chloroplast transgenes ...... 193

Figure 4.18 Transgenic Nicotiana benthamiana callus ...... 194

Figure 4.19 Comparison of CRISPR/Cas9 and CRISPR/Cpf1 systems ...... 205

Figure 4.20 Structure of plant LTR retrotransposons ...... 207

Figure B1 cpCas9 ...... 232

Figure B2 cpNCP10 ...... 233

Figure B3 cpMMLV reverse transcriptase ...... 234

Figure B4 PTEC 1 - N. benthamiana single integration site ...... 236

Figure B5 PTEC 2 - N. benthamiana double integration sites...... 239

Figure B6 PTEC 3 - A. thaliana single integration site ...... 242

Figure B7 PTEC 4 – A. thaliana double integration sites ...... 245

Figure B8 PTEC 5 – N. benthamiana single integration site, First Strand

Synthesis Only ...... 248

Figure B9 5’ ELVd UTR-sgRNA ...... 252

Figure B10 3’ ELVd UTR-sgRNA ...... 253

Figure B11 5’ELVd UTR-eGFP ...... 254

xvi

Figure B12 5’EIF4E UTR-eGFP ...... 255

Figure B13 cpeGFP ...... 256

Figure B14 Chloroplast peptide cloning vector gene fragment ...... 257

Figure F1 Fluorescence screen and planting of lines carrying PTEC

and reverse transcriptase transgenes ...... 262

Figure F2 PTEC screen...... 263

Figure F3 Reverse transcriptase screen ...... 264

Figure F4 Secondary PCR screens for integration ...... 264

1

CHAPTER 1: Nitrogen Recycling as a Framework for Addressing

Nitrogen Fertilizer Use in Algal Cultivation

Abstract

Nitrogen fertilizer is an expensive and pollution creating input in agriculture and algae production. Synthetic nitrogen fertilizer is created by the energy intensive Haber-Bosch process which fixes atmospheric N2 using high temperature, pressure, and H2 derived from fossil fuels.

However, the environmental impact of fertilizer can be reduced using nutrient recycling. This chapter addresses the challenge of nitrogen fertilizer use in microalgal production systems, and the various means by which nitrogen can be recovered or recycled. The biological mechanisms of nitrogen utilization by algae are reviewed. The genus Dunaliella is presented as a model system for investigating the mechanisms of organic nitrogen use.

1.1 Algal Cultivation is Industrial Agriculture

Agriculture is the basis of human civilization. Humans have been planting crops and utilizing them as resources to produce food, feed for livestock, building materials, medicine, and clothing for thousands of years. However, within the past 200 years the practice of agriculture has continually changed, with the mechanization of labor that came about via the industrial revolution, to the introduction of synthetic fertilizer that allowed a massive population explosion, to the rapid development of breeding after the discovery of genetics by Gregor Mendel, and the

2

eventual application of modern breeding techniques employing mutagenesis and genetic engineering.

Photosynthetic microalgae have been proposed as a 21st century platform for agriculture

(Rasala and Mayfield 2015; Guarnieri and Pienkos 2015; Christenson and Sims 2011). Like crops, these organisms can utilize solar energy to produce many specialized resources such as biodiesel derived from Triacylglycerols (TAGs) (Guschina and Harwood 2006; Amin 2009), metabolites like beta-carotene and astaxanthin (Ben-Amotz 1995; Kobayashi et al. 1991), and recombinant proteins like vaccines (Specht and Mayfield 2014). However, compared to traditional crops these organisms are typically more productive per unit area (Cooney et al.

2009). Arable land is unnecessary for algal growth; these organisms can potentially be grown in areas as barren as arid deserts, polar ice caps, and the sides of buildings (Pagliolico et al. 2017;

Moody et al. 2014; Franz et al. 2012). Another important advantage of photosynthetic microalgae is that many species do not require fresh water, meaning that cultivation could instead employ waste, brackish, or even oceanic water (Converti et al. 2009; R. W. Davis et al.

2015).

1.2 Pollution is an Avoidable Consequence of Algal Cultivation

All life depends on nitrogen, but only 2% of nitrogen contained on earth is accessible to life (Mackenzie 2011). Nitrogen is an essential atom in most of the components that are considered elemental to life: protein, DNA, RNA, amino acids, and nucleic acids. The atoms of nitrogen in these molecules are ultimately physically derived from gaseous N2, which comprises

78% of the atmosphere (Erisman et al. 2008). In the absence of human activity, N2 becomes

3

accessible to living organisms primarily by the action of nitrogen-fixing organisms, which reduce this gas into ammonia (NH3) (Franche et al. 2009). N is also introduced into the biosphere marginally by lightning in the troposphere that produces NOx species (Lelieveld and

Dentener 2000; Ehhalt et al. 2001). Both NH3 and NOx are incorporated into organisms via the

GS-GOGAT nitrogen assimilation pathway (Miflin and Lea 1976), and ultimately released back into the atmosphere as N2 via the action of denitrifying bacteria (Galloway et al. 2004). The capture and release of nitrogen in this way is referred to as the nitrogen cycle.

In 1908 the nitrogen cycle accelerated with the invention of synthetic fertilizer (Smil

2001). Fritz Haber and Carl Bosch developed a method to produce ammonia (NH3) from N2 and

H2 gas using an iron catalyst, high pressure (150-300 bar), and high temperature (500 °C)

(“Process of Producing Ammonia.” 1908). The unfortunate side products of this process are carbon pollution. For every 1 kg of ammonia produced, 57 MJ of energy is required and 2.5 kg of CO2 is released into the atmosphere (Hesq et al. 2010). The process itself also generates CO2, as the H2 is either derived directly from burning fossil fuels like coal, naptha, and natural gas, or by the hydrolysis of water, which itself is an energy requiring process (Pach 2007). The current rate of synthetic nitrogen fertilizer usage in agriculture requires approximately 2.5% of the world’s energy supply yearly (Huo and Wernick 2012), and is responsible for 0.93% of global

GHG emissions (“Fertilizers, Climate Change and Enhancing Agricultural Productivity

Sustainably” 2009). Because nearly half the world’s population is dependent on food derived from synthetic nitrogen, the pursuit of biofuels is expected to exacerbate the problems posed by the energetic needs synthetic nitrogen production (Erisman et al. 2010).

Complicating the use of synthetic fertilizers is that their application to agriculture itself generates environmental pollution and greenhouse gas emissions (Mosier et al. 1998; Shcherbak

4

et al. 2014; Reay et al. 2012; Snyder et al. 2009) (Figures 1.1, 1.2). Excess fertilizer applied to crops washes into waterways and causes eutrophication, inducing algal blooms that strip oxygen out of the environment and killing other organisms (Howarth 2008). Furthermore, when organic matter that is essentially constituted of nitrogen from synthetic fertilizer decomposes, it releases that nitrogen in the form of N2O and NOx, which are greenhouse gases, and the former of which has a global warming potential of nearly 300 times that of CO2 (Crutzen et al. 2016).

The paradigm of nitrogen fertilizer application and pollution applies equally to algal cultivation and the products derived thereof. Microalgae utilize the same fertilizer sources, and therefore produce the same kinds of waste as crops. However, there are now attempts to address the impact of pollution from fertilizer via nitrogen recycling methods (Wang and Brown 2013;

Lardon et al. 2009; Collet et al. 2011; Huo and Wernick 2012). Key to this concept is the treatment of waste biomass as a source of fertilizer that remains within the production system.

This approach has the benefit of creating a closed system, wherein fertilizer inputs and nitrogen waste are minimized. Nitrogen recycling would be particularly valuable in the production of biofuels, as these products do not contain nitrogen, and therefore could be extracted without any nitrogen loss.

1.3 Technologies Enabling Nitrogen Recycling

Nitrogen is not typically recovered in microalgal production systems, which is instead sold as an enriched biomass feedstock for other organisms (Crutzen et al. 2016; Erisman et al.

2008; Bleakley and Hayes 2017). This is done because selling protein rich algal mass generates revenue that increases the financial viability of using algae to produce biofuel. The practice of

5

Table 1.1 | End to End Pollution resulting from fertilizer use. Figure modified from (S. Wood and Cowie 2004). CO2 emissions from the production of ammonia by the Haber-Bosch process are shown.

Figure 1.1 | Global Nitrogen Emissions. Figure modified from (Fowler et al. 2013) Anthropogenic and natural nitrogen emissions of reactive nitrogen are shown (units Tg yr−1). Black values represent total emission while red values indicate only the anthropogenic contribution.

6

Figure 1.2 | Proposed Readjustment of the Nitrogen cycle in biofuel production. Figure modified from (Huo and Wernick 2012). A. Biofuel production from plant or algal feedstocks generates both carbon and nitrogen pollution B. Recycling nitrogen biomass close closes nitrogen and carbon loops, reducing the overall environmental impact of biofuel production

selling this mass has the consequence of most nitrogen leaving the algae production system and eventually returning to the atmosphere. The other logical consequence of this paradigm is that fertilizer must be synthesized and bought every time algae are produced. While functional, this system is not optimal from an energetic or monetary perspective. An optimal system would instead recycle as much nitrogen as possible within the production system (Figure 1.3). The methods explored to achieve this system can be broadly split into mechanisms that are either thermochemical or biological.

Thermochemical Conversion of Biomass

Thermochemical conversion of biomass consists of deliberately engineered processes that involve chemical, thermal, pressure, and light treatments to convert whole biomass into other forms. These methods generally have the advantage that they simultaneously generate a desired product and liberate nutrients. However, many of these methods are inefficient means of recovering nitrogen because they tend to volatilize nitrogen mass into inert N2, create harmful

7

nitrogen pollution either by themselves or via the combustion of their products, or generate undesired or unusable nitrogen products.

Hydrothermal Liquefaction

Hydrothermal liquefaction (HTL) employs high temperatures (250 to 374 °C) and pressure (4 to 22 MPa) to hydrolyze biomass, including algae, into a form known as biocrude oil

(Canabarro et al. 2013; Gollakota et al. 2018; Mu et al. 2017; López Barreiro et al. 2013;

Shuping et al. 2010; Brown et al. 2010; Akhtar and Amin 2011; Biller and Ross 2011; Toor et al.

2011; Cho and Park 2018). Compared to other thermochemical conversion methods it is advantageous because it can be used without dewatering biomass and produces less gaseous byproducts than combustion methods like pyrolysis, incineration, or gasification. This process also creates a nutrient rich aqueous waste stream containing ammonia derived from the input biomass. Generally, about 40-75% of input nitrogen can be recovered in this way (Garcia Alba et al. 2013; Chen et al. 2017; Edmundson et al. 2017; Leng et al. 2018; Biller et al. 2012). As a proof-of-concept, it was shown that microalgae used in this process can undergo 5 cycles of growth and processing with only recovered nutrients serving as fertilizer (Garcia Alba et al.

2013). However, waste stream recovery via this process cannot yet be used indefinitely, because of buildup of inhibitory compounds and a systematic decrease in micronutrients.

The biocrude oil produced from HTL is high in nitrogen because of maillard products generated during thermochemical conversion (Figure 1.4) (Selvaratnam et al. 2015; Garcia Alba et al. 2013; L. Chen et al. 2017; Edmundson et al. 2017; Leng et al. 2018; G. Yu et al. 2011;

Biller et al. 2012). As a consequence, biocrude oil is an unsuitable drop in fuel which would

8

produce hazardous and tightly regulated NOx gas waste if combusted, and therefore requires chemical upgrading and refinement to reduce or remove nitrogen content before it can be used as a source of energy, (US EPA n.d.). The clear challenges for the implementation of HTL as a recycling system are to reduce the amount of nitrogen remaining in biocrude in a cheap and energetically efficient way, and to increase the proportion of nitrogen in the aqueous waste stream by reducing the amount of nitrogen in the gaseous waste phase.

Rapid Hydrothermal Liquefaction (R-HTL or Flash Hydrolysis)

Flash hydrolysis is a derivative of HTL that utilizes shorter retention times. This procedure generates two distinct product phases, a solid and a liquid. The solid phase is a biocrude intermediate that is enriched in carbon, while the liquid phase contains several small organic nitrogen compounds such as peptides, amino acids, and ammonia. Typically, greater than 60% of input nitrogen can be recovered via this process, and like waste streams from HTL, waste from flash hydrolysis has also been used as a fertilizer for algal cultivation, although some algal species are not amenable to this process (Garcia-Moscoso et al. 2013; Talbot et al. 2016;

Barbera et al. 2016, 2017; Bessette et al. 2018; Garcia-Moscoso et al. 2015; S. Kumar et al.

2014).

However, there are two issues with the use of the flash hydrolysis. Foremost is that like

HTL, that the solid phase biofuel intermediate contains nitrogen derived from Maillard products that must be removed to be used as a fuel source (Bessette et al. 2018). This is particularly an issue because the solid phase biofuel intermediates typically contain much higher proportions

9

Figure 1.3 | Reaction pathways of hydrothermal biomass degradation in the presence of proteins. Figure Modified from (Toor et al.2011). Crosslinking of amino groups to sugar molecules sequesters reactive nitrogen in the form of maillard products. These products become part of biocrude oil, which, if burned would produce NOx pollution.

of nitrogen than HTL biocrude. The second issue is that nitrogen recovered in the aqueous phase is in the form of organic acids and peptides (Garcia-Moscoso et al. 2013; Talbot et al. 2016;

Barbera et al. 2016, 2017), which unlike ammonia and nitrate may or may not be a useable form of nitrogen depending on the species of algae exposed to them.

Biomass Gasification

Biomass Gasification is the process of converting dried biomass into a gas form using several thermochemical conversion steps. This process results in the production of combustible

10

syngas, which is used to provide energy (Anastas and Crabtree 2009). During the gasification process, some of the biomass nitrogen is converted into ammonia, which can be distilled away and reused. However, a great deal of biomass nitrogen remains in char and tar byproducts, or is released in the form of gaseous N2, prussic acid (HCN) and undesirable cyclic nitrogen species

(Onwudili et al. 2013; Guan et al. 2012; Stucki et al. 2009; Ajay Kumar, Jones, and Hanna 2009;

Goldschmidt et al. 2001).

Compared to HTL techniques, biomass gasification would be simultaneously more polluting and more efficient at recycling algal biomass as a usable form of nitrogen fertilizer.

However, a key drawback of gasification is that the input required is dewatered mass. This is essentially an untenable requirement in algal production, as the energy required to concentrate and dewater algae is impractically high (Sharma et al. 2013; Uduman et al. 2010).

Acid Hydrolysis

Acid catalyzed biomass pretreatment, otherwise known as acid hydrolysis, is a process used to fractionate algal biomass before conversion of that mass into a fuel source (Kumar et al.

2009; Harun et al. 2011; Li et al. 2014; Laurens et al. 2015; Choi et al. 2010). This essentially involves high temperature and intense acid treatment of dewatered algal paste to liberate carbohydrates, lipids, and protein rich biomass as separate fractions. The carbohydrates can be separately fermented to produce ethanol, while enriched and relatively nitrogen free lipids can be removed by solvent extraction. This process has the relative advantage among conversion technologies that more carbon is converted into clean fuels. Another advantage of this process is

11

that protein remains relatively intact and enriched, and can be used as a feed source or a substrate for anaerobic digestion.

The obvious issues are that the capacity of this system to recycle nitrogen entirely depends on the fate of the nitrogen rich biomass, and like biomass degasification, this process requires a dewatered biomass input. As is the case for flash hydrolysis, the ability of this nitrogen biomass to be reassimilated by the organism used for production is an essential barrier.

Wastewater Recycling

Manmade and animal derived waste water contains nitrogen, phosphorus, and carbon

(Cai et al. 2013). These sources of fertilizer have been previously used to grow cultures of microalgae, which has the mutual benefit of detoxifying the waste (Abdel-Raouf et al. 2012;

Ganeshkumar et al., 2018; Hülsen et al. 2018; Kuo et al., 2015; Toh et al. 1990). Particularly as wastewater grown algae can be further utilized in anaerobic digestion or other conversion methods to produce fuel, obtaining fertilizer via water treatment is an attractive approach (Li et al. 2016; Collet et al. 2011; Cai et al. 2013). However, by itself wastewater treatment does not address the downstream recovery of nitrogen from biomass fuel production. In addition, the drawback of using wastewater as a fertilizer source is that there simply isn’t enough of it. The

330 km³ of municipal wastewater generated worldwide every year could supply fertilizer to roughly 40 million hectares or 15% of all irrigated lands (Mateo-Sagasta et al. 2015). For comparison, producing 1540 million hectares of corn would meet half of the annual fuel need of just the United States (Chisti 2007). Although microalgae are orders of magnitude more

12

productive per unit area, it is estimated that only 3% of annual fuel need could be met using algae fertilized with wastewater (Chisti 2013).

Anaerobic Digestion and Fermentation

Anaerobic digestion is a process that involves feeding algal biomass to microorganisms to produce either methane or ethanol. The key advantages of this process are generally that methane or alcohol biogas can be used to power on site facilities used for growth and harvesting, and that an ammonia rich digestate is produced as a byproduct of digestion (Huo and Wernick

2012; Abdel-Raouf et al. 2012; Collet et al. 2011; Cai et al. 2013; Ward et al. 2014; Cho and

Park 2018). Furthermore, it has been shown that a wide variety of algae can be grown on anaerobic digestate derived from waste activated sludge, food waste, algal biomass, manure, sewage, and wastewater (Chen et al. 2018; Ayala-Parra et al. 2017; Pleissner et al. 2013; Cai et al. 2013; Cai et al. 2013; Cho and Park 2018). This integrated recycling concept is known to reduce the overall operating cost of producing combustible products, and may be able to recycle half the input nitrogen biomass (Ward et al. 2014; Chen et al. 2018).

However, anaerobic digestion is of limited use in the growth and recovery of marine algae. It has been shown that salt and high levels of alkaline earth metals present in salt containing water are toxic to the microorganisms used for anaerobic digestion, and are inhibiting to the formation of methane (Chen et al. 2008; Zhang et al. 2012; Vergara-Fernández et al. 2008;

Lettinga 1995; Parkin and Owen 1986). Furthermore, anaerobic digestion takes place on the order of days to weeks, and must be highly controlled to prevent the buildup of toxic ammonia released from nitrogen dense algae (Chen et al. 2008; Lettinga 1995; Parkin and Owen 1986).

13

Enzymatic Digestion

In the production of bioethanol, are used to convert complex starch polymers from corn or sorghum mash into fermentable monomer sugar subunits (Bothast and Schlicher

2005). A similar concept could be applied to proteins and nucleic acids resulting from algal biomass (Chapter 2). This is also similar to the process of digestion by protease cleavage using enzymatic digestion and hydrolysis (Northrop, 1939). Clearly these forms of nitrogen are suitable for bacteria and yeast, and several algae have been shown to use organic nitrogen sources (Ammann and Lynch, 1964; Birdsey and Lynch, 1962; Cain, 1965) (Section 1-4). In principle, this method should have the advantage that algal biomass could be used in its least reduced form, and the nitrogen would be recoverable with little to no energy required. However, the obvious drawback is that it is unknown if the autotrophic microalgae used for fuel production can use nitrogen biomass to the same extent that specialized heterotrophic organisms like E.coli and yeast can.

1.4 Nitrogen Metabolism and Scavenging in Chlorophyte Algae

To make use of fertilizer or nitrogen recovered from algal production, an alga must have some biological means of retrieving and converting nitrogen mass into a usable form. A majority of microalgae have been shown to be able to assimilate inorganic nitrogen like ammonium and nitrate through the use of specialized transporters and reducing enzymes.

However, the ability of microalgae to use organically bound nitrogen like amino acids, nucleobases, and biopolymers like peptides is less common. An illustration of the mechanisms

14

of nitrogen assimilation common to these organisms is provided here with special attention given to those means by which microalgae would be able to recover nitrogen from nitrogen recycling processes.

Assimilation of Inorganic Nitrogen Fertilizer

- - The most common forms of inorganic nitrogen are fertilizer nitrate (NO3 ), nitrite (NO2 ),

+ and ammonia/ ammonium (NH3/NH4 ). These are transported across the cell membrane via transporter proteins and integrated into glutamate using the GS-GOGAT pathway (Miflin and

Lea 1976). However, the transport of these ions is generally coupled to the movement of hydrogen ions and/or requires expenditure of cell energy (ATP). Additionally, more oxidized nitrate and nitrite require Nitrate Reductase and Nitrite Reductase enzymes for reduction to ammonium (Joy and Hageman 1966; Solomonson and Barber 1990) . Ammonium is the terminal inorganic form of nitrogen used by cells. Ammonium is fixed to carbon molecules either by being combined with glutamate by glutamine synthetase (GS) to yield glutamine, or by the reversible action of glutamate dehydrogenase to yield glutamate (Masclaux-Daubresse et al.

2010; Srivastava and Singh 1987; Miflin and Habash 2002).

15

Transport of Nitrogen

Nitrate

Two major families of proteins participate in nitrate transport in algae, Nitrate/Nitrite

Porters (NNP) and NRT1/Peptide Transporters (NPF) (Tsay et al. 2007; Forde 2000; Léran et al.

2014). NNPs, also called NRT2s, are usually associated with high affinity uptake and NPF with low affinity transport. An accessory protein called NAR2 is required for transport ability of some NNP genes but has no independent transport capability (Orsel et al. 2006; Zhou et al. 2000;

Galván et al. 1996; Quesada et al. 1994). NPFs have broad specificity and are known to uptake peptides, amino acids, auxin, and nitrate (Zhou et al. 1998; Jeong et al. 2004; Kanno et al. 2012;

Nour-Eldin et al. 2012). Both families are found ubiquitously in algae, although algae that lack a nitrate reductase may also lack the NNP family.

Nitrite

Nitrite transport is accomplished by NNPs, Formate/Nitrite Transporters (FNT), and HPP nitrite transporters. The only characterized member of the FNT family (also known as NAR1) in algae is LCIA, which transports nitrite and bicarbonate, and is a critical part of the

Chlamydomonas reinhardtii carbon concentrating mechanism (Yamano et al. 2015; Mariscal et al. 2006; Mackinder et al. 2017). FNT transporters are common in red, green, and glaucophyte algae, but are missing from land plants. HPP nitrite transporters were recently discovered in cyanobacteria and shown to transport nitrite into the chloroplast of A. thaliana (Maeda et al.

16

2014). No algae homologs have been characterized, although one of the HPP homologs of

Chlamydomonas (Cre06.g295826) shows upregulation under nitrogen starvation (Park et al.

2015).

Ammonia

Ammonia and ammonium are taken up by cells by the action of transport proteins, either those specific for ammonia transport (AMTs), or those used to transport similar ions because of competitive binding to a ligand recognition site (McDonald and Ward 2016; Martinelle and

Häggström 1993). These transporters are found in most organisms, and high affinity transport of ammonia/ammonium is accomplished by the AMT/Mep/Rh protein family ( McDonald and

Ward 2016). Ammonia transport by major intrinsic proteins (MIP) and other channel proteins has been reported (Di Giorgio et al. 2014; Chiasson et al. 2014), but these channels tend to have broad substrate specificity and only contribute to low affinity uptake. Algal AMTs are not well characterized, rather, function is assumed because of sequence conservation within the

AMT/Mep/Rh family (McDonald and Ward 2016). However, some algal AMT activity is known

(Yan 2013; Hildebrand 2005; Song et al. 2011).

Urea

Urea is technically an organic form of nitrogen because of the presence of a carbon atom, however the bulk of urea used as a ferilizer is derived from ammonia and CO2, and must be

17

reduced to ammonium and CO2 before it can be assimilated by the cell (Erisman et al. 2008;

Witte 2011). Many algae are capable of urea utilization, and some organisms like

Nannochloropsis salina appear to preferentially use urea over inorganic sources (Birdsey and

Lynch, 1962; Campos et al. 2014; Dong et al., 2014; Kim et al. 2016a; Kirk and Kirk, 1978a;

Kirk and Kirk, 1978b; Marudhupandi et al. 2016). Mechanistic preferences for or against urea may be due to competitive pressure from denitrifying organisms commonly found in the environment, which rapidly produce CO2 and ammonia via the action of the urease enzyme

(Bouwman et al. 1997). Urea uptake is mediated by DUR3 homologs (ElBerry et al. 1993;

Bagnasco 2005), which are ubiquitous among microbes, algae, and plants 9). Homologs of the

DUR3 transporter have been detected in many microalgae (Rentsch et al. 2007), but none of them are functionally characterized. However, there are microalgae that do not uptake urea or possess DUR3 homolog (Chapter 2).

Ammonia Scavenging by Deaminases

Deaminating enzymes recycle and partition nitrogen within the cell. The general action of these enzymes is to deaminate amino moieties from amino acids, liberating ammonium and corresponding keto-acids (Yu and Qiao 2012; Lukasheva et al. 2011). However, some organisms produce extracellular L-amino acid oxidases (LAAOs) that oxidize amine-containing compounds and produce free NH3 (Pollegioni et al. 2013).

Among microalgae, three functional extracellular LAAO/amine oxidases have been found in Chlamydomonas reinhardtii, Pleurochrysis carterue, Prymnesium parvum, (Palenik and Morel, 1990; Palenik and Morel, 1991; Piedras et al. 1992; Vallon et al. 1993). The LAAO

18

from Chlamydomonas is the only algal extracellular LAAO to be isolated and characterized. It produces ammonia and keto acids from 13 amino acids, and its expression is repressed by ammonium and induced under nitrogen starvation (Schmollinger et al. 2014; Park et al. 2015;

Piedras et al. 1992; Vallon et al. 1993). The putative LAAO(s) of Pleurochrysis carterue oxidizes 9 amino acids and some other amine compounds, while the putative LAAO(s) of

Prymnesium parvum only oxidizes several amine compounds and no amino acids (Palenik and

Morel 1990).

Assimilation of Organic Nitrogen

In contrast to the use of inorganic forms of nitrogen, the ability of microalgae to use organically bound nitrogen in the form of amino acids, nucleotides, peptides, and other biopolymers is less common. Many have found this puzzling, because dissolved organic nitrogen is relatively abundant in many of the bodies of water where algae can be found (Sipler and Bronk 2015; Berman and Bronk 2003; Aluwihare et al. 2005). Although several algal species have been shown to utilize organic nitrogen, the mechanism by which these are transported or assimilated is not known.

19

Uptake of Organic Nitrogen Molecules

Amino Acids

The ability of microalgae to use amino acids as a nitrogen source has been demonstrated or inferred from indirect tests using amino acids or cell lysates to rescue growth in media that would otherwise be nitrogen depleted. However, the ability of algae to transport amino acids across the cell membrane has been demonstrated only occasionally (Kirk and Kirk 1978a; Kirk and Kirk 1978b; Sauer et al. 1983; Ietswaart et al. 1994; Hellio and Le Gal 1998; Garcia-

Moscoso et al. 2015; Mulholland and Lee 2009), and there is little empirical evidence suggesting a biochemical route of amino acid assimilation (Hellio and Le Gal 1999; Hellio and Le Gal 1998;

Hellio et al. 2004).

Amino acid uptake can be mediated by amino acid transporters from at least three transporter subfamilies: Drug Metabolite Transporters (DMT), Amino Acid-Polyamine-

Organocation transporters (APC), and Amino Acid/Auxin Permease (AAAP) family. Many eukaryotes have amino acid transporters to shuttle these molecules across cell membranes as a consequence of primary metabolism or to facilitate protein synthesis or catabolism. The DMT family transporters facilitate the movement of a wide variety of solutes (Pratelli and Pilot 2014).

A subgroup of DMT transporters known as Usually Multiple Acids Move In and out

Transporters (UMAMIT) have been shown to be amino acid exporters and exchangers in

Arabidopsis thaliana (Besnard et al. 2016). The APC superfamily is split into the APC family and the Amino Acid/Auxin Permease (AAAP) family. Both families have multiple sub-families that are specific for amino acid transport such as the polyamine:H+ symporters (PHS), cationic

20

amino acid transporters (CAT), and amino acid permeases (AAP) (Wong et al. 2012). While phylogenetic and functional conservation has been shown within these families in bacteria, yeast, and plant systems, their existence and function in algae has not been demonstrated (Tegeder and

Masclaux-Daubresse 2018; Wong et al. 2012; Paulsen et al. 2000; Tegeder and Ward 2012;

Pratelli and Pilot 2014).

Members of these transporter families can be found in many algae; however, functional conservation and localization can be low within sub-family groups. For example, Arabidopsis thaliana homolog ANT1 which corresponds to Saccaromyces cerevisiae AVT3 localizes to the vacuole rather than the cell membrane (Fujiki et al. 2017). This makes it difficult to predict organism transporter localization and capability from gene sequence alone. Because there have been no amino acid transporters functionally characterized from algae, the question of which transporters are responsible for transport of specific metabolites remains open.

Nucleic Acids

Few authors have investigated nucleic acids as a nitrogen source in algae (Palenik and

Henson 1997; Berman and Bronk 2003; Berman 1999), and fewer still have shown that these organisms can transport and assimilate nucleic acids and nucleobases (Birdsey and Lynch 1962;

Ammann and Lynch 1964; Pérez-Vicente et al. 1991, 1995). There are reports of uracil auxotrophic algae (Fujiwara et al. 2013; Sakaguchi et al. 2011; Kasai et al. 2015), and a purine transporter has been shown to exist in Chlamydomonas reinhardtii (Pérez-Vicente et al. 1995;

Pérez-Vicente et al., 1991; Schein et al., 2013).

21

Specific records of algae utilizing nucleic acids as a nitrogen source are rare.

Chlamydomonas, Chlorella, and Scenedesmus spp. were found to transport a variety of purines

(Ammann and Lynch, 1964; Birdsey and Lynch, 1962; Cain, 1965; Piedras et al. 1995; Perez-

Vicente et al., 1995; Pérez-Vicente et al., 1991; Pineda et al. 1984). Dunaliella viridis was shown to be incapable of utilizing individual nucleobases or ribonucleosides as sole nitrogen sources (Chapter 2), however, acetamide, xanthine, and hypoxanthine do appear to facilitate recovery from nitrogen starvation in some Dunaliella spp. (Figure, 1.4).

Peptides

The ability of algae to utilize peptides may be mediated by transporters from the peptide transporter (PTR) and oligopeptide transporter (OPT) families (Lubkowitz 2011; Tegeder and

Rentsch 2010; Rentsch et al. 2007). In vascular plants, PTR and OPT homologs have been characterized and the families show broad uptake ability of oligopeptides, dipeptides, auxin, amino acids, nitrate, and peptides chelated with metal ions. Algal homologs may serve the same purpose. Genome database mining shows the presence of these families in algae, but no members have been functionally characterized (Polle et al. 2017; Radakovits et al. 2012;

Merchant et al. 2007; McDonald et al. 2010; Hanikenne et al. 2005).

Algal utilization of peptides as a nitrogen source has been analyzed indirectly, drawing conclusions from growth of algae strains on a cell lysate such as yeast extract. For example, it was shown that Tetraselmis sp. Ganghwa selectively utilized nitrogen from yeast extracts over inorganic nitrogen sources (Kim et al. 2016b). However, yeast extract contains

22

Figure 1.4 | Growth of Dunaliella spp. using urea, purines, and acetamide as a sole nitrogen source. (Murphree and Sederoff unpublished). Total cell density of cultures grown for 144 hours on mBA-N (-), mBA (+), or mBA-N with 5 mM urea (5), 50 mM urea (50), 5 mM Acetamide (Ace), 5 mM Allantoin (Allantoin), 5 mM Xanthine (Xanthine), or 5 mM Hypoxanthine (Hypoxanthine) (see chapter 2 for methods and composition of mBA- N/mBA). Average cell density was measured from four biological replicates. Error bars represent one standard deviation. Significant differences not shown. The total cell density of cultures of D. salina, D. tertiolecta, and D. tertiolecta was increased compared to a nitrogen starvation control (-) when supplemented with Urea, Acetamide, Xanthine, and Hypoxanthine. In contrast, supplementation of these nitrogen sources had no significant effects on the growth of D. viridis. The ability to use different nitrogen sources therefore represents a discrete and salient source of diversity within Dunaliella.

peptides and amino acids, confounding the identity of the nitrogen source being utilized. On the contrary, if an organism does not utilize trypsin and pepsin enzyme extracts as a nitrogen source, it is likely that most organic nitrogen sources are not utilized by that organism. Specific work to show utilization of peptides is rarer, although this activity is common among bacteria (Armstead

23

and Ling 1993; Goodell and Higgins 1987; Gänzle et al. 2007; Ling and Armstead 1995), yeast

(Holm et al. 2005; Becker and Naider 1977; Benjdia et al. 2006), mammals (Daniel et al. 2006), and plants (Tegeder and Rentsch 2010). Mulholland and Lee screened different algal communities for their ability to uptake and/or hydrolyze dipeptides and showed the ability of dominant species to utilize exogenously provided organic nitrogen (Mulholland and Lee 2009).

Dunaliella viridis is unable to utilize exogenously provided alanine-glutamine or glycine- glutamine peptides (Chapter 2).

Extracellular Depolymerization of Nitrogen Polymers

Many organisms employ intracellular proteases for protein catabolism and recycling of amino acids for protein synthesis (Bond and Butler 1987). Some also excrete extracellular proteases and nucleases to assist in the harvesting of nitrogen from the environment (Benedik and Strych, 1998; Cezairliyan and Ausubel, 2017; Cuatrecasas et al. 1967; Dhapeau et al. 1972;

Heun et al. 2012; Joo et al., 2002; Lee et al., 2000; Nascimento and Martins, 2004; Nestle and

Roberts, 1969). The model of this activity is pepsin and trypsin digestion of proteins in the digestive tracts of animals, a process that liberates peptides, dipeptides, and amino acids that are taken up by endothelial cells and commensal bacteria like Escherichia coli.

Reports of characterized extracellular proteases in algae are limited, and the role of these enzymes in nitrogen scavenging is unclear (Liu et al. 2016). A calcium dependent protease from

Chlorella sphaerkii was shown to function at a wide variety of pH levels and temperatures, but no further characterization was pursued (Kellam and Walker, 1987). The dinoflagellate

Peridinium gatunense was shown to excrete a protease used to sensitize other Peridinium

24

gatunense cells to oxidative damage similarly to programmed cell death (Vardi et al. 2007). The diatom Chaetoceros didymus produces extracellular proteases to defend itself against the lytic proteases of Kordia algicida bacterium which lyses other diatoms (Paul and Pohnert 2013).

Phagotrophy

Phagotrophy is the use of phagocytosis or one cell engulfing another cell or debris as a source of energy and nutrients. Phagotrophy is not widely found in photosynthetic algae;

Phagotrophy is found within Prymnesiophyceae, Chrysophyceae, and Pelagophyceae families of algae and dinoflagellates (Raven et al. 2009; Kamennaya et al. 2018). There are three reports of phagotrophic green algae, Micromonas, Pyramimonas, and Cymbomonas (McKie-Krisberg and

Sanders 2014; Bell and Laybourn-Parry 2003; Maruyama and Kim 2013). All are prasinophytes that consume small bacterial cells. Some work done on the phagotrophic alga Ochromonas danica has shown it is capable of producing harvestable lipids while feeding on bacteria and waste activated sludge (Lin and Ju 2017; Li et al. 2016; Li and Ju 2018). The efficiency of phagotrophy is variable and can be dependent on nutrient and other environmental conditions.

Compared to nutrient transport or enzymatic degradation of biomass this process necessarily requires a great deal of morphological and biological specialization.

25

1.5 Dunaliella spp. Biofuel Production as a System for Modeling

Nitrogen Recycling

I have explored the feasibility of nitrogen recycling within an algal production system using the genus Dunaliella as a model. Dunaliella are unicellular photoautotrophic algae used to produce high value carotenoids and treat wastewater, and that have the potential to produce TAG lipids suitable as a drop-in biofuel ( Dong et al. 2016; Slocombe et al. 2015; Davis et al. 2015;

Ben-Amotz et al. 2009). These are halophilic organisms that occupy a specialized niche in extreme salt and alkaline environments like The Great Salt Lake, solar salterns, and the Dead Sea

(Ben-Amotz et al. 2009). As a biofuel producer these organisms are particularly attractive because they can be grown optimally using brackish and salty water sources instead of freshwater. However, as is the case with other algae, nitrogen recycling is a key component of creating an environmentally and financially sustainable production system.

Of key interest is the ability of Dunaliella to utilize organic forms of nitrogen. This was explored because extracting products from these organisms typically does not entail thermochemical conversion or anaerobic digestion because it would either destroy high value carotenoid byproducts or because the high salt content of cultures renders these processes impractical. Instead, solvent extraction of ruptured algal lysates is used, which leaves algal biomass intact. This avenue has become even more compelling with the implementation of ultrasonic cavitation (Krehbiel et al. 2014; M. Wang et al. 2014), which promises to make algal lysate formation energetically and financially feasible. Algal biomass resulting from the process is assumed to be constituted of proteins and other nitrogen containing biopolymers, which we believe can be converted to organic nitrogen monomers either by acid hydrolysis or enzymatic

26

conversion. However, prior to these investigations, the capacity of Dunaliella to use these sources of nitrogen had been little explored ( Hellio and Le Gal 1999; Hellio and Le Gal 1998).

Conclusion

Synthetic nitrogen is a double-edged sword. While synthetic fertilizer supports an efficient agricultural infrastructure, it is implicitly connected to reliance on fossil fuels. Because of a reliance on substantial amounts of fertilizer, biofuel production from microalgae is not feasible without some means of nitrogen recycling. However, we know little about algal nitrogen scavenging and uptake mechanisms, and a central challenge in the development of algae as a form of agriculture will be the creation of production systems that account for and recycle nitrogen. Thus, there is a clear need to understand the mechanisms and capacity of algal species to use inorganic and nitrogen substrates. This knowledge will inform our ability to create systematic improvements in the capacity of algal species to serve as a production platform, either through the application of deliberate mixtures of algal organisms, or the use of genetic engineering to customize an algal chassis for the explicit purpose of nitrogen recycling.

27

References

Abdel-Raouf, N., A. A. Al-Homaidan, and I. B. M. Ibraheem. "Microalgae and wastewater treatment." Saudi journal of biological sciences 19, no. 3 (2012): 257-275. Akhtar, Javaid, and Nor Aishah Saidina Amin. “A Review on Process Conditions for Optimum Bio-Oil Yield in Hydrothermal Liquefaction of Biomass.” Renewable and Sustainable Energy Reviews, vol. 15, no. 3, Pergamon, 2011, pp. 1615–24. Aluwihare, Lihini I., Daniel J. Repeta, Silvio Pantoja, and Carl G. Johnson. "Two chemically distinct pools of organic nitrogen accumulate in the ocean." Science 308, no. 5724 (2005): 1007-1010. Amin, Sarmidi. “Review on Biofuel Oil and Gas Production Processes from Microalgae.” Energy Conversion and Management, vol. 50, no. 7, Pergamon, 2009, pp. 1834–40. Ammann, Elizabeth C. B., and Victoria H. Lynch. “Purine Metabolism by Unicellular Algae II. Adenine, Hypoxanthine, and Xanthine Degradation by Chlorella Pyrenoidosa.” Biochimica et Biophysica Acta (BBA) - Specialized Section on Nucleic Acids and Related Subjects, vol. 87, no. 3, Elsevier, 1964, pp. 370–79. Anastas, Paul T., and Robert H. Crabtree. Handbook of Green Chemistry. Green Catalysis. Wiley-VCH, 2009. Armstead, I. P., and J. R. Ling. “Variations in the Uptake and Metabolism of Peptides and Amino Acids by Mixed Ruminal Bacteria in Vitro.” Applied and Environmental Microbiology, vol. 59, no. 10, American Society for Microbiology, 1993, pp. 3360–66. Ayala-Parra, Pedro, Yuanzhe Liu, Jim A. Field, and Reyes Sierra-Alvarez. "Nutrient recovery and biogas generation from the anaerobic digestion of waste biomass from algal biofuel production." Renewable Energy 108 (2017): 410-416. Bagnasco, Serena M. “Role and Regulation of Urea Transporters.” Pflügers Archiv - European Journal of Physiology, vol. 450, no. 4, Springer-Verlag, 2005, pp. 217–26. Barbera, Elena, Eleonora Sforza, Sandeep Kumar, Tomas Morosinotto, and Alberto Bertucco. "Cultivation of Scenedesmus obliquus in liquid hydrolysate from flash hydrolysis for nutrient recycling." Bioresource technology 207 (2016): 59-66. Barbera, Elena, Ali Teymouri, Alberto Bertucco, Ben J. Stuart, and Sandeep Kumar. "Recycling Minerals in Microalgae Cultivation through a Combined Flash Hydrolysis–Precipitation Process." ACS Sustainable Chemistry & Engineering 5, no. 1 (2016): 929-935. Becker, Jeffrey M., and Fred Naider. “Peptide Transport in Yeast: Uptake of Radioactive Trimethionine in Saccharomyces Cerevisiae.” Archives of Biochemistry and Biophysics, vol. 178, no. 1, Academic Press, 1977, pp. 245–55.

28

Bell, Elanor M., and Johanna Laybourn-Parry. “Mixotrophy in the Antarctic phytoflagellate, pyramimonas gelidicola (chlorophyta: prasinophyceae) 1.” Journal of Phycology, vol. 39, no. 4, Blackwell Science Inc, 2003, pp. 644–49. Ben-Amotz, Ami. “New Mode of Dunaliella Biotechnology: Two-Phase Growth for β-Carotene Production.” Journal of Applied Phycology, vol. 7, no. 1, Kluwer Academic Publishers, 1995, pp. 65–68. Ben-Amotz, Ami, Jürgen EW Polle, and DV Subba Rao, eds. The Alga Dunaliella. CRC Press, 2009. Benedik, Michael J., and Ulrich Strych. “Serratia Marcescens and Its Extracellular Nuclease.” FEMS Microbiology Letters, vol. 165, no. 1, Oxford University Press, 1998, pp. 1–13. Benjdia, Mariam, Enno Rikirsch, Tobias Müller, Mélanie Morel, Claire Corratgé, Sabine Zimmermann, Michel Chalot, Wolf B. Frommer, and Daniel Wipf. "Peptide uptake in the ectomycorrhizal fungus Hebeloma cylindrosporum: characterization of two di‐and tripeptide transporters (HcPTR2A and B)." New Phytologist 170, no. 2 (2006): 401-410. Berman, T. “Algal Growth on Organic Compounds as Nitrogen Sources.” Journal of Plankton Research, vol. 21, no. 8, Oxford University Press, 1999, pp. 1423–37. Berman, T., and DA Bronk. “Dissolved Organic Nitrogen: A Dynamic Participant in Aquatic Ecosystems.” Aquatic Microbial Ecology, vol. 31, no. 3, 2003, pp. 279–305. Besnard, Julien, Réjane Pratelli, Chengsong Zhao, Unnati Sonawala, Eva Collakova, Guillaume Pilot, and Sakiko Okumoto. "UMAMIT14 is an amino acid exporter involved in phloem unloading in Arabidopsis roots." Journal of experimental botany 67, no. 22 (2016): 6385- 6397. Bessette, Andrew P., Ali Teymouri, Mason J. Martin, Ben J. Stuart, Eleazer P. Resurreccion, and Sandeep Kumar. "Life Cycle Impacts and Techno-economic Implications of Flash Hydrolysis in Algae Processing." ACS Sustainable Chemistry & Engineering 6, no. 3 (2018): 3580-3588. Biller, Patrick, Andrew B. Ross, S. C. Skill, A. Lea-Langton, B. Balasundaram, C. Hall, R. Riley, and C. A. Llewellyn. "Nutrient recycling of aqueous phase for microalgae cultivation from the hydrothermal liquefaction process." Algal Research 1, no. 1 (2012): 70-76. Biller, P., and A. B. Ross. “Potential Yields and Properties of Oil from the Hydrothermal Liquefaction of Microalgae with Different Biochemical Content.” Bioresource Technology, vol. 102, no. 1, Elsevier, 2011, pp. 215–25. Birdsey, E. C., and V. H. Lynch. “Utilization of Nitrogen Compounds by Unicellular Algae.” Science (New York, N.Y.), vol. 137, no. 3532, American Association for the Advancement of Science, 1962, pp. 763–64. Bleakley, Stephen, and Maria Hayes. “Algal Proteins: Extraction, Application, and Challenges Concerning Production.” Foods, vol. 6, no. 12, Multidisciplinary Digital Publishing Institute, 2017, p. 33.

29

Bond, J. S., and P. E. Butler. “Intracellular Proteases.” Annual Review of Biochemistry, vol. 56, no. 1, Annual Reviews 4139 El Camino Way, P.O. Box 10139, Palo Alto, CA 94303-0139, USA, 1987, pp. 333–64. Bothast, R. J., and M. A. Schlicher. “Biotechnological Processes for Conversion of Corn into Ethanol.” Applied Microbiology and Biotechnology, vol. 67, no. 1, Springer-Verlag, 2005, pp. 19–25. Bouwman, A. F., D. S. Lee, W. A. H. Asman, F. J. Dentener, K. W. Van Der Hoek, and J. G. J. Olivier. "A global high‐resolution emission inventory for ammonia." Global biogeochemical cycles 11, no. 4 (1997): 561-587. Brown, Tylisha M., Peigao Duan, and Phillip E. Savage. "Hydrothermal liquefaction and gasification of Nannochloropsis sp." Energy & Fuels 24, no. 6 (2010): 3639-3646. Cai, Ting, Stephen Y. Park, Ratanachat Racharaks, and Yebo Li. "Cultivation of Nannochloropsis salina using anaerobic digestion effluent as a nutrient source for biofuel production." Applied energy 108 (2013): 486-492. Cai, Ting, Stephen Y. Park, and Yebo Li. “Nutrient Recovery from Wastewater Streams by Microalgae: Status and Prospects.” Renewable and Sustainable Energy Reviews, vol. 19, Pergamon, 2013, pp. 360–69. Cain, Brother Joseph. “Nitrogen utilization in 38 freshwater Chlamydomonas algae.” Canadian Journal of Botany, vol. 43, no. 11, NRC Research Press Ottawa, Canada, 1965, pp. 1367–78. Campos, Herman, Wiebke J. Boeing, Barry N. Dungan, and Tanner Schaub. "Cultivating the marine microalga Nannochloropsis salina under various nitrogen sources: effect on biovolume yields, lipid content and composition, and invasive organisms." biomass and bioenergy 66 (2014): 301-307. Canabarro, Nicholas, Juliana F. Soares, Chayene G. Anchieta, Camila S. Kelling, and Marcio A. Mazutti. "Thermochemical processes for biofuels production from biomass." Sustainable Chemical Processes 1, no. 1 (2013): 22. Cezairliyan, Brent, and Frederick M. Ausubel. “Investment in Secreted Enzymes during Nutrient-Limited Growth Is Utility Dependent.” Proceedings of the National Academy of Sciences, vol. 114, no. 37, 2017, pp. E7796–802. Chen, Limei, Tao Zhu, Jose Salomon Martinez Fernandez, Shulin Chen, and Demao Li. "Recycling nutrients from a sequential hydrothermal liquefaction process for microalgae culture." Algal Research 27 (2017): 311-317. Chen, Ye, Jay J. Cheng, and Kurt S. Creamer. "Inhibition of anaerobic digestion process: a review." Bioresource technology 99, no. 10 (2008): 4044-4064. Ho, Shih-Hsin, Dillirani Nagarajan, Nan-qi Ren, and Jo-Shu Chang. "Waste biorefineries— integrating anaerobic digestion and microalgae cultivation for bioenergy production." Current opinion in biotechnology 50 (2018): 101-110.

30

Chiasson, David M., Patrick C. Loughlin, Danielle Mazurkiewicz, Manijeh Mohammadidehcheshmeh, Elena E. Fedorova, Mamoru Okamoto, Elizabeth McLean et al. "Soybean SAT1 (Symbiotic Ammonium Transporter 1) encodes a bHLH transcription factor involved in nodule growth and NH4+ transport." Proceedings of the National Academy of Sciences 111, no. 13 (2014): 4814-4819. Chisti, Yusuf. "Biodiesel from microalgae." Biotechnology advances 25, no. 3 (2007): 294-306. Chisti, Yusuf. "Constraints to commercialization of algal fuels." Journal of biotechnology 167, no. 3 (2013): 201-214. Cho, Hyun Uk, and Jong Moon Park. “Biodiesel Production by Various Oleaginous Microorganisms from Organic Wastes.” Bioresource Technology, Elsevier, 2018. Choi, Seung Phill, Minh Thu Nguyen, and Sang Jun Sim. "Enzymatic pretreatment of Chlamydomonas reinhardtii biomass for ethanol production." Bioresource technology 101, no. 14 (2010): 5330-5336. Christenson, Logan, and Ronald Sims. “Production and Harvesting of Microalgae for Wastewater Treatment, Biofuels, and Bioproducts.” Biotechnology Advances, vol. 29, no. 6, Elsevier, 2011, pp. 686–702. Collet, Pierre, Arnaud Hélias, Laurent Lardon, Monique Ras, Romy-Alice Goy, and Jean- Philippe Steyer. "Life-cycle assessment of microalgae culture coupled to biogas production." Bioresource technology 102, no. 1 (2011): 207-214. Converti, Attilio, Alessandro A. Casazza, Erika Y. Ortiz, Patrizia Perego, and Marco Del Borghi. "Effect of temperature and nitrogen concentration on the growth and lipid content of Nannochloropsis oculata and Chlorella vulgaris for biodiesel production." Chemical Engineering and Processing: Process Intensification 48, no. 6 (2009): 1146-1151. Cooney, Michael, Greg Young, and Nick Nagle. "Extraction of bio‐oils from microalgae." Separation & Purification Reviews 38, no. 4 (2009): 291-325. Crutzen, Paul J., Arvin R. Mosier, Keith A. Smith, and Wilfried Winiwarter. "N 2 O release from agro-biofuel production negates global warming reduction by replacing fossil fuels." In Paul J. Crutzen: A pioneer on atmospheric chemistry and climate change in the anthropocene, pp. 227-238. Springer, Cham, 2016. Cuatrecasas, Pedro, Sara Fuchs, and Christian B. Anfinsen. "Catalytic properties and specificity of the extracellular nuclease of Staphylococcus aureus." Journal of Biological Chemistry 242, no. 7 (1967): 1541-1547. Daniel, Hannelore, Britta Spanier, Gabor Kottra, and Dietmar Weitz. "From bacteria to man: archaic proton-dependent peptide transporters at work." Physiology 21, no. 2 (2006): 93-102. Davis, Ryan W., Benjamin J. Carvalho, Howland DT Jones, and Seema Singh. "The role of photo-osmotic adaptation in semi-continuous culture and lipid particle release from Dunaliella viridis." Journal of applied phycology 27, no. 1 (2015): 109-123.

31

Drapeau, Gabriel R., Yves Boily, and Jean Houmard. "Purification and properties of an extracellular protease of Staphylococcus aureus." Journal of Biological Chemistry 247, no. 20 (1972): 6720-6726. Dong, Hong-Po, Kai-Xuan Huang, Hua-Long Wang, Song-Hui Lu, Jing-Yi Cen, and Yue-Lei Dong. "Understanding strategy of nitrate and urea assimilation in a Chinese strain of Aureococcus anophagefferens through RNA-Seq analysis." PloS one 9, no. 10 (2014): e111069. Dong, Tao, Eric P. Knoshaug, Philip T. Pienkos, and Lieve ML Laurens. "Lipid recovery from wet oleaginous microbial biomass for biofuel production: a critical review." Applied Energy 177 (2016): 879-895. Dove, Alan. “Uncorking the Biomanufacturing Bottleneck.” Nature Biotechnology, vol. 20, no. 8, 2002, pp. 777–79. Edmundson, S., M. Huesemann, Robert Kruk, T. Lemmon, J. Billing, A. Schmidt, and D. Anderson. "Phosphorus and nitrogen recycle following algal bio-crude production via continuous hydrothermal liquefaction." Algal Research 26 (2017): 415-421. ElBerry, H. M., MONA L. Majumdar, THOMAS S. Cunningham, R. A. Sumrada, and T. G. Cooper. "Regulation of the urea active transporter gene (DUR3) in Saccharomyces cerevisiae." Journal of Bacteriology 175, no. 15 (1993): 4688-4698. Erisman, Jan Willem, Mark A. Sutton, James Galloway, Zbigniew Klimont, and Wilfried Winiwarter. "How a century of ammonia synthesis changed the world." Nature Geoscience 1, no. 10 (2008): 636. Erisman, Jan Willem, Hans van Grinsven, Adrian Leip, Arvin Mosier, and Albert Bleeker. "Nitrogen and biofuels; an overview of the current state of knowledge." Nutrient Cycling in Agroecosystems 86, no. 2 (2010): 211-223. International Fertilizer Industry Association. "Fertilizers, climate change and enhancing agricultural productivity sustainably." Paris, France: International Fertilizer Industry Association (2009). Forde, Brian G. “Nitrate Transporters in Plants: Structure, Function and Regulation.” Biochimica et Biophysica Acta (BBA) - Biomembranes, vol. 1465, no. 1–2, Elsevier, 2000, pp. 219–35. Fowler, David, Mhairi Coyle, Ute Skiba, Mark A. Sutton, J. Neil Cape, Stefan Reis, Lucy J. Sheppard et al. "The global nitrogen cycle in the twenty-first century." Phil. Trans. R. Soc. B 368, no. 1621 (2013): 20130164. Franche, Claudine, Kristina Lindström, and Claudine Elmerich. "Nitrogen-fixing bacteria associated with leguminous and non-leguminous plants." Plant and soil 321, no. 1-2 (2009): 35-59. Franz, Anette, Florian Lehr, Clemens Posten, and Georg Schaub. "Modeling microalgae cultivation productivities in different geographic locations–estimation method for idealized photobioreactors." Biotechnology journal 7, no. 4 (2012): 546-557.

32

Fujiki, Yuki, Hiromitsu Teshima, Shinji Kashiwao, Miyuki Kawano‐Kawada, Yoshinori Ohsumi, Yoshimi Kakinuma, and Takayuki Sekito. "Functional identification of AtAVT3, a family of vacuolar amino acid transporters, in Arabidopsis." FEBS letters 591, no. 1 (2017): 5-15. Fujiwara, Takayuki, Mio Ohnuma, Masaki Yoshida, Tsuneyoshi Kuroiwa, and Tatsuya Hirano. "Gene targeting in the red alga Cyanidioschyzon merolae: single-and multi-copy insertion using authentic and chimeric selection markers." PLoS One 8, no. 9 (2013): e73608. Galloway, James N., Frank J. Dentener, Douglas G. Capone, Elisabeth W. Boyer, Robert W. Howarth, Sybil P. Seitzinger, Gregory P. Asner et al. "Nitrogen cycles: past, present, and future." Biogeochemistry 70, no. 2 (2004): 153-226. Ganeshkumar, Vimalkumar, Suresh R. Subashchandrabose, Rajarathnam Dharmarajan, Kadiyala Venkateswarlu, Ravi Naidu, and Mallavarapu Megharaj. "Use of mixed wastewaters from piggery and winery for nutrient removal and lipid production by Chlorella sp. MM3." Bioresource technology 256 (2018): 254-258. Gänzle, Michael G., Nicoline Vermeulen, and Rudi F. Vogel. "Carbohydrate, peptide and lipid metabolism of lactic acid bacteria in sourdough." Food microbiology 24, no. 2 (2007): 128- 138. Garcia Alba, Laura, Mathijs P. Vos, Cristian Torri, Daniele Fabbri, Sascha RA Kersten, and Derk WF Brilman. "Recycling nutrients in algae biorefinery." ChemSusChem 6, no. 8 (2013): 1330-1333. Garcia-Moscoso, Jose L., Ali Teymouri, and Sandeep Kumar. "Kinetics of peptides and arginine production from microalgae (Scenedesmus sp.) by flash hydrolysis." Industrial & Engineering Chemistry Research 54, no. 7 (2015): 2048-2058. Garcia-Moscoso, Jose Luis, Wassim Obeid, Sandeep Kumar, and Patrick G. Hatcher. "Flash hydrolysis of microalgae (Scenedesmus sp.) for protein extraction and production of biofuels intermediates." The Journal of Supercritical Fluids 82 (2013): 183-190. Goldschmidt, Barbara, Nader Padban, Michael Cannon, Greg Kelsall, Magnus Neergaard, Krister Ståhl, and Ingemar Odenbrand. "Ammonia Formation and NOx, Emissions, with Various Biomass and Waste Fuels at the Vämamo 18 MWth IGCC Plant." Progress in thermochemical biomass conversion (2001): 524-535. Gollakota, A. R. K., Nanda Kishore, and Sai Gu. "A review on hydrothermal liquefaction of biomass." Renewable and Sustainable Energy Reviews (2017). Goodell, E. William, and Christopher F. Higgins. "Uptake of cell wall peptides by Salmonella typhimurium and Escherichia coli." Journal of bacteriology 169, no. 8 (1987): 3861-3865. Guan, Qingqing, Phillip E. Savage, and Chaohai Wei. "Gasification of alga Nannochloropsis sp. in supercritical water." The Journal of Supercritical Fluids 61 (2012): 139-145.

33

Guarnieri, Michael T., and Philip T. Pienkos. “Algal Omics: Unlocking Bioproduct Diversity in Algae Cell Factories.” Photosynthesis Research, vol. 123, no. 3, Springer Netherlands, 2015, pp. 255–63. Guschina, Irina A., and John L. Harwood. “Lipids and Lipid Metabolism in Eukaryotic Algae.” Progress in Lipid Research, vol. 45, no. 2, Pergamon, 2006, pp. 160–86, Hanikenne, Marc, Ute Krämer, Vincent Demoulin, and Denis Baurain. "A comparative inventory of metal transporters in the green alga Chlamydomonas reinhardtii and the red alga Cyanidioschizon merolae." Plant Physiology 137, no. 2 (2005): 428-446. Harun, Razif, W. S. Y. Jason, Tamara Cherrington, and Michael K. Danquah. "Exploring alkaline pre-treatment of microalgal biomass for bioethanol production." Applied Energy 88, no. 10 (2011): 3464-3467. Hellio, C., and Y. Le Gal. “Histidase from the Unicellular Green Alga Dunaliella Tertiolecta : Purification and Partial Characterization.” Eur. J. Phycol, vol. 34, 1999, pp. 71–78. Hellio, Claire, Benoit Veron, and Yves Le Gal. "Amino acid utilization by Chlamydomonas reinhardtii: specific study of histidine." Plant Physiology and Biochemistry 42, no. 3 (2004): 257-264. Hellio, Claire, and Yves Le Gal. “Histidine Utilization by the Unicellular Alga Dunaliella Tertiolecta.” Comparative Biochemistry and Physiology Part A: Molecular & Integrative Physiology, vol. 119, no. 3, Pergamon, 1998, pp. 753–58. HESQ, Yara, and Jan-Petter Fossum. "Calculation of Carbon Footprint of Fertilizer Production." (2010). Heun, Magnus, Lucas Binnenkade, Maximilian Kreienbaum, and Kai M. Thormann. "Functional specificity of extracellular nucleases of Shewanella oneidensis MR-1." Applied and environmental microbiology 78, no. 12 (2012): 4400-4411. Hildebrand, Mark. “Cloning and functional characterization of ammonium transporters from the marine diatom cylindrotheca fusiformis (bacillariophyceae).” Journal of Phycology, vol. 41, no. 1, Blackwell Science Inc, 2005, pp. 105–13. Holm, Tina, Semharai Netzereab, Mats Hansen, Ülo Langel, and Mattias Hällbrink. "Uptake of cell‐penetrating peptides in yeasts." FEBS letters 579, no. 23 (2005): 5217-5222. Howarth, Robert W. “Coastal Nitrogen Pollution: A Review of Sources and Trends Globally and Regionally.” Harmful Algae, vol. 8, no. 1, Elsevier, 2008, pp. 14–20, Hülsen, Tim, Kent Hsieh, Yang Lu, Stephan Tait, and Damien J. Batstone. "Simultaneous treatment and single cell protein production from agri-industrial wastewaters using purple phototrophic bacteria or microalgae–A comparison." Bioresource technology 254 (2018): 214-223. Huo, Yi-Xin, David G. Wernick, and James C. Liao. "Toward nitrogen neutral biofuel production." Current opinion in biotechnology 23, no. 3 (2012): 406-413.

34

Ietswaart, T., P. J. Schneider, and R. A. Prins. "Utilization of organic nitrogen sources by two phytoplankton species and a bacterial isolate in pure and mixed cultures." Applied and environmental microbiology 60, no. 5 (1994): 1554-1560. Joo, Han-Seung, C. Ganesh Kumar, Gun-Chun Park, Ki Tae Kim, Seung R. Paik, and Chung- Soon Chang. "Optimization of the production of an extracellular alkaline protease from Bacillus horikoshii." Process Biochemistry 38, no. 2 (2002): 155-159. Joy, K. W., and R. H. Hageman. “The Purification and Properties of Nitrite Reductase from Higher Plants, and Its Dependence on Ferredoxin.” The Biochemical Journal, vol. 100, no. 1, Portland Press Ltd, 1966, pp. 263–73. Kamennaya, Nina A., Gabrielle Kennaway, Bernhard M. Fuchs, and Mikhail V. Zubkov. "“Pomacytosis”—Semi-extracellular phagocytosis of cyanobacteria by the smallest marine algae." PLoS biology 16, no. 1 (2018): e2003502. Kasai, Yuki, Kohei Oshima, Fukiko Ikeda, Jun Abe, Yuya Yoshimitsu, and Shigeaki Harayama. "Construction of a self-cloning system in the unicellular green alga Pseudochoricystis ellipsoidea." Biotechnology for biofuels 8, no. 1 (2015): 94. Kellam, Stephen J., and John M. Walker. “An Extracellular Protease from the Alga Chlorella Sphaerkii.” Biochemical Society Transactions, vol. 15, no. 3, Portland Press Limited, 1987, pp. 520–21. Kim, Garam, Ghulam Mujtaba, and Kisay Lee. "Effects of nitrogen sources on cell growth and biochemical composition of marine chlorophyte Tetraselmis sp. for lipid production." Algae 31, no. 3 (2016): 257-266. Kirk, D. L., and M. M. Kirk. “Carrier-Mediated Uptake of Arginine and Urea by Chlamydomonas Reinhardtii.” Plant Physiology, vol. 61, no. 4, 1978, pp. 556–60, Kirk, M. M., and D. L. Kirk. “Carrier-Mediated Uptake of Arginine and Urea by Volvox Carteri F. Nagariensis.” Plant Physiology, vol. 61, no. 4, American Society of Plant Biologists, 1978, pp. 549–55. Kobayashi, Makio, Toshihide Kakizono, and Shiro Nagai. "Astaxanthin production by a green alga, Haematococcus pluvialis accompanied with morphological changes in acetate media." Journal of Fermentation and Bioengineering 71, no. 5 (1991): 335-339. Krehbiel, Joel D., Lance C. Schideman, Daniel A. King, and Jonathan B. Freund. "Algal cell disruption using microbubbles to localize ultrasonic energy." Bioresource technology 173 (2014): 448-451. Kumar, Ajay, David D. Jones, and Milford A. Hanna. "Thermochemical biomass gasification: a review of the current status of the technology." Energies 2, no. 3 (2009): 556-581. Kumar, Parveen, Diane M. Barrett, Michael J. Delwiche, and Pieter Stroeve. "Methods for pretreatment of lignocellulosic biomass for efficient hydrolysis and biofuel production." Industrial & engineering chemistry research 48, no. 8 (2009): 3713-3729.

35

Kumar, Sandeep, Elodie Hablot, Jose Luis Garcia Moscoso, Wassim Obeid, Patrick G. Hatcher, Brandon Michael DuQuette, Daniel Graiver, Ramani Narayan, and Venkatesh Balan. "Polyurethanes preparation using proteins obtained from microalgae." Journal of materials science 49, no. 22 (2014): 7824-7833. Kuo, Chiu-Mei, Tsai-Yu Chen, Tsung-Hsien Lin, Chien-Ya Kao, Jinn-Tsyy Lai, Jo-Shu Chang, and Chih-Sheng Lin. "Cultivation of Chlorella sp. GD using piggery wastewater for biomass and lipid production." Bioresource technology 194 (2015): 326-333. Lardon, Laurent, Arnaud Helias, Bruno Sialve, Jean-Philippe Steyer, and Olivier Bernard. "Life- cycle assessment of biodiesel production from microalgae." (2009): 6475-6481. Laurens, L. M. L., N. Nagle, R. Davis, N. Sweeney, S. Van Wychen, A. Lowell, and P. T. Pienkos. "Acid-catalyzed algal biomass pretreatment for integrated lipid and carbohydrate- based biofuels production." Green Chemistry 17, no. 2 (2015): 1145-1158. Ehhalt, D., M. Prather, F. Dentener, R. Derwent, Edward J. Dlugokencky, E. Holland, I. Isaksen et al. Atmospheric chemistry and greenhouse gases. No. PNNL-SA-39647. Pacific Northwest National Laboratory (PNNL), Richland, WA (US), 2001. Lee, Sun-og, Junichi Kato, Noboru Takiguchi, Akio Kuroda, Tsukasa Ikeda, Atsushi Mitsutani, and Hisao Ohtake. "Involvement of an extracellular protease in algicidal activity of the marine bacterium pseudoalteromonassp. strain A28." Applied and Environmental Microbiology 66, no. 10 (2000): 4334-4339. Lelieveld, Jos, and Frank J. Dentener. “What Controls Tropospheric Ozone?” Journal of Geophysical Research: Atmospheres, vol. 105, no. D3, 2000, pp. 3531–51. Leng, Lijian, Jun Li, Zhiyou Wen, and Wenguang Zhou. "Use of microalgae to recycle nutrients in aqueous phase derived from hydrothermal liquefaction process." Bioresource technology (2018). Léran, Sophie, Kranthi Varala, Jean-Christophe Boyer, Maurizio Chiurazzi, Nigel Crawford, Françoise Daniel-Vedele, Laure David et al. "A unified nomenclature of Nitrate Transporter 1/Peptide Transporter family members in plants." Trends in plant science 19, no. 1 (2014): 5- 9. Lettinga, G. “Anaerobic Digestion and Wastewater Treatment Systems.” Antonie van Leeuwenhoek, vol. 67, no. 1, Kluwer Academic Publishers, 1995, pp. 3–28. Li, Cong, Suo Xiao, and Lu-Kwang Ju. "Cultivation of phagotrophic algae with waste activated sludge as a fast approach to reclaim waste organics." Water research 91 (2016): 195-202. Li, Cong, and Lu-Kwang Ju. “Enhancement of Resource Recovery and Sludge Digestion by Cultivation of Phagotrophic Algae with Alkali-Pretreated Waste Activated Sludge and Waste Ketchup.” Process Safety and Environmental Protection, vol. 113, Elsevier, 2018, pp. 233– 41. Li, Kexun, Shun Liu, and Xianhua Liu. "An overview of algae bioethanol production." International Journal of Energy Research 38, no. 8 (2014): 965-977.

36

Lin, Zhongye, and Lu-Kwang Ju. “Growth and Lipid Production of a Phagotrophic Alga Feeding on Escherichia Coli Cells: A New Approach for Algal Biomass and Lipid Production from Wastewater Bacteria.” Environmental Engineering Science, vol. 34, no. 7, Mary Ann Liebert, Inc. 140 Huguenot Street, 3rd Floor New Rochelle, NY 10801 USA, 2017, pp. 461–68. Ling, J. R., and I. P. Armstead. “The in Vitro Uptake and Metabolism of Peptides and Amino Acids by Five Species of Rumen Bacteria.” Journal of Applied Bacteriology, vol. 78, no. 2, Blackwell Publishing Ltd, 1995, pp. 116–24. Lisa, T. A., P. Piedras, J. Cardenas, and M. Pineda. "Utilization of adenine and guanine as nitrogen sources by Chlamydomonas reinhardtii cells." Plant, Cell & Environment 18, no. 5 (1995): 583-588. Liu, Lu, Georg Pohnert, and Dong Wei. "Extracellular metabolites from industrial microalgae and their biotechnological potential." Marine drugs 14, no. 10 (2016): 191. Barreiro, Diego López, Wolter Prins, Frederik Ronsse, and Wim Brilman. "Hydrothermal liquefaction (HTL) of microalgae for biofuel production: state of the art review and future prospects." Biomass and Bioenergy 53 (2013): 113-127. Lubkowitz, Mark. “The Oligopeptide Transporters: A Small Gene Family with a Diverse Group of Substrates and Functions?” Molecular Plant, vol. 4, no. 3, 2011, pp. 407–15. Lukasheva, E. V., A. A. Efremova, E. M. Treshalina, A. Yu Arinbasarova, A. G. Medentzev, and T. T. Berezov. "L-Amino acid oxidases: properties and molecular mechanisms of action." Biochemistry (Moscow) Supplement Series B: Biomedical Chemistry 5, no. 4 (2011): 337- 345. Mackenzie, Fred T. Our Changing Planet : An Introduction to Earth System Science and Global Environmental Change. Prentice Hall, 2011. Mackinder, Luke CM, Chris Chen, Ryan D. Leib, Weronika Patena, Sean R. Blum, Matthew Rodman, Silvia Ramundo, Christopher M. Adams, and Martin C. Jonikas. "A Spatial Interactome Reveals the Protein Organization of the Algal CO 2-Concentrating Mechanism." Cell 171, no. 1 (2017): 133-147. Maeda, Shin-ichi, Mineko Konishi, Shuichi Yanagisawa, and Tatsuo Omata. "Nitrite transport activity of a novel HPP family protein conserved in cyanobacteria and chloroplasts." Plant and Cell Physiology 55, no. 7 (2014): 1311-1324. Mariscal, Vicente, Pascale Moulin, Mathilde Orsel, Anthony J. Miller, Emilio Fernández, and Aurora Galván. "Differential regulation of the Chlamydomonas Nar1 gene family by carbon and nitrogen." Protist 157, no. 4 (2006): 421-433. Martinelle, Kristina, and Lena Häggström. “Mechanisms of Ammonia and Ammonium Ion Toxicity in Animal Cells: Transport across Cell Membranes.” Journal of Biotechnology, vol. 30, no. 3, Elsevier, 1993, pp. 339–50.

37

Marudhupandi, Thangapandi, Ramamoorthy Sathishkumar, and Thipramalai Thankappan Ajith Kumar. "Heterotrophic cultivation of Nannochloropsis salina for enhancing biomass and lipid production." Biotechnology Reports 10 (2016): 8-16. Maruyama, Shinichiro, and Eunsoo Kim. “A Modern Descendant of Early Green Algal Phagotrophs.” Current Biology: CB, vol. 23, no. 12, Elsevier, 2013, pp. 1081–84. Masclaux-Daubresse, Céline, Françoise Daniel-Vedele, Julie Dechorgnat, Fabien Chardon, Laure Gaufichon, and Akira Suzuki. "Nitrogen uptake, assimilation and remobilization in plants: challenges for sustainable and productive agriculture." Annals of botany 105, no. 7 (2010): 1141-1157. Mateo-Sagasta, Javier, Liqa Raschid-Sally, and Anne Thebo. "Global wastewater and sludge production, treatment and use." In Wastewater, pp. 15-38. Springer, Dordrecht, 2015. McDonald, Sarah M., Joshua N. Plant, and Alexandra Z. Worden. "The mixed lineage nature of nitrogen transport and assimilation in marine eukaryotic phytoplankton: a case study of Micromonas." Molecular biology and evolution 27, no. 10 (2010): 2268-2283. McDonald, Tami R., and John M. Ward. “Evolution of Electrogenic Ammonium Transporters (AMTs).” Frontiers in Plant Science, vol. 7, Frontiers Media SA, 2016, p. 352. McKie-Krisberg, Zaid M., and Robert W. Sanders. “Phagotrophy by the Picoeukaryotic Green Alga Micromonas: Implications for Arctic Oceans.” The ISME Journal, vol. 8, no. 10, Nature Publishing Group, 2014, pp. 1953–61. Merchant, Sabeeha S., Simon E. Prochnik, Olivier Vallon, Elizabeth H. Harris, Steven J. Karpowicz, George B. Witman, Astrid Terry et al. "The Chlamydomonas genome reveals the evolution of key animal and plant functions." Science 318, no. 5848 (2007): 245-250. Miflin, Ben J., and Dimah Z. Habash. “The Role of Glutamine Synthetase and Glutamate Dehydrogenase in Nitrogen Assimilation and Possibilities for Improvement in the Nitrogen Utilization of Crops.” Journal of Experimental Botany, vol. 53, no. 370, Oxford University Press, 2002, pp. 979–87. Miflin, Benjamin J., and Peter J. Lea. “The Pathway of Nitrogen Assimilation in Plants.” Phytochemistry, vol. 15, no. 6, Pergamon, 1976, pp. 873–85. Moody, Jeffrey W., Christopher M. McGinty, and Jason C. Quinn. "Global evaluation of biofuel potential from microalgae." Proceedings of the National Academy of Sciences 111, no. 23 (2014): 8691-8696. Mosier, Arvin, Carolien Kroeze, Cindy Nevison, Oene Oenema, Sybil Seitzinger, and Oswald Van Cleemput. "Closing the global N 2 O budget: nitrous oxide emissions through the agricultural nitrogen cycle." Nutrient cycling in Agroecosystems 52, no. 2-3 (1998): 225-248. Mu, Dongyan, Roger Ruan, Min Addy, Sarah Mack, Paul Chen, and Yong Zhou. "Life cycle assessment and nutrient analysis of various processing pathways in algal biofuel production." Bioresource technology 230 (2017): 33-42.

38

Mulholland, Margaret R., and Cindy Lee. “Peptide Hydrolysis and the Uptake of Dipeptides by Phytoplankton.” Limnology and Oceanography, vol. 54, no. 3, 2009, pp. 856–68 Nascimento, Wellingta Cristina Almeida do, and Meire Lelis Leal Martins. “Production and Properties of an Extracellular Protease from Thermophilic Bacillus Sp.” Brazilian Journal of Microbiology, vol. 35, no. 1–2, SBM, 2004, pp. 91–96. Nestle, Marion, and W. K. Roberts. “An Extracellular Nuclease from Serratia Marcescens I. Purification and some properties of the enzyme*.” The Journal of Biological Chemistry, vol. 244, no. 19, 1969, pp. 5213–18. Northrop, J. H. “Crystalline Enzymes. The Chemistry of Pepsin, Trypsin, and Bacteriophage.” Crystalline Enzymes. The Chemistry of Pepsin, Trypsin, and Bacteriophage., Columbia University Press, 1939. Onwudili, Jude A., Amanda R. Lea-Langton, Andrew B. Ross, and Paul T. Williams. "Catalytic hydrothermal gasification of algae for hydrogen production: composition of reaction products and potential for nutrient recycling." Bioresource technology 127 (2013): 72-80. Pérez-Vicente, R., M. I. Burón, J. A. González-Reyes, J. Cárdenas, and M. Pineda. "Xanthine accumulation and vacuolization inChlamydomonas reinhardtii cells." Protoplasma 186, no. 1-2 (1995): 93-98. Pach, John D. "Ammonia production: energy efficiency, CO2 balances and environmental impact." In Proceedings-International Fertiliser Society, no. 601, pp. 1-26. International Fertiliser Society, 2007. Pagliolico, Simonetta L., Valerio RM Lo Verso, Francesca Bosco, Chiara Mollea, and Cinzia La Forgia. "A novel photo-bioreactor application for microalgae production as a shading system in buildings." Energy Procedia 111 (2017): 151-160. Palenik, B., and F. M. Morel. “Amine Oxidases of Marine Phytoplankton.” Applied and Environmental Microbiology, vol. 57, no. 8, American Society for Microbiology (ASM), Aug. 1991, pp. 2440–43, http://www.ncbi.nlm.nih.gov/pubmed/16348545. Palenik, B., and F. M. M. Morel. "Amino acid utilization by marine phytoplankton: a novel mechanism." Limnology and Oceanography 35, no. 2 (1990): 260-269. Palenik, Brian, and Sarah E. Henson. “The Use of Amides and Other Organic Nitrogen Sources by the Phytoplankton Emiliania Huxleyi.” Limnology and Oceanography, vol. 42, no. 7, 1997, pp. 1544–51. Park, Jeong‐Jin, Hongxia Wang, Mahmoud Gargouri, Rahul R. Deshpande, Jeremy N. Skepper, F. Omar Holguin, Matthew T. Juergens, Yair Shachar‐Hill, Leslie M. Hicks, and David R. Gang. "The response of Chlamydomonas reinhardtii to nitrogen deprivation: a systems biology analysis." The Plant Journal 81, no. 4 (2015): 611-624. Parkin, Gene F., and William F. Owen. “Fundamentals of Anaerobic Digestion of Wastewater Sludges.” Journal of Environmental Engineering, vol. 112, no. 5, 1986, pp. 867–920,

39

Paul, Carsten, and Georg Pohnert. “Induction of Protease Release of the Resistant Diatom Chaetoceros Didymus in Response to Lytic Enzymes from an Algicidal Bacterium.” PLoS ONE, edited by Vladimir N. Uversky, vol. 8, no. 3, Public Library of Science, 2013, p. e57577. Jack, Donald L., Ian T. Paulsen, and Milton H. Saier. "The amino acid/polyamine/organocation (APC) superfamily of transporters specific for amino acids, polyamines and organocations." Microbiology 146, no. 8 (2000): 1797-1814. Pérez-Vicente, Rafael, Jacobo Cárdenas, and Manuel Pineda. "Distinction between hypoxanthine and xanthine transport in Chlamydomonas reinhardtii." Plant physiology 95, no. 1 (1991): 126-130. Di Giorgio, Juliana Perez, Gabriela Soto, Karina Alleva, Cintia Jozefkowicz, Gabriela Amodeo, Jorge Prometeo Muschietti, and Nicolás Daniel Ayub. "Prediction of aquaporin function by integrating evolutionary and functional analyses." The Journal of membrane biology 247, no. 2 (2014): 107-125. Piedras, P., M. Pineda, J. Munoz, and J. Cardenas. "Purification and characterization of anl- amino-acid oxidase fromChlamydomonas reinhardtii." Planta 188, no. 1 (1992): 13-18. Pineda, Manuel, Emilio Fernández, and Jacobo Cárdenas. "Urate oxidase of Chlamydomonas reinhardii." Physiologia Plantarum 62, no. 3 (1984): 453-457. Pleissner, Daniel, Wan Chi Lam, Zheng Sun, and Carol Sze Ki Lin. "Food waste as nutrient source in heterotrophic microalgae cultivation." Bioresource technology 137 (2013): 139- 146. Polle, Juergen EW, Kerrie Barry, John Cushman, Jeremy Schmutz, Duc Tran, Leyla T. Hathwaik, Won C. Yim et al. "Draft Nuclear Genome Sequence of the Halophilic and Beta- Carotene-Accumulating Green Alga Dunaliella salina Strain CCAP19/18." Genome announcements 5, no. 43 (2017): e01105-17. Pollegioni, Loredano, Paolo Motta, and Gianluca Molla. "L-Amino acid oxidase as biocatalyst: a dream too far?." Applied microbiology and biotechnology 97, no. 21 (2013): 9323-9341. Pratelli, Réjane, and Guillaume Pilot. “Regulation of Amino Acid Metabolic Enzymes and Transporters in Plants.” Journal of Experimental Botany, vol. 65, no. 19, 2014, pp. 5535–56. Process of Producing Ammonia. Mar. 1908, https://patents.google.com/patent/US990191A/en. Radakovits, Randor, Robert E. Jinkerson, Susan I. Fuerstenberg, Hongseok Tae, Robert E. Settlage, Jeffrey L. Boore, and Matthew C. Posewitz. "Draft genome sequence and genetic transformation of the oleaginous alga Nannochloropsis gaditana." Nature communications 3 (2012): 686. Rasala, Beth A., and Stephen P. Mayfield. “Photosynthetic Biomanufacturing in Green Algae; Production of Recombinant Proteins for Industrial, Nutritional, and Medical Uses.” Photosynthesis Research, vol. 123, no. 3, Springer Netherlands, 2015, pp. 227–39,

40

Raven, John A., John Beardall, Kevin J. Flynn, and Stephen C. Maberly. "Phagotrophy in the origins of photosynthesis in eukaryotes and as a complementary mode of nutrition in phototrophs: relation to Darwin's insectivorous plants." Journal of experimental botany 60, no. 14 (2009): 3975-3987. Reay, Dave S., Eric A. Davidson, Keith A. Smith, Pete Smith, Jerry M. Melillo, Frank Dentener, and Paul J. Crutzen. "Global agriculture and nitrous oxide emissions." Nature climate change 2, no. 6 (2012): 410. Rentsch, Doris, Susanne Schmidt, and Mechthild Tegeder. "Transporters for uptake and allocation of organic nitrogen compounds in plants." FEBS letters 581, no. 12 (2007): 2281- 2289. Sakaguchi, Toshiro, Kensuke Nakajima, and Yusuke Matsuda. "Identification of the UMP synthase gene by establishment of uracil auxotrophic mutants and the phenotypic complementation system in the marine diatom, Phaeodactylum tricornutum." Plant physiology (2011): pp-110. Sauer, Norbert, Ewald Komor, and Widmar Tanner. "Regulation and characterization of two inducible amino-acid transport systems in Chlorella vulgaris." Planta 159, no. 5 (1983): 404- 410. Schein, Jessica R., Kevin A. Hunt, Janet A. Minton, Neil P. Schultes, and George S. Mourad. "The nucleobase cation symporter 1 of Chlamydomonas reinhardtii and that of the evolutionarily distant Arabidopsis thaliana display parallel function and establish a plant- specific solute transport profile." Plant physiology and biochemistry 70 (2013): 52-60. Schmollinger, Stefan, Timo Mühlhaus, Nanette R. Boyle, Ian K. Blaby, David Casero, Tabea Mettler, Jeffrey L. Moseley et al. "Nitrogen-sparing mechanisms in Chlamydomonas affect the transcriptome, the proteome, and photosynthetic metabolism." The Plant Cell 26, no. 4 (2014): 1410-1435. Selvaratnam, T., H. Reddy, Tapaswy Muppaneni, F. O. Holguin, N. Nirmalakhandan, Peter J. Lammers, and S. Deng. "Optimizing energy yields from nutrient recycling using sequential hydrothermal liquefaction with Galdieria sulphuraria." Algal Research 12 (2015): 74-79. Sharma, Kalpesh K., Sourabh Garg, Yan Li, Ali Malekizadeh, and Peer M. Schenk. "Critical analysis of current microalgae dewatering techniques." Biofuels 4, no. 4 (2013): 397-407. Shcherbak, Iurii, Neville Millar, and G. Philip Robertson. "Global metaanalysis of the nonlinear response of soil nitrous oxide (N2O) emissions to fertilizer nitrogen." Proceedings of the National Academy of Sciences 111, no. 25 (2014): 9199-9204. Shuping, Zou, Wu Yulong, Yang Mingde, Imdad Kaleem, Li Chun, and Junmao Tong. "Production and characterization of bio-oil from hydrothermal liquefaction of microalgae Dunaliella tertiolecta cake." Energy 35, no. 12 (2010): 5406-5411. Sipler, Rachel E., and Deborah A. Bronk. “Dynamics of Dissolved Organic Nitrogen.” Biogeochemistry of Marine Dissolved Organic Matter, Elsevier, 2015, pp. 127–232.

41

Slocombe, Stephen P., QianYi Zhang, Michael Ross, Avril Anderson, Naomi J. Thomas, Ángela Lapresa, Cecilia Rad-Menéndez et al. "Unlocking nature’s treasure-chest: screening for oleaginous algae." Scientific reports 5 (2015): 9844. Smil, Vaclav. Enriching the Earth: Fritz Haber, Carl Bosch, and the Transformation of World Food Production. MIT Press, 2001. Snyder, C. S., T. W. Bruulsema, T. L. Jensen, and P. E. Fixen. "Review of greenhouse gas emissions from crop production systems and fertilizer management effects." Agriculture, Ecosystems & Environment 133, no. 3-4 (2009): 247-266. Solomonson, L. P., and M. J. Barber. “Assimilatory Nitrate Reductase: Functional Properties and Regulation.” Annual Review of Plant Physiology and Plant Molecular Biology, vol. 41, no. 1, Annual Reviews 4139 El Camino Way, P.O. Box 10139, Palo Alto, CA 94303-0139, USA, 1990, pp. 225–53. Song, Ting, Qiang Gao, Zhengkai Xu, and Rentao Song. "The cloning and characterization of two ammonium transporters in the salt-resistant green alga, Dunaliella viridis." Molecular biology reports 38, no. 7 (2011): 4797-4804. Specht, Elizabeth A., and Stephen P. Mayfield. “Algae-Based Oral Recombinant Vaccines.” Frontiers in Microbiology, vol. 5, Frontiers, 2014, p. 60. Srivastava, H. S., and Rana P. Singh. “Role and Regulation of L-Glutamate Dehydrogenase Activity in Higher Plants.” Phytochemistry, vol. 26, no. 3, Pergamon, 1987, pp. 597–610. Stucki, Samuel, Frédéric Vogel, Christian Ludwig, Anca G. Haiduc, and Martin Brandenberger. "Catalytic gasification of algae in supercritical water for biofuel production and carbon capture." Energy & Environmental Science 2, no. 5 (2009): 535-541. Talbot, Caleb, Jose Garcia-Moscoso, Hannah Drake, Ben J. Stuart, and Sandeep Kumar. "Cultivation of microalgae using flash hydrolysis nutrient recycle." Algal research 18 (2016): 191-197. Tegeder, Mechthild, and Céline Masclaux-Daubresse. “Source and Sink Mechanisms of Nitrogen Transport and Use.” New Phytologist, vol. 217, no. 1, 2018, pp. 35–53, Tegeder, Mechthild, and Doris Rentsch. “Uptake and Partitioning of Amino Acids and Peptides.” Molecular Plant, vol. 3, no. 6, Cell Press, 2010, pp. 997–1011. Tegeder, Mechthild, and John M. Ward. “Molecular Evolution of Plant AAP and LHT Amino Acid Transporters.” Frontiers in Plant Science, vol. 3, 2012, p. 21. Toha, J., M. A. Soto, and X. Cuadros. "Waste water treatment using saline cultures of microalgae." Biotechnology techniques 4, no. 6 (1990): 441-444. Toor, Saqib Sohail, Lasse Rosendahl, and Andreas Rudolf. "Hydrothermal liquefaction of biomass: a review of subcritical water technologies." Energy 36, no. 5 (2011): 2328-2342. Tsay, Yi-Fang, Chi-Chou Chiu, Chyn-Bey Tsai, Cheng-Hsun Ho, and Po-Kai Hsu. "Nitrate transporters and peptide transporters." FEBS letters 581, no. 12 (2007): 2290-2300.

42

Uduman, Nyomi, Ying Qi, Michael K. Danquah, Gareth M. Forde, and Andrew Hoadley. "Dewatering of microalgal cultures: a major bottleneck to algae-based fuels." Journal of renewable and sustainable energy 2, no. 1 (2010): 012701. US EPA, OAR. Nitrogen Dioxide (NO2) Pollution. https://www.epa.gov/no2-pollution. Accessed 24 Feb. 2018. Vallon, Olivier, Laurence Bulte, Richard Kuras, Jacqueline Olive, and Françis‐André Wollman. "Extensive accumulation of an extracellular l‐amino‐acid oxidase during gametogenesis of Chlamydomonas reinhardtii." The FEBS Journal 215, no. 2 (1993): 351-360. Vardi, Assaf, Doron Eisenstadt, Omer Murik, Ilana Berman‐Frank, Tamar Zohary, Alex Levine, and Aaron Kaplan. "Synchronization of cell death in a dinoflagellate population is mediated by an excreted thiol protease." Environmental microbiology 9, no. 2 (2007): 360-369. Vergara-Fernández, Alberto, Gisela Vargas, Nelson Alarcón, and Antonio Velasco. "Evaluation of marine algae as a source of biogas in a two-stage anaerobic reactor system." Biomass and Bioenergy 32, no. 4 (2008): 338-344. Wang, Kaige, and Robert C. Brown. “Catalytic Pyrolysis of Microalgae for Production of Aromatics and Ammonia.” Green Chemistry, vol. 15, no. 3, Royal Society of Chemistry, 2013, p. 675. Wang, Meng, Wenqiao Yuan, Xiaoning Jiang, Yun Jing, and Zhuochen Wang. "Disruption of microalgal cells using high-frequency focused ultrasound." Bioresource technology 153 (2014): 315-321. Ward, A. J., D. M. Lewis, and F. B. Green. "Anaerobic digestion of algae biomass: a review." Algal Research 5 (2014): 204-214. Witte, Claus-Peter. “Urea Metabolism in Plants.” Plant Science, vol. 180, no. 3, Elsevier, 2011, pp. 431–38. Wong, Foon H., Jonathan S. Chen, Vamsee Reddy, Jonathan L. Day, Maksim A. Shlykov, Steven T. Wakabayashi, and Milton H. Saier Jr. "The amino acid-polyamine-organocation superfamily." Journal of molecular microbiology and biotechnology 22, no. 2 (2012): 105- 113. Wood, S. W., and Annette Cowie. "A review of greenhouse gas emission factors for fertiliser production." (2004). Yamano, Takashi, Emi Sato, Hiro Iguchi, Yuri Fukuda, and Hideya Fukuzawa. "Characterization of cooperative bicarbonate uptake into chloroplast stroma in the green alga Chlamydomonas reinhardtii." Proceedings of the National Academy of Sciences 112, no. 23 (2015): 7315- 7320. Yan, Dong, Junbiao Dai, and Qingyu Wu. "Characterization of an ammonium transporter in the oleaginous alga Chlorella protothecoides." Applied microbiology and biotechnology 97, no. 2 (2013): 919-928.

43

Yu, Guo, Yuanhui Zhang, Lance Schideman, Ted Funk, and Zhichao Wang. "Distributions of carbon and nitrogen in the products from hydrothermal liquefaction of low-lipid microalgae." Energy & Environmental Science 4, no. 11 (2011): 4587-4595. Yu, Zhiliang, and Hua Qiao. “Advances in Non-Snake Venom L-Amino Acid Oxidase.” Applied Biochemistry and Biotechnology, vol. 167, no. 1, Springer-Verlag, 2012, pp. 1–13. Zhang, Jingxin, Yaobin Zhang, and Xie Quan. "Electricity assisted anaerobic treatment of salinity wastewater and its effects on microbial communities." Water research 46, no. 11 (2012): 3535-3543.

44

CHAPTER 2: Amino Acids Are an Ineffective Fertilizer for Dunaliella spp. Growth

Contributions Statement:

I authored, edited, and submitted the following text for publication: Murphree, Colin A., et al.

"Amino Acids Are an Ineffective Fertilizer for Dunaliella spp. Growth." Frontiers in plant science 8 (2017): 847. I carried out growth experiments and metabolite quantifications, and developed modified methods to quantify carbohydrate and ammonia concentrations in the media used for Dunaliella cultivation. I worked with Chengsong Zhao and Dr. Guillaume Pilot to grow cultures and plan radioactivity uptake experiments that I participated in carrying out and analyzing. Together with Dr. Jeff Macdonald, ??Andrey Tikunov, and Dr. Heike Sederoff, I helped plan the NMR experiment, and did the growth, collection, and material processing prior to NMR data collection and analysis.

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 S1: Growth of D. viridis dumsii on Ribonucleosides and Nucleobases as sole N-source. 25 The total cell density of cultures grown for 72 hours in mBA -N containing each of the above 26 metabolites at the indicated concentrations. Average cell density was measured from four 27 biological replicates. Error bars represent one standard deviation. Significant growth relative to 28 the mBA – N control was assessed using a two tailed paired Student’s t-Test at p ≤ 0.05 (*), p ≤ 29 0.01 (**), and p ≤ 0.001 (***). 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

62

1 Supplemental Methods 2 3 Supplemental Methods 1: Cleaning Procedure to Isolate Pure Culture 4 5 The Dunaliella strains used in this study (Dunaliella viridis dumsii, D. tertiolecta CCMP 364, D. 6 primolecta LB 1000, D. salina 19/18) were isolated as pure culture using a modification of a 7 protocol obtained from UTEX (https://utex.org/). Briefly, 50 ml cultures were maintained under 8 autotropic growth conditions (Methods 2.1) and pelleted at 300 xg for 3 minutes. Pellets were 9 suspended in 1 ml of mBA, transferred to a new tube, and brought to 50 ml with mBA. The 10 resulting cultures were sonicated for 10 seconds. The centrifugation through sonication steps 11 were repeated a total of seven times. 100 µg/ml each of carbenicillin, kanamycin, rifampicin, 12 and spectinomycin were added to the resulting 50 ml cultures, which were then grown for 72 13 hours under autotrophic conditions. Recovering cultures were serially diluted to 100,000 cells 14 per ml, and 200 µl of the resulting dilution was plated onto mBA 1 % phytoblend agar plates 15 containing 100 µg/ml each of carbenicillin, kanamycin, rifampicin, and spectinomycin. These 16 plates were grown under autotrophic conditions for at least two weeks, and the resulting 17 individual colonies were transferred to new antibiotic-containing plates. Individual colonies on 18 these subsequent plates were used to inoculate pure cultures. The presence or absence of 19 contamination was established by the use of PCR targeting 16s rRNA.

63

CHAPTER 3: Metabolic and Transcriptional Profiles of Dunaliella viridis Supplemented with Ammonium derived from Glutamine

Contributions statement

My participation in the following work was co-inception of the idea to sequence

Dunaliella in response to amino acid feeding, helping to plan the growth and collection of metabolites and genetic material for the sequencing experiment, and partially collecting and analyzing data for figures 3.1-3.7. In particular, Jacob Dums and I worked together to plan a growth experiment involving the feeding of glutamine to Dunaiella as a nitrogen source. We both contributed to preparing materials and seed cultures that were necessary for the growth experiment. I specifically collected and analyzed total carbohydrate, protein, starch, and neutral lipid data presented in this study. I also developed the pipeline used for processing Dunaliella biomass into RNA appropriate for RNA sequencing using ribosomal depletion.

64

Metabolic and Transcriptional Profiles of Dunaliella viridis Supplemented with Ammonium derived from Glutamine

Jacob Dums1, Colin Murphree1, Naresh Vasani1,a, Danielle Young1,b, Heike Sederoff1*

1 Department of Plant and Microbial Biology, North Carolina State University, Raleigh, North

Carolina, United States

* Correspondence:

Dr. Heike Sederoff [email protected] a Current Address: Memorial Sloan Kettering Cancer Center, New York City, NY, United States b Current Address: Department of Plant Biology, Michigan State University, East Lansing,

Michigan, United States

Keywords: Dunaliella, nitrogen starvation, ammonium, glutamine, biofuel, recycling.

Abstract

Algal biofuel production requires an input of synthetic nitrogen fertilizer.

Fertilizer is synthesized via the Haber-Bosch process, produces CO2 as a waste byproduct, and represents a significant financial and energy investment. Reliance on synthetic fertilizer attenuates the environmental significance and economic viability of algae production systems. To lower fertilizer input, the waste streams of algal production systems can be recycled to provide alternative sources of nitrogen such as amino acids to

+ the algae. The halophytic green alga Dunaliella viridis can use ammonium (NH4 )

65

+ derived from the abiotic degradation of amino acids. Supplementation of NH4 from glutamine degradation supports acceptable levels of growth and increased neutral lipid

+ production compared to nitrate. To understand the effect of glutamine-released NH4 on algae growth and physiology, we analyzed metabolite levels, growth parameters, and transcript profiles of D. viridis cultures in a time course after transition from media containing nitrate as a sole N source to medium containing glutamine, glutamate, or a N-

+ depleted medium. Growth parameters were similar between glutamine (NH4 ) and nitrate supplemented cultures, however, metabolite data showed that the glutamine

+ supplemented cultures (NH4 ) more closely resembled cultures under nitrogen starvation

(N-depleted and glutamate supplementation). Neutral lipid accumulation was the same in

+ + nitrate and glutamine-derived NH4 cultures. However, glutamine-derived NH4 caused a transcriptional response in the immediate hours after inoculation of the culture. The

+ strong initial response of cultures to NH4 changed over the course of days to closely

+ resemble that of nitrogen starvation. These observations suggest that release of NH4 from glutamine was sufficient to maintain growth, but not high enough to trigger a cell transition to a nitrogen replete state. Comparison of nitrogen starved cultures and the nitrate positive control indicates downregulation of fatty acid synthesis and mixed regulation of the TCA cycle under nitrogen starvation, which also coincides with accumulation of more starch than neutral lipids. Our results indicate that a continuous,

+ amino acid derived slow release of NH4 to algae cultures could reduce the amount of synthetic nitrogen needed for growth.

66

Introduction

Most nitrogen fertilizer used in agriculture and industrial algae production is produced by the Haber-Bosch process, which converts N2 from air into ammonia using high pressure and temperature. Hydrogen for the Haber-Bosch process originates from methane, resulting in the release of CO2 (Huo et al., 2012). Approximately 2.5% of the world’s energy output is used for the Haber-Bosch process. Reducing the use of synthetic nitrogen fertilizer would lower the environmental impact and improve the feasibility of algae production.

After oil extraction, spent algae biomass contains nitrogen that can be recycled. The major nitrogen-containing components in this biomass are proteins and polynucleotides.

Digestion of the proteins can provide significant amounts of amino acids. However, some oil producing algae are selective in their ability to grow on amino acids as a nitrogen source (Huo et al., 2012, Murphree et al., 2017). We have previously shown that Dunaliella viridis is unable to take up free amino acids from acqueos media, with the exception of histidine. However, D.

+ viridis can utilize NH4 released from abiotic degradation of glutamine, tryptophan, and cysteine.

Histidine-supplemented growth medium supports algae growth at lower rates that nitrate fertilizer controls, while glutamine supplementation of algae cultures enables growth comparable to nitrate fertilizer controls with a relatively higher lipid content in the cells (Murphree et al.,

+ 2017). The ability to provide NH4 derived from deaminated amino acids like glutamine, tryptophan, and cysteine can contribute to the recycling of organic nitrogen from the culture and reduce the input of synthetic nitrogen fertilizer. D. viridis, like other Dunaliella species, is sensitive to ammonia toxicity (Gutierrez et al., 2016; Murphree et al., 2017) and shows growth inhibition and cell death at concentrations above 1 mM (Murphree et al., 2017). Therefore, using

67

amino acid supplemented to slowly-release ammonium in culture is an amendment that could limit ammonia toxicity.

Dunaliella viridis is a halophytic green alga that can accumulate significant amounts of triacylglycerides (TAG) for biofuel production. Extraction of TAGs can be achieved by osmotic shock (Wang et al., 2013). Dunaliella spp. are a desirable feedstock because they lack a cell wall, have a short generation time (>1 division per day), low freshwater requirements, and inducible lipid and starch production (Hosseini Tafreshi and Shariati, 2009; Srirangan et al.,

2015). Dunaliella spp. have been physiologically characterized for years (Oren, 2005; Ben-

Amotz et al., 2009) however only recently have there been attempts to explore transcriptomes of these algae under different environmental conditions such as: nitrogen starvation (Shin et al.,

2015; Tan et al., 2016), salinity change (Fang et al., 2017), temperature (Srirangan et al., 2015), light (Srirangan et al., 2015; Shin et al., 2015), and heavy metals (Puente-Sanchez et al., 2016).

Other transcriptomic studies have been used to explore gene content, but not transcriptional regulation (Rismani-Yazdi et al., 2011; Keeling et al., 2014; Matasci et al., 2014; Hong et al.,

2017; Yao et al., 2017). As nitrogen starvation is a key inducer of lipid production in algae, changes in transcriptome profiles in response to changes in N-supply have been investigated in other algae like Chlamydomonas or Nannochloropsis (Vieler et al., 2012; Merchant et al., 2012;

Alipanah et al., 2015). Dunaliella spp. transcriptomes have only been characterized in response

- + to NO3 , but not NH4 (Shin et al., 2015; Tan et al., 2016). We have shown previously that

+ glutamine degrades over time into pyroglutamate and NH4 and can support favorable growth and neutral lipid production in D. viridis (Murphree et al., 2017). However, the transcriptional

- + regulation that differentiates NO3 and glutamine-derived ammonia (NH4 ) supported growth are unknown.

68

Our experiments were designed to distinguish physiological, metabolic and transcriptional changes in D. viridis cultures in response to a transition to different nitrogen sources in the medium, specifically amino acid (glutamine) derived ammonium. Because amino acid-derived ammonium release is slow, the algae can survive and grow on e.g. glutamine containing medium, while cultures supplemented with the same concentration of ammonium do not survive. We therefore did not include an ammonium control, because it would have been lethal. Algae cultures supplemented with glutamine produced relatively high levels of triacylglycerides (TAGs) due to a starvation response that followed an initial growth response.

Our work suggests that in principle nitrogen from spent biomass could be recycled to reduce the need for synthetic fertilizer in algae biofuel production.

69

Material and Methods

Growth conditions

Dunaliella viridis strain dumsii was grown in a modified Ben-Amotz medium as reported previously (Murphree et al., 2017). Experiments were performed in a Percival E-36L growth chamber under continuous cool white fluorescent light (125 µmol photons m2/s) at a temperature of 23.5 °C. Four experimental media were created with nitrogen sources of 5 mM KNO3, 5 mM

L-glutamine (Sigma-Aldrich, Raleigh, NC), 5 mM L-glutamate (Sigma-Aldrich, Raleigh, NC), or no addition of KNO3 (Figure 3.1). The L-glutamate condition was used to control for nitrogen starvation in the presence of an unused and stable amino acid (Murphree et al., 2017). Cultures for inoculation were grown for a week in experimental conditions and were harvested by centrifugation (3500xg, 6 min.) in mid-logarithmic phase. These cells were suspended in 450 ml of the proper experimental media in 1 liter narrow-necked Erlenmeyer flasks to a concentration of 1 million cells/ml. Each of the 4 experimental cultures was duplicated to give 2 technical replicates and the entire experiment was repeated 3 times to provide 3 biological replicates.

Samples were taken at 3 hours, 27 hours, and 51 hours after inoculation. In addition to the samples taken from the experimental cultures, half volume samples were taken in duplicate from the inoculum culture right before inoculation. Sample collection consisted of aliquots taken for cell density/diameter/pH measurements (1 ml), RNA extraction (30 ml), neutral lipid fluorescence assay (5 ml), carbohydrate/starch detection (100 ml), soluble protein assay (5 ml), and chlorophyll content measurement (1 ml) (Figure 1). The aliquots for RNA, carbohydrate/starch, and protein measurements were pelleted and flash frozen in liquid nitrogen

70

for bulk processing. Cell density/diameter/pH, neutral lipid fluorescence assay, and chlorophyll content samples were processed immediately using the following protocols:

Cell density/diameter/pH measurements

Cell density and diameter were determined using a TC10 Automated Cell Counter (Bio-

Rad, Hercules, CA). Cell diameters were calculated from histogram data provided from the counter. Cells were grouped into 2 µm wide bins and an approximate average cell diameter was calculated from those data. The pH was measured with a Glass Semi-Micro Combination pH

Electrode (Beckman Coulter, Brea, CA) from the 1 ml aliquot used for cell density measurements.

Chlorophyll

Chlorophyll content was quantified as described in (Srirangan et al., 2015). A 1 ml aliquot of cells was pelleted and 0.75 ml of supernatant was removed. 1 ml of 100% ethanol was added to the 250 μl pellet and the samples were vortexed. After incubating at room temperature for 10 min, the samples were vortexed again and centrifuged for 10 min at 13000 rpm. The absorbance of the supernatant (1 ml) was read at 652 nm in a BioSpectrometer basic (Eppendorf,

Hamburg, Germany). The total chlorophyll concentration (in mg/ml) was calculated as

A652nm/36, where 36 is the extinction coefficient for chlorophyll A and B in ethanol (Winter,

1993).

71

Soluble protein

5 ml frozen pellets of cells were suspended in 1 ml of deionized water and soluble protein was extracted as described previously (Murphree et al., 2017). Briefly, frozen cells suspended in water were pelleted and the supernatant removed. The pellet was re-extracted with 1 mL of 0.1

N NaOH, incubated at room temperature with intermittent vortexing for 60 minutes, centrifuged), and the supernatants combined. Protein was quantified using the Micro BCA™

Protein Assay Kit (Cat#23235, ThermoFisher, Waltham, MA) using on a plate reader (Synergy

HT, Biotek, Winooski, VT) (Srirangan et al., 2015). Protein concentrations were calculated using linear regression analysis against a bovine serum albumin standard curve ranging from 0 to

200 µg/ml.

Total carbohydrate and starch

Total carbohydrates and starch content were quantified from 100 ml frozen pellets which were suspended in 1 ml of water. For total carbohydrates, the pellet was diluted 1:100 with water before carbohydrates were extracted and quantified using a microplate-adapted version of the Dubois method as stated in (Murphree et al., 2017). Briefly, 40 µl of cell dilutions were placed in 96-well polystyrene plates (Genesee Scientific, Morrisville, NC) and then 40 µl of 5% w/v crystalline phenol/H2O was added and mixed. Plates were incubated for 15 minutes and then 200 µl of 95–98% H2SO4 was mixed into the samples. Absorption of the samples at 490 nm was detected in a plate reader (Synergy HT, Biotek, Winooski, VT). All samples were prepared

72

in triplicate and a standard curve of 5 to 500 µg/ml sucrose in H2O was used to calculate total carbohydrate content (Murphree et al., 2017).

Starch extraction and quantification was performed on the full pellet as stated in

(Srirangan et al., 2015). Briefly, pellets suspended in 1 mL of H2O were extracted with 2:1, v/v methanol:chloroform for 75 mins with occasional vortexing. Samples were pelleted, the supernatant discarded, and the pellet resuspended in extraction buffer. Then, samples were vortexed, pelleted, and the organic phase removed. After another round of extraction, the pellet was dried and solubilized with 0.2 M KOH. Glacial acetic acid was used to adjust the pH and samples were sonicated in a water bath. Samples were enzymatically hydrolyzed to glucose using amylase and amyloglucosidase (Sigma-Aldrich, Raleigh, NC). Glucose was detected by the production of NADPH at an absorbance of 334 nm. All samples were prepared in triplicate.

Starch quantification was standardized using purified potato starch (Sigma-Aldrich, Raleigh,

NC) (Srirangan et al., 2015).

Neutral lipids

Neutral lipids were detected in whole cells using a Nile Red dye-based assay as described in (Murphree et al., 2017). Briefly, 5 ml culture samples were pelleted and suspended in 600 µl of fresh modified Ben-Amotz media. Samples were divided into 200 µl triplicates in 96-well polystyrene plates (Genesee Scientific, Morrisville, NC) and Nile Red (9-diethylamino-5H- benzo(α)phenoxazine-5-one, Sigma-Aldrich, Raleigh, NC) dye was added to a final concentration of 0.26 µM. Samples were incubated at room temperature in the dark for 15 minutes and then excited at 485 nm and detected at 590 nm in a plate reader (Synergy HT,

73

Biotek, Winooski, VT). Coconut oil standards of 5–100 µg/ml were used to calculate neutral lipid concentrations (Murphree et al., 2017).

Reference transcriptome annotation

Original transcriptome annotations for D. viridis strain dumsii were from a 2012

Blast2GO annotation run (Srirangan et al., 2015). Due to the advancements in the number of genomes sequenced and the expansion of annotation databases, the reference transcriptome was out-of-date. The newly released Dunaliella salina CCAP 19/18 genome, available through

Phytozome (Polle et al., 2017), was used to re-annotate based upon similarity of D. viridis transcripts to D. salina proteins. The D. salina proteome was obtained from Phytozome on June

11, 2017 and converted to a Blast database within the CLC Genomics Workbench version 10.0

(https://www.qiagenbioinformatics.com/). Blastx was used to extract the best protein hit for each D. viridis transcript using the following parameters: maximum number of blast hits = 1; e- value cutoff = 10-1; low complexity filter = off. D. salina annotations were then ascribed to the

D. viridis transcripts. Previous manual annotations were kept as they had already been confirmed. This rough annotation increased the number of annotated genes in D. viridis strain from 9400 to 13273.

RNA extraction, sequencing, and quality control

Total RNA was extracted from the 30 ml frozen pellets of the 39 samples using the

RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) following standard procedures for use with a

74

QIAcube (QIAgen). DNA was removed from total RNA using RNase-Free DNase (Qiagen), and rRNA was depleted from the samples using the Ribo-Zero™ Magnetic Kit (Plant Leaf) (Cat.

No. MRZPL1224, Epicentre, Charlotte, NC). After each processing step, RNA quality was confirmed using a Bioanalyzer RNA Pico Chip (Agilent, Santa Clara, CA). Strand specific cDNA libraries with 300–450 bp inserts were generated with the NEBNext® Ultra™ Directional

RNA Library Prep Kit for Illumina® (New England BioLabs, Ipswich, MA). Sample libraries were barcoded using the TruSeq DNA LT Sample Prep Kit, Indexed Adapter Sequences

(Illumina, San Diego, CA). Final quality of the libraries was confirmed using a Bioanalyzer

DNA high sensitivity chip (Agilent, Santa Clara, CA). Libraries were split into 2 groups and submitted for single end 125 bp Illumina HiSeq 2500 sequencing runs in 2 flow cells at the

North Carolina State University Genome Sciences Laboratory (GSL). Quality control, filtering, and trimming were performed using FastQC and the FAST-X toolkit and resulted in an average loss of 3% of reads.

Mapping reads

Reads from each sample were mapped to the previously de novo assembled reference transcriptome of Dunaliella viridis strain dumsii (Srirangan et al., 2015) using the RNA-Seq

Analysis module of the CLC Genomics Workbench version 10.0

(https://www.qiagenbioinformatics.com/). The reference transcriptome was modified to eliminate seven contaminating transcripts from tomato, and six transcripts were trimmed or eliminated due to previous adapter/index contamination in the original assembly. The RNA-Seq

Analysis module, which uses an expectation-maximization (EM) algorithm like RSEM (B. Li

75

and Dewey, 2011), was run with default parameters (mismatch cost = 2; insertion cost = 3, deletion cost = 3; length fraction = 0.8; similarity fraction = 0.8). Since putative alternate splice transcripts are represented individually in the reference transcriptome, the “maximum number of hits for a read” was increased to 30 so that reads that matched the multiple transcripts were not discarded. As these putative transcripts are only predicted and not verified for D. viridis, gene counts are preferable to transcript differences. To obtain gene specific data, raw counts for each transcript were exported from the CLC Genomics Workbench, and the counts for each putative transcript of each gene were summed to give a total raw read count per gene. The total gene counts were then imported back into CLC Genomics Workbench for differential expression analysis using the Microarray Tools in CLC.

Differential expression (DE) analysis

Differential expression analysis was used to determine significant transcriptional differences between the four nitrogen sources (NO3, STAR, GLU, GLN) at each of the three time points (3 h, 27 h, and 51 h). Comparisons were also made between the time points within each nitrogen source to determine transcriptional differences over time for a single nitrogen source. All samples were also compared to the inoculum culture (INOC) to provide a common comparison point (Figure 3.1). Differential expression analysis was performed using the

“Empirical analysis of DGE” tool in CLC, which utilizes the edgeR version 3.4.0 implementation of Fisher’s Exact Test for two-group comparisons (Robinson et al., 2010). Only genes that had a total number of read counts >5 within each comparison group were used to estimate tagwise dispersions, and the counts from biological replicates were averaged. Initial

76

analysis showed all comparisons between the GLU and STAR conditions were not statistically different, and thus all comparisons were regenerated without GLU data. Differential expression was considered significant based on the following parameters: False Discovery Rate (FDR) corrected p-value ≤ 0.05, log2 fold change ≥ |1|, and the total read count for the gene within the comparison was > 75. Raw read files and raw counts are available through the Gene Expression

Omnibus accession GSE111548.

Pathway analysis

The differential expression of key pathways was examined to determine any key regulatory points. The genes for lipid synthesis, triacylglycerol synthesis, and starch metabolism pathways were extracted from previous work (Srirangan et al., 2015). Genes for nitrogen assimilation, the Calvin-Benson Cycle, and the Krebs Cycle were extracted using KEGG

Orthology (KO) terms for those respective pathways (Kanehisa et al., 2017). Missing genes were manually obtained from the D. viridis transcriptome (Srirangan et al., 2015) using

Chlamydomonas reinhardtii orthologs from the KEGG Orthology Database (November 27,

2017). Putative D. viridis mRNA sequences were manually translated with ExPASy translate tool (Artimo et al., 2012) and submitted for KO annotation using BlastKOALA (Kanehisa et al.,

2016) to confirm that they were appropriate orthologs.

77

Results

As had been predicted from previous work (Murphree et al., 2017), the supplementation of growth medium with glutamate as the sole nitrogen source showed the same physiological, metabolic and transcriptiome phenotypes and profiles as those cultures grown on N-deplete medium. Glutamate is not a source of nitrogen and does not serve as a slow-release source for

+ ammonium. The nitrogen sufficiency conditions of nitrate and glutamine (NH4 ) had no significant differences in the growth rates and metabolic data at 27 hours but showed many significant differences at 51 hours. This general pattern was also reflected in the transcriptional data.

Glutamine decreases cell diameter, but not cell density

Cell density and diameter was measured to determine changes in cell division rates and cell volumes under each nitrogen treatment. The glutamate (GLU) and no nitrogen starvation

(STAR) treatments showed identical responses with a ~90% increase in cell density from 3 hours to 27 hours and a slight increase by 51 hours (Figure 3.2). Glutamine (GLN) and nitrate (NO3) supplementation show a ~110% increase in cell density from 3 hours to 27 hours which is significantly different from the nitrogen starvation (GLN, STAR) conditions. Cells in both nitrogen treatments (GLN, NO3) continued to divide until 51 hours with the glutamine (GLN) treatment having a small but significant increase over the nitrate (NO3) treatment (Figure 3.2).

Cultures containing glutamine (GLN) and nitrate (NO3) showed an increase in cell size at 27 hours over nitrogen-starved (GLU, STAR) cells; however, at 51 hours cells provided with

78

glutamine (GLN) were the same size as starved (GLU, STAR) cells, while the nitrate-grown

(NO3) cells were bigger (Figure 3.2).

Active cell division resulted in higher media pH

Medium pH was monitored to determine whether the nitrogen source had any effect medium alkalinity. All cultures experienced an initial decrease in pH as the inoculum culture was at a higher pH than the fresh media. Each culture condition resulted in increased pH over time. At 51 hours the nitrogen-starvation conditions (GLU, STAR) had lower pH levels than the nitrogen-sufficient conditions (NO3, GLN) with nitrate (NO3) showing an increased pH beyond that of the glutamine treatment (GLN) (Figure 3.3).

Nitrogen starvation and glutamine result in less chlorophyll and soluble protein than nitrate

D. viridis cells grown with glutamine (GLN) showed an intermediate amount of chlorophyll per cell in comparison to nitrate (NO3) and nitrogen starvation (GLU, STAR) conditions. The nitrogen-starvation conditions (GLU, STAR) show a trend of decreasing cellular chlorophyll content whereas the nitrogen-sufficient (NO3, GLN) conditions show greater chlorophyll content per cell over the nitrogen starved (GLU, STAR) conditions (Figure 3.4). At

27 hours, glutamine (GLN) and nitrate (NO3) supplemented cultures have ~67% more chlorophyll per cells than the starvation conditions (GLU, STAR). At 51 hours, cells in the nitrate (NO3) supplemented culture maintain their chlorophyll content while cells groing in the

79

glutamine supplemented culture (GLN) show a decrease in chlorophyll content that is similar to the chlorophyll content the cells had at 3 hours after innoculation. Glutamine-grown (GLN) cells still maintain more chlorophyll than nitrogen-starved cells (GLU, STAR) (Figure 3.4).

Changes in cellular soluble protein content during growth in different media show a similar pattern to chlorophyll content. Nitrogen starvation conditions (GLU, STAR) resulted in a decrease in cellular protein content over time, while nitrogen sufficiency (NO3, GLN) showed increased protein content. However, unlike chlorophyll, the protein content in glutamine (GLN) grown cells at 51 hours was reduced to the equivalent level of the nitrogen starved cells (Figure

3.5).

Glutamine supplementation increases carbohydrate content compared to nitrate controls

Changes in carbon storage molecules are common under variable nitrogen conditions, especially the increase in carbon storage under nitrogen starvation. We observed a similar response pattern at 27 hours after innoculation for total carbohydrate (Figure 3.6), starch

(Supplementary Figure 1), and neutral lipids (Figure 3.7), where all storage components showed a doubled or higher quantity when the cells were grown under nitrogen starvation (GLU, STAR).

After 51 hours in the respective culture media, the cells show differences in the accumulation of carbon storage components. The total carbohydrate content of glutamine-grown cells (GLN) increased while the nitrate-grown cells (NO3) showed a decrease in total carbohydrate levels (Figure 3.6). The glutamine-grown cells (GLN) were accumulating an intermediate carbohydrate quantity between nitrogen starvation and nitrate conditions similar to chlorophyll content (Figure 3.4). Starch content at 51 hours maintained the pattern of nitrogen-

80

starved cells (GLU, STAR) having more starch than nitrogen replete (GLN, NO3) cells.

However, the starch content of the no nitrogen and glutamine conditions (GL specifically were not significantly different from each other due to the large variability in the starch content of the replicates of the nitrogen-deplete (STAR) cultures (Supplementary Figure 1). For neutral lipid content, there were slight decreases in neutral lipid content from 27 hours to 51 in the nitrogen- replete conditions (NO3, GLN), while the cells grown under nitrogen-starvation conditions

(GLU, STAR) continue to accumulate neutral lipids. Neutral lipid content in the cells ofthe nitrate (NO3) or glutamine (GLN) supplemented cultureswas not different at either 27 or 51 hours (Figure 3.7).

Nitrogen starvation results significant transcriptional changes

Across all comparisons, 7019 transcripts or 38% of the Dunaliella viridis transcriptome was differentially expressed (DE). We compared the different transcriptome profiles over time

(INOC, 3 hrs, 27 hrs, 51 hrs) and between treatments at each time point (INOC, NO3, GLN,

GLU). Within the comparison’s made between the different nitrogen treatments at 3 hours there were 64 uniquely DE genes (Figure 3.8). Of the total of 7 comparisons made, this is the comparison with the smallest amount of DE genes (Figure 3.8). The inoculum (INOC) and nitrate (NO3) comparisons were larger with 467 and 468 DE genes, respectively. The glutamine

(GLN), starvation (STAR), 27 hour, and 51 hour comparisons showed a high degree of changes between transcript profiles with 2371, 3015, 3927, and 4048 DE genes, respectively. The 51 hour comparison represented a DE of 21.9% of the 18495 total genes (Figure 3.8). Within the inoculum comparison (INOC to 3 hours), 94 genes were DE in GLN, NO3, and STAR which

81

most likely represents genes active due to inoculation, and 192 genes were unique to GLN

(Figure 3.9). The large number of GLN-responsive genes was also reflected in the 3 hour comparison where NO3 and STAR were identical and transcriptional differences were only

+ detected in response to GLN (NH4 ) (Figure 3.10). At the 27 hour time point, only 20 DE genes were identified between GLN and NO3 treatments with 6 DE genes involved in nitrogen metabolism. However, the differences between STAR and GLN or NO3 were large: 3562 and

2748 DE genes, respectively (Figure 10). This pattern changed at 51 hours where DE genes between GLN and NO3 increased from 20 to 694 genes and DE genes between GLN and STAR decreased from 2748 to1330 genes. NO3 and STAR maintained a very large amount of DE genes at 3719 (Figure 3.10).

Distinctive patterns of expression were also identified when comparing changes in DE over time within each nitrogen treatment. GLN showed large DE shifts over time from 3 hours to 27 hours (696 genes) and from 27 hours to 51 hours (711 genes) (Figure 3.11). NO3 treatment on the other hand induced fewer changes with 333 DE genes from 3 to 27 hours and only 61 DE genes from 27 to 51 hours. STAR treatment was similar to nitrate (NO3) treatment in that the number of DE genes between 27 to 51 hours was small (28 genes), but the initial set of DE genes from 3 to 27 hours was very large at 1855 transcripts (Figure 3.11).

Nitrogen assimilation genes are DE in all nitrogen conditions over time

When analyzing the genes involved in nitrogen assimilation, DE was determined by comparison of all transcript profiles to the inoculum culture so that the magnitude of response could be compared within nitrogen sources and across time. Overall, GLN and STAR show

82

more dynamic responses over time than NO3, and since the inoculum culture was grown with

- NO3 this response was expected. Over time the nitrate condition showed little DE for the nitrogen assimilation genes with the most differences in transcript abundance identified at the 3 hour time point for ammonium transporter gene, AMT1.2 being upregulated (Figure 3.12).

The STAR condition at 3 hours resulted in upregulation of several nitrate/nitrite porters (NNP), all ammonium transporters, NAR2, and the ROC40 transcription factor (Figure 3.12). At 27 hours, all the NNP genes and Rh were downregulated in comparison to the 3 hour time point.

The AMT transporters, cytosolic glutamine synthetase (GS), NADH-dependent GOGAT, and several regulatory proteins and transcription factors were upregulated at 27 hours. This upregulation was either maintained or increased for these genes at the 51 hour time point. For the previously downregulated genes at 27 hours the pattern was similar with downregulation maintained below expression levels of the inoculum (INOC).

+ The presence of glutamine (GLN), or more accurately glutamine-derived NH4 , resulted in downregulation of several NNPs, formate/nitrite transporter 2, nitrate reductase, nitrite reductase, NAR2, and AMT1.1 at 3 hours (Figure 3.12). The only genes upregulated were Rh and ROC40. For GLN at 27 hours, most of the transcriptional repression was released, and many of the genes returned to their levels at inoculation or were upregulated above inoculum levels. ROC40 and Rh were both downregulated back to inoculum levels which were maintained through 51 hours. For genes upregulated at 27 hours, most were downregulated by

51 hours, which represented a return to expression levels that were similar to their levels at the 3 hour time point. The AMT1.1 gene is the only gene that was upregulated further at 51 hours.

+ There were several additional genes, all related to NH4 metabolism, that were upregulated at 51

83

hours. These genes were AMT1.1, AMT1.2, both glutamine synthetases, NADH-GOGAT, and the PII nitrogen regulator (Figure 3.12).

Lipid and triacylglycerol synthesis gene expression in nitrogen starvation

Since nitrogen starvation is a known inducer of lipid accumulation in algae, transcriptional regulation of triacylglycerol synthesis genes was analyzed for DE at all time points in STAR versus NO3 samples. Genes for fatty acid synthesis were downregulated over time or remained unchanged under starvation with malonyl-CoA ACP transacylase (MCT) and ketoacyl-ACP synthase III (KASIII) being the most downregulated at 51 hours (Figure 3.13).

Genes involved in Triacylglycerol metabolism showed a mixed response with 4 downregulated genes and 7 upregulated genes (Figure 3.14). The fatty acid desaturase genes FAD6-2 and

FAD7, as well as UDP-sulfoquinovose synthase and glycerol-3-P acyltransferase were reduced to half their transcript abundance. The upregulated genes encode digalactosyldiacyglycerol synthase, major lipid droplet protein, phosphatidate , diacylglycerol kinase, and three . The phosphatidate phosphatase gene PAP2 shows a large increase in transcript abundance at 27 hours of about 180-fold (Log2FC 7.48) and at 51 hours of 128-fold (Log2FC 7)

(Figure 3.14), which makes it the most DE gene in the lipid/triacylglycerol metabolism pathway although its overall expression is low.

84

Central carbon metabolism shows high transcriptional activity around pyruvate

As we were interested in the regulation of carbon metabolism under nitrogen starvation, we looked for DE at the 51 hour time point in STAR over NO3. Expression levels between 27 and 51 hours were nearly identical, so the 51 hour time point was analyzed. The Calvin-Benson cycle shows mixed regulation with phosphoglycerate kinase (PGK), aldolases (ALDO3,

ALDO4), sedoheptulose 1,7-bisphosphatase (SBP), ribose 5-phosphate (R5PI1), and ribulose 5-phosphate 3-epimerases (Ru5PE1, Ru5PE2) being downregulated and glyceraldehyde

3-phosphate dehydrogenase (GAPDH1) and aldolase (ALDO1) being upregulated. R5PI1,

ALDO1, and GAPDH1 are the most highly DE genes of the Calvin cycle with Log¬2FC values of respectively -2.4, 3.1, and 4.8 (0.2 -, 8.6 -, and 28-fold) changes in the STAR culture over the

NO3 culture cells (Figure 3.15).

Regarding the TCA cycle, most differentially expressed genes encoded enzymes involved in phosphoenolpyruvate, pyruvate, and oxaloacetate conversion (Figure 3.15). We observed downregulation of pyruvate kinase (PYK2), mitochondrial malate dehydrogenase (M-MDH1), and malic enzyme (ME3), although other isoforms of the same genes showed no change. More genes were upregulated with pyruvate phosphate dikinase (PPDK), PEP carboxylase (PEPC2),

PEP carboxykinase (PEPCK), mitochondrial and cytosolic malate dehydrogenase (M-MDH3, C-

MDH1) showing increases in transcription. Additionally, in the TCA cycle, aconitase and fumarate hydratase were upregulated and one subunit of 2-oxoglutarate dehydrogenase was downregulated (Figure 3.15).

Lipid and starch metabolism are branch off the Calvin and Krebs Cycles. The entry point for lipid metabolism, ACCase, shows downregulation in STAR (Figure 3.15, Figure 3.13).

85

Starch synthesis and degradation are upregulated in STAR with the synthesis enzymes, starch synthase, and starch branching enzyme being more highly upregulated than the degradation enzymes α-amylase and starch phosphorylase (data not shown).

Carbonic anhydrase (CA) transcription was also examined over time in comparison to the inoculum culture in both NO3 and STAR. Universally α-CAs and θ-CAs are downregulated between inoculation to the 3 hour time point (Figure 3.15). For α-CAs, downregulation continued under starvation (STAR) while in nitrate (NO3) the α-CAs were upregulated at 27 and

51 hours. The largest changes were observed for α-CA1 at 51 hours which reached a Log2FC of

-6.4 (0.01-fold) under starvation, and a Log2FC of 1.9 (3.7-fold) in nitrate. The θ-CAs showed different patterns of expression, but in general by 51 hours, expression of θ-CAs was higher in

NO3 than STAR cultures (Figure 3.15).

Transcriptional shift by glutamine at 51 hours

From 27 to 51 hours, the number of DE genes between GLN and NO3 increased from 20 to 694 genes (Figure 3.10). At this junction many metabolic shifts were observed, warranting a closer look at patterns of expression of the 694 genes to discover what might be causing these shifts.

350 genes changed in response to GLN, whereas only 34 changed for NO3 (Table I). Of the 34

DE NO3 genes, 23 were upregulated in NO3, but did not change in GLN. These genes consisted of heavy metal transporters and chaperones as well as carbonic anhydrases (CA). 11 of the 34 genes were DE in both NO3 and GLN with 3 downregulated in both, 3 with opposite regulation, and 5 upregulated in both. For all 11 genes, the expression level at 51 hours was lower in GLN than NO3. Of the 339 genes that were only DE in GLN, 113 were downregulated and 226 were

86

upregulated. The downregulated genes consisted of some nitrogen assimilation genes, protein synthesis and degradation, stress tolerance genes, porphyrin adjacent proteins, and lipid transport. The upregulated genes consisted of nitrogen assimilation genes, arginine biosynthesis, glycolysis, and protein degradation (Table 3.1).

87

Figure 3.1 | Experimental and sampling design. Experimental workflow consisted of four nitrogen source growth treatments (nitrate, glutamine, glutamate, or no nitrogen[starvation]), which were duplicated to create technical replicates for a total of eight cultures. Samples of culture were harvested in 24 hour intervals starting 3 hours after inoculation (3 h, 27 h, 51 h). The 3-hour wait was to allow the cells to recover somewhat from transfer shock. Samples were also taken from the inoculum to establish baseline culture measurements. As a result, thirteen unique treatment and time sampling points were obtained per experiment. At each sampling point, aliquots of culture were taken to assess growth parameters (cell density and diameter, pH of culture media), metabolite levels (chlorophyll, soluble protein, total carbohydrates, starch, neutral lipids), and transcription levels (RNA). The inoculum culture was maintained with nitrate and all cells were grown under continuous light. The experiment was repeated three times for three biological replicates.

88

Figure 3.2 | Growth (A) and cell diameter (B) over time on different nitrogen sources. The INOC time point represents cell density (A) and diameter (B) immediately after inoculation while the 3 h, 27 h, and 51 h time points are the cell densities at the time of the metabolite collections. Averages are based upon 3 biological replicates and error bars represent one standard deviation. Statistical difference at each time point was determined using single factor ANOVA (p≤0.05) followed by Tukey’s HSD. Significant difference within each time point is represented as follows: between nitrate and glutamate/no nitrogen (a); glutamine and glutamate/no nitrogen (b), and nitrate and glutamine (c).

89

Figure 3.3 | pH over time on different nitrogen sources. The INOC time point represents the pH of the inoculum culture immediately before inoculation while the 3 h, 27 h, and 51 h time points are the pH at the time of the metabolite collections. Averages are based upon 3 biological replicates and error bars represent one standard deviation. Statistical difference at each time point was determined using single factor ANOVA (p≤0.05) followed by Tukey’s HSD. Significant difference within each time point is represented as follows: between nitrate and glutamate/no nitrogen (a), glutamine and glutamate/no nitrogen (b), nitrate and glutamine (c), and glutamine and glutamate (d).

90

Figure 3.4 | Chlorophyll content over time on different nitrogen sources. The INOC time point represents the chlorophyll content of the inoculum culture immediately before inoculation while the 3 h, 27 h, and 51 h time points are the chlorophyll content at the time of the metabolite collections. Averages are based upon 3 biological replicates and error bars represent one standard deviation. Statistical difference at each time point was determined using single factor ANOVA (p≤0.05) followed by Tukey’s HSD. Significant difference within each time point is represented as follows: between nitrate and glutamate/no nitrogen (a); glutamine and glutamate/no nitrogen (b), and nitrate and glutamine (c).

91

Figure 3.5 | Soluble protein content over time on different nitrogen sources. The INOC time point represents the protein content of the inoculum culture immediately before inoculation while the 3 h, 27 h, and 51 h time points are the protein content at the time of the metabolite collections. Averages are based upon 3 biological replicates and error bars represent one standard deviation. Statistical difference at each time point was determined using single factor ANOVA (p≤0.05) followed by Tukey’s HSD. Significant difference within each time point is represented as follows: between nitrate and glutamate/no nitrogen (a); glutamine and glutamate/no nitrogen (b), and nitrate and glutamine (c).

92

Figure 3.6 | Total carbohydrate content over time on different nitrogen sources. The INOC time point represents the total carbohydrate content of the inoculum culture immediately before inoculation while the 3 h, 27 h, and 51 h time points are the total carbohydrate content at the time of the metabolite collections. Averages are based upon 3 biological replicates and error bars represent one standard deviation. Statistical difference at each time point was determined using single factor ANOVA (p≤0.05) followed by Tukey’s HSD. Significant difference within each time point is represented as follows: between nitrate and glutamate/no nitrogen (a); glutamine and glutamate/no nitrogen (b), and nitrate and glutamine (c).

93

Figure 3.7 | Neutral lipid content over time on different nitrogen sources. The INOC time point represents the neutral lipid content of the inoculum culture immediately before inoculation while the 3 h, 27 h, and 51 h time points are the neutral lipid content at the time of the metabolite collections. Averages are based upon 3 biological replicates and error bars represent one standard deviation. Statistical difference at each time point was determined using single factor ANOVA (p≤0.05) followed by Tukey’s HSD. Significant difference within each time point is represented as follows: between nitrate and glutamate/no nitrogen (a); glutamine and glutamate/no nitrogen (b), and nitrate and glutamine (c).

94

Figure 3.8 | Size comparison of Euler diagrams of DE genes. The area-proportional Euler diagrams are scaled to be approximately proportional to the 51 hour comparison which had 4048 unique differentially expressed genes. Total number of differently expressed genes in each Euler diagram is displayed underneath the comparison label. The inoculum comparison is the differential expression in the short-term transition (3 hours) from the inoculum culture to each of the three nitrogen sources at 0 hours and is expanded in Figure 3.9. The 3, 27, and 51 hour comparisons are differential expression between nitrogen conditions at each time point and are expanded in Figure 3.10. The GLN, NO3, STAR comparisons are differential expression over time in each nitrogen condition and are expanded in Figure 3.11. The color scheme represents the same time point comparison made within each nitrogen source.

95

Figure 3.9 | Euler diagram of the rapidly DE genes after inoculation. The area-proportional Euler diagram represents differential expression in the short-term transition (3 hours) from the inoculum culture to each of the three nitrogen sources (Figure 3.1). The total number of differentially expressed genes for each nitrogen source is displayed under each label. The number of shared genes for each comparison is displayed within the overlap region between comparisons. Euler diagrams created using eulerAPE (Micallef and Rodgers, 2014). Abbreviations: GLN, Glutamine; NO3, nitrate; STAR, nitrogen starvation.

96

Figure 3.10 | Euler diagrams of DE genes by nitrogen source. Each area-proportional Euler diagram represents differential expression between nitrogen conditions at each time point (3 Hour, 27 Hour, 51 Hour) (Figure 3.1). The total number of differentially expressed genes in each comparison is displayed under each label. The number of shared genes for each comparison is displayed within the overlap region between comparisons. The color scheme represents the same nitrogen source comparison made among each time point. The 27 Hour diagram does not have all numbers displayed due to the insignificant amount of differentially expressed genes for the GLN vs NO3 condition. Numerical output is also in Supplementary Table 1. Euler diagrams created using eulerAPE (Micallef and Rodgers, 2014). Abbreviations: GLN, Glutamine; NO3, nitrate; STAR, nitrogen starvation.

97

Figure 3.11 | Euler diagrams of DE genes over time. Each area-proportional Euler diagram represents differential expression over time in each condition (GLN, NO3, STAR) (Figure 3.1). The total number of differentially expressed genes in each comparison is displayed under each label. The number of shared genes for each comparison is displayed within the overlap region between comparisons. The color scheme represents the same time point comparison made within each nitrogen source. Numerical output is also in Supplementary Table 2. Euler diagrams created using eulerAPE (Micallef and Rodgers, 2014). Abbreviations: GLN, Glutamine; NO3, nitrate; STAR, nitrogen starvation.

98

Figure 3.12 | Differential expression of nitrogen assimilation genes over time. The log2 fold change values are presented as each value over the inoculum culture value. Comparisons are shown in each condition (NO3, GLN, STAR) for the entire time course (3, 27, 51 hours). Blue represents downregulation in comparison to the inoculum and red is upregulation in comparison to the inoculum. Colored boxes with # symbols are not statistically significant. Expression data and accession numbers are in Supplementary Table 3. Gene abbreviations are NNP, nitrate/nitrite porter; NR, nitrate reductase; FNT, Formate/Nitrite Transporter Family; NiR, nitrite reductase; AMT, ammonium transporter; Rh, Rhesus protein; GS, glutamine synthetase; GOGAT, glutamine oxoglutarate aminotransferase; PII, Nitrogen regulatory protein PII; NAR2, nitrate assimilation related 2; NIT2, transcription factor NIT2; NRR1, nitrogen response regulator 1; ROC40, MYB-related transcription factor.

99

Figure | 3.13 Expression of fatty acid synthesis genes during nitrogen starvation. The log2 fold change values are presented as STAR over NO3 for the 3 hour, 27 hour, and 51 hour time points (left to right). Blue represents downregulation in comparison to NO3 and red is upregulation in comparison to NO3. White represents no significant differential expression. Fatty acids are labeled by the number of carbons:number of unsaturated bonds. Expression data and accession numbers are in Supplementary Table 4. Abbreviations are as follows: CoA, coenzyme A; ACP, acyl carrier protein; ACCase, acetyl-CoA carboxylase; -CT, α- carboxyltransferase subunit; -CT, β-carboxyltransferase subunit; BC, biotin carboxylase subunit; BCCP, biotin carboxyl carrier protein subunit; MCT, malonyl-CoA ACP transacylase; FAS, fatty acid synthase; KAS, ketoacyl-ACP synthase; KAR, ketoacyl-ACP reductase; HAD, hydroxyacyl-ACP dehydrase, EAR, enoyl-ACP reductase; FAB2, stearoyl-ACP-9-desaturase; FAT, acyl-ACP . Modified from (Srirangan et al., 2015).

100

Figure 3.14 | Expression of triacylglycerol metabolism genes during nitrogen starvation. The log2 fold change values are presented as STAR over NO3 for the 3 hour, 27 hour, and 51 hour time points (left to right). Blue represents downregulation in comparison to NO3 and red is upregulation in comparison to NO3. White represents no significant differential expression. Black represents transcripts that lack sufficient raw reads to be trustworthy. Expression data and accession numbers are in Supplementary Table 5. Compounds include: PL, polar lipid; G-3-P, glycerol-3-phosphate; Lyso-PA, lyso-phosphatidic acid; PA, phosphatidic acid; PG, phosphatidylglycerol, SQDG, sulfoquinovosyldiacylglycerol; MGDG, monogalactosyldiacylglycerol; DGDG, digalactosyldiacylglycel; DAG, diacylglycerol; TAG, triacylglycerol. Enzymes include: GPAT, glycerol-3-P acyltransferase; LPAT, lysophosphatidic acid acyltransferase; PAP2 and PAH, phosphatidate phosphatase; DAGK, diacylglycerol kinase; DGAT and DGTT, diacylglycerol acyltransferase; PCT, CDP-diacylglycerol synthase; PGP, phosphatidylglycerolphosphate synthase; SQD1, UDP-sulfoquinovose synthase; SGD2, sulfolipid synthase; MGD1, monogalactosyldiacylglycerol synthase; DGD1, digalactosyldiacyglycerol synthase; PGD1, plastid galactoglycerolipid degradation 1; GGGT, galactolipid:galactolipid galactosyltransferase; PDAT1, phospholipid:diacylglycerol acyltransferase; MLDP, major lipid droplet protein; KCS, -ketoacyl-CoA synthase; CHAD, 3- hydroxyacyl-CoA dehydrogenase; TER, Trans-2-enoyl-CoA reductase; PCH, Palmitoyl-CoA ; FAD5, MGDG specific palmitate -7 desaturase; FAD6, -6 fatty acid desaturase; FAD7, -3 fatty acid desaturase; TAGL, triacylglycerol ; LCIII, class 3 lipase, FAP, class 3 lipase, LIPG, lipase. Modified from (Srirangan et al., 2015).

101

102

Figure 3.15 | Differential expression of core carbon metabolism genes under nitrogen starvation. Metabolites are displayed between pathway arrows. Enzymes are italicized and colored according to the log2FC value of STAR over NO3 at 51 hours. Red represents log2FC of greater than 2, orange is 1 to 2, black is no significant change, blue is -1 to -2, and purple is less than -2. Expression of θ and α carbonic anhydrase (CA) genes are shown over time in the nitrogen starvation (black) and nitrate (blue) conditions. Each line pattern represents a different carbonic anhydrase gene. All points are represented as log2FC values in comparison to the inoculum culture. Expression data and accession numbers are in Supplementary Table 6 and Supplementary Table 7. Metabolite abbreviations are as follows: 3PG, 3-phosphoglycerate; 1,3BPG, 1,3-bisphosophoglycerate; GAP, glyceraldehyde 3-phosphate; DHAP, dihydroxyacetone phosphate; FBP, fructose 1,6-bisphosphate; F6P, fructose 6-phosphate; E4P, erythrose 4-phosphate; S1,7P, sedoheptulose 1,7-bisphosphate; S7P, sedoheptulose 7-phosphate; X5P, xylulose 5-phosphate; R5P, ribose 5-phosphate; Ru5P, ribulose 5-phosphate; RuBP, ribulose 1,5-bisphosphate; G6P, glucose 6-phosphate; 2PG, 2-phosphoglycerate; PEP, phosphoenolpyruvate; PYR, pyruvate. Enzyme abbreviations are as follows: rbcL, RuBisCO large subunit; rbcS, RuBisCO small subunit; PGK, phosphoglycerate kinase; GAPDH, glyceraldehyde 3-phosphate dehydrogenase; TPI, triose phosphate isomerase; ALDO, aldolase; FBP, fructose 1,6-bisphosphatase; SBP, sedoheptulose 1,7-bisphosphatase; TRK, transketolase; R5PI, ribose 5-phosphate isomerase; Ru5PE, ribulose 5-phosphate 3-epimerase; PRK, phosphoribulokinase; PGI, phosphoglucose isomerase; PGK, phosphoglycerate mutase; ENO, enolase; PYK, pyruvate kinase; PPDK, pyruvate dikinase; PDHα/β, pyruvate dehydrogenase; PDHDA, pyruvate dehydrogenase complex dihydrolipoamide acetyltransferase; ACCase, acetyl- CoA carboxylase; PEPC, phosphoenolpyruvate carboxylase; PEPCK, phosphoenolpyruvate carboxykinase; CS, citrate synthase; ACH, aconitate hydratase; ICDH, isocitrate dehydrogenase; OGC, 2-oxoglutarate dehydrogenase; SCL, succinyl-CoA ; SDH, succinate dehydrogenase; FUH, fumarate hydratase; MDH, malate dehydrogenase, cytosolic (C), plastidic (P), mitochondrial (M); ME, malic enzyme.

103

Table 3.1 | Transcription patterns for DE genes at 51 hours between GLN and NO3. After removing genes with >75 counts in a comparison and those genes that were DE between GLN and NO3 at 27 hours, there remained 641 genes of the total 694 transcripts. There are 9 combinations of DE direction (51 h/27 h) for the 2 nitrogen treatments. DE direction is displayed as down (blue), up (red), or – (no significant change). The number of genes that display each pattern and some gene names and general functions are stated.

51 hr/27 hr GLN NO3 Sum Functions – down 0 2 θ-CAs, 2 α-CAs, 4 heavy metal transporters or chaperones, 1 – up 23 flagellar associated membrane protein, 1 zinc-containing alcohol dehydrogenase, 13 no/ambiguous annotation – – 268 NR, FNT2, 3 NNPs, ribosome subunits, porphyrin creating or down – 113 dependent proteins, stress/ROS, proteases, ABCG lipid transporter, 35 no/ambiguous annotation GS, GOGAT, PII, arginine biosynthesis, glycolysis, proteases, up – 226 102 with no/ambiguous annotation low iron induced protein, multicopper ferroxidase, bacterial down down 3 flavodoxin down up 3 flagellar flavodoxin, α-CA, unknown protein up down 0 up up 5 3 bestrophin chloride channels, θ-CA, unknown protein

104

Discussion

Transcriptional responses to nitrogen starvation in algae are well described in algae but differential transcriptome regulation between variable nitrogen sources has not been investigated.

Our interest was in transcriptional differences that caused the metabolic phenotypes between glutamine-grown and nitrate-grown algae. As discussed earlier, glutamine is not directly taken up by the algae, but decays into ammonium and a carbon skeleton (Murphree et al., 2017).

Glutamine is therefore a suitable source for ammonium and serves here as a comparison for D. viridis growth and metabolic/transcriptional responses to nitrate as the sole nitrogen source in the medium.

Glutamine supplemented cultures underophenotypic and transcriptomic changes comparable to nitrogen starvation

A pattern emerged from the metabolic data: at 3 hours all treatments were the same, at 27 hours GLN and NO3 were equal and separate from STAR, and at 51 hours NO3 and STAR maintained separation, but GLN shifted from the equal of NO3 towards the STAR condition.

This was reflected in the cell diameter, pH, chlorophyll, protein, and carbohydrate data (Figure

3.2, Figure 3.3, Figure 3.4, Figure 3.5, and Figure 3.6). Cell density and neutral lipid data maintained the separation of the 27 hour time point (Figure 3.2 and Figure 3.7). Similar results were demonstrated previously at 144 hours after inoculation (Murphree et al., 2017). While much of the metabolic data showed a starvation response at 51 hours for GLN, the equal cell

+ density between GLN and NO3 suggested that NH4 (GLN) grown cells had altered metabolism.

105

+ NH4 has been shown to change the metabolism of Dunaliella salina (Giordano and

Bowes, 1997), and the transcriptional data at first suggested that this was true for D. viridis

(Figure 3.10). Over time physiological, metabolic and transcriptiome profiles of GLN-grown

- cultures resemble less the patterns of the NO3 grown cultures and more like the N-starved cultures STAR and GLU.

+ Transcriptional repression by NH4 is well known D. viridis showed a fast response to the glutamine-released ammonium through regulation of nitrogen assimilation genes, specifically

NNPs, FNT nitrite transporters, AMT1.1, NR, NiR, and NAR2 within the first 3 hours (Figure

3.12). However, by 27 hours the repression had been released and the transcriptional levels of the genes resembled those of the starvation STAR condition at 3 hours (Figure 3.12). This

+ suggests the cells had taken up all available NH4 from the media and had initiated the nitrogen starvation response. By 51 hours in GLN, there was an interesting mix of transcriptional patterns

+ where both starvation and NH4 repression responses were present. Although not as strong as at

+ 3 hours, the NNPs, FNTs, NR, NiR, and NAR2 all showed the NH4 repression response at 51 hours which is likely due to continuing degradation of glutamine. However, for AMT1.1 the nitrogen starvation response induction was stronger; thus, AMT1.1 is upregulated by nitrogen starvation along with AMT1.2, cytosolic GS, NADH-GOGAT, and PII (Figure 3.12).

Contrary to previous assumptions, the transcriptional regulation of the nitrogen assimilation genes suggested that glutamine was a delayed nitrogen starvation response. This is further suggested by the expression levels of most genes in the GLN at 51 hr which had reached the level of starvation or were moving towards the starvation level (Supplementary Dataset 1).

The down regulation of the photosynthetic apparatus and chlorophyll biosynthesis is a hallmark of nitrogen deficiency (Schmollinger et al., 2014; Park et al., 2015; Tan et al., 2016) and is

106

represented in the STAR and GLN conditions but not in NO3 (Table S). The upregulation of arginine biosynthesis genes is also a distinctive nitrogen starvation response reported in

Chlamydomonas reinhardtii (Park et al., 2015). Even though arginine metabolite levels decreased, Park suggests that cells are transcriptionally prepared to direct nitrogen into arginine once nitrogen is restored to the system. Arginine biosynthesis is upregulated in STAR and GLN

(Table 3.1), which suggests that this mechanism is conserved across both species and is a

- + hallmark of nitrogen starvation both from NO3 and NH4 . The synthesis of the transcriptional and metabolic data determined that the metabolic effects seen in the GLN condition were due to

+ nitrogen starvation rather than NH4 .

+ Dunaliella viridis responds to the presence of NH4

Some of the most transcriptionally active sets of genes under these variable nitrogen

+ conditions were the nitrogen assimilation genes. The modulation by NH4 and nitrogen starvation of nitrogen assimilation genes is a well-known phenomenon especially in C. reinhardtii (Camargo et al., 2007; Giordano and Raven, 2014; Sanz-Luque et al., 2015).

However, the transcriptional regulation of the nitrogen assimilation genes in Dunaliella spp. is relatively unknown except for some limited qPCR data (J. Li et al., 2007; Song et al., 2011).

+ Additionally, Dunaliella spp. are primarily found in alkaline environments where the NH4 / NH3 equilibrium favors NH3. NH3 is an uncharged molecule capable of freely diffusing across the plasma membrane and disrupting pH gradients within cells. This makes it a very risky nitrogen

+ - source in alkaline environments even though NH4 requires less energy than NO3 to incorporate into metabolism (Arnold et al., 2015).

107

+ Dunaliella viridis clearly favors NH4 transport and assimilation when nitrogen starved.

Although NNPs, AMTs, and the NNP regulatory subunit NAR2 were upregulated at 3 hours by starvation, the NNPs were down regulated or returned to inoculum expression levels by 27 hours while the AMTs, cytosolic GS, NADH-GOGAT, and regulators PII, NAR2, and NIT2 were upregulated (Figure 3.12). The short-term upregulation of NNPs was most likely an attempt to

- - - scavenge NO3 /NO2-, but with no additional NO3 /NO2 , they were downregulated in favor of the

+ + NH4 uptake and metabolism genes (Schmollinger et al., 2014; Arnold et al., 2015). Since NH4 is less costly for the cells to incorporate this is not unexpected. It is plausible that the NIT2 transcription factor regulates these genes under nitrogen starvation as shown in C. reinhardtii

+ (Camargo et al., 2007), although it should be noted that D. viridis NIT2 is not repressed by NH4 like in C. reinhardtii (Camargo et al., 2007).

Starch synthesis is higher than lipid synthesis during nitrogen starvation in D. viridis

As has been reported for Dunaliella tertiolecta (Tan et al., 2016), nitrogen starvation leads to an increase in both carbohydrate and neutral lipid content in D. viridis (Figure 3.6,

Figure 3.7). Also like D. tertiolecta, D. viridis accumulates carbohydrates (45 µg/10¬6 cells) at a higher quantity than neutral lipids (3 µg/10¬6 cells) when nitrogen starved (Figure 3.6, Figure

3.7). At 51 hours carbohydrate levels are significantly higher in GLN than NO3, while neutral lipids levels are not significantly different (Figure 3.6, Figure 3.7). The delayed nitrogen starvation response in the GLN cultures suggests that carbohydrate accumulation is faster than neutral lipid accumulation. However, a time-course with higher temporal resolution would be

108

necessary to identify the metabolic shifts caused by these different treatments. Nitrogen starvation regulation of carbon storage is similar to other green algae

It is still unclear how green algae control the partitioning of carbon storage under nitrogen starvation (Goncalves, Wilkie et al., 2016). C. reinhardtii and D. tertiolecta are the most closely related species to D. viridis that have been studied for carbon accumulation under nitrogen starvation. Overall transcriptional patterns have been conserved, but there are some notable differences.

One point of conservation is that the accumulation of TAG in D. viridis is associated with transcriptional upregulation of the major lipid drop protein (MLDP) (Figure 3.14) as well as the two transcription factors, NRR1 and ROC40 (Figure 3.12). These genes have been shown to be upregulated in nitrogen starvation-linked TAG accumulation in Chlamydomonas (Moellering and Benning, 2010; Boyle et al., 2012; Goncalves, Koh et al., 2016). These genes were not analyzed in D. tertiolecta, but are conserved between C. reinhardtii and D. viridis.

An interesting difference between C. reinhardtii and D. tertiolecta is in the regulation of fatty acid synthesis under nitrogen starvation. C. reinhardtii upregulates several fatty synthesis genes while D. tertiolecta and D. viridis downregulate fatty acid synthesis genes (Figure 3.13) (Tan et al., 2016). This suggests less fatty acids would be synthesized, and thus there would be less fatty acids available for synthesis of neutral lipids or triacylglycerides (TAG). This is not the case for the Dunaliella spp. since neutral lipids accumulate under nitrogen starvation although starch is the major carbon storage component for both species (Figure 3.6, Figure 3.7) (Tan et al., 2016).

D. tertiolecta showed an increase in neutral lipids but no increase in total lipids, which suggested that increased TAG came from recycling membrane lipids into TAG (Tan et al., 2016). Under nitrogen starvation, algae are known to break down chloroplast membranes (Preininger et al.,

109

2015), which are a source of fatty acids for TAG synthesis. These reactions are catalyzed by diacylglycerol acyltransferase genes (DGTT, DGAT, PDAT, PGD) but no D. viridis homologs for these genes show any change in expression (Figure 3.14). However, there are increases in putative lipase genes (Figure 3.14), which could be mobilizing fatty acids for incorporation into

TAG (Boyle et al., 2012). Unfortunately, since only neutral lipid levels were measured in D. viridis, the difference between membrane recycling and de novo lipid synthesis cannot be distinguished, although it is certainly possible that both strategies are utilized.

Carbohydrates have been reported to be the major carbon storage compound in

Dunaliella spp. (Slocombe et al., 2015; Tan et al., 2016) and this is also true of D. viridis (Figure

3.6). The anabolic enzymes starch synthase and 1,4-α-glucan branching enzyme are upregulated as are the catabolic enzymes α-amylase and starch phosphorylase. The anabolic enzymes are slightly more upregulated than the catabolic enzymes, which may explain the maintenance of accumulated starch in STAR (Figure 3.6 and Supplementary Figure 1). This maintenance of accumulated starch is also found in D. tertiolecta where starch is primarily synthesized in the beginning of nitrogen starvation and then is degraded for fatty acid synthesis and TAG production under continuing nitrogen starvation (Pick and Avidan, 2017). This major turnover between starch and lipids however took place after 72 hours of starvation (Pick and Avidan,

2017), so our data at 51 hours potentially reflects the state of the cells before this turnover.

Additionally, all starch metabolism pathway genes are upregulated under nitrogen starvation in

D. tertiolecta (Tan et al., 2016), whereas only a small subset is upregulated in D. viridis (Table

S). This may be a consequence of the earlier time point used in our experiment, but could also stem from our use of continuous light as a growth condition, which has been shown to be

110

disruptive of the typical diurnal starch metabolism regulation in D. viridis (Srirangan et al.,

2015).

The Krebs Cycle is a common regulatory hub for carbon partitioning as it serves to provide most carbon skeletons for amino acid synthesis and degradation. In D. tertiolecta and C. reinhardtii, all enzymes of the Krebs Cycle are upregulated under nitrogen starvation (Park et al.,

2015; Tan et al., 2016). However, in D. viridis, 4 of the 8 steps show no change in expression, 1 subunit of 2-oxoglutarate dehydrogenase and 1 isoform of malate dehydrogenase are downregulated, and fumarate hydratase, aconitate hydratase, and 2 isoforms of malate dehydrogenase are upregulated (Figure 3.15). It is unclear as to where this mixed regulation originates from although differences in experimental design could be a factor. Under continuous light, D. viridis shows highly modified metabolism as continuous light eliminates diurnal cycles and also desynchronizes cell cycle (Srirangan et al., 2015), so an interaction between light and nitrogen stress could cause the differences in Krebs Cycle regulation.

Two key metabolites derived from glycolysis, phosphoenolpyruvate (PEP) and pyruvate

(PYR), are used to create acetyl-CoA, which can feed into the Krebs Cycle or fatty acid synthesis. As such they are important regulatory molecules in partitioning of carbon into carbohydrates or lipids (Polle et al., 2014). Under nitrogen starvation, D. tertiolecta and C. reinhardtii both show an upregulation of the pyruvate dehydrogenase complex which creates acetyl-CoA from PYR and allows for the synthesis of fatty acids (Park et al., 2015; Tan et al.,

2016). However, D. viridis does not show this regulatory pattern and the pyruvate dehydrogenase complex shows no change in transcription (Figure 3.15). In fact, when looking at the transcriptional program around PEP and PYR it appears that D. viridis favors starch production by avoiding the build-up of PYR which can be converted into acetyl-CoA and

111

eventually fatty acids. This avoidance of PYR comes in several different strategies. The first is stopping PYR production by downregulation of pyruvate kinase and malic enzymes (Figure

3.15), which create PYR from respectively PEP and malate. The upregulation of pyruvate phosphate dikinase suggests that if PYR is created it can be converted into PEP to avoid conversion into acetyl-CoA. Another strategy seems to be converting PEP into oxaloacetate, which feeds into the Krebs Cycle, by upregulating PEP carboxylase (Figure 3.15). The concurrent upregulation of PEP carboxykinase suggests that oxaloacetate being converted to PEP could be used to prompt gluconeogenesis and eventual starch synthesis (Figure 3.15). Although

PYR is a critical metabolite, the transcriptional data suggest that D. viridis is attempting to limit the pool of PYR as much as possible by stopping PYR production or converting it into alternate metabolites other than acetyl-CoA.

pH differences are associated with changes in the transcription patterns of batch cultures

One of the disadvantages of batch culture is that conditions in the culture flask change over time as the organism interacts with the environment. Dunaliella is known to alkalize the environment as part of the dissolved inorganic carbon uptake mechanism (Shiraiwa et al., 1993).

This was reflected in our experiment by the pH becoming alkaline over time as cell density increased (Figure 3.2, Figure 3.3). Within the subset of DE genes at 51 hours between GLN and

NO3, several genes appeared to be regulated by the difference in pH between cultures, as they were DE in all conditions from inoculum to 3 hour when the pH fell from 9.2 to 8.2 (Figure 3.3).

They were also DE in the 51 hour over 27 hour comparison of the NO3 and GLN conditions

(Table 3.1).

112

The carbon concentrating mechanism (CCM) is perturbed due to pH changes and nitrogen starvation

Carbonic anhydrases are major constituents of the carbon concentrating mechanism

(CCM) in algae (DiMario et al., 2018). In Chlamydomonas, carbonic anhydrases (CAs) are upregulated under low CO2 conditions (Mackinder et al., 2017). In Dunaliella spp., low CO2 and high salt induce CA expression (Jeon et al., 2016) and in D. viridis continuous light induces α-

CA expression (Srirangan et al., 2015). Nitrogen starvation also appears to be a regulating factor in D. viridis as the three α-CA genes were all downregulated by starvation in the STAR and

GLN samples (Figure 3.15, Supplementary Figure 6). Reducing CA levels could help slow the uptake of CO2, which nitrogen starved cells will struggle to fix when photosynthesis is downregulated.

θ-CAs are homologs of the Chlamydomonas LCIB/LCIC proteins which are chloroplast stroma-localized and predicted to convert escaped CO2 from the pyrenoid back into HCO3- under low CO2 levels (Wang et al., 2015; Mackinder et al., 2017). Much like the α-CAs the θ-

CAs were downregulated from the inoculum to the 3 hour time point. The STAR condition maintained the repression while the GLN and NO3 showed upregulation at 51 hours

(Supplementary Table 6). It is likely that the θ-CAs are responding to the pH because both GLN and NO3 had high pH values even though GLN had less nitrogen availability (Figure 3.3).

Interestingly, this increase in CA expression over time suggests CO2 starvation, even though the cells only grew for 51 hours in a media with 50 mM HCO3-. It is possible that as the pH rises D. viridis interprets the shift in carbon species from H2CO3 to HCO3- to CO32- as CO2 starvation.

113

The CCM also requires the transport of inorganic carbon across multiple membranes. In

Chlamydomonas, the HLA3/LCI1 complex and LCIA reside respectively on the plasma and chloroplast membranes and are critical for proper transport of HCO3- (Mariscal et al., 2006;

Yamano et al., 2015; Mackinder et al., 2017). However, the HLA3 homolog in D. viridis shows steady expression over time, and there is no LCI1 homolog in the D. viridis transcriptome.

LCIA, a putative formate/nitrite transporter family protein, can function as a HCO3- and nitrite transporter and is upregulated under CO2 deprivation in Chlamydomonas (Mariscal et al., 2006;

Yamano et al., 2015). However, the D. viridis homolog, FNT2, and the other FNT proteins are repressed in the GLN condition like other nitrogen assimilation genes, and FNT does not appear to react to the pH of the media (Figure 3.12). Additionally, other putative bicarbonate transporters, Cl-/HCO3- antiporters (Bonar and Casey, 2008) and a CIA8 homolog (Machingura et al., 2017) show no difference in transcription in all conditions. This suggests that increasing pH may not be interpreted as CO2 limitation by D. viridis or that the regulation of the CCM is different between C. reinhardtii and D. viridis. It would not be surprising to find different regulation of the CCM considering the native environments of both algae.

Recently, two thylakoid-localized bestrophin chloride channels were found to interact with LCIB/LCIC (θ-CAs) (Mackinder et al., 2017). Since human bestrophins are permeable to

HCO3- and Cl- (Qu and Hartzell, 2008), Mackinder predicted that the bestrophins could function as the unknown HCO3- transporter from the thylakoid into the pyrenoid (Mackinder et al.,

2017). Three D. viridis bestrophin homologs are upregulated over time as the pH increases in conjunction with the θ-CA homologs (Table 3.1) suggesting that this association and mechanism may also exist for the D. viridis CCM. A chloroplast-localized bestrophin homolog in

Arabidopsis thaliana was shown to activate photoprotective mechanisms under high light stress

114

through modulation of the proton motive force (Herdean et al., 2016), however since D. viridis was grown under continuous light with no light fluctuation we don’t suspect light stress caused the increase in bestrophin transcription. It is entirely plausible that D. viridis also may localize one or several bestrophins to the plasma membrane to import Cl- to exchange for HCO3- in conjunction with the rise in pH that the cells experienced by 51 hours (Figure 3.3). Further study will be required to determine both the localization and purpose of the bestrophin genes as part of the CCM.

Conclusion

Dunaliella viridis displays many of the canonical signs of nitrogen starvation when grown on glutamine as a slow-release source of ammonium. As shown for D. tertiolecta, starch is the major carbon storage form under N-limitation and not lipids. Our transcriptional data suggests that Dunaliella may avoid lipid production by inhibiting acetyl-CoA synthesis and

+ limiting pyruvate availability. Although the GLN (NH4 ) data is difficult to interpret at later time points due to constant nutrient flux, 192 genes are regulated exclusively by the switch from

- + NO3 to NH4 in the first 3 hours (Figure 3.9). However, as all the metabolic and growth

+ parameters are the same at 27 hours, the NH4 did not have sufficient time to affect phenotype.

Glutamine provided to Dunaliella at 5 mM was not sufficient to sustain nitrogen replete growth.

A higher concentration of glutamine or a strategy such as heterologous expression of an extracellular deaminase to accelerate degradation of glutamine could improve the use of glutamine as a replete nitrogen source. It should be noted that partial nitrogen starvation could be advantageous for biofuel production by supporting simultaneous cell growth and carbon

115

storage (Lardon et al., 2009); however, it was beyond the bounds of this work to determine this applicability for glutamine. The regulation of nitrogen assimilation genes also appears to be

+ - tailored to the alkaline environment, where NH4 is favored over NO3 . D. viridis represents an interesting study subject for not only halophytic, but alkali-tolerant survival mechanisms which can be exploited for bioenergy production.

Acknowledgments

We would like to thank Dr. Soundarya Srirangan for advice and thoughtful discourse. This work was supported by a NSF Emerging Frontiers in Research and Innovation (EFRI) grant [Award

Abstract #1332341].

Author contributions statement

JD, CM, and HS we involved in the conception and design of this project. JD, CM, NV, and DY were involved in collection and assembly of data. JD, CM, NV, DY, and HS were involved in the analysis and interpretation of the data. JD wrote the initial draft of the manuscript. JD, CM, and HS were responsible for critical revision of the article for important intellectual content. All authors were involved in final approval of the article.

116

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

117

References

Alipanah, L., Rohloff, J., Winge, P., Bones, A.M., and Brembu, T. (2015). Whole-cell response to nitrogen deprivation in the diatom Phaeodactylum tricornutum. J. Exp. Bot. 66, 6281- 6296, doi/10.1093/jxb/erv340 [doi].

Arnold, A., Sajitz-Hermstein, M., and Nikoloski, Z. (2015). Effects of Varying Nitrogen Sources on Amino Acid Synthesis Costs in Arabidopsis thaliana under Different Light and Carbon- Source Conditions. PLoS One 10, e0116536. doi:10.1371/journal.pone.0116536, doi/PONE- D-14-29214 [pii].

Artimo, P., Jonnalagedda, M., Arnold, K., Baratin, D., Csardi, G., de Castro, E., Duvaud, S., Flegel, V., Fortier, A., Gasteiger, E., Grosdidier, A., Hernandez, C., Ioannidis, V., Kuznetsov, D., Liechti, R., Moretti, S., Mostaguir, K., Redaschi, N., Rossier, G., Xenarios, I., and Stockinger, H. (2012). ExPASy: SIB bioinformatics resource portal. Nucleic Acids Res. 40, 597, doi/10.1093/nar/gks400 [doi].

Ben-Amotz, A., Polle, J.E., and Rao, D.S. (2009). The Alga Dunaliella : Biodiversity, Physiology, Genomics and Biotechnology(Enfield, NH: Science Publishers).

Bonar, P.T. and Casey, J.R. (2008). Plasma membrane Cl(-)/HCO(3)(-) exchangers: structure, mechanism and physiology. Channels (Austin) 2, 337-345, doi/6899 [pii].

Boyle, N.R., Page, M.D., Liu, B., Blaby, I.K., Casero, D., Kropat, J., Cokus, S.J., Hong- Hermesdorf, A., Shaw, J., Karpowicz, S.J., Gallaher, S.D., Johnson, S., Benning, C., Pellegrini, M., Grossman, A., and Merchant, S.S. (2012). Three acyltransferases and nitrogen-responsive regulator are implicated in nitrogen starvation-induced triacylglycerol accumulation in Chlamydomonas. J. Biol. Chem. 287, 15811-15825, doi/10.1074/jbc.M111.334052 [doi].

Camargo, A., Llamas, A., Schnell, R.A., Higuera, J.J., Gonzalez-Ballester, D., Lefebvre, P.A., Fernandez, E., and Galvan, A. (2007). Nitrate signaling by the regulatory gene NIT2 in Chlamydomonas. Plant Cell 19, 3491-3503, doi/tpc.106.045922 [pii].

DiMario, R.J., Machingura, M.C., Waldrop, G.L., and Moroney, J.V. (2018). The many types of carbonic anhydrases in photosynthetic organisms. 268, 11-17, doi///doi.org/10.1016/j.plantsci.2017.12.002.

Fang, L., Qi, S., Xu, Z., Wang, W., He, J., Chen, X., and Liu, J. (2017). De novo transcriptomic profiling of Dunaliella salina reveals concordant flows of glycerol metabolic pathways upon reciprocal salinity changes. 23, 135-149, doi///dx.doi.org/10.1016/j.algal.2017.01.017.

118

Giordano, M. and Bowes, G. (1997). Gas Exchange and C Allocation in Dunaliella salina Cells in Response to the N Source and CO2 Concentration Used for Growth. Plant Physiol. 115, 1049-1056, doi/115/3/1049 [pii].

Giordano, M. and Raven, J.A. (2014). Nitrogen and sulfur assimilation in plants and algae. 118, 45-61, doi///doi.org/10.1016/j.aquabot.2014.06.012.

Goncalves, E.C., Koh, J., Zhu, N., Yoo, M.J., Chen, S., Matsuo, T., Johnson, J.V., and Rathinasabapathi, B. (2016). Nitrogen starvation-induced accumulation of triacylglycerol in the green algae: evidence for a role for ROC40, a transcription factor involved in circadian rhythm. Plant J. 85, 743-757, doi/10.1111/tpj.13144 [doi].

Goncalves, E.C., Wilkie, A.C., Kirst, M., and Rathinasabapathi, B. (2016). Metabolic regulation of triacylglycerol accumulation in the green algae: identification of potential targets for engineering to improve oil yield. Plant. Biotechnol. J. 14, 1649-1660, doi/10.1111/pbi.12523 [doi].

Gutierrez, J., Kwan, T.A., Zimmerman, J.B., and Peccia, J. (2016). Ammonia inhibition in oleaginous microalgae. 19, 123-127, doi///dx.doi.org/10.1016/j.algal.2016.07.016.

Herdean, A., Teardo, E., Nilsson, A.K., Pfeil, B.E., Johansson, O.N., Unnep, R., Nagy, G., Zsiros, O., Dana, S., Solymosi, K., Garab, G., Szabo, I., Spetea, C., and Lundin, B. (2016). A voltage-dependent chloride channel fine-tunes photosynthesis in plants. Nat. Commun. 7, 11654, doi/10.1038/ncomms11654 [doi].

Hong, L., Liu, J.L., Midoun, S.Z., and Miller, P.C. (2017). Transcriptome sequencing and annotation of the halophytic microalga Dunaliella salina. J. Zhejiang Univ. Sci. B. 18, 833- 844, doi/10.1631/jzus.B1700088 [doi].

Hosseini Tafreshi, A. and Shariati, M. (2009). Dunaliella biotechnology: methods and applications. J. Appl. Microbiol. 107, 14-35, doi/10.1111/j.1365-2672.2009.04153.x [doi].

Huo, Y., Wernick, D.G., and Liao, J.C. (2012). Toward nitrogen neutral biofuel production. Curr. Opin. Biotechnol. 23, 406-413, doi///dx.doi.org.prox.lib.ncsu.edu/10.1016/j.copbio.2011.10.005.

Iomini, C., Li, L., Mo, W., Dutcher, S.K., and Piperno, G. (2006). Two flagellar genes, AGG2 and AGG3, mediate orientation to light in Chlamydomonas. Curr. Biol. 16, 1147-1153, doi/S0960-9822(06)01496-5 [pii].

Jeon, H., Jeong, J., Baek, K., McKie-Krisberg, Z., Polle, J.E.W., and Jin, E. (2016). Identification of the carbonic anhydrases from the unicellular green alga Dunaliella salina strain CCAP 19/18. 19, 12-20, doi///dx.doi.org.prox.lib.ncsu.edu/10.1016/j.algal.2016.07.010.

119

Kanehisa, M., Sato, Y., and Morishima, K. (2016). BlastKOALA and GhostKOALA: KEGG Tools for Functional Characterization of Genome and Metagenome Sequences. J. Mol. Biol. 428, 726-731, doi/S0022-2836(15)00649-X [pii].

Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y., and Morishima, K. (2017). KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, D361, doi/10.1093/nar/gkw1092 [doi].

Keeling, P.J., Burki, F., Wilcox, H.M., Allam, B., Allen, E.E., Amaral-Zettler, L.A., Armbrust, E.V., Archibald, J.M., Bharti, A.K., Bell, C.J., Beszteri, B., Bidle, K.D., Cameron, C.T., Campbell, L., Caron, D.A., Cattolico, R.A., Collier, J.L., Coyne, K., Davy, S.K., Deschamps, P., Dyhrman, S.T., Edvardsen, B., Gates, R.D., Gobler, C.J., Greenwood, S.J., Guida, S.M., Jacobi, J.L., Jakobsen, K.S., James, E.R., Jenkins, B., John, U., Johnson, M.D., Juhl, A.R., Kamp, A., Katz, L.A., Kiene, R., Kudryavtsev, A., Leander, B.S., Lin, S., Lovejoy, C., Lynn, D., Marchetti, A., McManus, G., Nedelcu, A.M., Menden-Deuer, S., Miceli, C., Mock, T., Montresor, M., Moran, M.A., Murray, S., Nadathur, G., Nagai, S., Ngam, P.B., Palenik, B., Pawlowski, J., Petroni, G., Piganeau, G., Posewitz, M.C., Rengefors, K., Romano, G., Rumpho, M.E., Rynearson, T., Schilling, K.B., Schroeder, D.C., Simpson, A.G., Slamovits, C.H., Smith, D.R., Smith, G.J., Smith, S.R., Sosik, H.M., Stief, P., Theriot, E., Twary, S.N., Umale, P.E., Vaulot, D., Wawrik, B., Wheeler, G.L., Wilson, W.H., Xu, Y., Zingone, A., and Worden, A.Z. (2014). The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing. PLoS Biol. 12, e1001889, doi/10.1371/journal.pbio.1001889 [doi].

Lardon, L., Helias, A., Sialve, B., Steyer, J.P., and Bernard, O. (2009). Life-cycle assessment of biodiesel production from microalgae. Environ. Sci. Technol. 43, 6475-6481.

Li, B. and Dewey, C.N. (2011). RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323, doi/10.1186/1471-2105- 12-323 [doi].

Li, J., Xue, L., Yan, H., Wang, L., Liu, L., Lu, Y., and Xie, H. (2007). The nitrate reductase gene-switch: a system for regulated expression in transformed cells of Dunaliella salina. 403, 132-142, doi/S0378-1119(07)00429-5 [pii].

Machingura, M.C., Bajsa-Hirschel, J., Laborde, S.M., Schwartzenburg, J.B., Mukherjee, B., Mukherjee, A., Pollock, S.V., Forster, B., Price, G.D., and Moroney, J.V. (2017). Identification and characterization of a solute carrier, CIA8, involved in inorganic carbon acclimation in Chlamydomonas reinhardtii. J. Exp. Bot. 68, 3879-3890, doi/10.1093/jxb/erx189 [doi].

Mackinder, L.C.M., Chen, C., Leib, R.D., Patena, W., Blum, S.R., Rodman, M., Ramundo, S., Adams, C.M., and Jonikas, M.C. (2017). A Spatial Interactome Reveals the Protein Organization of the Algal CO2-Concentrating Mechanism. 171, 147.e14, doi/S0092- 8674(17)31002-4 [pii].

120

Mariscal, V., Moulin, P., Orsel, M., Miller, A.J., Fernández, E., and Galván, A. (2006). Differential Regulation of the Chlamydomonas Nar1 Gene Family by Carbon and Nitrogen. 157, 421-433, doi///doi.org/10.1016/j.protis.2006.06.003.

Matasci, N., Hung, L., Yan, Z., Carpenter, E.J., Wickett, N.J., Mirarab, S., Nguyen, N., Warnow, T., Ayyampalayam, S., Barker, M., Burleigh, J.G., Gitzendanner, M.A., Wafula, E., Der, J.P., dePamphilis, C.W., Roure, B., Philippe, H., Ruhfel, B.R., Miles, N.W., Graham, S.W., Mathews, S., Surek, B., Melkonian, M., Soltis, D.E., Soltis, P.S., Rothfels, C., Pokorny, L., Shaw, J.A., DeGironimo, L., Stevenson, D.W., Villarreal, J.C., Chen, T., Kutchan, T.M., Rolf, M., Baucom, R.S., Deyholos, M.K., Samudrala, R., Tian, Z., Wu, X., Sun, X., Zhang, Y., Wang, J., Leebens-Mack, J., and Wong, G.K. (2014). Data access for the 1,000 Plants (1KP) project. 3, 17, doi/10.1186/2047-217X-3-17.

Merchant, S.S., Kropat, J., Liu, B., Shaw, J., and Warakanont, J. (2012). TAG, you're it! Chlamydomonas as a reference organism for understanding algal triacylglycerol accumulation. Curr. Opin. Biotechnol. 23, 352-363, doi/10.1016/j.copbio.2011.12.001; 10.1016/j.copbio.2011.12.001.

Micallef, L. and Rodgers, P. (2014). eulerAPE: drawing area-proportional 3-Venn diagrams using ellipses. PLoS One 9, e101717, doi/10.1371/journal.pone.0101717 [doi].

Moellering, E.R. and Benning, C. (2010). RNA interference silencing of a major lipid droplet protein affects lipid droplet size in Chlamydomonas reinhardtii. Eukaryot. Cell. 9, 97-106, doi/10.1128/EC.00203-09 [doi].

Morgan, B. and Lahav, O. (2007). The effect of pH on the kinetics of spontaneous Fe(II) oxidation by O2 in aqueous solution – basic principles and a simple heuristic description. 68, 2080-2084, doi///doi.org/10.1016/j.chemosphere.2007.02.015.

Murphree, C.A., Dums, J.T., Jain, S.K., Zhao, C., Young, D.Y., Khoshnoodi, N., Tikunov, A., Macdonald, J., Pilot, G., and Sederoff, H. (2017). Amino Acids Are an Ineffective Fertilizer for Dunaliella spp. Growth. Front. Plant. Sci. 8, 847.

Oren, A. (2005). A hundred years of Dunaliella research: 1905–2005. 1, 2.

Park, J.J., Wang, H., Gargouri, M., Deshpande, R.R., Skepper, J.N., Holguin, F.O., Juergens, M.T., Shachar-Hill, Y., Hicks, L.M., and Gang, D.R. (2015). The response of Chlamydomonas reinhardtii to nitrogen deprivation: a systems biology analysis. Plant J. 81, 611-624, doi/10.1111/tpj.12747 [doi].

Paz, Y., Katz, A., and Pick, U. (2007). A multicopper ferroxidase involved in iron binding to transferrins in Dunaliella salina plasma membranes. J. Biol. Chem. 282, 8658-8666, doi/M609756200 [pii].

Pick, U. and Avidan, O. (2017). Triacylglycerol is produced from starch and polar lipids in the green alga Dunaliella tertiolecta. J. Exp. Bot. 68, 4939-4950, doi/10.1093/jxb/erx280 [doi].

121

Polle, J.E.W., Neofotis, P., Huang, A., Chang, W., Sury, K., and Wiech, E.M. (2014). Carbon Partitioning in Green Algae (Chlorophyta) and the Enolase Enzyme. 4, 612-628, doi/10.3390/metabo4030612 [doi].

Polle, J.E.W., Barry, K., Cushman, J., Schmutz, J., Tran, D., Hathwaik, L.T., Yim, W.C., Jenkins, J., McKie-Krisberg, Z., Prochnik, S., Lindquist, E., Dockter, R.B., Adam, C., Molina, H., Bunkenborg, J., Jin, E., Buchheim, M., and Magnuson, J. (2017). Draft Nuclear Genome Sequence of the Halophilic and Beta-Carotene-Accumulating Green Alga Dunaliella salina Strain CCAP19/18. Genome Announc 5, 17, doi/e01105-17 [pii].

Preininger, E., Kosa, A., Lorincz, Z.S., Nyitrai, P., Simon, J., Boddi, B., Keresztes, A., and Gyurjan, I. (2015). Structural and functional changes in the photosynthetic apparatus of Chlamydomonas reinhardtii during nitrogen deprivation and replenishment. 53, 369-377, doi/10.1007/s11099-015-0129-y.

Puente-Sanchez, F., Olsson, S., and Aguilera, A. (2016). Comparative Transcriptomic Analysis of the Response of Dunaliella acidophila (Chlorophyta) to Short-Term Cadmium and Chronic Natural Metal-Rich Water Exposures. Microb. Ecol. 72, 595-607, doi/10.1007/s00248-016-0824-7 [doi].

Qu, Z. and Hartzell, H.C. (2008). Bestrophin Cl(−) Channels are Highly Permeable to HCO(3)(−). Am. J. Physiol. Cell. Physiol. 294, 1371, doi/10.1152/ajpcell.00398.2007 [doi].

Rismani-Yazdi, H., Haznedaroglu, B.Z., Bibby, K., and Peccia, J. (2011). Transcriptome sequencing and annotation of the microalgae Dunaliella tertiolecta: pathway description and gene discovery for production of next-generation biofuels. BMC Genomics 12, 148, doi/10.1186/1471-2164-12-148 [doi].

Robinson, M.D., McCarthy, D.J., and Smyth, G.K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140, doi/10.1093/bioinformatics/btp616 [doi].

Roekens, E.J. and Van Grieken, R. (1983). Kinetics of iron(II) oxidation in seawater of various pH. 13, 195-202, doi///doi.org/10.1016/0304-4203(83)90014-2.

Sanz-Luque, E., Chamizo-Ampudia, A., Llamas, A., Galvan, A., and Fernandez, E. (2015). Understanding nitrate assimilation and its regulation in microalgae. Front. Plant. Sci. 6, 10.3389/fpls.2015.00899, doi/10.3389/fpls.2015.00899 [doi].

Schmollinger, S., Muhlhaus, T., Boyle, N.R., Blaby, I.K., Casero, D., Mettler, T., Moseley, J.L., Kropat, J., Sommer, F., Strenkert, D., Hemme, D., Pellegrini, M., Grossman, A.R., Stitt, M., Schroda, M., and Merchant, S.S. (2014). Nitrogen-Sparing Mechanisms in Chlamydomonas Affect the Transcriptome, the Proteome, and Photosynthetic Metabolism. Plant Cell, doi/tpc.113.122523 [pii].

122

Schwarz, M., Sal-Man, N., Zamir, A., and Pick, U. (2003). A transferrin-like protein that does not bind iron is induced by iron deficiency in the alga Dunaliella salina. Biochim. Biophys. Acta 1649, 190-200, doi/S1570963903001857 [pii].

Shin, H., Hong, S.J., Kim, H., Yoo, C., Lee, H., Choi, H.K., Lee, C.G., and Cho, B.K. (2015). Elucidation of the growth delimitation of Dunaliella tertiolecta under nitrogen stress by integrating transcriptome and peptidome analysis. Bioresour. Technol. 194, 57-66, doi/S0960-8524(15)00955-4 [pii].

Shiraiwa, Y., Goyal, A., and Tolbert, N.E. (1993). Alkalization of the Medium by Unicellular Green Algae during Uptake Dissolved Inorganic Carbon. Plant. and Cell. Physiology 34, 649-657, doi/10.1093/oxfordjournals.pcp.a078467 [doi].

Slocombe, S.P., Zhang, Q., Ross, M., Anderson, A., Thomas, N.J., Lapresa, Á, Rad-Menéndez, C., Campbell, C.N., Black, K.D., Stanley, M.S., and Day, J.G. (2015). Unlocking nature’s treasure-chest: screening for oleaginous algae. Scientific Reports 5, 9844.

Song, T., Gao, Q., Xu, Z., and Song, R. (2011). The cloning and characterization of two ammonium transporters in the salt-resistant green alga, Dunaliella viridis. Mol. Biol. Rep. 38, 4797-4804, doi/10.1007/s11033-010-0621-1 [doi].

Srirangan, S., Sauer, M.L., Howard, B., Dvora, M., Dums, J., Backman, P., and Sederoff, H. (2015). Interaction of Temperature and Photoperiod Increases Growth and Oil Content in the Marine Microalgae Dunaliella viridis. PLoS One 10, e0127562, doi/10.1371/journal.pone.0127562 [doi].

Tan, K.W.M., Lin, H., Shen, H., and Lee, Y.K. (2016). Nitrogen-induced metabolic changes and molecular determinants of carbon allocation in Dunaliella tertiolecta. Scientific Reports 6, 37235.

Vieler, A., Wu, G., Tsai, C.H., Bullard, B., Cornish, A.J., Harvey, C., Reca, I.B., Thornburg, C., Achawanantakun, R., Buehl, C.J., Campbell, M.S., Cavalier, D., Childs, K.L., Clark, T.J., Deshpande, R., Erickson, E., Armenia Ferguson, A., Handee, W., Kong, Q., Li, X., Liu, B., Lundback, S., Peng, C., Roston, R.L., Sanjaya, Simpson, J.P., Terbush, A., Warakanont, J., Zauner, S., Farre, E.M., Hegg, E.L., Jiang, N., Kuo, M.H., Lu, Y., Niyogi, K.K., Ohlrogge, J., Osteryoung, K.W., Shachar-Hill, Y., Sears, B.B., Sun, Y., Takahashi, H., Yandell, M., Shiu, S.H., and Benning, C. (2012). Genome, Functional Gene Annotation, and Nuclear Transformation of the Heterokont Oleaginous Alga Nannochloropsis oceanica CCMP1779. PLoS Genet. 8, e1003064, doi/10.1371/journal.pgen.1003064; 10.1371/journal.pgen.1003064.

Wang, Y., Stessman, D.J., and Spalding, M.H. (2015). The CO2 concentrating mechanism and photosynthetic carbon assimilation in limiting CO2 : how Chlamydomonas works against the gradient. Plant J. 82, 429-448, doi/10.1111/tpj.12829 [doi].

123

Winter, H. (1993). Untersuchungen zur Akkumulatoin und Translokation von Assimilaten(: University of Gottingen).

Yamano, T., Sato, E., Iguchi, H., Fukuda, Y., and Fukuzawa, H. (2015). Characterization of cooperative bicarbonate uptake into chloroplast stroma in the green alga Chlamydomonas reinhardtii. Proc. Natl. Acad. Sci. U. S. A. 112, 7315-7320, doi/10.1073/pnas.1501659112 [doi].

Yao, L., Tan, K.W., Tan, T.W., and Lee, Y.K. (2017). Exploring the transcriptome of non-model oleaginous microalga Dunaliella tertiolecta through high-throughput sequencing and high performance computing. BMC Bioinformatics 18, x.

124

Supplementary Figures

Supplementary Figure 1 | Starch content over time on different nitrogen sources. The INOC timepoint represents the total starch content of the inoculum culture immediately before inoculation while the 3, 27, and 51 timepoints are the starch content at the time of the metabolite collections. Averages are based upon 3 biological replicates and error bars represent one standard deviation, except for no nitrogen and glutamate at 0 hours and no nitrogen at 51 hours, which only have 2 replicates. Substantial variance and missing data make this data suspect although it does reflect the carbohydrate levels found in Figure 3.6. Statistical difference at each timepoint was determined using single factor ANOVA (p≤0.05) followed by Tukey’s HSD. Significant difference within each time point is represented as follows: between nitrate and glutamate (a); glutamine and glutamate (b), nitrate and no nitrogen (c), and glutamine and no nitrogen (d).

125

Supplementary Tables

Supplementary Table 1 | Shared differentially expressed genes between nitrogen conditions. Numbers represent how many genes are differentially expressed and shared among the different comparisons made at each time point for each nitrogen source. This is the numerical output from the Figure 3.9 euler diagrams.

Shared Comparisons, Nitrogen Source 3 Hour 27 Hour 51 Hour [GLN v NO3] 9 3 103 [GLN v STAR] 25 361 204 [NO3 v STAR] 1171 2181 [GLN v NO3] and [NO3 v STAR] 5 434 [GLN v STAR] and [NO3 v STAR] 2375 969 [GLN v NO3] and [GLN v STAR] and [NO3 v STAR] 11 135 [GLN v NO3] and [GLN v STAR] 30 1 22

126

Supplementary Table 2 | Shared differentially expressed genes between time points. Numbers represent how many genes are differentially expressed and shared among the different comparisons made between time points for each nitrogen source. This is the numerical output from the Figure 3.10 euler diagrams.

Shared Comparisons, Time Point GLN NO3 STAR [3 v 27] 278 216 460 [3 v 51] 1110 94 1147 [27 v 51] 178 9 2 [3 v 27] and [27 v 51] 75 15 2 [3 v 51] and [27 v 51] 387 32 11 [3 v 27] and [3 v 51] and [27 v 51] 71 5 13 [3 v 27] and [3 v 51] 272 97 1380

127

Supplementary Table 3 | Expression data for nitrogen assimilation genes. Data is presented as the Log2FC of the experimental condition over the inoculum (INOC). Data is separated by nitrogen condition and time point. Data used to create Figure 3.12. Abbreviations: GLN, Glutamine; NO3, nitrate; STAR, nitrogen starvation.

GLN NO3 STAR 3 27 51 3 27 51 3 27 51 Transcript Description Name hr hr hr hr hr hr hr hr hr 55 ammonium transporter AMT1.1 -3.6 1.6 3.3 0.8 -0.3 0.1 1.2 3.1 3.2 289 ammonium transporter AMT1.2 -0.7 1.1 1.4 1.2 0.3 0.3 1.2 1.1 1.1 80 formate nitrite transporter FNT1 0.3 0.2 0.0 0.1 0.3 0.2 0.0 0.1 -0.2 840 formate nitrite transporter FNT2 -5.5 1.2 -0.5 0.2 0.3 0.8 0.5 0.6 0.9 1899 formate nitrite transporter FNT3 -1.2 -0.9 -0.3 -0.7 -0.8 0.2 -0.6 -1.1 -0.7 2824 formate nitrite transporter FNT4 -0.5 -0.4 0.0 -0.5 -0.5 -0.6 -0.4 0.2 0.6 13187 formate nitrite transporter FNT5 -0.4 -0.4 0.0 -0.6 -0.5 -0.4 0.0 0.1 0.3 ferredoxin-dependent glutamate GOGAT, 1622 0.1 0.1 0.2 0.2 0.2 0.2 0.1 0.2 0.1 synthase Fer glutamate synthase, NADH- GOGAT, 6877 -0.4 0.4 2.8 0.3 -0.6 -1.1 0.6 2.9 3.2 dependent NADH 555 glutamine synthetase, cytosolic GS, C -0.3 0.6 1.6 0.3 0.1 0.4 0.2 1.3 1.4 597 glutamine synthetase, plastidic GS, P 0.1 0.5 1.4 0.2 0.5 0.5 0.0 1.0 0.4 high affinity nitrate transporter 1727 NAR2 -3.9 1.0 -1.3 0.7 -0.3 0.1 1.1 1.1 0.7 accessory 1093 ferredoxin-nitrite reductase NiR -5.5 0.9 -2.1 0.2 -0.1 0.3 0.5 0.4 0.4 nitrate assimilation regulatory 7439 NIT2 -0.4 -0.2 0.7 -0.1 -0.3 -0.4 -0.1 1.4 1.0 transcription factor 363 nitrate/nitrite porter NNP1 -6.5 1.4 -2.1 0.7 0.1 0.4 1.4 0.5 0.2 1654 nitrate/nitrite porter NNP2 -4.0 1.0 -3.6 0.8 0.3 0.8 1.4 -1.6 -1.4 1115 nitrate/nitrite porter NNP3 -4.1 0.8 -4.7 0.7 0.0 0.6 1.3 -4.0 -3.4 2058 nitrate/nitrite porter NNP4 -3.4 0.8 -1.1 0.5 0.4 0.7 1.3 0.4 0.3 14213 nitrate/nitrite porter NNP5 -0.3 0.4 -0.7 -0.1 0.8 0.3 -0.3 -2.5 -2.2 1095 nitrate reductase NR -4.5 0.5 -1.2 0.5 -0.4 0.0 0.7 -0.1 -0.4 14987 nitrogen response regulator 1 NRR1 0.2 0.2 0.2 0.5 -0.2 0.3 0.5 1.2 2.2 2313 nitrogen regulatory protein PII PII 0.0 0.8 2.1 0.4 0.5 0.2 0.4 1.9 1.8 282 rhesus protein Rh 1.4 0.9 -0.8 1.3 1.3 0.8 1.1 -2.0 -1.8 14411 MYB transcription factor ROC40 1.9 -0.8 1.0 2.1 -0.5 -0.3 1.9 1.5 1.6

128

Supplementary Table 4 | Fatty acid synthesis gene expression under nitrogen starvation. Data is presented as the Log2FC of the starvation condition over nitrate at each time point. Data used to create Figure 3.13. Abbreviations: NO3, nitrate; STAR, nitrogen starvation.

Log2FC(STAR/NO3) Transcript Description Name 3 hr 27 hr 51 hr 1160 α-carboxyltransferase subunit α-CT -0.31 -0.93 -1.77 3002 biotin carboxylase subunit BC -0.14 -1.74 -1.81 1407 biotin carboxyl carrier protein subunit BCCP -0.03 -1.98 -2.22 3474 β-carboxyltransferase subunit β-CT -0.28 -1.26 -1.90 3109 enoyl-ACP reductase EAR -0.21 -1.34 -1.39 1900 stearoyl-ACP-D9-desaturase 2-1 FAB2-1 -0.08 0.09 0.19 68 stearoyl-ACP-D9-desaturase 2-2 FAB2-2 0.00 -0.11 -0.04 4670 acyl-acp thioesterase FAT 0.05 0.00 -0.24 172 hydroxyacyl-ACP dehydrase HAD -0.12 -1.54 -1.41 200 ketoacyl-ACP reductase KAR -0.20 -1.57 -1.53 3765 ketoacyl-ACP synthase I KAS I -0.10 -1.42 -1.44 4759 ketoacyl-ACP synthase II KAS II -0.12 -0.86 -0.71 3487 ketoacyl-ACP synthase III KAS III -0.34 -1.99 -2.01 874 malonyl-CoA ACP transacylase MCT -0.20 -2.90 -3.00

129

Supplementary Table 5 | Triacylglycerol metabolism gene expression under nitrogen starvation. Data is presented as the Log2FC of the starvation condition over nitrate at each time point. Data used to create Figure 3.14. Abbreviations: NO3, nitrate; STAR, nitrogen starvation.

Log2FC(STAR/NO3) Transcript Description Name 3 hr 27 hr 51 hr 12579 3-hydroxyacyl-CoA dehydrogenase CHAD 0.08 3.68 2.74 4102 diacylglycerol kinase DAGK 0.07 1.41 1.62 6208 diacylglycerol acyltransferase 1 DGAT1 0.09 0.03 0.33 1408 digalactosyldiacyglycerol synthase 1 DGD1 0.03 1.29 1.04 3780 diacylglycerol acyltransferase 1 DGTT1-1 -0.14 -0.88 -0.48 6053 diacylglycerol acyl DGTT1-2 -0.02 -0.82 -0.63 9514 diacylglycerol acyl transferase DGTT1-3 -0.12 -0.02 0.15 11338 diacylglycerol acyl transferase DGTT1-4 0.12 -0.42 -0.14 2006 diacylglycerol acyltransferase 2 DGTT2 -0.07 -0.22 -0.22 7672 diacylglycerol acyltransferase 4 DGTT4 -0.20 -0.84 -0.52 2217 MGDG specific palmitate delta-7 desaturase FAD5 -0.08 -0.32 -0.73 3113 omega-6 fatty acid desaturase 2 FAD6-2 -0.16 -1.42 -1.37 4592 omega-3 fatty acid desaturase FAD7 -0.06 -1.16 -1.49 6253 class 3 lipase FAP 0.08 0.45 0.71 306 galactolipid:galactolipid galactosyltransferase GGGT -0.18 0.64 0.51 1955 glycerol-3-P acyltransferase GPAT -0.25 -0.94 -1.07 3395 β-ketoacyl-CoA synthase KCS -0.12 -0.66 -0.76 4740 class 3 lipase LCIII 0.57 0.61 0.64 1438 lipase LIPG1 0.15 0.00 -0.23 3183 lipase LIPG2 0.25 1.10 0.39 230 lipase LIPG3 0.37 3.14 1.87 7995 lipase LIPG4 -0.30 1.13 0.79 6126 lysophosphatidic acid acyltransferase LPAT -0.09 0.17 0.42 5621 monogalactosyldiacylglycerol synthase 1 MGD1 0.01 -0.36 -0.43 5461 major lipid droplet protein MLDP 0.03 2.05 2.22 9078 phosphatidate phosphatase PAH -0.12 0.75 0.54 16678 phosphatidate phosphatase 2 PAP2 2.28 7.48 6.96 3762 Palmitoyl-CoA hydrolase PCH 0.08 -0.16 -0.27 3641 CDP-diacylglycerol synthase PCT -0.05 -0.08 0.40 6787 phospholipid:diacylglycerol acyltransferase PDAT1 0.27 0.24 0.14 10725 plastid galactoglycerolipid degradation 1 PGD1 0.32 0.90 0.92 2765 phosphatidylglycerolphosphate synthase PGP -0.13 -0.27 -0.37 3668 UDP-sulfoquinovose synthase 1 SQD1 -0.05 -1.20 -0.87 953 sulfolipid synthase SQD2 0.11 -0.09 0.09 3414 triacylglycerol lipase TAGL -0.04 -0.33 -0.42 5523 Trans-2-enoyl-CoA reductase TER -0.02 -0.69 -0.80

130

Supplementary Table 6 | Carbonic anhydrase gene expression in comparison to the inoculum. Data is presented as the Log2FC of the experimental condition over the inoculum. Data is separated by nitrogen condition and time point. Data used to create graphs in Figure 3.15. Abbreviations: GLN, Glutamine; NO3, nitrate; STAR, nitrogen starvation.

GLN NO3 STAR Transcript Description Name 3 hr 27 hr 51 hr 3 hr 27 hr 51 hr 3 hr 27 hr 51 hr 1065 α carbonic anhydrase α-CA1 -0.94 -0.5 -1.79 -1.0 0.2 1.9 -1.3 -5.9 -6.4 1098 α carbonic anhydrase α-CA2 -0.45 -0.59 -0.46 -0.4 -0.3 1.0 -0.7 -2.1 -2.5 1180 α carbonic anhydrase α-CA3 -0.93 -0.53 -1.12 -0.9 0.0 1.9 -1.3 -3.4 -3.9 1014 θ carbonic anhydrase θ-CA1 -1.27 -1.72 -0.83 -1.0 -1.4 0.7 -1.1 -2.7 -2.2 1717 θ carbonic anhydrase θ-CA2 -3 -2.72 -0.29 -2.7 -2.6 2.0 -2.6 -2.4 -1.6 5345 θ carbonic anhydrase θ-CA3 -2.19 0.452 -0.34 -0.9 -0.8 1.7 -0.9 0.1 0.3 5895 θ carbonic anhydrase θ-CA4 -1.95 -0.56 -0.37 -1.4 -0.6 0.7 -1.1 -1.1 -1.4

131

Supplementary Table 7 | Carbon metabolism gene expression under nitrogen starvation at 51 hours. Data is presented as the Log2FC of the starvation condition over nitrate at 51 hours. Data used to create Figure 3.15. Abbreviations: NO3, nitrate; STAR, nitrogen starvation.

Purpose Transcript Description Name 51 Hr ribulose-1,5-bisphosphate carboxylase/oxygenase 17933 rbcL -0.57 large subunit ribulose-1,5-bisphosphate carboxylase/oxygenase 100 rbcS -0.57 small subunit 144 phosphoglycerate kinase PGK -1.53 203 glyceraldehyde-3-phosphate dehydrogenase GAPDH1 4.78 206 glyceraldehyde-3-phosphate dehydrogenase GAPDH2 -0.40 7744 glyceraldehyde-3-phosphate dehydrogenase GAPDH3 -0.24 1119 triosephosphate isomerase TPI -0.78 304 fructose-1,6-bisphosphate aldolase ALDO1 3.07 Calvin 407 fructose-1,6-bisphosphate aldolase ALDO2 0.28 Cycle 1356 fructose-1,6-bisphosphate aldolase ALDO3 -1.15 9728 fructose-1,6-bisphosphate aldolase ALDO4 -1.45 770 fructose-1,6-bisphosphatase FBP1 -0.65 4937 fructose-1,6-bisphosphatase FBP2 -0.47 62 sedoheptulose-1,7-bisphosphatase SBP -1.43 320 transketolase TRK -0.81 361 ribose-5-phosphate isomerase R5PI1 -2.37 2221 ribose-5-phosphate isomerase R5PI2 -0.91 1346 ribulose phosphate-3-epimerase Ru5PE1 -1.07 4419 ribulose phosphate-3-epimerase Ru5PE2 -1.37 19 phosphoribulokinase PRK -0.20 14023 phosphoglucose isomerase PGI1 -0.53 Starch 13991 phosphoglucose isomerase PGI2 -0.10 1845 phosphoglucose isomerase PGI3 0.50 1976 phosphoglycerate mutase PGM1 0.13 11132 phosphoglycerate mutase PGM2 0.69 13939 phosphoglycerate mutase PGM3 -0.18 250 enolase ENO -0.19 Glycolysis 399 pyruvate kinase PYK1 0.15 479 pyruvate kinase PYK2 -1.08 1859 pyruvate kinase PYK3 0.38 8078 pyruvate kinase PYK4 -0.24 1267 pyruvate dehydrogenase e1 α subunit PDHα1 -0.13 2501 pyruvate dehydrogenase e1 α subunit PDHα2 -0.91 909 pyruvate dehydrogenase e1 β subunit PDHβ1 0.06 4421 pyruvate dehydrogenase e1 β subunit PDHβ2 -0.90 Fatty Acid pyruvate dehydrogenase complex 7517 PDHDA 0.84 Metabolism dihydrolipoamide acetyltransferase 1160 α-carboxyltransferase subunit α-CT -1.77 3002 biotin carboxylase subunit BC -1.81 1407 biotin carboxyl carrier protein subunit BCCP -2.22 3474 β-carboxyltransferase subunit β-CT -1.90 1578 phosphoenolpyruvate carboxylase PEPC1 0.91 Pyruvate 3505 phosphoenolpyruvate carboxylase PEPC2 1.51 Hub 5920 phosphoenolpyruvate carboxykinase PEPCK 1.13 Enzymes 13192 pyruvate phosphate dikinase PPDK 4.50 3233 malic enzyme ME1 -0.11

132

Supplementary Table 7 (continued).

3795 malic enzyme ME2 0.34 6244 malic enzyme ME3 -1.11 7054 malic enzyme ME4 -0.64 8393 malic enzyme ME5 0.19 1473 citrate synthase, mitochondrial CS1 0.26 3268 citrate synthase, glyoxysomal CS2 0.49 1626 aconitase ACH1 1.28 1618 isocitrate dehydrogenase ICDH1 0.04 3312 isocitrate dehydrogenase ICDH2 -0.93 938 2-oxoglutarate dehydrogenase e1 subunit OGC 0.63 998 dihydrolipoamide dehydrogenase OGC -0.46 5114 dihydrolipoamide dehydrogenase OGC -1.95 11313 2-oxoglutarate dehydrogenase e2 subunit OGC 0.68 12162 2-oxoglutarate dehydrogenase e2 subunit OGC 0.74 Krebs 4434 succinyl-CoA ligase, α subunit SCLα 0.21 Cycle 5159 succinyl-CoA ligase, β subunit SCLβ 0.37 4677 succinate dehydrogenase subunit A SDHa 0.42 4059 succinate dehydrogenase subunit B SDHb 0.13 654 succinate dehydrogenase subunit C SDHc -0.52 1059 fumarate hydratase FUH 1.38 14084 malate dehydrogenase, cytosolic C-MDH1 2.22 460 malate dehydrogenase, mitochondrial M-MDH1 -1.68 822 malate dehydrogenase, mitochondrial M-MDH2 0.64 2867 malate dehydrogenase, mitochondrial M-MDH3 1.61 219 malate dehydrogenase, chloroplast P-MDH1 -0.77

133

CHAPTER 4: Novel Methods of Plastid Genome Modification

Abstract

The ability to engineer plastid genomes is valuable because of the unique genetic and biological capabilities of this organelle. However, the methods available to perform plastid engineering tasks are limited generally because they are either technically infeasible or prohibitively time intensive. A framework for developing a new set of plastid engineering tools is proposed based on sequence specific nucleases, retroviral features, viroid fusions, and a condensed NHEJ pathway. Although initial attempts to implement some tools were unsuccessful, other components could be produced in transgenic plants that were active.

Introduction

Applications, Methods, and Limitations of Plastid Engineering

The use of genetically modified (GM) crops is acknowledged as a robust means of reducing pesticide use, providing a means of food security by allowing the control of plant disease, and decreasing the ecological impact of agriculture (National Academies of Sciences,

Engineering, and Medicine 2016). GM crops also generate enormous economic activity, as the global value of genetically engineered seed is expected to reach $113.28 billion by 2022

(marketsandmarket.com 2016). However, despite these advantages and the potential of GM

134

crops, no traits have been commercialized that improve yield or water use, and public perception of GM crops remains negative, particularly among younger americans (Frewer et al. 2013, Pew

Research Center 2016). Partly this is due to the regulatory burdens involved in commercializing

GM crops. Regulatory concerns are exacerbated because pollen can inadvertently spread transgenes to non-transgenic plants of the same species; this issue in itself has fueled public controversy and animosity between conventional and organic farmers and raised major environmental and regulatory concerns (National Academies of Sciences, Engineering, and

Medicine 2016).

Plastid genome engineering may be a solution to address transgene containment issues that plague conventional GM crops and is itself a potentially versatile and potentially valuable tool for crop improvement. Plants cells possess multiple plastids and hundreds to thousands of plastid chromosome copies (Bock 2007a). Plastids, and the genomes therein, are maternally inherited, which is a convenient feature that could be utilized to prevent or reduce the spread of transgenes into the environment (Gao and Zhao 2014). Relatively high levels of transgene expression can be attained by these organelles, enabling the production of high quantities of recombinant protein. This can be advantageous for the production of industrial enzymes or pharmaceutical/therapeutic proteins like vaccines or antibodies (Ahmad et al. 2010; Rigano et al.

2009; Scotti and Cardi 2012; Waheed et al. 2015; Ma et al. 2005; Bock 2007b; Specht et al.

2010; Mayfield and Franklin 2005; Mayfield et al. 2007). The prokaryotic genetic machinery of plastids also likely lack RNA interference mechanisms, which is a useful and distinguishing feature that allows for the expression of a variety of useful RNA transgenes, such as double- stranded RNAs that confer insect resistance (J. Zhang et al. 2015).

135

Applications

Although the plastid genome is small and there are only a few dozen protein coding genes (Bock 2007a), plastid transformation and targeted random mutagenesis are currently the only viable means to study the function of plastid genes and the plastid genome (Bock 2014).

Plastid transformation can be used to mutagenize or create knockouts in the plastid genome.

This system is useful because of the fidelity of homologous recombination in the chloroplast, and the viability of transforming multiple genetic constructs into a single chloroplast (Bock 2007b).

However, the salient limitation in genome modification using current methods of plastid transformation is the requirement of linking sequence alteration to the expression of an antibiotic or other resistance marker (Daniell et al. 2001; Day and Goldschmidt-Clermont 2011). Because many plastid genes are expressed in an operon, inclusion of a resistance marker must also be compatible with operonic transcription.

The prokaryotic-like regulatory environment of the chloroplast allows for high levels of transgene expression, and the expression of unique traits that would otherwise be unattainable in plant nuclear expression systems (J. Zhang et al. 2015; Bock 2007b, 2014). Implementation of classical herbicide (glufosinate, glyphosphate, D amino acids) and insect resistance traits (BT toxin) is achievable and enhanced using plastid expression (J. Zhang et al. 2015; Bock 2007b,

2014; Scotti et al. 2013). Comparatively high levels of transgene expression are directly associated with concomitantly higher levels of resistance to herbicide application or insect

136

pressure (Bock 2007b), however, this level of expression itself can also be a source of undesirable phenotypic side effects (Rigano et al. 2012; Ahmad et al. 2016). Implementation of novel and desirable resistance traits is also achievable via chloroplast transgene expression because unlike genes expressed from the nucleus, RNA is not subject to RNA interference and processing (Zhang et al. 2015). This situation allows for the expression of RNA like dsRNA that otherwise would induce silencing or be processed in a manner that would render it nonfunctional. The implementation of chloroplast expression to produce dsRNA as a means of highly specific pest resistance is particularly attractive as it is a novel mechanism of pest management that could be tailored to discriminate at the level of sequence (Zhang et al. 2015).

If developed further, it might provide a means of species or even variety specific resistance with less collateral damage than chemical pesticides or BT toxin.

High levels of expression conferred by plastid expression are also rapidly gaining interest because of the potential applications of this technology to utilize plants as a vehicle for large scale industrial and pharmaceutical protein and enzyme manufacture (Bock 2014; Ahmad et al.

2010; Rigano et al. 2009; Scotti and Cardi 2012; Waheed et al. 2015; Ma et al. 2005; Bock

2007b; Specht et al. Mayfield 2010; Mayfield and Franklin 2005; Mayfield et al. 2007). Among many products, transgenic chloroplasts have been utilized to produce vaccine antigens, antibody fragments, blood proteins, antibacterial proteins, and thermophilic enzymes used in biomass degradation for biofuel processing (Scotti et al. 2013). Membrane containment of the stroma allows for the segregation and storage of high levels of proteins away from the rest of the cell.

This situation makes possible the production and overaccumulation of potentially harmful or lethal proteins that would otherwise arrest growth.

137

Methods

Plastid transformation is achieved through the deliberate introduction of foreign DNA into the plastid of a living cell (Day and Goldschmidt-Clermont 2011; Bock 2014; Svab et al.

1990; Bock 2007b). This is necessary because transformation of the plastid genome is entirely reliant on integration of a foreign DNA via the plastid’s innate homologous recombination DNA repair pathway (Kowalczykowski et al. 1994; Takata et al. 1998; Cerutti et al. 1995; Seisuke and

Sakaguchi 2006; Day and Madesis 2007; Boesch et al. 2011; Cerutti et al. 1992). DNA is introduced into plastids mainly through the use of DNA coated ballistic pellets shot via a gene gun (Bock 2014; Yu et al. 2017; Bock 2007b) (Figure 4.1). PEG transformation is a viable, although less efficient means of DNA transfer (Kindle 1990; Maliga 2004; Nugent et al. 2006;

O’Neill et al. 1993; Golds et al. 1993). While transplastomic plants can also be generated via the use of deliberate protoplast fusion or multispecies grafting, these methods have well known practical limitations, and do not allow the fundamental plastid transformation event (Thyssen et al. 2012; Stegemann et al. 2012).

The current model of plastid transformation entails that DNA integrates into one or several plastid genomes through homologous recombination, which in the plastid is likely via

RecA mediated strand invasion (Cerutti et al. 1995, 1992; Boesch et al. 2011; Day and Madesis

2007), (Figure 4.2). Plastids carrying these events are identified and maintained via the use of antibiotic selection (Maliga 2004; Bock 2007b)(Figure 4.1). Once integrated, transgenes are subject to genetic drift that renders their presence fundamentally unstable (Bock 2007b). The heteroplasmic state of a mixture of nontransgenic chromosomes and transgenic chromosomes will tend to revert or proceed to a homoplasmic state (Bock 2007a). Antibiotic selection has

138

been traditionally used to enforce the existence of an integrated chloroplast transgene, as plastids in antibiotic pressured cells can only survive by the expression of a correct antibiotic resistance gene (Figure 4.1). This antibiotic pressure maintains the presence of an integrated transgene in the heteroplasmic state, creating a situation of biased stochastic drift that increases the number of transgene bearing plastid chromosomes until only transgene bearing plastid chromosomes exist

(homoplasmy).

Limitations

Stable transgenic homoplasmy is achieved after multiple and lengthy rounds of regenerating and propagating cells subject to antibiotic pressure in this way (Rigano et al. 2012;

Svab et al. 1990; Day and Madesis 2007; Maliga 2004; Bock 2014) ( Figure 4.1). This process takes place over the scale of months, as each of a cell’s potentially ~1000 plastid genome copies must be converted to match as little as one chromosome resulting from the initial transformation event (Bock 2007b) (Figure 4.1). The time and resources needed to create, maintain, and track plants transformed in this way is an obvious bottleneck, likely limited by the rate at which recombination occurs in the plastid. The amount of time needed to establish homoplasmy can also produce false positive chimeras resulting from gene conversion (Bock 2007b).

Plastid Transformation is primarily limited by the requirements of antibiotic selection and need for a species and tissue specific protocol (Bock 2007b, 2014). Tissue differentiation and endopolyploidization prevent regeneration of fertile individuals (Bock 2007b, 2014). Because all of the plastid transformation methods entail the use of regenerating individuals from callus,

139

Figure 4.1 | Current Approaches to Plastid Genome Modification. Figure from (Bock 2007b) Major steps in the tissue culture of plastid transformation of potato plants (Solanum tuberosum) are shown. (a) Aseptic leaves are spread evenly to maximize surface area for particle bombardment. (b) Putative transformants are transferred to media containing antibiotic selection. (c) Transplastomic lines resistant to antibiotics emerge after 11 weeks of selection (white arrow). (d ) Resistant plantlets from (c) regenerated again into shoots (picture taken after another 11 weeks). (e) Alternative regeneration round initiated from stem sections of (c) (picture taken after 11 weeks). ( f ) Homoplasmic plants with differentiated cell types (Tuber, root, stem).

there is therefore a basic need to identify that a desired species to transform can in fact survive and propagate in cell culture and produce fertile offspring. However, in some cases the use of tissue culture fails to regenerate fertile plantlets, and this may simply mean that chloroplast transformation by tissue regeneration is not tractable in some species.

Antibiotics are not universally effective, and not effective at all against many plant species, including most cereals (Maliga 2004; Day and Goldschmidt-Clermont 2011; Rigano et al. 2012; Bock 2014, 2007b). The consequence of a reliance on antibiotics for selection is that plastid transformation has not and currently cannot be pursued in many organisms. Recently

140

however, the possibility of using mutant varieties more susceptible to antibiotic selection was demonstrated in A. thaliana (Yu et al., 2017). This route enables the creation of transplastomic varieties in species where these methods could not previously be applied but carries the caveat that each variety to be transformed would have to be in a homozygous background that is appropriately mutagenized. Two obvious drawbacks to this strategy are that it may be intractable to generate functionally homozygous mutations in polyploid organisms, and that after transplastomic lines have been established, the mutagenized background would probably need to undergo additional backcrossing.

Summary

Although great strides have been made in implementing plastid transformation, fundamental limitations exist to be resolved. The reliance on antibiotic selection and the need for a lengthy regeneration process make this process difficult or impossible to implement in many cases. Presented here is a series of newer concepts and tools derived thereof that allow for improvements to the plastid genome modification process. The outcome of this discussion is a set of discrete rational designs providing novel methods of plastid transformation, plastid genome editing, and analysis of plastid genetic elements.

Plastid DNA Repair and Implications for Genome Modification

Modification of a genome by transformation or editing necessarily involves pathways of

DNA repair that render a sequence change stable and permanent. Plastids possess several

141

mechanisms of DNA repair, most of which serve a role in correcting mutagenesis due to ultraviolet light and radical oxygen and nitrogen produced by the highly oxidized environment of the stroma (Boesch et al. 2011). Plastid transformation is assumed to occur via a homologous recombination repair pathway that entails RecA mediated homologous pairing and strand invasion, but produces several distinct products (Kowalczykowski et al. 1994; Takata et al. 1998;

Cerutti et al. 1995; Seisuke and Sakaguchi 2006; Day and Madesis 2007; Boesch et al. 2011;

Cerutti et al. 1992), (Figure 4.2). Unfortunately, there is little direct evidence about the circumstances or exact proteins that facilitate major events in plastid homologous recombination

(Day and Madesis 2007). Models of plastid DNA repair and replication are instead largely derived from bacterial models, the existence of homologous plastid targeted DNA repair proteins, and observed variation in plastid genome content and structure (Day and Madesis

2007).

Plastids possess several proteins that are implicated in the repair of single base pair mutations resulting from ultraviolet light and oxidizing radicals (Boesch et al. 2011). However, homologous recombination is predominantly thought to be the only mechanism capable of repairing double stranded DNA breaks in the plastid (Cerutti et al. 1995, 1992; Kowalczykowski et al. 1994; Day and Madesis 2007; Boesch et al. 2011) (Figure 4.2). Homologous recombination in the plastid is assumed to occur via RecA mediated base pairing and strand invasion/exchange (Cerutti et al. 1995, 1992; Kowalczykowski et al. 1994; Day and Madesis

2007; Boesch et al. 2011) (Figure 4.2). Strong evidence for this is provided by the degenerative lethality of RecA mutants, and complementation experiments utilizing E.coli RecA (Cerutti et al.

1995, 1992). Furthermore, reciprocal recombination and gene conversion products are observed as the outcome of plastid genome transformation (Day and Madesis 2007; Bock 2007a, 2014).

142

The minimal homologous sequence length necessary for plastid genome transformation has been established using repeat sequences as a parameter (Day and Madesis 2007). Therefore, the existence of these homologous sequences is assumed here to form the necessary criteria for integration of foreign DNA into a plastid genome (Figure 4.2).

Novel Plastid Modification Tools

The technologies presented in this chapter are reliant on novel tools that allow for transport of RNA from nucleus to plastid, de novo DNA synthesis, and highly specific genome modification. The useful manifestations of these elements are presented in the latter portion of the chapter, but the central biological features underlying their function is presented here.

RNA transit to organelles

That RNA can be synthesized by a nuclear genome and localized to organelles is a well- established principle (Schneider and Maréchal-Drouard 2000; Schatz et al. 1996; Tan et al. 2002;

Tarassov et al. 1995; Salinas et al. 2008). The synthesis, regulation, and transport of tRNA derived from the nucleus but localized to organelles is common in a variety of eukaryotes

(Schneider and Maréchal-Drouard 2000; Schatz et al. 1996; Tan et al. 2002; Tarassov et al. 1995;

Salinas et al. 2008). Evidence of cis elements of mRNA acting as protein zip codes and associated with cotranslational import similarly appears to be present in multiple evolutionary lineages (Fujiki and Verner 1993; Verner 1993; Marc et al. 2002; Weis et al. 2013).

143

A B

Figure 4.2 | Prokaryotic DNA repair by homologous recombination and relevance to plastid transformation. A. Double-strand-break model of homologous recombination (Kowalczykowski 2000; Szostak et al. 1983). DNA is unwound by a topoisomerase (helicase) and removed on the leading strands emanating from a double stranded break by an . Exposed 3’ ends are coated in RecA, which facilitates base pairing between homologous sequence and strand invasion. DNA polymerase adds nucleotides 5’ to 3’ on both the leading strand annealed to an emerging D loop template and the invading lagging strand annealing to the opposite strand. Expansion of the D loop results in the formation of a second holiday junction. Holiday junctions are cleaved by resolvase and result in two distinct recombination products. B. Integration of DNA into the plastid genome by homologous recombination. a) Homologous sequence can integrate into the plastid genome either by reciprocal recombination or gene conversion. (Shen et al. 1986; Day and Madesis 2007)

However, there are now multiple lines of evidence suggesting that in plant lineages, nuclear encoded or viral mRNA can localize to plastid stroma ( Ahmad 2016; Gómez and Pallas

2012; Daròs 2016; Gómez and Pallás 2010a; Baek et al. 2017; Fadda et al. 2003; Nicolaï et al.

2007). For example, RNA viroids of the Avsunviroidae family localize and replicate within plastids (Ahmad 2016; Flores et al. 2000; Daròs 2016; Fadda et al. 2003), (Figure 4.3).

Although the nature of the transport mechanisms by which these viroids enter the plastid are unclear, the demonstrated stromal localization of viroid RNA and reliance on species specific chloroplast tRNA ligase and other proteins for completion of the viral life cycle are compelling evidence of the association between viroid RNA and plastid stroma (Flores et al. 2000; Daròs

144

2016; Fadda et al. 2003) (Figure 4.3). A more recent observation is that a plant’s own mRNA coding for the Eif4E gene may be an in cis acting RNA localization signal (Nicolaï et al. 2007).

However, replication of this result has not been forthcoming.

Importantly, it has been shown that one member of avsunviroidae, Eggplant latent viroid

(ELVd), can be trafficked to a plastid even in a truncated form, or transcriptionally fused as a

5’UTR to an mRNA coding for eGFP (Daròs 2016; Gómez and Pallas 2012; Gómez and Pallás

2010a), (Figure 4.3). Remarkably, expression of the 5’ ELVD UTR:eGFP fusion in a non-host species produced functional eGFP protein localized to plastids (Gómez and Pallas 2012; Gómez and Pallás 2010a) (Figure 4.3). This is despite the absence of sequence coding for a canonical chloroplast transit peptide, and despite the fact that the eGFP coding sequence is out of frame with prior start codons in the UTR sequence. There are two possible explanations for this. The first would be that the truncated ELVd RNA is also a cis acting zip code that promotes cotranslation and import (Weis et al. 2013). The second and more tantalizing possibility is that

GFP is translated by stromal ribosomes. The truncated ELVd sequence contains only one canonical plastid ribosome / Shine-dalgarno sequence (Gómez and Pallas 2012;

Gómez and Pallás 2010a). Unfortunately, this site is 76 bp upstream of the nearest start codon, which is itself out of frame with eGFP. However, the requirement for a canonical ribosome binding site to recruit a ribosome and initiate translation in plastids is not absolute (Peled-Zehavi and Danon 2007; Subramanian et al. 1991) (Appendix B).

The implication of using ELVd as a tool for biotechnology is that as a UTR it might be able to facilitate stromal localization of any desired RNA. This could extend to other protein coding mRNAs besides eGFP, or to non-coding RNA.

145

Useful Features of Retroviral Replication

To transform a plastid genome via homologous recombination, a source of homologous foreign DNA is necessary. Traditionally the DNA supplied is synthetic, manufactured either by

PCR or replication in E. coli (Svab et al. 1990; Maliga 2004; Day and Goldschmidt-Clermont

2011; Bock 2014). However, an alternative route to supply DNA would be to create DNA de novo, via an in vivo process. The known mechanisms of DNA synthesis are limited in this regard. Replication via DNA polymerase is the predominant method by which DNA is synthesized in vivo (Garfinkel, Boeke, and Fink 1985; Mak and Kleiman 1997; Negroni and Buc

2001). However, this process takes place at the scale of chromosomes and whole genomes, and is both tightly controlled and localized to nuclear and nucleoid regions (Day and Madesis 2007;

Bock 2007a). Importantly, there is also no clear mechanism of DNA transfer between nucleus and organelles, and plastid themselves only appear to receive DNA through horizontal gene transfer resulting from rare plastid fusion events during mating (Bock and Timmis 2008; Bock

2007a).

Reverse Transcription on the other hand, is a common process among transpositional and viral RNA elements (Garfinkel et al. 1985; Mak and Kleiman 1997; Negroni and Buc 2001).

Conversion of RNA to DNA via reverse transcription occurs at scales of several bp to several kb, which are similar in size to plastid transgenes. Furthermore, the mechanisms of reverse transcription are well understood, and can be applied and manipulated in vivo (Robertson et al.

1986; Culver et al. 1992; Miller et al. 1990; Ryder et al. 1990). Importantly, as RNA transit to plastid appears to occur in several plant species, reverse transcription would potentially be a

146

Figure 4.3 | Plastid Localization of Avsunviroidae RNA A. Replication pathway of Avsunviroidae. Avsunviroidae can localize in plastid through an unknown mechanism. In the plastid environment, genomic RNA is replicated into a – strand RNA concatemer by rolling circle replication, which is self-processed by cis acting ribozymes. Processed RNA is then reannealed by a host tRNA ligase, and a RNA concatemer of the original + strand is synthesized by rolling circle replication. (Gago-Zachert 2016). The + stand is then self-processed by cis acting ribozymes, and the resulting individual genomes are ligated again by the hosts own tRNA ligase. The resulting nascent viroids are then able to exit the plastid by an unknown mechanism. B. Expression of viroid fused GFP transcript generates GFP signal in plastids. Confocal microscope images of N. benthamiana expressing cytosol localized GFP (left panels), a transcriptional fusion between a truncated Eggplant Latent Viroid (ELVd) and an out of frame GFP (central panels), and a GFP protein localized to plastid by translational fusion to a N-terminal transit peptide (right panels). (Gómez and Pallás 2010b)

convenient way to generate DNA de novo in plastids from an RNA sent from a plant nucleus.

There are several known processes that involve reverse transcription: retrotransposition, telomere lengthening, propagation of prokaryotic retrons, and replication of a variety of viruses

(Mak and Kleiman 1997; Negroni and Buc 2001). However, the most appropriate use of reverse transcription is the de novo synthesis of DNA from RNA. This process is described in the Strong

Stop Stand Transfer model (Mak and Kleiman 1997; Negroni and Buc 2001), (Figure 4.4), which entails the formation of double stranded DNA resulting from an RNA template via reverse transcription. The reverse transcription process is initiated by annealing a cell’s own tRNA as a

147

primer to a complementary primer binding sequence (PBS) by the action of Reverse

Transcriptase. Reverse transcriptase then synthesizes DNA in a 5’ to 3’ manner from both RNA and DNA templates, and engages in a series of RNAseH mediated degradations of the RNA template and strand transfers. These are accomplished via the existence of complementary viral sequence features: PBS, R, U5, U3, and Poly Purine Tract (PPT) sequences that orient the action of reverse transcriptase.

Every step entailed by the Strong-stop Strand Transfer Model is mediated by a derivative of the gag protein, the nucleocapsid polypeptide (NC) (Darlix et al. 2014; Cornille et al. 1990;

Mak and Kleiman 1997; Negroni and Buc 2001). Relaxation of the secondary structure and binding of tRNA to a retroviral RNA is thought to be due to the action of NC, which has high affinity for single stranded nucleic acids, and acts as an RNA chaperone. NC protein is implicated in all strand transfer and DNA/RNA displacement steps, and in vivo is the most abundant component of the retrovirus, coating the entire retroviral RNA template. In vivo and in vitro studies have shown an absolute requirement for this peptide in retroviral replication, highlighting its ubiquitous and critical roles (Tsuchihashi and Brown 1994; De Rocquigny et al.

1992; Darlix et al. 1995; Rice et al. 1995; Darlix et al. 2014).

Implementation of viral sequence features and Reverse Transcriptase and NC proteins is proposed here to form the minimal set of transgenic features necessary to perform reverse transcription in a plastid in vivo. This is because a plastid possesses its own set of tRNA, and all of the nucleic acid raw materials necessary for reverse transcription (Peled-Zehavi and Danon

2007). A caveat is that there does not appear to be any evidence of reverse transcription in plastids, and unlike nuclear and prokaryotic genomes there does not appear to be any transpositional elements. To date there are no published reports of implementation of reverse

148

transcription in plastid, and it remains an open question if this process could be made to be functional either using naked RNA, as is the case for retroviruses, or whether LTR retrotransposon virus-like particles would be required to protect the reverse transcription complex of RNA, NC, and reverse transcriptase proteins.

Figure 4.4 | Reverse transcription of retroviral genomic RNA into double stranded proviral DNA via the strong stop strand transfer model. This figure is modified from (Mak and Kleiman 1997). Step 1: Primer tRNA is annealed to the PBS. And minus-strand strong-stop cDNA is synthesized. Positive strand R and U5 RNA is degraded by the RNase H activity of reverse transcriptase. Step 2: cDNA synthesized in step 1 is annealed to the 3’ terminus of the genomic RNA via complementarity between identical R and U5 sequence. Steps 3 and 4: Synthesis of minus-strand cDNA, and degradation of + strand RNA by RNase H. Only the polypurine tract (PPT), remains because it is also the primer for plus-strand strong-stop cDNA synthesis (step 5). Step 5: plus-strand strong-stop cDNA is synthesized, ending in a 3’ PBS sequence. Primer tRNA is released. Step 6: The + strand cDNA is transferred to the 3’ end of the – strand, annealing at complementary R, U5, and PBS sites. Step 7: DNA is synthesized 5’ to 3’ in both directions by Reverse transcriptase. Primer tRNAs for Reverse Transcription

149

Genome Editing and Modification of Expression in Plastid Using Crispr Cas Systems

Genome editing via the use of targeted systems has exploded as a technology in recent years (Mali et al. 2013; Horvath et al. 2010; A. J. Wood et al. 2011; Belhaj et al. 2015; Gaj et al. 2013). However, these systems have not to date been employed in manipulating the plastid genome. In the past this was likely due to the high technical and financial cost of developing TALEN and ZFN systems, and also partly due to the lack of an

NHEJ pathway in the plastid, which is necessary for the use of targeted nucleases as a deliberate mutagen (Gaj et al. 2013). However, the development of CRISPR-Cas systems has enabled the rapid and inexpensive application of genome editing to a variety of organisms. Presented here is a basic overview of the function and implementation of CRISPR-Cas systems, with a specific focus on applications in prokaryotic environments.

DNA modification and modulation by CRISPR-Cas systems is facilitated via the combination of an RNA target, the CRISPR RNA (crRNA), and a complementary Cas protein, typically Cas9 (Horvath et al. 2010; Gaj et al. 2013) (Figure 4.5). The association of Cas9 and a transcriptionally fused crRNA-tracrRNA called an sgRNA has been demonstrated to localize and act on specific sequences in a variety of different systems and cell types (Horvath et al. 2010; Ma et al. 2015; Mali et al. 2013; Ran et al. 2013; Belhaj et al. 2015). Furthermore, construction of these systems is practically convenient because a specific sequence target can be readily altered by standard cloning methods. These features have led to the widespread adoption of CRISPR-

Cas systems in many systems, including plants, with most manifestations being the use of

CRISPR-Cas9 mediated cleavage and either mutation of target sequences or gene replacement in the nuclear genome (Ran et al. 2013) (Figure 4.5). Mutation is a highly efficient process via

150

CRISPR-Cas9 cleavage in eukaryotic nuclear genomes because the NHEJ repair pathway is dominant (Ran et al. 2013). However, targeted cleavage in this way is flexible in that it can also be used to introduce deliberate mutations or even transgenes via a cell’s innate homologous recombination mechanism (Ran et al. 2013), (Figure 4.5).

Figure 4.5 | Application of CRISPR-Cas9 systems to generate sequence specific double stranded breaks. CRISPR-Cas systems generate double stranded breaks that are repaired in one of two ways. Non-Homologous End Joining (NHEJ) (left) entails error prone processing of 5’ and 3’ overhanging ends at a double stranded break site. Strands are ligated by endogenous DNA repair machinery. This result can result in a number of mutations, including indel formation or base pair replacement. Alternatively, repair is accomplished by homologous recombination using a template complementary to the sites of the double stranded break (right). Any sufficiently homologous sequence can serve as a repair template: Double stranded or single stranded DNA, linear or circular. (Ran et al. 2013)

Challenges arise when CRISPR-Cas systems are implemented in prokaryotes. This is because unlike eukaryote nuclear genomes, most prokaryotes repair DNA obligately through homologous recombination (Szostak et al. 1983; Takata et al. 1998; Kowalczykowski 2000;

Shen et al. 1986; Chayot et al. 2010). Implementation of CRISPR-Cas systems tends to be lethal if expressed in a living bacterial cell, implying that there are not any sufficient repair alternatives capable of ameliorating a high level of double stranded breaks (Selle and Barrangou 2015;

Gomaa et al. 2014). As a consequence instead of gene knockouts, gene knockdowns by

151

CRISPR-Cas systems in bacteria are mediated by catalytically inactive Cas9 (Zhao et al. 2015;

Liu et al. 2017; Larson et al. 2013; Chavez et al. 2015). This system is known as CRISPRi.

Plastid and prokaryote genetics are similar in that neither possess a NHEJ pathway by which CRISPR/Cas mediated double stranded breaks could be repaired. This is clearly an impediment to the usefulness of these systems, as both plastids and prokaryotes typically possess hundreds to thousands of identical genome copies, each of which can serve as a template for homology driven repair. However, the CRISPR-Cas nucleoprotein complex can potentially act on all of these potential genome targets (Horvath et al. 2010; Gaj et al. 2013). These systems could therefore by used to generate double stranded DNA breaks in most or all plastid chromosomes, which may limit or eliminate viable templates for the homologous repair pathway.

There are several alternative sequence specific nucleases that do not generate double stranded breaks. These tools could therefore be implemented in plastids with less deleterious consequences. CRISPRi has already been mentioned, but coupling of Cas9 in particular to transcriptional activators, otherwise known as CRISPR activator systems, could also be implemented to interrogate chloroplast genomic elements (Chavez et al. 2016, 2015; Gilbert et al. 2014; Tanenbaum et al. 2014; Konermann et al. 2014; Z. Li et al. 2017). There are also newer

CRISPR-CAS systems in place that cleave or bind RNA, which could be used to interrogate plastid gene expression (O’Connell et al. 2014).

Implementation of CRISPR-Cas systems in plastids could be accomplished using a biolistic approach. The obvious caveat is that implementation needs to be tailored in such a way that it results in a viable plastid. Application of a sequence specific nuclease in plastids was demonstrated previously using an inducible promoter system, which caused partial plastid chromosome deletions and resulted in dramatic morphological changes (Kwon et al. 2010).

152

CRISPR-Cas systems have not been implemented using traditional plastid transformation, although there are no obvious practical barriers to do so. In most circumstances this would likely involve the use of an alternative Cas protein that did not cleave the plastid genome. However, an intriguing potential use of active CRISPR-Cas would be its implementation in plastid gene drive

(Webber et al. 2015; Akbari et al. 2015; Gantz et al. 2015; Hammond et al. 2016; Sinkins and

Gould 2006). The primary reason for doing this would be to accelerate gene conversion in the heteroplasmic state of plastid DNA transformation. A constitutively active CRISPR-Cas system would have the effect of targeted cleavage of all non-transgenic genome copies, but transgenic genomes would lack the target sequence, protecting them from cleavage. Although it is unclear if cleavage to a linear form would prevent replication, it should favor repair using the transgenic genome as a template. Additionally, the cleavage site could be designed such that it prevents expression of an essential gene, with a transgene template providing a synonymous mutation that prevents cleavage. This would provide direct pressure favoring the incorporation of a transgene.

In principle, any nuclease could be used for the purpose of creating gene drive: Crispr-Cas systems, TALEN, ZFN, Homing endonuclease, or even restriction enzymes (Gupta et al. 2012;

Wood et al. 2011; Bedell et al. 2012; Brouns et al. 2008; Hockemeyer et al. 2011; Mussolino et al. 2011; Christian et al. 2010; Gaj et al. 2013; Miller et al. 2011; Smith and Nathans 1973;

Horton et al. 1989; Stoddard 2011, 2006; Stoddard et al. 1998; Duan et al. 1997), (Figure 4.6).

However, whether or not these systems actually function or would achieve a desired acceleration of gene conversion in vivo remains to be proven.

153

Novel Methods of Plastid Genome Modification

To overcome the limitations of traditional plastid genome engineering, an alternative set of synthetic genome modification tools are presented. One of the outcomes of this discussion was the creation of intellectual property derived from the ideas outlined (Appendix A). The character of the tools presented is that they are intended to be expressed in a transgenic plant using agrobacterium or biolistic mediated nuclear genome transformation. This is because the implementation of these tools is more common and tractable, and may be a route to modify species that are recalcitrant to traditional plastid modification techniques. Additionally, these tools rely on the assumption that any RNA can be artificially designed to localize to chloroplast stroma.

Figure 4.6 | Unique Restriction Enzyme Sites in the plastid genome of N. Tabacum. Unique restriction enzyme sites were identified in the plastid genome of N. tabacum (NC_001879.2) were identified using NEBcutter (Vincze et al. 2003) (http://nc2.neb.com/NEBcutter2/). Arrows denote open reading frames.

Implementation of two generic genome modification tools in plastids using nuclear transgene expression is presented here. The first is a method for modifying the plastid genome by integration of DNA generated de novo in plastids using a synthetic retroviral-like RNA and

154

reverse transcriptase pair. The second is a method for implementing CRISPR-Cas systems in plastids using nuclear transgene expression.

Plastid Transformation via de novo DNA synthesis

It may be possible to transform plastid genomes using DNA generated de novo in a plastid. DNA would be generated by the deliberate and coupled action of a reverse transcriptase protein and a paired RNA (Figure 4.7). The idea is that reverse transcriptase would act on an

RNA localized to plastid stroma and could be designed to produce either single stranded or double stranded DNA using the plastid’s tRNA and nucleotides. Key considerations of this idea are that both Reverse transcriptase and a paired RNA can be localized to a plastid, that key activities of reverse transcription are active in the stromal environment, and that a paired RNA has sufficient features to recruit and orient reverse transcriptase.

Localization of reverse transcriptase should be trivial; This would merely entail the translational fusion of an N terminal transit peptide sequence to a reverse transcriptase transgene

(Bruce 2000; Emanuelsson et al. 2018; Gavel and von Heijne 1990; Karlin‐Neumann and Tobin

1986; Chotewutmontri et al. 2017; Jarvis and Soll 2001). However, localization of a paired RNA is likely the riskier, and more difficult component to achieve. A key insight is that a paired RNA could be localized by fusion with a plastid viroid sequence (avsunviroidae) at the 5’ UTR

(Gómez and Pallás 2010a). Importantly, the use of a particular viroid sequence may need to be paired with a specific plant species (Flores et al. 2000; Gago-Zachert 2016; Daròs 2016).

Replication of plastid viroids is thought to be species specific, and among the reasons for this may be that the ability to be localized to a plastid is also species specific. In this work, a

155

truncated eggplant latent viroid (ELVd) was used to enable plastid localization of paired RNA.

This sequence was chosen because among plastid viroids, there is evidence that full length or truncated ELVd can localize to the plastids of N. benthamiana as well as eggplant (Gómez and

Pallás 2010a; Daròs 2016; Fadda et al. 2003; Gómez and Pallas 2012). Additionally, the use of

ELVd as a transcriptional fusion mediating chloroplast localization of a transgene was demonstrated previously (Gómez and Pallás 2010a; Gómez and Pallas 2012). It was therefore reasonable to assume that this sequence could be repurposed for other transgenes.

Implementation of Nuclease Systems in Plastid Using Nuclear Transgene Expression

It is feasible to implement nuclease systems in plastids using nuclear transgene expression. As is the case for reverse transcriptase, translational fusion of an N-terminal transit peptide is both necessary and sufficient for localization of any nuclease protein to plastid stroma

(Gavel and von Heijne 1990; Bruce 2000; Emanuelsson et al. 2018). Whether or not these proteins would act on plastid genomes compacted into nucleoids is a separate question.

Importantly, nuclear transgene expression could be used with CRISPR-Cas systems by achieving plastid localization of paired crRNA or sgRNA (Figure 4.7). As is the case for template RNA, this would probably be mediated by transcriptional fusion of a plastid viroid as a 5’UTR

(Horvath et al. 2010; Gómez and Pallás 2010a; Gómez and Pallas 2012).

156

Figure 4.7 | Chloroplast genome engineering. A. Chloroplast genome transformation via de novo DNA synthesis. A reverse transcriptase is used to convert an RNA template provided by the PTEC transgene into single or double stranded DNA in vivo. The newly constructed DNA is then integrated into a chloroplast genome via homologous recombination. B. CRISPR/cas9 modification of chloroplast genomes. Cas9/sgRNA ribonucleoproteins formed in the chloroplast selectively cleave circular chloroplast genomes at a desired target site, producing linearized genome fragments. C. Chloroplast Genome Editing via CRISPR/Cas9 and a synthetic NHEJ pathway. Linearized genome fragments produced by the action of CRISPR/cas9 are ligated via the coordinated action of Ku/LigD, producing a range of mutation products (triangles). D. Enhanced chloroplast genome transformation via the use of CRISPR/cas9 modification as an artificial genetic selection. CRISPR/cas9 is used to selectively linearize chloroplast genomes that do not possess a chloroplast transformation event. These linearized genomes will not replicate, and in successive generations only transformed chloroplast genomes will survive.

157

A

B

158

C

D

159

Nuclease systems could be implemented as independent systems, or in tandem with a variety of other tools (Figure 4.7). As an independent system, nucleases could be implemented in inactive forms such as CRISPRi and CRISPR activase. These manifestations would be useful, for example, in the study of chloroplast genetics and regulation. Importantly a key use of nuclease proteins could be to utilize them in tandem with other tools to accomplish new tasks.

Any nuclease system could potentially be used for this: CRISPR-Cas systems, ZFNs, TALENs,

Homing Endonuclease, or even Restriction Enzymes (Figure 4.6).

An intriguing use of nuclear expressed nuclease would be as a form of direct genetic selection in driving the process of homoplasmy (Figure 4.7). Functionally this concept is similar to the use of nuclease expressed in plastid as a gene drive system. This concept could be utilized in tandem with both traditional plastid genome modification in a dual transformation system, or in tandem with plastid transformation via de novo DNA synthesis. The hypothesis behind this approach is that a targeted nuclease and plastid transgene can be deliberately paired to selectively generate double stranded breaks in non-transgenic plastid chromosomes, creating an artificial selective bias for transgenic plastid chromosomes. This process is explicitly intended to accelerate the rate at which plastid transgenes achieve homoplasmy, ideally eliminating the need for additional regenerations needed to achieve that state. Given that these additional regeneration steps can add between 10-20 weeks to the process of generating stable transplastomic lines, this technology would provide an obvious benefit (Bock 2007b).

There may also be a potential to use nuclear expressed nuclease to induce point mutations without the use of a plastid transgene (Figure 4.7). Because the plastid does not possess a pathway of NHEJ DNA repair, there is not an efficient means of generating indel or point mutations using nuclease in the way that it is implemented in eukaryotic nuclear systems.

160

However, it may be possible to create a synthetic condensed NHEJ pathway in plastid.

Mycobacterium Ku and LigD proteins can generate a viable NHEJ pathway in E.coli, which similarly to plastids does not possess innate NHEJ (Chayot et al. 2010; Stephanou et al. 2007;

Shuman and Glickman 2007; Della et al. 2004; Gong et al. 2005; Pitcher et al. 2007). This set of two transgenes could similarly be implemented in plastids using transit peptides (Karlin‐

Neumann and Tobin 1986; Bruce 2000; Gavel and von Heijne 1990). There are several biological and practical considerations that need to be considered to implement this system, the key concerns being the possible outcomes of double stranded DNA breakage. Gene conversion of mutants to wild type sequence may occur because of the efficiency of RecA. However, if target sites of a nuclease remain intact, they will continue to be cleaved. The model outcome proposed is that combination of a nuclease and synthetic NHEJ system will tend to create a heteroplasmic variety of mutations which are more stable than wild type genomes. This is because mutation of a nuclease target site would relieve a chromosome from the pressure of double stranded breakage, and the only viable gene conversions would be between other mutagenized chromosomes. A central problem posed by this model would thereby be the resolution of a heteroplasmic state of a population of varied mutations to single homoplasmic mutation that is the result of gene conversion, which would likely have to be achieved by multiple rounds of tissue regeneration. An additional problem with this system is that unlike mutations introduced via transgene integration, the deliberate formation of indel and base conversion would be difficult to control. However, the formation of multiple mutations is not necessarily a drawback if the targeted gene is essential, as these are only viable in a heteroplasmic state in which some viable gene copies are available. This state is achievable, as synonymous mutations are among the possible outcomes. Finally, this system may be useful in

161

species for which plastid transformation is intractable, but nuclear transgene expression is achievable.

162

Materials and Methods

Plant materials

A. thaliana and N. benthamiana plants grown in soil (Sungro 2p, supplemented with

Osmocote 14-14-14 (Scotts Miracle-Gro Company, Marysville, OH) or on agar plates were kept at 25 ℃ with a 16:8 light:dark period in an environmental growth chamber. A. thaliana col-0 seed was vernalized at 4 ℃ for 48 hours after planting on sterile agar plates or on the surface of soil.

Agroinfiltration

N. benthamiana plants were infiltrated as previously described (Kapila et al. 1997) and maintained in environmental growth chambers (25°C, 16 h of light). Expression was analyzed at

72 hours after agroinfiltration.

Plasmid construction

Transformations

E. coli strain DH5α was used to propagate recombinant DNA in this work (Lucigen

Corporation, Middleton, Wisconsin). Chemically competent DH5α was used for transformation, and E. coli clones corresponding to all vectors were validated by sequencing (Genewiz,

Morrisville, North Carolina; Eurofins Genomics, Louisville, Kentucky). The resulting binary

163

vectors were used to transform A. tumefaciens GV3101 by electroporation (Gene Pulser Xcell™

Electroporation Systems (BIO-RAD Laboratories, Inc., Hercules, CA).

Reverse Transcriptase & PTEC constructs

PTEC and M-MLV reverse transcriptase constructs (Appendix B) embedded within pUC57 were obtained by gene synthesis (Genscript, Piscataway, NJ). These constructs were cloned into PCGW series binary vectors (Appendix B) using LR clonase (Appendix B), except for

PTEC 5, which was cloned into PCGW KAN using meganuclease digestion and ligation (I-CeuI, and PI-SCEI, rom New England Biolabs Inc. [NEB], Ipswich, Massachusetts; T4 DNA ligase from

NEB, Ipswich, MA). A synthetic gene construct for the expression of Nucleocapsid 10 translationally fused to the transit peptide of A. thaliana rbcS driven by a nopaline synthase promoter was embedded within pUC57 were obtained by gene synthesis (Appendix B) (Genscript,

Piscataway, NJ). The construct was then removed from pUC57 and cloned into a previously made

PCGW-HYG bearing M-MLV by restriction digestion and ligation (I-CeuI, from New England

Biolabs Inc. [NEB], Ipswich, Massachusetts; T4 DNA ligase from NEB, Ipswich, MA).

Cas9 & sgRNA constructs

A construct corresponding to the translational fusion of the transit peptide of A. thaliana rbcS (Appendix B) with Cas9 codon optimized for homo sapiens (Appendix B) was made via the ligation of a gBlocks® Gene Fragments (Integrated DNA Technologies [IDT], Coralville, Iowa)

(Appendix B) and copies of Cas9 generated by PCR (Appendix C, SacI, SphI from NEB, Ipswich,

164

MA). This fragment was subcloned into pDONR221 using BP clonase (Thermo-Fisher Scientific,

Waltham, MA) (Appendix C). The resulting subclone was amplified by PCR using M13 primers and cloned into binary vectors using LR clonase (Thermo-Fisher Scientific, Waltham, MA)

(Appendix B, Appendix C).

Constructs corresponding to the transcriptional fusion of sgRNA with ELVd at 5’ or 3’ ends driven by a nopaline synthase promoter/terminator (Appendix B) were synthesized as gBlocks® Gene Fragments (Integrated DNA Technologies [IDT], Coralville, Iowa). These fragments were cloned into pUC19 (Appendix B) (HindIII, and BamHI from New England Biolabs

Inc. [NEB], Ipswich, Massachusetts; T4 DNA ligase from NEB, Ipswich, MA), and the resulting plasmids were linearized by removing the protospacer region using BpiI (BbsI) (Thermo-Fisher

Scientific, Waltham, MA). Complementary primers corresponding to a desired targeting sequence were annealed and cloned into the protospacer region by ligation (Appendix C). Assembled

ELVd-sgRNA gene cassettes were then amplified by PCR and cloned into PCGW-KAN or previously made PCGW-KAN bearing cpCas9 using meganuclease digestion and ligation (primers

(I-CeuI, and PI-SCEI, rom New England Biolabs Inc. [NEB], Ipswich, Massachusetts; T4 DNA ligase from NEB, Ipswich, MA).

Fluorescence constructs

The design of fluorescence constructs was based on previously published materials (Gómez and Pallás 2010b, 2010a; Gómez and Pallas 2012). PCGW:eGFP (Dalal et al. 2015) was used as a reference for cytosol localized eGFP. Constructs corresponding to the transcriptional fusion of eGFP with ELVd (Appendix B) and the transcriptional fusion of eGFP with Eif4E were

165

synthesized as gBlocks® Gene Fragments (Integrated DNA Technologies [IDT], Coralville,

Iowa). These fragments were cloned into two PCGW series binary vectors using forced cloning

(Appendix B) (XbaI and BamH1 from New England Biolabs Inc. [NEB], Ipswich, Massachusetts;

T4 DNA ligase from NEB, Ipswich, MA).

A construct corresponding to the translational fusion of the transit peptide of A. thaliana rbcS with eGFP (Appendix B) was made via the ligation of a gBlocks® Gene Fragments

(Integrated DNA Technologies [IDT], Coralville, Iowa) (Appendix B) and copies of eGFP generated by PCR (Primers Appendix B, SacI and SphI from NEB, Ipswich, MA). This fragment was subcloned into pDONR221 using BP clonase (Thermo-Fisher Scientific, Waltham, MA)

(Appendix C). The resulting subclone was amplified by PCR and cloned into binary vectors using

LR clonase (Thermo-Fisher Scientific, Waltham, MA) (Appendix C, Appendix B).

Microscopy

N. benthamiana leaf tissue was cut from control and agroinfiltrated transgenic plants and placed on glass slides with the abaxial surface facing up. Whole leaf tissue was used for confocal microscopy images, whereas abaxial leaf peels were used for fluorescent microscopy. Fluorescent images were captured by Qcapture software 1394 using a Nikon Eclipse E800 fluorescent microscope and illumination provided by a mercury-vapor lamp. Green fluorescence was revealed using a GFP long pass filter. Red fluorescence was revealed using a D/F/T multi band filter.

ImageJ was used to edit images.

For confocal microscopy, slides containing plant tissue were inverted and placed on the platform of a confocal microscope with Airyscan, (LSM 710, ZEISS, Oberkochen, Germany) for

166

N. benthamiana. Laser excitation wavelength was set at 488 nm for fluorescence signal, and emission window from 503 nm to 619 nm with a 9-nm gap between each scan for lambda scan.

Transgenic plants selection

Arabidopsis thaliana col-0 plants were transformed via floral dip, and the resulting seed was stored at 4 °C. Wild type col-0 was used as a negative control for fluorescence and antibiotic selection. Transgenic seed expressing mCherry was identified by fluorescent microscopy, using illumination with a mercury halide lamp and a D/F/T filter. Transgenic seed expressing phosphinothricin acetyltransferase (BAR) were screened on m/s plates containing 20 μg/mL

Glufosinate-ammonium (Sigma-Aldrich Corporation, Raleigh, NC). Phosphinothricin resistant seedlings were identified by survival on agar and the presence of true leaves. Screened fluorescent seed was planted directly in soil, whereas phosphinothricin resistant seedlings were transferred to soil after the emergence of true leaves.

Genotyping

Transgenic lines were identified using PCR. Genomic DNA was extracted from 100 mg of rosette leaf tissue from A. thaliana T1 plants using Cetyltrimethylammonium bromide (CTAB) extraction buffer (Appendix F) as described previously. Transgene specific primers were used to amplify desired sequence, which was evaluated by agarose gel electrophoresis (Appendix C) and

DNA sequencing. Col-0 genomic DNA was used as a negative control for detection. The binary vector used for transformation of agrobacterium was used as a positive control for detection.

167

RNA expression, RTPCR

Transgenic lines carrying the correct transgene DNA sequence were further screened for the expression of RNA. RNA was extracted from 100 mg of rosette leaf tissue using TRIZOL reagent (Sigma-Aldrich Corporation, Raleigh, NC). Resulting RNA was treated with DNAse I

(TURBO DNA-free™ Kit TURBO™ DNase Treatment and Removal Reagents (Invitrogen,

Carlsbad, CA), and converted to cDNA with random primers (RNA to cDNA EcoDry™ Premix

(Random Hexamers) (Clonetech Laboraries Inc., Mountain View, CA). Transgene specific primers were used to amplify desired sequence, which was evaluated by agarose gel electrophoresis (Appendix C). Col-0 genomic DNA and DNAse treated RNA corresponding to each transgenic line were used as negative controls for detection. The binary vector used for transformation of agrobacterium was used as a positive control for detection.

qRT-PCR was used to determine the expression levels of reverse of transcriptase in transgenic lines. Quantification was performed with gene specific primers, and actin was used as a control for expression level. Transcript level was determined by fluorescence using SYBR green dye and taq polymerase(Invitrogen, Carlsbad, CA). Three independent biological replicates were used per sample.

Protein Extraction, SDS-PAGE, and Western blotting

Transgenic lines carrying the transgene DNA sequence and expressing transgene RNA for

Cas9 and M-MLV reverse transcriptase were screened for the expression of recombinant protein.

Protein samples were prepared from 100 mg of either A. thaliana rosette leaf tissue or N.

168

benthamiana leaf were frozen in N2. Frozen samples were ground into a fine white powder using ceramic beads and a bead mill (MM400, Retsch, Haan, Germany) (30 hz, 45 s intervals, 4 runs).

Subsequent protocols for extraction procedures, SDS page, and Western Blot were tailored for detection of either Reverse Transcriptase or Cas9.

Reverse Transcriptase

Frozen powder samples were then mixed with 200 microliters of Phosphate Buffered

Saline (PBS)(Appendix D) and ground using a bead mill (MM400, Retsch, Haan, Germany) (30 hz, 45 s intervals, 1 run) at 4° C. The resulting homogenate was centrifuged at 4° C for 10 minutes at 10,000 x g. 150 microliters of supernatant was transferred to a new tube and sampels were stored at -20° C. Prior to SDS-PAGE, protein samples were diluted in Laemmli buffer at a sample:buffer ratio of 4:1, and heated to 80° C for 15 minutes. The resulting denatured samples were clarified by centrifugation at room temperature at 10,000 x g for 30 seconds. 40 microliters of the cleared protein samples loaded onto NuPAGE™ 4-12% Bis-Tris Protein Gels (1.5 mm, 10 well,

Invitrogen, Carlsbad, CA). SDS-PAGE was carried out in MES-SDS buffer (Appendix D), at 200

V for 30 minutes. Protein gels were then either stained with coomassie R-250 for analysis, or transferred onto PVDF membranes using electrode cassettes submerged in Tris Glycine buffer

(Appendix D), run at 50 V at 4 ℃ overnight

PVDF blots were stained with Ponceau Red for analysis, and blocked with TBS-T containing 5% BSA for one hr. Blots were probed with an anti-FLAG Tag Monoclonal Antibody

(THE™ DYKDDDDK Antibody, Genscript, Grand Island, NY) at a dilution of 1:5,000 (v/v) in

TBST containing 5% BSA at room temperature overnight. Blots were then incubated with Goat

169

anti-Mouse IgG (H+L) Secondary Antibody, HRP (Invitrogen, Carlsbad, CA ) at a dilution of 1:

20,000 (v/v) in TBST containing 5% BSA at room temperature for least one hr. Blots were incubated with either SuperSignal™ West Pico PLUS Chemiluminescent Substrate (Invitrogen

Carlsbad, CA), or SuperSignal™ West Femto Chemiluminescent Substrate (Invitrogen, Carlsbad,

CA) and luminescence was detected using CL-XPosure™ Film (Thermo-Fisher Scientific,

Waltham, MA) developed with a SRX-101A Tabletop Processor (Konica Minolta Healthcare

Americas, Inc., Wayne, NJ).

Cas9

Frozen powder samples were then mixed with 200 microliters of Laemmli Buffer

(Appendix D) and ground using a bead beater (MM400, Retsch, Haan, Germany) (30 hz, 45 s intervals, 1 runs) at 4° C. The resulting homogenate was centrifuged at 4° C for 10 minutes at

10,000 x g. 150 microliters of supernatant was transferred to a new tube and sampels were stored at -20° C. Prior to SDS-PAGE, protein samples were heated to 80 °C for 15 minutes, and denatured samples were clarified by centrifugation at room temperature at 10,000 x g for 30 seconds. 40 microliters of the cleared protein samples loaded onto Tris acetate gels (1.5 mm, 10 well,

Invitrogen, Carlsbad, CA). SDS-PAGE was carried out in Tris-acetate-SDS (Appendix D), at 200

V for 60 minutes. Protein gels were then either stained with coomassie R-250 for analysis, or transferred onto PVDF membranes using electrode cassettes submerged in Tris-Glycine-SDS buffer (Appendix D), run at 100 V at 4 ℃ overnight

PVDF blots were stained with Ponceau Red for analysis, and blocked with TBS-T containing 3% BSA for one hr. Blots were probed with an anti-Cas9 Monoclonal Antibody

170

(Invitrogen, Carlsbad, CA) at a dilution of 1:1,000 (v/v) in TBST containing 2% BSA at room temperature overnight. Blots were then incubated with Goat anti-Mouse IgG (H+L) Secondary

Antibody, HRP (Invitrogen, Carlsbad, CA) at a dilution of 1: 5,000 (v/v) in TBST containing 3%

BSA at room temperature for least one hr. Blots were incubated with SuperSignal™ West Pico

PLUS Chemiluminescent Substrate (Invitrogen, Carlsbad, CA), and luminescence was detected using CL-XPosure™ Film (Thermo-Fisher Scientific, Waltham, MA) developed with a SRX-

101A Tabletop Processor (Konica Minolta Healthcare Americas, Inc., Wayne, NJ).

Immunoprecipitation

For immunoprecipitation (Co-IP) assay, a total of 10 g of rosette leaf material from T3 plants expressing recombinant M-MLV RT was used. Leaf material was ground into a fine white powder using liquid N2 and a mortar and pestle. Crude extracts were obtained by homogenizing leaf powder in 50 ml of Tris Buffered Saline supplemented with protease inhibitor cocktail

(Appendix D). Extracts were clarified by centrifugation at 10,000 x g for 10 minutes at 4 C.

Immunoprecipitation was performed with 100 microliters of Anti-FLAG M2 magnetic beads using the manufacturers protocol with modifications. TBS used to equilibrate magnetic resin and clarified protein extracts were supplemented with 1% BSA. Solutions used were supplemented with protease inhibitor cocktail (Sigma-Aldrich Corporation, Raleigh, NC). Elution was optimal using a FLAG peptide (Sigma-Aldrich Corporation, Raleigh, NC) diluted to 100 micrograms/ml in TBS containing protease inhibitor cocktail.

171

RT activity Assays

Reverse Transcriptase activity was quantified using a picogreen based assay that used fluorescence as a reporter for the formation of RNA-DNA duplexes produced from first strand synthesis (ENZ-Check Reverse Transcriptase Assay Kit, Thermo-Fisher Scientific, Waltham,

MA). Samples were supplemented with RNAse inhibitor.

Identification of transgenic chloroplasts in lines carrying PTEC and RT constructs

Creation of transgenic Arabidopsis thaliana expressing PTEC RNA and M-MLV reverse transcriptase was accomplished by the re-transformation of T3 homozygous PCGWBARRT line

18 via floral dip, and the resulting seed was stored at 4 °C. Seed was screened based on mCherry fluorescence and glufosinate selection. Fluorescent and resistant seedlings were transferred to soil after the emergence of true leaves. Lines carrying both PTEC and M-MLV transgenes were identified using PCR using Genomic DNA. Resulting double transgenic lines were screened for the existence of integrated chloroplast transgenes. Integrated chloroplast transgenes were screened by PCR using genomic DNA, with primers annealing to both chloroplast specific and transgene specific regions.

172

Generation of Transgenic N. Benthamiana

Leaf disc transformation was used to transform N. benthamiana using transgenic

Agrobacterium tumefaciens GV3101. Transgenic callus was selected using Kanamycin and hygromycin selection.

Results

Design and construction of vectors

Constructs used to express chloroplast localized reverse transcriptase and cas9

Although the use of CRISPR-Cas systems in plants has been previously reported, implementation in plastid has not previously been achieved. Most of the available CRISPR-Cas systems assume that these systems will be implemented in nuclear genome editing, and hence feature N and C bounding nuclear localization sequences (Mali et al. 2013; Ran et al. 2013;

Belhaj et al. 2015; Ma et al. 2015). To localize Cas9 expression in plastids, an expression cassette was designed to remove nuclear localization sequences and provide an N terminal translational fusion of the transit peptide of A. thaliana rbcS (Figure 4.8). Use of rbcS to localize recombinant proteins in plants has been described previously (Williamson et al. 1994; Miras et al. 2002), as well as within this study (Figure 4.8, Figure 4.9). Furthermore, a C terminal 6x HIS tag was added to facilitate downstream identification and purification of Cas9 expressed in transgenic plants (Figure 4.8).

173

Reverse Transcriptase from M-MLV was chosen to implement this study because it is capable of synthesizing several kb of cDNA during first strand synthesis, and because its interactions with RNA and RNA sequence needed for double stranded DNA synthesis from

RNA are well known (Kotewicz et al. 1988; Gerard and Grandgenett 1975; Verma 1975;

Shinnick et al. 1981; Mak and Kleiman 1997; Negroni and Buc 2001). A synthetic M-MLV reverse transcriptase gene was constructed to express chloroplast localized M-MLV RT. The protein coding sequence of M-MLV was codon optimized for A. thaliana nuclear expression, and features an N-terminal A. thaliana rbcS transit peptide, and a C terminal 3X FLAG affinity tag (Figure 4.8).

During the course of this work it was determined that in addition to reverse transcriptase, a nucleocapsid peptide would be necessary to initiate reverse transcription using a plant chloroplast tRNA (Henderson et al. 1981; Cornille et al. 1990; Pager et al. 1994; Davis and

Rueckert 1972; Darlix et al. 2014). Nucleocapsid 10 (NCP10) was chosen because of its role in relaxing the structure of primer tRNA, as well as in facilitating the folding of RNA and DNA templates during reverse transcription. A synthetic NCP10 gene featuring an N terminal A. thaliana rbcS transit peptide was constructed to express chloroplast localized NCP10 (Appendix

B).

PTEC Design and Construction

Plastid Transgene expression cassettes are designed to be chloroplast localized templates for reverse transcription, resulting in double stranded DNA synthesis. To localize these long RNAs to plastid, 5’ transcriptional fusions of ELVd are employed (Figure 4.8). Reverse transcription

174

Figure 4.8 | Design of genetic constructs used in chloroplast genome engineering. Abbreviations: NUCpro: a nuclear RNA pol II promoter, ex: CaMV 35S. NUCterm: a nuclear RNA pol II terminator. CTP: A chloroplast transit peptide used to localize protein to chloroplast stroma. U6pro: A nuclear U6 RNA pol I promoter. TTTTTT: A string of thymidine residues used as a terminator for genes expressed by RNA pol I. ELVd: Truncated eggplant latent viroid sequence used to localize RNA to chloroplast stroma. VRS I: Viral recognition sequence 1 VRS II: Viral recognition sequence 2. LHA/RHA: Left and right homology arms, used to facilitate homologous recombination with a chloroplast genome. GOI: Gene of interest. A. Chloroplast localized Reverse Transcriptase. M-MLV reverse transcriptase is localized to chloroplast stroma via translational fusion of a chloroplast transit peptide. A 3X FLAG peptide is used as an affinity tag. B. Chloroplast Localized Cas9. Cas9 is localized to chloroplast stroma via translational fusion of a chloroplast transit peptide. A 6X HIS peptide is used as an affinity tag C. & D. Chloroplast localized sgRNA. ELVd is transcriptionally fused to sgRNA at 5’ (C.) or 3’ (D.) in order to localize sgRNA to chloroplast stroma. E. Contents of sgRNA, including a 15-25 bp protospacer target, a NGG protospacer adjacent motif, and tracrRNA. F. Design of PTEC constructs. ELVd is used to localize PTEC RNA to chloroplast stroma. VRS II and VRS II are viral RNA motifs used by reverse transcriptase to anneal to the RNA molecule and facilitate double stranded DNA synthesis. The plastid modification cassette is the portion of the PTEC that integrates with a chloroplast genome. G. Plastid modification cassette. This sequence features LHA and RHA sequences that are identical to chloroplast genome sequence and facilitate homologous recombination. Embedded between LHA and RHA is a gene designed to be expressed in a chloroplast. Black ovals: Ribosome binding sequence

175

A

B

E C

D

F

G

176

of the PTEC is facilitated by features mimicking elements of the M-MLV RNA virus required for recognition and replication by reverse transcriptase (Mak and Kleiman 1997; Shinnick et al.

1981; Negroni and Buc 2001). These features are termed viral recognition sequence I (VRS I) and viral recognition sequence II (VRS II) (Figure 4.8). The function of VRS I is primarily to recruit reverse transcriptase bound to a primer tRNA, and also to facilitate downstream stages of double stranded DNA synthesis during reverse transcription. VRS I recruits reverse transcriptase via a tRNA primer binding site on the 3’ end. R sequence and the 5’ UTR are necessary for latter stages of reverse transcription and double stranded DNA synthesis. VRS II is required for double stranded DNA synthesis via the strong stop DNA transfer model. VRS IIs include a polypurine tract (PPT), a 3’ viral UTR, and second R sequence as these are the minimal components needed for double stranded DNA synthesis. Based on the role of the R sequence as a homologous region used for annealing during multiple steps of strong stop DNA synthesis, the

VRS II also includes a copy of all sequence upstream of the first R sequence in VRS I. This 5’

Copy sequence is placed between R sequence and the 3’ viral UTR in VRS II (Figure 4.8).

Additionally, a simpler design for PTECs was constructed that lacks VRS I & VRS II sequence (Appendix B). The purpose of this PTEC is to generate single stranded DNA rather than double stranded DNA. This is achieved by inclusion of only the PBS sequence of VRS I on the 3’ end of the PTEC. The intention is to only utilize the RNA dependent DNA polymerase

(First strand synthesis) and RNAseH functions of reverse transcriptase. The product is a single stranded cDNA copy of the PTEC RNA, which should be amenable to homologous recombination via the RecA pathway.

The product of reverse transcription in chloroplasts is meant to be recombinant DNA that can be integrated into a chloroplast genome by homologous recombination via the RecA

177

pathway. All PTECs feature a plastid modification cassette, based on the design of plastid transgene constructs used in traditional chloroplast transformation (Figure 4.8). The plastid modification cassette shares the features of traditional chloroplast transgene constructs, including several kb of sequence homologous to the chloroplast genome flanking a chloroplast transgene.

The different PTEC constructs used in this study vary with the homologous sequence used.

Homologous sequence was chosen based on the sequences used in prior work, and the desire to have transgene integrate at single or multiple locations (Yu et al. Maliga 2017; Sikdar et al.

1998; Khan and Maliga 1999; Ruf et al. 2001; Svab et al. 1990; Svab and Maliga 1993a). The chloroplast transgenes used in this study contain a chloroplast promoter and terminator, 5’ and 3’

UTR features needed to maintain stability and expression of a chloroplast mRNA (name the origin of these), and a coding sequence for a spectinomycin resistance protein (aadA) (Figure

4.8).

sgRNA Design and Construction

To generate chloroplast localized sgRNA, transcriptional fusions of ELVd were employed. However, a central design consideration was the placement of ELVd. There are several reports that 5’ and 3’ attachments to sgRNA can affect the function of sgRNA/Cas9 RNP complexes (Gao and Zhao 2014; Haurwitz et al. 2010). Two designs for sgRNA were used, one with ELVd as a 5’UTR, and one with ELVd as a 3’ UTR relative to the sgRNA cassette (Figure

4.8). Implementation of ELVd as a 3’ UTR has not been shown previously, but was justified because it is unknown if a 5’ ELVd UTR would prevent the function of sgRNA. A second justification for this design choice is that RNA viroids are circular; we hypothesized that a 3’

178

ELVd UTR should be functionally equivalent to a 5’ ELVd UTR for the purpose of chloroplast localization of RNA.

In contrast to other designs for the expression of sgRNA, RNA localized sgRNA are regulated by promoters and terminators specific for RNA pol II, rather than RNA pol III (Figure

4.8). The RNA pol III specific U6 promoter is typically employed in CRISPR-Cas systems because it allows for seamless expression of sgRNA, with only 6 3’ thymidine residues as accessory to the sgRNA sequence ( Ma et al. 2014). Use of RNA pol III is also advantageous because synthesized RNA is not mRNA, and does not possess a 5’ 7-methylguanylate cap needed for nuclear export, constraining these RNAs to the nucleus (Ma et al. 2014; Latchman and Latchman 2008; Turowski and Tollervey 2016; Willis 1994; Lee and Young 2000; Thomas and Chiang 2006; Hirose and Manley 2000; Hampsey 1998). An RNA pol II promoter and terminator for Nopaline synthase were therefore chosen because they should allow nuclear export of sgRNA as mRNA, and because these regulatory sequences have been previously used to express nuclear transgenes (Figure 4.8) (Bevan 1984; Ebert et al. 1987; An et al. 1990; An

1986).

sgRNA protospacers were designed to target sequences specific to the chloroplast genome, specifically 5’ coding sequence of psbA and rbcL. Because of sequence conservation the same target sequence could be used in both A. thaliana and N. benthamiana (Appendix C).

Replication of the use of ELVd as a plastid localizing UTR

The use of an ELVd fragment as an artificial UTR to localize RNA to the chloroplasts of plant cells was replicated in N. benthamiana by using green fluorescence as a proxy for localization

179

(Figure 4.9). Control fluorescence constructs that are known to express cytosol localized

(PCGW-eGFP) and chloroplast localized (CTP-eGFP) eGFP signal were used to control for the existence and location of green fluorescence. Untransformed N. benthamiana was used as a negative control for green fluorescence. Transient expression of ELVd sequence transcriptionally fused to eGFP produced green fluorescence that colocalized with chlorophyll autofluorescence that was detectable using both fluorescent and confocal microscopy (Figure 4.9). However, localization of fluorescence in CTP-eGFP and ELVd-EGFP expressing tissue was often leaky, showing marginal signal in both cytosol and nucleus in addition to plastid colocalization. The use of eIF4E as an artificial UTR to localize RNA to the chloroplasts of plant cells was also replicated, and as was reported previously this construct produces no detectable fluorescence

(Gómez and Pallás 2010a) (Figure 4.9). These results indicate that the use of ELVd as a plastid localizing UTR in N. benthamiana is reproducible, and might be extended to RNA sequences other than GFP.

Selection and Genotyping

Expression of transgenes was explored initially using transient transformation of infiltrated N. benthamiana to confirm the function of transgenes (Figure 4.12, Figure 4.13). mRNA corresponding to PTEC, sgRNA, Reverse transcriptase, and Cas9 transgenes was detectable at 72 hours post infiltration (Figure 4.12). Reverse transcriptase and Cas9 protein expression was also detectable in N. benthamiana (Figure 4.13). Based on these results, the binary vectors used in N. benthamiana were used to establish lines of A. thaliana by floral dip.

180

Figure 4.9 | Demonstration of ELVd 5’UTR use in N. benthamiana leaf tissue. eGFP: a construct expressing eGFP localized to cytosol (Appendix B), rbcSCTP-eGFP: a construct expressing eGFP localized to chloroplast via the use of a transit peptide (Appendix B), ELVd 5’UTR:eGFP: a construct expressing an RNA localized to chloroplast containing an out of frame eGFP coding sequence, eif4E 5’UTR:eGFP: a construct expressing an RNA localized to chloroplast containing an out of frame eGFP coding sequence A. Fluorescent microscopy of N. benthamiana abaxial leaf peels. Size bar indicates a length of 10 µm. Green fluorescence is colored cyan under a GFP filter, but is false colored green when merged with chloroplast red autofluorescence. Overlap between green and red fluorescence produces a yellow coloring. B. Confocal microscopy of N. benthamiana leaf. Size bar indicates a length of 10 µm. WT N. benthamiana is used as a negative control.

181

A GFP Autofluorescence Merged

eGFP

rbcSCTP- eGFP

ELVd 5’UTR: eGFP

eif4E 5’UTR: eGFP

182

GFP Autofluorescence Merged B

Wild Type Nicotiana Benthamiana

eGFP

rbcS CTP-eGFP

ELVd 5’UTR:eGFP

eif4E 5’UTR:eGFP

183

A B

C

.

D E

F

Figure 4.10 | Selection and Propagation of Transgenic Lines. A. Glufosinate ammonium (phosphinothricin, BASTA) selection of transgenic A. thaliana seed used to identify lines expressing reverse transcriptase. Resistant seedlings are identified by survival and production of true leaves. B. mCherry fluorescence screening of transgenic A. thaliana seed used to identify lines expressing PTEC and Cas9 constructs. Transgenic seeds can be segregated from non-transgenic seed by the level of red fluorescence emitted. C., D., E., F. Phenotypically indistinguishable lines of A. thaliana bearing Reverse Transcriptase (C.), single target PTEC (D.), double target PTEC (E.), and Cas9 (F.)

184

Constructs for expression of PTECs, Reverse Transcriptase, and Cas9 were transformed into A. thaliana by floral dip, and the resulting seed produced transgenic lines carrying appropriate functional selectable markers (Figure 4.10). None of the lines used in this study produced any obvious or identifiable unintended phenotypic traits, and transgenic markers exhibited mendelian segregation of selectable marker traits. One exception occurred; transgenic seed carrying

PTECv4 had diminished viability in multiple lines and multiple generations (Appendix E). This was not true of PTECv5, and was not true of an Empty Vector Control. Furthermore, within the seed pool containing a mixture of transgenic and nontransgenic seed from PTECv4 T1 and T2 lines, nonfluorescent (nontransgenic) seed did not exhibit any reduction in viability.

Putative transgenic lines identified by selection were further analyzed to establish that they were the correct genotype and that a desired RNA or protein was in fact expressed (Figure

4.11). 27 lines of PCGW BAR RT, 6 lines of CGWmCherryPTECv4, 4 lines of

CGWmCherryPTECv5, 22 lines of PCGWmCherry Cas9 were identified by screening genomic

DNA for a partial transgene with sequence specific primers and full-length sequence with nonspecific primers (Figure 4.11) (Appendix C). Lines that carried full length transgene sequence were further evaluated for the expression of transgene RNA (Figure 4.11). 26 lines of

PCGW BAR RT, 3 lines of CGWmCherryPTECv4, 3 lines of CGWmCherryPTECv5, and 19 lines of PCGWmCherry Cas9 were identified by screening cDNA for a partial transgene with sequence specific primers (Figure 4.11) (Appendix C). Lines that expressed a desired RNA were further evaluated for the expression of a desired protein (Figure 4.13). In an initial round of screening, 5 lines were found to express detectable Cas9, and only one line expressed detectable

M-MLV Reverse Transcriptase.

185

Figure 4.11 | Genotyping of transgenic A. thaliana lines. Abbreviations: L: Thermofisher 1 kb Plus DNA ladder, NT: No template negative control, WT: col-0 background control, r: DNAse treated RNA, c: cDNA. Transgenic lines of A. thaliana were confirmed by PCR showing the existence of DNA sequence and RNA expression corresponding to Reverse Transcriptase (A. & B.), PTECs (C. & D.), and Cas9 (E. & F.). A. & B. Transgenic lines bearing reverse transcriptase were identified using gene specific primers that bind within the reverse transcriptase coding sequence. RT: Reverse Transcriptase line 18, used as a positive control. C. & D. Transgenic lines bearing a PTEC transgene were identified using gene specific primers that bind within the aada (spectinomycin resistance) coding sequence of the plastid modification cassette. Binary vectors containing PTECs were used as positive controls. E. & F. Transgenic lines bearing Cas9 were identified using gene specific primers that bind within the Cas9 coding sequence. The binary vector coding for chloroplast localized Cas9 was used as a positive control.

186

A L NT WT RT Putative T1 Lines kb 5

2 1.5

B WT RT Putative T1 Lines

kb L NT r c r c r c r c r c r c r c r c r c 5

2 1.5

C PTEC PTEC Single Target Plasmid PTEC Single Target T1 Lines RNA & cDNA Single Dual Putative T1 Lines L NT WT Target Target Genomic DNA r c r c r c kb 1.5 1 0.7 0.5

D PTEC Dual Target PTEC Dual Target T1 Lines Putative T1 Lines RNA & cDNA L Genomic DNA r c r c r c r c kb

1.5 1 0.7 0.5

E Cas9 Cas9 Putative T1 Lines WT Cas9 Putative T1 Lines RNA & cDNA L NT r c Plasmid Genomic DNA r c r c kb 1.5 1 0.7 0.5

F Cas9 Putative T1 Lines RNA & cDNA

L r c r c r c r c r c r c r c r c kb 1.5 1 0.7 0.5

187

A PTEC 80

60

40

20 mRNA abundance 0 WT aadA rep 1 aadA rep 2 aadA rep 3 aadA rep 4 B Reverse Transcriptase 250 200

150 100

50 mRNA abundance 0 WT RT rep 1 RT rep 2 RT rep 3 RT rep 4 C ELVd 5’

L NT WT EV UTR: sgRNA kb 5

1.5

0.5 0.3 0.2

Figure 4.12 | Transient RNA Expression. Figure 4.12 A & B are the work of Dr. Soundarya Srirangan. Abbreviations: L: Thermofisher 1 kb Plus DNA ladder, NT: No template negative control, WT: N. benthamiana background, EV: Empty Vector negative control. qRT-PCR (A. & B.) and RT-PCR of RNA derived from transgenic N. benthamiana expressing (A.) PTEC (B.) Reverse Transcriptase or (C.) ELVd 5’ UTR: sgRNA. Wild type (WT) N. benthamiana was used as a negative control for the expression of transgenes. Actin was used as a control to estimate mRNA abundance (A. & B.), against which other genes were measured. 4 biological replicates were used to compare PTEC and Reverse Transcriptase RNA expression (A. & B.), while 3 biological replicates were used to compare ELVd 5’ UTR: sgRNA expression.

Because only one line was detected that expressed M-MLV reverse transcriptase out of

26 lines that expressed the mRNA for this transgene, additional forms of analysis were employed to analyze variance in expression. qRT-PCR revealed that expression level of transgene mRNA

188

varied significantly among lines, and was several orders of magnitude higher in T2 lines that produced detectable protein (Figure 4.14). Interestingly, T1 plants that also produced detectable protein had mRNA expression orders of magnitude lower than the T2 lines. Analysis of protein expression was repeated using a more sensitive chemiluminescent substrate, which revealed that

Reverse transcriptase protein was produced in at least three additional lines (Figure 4.14).

Reverse Transcriptase produced in planta is a functional enzyme

RNA Dependent DNA polymerase activity of recombinant M-MLV reverse transcriptase produced in A. thaliana and N. benthamiana was observed using a fluorescent reporter assay

(Figure 4.16). Crude extracts of transgenic A. thaliana and N. benthamiana produced significantly more fluorescence than the crude extracts of wild type plants. Recombinant M-

MLV reverse transcriptase purified from A. thaliana by immunoprecipitation demonstrated fluorescence varied with increasing volume, and the fluorescent signal increase was not observed in samples lacking a viable template for reverse transcription (Figure 4.15, Figure 4.16). These results imply that recombinant M-MLV produced in plants retains bon-a-fide RNA Dependent

DNA polymerase (reverse transcriptase) activity.

Combination of PTEC and RT in A. thaliana

One overexpressing line of carrying PCGWBARRT was retransformed with two PTEC constructs (Appendix F). The resulting seed was screened and multiple individuals were identified that carried both mCherry fluorescence and phosphinothricin resistance traits. These

189

A. A N. thaliana N. A. thaliana benthamiana Col-0 benthamiana Transgenic Controls Control E Transgenics E Lines Reverse Transcriptase

B Sonication + Transgenic L - Detergent N. benthamiana + kda Cas9 180

130

C + E - A. thaliana T1 Lines

Cas9

Figure 4.13 | In planta protein expression of rbcSCTP-M-MLV Reverse Transcriptase and rbcSCTP-Cas9. E: Empty lane A. Western blot showing expression of rbcSCTP-M-MLV Reverse Transcriptase in N. benthamiana and A. thaliana. B. Western blot showing expression of rbcSCTP-Cas9 extracted from N. benthamiana. Transgenic tissue was generally extracted with Laemmli buffer and ground using a bead beater, except for one sample that was instead sonicated (methods). -: N. benthamiana tissue, +: Commercial Cas9 protein diluted 1:100 in MqH20 (platinum cas9 thermo), L: Spectra high range protein ladder (thermo) C. Western blot showing expression of rbcSCTP-Cas9 extracted from transgenic lines of A. thaliana. Tissue was ground in phosphate buffered saline (methods). -: col-0 tissue, +: Commercial Cas9 protein diluted 1:100 in MqH20 (platinum cas9 thermo).

individuals also carried both PTEC and reverse transcriptase transgenes (Appendix F). These lines were then screened by PCR to establish if the combination of Reverse transcriptase and a

PTEC RNA in A. thaliana yields an integrated chloroplast transgene. An initial PCR screen identified an appropriately sized band in several lines (Figure 4.17). However, this band also appeared in both Reverse transcriptase and PTEC background controls. Additionally, replication of this PCR with identical parameters using either the same genomic DNA or DNA excised from the agarose gel of the initial screen did not reproduce the existence of these bands (Appendix F).

A secondary PCR screen failed to amplify any product resembling an integration event

190

A 300

250

200

150

100 Abundance mRNA 50

0 col-0 Line 17 Line 18 Line 18 Line 3 Line 4 Line 5 T1 T1 T2 T1 T1 T1

Reverse transcriptase T1 lines

B Col-0 E 18 E 3 4 5 Reverse Transcriptase

Figure 4.14 | Enhanced detection of M-MLV reverse transcriptase in transgenic A. thaliana lines. A. Relative expression of reverse transcriptase was measured in transgenic lines of A. thaliana using qRT-PCR. Actin was used as a control to estimate mRNA abundance against which other genes were measured. Col-0 was used as a negative control for the expression of transgenes. 3 biological replicates were used to compare Reverse Transcriptase RNA expression. B. Western blot showing expression of rbcSCTP-M-MLV Reverse Transcriptase in lines of A. thaliana. Detection was achieved with a high sensitivity chemiluminescent substrate (methods). Col-0 was used as a negative control for reverse transcriptase expression. A 1:100 dilution of tissue from the line 18 T1 plant was used as a positive control. E: Empty Lane.

191

A L CL FT W1 W2 W3 MR Gl Mg 3X

kda

93

72

57

B L CL FT W1 W2 W3 MR Gl Mg 3X

kda

93

72

57

Figure 4.15 | Immunoprecipitation of M-MLV reverse transcriptase from A. thaliana rosette leaf tissue. Paired Ponceau stain (A.) and Western Blot (B.) of fractions recovered from immunoprecipitation using ANTI-FLAG M2 magnetic beads. Abbreviations L: Flash protein Ladder. CL: Crude soluble protein lysate in Tris buffered saline. FT: Flow through fraction of lysate after incubation with ANTI-FLAG M2 magnetic beads. W1, W2, W3: sequential bead wash fractions of Tris buffered saline. MR: washed ANTI-FLAG M2 magnetic beads prior to elution. Gl: elution from magnetic beads using 0.1M Glycine, pH 3.0. Mg: elution using 4 M MgCl. 3X: Elution using 3X FLAG peptide (Sigma F4799).

192

A B

L + Elu

Fluorescence kda Sample Units Commercial RT 90 93 Wild Type A. 0 Thaliana Wild Type N. 0 Benthamiana Transgenic A. 72 9 57 Thaliana Transgenic N. Benthamiana 10

C Elution Volume vs Reverse Transcriptase activity 80 60 40

20 Fluorescence units Fluorescence 0 0 10 20 30 40 Volume (microliters)

Figure 4.16 | In vitro activity assays of MMLV reverse transcriptase produced in planta. A. Coomassie stain comparing L: Flash protein ladder, +: ThermoFisher M-MLV produced in E. coli, and Elu: Protein recovered from immunoprecipitation of transgenic A. thaliana expressing rbcSCTP-M-MLV Reverse Transcriptase in the 3X FLAG peptide elution fraction (Figure 5.10). B. & C. RNA Dependent DNA Polymerase (Reverse Transcription) activity of commercial reverse transcriptase (ThermoFisher M-MLV), crude extracts of N. benthamiana and A. thaliana leaf tissue, and protein recovered from immunoprecipitation (C.). Untransformed tissue of N. benthamiana and A. thaliana is used as a control against transgenic plants either transiently (N. benthamiana) or stably expressing rbcSCTP-M-MLV Reverse Transcriptase as determined by western blotting (Figure 5.8). Fluorescence resulting from Picogreen binding of RNA/DNA heteroduplexes is used as a proxy for First Strand Synthesis (Reverse Transcription).

(Appendix F). Based on these findings it was concluded that none of the A. thaliana lines carrying both PTEC and RT transgenes possessed an integrated chloroplast transgene.

193

N. benthamiana Callus

Transgenic N. benthamiana callus has been established using antibiotic selection (Figure

4.18). Lines of N. benthamiana transformed with either control vectors (PCGW-KAN), or vectors for expression of Reverse Transcriptase (PCGW-HYG-rbcSCTP-M-MLV Reverse

Transcriptase), Cas9 (PCGW-HYG-rbcSCTP-Cas9), or Cas9 and a viroid fused sgRNA (PCGW-

HYG-rbcSCTP-Cas9 + ELVd 5’ UTR: psbA sgRNA) have developed mature, differentiated callus able to establish roots, but confirmation of the existence and expression of transgenes has not been performed.

Reverse A Transcriptase PTEC Reverse Transcriptase + PTEC NT WT Background Background Hybrid Lines L kb 3 2

Reverse Transcriptase Reverse Transcriptase B + PTEC + PTEC

L NT WT Hybrid Lines NT WT Hybrid Lines

kb

3

2

Figure 4.17 | Attempted identification of integrated chloroplast transgenes. NT: No template negative control, WT: col-0 background control, L: Thermofisher 1 kb Plus DNA ladder (cat: 10787018). A. & B. Identification of integrated chloroplast transgenes was attempted using PCR. The forward primer was designed to anneal to a transgene specific sequence embedded within the PTEC and the reverse primer was designed to anneal to the chloroplast genome. A positive event should yield a band of ~2.5 kb (highlighted in red). Control plants expressing only reverse transcriptase or the PTEC yielded this band (A) in addition to multiple lines expressing both components (A, B).

194

A B C D

Figure 4.18 | Transgenic Nicotiana benthamiana callus. 6 week old callus resulting from agrobacterium transformation of Nicotiana Benthamiana leaf tissue is shown above for vectors A. PCGW-Kan, B. PCGW-HYG- rbcSCTP-Cas9, C. PCGW-HYG-rbcSCTP-M-MLV Reverse Transcriptase, and D. PCGW-HYG-rbcSCTP-Cas9 + ELVd 5’ UTR: psbA sgRNA.

Discussion

Although modification of plastid genomes has not been demonstrated utilizing the tools presented here, there is no concrete evidence that demonstrates why these shouldn’t or wouldn’t work in principle. Expression of plastid genome modification tools could be observed, and yielded to viable and fertile plants, and expression of a recombinant reverse transcriptase yielded a protein that was at least partially functional in vitro. There is therefore a continued need to determine which components of this work do or do not function as described, so as to refine problems encountered by troubleshooting.

195

The use of ELVd as a 5’UTR conferring chloroplast localization and transgene expression was replicated in N. benthamiana.

The use of ELVd as a UTR was replicated here because although there have been a few reports of its use, most of these publications were from the same set of authors (Gómez and

Pallas 2012; Gómez and Pallás 2010a, 2010b). There is in fact a dearth of publications relating to the use of RNA as a UTR for transit and localization to plastid, meaning that this technology was inherently risky. It seemed necessary given the critical role of this ELVd UTR in the project design to validate that its use was reproducible.

Therefore, the use of ELVd as a 5’UTR was reproduced using conditions and materials similar to the original study showing its use. Just as the Gomez and Pallas originally found, chloroplast localized green fluorescence was observed in samples expressing an ELVd fragment fused as a 5’ UTR to an out of frame eGFP protein coding sequence (Gómez and Pallás 2010a).

This effect was captured in two different experiments and with two different microscopes, which also confirmed the validity of positive fluorescent and negative non-fluorescent controls. An important caveat is that detection green fluorescence in ELVd-eGFP expressing samples was rare compared to chloroplast transit peptide localized eGFP controls. There is therefore a lingering need to establish the exact conditions under which ELVd transcriptional fusions are a reproducible and efficient tool. However, based on the replication here and elsewhere, the use of

ELVd to localize RNA to chloroplasts appears to be an authentic technology (Ahmad 2016; Baek et al. 2017).

However, ELVd UTRs also appear to have limitations. Expression of GFP in chloroplasts could not be observed in transgenic A. thaliana. This observation is also supported

196

by the biology of chloroplast viroids in the avsunviroidae family. Avsunviroidae have a demonstrably limited host range, and are believed to only replicate in hosts related to the species in which they were discovered (Fadda et al. 2003; Daròs 2016; Flores et al. 2000). ELVd in particular appears to only be able to replicate in eggplant, to the exclusion of other members of

Solanaceae (Fadda et al. 2003). It is therefore reasonable that ELVd would not function to localize RNA in A. thaliana, which is more distantly related than N. benthamiana to eggplant.

The underlying reasons for the species specificity of replication and chloroplast localization are unknown, but probing the diversity of the known avsunviroids may offer clues to mechanisms of RNA localization and translocation into the chloroplast. It may be possible to use these viroids to identify proteins similar to the TIC/TOC complex that facilitate recognition and translocation of RNA, rather than protein (Jarvis and Soll 2001; Chotewutmontri et al.

2017). Knowledge about this complex and the necessary features for recognition and translocation could then be applied to other organelles and species, possibly via a transgenic approach. If existing RNA localization and transit features are insufficient in a species like A. thaliana, it may be necessary to imbue them.

PTEC expression may reduce seed viability

Expression of PTEC, Cas9, and Reverse Transcriptase is observable in viable lines of A. thaliana, a necessary requirement for the creation of lines carrying multiple traits and for the functioning of chloroplast transformation. However, PTECv4 consistently produced seed which had reduced viability. This contrasts with PTECv5, which does not produce seed with any significant reduction in viability compared to controls. There were no observable differences in

197

mature plants, and transgenic plants known to express the PTECv4 RNA were fertile. Given that the reduction in seed viability is constrained to the transgenic seed pool, it is highly likely that expression of PTECv4 produces a negative effect on seed health.

The reason for the toxicity of this construct remains cryptic. Homology arm sequence is the only salient difference between PTECv4 and PTECv5. The homology arms used in PTECv4 cover the entire regulatory and coding region of rbcL and accD proteins, whereas homology arms used in PTECv5 cover a noncoding region of two inverted repeat regions in the chloroplast genome. It is unlikely that the RNA can be localized to the chloroplast because the ELVd

5’UTR probably does not function in A. thaliana. Therefore, there are potentially multiple speculative reasons for why expression of PTECv4 may reduce seed viability. Whether or not this is the case in N. benthamiana will be established for comparison.

Reverse Transcriptase is functional in plants

Although several dozen transgenic A. thaliana lines carried the reverse transcriptase transgene, only one was found to overexpress the protein. Although at least three additional lines were found to express protein, this was only possible by increasing the detection of western blotting by an order of magnitude. This contrasts with Cas9, of which at least five overexpressing lines were found. Importantly, the expression level of RT may not be tied to the amount of transcript present in tissues. There was substantial variation in the level of transcript for reverse transcriptase produced, and this seemed to have little bearing on the amount of protein observed on a western blot.

198

Reverse transcriptase produced in planta possesses RNA dependent DNA polymerase

(reverse transcriptase) activity in vitro. Although implied, it is unknown if this activity is present in vivo, as no evidence has yet shown synthesis of DNA from RNA in the chloroplast of a living plant cell. This is complicated by the requirement that M-MLV use a tRNA primer in vivo, and the need for the activity of additional features of M-MLV that have not been probed. In addition to reverse transcriptase M-MLV also should possess RNAse H and DNA polymerase functions.

This is absolutely required for synthesis of double stranded DNA from PTEC RNA.

Furthermore, although M-MLV is fused to an N terminal chloroplast transit peptide, the localization of this protein within the plant cell is assumed, but not certain. Clearly there is therefore a need to establish that M-MLV is indeed localized to plastid, and has RNAse H and

DNA polymerase functions.

RT + PTEC technology does not work in A. thaliana

The combination of PTEC RNA and Reverse transcriptase expression within lines of A. thaliana does not appear to produce an integrated chloroplast transgene. Three attempts to retransform T3 lines expressing reverse transcriptase with PTECv4 and PTECv5 yielded mostly negative outcomes, and the few potential transformants were false positives. This outcome is not entirely unexpected, as it is unlikely that the ELVd UTR functions in A. thaliana to localize

PTEC RNA to the chloroplast. Use of a ELVd 5’UTR does not appear to produce chloroplast localized GFP signal in A. thaliana, although it is unclear if this outcome is because the synthetic mRNA is not translocated, or merely not translated. As the replication of Avsunviroidae is somewhat species specific, it is likely to be the case that ELVd specifically is an unsuitable UTR

199

to localize RNA to chloroplast in A. thaliana. A potential alternative would be the use of eIF4E as a UTR to localize RNA in this system.

Additional explanations for the lack of chloroplast transgene integration include the need for an additional protein involved in tRNA primer annealing to the PTEC sequence, and potentially inhibited M-MLV RT activity due to the temperature at which plants grow. The need for a nucleocapsid protein to relax the secondary structure of a chloroplast tRNA to anneal a primer binding sequence was overlooked until it was apparent that no chloroplast transgenes were manifest in lines of A. thaliana. The simple remedy was to include a chloroplast localized nucleocapsid protein known to be involved in annealing tRNA in vivo in conjunction with M-

MLV RT. Therefore, the construct used to express chloroplast localized M-MLV was modified to also expresses recombinant NCP10 fused to an N terminal chloroplast transit peptide.

Optimal activity of M-MLV RT is between 37-42 °C when used in vitro. Therefore, a regime may be necessary to activate reverse transcription in the chloroplast of a plant, which is generally kept between 16-25 °C. It may be the case that for reverse transcription to proceed in a plant, it must be elevated to 37-42 °C for at least some period. Otherwise, a reverse transcriptase with optimal activity in the range of 16-25 °C is likely necessary.

Redesign of the system and implementation in N. benthamiana

Combining Reverse Transcriptase and PTEC components in A. thaliana did not result in transplastomic plants. However, there is no evidence to suggest that the mechanism of de novo

DNA synthesis and integration in plants is itself nonfunctional. There remain many questions about the nature of why no transplastomic plants were observed. Therefore, additional

200

experiments are being pursued to evaluate the components proposed here to modify chloroplast genomes. The aim of these experiments is to establish the function or nonfunction of individual components, so that improvements can be made to their design.

This work is now being implemented in N. benthamiana, as the function of ELVd as a synthetic 5’UTR is established as a viable tool. Furthermore, the integration of chloroplast transgenes and the selection thereof in tissue culture is well established. For these reasons N. benthamiana is an inherently less risky plant system in which to establish chloroplast modification technologies.

After review of the implementation of Reverse transcriptase and PTEC systems in A. thaliana, new and salient manifestations of these systems are being employed. The addition of a

Nucleocapsid protein is probably necessary for reverse transcriptase to anneal tRNApro in the chloroplast of a plant as an RNA primer. Therefore, new vectors need to be made that express both M-MLV Reverse Transcriptase and M-MLV NCP10 localized to plastid.

Additionally, a new manifestation of the PTEC is going to be evaluated that reduced the complexity of this component and is intended to generate single stranded DNA, rather than double stranded DNA. The design of this PTEC is similar to other PTEC constructs, with the exception that VRS sequences have been removed. To facilitate reverse transcription, a single primer binding site is included at the 3’ end. Single stranded DNA of up to 6 kb can be synthesized by M-MLV reverse transcriptase (Kotewicz et al. 1988). The product of reverse transcriptase via this design should be a RNA-DNA heteroduplex, the RNA portion of which can be acted on by the RNAse H function of M-MLV to release single stranded DNA (Gerard and

Grandgenett 1975; Verma 1975; Molling et al. 1971).

201

Conclusion

A new mechanism is proposed here to generate transplastomic plants, however, challenges in its implementation remain. Expression of Reverse transcriptase, Cas9, and PTEC components resulted in viable organisms. Recombinant Reverse transcriptase produced in planta has reverse transcriptase activity. However, transplastomic A. thaliana were not obtained by the implementation of reverse transcriptase and PTEC traits.

The challenge now is to evaluate if transplastomic N. benthamiana can be obtained by the implementation of reverse transcriptase and PTEC traits, with the addition of nucleocapsid proteins, temperature elevation, and single stranded PTEC design. As they are still unknown, the function and localization of all components in N. benthamiana will be evaluated in vivo and in vitro so that improvements can be made to create a working chloroplast modification system.

Prospects

In addition to implementation of plastid modification systems in N. benthamiana, many alternatives uses of these systems can be envisioned, as well as further considerations for improvement of design. To troubleshoot de novo DNA synthesis, a simpler and more straightforward proof of principle could be accomplished using a smaller PTEC in the form of a transgene dummy, and plastid localization would be easier to prove or disprove using derivatives of that proof of principle system that lacked, say, a 5’UTR necessary for plastid localization. In addition to the forms described previously here, implementation of nucleases in plastid could accomplished via the use of double transformation, in which plastid and nuclear transgene

202

cassettes are simultaneously introduced into a single plant cell. Consideration of the use of

CRISPR-CPF1 as an alternative system to CRISPR-Cas9 is also highly relevant, as the former has several salient advantages over the latter. Finally, de novo DNA synthesis may be more efficient in plastids if the proteins and RNA design are derived from plant LTR transposons rather than mammalian retrovirus.

Authentic implementation of reverse transcription in a plastid is likely to entail a great deal of failure. To this end, there is a real need to reduce the number of variables and the level of risk to identify points of failure, or designs that are nonfunctional. One of the key sources of both complexity and risk is the PTEC cassette. RNA localization and reverse transcription features are essentially assumed to function based on rational design, but there is no evidence yet that they function to either localize PTEC RNA or produce DNA in vivo. The complexity of this

RNA should be intentionally reduced to assess the function or nonfunction of each RNA element. This could be achieved through the creation of a much smaller and simpler dummy

PTEC. This would entail using much smaller homology arms, 300-400 bp each, simplifying retroviral features just to a PBS and U5 sequence at the 3’ terminus, and reducing the transgene cassette to a short sequence (20-30 bp) whose sole purpose is to create an efficient and unique annealing site for a PCR primer. This dummy construct would have the advantage that it could be made nearly the same size as an eGFP coding sequence, which is known to be able to localize in plastid. Given that no PTECs exist yet that do not have a transcriptionally fused 5’ ELVd sequence, construction of a dummy would also allow for the construction of a control for plastid localization. These smaller constructs would be ideal tools to evaluate plastid localization and

DNA synthesis, both in vivo and in vitro.

203

In addition to implementation of nuclease in plastid as a gene drive, traditional plastid transformation methods could be enhanced via the use of nuclease expressed from a nuclear genome. The benefits of this system are like use as a plastid gene drive, in that the purpose is to favor gene conversion towards a transgene product. This is also practically feasible, as simultaneous transformation of plastid and nuclear genomes is efficient using biolistic methods.

Essentially this would entail the use of already existing plastid transformation cassettes, with the addition of a nuclear transgene coding a sequence specific nuclease fused to a N-terminal plastid transit peptide. There is value in this idea as it would be easier to construct this system independently than to redesign plastid transgenes cassettes to include a nuclease.

In either case, there is a downstream need to remove both antibiotic markers and nucleases via backcrossing and/or cre/lox excision systems (Day and Goldschmidt-Clermont

2011). This is of particular concern when plastid transformation is translated into real world applications, as the inclusion of selectable marker transgenes entails a significant regulatory burden. The amount of additional work is comparable for either route if use of Cre/Lox systems are utilized for marker gene excision. However, utilization of dual transformation in this way is potentially an even more efficient means of plastid transformation as inclusion of a selectable marker as part of a plastid transgene cassette could be avoided entirely. If the use of a plastid localized nuclease is lethal or otherwise functionally deleterious in the absence of a plastid transgene, it could itself be used as a form of negative selection. By coupling a nuclear antibiotic resistance marker to expression of a plastid localized nuclease, integration of foreign DNA in the plastid could be itself be used as a criterion for selection. This would apply both transgene and mutagenesis approaches, with the intended benefits being that homoplasmy is achieved without additional regeneration, and that selectable markers can be removed via a single backcross.

204

Implementation of CRISPR-Cas9 systems in plastid via nuclear transgene expression is inherently risky because of the requirement for sgRNA expression via an RNA pol II promoter and plastid localization of the resulting mRNA. The additions of a 5’ 7-methylguanylate cap and polyadenylation of the 3’ end are features that likely interfere with binding or recognition of crRNA by Cas proteins. For these reasons crRNA are typically implemented using a RNA pol

III promoter, which also conveniently confines crRNA to inside the nuclear envelope.

Furthermore, 5’ and 3’ synthetic additions to sgRNA inhibit Cas9 mediated cleavage.

Transcriptional fusion of ELVd at either 5’ or 3’ends therefore may render any sgRNA nonfunctional. Therefore, there is a clear need for a CRISPR-Cas system that can utilize a crRNA despite synthetic attachment of an artificial viroid UTR. CRISPR-Cpf1 systems are a compelling alternative that may be able to overcome steric hindrance due to the length and secondary structure of ELVd fusion. Unlike Cas9, Cpf1 can recognize arrays of variable crRNA, processing them into individual crRNA (Fonfara et al. 2016; Zetsche et al. 2015, 2016)(Figure

4.19). This would be an obvious advantage to implementation in plastid, as Cpf1 should be able to process at least a portion of a crRNA array attached to a ELVd UTR. This means that a discrete and seamless crRNA could be processed away from other portions of the plastid localized RNA, at both 5’ and 3’ ends, and would provide a similar function to self cleaving ribozymes, intentional host processing of articial tRNA fusions, and Csy4 cleavage systems (Port and Bullock 2016, Xu et al. 2017). Importantly, if this is achievable is a claim that can be robustly evaluated in vitro.

205

Figure 4.19 | Comparison of CRISPR/Cas9 and CRISPR/Cpf1 systems. crRNA processing of Cas9 and Cpf1 Cas proteins is compared above. Cas9 proteins recognize and bind the fusion of a tracrRNA fused to a crRNA including a guide, this is known as an sgRNA. A full sgRNA must be synthesized and recognized by Cas9 for each target. In contrast, Cpf1 can process arrays of crRNA (pre-crRNA) made by transcriptional fusion. This process separates individual crRNA at 5’ and 3’ ends for loading into individual Cpf1 proteins (Ellie Castano 2017).

Innate biological factors of the plastid environment may render the model of DNA synthesis in vivo using a mammalian retroviral like system intractable. Two dominant reasons for this are that mammalian retroviral reverse transcriptase are optimally active at 37°C, and there may be protease or nuclease factors present in the plastid that degrade a template RNA or reverse transcriptase before reverse transcription can be completed. An alternative model to mammalian retrovirus that would likely be viable in this instance would be a plant derived LTR retrotransposon synthetically tailored for plastid localization. Plant LTR retrotransposons use the same Strong Stop Strand Transfer model of replication, meaning that the considerations for template RNA and associated proteins would be similar to the M-MLV like system implemented here (Kumar et al. 1999; Curcio et al. 2015; Havecker et al. 2004)(Figure 4.20). Importantly, reverse transcription of these retrotransposons is functional at temperatures used for plant growth

, as low as 20°C, instead of 37°C (Malagon and Jensen 2008; Checkley et al. 2010; Doh et al.

206

2014). Plant LTR retrotransposons are also potentially useful because of the ability to perform reverse transcription within an enclosed VLP, which could also be produced via the introduction of plastid targeted transgenes via the use of a transit peptide (Emanuelsson et al. 2018; Bruce

2000; Gavel and von Heijne 1990; Karlin‐Neumann and Tobin 1986; Roth 2000; Müller et al.

1987; Eichinger and Boeke 1988; Garfinkel et al. 1985; Mellor et al. 1985). Formation of VLP in a non-host environment is also known to be achievable, as these structures assembled in recombinant E. coli (Luschnig et al. 1995). Enclosure within the VLP complex facilitates the process of reverse transcription, and creates a porous environment that allows small proteins and nucleotide raw materials, while excluding larger proteins that might degrade the VLP complex or nascent RNA/cDNA hybrid (Garfinkel et al. 1985; Eichinger and Boeke 1988; Mellor et al.

1985; Al-Khayat et al. 1999; Burns et al. 1992; Brookman et al. 1995; Curcio et al. 1988; Curcio and Garfinkel 1991). However, expression of LTR retrotransposon like components is like to be tricky, because these elements are targets of both epigenetic gene silencing by DNA methylation, and posttranscriptional silencing (Kashkush et al. 2002; Hammond et al. 2001; Bennetzen et al.

1994, 1993; SanMiguel et al. 1998; SanMiguel et al. 1996; Neyers et al. 1986; Yoder et al.

1997). On the other hand, this may be a minor issue, as silencing tends to result from terminal repeats and read through transcription that can be avoided with clever design considerations for an LTR like transgene system.

207

Figure 4.20 | Structure of Plant LTR retrotransposons. The organization of retrotransposon genomic RNA is shown above. Salient features are comparable to retrovirus, including terminal features used in the process of reverse transcription by the strong stop strand transfer model (Figure 4.4). Ty1 and Ty3 – like retrotransposons differ mainly because of the order of protein domains of the pol polypeptide. Figure modified from (Kumar et al. 1999).

208

References

Ahmad, A., O. Pereira, E., J. Conley, A., S. Richman, A., & Menassa, R. (2010). Green Biofactories: Recombinant Protein Production in Plants. Recent Patents on Biotechnology, 4(3), 242–259. Ahmad, N., Michoux, F., Lössl, A. G., & Nixon, P. J. (2016). Challenges and perspectives in commercializing plastid transformation technology. Journal of Experimental Botany, 67(21), 5945–5960. Ahmad, T. (2016). Translocation of Virus-derived Nucleic Acids to Chloroplasts and Mitochondria in Plants. Akbari, Omar S., Hugo J. Bellen, Ethan Bier, Simon L. Bullock, Austin Burt, George M. Church, Kevin R. Cook et al. "Safeguarding gene drive experiments in the laboratory." Science (2015): aac7932. Al-Khayat, H. A., Bhella, D., Kenney, J. M., Roth, J.-F., Kingsman, A. J., Martin-Rendon, E., & Saibil, H. R. (1999). Yeast Ty retrotransposons assemble into virus-like particles whose T- numbers depend on the C-terminal length of the capsid protein. Journal of Molecular Biology, 292(1), 65–73. Fujiki, Masaaki, and Keith Verner. "Coupling of cytosolic protein synthesis and mitochondrial protein import in yeast. Evidence for cotranslational import in vivo." Journal of Biological Chemistry 268, no. 3 (1993): 1914-1920. An, G. (1986). Development of plant promoter expression vectors and their use for analysis of differential activity of nopaline synthase promoter in transformed tobacco cells. Plant Physiology, 81(1), 86–91. An, G., Costa, M. A., & Ha, S. B. (1990). Nopaline synthase promoter is wound inducible and auxin inducible. The Plant Cell, 2(3), 225–233. Baek, E., Park, M., Yoon, J.-Y., & Palukaitis, P. (2017). Chrysanthemum Chlorotic Mottle Viroid-Mediated Trafficking of Foreign mRNA into Chloroplasts. Research in Plant Disease, 23(3), 288–293. Bedell, Victoria M., Ying Wang, Jarryd M. Campbell, Tanya L. Poshusta, Colby G. Starker, Randall G. Krug II, Wenfang Tan et al. "In vivo genome editing using a high-efficiency TALEN system." Nature 491, no. 7422 (2012): 114. Belhaj, K., Chaparro-Garcia, A., Kamoun, S., Patron, N. J., & Nekrasov, V. (2015). Editing plant genomes with CRISPR/Cas9. Current Opinion in Biotechnology, 32, 76–84.

209

Bennetzen, J. L., Schrick, K., Springer, P. S., Brown, W. E., & SanMiguel, P. (1994). Active maize genes are unmodified and flanked by diverse classes of modified, highly repetitive DNA. Genome, 37(4), 565–576. Bennetzen, J. L., Springer, P. S., Cresse, A. D., & Hendrickx, M. (1993). Specificity and regulation of the mutator transposable element system in maize. Critical Reviews in Plant Sciences, 12(1–2), 57–95. Bevan, M. (1984). Binary Agrobacterium vectors for plant transformation. Nucleic Acids Research, 12(22), 8711–8721. Bock, R. (2007a). Plastid biotechnology: prospects for herbicide and insect resistance, metabolic engineering and molecular farming. Current Opinion in Biotechnology, 18(2), 100–106. Bock, R. (2007b). Structure, function, and inheritance of plastid genomes (pp. 29–63). Springer, Berlin, Heidelberg. Bock, R. (2014). Genetic engineering of the chloroplast: novel tools and new applications. Current Opinion in Biotechnology, 26, 7–13. Bock, R., & Timmis, J. N. (2008). Reconstructing evolution: Gene transfer from plastids to the nucleus. BioEssays, 30(6), 556–566. Boesch, Pierre, Frédérique Weber-Lotfi, Noha Ibrahim, Vladislav Tarasenko, Anne Cosset, François Paulus, Robert N. Lightowlers, and André Dietrich. "DNA repair in organelles: pathways, organization, regulation, relevance in disease and aging." Biochimica et Biophysica Acta (BBA)-Molecular Cell Research 1813, no. 1 (2011): 186-200. Brookman, J. L., Stott, A. J., Cheeseman, P. J., Burns, N. R., Adams, S. E., Kingsman, A. J., & Gull, K. (1995). An Immunological Analysis of Ty1 Virus-like Particle Structure. Virology, 207(1), 59–67. Brouns, Stan JJ, Matthijs M. Jore, Magnus Lundgren, Edze R. Westra, Rik JH Slijkhuis, Ambrosius PL Snijders, Mark J. Dickman, Kira S. Makarova, Eugene V. Koonin, and John Van Der Oost. "Small CRISPR RNAs guide antiviral defense in prokaryotes." Science 321, no. 5891 (2008): 960-964. Bruce, B. D. (2000). Chloroplast transit peptides: structure, function and evolution. Trends in Cell Biology, 10(10), 440–447. Burns, N. R., H. R. Saibil, N. S. White, J. F. Pardon, P. A. Timmins, S. M. Richardson, B. M. Richards, S. E. Adams, S. M. Kingsman, and A. J. Kingsman. "Symmetry, flexibility and permeability in the structure of yeast retrotransposon virus‐like particles." The EMBO journal 11, no. 3 (1992): 1155-1164. Cerutti, H., Johnson, A. M., Boynton, J. E., & Gillham, N. W. (1995). Inhibition of chloroplast DNA recombination and repair by dominant negative mutants of Escherichia coli RecA. Molecular and Cellular Biology, 15(6), 3003–3011.

210

Cerutti, H., Osman, M., Grandoni, P., & Jagendorf, A. T. (1992). A homolog of Escherichia coli RecA protein in plastids of higher plants. Proceedings of the National Academy of Sciences of the United States of America, 89(17), 8068–8072. Chavez, Alejandro, Jonathan Scheiman, Suhani Vora, Benjamin W. Pruitt, Marcelle Tuttle, Eswar PR Iyer, Shuailiang Lin et al. "Highly efficient Cas9-mediated transcriptional programming." Nature methods 12, no. 4 (2015): 326. Chavez, Alejandro, Marcelle Tuttle, Benjamin W. Pruitt, Ben Ewen-Campen, Raj Chari, Dmitry Ter-Ovanesyan, Sabina J. Haque et al. "Comparison of Cas9 activators in multiple species." Nature methods 13, no. 7 (2016): 563. Chayot, R., Montagne, B., Mazel, D., & Ricchetti, M. (2010). An end-joining repair mechanism in Escherichia coli. Proceedings of the National Academy of Sciences of the United States of America, 107(5), 2141–2146. Checkley, M. A., Nagashima, K., Lockett, S. J., Nyswaner, K. M., & Garfinkel, D. J. (2010). P- Body Components Are Required for Ty1 Retrotransposition during Assembly of Retrotransposition-Competent Virus-Like Particles. Molecular and Cellular Biology, 30(2), 382–398. Chotewutmontri, P., Holbrook, K., & Bruce, B. D. (2017). Plastid Protein Targeting: Preprotein Recognition and Translocation. International Review of Cell and Molecular Biology, 330, 227–294. Christian, Michelle, Tomas Cermak, Erin L. Doyle, Clarice Schmidt, Feng Zhang, Aaron Hummel, Adam J. Bogdanove, and Daniel F. Voytas. "Targeting DNA double-strand breaks with TAL effector nucleases." Genetics 186, no. 2 (2010): 757-761. Cornille, Fabrice, Yves Mely, Damien Ficheux, Isabelle Savignol, Dominique Gerard, Jean‐Luc Darlix, Marie‐Claude Fournie‐Zaluski, and Bernard P. Roques. "Solid phase synthesis of the retro viral nucleocapsid protein NCp10 of Moloney Murine Leukaemia virus and related “zinc‐fingers” in free SH forms." Chemical Biology & Drug Design 36, no. 6 (1990): 551- 558. Culver, K. W., Ram, Z., Wallbridge, S., Ishii, H., Oldfield, E. H., & Blaese, R. M. (1992). In vivo gene transfer with retroviral vector-producer cells for treatment of experimental brain tumors. Science (New York, N.Y.), 256(5063), 1550–1552. Curcio, M. J., & Garfinkel, D. J. (1991). Single-step selection for Ty1 element retrotransposition. Proceedings of the National Academy of Sciences of the United States of America, 88(3), 936–940. Curcio, M. J., Lutz, S., & Lesage, P. (2015). The Ty1 LTR-retrotransposon of budding yeast, Saccharomyces cerevisiae. Microbiology Spectrum, 3(2), 1–35. Curcio, M. J., Sanders, N. J., & Garfinkel, D. J. (1988). Transpositional competence and transcription of endogenous Ty elements in Saccharomyces cerevisiae: implications for regulation of transposition. Molecular and Cellular Biology, 8(9), 3571–3581.

211

Dalal, Jyoti, Roopa Yalamanchili, Christophe La Hovary, Mikyoung Ji, Maria Rodriguez-Welsh, Denise Aslett, Sowmya Ganapathy, Amy Grunden, Heike Sederoff, and Rongda Qu. "A novel gateway-compatible binary vector series (PC-GW) for flexible cloning of multiple genes for genetic transformation of plants." Plasmid 81 (2015): 55-62. Daniell, H., Muthukumar, B., & Lee, S. B. (2001). Marker free transgenic plants: engineering the chloroplast genome without the use of antibiotic selection. Current Genetics, 39(2), 109–116. Darlix, J.-L., de Rocquigny, H., Mauffret, O., & Mély, Y. (2014). Retrospective on the all-in-one retroviral nucleocapsid protein. Virus Research, 193, 2–15. Darlix, J.-L., Lapadat-Tapolsky, M., de Rocquigny, H., & Roques, B. P. (1995). First Glimpses at Structure-function Relationships of the Nucleocapsid Protein of Retroviruses. Journal of Molecular Biology, 254(4), 523–537. Daròs, J.-A. (2016). Eggplant latent viroid: a friendly experimental system in the family Avsunviroidae. Molecular Plant Pathology, 17(8), 1170–1177. Davis, N. L., & Rueckert, R. R. (1972). Properties of a ribonucleoprotein particle isolated from Nonidet P-40-treated Rous sarcoma virus. Journal of Virology, 10(5), 1010–1020. Day, A., & Goldschmidt-Clermont, M. (2011). The chloroplast transformation toolbox: selectable markers and marker removal. Plant Biotechnology Journal, 9(5), 540–553. Day, A., & Madesis, P. (2007). DNA replication, recombination, and repair in plastids (pp. 65– 119). Springer, Berlin, Heidelberg. De Rocquigny, H., Gabus, C., Vincent, A., Fournié-Zaluski, M. C., Roques, B., & Darlix, J. L. (1992). Viral RNA annealing activities of human immunodeficiency virus type 1 nucleocapsid protein require only peptide domains outside the zinc fingers. Proceedings of the National Academy of Sciences of the United States of America, 89(14), 6472–6476. Della, Marina, Phillip L. Palmbos, Hui-Min Tseng, Louise M. Tonkin, James M. Daley, Leana M. Topper, Robert S. Pitcher, Alan E. Tomkinson, Thomas E. Wilson, and Aidan J. Doherty. "Mycobacterial Ku and ligase proteins constitute a two-component NHEJ repair machine." Science 306, no. 5696 (2004): 683-685. Doh, J. H., Lutz, S., & Curcio, M. J. (2014). Co-translational Localization of an LTR- Retrotransposon RNA to the Endoplasmic Reticulum Nucleates Virus-Like Particle Assembly Sites. PLoS Genetics, 10(3), e1004219. Duan, X., Gimble, F. S., & Quiocho, F. A. (1997). Crystal Structure of PI-SceI, a Homing Endonuclease with Protein Splicing Activity. Cell, 89(4), 555–564. Ebert, P. R., Ha, S. B., & An, G. (1987). Identification of an essential upstream element in the nopaline synthase promoter by stable and transient assays. Proceedings of the National Academy of Sciences of the United States of America, 84(16), 5745–5749.

212

Eichinger, D. J., & Boeke, J. D. (1988). The DNA intermediate in yeast Ty1 element transposition copurifies with virus-like particles: cell-free Ty1 transposition. Cell, 54(7), 955–966. Ellie Castano. (2017). UMMS researchers receive amfAR grant to eliminate HIV reservoirs using gene editing. Retrieved April 5, 2018, from https://www.umassmed.edu/news/news- archives/2017/04/umms-researchers-receive-amfar-grant-to-eliminate-hiv-reservoirs-using- gene-editing/ Emanuelsson, Olof, Henrik Nielsen, and Gunnar Von Heijne. "ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites." Protein Science 8, no. 5 (1999): 978-984. Fadda, Z., Daròs, J. A., Fagoaga, C., Flores, R., & Duran-Vila, N. (2003). Eggplant latent viroid, the candidate type species for a new genus within the family Avsunviroidae (hammerhead viroids). Journal of Virology, 77(11), 6528–6532. Flores, R., Daròs, J. A., & Hernández, C. (2000). Avsunviroidae family: viroids containing hammerhead ribozymes. Advances in Virus Research, 55, 271–323. Fonfara, I., Richter, H., Bratovič, M., Le Rhun, A., & Charpentier, E. (2016). The CRISPR- associated DNA-cleaving enzyme Cpf1 also processes precursor CRISPR RNA. Nature, 532(7600), 517–521. Frewer, Lynn J., Ivo A. van der Lans, Arnout RH Fischer, Machiel J. Reinders, Davide Menozzi, Xiaoyong Zhang, Isabelle van den Berg, and Karin L. Zimmermann. "Public perceptions of agri-food applications of genetic modification–a systematic review and meta-analysis." Trends in Food Science & Technology 30, no. 2 (2013): 142-152. Gago-Zachert, S. (2016). Viroids, infectious long non-coding RNAs with autonomous replication. Virus Research, 212, 12–24. Gaj, T., Gersbach, C. A., & Barbas, C. F. (2013). ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends in Biotechnology, 31(7), 397–405. Gantz, V. M., Jasinskiene, N., Tatarenkova, O., Fazekas, A., Macias, V. M., Bier, E., & James, A. A. (2015). Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephensi. Proceedings of the National Academy of Sciences of the United States of America, 112(49), E6736-43. Gao, Y., & Zhao, Y. (2014). Self-processing of ribozyme-flanked RNAs into guide RNAs in vitro and in vivo for CRISPR-mediated genome editing. Journal of Integrative Plant Biology, 56(4), 343–349. Garfinkel, D. J., Boeke, J. D., & Fink, G. R. (1985). Ty element transposition: reverse transcriptase and virus-like particles. Cell, 42(2), 507–517. Gateway ® pDONR TM Vectors. (2012).

213

Gavel, Y., & von Heijne, G. (1990). A conserved cleavage-site motif in chloroplast transit peptides. FEBS Letters, 261(2), 455–458. Gerard, G. F., & Grandgenett, D. P. (1975). Purification and characterization of the DNA polymerase and RNase H activities in Moloney murine sarcoma-leukemia virus. Journal of Virology, 15(4), 785–797. Gilbert, Luke A., Max A. Horlbeck, Britt Adamson, Jacqueline E. Villalta, Yuwen Chen, Evan H. Whitehead, Carla Guimaraes et al. "Genome-scale CRISPR-mediated control of gene repression and activation." Cell 159, no. 3 (2014): 647-661. Golds, T., Maliga, P., & Koop, H.-U. (1993). Stable Plastid Transformation in PEG-treated Protoplasts of Nicotiana tabacum. Nature Biotechnology, 11(1), 95–97. Gomaa, A. A., Klumpe, H. E., Luo, M. L., Selle, K., Barrangou, R., & Beisel, C. L. (2014). Programmable removal of bacterial strains by use of genome-targeting CRISPR-Cas systems. mBio, 5(1), e00928-13. Gómez, G., & Pallas, V. (2012). Studies on subcellular compartmentalization of plant pathogenic noncoding RNAs give new insights into the intracellular RNA-traffic mechanisms. Plant Physiology, 159(2), 558–564. Gómez, G., & Pallás, V. (2010a). Can the import of mRNA into chloroplasts be mediated by a secondary structure of a small non-coding RNA? Plant Signaling & Behavior, 5(11), 1517– 1519. Gómez, G., & Pallás, V. (2010b). Noncoding RNA Mediated Traffic of Foreign mRNA into Chloroplasts Reveals a Novel Signaling Mechanism in Plants. PLoS ONE, 5(8), e12269. Gong, C., Bongiorno, P., Martins, A., Stephanou, N. C., Zhu, H., Shuman, S., & Glickman, M. S. (2005). Mechanism of nonhomologous end-joining in mycobacteria: a low-fidelity repair system driven by Ku, ligase D and ligase C. Nature Structural & Molecular Biology, 12(4), 304–312. Gupta, A., Christensen, R. G., Rayla, A. L., Lakshmanan, A., Stormo, G. D., & Wolfe, S. A. (2012). An optimized two-finger archive for ZFN-mediated gene targeting. Nature Methods, 9(6), 588–590. Hammond, Andrew, Roberto Galizi, Kyros Kyrou, Alekos Simoni, Carla Siniscalchi, Dimitris Katsanos, Matthew Gribble et al. "A CRISPR-Cas9 gene drive system targeting female reproduction in the malaria mosquito vector Anopheles gambiae." Nature biotechnology 34, no. 1 (2016): 78. Hammond, S. M., Caudy, A. A., & Hannon, G. J. (2001). Post-transcriptional gene silencing by double-stranded RNA. Nature Reviews Genetics, 2(2), 110–119. Hampsey, M. (1998). Molecular genetics of the RNA polymerase II general transcriptional machinery. Microbiology and Molecular Biology Reviews: MMBR, 62(2), 465–503.

214

Haurwitz, Rachel E., Martin Jinek, Blake Wiedenheft, Kaihong Zhou, and Jennifer A. Doudna. "Sequence-and structure-specific RNA processing by a CRISPR endonuclease." Science 329, no. 5997 (2010): 1355-1358. Havecker, E. R., Gao, X., & Voytas, D. F. (2004). The diversity of LTR retrotransposons. Genome Biology, 5(6), 225. Henderson, L. E., Copeland, T. D., Sowder, R. C., Smythers, G. W., & Oroszlan, S. (1981). Primary structure of the low molecular weight nucleic acid-binding proteins of murine leukemia viruses. The Journal of Biological Chemistry, 256(16), 8400–8406. Hirose, Y., & Manley, J. L. (2000). RNA polymerase II and the integration of nuclear events. Genes & Development, 14(12), 1415–1429. Hockemeyer, Dirk, Haoyi Wang, Samira Kiani, Christine S. Lai, Qing Gao, John P. Cassady, Gregory J. Cost et al. "Genetic engineering of human pluripotent cells using TALE nucleases." Nature biotechnology 29, no. 8 (2011): 731. Horton, R. M., Hunt, H. D., Ho, S. N., Pullen, J. K., & Pease, L. R. (1989). Engineering hybrid genes without the use of restriction enzymes: gene splicing by overlap extension. Gene, 77(1), 61–68. Horvath, Philippe, and Rodolphe Barrangou. "CRISPR/Cas, the immune system of bacteria and archaea." Science 327, no. 5962 (2010): 167-170. Jarvis, P., & Soll, J. (2001). Toc, Tic, and chloroplast protein import. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research, 1541(1–2), 64–79. Kapila, Jyoti, Riet De Rycke, Marc Van Montagu, and Geert Angenon. "An Agrobacterium- mediated transient gene expression system for intact leaves." Plant science 122, no. 1 (1997): 101-108. Karlin‐Neumann, G. A., & Tobin, E. M. (1986). Transit peptides of nuclear‐encoded chloroplast proteins share a common amino acid framework. The EMBO Journal, 5(1), 9–13. Kashkush, K., Feldman, M., & Levy, A. A. (2002). Transcriptional activation of retrotransposons alters the expression of adjacent genes in wheat. Nature Genetics, 33(1), 102–106. Khan, M. S., & Maliga, P. (1999). Fluorescent antibiotic resistance marker for tracking plastid transformation in higher plants. Nature Biotechnology, 17(9), 910–915. Kimura, Seisuke, and Kengo Sakaguchi. "DNA repair in plants." Chemical reviews 106, no. 2 (2006): 753-766. Kindle, K. L. (1990). High-frequency nuclear transformation of Chlamydomonas reinhardtii. Proceedings of the National Academy of Sciences of the United States of America, 87(3), 1228–1232. Konermann, Silvana, Mark D. Brigham, Alexandro E. Trevino, Julia Joung, Omar O. Abudayyeh, Clea Barcena, Patrick D. Hsu et al. "Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex." Nature 517, no. 7536 (2015): 583.

215

Kotewicz, M. L., Sampson, C. M., D ’alessio, J. M., & Gerard, G. F. (1988). Isolation of cloned Moloney murine leukemia vims reverse transcriptase lacking ribonudease H activity. Nucleic Acids Research, (1). Kowalczykowski, S. C. (2000). Initiation of genetic recombination and recombination-dependent replication. Trends in Biochemical Sciences, 25(4), 156–165. Kowalczykowski, S. C., Dixon, D. A., Eggleston, A. K., Lauder, S. D., & Rehrauer, W. M. (1994). Biochemistry of homologous recombination in Escherichia coli. Microbiological Reviews, 58(3), 401–465. Kumar, Amar, and Jeffrey L. Bennetzen. "Plant retrotransposons." Annual review of genetics 33, no. 1 (1999): 479-532. Kwon, Taegun, Enamul Huq, and David L. Herrin. "Microhomology-mediated and nonhomologous repair of a double-strand break in the chloroplast genome of Arabidopsis." Proceedings of the National Academy of Sciences 107, no. 31 (2010): 13954-13959. Larson, M. H., Gilbert, L. A., Wang, X., Lim, W. A., Weissman, J. S., & Qi, L. S. (2013). CRISPR interference (CRISPRi) for sequence-specific control of gene expression. Nature Protocols, 8(11), 2180–2196. Latchman, D. S., & Latchman, D. S. (2008). Rna Polymerases and the basal transcriptional complex. In Eukaryotic Transcription Factors (p. 68–I). Elsevier. Lee, T. I., & Young, R. A. (2000). Transcription of Eukaryotic Protein-Coding Genes. Annual Review of Genetics, 34(1), 77–137. Li, Z., Zhang, D., Xiong, X., Yan, B., Xie, W., Sheen, J., & Li, J.-F. (2017). A potent Cas9- derived gene activator for plant and mammalian cells. Nature Plants, 3(12), 930–936. Liu, S. John, Max A. Horlbeck, Seung Woo Cho, Harjus S. Birk, Martina Malatesta, Daniel He, Frank J. Attenello et al. "CRISPRi-based genome-scale identification of functional long noncoding RNA loci in human cells." Science 355, no. 6320 (2017): eaah7111. Luschnig, C., Hess, M., Pusch, O., Brookman, J., & Bachmair, A. (1995). The gag homologue of retrotransposon Ty1 assembles into spherical particles in Escherichia coli. European Journal of Biochemistry, 228(3), 739–744. Ma, H., Wu, Y., Dang, Y., Choi, J.-G., Zhang, J., & Wu, H. (2014). Pol III Promoters to Express Small RNAs: Delineation of Transcription Initiation. Molecular Therapy. Nucleic Acids, 3(5), e161. Ma, Julian K‐C., Eugenia Barros, Ralph Bock, Paul Christou, Philip J. Dale, Philip J. Dix, Rainer Fischer et al. "Molecular farming for new drugs and vaccines: current perspectives on the production of pharmaceuticals in transgenic plants." EMBO reports 6, no. 7 (2005): 593-599. Ma, Xingliang, Qunyu Zhang, Qinlong Zhu, Wei Liu, Yan Chen, Rong Qiu, Bin Wang et al. "A robust CRISPR/Cas9 system for convenient, high-efficiency multiplex genome editing in monocot and dicot plants." Molecular plant 8, no. 8 (2015): 1274-1284.

216

Mak, J., & Kleiman, L. (1997). Primer tRNAs for reverse transcription. Journal of Virology, 71(11), 8087–8095. Malagon, F., & Jensen, T. H. (2008). The T body, a new cytoplasmic RNA granule in Saccharomyces cerevisiae. Molecular and Cellular Biology, 28(19), 6022–6032. Mali, Prashant, Luhan Yang, Kevin M. Esvelt, John Aach, Marc Guell, James E. DiCarlo, Julie E. Norville, and George M. Church. "RNA-guided human genome engineering via Cas9." Science 339, no. 6121 (2013): 823-826. Maliga, P. (2004). Plastid transformation in higher plants. Annual Review of Plant Biology, 55(1), 289–313. Marc, P., Margeot, A., Devaux, F., Blugeon, C., Corral-Debrinski, M., & Jacq, C. (2002). Genome-wide analysis of mRNAs targeted to yeast mitochondria. EMBO Reports, 3(2), 159–164. Marketsandmarkets.com. (2016) Seed Market by Type (GM Seed, Conventional Seed), Crop Type (Cereals & Grains, Fruits & Vegetables, Oilseeds), Seed Treatment (Treated, Non- treated), Trait (Herbicide-tolerant, Insecticide-resistant), and Region - Global Forecast to 2022. Retrieved May 4, 2018, from https://www.marketsandmarkets.com/Market- Reports/seed-market-126130457.html Mayfield, S. P., & Franklin, S. E. (2005). Expression of human antibodies in eukaryotic micro- algae. Vaccine, 23(15), 1828–1832. Mayfield, Stephen P., Andrea L. Manuell, Stephen Chen, Joann Wu, Miller Tran, David Siefker, Machiko Muto, and Julia Marin-Navarro. "Chlamydomonas reinhardtii chloroplasts as protein factories." Current opinion in biotechnology 18, no. 2 (2007): 126-133. Mellor, Jane, Michael H. Malim, Keith Gull, Mick F. Tuite, Shirley McCready, Teresa Dibbayawan, Susan M. Kingsman, and Alan J. Kingsman. "Reverse transcriptase activity and Ty RNA are associated with virus-like particles in yeast." Nature 318, no. 6046 (1985): 583. Miller, D. G., Adam, M. A., & Miller, A. D. (1990). Gene transfer by retrovirus vectors occurs only in cells that are actively replicating at the time of infection. Molecular and Cellular Biology, 10(8), 4239–4242. Miller, Jeffrey C., Siyuan Tan, Guijuan Qiao, Kyle A. Barlow, Jianbin Wang, Danny F. Xia, Xiangdong Meng et al. "A TALE nuclease architecture for efficient genome editing." Nature biotechnology 29, no. 2 (2011): 143. Miras, S., Salvi, D., Ferro, M., Grunwald, D., Garin, J., Joyard, J., & Rolland, N. (2002). Non- canonical transit peptide for import into the chloroplast. The Journal of Biological Chemistry, 277(49), 47770–47778. Molling, K., Bolognesi, D. P., Bauer, H., Busen, W., Plassmann, H. W., & Hausen, P. (1971). Association of Viral Reverse Transcriptase with an Enzyme degrading the RNA Moiety of RNA-DNA Hybrids. Nature New Biology, 234(51), 240–243.

217

Müller, F., Brühl, K. H., Freidel, K., Kowallik, K. V, & Ciriacy, M. (1987). Processing of TY1 proteins and formation of Ty1 virus-like particles in Saccharomyces cerevisiae. Molecular & General Genetics: MGG, 207(2–3), 421–429. Mussolino, C., Morbitzer, R., Lütge, F., Dannemann, N., Lahaye, T., & Cathomen, T. (2011). A novel TALE nuclease scaffold enables high genome editing activity in combination with low toxicity. Nucleic Acids Research, 39(21), 9283–9293. National Academies of Sciences, Engineering, and Medicine. 2016. Genetically Engineered Crops: Experiences and Prospects. Washington, DC: The National Academies Press. Negroni, M., & Buc, H. (2001). Mechanisms of Retroviral Recombination. Annual Review of Genetics, 35(1), 275–302. Neyers, P., Shepherd, N. S., & Saedler, H. (1986). Plant Transposable Elements. Advances in Botanical Research, 12, 103–203. Nicolaï, M., Duprat, A., Sormani, R., Rodriguez, C., Roncato, M.-A., Rolland, N., & Robaglia, C. (2007). Higher plant chloroplasts import the mRNA coding for the eucaryotic translation initiation factor 4E. FEBS Letters, 581(21), 3921–3926. Nugent, G. D., Coyne, S., Nguyen, T. T., Kavanagh, T. A., & Dix, P. J. (2006). Nuclear and plastid transformation of Brassica oleracea var. botrytis (cauliflower) using PEG-mediated uptake of DNA into protoplasts. Plant Science, 170(1), 135–142. O’Connell, M. R., Oakes, B. L., Sternberg, S. H., East-Seletsky, A., Kaplan, M., & Doudna, J. A. (2014). Programmable RNA recognition and cleavage by CRISPR/Cas9. Nature, 516(7530), 263–266. O’Neill, C., Horváth, G. V., Horváth, É., Dix, P. J., & Medgyesy, P. (1993). Chloroplast transformation in plants: polyethylene glycol (PEG) treatment of protoplasts is an alternative to biolistic delivery systems. The Plant Journal, 3(5), 729–738. Pager, J., Coulaud, D., & Delain, E. (1994). Electron microscopy of the nucleocapsid from disrupted Moloney murine leukemia virus and of associated type VI collagen-like filaments. Journal of Virology, 68(1), 223–232. Peled-Zehavi, H., & Danon, A. (2007). Translation and translational regulation in chloroplasts (pp. 249–281). Springer, Berlin, Heidelberg. Pew Research Center, December, 2016, “The New Food Fights: U.S. Public Divides Over Food Science” Pitcher, R. S., Green, A. J., Brzostek, A., Korycka-Machala, M., Dziadek, J., & Doherty, A. J. (2007). NHEJ protects mycobacteria in stationary phase against the harmful effects of desiccation. DNA Repair, 6(9), 1271–1276. Port, Fillip, and Simon L. Bullock. "Expansion of the CRISPR toolbox in an animal with tRNA- flanked Cas9 and Cpf1 gRNAs." bioRxiv (2016): 046417.

218

Ran, F. Ann, Patrick D. Hsu, Chie-Yu Lin, Jonathan S. Gootenberg, Silvana Konermann, Alexandro E. Trevino, David A. Scott et al. "Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity." Cell 154, no. 6 (2013): 1380-1389. Ran, F. A., Hsu, P. D., Wright, J., Agarwala, V., Scott, D. A., & Zhang, F. (2013). Genome engineering using the CRISPR-Cas9 system. Nature Protocols, 8(11), 2281–2308. Rice, William G., Jeffrey G. Supko, Louis Malspeis, Robert W. Buckheit, David Clanton, Ming Bu, Lisa Graham et al. "Inhibitors of HIV nucleocapsid protein zinc fingers as candidates for the treatment of AIDS." Science 270, no. 5239 (1995): 1194-1197. Rigano, M. M., Manna, C., Giulini, A., Vitale, A., & Cardi, T. (2009). Plants as biofactories for the production of subunit vaccines against bio-security-related bacteria and viruses. Vaccine, 27(25–26), 3463–3466. Rigano, M. M., Scotti, N., & Cardi, T. (2012). Unsolved problems in plastid transformation. Bioengineered, 3(6), 329–333. Robertson, E., Bradley, A., Kuehn, M., & Evans, M. (1986). Germ-line transmission of genes introduced into cultured pluripotential cells by retroviral vector. Nature, 323(6087), 445–448. Roth, J.-F. (2000). The yeast Ty virus-like particles. Yeast, 16(9), 785–795. Ruf, S., Hermann, M., Berger, I. J., Carrer, H., & Bock, R. (2001). Stable genetic transformation of tomato plastids and expression of a foreign protein in fruit. Nature Biotechnology, 19(9), 870–875. Ryder, E. F., Snyder, E. Y., & Cepko, C. L. (1990). Establishment and characterization of multipotent neural cell lines using retrovirus vector-mediated oncogene transfer. Journal of Neurobiology, 21(2), 356–375. Salinas, T., Duchêne, A.-M., & Maréchal-Drouard, L. (2008). Recent advances in tRNA mitochondrial import. Trends in Biochemical Sciences, 33(7), 320–329. SanMiguel, P., Gaut, B. S., Tikhonov, A., Nakajima, Y., & Bennetzen, J. L. (1998). The paleontology of intergene retrotransposons of maize. Nature Genetics, 20(1), 43–45. SanMiguel, Phillip, Alexander Tikhonov, Young-Kwan Jin, Natasha Motchoulskaia, Dmitrii Zakharov, Admasu Melake-Berhan, Patricia S. Springer et al. "Nested retrotransposons in the intergenic regions of the maize genome." Science 274, no. 5288 (1996): 765-768. Schatz, G., Dobberstein, B., Mireau, H., Fox, T. D., Martin, R. P., & Tarassov, I. A. (1996). Common Principles of Protein Translocation Across Membranes. Science, 271(5255), 1519– 1526. Schneider, A., & Maréchal-Drouard, L. (2000). Mitochondrial tRNA import: are there distinct mechanisms? Trends in Cell Biology, 10(12), 509–513. Scotti, N., Bellucci, M., & Cardi, T. (2013). The Chloroplasts as Platform for Recombinant Proteins Production. In Translation in Mitochondria and Other Organelles (pp. 225–262). Berlin, Heidelberg: Springer Berlin Heidelberg.

219

Scotti, N., & Cardi, T. (2012). Plastid Transformation as an Expression Tool for Plant-Derived Biopharmaceuticals (pp. 451–466). Humana Press. Selle, K., & Barrangou, R. (2015). Harnessing CRISPR–Cas systems for bacterial genome editing. Trends in Microbiology, 23(4), 225–232. Shen, Ping, and Henry V. Huang. "Homologous recombination in Escherichia coli: dependence on substrate length and homology." Genetics 112, no. 3 (1986): 441-457. Shinnick, T. M., Lerner, R. A., & Sutcliffe, J. G. (1981). Nucleotide sequence of Moloney murine leukaemia virus. Nature, 293(5833), 543–548. Shuman, S., & Glickman, M. S. (2007). Bacterial DNA repair by non-homologous end joining. Nature Reviews Microbiology, 5(11), 852–861. Sikdar, S. R., Serino, G., Chaudhuri, S., & Maliga, P. (1998). Plastid transformation in Arabidopsis thaliana. Plant Cell Reports, 18(1–2), 20–24. Sinkins, S. P., & Gould, F. (2006). Gene drive systems for insect disease vectors. Nature Reviews Genetics, 7(6), 427–435. Smith, H. O., & Nathans, D. (1973). A Suggested nomenclature for bacterial host modification and restriction systems and their enzymes. Journal of Molecular Biology, 81(3), 419–423. Specht, E., Miyake-Stoner, S., & Mayfield, S. (2010). Micro-algae come of age as a platform for recombinant protein production. Biotechnology Letters, 32(10), 1373–1383. Stegemann, S., Keuthe, M., Greiner, S., & Bock, R. (2012). Horizontal transfer of chloroplast genomes between plant species. Proceedings of the National Academy of Sciences of the United States of America, 109(7), 2434–2438. Stephanou, N. C., Gao, F., Bongiorno, P., Ehrt, S., Schnappinger, D., Shuman, S., & Glickman, M. S. (2007). Mycobacterial nonhomologous end joining mediates mutagenic repair of chromosomal double-strand DNA breaks. Journal of Bacteriology, 189(14), 5237–5246. Stoddard, B. L. (2006). Homing endonuclease structure and function. Quarterly Reviews of Biophysics, 38(1), 49. Stoddard, B. L. (2011). Homing : From Microbial Genetic Invaders to Reagents for Targeted DNA Modification. Structure, 19(1), 7–15. Stoddard, B. L., Flick, K. E., Jurica, M. S., & Monnat, R. J. (1998). DNA binding and cleavage by the nuclear intron-encoded homing endonuclease I-PpoI. Nature, 394(6688), 96–101. Subramanian, A. R., Stahl, D., & Prombona, A. (1991). Ribosomal Proteins, Ribosomes, and Translation in Plastids. In The Molecular Biology of Plastids (pp. 191–215). Elsevier. Svab, Z., Hajdukiewicz, P., & Maliga, P. (1990). Stable transformation of plastids in higher plants. Proceedings of the National Academy of Sciences of the United States of America, 87(21), 8526–8530.

220

Svab, Z., & Maliga, P. (1993a). High-frequency plastid transformation in tobacco by selection for a chimeric aadA gene. Proceedings of the National Academy of Sciences of the United States of America, 90(3), 913–917. Svab, Z., & Maliga, P. (1993b). High-frequency plastid transformation in tobacco by selection for a chimeric aadA gene. Proceedings of the National Academy of Sciences of the United States of America, 90(3), 913–917. Szostak, J. W., Orr-Weaver, T. L., Rothstein, R. J., & Stahl, F. W. (1983). The double-strand- break repair model for recombination. Cell, 33(1), 25–35. Takata, Minoru, Masao S. Sasaki, Eiichiro Sonoda, Ciaran Morrison, Mitsumasa Hashimoto, Hiroshi Utsumi, Yuko Yamaguchi‐Iwai, Akira Shinohara, and Shunichi Takeda. "Homologous recombination and non‐homologous end‐joining pathways of DNA double‐ strand break repair have overlapping roles in the maintenance of chromosomal integrity in vertebrate cells." The EMBO journal 17, no. 18 (1998): 5497-5508. Tan, T. H. P., Pach, R., Crausaz, A., Ivens, A., & Schneider, A. (2002). tRNAs in Trypanosoma brucei: genomic organization, expression, and mitochondrial import. Molecular and Cellular Biology, 22(11), 3707–3717. Tanenbaum, M. E., Gilbert, L. A., Qi, L. S., Weissman, J. S., & Vale, R. D. (2014). A protein- tagging system for signal amplification in gene expression and fluorescence imaging. Cell, 159(3), 635–646. Tarassov, I., Entelis, N., & Martin, R. P. (1995). An Intact Protein Translocating Machinery is Required for Mitochondrial Import of a Yeast Cytoplasmic tRNA. Journal of Molecular Biology, 245(4), 315–323. Tarassov, I., Entelis, N., & Martin, R. P. (1995). Mitochondrial import of a cytoplasmic lysine‐ tRNA in yeast is mediated by cooperation of cytoplasmic and mitochondrial lysyl‐tRNA synthetases. The EMBO Journal, 14(14), 3461–3471. Thomas, M. C., & Chiang, C.-M. (2006). The General Transcription Machinery and General Cofactors. Critical Reviews in Biochemistry and Molecular Biology, 41(3), 105–178. Thyssen, G., Svab, Z., & Maliga, P. (2012). Cell-to-cell movement of plastids in plants. Proceedings of the National Academy of Sciences of the United States of America, 109(7), 2439–2443. Tsuchihashi, Z., & Brown, P. O. (1994). DNA strand exchange and selective DNA annealing promoted by the human immunodeficiency virus type 1 nucleocapsid protein. Journal of Virology, 68(9), 5863–5870. Turowski, T. W., & Tollervey, D. (2016). Transcription by RNA polymerase III: insights into mechanism and regulation. Biochemical Society Transactions, 44(5), 1367–1375. Verma, I. M. (1975). Studies on reverse transcriptase of RNA tumor viruses III. Properties of purified Moloney murine leukemia virus DNA polymerase and associated RNase H. Journal of Virology, 15(4), 843–854.

221

Verner, K. (1993). Co-translational protein import into mitochondria: an alternative view. Trends in Biochemical Sciences, 18(10), 366–371. Vincze, T., Posfai, J., & Roberts, R. J. (2003). NEBcutter: a program to cleave DNA with restriction enzymes. Nucleic Acids Research, 31(13), 3688–3691. Waheed, M. T., Ismail, H., Gottschamel, J., Mirza, B., & Lössl, A. G. (2015). Plastids: The Green Frontiers for Vaccine Production. Frontiers in Plant Science, 6, 1005. Webber, B. L., Raghu, S., & Edwards, O. R. (2015). Opinion: Is CRISPR-based gene drive a biocontrol silver bullet or global conservation threat? Proceedings of the National Academy of Sciences of the United States of America, 112(34), 10565–10567. Weis, B. L., Schleiff, E., & Zerges, W. (2013). Protein targeting to subcellular organelles via mRNA localization. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research, 1833(2), 260–273. Williamson, C. L., Slocum, R. D., Lee, G., Lee, K. H., Kim, S., Cheong, G.-W., & Hwang, I. (1994). Molecular Cloning and Characterization of the pyrB1 and pyrB2 Genes Encoding Aspartate Transcarbamoylase in Pea (Pisum sativum L.). Plant Physiology, 105(1), 377–384. Willis, I. M. (1994). RNA polymerase III. In EJB Reviews 1993 (pp. 29–39). Berlin, Heidelberg: Springer Berlin Heidelberg. Wood, Andrew J., Te-Wen Lo, Bryan Zeitler, Catherine S. Pickle, Edward J. Ralston, Andrew H. Lee, Rainier Amora et al. "Targeted genome editing across species using ZFNs and TALENs." Science 333, no. 6040 (2011): 307-307. Xu, Li, Lixia Zhao, Yandi Gao, Jing Xu, and Renzhi Han. "Empower multiplex cell and tissue- specific CRISPR-mediated gene manipulation with self-cleaving ribozymes and tRNA." Nucleic acids research 45, no. 5 (2017): e28-e28. Yanisch-Perron, C., Vieira, J., & Messing, J. (1985). Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mpl8 and pUC19 vectors. Gene, 33(1), 103– 119. Yoder, J. A., Walsh, C. P., & Bestor, T. H. (1997). Cytosine methylation and the ecology of intragenomic parasites. Trends in Genetics, 13(8), 335–340. Yu, Q., Lutz, K. A., & Maliga, P. (2017). Efficient Plastid Transformation in Arabidopsis. Plant Physiology, 175(1), 186–193. Zetsche, Bernd, Jonathan S. Gootenberg, Omar O. Abudayyeh, Ian M. Slaymaker, Kira S. Makarova, Patrick Essletzbichler, Sara E. Volz et al. "Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system." Cell 163, no. 3 (2015): 759-771. Zetsche, Bernd, Matthias Heidenreich, Prarthana Mohanraju, Iana Fedorova, Jeroen Kneppers, Ellen M. DeGennaro, Nerges Winblad et al. "Multiplex gene editing by CRISPR–Cpf1 using a single crRNA array." Nature biotechnology 35, no. 1 (2017): 31.

222

Zhang, J., Khan, S. A., Hasse, C., Ruf, S., Heckel, D. G., & Bock, R. (2015). Full crop protection from an insect pest by expression of long double-stranded RNAs in plastids. Science, 347(6225), 991–994. Zhao, Yicheng, Zhen Dai, Yang Liang, Ming Yin, Kuiying Ma, Mei He, Hongsheng Ouyang, and Chun-Bo Teng. "Sequence-specific inhibition of microRNA via CRISPR/CRISPRi system." Scientific reports 4 (2014): 3943.

223

APPENDICES

224

APPENDIX A: Methods and Compositions for Modification of Plastid Genomes

The abstract and claims of “METHODS AND COMPOSITIONS FOR MODIFICATION OF PLASTID GENOMES,” which was produced as an outcome of this work, are shown below. Full details and text are available at: https://patentscope.wipo.int/search/en/detail.jsf?docId=WO2018045321

METHODS AND COMPOSITIONS FOR MODIFICATION OF PLASTID GENOMES

ABSTRACT The invention relates to methods of modifying a plastid genome using sequence specific nucleases, as well as reverse transcriptase polypeptides and plastid modification cassettes. The invention further relates to methods of modifying a plastid or a mitochondrial genome using ATP-dependent DNA ligase D (LigD) and DNA-binding protein Ku (Ku). Also included are the plants, plant cells, and seeds produced by these methods.

225

THAT WHICH IS CLAIMED IS:

1. A method of modifying a plastid genome of a plant cell, comprising introducing into a plant cell: (a) a polynucleotide encoding a reverse transcriptase (RT) polypeptide fused to a plastid transit peptide; (b) a recombinant nucleic acid comprising (i) a plastid modification cassette, (ii) a first recognition site located immediately 5’ of the plastid modification cassette, (iii) a second recognition site located immediately 3’ of the plastid modification cassette, and (iv) a plastid localization sequence operably located 5’ of the first recognition site; and (c) a polynucleotide encoding a sequence-specific nuclease fused to a plastid transit peptide, thereby modifying the plastid genome of said plant cell.

2. A method of modifying a plastid genome of a plant cell, comprising introducing into a plant cell a polynucleotide encoding an ATP-dependent DNA ligase D (LigD) fused to a plastid transit peptide and a polynucleotide encoding a DNA-binding protein Ku (Ku) fused to a plastid transit peptide, thereby modifying the plastid genome of said plant cell.

3. The method of claim 2, further comprising introducing into the plant cell a polynucleotide encoding a sequence-specific nuclease fused to a plastid transit peptide.

4. A method of modifying a plastid genome of a plant cell, comprising: introducing into a plant cell (a) a recombinant nucleic acid linked to a plastid localization sequence and comprising a polynucleotide encoding a reverse transcriptase polypeptide and a sequence-specific nuclease; and (b) a recombinant nucleic acid comprising (i) a plastid modification cassette, (ii) a first recognition site located immediately 5’ of the plastid modification cassette, (iii) a second recognition site located immediately 3’ of the plastid modification cassette, and (iv) a plastid

226

localizing sequence operably located 5’ of the first recognition site, thereby modifying the plastid genome of the plant cell.

5. The method of claim 3 or claim 4, wherein the sequence-specific nuclease is a Cas9 nuclease or a Cpf1 nuclease, the method further comprising introducing a guide nucleic acid linked to a plastid localization sequence.

6. The method of claim 3 or claim 4, wherein the sequence-specific nuclease is a Cpf1 nuclease, transcription activator-like (TAL) effector nuclease (TALEN), a zinc-finger nuclease (ZFN), and/or a meganuclease.

7. A method of producing a plant cell having a modified plastid genome, comprising introducing into a plant cell: (a) a polynucleotide encoding a reverse transcriptase (RT) polypeptide fused to a plastid transit peptide; (b) a recombinant nucleic acid comprising (i) a plastid modification cassette, (ii) a first recognition site located immediately 5’ of the plastid modification cassette, (iii) a second recognition site located immediately 3’ of the plastid modification cassette, and (iv) a plastid localization sequence operably located 5’ of the first recognition site; and (c) a polynucleotide encoding a sequence-specific nuclease fused to a plastid transit peptide, thereby producing a plant cell having a modified plastid genome.

8. A method of producing a plant cell having a modified plastid genome, comprising introducing into a plant cell a polynucleotide encoding an ATP-dependent DNA ligase D (LigD) fused to a plastid transit peptide and a polynucleotide encoding a DNA-binding protein Ku (Ku) fused to a plastid transit peptide, thereby modifying the plastid genome of said plant cell.

9. The method of claim 8, further comprising introducing into said plant cell a sequence- specific nuclease fused to a plastid transit peptide.

227

10. A method of producing a plant cell having a modified plastid genome, comprising: introducing into a plant cell (a) a recombinant nucleic acid linked to a plastid localization sequence and comprising a polynucleotide encoding a reverse transcriptase polypeptide and a sequence-specific nuclease; and (b) a recombinant nucleic acid comprising (i) a plastid modification cassette, (ii) a first recognition site located immediately 5’ of the plastid modification cassette, (iii) a second recognition site located immediately 3’ of the plastid modification cassette, and (iv) a plastid localizing sequence operably located 5’ of the first recognition site, thereby modifying the plastid genome of the plant cell.

11. The method of claim 9 or claim 10, wherein the sequence-specific nuclease is a Cas9 nuclease or a Cpf1 nuclease, the method further comprising introducing a guide nucleic acid linked to a plastid localization sequence.

12. The method of claim 10, wherein the sequence-specific nuclease is a Cpf1 nuclease, transcription activator-like (TAL) effector nuclease (TALEN), a zinc-finger nuclease (ZFN), and/or a meganuclease.

13. A method of expressing a polynucleotide sequence of interest (POI) in a plastid, comprising introducing into a plant cell: (a) a polynucleotide encoding a reverse transcriptase polypeptide fused to a plastid transit peptide and a sequence-specific nuclease fused to a plastid transit peptide, or a polynucleotide linked to a plastid localization sequence and encoding a reverse transcriptase polypeptide and a sequence-specific nuclease; and (b) a recombinant nucleic acid comprising (i) a plastid modification cassette, (ii) a first recognition site located immediately 5’ of the plastid modification cassette, (iii) a second recognition site located immediately 3’ of the plastid modification cassette, and (iv) a plastid localization sequence operably located 5’ of the first recognition site, wherein said plastid modification cassette comprises a POI, thereby expressing the POI in a plastid.

228

14. A method of transforming a plastid genome, comprising: introducing into a plant cell: (a) a polynucleotide encoding a reverse transcriptase polypeptide fused to a plastid transit peptide and a sequence-specific nuclease fused to a plastid transit peptide, or a polynucleotide linked to a plastid localization sequence and encoding a reverse transcriptase polypeptide and a sequence- specific nuclease; and (b) a recombinant nucleic acid comprising (i) a plastid modification cassette, (ii) a first recognition site located immediately 5’ of the plastid modification cassette, (iii) a second recognition site located immediately 3’ of the plastid modification cassette, and (iv) a plastid localization sequence operably located 5’ of the first recognition site, wherein said plastid modification cassette comprises a POI, thereby transforming said plastid genome.

15. The method of any one of claims 1, 3, 7, 9, 13 or 14, wherein introducing a sequence- specific nuclease comprises introducing: a polynucleotide encoding a Cpf1 nuclease fused to a plastid transit peptide or encoding a Cas9 nuclease fused to a plastid transit peptide; and a guide nucleic acid linked to a plastid localization sequence, thereby modifying the plastid genome of said plant cell.

16. The method of any one of claims 1, 3, 7, 9, 13 or 14, wherein introducing a sequence- specific nuclease comprises introducing: a polynucleotide encoding a transcription activator-like (TAL) effector nuclease (TALEN) fused to a plastid transit peptide, wherein the TALEN comprises a TAL effector DNA- binding domain fused to a DNA cleavage domain.

17. The method of any one of claims 1, 3, 7, 9, 13 or 14, wherein introducing a sequence- specific nuclease comprises introducing: a polynucleotide encoding a zinc-finger nuclease (ZFN) fused to a plastid transit peptide, wherein the ZFN comprises a zinc finger DNA-binding domain fused to a DNA-cleavage domain.

18. The method of any one of claims 1, 3, 7, 9, 13 or 14, wherein introducing a sequence- specific nuclease comprises introducing:

229

a polynucleotide encoding a meganuclease fused to a plastid transit peptide.

19. The method of any one of claims 1 to 18, wherein the polynucleotide encoding a sequence-specific nuclease, the polynucleotide encoding a RT polypeptide, the polynucleotide encoding LigD, and/or the polynucleotide encoding Ku are operably linked to one or more promoters and optionally, operably linked to one or more terminators.

20. The method of any one of claims 1, 4, 5, 7, 10, 11, or 15, wherein the recombinant nucleic acid and/or the guide nucleic acid are each operably linked to a promoter and optionally, operably linked to a terminator.

21. The method of any one of claims 5, 11, 15 or 20, wherein the guide nucleic acid comprises a recombinant CRISPR array or a recombinant CRISPR array and a recombinant trans-activating CRISPR (tracr) nucleic acid.

22. The method of claim 21, wherein the recombinant CRISPR array and the recombinant tracr nucleic acid are fused to form a single guide (sg) nucleic acid.

23. The method of claim 21 or claim 22, wherein the recombinant CRISPR array comprises at least one CRISPR spacer-repeat nucleic acid comprising: (a) a spacer sequence comprising a 5’ end and a 3’ end; and (b) a Type II CRISPR repeat sequence or a Type V repeat sequence, comprising a 5’ end and a 3’ end, wherein the spacer sequence is linked at its 3’end to the 5’ end of the repeat.

24. The method of any one of claims 1 to 3, 7 to 9, or 13 to 21, wherein the plastid transit peptide is a transit peptide from ribulose-1,5-bisphosphate carboxylase/oxygenase small subunit (rbcS), chlorophyll a/b binding protein, biotin carboxyl carrier protein, ferredoxin-dependent glutamate synthase 2 and/or protochlorophyllide A.

230

25. The method of any one of claims 1, 4, 5 to 7, 10 to 12, or 13 to 21, wherein the plastid localization sequence is an Eggplant Latent Viroid non-coding RNA sequence, an Avsunviroidae family non-coding RNA sequence, an Avocado sunblotch viroid (ASBVd) non-coding RNA sequence, a Peach latent mosaic viroid (PLMVd) non-coding RNA sequence, a Chrysanthemum chlorotic mottle viroid (CChMVd) non-coding RNA sequence, a (eIF4E) eukaryotic initiation factor 4E, and/or any combination thereof.

26. The method of any one of claims 1, 4 to 7, 10 to 21, wherein the plastid modification cassette comprises a first homology arm and a second homology arm, optionally wherein the plastid modification cassette further comprises an intervening synthetic nucleotide sequence up to about 10 kb in size located between the first and second homology arms.

27. The method of any one of claims 1 to 26, further comprising regenerating a plant from said plant cell having a modified plastid genome.

28. A plant produced by the method of claim 27.

29. A seed produced from the plant of claim 28.

30. A crop comprising a plurality of the plants of Claim 28 planted together in an agricultural field, a golf course, a residential lawn, a road side, an athletic field, and/or a recreational field.

31. A product produced from the plant of claim 28, the seed of claim 29, or the crop of claim 30.

231

APPENDIX B: Vector Sequence and Synthetic Constructs

Table B1 | Vectors used in this study. Vectors used for cloning genes or transgene expression in plants are shown above.

Vectors used in this study GenBank Name accession Reference PC-GW–Kan KP826769.1 (Dalal et al. 2015)

PC-GW–Hyg KP826770.1 (Dalal et al. 2015)

PC-GW–mCherry KP826771.1 (Dalal et al. 2015)

PC-GW–EGFP KP826772.1 (Dalal et al. 2015)

PC-GW–BAR KP826773.1 (Dalal et al. 2015)

(Yanisch-Perron et al. pUC19 L09137.2 1985)

(“Gateway ® pDONR pDONR221 N/A TM Vectors” 2012)

232

Synthetic Constructs Sequence of the constructs used in this work are shown below. Critical sequence features are highlighted by different colors, which correspond to named features in the key preceding each sequence.

EcoRI-NotI-rbcS-GGGS polylinker-SacI-hsCas9-SphI-GGGS polylinker-6x HIS tag-AscI- HindIII

GaattcGCGGCCGCatggcttcctctatgctctcttccgctactatggttgcctctccggctcaggccactatggtcgctcctttcaacggacttaagtcctccgctgccttcccagccac ccgcaaggctaacaacgacattacttccatcacaagcaacggcggaagagttaactgcggtggaggcggttcagagctcggtaccgacaagaagtacagcatcggcctggacatcg gcaccaactctgtgggctgggccgtgatcaccgacgagtacaaggtgcccagcaagaaattcaaggtgctgggcaacaccgaccggcacagcatcaagaagaacct gatcggagccctgctgttcgacagcggcgaaacagccgaggccacccggctgaagagaaccgccagaagaagatacaccagacggaagaaccggatctgctatct gcaagagatcttcagcaacgagatggccaaggtggacgacagcttcttccacagactggaagagtccttcctggtggaagaggataagaagcacgagcggcacccca tcttcggcaacatcgtggacgaggtggcctaccacgagaagtaccccaccatctaccacctgagaaagaaactggtggacagcaccgacaaggccgacctgcggctg atctatctggccctggcccacatgatcaagttccggggccacttcctgatcgagggcgacctgaaccccgacaacagcgacgtggacaagctgttcatccagctggtgc agacctacaaccagctgttcgaggaaaaccccatcaacgccagcggcgtggacgccaaggccatcctgtctgccagactgagcaagagcagacggctggaaaatctg atcgcccagctgcccggcgagaagaagaatggcctgttcggaaacctgattgccctgagcctgggcctgacccccaacttcaagagcaacttcgacctggccgaggat gccaaactgcagctgagcaaggacacctacgacgacgacctggacaacctgctggcccagatcggcgaccagtacgccgacctgtttctggccgccaagaacctgtc cgacgccatcctgctgagcgacatcctgagagtgaacaccgagatcaccaaggcccccctgagcgcctctatgatcaagagatacgacgagcaccaccaggacctga ccctgctgaaagctctcgtgcggcagcagctgcctgagaagtacaaagagattttcttcgaccagagcaagaacggctacgccggctacattgacggcggagccagcc aggaagagttctacaagttcatcaagcccatcctggaaaagatggacggcaccgaggaactgctcgtgaagctgaacagagaggacctgctgcggaagcagcggac cttcgacaacggcagcatcccccaccagatccacctgggagagctgcacgccattctgcggcggcaggaagatttttacccattcctgaaggacaaccgggaaaagat cgagaagatcctgaccttccgcatcccctactacgtgggccctctggccaggggaaacagcagattcgcctggatgaccagaaagagcgaggaaaccatcaccccct ggaacttcgaggaagtggtggacaagggcgcttccgcccagagcttcatcgagcggatgaccaacttcgataagaacctgcccaacgagaaggtgctgcccaagcac agcctgctgtacgagtacttcaccgtgtataacgagctgaccaaagtgaaatacgtgaccgagggaatgagaaagcccgccttcctgagcggcgagcagaaaaaggc catcgtggacctgctgttcaagaccaaccggaaagtgaccgtgaagcagctgaaagaggactacttcaagaaaatcgagtgcttcgactccgtggaaatctccggcgtg gaagatcggttcaacgcctccctgggcacataccacgatctgctgaaaattatcaaggacaaggacttcctggacaatgaggaaaacgaggacattctggaagatatcgt gctgaccctgacactgtttgaggacagagagatgatcgaggaacggctgaaaacctatgcccacctgttcgacgacaaagtgatgaagcagctgaagcggcggagat acaccggctggggcaggctgagccggaagctgatcaacggcatccgggacaagcagtccggcaagacaatcctggatttcctgaagtccgacggcttcgccaacag aaacttcatgcagctgatccacgacgacagcctgacctttaaagaggacatccagaaagcccaggtgtccggccagggcgatagcctgcacgagcacattgccaatct ggccggcagccccgccattaagaagggcatcctgcagacagtgaaggtggtggacgagctcgtgaaagtgatgggccggcacaagcccgagaacatcgtgatcga aatggccagagagaaccagaccacccagaagggacagaagaacagccgcgagagaatgaagcggatcgaagagggcatcaaagagctgggcagccagatcctg aaagaacaccccgtggaaaacacccagctgcagaacgagaagctgtacctgtactacctgcagaatgggcgggatatgtacgtggaccaggaactggacatcaaccg gctgtccgactacgatgtggaccatatcgtgcctcagagctttctgaaggacgactccatcgacaacaaggtgctgaccagaagcgacaagaaccggggcaagagcg acaacgtgccctccgaagaggtcgtgaagaagatgaagaactactggcggcagctgctgaacgccaagctgattacccagagaaagttcgacaatctgaccaaggcc gagagaggcggcctgagcgaactggataaggccggcttcatcaagagacagctggtggaaacccggcagatcacaaagcacgtggcacagatcctggactcccgg atgaacactaagtacgacgagaatgacaagctgatccgggaagtgaaagtgatcaccctgaagtccaagctggtgtccgatttccggaaggatttccagttttacaaagtg cgcgagatcaacaactaccaccacgcccacgacgcctacctgaacgccgtcgtgggaaccgccctgatcaaaaagtaccctaagctggaaagcgagttcgtgtacgg cgactacaaggtgtacgacgtgcggaagatgatcgccaagagcgagcaggaaatcggcaaggctaccgccaagtacttcttctacagcaacatcatgaactttttcaag accgagattaccctggccaacggcgagatccggaagcggcctctgatcgagacaaacggcgaaaccggggagatcgtgtgggataagggccgggattttgccaccg tgcggaaagtgctgagcatgccccaagtgaatatcgtgaaaaagaccgaggtgcagacaggcggcttcagcaaagagtctatcctgcccaagaggaacagcgataag ctgatcgccagaaagaaggactgggaccctaagaagtacggcggcttcgacagccccaccgtggcctattctgtgctggtggtggccaaagtggaaaagggcaagtc caagaaactgaagagtgtgaaagagctgctggggatcaccatcatggaaagaagcagcttcgagaagaatcccatcgactttctggaagccaagggctacaaagaagt gaaaaaggacctgatcatcaagctgcctaagtactccctgttcgagctggaaaacggccggaagagaatgctggcctctgccggcgaactgcagaagggaaacgaac tggccctgccctccaaatatgtgaacttcctgtacctggccagccactatgagaagctgaagggctcccccgaggataatgagcagaaacagctgtttgtggaacagcac aagcactacctggacgagatcatcgagcagatcagcgagttctccaagagagtgatcctggccgacgctaatctggacaaagtgctgtccgcctacaacaagcaccgg gataagcccatcagagagcaggccgagaatatcatccacctgtttaccctgaccaatctgggagcccctgccgccttcaagtactttgacaccaccatcgaccggaaga ggtacaccagcaccaaagaggtgctggacgccaccctgatccaccagagcatcaccggcctgtacgagacacggatcgacctgtctcagctgggaggcgacgtcgac ctgcaggcatgcggtggaggcggttcacatcatcaccatcaccactaaGGCGCGCCaagctt

Figure B1 | cpCas9. A synthetic derivative of Cas9 codon optimized for expression in homo sapiens (KM099237.1) was cloned by PCR and restriction digestion and ligation into a synthetic gene fragment featuring a N terminal rbcS plastid transit peptide and C terminal 6x HIS affinity tag.

233

I-CEU-I-HindIII-Nospro-Kozak-rbcS-NCP10-Nosterm- BamHI- I-CEU-I

GGGGGGTAACTATAACGGTCCTAAGGTAGCGAAAATTAAAAGCTTGATCATGAGCGGAGAATTAAGG GAGTCACGTTATGACCCCCGCCGATGACGCGGGACAAGCCGTTTTACGTTTGGAACTGACAGAACCGC AACGTTGAAGGAGCCACTCAGCCGCGGGTTTCTGGAGTTTAATGAGCTAAGCACATACGTCAGAAACC ATTATTGCGCGTTCAAAAGTCGCCTAAGGTCACTATCAGCTAGCAAATATTTCTTGTCAAAAATGCTCC ACTGACGTTCCATAAATTCCCCTCGGTATCCAATTAGAGTCTCATATTCACTCTCAATCCAAATAATCT GCACCGGATCTGGATCGTTTCGCGGCCGCCGCCATGGCTTCCTCTATGCTCTCTTCCGCTACTATGGTT GCCTCTCCGGCTCAGGCCACTATGGTCGCTCCTTTCAACGGACTTAAGTCCTCCGCTGCCTTCCCAGCC ACCCGCAAGGCTAACAACGACATTACTTCCATCACAAGCAACGGCGGAAGAGTTAACTGCGCTACCGT TGTAAGCGGCCAAAAGCAAGACCGACAGGGGGGGGAGAGAAGGCGTAGTCAGTTGGACAGAGACCA ATGCGCCTATTGTAAAGAAAAAGGCCACTGGGCTAAAGATTGCCCTAAAAAGCCGCGTGGTCCACGTG GACCGCGACCCCAAACATCTTTATTGTAAACCAGCTCGAATTTCCCCGATCGTTCAAACATTTGGCAAT AAAGTTTCTTAAGATTGAATCCTGTTGCCGGTCTTGCGATGATTATCATATAATTTCTGTTGAATTACGT TAAGCATGTAATAATTAACATGTAATGCATGACGTTATTTATGAGATGGGTTTTTATGATTAGAGTCCC GCAATTATACATTTAATACGCGATAGAAAACAAAATATAGCGCGCAAACTAGGATAAATTATCGCGCG CGGTGTCATCTATGTTACTAGATCGGGAATTGGATCCGGGGGGTAACTATAACGGTCCTAAGGTAGCG AAGGGGGG

Figure B2 | cpNCP10. A synthetic gene corresponding to the NCP10 protein of M-MLV (NP_955586.1) was codon optimized for expression in N. benthamiana. The synthetic protein was constructed to feature a fused N terminal rbcS transit peptide. Expression is driven by a nopaline synthase promoter and terminator (KX648714.1), and cloning into binary vectors is achieved by inclusion of I-CEU-I homing endonuclease sites.

234

EcoRI-attL1-stuffer-NotI-rbcS-RT-3X-flag-NotI-stuffer- attL2- EcoRI gaattccaaataatgattttattttgactgatagtgacctgttcgttgcaacaaattgatgagcaatgcttttttataatgccaactttgtacaaaaaagcaggctcgccggccgc caccttcggcgcggccgcatggcttcctctatgctctcttccgctactatggttgcctctccggctcaggccactatggtcgctcctttcaacggacttaagtcctccgctgc cttcccagccacccgcaaggctaacaacgacattacttccatcacaagcaacggcggaagagttaactgcatgttgaatattgaagacgagcatagattacatgagaca tctaaggagccagatgtttctttgggcagtacatggttatcagattttccacaagcatgggccgagacaggtggaatggggcttgcggttcgacaagcgccgctcataatc cctctaaaagctacttctactccagtctctattaagcagtatccgatgtcacaagaagccagactcggcattaaaccacatatccaaagattgctggatcagggaatcttggt cccctgtcaaagtccctggaacacacctttattgccggtcaagaagcctgggacaaacgattatcgtccagtgcaagatcttcgagaagttaacaaaagggttgaggatat ccatccaactgtaccaaatccttataacttactctccggcttacctcctagtcatcaatggtatacggttttggacctcaaagatgcctttttttgtctacgacttcatcctacctcc cagcctttgtttgcattcgagtggagggatccagaaatgggtataagtggccaattaacttggacgaggttgcctcaaggatttaaaaactctccgactttattcgatgaagct ttgcacagagatctcgctgattttcgtatccaacatccagatcttattcttctacagtatgttgacgatttgttactagctgcaacaagtgaactggattgccaacaaggaacaa gagctctattgcagaccctaggaaaccttggttacagagcatcggcaaagaaagctcaaatatgtcagaagcaggtgaagtatttgggatacttacttaaggaggggcaa agatggcttaccgaagcgagaaaagaaacggttatgggacagccgacaccgaagacgccacgtcaactcagagagttcttgggtactgcgggtttctgtaggctttgga ttcctggattcgcagagatggcggctccgttatacccattgactaaaacggggacgctgtttaactggggtcccgatcaacagaaagcataccaagaaataaaacaagcc ctgctaactgctcccgcgctgggactccctgatttaacgaagcccttcgaactgtttgtcgacgaaaagcagggttatgctaagggtgttctgacacagaaattaggtcctt ggcgtaggcctgttgcatatttgtcgaaaaaattggatccagtggctgctggttggccgccatgtctccgtatggtggctgcgatagctgtcttgactaaggatgctggtaaa ttaaccatgggccagcccctcgttatcttggcaccacacgctgtggaagctcttgttaaacaaccaccagatagatggttgtcaaatgctcgtatgactcattatcaagcact cttgttggatacagatagagttcaattcgggccggtggttgctttgaaccctgcaactttgcttcctttacctgaagagggactccagcataattgtctggatatacttgctgag gcacacggtactcgaccggatttgaccgatcaaccattgcctgatgcagaccacacttggtacacggacggttcgtcattacttcaggagggtcaacgaaaggcaggtg ctgcagtcacaacagagaccgaggtcatctgggctaaagctcttcctgctggaacatctgcgcaaagagcagagttaattgctcttacacaggccttgaagatggccgaa ggcaaaaagcttaacgtttacacggactcacgttacgcctttgctacagctcacatacatggagaaatttacagaagaagaggactactaacttcagaaggtaaagaaatc aagaataaggacgaaattttggccctccttaaggctctcttcctgccaaagagactttcaattattcactgcccaggacatcaaaaagggcatagcgccgaggccagggg aaatagaatggctgatcaggccgctaggaaggcagcaatcaccgaaacccctgatactagtactttgttaatcatggattacaaggaccacgacggggattacaagga ccacgacattgattacaaggatgatgatgacaagtaagcggccgcaagcgcgctcggccgcggcgacccagctttcttgtacaaagttggcattataagaaagcattg cttatcaatttgttgcaacgaacaggtcactatcagtcaaaataaaatcattatttggaattc

Figure B3 | cpMMLV Reverse Transcriptase. A synthetic gene corresponding to the Reverse Transcriptase portion of the pol protein of M-MLV (NP_955591.1) was codon optimized for expression in N. benthamiana. The synthetic protein was constructed to feature a fused N terminal rbcS transit peptide and C terminal 3X FLAG affinity tag. Cloning into binary vectors is achieved by inclusion of gateway 5’ attL1 and 3’ attL2 sites.

235

PTEC constructs

PTEC constructs were made by gene synthesis, but share many features in common. Cloning into binary vectors is achieved by inclusion of gateway 5’ attL1 and 3’ attL2 sites. 5’ ELVd sequence is included as a means of chloroplast localization of the PTEC RNA, and VRS 1 and VRS 2 were constructed based on the genome sequence of M-MLV (NC_001501.1) and tRNApro in N. benthamiana and A. thaliana (NC_001879.2, NC_000932.1). Homology arms were chosen based on prior publications (Svab and Maliga 1993b), and available plastid genome sequence (NC_001879.2, NC_000932.1). Plastid transgene sequence, including fused promoter and UTR sequence, as well as the aada spectinomycin resistance gene, were based on prior publication (Svab and Maliga 1993b).

236

Figure B4 | PTEC 1 - N. benthamiana single integration site. A synthetic gene corresponding to a PTEC that integrates between the rbcL and accD genes of the N. tabacum plastid genome is shown above.

237

HindIII-attL1-stuffer 1-AscI-ELVd-VRS 1- AscI-NotI- RbcL(Left homology arm) - Prrn-5’UTR- SCA1-aadA- SCA1-psbA3’UTR-accD (Right Homology Arm)- NotI- EcoR1- VRS 2- EcoR1-stuffer 2- attL2- HindIII aagcttcaaataatgattttattttgactgatagtgacctgttcgttgcaacaaattgatgagcaatgcttttttataatgccaactttgtacaaaaaa gcaggctcgccggccgccaccttcggcggcgcgccttggcgaaaccccatttcgacctttcggtctcatcaggggtggcacacaccaccc tatggggagaggtcgtcctctatctctcctggaaggccggagcaatccaaaagaggtacacccacccatgggtcgggactttaaattcgga ggattcgtcctttaaacgttcctccaagagtcccttccccaaacccttactttgtaagtgtggttcggcgaatgtaccgtttcgtcctttcggactc atcagggaaagtacacactttccgacggtgggttcgtcgacacctctccccctcccaggtactatcccctttccaggatttgttcccgcgcca gtcctccgattgactgagtcgcccgggtacccgtgtatccaataaaccctcttgcagttgcatccgacttgtggtctcgctgttccttgggagg gtctcctctgagtgattgactacccgtcagcgggggtctttcatttggggggcgttccgagaatggtagggatgacaggatttgggcgcgcc gcggccgcCCACAAACAGAGACTAAAGCAAGTGTTGGATTCAAAGCTGGTGTTAAAGAGTACAAATTGA CTTATTATACTCCTGAGTACCAAACCAAGGATACTGATATATTGGCAGCATTCCGAGTAACTCCTCAAC CTGGAGTTCCACCTGAAGAAGCAGGGGCCGCGGTAGCTGCCGAATCTTCTACTGGTACATGGACAACT GTATGGACCGATGGACTTACCAGCCTTGATCGTTACAAAGGGCGATGCTACCGCATCGAGCGTGTTGT TGGAGAAAAAGATCAATATATTGCTTATGTAGCTTACCCTTTAGACCTTTTTGAAGAAGGTTCTGTTAC CAACATGTTTACTTCCATTGTAGGTAACGTATTTGGGTTCAAAGCCCTGCGCGCTCTACGTCTGGAAGA TCTGCGAATCCCTCCTGCTTATGTTAAAACTTTCCAAGGTCCGCCTCATGGGATCCAAGTTGAAAGAGA TAAATTGAACAAGTATGGTCGTCCCCTGTTGGGATGTACTATTAAACCTAAATTGGGGTTATCTGCTAA AAACTACGGTAGAGCTGTTTATGAATGTCTTCGCGGTGGACTTGATTTTACCAAAGATGATGAGAACG TGAACTCACAACCATTTATGCGTTGGAGAGATCGTTTCTTATTTTGTGCCGAAGCACTTTATAAAGCAC AGGCTGAAACAGGTGAAATCAAAGGGCATTACTTGAATGCTACTGCAGGTACATGCGAAGAAATGAT CAAAAGAGCTGTATTTGCTAGAGAATTGGGCGTTCCGATCGTAATGCATGACTACTTAACGGGGGGAT TCACCGCAAATACTAGCTTGGCTCATTATTGCCGAGATAATGGTCTACTTCTTCACATCCACCGTGCAA TGCATGCGGTTATTGATAGACAGAAGAATCATGGTATCCACTTCCGGGTATTAGCAAAAGCGTTACGT ATGTCTGGTGGAGATCATATTCACTCTGGTACCGTAGTAGGTAAACTTGAAGGTGAAAGAGACATAAC TTTGGGCTTTGTTGATTTACTGCGTGATGATTTTGTTGAACAAGATCGAAGTCGCGGTATTTATTTCACT CAAGATTGGGTCTCTTTACCAGGTGTTCTACCCGTGGCTTCAGGAGGTATTCACGTTTGGCATATGCCT GCTCTGACCGAGATCTTTGGGGATGATTCCGTACTACAGTTCGGTGGAGGAACTTTAGGACATCCTTG GGGTAATGCGCCAGGTGCCGTAGCTAATCGAGTAGCTCTAGAAGCATGTGTAAAAGCTCGTAATGAAG GACGTGATCTTGCTCAGGAAGGTAATGAAATTATTCGCGAGGCTTGCAAATGGAGCCCGGAACTAGCT GCTGCTTGTGAAGTATGGAAAGAGATCGTATTTAATTTTGCAGCAGTGGACGTTTTGGATAAGTAAAA ACAGTAGACATTAGCAGATAAATTAGCAGGAAATAAAGAAGGATAAGGAGAAAGAACTCAAGTAATT ATCCTTCGTTCTCTTAATTGAATTGCAATTAAACTCGGCCCAATCTTTTACTAAAAGGATTGAGCCGAA TACAACAAAGATTCTATTGCATATATTTTGACTAAGTATATACTTACCTAGATATACAAGATTTGAAAT ACAAAATCTAGAAAACTAAATCAAAATCTAAGACTCAAATCTTTCTATTGTTGTCTTGGATCCACAATT AATCCTACGGATCCTTAGGATTGGTATATTCTTTTCTATCCTGTAGTTTGTAGTTTCCCTGAATCAAGCC AAGTATCACACCTCTTTCTACCCATCCTGTATATTGTGctcccccgccgtcgttcaatgagaatggataagaggctcgt gggattgacgtgagggggcagggatggctatatttctgggagcgaactccgggcgaatacgaagcgcttggatacagttgtagggaggg atttAGTACTatggcagaagcggtgatcgccgaagtatcgactcaactatcagaggtagttggcgtcatcgagcgccatctcgaaccgac gttgctggccgtacatttgtacggctccgcagtggatggcggcctgaagccacacagtgatattgatttgctggttacggtgaccgtaaggct tgatgaaacaacgcggcgagctttgatcaacgaccttttggaaacttcggcttcccctggagagagcgagattctccgcgctgtagaagtca ccattgttgtgcacgacgacatcattccgtggcgttatccagctaagcgcgaactgcaatttggagaatggcagcgcaatgacattcttgcag gtatcttcgagccagccacgatcgacattgatctggctatcttgctgacaaaagcaagagaacatagcgttgccttggtaggtccagcggcg gaggaactctttgatccggttcttgaacaggatctatttgaggcgctaaatgaaaccttaacgctatggaactcgccgcccgactgggctggc gatgagcgaaatgtagtgcttacgttgtcccgcatttggtacagcgcagtaaccggcaaaatcgcgccgaaggatgtcgctgccgactggg caatggagcgcctgccggcccagtatcagcccgtcatacttgaagctagacaggcttatcttggacaagaagaagatcgcttggcctcgcg cgcagatcagttggaagaatttgtccactacgtgaaaggcgagatcaccaaggtagtcggcaaataaAGTACTgatcctggcctagtcta taggaggttttgaaaagaaaggagcaataatcattttcttgttctatcaagagggtgctattgctcctttctttttttctttttatttatttactagtatttta cttacatagacttttttgtttacattatagaaaaagaaggagaggttattttcttgcatttattcatgAGAGGACAAATCTCTTTTTTCGA TGCGAATTTGACACGACATAGGAGAAGCCGCCCTTTATTAAAAATTATATTATTTTAAATAATATAAA

238

GGGGGTTCCAACATATTAATATATAGTGAAGTGTTCCCCCAGATTCAGAACTTTTTTTCAATACTCACA ATCCTTATTAGTTAATAATCCTAGTGATTGGATTTCTATGCTTAGTCTGATAGGAAATAAGATATTCAA ATAAATAATTTTATAGCGAATGACTATTCATCTATTGTATTTTCATGCAAATAGGGGGCAAGAAAACTC TATGGAAAGATGGTGGTTTAATTCGATGTTGTTTAAGAAGGAGTTCGAACGCAGGTGTGGGCTAAATA AATCAATGGGCAGTCTTGGTCCTATTGAAAATACCAATGAAGATCCAAATCGAAAAGTGAAAAACATT CATAGTTGGAGGAATCGTGACAATTCTAGTTGCAGTAATGTTGATTATTTATTCGGCGTTAAAGACATT CGGAATTTCATCTCTGATGACACTTTTTTAGTTAGTGATAGGAATGGAGACAGTTATTCCATCTATTTT GATATTGAAAATCATATTTTTGAGATTGACAACGATCATTCTTTTCTGAGTGAACTAGAAAGTTCTTTT TATAGTTATCGAAACTCGAATTATCGGAATAATGGATTTAGGGGCGAAGATCCCTACTATAATTCTTAC ATGTATGATACTCAATATAGTTGGAATAATCACATTAATAGTTGCATTGATAGTTATCTTCAGTCTCAA ATCTGTATAGATACTTCCATTATAAGTGGTAGTGAGAATTACGGTGACAGTTACATTTATAGGGCCGTT TGTGGTGGTGAAAGTCGAAATAGTAGTGAAAACGAGGGTTCCAGTAGACGAACTCGCACGAAGGGCA GTGATTTAACTATAAGAGAAAGTTCTAATGATCTCGAGGTAACTCAAAAATACAGGCATTTGTGGGTT CAATGCGAAAATTGTTATGGATTAAATTATAAGAAATTTTTGAAATCAAAAATGAATATTTGTGAACA ATGTGgcggccgcgaattcaaaaaggggggaatgaaagaccccacctgtaggtttggcaagctagcttaagtaacgccattttgcaagg catggaaaaatacataactgagaatagagaagttcagatcaaggtcaggaacagatggaacagctgaatatgggccaaacaggatatctgt ggtaagcagttcctgccccggctcagggccaagaacagatggaacagctgaatatgggccaaacaggatatctgtggtaagcagttcctg ccccggctcagggccaagaacagatggtccccagatgcggtccagccctcagcagtttctagagaaccatcagatgtttccagggtgccc caaggacctgaaatgaccctgtgccttatttgaactaaccaatcagttcgcttctcgcttctgttcgcgcgcttctgctccccgagctcaataaa agagcccacaacccctcactcgggtatataaggaagttcatttcatttggagagaacacggctgcaggaaagacggttctagaatccgctta agacctcctaggtccaacgcgttttctactagttacattgagagagcgagacgtcccggtgtgtgtgagagagcagaatcgccggccgcca ccttcggcgcagcgccttggcgaaaccccatttcgacctttcggtctcatcaggggtggcacacaccaccctatggggagaggtcgtcctct atctctcctggaaggccggagcaatccaaaagaggtacacccacccatgggtcgggactttaaattcggaggattcgtcctttaaacgttcct ccaagagtcccttccccaaacccttactttgtaagtgtggttcggcgaatgtaccgtttcgtcctttcggactcatcagggaaagtacacacttt ccgacggtgggttcgtcgacacctctccccctcccaggtactatcccctttccaggatttgttcccgcgccagtcctccgattgactgagtcgc ccgggtacccgtgtatccaataaaccctcttgcagttgcagaattcaagcgcgctcggccgcggcgacccagctttcttgtacaaagttggc attataagaaagcattgcttatcaatttgttgcaacgaacaggtcactatcagtcaaaataaaatcattatttgaagctt

239

Figure B5 | PTEC 2 - N. benthamiana double integration sites. A synthetic gene corresponding to a PTEC that integrates in the noncoding inverted repeat regions of the N. tabacum plastid genome is shown above.

240

HindIII-attL1-stuffer 1-AscI-ELVd-VRS 1- AscI-NotI-Left flank seq-Prrn-5’UTR-aadA- psbA3’UTR-right flank seq- NotI-BglII- VRS 2- BglII-stuffer 2- attL2- HindIII

aagcttcaaataatgattttattttgactgatagtgacctgttcgttgcaacaaattgatgagcaatgcttttttataatgccaactttgtacaaaaaa gcaggctcgccggccgccaccttcggcggcgcgccttggcgaaaccccatttcgacctttcggtctcatcaggggtggcacacaccaccc tatggggagaggtcgtcctctatctctcctggaaggccggagcaatccaaaagaggtacacccacccatgggtcgggactttaaattcgga ggattcgtcctttaaacgttcctccaagagtcccttccccaaacccttactttgtaagtgtggttcggcgaatgtaccgtttcgtcctttcggactc atcagggaaagtacacactttccgacggtgggttcgtcgacacctctccccctcccaggtactatcccctttccaggatttgttcccgcgcca gtcctccgattgactgagtcgcccgggtacccgtgtatccaataaaccctcttgcagttgcatccgacttgtggtctcgctgttccttgggagg gtctcctctgagtgattgactacccgtcagcgggggtctttcatttggggggcgttccgagaatggtagggatgacaggatttgggcgcgcc gcggccgcaattcaccgccgtatggctgaccggcgattactagcgattccggcttcatgcaggcgagttgcagcctgcaatccgaactgag gacgggtttttggggttagctcaccctcgcgggatcgcgaccctttgtcccggccattgtagcacgtgtgtcgcccagggcataaggggcat gatgacttgacgtcatcctcaccttcctccggcttatcaccggcagtctgttcagggttccaaactcaacgatggcaactaaacacgagggtt gcgctcgttgcgggacttaacccaacaccttacggcacgagctgacgacagccatgcaccacctgtgtccgcgttcccgaaggcacccct ctctttcaagaggattcgcggcatgtcaagccctggtaaggttcttcgctttgcatcgaattaaaccacatgctccaccgcttgtgcgggcccc cgtcaattcctttgagtttcattcttgcgaacgtactccccaggcgggatacttaacgcgttagctacagcactgcacgggtcgatacgcacag cgcctagtatccatcgtttacggctaggactactggggtatctaatcccattcgctcccctagctttcgtctctcagtgtcagtgtcggcccagc agagtgctttcgccgttggtgttctttccgatctctacgcatttcaccgctccaccggaaattccctctgcccctaccgtactccagcttggtagtt tccaccgcctgtccagggttgagccctgggatttgacggcggacttaaaaagccacctacagacgctttacgcccaatcattccggataacg cttgcatcctctgtattaccgcggctgctggcacagagttagccgatgcttattccccagataccgtcattgcttcttctccgggaaaagaagtt cacgacccgtgggccttctacctccacgcggcattgctccgtcaggctttcgcccattgcggaaaattccccactgctgcctcccgtaggag tctgggccgtgtctcagtcccagtgtggctgatcatcctctcggaccagctactgatcatcgccttggtaagctattgcctcaccaactagcta atcagacgcgagcccctcctcgggcggattcctccttttgctcctcagcctacggggtattagcagccgtttccagctgttgttcccctcccaa gggcaggttcttacgcgttactcacccgtccgccactggaaacaccacttcccgtccgacttgcatgtgttaagcatgccgccagcgttcatc ctgagccaggatcgaactctccatgagattcatagttgcattacttatagcttccttgttcgtagacaaagcggattcggaattgtctttcattcca aggcataacttgtatccatgcgcttcatattcgcccggagttcgctcccagaaatatagccatccctgccccctcacgtcaatcccacgagcct cttatccattctcattgaacgacggcgggggagctttcgaggcctcgaaatccaactagaaaaactcacattgggcttagggataatcaggct cgaactgatgacttccaccacgtcaaggtgacactctaccgctgagttatatcccttccccgccccatcgagaaatagaactgactaatccta agtcaaagggtcgagaaactcaacgccactattcttgaacaacttggagccgggccttcttttcgcactattacggatatgaaaataatggtca aaatcggattcaattgtcgctcccccgccgtcgttcaatgagaatggataagaggctcgtgggattgacgtgagggggcagggatggctat atttctgggagcgaactccgggcgaatacgaagcgcttggatacagttgtagggagggatttatggcagaagcggtgatcgccgaagtatc gactcaactatcagaggtagttggcgtcatcgagcgccatctcgaaccgacgttgctggccgtacatttgtacggctccgcagtggatggcg gcctgaagccacacagtgatattgatttgctggttacggtgaccgtaaggcttgatgaaacaacgcggcgagctttgatcaacgaccttttgg aaacttcggcttcccctggagagagcgagattctccgcgctgtagaagtcaccattgttgtgcacgacgacatcattccgtggcgttatccag ctaagcgcgaactgcaatttggagaatggcagcgcaatgacattcttgcaggtatcttcgagccagccacgatcgacattgatctggctatct tgctgacaaaagcaagagaacatagcgttgccttggtaggtccagcggcggaggaactctttgatccggttcttgaacaggatctatttgag gcgctaaatgaaaccttaacgctatggaactcgccgcccgactgggctggcgatgagcgaaatgtagtgcttacgttgtcccgcatttggta cagcgcagtaaccggcaaaatcgcgccgaaggatgtcgctgccgactgggcaatggagcgcctgccggcccagtatcagcccgtcata cttgaagctagacaggcttatcttggacaagaagaagatcgcttggcctcgcgcgcagatcagttggaagaatttgtccactacgtgaaagg cgagatcaccaaggtagtcggcaaataagatcctggcctagtctataggaggttttgaaaagaaaggagcaataatcattttcttgttctatca agagggtgctattgctcctttctttttttctttttatttatttactagtattttacttacatagacttttttgtttacattatagaaaaagaaggagaggttat tttcttgcatttattcatgaactgcccctatcggaaataggattgactaccgattccgaaggaactggagttacatctcttttccattcaagagttct tatgcgtttccacgcccctttgagaccccgaaaaatggacaaattccttttcttaggaacacatacaagattcgtcactacaaaaaggataatg gtaaccctaccattaactacttcatttatgaatttcatagtaatagaaatacatgtcctaccgagacagaatttggaacttgctatcctcttgcctag caggcaaagatttacctccgtggaaaggatgattcattcggatcgacatgagagtccaactacattgccagaatccatgttgtatatttgaaag aggttgacctccttgcttctctcatggtacactcctcttcccgccgagccccttttctcctcggtccacagagacaaaatgtaggactggtgcca

241

acaattcatcagactcactaagtcgggatcactaactaatactaatctaatataatagtctaatatatctaatataatagaaaatactaatataatag aaaagaactgtcttttctgtatactttccccggttccgttgctaccgcgggctttacgcaatcgatcggattagatagatatcccttcaacataggt catcgaaaggatctcggagacccaccaaagtacgaaagccaggatctttcagaaaacggattcctattcaaagagtgcataaccgcatgga taagctcacactaacccgtcaatttgggatccaaattcgagattttccttgggaggtatcgggaaggatttggaatggaataatatcgattcata cagaagaaaaggttctctattgattcaaacactgtacctaacctatgggatagggatcgaggaaggggaaaaaccgaagatttcacatggta cttttatcaatctgatttatttcgtacctttcgttcaatgagaaaatgggtcaaattctacaggatcaaacctatgggacttaaggaatgatataaaa aaaagagagggaaaatattcatattaaataaatatgaagtagaagaacccagattccaaatgaacaaattcaaacttgaaaaggatcttcctta ttcttgaagaatgaggggcaaagggattgatcaagaaagatcgcggccgcagatctaaaaaggggggaatgaaagaccccacctgtagg tttggcaagctagcttaagtaacgccattttgcaaggcatggaaaaatacataactgagaatagagaagttcagatcaaggtcaggaacaga tggaacagctgaatatgggccaaacaggatatctgtggtaagcagttcctgccccggctcagggccaagaacagatggaacagctgaata tgggccaaacaggatatctgtggtaagcagttcctgccccggctcagggccaagaacagatggtccccagatgcggtccagccctcagca gtttctagagaaccatcagatgtttccagggtgccccaaggacctgaaatgaccctgtgccttatttgaactaaccaatcagttcgcttctcgct tctgttcgcgcgcttctgctccccgagctcaataaaagagcccacaacccctcactcgggtatataaggaagttcatttcatttggagagaaca cggctgcaggaaagacggttctagaatccgcttaagacctcctaggtccaacgcgttttctactagttacattgagagagcgagacgtcccg gtgtgtgtgagagagcagaatcgccggccgccaccttcggcgcagcgccttggcgaaaccccatttcgacctttcggtctcatcaggggtg gcacacaccaccctatggggagaggtcgtcctctatctctcctggaaggccggagcaatccaaaagaggtacacccacccatgggtcgg gactttaaattcggaggattcgtcctttaaacgttcctccaagagtcccttccccaaacccttactttgtaagtgtggttcggcgaatgtaccgttt cgtcctttcggactcatcagggaaagtacacactttccgacggtgggttcgtcgacacctctccccctcccaggtactatcccctttccaggat ttgttcccgcgccagtcctccgattgactgagtcgcccgggtacccgtgtatccaataaaccctcttgcagttgcaagatctaagcgcgctcg gccgcggcgacccagctttcttgtacaaagttggcattataagaaagcattgcttatcaatttgttgcaacgaacaggtcactatcagtcaaaat aaaatcattatttgaagctt

242

Figure B6 | PTEC 3 – A. thaliana single integration sit. A synthetic gene corresponding to a PTEC that integrates between the rbcL and accD genes of the A. thaliana plastid genome is shown above.

243

5’-ECOR1-attL1-stuffer 1-AscI-ELVd-VRS 1- AscI-NotI- RbcL(Left homology arm) - Prrn-5’UTR- SCA1-aadA -SCA1 - psbA3’UTR-right flank seq- NotI-Xho1- VRS 2- Xho1-stuffer 2- attL2- ECOR1-3’

GAATTCcaaataatgattttattttgactgatagtgacctgttcgttgcaacaaattgatgagcaatgcttttttataatgccaactttgtacaa aaaagcaggctcgccggccgccaccttcggcggcgcgccttggcgaaaccccatttcgacctttcggtctcatcaggggtggcacacacc accctatggggagaggtcgtcctctatctctcctggaaggccggagcaatccaaaagaggtacacccacccatgggtcgggactttaaatt cggaggattcgtcctttaaacgttcctccaagagtcccttccccaaacccttactttgtaagtgtggttcggcgaatgtaccgtttcgtcctttcg gactcatcagggaaagtacacactttccgacggtgggttcgtcgacacctctccccctcccaggtactatcccctttccaggatttgttcccgc gccagtcctccgattgactgagtcgcccgggtacccgtgtatccaataaaccctcttgcagttgcatccgacttgtggtctcgctgttccttgg gagggtctcctctgagtgattgactacccgtcagcgggggtctttcatttggggggcgttccgagaatggtagggatgacaggatttgggcg cgccgcggccgcATGTCACCACAAACAGAGACTAAAGCAAGTGTTGGGTTCAAAGCTGGTGTTAAAGAGT ATAAATTGACTTACTATACTCCTGAATATGAAACCAAGGATACTGATATCTTGGCAGCATTCCGAGTA ACTCCTCAACCTGGAGTTCCACCTGAAGAAGCAGGGGCTGCGGTAGCTGCTGAATCTTCTACTGGTAC ATGGACAACTGTGTGGACCGATGGGCTTACCAGCCTTGATCGTTACAAAGGACGATGCTACCACATCG AGCCCGTTCCAGGAGAAGAAACTCAATTTATTGCGTATGTAGCTTATCCCTTAGACCTTTTTGAAGAAG GTTCGGTTACTAACATGTTTACCTCGATTGTGGGTAATGTATTTGGGTTCAAAGCCCTGGCTGCTCTAC GTCTAGAGGATCTGCGAATCCCTCCTGCTTATACTAAAACTTTCCAAGGACCACCTCATGGTATCCAAG TTGAAAGAGATAAATTGAACAAGTATGGACGTCCCCTATTAGGATGTACTATTAAACCAAAATTGGGG TTATCCGCGAAAAACTATGGTAGAGCAGTTTATGAATGTCTACGTGGTGGACTTGATTTTACCAAAGA TGATGAGAATGTGAACTCCCAACCATTTATGCGTTGGAGAGACCGTTTCTTATTTTGTGCCGAAGCTAT TTATAAATCACAGGCTGAAACAGGTGAAATCAAAGGGCATTATTTGAATGCTACTGCGGGTACATGCG AAGAAATGATCAAAAGAGCTGTATTTGCCAGAGAATTGGGAGTTCCTATCGTAATGCATGACTACTTA ACAGGGGGATTCACCGCAAATACTAGTTTGTCTCATTATTGCCGAGATAATGGCCTACTTCTTCACATC CACCGTGCAATGCACGCTGTTATTGATAGACAGAAGAATCATGGTATGCACTTCCGTGTACTAGCTAA AGCTTTACGTCTATCTGGTGGAGATCATATTCACGCGGGTACAGTAGTAGGTAAACTTGAAGGAGACA GGGAGTCAACTTTGGGCTTTGTTGATTTACTGCGCGATGATTATGTTGAAAAAGATCGAAGCCGCGGT ATCTTTTTCACTCAAGATTGGGTCTCACTACCTGGTGTTCTGCCTGTGGCTTCAGGGGGTATTCACGTTT GGCATATGCCTGCTTTGACCGAGATCTTTGGAGATGATTCTGTACTACAATTCGGTGGAGGAACTTTAG GCCACCCTTGGGGAAATGCACCGGGTGCCGTAGCCAACCGAGTAGCTCTGGAAGCATGTGTACAAGCT CGTAATGAGGGACGTGATCTTGCAGTCGAGGGTAATGAAATTATCCGTGAAGCTTGCAAATGGAGTCC TGAACTAGCTGCTGCTTGTGAAGTATGGAAAGAGATCACATTTAACTTCCCAACCATCGATAAATTAG ATGGCCAAGAGTAGATGAATTAGATTTAGTAATTCACGTTTGTTTTATTAGTTTAATTGCACTCGGCTC AATCTTTTTTTTACTAAAAAAGATTGAGCCGAGGTTATCTGTTGTATATACTATTTTTTTTGATAGATAC ATACTTAAATTTAGATAGAAAAAAAACTCTTCAATAAAAAAAAGAAGATTAAACACAACTACAATTTT GTTATTGTAGTGTTGTGTGctcccccgccgtcgttcaatgagaatggataagaggctcgtgggattgacgtgagggggcaggga tggctatatttctgggagcgaactccgggcgaatacgaagcgcttggatacagttgtagggagggatttAGTACTatggcagaagcggt gatcgccgaagtatcgactcaactatcagaggtagttggcgtcatcgagcgccatctcgaaccgacgttgctggccgtacatttgtacggct ccgcagtggatggcggcctgaagccacacagtgatattgatttgctggttacggtgaccgtaaggcttgatgaaacaacgcggcgagcttt gatcaacgaccttttggaaacttcggcttcccctggagagagcgagattctccgcgctgtagaagtcaccattgttgtgcacgacgacatcat tccgtggcgttatccagctaagcgcgaactgcaatttggagaatggcagcgcaatgacattcttgcaggtatcttcgagccagccacgatcg acattgatctggctatcttgctgacaaaagcaagagaacatagcgttgccttggtaggtccagcggcggaggaactctttgatccggttcttg aacaggatctatttgaggcgctaaatgaaaccttaacgctatggaactcgccgcccgactgggctggcgatgagcgaaatgtagtgcttacg ttgtcccgcatttggtacagcgcagtaaccggcaaaatcgcgccgaaggatgtcgctgccgactgggcaatggagcgcctgccggccca gtatcagcccgtcatacttgaagctagacaggcttatcttggacaagaagaagatcgcttggcctcgcgcgcagatcagttggaagaatttgt ccactacgtgaaaggcgagatcaccaaggtagtcggcaaataaAGTACTgatcctggcctagtctataggaggttttgaaaagaaagga gcaataatcattttcttgttctatcaagagggtgctattgctcctttctttttttctttttatttatttactagtattttacttacatagacttttttgtttacatta tagaaaaagaaggagaggttattttcttgcatttattcatgAGTGTTATATTCTTTCGTGTCAGGGCTTGAACCAAGTATC CCCGCTTCTTCTACCCCATCCTGCATGTTGTCCTTTTCTTTTCATTCCGTATTGGAATAAAAAAAGTTTA GGCCAGAAGCTCTATGGAAAAATCGTGGTTCAATTTTATGTTTTCTAAGGGAGAATTGGAATACAGAG GTGAGCTAAGTAAAGCAATGGATAGTTTTGCTCCTGGTGAAAAGACTACTATAAGTCAAGACCGTTTT

244

ATATATGATATGGATAAAAACTTTTATGGTTGGGATGAGCGTTCTAGTTATTCTTCTAGTTATTCCAAT AATGTTGATCTTTTAGTTAGCTCCAAGGACATTCGCAATTTCATATCGGATGACACCTTTTTTGTTAGG GATAGTAATAAGAATAGTTATTCTATATTTTTTGATAAAAAAAAAAAAATTTTTGAGATTGACAATGA TTTTAGTGACCTAGAAAAATTTTTTTATAGTTATTGTAGTTCTAGTTATCTAAATAATAGATCTAAAGG TGACAACGATCTGCACTATGATCCTTACATTAAGGATACTAAATATAATTGTACTAATCACATTAATAG TTGCATTGATTCTTATTTTCGTTCTTACATCTGTATTGATAATAACTTTTTAATCGATAGTAATAATTTT AATGAAAGTTACATTTATAATTTCATTTGTAGTGAAAGCGGAAAGATTCGTGAAAGTAAAAATTACAA GATAAGAACTAATAGGAATCGTAGTAATTTAATAAGTTCTAAGGATTTCGATATAACTCAAAACTACA ATCAATTGTGGATTCAATGCGACAATTGTTATGGATTAATGTATAAGAAAGTCAAAATGAATGTTTGT GAACAATGTGGACATTATTTGAAAATGAGTAGTTCAGAAAGAATCGAGCTTTCGATTGATCCGGGTAC TTGGAATCCTATGGATGAAGACATGGTCTCTGCGGATCCCATTAAATTTCATTCGAAGGAGGAACCTT ATAAAAACCGTATTGACTCTGCGCAAAAAACTACAGGATTGACTGACGCTGTTCAAACAGGTACAGGT CAACTAAACGGTATTCCGGTAGCCCTTGGGGTTATGGATTTTCGGTTTATGGGGGGTAGTATGGGATCC GTAGTAGGCGAAAAAATAACTCGTTTGATCGAGTATGCTACCAATCAATGTTTACCTCTTATTTTAGTG TGTTCTTCCGGAGGAGCACGAATGCAAGAAGGAAGTTTAAGTTTGATGCAAATGGCTAAAATTTCTTC GGTTTTATGTGATTATCAATCAAGTAAAAAGTTATTCTATATATCAATTCTTACATCTCCTACTACCGGT GGAGTGACAGCTAGTTTTGGTATGTTGGGGGATATCATTATTGCCGAACCCTATGCCTATATTGCATTT GCGGGTAAAAGAGTAATTGAACAAACATTGAAAAAAGCCGTGCCTGAAGGTTCACAAGCGGCTGAAT CTTTATTACGTAAGGGCTTATTGGATGCAATTGTACCACGTAATCTTTTAAAAGGTGTTCTGAGCGAGT TATTTCAGCTCCATGCTTTTTTTCCTTTGAACACAAATTAAgcggccgcctcgagaaaaaggggggaatgaaagac cccacctgtaggtttggcaagctagcttaagtaacgccattttgcaaggcatggaaaaatacataactgagaatagagaagttcagatcaag gtcaggaacagatggaacagctgaatatgggccaaacaggatatctgtggtaagcagttcctgccccggctcagggccaagaacagatg gaacagctgaatatgggccaaacaggatatctgtggtaagcagttcctgccccggctcagggccaagaacagatggtccccagatgcggt ccagccctcagcagtttctagagaaccatcagatgtttccagggtgccccaaggacctgaaatgaccctgtgccttatttgaactaaccaatc agttcgcttctcgcttctgttcgcgcgcttctgctccccgagctcaataaaagagcccacaacccctcactcgggtatataaggaagttcatttc atttggagagaacacggctgcaggaaagacggttctagaatccgcttaagacctcctaggtccaacgcgttttctactagttacattgagaga gcgagacgtcccggtgtgtgtgagagagcagaatcgccggccgccaccttcggcgcagcgccttggcgaaaccccatttcgacctttcgg tctcatcaggggtggcacacaccaccctatggggagaggtcgtcctctatctctcctggaaggccggagcaatccaaaagaggtacaccc acccatgggtcgggactttaaattcggaggattcgtcctttaaacgttcctccaagagtcccttccccaaacccttactttgtaagtgtggttcgg cgaatgtaccgtttcgtcctttcggactcatcagggaaagtacacactttccgacggtgggttcgtcgacacctctccccctcccaggtactat cccctttccaggatttgttcccgcgccagtcctccgattgactgagtcgcccgggtacccgtgtatccaataaaccctcttgcagttgcactcg agaagcgcgctcggccgcggcgacccagctttcttgtacaaagttggcattataagaaagcattgcttatcaatttgttgcaacgaacaggtc actatcagtcaaaataaaatcattatttgGAATTC

245

Figure B7 | PTEC 4 – A. thaliana double integration sites. A synthetic gene corresponding to a PTEC that integrates in the noncoding inverted repeat regions of the A. thaliana plastid genome is shown above.

246

Xho1-attL1-stuffer 1-AscI-ELVd-VRS 1- AscI-NotI- Left Homology Arm Prrn-5’UTR-Nsi1- aadA -Nsi1-sbA3’UTR-Right homology Arm- NotI-sbf1- VRS 2- sbf1-stuffer 2- attL2- Xho1

CTCGAGcaaataatgattttattttgactgatagtgacctgttcgttgcaacaaattgatgagcaatgcttttttataatgccaactttgtacaaaaaagcag gctcgccggccgccaccttcggcggcgcgccttggcgaaaccccatttcgacctttcggtctcatcaggggtggcacacaccaccctatggggagagg tcgtcctctatctctcctggaaggccggagcaatccaaaagaggtacacccacccatgggtcgggactttaaattcggaggattcgtcctttaaacgttcct ccaagagtcccttccccaaacccttactttgtaagtgtggttcggcgaatgtaccgtttcgtcctttcggactcatcagggaaagtacacactttccgacggt gggttcgtcgacacctctccccctcccaggtactatcccctttccaggatttgttcccgcgccagtcctccgattgactgagtcgcccgggtacccgtgtatc caataaaccctcttgcagttgcatccgacttgtggtctcgctgttccttgggagggtctcctctgagtgattgactacccgtcagcgggggtctttcatttggg gggcgttccgagaatggtagggatgacaggatttgggcgcgccgcggccgcAGATCTTTCTCAATCAATCTCTTTGCCCCTCAT TCTTCGAGAATCAGAAAGAGACTTTTTCAAGTTTGAATTTGTTCATTTGGAATCTGGGTTCTTCTACTTC ATTTTGATTTACTTATTTTTTCTTTATTTTCTCTCTCTTTTCTTTATTTGATTTCTTTTTTGATTTTATTCCC TTCCATCATTCTTAAGTCCCATAAGTTTGATCCTATAGAATCTGACCCATTTTCTCATTGAGCGAAGGG GTACGAAATAAATTCAATCAGATTTATTTTTGATCAAAAAAAAAAAAAAAAATCACTATGTGAAATCT TCGTTTTTTTTTTTCTCTTTCTCTATCGCTTTCCCATAAGTACAGCACTTGTTGAATCGATAGAGAACCT TTTCTTCTGTATCGATATGAATCCATTATGAATCGATATTATTACATTCCAATTCCTTACCAATATCCCT CAAGGAAAATCCCGAATTGGATCCCAAATTGACGGGTTAGTGTGAGCTTATCCATGCGGTTATGCACT CTTCGAATAGGAATTCATTTTCTGAAAGATTCTGGCTTTCGTGCTTTGGCGGGTCTCCGAGATCCTTTC GACGACCTATGTTGTGTTGAAGGGATATCTAGATGATCCGATCAATTGCGTAAAGCCCGCAGTAGCAA CGGAACCGGGGAAAGTATACATAAGTATACAGAAAAGACAGTTCTTTTCTTTTCTATTATATTAGGATT TTCTATTCTATTAGATTAGTTAGTGATCTTGGCGCAGTGAGTCCTTTCTTCCGTGATTAACTGTTGGCAC CAGTCCTACATTTTGTCTCTGCGGACCGAGAAGAAAGGAGGCTCCGCGGGAAGAGGATTGTACCATAG AAGCACGGAGGTAAACCTCTTTCCAATATATAAATTCTGGCAATGTAGTTGGGCTTTCATGTTGATCCG AATGAATCATCTTTTTCGCGGAGTGAAATCTTTGCCTGCTAGGCAAGATGATAGGATAGCAAGTTACA AATTCTGTTTCGGTAGGACATGTATTTCTATTACTATGAAATTCATAAATAAAATAGTTAATCGTGGGG TTACCATTCTCTCTTTTTTTTTTTCGTTATCTCGCATGTGGTCCTAAGAAAAGGGAATTTGTCGATTTTTC GGGGTCTTAAAGGGGCGTGGAAACACATAAGAACTCTTGAATGGAAATGGAAAAGAGATGTAACTCC AGTTCCTTTGGAAATAGGAAGATCTTTGGCGCAAGAATAAAGGATTAATCCGTATCATCTTGACTTGG TTCTTATTTCTCTATTTTTTTAAGTTTAAGAAAAGAATACCGTTTCTCCTACCCGTATCGAATAGAACAT GCCGAGTCAAATCTTCTTCATGTAAAACCGGCTTGATTTAGATCGGGAGAATCGTACGGTTTTATGAA ACCATGTGCTATGGCTCGAATCCGTAGTCAATCCTATTTCCGATAGGAGTAGTTGctcccccgccgtcgttcaat gagaatggataagaggctcgtgggattgacgtgagggggcagggatggctatatttctgggagcgaactccgggcgaatacgaagcgcttggatacag ttgtagggagggatttATGCATatggcagaagcggtgatcgccgaagtatcgactcaactatcagaggtagttggcgtcatcgagcgccatctcga accgacgttgctggccgtacatttgtacggctccgcagtggatggcggcctgaagccacacagtgatattgatttgctggttacggtgaccgtaaggcttg atgaaacaacgcggcgagctttgatcaacgaccttttggaaacttcggcttcccctggagagagcgagattctccgcgctgtagaagtcaccattgttgtg cacgacgacatcattccgtggcgttatccagctaagcgcgaactgcaatttggagaatggcagcgcaatgacattcttgcaggtatcttcgagccagcca cgatcgacattgatctggctatcttgctgacaaaagcaagagaacatagcgttgccttggtaggtccagcggcggaggaactctttgatccggttcttgaac aggatctatttgaggcgctaaatgaaaccttaacgctatggaactcgccgcccgactgggctggcgatgagcgaaatgtagtgcttacgttgtcccgcattt ggtacagcgcagtaaccggcaaaatcgcgccgaaggatgtcgctgccgactgggcaatggagcgcctgccggcccagtatcagcccgtcatacttga agctagacaggcttatcttggacaagaagaagatcgcttggcctcgcgcgcagatcagttggaagaatttgtccactacgtgaaaggcgagatcaccaa ggtagtcggcaaataaATGCATgatcctggcctagtctataggaggttttgaaaagaaaggagcaataatcattttcttgttctatcaagagggtgctat tgctcctttctttttttctttttatttatttactagtattttacttacatagacttttttgtttacattatagaaaaagaaggagaggttattttcttgcatttattcatgCAA GTTGTTCAAGAATAGTGGCCTTGAGTTTCTCGACCCTTTGACTTAGGATTAGTCAGTTCTATTTCTTGAT GGGGGAAGGGATATAACTCAGCGGTAGAGTGTCACCTTGACGTGGTGGAAGTCATCAGTTCGAGCCTG ATTATCCCTAAACCCAATGAATGTGAGTTTTTCTATTTTGACTTGCTCCCTCGCTGTGATCGAATAAGA ATGGATAAGAGGCTCGTGGGATTGACGTGAGGGGGTAGGGGTAGCTATATTTCTGGGAGCGAACTCC ATGCGAATATGAAGCGCATGGATACAAGTTATGACTTGGAATGAAAGACAATTCCGAATCAGCTTTGT CTACGAAGAAGGAAGCTATAAGTAATGCAACTATGAATCTCATGGAGAGTTCGATCCTGGCTCAGGAT GAACGCTGGCGGCATGCTTAACACATGCAAGTCGGACGGGAAGTGGTGTTTCCAGTGGCGGACGGGT GAGTAACGCGTAAGAACCTGCCCTTGGGAGGGGAACAACAGCTGGAAACGGCTGCTAATACCCCGTA GGCTGAGGAGCAAAAGGAGGAATCCGCCCGAGGAGGGGCTCGCGTCTGATTAGCTAGTTGGTGAGGC AATAGCTTACCAAGGCGATGATCAGTAGCTGGTCCGAGAGGATGATCAGCCACACTGGGACTGAGAC

247

ACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATTTTCCGCAATGGGCGAAAGCCTGACGGAGC AATGCCGCGTGGAGGTAGAAGGCCTACGGGTCCTGAACTTCTTTTCCCAGAGAAGAAGCAATGACGGT ATCTGGGGAATAAGCATCGGCTAACTCTGTGCCAGCAGCCGCGGTAATACAGAGGATGCAAGCGTTAT CCGGAATGATTGGGCGTAAAGCGTCTGTAGGTGGCTTTTTAAGTCCGCCGTCAAATCCCAGGGCTCAA CCCTGGACAGGCGGTGGAAACTACCAAGCTTGAGTACGGTAGGGGCAGAGGGAATTTCCGGTGGAGC GGTGAAATGCGTAGAGATCGGAAAGAACACCAACGGCGAAAGCACTCTGCTGGGCCGACACTGACAC TGAGAGACGAAAGCTAGGGGAGCGAATGGGATTAGATACCCCAGTAGTCCTAGCCGTAAACGATGGA TACTAGGCGCTGTGCGTATCGACCCGTGCAGTGCTGTAGCTAACGCGTTAAGTATCCCGCCTGGGGAG TACGTTCGCAAGAATGAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTT AATTCGATGCAAAGCGAAGAACCTTACCAGGGCTTGACATGCCGCGAATCCTCTTGAAAGAGAGGGG TGCCTTCGGGAACGCGGACACAGGTGGTGCATGGCTGTCGTCAGCTCGTGCCGTAAGGTGTTGGGTTA AGTCCCGCAACGAGCGCAACCCTCGTGTTTAGTTGCCACCGTTGAGTTTGGAACCCTGAACAGACTGC CGGTGATAAGCCGGAGGAAGGTGAGGATGACGTCAAGTCATCATGCCCCTTATGCCCTGGGCGACAC ACGTGCTACAATGGCCGGGACAAAGGGTCGCGATCCCGCGAGGGTGAGCTAACTCCAAAAACCCGTC CTCAGTTCGGATTGCAGGCTGCAACTCGCCTGCATGAAGCCGGAATCGCTAGTAATCGCCGGTCAGCC ATACGGCGGTGAATTgcggccgccctgcaggaaaaaggggggaatgaaagaccccacctgtaggtttggcaagctagcttaagtaacgcc attttgcaaggcatggaaaaatacataactgagaatagagaagttcagatcaaggtcaggaacagatggaacagctgaatatgggccaaacaggatatct gtggtaagcagttcctgccccggctcagggccaagaacagatggaacagctgaatatgggccaaacaggatatctgtggtaagcagttcctgccccgg ctcagggccaagaacagatggtccccagatgcggtccagccctcagcagtttctagagaaccatcagatgtttccagggtgccccaaggacctgaaatg accctgtgccttatttgaactaaccaatcagttcgcttctcgcttctgttcgcgcgcttctgctccccgagctcaataaaagagcccacaacccctcactcgg gtatataaggaagttcatttcatttggagagaacacggctgcaggaaagacggttctagaatccgcttaagacctcctaggtccaacgcgttttctactagtt acattgagagagcgagacgtcccggtgtgtgtgagagagcagaatcgccggccgccaccttcggcgcagcgccttggcgaaaccccatttcgaccttt cggtctcatcaggggtggcacacaccaccctatggggagaggtcgtcctctatctctcctggaaggccggagcaatccaaaagaggtacacccaccca tgggtcgggactttaaattcggaggattcgtcctttaaacgttcctccaagagtcccttccccaaacccttactttgtaagtgtggttcggcgaatgtaccgttt cgtcctttcggactcatcagggaaagtacacactttccgacggtgggttcgtcgacacctctccccctcccaggtactatcccctttccaggatttgttcccg cgccagtcctccgattgactgagtcgcccgggtacccgtgtatccaataaaccctcttgcagttgcacctgcaggaagcgcgctcggccgcggcgaccc agctttcttgtacaaagttggcattataagaaagcattgcttatcaatttgttgcaacgaacaggtcactatcagtcaaaataaaatcattatttgCTCGAG

248

Figure B8 | PTEC 5 – N. benthamiana single integration site, First Strand Synthesis Only. PTEC 5 is distinguished from other PTEC constructs in that it does not feature VRS I and VRS II sequence. Instead there is a single 3’ 18 bp Primer binding sequence corresponding to N. Benthamiana tRNApro (NC_001879.2). Additionally, this PTEC was redesigned to minimize the number of base pairs preceding transcription of the ELVd sequence. Prior constructs would transcribe at least a hundred additional base pairs because of the position of gateway and restriction sites relative to the endogenous promoters of desired binary vectors. Unlike other PTECs, PTEC 5 was cloned via I-CEU-I and PI-SCE-I homing endonuclease to remove the endogenous 35S promoter, but retain use of the endogenous 35S terminator of PCGW series vectors (Dalal et al. 2015).

249

I-CEU-I-35Spro-ELVd-left Homology Arm- Prrn-5’UTR-aadA-psbA3’UTR-Right Homology Arm-Primer Binding Sequence-PI-SCEI

TAACTATAACGGTCCTAAGGTAGCGAAAGATTAGCCTTTTCAATTTCAGAAAGAATG CTAACCCACAGATGGTTAGAGAGGCTTACGCAGCAGGTCTCATCAAGACGATCTAC CCGAGCAATAATCTCCAGGAAATCAAATACCTTCCCAAGAAGGTTAAAGATGCAGT CAAAAGATTCAGGACTAACTGCATCAAGAACACAGAGAAAGATATATTTCTCAAGA TCAGAAGTACTATTCCAGTATGGACGATTCAAGGCTTGCTTCACAAACCAAGGCAAG TAATAGAGATTGGAGTCTCTAAAAAGGTAGTTCCCACTGAATCAAAGGCGATGGAG TCAAAGATTCAAATAGAGGACCTAACAGAACTCGCCGTAAAGACTGGCGAACAGTT CATACAGAGTCTCTTACGACTCAATGACAAGAAGAAAATCTTCGTCAACATGGTGG AGCACGACACACTTGTCTACTCCAAAAATATCAAAGATACAGTCTCAGAAGACCAA AGGGCAATTGAGACTTTTCAACAAAGGGTAATATCCGGAAACCTCCTCGGATTCCAT TGCCCAGCTATCTGTCACTTTATTGTGAAGATAGTGGAAAAGGAAGGTGGCTCCTAC AAATGCCATCATTGCGATAAAGGAAAGGCCATCGTTGAAGATGCCTCTGCCGACAG TGGTCCCAAAGATGGACCCCCACCCACGAGGAGCATCGTGGAAAAAGAAGACGTTC CAACCACGTCTTCAAAGCAAGTGGATTGATGTGATATCTCCACTGACGTAAGGGATG ACGCACAATCCCACTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTC ATTTGGAGAGAACACGGTTGGCGAAACCCCATTTCGACCTTTCGGTCTCATCAGGGG TGGCACACACCACCCTATGGGGAGAGGTCGTCCTCTATCTCTCCTGGAAGGCCGGAG CAATCCAAAAGAGGTACACCCACCCATGGGTCGGGACTTTAAATTCGGAGGATTCG TCCTTTAAACGTTCCTCCAAGAGTCCCTTCCCCAAACCCTTACTTTGTAAGTGTGGTT CGGCGAATGTACCGTTTCGTCCTTTCGGACTCATCAGGGAAAGTACACACTTTCCGA CGGTGGGTTCGTCGACACCTCTCCCCCTCCCAGGTACTATCCCCTTTCCAGGATTTGT TCCCAATTCACCGCCGTATGGCTGACCGGCGATTACTAGCGATTCCGGCTTCATGCA GGCGAGTTGCAGCCTGCAATCCGAACTGAGGACGGGTTTTTGGGGTTAGCTCACCCT CGCGGGATCGCGACCCTTTGTCCCGGCCATTGTAGCACGTGTGTCGCCCAGGGCATA AGGGGCATGATGACTTGACGTCATCCTCACCTTCCTCCGGCTTATCACCGGCAGTCT GTTCAGGGTTCCAAACTCAACGATGGCAACTAAACACGAGGGTTGCGCTCGTTGCG GGACTTAACCCAACACCTTACGGCACGAGCTGACGACAGCCATGCACCACCTGTGT CCGCGTTCCCGAAGGCACCCCTCTCTTTCAAGAGGATTCGCGGCATGTCAAGCCCTG GTAAGGTTCTTCGCTTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCC CCCGTCAATTCCTTTGAGTTTCATTCTTGCGAACGTACTCCCCAGGCGGGATACTTAA CGCGTTAGCTACAGCACTGCACGGGTCGATACGCACAGCGCCTAGTATCCATCGTTT ACGGCTAGGACTACTGGGGTATCTAATCCCATTCGCTCCCCTAGCTTTCGTCTCTCAG TGTCAGTGTCGGCCCAGCAGAGTGCTTTCGCCGTTGGTGTTCTTTCCGATCTCTACGC ATTTCACCGCTCCACCGGAAATTCCCTCTGCCCCTACCGTACTCCAGCTTGGTAGTTT CCACCGCCTGTCCAGGGTTGAGCCCTGGGATTTGACGGCGGACTTAAAAAGCCACCT ACAGACGCTTTACGCCCAATCATTCCGGATAACGCTTGCATCCTCTGTATTACCGCG GCTGCTGGCACAGAGTTAGCCGATGCTTATTCCCCAGATACCGTCATTGCTTCTTCTC CGGGAAAAGAAGTTCACGACCCGTGGGCCTTCTACCTCCACGCGGCATTGCTCCGTC AGGCTTTCGCCCATTGCGGAAAATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGGCC GTGTCTCAGTCCCAGTGTGGCTGATCATCCTCTCGGACCAGCTACTGATCATCGCCTT GGTAAGCTATTGCCTCACCAACTAGCTAATCAGACGCGAGCCCCTCCTCGGGCGGAT TCCTCCTTTTGCTCCTCAGCCTACGGGGTATTAGCAGCCGTTTCCAGCTGTTGTTCCC CTCCCAAGGGCAGGTTCTTACGCGTTACTCACCCGTCCGCCACTGGAAACACCACTT CCCGTCCGACTTGCATGTGTTAAGCATGCCGCCAGCGTTCATCCTGAGCCAGGATCG

250

AACTCTCCATGAGATTCATAGTTGCATTACTTATAGCTTCCTTGTTCGTAGACAAAGC GGATTCGGAATTGTCTTTCATTCCAAGGCATAACTTGTATCCATGCGCTTCATATTCG CCCGGAGTTCGCTCCCAGAAATATAGCCATCCCTGCCCCCTCACGTCAATCCCACGA GCCTCTTATCCATTCTCATTGAACGACGGCGGGGGAGCTTTCGAGGCCTCGAAATCC AACTAGAAAAACTCACATTGGGCTTAGGGATAATCAGGCTCGAACTGATGACTTCC ACCACGTCAAGGTGACACTCTACCGCTGAGTTATATCCCTTCCCCGCCCCATCGAGA AATAGAACTGACTAATCCTAAGTCAAAGGGTCGAGAAACTCAACGCCACTATTCTTG AACAACTTGGAGCCGGGCCTTCTTTTCGCACTATTACGGATATGAAAATAATGGTCA AAATCGGATTCAATTGTCGCTCCCCCGCCGTCGTTCAATGAGAATGGATAAGAGGCT CGTGGGATTGACGTGAGGGGGCAGGGATGGCTATATTTCTGGGAGCGAACTCCGGG CGAATACGAAGCGCTTGGATACAGTTGTAGGGAGGGATTTATGGCAGAAGCGGTGA TCGCCGAAGTATCGACTCAACTATCAGAGGTAGTTGGCGTCATCGAGCGCCATCTCG AACCGACGTTGCTGGCCGTACATTTGTACGGCTCCGCAGTGGATGGCGGCCTGAAGC CACACAGTGATATTGATTTGCTGGTTACGGTGACCGTAAGGCTTGATGAAACAACGC GGCGAGCTTTGATCAACGACCTTTTGGAAACTTCGGCTTCCCCTGGAGAGAGCGAGA TTCTCCGCGCTGTAGAAGTCACCATTGTTGTGCACGACGACATCATTCCGTGGCGTT ATCCAGCTAAGCGCGAACTGCAATTTGGAGAATGGCAGCGCAATGACATTCTTGCA GGTATCTTCGAGCCAGCCACGATCGACATTGATCTGGCTATCTTGCTGACAAAAGCA AGAGAACATAGCGTTGCCTTGGTAGGTCCAGCGGCGGAGGAACTCTTTGATCCGGTT CTTGAACAGGATCTATTTGAGGCGCTAAATGAAACCTTAACGCTATGGAACTCGCCG CCCGACTGGGCTGGCGATGAGCGAAATGTAGTGCTTACGTTGTCCCGCATTTGGTAC AGCGCAGTAACCGGCAAAATCGCGCCGAAGGATGTCGCTGCCGACTGGGCAATGGA GCGCCTGCCGGCCCAGTATCAGCCCGTCATACTTGAAGCTAGACAGGCTTATCTTGG ACAAGAAGAAGATCGCTTGGCCTCGCGCGCAGATCAGTTGGAAGAATTTGTCCACT ACGTGAAAGGCGAGATCACCAAGGTAGTCGGCAAATAAGATCCTGGCCTAGTCTAT AGGAGGTTTTGAAAAGAAAGGAGCAATAATCATTTTCTTGTTCTATCAAGAGGGTGC TATTGCTCCTTTCTTTTTTTCTTTTTATTTATTTACTAGTATTTTACTTACATAGACTTT TTTGTTTACATTATAGAAAAAGAAGGAGAGGTTATTTTCTTGCATTTATTCATGAACT GCCCCTATCGGAAATAGGATTGACTACCGATTCCGAAGGAACTGGAGTTACATCTCT TTTCCATTCAAGAGTTCTTATGCGTTTCCACGCCCCTTTGAGACCCCGAAAAATGGA CAAATTCCTTTTCTTAGGAACACATACAAGATTCGTCACTACAAAAAGGATAATGGT AACCCTACCATTAACTACTTCATTTATGAATTTCATAGTAATAGAAATACATGTCCTA CCGAGACAGAATTTGGAACTTGCTATCCTCTTGCCTAGCAGGCAAAGATTTACCTCC GTGGAAAGGATGATTCATTCGGATCGACATGAGAGTCCAACTACATTGCCAGAATC CATGTTGTATATTTGAAAGAGGTTGACCTCCTTGCTTCTCTCATGGTACACTCCTCTT CCCGCCGAGCCCCTTTTCTCCTCGGTCCACAGAGACAAAATGTAGGACTGGTGCCAA CAATTCATCAGACTCACTAAGTCGGGATCACTAACTAATACTAATCTAATATAATAG TCTAATATATCTAATATAATAGAAAATACTAATATAATAGAAAAGAACTGTCTTTTC TGTATACTTTCCCCGGTTCCGTTGCTACCGCGGGCTTTACGCAATCGATCGGATTAGA TAGATATCCCTTCAACATAGGTCATCGAAAGGATCTCGGAGACCCACCAAAGTACG AAAGCCAGGATCTTTCAGAAAACGGATTCCTATTCAAAGAGTGCATAACCGCATGG ATAAGCTCACACTAACCCGTCAATTTGGGATCCAAATTCGAGATTTTCCTTGGGAGG TATCGGGAAGGATTTGGAATGGAATAATATCGATTCATACAGAAGAAAAGGTTCTC TATTGATTCAAACACTGTACCTAACCTATGGGATAGGGATCGAGGAAGGGGAAAAA CCGAAGATTTCACATGGTACTTTTATCAATCTGATTTATTTCGTACCTTTCGTTCAAT GAGAAAATGGGTCAAATTCTACAGGATCAAACCTATGGGACTTAAGGAATGATATA

251

AAAAAAAGAGAGGGAAAATATTCATATTAAATAAATATGAAGTAGAAGAACCCAG ATTCCAAATGAACAAATTCAAACTTGAAAAGGATCTTCCTTATTCTTGAAGAATGAG GGGCAAAGGGATTGATCAAGAAAGATCGGTAGGGATGACAGGATTATCTATGTCGG GTGCGGAGAAAGAGGTAATGAAATGG

252

Chloroplast localized sgRNA

HindIII-Nospro-ELVd-Dual bbs1/bpi1 sites-gRNA scaffold-Nosterm-EcorI

AATTAAAAGCTTGATCATGAGCGGAGAATTAAGGGAGTCACGTTATGACCCCCGCC GATGACGCGGGACAAGCCGTTTTACGTTTGGAACTGACAGAACCGCAACGTTGAAG GAGCCACTCAGCCGCGGGTTTCTGGAGTTTAATGAGCTAAGCACATACGTCAGAAA CCATTATTGCGCGTTCAAAAGTCGCCTAAGGTCACTATCAGCTAGCAAATATTTCTT GTCAAAAATGCTCCACTGACGTTCCATAAATTCCCCTCGGTATCCAATTAGAGTCTC ATATTCACTCTCAATCCAAATAATCTGCACCGGATCTGGATCGTTTCGCGTTGGCGA AACCCCATTTCGACCTTTCGGTCTCATCAGGGGTGGCACACACCACCCTATGGGGAG AGGTCGTCCTCTATCTCTCCTGGAAGGCCGGAGCAATCCAAAAGAGGTACACCCAC CCATGGGTCGGGACTTTAAATTCGGAGGATTCGTCCTTTAAACGTTCCTCCAAGAGT CCCTTCCCCAAACCCTTACTTTGTAAGTGTGGTTCGGCGAATGTACCGTTTCGTCCTT TCGGACTCATCAGGGAAAGTACACACTTTCCGACGGTGGGTTCGTCGACACCTCTCC CCCTCCCAGGTACTATCCCCTTTCCAGGATTTGTTCCCGGGTCTTCGAGAAGACCTGT TTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAA GTGGCACCGAGTCGGTGCACCAGCTCGAATTTCCCCGATCGTTCAAACATTTGGCAA TAAAGTTTCTTAAGATTGAATCCTGTTGCCGGTCTTGCGATGATTATCATATAATTTC TGTTGAATTACGTTAAGCATGTAATAATTAACATGTAATGCATGACGTTATTTATGA GATGGGTTTTTATGATTAGAGTCCCGCAATTATACATTTAATACGCGATAGAAAACA AAATATAGCGCGCAAACTAGGATAAATTATCGCGCGCGGTGTCATCTATGTTACTAG ATCGGGAATTGGATCCGGGGGG

Figure B9 | 5’ ELVd UTR-sgRNA. A construct corresponding to a sgRNA recognized by Cas9 transcriptionally fused to a 5’ ELVd sequence as a means of chloroplast localization was synthesized as a gene fragment based on previous designs ((Gómez and Pallás 2010a, 2010b)). This construct features dual bbs1 sites that allow for cloning any target sequence of interest. In contrast to other designs for sgRNA, this construct is driven by a nopaline synthase promoter, which recruits RNA pol II.

253

HindIII-Nospro- Dual bbs1/bpi1 sites-gRNA scaffold-ELVd-Nosterm-EcorI AATTAAAAGCTTGATCATGAGCGGAGAATTAAGGGAGTCACGTTATGACCCCCGCC GATGACGCGGGACAAGCCGTTTTACGTTTGGAACTGACAGAACCGCAACGTTGAAG GAGCCACTCAGCCGCGGGTTTCTGGAGTTTAATGAGCTAAGCACATACGTCAGAAA CCATTATTGCGCGTTCAAAAGTCGCCTAAGGTCACTATCAGCTAGCAAATATTTCTT GTCAAAAATGCTCCACTGACGTTCCATAAATTCCCCTCGGTATCCAATTAGAGTCTC ATATTCACTCTCAATCCAAATAATCTGCACCGGATCTGGATCGTTTCGCGGGGTCTTC GAGAAGACCTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTAT CAACTTGAAAAAGTGGCACCGAGTCGGTGCGTTGGCGAAACCCCATTTCGACCTTTC GGTCTCATCAGGGGTGGCACACACCACCCTATGGGGAGAGGTCGTCCTCTATCTCTC CTGGAAGGCCGGAGCAATCCAAAAGAGGTACACCCACCCATGGGTCGGGACTTTAA ATTCGGAGGATTCGTCCTTTAAACGTTCCTCCAAGAGTCCCTTCCCCAAACCCTTACT TTGTAAGTGTGGTTCGGCGAATGTACCGTTTCGTCCTTTCGGACTCATCAGGGAAAG TACACACTTTCCGACGGTGGGTTCGTCGACACCTCTCCCCCTCCCAGGTACTATCCCC TTTCCAGGATTTGTTCCCACCAGCTCGAATTTCCCCGATCGTTCAAACATTTGGCAAT AAAGTTTCTTAAGATTGAATCCTGTTGCCGGTCTTGCGATGATTATCATATAATTTCT GTTGAATTACGTTAAGCATGTAATAATTAACATGTAATGCATGACGTTATTTATGAG ATGGGTTTTTATGATTAGAGTCCCGCAATTATACATTTAATACGCGATAGAAAACAA AATATAGCGCGCAAACTAGGATAAATTATCGCGCGCGGTGTCATCTATGTTACTAGA TCGGGAATTGGATCCGGGGGG

Figure B10 | 3’ ELVd UTR-sgRNA. A construct corresponding to a sgRNA recognized by Cas9 transcriptionally fused to a 3’ ELVd sequence as a means of chloroplast localization was synthesized as a gene fragment based on previous designs ((Gómez and Pallás 2010a, 2010b)). This construct features dual bbs1 sites that allow for cloning any target sequence of interest. In contrast to other designs for sgRNA, this construct is driven by a nopaline synthase promoter, which recruits RNA pol II.

254

Fluorescence

XbaI- ELVd Fragment-eGFP-AscI-SacI-SbfI-BamHI

AATTAATCTAGAGttggcgaaaccccatttcgacctttcggtctcatcaggggtggcacacaccaccctatggggagaggtcgt cctctatctctcctggaaggccggagcaatccaaaagaggtacacccacccatgggtcgggactttaaattcggaggattcgtcctttaaac gttcctccaagagtcccttccccaaacccttactttgtaagtgtggttcggcgaatgtaccgtttcgtcctttcggactcatcagggaaagtaca cactttccgacggtgggttcgtcgacacctctccccctcccaggtactatcccctttccaggatttgttcccATGGTGAGCAAGG GCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTA AACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAA GCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCT CGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAA GCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCA TCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGC GACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAA CATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGG CCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAG GACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGG CCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAG ACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGG ATCACTCTCGGCATGGACGAGCTGTACAAGTAAGGCGCGCCGAGCTCCCTGCAGGG GATCCAATTAA

Figure B11 | 5’ELVd UTR-eGFP. A construct corresponding to eGFP transcriptionally fused to a 5’ ELVd sequence as a means of chloroplast localization was synthesized as a gene fragment based on previous designs ((Gómez and Pallás 2010a, 2010b)). This construct additionally features a 3’ multiple cloning site.

255

XbaI -EIF4E -eGFP- BamHI

AATTAATCTAGAGAGTAATTTAGGCAGTTCGGAGAAACAATGGCGGTAGAAGACAC TCCCAAATCTGTTGTAACGGAAGAAGCTAAGCCTAATTCAATAGAGAATCCGATTGA TCGATACCATGAGGAAGGTGATGATGCCGAAGAAGGAGAGATCGCCGGAGGAGAA GGAGACGGAAACGTTGACGAATCGAGCAAATCCGGTGTTCCTGAATCGCATCCTCT GGAACATTCATGGACTTTCTGGTTCGATAATCCTGCTGTGAAATCGAAACAAACCTC TTGGGGAAGTTCCTTGCGACCCGTGTTTACGTTTTCAACTGTTGAGGAATTTTGGAGT TTGTACAACAACATGAAGCATCCGAGCAAGTTAGCTCACGGAGCTGACTTCTACTGT TTCAAACACATCATTGAACCTAAGTGGGAGGGTCCTATTTGTGCTAATGGAGGAAAA TGGACTATGACTTTCCCTAAGGAGAAGTCTGATAAGAGCTGGCTCTACACTTTGCTT GCATTGATTGGAGAGCAGTTTGATCATGGAGATGAAATATGTGGAGCAGTTGTCAA CATTAGAGGAAAGCAAGAAAGGATATCTATTTGGACTAAAAATGCTTCAAACGAAG CTGCTCAGGTGAGCATTGGAAAACAATGGAAGGAGTTTCTCGATTACAACAACAGC ATAGGTTTCATCATCCATGAGGATGCGAAGAAGCTCGACAGGAATGCAAAGAACGC TTACACCGCTAACCTCTCAAATCTTTGCATTGGTTTCAATTACAGTTTGGTATGTGAG AGATCTCTATTTATCTAAACATGACTTGGACAGTCTGTCTTTGCTAGTGTCTGATTGC TCACGACGCTCTAACATTTCATTTAGTAATACACTAGTATGGTTCCTCATAACATGGT GAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACG GCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACC TACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGG CCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGAC CACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGA GCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGT TCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAG GACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTA TATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACA ACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATC GGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTG AGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGC CGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAAGGCGCGCCGAGCTCCC TGCAGGGGATCCAATTAA

Figure B12 | 5’EIF4E UTR-eGFP. A construct corresponding to eGFP transcriptionally fused to a 5’ Eif4E sequence as a means of chloroplast localization was synthesized as a gene fragment based on previous designs (Nicolaï et al. 2007). This construct additionally features a 3’ multiple cloning site.

256

EcoRI-rbcS-gggs polylinker-SacI-eGFP-SphI-gggs polylinker-6x his tag-HindIII

Gaattcatggcttcctctatgctctcttccgctactatggttgcctctccggctcaggccactatggtcgctcctttcaacggacttaagtcctccgctgccttcccagccac ccgcaaggctaacaacgacattacttccatcacaagcaacggcggaagagttaactgcggtggaggcggttcagagctcATGGTGAGCAAGGGC GAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAA CGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGC TGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCG TGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGC AGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCT TCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGAC ACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACAT CCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCG ACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGAC GGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCC CGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACC CCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATC ACTCTCGGCATGGACGAGCTGTACAAGTAAgcatgcggtggaggcggttcacatcatcaccatcaccactaaaagctt

Figure B13 | cpeGFP. eGFP sequence was cloned by PCR using PCGW-EGFP as a template and restriction digestion and ligation into a synthetic gene fragment featuring a N terminal rbcS plastid transit peptide and C terminal 6x HIS affinity tag.

257

EcoRI- SacI-rbcS-gggs polylinker-MCS-gggs polylinker-6x his tag-SphI-HindIII

AAAAAAGaattcGCGGCCGCatggcttcctctatgctctcttccgctactatggttgcctctccggctcaggccactatggtcgctcctttcaacggacttaag tcctccgctgccttcccagccacccgcaaggctaacaacgacattacttccatcacaagcaacggcggaagagttaactgcggtggaggcggttcagagctcggtaccc ggggatcctctagagtcgacctgcaggcatgcggtggaggcggttcacatcatcaccatcaccactaaGGCGCGCCaagcttAAAAAA

Figure B14 | Chloroplast peptide cloning vector gene fragment. The fragment below was synthesized as a gene fragment, and was used to attach N terminal rbcS and C terminal 6x His tags to eGFP and Cas9 proteins.

258

APPENDIX C: Primers

Table C1 | Primers used in this study. The names and sequence of primers used to clone and screen genes used in this study are shown above. The sequence targeted by sgRNA, as well as the genome sequence used to design these targets is also shown.

Primers used in this study Amplicon GenBank accessions used for Name/Target Primer sequence (5'-3') Length primer design NC_001879.2 psbA* GCGAAAGCGAAAGCCTATGGGG N/A NC_000932.1

NC_001879.2 rbcL* GAGACTAAAGCAAGTGTTGG N/A NC_000932.1

KP826773.1 35S F: AAAGGGGGATGTGCTGCAAG Variable R: GCTCGTATGTTGTGTGGAATT N/A aada F: CCTAAGTCAAAGGGTCGAGAAA 1326 bp R: GGGACAACGTAAGCACTACA N/A cpCas9 F: TGATCGAGACAAACGGCGAAACC 900 bp R: AAAAAAGCATGCGTCGATCCGTGTCTCGTACAGG N/A cpReverse F: GCCGCTCATAATCCCTCTAAA 1902 bp Transcriptase R: CCGTCGTGGTCCTTGTAATC M13 F: GTAAAACGACGGCCAG Variable N/A R: CAGGAAACAGCTATGACC eGFP F: AAAAAAGAGCTCCCCATGGTGAGCAAGGGCGAGGAG 720 bp N/A cloning R: AAAAAAGCATGCTTACTTGTACAGCTCGTCCA hsCas9 F: AAAAAAGAGCTCGACAAGAAGTACAGCATCGGCCTG 4113 bp N/A cloning R: AAAAAAGCATGCGTCGATCCGTGTCTCGTACAGG pDONR221 F: CACCATGGCTTCCTCTATGCTCTCTT Variable N/A BP Rxn R: GCCTTAGTGGTGATGGTGATGATG *psbA and rbcL primers were synthesized and annealed as dimers in the construction of sgRNA. These primers were not implemented in PCR.

259

APPENDIX D: Recipes

Cetyltrimethylammonium bromide (CTAB) extraction buffer D Sorbitol 140 mM Tris Base 220 mM EDTA 22 mM NaCl 801 mM Cetyltrimethylammonium bromide 22 mM Sodium lauroyl sarcosinate (Sarkosyl) 34 mM

Phosphate Buffered Saline (PBS) - pH 7.4 NaCl 137 mM KCl 2.7 mM Na2HPO4 10 mM KH2PO4 1.8 mM

MES-SDS – pH 7.3 MES 50 mM Tris Base 50 mM SDS 3.5 mM EDTA 1 mM

Tris-acetate-SDS – pH 7.3 Tricine 50 mM Tris Base 50 mM SDS 3.5 mM EDTA 1 mM

Coomassie R-250 Stain Coomassie R250 0.3 % w/v Methanol 50% v/v Glacial Acetic Acid 10% v/v

260

Coomassie R-250 destain Methanol 10% v/v Glacial Acetic Acid 10% v/v

Ponceau Stain Ponceau S 0.3% w/v

Tris Glycine buffer Glycine 192 mM Tris Base 24.7 mM

Tris-Glycine-SDS buffer Glycine 192 mM Tris Base 24.7 mM SDS 3.5 mM

Laemmli Buffer SDS 139 mM Glycerol 2.741 M Tris-HCL 120 mM Bromophenol Blue 3 mM

Tris Buffered Saline supplemented with protease inhibitor – pH 7.5 Tris-HCl 50 mM NaCl 150 mM Sigma Protease Inhibitor Cocktail for plant cell and tissue culture extracts -P9599 1% v/v

261

APPENDIX E: Seed Viability of PTECv3 Transgenics

Table E1 | Comparison of Transgenic Seed Viability. The viability of transgenic A. thaliana seed was assessed by quantifying visible germination on sterile media. Seed used in this experiment was pooled from multiple transgenic T3 or T1 lines. Pooled seed was then screened for expression of red fluorescence conferred by the mCherry transgene marker, and equal numbers of both fluorescent and nonfluorescent seed was collected from each seed pool. Values above were derived from populations of 40 individual seeds. 40 seeds from each sample type (row) were plated on sterile media without selection (MS) or media with supplemented with 3% Sucrose (+ 3% Sucrose) (80 seeds per row). Plated seed was organized into structured sections on each plate, with 4 seeds of each sample type (row) on each plate (10 plates per condition or 20 plates total). Seed was stratified for 48 hours at 4°C, and then grown for 14 days in an environmental growth chamber (25°C, 16 h of light).

+ 3% Sucrose MS 7 9 14 7 9 14 col 0 88% 88% 88% 85% 90% 30% cpReverse Transcriptase T4 Seed 95% 95% 95% 98% 98% 53% PTEC 3 T2 Seed Non-Fluorescent 70% 73% 73% 88% 85% 23% Fluorescent 3% 3% 3% 3% 3% 3% cpReverse Transcriptase T4 Seed Retransformed with PTEC 3 Non-Fluorescent 63% 60% 68% 68% 73% 10% Fluorescent 30% 28% 30% 38% 40% 0% cpReverse Transcriptase T4 Seed Retransformed with PTEC 4 Non-Fluorescent 58% 58% 58% 55% 58% 5% Fluorescent 50% 55% 55% 68% 68% 3%

262

APPENDIX F: Screening of lines carrying PTEC and RT Transgenes

A C

B D

Figure F1 | Fluorescence screen and planting of lines carrying PTEC and reverse transcriptase transgenes. Seed resulting from Floral dip of T3 lines expressing cpReverse Transcriptase protein is shown above (A, B). Transgenic Seed corresponding to a second transformation event was screened based on mCherry fluorescence, and later planted on either soil, or sterilized and plated on MS containing 3% sucrose (C, D). Seed and plants shown above show the outcomes of two separate transformations entailing either PTECv3 (A, C), or PTECv4 (B, D).

263

A B

C D

E

Figure F2 | PTEC Screen. Plants resulting from fluorescent seed was genotyped to establish the existence of the PTEC transgene (A-E). Positive transformants yielded a single band of ~1kb. A no template control (NT), and col 0 material were used as negative controls for amplification. The plasmids used for transformation (4,5) were used as positive controls.

264

A B

C D

Figure F3 | Reverse Transcriptase Screen. Plants resulting from fluorescent seed was genotyped to establish the existence of the cpReverse Transcriptase transgene (A-D). Positive transformants yielded a single band of ~2.5 kb. A no template control (NT), and col 0 material were used as negative controls for amplification. The plasmid used for transformation (+) was used as a positive control.

265

Table F1 | Transgene Screening Summary. The table below summarizes the results of genotyping putative transgenics. Only plants that amplified both PTEC and Reverse Transcriptase genes were interrogated for the existence of an integrated plastid transgene.

RT AADA both PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on SOIL, #1 N N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #1 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #2 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #3 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #4 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #5 N Y N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #6 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #7 N N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #8 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #9 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #10 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #11 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #12 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #13 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #14 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #15 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #16 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #17 N Y N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #18 N Y N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #19 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #20 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #21 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #22 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #23 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #24 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #25 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #26 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #27 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on SOIL, #28 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #1 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #2 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #3 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #4 N Y N PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #5 N N N PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #6 N Y N PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #7 N N N PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #11 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #14 N Y N

266

Table F1 (continued).

PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #17 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #21 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #22 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #31 N Y N PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #33 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #34 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #35 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #36 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv3 T1, germinated on sugar, #37 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #8 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #9 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #10 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #12 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #13 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #15 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #16 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #18 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #19 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #20 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #23 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #24 N N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #25 N Y N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #26 N N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #27 N N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #28 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #29 N N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #30 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #32 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #38 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #39 Y Y Y PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #40 Y N N PCGW BAR RT line 18 T4 Crossed with PTECv4 T1, germinated on sugar, #41 Y N N

267

A B

D C

E F

Figure F4 | Secondary PCR screens for integration. Lines established to possess PTEC and Reverse Transcriptase Transgenes were screened using several additional sets of primers to identify the existence of integrated plastid transgenes (A-F). A positive event should yield a band of ~2.5 kb. However, the only samples that produced this band were wild type or transgenic background negative controls.