RICE UNIVERSITY

By

Anna Guseva

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE

Master of Science

APPROVED, THESIS COMMITTEE

Jonathon Silberg Jonathon Silberg (Jun 8, 2020 14:43 CDT) Joff Silberg

George Bennett George Bennett (Jun 11, 2020 21:17 CDT)

George Bennett

Caroline Ajo-Franklin

HOUSTON, TEXAS June 2020

ABSTRACT

Flavodoxin protein electron carriers: bioinformatic analysis and interactions with sulfite reductases

by Anna Guseva

Flavodoxins (Flds) are that distribute electrons to different

metabolic pathways through interactions with an array of partner proteins. The aim

of my thesis is to understand Fld evolution, establish whether Flds are encoded

within the same genomes as Fd-dependent sulfite reductases (SIRs), and

demonstrate that a cellular assay can monitor Fld electron transfer (ET) to SIRs.

Using bioinformatics, I identify numerous microbes whose genomes encode both

Fld and SIR genes. Additionally, I show that Flds can support ET to SIR using a synthetic pathway where protein-mediated ET is monitored using the growth of an

Escherichia coli auxotroph that depends upon Fld transferring electrons from a

Fd:NADP+ reductase to SIR. My results represent the first evidence that Flds support ET to assimilatory SIRs. Additionally, they show how a synthetic ET pathway in cells can be leveraged to rapidly compare the ET efficiencies of different Flds.

ii

Acknowledgments

While my advisor, Dr. Joff Silberg, is known for saying that PhD is not a sprint but a marathon, my accelerated Master’s program sometimes felt like a marathon that you run as if it were a sprint. Balancing my research with classes and other activities in order to complete this thesis would not be possible without the support of many mentors, members of the lab, friends, and family.

First, I would like to acknowledge my advisor, Dr. Silberg, who guided me through the design and implementation of my research project, and was always open to discussing my ideas, questions, and concerns. I would especially like to thank Dr. Silberg for constantly being more optimistic about my results than I was. In our meetings, he helped me learn a lot about extracting useful information from the data that did not fit my expectations and adjusting my plans based on new results. I am very grateful that he also supported me in pursuing other projects in addition to my research, such as my participation in iGEM

(International Genetically Engineered Machine) competition.

Another person without whom I would not have even started my research project is PhD student Jordan Bluford, who has been my graduate student mentor. His mentorship allowed me to learn many essential skills and develop independence as a researcher. He has been remarkably patient in helping me fix my mistakes, improve my writing and presentation skills, and develop confidence in my work. Jordan also gave me a piece of advice that became, without a doubt, the most useful strategy in my thesis work: “Sometimes, when we float in the oceans of despair, it’s ok to just take a break.”

iii

At all stages of my project, I received valuable advice and mentorship from other members of the lab. I would especially like to thank Dr. Ian Campbell and

Dr. Josh Atkinson for teaching me many of the techniques central to my thesis, and being extremely patient in helping me troubleshoot the problems with my experiments. I also thank Ilenne Del Valle for her amazing advice on data visualization and science communication. Other lab members and alumni also provided great help and support throughout my project: Tasfia Azim, Emm Fulk,

Annelise Goldman, Dongkuk Huh, Jing Jin, Dimithree Kahanda, Prashant

Kalvapalle, Jinyoung Kim, Jayaker Kolli, Li Chieh Lu, Andrea Padron, Kiara

Reyes, Swetha Sridhar, Emily Thomas, Albert Truong, Jennifer Vu, and Liz

Windham. I also want to thank Shyam Bhakta and Dr. David Zong from the

Bennett Lab at Rice for teaching me many useful lab techniques.

Outside of the lab, I would like to thank my committee members Dr.

George Bennett, Dr. Caroline Ajo-Franklin, and Dr. Matthew Bennett. I also want to acknowledge the BioSciences department administrators Dr. Susan Cates and

Dr. Rachael Eaton, who helped me keep track of completing the degree requirements. In addition, I want to thank many teachers, who helped me develop skills needed to complete this thesis. Especially Dr. Beth Beason and Dr.

Collin Thomas, for individually caring about each student and making their classes extremely interesting.

Finally, I thank my family and friends for supporting me throughout my project.

iv

My last and special thanks goes to the friends who helped maintain the creative atmosphere of our research building, Keck Hall, by founding and contributing to the Keck Museum of Modern Arts, as well as Keck Hall Book club

– some of the many initiatives that provided great comfort and inspiration.

v

Contents

Acknowledgments ...... iii Contents ...... vi List of Figures ...... viii List of Tables ...... x Abbreviations ...... xi Introduction ...... 1 1.1. Flavodoxins are electron carriers with diverse metabolic roles ...... 1 1.2. Flavodoxins evolved to replace ferredoxins in iron limitation and oxidative stress conditions ...... 5 1.3. In E. coli, flavodoxins have diverse roles in metabolism ...... 9 1.4. Insights into flavodoxin structure and interactions have come from in vitro studies ...... 12 1.5. Fld is a promising component for biotechnology and bioelectronics ...... 14 1.6. Selections can be used to accelerate studies of protein electron carriers 15 1.7. Hypothesis tested in this research ...... 18 Methods ...... 21 2.1. Materials ...... 21 2.2. Plasmid construction ...... 22 2.3. Auxotroph growth assay ...... 22 2.4. Flow cytometry ...... 23 2.5. Construction of the sequence similarity network and analysis of organism distribution ...... 24 2.6. Statistics ...... 26 Results ...... 27 3.1. Bioinformatics: identifying flavodoxins that might interact with sulfite reductases ...... 27 3.2. Constructs for testing flavodoxin-mediated electron transfer to sulfite reductases ...... 39 3.3. Cyanobacterial flavodoxins support electron transfer to ferredoxin- dependent sulfite reductases ...... 41

vi

vii

3.4. RFP fusion can be used to measure intracellular flavodoxin concentrations ...... 45 Future directions ...... 56 4.1. Investigating the interactions of flavodoxins with cyanobacterial sulfite reductases ...... 56 4.2. Designing protein switches using flavodoxins ...... 57 4.3 Investigating the relationship between cycling efficiency and electron transfer rates measured in vitro ...... 58 References ...... 60 Appendix A...... 70 Appendix B...... 74 Appendix C...... 75 Appendix D...... 76

vii

List of Figures

Figure 1. The mechanism of ET by FMN ...... 3

Figure 2. Structural and electrostatic comparison of (A) Fd and (B) Fld ...... 8

Figure 3. Flavodoxin partners in E. coli ...... 11

Figure 4. E. coli EW11 sulfide auxotroph assay ...... 17

Figure 5. Fld sequence similarity network ...... 32

Figure 6. Sub-network of Flds ...... 35

Figure 7. Distribution of proteins from the results of P. marinus SIR BLAST across genomes containing Flds from the sub-network ...... 38

Figure 8. Constructs used for expressing the components of the synthetic pathway ...... 40

Figure 9. N-terminal fusion of RFP to Fld...... 42

Figure 10. The workflow for assessing auxotroph growth complementation with Flds ...... 44

Figure 11. Growth complementation observed for different constructs with the concentration of aTc of 100 ng/mL ...... 47

Figure 12. Comparison of growth complementation for cells with 100 ng/mL aTc and uninduced controls ...... 48

Figure 13. The effect of varying aTc concentrations on the auxotroph growth ...... 49

Figure 14. The effect of varying aTc concentrations on the auxotroph growth for different RBSs ...... 51

Figure 15. Workflow for calibrating the flow cytometry data ...... 52

Figure 16. Fluorescence distributions for cells expressing RFP::Fld at different aTc concentrations ...... 54

viii

ix

Figure 17. aTc-dependent growth and OD/fluorescence ratios for Flds from Synechocystis sp., A, marina, and Nostoc sp. measured after 48 hours of growth in M9sa media ...... 55

Figure 18. Radial phylogram showing the distribution of organisms from cluster (1) of the Fld sub-network ...... 70

Figure 19. Radial phylogram showing the distribution of organisms from cluster (2) of the Fld sub-network ...... 71

Figure 20. Radial phylogram showing the distribution of organisms from cluster (3) of the Fld sub-network ...... 72

Figure 21. Radial phylogram showing the distribution of organisms from cluster (4) of the Fld sub-network ...... 73

Figure 22. Sequence alignment of cyanobacterial Flds and E. coli Fld ...... 75

Figure 23. Sequence alignment of cyanobacterial SIRs (A), cyanobacterial Flds (B), and (C) Flds from Nostoc punctiforme ...... 76

ix

List of Tables

Table 1. The diversity of oxidoreductases that have been shown to couple with Flds...... 4

Table 2. Number of Flds, Fds, SIRs, and FNRs across the organisms containing both Fld and Fd-dependent SIR...... 30

Table 3. Alignment scores of SIRs from organisms containing Flds with SIR from Z. mays ...... 74

x

Abbreviations

aTc anhydrotetracycline

ET electron transfer

Fd ferredoxin

Fld flavodoxin

FMN flavin mononucleotide

FNR ferredoxin-NADP+ reductase

OD optical density

PEC protein electron carrier

PSI photosystem I

RBS ribosome

RFP red fluorescent protein

RNR

SIR sulfite reductase

SSN sequence similarity network

TIR translation initiation rate

xi

Chapter 1

Introduction

1.1. Flavodoxins are electron carriers with diverse metabolic

roles

Flds are small (15-22 kDa) PECs that participate in photosynthesis and

other important metabolic processes in bacteria, archaea, and lower eukaryotes

such as algae (Pierella et al., 2014). The electron transfer (ET) mediated by Flds

depends upon a flavin mononucleotide (FMN), which binds to Flds to

create a functional holoprotein (Frago et al., 2010). In cells, FMN is made from riboflavin by riboflavin kinase, and consists of isoalloxazine ring system, the ribitol group, and a phosphate. While the isoalloxazine ring is able to accept up to two protons and two electrons (Figure 1), forming reduced FMNH2, experimental

studies showed that FMN in Flds typically functions as a single-electron carrier

(Lodeyro et al., 2012). The midpoint reduction potentials (E°) of Flds are most

commonly found in the range of -300 to -450 mV, similar to plant-type ferredoxins 1

(Fds), although higher (E° = -226 mV for Citrobacter braakii Fld) and lower (E° = -

522 mv for Azotobacter chroococcum Fld) potentials have been observed

(Deistung and Thorneley, 1986, Kimmich et al., 2007). The wide range of redox

potentials reflects the compatibility of Fld with a large diversity of partner proteins.

Flds evolved to support ET to a large variety of partner oxidoreductases

(Table 1). Some examples of cellular processes supported by Flds include nitrogen fixation (Pence et al, 2017), carbon fixation (Nakayama et al, 2013), and amino

acid synthesis (Vigara et al., 1998). Given their low midpoint potential, they may

be able to interact with additional partner oxidoreductases beyond those directly

demonstrated through biochemical and cellular studies. Fd electron carriers, which

present a similar low midpoint potential, have been found to support ET to >80

different oxidoreductases (Atkinson et al., 2016). Flds may transfer electrons to a

range of Fd partners in cells, such as Fd-dependent SIR, CO dehydrogenase,

glycine reductase, heme oxygenase, and others (Atkinson et al., 2016). In some

cases, this coupling could occur in nature, and in others it may be possible using

synthetic biology.

2

Figure 1. The mechanism of ET by FMN. FMN can accept two electrons, but due to the Fld stabilization of semiquinone form, Fld typically transitions between simiquinone and hydroquinone states and functions as a single-electron carrier.

3

Table 1. The diversity of oxidoreductases that have been shown to couple with

Flds. PFOR: pyruvate ferredoxin ; FNR: ferredoxin-NADP+ reductase; FQR: flavodoxin: quinone reductase.

Partner Organism example Reference

ribonucleotide reductase E. coli Bianchi (1993) pyruvate formate- activating E. coli Crain (2013) cobalamin-dependent methionine synthase E. coli Hall (2000)

P450c17 E. coli Jenkins (1994)

biotin reductase E. coli Birch (2000)

GcpE E. coli Puan (2005)

PFOR E. coli Nakayama (2013)

PSI Synechococcus sp. Mühlenhoff (1996)

nitrite reductase Chlorella fusca Vigara (1998)

nitrate reductase Azotobacter vinelandii Gangeswaran (1996)

Azotobacter vinelandii Pence (2017)

bisulfite reductase Desulfovibrio vulgaris Drake (1977)

Glutamate synthetase Chlorella fusca Vigara (1998)

FNR Bacillus cereus Gudim (2018)

acyl lipid desaturase Bacillus subtilis Chazarreta-Cifre (2011)

[NiFe]-Hydrogenase Synechocystis sp. Gutekunst (2014)

FQR Helicobacter pylori St Maurice (2007)

4

1.2. Flavodoxins evolved to replace ferredoxins in iron

limitation and oxidative stress conditions

Flds have a fascinating evolutionary history. Like iron-sulfur cluster

containing Fds, Flds are thought to have arisen in the early anaerobic environment

of the Earth (Pierella et al., 2014). The rising levels of oxygen in the early Earth

would have been detrimental to the Fe-S cluster of Fds, so Flds with their organic

FMN cofactor became more widespread as a defense mechanism against iron limitation and oxidative stress that arose (Lodeyro 2012). While there is evidence that Fds evolved from small iron-sulfur cluster binding peptides, the mechanism by which Flds evolved is less clear. There is emerging evidence that the Flds and Fds shared a common ancestral fold that diverged into the two different classes of

PECs within ancient microbes (Raanan, 2020).

Fld distribution across the tree of life likely arose through horizontal gene

transfer among Bacteria, Archaea, and Eukarya (Pierella et al., 2015).

Phylogenetic analysis demonstrates that during their evolution, Flds diverged into

two groups, including long-chain and short chain Flds. These Fld family members

are distinguished by the presence or absence of a 20-amino acid loop (Sancho,

2006), which is suggested to play a role in partner recognition (Pierella et al.,

2014). Among eukaryotes, Flds are most common in algae, where they are

localized in chloroplasts (Pierella et al., 2015). In bacteria, Flds are most

widespread in Cyanobacteria, Proteobacteria, Bacteroidetes, and Firmicutes

(Campbell et al., 2019), where they are thought to play a role in protection against

5

environmental stress, replacing Fd or participating in metabolism as essential proteins.

In many organisms, Fld expression is controlled by iron availability or other environmental conditions that downregulate Fd expression. In Anabaena sp., a cyanobacterium, Fld expression is controlled by the FurA transcriptional regulator, whose levels increase when iron is scarce (González et al., 2014). In E. coli, Fld is controlled by the SoxRS regulon, and its expression is induced by SoxR transcription factor when bacteria experiences oxidative stress that can affect iron availability for oxidoreductases like Fds. Fld is thought to mitigate oxidative stress by delivering electrons to reduce the oxidised Fe-S clusters of such as ribonucleotide reductase and methionine synthase (Duval and Lister, 2013).

Despite the advantage of protecting the organisms against environmental stress,

Flds were not retained during the evolution of terrestrial plants and are absent from higher eukaryotes. This can be contrasted with Fds, which are common in all domains of life. The lack of Flds in plants may appear surprising, since Fd levels in plants also decline in response to iron-scarce conditions. Moreover, studies examining the expression of recombinant Flds in plants have demonstrated that they improve resistance to iron limitation and oxidative stress (Peña et al., 2010,

Zurbriggen et al., 2008, Tognetti et al., 2006). The explanation for this trend in Fld evolution can be inferred from the distribution of Fld-containing organisms.

Bacteria and algae expressing Flds are typically found in iron-scarce habitats, while organisms found in areas with higher levels of iron usually lack Flds. Based on this observation, the currently accepted hypothesis explaining Fld distribution

6

refers to the theory that plants evolved from phototrophs inhabiting iron-rich coastal

regions, where selective pressure to retain Flds was low (Pierella et al., 2014).

Despite having an overlapping set of partner proteins and similar functions,

Fds and Flds differ in their sizes, primary structures, and tertiary structures (Figure

2). This observation raises an interesting question about what features allow this

overlap in partner oxidoreductases despite structural differences. To explore this question, a study by Ullmann and coworkers performed electrostatic alignment between Fd and Fld from Anabaena (Ullmann et al., 2009). Atomic charges on the crystal structures were used to calculate the distribution of electrostatic potentials for Fld and Fd. The values from the two arrays were used to calculate Hodgkin index (Hodgkin and Graham, 1987). Maximizing the index allowed the researchers to find the optimal electrostatic alignments. Among the alignments with a similar

Hodgkin index, two were selected based on the largest structural overlap. The results of the study demonstrated that these alignments result in the overlap of the

active sites for Fd and Fld interaction with FNR. Their conclusions suggested that

electrostatic interactions may play a key role in Fld partner recognition and binding,

and may be the basis of functional equivalence of Fd and Fld.

7

Figure 2. Structural and electrostatic comparison of (A) Fd and (B) Fld.

Electrostatic potential is in dimensionless units of kT/e, where k is Boltzmann's constant, T is temperature (298 K), and e is the charge of an electron. Potentials were calculated using APBS (Baker et al., 2001). Red highlight indicates negatively charged amino acids, and blue highlight indicates positively charged amino acids.

Fd is from Z. mays (3b2f) and Fld is from Nostoc sp. (1obo).

8

1.3. In E. coli, flavodoxins have diverse roles in metabolism

In cyanobacteria and algae, Fld is most commonly used as an alternative

PEC to Fd. In contrast, Flds can be essential in some proteobacteria for cell

survival (Puan et al., 2005, Freigang et al., 2002). The five Flds in E. coli are

relatively well-characterized compared with those of other organisms (Figure 3), although gaps in our knowledge about a subset of the E. coli Flds and their functions remain. The network of metabolic processes mediated by E. coli Flds

highlights their importance as ET hubs in cells. The most extensively studied Fld

in E. coli is FldA (long-chain variant), which is essential for cell growth because it

is needed to support the mevalonate pathway for isoprenoid biosynthesis (Puan et

al., 2005). In this pathway, FldA is thought to transfer electrons to enzyme GcpE,

which synthesizes isoprenoid precursor HMBPP ((E)-4-Hydroxy-3-methyl-but-2-

enyl pyrophosphate). FldA also provides reducing equivalents to a number of other

partners, such as methionine synthase (Hall et al., 2005) and anaerobic

ribonucleotide reductase (Bianchi et al., 1993). Electron donors to FldA include

Flavodoxin-NADP reductase Fpr (Krapp et al., 2002) and pyruvate:flavodoxin

oxidoreductase Ydbk. The other E. coli Flds are less well-characterized. MioC is a

short-chain Fld (Birch et al., 2000) that is thought to transfer electrons to biotin

synthase, which converts dethiobiotin to biotin. NrdI is another short-chain Fld,

which is thought to transfer electrons to diferric-tyrosyl radical cofactor of class Ib

(aerobic) ribonucleotide reductase (RNR) (Cotruvo et al., 2008). The electron

donor to NrdI is unknown. The functions of two other E.coli Flds, FldB (Ye and Fu,

2014) and Yqca (Ye and Hu, 2014), have not been established.

9

The genomic context of some of the E. coli Flds suggests a role in the oxidative stress response. While the role of FldB remains unknown, it was proposed that some of the FldA partner interactions help cells recover from oxidative stress (Gaudu and Weiss, 2000). Another Fld, NrdI, is found in the

NrdHIEF operon, and the RNR in this operon (NrdEF; class Ib type) interacts with

NrdI under conditions of iron limitation and oxidative stress, replacing a class Ia

RNR (NrdAB), which is maintained by Fd. The cross-reactivity of E. coli Flds is still not fully established. While it was shown that FldA cannot be replaced by other native Flds, further studies are needed to determine the specificity of the other four

E. coli Flds.

Flds are essential in other bacteria beyond E. coli, such as Bacillus subtilis

(Gudim et al., 2018) and Helicobacter pylori (Freigang et al., 2002). This observation makes Fld a promising drug target (Salillas and Sancho, 2020), and it suggests the possibility of creating auxotrophic strains to investigate growth complementation for different Fld types and their specificity in ET reactions. For example, in E. coli, FldA knockout strain can be used to study the interactions between heterologously expressed Flds and GcpE. Since ET to GcpE is required in the pathway for the biosynthesis of isoprenoid precursor, FldA knockout strain will be unable to grow in minimum media, and introducing alternative Flds may be able to rescue cell growth. Growth rates can be measured as a proxy for Fld ET efficiency to GcpE.

10

Figure 3. Fld partners in E. coli. FldA is an essential protein in E. coli, where it participates in isoprenoid biosynthesis, oxidative stress response, and other pathways. MioC functions to maintain the activity of biotin synthase, and NrdI provides electrons to RNR Ib. The functions of other Flds are unknown.

11

1.4. Insights into flavodoxin structure and interactions have

come from in vitro studies

Much of our understanding of Fld biochemistry has come from in vitro

studies. One question investigated by these studies has been the role of the

FMN in Fld-mediated ET. These efforts demonstrated that while Flds may in theory accept two electrons, they usually function as a single-electron carrier, with FMN transitioning between semiquinone and hydroquinone forms (Lodeyro et al., 2012). Mutational studies showed that FMN binding to Fld is driven by hydrophobic interactions, mainly centered at the isoalloxazine ring of the FMN

(Sancho, 2006). To achieve single-electron transfer, Flds modulate the redox potential of the FMN by having higher binding affinity to the semiquinone FMN

(Lostao et al., 1997). These results were obtained by observing the effect of mutating different residues at the FMN-binding site and evaluating the details of

FMN binding to Fld using X-ray crystallography. In addition, a number of Fld structures have been solved to date, including Flds from Nostoc sp. (Lostao et al., 2003), Clostridium beijerinckii (Ludwig et al., 1997), Desulfovibrio desulfuricans (Romero et al., 1996), and Helicobacter pylori (Freigang et al.,

2002).

Previous studies also assessed the physicochemical properties that control

Fld-partner interactions. The most intensively studied Fld partners are FNR and photosystem I (PSI), whose molecular mechanisms of binding have been characterized (Nogués et al., 2003, Frago et al., 2010, Goñi et al., 2009). For example, Nogués and coworkers analyzed the role of two Fld hydrophobic

12

residues in the vicinity of FMN. Using stopped-flow measurements, the authors compared how mutating those residues affects the kinetics of the Fld-FMN interaction. Laser-flash photolysis was used to study the interaction of Fld partners with PSI. Frago and coworkers studied the effect of varying the Fld redox potentials using FMN derivatives, where some methyl groups were replaced by chlorine and hydrogen. Goñi and coworkers analyzed the effect of multiple charge reversal mutations of the Fld. Taken together, these studies demonstrated that electrostatic and hydrophobic interactions play the major role in establishing Fld-FNR and Fld-

PSI interactions necessary for the ET, with Fld binding to positively charged areas of FNR or PSI (Nogués et al., 2005). They also showed that altering the FMN- binding residues can be used to modulate Fld midpoint potentials and change the

ET efficiency (Goñi et al., 2009).

In vitro studies provided important information about Fld functions and properties. However, many gaps in our knowledge about Flds still remain. While many Flds can interact with a wide range of partners, others appear to me more specific, and the reasons behind this variation in specificity remain unknown. A list of characterized Fld partners (Table 1) is still limited, and further studies are needed to determine whether Fld can replace Fd in interaction with other Fd partners (Atkinson et al., 2016). In addition, most studies have focused on Flds from bacteria, and Flds from other domains remain poorly understood. For example, there is only one study discussing Fld from archaea (Prakash et al.,

2019). Finally, more in vivo studies are required to investigate Fld behavior in living

13

cells and the applications of engineered Flds in biotechnology and metabolic engineering.

1.5. Fld is a promising component for biotechnology and

bioelectronics

A better understanding of Fld-mediated ET is critical to enable the future use of Flds to manipulate ET in cells for applications in metabolic engineering, bioelectrochemical systems, and plant biotechnology. In metabolic engineering, yield depends not only on enzymes expressed in a pathway, but also on the availability of cofactors that provide reducing equivalents to support chemical reactions (Zhao et al., 2017). Maintaining redox balance and homeostasis is therefore necessary for efficient product synthesis. Tuning the expression of redox proteins can be used to improve redox balance and product output in applications such as production of valuable chemicals and fuels under fermentative conditions

(Zhao et al., 2017, Catlett et al., 2015, Kallio et al., 2014). For example, Catlett and coworkers demonstrated that the overexpression of heterodisulfide reductase in

Methanosarcina acetivorans allows to bypass the rate-limiting redox reactions and increases the rate of methane output. PECs, such as Fds, have also been used for pathway optimization. Kallio and coworkers showed that the expression of a cyanobacterial Fd in E. coli within a synthetic pathway for propane biosynthesis enhances the electron supply to aldehyde deformylating oxygenase and increases the efficiency of propane synthesis. Despite these efforts, no rules for choosing specific PECs for heterologous expression have been developed. More studies on

14

partner interactions are needed to develop improved approaches for controlling the electron fluxome by expressing non-native PECs in biotechnology.

Flds are promising candidates for future applications in biotechnology because they are able to interact with a wide range of partner proteins and display varying specificities. The use of Flds in biotechnology has been demonstrated in plants, where their expression has been used to improve stress tolerance (Ceccoli et al., 2012, Peña et al., 2010, Tognetti et al., 2006). In plants, Fld expression confers resistance to oxidative stress by allowing for ET under conditions where

Fds are not sufficiently active, such as conditions of iron limitation, heat, and drought (Zurbriggen et al., 2008). Under stress conditions, Fd is downregulated, so Fld targeted to chloroplasts takes its role by receiving electrons from PSI.

Following their reduction by PSI, reduced Flds can then interact with FNR, providing electrons for NADPH regeneration, which is required for many essential metabolic processes (Giró et al., 2011). Apart from applications in plants, diverse roles of Flds have been found in carbon, nitrogen, and sulfur metabolism in bacteria, suggesting that engineered Flds could be useful in a wide range of biosynthetic pathways.

1.6. Selections can be used to accelerate studies of protein

electron carriers

To enable the studies of ET-mediated by PECs in living cells, an E. coli strain with a growth defect was previously developed (Barstow et al., 2011,

Atkinson et al., 2019). With this EW11 strain, the growth depends upon PEC ET

15

because growth in minimal medium containing sulfate as a sulfur source requires the transfer of electrons from FNR to a Fd-dependent SIR. E. coli EW11 is unable to produce sulfide required for the synthesis of essential metabolites such as cysteine and methionine when grown on sulfate or sulfite unless the synthetic pathway is functional (Figure 4). To minimize the electron leak to off-pathway routes, E. coli EW11 has knockouts of genes encoding major Fd partner proteins, including ferredoxin-NAD+ reductase, pyruvate-flavodoxin oxidoreductase,

flavodoxin-NADP+ reductase, hydrogenase, and others (listed in Barstow et al.,

2011). Previous results showed that Fds from different organisms, such as

Mastigocladus laminosus, Zea mays, and Chlamydomonas reinhardtii, are able to

complement auxotroph growth when coexpressed with corn FNR and SIR

(Atkinson et al., 2019). This finding illustrates the power of using this cellular assay

to study the partner specificity of natural and non-natural partner proteins.

It is currently unknown whether Flds can support ET to an assimilatory SIR

such as Fd-dependent SIR expressed in the pathway; it was also not directly tested

whether Fld is able to accept electrons from Z. mays FNR. However, it was

previously shown that Fld plays a role in a dissimilatory pathway for bisulfite

reduction in sulfate-reducing bacteria, like Desulfovibrio vulgaris (Drake et al.,

1977) and Desulfovibrio gigas (Hatchikian et al., 1972). In these organisms, Fld

supports ET to bisulfite reductase. The ET between Flds and bacterial FNRs is well characterized. Flds have been shown to interact with FNR from tobacco

(Tognetti, 2006) and mammalian cells (Zöllner 2004).

16

Figure 4. E. coli EW11 sulfide auxotroph assay. The sulfate provided in the media is converted to sulfite through a series of intermediates (APS and PAPS).

PEC transfers electrons from FNR to SIR, which catalyzes the conversion of sulfite to sulfide necessary for the production of essential compounds such as cysteine, methionine, thiamine, Fe-S clusters, and others, leading to cell growth.

17

The ability of Fld to interact with those structurally distinct FNRs suggests that it

may be capable of accepting electrons from Z. mays FNR.

1.7. Hypothesis tested in this research

In my thesis, I focus on the interaction between cyanobacterial Flds and

SIRs. A number of Flds from cyanobacteria, such as Flds from Nostoc sp. and

Synechocystis sp. (Bottin, 1992, Kutzki, 1998), have been extensively characterized. Crystal structures of Flds from several cyanobacteria, including

Nostoc sp. (Rao et al., 1992) and Synechococcus elongatus (Drennan et al., 1999) were solved. It was shown that Nostoc sp., Synechocystis sp., and S.elongatus have similar redox potentials around -430 mV (Hoover et al., 1999, Bottin et al.,

1992, Paulsen et al., 1990). Partner interactions were studied for Flds from

Synechocystis sp. and Nostoc sp. It was shown that Fld from Synechocystis sp. is able to interact with NiFe hydrogenase (Gutekunst et al., 2014) and PSI (Meimberg et al., 1998), and Flds from Nostoc sp. and S. elongatus interact with FNR and PSI

(Nogués et al., 2005), (Mühlenhoff et al., 1996). In addition, it was shown that Flds from some cyanobacteria participate in nitrogen fixation and hydrogen metabolism

(Lodeyro et al., 2012).

Given their importance as ET hubs and their ability to confer stress resistance, cyanobacterial Flds can serve as useful tools in metabolic engineering and other applications. However, their in vivo characterization is lacking, and no rules for choosing specific Flds for applications in different synthetic pathways have been developed. Our knowledge of metabolic roles of the cyanobacterial Flds

18

is also limited. While many studies explored their function in photosynthesis and nitrogen assimilation, the role of Flds in other pathways like sulfur assimilation are unknown.

The aim of my research is to develop a high-throughput method for measuring ET efficiency of Flds in living cells. To achieve this, I have utilized the synthetic pathway expressed in E. coli sulfide auxotroph (Figure 4), which links Fld activity to cell growth. Studies of Flds in this pathway have allowed me to test the hypothesis that cyanobacterial Flds evolved to support ET to Fd-dependent SIR.

Fd-dependent SIR plays a central role in the pathway for sulfur assimilation in cyanobacteria (Schmidt et al., 1988). I hypothesized that Flds can replace Fd in supporting ET to Fd-dependent SIR to maintain critical sulfur metabolism reactions under iron-limited conditions. To evaluate whether Flds could have evolved to interact with Fd-dependent SIR, I used a genome database search and analysis of sequence similarity networks to identify the organisms that co-express Flds and

Fd-dependent SIRs. To test whether these proteins interact, I then expressed cyanobacterial Flds in E. coli EW11. My results show that this cellular assay can be used as a simple strategy to identify whether Fld expression is able to rescue auxotroph growth.

The final aim of my thesis is to develop a quantitative metric for assessing in vivo ET efficiencies of Flds. This allows me to not only screen for Flds that are able to interact with Fd-dependent SIRs, but also to compare the estimated ET rates among different Flds. The metric I propose is defined as “cycling efficiency,” which represents Fld growth normalized over intracellular Fld concentration. To

19

quantify the concentration of Fld in a cell, I build RFP-Fld fusion constructs, which allows me to estimate Fld expression levels using flow cytometry. I am comparing the cellular cycling efficiencies of Flds with divergent sequences but similar midpoint potentials. Achieving this serves as the first demonstration of in vivo

characterization of Fld ET. The strategy of using growth selection in combination

with measuring intracellular PEC levels can be leveraged to compare ET

efficiencies among different types of PECs, for example Fds and Flds.

20

Chapter 2

Methods

2.1. Materials

Oligonucleotides for PCR were ordered from Integrated DNA

Technologies. Chloramphenicol, kanamycin, and streptomycin were from

Research Products International. T4 DNA and BsaI restriction enzyme were from New England Biolabs. Phusion DNA polymerase was from Thermo Fisher

Scientific. Anhydrotetracycline (aTc) and all other chemicals were purchased from

Sigma-Aldrich. Flow cytometry Spherotech Rainbow Calibration Particles were from Spherotech. E. coli EW11 was a gift from Pam Silver (Harvard University)

(Barstow et al., 2011).

21

2.2. Plasmid construction

DNA sequences for Synechocystis sp. Fld, A. marina Fld, E. coli Fld, and

Nostoc sp. Fld were synthesized as G-blocks by Integrated DNA Technologies.

The sequences for Z. mays FNR and Z. mays SIR were provided by Pamela Silver laboratory at Harvard University (Barstow et al., 2011). The plasmids for Fld expression were constructed using Golden Gate assembly (Engler 2008) from

DNA fragments that were PCR amplified using Phusion DNA polymerase (Thermo

Fisher). All plasmids were sequence verified.

2.3. Auxotroph growth assay

E. coli EW11 competent cells were prepared using a standard protocol

(Sambrook et al., 2006). The cells were transformed with a plasmid expressing

Z.mays FNR and SIR, and a second plasmid expressing Fld, Fd, or a negative

control mutant Fd C47A. The colonies were inoculated into a deep-well 96-well

plate with nonselective M9c medium containing sulfide source, with 34 μg/mL

chloramphenicol and 100 μg/mL streptomycin. The composition of M9c medium

includes sodium chloride (0.5 g/L), sodium phosphate, monobasic (3 g/L), 2%,

sodium phosphate, dibasic (6.8 g/L), calcium chloride (0.1 mM), glucose,

ammonium chloride (1 g/L), ferric citrate (24 mg/L), magnesium sulfate (10 mM),

inositol (20 mg/L), p-aminobenzoic acid (2 mg/L), adenine (5 mg/L), uracil (20

mg/L), tryptophan (40 mg/L), tyrosine (1.2 mg/L), and the other 18 amino acids (80

mg/L). The cells were grown for 24 hours in a 37 °C shaking incubator, 250 r.p.m.

22

The cells were then inoculated 1:100 into M9sa media, with the same composition as M9c but lacking cysteine and methionine, in a flat-bottom shallow-well 96-well

plate. Fld or Fd expression was induced with aTc concentration gradient and cells

were incubated in Infinite m1000 Pro plate reader (Tecan), shaking at 3 mm

amplitude. Optical density at 600 nm (OD600) was recorded every 10 min. For

constructs expressing RFP, OD600 was measured after 48 hours of incubation

and bulk fluorescence was also measured at excitation wavelength of 558 nm and

emission wavelength of 583 nm.

2.4. Flow cytometry

Cells for the flow cytometry were obtained from the 96-well plate following

the 48 hours of incubation in M9sa media. OD600 was measured and cells were

diluted to the OD600 of 0.001 in phosphate buffer saline (PBS) in a 96-well plate.

Kanamycin was added to the concentration of 500 ng/mL to arrest cell growth. The

plate was incubated for 1 hour in 37 °C shaking incubator to allow for RFP

maturation. Flow cytometry was performed using a Millipore Guava HT Cyte Flow

Cytometer. For each sample, 15000 events were collected. Measurements for the

standard curve were collected using Spherotech Rainbow Calibration Particles (8

peaks) (Spherotech, RCP-30–5A, Lot number EAK01). The results were exported

as FCS3.0 files. The files were analyzed using FlowCal software (Castillo-Hair et

al., 2016). Fluorescence values were converted from arbitrary units (AU) to

molecules of equivalent fluorophore (M.E.F) based on the standard curve

generated using the calibration particles. The gate of 50 percent was applied to

23

extract the region of the highest density in the forward and side scatter using a

FlowCal function. Means and standard deviations were calculated from the data

using FlowCal functions.

2.5. Construction of the sequence similarity network and

analysis of organism distribution

To create SSN that visualizes Fld sequence relationships, genomes with

“finished” tag (n = 10,051) were selected from the Integrated Microbial Genomes

and Microbiomes. The search was performed across those genomes to identify

organisms that contain Flds (IPR008254). Among the list of Flds, all sequences

longer than 200 amino acids were eliminated to remove proteins with FMN-binding motifs that are not Flds. The Enzyme Function Initiative-Enzyme Similarity Tool

(EFI-EST) was used (Gerlt, 2017) to obtain the sequence similarity scores and

create a file that can be used for SSN generation. SSN was visualized using

Cytoscape (Shannon, 2003) by applying a perfuse force directed layout. Among

the genomes that encode Flds, a search was performed to identify organisms that

also contain Fd-dependent SIRs (IPR011787). The final list of genomes was also

searched for 2Fe-2S Fds (IPR001041) to identify the number of Fds in these

organisms. The Flds from genomes with both Fld and SIR were annotated on the

SSN to visualize the relationships among Flds from organisms that have both Flds

and Fd-dependent SIRs. Sequences >= 160 amino acids were annotated as “long”

and sequences < 160 amino acids were annotated as “short” To isolate the sub-

24

network with sequences annotated as Flds, Cytoscape Network Analyzer tool was

used to extract the connected component containing the sub-network.

To visualize the distribution of Fd-dependent SIRs, NADPH-dependent

SIRs, NIRs, and anaerobic SIRs, BLAST was performed using Fd-dependent SIR from Prochlorococcus marinus MIT9211 across the 731 genomes corresponding to the Flds from the sub-network. IMG genome BLAST function was used for the search. Genome IMG IDs corresponding to the genes from the BLAST results were exported, and a Python script was used to create a list of Fld gene IDs corresponding to either Fd-dependent SIRs, NADPH-dependent SIRs, NIRs, or anaerobic SIRs. The IDs were used to annotate the SSN. For the Flds from organisms that contained more than one type of these proteins, annotations were added for all of them.

To analyze the distribution of organisms in each cluster of the Fld sub- network, organism names were exported from each cluster using Cytoscape.

Python script was used to narrow down the list of organisms: for each genus present in a given cluster, only one organism was selected to create a tree that visualizes the genera present in the cluster. The script also matched the selected organisms with their IMG IDs. These IDs were searched in the IMG database, and their genomes were selected for analysis using the IMG Genome Clustering tool. Using hierarchical clustering, the genomes were clustered by taxonomy

(class), and their relationships were visualized using a radial phylogram.

25

2.6. Statistics

Error bars represent standard deviation for three biologically independent

replicates. To compare the OD600 for uninduced cells and cells where Fld expression was induced with 100 ng/mL aTc, independent two-tailed t-tests were performed with α=0.05.

26

Chapter 3

Results

3.1. Bioinformatics: identifying flavodoxins that might

interact with sulfite reductases

Only a subset of Fd-dependent oxidoreductases have been shown to interact with Flds, even though some may couple with both Flds and Fds. Among the more than 80 partner oxidoreductases shown to couple with Fds (Atkinson et al., 2016), fewer than 20 have been shown to couple with Flds through biochemical and genetic assays (Pierella et al., 2014). To better understand potential coupling of Flds with these oxidoreductases, one first needs to assess whether genomes encode both Flds and these Fd-dependent oxidoreductases. Among its functions in different metabolic pathways, the role of Fld in sulfur metabolism is one of the least understood, and it is unknown whether Fld can replace Fd to mediate sulfur assimilation in iron-limited conditions. To investigate this possibility, I analyzed the distribution of Flds and Fd-dependent SIRs across different organisms and identified the organisms that co-express Fld and SIR.

27

To identify Flds that might be capable of transferring electrons to Fd- dependent SIR, I searched for organisms that have genomes that encode at least one Fld and one SIR. Genomes annotated with the “finished” tag were retrieved from the Integrated Microbial Genomes and Microbiomes (IMG) database of Joint

Genome Institute (JGI), and those containing a Fd-dependent SIR (IPR011787) were first identified. Protein search in the IMG database was performed based on their InterPro ID (Hunter et al., 2009). The subset of organisms having a Fd- dependent SIR was searched to identify those also encoding a Fld (IPR008254).

Proteins larger than 200 amino acids were removed to restrict the analysis to small

PECs and eliminate multi-domain proteins containing Fld-like domains, such as nitric oxide synthases. This analysis revealed 23 photosynthetic organisms with genomes that contain both Flds and Fd-dependent SIRs, including 21 cyanobacteria, one green algae (Ostreococcus lucimarinus), and one marine diatom (Thalassiosira pseudonana) (Table 2). All of these organisms contained a single Fd-dependent SIR, while only 68% contained a single Fld. The largest number of Fld paralogs was found in Nostoc punctiforme, which contained four family members.

In addition to these organisms, the search also yielded Z. mays. However, this organism was excluded because Flds are not found in plants (Pierella et al.,

2014). Most of the obtained sequences from Z. mays corresponded to proteins with FMN-binding motifs that were not Flds. These genes were annotated as encoding NADPH:quinone oxidoreductases or subunits of sulfite reductase.

BLAST search for the only sequence annotated as Fld suggested that it was an

28

ankyrin repeat family protein. The reason why this protein is annotated as Fld is unknown; it is shorter than Flds (132 amino acids), does not have an FMN-binding motif, and displays no similarity when aligned to E. coli Fld.

To compare the prevalence of three-component pathways that have the potential to support SIR reduction in cyanobacteria and other organisms that contain both Fld and SIR, the JGI database was also searched for proteins that couple NADPH oxidation to Fld reduction (IPR015701) and 2Fe-2S Fds

(IPR001041). Like in the Fld search, a 200 amino acid cutoff was applied to the

Fds. This genome mining revealed that these organisms contain larger numbers of Fd paralogs (5 to 16) than Fld paralogs (1 to 4) (Table 2). Additionally, a vast majority had one FNR, with the exception of Acaryochloris marina and

Thalassiosira pseudonana, which had three FNRs. Our results are consistent with previous studies which show that Fd-dependent SIRs are common in photosynthetic organisms (Hirasawa et al., 2004). Since Flds in photosynthetic prokaryotes are important under the conditions of iron limitation and oxidative stress to support Fd-dependent reactions, Flds may be able to transfer electrons to Fd-dependent SIRs. My results also reflect the fact that bacteria often have multiple Fd and Fld variants in their genome (Campbell et al., 2019). These paralogs are thought to evolve different partner specificities that allow them to support ET between a wide diversity of metabolic pathways (Campbell et al.,

2019).

29

Table 2. Number of Flds, Fds, SIRs, and FNRs across the organisms containing both Fld and Fd-dependent SIR.

Strain Fld Fd SIR FNR

Acaryochloris marina MBIC11017 1 16 1 3

Anabaena variabilis ATCC 29413 2 9 1 1

Crocosphaera subtropica BH68 2 12 1 1

Gloeothece citriformis PCC 7424 2 9 1 1

Nostoc punctiforme PCC 73102 4 12 1 1

Nostoc sp. PCC 7120 1 7 1 1

Ostreococcus lucimarinus CCE9901 2 7 1 1

Prochlorococcus marinus AS9601 1 5 1 1

Prochlorococcus marinus MIT9211 1 4 1 1

Prochlorococcus marinus MIT9303 1 4 1 1

Prochlorococcus marinus MIT9313 1 4 1 1

Prochlorococcus marinus MIT9515 1 4 1 1

Synechococcus elongatus PCC 6301 1 5 1 1

Synechococcus elongatus PCC 7942 1 5 1 1

Synechococcus sp. CC9605 2 9 1 1

Synechococcus sp. JA-2-3B'a(2-13) 1 7 1 1

Synechococcus sp. JA-3-3Ab 1 7 1 1

Synechocystis sp. GT-S, PCC 6803 1 8 1 1

Synechocystis sp. PCC 6803 1 8 1 1

Synechocystis sp. PCC 6803 GT-I 1 8 1 1

Synechocystis sp. PCC 6803 PCC-N 1 8 1 1

Thalassiosira pseudonana CCMP 1335 2 5 1 3

Trichodesmium erythraeum IMS101 2 8 1 1

30

To better understand the evolutionary relationships between cyanobacterial

Flds, the major group of organisms with both Fld and SIR, and other Fld families,

a sequence similarity network (SSN) was created. Prior studies have shown that

SSNs represent a simple way to visualize how the primary structures of protein

paralogs relate to one another (Atkinson et al., 2009). They also provide a frame

of reference for establishing how the cyanobacterial Flds in Table 2 may have

evolved from other Fld family members.

The SSN was constructed using EFI-EST (Gerlt et al., 2015) using Flds from

genomes in JGI annotated as “finished” and visualized with Cytoscape (Shannon

et al., 2003) (Figure 5). The SSN nodes represent unique Fld family members, and

two nodes are connected by an edge if their alignment score is higher than the

cutoff, which can be varied. This alignment score is calculated by the EFI-EST

based on the BLAST score, which is a measure of similarity in pairwise protein

sequence alignment (Gerlt, 2015). This network was generated with the cutoff of

20. Any Fld sequences exhibiting pairwise sequence identity >95% were grouped

together and are shown as single nodes to simplify network visualization.

The Fld SSN contains over three dozen clusters, which vary in density and

connectivity to the other clusters. In this SSN, Flds long and short chain family

members frequently partition into separate clusters, although some clusters contain mixtures of both lengths. Proteins > 160 amino acids were annotated as

“long” and proteins < 160 amino acids were annotated as “short.” This cutoff was

selected based on the observation that short-chain Flds have an average length

of ~150 amino acids while long-chain Flds have an average length of ~170 amino

31

Figure 5. Fld sequence similarity network. Cutoff = 20. In the major cluster groups, most proteins are annotated as: A. Flds; B. Fld-like multimeric proteins

WrbA; C. MioC proteins; D. hotoporphirogen oxidases HemG. Flds from organisms that contain both Fld and Fd-dependent SIR are shown in yellow. Long Flds (above

160 amino acids) are shown in purple. Short Flds (below 160 amino acids) are shown in green.

32

acids (Campbell et al., 2019). Major groups of connected clusters represent different Flds and Fld-like proteins. Group A represents the group of most well- studied Flds from bacteria and archaea. Other clusters consist of FMN-binding proteins, and they are not always referred to as Flds in literature.

Group B contains Fld-like multimeric proteins WrbA, NADPH:quinone oxidoreductases which bind FMN and NADPH (Andrade, 2007). Group C consists of MioC proteins, which play a role in ET to biotin synthase and are typically classified as short-chain Flds (Birch et al., 2000). Group D contains photoporphirogen oxidases HemG, which exhibits high sequence similarity with

Fld. HemG may have evolved from long-chain Flds (Kobayashi, 2014), but this protein is currently not sufficiently well-studied to determine whether it should be classified as Fld.

Flds found in genomes that contain Fd-dependent SIR have high sequence similarity, with a vast majority localizing to a single cluster that also contains cyanobacterial and proteobacterial Flds. Flds from organisms that do not belong to cyanobacteria, Ostreococcus lucimarinus and Thalassiosira pseudonana, are found in different clusters, reflecting their more distant evolutionary relationship to cyanobacterial Flds. Other exceptions included Fld from Nostoc punctiforme and

Fld from Anabaena variabilis. Both of those organisms also have Flds in the cluster with other cyanobacteria Flds, suggesting that these additional Flds in other clusters may have evolved differences in sequences to interact with another set of partners.

33

The Flds encoded in the majority of organisms harboring Fd-dependent

SIRs are found in one dense cluster (cluster (1)) that is connected to three other

clusters (Figure 6). While the organisms from the groups of closely related

sequences in each cluster are not uniform, some trends of domains and families that are most abundant in each cluster are observed. In cluster (1), the most

abundant organisms are Proteobacteria. Among the different genera in this cluster,

>50% are Proteobacteria (with the majority of Gammaproteobacteria), and approximately 20% are Cyanobacteria (Appendix A, Figure 18). In contrast to

cluster (1), which contains a wide diversity of organisms, cluster (2) is more uniform

and contains primarily short-chain Flds from only six genera: Bacillus, Aerococcus,

Anoxybacillus, Amphibacillus, Aneurinibacillus, and Alkaliphilus (Appendix A,

Figure 19). The majority of Flds in this cluster (94%) are Flds from Bacilli. This

suggests that Bacilli evolved distinct short-chain Flds that are different from Flds

in other clusters. Among the organisms in cluster (3), >60% belong to Firmicutes.

Across the remaining organisms, the majority are Actinobacteria, in addition to a

small fraction of Spirochaetes, and one Archaeon (Appendix A, Figure 20). While

Firmicutes tend to have both long-chain and short-chain Flds, Actinobacteria

primarily display short-chain Flds. Cluster (4) consists of methanogenic Archaea

from Euryarchaeota phylum (Appendix A, Figure 21). The small number of

organisms in this cluster reflects the fact that methanogenic Archaea are not very

well-studied, and only one Fld from Archaea has been characterized (Prakash et

al., 2019). This Fld from Methanosarcina is long-chain, while all other genera of

organisms in cluster (4) contain short-chain Flds.

34

Figure 6. Sub-network of Flds. Cutoff = 20. Flds that contain Fld and Fd- dependent SIR are shown in yellow. Long Flds (above 160 amino acids) are shown in purple. Short Flds (below 160 amino acids) are shown in green. Annotations represent the most abundant phyla in each cluster.

35

While the major classification currently used for Flds separates them into

short-chain and long-chain variants, my analysis suggests that the relationship

among the different Fld groups is more complex, and that that Flds in different

phyla evolved varying sequences. This SSN also provides a framework for

studying Fld interaction with SIR. The candidate Flds most likely to interact with

SIR come from the organisms that contain both Flds and SIRs, but SSN reveals

neighboring Flds that may also be able to support ET to SIR.

Interpro annotations of Fd-dependent SIRs are incomplete, which explains

the limited number of organisms co-expressing Fld and SIR found through the

database search. To obtain a broader view on the distribution of SIR-containing organisms in the SSN, 731 genomes, which contained the Flds from the SSN subcluster (Figure 7), were searched for genes encoding Fd-dependent SIRs. This

BLAST search was performed with Fd-dependent SIR from Prochlorococcus marinus MIT9211. This SIR was used because it was one of the SIRs found through Interpro annotations, and it was also previously described in a study which explored the interaction of this SIR with Fd (Campbell et al., 2020). BLAST results included four families of proteins, including (i) Fd-dependent SIRs (Yonekura-

Sakakibara et al., 2000), (ii) NADPH-dependent SIRs (Askenasy et al., 2018), (iii)

Fd-dependent nitrite reductases (NIRs) (Bellissimo and Privalle, 1995), and (iv) subunit C of the NADH-dependent anaerobic sulfite reductases, which participate in dissimilatory sulfate reduction in some organisms (Anantharaman et al., 2018).

36

SSN was annotated using the larger set of organisms found to encode

partner proteins. Four different SSNs were generated to visualize the distribution

of Flds in organisms found to contain the different proteins identified in the BLAST

search. The first SSN (Figure 8a) shows the organisms that contain genes encoding both Flds and Fd-dependent SIRs. With this approach, we found that all three of the four major clusters in this portion of the SSN contain both genes.

Cyanobacteria constituted > 50% of those organisms, 10% were Firmicutes, and

10% were proteobacteria. A small fraction of organisms from other clusters was also present, including Chlorophyta, Thaumarchaeota, Bacteroides, Hapista, and

Fusobacteria. The second SSN (Figure 8b) shows the distribution of Flds from organisms containing NADPH-dependent SIRs. More than 75% of these organisms are Proteobacteria, and the remainder are Firmicutes. The third SSN

(Figure 8c) corresponds to the distribution of anaerobic SIRs. The distribution of organisms is: 69 % Firmicutes (the majority of which belong to Clostridia), 19%

Proteobacteria (Aeromonas and Fusobacterium), and 12% Fusobacteria. The fourth SSN (Figure 8d) shows the distribution of Flds from organisms containing

Fd-dependent NIRs. Similar to the Fd-dependent SIRs SSN, the major phylum containing NIRs is Cyanobacteria (~50%), and the remaining 50% of organisms are Firmicutes (with the exception of several organisms from Proteobacteria,

Planctomycetes, and Chlorophyta).

37

Figure 7. Distribution of proteins from the results of P. marinus SIR BLAST across genomes containing Flds from the sub-network. (A) Fd-dependent

SIRs; (B) NADPH-dependent SIRs; (C) anaerobic SIRs (subunit C); (D) Fd- dependent NIRs. Flds from organisms containing each type of protein are shown in yellow.

38

The results of BLAST search confirm that Cyanobacteria are the major

group of organisms containing both Flds and Fd-dependent SIRs, but in addition

reveal that there are other phyla for which Flds may have evolved to interact with

Fd-dependent SIRs. While my thesis focuses only on the interaction between

cyanobacterial Flds and SIRs, future studies can use bioinformatic results

presented here to guide the selection of Fld-SIR pairs that can participate in ET.

3.2. Constructs for testing flavodoxin-mediated electron

transfer to sulfite reductases

Among the strains of cyanobacteria found to co-express Fld and Fd-

dependent SIR, Flds from Synechocystis sp., Acaryochloris marina, and Nostoc

sp. were selected to be tested in the assay. In addition, E. coli Fld was included to

investigate whether the native protein may be able to participate in the ET within

the synthetic pathway. To express the components of the synthetic pathway in E.

coli EW11, I used a plasmid for constitutive expression of FNR and SIR from Z. mays, and constructed plasmids for aTc-inducible Fld expression (Figure 8). The plasmids for Fld expression contained genes for TetR repressor, and Fld under aTc-inducible promoter. To build these constructs, a backbone from a plasmid previously made for Fd expression was PCR-amplified, and Fld sequence was inserted using Golden Gate assembly (Engler and Marillonnet, 2013).

39

Figure 8. Constructs used for expressing the components of the synthetic pathway. (A) FNR and SIR are constitutively expressed from a single plasmid. (B)

Fld is expressed from a second plasmid containing TetR repressor, and Fld induction levels can be controlled by varying the concentration of aTc.

40

RBS sequence was also changed to provide an estimated translation initiation rate

(TIRs) of 6,000 (designed using Salis Lab RBS calculator). Fld sequences used

for these constructs were previously codon-optimized for E.coli and synthesized as DNA fragments. Sequencing results confirmed the successful replacement of

Fd sequence with Fld and the insertion of the new RBS.

In subsequent experiments, fusion of RFP to Fld was used to evaluate the

intracellular Fld concentrations. In these constructs, RFP was fused to the N-

terminal of the Fld using a serine-glycine linker which does not contain any

cleavage sites for known E. coli proteases, as predicted using ExPASy server

(Wilkins et al., 1999) (Figure 9). To build these constructs, I used the same process

as described above, with an addition of RFP sequence amplified from a plasmid

for constitutive RFP expression. The plasmid backbone, Fld sequence, and RFP

sequence were combined in the Golden Gate assembly. The 10-amino acid linkers

were incorporated in the primer overhangs. All constructs were sequence-verified.

3.3. Cyanobacterial flavodoxins support electron transfer to

ferredoxin-dependent sulfite reductases

To determine whether Flds from Cyanobacteria with genomes also

containing SIRs are able to interact with SIR in vivo, I am using a cellular assay

based on E. coli EW11 sulfide auxotroph. This strain only grows in medium with

sulfate as the only sulfur source in presence of a synthetic pathway, where Fld

transfers electrons from FNR to SIR. The Flds studied using the cellular assay

41

Figure 9. N-terminal fusion of RFP to Fld. (A) Structures of RFP (2vad) and

Fld from Nostoc sp. (1 obo) connected by a G-S linker with no known E. coli

protease sites. Linker cartoon was added manually to the protein structures obtained from PDB. (B) Construct used for the expression of RFP fused to Fld.

RFP-Fld levels can be controlled by varying the concentration of the aTc

inducer.

42

were selected from the bioinformatic results of searching for organisms that

contain both Fld and Fd-dependent SIRs. Multiple sequence alignment was

performed between the SIRs from these organisms and Z. mays SIRs (the

alignment was performed using CLUSTALW (Thompson et al., 2002)). Among the

organisms corresponding to the highest alignment score, I selected those that had

a single Fld (Appendix B, Table 3). Based on this, Flds from three organisms were

selected for further studies: Nostoc sp., Synechocystis sp., and A. marina. E. coli

Fld was added to test for the possibility of native Fld interaction with SIR.

To identify whether the selected Flds are able to support SIR activity, these

Flds were tested for their ability to complement E. coli EW11 growth (Figure 10).

E. coli EW11 cells were transformed with a plasmid for aTc-inducible Fld expression and a plasmid for constitutive expression of FNR and SIR. These cells were grown in complete M9 (M9c) media (3 replicates per plate) for 24 hours in

37°C shaking incubator. The cultures were then used to inoculate fresh cultures in selective M9 (M9sa) media (minimal media lacking cysteine and methionine).

Serial dilutions of aTc were performed to induce Fld expression. M9sa cultures were grown in Tecan at 37°C, and absorbance was measured as time course at

600 nm using a plate reader.

To compare Fld growth complementation with results previously published for Fd (Atkinson et al., 2019), the assay was also performed for Mastigocladus laminosus Fd and M. laminosus C42A mutant. It was previously shown that M.

laminosus Fd allows E. coli EW11 to have high growth rate while M. laminosus

43

Figure 10. The workflow for assessing auxotroph growth complementation with Flds. E. coli EW11 cells are transformed with plasmids for the expression of

Fld, FNR, and SIR. Cells are inoculated into M9c media, and then diluted into

M9sa media after 24 hours of growth. OD600 is measured to assess cell growth in

M9sa.

44

C42A fails to complement auxotroph growth as C42A mutant lacks a cysteine that is involved in Fe-S cluster binding, resulting in a non-functional protein.

Growth data for Flds and Fds induced with 100 ng/mL aTc demonstrates that these PECs result in different rates of auxotroph growth (Figure 11). Fd growth

complementation is consistent with previous results (Atkinson et al., 2019). Flds

from Synechocystis sp., A. marina, and Nostoc sp are able to complement

auxotroph growth, while no growth was observed for E. coli Fld. This result may

be caused by the sequence difference observed for E. coli Fld (Appendix C).

For Flds from Synechocystis sp., A. marina, and Nostoc sp, significant

growth is observed compared to uninduced cells (Figure 12). The results of

measuring aTc-dependent growth (Figure 13) demonstrate that growth

complementation depends on Fld expression levels, and the highest expression at

100 ng/mL aTc corresponds to the highest measured OD600. These results support

the hypothesis that Flds from cyanobacteria, but not from E. coli, can interact with

SIR and demonstrate a simple selection to identify Flds that are able to participate

in ET to SIR.

3.4. RFP fusion can be used to measure intracellular

flavodoxin concentrations

The results of EW11 growth assay provide evidence that Flds are able to

interact with FNR and SIR partners in the synthetic pathway. However, while cell

growth is correlated with Fld activity in the pathway, it does not provide a reliable

proxy for Fld ET efficiency since growth is dependent on other factors such as the

45

intracellular concentration of Fld. To address this concern, I propose a metric

called “cycling efficiency,” defined as final OD600 of E. coli EW11 cells in the assay

normalized over the intracellular Fld concentration.

To develop a high-throughput method for identification of Fld cycling

efficiency, a simple way of measuring intracellular Fld concentration is needed. I

hypothesized that this can be achieved by creating RFP::Fld fusion proteins and

using flow cytometry to determine the number of RFP molecules in a cell. FlowCal

software can be used to calibrate flow cytometry data from arbitrary units to

molecules of equivalent fluorophore (MEF) (Castillo-Hair et al., 2016). I chose to

fuse dsRed monomeric RFP to the N terminal of the Fld (Figure 9) since it would

allow for the standardization of RBS context, and the same RBS can be used to

achieve similar expression levels. The use of a flexible serine-glycine 10-amino

acid linker between RFP and Fld allows to assume that the concentration of Fld

can be estimated based on RFP levels.

Since the optimal levels of Fld expression in the assay are unknown, I selected three RBSs predicted to have different orders of magnitude for the translation initiation rate when placed upstream of RFP (1,000, 6,000, and

100,000). To confirm that RFP fusion does not interfere with Fld auxotroph growth complementation, growth assay was performed for the three Fld constructs and the negative control following the protocol described above. Varying aTc concentrations were used in M9sa media to induce RFP::Fld production.

Absorbance measurements were taken after 48 hours of incubation. The strongest

RBS resulted in a very low OD600 below 0.2 for all aTc concentrations (Figure 14).

46

Figure 11. Growth complementation observed for different constructs with the concentration of aTc of 100 ng/mL. The experiments for Fd and Flds were performed on different days. Cells expressing Fd were grown for 32 hours, and cells expressing Flds were grown for 48 hours. Data up to 32 hours is shown.

The bars represent the means and the error bars represent standard deviations

(n=3 biologically independent samples).

47

Figure 12. Comparison of growth complementation for cells with 100 ng/mL aTc and uninduced controls. Significant growth was observed for Synechocystis

sp. Fld (Ss), Nostoc sp. Fld (Ns), A. marina Fld (Am), but not E. coli Fld (Ec).

Significant growth is indicated with asterisks (Ss: p = 3 x 10-7 ; Ns: p =4 x 10-5; Am: p = 9 x 10-3; Ec: p = 5 x 10-1). The bars represent the means and the error bars represent standard deviations (n=3 biologically independent samples).

48

Figure 13. The effect of varying aTc concentrations on the auxotroph growth. The experiments for Fd and Flds were performed on different days. Data at 32 hours is shown. The symbols represent the means and the error bars represent standard deviations (n=3 biologically independent samples).

49

This may be caused by the toxicity of extreme Fld overexpression, which inhibits

cell growth in stressful sulfide-limiting conditions. Highest growth complementation corresponded to the medium RBS, and this RBS was selected for the use in subsequent experiments. These results demonstrate that fusion to RFP does not prevent Fld ET from FNR to SIR, and RFP::Fld is able to achieve growth complementation comparable to unfused Fld.

To identify whether Flds from Synechocystis sp., A. marina, and Nostoc sp.

have different efficiencies in the pathway, I first evaluated the intracellular

concentrations of these Flds fused to RFP using flow cytometry. Flow cytometry

was performed on cells obtained from the growth assay after 48 hours of

incubation in M9sa media. The cultures were diluted to the OD600 of 0.001 and

growth was arrested by the addition of kanamycin. For each sample, flow

cytometry was performed to measure the fluorescence signal for each cell. To

convert the data from arbitrary units (AU) to molecules of equivalent fluorophore

(MEF), the peaks of the calibration particles in the range of the instrument settings

were used (Figure 15). Data analysis was performed using FlowCal software.

Fluorescence peaks demonstrate that the constructs successfully express RFP

and aTc-depended difference in expression is observed for Flds from all three

organisms (Figure 16). Based on the results obtained from flow cytometry

measurements, I analyzed the relationship between growth rate, fluorescence

(representative of Fld concentration) and OD600/(mean fluorescence) ratio (Figure

17). At aTc concentrations below 10 ng/mL, there is a steep rise in cell growth, but

OD600/(mean fluorescence) remains almost constant.

50

Figure 14. The effect of varying aTc concentrations on the auxotroph growth for different RBSs. TIRs corresponding to these RBSs are: 1,000 (weak), 6,000

(medium), 100,000 (strong) predicted by the RBS calculator. OD600 was measured

after 48 hours. The symbols represent the means and the error bars represent

standard deviations (n=3 biologically independent samples). These

measurements were performed on different days.

51

Figure 15. Workflow for calibrating the flow cytometry data. Samples

prepared using cell cultures from the growth assay are passed through the flow

cytometer to measure the fluorescence signal per cell. Fluorescent calibration

particles with a known number of fluorophores per bead are used to obtain the standard curve. FlowCal software is used to calibrate the data based on the standard curve.

52

This observation suggests that at this range, Fld is rate-limiting for growth (OD600

increases in a fixed proportion to the increase of Fld concentration). At higher induction levels, other factors, such as the concentration of partner proteins, limit cell growth. These results suggest that in 0-10 ng/mL range of aTc concentrations

cell growth is directly linked to Fld activity in transferring electrons from FNR to

SIR. For all Flds in this regions, the value of OD/fluorescence x 105 has the same

order of magnitude, with similar values between 20-40. These results suggest that

Flds tested have a similar efficiencies in the pathway, as they show similar growth

when normalized to the intracellular Fld concentration.

Taken together, these results represent a proof of concept that flow

cytometry can be used to determine aTc-dependent differences in intracellular Fld

concentration and provide an estimation about the region of aTc concentrations

where Fld is rate-limiting for growth.

The results presented in my thesis demonstrate that Flds evolved varying sequences, which can be represented as distinct clusters on the SSN. They also

show that microbes from diverse phyla have genomes that encode both Flds and

Fd-dependent SIRs. My bioinformatic analysis guided the selection of Flds that may interact with Fd-dependent SIRs, and I demonstrated how a cellular assay

can be used to study Fld ET to SIR. In my experiments, I demonstrated a high-

throughput method for measuring Fld cycling efficiency and obtained the results

that show for the first time that Flds are able to support ET to Fd-dependent SIR.

53

Figure 16. Fluorescence distributions for cells expressing RFP::Fld at different aTc concentrations. Flow cytometry data was calibrated using FlowCal and gated by applying 50% gate. From left to right, peaks correspond to the aTc concentrations: 1.6 ng/mL, 3.1 ng/mL, 6.2 ng/mL, 12.5 ng/mL, 25.0 ng/mL, 50.0 ng/mL, 100.0 ng/mL.

54

Figure 17. aTc-dependent growth and OD/fluorescence ratios for Flds from

Synechocystis sp., A, marina, and Nostoc sp. measured after 48 hours of

growth in M9sa media. Fluorescence is in MEF and corresponds to the average

RFP::Fld concentration per cell. The symbols represent the means and the error

bars represent standard deviations (n=3 biologically independent samples).

55

Chapter 4

Future directions

4.1. Investigating the interactions of flavodoxins with

cyanobacterial sulfite reductases

Bioinformatic analysis demonstrated that cyanobacteria are the major group

of organisms with genomes containing both Flds and Fd-dependent SIRs. My

results show that cyanobacterial Flds support ET to SIR from Z. mays. This result

suggests that Flds from these organisms may also be able to interact with native

SIRs. Sequence alignment of SIRs from cyanobacteria and Z. mays SIRs

demonstrates that many charged residues of these proteins are conserved

(Appendix D, Figure 18A). In addition, previous studies demonstrated that most

residues that participate in SIR-Fd binding are also conserved (Campbell et al.,

2020). Since electrostatic interactions play a major role in Fld partner interactions,

these observations suggest that Flds may be able to interact with cyanobacterial

SIRs, in addition to Z. mays SIR. In future experiments, the auxotroph growth assay can be used to test a larger set of Flds and SIRs, which can help improve our knowledge of the factors that allow for Fld-SIR interaction.

56

The SSN that I generated can be used to guide the selection of Flds for these studies, as Flds with increasing sequence variations can be selected to identify the sequence differences between Flds that interact with SIRs, and those that are not able to rescue auxotroph growth. Flds from organisms that contain multiple Flds, such as Nostoc punctiforme, can be of particular interest because those Flds often display large sequence variations (Appendix D, Figure 18C).

Demonstrating that some of these Flds are not able to interact with SIR can serve as evidence that Flds have varying specificities for ET to SIRs. These experiments, in combination with structural studies to identify contacting residues between Fld and SIR, can generate valuable insights into the specificity of Fld-SIR interactions.

4.2. Designing protein switches using flavodoxins

Achieving direct control over ET in cells requires methods that can allow us to modulate ET in response to external signals. The simplest strategy to achieve this goal involves transcriptional regulation, where protein ET components can be expressed from inducible promoters (Dundas and Keitz, 2019). In synthetic pathways for applications in biosensors and bioelectronics, however, post- translational regulation can provide a faster and a more robust method to manipulate electron flow in cells. Protein switches that control ET were previously developed for Fds (Atkinson et al., 2019). In a study by Atkinson and coworkers,

Fd was split, and the fragments were fused to FKBP12 and the FKBP-rapamycin binding domain. The addition of rapamycin promoted the interaction of these proteins, resulting in dimerization of Fd fragments. The complete Fd was then able

57

to participate in ET. Another study by Wu and coworkers demonstrated how

inserting the ligand binding domain from the human estrogen receptor allows Fd

ET to be controlled by the addition of 4-hydroxytamoxifen (4-HT) (Wu et al., 2020).

Using Flds to create similar protein switches can allow us to build synthetic

pathways that can function in environments where cells undergo iron limitation and

oxidative stress. In addition, there is a potential to use Flds to create switches with

more controlled behavior. Since the synthesis and degradation of Fds depends on

complex Fe-S metabolism, where they participate in Fe-S cluster transfer, and get

degraded to release iron for use in other cellular pathways (Blanc et al., 2015),

these processes may affect the dynamics of Fd protein switches.

In addition to rapamycin and 4-HT, other ligands or signals relevant for

medical and environmental applications can be explored. For example, Fld

fragments can be fused to the components of light-switchable systems, such as

phytochrome B and phytochrome interacting factor 6, which dimerize when

induced with red light (Olson and Tabor, 2012).

4.3 Investigating the relationship between cycling efficiency

and electron transfer rates measured in vitro

In my thesis, I make an assumption that cycling efficiency is correlated with

Fld ET rate. To further explore this correlation, future studies can focus on investigating the relationship between cycling efficiency of a given Fld and its ET

rate to SIR obtained using standard in vitro measurements. Previous studies

evaluated the ET rate to SIR by measuring the rate of sulfite to sulfide conversion

58

(Yonekura-Sakakibara et al., 2000, Atkinson et al., 2019). Using purified FNR, Fld,

and SIR, the rate of sulfide production can be measured using a sulfide electrode,

where current produced on the working electrode is proportional to sulfide

concentration (Jeroschewski et al., 1996). Alternatively, sulfide production can be

measured by evaluating the concentrations of sulfite and sulfide in E. coli EW11

cell lysates at different time points in the course of the sulfide auxotroph assay

using ion chromatography (Ohira and Toda, 2006). Demonstrating the correlation

between cycling efficiency and ET rates measured in vitro will serve as evidence that cycling efficiency is a reliable method to evaluate in vivo Fld electron transfer rates.

59

References

Anantharaman, K., Hausmann, B., Jungbluth, S.P., Kantor, R.S., Lavy, A., Warren, L.A., Rappé, M.S., Pester, M., Loy, A., Thomas, B.C., et al. (2018). Expanded diversity of microbial groups that shape the dissimilatory sulfur cycle. The ISME Journal 12, 1715–1728.

Andrade, S.L.A., Patridge, E.V., Ferry, J.G., and Einsle, O. (2007). Crystal Structure of the NADH:Quinone Oxidoreductase WrbA from Escherichia coli. Journal of Bacteriology 189, 9101–9107.

Askenasy, I., Murray, D.T., Andrews, R.M., Uversky, V.N., He, H., and Stroupe, M.E. (2018). Structure–Function Relationships in the Oligomeric NADPH-Dependent Assimilatory Sulfite Reductase. Biochemistry 57, 3764– 3772.

Atkinson, H.J., Morris, J.H., Ferrin, T.E., and Babbitt, P.C. (2009). Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies. PLoS One 4.

Atkinson, J.T., Campbell, I., Bennett, G.N., and Silberg, J.J. (2016). Cellular Assays for Ferredoxins: A Strategy for Understanding Electron Flow through Protein Carriers That Link Metabolic Pathways. Biochemistry 55, 7047–7064.

Atkinson, J. T., Campbell, I. J., Thomas, E. E., Bonitatibus, S. C., Elliott, S. J., Bennett, G. N., & Silberg, J. J. (2019). Metalloprotein switches that display chemical-dependent electron transfer in cells. Nature Chemical Biology.15(2), 189–195.

Baker, N.A., Sept, D., Joseph, S., Holst, M.J., and McCammon, J.A. (2001). Electrostatics of nanosystems: Application to microtubules and the ribosome. Proc Natl Acad Sci U S A 98, 10037–10041.

Barstow, B., Agapakis, C. M., Boyle, P. M., Grandl, G., Silver, P. A., & Wintermute, E. H. (2011). A synthetic system links FeFe-hydrogenases to essential E. coli sulfur metabolism. Journal of Biological Engineering. 5(1), 7.

Bellissimo, D.B., and Privalle, L.S. (1995). Expression of Spinach Nitrite Reductase in Escherichia coli:Site-Directed Mutagenesis of Predicted Amino Acids. Archives of Biochemistry and Biophysics 323, 155–163.

Bianchi, V., Eliasson, R., Fontecave, M., Mulliez, E., Hoover, D.M., Matthews, R. G., Reichard, P. (1993). Flavodoxin is required for the activation of the anaerobic ribonucleotide reductase. Biochem Biophys Res Commun 197, 792–797.

60

Birch, O.M., Hewitson, K.S., Fuhrmann, M., Burgdorf, K., Baldwin, J.E., Roach, P.L., and Shaw, N.M. (2000). MioC is an FMN-binding protein that is essential for Escherichia coli biotin synthase activity in vitro. J. Biol. Chem. 275, 32277–32280.

Blanc, B., Gerez, C., and Ollagnier de Choudens, S. (2015). Assembly of Fe/S proteins in bacterial systems: Biochemistry of the bacterial ISC system. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research 1853, 1436– 1447.

Bottin, H., and Lagoutte, B. (1992). Ferredoxin and flavodoxin from the cyanobacterium Synechocystis sp PCC 6803. Biochim. Biophys. Acta 1101, 48–56.

Campbell, I. (2020). Understanding and modulating electron transfer through ferredoxins. Rice University.

Campbell, I.J., Bennett, G.N., and Silberg, J.J. (2019). Evolutionary Relationships Between Low Potential Ferredoxin and Flavodoxin Electron Carriers. Front. Energy Res. 7.

Campbell, I.J., Olmos, J.L., Xu, W., Kahanda, D., Atkinson, J.T., Sparks, O.N., Miller, M.D., Phillips, G.N., Bennett, G.N., and Silberg, J.J. (2020). Prochlorococcus phage ferredoxin: structural characterization and interactions with cyanobacterial sulfite reductases. BioRxiv 2020.02.07.937771.

Castillo-Hair, S.M., Sexton, J.T., Landry, B.P., Olson, E.J., Igoshin, O.A., and Tabor, J.J. (2016). FlowCal: A User-Friendly, Open Source Software Tool for Automatically Converting Flow Cytometry Data from Arbitrary to Calibrated Units. ACS Synth. Biol. 5, 774–780.

Ceccoli, R.D., Blanco, N.E., Segretin, M.E., Melzer, M., Hanke, G.T., Scheibe, R., Hajirezaei, M.-R., Bravo-Almonacid, F.F., and Carrillo, N. (2012). Flavodoxin displays dose-dependent effects on photosynthesis and stress tolerance when expressed in transgenic tobacco plants. Planta 236, 1447– 1458.

Charon, M.-H., Volbeda, A., Chabriere, E., Pieulle, L., and Fontecilla-Camps, J.C. (1999). Structure and electron transfer mechanism of pyruvate:ferredoxin oxidoreductase. Current Opinion in Structural Biology 9, 663–669.

Chazarreta-Cifre, L., Martiarena, L., de Mendoza, D., and Altabe, S.G. (2011). Role of Ferredoxin and Flavodoxins in Bacillus subtilis Fatty Acid Desaturation. J Bacteriol 193, 4043–4048.

61

Cotruvo, J.A., and Stubbe, J. (2008). NrdI, a flavodoxin involved in maintenance of the diferric-tyrosyl radical cofactor in Escherichia coli class Ib ribonucleotide reductase. PNAS 105, 14383–14388.

Crain, A.V., and Broderick, J.B. (2013). Flavodoxin Cofactor Binding Induces Structural Changes that are Required for Protein-Protein Interactions with NADP+ Oxidoreductase and Pyruvate Formate-Lyase Activating Enzyme. Biochim Biophys Acta 1834, 2512–2519.

Deistung, J., and Thorneley, R.N. (1986). Electron transfer to nitrogenase. Characterization of flavodoxin from Azotobacter chroococcum and comparison of its redox potentials with those of flavodoxins from Azotobacter vinelandii and Klebsiella pneumoniae (nifF-gene product). Biochem J 239, 69–75.

Drake, H.L., and Akagi, J.M. (1977). Bisulfite reductase of Desulfovibrio vulgaris: explanation for product formation. J Bacteriol 132, 139–143.

Drennan, C.L., Pattridge, K.A., Weber, C.H., Metzger, A.L., Hoover, D.M., and Ludwig, M.L. (1999). Refined structures of oxidized flavodoxin from Anacystis nidulans. J. Mol. Biol. 294, 711–724.

Dundas, C.M., and Keitz, B.K. (2019). Tuning Extracellular Electron Transfer by Shewanella oneidensis Using Transcriptional Logic Gates. BioRxiv 2019.12.18.881623.

Duval, V., and Lister, I.M. (2013). MarA, SoxS and Rob of Escherichia coli – Global regulators of multidrug resistance, virulence and stress response. Int J Biotechnol Wellness Ind 2, 101–124.

Engler, C., and Marillonnet, S. (2013). Combinatorial DNA assembly using Golden Gate cloning. Methods Mol. Biol. 1073, 141–156.

Engler, R. Kandzia, S. Marillonnet, A One Pot, One Step, Precision Cloning Method with High Throughput Capability. PLoS One 3 (2008).

Frago, S., Lans, I., Navarro, J.A., Hervás, M., Edmondson, D.E., De la Rosa, M.A., Gómez-Moreno, C., Mayhew, S.G., and Medina, M. (2010). Dual role of FMN in flavodoxin function: Electron transfer cofactor and modulation of the protein–protein interaction surface. Biochimica et Biophysica Acta (BBA) - Bioenergetics 1797, 262–271.

Freigang, J., Diederichs, K., Schäfer, K.P., Welte, W., and Paul, R. (2002). Crystal structure of oxidized flavodoxin, an essential protein in Helicobacter pylori. Protein Sci 11, 253–261.

62

Gangeswaran, R., and Eady, R.R. (1996). Flavodoxin 1 of Azotobacter vinelandii: characterization and role in electron donation to purified assimilatory nitrate reductase. Biochem J 317, 103–108.

Gaudu, P., and Weiss, B. (2000). Flavodoxin Mutants of Escherichia coli K- 12. J Bacteriol 182, 1788–1793.

Gerlt, J.A., Bouvier, J.T., Davidson, D.B., Imker, H.J., Sadkhin, B., Slater, D.R., and Whalen, K.L. (2015). Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks. Biochim. Biophys. Acta 1854, 1019–1037.

Giró, M., Ceccoli, R.D., Poli, H.O., Carrillo, N., and Lodeyro, A.F. (2011). An in vivo system involving co-expression of cyanobacterial flavodoxin and ferredoxin–NADP+ reductase confers increased tolerance to oxidative stress in plants. FEBS Open Bio 1, 7–13.

Goñi, G., Herguedas, B., Hervás, M., Peregrina, J.R., De la Rosa, M.A., Gómez-Moreno, C., Navarro, J.A., Hermoso, J.A., Martínez-Júlvez, M., and Medina, M. (2009). Flavodoxin: A compromise between efficiency and versatility in the electron transfer from Photosystem I to Ferredoxin-NADP+ reductase. Biochimica et Biophysica Acta (BBA) - Bioenergetics 1787, 144– 154.

González, A., Angarica, V.E., Sancho, J., and Fillat, M.F. (2014). The FurA regulon in Anabaena sp. PCC 7120: in silico prediction and experimental validation of novel target genes. Nucleic Acids Res. 42, 4833–4846.

Gudim, I., Hammerstad, M., Lofstad, M., and Hersleth, H.-P. (2018). The Characterization of Different Flavodoxin Reductase-Flavodoxin (FNR-Fld) Interactions Reveals an Efficient FNR-Fld Redox Pair and Identifies a Novel FNR Subclass. Biochemistry 57, 5427–5436.

Gutekunst, K., Chen, X., Schreiber, K., Kaspar, U., Makam, S., and Appel, J. (2014). The bidirectional NiFe-hydrogenase in Synechocystis sp. PCC 6803 is reduced by flavodoxin and ferredoxin and is essential under mixotrophic, nitrate-limiting conditions. J. Biol. Chem. 289, 1930–1937.

Hall, D.A., Jordan-Starck, T.C., Loo, R.O., Ludwig, M.L., and Matthews, R.G. (2000). Interaction of Flavodoxin with Cobalamin-Dependent Methionine Synthase. Biochemistry 39, 10711–10719.

Hatchikian, E.C., Le Gall, J., Bruschi, M., and Dubourdieu, M. (1972). Regulation of the reduction of sulfite and thiosulfate by ferredoxin, flavodoxin and cytochrome cc′3 in extracts of the sulfate reducer Desulfovibrio gigas. Biochimica et Biophysica Acta (BBA) - Enzymology 258, 701–708.

63

Hirasawa, M., Nakayama, M., Hase, T., and Knaff, D.B. (2004). Oxidation- reduction properties of maize ferredoxin:sulfite oxidoreductase. Biochimica et Biophysica Acta (BBA) - Bioenergetics 1608, 140–148.

Hodgkin, E.E., and Richards, W.G. (1987). Molecular similarity based on electrostatic potential and electric field. International Journal of Quantum Chemistry 32, 105–110.

Hoover, D.M., Drennan, C.L., Metzger, A.L., Osborne, C., Weber, C.H., Pattridge, K.A., and Ludwig, M.L. (1999). Comparisons of wild-type and mutant flavodoxins from Anacystis nidulans. Structural determinants of the redox potentials. J. Mol. Biol. 294, 725–743.

Hunter, S., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bork, P., Das, U., Daugherty, L., Duquenne, L., et al. (2009). InterPro: the integrative protein signature database. Nucleic Acids Res 37, D211–D215.

J. L. Catlett, A. M. Ortiz, N. R. Buan, Rerouting Cellular Electron Flux To Increase the Rate of Biological Methane Production. Appl. Environ. Microbiol. 81, 6528–6537 (2015).

Jenkins, C.M., and Waterman, M.R. (1994). Flavodoxin and NADPH- flavodoxin reductase from Escherichia coli support bovine cytochrome P450c17 hydroxylase activities. J. Biol. Chem. 269, 27401–27408.

Jeroschewski, P., Steuckart, C., and Kühl, M. (1996). An Amperometric Microsensor for the Determination of H2S in Aquatic Environments. Anal. Chem. 68, 4351–4357. K. Singh, L. M. McIntyre, L. A. Sherman, Microarray analysis of the genome- wide response to iron deficiency and iron reconstitution in the cyanobacterium Synechocystis sp. PCC 6803. Plant Physiol. 132, 1825–1839 (2003). Kimmich, N., Das, A., Sevrioukova, I., Meharenna, Y., Sligar, S.G., and Poulos, T.L. (2007). Electron transfer between cytochrome P450cin and its FMN-containing redox partner, cindoxin. J. Biol. Chem. 282, 27006–27011.

Kobayashi, K., Masuda, T., Tajima, N., Wada, H., and Sato, N. (2014). Molecular Phylogeny and Intricate Evolutionary History of the Three Isofunctional Enzymes Involved in the Oxidation of Protoporphyrinogen IX. Genome Biol Evol 6, 2141–2155.

Krapp, A.R., Rodriguez, R.E., Poli, H.O., Paladini, D.H., Palatnik, J.F., and Carrillo, N. (2002). The Flavoenzyme Ferredoxin (Flavodoxin)-NADP(H) Reductase Modulates NADP(H) Homeostasis during the soxRS Response of Escherichia coli. J Bacteriol 184, 1474–1480.

64

Kutzki, C., Masepohl, B., and Böhme, H. (1998). The isiB gene encoding flavodoxin is not essential for photoautotrophic iron limited growth of the cyanobacterium Synechocystis sp. strain PCC 6803. FEMS Microbiol. Lett. 160, 231–235.

Lodeyro, A.F., Ceccoli, R.D., Pierella Karlusich, J.J., and Carrillo, N. (2012). The importance of flavodoxin for environmental stress tolerance in photosynthetic microorganisms and transgenic plants. Mechanism, evolution and biotechnological potential. FEBS Letters 586, 2917–2924.

Lostao, A., Daoudi, F., Irún, M.P., Ramon, A., Fernández-Cabrera, C., Romero, A., and Sancho, J. (2003). How FMN binds to anabaena apoflavodoxin: a hydrophobic encounter at an open binding site. J. Biol. Chem. 278, 24053–24061.

Lostao, A., Gómez-Moreno, C., Mayhew, S.G., and Sancho, J. (1997). Differential Stabilization of the Three FMN Redox Forms by Tyrosine 94 and Tryptophan 57 in Flavodoxin from Anabaena and Its Influence on the Redox Potentials. Biochemistry 36, 14334–14344.

Ludwig, M.L., Pattridge, K.A., Metzger, A.L., Dixon, M.M., Eren, M., Feng, Y., and Swenson, R.P. (1997). Control of oxidation-reduction potentials in flavodoxin from Clostridium beijerinckii: the role of conformation changes. Biochemistry 36, 1259–1280.

Meimberg, K., Lagoutte, B., Bottin, H., and Mühlenhoff, U. (1998). The PsaE Subunit Is Required for Complex Formation between Photosystem I and Flavodoxin from the Cyanobacterium Synechocystis sp. PCC 6803. Biochemistry 37, 9759–9767.

Montellano, P.R.O. de (2015). Cytochrome P450: Structure, Mechanism, and Biochemistry (Springer).

Mühlenhoff, U., Zhao, J., and Bryant, D.A. (1996). Interaction between Photosystem I and Flavodoxin from the Cyanobacterium Synechococcus sp. PCC 7002 as Revealed by Chemical Cross-Linking. European Journal of Biochemistry 235, 324–331.

Nakayama, T., Yonekura, S.-I., Yonei, S., and Zhang-Akiyama, Q.-M. (2013). Escherichia coli pyruvate:flavodoxin oxidoreductase, YdbK - regulation of expression and biological roles in protection against oxidative stress. Genes Genet. Syst. 88, 175–188.

Nogués, I., Hervás, M., Peregrina, J.R., Navarro, J.A., de la Rosa, M.A., Gómez-Moreno, C., and Medina, M. (2005). Anabaena flavodoxin as an electron carrier from photosystem I to ferredoxin-NADP+ reductase. Role of

65

flavodoxin residues in protein-protein interaction and electron transfer. Biochemistry 44, 97–104.

Nogués, I., Martínez-Júlvez, M., Navarro, J.A., Hervás, M., Armenteros, L., de la Rosa, M.A., Brodie, T.B., Hurley, J.K., Tollin, G., Gómez-Moreno, C., et al. (2003). Role of Hydrophobic Interactions in the Flavodoxin Mediated Electron Transfer from Photosystem I to Ferredoxin-NADP+ Reductase in Anabaena PCC 7119. Biochemistry 42, 2036–2045.

Ohira, S.-I., and Toda, K. (2006). Ion chromatographic measurement of sulfide, methanethiolate, sulfite and sulfate in aqueous and air samples. J Chromatogr A 1121, 280–284.

Olson, E.J., and Tabor, J.J. (2012). Post-translational tools expand the scope of synthetic biology. Current Opinion in Chemical Biology 16, 300–306. P. Kallio, A. Pásztor, K. Thiel, M. K. Akhtar, P. R. Jones, An engineered pathway for the biosynthesis of renewable propane. Nature Communications 5, 1–8 (2014).

Paulsen, K.E., Stankovich, M.T., Stockman, B.J., and Markley, J.L. (1990). Redox and spectral properties of flavodoxin from Anabaena 7120. Arch. Biochem. Biophys. 280, 68–73.

Peña, T.C. de la, Redondo, F.J., Manrique, E., Lucas, M.M., and Pueyo, J.J. (2010). Nitrogen fixation persists under conditions of salt stress in transgenic Medicago truncatula plants expressing a cyanobacterial flavodoxin. Plant Biotechnology Journal 8, 954–965.

Pence, N., Tokmina-Lukaszewska, M., Yang, Z.-Y., Ledbetter, R.N., Seefeldt, L.C., Bothner, B., and Peters, J.W. (2017). Unraveling the interactions of the physiological reductant flavodoxin with the different conformations of the Fe protein in the nitrogenase cycle. J. Biol. Chem. 292, 15661–15669.

Pierella Karlusich, J.J., and Carrillo, N. (2017). Evolution of the acceptor side of photosystem I: ferredoxin, flavodoxin, and ferredoxin-NADP+ oxidoreductase. Photosyn. Res. 134, 235–250.

Pierella Karlusich, J.J., Ceccoli, R.D., Graña, M., Romero, H., and Carrillo, N. (2015). Environmental Selection Pressures Related to Iron Utilization Are Involved in the Loss of the Flavodoxin Gene from the Plant Genome. Genome Biol Evol 7, 750–767.

Pierella Karlusich, J.J., Lodeyro, A.F., and Carrillo, N. (2014). The long goodbye: the rise and fall of flavodoxin during plant evolution. J. Exp. Bot. 65, 5161–5178.

66

Prakash, D., Iyer, P.R., Suharti, S., Walters, K.A., Santiago-Martinez, M.G., Golbeck, J.H., Murakami, K.S., and Ferry, J.G. (2019). Structure and function of an unusual flavodoxin from the domain Archaea. PNAS 116, 25917–25922.

Puan, K.-J., Wang, H., Dairi, T., Kuzuyama, T., and Morita, C.T. (2005). fldA is an essential gene required in the 2-C-methyl-D-erythritol 4-phosphate pathway for isoprenoid biosynthesis. FEBS Lett. 579, 3802–3806.

Raanan, H., Poudel, S., Pike, D.H., Nanda, V., and Falkowski, P.G. (2020). Small protein folds at the root of an ancient metabolic network. PNAS 117, 7193–7199.

Rao, S.T., Shaffie, F., Yu, C., Satyshur, K.A., Stockman, B.J., Markley, J.L., and Sundarlingam, M. (1992). Structure of the oxidized long-chain flavodoxin from Anabaena 7120 at 2 A resolution. Protein Sci. 1, 1413–1427.

Romero, A., Caldeira, J., Legall, J., Moura, I., Moura, J.J., and Romao, M.J. (1996). Crystal structure of flavodoxin from Desulfovibrio desulfuricans ATCC 27774 in two oxidation states. Eur. J. Biochem. 239, 190–196.

Salillas, S., and Sancho, J. (2020). Flavodoxins as Novel Therapeutic Targets against Helicobacter pylori and Other Gastric Pathogens. Int J Mol Sci 21.

Sambrook, J., and Russell, D.W. (2006). Transformation of E. coli by Electroporation. Cold Spring Harb Protoc 2006, pdb.prot3933.

Sancho, J. (2006). Flavodoxins: sequence, folding, binding, function and beyond. Cell. Mol. Life Sci. 63, 855–864. Schmidt, A. (1988). Sulfur metabolism in cyanobacteria. In Methods in Enzymology, (Academic Press), pp. 572–583.

Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504.

St Maurice, M., Cremades, N., Croxen, M.A., Sisson, G., Sancho, J., and Hoffman, P.S. (2007). Flavodoxin:quinone reductase (FqrB): a redox partner of pyruvate:ferredoxin oxidoreductase that reversibly couples pyruvate oxidation to NADPH production in Helicobacter pylori and Campylobacter jejuni. J. Bacteriol. 189, 4764–4773.

Thompson, J.D., Gibson, T.J., and Higgins, D.G. (2002). Multiple sequence alignment using ClustalW and ClustalX. Curr Protoc Bioinformatics Chapter 2, Unit 2.3.

67

Tognetti, V.B., Palatnik, J.F., Fillat, M.F., Melzer, M., Hajirezaei, M.-R., Valle, E.M., and Carrillo, N. (2006). Functional Replacement of Ferredoxin by a Cyanobacterial Flavodoxin in Tobacco Confers Broad-Range Stress Tolerance. Plant Cell 18, 2035–2050.

Ullmann, G.M., Hauswald, M., Jensen, A., and Knapp, E.-W. (2000). Structural alignment of ferredoxin and flavodoxin based on electrostatic potentials: Implications for their interactions with photosystem I and ferredoxin-NADP reductase. Proteins: Structure, Function, and Bioinformatics 38, 301–309.

Ullmann, G.M., Hauswald, M., Jensen, A., Kostić, N.M., and Knapp, E.-W. (1997). Comparison of the Physiologically Equivalent Proteins Cytochrome c6 and Plastocyanin on the Basis of Their Electrostatic Potentials. Tryptophan 63 in Cytochrome c6 May Be Isofunctional with Tyrosine 83 in Plastocyanin. Biochemistry 36, 16187–16196.

Vigara, A.J., Inda, L.A., Vega, J.M., Gómez‐Moreno, C., and Peleato, M.L. (1998). Flavodoxin as an Electronic Donor in Photosynthetic Inorganic Nitrogen Assimilation by Iron-deficient Chlorella fusca Cells. Photochemistry and Photobiology 67, 446–449.

Wilkins, M.R., Gasteiger, E., Bairoch, A., Sanchez, J.C., Williams, K.L., Appel, R.D., and Hochstrasser, D.F. (1999). Protein identification and analysis tools in the ExPASy server. Methods Mol. Biol. 112, 531–552.

Wu, B., Atkinson, J.T., Kahanda, D., Bennett, G.N., and Silberg, J.J. (2020). Combinatorial design of chemical-dependent protein switches for controlling intracellular electron transfer. AIChE Journal 66, e16796.

Ye, Q., Fu, W., Hu, Y., and Jin, C. (2014). Long-chain flavodoxin FldB from Escherichia coli. J Biomol NMR 60, 283–288.

Ye, Q., Hu, Y., and Jin, C. (2014). Conformational Dynamics of Escherichia coli Flavodoxins in Apo- and Holo-States by Solution NMR Spectroscopy. PLOS ONE 9, e103936.

Yonekura-Sakakibara, K., Onda, Y., Ashikari, T., Tanaka, Y., Kusumi, T., and Hase, T. (2000). Analysis of Reductant Supply Systems for Ferredoxin- Dependent Sulfite Reductase in Photosynthetic and Nonphotosynthetic Organs of Maize. Plant Physiol 122, 887–894.

Zhao, C., Zhao, Q., Li, Y., and Zhang, Y. (2017). Engineering redox homeostasis to develop efficient alcohol-producing microbial cell factories. Microbial Cell Factories 16, 115.

68

Zöllner, A., Nogués, I., Heinz, A., Medina, M., Gómez-Moreno, C., and Bernhardt, R. (2004). Analysis of the interaction of a hybrid system consisting of bovine adrenodoxin reductase and flavodoxin from the cyanobacterium Anabaena PCC 7119. Bioelectrochemistry 63, 61–65.

Zurbriggen, M.D., Tognetti, V.B., Fillat, M.F., Hajirezaei, M.-R., Valle, E.M., and Carrillo, N. (2008). Combating stress with flavodoxin: a promising route for crop improvement. Trends Biotechnol. 26, 531–537.

69

Appendix A

Figure 18. Radial phylogram showing the distribution of organisms from cluster (1) of the Fld sub-network. Genomes were clustered based on

organism class using IMG Genome Clustering tool.

70

Aneurinibacillus_sp__XH2 Amphibacillus_xylanus_NBRC_15112

Figure 19. Radial phylogram showing the distribution of organisms from cluster (2) of the Fld sub-network. Genomes were clustered based on organism class using IMG Genome Clustering tool.

71

Figure 20. Radial phylogram showing the distribution of organisms from cluster (3) of the Fld sub-network. Genomes were clustered based on organism class using IMG Genome Clustering tool.

72

Amphibacillus_xylanus_NBRC_15112 Aneurinibacillus_sp__XH2

Figure 21. Radial phylogram showing the distribution of organisms from cluster (4) of the Fld sub-network. Genomes were clustered based on organism class using IMG Genome Clustering tool.

73

Appendix B

Table 3. Alignment scores of SIRs from organisms containing Flds with SIR from

Z. mays

Fld SIR Alignment score with Strain count Z. mays SIR Acaryochloris marina MBIC11017 1 49 Anabaena variabilis ATCC 29413 2 50 Crocosphaera subtropica BH68 2 49 Gloeothece citriformis PCC 7424 2 49 Nostoc punctiforme PCC 73102 4 49 Nostoc sp. PCC 7120 1 50 Ostreococcus lucimarinus CCE9901 2 53 Prochlorococcus marinus AS9601 1 44 Prochlorococcus marinus MIT9211 1 44 Prochlorococcus marinus MIT9303 1 44 Prochlorococcus marinus MIT9313 1 44 Prochlorococcus marinus MIT9515 1 42 Synechococcus elongatus PCC 6301 1 45 Synechococcus elongatus PCC 7942 1 46 Synechococcus sp. CC9605 2 45 Synechococcus sp. JA-2-3B'a(2-13) 1 46 Synechococcus sp. JA-3-3Ab 1 46 Synechocystis sp. GT-S, PCC 6803 1 46 Synechocystis sp. PCC 6803 1 46 Synechocystis sp. PCC 6803 GT-I 1 46 Synechocystis sp. PCC 6803 PCC-N 1 46 Thalassiosira pseudonana CCMP 1335 2 45 Trichodesmium erythraeum IMS101 2 48

74

Appendix C

Figure 22. Sequence alignment of cyanobacterial Flds and E. coli Fld.

Black arrows indicate sequence differences characterized by presence/absence of a residue or a charge difference for E. coli Fld compared to cyanobacterial

Flds. Red indicates negatively charged residues and purple indicates positively charged residues. Multiple sequence alignment was performed using

CLUSTALW.

75

Appendix D

Figure 23. Sequence alignment of cyanobacterial SIRs (A), cyanobacterial

Flds (B), and (C) Flds from Nostoc punctiforme. Red indicates negatively

charged residues and purple indicates positively charged residues. Residues that

were shown to interact with Fd for Z. mays SIR are shown with asterisks. Multiple

sequence alignment was performed using CLUSTALW.

76