<<

Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1953

Genome evolution of a bee- associated bacterium

ANDREA GARCÍA-MONTANER

ACTA UNIVERSITATIS UPSALIENSIS ISSN 1651-6214 ISBN 978-91-513-0986-6 UPPSALA urn:nbn:se:uu:diva-416847 2020 Dissertation presented at Uppsala University to be publicly examined in A1:111a, Biomedicinskt centrum (BMC), Husargatan 3, Uppsala, Thursday, 24 September 2020 at 09:15 for the degree of Doctor of Philosophy. The examination will be conducted in English. Faculty examiner: Professor Matthias Horn (Department of and Ecosystem Science, Division of Microbial Ecology).

Abstract García-Montaner, A. 2020. Genome evolution of a bee-associated bacterium. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1953. 80 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-513-0986-6.

The use of large-scale comparative genomics allows us to explore the genetic diversity and mechanisms of evolution of related organisms. This thesis has focused on the application of such approaches to study kunkeei, a bacterial inhabitant of the honeybee gut. We produced 102 novel complete genomes from L. kunkeei, which were used in four large comparative studies. In the first study, 41 bacterial strains were isolated from the crop of honeybees whose populations were geographically isolated. Their genome sequences revealed differences in gene contents, including the mobilome, which were mostly phylogroup- specific. However, differences in strain diversity and co-occurrence between both locations were observed. In the second study, we obtained 61 bacterial isolates from neighboring hives at different timepoints during the summer. We observed that strain diversity seemed hive- specific and relatively constant in time. Surprisingly, the observed mobilome also showed hive specificity and was maintained through the summer. The novel genome data were combined with previously published genomes, allowing us to perform deep comparative analyses on the evolution of the species using a total of 126 genomes. We determined that, despite the large number of sequenced genomes, L. kunkeei has an open pangenome. Besides, we evaluated the effects of recombination on the species core genome, and concluded that it mainly evolves through mutation events. In the last study we described the mechanisms of evolution of a cluster of 5 giant genes (about 90 kb long in total) that are unique to L. kunkeei and the closest sister species. Their patterns of evolution do not reflect those of the species core genome. We concluded that they originated by duplication events, and have diverged by accumulation of both mutations and recombination events. We predicted a potential interaction between the proteins encoded by two of them, and we hypothesized a role in host-specific interaction for another protein. In conclusion, these studies have provided novel and cohesive knowledge on the composition and dynamics of different populations of L. kunkeei, and may have contributed to better understand its ecological niche.

Keywords: Bacterial evolution, Genomics, , Honeybees

Andrea García-Montaner, Department of Cell and Molecular , Molecular Evolution, Box 596, Uppsala University, SE-752 37 Uppsala, Sweden.

© Andrea García-Montaner 2020

ISSN 1651-6214 ISBN 978-91-513-0986-6 urn:nbn:se:uu:diva-416847 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-416847)

Para los que más quiero.

Cover art by Tomás Montaner Carbó, my dear grandpa, who beautifully captured the dynamic nature of bacterial genomes (front) and the incredible diversity enclosed within a single species (back), two key aspects of this doctoral thesis.

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Garcia-Montaner, A., Tamarit, D., Näslund, K., Webster, M.T., Andersson, S.G.E. (2020) Comparative genomics of Lactobacil- lus kunkeei in two insular populations of honeybees. Manuscript. II Garcia-Montaner, A.*, Dyrhage, K.*, Näslund, K., Vasquez, A., Andersson, S.G.E. (2020). Strain diversity and evolution of Lactobacillus kunkeei throughout the summer months. Manu- script. III Garcia-Montaner, A.*, Tamarit, D.*, Andersson, S.G.E. (2020) Role of recombination and mutation to the evolution of the core genome in Lactobacillus kunkeei. Manuscript. IV Garcia-Montaner, A., Andersson, S.G.E. (2020). Molecular evolution of giant genes in Lactobacillus isolated from social bees. Manuscript.

(*) Equal contribution

Reprints were made with permission from the respective publishers.

Contents

Introduction to this thesis ...... 11 Chapter 1: Genome evolution in bacteria ...... 12 Genomic organization ...... 12 Sources of genetic innovation ...... 14 Birth of new genes ...... 14 Nucleotide substitutions ...... 14 Duplications, deletions and rearrangements ...... 15 Horizontal gene transfer ...... 15 Recombination ...... 17 Forces of evolution...... 19 Chapter 2: Host-microbe systems ...... 20 Host-associated lifestyles ...... 21 Genome reduction ...... 22 Surface interaction in bacterial symbionts...... 23 Animal ...... 28 Bacterial composition ...... 29 Beneficial roles of the ...... 30 Model systems for microbiota studies ...... 31 Chapter 3: The honeybee as a model system ...... 33 Ecology and nutrition ...... 33 Health and disease...... 34 ...... 35 Chapter 4: Lactobacillus kunkeei ...... 39 Taxonomic context: The genus Lactobacillus ...... 39 Evolutionary context ...... 41 Ecological context ...... 43 Potential symbiotic role ...... 43 Metabolic profile ...... 45 Genomic traits ...... 48 Aims ...... 51 Results ...... 52 Paper I ...... 52

Paper II ...... 53 Paper III ...... 55 Paper IV...... 56 Perspectives...... 58 Svensk sammanfattning ...... 61 Resumen en español ...... 63 Acknowledgements ...... 66 References ...... 70

Abbreviations

6-PG/PK 6-phosphogluconate/phosphoketolase ADH Alcohol dehydrogenase ALDH Acetaldehyde dehydrogenase AMP Antimicrobial peptides ANI Average nucleotide identity CCD Colony collapse disorder dN Non-synonymous nucleotide substitution rate dS Synonymous nucleotide substitutions rate EMP Embden-Meyerhof-Parnas FLAB Fructophilic HGT Horizontal gene transfer LAB Lactic acid bacteria MEP Methylerythritol 4-phosphate MGE Mobile genetic element MVA Mevalonate MYA Million years ago r/m Recombination rate over mutation rate

Introduction to this thesis

- Life is hard, Mr. Scoresby, but we cling to it all the same. - And this journey we’re on? Is that folly or wisdom? - The greatest wisdom I know. ― Philip Pullman, His Dark Materials

I have been fascinated by the animal world since an early age. As you probably guessed, I wanted to become a vet. Yes, very original of me. However, my interest in nature kept growing and expanding as I grew older. During my school years, and maybe inspired by wild life documentaries, I started to shift my interest to the observation of nature itself. The idea of becoming a reporter, traveling around the world, learning about animals and teaching others about them, seemed like a dream life. However, that changed as soon as I started to dig into cell biology during my secondary education. When the time came, choosing a career path might have been the easiest decision in my life: I wanted to study Biology. During my years at university, I took another detour within the field, and fell in love with . It was so fascinating…! A whole new world to me. My main interest was on the huge diversity of microbial life, and its clin- ical and biotechnological implications. A very practical approach. Finally, when I was deciding where (and in what topic) I would do my PhD, I saw the light. I don’t even know why it happened then, as if I didn’t know about it before. I did, and I liked it. It simply hadn’t clicked. But suddenly it did, and I reoriented my interest towards microbial ecology and evolution. And that’s how I ended up at Uppsala University, learning about genome evolution of a bacterium associated to honeybees. This PhD has brought me a deeper understanding on bacterial evolution, and how that affects animal-microbe symbiotic systems. I have learnt how bacterial populations vary across time and between different hosts, and poten- tial mechanisms of interaction between them. In the following sections I will introduce the main topics that contextualize this work, and summarize the main results of the studies carried out. This introduction is followed by the four manuscripts generated in this doctoral thesis. The supplementary material for each manuscript will be available on the provided links until the date of the defense. Enjoy the reading!

11 Chapter 1: Genome evolution in bacteria

Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that! — Lewis Carrol, Through the looking-glass.

Evolution.

A word that encloses so many wonders. Life itself. Or better, the way life has come to be how we know it today. As a metaphor, genomes could be seen as a book, where the information is encoded and maintained over time. However, that information is under constant change, so it is not only telling us how or- ganisms are today; we can infer how they have changed in the past, and how they change almost in real time. Comparative genomics studies are extremely powerful tools to interpret the information contained in the many pages of these complex books that all organisms are. Genomic evolution among bacteria is the main focus of this thesis, and the topic of this first chapter. I will firstly introduce some basic concepts on how genomes are organized, both functionally and structurally. Then, there will be an overview on genetic variation dynamics, and the main sources of innova- tion on bacterial genomes that have been relevant for the work on this thesis.

Genomic organization From a structural point of view, bacterial genomes are relatively simple— es- pecially if compared to eukaryotes. Most of the genetic information is encoded in the chromosome, which is vertically transmitted during replication. The other structural unit found within bacterial genomes is the plasmid. Plasmids are small circular DNA molecules that replicate independently from the chro- mosome, and may carry adaptive genes. Functionally speaking, genes encoded in bacterial genomes can be divided into several groups that, together, constitute the pangenome, which represents the whole set of genes of a species (Tettelin et al. 2005) (figure 1A). The core genes are those that encode fundamental cellular processes. Hence, they con- stitute the core genome, and are shared by all the strains that comprise a spe- cies. On the other hand, accessory genes are involved in non-essential

12 functions that may increase fitness under specific conditions, or contribute to specific adaptations. The accessory genome is therefore highly variable, and contributes to define the different strains within a species. Within the acces- sory genome it is also possible to distinguish the strain-specific genes, which are unique to each strain. Most bacterial genomes are highly dynamic, with high rates of gene gain and loss. However, not all genes are equally mobile (Rankin et al. 2011). Evolutionary studies have revealed that changes in the accessory genome are often associated with horizontal gene transfer and site- specific recombination, often driven by mobile genetic elements. On the other hand, changes in the core genome generally involve vertical transmission and homologous recombination (González-Torres et al. 2019). These mechanisms will be further described in this chapter.

Figure 1. The species pangenome. (A) Types of genes that comprise the pangenome, and how they are shared among all the strains within a species (each oval represents one genome. The size of the pangenome (B) increases with the number of genomes added to the comparison, while the core genome size decreases. If the total number of genes in the pangenome reaches saturation, the species has a closed pangenome, and the genetic diversity is well represented by the available genomes.

Exploring a species pangenome is like peering at the metaphoric books that represent all the genomes within a species. The more books we read, the more we know on that species ecology, evolution, metabolism, etc. It might happen, though, that we experience a fast learning curve as we dive into it. However, at some point, the new books we read might start only adding small details, very specific, which do not add so much relevant information to the big picture of the whole species. Then, we consider that the species pangenome is closed, and the addition of new genomes will not significantly increase the genetic diversity on the species (figure 1B). That is why pangenomic studies are a good approach to also help us direct our investigation efforts more strategi- cally, and focus on digging where we can still obtain novel and relevant knowledge. Studies based on pangenomic analyses are a powerful approach in compar- ative genomics, and have multiple applications. Given the great amount of genomic data accumulated over the last decades, it is not surprising to

13 encounter taxonomical reclassifications motivated by new molecular and ge- nomic evidence (Huang et al. 2020; Suresh et al. 2019). The field of microbial ecology benefits greatly from these kinds of analyses, since they can provide valuable knowledge about lifestyle, niche adaptation and metabolism (Peeters et al. 2019; Chaudhry & Patil 2020). Of course, we shall not forget the contri- butions to areas of great impact for our society, such as biomedicine and bio- technology. Pangenomic studies have revealed clinically relevant adaptions on different opportunistic and pathogenic bacteria (Iversen et al. 2020) that allow to better understand disease development and potential treatments. Un- derstanding the genetic diversity and metabolic potential enclosed in a given species is also relevant for the discovery of new ways of using microbes for biotechnological applications. For instance, finding strains that may be used as probiotics (Sulthana et al. 2019; Sharma et al. 2018). As the most relevant application in the context of this thesis, pangenomic studies provide extensive insights into bacterial populations structure, genome dynamics and evolution of a species (Freschi et al. 2019; Ying et al. 2019; Stevens et al. 2019).

Sources of genetic innovation Gene innovation is essential for evolution, and it comes in different forms. Several processes take place within bacterial genomes that alter their size, spa- cial organization and genetic diversity. Gene innovation can also come from outside the genome, introducing genes of different origin (prokaryotic and eu- karyotic) that may be eventually integrated. Some of these processes that lead to genetic innovation will be briefly introduced in this section.

Birth of new genes Protein-coding genes can emerge from non-coding DNA, and are known as de novo genes. These genes are usually short, expressed at low levels and have a short lifespan, although some of them are retained and remain functional (Schlötterer 2015). Although de novo genes have raised skepticism in the past about their nature as real genes, now there is enough genetic evidence to con- firm their existence and biological relevance. However, evolutionary, func- tional and structural studies are still needed in the field to fully understand how new genes are generated from scratch (Schmitz & Bornberg-Bauer 2017).

Nucleotide substitutions Point mutations are one of the main sources of genetic variation in extant genes. They occur during the DNA replication process, changing the identity of some nucleotides by others. Two types of nucleotide substitutions can take place. Transitions are nucleotide changes between purines ([Aßà G]) or

14 pyrimidines ([CßàT]). On the other hand, transversions happen between pu- rines and pyrimidines. Although mutation is a relatively stochastic process, it is affected by certain biases. For instance, transitions are more frequent than transitions. Besides, it exists a universal mutation bias that favors AT-rich ge- nomes among bacteria (Hershberg & Petrov 2010). Observations of clonal bacterial populations evolving under relaxed selection indicated that mutation is consistently biased towards AT. Nucleotide substitutions can have great consequences at the protein level, and hence, in cell fitness and viability. However, not all nucleotide changes are reflected at the protein level in the same way. Some amino acids are en- coded by several alternative codons. This means that they can still be trans- lated into the same amino acid even if the identity of a nucleotide changes. Point mutations leading to an alternative codon for a given amino acid are called synonymous substitutions — they do not change the encoded amino acid. If, however, the point mutation leads to a different amino acid being en- coded, it is called a non-synonymous substitution. In the latter scenario, two things can happen. If the newly encoded amino acid has similar biochemical properties to the original one, the conformation and/or function of the protein may not be greatly affected. On the other hand, if the newly encoded amino acid differs greatly from the original one, this will have strong consequences at the protein level. For instance, it is possible that the protein is not properly folded, or that it is not able to identify certain substrate, so it is not functional anymore.

Duplications, deletions and rearrangements Duplication events can occur on individual genes and on larger regions in the chromosome. The different gene copies resulting from a duplication often fol- low distinct evolutionary processes via differential accumulation of point mu- tations. New copies can either rapidly diverge and acquire new functions (neofunctionalization) or specialize in either of the functions encoded in the original gene (subfunctionalization). The accumulation of deleterious muta- tions leads to function loss (pseudogenization). Any gene can become a pseudogene if its sequence is truncated, and they will eventually be removed from the gene pool. Finally, rearrangements can affect larger regions in the chromosome, and can alter the chromosomal organization at large extents. They can appear as inversions, when the sequence encoded on a certain region is exchanged between strands, or as translocations, when a region in the chro- mosome is moved into another location preserving the original orientation.

Horizontal gene transfer The ability to exchange genetic material between bacteria has been known since Joshua Lederberg’s pioneering work (Zinder & Lederberg 1952;

15 Lederberg & Tatum 1946). During a horizontal gene transfer (HGT) event, a recipient genome acquires genetic material from a donor genome. The newly acquired DNA integrates into the recipient cell and is inherited by its descend- ants. Hence, it is an effective mechanism to introduce genetic variability, and a strong driver of evolution. HGT takes place through several mechanisms. Conjugation is the mecha- nism that some bacteria use to transfer genetic material such as plasmids or transposons via cell-to-cell contact. In order to be transferable, the plasmid or transposon needs to encode certain genetic systems that allow a successful DNA transfer. Transduction is another mechanism that introduces genetic ma- terial into bacterial genomes, and it takes place via , which are that infect bacteria. Bacteriophages use transduction to introduce their genetic material into the host and, once inside, be integrated into the bacterial DNA. Bacteria can also uptake free DNA directly from the environment in a process known as transformation. The ability to transform is referred to as natural competence, and also depends on a battery of genes that coordinate the process to allow the cell to uptake and integrate foreign DNA. Those are the general mechanisms for introducing new DNA into bacterial cells, and the following are the elements that vectorize the process.

Mobile genetic elements The collection of mobile genetic elements (MGE), which includes plasmids, phages and transposons, constitutes the mobilome. It plays an important role in increasing the genetic diversity of bacterial genomes by introducing foreign DNA into the cells. Plasmids are small DNA molecules that replicate independently from the chromosome, and encode non-essential functions that may add fitness ad- vantages under certain conditions. As previously mentioned, they are an im- portant source of genetic diversity within a bacterial population, as they can be exchanged between bacterial cells. When these genes are shared among members within the population, they contribute to increase the survival chances of the population in a given niche. Common functions found in plas- mids include antibiotic resistance, antimicrobial production, virulence, etc. Bacteriophages can be found in two different formats when infecting bac- terial cells. As mentioned in the previous section, they can be integrated into the chromosome and replicate together with the bacterial DNA (they are then referred to as prophages). Some examples of plasmids carrying phage-like genes have been reported as well, in what seems to be a hybrid element con- taining both types of genes (Galetti et al. 2019; Chen et al. 2012). Once they infect the cell, bacteriophages can enter into two different reproductive cycles, the lytic cycle or the lysogenic cycle, with different consequences to the host. If the lytic cycle is activated, the phage will be replicated and translated, giv- ing rise to multiple copies of phage DNA that will be encapsulated. The newly synthetized phages will be released to the environment by a lytic process,

16 hence, killing the cell to spread the infection to new hosts. In the lysogenic cycle, the phage DNA remains dormant and inactivated within the bacterial chromosome, being replicated and vertically transferred to the offspring. Lys- ogenic bacteriophages have, therefore, a stronger influence on the evolution of bacterial genomes. They are for instance responsible for increasing genetic diversity, since they may provide hosts with adaptive traits often associated with biotic interactions (Touchon et al. 2017). Prophages can also influence and regulate bacterial gene expression by mechanisms such as transcription factors, sRNAs, DNA rearrangements, and even controlled bacterial lysis (Argov et al. 2017). In some cases, prophages may serve as vehicles for the horizontal transfer of random fragments of bacterial DNA that is packed to- gether with phage DNA. These are the gene transfer agents, and also partici- pate in the horizontal exchange of genetic material at the population level (Québatte et al. 2017). Despite their phage origin, some of these transfer sys- tems have evolved into domesticated bacteriophages that serve to transfer larger amounts of bacterial DNA (Tamarit et al. 2018). Finally, one of the simplest types of MGEs is the transposon. A transposon is a genetic element that can replicate independently of the chromosome, and be inserted in different parts of the genome. It is one of the so-called selfish genetic elements, which replicate and spread within and across genomes using the cell’s machinery. A transposon structure is characterized by one or two transposase genes that are responsible for cutting the DNA and integrating the transposon molecule in a new position. In simple terms, transposons replica- tion can happen via copy-paste, hence increasing their copy number in the genome, or cut-paste, by simply changing position without generating copies of themselves. Sometimes they can carry extra genes that may or may not add advantages to the host cell. Transposons play a major role in genome plastic- ity, and are responsible of insertion, deletion and rearrangement of chromoso- mal regions, at the same time that are a key mechanism of horizontal gene transfer (Chandler 2017).

Recombination The process of recombination involves the exchange of genomic material be- tween different organisms. It takes place within and between eukaryotes, pro- karyotes and viruses through two different mechanisms, being a strong force of evolution in all groups. Homologous recombination allows a given genomic fragment to be replaced by the corresponding homologous fragment from a donor genome. On the other hand, non-homologous recombination adds new material into the recipient genome by introducing DNA into specific insertion sites — the already mentioned HGT (Didelot et al. 2012). Only homologous recombination will be more profoundly addressed in this thesis, more specif- ically, prokaryotic homologous recombination.

17 Homologous recombination takes place between organisms of the same or different populations, and also between species. In this way, it can enhance the adaptive potential of organisms by favoring functional divergence and spreading beneficial mutations, which may have a strong effect on the evolu- tion of the population (Bay & Bielawski 2011; Kalinina et al. 2016). Moreo- ver, its effect on shaping genetic diversity amongst prokaryotic species can even exceed that of mutation (Vos & Didelot 2009). The ratio of recombina- tion over mutation rates (r/m) is a direct measure of the relative impact of recombination on sequence divergence (Guttman & Dykhuizen 1994). For that reason, it is widely used when studying genomes evolution. The rates of homologous recombination across prokaryotic genomes varies widely be- tween species. The differences in r/m can go up to three orders of magnitude, according to some observations (Vos & Didelot 2009). Homologous recom- bination is expected to increase the rate at which bacteria adapt to their envi- ronment, and the recombination rate might represent a major determinant of the speed of adaption (Lin & Kussell 2017). Some studies have demonstrated that phylogenetic position is not correlated with homologous recombination rates variation. In other words, closely related species do not necessarily re- combine at a higher rate per se. Instead, it seems that species that share a sim- ilar lifestyle and/or ecological context display higher associated recombina- tion rates (Vos & Didelot 2009; González-Torres et al. 2019), and are thus more prone to recombine, even when substantial genetic divergence exists be- tween them (Gilbert et al. 2018). Recombination and mutation rates vary at different chromosomal positions due to the way the bacterial nucleoid is packed, and the involvement of differ- ent proteins that contribute to its structure and stability (Kivisaar 2020). Four different dynamical macrodomains have been described in the nucleoid of E. coli (Valens et al. 2004) upon analyzing the frequency of site-recombination across the chromosome. It was observed that some regions displayed high in- ternal recombination rates, while it seemed that interaction with other regions was highly limited. These observations were based on E. coli as a model ge- nomic system. However, this might not be directly applicable to all cases if different genomic organizations are found, but it sets the basis for similar con- formational and structural limitations that make certain genomic areas more prone to recombination. Regardless of where in the chromosome recombination events take place, they have strong effects on genome content and evolution. Frequent recombi- nation events of large size might lead to sequence homogenization (Engel et al. 2014) and a consequent loss of the recombination signal, making it difficult to detect using alignment-based algorithms. On the other hand, frequent re- combination events of shorter fragments might increase genetic variation up to the extent that sequences become so divergent that they will no longer re- combine anymore. Then, as genetic distance increases between sequences, the

18 less probable is that they recombine. Recombination therefore ultimately par- ticipates in the process of genetic diversification. The evolutionary benefits of homologous recombination are still under de- bate. However, three hypotheses are the preferred options by several authors. The DNA repair hypothesis states that foreign DNA serves as a template to repair double-stranded breaks. In similar lines, another possibility would be that it is related to the removal of deleterious mutations and the combination of beneficial mutations — similar to sex in eukaryotic organisms. Finally, the food hypothesis states that the incorporation of foreign DNA in the genome through recombination is a by-product of the uptake of DNA for metabolism (Vos & Didelot 2009). Whichever was the origin of recombination and why it has been maintained might still be difficult to answer. However, it is undeni- able the importance that it has towards evolution of bacterial species.

Forces of evolution As we have seen, genomes are dynamic entities whose gene content is influ- enced by both internal and external processes. All the sources of innovation that have been reviewed in this chapter allow bacterial populations to adapt to their environment and keep their fitness. However, it is essential to regulate such genetic changes. Since mutations are a stochastic process, they could jeopardize the survival of the cell by affecting essential functions in an unde- sired way. Therefore, a balance between conservation and innovation is needed. Such equilibrium is maintained by evolutionary forces such as selection and drift. Selection is especially eficient in large populations, where it acts in different ways. Purifying selection tends to remove those genes that accumu- late disadvantageous and/or deleterious mutations, and strongly contributes to function preservation. If, on the contrary, sequence diversification is advanta- geous, diversifying selection allows nucleotide substitutions that strongly af- fect the coded protein. Finally, if there is no selective pressure acting at the molecular level, the genetic changes that take place in the genome are driven by genetic drift. Under such scenario, fixation of mutations occurs by random sampling of genotypes in the population. Genetic drift has a stronger effect on smaller populations, or those that experience extreme declines in size — i.e. population bottlenecks. As a result, the genetic diversity of the population ex- periences a strong decrease.

19 Chapter 2: Host-microbe systems

As a net is made up of a series of ties, so everything in this world is con- nected by a series of ties. If anyone thinks that the mesh of a net is an inde- pendent, isolated thing, he is mistaken. It is called a net because it is made up of a series of interconnected meshes, and each mesh has its place and respon- sibility in relation to other meshes. — Gautama Buddha

Host-associated organisms are also referred to as symbionts, and the organism to which they are closely associated is the host. In the context of this thesis, I will mainly be talking about bacteria and animals, respectively. The term sym- biosis comes from the Greek συµβίωσις "living together", and since it was first described in 1878 by Anton de Bary its definition has varied in several ways. De Bary defined it as “the living together of unlike named organisms”, which is quite a generalist definition but also very appropriate because it does not exclude any kind of interaction. This is useful because symbiotic associations can be classified under very different criteria, which results in a great variety of interactions and lifestyles depending on which characteristics we are con- sidering. The association of two or more organisms as an evolutionary adaptation strategy is almost as old as life itself. Eukaryotic organisms evolved from the symbiotic association of several unicellular , who gave rise to the nucleus and organelles such as the chloroplast and mitochondria (Margulis 1975). This example represents an extraordinary case of symbiosis, and since then, symbiosis has proven to be an advantageous strategy of evolution that countless organisms have adopted at different levels of association extent. We could say that symbiosis, in all its shapes, is one of the most conserved strat- egies for adaptation, and it is found wherever we look in nature. In this chapter I will first review some basic concepts of symbiosis, before focusing on conserved traits common to bacterial symbionts. I will finish de- veloping the concept of the microbiota and its implications on host health and physiology.

20 Host-associated lifestyles The broadest level of classification for symbiotic interactions refers to the kind of relationship between the organisms and the resulting cost-benefit outcome. Under this point of view, an interaction can be mutualistic, commensal or par- asitic, following a benefit-decreasing order. In ecological interactions, when- ever the resulting benefit goes in both directions (towards both organisms par- ticipating in the association) it is called mutualism. While, if the benefit goes just in one direction and one of the participants has rather a passive role, it is called commensalism. Finally, if there is clear harm inflicted on one of the members, it is a case of parasitism. Depending on where the symbionts estab- lish their niche, it is also possible to distinguish between endosymbionts (those that are intracellular and live inside the host cells) and ecto- symbionts (those organisms that live on the host surface). Some bacterial symbionts are heritable, meaning that they are vertically transferred to offspring. Others are non-heritable, and are instead transferred horizontally between members of the host population, or are acquired from the environment. Mixed-modes of symbiont transfer exist as well. This mech- anism combines both vertical and horizontal transfers, which includes both acquisition from the environment, and transfers between host individuals. Finally, different types of symbioses can also be classified by how strict the interaction is between the symbiont and the host. Both the symbiont and the interaction can be classified as obligate or facultative. Obligate symbionts tend to be vertically transmitted endosymbionts, who have often lost a great amount of genetic potential (with a subsequent reduction in the genome size) and become metabolically dependent on the host. In such cases, it is common for both the host and the symbiont to show traces of co-evolution. That is, to find correlations between host and endosymbiont evolutionary histories. How- ever, this is only clearly visible in those systems where the symbionts are strictly inherited vertically, so one can always trace back through time. Facul- tative symbionts do not depend on the host to survive, and can be found and/or cultured independently When talking about symbiotic systems per se, we refer to the bacterial com- munity in close association to a host as its microbiota. This term represents the collection of microbial populations associated to another organism (the host), where the different strains of each species contribute genetically and functionally to the community. We can also talk about the when refereeing not only to the microorganisms associated to certain niche, but also to their genomes and the surrounding environmental conditions. This is a wider concept that considers also the biotic and abiotic factors included in a given environment that contribute to its composition.

21 Genome reduction The minimal genome, considered as the minimal gene set necessary for life, is a concept that has fascinated evolutionary for a long time. Since the first bacterial genomes started to be sequenced, the lower limit for genome size has kept decreasing. Perhaps unsurprisingly, bacteria encoding the small- est known genomes have host-associated lifestyles. One of the most conserved characteristics of bacterial symbionts is the re- duction of their genome size compared to free-living close relatives. In partic- ular, the most extreme cases of reduction in genome size are observed in ob- ligate intracellular endosymbionts. Due to the high stability provided by the host and the constant access to nutrients, bacterial symbionts rely less and less on their own genetic potential to produce essential products, and take ad- vantage of what the host can offer. Since many genes are not needed to be expressed, selection acts less efficiently on them. Relaxed selection and ran- dom genetic drift also favor an increase in mutation fixation rate, leading to the inactivation and eventual loss of non-essential genes (Kuo et al. 2009; Novichkov et al. 2009). Therefore, subsequent gene losses result in genome shrinkage. Genome reduction is a progressive process in which dispensable genes are eventually lost. The degree of genetic loss is correlated to the level of depend- ence towards the host. Hence, recently evolved symbionts have larger ge- nomes than those strains in more ancient symbiotic relationships. The most extreme cases of genome reduction can be observed in eukaryotic organelles (mitochondria and plastids), which represent the ultimate degree of associa- tion between a host and an endosymbiont according to the en- dosymbiotic theory (Margulis 1993). However, different stages can be ob- served in the process of genome reduction in-between free-living bacteria and endosymbiont organelles. Considering symbiotic interactions as the result of long-term adaptations, we may explain the process of genome reduction as a linear continuum. Initial stages are characterized by the proliferation of mobile elements, chromosome rearrangements, gene inactivation, accumulation of pseudogenes and deletions (McCutcheon & Moran 2012; Latorre & Manzano- Marín 2017). As gene loss continues, pseudogenes and mobile elements are also lost at the same time together with non-essential genes. Based on genomic studies of bacterial symbionts with the smallest characterized genomes, the most commonly lost functions involve cell envelope biogenesis, regulation of gene expression, and DNA repair and recombination (McCutcheon & Moran 2012; Latorre & Manzano-Marín 2017). On the other hand, maintained genes are those that provide essential functions to the host. In other words, the sym- biotic function is retained to maintain the co-dependence relationship. Conse- quently, different bacteria possess a particular set of retained genes. The losses are determined by the specific host's needs, but can also reflect the particular processes of gene loss undergone by each symbiotic lineage, giving rise to

22 differentially retained genes with similar or equivalent functions (Latorre & Manzano-Marín 2017). The universal mutational bias ([ G | C ] à [ A | T ]) favors AT-rich ge- nomes, but selection counter acts promoting bacterial GC-rich genomes (Bobay & Ochman 2017). The fact that most reduced genomes break this trend may indicate that selection acts less efficiently on these, as expected in small populations where drift is the major driver of evolution (McCutcheon & Moran 2012). The AT-rich composition of small genomes is a result of differ- ent factors that are a consequence of genomic shrinkage. The lack of DNA repair and recombination machinery contributes to the accumulation of muta- tions, resulting in more A and T substitutions (McCutcheon & Moran 2012). However, recent studies suggest that selection may represent a strong evolu- tionary force driving the genomes of intracellular genetic elements (including symbionts) towards an AT rich composition (Dietel et al. 2019).

Surface interaction in bacterial symbionts Given the ancient capacity of microbes to interact with other organisms, we could expect that the fundamental molecular principles of these interactions will be conserved. This section contains an overview of some of these adap- tations, with a focus on those that have a special relevance in the frame of this thesis. Notice, though, that many of these traits are accompanied by high spec- ificity towards the host. So, with this in mind, I will try to keep this content as general as possible to provide a broad idea of the repertoire of mechanisms for surfaces interaction of both pathogenic and beneficial symbionts. Note that some structures or strategies presented in the following subsections may play more than one role, and could be included in more than one category (e.g. facilitating adhesion and also protecting the cells).

23 Figure 2. Structures and mechanisms for surface interaction. Bacterial popula- tions display a wide diversity of strategies to attach and establish in their niche. They can modify their extracellular surface by displaying adhesins to mediate attachment; or by producing capsules and s-layers, to protect themselves from the environment and the host’s immune system. They can also form aggregates, like , which is an efficient attachment mechanism and provides great competitive advantages. Some species are also able to integrate into the host’s cell and reproduce in there. Besides, there is active communication and traffic of products among the whole symbiotic sys- tem, involving microbial and host cells.

Adhesion mechanisms One of the most critical steps in the establishment of host-bacterial symbiosis is the interaction between the bacterial surface and the host tissue. The first contact of bacteria with host surfaces relies on non-specific interactions, in- volving hydrophobicity, charge or other surface properties. These forces pro- vide initial contact, but must be supplemented by specific receptor interactions to allow colonization (Abraham et al. 2015). There is a wide variety of extra- cellular proteins, also called adhesins, that recognize and bind different sub- strates from the host surface. A classic mechanism of adhesion involves the use pili, or fimbriae (figure 2). They are long proteinaceous filaments that are composed of a series of pilin proteins tightly packed in a helix conformation, covalently (in Gram-posi- tives) or non-covalently (in Gram-negatives) attached. Pili also carry non- structural extra pilin proteins that act as adhesins, with different receptor spec- ificity. These adhesins can be involved not only in surface adhesion, but also in a variety of functions such as formation, colonization, phage trans- duction, DNA uptake and twitching motility (Proft & Baker 2009; Piepenbrink & Sundberg 2016).

24 Other non-fimbrial adhesins include molecules with proteinaceous or pol- ysaccharide structure, which are extremely important for host-symbiont inter- actions. These adhesins recognize various classes of host molecules ¾includ- ing transmembrane proteins such as integrins or cadherins¾ or components of the extracellular matrix such as collagen, fibronectin, laminin, or elastin (Ribet & Cossart 2015). There is a huge diversity of non-fimbrial adhesins, and a whole chapter, at least, would be needed to cover entirely this topic. Another common strategy in surface attachment is the ability to form ag- gregates. This facilitates interaction and adhesion to surfaces of different na- ture, and, at the same time, excludes potential competitors by preventing their attachment (Nishiyama et al. 2015; Oliveira et al. 2015). Biofilms are a form of aggregation, in which several bacteria are embedded within an extracellular matrix (figure 2). This matrix composed by one or more polymeric substances such as proteins, polysaccharides and/or extracellular DNA. Sometimes other molecules are also found in biofilms, such as those involved in cell-to-cell communication (Flemming & Wingender 2010). It has been observed that an extreme variability of genetic and biochemical mechanisms underlay biofilm formation, both across strains and growth conditions (Oliveira et al. 2015). Thus, its adaptive advantages are similarly diverse. Although biofilms con- tribute to increase competitive advantage against unwanted microbial popula- tions, they are also cooperative aggregations. By the exchange of diverse com- pounds, biofilm-forming communities are able to communicate and regulate their gene expression, and even to complementarily contribute to synthetize compounds that define the extracellular matrix (Dragoš et al. 2018). Finally, biofilms not only protect bacterial communities from external damage; they also provide dynamic spaces to share nutrient resources (Sivadon et al. 2019). Biofilm formation represents a serious problem in the medical field because many bacterial have the ability to form these kinds of aggregates, and their resistance to antibiotics is becomes greatly increased. They are also problematic when they grow on industrial systems, being extremely difficult to remove and hence contaminating various processes. Therefore, there are important ongoing efforts to understanding biofilm formation and dynamics, and to find efficient ways to eradicate them (Sharma et al. 2019; Vuotto & Donelli 2019; Cattò & Cappitelli 2019; Azeredo et al. 2017).

Protection from the environment Not all bacterial symbionts are adapted to survive the sometimes-harsh extra- cellular environment of the host in the same way. Host defense mechanisms, in addition to abiotic conditions, such as pH and mechanical factors, have pro- moted the development of different protection strategies among symbiont bac- teria. These strategies are not only to protect themselves from external dam- age, but also to reinforce their niche establishment and to gain a competitive advantage against co-habiting species.

25 Some species are able to modify the whole cell surface by producing a pol- ysaccharide capsule (figure 2). This strategy is a common trait among patho- gens. Some bacteria can produce capsules, which play a key role in immune response evasion by preventing phagocytosis by macrophages in animal hosts. Indeed, there is a great diversity in capsule composition that is driven by im- mune selection from mammalian hosts (Wen & Zhang 2015). Capsules also contribute to surface adherence and protection from physical and biological stress such as pH, dissection and viruses attack, among others. Despite opin- ions of capsules contributing to bacterial isolation in terms of communication, recent studies have observed an increased rate of horizontal gene transfer and recombination among capsule-forming strains (Rendueles et al. 2018). There- fore, they may also contribute to increasing genetic diversity among the pop- ulation. Capsules are firmly attached to the cell surface and are difficult to wash off under laboratory procedures. In contrast, slime layers, or s-layers, are released into the medium, forming a sort of halo around bacterial cells that is loosely associated to them (figure 2). They have a similar polysaccharide composition to capsules, but with a bi-dimensional structure, and serve similar protection and adhesion functions (Gerbino et al. 2015; Fagan & Fairweather 2014).

Invasion of host cells Some bacterial symbionts have gone one step further in protecting themselves from the environment and from the host immune system, and have developed the ability to invade host cells, being hence considered as intracellular symbi- onts (figure 2). They do this for replication and/or dissemination purposes, as well as to easily acquire nutrients from the host. The mechanisms for host invasion vary among species, and their engulfment can take place upon inter- action with host-cell receptors or through translocation of bacterial proteins into the cell (Pizarro-Cerdá & Cossart 2006). To improve their ability to col- onize and survive intracellularly, intracellular symbionts may use eukaryotic- like proteins that mimic and manipulate host cellular processes. These molec- ular strategies are shared both by pathogens and cooperative bacteria interact- ing with eukaryotes. The spread of such genes among beneficial and patho- genic bacteria may be the result of either horizontal gene transfer or conver- gent evolution (Frank 2019).

Immune system evasion Even in the case of mutualistic symbiotic interactions, it is still a bacterial infection what the animal host detects, and its immune system will react ac- cordingly. Both beneficial and pathogenic symbionts have evolved a variety of adaptions to prevent immune system recognition. All bacterial extracellular structures that interact with the host’s tissues are usually under strong diversi- fying selection in order to avoid being recognized by the host immune system. Immune system evasion strategies include resistance against immune

26 effectors, lack of immune elicitors and negative regulation of the immune sys- tem (Douglas 2014). However, the immune system is not an automatic mech- anism that will respond to the presence of any bacterium in the same way. Eukaryotic hosts and their immune systems have cohabited with all sorts of bacteria for a long time, and hence are able to modulate their response depend- ing on the bacterial population they may encounter. There are several exam- ples of adaptations that allow the host to simultaneously tolerate and/or pro- mote growth of beneficial microbiota while protecting itself against patho- gens. The relationship between immunity and the microbiome reaches far be- yond simple recognition and includes complex cross talk between the host and microbes, and can include direct microbiome-mediated protection against pathogens (Morella & Koskella 2017).

Cell-to-cell communication and exchange As a result of the close association between host and symbiont, there is active traffic of different types of molecules between them, and also within the sym- biont population. Cell-to-cell communication within symbiont populations mainly takes place through the secretion (and detection) of small signaling molecules. The process of communication among the bacterial cells of a given population is called . Through the secretion of autoinducers, this mechanism allows bacteria to monitor the environment and to react ac- cording to external stimuli. It was once thought to be a relatively simple and automatic process. Now we know that it is instead a coordinated behavior that regulates such diverse and essential functions as bioluminescence, virulence factor production, secondary metabolites production, competence for DNA uptake and biofilm production (Mukherjee & Bassler 2019). The collective- ness of quorum sensing is precisely what ensures the successful outcome of these essential functions. Bacterial symbionts not only communicate between themselves, but also with the host, and the host with them. Host factors are important modulators of bacterial populations, and they are involved in the promotion of symbiont infection or the restriction of colonization, as previously mentioned. Small RNA molecules (sRNA) have been found to be commonly exchanged between all kinds of organisms in order to mediate gene expression. Symbiotic systems are not an exception, and movement of sRNA molecules between hosts and symbionts has been reported in both directions on close associations involving all domains of life (Knip et al. 2014). The sRNA molecules may be used by symbionts to establish their niche within host tissues, or by the host to accept or reject that establishment and regulate the population. This mutual regulation of their expression through different mechanisms is a result of long- term evolution, and is extremely specific. One key signature of beneficial symbiotic interactions is the horizontal ex- change of DNA. Horizontal gene transfer from bacteria to eukaryotic hosts has been extensively described (Sitaraman 2018; Husnik & McCutcheon

27 2018), and it may be classified into two broad types. Namely, HGTs that main- tain pre-existing functions, and those that provide the recipient with new func- tionality. The latter includes advantageous adaptions such as altered nutrition, protection and adaptation to extreme environments (Husnik & McCutcheon 2018). Bacterial symbionts can also be recipients of DNA, which may con- tribute to their fitness and symbiotic role (Waterworth et al. 2020; Pinto-Carbó et al. 2016; López-Madrigal & Gil 2017), even when the tendency is towards a reduction of their genome size. Finally, active transport of metabolic compounds is also a common process in symbiotic interactions. Bacterial cells use a diverse set of transporters and/or secretion systems to secrete products to the environment or directly into the host cell, and to receive them as well when they are provided by the host or other microbes. Although most of the communication between host and microbiome takes place through the release of biochemical compounds (including the aforementioned sRNAs), biomechanical forces should not be excluded as another possible type of communication (Douglas 2019).

Animal microbiomes Within host-microbe systems, there is a strong bias among published data to- wards animal-bacteria systems. This bias is partly due to uncertain methods for genomic DNA isolation, variations in PCR efficiencies, and insufficient reference databases (Paterson et al. 2017; Tkacz et al. 2018; Richard & Sokol 2019) for both non-animal hosts and non-bacterial symbionts (i.e., and microbial eukaryotes), but efforts are being made to breach this gap. However, there are studies that acknowledge the presence and relevance of archaea and microbial eukaryotes as members of various animal microbiota communities ¾mainly the human gut. However, even then, research tends to be biased to- wards their role as pathogens, and little is known about their positive and/or neutral interactions with the host and other microorganisms. However, there are a few studies exploring positive interactions of non-bacterial symbionts. For instance, a recent study found that the presence of the opportunistic path- ogen Blastocystis (the most common human gut ) was correlated with shifts in the gut bacterial and eukaryotic microbiota, in the absence of gastro- intestinal disease or inflammation (Nieves-Ramírez et al. 2018), indicating that this protist (and potentially others) are drivers of community diversity and composition. In a similar line, another study reported an increase of the mu- cosal host defenses upon colonization by the protist Tritichomonas muculis, reducing the risk of infections (Chudnovskiy et al. 2016). Similar efforts have recently been performed to better understand the role of archaea and fungi as members of the gut microbiota (Sereme et al. 2019; Moissl-Eichinger et al. 2018; Bang & Schmitz 2015; Paterson et al. 2017; Bradford & Ravel 2017; Richard & Sokol 2019). New insights come with every study, but there is still

28 a lot to be done. Only when all the members in play are under the focus of study will we achieve a full understanding of a host’s microbiome ¾other- wise, relevant interactions and contributions are left behind. A great majority of microbiome studies are focused on humans as a host, and provide detailed explorations of the bacterial microbiome composition and its effects on the body. A major focus has been drawn to the gut microbi- ota. It has been accepted for a long time that the bacteria that dwell in our guts play a key role in digesting a great part of the food we intake. Over the last years, more evidence has come to light showing a direct connection between gut bacteria and different aspects of human physiology. Stunning examples include bacterial influence on the immune system response and autoimmune disorders (Takiishi et al. 2017; Thaiss et al. 2016; Fritsch & Abreu 2019; Nishida et al. 2018), nutrition (Miyamoto et al. 2019; Dao & Clément 2018; de Clercq et al. 2016) several other diseases (Li & Tang 2017; Meng et al. 2018; Sun & Shen 2018; Quigley 2017) and even brain development and men- tal health (Lach et al. 2018). Their presence is therefore not simply translated in aiding at nutrient digestion, but also in the absorption and production of a variety of metabolites with wide physiological effects. Similar effects on the host’s health and physiology are likewise associated with the gut microbiota of other organisms.

Bacterial composition The taxonomic composition of the microbiome of most animals is highly var- iable. The type of microorganisms that colonize the different tissues of a mul- ticellular eukaryotic host can depend on physical properties, such as pH, oxy- gen availability, humidity, accessible nutrients, etc. However, the story is more complicated, and biotic factors also come into play. Despite reproducible differences among hosts of different developmental ages or consuming different diets, the basis for much of the variation between host individuals or within one host over time is not well understood (Douglas 2019). Which microorganisms may be available to an individual host is influ- enced by their distribution and abundance in the external environment. The behavior of the host is also of relevance, especially among animals whose populations are based on social interactions, such as primates or certain in- sects, such as honeybees (Suryanarayanan et al. 2018; Dill-McFarland et al. 2019; Moeller et al. 2016). Both competitive and cooperative interactions among bacteria are also a key factor in shaping the composition of the microbiota. Competition can be driven by access to space and/or nutrients. Many bacterial species produce toxins and antimicrobial compounds that interfere with the growth and fitness of neighboring microbes. In other cases, bacterial populations within the same host co-exist either by cross-feeding or cooperation. The difference between these two concepts is that in the first case of cross-feeding, the microbes grow

29 on by-products produced by others such as fermentation products, amino acids or digested sugars. On the other hand, cooperation has evolved as an adapta- tion to specifically increase the fitness of one another (Coyte & Rakoff- Nahoum 2019). The balance between interactions will determine the stability of the microbiota. Now, what about the host? How much does it have to say about who lives within it? The answer is that, it depends. Animal hosts have different ways to manage the presence and establishment of microorganisms, but the actual ef- fect that those have in affecting the taxonomic composition of the microbiota is rather variable. In simple associations where few microbial taxa are in- volved, there is a predominant role of the host in controlling its bacterial part- ners. Under this scenario, immune factors, host-derived nutrients, and me- chanical factors play an important role, and vary between host taxa. On the other hand, when microbial diversity increases, the host losses a greater part of that control, making it challenging to understand the dynamics behind the composition of complex microbiomes (Douglas 2019). There are, however, indications based on different animal models that much of the observed vari- ability between individuals might be stochastic. Therefore, in these cases it could be more dependent on passive dispersal, loss and gain in both directions between the host and the environment (Obadia et al. 2017; Vega & Gore 2017; Burns et al. 2016; Sieber et al. 2019). Finally, an unbalance in the microbial composition translates into homeo- stasis problems that can greatly affect the hosts’ health. This state is known as , and will be discussed in more detail in the next chapter.

Beneficial roles of the microbiota One of the goals of microbiome studies is to decipher the role of each micro- bial population within the symbiotic system. In communities with a reduced number of taxa, this is something relatively easy to do, as it is to explore the underlying adaptive mechanisms of these host-microbe associations. I will fo- cus on two specific types of beneficial interactions that microbes establish with their hosts that are most relevant towards this thesis work, both associated with the pea aphid. A classic host-microbe symbiosis is one in which the symbiont provides essential nutrients to the host. The example of the pea aphid (Acyrthosiphon pisum) and its primary symbiont, Buchnera aphidicola, is a classic system to represent nutritional symbioses. The pea aphid belongs to the order Hemip- tera, and has specialized its diet exclusively to phloem sap, which is rich in sugars, but poor in essential amino acids. To compensate this metabolic un- balance, the pea aphid established a mutualistic association, around 160-280 MYA (Moran et al. 1993), with the aforementioned bacterium, which contrib- utes up to 90% of the required essential amino acids (Douglas 2006). The bac- terium B. aphidicola is an obligate endosymbiont species that inhabits

30 bacteriocytes ¾ specialized cells in insects developed exclusively to harbor symbiont bacteria. Both B. aphidicola and the pea aphid have gone through a long co-evolutionary process that has led to a complete metabolic dependence of one towards the other. Genomic data has been extremely useful to decipher the metabolic and regulatory landscapes of each species, showing clear inti- macy and complementarity at the metabolic level (Feng et al. 2019; Ramsey et al. 2010; Shigenobu & Wilson 2011). The genome of B. aphidicola has experienced massive reduction as a consequence of this metabolic depend- ence, with only a 641 Kb chromosome and two plasmids (Shigenobu & Wilson 2011). Other bacteria grow within the bacteriocytes of the pea aphid, including Wolbachia species (widely spread among insects) and Hamiltonella defensa. The latter invites us to speak about a second type of beneficial sym- biosis. Defensive symbioses, as can be inferred from their name, help the host to defend themselves against different threats that they may be exposed to. In the case of the pea aphid, H. defensa specifically plays a key role in protecting host against the attack of parasitoid wasps (Moran, Russell, et al. 2005). It is a facultative symbiont that is maternally transmitted (though occasionally transmitted horizontally) and has a genome of about 2 Mb (Moran, Degnan, et al. 2005). Its defensive role consists of blocking the larval development of certain parasitoid wasp species, which is largely dependent on the symbiont strain. As if to honor the mesh metaphor that opened this chapter, here we encounter one more participant in this symbiotic system. The presence on the H. defensa genome of a temperate lambda-like , that encodes a battery of toxins, is decisive for effective protection of the aphid (Degnan et al. 2009; Oliver & Higashi 2019). Finally, the parasitoid wasps, appear to gain virulence over time when exposed to aphids infected with the defensive sec- ondary symbiont (Dion et al. 2011). Therefore, the populations of all compo- nents of this symbiotic system are in a dynamic equilibrium, which is driven by selection. There is just one more thing I would like to highlight about H. defensa. It encodes a series of pathogenic traits, such as secretion systems, that enable the bacteria to infect the host’s cells and evade the immune system (Degnan et al. 2009; Chevignon et al. 2018). This exemplifies the thin line that sometimes separates mutualists from pathogens, which evolve similar molecular adaptations for host interaction.

Model systems for microbiota studies Understanding how host-microbe systems are established and maintained is of great relevance for areas such as biomedicine, conservation biology and biotechnology. The challenge in all cases, though, is to decipher the complex network of interactions that exists between the host and its microbiota. Sym- biotic interactions are highly specific to the host-microbe system, since they result from adaptation processes between them. Therefore, we can think that

31 only by studying specific systems it is possible to get to know it. Although this is partly true, the molecular bases of those adaptations are universal, and it is easier to first approach the understanding of a complex system from the perspective of a simpler one. Some organisms are great model systems to study symbiosis from a molec- ular and ecological point of view, and are widely used among evolutionary biologists. They all share certain characteristics that make them suitable for such studies. For instance, they should be easily accessible in nature, and/or easy to grow under laboratory conditions. Ideally, their associated microbial communities should fulfill these requirements too. Besides, the possibility to manipulate the microbiota can offer valuable insights into the role they per- form. Therefore, model systems that allow us to work with germ-free host individuals is also a plus. Invertebrate animals and their associated microbiota provide appealing systems to work with, because in addition to all aforemen- tioned reasons, there is a great diversity of hosts that yields an even greater diversity of symbionts and types of interactions. One of those model systems are honeybees, which will be addressed in detail in the following chapter.

32 Chapter 3: The honeybee as a model system

I am large, I contain multitudes. ¾ Walt Whitman, Song of Myself

Honeybees are one of the most important pollinators in nature, and are also of great relevance in the production of different goods for human consumption. There are seven honeybee species within the genus Apis, and they can roughly be classified into two main groups: eastern and western honeybees. Only one species from each group has been domesticated: A. cerana and A. mellifera, respectively. The latter is the most commonly and widely used in apiculture, and it is also referred to as the European, western, or common honeybee. An important reason why honeybees are great model systems for microbi- ota studies is the reduced number of taxa members they harbor in their gut. In addition, all of them are cultivable, allowing us to get a wide understanding of their biology, evolution and interactions with the host from various ap- proaches (Romero et al. 2019). The possibility to grow microbes-free bees and inoculate them with defined communities is also a powerful tool for under- standing the effects that the microbiota has on the host’s physiology. Another powerful reason for the success of honeybees as model systems is how well known their biology is. Honeybees have been of great importance for human societies for a long time, and hence many efforts have been made to study them.

Ecology and nutrition Honeybees are eusocial insects whose populations are hierarchically orga- nized in different classes: workers (including forager and nurse bees), drones (whose solely purpose is to assist with reproduction) and the queen. It is worker bees that are mostly used to study the honeybee gut microbiota, and it is interesting to note that their bacterial community varies according to the behavioral task that they perform (Jones et al. 2018). Young worker bees start by performing nurse tasks, and are in charge of feeding the larvae and cleaning the hive. When they grow older, they become foragers. Forager bees collect nectar and pollen during the warm months. During winter, the bees stay inside, keeping the hive under warm conditions, and feeding on the resources previ- ously stored.

33 Pollen is the main source of protein for honeybees, which they break down into amino acids to be used for their various metabolic and physiological needs. However, not all types of pollen have an acceptable quality to fulfill these needs. Pollen quality is measured in terms of the nutritional value that it provides in the form of amino acids, and this varies according to the different species (Taha et al. 2019). Therefore, it is essential for honeybees to maintain a balanced diet, and there are efforts on understanding how they dif- ferentiate and select their food to fulfill their needs. It seems that it is a com- plex and multisensory strategy that depends on visual, olfactory and chemo- tactile abilities (Ruedenauer et al. 2018; Simcock et al. 2014). However, pol- len availability depends on floral biodiversity, which, in turn, is influenced by geography, weather conditions and seasonality (Flo et al. 2018; Danner et al. 2016; DeGrandi-Hoffman et al. 2018). Alongside amino acids, pollen is also an important source of lipids, vitamins and minerals, which are essential for colony survival and development. On the other hand, carbohydrates and water are acquired from the nectar. The main carbohydrates found in nectar are glucose and fructose, which are simple sugars and immediately ready to use. Finally, water is not only used for its natural purpose of hydration and keeping osmotic pressure. It is also used to prepare some of the bee products and also to regulate the humidity inside the hive through evaporation, which is important for egg development (Mitchell 2019).

Health and disease The worrying decline of the worldwide honeybee population, especially dur- ing the last decade, is well known. Not only is this decline worrying from an ecological point of view, but also from an economic perspective because of the impact that honeybees have on global . For these reasons, there are more and more efforts aimed at understanding honeybee ecology and evo- lution. These have resulted in increased comprehension about honeybee dis- ease dynamics and their control. Parasites are the main drivers of disease in honeybees, and have been known as a major problem in apiculture for a long time. Varroa destructor (also known as the Varroa mite) is an ectoparasitic mite, and an important pathogenic agent (Anderson & Trueman 2000). It feeds on hemolymph of both adult bees and larvae, debilitating them and making them more susceptible to other pathogenic infections. The Varroa mite is also a vector for several RNA viruses that are considered to be a major contributing cause to the worldwide collapse of honeybee colonies (Martin 2002; Thoms et al. 2019; Kang et al. 2016). Nosema ceranae is another important parasite of honeybees (Fries et al. 1996; Higes et al. 2006). It is a microsporidian , and, upon ingestion, its spores germinate in the gut. Infection shows profound effects on honeybee

34 metabolism, flying behavior, pheromone profiles and profound negative ef- fects on life span (Higes et al. 2006; Paris et al. 2018). Finally, Paenibacillus larvae is probably the most important bacterial pathogen in honeybee colonies (Katznelson 1950; Heyndrickx et al. 1996). It is a Gram-positive spore-form- ing bacterium that infects the gut of larvae upon transmission through contam- inated food (Poppinga & Genersch 2015). This pathogenic bacterium is the causative agent of American foulbrood disease, one of the major disease prob- lems affecting worldwide beekeeping colonies. The spread of disease among honeybees becomes an even more complex matter when considering that infected individuals can shift colonies. This phe- nomenon is known as drifting behavior, and it is described as individuals that do not come back to their original colony, and who instead enter and establish in another colony in the vicinity. Since pioneer studies back in the 50’s that observed these colony drifting patterns (Free 1958), more studies have con- tributed to a better understanding of this intriguing yet relatively common be- havior. Up to 13-42 % of the individuals in a given colony have been reported as “alien” (Pfeiffer & Crailsheim 1998). The percentage of alien individuals varies and is of course influenced by the hive density on the area, the distance between them and also the time of season. Consequently, drifting has a critical effect on how pathogens and disease spread. A more recent study observed that honeybees that were infected with the Varroa mite were more prone to accepting drifting individuals into their colony. This might facilitate not only the infestation with Varroa mite, but also the uptake of other pests and para- sites (Forfert et al. 2015). All in all, there are many factors that affect the health of honeybees, and many efforts are being made to approach the problem of declining popula- tions. The gut microbiome has proven to greatly influence the health and phys- iology of honeybees, and therefore is a main focus of study.

Gut microbiota The honeybee gut is colonized at early stages through horizontal transmission facilitated by the honeybees’ social behavior (Powell et al. 2014). After that, the bacterial composition of the microbiota varies throughout the different de- velopmental stages, showing more variability among larvae individuals (Martinson et al. 2012; Hroncova et al. 2015; Vojvodic et al. 2013; Dong et al. 2020). The reference composition is that from adult worker bees because it remains relatively stable despite changes in the diet and physiology between nurses and foragers (Zheng et al. 2018). Thus, it has been established that the gut microbiota is composed of 5-9 bacterial taxa that are present in honeybees worldwide (Martinson et al. 2011; Zheng et al. 2018). Each taxon corresponds to a species or a cluster of closely related species, and these are considered as core members of the gut microbiota. However, there is a huge intraspecies

35 diversity that remains unnoticed when using metagenomic approaches based on the 16S rRNA gene (Zheng et al. 2019; Engel et al. 2014; Ellegaard & Engel 2019). An elegant example is that observed in two honeybee species, A. mellifera and A. cerana. Although they diverged about 6 MYA, they are largely colonized by the same core bacterial species. However, there is a clear host specificity at the strain level that is evident when analyzed with genomic resolution (Ellegaard et al. 2020). This specialization might be a result of di- vergent ecological contexts, and could have consequences on the microbiota functioning and host-symbiont interactions. The gut of honeybees is compartmentalized into three main areas: the fore- gut (represented by the crop), the midgut and the hindgut, which contains the ileum and the rectum (figure 3). The crop (or honey stomach) is the most prox- imal part of the digestive track. Nectar is collected here to be later transformed into honey in the beehive. Few bacterial species inhabit the crop, and its mi- crobiota is strongly biased towards lactic acid bacteria (LAB), which consti- tutes a group of bacteria with a characteristic fermentative metabolism that will be further discussed in the next chapter. Bacteria in the crop vary numer- ically across seasons, but are consistent across Apis species (Olofsson & Vásquez 2008; Olofsson et al. 2011). The crop microbiota was observed to form biofilms and networks that seemed to attach to the crop wall by extracel- lular polymeric substances (Vásquez et al. 2012). Later studies also character- ized the production of extracellular glucansucrases by Lactobacillus kunkeei, involved in the formation of highly branched polymers as part of biofilm struc- tures (Vasileva et al. 2017). Lactobacillus kunkeei is the most abundant spe- cies in the crop (Tamarit et al. 2015; Vásquez et al. 2012), and although the crop microbiota in general showed antimicrobial activity, that of L. kunkeei was especially remarkable (Vásquez et al. 2012; Iorizzo et al. 2020; Berríos et al. 2018). Besides, L. kunkeei has also shown positive effects on the honey- bee’s health in cases of infection caused by P. larvae and N. ceranae (Arredondo et al. 2018; Daisley et al. 2020). However, despite beneficial ef- fects towards the bee’s health, L. kunkeei is sometimes considered as transient members of the bee gut, and even opportunistic pathogens (Anderson et al. 2013; Anderson & Ricigliano 2017). These matters will be further discussed in more detail in Chapter 4.

36 Figure 3. Structure of the honeybee gut and its associated microbiota. The gut is divided in three main sections: the foregut, the midgut and the hindgut (which contains both the ileum and the rectum). The honey crop is in the foregut, and contains fewer bacterial cells than the most distal parts of the gut. The most dominant species in the crop is L. kunkeei. The midgut also contains few bacteria and does not have a stable characterized microbiota. The ileum contains a high density of bacterial cells that mostly belong to three species (F. perrara, S. alvi and G. apicola). Finally, the rectum contains an even greater bacterial cells density, dominated by Lactobacillus and Bifidobacterium species. Image adapted from Raymann & Moran (2018).

The midgut connects the honey crop with the rest of the digestive track. It also has few bacterial species, and its composition mainly resembles that of the ileum. The ileum is the most proximal part of the hindgut, and starts to display a richer microbiota, both in diversity and abundance. Here, the three dominant species found are Snodgrasella alvi, Gilliamella apicola and Frischella per- rara. They display particular distributions within the gut. Both S. alvi and G. apicola are found in close association with each other, with the first one firmly attached to the gut epithelium and G. apicola layered on top of it. The location of F. perrara is limited to the pylorous ¾the region connecting the ileum to the midgut. Finally, most Lactobacillus species are found in the rectum, and belong to the formerly named Firm-4 and Firm-5 clades. Each clade forms an independent monophyletic group comprising species isolated only from the bee gut. Firm-4 includes Lactobacillus mellis, Lactobacillus mellifer and Lac- tobacillus bombi. Firm-5 includes Lactobacillus apis, Lactobacillus helsing- borgensis, Lactobacillus melliventris, Lactobacillus kimbladii, Lactobacillus kullabergensis and Lactobacillus panisapium (Heo et al. 2020). Lactobacilli within these clades co-dominate the microbiota composition together with Bifidobacterium species, where B. asteroids is the most abundant (Moran 2015; Zheng et al. 2018). The various effects of the gut microbiome upon honeybee health and phys- iology are the object of extensive studies. For instance, it has been reported that the gut microbiota influences the sugar response and insulin/insulin-like signaling, and also affect body/gut weight gain (Zheng et al. 2017). There are

37 also correlations that describe a role of the gut microbiota and pathogen re- sistance (Forsgren et al. 2010; Schwarz et al. 2016; Kwong et al. 2017). Be- sides, bacteria within the gut ferment sugars and polysaccharides from the host diet that cannot be otherwise digested (Engel et al. 2012; Zheng et al. 2016), and they contribute to reducing the pH and redox potential (Zheng et al. 2017). Multiple stressors are correlated with states of dysbiosis in the gut. Expo- sure to toxic compounds such as pesticides, and also to pathogenic infections, have been shown to affect the composition of the microbiota, for instance by strongly decreasing the abundance of Lactobacillus spp. and Bifidobacterium spp. (Rouzé et al. 2019). The same populations, together with S. alvi, showed similar patterns of decrease upon exposure to antibiotic treatment, and sur- vival rates of honeybees dropped dramatically a few days later (Raymann et al. 2017). Increased mortality, together with impaired development, was also reported in experiments where bees where fed with aged diets (Maes et al. 2016). This study observed altered abundances of core microbiota members that were positively correlated with the presence of pathogenic species. Other diet compounds, together with climate conditions and age/caste of the bee were also correlated with dysbiosis (Raymann & Moran 2018) with negative consequences on gut homeostasis and host physiology.

38 Chapter 4: Lactobacillus kunkeei

Your reading will be even better then, after a lifetime of thought and effort, because it will come from conscious understanding. Grace attained like that is deeper and fuller than grace that comes freely, and furthermore, once you’ve gained it, it will never leave you. ― Philip Pullman, His Dark Materials

Lactobacillus kunkeei was firstly identified on spoiled wine (Edwards et al. 1998; Huang et al. 1996). Since then, it has drawn an increasing interest among the scientific community as a honeybee-associated symbiont, and also as a potential probiotic. In this chapter, I will provide an overview of what is known about this bacterium, as an introduction to the organism on which this doctoral thesis focuses on.

Taxonomic context: The genus Lactobacillus Lactobacillus kunkeei belongs to a large and diverse group of bacteria that has experienced extensive taxonomic rearrangements for more than a century. The genus Lactobacillus was firstly described by Martinus Beijerinck, Dutch bot- anist and , back in 1901. It was described as a group of Gram- positive, fermentative, facultatively anaerobic and non-spore-forming bacte- ria. The genus is included in the phylum Firmicutes, class , order Lac- tobacillales, family ¾which currently contains the genera Lactobacillus, Paralactobacillus, Pediococcus and Sharpea. The closest rela- tives to Lactobacillaceae are Leuconostocaceae, which includes the genera Convivina, Fructobacillus, Leuconostoc, Oenococcus and Weissella (Zheng et al. 2020). The first efforts to classify lactobacilli were based on the study of pheno- typic traits, such as growth conditions and fermentation pathways (Orla- Jensen 1919). This approach turned out to be strongly inconsistent under the light of the phylogeny-based methods that became common practice by the end of the century (Hammes & Vogel 1995). The sequencing era entailed an increased access to data and tools from which taxonomical studies have greatly benefited. Single-gene phylogenetic trees (including 16S rRNA) al- lowed the grouping of species into consistent clades of closely related organ- isms, or phylogroups (Salvetti et al. 2012). Phylogenomic analyses, based on

39 core genome reconstructions, provided robust evidence that all lactobacilli could be subdivided in at least 24 phylogenetic groups, being Pediococcus placed within the genus Lactobacillus (Zheng et al. 2015; Salvetti et al. 2012). The Lactobacillus genus complex was proposed to include Lactobacillus, Pediococcus, Weisella, Leuconostoc Oenococcus and Fructobacillus, based on phylogenomic analyses using 16 proteins shared by more than 200 strains (Sun et al. 2015). The number of known Lactobacillus species has experienced an exponen- tial growth for over one century, remarkably accelerated during the last ten years by numerous large-scale genomic studies (Salvetti et al. 2018; Duar et al. 2017; Zheng et al. 2015; Sun et al. 2015). As a result, the incredible amount of available data has shown that the general level of genetic diversity within the Lactobacillus genus complex exceeds by far what is generally found in bacterial genera, and even bacterial families. However, the species that form the stable phylogroups display a phylogenetic and physiological diversity that match the diversity of other bacterial genera. Based on functional and metabolic evidence, Lactobacillus species could be divided in three non-monophyletic groups associated to different lifestyles and habitats (Duar et al. 2017). Free-living species are characterized by being mostly or exclusively found on , encoding large and metabolically ver- satile genomes and having an optimal growth in environmental conditions. Host-associated species are mainly found in insects, birds, humans and farm animals. Interestingly, in all cases of host-associated species, the preferred habitat seems to be food storage/processing organs (Walter 2008). Four sub- groups are also recognized within the insect-associated lactobacilli. They tend to encode small genomes that display restricted carbohydrate fermentation abilities, which might reflect adaptations to the host’s diet (Duar et al. 2017). Finally, a third group is defined based on a nomadic lifestyle, meaning that these species do not form stable populations neither as free-living nor associ- ated to animal hosts (Duar et al. 2017). They possess large genomes with flex- ible metabolic and enzymatic repertories that allow them to survive in differ- ent environments and transition between them. The latest taxonomical proposal to reclassify the genus Lactobacillus (261 species) came in April 2020 with the aim of defining a biologically meaning- ful classification (Zheng et al. 2020). In this study, the authors re-evaluated how closely related Lactobacillaceae and Leuconotoccaceae are, and the sub- divisions within each family. Several criteria were analyzed: average nucleo- tide identity, average amino acid identity, core-gene average amino acid iden- tity, core genome phylogenies, signature genes, and metabolic or ecological criteria. As a result, they identified 26 lineages within Lactobacillaceae, which corresponded to the phylogroups observed by previous phylogenomic classifications (Salvetti et al. 2012; Zheng et al. 2015; Sun et al. 2015). These 26 phylogenetic groups are now proposed as new independent genera,

40 comprising Lactobacillus, Paralactobacillus, Pediococcus and 23 new genera consisting of species previously assigned to the genus Lactobacillus. Under the light of the latest proposed classification, Lactobacillus kunkeei would be renamed and classified into the new genus Apilactobacillus, which also includes the bee-adapted species A. apinorum, A. timberlakei, A. mich- eneri, A. quenuiae, A. ozensis and A. kosoi. However, this taxonomical reas- signment has been proposed only recently, and all the work included in this thesis has been performed on the former nomenclature. Therefore, for con- sistency with the rest of the thesis, they will still be referred to as Lactobacillus species.

Evolutionary context Lactobacillus kunkeei is closely related to other lactobacilli isolated from bees and flowers, and together they form the so-called “L. kunkeei phylogroup” (figure 4). The closest species to L. kunkeei is L. apinorum, which was also isolated from the gut of honeybees (Olofsson et al. 2014). Lactobacillus bombintestini, recently isolated from the gut of bumblebees, was shown to cluster as a sister group to L. kunkeei and L. apinorum (Heo et al. 2020). A monophyletic clade encompassing the species L. micheneri, L. timberlakei and L. quenuiae represent the next related sister clade. These species were isolated from flowers of Abutilon species and wild bees from the families Megachilidae and Halictidae (McFrederick et al. 2018). The species L. kosoi lies within the same clade as the latter, with whom shares major physiological properties (although it was isolated from a fermented beverage (Chiou et al. 2018). The most distant member within the L. kunkeei group is L. ozensis, which was isolated from flowers from multiple plant species (Kawasaki et al. 2011).

41 Figure 4. Currently supported species tree of L. kunkeei and related species. Sub- tree of a genome phylogeny of all Lactobacillus species, calculated with 114 single- copy core genes by Zheng et al. (2020). Colored branches indicate different phy- logroups, and representative species are highlighted in bold. The asterisks (*) indicate the presence of species whose lifestyle is not clear. Main isolation sources of host- associated species are indicated with symbols, and the honeybee category includes also the hive environment. Adapted from (Zheng et al. 2020).

The sister clade to the L. kunkeei phylogroup is the L. fructivorans phy- logroup, which includes isolates from insect intestinal microbiota (L. fruc- tivorans), flowers (L. florum) and fermented foods (L. sanfranciscensis). Closely related to the aforementioned phylogroups are the L. buchneri, L. col- linoides and L. brevis groups. Most of these latter species have larger genomes and free-living or undefined lifestyles (Duar et al. 2017; Zheng et al. 2020). All the species within the L. kunkeei phylogroup have recently been in- cluded within the newly proposed genus Apilactobacillus (Zheng et al. 2020), and seemingly have an insect-associated lifestyle ¾ specifically, associated to bee species, as the newly proposed name indicates. Although not all have been isolated from bees, the genomic, metabolic and ecological characteristics they share suggest such a lifestyle (Duar et al. 2017). The same reasoning ap- plies to the L. fructivorans phylogroup, which now constitutes the new genus Fructilactobacillus, whose members are also considered to be insect-associ- ated. Therefore, in the light of these observations, it is likely that the transition from free-living to a host-associated lifestyle took place in the ancestor of the L. kunkeei-L. fructivorans phylogroups.

42 Ecological context Lactobacillus kunkeei is frequently isolated from the honey crop of honeybees and stingless bees, where is the dominant species and a major component of the biofilm produced by the resident bacterial populations (Vásquez et al. 2012; Tamarit et al. 2015; Olofsson et al. 2016). It has also been isolated from other parts of the gut (although in lower abundance, and less frequently), pol- len, nectar, fruits and bee products such as honey or beebread (Tamarit et al. 2015; Endo et al. 2012; Neveling et al. 2012; Moran 2015). Although L. kunkeei was first described over two decades ago, its ecologi- cal role has not been completely unraveled yet. The variety of isolation sources, and its low abundance in gut regions other than the crop, has ques- tioned its honeybee-associated lifestyle (Moran 2015; Corby-Harris et al. 2014; Ellegaard & Engel 2019). As explained in the previous chapter, other Lactobacillus species in the gut are consistently found in high abundance in the hindgut, which is why they considered true members of the gut microbiota. Compared to them, the presence of L. kunkeei outside the gut is sometimes considered an indicative of a predominantly environmental or hive-associated lifestyle (Corby-Harris et al. 2014; Anderson & Ricigliano 2017) rather than bee-associated. Species considered as hive bacteria, such as Parasaccharibac- ter apium, are adapted to changing conditions across different micro-niches, including stored food (royal jelly, honey and bee bread) and the gut of bee individuals under different life stages and diets (larvae, nurses, foragers). They are suggested to be opportunistic pathogens, since their abundance in worker bees is associated to disease states, diminishing microbiota structure and func- tion (Anderson & Ricigliano 2017). However, recent studies based on genomic and experimental data propose L. kunkeei as a honeybee-associated symbiont, which may play beneficial roles towards the host fitness.

Potential symbiotic role The main potential role attributed to L. kunkeei is as a protective symbiont. One of the metabolic characteristics of L. kunkeei is its ability to produce and secrete antimicrobial compounds. Experimental evidence has shown that LAB species from the crop are able to inhibit the growth of various environmental microbes found in flowers (Vásquez et al. 2012) and even pathogenic organ- isms (Olofsson et al. 2016). From all the LAB species included in the crop microbiota, L. kunkeei was the one showing the strongest inhibition effects against all tested flower microbes. Among the bioactive compounds with an- timicrobial properties, L. kunkeei produces lactic, acetic and formic acids, vol- atiles such as benzene and xylene, and various enzymes and bacteriocins (Olofsson et al. 2016). Bacteriocins are peptides with different bactericidal and bacteriostatic activity. Most bacteriocins are used against

43 phylogenetically related bacteria and against some pathogens such as Bacil- lus, Clostridium, and Listeria. Although the reported spectrum of action of bacteriocins is mainly against Gram-positive bacteria, recent studies showed evidence of their ability to act also against Gram-negative bacteria (Kareb & Aïder 2020). Particularly, L. kunkeei strain FF30-6 is able to inhibit the growth of Melissocccus plutonius, a major bacterial pathogen in bee larvae, by the production of bacteriocins (Endo & Salminen 2013). The antimicrobial activ- ity of this strain showed an interesting degree of specificity, since other hon- eybee-associated cultures were not negatively affected. Several other strains have shown antifungal activity against cultures of Acosphaera apis, another pathogen affecting larvae (Iorizzo et al. 2020). Genomic studies have shown that some strains also encode several lysozymes or lysozyme-activity en- zymes, which also have antimicrobial functions (Djukic et al. 2015; Butler et al. 2013). However, it is also possible that the production of such compounds might be a mechanism of niche protection against pathogens and from com- petitor species rather than protection towards the bee. In addition to antimicrobial properties, L. kunkeei has also been proposed as a potential probiotic species because of the positive effects on honeybee health (Duong et al. 2020). For example, a mix of different lactobacilli, which included L. kunkeei, administered as a food supplement proved to be effective against P. larvae infection by reducing the pathogen load, upregulating the expression of key immune genes and overall improving the survival rates of infected larvae (Daisley et al. 2020). Another study reported the effectiveness of a mix composed only by L. kunkeei strains isolated from the midgut, which also reduced the number of larvae infected by P. larvae, as well as the number of spores produced by N. ceranae in adult bees, with no negative effects on the bee individuals (Arredondo et al. 2018). Similar effects against the infec- tion by N. ceranae were observed as well when supplemented as part of a LAB mixture containing other Lactobacillus and Bifidobacterium species (Baffoni et al. 2016). Experimental evidence on other properties of L. kunkeei such as biofilm formation and auto-aggregation abilities, as well as high hydrophobi- city, strongly support its use as a probiotic to improve honeybee health (Iorizzo et al. 2020). Besides, L. kunkeei is a good candidate to be used as a paratransgenic organism to fight disease in the honeybee. This strategy con- sists of genetically modifying an endogenous microorganism and reintroduc- ing it into the host to express molecules against pathogen development. Lac- tobacillus kunkeei proved to be able to transform, and hence integrate the de- sired DNA into its genome, and to survive well in the gut environment upon reintroduction without negative effects on the host (Rangberg et al. 2015). This opens a world of possibilities to approach honeybee health and treatment of diseases from a microbial perspective. Regardless of the biological role of L. kunkeei in the context of honeybees, some described traits suggest a close association to the bee. Experimental and genomic studies support the biofilm formation potential of L. kunkeei by the

44 production of large surface proteins or extracellular matrix-binding proteins that might bind to epithelial cells (Djukic et al. 2015; Iorizzo et al. 2020). Ge- nome-based studies have also revealed the presence of putative polysaccha- ride biosynthesis proteins in several strains, which may participate in the pro- duction of capsule or s-layer structures (Djukic et al. 2015). As described in previous chapters, these kinds of extracellular structures actively participate in avoidance of detection by the immune system of the host, and also contrib- ute to surface attachment. They also confer protection from various abiotic factors, so the formation of capsules could explain the isolation of L. kunkeei from environments other than the bee gut, such as the hive, flowers or fruits. All in all, the ecological context of L. kunkeei is still object of intense de- bate, and more work will be needed to provide an answer. The work in this thesis has not focused specifically on this matter, but some of the observations may contribute to feed the discussion about the true niche of L. kunkeei.

Metabolic profile As mentioned beforehand, L. kunkeei is a lactic acid bacterium. Bacteria within this group produce lactate as the major product resulting from the fer- mentation of carbon sources. Depending on the fermentation pathway they use, LAB can be classified as homofermentative or heterofermentative (figure 5). The first group uses the Embden-Meyerhof-Parnas (EMP) pathway, which produces lactate as sole product of the fermentation process. Heterofermenta- tive bacteria use the 6-phosphogluconate/phosphoketolase pathway (6- PG/PK), which produces both lactate and ethanol. The lactate is released ex- tracellularly, contributing to lowering the pH of the environment, hindering the growth of many microorganisms that do not withstand such acidic condi- tions.

45 Figure 5. Lactic acid metabolism. Depending on the products obtained from the fer- mentation of carbon sources, LAB are classified as homofermentative (red pathway) if they produce lactate, or heterofermentative (yellow pathway) if apart from lactate, they produce other products as ethanol. A subdivision within the last group is repre- sented by fructophilic bacteria, which have a partial or complete deletion of the adhE gene and do not produce ethanol (or not as efficiently). Instead, together with lactate, they produce acetate with the acetate kinase (ack).

Fructophilic lactic acid bacteria (FLAB) are a subdivision within heterofer- mentative LAB that share some unique characteristics. Their optimum growth substrate is fructose, although the addition of electron acceptor substrates such as oxygen, fructose, and pyruvate can enhance their growth on glucose. For the same reason, although they can grow on anaerobic conditions, they grow better in the presence of oxygen. Another peculiarity related to their heterofer- mentative metabolism is that they produce CO2 and acetate as extra accessory products, together with lactate. Within FLAB, species are classified as facul- tative or obligate fructophilic based on their ability to grow on glucose with or without the presence of external electrons acceptors. The biochemical cause behind this is the absence of alcohol and/or acetaldehyde dehydrogenase ac- tivity (ADH and ALDH, respectively). Obligate FLAB show a partial or total deletion of the bifunctional ADH/ALDH gene (adhE) and are not able to pro- duce ethanol (figure 5). As a consequence, there is an imbalance of NAD+/NADH ratio, and they need extra co-substrates to oxidize NADH (Endo et al. 2018, 2009; Filannino et al. 2019; Maeno et al. 2016). The unique characteristics in FLAB are suggested to be the result of genome reductive evolution linked to the adaptation to specific niches (Maeno et al. 2017).

46 FLAB species are limited to the genera Lactobacillus and Fructobacillus, and have been isolated from fructose-rich environments such as fructose-feeding insects, flowers, fruits and fermented foods derived from the latter (Endo & Salminen 2013; Endo 2012; Filannino et al. 2019). Within the Lactobacillus genus, L. kunkeei, L. apinorum, L. kosoi and L. florum are the only fructophilic species described up to date (Filannino et al. 2019). L. kunkeei in particular has been described as having an obligate fructophilic heterofermentative me- tabolism, but it still possesses ALDH activity as a result of a partial deletion on its adhE gene (Maeno et al. 2016). Comparative studies between L. kunkeei strains and close relatives reported large losses of gene families involved in biosynthesis of amino acids and car- bohydrate metabolism and transport. In the context of the genome reduction experienced by L. kunkeei, these patterns are consistent with a shift to a nutri- tionally-rich growth environment (Tamarit et al. 2015). Additionally, other observations reported an enrichment of genes involved in amino acids metab- olism and transport in the accessory genome (Asenjo et al. 2016; Maeno et al. 2016). The complete genome of strain MP2 also revealed auxotrophy for me- thionine and cysteine, and the presence of a D-methionine transport system, indicating that L. kunkeei acquires these amino acids from the environment (Asenjo et al. 2016). Similarly, a lack of glutamine transport system was also reported (Maeno et al. 2016). Nevertheless, there are differences regarding amino acid biosynthetic potential between strains, and the presence/absence gene patterns do not correlate with the species phylogeny (Tamarit et al. 2015). Besides, many of these biosynthetic genes are located in the same ge- nomic regions across the strains, showing syntenic arrangements, as an indic- ative of independent losses rather than parallel gains across the species (Tamarit et al. 2015). Therefore, larger comparative studies might provide more information on the biosynthetic capabilities of L. kunkeei strains and its relation with their ecological context. Other metabolic characteristics of L. kunkeei include an overrepresentation of genes involved in fatty acid biosynthesis and metabolism, especially when compared to other lactobacilli with small genomes (Maeno et al. 2016). Strain MP2 encodes two pathways for isoprenoid biosynthesis, the mevalonate (MVA) and the methylerythritol 4-phosphate (MEP) pathways. These path- ways are non-homologous, and have distinct distributions across the tree of life. Archaea and eukaryotes are known to possess the MVA pathway, while bacteria generally harbor the MEP pathway. Although some bacteria do en- code genes of the MVA pathway that are more similar to the eukaryotic ver- sions than to the archaeal ones, their patchy taxonomical distributions suggest that they were acquired by HGT (Hoshino & Gaucher 2018).

47 Genomic traits With an AT-rich, small genome of about 1.56 Mb, L. kunkeei lies within the range corresponding to bacteria with a host-associated lifestyle (Moran & Bennett 2014; Duar et al. 2017). The reduction of the genome size has been attributed, as previously mentioned, to niche adaptation on a nutritionally rich environment, and also to seasonal bottlenecks in the population of L. kunkeei in the crop, which drastically declines during fall and winter (Tamarit et al. 2015; Maeno et al. 2016, 2017). On the other hand, a recent study suggests an alternative mechanism for genome reduction driven by the loss of the mutY gene (Heo et al. 2020). This gene encodes an A/G-specific glycosylase in- volved in DNA repair processes. Therefore, the genomic reduction would be a consequence of the loss of this gene, favoring an accumulation of AT muta- tions and increasing the genome instability, allowing the genetic drift to act by reducing the genome size. Therefore, the evolutionary adaptations behind genome reduction of L. kunkeei are not fully understood, and more work would be needed to determine which are the causes behind. Up to date, the only complete genome publicly available is that of strain MP2 (Asenjo et al. 2016), which has a size of 1.6 Mb and 36.9% GC content. Other draft genomes have been published, but the use of PacBio long reads for MP2 allowed in this case to close the gaps that cannot be resolved on Illu- mina-based assemblies. One of the most relevant observations on this closed genome is that L. kunkeei encodes 5 copies of the ribosomal operon, which allows rapid growth with doubling times of 55 minutes (Tamarit et al. 2015). Although there is not a strong correlation between genome size and number of rRNA operons, genomes of this size normally have fewer copies (Vogel et al. 2011). Besides, Tamarit et al. also observed a near identical sequence iden- tity at the 16S rRNA gene among all ten L. kunkeei strains included in their study. Such level of sequence similarity hinders the detection of intra-species diversity on microbiome studies, which can be salvaged by genome-base com- parative studies (i.e. phylogenomics).

48 Figure 6. Functionally biased chromosome architecture of L. kunkeei. The chro- mosome is here represented by a black circle, with the black dot indicating the position of the origin of replication. The colored strokes represent regions in the chromosome where different types of genes accumulate. Most of the species-specific and amino acid metabolism genes (blue and orange, respectively) are located in the origin half of the chromosome, and so are a group of genes encoding proteins secreted in response to contact with pathogenic bacteria (green). In the opposite half accumulate the ma- jority of single-copy orthologous genes shared by all L. kunkeei strains (purple), as well as genes involved in translation (red). This distribution pattern is not common among bacteria. Image adapted from Tamarit et al. (2015)

One of the most striking genomic traits of L. kunkeei is its functionally biased chromosomal architecture, which was firstly observed by Tamarit et al. (2015) based on the analysis of 10 L. kunkeei strains obtained from different bee spe- cies (figure 6). They observed that core genes were mainly located around the terminus of replication, while metabolic genes (specially related to amino ac- ids) accumulated around the origin. The later region also accumulated most of the newly acquired and/or strain specific genes, as well as genes encoding secreted products (Tamarit et al. 2015). Although the mechanisms leading to this biased organization are not known, it has only been reported in L. kunkeei and close relatives, and more work is needed to understand the biological origin and relevance of such patterns. Additionally, all the L. kunkeei genomes contain a region of about 90 kb that only contains 4-5 large genes in tandem organization (Djukic et al. 2015; Tamarit et al. 2015). Based on the identification of a signal peptide at the N- terminal end and secondary structure predictions, they are thought to be se- creted proteins that are attached to the cell wall. Experimental evidence indi- cates that they are also expressed, since their products have been identified with PCR amplification (Djukic et al. 2015). They have no gene homologs in any sequenced microbial genome, but show distant and/or structural similarity to conserved domains found in adhesion proteins encoded by Staphylococcus,

49 Streptococcus, Burkholderia, Weissella, Mannheimia, Marinomonas, Pedio- coccus and also in Lactobacillus species (Djukic et al. 2015). Their location near the origin of replication and their high GC content suggests that these genes were horizontally acquired (Tamarit et al. 2015). Preliminary analyses also indicate that these large genes are subject to recombination processes (Tamarit et al. 2015). These observations predict a potential role of the large genes of L. kunkeei in surface attachment.

50 Aims

By large-scale comparative genomic studies, this PhD thesis has aimed to:

• Explore the genomic diversity enclosed within the L. kunkeei spe- cies • Evaluate evolution mechanisms that shape the genome • Determine whether L. kunkeei is a stable member of the honeybee crop microbiota • Contribute to unravel the mechanisms of interaction between L. kunkeei and the honeybee

To reach such goals, we performed the following studies, which were moti- vated by the following questions:

Paper I: • Does L. kunkeei diversity vary between isolated populations? • How does the chromosomal gene content vary between them? • Is the composition of the mobilome correlated to geographical origin, or to phylogenetic position?

Paper II: • Does the population of L. kunkeei change through time? • Do neighboring hives contain similar L. kunkeei populations? • Does pollen availability affect the bacterial composition of the bee population during the summer?

Paper III: • Is the current L. kunkeei data representative of the species genomic diversity? • How does the complete species phylogeny look like? • Are recombination events a major driver of evolution in the core genome of L. kunkeei?

Paper IV: • How do the giant genes vary within and between genomes? • By which mechanisms have the giant genes evolved? • What do the giant genes interact with?

51 Results

Paper I In this study, we compared the genome and mobilome diversity of Lactoba- cillus kunkeei populations isolated from the crop of honeybees from two geo- graphical origins — Åland and Gotland, two islands in the Baltic Sea. We sampled multiple L. kunkeei isolates from 1-5 bees from each popula- tion, and obtained 41 new genomes using PacBio sequencing technology. The use of long reads allowed us to produce closed genomes that revealed the pres- ence of 5 copies of the rRNA operon. All the genomes were assembled in one contig, and had a size of about 1.5 Mb, encoding an average of 1500 genes. We placed the newly sequenced genomes in an evolutionary context by inferring a phylogenomic tree using other L. kunkeei genomes. An ancestral reconstruction revealed the acquisition and loss of relevant functions by dif- ferent subgroups of the two insular populations. We screened in more detail some of these variable functions. We also explored the mobilome of the spe- cies, with emphasis on our newly sequenced strains. We characterized a num- ber of integrated prophages and insertion elements, as well as extrachromoso- mal elements that included both plasmids and phages.

Different strain diversity between the two L. kunkeei populations We reconstructed a phylogenomic tree using 899 single-copy pan-orthologous genes that were shared by L. kunkeei and L. apinorum Fhon13, the closest outgroup. Most of the newly sequenced genomes were classified into the pre- viously defined phylogroups, and also formed a new and divergent clade that we designated as phylogroup E. A large majority of isolates from Gotland bees were classified into phylogroup A, and had extremely low levels of sequence divergence between them. Only three isolates from Gotland were classified into phylogroup B, again with low divergence levels between them. On the other hand, isolates from Åland bees displayed a richer strain diversity, and were classified into phylogroups A, B and E. Therefore, we concluded that L. kunkeei population in the honey crop of bees from Åland and Gotland varies greatly in strain diversity and co-occurrence within individual bees, being richer among Åland bees isolates.

52 Variable amino acids biosynthetic potential All of our newly sequenced L. kunkeei strains displayed complete biosynthetic pathways to 9 amino acids, but only 2 of the 10 considered as essential. Some pathways were completely absent, while others contained only some genes that are probably used in other metabolic steps. From these results, and the presence of multiple amino acids membrane transporters, we concluded that L. kunkeei depends on the supply of amino acids from other sources, such as pollen.

L. kunkeei has a diverse and dynamic mobilome The identification and characterization of mobile elements integrated in the chromosome of L. kunkeei revealed the presence of components of different nature. Most of the genomes contained a high number of insertion sequences identified as transposase genes, which tend to accumulate near the origin of replication. The chromosomal location of such elements tends to be conserved within phylogroups, indicating strain-specific transposon proliferation pat- terns. Moreover, one prophage was found in several strains across the phylog- eny, with different levels of conservation. We identified two insertion spots for the prophage at either side of the terminus of replication. Finally, the pres- ence of a conserved region encoding either CRISPR-Cas or restriction-modi- fication system indicated that L. kunkeei is frequently exposed to foreign DNA, and has a conserved mechanism to protect their cells from the invasion of mobile elements. We also characterized the multiple extrachromosomal elements that were found in many of the newly sequenced genomes. Some of them were identi- fied as phages, are were homologous to the integrated prophages found in some of the chromosomes of L. kunkeei, indicating that it keeps active infec- tions within the bacterial population. Other extrachromosomal elements were identified as putative plasmids, and where classified into 7 types that were phylogroup-specific. The nature of these replicons is rather unknown, since they genetically resemble both plasmids and transposable elements.

Paper II In this study, we have obtained L. kunkeei isolates from the crop of honeybees that belonged to neighboring hives in the Swedish region of Helsingborg. In order to compare how the bacterial population changes with time, we collected bee samples at different dates between May and August. We collected 1-5 bees from four hives at different timepoints during the summer months, and obtained 1-4 L. kunkeei isolates from each bee, in aver- age. In total, we obtained 61 isolates, 57 of which distributed between hives 3 and 4. We used PacBio technology and obtained closed genomes, many of

53 them including a variable number of extra contigs of shorter length. The ge- nome size and number of encoded genes resembled other complete genomes of L. kunkeei. We performed a phylogenomic analysis with the newly sequenced ge- nomes and other available L. kunkeei, as well as closely related Lactobacillus species found in the honeybees, wild bees and flowers. We also explored the amino acids biosynthetic potential of all species to note any correlation with time, under the hypothesis that the fluctuating pollen availability during the season might be reflected at the biosynthetic potential level of the population. We were also interested in the diversity and stability of extrachromosomal elements encoded by the L. kunkeei population of our dataset.

The L. kunkeei population is stable through time, and hive-specific We identified 546 genes that were shared in one copy by our novel 61 new genomes, other available L. kunkeei genomes, and the selected outgroup spe- cies — L. apinorum, L. micheneri, L. timberlakei, L. kosoi, L. ozensis, L. flo- rum and L. sanfranciscensi. The phylogenomic tree indicated that the new L. kunkeei genomes were mostly classified within the existing phylogroups A and C, and phylogroup B at a lesser extent. Isolates from hive 3 mainly be- longed to phylogroup A, while most isolates from hive 4 clustered within phy- logroup C. Therefore, our observations indicate that the L. kunkeei population in the crop of these bees shows certain degree of hive-specificity. Besides, we did not observe co-occurrence of the most abundant strains in any bee. With regards to seasonality, we did not observe any pattern suggesting that the pop- ulation changes through time, since strain diversity did not show any temporal shift within the hives. This distribution was supported by principal component analyses.

Amino acid biosynthetic profiles within hives do not change during the summer The new genomes of L. kunkeei showed amino acid biosynthesis ab- sence/presence patterns comparable to those observed in similar studies. Given the different phylogenetic positioning of the strains belonging to each hive, we observed striking differences with regards to biosynthetic potential of the most variable amino acids (i.e. methionine, tyrosine and tryptophan). Therefore, the L. kunkeei population from different hives shows functional variability in their potential to synthetize certain amino acids, which is not strongly affected by seasonality.

The extrachromosomal elements are maintained through time Of the extrachromosomal elements encoded by the 61 genomes obtained in this study, we focused our analyses in two of them that where present in about 50% of the isolates. One of them, a putative plasmid named pSAM, was pre- dominant among strains from phylogroup C, but could also be found in a few

54 isolates from phylogroups A and B. Among others, this element encoded genes for the degradation of different compounds, including methionine and tryptophan. Interestingly, many of the isolates carrying pSAM were able to synthetize methionine and tryptophan from chromosomal genes. The second putative plasmid was named pNisin, and was identified only in phylogroup A. Its name comes from some of the genes that it encodes, involved in the pro- duction and mobilization of nisin, an antimicrobial peptide. Again, these ele- ments were at large extent phylogroup and hive-specific, and maintained at all timepoints throughout the summer.

Paper III In the light of the large amount of L. kunkeei genomes that have been se- quenced in the last years, we performed a large-scale analysis to explore the intra-species diversity and the main forces of evolution acting on the core ge- nome. We gathered all publicly available genomes of L. kunkeei, and added 103 unpublished novel and complete genomes resulting from our own research projects (including the resequenced strain Fhon2). In total, 134 genomes, in- cluding 126 L. kunkeei and 8 close outgroup species to provide an evolution- ary frame. We performed extensive phylogenomic analyses based on single-copy panorthologous genes that represented the core genome of the complete da- taset. We observed that some clades in the phylogeny were not fully resolved, and evaluated the effect of recombination as a potential mechanism behind. Finally, we calculated the species pangenome to evaluate whether the cur- rently available set of 126 genomes is representative of the species genetic diversity.

Lactobacillus kunkeei has an open pangenome Upon examination of the 126 genomes of L. kunkeei from a pangenomic per- spective, we determined that the species core genome is composed of 774 genes, which are shared by more than 98% of the strains. The rest of the genes are present in different subsets of strains, the majority of which (2850 genes) are only present in less than 14% of the strains. This indicates that great part of the accessory genome is strain or clade-specific. Overall, given the increas- ing trend of the cumulative number of non-core genes upon the addition of new genomes, we concluded that L. kunkeei has an open pangenome, reflect- ing the high diversity of the species and its ecological plasticity.

The phylogeny of L. kunkeei is largely stable and congruent We used two methods (Fasttree and IQ-Tree) to calculate a phylogenomic re- construction based on the set of single-copy genes shared by L. kunkeei and

55 the closest related species. The 126 L. kunkeei genomes were classified into 6 phylogroups (A-F), which were monophyletic and fully supported by both methods. However, the placement of a few strains differed between the two reconstructions.

Recombination does not largely affect the evolution of the core genome We examined the relative effect of recombination to gene divergence and phy- logenetic relationships within L. kunkeei. The calculation of the recombination to mutation rates (r/m) on the species core genes that were larger than 300 nucleotides was rather low (average about 0.17), indicating that the core genes mainly evolve by the accumulation of mutations. However, other recombina- tion prediction software detected recombination signals on many genes, sug- gesting multiple intragenic recombination events both within and between phylogroups. From our multiple approach, we identified 56 core genes that tested positive for recombination. Phylogenomic reconstructions excluding such recombinant genes produce trees that were largely consistent with the species tree. Whole-gene recombination events were also accounted for by examining single-protein trees compared to the phylogenomic reconstruc- tions. Again, the results confirmed frequent recombination events both within and between phylogroups. We concluded that recombination takes place at low levels, and although it is possible to detect its effects at the phylogenetic level, it does not greatly affect the generation of robust species tree.

Paper IV In this study, we explored the mechanisms of evolution of a cluster of 5 giant genes (9-25 kb long, each) that are conserved in all L. kunkeei and L. apinorum genomes, and have no homologs in any other species. Based on distant struc- tural similarity and the presence of a signal peptide and cell-achoring motif, they are predicted to be extracellular proteins potentially involved in surface attachment. We collected a representative dataset comprising a total of 83 L. kunkeei strains (isolated from A. mellifera and also other bee species) and L. apinorum Fhon13. First, we explored the sequence composition and similarity, and the spatial organization of the giant genes. After providing an evolutionary frame- work to the genomes, we also studied the evolution of the orthologous gene families in which the giant genes are classified. Finally, we evaluated the ef- fect of nucleotide substitutions and recombination events in the sequence evo- lution and diversification patterns.

The giant genes originated by multiple duplication events We determined that there are 5 giant genes families; 4 of them are mostly conserved in the whole genome dataset, and one is only present in a few

56 isolates, replacing the first gene in the cluster. Homology searches at the pro- tein level indicated that the first three genes in the giant genes cluster display higher levels of similarity between them than to the last two genes. Moreover, the last two genes in the cluster are only similar to genes in their same orthol- ogous family. This, together with phylogenetic patterns, indicate that the giant genes originated by subsequent gene duplications and posterior accumulation of changes that led to high sequence divergence, only preserving a short region at the C-terminal end that is conserved in all.

Both mutation and recombination drive sequence evolution None of the giant gene phylogenies resembled the species phylogeny. Dis- tantly related strains formed diverse and well supported clades for all the giant genes, with overall different tree topologies between them and compared to the species tree. This was interpreted as non-vertical processes shaping the sequence diversification, such as recombination. Intragenic recombination events were confirmed for the 5 gene families by different methods, although the r/m ratio did not indicate strong effects of recombination. Although the values obtained from nucleotide substitutions rates suggested that, generally, these genes evolve under purifying selection, we observed extreme values of sequence divergence (represented by synonymous substitutions rates) be- tween different pairs of genes. By looking more closely at how the sequence similarity varies across the genes, we noted peaks of high divergence, indica- tive of regional divergence within the genes.

Different putative functions predicted for the giant genes All genes have a conserved LPxTG region at the C-terminal end. This domain is involved in anchoring the protein to the cell wall, which means that the rest of the protein is exposed to the outside. In general, we observed higher se- quence divergence accumulating closer to the N-terminal region, which is pre- dicted to be exposed to the extracellular environment and hence involved in interaction and attachment. The second gene in the cluster showed the strong- est such pattern, and also showed extreme sequence divergence among L. kun- keei strains isolated from bees other than A. mellifera species. This suggests that this gene might be involved in the interaction with the bee surface, and such interaction might be host-specific. The last two genes in the giant genes cluster seem to have co-evolved, as observed by their almost-identical phylogenetic diversification patterns. This suggests that they might interact with each other, forming a protein complex. Yet, the structure and function of such complex is still unknown.

57 Perspectives

The work presented in this thesis represents a comprehensive study on the evolution of a specific inhabitant of the honeybee gut. The gut microbiota of honeybees has been extensively explored from different perspectives and us- ing various methodologies. The advantages of experimental studies focused on the whole microbiota community are obvious to understand how symbiotic systems are established and maintained. However, the knowledge that can be gathered on single species by performing large-scale comparative studies at the genome level can also be extremely valuable. Here, applying such an ap- proach, we have explored in detail the genetic potential and mechanisms of evolution of a honeybee-associated bacterium whose ecological niche and in- teraction with its potential host are not fully understood. One of the advantages of working with L. kunkeei has been the possibility of cultivation to obtain pure cultures prior to whole-genome sequencing. The small size of its genome and the use of long-read sequencing technologies have allowed us to obtain closed genomes that have proven to be extremely useful for our comparative analyses. In total, we have obtained 102 novel ge- nomes, which have greatly expanded the known intra-species diversity of L. kunkeei providing large amounts of genomic data for future studies. The data generated in these studies have added up to our understanding of how L. kunkeei populations vary between groups of geographically distant honeybees (Paper I). The case of the Åland and Gotland honeybee populations is extremely interesting. The ecological context of each of these populations has been shaped by different biotic and abiotic factors, such as climate condi- tions and the structure and composition of the floral community, which play key roles in honeybees’ fitness and survival. However, the most notable dif- ference is the presence or absence of the Varroa mite. The bees from the island of Gotland are naturally resistant to the infestation of the parasite, while the ones from Åland have not been in contact with Varroa for decades. We ob- served striking differences regarding strain diversity of L. kunkeei isolated from these two populations, surprisingly reduced among the mite-resistant bees from Gotland. When analyzing the strain diversity of honeybees from our third sampling site (Paper II), we observed similar strain diversity levels to those from Åland. It is impossible not to speculate whether the low diversity of L. kunkeei among these bees is a consequence of the presence of the Var- roa-mite, or the resistant bees themselves. Another question would be whether the two main strains found among these bees confer any advantage in such an

58 environment, which could explain why they are so abundant. Our studies hence open the doors for future investigations focused on exploring the L. kunkeei population from tolerant and susceptible bees from the island of Got- land, and whether the reduced intra-species diversity is also found among other gut bacteria. One of the main questions on the ecology of L. kunkeei is whether it is a member of the bee or the hive microbiota, even a contaminant from the flow- ers. Its presence on different bee products found in the hive, and the lack of consistency for isolation among adult bees, can be interpreted as a sign of weak association with the honeybee. However, our seasonal study performed on two hives that were thoroughly sampled during the summer suggested oth- erwise (Paper II). According to our results, the L. kunkeei population remained relatively stable through time on both hives. The similarity between isolates seemed to correlate more strongly with the hive than with the time they were sampled. Therefore, it seems that they establish stable populations throughout the summer. These observations argue against the potential acquisition of L. kunkeei from the flowers. However, they do not dismiss the possibility of L. kunkeei true niche being the beehive instead of the honey crop. More compar- ative studies are needed to understand the population dynamics of this species within the bees and also the beehive, by expanding the sampling to different areas of the hive and throughout the entire year to follow in detail all the fluc- tuations. To answer such questions, whole-genome metagenomic analyses would allow to perform quantitative estimations on the population size, while we assemble the genomes to perform comparative studies. The access to the great amount of currently available L. kunkeei complete genomes would strongly contribute to increase the quality of future genomes obtained with this approach. Another relevant aspect covered by this thesis has been the study of the giant genes (Paper IV). Bacterial genes this large are not common, but they are of great biological relevance. The five genes encoded by L. kunkeei, and L. apinorum, are predicted to be involved in adhesion. However, the lack of homologs in the databases and the absence of conserved domains make the task of determining their specific function extremely difficult. We used a large portion of the available genomes to study the mechanisms of evolution behind these genes. Apart from confirming previous observations on the recombino- genic nature of these genes (Tamarit et al. 2015), we also found extreme di- vergence patterns that showed regional distribution across the genes. Struc- tural analyses may be of help here, since we would then be able to determine whether the most conserved and divergent regions correspond to functional or structural domains. However, despite the need for complementary knowledge to infer more about the role of the giant genes, we extracted some relevant conclusions and hypotheses to be tested in the future. We observed that two of those genes have evolved in concert to each other, suggesting a potential interaction of their protein products. Another giant gene showed extreme

59 levels of divergence among L. kunkeei isolates obtained from different bee species. Interestingly, all its homologs showed striking differences in se- quence similarity in the region putatively exposed to the extracellular envi- ronment. Our interpretation is that this protein might attach to the bee surface, and this interaction might be host-specific. However, structural and experi- mental data would be needed to prove these hypotheses and to decipher the ultimate role of the giant genes. The access to more than one hundred genomes has given us the opportunity to explore the pangenomic landscape of L. kunkeei (Paper III). To our surprise, despite the large amounts of genomes that were used, our results indicated that L. kunkeei has a open pangenome, meaning that we have not covered yet the genetic diversity enclosed within this species. Such gene content variability between strains suggest a high ecological plasticity, and invites for further studies. The use of such great number of genomes was also valuable to per- form deep phylogenetic studies and to explore the mechanisms of evolution acting on the species core genome (Paper III). Although we determined that mutation is the main force driving sequence divergence among the core genes, we observed that recombination events occur within and between phy- logroups. Besides, those could be behind some of the phylogenetic inconsist- encies observed across the different genome topologies that we have obtained. Although it has not been a focus of this thesis, L. apinorum, the closest related species to L. kunkeei, might be of great relevance to complement future investigations. This bacterium is also isolated from the honey crop and has antimicrobial properties that inhibit the growth of different pathogens (Vásquez et al. 2012). Besides, it also encodes the set of unknown giant genes (Tamarit et al. 2015). However, only one genome is currently available (Olofsson et al. 2014), and little is known about this species. It would be in- teresting to perform similar large-scale analyses that could provide more in- formation on how the populations are structured, and how they differ with time and between different bee populations. Determining their abundance and diversity within the crop would provide valuable knowledge on the honeybee- associated bacterial communities. Taken together, the genomic data from our studies allowed us to explore the intra-species diversity of L. kunkeei with the use of almost 150 genomes, most of them complete. With such amount of data at hand, the studies that can be carried out are only limited by our imagination. The possibility to establish a genetic system with L. kunkeei will also allow to answer many of these ques- tions and others that might arise.

60 Svensk sammanfattning

Något som Evolutionen har lärt oss, är att “en individ” inte alltid är ett enkelt begrepp. När vi till exempel tänker på en människa, så kanske vi ser en enskild person framför oss. Detta är ju självklart inte fel, men vad vi ofta glömmer bort är alla de mikroorganismer som finns i, och på, vår kropp. Dessa mikro- organismer har funnit med oss under miljontals år, och har blivit en essentiell komponent i vårt välbefinnande. Med andra ord så kan man säga att det finns ett symbiotiskt förhållande mellan oss själva, och dessa mikroorganismer. Så vi är inte bara är den där enskilda personen, vi är oss, och vårt mikrobiom. Ett mikrobiom är de mikroorganismer som lever i tät symbios tillsammans med en annan organism, dess värd. Varje djur, växt och svampart har ett di- stinkt mikrobiom associerat till just sig. Dessa består av olika prokaryoter (bakterier och arkéer), encelliga eukaryoter (protister) och till och med , vilka alla lever tillsammans. Dock är de flesta studier som utförts på mikro- biom, ofta inriktade på just bakterier. Detta då de är lättare att odla vid ett laboratorium, vilket har lett till att vi också vet mest, samt har den största da- tamängden, om just den här gruppen. Denna överrepresentation av bakterier i dessa typer av studier har dock snedvridit vår förståelse om mikrobiom, men då fler och fler studier utförs, på ett mer komplett underlag, så kommer vi förhoppningsvis att ha en mer komplett förståelse om sammansättningen och betydelsen av organismers mikrobiom i framtiden. Mikrobiomen i sig kan utföra många viktiga funktioner för sin värd. Dessa inkluderar nedbrytandet av olika födoämnen, frigörande av olika metaboliter, skydd mot patogener och till och med som kamouflage via bioluminiscens, samt mycket mer. Ju mer vi studerar sammansättningen och funktionen av dessa mikrobiom, desto mer upptäcker vi om hur komplexa och evolutionärt viktiga dessa system faktiskt är. För att återgå till vårt mänskliga exempel så finns det flera distinkta mikrobiom knutna till oss. Dessa finns i munnen, tarm- systemet, vaginan och till och med på huden. Många studier har visat vilken viktig roll dessa mikrobiom har för at upprätthålla homeostasen i dessa områ- den, och för vårt allmänna välbefinnande. Några av dessa mikrobiella grupper är oerhört komplexa i sin sammansättning, framför allt de vi kan hitta i tarm- systemet. På grund av detta är det ofta en stor utmaning att förstå alla inter- aktioner dessa mikroorganismer har med vår kropp. Genom att studera mo- dellorganismer med enklare mikrobiom så kan vi dra paralleller, samt skapa oss en förståelse för många av dessa processer, och extrapolera denna nya kunskap till andra, mer komplicerade, system.

61 Insekter har visat sig vara utmärkta modellsystem för att studera hur mi- krobiom interagerar. Många insektsarter har bakteriella mikrobiom som end- ast består av ett fåtal arter, vilket är en god förutsättning för en studie. Det är dessutom ofta relativt enkelt att odla dessa bakterier vid ett laboratorium, vil- ket öppnar upp för en uppsjö av olika studier. Det finns många sätt vi kan använda oss av för att studera dessa bakterier, men denna avhandling fokuse- rar på bioinformatiska metoder. Det första steget för denna typ av analys är att extrahera, samt sekvensera, bakteriens arvsmassa. Detta ger oss tillgång till all genetisk information, och öppnar för mycket kraftfulla jämförelser. Om vi se- dan gör detta för ett stort antal bakterier, så kan vi börja svar på frågor om hur dessa organismer interagerar med sin omgivning, med sin värd, samt lära oss mer om deras evolutionära historia. I denna avhandling så har fokus legat på en specifik bakterieart, Lactobacil- lus kunkeei, vilken är en del av det mikrobiom man kan hitta i honungsmagen hos honungsbin. Till skillnad mot andra bakterier från bin så vet vi mycket lite om denna specifika art. För att försöka lära oss så mycket som möjlig om denna organism så isolerade, samt sekvenserade, vi ca hundra olika stammar av L. kunkeei från honungsbin. En av de huvudsakliga fynd vi gjorde var att olika populationer av bin kan husera mycket olika typer av L. kunkeei. Exakt vad detta har för biologiska konsekvenser är dock ännu inte helt kartlagt. Ge- nom att studera mikrobiomen hos honungsbin över en hel sommar kunde vi se att de typer av L. kunkeei som är knutna till bina är stabilt över tid, trots den stora variationen som existerar mellan olika bikupor. Denna stabilitet tyder på att det finns en stark sammankoppling mellan dessa bakterier och honungsbin. Vi har också studerat den evolutionära historien hos fem stycken ovanligt stora gener, mellan 9 och 25 kilo-baspar i storlek. Dessa gener är mycket in- tressanta då de är unika för L. kunkeei (och dess närmaste besläktade arter), och då vi inte vet någonting om deras funktion. Dessa gener uppvisar evolut- ionära mönster med stor variation, drivna av olika evolutionära mekanismer. Våra resultat antyder också att åtminstone en av dem är en del av interaktionen mellan bakterien och dess bi-värd, samt att denna interaktion är beroende av biart. Sammantaget så visar arbetet i denna avhandling vilket kraftfullt verktyg jämförande genomik är i studier av bakteriell evolution och ekologi. De över hundra nya genomen som genererats kommer att vara till gagn för framtida studier, och kan hjälpa oss att bättre förstå hur honungsbins mikrobiom fun- gerar. Förhoppningsvis kan detta leda till större insikter om bin, samt deras hälsa, och utgöra grunden för utvecklingen av nya strategier för att bemöta de problem med bihälsa som finns på en global nivå.

62 Resumen en español

Una de las lecciones que puede enseñarnos la Evolución es que el concepto de “individuo” es, de alguna manera, algo difuso. Por ejemplo, cuando pensa- mos en el ser humano, nos imaginamos a una persona. Lo cual no es inco- rrecto, por supuesto. Sin embargo, algo que no solemos incluir en esa imagen mental son todos los microorganismos que viven en nuestro cuerpo. Estos mi- croorganismos han estado con nosotros durante miles de años, y han llegado a convertirse en algo esencial para el correcto funcionamiento de nuestro or- ganismo. En otras palabras, se ha llegado a establecer una relación de simbio- sis entre ambas partes. Por tanto, en conjunto, no somos simplemente noso- tros, somos nosotros y nuestra microbiota. La microbiota es el conjunto de microorganismos que está estrechamente asociado a otro organismo, el cual actúa como hospedador. Todas las especies de animales, plantas y hongos tienen una microbiota característica asociada a ellos. Estas comunidades microbianas están compuestas por organismos pro- cariotas ( y arqueas), eucariotas unicelulares (protistas) e incluso vi- rus, y todos viven en un equilibrio dinámico. Sin embargo, la mayoría de es- tudios sobre microbiota están centrados en bacterias —entre otras razones, porque son más fáciles de cultivar, hay mucha información genética sobre ellos y han sido estudiados durante mucho tiempo. Este sesgo es contraprodu- cente a la hora de alcanzar un grado de conocimiento global sobre la compo- sición de la microbiota, y la relevancia que esta tiene para el hospedador. Sin embargo, cada vez hay más estudios centrados en otros microbios aparte de las bacterias, por lo que, con el tiempo, tendremos una idea más realista de la composición y la importancia de la microbiota en conjunto. Las comunidades microbianas juegan papeles tan diversos para el hospe- dador como la digestión de ciertos compuestos de la dieta, la secreción de productos esenciales, protección frente a organismos patógenos e incluso ca- muflage a través de la producción de bioluminiscencia, entre otros. Cuanto más estudiamos la composición y el papel de la microbiota en diferentes or- ganismos, más descubrimos la complejidad y la importancia evolutiva de di- chas asociaciones. Volviendo al ejemplo de los humanos, en nuestro cuerpo hay varias áreas que albergan una microbiota característica, como la boca, el intestino, la vagina e incluso la piel. Hay muchos estudios que muestran el papel tan importante que las bacterias desarrollan para mantener dicha zona, e incluso el cuerpo entero, en un estado óptimo de salud. Sin embargo, algunas de estas comunidades microbianas son extremadamente complejas en cuanto

63 a su composición, especialmente la del intestino. En este tipo de casos, desen- trañar las interacciones que existen entre las bacterias y nuestro cuerpo se con- vierte en todo un reto. El estudio de otros organismos modelo con más simples nos permite entender los mecanismos que existen detrás de esas interacciones, y, a partir de ahí, extrapolar parte de ese conocimiento a otros sistemas parecidos. Los insectos son un buen sistema modelo para entender cómo funciona la microbiota. Hay muchas especies que albergan comunidades bacterianas com- puestas de unas pocas especies, lo cual es un buen comienzo para simplificar las cosas. Además, en muchos casos, es posible aislar dichas bacterias y cre- cerlas en el laboratorio, y aquí nos encontramos ante todo un mundo de posi- bilidades. Existen muchas estrategias para estudiar la microbiota, pero esta tesis se ha centrado en el uso de técnicas bioinformáticas. El primer paso con- siste en extraer el material genético en forma de ADN, para después secuen- ciarlo y poder acceder a toda la información genética que se encuentra codifi- cada en el genoma de la bacteria. Esta es una técnica extremadamente útil porque podemos, literalmente, leer su ADN y conocer todo sobre esa bacteria (desde un punto de vista genético). El poder de la genómica comparada viene cuando somos capaces de estudiar los genomas de un gran número de bacte- rias; es entonces cuando podemos responder a preguntas sobre su evolución, cómo se han adaptado al ambiente en el que viven y qué tipos de interacciones establecen con su hospedador. El trabajo de esta tesis se ha centrado en el estudio de una bacteria en con- creto, Lactobacillus kunkeei, que forma parte de la microbiota de las abejas — concretamente, del estómago de la miel. Sin embargo, en comparación con otras bacterias que viven asociadas con las abejas, se sabe muy poco de esta. Con el fin de conocer todo lo posible sobre esta especie, hemos obtenido unas cien cepas diferentes de L. kunkeei, extraídas a partir de diversas abejas. Una de las conclusiones principales de nuestros estudios es que diferentes pobla- ciones de abejas pueden albergar poblaciones de L. kunkeei extremadamente diferentes. Sin embargo, las implicaciones biológicas de estas diferencias no se conocen todavía. A pesar de las variaciones que observamos en cuanto a la diversidad de cepas entre abejas, la población de L. kunkeei parece mantenerse estable en el tiempo. Esta estabilidad, junto con una aparente especificidad de cepas asociada a las diferentes colmenas, parecen sugerir que hay una fuerte asociación entre L. kunkeei y las abejas y/o colmenas. También hemos estu- diado la evolución de cinco genes gigantes, que ocupan casi un 10% del ge- noma. Estos genes están presentes únicamente en L. kunkeei y la especie más cercana, y no se sabe nada sobre la función que desempeñan (aunque se hipo- tetiza que participan en la adhesión de las bacterias a algún tipo de superficie). Según nuestros estudios, muestran unos patrones de evolución muy caracterí- sitcos, acumulando una gran variabilidad a nivel de secuencia como resultado de varios mecanismos al mismo tiempo. Nuestros resultados sugieren que al menos una de las proteínas codificadas por estos genes podría estar implicada

64 en la interacción con la abeja, y que esta interacción podría ser dependiente de la especie de la abeja. En conclusión, el trabajo llevado a cabo en esta tesis muestra la utilidad de la genómica comparada para estudiar evolución en bacterias y para descifrar el contexto ecológico de especies determinadas. Además, la obtención de más de cien nuevos genomas de L. kunkeei será de gran utilidad para futuros estu- dios genómicos, y contribuirá a profundizar en nuestro entendimiento sobre la microbiota de las abejas. Estos estudios pueden también contribuir a entender mejor la salud de las abejas, y a desarrollar nuevas estrategias de apicultura para hacer frente a las enfermedades que afectan globalmente a esta especie tan amenazada.

65 Acknowledgements

This PhD has been a long journey that started long time ago. What I’m most grateful for is to have been lucky enough to share this experience with so many people. People without whom the outcome of this adventure would have been extremely different, both at the scientific and personal level. People whose support as mentors, colleagues and friends has pushed me forward and made the whole experience much more enjoyable. Siv, I have to thank you for giving me the opportunity to fulfill the goal for which I started to prepare more than 10 years ago, when I discovered that research was the way to go for me. I’ve learnt much more than I expected in many different ways, and I think working with you has made me grow both as a scientist and as a person. The results of this thesis would have never seen the light without the amazing work of Kristina. Thank you for growing and pre- paring more than 100 L. kunkeei isolates, and thank you as well for letting me participate in part of that process. It was nice to take a break from the bioin- formatics and go back to the wet lab! I really loved working with you. I would also like to thank Matthew for always having the door of his office open for me, and for providing us with so much material to explore and discover the wonders of L. kunkeei. I’m also very happy that you brought me sampling bees one day, it was very exciting to pretend to be a beekeeper for a day! Along the same lines, thank you as well, Alejandra, for providing us with so many sam- ples that have been such a relevant part of this work. I have had the chance to be inspired by other great scientists around me, that have added up to make this experience even richer. Days would always brighten up with Lisa’s laughter filtering through open doors (and sometimes closed ones as well J). I keep many good moments in my memory about se- rious and not so serious conversations that have been extremely valuable to me. I specially appreciate how much I’ve learnt from you about teaching, a discipline that I love more than I thought I would. Jan, I admire your dedica- tion to all your students, and I thank you for the time you always dedicate to everyone. Your advice on results interpretations was also very useful and highly appreciated! I have also learnt many things, directly and indirectly, from Thijs. I think it is amazing the curiosity you have and how driven you are to pursue good and cool science, and the great team work environment that you create. I have been lucky enough to work in a lab full of amazing and multidisci- plinary people that have taught me a lot. Dani, thank you for encouraging me

66 to apply for this PhD student position, and for believing in me from the first to the last day. Working next to you has been really fun and stimulating, and our friendship behind has added an extra value to this whole experience. Karl, working with you has being really great. Not only because of your magic scripts and nice plots, but also because how clearly ideas seem always to be in your mind. And you are an amazing cook! Christian, the years we shared office were great, even though you didn’t want to sneak in some bread for me from bikupan J. I have learnt a lot with you, not only about right and wrong, but also about cool science and life in general. Erik, I’m impressed of the amazing work you have done “experimenting” with our beloved L. kunkeei in the lab, I’m sure great things will come thanks to that. Emil, you always bring up interesting papers about weird stuff, and thanks to you I got to try yoga with goats!! Mayank, it has been fun to teach with you and to talk about any sort of random stuff! Wei, thank you for always caring about me, and for being such a strong woman who has inspired me so much during these last years. The corridor hasn’t been the same since you left, but every now and then I think I hear a “miau” around the corner… Anna, I’ve enjoyed very much shar- ing office with you as well, even though “your side” made “my side” look (even more) messy. Thanks for all the laughs, trips to the vending machine and the Swedish lessons. Tom, I will never forget your oven-cooked Spanish omelet, it was surprisingly good! Next time you have to try with paella… or better not. Thanks also for your eternal smile and for always spreading so much joy. Gaëlle, it took me some time to get to know the cool person you are. I love your sense of humor and your artistic side! Minna, I miss your energy around, and I’m still amazed of the strong and passionate person you are. Other wonderful people have been around as well in Molevo. All the #TastyTuesdays, corridor hockey championships, mushroom/berries picking, fikas, beerclubs, nights out, board games sessions, parties, sports and races, trips, conferences, workshops… and the rest of things we’ve done together have definitely contributed to make this whole PhD much better. Ellie, you have the most incredible and fun stories to tell, and I’m really hoping that one day you compile them in a book. Thank you as well for all your advice, and for dragging me to parkrun!! Guilhe, what can I tell you? I’m amazed by the curious person you are, and how you always try to seize the moment and make the most out of it. You have taught me many valuable lessons. Alejandro, otro que tal baila! Thank you for always listening to me and encouraging me, and for your patience when I had none left. I really loved having shared so many things with you, especially concerts and climbing! Zeynep, I hope you keep up with your nice figures and presentations, I’m sure great things will come from your PhD! Eva F., you are such an amazing person, and an even greater friend. It is impossible not to smile remembering all the moments we have shared from day 1 (or, morally, even before that). Thank you for been always there, tía figa. Jennah, it has been great to be roommates, I loved to

67 share all those moments with you! As much as I love your hugs J. Anders, I never get tired of talking to you, you always find the exciting and interesting side of everything. Thank you for being a passionate person, and to transmit that feeling to everyone around. Joran, I miss your jokes, and our Spanish lessons!! You definitely make a lot of progress, and I also remember the basic Dutch stuff as well (enkeltje!). Courtney, I love that you always spice up con- versations with some witty comment, and that I almost always learn something new from you! Laura, “at least you are not going to Paris” xD Thanks for all the laughs and for being such a kind and positive person, I feel energized when you are around! Kasia, thank you for always having super great input to give about science stuff! Thanks also to everyone else with whom I’ve shared dif- ferent moments over fika, beers and whose company I’ve also enjoyed (Jon- athan, Feifei, Kirsten, Henning, Felix, Max, Will, Anja, Jun-Hoe). I also want to metion all the master students that have being around, it has been great to meet you and share so many things with all of you! Specially Aleksei, who I had the chance to supervise for some months. It was a very rich experience and learnt a lot, thank you for that! One last thank you note to the colleagues who took the time to proofread different parts of this thesis (or help with Swe- dish translations); Guilhe, Alejandro, Eva, Dani, Jennah, Anders, thank you so much!! Of course, not everything is about work, and luckily, I’ve been surrounded by many other amazing people outside the lab. Ana Roooosaaaaaaaaa I’m so happy that you also moved to Uppsala, life here would have been so differ- ent without you. Thank you for your craziness, for caring, for being like fam- ily, and for all the fun things we’ve done together. Giulia, I will never under- stand where you get all that energy from, but I wouldn’t change it for anything in the world. Thank you for making the world brighter. Mahsa, you are one of the most caring people I know, and you are always there ready to help, either with a hug or good advice. Carmen, thank you for your constant sup- port and for always knowing what to say to make me feel better. Gianni, thank you for been cool when I showed up to your birthday party when we didn’t even know each other, it has been a long way since then and we’ve shared tones of good moments and experiences together. Sophia, you make the best bread ever (sorry, Guilhe). Well, the best bread and the best whatever, you have a natural skill to create stuff with your hands that reflects all the good things you have inside. I should thank you for many things, but now I can only think on your constant reminders during the last months of thesis writing to take intermittent breaks. Thanks for taking care of me! Emilio, it is always a pleasure to talk about crazy stuff with you, you always cheer me up. Also, thanks for teaching me the obscure world of vim! I still don’t believe how lucky I was when I moved to Uppsala to find a place to leave that soon enough I could call home. Svea, you are the definition of fun and responsibility, com- bined with craziness and care. Thank you for being my roommate and my friend, despite my murder attempts when cooking chorizo. Adam, I don’t

68 know if I’ll forget you for blaming us for leaving the teabags in the sink, but I appreciate that you didn’t make too much fun of my “zombie mode” every morning! I also thank the universe for showing me that post about some sec- ond-hand stuff that brought me to meet you, Ana I. Thank you for bringing so much laugh, joy and warmth to my life over the last year. Life in Uppsala has also been spiced up by some Mediterranean and not so Mediterranean vibes. Maria C., Alberto, Ana Z., Jesús, Laura, Fernando, Ana M., Guille, David O., Miguel Ángel, Miguel, Eva G., Guillaume, Wiwi, David W., Ana P., Jennifer, Karin, Maria T., Bárbara, Sergio, Alice, Dennis, Laia, Mercè, Javier, Davide and Ilektra. Thank you all for everything! I don’t forget about my friends from Valencia who have been part of this journey from the very earliest stages. I still don’t believe how lucky I was during my bachelor to find a group of people so kind, intelligent, honest, fun and inspiring. Martina, Anselm, Silvia, Dani M., Marta, Ximo, Carmen, Jose, Ana D., Bortx, Nuria; it doesn’t matter where you are or how often we see each other, I love you all. To the people from my masters, I also want to say that we ended up into that madness together. Roser, Jaume, Alberto, Pauli, Chiri, David; you don’t know how important you became to me in such a short time, and how much I learnt from you (not only programming!). Eva. I can’t think of any important moment where you haven’t been with me, either physically or mentally. Thank you for never leaving me alone and for always encouraging me when I needed it. Ale, I literally have no words to describe how grateful I am to have shared great part of this adventure with you. Thank you for filling these years with love, care, laughter, dreams and unconditional support. Lastly, I want to thank my family, because they have been a true inspiration and motivation to bring me where I am today, and I’d like everyone to know. I’m grateful to my grandma, who never ever stopped learning new things and challenging herself. Also, to my grandpa, an example of patience and perse- verance. My dad taught me about nature and the wonders of knowledge from an early age, and my mum always encouraged me to pursue my career dreams. And I’m also grateful to my sister, who always believes in me and what I do. Por ultimo, quiero agradecer todo esto a mi familia, porque siempre han sido una gran inspiración y motivación que me ha llevado donde estoy hoy. Gracias a mi yaya, quien nunca dejó de aprender cosas nuevas y enfrentarse a nuevos desafíos. Una mujer fuerte e independiente de quien aprendí muchí- simo. Te echo de menos. Gracias también a mi yayo, todo un ejemplo de pa- ciencia y constancia, de experiencia y ganas de compartir. Gracias por todo lo que me has enseñado. Gracias a mi padre por inculcarme el interés por la naturaleza desde bien pequeña, y por siempre sacarme una sonrisa con tanta facilidad. Gracias a mi madre por animarme siempre a perseguir mis sueños, y por convencerme de que por difícil que sea, si me lo propongo de verdad, puedo con todo. Por último, gracias a mi hermana, por animarme en todo momento, aunque, según tú, no entiendas de qué va de lo que hago J.

69 References

Abraham S.N., Sharon N., Ofek I., Schwartzman J.D. 2015. Adhesion and Colonization. In: Molecular Medical Microbiology. Academic Press pp. 409– 421. Anderson D.L., Trueman J.W. 2000. Varroa jacobsoni (Acari: Varroidae) is more than one species. Experimental & Applied Acarology. 24:165–89. Anderson K.E et al. 2013. Microbial ecology of the hive and pollination landscape: Bacterial associates from floral nectar, the alimentary tract and stored food of honey bees (Apis mellifera). PLoS ONE. 8. Anderson K.E., Ricigliano V.A. 2017. Honey bee gut dysbiosis: a novel context of disease ecology. Current Opinion in Insect Science. 22:125–132 Argov T. et al. 2017. Temperate bacteriophages as regulators of host behavior. Current Opinion in Microbiology. 38:81–87. Arredondo D. et al. 2018. Lactobacillus kunkeei strains decreased the infection by honey bee pathogens Paenibacillus larvae and Nosema ceranae. Beneficial Microbes. 9:279–290. Asenjo F. et al. 2016. Genome sequencing and analysis of the first complete genome of Lactobacillus kunkeei strain MP2, an Apis mellifera gut isolate. PeerJ. 4:e1950. Azeredo J. et al. 2017. Critical review on biofilm methods. Critical Reviews in Microbiology. 43:313–351. Baffoni L. et al. 2016. Effect of dietary supplementation of Bifidobacterium and Lactobacillus strains in Apis mellifera L. against Nosema ceranae. Beneficial Microbes. 7:45–51. Bang C., Schmitz R. 2015. Archaea associated with human surfaces: not to be underestimated. FEMS Microbiology Reviews. 39:631–648. Bay R., Bielawski J. 2011. Recombination detection under evolutionary scenarios relevant to functional divergence. Journal of Molecular Evolution. 73:273-286. Berríos P. et al. 2018. Inhibitory effect of biofilm-forming Lactobacillus kunkeei strains against virulent Pseudomonas aeruginosa in vitro and in honeycomb moth (Galleria mellonella) infection model. Beneficial Microbes. 9:257–268. Bobay L.M., Ochman H. 2017. Impact of recombination on the base composition of bacteria and archaea. Molecular Biology and Evolution. 34:2627–2636. Bradford L., Ravel J. 2017. The vaginal mycobiome: a contemporary perspective on fungi in women’s health and diseases. Virulence. 8:342–351. Burns A.R et al. 2016. Contribution of neutral processes to the assembly of gut microbial communities in the zebrafish over host development. The ISME Journal. 10:655–64. Butler È. et al. 2013. Proteins of novel lactic acid bacteria from Apis mellifera mellifera: An insight into the production of known extra-cellular proteins during microbial stress. BMC Microbiology. 13:235 Cattò C., Cappitelli F. 2019. Testing anti-Biofilm polymeric surfaces: where to start? International journal of molecular sciences. 20.

70 Chandler M. 2017. Prokaryotic DNA transposons: classes and mechanism. eLS. 1– 16. Chaudhry V., Patil P.B. 2020. Evolutionary insights into adaptation of ı to human and non-human niches. Genomics. 112:2052–2062. Chen Z, Zhong L, Shen M, Fang P, Qin Z. 2012. Characterization of Streptomyces plasmid-phage pFP4 and its evolutionary implications. Plasmid. 68:170–178. Chevignon G., Boyd B.M., Brandt J.W., Oliver K.M., Strand M.R. 2018. Culture- facilitated comparative genomics of the facultative symbiont Hamiltonella defensa. Genome Biology and Evolution. 10:786–802. Chiou T.-Y. et al. 2018. Lactobacillus kosoi sp. nov., a fructophilic species isolated from kôso, a Japanese sugar-vegetable fermented beverage. . 111:1149–1156. Chudnovskiy A. et al. 2016. Host-protozoan interactions protect from mucosal infections through activation of the inflammasome. Cell. 167:444-456. de Clercq N.C., Groen A.K., Romijn J.A., Nieuwdorp M. 2016. Gut microbiota in obesity and undernutrition. Advances in Nutrition. 7:1080–1089. Corby-Harris V., Maes P., Anderson K.E. 2014. The bacterial communities associated with honey bee (Apis mellifera) foragers. PLoS ONE. 9. Coyte K.Z., Rakoff-Nahoum S. 2019. Understanding competition and cooperation within the mammalian gut microbiome. Current Biology : CB. 29:R538–R544. Daisley B.A. et al. 2020. Novel probiotic approach to counter Paenibacillus larvae infection in honey bees. The ISME Journal. 14:476–491. Danner N., Molitor A.M., Schiele S., Härtel S., Steffan-Dewenter I. 2016. Season and landscape composition affect pollen foraging distances and habitat use of honey bees. Ecological Applications. 26:1920–1929. Dao M.C., Clément K. 2018. Gut microbiota and obesity: Concepts relevant to clinical care. European journal of internal medicine. 48:18–24. Degnan P.H., Yu Y., Sisneros N., Wing R.A., Moran N.A. 2009. Hamiltonella defensa, genome evolution of protective bacterial endosymbiont from pathogenic ancestors. Proceedings of the National Academy of Sciences of the United States of America. 106:9063–8. DeGrandi-Hoffman G. et al. 2018. Connecting the nutrient composition of seasonal pollens with changing nutritional needs of honey bee (Apis mellifera L.) colonies. Journal of Insect Physiology. 109:114–124. Didelot X., Falush D. 2007. Inference of bacterial microevolution using multilocus sequence data. Genetics. 175:1251–1266. Didelot X., Méric G., Falush D., Darling A.E. 2012. Impact of homologous and non- homologous recombination in the genomic evolution of Escherichia coli. BMC Genomics. 13:256. Dietel A.K., Merker H., Kaltenpoth M., Kost C. 2019. Selective advantages favour high genomic AT-contents in intracellular elements. PLoS Genetics. 15:e1007778. Dill-McFarland K.A. et al. 2019. Close social relationships correlate with human gut microbiota composition. Scientific reports. 9:703. Dion E., Zélé F., Simon J-C., Outreman Y. 2011. Rapid evolution of parasitoids when faced with the symbiont-mediated resistance of their hosts. Journal of Evolutionary Biology. 24:741–50. Djukic M. et al. 2015. High quality draft genome of Lactobacillus kunkeei EFB6, isolated from a German European foulbrood outbreak of honeybees. Standards in Genomic Sciences. 10:16.

71 Dong Z-X. et al. 2020. Colonization of the gut microbiota of honey bee (Apis mellifera) workers at different developmental stages. Microbiological Research. 231:126370. Douglas A.E. 2006. Phloem-sap feeding by animals: problems and solutions. Journal of Experimental . 57:747–754. Douglas A.E. 2019. Simple animal models for microbiome research. Nature Reviews Microbiology. 17:764–775. Douglas A.E. 2014. The molecular basis of bacterial-insect symbiosis. Journal of Molecular Biology. 426:3830–7. Dragoš A. et al. 2018. Division of labor during biofilm matrix production. Current Biology : CB. 28:1903-1913.e5. Duar R.M. et al. 2017. Lifestyles in transition: evolution and of the genus Lactobacillus. FEMS Microbiology Reviews. 41:S27–S48. Duong B.T.T. et al. 2020. Investigation of the gut microbiome of Apis cerana honeybees from Vietnam. Biotechnology Letters. 1–9. Edwards, Haag, Collins, Hutson, Huang. 1998. Lactobacillus kunkeei sp. nov. : a spoilage organism associated with grape juice fermentations. Journal of Applied Microbiology. 84:698–702. Ellegaard K.M., Engel P. 2019. Genomic diversity landscape of the honey bee gut microbiota. Nature Communications. 10:446. Ellegaard K.M., Suenami S., Miyazaki R., Engel P. 2020. Vast differences in strain- level diversity in the gut microbiota of two closely related honey bee species. Current Biology. Endo A. et al. 2012. Characterization and emended description of Lactobacillus kunkeei as a fructophilic lactic acid bacterium. International Journal of Systematic and Evolutionary Microbiology. 62:500–504. Endo A. et al. 2018. Fructophilic lactic acid bacteria, a unique group of fructose- fermenting microbes. Applied and Environmental Microbiology. 84. Endo A. 2012. Fructophilic lactic acid bacteria inhabit fructose-rich niches in nature. Microbial ecology in health and disease. 23. Endo A., Futagawa-Endo Y., Dicks L.M.T. 2009. Isolation and characterization of fructophilic lactic acid bacteria from fructose-rich niches. Systematic and Applied Microbiology. 32:593–600. Endo A., Salminen S. 2013. Honeybees and beehives are rich sources for fructophilic lactic acid bacteria. Systematic and Applied Microbiology. 36:444–448. Engel P., Martinson V.G., Moran N.A. 2012. Functional diversity within the simple gut microbiota of the honey bee. Proceedings of the National Academy of Sciences of the United States of America. 109:11002–7. Engel P., Stepanauskas R., Moran N.A. 2014. Hidden diversity in honey bee gut symbionts detected by single-cell genomics. PLoS genetics. 10:e1004596. Fagan R.P., Fairweather N.F. 2014. Biogenesis and functions of bacterial S-layers. Nature Reviews Microbiology. 12:211–222. Feng H. et al. 2019. Trading amino acids at the aphid-Buchnera symbiotic interface. Proceedings of the National Academy of Sciences of the United States of America. 116:16003–16011. Flemming H-C., Wingender J. 2010. The biofilm matrix. Nature Reviews Microbiology. 8:623–633. Flo V. et al. 2018. Yearly fluctuations of flower landscape in a Mediterranean scrubland: Consequences for floral resource availability. PloS one. 13:e0191268. Forfert N. et al. 2015. Parasites and pathogens of the honeybee (Apis mellifera) and their influence on inter-colonial transmission. PLoS ONE. 10.

72 Forsgren E., Olofsson T.C., Vásquez A., Fries I. 2010. Novel lactic acid bacteria inhibiting Paenibacillus larvae in honey bee larvae. Apidologie. 41:99–108. Frank A.C. 2019. Molecular host mimicry and manipulation in bacterial symbionts. FEMS Microbiology Letters. 366. Free J.B. 1958. The drifting of honey-bees. The Journal of Agricultural Science. 51:294–306. Freschi L. et al. 2019. The Pseudomonas aeruginosa pan-Genome provides new insights on its population structure, horizontal gene transfer, and pathogenicity. Genome Biology and Evolution. 11:109. Fries I., Feng F., da Silva A., Slemenda S.B., Pieniazek N.J. 1996. Nosema ceranae n. sp. (Microspora, Nosematidae), morphological and molecular characterization of a microsporidian parasite of the asian honey bee Apis cerana (Hymenoptera, Apidae). European Journal of Protistology. 32:356–365. Fritsch J., Abreu M.T. 2019. The microbiota and the immune response: What is the chicken and what is the egg? Gastrointestinal endoscopy clinics of North America. 29:381–393. Galetti R., Andrade L.N., Varani A.M., Darini A.L.C. 2019. A phage-like plasmid carrying blaKPC-2Gene in carbapenem-resistant pseudomonas aeruginosa. Frontiers in Microbiology. 10:2–6. Gerbino E., Carasi P., Mobili P., Serradell M.A., Gómez-Zavaglia A. 2015. Role of S-layer proteins in bacteria. World Journal of Microbiology and Biotechnology. 31:1877–1887. Gilbert M.J., Duim B., van der Graaf-van Bloois L., Wagenaar J.A., Zomer A.L. 2018. Homologous recombination between genetically divergent Campylobacter fetus lineages supports host-associated speciation. Genome biology and evolution. 10:716–722. González-Torres P., Rodríguez-Mateos F., Antón J., Gabaldón T. 2019. Impact of homologous recombination on the evolution of prokaryotic core genomes. mBio. 10. Guttman D.S., Dykhuizen D.E. 1994. Clonal divergence in Escherichia coli as a result of recombination, not mutation. Science. 266:1380–3. Hammes W.P., Vogel R.F. 1995. The genus Lactobacillus. In: The Genera of Lactic Acid Bacteria. Springer US: Boston, pp. 19–54. Heo J., Kim S.J., Kim J.S., Hong S.B., Kwon S.W. 2020. Comparative genomics of Lactobacillus species as bee symbionts and description of Lactobacillus bombintestini sp. nov., isolated from the gut of Bombus ignitus. Journal of Microbiology. 58:445–455. Hershberg R. & Petrov D.A. 2010. Evidence that mutation is universally biased to- wards AT in bacteria. PLOS Genetics. 6: e1001115 Heyndrickx M. et al. 1996. Reclassification of Paenibacillus (formerly Bacillus) pulvifaciens (Nakamura 1984) Ash et al. 1994, a later subjective synonym of Paenibacillus (formerly Bacillus) larvae (White 1906) Ash et al. 1994, as a subspecies of P. larvae, with emended descriptions of P. larvae as P. larvae subsp. larvae and P. larvae subsp. pulvifaciens. International Journal of Systematic Bacteriology. 46:270–9. Higes M., Martín R., Meana A. 2006. Nosema ceranae, a new microsporidian parasite in honeybees in Europe. Journal of invertebrate pathology. 92:93–5. Hoshino Y., Gaucher E.A. 2018. On the origin of isoprenoid biosynthesis. Molecular Biology and Evolution. 35:2185–2197. Hroncova Z. et al. 2015. Variation in honey bee gut microbial diversity affected by ontogenetic stage, age and geographic location. PLoS ONE. 10:1–17.

73 Huang C-H. et al. 2020. Genome-based reclassification of Lactobacillus casei: emended classification and description of the species Lactobacillus zeae. International Journal of Systematic and Evolutionary Microbiology. ijsem003969. Huang Y-C., Edwards C.G., Peterson J.C., Haag K.M. 1996. Relationship between sluggish fermentations and the antagonism of yeast by lactic acid bacteria. American Journal of Enology and Viticulture. 47:1 LP – 10. Husnik F., McCutcheon J.P. 2018. Functional horizontal gene transfer from bacteria to eukaryotes. Nature Reviews Microbiology. 16:67–79. Iorizzo M. et al. 2020. Antagonistic activity against Ascosphaera apis and functional properties of Lactobacillus kunkeei strains. Antibiotics. 9. Iversen K.H. et al. 2020. Similar genomic patterns of clinical infective endocarditis and oral isolates of Streptococcus sanguinis and Streptococcus gordonii. Scientific Reports. 10. Jones J.C. et al. 2018. The gut microbiome is associated with behavioural task in honey bees. Insectes Sociaux. 65:419–429. Kalinina A., Suvorikova A., Spokoiny V., Gelfand M. 2016. Detection of homologous recombination in closely related strains. Journal of bioinformatics and computational biology. 14. Kang Y., Blanco K., Davis T., Wang Y., DeGrandi-Hoffman G. 2016. Disease dynamics of honeybees with Varroa destructor as parasite and virus vector. Mathematical biosciences. 275:71–92. Kareb O., Aïder M. 2020. Quorum sensing circuits in the communicating mechanisms of bacteria and its implication in the biosynthesis of bacteriocins by lactic acid bacteria: a Review. Probiotics and Antimicrobial Proteins. 12:5–17. Katznelson H. 1950. The influence of antibiotics and sulfa drugs on Bacillus larvae, cause of American foulbrood of the honeybee, in vitro and in vivo. Journal of bacteriology. 59:471–9. Kawasaki S. et al. 2011. Lactobacillus ozensis sp. nov., isolated from mountain flowers. International Journal of Systematic and Evolutionary Microbiology. 61:2435–2438. Kivisaar M. 2020. Mutation and recombination rates vary across bacterial chromosome. Microorganisms. 8. Knip M., Constantin M.E., Thordal-Christensen H. 2014. Trans-kingdom cross-talk: small RNAs on the move. PLoS genetics. 10:e1004602. Kuo C-H., Moran N.A., Ochman H. 2009. The consequences of genetic drift for bacterial genome complexity. Genome Research. 19:1450. Kwong W.K., Mancenido A.L., Moran N.A. 2017. Immune system stimulation by the native gut microbiota of honey bees. Royal Society Open Science. 4:170003. Lach G., Schellekens H., Dinan T.G., Cryan J.F. 2018. Anxiety, depression, and the microbiome: a role for gut peptides. Neurotherapeutics : the journal of the American Society for Experimental NeuroTherapeutics. 15:36–59. Latorre A., Manzano-Marín A. 2017. Dissecting genome reduction and trait loss in insect endosymbionts. Annals of the New York Academy of Sciences. 1389:52– 75. Lederberg J., Tatum E. 1946. Gene recombination in Escherichia coli. Nature. 158:558. Li D.Y., Tang W.H.W. 2017. Gut microbiota and atherosclerosis. Current atherosclerosis reports. 19:39. Lin M., Kussell E. 2017. Correlated mutations and homologous recombination within bacterial populations. Genetics. 205:891.

74 López-Madrigal S., Gil R. 2017. Et tu, brute? Not even intracellular mutualistic symbionts escape horizontal gene transfer. Genes. 8. Maeno S. et al. 2016. Genomic characterization of a fructophilic bee symbiont Lactobacillus kunkeei reveals its niche-specific adaptation. Systematic and Applied Microbiology. 39:516–526. Maeno S., Dicks L., Nakagawa J., Endo A. 2017. Lactobacillus apinorum belongs to the fructophilic lactic acid bacteria. Bioscience of Microbiota, Food and Health. 36:147–149. Maes P.W., Rodrigues P.A.P., Oliver R., Mott B.M., Anderson K.E. 2016. Diet- related gut bacterial dysbiosis correlates with impaired development, increased mortality and Nosema disease in the honeybee (Apis mellifera). Molecular Ecology. 25:5439–5450. Margulis L. 1993. Symbiosis in cell evolution : microbial communities in the Archean and Proterozoic eons. 2nd ed. New York: W.H. Freeman & Co. Ltd. Margulis L. 1975. Symbiotic theory of the origin of eukaryotic organelles; criteria for proof. Symposia of the Society for Experimental Biology. 21–38. Martin S.J. 2002. The role of Varroa and viral pathogens in the collapse of honeybee colonies: a modelling approach. Journal of Applied Ecology. 38:1082–1093. Martinson V.G. et al. 2011. A simple and distinctive microbiota associated with honey bees and bumble bees. Molecular Ecology. 20:619–628. Martinson V.G., Moy J., Moran N.A. 2012. Establishment of characteristic gut bacteria during development of the honeybee worker. Applied and environmental microbiology. 78:2830–40. McCutcheon J.P., Moran N.A. 2012. Extreme genome reduction in symbiotic bacteria. Nature Reviews Microbiology. 10:13–26. McFrederick Q.S., Vuong H.Q., Rothman J.A. 2018. Lactobacillus micheneri sp. nov., Lactobacillus timberlakei sp. nov. and Lactobacillus quenuiae sp. nov., lactic acid bacteria isolated from wild bees and flowers. International Journal of Systematic and Evolutionary Microbiology. 68:1879–1884. Meng C., Bai C., Brown T.D., Hood L.E., Tian Q. 2018. Human gut microbiota and gastrointestinal cancer. Genomics, proteomics & bioinformatics. 16:33–49. Mitchell D. 2019. Nectar, humidity, honey bees (Apis mellifera) and varroa in summer: a theoretical thermofluid analysis of the fate of water vapour from honey ripening and its implications on the control of Varroa destructor. Journal of The Royal Society Interface. 16. Miyamoto J. et al. 2019. Gut microbiota confers host resistance to obesity by metabolizing dietary polyunsaturated fatty acids. Nature communications. 10:4007. Moeller A.H et al. 2016. Social behavior shapes the chimpanzee pan-microbiome. Science advances. 2:e1500997. Moissl-Eichinger C. et al. 2018. Archaea are interactive components of complex microbiomes. Trends in microbiology. 26:70–85. Moran N.A, Munson M, Baumann P, Ishikawa H. 1993. A molecular clock in endosymbiotic bacteria is calibrated using the insect hosts. Proceedings of the Royal Society B: Biological Sciences. 253:167–171. Moran N.A. 2015. Genomics of the honey bee microbiome. Current Opinion in Insect Science. 10:22–28. Moran N.A., Bennett G.M. 2014. The tiniest tiny genomes. Annual Review of Microbiology. 68:195–215. Moran N.A., Degnan P.H., Santos S.R., Dunbar H.E., Ochman H. 2005. The players in a mutualistic symbiosis: insects, bacteria, viruses, and virulence genes.

75 Proceedings of the National Academy of Sciences of the United States of America. 102:16919–26. Moran NA, Russell JA, Koga R, Fukatsu T. 2005. Evolutionary relationships of three new species of Enterobacteriaceae living as symbionts of aphids and other insects. Applied and environmental microbiology. 71:3302–10. Morella N.M., Koskella B. 2017. The value of a comparative approach to understand the complex interplay between microbiota and host immunity. Frontiers in immunology. 8:1114. Mukherjee S., Bassler B.L. 2019. Bacterial quorum sensing in complex and dynamically changing environments. Nature reviews. Microbiology. 17:371. Neveling D.P., Endo A., Dicks L.M.T. 2012. Fructophilic Lactobacillus kunkeei and Lactobacillus brevis isolated from fresh flowers, bees and bee-hives. Current Microbiology. 65:507–515. Nieves-Ramírez M.E. et al. 2018. Asymptomatic intestinal colonization with protist Blastocystis is strongly associated with distinct microbiome ecological patterns. mSystems. 3. Nishida A. et al. 2018. Gut microbiota in the pathogenesis of inflammatory bowel disease. Clinical journal of gastroenterology. 11:1–10. Nishiyama K. et al. 2015. Cell surface-associated aggregation-promoting factor from Lactobacillus gasseri SBT2055 facilitates host colonization and competitive exclusion of Campylobacter jejuni. Molecular Microbiology. 98:712–726. Novichkov P.S., Wolf Y.I., Dubchak I., Koonin E.V. 2009. Trends in prokaryotic evolution revealed by comparison of closely related bacterial and archaeal genomes. Journal of bacteriology. 191:65–73. Obadia B. et al. 2017. Probabilistic invasion underlies natural gut microbiome stability. Current biology : CB. 27:1999-2006.e8. Oliveira N.M. et al. 2015. Biofilm formation as a response to ecological competition. PLOS Biology. 13:e1002191. Oliver K.M., Higashi C.H. 2019. Variations on a protective theme: Hamiltonella defensa infections in aphids variably impact parasitoid success. Current Opinion in Insect Science. 32:1–7. Olofsson T.C. et al. 2016. Lactic acid bacterial symbionts in honeybees - an unknown key to honey’s antimicrobial and therapeutic activities. International Wound Journal. 13:668–679. Olofsson T.C., Alsterfjord M., Nilson B., Butler È., Vásquez A. 2014. Lactobacillus apinorum sp. nov., Lactobacillus mellifer sp. nov., Lactobacillus mellis sp. nov., Lactobacillus melliventris sp. nov., Lactobacillus kimbladii sp. nov., Lactobacillus helsingborgensis sp. nov. and Lactobacillus kullabergensis sp. nov., isolated from the honey stomach of the honeybee Apis mellifera. International Journal of Systematic and Evolutionary Microbiology. 64:3109. Olofsson T.C., Vásquez A. 2008. Detection and identification of a novel lactic acid bacterial flora within the honey stomach of the honeybee Apis mellifera. Current Microbiology. 57:356–363. Olofsson T.C., Vásquez A., Sammataro D., Macharia J. 2011. A scientific note on the lactic acid bacterial flora within the honeybee subspecies Apis mellifera (Buckfast), A. m. scutellata, A. m. mellifera, and A. m. monticola. Apidologie. 42:696–699. Orla-Jensen S. 1919. The lactic acid bacteria. In: Mémoires de l’Académie Royale des Sciences et des Lettres de Danemark. Andr. Fred. Host and Son, Copenhagen. Paris L., El Alaoui H., Delbac F., Diogon M. 2018. Effects of the gut parasite Nosema ceranae on honey bee physiology and behavior. Current Opinion in Insect Science. 26:149–154.

76 Filannino, P. et al. 2019. Fructose-rich niches traced the evolution of lactic acid bacteria toward fructophilic species. Critical Reviews In Microbiology. 45:65– 81. Paterson M.J., Oh S., Underhill D.M. 2017. Host-microbe interactions: commensal fungi in the gut. Current opinion in microbiology. 40:131–137. Peeters C. et al. 2019. Comparative genomics of Pandoraea, a genus enriched in xenobiotic biodegradation and metabolism. Frontiers in microbiology. 10:2556. Pfeiffer K.J., Crailsheim K. 1998. Drifting of honeybees. Insectes Sociaux. 45:151– 167. Piepenbrink K.H., Sundberg E.J. 2016. Motility and adhesion through type IV pili in Gram-positive bacteria. Biochemical Society transactions. 44:1659–1666. Pinto-Carbó M. et al. 2016. Evidence of horizontal gene transfer between obligate leaf nodule symbionts. The ISME journal. 10. Pizarro-Cerdá J., Cossart P. 2006. Leading edge review bacterial adhesion and entry into host cells. Cell. 124:715–727. Poppinga L., Genersch E. 2015. Molecular pathogenesis of american foulbrood: how Paenibacillus larvae kills honey bee larvae. Current opinion in insect science. 10:29–36. Powell J.E., Martinson V.G., Urban-Mead K., Moran N.A. 2014. Routes of acquisition of the gut microbiota of the honey bee Apis mellifera. Applied and Environmental Microbiology. 80:7378–7387. Proft T., Baker E.N. 2009. Pili in Gram-negative and Gram-positive bacteria — structure, assembly and their role in disease. Cellular and Molecular Life Sciences. 66:613–635. Québatte M. et al. 2017. Gene transfer agent promotes evolvability within the fittest subpopulation of a bacterial pathogen. Cell systems. 4:611-621.e6. Quigley E.M.M. 2017. Microbiota-brain-gut axis and neurodegenerative diseases. Current neurology and neuroscience reports. 17:94. Ramsey J.S. et al. 2010. Genomic evidence for complementary purine metabolism in the pea aphid, Acyrthosiphon pisum, and its symbiotic bacterium Buchnera aphidicola. Insect Molecular Biology. 19:241–248. Rangberg A., Mathiesen G., Amdam G.V., Diep D.B. 2015. The paratransgenic potential of Lactobacillus kunkeei in the honey bee Apis mellifera. Beneficial Microbes. 6:513–523. Rankin D.J., Rocha E.P.C., Brown S.P. 2011. What traits are carried on mobile genetic elements, and why. Heredity. 106:1–10. Raymann K., Moran N.A. 2018. The role of the gut microbiome in health and disease of adult honey bee workers. Current Opinion in Insect Science. 26:97–104. Raymann K., Shaffer Z., Moran N.A. 2017. Antibiotic exposure perturbs the gut microbiota and elevates mortality in honeybees. PLoS biology. 15:e2001861. Rendueles O., de Sousa J.A.M., Bernheim A., Touchon M., Rocha E.P.C. 2018. Genetic exchanges are more frequent in bacteria encoding capsules. PLoS genetics. 14:e1007862. Richard M.L., Sokol H. 2019. The gut mycobiota: insights into analysis, environmental interactions and role in gastrointestinal diseases. Nature Reviews Gastroenterology & Hepatology. 16:331–345. Romero S., Nastasa A., Chapman A., Kwong W.K., Foster L.J. 2019. The honey bee gut microbiota: strategies for study and characterization. Insect Molecular Biology. 28:455–472. Rouzé R., Moné A., Delbac F., Belzunces L., Blot N. 2019. The honeybee gut microbiota is altered after chronic exposure to different families of insecticides and infection by Nosema ceranae. Microbes and Environments. 34:226.

77 Salvetti E., Harris H.M.B., Felis G.E., O’Toole P.W. 2018. Comparative genomics of the genus Lactobacillus reveals robust phylogroups that provide the basis for reclassification. Applied and Environmental Microbiology. 84. Salvetti E., Torriani S., Felis G.E. 2012. The genus Lactobacillus: a taxonomic update. Probiotics and Antimicrobial Proteins. 4:217–226. Schlötterer C. 2015. Genes from scratch--the evolutionary fate of de novo genes. Trends in genetics : TIG. 31:215–9. Schmitz J., Bornberg-Bauer E. 2017. Fact or fiction: updates on how protein-coding genes might emerge de novo from previously non-coding DNA. F1000Research. 6:57. Schwarz R.S., Moran N.A., Evans J.D. 2016. Early gut colonizers shape parasite susceptibility and microbiota composition in honey bee workers. Proceedings of the National Academy of Sciences. 113:9345–9350. Sereme Y. et al. 2019. Methanogenic archaea: emerging partners in the field of allergic diseases. Clinical reviews in allergy & immunology. 57:456–466. Sharma D., Misba L., Khan A.U. 2019. Antibiotics versus biofilm: An emerging battleground in microbial communities. Antimicrobial Resistance and Infection Control. 8:76. Sharma V., Mobeen F., Prakash T. 2018. Exploration of survival traits, probiotic determinants, host interactions, and functional evolution of bifidobacterial genomes using comparative genomics. Genes. 9. Shigenobu S., Wilson A.C.C. 2011. Genomic revelations of a mutualism: the pea aphid and its obligate bacterial symbiont. Cellular and molecular life sciences : CMLS. 68:1297–309. Sieber M. et al. 2019. Neutrality in the Metaorganism. PLoS biology. 17:e3000298. Simcock N.K., Gray H.E., Wright G.A. 2014. Single amino acids in sucrose rewards modulate feeding and associative learning in the honeybee. Journal of Insect Physiology. 69:41. Sitaraman R. 2018. Prokaryotic horizontal gene transfer within the human : ecological-evolutionary inferences, implications and possibilities. Microbiome. 6:163. Sivadon P., Barnier C., Urios L., Grimaud R.. 2019. Biofilm formation as a microbial strategy to assimilate particulate substrates. Environmental Microbiology Reports. 11:1758-2229.12785. Stevens M.J.A. et al. 2019. Whole-genome-based phylogeny of Bacillus cytotoxicus reveals different clades within the species and provides clues on ecology and evolution. Scientific reports. 9:1984. Sulthana A., Lakshmi S.G., Madempudi R.S. 2019. High-quality draft genome and characterization of commercially potent probiotic Lactobacillus strains. Genomics & Informatics. 17. Sun M-F., Shen Y-Q. 2018. Dysbiosis of gut microbiota and microbial metabolites in Parkinson’s Disease. Ageing research reviews. 45:53–61. Sun Z. et al. 2015. Expanding the biotechnology potential of lactobacilli through comparative genomics of 213 strains and associated genera. Nature Communications. 6:8322. Suresh G., Lodha T.D., Indu B., Sasikala C., Ramana C.V. 2019. Taxogenomics resolves conflict in the genus Rhodobacter: a two and half decades pending thought to reclassify the genus Rhodobacter. Frontiers in microbiology. 10:2480. Suryanarayanan S. et al. 2018. Collaboration matters: honey bee health as a transdisciplinary model for understanding real-world complexity. Bioscience. 68:990–995.

78 Taha E.-K.A., Al-Kahtani S., Taha R. 2019. Protein content and amino acids composition of bee-pollens from major floral sources in Al-Ahsa, eastern Saudi Arabia. Saudi Journal of Biological Sciences. 26:232. Takiishi T., Fenero C.I.M., Câmara N.O.S. 2017. Intestinal barrier and gut microbiota: Shaping our immune responses throughout life. Tissue barriers. 5:e1373208. Tamarit D. et al. 2015. Functionally structured genomes in Lactobacillus kunkeei colonizing the honey crop and food products of honeybees and stingless bees. Genome Biology and Evolution. 7:1455–1473. Tamarit D., Neuvonen M-M., Engel P., Guy L., Andersson S.G.E. 2018. Origin and evolution of the Bartonella gene transfer agent. Molecular Biology and Evolution. 35:451–464. Tettelin H. et al. 2005. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome”. Proceedings of the National Academy of Sciences of the United States of America. 102:13950. Thaiss C.A., Zmora N., Levy M., Elinav E. 2016. The microbiome and innate immunity. Nature. 535:65–74. Thoms C.A. et al. 2019. Beekeeper stewardship, colony loss, and Varroa destructor management. Ambio. 48:1209–1218. Tkacz A., Hortala M., Poole P.S. 2018. Absolute quantitation of microbiota abundance in environmental samples. Microbiome. 6:110. Touchon M., Moura de Sousa J.A., Rocha E.P. 2017. Embracing the enemy: The diversification of microbial gene repertoires by phage-mediated horizontal gene transfer. Current Opinion in Microbiology. 38:66–73. Valens M., Penaud S., Rossignol M., Cornet F., Boccard F. 2004. Macrodomain organization of the Escherichia coli chromosome. The EMBO journal. 23:4330– 41. Vasileva T. et al. 2017. Glucansucrases produced by fructophilic lactic acid bacteria Lactobacillus kunkeei H3 and H25 isolated from honeybees. Journal of Basic Microbiology. 57:68–77. Vásquez A., Forsgren E., Fries I., Paxton Robert J., et al. 2012. Symbionts as major modulators of insect health: lactic acid bacteria and honeybees. PloS one. 7:e33188. Vásquez A., Forsgren E., Fries I., Paxton Robert J., et al. 2012. Symbionts as major modulators of insect health: lactic acid bacteria and honeybees. PLoS ONE. 7:e33188. Vega N.M., Gore J. 2017. Stochastic assembly produces heterogeneous communities in the Caenorhabditis elegans intestine. PLoS biology. 15:e2000633. Vogel R.F. et al. 2011. Genomic analysis reveals Lactobacillus sanfranciscensis as stable element in traditional sourdoughs. Microbial Cell Factories. 10:S6. Vojvodic S., Rehan S.M., Anderson K.E. 2013. Microbial gut diversity of Africanized and European honey bee larval instars. PloS one. 8:e72106. Vos M., Didelot X. 2009. A comparison of homologous recombination rates in bacteria and archaea. ISME Journal. 3:199–208. Vuotto C., Donelli G. 2019. Novel treatment strategies for biofilm-based onfections. Drugs. 79:1635–1655. Walter J. 2008. Ecological role of lactobacilli in the gastrointestinal tract: implications for fundamental and biomedical research. Applied and environmental microbiology. 74:4985–96. Waterworth S.C. et al. 2020. Horizontal gene transfer to a defensive symbiont with a reduced genome in a multipartite beetle microbiome. mBio. 11.

79 Wen Z., Zhang J-R. 2015. Bacterial capsules. In: Molecular medical microbiology. Academic Press pp. 33–53. Ying J. et al. 2019. Comparative genomic analysis of Rhodococcus equi: an insight into genomic diversity and genome evolution. International Journal of Genomics. 2019. Zheng H. et al. 2019. Division of labor in honey bee gut microbiota for plant polysaccharide digestion. Proceedings of the National Academy of Sciences. 116:25909–25916. Zheng H. et al. 2016. Metabolism of toxic sugars by strains of the bee gut symbiont Gilliamella apicola. mBio. 7. Zheng H., Powell J.E., Steele M.I., Dietrich C., Moran N.A. 2017. Honeybee gut microbiota promotes host weight gain via bacterial metabolism and hormonal signaling. Proceedings of the National Academy of Sciences. 114:4775–4780. Zheng H., Steele M.I., Leonard S.P., Motta E.V.S., Moran N.A. 2018. Honey bees as models for gut microbiota research. Lab Animal. 47:317–325. Zheng J. et al. 2020. A taxonomic note on the genus Lactobacillus: Description of 23 novel genera, emended description of the genus Lactobacillus Beijerinck 1901, and union of Lactobacillaceae and Leuconostocaceae. International Journal of Systematic and Evolutionary Microbiology. 70:2782–2858. Zheng J., Ruan L., Sun M., Gänzle M. 2015. A genomic view of lactobacilli and pediococci demonstrates that phylogeny matches ecology and physiology Björkroth, J, editor. Applied and Environmental Microbiology. 81:7233–7243. Zinder N., Lederberg J. 1952. Genetic exchange in Salmonella. Journal of bacteriology. 64:679–99.

80

Acta Universitatis Upsaliensis Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1953 Editor: The Dean of the Faculty of Science and Technology

A doctoral dissertation from the Faculty of Science and Technology, Uppsala University, is usually a summary of a number of papers. A few copies of the complete dissertation are kept at major Swedish research libraries, while the summary alone is distributed internationally through the series Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology. (Prior to January, 2005, the series was published under the title “Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology”.)

ACTA UNIVERSITATIS UPSALIENSIS Distribution: publications.uu.se UPPSALA urn:nbn:se:uu:diva-416847 2020