GBE
Evolutionary Dynamics of Pathoadaptation Revealed by Three Independent Acquisitions of the VirB/D4 Type IV Secretion System in Bartonella
Alexander Harms1,6, Francisca H.I.D. Segers2,7, Maxime Quebatte1, Claudia Mistl1, Pablo Manfredi1, Jonas Ko¨ rner1, Bruno B. Chomel3,MichaelKosoy4, Soichi Maruyama5, Philipp Engel2,and Christoph Dehio1,* 1Focal Area Infection Biology, Biozentrum, University of Basel, Switzerland 2Department of Fundamental Microbiology, University of Lausanne, Switzerland 3Department of Population Health and Reproduction, School of Veterinary Medicine, University of California Davis 4Bacterial Diseases Branch, Division of Vector-Borne Disease, Centers for Disease Control and Prevention, Fort Collins, Colorado 5Laboratory of Veterinary Public Health, Department of Veterinary Medicine, College of Bioresource Sciences, Nihon University, Tokyo, Japan 6Present address: Department of Biology, Center of Excellence for Bacterial Stress Response and Persistence, University of Copenhagen, Denmark 7Present address: Institute of Zoology, Johannes Gutenberg University Mainz, Germany
*Corresponding author: E-mail: [email protected].
Accepted: March 3, 2017 Data deposition: The genome sequences of all (re)sequenced or reannotated bartonellae have been deposited in the NCBI GenBank database under BioProject accession PRJNA360985 with accession numbers KY583505 (B. ancashensis 20.00), CP019789 (genome of B. schoenbuchensis R1), CP019790 (pVbh of B. schoenbuchensis R1), CP019489 (B. sp. 1-1c), CP019780 (B. sp. A1379B), MUYE00000000 (B. sp. AR15-3), CP019782 (B. sp. CDCskunk), CP019785 (B. sp. Coyote22sub2), CP019784 (B. sp. HoopaDog 114), CP019783 (B. sp. HoopaFox 11B), CP019787 (B. sp. JB15), CP019788 (B. sp. JB63), CP019786 (B. sp. Raccoon60), MUBG00000000 (B. sp. SikaDeer WD12.1), CP019781 (B. sp. SikaDeer WD16.2), and MUYW00000000 (B. taylorii IBS296). All sequence alignments are available from the corresponding author upon request.
Abstract The a-proteobacterial genus Bartonella comprises a group of ubiquitous mammalian pathogens that are studied as a model for the evolution of bacterial pathogenesis. Vast abundance of two particular phylogenetic lineages of Bartonella had been linked to enhanced host adaptability enabled by lineage-specific acquisition of a VirB/D4 type IV secretion system (T4SS) and parallel evolution of complex effector repertoires. However, the limited availability of genome sequences from one of those lineages as well as other, remote branches of Bartonella has so far hampered comprehensive understanding of how the VirB/D4 T4SS and its effectors called Beps have shaped Bartonella evolution. Here, we report the discovery of a third repertoire of Beps associated with the VirB/D4 T4SS of B. ancashensis, a novel human pathogen that lacks any signs of host adaptability and is only distantly related to the two species-rich lineages encoding a VirB/D4 T4SS. Furthermore, sequencing of ten new Bartonella isolates from under- sampled lineages enabled combined in silico analyses and wet lab experiments that suggest several parallel layers of functional diversification during evolution of the three Bep repertoires from a single ancestral effector. Our analyses show that the Beps of B. ancashensis share many features with the two other repertoires, but may represent a more ancestral state that has not yet unleashed the adaptive potential of such an effector set. We anticipate that the effectors of B. ancashensis will enable future studies to dissect the evolutionary history of Bartonella effectors and help unraveling the evolutionary forces underlying bacterial host adaptation. Key words: filamentation induced by cAMP, AMPylation, parallel evolution, bacterial effector.
ß The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]
Genome Biol. Evol. 9(3):761–776. doi:10.1093/gbe/evx042 Advance Access publication March 7, 2017 761 Harms et al. GBE
Introduction 2011; Saenz et al. 2007). Surprisingly, a recent study on B. ancashensis, a new species of L1 that shares its human reser- The successful infection of eukaryotic hosts by many bacteria voir host with B. bacilliformis, reported the discovery of a VirB/ depends on the subversion of host cell functioning by effec- D4 T4SS in the genome of this organism, but no further func- tor proteins that are translocated into target cells via macro- tional or evolutionary analyses were carried out (Hang et al. molecular machineries like the type IV secretion system 2015). (T4SS) (Costa, et al. 2015; Gonzalez-Rivera et al. 2016). All Bartonella effector proteins (Beps) share a bipartite se- The evolution of these machineries from ancestors with func- cretion signal composed of a C-terminal Bep intracellular de- tions in genuine bacterial processes is well established (Abby livery (BID) domain and a positively charged tail (Schulein et al. and Rocha 2012; Frank et al. 2005; Guglielmini et al. 2013). 2005; Siamer and Dehio 2015). In addition, the far majority of However, the evolutionary trajectories of secreted effectors Beps contain an N-terminal filamentation induced by cAMP have proven difficult to study, because they are frequently (FIC) domain. The FIC—BID domain structure is thought to blurred by horizontal gene transfer (Brown and Finlay 2011; reflect the ancestral state of these effector proteins from Burstein et al. 2016). One major exception is the a-proteo- which other effector architectures have derived throughout bacterial genus Bartonella where two distinct effector sets the course of evolution (Engel et al. 2011). FIC domains are secreted by the VirB/D4 T4SS have evolved from a single enzymatic domains that mediate the posttranslational modi- common ancestor and are thought to promote host adapt- fication of target proteins via AMPylation, which typically re- ability (Engel et al. 2011; Saenz et al. 2007). These bacteria sults in target protein inactivation (Harms et al. 2016). The therefore provide an ideal setting to study the molecular basis AMPylation activity of FIC domains is dependent on the integ- of host adaptation and the evolution of bacterial effector rity of an HPFX[D/E]GNGRXXR signature motif that forms part proteins. of the active site. Alternative sequences at the signature motif The characteristic stealth infection strategy of Bartonella is are indicative of divergent molecular activities, e.g., in case of hallmarked by arthropod transmission and persistent, hemo- the Legionella pneumophila effector AnkX that mediates tropic infections in their respective mammalian reservoir hosts target protein phosphocholination or the Doc toxin of bacte- (Harms and Dehio 2012). Host-specific adaptation is critical for riophage P1 that is a kinase (Castro-Roa et al. 2013; successful implementation of the Bartonella stealth infection Mukherjee et al. 2011). With regard to the Beps, it has been strategy, because reservoir host infections are typically long- shown that Bep1 of B. rochalimae (L3) AMPylates a specific lasting but do not cause obvious disease symptoms. In con- subset of small host GTPases (Dietz et al., in revision), that trast, infections of incidental hosts are often self-limiting but Bep2 of the same organism AMPylates the host intermediate can be accompanied by considerable morbidity (Angelakis and filament protein vimentin (Pieles et al. 2014), and that BepA of Raoult 2014; Harms and Dehio 2012). The phylogeny of B. henselae (L4) causes the AMPylation of two unknown host Bartonella is divided into four major lineages (L1–L4) of proteins of ca. 40–50 kDa (Palanivelu et al. 2010). Apart from which L3 and L4 together comprise the overwhelming major- effectors harboring a FIC domain, novel effector types have ity of different reservoir hosts (Engel et al. 2011). Conversely, evolved independently in both L3 and L4. These derived ef- B. bacilliformis (L1) is a strict human pathogen, whereas the fector architectures consist of one or several BID domains and/ species of L2 seem to be limited to ruminant hosts (Harms and or tyrosine-containing motives phosphorylated by host kinases Dehio 2012; Minnick et al. 2014). The diversity of species in L3 to subvert host cell functioning (Engel et al. 2011; Siamer and and L4 is the result of two parallel adaptive radiations that Dehio 2015). have been linked to the acquisition of the VirB/D4 T4SS and In this work, we report the discovery of a third set of Beps the vast potential of its secreted effectors to manipulate host linked to a VirB/D4 T4SS in B. ancashensis, a human pathogen cell functions (Engel et al. 2011; Saenz et al. 2007). Previous of L1 that is clinically indistinguishable from its sister species work showed that the VirB/D4 T4SS had been independently B. bacilliformis (Blazes et al. 2013; Minnick et al. 2014; Mullins acquired in L3 and L4 together with a single primordial effec- et al. 2013). We performed an extensive analysis of the Beps tor that, via parallel series of effector gene duplication and of B. ancashensis, L3, and L4, that revealed signs of functional functional diversification, evolved into complex effector reper- diversification in all three effector repertoires. However, our toires (Engel et al. 2011). The modular evolvability and func- results suggest that the Beps of B. ancashensis may not yet tional diversity of these effector repertoires is thought to have have explored the full adaptive potential as present in the enabled the surge of host adaptability that sparked the adap- other lineages. Furthermore, we present the genome se- tive radiations of L3 and L4 (Engel et al. 2011; Guy et al. 2013). quences of eight additional isolates of L3 that enabled a com- This view is supported by the observation that neither B. bacil- prehensive view on the Beps of this lineage for which only a liformis (L1) nor any species of L2 encode host-interacting few genomes have previously been available (Engel et al. T4SS or show any evidence of recent host shifts (Engel et al. 2011). Taken together, our analyses shed new light on the
762 Genome Biol. Evol. 9(3):761–776. doi:10.1093/gbe/evx042 Advance Access publication March 7, 2017 Three Independent Acquisitions of VirB/D4 T4SS in Bartonella GBE
Table 1 Genomic Features of Bartonella Genomes That Were (Re)Sequenced or Reannotated as Part of This Study Organism Reservoir Host Lineage Size (bp)a Contigsa GC %a CDS Analysis Source of Strain/Sequence B. ancashensis 20.00 Human 1 1,467,789 1 37.1 1,251 reannotated ATCC BAA-2694/ (Homo sapiens) (Hang et al. 2015) B. schoenbuchensis R1 Roe deer 2 1,672,626/55,761 1/1 37.9/35.2 1,555/84 resequenced Dehio collection/this study (Capreolus capreolus) B. sp. Sika Deer Sika deer 2 1,739,423/49,553 2/1 37.9/34.9 1,565/71 sequenced (Sato et al. 2012b)/this study WD12.1 (Cervus nippon) B. sp. Sika Deer Sika deer 2 1,762,996 1 37.6 1,526 sequenced (Sato et al. 2012b)/this study WD16.2 (Cervus nippon) B. sp. CDC_skunk Stripped skunk 3 1,615,223 1 36.1 1,374 sequenced this study/this study (Mephitis mephitis) B. sp. JB15 Japanese badger 3 1,494,018 1 35.3 1,283 sequenced (Sato et al. 2012a)/this study (Meles anakuma) B. sp. JB63 Japanese badger 3 1,493,693 1 35.3 1,255 sequenced (Sato et al. 2012a)/this study (Meles anakuma) B. sp. Hoopa Gray fox (Urocyon 3 1,579,438 1 36 1,316 sequenced (Henn et al. 2007)/this study Fox 11B cinereoargenteus) B. sp. Hoopa Dog 114 Dog (Canis canis) 3 1,571,582 1 35.9 1,348 sequenced (Henn et al. 2007)/this study B. sp. A1379B Red fox 3 1,541,976 1 35.8 1,289 sequenced (Henn et al. 2009)/this study (Vulpes vulpes) B. sp. Coyote22sub2 Coyote 3 1,561,331 1 35.9 1,325 sequenced (Henn et al. 2009)/this study (Canis latrans) B. sp. Racoon60 Raccoon (Procyon lotor) 3 1,615,700 1 36.1 1,400 sequenced (Henn et al. 2009)/this study B. sp. AR15-3 American Red Scquirrel 3 1,630,082 2 35.9 1,447 resequenced Dehio collection/ (Tamiasciurus (Engel et al. 2011) hudosonicus) B. sp. 1-1c Rat (Rattus norvegicus) 3 1,600,621 1 35.9 1,358 resequenced Dehio collection/ (Engel et al. 2011) B. taylorii IBS325 Vole (Microtus sp.) 4 1,677,768 194 38.8 1,336 sequenced Dr. Y. Piemont, Dr. R. Heller/this study
NOTE.—Relevant features of the Bartonella genomes sequenced, resequenced, or reannotated as part of this study. A comprehensive list of all Bartonella genomes used throughout this work is presented in supplementary table S1, Supplementary Material online. aThe first number refers to the main chromosome, the second number refers to plasmids.
evolutionary trajectories of Bartonella effector proteins and coli)orad30mgml 1 (Bartonella), and the rpsL genotype of highlight their value as a model for the evolution of bacterial the B. henselae Houston-1 lab strain RSE247 was selected pathogenesis. withstreptomycinad100mgml 1.
Materials and Methods Bartonella Isolates and Genome Sequences Bacterial Strains and Growth Conditions Bartonella organisms were from our laboratory collection or were obtained from different sources as listed in table 1. Escherichia coli strains were routinely grown in LB medium. Bartonella taylorii IBS296 isolated from a vole (Microtus sp.) Plasmids were moved into Bartonella by conjugation using an was obtained from Dr. Yves Piemont and Dr. Remy Heller. E. coli donor strain with chromosomal RP4 machinery (Jonas Bartonella sp. CDCskunk was isolated from a stripped skunk Ko¨ rner, unpublished). All bartonellae were routinely grown for (Mephitis mephitis) at the Janos Biosphere Reserve, Mexico. 3–7 days on nutrient agar containing 5% defibrinated sheep Genome sequences of relevant Bartonella species and other blood in a water-saturated atmosphere with 5% CO at 2 organisms were downloaded from the National Center for 35 C. Most species and isolates were grown on heart infusion Biotechnology Information (NCBI; http://www.ncbi.nlm.nih. agar (Oxoid), but B. birtlesii and B. taylorii were cultured on gov; last accessed March 9, 2017). The genome sequences Columbia agar (Oxoid) and the two isolates from Sika deer of additional Bartonella isolates were determined as described were cultured on tryptic soy agar (Oxoid). Derivatives of below. Comprehensive information regarding all genomes pCD366 were selected with kanamycin ad 50 mgml 1 (E.
Genome Biol. Evol. 9(3):761–776. doi:10.1093/gbe/evx042 Advance Access publication March 7, 2017 763 Harms et al. GBE
used in this study is assembled in supplementary table S1, novo assembly of mis and unmapped reads using Supplementary Material online. SOAPdenovo2 (Luo et al. 2012).
Plasmid Construction Identification and Protein Domain Annotation of Plasmids were constructed using standard restriction-based Bartonella Effector Proteins procedures. Derivatives of pCD366 (an RSF1010 derivative Bep and biaA genes in the genomes of all organisms shown in encoding promoterless gfpmut2 downstream of a multiple figure 1 were systematically identified in silico using BLAST cloning site (Seubert et al. 2003)wereusedfor implemented in Geneious v9.1.5 with amino acid query se- PvirB2::gfpmut2 promoter probes as described previously quences against the translation of a local genome sequence and designated the pAH196 series (Quebatte et al. 2010). database in all six reading frames (tBLASTn). Beps were iden- All oligonucleotide primers are listed in supplementary table tified with the bipartite secretion signal (terminal BID domain S2, Supplementary Material online. All vectors and details of and C-terminal tail) of all effectors of B. henselae (L4) and B. their construction are listed in supplementary table S3, sp. 1-1c (L3) where the effectors had been annotated previ- Supplementary Material online. ously (Engel et al. 2011; Schulein et al. 2005). BiaA orthologs were identified analogously with BLAST searches using the Genome Sequencing, Assembly, and Annotation homologous FicA antitoxins VbhA of B. schoenbuchensis Genomic DNA was isolated from about 0.25 g of bacterial and YeFicA of Y. enterocolitica as query sequences (Harms material using the Power Soil DNA isolation kit (Mobio). et al. 2015). SMRT sequencing (Pacific Biosciences) and assembly with FIC domains were manually annotated in all effectors using HGAP (Chin et al. 2013) were carried out at the Yale Center the protein analysis and classification tool of InterPro (http:// for Genome Analysis. For all newly sequenced genomes, we www.ebi.ac.uk/interpro/; last accessed March 9, 2017) imple- obtained 1–3 contigs representing the entire chromosome of mented in Geneious v9.1.5. BID domains of all effectors were a given species. In case of complete genomes (i.e., one contig manually annotated based on BLAST searches with all BID with overlapping ends), the origin of replication was set at the domains (including internal ones) of the effectors of B. hense- change in sign of the GC skew and annotations were subse- lae (L4) and B. sp. 1-1c (L3). Eukaryotic-like tyrosine phosphor- quently carried out using the Integrated Microbial Genomes ylation motifs were identified and annotated in all effectors (IMG) platform (Markowitz et al. 2014).ThegenomeofB. using ScanSite3 (http://scansite3.mit.edu/; last accessed March taylorii IBS296 was sequenced separately (see below). Beps 9, 2017) with medium stringency and 0.25% fidelity cutoff. were identified and manually annotated in all genomes Sequence logos were determined using WebLogo (http:// based on the presence of a BID domain secretion signal as weblogo.berkeley.edu/; last accessed March 9, 2017). described below. Frameshifts were identified in coding regions using the Microbial Genome Submission Check tool on NBCI Phylogenetic Analyses of Bartonella Species (https://www.ncbi.nlm.nih.gov/genomes/frameshifts/frame- The maximum likelihood phylogeny of figure 1 was created shifts.cgi; last accessed March 9, 2017). Frameshifts close to from 29 bartonellae and 5 outgroup species based on a con- homopolymeric sequence stretches were corrected manu- catenate alignment of the nucleotide sequence (fig. 1)or ally by inserting or deleting a single base in the homopoly- amino acid sequence (supplementary fig. S1B, meric stretch. These stretches are prone to SMRT Supplementary Material online) of 509 core gene orthologs. sequencing errors and their corrections resulted in all cases Orthologous gene families were identified using the in a single continuous ORF. Inconsistencies identified in the OrthoFinder software (Emms and Kelly 2015). From the result- previously published genome of B. ancashensis (Hang et al. ing 4981 protein families (excluding singletons), we selected 2015) were confirmed by resequencing the corresponding families which contained exactly one gene from each genome locus using the Microsynth Barcode Economy Run service. in our analysis. This resulted in 494 core genes for tree con- struction, to which we added 14 ribosomal protein genes that Sequencing of Bartonella Taylorii IBS296 are duplicated in B. bacilliformis, and the housekeeping gene Bartonella taylorii IBS296 was sequenced at the Quantitative groEL which was clustered by OrthoFinder with multiple Genomics Facility of the Department of Biosystems Science orthologs for some outgroup species. For each of these addi- and Engineering (D-BSSE) of the ETH Zu¨ rich in Basel, tional genes, we used the homolog that was most similar to its Switzerland. Two read sets were generated from a 50 bp counterpart in the other strains. We aligned the protein se- single-read run on a HiSeq2000 and a 150 bp paired-end quences with ClustalW2 (Larkin et al. 2007). The nucleotide run on a MiSeq sequencer (Illumina), achieving an average sequences were aligned in codons according to the protein coverage above 400 . Mapping against the genome se- alignments with an in-house Python script. From all align- quence of the B. taylorii 8TBB reference strain using Bowtie2 ments, we removed the columns with gaps in over 50% of (Langmead and Salzberg 2012) was complemented by a de the sequences with trimAl v1.2 (Capella-Gutierrez et al. 2009).
764 Genome Biol. Evol. 9(3):761–776. doi:10.1093/gbe/evx042 Advance Access publication March 7, 2017 Three Independent Acquisitions of VirB/D4 T4SS in Bartonella GBE
FIG.1.—Comprehensive phylogeny of the genus Bartonella. Maximum likelihood phylogeny of Bartonella based on the nucleotide sequence alignment of 509 concatenated core genes. Branch labels show bootstrap support (top; 100 replicates) and % support by single gene phylogenies (bottom). Lineages encoding a VirB/D4 T4SS are highlighted by grey shading. Reservoir hosts are shown on the right-hand side of the illustration (simplified; for details,seetable 1 and supplementary table S1, Supplementary Material online). A representation of the phylogeny with full outgroup as well as an analogous phylogeny based on concatenated protein sequence are shown in supplementary figures S1A and S1B, Supplementary Material online.
The best model of evolution for each gene was predicted with Whole-Genome Alignments Prottest-3.4 (Darriba et al. 2011)andMEGAv6.0(Tamura Whole genome alignments were generated as pairwise et al. 2013) for protein and nucleotide sequences, respectively. tBLASTx files using the DoubeACT tool (http://www.hpa- Invariable site models were not taken into consideration as bioinfotools.org.uk; last accessed March 9, 2017) and visual- recommended by the author of RAxML (Stamatakis 2014). ized using the genoPlotR package in R v3.2.0 (Guy et al. We inferred single gene trees for all protein and DNA se- 2010). quence alignments with RAxML v8.0.0. The trimmed protein and DNA alignments were concatenated with an in-house Python script. Next, genome-wide trees were inferred with Phylogenetic Analyses of VirB/D4 T4SS and Beps RAxML v8.0.0 using the model of evolution that overall was For phylogenies of Bartonella effector proteins, Bep sequences the best model of evolution for most single gene alignments of all species shown in figure 1 were aligned using MAFFT (GTR + G for DNA; JTT + G for protein). To assess the robust- implemented in Geneious v9.1.5. This alignment was manu- ness of the phylogenies, we did 100 bootstrap replicates and ally curated to remove regions of largely gapped or obviously counted the number of single gene trees that were congruent nonhomologous sequence. The optimal model for maximum with the genome-wide phylogenies at each branch using an likelihood analysis was determined using ProtTest3 and found in-house Python script and the package APE v3.4 (Paradis to be JTT in all cases (Darriba et al. 2011). Maximum likelihood et al. 2004)inRv3.2.0. phylogenies were constructed from the alignments using
Genome Biol. Evol. 9(3):761–776. doi:10.1093/gbe/evx042 Advance Access publication March 7, 2017 765 Harms et al. GBE
FIG.2.—The VirB/D4 T4SS of B. ancashensis was acquired independently from its homologs in L3 and L4. (A) Whole-genome alignment of B. bacilliformis KC583 (lacking any VirB/D4 T4SS), B. ancashensis, B. clarridgeiae (representing L3), and B. henselae (representing L4) with grey bands indicating tblastx hits. virB/bep and bep loci are colored in black and green, respectively. Although the virB/D4/bep locus of B. ancashensis is clearly at a distinct position from any related locus in L3, it is found close to the virB/D4/bep locusofL4atadifferentpositionofthesecretion system cassette (SSC) region that encodes various horizontally acquired virulence factors (Guy et al. 2013) (more details shown in supplementary fig.S2, Supplementary Information online). (B) Maximum likelihood phylogeny of the Bartonella VirB/D4 T4SS and related machineries based on concatenated protein alignments of all components (with rhizobial conjugation systems as outgroup). Branch numbers represent bootstrap support when >80 (of 100 replicates).