Evolution and molecular characterization of tick-borne

Anaplasmataceae and implications for pathogen diagnostics

and control

Alejandro Cabezas-Cruz

Doctoral Thesis Evolution and molecular characterization of tick-borne Anaplasmataceae and implications

for pathogen diagnostics and control

Trabajo presentado por el M.C. Alejandro Cabezas-Cruz para optar al grado de Doctor por la

Universidad de Castilla La Mancha.

DEPARTAMENTO:

Ciencia y Tecnología Agroforestal y Genética

CENTRO:

Instituto de Investigación en Recursos Cinegéticos IREC,

Grupo de Sanidad y Biotecnología, SaBio (Health and Biotechnology)

PROGRAMA DE DOCTORADO:

Investigación básica y aplicada en recursos cinegéticos.

DIRECTORES:

José de la Fuente García

Libor Grubhoffer V0 B0 de los Directores:

Fdo. José de la Fuente García Fdo. Libor Grubhoffer

El doctorando:

Alejandro Cabezas-Cruz

UNIVERSIDAD DE CASTILLA-LA MACHA

INSTITUTO DE INVESTIGACION EN RECURSOS CINEGÉTICOS

(CSIC-UCLM-JCCM)

Ciudad Real, 2014 The realization of this work was possible thanks to the following projects:

Alejandro Cabezas-Cruz was supported by the EU Marie Curie actions FP7-PEOPLE-ITN program: POSTICK ITN (Post-graduate training network for capacity building to control ticks and tick-borne diseases; EU Grant No. 238511) and by a grant from the Ministère de l’Education

Supérieure et de la Recherche of France. This work was also possible thanks to grants

BFU2011-23896 and EU FP7 ANTIGONE project number 278976. A mis padres Lissette y Leonardo por lo que llevo de ellos en mi,

a mis abuelas que me dieron un amor infinito, a mis hijas: Carmen Alicia, Carmen Sofia y Chloe que son una motivación infinita,

y a mi esposa que me acompaña con amor en mi camino por esta vida. Table of contents

Organization of the thesis...... 1 Chapter I Introduction...... 3 Anaplasmataceae family: diagnostic, immunological control and evolution ...... 4 Abstract...... 4 Introduction...... 6 E. canis and A. marginale genomes and surface proteins...... 5 Evolution...... 6 Diagnostics...... 6 Immunological control...... 8 Concluding remarks...... 9 References...... 9 Hypotesis and Objectives ...... 13 Hypothesis...... 14 Objectives...... 14 Development of objectives...... 15 Chapter II The gene msp1a is a relevant genetic tool for the characterization of A. marginale diversity worldwide...... 16 Chapter II.I Functional and Immunological Relevance of marginale Major Surface Protein 1a Sequence and Structural Analysis...... 17 Abstract...... 18 Introduction...... 18 Results and Discussion...... 19 Classification of A. marginale Strains Using MSP1a Sequence Data...... 19 The Biological Implications of Sequence Variation of MSP1a Tandem Repeats...... 19 Analysis of B Cell Epitope in MSP1a Tandem Repeats...... 24 Methods...... 27 References...... 29 Chapter II.II Detection of genetic diversity of Anaplasma marginale isolates in Minas Gerais, Brazil...... 31 Abstract...... 32 Introduction...... 33 Materials and Methods...... 33 Results...... 34 Detection of A. marginale infections...... 34 MSP1a microsatellite analysis...... 34 MSP1a tandem repeats and phylogenetic analysis...... 34 Discussion...... 36 Conclusions...... 37 References...... 37 Chapter II.III Epidemiology and evolution of genetic variability of Anaplasma marginale in South Africa...... 39 Abstract...... 40 Introduction...... 41 Materials and Methods...... 41 Results and Discussion...... 43 Molecular evidence of A. marginale prevalence in South Africa...... 43 A. marginale prevalence and msp1a genetic diversity...... 43 Evolution of msp1a genetic diversity...... 44 Amino acid variability and low variable MSP1a peptides...... 46 Conclusions...... 46 References...... 47 Chapter II.IV Low genetic diversity associated to low prevalence of Anaplasma marginale in water buffaloes...... 48 Abstract...... 49 Introduction...... 50 Materials and Methods...... 50 Results and Discussion...... 52 Conclusions...... 55 References...... 56 Chapter III The evolution of A. marginale msp1a is affected by the ecology supporting different tick vectors...... 58 Chapter III The evolution of A. marginale msp1a is affected by the ecology supporting different tick vectors...... 59 Abstract...... 50 Introduction...... 60 Results and Discussion...... 61 Different rates of evolution among A. marginale strains related to different ecoregions...... 61 Emergence of new MSP1a genotypes...... 62 Selective pressures acting on A. marginale msp1a tandem repeats...... 63 MSP1a evolutionary fingerprinting...... 64 Directional and convergent evolution of A. marginale MSP1a related to tick vectors...... 64 Effect of the presence of tick vectors on MSP1a tandem repeats amino acid variability and coevolution of amino acid sites...... 65 Concluding remarks...... 67 Materials and Methods...... 67 References...... 69 Chapter IV In vitro culture of three new strains of E. canis and evaluation of genetic diversity using gp36 gene...... 72 Chapter IV In vitro culture of three new strains of E. canis and evaluation of genetic diversity using gp36 gene...... 72 Abstract...... 73 Introduction...... 73 Materials and Methods...... 74 Results...... 75 E. canis cultures...... 75 Sequence analysis of 16S rRNA...... 76 Sequence analysis of gp36...... 76 Putative glycosylation of gp36 and phylogenetic Clusters A and B...... 78 Theoretical pI and secondary structure of gp36...... 79 Discussion...... 79 References...... 80 Chapter V mineirensis, a new species of the genus Ehrlichia closely related to E. canis, presents a divergent gp36 ortholog...... 82 Chapter V.I New species of Ehrlichia isolated from Rhipicephalus (Boophilus) microplus shows an ortholog of the E. canis major immunogenic glycoprotein gp36 with a new sequence of tandem repeats...... 83 Abstract...... 84 Introduction...... 85 Methods...... 85 Results...... 87 Sequence analysis of 16S rRNA...... 87 Sequence analysis of dsb...... 87 Sequence analysis of groESL operon...... 87 Sequence analysis of gltA gene...... 88 Sequence analysis of the gp36 gene and the putative encoded protein sequence...... 89 B cell epitopes analysis...... 90 Discussion...... 91 Conclusions...... 94 References...... 94 Chapter V.II Ultrastructure of Ehrlichia mineirensis, a new member of the Ehrlichia genus...... 96 Abstract...... 97 Introduction...... 97 Materials and Methods...... 98 Results and Discussion...... 98 Ultrastructure of E. mineirensis...... 98 References...... 100 Chapter V.III In vitro culture of a novel genotype of Ehrlichia sp. from Brazil...... 101 Abstract...... 102 Introduction...... 102 Materials and Methods...... 103 Results...... 104 Ehrlichia sp. in tick cell culture...... 104 Infection of mammalian cells...... 104 Phylogenetic analysis of 16S rRNA...... 104 Discussion...... 105 References...... 106 Chapter VI General Discussion...... 109 Study of the genetic diversity of tick-borne pathogenic of the family Anaplasmataceae...... 109 Abstract...... 110 Bacteria classification...... 110 Classical molecular applied to Anaplasmataceae...... 111 Housekeeping genes for taxonomy...... 111 Immunoreactive proteins and molecular taxonomy...... 113 Beyond molecular taxonomy: polyphasic taxonomy...... 114 Implications for pathogen molecular diagnostics and control...... 115 Concluding remarks...... 117 References...... 117 General Conclusions...... 122 Conclusions...... 123 Summary...... 126 Appendices...... 134

Organization of the thesis

This thesis focuses on the molecular characterization of recognized and new species of tick- borne pathogens to characterize their genetic diversity and evolution. Two bacteria of the family Anaplasmataceae are used as model organisms: Anaplasma marginale and Ehrlichia canis. A new organism, named Ehrlichia mineirensis, was isolated and characterized.

The genetic variability is the essential mechanism for the generation of new pathogenic strains of existing pathogens with an impact on pathogen diagnostics and control. To address this problem, here we study the genetic variability and evolution in major surface proteins from A. marginale and E. canis.

Chapter I is a general introduction to the family Anaplasmataceae, focusing on A. marginale and E. canis.

Chapter II describes the genetic variability of A. marginale in cattle from South Africa and Brazil as well as the genetic variability of this pathogen in water buffaloes from one region in Brazil. The gene msp1a is presented as a relevant genetic tool for the characterization of A. marginale diversity. The genetic variability of A. marginale is discussed in the epidemiologic context of . To further characterize the genetic diversity of A. marginale worldwide, 224 A. marginale strains were classified using MSP1a sequences available at GenBank.

Chapter III resumes our recent findings related to the evolution of the A. marginale gene msp1a. Phylogenetic and evolutionary analyses show that the gene msp1a evolve under diversifying and convergent evolution. Different tick vectors seem to exert different evolutionary pressures on this gene, which encodes for the MSP1a involved in pathogen-tick interactions. MSP1a from A. marginale strains that evolved in ecoregions optimal for R. microplus development, present high diversity and a strong positive selection -1- on different regions of the gene. In contrast, msp1a from A. marginale strains that evolved in regions where Dermacentor ticks are the main vectors, present a neutral evolution, with low amino acid variability and low evolutionary pressure. Chapter IV describes the genetic diversity of E. canis in three new strains isolated in Spain and South Africa. In vitro culture of these strains was carried out using tick cells and dog macrophages to complete the life cycle of E. canis. The genetic diversity of E. canis was evaluated using the glycoprotein gp36, which, in contrast to MSP1a from A. marginale, contains variable number of highly conserved tandem repeats. Other E. canis strains for which gp36 sequences were available at GenBank were analyzed. Two different phylogenetic clusters of E. canis were identified and based in the molecular characteristics of their gp36, we propose that members of each cluster share antigenic and probably infective properties.

Chapter V presents a complete molecular, cellular and ultrastructural characterization of E. mineirensis, a new species of the genus Ehrlichia isolated from the hemolymph of Rhipicephalus microplus from Brazil.

Chapter VI discusses the relevance of the results presented in the thesis and their implications for the diagnostic and control of tick-borne diseases.

-2- Chapter I

Introduction

Anaplasmataceae family: diagnostic, immunological control and

evolution.

Cabezas-Cruz A., de la Fuente J. Anaplasmataceae family: diagnostic, immunological control and evolution.

-3- Anaplasmataceae family: diagnostic, immunological control and evolution.

Alejandro Cabezas-Cruz1,2, José de la Fuente2,3

1Center for Infection and Immunity of Lille (CIIL), INSERM U1019 – CNRS UMR 8204, Université

Lille Nord de France, Institut Pasteur de Lille, Lille, France.

2SaBio. Instituto de Investigación en Recursos Cinegéticos IREC-CSIC-UCLM-JCCM, Ronda de

Toledo s/n, 13005 Ciudad Real, Spain.

3Department of Veterinary Pathobiology, Center for Veterinary Health Sciences, Oklahoma State

University, Stillwater, OK 74078, USA.

Abstract

The family Anaplasmataceae includes intracellular bacteria such as Anaplasma marginale and Ehrlichia canis, which are important rickettsial pathogens, transmitted by ixodid ticks. A. marginale colonizes erythrocytes and causes bovine anaplasmosis, an important veterinary disease in tropical and subtropical regions of the world. E. canis produce canine , a fatal infection in dogs but human infections were also reported. The diagnostics and control of these tick-borne diseases is a challenging problem. In the last few years, the Anaplasmataceae family has been expanded with an increasing number of organisms phylogenetically related to some of its members, including A. marginale and E. canis. Here, we discuss resent advances in A. marginale and E. canis diagnostics and control as well as our present understanding in their genetic diversity and evolution.

Keywords: A. marginale and E. canis diagnostics, control and evolution.

Introduction biologic similarities, including ixodic tick Ehrlichia and Anaplasma are small, transmission and similar adptations to coccoid to pleomorphic, Gram-negative intracellular parasitism. Taxonomic valid in the order species of Ehrlichia are: E. canis, E. , family Anaplasmataceae chaffeensis, E. ewingii, E. muris and E. (Dumler et al, 2001). Besides the genuses ruminantium. In addition, numerous Ehrlichia and Anaplasma, which are sister candidate entities have been reported taxa, the family Anaplasmataceae also (Ehrlichia walkerii, Ehrlichia includes Neoehrlichia and . shimanensis, Ixodes ovatus ehrlichia, Ehrlichia and Anaplasma share significant Panola Mountain ehrlichia), all isolated

-4- from hard ticks and mainly characterized A small subset of the proteins encoded in by PCR sequencing (Telford III et al, the E. canis genome reacts strongly with 2011). Anaplasma is composed by four antibodies and they are considered to be main species: A. marginale, A. bovis, A. major immunoreactive proteins (McBride phagocytophilum and A. platys (Dumler et et al, 2003). The genes encoding these al, 2001). Three of the five Ehrlichia proteins may exhibit a high level of species can potentialy cause human diversity as a result of increased selective ehrlichiosis (Ehrlicha canis, Ehrlichia pressure by the immune system. In fact, chaffeensis, and ) the study of several of these proteins, (Telford III et al, 2011) while A. including gp200, gp140, gp19 and gp36, phagocytophilum is a well-known example has showed high genetic diversity among of tick-borne bacteria causing human geographically distant isolates of E. canis granulocytic anaplasmosis (HGA). (Zhang et al, 2008). Of these genes, E. E. canis has a tropism for canine canis gp36 was the most divergent (Zhang monocytes and macrophages and therefore et al, 2008). E. canis gp36 is an acidic the disease is known as canine monocytic serine-rich glycoprotein that contains a ehrlichiosis (CME). E. canis has also been tandem repeat region in which major identified as being the cause of human antibody epitopes are located (Doyle et al, ehrlichiosis in patients from Venezuela 2006). This glycoprotein has been (Perez et al, 1996; Perez et al, 2006). The successfully used for the characterization brown dog tick Rhipicephalus sanguineus of the genetic variability of E. canis is an efficient vector of E. canis (Groves et (Hsieh et al, 2010; Kamani et al, 2013) as al, 1975; Bremer et al, 2005) but well as for a diagnostic ELISA system for experimental transmission of E. canis has E. canis (Cardenas et al, 2007). Further, also been demonstrated to occur through based on amino acid homology and the ixodid tick Dermacentor variabilis genomic synteny analyses, it was (Johnson et al, 1998). A. marginale infects determined that E. canis gp36 has ortholog red blood cells of not only cattle, but also proteins in E. chaffeensis (gp47) and E. other domestic and wild ruminants such as ruminantium (mucin-like protein) (Doyle water buffaloes (Bubalus bubalis) and it is et al, 2006). transmitted biologically by R. microplus. In A. marginale, of the 949 encoded proteins, 62 are predicted to be outer E. canis and A. marginale genomes and membrane proteins, of these 49 belong to surface proteins msp2 or msp1 superfamilies (Brayton et al, At the genomic level these two agent are 2005). Three major surface proteins very similar, both presenting a circular (MSPs), MSP5, MSP4 and MSP1α have genome of 1,315,030 bp and 1,197,687 bp been extensively used for the molecular encoding for 925 and 949 proteins in E. identification and characterization of A. canis (Mavromatis et al, 2006) and A. marginale (Aubry and Geale, 2011). The marginale (Brayton et al, 2005) msp5 gene is highly conserved in A. respectively. A. marginale has a very marginale making difficult its use to unusual G+C content of 49.8% as most of differentiate among bacterial strains sequenced Ricketsialess average 31%, for (Visser et al, 2009). The msp4 gene is less example E. canis has 28.9% (Mavromatis conserved than msp5, but its function is et al, 2006). unknown and does not accumulate enough genetic changes to identify local genetic

-5- diversity of A. marginale strains (de la genomes are reduced during evolution. It Fuente et al, 2003a). However, analyses of has been suggested that the first msp1α (major surface protein 1 alpha) association occurred between the ancestor gene sequences have allowed the of intracellular bacteria and invertebrate identification of A. marginale strains eukaryotic cells (Darby et al, 2007) worldwide (Cabezas-Cruz et al, 2013) and meaning, for A. marginale and E. canis, despite the msp1α genetic diversity, this that pathogen-vertebrate association was gene is considered as a stable genetic secondary to tick-pathogen association. marker conserved during acute and This is in sharp contrast with the fact that persistent rickettsemia in cattle and also some strains of A. marginale are during multiplication in ticks (Palmer et al, transmitted by ticks while others are not 2001; Bowie et al, 2002; de la Fuente et al, (revised in de la Fuente, 2003c). 2003b). Furthermore, MSP1a is an adhesin for bovine erythrocytes and tick cells and Diagnostics the tandem repeated region of this protein Because of the similarities between E. binds tick cell extract, providing canis and A. marginale, the methods for information regarding tick transmissibility diagnostic are very similar. The diagnostic phenotypes of A. marginale strains (de la technics (excluding clinical findings) used Fuente et al, 2003c). to identify E. canis and A. marginale can be classified in microscopic, serologic and Evolution molecular. In addition, the in vitro culture The group, to which of the agent in tick or mammalian cell Anaplasmataceae belongs, represent a lines as well as the experimental infection paradigm of reductive evolution (Blanc et of susceptible animals with blood from al, 2007). The reductive evolution model suspect animals could be also consider as has been proposed to define the genomic diagnostic means. similarities seen in all obligate Classical microscopic diagnosis of endosymbionts, including obligate Ehrlichia spp. and Anaplasma spp. intracellular parasites (Wernegreen, 2005). infection relies on identification of Genome reduction is associated to massive morulae in circulating monocytes or gene loss in highly vertebrate-pathogenic erythrocytes in Giemsa-stained blood Rickettsia compared to less virulent or smears. However, morulae are usually endosymbiotic species (Merhej and difficult to find in blood smears, even Raoult, 2011). Typically, the bacteria during the acute stage of the disease. species having reduced genomes show Cell culture isolation, which some rapid DNA sequence evolution, strong consider a gold standard for confirming nucleotide compositional biases, low gene these infections, is laborious, expensive, acquisition from foreign sources, high time-consuming and available only in specialization to their particular host and specialized research laboratories. Despite genome dynamics different from their free the advances in cellular culture techniques, leaving counterparts (Wernegreen, 2005). some Ehrlichial agents have never been Interestingly, the decrease of the rate of cultured in vitro, two examples are E. gene loss was associated to an increase in ewingii and A. platys. Fortunately, the in the genomic divergence among Ricketssia vitro culture of E. canis and A. marginale (Blanc et al, 2007). Suggesting a common is an available tool for pathogen trend in losing certain genes while the identification and characterization (Bell-

-6- Sakyi et al, 2007). Recent developments in indirect fluorescent antibody (IFA) test, as tick cell culture technologies have made well as various enzyme linked possible the in vitro culture of several immunosorbent assays (ELISA) such as a isolates of different tick-borne pathogens, cELISA, indirect ELISA and dot ELISA. including E. canis and A. marginale (Bell- The preferred serological test are cELISA Sakyi et al, 2007). The first successful in and the CAT. vitro isolation and propagation of E. canis A cELISA, using the major surface protein in Ixodes scapularis-derived IDE8 tick cell 5 (MSP5) from A. marginale is the most cultures was reported by Ewing and accurate serological test currently colleagues (1995), whereas infected available for identifying Anaplasma- monocytes obtained from infected dogs infected cattle. This cELISA uses a were successfully maintained in cultures monoclonal antibody specific to MSP5 already in 1971 (Nyindo et al, 1971). and has a sensitivity of 95% and a However, the macrophage cell line DH82, specificity of 98% (revised in Aubry and obtained from a dog suffering from Geale, 2011). Recently, critical motifs malignant histiocytosis, is the most present in the MSP1a tandem repeats were successfully used to continuously obtained through Phage Display propagate E. canis in vitro (Dawson et al, technology against the neutralizing 1991). monoclonal antibody 15D2 (Santos et al, Serology may be helpful in identifying the 2012). The antibody 15D2 is specific to presence of antibodies to Ehrlichia spp. one neutralization epitope found in the and Anaplasma spp. However, there are MSP1a tandem repeats (Palmer et al, three main problems with these technics in 1987; Allred et al, 1990). This epitope was general: they may not detect early incorporated onto a bioelectrode of infections during the acute phase of graphite and direct serum detection was disease (e. g. Cardenas et al, 2007). Due to demonstrated by impedance, differential the normal dynamic of vertebrate host pulse voltammetry, and atomic force immune response, antibody titers may microscopy (Santos et al, 2012). persist months to years even after Regarding E. canis, IFA is currently resolution of clinical signs or disease. considered the gold standard and is the Meaning that animals with positive most widely used method for the diagnosis antibodies may not be infected at the of CME (Waner et al, 2001). In addition, moment of detection. Finally, cross- the major immunoreactive antigens gp36, reaction among the Ehrlichia spp. and gp19, gp200 and p28 have been used to Anaplasma spp. is commonly recognized, develop an ELISA assay that exhibited and nonspecific antibody may develop due 100% sensitivity and specificity for to infection with related agents (Al- immunodiagnosis (Cardenas et al, 2007). Adhami et al, 2011) masking the real Moreover, using gp36 and gp19, causal agent. antibodies against E. canis in There are several serological test that have experimentally infected dogs were been extensively used in the epidemiology detected as much as 15 days before of anaplasmosis (revised in details in compared to IFA test (Cardenas et al, Aubry and Geale, 2011): Indirect 2007). fluorescent antibody (IFA), complement The molecular diagnosis of Ehrlichia and fixation (CF) test, capillary, agglutination Anaplasma infection via polymerase chain assay, card agglutination test (CAT), reaction (PCR) of whole blood samples

-7- has become readily available. Various a subspecies of relatively low genes are useful for organism pathogenicity which is close related to A. identification at genus and species level. marginale, is available in Africa, Among the most used are phylogenetically Australia, Israel and Latin American reliable are 16S RNA (Warner and countries. However, none of them have Dawson, 1996; Wen et al, 2002), gltA been able to prevent cattle from becoming (Inokuma et al, 2001) and groEL (Yu et al, persistently infected with A. marginale 2001). Additionally, msp4 and msp5 have (Kocan et al, 2003). In addition, the live been used for identification of A. vaccine has proved to be only partially marginale (revised in Aubry and Geale, effective and reports of vaccination failure 2011). Amplification of related organisms are not uncommon (Kocan et al, 2003), in by nonspecific primers has been shown to fact, while anaplasmosis still constitute a result in false-positive reactions. However problem in South Africa, the sales of the sequencing and sequences comparison vaccines in South Africa were reduced may be a solution for this problem. from 800 000 to 200 000 doses in a period Conversely, false-negatives may occur if of 22 years (1976-1998) (De Waal, 2000). extraction procedures fail to remove PCR The major surface proteins from A. inhibitors present in a blood sample or if marginale, particulary, MSP1a has been the level of circulating rickettsemia falls shown to have potential use as a vaccine below the level of assay. antigen because this protein contains both The main problem with the above genes is neutralization sensitive (Palmer et al, that they are conserved at species levels so 1987; Allred et al, 1990) and strains identification become challenging immunodominant epitopes (Garcia-Garcia or impossible using this genes. An et al, 2004). Recently, use of MSP1a for alternative is the use of variable genes. vaccine development for A. marginale has The msp1a gene for A. marginale (Palmer regained new attention. The N-terminus et al, 2001; Almazán et al, 2008; Ruybal et tandem repeated region of this protein was al, 2009) and gp36 for E. canis (Zhang et used in immunization trials in cattle al, 2008; Hsieh et al, 2010; Kamani et al, against A. marginale (Torina et al, 2014) 2013) have become the two most used and laboratory animal models (Santos et tools for testing strains variability in these al, 2013; Silvestre et al, 2014), and two pathogens. demonstrated promising results. No commercial vaccine for CME is Immunological control currently available. However, attempts to Vaccination is an economical, promising develop a vaccine against E. canis have and relatively effective option to control also been made and the evaluation of anaplasmosis (Aubry and Geale et al, inactivated (Mahan et al, 2005) and 2011). The development of a vaccine able attenuated (Rudoler et al, 2012) vaccines to prevent A. marginale infection in the was reported. Inoculation with the field has become challenging due the attenuated E. canis strain only produced existence of a wide range of different A. transient thrombocytopenia in vaccinated marginale strains that present antigenic dogs. After a challenge with a wide variation. A killed vaccine was available virulent strain of E. canis, only 3 out of 8 in the marked till 1999 (Kocan et al, dogs suffered mild transient fever while all 2003). A live vaccine, based in the use of the dogs of the control group suffered A. marginale sub sp centrale (A. centrale), severe disease. Four genes, p30, gp19,

-8- VirB4 and VirB9, encoding for Bremer, WG., Schaefer, JJ., Wagner, ER., Ewing, immunoreactive proteins and virulence SA., Rikihisa, Y., Needham, GR., Jittapalapong, S., Moore, DL., Stich, RW. (2005) Transstadial and factors were found conserved between the intrastadial experimental transmission of Ehrlichia vaccine strain and the challenge strain. canis by male Rhipicephalus sanguineus. Vet (Rudoler et al, 2012). Parasitol, 131,95–105.

Concluding remarks Johnson, E., Ewing, S., Barker, R., Fox, J., Crow, The diagnostic and control of E. canis and D., Kocan K. (1998) Experimental transmission of A. marginale constitute a challenging Ehrlichia canis (Rickettsiales: Ehrlichieae) by problem. We need molecular tools suitable Dermacentor variabilis (Acari: Ixodidae). Vet for evaluating the genetic diversity of E. Parasitol, 74,277-288. canis and A. marginale. Understanding of Mavromatis, K., Doyle, C., Lykidis, A., Ivanova, the molecular mechanisms underlying the N., Francino, M., Chain, P., Shin, M., Malfatti, S., generation of genetic diversity in these Larimer, F., Copeland, A., Detter, J., Land, M., pathogens is crucial in order to implement Richardson, P., Yu, X., Walker, D., McBride, J., control measures that do not stay one step Kyrpides, N. (2006) The genome of the obligately behind in pathogen evolution. intracellular bacterium Ehrlichia canis reveals themes of complex membrane structure and immune evasion strategies. J Bacteriol, 188,4015- References 23. Dumler, J., Barbet, A., Bekker, C., Dasch, G., Palmer, G., Ray, S., Rikihisa, Y., and Rurangirwa, Brayton, K., Kappmeyer, L., Herndon, D., Dark, F. (2001) Reorganization of genera in the families M., Tibbals, D., Palmer, G., McGuire, T., Knowles, and Anaplasmataceae in the order D. (2005) Complete genome sequencing of Rickettsiales: unification of some species of Anaplasma marginale reveals that the surface is Ehrlichia with Anaplasma, Cowdria with Ehrlichia skewed to two superfamilies of outer membrane and Ehrlichia with , descriptions of proteins. Proc Natl Acad Sci USA, 102,844-9. six new species combinations and designation of Ehrlichia equi and ‘HGE agent’ as subjective McBride, J., Corstvet, R., Gaunt, S., Boudreaux, synonyms of Ehrlichia phagocytophila. Int J Syst C., Guedry, T., Walker, D.H. (2003) Kinetics of and Evol Microbiol, 51,2145-2165. antibody response to Ehrlichia canis immunoreactive proteins. Infect Immun, 71,2516– Telford III, S., Goethert, H., and Cunningham, J. 2524. (2011) Prevalence of Ehrlichia muris in Wisconsin deer ticks collected during the mid-1990s. The Zhang, X., Luo, T., Keysary, A., Baneth, G., Open Microbiol J, 5,18-20. Miyashiro, S., Strenger, C., Waner, T., McBride, J. (2008) Genetic and Antigenic Diversities of Major Perez, M., Rikihisa, Y., Wen, B. (1996) Ehrlichia Immunoreactive Proteins in Globally Distributed canis-like agent isolated from a man in Venezuela: Ehrlichia canis Strains. Clin Vacc Immun, antigenic and genetic characterization. J Clin 15,1080-1088. Microbiol, 34,2133-2139. Doyle, CK., Nethery, KA., Popov, VL., McBride, Perez, M., Bodor, M., Zhang, C., Xiong, Q., JW. (2006) Differentially expressed and secreted Rikihisa, Y. (2006) Human infection with Ehrlichia major immunoreactive protein orthologs of canis accompanied by clinical signs in Venezuela. Ehrlichia canis and E. chaffeensis elicit early Ann N Y Acad Sci, 1078,110-117. antibody responses to epitopes on glycosylated tandem repeats. Infect Immun, 74,711-720. Groves, MG., Dennis, GL., Amyx, HL., Huxsoll, DL. (1975) Transmission of Ehrlichiacanis to dogs Hsieh, YC., Lee, CC., Tsang, CL., Chung, YT. by ticks (Rhipicephalus sanguineus). Am J Vet (2010) Detection and characterization of four novel Res, 36,937–940. genotypes of Ehrlichia canis from dogs. Vet Microbiol, 146, 70–75.

-9- Kamani, J., Lee, CC., Haruna, AM., Chung, PJ., geographic strains. J Clin Microbiol, 41,1609– Weka, PR., Chung, YT. (2013) First detection and 1616. molecular characterization of Ehrlichia canis from dogs in Nigeria. Res Vet Sci, 94,27-32. de la Fuente, J., Garcia-Garcia, JC., Blouin, EF., Kocan, KM. (2003c) Characterization of the Cardenas, AM., Doyle, CK., Zhang, X., Nethery, functional domain of major surface protein 1a K., Corstvet, RE., Walker, DH., McBride, JW. involved in adhesion of the rickettsia Anaplasma (2007) Enzyme-linked immunosorbent assay with marginale to host cells. Vet Microbiol, 91,265– conserved immunoreactive glycoproteins gp36 and 283. gp19 has enhanced sensitivity and provides species-specific immunodiagnosis of Ehrlichia Blanc, G., Ogata, H., Robert, C., Audic, S., Suhre, canis infection. Clin Vaccine Immunol. 14, 123- K., Vestris, G., Claverie, JM., Raoult, D. (2007) 128. Reductive genome evolution from the mother of Rickettsia. PLoS Genet, 3,e14. Aubry, P., Geale, DW. (2011) A review of bovine anaplasmosis. Transbound Emerg Dis, 58,1-30. Wernegreen, JJ. (2005) For better or worse: genomic consequences of intracellular mutualism Visser, ES., McGuire, TC., Palmer, GH., Davis, and parasitism. Curr Opin Genet Dev, 15,572-83. WC., Shkap, V., Pipano, E., Knowles, DP. (1992) The Anaplasma marginale msp5 gene encodes a Merhej, V., Raoult, D. (2011) Rickettsial evolution 19-kilodalton protein conserved in all recognized in the light of comparative genomics. Biol Rev Anaplasma species. Infect Immun, 60,5139–5144. Camb Philos Soc, 86,379-405. de la Fuente, J., Golsteyn, TEJ., van den Bussche, Darby, AC., Cho, NH., Fuxelius, HH., Westberg, RA., Hamilton, RG., Tanaka, EE., Druhan, SE., J., Andersson, SG. (2007) Intracellular pathogens Kocan, KM. (2003a) Characterization of go extreme: genome evolution in the Rickettsiales. Anaplasma marginale isolated from North Trends Genet, 23,511-20. American bison. Appl Environ Microbiol, 69,5001- 5005. Bell-Sakyi, L., Zweygarth, E., Blouin, EF., Gould, EA., Jongejan, F. (2007) Tick cell lines: tools for Cabezas-Cruz, A., Passos, LMF., Lis, K., Kenneil, tick and tick-borne disease research. Trends R., Valdés, JJ., Ferrolho, J., Tonk, M., Pohl, AE., Parasitol, 23,450-7. Grubhoffer, L., Zweygarth, E., Shkap, V., Ribeiro, MFB., Estrada-Peña, A., Kocan, KM., de la Ewing, SA., Munderloh, UG., Blouin, EF., Kocan, Fuente, J. (2013) Functional and Immunological KM., Kurtti, TJ. (1995) Ehrlichia canis in tick cell Relevance of Anaplasma marginale Major Surface culture. In: Proceedings of the 76th Conference of Protein 1a Sequence and Structural Analysis. PloS Research Workers in Animal Diseases, Chicago, One, 8,1-13. USA, 13-14 November 1995, Iowa State University Press, Ames, abstract no. 165. Palmer, GH., Rurangirwa, FR., McElwain, TF. (2001) Strain composition of the ehrlichia Nyindo, MBA., Ristic, M., Huxsoll, DL., Smith, Anaplasma marginale within persistently infected AR. (1971) Tropical canine pancytopenia: in vitro cattle, a mammalian reservoir for tick transmission. cultivation of the causative agent – Ehrlichia canis. J Clin Microbiol, 39,631–635. Am J Vet Res, 32,1651-1658.

Bowie, MV., de la Fuente, J., Kocan, KM., Blouin, Dawson, JE., Rikihisa, Y., Ewing, SA., Fishbein, EF., Barbet, AF. (2002) Conservation of major DB. (1991) Serologic diagnosis of human surface protein 1 genes of Anaplasma marginale ehrlichiosis using two Ehrlicha canis isolates. during cyclic transmission between ticks and cattle. Infect Dis, 163,564-567. Gene, 282,95–102. Al-Adhami, B., Scandrett, WB., Lobanov, VA., de la Fuente, J., van Den Bussche, RA., Prado, T., Gajadhar, AA. (2011) Serological cross-reactivity Kocan, KM. (2003b) Anaplasma marginale major between Anaplasma marginale and an Ehrlichia surface protein 1a genotypes evolved under species in naturally and experimentally infected positive selection pressure but are not markers for cattle. J Vet Diagn Invest. 23,1181-1188.

-10- Santos, PS., Nascimento, R., Rodrigues, LP., marginale strains from an outbreak of bovine Santos, FA., Faria, PC., Martins, JR., Brito- anaplasmosis in an endemic area. Vet Parasitol, Madurro, AG., Madurro, JM., Goulart, LR. (2012) 158,103-109. Functional epitope core motif of the Anaplasma marginale major surface protein 1a and its Ruybal, P., Moretta, R., Perez, A., Petrigh, R., incorporation onto bioelectrodes for antibody Zimmer, P., Alcaraz, E., Echaide, I., Torioni de detection. PLoS One, 7,e33045. Echaide, S., Kocan, KM., de la Fuente, J. Farber M. (2009) Genetic diversity of Anaplasma Palmer, GH., Waghela, SD., Barbet, AF., Davis, marginale in Argentina. Vet Parasitol, 162,176– WC., McGuire, TC. (1987) Characterization of a 180. neutralization-sensitive epitope on the Am 105 surface protein of Anaplasma marginale. Int J Kocan, KM., de la Fuente, J., Guglielmone, AA., Parasitol, 17,1279–1285. Melendez, RD. (2003) Antigens and alternatives for control of Anaplasma marginale infection in Allred, DR., McGuire, TC., Palmer, GH., Leib, cattle. Clin Microbiol Rev, 16,698–712. SR., Harkins, TM., McElwain, TF., Barbet, AF. (1990) Molecular basis for surface antigen size De Waal, DT. (2000) Anaplasmosis Control and polymorphisms and conservation of a Diagnosis in South Africa. Ann NY Acad Sci, neutralization-sensitive epitope in Anaplasma 916,474-483. marginale. Proc Natl Acad Sci USA, 87,3220– 3224.

Waner, T., Harrus, S., Jongejan, F., Bark, H., Garcia-Garcia, JC., de la Fuente, J., Kocan, KM., Keysary A., Cornelissen, AW. (2001) Significance Blouin, EF., Halbur, T., Onet, VC., Saliki, JT. of serological testing for ehrlichial diseases in dogs (2004) Mapping of B-cell epitopes in the N- with special emphasis on the diagnosis of canine terminal repeated peptides of Anaplasma monocytic ehrlichiosis caused by Ehrlichia canis. marginale major surface protein 1a and Vet Parasitol, 95,1-15. characterization of the humoral immune response of cattle immunized with recombinant and whole Warner, CK., Dawson, JE. (1996) Genus- and organism antigens. Vet Immunol Immunopathol, species-level identification of Ehrlichia species by 98,137–151. PCR and sequencing. In PCR protocols for emerging infectious diseases. Edited by Persing DH. Washington DC: ASM Press; 1996:100–105. Torina, A., Moreno-Cid, JA., Blanda, V., Fernández de Mera, IG., de la Lastra, JM., Wen, B., Jian, R., Zhang, Y., Chen, R. (2002) Scimeca, S., Blanda, M., Scariano, ME., Briganò, Simultaneous detection of Anaplasma marginale S., Disclafani, R., Piazza, A., Vicente, J., Gortázar, and a new Ehrclihia species closely related to C., Caracappa, S., Lelli, RC., de la Fuente, J. by sequence analyses of 16S (2014) Control of tick infestations and pathogen ribosomal DNA in Boophilus microplus ticks from prevalence in cattle and sheep farms vaccinated Tibet. J Clin Microbiol, 40,3286–3290. with the recombinant Subolesin-Major Surface Protein 1a chimeric antigen. Parasit Vectors, 7,10. Inokuma, H., Brouqui, P., Drancourt, M., Raoult, D. (2001) Citrate synthase gene sequence: a new tool for phylogenetic analysis and identification of Ehrlichia. J Clin Microbiol, 39,3031-3039. Santos, PS., Sena, AA., Nascimento, R., Araújo, TG., Mendes, MM., Martins, JR., Mineo, TW., Yu, XJ., Zhang, XF., McBride, JW., Zhang, Y., Mineo, JR., Goulart, LR. (2013) Epitope-based Walker, DH. (2001) Phylogenetic relationships of vaccines with the Anaplasma marginale MSP1a Anaplasma marginale and ‘Ehrlichia platys’ to functional motif induce a balanced humoral and other Ehrlichia species determined by GroEL aa cellular immune response in mice. PLoS One, 8, sequences. Int J Syst Evol Microbiol, 51,1143– e60311. 1146.

Almazán, C., Medrano, C., Ortiz, M., de la Fuente, J. (2008) Genetic diversity of Anaplasma

-11- Silvestre, BT., Rabelo, ÉM., Versiani, AF., da Mahan, S., Kelly, PJ., Mahan, SM. (2005) A Fonseca, FG., Silveira, JA., Bueno, LL., Fujiwara, preliminary study to evaluate the immune RT., Ribeiro, MF. (2014) Evaluation of humoral responses induced by immunization of dogs with and cellular immune response of BALB/c mice inactivated Ehrlichia canis organisms. immunized with a recombinant fragment of MSP1a Onderstepoort J Vet Res, 72,119-128. from Anaplasma marginale using carbon nanotubes as a carrier molecule. Vaccine, 32,2160- Rudoler, N., Baneth, G., Eyal, O., van Straten, M., 2166. Harrus, S. (2012) Evaluation of an attenuated strain of Ehrlichia canis as a vaccine for canine monocytic ehrlichiosis. Vaccine. 31,226-233.

-12- Hypothesis and Objectives

-13- Hypothesis:

Surface proteins in Anaplasmataceae reflect bacterial genetic diversity and the evolution of pathogen-host/vector interactions and can be used in epidemiological studies and for the development and diagnostic and control measures.

Objectives:

1. To characterize genetic diversity in Anaplasma marginale strains using the msp1a gene

encoding for major murface protein 1a (MSP1a).

2. To characterize the evolution of Anaplasma marginale strains based on the role of

MSP1a in pathogen-tick interactions.

3. To characterize the genetic diversity of Erlichia canis strains using the glycoprotein gp36

gene.

4. To characterize at the molecular, cellular and ultrastructural levels a new species of the

genus Ehrlichia, E. mineirensis, isolated from the hemolymph of Rhipicephalus

microplus.

-14- Development of objectives

Objectives 1 and 2 focus on A. marginale and are addressed in Chapters II and III characterizing A. marginale msp1a gene as a relevant tool to evaluate the genetic variability and evolution in this species. Three studies were performed in anaplasmosis endemic regions (Brazil and South Africa) of the world to evaluate the genetic diversity of A. marginale using the gene msp1a. Additionally, a global analysis of all the msp1a sequences available in GeneBank allowed us to have a general picture of the variability of this gene as well as the different patterns of sequence variability in different regions of the world. Subsequently, using remotely sensed information, we divided the world in ecoregions suitable for different tick vectors and linked the location of each strain of A. marginale to these ecoregions. Chapter III this strategy and the evaluation of the rates of evolution of msp1a in the different ecoregions above.

Objectives 3 and 4 focus on Ehrlichia species and are addressed in Chapters IV and V, describing the isolation of three new strains of E. canis as well as a new species of the genus

Ehrichia, E. mineirensis, closely related to E. canis. All organisms where isolated and propagated in vitro using tick and mammalian cell lines. Different genes were used for the molecular characterization of the three new strains of E. canis (16S rRNA and gp36) and for E. mineirensis

(16S rRNA, groESL operon, groEL, gltA, dsb and gp36). These two objectives also focus on the use of gp36 gene in the characterization of E. canis genetic variability. All the gp36 sequences available in GeneBank were analyzed and phylogenetic studies were performed.

-15- Chapter II

The gene msp1a is a relevant genetic tool for the characterization of

A. marginale diversity worldwide.

Cabezas-Cruz A., Passos L M F., Lis K., Kenneil R., Valdés J J., Ferrolho J., Tonk M., Pohl A E., Grubhoffer L., Zweygarth E., Shkap V., Ribeiro M F B., Estrada-Peña A., Kocan K M., de la Fuente J. 2013. Functional and Immunological Relevance of Anaplasma marginale Major Surface Protein 1a Sequence and Structural Analysis. PLoS ONE. 8(6): e65243.

Pohl A E., Cabezas-Cruz A., Ribeiro M F B., Silveira J A G., Silaghi C., Pfister K., Passos L M F. 2013. Detection of genetic diversity of Anaplasma marginale isolates in Minas Gerais, Brazil. Brazilian journal of veterinary parasitology. 22(1): 1-7.

Mutshembele A M., Cabezas-Cruz A., Mtshali M S., Thekisoe O M M., Galindo Ruth C., de la Fuente J. 2014. Epidemiology and evolution of genetic variability of Anaplasma marginale in South Africa. Tick and tick borne diseases. In press.

Silva J B., Fonseca A H., Barbosa J D., Cabezas-Cruz A., de la Fuente J. 2014. Low genetic diversity associated to low prevalence of Anaplasma marginale in water buffaloes. Tick and tick borne diseases. In press.

-16- Chapter II.I

Functional and Immunological Relevance of Anaplasma marginale

Major Surface Protein 1a Sequence and Structural Analysis

Cabezas-Cruz A., Passos L M F., Lis K., Kenneil R., Valdés J J., Ferrolho J., Tonk M., Pohl A E., Grubhoffer L., Zweygarth E., Shkap V., Ribeiro M F B., Estrada-Peña A., Kocan K M., de la Fuente J. 2013. Functional and Immunological Relevance of Anaplasma marginale Major Surface Protein 1a Sequence and Structural Analysis. PLoS ONE. 8(6): e65243.

-17- Chapter II.I Functional and Immunological Relevance of Anaplasma marginale Major Surface Protein 1a Sequence and Structural Analysis

Alejandro Cabezas-Cruz1, Lygia M. F. Passos2,3, Katarzyna Lis2, Rachel Kenneil2, James J. Valde´s1, Joana Ferrolho4, Miray Tonk1, Anna E. Pohl2, Libor Grubhoffer1, Erich Zweygarth2, Varda Shkap5, Mucio F. B. Ribeiro6, Agustı´n Estrada-Pen˜ a7, Katherine M. Kocan8, Jose´ de la Fuente8,9* 1 University of South Bohemia, Faculty of Science and Biology Centre of the Academy of Sciences of the Czech Republic, Parasitology Institute, Cˇ eske´ Budeˇjovice, Czech Republic, 2 Institute for Comparative Tropical Medicine and Parasitology, Ludwig-Maximilians-Universita¨t, Mu¨nchen, Germany, 3 Departamento de Medicina Veterina´ria Preventiva, Escola de Veterina´ria, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil, 4 Pirbright Laboratory, The Pirbright Institute, Pirbright, Surrey, United Kingdom, 5 Division of Parasitology, Kimron Veterinary Institute, Bet Dagan, Israel, 6 Departamento de Parasitologia, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil, 7 Facultad de Veterinaria, Universidad de Zaragoza, Zaragoza, Spain, 8 Department of Veterinary Pathobiology, Center for Veterinary Health Sciences, Oklahoma State University, Stillwatert, Oklahoma, United States of America, 9 Sanidad y Biotecnologı´a, Instituto de Investigacio´n de Recursos Cinege´ticos, IREC-CSIC-UCLM-JCCM, Ciudad Real, Spain

Abstract Bovine anaplasmosis is caused by cattle infection with the tick-borne bacterium, Anaplasma marginale. The major surface protein 1a (MSP1a) has been used as a genetic marker for identifying A. marginale strains based on N-terminal tandem repeats and a 59-UTR microsatellite located in the msp1a gene. The MSP1a tandem repeats contain immune relevant elements and functional domains that bind to bovine erythrocytes and tick cells, thus providing information about the evolution of host-pathogen and vector-pathogen interactions. Here we propose one nomenclature for A. marginale strain classification based on MSP1a. All tandem repeats among A. marginale strains were classified and the amino acid variability/ frequency in each position was determined. The sequence variation at immunodominant B cell epitopes was determined and the secondary (2D) structure of the tandem repeats was modeled. A total of 224 different strains of A. marginale were classified, showing 11 genotypes based on the 59-UTR microsatellite and 193 different tandem repeats with high amino acid variability per position. Our results showed phylogenetic correlation between MSP1a sequence, secondary structure, B-cell epitope composition and tick transmissibility of A. marginale strains. The analysis of MSP1a sequences provides relevant information about the biology of A. marginale to design vaccines with a cross-protective capacity based on MSP1a B-cell epitopes.

Citation: Cabezas-Cruz A, Passos LMF, Lis K, Kenneil R, Valde´s JJ, et al. (2013) Functional and Immunological Relevance of Anaplasma marginale Major Surface Protein 1a Sequence and Structural Analysis. PLoS ONE 8(6): e65243. doi:10.1371/journal.pone.0065243 Editor: Roman Ganta, Kansas State University, United States of America Received February 16, 2013; Accepted April 22, 2013; Published June 11, 2013 Copyright: ß 2013 Cabezas-Cruz et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This research was supported by POSTICK ITN (Post-graduate training network for capacity building to control ticks and tick-borne diseases) within the FP7-PEOPLE-ITN programme (EU Grant No. 238511) and BFU2011-23896 grant to JF. JJV was sponsored by project CZ.1.07/2.3.00/30.0032, co-financed by the European Social Fund and the state budget of the Czech Republic. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: [email protected]

Introduction have been shown to be involved in host cell/pathogen interactions [16]. MSP1a, one of six MSPs described previously on A. marginale, Bovine anaplasmosis, caused by the intraerythrocytic rickettsia is a 70–100 kDa protein encoded by a single-copy gene, msp1a, Anaplasma marginale (Rickettsiales: Anaplasmataceae), is an eco- which is conserved during the multiplication in cattle and ticks nomically important disease of cattle which is endemic in tropical [17]. MSP1a is involved in adhesion of A. marginale to bovine and subtropical regions of the world [1,2]. This obligate erythrocytes and tick cells and therefore is a determinant of intracellular pathogen can be transmitted biologically by ticks, infection for cattle and transmission of A. marginale by ticks. MSP1a mechanically by transfer of infective blood on fomites or the has also been shown to be involved in development of bovine mouthparts of biting insects [1,2], and, less commonly, by immunity against A. marginale [3]. Strains of A. marginale were transplacental transmission from dams to their calves [3]. originally identified by differences in the molecular weight of Many geographic strains of A. marginale have been identified MSP1a because of variable number of 23–31 amino acid serine- worldwide which differ in morphology, protein sequence, rich tandem repeats located in the N-terminal region of the protein antigenic characteristics and their ability to be transmitted by which is continuous with a highly conserved C-terminal region ticks [1,2,4–15]. The genetic diversity of A. marginale strains derived [6,11,14]. Because the number and sequence of tandem repeats from bovine erythrocytes has been characterized based on the remained the same in a given strain, the msp1a gene was sequence of major surface protein (MSP) genes, several of which

-18- Chapter II.I recognized as a stable genetic marker for geographic strain identity structure of the MSP1a tandem repeats and microsatellites when [9,12,15,18–20]. Phylogenetic analyses of A. marginale strains using available. These A. marginale strains came from 17 world regions MSPs were reported by de la Fuente et al. [14,21–23]. While providing a global MSP1a diversity (Fig. 2), and were classified sequence analysis of MSP4 provided phylogeographic informa- following our proposed nomenclature (Table S1). The majority of tion, MSP1a did not prove to be as suitable for these studies [24]. A. marginale strains had more than one MSP1a tandem repeat and However, MSP1a repeat sequence analysis contributed to the the maximum number of repeats was 10. No strains were reported understanding of the genetic diversity of A. marginale within specific with 9 tandem repeats (Table S1 and Fig. 3). Tables 1 provide a regions, as well providing insight into the evolution of host– list of the most commonly reported strains and tandem repeats. pathogen-vector interactions [14,21–23,25]. The majority of strains were seen in only a given region, although MSP1a also contains neutralization sensitive T- and B-cell several strains were isolated from multiple South American epitopes required for development of a protective immune countries (Argentina/Chaco/2 (t, 22, 13, 18) from Argentina response [8,10,26–29]. One B-cell epitope within the MSP1a and Mexico; Brazil/Parana/2 (t, 10, 15) from Brazil and tandem repeat ((Q/E)ASTSS) was recognized by a monoclonal Argentina; Mexico/Pichucalco/E - (a, b, b, C2) from Argentina, antibody that neutralized A. marginale in vitro [6]. This neutrali- Brazil and Mexico; and Mexico/Tamaulipas/2 (64, 65, D, 65, E) zation-sensitive epitope was found to be conserved among from Mexico and Venezuela). The strain, Argentina/Santa Fe/2 heterologous A. marginale strains [29,30]. An additional linear B- (a, b3, C), was the only strain found in more than one continent, cell epitope (SSAGGQQQESS) was found to be immuno and was reported in Argentina, Mexico, and Taiwan. Most of the dominant [26,28,31]. Cattle immunized with MSP1 were partially MSP1a tandem repeats were shared between different strains, and protected against challenge with homologous and heterologous repeat B, the most common tandem repeat sequence, occurred in strains [32–34]. Furthermore, MSP1a antibodies reduced the 43 strains (Table 1). While some tandem repeats were unique to infectivity of A. marginale for cultured tick cells [35] and infection one country (repeat 72 was only reported in Brazil) or continent and transmission of A. marginale by D. variabilis [1]. (repeat B was found throughout the American continent), some MSP1a is relevant to many facets of A. marginale research. Strain repeats appeared to be distributed worldwide (repeat M was classification enables a comprehensive study of the extensive reported in Israel, Italy, USA and South America). This weak worldwide diversity of A. marginale. As reported herein, develop- association between specific tandem repeat sequences and ment of an unified nomenclature of MSP1a from A. marginale particular geographic regions was reported previously by de la strains based on all available sequence data allowed for review and Fuente et al. [14] and may be attributed to worldwide cattle characterization of the worldwide genetic diversity of A. marginale. movement, among other factors. Notably, in Australia, in which The information generated from these studies will be fundamental introduction of cattle has been limited, only one MSP1a genotype toward understanding the functional and immune relevance of A. has been reported [37]. marginale MSP1a and in formulating vaccines that will be cross- protective among these diverse strains. The Biological Implications of Sequence Variation of MSP1a Tandem Repeats Results and Discussion The tandem repeated portion of the N-terminal region of the A. Classification of A. marginale Strains Using MSP1a marginale MSP1a has been shown to be an adhesin for bovine Sequence Data erythrocytes and tick cells, and thus are involved in pathogen In this study we propose a unified nomenclature for the infection of host cells and transmission by ticks [10,36,38]. In classification of A. marginale strains based on the sequences of the contrast, the MSP1a N-terminal tandem repeats are absent in A. MSP1a tandem repeats and the 59-UTR microsatellite. This marginale subsp. centrale. Although A. centrale can be transmitted by approach was supported by the following considerations: (i) the Rhipicephalus simus, the tick species from which this organisms was availability of numerous A. marginale MSP1a sequences in initially isolated, this Anaplasma sp. cannot be transmitted by other GenBank, (ii) the fact that MSP1a is encoded by a single-copy tick species that are known to be A. marginale vectors [20,39]. gene [1], (iii) the tandem repeat structure and sequence vary These analyses provided information on the range and among strains from different geographic locations, while the frequency of variations in the A. marginale MSP1a tandem repeats. remaining portion of the protein is highly conserved [14], (iv) the Herein, we present the sequence variation data and discuss tandem repeats structure is a stable genetic marker that is biological implications of these findings, including O-glycosylation, conserved within a strain during the acute and persistent chronic amino acids at position 20 for binding to tick cell extract (TCE), phases of the A. marginale infection in cattle and after passage and protein conformation, pathogen-environmental relationships, and transmission by ticks [1], (v) the tandem repeats contain functional combination of these factors. domains that serve as adhesins for bovine erythrocytes and tick O-glycosylation. MSP1a tandem repeats were found to have cells, a prerequisite for infection of host cells [10,36], (vi) the a high variability across almost all the 31 amino acid positions, tandem repeats contain relevant B cell epitopes and neutralization suggesting considerable evolutionary pressure on this molecule epitopes important for natural or induced immune protection in (Fig. 4A). Four positions were totally conserved: serine (S)4 and cattle [6,31], and (vii) a microsatellite which has been implicated in S25, alanine (A)22 and Glicine 31 (Fig. 4A). MSP1a has been the regulation of MSP1a expression levels is located in the 59-UTR shown to be O-glycosylated, with S/threonine (T) regions present of the msp1a gene [25]. in the tandem repeats as the target site for this type of glycosylation In this study, 193 different MSP1a tandem repeats were [31]. Furthermore, the binding capacity of MSP1a to tick cells identified, 79 of which were published in GenBank but not diminished after deglycosylation [31]. The conservation of S4 and formally classified (Fig. 1). Two new microsatellite structures were S25 among all the tandem repeats included in this study could described in our analysis and named J and K (J: m = 1, n = 8, indicate that the O-glycosylation at these two positions is highly d = 21; K: m = 2, n = 8, d = 25) after Estrada-Pen˜a et al. [25]. relevant for A. marginale infection. Several bacterial glycoproteins Unique A. marginale strains (224; 77% of all sequences found) are have also been reported to play a role in bacterial adhesion, based on differences in geographic location, the number and invasion and pathogenesis [40].

-19- Chapter II.I

Figure 1. MSP1a tandem repeat sequences in A. marginale strains. The one letter amino acid code was used to depict MSP1a repeat sequences. Dots indicate identical amino acids and gaps indicate deletion/insertions. The ID of each repeat form was given following the nomenclature proposed by de la Fuente et al. (2007) [14]. The sequences from 114 until 161 are the newly classified. doi:10.1371/journal.pone.0065243.g001

Relevance of amino acids at position 20 for binding to tick both tick-transmissible and not transmissible A. marginale strains cell extract (TCE). Within the MSP1a tandem repeats, the and, at least for some strains, the presence of TCE-binding with negatively charged amino acids, aspartic acid (D) and glutamic tandem repeats correlated with strains that were transmissible by acid (E), at position 20 were shown to be essential for binding of Dermacentor sp. ticks [10]. In all strains, the first MSP1a tandem MSP1a to TCE. When glycine (G) was located at position 20, repeat (R1) contained 67 (34.7%) different sequences. However, binding was not observed [10]. This result suggested that the R1 tandem repeats had less amino acid variability and 6 conserved amino acid at position 20 may be essential for A. marginale binding positions when compared to non-R1 tandem repeats, in which to tick cells, a prerequisite for pathogen infection and transmission only 4 conserved amino acid positions were found (Fig. 4B). These by ticks. In fact, previous experiments confirmed the existence of results suggested that the R1 tandem repeat may play a role in A.

-20- Chapter II.I

Figure 2. World A. marginale MSP1a molecular map. The worldwide molecular characterization of A. marginale MSP1a sequences is shown. The number of A. marginale strains (S), tandem repeats (TR), tandem repeat 2D structures (TR-2D), functional tandem repeats (FTR) containing D and E at position 20 and B cell epitope types (BCE) and microsatellites (MS) are represented for each country. Primary data is depicted in figures 1, 3 and 6. The information on 59 UTR microsatellites is not available (NA) for some sequences. doi:10.1371/journal.pone.0065243.g002 marginale infection and transmission. We found 87 tandem repeats analysis revealed that the amino acid at position 20 correlated with containing D20 (71%) or E20 (29%) (Fig. 1). In total, 161 A. specific 2-D structure changes in the tandem repeat. When D or E marginale strains contained one of these tandem repeats at least amino acids were at this position, the structure of the tandem once and in 114 (71%) of these strains, the D20 or E20 was found repeat was predominantly long a-helical structures (Model types in the R1 tandem repeat. Surprisingly, the highest variable amino 39, A, 13 and s), but when a G was in this position, the repeat was acid was at position 20 (Fig. 4A), suggesting greater evolutionary a short a helix, b-strand or coiled 2-D structure (Model types 4, pressure at this amino acid position. From our findings, G was the 10, a and 48) (Fig. 5). The other four amino acids that were found most frequent amino acid at position 20 (Fig. 4C), in both R1 and at lower frequencies at position 20, (I, Y, T and S; Fig. 4C), except non-R1 tandem repeats (data not shown), but only 4 amino acids for Y, retained the a-helical 2-D structure (Fig. 5). were found at position 20 in R1 (from highest to lowest frequency: Our results suggest that the MSP1a tandem repeat 2-D G, D, E and serine [S]) while 7 different amino acids were found at structure also correlated with tick transmissibility (Table 2). Strains position 20 in non-R1 tandem repeats (G, D, E, S, T, isoleucine [I] reported previously that were not transmitted by Dermacentor sp. and tyrosine [Y]) (Fig. 4C). In previous experiments, non-R1 had a predominant pattern for 2-D structure of tandem repeats of tandem repeats had a phylogenetic correlation with tick-transmis- b strand, short a-helix or coiled structures, regardless of whether sible strains, but this correlation was not seen with R1 tandem or not they had TCE-binding tandem repeats (Table 2). In repeats [9]. We propose that non-R1 tandems are also involved in contrast, abundant a-helices were found in tandem repeats of A. marginale-tick interactions which require more genetic variabil- strains transmitted by ticks (Table 2). In the last case, as shown for ity, because more than 20 different tick species have been reported the USA/Florida/G - (A, B7) strain, the presence of all seven to transmit A. marginale [24]. TCE-binding tandem repeats did not correlate with tick- Protein conformation. As proposed previously both amino transmissibility; this Florida isolate was clearly shown to be non- acid sequence and protein conformation may contribute to the infective for ticks or cultured tick cells (Table 2). However, the 2-D function of MSP1a as adhesin [10]. Herein, we explored this structure appeared to be a determinant for the biological hypothesis by predicting the 2-D structure of all the MSP1a transmission of A. marginale, because the Israel/Israel tailed/F - tandem repeats. We found that 14 models explained all of the (1, F, M, 3) strains, while not having TCE-binding repeats but did variability of 2-D structure among the 193 tandem repeats (Fig. 5). have a-helices as 2-D structure, were tick transmissible (Table 2). Three a-helical 2-D structure models, differing in the length and As listed in Table 2, the data collected thus far regarding A. amount of a helixes in the tandem repeat, described 68% of the 2- marginale transmissibility by ticks is related to the major vector D structure variation (presented as A, s and F in Fig. 5). The Dermacentor sp. The complexity of the relationship between the 2-D

-21- Chapter II.I

Figure 3. Number of tandem repeats among A. marginale strains. The total number of strains classified in our study were organized by the number of MSP1a tandem repeats. The percent of A. marginale strains (external numbers) containing different number of tandem repeats (internal numbers) is shown. The most common numbers of MSP1a tandem repeats among strains were 3 (yellow), 4 (light blue) and 5 (violet). doi:10.1371/journal.pone.0065243.g003 structure, TCE-binding repeats and tick transmissibility was also Eco-region Cluster 1 extended over large areas of central Africa seen with the Brazil/Minas Gerais/E strain–(13, 42, 13, 18) which and central South America, primarily Argentina and southern does not contain b strands and is not transmissible by Rhipicephalus Brazil, and was a region with medium to high Normalized (Boophilus) microplus [13]. This example demonstrated a different Difference Vegetation Index (NDVI) values and a well-defined pattern as that observed with A. marginale that are not transmissible seasonal decrease between June and September. The highest by Dermacentor sp. The 2-D structure data presented in the present recorded temperature and annual rainfall of approximately study is in agreement with an analysis performed recently on A. 1,000 mm occurs in Eco-region Cluster 1. Eco-region Cluster 2 marginale MSP2 variants in tick or mammalian cells [41]. The 2-D included vast areas of the Mesoamerican corridor, northern South structure analysis using PSIPRED demonstrated that MSP2 America and a small territory of eastern South Africa, and variants expressed in ticks were predominantly a-helices, while included zones with high NDVI throughout the year without b-strands were present in MSP2 variants expressed only in seasonal variability. The temperature values in Eco-region Cluster mammalian cells [41,42]. 2 were similar to those in Eco-region Cluster 1, but with an annual Pathogen-environmental relationships. A. marginale was rainfall of approximately 1,500 mm. Eco-region Cluster 3 recorded in four eco-region clusters defined in our study (Table 3). extended over central South Africa and scattered parts of the

Table 1. Geographical occurrence of the most common A. marginale strains.

Strains Sructure of MSP1a tandem repeats Number of strains World occurrence

Most common t 22 13 18 7 4x Argentina, 3x Mexico a b bbC 7 4x Argentina, 2x Mexico, 1x Taiwan Second common 34 13 4 37 6 6x South Africa Third common B B M 5 5x Argentina F M M 5 4x Argentina, 1x Mexico

The most frequent A. marginale strains and their geographical occurrence are shown. The most common tandem repeats found among all the A. marginale strains are underlined and there were found more than 60 (M), 80 (b) and 90 (B) times. doi:10.1371/journal.pone.0065243.t001

-22- Chapter II.I

-23- Chapter II.I

Figure 4. Amino acid variability and frequency in A. marginale MSP1a tandem repeats. The amino acid variability (A), comparison of the variability between tandem repeats at positions R1 and non-R1 (B) and frequency (C) were calculated per amino acid position in the MSP1a tandem repeats using the formula Variability = number of different amino acids at a given position/frequency of the most common amino acid at that position [50]. The one letter amino acid code was used to name the amino acids in (C) and the most frequent amino acids per position are colored in gray. doi:10.1371/journal.pone.0065243.g004 southern USA and Mexico, and had the lowest NDVI values with binding tandem repeats. The high b-strand content and short a- minimal change across the year. This eco-region had lower helixes in MSP1a tandem repeats appears to be associated with a temperature values and minimum rainfall. Finally, Eco-region non-tick-transmissible phenotype, similar to the results reported Cluster 4 extended over large areas of the USA and had a clear recently with MSP2 sequence study [41]. However, variable 2-D NDVI signature that was low between November and March and structures such as those in cluster b-a-c may be required in order then rose to maximum levels in July. This area was the coldest to bypass the absence of TCE-binding tandem repeats and among the four eco-region clusters, with an annual rainfall of maintain the tick-transmission phenotype. The presence of TCE- approximately 800 mm/year. The results of this study demon- binding tandem repeats could contribute to the organization of the strated that 82% of MSP1a R1 unique sequences were associated MSP1a molecule, as seen in cluster a-1, where high content of a- with only one eco-region cluster (Table 3). Seventeen R1 unique helices correlated only with the presence of TCE-binding tandem sequences (27% of the total number of R1 sequences) were repeats. Additionally, the analysis using the GeneSilico Metaserver reported exclusively in Eco-region Cluster 1 and shared 16 out of predicted that tandem repeats have a protein disorder across the 31 amino acids (51.6% of the total number of amino acids) whole tandem repeat (data not shown). Intrinsically disordered (Table 3). Sixteen R1 unique sequences (17%) were reported only proteins demonstrated better molecular recognition due to a in Eco-region Cluster 2 which had 64.5% identical amino acids higher specificity, larger interacting surfaces and different folding (Table 3). Twenty-five R1 unique sequences (32%) were only patterns upon binding [45]. found in Eco-region Cluster 3, of which 64.5% of their amino acids were shared (Table 3). Only five R1 sequences were Analysis of B Cell Epitope in MSP1a Tandem Repeats exclusively associated with Eco-region Cluster 4, which had 77.4% Variation in A. marginale outer membrane proteins, such as identical amino acids (Table 3). Eight R1 sequences, were found MSP1a, is a major challenge in developing vaccines that can simultaneously in more than one of the eco-region clusters provide cross-protection between the diversity of strains world- (Table 3). These results confirmed that A. marginale MSP1a R1 wide. MSP1a has long been investigated as a vaccine candidate sequences clustered according to a pattern of abiotic (climate) [68,32–34] due to the presence of a conserved neutralization- factors, and are related to both the species of tick vector and the sensitive B-cell epitope at position 20–26 of tandem repeats [6,29]. performance of this tick vector in the eco-region [25]. Higher However, a study [31] of the the antibody response to the strain variability in R1 repeat sequences appeared in areas where several USA/Oklahoma/G - (K, C, H), demonstrated that after tick species are candidate vectors (i.e. USA and Canada) or where vaccination with whole A. marginale or recombinant MSP1a, a mechanical transmission is common (i.e. central Argentina). different MSP1a B-cell epitope was immunodominant, Remarkably, only one A. marginale MSP1a genotype has been SSAGGQQQESS, a linear epitope at amino acid positions 4 to recorded in Australia (Table S1) along with a single tick vector 14 of the tandem repeat. As the antibody response is of principal species, Rhipicephalus australis [43]. As reported previously, the importance in anaplasmosis, strain to strain variation in tandem hypothesis of strain geographic association was rejected [25]. repeat B-cell epitopes would be an important consideration in Mantel’s test on R1 sequences was 0.82 (P,0.001) when applied development of an MSP1a recombinant vaccine [46–48]. We to eco-region clusters using only unique sequences. The same test therefore characterized the diversity of the immunodominant provided a value of 0.31 (P = 0.145) for the distances matrix based position 4–14 B-cell epitope among sequenced strains. on geographical association of strains. All the A. marginale MSP1a This epitope showed high sequence variability among all R1 sequences within each eco-region cluster appeared to be under MSP1a sequences reported to date (Fig. 4A). From the 172 positive selection as shown by dN/dS indexes of 1.83, 1.61, 1.54 MSP1a tandem repeats included in the B-cell epitope analysis, 53 and 1.21 for Eco-region Clusters 1 to 4, respectively. Therefore, sequence variants were found; nevertheless 5 of those variants these results confirmed the hypothesis that A. marginale strains are covered 64% of the total epitope variability (Figs. 7A and 7B). associated with factors that drive the biological performance of These 5 variants formed 2 phylogenetic clusters (Fig. 7C); variants ticks vectors in each region [25]. in cluster 2 share the same antibody recognition site, while those in Influence of a combination of factors. A phylogenetic cluster 1, types 1 and 11, have different antibody recognition sites correlation was found among A. marginale strains between MSP1a (data not shown). All B-cell epitope types were surface exposed tandem repeats 2-D structure, transmissibility by ticks and the (data not shown) as was previously predicted for the Type 1 B-cell presence of TCE-binding tandem repeats (Fig. 6). Notably, cluster epitope using the TMHMM2 algorithm [31]. b contains all non-tick-transmissible A. marginale strains, abundant Seven of the 53 B-cell epitope variants gave a 0 score in both B- b-strand tandem repeat 2-D structure, and a low proportion of cell epitope prediction servers BCEPRED and BCPREDS (data TCE-binding repeats (Fig. 6). The exception to this rule is the not shown), suggesting that some amino acid changes in the USA/St. Maries/G – (J, B2) strain, which is tick-transmissible immunodominant B-cell epitope (amino acids 4–14) could be the [34,44] but falls into this cluster. This position of the USA/St. determining factor for the loss of this epitope. Analysis by VaxiJen, Maries/G – (J, B2) strain in the phylogenetic tree suggests that A. a predictor of protective antigens [49], demonstrated that the marginale tick-transmissible strains may evolve from non-tick- highest VaxiJen score belongs to the type model B-cell epitope, transmissible strains. The cluster a-2 contains tick-transmissible while types 1, 10, 11 and 17 have VaxiJen scores lower than the strains with the highest proportion of a-helices and all TCE- type model but higher than the average for all 53 epitopes binding tandem repeats. In contrast, strains in cluster b-a-c have a (Fig. 7D). Among the main types of B-cell epitopes, a linear but more variable 2-D structure and a high proportion of TC non- negative correlation was observed between VaxiJen and

-24- Chapter II.I

-25- Chapter II.I

Figure 5. Changes in putative 2D structure and disorder analysis of A. marginale MSP1a tandem repeats. The PSIPRED web server was used to predict the 2D structure. The tandem repeats were grouped into fourteen 2D structure models. Tandem repeats shown represent prototypes of corresponding tandem repeat 2D structures. The second column shows (model presented) the ID of the tandem repeat presented as prototype. Models ID in red represent tandem repeats in R1 position (first tandem in the MSP1a sequence). doi:10.1371/journal.pone.0065243.g005

BCPREDS scores and between Blastp and BCPREDS scores tandem repeats in tick transmissible strains contained type 1 B-cell (Figs. 7E and 7F), suggesting a relationship between sequence epitopes. This data suggest antigenic differences between tick- identity and immune properties among the B-cell epitopes. transmissible and not-transmissible A. marginale strains, and agrees Overall, these results suggested that different immune properties with the finding that both type 1 and model type epitopes fall into exist among the different MSP1a types of the B-cell epitopes. different phylogenetic clusters (Fig. 7C) presenting different As this is an immunodominant epitope [31], tandem repeats putative antibodies recognition sites. Both epitopes had the highest with epitopes predicted to be recognised by different antibodies VaxiJen and BCPRED scores among the 5 most common B-cell could be a factor in the frequent lack of cross-protection between epitopes, but shared low identity as shown by Blastp score (data heterologous strains. Conversely, strains which share the same not shown). type of antibody recognition site may be more likely to be cross- Collectively, the results of these studies demonstrate that the protective. unified nomenclature proposed herein using MSP1a sequences A correlation (R2 = 0.69) was found between the number of 2-D provides information about A. marginale strain world distribution, structure models present in a given geographic location and the transmissibility by ticks, infective potential, antigenic variability amount of B-cell epitope types in the same region (Fig. 2). and putative utility for MSP1a vaccine development. The Therefore, we explored the hypothesis that there was a link structural and immune analyses of MSP1a revealed a phylogenetic between 2-D structure and B-cell epitopes among the MSP1a correlation between A. marginale tick transmissibility, 2-D tandem repeats. An a-helical structure was seen in 88% of the structure adopted by the tandem repeats and the type of B-cell tandem repeats containing type 1 B-cell epitopes and in 100% of epitopes present in the tandem repeats. These results are tandem repeats containing types 10, 11 or 17 B-cell epitopes. In fundamental information for design of MSP1a structure-based contrast, 69% of the tandem repeats containing type model B-cell vaccines which would be cross protective against multiple A. epitopes had b-strand structures. Interestingly, a correlation was marginale strains, and for development of serodiagnostic methods found between tick transmissibility and the type of B-cell epitopes based on differential B-cell epitopes, for epidemiological charac- present on MSP1a repeats, possibly due to these structural terization of field strains. differences between epitope types. 71% of the MSP1a tandem repeats present in non-tick-transmissible A. marginale strains were found to have type model B-cell epitopes, whereas 87% of the

Table 2. Effect of putative MSP1a tandem repeat 2-D structure on A. marginale tick transmission phenotype.

Strains MSP1a tandem repeats 2D structure Transmission by ticks R. sanguineus or Dermacentor spp. R. microplus H. excavatum

USA/Idaho/C - (D5,E) (a-a, a-a, a-a, a-a, a-a, a-a)Yes(*) ND ND Puerto Rico/Puerto Rico/C - (E, w5)(a-a, a-a, a-a, a-a, a-a, a-a)Yes(***) Yes (***) ND USA/Virginia/G - (A, B) (a-a, b-a)Yes(*) ND ND USA/St.Maries/G - (J, B2)(a-a, b-a, b-a)Yes(*) Yes(***) ND USA/Oklahoma/G - (U) (a-a)Yes(+) ND ND USA/Missisippi/D - (D4,E) (a-a, a-a, a-a, a-a, a-a)Yes(*) ND ND USA/Rassmusen/2 (A, F, H) (a-a, a-c, a-c) Yes(*) ND ND USA/Kansas/2 (E, M, w)(a-a, a-c, a-a)Yes(2) ND ND Nigeria/Zaria/2 (54, 55, F) (b-b, a-c, a-c) Yes(**) ND ND Israel/Israel tailed/F - (1, F, M, 3) (a-c, a-c, a-c, a-c) ND Yes(****) Yes(****) Israel/Israel non tailed/G - (1, 4) (a-c, a-b) ND Yes(****) No(****) USA/Florida/G - (A, B7)(a-a, b-a, b-a, b-a, b-a, b-a, b-a, b-a)No(*) ND ND USA/California/G - (B2,C) (b-a, b-a, b-c) No(*) ND ND USA/Okeechobee/G - (L, B, C, B, C ) (a-a, b-a, b-c, b-a, b-c) No(*) ND ND USA/Illinois/G - (M, N, B, M, H) (a-c, a-a, b-a, a-c, a-c) No(*) ND ND

The information about transmission of A. marginale strains by ticks was collected from (*) de la Fuente et al. (2003) [10], (**) Zivkovic et al. (2007) [65], (***) Futse et al. (2003) [44], (****) Shkap et al. 2009 (****) [39], (2) Leverich et al. (2008) [66], and (+) Barbet et al (2001) [67]. TCE-binding tandem repeats are underlined. Abbreviation: ND, not determined. doi:10.1371/journal.pone.0065243.t002

-26- Chapter II.I

Table 3. Association of A. marginale MSP1a R1 sequences with world ecoregions.

Ecoregion R1 sequences(a) Other R1 sequences(b)

1: central Africa and central South America, primarily M, 4, 8, 12, 16, 56, 60, 64, 67, 69, 72, 78, 93, 132, c, p, t A, B, D, T, 13, 23, a Argentina and southern Brazil 2: Mesoamerican corridor, northern South America E, F, 28, 37, 48, 53, 54, 84, 85, 101, 117, 121, A, B, L, T, 13, 23, a and a small territory of eastern South Africa 126, 129, 136, e 3: central South Africa and scattered parts of M, O, Q, U, 1, 3, 5, 6, 7, 27, 33, 34, 39, 40, 42, 74, A, D southern USA and Mexico 77, 82, 141, 142, 143, 147, 151, 154, 155, 4: USA I, J, K, O, U, 19, A, B, L, a

World ecoregions were built upon temporal series of NDVI values. (a)R1 sequences recorded in one ecoregion only. (b)R1 sequences that have been reported in other ecoregions. doi:10.1371/journal.pone.0065243.t003

Methods information. When this information was equal between isolates, information was used from the isolate first submitted to GenBank. Anaplasma marginale Strains Classification A total of 289 A. marginale MSP1a sequences with complete Amino Acid Variability within MSP1a Tandem Repeats tandem repeat regions included in this study were obtained from Tandem repeat sequences were aligned using Clustalw, and published research and the GenBank sequence database [http:// each amino acid position was numbered from 1 to 31. The amino www.ncbi.nlm.nih.gov/]. These sequences were analyzed and acid variability was determined using the formula of Kuby et al. classified, and the tandem repeats were named (or renamed) [50]. The variability was equal to the number of different amino following the nomenclature proposed by Allred et al. [6] and de la acids at a given position/frequency of the most common amino Fuente et al. [14]. When microsatellite sequences were included in acid at that position. the msp1a published nucleotide sequence, they were used to assign a genotype following the system of Estrada-Pen˜a et al. [25]. Correlation Analysis between MSP1a Tandem Repeats Briefly, the 59-UTR microsatellite located between the putative Shine-Dalgarno (SD) sequence (GTAGG) and the translation and World Ecological Regions initiation codon (ATG), GTAGG (G/A TTT)m(GT)nT ATG The analysis was conducted as described previously, assuming (microsatellite sequence is shown in bold letters) and the SD-ATG that (i) eco-regions could be delineated by quantitative abiotic distance (d) calculated in nucleotides as (4 6 m)+(2 6 n) +1 were characters based on well-recognized and repeatable attributes and used. We propose one nomenclature for A. marginale strains based (ii) A. marginale strains were associated with each eco-region and on MSP1a with the following structure: country/locality/micro- subjected to different environmental conditions that could be satellite genotype - (structure of tandem repeat), and all MSP1a analyzed by multivariate geographic clustering [25]. The feature sequences were classified using this nomenclature. When multiple selected to build the eco-regions was the NDVI, which is a variable strains had 100% amino acid sequence similarity across tandem that reflects vegetation stress and summarizes information about repeats, they were listed under one strain name, with geographical the ecological background for the performance of tick populations information taken from the isolate with the most complete [25]. A 0.1u resolution series of monthly NDVI data was obtained

Figure 6. Phylogenetic tree based on MSP1a tandem repeat amino acid sequences. The MSP1a sequences from tick-transmissible and non-transmissible strains (Table 2) were included in the phylogenetic analysis. The phylogenetic tree was reconstructed using the neighbor joining and maximum likelihood methods. Reliability for internal branch was assessed using the bootstrapping method with 1000 bootstrap replicates. Bootstrap values are shown as % in the internal branch. The tree shows four phylogenetic clusters containing different patterns of MSP1a tandem repeat 2D structures. Cluster b-a-c (blue), cluster a-1 and cluster a-2 (beige) contain tick-transmissible A. marginale strains while in cluster b (red) fall the non-tick-transmissible strains. doi:10.1371/journal.pone.0065243.g006

-27- Chapter II.I

Figure 7. B-cell epitope analysis in A. marginale MSP1a tandem repeats. The B-cell epitopes were predicted using BCPRED server. The type 1 B-cell epitope was used as reference (Model) for comparisons. (A) Clustalw alignment and amino acid changes in the 5 more represented MSP1a tandem repeat B cell epitopes. B-cell epitope types model (light violet), 1 (blue), 10 (yellow), 11 (dark violet) and 17 (red) are shown. (B) Percent of tandem repeats containing each type of B cell epitopes. (C) Neighbor joining phylogenetic tree based on B cell epitope amino acid sequences showing the two clusters formed by the 5 more represented B cell epitopes. Cluster-1: Types 1 and 11 and Cluster-2: Types Model, 10 and 17. Correlations between VaxiJen/Blastp (D), BCPRED/Blastp scores (E) and VaxiJen/BCPRED (F) scores are shown. These correlations suggest that the epitopes with higher homology (Blastp score) share in common the immunogenic properties represented by VaxiJen/BCPRED. doi:10.1371/journal.pone.0065243.g007 for the period 1986–2006. The 12 averaged monthly images were the analysis. Prediction/score of B-cell epitope was determined subjected to Principal Components Analysis (PCA) to obtain using BCPREDS server [54] and the protective potential of the B- decomposition into the main axes representing the most signifi- cell epitope was predicted using the VaxiJen server [55]. cant, non-redundant information. The strongest principal axes Prediction of physicochemical properties of the B-cell epitope were chosen using Cattell’s Scree Test [25]. The PCA analysis was assayed using BCEPRED server [56]. PepSurf algorithm [57], retained three principal axes, including 92% of the total variance. implemented in the PEPITOPE server [58], was used to A hierarchical agglomerative clustering on PCA values was then determine the structure/position of the affinity-selected B-cell used to classify multiple geographical areas into a single common epitopes in a model protein. The 3D analysis of MSP1a tandem set of discrete regions. Mahalanobis distance was used as a repeat B-cell epitopes was performed using a model of the crystal measure of dissimilarity and the weighted pair-group average was structure of the Fv corresponding with the anti-blood group A used as the amalgamation method. A value of 0.05 was used as the antibody AC1001 (PDB ID: 1JV5) [59]. cut-off probability for assignment to a given eco-region. For phylogenetic analysis, sequences were aligned with MUS- CLE (v3.7) configured for the highest accuracy [60]. After Bioinformatics alignment, ambiguous regions (i.e., containing gaps and/or poorly Secondary structure was predicted using the position-specific aligned) were removed with Gblocks (v0.91b) [61]. The phyloge- netic tree was reconstructed using the neighbor joining (NJ) and scoring matrices method [51] from the PSIPRED server [52], and maximum likelihood methods implemented in PHYLIP package protein disorder was predicted using the GeneSilico Metaserver (v3.66), NJ distances were calculated using FastDist [62,63]. [53]. Reliability for internal branch was assessed using the boot- The immunodominant B-cell epitope SSAGGQQQESS (amino strapping method (1000 bootstrap replicates). Graphical represen- acid positions 4–14), previously mapped in the A. marginale strain tation and editing of the phylogenetic tree were performed with USA/Oklahoma/G - (K, C, H) MSP1a sequence [31] will be TreeDyn (v198.3) [64]. referred to as epitope ‘‘Type 10. The variability among MSP1a tandem repeats within this B-cell epitope (amino acid positions 4– Supporting Information 14) was evaluated. The percent of amino acid identity and Blastp score among the B-cell epitopes had a linear correlation Table S1 Classification of A. marginale strains based (R2 = 0.85), so the Blastp score was used as an identity index in on the proposed nomenclature. A total of 289 MSP1a

-28- Chapter II.I sequences were analyzed. A. marginale 224 unique strains were Author Contributions classified using the nomenclature proposed in our study: Country/ Conceived and designed the experiments: JdlF AC-C JJV LMFP LG. Locality/microsatellite genotype - (structure of tandem repeat). Performed the experiments: JJV RK KL JF MT AEP AC-C. Analyzed the The 59UTR microsatellite genotype was included when available. data: JJV RK KL JF MT AE-P AC-C. Contributed reagents/materials/ The structure of tandem repeats was represented following the analysis tools: EZ VS MR. Wrote the paper: JdlF KMK JF RK KL JJV nomenclature previously proposed [14] (Fig. 1). When the same AE-P MT AC-C. repeat was present more than one time, a super-index was used to represent copy number for this repeat. (PDF)

References 1. Kocan KM, de la Fuente J, Guglielmone AA, Melendez RD (2003) Antigens 23. de la Fuente J, Kocan KM, Blouin EF, Zivkovic Z, Naranjo V, et al. (2010) and alternatives for control of Anaplasma marginale infection in cattle. Clin Functional genomics and evolution of tick-Anaplasma interactions and vaccine Microbiol Rev 16: 698–712. development. Vet Parasitol 167: 175–186. 2. Kocan KM, de la Fuente J, Blouin EF, Garcia-Garcia JC (2004) Anaplasma 24. Kocan KM, de la Fuente J, Blouin EF, Coetzee JF, Ewing SA (2010) The natural marginale (Rickettsiales: Anaplasmataceae): recent advances in defining host- history of Anaplasma marginale. Vet Parasitol 167: 95–107. pathogen adaptations of a tick-borne rickettsia. Parasitology 129 Suppl: S285– 25. Estrada-Pen˜a A, Naranjo V, Acevedo-Whitehouse K, Mangold AJ, Kocan KM, 300. et al. (2009) Phylogeographic analysis reveals association of tick-borne pathogen, 3. Aubry P, Geale DW (2011) A review of bovine anaplasmosis. Transbound Anaplasma marginale, MSP1a sequences with ecological traits affecting tick vector Emerg Dis 58: 1–30. performance. BMC Biol 7: 57. 4. Smith RD, Levy MG, Kuhlenschmidt MS, Adams JH, Rzechula DL, et al. 26. Brown WC, Palmer GH, Lewin HA, McGuire TC (2001) CD4(+)T (1986) Isolate of Anaplasma marginale not transmitted by ticks. Am J Vet Res 47: lymphocytes from calves immunized with Anaplasma marginale major surface 127–129. protein 1 (MSP1), a heteromeric complex of MSP1a and MSP1b, preferentially 5. Wickwire KB, Kocan KM, Barron SJ, Ewing SA, Smith RD, et al. (1987) recognize the MSP1a carboxyl terminus that is conserved among strains. Infect Infectivity of three Anaplasma marginale isolates for Dermacentor andersoni.AmJVet Immun 69: 6853–6862. Res 48: 96–99. 27. Brown WC, McGuire TC, Zhu D, Lewin HA, Sosnow J, et al. (2001) Highly 6. Allred DR, McGuire TC, Palmer GH, Leib SR, Harkins TM, et al. (1990) conserved regions of the immunodominant major surface protein 2 of the Molecular basis for surface antigen size polymorphisms and conservation of a genogroup II ehrlichial pathogen Anaplasma marginale are rich in naturally neutralization-sensitive epitope in Anaplasma marginale. Proc Natl Acad Sci U S A derived CD4+ T lymphocyte epitopes that elicit strong recall responses. 87: 3220–3224. J Immunol 166: 1114–1124. 7. Rodriguez SD, Garcia Ortiz MA, Hernandez Salgado G, Santos Cerda NA, 28. Brown WC, McGuire TC, Mwangi W, Kegerreis KA, Macmillan H, et al. Aboytes Torre R, et al. (2000) Anaplasma marginale inactivated vaccine: dose titration against a homologous challenge. Comp Immunol Microbiol Infect Dis (2002) Major histocompatibility complex class II DR-restricted memory CD4(+) 23: 239–252. T lymphocytes recognize conserved immunodominant epitopes of Anaplasma 8. de la Fuente J, Garcia-Garcia JC, Blouin EF, McEwen BR, Clawson D, et al. marginale major surface protein 1a. Infect Immun 70: 5521–5532. (2001) Major surface protein 1a effects tick infection and transmission of 29. Palmer GH, Waghela SD, Barbet AF, Davis WC, McGuire TC (1987) Anaplasma marginale. Int J Parasitol 31: 1705–1714. Characterization of a neutralization-sensitive epitope on the Am 105 surface 9. de La Fuente J, Garcia-Garcia JC, Blouin EF, Rodriguez SD, Garcia MA, et al. protein of Anaplasma marginale. Int J Parasitol 17: 1279–1285. (2001) Evolution and function of tandem repeats in the major surface protein 1a 30. Oberle SM, Palmer GH, Barbet AF, McGuire TC (1988) Molecular size of the ehrlichial pathogen Anaplasma marginale. Anim Health Res Rev 2: 163–173. variations in an immunoprotective protein complex among isolates of Anaplasma 10. de la Fuente J, Garcia-Garcia JC, Blouin EF, Kocan KM (2003) Character- marginale. Infect Immun 56: 1567–1573. ization of the functional domain of major surface protein 1a involved in adhesion 31. Garcia-Garcia JC, de la Fuente J, Kocan KM, Blouin EF, Halbur T, et al. (2004) of the rickettsia Anaplasma marginale to host cells. Vet Microbiol 91: 265–283. Mapping of B-cell epitopes in the N-terminal repeated peptides of Anaplasma 11. de la Fuente J, Torina A, Naranjo V, Caracappa S, Vicente J, et al. (2005) marginale major surface protein 1a and characterization of the humoral immune Genetic diversity of Anaplasma marginale strains from cattle farms in the province response of cattle immunized with recombinant and whole organism antigens. of Palermo, Sicily. J Vet Med B Infect Dis Vet Public Health 52: 226–229. Vet Immunol Immunopathol 98: 137–151. 12. Palmer GH, Rurangirwa FR, McElwain TF (2001) Strain composition of the 32. Palmer GH, Oberle SM, Barbet AF, Goff WL, Davis WC, et al. (1988) ehrlichia Anaplasma marginale within persistently infected cattle, a mammalian Immunization of cattle with a 36-kilodalton surface protein induces protection reservoir for tick transmission. J Clin Microbiol 39: 631–635. against homologous and heterologous Anaplasma marginale challenge. Infect 13. Ruiz PM, Passos LM, Ribeiro MF (2005) Lack of infectivity of a Brazilian Immun 56: 1526–1531. Anaplasma marginale isolate for Boophilus microplus ticks. Vet Parasitol 128: 325–331. 33. Palmer GH, Barbet AF, Cantor GH, McGuire TC (1989) Immunization of 14. de la Fuente J, Ruybal P, Mtshali MS, Naranjo V, Shuqing L, et al. (2007) cattle with the MSP-1 surface protein complex induces protection against a Analysis of world strains of Anaplasma marginale using major surface protein 1a structurally variant Anaplasma marginale isolate. Infect Immun 57: 3666–3669. repeat sequences. Vet Microbiol 119: 382–390. 34. de la Fuente J, Kocan KM, Garcia-Garcia JC, Blouin EF, Halbur T, et al. (2003) 15. Barbet AF, Palmer GH, Myler PJ, McGuire TC (1987) Characterization of an Immunization Against Anaplasma marginale Major Surface Protein 1a Reduces immunoprotective protein complex of Anaplasma marginale by cloning and Infectivity for Ticks. The International Journal of Applied Research in expression of the gene coding for polypeptide Am105L. Infect Immun 55: 2428– Veterinary Medicine 1. 2435. 35. Blouin EF, Saliki JT, de la Fuente J, Garcia-Garcia JC, Kocan KM (2003) 16. Palmer GH, Rurangirwa FR, Kocan KM, Brown WC (1999) Molecular basis Antibodies to Anaplasma marginale major surface proteins 1a and 1b inhibit for vaccine development against the ehrlichial pathogen Anaplasma marginale. infectivity for cultured tick cells. Vet Parasitol 111: 247–260. Parasitol Today 15: 281–286. 36. McGarey DJ, Barbet AF, Palmer GH, McGuire TC, Allred DR (1994) Putative 17. Kocan KM, de la Fuente J (2003) Co-feeding studies of ticks infected with adhesins of Anaplasma marginale: major surface polypeptides 1a and 1b. Infect Anaplasma marginale . Vet Parasitol 112: 295–305. Immun 62: 4594–4601. 18. Barbet AF, Blentlinger R, Yi J, Lundgren AM, Blouin EF, et al. (1999) 37. Lew AE, Bock RE, Minchin CM, Masaka S (2002) A msp1alpha polymerase Comparison of surface proteins of Anaplasma marginale grown in tick cell culture, chain reaction assay for specific detection and differentiation of Anaplasma tick salivary glands, and cattle. Infect Immun 67: 102–107. marginale isolates. Vet Microbiol 86: 325–335. 19. Bowie MV, de la Fuente J, Kocan KM, Blouin EF, Barbet AF (2002) 38. de la Fuente J, Garcia-Garcia JC, Blouin EF, Kocan KM (2001) Differential Conservation of major surface protein 1 genes of Anaplasma marginale during cyclic transmission between ticks and cattle. Gene 282: 95–102. adhesion of major surface proteins 1a and 1b of the ehrlichial cattle pathogen 20. Ueti MW, Reagan JO, Jr., Knowles DP, Jr., Scoles GA, Shkap V, et al. (2007) Anaplasma marginale to bovine erythrocytes and tick cells. Int J Parasitol 31: 145– Identification of midgut and salivary glands as specific and distinct barriers to 153. efficient tick-borne transmission of Anaplasma marginale. Infect Immun 75: 2959– 39. Shkap V, Kocan K, Molad T, Mazuz M, Leibovich B, et al. (2009) 2964. Experimental transmission of field Anaplasma marginale and the A. centrale vaccine 21. de La Fuente J, Passos LM, Van Den Bussche RA, Ribeiro MF, Facury-Filho EJ, strain by Hyalomma excavatum, Rhipicephalus sanguineus and Rhipicephalus (Boophilus) et al. (2004) Genetic diversity and molecular phylogeny of Anaplasma marginale annulatus ticks. Vet Microbiol 134: 254–260. isolates from Minas Gerais, Brazil. Vet Parasitol 121: 307–316. 40. Benz I, Schmidt MA (2002) Never say never again: protein glycosylation in 22. de la Fuente J, Lew A, Lutz H, Meli ML, Hofmann-Lehmann R, et al. (2005) . Mol Microbiol 45: 267–276. Genetic diversity of anaplasma species major surface proteins and implications 41. Chavez AS, Felsheim RF, Kurtti TJ, Ku PS, Brayton KA, et al. (2012) for anaplasmosis serodiagnosis and vaccine development. Anim Health Res Rev Expression patterns of Anaplasma marginale Msp2 variants change in response to 6: 75–89. growth in cattle, and tick cells versus mammalian cells. PLoS One 7: e36012.

-29- Chapter II.I

42. Futse JE, Brayton KA, Nydam SD, Palmer GH (2009) Generation of antigenic 55. Doytchinova IA, Flower DR (2007) VaxiJen: a server for prediction of protective variants via gene conversion: Evidence for recombination fitness selection at the antigens, tumour antigens and subunit vaccines. BMC Bioinformatics 8: 4. locus level in Anaplasma marginale. Infect Immun 77: 3181–3187. 56. Saha S, Raghava GP (2006) Prediction of continuous B-cell epitopes in an 43. Estrada-Pen˜a A, Venzal JM, Nava S, Mangold A, Guglielmone AA, et al. (2012) antigen using recurrent neural network. Proteins 65: 40–48. Reinstatement of Rhipicephalus (Boophilus) australis (Acari: Ixodidae) with 57. Mayrose I, Shlomi T, Rubinstein ND, Gershoni JM, Ruppin E, et al. (2007) redescription of the adult and larval stages. J Med Entomol 49: 794–802. Epitope mapping using combinatorial phage-display libraries: a graph-based 44. Futse JE, Ueti MW, Knowles DP, Jr., Palmer GH (2003) Transmission of algorithm. Nucleic Acids Res 35: 69–78. Anaplasma marginale by Boophilus microplus: retention of vector competence in the 58. Mayrose I, Penn O, Erez E, Rubinstein ND, Shlomi T, et al. (2007) Pepitope: absence of vector-pathogen interaction. J Clin Microbiol 41: 3829–3834. epitope mapping from affinity-selected peptides. Bioinformatics 23: 3244–3246. 45. Dunker AK, Brown CJ, Lawson JD, Iakoucheva LM, Obradovic Z (2002) 59. Thomas R, Patenaude SI, MacKenzie CR, To R, Hirama T, et al. (2002) Intrinsic disorder and protein function. Biochemistry 41: 6573–6582. Structure of an anti-blood group A Fv and improvement of its binding affinity 46. Valdez RA, McGuire TC, Brown WC, Davis WC, Jordan JM, et al. (2002) without loss of specificity. J Biol Chem 277: 2059–2064. Selective in vivo depletion of CD4(+) T lymphocytes with anti-CD4 monoclonal 60. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy antibody during acute infection of calves with Anaplasma marginale. Clin Diagn and high throughput. Nucleic Acids Res 32: 1792–1797. Lab Immunol 9: 417–424. 61. Castresana J (2000) Selection of conserved blocks from multiple alignments for 47. Santos PS, Nascimento R, Rodrigues LP, Santos FA, Faria PC, et al. (2012) their use in phylogenetic analysis. Mol Biol Evol 17: 540–552. Functional epitope core motif of the Anaplasma marginale major surface protein 1a 62. Elias I, Lagergren J (2007) Fast computation of distance estimators. BMC and its incorporation onto bioelectrodes for antibody detection. PLoS One 7: Bioinformatics 8: 89. e33045. 48. Suarez CE, Noh S (2011) Emerging perspectives in the research of bovine 63. Felsenstein J (1989) Mathematics vs. Evolution: Mathematical Evolutionary babesiosis and anaplasmosis. Vet Parasitol 180: 109–125. Theory. Science 246: 941–942. 49. Maritz-Olivier C, van Zyl W, Stutzer C (2012) A systematic, functional 64. Chevenet F, Brun C, Banuls AL, Jacq B, Christen R (2006) TreeDyn: towards genomics, and reverse vaccinology approach to the identification of vaccine dynamic graphics and annotations for analyses of trees. BMC Bioinformatics 7: candidates in the cattle tick, Rhipicephalus microplus. Ticks Tick Borne Dis 3: 179– 439. 187. 65. Zivkovic Z, Nijhof AM, de la Fuente J, Kocan KM, Jongejan F (2007) 50. Kindt TJ, Goldsby RA, Osborne BA, Kuby J (2007) Kuby immunology. New Experimental transmission of Anaplasma marginale by male Dermacentor reticulatus. York: W.H. Freeman. xxii, 574, A-531, G-512, AN-527, I-527 p. BMC Vet Res 3: 32. 51. Jones DT (1999) Protein secondary structure prediction based on position- 66. Leverich CK, Palmer GH, Knowles DP, Jr., Brayton KA (2008) Tick-borne specific scoring matrices. J Mol Biol 292: 195–202. transmission of two genetically distinct Anaplasma marginale strains following 52. Buchan DW, Ward SM, Lobley AE, Nugent TC, Bryson K, et al. (2010) Protein superinfection of the mammalian reservoir host. Infect Immun 76: 4066–4070. annotation and modelling servers at University College London. Nucleic Acids 67. Barbet AF, Yi J, Lundgren A, McEwen BR, Blouin EF, et al. (2001) Antigenic Res 38: W563–568. variation of Anaplasma marginale: major surface protein 2 diversity during cyclic 53. Kurowski MA, Bujnicki JM (2003) GeneSilico protein structure prediction meta- transmission between ticks and cattle. Infect Immun 69: 3057–3066. server. Nucleic Acids Res 31: 3305–3307. 68. Palmer GH, Barbet AF, Davis WC, McGuire TC (1986) Immunization with an 54. El-Manzalawy Y, Dobbs D, Honavar V (2008) Predicting linear B-cell epitopes isolate-common surface protein protects cattle against anaplasmosis. Science using string kernels. J Mol Recognit 21: 243–255. 231: 1299–1302.

-30- Chapter II.II

Detection of genetic diversity of Anaplasma marginale isolates in

Minas Gerais, Brazil.

Pohl A E., Cabezas-Cruz A., Ribeiro M F B., Silveira J A G., Silaghi C., Pfister K., Passos L M F. 2013. Detection of genetic diversity of Anaplasma marginale isolates in Minas Gerais, Brazil. Brazilian journal of veterinary parasitology. 22(1): 1-7.

-31- Rev. Bras. Parasitol. Vet., Jaboticabal, v. 22, n. 1, p. 1-7, jan.-mar. 2013 ISSN 0103-846X (impresso) / ISSN 1984-2961 (eletrônico) Chapter II.II Detection of genetic diversity of Anaplasma marginale isolates in Minas Gerais, Brazil Detecção de diversidade genética de isolados Anaplasma marginale em Minas Gerais, Brasil Anna Elisabeth Pohl1; Alejandro Cabezas-Cruz2; Múcio Flávio Barbosa Ribeiro3; Júlia Angélica Gonçalves da Silveira3; Cornelia Silaghi; Kurt Pfister1; Lygia Maria Friche Passos1,4*

1Institute for Comparative Tropical Medicine and Parasitology, Ludwig-Maximilians-Universität – LMU, 80752, München, Germany 2Biology Centre of the Academy of Sciences of the Czech Republic, v. v. i. , Parasitology Institute – IP, Faculty of Science, University of South Bohemia, 370 05, České Budějovice, Czech Republic 3Departamento de Parasitologia, Instituto de Ciências Biológicas – ICB, Universidade Federal de Minas Gerais – UFMG, CEP 31270-901, Belo Horizonte, Brazil 4Departamento de Medicina Veterinária Preventiva, Instituto Nacional de Ciência e Tecnologia – INCT/Informação Genético-sanitária da Pecuária Brasileira, Escola de Veterinária, Universidade Federal de Minas Gerais – UFMG, CEP 30123-070, Belo Horizonte, MG, Brazil

Received December 6, 2012 Accepted February 18, 2013 Abstract

Bovine anaplasmosis, caused by the tick-borne rickettsia Anaplasma marginale, is endemic in tropical and subtropical regions of the world and results in economic losses in the cattle industry. Major surface proteins (MSPs) have been used as markers for the genetic characterization of A. marginale strains and demonstrate that many isolates may occur in a given geographic area. However, in Brazil, little is known about the genetic diversity of A. marginale isolates within individual herds. This study was designed to examine the genetic variation among A. marginale infecting calves in a farm in the south of Minas Gerais State, Brazil. Blood samples collected from 100 calves were used to prepare Giemsa- stained smears that were microscopically examined for the presence of A. marginale. From each blood sample, DNA was extracted and analyzed by a polymerase chain reaction (PCR), followed by sequencing to determine diversity among the isolates. Examination of blood smears showed that 48% of the calves were infected with A. marginale, while the real-time PCR detected 70.2% positivity. Congenital infections were found in four calves. The microsatellite and tandem repeat analyses showed high genetic diversity among the isolates. Keywords: Anaplasma marginale, MSP1a, DNA sequencing, microsatellites, tandem repeats, Brazil.

Resumo

A anaplasmose bovina, causada pela rickettsia Anaplasma marginale e transmitida por carrapatos, é endêmica em regiões tropicais e subtropicais no mundo e causa grandes perdas econômicas na industria de bovinos. Proteínas principais de superfície (MSPs) foram usados como marcadores para a caracterização genética de amostras de A. marginale, demonstrando que diferentes isolados podem ocorrer numa certa região geográfica. Porém, no Brasil pouco se sabe sobre a variedade genética de isolados de A. marginale em rebanhos individuais. Este estudo teve como objetivo investigar a ocorrência de variação genética entre bezerros infectados com A. marginale numa fazenda do sul de Minas Gerais, Brasil. Amostras de sangue coletadas de 100 bezerros foram utilizadas para o preparo de esfregaços sanguíneos, corados pelo Giemsa, para detecção da infecção por A. marginale. Amostras de DNA extraídas de cada amostra foram analisadas através de PCR seguido de sequenciamento. O exame dos esfregaços demonstrou que 48% dos bezerros estavam infectados com A. marginale, enquanto que o PCR detectou 70,2% de positividade. Infecção congênita foi detectada em quatro bezerros. As análises de microsatélites e ‘tandem repeats’ comprovaram uma grande diversidade genética entre os isolados. Palavras-chave: Anaplasma marginale, MSP1a, sequenciamento de DNA, microsatélites, tandem repeats, Brasil.

*Corresponding author: Lygia Maria Friche Passos Institute for Comparative Tropical Medicine and Parasitology, Ludwig-Maximilians-Universität – LMU, 80752, München, Germany e-mail: [email protected]

-32- Chapter II.II

Introduction longitude 45° 67’, 816 m), in Minas Gerais State, Brazil. Blood samples were collected from the jugular vein into EDTA from all Anaplasmosis is a tick-borne disease of ruminants, caused by the the calves (100) on the farm, which were of the breeds Holstein, obligate intra-erythrocytic bacterium Anaplasma marginale with a Jersey and crossbreed. The calves were classified into 4 groups widespread distribution in tropical-to-temperate climates (AUBRY; according to their ages: Group 1 (17 calves from 1 to 7 days old), GEALE, 2011). Ticks are the biological vectors of A. marginale and Group 2 (12 calves from 8 to 30 days old), Group 3: (15 calves the one-host tick Rhipicephalus (Boophilus) microplus is considered from 31 to 107 days old) and Group 4: (56 calves from 108 to to be the main vector in Brazil (KESSLER; SCHENK, 1998). 381 days old). Giemsa-stained blood smears were prepared and The pathogen can also be transmitted mechanically by biting flies, bacteremia was calculated as the percentage of A. marginale infected blood-contaminated fomites, or congenitally by transplacental erythrocytes detected in 20 microscopic fields. transmission (ZAUGG; KUTTLER, 1984; PASSOS; LIMA, 1984). Transplacental transmission of A. marginale may therefore contribute to the epidemiology of this disease in some regions 2. Genomic DNA isolation and PCR (KOCAN et al., 2003). Cattle of all ages can become infected and remain persistently infected carriers for life, with clinical signs DNA was extracted from 94 blood samples using the varying from asymptomatic to acute cases with fever, anaemia, commercial Wizard® Genomic DNA Purification kit according abortion, weight loss, lowered milk production or death (KESSLER; to the manufacturer’s instructions (Promega, Madison, USA). SCHENK, 1998; RIBEIRO; PASSOS, 2002). DNA concentrations were determined using the spectrophotometer Direct diagnosis can be made by microscopic examination of NanoDrop®ND-1000 (PepLab, Erlangen, Germany). For the initial Giemsa-stained blood smears, but this method can only detect levels screening a real-time PCR was used, as reported by CARELLI et al. of >107 infected erythrocytes per ml of blood (PALMER et al., (2007) with modifications for targeting the msp1β gene. All 2000). amplifications were performed in a 7500-fast-Real-Time PCR For epidemiological surveys, indirect and direct methods, System (Applied Biosystems, Darmstadt, Germany) and were such as the Indirect Fluorescent Antibody Test (IFAT) and the carried out in a 25 µL reaction mixture containing 5 µL of DNA real-time PCR are more appropriate to reveal the status of a herd. template, 15 µL TaqMan® Gen Expression Master Mix (Applied Brazil is considered to be an endemic area where calves, being Biosystems, USA), 2.25 µL (10 µM) of each primer (AM- less susceptible to clinical disease than adults, acquire the infection forward: 5′ TTGGCAAGGCAGCAGCTT 3′ and AM-reverse: 5′ shortly after birth which seems to be an important factor for TTCCGCGAGCATGTGCAT 3′) and 0.50 l (10 µM) of the probe endemic stability (AUBRY; GEALE, 2011). However, although (AM-probe: 6FAM-5`-TCGGTCTAACATCTCCAGGCTTTCAT- anaplasmosis is endemic in Minas Gerais State, the occurrence 3`-BHQ1). Cycling was performed under the following conditions: of outbreaks causes huge economic losses to the cattle industry 2 min/50 °C; 10 min/95 °C and 40 cycles of 15 sec/95 °C and (RIBEIRO; REIS, 1981). 1 min/60 °C. Many different geographic strains ofA. marginale have been For sequencing, 13 positive samples from calves of different identified, which differ as regards to biology, genetic characteristics ages were selected and further analyzed by a hemi-nested PCR and transmissibility by ticks (DE LA FUENTE et al., 2005, targeting the msp1α gene following the protocol of Lew et al. 2007). The A. marginale major surface protein 1a (MSP1a) was (2002). Reactions were performed in an automated DNA thermal shown to be involved in vector-pathogen and host-pathogen cycler (Eppendorf Mastercycler® , gradient) using the primers interactions and to have evolved under positive selection pressure 1733F (5′ TGTGCTTATGGCAGACATTTCC 3′), 3134R (DE LA FUENTE et al., 2003a). Among different strains, the (5′ TCACGGTCAAAACCTTT GCTTACC 3′) and 2957R MSP1a differs in molecular weight due to a variable number of (5′ AAACCTTGTAGCCCCAACTTATCC 3′). The primary tandem 23- to 31-amino-acid repeats, and it has been proven to be a stable marker of strain identity (ESTRADA-PEÑA et al., 2009). amplification cycle, following an initial denaturation at 94°C The understanding ofA. marginale epidemiology, including for 5 minutes, consisted of 40 cycles at 94°C of 30 seconds, the characterization of the genetic diversity of strains in a region, 1 minute at 55 °C and 2 minutes at 72 °C, followed by a final provides knowledge for the development and implementation of extension step for 7 min at 72 °C. The same conditions were control measures. used in the second amplification cycle except that the annealing Therefore the aim of the present study was to determine the temperature was changed to 60 °C. For visualization, PCR occurrence of genetic diversity within a herd in a dairy farm in products were electrophoresed on a 2% agarose gel stained with Brazil. Gel Red® (Biotium, USA). In addition, amplified fragments were purified using a commercial kit (QIAquick PCR purification Kit, Qiagen, Hilden, Germany) and sent for sequencing of both Materials and Methods strands (Eurofins, MWG, Operon, Ebersberg).

3. DNA sequence analysis 1. Samples and microscopic examination All MSP1a sequences of A. marginale isolates from Minas The study was carried out from December 2010 to January Gerais available in GenBank were included for the genetic 2011 on a dairy farm located near Cordislândia (latitude 21° 79’ diversity analysis. A microsatellite was located in the MSP1a -33- Chapter II.II

5’UTR between the putative Shine-Dalgarno sequence (GTAGG) Unexpectedly, congenital infection was detected in four 1-day and the translation initiation codon (ATG). The structure of the old calves. All four calves were positive in the hemi-nested PCR microsatellite (bold) was GTAGG (G/A TTT)m (GT)n T ATG and in the real-time PCR. Only one of them was negative by (ESTRADA-PEÑA et al., 2009). The UFMG-1 and UFMG-2 blood smear examination. isolates were not included in the microsatellite analysis because the Table 1 shows the percentage of animals in each age group that 5’UTR was missing. The tandem repeat analysis was performed were positive by blood smear and in the real-time PCR. From the following the nomenclature proposed by DE LA FUENTE et al. MSP1a PCR an amplicon of approximately 0.8 Kb was isolated (2007). corresponding to the expected size of the targeted msp1 α gene The database nucleotide collection (nr/nt) using the Megablast fragment (Table 2). (optimized for highly similar sequences) from the BLAST server (http://blast.ncbi.nlm.nih.gov/) was applied to find homologies with 2. MSP1a microsatellite analysis our sequences. Nucleotide sequences were aligned using BLAST (ZHANG et al., 2000) and protein sequences were aligned using the The analysis of MSP1a microsatellite sequences resulted in multiple-alignment program CLUSTALW (THOMPSON et al., five different genotypes amongstA. marginale strains from Minas 1994). Nucleotide sequences were translated to amino acid (aa) Gerais (Table 3). The different microsatellite sequences produced sequence by the ExPASy translation tool of the Swiss Institute SD-ATG distances between 19 and 23 nucleotides (Table 3). The of Bioinformatics (EXPASY TRANSLATION TOOL, 2011). predominant genotype was E and it was distributed among animals The phylogenetic analysis was performed as follows: nucleotide of different ages and breeds (Table 4). In sequences previously sequences were aligned with MUSCLE (v3.7) configured for reported from Minas Gerais, the genotypes G, D, C and E were highest accuracy (EDGAR, 2004). After alignment, ambiguous found (DE LA FUENTE et al., 2002). Genotype C was not found regions (i.e., containing gaps and/or poorly aligned) were removed in the present study and genotype B is reported for the first time with Gblocks (v0.91b) (CASTRESANA, 2000). The phylogenetic in Minas Gerais isolates. tree was reconstructed using the maximum likelihood method implemented in the PhyML program (v3.0 aLRT) (GUINDON; GASCUEL, 2003, ANISIMOVA; GASCUEL, 2006). Reliability 3. MSP1a tandem repeats and phylogenetic analysis for internal branch was assessed using the bootstrapping method (100 bootstrap replicates). Graphical representation and editing Differences were found in the tandem repeat sequences and of the phylogenetic tree were performed with TreeDyn (v198.3) in the structure of the gene msp1α among the different isolates (CHEVENET et al., 2006). from Minas Gerais. One MSP1a tandem repeat resulted in a new sequence with amino acid changes as shown in Figure 1. Twenty four different tandem repeats were found in all the analyzed Results sequences (Figure 2). Nine of them were not previously reported

1. Detection of A. marginale infections Table 1. Positivity (%) for Anaplasma marginale detected by direct examination (blood smears) and by molecular analysis (RT-PCR) A. marginale could be detected by microscopic examination from Minas Gerais State. of blood smears in 48 out of 100 animals (48%) with bacteremia Group + Blood smears + RT-PCR (age range in days) (%) (%) ranging from 0.1 to 16.1%. 1 (1-7) 17.6 25 The results obtained by real-time PCR showed that 66 out of 2 (8-30) 8.3 16.7 94 animals (70.2%) were positive for A. marginale and were thus 3 (31-107) 46.7 61.5 this method was more sensitive than direct examination of blood 4 (108-381) 67.9 98.1 smears. A. marginale infection was detected in all the age groups.

Figure 1. New sequence of MSP1a tandem repeat (72) found in A. marginale isolates from Minas Gerais. The one letter amino acid code was used to depict the differences found in MSP1a repeats. Asterisks indicate identical amino acids and gaps indicate deletion/insertions. The repeat form B was taken as a model to compare the new repeat.

-34--3- Chapter II.II

Table 2. Isolates of A. marginale from Minas Gerais, Brazil included in the study. Isolate Age (days) Breed Age group GenBank accession number Minas-1 0 Holstein 1 JX844205 Minas-2 1 Jersey 1 JX844206 Minas-3 1 Holstein 1 JX844207 Minas-4 1 Jersey 1 JX844208 Minas-5 14 Jersey 2 JX844209 Minas-6 29 Jersey 2 JX844210 Minas-7 40 Jersey 2 JX844216 Minas-8 78 Jersey 3 JX844211 Minas-9 78 Holstein 3 JX844217 Minas-10 103 Holstein 3 JX844212 Minas-11 253 Jersey 4 JX844213 Minas-12 280 Jersey 4 JX844214 Minas-13 355 Jersey 4 JX844215 Brazil N/D N/D N/D AF428092 Brazil-5 N/D N/D N/D AY283198 Brazil-9 N/D N/D N/D AY283199 Brazil-12 N/D N/D N/D AY283200 UFMG-1 N/D N/D N/D EU676175 UFMG-2 N/D N/D N/D EU676176 N/D: Not Defined.

Table 3. The msp1α microsatellite sequences were analyzed in 13 A. marginale isolates. The microsatellite (sequence in bold) was lo- cated between the Shine-Dalgarno (SD; sequence in brackets) and the translation initiation codon (ATG) with the structure: GTAGG (G/ATTT)m (GT)n T ATG. The SD-ATG distance was calculated in nucleotides as (4xm) + (2xn) +1. SD-ATG distance Isolates Genotype m* n** (nucleotide) Minas-1 E 2 7 23 Minas-2 E 2 7 23 Minas-3 E 2 7 23 Minas-4 G 3 5 23 Minas-5 B 1 9 23 Minas-6 E 2 7 23 Minas-7 D 2 6 21 Minas-8 E 2 7 23 Minas-9 D 2 6 21 Minas-10 E 2 7 23 Minas-11 E 2 7 23 Minas-12 E 2 7 23 Minas-13 E 2 7 23 Brazil G 3 5 23 Brazil-5 D 2 6 21 Brazil-9 C 2 5 19 Brazil-12 E 2 7 23 *m is the number of repetitions of the nucleotide sequence G/A TTT. **n is Figure 2. The structure of the MSP1a repeat regions, according to the number of repetitions of the nucleotide sequence GT. the nomenclature proposed by De la Fuente et al. (2007). in Minas Gerais, but had been detected in Argentina, Mexico, Interestingly, the tandem repeats σ, C, F, N, µ and 42, previously South Africa and Israel (DE LA FUENTE et al., 2007), and one already reported in Minas Gerais (DE LA FUENTE et al., 2007), was a new sequence when compared with the tandem repeats were not found in the MSP1a sequences analyzed in this study reported worldwide for A. marginale MSP1a. (Figure 2).

-35- Chapter II.II

Using the 0.8 Kb fragment of MSP1a a maximum likelihood hand, the results reveal the presence of several A. marginale msp1a tree was built. The tree shows clusters among the different isolates genotypes within the herd (Table 2). These results are consistent from Minas Gerais and the phylogenetic relationship between with the findings of highA. marginale genetic diversity in endemic them. Four clusters of MSP1a genes were found as shown in the areas worldwide (DE LA FUENTE et al., 2007). phylogenetic tree (Figure 3). MSP1a contains a variable number of tandemly repeated peptides in the amino-terminal region, while the remainder of the protein is highly conserved between isolates. The number of Discussion repeats varies among geographic isolates of A. marginale but is constant within an isolate and has been used as a stable genetic The analysis of MSP1a repeats sequences, which has provided marker of isolate identity (DE LA FUENTE et al., 2003b). evolutionary information about geographically distinct A. marginale Based on analysis of MSP1a tandem repeats, we found fourteen strains (DE LA FUENTE et al., 2001, 2005, 2007), was used in different strains in isolates from Minas Gerais (Figure 2), eight of the present study to characterize pathogen genetic diversity within them reported for the first time. Comparisons with the tandem a Brazilian dairy farm. repeats previously reported for MSP1a showed one new tandem The results reported here confirm the presence of A. marginale repeat sequence in this study, which has been named 72 (Figure 1), infections among all age groups tested, and demonstrate that calves following the nomenclature proposed by De la Fuente et al. (2007). acquire the infection shortly after birth, suggesting that there is One common structure (72-62-61) for MSP1a tandem repeats endemic stability for anaplasmosis in this dairy farm. On the other was found in six of the Minas Gerais isolates (Minas 6-10 and

Table 4. A. marginale genotype frequency of bovines infected from Minas Gerais State per breed and age of the calves. Breed Group (age range in days) Genotype Isolates Holstein Jersey 1 (1-7) 2 (8-30) 3 (31-107) 4 (108-381) B 1 0.00 1.00 0.00 1.00 0.00 0.00 D 2 0.50 0.50 0.00 0.50 0.50 0.00 E 9 0.33 0.66 0.33 0.11 0.22 0.33 G 1 0.00 1.00 1.00 0.00 0.00 0.00

Figure 3. Unrooted phylogenetic tree based on the MSP1a protein sequences of A. marginale isolates from Minas Gerais. The tree shows the clusters of MSP1a. Bootstrap values are show as % in the internal branch (only values equal or higher than 50%).

-36- Chapter II.II

Minas 12), representing the most frequent strain in our study. congenital transmission appears to be an important phenomenon The maximum likelihood phylogenetic tree shows that the MSP1a that contributes to the persistence of several A. marginale strains from Minas Gerais falls into four separated clusters. The tandem within a herd. Further investigations should address the influence repeats 72-62-61 and C-F-N, contained in the isolate Brazil-5, of such elevated genetic variation of A. marginale strains on reveal minor changes in amino acids (data not shown) and the epidemiological aspects of bovine anaplasmosis. This might phylogenetic tree shows that they fall in the same Cluster (Cluster contribute to a better understanding of the epidemiology of the 2), suggesting a common ancestor for both isolates. All the isolates disease and consequently to the development and implementation containing at least one tandem repeat type β fall in the MSP1a of appropriate control and prevention measures. Cluster 1 based on the phylogenetic tree, supporting the idea that strains containing the tandem repeat type β have common origins in Minas Gerais. Two of the strains containing tandem Acknowledgements repeat type β found in our study, Minas-1 and Minas-3, belonged to animals with congenital A. marginale. The authors are grateful to Camila V. Bastos (UFMG, Belo The phylogeographic clustering of some genotypes of the Horizonte, Brazil), Evelyn Overzier, Claudia Thiel, Tim Tiedemann MSP1a microsatellite has been reported. The genotype B was and Katarzyna Lis (Institute for Comparative Tropical Medicine exclusively associated with areas of southern Brazil but genotypes and Parasitology, LMU-München, Germany) for their excellent C, D, E and G have been also found in other geographic regions technical assistance, and to Lesley Bell-Sakyi (The Pirbright Institute, (ESTRADA-PEÑA et al., 2009). Surrey, UK) for revising the English language of the manuscript. The five genotypes found in Minas Gerais isolates were previously Alejandro Cabezas-Cruz is a Marie Curie Early Stage Researcher reported for this geographic region. The predominant genotype supported by the POSTICK ITN (Post-graduate training network E was present in all of the phylogenetic clusters of MSP1a except for capacity building to control ticks and tick-borne diseases) within for Cluster 3. The length of the MSP1a microsatellite could have the FP7-PEOPLE – ITN programme (EU Grant No. 238511). affected the expression of MSP1a, thus influencing the infection and transmission of A. marginale. Higher expression levels were reported for SD-ATG distances of 23 and 29 nucleotides and References lower expression for 19 nucleotides (ESTRADA-PEÑA et al., Anisimova M, Gascuel O. Approximate likelihood-ratio test for branches: 2009). Almost all the tandem repeats found in the present study A fast, accurate, and powerful alternative. Syst Biol 2006; 55(4): 539‑552. were 21 or 23 nucleotides long. It is noteworthy that all the SD- PMid:16785212. http://dx.doi.org/10.1080/10635150600755453 ATG distances in the congenitally transmitted A. marginale were 23 nucleotides. This suggests a high capacity for infection and Aubry P, Geale DW. A Review of Bovine Anaplasmosis. Transbound Emerg transmission of the A. marginale strains found in this area and Dis 2011; 58(1): 1-30. PMid:21040409. PMid:21040409. particularly of the isolates transmitted congenitally. Carelli G, Decaro N, Lorusso A, Elia G, Lorusso E, Mari V, et al. Detection Based on the analysis of MSP1a microsatellite and tandem and quantification of Anaplasma marginale DNA in blood samples of repeat structure we found a high genetic diversity in the Minas cattle by real-time PCR. Vet Microbiol 2007; 124(1‑2): 107‑114. Gerais A. marginale isolates. The genetic diversity of A. marginale PMid:17466470. PMid:17466470. MSP1a could be explained not only by evolutionary pressures Castresana J. Selection of conserved blocks from multiple alignments for exerted by ligand-receptor and host-parasite interactions their use in phylogenetic analysis. Mol Biol Evol 2000; 17(4): 540-552. (DE LA FUENTE et al., 2001) but also by constant movements of PMid:10742046. http://dx.doi.org/10.1093/oxfordjournals.molbev. cattle or independent transmission (DE LA FUENTE et al., 2005). a026334 In the case of this farm, evolutionary pressure and independent Chevenet F, Brun C, Bañuls AL, Jacq B, Chisten R. TreeDyn: transmission should be the main explanations for the genetic towards dynamic graphics and annotations for analyses of trees. BMC diversity, since no new animals had been introduced into the herd Bioinf 2006; 10(7): 439. PMid:17032440 PMCid:1615880. http:// over the previous 20 years. Nevertheless, one animal with congenital dx.doi.org/10.1186/1471-2105-7-439 A. marginale (isolate Minas-3) had a tandem repeat structure De la Fuente J, Garcia-Garcia JC, Blouin EF, Rodriguez SD, García previously reported in Mexico (DE LA FUENTE et al., 2001). MA, Kocan KM. Evolution and function of tandem repeats in the major Regarding congenital transmission, each of the four infected surface protein 1a of the ehrlichial pathogen Anaplasma marginale. Anim newborn calves carried a different strain of A. marginale (Minas-1 Helth Res Rev 2001; 2(2): 163-173. PMid:11831437. PMid:11831437. to -4) (Figure 2), but only two genotypes, E and G, based on the De la Fuente J, Van Den Bussche RA, Garcia-Garcia JC, Rodríguez msp1α microsatellites (Table 3). The genotype E predominated in SD, García MA, Guglielmone AA, et al. Phylogeography of New World this and the other age groups, showing it to be the predominant isolates of Anaplasma marginale based on major surface protein sequences. genotype at the study site. This genotype was found in one sequence Vet Microbiol 2002; 88(3): 275-285. PMid:12151201. PMid:12151201. previously reported in Minas Gerais (Brazil-12) (Table 3). De la Fuente J, Van Den Bussche RA, Prado TM, Kocan KM. Anaplasma marginale msp1a genotypes evolved under positive Conclusions selection pressure but are not markers for geographic strains. J Clin Microbiol 2003a; 41(4): 1609-1616. PMid:12682152. PMid:12682152. From the results presented here we can conclude that (1) De la Fuente J, Garcia-Garcia JC, Blouin EF, Kocan KM. Characterization genetic diversity of A. marginale is high in Minas Gerais and (2) of the functional domain of major surface protein 1a involved in

-37- Chapter II.II adhesion of the rickettsia Anaplasma marginale to host cells. Vet Kocan KM, De la Fuente J, Guglielmone AA, Melendez RD. Antigens Microbiol 2003b; 91(2-3) 265-283. PMid: 12458174. PMid: 12458174. and alternatives for control of Anaplasma marginale infection in cattle. Clin Microbiol Rev 2003; 16(4): 698-712. PMid:14557295 De la Fuente J, Lew A, Lutz H, Meli ML, Hofmann-Lehmann R, PMCid:207124. http://dx.doi.org/10.1128/CMR.16.4.698-712.2003 Shkap V, et al. Genetic diversity of Anaplasma species major surface proteins and implications for anaplasmosis serodiagnosis and vaccine Lew AE, Bock RE, Minchin CM, Masaka S. A msp1a polymerase chain development. Anim Health Res Rev 2005; 6(1): 75-89. PMid:16164010. reaction assay for specific detection and differentiation ofAnaplasma PMid:16164010. marginale isolates. Vet Microbiol 2002; 86(4): 325-335. Pmid:11955782. De la Fuente J, Ruybal P, Mtshali MS, Naranjo V, Shuqing Palmer GH, Brown WC, Rurangirwa FR. Antigenic variation in the L, Mangold AJ, et al. Analysis of world strains of Anaplasma persistence and transmission of the ehrlichia Anaplasma marginale. marginale using major surface protein 1a repeat sequences. Vet Microbes Infect 2000; 2(2): 167-176. PMid:10742689. PMid:10742689. Microbiol 2007; 119(2‑4): 382‑390. Pmid:17084044. Passos LMF, Lima JD. Diagnóstico de anaplasmose bovina congênita em Edgar RC. MUSCLE: multiple sequence alignment with high accuracy Minas Gerais. Arq Bras Med Vet Zootec 1984; 36(6): 743-744. and high throughput. Nucleic Acids Res 2004; 32(5): 1792-1797. Ribeiro MFB, Reis R. Natural exposure of calves to Anaplasma PMid:15034147 PMCid:390337. http://dx.doi.org/10.1093/nar/ marginale in edemic areas of Minas Gerais. Arq Bras Med Vet gkh340 Zootec 1981; 33(1): 63‑66. Estrada-Peña A, Naranjo V, Acevedo-Whitehouse K, Mangold AJ, Ribeiro MFB, Passos LMF. Tristeza parasitária bovina. Cad Tec Vet Kocan KM, De la Fuente J. Phylogeographic analysis reveals association Zootec 2002; 39: 36-52. of tick-borne pathogen, Anaplasma marginale, MSP1a sequences with ecological traits affecting tick vector performance. BMC Biol 2009; 7:57. Thompson JD, Higgings DG, Gibson TJ. CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through PMid:19723295 PMCid:2741432. http://dx.doi.org/10.1186/1741- sequence weighting, position-specific gap penalties and weight matrix 7007-7-57 choice. Nucleic Acids Res 1994; 22(22): 4673-4680. PMid:7984417 ExPaSy Translation Tool 2011. Available from: http://expasy.hcuge.ch/ PMCid:308517. http://dx.doi.org/10.1093/nar/22.22.4673 tools/dna.html. Zaugg JL, Kuttler KL. Bovine anaplasmosis: in utero transmission and Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate the immunologic significance of ingested colostral antibodies. Am J Vet large phylogenies by maximum likelihood. Syst Biol 2003; 52(5): 696‑704. Res 1984; 45(3): 440-443. PMid:6711971. PMid:14530136. http://dx.doi.org/10.1080/10635150390235520 Zhang Z, Schwartz S, Wagner L, Miller W. A greedy algorithm for Kessler RH, Schenk MAM. Carrapato, tristeza parasitária e tripanossomose aligning DNA sequences. J Comput Biol 2000; 7(1-2): 203-214. dos bovinos. EMBRAPA; 1998. PMid:10890397. http://dx.doi.org/10.1089/10665270050081478

-38- Chapter II.III

Epidemiology and evolution of genetic variability of Anaplasma

marginale in South Africa.

Mutshembele A M., Cabezas-Cruz A., Mtshali M S., Thekisoe O M M., Galindo Ruth C., de la Fuente J. 2014. Epidemiology and evolution of genetic variability of Anaplasma marginale in South Africa. Tick and tick borne diseases. In press.

-39- Chapter II.III

Contents lists available at ScienceDirect Ticks and Tick-borne Diseases

journal homepage: www.elsevier.com/locate/ttbdis

Original article Epidemiology and evolution of the genetic variability of Anaplasma marginale in South Africa

Awelani M. Mutshembele a,b, Alejandro Cabezas-Cruz c,d,∗, Moses S. Mtshali a,b, Oriel M.M. Thekisoe b, Ruth C. Galindo c, José de la Fuente c,e a Research and Scientific Services Department, National Zoological Gardens of South Africa, P.O. Box 754, Pretoria 0001, South Africa b Department of Zoology and Entomology, University of the Free State, QwaQwa Campus, Private Bag x13, Phuthaditjhaba 9866, South Africa c SaBio, Instituto de Investigación en Recursos Cinegéticos IRES-CSIC-UCLM-JCCM, Ronda de Toledo s/n, 13005 Ciudad Real, Spain d Center for Infection and Immunity of Lille (CIIL), INSERM U1019, CNRS UMR 8204, Université Lille Nord de France, Institut Pasteur de Lille, Lille, France e Department of Veterinary Pathobiology, Center for Veterinary Health Sciences, Oklahoma State University, Stillwater, OK 74078, USA article info a b s t r a c t

Article history: Bovine anaplasmosis caused by infection of cattle with Anaplasma marginale has been considered to Received 5 March 2014 be endemic in South Africa, an assumption based primarily on the distribution of the tick vectors of A. Received in revised form 2 April 2014 marginale and serological studies on the prevalence of anaplasmosis in Limpopo, Free State, and North Accepted 17 April 2014 West. However, molecular evidence of the distribution of anaplasmosis has only been reported in the Available online xxx Free State province. In order to establish effective control measures for anaplasmosis, epidemiological surveys are needed to define the prevalence and distribution of A. marginale in South Africa. In addition, Keywords: a proposed control strategy for anaplasmosis is the development of an A. marginale major surface protein A. marginale MSP1a 1a (MSP1a)-based vaccine. Nevertheless, regional variations of this gene would need to be characterized Prevalence prior to vaccine development for South Africa. The objectives of the present study were therefore to Selective pressures conduct a national survey of the prevalence of A. marginale in South Africa, followed by an evaluation of the MSP1a vaccines diversity and evolution of msp1a in South African strains of A. marginale. To accomplish these objectives, species-specific PCR was used to test 250 blood samples from cattle collected from all South African provinces (including 26 districts and municipalities), except the Free State province where similar studies were reported previously. The prevalence of A. marginale ranged from 65% to 100%, except in Northern Cape province where A. marginale was not detected. A correlation was found between the prevalence and genetic diversity of A. marginale MSP1a. Additionally, the genetic diversity of the A. marginale MSP1a was found to evolve under negative and positive selection, and 23 new tandem repeats in South Africa were shown to have evolved from the extant tandem repeat 4. Despite the MSP1a genetic variability, some types of tandem repeats were found to be conserved among the A. marginale strains, and low-variable peptides in MSP1a tandem repeats were subsequently identified. The results of this research confirmed that anaplasmosis is endemic in South Africa. The results of the molecular characterization of the MSP1a can then be used as the basis for development of new and novel vaccines for anaplasmosis control in South Africa. © 2014 Elsevier GmbH. All rights reserved.

Introduction transmitted biologically by ticks, mechanically by biting insects and blood-contaminated fomites and from cow to calf via transplacen- Bovine anaplasmosis is a non-contagious tick-borne disease tal transmission (Aubry and Geale, 2011). Five tick species have caused by infection of cattle with Anaplasma marginale, an obligate been shown experimentally to transmit A. marginale in South Africa, intraerythrocytic bacterium classified in the family Anaplasmat- including Rhipicephalus microplus, R. decoloratus, R. evertsi evertsi, R. aceae, order Rickettsiales (Dumler et al., 2001). This pathogen is simus, and Hyalomma marginatum rufipes (as reviewed by de Waal, 2000). Acute disease in cattle is characterized by weight loss, fever, abortion, low milk production, and in some cases death. The ani- mals that recover from the disease become persistently infected ∗ Corresponding author at: Center for Infection and Immunity of Lille (CIIL), and serve as reservoir of infection for mechanical transmission and INSERM U1019, CNRS UMR 8204, Université Lille Nord de France, Institut Pasteur biological transmission by ticks (Kocan et al., 2003). Anaplasmosis de Lille, Lille, France. Tel.: +33 631 235 191. E-mail address: [email protected] (A. Cabezas-Cruz). is widespread in South Africa and, as estimated by de Waal (2000), http://dx.doi.org/10.1016/j.ttbdis.2014.04.011 1877-959X/© 2014 Elsevier GmbH. All rights reserved.

-40- Chapter II.III

(99)% of the total cattle population is at risk of acquiring A. marginale host cells (McGarey et al., 1994; de la Fuente et al., 2003a). While infection. the tandem repeats of MSP1a are highly variable, repeats are com- Currently, antimicrobial drugs are not available for the elim- monly represented among worldwide strains (Cabezas-Cruz et al., ination of persistent infections in cattle. Although the World 2013) and have been shown to evolve under positive selection (de Organization for Animal Health proposed the use of enrofloxacin, la Fuente et al., 2003b). However, the specific codon positions that imidocarb, and oxytetracycline for the elimination of persistent evolve under positive or negative selection have not been reported. A. marginale infections in cattle, these antimicrobial drugs were For development of MSP1a-based vaccines, some characteristics of not found to eliminate the persistent A. marginale infections MSP1a should be taken into consideration, including (i) the extant (Coetzee et al., 2005, 2006). Vaccines have been used as an alter- genetic variability of MSP1a, (ii) the evolution of MSP1a genetic native method for control of anaplasmosis. A live vaccine using A. diversity, and (iii) the conservative nature of some tandem repeats marginale ssp. centrale (A. centrale), a subspecies of relatively low among different isolates. pathogenicity, is available in South Africa, Australia, Israel, and Latin In this study, molecular evidence is provided regarding the America, but this vaccine has proven to be only partially effective, prevalence of A. marginale in 8 of the 9 South African provinces. and reports of vaccination failure are not uncommon (Kocan et al., Additionally, evolution of the genetic diversity of A. marginale 2003). While anaplasmosis still constitutes a problem for cattle pro- msp1a in South Africa was studied, demonstrating that different duction in South Africa, sales of vaccines for bovine anaplasmosis codon positions of this gene evolved under positive or nega- in South Africa have reduced from 800,000 to 200,000 doses over tive selection, likely due to immune selection and transmission 22 years (1976–1998) (de Waal, 2000). fitness. Finally, these studies demonstrated the low variability MSP1a has been shown to have potential use as a vaccine antigen of some tandem repeats commonly found among A. marginale because this protein contains both neutralization-sensitive (Palmer strains. Collectively, results of this research will contribute toward et al., 1987; Allred et al., 1990) and immunodominant epitopes development of new and novel vaccines for control of bovine (Garcia-Garcia et al., 2004). Recently, the use of MSP1a for vaccine anaplasmosis in South Africa and other regions of the world. development for A. marginale has regained new attention. The N terminus tandem repeated region of this protein was used in immu- nization trials in cattle against A. marginale (Torina et al., 2014) Materials and methods and laboratory animal models (Santos et al., 2013; Silvestre et al., 2014) and demonstrated promising results. MSP1a is one of 6 MSPs Study site and samples collection that have been described in A. marginale. This protein is encoded by a single-copy gene, msp1a, which is conserved during the mul- Blood samples were collected for these studies from May 2011 tiplication of the parasite in cattle and ticks (Kocan et al., 2003) to July 2013 in 26 districts and municipalities from 8 South African and has been useful for epidemiological studies of A. marginale in provinces (Fig. 1). Coordinates of the collection sites are provided various regions of the world (Ruybal et al., 2009; Almazán et al., in Table 1, as well as the farm production systems. Cattle for these 2008). MSP1a contains tandem repeats at the N terminus of the studies were randomly selected, and blood samples were collected protein, which present functional residues that serve as adhesins only from adult animals after the owner’s consent. While infor- for bovine erythrocytes and tick cells, a prerequisite for infection of mation regarding age or sex of the animals was not recorded,

Fig. 1. Map of South African areas included in the study. Map of South Africa showing the provinces that were included in the study (Limpopo, Mpumalanga, North West, Gauteng, KwaZulu-Natal, Eastern Cape, Western Cape, and Northern Cape). Endemic, epidemic, and A. marginale-free areas are coloured differentially (data collected from de Waal, 2000). The main tick species involved in the transmission of A. marginale in the sampled areas are shown: Rhipicephalus microplus, R. decoloratus, and R. evertsi evertsi (data collected from de Waal, 2000).

-41- Chapter II.III

Table 1 Sampling sites, farming system, and coordinates in the South African provinces.

South African Farming system Sampling sites Map coordinates provinces

Limpopo Communal Capricorn Dist., Aganang Municipality 23◦40 S, 29◦5 E Mpumalanga Communal Ehlanzeni South Dist. 25◦2700 S, 30◦5859 E Gauteng Commercial West Rand Dist., Merafong Municipality, Khutsong South, Carltonville 26◦201 S, 27◦1939 E 26◦22S, 27◦24 E North West Communal Moretele Dist., Maubane 25◦16 S, 28◦15 E KwaZulu-Natal Communal Pietermaritzburg Chota, Umhlati Dist., Albert Falls, Shallow drift, 29◦2917.16 S, 30◦2627.78 E Umgungundlovu Dist, Richmond Municipality, Ndaleni Dip Tank 29◦2833.5 S, 30◦2611.3 E 29◦5255.3 S, 30◦4056.2 E Eastern Cape Mix Amathole Dist., Nkokobe Municipality, Middledrift, Nxuba Municipality 26◦3839 E, 32◦4652 S 26◦5125 E, 32◦4112 S 27◦1158 E, 32◦5237 S 26◦2640 E, 32◦4425 S 26◦1818 E, 32◦4204 S Western Cape Mix Stellenbosch Dist., Boland 33◦4428.6 S, 18◦5953 E province Northern Cape Communal John Taolo Gaetsewe Dist., Ga-Segonyana Municipality, Kuruman, Zero Farm 27◦1853 S, 23◦4215 E vaccination histories were not available for these animals. Blood Sequence analysis samples were collected by tail venipuncture using an 18-gauge needle and sterile 10 ml vacutainer EDTA tubes and stored at 4 ◦C. To identify the msp1a gene sequences obtained in our study, the database Nucleotide collection (nr/nt) using Megablast (opti- mize for highly similar sequences) from the BLAST server was DNA extraction used (Zhang et al., 2000). Protein homology and identity analysis were performed using the multiple-alignment program ClustalW Genomic DNA was extracted from cattle blood samples using ZR (Thompson et al., 1994). The MSP1a tandem repeats found in this Genomic DNATM Tissue Miniprep (Zymo Research, CA, USA). DNA study were reported previously (Cabezas-Cruz et al., 2013). was resuspended in DNA elution buffer and stored at −20 ◦C. The concentration of DNA was determined using the NanoDrop® ND- Codon-based phylogenetic analysis of tandem repeats 1000 (NanoDrop Technologies Inc., Wilmington, USA). Codon-based alignment was performed using the codon suite server (Schneider et al., 2005, 2007). Detection of selection pressure on individual codons was calculated using 2 methods, single like- A. marginale species-specific PCR lihood ancestor counting (SLAC) and fixed effects likelihood (FEL) (Pond and Frost, 2005), used in the Datamonkey webserver (Delport A specific set of primers was used to amplify msp1a et al., 2010; Pond and Frost, 2005). Positive and negative selections gene – 1733F-(5-TGTGCTTATGGCAGACATTTCC-3), 2957R-(5- were assigned to codon where ω = dN (non-synonymous substi- AAACCTTGTAGCCCCAACTTATCC-3)(Lew et al., 2002). Primers tutions)/dS (synonymous substitutions) ratio was higher or lower were synthesized and supplied by Inqaba Biotech [Inqaba Biotech- than 1, respectively. The reconstruction of the ancestral amino acid nical Industries (Pty) Ltd., Pretoria, Gauteng, South Africa]. PCR sequence was performed using a neighbour joining tree rooted reactions were prepared using Master Mix (Thermo Scientific in tandem repeat 83 under the Dayhoff model of substitutions DreamTaq Green PCR), 0.1–1.0 ␮M of Forward and Reverse primers which was estimated to be the best model fitting the actual data. and 1 ␮g of DNA template in 25 ␮l final volume. The amplifica- Three reconstruction methods were used: Joint (Pupko et al., 2000), tion cycles consisted of 40 cycles of 1 min at 94 ◦C, 1 min at 65 ◦C, marginal (Yang et al., 1995), and sample (Nielsen, 2002), which are and 2 min at 72 ◦C. Amplified products were separated in 1% TBE also used in the Datamonkey webserver. (89 mM Tris–Borate, 2 mM EDTA, pH 8) agarose gel using 1 kb lad- der as a DNA size marker (1 kb DNA ladder, Life Sciences, Fermentas Amino acid variability, composition and genetic diversity index GmbH, Germany). DNA was visualized by gel staining in 0.5 ␮g/ml (GDI) GelRed (Fermentas GmbH, Germany) under UV illumination and photographed. The amino acid variability was calculated in the variability server (Garcia-Boronat et al., 2008) using the Shannon entropy (H) formula (Shannon, 1948) as follows: DNA sequencing M H = − P log P PCR products containing only single amplification products i 2 i of msp1a were sequenced at Inqaba Biotech facilities [Inqaba i=1

Biotechnical Industries (Pty) Ltd., Pretoria, Gauteng, South Africa]. where Pi is the fraction of residues with a certain type of amino The termination reactions were performed using BigDye VER3.1 acid i, and M is the number of types of amino acid for a position. (ABI, Life Technologies, CA, USA) according to manufacturer’s The proportion of variable over conserved positions was calculated instructions. The labelled fragments were purified using Zymo as the number of positions with more than 0 of Shannon variabil- research sequencing clean-up kit (Zymo Research, CA, USA) and ity divided by the number of positions with 0 Shannon variability. subsequently analyzed on a 3500×L Genetic Analyzer (ABI, Life Amino acids were also classified regarding biochemistry properties Technologies, CA, USA). The sequences obtained in this study were to know negative charged, positively charged, uncharged-polar, submitted to GenBank and provided with accession numbers for and non-polar. For comparison purposes, Shannon variability was msp1a (KC470153-KC470196) gene. additionally calculated in 28 and 43 MSP1a sequences available

-42- Chapter II.III

Table 2 Observed prevalences of Anaplasma marginale in different provinces of South Africa.

South African No. of blood Prevalence of A. No. of msp1a provinces samples collected marginale, sequenced per province msp1a-positive PCR (%)

Limpopo 20 13 (65) 8 Mpumalanga 21 21 (100) 2 Gauteng 20 20 (100) 5 North West 33 24 (72) 7 KwaZulu-Natal 55 50 (90) 9 Eastern Cape 40 40 (100) 3 Western Cape 16 14 (87.5) 11 Northern Cape 45 0 (0) 0 Free Statea 215 129 (51.3%) 29

a Data collected from Mtshali et al. (2007). in GenBank from Venezuela and the USA, respectively. To further characterize the genetic diversity of MSP1a, a GDI was calculated for each A. marginale strain as follows: number of different MSP1a tandem repeats divided by the total number of tandem repeats per strain. 1 and 0 were considered maximum and minimum genetic diversity, respectively.

Results and discussion

Molecular evidence of A. marginale prevalence in South Africa

Bovine anaplasmosis has been considered to be endemic in Fig. 2. Newly reported sequences of MSP1a tandem repeats. The one-letter amino South Africa, an assumption based primary on the distribution of acid code was used to depict MSP1a repeat sequences. Dots indicate identical amino the tick vectors (Ndou et al., 2010) and the seroprevalence of A. acids, and gaps indicate deletion/insertions. The ID of each repeat form was assigned marginale which was determined only in the Free State (Dreyer previously in Cabezas-Cruz et al. (2013). Tandem repeat A was used as a model for amino acid comparison. et al., 1998), Limpopo (Rikhotso et al., 2005), and North West (Ndou et al., 2010) provinces. Molecular evidence of endemic bovine anaplasmosis has not been reported in most of South Africa. This A. marginale prevalence and MSP1a genetic diversity study was therefore designed to investigate the molecular evidence of A. marginale infection in cattle from Mpumalanga, Gauteng, East- The sequence of A. marginale MSP1a was analyzed in 44 strains ern Cape, Limpopo, North West, KwaZulu-Natal, Eastern Cape, and (Table 3). We found 52 different types of tandem repeats among Western Cape provinces (Table 2). Molecular diagnosis of anaplas- South African strains (including those reported by Mtshali et al. mosis was determined previously in Free State province (Mtshali (2007), from Free State province), 23 of which were described et al., 2007). In the present study, species-specific msp1a primers for the first time (Fig. 2). Using a GDI, the genetic diversity per were used for PCR assays on 250 DNA samples obtained from cattle A. marginale strain was described (Table 3), and the average of blood. The results of these PCR studies confirmed that A. marginale is genetic diversity was calculated per province. A polynomial cor- widespread in South Africa, but with a variable prevalence in all the relation (R2 = 0.76) occurred between the GDI and the prevalence sampled provinces, except for Northern Cape in which no positive of anaplasmosis per province (Fig. 3). Interestingly, provinces with samples were detected (Table 2). The absence of A. marginale- 100% of A. marginale prevalence were not the provinces with the positive PCR in Northern Cape is not surprising because this area highest or lowest GDI, rather these provinces were between a is considered to be free of tick vectors (Mtshali and Mtshali, 2013), range of genetic diversity (0.82–0.87). Two driving forces, immune and the prevalence of Babesia spp., another tick-borne pathogen, selection and transmission fitness, have been suggested to impact was also found to be very low in this province (Mtshali and Mtshali, genetic diversification of A. marginale (Palmer and Brayton, 2013). 2013). The provinces with the highest prevalences of A. marginale Immune pressure in cattle may induce greater genetic diversity were Mpumalanga, Gauteng, and Eastern Cape with an infection in A. marginale populations in order to insure persistent infec- rate of 100%. In the other provinces, A. marginale infections in cat- tion. At the same time, the development of antigenic variation tle ranged from 65% to 90% (Table 2 and Fig. 3). Differences in the was suggested to have a transmission cost (Palmer and Brayton, prevalence of A. marginale were not observed between commercial 2013). Anaplasma marginale strains with lower transmission fit- and communal farming systems. The prevalence of A. marginale in ness due to high genetic variability would be at risk of elimination cattle from South Africa can be considered high when compared from the population (Palmer and Brayton, 2013). Furthermore, to the prevalence of A. marginale in cattle from other regions of high incidence of ticks correlated with increased MSP1a genetic the world, for example, in Brazil, a recent study found 70% preva- variability in Argentina (Ruybal et al., 2009). These apparently con- lence of A. marginale in a cattle herd (Pohl et al., 2013). A study tradictory aspects may be reconciled if increased tick transmission by de la Fuente et al. (2005) showed a range from 25% to 100% of contributes to greater circulation of A. marginale strains in the cattle A. marginale prevalence in cattle herds from Italy. Analysis of the population, favouring the interaction between A. marginale strains prevalence and genetic diversity of A. marginale MSP1a in different and potentially resistant hosts. In this study, the observed range geographic regions constitutes an important step toward devel- of GDI, in which 100% prevalence (an indicator of transmission opment of effective MSP1a-based vaccines because the antigenic efficiency) exist, may reflect a positive balance between genetic composition of the vaccine should contain MSP1a variants present diversity and biological transmission by ticks taking into account in different regions of the world. that, as shown in Fig. 1, all the studied areas except for Northern

-43- Chapter II.III

Cape is infested by ticks. Notably, most of the new tandem repeats State province reported by Mtshali et al. (2007). The unique tan- shown in Fig. 2 were sequenced from KwaZulu-Natal province dem repeats found in this study were classified as rare, repeated, which has 90% prevalence of A. marginale. When KwaZulu-Natal most of them, one time in only one strain (Fig. 2). In contrast, tan- was excluded from the correlation analysis between A. marginale dem repeats 3, 4, 13, 34, Q, and 37 had a high frequency (Table 3) prevalence and GDI (Fig. 3), the polynomial regression increased and were also reported in A. marginale strains from Israel (3, 4), from R2 = 0.76 to R2 = 0.98. A possible explanation for this result may South America (4, 13), and Europe (Q). Repeat sequences 34 and be that the tandem repeats diversification found among KwaZulu- 37 were abundant only in South Africa with rare exceptions (as Natal MSP1a provided a degree of transmission fitness of the reviewed by Cabezas-Cruz et al., 2013). Considering this, we wanted strains present in that area. Another explanation could be that to test whether the tandem repeats newly described in this study cattle immunity is increasing in the area, inducing A. marginale (Fig. 2) originated from extant A. marginale MSP1a tandem repeats diversification for antigenic variation. One interesting question that or had evolved from a tandem repeat that was lost after tandem remains unanswered is how genetic diversity of A. marginale msp1a repeat differentiation. To test this hypothesis, we reconstructed emerges in a specific region. the ancestral state (see ‘Materials and methods’) of the new tan- dem repeats presented in Fig. 2. Surprisingly, the ancestral state Evolution of MSP1a genetic diversity of all the new tandem repeats was tandem repeat 4 (Fig. 4) sug- gesting that all of the newly described tandem repeats from South In order to analyze the evolution of MSP1a genetic diversity African A. marginale strains evolved from this tandem repeat. In observed in our samples, the tandem repeats (Table 3) were classi- addition, this evidence suggests that these repeated sequences may fied as “frequent” (present more than 22 times) or “rare” (present constitute a group of recently evolved tandem repeats specific to less than 10 times) based on the frequency of their appearance South Africa and, in fact, these repeats have not been reported among South African A. marginale strains, including those from Free elsewhere (Cabezas-Cruz et al., 2013). The mechanism for the

Table 3 Anaplasma marginale strains and MSP1a tandem repeats organization.

A. marginale Origin Structure of MSP1a tandem repeats GDIa AVE-GDI/STDEV strain

LP-7 Limpopo 34 159 1 0.917/0.144 LP-10 Limpopo 27 13 3 36 1 LP-30 Limpopo 27 13 3 1 LP-34 Limpopo 34 13 3 38 1 LP-37 Limpopo 27 13 13 37 0.75 LP-46 Limpopo 3 38 1 LP-50 Limpopo 34 13 13 0.667 MP-C2 Mpumalanga 34 13 158 37 1 0.875/0.177 MP-C5 Mpumalanga 15 15 100 83 0.75 NW-C2 North West 27 13 4 4 37 0.8 0.960/0.089 NW-C4 North West 27 13 4 37 1 NW-C5 North West 82 13 79 4 37 1 NW-C1-160312 North West 34 13 3 36 38 1 NW-C4-160312 North West 34 36 38 3 1 GP-C1 Gauteng 82 13 4 4 37 0.8 0.826/0.173 GP-C2 Gauteng 34 27 3 38 13 3 38 0.714 GP-C5 Gauteng 3 4 4 4 37 0.6 GP-C1112105 Gauteng 34 37 1 GP-C4117105 Gauteng 3 36 38 1 GP-C7117105 Gauteng 34 13 13 0.667 GP-C1817105 Gauteng 34 13 37 1 KZN-D KwaZulu-Natal 42 43 25 161 31 1 0.919/0.128 KZN-F KwaZulu-Natal 42 43 25 31 31 0.8 KZN-K KwaZulu-Natal 27 13 4 4 37 0.8 KZN-Y KwaZulu-Natal 143 144 145 146 1 KZN-MM KwaZulu-Natal 42 43 25 31 1 KZN-14 KwaZulu-Natal 142 43 25 31 1 KZN-19 KwaZulu-Natal 141 140 140 0.667 KZN-49 KwaZulu-Natal 147 148 149 150 1 KZN-51 KwaZulu-Natal 147 1 EC-22 Eastern Cape 27 13 4 4 37 0.8 0.867/0.115 EC-23 Eastern Cape 151 152 4 4 153 0.8 EC-24 Eastern Cape 27 13 4 1 WC-4 Western Cape 40 QQ m 0.75 0.741/0.286 WC-6 Western Cape 3 4 4 37 0.75 WC-7 Western Cape M MM M 0.25 WC-8 Western Cape 34 4 37 1 WC-10 Western Cape 154 1 WC-11 Western Cape 40 QQ Q Q 37 0.429 WC-12 Western Cape 27 13 37 1 WC-13 Western Cape M QM Q M 0.4 WC-14 Western Cape 155 36 38 1 WC-15 Western Cape 160 13 37 4 161 1 WC-16 Western Cape 34 13 4 13 13 4 37 0.571

More common tandem repeats are highlighted (bold) and mentioned in order of abundance, from higher to lower: 13, 4, 37 and 3, 34 have the same frequency, respectively. a Genetic diversity index of MSP1a (GDI) was calculated as follows: number of different MSP1a tandem repeats/total number of tandem repeats per strain. 1 is maximum genetic diversity. The average of GDI (AVE-GDI) and standard deviation (STDEV) for each region is shown.

-44- Chapter II.III

generation of genetic diversity of A. marginale could be gene dupli- cation followed by either mutation or mismatch repair as proposed by Palmer and Brayton (2013). The variable number of tandem repeats found among MSP1a sequences suggests that tandem repeat duplication due to homologous recombination may occur in order to give a “substrate” for “testing” competitive advantage of new tandem repeats variants. As demonstrated by this research, new tandem repeats shown in Fig. 2 evolved from an initial tan- dem repeat 4 which served as a template for genetic variation. To test whether tandem repeat 4 evolved to the new forms under selective pressures, the ratio ω (see ‘Materials and methods’) was calculated for each codon position of the new tandem repeats from South Africa. We found that the diversification of tandem repeat 4 in South African strains occurred under both positive and negative selective pressures (Table 4 and Fig. 4). Surprisingly, the positions Fig. 3. Correlation between msp1a genetic diversity and A. marginale prevalence in evolving under negative selection (Fig. 4, negative signs), 8 and 10, South Africa. The prevalences of anaplasmosis in different South African provinces were reported before to be present in an immunodominant B-cell were plotted against the average of genetic diversity index (GDI). GDI was calcu- epitope present in MSP1a (Fig. 4, first boxed area, Garcia-Garcia lated for each strain as the number of different tandem repeats divided by the total number of tandem repeats. A polynomial correlation was found between these 2 et al., 2004), and position 25, also evolving under negative selec- parameters with R2 = 0.76. Provinces with 100% prevalence (Gauteng, Eastern Cape, tion, was found previously in a neutralization-sensitive epitope and Mpumalanga) are in the range of 0.82–0.87 GDI while regions having less than (Fig. 4, second boxed area; Palmer et al., 1987; Allred et al., 1990). 100% prevalence are out of this range. In contrast, one of the positions found to be evolving under positive

Fig. 4. Reconstruction of ancestral amino acid sequence and amino acid positions under positive and negative selection. The reconstruction of the ancestral state (TR 4: Tandem repeat 4, sequence marked with *) of the new tandem repeats found in South Africa (Fig. 2) was performed using 3 reconstruction methods, namely: joint, marginal, and sample (see ‘Materials and methods’ for details). Positions that evolve under negative (−) and positive (+) selection are shown (see Table 4). Amino acid at position 20 is indicated (arrow). The residues of the immunodominant B-cell epitope (Garcia-Garcia et al., 2004) (first box) and the neutralization-sensitive epitope (Palmer et al., 1987; Allred et al., 1990) (second box) are also shown.

Table 4 Sites that evolved under positive and negative selection in the new tandem repeats from South Africa.

Codons Method SLAC Method FEL Type of selection

ω (dN/dS) p value ω (dN/dS) p value

1 5.49 0.043 Infinite 0.012 Positive 6 1.58 0.311 Infinite 0.093 Positive 16 1.39 0.371 Infinite 0.196 Positive 20 2.86 0.172 Infinite 0.133 Positive 28 2.22 0.197 Infinite 0.079 Positive 8 −1.66 0.373 −5.14 0.236 Negative 10 −6.81 0.037 −13.89 0.016 Negative 25 −3.57 0.134 −6.56 0.110 Negative

-45- Chapter II.III selection was the amino acid position 20 (Fig. 4, arrow), which has been implicated in the binding of MSP1a to tick cells extract (de la Fuente et al., 2003a). These findings support the hypothesis dis- cussed above that immune escape and tick transmission are both driving forces of A. marginale MSP1a genetic diversification and sug- gest that tick transmission and immune escape would be triggers of the observed diversification of tandem repeat 4 in South Africa. This trend of purifying selection/negative selection (or deletion of the unfit) for sites involved in immune recognition by the host was confirmed, in wider frame, by the following data: (i) Most of the residues of the MSP1a immunodominant B-cell epitope (Garcia- Garcia et al., 2004) were deleted in the tandem repeats forms ␣ and 108. The tandem repeat ␣ is widespread in strains from Mexico, Brazil, Venezuela, Argentina, and Taiwan and is also present in the most common A. marginale strains of the world which have the tan- dem repeat composition ␣, ␤, ␤, ␤,  (Cabezas-Cruz et al., 2013). (ii) The first glutamine (Q) of the neutralization-sensitive epitope (Palmer et al., 1987; Allred et al., 1990) is deleted in several tan- dem repeats (A, D, E, ␥, , ␸, 5–9, 14, 31, 36, 52, 57, 60–66, 69–72, 76, 84–86, 95–99, 105, 107, 116–119, 129–131, 136–139) from A. marginale strains reported previously (Cabezas-Cruz et al., 2013). As seen in Fig. 4, Q was deleted in some of South African new tandem repeats (146, 147, 148, 149, and 150). Collectively, this evidence suggests that purifying selection is most likely one of the mecha- nisms that A. marginale had evolved to escape immune recognition toward MSP1a. Tandem repeat 4 from MSP1a was also found in A. marginale strains isolated from an anaplasmosis outbreak in Mexico where R. microplus was implicated as tick vector (Almazán et al., 2008), and new tandem repeats were reported also in this study. It should be interesting to test whether these newly described Mexican MSP1a variants evolved from tandem repeat 4. We consider these findings to be relevant to the development of MSP1a-based vaccines and suggest that MSP1a-based vaccination should be combined with Fig. 5. MSP1a amino acid variability, composition, and low-variable peptides. The tick control strategies in order to minimize genetic diversity of A. figure shows the amino acid variability and composition among the tandem repeats marginale MSP1a. A recent study was reported that combined the in 3 different countries: Venezuela (A), USA (B), and South Africa (C). Venezuela use of a tick-protective antigen with MSP1a in the same vaccine shows a high proportion of variable/conserved sites and a high average of amino acid formulation that was directed toward control both tick infestations variability while South Africa and the USA show middle and lower values, respec- tively. Different colours in columns depict different biochemical properties in the and anaplasmosis (Torina et al., 2014). amino acid composition: negative (green), positive (red), uncharged-polar (beige), and non-polar (blue); proportion of deleted positions is shown in yellow. Consen- Amino acid variability and low-variable MSP1a peptides sus sequences of low-variable peptides are shown (*) for the USA and South Africa. The region of the immunodominant B-cell epitope from A. marginale (Garcia-Garcia Genetic variability could also be analyzed by calculating the et al., 2004) is boxed in the low-variable peptide from South Africa. (For interpre- tation of the references to colour in this figure legend, the reader is referred to the amino acid variability in each position of the tandem repeat. In web version of this article.) order to compare the MSP1a genetic variability in South Africa with other regions of the world, we calculated the amino acid variabil- ity for 29 amino acid positions from all the MSP1a tandem repeats found in South Africa, in the USA, and in Venezuela which were from Venezuela does not support this type of low-variable peptide. available in GenBank and collected by Cabezas-Cruz et al. (2013). Interestingly, the low variability peptide in South Africa overlaps The amino acid variability of MSP1a tandem repeats from South with the position of the immunodominant B-cell epitope reported Africa can be seen in Fig. 5 in comparison with those from the previously for MSP1a (Garcia-Garcia et al., 2004). USA (low) and Venezuela (high). In agreement with this, the aver- age of amino acid variability was 0.30, 0.42, and 0.72 for USA, South Africa, and Venezuela, respectively (Fig. 5A–C). In compar- Conclusions ison, the proportion of variable over-conserved positions in MSP1a tandem repeats was higher in Venezuela (29) compared to those of In this molecular study, bovine anaplasmosis was confirmed to both the USA (0.85) and South Africa (2.2). Epitopes from MSP1a be widespread in South Africa. The genetic diversity of A. marginale were found to induce a protective immune response against A. MSP1a was found to occur through the evolution of extant tandem marginale (Santos et al., 2013). Considering this, using the variabil- repeats which, under positive and negative selection, diversify to ity server (Garcia-Boronat et al., 2008), we explored whether some new tandem repeat variants that are likely constricted by 2 forces: low-variable peptides could be found in MSP1a from South Africa, the host immune system and tick transmission. Furthermore, com- the USA, and Venezuela. We observed that the amino acid vari- mon MSP1a tandem repeats were present among A. marginale ability of MSP1a tandem repeats from South Africa and the USA strains in South Africa and other regions of the world which in supported the existence of low-variable peptides (Fig. 5), which combination with the use of low-variable MSP1a peptides will con- may be useful in the development of peptide-based MSP1a vac- tribute to the development of MSP1a-based vaccines for the control cines, while the high amino variability of MSP1a tandem repeats of anaplasmosis.

-46- Chapter II.III

Conflict of interest Garcia-Garcia, J.C., de la Fuente, J., Kocan, K.M., Blouin, E.F., Halbur, T., Onet, V.C., Saliki, J.T., 2004. Mapping of B-cell epitopes in the N-terminal repeated peptides of Anaplasma marginale major surface protein 1a and characterization of the The authors declare no conflict of interest. humoral immune response of cattle immunized with recombinant and whole organism antigens. Vet. Immunol. Immunopathol. 98, 137–151. Acknowledgements Kocan, K.M., de la Fuente, J., Guglielmone, A.A., Melendéz, R.D., 2003. Antigens and alternatives for control of Anaplasma marginale infection in cattle. Clin. Micro- biol. Rev. 16, 698–712. This research was funded by National Research Foundation. The Lew, A.E., Bock, R.E., Minchin, C.M., Masaka, S., 2002. A msp1a polymerase chain- authors are grateful to the Directors of Department of Agriculture reaction assay for specific detection and differentiation of Anaplasmamarginale isolates. Vet. Microbiol. 86, 325–335. and Veterinary Institute from Mpumalanga, North West, Kwa-Zulu McGarey, D.J., Barbet, A.F., Palmer, G.H., McGuire, T.C., Allred, D.R., 1994. Putative Natal, Eastern Cape, and Western Cape (Prof Nomfundo Mnisi, Dr adhesins of Anaplasma marginale: major surface polypeptides 1a and 1b. Infect. Mlilo, Dr Songelwayo Chisi, Dr L Mrwebi, Dr Ronald Sinclair) with Immun. 62, 4594–4601. the assistance from the Veterinary Technicians (Mr Toypat Mdluli, Mtshali, M.S., de la Fuente, J., Ruybal, P., Kocan, K.M., Vicente, J., Mbati, P.A., Shkap, V., Blouin, E.F., Mohale, N.E., Moloi, T.P., Spickett, A.M., Latif, A.A., 2007. Prevalence Ms Keneilwe Constance Kaotane, Mr Jan Masethe Maime, Mr Anil and genetic diversity of Anaplasma marginale strains in cattle in South Africa. Suresh, Mr Erwin Lucas) and the farmers who facilitated sample Zoonoses Public Health 54, 23–30. collection. Dr Chris Marufu for sample donations from Eastern Cape. Mtshali, M.S., Mtshali, P.S., 2013. Molecular diagnosis and phylogenetic analysis of Babesia bigemina and Babesia bovis hemoparasites from cattle in South Africa. Ms Sinesipho Ntanta (DST intern) for her wonderful help in the BMC Vet. Res. 9, 154. laboratory. Ndou, R.V., Diphahe, T.P., Dzoma, B.M., Motsei, L.E., 2010. The seroprevalence and endemic stability of anaplasmosis in cattle around Mafikeng in the North West Province, South Africa. Vet. Res. 1, 1–3. References Nielsen, R., 2002. Mapping mutations on phylogenies. Syst. Biol. 51, 729–739. Palmer, G.H., Brayton, K.A., 2013. Antigenic variation and transmission fitness as Allred, D.R., McGuire, T.C., Palmer, G.H., Leib, S.R., Harkins, T.M., McElwain, T.F., drivers of bacterial strain structure. Cell. Microbiol. 15, 1969–1975. Barbet, A.F., 1990. Molecular basis for surface antigen size polymorphisms and Palmer, G.H., Waghela, S.D., Barbet, A.F., Davis, W.C., McGuire, T.C., 1987. Character- conservation of a neutralization-sensitive epitope in Anaplasma marginale. Proc. ization of a neutralization-sensitive epitope on the Am 105 surface protein of Natl. Acad. Sci. USA 87, 3220–3224. Anaplasma marginale. Int. J. Parasitol. 17, 1279–1285. Almazán, C., Medrano, C., Ortiz, M., de la Fuente, J., 2008. Genetic diversity of Pohl, A.E., Cabezas-Cruz, A., Ribeiro, M.F.B., Silveira, J.A.G., Silaghi, C., Pfister, K., Pas- Anaplasma marginale strains from an outbreak of bovine anaplasmosis in an sos, L.M.F., 2013. Detection of genetic diversity of Anaplasma marginale isolates endemic area. Vet. Parasitol. 158, 103–109. in Minas Gerais, Brazil. Rev. Bras. Parasitol. Vet. 22, 129–135. Aubry, P., Geale, D.W., 2011. A review of bovine anaplasmosis. Transbound. Emerg. Pond, S.L., Frost, S.D., 2005. Datamonkey: rapid detection of selective pressure on Dis. 58, 1–30. individual sites of codon alignments. Bioinformatics 21, 2531–2533. Cabezas-Cruz, A., Passos, L.M.F., Lis, K., Kenneil, R., Valdés, J.J., Ferrolho, J., Tonk, M., Pupko, T., Shamir, I.P.R., Graur, D., 2000. A fast algorithm for joint reconstruction of Pohl, A.E., Grubhoffer, L., Zweygarth, E., Shkap, V., Ribeiro, M.F.B., Estrada-Pena,˜ ancestral amino acid sequences. Mol. Biol. Evol. 17, 890–896. A., Kocan, K.M., de la Fuente, J., 2013. Functional and immunological relevance of Rikhotso, B.O., Stoltsz, W.H., Bryson, N.R., Sommerville, J.E., 2005. The impact of 2 Anaplasma marginale major surface protein 1a sequence and structural analysis. dipping systems on endemic stability to bovine babesiosis and anaplasmosis in PLoS ONE 8, e65243. cattle in 4 communally grazed areas in Limpopo Province, South Africa. J. S. Afr. Coetzee, J.F., Apley, M.D., Kocan, K.M., Rurangirwa, F.R., Van Donkersgoed, J., 2005. Vet. Assoc. 76, 217–223. Comparison of three oxytetracycline regimes for the treatment of persistent Ruybal, P., Moretta, R., Perez, A., Petrigh, R., Zimmer, P., Alcaraz, E., Echaide, I., Torioni Anaplasma marginale infections in beef cattle. Vet. Parasitol. 127, 61–73. de Echaide, S., Kocan, K.M., de la Fuente, J., Farber, M., 2009. Genetic diversity of Coetzee, J.F., Apley, M.D., Kocan, K.M., 2006. Comparison of the efficacy of Anaplasma marginale in Argentina. Vet. Parasitol. 162, 176–180. enrofloxacin, imidocarb, and oxytetracycline for clearance of persistent Santos, P.S., Sena, A.A., Nascimento, R., Araújo, T.G., Mendes, M.M., Martins, J.R., Anaplasma marginale infections in cattle. Vet. Ther. 7, 347–360. Mineo, T.W., Mineo, J.R., Goulart, L.R., 2013. Epitope-based vaccines with the de la Fuente, J., Garcia-Garcia, J.C., Blouin, E.F., Kocan, K.M., 2003a. Characterization Anaplasma marginale MSP1a functional motif induce a balanced humoral and of the functional domain of major surface protein 1a involved in adhesion of the cellular immune response in mice. PLoS ONE 8, e60311. rickettsia Anaplasma marginale to host cells. Vet. Microbiol. 91, 265–283. Schneider, A., Cannarozzi, G.M., Gonnet, G.H., 2005. Empirical codon substitution de la Fuente, J., Van Den Bussche, R.A., Prado, T.M., Kocan, K.M., 2003b. Anaplasma matrix. BMC Bioinformatics 6, 134. marginale msp1˛ genotypes evolved under positive selection pressure but are Schneider, A., Gonnet, G., Cannarozzi, G., 2007. SynPAM – a distance measure based not markers for geographic isolates. J. Clin. Microbiol. 41, 1609–1616. on synonymous codon substitutions. IEEE/ACM Trans. Comput. Biol. Bioinform. de la Fuente, J., Torina, A., Naranjo, V., Caracappa, S., Vicente, J., Mangold, A.J., Vicari, 4, 553–560. D., Alongi, A., Scimeca, S., Kocan, K.M., 2005. Genetic diversity of Anaplasma Shannon., C.E., 1948. The mathematical theory of communication. Bell Syst. Tech. J. marginale strains from cattle farms in the province of Palermo, Sicily. J. Vet. 27, 379–423. Med. B. Infect. Dis. Vet. Public Health 52, 226–229. Silvestre, B.T., Rabelo, E.M., Versiani, A.F., da Fonseca, F.G., Silveira, J.A., Bueno, L.L., de Waal, D.T., 2000. Anaplasmosis control and diagnosis in South Africa. Ann. N.Y. Fujiwara, R.T., Ribeiro, M.F., 2014. Evaluation of humoral and cellular immune Acad. Sci. 916, 474–483. response of BALB/c mice immunized with a recombinant fragment of MSP1a Dreyer, K., Fourie, L.J., Kok, D.J., 1998. Epidemiology of tick-borne diseases of cattle from Anaplasma marginale using carbon nanotubes as a carrier molecule. Vac- in Botshabelo and Thaba Nchu in the Free State Province. Onderstepoort J. Vet. cine, http://dx.doi.org/10.1016/j.vaccine.2014.02.062 (Epub ahead of print). Res. 65, 285–289. Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improving the sensi- Delport, W., Poon, A.F., Frost, S.D., Kosakovsky Pond, S.L., 2010. Datamonkey 2010: a tivity of progressive multiple sequence alignment through sequence weighting, suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics 26, positions-specific gap penalties and weight matrix choice. Nucl. Acid. Res. 22, 2455–2457. 4673–4680. Dumler, J.S., Barbet, A.F., Bekker, C.P.J., Dasch, G.A., Palmer, G.H., Ray, S.C., Rikihisa, Y., Torina, A., Moreno-Cid, J.A., Blanda, V., Fernández de Mera, I.G., de la Lastra, J.M., Rurangirwa, F.R., 2001. Reorganization of genera in the families Rickettsiaceae Scimeca, S., Blanda, M., Scariano, M.E., Briganò, S., Disclafani, R., Piazza, A., and Anaplasmataceae in the order Rickettsiales: unification of some species of Vicente, J., Gortázar, C., Caracappa, S., Lelli, R.C., de la Fuente, J., 2014. Control of Ehrlichia with Anaplasma, Cowdria with Ehrlichia and Ehrlichia with Neorickettsia, tick infestations and pathogen prevalence in cattle and sheep farms vaccinated descriptions of six new species combinations and designation of Ehrlichia equi with the recombinant subolesin-major surface protein 1a chimeric antigen. Par- and ‘HGE agent’ as subjective synonyms of Ehrlichia phagocytophila. Int. J. Syst. asit. Vectors 7, 10. Evol. Microbiol. 51, 2145–2165. Yang, Z., Kumar, S., Nei, M., 1995. A new method of inference of ancestral nucleotide Garcia-Boronat, M., Diez-Rivero, C.M., Reinherz, E.L., Reche, P.A., 2008. PVS: a web and amino acid sequences. Genetics 14, 1641–1650. server for protein sequence variability analysis tuned to facilitate conserved Zhang, Z., Schwartz, S., Wagner, L., Miller, W., 2000. A greedy algorithm for aligning epitope discovery. Nucl. Acids Res. 36, 35–41. DNA sequences. J. Comput. Biol. 7, 203–214.

-47- Chapter II.IV

Low genetic diversity associated to low prevalence of Anaplasma

marginale in water buffaloes.

Silva J B., Fonseca A H., Barbosa J D., Cabezas-Cruz A., de la Fuente J. 2014. Low genetic diversity associated to low prevalence of Anaplasma marginale in water buffaloes. Tick and tick borne diseases. In press.

-48- Chapter II.IV

Low genetic diversity associated with low prevalence of Anaplasma

marginale in water buffaloes in Marajó Island, Brazil

Jenevaldo B. Silva1, Adivaldo H. Fonseca2, José D. Barbosa3, Alejandro Cabezas-

Cruz4,5, José de la Fuente4,6,*

1Departamento de Patologia Veterinária, Universidade Estadual Paulista, Jaboticabal, São Paulo, Brasil.

2Departamento de Clínica Veterinária, Universidade Federal do Pará, Castanhal, Pará, Brasil

3Departamento de Parasitologia, Universidade Federal Rural do Rio de Janeiro, Seropédica, Rio de Janeiro, Brasil.

4SaBio. Instituto de Investigación en Recursos Cinegéticos IREC, CSIC-UCLM-JCCM, Ronda de Toledo s/n, 13005 Ciudad Real,

Spain.

5Center for Infection and Immunity of Lille (CIIL), INSERM U1019 – CNRS UMR 8204, Université Lille Nord de France,

Institut Pasteur de Lille, Lille, France. 6Department of Veterinary Pathobiology, Center for Veterinary Health Sciences,

Oklahoma State University, Stillwater, OK 74078, USA.

* Corresponding author: José de la Fuente. SaBio. Instituto de Investigación en

Recursos Cinegéticos, Ronda de Toledo s/n, 13005 Ciudad Real, Spain.

Phone: +34 926295450. E-mail: [email protected]

Abstract The rickettsia Anaplasma marginale is the etiologic agent of bovine anaplasmosis, an important tick-borne disease affecting cattle in tropical and subtropical regions of the world. In endemic regions, the genetic diversity of this pathogen is usually related to the high prevalence of the disease in cattle. The major surface protein 1 alpha (MSP1a) has been used as a marker to characterize the genetic diversity and for geographical identification of A. marginale strains. The present study reports the characterization of A. marginale MSP1a diversity in water buffaloes. Blood samples were collected from 200 water buffaloes on Marajó Island, Brazil where the largest buffalo herd is located in the Western hemisphere. Fifteen buffaloes (7.5%) were positive for A. marginale msp1α by PCR. Four different strains of A. marginale with MSP1a tandem repeat structures (4- 3-27), (162-63-27), (78-24-24-25-31) and (τ-10-10-15) were found, being (4-63-27) the most common. MSP1a tandem repeats composition in buffalos and phylogenetic analysis using msp1α gene showed that the A. marginale strains identified in buffaloes are closely related to A. marginale strains from cattle. The results demonstrated low genetic diversity of A. marginale associated with low bacterial prevalence in buffaloes and suggested that buffaloes may be reservoirs of this pathogen for cattle living in the same area. The results also suggested that mechanical transmission and not biological transmission by ticks might be playing the major role for pathogen circulation among water buffaloes in Marajó Island, Brazil.

Keywords: Anaplasma marginale, Buffalo, MSP1a, genetics, anaplasmosis

-49- Chapter II.IV

Introduction source of meat, milk, and leather, besides being used to plow the land and Anaplasma marginale to transport people and crops (IBGE., (Rickettsiales: Anaplasmataceae) is the 2012). Serological and molecular most prevalent pathogen transmitted by detection of A. marginale in water ticks worldwide, distributed on the six buffaloes in Brazil have shown a continents and responsible for high prevalence of 49.0% and 5.4%, morbidity and mortality in cattle in respectively (Silva et al., 2014). temperate, subtropical, and tropical However, although the A. marginale regions (Vidotto et al., 1998; Kocan et msp1α 76 genetic diversity has been al., 2010). Bacteria of the genus characterized in Brazilian cattle (de la Anaplasma are obligate intracellular Fuente et al., 2007; Estrada-Peña et al., pathogens that can be transmitted 2009; Cabezas-Cruz et al., 2013; Pohl et biologically by ticks, mechanically by al., 2013), a similar study has not been hematophagous insects and blood- conducted in buffaloes. contaminated fomites and less In this study, we characterized the frequently transplacentally (Kocan et A. marginale msp1α genetic diversity in al., 2010). naturally infected water buffaloes on The global distribution and high Marajó Island, Brazil. The results pathogenicity of A. marginale is due to demonstrated low genetic diversity of the diversity and genetic variability of A. marginale associated to low this bacterium (de la Fuente et al., prevalence of the bacteria in water 2007). This pathogen has over 20 buffaloes and suggested that buffaloes proteins capable of inducing protective may be a reservoir of this pathogen for immunity (Suarez and Noh, 2011) from cattle living in the same area. The which major surface proteins (MSPs) results also suggested that mechanical have been extensively characterized transmission and not biological (Kocan et al.,2010). Among the major transmission by ticks might be playing surface proteins (MSPs), special an essential role for pathogen attention has been directed to MSP1a circulation among water buffaloes in because it is involved in the interaction Marajó Island, Brazil. of the bacterium with vertebrate and invertebrate host cells (de la Fuente et Materials and Methods al., 2010). Several strains of A. marginale have been identified Experimental design and study site worldwide and these strains differ in their morphology, MSP1a amino acid A cross-sectional molecular study sequence, antigenic characteristics, and was conducted sampling buffalo herds ability to be transmitted by ticks (de la in four provinces of Marajó Island, Fuente et al., 2007; Estrada-Peña et al., Brazil (Soure, Salvaterra, Muaná, and 2009; Cabezas-Cruz et al., 2013). Chaves) between January and The primary host for A. marginale December 2012. The Marajó Island is cattle, but other ruminants such as hosts the largest water buffalo deer and buffaloes can also be infected population in the Western hemisphere. (Kocan et al., 2010). Approximately The vegetation on this island is 300,000 buffaloes are geographically predominantly provided by the Amazon isolated on Marajó Island, Brazil, tropical rainforest (Furtado et al., 2009). representing the largest buffalo herd in The buffaloes are vaccinated against the Western hemisphere, and these and foot-and-mouth disease, animals have been used as a primary but endo and ectoparasite control is

-50- Chapter II.IV rarely used. Large areas of bog and it. After the PCR reaction, amplicons grassland along the floodplains of rivers were purified with the Silica Bead DNA are found on Marajó Island (Furtado et Gel Extraction Kit (Fermentas Life al., 2009). These animals are reared Sciences, Sao Paulo, Brazil) following using an extensive system. The main manufacturer's instructions and tick species found on animals are sequenced. The A. marginale msp1α Amblyomma cajennense, Rhipicephalus sequences obtained in this study from (Boophilus) microplus, Dermacentor water buffaloes are available in nitens and A. maculatum. These tick GenBank with accession numbers species can be found on buffaloes with KJ575588 - KJ575602. low infestation rates throughout the year (Silva et al., 2014). A. marginale msp1α sequence analysis

Sample collection and DNA extraction A microsatellite is located at the msp1α 5' untranslated region (UTR) Two hundred female water between the putative Shine-Dalgarno buffaloes were randomly selected in at (SD) sequence (GTAGG) and the start least three farms from each province codon (ATG). The general included in the study. Blood samples microsatellite structure is as previously were collected from the caudal or reported GTAGG (G/A TTT)m (GT)n jugular veins of individual animals. T ATG (Estrada-Peña et al., 2009) DNA extraction was performed using where microsatellite sequence is in bold the QIAamp DNA Blood Mini Kit letters. The SD-ATG distance was (Qiagen, Valencia, CA, USA) following calculated according to the equation (4 manufacturers recommendations. × m) + (2 × n) + 1. Based on the structure of this microsatellite eleven A. marginale msp1α PCR and DNA genotypes (named with Latin alphabet sequencing letters from A to K) of A. marginale msp1α have been previously identified The primers 1733F (5' (Estrada-Peña et al., 2009; Cabezas- TGTGCTTATGGCAGACATTTCC 3'), Cruz et al., 2013). Theoretical 3134R (5' translation of msp1α DNA into amino TCACGGTCAAAACCTTTGCTTACC acid sequences was performed using the 3'), and 2957R (5' Expasy Translation Tool AAACCTTGTAGCCCCAACTTATCC (http://expasy.hcuge.ch/ tools/dna.html). 3') were used to amplify A. marginale Tandem repeats were identified and msp1α as reported previously (Lew et named according to the nomenclature al., 2002). Briefly, primers 1733F and proposed by de la Fuente et al. (2007) 3134R were used in the first PCR and updated by Cabezas-Cruz et al. amplification, while 1733F and 2957R (2013). Tandem repeat sequences were were used in a nested-PCR reaction. For aligned using MUSCLE (v3.7) (Edgar., both reactions, 12.5 μl PCR Master Mix 2004). Codon based alignment was (Qiagen, Valencia, CA, USA), 20 pmol performed using the codon suite server of each primer and 5 μl genomic DNA (Schneider et al., 2007). Detection of (first reaction) were used in a final selection pressure on individual codons volume of 25 μl. For the second was calculated using two methods, reaction 1 μl of the DNA amplified in single likelihood ancestor counting the first reaction was used as template. (SLAC) and fixed effects likelihood Control reactions were performed in a (FEL) implemented in Datamonkey similar way but without DNA added to webserver (Delport et al., 2010).

-51- Chapter II.IV

Positive or negative selection was al., 2014). The results obtained in the assigned to codons where ω=dN (non- present study using msp1α agreed with synonymous substitutions)/dS those reported by Silva et al. (2014) and (synonymous substitutions) ratio was showed 7.5% (15 positive samples) higher or lower than 1, respectively. As prevalence of A. marginale in water recommended in Datamonkey buffaloes from Marajó Island, Brazil. webserver (Delport et al., 2010), only This prevalence could be considered sites with p-value < 0.25 were low when compared with the prevalence considered to be under selection. of A. marginale in cattle from Brazil. For example, using msp1α, a recent Phylogenetic analysis study showed 70% prevalence of A. marginale in a herd of Brazilian cattle For msp1α phylogenetic analysis, (Pohl et al., 2013). Water buffaloes with nucleotide sequences were aligned with clinical anaplasmosis were not MUSCLE (v3.7) configured for high registered in the present study. The precision (Edgar., 2004) followed by pathogenic significance of A. marginale removal of the ambiguous regions with for water buffaloes remains to be Gblocks (v0.91b) (Castresana., 2000). elucidated, but the fact that buffaloes The phylogenetic tree was constructed can carry A. marginale raise concerns using the neighbor joining method regarding the role of this species as implemented in Neighbor from the reservoirs of A. marginale for cattle PHYLIP package (v3.66) (Felsenstein., living in the same area (Silva et al., 1989). Internal branch confidence was 2014). Phylogenetic analysis using assessed by the bootstrapping method msp1α show that A. marginale strains using 1000 bootstrap replicates. found in buffaloes are closely related to Sequences of A. marginale msp1α strains isolated previously from cattle in previously reported in cattle from Brazil Brazil (Fig. 1A), suggesting that and the USA were obtained from buffaloes can be infected with the same Genebank and used as outgroups. strains that infect cattle and thus buffaloes could constitute reservoir Results and discussion hosts for A. marginale in cattle. Further research is needed to elucidate the role Low prevalence of A. marginale was of water buffaloes as reservoir hosts for recently reported in buffaloes in Marajó A. marginale in cattle in this or other Island, Brazil, using the major surface regions where both species share the antigen 5 (msp5) gene marker (Silva et same space.

-52- Chapter II.IV

Figure 1. Characterization of A. marginale msp1α sequences. (A) Neighbor joining phylogenetic tree of A. marginale msp1α. The tree was constructed using the neighbor joining method with A. marginale msp1a sequences from strains identified in water buffaloes and cattle. Bootstrap values are represented as percent on internal branches (1000 replicates). The GenBank accession numbers of the respective sequences used for the phylogenetic analysis are shown. The four different A. marginale strains obtained from water buffaloes in this study are shown (together with tandem repeat structure in parenthesis) as Water buffalo 3 (78, 24, 24, 25, 31) (black triangle); Water buffalo 13 (4, 63, 27) (white square); Water buffalo 15 (162, 63, 27) (black square) and Water buffalo 4 (τ, 10, 10, 15) (white triangle). (B) Amino acid differences between tandem repeats 4 and 162 and position evolving under negative selection. The one letter code is used for the different amino acids of the tandem repeats. Conserved amino acid positions are highlighted with asterisks. Substitution of glutamine (Q) in tandem repeat 4 by leucine (L) in tandem repeat 162 is show with an arrow. Amino acid at position 10 (-) evolving under negative selection (p < 0.25 using both FEL and SLAC methods) and residues of the immunodominant B- cell epitope (Garcia-Garcia et al., 2004) (box) are also shown.

The gene msp1α has been extensively et al., 2007; Ruybal et al., 2009; used for the characterization of the Estrada-Peña et al., 2009; Cabezas-Cruz genetic diversity of A. marginale in et al., 2013; Pohl et al., 2013) but little cattle (Palmer et al., 2001; de la Fuente is known about the genetic diversity of

-53- Chapter II.IV

A. marginale in other species of present study (Table 1). The results ungulates, including buffaloes (de la showed that the genetic diversity of A. Fuente et al., 2004). In order 175 to marginale msp1α in buffaloes from determine the genetic diversity of A. Marajó Island is low, with only four marginale infecting buffaloes, we different strains identified, showing the sequenced the 15 msp1α positive microsatellite genotype E (Table 1). samples that were obtained in the

Table 1. Organization of MSP1a tandem repeats in A. marginale strains identified in water buffaloes.

A. marginale strain identification No. of animals infected with this strain Brazil/Marajó Island/ E - (4, 63, 27) 9 Brazil/Marajó Island/ E - (78, 242, 25, 31) 3 Brazil/Marajó Island/ E - (τ, 102, 15) 2 Brazil/Marajó Island/ E - (162, 63, 27) 1 Strain identification is based on msp1α and includes Country/Locality/microsatellite genotype – (tandem repeats structure). Superscripts represent the number of times that a tandem repeats are repeated. The new MSP1a tandem repeat 162 was named following the system proposed by de la Fuente et al. (2007) and updated by Cabezas-Cruz et al. (2013).

In contrast, the results by Pohl et al. genetic diversity in this area. This (2013) in cattle showed, in 13 phenomenon is in agreement with the sequenced samples, 8 different strains fact that cattle movement is limited in of A. marginale with four different Australia where only one strain of A. microsatellite genotypes (B, D, E and marginale has so far been identified in G). Three possibilities could be cattle (Lew et al., 2002); (c) finally, it considered in order to explain the low could be argued that A. marginale was genetic diversity of A. marginale in just recently introduced in this buffalo buffaloes in Marajó Island: (a) in bovine herd which will result in low genetic anaplasmosis endemic regions, low diversity. Low genetic diversity of genetic diversity of A. marginale msp1α msp1α was reported in a previously has been related to tick absence (Ruybal uninfected cattle herd where only a et al., 2009). Most of the sampled single msp1α genotype was found buffaloes in this study were raised on (Palmer et al., 2001). submerged wetlands, where tick Despite the low genetic diversity attachment is rare (Silva et al., 2014) observed for A. marginale in buffaloes, and thus tick infestation rates are low evidence of genetic diversification was and transmission of A. marginale is an found. The A. marginale strains unlikely event; (b) cattle movement has obtained from buffaloes in this study been proposed as a source of genetic had between 3 and 5 MSP1a repeat diversity in A. marginale worldwide (de sequences 200 (Table 1). Tandem repeat la Fuente et al., 2007). In Marajó Island, 162 was found for the first time in this the entry of new buffaloes is prohibited, study (Table 1 and Fig. 1B). Tandem limiting the possibility of the repeats 162 and 4 only differ in one introduction of new strains of A. amino acid at position 27 with marginale and consequently bacterial glutamine (Q) in tandem repeat 4 and

-54- Chapter II.IV leucine (L) in tandem repeat 162 (Fig. reported in buffaloes (Da Silva et al., 1B). In addition, the amino acid Q in 225 2013). Differential tick tandem repeat 4 is encoded by the transmission fitness has been found codon CAA and a single mutation to among different A. marginale msp1α uracile in the second adenine of the genotypes (Palmer et al., 2004). CAA codon will result in the codon Considering that ticks may not be CUA which encodes for the amino acid playing an important role in L in tandem repeat 162. This finding transmission among buffaloes in the suggested that the tandem repeat 162 study site, the most common strain may have originated recently from found in water buffaloes may be tandem repeat 4, providing evidence for adapted to mechanical or transplacental genetic diversification of A. marginale transmission. In agreement with these in water buffaloes. In agreement with findings, 60% of the A. marginale this hypothesis, the phylogenetic MSP1a tandem repeats obtained here analysis using msp1α indicated that the presented the amino acid glycine (G) at strain Water buffalo 15 (162, 63, 27) position 20 and this amino acid was in possibly evolved from strain Water at least one of the MSP1a repeats in all buffalo 13 (4, 63, 27) (Fig. 1A). In order the A. marginale strains. The negatively to determine which selective pressures charged amino acids aspartic acid (D) could be triggering MSP1a and glutamic acid (E) at position 20 diversification in A. marginale from were shown to be essential for the buffaloes, the ratio ω was calculated binding of MSP1a to tick cells while showing that codon at position 10 from with a G at this position no binding was tandem repeat 4 was evolving under observed (de la Fuente et al., 2003). negative selection (Fig. 1B). These amino acids affect MSP1a Interestingly, this amino acid position is conformation and these conformational present in an immunodominant B-cell changes were suggested to affect A. epitope described before for A. marginale transmission by ticks marginale MSP1a (Garcia-Garcia et al., (Cabezas-Cruz et al., 2013). 2004) (Fig. 1B). These results suggested that this tandem repeat which was Conclusions present in the most common strain of A. marginale found in buffaloes (Table 1) In this study, the genetic diversity of may be under selective pressure by the MSP1a in A. marginale was host immune system (Garcia-Garcia et characterized in water buffaloes. The A. al., 2004). marginale genetic diversity was low in Some of the tick species found infesting buffaloes and correlated with the low buffaloes such as Rhipicephalus and bacterial prevalence in this species. One Dermacentor spp. have been recognized major factor that could be contributing as vectors of A. marginale (Kocan et al., to this low genetic diversity is the 2010). However, the low tick infestation ecology of the studied area, which is not rates found in buffaloes in the study suitable for ticks thus reducing the area suggested that mechanical and/or probability for pathogen biological transplacental transmission could be transmission. Mechanical transmission playing an important role in A. by hematophagous Diptera could be marginale transmission in this buffalo playing a major role in the transmission herd. The sucking lice, Haematopinus of A. marginale in the study site. tuberculatus was implicated recently in Evidence was found to support the A. marginale transmission and hypothesis that MSP1a is under outbreaks of this lice species have been selective pressure by the host immune

-55- Chapter II.IV system in buffaloes. Finally, water Mangold, A.J. 2010. Functional buffaloes may serve as reservoir hosts genomics and evolution of tick- Anaplasma interactions and vaccine of A. marginale for cattle. These results development. Vet. Parasitol. 167, 175– expanded our knowledge of A. 186. marginale strains and provided Delport, W., Poon, A.F., Frost, S.D., additional support for the role of MSP1a Kosakovsky Pond, S.L., 2010. in pathogen evolution and transmission. Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics 26, Acknowledgments 2455-7. Edgar, R.C., 2004. MUSCLE: multiple We thank the Coordination for the sequence alignment with high accuracy Improvement of Higher Education and high throughput. Nucleic. Acids. Personnel (CAPES) foundation and the Res. 32, 1792-1797. Estrada-Peña, A., Naranjo, V., Acevedo- National Council for Technological and Whitehouse, K., Mangold, A.J., Kocan, Scientific Development (CNPq) for K.M., de la Fuente, J., 2009. their financial support. Phylogeographic analysis reveals association of tick-borne pathogen, Anaplasma marginale, MSP1a sequences References with ecological traits affecting tick vector performance. BMC. Biol. 57, 1-13. Castresana, J., 2000. Selection of conserved Cabezas-Cruz, A., Passos, L M F., Lis, K., blocks from multiple alignments for their Kenneil, R., Valdés, J J., Ferrolho, J., use in phylogenetic analysis. Mol. Biol. Tonk, M., Pohl, A E., Grubhoffer, L., Evol. 17, 540-552. Zweygarth, E., Shkap, V., Ribeiro, M F Da Silva, A.S., Lopes, L.S., Diaz, J.D., Tonin, B., Estrada-Peña, A., Kocan, K M., de la A.A., Stefani, L.M., Araújo, D.N., 2013. Fuente, J., 2013. Functional and Lice outbreak in buffaloes: evidence of Immunological Relevance of Anaplasma Anaplasma marginale transmission by marginale Major Surface Protein 1a sucking lice Haematopinus tuberculatus. Sequence and Structural Analysis. PloS. J. Parasitol. 99, 546-7. ONE. 8, e65243. de la Fuente, J., Van Den Bussche, R.A., Prado, Felsenstein, J., 1989. PHYLIP - Phylogeny T., Kocan, K.M., 2003. Anaplasma Inference Package (Version 3.2). marginale major surface protein 1a Cladistics. 5, 164-166 genotypes evolved under positive Furtado, A.P., Do Carmo, E.S., Giese, E.G., selection pressure but are not markers for Vallinoto, A.C., Lanfredi, R.M., Santos, geographic strains. J. Clin. Microbiol. 41, J.N., 2009. Detection of dog filariasis in 1609–1616. Marajo Island, Brazil by classical and de la Fuente, J., Vicente, J., Höfle, U., Ruiz- molecular methods. Parasitol. Res. 105, Fons, F., Fernández de Mera, I.G., Van 1509-1515. Den Bussche, R.A., Kocan, K.M., Garcia-Garcia, J.C., de la Fuente, J., Kocan, Gortazar, C., 2004. Anaplasma infection K.M., Blouin, E.F., Halbur, T., Onet, in free-ranging Iberian red deer in the V.C., Saliki, J.T., 2004. Mapping of B- region of Castilla - La Mancha, Spain. cell epitopes in the N-terminal repeated Vet. Microbiol. 100, 163-173. peptides of Anaplasma marginale major de la Fuente, J., Ruybal, P., Mtshali, M.S., surface protein 1a and characterization of Naranjo, V., Shuqing, L., Mangold, A.J., the humoral immune response of cattle Rodríguez, S.D., Jiménez, R.., Vicente, immunized with recombinant and whole J., Moretta, R., Torina, A., Almazán, C., organism antigens. Vet. Immunol. Mbati, P.M., Torioni de Echaide, S., Immunopathol. 98, 137–151. Farber, M., Rosario-Cruz, R., Gortazar, IBGE, 2012. Instituto Brasileiro de Geografia e C., Kocan, K.M., 2007. Analysis of Estatística [Brazilian Institute of world strains of Anaplasma marginale Geography and Statistics] (in using major surface protein 1a repeat Portuguese). Available from: sequences. Vet. Microbiol. 119, 382–390. http://www.ibge.gov.br/home/ de la Fuente, J., Kocan, K.M., Blouin, E.F., Kocan, K.M., de la Fuente, J., Blouin, E.F., Zivkovic, Z., Naranjo, V., Almazán, C., Coetzee, J.F., Ewing, S.A., 2010. The Esteves, E., Jongejan, F., Daffre, S.,

-56- Chapter II.IV

natural history of Anaplasma marginale. Ruybal, P., Moretta, R., Perez, A., Petrigh, R., Vet. Parasitol. 167, 95-107. Zimmer, P., Alcaraz, E., Echaide, I., Lew, A.E., Bock, R.E., Minchin, C.M., Masaka, Torioni de Echaide, S., Kocan, K.M., de S., 2002. A msp1α polymerase chain la Fuente, J., Farber, M., 2009. Genetic reaction assay for specific detection and diversity of Anaplasma marginale in differentiation of Anaplasma marginale Argentina. Vet. Parasitol. 162, 176–180. isolates. Vet. Microbiol. 86, 325–335. Silva, J.B., Vinhote, W.M.S., Oliveira, C.M.C., Palmer, G.H., Rurangirwa, F.R., McElwain, André, M.R., Fonseca, A.H., Barbosa, T.F., 2001. Strain composition of the J.D., 2014. Molecular and serological ehrlichia Anaplasma marginale within prevalence of Anaplasma marginale in persistently infected cattle, a mammalian water buffaloes in the northern Brazil. reservoir for tick transmission. J. Ticks. Tick. Borne. Dis. 5, 100-104. Clinical. Microbiol. 39, 631–635. Schneider, A., Gonnet, G., Cannarozzi, G., Palmer, G.H., Knowles Jr., D.P., Rodriguez, 2007. SynPAM-a distance measure based J.L., Gnad, D.P., Hollis, L.C., Marston, on synonymous codon substitutions. T., Brayton, K.A., 2004. Stochastic IEEE/ACM Trans. Comput. Biol. transmission of multiple genotypically Bioinform. 4, 553-560. distinct Anaplasma marginale strains in a herd with high prevalence of Anaplasma Suarez, C.E., Noh, E., 2011. Emerging infection. J. Clin. Microbiol. 42, 5381– perspectives in the research of bovine 5384. babesiosis and anaplasmosis. Vet. Pohl, A.E., Cabezas-Cruz, A., Ribeiro, M.F.B., Parasitol. 180, 109-125. Silveira, J.A.G., Silaghi, C., Pfister, K., Vidotto, M.C., Vidotto, O., Andrade, G.M., Passos, L.M.F., 2013. Detection of Palmer, G., Mcelwain, T., Knowles, genetic diversity of Anaplasma D.P., 1998. Seroprevalence of marginale isolates in Minas Gerais, Anaplasma marginale in cattle in Paraná Brazil. Rev. Bras. Parasitol. Vet. 22, 129- State, Brazil, by MSP-5 competitive 135. ELISA. Ann. N. Y. Acad. Sci. 849, 424- 426.

-57- Chapter III

The evolution of A. marginale msp1a is affected by the ecology

supporting different tick vectors.

Cabezas-Cruz A., Estrada-Peña A., Valdes J., Kocan K M, de la Fuente J. A. marginale msp1a gene evolved under diversifying and convergent evolution influenced by the presence of different tick vector species.

-58- Chapter III

A. marginale msp1a gene evolved under diversifying and convergent evolution

influenced by the presence of different tick vector species.

Alejandro Cabezas-Cruz1,2, Agustin Estrada-Peña3, James Valdés4, Katherine M Kocan5 and José de la Fuente2,5.

1Center for Infection and Immunity of Lille (CIIL), INSERM U1019 – CNRS UMR 8204, Université Lille Nord de France, Institut Pasteur de Lille, Lille, France. 2SaBio. Instituto de Investigación en Recursos Cinegéticos IREC-CSIC-UCLM-JCCM, Ronda de Toledo s/n, 13005 Ciudad Real, Spain.

3Facultad de Veterinaria, Universidad de Zaragoza, Zaragoza, Spain.

4University of South Bohemia, Faculty of Science and Biology Centre of the ASCR, Parasitology

Institute, České Budějovice, Czech Republic.

5Department of Veterinary Pathobiology, Center for Veterinary Health Sciences, Oklahoma State University, Stillwater, OK 74078, USA.

Abstract

Anaplasma marginale is an intracellular rickettsia that switches, during its natural life, between invertebrate (ticks) and vertebrate hosts (ruminants). This pathogenic parasite has a closed-core genome, which is a paradigm of reductive evolution and for which there is no evidence of gene gain or loss. For such bacteria, genetic variability must play an important role in environmental adaptability. So far there is no study comparing the evolution of A. marginale in different ecological environments associated to different tick vectors. Herein, using the msp1a gene as a molecular marker, we show that the rate of evolution of A. marginale is higher in ecoregions where there are ecological conditions that enhance R. microplus performance, while in USA, where Dermacentor spp are predominant, the evolution is neutral and the diversification is extremely low. Analyses of non-synonymous to synonymous substitutions ratio (ω) per codon position show a high number of positions that evolved under positive selection in ecoregions suitable for R. microplus compared to areas suitable for Dermacentor spp. As a result, MSP1a amino acid variability is higher in areas with R. microplus compared to areas with Dermacentor spp.

Keywords: A. marginale, MSP1a, diversifying and convergent evolution, R. microplus, Dermacentor spp.

-59- Chapter III Introduction de la Fuente et al, 2007; Ruybal et al, The Rickettsiales are a diverse group of α- 2009; Cabezas-Cruz et al, 2013; Pohl et al, that include major 2013). MSP1a, particularly the region pathogens of human and livestock (Darby containing the tandem repeats, has been et al, 2007). Most species are parasites of shown to be involved in the interaction arthropods such as ticks, lice or fleas with receptors on bovine erythrocytes (Azad and Beard, 1998). Others have the (McGarey and Allred, 1994; McGarey et capacity to switch from arthropods to al, 1994; de la Fuente et al, 2001). The vertebrate host, establishing complex tandem repeats of MSP1a were also shown cycles of transmission and persistent to be involved in A. marginale-tick infection (Darby et al, 2007). Recent interaction (de la Fuente, 2003). advances in genome sequencing facilitate Interestingly, msp1a genotypes evolved genome comparison studies on the under positive selection, but they are not Rickettsiales (Wernegreen, 2005; Darby et markers for geographic isolates (de la al, 2007; Blanc et al, 2007). Rickettsiales Fuente et al, 2003a). Does tick-mediated have small genomes, ranging from 0.9 Mb transmission generate genetic diversity in () to 2.1 Mb A. marginale or not? This question is yet ( tsutsugamushi), and they are a unanswered, however, a study by Ruybal paradigm of reductive evolution (Darby et and colleagues suggests that transmission al, 2007; Blanc et al, 2007). Little is known by R. microplus ticks generate MSP1a about the evolution of Rickettsiales in genetic diversity (Ruybal et al, 2009). relation to different vectors and how this Another study by Estrada-Peña and may influence the evolutionary rates and colleagues showed a lowest percentage of genetic diversity of this microorganisms. conserved amino acids in MSP1a from Herein, we used Anaplasma marginale as a ecoregions with R. microplus ticks model to study the evolutionary patterns in (Estrada-Peña et al, 2009). These two relation to various environmental niches. studies suggest that transmission by R. A. marginale is the causative agent of microplus may enhance the genetic anaplasmosis, a tick-borne disease variability of A. marginale. In sharp affecting cattle in tropical and subtropical contrast, a recent study by Herndon and regions of the world (revised in Aubry and colleagues showed that after Dermacentor Geale, 2011). A. marginale has a “closed- andersoni transmission, the genetic core” genome in which the gene content is heterogeneity of A. marginale subsp conserved among all sensu strictu strains centrale, a close relative of A. marginale, and after sequencing several strains, no is drastically restricted (Herndon et al, new gene was found (Pierlé et al, 2012). 2013). At first glance, these are apparently However, A. marginale is a highly variable two contradictory results, however, further pathogen (Palmer and Brayton, 2013), for investigation reveals otherwise. Here, which more than 200 different strains have using amino acid and nucleotides MSP1a been already reported (Cabezas-Cruz et al, sequences from 160 isolates of A. 2013). A. marginale strain characterization marginale available in Genbank, we show has been performed using major surface that A. marginale follow different proteins (MSP). Among these proteins, evolutionary patterns in environmental msp1a has been widely used to niches supporting different vector species: characterize the genetic diversity of A. R. microplus, R. annulatus, R. decoloratus marginale worldwide (Palmer et al, 2001; and Dermacentor spp.

-60- Chapter III

different tick vectors (Appendix 1 and 2). Results and discussion To estimate the rate of evolution of each lineage among the different clusters we Different rates of evolution among A. used a modification of the method (root-to- marginale strains related to different tip) described by Lanfear and colleagues, ecoregions 2010. We reconstructed the last common The phylogenetic relationships of MSP1a ancestor for each cluster (arrows in internal from A. marginale populations located in nodes in Appendix 1) and calculated the Argentina, Brazil, South Africa, USA, branch length from root(ancestor)-to- Israel, Mexico and Venezuela were tip(each lineage) using neighbour joining inferred using ML, NJ and MrBayes distances. In addition, we determined the (Appendix 1). Each population presents 2 number of nodes through which each (Argentina, Brazil and USA), 3 (South lineage has passed. We found negative Africa, Israel and Mexico) or 5 relationship between path length and the (Venezuela) clusters of phylogenetically number of nodes in ecoregions supporting related A. marginale strains. Each lineage the coexistence of R. annulatus and R. was located in an ecoregion related to microplus (Figure 1).

Figure 1. Correlation between branch length and number of nodes. The relationship between branch length and number of nodes is shown. We reconstructed the last common ancestor for each phylogenetic cluster of related A. marginale strains in the different ecoregions (Appendix 1) and calculated the branch length from root(ancestor)-to-tip(each lineage) using neighbour joining distances (y axis). In addition, we determined the number of nodes through which each lineage has passed (x axis).

However, there was no evidence of steady gradual transformation of whole positive or negative relationship (R2 < lineages (Gould and Eldredge, 1977), but 0.01) in ecoregions supporting R. also for particular genes (Pagel et al, microplus, R. annulatus, R. decoloratus or 2006). In agreement with this view, it has non-Rhipicephalus ticks,. The lack of been hypothesized that changes in msp1a positive relationship between path length genotypes should be an “infrequent event”, and the number of nodes is interpreted as as genotypic variation was not observed evidence of gradual evolution (Pagel et al, after two years of experimental infection of 2006). Gradual evolution is a uniform and cattle, natural infection or transmission by

-61- Chapter III

Dermacentor ticks (Palmer et al, 2001). lineages present in regions where any of Extremely slow, gradual evolution has the Rhipicephalus spp included in this been reported for the VP8 portion of the study are common, have low number of VP4 capsid protein of rotavirus in absence nodes (e.g., average of nodes for R. of vaccination pressure. The complete microplus, R. decoloratus and R. replacement of one VP8 genotype to the annulatus is 3.7, 4.1 and 2.8, respectively), other, took 19 epidemic years (Novikova et but higher genetic distances root-to-tip al, 2012). Interesting, VP8 has epitopes (e.g. average of genetic distance for R. sensitive to neutralizing antibodies microplus, R. decoloratus and R. (Novikova et al, 2012). A. marginale annulatus is 0.055, 0.041 and 0.052 MSP1a contains also neutralization respectively). This suggests that A. sensitive (Palmer et al, 1987; Allred et al, marginale strains present in ecoregions 1990) and immunodominant epitopes suitable for Rhipicephalus ticks may (Garcia-Garcia et al, 2004). evolve faster than those present in areas Interestingly, all lineages in areas where where Dermacentor spp are the main tick Rhipicephalus ticks are not common have vector. We also observed that the evolution a low rate of diversification from the of MSP1a genotypes in all populations was ancestor (average of genetic distance related to a reduction in the number of 0.039); despite long evolutionary walks MSP1a tandem repeats among lineages (average of nodes 6.2). In contrast, compared to the ancestor (Figure 2).

Figure 2. Rate of loss in tandem repeats. The relationship between the ratio of tandem repeats (TR) loss per node (y axis) and the number of nodes through which each lineage has passed (x axis) is shown.

Taking this into account, we calculated the in the tree, probably due to purifying tandem repeat loss/nodes ratio and found selection. At the same time, lineages that that it was negatively related to the number are more evolved lost fewer tandem of nodes for which each lineage has passed repeats probably due to a fitness advantage in all ecoregions except for ecoregions of having higher number of tandem suitable for R. decoloratus (R2 =0.06). This repeats. result shows that early loss in tandem repeats is related only to specific lineages Emergence of new MSP1a genotypes.

-62- Chapter III

The genetic diversity of MSP1a is mostly regions where Rhipicephalus spp were not due to amino acid variation in the tandem present (2.6). Out of 230 nucleotide repeat region of the protein, while the rest substitutions, 127 were transitions and 103 of the proteins remain conserved among transversions (Appendix 4 and 5) having a isolates (de la Fuente et al, 2001a). To global transition/transversion ratio of 0.81. study the evolution of the diversity of There was a slight base composition bias MSP1a tandem repeats among the different toward G and C (52% substitutions toward regions we determine the origin of the G+C and 48% substitutions toward A+T). unique tandem repeats present in This is consistent with a high C+G content geographically related A. marginale of A. marginale genome (49.8%). strains. For this purpose phylogenetic trees Interestingly, transition/transversion ratios were constructed using the tandem repeats varied among A. marginale msp1a from A. marginale related strains sequences located in different ecoregions (Appendix 1). This approach allowed us to and they were notably different in the identify the ancestor sequences from where ecoregion optimal for R. decoloratus the region-specific tandem repeats were compared to the rest (Appendix 6). It is originated. Most of the region-specific generally assumed that, in metazoan, the tandem repeats were originated from extant number of transitional substitutions is tandem repeats (Appendix 1), while some higher than transversional substitutions. were originated from tandem repeats that However, interspecific (Keller et al, 2007) are absent from the A. marginale msp1a as well as intraspecific (Seplyarskiy et al, tandem repeat pool (Appendix 1 and 3). 2012) differences in transition/transversion Subsequently, by comparing the ratio have been reported. nucleotides and amino acid sequences from the ancestors and the resulting tandem Selective pressures acting on A. repeats, we were able to identify the marginale msp1a tandem repeats. genetic changes that originated new In order to test whether individual codons tandem repeats in the respective regions. in the tandem repeats from different We observed that single nucleotide regions were under positive or negative changes were the main genetic change selection, the ratio ω (see materials and producing msp1a tandem repeats variation, methods) was calculated for each codon but we also found codon deletions and one position of the nucleotide sequences of the insertion (Appendix 4). This is in MSP1a tandem repeats from the different agreement with one hypothesis for the regions. Sites that evolved under positive origin of different MSP1a tandem repeats selection were found in all the regions (de la Fuente et al, 2001a). This hypothesis except in the region where Rhipicephalus argues that new tandem repeats will spp were not found (Appendix 7). In this emerge after mutations from the ancestor region only codon 29 evolved under tandem repeat (de la Fuente et al, 2001a). negative selection. A. marginale msp1a We identified nucleotide substitutions in that evolved in ecoregions optimal for R. 96 tandem repeat diversifications (changes microplus present 7 codons that evolved from ancestor sequence to descendent under positive selection, which were the sequence). From these nucleotide highest number of positively evolving sites substitutions, 206 were non-synonymous (Appendix 7). Finally, in regions where and 57 were synonymous. However, the Rhipicephalus spp were eradicated, the ratio of non-synonymous to synonymous amount of positively selected sites drop to substitutions differs drastically among the only 1 site. Regions optimal for ecoregions suitable for R. microplus (3), R. development of R. decoloratus, R. decoloratus (3.8), and coexistence of R. annulatus and coexistence of R. annulatus annulatus and R. microplus (18) and in and R. microplus, present 4, 2 and 3 sites

-63- Chapter III that evolved under positive selection, present different patterns of evolution respectively. Sites that evolved under related to the presence of tick vectors and negative selection were also found in all to different tick vector species. The groups. evolutionary fingerprinting results show MSP1a evolutionary fingerprinting. that the evolution of msp1a gene vary Genes are shaped by natural selection into depending the tick vectors species involved a unique arrangement of sites evolving in the transmission of a certain ecoregion rapidly or resisting change. The results (Figure 3). above show that A. marginale msp1a

Evolutionary fingerprinting denotes Cabezas-Cruz et, 2013). This genetic different classes of substitution rates per variability is due to variability in the group of related genetic changes in the number of tandem repeats among different DNA sequences. Thus, sites evolving at isolates (from 1 to 10) and also due to different substitution rates will be detected. tandem repeat diversification. As shown The presence of R. microplus and R. before, unique tandem repeats emerge decoloratus produce fours rates classes, within different ecoregions. At the same suggesting that different evolutionary time, some non-phylogenetically related A. pressures are exerted on different portions marginale strains exhibit the same tandem of the gene. In these two regions, we found repeats. For example in Argentina, the two groups of sites being positively tandem repeats B and F are found in selected and two other being negatively different phylogenetic clusters (Appendix selected (close to the neutral transversal 1). Cattle movement has been suggested as line, as in the case of R. microplus). The an explanation for this observation (de la evolution is mostly neutral in areas where Fuente et al, 2007). Another possibility Rhipicephalus spp are not found. could be that under similar selection pressures, the tandem repeats will evolve Directional and convergent evolution of to some specific form(s). In order to test A. marginale MSP1a related to tick the hypothesis of directional evolution, we vectors. performed a DEPS test of directional A. marginale msp1a are highly variable evolution. As expected, and in agreement among different isolates (Palmer et al, with the above results, the presence of 2001; de la Fuente et al, 2007; Ruybal et ticks of Rhipicephalus spp, induce al, 2009; Estrada-Peña et al, 2009; directional evolution on MSP1a, while in

-64- Chapter III regions not suitable for Rhipicephalus spp, evolution –only one site is under or where this tick genus was eradicated, directional evolution, or does not exist there is minimal evidence of directional (Figure 4).

In order to determine the origin of equal Analysing the evolution of MSP1a tandem repeats in non-related A. marginale diversification (mentioned above), we strains, we identified the ancestors of these noticed that 70 out of 96 tandem repeat sequences within their respective regions. diversifications from different ecoregions Surprisingly, we found that equal tandem were produced by more than one non- repeats in non-related A. marginale strains synonymous substitution, suggesting originated from different tandem repeats generation of variability and coevolution of (Appendix 3). This is in agreement with amino acid sites. To test whether amino the evidence of directional evolution acid variability differ among A. marginale mentioned above. These results suggest MSP1a from different ecoregions and if that A. marginale strains having the same this variability may have any structural tandem repeat composition may not be consequence, we first determined the related due to new genotypes introduced, amino acid variability of all the tandem instead they may have evolved under repeats present in the different ecoregions similar selective pressures that and secondly predicted the putative directionally select specific tandem coevolution of amino acid sites. A. repeats. marginale MSP1a tandem repeats variability is lower in non-Rhipicephalus Effect of the presence of tick vectors on spp (average 0.26) and R. annulatus MSP1a tandem repeats amino acid (average 0.30) regions (Figure 5). variability and coevolution of amino acid sites

-65- Chapter III

Figure 5. MSP1a tandem repeats aa variability in different ecological regions. The figure shows the Shannon amino acid variability (y axis) among the aa sites (x axis) in tandem repeats from different ecological regions: R. microplus, R. annulatus, R. decoloratus, coexistence of R. microplus and R. annulatus, non- Rhipicephalus spp ticks and finally regions where Rhipicephalus spp were eradicated 30 to 50 years ago.

pressures are exerted on maintaining In sharp contrast, regions with ecological favourable interactions between certain optimal for R. microplus present the higher amino acids sites. The high variability variability (average 0.67) (Figure 5). found in tandem repeats from regions with Interestingly, the region where R. microplus is not random and the pattern Rhipicephalus spp ticks were eradicated 30 of coevolving sites suggest that, under R. to 50 years ago, present middle values of microplus transmission, A. marginale amino acid variability (Figure 5). In order MSP1a becomes a highly functional to answer whether the amino acid molecule (Figure 6 and Appendix 8) with 5 variability in different ecoregions have any sites simultaneously coevolving with 3 evolutionary meaning, we performed an other sites. Remarkably, positions 16 and evolutionary coupling analysis (Poon et al, 23 are simultaneously coevolving with 5 2008 and Kosakovsky et al, 2008) to other sites. In agreement with this, regions identify co-evolving sites in MSP1a where Rhipicephalus spp were eradicated tandem repeats sequences. Residue or are absent, present an extremely low coupling is maintained during evolution number of coevolving sites and no site is (evolutionary couplings) when one site is coevolving with 3 or 5 sites at the same structurally or functionally linked to one or time (Figure 6 and Appendix 8). many other sites. Thus, evolutionary

-66- Chapter III

Figure 6. Diagram of coevolution of MSP1a amino acid positions related to tick presence or absence. The coevolving sites in the MSP1a tandem repeats were identified. The molecule is represented as a thick line and the aa positions as circles. Sites coevolving with 0 (green circles), 1 (red circles), 2 (blue circles), 3 (yellow circles) or more (purple circles) aa sites are shown. Dashed and continue lines connect sites coevolving with less or more than 3 sites respectively.

Concluding remarks diversity of A. marginale central (a species closely related to A. marginale) is reduced. Previous studies have suggested that the Our results confirm and expand these presence of R. microplus could trigger the observations. The mechanisms by which genetic diversity in A. marginale ticks of the genus Rhipicephalus produce populations (Ruybal et al, 2009; Estrada- genetic diversification in A. marginale Peña et al, 2009). On the other hand, other populations remains to be discovered. authors have suggested that after passage in Dermacentor andersoni, the genetic to group the A. marginale strains present in ecoregions optimal for R. microplus, R. Materials and Methods annulatus, R. decoloratus, coexistence of R. microplus and R. annulatus as well as in ecoregions no optimal for Rhipicephalus A. marginale strains spp performance. Additionally, using The analysis were conducted using A. literature mining we identified regions marginale MSP1a from Argentina, Brazil, where R. microplus tick were eradicated. South Africa, USA, Israel, Mexico and Methods used for ecoregion determination Venezuela (revised in Cabezas-Cruz et al, are explained below. 2013). The tandem repeats were named as in Cabezas-Cruz et al, 2013. Each A. The series of data marginale strain used in this study was We used remotely sensed information of correlated to different ecoregions features of the ground surface to produce a containing ecological traits affecting R. classification of the World into a set of microplus, R. annulatus and R. abiotic categories using harmonic decoloratus performances. This allowed us regression. Three series of data were

-67- Chapter III targeted: the Land Surface Temperature satellite orbital drift correction applied. We (LST); the Normalised Difference prepared one month composites from the Vegetation Index (NDVI); and the Leaf 8-day images, using the method of the Area Index (LAI). The first expresses the maximum pixel value, to obtain the largest temperature at the ground surface with a area without gaps in pixels. One of the precision of one decimal. The NDVI is a problems with applying remotely sensed measure of the photosynthetic activity of imagery to the detection of abiotic niche is plants. Its value has been proven in the the existence of gaps at regions near the field of large-scale monitoring of poles because of the long-lasting vegetation cover, and it has been accumulation of snow, ice, or clouds. The extensively used as a descriptive variable effects are larger in the northern of the habitat for medically important hemisphere because of the proximity of arthropods. NDVI thus represents an inhabited lands to the North Pole. The adequate source of data to cope with the detection of these gaps and filling them water component of the arthropod life with estimated values may be unreliable if cycle, assessing temporal aspects of the number of consecutive gaps is too long. vegetation development and quality. The Some regions in the far North were not LAI defines an important structural included in the final set of images because property of a plant canopy, the number of they were covered by snow, clouds, or ice equivalent layers of leaf vegetation relative for periods longer than 4 months. to a unit of ground area. This feature is Monthly values of each variable were important for the abiotic niche of an subjected to harmonic regression. We organism because it measures how the performed the harmonic regressions in the ground is protected against the sun and its R development framework together with evaporative capacities. the packages “raster” and “TSA”. Five Harmonic regression is a mathematical coefficients for each variable were technique used to decompose a complex extracted from the annual time series, signal into a series of individual sine and representing the yearly, 6-month, and 3- cosine waves, each characterised by a month signals of the harmonic regressions. specific amplitude and phase angle. In the Thus, five layers of coefficients of each process, a series of coefficients describe variable could reconstruct the complete the cyclical variation of the series, original time series and constitute the including its seasonal behaviour. Harmonic environmental covariates proposed in this regression serves as a method of potential paper to describe the abiotic niche of application for capturing the abiotic niche organisms. of an organism because it describes both We performed an unsupervised the pattern (seasonal components) and the classification on these 15 layers of ranges of climate variables between explanatory variables. Unsupervised defined time intervals with the coefficients classification and further clustering of the that result from the harmonic regression. categories of abiotic features was done. All the data were obtained from the NEO’s The purpose of the clustering of the (NASA Earth Observations) web server categories was to obtain an evaluation of (http://neo.sci.gsfc.nasa.gov/about/). The the similarities among them, to further series of covariates (LST, NDVI, and LAI) relate these branches of the “abiotic were obtained at a resolution of 0.1º, from phylogenies” with the results about the October 2000 to December 2012 at 8-day phylogeny and evolution of A. marginale. intervals. The available set of images has been already processed by the MODIS Phylogenetic analysis team, with improved cloud masking and The phylogenetic analysis were conducted adequate atmospheric correction and with MSP1a amino acid sequences aligned

-68- Chapter III with MAFFT (v7) configured for the Amino acid (aa) variability, highest accuracy (Katoh and Standley, identification of co-evolving amino acid 2013). After alignment, regions with gaps sites and directional evolution of protein were removed from the alignment. sequences (DEPS) test Phylogenetic trees were reconstructed Shannon entropy analysis (Shannon, 1948) using maximum likelihood (ML), neighbor was used to calculate the aa variability as joining (NJ) and bayesian inference (MB) implemented in Protein Variability Server methods as implemented in PhyML (v3.0 (Garcia-Boronat et al, 2008). The Shannon aLRT) (Guindon and Gascuel, 2003; entropy (H) was calculated for every Anisimova and Gascuel, 2006), PHYLIP position with the following equation: (v3.66) (Felsenstein, 1989) and MrBayes (v3.1.2) (Huelsenbeck and Ronquist, 2001), respectively. The reliability for the internal branches of ML was assessed using the bootstrapping method (1000 Pi is the amount of each aa type (i), and M bootstrap replicates) and the approximate is the number of aa types (maximum 20 likelihood ratio test (aLRT – SH-Like) aa). (Anisimova and Gascuel, 2006). Co-evolving sites and directional evolution Reliability for the NJ tree was assessed of protein sequences (DEPS) test were using bootstrapping method (1000 performed using the methods described in bootstrap replicates). 10 000 generations of (Poon et al, 2008) and (Kosakovsky et al, Markov Chain Monte Carlo (MCMC) 2008) implemented in Datamonkey chains were run for MrBayes. webserver.

Codon based phylogenetic analysis of Evolutionary fingerprinting analysis tandem repeats Evolutionary fingerprinting is a concept Codon based alignment was performed developed by Pond and colleagues, 2010 using the codon suite server (Schneider et based on the probability distribution of al, 2005, 2007). Detection of selection site-to-site of synonymous and pressure on individual codons was nonsynonymous substitution rates in an calculated using single likelihood ancestor alignment (Pond et al, 2010) making the counting (SLAC) method (Pond and Frost, assumption that codon sites are evolving 2005), implemented in Datamonkey independently of one another. For webserver (Delport et al, 2010; Pond and evolutionary fingerprinting we used Frost, 2005). Positive and negative alignments of MSP1a tandem repeats selections were assigned to codon where nucleotides sequences from the different ω=dN (non-synonymous substitutions)/dS ecoregions previously identified. (synonymous substitutions) ratio was Evolutionary fingerprinting method is higher or lower than 1 respectively. The implemented in Datamonkey webserver. reconstruction of the ancestral amino acid sequence was performed using a Neighbor References Joining Tree constructed using the Dayhoff Darby AC, Cho NH, Fuxelius HH, Westberg J, model of substitutions which was Andersson SG. Intracellular pathogens go extreme: genome evolution in the Rickettsiales. Trends estimated to be the best model fitting the Genet. 2007. 23(10):511-520. actual data. Three reconstruction methods were used: Joint (Pupko et al, 2000), Azad AF, Beard CB. Rickettsial pathogens and marginal (Yang et al, 1995) and sample their arthropod vectors. Emerg Infect Dis. 1998. (Nielsen, 2002) which are implemented in 4(2): 179-189. Datamonkey webserver. Wernegreen JJ. For better or worse: genomic consequences of intracellular mutualism and

-69- Chapter III parasitism. Curr Opin Genet Dev. 2005. 15(6):572- possible adhesins. Infect Immun. 1994. 583. 62(10):4587-4593.

Blanc G, Ogata H, Robert C, Audic S, Suhre K, McGarey DJ, Barbet AF, Palmer GH, McGuire TC, Vestris G, Claverie JM, Raoult D. Reductive Allred DR. Putative adhesins of Anaplasma genome evolution from the mother of Rickettsia. marginale: major surface polypeptides 1a and 1b. PLoS Genet. 2007. 3(1):e14. Infect Immun. 1994. 62(10):4594-4601.

Aubry P, Geale DW. A review of bovine de la Fuente J, Garcia-Garcia JC, Blouin EF, Kocan anaplasmosis. Transbound Emerg Dis. 2011. KM. Differential adhesion of major surface proteins 58(1):1-30. 1a and 1b of the ehrlichial cattle pathogen Anaplasma marginale to bovine erythrocytes and Pierlé SA, Dark MJ, Dahmen D, Palmer GH, tick cells. Int J Parasitol. 2001. 31(2):145-153. Brayton KA. Comparative genomics and transcriptomics of trait-gene association. BMC de la Fuente J, Garcia-Garcia JC, Blouin EF, Kocan Genomics. 2012. 13(669). KM. Characterization of the functional domain of major surface protein 1a involved in adhesion of the Palmer GH, Brayton KA. Antigenic variation and rickettsia Anaplasma marginale to host cells. Vet transmission fitness as drivers of bacterial strain Microbiol. 2003. 91(2-3):265–283. structure. Cell Microbiol. 2013. 15(12):1969-1975. de la Fuente J, Van Den Bussche RA, Prado TM, Cabezas-Cruz A, Passos LMF, Lis K, Kenneil R, Kocan KM. Anaplasma marginale msp1alpha Valdés JJ, Ferrolho J, Tonk M, Pohl AE, genotypes evolved under positive selection pressure Grubhoffer L, Zweygarth E, Shkap V, Ribeiro but are not markers for geographic isolates. J Clin MFB, Estrada-Peña A, Kocan KM, de la Fuente J. Microbiol. 2003a. 41(4):1609-1616. 2013. Functional and Immunological Relevance of Anaplasma marginale Major Surface Protein 1a Estrada-Peña A, Naranjo V, Acevedo-Whitehouse Sequence and Structural Analysis. Plos One. 8:1- K, Mangold AJ, Kocan KM, de la Fuente J. 13. Phylogeographic analysis reveals association of tick-borne pathogen, Anaplasma marginale, MSP1α Palmer GH, Rurangirwa FR, McElwain TF. Strain sequences with ecological traits affecting tick composition of the ehrlichia Anaplasma marginale vector performance. BMC Biology. 2009. 7(57):1- within persistently infected cattle, a mammalian 13. reservoir for tick transmission. J Clinical Microbiol. 2001. 39(2): 631–635. Herndon DR, Ueti MW, Reif KE, Noh SM, Brayton KA, Agnes JT, Palmer GH. Identification of de la Fuente J, Ruybal P, Mtshali MS, Naranjo V, multilocus genetic heterogeneity in Anaplasma Shuqing L, Mangold AJ, Rodríguez SD, Jiménez R, marginale subsp. centrale and its restriction Vicente J, Moretta R, Torina A, Almazán C, Mbati following tick-borne transmission. Infect Immun. PM, Torioni de Echaide S, Farber M, Rosario-Cruz 2013. 81(5):1852-1858. R, Gortazar C, Kocan KM. Analysis of world strains of Anaplasma marginale using major Lanfear R, Welch JJ, Bromham L. Watching the surface protein 1a repeat sequences. Vet Microbiol. clock: studying variation in rates of molecular 2007. 119(2-4): 382–390. evolution between species. Trends Ecol Evol. 2010. 25(9):495-503. Ruybal P, Moretta R, Perez A, Petrigh R, Zimmer P, Alcaraz E, Echaide I, Torioni de Echaide S, Pagel M, Venditti C, Meade A. Large punctuational Kocan KM, de la Fuente J, Farber M. Genetic contribution of speciation to evolutionary diversity of Anaplasma marginale in Argentina. divergence at the molecular level. Science. 2006. Vet Parasitol. 2009. 162(1-2):176–180. 314(5796):119-121.

Pohl AE, Cabezas-Cruz A, Ribeiro MFB, Silveira Gould SJ, Eldredge N. "Punctuated equilibria: the JAG, Silaghi C, Pfister K, Passos LMF. Detection tempo and mode of evolution reconsidered." of genetic diversity of Anaplasma marginale Paleobiology. 1977. 3(2):115-151 isolates in Minas Gerais, Brazil. Rev Bras Parasitol Vet. 2013. 22(1):129-135. Novikova NA, Morozova OV, Fedorova OF, Epifanova NV, Sashina TA, Efimov EI. Rotavirus McGarey DJ, Allred DR. Characterization of infection in children of Nizhny Novgorod, Russia: hemagglutinating components on the Anaplasma the gradual change of the virus allele from P[8]-1 to marginale initial body surface and identification of

-70- Chapter III

P[8]-3 in the period 1984-2010. Arch Virol. 2012. 157(12):2405-2409. Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. Palmer GH, Waghela SD, Barbet AF, Davis WC, 2001. 17(8):754-5. McGuire TC. Characterization of a neutralization- sensitive epitope on the Am 105 surface protein of Schneider A, Cannarozzi GM, Gonnet GH. Anaplasma marginale. Int J Parasitol. 1987. Empirical codon substitution matrix. BMC 17(7):1279–1285. Bioinform. 2005. 6:134.

Allred DR, McGuire TC, Palmer GH, Leib SR, Schneider A, Gonnet G, Cannarozzi G. SynPAM-a Harkins TM, McElwain TF, Barbet AF. Molecular distance measure based on synonymous codon basis for surface antigen size polymorphisms and substitutions. IEEE/ACM Trans Comput Biol conservation of a neutralization-sensitive epitope in Bioinform. 2007. 4(4):553-560. Anaplasma marginale. Proc Natl Acad Sci USA. 1990. 87(8):3220–3224. Pond SL, Frost SD. Datamonkey: rapid detection of selective pressure on individual sites of codon Garcia-Garcia JC, de la Fuente J, Kocan KM, alignments. Bioinform. 2005. 21(10):2531-2533. Blouin EF, Halbur T, Onet VC, Saliki JT. Mapping of B-cell epitopes in the N-terminal repeated Delport W, Poon AF, Frost SD, Kosakovsky PSL. peptides of Anaplasma marginale major surface Datamonkey 2010: a suite of phylogenetic analysis protein 1a and characterization of the humoral tools for evolutionary biology. Bioinformatics. immune response of cattle immunized with 2010. 26(19):2455-2457. recombinant and whole organism antigens. Vet Immunol Immunopathol. 2004. 98(3-4):137–151. Pupko T, Shamir IPR, Graur D. A fast algorithm for joint reconstruction of ancestral amino acid de la Fuente J, Garcia-Garcia JC, Blouin EF, sequences. Mol Biol Evol. 2000. 17(6):890-896 Rodríguez SD, García MA, Kocan KM. Evolution and function of tandem repeats in the major surface Yang Z, Kumar S, Nei M. A new method of protein 1a of the ehrlichial pathogen Anaplasma inference of ancestral nucleotide and amino acid marginale. Anim Health Res Rev. 2001a. 2(2):163- sequences. Genetics. 1995. 141(4):1641-1650. 173. Nielsen R. Mapping mutations on phylogenies. Syst Keller I, Bensasson D, Nichols RA. Transition- Bio. 2002. 51(5):729-739. transversion bias is not universal: a counter example from grasshopper pseudogenes. PLoS Shannon CE. The mathematical theory of Genet. 2007. 3(2):e22. communication. The Bell system Technical Journal 1948. 27:379-423. Seplyarskiy VB, Kharchenko P, Kondrashov AS, Bazykin GA. Heterogeneity of the Garcia-Boronat M, Diez-Rivero CM, Reinherz EL, transition/transversion ratio in Drosophila and Reche PA. PVS: a web server for protein sequence Hominidae genomes. Mol Biol Evol. 2012. variability analysis tuned to facilitate conserved 29(8):1943-1955. epitope discovery. Nucleic Acids Res. 2008. 36:35- 41 Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in Poon AF, Lewis FI, Frost SD, Kosakovsky Pond performance and usability. Mol Biol Evol. 2013. SL. Spidermonkey: rapid detection of co-evolving 30(4):772-780. sites using Bayesian graphical models. Bioinformatics. 2008. 24(17):1949-1950. Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by Kosakovsky PSL, Poon AF, Leigh Brown AJ, Frost maximum likelihood. Syst Biol. 2003. 52:696-704. SD. A maximum likelihood method for detecting directional evolution in protein sequences and its Anisimova M, Gascuel O. Approximate likelihood- application to influenza A virus. Mol Biol Evol. ratio test for branches: A fast, accurate, and 2008. 25(9):1809-1824. powerful alternative. Syst Biol. 2006. 55(4):539-552. Pond SL, Scheffler K, Gravenor MB, Poon AF, Frost SD. Evolutionary fingerprinting of genes. Mol Felsenstein J. PHYLIP - Phylogeny Inference Biol Evol. 2010. 27(3):520-536. Package (Version 3.2). Cladistics. 1989. 5:164-166.

-71- Chapter IV

In vitro culture of three new strains of E. canis and evaluation of

genetic diversity using gp36 gene.

Zweygarth E., Cabezas-Cruz A., Josemans A I., Oosthuizen M C., Matjila P T., Lis K, Broniszewska M., Schöl H, Ferrolho J., Grubhoffer L., Passos L M F. 2014. In vitro culture and major immunoreactive protein gp36 structural differences of geographically distant Ehrlichia canis isolates. Ticks and Tick-borne Diseases. 5(4): 423-31

-72- ARTICLE IN PRESS Chapter IV

Contents lists available at ScienceDirect Ticks and Tick-borne Diseases

journal homepage: www.elsevier.com/locate/ttbdis

Original article In vitro culture and structural differences in the major immunoreactive protein gp36 of geographically distant Ehrlichia canis isolates

Erich Zweygarth a,∗, Alejandro Cabezas-Cruz b, Antoinette I. Josemans c, Marinda C. Oosthuizen d, Paul T. Matjila d,e, Katarzyna Lis a, Marzena Broniszewska a, Heidrun Schöl a, Joana Ferrolho f, Libor Grubhoffer b, Lygia M.F. Passos a,g a Comparative Tropical Medicine and Parasitology, Ludwig-Maximilians-Universität München, Leopoldstrasse 5, 80802 Munich, Germany b University of South Bohemia, Faculty of Science and Biology Centre of the ASCR, Parasitology Institute, Ceskéˇ Budejovice,ˇ Czech Republic c Onderstepoort Veterinary Institute, Private Bag X5, Onderstepoort 0110, South Africa d Department of Veterinary Tropical Diseases, Faculty of Veterinary Science, University of Pretoria, Private Bag X4, Onderstepoort 0110, South Africa e Department of Life and Consumer Sciences, College of Agriculture and Environmental Sciences, Unisa, Private Bag X06, Florida 1710, South Africa f The Pirbright Institute, Pirbright Laboratory, Ash Road, Pirbright, Surrey GU24 0NF, UK g Departamento de Medicina Veterinária Preventiva, Escola de Veterinária-UFMG, Belo Horizonte, Minas Gerais, Brazil article info a b s t r a c t

Article history: Ehrlichia canis, the etiologic agent of canine ehrlichiosis, is an obligate intracytoplasmic Gram-negative Received 28 September 2013 tick-borne bacterium belonging to the Anaplasmataceae family. E. canis is distributed worldwide and can Received in revised form 26 January 2014 cause serious and fatal infections in dogs. Among strains of E. canis, the 16S rRNA gene DNA sequences Accepted 26 January 2014 are highly conserved. Using this gene to genetically differentiate isolates is therefore difficult. As an Available online xxx alternative, the gene gp36, which encodes for a major immunoreactive protein in E. canis, has been successfully used to characterize the genetic diversity of this pathogen. The present study describes Keywords: the isolation and continuous propagation of a Spanish and 2 South African isolates of E. canis in IDE8 tick Ehrlichia canis In vitro culture cells. Subsequently, canine DH82 cell cultures were infected using initial bodies obtained from infected IDE8 tick cells IDE8 cultures. It was possible to mimic the life cycle of E. canis in vitro by transferring infection from DH82 tick cells to canine cells and back again. To characterize these E. canis strains at the molecular level, the 16S rRNA 16S rRNA and gp36 genes were amplified by PCR, sequenced, and aligned with corresponding sequences gp36 available in GenBank. All 16S rRNA sequences amplified in this study were identical to previously reported E. canis strains. Maximum likelihood analysis based on the gp36 amino acid sequences showed that the South African and Spanish strains fall into 2 well-defined phylogenetic clusters amongst other E. canis strains. The members of these 2 phylogenetic clusters shared 2 unique molecular properties in the gp36 amino acid sequences: (i) deletion of glycine 117 and (ii) the presence of an additional putative N- linked glycosylation site. We further show correlation between the putative secondary structure and the theoretical isoelectric point (pI) of the gp36 amino acid sequences. A putative role of gp36 as an adhesin in E. canis is discussed. Overall, we report the successful in vitro culture of 3 new E. canis strains which present different molecular properties in their gp36 sequences. © 2014 Elsevier GmbH. All rights reserved.

Introduction and Huxsoll, 1973; Harvey et al., 1979; van Heerden, 1979; Price and Karstad, 1980) caused by the bacterium Ehrlichia canis. The Canine ehrlichiosis or tropical canine pancytopenia is an agent has a tropism for canine monocytes and macrophages, and infectious, non-contagious, tick-borne disease of domestic dogs therefore the disease is known as canine monocytic ehrlichiosis. (Donatien and Lestoquard, 1935) and some wild Canidae (Amyx E. canis has also been identified as cause of human ehrlichiosis (Maeda et al., 1987; Perez et al., 1996, 2006). The brown dog tick Rhipicephalus sanguineus is an efficient vector of E. canis (Donatien ∗ and Lestoquard, 1936; Groves et al., 1975; Bremer et al., 2005). Corresponding author. Tel.: +49 89 2180 6837; fax: +49 89 2180 3623. E-mail address: [email protected] (E. Zweygarth). Experimental transmission of E. canis has also been demonstrated http://dx.doi.org/10.1016/j.ttbdis.2014.01.011 1877-959X/© 2014 Elsevier GmbH. All rights reserved.

-73- Chapter IV to occur through the ixodid tick Dermacentor variabilis (Johnson fall into 2 well-defined phylogenetic clusters and (ii) that correla- et al., 1998). The disease was first described in Algeria (Donatien and tion exists between the number of tandem repeats, the theoretical Lestoquard, 1935) and has a worldwide geographical distribution isoelectric point (pI), and the putative secondary structure in the E. which coincides with the distribution of the vector ticks. Ehrlichio- canis immunoreactive glycoprotein gp36. sis in dogs was first described in South Africa in 1938 (Neitz and Thomas, 1938) and was first reported in Spain 50 years later (Font Materials and methods et al., 1988). E. canis was first propagated in monocyte cell cultures derived Cell lines and culture media from the peripheral blood of acutely infected dogs (Nyindo et al., 1971). Hemelt et al. (1980) further improved this method by The tick cell line IDE8 derived from I. scapularis embryos adding E. canis-infected monocytes to uninfected monocyte cul- (Munderloh et al., 1994) was maintained at 32 ◦C in L-15B tures obtained from healthy donor dogs. This technique allowed medium (Munderloh and Kurtti, 1989) supplemented with 5% heat- cultures to be maintained far beyond 100 passages. Due to the fact inactivated fetal bovine serum, 10% tryptose phosphate broth, 0.1% that normal monocytes do not proliferate and only have a limited bovine lipoprotein concentrate (MP Biomedicals, Santa Ana, CA, life span in culture, both the above-mentioned techniques required USA), 100 IU/ml penicillin, and 100 ␮g/ml streptomycin. Infected a constant supply of monocytes from donor dogs. This changed cell cultures were propagated in a modified L-15B medium as out- when cells of a dog suffering from malignant histiocytosis gave rise lined above, but without antibiotics, and further supplemented to a continuous cell line, namely DH82 (Wellman et al., 1988). The with 0.1% NaHCO3 and 10 mM HEPES. The pH of the medium DH82 cell line was then successfully used to continuously propa- ◦ was adjusted to approximately 7.5. The modified L-15B medium gate E. canis in vitro at 37 C(Dawson et al., 1991) thus minimizing is referred to as complete culture medium (CCM). Uninfected the risk of introducing extraneous biological contaminants. Other DH82 cells, originally derived from a dog suffering from malig- heterologous host cells were used for in vitro cultivation of E. canis, nant histiocytosis (Wellman et al., 1988), were propagated as including human–canine hybrid cells (Stephenson and Osterman, described elsewhere (Zweygarth and Josemans, 2001), whereas 1980), immortal human microvascular cells (Dawson et al., 1993), infected DH82 cell cultures were propagated in CCM. and a continuous BALB/C mouse macrophage cell line (Keysary et al., 2001). The first successful in vitro isolation and propagation of E. canis in Ixodes scapularis-derived IDE8 tick cell cultures was E. canis strains reported by Ewing et al. (1995). Subsequently, E. canis was also propagated in another I. scapularis cell line, ISE6 (Singu et al., 2006) Blood samples were obtained from 2 dogs diagnosed with and in an Ixodes ricinus-derived cell line IRE/CTVM18 (Bell-Sakyi ehrlichiosis at the Onderstepoort Veterinary Hospital, Faculty of et al., 2007). Veterinary Science, University of Pretoria (South Africa). These In order to fully identify and differentiate isolated pathogens, samples are referred to as strains South Africa 171 and 222. The the in vitro culture must be complemented by a variety of other blood sample from a Spanish dog, here referred to as Spain 105, was molecular techniques. The molecular characterization of evolution- obtained from the Diagnostic Laboratory of the Institute for Com- arily conserved genes such as 16S rRNA is useful to identify bacteria parative Tropical Medicine and Parasitology, University of Munich at species level (Woese, 1987), but due to the high degree of con- (Germany). This dog had been imported from Spain to Germany. servation such genes provide little information on strain diversity, It was asymptomatic, but had a chronic E. canis infection. At the and more variable genes must be used instead (Zhang et al., 2008). time of culture initiation, Giemsa-stained blood smears prepared E. canis has a single circular chromosome containing 1,315,030 from all the dogs showed typical ehrlichial inclusions in monocytic nucleotides predicted to encode 925 proteins (Mavromatis et al., cells. 2006). A small subset of these proteins react strongly with anti- bodies and are therefore considered to be major immunoreactive E. canis cell cultures proteins (McBride et al., 2003). The genes encoding these proteins may exhibit a high level of diversity as a result of increased selec- Two different methods of culture initiation were applied for the tive pressure by the immune system. In fact, the study of several of South African isolates. Blood of strain South Africa 171 was drawn these proteins, including gp200, gp140, gp19, and gp36, has shown into vacutubes containing EDTA as anticoagulant. After centrifuga- a high genetic diversity among geographically distant isolates of E. tion for 10 min at 500 × g at room temperature, buffy coat cells were canis (Zhang et al., 2008). Of these genes, E. canis gp36 was the most harvested and washed once in physiological phosphate-buffered divergent (Zhang et al., 2008). E. canis gp36 is an acidic serine-rich saline (PBS). The buffy coat and contaminating erythrocytes were glycoprotein that contains a tandem repeat region in which major layered onto 3 ml HistopaqueR-1077 Hybri-MaxR (Sigma, St. Louis, antibody epitopes are located (Doyle et al., 2006). This glycoprotein MO, USA) and centrifuged at 800 × g for 20 min at room temper- has been successfully used for the characterization of the genetic ature. Cells at the interphase were collected and washed in PBS variability of E. canis (Hsieh et al., 2010; Kamani et al., 2013) as well twice, resuspended in CCM and added to one T25 cell culture flask as for a diagnostic ELISA system for E. canis (Cardenas et al., 2007). (TPP, Trasadingen, Switzerland) containing IDE8 cells. The second Furthermore, it has been determined by amino acid homology and method was used for strain South Africa 222 because of the very genomic synteny analysis that E. canis gp36 has ortholog proteins in small volume of blood obtained (200 ␮l). Buffy coat cells were Ehrlichia chaffeensis (gp47) and Ehrlichia ruminantium (mucin-like harvested after centrifugation for 10 min at 500 × g at room temper- protein) (Doyle et al., 2006). In the present study, we report the iso- ature and washed only once in PBS. Buffy coat cells contaminated lation of 3 new E. canis strains from infected dogs into IDE8 cells, 2 with erythrocytes were then used to infect one flask of IDE8 cells. from South Africa and one from Spain. The strains were cultivated A third method was used for strain Spain 105. A 0.5-ml aliquot of in vitro in both IDE8 and DH82 cell lines. The molecular character- blood was diluted in 10 ml of PBS and subsequently centrifuged ization of these strains was carried out using both 16S rRNA and for 5 min at 515 × g at 18 ◦C. The supernatant was removed, and gp36 genes. After performing phylogenetic and molecular analy- the blood cell pellet was resuspended in 9 ml of double-distilled ses, using the gp36 sequences from our strains, but also from 18 sterile water for approximately 30 s to lyse most of the erythro- gp36 gene sequences from representative E. canis strains available cytes. Physiological tonicity was restored by the addition of 1 ml of in GenBank, we could conclude (i) that worldwide E. canis strains 10-fold concentrated Hank’s balanced salt solution. The lysate was

-74- ARTICLE IN PRESS Chapter IV centrifuged for 5 min at 290 × g at 18 ◦C. The pellet was resuspended ClustalW (Thompson et al., 1994). The phylogenetic analysis was in CCM and inoculated into one IDE8 culture flask. performed as follows: sequences were aligned with MUSCLE All infected cultures were propagated at 34 ◦C in 25-cm2 plas- (v3.7) configured for highest accuracy (Edgar, 2004). After align- tic culture flasks in 5 ml of CCM. Three milliliters of medium were ment, ambiguous regions were removed with Gblocks (v0.91b) changed twice weekly. IDE8 cultures heavily infected with the (Castresana, 2000). The phylogenetic tree was reconstructed using respective E. canis strain were subcultured when almost all cells the maximum likelihood method implemented in the PhyML pro- had detached from the substrate. Varying amounts of the cell sus- gram (v3.0 aLRT) (Guindon and Gascuel, 2003). Reliability for the pensions were then distributed into uninfected IDE8 culture flasks. internal branch was assessed using the bootstrapping method To infect DH82 cells, infected IDE8 cell culture suspensions of cul- (1000 bootstrap replicates). Graphical representation and editing ture passage 10 were centrifuged at 290 × g for 2 min in order to of the phylogenetic tree were performed with TreeDyn (v198.3) remove the majority of the tick cells. One milliliter of the super- (Chevenet et al., 2006). natant was distributed into culture flasks containing uninfected DH82 cells together with 4 ml CCM and incubated at 34 ◦C. After Analysis of the gp36 gene and putative amino acid sequences 3 days, the medium was replaced with 5 ml of CCM, thereafter 3 ml of medium were changed twice weekly. The gp36 amino acid sequences from the E. canis strains Once the E. canis-infected DH82 cultures reached an infection reported in this study were tested for the presence of sig- rate of approximately 70%, 1 ml of cell suspension of culture pas- nal peptide sequences with the computational algorithm SignalP sage 5 was used to reinfect normal IDE8 cells. After 3 days, the trained on Gram-negative bacteria (Nielsen et al., 1997). The medium was replaced with 5 ml of fresh medium. Three milliliters gp36 protein sequence was evaluated for potential mucin-type of medium were then changed twice weekly. O-linked glycosylation on serines and threonines with the com- Microscopic examinations were carried out to demonstrate the putational algorithm NetOGlyc v3.1 (Julenius et al., 2005); for presence of E. canis in the respective cells. Small samples from the N-linked glycosylation the NetNGlyc 1.0 Server (NetNGlyc 1.0 cell layer were removed from the cultures using a sterile 21-gauge Server, http://www.cbs.dtu.dk/services/NetNGlyc/) was used. The needle with a bent tip, and smears were prepared. In addition, cyto- tandem repeats finder (TRF) database (Benson, 1999) was used centrifuged smears were made from cultures in which some of the to predict the presence of tandem repeats in gp36. Secondary cells were in suspension. Smears were allowed to dry before being structure was predicted using the position-specific scoring matri- fixed with methanol and stained with Giemsa. ces method (Jones, 1999) from the PSIPRED server (Buchan et al., 2010). The pI was calculated using the ProtParam tool (Gasteiger Genomic DNA isolation et al., 2005). For convenient gp36 sequence comparison, the gp36 genes and predicted amino acid sequences were divided into three Total DNA was extracted from IDE8 cultures highly infected regions following the method reported by Hsieh et al. (2010): with each of the respective E. canis strains with DNeasy Blood Region I was the 5 end pre-tandem repeat region composed of & Tissue Kit (Qiagen, Gaithersburg, CA, USA) following the man- 426–429 pb/142–143 aa at the N terminus of the encoded protein, ufacturer’s instructions. The DNA concentration and purity were Region II the tandem repeat region (variable numbers of the 27 bp/9 determined by measuring the optical density at both 260 nm and aa repeat unit depending on the strain), and Region III the 3 end 280 nm with a DNA-RNA calculator (NanoDrop® ND-1000, Peqlab, post-repeat region (81–93 pb/28–30 aa) at the C terminus of the Erlangen, Germany). encoded protein.

PCR, cloning and sequencing Sequences used in this study

The primers used in this study to amplify 16S rRNA and gp36 The following sequences obtained from E. canis were deposited genes have been reported previously (Warner and Dawson, 1996; in GenBank, and their accession numbers are: E. canis South Africa Hsieh et al., 2010) and are shown in Table 1. Two independent 171 16S rRNA (KC479023), gp36 (KC479020); E. canis South Africa PCR reactions were performed for each gene. The PCR reactions 222 16S rRNA (KC479024), gp36 (KC479021); and E. canis Spain were carried out in an Eppendorf thermocycler (Eppendorf, Ham- 105 16S rRNA (KC479022), gp36 (KC479019). The following E. canis burg, Germany) as described previously (Hsieh et al., 2010). The gp36 amino acid sequences obtained from GenBank were used PCR products were stained using an ethidium bromide-free system for the phylogenetic tree and molecular analysis: China-TWN1 and visualized on 0.8% agarose minigels. The size of the amplified (ABS82573), China-TWN2 (ABU44524), China-TWN3 (ABV26011), fragments was estimated by comparison with the DNA molecular China-TWN4 (ABX71625), Brazil-São Paulo (ABA39257), USA- weight marker (100-bp DNA Ladder; Promega, Madison, WI, USA). Louisiana (ABA39254), USA-Oklahoma (AAZ40200), USA-Florida In each case, the single amplified product was column-purified (ABA39255), USA-Jake 2 (AAZ68160), USA-DJ (ABA39256), USA- using the QIAquick PCR Purification Kit (Qiagen, USA) and then lig- Jake 1 (AAZ40199), USA-Demon (AAZ40201), Africa-Cameroon ated into the TOPO Cloning Kit (Invitrogen, Carlsbad, CA, USA) for (ABA39258), Africa-Nigeria 64 (JN622143), Africa-Nigeria 94 subsequent cloning and sequencing. For both genes in each strain, (JN982341), Africa-Nigeria 80 (JN982338), Israel-611 (ABV02078), five individual clones containing the cloned fragment in the TOPO and Israel-Ranana (ABW91006). Ehrlichia chaffeensis Arkansas gp47 vector were purified using the QIAprep Spin Miniprep Kit (Qiagen, (AAZ40202) was used to root the gp36 phylogenetic tree. USA) and prepared for sequencing using an ABI 3130 sequencer (Applied Biosystems, Foster City, CA, USA) and the Big Dye Ter- Results minator v3.1 Cycle Sequencing Kit (Applied Biosystems, USA). The final sequences were a consensus of the sequences obtained from E. canis cultures each of the five clones. Leukocytes isolated from the blood of three naturally infected Sequences identity, alignments and phylogenetic analysis dogs were used as infective inoculum. All the E. canis strains were successfully established in IDE8 cell cultures and were detected Identity and alignment analyses were performed using BLAST in Giemsa-stained culture smears 10–15 days after initiation. Sus- server (Zhang et al., 2000) or the multiple-alignment program pensions of organisms obtained from infected IDE8 cell cultures of

-75- Chapter IV

Table 1 Primers used in this study for the amplification of the 16S rRNA and gp36 genes from E. canis strains.

Target Primersa Sequence Expected size (kb)

16Sr RNA 8Fb 5-AGTTTGATCATGGCTCAG-3 1.4 1448R 5-CCATGGCGTGACGGGCAGTGTG-3

gp36 EC36-F1c 5-GTATGTTTCTTTTATATCATGGC-3 0.8–1.0 EC36-R1 5-GGTTATATTTCAGTTATCAGAAG-3

a Primer F is forward and primer R is reverse. b Warner and Dawson (1996). c Hsieh et al. (2010).

Table 2 In vitro culture data of E. canis strains propagated in IDE8 or DH82 cells.

E. canis strain Route of infection Prepatent period (days) Days in culture No. of passages

Blood → IDE8 15 184 20 Spain 105 IDE8 → DH82 11 290 18

Blood → IDE8 10 220 31 South Africa 171 IDE8 → DH82 16 119 10

Blood → IDE8 12 202 27 South Africa 222 IDE8 → DH82 14 108 4 passage 10 were used to infect DH82 cells. These cells became pos- identity, the strain Spain 105 showed 87% and 78% identity with the itive 11–16 days after infection. The culture data are summarized strains South Africa 222 and 171, respectively. When comparing our in Table 2. sequences with previously reported gp36 sequences, we found that E. canis-infected DH82 cultures of strains Spain 105 and South the most distant strain was South Africa 171 having only 73% amino Africa 171 were used to reinfect IDE8 cultures. One milliliter of acid identity with an isolate from the United States (USA-Florida infected DH82 cell suspension of passage 5 was distributed into GenBank accession number ABA39255). The strain Spain 105 had a culture flask containing IDE8 cells and 4 ml CCM. E. canis Spain high identity with the isolated Israel-Ranana (98%) and lower iden- and South Africa 171 organisms were detected in Giemsa-stained tity with China-TWN1 (83.7%). Previously reported gp36 sequences smears of IDE8 cells after 18 and 10 days, respectively. Cultures from the same geographical region showed high identity (Kamani were further propagated in IDE8 cells for at least another three et al., 2013; Hsieh et al., 2010). For example, the isolates from passages before they were discarded. Nigeria and the isolate Cameroon 71 shared 100% identity (Kamani et al., 2013), the 7 gp36 sequences from isolates from the United Sequence analysis of 16S rRNA States used in this study (see section “Materials and methods” for GenBank accession numbers) shared more than 97% identity, and For the genetic analysis, we used total DNA extracted from 4 Taiwanese isolates shared more than 99% identity (Hsieh et al., heavily infected IDE8 cells at culture passage 15. In order to obtain 2010). As already mentioned above, our two South African strains relevant information from 16S rRNA at the species level, the primers shared only 88% identity, the lowest percentage of identity for iso- 8F and 1448R were used to amplify a 1.4-kb fragment. An amplicon lates from the same geographical region. For the convenience of of approximately 1.4 kb, corresponding to the expected size of the the molecular analysis of the gp36 gene and the putative encoded targeted 16S rRNA gene fragment, was obtained from each of the 3 protein, we divided the protein into three regions as proposed by E. canis strains (data not shown). A consensus sequence of 1.384 kb Hsieh et al. (2010) (see section “Materials and methods”). was obtained from 2 independent PCRs and 5 sequenced clones for each strain. We confirmed that the 3 strains were E. canis as their Region I (the 5 end pre-tandem repeat region) and gp36 sequences were 100% identical with other previously reported E. phylogenetic clusters canis 16S rRNA gene sequences. The 16S rRNA from the strain Spain 105 was identical with a previously reported E. canis strain (Gen- Nucleotide and amino acid alignments of the gp36 gene obtained Bank accession number EF011110), and both South African strains in this study revealed that the 5 end pre-tandem repeat region were identical with the Taiwanese strains reported by Hsieh et al. (Region I) differed in length and identity. In most of the gp36 iso- (2010). lates that have been reported so far, this region contained 429 nucleotides (encoding for 143 amino acids). Hsieh et al. (2010) Sequence analysis of gp36 noted that the Taiwanese gp36 isolates have one codon less in this region and thus contain 142 amino acids rather than 143 amino The gp36-based PCR products derived from the strains reported acids as found in other strains. Table 3 shows that both of our South here had different molecular sizes. Amplicons of 1.0, 0.8, and 0.8 kb African strains, like the Taiwanese, had 426 nucleotides (encoding were obtained for the E. canis strains South Africa 171, Spain 105, for 142 amino acids) in this region, while the Spanish strain had 429 and South Africa 222, respectively (data not shown). Subsequent nucleotides (encoding for 143 amino acids) like the rest of the iso- cloning of the PCR amplicons followed by sequencing showed that lates. At this point, we wanted to explore whether this very specific our amplicons were 0.906, 0.738, and 0.702 kb in length encoding difference among gp36 sequences followed any phylogenetic pat- proteins with 301, 245, and 233 amino acids, respectively. Pre- tern. After performing a maximum likelihood phylogenetic analysis dicted signal peptides were detected in all of them. The predicted based on the gp36 amino acid sequences from our strains and also amino acid sequences from our gp36 genes had a variable iden- from 18 previously reported gp36 sequences accessible in GenBank tity among them and also when compared to previously reported (Fig. 1), we noted firstly that our three strains fell into different phy- gp36 sequences. While the two South African strains shared 88% logenetic clusters and secondly that they clustered together with

-76- ARTICLE IN PRESS Chapter IV

Table 3 Length and percentage identity in the Region I of gp36 among different E. canis strains.

Source Strain Nucleotide level Amino acid level

Lengtha Identityb Lengtha Identityb

USA-Louisiana 429 – 143 – USA-Oklahoma 429 100 143 100 Brazil-São Paulo 429 99 143 97 Israel 611 429 99 143 97 Africa-Nigeria 64 429 99 143 97 Africa-Cameroon 429 99 143 97 E. canis gp36 Spain 105c 429 98 143 94 South Africa 171c 426 89 142 81 South Africa 222c 426 89 142 81 China-TWN1 426 87 142 79 China-TWN2 426 87 142 79 China-TWN3 426 87 142 79 China-TWN4 426 87 142 79

a Length in base pair and amino acid sequence, respectively. b Percentage of nucleotide and amino acid identity were calculated with ClustalW. All the sequences were compared to E. canis USA-Louisiana. c E. canis strains reported in this study. other gp36 sequences regarding the amount of amino acids in the rarely present amino acid changes (Zhang et al., 2008). As it is Region I and the presence of a putative site for N-linked glycosyla- shown in Table 4, the strains reported in this work presented a tion at position 125 (Fig. 2) (see below). The South African strains 100% of amino acid identity in this region when compared to other fell into a separate cluster together with the Taiwanese isolates isolates. (Fig. 1, Cluster B), while strain Spain 105 fell into a clade together with the rest of the sequences (Fig. 1, Cluster A). Members of Cluster B contained one deletion of an amino acid at position 117 (glycine,  G) when compared with members of Cluster A (Fig. 2, black dot). Region III (the 3 end post-repeat region)

The length of this region varied between 16 and 30 amino acids. Region II (the tandem repeat region) The South African strains 171 and 222 had 16 and 28 amino acids, respectively, while the Spanish strain had 30. The South African This region of the gp36 gene contained a variable number of strains shared a very low identity in this region (18%), while the repeated units of 27 nucleotides coding for 9 amino acids depend- Spanish strain and the South African 222 and 171 shared 85% ing on the isolate as reported before by Doyle et al. (2006) and and 6% identity, respectively. Compared with other sequences, the Hsieh et al. (2010). Our strains contained 7 (South Africa 222), 8 South African strains were closely related to the China-TWN iso- (Spain 105), and 16 (South Africa 171) tandem repeats. Both the lates sharing 89.3% identity and were less similar to the strain nucleotide and amino acid sequences in the tandem repeat region Israel-611 (67.9%). The Spain 105 strain was 100% identical with were highly conserved (Table 4). A consensus sequence of nine Israel-Ramana and was more separated from Israel-611 (64.3%). No amino acids (TEDSVSAPA) is encoded by the repeat units, and they correlation was found between the grouping based on amino acid

Fig. 1. Phylogenetic tree based on the E. canis gp36. The tree shows that our E. canis strains fall into different phylogenetic clusters related to their different geographical locations. Open square shows the strain Spain 105, squares with horizontal and diagonal lines show the strains South Africa 171 and 222, respectively. Different phylogenetic clusters are shown, Clusters A and B. Bootstrap values are shown as % in the internal branch. Only bootstrap values higher than 50% are shown. Ehrlichia chaffeensis gp47 was used as outgroup.

-77- ARTICLE IN PRESS Chapter IV

Fig. 2. Deletion of glycine and characteristic sequon among members of E. canis gp36 Clusters A and B. The gp36 amino acid sequence from position 110 to 143 is aligned. Members of both Clusters A and B are shown. Members of Cluster A contain G at position 117 (column under black dot) and P between N and S (open box) which avoid the glycosylation on N 125, while members of Cluster B have a deletion at G 117 (column under black dot) and present the sequon N-S-S (open box). number in this region and the 2 phylogenetic clusters described N-glycosylation and O-glycosylation were identified in the gp36 above. Regions I and II, respectively. N-linked glycosylation requires asparagine (N) to be present in special motifs called sequons Putative glycosylation of gp36 and phylogenetic Clusters A and B (NXS/T), where X is any amino acid except proline (P), and the third residue is either serine (S) or threonine (T) (Spiro, 2002). One Doyle et al. (2006) described gp36 as a glycoprotein in three putative site of N-glycosylation at position 85 was found to be com- strains of E. canis from the United States (Jake, Oklahoma, and mon for all the strains (data not shown). The second amino acid of Demon, included in our phylogenetic analysis). In order to eval- this sequon varied between valine (V) and alanine (A) among the uate whether this property of gp36 was present in our strains and gp36 sequences (data not shown). Nevertheless, a second sequon, in all the strains used for the phylogenetic analyses (Fig. 1), we at position 125 (N-S-S) was specifically found in members of Clus- predicted the presence of putative sites for N-glycosylation and ter B, but not in members of Cluster A due to the change of S to P O-glycosylation. We found N-glycosylation and O-glycosylation (Fig. 2, box). O-linked glycosylation occurs on the hydroxyl group putative sites in all the sequences analyzed. Putative sites for of S or T amino acid residues (Spiro, 2002). As expected, the Region

Table 4 Summary of tandem repeats found in gp36 from different E. canis strains.

Source Strain Nucleotide levela Amino acid levelb Consensus in tandem repeat sequence (aa)c

Length No. Identity Length No. Identity

USA-Louisiana 27 5 99 9 5 99 TEDSVSAPA USA-Oklahoma 27 5 100 9 5 100 . . . . . Brazil-São Paulo 27 18 100 9 18 100 . . . . . Israel 611 27 11 99 9 11 99 . . P. .T . Africa-Nigeria 64 27 8 100 9 8 100 . . . . . Africa-Cameroon 27 16 100 9 16 100 . . . . . E. canis gp36 Spain 105d 27 8 100 9 8 100 . . . . . South Africa 171d 27 16 99 9 16 100 . . . . . South Africa 222d 27 7 100 9 7 100 . . . . . China-TWN1 27 13 100 9 13 100 . . . . . China-TWN2 27 12 100 9 12 100 . . . . . China-TWN3 27 10 100 9 10 100 . . . . . China-TWN4 27 14 100 9 14 100 . . . . .

a The length in base pairs, number of tandem repeats (No.), and the percentage of identity among the tandem repeats in each gp36 gene were determined using the TRF database. b The length in amino acids and the number of tandem repeats (No.) were analyzed in each gp36 protein sequence. The percentage of amino acid identity among the tandem repeats in each gp36 protein sequence was calculated with ClustalW. c The dots below the tandem repeat amino acid sequences mean conserved sequence compared to E. canis USA-Louisiana. d E. canis strains reported in this study.

-78- Chapter IV

II (tandem repeat region) of gp36, which is an area rich in S and T ticks fed on dogs infected with DH82-derived E. canis (Oklahoma) residues, was predicted to be highly O-glycosylated in all the strains acquired an infection, although it was not clear whether the ticks that were analyzed (data not shown). were subsequently able to transmit the organisms to naïve dogs. A switch between a tick and a mammalian cell culture system, as has been used in the present experiments, may help to prevent the Theoretical pI and secondary structure of gp36 selection of organisms losing the ability to infect either of the cell types. Correlation was found between the number of tandem repeats Sequence comparison of the 16S rRNA gene is recognized as and the theoretical pI of the gp36 protein sequences. Fig. 3A shows one of the most powerful and precise methods for determining that an increase in the number of gp36 tandem repeats (Fig. 3A, the phylogenetic relationships of bacteria (Woese, 1987; Warner y-axis) decreases the pI (Fig. 3A, x-axis) of the protein and vice and Dawson, 1996; Yu et al., 2001). In the present study using a versa. As shown before (Table 4), the gp36 from our 3 E. canis strains nearly full-length sequence of the 16S rRNA gene (1370 bp), we contained different numbers of tandem repeats. Correspondingly, identified our strains to species level as E. canis. High levels of they showed different pIs (Fig. 3A, black diamond, black square, identity (99–100%) have been reported among E. canis 16S rRNA and black triangle). We observed the same pattern when any other sequences. Our 16S rRNA sequences were 100% identical with pre- previously reported gp36 was considered (Fig. 3A, blue circles). viously reported E. canis 16S rRNA. Nevertheless, in the gp36 proteins with more than 12 and less E. canis gp36 and its orthologs in E. chaffeensis and E. ruminan- than five tandem repeats, the rate of change of pI fell markedly. tium are divergent genes, due to high evolutionary pressure exerted The resulting plot was a sigmoid curve (R2 = 0.94) that resembled on them by the immune system (Zhang et al., 2008; Doyle et al., a classic pH titration curve (Fig. 3A, sigmoid curve). The behav- 2006). This gene has been used to characterize new isolates of E. ior of the curve could be due to a “buffer effect” of the pre-tandem canis where 16S rRNA was not well suited to discriminate between E. repeat region. In agreement with the latter observation, we noticed canis isolates. Previous phylogenetic analysis of E. canis gp36 shows that a high number of tandem repeats (more acidic proteins) cor- some geographical correlation (Hsieh et al., 2010; Kamani et al., related with amino acid changes in Region I of gp36 from acidic 2013), suggesting local evolutionary adaptations. Our results show to basic amino acids (e.g. tyrosine to histidine and methionine to that this pattern is not consistent, while the South African strains isoleucine) (data not shown). We also found differences in the pre- fell into a separate cluster from the Taiwanese isolates in agreement dicted secondary structure adopted by the first 40 amino acids of with the geographical distance between South Africa and Taiwan; the pre-tandem repeat region under different ranks of pI. While Spain 105 fell into the same cluster with Israel-Ranana despite the more acidic proteins (lower pI) adopted only ␤-strand as sec- geographical distance between Spain and Israel. This could be due ondary structure conformation (Fig. 3B), the more basic proteins to the introduction of infected individuals (host or vector or both) to (higher pI) adopted combinations of both ␤-strand and ␣-helix different geographical areas, causing the emergence of non-typical (Fig. 3C). E. canis strains in some regions of the world. This explanation has been previously proposed as the cause of the absence of correla- Discussion tion between locality and genetic identity found in strains of other tick-borne pathogens such as Anaplasma marginale (de la Fuente In vitro isolation and propagation of three strains of E. canis was et al., 2007). A new Taiwanese group of E. canis strains has been achieved using IDE8 tick cell cultures as primary host cells. The recently reported, which falls into a separate cluster from the pre- prepatent period in vitro was at most 15 days, regardless of which viously reported E. canis gp36 (Hsieh et al., 2010). In this study, of the three methods of isolation of E. canis-infected cells was used. the two South African strains fell into the same cluster as the Tai- Only the two South African strains were isolated from clinically sick wanese isolates (Cluster B). The E. canis strains found in this cluster dogs, whereas the strain from Spain was obtained from a carrier dog shared 2 characteristics in their gp36 amino acid sequences: (i) the (data not shown); nevertheless the in vitro prepatent period of all pre-tandem repeat region (Region I) contained 142 amino acids strains was in the same range. In the present experiments, normal due to a deletion of G 117 when compared with members of Clus- atmospheric conditions were used for culture initiation and con- ter A (Fig. 2, black dot), and (ii) the sequon for a potential site of tinuous propagation. In contrast, the first reported isolation of E. N-glycosylation at position 125 had the sequence N/S/S in mem- canis into IDE8 tick cell cultures nearly 20 years ago required CO2 bers of Cluster B instead of N/P/S found among members of Cluster and a lowered O2 concentration (Ewing et al., 1995). Initial bod- A. The presence of P in the middle amino acid position will pre- ies harvested from IDE8 cultures were successfully used to infect vent glycosylation on N due to conformational hindrance. Both the canine cell line DH82. Once infected, DH82-derived initial bod- changes could have structural, functional, and antigenic implica- ies were able to reinfect IDE8 tick cells. Thus, it was possible to tions for gp36: firstly, glycine is a unique amino acid that gives close the life cycle of E. canis in vitro between a tick vector and a high conformation flexibility to protein folding due to the absence mammalian culture system. Ewing et al. (1995) generated E. canis of side chains on its backbone (Betts and Russell, 2003). Secondly, organisms in IDE8 cell cultures and injected them into dogs which Doyle et al. (2006) described gp36 as a glycoprotein containing became infected. Subsequently, E. canis could be reisolated from O-glycosylation predicted sites on the serines and threonines of these dogs into IDE8 cells. The fact that the organisms were propa- the tandem repeat. The glycosylation plays a crucial role in the gated in vitro in both tick and canine cells in our experiments most immunogenicity of these glycoproteins (Zhang et al., 2008; Doyle probably indicates that culture-generated organisms may be able to et al., 2006). Deglycosylation of the gp36 tandem repeat drasti- infect both dogs and ticks in vivo. In contrast, Mathew et al. (1996) cally reduces its immunogenicity (Doyle et al., 2006). The effect compared the transmissibility by the tick Rhipicephalus sanguineus of the putative N-glycosylation in the E. canis gp36 pre-tandem of a recent isolate of E. canis (Ebony) with that of another iso- repeat region that we reported here has not been experimentally late (Oklahoma) that had been passaged in DH82 cell culture. The evaluated. Nevertheless, N-linked glycosylation plays an important authors assumed that the Oklahoma isolate of E. canis in cell culture role in cellular biology, impacting on several properties of pro- adversely affected its transmissibility by ticks, raising the possibil- teins such as solubility, stability and turnover, secretion, protease ity that cell-cultured isolates of E. canis may lose their affinity for resistance, protein–protein interaction/recognition, and immuno- ticks. Contradictory results, however, were reported by Unver et al. genicity (Spiro, 2002). For example, the loss of sequons in the simian (2001) using the Oklahoma isolate. They showed that R. sanguineus immunodeficiency virus gp120 glycoprotein was reported to have

-79- ARTICLE IN PRESS Chapter IV

Fig. 3. (A–C) Correlation between theoretical isoelectric point (pI), number of tandem repeats, and secondary structure in E. canis gp36. The theoretical pI of the gp36 amino acid sequences from our 3 strains and from other previously reported E. canis strains (see section “Materials and methods” for GenBank accession numbers) were calculated and the secondary structure predicted. (A) We found correlation between the number of gp36 tandem repeats (y-axis labeled as “No. of TR”) and the theoretical pI (x-axis labeled as “pI”). Our strains are shown: South African strains 171 and 222 (black diamond and black triangle, respectively) and Spain 105 (black square). Previously reported E. canis gp36 sequences are labeled indistinctly as circles (blue). Changes in the secondary structure adopted by the first 40 amino acids of gp36 were found related to the theoretical pI of the protein. (B) The putative secondary structure adopted by the first 40 amino acids in the more acidic gp36 (sequences with more than 10 tandem repeats, symbols to the left of the y-axis). (C) Putative secondary structure adopted by the first 40 amino acids in the less acidic gp36 (sequences with less than 10 tandem repeats, symbols to the right of the y-axis). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.) a fitness cost, as it leads to failure of infectivity/immune evasion correlations exist between the number of tandem repeats, the the- (Ohgimoto et al., 1998). oretical pI, and the putative secondary structure adopted by the E. E. canis gp36 has been described as an acidic serine-rich protein canis immunoreactive glycoprotein gp36. (McBride et al., 2003). The tandem repeat of gp36 contains a high proportion of acidic amino acids. Therefore, we wanted to explore Acknowledgements the correlation between the number of tandem repeats, the the- oretical isoelectric properties of gp36, and the consequences for We thank Prof. Katherine M. Kocan (Oklahoma State University, the protein structure. We found differences in the predicted sec- USA) and Dr. Ulrike G. Munderloh (University of Minnesota, USA) ondary structure of the pre-tandem repeat region upon pI variation. for the provision of the IDE8 cells. A. Cabezas-Cruz, K. Lis, and J. In the pI region between 3.96 and 4.08, we observed exclusively ␤- Ferrolho are Early-Stage Researchers supported by the POSTICK ITN strand as secondary structure conformation, but between 4.11 and (post-graduate training network for capacity building to control 4.47, a combination of ␤-strands and ␣-helices was observed. It has ticks and tick-borne diseases) within the FP7-PEOPLE-ITN program been reported that proteins with acidic pIs tend to degrade more (EU Grant No. 238511). rapidly than those with neutral or basic pIs (Dice and Goldberg, 1975). These authors also suggested that the native conformations of the acidic polypeptides could be inherently less stable, and that References acidic proteins only accumulate in regions of the cells where the degradative enzymes are accumulated. The changes in secondary Amyx, H.L., Huxsoll, D.L., 1973. Red and gray foxes-potential reservoir hosts for Ehrlichia canis. J. Wildl. Dis. 9, 47–50. structure were observed in the first 40 amino acids of the protein, Bell-Sakyi, L., Zweygarth, E., Blouin, E.F., Gould, E.A., Jongejan, F., 2007. Tick cell lines: which coincides with the location of the putative signal peptide of tools for tick and tick-borne disease research. Trends Parasitol. 23, 450–457. gp36. These changes in the secondary structure of the acidic/bigger Benson, G., 1999. Tandem repeats finder: a program to analyze DNA sequences. versions of gp36 could be related to a differential secretion path- Nucleic Acids Res. 27, 573–580. Betts, M.J., Russell, R.B., 2003. Amino acid properties and consequences of substitut- way to avoid the degradatory machinery of the infected cells. On ions. In: Barnes, M.R., Gray, I.C. (Eds.), Bioinformatics for Geneticists. John Wiley the other hand, an ortholog of E. canis gp36, the mucin-like protein & Sons, Ltd., pp. 289–316. of E. ruminantium (Doyle et al., 2006), was shown in a recombinant Bremer, W.G., Schaefer, J.J., Wagner, E.R., Ewing, S.A., Rikihisa, Y., Needham, G.R., Jittapalapong, S., Moore, D.L., Stich, R.W., 2005. Transstadial and intrastadial version to be an adhesin for IDE8 cells (de la Fuente et al., 2004). experimental transmission of Ehrlichia canis by male Rhipicephalus sanguineus. Recently, we found evidence that the putative secondary structure Vet. Parasitol. 131, 95–105. adopted by MSP1a in A. marginale, which is also an adhesin, might Buchan, D.W., Ward, S.M., Lobley, A.E., Nugent, T.C., Bryson, K., Jones, D.T., 2010. Protein annotation and modelling servers at University College London. Nucleic play a role in tick transmission (Cabezas-Cruz et al., 2013). Further Acids Res. 38, 563–568. studies are needed to address the question of whether gp36 is an Cabezas-Cruz, A., Passos, L.M.F., Lis, K., Kenneil, R., Valdés, J.J., Ferrolho, J., Tonk, M., adhesin and whether the different patterns of secondary structure Pohl, A.E., Grubhoffer, L., Zweygarth, E., Shkap, V., Ribeiro, M.F.B., Estrada-Pena,˜ A., Kocan, K.M., de la Fuente, J., 2013. Functional and immunological relevance of found in our study are related to this property. Anaplasma marginale major surface protein 1a sequence and structural analysis. In conclusion, we report the successful in vitro culture of 3 PLoS ONE 8, e65243. new E. canis strains which present different molecular proper- Cardenas, A.M., Doyle, C.K., Zhang, X., Nethery, K., Corstvet, R.E., Walker, D.H., McBride, J.W., 2007. Enzyme-linked immunosorbent assay with conserved ties in their gp36 sequences. Extending our analysis to E. canis immunoreactive glycoproteins gp36 and gp19 has enhanced sensitivity and pro- gp36 sequences available in GenBank, we can conclude (i) that vides species-specific immunodiagnosis of Ehrlichia canis infection. Clin. Vaccine worldwide, E. canis strains fall into 2 well-defined phylogenetic Immunol. 14, 123–128. clusters that contain at least 2 specific properties, namely (a) dele- Castresana, J., 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552. tion of glycine 117 and (b) the presence of an additional putative Chevenet, F., Brun, C., Banuls, A.L., Jacq, B., Chisten, R., 2006. TreeDyn: towards N-linked glycosylation site in the Region I of gp36; and (ii) that dynamic graphics and annotations for analysis of trees. BMC Bioinf. 10, 439.

-80- ARTICLE IN PRESS Chapter IV

Dawson, J.E., Rikihisa, Y., Ewing, S.A., Fishbein, D.B., 1991. Serologic diagnosis of the obligately intracellular bacterium Ehrlichia canis reveals themes of com- human ehrlichiosis using two Ehrlicha canis isolates. Infect. Dis. 163, 564–567. plex membrane structure and immune evasion strategies. J. Bacteriol. 188, Dawson, J.E., Candal, F.J., George, V.G., Ades, E.W., 1993. Human endothelial cells as 4015–4023. an alternate to DH82 cells for the isolation of Ehrlichia chaffeensis, E. canis and McBride, J.W., Corstvet, R.E., Gaunt, S.D., Boudreaux, C., Guedry, T., Walker, D.H., Rickettsia rickettsi. Pathobiology 61, 293–296. 2003. Kinetics of antibody response to Ehrlichia canis immunoreactive proteins. de la Fuente, J., Ruybal, P., Mtshali, M.S., Naranjo, V., Shuqing Li Mangold, A.J., Infect. Immun. 71, 2516–2524. Rodríguez, S.D., Jiménez, R., Vicente, J., Moretta, R., Torina, A., Almazán, C., Mbati, Munderloh, U.G., Kurtti, T.J., 1989. Formulation of medium for tick cell culture. Exp. P.M., Torioni de, E.S., Farber, M., Rosario-Cruz, R., Gortazar, C., Kocan, K.M., 2007. Appl. Acarol. 7, 219–229. Analysis of world strains of Anaplasma marginale using major surface protein 1a Munderloh, U.G., Liu, Y., Wang, M., Chen, C., Kurtti, T.J., 1994. Establishment, main- repeat sequences. Vet. Microbiol. 119, 382–390. tenance and description of cell lines from the tick Ixodes scapularis. J. Parasitol. de la Fuente, J., Garcia-Garcia, J.C., Barbet, A.F., Blouin, E.F., Kocan, K.M., 2004. Adhe- 80, 533–543. sion of outer membrane proteins containing tandem repeats of Anaplasma and Neitz, W.O., Thomas, A.D., 1938. in the dog. J. S. Afr. Vet. Med. Assoc. 9, Ehrlichia species (Rickettsiales: Anaplasmataceae) to tick cells. Vet. Microbiol. 166–174. 98, 313–322. Nielsen, H., Engelbrecht, J., Brunak, S., von Heijne, G., 1997. Identification of pro- Dice, J.F., Goldberg, A.L., 1975. Relationship between in vivo degradative rates and karyotic and eukaryotic signal peptides and prediction of their cleavage sites. isoelectric points of proteins. Proc. Natl. Acad. Sci. U.S.A. 72, 3893–3897. Protein Eng. 10, 1–6. Donatien, A., Lestoquard, F., 1935. Existence en Algérie d’une Rickettsia du chien. Nyindo, M.B.A., Ristic, M., Huxsoll, D.L., Smith, A.R., 1971. Tropical canine pancytope- Bull. Soc. Pathol. Exot. 28, 418–419. nia: in vitro cultivation of the causative agent–Ehrlichia canis. Am. J. Vet. Res. 32, Donatien, A., Lestoquard, F., 1936. Existence de la prémunition dans la rickettsiose 1651–1658. naturelle ou expérimentale du chien. Bull. Soc. Pathol. Exot. 29, 378–383. Ohgimoto, S., Shioda, T., Mori, K., Nakayama, E.E., Hu, H., Nagai, Y., 1998. Location- Doyle, C.K., Nethery, K.A., Popov, V.L., McBride, J.W., 2006. Differentially expressed specific, unequal contribution of the N glycans in simian immunodeficiency virus and secreted major immunoreactive protein orthologs of Ehrlichia canis and E. gp120 to viral infectivity and removal of multiple glycans without disturbing chaffeensis elicit early antibody responses to epitopes on glycosylated tandem infectivity. J. Virol. 72, 8365–8370. repeats. Infect. Immun. 74, 711–720. Perez, M., Bodor, M., Zhang, C., Xiong, Q., Rikihisa, Y., 2006. Human infection with Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and Ehrlichia canis accompanied by clinical signs in Venezuela. Ann. N. Y. Acad. Sci. high throughput. Nucleic Acids Res. 32, 1792–1797. 1078, 110–117. Ewing, S.A., Munderloh, U.G., Blouin, E.F., Kocan, K.M., Kurtti, T.J.,1995. Ehrlichia canis Perez, M., Rikihisa, Y., Wen, B., 1996. Ehrlichia canis-like agent isolated from a in tick cell culture. In: Proceedings of the 76th Conference of Research Workers man in Venezuela: antigenic and genetic characterization. J. Clin. Microbiol. 34, in Animal Diseases, Chicago, USA, November 13–14, 1995. Iowa State University 2133–2139. Press, Ames (Abstract no. 165). Price, J.E., Karstad, L.H., 1980. Free-living jackals (Canis mesomelas) – potential reser- Font, J., Cairó, J., Callés, A., 1988. Ehrlichiosis canina. Clin. Vet. Pequenos˜ Anim. 8, voir hosts for Ehrlichia canis in Kenya. J. Wildl. Dis. 16, 469–473. 141–148. Singu, V., Peddireddi, L., Sirigireddy, K.R., Cheng, C., Munderloh, U., Ganta, R.R., 2006. Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M.R., Appel, R.D., Bairoch, Unique macrophage and tick cell-specific protein expression from the p28/p30- A., 2005. Protein identification and analysis tools on the ExPASy server. In: outer membrane protein multigene locus in Ehrlichia chaffeensis and Ehrlichia Walker, J.M. (Ed.), The Proteomics Protocols Handbook. Humana Press, Inc., canis. Cell. Microbiol. 8, 1475–1487. Totowa, NJ, pp. 571–607. Spiro, R.G., 2002. Protein glycosylation: nature, distribution, enzymatic formation, Groves, M.G., Dennis, G.L., Amyx, H.L., Huxsoll, D.L., 1975. Transmission of Ehrlichia and disease implications of glycopeptide bonds. Glycobiology 12, 43–56. canis to dogs by ticks (Rhipicephalus sanguineus). Am. J. Vet. Res. 36, 937–940. Stephenson, E.H., Osterman, J.V., 1980. Somatic cell hybrids of canine peritoneal Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to estimate macrophages and SV-40 transformed human cells: derivation, characterization, large phylogenies by maximum likelihood. Syst. Biol. 52, 696–704. and infection with Ehrlichia canis isolates. Am. J. Vet. Res. 41, 234–240. Harvey, J.W., Simpson, C.F., Gaskin, J.M., Sameck, J.H., 1979. Ehrlichiosis in wolves, Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTALW: improving the sensi- dogs, and wolf–dog crosses. J. Am. Vet. Med. Assoc. 175, 901–905. tivity of progressive multiple sequence alignment through sequence weighting, Hemelt, I.E., Lewis, G.E., Huxsoll, D.L., Stephenson, E.H., 1980. Serial propagation of position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, Ehrlichia canis in primary canine peripheral blood monocyte cultures. Cornell 4673–4680. Vet. 70, 37–42. Unver, A., Ohashi, N., Tajima, T., Stich, R.W., Grover, D., Rikihisa, Y., 2001. Trans- Hsieh, Y.C., Lee, C.C., Tsang, C.L., Chung, Y.T., 2010. Detection and characterization of criptional analysis of p30 major outer membrane multigene family of Ehrlichia four novel genotypes of Ehrlichia canis from dogs. Vet. Microbiol. 146, 70–75. canis in dogs, ticks, and cell culture at different temperatures. Infect. Immun. 69, Johnson, E.M., Ewing, S.A., Barker, R.W., Fox, J.C., Crow, D.W., Kocan, K.M., 1998. 6172–6178. Experimental transmission of Ehrlichia canis (Rickettsiales: Ehrlichieae) by Der- van Heerden, J., 1979. The transmission of canine ehrlichiosis to the wild dog Lycaon macentor variabilis (Acari: Ixodidae). Vet. Parasitol. 74, 277–288. pictus (Temminck) and black-backed jackal Canis mesomelas (Schrebber). J. S. Jones, D.T., 1999. Protein secondary structure prediction based on position-specific Afr. Vet. Assoc. 50, 245–248. scoring matrices. J. Mol. Biol. 292, 195–202. Warner, C., Dawson, J., 1996. Genus- and species-level identification of Ehrlichia Julenius, K., Molgaard, A., Gupta, R., Brunak, S., 2005. Prediction, conservation anal- species by PCR and sequencing. In: Persing, D.H. (Ed.), PCR Protocols for Emerging ysis, and structural characterization of mammalian mucin type O-glycosylation Infectious Diseases. ASM Press, Washington, DC, pp. 100–105. sites. Glycobiology 15, 153–164. Wellman, M.L., Krakowka, S., Jacobs, R.M., Kociba, G.J., 1988. A Kamani, J., Lee, C.C., Haruna, A.M., Chung, P.J., Weka, P.R., Chung, Y.T., 2013. First macrophage–monocyte cell line from a dog with malignant histiocytosis. detection and molecular characterization of Ehrlichia canis from dogs in Nigeria. In Vitro Cell. Dev. Biol. 24, 223–229. Res. Vet. Sci. 94, 27–32. Woese, C.R., 1987. Bacterial evolution. Microbiol. Rev. 51, 221–271. Keysary, A., Waner, T., Strenger, C., Harrus, S., 2001. Cultivation of Ehrlichia canis in Yu, X., Zhang, X., McBride, J., Zhang, Y., Walker, D., 2001. Phylogenetic relationships of a continuous BALB/C mouse macrophage cell culture line. J. Vet. Diagn. Invest. Anaplasma marginale and ‘Ehrlichia platys’ to other Ehrlichia species determined 13, 521–523. by GroEL aa sequences. Int. J. Syst. Evol. Microbiol. 51, 1143–1146. Maeda, K., Markowitz, N., Hawley, R.C., Ristic, M., Cox, D., McDade, J.E., 1987. Zhang, X., Luo, T., Keysary, A., Baneth, G., Miyashiro, S., Strenger, C., Waner, T., Human infection with Ehrlichia canis, a leukocytic rickettsia. N. Engl. J. Med. McBride, J.W., 2008. Genetic and antigenic diversities of major immunoreactive 316, 853–856. proteins in globally distributed Ehrlichia canis strains. Clin. Vaccine Immunol. Mathew, J.S., Ewing, S.A., Barker, R.W., Fox, J.C., Dawson, J.E., Warner, C.K., Murphy, 15, 1080–1088. G.L., Kocan, K.M., 1996. Attempted transmission of Ehrlichia canis by Rhipi- Zhang, Z., Schwartz, S., Wagner, L., Miller, W., 2000. A greedy algorithm for aligning cephalus sanguineus after passage in cell culture. Am. J. Vet. Res. 57, 1594–1598. DNA sequences. J. Comput. Biol. 7, 203–214. Mavromatis, K., Doyle, C.K., Lykidis, A., Ivanova, N., Francino, M.P., Chain, P., Shin, Zweygarth, E., Josemans, A.I., 2001. Continuous in vitro propagation of Cowdria M., Malfatti, S., Larimer, F., Copeland, A., Detter, J.C., Land, M., Richardson, ruminantium (Welgevonden stock) in a canine macrophage–monocyte cell line. P.M., Yu, X.J., Walker, D.H., McBride, J.W., Kyrpides, N.C., 2006. The genome of Onderstepoort J. Vet. Res. 68, 155–157.

-81- Chapter V

Ehrlichia mineirensis, a new species of the genus Ehrlichia closely

related to E. canis, presents a divergent gp36 ortholog.

Cabezas-Cruz A., Zweygarth E., Ribeiro M F B., da Silveira A G J., de la Fuente J., Grubhoffer L., Valdés J J., Passos M F L. 2012. New species of Ehrlichia isolated from Rhipicephalus (Boophilus) microplus shows an ortholog of the E. canis major immunogenic glycoprotein gp36 with a new sequence of tandem repeats. Parasites & Vectors. 5: 291.

Cabezas-Cruz A., Vancová M., Zweygarthb E., Ribeiro M F B., Grubhoffer L., Passos L M F. 2013. Ultrastructure of Ehrlichia mineirensis, a new member of the Ehrlichia genus. Veterinary Microbiology. doi: 10.1016

Zweygarth E., Schöl H., Lis K., Cabezas-Cruz A., Thiel C., Silaghi C., Ribeiro M F B., Passos L M F. 2013. In vitro culture of a novel genotype of Ehrlichia sp. from Brazil. Transboundary and Emerging Diseases. 60 (Suppl. 2) 86– 92.

-82- Chapter V.I

New species of Ehrlichia isolated from Rhipicephalus

(Boophilus) microplus shows an ortholog of the E. canis

major immunogenic glycoprotein gp36 with a new sequence

of tandem repeats.

Cabezas-Cruz A., Zweygarth E., Ribeiro M F B., da Silveira A G J., de la Fuente J., Grubhoffer L., Valdés J J., Passos M F L. 2012. New species of Ehrlichia isolated from Rhipicephalus (Boophilus) microplus shows an ortholog of the E. canis major immunogenic glycoprotein gp36 with a new sequence of tandem repeats. Parasites & Vectors. 5: 291.

-83- Chapter V.I

RESEARCH Open Access New species of Ehrlichia isolated from Rhipicephalus (Boophilus) microplus shows an ortholog of the E. canis major immunogenic glycoprotein gp36 with a new sequence of tandem repeats Alejandro Cabezas Cruz1, Erich Zweygarth2, Mucio Flavio Barbosa Ribeiro3, Julia Angelica Gonçalves da Silveira3, Jose de la Fuente4,5, Libor Grubhoffer1, James J Valdés1 and Lygia Maria Friche Passos2,6*

Abstract Background: Ehrlichia species are the etiological agents of emerging and life-threatening tick-borne human zoonoses that inflict serious and fatal infections in companion animals and livestock. The aim of this paper was to phylogeneticaly characterise a new species of Ehrlichia isolated from Rhipicephalus (Boophilus) microplus from Minas Gerais, Brazil. Methods: The agent was isolated from the hemolymph of Rhipicephalus (B.) microplus engorged females that had been collected from naturally infested cattle in a farm in the state of Minas Gerais, Brazil. This agent was then established and cultured in IDE8 tick cells. The molecular and phylogenetic analysis was based on 16S rRNA, groEL, dsb, gltA and gp36 genes. We used the maximum likelihood method to construct the phylogenetic trees. Results: The phylogenetic trees based on 16S rRNA, groEL, dsb and gltA showed that the Ehrlichia spp isolated in this study falls in a clade separated from any previously reported Ehrlichia spp. The molecular analysis of the ortholog of gp36, the major immunoreactive glycoproteins in E. canis and ortholog of the E. chaffeensis gp47, showed a unique tandem repeat of 9 amino acids (VPAASGDAQ) when compared with those reported for E. canis, E. chaffeensis and the related mucin-like protein in E. ruminantium. Conclusions: Based on the molecular and phylogenetic analysis of the 16S rRNA, groEL, dsb and gltA genes we concluded that this tick-derived microorganism isolated in Brazil is a new species, named E. mineirensis (UFMG-EV), with predicted novel antigenic properties in the gp36 ortholog glycoprotein. Further studies on this new Ehrlichia spp should address questions about its transmissibility by ticks and its pathogenicity for mammalian hosts. Keywords: Ehrlichia spp, Rhipicephalus (Boophilus) microplus, Phylogenetic analysis, Gp36 major immunogenic protein

* Correspondence: [email protected] 2Comparative Tropical Medicine and Parasitology, Ludwig-Maximilians-Universität München, Munich, Germany 6Departamento de Medicina Veterinaria Preventiva, INCT-Pecuária, Escola de Veterinária-UFMG, Belo Horizonte, Minas Gerais, Brazil Full list of author information is available at the end of the article

© 2012 Cruz et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

-84- Chapter V.I

Background Recently, we have isolated an organism from The emergence of multiple Ehrlichia species as etio- hemolymph of R. (B.) microplus engorged females which logical agents of newly discovered human zoonoses and had been collected from naturally infested cattle in Bra- the previous recognition of these agents as causing ser- zil (unpublished data). This organism has been propa- ious disease in companion animals and livestock have in- gated continuously in vitro, both in a tick cell line tensified the interest in these pathogens. Ehrlichiae are (IDE8) and in a monocyte-macrophage cell line from a tick-transmitted obligate intracellular gram-negative bac- dog (DH82), and has been initially characterised as a teria that are maintained in nature by persistent infec- new genotype of Ehrlichia spp (UFMG-EV strain) [16]. tion of mammalian hosts [1]. They are microorganisms In the present study we report further molecular and residing within the cytoplasmic vacuoles of monocytes, phylogenetic analyses focusing on five genes (16S rRNA, granulocytes, or platelets of humans and animals. Ehrli- groESL, gltA, dsb and gp36) of this new organism, from chia species elicit illnesses with fever, headache, now on referred as Ehrlichia mineirensis (UFMG-EV). leukopenia, and thrombocytopenia [2]. The obligately intracellular alpha-proteobacterial genus Methods Ehrlichia (Rickettsiales: Anaplasmataceae) is spread all Organism isolation and in vitro cultivation over the world and are comprised of five recognized spe- Eleven R. (B.) microplus engorged females, larger than cies that are tick-transmitted, with three of the five caus- 4.5 mm in length, were collected from naturally infested ing human ehrlichiosis (E. canis, E. chaffeensis,andE. calves (4 to 6 months old) from a farm in Minas Gerais, ewingii) [3]. The agent that causes the veterinary disease Brazil. The ticks were washed, blotted dry, and disin- heartwater (E. ruminantium) can potentially infect fected with Germekil (Johnson, Brazil.), for 30 minutes humans [2,4] and Ehrlichia muris has not been associated at room temperature. After several washes in sterile dis- with human infection. In addition, numerous candidate tilled water, the ticks were individually placed into poly- entities have been reported (“E. walkerii”, “E. shimanensis”, styrene plates and were incubated at 27°C and relative “Ixodes ovatus ehrlichia”, “Panola Mountain ehrlichia”, humidity over 83%. After a 10-day incubation period etc.), all isolated from hard ticks and mainly characterized hemolymph were collected to provide material for by PCR sequencing [3]. To date, only three species of the infecting IDE8 cells [17]. Each tick was held with sterile genus Ehrlichia have been reported in Brazil: E. canis, E. forceps, the cuticula was again sterilized, as previously ewingii and E. chaffeensis [5]. described, and the leg cut with a sterile scalpel blade. Different hard ticks species have been associated with The hemolymph was collected using a capillary tube to transmitting members of the genus Ehrlichia: Rhipice- gather the draining fluid. Hemolymph from three ticks phalus sanguineus and Dermacentor variabilis (E. canis), were pooled in a tube containing 200 μl of culture Amblyomma americanum [6] and Dermacentor variabi- medium, which constitute the inoculum to infect one lis [5] (E. chaffeensis and E. ewingii), Haemaphysalis spp culture flask containing an on growing IDE8 cell and Ixodes spp (E. muris) and Amblyomma spp (E. rumi- monolayer. nantium) [6]. After infection, the culture flask was monitored daily Polyphasic taxonomy has been advocated to ensure by examination of cytocentrifuge smears made from well-balanced determination of taxonomic relation- 50 μl aliquots taken from the culture suspension. Smears ships [7]. Different genes have been proposed to clas- were fixed twice with methanol (for 10 min), stained sify ehrlichial agents. The most widely used are 16S with an 8% Giemsa solution for 30 min and examined rRNA [8,9], groESL operon [10],groELgene[11],gltA under oil immersion at 1,000x magnification. The first [7], dsb [12],gp36and gp19 [13]. The gp36 belong to infected cells were detected 28 days after culture the group of major immunogenic antigen in E. canis initiation. (gp36)andE. chaffeensis (gp47)andbothareorthologs Maintenance of cultures was carried out with medium to the mucin-like protein in E. ruminantium.These changes weekly. Briefly, IDE8 cells were maintained at glycoproteins have tandem repeats that contain major 32°C in L-15B medium [18], supplemented with 5% B-cell epitopes with carbohydrate determinants, which heat-inactivated foetal bovine serum, 10% tryptose phos- contribute substantially to the immunoreactivity of phate broth, 0.1% bovine lipoprotein concentrate (MP these proteins. Only five types of tandem repeats have Biomedicals, Santa Ana, CA, USA), 100 IU/ml penicillin been characterized [14]. Of these glycoproteins, gp36 is and 100 μg/ml streptomycin. Infected IDE8 cultures the most divergent gene among E. canis isolates [15]. were propagated in a modified L-15B medium as out- Nevertheless, the tandem repeat is highly conserved lined above, further supplemented with 0.1% NaHCO3 among different isolates, changing only in the number and 10 mM HEPES. The pH of the medium was of repeats [13] and in few amino acids among E. canis adjusted to 7.5 with 1 N NaOH. Infected cultures were isolates [15]. propagated at 34°C in 25 cm2 plastic culture flasks in

-85- Chapter V.I

5 ml of the medium under normal atmospheric Cloning and sequencing conditions. The resulting PCR products were electrophoresed on a 0.8% agarose gel. The size of the amplified fragments was checked by comparison to a DNA molecular weight Genomic DNA isolation marker (100-bp DNA Ladder; Promega, USA). In each The DNeasy Blood & Tissue Kit (Qiagen Inc. Valencia, case, the single amplified product of the expected size Calif.) was used for extraction of DNA from infected was column purified using the QIAquick PCR Purifica- IDE8 cells. DNA extraction was performed according to tion Kit (Qiagen, USA) and then ligated into the TOPO the manufacturer’s instructions. The extracted material TA Cloning Kit (Invitrogen, USA) for subsequent trans- was eluted from the columns in 100 μl of sterile double formation in TOP 10 Chemically Com- distilled H O (ddH2O), and the DNA concentration and 2 petent cells. For each gene, five individuals clones purity were determined by measuring the optical density containing the cloned fragment in the TOPO vector at both 260 and 280 nm with a DNA-RNA calculator W were purified using the QIAprep Spin Miniprep Kit (NanoDrop ND-1000, Peqlab, Erlangen, Germany). (Qiagen, USA) and prepared for sequencing using an Ten-fold dilutions were done with the genomic DNA ABI 3130 sequencer (Applied Biosystems, USA) and the and separated in aliquot of 10 μl each and kept frozen Big Dye Terminator v3.1 Cycle Sequencing Kit (Applied until their use in a PCR reaction. Biosystems, USA) with the M13F and M13R vector pri- mer. Both the sense and antisense strands of each PCR- PCR amplified product were sequenced, and the sequences The primers used in this study are shown in (Table 1). were then manually edited to resolve any ambiguities. A The oligonucleotide primers used for the amplification consensus sequence was obtained for each amplified of dsb gene and gltA gene were designed for this study PCR product by comparing both the sense and antisense using primer design software (PrimerSelect; DNAStar, sequences from the five clones. USA) and information from the E. canis genome [Gen- Bank: CP000107] [19]. Two independent PCR reactions DNA sequence analysis were performed for each gene. For each PCR amplifica- To find the homology of our sequences we used the tion, 2 μL of extracted DNA was used as the template in database Nucleotide collection (nr/nt) using Megablast a25μL reaction mixture containing 20pmol of each pri- (optimize for highly similar sequences) from the BLAST mer and 2X PCR Master Mix (Promega, USA). The server [20]. Nucleotide sequences were aligned using reactions were conducted in an Eppendorf thermocycler BLAST [20] and protein sequences were aligned using (Eppendorf Mastercycler personal AG, 22331 Hamburg, the multiple-alignment program CLUSTALW [21]. The Germany) according to the parameters: 2 min at 94°C homology between sequences was analyzed using MegA- followed by 40 cycles of 30 sec at 94°C, 1 min at 45°C, lign, DNAStar, USA. Nucleotide sequences were trans- and 1.5 min at 72°C with a final extension step of 5 min. lated to amino acid (aa) sequence by the ExPASy The PCR products were stained using an Ethidium translation tool of the Swiss Institute of Bioinformatics bromide free system, 6X Orange DNA Loading Dye [22]. (Thermo Scientific, Germany) and visualized in 0.8% The phylogenetic analysis was performed as follows: agarose minigels. sequences were aligned with MUSCLE (v3.7) configured

Table 1 Primers used in this study for the amplification of the 16S rRNA, groESL, gltA, dsb and gp36 genes from E. mineirensis (UFMG-EV) genomic DNA Target Primers* Sequence Expected size (Kb) 16Sr RNA 8F9 50- AGTTTGATCATGGCTCAG – 30 1.4 1448R 50- CCATGGCGTGACGGGCAGTGTG – 30 groEL HS110 50- TGGGCTGGTA(A/C)TGAAAT – 30 1.4 HS6 50- CCICCIGGIACIA(C/T)ACCTTC – 30 gltA gltAF1 50- CTTCTGATAAGATTTGAAGTGTTTG – 30 1.5 gltAR1 50- CTTTACAGTACCTATGCATATCAATCC – 30 dsb dsbF2 50- CTTAGTAATACTAGTGGCAAGTTTTCCAC – 30 0.683 dsbR2 50- GTTGATATATCAGCTGCACCACCG – 30 gp36 EC36-F113 50- GTATGTTTCTTTTATATCATGGC – 30 1.0 EC36-R1 50- GGTTATATTTCAGTTATCAGAAG – 30 *Primers F are forward and R reverse.

-86- Chapter V.I for highest accuracy [23]. After alignment, ambiguous used to isolate a fragment of ~1.4Kb. Approximately a regions (i.e., containing gaps and/or poorly aligned) 1.4Kb amplicon corresponding to the expected size of were removed with Gblocks (v0.91b) [24]. The phylo- targeted 16S rRNA gene fragment was obtained (data genetic tree was reconstructed using the maximum not shown). A consensus sequence of 1.384 Kb was likelihood method implemented in the PhyML pro- obtained from 2 independent PCRs and five clones were gram (v3.0 aLRT) [25,26]. Reliability for internal sequenced. In total, our sequence had 10 changes of branch was assessed using the bootstrapping method nucleotides when compared with E. canis [GenBank: (100 bootstrap replicates). Graphical representation and GU810149] with two insertions and three deletions (data edition of the phylogenetic tree were performed with not shown). The percent of identities with all the mem- TreeDyn (v198.3) [27]. The nomenclature used in the bers of the Ehrlichia genus are shown in the Table 2 trees is according to Dumler et al., [19]. The same upper triangle. Figure 1A shows the tree build using the analysis of similarity and phylogenetic relationships maximum likelihood method; it shows that E. mineiren- was performed for the genes 16S rRNA, groEL, gltA sis (UFMG-EV) falls in a clade separated from all the and dsb with the exception that the dsb tree is previous reported sequences. The tree build with the unrooted and the rest are rooted. neighbour joining method using the Kimura 2 para- meters substitution model show identical results (data Analysis of the glycoprotein gp36 gene and putative aa not shown). sequence The gene 16S rRNA has a highly variable region The gp36 ortholog was tested for the presence of signal located at the 5’ end of the gene [8]. This fragment is peptide sequences with the computational algorithm Sig- useful in identifying Ehrlichia spp [9]. Figure 2 shows nalP trained on gram-negative bacteria [28]. The gp36 three changes in nucleotides in E. mineirensis (UFMG- protein sequence was evaluated for potential mucin-type EV) in comparison with E. canis and seven changes in O-linked glycosylation on serines and threonines with nucleotides when compare with Ehrlichia. sp. Tibet the computational algorithm NetOGlyc v3.1 [29] and for which was isolated from R. microplus [8]. N-linked glycosylation was used the NetNGlyc 1.0 Ser- ver [30]. The Tandem Repeats Finder database [31] was Sequence analysis of dsb used to analyze the tandem repeats. The prediction of The amplicon obtained from the PCR set up with the continuous B cell epitopes was done using the B cells primers dsbF2 and dsbR2 gave a band with the expected Epitopes Prediction Tool [32] and the 3D structure of size of 0.7 Kb. A fragment of 0.683 Kb of the gene dsb the glycoprotein and the predicted epitopes was was obtained and sequenced. Dsb gene sequences for obtained using the algorithm contained in the ElliPro available Ehrlichia spp. were aligned using clustalW. The epitope modeling tool and sequences available in the alignment shows that dsb gene is conserved (76.4% - ElliPro server [33]. As previously reported [14], for the 94.7%) within the genus (Table 2 lower triangle). The aa convenience of sequence comparison the gp36 gene sequence shows homology from 72.0% to 95.0% with E. orthologs were divided into three regions: 5’ end pre- ruminantium [GenBank: AF308669, clon 18hw] and E. repeat region, a tandem repeat region, and 3’ end post- canis [GenBank: AF403710], respectively. When com- repeat region. pared with the complete dsb from E. canis [AF403710] 10 aa changes are observed (data not shown). The Sequences used in this study changes are concentrated at the carboxyl-terminus of The sequences obtained from Ehrlichia mineirensis the protein. Different dsb isolates of E. canis share 100% (UFMG-EV) have been deposited in GeneBank, and of identity among them (Table 3) The phylogenetic tree their accession numbers are: 16S rRNA [GenBank: shows that E. mineirensis (UFMG-EV) dsb is separated JX629805], groESL [GenBank: JX629806], dsb [GenBank: from its homologs in other species of the Ehrlichia JX629808], gltA [GenBank: JX629807] and gp36 [Gen- genus (Figure 3). Bank: JX629809]. The 16S rRNA, groEL, gltA, dsb and gp36 sequences used for the phylogenetic tree or mo- Sequence analysis of groESL operon lecular analysis in general were obtained from GenBank The amplification with primers HS1-HS6 produced a and their accession numbers are show in the Tables and PCR product in the expected size 1.4Kb. The nucleotide Figures where they have been mentioned. sequences of the PCR products amplified from E. mine- irensis (UFMG-EV) contained a reading frame corre- Results sponding to the 26 aa carboxyl-terminus of groES, 416 Sequence analysis of 16S rRNA aa of the amino-terminal end of groEL, and the spacer In order to obtain relevant information from 16S rRNA between them. The length of the nucleotide sequence of at the species level, the primers 8 F and 1448R were the spacer region in the sequence reported here were 95

-87- Chapter V.I

Table 2 Identities comparison of 16S rRNA and dsb genes between E. mineirensis (UFMG-EV) and other members of the genus Ehrlichia Percent of nucleotide similarity of 16S rRNA* Ehrlichia mineirensis E. canis E. chaffeensis E. ewingii E. muris E. ruminantium (UFMG-EV) [GU810149] [AF147752] [U96436] [AB013008] [AF069758] Ehrlichia mineirensis *** 98.3 (16SrRNA) 96.9 (16SrRNA) 96.4 (16SrRNA) 94.5 (16SrRNA) 95.0 (16SrRNA) (UFMG-EV) Ehrlichia canis 94.7 (dsb) *** 98.4 (16SrRNA) 97.9 (16SrRNA) 97.1 (16SrRNA) 97.2 (16SrRNA) [AF403710] Ehrlichia chaffeensis 82.3 (dsb) 83.5 (dsb) *** 98.1 (16SrRNA) 97.6 (16SrRNA) 96.9 (16SrRNA) [AF403711] Ehrlichia ewingii 78.6 (dsb) 76.9 (dsb) 78.0 (dsb) *** 97.2 (16SrRNA) 97.1 (16SrRNA) [AY428950] Ehrlichia muris 81.1 (dsb) 81.1 (dsb) 84.5 (dsb) 77.2 (dsb) *** 96.4 (16SrRNA) [AY236484] Ehrlichia ruminantium 76.9 (dsb) 74.6 (dsb) 77.1 (dsb) 76.6 (dsb) 76.4 (dsb) *** [AF308669] Percent of nucleotide similarity of dsb*. *The values are % of nucleotide sequence similarity for 1.3Kb (16Sr RNA) and determined from pairwise aligment using DNASTAR software (MegAlign; DNASTAR, Inc., Madison, WI). Accession Numbers are from GenBank. bases. Sequence homology analyses were done for each Sequence analysis of gltA gene of the nucleotide sequences and the deduced aa Primers gltAF1 and gltAR1 were designed in this study sequences from the partial GroES and GroEL reading using information from E. canis genome [GenBank: frames. Nucleotide and aa sequence homologies with CP000107] and E. chaffeensis gltA gene sequence [Gen- other members of the Ehrlichia genus are presented in Bank: AF304142]. The full length of gltA gene of E. Table 4. A phylogenetic tree based on multiple sequence mineirensis (UFMG-EV) was isolated. A single band of alignment of the 1.249 Kb corresponding to groEL is ~1.5Kb was obtained from the PCR reaction (data not presented in Figure 1B. shown). The full length gene of 1.251 Kb was obtained

Figure 1 AB Phylogenetic trees based on the 16S rRNA (A) and groEL (B) genes sequences from members of the family Anaplasmataceae. The tree shows that E. mineirensis (UFMG-EV) falls in a clade separated from all the previous reported sequences. Bootstrap values are shown as % in the internal branch. Only bootstrap values equal or higher than 50% are shown. 16S rRNA sequence was used to root the 16S rRNA tree and E.coli groEL gene was used to root the groEL tree. The GenBank accession numbers of the sequences used to build the 16S rRNA tree are: E. muris, AB013008; E. chaffeensis, AF147752; E. ruminantium, AF069758; E. ewingii, U96436; A. marginale, M60313; A. phagocytophilum, M73224; A. platys, M82801; N. helminthoeca, U12457; N. sennetsu, M73225; N. risticii, AF036649; E. canis, GU810149; R. prowazekii, NR044656. The GenBank accession numbers of the sequences used to build the groEL tree are: E. muris, AF210459; E. chaffeensis, L10917; E. ruminantium, U13638 ; E. ewingii, AF195273; A. marginale, AF165812; A. phagocytophilum, U96729; A. platys, AY008300; N. sennetsu, U88092; N. risticii, U96732; E. canis, U96731; E. coli, X07850.

-88- Chapter V.I

Figure 2 A highly variable region of sequence located at 5’ end of the 16S rRNA gene revealed by multiple alignments of 16S rRNA gene sequences of Ehrlichia genus. Underlined are the nucleotide differences found between E. canis and E. mineirensis (UFMG-EV). The GenBank accession numbers of the sequences show in the alignment are: E. muris, AB013008; E. chaffeensis, AF147752; E. ruminantium, AF069758; E. ewingii, U96436 and E. canis, GU810149. after sequencing and consensus analysis. The putative two of N-glycosylation. The O-carbohydrates were pre- citrate synthase protein predicted using the E. mineirensis dicted to be linked to three serines (S) of the tandem (UFMG-EV) gltA gene was 416 aa. Table 5 shows the nu- repeat region at position 155, 164 and 173 and two cleotide and the aa similarities with other members of the threonines (T) present in the post-repeat region at Ehrlichia genus. The gltA gene has been proposed as an position 286 and 289. We explored as well the possi- alternative tool for the phylogenetic analysis of the genus bility to find N-glycosylation on putative glycosylated Ehrlichia [7]. Using the maximum likelihood method we asparagines (N). Two sequons of N-glycosylation (N- built a phylogenetic tree showing that E. mineirensis Xaa-T/S) at the pre-repeat region were found: NRS (at (UFMG-EV) falls in a clade apart from any previously position 81) and NFS (at position 106). reported gltA genes in the family Anaplasmataceae (Figure 4). Differences found in the Region I (The 5′ end pre-repeat region) Sequence analysis of the gp36 gene and the putative Alignment of the gp36 ortholog obtained in this study encoded protein sequence revealed that our sequence was 422 nucleotides in length The gp36 based PCR products derived from the isolate encoding for 141 aa (Table 6). The nucleotide and pre- reported here had a molecular size of 1000 base pair dicted aa sequences exhibited relatively low identities, (bp) (data not shown). Subsequent cloning of the PCR ranging from 54.9% to 91.2%, and from 38.0% to 82.0%, amplicons followed by sequencing showed that our respectively, in comparison with related genes previously gene was 0.948 Kb encoding a predicted protein with published for the gp36 orthologs in E. canis, E. chaffeen- 315 aa and a molecular mass of 31.51 KDa (28.89 sis and E. ruminantium [14] (Table 6). KDa without the predicted 23-aa signal peptide). We found that the gp36 protein isolated in our study is a Region II (the tandem repeat region) putative glycoprotein. The aa sequence of gp36 in our Region II in E. mineirensis (UFMG-EV) contains 16 tan- study has five potential sites of O-glycosylation and dem repeats of 27 bp, each encoding nine aa. The single

Table 3 Unique aa changes in the carboxyl terminal of Ehrlichia mineirensis (UFMG-EV) dsb differ from E. canis dsb available in the GenBank Isolates aa position1 Identity %1 160 162 168 184 185 204 Ehrlichia canis [AF403710] 100 V Q H H Y T Ehrlichia canis Uberlandia [GU586135] 100 ...... Ehrlichia canis Sao Paulo [DQ460715] 100 ...... Ehrlichia canis Jaboticabal [DQ460716] 100 ...... Ehrlichia mineirensis (UFMG-EV) 94.0 A K Y N H A 1- Positions and % of identities are based on the sequence of E. canis [GenBank: AF403710]. The dots below the aa letters mean conserved positions. Accession Numbers are from GenBank.

-89- Chapter V.I

(UFMG-EV) go from 12.2% (E. chaffeensis St Vincent, DQ146157) to 75% (E. canis TWN1, EF551366) and from 10% (E. chaffeensis St Vincent) to 32% (E. canis TWN1), respectively. E. ruminantium Highway mucin- like protein has 37.3% (bp) and 21% (aa) of homology with E. mineirensis (UFMG-EV).

B cell epitopes analysis The presence of B cell epitopes in the putative gp36 pro- tein was predicted. The presence of one continuous B Figure 3 Phylogenetic unrooted tree based on the dsb gene cell epitope was predicted in a highly hydrophobic re- sequences from members of the family Anaplasmataceae. The peat tandem region of our protein (197–212). Consider- tree shows that E. mineirensis (UFMG-EV) falls in a clade separated ing that gp36 (E. canis) and gp47 (E. chaffeensis) were from all the previous reported sequences and the previously the closest orthologs, we attempted to find B cell epitope reported E. canis dsb sequences. Bootstrap values are show as% in the internal branch. Only bootstrap values equal or higher than 50% in the tandem repeat of these species using the same al- are shown. The GenBank accession numbers of the dsb sequences gorithm employed for E. mineirensis (UFMG-EV). We used to build the tree are: E. canis, AF403710; E. canis Uberlandia, found the presence of continuous B cell epitopes in the GU586135; E. canis Jaboticabal, DQ460716; E. canis Sao Paulo, tandem repeat of E. canis gp36 [GenBank: EF560599] DQ460715; E. muris, AY236484; E. chaffeensis, AF403711; E. and E. chaffeensis gp47 [strain Arkansas, DQ085430 and ruminantium, AF308669, clon 18hw; E. ewingii, AY428950. strain St. Vincent, DQ146157]. The continuous epitopes found in these last three sequences were localized be- tandem repeat had the sequence VPAASGDAQ and was tween the aa position 139–158, 195–225 and 203–218, completely different to the sequences reported for glyco- respectively. The corresponding primary structures of protein orthologs of gp36 E. canis, gp47 E. chaffeensis the epitopes are shown in Figure 5A-E. We then com- and E. ruminantium mucin-like protein (Table 7). The pared the predicted 3D structures of the epitopes found tandem repeat of E. mineirensis (UFMG-EV) is a serine in the gp36 orthologs in E. mineirensis (UFMG-EV), E. enriched area of the total protein sequence but does not canis and the two from different strains of E. chaffeensis. contain threonine. Its glycoprotein gene shows a high C We found that all epitopes were exposed on the surface + G percent in the whole gene (42.0%) and in the tan- of the predicted 3D structure of each protein. The super- dem repeat region (52.1%). position analysis of the epitopes 3D structure showed that they were structurally dissimilar with a root mean Region III (the 3′ end post-repeat region) square deviation (rmsd) of 5-6 Å between the epitope of The comparison of region III among the orthologs show E. mineirensis (UFMG-EV) and others three Figure 5A- that it is a quite variable region, presenting differences E. A linear correlation between the rmsd and % (dis) in length, nucleotide and aa sequence. It has been widely similarities among structure and sequences, respectively, revised by [14] and [15]. Our sequence was 94-bp length, is a valid interpretation for the evolution of homolog which differ from any previously reported (data not proteins [34]. Correlation for the epitopes of E. mineir- shown). The percent identities of nucleotide and aa se- ensis (UFMG-EV) when compared with the other three quence in this region when compare with E. mineirensis orthologs gives an R2 = 0.77.

Table 4 Identities comparison of groEL gene and putative aa sequence between Ehrlichia mineirensis (UFMG-EV) and other members of Ehrlichia genus Percent of nucleotide (nt) similarity* E. mineirensis (UFMG-EV) E. canis E. chaffeensis E. ewingii E. muris E. ruminantium Ehrlichia mineirensis (UFMG-EV) *** 97.2 (nt) 92.3 (nt) 91.0 (nt) 92.0 (nt) 87.3 (nt) Ehrlichia canis [U96731] 99.0 (aa) *** 92.5 (nt) 90.9 (nt) 92.4 (nt) 87.6 (nt) Ehrlichia chaffeensis [L10917] 97.0 (aa) 97.0 (aa) *** 91.7 (nt) 94.3 (nt) 87.8 (nt) Ehrlichia ewingii [AF195273] 95.0 (aa) 95.0 (aa) 96.0 (aa) *** 91.5 (nt) 88.0 (nt) Ehrlichia muris [AF210459] 97.0 (aa) 97.0 (aa) 99.0 (aa) 97.0 (aa) *** 87.3 (nt) Ehrlichia ruminantium [U13638] 92.0 (aa) 92.0 (aa) 93.0 (aa) 92.0 (aa) 93.0 (aa) *** Percent of amino acid (aa) similarity*. *The values showed are % of nucleotide and aa sequence similarity of 1.249 Kb determined from pairwise aligment using DNASTAR software (MegAlign; DNASTAR, Inc., Madison, WI) and 416 aa of the amino terminal determined from ClustalW. Accession Numbers are from GenBank.

-90- Chapter V.I

Table 5 Identities comparison of gltA gene and putative aa sequence between E. mineirensis (UFMG-EV) and other members of Ehrlichia genus Percent of nucleotide (nt) similarity* E. mineirensis (UFMG-EV) E. canis E. chaffeensis E. ewingii E. muris E. ruminantium Ehrlichia mineirensis (UFMG-EV) *** 94.3 (nt) 84.6 (nt) 80.9 (nt) 84.8 (nt) 77.6 (nt) Ehrlichia canis [AF304143] 94.0 (aa) *** 85.0 (nt) 82.2 (nt) 85.4 (nt) 79.0 (nt) Ehrlichia chaffeensis [AF304142] 82.0 (aa) 84.0 (aa) *** 82.0 (nt) 87.0 (nt) 78.9 (nt) Ehrlichia ewingii [DQ365879] 79.0 (aa) 80.0 (aa) 77.0 (aa) *** 82.5 (nt) 79.4 (nt) Ehrlichia muris [AF304144] 82.0 (aa) 84.0 (aa) 85.0 (aa) 78.0 (aa) *** 79.6 (nt) Ehrlichia ruminantium [AF304146] 74.0 (aa) 77.0 (aa) 75.0 (aa) 75.0 (aa) 77.0 (aa) *** Percent of aa similarity*. *The values showed are % of nucleotide and aa sequence similarity of the full length determined from pairwise aligment using DNASTAR software (MegAlign; DNASTAR, Inc., Madison, WI) and the putative encoded aa determinated from ClustalW. Accession Numbers are from GenBank.

Discussion hypervariable region of 16S rRNA was different when Polyphasic taxonomy has been advocated to ensure well- compared with other members of Ehrlichia genus. balanced determinations of taxonomic relationships [7]. Since the 16S rRNA gene is known to exhibit a high Different genes have been proposed to classify ehrlichial level of structural conservation with a low evolutionary agents, however, the most widely used are 16S rRNA rate, levels of sequence divergence greater than 0.5% in [8,9], groESL operon [10], groEL gene [11], gltA [7], dsb comparisons with nearly complete 16S rRNA gene [12], gp36, and gp19 [13]. sequences of members of the genus Ehrlichia have been Sequence comparison of the 16S rRNA gene is recog- considered sufficient to classify organisms as different nized as one of the most powerful and precise methods species [8,35]. The levels of divergence of the 16S rRNA for determining the phylogenetic relationships of bac- sequence between this novel Brazilian ehrlichial agent teria [8,11,35]. Our results were consistent with previous and the closest member of the Anaplasmataceae, E. phylogenetic analysis of Ehrlichia spp by using the 16S canis was 1.7% in pairwise comparisons of 1384 base rRNA gene sequences [9,36]. In this study, our analysis sequences (data not shown), and this level of difference of a relevant fragment of 16S rRNA sequences revealed should be sufficient to classify the novel ehrlichial agent that the novel agent found in Brazilian R. (B.) microplus as a new species of the genus Ehrlichia. Furthermore, ticks was closely related to E. canis [GenBank: the 16S rRNA phylogenetic tree constructed with a max- GU810149], but was also closely related to E. chaffeensis imum likelihood method show that E. mineirensis [GenBank: AF147752] showing 98.3% and 96.9% of (UFMG-EV) falls in a different clade separated from any homology, respectively. It is worth noting that the hyper- previously reported Ehrlichia spp. variable region 16S rRNA is well conserved in members The genes groEL [11] and gltA [7] have been proposed of the same species (data not shown) and are different as an alternative to 16S rRNA for the phylogenetic ana- among members of Ehrlichia genus [8,9]. However, our lysis of the Anaplasmatacaea family as they are less

Figure 4 Phylogenetic tree based on the citrate synthase (gltA) gene sequences from members of the family Anaplasmataceae. The tree shows that E. mineirensis (UFMG-EV) falls in a clade separated from all the previously reported sequences. Bootstrap values are show as % in the internal branch. Only are showed bootstrap values equal or higher than 50%. N. risticii gltA sequence was used to root the tree. The GenBank accession numbers of the gltA sequences used to build the tree are as follow: E. canis, AF304143; E. muris, AF304144; E. chaffeensis, AF304142; E. ruminantium, AF304146; E. ewingii, DQ365879; A. marginale, AF304140; A. phagocytophilum, AF304138; A. platys, AY077620.

-91- Chapter V.I

Table 6 Length and percent of nucleotide and aa homology of the 5’ end pre-repeat region between the orthologs of gp36 in Ehrlichia mineirensis (UFMG-EV) and related genes Nucleotide aa Source Strain Length1 Homology2 Length3 Homology4 Ehrlichia mineirensis (UFMG-EV) 422 - 141 - Ehrlichia canis gp36 TWN1 [EF551366] 425 91.2 142 82 Louisiana [DQ146151] 428 88.2 143 78 Sao Paulo [DQ146154] 428 88.4 143 78 Cameroon [DQ146155] 428 88.6 143 79 Ehrlichia chaffeensis gp47 Arkansas [DQ085430] 471 61.8 157 52 Sapulpa [DQ085431] 461 62.1 154 53 Jax [DQ146156] 461 60.7 154 51 St Vincent [DQ146157] 461 62.1 154 53 Ehrlichia ruminantium mucin-like protein Highway [AF308673] 410 54.9 137 38 1 - The length were determinate using the Tandem Repeats Finder database [30]. 2 - Percent of nucleotide homology were calculated with MegAlign, DNAStar, USA. Comparing with E. mineirensis (UFMG-EV). 3 - The length was determined using ClustalW [20] in comparison with Ehrlichia mineirensis (UFMG-EV). 4 - Percent of aa homology were calculated with ClustalW [20]. Comparing with E. mineirensis (UFMG-EV). Accession Numbers are from GenBank. conserved than 16S rRNA among the family members In our study the level of similarity among ehrlichial [7] and dsb gene has been previously used to classified gltA and dsb were lower than that of 16S rRNA and members of the Ehrlichia genus [12]. It is important to groEL gene sequences in the genus Ehrlichia. E. canis note that the spacer of the groESL operon was 95 bp in was the closest Ehrlichia species to E. mineirensis E. mineirensis (UFMG-EV), which differs from the (UFMG-EV) in all the studied genes. Similar phylogen- reported for E. canis, E. chaffeensis, E. ruminantium with etic relationships are observed between other members 93, 100 and 96 bp, respectively [10]. The gp36 orthologs of the Ehrlichia genus – i.e., E. chaffeensis/E. muris, N. are a divergent gene in E. canis, E. chaffeensis and E. risticii/N. sennetsu and A. marginale/A. platys. ruminantium due to their high evolutionary pressure The architecture of gltA, groEL and dsb based phylo- [14,15]. This gene has been used to differentiate new iso- genetic trees were similar to that of the tree derived lates of E. canis where 16S rRNA was not well suited to from the 16S rRNA gene sequences. However, the trees discriminate between E. canis isolates [13]. constructed from gltA and dsb show more divergence

Table 7 Summary of Ehrlichia tandem repeats present in gp36 glycoprotein orthologs Repeat Source Strain Length No.1 Homology% Consensus tandem repeat sequence (aa)2 (bp)1 (bp)1 Ehrlichia mineirensis (UFMG-EV) 27 16.0 100 VPAASGDAQ Ehrlichia canis gp36 TWN1 [EF551366] 27 13.2 100 TEDSVSAPA Louisiana [DQ146151] 27 5.2 99 ...... Sao Paulo [DQ146154] 27 18.2 100 ...... Cameroon [DQ146155] 27 16.2 100 ...... IS [EF636663] 27 11.2 99 TEDPVSATA Ehrlichia chaffeensis gp47 Arkansas [DQ085430] 57 7.0 99 ASVSEGDAVVNAVSQETPA Sapulpa [DQ085431] 99 4.5 99 EGNASEPVVSQEAAPVSESGDAANPVSSSENAS Jax [DQ146156] 99 4.5 98 ...... St Vincent [DQ146157] 99 3.4 98 ...... Ehrlichia ruminantium Highway [AF308673] 27 21.7 99 VTSSPEGSV mucin-like protein Welgevonden [CR767821] 27 56.0 95 ...... Gardel [CR925677] 66 16.9 99 SSEVTESNQGSSASVVGDAGVQ 1 – The length (bp), No of nucleotide repeats and the % of Homology were determinate using the Tandem Repeats Finder database [21]. 2 – The dots below the tandems mean conserved aa sequence. Accession Numbers are from GenBank.

-92- Chapter V.I

Figure 5 A-E Epitope identification. The modeled 3D structures for E. mineirensis (UFMG-EV) (A), E. canis (B; GenBank: EF560599), and E. chaffeensis (C and D; GenBank: DQ085430, DQ146157, respectively) depict the position of the predicted epitope (→). Protein structures are colored from blue (N-terminus) to red (C-terminus) according to the residue position. An epitope Cα superimposition (E)ofE. mineirensis (UFMG-EV) (cyan), E. canis (brown), E. chaffeensis (GenBank: DQ085430; green) and E. chaffeensis (GenBank: DQ146157; yellow) depicting the differences in their overall structures, E. mineirensis (UFMG) having a 5-6 Å difference compared with the other epitopes). than that from the 16S rRNA and groEL gene. The dif- repeat. It is noteworthy that the tandem repeat of our ference of E. canis and E. mineirensis (UFMG-EV) was sequence does not contain threonine; nevertheless, we well established in all the four trees based on nucleotide predicted three sites of O-glycosylation in the serines of sequences. E. mineirensis (UFMG-EV) was well defined, the tandem repeat and two in threonines of the post- with higher bootstrap values in the gltA (100) and dsb repeat region. Two N-glycosylation sites were found in (100) based trees than for those of the 16S rRNA (97) our aa sequence. The analysis for N-glycosylation was and groEL (93) based tree. done for E. ruminantium, E. canis and for E. chaffeensis Based on aa homology and genomic synteny analyses, ortholog sequences (data not shown) and potential sites it has been determined that the mucin-like protein of of N-glycosylation were found as well for these Ehrlichia ruminantium, gp36 of E. canis and gp47 of E. sequences. Glycosylation plays a crucial role in the im- chaffeensis are orthologs [14]. Identity of 87.2% has been munogenicity of these glycoproteins [14,15]. Deglycosyla- found in the pre-repeat region among geographically tion of the gp36 tandem repeat drastically reduces its distant E. canis isolates [13]. The single tandem repeat immunogenicity [14]. Both gp36 and gp47 are described was highly conserved among isolates (TEDSVSAPA) as the major immunoreactive protein of E. canis and E. with variations in the number of repeats [13-15] and few chaffeensis and the tandem repeats contain the major conservative changes in amino acid sequences [15]. The antibody epitope [14,15]. It was found that the tandem tandem repeat genetic unit varies in length (from 27 bp repeat of gp36 from E. mineirensis (UFMG-EV) contain – 99 bp) among the different orthologs, number of the major B cell epitope previously reported for the repeats (from 3.4 - 56) and the homology of the nucleo- glycoprotein orthologs. The prediction of the 3D struc- tide and the aa sequence encoded in the repeat (Table 7). ture of the B cell epitopes present in the tandem repeat Our sequence contains a tandem repeat that shares an shows a high structural divergence among the closest extremely low homology with the gp36 orthologs reported gp36 orthologs in E. mineirensis (UFMG-EV), E. canis until now ranging from 22% (E. ruminantium and E. and E. chaffeensis. These structural differences may canis) to 33% (E. chaffeensis). Doyle et al. [14] describes explain the results obtained by Doyle et al. [14] in gp36 and gp47 as glycoprotein sharing O-glycosylation which neither gp36 nor gp47 reacted with heterologous predicted sites in the serines and threonines of the tandem antisera.

-93- Chapter V.I

The C + G content of the gp36 gene of E. mineirensis the in vitro cultivation and maintenance of the microorganism at UFMG. JF (UFMG-EV) is higher than the rest of the orthologs pre- contributed to design the molecular and phylogenetic analyses. LG contributed to the overall design and supervision of the study. JJV viously reported (data not shown). The C + G content in performed the 3D structure prediction and contributed with the epitope specific genes have been used in systematics as support analysis. LP developed the conception and design of the study and for the classification of organisms [7], and it is known contributed in drafting the manuscript. All authors critically revised the manuscript and have given final approval of the version to be published. that recombination significantly increases the silent C + G content of a genome in a selectively neutral manner Acknowledgments [37]. The authors thank Dr Ulrike G. Munderloh (University of Minnesota, USA) for Although it is well known that Babesia bovis, B. bige- permission to use the IDE8 cell line. A. Cabezas Cruz is a Marie Curie Early Stage Researcher (ESR) supported by the POSTICK ITN (Post-graduate training mina and Anaplasma marginale are the most common network for capacity building to control ticks and tick-borne diseases) within etiological agents transmitted by R. (B.) microplus ticks the FP7- PEOPLE – ITN programme (EU Grant No. 238511). [38], the detection of any species of Ehrlichia in R. (B.) Author details microplus ticks has been infrequently reported. The first 1University of South Bohemia, Faculty of Science, České Budějovice, Czech two reports were in China in the Guangxi Autonomous Republic. 2Comparative Tropical Medicine and Parasitology, 3 Region in 1999 [39] and Tibet in 2002 [8]; the second in Ludwig-Maximilians-Universität München, Munich, Germany. Departamento de Parasitologia, ICB-UFMG, Belo Horizonte, Brazil. 4Instituto de Investigación Thailand in 2003 [36] and the latest one in Xiamen, de Recursos Cinegéticos, IREC, Ciudad Real, Spain. 5Department of Veterinary China in 2011 [40]. Except the isolate from Guangxi, E. Pathobiology, Center for Veterinary Health Sciences, Oklahoma State 6 canis [39], the rest share, based on 16S rRNA, a 99.9% of University, Oklahoma, USA. Departamento de Medicina Veterinaria Preventiva, INCT-Pecuária, Escola de Veterinária-UFMG, Belo Horizonte, Minas homology [36,40] and differ from the ehrlichial species Gerais, Brazil. previously reported and classified as Ehrlichia spp strain Tibet [8]. In the present study, determined by pairwise Received: 17 October 2012 Accepted: 3 December 2012 Published: 11 December 2012 alignment, the E. mineirensis (UFMG-EV) isolated from R. (B.) microplus shares 97% of similarity with the 16S References rRNA sequences of the referred species (data not 1. Doyle KC, Labruna MB, Breitschwerdt EB, Tang YW, Corstvet RE, Hegarty BC, shown). This is the second report of a new Ehrlichia spp Bloch KC, Li P, Walker DH, McBride JW: Detection of medically important Ehrlichia by quantitative multicolor TaqMan real-time polymerase chain isolated from R. (B.) microplus, but the first to be reaction of the dsb gene. J Mol Diagn 2005, 7(4):504–510. reported in the American continent. The identification 2. Rikihisa Y: Anaplasma phagocytophilum and Ehrlichia chaffeensis: of E. mineirensis (UFMG-EV) in R. (B.) microplus ticks subversive manipulators of host cells. Nat Rev Microbiol 2010, 8:328–339. 3. Telford SR III, Goethert HK, Cunningham JA: Prevalence of Ehrlichia muris suggests a potential of infection and transmission of this in Wisconsin deer ticks collected during the mid 1990s. The Open agent to cattle in the area where infected ticks are Microbiol J 2011, 5:18–20. present. 4. Allsopp MTEP, Louw M, Meyer EC: Ehrlichia ruminantium: an emerging human pathogen? Ann NY Acad Sci 2005, 1063:358–360. 5. Da Costa VRF, Biondo AW, Sá Guimarăes AM, Dos Santos AP, Dos Santos RP, Conclusions Dutra LH, De Paiva DPPV, De Morais HA, Messick JB, Labruna MB, Vidotto O: – Based on the molecular and phylogenetic analysis of the Ehrlichiosis in Brazil. Rev Bras Parasitol Vet Jaboticabal 2011, 20(1):1 12. 6. Rar V, Golovljova I: Anaplasma, Ehrlichia, and “Candidatus Neoehrlichia” genes 16S rRNA, groEL, dsb and gltA we concluded that bacteria: pathogenicity, biodiversity, and molecular genetic the new microorganism isolated from the hemolymph of characteristics, a review. Infect Gen and Evol 2011, 11(8):1842–1861. R. (B.) microplus is a new species of Ehrichia with new 7. Inokuma H, Brouqui P, Drancourt M, Raoult D: Citrate synthase gene sequence: a new tool for phylogenetic analysis and identification of predicted antigenic properties in the gp36 glycoprotein Ehrlichia. J Clin Microbiol 2001, 39(9):3031–3039. ortholog. Complementary analysis of C + G content in the 8. Wen B, Jian R, Zhang Y, Chen R: Simultaneous detection of Anaplasma gp36 orthologs, distant of groESL spacer and hypervari- marginale and a new Ehrlichia species closely related to Ehrlichia chaffeensis by sequence analyses of 16S ribosomal DNA in Boophilus able region of 16S rRNA supports the fact that E. mineir- microplus ticks from Tibet. J Clin Microbiol 2002, 40(9):3286–3290. ensis (UFMG-EV) is a separate phylogenetic entity. 9. Warner CK, Dawson JE: Genus- and species-level identification of Ehrlichia Further studies should address the question whether species by PCR and sequencing.InPCR protocols for emerging infectious diseases. Edited by Persing DH. Washington DC: ASM Press; 1996:100–105. R. (B.) microplus is a competent vector for this and 10. Sumner JW, Nicholson WL, Massung RF: PCR amplification and other Ehrlichia species and whether this new organism comparison of nucleotide sequences from the groESL heat shock operon – is an emerging pathogen for cattle or an endosymbiont of Ehrlichia species. J Clin Microbiol 1997, 35(8):2087 2092. 11. Yu XJ, Zhang XF, McBride JW, Zhang Y, Walker DH: Phylogenetic of R. (B.) microplus. relationships of Anaplasma marginale and ‘Ehrlichia platys’ to other Ehrlichia species determined by GroEL aa sequences. Int J Syst Evol Competing interests Microbiol 2001, 51(3):1143–1146. The authors declare that they have no competing interest. 12. Sacchi ABV, Duarte JMB, André MR, Machado RZ: Prevalence and molecular characterization of Anaplasmataceae agents in free-ranging Authors' contributions Brazilian marsh deer (Blastocerus dichotomus). Comp Immun Microbiol and AC performed the isolation of the genes, the interpretation of the molecular, Inf Dis 2012, 35(4):325–334. in silico immunological data and drafted the manuscript. EZ performed the 13. Hsieh YC, Lee CC, Tsang CL, Chung YT: Detection and characterization of in vitro cultivation and maintenance of the microorganism at LMU. MFBR four novel genotypes of Ehrlichia canis from dogs. Vet Microbiol 2010, isolated the organism from ticks and established it in vitro. JAGS performed 146(1–2):70–75.

-94- Chapter V.I

14. Doyle CK, Nethery KA, Popov VL, McBride JW: Differentially expressed and 39. Pan H, et al: Amplification of 16S rRNA gene fragments of Ehrlichia canis secreted major immunoreactive protein orthologs of Ehrlichia canis and from ticks in southern China. Chin J Zoon 1999, 15(3):3–6. E. chaffeensis elicit early antibody responses to epitopes on glycosylated 40. Jiang BG, Cao WC, Niu JJ, Wang JX, Li HM, Sun Y, Yang H, Richadus JH, tandem repeats. Infect Immun 2006, 74(1):711–720. Habbema JD: Detection and identification of Ehrlichia species in 15. Zhang X, Luo T, Keysary A, Baneth G, Miyashiro S, Strenger C, Waner T, Rhipicephalus (Boophilus) microplus ticks in cattle from Xiamen China. McBride JW: Genetic and antigenic diversities of major immunoreactive Vector Borne Zoon Dis 2011, 11(3):325. proteins in globally distributed Ehrlichia canis strains. Clin and Vacc Immun 2008, 15(7):1080–1088. doi:10.1186/1756-3305-5-291 16. Zweygrath E, Schöl H, Lis K, Cabezas Cruz A, Thiel C, Silaghi C, Ribeiro MFB, Cite this article as: Cruz et al.: New species of Ehrlichia isolated from Passos LMF: In vitro culture of a new genotype of Ehrlichia sp. from Brazil. Rhipicephalus (Boophilus) microplus shows an ortholog of the E. canis Orvieto: Joint Conference on Emerging and Re-emerging Epidemics major immunogenic glycoprotein gp36 with a new sequence of Affecting Global Health 2012; 2012. Proceedings. tandem repeats. Parasites & Vectors 2012 5:291. 17. Munderloh UG, Liu Y, Wang M, Chen C, Kurtti TJ: Establishment, maintenance and description of cell lines from the tick Ixodes scapularis. J Parasitol 1994, 80(4):533–543. 18. Munderloh UG, Kurtti TJ: Formulation of medium for tick cell culture. Exp Appl Acarol 1989, 7(3):219–229. 19. Dumler JS, et al: Reorganization of genera in the families Rickettsiaceae and Anaplasmataceae in the order Rickettsiales: unification of some species of Ehrlichia with Anaplasma, Cowdria with Ehrlichia and Ehrlichia with Neorickettsia, descriptions of six new species combinations and designation of Ehrlichia equi and ‘HGE agent’ as subjective synonyms of Ehrlichia phagocytophila. Int J Syst Evol Microbiol 2001, 51(6):2145–2165. 20. Zhang Z, Schwartz S, Wagner L, Miller W: A greedy algorithm for aligning DNA sequences. J Comput Biol 2000, 7(1–2):203–214. 21. Thompson JD, Higgins DG, Gibson TJ: CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994, 22(22):4673–4680. 22. ExPASy Translation Tool: ExPASy Translation Tool. http://expasy.hcuge.ch/ tools/dna.html. 23. Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32(5):1792–1797. 24. Castresana J: Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 2000, 17(4):540–552. 25. Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 2003, 52(5):696–704. 26. Anisimova M, Gascuel O: Approximate likelihood ratio test for branchs: a fast, accurate and powerful alternative. Syst Biol 2006, 55(4):539–552. 27. Chevenet F, Brun C, Banuls AL, Jacq B, Chisten R: TreeDyn: towards dynamic graphics and annotations for analyses of trees. BMC Bioinf 2006, 10(7):439. 28. Nielsen H, Engelbrecht J, Brunak S, von Heijne G: Identification of prokaryotic and eukaryotic signal peptides and prediction of theircleavage sites. Protein Eng 1997, 10(1):1–6. 29. Julenius K, Molgaard A, Gupta R, Brunak S: Prediction, conservation analysis, and structural characterization of mammalian mucin type O- glycosylation sites. Glycobiol 2005, 15(2):153–164. 30. NetNGlyc 1.0 Server. http://www.cbs.dtu.dk/services/NetNGlyc/. 31. Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 1999, 27(2):573–580. 32. Ponomarenko JV, Bourne PE: Antibody-protein interactions: benchmark datasets and prediction tools evaluation. BMC Struct Biol 2007, 7:64–83. 33. Ponomarenko J, Bui HH, Li W, Fusseder N, Bourne PE, Sette A, Peters B: ElliPro: a new structure-based tool for the prediction of antibody epitopes. BMC Bioinf 2008, 9:514–521. 34. Wood TC, Pearson WR: Evolution of protein sequences and structures. J Mol Biol 1999, 291:977–995. 35. Woese CR: Bacterial evolution. Microbiol Rev 1987, 51(2):221–271. 36. Parola P, Cornet JP, Sanogo YO, Miller RS, Thien HV, Gonzalez JP, Raoult D, Telford SR III, Wongsrichanalai C: Detection of Ehrlichia spp., Anaplasma spp., Rickettsia spp., and other eubacteria in ticks from the Thai-Myanmar border and Vietnam. J Clin Microbiol 2003, 41(4):1600–1608. 37. Birdsell JA: Integrating genomics, bioinformatics, and classical genetics to study the effects of recombination on genome evolution. Mol Biol Evol 2002, 19(7):1181–1197. 38. Alonso M, Arellano-Sota C, Cereser VH, Cordoves CO, Guglielmone AA, Kessler R, Mangold AJ, Nari A, Patarroyo JH, Solari MA, Vega CA, Vizcaino O, Camus E: Epidemiology of bovine anaplasmosis and babesiosis in Latin America and the Caribbean. Vet Sci Tech Off Int Epiz 1992, 11(3):713–733.

-95- Chapter V.II

Ultrastructure of Ehrlichia mineirensis, a new member of the

Ehrlichia genus.

Cabezas-Cruz A., Vancová M., Zweygarthb E., Ribeiro M F B., Grubhoffer L., Passos L M F. 2013. Ultrastructure of Ehrlichia mineirensis, a new member of the Ehrlichia genus. Veterinary Microbiology. doi: 10.1016

-96-

Chapter V.II

Contents lists available at ScienceDirect

Veterinary Microbiology

jo urnal homepage: www.elsevier.com/locate/vetmic

Ultrastructure of Ehrlichia mineirensis, a new member of the

Ehrlichia genus

a,1 a,1 b

Alejandro Cabezas-Cruz , Marie Vancova´ , Erich Zweygarth ,

c a d,c,

Mucio Flavio Barbosa Ribeiro , Libor Grubhoffer , Lygia Maria Friche Passos *

a ˇ

University of South Bohemia, Faculty of Science and Biology Centre of the ASCR, Institute of Parasitology, Ceske´ Budeˇjovice, Czech Republic

b

Comparative Tropical Medicine and Parasitology, Ludwig-Maximilians-Universita¨t Mu¨nchen, Munich, Germany

c

Departamento de Parasitologia, ICB-UFMG, Belo Horizonte, Brazil

d

Departamento de Medicina Veterinaria Preventiva, INCT-Pecua´ria, Escola de Veterina´ria-UFMG, Belo Horizonte, Minas Gerais, Brazil

A R T I C L E I N F O A B S T R A C T

Recently, we reported the in vitro isolation and the molecular characterization of a new

Article history:

Received 6 May 2013 species of Ehrlichia (Ehrlichia mineirensis) from haemolymph of Brazilian Rhipicephalus

Received in revised form 31 July 2013 (Boophilus) microplus ticks. This organism shows an ortholog of Ehrlichia canis major

Accepted 3 August 2013 immunogenic protein gp36 with a new structure of tandem repeats. In the present study,

we used electron microscopy (high pressure freezing and freeze substitution preparative

Keywords: techniques) to characterize morphologically this new agent growing in IDE8 tick cells. The

Ehrlichia mineirensis

results showed that E. mineirensis shares ultrastructural features with other members of

Electron microscopy

the genus Ehrlichia (Ehrlichia muris, E. canis and Ehrlichia chaffeensis); typical

In vitro culture

parasitophorous vacuoles (morulae) contain electron-dense and reticulated Ehrlichiae

embedded inside a fibrillar matrix. We observed the characteristic Gram-negative-type

cell wall composed of both cytoplasmic and rippled outer membrane. We found organisms

undergoing binary fission and rarely altered cells with unusual invagination of the

cytoplasmic membrane.

ß 2013 Published by Elsevier B.V.

1. Introduction including Ehrlichia canis, Ehrlichia chaffeensis, Ehrlichia

ewingii, Ehrlichia muris, and Ehrlichia ruminantium

Ehrlichiae are obligate intracytoplasmic Gram-nega- (Dumler et al., 2001). The ultrastructure of members of

tive, tick-borne bacteria belonging to the Anaplasmata- this genus has been previously characterized (Popov

ceae family. Ehrlichioses are considered as emerging et al., 1998; Bell-Sakyi et al., 2000). Recently, a new

diseases in both humans and animals. At present, the species of the genus Ehrlichia was isolated from

genus Ehrlichia consists of five recognized species, haemolymph of Rhipicephalus (Boophilus) microplus

engorged females that were collected from naturally

* Corresponding author at: Departamento de Medicina Veterina´ria infested cattle on a farm in the state of Minas Gerais,

Preventiva, Escola de Veterina´ria-UFMG Av. Antonio Carlos, 6627, CP 567,

Brazil and was called Ehrlichia mineirensis (Cabezas-Cruz

Belo Horizonte, 30123-970, Minas Gerais, Brazil. Tel.: +55 (31) 34092075;

et al., 2012). The in vitro culture (Zweygarth et al., 2013)

fax: +55 (31) 34092080.

and the molecular characterization of this agent were

E-mail addresses: [email protected] (A. Cabezas-Cruz),

[email protected] (M. Vancova´), [email protected] (E. Zweygarth), previously reported (Cabezas-Cruz et al., 2012). The

[email protected] (M.F.B. Ribeiro), [email protected] (L. Grubhoffer), present study aimed to further characterize morpholo-

[email protected], [email protected] (L.M.F. Passos).

gically this new organism through electron microscopy

1

Joint first authorship.

using high pressure freezing and freeze substitution

preparative techniques.

0378-1135/$ – see front matter ß 2013 Published by Elsevier B.V. http://dx.doi.org/10.1016/j.vetmic.2013.08.001

-97-

Chapter V.II

2. Materials and methods

2.1. In vitro cultivation

Maintenance of E. mineirensis cultures was carried out

with medium changes twice a week as previously

described (Zweygarth et al., 2013). Briefly, IDE8 cells were

maintained at 32 8C in L-15B medium (Munderloh and

Kurtti, 1989), supplemented with 5% heat-inactivated

foetal bovine serum, 10% tryptose phosphate broth, 0.1%

bovine lipoprotein concentrate (MP Biomedicals, Santa

Ana, CA, USA), 100 IU/ml penicillin and 100 mg/ml

streptomycin. Infected IDE8 cultures were propagated in

a modified L-15B medium as outlined above, further

supplemented with 0.1% NaHCO3 and 10 mM HEPES. The

pH of the medium was adjusted to 7.5 with 1 N NaOH.

2

Infected cultures were propagated at 34 8C in 25 cm Fig. 1. Giemsa-stained cytocentrifuge smear of Ehrlichia mineirensis in

plastic culture flasks in 5 ml of the medium under normal IDE8 cells. IDE8 cells heavily infected with E. mineirensis. Rickettsiae

contained within cytoplasmic vacuoles as well as rickettsiae released

atmospheric conditions. Smears were fixed with methanol,

from infected cells (arrows) are shown. Magnification, 1000.

stained with an 8% Giemsa solution for 30 min and

examined under oil immersion at 1000 magnifications.

When the cultures reached approximately 80% of infection, vacuoles and inclusions similar to those previously

they were collected for electron microscopy sample described as phagolysosomes (Blouin and Kocan, 1998)

processing. (data not shown). The phagolysosomes and secondary

lysosomes were present in infected IDE8 cells too (Fig. 2A,

2.2. Sample processing and electron microscopy asterisks), however, they also contained membrane-lined

vacuoles containing up to 25 rickettsial organisms 0.4–

Cells were centrifuged and the pellet was immersed in a 1.5 mm in diameter (Fig. 2A). We were able to show the

20% BSA solution. Cells were immediately frozen using a presence of both reticulated (Fig. 2A) and electron-dense

high pressure freezer (EMPACT2, Leica Microsystems, bodies (Fig. 2B) which have been described previously for

Vienna, Austria). Freeze substitution was performed in a other members of the genus Ehrlichia (Popov et al., 1998;

medium containing 2% OsO4 in anhydrous acetone for 96 h Bell-Sakyi et al., 2000). The organisms were round and

at 90 8C. Then the temperature was raised to 4 8C (4 8C/ oval shaped (Fig. 2A and B) and they had typical tri-layered

1 h). The samples were rinsed three times in acetone, cytoplasmic and outer membranes, in some the outer

infiltrated and embedded in Polybed 812 at room membrane was rippled (Fig. 2B inset). It is noteworthy to

temperature. Ultrathin sections were contrasted in etha- mention that we did not find high rickettsial polymorph-

nolic uranyl acetate and lead citrate solutions, and ism as was reported previously for E. ruminantium (Bell-

observed in a JEOL 1010 TEM (JEOL Ltd.) at an accelerating Sakyi et al., 2000). We observed numerous reticulate cells

voltage of 80 kV. Images were captured using a Mega View of E. mineirensis undergoing binary fission (Fig. 2C).

III camera (SIS GmbH). Parasitophorous vacuoles were surrounded by mitochon-

dria (Fig. 2D), and cisterns of rough endoplasmic

2.3. PCR reticulum (Fig. 2E). In several cases, organelles were in

tight contact with the vacuole membrane (Fig. 2D and E)

A PCR was carried out (Cabezas-Cruz et al., 2012) on similarly to observations made by others (Dedonder et al.,

DNA extracted with a commercial kit (Qiagen Inc. Valencia, 2012). Moreover, we observed bundles of microtubules

CA) from uninfected and Ehrlichia-infected IDE8 cultures. (Fig. 2D, inset, white arrows) surrounding the membrane

0

The primers set consisting of 8F (5 AGTTTGATCATGGCT- of morulae which may be important for the movement of

0

CAG) and 1448R (5 CCATGGCGTGACGGGCAGTGTG) was the rickettsia through the cytoplasm. Some rickettsial

used to amplify 1400 base pairs of E. mineirensis 16S rRNA colonies also contained tiny vesicles visible in the

gene. interrickettsial space (Fig. 2B and D black arrows), as

has been described for Anaplasma marginale in IDE8 cells

3. Results and discussion (Blouin and Kocan, 1998). We also found cells with an

unusual structure (Fig. 3) that has not been previously

3.1. Ultrastructure of E. mineirensis reported; such cells were observed only rarely. Further

studies should clarify the relevance of cells with this kind

Infection of cultures was confirmed by direct examina- of structure.

tion of Giemsa-stained cytocentrifuge smears and PCR. The establishment of E. mineirensis in tick cell culture

Both, microscopic examination and PCR results confirmed provides a source of material for the study of this pathogen

that E. mineirensis cells (Fig. 1) and DNA (data not shown) (Zweygarth et al., 2013). The use of this culture system will

were present in the infected IDE8 culture and absent from increase our knowledge and understanding of E. mine-

the uninfected cells. Uninfected IDE8 cells contained irensis development in ticks. Here we showed the

-98-

Chapter V.II

Fig. 2. Electron micrograph of Ehrlichia mineirensis – infected IDE8 cells. (A) Cells containing phagolysosomes/secondary lysosomes (white asterisk) and

numerous vacuoles with bacteria. (B) Electron-dense bodies and small vesicles (black arrows) inside membrane-lined vacuoles. The inset shows in detail

the rippled membrane. (C) Reticulate cells undergoing binary fission. (D) The vacuole containing reticulate bodies that have ruffled outer membrane and

small vesicles (black arrows) is surrounded with mitochondria (Mi) and microtubules (detail in inset, white arrows). (E) A cisterna of endoplasmic reticulum

in tight contact with the membrane of the morulae.

successful ultrastructural characterization of this new et al., 2012) and present studies provide new insides in the

agent in IDE8 cells using electron microscopy. Even biology of this new species of the genus Ehrlichia.

thought further studies are needed to clarify the patho-

genic potential of this agent, our recent (Cabezas-Cruz Author’s contributions

AC-C, MV, LMFP, LG, EZ conceived and designed

research; AC-C and MV performed research and analyzed

data; EZ supervised the in vitro culture; AC-C and MV

wrote the paper; EZ, LG, LMFP, MFBR made critical

revisions to the manuscript.

Funding

This research was supported by POSTICK ITN (Post-

graduate training network for capacity building to control

ticks and tick-borne diseases) within the FP7-PEOPLE – ITN

programme (EU Grant No. 238511), by TE 01020118 and by

the GACR Z60220518. The funder had no role in study

design, data collection and analysis, decision to publish, or

preparation of the manuscript.

Competing interests

Fig. 3. Ehrlichia mineirensis with an unusual morphological structure. The

figure shows a cell with an invagination of the membrane, small vesicles The authors have declared that no competing interests are shown (arrow). exist.

-99-

Chapter V.II

Acknowledgements Dedonder, S., Cheng, C., Willard, L., Boyle, D., Ganta, R., 2012. Transmis-

sion electron microscopy reveals distinct macrophage- and tick

cell-specific morphological stages of Ehrlichia chaffeensis. PLoS

The authors thank Dr Ulrike G. Munderloh (University ONE 7, e36749.

Dumler, S., Barbet, A., Bekker, C., Dasch, G., Palmer, G., Ray Stuart, Rikihisa,

of Minnesota, USA) for permission to use the IDE8 cell line.

Y., Rurangirwa, F., 2001. Reorganization of genera in the families

Rickettsiaceae and Anaplasmataceae in the order Rickettsiales: uni-

References

fication of some species of Ehrlichia with Anaplasma, Cowdria with

Ehrlichia and Ehrlichia with Neorickettsia, descriptions of six new

Bell-Sakyi, L., Paxton, E., Munderloh, U., Sumption, K., 2000. Growth of species combinations and designation of Ehrlichia equi and ‘HGE

Cowdria ruminantium, the causative agent of heartwater, in a tick cell agent’ as subjective synonyms of Ehrlichia phagocytophila. Int. J. Syst.

line. J. Clin. Microbiol. 38, 1238–1240. Evol. Microbiol. 51, 2145–2165.

Blouin, E., Kocan, K., 1998. Morphology and development of Anaplasma Munderloh, U., Kurtti, T., 1989. Formulation of medium for tick cell

marginale (Rickettsiales: Anaplasmataceae) in cultured Ixodes scapu- culture. Exp. Appl. Acarol. 7, 219–229.

laris (Acari: Ixodidae) cells. J. Med. Entomol. 33, 656–664. Popov, V., Han, V., Chen, S., Dumler, J., Feng, H., Andreadisj, T., Tesh, R.,

Cabezas-Cruz, A., Zweygarth, E., Ribeiro, M., da Silveira, J., de la Fuente, J., Walker, D., 1998. Ultrastructural differentiation of the genogroups in

Grubhoffer, L., Valde´s, J., Passos, L., 2012. New species of Ehrlichia the genus Ehrlichia. J. Med. Microbiol. 47, 235–251.

isolated from Rhipicephalus (Boophilus) microplus shows an ortholog Zweygarth, E., Scho¨l, H., Lis, K., Cabezas-Cruz, A., Thiel, C., Silaghi, C.,

of the E. canis major immunogenic glycoprotein gp36 with a new Ribeiro, M., Passos, L., 2013. In vitro culture of a novel genotype of

sequence of tandem repeats. Parasit. Vectors 5, 291. Ehrlichia sp. from Brazil. Transbound Emerg. Dis. (in press).

-100- Chapter V.III

In vitro culture of a novel genotype of Ehrlichia sp. from

Brazil.

Zweygarth E., Schöl H., Lis K., Cabezas-Cruz A., Thiel C., Silaghi C., Ribeiro M F B., Passos L M F. 2013. In vitro culture of a novel genotype of Ehrlichia sp. from Brazil. Transboundary and Emerging Diseases. 60 (Suppl. 2) 86– 92.

-101- Chapter V.III

Transboundary and Emerging Diseases

ORIGINAL ARTICLE In vitro Culture of a Novel Genotype of Ehrlichia sp. from Brazil

E. Zweygarth1, H. Schol€ 1, K. Lis1, A. Cabezas Cruz2, C. Thiel1, C. Silaghi1, M. F. B. Ribeiro3 and L. M. F. Passos1,4

1 Comparative Tropical Medicine and Parasitology, Ludwig-Maximilians-Universitat€ Munchen,€ Munich, Germany 2 Faculty of Science and Biology Centre of the ASCR, Institute of Parasitology, University of South Bohemia, Cesk e Budejovice, Czech Republic 3 Departamento de Parasitologia, ICB-UFMG, Belo Horizonte, Brazil 4 Departamento de Medicina Veterinaria Preventiva, INCT-Pecuaria, Escola de Veterinaria- UFMG, Belo Horizonte, Brazil

Keywords: Summary Ehrlichia; Rhipicephalus (Boophilus) microplus; in vitro culture; tick cell; DH82; endothelial Ehrlichiae are obligate intracytoplasmic Gram-negative, tick-borne bacteria cell; cattle; 16S rRNA; Brazil belonging to the Anaplasmataceae family. Ehrlichioses are considered emerging diseases in both humans and animals. Several members of the genus Ehrlichia Correspondence: have been isolated and propagated in vitro. This study describes the continuous E. Zweygarth. Comparative Tropical Medicine propagation of a Brazilian Ehrlichia sp. isolate in IDE8 tick cells, canine DH82 and Parasitology, Ludwig-Maximilians- cells and bovine aorta cells. Initially, the organisms were isolated from the hae- Universitat€ Munchen,€ Leopoldstrasse 5, Munich D-80802, Germany. Tel.: +49 89 molymph of a Rhipicephalus (Boophilus) microplus tick into IDE8 cells. Infected 21806837; Fax: +49 89 21803623; IDE8 cells were brought from Brazil to Germany, where the organisms were con- E-mail: [email protected] tinuously propagated in IDE8, DH82 and bovine aorta cells. Bovine aorta cells were infected and propagated for 3 months, corresponding to six subcultures, Received for publication November 15, 2012 whereas the other two infected cell lines were kept for more than 1 year. During the cultivation period, 36 and 14 subcultures were carried out in IDE8 and DH82 doi:10.1111/tbed.12122 cell cultures, respectively. Reinfection of IDE8 cells with organisms grown in DH82 cells was achieved. Sequence analysis made with a fragment of the 16S rRNA gene showed that this Ehrlicha sp. is closely related to Ehrlichia canis. How- ever, the maximum likelihood phylogenetic tree shows that it falls in a separate phylogenetic clade from E. canis.

agent (Munderloh et al., 2009) and a new pathogenic Ehrli- Introduction chia species from the United States (Pritt et al., 2011) were Ehrlichiae are obligate intracytoplasmic Gram-negative, isolated using in vitro culture techniques. tick-borne bacteria belonging to the Anaplasmataceae fam- New Ehrlichia spp. have been isolated from Rhipicephalus ily. Ehrlichioses are considered emerging diseases in both (Boophilus) microplus ticks in Asia and characterized molec- humans and animals. At present, the genus Ehrlichia con- ularly (Wen et al., 2002; Parola et al., 2003), but these have sists of five recognized species: Ehrlichia canis, Ehrlichia not yet been propagated in vitro. In Brazil, three Ehrlichia chaffeensis, Ehrlichia ewingii, Ehrlichia muris and Ehrlichia spp. have been reported: E. canis (Costa et al., 1973), ruminantium (Dumler et al., 2001). Four of these species E. ewingi (Oliveira et al., 2009) and E. chaffeensis (Machado have been propagated in vitro, namely E. ruminantium, the et al., 2006), of which E. canis was the only species estab- causative agent for heartwater in ruminants (Bezuidenhout lished in cell culture (Torres et al., 2002). Ehrlichia chaffensis et al., 1985), E. canis, which causes tropical canine pancy- was found in wild herbivores, namely the marsh deer (Blast- topenia (Dawson et al., 1991), E. chaffeensis, which causes ocerus dichotomus) (Machado et al., 2006) and the brown moderate to severe disease in humans, and E. muris (Mun- brocket deer (Mazama gouazoubira) (Silveira et al., 2012). derloh et al., 2009), isolated from a wild mouse and not yet In this study, we describe the continuous propagation of attributed to a human disease. Recently, an Ehrlichia-like a Brazilian Ehrlichia sp. isolate in IDE8 tick cells, canine

-102- Chapter V.III DH82 cells and bovine aorta endothelial cells. This organism medium is referred to as complete culture medium (CCM). was originally isolated from the haemolymph of R.(B.) mi- Cultures were propagated at 34°Cin25cm2 plastic culture croplus ticks into IDE8 cells. DNA samples obtained from flasks in 5 ml of the CCM. the in vitro cultures were used for a partial molecular charac- terization of a fragment of the 16S rRNA gene. Mammalian cell cultures Uninfected DH82 cells, originally derived from a dog suf- Materials and Methods fering from malignant histiocytosis (Wellman et al., 1988), Strain history and the bovine aorta BA886 cell line were used (Yunker Eleven R.(B.) microplus engorged females were collected et al., 1988). Both cell lines were propagated in Dulbecco’s from a cattle paddock at a farm in Minas Gerais, Brazil, modified Eagle’s medium nutrient mixture Ham F-12 where they had dropped off from 4 to 6 months old calves (DME/F-12)( D 0547; Sigma, St. Louis, MO, USA) contain- kept under natural field conditions. To obtain haemol- ing 15 mM HEPES [N (2-hydroxyethyl)piperazine-N’-(2 ymph, ticks were treated basically as described by Ribeiro ethanesulfonic acid)] and 1.2 g/l sodium bicarbonate. The et al. (2009). Briefly, engorged females were washed, blot- medium was further supplemented with 10% (v/v) heat- ted dry and disinfected before being individually placed inactivated FBS, 2 mML-glutamine, 100 IU/ml penicillin l into polystyrene plates and incubated at 27°C and a relative and 100 g/ml streptomycin (Zweygarth et al., 1997). humidity over 83%. After a 10-day incubation period, hae- molymph was collected to provide material for infecting Infection of mammalian cells IDE8 cells. Each tick was held with sterile forceps, the cutic- IDE8 cultures heavily infected with Ehrlichia sp. were har- ula was again sterilized, and a leg cut with a sterile scalpel vested when more than 50% of the cells had detached from blade. The haemolymph was collected using a capillary tube the substrate. The cell culture suspension was then centri- to gather the draining fluid. Haemolymph from three ticks fuged at 290 g for 2 min to remove the majority of the host was pooled in a tube containing 200 ll of culture medium, cells. One millilitre of cell suspension was distributed into which constituted the inoculum to infect one 25 cm2 cul- culture flasks containing DH82 or BA886 cells together ture flask containing an IDE8 cell monolayer. The cultures with 4 ml CCM and incubated at 34°C. After 3 days, the were monitored daily by examination of Giemsa-stained medium was replaced with 5 ml of fresh medium. cytocentrifuge smears under oil immersion at 10009 mag- nification. The first infected cells were detected 28 days after culture initiation. Reinfection of IDE8 cells Four cell culture tubes with passage no. 7 of these cul- Ehrlichia sp.-infected DH82 cultures were harvested when tures were brought from Brazil to the laboratory in approximately 70% of the cells were infected. The cell cul- Munich. Upon their arrival, the medium of the culture ture suspension was centrifuged at 290 g for 2 min to tubes was completely removed and centrifuged (290 g, remove the majority of the host cells. One millilitre of super- 5 min). The cell pellets together with 3 ml of fresh medium natant was distributed into a culture flask containing IDE8 were transferred back into their respective tubes. The cells and 4 ml CCM and incubated at 34°C. After 3 days, organisms were tested by PCR (De la Fuente et al., 2005) the medium was replaced with 5 ml of fresh medium. and showed a positive reaction for Ehrlichia spp. Here, the organisms are referred to as Ehrlichia sp. (UFMG-EV). Light microscopy Microscopic examinations were carried out to demonstrate IDE8 tick cell cultures Ehrlichia spp. in the respective cells. Small samples from The tick cell line IDE8, derived from Ixodes scapularis the cell layer were removed, and smears were prepared. â embryos (Munderloh et al., 1994), was maintained at 32°C Cytospin (Thermo Shandon Ltd, Astmoor, UK) smears in L-15B medium (Munderloh and Kurtti, 1989) supple- were made from cultures where some of the cells were in mented with 5% heat-inactivated foetal bovine serum suspension. Smears were allowed to dry before being fixed (FBS), 10% tryptose phosphate broth (TPB), 0.1% bovine with methanol and stained with eosin methylene blue. lipoprotein concentrate (MP, Santa Ana, CA, USA), 100 IU/ml penicillin and 100 lg/ml streptomycin. Infected Genomic DNA extraction. PCR, cloning and sequencing of IDE8 cultures were propagated in modified L-15B medium 16S rRNA as outlined above but further supplemented with 0.1%

NaHCO3 and 10 mM HEPES. The pH of the medium was The DNeasy Blood & Tissue kit was used for extraction adjusted to approximately 7.5. The modified L-15B of DNA from infected IDE8 cells. DNA extraction was

-103- Chapter V.III performed according to the manufacturer’s instructions. cell culture flask (25 cm2) containing a layer of uninfected The universal primers 8F and 1448R were used to amplify IDE8 cells. At the time of writing, the Ehrlichia sp. (UFMG- 16S rRNA from Ehrlichia spp.: 5′- AGTTTGATCATGGCT- EV) isolate had been in culture for more than 1 year, with CAG – 3′ and 5′- CCATGGCGTGACGGGCAGTGTG – 3′, an average subculture interval of 12 days (range 7– respectively, following the protocol proposed by Warner 27 days). After culture adaptation, the subculture ratio was and Dawson (1996). The single amplified product of the 1 : 100 to 1 : 200. A mass of Ehrlichia sp. (UFMG-EV) expected size was column purified using the QIAquick PCR organisms, released from IDE8 cells, in eosin methylene Purification kit (Qiagen, Gaithersburg, CA 91355, USA) blue-stained cytospin smears is shown in Fig. 1. and then ligated into the TOPO TA Cloning kit (Invitro- gene, Carlsbad, CA 92008, USA). After E. coli transforma- Infection of mammalian cells tion, five clones were purified and sequenced using the M13F and M13R vector primers. Suspensions of organisms obtained from infected IDE8 cell cultures were co-cultivated with DH82 and BA886 cells. Endothelial cells were found to harbour small colonies Sequences analysis and phylogenetic tree 16 days after infection; DH82 cells were found to be posi- The database nucleotide collection (nr/nt) applying Mega- tive 2 days thereafter. Ehrlichia sp (UFMG-EV) were prop- blast from BLAST (Zhang et al., 2000) was used to find agated in DH82 cells for more than 1 year, with an average homologies to our sequence. The phylogenetic tree was subculture interval of 26.1 days (range 17–43 days). These constructed using the maximum likelihood method imple- cultures are now routinely subcultured at a ratio of 1 : 50. mented in the PhyML program (v3.0 aLRT) (Anisimova Bovine endothelial cells were propagated for 132 days, with and Gascuel, 2006). Reliability of the internal branch was an average subculture interval of 22 days (7–23 days), assessed using the bootstrapping method (100 bootstrap and a subculture ratio of 1 : 5. Intracellular colonies of replicates). Graphical representation and editing of the Ehrlichia sp. (UFMG-EV) in eosin methylene blue-stained phylogenetic tree were performed with TreeDyn (v198.3) cytospin smears of DH82 and BA886 cells are shown in (Chevenet et al., 2006). Both PhyML and TreeDyn are Figs 2 and 3, respectively. Data of all the infected cultures, available online at HYPERLINK "http://www.phylogeny.fr" including the infected tick cell cultures, are summarized in www.phylogeny.fr (Dereeper et al., 2008). The 16S rRNA Table 1. sequence from Rickettsia prowazekii was used to root the tree. Phylogenetic analysis of 16S rRNA An amplicon approximately 1.4 Kb corresponding to the Sequences used in this study expected size of the targeted 16S rRNA gene fragment was The partial 16S rRNA sequence determined for this new Ehrli- obtained (data not shown). Our sequence showed a 98% chia isolate from Brazil has been deposited in GenBank, under and 97% identity with the sequences of 16S rRNA of accession number JX629805. The sequences of 16S rRNA used E. canis (GenBank: GU810149) and E. chaffeensis (Arkansas to build the phylogenetic tree were obtained from GenBank with accession numbers as follows: E. canis-TWN: GU810149; E. canis-China: AF162860; E. canis-Japan: AF536827; E. canis- USA: M73221; E. canis-Venezuela: AF373612; E. canis-Brazil: EF195134; E. canis-Peru: DQ915970; E. canis-Turkey: AY621071; E. canis-Israel: U26740; E. canis-Spain: AY394465; E. canis-Greece: EF011110; E. canis-Italy: EU439944; E. muris: AB013008; E. chaffeensis: AF147752; E. ruminantium: AF069758; E. ewingii: U96436; A. marginale: M60313; A. phagocytophilum: M73224; A. platys: M82801; R. prowazekii: NR044656.

Results Ehrlichia sp. in tick cell culture Approximately 3 weeks after the arrival of the culture in Munich, one of the tubes showed several colonies in stained Fig. 1. IDE8 tick cell line infected with Ehrlichia sp. (UFMG-EV) and cytospin preparations. These cells were transferred into a stained with eosin methylene blue.

-104- Chapter V.III

Fig. 2. Colonies of Ehrlichia sp. (UFMG-EV) in DH82 cells stained with eosin methylene blue.

Fig. 4. Phylogenetic tree based on the 16S rRNA gene sequences from some members of the Ehrlichia/Anaplasma genus. The phylogenetic tree was reconstructed using the maximum likelihood method imple- mented in the PhyML program (v3.0 aLRT). Reliability for internal branch was assessed using the bootstrapping method (100 bootstrap repli- cates). Rickettsia prowazekii 16S rRNA was used to root the tree.

Discussion Here, we report the identification of a novel Ehrlichia sp. isolated from the haemolymph of R.(B.) microplus in Brazil. The organisms gave rise to a continuous culture in I. scapularis IDE8 tick cells. Initially, it was assumed that the growing cultures were A. marginale; firstly, because the Fig. 3. Colonies of Ehrlichia sp. (UFMG-EV) in BA886 cells stained with organisms were isolated from cattle ticks; secondly, eosin methylene blue. because some of the cattle nearby could have been carriers of A. marginale; and thirdly, because it is hardly possible Table 1. In vitro propagation data of Ehrlichia sp. (UFMG-EV) in IDE8, to distinguish between in vitro cultured Ehrlichia and DH82 and BA886 cell lines Anaplasma organisms, macro- or microscopically. For Days in Number of Average subculture further characterization, a PCR specific for A. marginale Host cells culture passages intervals [days] (range) was conducted; however, it did not amplify the DNA iso- lated from the culture (data not shown). These results IDE8 365 36 10.1 (6–14) strongly suggested that another agent had been isolated. At DH82 365 14 26.1 (17–43) BA886 132 6 22 (7–23) the same time, infection experiments using culture-gener- ated elementary bodies showed that they were able to infect DH82 cells. This cell type supports the growth of E. canis strain; GenBank: AF147752), respectively. In total, our sequence (Dawson et al., 1991), the Ehrlichia species which is closely had 10 nucleotide changes when compared with E. canis (Gen- related to Ehrlichia sp. (UFMG-EV). The DH82 infection Bank: GU810149), with two insertions and three deletions experiment and an additional PCR test (De la Fuente et al., (data not shown). The maximum likelihood phylogenetic tree 2005) (data not shown) then clearly identified the new (Fig. 4) shows that Ehrlichia sp. (UFMG-EV) falls in a clade agent as an Ehrlichia species. separated from any previously reported member of the family Ehrlichia sp. (UFMG-EV) elementary bodies infected Anaplasmataceae. bovine endothelial cells. Similar results were reported for

-105- Chapter V.III E. ruminantium, an agent which primarily targets endo- 16S rRNA phylogenetic tree built with the maximum thelial cells of diseased ruminants (Cowdry, 1926). From likelihood method shows that Ehrlichia sp. (UFMG-EV) our culture results, it seems unlikely that Ehrlichia sp. falls in a different clade, separated from any previously (UFMG-EV) prefers endothelial cells as targets, because reported Ehrlichia spp. As the 16S rRNA gene is known to the development in endothelial cells took quite some time, exhibit a high level of structural conservation with a low on average 22 days, whereas E. ruminantium can finish its evolutionary rate, levels of sequence divergence >0.5% in developmental cycle in cultured endothelial cells within comparisons with nearly complete 16S rRNA gene 3 days (Zweygarth, 2006). On the other hand, E. canis was sequences of members of the genus Ehrlichia have been established in a human microvascular endothelial cell line considered sufficient to classify organisms as different spe- (Dawson et al., 1993), but failed to grow in the bovine cies (Wen et al., 2002). The levels of divergence of the 16S endothelial cell line used in the present experiments (own rRNA sequence between this novel Brazilian ehrlichial unpublished results). Ehrlichia sp. (UFMG-EV) was agent and the closest member of the Anaplasmataceae, isolated from the haemolymph of R.(B.) microplus, which E. canis, were 1.7% in pairwise comparisons of 1384 base is commonly known as the cattle tick. It is a one-host tick, sequences (data not shown), and this level of difference and from its life cycle, all three feedings of any individual should be sufficient to classify the novel ehrlichial agent as tick occur on the same individual host (Walker et al., a new species of the genus Ehrlichia. Nevertheless, polypha- 2003). Therefore, if R.(B.) microplus is the vector of the sic taxonomy has been advocated to ensure well-balanced newly identified Ehrlichia sp., then its transmission has to determinations of taxonomic relationships (Inokuma et al., be transovarial. In contrast, E. canis is transmitted very 2001). frequently by the three-host tick Rhipicephalus sanguineus In conclusion, the in vitro isolation and propagation of (Donatien and Lestoquard, 1936). In transmission experi- the novel Ehrlichia sp. genotype offers a tremendous advan- ments, it was found that E. canis was not transmitted tage as it provides an excellent sustainable source of biolog- transovarially (Groves et al., 1975); similarly, E. ruminan- ical material, whether dead or alive. The pathogenicity and tium, which is transmitted by another three-host tick of zoonotic potential of these organisms are not yet known, the genus Amblyomma, is also not transmitted transovari- and a future aim of our experiments is to evaluate the ally (Lounsbury, 1902). Ehrlichia spp. has never been validity of one of Koch’s postulates, namely whether the found in cattle in Brazil. Nevertheless, it is assumed that organism can cause disease, or at least infection, when the novel Ehrlichia sp. originates from cattle, although introduced into a healthy bovine, or whether is it simply an there is only circumstantial evidence as outlined above. endosymbiont of R.(B.) microplus ticks. Additional investi- The only Ehrlichia sp. diagnosed in Brazilian herbivores is gations are still necessary to evaluate the molecular and E. chaffeensis, which was found in the marsh deer (Blast- phylogenetic characteristics of this novel genotype of Ehrli- ocerus dichotomus) (Machado et al., 2006) and the brown chia sp. from Brazil. brocket deer (Mazama gouazoubira) (Silveira et al., 2012). In the Americas, only E. ruminantium has been described Acknowledgements in ruminants in the Caribbean (Perreau et al., 1980) and a tick-transmitted Ehrlichia from Georgia, USA, that was The authors thank Dr. Ulrike G. Munderloh (University of closely related to E. ruminantium (Loftis et al., 2006). Minnesota, USA) for permission to use the IDE8 cell line Very recently, a novel Ehrlichia genotype was detected in and to Mrs. Marzena Broniszewska for excellent technical naturally infected cattle in Canada (Gajadhar et al., 2010). support. K. Lis and A. Cabezas Cruz are Early Stage These organisms were isolated from host animals; on the Researchers supported by the POSTICK ITN (Post-gradu- other hand, several Ehrlichia spp. have been isolated from ate training network for capacity building to control ticks R.(B.) microplus ticks collected from infested cattle in Asia and tick-borne diseases) within the FP7- PEOPLE – ITN and characterized molecularly (Wen et al., 2002; Parola programme (EU Grant No. 238511). et al., 2003). Phylogenetic analysis using 16S rRNA gene is recognized Conflicts of interest as one of the most powerful and precise methods for deter- mining phylogenetic relationships in the bacterial kingdom The authors declare that they have no competing interests. (Woese, 1987). Our results were consistent with previous phylogenetic analysis of Ehrlichia spp. using the 16S rRNA References gene sequences (Parola et al., 2003). In this study, our anal- ysis of a relevant fragment of 16S rRNA sequences revealed Anisimova, M., and O. Gascuel, 2006: Approximate likelihood- that the novel agent found in Brazilian R.(B.) microplus ratio test for branches: a fast, accurate, and powerful alterna- ticks was more closely related to E. canis; nevertheless, the tive. Syst. Biol. 55, 539–552.

-106- Chapter V.III

Bezuidenhout, J. D., C. L. Paterson, and B. J. Barnard, 1985: In Loftis, A. D., W. K. Reeves, J. P. Spurlock, S. M. Mahan, D. R. vitro cultivation of Cowdria ruminantium. Onderstepoort J. Troughton, G. A. Dasch, and M. L. Levin, 2006: Infection of a Vet. Res. 52, 113–120. goat with a tick-transmitted Ehrlichia from Georgia, U.S.A., Chevenet, F., C. Brun, A. L. Banuls,~ B. Jacq, and R. Chisten, that is closely related to Ehrlichia ruminantium. J. Vector Ecol. 2006: TreeDyn: towards dynamic graphics and annotations 31, 213–223. for analyses of trees. BMC Bioinformatics 10, 7–439. Lounsbury, C. P., 1902: Heartwater in calves. Agric. J. Cape of Costa, J. O., J. A. Jr Batista, M. Silva, and M. P. Guimaraes, Good Hope, 21, 165–169. 1973: Ehrlichia canis infection in a dog in Belo Horizonte- Machado, R. Z., J. M. Duarte, A. S. Dagnone, and M. P. Szabo, Brazil. Arq. Esc. Vet. UFMG Bela Horizonte 25, 199–200. 2006: Detection of Ehrlichia chaffeensis in Brazilian marsh Cowdry, E. V., 1926: Studies on the etiology of heartwater: III. deer (Blastocerus dichotomus). Vet. Parasitol. 139, 262–266. the multiplication of Rickettsia ruminantium within the endo- Munderloh, U. G., and T. J. Kurtti, 1989: Formulation of med- thelial cells of infected animals and their discharge into the ium for tick cell culture. Exp. Appl. Acarol. 7, 219–229. circulation. J. Exp. Med. 44, 803–814. Munderloh, U. G., Y. Liu, M. Wang, C. Chen, and T. J. Kurtti, Dawson, J. E., Y. Rikihisa, S. A. Ewing, and D. B. Fishbein, 1991: 1994: Establishment, maintenance and description of cell Serologic diagnosis of human ehrlichiosis using two Ehrlichia lines from the tick Ixodes scapularis. J. Parasitol. 80, canis isolates. J. Infect. Dis. 163, 564–567. 533–543. Dawson, J. E., F. J. Candal, V. G. George, and E. W. Ades, 1993: Munderloh, U. G., D. J. Silverman, K. C. Macnamara, G. G. Ahl- Human endothelial cells as an alternative to DH82 cells for strand, M. Chatterjee, and G. M. Winslow, 2009: Ixodes ovatus isolation of Ehrlichia chaffeensis, E. canis, and Rickettsia rick- Ehrlichia exhibits unique ultrastructural characteristics in ettsii. Pathobiology 61, 293–296. mammalian endothelial and tick-derived cells. Ann. N. Y. De la Fuente, J., V. Naranjo, F. Ruiz-Fons, U. Hofle,€ I. G. Acad. Sci. 1166, 112–119. Fernandez de Mera, D. Villanua, C. Almazan, A. Torina, S. Oliveira, L. S., K. A. Oliveira, L. C. Mour~ao, A. M. Pescatore, M. Caracappa, K. M. Kocan, and C. Gortazar, 2005: Potential R. Almeida, L. G. Conceicß~ao, M. A. Galv~ao, and C. Mafra, vertebrate reservoir hosts and invertebrate vectors of Ana- 2009: First report of Ehrlichia ewingii detected by molecular plasma marginale and A. phagocytophilum in central spain. investigation in dogs from Brazil. Clin. Microbiol. Infect. 15 Vector Borne Zoonotic Dis. 5, 390–401. (Suppl 2), 55–56. Dereeper, A., V. Guignon, G. Blanc, S. Audic, S. Buffet, Parola, P., J. P. Cornet, Y. O. Sanogo, R. S. Miller, H. V. Thien, F. Chevenet, J.-F. Dufayard, S. Guindon, V. Lefort, M. J. P. Gonzalez, D. Raoult, S. R. III Telford, and C. Wongsri- Lescot, J.-M. Claverie, O. Gascuel. Phylogeny.fr: robust chanalai, 2003: Detection of Ehrlichia spp., Anaplasma spp., phylogenetic analysis for the non-specialist Nucleic Acids Rickettsia spp., and other eubacteria in ticks from the Thai- Res. 2008 Jul 1; 36 (Web Server Issue): W465–9. Epub Myanmar border and Vietnam. J. Clin. Microbiol. 41, 1600– 2008 Apr 19. 1608. Donatien, A., and F. Lestoquard, 1936: Existence de la premuni- Perreau, P., P. C. Morel, N. Barre, and P. Durand, 1980: Cowdri- tion dans la rickettsiose naturelle ou experimentale du chien. osis (Heartwater) by Cowdria ruminantium in ruminants of Bull. Soc. Pathol. Exot. 29, 378–383. French Indies (Guadeloupe) and Mascarene Islands (La Dumler, J. S., A. F. Barbet, C. P. J. Bekker, G. A. Dasch, G. H. Reunion and Mauritius). Rev. Elev. Med. Vet. Pays Trop. 33, Palmer, S. C. Ray, Y. Rikihisa, and F. R. Rurangirwa, 2001: 21–22. Reorganization of genera in the families Rickettsiaceae and Pritt, B. S., L. M. Sloan, D. K. Johnson, U. G. Munderloh, S. M. Anaplasmataceae in the order Rickettsiales: unification of some Paskewitz, K. M. Mcelroy, J. D. Mcfadden, M. J. Binnicker, D. species of Ehrlichia with Anaplasma, Cowdria with Ehrlichia F. Neitzel, G. Liu, W. L. Nicholson, C. M. Nelson, J. J. Fran- and Ehrlichia with Neorickettsia, descriptions of six new spe- son, S. A. Martin, S. A. Cunningham, C. R. Steward, K. Bogu- cies combinations and designation of Ehrlichia equi and ‘HGE mill, M. E. Bjorgaard, J. P. Davis, J. H. Mcquiston, D. M. agent’ as subjective synonyms of Ehrlichia phagocytophila. Int. Warshauer, M. P. Wilhelm, R. Patel, V. A. Trivedi, and M. E. J. Syst. Evol. Microbiol. 51, 2145–2165. Eremeeva, 2011: Emergence of a new pathogenic Ehrlichia Gajadhar, A. A., V. Lobanov, W. B. Scandrett, J. Campbell, and species, Wisconsin and Minnesota, 2009. N. Engl. J. Med. 365, B. Al-Adhami, 2010: A novel Ehrlichia genotype detected in 422–429. naturally infected cattle in North America. Vet. Parasitol. 173, Ribeiro, M. F., C. V. Bastos, M. M. Vasconcelos, and L. M. Pas- 324–329. sos, 2009: Babesia bigemina: in vitro multiplication of sporoki- Groves, M. G., G. L. Dennis, H. L. Amyx, and D. L. Huxsoll, netes in Ixodes scapularis (IDE8) cells. Exp. Parasitol. 122, 1975: Transmission of Ehrlichia canis to dogs by ticks (Rhipi- 192–195. cephalus sanguineus). Am. J. Vet. Res. 36, 937–940. Silveira, J. A., E. M. Rabelo, and M. F. Ribeiro, 2012: Molecular Inokuma, H., P. Brouqui, M. Drancourt, and D. Raoult, 2001: detection of tick-borne pathogens of the family Anaplasmata- Citrate synthase gene sequence: a new tool for phylogenetic ceae in Brazilian brown brocket deer (Mazama gouazoubira, analysis and identification of Ehrlichia. J. Clin. Microbiol. 39, Fischer, 1814) and marsh deer (Blastocerus dichotomus, Illiger, 3031–3039. 1815). Transbound. Emerg. Dis. 59, 353–360.

-107- Chapter V.III

Torres, H. M., C. L. Massard, M. J. Figueiredo, T. Ferreira, and closely related to Ehrlichia chaffeensis by sequence analyses of N. R. Almosny, 2002: Isolamento e propagacß~ao da Ehrlichia 16S ribosomal DNA in Boophilus microplus ticks from Tibet. J. canis em celulas DH82 e obtencß~ao de antıgeno para a reacß~ao Clin. Microbiol. 40, 3286–3290. de imunofluoresc^encia indireta. Rev. Bras. Ci^encia Vet. 9, 77– Woese, C. R., 1987: Bacterial evolution. Microbiol. Rev. 51, 221– 82. 271. Walker, A. R., A. Bouattour, J.-L. Camicas, A. Estrada-Pena,~ I. Yunker, C. E., B. Byrom, and S. Semu, 1988: Cultivation of Cow- G. Horak, A. A. Latif, R. G. Pegram, and P. M. Preston, 2003: dria ruminantium in bovine vascular endothelial cells. Kenya Ticks of Domestic Animals in Africa: A Guide to Identifica- Vet. 12, 12–16. tion of Species. Publ. Bioscience Reports, Edinburgh, Scot- Zhang, Z., S. Schwartz, L. Wagner, and W. Miller, 2000: A land, UK. greedy algorithm for aligning DNA sequences. J. Comput. Biol. Warner, C. K., and J. E. Dawson, 1996: Genus- and species-level 7, 203–214. identification of Ehrlichia species by PCR and sequencing. In: Zweygarth, E. 2006: In vitro cultivation of Ehrlichia ruminantium Persing, D. H. (ed), PCR Protocols for Emerging Infectious and development of an attenuated culture-derived vaccine. Diseases, pp. 100–105. ASM Press, Washington, DC. PhD Thesis, Utrecht University, Utrecht, The Netherlands. Wellman, M. L., S. Krakowka, R. M. Jacobs, and G. J. Kociba, Zweygarth, E., S. W. Vogel, A. Josemans, and E. Horn, 1997: 1988: A macrophage-monocyte cell line from a dog with In vitro isolation and cultivation of Cowdria ruminantium malignant histiocytosis. In Vitro Cell. Dev. Biol. 24, 223–229. under serum-free culture conditions. Res. Vet. Sci. 63, Wen, B., R. Jian, Y. Zhang, and R. Chen, 2002: Simultaneous 161–164. detection of Anaplasma marginale and a new Ehrlichia species

-108- Chapter VI

General Discussion

Study of the genetic diversity of tick-borne pathogenic bacteria of

the family Anaplasmataceae.

Cabezas-Cruz A., de la Fuente J. Study of the genetic diversity of tick-borne pathogenis bacteria of the family Anaplasmataceae.

-109- Chapter VI Study of the genetic diversity of tick-borne pathogenic bacteria of

the family Anaplasmataceae.

Alejandro Cabezas-Cruz1,2, José de la Fuente2,3

1Center for Infection and Immunity of Lille (CIIL), INSERM U1019 – CNRS UMR 8204,

Université Lille Nord de France, Institut Pasteur de Lille, Lille, France.

2SaBio. Instituto de Investigación en Recursos Cinegéticos IREC-CSIC-UCLM-JCCM, Ronda

de Toledo s/n, 13005 Ciudad Real, Spain.

3Department of Veterinary Pathobiology, Center for Veterinary Health Sciences, Oklahoma State

University, Stillwater, OK 74078, USA.

Abstract

Classification of bacteria at species level is a challenging task, not only because acquisition of new genes through lateral gene transfer could produce new genotypes with unpredictable phenotypes but also because bacterial systematics lacks a theory-based conceptual framework for their classification. Tick-borne Rickettsia is a diverse group of α-proteobacteria that cause disease in humans and animals worldwide. High genetic diversity within strains of the same Rickettsia species is a new challenge that overlaps with the one mentioned before. Genetic differentiation of different strains within the “same species” could give rise to ecologically adapted strains with totally different patterns of pathogenicity, virulence and transmission. Here we discuss the species and strains classification methods that are currently used for A. marginale and E. canis, two examples of highly variable pathogens within the family Anaplasmataceae. The results of these studies have implication for pathogen diagnostics and control.

Keywords: A. marginale, E. canis, genetic diversity, control, diagnostics.

Bacteria classification classification currently lacks of a theoretical Bacteria classification is matter of interest framework for species demarcation as exist for epidemiologists, evolutionary biologists for eukaryotes (Cohan and Perry, 2007). and taxonomists. A precise classification in This have gave rise to the use of an pathogenic bacteria is a first step relevant to empirically designed concept of bacteria understanding pathogens epidemiology, species that is extremely conservative transmission patterns and to implement (Rosselló-Mora and Amann, 2001). This control strategies. However, bacteria maybe explains the striking low number

-110- Chapter VI (5000) of prokaryotic species that have been 16S rRNA sequence divergence was lowered described so far (Rosselló-Mora and Amann, to 1.3% (Stackebrandt et al, 2002). This 2001; Merhej and Raoult, 2011). In contrast, parameters were revisited again in 2006 and the number of eukaryotic organisms two 16S rRNA sequences with 1% described is significantly higher (e.g. 5-10 divergence between them was sufficient million arthropods species) (Ødegaard, criteria to classified two species as different 2000). Eukaryotic species definition has (Stackebrandt and Ebers, 2006). However, made possible grouping different species due to the high degree of sequence identity regarding their morphology, behavior, of this gene among rickettsial species, it was physiology and ecology (Cohan and Perry, not suitable its use to differentiate some 2007). Thus, species names in eukaryotic ricketsial spp (Merhej and Raoult, 2011). kingdoms provide specific information Finally, a recent study found that about organisms’ physiology, biochemistry phylogenetic analysis of the genus and ecology (Cohan and Perry, 2007). In Acinetobacter based on 16S rRNA gene contrast, currently, the definition of bacteria sequences provided unreliable and species relies almost exclusively on uninformative results (Chan et al, 2012). molecular data (Merhej and Raoult, 2011). The most actualized molecular taxonomy of This has a major impact on medical the family Anaplasmataceae is based in 16S taxonomy. Recent studies show that most rRNA, groESL and surface proteins (Dumler named bacteria are in fact assemblage of et al, 2001). The family Anaplasmataceae closely related but ecologically distinct was divided basically in the genuses microorganisms (López-López et al, 2005; Anaplasma, Ehrlichia, Wolbachia and Schloter et al, 2000; Smith et al, 2006). Neorickettsia (Dumler et al, 2001). The sequence similarity of 16S rRNA among Classical molecular Taxonomy applied to members of each of these genuses is Anaplasmataceae extremely high, with minimum of 96.1% One of the advantage of a sequence-based (Anaplasma), 97.7% (Ehrlichia), 95.6% taxonomy is that it makes possible in silico (Wolbachia) and 94.9% (Neorickettsia). comparisons to every existing sequence in This high sequence similarity makes big data repositories, like NCBI. Sequence difficult to differentiate news species within comparison of the 16S rRNA gene is this genuses using this gene. However, recognized as one of the most powerful and characterization of new species within precise methods for determining the Ehrlichia have been done using 16S rRNA phylogenetic relationships of bacteria and 0.5% divergence was considered as (Woese, 1987; Yu et al, 2001; Wen et al, good criteria for classification of new agents 2002; Woo et al, 2008). There are different (Anderson et al, 1991; Wen et al, 1995, criteria for the necessary degree of 1996 and 2002). 16S rRNA has high degree divergence in 16S rRNA between two strains conservation because nucleotide changes are to consider one of them as a new organism. accumulated at very low rates of evolution Taking DNA-DNA hybridization as (Woese, 1987). This will limit the use of reference, Stackebrandt and Goebel (1994) 16S rRNA to differentiate accurately only showed that 2.5% divergence in 16S rRNA species that have diverged long time ago. between two strains was enough to classify What to do with recently evolved species or them as different species (Stackebrandt and different strains within the same species? Goebel, 1994). Later in 2002 the threshold to recognize new species of bacteria using Housekeeping genes for taxonomy

-111- Chapter VI Polyphasic taxonomy has been advocated to early from the common ancestor of spotted ensure well balanced determinations of fever group (SFG) (Fournier et al, 1998; Lee taxonomic relationships (Inokuma et al, et al, 2003). However, while GroEL has 6% 2001). Different genes, combined to 16S - 10% sequence divergence within rRNA, have been proposed to classify Anaplasma genus, making possible a ehrlichial agents at the species level, the reliable phylogenetic differentiation among most widely used are, groESL operon species of this genus, the sequence (Sumner et al, 1997), groEL gene (Yu et al, divergence of GroEL within the genus 2001), gltA (Inokuma et al, 2001), dsb Ehrlichia is extremely low (0.3% - 8.6%) (Sacchi et al, 2012). Some of these genes, to rendering phylogenetic trees with bootstrap which we will refer later in more details, are values as low as the bootstrap values for major immunogenic protein specific to E. trees using 16SrRNA (Yu et al, 2001). To canis: gp36 (Doyle et al, 2006; Hsieh et al, sum up, 16SrRNA, gltA and GroEL among 2010) and A. marginale: MSP1a (Allred et others are examples of genes that al, 1990). accumulated low amount of nucleotide Citrate synthase gene (gltA) was a proposed changes. These genes have been useful to as a new tool for phylogenetic analysis and classify members of Anaplasmataceae and identification of members of rickettsiales at the family, genuses and Anaplasmataceae (Inokuma et al, 2001). species (at least from a classical point of Two properties makes gltA a good genetic view of the concept of species). However, it marker for characterization of Ehrlichial was estimated that the variability of one agents at species level: (i) the percent of bacteria (i.e. Escherichia coli) alone may be gltA nucleotides identity among Ehrlichia greater than the recorded diversity among all species is between 49.7% (minimal identity) primates (Lawrence and Ochman, 1998). and 99.8% (maximal identity), the range of Are then the housekeeping genes useful to 16S rRNA is more narrow (83.5% (minimal differentiate among strains that emerged identity) to 99.9% (maximal identity)); (ii) recently? One study by Lew and colleagues the topology of the phylogenetic trees build (2003) included 13 strains of A. marginale by gltA nucleotide sequences or amino acid from South Africa, Namibia, Zimbabwe, sequences was similar to that derived from Israel, USA, Australia and Uruguay, neither the 16S rRNA but with higher bootstrap 16SrRNA nor GroEL were able to resolve values (Inokuma et al, 2001). Similarly, the the phylogenetic relationship between these heat-shock protein (groEL) gene have been strains (Lew et al, 2003). Other study by successfully used for taxonomic Hsieh and colleagues (2010) included 22 classification of Ehrlichia and Anaplasma strains of E. canis also from different (Yu et al, 2001). The genus and species countries, 13 out of 22, 5 out of 22 and 3 out classification currently used for of 22 had 100%, 99.9% and 99.8% identities Anaplasmataceae is based, as mention between their 16SrRNA respectively, before, in 16SrRNA but also in GroEL making difficult to distinguish them using (Dumler et al, 2001). Some reports, while 16SrRNA (Hsieh et al, 2010) or other highly recognizing that gltA is superior to conserved genes. The housekeeping genes 16SrRNA, found that GroEL is superior to showed us the “trunk” of phylogenetic gltA to differentiate among rickettsiales (Lee diversity in Anaplasmataceae, to learn about et al, 2003). This authors argue that reliable the “leaves”, we may need different tools. phylogenetic interrelationships could be drawn only among the rickettsia that diverge

-112- Chapter VI Immunoreactive proteins and molecular glycosylation site (Zweygarth et al, 2014). taxonomy Previous phylogenetic analysis of E. canis In order to evade the host immune system or gp36 shows some geographical correlation to adapt and occupy new ecological niche, (Hsieh et al, 2010; Kamani et al, 2013), the evolution of several pathogens is marked suggesting local evolutionary adaptations. by a dynamic generation of genetic Our results show that this pattern is not diversity. This continuous process underlie consistent, while the South African strains the emergence of strains with changed fell into a separate cluster from the patterns of infection, pathogenicity and Taiwanese isolates in agreement with the virulence that eventually could also generate geographical distance between South Africa new pathogens. A. marginale and E. canis and Taiwan; Spain 105 fell into the same are examples of intracellular highly variable cluster with Israel-Ranana despite the tick-borne rickettsiales. The challenge geographical distance between Spain and remain in identifying those genes with Israel. This could be due to the introduction higher rate of evolution that usually reflect of infected individuals (host or vector or an active interaction with hosts and vectors both) to different geographical areas, associate to the microorganisms in question. causing the emergence of non-typical E. Two genes have been extensively used to canis strains in some regions of the world. differentiate between E. canis and A. This explanation has been previously marginale strains: gp36 (Hsieh et al, 2010) proposed as the cause of the absence of and MSP1a (Cabezas-Cruz et al, 2013). correlation between locality and MSP1a These two genes have two main similarities: genetic identity found in strains of A. (i) they are expose to vertebrate host marginale (de la Fuente et al, 2007). Our immune recognition, and (ii) they present results on the evolution of MSP1a suggest variable number of tandem repeats. The another reason for the lack of main difference between these two genes is phylogeographic correlation among isolates that the repeat unit of 9 amino acids in E. of E. canis and A. marginale using gp36 and canis gp36 is highly conserved among MSP1a respectively: convergent evolution. isolates and the pre-tandem and post tandem Positive and negative selection can generate repeat regions are variable (Zweygarth et al, a systematic signals that is not ancestral in 2014), while in A. marginale MSP1a, the nature (Massey et al, 2008). Similar repeat unit of 23-31 amino acids is highly selective forces acting on proteins under variable (Cabezas-Cruz et al, 2013) and the structural or functional constraints may rest of the protein is conserved among present convergent or parallel evolution isolates (de la Fuente et al, 2001). (homoplasy) (Massey et al, 2008). We found Using maximum likelihood analysis based that different variants of MSP1a tandem on the gp36 amino acid sequences we repeats evolved in such a manner that they showed that newly isolates of E. canis from converged to specific MSP1a tandem South African and Spanish strains fall into 2 repeats possibly due to functional constrains well-defined phylogenetic clusters amongst related with the role that this protein has in other E. canis strains. In addition we found pathogen-hosts interactions. MSP1a has that the members of these 2 phylogenetic been shown to be an adhesin for bovine clusters shared 2 unique molecular erythrocytes (McGarey et al, 1994; de la properties in the gp36 amino acid sequences: Fuente et al, 2001a) and tick cells (de la (i) deletion of glycine 117 and (ii) the Fuente et al, 2001a). In addition, specific presence of an additional putative N-linked tandem repeats were found to be adhesive to

-113- Chapter VI tick cell extracts while others do not bind mineirensis was 2.8%. Taking in account the tick cells (de la Fuente et al, 2003). On the high identity of 16SrRNA among E. canis other hand, based on genome synteny and strains (maximum 0.6%) (Hsieh et al, 2010), amino acid homology, Doyle and colleagues E. mineirensis may have diverged long time (2006) proposed that E. canis gp36, E. ago from E. canis. However, the drastic chaffensis gp47 and E. ruminantium mucin change in one surface protein (gp36) that like protein are orthologs (Doyle et al, maybe involved in pathogen-hosts 2006). E. ruminantium mucin like protein interaction suggest an adaptation to different was also found to be an adhesin for tick cells vertebrate or/and invertebrate hosts. In (de la Fuente et al, 2004). Taking this into agreement with this, while the common tick account we proposed that gp36 may have vector for E. canis is R. sanguineous the same molecular function: adhesin for (Groves et al, 1975; Bremer et al, 2005) E. tick cells (Zweygarth et al, 2014). This mineirensis was isolated from R. microplus provides an alternative explanation for the hemolymph (Cabezas-Cruz et al, 2012). The lack of phylogeographic pattern among E. reports of Ehrlichia species in R. microplus canis and A. marginale isolates using gp36 are scarce. The first two reports were in and MSP1a respectively. China in the Guangxi Autonomous Region in 1999 (Hua et al, 1999) and Tibet in 2002 Beyond molecular taxonomy: polyphasic (Wen et al, 2002); the second in Thailand in taxonomy 2003 (Parola et al, 2003) and the latest one In order to carry out an informative in Xiamen, China in 2011 (Jiang et al, characterization of a new microorganism or 2011). Additionally, it has been recently new strains of known pathogens: reported a new agent close related to E. morphological, physiological and mineirensis that is pathogenic for cattle in biochemical properties of the new agent Brazil (Aguiar et al, 2014). This new agent, must be provided together with molecular named Ehrlichia sp. UFMT-BV, presents data. Using a polyphasic taxonomic 99% (based on dsb) and 100% (based on approach we were able to characterized E. 16SrRNA) sequences similarity to E. mineirensis, a new member of the genus mineirensis. Interesting, Eh.sp UFMT-BV Ehrlichia. The molecular characterization has a new sequence of tandem repeat was first based in four genes: 16S rRNA, (VSADSGAAQ) in the gp36 which is close groEL, dsb and gltA. Once the pathogen was related to the variant present in E. identified using 16S rRNA, groEL and gltA mineirensis. Altogether, this suggests that as a close relative to E. canis, we included this new group of organisms evolved from gp36 in the genes to be tested. We found E. canis sensu stricto and became something very interesting, the organism ecologically independent from the parental presented an ortholog of gp36 but with a species. In agreement with the new hosts new sequence of tandem repeats association of this group of microorganisms, (VPAASGDAQ), completely different to the we found that E. mineirensis was able to one reported for E. canis (TEDSVSAPA) growth in bovine aorta BA886 cell line (Cabezas-Cruz et al, 2012). The difference while E. canis was not (Zweygarth et al, found in gp36 tandem repeats was consistent 2013). This in vitro observation supports the with a 1.7% sequence divergence between above conclusions regarding host specificity 16SrRNA of E. mineirensis and E. canis. of the new group of phylogenetically related The nucleotide sequence divergence agents. At the ultrastructural level, E. between groEL of E. canis and E. mineirensis shares ultrastructural features

-114- Chapter VI with other members of the genus Ehrlichia Kansas and Oklahoma). Subsequently, it (E. muris, E. canis and E. chaffeensis). was shown that cattle immunized with the However, we found cells with unusual antigen Am 105 were significantly protected structures (invagination of the cellular from challenge with 108 Florida isolate membrane) for which we do not have an initial bodies (Palmer et al, 1986). It was explanation (Cabezas-Cruz et al, 2013a). later shown that the epitope recognized by the neutralizing monoclonal antibodies Implications for pathogen molecular above was present in the tandem repeat diagnostics and control. region of MSP1a with amino acid sequence Both proteins gp36 and MSP1a have (E/QASTSS) (Allred et al, 1990). In the been used in molecular diagnostics for E. mentioned study, Allred and colleagues canis and A. marginale respectively. The (1990) analyzed 5 different tandem repeats genes encoding these proteins have shown to (A, B, C, D, E) in which they found the be good targets for identification of E. canis neutralizing epitope to be conserved. Taking and A. marginale as well as for the this into account they finished they evaluation of genetic diversity of these manuscript saying: “It is enigmatic that a species. Beyond its use for molecular surface-exposed neutralization sensitive diagnostics, MSP1a has also been used as an epitope encoded by sequences of potentially antigen for A. marginale vaccination. We high genetic plasticity remains constant consider that immunoreactive variable despite immune pressure” (Allred et al, proteins offer an opportunity to combine 1990). diagnostics of genetic diversity with control After analyzing 193 different MSP1a measures. In the following paragraph we tandem repeats (Cabezas-Cruz et al, 2013), discuss how this could be achieve using A. we found an answer to the enigma: A. marginale MSP1a as an example. marginale tandem repeats do evolve under MSP1a is a potential good antigen for negative selection in codon positions of the vaccination containing both neutralization neutralization-sensitive epitope. In addition, sensitive (Palmer et al, 1987; Allred et al, deletion of the first glutamine (Q) in the 1990) and immunodominant epitopes neutralization-sensitive epitope was found (Garcia-Garcia et al, 2004). The incubation among newly evolved tandem repeats in of A. marginale initial bodies with South Africa (Mutshembele et al, 2014). neutralizing monoclonal antibodies against Deletion in the first Q and last S in the Florida isolate Am 105 (MSP1a) completely neutralization-sensitive epitope avoid the neutralized the infectivity of 107 initial binding of the neutralizing monoclonal bodies of A. marginale in splenectomized antibodies (Allred et al, 1990). Indeed, we calves (Palmer et al, 1986). However, the found that the first Q of the neutralization- infectivity was only partially neutralized sensitive epitope (Palmer et al, 1987; Allred when 108, 109 and 1010 initial bodies of A. et al, 1990) is deleted in several tandem marginale were used in the infection repeats (A, D, E, γ, Σ, φ, 5 - 9, 14, 31, 36, experiments (Palmer et al, 1986). 52, 57, 60 - 66, 69 - 72, 76, 84 – 86, 95 – 99, Additionally, Palmer and colleagues (1986) 105, 107, 116 – 119, 129 – 131, 136 - 139) showed that this neutralizing monoclonal from A. marginale strains reported antibodies recognize antigenic determinants previously (Cabezas-Cruz et al, 2013). in 8 different A. marginale isolates (Florida; We also observed sequence variation Okanogan (Washington), south Idaho, north among the amino acids present in the Texas, Clarkston (Washington); Virginia; immunodominant epitope

-115- Chapter VI (SSAGGQQQESS) reported by Garcia- regions and also low variable peptides Garcia and colleagues (2004). Most of the (Cabezas-Cruz et al, 2013; Mutshembele et residues of the MSP1a immunodominant B- al, 2014). This offers an opportunity to cell epitope (Garcia-Garcia et al, 2004) were design peptide-based vaccines using such deleted in the tandem repeat forms α and conserved or low variable tandem repeats. 108. The tandem repeat α is widespread in We consider that the development of MSP1a strains from Mexico, Brazil, Venezuela, based vaccines should take into Argentina and Taiwan and is also present in consideration: (i) the extant genetic the most common A. marginale strain of the variability of MSP1a but also, (ii) the world which has the tandem repeat evolution of MSP1a genetic diversity, and composition (α, β, β, β, Γ) (Cabezas-Cruz et finally (iii) the conservative nature of some al, 2013). The codon positions 8 and 10 of tandem repeats among different isolates. the immunodominant B-cell epitope Recently, Silvestre and colleagues (2014) (Garcia-Garcia et al, 2004), were found to immunized mice with recombinant MSP1a evolved under negative selection (N-terminus region containing the tandem (Mutshembele et al, 2014). Collectively, this repeats) using carbon nanotubes as a carrier evidence suggests that purifying selection is molecule. They found an enhanced immune most likely one of the evolutionary response in mice immunized with the mechanisms that A. marginale uses to recombinant version of MSP1a attached to escape immune recognition toward MSP1a the carbon nanotubes (Silvestre et al, 2014). tandem repeats (Mutshembele et al, 2014). Thus, the approach appears valid, however, Variation in the amino acid sequence of the authors used the MSP1a sequence of the immuno-relevant epitopes found in surface A. marginale strain UFMG2 (GenBank molecules could produce immune scape of EU676175), containing the tandem repeats A. marginale at the population level. An composition (13-27-27). Tandem repeat 13 interesting feature of the interaction between was found to be highly represented in A. A. marginale and cattle is that once marginale strains from South Africa recovered from acute infection, cattle are (Cabezas-Cruz et al, 2013; Mutshembele et resistant to challenge with homologous al, 2014). It is also present in some strains strain, however, they remain susceptible to from Argentina, Venezuela and Mexico. infection from heterologous isolates (Kuttler However, it is low abundant in A. marginale et al, 1984). This allows the so called “strain strains from Brazil (Pohl et al, 2013; superinfection”, the ability of one strain to Cabezas-Cruz et al, 2013). In addition, Pohl stablish an infection in an animal that and colleagues (2013) showed that this developed previously an immune response particular strain (UFMG2) forms a against a different strain (Palmer and phylogenetic cluster together with only 3 Brayton, 2013). This results in high other strains of A. marginale present in selection for genetic diversity of A. Brazil. Suggesting that it is not closely marginale at the population level (Palmer related to other strains circulating in the and Brayton, 2013). Considering this, one of country. Something similar happens with the priorities in the development of a subunit tandem repeat 27 of this strain (UFMG2). vaccine for anaplasmosis is induction of This suggest that using this strain for cross-protection. vaccine design could have low impact in a Despite the genetic variability of field trial in Brazil. MSP1a tandem repeats, there are conserved Using Phage Display technology Santos tandem repeats in specific geographic and colleagues (2012; 2013) identified

-116- Chapter VI immunodominant epitopes recognized by The protein gp36 from E. canis has not the neutralizing monoclonal antibody above been used in vaccination against E. canis. (Palmer et al, 1986; Allred et al, 1990). However, Doyle and colleagues (2006) They identified two critical epitopes demonstrated that the tandem repeat region recognized by the antibody: STSSQL and of gp36 was highly immunoreactive. In SEASTSSQLGA (Santos et al, 2012). Using addition, a single repeat from gp36 these epitopes as antigens for vaccination expressed as a recombinant fusion protein they found reduced number of infected was recognized by anti-E. canis dog serum erythrocytes, and no mortality in a mice (Doyle et al, 2006). Further study should model (Santos et al, 2013). Unfortunately, address the feasibility of using gp36 as a this study does not provided details about vaccine candidate. the MSP1a genotype of the A. marginale strain used in the challenge. As it was Concluding remarks mentioned before, this neutralization Bacteria exhibit an incredible diversity at the sensitive peptide in MSP1a also shows species level that has yet to be explored. variation among different A. marginale Surface proteins with higher rates of strains. molecular evolution than housekeeping Our results and those by other authors genes offer an alternative for diagnostic and (Ruybal et al, 2009; Almazán et al, 2008) epidemiologic studies of tick-borne bacteria show that in ecological regions potentially of the genus Ehrlichia and Anaplasma. The infected with tick of the genus characterization of new agents must Rhipicephalus, the variability of A. combine cellular and molecular approaches marginale is higher than in regions with low in order to find meaningful information probability of existence of these ticks. We regarding the biology of the microorganism. consider these findings to be relevant to the Evaluation of genetic diversity combined development of MSP1a based vaccines and with rational design of peptide-based suggest that MSP1a based vaccination vaccines may be a good approach to control should be combined with tick control A. marginale, E. canis and related strategies in order to minimize genetic organisms. diversity of A. marginale MSP1a. A recent study combined the use of a tick protective References antigen with MSP1a in the same vaccine Aguiar DM, Ziliani TF, Zhang X, Melo AL, Braga formulation that was directed toward control IA, Witter R, Freitas LC, Rondelli AL, Luis MA, Sorte EC, Jaune FW, Santarém VA, Horta MC, both tick infestations and anaplasmosis Pescador CA, Colodel EM, Soares HS, Pacheco RC, (Torina et al., 2014). However, Torina and Onuma SS, Labruna MB, McBride JW. A novel colleagues (2014) used a mutated variant of Ehrlichia genotype strain distinguished by the TRP36 MSP1a Oklahoma strain lacking the tandem gene naturally infects cattle in Brazil and causes repeats (Canales et al, 2008), so it is difficult clinical manifestations associated with ehrlichiosis. to evaluate the effect of MSP1a tandem Ticks Tick Borne Dis. 2014. pii: S1877- 959X(14)00069-7 (in press). repeats in this case. All together our results suggest that control of A. marginale, using Allred DR, McGuire TC, Palmer GH, Leib SR, MSP1a as vaccine candidate, should be Harkins TM, McElwain TF, Barbet AF. Molecular combined to molecular diagnostics to basis for surface antigen size polymorphisms and evaluate the genetic diversity and MSP1a conservation of a neutralization-sensitive epitope in Anaplasma marginale. Proc Natl Acad Sci USA. composition. 1990. 87(8):3220-3224.

-117- Chapter VI

Almazán C, Medrano C, Ortiz M, de la Fuente J. protein 1a of the ehrlichial pathogen Anaplasma Genetic diversity of Anaplasma marginale strains marginale. Anim Health Res Rev. 2001. 2(2):163– from an outbreak of bovine anaplasmosis in an 173. endemic area. Vet Parasitol. 2008. 158(1-2):103-109. de la Fuente J, Garcia-Garcia JC, Blouin EF, Kocan Anderson BE, Dawson JE, Jones DC, Wilson KH. KM. Differential adhesion of major surface proteins Ehrlichia chaffeensis, a new species associated with 1a and 1b of the ehrlichial cattle pathogen human ehrlichiosis. J Clin Microbiol. 1991. Anaplasma marginale to bovine erythrocytes and tick 29(12):2838–2842. cells. Int J Parasitol. 2001a. 31(2):145-153.

Bremer WG, Schaefer JJ, Wagner ER, Ewing SA, de la Fuente J, Garcia-Garcia JC, Blouin EF, Kocan Rikihisa Y, Needham GR, Jittapalapong S, Moore KM. Characterization of the functional domain of DL, Stich RW. Transstadial and intrastadial major surface protein 1a involved in adhesion of the experimental transmission of Ehrlichia canis by male rickettsia Anaplasma marginale to host cells. Vet Rhipicephalus sanguineus. Vet Parasitol. 2005. Microbiol. 2003. 91(2-3):265-283. 131(1-2):95–105. de la Fuente J, Garcia-Garcia JC, Barbet AF, Blouin Cabezas-Cruz A, Zweygarth E, Ribeiro MFB, da EF, Kocan KM. Adhesion of outer membrane Silveira AGJ, de la Fuente J, Grubhoffer L, Valdés proteins containing tandem repeats of Anaplasma and JJ, Passos MFL. New species of Ehrlichia isolated Ehrlichia species (Rickettsiales: Anaplasmataceae) to from Rhipicephalus (Boophilus) microplus shows an tick cells. Vet Microbiol. 2004. 98(3-4):313-322. ortholog of the E. canis major immunogenic glycoprotein gp36 with a new sequence of tandem de la Fuente J, Ruybal P, Mtshali MS, Naranjo V, repeats. Parasit Vectors. 2012. 5(291). Shuqing Li, Mangold AJ, Rodríguez SD, Jiménez R, Vicente J, Moretta R, Torina A, Almazán C, Mbati Cabezas-Cruz A, Passos LMF, Lis K, Kenneil R, PM, Torioni de ES, Farber M, Rosario-Cruz R, Valdés JJ, Ferrolho J, Tonk M, Pohl AE, Grubhoffer Gortazar C, Kocan KM. Analysis of world strains of L, Zweygarth E, Shkap V, Ribeiro MFB, Estrada- Anaplasma marginale using major surface protein 1a Peña A, Kocan KM, de la Fuente J. Functional and repeat sequences. Vet Microbiol. 2007. 119(2- Immunological Relevance of Anaplasma marginale 4):382–390. Major Surface Protein 1a Sequence and Structural Analysis. Plos One. 2013. 8(6):e65243. Doyle CK, Nethery KA, Popov VL, McBride JW. Differentially expressed and secreted major Cabezas-Cruz A, Vancová M, Zweygarthb E, Ribeiro immunoreactive protein orthologs of Ehrlichia canis MFB, Grubhoffer L, Passos LMF. Ultrastructure of and E. chaffeensis elicit early antibody responses to Ehrlichia mineirensis, a new member of the Ehrlichia epitopes on glycosylated tandem repeats. Infect genus. Vet Microbiol. 2013a. 167(3–4):455–458. Immun. 2006. 74(1):711–720.

Canales M, Almazán C, Pérez de la Lastra JM, de la Dumler JS, Barbet AF, Bekker CP, Dasch GA, Fuente J. Anaplasma marginale major surface protein Palmer GH, Ray SC, Rikihisa Y, Rurangirwa FR. 1a directs cell surface display of tick BM95 Reorganization of genera in the families immunogenic peptides on Escherichia coli. J Rickettsiaceae and Anaplasmataceae in the order Biotechnol. 2008. 135(4):326-332. Rickettsiales: unification of some species of Ehrlichia with Anaplasma, Cowdria with Ehrlichia and Chan JZ, Halachev MR, Loman NJ, Constantinidou Ehrlichia with Neorickettsia, descriptions of six new C, Pallen MJ. Defining bacterial species in the species combinations and designation of Ehrlichia genomic era: insights from the genus Acinetobacter. equi and 'HGE agent' as subjective synonyms of BMC Microbiol. 2012. 12(302). Ehrlichia phagocytophila. Int J Syst Evol Microbiol. 2001. 51(6):2145-2165. Cohan FM, Perry EB. A systematics for discovering the fundamental units of bacterial diversity. Curr Fournier PE, Roux V, Raoult D. Phylogenetic Biol. 2007. 17(10):373-386. analysis of group rickettsiae by study of the outer surface protein rOmpA. Int J Syst Bacteriol. de la Fuente J, Garcia-Garcia JC, Blouin EF, 1998. 48(3):839–849. Rodríguez SD, Garcia MA, Kocan KM. Evolution and function of tandem repeats in the major surface

-118- Chapter VI Garcia-Garcia JC, de la Fuente J, Kocan KM, Blouin EF, Halbur T, Onet VC, Saliki JT. Mapping of B-cell Lew AE, Gale KR, Minchin CM, Shkap V, de Waal epitopes in the N-terminal repeated peptides of DT. Phylogenetic analysis of the erythrocytic Anaplasma marginale major surface protein 1a and Anaplasma species based on 16S rDNA and GroEL characterization of the humoral immune response of (HSP60) sequences of A. marginale, A. centrale, and cattle immunized with recombinant and whole A. ovis and the specific detection of A. centrale organism antigens. Vet Immunol Immunopathol. vaccine strain. Vet Microbiol. 2003. 92(1-2):145-160. 2004. 98(3-4): 137–151. López-López A, Bartual SG, Stal L, Onyshchenko O, Groves MG, Dennis GL, Amyx HL, Huxsoll DL. Rodríguez-Valera F. Genetic analysis of Transmission of Ehrlichiacanis to dogs by ticks housekeeping genes reveals a deep-sea ecotype of (Rhipicephalus sanguineus). Am J Vet Res. 1975. Alteromonas macleodii in the Mediterranean Sea. 36(7):937–940. Environ Microbiol. 2005. 7(5):649-659.

Hsieh YC, Lee CC, Tsang CL, Chung YT. Detection Massey SE, Churbanov A, Rastogi S, Liberles DA. and characterization of four novel genotypes of Characterizing positive and negative selection and Ehrlichia canis from dogs. Vet Microbiol. 2010. their phylogenetic effects. Gene. 2008. 418(1-2):22- 146(1–2):70–75. 26.

Hua P, Xiangrui C, Yuhai M, Yang S, Qiang Y, McGarey DJ, Barbet AF, Palmer GH, McGuire TC, Shide D, Xueying Z, Huan N, Yongguo Z, Dawson J. Allred DR. Putative adhesins of Anaplasma Amplification of 16S rRNA gene fragments of marginale: major surface polypeptides 1a and 1b. Ehrlichia canis from ticks in southern China. Chin J Infect Immun. 1994. 62(10):4594-4601. Zoon. 1999. 15(3):3–6. Merhej V, Raoult D. Rickettsial evolution in the light Inokuma H, Brouqui P, Drancourt M, Raoult D. of comparative genomics. Biol Rev Camb Philos Soc. Citrate synthase gene sequence: a new tool for 2011. 86(2):379-405. phylogenetic analysis and identification of Ehrlichia. J Clin Microbiol. 2001. 39(9):3031–3039. Mutshembele AM, Cabezas-Cruz A, Mtshali MS, Thekisoe OMM, Galindo RC, de la Fuente J. Jiang BG, Cao WC, Niu JJ, Wang JX, Li HM, Sun Y, Epidemiology and evolution of genetic variability of Yang H, Richadus JH, Habbema JD. Detection and Anaplasma marginale in South Africa. Ticks Tick identification of Ehrlichia species in Rhipicephalus Borne Dis. 2014. In press. (Boophilus) microplus ticks in cattle from Xiamen China. Vector Borne Zoon Dis. 2011. 11(3):325-330. Ødegaard F. How many species of arthropods? Erwin's estimate revised. Biol J Linn Soc. 2000. Kamani J, Lee CC, Haruna AM, Chung PJ, Weka 71(4):583–597. PR, Chung YT. First detection and molecular characterization of Ehrlichia canis from dogs in Palmer GH, Barbet AF, Davis WC, McGuire TC. Nigeria. Res Vet Sci. 2013. 94(1):27–32. Immunization with an isolate-common surface protein protects cattle against anaplasmosis. Science. Kuttler KL, Zaugg JL, Johnson LW. Serologic and 1986. 231(4743):1299-302. clinical responses of premunized, vaccinated, and previously infected cattle to challenge exposure by Palmer GH, Waghela SD, Barbet AF, Davis WC, two different Anaplasma marginale isolates. Am J McGuire TC. Characterization of a neutralization- Vet Res. 1984. 45(11):2223-2226. sensitive epitope on the Am 105 surface protein of Anaplasma marginale. Int J Parasitol. 1987. Lawrence JG, Ochman H. Molecular archaeology of 17(7):1279–1285. the Escherichia coli genome. Proc Natl Acad Sci USA. 1998. 95(16):9413-9417. Palmer GH, Brayton KA. Antigenic variation and transmission fitness as drivers of bacterial strain Lee JH, Park HS, Jang WJ, Koh SE, Kim JM, Shim structure. Cell Microbiol. 2013. 15(12):1969-1975. SK, Park MY, Kim YW, Kim BJ, Kook YH, Park KH, Lee SH. Differentiation of rickettsiae by groEL Parola P, Cornet JP, Sanogo YO, Miller RS, Thien gene analysis. J Clin Microbiol. 2003. 41(7):2952- HV, Gonzalez JP, Raoult D, Telford SR III, 2960. Wongsrichanalai C. Detection of Ehrlichia spp.,

-119- Chapter VI

Anaplasma spp., Rickettsia spp., and other eubacteria Stackebrandt E, Goebel BM. Taxonomic Note: A in ticks from the Thai-Myanmar border and Vietnam. Place for DNA-DNA Reassociation and 16S rRNA J Clin Microbiol. 2003. 41(4):1600–1608. Sequence Analysis in the Present Species Definition Pohl AE, Cabezas-Cruz A, Ribeiro MFB, Silveira in Bacteriology. Int J Syst Evol Microbiol. 1994. JAG, Silaghi C, Pfister K, Passos LMF. Detection of 44(4):846-849. genetic diversity of Anaplasma marginale isolates in Minas Gerais, Brazil. Rev Bras Parasitol Vet. 2013. Stackebrandt E, Frederiksen W, Garrity GM, 22(1):1-7. Grimont PA, Kämpfer P, Maiden MC, Nesme X, Rosselló-Mora R, Swings J, Trüper HG, Vauterin L, Rosselló-Mora R, Amann R. The species concept for Ward AC, Whitman WB. Report of the ad hoc prokaryotes. FEMS Microbiol Rev. 2001. 25(1):39- committee for the re-evaluation of the species 67. definition in bacteriology. Int J Syst Evol Microbiol. 2002. 52(3):1043-1047. Ruybal P, Moretta R, Perez A, Petrigh R, Zimmer P, Alcaraz E, Echaide I, Torioni de Echaide S, Kocan Stackebrandt E, Ebers J. Taxonomic parameters KM, de la Fuente J, Farber M. Genetic diversity of revisited: tarnished gold standards. Microbiol Today. Anaplasma marginale in Argentina. Vet Parasitol. 2006. 33(4):152-155. 2009. 162(1-2) :176-180. Sumner JW, Nicholson WL, Massung RF. PCR Sacchi ABV, Duarte JMB, André MR, Machado RZ. amplification and comparison of nucleotide Prevalence and molecular characterization of sequences from the groESL heat shock operon of Anaplasmataceae agents in free-ranging Brazilian Ehrlichia species. J Clin Microbiol. 1997. marsh deer (Blastocerus dichotomus). Comp Immun 35(8):2087–2092. Microbiol and Inf Dis. 2012. 35(4):325–334. Torina A, Moreno-Cid JA, Blanda V, Fernández de Santos PS, Nascimento R, Rodrigues LP, Santos FA, Mera IG, de la Lastra JM, Scimeca S, Blanda M, Faria PC, Martins JR, Brito-Madurro AG, Madurro Scariano ME, Briganò S, Disclafani R, Piazza A, JM, Goulart LR. Functional epitope core motif of the Vicente J, Gortázar C, Caracappa S, Lelli RC, de la Anaplasma marginale major surface protein 1a and Fuente J. Control of tick infestations and pathogen its incorporation onto bioelectrodes for antibody prevalence in cattle and sheep farms vaccinated with detection. PLoS One. 2012. 7(3):e33045. the recombinant Subolesin-Major Surface Protein 1a chimeric antigen. Parasit Vectors. 2014. 7(10). Santos PS, Sena AA, Nascimento R, Araújo TG, Mendes MM, Martins JR, Mineo TW, Mineo JR, Wen B, Rikihisa Y, Mott J, Fuerst PA, Kawahara M, Goulart LR. Epitope-based vaccines with the Suto C. Ehrlichia muris sp. nov., identified on the Anaplasma marginale MSP1a functional motif basis of 16S rRNA base sequences and serological, induce a balanced humoral and cellular immune morphological, and biological characteristics. Int J response in mice. PLoS One. 2013. 8(4):e60311. Syst Bacteriol. 1995. 45(2):250–254.

Schloter M, Lebuhn M, Heulin T, Hartmann A. Wen B, Rikihisa Y, Yamamoto S, Kawabata N, Ecology and evolution of bacterial microdiversity. Fuerst PA. Characterization of the SF agent, an FEMS Microbiol Rev. 2000. 24(5):647-660. Ehrlichia sp. isolated from the fluke Stellantchasmus falcatus, by 16S rRNA base sequence, serological, Silvestre BT, Rabelo ÉM, Versiani AF, da Fonseca and morphological analyses. Int J Syst Bacteriol. FG, Silveira JA, Bueno LL, Fujiwara RT, Ribeiro 1996. 46(1)149–154. MF. Evaluation of humoral and cellular immune response of BALB/c mice immunized with a Wen B, Jian R, Zhang Y, Chen R. Simultaneous recombinant fragment of MSP1a from Anaplasma detection of Anaplasma marginale and a new marginale using carbon nanotubes as a carrier Ehrlichia species closely related to Ehrlichia molecule. Vaccine. 2014. 32(19):2160-2166. chaffeensis by sequence analyses of 16S ribosomal DNA in Boophilus microplus ticks from Tibet. J Clin Smith NH, Kremer K, Inwald J, Dale J, Driscoll JR, Microbiol. 2002. 40(9):3286–3290. Gordon SV, van Soolingen D, Hewinson RG, Smith JM. Ecotypes of the Mycobacterium tuberculosis Woese CR. Bacterial evolution. Microbiol Rev. 1987. complex. J Theor Biol. 2006. 239(2):220-225. 51(2):221–271.

-120- Chapter VI

Woo PC, Lau SK, Teng JL, Tse H, Yuen KY. Then and now: use of 16S rDNA gene sequencing for bacterial identification and discovery of novel bacteria in clinical microbiology laboratories. Clin Microbiol Infect. 2008. 14(10):908-934.

Ybañez AP, Ybañez RH, Claveria FG, Cruz-Flores MJ, Xuenan X, Yokoyama N, Inokuma H. High Genetic Diversity of Anaplasma marginale Detected from Philippine Cattle. J Vet Med Sci. 2014. [Epub ahead of print]

Yu XJ, Zhang XF, McBride JW, Zhang Y, Walker DH. Phylogenetic relationships of Anaplasma marginale and ‘Ehrlichia platys’ to other Ehrlichia species determined by GroEL aa sequences. Int J Syst Evol Microbiol. 2001. 51(3):1143–1146.

Zweygarth E, Schöl H, Lis K, Cabezas-Cruz A, Thiel C, Silaghi C, Ribeiro MFB., Passos LMF. In vitro culture of a novel genotype of Ehrlichia sp. from Brazil. Transbound Emerg Dis. 2013. 60(2):86–92.

Zweygarth E, Cabezas-Cruz A, Josemans AI, Oosthuizen MC, Matjila PT, Lis K, Broniszewska M, Schöl H, Ferrolho J, Grubhoffer L, Passos LMF. In vitro culture and structural differences in the major immunoreactive protein gp36 of geographically distant Ehrlichia canis isolates. Ticks Tick Borne Dis. 2014. 5(4):423–431.

-121- Conclusiones generales

General conclusions

-122- Conclusions

1. The immunoreactive proteins gp36 and MSP1a from E. canis and A. marginale,

respectively are encoded by genes with variable sequences that are relevant for the

characterization of genetic diversity in these two pathogens.

2. High A. marginale prevalence in cattle from endemic regions is associated with high

genetic diversity in the gene msp1a.

3. Low A. marginale prevalence in water buffaloes is associated with low genetic diversity

in the gene msp1a.

4. The evolution of A. marginale msp1a changes between ecological regions optimal for

different tick vectors. High amino acids variability and strong selective pressure are

associated to ecoregions optimal for R. microplus while low amino acid variability and

weak selective pressure ar associated to ecoregions optimal for Dermacentor spp.

5. The genetic diversity of E. canis gp36 is high among different strains of this pathogen.

6. The gene gp36 is a suitable tool for the identification of new species of the genus

Ehrlichia closely related to E. canis.

7. Polyphasic taxonomy, including variable genes such as msp1a and gp36, is necessary in

order to diagnose new tick-borne pathogens.

8. Surface proteins in Anaplasmataceae such as MSP1a that are involved in pathogen-tick

interactions may be used for the development of control measures to reduce pathogen

infection and transmission and thus the risk for tick-borne diseases.

-123- Conclusiones

1. Las proteínas immunoreactivas gp36 y MSP1a de E. canis y A. marginale,

respectivamente son codificadas por genes con secuencias variables que son relevantes

para la caracterización de la variabilidad genética en estos dos patógenos.

2. La alta prevalencia de A. marginale en ganado bovino de regiones endémicas esta

asociada con una alta variabilidad genética del gen msp1a.

3. La baja prevalencia de A. marginale en búfalos de agua esta asociada con una baja

variabilidad genética del gen msp1a.

4. La evolución del gen msp1a de A. marginale cambia entre regiones ecológicas óptimas

para differentes especies de garrapatas vectores. Una alta variabilidad de amino ácidos y

una fuerte presión selectiva estan asociadas a regiones óptimas para el desarrollo de R.

microplus mientras que una baja variabilidad de amino ácidos y una débil presión

selectiva estan asociadas a regiones óptimas para el desarrollo de especies de

Dermacentor.

5. La diversidad genética del gen gp36 de E. canis es alta entre diferentes cepas de este

patógeno.

6. El gen gp36 es una herramienta apropiada para la identificación de nuevas especies del

género Ehrlichia cercanas a E. canis.

7. La taxonomía polifásica, incluyendo genes variables tales como msp1a y gp36, es

necesaria para el diagnóstico de nuevos patógenos transmitidos por garrapatas.

8. Las proteínas de superficie, tales como MSP1a, presentes en especies de la familia

Anaplasmataceae y que están envueltas en las interacciones entre garrapatas y patógenos,

-124- pueden ser usadas para el desarrollo de medidas de control para reducir la infección y transmisión de patógenos y asi el riesgo de adquirir enfermedades transmitidas por garrapatas.

-125- Summary

The genetic variability underlies the emergence of new strains of microorganisms with different patterns of virulence, transmission and pathogenicity. A. marginale and E. canis constitute two models of highly variable tick-borne pathogens. This variability give to A. marginale a huge plasticity to infect different vertebrate (several species of ungulates) and invertebrate (20 tick species have been incriminated as vectors of A. marginale) hosts. E. canis is an example of variable pathogen but has a more restricted host specificity than A. marginale. E. canis infects mainly dogs and is almost exclusively transmitted by R. sanguineus, however it has been shown that E. canis can infect humans. Classic taxonomy has been focused in the classification of families, genuses and species of microrganisms. Nevertheless, a huge diversity is found at the level of strain which we have just started to realize. We need molecular tools to study the diversity of pathogens at all taxonomic levels. This could have major implications in understanding diseases at the epidemiologic level as well as to design control strategies. In the

Chapter I of this thesis we briefly review the diagnosis methods to detect A. marginale and E. canis. We also include a brief description of the genomes of these organisms that have been sequenced offering a wonderful opportunity to search for new genes useful for diagnosis and/or control. Subsequently, in the Chapter II we identified the gen msp1a as a highly variable gene that have shown to be a good tool to study the epidemiology and genetic diversity of A. marginale. Three studies are presented in this chapter where we evaluate the genetic diversity of

A. marginale in cattle from Brazil and South Africa and in water buffaloes from Brazil. These studies show that A. marginale has a high incidence in cattle from endemic regions associated to

-126- high genetic variability. At the same time the incidence and genetic diversity is low in water buffaloes.

One of the most interesting and relevant questions regarding the genetic variability observed in these pathogens is: What triggers the genetic diversity? For example, during the analysis presented in Chapter II.I, we realized that A. marginale msp1a in USA is less variable than in

Venezuela. Then we study this phenomenon through a strategy presented in Chapter III. It is generally accepted that intracellular bacteria, like A. marginale and E. canis, are under less evolutionary pressures compared to the free living bacteria. Our results showed that A. marginale msp1a is not only under evolutionary pressures, but also that these pressures vary depending the ecological region in which different strains of A. marginale are found.

A similar gene to msp1a is found in E. canis, gp36, which presents also tandem repeats. The main different between these two genes is that the tandem repeat region of msp1a is highly variable while the rest of the protein is highly conserved. The opposite pattern is observed in gp36. In the Chapter IV we present the isolation of three different strains of E. canis with variable gp36 genes. We further study the gp36 sequences available in GenBank and found that all the strains reported so far, including the three from our study, fall in two phylogenetic clusters. Members of each cluster, share two properties: (i) deletion/insertion of glycine 117, and

(ii) one putative sequon less/more.

As it was mentioned above, one the major risks associated to the genetic variability of microorganisms is the rise of new strains or new pathogenic agents. In the Chapter V we present the characterization of a new microorganism, named E. mineirensis, which was isolated from R. microplus in Brazil. This microorganism is closely related to E. canis and presents an ortholog of gp36 with a new sequence of tandem repeats. It is worth mentioning that an organism with 100%

-127- identity (based on 16S rRNA) to E. mineirensis was recently identified in Brazil. It was shown that this new organism is pathogenic for cattle. Finally, in Chapter VI we present a general discussion of the results of the thesis, mainly those related to genetic markers for the identification and study of the genetic diversity of A. marginale and E. canis and their implication for diagnostics and control of these pathogens.

-128- Resumen

La variabilidad genética es la base de la aparición de nuevas cepas de microorganismos con diferentes patrones de virulencia, transmisión y patogenicidad. A. marginale y E. canis constituyen dos modelos de patógenos altamente variables. Esta variabilidad le proporciona a A. marginale una gran plasticidad para infectar differentes hospederos vertebrados (varias especies de ungulados) y al mismo tiempo para ser transmitidas por una por una gran variedad de especies de garrapatas (20 especies han sido incriminadas como vectores de A. marginale). E. canis es un ejemplo de patógeno variable pero que infecta principalmente perros y es transmitida mayoritariamente por Rhipicephalus sanguineus, sin embargo se han reportado casos de infeccion humana con este patógeno. La taxonomía clásica se ha concentrado principalmente en la caracterización de familias, géneros y especies. Sin embargo, una gran diversidad se encuentra a nivel de cepa, la cual no ha sido explorada aun y de la cual apenas hemos comenzado a aprender. Se necesitan herramientas moleculares para estudiar la diversidad de los patógenos a todos los niveles taxonómicos. Esto puede tener mayores implicaciones a nivel epidemiológico y del control de enfermedades. En el Capítulo I se revisan brevemente los métodos de diagnóstico empleados para detectar A. marginale y E. canis. También se incluye una breve descripción de sus respectivos genomas, los cuales han sido secuenciados y ofrecen una maravillosa oportunidad para la búsqueda de nuevos genes con fines de diagnóstico y/o control.

Posteriormente, en el Capítulo II se identifica msp1a como un gen altamente variable de A. marginale y el cual demuestra ser una buena herramienta para estudios epidemiológicos y sobre diversidad genética en esta bacteria. Tres estudios se presentan en este capítulo donde se evalua la diversidad genética de A. marginale en Brasil y Sur Africa en ganado bovino y en búfalos de

-129- agua. Estos estudios demuestran que A. marginale tiene una alta incidencia en regiones endémicas para esta enfermedad, la cual esta relacionada con una alta variabilidad genética. Al mismo tiempo se muestra que los búfalos de agua pueden ser infectados por A. marginale pero con unas bajas prevalencia y diversidad genética. Una de las preguntas mas interesantes y relevantes al observar la mencionada variabilidad genética en estos patógenos es: cuales son los mecanismos que desencadenan dicha diversidad genética ? Por ejemplo, durante los estudios presentados en el Capítulo II, apéndice II.I, nos percatamos que la variabilidad genética de A. marginale en USA es mucho menor que en Venezuela. Entonces nos dimos a la tarea de estudiar este fenómeno mediante una estrategia presentada en el Capítulo III. Es generalmente aceptado que las intracellulares, como A. marginale y E. canis, estan sometidas a menos presiones evolutivas que sus homólogas de vida libre. Nuestros resultados muestran que el gen msp1a de A. marginale no solo esta expuesto a presiones evolutivas que producen selección positiva y negativa en diferentes codones a lo largo del tandem repeat sino que ademas, lo mas interesante, dichas presiones evolutivas varian en dependencia de la región ecológica en la cual se encuentre una determinada cepa de A. marginale.

Un gen similar a msp1a se encuentra en E. canis, gp36, el cual presenta también tandem repetidos. La mayor diferencia entre estos dos genes radica en que la región de tandem repetidos de msp1a es altamente variable mientras que el resto de la proteína codificada por este gen es altamente conservada. El patrón inverso se observa en gp36. En el Capítulo IV de la tesis se muestra que el gen gp36 varia entre differentes cepas de E. canis, sin embargo, las cepas que se han reportado hasta ahora se agrupan en dos cluster filogenéticos. Los miembros de cada cluster comparten dos propiedades principales: (i) ausencia de una glicina en position 117 y (ii) un sequon de posible glicosilación de menos.

-130- Como se refirió anteriormente, uno de los riesgos mayores que entraña la variabidad genética de los microorganismos es la aparición de nuevas cepas o nuevos agentes patógenos. En el

Capítulo V presentamos la caracterización de un nuevo microorganismo, llamado E. mineirensis, que fue aislado de R. microplus en Brasil. Este microorganismo esta filogenéticamente relacionado a E. canis y presenta un ortólogo de gp36 con una nueva secuencia de tandem repetidos. Es importante mencionar que un microorganismo identico 100%

(basado en 16S rRNA) a E. mineirensis se identificó recientemente en Brasil y se demostró que es patogénico para el ganado vacuno. Finalmente, en el Capítulo VI se discuten de manera general los resultados de la tesis, principalmente los relacionados con el uso de marcadores genéticos para la identificación y el estudio de la diversidad genética de A. marginale y E. canis y sus implicaciones para el diagnóstico y control de estos patógenos.

-131- Agradecimientos

Este doctorado no es un doctorado común y corriente. Este doctorado es el fruto de muchos poquitos que muchas personas han puesto para al final completar un pedazo de mi vida como científico. Este doctorado también tiene muchos “puntos” que la vida ha unido, formando una de esas redes increíbles que ha veces nos hace exclamar: “Que chiquito es el mundo!”.

Voy a comenzar estos agradecimientos por Pepe que ha sido una guía mucho antes de haberlo conocido, mucho antes incluso de que él lo supiera. “En un lugar de la Mancha…” siempre habia oído que se encontraba un tal Pepe que era una leyenda en el mundo de las garrapatas. Junto a nombres como Frans Jongejan, Stephen Wikel, Pat Nuttal y otros… se hallaba él como una de las personas mas apasionadas con las “cosas” de estos animalitos que son las garrapatas. Muchas gracias Pepe por desde que nos conocimos al principio aceptar ser un maestro para mí.

Quisiera también agradecer en primer lugar al Profesor Libor Grubhoffer que ha sido una de las personas mas especiales que he conocido. Hay momentos en la vida en que una oportunidad es decisiva para seguir, Libor me dio esa oportunidad al acogerme en su inolvidable České Budějovice. Sin él tampoco hubiera sido posible este doctorado. Libor ha sido muchas veces como un padre para mí.

No hay duda de que una de las cosas mas especiales que me han pasado en la vida es POSTICK. De pronto yo estaba dentro de un grupo de jóvenes investigadores interesados en las “cosas” de las garrapatas y tuvimos la maravillosa oportunidad de estar formándonos con algunos de los mejores científicos del mundo en esta rama de la ciencia. En un contexto de talleres científicos, conferencias internacionales, comunicación intensa, otras actividades científicas y no cientificas, en fin, el paraíso para cualquier estudiante. Asi yo lo veo. Muchas gracias a todos los “PIs” de POSTICK y en especial a la Profesora Lygia Pasos por haberlos reunidos a todos para hacer realidad ese increíble proyecto. Gracias a ella también por su sonrisa eterna y amabilidad sin dejarse marchitar en los momentos difíciles.

También muchas gracias a todos los estudiantes: John, Octavio, Marina, Miray, Joana, Kasia, Rachel, Jun, Leticia, Marta, Annika, Boglan, Claudia y Savigne. La pase muy bien con todos ustedes y aprendí también mucho de ustedes. Los extraño. Miray, Joana, Rachel y Kasia formamos parte de ese primer artículo sobre MSP1a, el cual nos divertímos muchísimo haciendo. Eso nos unió todavía más. Me hubiera encantado formar parte en un artículo donde hubieran estado todos los estudiantes y “PIs” de POSTICK. Quizá en el futuro.

Quiero agradecer al Dr. Erich Zweygarth que no tiene el título oficial de Profesor pero que para mí lo es hace siglos. Erich, además de ser mi amigo, fue un pilar imprescindible de los trabajos publicados en esta tesis. Sin Erich, hoy no se conocería a Ehrlichia mineirensis. Erich dice humildemente que él no es científico, sino “artista”, y yo agregaría que es un artista de los mejores en la ciencia del cultivo in vitro.

-132- Gracias a Agustin Estrada, mi amigo de los satélites y los climas y las magicas “harmonic regressions”, ya verás que un día me las aprendo. Sin tu conocimiento de ecología de garrapatas y enfermedades transmitidas por garrapatas, el alcance de esta tesis estaría bastante limitado. Gracias también por tu buen humor y por “skypear” cada vez que tenemos que discutir de ciencia.

No puedo dejar de mencionar aquí a Jimi, mi hermano Jimi. Cuando llegué a Budejovice me dijeron: Hay otro cubano. Lo conocí un día que él estaba sentado en el microscopio. Puedo recordar. Todavía no hemos terminado aquella primera conversación, seguimos hablando, seguimos soñando juntos. Jimi ha sido también una parte importante de los análisis estructurales de proteína que se han realizado para esta tesis. Gracias mi hermano.

I want to thank my wife, Cami, for her love and support in the realization of this thesis. You have been very important for me. Since Cuba, Germany, Czech Republic and now, you have always been with me. Sorry Cami that I cannot write this in French, yet . Quiero también agradecerle a mi chiquitica Chloé (mi hija más pequeña) que sin consideración ninguna me saca de lo que este haciendo porque necesita jugar. Ella aún no sabe cuanto se lo agradezco.

Las gracias también a mi mama por todo su amor y protección de siempre. Nunca podré agradecer lo suficiente todo el amor y tiempo que me has dedicado.

Quiero darle gracias especiales a David Castellanos, que siempre me ha ayudado con buena voluntad y buena onda en los trámites de esta tesis que nunca han sido son pocos.

Esta tesis lleva también algunos dolores, toda felicidad va unida a ellos creo yo. Como dijera Cesar Vallejo: “Hay golpes en la vida, tan fuertes…Yo no sé!" Mi golpe es la distancia. Estoy lejos de mis dos hijas Camen Alicia y Carmen Sofia. Estoy lejos de mis abuelas, de mis padres. Estoy lejos. Un día estaremos juntos otra vez… Como me dijo una amiga una vez: “Something will happen as long as you keep trying…”

Muchas gracias a todos ustedes !

-133 - Appendices.

-134- Appendix 1

-135- -136- -137- Appendix 1. Diversification of A. marginale MSP1a. A: The figures show the phylogenetic relationship (as a cladogram) of A. marginale strains from Argentina, Brazil, South Africa, USA, Israel, Mexico and Venezuela. Ecoregions with optimal climatic conditions for R. microplus, R. decoloratus and R. annulatus were determined (See material and methods) and A. marginale strains isolates from the same regions were identified. The last common ancestor (See material and methods) for each phylogenetic cluster of related A. marginale strains was determined (arrows in internal nodes). B: The tandem repeats composition and identification of unique tandem repeats (*) from A. marginale MSP1a is presented. Tandem repeats not labelled are common to several isolates from different regions of the world. C: The phylogenetic relationship among tandem repeats from phylogenetically related A. marginale strains was inferred and the ancestor sequences reconstructed. When the ancestor was not an extant tandem repeats, the sequence is shown below (open circles), when the ancestor was an extant tandem repeats, the ancestor tandem repeat is shown on the tree (close circles). In all cases phylogenetic trees were constructed using ML, NJ and MrBayes with MSP1a amino acid sequences accessible in GenBank (accession numbers are shown). Three statistical methods were used to test the significance of the topologies: bootstrapping (B) for NJ and ML, approximate likelihood ratio (SH) for ML and posterior probabilities (PB) for MrBayes. Numbers on internal branches are the values of the statistic test expressed as percent. Only values higher than 50 are presented.

-138- Appendix 2

Appendix 2. World ecoregions. A: the spatial range of the 36 categories of abiotic variables produced with remotely sensed information on temperature, NDVI and LAI, and sequentially coloured. B to D: areas of suitable abiotic conditions for R. annulatus, R. microplus, and R. decoloratus, respectively. The maps are coded in four levels of colours, from dark blue to yellow, meaning from the not possible persistence (blue) to the best abiotic conditions for survival (yellow).

-139- Appendix 3

Appendix 3. Evidence of diversification and convergent evolution in A. marginale MSP1a tandem repeats.

Different MSP1a tandem repeats that gave origin to Ancestors of Unique Tandem repeats in the respective Ancestor Unique regions. s of Tandem Unique Repeats Argentina Brazil South Africa USA Israel Mexico Venezuela Tandem Repeats

Cluster β (i*); Cluster β (N); Not present Cluster B Cluster 3.2 (F) Cluster 9 (ii*); Cluster B (B) F 52, 17, 91 Cluster 10 (M) Cluster 13 (M) (B) Cluster β (81)

Cluster β ( β ) Cluster β (81) Not present Not present Not present Cluster β (81) Cluster β ( β ); β 49, 50, 51, Cluster ϕ (iii*) 127

Cluster 10 (iv*) Cluster β (N); Not present Not present Not present Not present Not present τ 48 Cluster 13 (10)

Cluster 10 (v*) Cluster β (N); Cluster 13 (27) Not present Not present Cluster 9 (13) Cluster 63 (18) 13 152, 158, Cluster 13 (18) 30, 23

Cluster β (3); Cluster β (F); Cluster Q (m) Cluster B Cluster 3.1 (3); Cluster 9 (ii*) Not present M 72 Cluster 10 (v*) Cluster 13 (B) (B) Cluster 3.2 (F)

Cluster10 (C) Cluster 13 (M) Not present Cluster B Not present Cluster B/C (B) Cluster B (B) B µ, 16, σ , (B) V, J, 89, 101, 102, 90, 120, 119, 118, 117

Not present Not present Cluster 13 (100); Not present Not present Not present Cluster 63 (18) 37 83, 153 Cluster Q (m)

Not present Not present Cluster 13 (100) Not present Not present Not present Not present 141 140

Not present Not present Cluster 13 (13) Not present Not present Not present Not present vi* 79, 82

-140- Cluster 10 (13) Cluster 13 (18) Cluster 13 (27) Not present Not present Cluster 9 (59) Cluster 63 (v) 27 161, 155

Not present Not present Cluster 13 (27) Not present Not present Not present Not present 34 160, 151

Cluster 10 (13) Cluster 13 (13) Cluster 31 (vii*) Not present Cluster 3.1 (78) Not present Not present 25 143, 145

Not present Not present Cluster 31 (vii*) Not present Not present Not present Not present viii* 146, 156

Not present Cluster 13 (18) Cluster 31 (42) Not present Not present Not present Not present 42 142, 156

Not present Not present Not present Cluster B Not present Cluster B/C (B) Cluster β (ix*); T O, K, L (B) Cluster B (B)

Not present Not present Not present Cluster B Not present Not present Not present H P, W (H)

Not present Not present Cluster 13 (27) Not present Cluster 3.1 (3); Not present Not present 3 77, 74 Cluster 3.2 (F)

Not present Not present Not present Not present Not present Cluster 9 (v*) Cluster 63 (94) 63 12, 68, 104

Cluster 10 (x*) Not present Not present Not present Not present Cluster 9 (x*) Not present 11 56, 109, 93

Not present Not present Cluster 13 (100) Not present Cluster 3.1 (3); Cluster 9 (x*) Not present 4 67, 14 Cluster 3.2 (F)

Not present Not present Not present Not present Not present Cluster 9 (13) Not present 59 9, 58

Cluster 10 (B) Not present Not present Cluster B Not present Cluster B/C (B); Cluster B (F) C π (B) Cluster β (103)

Not present Not present Not present Not present Not present Cluster β (81) Not present 81 69

Not present Cluster β (81) Not present Not present Not present Cluster β (81) Cluster 62 (61) 61 70, 71, 84

Cluster β xi* 103, 108 Not present Not present Not present Not present Not present (103) Not present

Not present Cluster β (81) Not present Not present Not present Cluster β (61) Cluster 62 (61) 62 86, 66

-141-

-142- Not present Not present Not present Not present Not present Not present Cluster β (ix*) ix* 116, 126

Not present Not present Not present Not present Not present Not present Cluster β ( β ) 115 114, 128

Not present Not present Not present Not present Not present Not present Cluster β (D) 139 137

Cluster D D 138, 95, Not present Not present Not present (D) Not present Cluster B/C (D) Cluster B (D) 136

Not present Not present Not present Not present Not present Not present Cluster 63 (v*) v* 92, 87

Not present Not present Not present Not present Not present Not present Cluster 63 (v*) 94 110

Not present Not present Cluster 13 (141) Not present Not present Not present Cluster ϕ (iii*) 100 ­­­­

Not present Not present Not present Not present Not present Not present Cluster 63 (18) xii* 124

Not present Not present Not present Not present Not present Not present Cluster 63 (xii*) 124 113, 123

Not present Not present Not present Not present Not present Not present Cluster 63 (124) 123 122, 121

Not present Not present Not present Not present Not present Not present Cluster ϕ (xiii*) xiii* 129

Not present Cluster β (N) Not present Not present Not present Not present Cluster ϕ (iii*) ϕ 130, 131

The ancestors of unique tandem repeats in the different regions are shown inside the brackets.

* Tandem repeats that have been selectively eliminated: i(TDSSSASGQQQESSVSSQSGDASTSSQLG), ii (TDSSSASGQQQESSVLSQSGQASTSSQLGVG), iii (TDSSSASGQQQESSVSSQSGEASTSSQLG), iv (TDSSSASGQQQESSVLPSGQASTSSQLGVG), v (ADSSSASGQQQESSVLSQSGQASTSSQLGVG), vi (ADSSSASGQQQESSVLSQSDLSSTSSQLG), vii (ADSSSAGNQQQESSVSSQSSQASTSSQLG), viii (ADSSSAGNQQQESSVSSQSDASTSSQLGG), ix (TDSSSASGQQQESSVLSSQSDASTSSQLG), x (TDSSSASGQQQESSVLSPSGQASTSSQLGVG), xi (ADSSSASGVLSQSGQASTSSQLGTSSQLG), xii (TDSSSASGQQRESSVLSQSDQASTSSQSG) and xiii (ADSSSASGQHQESSVSSQSEASTSSQLG).

-142- Appendix 4. See note at the end of appendices.

Appendix 5

Appendix 5. Number of transition and transversion substitutions. Transitions Transversions

A to G G to A C to T T to C A to C C to A A to T T to A G to T T to G C to G G to C

50 49 16 12 15 10 20 5 10 11 15 17

TOTAL 127 TOTAL 103

Appendix 6

Appendix 6. Transition and transversion substitutions in A. marginale msp1a from different ecoregions.

Ecoregion Total of Transitions Transversio Ratio transitions/ substitutions ns transversions

R. microplus 161 89 72 1.2

R. decoloratus 32 13 19 0.68

R. microplus 19 12 7 1.7 and

R. annulatus

R. annulatus 5 5 0 ND

Absence of 12 7 5 1.4 Rhipicephalus spp

Rhipicephalus 7 7 0 ND spp Eradicated

-143- Appendix 7. See note at the end of appendices.

Appendix 8. Probability data and coupling of MSP1a tandem repeats amino acid positions related to different ecological regions.

Ecoregions Sites coevolving Posterior probabilities (Sites were considered coevolving when P>0.5 for any of the conditionality conditions below)

Site 1 Site 2 Site 2 being Site 1 being Site 1 and Site 2 being conditionally conditionally conditionally dependent on Site 1 dependent on Site 2 dependent

R. annulatus 2 19 0.301733 0.415099 0.716832

3 8 0.309229 0.288939 0.598168

8 7 0.55459 0.273822 0.828412

8 14 0.48747 0.108429 0.595899

16 21 0.146289 0.514506 0.660795

20 21 0.437361 0.395522 0.832883

R. decoloratus 5 6 0.473632 0.231634 0.705266 7 16 0.544418 0.150472 0.69489

8 7 0.548206 0.372096 0.920302

18 10 0.263817 0.51705 0.780867

22 10 0.123018 0.399116 0.522134

22 21 0.523915 0.143764 0.667679

22 25 0.229865 0.330314 0.560179

24 6 0.520408 0.189763 0.710171

-144- 24 23 0.429535 0.0735722 0.503107

28 20 0.276347 0.277185 0.553532

29 14 0.451034 0.364647 0.815681

Absence of 1 28 0.293701 0.37771 0.671411

Rhipicephalus 2 7 0.318991 0.28462 0.603611 spp 20 1 0.353189 0.380897 0.734086

20 21 0.266579 0.433119 0.699698

29 11 0.294364 0.502934 0.797298

30 21 0.360167 0.2575 0.617667

R. microplus 1 8 0.00840854 0.518813 0.527222 1 16 0.00613617 0.829536 0.835672

2 7 0.588338 0.126879 0.715217

5 6 0.343091 0.397345 0.740436

5 22 0.53648 0.217865 0.754345

6 23 0.593461 0.16232 0.755781

8 3 0.84781 0.0100438 0.857854

8 7 0.94 0.06 1

8 12 0.0570501 0.892231 0.949281

9 12 0.559631 0.439183 0.998814

14 11 0.324172 0.5269 0.851072

14 12 0.199546 0.695303 0.894849

-145- 16 11 0.59026 0.208017 0.798277

16 18 0.239961 0.759984 0.999945

16 28 0.68012 0.307422 0.987542

18 6 0.140288 0.750704 0.890992

20 16 0.114804 0.807637 0.922441

20 21 0.14 0.86 1

21 8 0.827506 0.0426712 0.870177

21 23 0.115856 0.664965 0.780821

24 23 0.0945503 0.741252 0.835802

25 17 0.763093 0.118171 0.881264

25 24 0.529157 0.0596495 0.588807

27 11 0.031971 0.523934 0.555905

27 23 0.0321853 0.958824 0.991009

29 23 0.107986 0.602504 0.71049

30 18 0.297657 0.436509 0.734166

Rhipicephalus 3 4 0.331338 0.350022 0.68136 spp 8 7 0.311332 0.254289 0.565621 eradicated 17 8 0.33533 0.220862 0.556192

24 15 0.272844 0.288497 0.561341

26 15 0.261667 0.312679 0.574346

26 24 0.247974 0.284452 0.532426

-146- R. microplus and 3 21 0.156709 0.399208 0.555917

R. annulatus 8 1 0.693408 0.077893 0.771301 8 7 0.336613 0.24152 0.578133

10 13 0.392675 0.357828 0.750503

13 11 0.387093 0.116816 0.503909

14 12 0.598474 0.300587 0.899061

16 7 0.549427 0.449105 0.998532

16 8 0.254689 0.337031 0.59172

18 30 0.180302 0.428639 0.608941

21 16 0.205304 0.324946 0.53025

21 20 0.499826 0.480096 0.979922

23 11 0.556776 0.271839 0.828615

Note to Appendices: The appendices 4 and 7 are in electronic format in the CD attached to this thesis.

-147-