The Complete Genome Sequence and Analysis of the Human Pathogen Campylobacter Ian
Total Page:16
File Type:pdf, Size:1020Kb
FOODBORNE PATHOGENS AND DISEASE Volume 5, Number 4, 2008 © Mary Ann Liebert, Inc. DOl: 10.1089/fpd.2008.01 01 The Complete Genome Sequence and Analysis of the Human Pathogen Campylobacter Ian William G. Miller, Guilin Wang, Tim T. Binnewies, 2 and Craig T. Parker Abstract Camp ylollacttr lari is a member of the epsilon subdivision of the Proteobacteria and is part of the ther- motolerant CampiIahacter group, a dade that includes the human pathogen C. jejuni. Here we present the complete genome sequence of the human clinical isolate, C. lari RM2100. The genome of strain R1V12100 is 1.53 Mb and includes the 46kb megaplasmid pCL2100. Also present within the strain RM2100 genome is a 36 kb putative prophage, termed CLIE1, which is similar to CJIE4, a putative prophage present within the C. jejilni RM1221 genome. Nearly all (90%) of the gene content in strain RM2100 is similar to genes present in the genomes of other characterized thermotolerant campylobacters. However, several genes involved in amino acid biosynthesis and energy metabolism, identified previously in other Can ipiilobacter genomes, are absent from the C. lari RM2I00 genome. Therefore, C. lari RM2100 is predicted to be multiply auxotrophic, unable to synthesize eight different amino acids, acetyl-coA, and pantothenate. Additionally, strain RM2100 does not contain a complete TCA cycle and is missing the CydAB terminal oxidase of the re- spiratory chain. Defects in the amino acid bios ynthetic pathways in this organism could be potentially compensated by the large number of encoded peptidases. Nevertheless, the apparent absence of certain key enzymatic functions in strain RM2I00 would be expected to have an impact on C. Ian biology. It is also possible that the reduction in the C. lari metabolic machinery is related to its environmental range and host preference. Introduction Most strains of C. lan (formerly C. landis) were isolated originally from gulls (Skirrow and AMPYLOBACTER LARI is a member of the Benjamin, 1980; Benjamin et al., 1983). However, C thermotolerant dade of the Gram-negative C. lari strains have also been isolated from cattle Epsilonproteobacteria. Other members of the (Giacoboni et al., 1993; Aarestrup et al., 1997), phylogenetically related, thermotolerant Cam- dogs (Engvall et al., 2003), pigs (Harvey et al., pylobacter group include C. jejuni, C. coli, C. up- 1999; Lindblom et al., 1990), poultry (Tresierra- saliensis, C. lielveticus, and C. insulaenigrae. The Ayala et al., 1994), and shellfish (Endtz et al., thermotolerant campylobacters are character- 1997; Van Doorn et al., 1998), as well as water ized by the ability of strains within each species sources (Obiri-Danso and Jones, 1999a; Obiri- to grow at 42 C C; optimal growth of these species Danso et al., 2001). Although associated with is supported under microaerobic conditions birds and food animals, C. lari is isolated infre- where the [021 is <5%. quently from animal or processed food sources. Produce Safety and Microbiology Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Albany, California. Center for Biological Sequence Anal ysis, Technical Lniversitv of Denmark, Lvnghv, Denmark. 371 372 MILLER ET AL. However, C. lari is detected at moderate to high RM2 100 (multilocus sequence typing [MLST] levels in shellfish; in a study by Endtz et al. ST-3 [Miller et al., 2005a]) is also very closely (1997), nearly all of the Campylobacter strains related phylogenetically to the C. lari type strain isolated from shellfish (37/39) were identified CCUG 23947 (NARTC; MLST ST-4 [Miller et al., as C. lan. The presence of relatively high levels 2005a]), which was isolated from a gull. Pre- of C. lari in both seawater and shorebirds liminary data on the draft RM2 100 genome se- (Glunder and Petermann, 1989; Obiri-Danso quence was presented previously (Fouts et al., and Jones, 1999b; Obiri-Danso et al., 2001; 2005); however, many features present in this Moore et al., 2002; Waldenstrom et al., 2007) strain could not be addressed conclusively due suggests that the preferred environmental niche to the incomplete nature of the genome se- for this organism is the marine shoreline, and quence. Cam pylobacter lari R1\42100 was se- would presumably explain the prevalence of C. quenced initially because of both the clinical lari in shellfish. Also consistent with a marine origin of this isolate and the classification of C. environmental range, C. lari is more haloto- lari as an "emerging" Cam pylobacter species lerant than other thermotolerant campylobac- capable of causing disease in humans. Ad- ters (Skirrow and Benjamin, 1980; Smibert, ditionally, solution of the strain RM2 100 genome 1984). Additionally, the phylogenetically re- would hopefully provide new insights into lated species C. insulaenignae is isolated pre- several aspects of Cain pylobacter biology, such as dominantly from marine mammals (Stoddard, colonization-host adaptation, virulence, and 2005; Stoddard et al., 2007); thus, C. lari and synthesis of surface structures (e.g., capsule). C. insulaenigrae may be members of a group of This study presents the completed genomic se- marine-adapted campylobacters. quence of the .human clinical isolate RM2100. Unlike C. jejuni, C. lari is isolated relatively The genomic data revealed that this strain con- infrequently from clinical samples and only one tains defects in multiple metabolic pathways outbreak has been attributed to this organism that may pertain to C. lari biology. (Broczyk et al., 1987). Nevertheless, the clinical symptomatology of C. lan-related campylo- bacterioses is similar to that of C. jejuni, namely Materials and Methods gastroenteritis with abdominal pain, fever, Growth conditions and chemicals and diarrhea (Broczyk et al., 1987; Lin et al., Cainpylobacter lari 1998; Prasad et al., 2001; Otasevic et al., 2004). RM2100 was cultured at Cam pylobacter lari has been associated also 37°C on brain heart infusion agar (Becton Dick- with bacteremia in immunocompromised (Na- inson, Sparks, MD) amended with 5% (v/v) chamkin et al., 1984; Martinot et al., 2001) or laked horse blood (Hema Resource & Supply, Aurora, OR). The incubation atmosphere was otherwise debilitated patients (Morris et al., 1998) and in patients with gastroenteritis 5% H2, 10% CO2/ and 85% N2. Polymerase chain reaction (PCR) enzymes and reagents were (Soderstrom et al., 1991; Skirrow et al., 1993). Cam pylobacter lari strains are phenotypically purchased from New England Biolabs (Beverly, and genotypically diverse and have been sub- MA) or Epicentre (Madison, WI). All chemicals divided into four major phenotypic groups were purchased from Sigma-Aldrich Chemicals (Endtz et al., 1997; On and Harrington, 2000; (St. Louis, MO) or Fisher Scientific (Pittsburgh, Duim et al., 2004): the nalidixic acid-resistant PA). DNA sequencing chemicals and capillaries thermophilic campylobacters (NARTC), the were purchased from Applied Biosystems (Fos- urease-producing thermopl-iilic campylobacters ter City, CA). (UPTC), the nalidixic acid-susceptible (NASC) group, and the urease-producing nalidixic acid- Polymerase chain reactions susceptible group. Campylobacter lari RM2100 C. lari genomic DNA was prepared as de- (ATCC-BAA 1060; CDC strain D67, "case 6" scribed previously (Miller et al., 2005b). Stan- [Tauxe et al., 1985]) is a member of the NARTC dard amplifications were performed on a Tetrad C. lari subgroup and was isolated from an 8- thermocycler (Bio-Rad, Hercules, CA) with the month-old girl with watery diarrhea. Strain following settings: 30 seconds at 94C; 30 373 CAMPYLOBACTER LARI GENOME seconds at 53°C; 2 minutes at 72°C (30 cycles). manually. The final assembly contained 29,574 Each amplification mixture contained 50 ng ge- reads. The final assembly also contained con- nomic DNA, 1 x PCR buffer (Epicentre), lx PCR tiguous sequences (>2xcoverage/nt) on both enhancer (Epicentre), 2.5 mM MgC1 2, 250 1iM strands for an average coverage of 9.6x; am- each dNTP, 50 pmol each primer, and 1 U poly- biguous bases were resequenced at least twice. merase (New England Biolabs). Amplicons were purified on a BioRobot 8000 Workstation (Qia- Genome analysis gen, Santa Clarita, CA). Sequencing and PCR oligonucleotides were purchased from Qiagen. Putative coding sequences were predicted using ORFscan (Rational Genomics, South San DNA sequencing Francisco, CA) with the parameters set to in- clude all three potential start codons (ATG, GTG, Cycle sequencing reactions were performed and TTG) and a coding sequence (CDS) cut-off of on a 96-well Tetrad thermocycler (Bio-Rad) us- 50 amino acids. Spurious CDSs (e.g., CDSs con- ing the ABI PRISM BigDye terminator cycle tained within larger CDSs) were deleted manu- sequencing kit (version 3.1) and standard pro- ally. Initial annotation was accomplished by tocols. Amplicon extension products were comparing the predicted proteins to the NCBI purified using DyeEx 96 well plates (Qiagen) nonredundant (nr) database using BLASTP. The according to the manufacturers protocols. list of putative CDSs was then used to create a DNA sequencing was performed on an ABI preliminary GenBank-formatted (.gbk) file that PRISM 3130XL Genetic Analyzer (Applied Bio- was entered into Artemis (release 9.0; http:// systems) using the POP-7 polymer and ABI www.sanger.ac.uk / Software / Artemis / [Ruth- PRISM Genetic Analyzer Data Collection and erford et al., 2000]). Annotation within Artemis ABI PRISM Genetic Analyzer Sequencing Ana- included the fusion of split CDSs into pseudo- lysis software. genes