
Supporting Information Yan et al. 10.1073/pnas.0801093105 SI Methods ment was Ϸ200–500 bp in length. The microarray featured 3,988 Genome Assembly. The initial sequences assembly was carried out of the 4,146 CDS identified in P. stutzeri A1501, excluding by using the Phred/Phrap package (www.phrap.org) (1, 2). Gaps multiple copies of transposases, unamplified genes, low- were closed using a combination of primer walking and multiplex concentration products, and unpurified products (158 CDSs in PCR. The Consed program (3) was used to facilitate sequence total). finishing. The overall sequence quality of the genome was PCR products were analyzed on agarose gels to confirm the further improved, inspecting the following criteria for each base: success of each reaction and subsequently purified by using (i) covered by at least two independent valid reads accountable MultiScreen-PCR plates (Millipore). DNA was then resus- from both strands and (ii) the final consensus quality score pended in 12 l of spotting solution containing 50% dimethyl generated by Phrap Ͼ40. sulfoxide. A set of microarrays containing a total of 4,352 spots, including 3,988 PCR products and 364 controls in each block, Genome Annotation. Prediction of coding sequences (CDS) was including blank, negative, and positive controls, were spotted generated wih the program Glimmer (4), and location of start onto CMT-GAPSII-coated slides (Corning), using a 16-pin codons were corrected by RBSfinder (5). Overlapping and configured MicroGrid II array printer (BioRobotics) controlled closely clustered CDS were manually inspected. The functional by the MicroGrid TAS Application Suite, Version 2.2.0.6. Neg- annotation of the predicted CDS was carried out by a BLASTP ative control genes consisted of the type III secretion system (6) search of the translations versus GenBank’s nonredundant genes of Shigella flexneri 2a strain 301 (GenBank accession nos. protein database (NR) and manual curation. Transfer RNA AE005674 and AF386526) and the commercial Arabidopsis genes were identified by the program tRNAscan-SE (7). Ribo- genes (SpotReport cDNA Array Validation System, Stratagene). somal RNAs were identified by a BLASTN (6) search against a A positive control gene was 16S rRNA of P. stutzeri A1501. database of all publicly available rRNA sequences. Spotted DNA was UV cross-linked to slides, using a CL-1000 Repeat sequences (RPs) were primarily detected by Repseek UV cross-linker and subsequently placed in a blocking solution (8). After manual curation a nonredundant set of potential RPs containing 200 mM succinic anhydride and 50 mM N- was given to RepeatMasker (www.repeatmasker.org) to identify methylimidazol prepared in 1-methyl-2-pyrrolidinone for 60 all intact or partial copies within A1501 genome. PhageFinder min, washed for 2 min in 95°C distilled water, and rinsed five (9) program was used to identify intact or nearly intact prophage times in 95% ethanol. Slides were spin dried at 185 ϫ g for 1 min regions in the genome. The IslandPath software (10) was used to and stored for future hybridizations. screen A1501 genome for potential anomalous gene clusters first, and then genomic islands were identified manually based on cDNA Synthesis, Labeling, and Hybridization. Cells from an over- general characteristics of typical genomic islands. night culture were centrifuged and resuspended in a 50-ml flask containing 10 ml of N-free minimal medium K at an OD600 of Comparative Genomics. Genomic comparisons were carried out by 0.1. The suspensions were incubated for6hat30°C under ϩ bidirectional BLASTP comparisons of whole genome protein nitrogen fixation conditions (0.5% O2, 0.1 mM NH4 ) or nitro- ϩ databases. Ortholog groups among P. stutzeri A1501 and five gen excess growth conditions (0. 5% O2 and 20 mM NH4 ), other Pseudomonas species, P. aeruginosa PAO1 (AE004091), P. respectively. Total RNA was extracted using the Promega SV putida KT2440 (AE015451), P. entomophila L48 (CT573326), P. Total RNA isolation system and then 5 g of total RNA was used fluorscens Pf-5 (CP000076) and P. syringae pv. tomato DC3000 for cDNA synthesis. Genomic DNA and cDNA were fluores- (AE016853, AE016854, AE016855), were identified by the Or- cently labeled by using the BioPrime DNA Labeling System (Life thoMCL program (11) with an e value cutoff of 10Ϫ5. Technologies) with random hexamers. Cy5 dye-labeled cDNA and Cy3-labeled genomic DNA samples were mixed and hybrid- Phylogenetic Analysis. Twelve conserved housekeeping proteins ized at 65°C for 16 h. Genomic DNA was used as a universal were used for whole genome based phylogenetic analysis of internal control for the quality of the microarrays allowed for the Pseudomonas (Fig. S2). Total length of alignment was 7,003 aa comparison of results across multiple experiments. Biological after removing ambiguously aligned regions. Protein sequence experiments were carried out three times, which provided three alignment was carried out for each individual protein, using biological repeats. ClustalW. The multialignments were then manually checked and trimmed with BioEdit. The dataset with concatenation of the 12 Microarray Data Analysis. Processed slides were scanned with a proteins was fed to TREE-PUZZLE software to construct the GenePix 4000B scanner (Axon). Fluorescent spots and local maximum likelihood (ML) tree using the JTT model of amino background intensities were quantified with GenePix Pro 6.0 acid substitution. The rate heterogeneity was estimated by software (Axon). Before data analysis, signals were normalized gamma distribution with eight rate categories, and the alpha- using a locally weighted scatterplot smoothing regression (LOW- parameter estimated from the dataset. Reliability of the dataset ESS) algorithm in the MIDAS software package (www.tigr.org/ was assessed by bootstrap. One thousand permutation datasets software/tm4) with smoothing parameter set to 0.33. Local were generated by using the SEQBOOT program from the background value was subtracted from intensity of each spot. PHYLIP package. For each of the 1,000 datasets ML tree was The mean of the signal intensities of the control spots hybridized constructed using the same parameters described above. TREE- with labeled reference genomic DNA in each experiment was PUZZLE was then used with the ‘‘Consensus of user defined calculated. The mean Cy5/Cy3 (sample/reference) ratios of trees’’ option to generate a consensus tree. signal intensity were calculated for analysis. Genes were consid- ered to be differentially expressed if (i) average expression Microarray Fabrication. PCR fragments used for printing microar- changed by at least 2-fold in three independent experiments or ray chips were amplified as described in ref. 12 using ologo- (ii) the change in gene expression was in the same direction nucleotides designed from A1501 genome sequence. Each frag- (‘‘increased’’ or ‘‘decreased’’) in three experiments. Yan et al. www.pnas.org/cgi/content/short/0801093105 1of18 1. Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer 8. Achaz G, Boyer F, Rocha EP, Viari A, Coissac E (2007) Repseek, a tool to retrieve traces using phred. I. Accuracy assessment. Genome Res 8:175–185. approximate repeats from large DNA sequences. Bioinformatics 23:119–121. 2. Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. 9. Fouts DE (2006) PhageFinder: Automated identification and classification of pro- Error probabilities. Genome Res 8:186–194. phage regions in complete bacterial genome sequences. Nucleic Acids Res 34:5839– 3. Gordon D, Abajian C, Green P (1998) Consed: A graphical tool for sequence finishing. 5851. Genome Res 8:195–202. 10. Hsiao W, Wan I, Jones SJ, Brinkman FS (2003) IslandPath: Aiding detection of genomic 4. Delcher AL, Bratke KA, Powers EC, Salzberg SL (2007) Identifying bacterial genes and islands in prokaryotes. Bioinformatics 19:418–420. endosymbiont DNA with Glimmer. Bioinformatics 23:673–679. 11. Li L, Stoeckert CJ, Jr, Roos DS (2003) OrthoMCL: Identification of ortholog groups for 5. Suzek BE, Ermolaeva MD, Schreiber M, Salzberg SL (2001) A probabilistic method for eukaryotic genomes. Genome Res 13:2178–2189. identifying start codons in bacterial genomes. Bioinformatics 17:1123–1130. 12. Peng J, et al. (2006) The use of comparative genomic hybridization to characterize 6. Altschul SF, et al. (1997) Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res 25:3389–3402. genome dynamics and diversity among the serotypes of Shigella. BMC Genomics 7:218. 7. Lowe TM, Eddy SR (1997) tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25:955–964. Yan et al. www.pnas.org/cgi/content/short/0801093105 2of18 Fig. S1. Pairwise genome comparisons of P. stutzeri A1501 with five other representative Pseudomonas species in dot–matrix style. Orthologs between A1501 (always shown on the x axis) and the five other Pseudomonas species (shown on the y axis) are marked as dots and color-coded for similarity. Red, Ͼ80%; blue, 50–80%; cyan, Ͻ50%. The start of the KT2440 genome was adjusted to make it begin with dnaA in accordance with the others. Yan et al. www.pnas.org/cgi/content/short/0801093105 3of18 Fig. S2. Maximum likelihood (ML) tree of sequenced Pseudomonas species based on combined dataset of 12 housekeeping proteins. E. coli K-12 MG1655 is used as outgroup. Numbers along the branches show support value of nodes estimated by 1,000 bootstrap replicates. The 12 housekeeping proteins used for phylogenetic analysis are aspartyl-tRNA synthetase (AspS), ATP synthase subunit B (AtpD), carbamoyl-phosphate synthase small subunit (CarA), glycogen phosphorylase (GlgP), DNA gyrase subunit B (GyrB), uroporphyrinogen decarboxylase (HemE), amidophosphoribosyltransferase (PurF), recombinase A (RecA), DNA-directed RNA polymerase beta subunit (RpoB), RNA polymerase sigma factor (RpoD), seryl-tRNA synthetase (SerS), and dihydrolipoamide acetyltransferase (SucB). Yan et al. www.pnas.org/cgi/content/short/0801093105 4of18 Fig.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages18 Page
-
File Size-