Supporting Information

Pope et al. 10.1073/pnas.1005297107 SI Materials and Methods with the Paracel Genome Assembler (PGA version 2.62, www. Cell Dissociation and DNA Extraction. Before cell dissociation and paracel.com). The resulting pooled assembly consisted of 12,664 DNA extraction,a subsampleofeachdigestasample was pooledand contigs, of which the longest was 27.9 kbp long and contained hereafter is referred to as T1 (November 2006) and T2 (May 2007). 237 reads (average read depth 7.9×). Approximately 38% of the To desorb and recover those microbes adherent to plant biomass, reads remained as singlets. 5 to 10 g of the pooled samples was centrifuged at 12,000 × g for 2 min, and the pellet was resuspended in 15 mL dissociation buffer Full Fosmid Sequencing and Assembly. Based on a number of [0.1% Tween 80, 1% methanol and 1% tertiary butanol (vol/vol), functional and hybridization-based screens, 98 fosmids were pH 2] (1). This mixture was vortexed for 30 s and centrifuged at chosen for sequencing. The individual fosmids were induced to low speed for 20 s to sediment the large plant particles; this process increase their copy number following Epicentre protocols, and fi was repeated two to three more times, with the supernatants col- the fosmid DNA puri ed using Qiagen MiniPrep columns. ∼ μ lected and pooled. Microbial biomass was collected by centrifuga- Equimol amounts of the fosmids were pooled together ( 20 g tion at 12,000 × g for 5 min and the cell pellets were resuspended total DNA) and both a 3-kb paired-end library and a 454 stan- in 1 mL of sterile 10 mM Tris-HC1 (pH 8.0) 1 M NaCl, then sub- dard shotgun library were constructed. Both libraries were di- jected to one final low speed centrifugation for 20 s to remove any rectly sequenced with the 454 Life Sciences Genome Sequencer ∼ residual particulate matter. The cells were finally harvested by GS FLX and the libraries produced 700 Mbp of data with an centrifugation at 12,000 × g for 5 min. average read length of 375 bp. Duplicate removal and splitting of The cell pellets (∼200 mg wet-weight) were resuspended in paired reads reduced the dataset to 560 Mbp in 2,077,631 reads. 700 μL TE buffer and incubated at 75 °C for 10 min to inactivate The Newbler assembly tool was applied to these data and 33 of nucleases. Cell lysis was performed by adding lysozyme (1 mg/ the fosmid inserts were completely assembled, another 39 fosmid mL)/mutanolysin (20 U) and achromopeptidase (1 mg/mL) to inserts were reconstructed from two or more contigs linked via these cell suspensions and incubation at 37 °C for 90 min. Next, paired-end reads, and 26 inserts were partially sequenced. In SDS was added to give a final concentration of 1% (wt/vol), 0.20 total, 2.5 Mb of metagenomic DNA sequence was assembled and mg proteinase K was also added, and the mixture was incubated manually edited from the 98 fosmids selected for sequencing. at 55 °C for 90 min. Next, NaCl and CTAB were added to give fi Binning. MEGAN was used to determine the phylogenetic distri- nal concentrations of 0.7 M and 2% (wt/vol) respectively, and fi the mixture was incubated at 70 °C for 10 min. Following phenol: bution of the rst batch of 30,000 Sanger reads generated by the CSP program. BLASTX was used to compare all reads against the chloroform:isoamylalcohol and chloroform extractions, the DNA “ ” was precipitated with two volumes of 95% ethanol, washed with NCBI-NR ( non-redundant ) protein database. Results of the 70% ethanol, and the pellet air-dried and resuspended in TE BLASTX search were subsequently uploaded into MEGAN (6) buffer (pH 8.0) at a final concentration ∼0.5 μg/μL. for hierarchical tree constructions, which uses the BLAST bit- score to assign , as opposed to using percentage identity. fi 16S rRNA Gene PCR Clone Libraries. Two rrs clone libraries were Assembled metagenomic contigs were binned (classi ed) using prepared from the pooled metagenomic DNA samples by using PhyloPythia (7). Generic models for the ranks of , phy- two different primer pairs broadly targeting the bacterial domain: lum, and class were combined with sample-specific models for 27F (5′-AGA GTT TGA TCC TGG CTC AG-3′) and 1492R (5′- the clades “uncultured γ-Proteobacteriacea bacterium” (WG-1), GGT TAC CTT GTT ACG ACT T-3′); and GM3 (5′-AGA GTT “uncultured bacterium” (WG-2), and “un- TGA TCM TGG C-3′) and GM4 (5′-TAC CTT GTT ACG ACT cultured Erysipelotrichaceae bacterium” (WG-3). The generic mod- T-3′) (2). The PCR amplicons were cloned into the vector pCR4- els represent all clades covered by two or more species at the cor- TOPO (Invitrogen Corp.) and plated onto Carbenicillin-containing responding ranks among the sequenced microbial isolates. The (150 μg/mL) LB agar plates, and two 384-well microtitre plates sample-specific models include classes for the dominant sample were picked. The propagated clones were end-sequenced using populations of WG-1, WG-2, and WG-3, as well as a class “Other.” vector-based primers, and the sequence reads were trimmed, as- The sample-specific models for WG-1, WG-2, and WG-3 were sembled, and quality-checked using the genelib software package each trained on sequence data obtained from contigs assembled (E. Kirton; Joint Genome Institute, Walnut Creek, CA). Thirty- from the metagenome, and fully sequenced fosmids identified us- one putative chimeras were identified using Bellerophon (3) and ing phylogenetic marker genes. Five sample-specific support vector Chimera Check (4) and excluded from the dataset. A total of 663 machines were created by using fragments of lengths of 3, 5, 10, 15, near-complete bacterial rrs gene sequences passed the quality and and 50 kb. All input sequences were extended by their reverse chimera filters and were used in the subsequent analyses. complement before computation of the compositional feature vectors. The parameters w and l were both set to 5 for the sample- Metagenome Processing: Shotgun Library Preparation, Sequencing, specific models. Thirty-three assembled contigs and one sequenced and Assembly. Shotgun libraries from the Tammar genomic fosmid, assigned unambiguously through analysis of phylogenetic DNA were prepared from each of the pooled samples T1 and T2: marker genes, were used for the training of WG-1 sample-specific a 2- to 4-kb insert library cloned into pUC18 and a roughly 36-kb model (a total of 388,452 bp). Sixteen assembled contigs and four insert fosmid library cloned in pCC1Fos (Epicentre Corp.). Li- sequenced fosmids assigned unambiguously through analysis of braries were sequenced with BigDye Terminators v3.1 and re- phylogenetic marker genes were used for the training of WG-2 solved with ABI PRISM 3730 (ABI) sequencers. A total of sample-specific models (a total of 208,429 bp). Thirty assembled 121,728 reads (60,672 from T1 and 61,056 from T2) comprising contigs assigned unambiguously through analysis of phylogenetic 87.12 megabases (Mb) of phred Q20 sequence were generated marker genes were used for the training of WG-3 sample-specific from the small insert library. Sequence reads from T1 and T2 were model (a total of 118,249 bp). Input fragments of a particular trimmed with LUCY v. 1.19p (5), resulting in a total of 106,913 length were generated from the fosmids by using a sliding window reads comprising 82.7 Mb, then pooled together and assembled with a step size of one-tenth of the generated fragment size (for

Pope et al. www.pnas.org/cgi/content/short/1005297107 1of7 example, 5 kb for 50-kb fragments). For the class “Other,” frag- sample-specific models. Results of this binning process were loaded ments from 340 sequenced isolates was used. The classifier consisting into IMG/M-ER to allow independent analysis of the component of the sample-specific and generic clade models were then applied to populations. The resulting metabolic reconstruction for WG-2 is assign all fragments more than 1 kb of the sample. In case of con- summarized in Fig. S4, with IMG gene object identifiers (oids) for flicting assignments, preference was given to assignments of the tracking in IMG/M provided in Table S5.

1. Kang S, Denman SE, Morrison M, Yu Z, McSweeney CS (2009) An efficient RNA 5. Chou H-H, Holmes MH (2001) DNA sequence quality trimming and vector removal. extraction method for estimating gut microbial diversity by polymerase chain reaction. Bioinformatics 17:1093–1104. Curr Microbiol 58:464–471. 6. Huson DH, Auch AF, Qi J, Schuster SC (2007) MEGAN analysis of metagenomic data. 2. Warnecke F, et al. (2007) Metagenomic and functional analysis of hindgut microbiota Genome Res 17:377–386. of a wood-feeding higher termite. Nature 450:560–565. 7. McHardy AC, Martín HG, Tsirigos A, Hugenholtz P, Rigoutsos I (2007) Accurate 3. Huber T, Faulkner G, Hugenholtz P (2004) Bellerophon: A program to detect chimeric phylogenetic classification of variable-length DNA fragments. Nat Methods 4:63–72. sequences in multiple sequence alignments. Bioinformatics 20:2317–2319. 4. Cole JR, et al. (2003) Ribosomal Database Project (2003) The Ribosomal Database Project (RDP-II): Previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res 31:442–443.

A PD rarefaction

Sample Simpson Observed_species Chao1 Rumen_FA_8 0.990 265 521.094 Rumen_FA_64 0.990 258 523.863 Rumen_FA_71 0.967 182 430.953 15 Rumen_PL 0.940 258 567.479 Tammar 0.987 206 496.055 Termite_PL3 0.785 71 104.394

10 Toatl branch length Toatl 5

00 100 200 300 400 500 600 700 Sequences per sample

B OTU rarefaction 300

250

200

150 Number of OTUs 100

50

0 0 100 200 300 400 500 600 700 Sequences per sample

Fig. S1. Rarefaction analyses using phylogenetic diversity (PD) (A) and operational taxonomic unit (OTU) frequency (B)ofrrs gene datasets obtained from the Tammar wallaby foregut, bovine rumen, and termite lumen microbiomes. In both panels, a 97% sequence identity threshold has been employed for the OTU constructions used in these analyses. (A)A“PD_whole tree” has been employed using QIIME and the PD value on the y axis represents the summation of the branch lengths from the phylogenetic trees constructed from the rrs gene sequences (Rumen_FA_8, dark blue; Rumen_PL, light blue; Rumen_FA_64, green; Tammar, purple; Rumen_FA_71, red and Termite_PL3, yellow). The α diversity measures (Simpson, observed species, and chao1 richness estimates) are also shown for each sample. (B) The same data as in A using the number of OTUs observed with each dataset as the y axis.

Pope et al. www.pnas.org/cgi/content/short/1005297107 2of7 A

Lachnospiraceae (345)

94 Mollicutes (71) 92 100 Staphylococcaceae (4) 55 AcidaminococcaceaeAcidaminococcaceae ((36)36)

100 Moorella 99 gut clone group (16) 76 MogibacteriumMogg()ibacterium (5)(5)

Bacteroidales (142)(142)

100 (2) 81 Succinivibrionaceae (77) 94 100 57 Aeromonadaceae 67 SHTP737 100 97 Actinobacillus (3) 100 Moraxellaceae (2) 100 SHTP728 SGYA398

64

50 100 (5) 100 TM7 (2) 100 C10661 Methanobrevibacter gottschalkii str. PG, U55239.1 0.10

B Gammaproteobacteria Proteobacteria Alphaproteobacteria delta/epsilon subdivisions Betaproteobacteria unclassified Proteobacteria Flavobacteria Prevotellaceae Bacteroides thetaiotaomicron Bacteroides fragilis Bacteroides Bacteroides capillosus Bacteroides ovatus Bacteroides uniformis Bacteroides caccae Bacteroidales Bacteroides vulgatus Bacteroides stercoris Porphyromonadaceae Rikenellaceae environmental samples Chlorobi Spirochaetes Bacilli Butyrivibrio Lachnospiraceae Roseburia Ruminococcus Coprococcus cellular organisms Clostridiales Acidaminococcaceae Firmicutes Peptostreptococcaceae Clostridiaceae Clostridium root Alkaliphilus Faecalibacterium Dorea Anaerotruncus Eubacteriaceae Peptococcaceae unclassified Clostridiales Heliobacteriaceae Halanaerobiales Thermoanaerobacteriales Mollicutes Deinococcus-Thermus Fusobacteria unclassified Bacteria candidate division TM7 candidate division WWE1 / group Euryarchaeota Methanobacteria Methanomicrobia Methanococci environmental samples Eukaryota Viruses Not assigned No hits

Fig. S2. Composition of the bacterial community analyzed. (A) Phylogenetic diversity of the Tammar wallaby foregut rrs sequences. From a PCR-based col- lection and the metagenomic libraries (small-insert and fosmid), 663 almost full-length and 51 partial to full 16S rRNA gene sequences representing six dif- ferent phyla were analyzed using the maximum-likelihood algorithm. The total number of sequences contained within each grouping is noted in brackets. Red dots indicate where at least one group was represented in both the PCR collection and metagenomic libraries. Blue dots indicate where at least one group was represented by a sequenced fosmid clone. The phylogram was constructed from 1,289 unambiguously aligned nucleotide positions. (Scale bar, a sequence divergence of 10%.) Branching pattern confidence values greater than 50% are shown at nodes. See Figs. S1 and S3 and Table S2 for accession numbers and detailed phylogenetic analysis. (B) Phylogenetic diversity of metagenome sequences (∼30 000) computed by MEGAN based on a BLASTX comparison. The size of the circles is scaled logarithmically to represent the number of reads assigned directly to the taxon.

Pope et al. www.pnas.org/cgi/content/short/1005297107 3of7 OTU−2.1 ((74)74)

OTU−2.2OTU−2.2 ((47)47) 96 WG-2 SGYA541 SGYA461 Lachnobacterium sp. str. wal 14165, AJ518873.1 100 culturable kangaroo Macropus giganteus Eastern grey Kangaroo forestomach contents, AY442825.1 culturable kangaroo Macropus giganteus Eastern grey Kangaroo forestomach contents, AY442824.1 67 SHTP424 SHTP587 SHTP599 98 SHTP493 SHTP538 100 culturable kangaroo Macropus giganteus Eastern grey Kangaroo forestomach contents, AY442826.1 98 butyrate−producing YE53, AY442817.1 76 SGYA446 70 SHTP727 61 Butyrivibrio fibrisolvens str. Mz3, AM039822.1 SGYA609 100 Roseburia intestinalis str. XB6B4, AM055815.1 Roseburia intestinalis str. L1−82, AJ312385.1 Lachnobacterium bovis str. LRC 5382; ATCC BAA−151, AF298663.1 100 Johnsonella (3)

55 Ruminococcus productus (0)

SGYA396 54 OTU−6 (7) 100 100 SHTP666 SHTP550 73 SGYA507 97 Fosmid clone Sc34 100 Butyvibrio (28) 63 52 C2947 54 SGYA544 99 SHTP492 98 SGYA628 97 SGYA603 54 SHTP401 68 SHTP696 100 SHTP658 100 SGYA554 86 C3520 100 Clostridium symbiosum (3) 100 100 Anaerostipes caccae str. L1, AB243985.1 Anaerostipes caccae str. L2, AB243986.1 90 SHTP763 100 Coprococcus (3) 87 80 SGYA426 100 Sporobacter (49) 100 92 Ruminococcus (13) 73 Lachnospiraceae 69 OTU−3 ((38)38) WG-3WG-3 73 74 cylindroides str. ATCC27803, L34617.2 100 butyrate−producing T2−87, AY305306.1 52 Streptococcus pleomorphus, M23730.1 99 SHTP683 80 SHTP732 98 Erysipelothrix (15) 95 94 SGYA458 100 Staphylococcaceae (4) Mollicutes 84 64 Acidaminococcaceae (37)

100 Mogibacterium (4) Thermus thermophilus, AY497773.1

0.10

Fig. S3. Phylogenetic diversity of the Tammar wallaby foregut microbiota within the phylum Firmicutes. From a PCR-based collection and the metagenomic libraries (small-insert and fosmid), 663 almost full-length and 51 partial to full 16S rRNA gene sequences were analyzed using the maximum-likelihood al- gorithm (RAxML). The number of Tammar foregut community sequences within each grouping is given in brackets; dark blue shading denotes dinstict Tammar foregut community phylotypes; red text denotes sequences from the metagenome libraries. Light blue dots indicate where at least one OTU was represented by a sequenced fosmid clone. White shading denotes reference groups having no representation in this collection. The phylogram was constructed from 1,289 unambiguously aligned nucleotide positions. The scale bar represents a sequence divergence of 10%. Branching pattern confidence values greater than 50% are shown at nodes.

Pope et al. www.pnas.org/cgi/content/short/1005297107 4of7 Maltose Glucose Fructose

PTS Maltose ABC Trehalose ABC Amino- Glucans acids ABC GH65 Galactans Mono- βD-G1P ABC Co2+ saccharide GH2 GH16 GH42 Zn2+ GH27 ABC Carbohydrates galK D-galactose

Sulfate Glc

G6P R5P D-Glc αD-G1P APS S7P GH3 GH94 F6P D-X5P Sulfite E4P G3P R5P XK Cellobiose G3P + G3P D-xylulose GH5 H2S xylA Rhamnose PEP D-xylose GH78 GH43 Pyruvate Cellulose L- Type II AcCoA xylans Rhamnose secretion atob Acetyl-P system ADP ATP Aceto-AcCoA ATP BDH Acetate NAD+ Butanoyl-Coa

PTB Butanoyl-P H+ BUK ATP SecA Butyrate SecB Sec D

Fig. S4. Schematic representation of selected metabolic features of the WG-2 population as inferred from genomic comparisons (Table S5). Major metabolic pathways and energy-generating systems are shown. Similar shapes indicate similar functions. Broken lines indicate sections of pathways missing; partially broken lines indicate partial sections of pathway identified. Abbreviations: AcCoA, acetyl-CoA; acetyl-P, acetyl phosphate; AK, acetate kinase; APS, adenylylsulfate; atob, acetyl-CoA C-acetyltransferase; αD-G1P, α-D-Glucose 1-phosphate; BDH, butyryl-CoA dehydrogenase; BU.K., butyrate kinase; butanoyl-P, butanoyl phosphate; CL, citrate lyase; D-X5P, D-Xylulose 5-phosphate; E4P, erythrose-4-phosphate; F6P, fructose-6-phosphate; G3P, glyceraldehyde-3-phosphate; G6P, glucose-6-phosphate; galK, galactokinase; GH, glycoside hydrolase family (CAZy); PTB, phosphate butyryltransferase; PTS, Phosphotransferase system; R5P, pentose-phosphates; S7P, sedoheptulose-7-phosphate; XK, xylulokinase; xylA, xylose isomerize. Gene identification numbers (IMG gene object identifiers) can be found in Table S5.

Pope et al. www.pnas.org/cgi/content/short/1005297107 5of7 2004121488_Termite Gut Community 2004093898_Termite Gut Community 2013500002_Macropus eugenii Gut Microbiome 639970307_Clostridium cellulolyticuml H10 2021680811_Sc65 303887_Ruminococcus albus str. F-40 2021680506_Sc52 144155_Butyrivibrio fibrisolvens str. A46 2980984_Fibrobacter succinogenes subsp. succinogenes S85 2021681402_Sc100 2021681469_Sc106 2021681103_Sc78 642793250_Bacteroides coprocola DSM 17136 2004122839_Termite Gut Community 2004099705_Termite Gut Community 2004093236_Termite Gut Community 2004110044_Termite Gut Commun 2004147003_Termite Gut Community 2004123807_Termite Gut Community 2004099367_Termite Gut Community 2004147002_Termite Gut Community 156072356_Fibrobacter succinogenes subsp. succinogenes S85 2004163106_Termite Gut Community 2004127657_Termite Gut Community 2004123120_Termite Gut Community 2004162858_Termite Gut Community 2004152370_Termite Gut Community 2004123119_Termite Gut Community 2004134530_Termite Gut Community 642809237_Bacteroides intestinalis DSM 17393 2021680095_Sc32 639968472_Clostridium cellulolyticum H10 640105950_Clostridium thermocellum ATCC 27405 640107551_Clostridium thermocellum ATCC 27405 640105819_Clostridium thermocellum ATCC 27405 639971391_Clostridium cellulolyticum H10 640108291_Clostridium thermocellum ATCC 27405 48994862_Ruminococcus albus str. 8 639970699_Clostridium cellulolyticum H10 640106238_Clostridium thermocellum ATCC 27405 2013444844 WG-2 [Macropus eugenii Gut Microbiome] 2013445168 WG-2 [Macropus eugenii Gut Microbiome] 2013494443_Macropus eugenii Gut Microbiome 2004136869_Termite Gut Community 2004109124_Termite Gut Community 2004083596_Termite Gut Community 2004101966_Termite Gut Community 2004141029_Termite Gut Community 2004101470_Termite Gut Community 2004120632_Termite Gut Community 2004110005_Termite Gut Community 2004146842_Termite Gut Community 488281_Fibrobacter succinogenes subsp. succinogenes S85 4204224_Prevotella ruminicola str. 23 457801_Ruminococcus flavefaciens str. FD-1 2021680215_Sc36 2013447234_Macropus eugenii Gut Microbiome 640108227_Clostridium thermocellum ATCC 27405 642795795_Bacteroides coprocola DSM 17136 642809430_Bacteroides intestinalis DSM 17393 2013459258_Macropus eugenii Gut Microbiome 2004147824_Termite Gut Community 2004147330_Termite Gut Community 2004146484_Termite Gut Community 2004175329_Termite Gut Community 2004106112_Termite Gut Community 2004099281_Termite Gut Community 2004117248_Termite Gut Community 2004153303_Termite Gut Community 2004085838_Termite Gut Community 2004127134_Termite Gut Community 2004142219_Termite Gut Community 666883_Fibrobacter succinogenes str.A3c 148571_Fibrobacter succinogenes subsp. succinogenes S85 666881_Fibrobacter succinogenes str. REH9-1 666885_Fibrobacter intestinalis str. DR7 640106892_Clostridium thermocellum ATCC 27405 640107599_Clostridium thermocellum ATCC 27405 639971571_Clostridium cellulolyticum H10 640106893_Clostridium thermocellum ATCC 27405 641052709_Ruminococcus gnavus ATCC 29149 2004156121_Termite Gut Community 2004151907_Termite Gut Community 2004084475_Termite Gut Community 2004161967_Termite Gut Community 2013462056_Macropus eugenii Gut Microbiome 642793828_Bacteroides coprocola DSM 17136 2013432156_Macropus eugenii Gut Microbiome 1022698_Fibrobacter succinogenes subsp. succinogenes S85 2004135287_Termite Gut Community 2004119896_Termite Gut Community 2004103881_Termite Gut Community 2004098869_Termite Gut Community 2004118842_Termite Gut Community 2004100550_Termite Gut Community 2004091663_Termite Gut Community 2004163608_Termite Gut Community 2004105622_Termite Gut Community 2004161382_Termite Gut Community 2004147619_Termite Gut Community 2013426345_Macropus eugenii Gut Microbiome 2021679780_Sc17 2021680353_Sc47 2013469065_Macropus eugenii Gut Microbiome 2021681106_Sc78 2021681399_Sc100 2021680092_Sc30 2013443571_Macropus eugenii Gut Microbiome 2021681462_Sc104 2130574_Prevotella bryantii str. B14 143941_Prevotella ruminicola str. 23 2021680985_Sc75 2013428380_Macropus eugenii Gut Microbiome 2021680678_Sc58 7229063_Ruminococcus albus str. F-40 2004108079_Termite Gut Community 2004100799_Termite Gut Community 2004114038_Termite Gut Community 2004120073_Termite Gut Community 2004084013_Termite Gut Community 2004128009_Termite Gut Community 2004112466_Termite Gut Community 2004131474_Termite Gut Community 39473_Butyrivibrio fibrisolvens str. H17c 639968978_Clostridium cellulolyticuml H10 7023994_Ruminococcus albus str. F-40 234872_Ruminococcus flavefaciens str. 17 45966_Ruminococcus albus str. SY3 152639_Ruminococcus albus str. 8 45964_Ruminococcus albus str. SY3 640106214_Clostridium thermocellume ATCC 27405 98588_Clostridium cellulovorans 743B

0.50

Fig. S5. Phylogenetic analysis of the glycoside hydrolase family 5 (GH5) diversity encoded by the Tammar wallaby foregut microbiome. GH5 sequences from the Tammar foregut metagenome are colored in red, GH5 sequences from the sequenced fosmid clones are colored in yellow, the termite gut metagenome is green, and various other sources in black. Purple dots denote GH5 sequence recovered from putative polysaccharide utilization loci gene clusters (Table S4). Metagenomic sequences and additional public sequences are identified by their Joint Genome Institute gene object identifier and GenBank GI number, re- spectively. The Pfam PF00150 (Cellulase – glycosyl hydrolase family 5) has a length of 378 residues, comprising domain sequences with an average length of 227 amino acids and 17% sequence identity.

Pope et al. www.pnas.org/cgi/content/short/1005297107 6of7 Other Supporting Information Files

Table S1 (DOC) Table S2 (DOC) Table S3 (PDF) Table S4 (PDF) Table S5 (PDF)

Pope et al. www.pnas.org/cgi/content/short/1005297107 7of7