<<

Utilizing “Omics” Based Approaches to Investigate Targeted Microbial Processes

By

Vanessa Lynn Brisson

A dissertation submitted in partial satisfaction of the

requirements for the degree of

Doctor of Philosophy

in

Engineering – Civil and Environmental Engineering

in the

Graduate Division

of the

University of California, Berkeley

Committee in charge:

Professor Lisa Alvarez-Cohen, Chair Professor Kara Nelson Professor Fiona Doyle

Spring 2015

Utilizing “Omics” Based Approaches to Investigate Targeted Microbial Processes

Copyright © 2015

By

Vanessa Lynn Brisson Abstract

Utilizing “Omics” Based Approaches to Investigate Targeted Microbial Processes

by

Vanessa Lynn Brisson

Doctor of Philosophy in Engineering – Civil and Environmental Engineering

University of California, Berkeley

Professor Lisa Alvarez-Cohen, Chair

Metabolomic, genomic, and metagenomic analyses were used to provide insight into two different environmentally relevant microbial processes: bioleaching of rare earth elements from monazite sand and reductive dechlorination of chlorinated ethenes. Although rare earth elements are important for a variety of technologies, current extraction techniques are severely environmentally damaging. The research presented here demonstrates that some microorganisms are capable of biological leaching of rare earth elements from monazite, opening the possibility of a novel, environmentally sustainable bioleaching extraction process. Metabolomic analysis of a monazite bioleaching microorganism was used to further our understanding of the bioleaching process. Chlorinated ethenes are common groundwater contaminants with human health risks. Dehalococcoides mccartyi are the only organisms known to completely reduce chlorinated ethenes to the harmless product ethene. However, D. mccartyi dechlorinate these chemicals more effectively and grow more robustly in mixed microbial communities than in isolation. Genomic and metagenomic analyses were used to advance our understanding of D. mccartyi in a mixed microbial community and in isolation.

Successful isolation and characterization of monazite bioleaching microorganisms provided a proof of concept for monazite bioleaching as an environmentally friendly alternative to conventional extraction of rare earth elements from monazite, a rare earth . Three fungal strains were found to be capable of bioleaching monazite, utilizing the mineral as a phosphate source and releasing rare earth cations into solution. These organisms include one known phosphate solubilizing fungus, Aspergillus niger ATCC 1015, as well as two newly isolated fungi: an Aspergillus terreus strain ML3-1 and a Paecilomyces spp. strain WE3-F. The rare earth elements were released in proportions similar to those present in the monazite, which was dominated by , , , and . Although monazite also contains the radioactive element , bioleaching by these fungi preferentially solubilized rare earth elements over thorium, leaving the thorium in the solid residual. Adjustments in growth medium composition improved bioleaching performance measured as rare earth release. Cell-free spent medium generated during growth of A. terreus strain ML3-1 and Paecilomyces spp. strain WE3-F in the presence of monazite retained robust bioleaching capacity, indicating that compounds exogenously released by these organisms contribute substantially to leaching activity. Organic acids released by the organisms were identified and quantified. Abiotic leaching with laboratory prepared solutions of the identified organic acids was not as effective as bioleaching or leaching with cell-free spent medium at releasing rare earths from monazite,

1 indicating that compounds other than the identified organic acids contribute to leaching performance.

Metabolomic analysis of a monazite bioleaching microorganism was performed in order to better understand the bioleaching process. Overall metabolite profiling, in combination with biomass accumulation data, identified a lag in growth phase when this organism was grown under phosphate limitation stress. Analysis of the relationships between metabolite concentrations, rare earth element solubilization levels, and bioleaching growth conditions identified several metabolites potentially associated with bioleaching. Further investigation using laboratory prepared solutions of 17 of these metabolites indicated significant leaching contributions from citric and citramalic acids only. These contributions were relatively small compared to bioleaching effectiveness of microbial supernatant, suggesting that other still unknown factors contribute to bioleaching activity. Further investigations of bioleaching supernatant using gel permeation chromatography indicated that the compounds involved in leaching form only weakly held complexes, like those of citric acid, with the solubilized rare earth elements, rather than forming more strongly held complexes.

The phylogenetic composition and gene content of a functionally stable trichloroethene degrading microbial community was examined using metagenomic sequencing and analysis. For phylogenetic classification, contiguous sequences (contigs) longer than 2,500 bp were grouped into classes according to tetranucleotide frequencies and assigned to taxa based on rRNA genes and other phylogenetic marker genes. Classes were identified for Clostridiaceae, Dehalococcoides, , Methanobacterium, Methanospirillum, as well as a Spirochaete, a Synergistete, and an unknown Deltaproteobacterium. D. mccartyi contigs were also identified based on sequence similarity to previously sequenced genomes, allowing the identification of 170 kb on contigs shorter than 2,500 bp. Examination of metagenome sequences affiliated with D. mccartyirevealed 406 genes not found in previously sequenced D. mccartyi genomes, including nine cobalamin biosynthesis genes related to corrin ring synthesis. This is the first time that a D. mccartyi strain has been found to possess genes for synthesizing this cofactor critical to reductive dechlorination. Besides D. mccartyi, several other members of this community appear to have genes for complete or near-complete cobalamin biosynthesis pathways. Seventeen genes for putative reductive dehalogenases were identified, including 11 novel ones, all associated with D. mccartyi. Genes for hydrogenase components (271 in total) were widespread, highlighting the importance of hydrogen metabolism in this community. PhyloChip microarray analysis confirmed the stability of this microbial community over time.

Bioinformatic analyses using genomic and metagenomic data were used to further advance investigations of organisms from the genus Dehalococcoides. In the first of these analyses, metagenomic sequencing data from three dechlorinating microbial communities was used to evaluate the specificity of a genus wide microarray targeting Dehalococcoides genes from four sequenced Dehalococcoides genomes. Based on this analysis, the microarray was found to detect sequences with a minimum estimated sequence identity of 90 to 95%, remaining highly specific for the target sequences while allowing for small sequence variation. However, the microarray did not detect all genes with > 95% sequence identity, and failed to detect some genes with apparently 100% sequence identity. In the second analysis, a comparative genomics analysis was used to evaluate the prevalence of the recently reported incomplete Wood- Ljungdahl pathway of Dehalococcoides. This analysis revealed that the genetic pattern of genes 2 associated with this incomplete pathway is unique to the Dehalococcoides genus among sequenced bacterial and archaeal genomes.

3

Dedication

To Gabriel, Abigail, and Madeline

i

Table of Contents

Abstract...... 1

Dedication...... i Table of Contents...... ii List of Figures...... vi List of Tables...... viii Acknowledgements...... ix

Chapter 1: Introduction and Background...... 1

1.1 “Omics” techniques...... 2 1.2 Targeted microbial processes...... 3 1.2.1 Bioleaching of rare earth elements from monazite...... 3 1.2.2 Microbial reductive dehalogenation of chlorinated ethenes...... 7 1.3 Dissertation overview...... 9

Chapter 2: Bioleaching of Rare Earth Elements from Monazite Sand...... 11

2.1 Introduction...... 12 2.2 Materials and methods...... 13 2.2.1 Enrichment and isolation of rare earth element solubilizing microorganisms...... 13 2.2.2 DNA extraction, amplification, sequencing, and sequence analysis...... 13 2.2.3 Bioleaching growth conditions...... 13 2.2.4 Abiotic leaching conditions...... 15 2.2.5 Biomass measurements...... 15 2.2.6 Analytical methods...... 15 2.2.7 Statistical analysis...... 16 2.2.7.1 Analysis of biomass growth...... 16 2.2.7.2 Analysis of bioleaching performance...... 16 2.2.7.3 Analysis of proportional release of rare earth elements and thorium...... 17 2.2.7.4 Analysis of abiotic leaching with , organic acids, and spent medium...... 17 2.3 Results and discussion...... 17 2.3.1 Enrichment, isolation, and identification of bioleaching microorganisms...... 17 2.3.2 Biomass growth during bioleaching...... 19 2.3.3 Bioleaching performance under different growth conditions...... 19 2.3.4 Proportional release of rare earth elements and thorium during bioleaching...... 25 2.3.5 Organic acid production during bioleaching...... 26 2.3.6 Abiotic leaching with hydrochloric acid and organic acids...... 27 2.3.7 Abiotic leaching with spent medium from bioleaching...... 31 2.3.8 Statistical analysis results...... 32

ii

Chapter 3: Metabolomic Analysis of a Monazite Bioleaching Fungus...... 33

3.1 Introduction...... 34 3.2 Materials and methods...... 35 3.2.1 Organism and bioleaching growth conditions...... 35 3.2.2 Quantification of rare earth elements, thorium, phosphate, glucose, pH, and biomass..35 3.2.3 Metabolomic analysis...... 35 3.2.4 Identification of metabolites of potential bioleaching importance...... 36 3.2.5 Abiotic leaching conditions...... 36 3.2.6 Gel permeation chromatographic separation of rare earth element complexes and free rare earth elements...... 37 3.3 Results and discussion...... 37 3.3.1 Bioleaching performance...... 37 3.3.2 Overall metabolomic profile...... 39 3.3.3 Identification of metabolites of potential bioleaching importance...... 41 3.3.3.1 Metabolites released at higher concentrations when soluble phosphate was not available...... 41 3.3.3.2 Metabolites whose concentration correlated with rare earth element concentration...... 43 3.3.3.3 High signal intensity metabolites...... 44 3.3.4 Abiotic leaching effectiveness of identified metabolites...... 44 3.3.5 Gel permeation chromatographic separation of complexed rare earth elements...... 46

Chapter 4: Metagenomic Analysis of a Functionally Stable Trichloroethene Degrading Microbial Community...... 48

4.1 Introduction...... 49 4.2 Materials and methods...... 50 4.2.1 ANAS enrichment culture and DNA sample preparation...... 50 4.2.2 Metagenome sequencing, assembly, and annotation...... 50 4.2.3 Analysis of metagenomic sequence data...... 50 4.2.3.1 Identification of Dehalococcoides contigs by sequence similarity...... 50 4.2.3.2 Classification of ANAS contigs by tetranucleotide frequencies...... 51 4.2.3.2 Comparisons to reference genomes and identification of novel Dhc genes...... 51 4.2.4 Confirmation of novel Dehalococcoides genes in Dehalococcoides isolates from ANAS...... 52 4.2.5 Trichloroethene dechlorination by Dehalococcoides isolate ANAS2 and ANAS Subcultures...... 52 4.2.6 PhyloChip assessment of community composition...... 53 4.3 Results...... 54 4.3.1 ANAS metagenome overview...... 54 4.3.2 Dehalococcoides in ANAS...... 54 4.3.2.1 Identification of Dehalococcoides contigs...... 54 4.3.2.2 Metagenome coverage of Dehalococcoides genes detected by microarray...... 55 4.3.2.3 Co-assembly of sequence from distinct Dehalococcoides strains...... 56 4.3.2.4 Identification of novel Dehalococcoides genes...... 56

iii

4.3.2.5 Trichloroethene dechlorination by Dehalococcoides isolate ANAS2 under different cobalamin conditions...... 59 4.3.3 ANAS community structure...... 59 4.3.3.1 Tetranucleotide classification of metagenome contigs...... 59 4.3.3.2 Comparisons to previously sequenced genomes...... 61 4.3.3.3 PhyloChip analysis of ANAS community composition...... 63 4.3.4 Metabolic functions in ANAS...... 63 4.3.4.1 Metagenome gene content overview...... 63 4.3.4.2 Reductive dechlorination...... 65 4.3.4.3 Hydrogen production and consumption...... 67 4.3.4.4 Cobalamin biosynthesis...... 67 4.3.4.5 Trichloroethene dechlorination by ANAS subcultures under different cobalamin conditions...... 69 4.4 Discussion...... 69

Chapter 5: Evaluation of microarray specificity for detecting Dehalococcoides mccartyi genes in mixed microbial communities using metagenomic sequence data...... 73

5.1 Introduction...... 74 5.2 Methods...... 74 5.2.1.1 Microbial communities...... 74 5.2.1.2 Metagenome and microarray datasets...... 75 5.2.1.3 Evaluation of microarray specificity through comparison of datasets...... 75 5.3 Results and discussion...... 75

Chapter 6: Comparative genomics of Wood-Ljungdahl pathways in Dehalococcoides mccartyi and in other fully sequenced bacteria and archaea...... 82

6.1 Introduction...... 83 6.2 Methods...... 83 6.3 Results and discussion...... 84

Chapter 7: Conclusions and Suggestions for Future Work...... 86

7.1 Bioleaching of rare earth elements from monazite...... 87 7.2 Microbial reductive dehalogenation of chlorinated ethenes...... 89

References...... 91

Appendices...... 108

Appendix 1. Calculation of total Nd solubilized from NdPO4 as a function of pH...... 109 Appendix 2. Metabolomics signal intensities for all metabolites and time points...... 113 Appendix 3. Heatmap showing average levels of all detected metabolites during monazite bioleaching...... 154 Appendix 4. Novel ANAS Dehalococcoides genes with product predictions beyond "hypothetical protein"...... 157

iv

Appendix 5. Genes for hydrogenase components identified in the ANAS metagenome contigs...... 165 Appendix 6. Cobalamin biosynthesis genes identified in the ANAS metagenome contigs...... 182 Appendix 7. Bacterial and archaeal sequenced genomes lacking genes for methylene tetrahydrofolate reductase (MTHFR)...... 189

v

List of Figures

Figure 1.1. Total Nd solubilized from NdPO4 at varying pH.

Figure 1.2. Reductive dechlorination of chlorinated ethenes.

Figure 2.1. Initial characterization of rare earth element solubilization from unground monazite by fungal and bacterial isolates.

Figure 2.2. Biomass production measured as volatile solids after six days incubation with different phosphate sources.

Figure 2.3. Bioleaching of rare earth elements from monazite under different growth conditions.

Figure 2.4. Total sugar concentrations during bioleaching of monazite.

Figure 2.5. pH during bioleaching of monazite.

Figure 2.6. Proportions of rare earth elements and thorium in monazite and in bioleaching supernatant after six days of bioleaching.

Figure 2.7. Abiotic leaching of rare earth elements from monazite by hydrochloric acid solutions, organic acids, and bioleaching spent medium.

Figure 2.8. Relationship between pH and solubilization of thorium for abiotic leaching of monazite with solutions of hydrochloric acid.

Figure 2.9. Abiotic leaching of Th by different organic acids and by spent medium from three bioleaching organisms.

Figure 3.1. Bioleaching of monazite in the absence or presence of soluble phosphate (K2HPO4).

Figure 3.2. Heatmap showing average levels of identified metabolites detected during monazite bioleaching for each growth condition and time point.

Figure 3.3. Metabolites of potential bioleaching importance identified by higher concentrations for growth with monazite only than for growth with K2HPO4 and monazite.

Figure 3.4. Correlations between metabolite signal intensities and rare earth element concentrations.

Figure 3.5. Abiotic solubilization of rare earth elements from monazite by selected metabolites.

Figure 3.6. Abiotic solubilization of Th from monazite by selected metabolites.

Figure 3.7. Chromatographic separation of free Nd3+ and EDTA-Nd3+ complexes at circumneutral pH. vi

Figure 3.8. Chromatographic separation of Nd3+ and Nd3+ complexes at pH 2.5.

Figure 4.1. Comparison of metagenomic Dehalococcoides coverage with ANAS genes detected by microarray.

Figure 4.2. Alignment of ANAS metagenome Dehalococcoides contigs (identified by tetranucleotide frequency and/or sequence similarity) to the Dehalococcoides strain 195 genome.

Figure 4.3. Operon structure for genes for the first (corrin ring synthesis) part of the cobalamin biosynthesis pathway identified in an ANAS metagenome contig associated with Dehalococcoides.

Figure 4.4. Evidence for the association of contig ANASMEC_C6240 (containing cobalamin biosynthesis genes) with Dehalococcoides.

Figure 4.5. Ethene production during trichloroethene degradation by Dehalococcoides isolate ANAS2.

Figure 4.6. Ethene production during trichloroethene dechlorination by ANAS subcultures.

Figure 5.1. Distribution of genes among profile categories.

Figure 5.2. Fraction of genes identified as “Present” as a function of the number of probes for that gene with N mismatches where N = 0, 1, 2, 3, or > 3 (unaligned).

Figure 5.3. Relationships between profile mismatch distributions and microarray “Present”/ “Absent” identification.

Figure 6.1. Identification of targeted Wood-Ljungdahl pathway genes in fully sequenced bacterial and archaeal genomes.

Figure 7.1. Effect of monazite sand grain size on abiotic leaching with 10 mM citric acid.

vii

List of Tables

Table 2.1. Bioleaching growth media compositions.

Table 2.2. Molar ratio of total rare earth elements to phosphate measured after bioleaching with different media compositions.

Table 2.3. Maximum observed concentrations of identified organic acids produced by three fungal isolates during bioleaching and percentage of bioleaching flasks for which each acid was detected.

Table 2.4. P-values for statistical analyses reported in the text for bioleaching and abiotic leaching of monazite.

Table 4.1. PCR primers and annealing temperatures for novel Dehalococcoides genes.

Table 4.2. Classification of contigs by tetranucleotide frequency and identification of contig classes by 16S and 23S BLAST comparisons.

Table 4.3. Comparison of ANAS contig classes to the most similar sequenced genomes.

Table 4.4. Overview of ANAS gene content by clusters of orthologous genes.

Table 4.5. Reductive dehalogenase genes identified in ANAS metagenome contigs.

Table 4.6. Cobalamin biosynthesis genes identified in ANAS metagenome contigs.

Table 5.1. Non-determinant probe set (gene) mismatch profiles.

viii

Acknowledgements

I would like to thank my advisor Lisa Alvarez-Cohen for her guidance over the past five years, along with all the members of the Alvarez-Cohen research group, especially three post-doctoral scholars, Dr. Patrick K. H. Lee, Dr. Wei-Qin Zhuang, and Dr. Shan Yi, for their advice and support.

I would also like to thank a number of individuals for their specific contributions that made this work possible. Dr. Karl Lalonde and Dr. Geoffrey A. Dorn assisted with the collection of monazite samples, including leading me on collection expedition in the Colorado Rockies. Dr. Negassi Hadgu provided assistance with ICP-MS analysis of rare earth elements and thorium for the bioleaching studies described in Chapters 2 and 3. Several people contributed work that made the metagenomic analysis in Chapter 4 possible. Kimberlee West collected ANAS cell samples and performed nucleic acid extractions. Dr. Susannah G. Tringe and other staff members at the Department of Energy Joint Genome Institute (JGI) performed the metagenome sequencing, assembly, and initial annotation. Kimberlee West and Dr. Eoin Brodie carried out and performed the initial analyses on the PhyloChip experiments. Dr. Yujie Men provided the microarray and metagenomic sequencing datasets for HiTCEB12 and HiTCE analysed in Chapter 5.

Additionally, I would like to thank the funding organizations that supported this work. The monazite bioleaching research described in Chapters 2 and 3 was supported by Siemens Corporate Research, a division of Siemens Corporation, through award number UCB_CKI-2012- Industry_IS-001-Doyle. The dechlorination research described in Chapters 4, 5, and 6 was supported by the Strategic Environmental Research and Development Program (SERDP) through grant ER-1587 and the NIEHS Superfund Basic Research Project ES04705-19. Funding for metagenomic sequencing was provided under the JGI Community Sequencing Program of the Department of Energy Office of Biological and Environmental Research. Part of the metagenomics work was performed at Lawrence Berkeley National Lab supported by the Office of Science, U. S. Department of Energy under Contract No. 470 DE-AC02-05CH11231.

This dissertation incorporates material from the following coauthored/previously published studies.

Brisson, Vanessa L., Wei-Qin Zhuang and Lisa Alvarez-Cohen (submitted 2015). "Bioleaching of Rare Earth Elements from Monazite Sand." Biotechnology and Bioengineering.

Brisson, Vanessa L., Kimberlee A. West, Patrick. K. H. Lee, Susannah G. Tringe, Eoin L. Brodie and Lisa Alvarez-Cohen (2012). "Metagenomic analysis of a stable trichloroethene- degrading microbial community." The ISME Journal 6(9): 1702-1714.

Zhuang, Wei-Qin, Shan Yi, Markus Bill, Vanessa L. Brisson, Xueyang Feng, Yujie Men, Mark E. Conrad, Yinjie J. Tang and Lisa Alvarez-Cohen (2014). "Incomplete Wood– Ljungdahl pathway facilitates one-carbon metabolism in organohalide-respiring Dehalococcoides mccartyi." Proceedings of the National Academy of Sciences 111(17): 6419-6424.

ix

Chapter 1:

Introduction and Background

1

1.1 “Omics” techniques

“Omics” refers to the evaluation of the content of a particular class of molecules in an organism or biological system. Technological advances over the past few decades have advanced our ability to analyze the contents of a living organism or system in increasingly comprehensive ways. The four interrelated “omics” analyses reviewed here are genomics, transcriptomics, proteomics, and metabolomics, which respectively describe gene content, gene transcription to mRNA, mRNA translation to proteins, and metabolite production/consumption by reactions catalyzed by proteins.

Genomics is the analysis of the total genetic content of an organism, while metagenomics is the extension of that analysis to the genetic content of a community of organisms. Metagenomic data provide a broad view of the genetic composition of a community, including information about the identity and potential metabolic capabilities of community members. Advances in DNA sequencing technologies and analysis tools have facilitated the metagenomic analysis of increasingly complex microbial communities. In addition to sequencing, microarrays, such as the PhyloChip for phylogenetic profiling (Brodie, DeSantis et al. 2006) and the GeoChip for functional genes (He, Gentry et al. 2007), provide another approach to examining the metagenome of a microbial community. Microarray analyses are limited to detecting the targeted genes but can detect them with high sensitivity, while sequencing based approaches can detect novel gene sequences but are limited in their sensitivity to low abundance genes due to random sampling effects (Zhou, Kang et al. 2008, Zhou, Wu et al. 2011). Genomic/metagenomic sequencing also provide a basis for transcriptomic and proteomic analyses.

Transcriptomic analyses examine the genes that are transcribed from DNA to mRNA. Since this is also an analysis of nucleic acids, transcriptomics relies on similar technologies to genomics and metagenomics. In addition to identifying transcribed genes, these analyses elucidate differences in transcription levels between different growth conditions, providing information on organisms’ response to conditions in terms of transcriptional regulation of genes.

In proteomics, the complement of proteins produced by an organism or group of organisms are analyzed. Like transcriptomics, proteomics can be used to evaluate regulatory responses to different conditions, in this case at the level of translation of mRNA sequences into proteins. The majority of proteomics studies utilize mass spectrometry (MS) techniques to analyze peptides from digested proteins and map those back to a database of proteins or to those proteins predicted from genomic/metagenomic sequences (VerBerkmoes, Denef et al. 2009, Altelaar, Munoz et al. 2013). Improvements in genomic/metagenomic sequencing and analyses have also facilitated proteomic analyses by providing improved reference sets of predicted proteins for analysis of MS results (VerBerkmoes, Denef et al. 2009).

Metabolomics refers to the analysis of small molecules present in or excreted by organisms. Metabolomic analyses applied to excreted metabolites are sometimes referred to as exometabolomics or metabolic footprinting while endometabolomics or metabolic fingerprinting refers to analysis of internal metabolites (Kell, Brown et al. 2005). Similar to transcriptomics and proteomics, metabolomic analyses are useful for comparing responses to differing growth conditions. However, unlike the above analyses, metabolomic analyses do not require 2 genomic/metagenomic sequencing to provide a reference set of predicted transcripts or proteins for comparison (Kell, Brown et al. 2005). The two main analytical tools used in metabolomics are various forms of MS (usually coupled to liquid chromatography or gas chromatography) and nuclear magnetic resonance (NMR) (Patti, Yanes et al. 2012). These analyses can be either targeted (focusing on more detailed measurement of a small predefined set of metabolites) or untargeted (detecting as large a set of metabolites as possible) depending on the desired application (Patti, Yanes et al. 2012).

1.2 Targeted microbial processes

1.2.1 Bioleaching of rare earth elements from monazite

Over recent years, the rare earth elements (REEs) have become increasingly important for their use in a number of different technologies, several of which are related to energy efficiency and alternative energy generation (USDoE 2011, Alonso, Sherman et al. 2012). For instance, permanent magnets, used in wind turbines as well as many other applications, are made with Nd, Pr, and Dy (USDoE 2011). High efficiency batteries used in hybrid electric cars use a variety of REEs including Ce, La, Nd, and Pr (USDoE 2011). Although these are generally considered environmentally beneficial technologies, current processing techniques for extraction of REEs from are environmentally damaging due to their high energy inputs and use of harsh chemicals, which result in the production of environmentally damaging waste streams (Gupta and Krishnamurthy 1992, Alonso, Sherman et al. 2012).

The REEs include the naturally occurring elements of the series (La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, and Lu) (atomic numbers 57 to 60 and 62 to 71) as well as Y and Sc (atomic numbers 57 to 60 and 62 to 71), which have similar chemical behavior to the (Gupta and Krishnamurthy 1992, Cotton 2006). Pm, also a lanthanide, is not included because all of its isotopes decay radioactively, and thus it is not found naturally (Cotton 2006). The REEs are further subdivided into the light REEs (La-Gd, Sc) and the heavy REEs (Tb-Lu, Y) (Gupta and Krishnamurthy 1992, Cotton 2006). This division is based on atomic size, which decreases with increasing atomic number across the lanthanides, and on the electron configuration (Cotton 2006).

REEs are not truly rare and can be found in many locations around the globe (Gupta and Krishnamurthy 1992, Rudnick and Gao 2003). The three main REE ores that are currently mined for production are bastnasite (REE-FCO3), monazite (light REE-PO4), and (heavy REE-PO4), together representing approximately 95% of known REE (Gupta and Krishnamurthy 1992, Rosenblum and Fleischer 1995). Of these, monazite and bastnasite are much more abundant than xenotime (Gupta and Krishnamurthy 1992). Monazite is further classified as monazite-Ce, monazite-La, and monazite-Nd, depending on which REE is dominant in the mineral (Rosenblum and Fleischer 1995). In addition to REE-PO4, monazite usually contains Th and sometimes U, both of which are radioactive, presenting a challenge for separation and disposal when they are extracted along with REEs (Gupta and Krishnamurthy 1992). Th is usually present as either cheralite (ThCa(PO4)2) or (ThSiO4) (Nitze 1896, Rosenblum and Fleischer 1995). In addition to the REEs, Th, and U, other elements commonly

3 present in monazite include Si, Ca, Al, Mg, Fe, Mn, and Pb (Nitze 1896, Zhu and O'Nions 1999).

REEs are generally difficult to extact from monazite, and conventional monazite processing uses caustic chemicals to leach REEs from the ore at high temperatures. Since monazite is the focus of the bioleaching process presented in this dissertation, conventional monazite leaching processes are reviewed here briefly. There are two main treatment processes for monazite: acid treatment and alkali treatment (Gupta and Krishnamurthy 1992). In the acid treatment process, concentrated inorganic acid, usually H2SO4, is used to digest the ore at approximately 200 ºC. REEs and Th are then recovered separately through a series of neutralization and precipitation steps. In the alkali treatment process, concentrated NaOH solution is used to decompose the ore at approximately 140 ºC in order to recover Na3PO4 as a product stream, converting the REE to REE hydroxides. These are then dissolved with concentrated inorganic acid and further processed to recover the REEs. The use of harsh chemicals and high temperatures results in the production of toxic waste streams and high energy usage. Also, the co-extraction of radioactive elements (Th and U) early in the process necessitates further downstream processing to separate this radioactive material from other process streams and to dispose of it appropriately. Bioleaching offers a potentially more environmentally friendly approach to extraction of REEs from ores.

Phosphate solubilizing microorganisms (PSMs), which include both bacterial and fungal species, are capable of releasing phosphate from otherwise low solubility phosphate minerals (Rodrı́guez and Fraga 1999). All organisms need phosphate to survive. Phosphate is an important part of the structure of DNA, RNA, and cytoplasmic membranes, and it is also needed for adenosine tri- phosphate, which stores energy for cells in phosphate bonds (Madigan, Martinko et al. 2008). However, in many environmental systems phosphate is not readily available, but is instead locked away in insoluble minerals (Rodrı́guez and Fraga 1999). This provides a selective pressure for the evolution of organisms capable of solubilizing these minerals in order to make the phosphates bioavailable.

Most research with PSMs has focused on agricultural applications with the objectives of understanding how microorganisms make phosphate more bioavailable to plants and developing approaches for using microorganisms to enhance the effectiveness of phosphate fertilizers (Asea, Kucey et al. 1988, Illmer and Schinner 1992, Rodrı́guez and Fraga 1999, Gyaneshwar, Naresh Kumar et al. 2002, Arcand and Schneider 2006, Vassilev, Vassileva et al. 2006, Chuang, Kuo et al. 2007, Morales, Alvear et al. 2007, Osorio and Habte 2009, Chai, Wu et al. 2011, Braz and Nahas 2012). In addition to agriculturally important studies, there has also been research on using PSM’s to extract phosphate from ores (Costa, Medronho et al. 1992) and to remove phosphates from iron ores to make these ores more suitable for iron production (Delvasto, Valverde et al. 2008, Delvasto, Ballester et al. 2009, Adeleke, Cloete et al. 2010). Most of these studies have focused on calcium phosphate minerals including tricalcium phosphate, dicalcium phosphate, , and rock phosphate (Illmer and Schinner 1992, Illmer and Schinner 1995, Altomare, Norvell et al. 1999, Rodrı́guez and Fraga 1999), but a few have addressed other phosphate minerals, including AlPO4, FePO4, and (CuAl6(PO4)4(OH)8·4H2O) (Illmer, Barbato et al. 1995, Souchie, Azcón et al. 2006, Chuang, Kuo et al. 2007, Delvasto, Valverde et al. 2008, Chai, Wu et al. 2011). For the studied PSMs, solubilization varied between minerals, with FePO4 and turquoise exhibiting much lower solubility than AlPO4 and calcium phosphates. 4

To the best of our knowledge, no previous research has evaluated the potential for PSM bioleaching of monazite.

Several mechanisms have been proposed to explain phosphate solubilization by PSMs, but the production of organic acids is thought to be a major contributor (Rodrı́guez and Fraga 1999, Nautiyal, Bhadauria et al. 2000, Gyaneshwar, Naresh Kumar et al. 2002, Scervino, Papinutti et al. 2011). In addition to reducing the pH, which somewhat increases the solubility of phosphate minerals, some organic acids can act as chelating agents, forming complexes with the cations released from the phosphate minerals and thus improving overall solubilization (Bolan, Naidu et al. 1994, Gadd 1999, Gyaneshwar, Naresh Kumar et al. 2002, Arcand and Schneider 2006). PSMs have been observed to produce a variety of organic acids including citric, gluconic, oxalic, succinic, acetic, malonic, propionic, 2-ketogluconic, lactic, isovaleric, isobutyric, and glycolic acid (Cunningham and Kuiack 1992, Illmer and Schinner 1995, Rodrı́guez and Fraga 1999, Chen, Rekha et al. 2006, Chuang, Kuo et al. 2007). In some studies, the detected acids predominantly account for the levels of solubilization observed, while in others low production of organic acids indicates that other factors contribute to solubilization (Illmer and Schinner 1992, Illmer, Barbato et al. 1995, Altomare, Norvell et al. 1999, Rodrı́guez and Fraga 1999, Chen, Rekha et al. 2006, Chuang, Kuo et al. 2007). Additionally, some studies have found what appeared to be other organic acids that were not identifiable (Chen, Rekha et al. 2006).

Since monazite is a REE , we hypothesized that some PSMs may be able to solubilize monazite for the extraction of REEs. However, there are a number of factors that make solubilization of monazite more challenging than solubilization of calcium phosphates typically used in PSM studies. For instance, REE-phosphates are known to have particularly low solubilities in water, on the order of 10-13 M (10-11 g/L) (Firsching and Brune 1991), whereas the -6 solubility of Ca3(PO4)2 is 3.9×10 M (0.0012 g/L) (Haynes ed. 2015). Figure 1.1 shows the predicted relationship between pH and total Nd solubilization for NdPO4 (see Appendix 1 for calculations). From this we can see that even at a pH of 2, the total dissolved Nd is still below -5 10 M, whereas the corresponding dissolved Ca concentration at this pH is ≥ 1 M for Ca3(PO4)2 and a variety of other calcium phosphates (Akiyama and Kawasaki 2012). Based on these data, those PSMs that rely on acidification alone for solubilization of Ca phosphate minerals can be expected to be much less effective at solubilizing monazite. Thus, the production of effective complexing agents will likely be critical to facilitate monazite solubilization.

5

Figure 1.1. Total Nd solubilized (log scale) from NdPO4 over a pH range. Curve was calculated based on equilibrium data from (Puigdomenech 2013). Calculations are shown in Appendix 1.

Some of the organic acids identified in PSM studies have also been shown to form complexes with REEs. For instance, REE citrate complexes have been studied by a number of researchers, and several different complexes between REEs and citrate have been proposed (Wood 1993, Goyne, Brantley et al. 2010). The stability constant for the formation of 1:1 REE citrate complexes has been estimated at about 109 (Martell and Smith 1974, Goyne, Brantley et al. 2010). One study evaluated citrate, along with oxalate, phthalate, and salicylate complexation of REEs from monazite in the context of metal mobilization in soils (Goyne, Brantley et al. 2010). They found citrate to be the most effective at releasing REEs from monazite under their experimental conditions. In addition to organic acids, other chelating molecules could also be involved in solubilization of REEs. For instance, some siderophores, iron complexing molecules produced by many bacteria and fungi, have also been found to form complexes with REEs (Christenson and Schijf 2011).

Another challenge for REE solubilization is that once solubilized, REEs may be removed from the medium by other processes including re-precipitation or adsorption. For instance, REE oxalates are highly insoluble (Gadd 1999) and therefore, the production of oxalic acid will need to be monitored closely in a bioleaching process to minimize the precipitation of REE oxalates. Also, REEs have been found to adsorb to the cell walls and extracellular polymers of some organisms or be absorbed into cells (Moriwaki and Yamamoto 2013). Such effects could result in the removal of solubilized REEs from the bulk medium.

6

1.2.2 Microbial reductive dehalogenation of chlorinated ethenes

Chlorinated ethenes are common groundwater contaminants in the United States (McCarty 1997, Moran, Zogorski et al. 2007, US_Dept._of_H&HS 2007, Doherty 2014). Industrial use of tetrachloroethene (PCE) and trichloroethene (TCE), used for their properties as organic solvents, began in the early 1900s (Doherty 2014). Important industrial applications include metal degreasing and dry cleaning (Mohn and Tiedje 1992, McCarty 1997). TCE was also used historically for coffee decaffeination (Doherty 2014). Although use of these chemicals has greatly decreased in recent decades, due to a combination of regulations and public concern, existing contamination is expected to present a persistent problem for decades to come (Doherty 2014). Due to poor disposal practices as well as accidental spills and leaks, chlorinated ethene contamination of groundwater is a widespread problem, with over half of Superfund sites having TCE contamination (US_Dept._of_H&HS 1997).

TCE has been tied to a number of both acute and chronic human health effects including neurological, kidney, liver, reproductive, and immune system effects (US_Dept._of_H&HS 1997, US_EPA 2011). Dichloroethene (cis-DCE and trans-DCE) and vinyl chloride (VC), intermediates of PCE and TCE dechlorination, are also both highly toxic, and VC is a known human carcinogen while PCE and TCE are suspected carcinogens (Kielhorn, Melber et al. 2000, US_Dept._of_H&HS 2005).

A variety of remediation approaches have been studied for the treatment of chlorinated ethene contamination in groundwater. Zero-valent iron particles, which are capable of donating electrons for the reduction of chlorinated organics, represent an important abiotic approach to remediation that has been widely studied and used for remediation (Gillham and O'Hannesin 1994, Arnold and Roberts 2000, Liu, Majetich et al. 2005). Biological degradation of TCE can occur co-metabolically with some aerobic microorganisms in which oxygenase enzymes that target other substrates also catalyze the oxidation of TCE due to a lack of enzyme specificity (Bradley 2003). Anaerobic biodegradation of chlorinated ethenes via reductive dehalogenation is another important bioremediation process and is one focus of research presented in this dissertation.

Some anaerobic microorganisms are capable of reductive dechlorination of chlorinated organics like PCE and TCE. In this process, the microorganisms use a chlorinated organic as their terminal electron acceptor for energy metabolism (Smidt and de Vos 2004). The reduction of the chlorinated organic, and the replacement of the chlorine with a hydrogen atom, is coupled to the oxidation of an electron donor, usually hydrogen (Figure 1.2). A number of different microorganisms have been identified that are capable of partially dechlorinating PCE and TCE to the toxic intermediate DCE (Scholz-Muramatsu, Neumann et al. 1995, Sharma and McCarty 1996, Holliger, Hahn et al. 1998, Luijten, de Weert et al. 2003, Löffler, Cole et al. 2004). However, only members of the genus Dehalococcoides (Dhc) have been found to be capable of fully dechlorinating chlorinated ethenes to ethene (Maymo-Gatell, Chien et al. 1997, Smidt and de Vos 2004).

7

Figure 1.2. Reductive dechlorination of chlorinated ethenes. Each successive chlorine removal step involves the oxidation of one mole of H2 per mole of chlorinated ethene reduced, with the transfer of two moles of electrons. (a) PCE reduction to TCE. (b) TCE reduction to cis-DCE or trans-DCE. (c) cis-DCE or trans-DCE reduction to VC. (d) VC reduction to ethene.

Dhc are strictly anaerobic bacteria that use chlorinated ethenes and other chlorinated organics as electron acceptors (Maymo-Gatell, Chien et al. 1997, Smidt and de Vos 2004). These reductive dechlorination reactions are catalyzed by membrane associated enzymes called reductive dehalogenases (RDases) (Smidt and de Vos 2004). Genome sequencing of several Dhc strains has revealed a large variety of putative RDase genes. The complement of RDase genes varies greatly between strains and corresponds to variation in dechlorination abilities (Kube, Beck et al. 2005, Seshadri, Adrian et al. 2005, McMurdie, Behrens et al. 2009, Lee, Cheng et al. 2011). Further, the suite of Dhc RDase genes that have been tied to functional activity are far fewer, currently numbering six in all (pceA, tceA, vcrA, bvcA, cbrA, and mbrA) (Magnuson, Stern et al. 1998, Magnuson, Romine et al. 2000, Krajmalnik-Brown, Holscher et al. 2004, Muller, Rosner et al. 2004, Adrian, Rahnenfuhrer et al. 2007, Chow, Cheng et al. 2010).

Dhc species have strict metabolic needs for growth and dechlorination. All known Dhc require anaerobic conditions with certain chlorinated organics as terminal electron acceptors, hydrogen as the electron donor, and acetate as a carbon source (Maymo-Gatell, Chien et al. 1997, Adrian, Szewzyk et al. 2000, He, Ritalahti et al. 2003, Smidt and de Vos 2004). Further, although cobalamin is a necessary cofactor for RDases (Smidt and de Vos 2004), no Dhc strains have been reported to be capable of synthesizing cobalamin de novo (Kube, Beck et al. 2005, Seshadri, Adrian et al. 2005, He, Holmes et al. 2007). Previously sequenced Dhc strains have 8 genes encoding for enzymes in the second part of the cobalamin biosynthesis pathway, lower ligand attachment and rearrangement (Maymo-Gatell, Chien et al. 1997, Kube, Beck et al. 2005, McMurdie, Behrens et al. 2009), but not for the first part of the pathway, corrin ring synthesis. Additionally, although Dhc can produce all essential amino acids and Dhc strain 195 is capable of nitrogen fixation, Dhc grows more robustly when certain amino acids and fixed nitrogen are available for uptake from the environment (Lee, He et al. 2009, Zhuang, Yi et al. 2011). A recent study also showed that, due to an incomplete Wood-Ljungdahl pathway, Dhc produces carbon monoxide (CO) as a byproduct of acetate assimilation for methionine production (Zhuang, Yi et al. 2014). Without other organisms capable of removing it, CO builds up during Dhc growth, resulting in inhibition of Dhc growth and dechlorination.

Dhc has been shown to grow more robustly and dechlorinate more rapidly when grown in mixed microbial communities or defined consortia, likely due to the ability of other organisms to facilitate the specific growth requirements of Dhc (Maymo-Gatell, Chien et al. 1997, He, Holmes et al. 2007, Lee, Cheng et al. 2011, Men, Feil et al. 2012). The improved performance of Dhc in these communities, along with the greater relevance of these conditions to in situ dechlorination at contaminated sites, make the study of complex dechlorinating communities important for development of effective bioremediation strategies.

1.3 Dissertation overview

This dissertation describes investigations into the microbial processes discussed above, with a guiding theme of using “omics” based approaches to deepen our understanding of environmentally relevant microbial processes. The remainder of this dissertation is organized into four chapters detailing those investigations followed by an additional chapter summarizing the results and suggesting future research directions based on those findings.

Chapter 2 describes the establishment and characterization of a monazite bioleaching process. This includes the initial enrichment and isolation of bioleaching microorganisms as well as the optimization of bioleaching growth parameters. It also includes an analysis of organic acid production during bioleaching and the potential contribution of those organic acids to overall bioleaching effectiveness.

In Chapter 3, one of the organisms isolated in Chapter 2 was selected for an untargeted metabolomic analysis of the bioleaching supernatant to further understand the bioleaching process. This study investigated the excretion of metabolites by a monazite bioleaching fungus when grown with and without an additional soluble phosphate source (K2HPO4). The gas chromatography time of flight mass spectrometry (GC-TOF-MS) technique employed in this chapter enabled a more comprehensive analysis of excreted metabolites and identification of compounds potentially associated with bioleaching effectiveness.

Chapter 4 switches to an analysis of a more complex mixed microbial community degrading TCE, as opposed to the bioleaching isolates studied in Chapters 2 and 3. This chapter describes a metagenomic sequencing based analysis of the target community, revealing new information about both the phylogenetic makeup of that community and its genetic content. In addition to

9 profiling the whole community, this chapter also has a particular focus on the Dhc strains within that community, who are responsible for the community’s dechlorination activity.

Chapters 5 and 6 contain further metagenomic/genomic sequence based investigations of Dhc strains in microbial communities and as isolates. In the Chapter 5, three Dhc containing dechlorinating mixed communities were investigated using both metagenomic sequencing and a Dhc genus wide microarray. The metagenomic sequencing data were then used to evaluate the sensitivity and specificity of the microarray for detecting the targeted Dhc genes. In Chapter 6, fully sequenced bacterial and archaeal genomes were analyzed bioinformatically for patterns of genes (presence and absence of certain genes) that parallel the gene pattern associated with the recently identified incomplete Wood-Ljungdahl pathway found in Dhc in order to determine whether other known organisms share this newly identified version of the pathway.

Conclusions drawn from the above investigations and suggestions for future work are summarized in Chapter 7.

10

Chapter 2:

Bioleaching of Rare Earth Elements from Monazite Sand

A version of this chapter has been submitted for publication as:

Brisson, Vanessa L., Wei-Qin Zhuang and Lisa Alvarez-Cohen (submitted 2015). "Bioleaching of Rare Earth Elements from Monazite Sand." Biotechnology and Bioengineering.

11

2.1 Introduction

Rare earth elements (REEs) are increasingly in demand for a variety of technologies including efficient batteries for hybrid and electric vehicles; permanent magnets used in wind turbines; high efficiency electric lights; and a variety of consumer electronics (USDoE 2011, Alonso, Sherman et al. 2012). Unfortunately, current processing techniques applied for extraction of REEs from mineral ores require high energy inputs and use harsh chemicals, producing environmentally damaging waste streams (Gupta and Krishnamurthy 1992, Alonso, Sherman et al. 2012).

Phosphate solubilizing microorganisms (PSMs) can solubilize phosphate from otherwise low solubility phosphate minerals (Rodrı́guez and Fraga 1999). A variety of both bacterial and fungal PSMs have been identified and studied. Most of that work has focused on solubilization of calcium phosphate minerals in the context of agricultural applications with the goals of enhancing phosphate fertilizer effectiveness and promoting plant growth (Asea, Kucey et al. 1988, Illmer and Schinner 1992, Rodrı́guez and Fraga 1999, Gyaneshwar, Naresh Kumar et al. 2002, Arcand and Schneider 2006, Vassilev, Vassileva et al. 2006, Chuang, Kuo et al. 2007, Morales, Alvear et al. 2007, Osorio and Habte 2009, Chai, Wu et al. 2011, Braz and Nahas 2012).

Organic acid production is considered to be a primary contributor to phosphate solubilization by PSMs (Rodrı́guez and Fraga 1999, Nautiyal, Bhadauria et al. 2000, Gyaneshwar, Naresh Kumar et al. 2002, Scervino, Papinutti et al. 2011). This activity is thought to be due to both pH reduction and the formation of complexes between the organic acid and the cations released from the phosphate minerals (Bolan, Naidu et al. 1994, Gadd 1999, Gyaneshwar, Naresh Kumar et al. 2002, Arcand and Schneider 2006).

Some organic acids identified in PSM studies have been analyzed for their ability to form complexes with REEs. For example, the stability constants for the formation of 1:1 REE citrate complexes have been estimated around109 (Martell and Smith 1974, Goyne, Brantley et al. 2010), and other REE citrate complexes have been proposed (Wood 1993). Beyond organic acids, other chelating molecules, such as siderophores, can also interact with REEs and may therefor be relevant to REE solubilization (Christenson and Schijf 2011).

Bioleaching offers a potentially more environmentally friendly alternative to conventional extraction of REEs from ores. Since monazite is a REE phosphate mineral, the objectives of this study were to investigate the potential of using PSMs to solubilize monazite for the extraction of REEs and to evaluate the contributions of different organic acids to REE bioleaching. To the best of our knowledge, no previous research has evaluated the potential use of PSMs for this purpose.

12

2.2 Materials and methods

2.2.1 Enrichment and isolation of REE solubilizing microorganisms

REE-phosphate solubilizing enrichment cultures were established in National Botanical Research Institute phosphate growth medium (NBRIP medium), a commonly used medium for phosphate solubilization studies (Nautiyal 1999). See Table 2.1 for medium composition. Inoculating source material was placed in 50 mL centrifuge tubes, covered with NBRIP medium, and shaken to release cells. Soil and sand particles were allowed to settle and supernatant was collected and used to inoculate enrichment bottles containing NBRIP medium with 10 g/L glucose (added as carbon and energy source) and insoluble NdPO4 (phosphate source) [Sigma- Aldrich, St. Louis, MO]. Enrichment cultures were shaken continuously and approximately 50% of the growth medium was exchanged for fresh medium weekly.

REE-phosphate solubilizing microorganisms were isolated from enrichment cultures on selective plates containing NBRIP medium solidified with 1.5% agar and containing powdered NdPO4 as the phosphate source and 10 g/L glucose as carbon and energy source. Plates were inoculated with 100 µL of enrichment culture and incubated at 30 °C. Once growth was observed, individual colonies were selected and transferred to new plates. 10 µL sterile water was used to help fungal spores to adhere to the sterile loop for transfer. This process was repeated for several transfers to achieve isolated strains. Once fungal strains were isolated, they were maintained on potato dextrose agar plates. Six known PSMs from culture collections (Aspergillus niger ATCC 1015, Burkholderia ferrariae FeG101, Microbacterium ulmi XIL02, Pseudomonas rhizosphaerae IH5, Pseudomonas fluorescens, Sterptomyces youssoufiensis X4) were also tested to their ability to grow on NBRIP-NdPO4 selective plates.

Organisms capable of growth on selective plates were screened for their ability to leach REEs from monazite in liquid culture. These tests took place in 250 mL bottles, each containing 100 mL NBRIP medium with 10 g/L glucose and 7.5 g raw (unground) monazite sand. Bottles were stirred continuously at 250 RPM at room temperature (25 to 28 ºC) and supernatant REE concentrations were monitored over two weeks incubation.

2.2.2 DNA extraction, amplification, sequencing, and sequence analysis

For DNA extraction, biomass was collected from cultures grown on potato dextrose agar plates. Biomass was scraped off of the agar, frozen under liquid nitrogen, and crushed with a mortar and pestle to disrupt cell walls. DNA was extracted using the Qiagen DNeasy Plant Minikit. Fungal 18S and 5.8S genes and ITS regions were amplified by PCR using the NS1, NS3, NS4, NS8, ITS1, and ITS4 universal fungal primers (White, Bruns et al. 1990). PCR reactions were carried out using Qiagen Taq DNA polymerase. PCR products were purified by precipitation with polyethylene glycol precipitation followed by ethanol wash, drying, and re-suspension in sterile water (Sánchez, McFadden et al. 2003). Sanger sequencing was performed at the UC Berkeley DNA sequencing facility.

2.2.3 Bioleaching growth conditions

Bioleaching experiments were carried out in 250 mL Erlenmeyer flasks with foam stoppers. Each flask contained 0.5 g monazite sand [City Chemical LLC, West Haven, CT] ground to finer 13 than 75µm (200 mesh) and 50 mL growth medium. Chemical digestion and ICP-MS analysis of the sand determined that it contained approximately 23.5% REEs by weight, indicating the presence of other minerals besides monazite. The growth medium composition and carbon source varied with each experiment. Four different basic media were used: NBRIP medium (Nautiyal 1999), modified Pikovskaya medium (PVK medium) (Pikovskaya 1948, Nautiyal 1999), PVK medium without Mn and Fe, and modified ammonium salts medium (AMS medium) (Parales, Adamus et al. 1994). See Table 2.1 for media compositions.

Table 2.1. Bioleaching growth media compositions. Medium NBRIP PVK PVK mediumb AMS Component mediuma mediumb without Mn and Fe meidumc

MgCl2·6H2O 5 g/L ------

MgSO4·7H2O 0.25 g/L 0.1 g/L 0.1 g/L 1.0 g/L

KCl 0.2 g/L 0.2 g/L 0.2 g/L 0.2 g/L

(NH4)2SO4 0.1 g/L 0.5 g/L 0.5 g/L 0.66 g/L

NaCl -- 0.2 g/L 0.2 g/L --

MnSO4·H2O -- 0.002 g/L -- --

FeSO4·7H2O -- 0.002 g/L -- --

Trace elements ------1.0 mL/L stock (1000x)d

Stock A ------1.0 mL/L (1000x)e aSee reference (Nautiyal 1999). bThis is a modification of the original Pikovskaya medium, with yeast extract omitted. See references (Pikovskaya 1948, Nautiyal 1999). cThis is a modification of the original ammonium salts medium, with phosphate buffer omitted. See reference (Parales, Adamus et al. 1994) d Trace elements stock contained FeSO4·7H2O (0.5 g/L), ZnSO4·7H2O (0.4 g/L), MnSO4·H2O (0.02 g/L), H3BO3 (0.015 g/L), NiCl2·6H2O (0.01 g/L), EDTA (0.25 g/L), CoCl2·6H2O (0.05 g/L), and CuCl2·2H2O (0.005 g/L). e Stock A contained FeNaEDTA (5 g/L) and NaMoO4·2H2O (2 g/L).

One of five carbon sources (glucose, fructose, sucrose, xylose, or starch) was added to the medium prior to sterilization. After sterilization, flasks were inoculated with conidia (asexual spores) collected from cultures maintained on potato dextrose agar. Conidia were collected from plates by scraping with a sterile loop and suspended in sterile deionized water. Necessary dilutions to achieve a spore concentration to 107 CFU per mL were determined by using

14 calibration curves relating optical density at 600 nm to CFU concentration. Each bioleaching flask was inoculated with 1 mL (107 CFU) of spore suspension. Flasks were incubated at room temperature (25 to 28 °C) and stirred with a one inch magnetic stir bar at 250 rpm for the duration of the six day bioleaching experiments. All bioleaching experiments were conducted with three biological replicates for each condition unless otherwise noted.

2.2.4 Abiotic leaching conditions

Abiotic leaching experiments were conducted in 50 mL flat bottomed polypropylene tubes. Each tube contained 0.1 g monazite sand ground to finer than 75 µm (200 mesh) and 10 mL of the desired leaching solution. Tubes were incubated at room temperature (25 to 28 °C) and stirred with 0.5 inch stir bars at 250 rpm for 48 hours.

For abiotic leaching with organic acids and hydrochloric acid, acid solutions were prepared with deionized water and filter sterilized through 0.2 µm filters prior to leaching. Acetic, citric, itaconic, oxalic, and succinic acids [Sigma-Aldrich, St. Louis, MO] were tested at two concentrations each: 2 mM and 20 mM, while gluconic acid [Sigma-Aldrich, St. Louis, MO] was tested at 1.8 mM and 18 mM. HCl solutions were prepared to provide a range of pH from 1.8 to 3.7. Three experimental replicates were done for each acid concentration.

For abiotic leaching with spent medium, bioleaching spent medium was collected after six days of bioleaching and filter sterilized through 0.2 µm filters. Filtered spent medium was treated to remove REEs by adding 15 mL spent medium to a 50 mL tube containing 0.5 g Amberlite IR120 resin [Sigma-Aldrich, St. Louis, MO] and shaking horizontally for one hour. 10 mL of treated spent medium was then used for leaching. Six replicates were done for each organism, each from a separate bioleaching flask.

2.2.5 Biomass measurements

Volatile solids (VS) were determined as a measure of biomass production. VS was used rather than dry weight because monazite sand becomes entrapped in the biomass during bioleaching. By measuring VS, the organic portion of the total dry weight could be measured. Samples comprising the entire contents of a bioleaching flask were filtered on glass microfiber filters. Filtered samples were dried overnight at 105 ºC, cooled to room temperature, and weighed. Samples were then ashed at 550 ºC for four hours, cooled to room temperature, and weighed again. The difference between the dried weight and the ashed weight was determined as VS.

2.2.6 Analytical methods

To quantify REEs, Th and U concentrations, supernatant samples were collected, filtered with 0.2 µm filters, and diluted 100-fold in deionized water acidified with 1.5% nitric acid [70%, trace metals grade, Fisher Scientific, Pittsburgh, PA] and 0.5% hydrochloric acid [36%, ACS Plus grade, Fisher Scientific, Pittsburgh, PA]. Samples were analyzed on an Agilent Technologies 7700 series ICP-MS.

To quantify sugars and organic acids, supernatant samples were collected and filtered with 0.2 µm filters. 1.5 mL samples were acidified by adding 10 µL 6 M [ACS grade, Fisher Scientific, Pittsburgh, PA] and analyzed on a Waters 2695 HPLC system using a BioRad 15

Aminex HPX-87H carbohydrate/organic acids analysis column with 5 mM H2SO4 as the mobile phase at a flow rate of 0.6 mL/min. Sugars were detected using a Waters 2414 detector. Organic acids were detected using a Waters 2996 UV absorption detector monitoring absorption at 210 nm. Calibration curves were prepared for concentration ranges of 0.1 to 10 g/L for sugars and 0.5 mM to 20 mM for organic acids, with the exception of acetic acid, which could only be detected to a minimum concentration of 1 mM. Sugar standards prepared included glucose, fructose, sucrose, and xylose. Organic acid standards included acetic, citric, gluconic, itaconic, lactic, oxalic, and succinic acids. pH was measured using a Hanna Instruments HI1330B glass pH electrode and HI 2210 pH meter. Although this meter is designed to measure pH in the range of -2 to 16, the lowest pH standard allowed by the automatic calibration is 4.01. In order to test the accuracy for more acidic samples, the meter was first calibrated with pH 4.01 and 7.01 standards and then used to measure a pH 1.00 standard. This consistently gave a reading of 0.9, indicating that pH measurements between 1.0 and 4.0 should be within 0.1 pH units of the correct value.

Phosphate concentrations were determined using the BioVision Phosphate Colorometric Assay Kit according to the manufacturer’s instructions.

2.2.7 Statistical analyses

Statistical analyses were performed in Python using the StatsModels module. A significance level of α = 0.05 was used for all analyses. Details of statistical methods used for each analysis are given below. P-values from statistical analyses are tabulated in Table 2.4 at the end of this chapter. Average concentrations and amounts are reported as mean ± standard deviation for three replicates. For all analyses involving multiple comparisons, p-values were adjusted using the Šidák correction (Šidák 1967) to maintain a significance level of α = 0.05. The correction is given by:

n padjusted = 1 − (1 − punadjusted) where n is the number of comparisons performed. Average concentrations and amounts are reported as mean ± standard deviation for three replicates unless otherwise noted.

2.2.7.1 Analysis of biomass growth

Biomass (measured as VS) results were analyzed by performing pairwise comparisons to positive and negative controls using a two-tailed T-test for independent samples with unequal variance. This analysis was performed on the log transformed data, a common transformation for biomass and cell count data due to the typical positive skew of such data (Olsen 2003, Olsen 2014).

2.2.7.2 Analysis of bioleaching performance

Differences in performance under different growth conditions were analyzed using a two-tailed T-test for independent samples with unequal variance to compare total REE concentrations at the

16 end of six days of incubation. For each organism, data from different growth conditions (e.g. different medium compositions) were compared by pairwise comparisons.

2.2.7.3 Analysis of proportional release of REEs and thorium

The proportional release of REEs during bioleaching was compared with the proportion of REEs present in monazite by using Hotelling’s T2 test for two independent samples with unequal covariance matrices. Each data point represented four measured values, which were the proportions of La, Ce, Pr, and Nd, the dominant REEs in our monazite. Pairwise comparisons were performed to compare leachate from each of the organisms to the monazite composition.

Differences in the release of Th in proportion to REE release were analyzed using a two-tailed Student’s t-test for independent samples with unequal variance. Pairwise comparisons were performed to compare Th in leachate from each of the organisms to the monazite composition and to compare leachate from each of the organisms to each other.

2.2.7.4 Analysis of abiotic leaching with hydrochloric acid, organic acids, and spent medium

For analysis of abiotic leaching of REEs, a weighted linear model was used in order to analyze the effects of one continuous (pH) and one categorical (different acids or spent supernatant from different organisms) independent variable on the dependent variable (REE concentration). The sample variances were first estimated for each acid and for the supernatant from each organism. These variances were then used to determine weights in the model. In order to estimate the sample variances, a least squares linear fit between pH and REE concentration was first performed for the HCl data. The slope of this line was then used to estimate a linear fit for each of the other acids and for the supernatant data. The sum squared error in relation to that linear fit was used to estimate the variance for each and those variance estimates were used to weight the model.

For analysis of abiotic leaching of Th, an initial least squares regression analysis of the data from HCl solutions ranging in pH from 1.8 to 3.7 showed no correlation between pH and Th solubilization. Therefore, data were analyzed using a two-tailed Student’s T-test for independent samples with unequal variance to do pairwise comparisons of each organic acid and supernatant solution to the HCl solution. Different concentrations of each organic acid were analyzed separately.

2.3 Results and discussion

2.3.1 Enrichment, isolation and identification of bioleaching microorganisms

Source inoculation materials for establishing REE-phosphate solubilizing enrichment cultures were collected from two different locations: tree root associated soil from the UC Berkeley campus and sand and sediment samples from Mono Lake in California. We hypothesized that root associated soil might yield PSMs because root associated microbial communities are known to support plant growth by improving nutrient availability (Rodrı́guez and Fraga 1999), and that 17 since Mono Lake is an alkaline salt lake with high concentrations of heavy metals and REEs (Johannesson and Lyons 1994), it might serve as a source of microorganisms that are tolerant of high REE concentrations.

Enrichment cultures capable of utilizing monazite as their sole phosphate source were successfully cultivated from both source materials. Two known PSMs from culture collections (Aspergillus niger ATCC 1015, Burkholderia ferrariae FeG101) were also capable of growth on NBRIP-NDPO4 selective plates. Initial screening of these organisms identified the three most promising bioleaching organisms for further study, all of which were fungi: Aspergillus niger ATCC 1015, isolate ML3-1 from a Mono Lake enrichment culture, and isolate WE3-F from a tree root soil enrichment culture (Figure 2.1).

Figure 2.1. Initial characterization of REE solubilization from unground monazite by fungal and bacterial isolates. Note that these initial characterizations used unground monazite sand. Subsequent experiments were performed with ground monazite.

Sequences of 18S, 5.8S, and ITS regions of ML3-1 and WE3-F were determined and have been deposited in Genbank with accession numbers KM874778, KM874779, KM874780, and KM874781. Based on BLAST comparisons of these sequences to the NBRIP nucleotide database, ML3-1 and WE3-F showed high sequence similarity (≥ 99%) to Aspergillus terreus and Paeciliomyces sp. respectively. Microscopic observation of ML3-1 and WE3-F showed morphological characteristics consistent with these identifications. Previous studies have reported phosphate solubilizing activity for some strains of both the Aspergillus and Paeciliomyces genera (Ahuja, Ghosh et al. 2007, Chuang, Kuo et al. 2007, Braz and Nahas 2012, Mendes, Vassilev et al. 2013). 18

2.3.2 Biomass growth during bioleaching

Biomass production after six days of growth with monazite as sole phosphate source was compared to growth with 0.4 g/L K2HPO4 as phosphate source (positive control) and to growth without added phosphate (negative control) (Figure 2.2). For all three organisms, growth on monazite resulted in average VS concentrations approximately tenfold greater than the negative control. This difference was statistically significant for all three organisms. Significant differences in VS production were not observed between growth on monazite and growth on K2HPO4, demonstrating that these organisms can utilize monazite as a phosphate source for growth.

Figure 2.2. Biomass production measured as VS after six days incubation with different phosphate sources: 10 g/L monazite, 0.4 g/L K2HPO4 (positive control), or no added phosphate i.e. growth on trace phosphate contamination in medium and inoculum (negative control). All incubations were in AMS medium with 10 g/L glucose. Error bars indicate 95% confidence intervals around the geometric means.

2.3.3 Bioleaching performance under different growth conditions

Several previous studies have tried to optimize the solubilization of phosphate minerals by PSMs (Asea, Kucey et al. 1988, Nautiyal 1999, Nautiyal, Bhadauria et al. 2000, Chuang, Kuo et al. 2007, Chai, Wu et al. 2011). Factors addressed include carbon source, nitrogen source, and medium composition, including variations in metals concentrations. In general, carbon source and medium composition were found to have significant effects, which sometimes varied between different organisms in the same study (Nautiyal 1999, Nautiyal, Bhadauria et al. 2000,

19

Chai, Wu et al. 2011). Studies addressing nitrogen source found either minimal effect or a preference for ammonium, particularly for studies involving fungal PSMs (Asea, Kucey et al. 1988, Nautiyal, Bhadauria et al. 2000, Chuang, Kuo et al. 2007, Chai, Wu et al. 2011). Therefore, in this study, medium composition and carbon source were the focus of growth condition optimization while the nitrogen source was fixed as ammonium. Results of these experiments are shown in Figures 2.3, 2.4, and 2.5, and are discussed below.

Figure 2.3. Bioleaching of REEs from monazite under different growth conditions. Error bars indicate standard deviations around the means. (a) Different growth media: PVK medium, PVK medium without Fe or Mn, AMS medium, and NBRIP medium, all containing 10 g/L glucose as carbon source. (b) Different carbon sources: glucose, fructose, sucrose, xylose, and starch, all in AMS medium with initial carbon source concentrations of 10 g/L. (c) Different starting glucose concentrations: 5 g/L, 10 g/L, and 100 g/L, all in AMS medium.

20

Figure 2.4. Total sugar concentrations during bioleaching of monazite. Error bars indicate standard deviations around the means. (a) Different growth media: PVK medium, PVK medium without Fe or Mn, AMS medium, and NBRIP medium, all containing 10 g/L glucose as carbon source. (b) Different carbon sources: glucose, fructose, sucrose, xylose, and starch, all in AMS medium with initial carbon source concentrations of 10 g/L. For the sucrose medium, data include the sum of glucose, fructose and sucrose concentrations. For starch medium, data include only glucose concentrations rather than starch, which could not be determined by the detection method used. (c) Different initial glucose concentrations: 5 g/L, 10 g/L, and 100 g/L, all in AMS medium. For the 100g/L glucose condition, glucose concentration remained above the scale of the graph throughout bioleaching.

21

Figure 2.5. pH during bioleaching of monazite. Error bars indicate standard deviations around the means. (a) Different growth media: PVK medium, PVK medium without Fe or Mn, AMS medium, and NBRIP medium, all containing 10 g/L glucose as carbon source. (b) Different carbon sources: glucose, fructose, sucrose, xylose, and starch, all in AMS medium with initial carbon source concentrations of 10 g/L. (c) Different initial glucose concentrations: 5 g/L, 10 g/L, and 100 g/L, all in AMS medium.

22

Varying growth media influenced REE solubilization performance (Figure 2.3-a), with the highest average REE solubilization for all organisms occurring with AMS medium which resulted in average total REE concentrations after six days of bioleaching of 86 ± 6, 101 ± 27, and 112 ± 16 mg/L for A. niger, ML3-1, and WE3-F respectively. These concentrations correspond to 3, 4, and 5% recovery of the total REEs present in the monazite sand. Growth on NBRIP medium consistently resulted in poor solubilization performance, with average total REE concentrations after six days of bioleaching of 30 ± 2, 28 ± 1, and 30 ± 2 mg/L for A. niger, ML3-1, and WE3-F respectively. In pairwise comparisons, the difference in REE solubilization between AMS medium and NBRIP medium was statistically significant for A. niger and WE3-F, and was marginally significant for ML3-1 (p = 0.065), likely due to high variability and low sample size.

Phosphate concentrations observed during bioleaching were much lower than REE concentrations. The molar ratio of REEs to phosphate in the monazite sand is expected to be ≈ 1. However, the observed molar ratio of REEs to phosphate in solution at the end of bioleaching was well above one, and varied for different organisms and different media compositions (Table 2.2). This ratio ranged from 5 ± 1 for ML3-1 grown on NBRIP medium to 170 ± 30 for A. niger grown on AMS medium. These data indicate that much of the phosphate associated with REEs in the monazite was either not released or was removed from solution. As indicated by the biomass measurements, some of the phosphate released from the monazite was used for biomass production. Estimates of phosphate incorporated into biomass made by assuming 3% dry weight biomass phosphorus content (Rittmann and McCarty 2001) suggest that 77 mg/L, 67 mg/L and 61 mg/L phosphorus (i.e. 230 mg/L, 210 mg/L and 190 mg/L phosphate) would need to be taken up to support the 2.6 g/L, 2.2 g/L and 2.0 g/L biomass generated by A. niger, ML3-1, and WE3-F respectively after six days of growth on AMS with 10 g/L glucose. Assuming equimolar release of REEs and phosphate from the monazite, the amount of phosphate required to support biomass growth is two to six times greater than the REE quantities released under these conditions, suggesting that phosphate consumption for growth accounts for the low phosphate concentrations in solution. Altomare et al. observed a similar reduction in phosphate concentration concurrent with an increase in calcium solubilization during growth of Trichoderma harzanum Rifai 1295-22 on hydroxyapatite, which they also attributed to phosphate uptake by the organism (Altomare, Norvell et al. 1999). The apparently low REE concentration compared to the estimated phosphate in the biomass may be due to the inherent uncertainty in the estimation of biomass as VS and the assumption of 3% phosphate concentration. This discrepancy may also be due to the removal of REEs from the system by other processes including re-pricipitation (e.g. as REE-oxalates) (Gadd 1999) or adhesion to microbial cells (Moriwaki and Yamamoto 2013). If these processes are occurring, they may limit the potential for total recovery of REEs by bioleaching.

Table 2.2. Molar ratio of total REEs to phosphate measured after bioleaching with different media compositions (mean ± standard deviation). Medium A. niger ML3-1 WE3-F PVK 29 ± 23 11 ± 1 126 ± 8 PVK without Mn for Fe 62 ± 98 14 ± 2 68 ± 45 AMS 166 ± 25 9 ± 4 102 ± 78 NBRIP 8 ± 4 5 ± 1 5 ± 0

23

Previous PSM studies of Ca3(PO4)2 solubilization have reported phosphate concentrations in liquid cultures on the order of 250 to 520 mg/L (Rodrı́guez and Fraga 1999, Chen, Rekha et al. 2006, Chuang, Kuo et al. 2007, Chai, Wu et al. 2011, Scervino, Papinutti et al. 2011). In contrast, the maximum phosphate concentration observed in this study during monazite solubilization with different media compositions was 15 mg/L (ML3-1 grown on AMS medium). Differences in solubilization between different phosphate minerals has been previously reported, even when using the same PSMs (Illmer and Schinner 1995, Rodrı́guez and Fraga 1999, Souchie, Azcón et al. 2006, Chuang, Kuo et al. 2007, Delvasto, Valverde et al. 2008, Adeleke, Cloete et al. 2010). Also, REE-phosphates are known to have particularly low solubilities in water, on the order of 10-13 M (10-11 g/L) (Firsching and Brune 1991), whereas the solubility of -6 Ca3(PO4)2 is 3.9×10 M (0.0012 g/L)(Haynes ed. 2015).

With AMS medium and both versions of PVK medium, glucose was completely or almost completely consumed (≤ 0.6 g/L remaining) by the end of the experiment (Figure 2.4-a). In contrast, with NBRIP medium, glucose concentrations were only reduced to 6.3 ± 0.1, 7.4 ± 0.1, and 7.1 ± 0.0 g/L for A. niger, ML3-1, and WE3-F respectively. Growth on NBRIP medium also resulted in smaller reductions in pH than growth on other media (Figure 2.5-a).

In the study by Nautiyal that introduced NBRIP medium, several versions of the medium were compared with several modifications of Pikovskaya medium, including the yeast extract free version used in this study (PVK) (Nautiyal 1999). They showed significantly enhanced solubilization of phosphate from Ca3(PO4)2 by a variety of bacterial strains (five Pseudomonas and three Bacillus strains) with NBRIP medium. However, the generally poor performance of NBRIP medium in this study with fungi indicates that despite its widespread use in phosphate solubilization studies, NBRIP medium is not well suited for some PSMs and/or solubilization of some phosphate minerals.

Among the five carbon sources tested, there was no clear over-performer (Figure 2.3-b). For ML3-1 and WE3-F, REE solubilization profiles were similar for all carbon sources tested. REE solubilization performance for A. niger was much more variable between replicates with the same carbon source. However, pairwise comparison of REE solubilization revealed an apparent (and statistically significant) preference by A. niger for starch over fructose. In contrast to the variability in REE solubilization, pH and carbon source consumption profiles were similar for A. niger for all carbon sources tested, as they also were for the other two isolates (Figures 2.4-b and 2.5-b).

In the glucose concentration range tested (5 g/L, 10 g/L, and 100 g/L), higher glucose concentrations did not correspond to improved REE solubilization for ML3-1 and WE3-F (Figure 2.3-c). For A. niger, the performance was again quite variable, and although the average REE concentration was highest for 100 g/L glucose, this difference was not statistically significant. The pH reduction was comparable for all glucose concentrations tested (Figure 2.5- c). Interestingly, for the lowest glucose concentration (5 g/L), the glucose was consumed by the fourth day (Figure 2.4-c), but REE concentrations continued to rise through the end of the experiment. For the highest glucose concentration (100 g/L), glucose levels remained above 10 g/L for the entire experiment. These data indicate that glucose availability was not the limiting factor for bioleaching under the conditions tested.

24

Although we found no previous studies of bioleaching of REEs from monazite, one study examined bioleaching of REEs from red mud, a byproduct of bauxite ore processing for alumina production (Qu and Lian 2013). In that study, a fungus, Penicillium tricolor RM-10, was isolated and its REE leaching abilities were evaluated. For direct bioleaching of the red mud, a similar process to the monazite bioleaching in this study, they reported leaching efficiencies of 20% to 40% for total REEs (10% to 80% for individual REEs) depending on the amount of red mud provided. The highest red mud concentrations corresponded to the lowest efficiency, indicating that the process may have been approaching a solubility limitation at the highest red mud concentration. The leaching efficiency for monazite bioleaching in this study under standard conditions (AMS medium, 10 g/L glucose) was 3% to 5% of total REEs present in the monazite. Although bioleaching efficiency for the red mud was higher than for the monazite, the absolute REE concentrations in the red mud leachate ranged from 20 to 60 mg/L total REEs, compared to 60 to 120 mg/L for monazite bioleaching by ML3-1 and WE3-F in this study.

Given the differences in the two studies, it is not surprising that leaching efficiencies differ. Some of the most obvious differences are the ores (red mud vs. monazite) and the experimental time scales (50 days vs. 6 days). Additionally, for monazite bioleaching, the monazite served as a phosphate source for growth, whereas phosphate was provided in the growth medium for red mud bioleaching. .The stress caused by the low phosphate concentration during monazite bioleaching may have induced a different bioleaching mechanism. Differences in pH may have also affected the process. Since the red mud was highly alkaline, the initial pH for red mud bioleaching was between 9 and 11, as compared to 5 for monazite bioleaching. However, during red mud bioleaching, the pH dropped to acidic conditions for all but the highest red mud concentrations tested.

2.3.4 Proportional release of REEs and thorium during bioleaching

Proportions of REEs and Th in monazite (seven replicates) and in bioleaching supernatant (nine replicates for each organism) are shown in Figure 2.6. The monazite sand used in this study is dominated by Ce, La, Nd, and Pr, and the bioleaching supernatant reflected this composition. Release of Th during bioleaching was low in proportion to REEs. For standard growth conditions (AMS medium, 10 g/L glucose), averages for released Th were 0.026 ± 0.046, 0.0003 ± 0.0001, and 0.0028 ± 0.0039 mole Th per mole REEs for A. niger, ML3-1, and WE3-F respectively (nine replicates each). In comparison, the monazite contained 0.11 ± 0.02 mole Th per mole REEs (seven replicates). Differences in Th release between organisms were not statistically significant.

25

Figure 2.6. Proportions of (a) REEs and (b) Th in monazite and in bioleaching supernatant after six days of bioleaching. Concentrations are normalized to total REE content of samples. Error bars indicate standard deviations around the means. Bioleaching samples are from growth on AMS medium with 10 g/L glucose.

Proportions of REEs in bioleaching supernatant generally reflect the proportions present in the monazite (Figure 2.6-a). Analysis revealed only small, but statistically significant differences between the REE composition of the bioleaching supernatant and that of the monazite. Supernatant from all organisms contained a slightly higher proportion of La as compared to the monazite and slightly lower proportions of Ce and Nd. Some previous studies suggest possible explanations for these variations in REE proportions. For example, in their study of REE release from monazite by organic acids, Goyne et al. found that several organic acids preferentially released Nd over Ce and La (Goyne, Brantley et al. 2010). Another possible contributing factor is the preferential adsorption of some REEs to microbial cell walls after release from monazite, as was shown previously for a number of different bacteria (Moriwaki and Yamamoto 2013).

2.3.5 Organic acid production during bioleaching

Organic acid production was observed for all organisms, with each organism producing a different set of acids, some of which could be identified based on known standards. For a given organism, organic acid production was variable, and not all acids were detected in all biological replicates. Table 2.3 lists the maximum observed concentrations for identified organic acids during bioleaching experiments along with the percentage of bioleaching flasks for which each acid was detected.

26

Table 2.3. Maximum observed concentrations of identified organic acids produced by three fungal isolates during bioleaching and percentage of bioleaching flasks for which each acid was detected. A. niger ML3-1 WE3-F Organic Acid Maximum Percentage Maximum Percentage Maximum Percentage concentration of flasks concentration of flasks concentration of flasks

Acetic 0% 0% 3.8 mM 8%

Citric 15.9 mM 78% 0% 0%

Gluconic 5.3 mM 17% 0% 1.2 mM 67%

Itaconic 0% >20 mM 97% 0%

Lactic 0% 0% 0%

Oxalic 2.0 mM 17% 0% 0%

Succinic 1.6 mM 56% 4.0 mM. 28% 5.4 mM 11%

A. niger produced citric, gluconic, oxalic, and succinic acids. A. niger is known to produce these acids and is used industrially to produce citric and gluconic acids (Magnuson and Lasure 2004, Papagianni 2004). Optimization of A. niger acid production has revealed that low pH (< 2) favors citric acid production while higher pH (> 4) favors gluconic and oxalic acid production (Magnuson and Lasure 2004, Ramachandran, Fontanille et al. 2006). In this study, the pH of the A. niger bioleaching cultures ranged from 2.0 to 2.8 in the later part of the bioleaching process (t = 4 or 6 days), closer to the optimal conditions for production of citric acid. The production of higher concentrations of oxalic acid corresponded with lower concentrations of REEs, which is consistent with the known low solubility of REE-oxalates (Gadd 1999). Oxalic acid production by A. niger was observed at the end of some bioleaching experiments, and is likely responsible for the decrease in soluble REEs observed on the final day of these experiments.

ML3-1 produced primarily itaconic and succinic acids and WE3-F produced acetic, gluconic, and succinic acids. As noted above, ML3-1 showed high sequence similarity to A. terreus, some strains of which have been used industrially to produce itaconic acid (Magnuson and Lasure 2004). A. niger and WE3-F also produced some compounds that generated large peaks in the HPLC UV absorbance chromatogram, but could not be identified based on the available standards. Other PSM studies have also observed additional compounds presumed to be other organic acids potentially involved in phosphate solubilizing activity (Chen, Rekha et al. 2006).

2.3.6 Abiotic leaching with hydrochloric acid and organic acids

Leaching with inorganic hydrochloric acid solutions representing a range of acidities (five pHs ranging from 1.8 to 3.7) indicated an inverse correlation between pH and REE solubilization that was approximately linear within the range tested (r2 = 0.96) (Figure 2.7-a). Leaching with the

27 most acidic solution (pH 1.8) resulted in the greatest REE solubilization, achieving a concentration of 19 ± 2 mg/L. This inverse relationship is consistent with what is expected for this pH range. Oelkers and Poitrasson studied monazite solubility in HCl acidified water (Oelkers and Poitrasson 2002). Although their results cannot be directly compared to the results of this study due to the different experimental methods, they also found an inverse relationship between pH and REE solubilization.

All organic acids tested, with the exception of oxalic acid, leached REEs from monazite to concentrations greater than 1 mg/L (Figure 2.7-a). The low observed REE solubilization with oxalic acid, even at low pH, is consistent with the known insolubility of REE-oxalates. Because this behavior is known and is particular to oxalic acid, oxalic acid was excluded from the statistical analysis of abiotic leaching of REEs by other acids and spent supernatant.

For acetic, gluconic, itaconic, and succinic acids, the solubilization of REEs was not significantly different from what would be expected for the direct effect of pH. However, for citric acid, REE solubilization was slightly higher (approximately 3 mg/L, statistically significant) than would be expected based solely on pH reduction. Goyne et al. studied the ability of several organic acids to dissolve REEs from monazite, and found that citrate leached more REEs than the other acids tested (Goyne, Brantley et al. 2010). However, the observed REE solubilization levels for all organic acids tested (≤ 18 mg/L) were substantially lower than those observed for the active cultures (averages 60-120 mg/L for ML3-1 and WE3-F depending on growth conditions).

28

Figure 2.7. Abiotic leaching of REEs from monazite by HCl solutions, organic acids, and bioleaching spent medium. Grey lines show the least squares fit to the HCl data (r2 = 0.96). (a) Leaching with organic acids compared to HCl. (b) Leaching with spent medium from bioleaching compared to HCl. REE concentrations observed at the end of bioleaching are shown with unfilled markers for comparison.

With respect to solubilization of radioactive Th during abiotic monazite leaching, a correlation was not detected between pH and Th solubilization (Figure 2.8) and solubilization of Th was low overall in the HCl solutions (≤ 0.01 mg/L in 14 of 15 samples). Citric and oxalic acids solubilized Th from monazite significantly more than HCl solutions (1.0 ± 0.1 mg/L, 1.4 ± 0.1 mg/L, 0.5 ± 0.1 mg/L, and 3.2 ± 0.1 mg/L for 2 mM citric, 20 mM citric, 2 mM oxalic, and 20 mM oxaic respectively) (Figure 2.9). Acetic, gluconic, itaconic, and succinic acids did not solubilize Th, resulting in Th concentrations below 0.1 mg/L for each acid tested. 29

Figure 2.8. Relationship between pH and solubilization of Th for abiotic leaching of monazite with solutions of HCl. (a) All data (b) Data excluding apparent outlier at pH = 2.8, [Th] = 0.41 mg/L. Lines show least squares linear fits to the data. Neither linear fit is statistically significant.

Figure 2.9. Abiotic leaching of Th by different organic acids and by spent medium from three bioleaching organisms. Error bars indicate sample standard deviations around the means. (n = 15 for HCl, n = 3 for each concentration of organic acid, n = 6 for spent supernatant for each organism)

30

2.3.7 Abiotic leaching with spent medium from bioleaching

After six days of growth with monazite, medium from the fungal bioleaching experiments (six biological replicates for each organism) was filtered to remove cells and treated with Amberlite IR-120 resin to remove REEs from solution, reducing total REE concentrations to less than 0.8 mg/L. The spent medium samples were then tested for monazite solubilization capabilities. Spent medium from ML3-1 and WE3-F solubilized REEs to levels above what would be expected based on the low pH of the spent medium (Figure 2.7-b). Furthermore, citric acid was not detected in medium from bioleaching with these organisms (Table 2.3), so no additional REE solubilization could be attributed to citric acid. Spent medium from A. niger was not effective at leaching REEs from monazite, likely due to the presence of oxalic acid, a known REE precipitant (Gadd 1999). Spent medium from A. niger solubilized Th significantly more than the HCl solutions while spent medium from ML3-1 and WE3-F did not (Figure 2.9).

The ability of spent medium to leach REEs from monazite indicates that the presence of microorganisms is not necessary for at least some portion of the observed solubilization. However, the higher REE concentrations observed for active bioleaching compared to spent medium indicate that the microorganisms’ presence promote the most effective leaching. One contributing factor may be consumption of phosphate by the microorganisms that hinders precipitation. As noted above, the high molar ratios of REEs to phosphate during bioleaching indicate that the majority of phosphate released from monazite during bioleaching is removed from solution for incorporation into biomass.

These data indicate that both ML3-1 and WE3-F release as yet unidentified compounds into solution that are more effective than the identified organic acids at solubilizing REEs from monazite. Based on these results, ML3-1 and WE3-F are more promising organisms for the development of bioleaching for processing monazite than A. niger. This study provides a proof of concept for such a bioleaching process. Further study is needed to understand bioleaching mechanisms and to optimize the process to achieve an economically viable alternative to conventional REE extraction processes.

31

2.3.8 Statistical analyses results

Table 2.4. P-values for statistical analyses reported in the text for bioleaching and abiotic leaching of monazite. For analyses involving multiple comparisons, p-values are Šidák adjusted. Unless otherwise noted, only p-values indicating statistical significance (p < 0.05) are given. Comparison Condition p-value (Šidák adjusted for multiple comparisons) Growth difference between monazite A. niger 0.0028 and negative control (Figure 2.2) ML3-1 0.0013 WE3-F 0.017 REE solubilization differences between A. niger 0.0029 AMS and NBRIP (Figure 2.3-a) ML3-1 0.065 (marginally significant) WE3-F 0.0018 REE solubilization differences between A. niger 0.045 fructose and starch (Figure 2.3-b) Proportional release of Th during A. niger 0.0013 bioleaching in comparison to Th ML3-1 <0.0001 content of monazite (Figure 2.6-b) WE3-F <0.0001 Proportional release of different REEs A. niger 0.037 in comparison to REE proportions in ML3-1 <0.0001 monazite (Figure 2.6-a) WE3-F <0.0001 Linear correlation between REE HCl solutions <0.0001 solubilization and pH (Figure 2.7-a) REE solubilization differences between citric acid 0.0001 organic acids / spent medium and HCl ML3-1 0.0003 control (Figure 2.7-a and 2.7-b) WE3-F <0.0001 Th solubilization difference between 2 mM citric acid 0.0079 organic acids / spent medium and HCl 20 mM citric acid 0.0005 (Figure 2.9) 2 mM oxalic acid 0.0008 20 mM oxalic acid 0.0019 A. niger 0.015

32

Chapter 3:

Metabolomic Analysis of a Monazite Bioleaching Fungus

33

3.1 Introduction

Bioleaching of monazite by phosphate solubilizing microorganisms (PSMs) offers a possible alternative to conventional monazite extraction, potentially resulting in a more environmentally sustainable extraction process. PSMs are microorganisms that have the ability to solubilize phosphate ions from otherwise insoluble phosphate compounds and minerals (Rodrı́guez and Fraga 1999). As was demonstrated in Chapter 2, some PSMs are capable of releasing REEs from monazite sand and thus could be useful for a potential monazite bioleaching process. The organism used in this study is a fungal monazite bioleaching PSM, designated as WE3-F and identified as a Paeciliomyces species. The isolation, identification, and initial characterization of this organism were described in Chapter 2. This organism was selected for further study based on its consistent bioleaching performance in that study.

Current understanding of the mechanisms of phosphate solubilization by PSMs indicates that two main contributing factors are acidification of the medium and the formation of complexes between organic acids produced by the PSMs and cations associated with phosphate in the mineral and released during solubilization (Bolan, Naidu et al. 1994, Rodrı́guez and Fraga 1999, Nautiyal, Bhadauria et al. 2000, Gyaneshwar, Naresh Kumar et al. 2002, Arcand and Schneider 2006, Scervino, Papinutti et al. 2011). The investigation of monazite bioleaching described in Chapter 2 indicated that although both acidification and complexation with citric acid were able to contribute to monazite leaching, these contributions did not account for the levels of leaching seen during bioleaching or when leaching with spent bioleaching medium. Therefore, in order to better understand the bioleaching process, another approach was necessary to identify a larger array of small molecules released during bioleaching that might be associated with bioleaching effectiveness.

Untargeted metabolomics technologies provide the opportunity to accurately detect a large number of different organic molecules and compare relative concentrations across different conditions, providing insight into biological processes. Metabolomic analyses applied to excreted metabolites are sometimes referred to as exometabolomics or metabolic footprinting (Kell, Brown et al. 2005). Metabolomic footprinting has been applied to investigate other eukaryotic microbial processes including wine production by yeast and microalgae growth in bioreactors (Howell, Cozzolino et al. 2006, Sue, Obolonkin et al. 2011, Richter, Dunn et al. 2013).

In this study an untargeted metabolomics approach using gas chromatography time of flight mass spectrometry (GC-TOF-MS) was used to analyze metabolites excreted into the growth medium during monazite bioleaching under two different growth conditions: growth with monazite as the only phosphate source (using phosphate limitation to force monazite solubilization) and growth with the addition of a soluble phosphate source (relieving the phosphate limitation stress). This analysis had two parallel goals. One was to identify metabolites excreted into solution that may contribute to monazite solubilization, and the second was to examine the effects of phosphate availability on growth and metabolic processes of a bioleaching microorganism.

34

3.2 Materials and methods

3.2.1 Organism and bioleaching growth conditions

Bioleaching experiments were performed with monazite bioleaching fungal isolate ML3-1, whose isolation and identification as a Paeciliomyces species were described in Chapter 2.

Growth conditions were based on those described in Chapter 2 with some modifications. Briefly, bioleaching was conducted in 250 mL Erlenmeyer flasks, each containing 0.5 g ground monazite sand [City Chemical LLC, West Haven, CT] (finer than 200 mesh) and 50 mL modified ammonium salts medium (AMS medium) (Parales, Adamus et al. 1994). AMS medium contained 1.0 g/L MgSO4·7H2O, 0.2 g/L KCl, 0.66 g/L (NH4)2SO4, 1.0 mL/L 1000x trace elements stock solution, and 1.0 mL/L stock A. The 1000x trace elements stock solution contained 0.5 g/L FeSO4·7H2O, 0.4 g/L ZnSO4·7H2O, 0.02 g/L MnSO4·H2O, 0.015 g/L H3BO3, 0.01 g/L NiCl2·6H2O, 0.25 g/L EDTA, 0.05 g/L CoCl2·6H2O, and 0.005 g/L CuCl2·2H2O. Stock A contained 5 g/L FeNaEDTA and 2 g/L NaMoO4·2H2O. 10 g/L glucose was added as a carbon and energy source and air in the headspace served as oxygen source. Each flask was inoculated with 1 mL of spore suspension containing approximately 107 CFU and sealed with a foam stopper. Flasks were stirred continuously at 250 RPM and incubated at 28 ºC for the duration of the bioleaching experiment.

Two different growth conditions were compared to study the effects of a soluble phosphate source: growth with monazite only and growth with K2HPO4 and monazite. For the K2HPO4 and monazite condition flasks, 0.4 g/L K2HPO4 was added.

3.2.2 Quantification of REEs, Th, phosphate, glucose, pH and biomass

REE, Th, phosphate, glucose, pH, and biomass were quantified by the analytical methods described in Chapter 2. Briefly, REE and Th concentrations were measured by ICP-MS. Phosphate concentration was measured by colorimetric assay. Glucose was measured by HPLC with refractive index detection. pH was measured using a Hanna Instruments HI 2210 pH meter. Biomass was measured as total volatile solids by drying and subsequent ashing of filter collected samples. REE, Th, phosphate, glucose, and pH measurements were taken for six biological replicates for each time point (0, 2, 4, and 6 days after inoculation), while biomass measurements were taken for three biological replicates at time points 2, 4, and 6 days.

3.2.3 Metabolomic analysis

Samples of bioleaching supernatant were collected, filtered through 0.2 µm syringe filters to remove cells, and immediately frozen and stored at -80 ºC. Six replicate samples were collected at each time point (0, 2, 4, and 6 days after inoculation). Metabolomic analysis was performed by the West Coast Metabolomics Center at the University of California, Davis. At the Metabolomics Center, the samples were extracted and a silylation derivitization with N-Methyl- N-(trimethylsilyl) trifluoroacetamide (MSTFA) was performed prior to analysis by GC-TOF- MS.

Hierarchical clustering of metabolites based on concentration profiles was performed in Python using the SciPy cluster module. Signal intensity data for each metabolite were first centered by 35 subtracting the mean signal intensity for that metabolite, and normalized by dividing by the standard deviation. Hierarchical clustering was performed using the “complete” method, also called the farthest point algorithm, with Euclidian distances.

3.2.4 Identification of metabolites of potential bioleaching importance

Metabolites that were potentially relevant to bioleaching performance were identified by three methods. The first method identified metabolites that were released at higher concentrations under the monazite only condition than under the K2HPO4 plus monazite condition. Signal intensities for each metabolite were compared between the two conditions using a two-tailed T- test for independent samples. P-values were corrected for multiple comparisons using the Benjamini/Hochberg correction for false discovery rate for independent samples (Benjamini and Hochberg 1995). For this analysis only, all metabolites whose p-values were marginally significant (p < 0.1) were selected for further study. This less stringent p-value criterion was used at this intermediate stage in order to identify a large number of metabolites for the final set of experiments. This analysis was performed independently for time points 2, 4, and 6 days.

The second approach to selecting metabolites of interest was to identify correlations between metabolite concentration (signal intensity) and REE concentration. This analysis was performed on data from the monazite only condition, using measurements of metabolite concentrations and REE concentrations at each time point. A least squares linear regression was performed to identify correlations. P-values were corrected for multiple comparisons using the Šidák correction (Šidák 1967), as described in Chapter 2. Metabolites whose linear regression had a positive slope and a significant corrected p-value (p > 0.05) were selected for further study.

The final approach was to select metabolites with the highest signal intensities. Metabolites whose average signal intensity was greater than 105 for any condition and time point were selected for further study.

3.2.5 Abiotic leaching conditions

Abiotic leaching conditions were a modification of those used in Chapter 2. Leaching was conducted in 50 mL flat bottomed polypropylene tubes, each containing 0.1 g ground monazite sand (200 mesh). 10 mL leaching solution was added to autoclaved tubes and stirred for 48 hours at 250 rpm at room temperature (25 to 28 ºC). All leaching solutions were tested in triplicate.

Leaching solutions contained selected metabolites at a concentration of 10 mM, with the exception of stearic acid. Stearic acid, whose solubility in water is extremely low (0.003 g/L or 0.01 mM at 20 ºC) (Anneken, Both et al. 2000), was dissolved in water for 20 minutes with vortexing and filtered to remove undissolved particles. Additionally, a combined leaching solution containing all selected metabolites, each at a concentration of 10 mM (except for stearic acid), was also tested. All leaching solutions were adjusted to pH 2.5 by the addition of HCl in order to mimic the pH observed during bioleaching and to eliminate the effects of variations in pH observed in Chapter 2. Leaching solutions were filter sterilized through 0.2 µm syringe filters prior to leaching experiments.

36

Statistical significance of leaching effectiveness was determined using a two-tailed T-test for independent samples to compare REEs released by each leaching solution to a control solution of HCl at a pH of 2.5. P-values were corrected for multiple comparisons using the Šidák correction, as described in Chapter 2.

3.2.6 Gel permeation chromatographic separation of REE complexes and free REEs

Conditions for gel permeation chromatography experiments were based on those described by Altomare et al. for separation of iron and manganese complexes (Altomare, Norvell et al. 1999), and modified for application to REE bioleaching samples.

Low pressure chromatography experiments were conducted with Econo-Column glass columns (1 cm diameter, 20 cm length) [Bio-Rad Laboratories Inc., Hercules, CA] packed to a bed height of 15 cm with BioGel P2 Polyacrilamide Gel [Bio-Rad Laboratories Inc., Hercules, CA] according to the manufacturer’s instructions. Two columns were prepared, one at circumneutral pH and one at pH 2.5. The solvent for the circumneutral column was 20 mM NaCl in water. The solvent for the pH 2.5 column contained 20 mM NaCl and in water adjusted to pH 2.5 with HCl.

200 µL samples were injected via stopcock and Econo-Column flow adapter [Bio-Rad Laboratories Inc., Hercules, CA]. Solvent flow rate was maintained at 0.250 mL/min using an ISMATEC IPC High Precision Multichannel Dispenser [IDEX Health & Science SA, Glattbrugg, Switzerland]. Effluent was collected in 1.25 mL (5 minute) fractions for 2 hours for each sample. The column was flushed for an additional hour before the next sample was applied.

For circumneutral pH experiments, controls contained 2 mM NdCl3 with or without 1 mM disodium EDTA. For pH 2.5 experiments, controls contained 0.1 mM NdCl3. The pH 2.5 citric acid control contained 10 mM citric acid, and the pH2.5 EDTA control contained 10 mM disodium EDTA. Controls were adjusted to pH 2.5 with HCl. Bioleaching samples were filtered with 0.2 µm syringe filters to remove cells and frozen and stored at -80 ºC prior to chromatography experiments.

3.3 Results and discussion

3.3.1 Bioleaching performance

The results of monazite bioleaching with and without a soluble phosphate source (K2HPO4) are summarized in Figure 3.1. REE solubilization was greater for the monazite only condition, when a soluble phosphate source was not provided, reaching concentrations of 42 ± 15 mg/L after six days of leaching (Figure 3.1-a). This is consistent both with forcing the organisms to solubilize phosphate for growth and with possible re-precipitation of REE-PO4 in the medium that contains K2HPO4 at a relatively high phosphate content. However, some solubilization of REEs did occur in the cultures provided with K2HPO4, reaching concentrations of 14 ± 9 mg/L after six days of bioleaching. Althoug Th release was small for both conditions, it was consistently greater for the monazite only condition (0.6 ± 0.3 mg/L) than for K2HPO4 plus monazite (0.04 ± 0.02 mg/L).

37

Figure 3.1. Bioleaching of monazite in the absence or presence of soluble phosphate (K2HPO4). Shown are (a) REE concentrations, (b) Th concentrations, (c) phosphate concentrations, (d) glucose concentrations, (e) pH, and (f) biomass measured as volatile solids. REE, phosphate, glucose, and pH data are for six biological replicates. Biomass data are for three biological replicates. Error bars indicate standard deviations around the mean.

Free phosphate concentrations (Figure 3.1-c) remained very low (maximum observed concentration in a single sample: 0.005 mM) when monazite was the only phosphate source. When K2HPO4 was added to the medium, phosphate levels decreased from their initial concentration but still remained high throughout the experiment (minimum observed concentration in a single sample: 0.68 mM). This indicates that the concentration of K2HPO4 provided was sufficient to avoid phosphate limiting conditions during bioleaching for this growth condition.

Glucose consumption (Figure 3.1-d) for the monazite only growth condition lagged behind glucose consumption when K2HPO4 was provided. The pH was reduced at a faster rate when soluble phosphate was provided (Figure 3.1-e), resulting in a slightly lower pH for this condition on day two of bioleaching despite the higher initial pH of the medium with added K2HPO4. However, by the fourth day, both conditions had similar pHs.

38

Biomass production (Figure 3.1-f) for the monazite only condition also lagged behind growth with K2HPO4 plus monazite. By the sixth day, however, biomass accumulation was comparable under both growth conditions (2.8 ± 0.03 g/L for monazite only and 2.9 ± 0.2 g/L for K2HPO4 with monazite).

Together, the phosphate, glucose, pH, and biomass data indicate that although low phosphate levels may be slowing initial growth rates, by the end of the six day bioleaching experiment, phosphate is not the limiting factor for growth. The depletion of glucose by the end of the experiment in both cases may suggest a glucose growth limitation. However, as was shown in Chapter 2, increasing the glucose concentration to 100 g/L did not improve bioleaching performance.

Nitrogen availability is another possible growth limiting factor. Some estimates of the nitrogen content of fungal mycelia range from 0.2% to 9% of dry weight (Lahoz, Reyes et al. 1966, Dawson, Maddox et al. 1989, Watkinson, Bebber et al. 2006). Assuming a typical value of 5% nitrogen content, the 0.66 g/L of (NH4)2SO4 (i.e. 0.14 g/L N) provided in AMS medium would correspond to the production of approximately 2.8 g/L of dry biomass, suggesting that nitrogen availability may be limiting biomass production to this level. Scervino et al. found nitrogen limitation to enhance phosphate mineral solubilization by Penicillium purpurogenum, another phosphate solubilizing fungus (Scervino, Papinutti et al. 2011), indicating that nitrogen limitation of growth may be desirable for bioleaching performance. Nitrogen limitation has also been found to enhance citric acid production in some fungi (Cunningham and Kuiack 1992, Papagianni 2007).

3.3.2 Overall metabolomic profile

Metabolomic analyses of the fungal supernatant from the two conditions detected 210 metabolites (Appendix 2). Of these 87 could be identified as known chemicals. The remaining 123 were identified only with BinBase ID numbers based on their characteristic mass spectra. Concentration profiles of all metabolites identified by name are summarized in Figure 3.2 for all time points and conditions (see Appendix 3 for concentration profiles of all detected metabolites, including those identified only by BinBase IDs). Overall, the lag in growth when monazite was the only phosphate source, observed above in glucose consumption, pH reduction, and biomass growth (Figure 3.1-d, -e, and -f), was paralleled in the concentration time profiles for many metabolites, with metabolite concentrations peaking at earlier time points when K2HPO4 was provided.

39

level level

other KEGG pathways. Metabolites whose names are highlighted in green are Metabolites pathways. other names highlighted in green whose KEGG

cal clustering, with the clustering dendrogram displayed to displayed the heatmap cal clustering, of and with the left the clustering dendrogram

heatmap and metabolite names indicate KEGG pathways involving each metabolite. E metabolite. = involving pathways indicate each heatmap and KEGG names metabolite

rage levels of identified metabolites detected during monazite bioleaching for for growtheach monazite of identified bioleaching during detected metabolites levels rage

Heatmap showing ave Heatmap

.

2

3.

Figure metabolites. different conditions time and points. Rows different represent Columns represent point. and time condition ordered on hierarchi are based Metabolites indicate colors standard above mean right.(yellow) the overall deviations to the (blue) and below names metabolite Heatmap Letters metabolite. between each for metabolism, nucleic acid metabolism, N = C metabolism, acid metabolism, amino L = lipid = metabolism, carbohydrate A = energy vitamins, of cofactors metabolism O = and V = (> 8 acids. long chain names are fatty highlighted whose in purple are carbon) TCA cycle. in the Metabolites participate 40

Hierarchical clustering of metabolites based on these profiles identified several groups of metabolites with similar behaviors. For instance, most of the components of the tricarboxylic acid cycle (TCA cycle) (Figure 3.2, yellow highlights), including citric, isocitric, alpha- ketoglutaric, succinic, fumaric, and malic acids, clustered together. The concentrations of these TCA cycle components peaked on day four when K2HPO4 was added and on day six when monazite was the only phosphate source. Aconitic acid, also in the TCA cycle, had a different concentration profile, peaking on day six for both growth conditions, while oxaloacetic acid was not detected. This overall trend is consistent with the earlier depletion of glucose when K2HPO4 was provided (Figure 3.1-d) since glucose, through glycolysis and the TCA cycle, feeds into the production of these metabolites (Madigan, Martinko et al. 2008). Once the glucose is depleted, these TCA cycle components are consumed and not replenished, resulting in the reduced concentrations by day six when K2HPO4 is provided (Figure 3.2). The importance of maintaining high sugar concentrations for the production and excretion of citric acid by Aspergillus niger, a commercially important production process, have been well documented (Magnuson and Lasure 2004, Papagianni 2007).

Long chain fatty acids (longer than eight carbons) (Figure 3.2, blue highlights) also clustered together, with concentrations peaking on day six when K2HPO4 was provided, and remaining at much lower levels when monazite was the only phosphate source. Long chain fatty acids observed were azelaic, capric, lauric, oleic, palmitic, pelargonic, and stearic acids. This increase in long chain fatty acid production corresponds with the depletion of glucose and the leveling off in biomass production (Figure 3.1-d and f), and may be related to a transition from exponential growth to stationary phase. Long chain fatty acids and their derivatives have been associated with changes in fungal physiology and morphology, and specifically with the transition from growth to spore formation (Mysyakina and Feofilova 2011).

3.3.3 Identification of metabolites of potential bioleaching importance

3.3.3.1 Metabolites released at higher concentrations when soluble phosphate was not available

Direct comparison of metabolite levels for the two growth conditions identified metabolites with higher concentrations for the monazite only growth condition. This analysis identified 15 and 13 metabolites for the 2 day and 4 day time points respectively. Three metabolites were identified for both time points. However, none of these had identification beyond BinBase ID numbers (20282, 2044, and 1681). No metabolites were identified from the analysis at the 6 day time point because differences in concentration were not found to be statistically significant, likely due to the high variability at this time point.

Of the eleven metabolites identified by name (eight for 2 days and three for 4 days) seven were selected for further study of their leaching abilities (ribose, ribitol, nicotinic acid, isothreonic acid, gluconic acid, histidine, and citric acid). Sulfuric acid was not considered because the focus of this analysis was organic metabolites. Glucose was not considered because it was the provided substrate rather than a metabolite and its higher concentration in the monazite-only condition at 4 days was already shown (Figure 3.1-d) Fructose, which was tested as a carbon source in Chapter 2 and was found not to have any benefits over glucose in REE solubilization, was also rejected. Ribonic acid was not selected because it was not commercially available. 41

Figure 3.3. Metabolites of potential bioleaching importance identified by occurrence at higher concentrations in the monazite-only condition compared to K2HPO4 plus monazite. (a) Metabolites identified at 2 days. (b) Metabolites identified for at 4 days. Signal intensities are normalized to the maximum observed signal intensity for each metabolite at that time point. Height and error bars indicate mean and standard deviation for six biological replicates.

42

3.3.3.2 Metabolites whose concentrations correlated with REE concentrations

Fifteen metabolites, eight of which were identified by chemical name, were found to have positive correlations with REE concentrations (Figure 3.4). Six of these (galactinol, citramalic acid, 4-hydroxybenzoic acid, 3,4-dihydroxybenzoic acid, 1-deoxyerythritol, 2-deoxyerythritol) were selected for further leaching studies based on their commercial availability.

Figure 3.4. Correlations between metabolite signal intensities and REE concentrations. Only compounds found to have a significant positive correlation and having identification beyond BinBase ID numbers are shown. P-values are Šidák adjusted.

43

3.3.3.3 High signal intensity metabolites

Seven named metabolites had high overall signal intensities (average signal intensity > 105 for at least one condition and time point). Four of these compounds (sorbitol, glycerol, p-cresol, and stearic acid) were selected for further study, while three (glucose, sulfuric acid, and phosphate), were rejected for previously stated reasons.

3.3.4 Abiotic leaching effectiveness of identified metabolites

Of all the tested metabolites, only citric acid and citramalic acid showed statistically significant improvements in REE solubilization greater than the pH 2.5 HCl control (p = 0.008 and 0.04 after Šidák correction for citric and citramalic acid respectively) (Figure 3.5). Leaching with a combination of all selected metabolites did not improve solubilization significantly beyond the combined effects of individual metabolites, with increases of only approximately 6.5 and 5.1 mg/L above controls for citric and citramalic acids respectively, and did not approach the REE concentrations achieved by direct bioleaching (42 ± 15 mg/L, see figure 3.1-a). Although the effect of citric acid appears to be somewhat larger here than that reported in Chapter 2 (6.5 mg/L here as opposed to 3 mg/L in Chapter 2), the experimental protocols were quite different (see Materials and methods section). This experiment supports the overall result from Chapter 2 that citric acid provides some additional REE solubilization, but not sufficient improvements to account for the majority of the bioleaching effectiveness. Citramalic acid has previously been shown to solubilize phosphate from low phosphate soils amended with monocalcium phosphate dihydrate (Ca(H2PO4)2·H2O) (Khorassani, Hettwer et al. 2011).

With regard to Th release during bioleaching, only citric acid, citramalic acid, and the combination of all selected metabolites resulted in leaching of detectable levels of thorium release (Figure 3.6). Notably, these are the same metabolites that contributed to additional REE solubilization. However, citric acid leached significantly more Th than citramalic acid (1.18 ± 0.01 mg/L as opposed to 0.25 ± 0.0 mg/L), indicating that citramalic acid may have more desirable leaching characteristics.

44

Figure 3.5. Abiotic solubilization of REEs from monazite by selected metabolites. Heights and error bars indicate means and standard deviations for three replicates.

Figure 3.6. Abiotic solubilization of Th from monazite by selected metabolites. Heights and error bars indicate means and standard deviations for three replicates.

45

3.3.5 Gel permeation chromatographic separation of complexed REEs

In order to determine whether the formation of large, highly stable complexes contributed to bioleaching effectiveness, a gel permeation chromatography approach was developed to separate free REEs from REE complexes. Initial testing of low pressure gel permeation chromatography was successful at separating free Nd3+ from EDTA-Nd3+ complexes at circumneutral pH (Figure 3.7).

Figure 3.7. Chromatographic separation of free Nd3+ and EDTA-Nd3+ complexes at circumneutral pH.

Due to the low pH of the bioleaching cultures, an additional gel permeation chromatography column was prepared and operated at pH 2.5 (Figure 3.8). Unlike the circumneutral results, the 3+ 3+ NdCl3 plus EDTA did not show a clear separation of Nd and EDTA-Nd complex under these conditions, instead resulting in a smeared peak between 30 and 100 minutes retention time. The combination of NdCl3 and citric acid, a weaker complexing agent, resulted in Nd release in a single peak between 70 and 100 minutes retention time, peaking between 80 and 90 minutes, the same retention time as the NdCl3 control without any complexing agents. This result is unsurprising for the weaker complexes that dissociate more readily within the chromatography column (Collins 2004). The equilibrium constant for complexes of EDTA with REEs has been estimated at 1014 to 1020 (Wheelwright, Spedding et al. 1953), whereas the equilibrium constant for complexes of citrate with REEs are on the order of 109 (Martell and Smith 1974, Goyne, Brantley et al. 2010).

46

Samples from three bioleaching bottles also resulted in single peaks of Nd with retention times of 70 to 100 minutes (peak at 80 to 90 minutes), similar to the NdCl3 and NdCl3 with citric acid controls. These chromatography results indicate that the compounds responsible for REE bioleaching do not form strong complexes, like those of EDTA, which are able to remain somewhat intact during chromatographic separation. Instead, any complexes formed in these bioleaching samples are more similar to the weaker complexes formed between REEs and citric acid.

Figure 3.8. Chromatographic separation of Nd3+ and Nd3+ complexes at pH 2.5.

In combination, the metabolomics analysis along with the gel permeation chromatography results provide some insight into the nature of the compounds responsible for bioleaching effectiveness. The chromatography results suggest that any complexes formed are relatively weak. This is consistent with the ability of the Amberlite IR120 resin to remove REEs from bioleaching spent medium as was reported in Chapter 2. The metabolomics analysis and subsequent abiotic leaching experiments indicate that while citric acid and citramalic acid contribute to leaching, they do not completely explain bioleaching effectiveness. Any contributions of other identified metabolites were not great enough to be detected. Together these results suggest that a combination of many compounds forming weak complexes with REEs contribute to fungal bioleaching effectiveness. These may include some of the unknowns identified by BinBase numbers in this metabolomics analysis, but they may also include others not detectable by the GC-TOF-MS approach or not in the reference mass spectra database used by the West Coast Metabolomics Center for this analysis.

47

Chapter 4:

Metagenomic Analysis of a Functionally Stable Trichloroethene Degrading Microbial Community

A version of the following chapter has been published as:

Brisson, Vanessa L., Kimberlee A. West, Patrick. K. H. Lee, Susannah G. Tringe, Eoin L. Brodie and Lisa Alvarez-Cohen (2012). "Metagenomic analysis of a stable trichloroethene-degrading microbial community." The ISME Journal 6(9): 1702-1714.

48

4.1 Introduction

Chlorinated ethenes are common groundwater contaminants that pose human health risks (McCarty 1997, Moran, Zogorski et al. 2007, US_Dept._of_H&HS 2007). Although several groups of organisms can reductively dechlorinate tetrachloroethene (PCE) and trichloroethene (TCE) to the toxic intermediate dichloroethene (cis-DCE and trans-DCE) (Scholz-Muramatsu, Neumann et al. 1995, Sharma and McCarty 1996, Holliger, Hahn et al. 1998, Luijten, de Weert et al. 2003, Löffler, Cole et al. 2004), Dehalococcoides (Dhc) species are the only organisms known to dechlorinate these compounds completely to the harmless product ethene (Maymo- Gatell, Chien et al. 1997, Smidt and de Vos 2004). Dhc species have been found to grow more robustly and reduce chlorinated organics more effectively when grown in mixed communities rather than in isolation, likely due to Dhc’s stringent metabolic needs (Maymo-Gatell, Chien et al. 1997, He, Holmes et al. 2007).

The dechlorinating community studied here is an enrichment culture that has been stably dechlorinating TCE to ethene for over ten years. This culture was derived from sediment collected at the Alameda Naval Air Station and is referred to as ANAS (Richardson, Bhupathiraju et al. 2002). The phylogenetic composition of ANAS has been studied using clone libraries (Richardson, Bhupathiraju et al. 2002, Lee, Johnson et al. 2006), and the Dhc strains in ANAS have been analyzed using qPCR and whole-genome microarrays (Holmes, He et al. 2006, West, Johnson et al. 2008, Lee, Cheng et al. 2011). ANAS contains two Dhc strains, which have recently been isolated (Holmes, He et al. 2006, Lee, Cheng et al. 2011). A comparative genomics analysis showed these strains to have very similar core genomes, but different RDase genes, with correspondingly different dechlorination abilities (Lee, Cheng et al. 2011).

Metagenomic sequencing analysis was used in this study to examine the Dhc component in the context of the ANAS microbial community. Metagenomic approaches have been used to study a variety of microbial communities, including those inhabiting termite guts, human intestines, wastewater treatment plants, and acid mines (Tyson, Chapman et al. 2004, Gill, Pop et al. 2006, Warnecke, Luginbuhl et al. 2007, Sanapareddy, Hamp et al. 2009). In the case of dechlorinating communities, metagenomic data can provide insights into the organisms that support dechlorination activity (Waller 2009).

In this study DNA sequences of Dhc and other ANAS community members are identified and examined from metagenomic sequence data. This study focuses on three categories of functional genes related to dechlorination activity: genes for reductive dehalogenases (RDases), genes for cobalamin biosynthesis enzymes, and genes for hydrogenases. RDases are the enzymes that catalyze the reductive dehalogenation reactions. Cobalamin biosynthesis was targeted because cobalamin is a required cofactor for RDases (Smidt and de Vos 2004). Hydrogenases, which catalyze the reversible oxidation of molecular hydrogen, were targeted because Dhc couple reductive dechlorination to hydrogen oxidation (Maymo-Gatell, Chien et al. 1997, Adrian, Szewzyk et al. 2000, He, Ritalahti et al. 2003).

49

4.2 Materials and methods

4.2.1 ANAS enrichment culture and DNA sample preparation

Culture conditions and maintenance procedures for ANAS have been described previously (Richardson, Bhupathiraju et al. 2002). Briefly, 350 mL of culture was grown at 25 to 28 °C and 1.8 atm with a N2-CO2 (90:10) headspace in a 1.5 L continuously stirred semi-batch reactor. The culture was amended with 13 μL TCE and 25 mM lactate every 14 days.

Cells were collected from 30 mL culture samples by vacuum filtration onto hydrophilic Durapore membrane filters (0.22 µm pore size, 47 mm diameter [Millipores, Billerica, MA]), and filters were stored in 2 mL microcentrifuge tubes at –80 °C until further processing. For PhyloChip experiments, samples were collected from the same time point (27 hrs) from three different 14-day cycles of the culture to achieve biological triplication. For metagenomic sequencing, samples from the same time point (27 hrs) from four different feeding cycles were pooled in order to collect enough material for sequencing. Total nucleic acids were extracted from frozen filters using a modified version of the bead beating and phenol extraction method described previously (West, Johnson et al. 2008).

4.2.2 Metagenome sequencing, assembly, and annotation

Metagenome sequencing, assembly, and annotation were performed at the Department of Energy Joint Genome Institute (JGI). A combination of 454-Titanium sequencing (453,944 reads) and paired-end short-insert Sanger sequencing (76,272 mate pairs, approximate insert size 3 kb) was used. 454-Titanium sequencing reads were assembled into contiguous sequences (contigs) using Newbler [454 Life Sciences, Roche Applied Sciences, Branford, CT, USA]. Those contigs were shredded to resemble overlapping Sanger sequencing reads, which were then combined into an assembly with the paired-end Sanger sequencing reads using the Paracel Genome Assembler [Paracel Inc., Pasadena, CA, USA]. Similar methods have been used by other researchers to combine Sanger and 454-Titanium sequencing data (Goldberg, Johnson et al. 2006, Woyke, Xie et al. 2009). The contigs resulting from this second assembly, as well as Sanger reads and Newbler contigs that could not be further assembled, were annotated through a version of the JGI microbial annotation pipeline (Mavromatis, Ivanova et al. 2009) adapted to metagenomes, which includes prediction of protein coding and RNA genes and product naming. Annotation was automated and no manual annotation was performed. Data were loaded into the Integrated Microbial Genomes with Microbiome Samples (IMG/M) database (Markowitz, Chen et al. 2010) and used in the following analyses.

4.2.3 Analysis of metagenomic sequence data

4.2.3.1 Identification of Dhc contigs by sequence similarity

Dhc contigs were identified in a two stage sequence similarity (SS) process. In the first stage, contigs were identified by comparison to previously sequenced Dhc reference genomes (Dhc strains 195, BAV1, CBDB1, VS and GT). Reference genome sequences were retrieved from the National Center for Biotechnology Information (NCBI) genomes database [ftp://ftp.ncbi.nlm.nih.gov/genomes], and blastn (Zhang, Schwartz et al. 2000) was used to compare the reference genome sequences against a database of all metagenome contig 50 sequences. For each reference genome, the top 250 BLAST hit contigs were selected for the second stage comparison (at this cutoff, additional contigs did not expand the useful contig set), where their identities were checked by comparison to the NCBI genomes database using megablast (Zhang, Schwartz et al. 2000). All contigs whose top BLAST hit (lowest expect value) in the genomes database was to Dhc were selected and expect values were checked for significance. For all contigs identified as Dhc, the expect value of the identifying BLAST hit was ≤ 10-35.

4.2.3.2 Classification of ANAS contigs by tetranucleotide frequencies

ANAS contigs larger than 2,500 bp were grouped by tetranucleotide frequencies (TF) using a procedure based on one described by Dick et al. (Dick, Andersson et al. 2009) with some modifications described here. Clustered regularly interspaced short palindromic repeat (CRISPR) and rRNA gene sequences were removed from contig sequences prior to classification because these sequence regions are known to have atypical nucleotide compositions compared to their genomes (Reva and Tummler 2005, Dick, Andersson et al. 2009). Next, all contig sequences larger than 2,500 bp were selected for classification, with contigs larger than 7,500 bp divided into 5,000 bp fragments. Sequence fragments were classified based on TF using the Databionics ESOM Tools program (Ultsch and Moerchen 2005, Databionics 2006). Dick et al.’s method for clustering of sequences (Dick, Andersson et al. 2009) was used with the following modifications. Online training was used instead of the k-batch algorithm because online training provides more accurate, albeit slower, performance (Databionics 2006). A map size of 120x196 and an initial radius of 60 were selected based on the size of the dataset.

4.3.3.3 Comparisons to reference genomes and identification of novel Dhc genes

To identify regions of similarity and difference between a set of metagenome contigs and a reference genome, each contig in the set was compared to the reference genome using blastn, with an expect value cutoff of (10-12) unless otherwise stated. Based on the results of these searches, aligning and non-aligning regions were identified in the contigs and the reference genome.

Two measures of overall similarity between contigs and reference are reported. The first, contig match, is the percentage of total bases in all contigs that are part of an alignment to the reference genome. The second, reference match, is the percentage of bases in the reference genome that are part of an alignment to some contig in the set.

To identify contig regions containing potentially novel Dhc genes, Dhc metagenome contigs were compared to five sequenced Dhc genomes (strains 195, BAV1, CBDB1, VS, and GT) that were publicly available in August 2010. Contig regions that were not in alignments to any reference genomes and were over 100 bases in length were investigated further. A less stringent expect value cutoff (10-6) was used to ensure that only low similarity regions were included in the analysis. All annotated genes contained in the non-aligning regions, or overlapping the regions by at least five bases were identified as novel.

51

4.2.4 Confirmation of novel Dhc genes in Dhc isolates from ANAS

Selected novel Dhc genes identified in the metagenome were amplified and sequenced from Dhc strains previously isolated from ANAS. Primers were designed based on the metagenome gene sequence using Primer3 (Table 4.1). PCR reactions were performed in 0.2 mL tubes in using Qiagen Taq DNA Polymerase. The thermocycler program was as follows: 12 minutes at 94 C; 40 cycles of one minute at 94C, 45 seconds at annealing temperature (Table 4.1), and two minutes at 72 C; 12 minutes at 72 C. Genomic DNA (gDNA) from Dhc strains ANAS1 and ANAS2 were used as templates for separate reactions. ANAS metagenomic DNA was used as a positive control template and Dhc strain 195 gDNA was used as a negative control template. PCR products were visualized on agarose gels and purified using the QIAquick PCR Purification Kit. Purified PCR products were sequenced by Sanger sequencing.

Table 4.1. PCR primers and annealing temperatures for novel Dhc genes. Target JGI IMG Annealing Primer sequences Gene Gene Object Temp. for Name ID Forward Reverse PCR cbiD 2014753801 ACCGCCAGCCTCAGGGTTGA ACAGCCGCCATGGCACACAG 59 °C cbiF 2014753804 CGCTGTCTGGAAGAAGCCGACC TGCATGGCGGAGGCCAGATT 57 °C cbiC 2014753814 CGCCGTTGTCCGCCAGCTTA TTTCACCCGCCGCTTCTGCC 58 °C

4.2.5 TCE dechlorination by Dhc Isolate ANAS2 and ANAS Subcultures

Dhc isolate ANAS2 was grown in 120 mL serum bottles with H2/CO2 (80%/20%) in the headspace. Bottles contained 99 mL Bav1 medium (He, Ritalahti et al. 2003) with 5 mM acetate as a carbon source and 7 µL TCE, but no cobalamin. Bottles were either amended with 50 µg/L (37 nM) cobalamin or 5.4 µg/L (37 nM) 5,6-dimethylbenzimidazole (DMB), or were left un- amended. Bottles were inoculated with 1 mL of active ANAS2 culture stock, which had been growing on Bav1 medium with 5 mM acetate and 50 mM cobalamin, and incubated at 34 ºC until they had completely dechlorinated 7 µL of TCE to ethene.

Subcultures of the ANAS microbial community were grown in 120 mL serum bottles with N2/CO2 (90%/10%) in the headspace. Growth medium was the same as that used in the ANAS culture, but with varying concentrations of cobalamin. 20 mM lactate was provided as an electron donor and carbon source, and 2 µL TCE was provided as an electron acceptor. 5 mL of inoculum was added to 45 mL of growth medium, for a final liquid volume of 50 mL in each bottle. Inoculant for the first stage subcultures was taken from the ANAS culture. The second stage subcultures were inoculated from first stage subculture bottles with the same cobalamin concentration. Bottles were incubated at room temperature (approximately 25 ºC)

Chlorinated ethene concentrations were monitored by gas chromatography on an Agilent Technologies 7890A GC system using a previously described protocol (Lee, Johnson et al. 2006).

52

4.2.6 PhyloChip assessment of community composition

Metagenomic DNA and RNA extracted from ANAS were applied to separate PhyloChip microarrays to examine the phylogenetic composition of ANAS. The methods for these experiments draw on several previously published methods (Cole, Truong et al. 2004, Brodie, DeSantis et al. 2006, DeSantis, Brodie et al. 2007, West, Johnson et al. 2008).

Total nucleic acids were extracted as previously described (West, Johnson et al. 2008). DNA and RNA were separated using the Qiagen AllPrep DNA/RNA Kit according to manufacturer’s instructions. RNA was further purified using the Qiagen RNase-free DNase Set, per manufacturer’s instructions. RT-qPCR was performed to confirm that RNA samples contained no DNA contamination. The masses of DNA and RNA per volume were quantified using a fluorometer [model TD-700, Turner Designs, Sunnyvale, CA] and the Quant-iT PicoGreen dsDNA and Quant-iT RiboGreen RNA reagents [Invitrogen Molecular Probes, Carlsbad, CA], respectively, according to the manufacturer's instructions.

The bacterial and archaeal 16S rRNA genes were amplified from the extracts using the following primers: bacterial primer 27F (5′-AGRGTTTGATCMTGGCTCAG), archaeal primer 4F (5'- TCC GGT TGA TCC TGC CGG-3'), and universal primer 1492R (5′- GGTTACCTTGTTACGACTT). For DNA PhyloChips, PCR was performed using the TaKaRa Ex Taq system [Takara Bio Inc., Japan] and DNA was prepared for the microarrays as previously described (DeSantis, Brodie et al. 2007).

For RNA PhyloChips, a direct hybridization method was employed as follows. 16S rRNA was enriched from total RNA by gel extraction. Direct analysis of rRNA was achieved using a modification of the protocol of Cole et al. (Cole, Truong et al. 2004). To account for technical variation between hybridizations, a set of internal RNA spikes were added to each sample preparation. These spikes consisted of transcripts generated by T7 or T3 mediated in vitro transcription from linearized plasmids pGIBS-LYS (containing Bacillus subtilis lysA, ATCC 87482), pGIBS-PHE (containing Bacillus subtilis Phe gene, ATCC 87483) and pGIBS-THR (containing Bacillus subtilis Thr gene, ATCC 87484). To each RNA fragmentation reaction, 1.35×1010, 3.13×1010 and 3.13×1011 transcripts of LysA, Thr, and Phe respectively were added in a volume of 1 μL. Combined sample RNA (1 µg) and spike mix was fragmented and dephosphorylated simultaneously using 0.1U RNaseIII/µg RNA, shrimp alkaline phosphatase [USB, OH, USA] 0.2U/µg RNA in a buffer containing 10 mM Tris-HCl, 10 mM MgCl2, 50 mM NaCl, 1mM DTT (pH7.9) in a final volume of 20 µL. The mixtures were then incubated at 37 °C for 35 min followed by inactivation at 65 °C for 20 min. RNA labeling with multiple biotin residues utilized an efficient labeling system that employs T4 RNA ligase to attach a 3'- biotinylated donor molecule [pCp-Biotin3, Trilink Biotech, San Diego, CA, USA] to target RNA (Cole, Truong et al. 2004). Labeling was performed with 20 µL of fragmented/dephosphorylated RNA, 20U T4 RNA ligase [NEB, MA, USA], 100 µM pCp-Biotin3 in a buffer containing 50 mM Tris-HCl, 10 mM MgCl2, 10 mM DTT, 1 mM ATP (pH 7.8), 16% v/v PEG 8000. The final volume was 45 µL. The reaction mixture was incubated at 37 °C for 2h and inactivated at 65 °C for 15 min. The mixture was then prepared for PhyloChip hybridization without any further purification and was processed according to standard Affymetrix expression analysis technical manual procedures for cDNA.

53

PhyloChips were hybridized at 50 °C in an Affymetrix hybridization oven for 16 h at 60 rpm. Microarrays were stained according to the Affymetrix protocol and then immediately scanned using a GeneChip Scanner 3000 7G [Affymetrix, Santa Clara CA]. To process captured fluorescent images into taxon hybe scores, images were background corrected and probe pairs scored as previously described (Brodie, DeSantis et al. 2006).

4.3 Results

4.3.1 ANAS metagenome overview

ANAS metagenome sequences were assembled into 26,293 contigs, totaling 41,065,977 bp of DNA sequence. Contigs ranged in length from 78 bp to 921,258 bp, with an N50 length of 2,149 bp. 60,992 protein coding genes and 565 RNA genes were identified. The annotation is available through IMG/M [http://img.jgi.doe.gov/cgi-bin/m/main.cgi] (Taxon Object ID 2014730001) (Kyrpides, Markowitz et al. 2008).

4.3.2 Dhc in ANAS

4.3.2.1 Identification of Dhc contigs

The SS method identified 301 contigs as Dhc. In the TF analysis, one class containing 45 contigs was identified as Dhc based on the presence of 16S and 23S rRNA genes that were 100% and 99% identical to those of Dhc strain 195.

The Dhc contigs identified by SS and by TF were compared to evaluate the two methods. Of the 301 Dhc contigs (1,810,488 bp total) identified by SS, 49 (1,643,099 bp total) were sufficiently long (> 2,500 bp) for classification by TF. Of those, the TF method classified 45 as Dhc (the class of 45 identified above) and one (ANASMEC_C10442) as a Synergistete, leaving three (ANASMEC_C5086, ANASMEC_C818, and ANASMEC_C10029) unclassified.

The four contigs identified by SS but not by TF were further examined to determine possible reasons for the discrepancy. The BLAST alignments identifying these contigs by SS covered 25% or less of each contig’s length. In the non-aligning sequence regions, two contigs (ANASMEC_5086 and ANASMEC_C818) contained several phage related genes and recombinases, indicating possible horizontal DNA transfer, which could explain non-Dhc TF classification of sequences from a Dhc genome. The other contigs (ANASMEC_C10029 and ANASMEC_C10442) did not contain genes that were obvious indicators of horizontal transfer, although this does not rule out that explanation. Mis-assembly may also be responsible for the presence of both Dhc and non-Dhc sequence in these contigs. Given this uncertainty, the following analyses consider contigs identified by TF and SS separately, and make special note when these four contigs are relevant to a particular analysis.

54

4.3.2.2 Metagenome coverage of Dhc genes detected by microarray

Metagenome coverage of Dhc genes was assessed by comparison to results from a previous comparative genomics study performed with microarrays targeting 98.6% of annotated genes in Dhc strains 195, BAV1, CBDB1, and VS (Lee, Cheng et al. 2011). Coverage was evaluated by identifying Dhc genes detected in ANAS by the microarray analysis and determining which of those genes were present in the metagenome sequences (Figure 4.1). Presence of Dhc genes in the metagenome was determined by blastn comparisons of the genome sequences of Dhc strains 195, BAV1, CBDB1, and VS to all metagenome contigs (expect value cutoff of 1×10-12).

Figure 4.1. Comparison of metagenomic Dhc coverage with ANAS genes detected by microarray. Although the analysis was performed for genes from Dhc strains 195, BAV1, CBDB1, and VS, only results for Dhc strain 195 are shown here for simplicity. Circles represent the Dhc strain 195 genome, with the origin of replication at the top. The inner circle shows regions with ANAS metagenome / strain 195 alignments in magenta. The outer circle shows ANAS genes detected by microarray in blue-green.

55

The metagenome contigs contained 96.2% (1,311 of 1,363) of the genes identified as present by the microarray analysis. Another 3.4% (47 genes) were partially present, overlapping the contig end. Only five of the 1,363 Dhc genes identified by microarray were not found in any of metagenome contigs. These were all genes from the Dhc195 genome, and include fabG (DET1277), nusB (DET1278), and three genes coding for hypothetical proteins (DET0768, DET1405, and DET1406). Based on the alignment of the metagenome contigs to the Dhc 195 genome, these genes appear to fall in gaps between contigs. Blastn comparisons of these genes to the raw sequencing reads revealed that all five genes had significant alignments (expect value < 1×10-50) to 454-Titanium sequencing reads but not to Sanger sequencing reads, indicating that they were missed by Sanger sequencing.

4.3.2.3 Co-assembly of sequence from distinct Dhc strains

Comparisons to the previous comparative genomics microarray analysis (Lee, Cheng et al. 2011) were also used to determine whether sequences from the two distinct Dhc strains were co- assembled in the metagenome. The presence of two different Dhc strains (ANAS1 and ANAS2) in the ANAS community has been established previously (Holmes, He et al. 2006, Lee, Cheng et al. 2011), and the previous study identified 60 genes distinct to ANAS1 and 36 genes distinct to ANAS2 (Lee, Cheng et al. 2011). The metagenome contigs containing these non-shared genes were identified using BLAST and identifications were confirmed by BLAST comparison of metagenome sequences to the NCBI non-redundant nucleotide database. Although all genes analyzed had significant (expect value < 10-12) alignments in the contigs, alignment of at least 75% of gene length was also required for positive identification for this analysis. 5 genes distinct to ANAS1 failed this alignment length requirement and were not considered. Six contigs, representing 541,431 bp combined, were found to be co-assembled because each contained at least one gene distinct to ANAS1 and one gene distinct to ANAS2. In total, 17 contigs contained genes distinct to ANAS1, and 15 contigs contained genes distinct to ANAS2.

4.3.2.4 Identification of novel Dhc genes

406 novel genes, 184 with annotated functions (Appendix 4), were identified on 26 contigs (15 identified as Dhc by both TF and SS, four identified by SS alone as described above, and seven that were too short for TF analysis but were identified by SS) (Figure 4.2). The most surprising finding was the presence of nine genes predicted to be involved in corrin ring synthesis, the first half of the cobalamin biosynthesis pathway.

56

Figure 4.2. Alignment of ANAS metagenome Dhc contigs (identified by TF and/or SS) to the Dhc strain 195 genome. The inner circle represents the reference strain 195 genome, with the origin of replication at the top. Magenta areas indicate alignment to ANAS metagenome contigs while grey areas indicate regions with no alignment. Each contiguous bar in the outer circles represents a contig, positioned based on its aligning regions, with contigs plotted on different circles to avoid overlap. Green areas indicate regions with no alignment to the reference genome, potentially containing novel Dhc genes (if they also do not align to other Dhc reference genomes).

57

The corrin ring synthesis genes are on contig ANASEMC_C6240, which was identified as Dhc by both SS and by TF. Eight of the nine genes are oriented in the same direction and appear to be in a single operon, along with seven genes for ATP-binding cassette transporter (ABC- transporter) components (Figure 4.3), some specifically annotated as cobalamin transporters. All regions of this contig aligning to reference Dhc genomes aligned to previously identified High Plasticity Regions (HPRs), which contain much of the variation between sequenced Dhc genomes (McMurdie, Behrens et al. 2009). Based on the TF analysis, the region of this contig containing the cobalamin biosynthesis genes grouped with the Dhc sequences and not with any other contig class (Figure 4.4)

Figure 4.3. Operon structure for genes for the first (corrin ring synthesis) part of the cobalamin biosynthesis pathway identified in an ANAS metagenome contig associated with Dhc. Genes in white are the corrin ring synthesis genes, labeled with the gene name. Genes with hatching are genes for ABC-transporter components.

Figure 4.4. Evidence for the association of contig ANASMEC_C6240 (containing cobalamin biosynthesis genes) with Dhc. (A.) The top bar shows how different segments of the contig were grouped with Dhc based on TF analysis, while (B.) the bottom bar shows which parts of the contig aligned with previously sequenced Dhc genomes (magenta matches Dhc and green does not). The location of the apparent cobalamin biosynthesis operon is indicated and has a Dhc TF composition but does not align to previously sequenced Dhc genomes.

PCR amplification and sequencing were used to confirm the presence of three of the cobalamin biosynthesis genes in Dhc strains previously isolated from ANAS. Genes tested included two from the apparent cobalamin biosynthesis operon (cbiD and cbiF) and the one from elsewhere on the same contig (cbiC). All three genes were successfully amplified and sequenced from gDNA from Dhc strain ANAS2 as well as from ANAS metagenomic DNA but not from Dhc strain ANAS1 or strain 195. Sequences had 99.6-100% nucleotide identity with corresponding metagenome sequences. Amplification with the primers for cbiF produced products of a different size than the target sequence when gDNA from Dhc strain ANAS1 or strain 195 was

58 used as the template. Sequencing of PCR products confirmed that these were a different sequence from the target, the result of non-specific primer binding.

Several other groups of genes are well represented among the novel Dhc genes identified here. 15 novel genes for ABC-transporter components were identified, including 13 on the same contig as the corrin ring synthesis genes. 15 genes for phage proteins and 14 genes for recombinases were also present. 11 novel genes for RDases were identified. However, for one RDase, the first third of the gene matched (98% ID) the Dhc strain 195 gene DET0088, an RDase domain gene that is approximately one third the length of a typical RDase gene. Together with the remaining two thirds of the RDase gene, this appears to be a full length novel Dhc RDase gene in the ANAS metagenome.

4.3.2.5 TCE dechlorination by Dhc isolate ANAS2 under different cobalamin conditions

Figure 4.5 shows ethene produced during TCE dechlorination by Dhc isolate ANAS2. ANAS2 was able to completely dechlorinate 60 µmol TCE per bottle to ethene within 20 days of incubation when provided with 50 µg/L cobalamin. However, when provided with no cobalamin, ethene levels remained below 2 µmol ethene per bottle, regardless of whether DMB was present.

Figure 4.5. Ethene production during TCE degradation by Dhc isolate ANAS2.

4.3.3 ANAS community structure

4.3.3.1 TF classification of metagenome contigs

TF was used to analyze all contigs longer than 2 500 bp, comprising 2 323 contigs representing 46% of the total sequence length of all contigs. Of these contigs, 95% were classified into 10 classes (Table 4.2). 141 contigs were left unclassified because they did not cluster with other contigs.

59

Table 4.2. Classification of contigs by TF and identification of contig classes by 16S and 23S BLAST comparisons.

%ID Median 16S rRNA gene of Number Total Contig Avg. Closest BLAST Closest of Sequence Length Read Hitb (23S when 16S BLAS Classa Contigs Length (bp) (bp) Depth Taxa not present) T Hitb Class 1 13 2,279,508 39,051 53 Clostridiaeceae Clostridiaceae 95 bacterium SH021 Class 2 45 1,483,420 14,384 39 Dehalococcoides Dehalococcoides 100 sp. MB and Dehalococcoides ethenogenes 195 Class 3 77 2,654,085 27,104 18 Spirochaetes Spirochaetes 92 bacterium enrichment culture clone DhR^2/LM- B02 Class 4 152 2,550,033 11,242 11 Methanobacterium Methanobacterium 99 formicicum strain FCam Class 5 382 2,249,123 4,714 8 Desulfovibrio (Desulfovibrio (96)c desulfuricans subsp. Desulfuricans str. ATCC 27774)c Class 6 449 2,295,796 4,122 7 unknown taxa no rRNA genes Class 7 550 2,732,616 3,821 7 Synergistetes Synergistetes 98 bacterium enrichment culture clone DhR^2/LM- F01 Class 8 205 791,052 3,346 6 Delta- no rRNA genes proteobacterium Class 9 191 675,183 3,112 6 unknown taxa no rRNA genes Class 10 118 421,953 3,156 5 Methanospirillum Methanospirillum 99 hungatei JF-1 Unclassified 141 864,492 3,451 All Classes 2,323 18,997,261 4,003 aClasses are ordered by average read depth. bIdentity and %ID are presented for the top 16S (or 23S) rRNA gene BLAST hit in the NCBI nucleotide database that was identified beyond “uncultured bacterium”. BLAST searches were performed in August, 2010.

Based on 16S and 23S rRNA genes present on the contigs, seven of the 10 classes were attributed to the following taxa: a Clostridiaceae, Dhc, Desulfovibrio (23S only), Methanobacterium, Methanospirillum, a Spirochaete, and a Synergistete. The remaining three classes did not contain 16S or 23S rRNA genes. Notably, an additional contig (ANASMEC_C9204) containing a set of rRNA gene sequences from a Clostridium did not cluster with any contig class, although it was more similar to the Clostridiaceae class than to other contig classes. This 8 722 bp contig also contains genes for subunits of a type IIA topoisomerase and a gene for a hypothetical protein. A partial 23S rRNA gene belonging to a 60

Bacteroides and a 16S gene belonging to a Desulfovibrio were also identified, but were on contigs smaller than the 2,500 bp cutoff used for the TF analysis.

IMG/M Phylogenetic Marker COGs (Markowitz, Ivanova et al. 2008) were used to try to identify the remaining three classes. One class was identified as a Deltaproteobacterium, likely from the order. Marker genes in Class 6 did not give a clear identification, and Class 9 contained no marker genes.

4.3.3.2 Comparisons to previously sequenced reference genomes

Contig classes were compared to relevant reference genomes in the NCBI genomes database (accessed September 2010). Desulfovibrio and Methanospirillum contigs were compared to fully sequenced genomes from the same genus. Methanobacterium contigs were compared to genomes of members of the Methanobacteriaceae family. Clostridiaceae contigs were compared to Clostridium genomes (most similar genus based on 16S and 23S sequences). Comparisons were not performed for the Spirochaete, Synergistete, or unknown Deltaproteobacterium contigs because sufficiently close relatives (same family or genus) could not be identified.

Dhc contigs had the most similarity to reference genomes, while Clostridiaceae and Methanobacterium contigs had < 4% contig match (percent of contig bases in alignments to reference genome) or reference match (percent of reference genome bases in alignments to contigs) (Table 4.3). For comparison, it is useful to consider what these values are for a set of contigs compared to a reference genome that is not closely related. A comparison of the Dhc contigs to seven Desulfovibrio reference genomes results in contig matches and reference matches of 0.1% to 0.2%. For Methanospirillum contigs, the disparity between contig match and reference match is probably due to poor sequencing coverage (0.5 Mbp compared to 3.5 Mbp for Methanospirillum hungatei).

61

Table 4.3. Comparisons of ANAS metagenome contigs to reference genomes. Contig Class Reference Genome Contig Matchb Reference Matchc Dehalococcoides Dehalococcoides ethenogenes str. 195 81.7% 82.2% Dehalococcoides Dehalococcoides str. BAV1 73.9% 78.9% Dehalococcoides Dehalococcoides str. CBDB1 74.7% 76.9% Dehalococcoides Dehalococcoides str. VS 76.0% 77.4% Dehalococcoides Dehalococcoides str. GT 72.4% 76.6% Desulfovibrio Desulfovibrio desulfuricans ATCC 27774 32.9% 26.1% Desulfovibrio Desulfovibrio desulfuricans G20 5.1% 3.3% Desulfovibrio Desulfovibrio magneticus RS 1 3.8% 1.9% Desulfovibrio Desulfovibrio salexigens DSM 2638 1.6% 1.2% Desulfovibrio Desulfovibrio vulgaris DP4 6.6% 4.8% Desulfovibrio Desulfovibrio vulgaris Hildenborough 6.5% 4.5% Desulfovibrio Desulfovibrio vulgaris Miyazaki 9.1% 5.4% Methanobacteria Methanobrevibacter ruminantium M1 0.9% 0.9% Methanobacteria Methanobrevibacter smithii ATCC 35061 0.8% 1.4% Methanobacteria Methanosphaera stadtmanae DSM 3091 0.7% 1.6% Methanobacteria Methanothermobacter thermautotrophicus 2.1% 3.2% Methanospirillum Methanospirillum hungatei 46.5% 6.1% Clostridiaceae Clostridium acetobutylicum 0.5% 1.1% Clostridiaceae Clostridium beijerinckii 0.8% 1.1% Clostridiaceae Clostridium botulinum A 0.4% 1.0% Clostridiaceae Clostridium botulinum A2 Kyoto 0.4% 0.9% Clostridiaceae Clostridium botulinum A3 Loch Maree 0.4% 0.9% Clostridiaceae Clostridium botulinum A ATCC 19397 0.4% 0.9% Clostridiaceae Clostridium botulinum A Hall 0.4% 0.9% Clostridiaceae Clostridium botulinum B1 Okra 0.4% 0.9% Clostridiaceae Clostridium botulinum Ba4 657 0.4% 0.9% Clostridiaceae Clostridium botulinum B Eklund 17B 0.6% 1.3% Clostridiaceae Clostridium E3 Alaska E43 0.6% 1.3% Clostridiaceae Clostridium botulinum F 230613 uid47575 0.5% 1.0% Clostridiaceae Clostridium botulinum F Langeland 0.5% 1.0% Clostridiaceae Clostridium cellulolyticum H10 0.8% 1.0% Clostridiaceae Clostridium difficile 630 0.5% 1.0% Clostridiaceae Clostridium difficile R20291 uid38039 0.5% 0.9% Clostridiaceae Clostridium kluyveri DSM 555 1.1% 1.1% Clostridiaceae Clostridium kluyveri NBRC 12016 1.1% 1.1% Clostridiaceae Clostridium ljungdahlii ATCC 49587 0.6% 0.9% Clostridiaceae Clostridium novyi NT 0.8% 2.0% Clostridiaceae Clostridium perfringens 0.4% 1.3% Clostridiaceae Clostridium perfringens ATCC 13124 0.4% 1.0% Clostridiaceae Clostridium perfringens SM101 uid12521 0.4% 1.4% Clostridiaceae Clostridium phytofernentans ISDg 0.9% 1.0% Clostridiaceae Clostridium tetani E88 0.4% 1.0% Clostridiaceae Clostridium thermocellum ATCC 27405 0.5% 0.6% aMost similar sequenced genomes for each contig class are indicted in bold bContig match is the percentage of total bases in all contigs in the set that are part of an alignment to the reference genome. cReference match is the percentage of total bases in the reference genome that are part of an alignment to some contig in the set. 62

4.3.3.3 PhyloChip analysis of ANAS community composition

PhyloChip analysis of metagenomic DNA identified 1,056 bacterial and archaeal taxa in ANAS (37 bacterial phyla, two archaeal phyla). Of these, 285 taxa were identified as highly active by detection when hybridizing RNA to the PhyloChip (29 bacterial phyla, two archaeal phyla).

The community composition of ANAS as detected by DNA PhyloChip experiments remained stable between the three feeding cycles sampled (mean coefficient of variation for normalized signal intensity: 0.083). The greatest variation was seen for taxa with the lowest average signal intensity. 11 taxa, all among the lowest 5% of average signal intensity, had coefficients of variation ≥ 0.20. The highest coefficient of variation (0.48) was for a Methanosarcinaceae.

The highly active taxa (taxa detected by RNA PhyloChip experiments) were also stable between the three sampling time points (mean coefficient of variation: 0.085). Only four taxa, the four Methanobacteriaceae detected, had coefficients of variation > 0.20 for the RNA PhyloChip experiments. These had coefficients of variation ranging from 0.24 to 0.36 and fell within the lowest 15% of signal intensity.

For all contig class taxa (Clostridiaceae, Dhc, Desulfovibrio, , Methanobacterium, Methanospirillum, Spirochaetes, and Synergistetes) except for Methanospirillum, representatives of the same taxa were detected as both present and active by PhyloChip experiments using DNA and RNA respectively. Representatives of all bacterial contig class taxa were among the highest 10% of average signal intensity in DNA PhyloChip experiments, consistent with these taxa being dominant members of the community. However, all Methanobacterium detected were among the lowest 15% of signal intensity, and Methanospirillum were not detected by the PhyloChips. In the RNA PhyloChip experiments, Dhc was the only contig class taxa among the top 10% of signal intensity, although several Clostridiales not identified as Clostridiaceae were also in this most active group. One Spirochaete also appeared in the top 15% of signal intensity.

4.3.4 Metabolic functions in ANAS

4.3.4.1 Metagenome gene content overview

The ANAS metagenome contains 60,992 putative protein coding genes. Of these, 36,101 could be assigned to clusters of orthologous genes (COGs), and 32,520 of those were assigned to categories beyond general function prediction (Table 4.4).

63

Table 4.4. Overview of ANAS gene content by clusters of orthologus genes. Percentage of genes (out of genes with Number function prediction COG Categorya of genes beyond general function) Amino acid Transport and metabolism 4,046 12.4% Energy production and conversion 3,364 10.3% Carbohydrate transport and metabolism 2,684 8.3% Translation, ribosomal structure, and biogenesis 2,540 7.8% Signal transduction mechanisms 2,499 7.7% Replication, recombination, and repair 2,363 7.3% Cell wall / membrane / envelope biogenesis 2,227 6.8% Transcription 2,201 6.8% Coenzyme transport and metabolism 1,952 6.0% Inorganic ion transport and metabolism 1,922 5.9% Posttranslational modification, protein turnover, 1,436 4.4% chaperones Nucleotide transport and metabolism 1,266 3.9% Lipid transport and metabolism 961 3.0% Defense mechanisms 733 2.3% Cell motility 694 2.1% Intracellular trafficking, secretion, and vesicular 670 2.1% transport Secondary metabolites biosynthesis, transport and 505 1.6% catabolism Cell cycle control, cell division, chromosome 422 1.3% partitioning Chromatin structure and dynamics 33 0.1% Cytoskeleton 1 0.0% RNA processing and modification 1 0.0%

Function unknown 2,535 General function prediction only 4,577 Not in COGs 25,456 aCOG categories are ordered by number of genes in category

Three types of functional genes related to dechlorination (genes directly involved in dechlorination, genes involved in cobalamin biosynthesis, and genes involved in hydrogen production and consumption) were selected for further analysis because they may provide insight into the dechlorinating abilities of this community and the interactions between community members that result in an efficient dechlorinating consortium.

64

4.3.4.2 Reductive dechlorination

Fifteen putative RDase genes located on six contigs were identified in the JGI annotation of the ANAS metagenome contigs (Table 4.5). In addition, three RDase genes identified by previous microarray analysis (Lee, Cheng et al. 2011) but not annotated in the JGI annotation were found by BLAST search on an additional contig (ANASMEC_C7898) and gene identities were confirmed by comparison to the NCBI non-redundant nucleotide database. Two of these were present as full length RDase genes. The third, matching Dhc strain 195 gene DET1535, was present as two partial RDase genes disrupted by an apparent frame-shift mutation. Of the seven contigs containing RDase genes, six were identified as Dhc by both SS and TF. The remaining contig (ANASMEC_C818), which contained only one RDase gene, was identified as Dhc by the SS method, but was left unclassified in the TF analysis. This contig had significant sequence similarity to Dhc strain 195 over approximately 25% of its length, including the RDase gene region. The non-aligning contig regions contained recombinases and phage related genes, indicating possible horizontal transfer and perhaps accounting for the atypical tetranucleotide composition.

Of the 17 full length RDase genes identified, seven were matched to putative RDase genes in the NCBI non-redundant protein database (≥ 98% amino acid ID). Together with the partial RDase genes mentioned above that match DET1535 (97% amino acid ID), these correspond to the eight RDase genes identified as present in ANAS (or Dhc isolates from ANAS) by the previous microarray study (Lee, Cheng et al. 2011). These include two (tceA and vcrA) that have been linked to enzymes with demonstrated RDase activity, and one (DET0088) that appears as a truncated RDase gene in Dhc strain 195, but which is extended to a full length novel RDase gene in the ANAS metagenome as noted above. Of the other ten putative RDase genes, one had 91% amino acid identity to another putative RDase and the remaining nine had less than 70% identity to any sequences in the NCBI protein database as of July 2011.

65

)

)

vcrA

b

ein database. ein

tceA

y Targeted

Corresponding Microarra GenesRDase DET0079( DhcVS_1291( DhcVS_1314 DET1535 DET1535 DET1545 DET0180 DET0173 DET0088

were totied ANAS

redundant prot

-

a

% Identity (Amino Acid) to MSS 98 69 45 50 36 51 99 98 97 97 100 69 51 48 100 99 40 99 64 91

(Lee, Cheng2011) et al.

et al. et al.

a

ase gene)

D

Lee

ANAS metagenomeANAS to NCBI the non

Accession Number for MSS BAF34980.1 YP_182236.1 YP_003757919.1 ABY28307.1 YP_003758765.1 YP_003759128.1 YP_003463052.1 YP_003330741.1 YP_003330743.1 YP_182233.1 YP_182243.1 BAI70456.1 BAI47828.1 YP_307395.1 YP_180928.1 YP_180921.1 YP_003757807.1(full gene)RDase length YP_1080839.1 (partial R YP_003330810.1 AAT48554.1

parison of genesRDase from

014767429

JGI IMG Gene Object ID 2014734823 2014753778 2014753787 2014753829 2014753830 2014753858 2014753885 c c c c 2014766079 2 2014767507 2014767559 2014767564 2014767632 2014770387 2014774104

) ) )

- - -

) ) ) ) ) ) )

)

------

-

(

do havenot Gene IDObject numbers becausedid they not appear the in annotation.original JGI

110095(+) 116547( 122028( 188984(

- - - -

20253 ( 14277 (+) 51759 ( 53458 ( 83084 (+) 34098 ( 39556 ( 40431 59241 ( 64352 ( 25724 (+)

------

determinedby blastp com

5522 (+) 8415 (+) 3792 (+)

- - -

1533(+)

-

Coordinates (strand) 19018 4113 12862 50278 52100 81651 108536 32653 39035 39781 57739 6928 79 62889 115180 120535 187506 2284 24153

genes metagenome in ANAS contigs. genes identified

0

tig ANASMEC_C7898tig

RDase

.

st similarst sequences (MSSs)

nes on con

metagenome byRDases blastp comparisons of togene sequences ANAS metagenome contigs.

Microarray targeted genesRDase identified as present (or in ANAS ANAS Dhc isolates) by Partial RDase genes in ANAS metagenome contigs are highlightedgrey. in

Mo Ge

Table 4.5 Contig Name ANASMEC_C818 ANASMEC_C6240 ANASMEC_C6240 ANASMEC_C6240 ANASMEC_C6240 ANASMEC_C6240 ANASMEC_C624 ANASMEC_C7898 ANASMEC_C7898 ANASMEC_C7898 ANASMEC_C7898 ANASMEC_C9125 ANASMEC_C9422 ANASMEC_C9422 ANASMEC_C9422 ANASMEC_C9422 ANASMEC_C9422 ANASMCE_C10019 ANASMCE_C10784 a b c d

66

4.3.4.3 Hydrogen production and consumption

Hydrogenases, enzymes that catalyze the reversible oxidation of molecular hydrogen, appear to be widespread in the ANAS community, with 271 genes annotated as hydrogenase components (Appendix 5). Of those, 126 genes were present on contigs that were large enough for classification by TF, spread across all classes except the Methanospirillum class. However, this is likely a false negative result given the low coverage of this genome as described above. Methanospirillum are expected to have genes for hydrogenases used in methanogenesis (Madigan, Martinko et al. 2008). Of the 126 hydrogenase genes in large contigs, the Methanobacterium class had the largest proportion (36 genes), followed by the Desulfovibrio class (26 genes) and Dhc (17 genes). The Clostridiaceae class contained only three genes for hydrogenase components.

4.3.4.4 Cobalamin biosynthesis

In total, twenty genes along the first (corrin ring synthesis) and second (lower ligand attachment and rearrangement) parts of the cobalamin biosynthesis pathway were targeted for analysis (Kanehisa and Goto 2000, Warren, Raux et al. 2002). Near complete cobalamin biosynthesis pathways appear to be present in the Dhc, Methanobacterium, and Clostridiaceae classes (Table 4.6, Appendix 6).

Genes for incomplete biosynthesis pathways were identified in both the ANAS Desulfovibrio and Methanospirillum contigs. However, the total sequence length of these contig classes is significantly smaller than would be expected for a full genome (Desulfovibrio contigs, 2,249,123 bp total, represent 43% to 78% of the length of sequenced Desulfovibrio genomes; Methanospirillum contigs, 421,953 bp total, represent 12% of the length of the Methanospirillum hungatei genome), indicating incomplete coverage.

67

Table 4.6. Cobalamin biosynthesis genes identified in ANAS metagenome contigs.

Contig Classifications

Genesa

Clostridiaceae Dehalococcoides Spirochaetes Methanobacterium Desulfovibrio 6) (class Synergistetes Deltaproteobacterium 9) (class Methanospirillum unclassified short contigs cbiX/cbiK(cobN)b x x x x x x cbiL (cobI)c x x x x x x cbiH (cobJ) c, d x x x x x x cbiF (cobM) c, d x x x x x x x cbiG c x x x x x x cbiD (cobF ) c x x x x x x x x cbiJ (cobK ) c x x x x cbiE (cobL) c, d, e x x x x x x x cbiT (cobL) c, d x x x x x cbiC (cobH) c, d x x x x x x cbiA (cobB) d, e x x x x x x x cobA (cobO) d, e x x x x x cbiP (cobQ) d, e x x x x x x x cbiB (cobD) d, e x x x x x x cobU (cobP) d, e x x x x x x cobT (cobU) d, e x x x x cobC (cobU) e x cobS (cobV) d, e x x x x x x aGene names are given for the anaerobic (early cobalt insertion) cobalamin biosynthesis pathway, with the names for the aerobic pathway genes with the same function in parentheses. bcbiX and cbiK (and cobN) are grouped together because they code for alternative cobaltochelatases cGenes involved in corrin ring synthesis, the first part of the cobalamin synthesis pathway dGenes present in Hodgkinia cicadicola eGenes present in Dhc strain 195

68

4.3.4.5 TCE dechlorination by ANAS subcultures under different cobalamin conditions

Figure 4.6 shows ethene production during TCE dechlorination by subcultures of the ANAS microbial community. Subcultures were capable of dechlorinating TCE to ethene, even when additional cobalamin was not provided. However, dechlorination was more rapid when higher concentrations of cobalamin were provided.

Figure 4.6. Ethene production during TCE dechlorination by ANAS subcultures. (a.) First subculture. (b.) Second subculture.

4.4 Discussion

In this study, metagenomic sequencing and analysis were used to examine the phylogenetic composition of ANAS and the genes present in the dominant community members, with a focus on Dhc. Although Dhc and non-Dhc metagenome contigs were classified based on TF (tetranucleotide frequency), an alternative SS (sequence similarity) approach was also used to identify Dhc contigs. Both approaches have advantages: SS can identify smaller contigs, while TF works even when closely related reference genomes are unavailable.

Metagenomic analysis has provided some insight into the functions and interactions of different community members in the context of overall TCE dechlorination activity. The widespread presence of genes for hydrogenases emphasizes the importance of hydrogen metabolism in this community. In the ANAS bioreactor, lactate is fermented to acetate and hydrogen, which are used by Dhc and by other organisms. Because hydrogenases can catalyze both the formation and degradation of molecular hydrogen, the presence of hydrogenase genes does not differentiate the organisms that are producing hydrogen from those that are consuming it. Based on knowledge of other organisms in these taxonomic groups, however, the Clostridiaceae, the Desulfovibrio, and the Spirochaete are potential fermenters that produce hydrogen, although some may also be homoacetogens, consuming hydrogen and carbon dioxide to produce acetate (Leadbetter, 69

Schmidt et al. 1999, Madigan, Martinko et al. 2008). The methanogens likely consume hydrogen as an electron donor, competing with Dhc (Madigan, Martinko et al. 2008). These different hydrogen producers and consumers (fermenters, homoacetogens, reductive dechlorinators, and methanogens) have different thermodynamic requirements and different hydrogen thresholds. However, in this community they appear to have developed working syntrophic relationships, allowing stable long-term dechlorination activity.

With respect to dechlorination reactions, although other organisms are known to reductively dechlorinate TCE to DCE in many environments (Scholz-Muramatsu, Neumann et al. 1995, Sharma and McCarty 1996, Holliger, Hahn et al. 1998, Löffler, Cole et al. 2004), the association of all RDase genes in the metagenome with Dhc contigs implies that Dhc is the dominant, and possibly sole dechlorinator in ANAS. Previous studies have indicated that ANAS contains two distinct Dhc strains (Holmes, He et al. 2006, Lee, Cheng et al. 2011). Consequently, the metagenomic dataset was analyzed to determine whether sequences from these strains were co- assembled. Although co-assembly at the domain level has been reported for both real and simulated metagenomic datasets, these errors are expected to be rare and easy to identify (DeLong 2005, Mavromatis, Ivanova et al. 2007). Co-assembly of closely related species or strains is more common and more difficult to detect (Mavromatis, Ivanova et al. 2007, Kunin, Copeland et al. 2008). In this study, co-assembly of sequences from the two Dhc strains was detected for at least six contigs, representing 541,431 bp. Considering the similarity between these two strains (Lee, Cheng et al. 2011), this amount of co-assembly is not surprising. However, it is worth recognizing as one characteristic of this approach and highlights the importance of parallel sequencing of isolates and/or single cells to metagenome studies.

Because the medium provided for ANAS contains only 2 µg/L cobalamin, a lower than optimal concentration for Dhc (He, Holmes et al. 2007), cobalamin synthesis in the bioreactor is likely necessary to support the observed dechlorination abilities. Several community members, including Dhc, appear to have genes for complete or near complete cobalamin biosynthesis pathways. Although some genes appear to be missing, not all genes identified in the pathway are necessary for de novo cobalamin synthesis. For example, Hodgkinia cicadicola, an endosymbiont of cicadas with a highly streamlined genome, retains cobalamin synthesis capabilities despite its lack of several of the enzymes in the pathway (Table 4.6) (McCutcheon, McDonald et al. 2009). Subcultures of the ANAS microbial community continued to be able to dechlorinate TCE to ethene without additional cobalamin despite 10 (subculture 1) and 100-fold (subculture 2) dilutions of residual cobalamin carried over in the ANAS inoculum, confirming that cobalamin production is functional within this microbial community.

Since previously sequenced Dhc do not have these genes and Dhc are assumed to obtain this cofactor from other organisms, the association of genes for corrin ring synthesis (the first part of cobalamin biosynthesis) with Dhc was unexpected (Kube, Beck et al. 2005, Seshadri, Adrian et al. 2005, He, Holmes et al. 2007). The contig regions containing the corrin ring synthesis genes have TF compositions that were grouped with the Dhc sequences and not with any of the other contig classes (Figure 4.4), implying that these genes were not recently horizontally transferred to Dhc, but have been maintained in the ANAS Dhc for some time. Given that Dhc are known to have relatively streamlined genomes (Kube, Beck et al. 2005, Seshadri, Adrian et al. 2005, McMurdie, Behrens et al. 2009), it is interesting that the ANAS Dhc appear to be maintaining genes for this pathway even though other community members appear to be capable of supplying 70 this cofactor and cobalamin has been supplied in the medium, albeit at a low level, for over ten years. Since PCR amplification and sequencing have confirmed the presence of these genes in Dhc strain ANAS2, preliminary experiments were performed to investigate the functionality of the Dhc cobalamin biosynthesis pathway in that strain. In these experiments, DMB was provided to some cultures because DMB is the lower ligand of the cobalamin molecule. The metagenomic analysis did not reveal a DMB synthesis pathway, indicating that exogenous DMB may be necessary for cobalamin production even if the identified corrin ring synthesis genes are functional. Only minimal ethene production was observed when this strain was grown without cobalamin (Figure 4.5), indicating that the predicted cobalamin synthesis pathway was not actively providing cobalamin under these conditions. A previous study showed that Dhc is capable of scavenging and modifying corrinoids from their environment (Yi, Seth et al. 2012) and these newly identified cobalamin synthesis genes may represent an extension of that scavenging system. Further investigations are necessary to determine whether this pathway is utilized under other conditions, either for de novo cobalamin synthesis or for corrinoid scavenging and repair.

The description of the community composition derived from metagenomic analysis is generally consistent with those of previous 16S clone library studies (Richardson, Bhupathiraju et al. 2002, Lee, Johnson et al. 2006) and the PhyloChip study presented here. Overall, data from the clone libraries and metagenome sequencing agreed on the most abundant bacterial taxa, which were also detected by the PhyloChip. The PhyloChip also detected many other taxa because it is more effective at detecting low abundance organisms (Brodie, DeSantis et al. 2006, DeSantis, Brodie et al. 2007). This is because the PhyloChip is less sensitive to random sampling effects that impact sequencing based approaches (Zhou, Kang et al. 2008, Zhou, Wu et al. 2011). With the exception of Methanospirillum, the archaeal taxa detected in the metagenome were also detected by the PhyloChip, along with several other archaea. No Archaeal clone libraries have yet been prepared for ANAS.

One notable discrepancy between the bacterial clone libraries and the metagenome was in the relative abundance of taxa detected by the two methods. Specifically, the Spirochaete exhibited only low abundance (1-2% of clones) in both clone library experiments (Richardson, Bhupathiraju et al. 2002, Lee, Johnson et al. 2006). Based on the median contig length and average read depth of Spirochaete contigs (Table 4.2) however, the Spirochaete appears to be one of the more abundant organisms in ANAS. Studies of other Dhc containing dechlorinating microbial communities have also detected Spirochaetes (Gu, Hedlund et al. 2004, Macbeth, Cummings et al. 2004, Duhamel and Edwards 2006). Based on what is known of Spirochaetes in general, they may be fermenters or homoacetogens in these communities (Leadbetter, Schmidt et al. 1999, Madigan, Martinko et al. 2008). Clone libraries are known to be susceptible to PCR and cloning biases (von Wintzingerode, Gobel et al. 1997), and some studies have found Spirochaetes in particular to be underrepresented in some clone libraries (Campbell and Cary 2001, Hongoh, Ohkuma et al. 2003). However, recent studies suggest that estimates of relative abundance based on metagenomic sequencing read depth are also biased (Amend, Seifert et al. 2010, Morgan, Darling et al. 2010).

The notable discrepancies between the metagenome and the PhyloChip results were with the methanogens. The PhyloChip did not detect any Methanospirillum, and although the read depth and contig length of the Methanobacterium contigs indicates that they were dominant community 71 members, their low signal intensity using the PhyloChip suggests otherwise. Because these experiments involved amplification of 16S genes prior to PhyloChip hybridization, the low signal intensity may be due to poor amplification. Methanogens had the highest coefficients of variation in both the PhyloChip DNA and RNA results, lending weight to the explanation that the methanogen population is less stable than the rest of this microbial community.

In this study the metagenome sequences were also compared with a previous comparative genomics study that used microarrays to detect known Dhc genes in ANAS (Lee, Cheng et al. 2011). The agreement between the two approaches in detecting Dhc genes (Figure 4.1) confirms that the coverage of Dhc in the metagenomic sequence data was very high. Most differences between the results of the two methods are regions of the reference Dhc genomes for which no genes were detected in ANAS by microarray, but which had an alignment in the metagenome contigs. This highlights the specificity of the microarray to detect only very closely matched sequences. Alternatively, metagenomic sequencing allows the detection of somewhat more divergent versions of genes as well as unexpected or novel genes.

This analysis of metagenomic sequence data has advanced our understanding of this dechlorinating microbial community. The phylogenetic composition of ANAS described by metagenomic sequencing generally confirms the composition described by PhyloChip and previous 16S clone library studies, with a few discrepancies in the relative abundances of some taxa and possible variability in the methanogen population. More importantly, the analysis of functional genes relevant to dechlorination provides insight into the capabilities of microbial community members. Dhc appear to be the dominant reductive dechlorinators in ANAS since all RDase genes identified were associated with Dhc. Genes related to the synthesis of cobalamin, an important cofactor for reductive dechlorination, are present in several community members, including Dhc, highlighting the importance of this cofactor in the function of ANAS. This is the first time that genes for the first part of the cobalamin biosynthesis pathway have been identified in a Dhc strain, further highlighting the unique adaptation of the ANAS strains to reductive dechlorination, but also suggesting that the non-Dhc community members likely have additional important roles beyond cobalamin biosynthesis.

72

Chapter 5:

Evaluation of microarray specificity for detecting Dehalococcoides mccartyi genes in mixed microbial communities using metagenomic sequence data

73

5.1 Introduction

Members of the bacterial species Dehalococcoides mccartyi (Dhc) are the only organisms known to be able to fully dechlorinate the potentially carcinogenic groundwater contaminants tetrachloroethene (PCE) and trichloroethene (TCE) to the harmless end product ethene via reductive dechlorination (Maymo-Gatell, Chien et al. 1997, Smidt and de Vos 2004). Because of this apparently unique capability, Dhc has been heavily studied in isolation, in defined microbial consortia, in mixed microbial communities, and in isolation using a variety of approaches (Ding and He 2012, Löffler, Ritalahti et al. 2013).

Microarrays have been used to study the presence, gene content, and gene expression of Dhc in communities and in isolation (West, Johnson et al. 2008, Conrad, Brodie et al. 2010, Hug, Salehi et al. 2011, Lee, Cheng et al. 2011, Waller, Hug et al. 2012, Mansfeldt, Rowe et al. 2014). When conducting and interpreting microarray study results, it is helpful to understand the specificity of the microarray for the targeted sequences, especially when targeting organisms in mixed communities whose genetic sequences may not be identical to those used to design the microarray. Previous studies have reported a wide range of microarray specificity depending on type of microarray, probe design, and protocols (Kane, Jatkoe et al. 2000, Koltai and Weingarten-Baror 2008, Oh, Yoder-Himes et al. 2010). However, most previous studies used well defined, known DNA samples to examine microarray specificity and sensitivity (Kane, Jatkoe et al. 2000, Oh, Yoder-Himes et al. 2010). While informative, such studies cannot capture the complexities introduced by using microarrays to profile the genetic content of mixed microbial communities (Dugat-Bony, Peyretaillade et al. 2012).

In this study, microarray specificity in the context of complex, mixed microbial communities was evaluated using metagenomic sequencing data from three communities, including the ANAS community analyzed in Chapter 4. The microarray evaluated here, which targets 98.6% of genes from four sequenced Dhc isolates, has been used previously to profile Dhc genes in dechlorinating microbial communities and un-sequenced Dhc isolates (Lee, Cheng et al. 2011, Lee, Cheng et al. 2013, Men, Lee et al. 2013, West, Lee et al. 2013). This microarray is capable of differentiating between closely related Dhc strains, indicating high specificity (Lee, Cheng et al. 2011).

5.2 Methods

5.2.1 Microbial communities

Three TCE dechlorinating microbial communities containing Dhc were evaluated. The first was the ANAS community described in Chapter 4. The remaining two communities were developed by Dr. Yujie Men and are described in detail in (Men, Lee et al. 2013). Briefly, these were cultures inoculated with groundwater samples and enriched over many generations to ferment lactate and to dechlorinate TCE. Culture HiTCEB12 was enriched with 100 µg/L (74 nM) cobalamin amendment, while culture HiTCE was enriched without exogenous cobalamin.

74

5.2.2 Metagenome and microarray datasets

Metagenome sequences for ANAS were described in Chapter 4. All raw sequencing reads were used in this analysis, including 453,944 454-Titanium sequencing reads and 76,272 mate pairs of Sanger sequencing reads.

Metagenome sequences for HiTCEB12 and HiTCE were provided by Dr. Yujie Men. Prior to analysis, raw Illumina sequencing reads were processed to trim adapter contamination sequences and low quality (q < 20) bases using Scythe and Sickle.

The microarray datasets used in this analysis used the microarray described in (Lee, Cheng et al. 2011). Briefly, these were Affymetrix GeneChips targeting 98.6% of genes from four Dhc genomes (strains 195, BAV1, CBDB1 and VS). This microarray targets each gene with a probe set consisting of 11 exact match probes, each 25 bases long, along with 11 corresponding single mismatch probes in which the thirteenth base is a mismatch for the target gene sequence. All microarray datasets were analyzed as previously described (West, Johnson et al. 2008, Lee, Cheng et al. 2011). A gene was deemed “Present” if it had a p-value < 0.05, indicating differential hybridization to exact match probes over mismatch probes, and a signal intensity > 140 for all three replicates (Lee, Cheng et al. 2011, Men, Lee et al. 2013).

Microarray data for ANAS were reported by Lee et al. (Lee, Cheng et al. 2011) and were provided by Dr. Patrick K. H. Lee. Microarray data for HiTCEB12 and HiTCE were reported by Men et al. (Men, Lee et al. 2013) and were provided by Dr. Yujie Men.

5.2.3 Evaluation of microarray specificity through comparison of datasets

Microarray exact match probe sequences were aligned to metagenome sequences using the Bowtie aligner to find the best match between the probe and the metagenome sequences. Bowtie options were set to allow up to three mismatches in an alignment.

Based on the alignment results, a probe mismatch profile was determined for each microarray probe set. A probe set’s mismatch profile included five numbers: the number of probes whose best alignment in the metagenomic sequences had zero, one, two, or three genes and the number of probes that did not align. For example, a probe set (gene) that had six probes with zero mismatches, one probe with one mismatch, two probes with two mismatches, two probes with three mismatches, and zero unaligned probes would be represented by the profile [6, 1, 2, 2, 0]. Once mismatch profiles were determined, the relationships between these profiles and the presence/absence of genes according to the microarray analysis were evaluated

5.3 Results and discussion

The distribution of genes among different categories of profiles is shown in Figure 5.1. Of the 1,365 possible mismatch profiles, 676 were detected for at least one gene in at least one of the datasets (378, 441, and 436 for ANAS, HiTCEB12, and HiTCE datasets respectively). Of these, a majority of profiles (453 profiles, 67%) always corresponded to “Absent” identifications in the microarray analysis. Most of the remaining profiles (200 profiles, 30%) always corresponded to 75

“Present” identifications. Only 23 profiles were non-determinant, corresponding to some genes that were “Absent” and some genes that were “Present”. However, a few of these non- determinant profiles corresponded to a large number of genes, resulting in a nearly even distribution of genes between the always “Present” and non-determinant profile categories (Figure 5.1, Table 5.1)

Figure 5.1. Distribution of genes among profile categories. Profile categories: always “Absent”, always “Present”, and non determinant. The large pie chart is for all datasets combined. Small pie charts are for individual datasets. Numbers on the wedges of the small pie charts indicate the number of genes in that dataset represented by that profile category.

Details of the non-determinant profiles are given in Table 5.1. Notably, for both the HiTCEB12 and HiTCE datasets but not for ANAS, there were a small number of genes that were identified as “Absent” in the microarray analysis for which exact matches were found in the metagenomic sequencing reads for all eleven probe sequences. These included six genes for HiTCEB12 (DET_tRNA-Val-1, DET_tRNA-Ala-1, DET_tRNA-Pro-2, DET_tRNA-Val-3, DET_tRNA-Ala- 2, and DET1376) and three genes for HiTCE (DET1463, DET_tRNA-Val-3, and DET1376). 76

Further review of the microarray data revealed that the probe sets for these genes all had p-values less than 0.05, an indication of the presence of the gene, but were considered “Absent” due to low signal intensity of one or more replicate samples.

The microarray analysis was highly specific for sequences with low divergence from the target sequence. The fraction of a genes identified as “Present” was high when most probes had exact matches or only one mismatch, while that fraction was very low if three or more probes had three mismatches or were unaligned (Figure 5.2). The ANAS dataset showed slightly lower specificity than the other datasets, identifying a larger fraction of genes as “Present” when multiple probes had two mismatches (Figure 5.2 grey squares).

Figure 5.2. Fraction of genes identified as “Present” as a function of the number of probes for that gene with exactly N mismatches where N = 0, 1, 2, 3, or > 3 (unaligned). The large graph is for all datasets combined. Small graphs are for individual datasets.

77

.01

1.00 0.99 0.99 0.98 0.96 0.95 0.91 0.85 0.78 0.67 0.67 0.67 0.64 0.50 0.33 0.33 0.33 0.25 0.20 0.02 0.02 0.01 0

fraction fraction

"Present"

9 3 3 6 2 3 9 3 4 5

42 24 44 11 13 11 44 61

465 994 171 156 126

num

genes

All Datasets

n/a n/a n/a n/a n/a

0.99 0.99 1.00 1.00 1.00 0.90 0.67 1.00 0.50 0.50 0.67 0.33 0.00 0.00 0.00 0.00 0.00 0.00

fraction fraction

"Present"

5 0 3 1 0 2 2 3 0 0 3 0 1 2 2

54 10 10 26 45 54

181 339

num

genes

HiTCE

n/a n/a n/a n/a

1.00 0.99 0.97 0.92 0.83 0.92 0.67 0.67 0.50 0.00 0.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

fraction fraction

"Present"

6 3 3 4 0 0 1 4 0 1 4 0 1 1 4

35 12 12 19 38 63

147 417

num

genes

HiTCEB12

1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.67 1.00 1.00 0.75 0.50 0.50 1.00 0.33 0.50 0.50 0.03 0.06 0.01 0.11

fraction

"Present"

8 7 4 3 1 3 4 2 2 2 3 2 2 9

82 20 13 22 38 16 73

137 238

num

.

genes

ANAS

0 0 0 0 0 0 0 0 0 0 0 0 0 1 5 0 3 4 0 4 3 9 4

num

probes

unaligned unaligned

0 0 0 0 1 0 1 0 0 2 1 0 0 2 1 0 5 3 0 7 6 1 4

with 3

num probes num probes

mismatches

0 0 0 2 0 0 1 0 3 2 1 3 0 2 2 4 0 1 8 0 2 1 3

with 2

num probes num probes

mismatches

2 0 4 2 2 6 4 8 4 1 6 5 5 2 4 3 3 3 0 0 0 0

11

determinant probe set (gene) mismatch set profiles probe determinant

with 1

-

mismatch

num probes num probes

Non

.1.

5

9 7 7 8 5 5 3 4 6 3 3 0 1 1 3 0 0 0 0 0 0 0

11

with 0

num probes num probes

mismatches

Table The Profile

78

Relationships between profile mismatch distributions and microarray “Present”/”Absent” identification are shown in more detail in Figure 5.3. Considering genes for which all probes aligned, the maximum number of mismatches observed for a gene that was still considered “Present” were 27, 18, and 15 for the ANAS, HiTCEB12, and HiTCE datasets respectively, corresponding to an estimated 90% to 95% sequence identity (indicated with arrows in Figure 5.3). For all datasets, there were a small number (0.3% to 2%) of genes identified as “Present” for which none of the probes had perfect matches (27, 5, and 4 genes for ANAS, HiTCEB12, and HiTCE respectively) (top row of symbols in Figure 5.3). The HiTCEB12 and HiTCE datasets also contained some genes identified as “Present” with up to four unaligned probes, while the ANAS dataset contained one gene identified as “Present” despite nine unaligned probes.

The microarray analysis was highly specific for the targeted gene sequences but did not exclusively require exact matches, requiring a minimum estimated sequence identity of 90-95% in sequences targeted by probes for a gene to be identified as “Present”. This is despite the use of single mismatch probes to account for non-specific hybridization (West, Johnson et al. 2008, Lee, Cheng et al. 2011). The high specificity of this microarray is consistent with its previously demonstrated ability to differentiate between genes from closely related Dhc strains (Lee, Cheng et al. 2011).

The specificity of these microarray analyses may come at the cost of sensitivity by causing some genes to be identified as “Absent” for which exact or very close matches to the probe target sequences are actually present. As noted above, this was seen for an extremely small number of genes (6 and 3 genes respectively) in the HiTCEB12 and HiTCE datasets, and none for the ANAS dataset. Further investigation of the effects of parameters used in microarray analysis on specificity and sensitivity could improve interpretation of microarray results. Previous studies have found that several factors including probe length, GC content, and probe sequence, can affect hybridization efficiencies, thus influencing microarray specificity and sensitivity (Letowski, Brousseau et al. 2004, Harrison, Binder et al. 2013). The location of mismatches, which was not considered in this analysis, has also been shown to be a factor in probe hybridization efficiencies (Letowski, Brousseau et al. 2004).

79

The large The

.

the greatest number of mismatches the greatest for number of

Green circles indicate “Present” genes and Xs and indicate genes “Present” red circles Green

Arrows indicate Arrows

er of genes. of er

ons and microarray “Absent” identification “Present”/ microarray ons and

.

for dataset each

Relationships between profile mismatch distributi Relationships between

.3.

5 Figure individual datasets. for datasets combined. all is are for Small graphs graph size is proportional to genes. numb Marker “Absent” indicate “Present” genes were any which 80

This analysis also revealed only small apparent differences between the ANAS dataset and the HiTCEB12 and HiTCE datasets, with the analysis of the ANAS dataset indicating slightly lower microarray specificity. The ANAS dataset resulted in a higher fraction of genes declared “Present” when a majority of probes had either one or two mismatches (Figure 5.2, small graphs), and allowed a higher number of total mismatches to still be identified as “Present” (Figure 5.3, arrows on small graphs). However, these differences affect the results for only a very small portion (2%) of genes, as indicated by the small marker sizes in the relevant regions of Figure 5.3. Differences in metagenomic sequencing or in the microarray experiments may have contributed to the observed differences in the specificity analysis results.

The ANAS metagenome was sequenced using a combination of 454-Titanium and Sanger sequencing, generating a total of 0.3 Gbp of sequence. In comparison, the metagenome datasets for HiTCEB12 and HiTCE totaled 17.3 Gbp and 14.0 Gbp of sequence (after trimming) respectively, generated using Illumina HiSeq. The lower sequence quantity for ANAS implies lower sequencing depth, which could miss low abundance variants in the Dhc population. Microarray approaches are more effective at detecting low abundant variants because they are less susceptible to sampling biases that affect sequencing (Brodie, DeSantis et al. 2006, DeSantis, Brodie et al. 2007, Zhou, Kang et al. 2008, Zhou, Wu et al. 2011). However, the analysis of the ANAS metagenome indicated high sequencing depth for Dhc contigs (Table 4.2) (Brisson, West et al. 2012). Further, when Lee et al. applied DNA from the two Dhc strains isolated from ANAS (ANAS1 and ANAS2) to the same microarray, they found that these two dominant strains entirely account for the Dhc genes identified in the microarray analysis of the ANAS community (Lee, Cheng et al. 2011), indicating that low abundance variants were not responsible for the anomalously declared “Present” genes. This suggests that sequencing differences are unlikely to account for the observed differences in specificity analyses between datasets.

Differences between microarray experiments may also have contributed to the small differences observed in the specificity analyses. Sample preparation and processing for ANAS were performed by different personnel and at different times from the HiTCEB12 and HiTCE samples, which could have contributed to small differences in microarray specificity results. In their study of microarray expression analysis variability, Bammler et al. found variability within and between laboratories for microarray analysis results even with standardized protocols and sample material (Bammler, Beyer et al. 2005).

81

Chapter 6:

Comparative genomics of Wood-Ljungdahl pathways in Dehalococcoides mccartyi and other fully sequenced bacteria and archaea

A version of the following chapter has been published as part of:

Zhuang, Wei-Qin, Shan Yi, Markus Bill, Vanessa L. Brisson, Xueyang Feng, Yujie Men, Mark E. Conrad, Yinjie J. Tang and Lisa Alvarez-Cohen (2014). "Incomplete Wood–Ljungdahl pathway facilitates one-carbon metabolism in organohalide-respiring Dehalococcoides mccartyi." Proceedings of the National Academy of Sciences 111(17): 6419-6424.

82

6.1 Introduction

Detailed studies of Dehalococcoides mccartyi (Dhc) in isolation have revealed a variety of capabilities and limitations of Dhc’s metabolism. For example, Dhc’s dependence on cobalamin was discussed in Chapter 4. Dhc strain 195 has been shown to be capable of nitrogen fixation, although growth and dechlorination are more robust when fixed nitrogen is provided (Lee, He et al. 2009). Similarly, although Dhc can produce all essential amino acids, provision of exogenous amino acids has been shown to enhance growth and dechlorination activity (Zhuang, Yi et al. 2011).

Recently, examination of the Dhc genome and subsequent experimental results revealed an incomplete Wood-Ljungdahl pathway not previously reported for other microorganisms (Zhuang, Yi et al. 2014). The Wood-Ljungdahl pathway is used by many microorganisms in various forms for energy metabolism and carbon fixation (Fuchs 1994, Zhuang, Yi et al. 2014). All sequenced Dhc strains have a version of this pathway that is missing key genes (Kube, Beck et al. 2005, Seshadri, Adrian et al. 2005, McMurdie, Behrens et al. 2009). Specifically, the gene for methylene-tetrahydrofolate reductase (MTHFR), which is used for the production of methyl- tetrahydrofolate, a precursor for methionine synthesis (Rüdiger and Jaenicke 1973), is missing. Zhuang et al. showed that Dhc instead produces methyl-tetrahydrofolate by cleaving acetyl-CoA using acetyl-CoA synthase (ACS), using the Wood-Ljungdahl pathway in the reverse direction (Zhuang, Yi et al. 2014). In the process, Dhc produces carbon monoxide, which accumulates (since Dhc also lacks a gene for carbon monoxide dehydrogenase) and inhibits growth unless other organisms are present that can remove carbon monoxide.

In this study, a bioinformatic analysis was performed to determine whether the pattern of genes corresponding to this incomplete version of this pathway (absence of MTHFR and presence of ACS) is present in other known microorganisms.

6.2 Methods

A comparative genomic analysis was performed to evaluate the prevalence of MTHFR genes in sequenced microbial genomes and to identify organisms lacking this gene. The search was performed using all bacterial and archaeal genomes in the National Center for Biotechnology Information (NCBI) genomes database, downloaded in February of 2013. Initially, all genome annotations were searched for identified MTHFR genes. Based on the genes identified in this search, a database of corresponding protein sequences was created of all annotated bacterial and archaeal MTHFR protein sequences. To find previously unannotated MTHFR genes, all genomes that lack annotated MTHFRs were compared with the new MTHFR protein sequence database using blastx. An expect value cutoff of 10-15 was used to positively identify previously unannotated MTHFR genes. The set of genomes without blast hits of MTHFR genes was manually queried in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Microbesonline databases for genes encoding MTHFR functions, including MTHFR (ferredoxin) (EC 1.5.7.1), MTHFR [NAD(P)H] (EC 1.5.1.20), and a bifunctional homocysteine S- methyltransferase (EC 2.1.1.10).

83

In the genomes lacking MTHFR genes, the presence of acetyl-CoA synthase (ACS) (EC 2.3.1.169) genes was searched using the same process to assess the distribution of incomplete Wood–Ljungdahl pathways in other prokaryotes (Pierce, Xie et al. 2008). Finally, all D. mccartyi strains were searched for the homologs of betaine-homocysteine methyltransferase (EC 2.1.1.5) using the bacterial protein sequences in BRENDA (in August of 2013) and elsewhere (Rodionov, Vitreschak et al. 2004).

6.3 Results and discussion

Because the substitution of missing MTHFR function by acetyl-CoA had not been previously reported, a bioinformatics analyses was performed on the sequenced bacterial and archaeal genomes to determine whether the pattern of genes for this characteristic is present in other microorganisms besides Dhc. Figure 5.4 summarizes the results of this analysis. Of 2,277 bacterial and archaeal genomes in the NCBI genomes database (as of February of 2013), 1,548 were found to have annotated MTHFR genes. A blastx search comparing the remaining 729 genomes to the annotated MTHFR protein sequences identified an additional 303 genomes containing MTHFR homologous genes, and another seven genomes with MTHFR genes were identified by manual curation. MTHFR genes were not identified in 419 genomes (Appendix 7). Many of these genomes belonged to parasitic or symbiotic organisms, whose close association with a host may explain the absence of this functionality. Further analysis of the 419 genomes without MTHFR genes focused on the presence of the ACS gene. Within this group, homologs of this gene were found only in sequenced Dhc strains, but not in other genomes.

Figure 6.1. Identification of targeted Wood-Ljungdahl pathway genes in fully sequenced bacterial and archaeal genomes.

Others have previously suggested that some soil and marine bacteria use an alternative methionine biosynthesis pathway, using betaine instead of CH3-THF as the methyl donor to 84 homocysteine via the activity of betaine-homocysteine methyltransferase (Rodionov, Vitreschak et al. 2004, Barra, Fontenelle et al. 2006, Sowell, Norbeck et al. 2008, Hug, Beiko et al. 2012). Therefore, all Dhc genomes were also searched for gene homologs of this gene to determine whether this alternative pathway might account for the absence of MTHFR. No homologs of bacterial betaine-homocysteine methyltransferase were found in any of the Dhc genomes, indicating the absence of this alternative pathway in Dhc.

Although variations in C1 metabolism, such as the replacement of tetrahydrofolate by polyglutamate or methanopterins and NAD(P)H instead of ferredoxin as the cofactor for MTHFR (Schauder, Preuß et al. 1988, Thauer, Kaster et al. 2008, Fuchs 2011), have previously been reported for bacteria and archaea, the complete replacement of the MTHFR function with acetyl-CoA cleavage had not been reported prior to its identification in Dhc (Zhuang, Yi et al. 2014). The above comparative genomics analysis suggests that this strategy for generating CH3- THF is not found in other sequenced bacteria and archaea, highlighting the apparent novelty of this pathway. However, it is still unclear whether this strategy has wider distribution in the environment, given the limited numbers of sequenced organisms and the inherent challenges associated with growing carbon monoxide generating organisms in isolation.

85

Chapter 7:

Conclusions and Suggestions for Future Work

86

The research presented in this dissertation investigated two different microbial processes: bioleaching of rare earth elements (REEs) from monazite sand and microbial reductive dehalogenation of chlorinated ethenes. These studies utilized metabolomic, metagenomic, and genomic approaches to supplement and support microbiological studies of these processes.

7.1 Bioleaching of rare earth elements from monazite

The work in Chapter 2 demonstrated that some microorganisms are capable of bioleaching rare earth elements (REEs) from monazite sand. A variety of both bacterial and fungal microorganisms were tested for their monazite bioleaching capabilities, including two known phosphate solubilizing microorganisms (PSMs) (Aspergillus niger ATCC 1015 and Burkholderia ferrariae FeG101) as well as nine microorganisms isolated in this study. The most effective bioleaching microorganisms were all fungi and included Aspergillus niger ATCC 1015 and two strains isolated in this study: Aspergillus terreus strain ML3-1 and Paecilomyces spp. strain WE3-F. Bioleaching of monazite has not been previously reported and suggests a possible environmentally less damaging alternative to conventional REE extraction methods.

Further investigations in Chapters 2 and 3 sought to gain an understanding the mechanisms of monazite solubilization. The analysis of organic acids in Chapter 2 indicated that although the reduction in pH did result in some solubilization, most of the organic acids tested did not achieve significant additional solubilization through complex formation. Citric acid provided some additional solubilizing power, but this effect was small and did not account for observed bioleaching effectiveness. In contrast, the spent medium experiments showed that other unidentified compounds released by the microorganisms did contribute significantly to bioleaching. The goal of identifying these compounds motivated the exometabolomic analysis described in Chapter 3. In addition to confirming that citric acid does contribute some to REE solubilization, the metabolomics analysis also identified citramalic acid as a potential contributor. However, the contributions of citric and citramalic acid were shown to be relatively small. The results of the gel permeation experiments presented in Chapter 3 indicated that large, highly stable complexes, like those of EDTA, were not present in the bioleaching supernatant, suggesting that solubilization is instead potentially driven by the combination of many weaker complexing compounds with interactions more similar to those of citric acid. Further investigation is necessary to identify additional compounds contributing to bioleaching.

Even under the best growth conditions identified in Chapter 2, the maximum recovery of REEs from monazite was still only 5%. Significant process improvements and growth condition optimization will be required to increase REE recovery to make bioleaching an economically viable alternative to conventional processes.

One approach to improving performance would be to do a more extensive search for effective bioleaching microorganisms. The enrichments and isolations described in Chapter 2 were derived from only two environmental source materials, and the most effective bioleaching isolates from these enrichments outperformed known PSMs. Now that the possibility of monazite bioleaching has been established, the enrichment and isolation of organisms from more sources, especially from locations where monazite occurs naturally, could result in the isolation

87 of more effective bioleaching microorganisms. Organisms from sites where monazite occurs may already be adapted to using it as a phosphate source and can also be expected to have improved tolerance for radioactivity from Th.

Further optimization of growth conditions is also necessary for development of a viable process. Characterization of bioleaching performance with different growth media compositions in Chapter 2 resulted in improved REE solubilization. The comparison of growth with and without soluble phosphate presented in Chapter 3 indicated that although low phosphate availability resulted in a lag in growth, phosphate was not ultimately the growth limiting factor. Further investigation suggested that nitrogen may have been the limiting factor for growth. Some previous work has suggested that nitrogen limitation may be desirable for organic acid production and phosphate mineral solubilization by fungi (Cunningham and Kuiack 1992, Papagianni 2007, Scervino, Papinutti et al. 2011). Further investigation of the effects of nitrogen availability on the bioleaching process could be a useful direction for process optimization.

In addtions to identifying more effective microorganisms and optimizing their growth conditions, other process improvements could also increase leaching efficiency. For instance, grinding the monazite to a finer grain size may facilitate more effective leaching. Preliminary abiotic leaching experiments using 10 mM citric acid to leach monazite ground to different gain sizes (same abiotic leaching protocol as in Capter 2) demonstraded improved leaching with more finely ground sand (Figure 7.1). Increasing the leaching time may also be effective. Over six days of bioleaching, REE concentrations did not appear to have leveled off (Figure 2.3), and a longer leaching time could increase REE yield. Alternatively, the same monazite could be leached several times with fresh medium and organisms to extract more REEs, or a continuous flow process could be used in which the monazite is retained via settling while the leachate is continuously recovered. Removal of phosphate from the system, possibly by the use of phosphate accumulating microrgansism, could also help drive the leaching process and prevent re-precipitation of REEs.

Figure 7.1 Effect of monazite sand grain size on abiotic leaching with 10 mM citric acid.

88

Further identification of the unknown compounds that contribute to REE solubilization would also provide a useful basis for guiding process optimization for the production of desirable metabolites. Of the metabolites identified as potentially associated with bioleaching in Chapter 3, several were identified by BinBase ID but not by chemical name. Further investigation into the mass spectra associated with these metabolites could be done to identify characteristics of these molecules. However, such an analysis will be complicated by the effects of the silylation derivatization performed to prepare the samples for gas chromatography.

In addition to REE solubilization, the fate of Th from monazite is also a critical consideration for development of an alternative monazite bioleaching process. The analysis in Chapter 2 yielded the promising result that the microorganisms preferentially released REEs over Th from monazite. Further results from Chapter 3 found that while citric and citramalic acid both contributed somewhat to REE solubilization, citramalic acid solubilized less Th. This is consistent with the previously published different affinities of various ligands for REEs and Th (Martell and Smith 1974, Yong and Macaskie 1997). Future investigations of bioleaching compounds must continue to examine Th solubilization in order to guide optimization for increased solubilization of REEs while minimizing Th solubilization.

7.2 Microbial reductive dehalogenation of chlorinated ethenes

The metagenomic analysis described in Chapter 5 provided information about the structure of the ANAS microbial community and about the strains of Dehalococcoides mccartyi (Dhc) operating within that community. Metagenome contigs were grouped into ten classes based on tetranucleotide frequency. Based on the presence of phylogenetic marker genes, eight of these classes could be given taxonomic identification: Clostridiaceae, Dhc, Desulfovibrio, Methanobacterium, Methanospirillum, as well as a Spirochaete, a Synergistete, and an unknown Deltaproteobacterium. Clostridiaceae and Dhc had much higher read depths than other contig classes, and thus are likely the most abundant taxa in the community. Reductive dehalogenase genes were only found on contigs associated with Dhc, indicating that Dhc dominates the dechlorination activity of the ANAS culture.

Some of the most interesting findings of the metagenomic analysis involved genes related to the biosynthesis of cobalamin, an important cofactor for reductive dehalogenase enzymes. Cobalamin biosynthesis genes were wide spread among the different contig classes, including genes for a nearly complete biosynthesis pathway in Dhc, something that had not been previously reported. The presence of these genes was confirmed in Dhc strain ANAS2. However, preliminary experiments were not able to demonstrate the ability of this strain to grow without exogenous cobalamin.

Further study is necessary to investigate the functionality of the cobalamin biosynthesis genes identified in the metagenomic analysis. In order to understand cobalamin production within the ANAS community, a metatranscriptomic analysis focused on these genes should be performed. This analysis should investigate the transcription of all identified cobalamin biosynthesis genes in the metagenomic data over the course of a TCE degradation cycle. This should help to identify which organisms are important for cobalamin production in this community and when

89 they are producing it. Once an initial analysis is performed, a more targeted investigation of selected genes could be performed using RT-qPCR (reverse transcription quantitative polymerase chain reaction) in order to achieve a more quantitative analysis of critical genes for this process.

Additional studies with the Dhc ANAS2 isolate should also be performed to further investigate the functionality of the cobalamin biosynthesis genes identified in this isolate. One possible approach would be defined co-culture experiments with other organisms that cannot synthesize cobalamin but are able to support Dhc growth in other ways. This could produce more optimal growth conditions that would allow Dhc to invest the energy required for cobalamin production. Alternatively, instead of encoding a fully functional cobalamin biosynthesis pathway, these genes may instead represent an extension of Dhc’s previously reported corrinoid scavenging capabilities (Yi, Seth et al. 2012). This possibility could be tested by investigating the ability of this strain to grow and dechlorinate with degraded cobalamin.

In Chapters 5 and 6, additional bioinformatics analyses to support other investigations of Dhc were explored. The comparative analysis, presented in Chapter 6, of the Wood–Ljungdahl pathway genes in Dhc and in genome sequences from other bacteria and archaea helped to support the investigation of this version of the pathway and its novelty among known microorganisms (Zhuang, Yi et al. 2014). In Chapter 5, the use of metagenomic sequencing data to evaluate microarray specificity provides a new assessment of how a microarray performs when applied to a complex microbial community. This analysis indicated that this particular microarray could detect sequences with 90 to 95% sequence identity to the target sequences, but also showed some variation of detection/non detection of genes having the same level of sequence identity. Re-evaluation of the microarrays with different criteria for gene “Presence”/”Absence” calls followed by a repeat of the analysis in Chapter 5 could shed more light on how selection of these criteria affect the specificity of microarray analyses in the context of complex microbial communities.

90

References

Adeleke, R., E. Cloete and D. Khasa (2010). "Isolation and identificaiton of iron ore-solubilizing fungus." South African Journal of Science 106(9-10): 43-48.

Adrian, L., J. Rahnenfuhrer, J. Gobom and T. Holscher (2007). "Identification of a chlorobenzene reductive dehalogenase in Dehalococcoides sp strain CBDB1." Applied and Environmental Microbiology 73(23): 7717-7724.

Adrian, L., U. Szewzyk, J. Wecke and H. Gorisch (2000). "Bacterial dehalorespiration with chlorinated benzenes." Nature 408(6812): 580-583.

Ahuja, A., S. B. Ghosh and S. F. D’Souza (2007). "Isolation of a starch utilizing, phosphate solubilizing fungus on buffered medium and its characterization." Bioresource Technology 98(17): 3408-3411.

Akiyama, M. and S. Kawasaki (2012). "Novel grout material comprised of calcium phosphate compounds: In vitro evaluation of precipitation and strength reinforcement." Engineering Geology 125(0): 119-128.

Alonso, E., A. M. Sherman, T. J. Wallington, M. P. Everson, F. R. Field, R. Roth and R. E. Kirchain (2012). "Evaluating Rare Earth Element Availability: A Case with Revolutionary Demand from Clean Technologies." Environmental Science & Technology 46(6): 3406-3414.

Altelaar, A. F. M., J. Munoz and A. J. R. Heck (2013). "Next-generation proteomics: towards an integrative view of proteome dynamics." Nat Rev Genet 14(1): 35-48.

Altomare, C., W. A. Norvell, T. Björkman and G. E. Harman (1999). "Solubilization of Phosphates and Micronutrients by the Plant-Growth-Promoting and Biocontrol Fungus Trichoderma harzianum Rifai 1295-22." Applied and Environmental Microbiology 65(7): 2926- 2933.

Amend, A. S., K. A. Seifert and T. D. Bruns (2010). "Quantifying microbial communities with 454 pyrosequencing: does read abundance count?" Molecular Ecology 19(24): 5555-5565.

Anneken, D. J., S. Both, R. Christoph, G. Fieg, U. Steinberner and A. Westfechtel (2000). Fatty Acids. Ullmann's Encyclopedia of Industrial Chemistry, Wiley-VCH Verlag GmbH & Co. KGaA.

Arcand, M. M. and K. D. Schneider (2006). "Plant- and microbial-based mechanisms to improve the agronomic effectiveness of phosphate rock: a review." Anais da Academia Brasileira de Ciências 78: 791-807. 91

Arnold, W. A. and A. L. Roberts (2000). "Pathways and Kinetics of Chlorinated Ethylene and Chlorinated Acetylene Reaction with Fe(0) Particles." Environmental Science & Technology 34(9): 1794-1805.

Asea, P. E. A., R. M. N. Kucey and J. W. B. Stewart (1988). "Inorganic phosphate solubilization by two Penicillium species in solution culture and soil." Soil Biology and Biochemistry 20(4): 459-464.

Bammler, T., R. P. Beyer, S. Bhattacharya, G. A. Boorman, A. Boyles, B. U. Bradford, R. E. Bumgarner, P. R. Bushel, K. Chaturvedi, D. Choi, M. L. Cunningham, S. Dengs, H. K. Dressman, R. D. Fannin, F. M. Farun, J. H. Freedman, R. C. Fry, A. Harper, M. C. Humble, P. Hurban, T. J. Kavanagh, W. K. Kaufmann, K. F. Kerr, L. Jing, J. A. Lapidus, M. R. Lasarev, J. Li, Y. J. Li, E. K. Lobenhofer, X. Lu, R. L. Malek, S. Milton, S. R. Nagalla, J. P. O'Malley, V. S. Palmer, P. Pattee, R. S. Paules, C. M. Perou, K. Phillips, L. X. Qin, Y. Qiu, S. D. Quigley, M. Rodland, I. Rusyn, L. D. Samson, D. A. Schwartz, Y. Shi, J. L. Shin, S. O. Sieber, S. Slifer, M. C. Speer, P. S. Spencer, D. I. Sproles, J. A. Swenberg, W. A. Suk, R. C. Sullivan, R. Tian, R. W. Tennant, S. A. Todd, C. J. Tucker, B. Van Houten, B. K. Weis, S. Xuan, H. Zarbl and C. Toxicogenomics Res (2005). "Standardizing global gene expression analysis between laboratories and across platforms." Nature Methods 2(5): 351-356.

Barra, L., C. Fontenelle, G. Ermel, A. Trautwetter, G. C. Walker and C. Blanco (2006). "Interrelations between Glycine Betaine Catabolism and Methionine Biosynthesis in Sinorhizobium meliloti Strain 102F34." Journal of Bacteriology 188(20): 7195-7204.

Benjamini, Y. and Y. Hochberg (1995). "CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING." Journal of the Royal Statistical Society Series B-Methodological 57(1): 289-300.

Bolan, N. S., R. Naidu, S. Mahimairaja and S. Baskaran (1994). "Influence of low-molecular- weight organic acids on the solubilization of phosphates." Biology and Fertility of Soils 18(4): 311-319.

Bradley, P. M. (2003). "History and Ecology of Chloroethene Biodegradation: A Review." Bioremediation Journal 7(2): 81-109.

Braz, R. R. and E. Nahas (2012). "Synergistic action of both Aspergillus niger and Burkholderia cepacea in co-culture increases phosphate solubilization in growth medium." FEMS Microbiology Letters 332(1): 84-90.

Brisson, V. L., P. K. H. Lee and L. Alvarez-Cohen (2012). Comparison of Microarray and Metagenomic Sequencing for Detecting Dehalococcoides Genes. American Society for Microbiology, 112th General Meeting. San Francisco.

92

Brisson, V. L., K. A. West, P. K. H. Lee, S. G. Tringe, E. L. Brodie and L. Alvarez-Cohen (2012). "Metagenomic analysis of a stable trichloroethene-degrading microbial community." ISME J 6(9): 1702-1714.

Brodie, E. L., T. Z. DeSantis, D. C. Joyner, S. M. Baek, J. T. Larsen, G. L. Andersen, T. C. Hazen, P. M. Richardson, D. J. Herman, T. K. Tokunaga, J. M. M. Wan and M. K. Firestone (2006). "Application of a high-density oligonucleotide microarray approach to study bacterial population dynamics during reduction and reoxidation." Applied and Environmental Microbiology 72(9): 6288-6298.

Campbell, B. J. and S. C. Cary (2001). "Characterization of a novel spirochete associated with the hydrothermal vent polychaete annelid, Alvinella pompejana." Applied and Environmental Microbiology 67(1): 110-117.

Chai, B., Y. Wu, P. Liu, B. Liu and M. Gao (2011). "Isolation and phosphate-solubilizing ability of a fungus, Penicillium sp. from soil of an alum mine." Journal of Basic Microbiology 51(1): 5- 14.

Chen, Y. P., P. D. Rekha, A. B. Arun, F. T. Shen, W. A. Lai and C. C. Young (2006). "Phosphate solubilizing bacteria from subtropical soil and their tricalcium phosphate solubilizing abilities." Applied Soil Ecology 34(1): 33-41.

Chow, W. L., D. Cheng, S. Q. Wang and J. Z. He (2010). "Identification and transcriptional analysis of trans-DCE-producing reductive dehalogenases in Dehalococcoides species." ISME J 4(8): 1020-1030.

Christenson, E. A. and J. Schijf (2011). "Stability of YREE complexes with the trihydroxamate siderophore desferrioxamine B at seawater ionic strength." Geochimica et Cosmochimica Acta 75(22): 7047-7062.

Chuang, C.-C., Y.-L. Kuo, C.-C. Chao and W.-L. Chao (2007). "Solubilization of inorganic phosphates and plant growth promotion by Aspergillus niger." Biology and Fertility of Soils 43(5): 575-584.

Cole, K., V. Truong, D. Barone and G. McGall (2004). "Direct labeling of RNA with multiple biotins allows sensitive expression profiling of acute leukemia class predictor genes." Nucleic Acids Research 32(11).

Collins, R. N. (2004). "Separation of low-molecular mass organic acid–metal complexes by high-performance liquid chromatography." Journal of Chromatography A 1059(1–2): 1-12.

Conrad, M. E., E. L. Brodie, C. W. Radtke, M. Bill, M. E. Delwiche, M. H. Lee, D. L. Swift and F. S. Colwell (2010). "Field Evidence for Co-Metabolism of Trichloroethene Stimulated by

93

Addition of Electron Donor to Groundwater." Environmental Science & Technology 44(12): 4697-4704.

Costa, A. C. A., R. A. Medronho and R. P. Pecanha (1992). "Phosphate rock bioleaching." Biotechnology Letters 14(3): 233-238.

Cotton, S. (2006). Introduction to the Lanthanides. Lanthanide and Chemistry, John Wiley & Sons, Ltd: 1-7.

Cunningham, J. E. and C. Kuiack (1992). "Production of citric and oxalic acids and solubilization of calcium phosphate by Penicillium bilaii." Applied and Environmental Microbiology 58(5): 1451-1458.

Databionics. (2006). "Databionic ESOM Tools User Manual." Retrieved 29 July, 2010, from http://databionic-esom.sourceforge.net/.

Dawson, M. W., I. S. Maddox and J. D. Brooks (1989). "Evidence for nitrogen catabolite repression during citric acid production by Aspergillus niger under phosphate-limited growth conditions." Biotechnology and Bioengineering 33(11): 1500-1504.

DeLong, E. F. (2005). "Microbial community genomics in the ocean." Nature Reviews Microbiology 3(6): 459-469.

Delvasto, P., A. Ballester, J. A. Muñoz, F. González, M. L. Blázquez, J. M. Igual, A. Valverde and C. García-Balboa (2009). "Mobilization of phosphorus from iron ore by the bacterium Burkholderia caribensis FeGL03." Minerals Engineering 22(1): 1-9.

Delvasto, P., A. Valverde, A. Ballester, J. A. Muñoz, F. González, M. L. Blázquez, J. M. Igual and C. García-Balboa (2008). "Diversity and activity of phosphate bioleaching bacteria from a high-phosphorus iron ore." Hydrometallurgy 92(3–4): 124-129.

DeSantis, T. Z., E. L. Brodie, J. P. Moberg, I. X. Zubieta, Y. M. Piceno and G. L. Andersen (2007). "High-density universal 16S rRNA microarray analysis reveals broader diversity than typical clone library when sampling the environment." Microbial Ecology 53(3): 371-383.

Dick, G. J., A. F. Andersson, B. J. Baker, S. L. Simmons, A. P. Yelton and J. F. Banfield (2009). "Community-wide analysis of microbial genome sequence signatures." Genome Biology 10(8).

Ding, C. and J. He (2012). "Molecular techniques in the biotechnological fight against halogenated compounds in anoxic environments." Microbial Biotechnology 5(3): 347-367.

Doherty, R. (2014). History of TCE. Trichloroethylene: Toxicity and Health Risks. K. M. Gilbert and S. J. Blossom, Springer London: 1-14. 94

Dugat-Bony, E., E. Peyretaillade, N. Parisot, C. Biderre-Petit, F. Jaziri, D. Hill, S. Rimour and P. Peyret (2012). "Detecting unknown sequences with DNA microarrays: explorative probe design strategies." Environmental Microbiology 14(2): 356-371.

Duhamel, M. and E. A. Edwards (2006). "Microbial composition of chlorinated ethene- degrading cultures dominated by Dehalococcoides." FEMS Microbiology Ecology 58(3): 538- 549.

Firsching, F. H. and S. N. Brune (1991). "Solubility products of the trivalent rare-earth phosphates." Journal of Chemical & Engineering Data 36(1): 93-95.

Fuchs, G. (1994). Variations of the Acetyl-CoA Pathway in Diversely Related Microorganisms That Are Not Acetogens. Acetogenesis. H. Drake, Springer US: 507-520.

Fuchs, G. (2011). "Alternative Pathways of Carbon Dioxide Fixation: Insights into the Early Evolution of Life?" Annual Review of Microbiology 65(1): 631-658.

Gadd, G. M. (1999). Fungal Production of Citric and Oxalic Acid: Importance in Metal Speciation, Physiology and Biogeochemical Processes. Advances in Microbial Physiology. R. K. Poole, Academic Press. Volume 41: 47-92.

Gill, S. R., M. Pop, R. T. DeBoy, P. B. Eckburg, P. J. Turnbaugh, B. S. Samuel, J. I. Gordon, D. A. Relman, C. M. Fraser-Liggett and K. E. Nelson (2006). "Metagenomic analysis of the human distal gut microbiome." Science 312(5778): 1355-1359.

Gillham, R. W. and S. F. O'Hannesin (1994). "Enhanced Degradation of Halogenated Aliphatics by Zero-Valent Iron." Ground Water 32(6): 958-967.

Goldberg, S. M. D., J. Johnson, D. Busam, T. Feldblyum, S. Ferriera, R. Friedman, A. Halpern, H. Khouri, S. A. Kravitz, F. M. Lauro, K. Li, Y.-H. Rogers, R. Strausberg, G. Sutton, L. Tallon, T. Thomas, E. Venter, M. Frazier and J. C. Venter (2006). "A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes." Proceedings of the National Academy of Sciences of the United States of America 103(30): 11240-11245.

Goyne, K. W., S. L. Brantley and J. Chorover (2010). "Rare earth element release from phosphate minerals in the presence of organic acids." Chemical Geology 278(1–2): 1-14.

Gu, A. Z., B. P. Hedlund, J. T. Staley, S. E. Strand and H. D. Stensel (2004). "Analysis and comparison of the microbial community structures of two enrichment cultures capable of reductively dechlorinating TCE and cis-DCE." Environmental Microbiology 6(1): 45-54.

95

Gupta, C. K. and N. Krishnamurthy (1992). "Extractive metallurgy of rare earths." International Materials Reviews 37(1): 197-248.

Gyaneshwar, P., G. Naresh Kumar, L. J. Parekh and P. S. Poole (2002). "Role of soil microorganisms in improving P nutrition of plants." Plant and Soil 245(1): 83-93.

Haynes, W. M. ed. (2015). Properties of the Elements and Inorganic Compounds. CRC Handbook of Chemistry and Physics 95th Edition [Online]. W. M. Haynes. Boca Raton, FL, CRC Press/Taylor and Francis.

He, J., V. F. Holmes, P. K. Lee and L. Alvarez-Cohen (2007). "Influence of vitamin B12 and cocultures on the growth of Dehalococcoides isolates in defined medium." Applied and Environmental Microbiology 73(9): 2847-2853.

He, J. Z., K. M. Ritalahti, K. L. Yang, S. S. Koenigsberg and F. E. Loffler (2003). "Detoxification of vinyl chloride to ethene coupled to growth of an anaerobic bacterium." Nature 424(6944): 62-65.

He, Z., T. J. Gentry, C. W. Schadt, L. Wu, J. Liebich, S. C. Chong, Z. Huang, W. Wu, B. Gu, P. Jardine, C. Criddle and J. Zhou (2007). "GeoChip: a comprehensive microarray for investigating biogeochemical, ecological and environmental processes." ISME J 1(1): 67-77.

Holliger, C., D. Hahn, H. Harmsen, W. Ludwig, W. Schumacher, B. Tindall, F. Vazquez, N. Weiss and A. J. B. Zehnder (1998). "Dehalobacter restrictus gen. nov. and sp. nov., a strictly anaerobic bacterium that reductively dechlorinates tetra- and trichloroethene in an anaerobic respiration." Archives of Microbiology 169(4): 313-321.

Holmes, V. F., J. Z. He, P. K. H. Lee and L. Alvarez-Cohen (2006). "Discrimination of multiple Dehalococcoides strains in a trichloroethene enrichment by quantification of their reductive dehalogenase genes." Applied and Environmental Microbiology 72(9): 5877-5883.

Hongoh, Y., M. Ohkuma and T. Kudo (2003). "Molecular analysis of bacterial microbiota in the gut of the termite Reticulitermes speratus (Isoptera; Rhinotermitidae)." FEMS Microbiology Ecology 44(2): 231-242.

Howell, K. S., D. Cozzolino, E. J. Bartowsky, G. H. Fleet and P. A. Henschke (2006). "Metabolic profiling as a tool for revealing Saccharomyces interactions during wine fermentation." FEMS Yeast Research 6(1): 91-101.

Hug, L. A., R. G. Beiko, A. R. Rowe, R. E. Richardson and E. A. Edwards (2012). "Comparative metagenomics of three Dehalococcoides-containing enrichment cultures: the role of the non- dechlorinating community." Bmc Genomics 13.

96

Hug, L. A., M. Salehi, P. Nuin, E. R. Tillier and E. A. Edwards (2011). "Design and Verification of a Pangenome Microarray Oligonucleotide Probe Set for Dehalococcoides spp." Applied and Environmental Microbiology 77(15): 5361-5369.

Illmer, P., A. Barbato and F. Schinner (1995). "Solubilization of hardly-soluble AlPO4 with P- solubilizing microorganisms." Soil Biology and Biochemistry 27(3): 265-270.

Illmer, P. and F. Schinner (1992). "Solubilization of inorganic phosphates by microorganisms isolated from forest soils." Soil Biology and Biochemistry 24(4): 389-395.

Illmer, P. and F. Schinner (1995). "Solubilization of inorganic calcium phosphates— Solubilization mechanisms." Soil Biology and Biochemistry 27(3): 257-263.

Johannesson, K. H. and W. B. Lyons (1994). "The rare earth element geochemistry of Mono Lake water and the importance of carbonate complexing." Limnology and Oceanography 39(5): 1141-1154.

Kane, M. D., T. A. Jatkoe, C. R. Stumpf, J. Lu, J. D. Thomas and S. J. Madore (2000). "Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays." Nucleic Acids Research 28(22): 4552-4557.

Kanehisa, M. and S. Goto (2000). "KEGG: Kyoto Encyclopedia of Genes and Genomes." Nucleic Acids Research 28(1): 27-30.

Kell, D. B., M. Brown, H. M. Davey, W. B. Dunn, I. Spasic and S. G. Oliver (2005). "Metabolic footprinting and systems biology: the medium is the message." Nat Rev Micro 3(7): 557-565.

Khorassani, R., U. Hettwer, A. Ratzinger, B. Steingrobe, P. Karlovsky and N. Claassen (2011). "Citramalic acid and salicylic acid in sugar beet root exudates solubilize soil phosphorus." Bmc Plant Biology 11: 8.

Kielhorn, J., C. Melber, U. Wahnschaffe, A. Aitio and I. Mangelsdorf (2000). "Vinyl chloride: Still a cause for concern." Environmental Health Perspectives 108(7): 579-588.

Koltai, H. and C. Weingarten-Baror (2008). "Specificity of DNA microarray hybridization: characterization, effectors and approaches for data correction." Nucleic Acids Research 36(7): 2395-2405.

Krajmalnik-Brown, R., T. Holscher, I. N. Thomson, F. M. Saunders, K. M. Ritalahti and F. E. Loffler (2004). "Genetic identification of a putative vinyl chloride reductase in Dehalococcoides sp strain BAV1." Applied and Environmental Microbiology 70(10): 6347-6351.

97

Kube, M., A. Beck, S. H. Zinder, H. Kuhl, R. Reinhardt and L. Adrian (2005). "Genome sequence of the chlorinated compound respiring bacterium Dehalococcoides species strain CBDB1." Nature Biotechnology 23(10): 1269-1273.

Kunin, V., A. Copeland, A. Lapidus, K. Mavromatis and P. Hugenholtz (2008). "A Bioinformatician's Guide to Metagenomics." Microbiology and Molecular Biology Reviews 72(4): 557-578.

Kyrpides, N. C., V. M. Markowitz, N. N. Ivanova, E. Szeto, K. Palaniappan, K. Chu, D. Dalevi, I. M. A. Chen, Y. Grechkin, I. Dubchak, I. Anderson, A. Lykidis, K. Mavromatis and P. Hugenholtz (2008). "IMG/M: a data management and analysis system for metagenomes." Nucleic Acids Research 36: D534-D538.

Lahoz, R., F. Reyes and R. Beltra (1966). "Some Chemical Changes in the Mycelium of Aspergillus flavus during Autolysis." Journal of General Microbiology 45(1): 41-49.

Leadbetter, J. R., T. M. Schmidt, J. R. Graber and J. A. Breznak (1999). "Acetogenesis from H2 Plus CO2 by Spirochetes from Termite Guts." Science 283(5402): 686-689.

Lee, P. K. H., D. Cheng, P. Hu, K. A. West, G. J. Dick, E. L. Brodie, G. L. Andersen, S. H. Zinder, J. He and L. Alvarez-Cohen (2011). "Comparative genomics of two newly isolated Dehalococcoides strains and an enrichment using a genus microarray." ISME J.

Lee, P. K. H., D. Cheng, K. A. West, L. Alvarez-Cohen and J. He (2013). "Isolation of two new Dehalococcoides mccartyi strains with dissimilar dechlorination functions and their characterization by comparative genomics via microarray analysis." Environmental Microbiology 15(8): 2293-2305.

Lee, P. K. H., J. He, S. H. Zinder and L. Alvarez-Cohen (2009). "Evidence for Nitrogen Fixation by “Dehalococcoides ethenogenes” Strain 195." Applied and Environmental Microbiology 75(23): 7551-7555.

Lee, P. K. H., D. R. Johnson, V. F. Holmes, J. Z. He and L. Alvarez-Cohen (2006). "Reductive dehalogenase gene expression as a biomarker for physiological activity of Dehalococcoides spp." Applied and Environmental Microbiology 72(9): 6161-6168.

Liu, Y., S. A. Majetich, R. D. Tilton, D. S. Sholl and G. V. Lowry (2005). "TCE Dechlorination Rates, Pathways, and Efficiency of Nanoscale Iron Particles with Different Properties." Environmental Science & Technology 39(5): 1338-1345.

Löffler, F., K. Ritalahti and S. Zinder (2013). Dehalococcoides and Reductive Dechlorination of Chlorinated Solvents. Bioaugmentation for Groundwater Remediation. H. F. Stroo, A. Leeson and C. H. Ward, Springer New York: 39-88.

98

Löffler, F. E., J. R. Cole, K. M. Ritalahti and J. M. Tiedje (2004). Diversity of Dechlorinating Bacteria. Dehalogenation. M. M. Häggblom and I. D. Bossert, Springer US: 53-87.

Luijten, M. L. G. C., J. de Weert, H. Smidt, H. T. S. Boschker, W. M. de Vos, G. Schraa and A. J. M. Stams (2003). "Description of Sulfurospirillum halorespirans sp nov., an anaerobic, tetrachloroethene-respiring bacterium, and transfer of Dehalospirillum multivorans to the genus Sulfurospirillum as Sulfurospirillum multivorans comb. nov." International Journal of Systematic and Evolutionary Microbiology 53: 787-793.

Macbeth, T. W., D. E. Cummings, S. Spring, L. M. Petzke and K. S. Sorenson, Jr. (2004). "Molecular Characterization of a Dechlorinating Community Resulting from In Situ Biostimulation in a Trichloroethene-Contaminated Deep, Fractured Basalt Aquifer and Comparison to a Derivative Laboratory Culture." Appl. Environ. Microbiol. 70(12): 7329-7341.

Madigan, M. T., J. M. Martinko, P. V. Dunlap and D. P. Clark (2008). Brock biology of microorganisms. San Francisco, Calif. ; London, Pearson Benjamin-Cummings.

Magnuson, J. and L. Lasure (2004). Organic Acid Production by Filamentous Fungi. Advances in Fungal Biotechnology for Industry, Agriculture, and Medicine. J. Tkacz and L. Lange, Springer US: 307-340.

Magnuson, J. K., M. F. Romine, D. R. Burris and M. T. Kingsley (2000). "Trichloroethene reductive dehalogenase from Dehalococcoides ethenogenes: Sequence of tceA and substrate range characterization." Applied and Environmental Microbiology 66(12): 5141-5147.

Magnuson, J. K., R. V. Stern, J. M. Gossett, S. H. Zinder and D. R. Burris (1998). "Reductive dechlorination of tetrachloroethene to ethene by two-component enzyme pathway." Applied and Environmental Microbiology 64(4): 1270-1275.

Mansfeldt, C. B., A. R. Rowe, G. L. W. Heavner, S. H. Zinder and R. E. Richardson (2014). "Meta-Analyses of Dehalococcoides mccartyi Strain 195 Transcriptomic Profiles Identify a Respiration Rate-Related Gene Expression Transition Point and Interoperon Recruitment of a Key Oxidoreductase Subunit." Applied and Environmental Microbiology 80(19): 6062-6072.

Markowitz, V. M., I.-M. A. Chen, K. Palaniappan, K. Chu, E. Szeto, Y. Grechkin, A. Ratner, I. Anderson, A. Lykidis, K. Mavromatis, N. N. Ivanova and N. C. Kyrpides (2010). "The integrated microbial genomes system: an expanding comparative analysis resource." Nucleic Acids Research 38(suppl 1): D382-D390.

Markowitz, V. M., N. Ivanova, I. Anderson, A. Lykidis, K. Mavromatis, A. Pati, E. Szeto, K. Palaniappan, I. M. A. Chen, K. Chu, Y. Grechkin and N. C. Kyrpides (2008). Using IMG-M : Comparative Analysis with the IMG/M System : Addendum to Using IMG, Department of Energy Joint Genome Institute

99

Lawrence Berkeley National Laboratory: 10-11.

Martell, A. E. and R. M. Smith (1974). Critical Stability Constants. New York, Plenum Press.

Mavromatis, K., N. Ivanova, K. Barry, H. Shapiro, E. Goltsman, A. C. McHardy, I. Rigoutsos, A. Salamov, F. Korzeniewski, M. Land, A. Lapidus, I. Grigoriev, P. Richardson, P. Hugenholtz and N. C. Kyrpides (2007). "Use of simulated data sets to evaluate the fidelity of metagenomic processing methods." Nature Methods 4(6): 495-500.

Mavromatis, K., N. N. Ivanova, I. M. A. Chen, E. Szeto, V. M. Markowitz and N. C. Kyrpides (2009). "The DOE-JGI Standard Operating Procedure for the Annotations of Microbial Genomes." Standards in Genomic Sciences 1(1): 63-67.

Maymo-Gatell, X., Y. T. Chien, J. M. Gossett and S. H. Zinder (1997). "Isolation of a bacterium that reductively dechlorinates tetrachloroethene to ethene." Science 276(5318): 1568-1571.

McCarty, P. L. (1997). "Microbiology - Breathing with chlorinated solvents." Science 276(5318): 1521-1522.

McCutcheon, J. P., B. R. McDonald and N. A. Moran (2009). "Convergent evolution of metabolic roles in bacterial co-symbionts of insects." Proceedings of the National Academy of Sciences of the United States of America 106(36): 15394-15399.

McMurdie, P. J., S. F. Behrens, J. A. Muller, J. Goke, K. M. Ritalahti, R. Wagner, E. Goltsman, A. Lapidus, S. Holmes, F. E. Loffler and A. M. Spormann (2009). "Localized Plasticity in the Streamlined Genomes of Vinyl Chloride Respiring Dehalococcoides." PLoS Genetics 5(11).

Men, Y., H. Feil, N. C. VerBerkmoes, M. B. Shah, D. R. Johnson, P. K. H. Lee, K. A. West, S. H. Zinder, G. L. Andersen and L. Alvarez-Cohen (2012). "Sustainable syntrophic growth of Dehalococcoides ethenogenes strain 195 with Desulfovibrio vulgaris Hildenborough and Methanobacterium congolense: global transcriptomic and proteomic analyses." ISME J 6(2): 410-421.

Men, Y., P. H. Lee, K. Harding and L. Alvarez-Cohen (2013). "Characterization of four TCE- dechlorinating microbial enrichments grown with different cobalamin stress and methanogenic conditions." Applied Microbiology and Biotechnology 97(14): 6439-6450.

Mendes, G. d. O., N. B. Vassilev, V. H. A. Bonduki, I. R. d. Silva, J. I. Ribeiro Junior and M. D. Costa (2013). "Inhibition of Aspergillus niger phosphate solubilization by fluoride released from rock phosphate." Applied and Environmental Microbiology 79(16): 4906-4913.

Mohn, W. W. and J. M. Tiedje (1992). "Microbial Reductive Dehalogenation." Microbiological Reviews 56(3): 482-507. 100

Morales, A., M. Alvear, E. Valenzuela, R. Rubio and F. Borie (2007). "Effect of inoculation with Penicillium albidum, a phosphate-solubilizing fungus, on the growth of Trifolium pratense cropped in a volcanic soil." Journal of Basic Microbiology 47(3): 275-280.

Moran, M. J., J. S. Zogorski and P. J. Squillace (2007). "Chlorinated solvents in groundwater of the United States." Environmental Science & Technology 41(1): 74-81.

Morgan, J. L., A. E. Darling and J. A. Eisen (2010). "Metagenomic Sequencing of an In Vitro- Simulated Microbial Community." PLoS One 5(4).

Moriwaki, H. and H. Yamamoto (2013). "Interactions of microorganisms with rare earth ions and their utilization for separation and environmental technology." Applied Microbiology and Biotechnology 97(1): 1-8.

Muller, J. A., B. M. Rosner, G. von Abendroth, G. Meshulam-Simon, P. L. McCarty and A. M. Spormann (2004). "Molecular identification of the catabolic vinyl chloride reductase from Dehalococcoides sp strain VS and its environmental distribution." Applied and Environmental Microbiology 70(8): 4880-4888.

Mysyakina, I. S. and E. P. Feofilova (2011). "The role of lipids in the morphogenetic processes of mycelial fungi." Microbiology 80(3): 297-306.

Nautiyal, C. S. (1999). "An efficient microbiological growth medium for screening phosphate solubilizing microorganisms." FEMS Microbiology Letters 170(1): 265-270.

Nautiyal, C. S., S. Bhadauria, P. Kumar, H. Lal, R. Mondal and D. Verma (2000). "Stress induced phosphate solubilization in bacteria isolated from alkaline soils." FEMS Microbiology Letters 182(2): 291-296.

Nitze, H. B. C. (1896). 16th Annual Report of the USGS Part IV. USGS: 667-693.

Oelkers, E. H. and F. Poitrasson (2002). "An experimental study of the dissolution stoichiometry and rates of a natural monazite as a function of temperature from 50 to 230 °C and pH from 1.5 to 10." Chemical Geology 191(1–3): 73-87.

Oh, S., D. R. Yoder-Himes, J. Tiedje, J. Park and K. T. Konstantinidis (2010). "Evaluating the Performance of Oligonucleotide Microarrays for Bacterial Strains with Increasing Genetic Divergence from the Reference Strain." Applied and Environmental Microbiology 76(9): 2980- 2988.

Olsen, C. H. (2003). "Review of the use of statistics in Infection and Immunity." Infection and Immunity 71(12): 6689-6692.

101

Olsen, C. H. (2014). "Statistics in Infection and Immunity Revisited." Infection and Immunity 82(3): 916-920.

Osorio, N. and M. Habte (2009). Strategies for Utilizing Arbuscular Mycorrhizal Fungi and Phosphate-Solubilizing Microorganisms for Enhanced Phosphate Uptake and Growth of Plants in the Soils of the Tropics. Microbial Strategies for Crop Improvement. M. S. Khan, A. Zaidi and J. Musarrat, Springer Berlin Heidelberg: 325-351.

Papagianni, M. (2004). "Fungal morphology and metabolite production in submerged mycelial processes." Biotechnology Advances 22(3): 189-259.

Papagianni, M. (2007). "Advances in citric acid fermentation by Aspergillus niger: Biochemical aspects, membrane transport and modeling." Biotechnology Advances 25(3): 244-263.

Parales, R. E., J. E. Adamus, N. White and H. D. May (1994). "Degradation of 1,4-dioxane by an actinomycete in pure culture." Applied and Environmental Microbiology 60(12): 4527-4530.

Patti, G. J., O. Yanes and G. Siuzdak (2012). "Innovation: Metabolomics: the apogee of the omics trilogy." Nat Rev Mol Cell Biol 13(4): 263-269.

Pierce, E., G. Xie, R. D. Barabote, E. Saunders, C. S. Han, J. C. Detter, P. Richardson, T. S. Brettin, A. Das, L. G. Ljungdahl and S. W. Ragsdale (2008). "The complete genome sequence of Moorella thermoacetica (f. Clostridium thermoaceticum)." Environmental Microbiology 10(10): 2550-2573.

Pikovskaya, R. I. (1948). "Mobilization of phosphorus in soil in connection with the vital activity of some microbial species." Microbiology 17: 362-370.

Puigdomenech, I. (2013). HYDRA: Hydrochemical Equilibrium Constant Database. Sweden, Royal Institute of Technology.

Qu, Y. and B. Lian (2013). "Bioleaching of rare earth and radioactive elements from red mud using Penicillium tricolor RM-10." Bioresource Technology 136(0): 16-23.

Ramachandran, S., P. Fontanille, A. Pandey and C. Larroche (2006). "Gluconic acid: Properties, applications and microbial production." Food Technology and Biotechnology 44(2): 185-195.

Reva, O. N. and B. Tummler (2005). "Differentiation of regions with atypical oligonucleotide composition in bacterial genomes." BMC Bioinformatics 6.

Richardson, R. E., V. K. Bhupathiraju, D. L. Song, T. A. Goulet and L. Alvarez-Cohen (2002). "Phylogenetic characterization of microbial communities that reductively dechlorinate TCE

102 based upon a combination of molecular techniques." Environmental Science & Technology 36(12): 2652-2662.

Richter, C. L., B. Dunn, G. Sherlock and T. Pugh (2013). "Comparative metabolic footprinting of a large number of commercial wine yeast strains in Chardonnay fermentations." FEMS Yeast Research 13(4): 394-410.

Rittmann, B. E. and P. L. McCarty (2001). Environmental Biotechnology: Principles and Applications. Boston, McGraw Hill.

Rodionov, D. A., A. G. Vitreschak, A. A. Mironov and M. S. Gelfand (2004). "Comparative genomics of the methionine metabolism in Gram-positive bacteria: a variety of regulatory systems." Nucleic Acids Research 32(11): 3340-3353.

Rodrı́guez, H. and R. Fraga (1999). "Phosphate solubilizing bacteria and their role in plant growth promotion." Biotechnology Advances 17(4–5): 319-339.

Rosenblum, S. and M. Fleischer (1995). The Distribution of Rare-earth Elements in Minerals of the Monazite Family, U.S. Government Printing Office.

Rüdiger, H. and L. Jaenicke (1973). "The biosynthesis of methionine." Molecular and Cellular Biochemistry 1(2): 157-168.

Rudnick, R. L. and S. Gao (2003). 3.01 - Composition of the Continental Crust. Treatise on Geochemistry. H. D. H. K. Turekian. Oxford, Pergamon: 1-64.

Sanapareddy, N., T. J. Hamp, L. C. Gonzalez, H. A. Hilger, S. M. Clinton and A. A. Fodor (2009). "Molecular Diversity of a North Carolina Wastewater Treatment Plant as Revealed by Pyrosequencing." Applied and Environmental Microbiology 75(6): 1688-1696.

Sánchez, J. A., C. S. McFadden, S. C. France and H. R. Lasker (2003). "Molecular phylogenetic analyses of shallow-water Caribbean octocorals." Marine Biology 142(5): 975-987.

Scervino, J. M., V. L. Papinutti, M. S. Godoy, M. A. Rodriguez, I. Della Monica, M. Recchi, M. J. Pettinari and A. M. Godeas (2011). "Medium pH, carbon and nitrogen concentrations modulate the phosphate solubilization efficiency of Penicillium purpurogenum through organic acid production." Journal of Applied Microbiology 110(5): 1215-1223.

Schauder, R., A. Preuß, M. Jetten and G. Fuchs (1988). "Oxidative and reductive acetyl CoA/carbon monoxide dehydrogenase pathway in Desulfobacterium autotrophicum." Archives of Microbiology 151(1): 84-89.

103

Scholz-Muramatsu, H., A. Neumann, M. Messmer, E. Moore and G. Diekert (1995). "Isolation and Characterization of Dehalospirillum Multivorans Gen-Nov, Sp-Nov, a Tetrachloroethene- Utilizing, Strictly Anaerobic Bacterium." Archives of Microbiology 163(1): 48-56.

Seshadri, R., L. Adrian, D. E. Fouts, J. A. Eisen, A. M. Phillippy, B. A. Methe, N. L. Ward, W. C. Nelson, R. T. Deboy, H. M. Khouri, J. F. Kolonay, R. J. Dodson, S. C. Daugherty, L. M. Brinkac, S. A. Sullivan, R. Madupu, K. T. Nelson, K. H. Kang, M. Impraim, K. Tran, J. M. Robinson, H. A. Forberger, C. M. Fraser, S. H. Zinder and J. F. Heidelberg (2005). "Genome sequence of the PCE-dechlorinating bacterium Dehalococcoides ethenogenes." Science 307(5706): 105-108.

Sharma, P. K. and P. L. McCarty (1996). "Isolation and characterization of a facultatively aerobic bacterium that reductively dehalogenates tetrachloroethene to cis-1,2-dichloroethene." Applied and Environmental Microbiology 62(3): 761-765.

Šidák, Z. (1967). "Rectangular Confidence Regions for the Means of Multivariate Normal Distributions." Journal of the American Statistical Association 62(318): 626-633.

Smidt, H. and W. M. de Vos (2004). "Anaerobic microbial dehalogenation." Annual Review of Microbiology 58: 43-73.

Souchie, E. L., R. Azcón, J. M. Barea, O. J. Saggin-Júnior and E. M. R. d. Silva (2006). "Phosphate solubilization and synergism between P-solubilizing and arbuscular mycorrhizal fungi." Pesquisa Agropecuária Brasileira 41: 1405-1411.

Sowell, S. M., A. D. Norbeck, M. S. Lipton, C. D. Nicora, S. J. Callister, R. D. Smith, D. F. Barofsky and S. J. Giovannoni (2008). "Proteomic Analysis of Stationary Phase in the Marine Bacterium “Candidatus Pelagibacter ubique”." Applied and Environmental Microbiology 74(13): 4091-4100.

Sue, T., V. Obolonkin, H. Griffiths and S. G. Villas-Bôas (2011). "An Exometabolomics Approach to Monitoring Microbial Contamination in Microalgal Fermentation Processes by Using Metabolic Footprint Analysis." Applied and Environmental Microbiology 77(21): 7605- 7610.

Thauer, R. K., A. K. Kaster, H. Seedorf, W. Buckel and R. Hedderich (2008). "Methanogenic archaea: ecologically relevant differences in energy conservation." Nature Reviews Microbiology 6(8): 579-591.

Tyson, G. W., J. Chapman, P. Hugenholtz, E. E. Allen, R. J. Ram, P. M. Richardson, V. V. Solovyev, E. M. Rubin, D. S. Rokhsar and J. F. Banfield (2004). "Community structure and metabolism through reconstruction of microbial genomes from the environment." Nature 428(6978): 37-43.

104

Ultsch, A. and F. Moerchen (2005). ESOM-Maps: tools for clustering, visualization, and classification with Emergent SOM, University of Marburg, Germany.

US_Dept._of_H&HS (1997). Toxicological Profile for Trichloroethylene. U. S. D. o. H. a. H. Services.

US_Dept._of_H&HS (2005). Report on Carcinogens, 11th edition. U. S. D. o. H. a. H. Services.

US_Dept._of_H&HS (2007). 2007 CERCLA Priority List of Hazardous Substances. U. S. D. o. H. a. H. Services.

US_EPA (2011). Toxicological Review of Trichloroethylene. U. S. E. P. Agency.

USDoE (2011). 2011 Critical Materials Strategy. U. S. D. o. Energy.

Vassilev, N., M. Vassileva and I. Nikolaeva (2006). "Simultaneous P-solubilizing and biocontrol activity of microorganisms: potentials and future trends." Applied Microbiology and Biotechnology 71(2): 137-144.

VerBerkmoes, N. C., V. J. Denef, R. L. Hettich and J. F. Banfield (2009). "Systems Biology: Functional analysis of natural microbial consortia using community proteomics." Nat Rev Micro 7(3): 196-205. von Wintzingerode, F., U. B. Gobel and E. Stackebrandt (1997). "Determination of microbial diversity in environmental samples: pitfalls of PCR-based rRNA analysis." FEMS Microbiology Reviews 21(3): 213-229.

Waller, A. S. (2009). Molecular Investigation of Chloroethene Resuctive Dehalogenation by the Mixed Microbial Community KB1. Doctor of Philosophy, University of Toronto.

Waller, A. S., L. A. Hug, K. Mo, D. R. Radford, K. L. Maxwell and E. A. Edwards (2012). "Transcriptional Analysis of a Dehalococcoides-Containing Microbial Consortium Reveals Prophage Activation." Applied and Environmental Microbiology 78(4): 1178-1186.

Warnecke, F., P. Luginbuhl, N. Ivanova, M. Ghassemian, T. H. Richardson, J. T. Stege, M. Cayouette, A. C. McHardy, G. Djordjevic, N. Aboushadi, R. Sorek, S. G. Tringe, M. Podar, H. G. Martin, V. Kunin, D. Dalevi, J. Madejska, E. Kirton, D. Platt, E. Szeto, A. Salamov, K. Barry, N. Mikhailova, N. C. Kyrpides, E. G. Matson, E. A. Ottesen, X. N. Zhang, M. Hernandez, C. Murillo, L. G. Acosta, I. Rigoutsos, G. Tamayo, B. D. Green, C. Chang, E. M. Rubin, E. J. Mathur, D. E. Robertson, P. Hugenholtz and J. R. Leadbetter (2007). "Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite." Nature 450(7169): 560-U517.

105

Warren, M. J., E. Raux, H. L. Schubert and J. C. Escalante-Semerena (2002). "The biosynthesis of adenosylcobalamin (vitamin B-12)." Natural Product Reports 19(4): 390-412.

Watkinson, S., D. Bebber, P. Darrah, M. Fricker, M. Tlalka and L. Boddy (2006). The role of wood decay fungi in the carbon and nitrogen dynamicas of the forest floor. Fungi in Biogeochemical Cycles. G. M. Gadd. Cambridge, New York, Cambridge University Press: 151- 181.

West, K. A., D. R. Johnson, P. Hu, T. Z. DeSantis, E. L. Brodie, P. K. H. Lee, H. Feil, G. L. Andersen, S. H. Zinder and L. Alvarez-Cohen (2008). "Comparative genomics of "Dehalococcoides ethenogenes" 195 and an enrichment culture containing unsequenced "Dehalococcoides" strains." Applied and Environmental Microbiology 74(11): 3533-3540.

West, K. A., P. K. H. Lee, D. R. Johnson, S. H. Zinder and L. Alvarez-Cohen (2013). "Global gene expression of Dehalococcoides within a robust dynamic TCE-dechlorinating community under conditions of periodic substrate supply." Biotechnology and Bioengineering 110(5): 1333- 1341.

Wheelwright, E. J., F. H. Spedding and G. Schwarzenbach (1953). "The Stability of the Rare Earth Complexes with Ethylenediaminetetraacetic Acid." Journal of the American Chemical Society 75(17): 4196-4201.

White, T. J., T. Bruns, S. Lee and J. Taylor (1990). Amplification and Direct Sequencing of Fungal Bibosomal RNA Genes for Phylogenetics. PCR Protocols: A Guide to Methods and Applications. M. A. Innis, D. H. Gelfand, J. J. Sninsky and T. J. White. New York, Academic Press, Inc.: 315-322.

Wood, S. A. (1993). "The aqueous geochemistry of the rare-earth elements: Critical stability constants for complexes with simple carboxylic acids at 25°C and 1 bar and their application to nuclear waste management." Engineering Geology 34(3–4): 229-259.

Woyke, T., G. Xie, A. Copeland, J. M. Gonzalez, C. Han, H. Kiss, J. H. Saw, P. Senin, C. Yang, S. Chatterji, J.-F. Cheng, J. A. Eisen, M. E. Sieracki and R. Stepanauskas (2009). "Assembling the Marine Metagenome, One Cell at a Time." Plos One 4(4).

Yi, S., E. C. Seth, Y.-J. Men, S. P. Stabler, R. H. Allen, L. Alvarez-Cohen and M. E. Taga (2012). "Versatility in Corrinoid Salvaging and Remodeling Pathways Supports Corrinoid- Dependent Metabolism in Dehalococcoides mccartyi." Applied and Environmental Microbiology 78(21): 7745-7752.

Yong, P. and L. E. Macaskie (1997). "Removal of lanthanum, uranium and thorium from the citrate complexes by immobilized cells of Citrobacter sp. in a flow-through reactor: implications for the decontamination of solutions containing plutonium." Biotechnology Letters 19(3): 251- 256. 106

Zhang, Z., S. Schwartz, L. Wagner and W. Miller (2000). "A greedy algorithm for aligning DNA sequences." Journal of Computational Biology 7(1-2): 203-214.

Zhou, J., S. Kang, C. W. Schadt and C. T. Garten, Jr. (2008). "Spatial scaling of functional gene diversity across various microbial taxa." Proceedings of the National Academy of Sciences of the United States of America 105(22): 7768-7773.

Zhou, J., L. Wu, Y. Deng, X. Zhi, Y.-H. Jiang, Q. Tu, J. Xie, J. D. Van Nostrand, Z. He and Y. Yang (2011). "Reproducibility and quantitation of amplicon sequencing-based detection." Isme Journal 5(8): 1303-1313.

Zhu, X. K. and R. K. O'Nions (1999). "Monazite chemical composition: some implications for ." Contributions to Mineralogy and Petrology 137(4): 351-363.

Zhuang, W.-Q., S. Yi, M. Bill, V. L. Brisson, X. Feng, Y. Men, M. E. Conrad, Y. J. Tang and L. Alvarez-Cohen (2014). "Incomplete Wood–Ljungdahl pathway facilitates one-carbon metabolism in organohalide-respiring Dehalococcoides mccartyi." Proceedings of the National Academy of Sciences 111(17): 6419-6424.

Zhuang, W.-Q., S. Yi, X. Feng, S. H. Zinder, Y. J. Tang and L. Alvarez-Cohen (2011). "Selective Utilization of Exogenous Amino Acids by Dehalococcoides ethenogenes Strain 195 and Its Effects on Growth and Dechlorination Activity." Applied and Environmental Microbiology 77(21): 7797-7803.

107

Appendices

108

Appendix 1:

Calculation of total Nd solubilized from NdPO4 as a function of pH

109

Appendix 1. Calculation of total Nd solubilized from NdPO4 as a function of pH.

Equilibrium equations, with equilibrium constants from (Puigdomenech 2013):

Phosphate/Phosphoric Acid

[ ] − + H3PO4 −2.149 1 H3PO4 ⇌ H2PO4 + H Ka1 = − + = 10 [H2PO4 ][H ] [H PO−] − 2− + 2 4 −7.207 2 H2PO4 ⇌ HPO4 + H Ka2 = 2− + = 10 [HPO4 ][H ]

[HPO2−] 3 2− 3− + 4 −12.346 HPO4 ⇌ PO4 + H Ka3 = 3− + = 10 [PO4 ][H ]

Neodymium Hydroxides

[H+][NdOH2+] 4 Nd3+ + H O ⇌ NdOH2++H+ β = = 10−8.16 2 1 [Nd3+]

[H+]2[Nd(OH)+] 5 Nd3+ + 2H O ⇌ Nd(OH)++2H+ β = 2 = 10−17.04 2 2 2 [Nd3+]

[H+]3[Nd(OH)0 ] 3+ 0 + 3(aq) −26.41 6 Nd + 3H2O ⇌ Nd(OH)3(aq)+3H β = = 10 3 [Nd3+]

[H+]4[Nd(OH)−] 7 Nd3+ + 4H O ⇌ Nd(OH)−+4H+ β = 4 = 10−37.1 2 4 4 [Nd3+]

3+ 0 + 3+ [Nd ] 18.1 8 Nd(OH) +3H ⇌ Nd + 3H2O Ksp = = 10 3(s) Nd(OH)3 [H+]3

Neodymium Phosphates

0 3+ 3− 3+ 3− −26.2 9 NdPO4(s) ⇌ Nd + PO4 KspNdPO4 = [Nd ][PO4 ] = 10

[NdPO0 ] 3+ 3− 0 4(aq) 11.8 10 Nd + PO4 ⇌ NdPO4(aq) K 0 = = 10 NdPO4(aq) 3+ 3− [Nd ][PO4 ]

110

( )3− 3+ 3− 3− [Nd PO4 2 ] 19.5 11 Nd + 2PO ⇌ Nd(PO ) K 3− = = 10 4 4 2 Nd(PO4)2 3+ 3− 2 [Nd ][PO4 ]

+ 3+ 3− + + [NdHPO4 ] 18.237 12 Nd + PO + H ⇌ NdHPO K + = = 10 4 4 NdHPO4 3+ 3− + [Nd ][PO4 ][H ] [Nd(HPO )−] 3− − 4 2 33.36 13 3+ + ( ) K − = = 10 Nd + 2PO4 + 2H ⇌ Nd HPO4 2 Nd(HPO4)2 3+ 3− 2 + 2 [Nd ][PO4 ] [H ]

2+ 3+ 3− + 2+ [NdH2PO4 ] 22.284 14 Nd + PO + 2H ⇌ NdH PO K 2+ = = 10 4 2 4 NdH2PO4 3+ 3− + 2 [Nd ][PO4 ][H ]

Mass Balance:

0 Assuming no precipitation of Nd(OH)3(s), the total concentration of neodymium must be equal to the total concentration of phosphate for dissolution of NdPO4. We will check this assumption at the end.

[Nd]tot = [PO4]tot First, we substitute in all dissolved forms of neodymium and phosphate seen in equations 1-14 above.

3+ 2+ + 0 − 0 [Nd ] + [NdOH ] + [Nd(OH)2 ] + [Nd(OH)3(aq)] + [Nd(OH)4 ] + [NdPO4(aq)] 3− + − 2+ + [Nd(PO4)2 ] + [NdHPO4 ] + [Nd(HPO4)2 ] + [NdH2PO4 ] 3− 2− − 0 3− = [PO4 ] + [HPO4 ] + [H2PO4 ] + [H3PO4] + [NdPO4(aq)] + 2[Nd(PO4)2 ] + − 2+ + [NdHPO4 ] + 2[Nd(HPO4)2 ] + [NdH2PO4 ] Then we eliminate terms that appear on both sides of the equation.

3+ 2+ + 0 − [Nd ] + [NdOH ] + [Nd(OH)2 ] + [Nd(OH)3(aq)] + [Nd(OH)4 ]+ 3− 2− − 3− − = [PO4 ] + [HPO4 ] + [H2PO4 ] + [H3PO4] + [Nd(PO4)2 ] + [Nd(HPO4)2 ]

Then we substitute in equations 4-7, 11, and 13 above to get everything in terms of [Nd3+], 3− + [PO4 ], and [H ].

β [Nd3+] β [Nd3+] β [Nd3+] β [Nd3+] [Nd3+] + 1 + 2 + 3 + 4 [H+] [H+]2 [H+]3 [H+]4 + 3− + 2 3− + 3 3− 3− [H ][PO4 ] [H ] [PO4 ] [H ] [PO4 ] = [PO4 ] + + + Ka3 Ka2 ∙ Ka3 Ka1 ∙ Ka2 ∙ Ka3 3+ 3− 2 3+ 3− 2 + 2 + K 3−[Nd ][PO ] + K − [Nd ][PO ] [H ] Nd(PO4)2 4 Nd(HPO4)2 4

111

Since we are looking at the dissolution of NdPO4, we will assume that this is in equilibrium with 0 3− 3+ 3− NdPO4(s), and use equation 9 to get [PO4 ] in terms of [Nd ] and then eliminate [PO4 ] from the equation.

β [Nd3+] β [Nd3+] β [Nd3+] β [Nd3+] [Nd3+] + 1 + 2 + 3 + 4 [H+] [H+]2 [H+]3 [H+]4 + + 2 + 3 KspNdPO4 KspNdPO4[H ] KspNdPO4[H ] KspNdPO4[H ] = 3+ + 3+ + 3+ + 3+ [Nd ] Ka3[Nd ] Ka2 ∙ Ka3[Nd ] Ka1 ∙ Ka2 ∙ Ka3[Nd ] 2 2 + 2 KNd(PO )3−(KspNdPO ) KNd(HPO )−(KspNdPO ) [H ] + 4 2 4 + 4 2 4 [Nd3+] [Nd3+]

We then rearrange to solve for [Nd3+] as a function of [H+].

[Nd3+]

+ + 2 + 3 [H ] [H ] [H ] + 2 Ksp (1 + + + + K 3−Ksp + K ( )−Ksp [H ] ) NdPO4 Ka Ka ∙ Ka Ka ∙ Ka ∙ Ka Nd(PO4)2 NdPO4 Nd HPO4 2 NdPO4 = √ 3 2 3 1 2 3 β β β β (1 + 1 + 2 + 3 + 4 ) [H+] [H+]2 [H+]3 [H+]4

We then use this to calculate [Nd3+] at a range of pH from 0 to 12. From that, we use equations 4-7 and 10-14 to calculate all other dissolved Nd species. We then sum up the concentrations of all dissolved Nd species to calculate [Nd]tot for plotting Figure 1.2.

0 Now we need to check that our initial assumption that Nd(OH)3(s) does not precipitate was valid. To do this, we use equation 9 and check that the following is satisfied at all pH in range:

3+ + 3 [Nd ] < KspNd(OH)3 [H ]

Doing this, we find that the above holds for pH < 12.55, so the assumption of no precipitation of 0 Nd(OH)3(s) is valid for the pH range of 0 to 12 used to plot Figure 1.2. At higher pH (≥ 12.55), 0 the concentration of hydroxide ions is sufficiently high to make precipitation of Nd(OH)3(s) a factor.

112

Appendix 2:

Metabolomics signal intensities for all metabolites and time points

113

Appendix 2. Metabolomics signal intensities for all metabolites and time points. aData for each time point are in a separate table.

5

43 72 77 6 73

237 164 109 107 213 545 157 114 522 411 137 918 883 346 164 303

2622 1768

flask 6

45 55 47 67 50 60 44

133 175 147 339 211 118 247 218 102 611 473 118 275

9508 1931 1197

flask 5

72 61 80 93 84 98 76 84 95

129 306 195 549 243 297 384 116 803 730 381 240

2258 1898

flask 4

and Monazite

4

56 54 82 78 65 42 56 61

155 155 171 484 139 256 677 102 726 276 490 129 304

2141

10491

flask 3

HPO

2

K

54 75 55 58 72 63 63 96

204 196 152 355 207 292 224 109 677 764 377 125 230

2043

10441

flask 2

81

83 84 83 92 89 62 84

156 296 228 364 203 119 303 237 149 724 8 384 117 286

2145

10397

flask 1

72 66 61 52 50 65 56 47

255 255 138 401 156 260 184 122 808 237 101 214

9721 1991 1167

flask 6

Signal Intensity at 0 Days

78 95 77 77 64

259 259 106 172 531 205 109 368 222 113 137 831 887 372 215 222

8795 1848

flask 5

0

62 85 45 62 64 64 53 66

202 202 21 395 169 382 175 136 756 798 201 152 196

2126 1593

flask 4

64 88 84 67 98 83

123 169 133 389 314 101 126 321 282 156 790 392 338 113 237

9519 1885

Monazite Only

flask 3

09

67 87 73 77 93

205 264 104 171 439 162 110 373 309 153 117 804 4 287 678

3094 2227 1412

flask 2

63 66 55 73 55 77 74

221 221 210 399 179 363 155 138 110 743 958 318 181 191

9548 1944

flask 1

furoic acid

-

2

-

glucose

enzoic acid

hexose

-

-

methylglutaric acid

D

d

-

- -

diol

3

-

-

2,3

-

ketoglutarate

-

gentiobiose

-

dihydroxyb anhydro anhydro

- - -

deoxyerythritol deoxyerythritol hydroxyadipic acid hydroxyglutaric acid isopropylmalic acid deoxyhexitol hydroxy hydroxypropionic acid hydroxybenzoate hydroxymethyl

------eta

Metabolite Name or IDBinBase

1 2 2 2 2 3,4 3,6 3,6 3 3 3 4 5 aconitic acid adipic acid alanine alpha azelaic acid benzoic acid b butane capric acid cellobiose

114

69 99 89 74

153 316 394 101 228 124 190 359 898 192

9277 3050 2019 1465 8182 2532 1083

11045

flask 6

462180

66

61 53 54 49 53 59

114 320 392 129 1 252 565 170 626

7722 6914 1888 8514 2786 8288 1618

flask 5

820818

45 93 98 67 40

135 292 327 196 147 278 100 899 182

5795 2381 1415 1049 2178 2137

11094 15065

flask 4

509311

and Monazite

4

67 69 76 79 67 65

134 342 383 139 172 298 702 269 520

8755 1791 2837 1837

10466 13692 10468

flask 3

580870

HPO

2

K

57 89 80 57 49

311 341 313 144 110 131 277 751 249 607

7049 2198 2387 1760

10884 12040 10711

flask 2

710386

96 71 47 44 66 62

139 296 398 191 132 331 775 277 709

8999 5862 2146 3766 1896

11281 11205

flask 1

612896

62 67 79 80 44 57

105 262 241 172 118 270 539 655 213 406

5120 7168 1570 1796 1337

10950

flask 6

292411

Signal Intensity at 0 Days

80 80 69 84

113 411 305 231 235 228 331 792 107 172 622

6729 2857 2805 1090 2038 2044

16047

flask 5

216348

52 81 81 67 75 67 86

265 242 193 127 250 744 168 480

8876 3930 1102 1211 1404 1850

14174

flask 4

235711

zite Only

63 60 72 64 57

659 254 358 308 305 139 294 783 319 648

7564 7624 1836 1886 1913 8877 1221

Mona

flask 3

373541

91 98 74 79 76

146 356 301 189 110 303 249 771 783 204

8096 5714 1543 4572 1985 1168

15015

flask 2

637431

62 97 61 91 88 51

243 246 147 243 280 254 940 723 181 480

8785 5854 1736 1846 2297 1352

flask 1

273055

phosphate

-

galactoside

phosphate

-

-

alpha

3

1

- -

-

nic acid

Metabolite Name or IDBinBase

citramalic acid citric acid dehydroabietic acid dihydroxyacetone erythritol erythro fructose fumaric acid galactinol gluconic acid glucose glucose glutaric acid glyceric acid glycerol glycerol glycerol glycolic acid histidine hydroxylamine isocitric acid isothreonic acid lactic acid

115

96 73 77 84 93

127 271 783 439 950 633 105 117 328 637 235

1696 2739 7432 1851 1782

26900 61230

flask 6

96 66 86 73 86 77

185 186 460 204 387 959 102 899 291 204 195

1593 4249 7835 1288

19516 48507

flask 5

77 91 62 45

212 729 746 346 699 127 144 664 113 105 525 219

1147 2646 2018 1818

15468 26634 60084

flask 4

and Monazite

4

89 56 97 94

148 235 384 134 102 112 102 352 103 129 269

1019 4750 1845 1460 1202

11005 20423 70494

flask 3

HPO

2

K

86 67 67 84

149 203 423 345 529 135 216 104 116 345 200

1787 4669 2031 1671 1383

11181 21335 63758

flask 2

80 62 83 69 99

196 222 412 136 700 106 222 326 363 245

1913 4577 2258 1254 1309

10084 22461 61267

flask 1

58 75 27 62 43 77

153 222 450 313 202 429 320 105 313 222

1736 4282 2529 9488 2985 1173

18241

flask 6

Signal Intensity at 0 Days

69 92 70

162 243 647 495 540 817 242 523 126 429 108 135 701 244

2940 6065 1560 2053 1523

29968

flask 5

93

57 58 94

174 220 809 459 258 152 533 121 990 375 109 101 613 284

1195 4067 7548 1933 12

24827

flask 4

54 57 33

111 173 584 481 105 537 208 115 187 222

1055 2448 1035 3213 1619 1718 2841 2102

Monazite Only

17649 20437

flask 3

90 98 79 38

187 958 514 417 509 691 120 108 425 101 417 179

1049 4698 9055 2275 1281 1536

28203

flask 2

47 65 95 87 65

170 166 627 454 376 796 336 103 363 674 181

4420 2974 2983 1031 1717

12816 18153

flask 1

diol

-

1,3

-

c acid

inositol

-

cresol

-

Metabolite Name or IDBinBase

lactitol lactulose lauric acid Levoglucosan lyxitol lyxose malic acid maltose maltotriose myo nicotinic acid oleic acid oxalic acid palmiti p pelargonic acid phosphate propane putrescine pyruvic acid ribitol ribonic acid ribose

116

86 81 60 716

106 470 274 494 238 258 137 349 232 396 314 826

1450 1994 1 1208

31678

flask 6

340650

1000586 1346583

79 83 70 56 63 88

796 136 110 229 423 113 395 442 901 343

1416 1488 1315

37378

flask 5

587029 406600

2171983

51 47 72 70

274 163 287 179 253 518 354 445 403 294 712

1526 1671 2200 1109

84351

flask 4

941948 292150

1084218

and Monazite

3

4

91 89 94 64 85 51 73

100 189 436 124 420 310 412

1018 1628 1577 1534 1086

54932

flask

696589 478543

HPO

2608019

2

K

99 55 72 44

880 154 229 127 156 434 167 344 425 587 952 482

1926 1654 1620

45501

flask 2

670738 390282

2491012

97

82 36 57

124 5 112 234 101 516 150 307 469 282 960 570

2333 1697 1787 1531

32821

flask 1

671307 501828

2484236

68 68 66 70

684 107 407 105 210 344 123 185 400 291 427

1806 1551 1333 1381

32217

flask 6

700086 879748

2348020

61

Signal Intensity at 0 Days

23

107 341 269 452 107 521 102 269 290 372 328 677

1234 1351 1816 1808 1651 1319

25370

flask 5

955475 6926

1071979

92 33

109 568 278 604 153 174 398 316 199 423 346 371 468

1992 1876 1559 1553

49227

flask 4

862874 756861

1718288

92 30 90 35

951 180 585 268 209 454 249 777 408 419 420

1466 1815 1294 1315

Monazite Only

64064

flask 3

608167 840680

2053440

84 96 63

242 300 412 112 199 186 143 282 371 550 908 865

2024 2225 1728 1471

41136

flask 2

874550 333850

2958585

89 31 42 67

828 238 149 215 474 175 411 412 262 443

1238 1729 1457 1508 1010

62928

flask 1

651962 915273

2350308

proline

-

L

-

phosphate

-

hydroxy

5 -

-

glucuronic acid

olactone

4

-

-

willardiine

-

)

-

Metabolite Name or IDBinBase

ribose s( shikimic acid sorbitol stearic acid succinic acid sucrose sulfuric acid tagatose threitol trans tyrosol UDP uracil xylitol xylon xylose xylulose 39 47 62 91 99

117

73 87

383 280 449 401 552 619 378 297 873 728 399 774 410 737

2484 1171 2574 1249 1253 2426

21335

flask 6

60

92

159 150 432 159 495 613 379 345 387 198 306 759 502 674 307 491 236 162 738

10 6998

55508

flask 5

327 297 450 783 458 287 560 365 947 233 539 819 521 683 646 170 761 126

2461 1215 1467 3329

20126

flask 4

and Monazite

4

72

377 457 227 521 965 410 367 376 144 352 621 848 427 719 373 117

2421 1046 1003 8131 1106

flask 3

103905

HPO

2

K

75

306 379 405 199 590 373 382 617 147 369 638 497 626 537 647 120

1343 1787 1022 1116 8032

flask 2

116819

95 79

357 217 410 795 702 189 574 393 305 371 970 643 542 535 545 155 567

1563 1182 8049

flask 1

151098

264

320 195 258 316 864 415 436 139 313 689 916 384 586 532 105 537 114

2 3150 1241 1073 6645

flask 6

184373

Signal Intensity at 0 Days

74

342 204 375 276 557 774 493 721 294 752 778 322 288 734 118

1081 2498 1190 1334 2042 1189

43865

flask 5

67

322 279 192 546 259 718 594 870 196 497 990 519 633 384 190 689 155

2178 1689 1158 1226 3182

661

flask 4

236 345 284 248 945 633 576 898 458 273 969 722 340 615 444 133 525 131

1443 2436 1192 5085

Monazite Only

77446

flask 3

216 301 400 638 403 778 687 617 150 515 652 853 764 690 227 753 117

1264 1653 1013 4068 1098

46114

flask 2

95

310 129 191 208 221 894 417 893 217 407 850 402 621 489 108 585

2192 2022 1073 1287 6443

flask 1

206087

Metabolite Name or IDBinBase

137 168 257 657 809 892 1064 1173 1673 1681 1690 1715 1870 1872 1875 1878 1913 2044 2061 2081 2095 2097 2821

118

535 157 406 159 553 907 289 302 248 174 968 234 251 649 315

2652 1146 3391 3714 1156 2392

11761 23905

flask 6

8

842 380 139 424 325 817 172 214 166 219 72 221 176 892 303

4018 8381 5543 1234 2667 4836 1624

12036

flask 5

92

510 168 437 104 382 794 266 237 167 206 248 910 611 319

5431 1099 3356 1130 3523 1190

11267 30445

flask 4

and Monazite

4

397 299 282 404 246 206 215 919 101 196 313

4767 1030 6376 4719 3174 1402 1365 3218 1170 3007 1268

10241

flask 3

HPO

2

K

367 155 296 908 236 218 155 127 209 172 343

4676 1097 9286 6303 1375 1212 1043 3055 1345 3065 1752

16234

flask 2

312 157 801 303 189 188 141 103 195 193 916 319

4281 1139 9587 5982 2333 1107 1061 3494 1043 5418

11116

flask 1

k 6

57

809 297 166 692 269 864 903 204 184 148 164 324 588 285

4249 8499 5416 1023 3053 5873 1182

11895

flas

Signal Intensity at 0 Days

452 155 456 209 771 355 232 170 117 230 613 885 683 315

5968 1609 3775 1485 1058 5014 1546

11130 12676

flask 5

6

86

313 215 713 306 638 704 281 272 130 286 571 540 870 331

6350 1107 502 9076 1132 4024 2925

10538

flask 4

741 124 363 243 207 266 101 111 930 174 328 748 287

4582 1296 9066 4012 1033 1211 2770 2324 1020

Monazite Only

19770

flask 3

292 285 162 388 967 272 197 908 249 462 345

5744 1307 6023 1474 2233 1021 3309 2186 3667 1285

11366 24297

flask 2

953 386 133 295 348 967 922 238 253 125 104 108 329 749 761 236

4380 8513 5639 9292 1130 4251 4339

flask 1

37

Metabolite Name or IDBinBase

2944 3083 3173 3247 3442 3781 4541 4543 4713 4732 4795 4797 4819 49 4976 5346 5576 6104 6330 6646 9320 14694 16561

119

93 67

469 230 277 874 303 333 633 579 891 646 527 335 312 220

1737 1133 1156 1118 2691

16839 22430

flask 6

45 71 84

320 101 239 429 275 537 860 438 709 443 440 337

2704 1744 1393 1264 3896 1475

19823 11575

flask 5

82 64

498 142 209 910 240 235 491 948 420 752 675 372 347 168

3195 1058 9858 2659 1089 1762

14751

flask 4

and Monazite

4

80 56

370 442 115 304 276 165 941 302 918 549 262 524 172

3058 1209 1240 1606 4561 1961

20290 14250

flask 3

HPO

2

K

80 60

407 426 391 230 604 181 823 253 900 700 312 264 163

2954 1608 1023 1999 4269 2325

22736 15451

flask 2

97 92

376 217 480 711 286 230 183 827 363 889 254 249 130

3004 2337 1407 3851 1483 3923

25907 14640

flask 1

66

401 117 386 455 237 453 130 898 795 249 752 574 291 119

2205 1182 1637 3825 1986 3453

22146 18890

flask 6

Signal Intensity at 0 Days

344 105 388 683 222 234 100 498 102 923 826 742 342 315 254

1709 1137 1116 1189 2679 2687

17585 23151

flask 5

89 88 88

445 707 385 870 196 219 678 368 973 716 569 196 209

1688 1030 2597 1148 1982

10927 26112

flask 4

te Only

91 65

398 181 391 237 424 841 310 778 297 296 162

2848 5757 1047 1075 1480 3016 1460 3098

Monazi

18501 18156

flask 3

2

71

358 162 510 449 652 958 764 102 37 938 601 396 268

1789 2286 1098 1100 4610 1349 3325

18953 24499

flask 2

89 65

352 101 649 538 846 191 282 996 828 327 653 404 350 208

3402 1600 3487 1557 3314

19640 14813

flask 1

082

Metabolite Name or IDBinBase

16817 16850 16855 17068 17069 17140 17425 17471 17651 17830 18 18173 18226 18241 20282 21704 22967 25801 30962 31359 41682 41689 41808

120

91

254 912 256 161 247 371 369 448 336 135 432 264 212 120 938 123 599 844

1451 4623 3224

12438

flask 6

4

96 92 69

139 473 164 66 179 123 296 296 106 183 197 270 976 540 455 125

3185 3489 1983

19462

flask 5

97

223 627 232 540 213 400 512 350 486 123 192 372 287 134 123 428

1373 1255 4358 2679 1017

13355

flask 4

and Monazite

4

68 72

178 779 414 283 203 186 454 358 253 186 390 205 100 611 485 141

1057 3875 3971 2299

17626

flask 3

HPO

2

K

84 93 87 86

143 889 388 990 114 219 410 410 224 275 373 228 638 131

1242 3771 3742 1678

18247

flask 2

90 73 56 85

203 739 281 928 293 249 314 314 263 194 268 196 509 379 961

1285 3613 3972

16272

flask 1

54 81

178 701 184 831 286 153 235 235 184 105 151 308 266 996 109 533 412

8474 2716 3620 3968

flask 6

Signal Intensity at 0 Days

89

197 203 603 279 443 445 290 332 141 249 309 254 125 672 622 120

1002 1234 6455 4396 3049 3940

flask 5

5

89 90

129 970 232 774 136 231 53 393 275 107 175 274 273 475 104 661

1041 6882 4259 3108 4034

flask 4

ite Only

85 73 40

102 605 294 859 164 264 242 242 255 209 269 165 723 833 403 118

2923 3429 5328

Monaz

10181

flask 3

88 83

171 848 321 333 240 269 332 213 219 125 259 396 282 117 949 551

1315 4049 3749 5135

15520

flask 2

98 91 73 69 85

621 324 893 268 227 382 382 174 198 268 187 902 429 469

7740 3074 3319 2560

flask 1

Metabolite Name or IDBinBase

41811 41938 42205 47170 48522 49382 53724 54643 87877 88911 89221 97326 97332 100768 100869 100880 100908 101299 102223 102616 102661 102662 102679

121

56 52 91 65

646 316 778 103 153 179 133 152 932 100 162 216 119 629

9244 1564 4281 1153

30033

flask 6

77 91 52 89 81

250 504 120 281 151 256 133 937 169 137 156 384

1104 2343 3686 1651

11574 17523

flask 5

89 73 73 99 91

756 322 754 179 177 154 159 120 927 111 113 100 971 392

9130 1367 3754

24008

flask 4

and Monazite

3

4

93 96 69 72

249 531 122 306 245 120 839 125 134 327 916 103 147 558

1212 2867 253

12586 23109

flask 3

HPO

2

K

40 80 46

352 464 131 301 127 185 187 119 967 129 173 102 151 581

1663 2770 3636 1143

12452 19548

flask 2

72 71 96 99 87

341 528 157 292 146 126 143 905 231 120 168 546

1469 4139 4118 1639

12334 20102

flask 1

7

65 42 96 88 96 72 74

390 50 176 212 133 169 117 137 805

1155 2023 1181 3591 1163

11812 20549

flask 6

Signal Intensity at 0 Days

61

523 294 811 160 111 183 220 118 202 117 119 138 120 138 121 978 780

1320 1085 2672

10962 27265

flask 5

57

94 76 81 80 90 55 71

391 348 865 152 118 127 124 9 128 103 872 968

8479 1728 3364

22266

flask 4

57 91 63 89 54

431 700 902 789 383 197 199 123 129 104 737

1226 3935 1254 3106 1200

Monazite Only

11488 20437

flask 3

73 76 93

429 533 180 186 303 243 176 140 113 168 143 209

1250 1995 1239 3445 1691 1264

10540 25141

flask 2

97 78 93 72 59 65

494 491 137 176 172 183 108 121 100 592

1131 2350 1068 3118 1083

11398 20047

flask 1

749

Metabolite Name or IDBinBase

102711 102714 102715 102716 102727 102728 102729 102730 102731 102732 102733 102734 102735 102740 102741 102746 102747 102 102776 102784 102790 102791 102793

122

660 230 135

flask 6

859 101 125

flask 5

546 142 109

flask 4

and Monazite

4

80

893 130

flask 3

HPO

2

K

80

928 164

flask 2

97

783 115

flask 1

722 117 111

flask 6

Signal Intensity at 0 Days

805 105 131

flask 5

89 78

628

flask 4

90

701 181

Monazite Only

flask 3

162 192

1005

flask 2

72

737 101

flask 1

Metabolite Name or IDBinBase

102808 102809 102821

123

28

57

814 793 575 564 185 562 498 128 500 704 966 181 219

1167 1565 41 2389 2523 1196 2114

67327 11205

flask 6

80 83 51 80

943 417 164 584 527 207 277 270 846 672

1316 1122 3352 1870 1581 2823 1885 5395

29243

flask 5

5

98 90

399 644 157 986 516 527 160 234 37 765 125 880 202

1055 2033 1641 3423 1436 6159

10153 28922

flask 4

and Monazite

4

91 89

899 899 135 665 769 530 333 280 411 810 168 864 232

1046 4535 2813 1540 1301 1525 3247

18057

flask 3

HPO

2

K

93 89 74 324

476 738 589 478 277 298 365 903 149 203

1095 1168 2417 1947 1944 3250 1074 1809 7

34968

flask 2

37 85 78

916 613 425 228 228 317 639 116 766 177

1057 1057 1233 2764 1968 1601 2483 1569 5227

25048

flask 1

463 463 160 549 185 176 435 129 291 112 153 323 230 195 119 884 967 195

4217 1260 1546 4633 1170

flask 6

1

Signal Intensity at 2 Days

8 52 66 65 98

420 331 589 241 237 363 224 204 288 333 714 750 146

1798 1336 5921 1253

11202

flask 5

91 95 95

493 240 468 271 363 552 253 145 301 351 373 101 718 159

3673 2374 1673 1628 2660

13094

flask 4

63 96 79 65 96 74

279 336 322 150 302 203 313 309 169 515 394 134

2830 2045 1140 4548 1096

Monazite Only

flask 3

84 86 66

489 421 118 536 222 204 413 238 303 451 655 162 837 952 153

3400 2233 1327 7400 1842

flask 2

408 466 144 665 270 302 698 114 264 127 120 400 579 372 115 895 221

4104 2744 1654 9042 1374 2109

flask 1

furoic acid

-

2

-

glucose

hexose

-

-

methylglutaric acid

D

d

- opionic acid

- -

diol

3

-

-

2,3

-

ketoglutarate

bolite Name or IDBinBase

-

gentiobiose

-

dihydroxybenzoic acid anhydro anhydro

- - -

deoxyerythritol deoxyerythritol hydroxyadipic acid hydroxyglutaric acid isopropylmalic acid deoxyhexitol hydroxy hydroxypr hydroxybenzoate hydroxymethyl

------

Meta

1 2 2 2 2 3,4 3,6 3,6 3 3 3 4 5 aconitic acid adipic acid alanine alpha azelaic acid benzoic acid beta butane capric acid cellobiose

124

615 608 301 225 615 226 611 148 208

1119 1525 4988 3017 4920 2012 1469 6187 3457 8479

11487 10234

flask 6

267025 297282

63

793 298 457 110 114 459 265 216 113

1214 4435 1915 3909 1264 1289 5744 2722 1942

11989 10593

flask 5

399497 352239

2

82 91

980 64 602 105 459 189 282 116

1080 4853 1908 3706 1248 1034 7243 3646 2123

10789 10153

flask 4

216037 302706

and Monazite

4

58 92

715 857 379 396 999 163 111 483 378 213

5219 1529 6166 2303 4134 2344 2411 9062

15585

flask 3

500747 270894

HPO

2

4

K

851 556 467 158 117 544 163 422 113 156

1460 4392 2326 492 1546 1192 8105 2756 2009

13537 10397

flask 2

216064 335937

53

609 540 187 425 168 417 129 160

1023 1183 4407 2075 2656 1315 1140 5082 3092 2099

13158 10542

flask 1

147070 358631

11

60

401 284 391 682 181 147 550 889 625 123 236

1448 2315 1035 71 1723 4623 1345 5383

17049

flask 6

340625 236459

Signal Intensity at 2 Days

86 76 65

381 618 254 940 296 719 596 137 370 234 768 215

1820 6731 2513 1770 4662

10736

flask 5

333969 188016

43

90 66 86

483 320 580 148 528 318 232

1019 2733 1445 6374 1009 2347 33 2675 1227 3906

93149 13540

flask 4

276717

60 54

247 244 632 193 312 178 277 276 114 185

1327 1543 5689 2388 2012 1313 1027 3021

Monazite Only

11103

flask 3

243526 173497

91

356 230 434 854 166 151 465 246 114 226

1230 3048 1119 6233 2127 2960 2231 1405 3301

13580

flask 2

269381 275057

355 279 283 232 122 372 102 358 134 242

1016 2318 1240 5416 1310 1430 2934 1610 1775 6126

21122

flask 1

202366 217568

phosphate

-

galactoside

phosphate

-

-

alpha

3

1

- -

-

tol

ic acid

Metabolite Name or IDBinBase

citramalic acid citric acid dehydroabietic acid dihydroxyacetone erythri erythronic acid fructose fumaric acid galactinol gluconic acid glucose glucose glutaric acid glyceric acid glycerol glycerol glycerol glycolic acid histidine hydroxylamine isocitric acid isothreonic acid lact

125

85

138 152 900 103 133 260 577 144

1583 1243 2077 3386 5375 8712 2814 1138 2055 2153

74624 11754 36471 87635

flask 6

57

85 57 82

480 612 398 137 108 338 169 213 148

1129 4300 6109 11 1442 2165

32202 12385 20011 61061 38523

flask 5

85 85 90

520 506 335 109 423 504 157 137

1021 1872 3955 9753 7142 3381 1185 2417 1873

37590 25088 77636

flask 4

and Monazite

4

67

3718

232 730 517 103 126 135 457 205 998 191

3159 1177 3071 7138 1065 7827 3757 1702

28118 27584 98283 3

flask 3

HPO

2

K

88 98

665 546 168 330 136 105 506 309 152

1279 2226 5021 6552 3794 1606 1350 2226

46971 12933 24089 76769

flask 2

49 76 27 92

781 472 645 736 111 462 220

3482 1309 4216 9217 5028 3048 1096 2808 2548

48492 23286 76222

flask 1

78 82 98 50

270 234 856 713 227 785 699 162 219

1165 2112 3562 7242 1645 1007 5323

24530 26391 28981

flask 6

Signal Intensity at 2 Days

59 84

251 181 313 808 475 146 125 477 205 123 188

2082 6508 1167 6986 2582 2226 4541

25620 18395 30877

flask 5

54

211 402 604 206 110 196 238 380 197 617 200

Only

1346 2041 2483 1106 6576 1982

40611 14479 27539 49371 10422

flask 4

49 87 76

205 242 671 666 146 862 177 139 403 331 157

3028 1862 1524 3726 2621

Monazite

16809 10979 19504 20406

flask 3

60

74 40

380 253 308 959 6 131 102 733 364 196 185

1571 1884 9195 1951 1672 8939

31780 23290 22982 38597

flask 2

221 227 909 490 126 918 229 140 139 248 618 162 187

1261 3372 9001 7564 1980 1343 6049

49731 32534 61931

flask 1

diol

-

1,3

-

inositol

-

cresol

xitol

-

Metabolite Name or IDBinBase

lactitol lactulose lauric acid levoglucosan ly lyxose malic acid maltose maltotriose myo nicotinic acid oleic acid oxalic acid palmitic acid p pelargonic acid phosphate propane putrescine pyruvic acid ribitol ribonic acid ribose

126

110 692 235 232 288 360 102 408 832 754 908

5237 2202 3657 1273 2245 2853 1561

38659 13468

flask 6

506650 455793 883333

80 57

162 243 118 273 263 243 595 307 847 384

4493 1426 1062 2420 1664 1627

25260 12956

flask 5

480270 431934

1788072

69

136 514 125 123 395 199 370 542 440 824 729

3392 1483 1286 1762 1943 2056

45662 11452

flask 4

698789 427324 976281

and Monazite

4

77

189 172 108 135 291 207 946 293 636 304 868

1736 8257 2195 1472 2341 1939 1092

34665

flask 3

353996 508424

2639849

HPO

2

K

738 97 97

186 407 339 161 140 293 571 616 737

3 1587 1681 2231 1984 1694 1072

43079 14003

flask 2

709475 364649 849434

64

136 313 126 400 214 262 264 610 259 869 897

2606 1494 1009 1841 2037 2003

19756 12316

flask 1

736183 433801

1010921

62

82 89 53

410 372 406 118 791 858 426 579 559 710

18 5219 1868 2044 2077 1090

27590

flask 6

132969

1077918 2201845

Signal Intensity at 2 Days

40 55 55

219 115 192 159 558 205 613 381 408

3790 5257 1606 1117 1570 1244 1060

22649

flask 5

300929 975957

2199569

4

84 25 64

417 283 261 193 739 923 854 312 970 478

5382 5852 1863 2658 1025 1704

3076

flask 4

350021 998404

2419444

85 87 54

220 102 194 106 417 490 143 449 249 767 523

1400 2370 1352 1264 1442

Monazite Only

42625

flask 3

621734 779173

2310010

32

57

108 419 110 162 355 158 618 248 838 480 944 402

3031 4243 15 1089 1888 1797

60067

flask 2

322477 851483

2372329

77

151 165 153 121 258 253 669 216 780 608 841

3250 7676 2028 1251 1580 2262 1454

32644

flask 1

466639 809174

1551478

proline

-

L

-

phosphate

-

hydroxy

5 -

-

glucuronic acid

4

-

-

willardiine

-

)

ikimic acid

-

Metabolite Name or IDBinBase

ribose s( sh sorbitol stearic acid succinic acid sucrose sulfuric acid tagatose threitol trans tyrosol UDP uracil xylitol xylonolactone xylose xylulose 39 47 62 91 99

127

389 453 728 413 527 541 465 600 257 535 395 889 247 979 213

1236 1166 2636 1233 3774 1852 1512

lask 6

28704

f

45

226 376 194 575 802 257 442 206 728 480 603 313 518 487 823 105

2248 1837 1465 1952 2551

29900

flask 5

326 218 279 350 536 637 957 394 658 165 931 481 802 386 746 621 205 661 151

1678 2033 2167

40915

flask 4

and Monazite

4

358 553 188 484 272 348 280 120 649 613 937 503 196 896 154

1432 1053 1021 1692 1058 2018 3697

61361

flask 3

HPO

2

K

73

428 290 866 341 736 368 622 222 614 781 602 718 207 620 156

2473 1281 1070 1780 11 2297 2096

30086

flask 2

396 386 260 248 281 342 891 292 246 183 433 877 248 876 524 112 990 166

1016 1766 2162 2025

34242

flask 1

51

288 713 200 231 545 749 600 751 922 381 512 971 580 814 655 206 656

3448 1611 1835 3347

44359

flask 6

Signal Intensity at 2 Days

92

234 539 245 166 269 409 198 181 307 776 557 711 596 436 138 636

1271 1185 1181 1502 6834

flask 5

151494

82

298 936 250 260 436 335 489 529 576 671 601 656 442 210 598

1879 1823 4557 1498 1824 3887

99321

flask 4

36

94 72

376 403 1 737 305 609 493 425 181 328 621 441 990 466 825

3771 1046 1339 1399 4134

Monazite Only

89709

flask 3

301 136 229 193 314 657 547 757 245 472 761 242 657 753 141 931 119

3696 2090 1621 1853 3637

86239

flask 2

408 713 232 838 913 243 681 490 270 757 309 558 198 923 174

1140 2030 1150 2973 3272 1421

13114 45481

flask 1

Metabolite Name or IDBinBase

137 168 257 657 809 892 1064 1173 1673 1681 1690 1715 1870 1872 1875 1878 1913 2044 2061 2081 2095 2097 2821

128

490

429 309 423 358 349 351 257 309 570 558

7216 1 1427 1097 1107 1913 6300 2825 7607

16535 15763 19682 10324

flask 6

31

963 196 617 991 663 250 225 114 943 179 200 379 344

4285 9268 9783 1059 4243 4041 3816 9088

15539

flask 5

22

495 1658 616 495 212 171 129 178 335 584 310

4993 1252 1182 4667 1282 1162 4411 2232 5846

11241 11676 1

flask 4

and Monazite

4

237 237 758 787 235 279 208 235 183 986 407

5843 1396 3932 1298 3295 1202 1062 5127 3319 5102

12587 17452

flask 3

HPO

2

K

5

53

309 717 343 619 56 394 197 108 469 219 426 335

4976 1541 4840 8572 5636 1261 4725 2381 6384

11110

flask 2

59 50

800 274 362 779 878 437 208 203 632 309 714 379

4588 3745 1222 4811 1152 4329

10447 11862 10439

flask 1

944 226 543 804 397 254 253 137 256 292 387

6176 1564 2368 1314 1445 1052 4842 1113 2883 5482

11055 21743

flask 6

Signal Intensity at 2 Days

87

248 169 407 837 245 196 195 718 199 164 378 278

3686 1134 7946 8602 1644 1018 3665 4137 9694

13741

flask 5

73

651 246 598 977 274 433 242 255 425 426 396

3563 1092 3844 1545 1042 4946 4353

11390 20117 16542 12157

flask 4

64

237 170 261 933 736 243 249 224 114 786 316 538 261

3861 1030 7828 1925 1334 3663 3967 5380

Monazite Only

12442

flask 3

78

429 212 324 588 708 371 325 142 213 417 744 335

4624 1025 3114 1067 4900 2095

10733 12061 10590 11093

flask 2

345

373 213 650 650 470 153 352 278 385 646 292

7399 1 3500 1493 2399 1546 6078 1091 2222 9607

14824 24354

flask 1

Metabolite Name or IDBinBase

2944 3083 3173 3247 3442 3781 4541 4543 4713 4732 4795 4797 4819 4937 4976 5346 5576 6104 6330 6646 9320 14694 16561

129

350 154 821 440 261 543 686 484 359

2381 1496 5297 1101 6169 9881 1606 1073 1652 1054

14210 13609 55550 20485

flask 6

72

447 461 183 745 750 606 567 457 552 774 306 222

1696 1984 1617 4285 1018 1355

11769 10034 18071 13444

flask 5

410 251 117 509 419 290 317 898 400 140

2167 1021 1445 1906 7641 4670 2115 1124 1087 1046

10657 22835 15406

flask 4

and Monazite

4

57

92

516 583 373 384 503 775 766 544 2

3027 1214 7081 4104 1127 1442 1249 1351 1519 3371

16077 14592 17926

flask 3

HPO

2

K

91

781 848 301 809 781 926 423 479 494 801 427 179

1717 2068 2327 8287 5824 1062 8929 1547

27663 15536

flask 2

85

757 622 287 876 693 403 827 326 258 808 351 244

1474 1583 1635 5586 3994 1267 8919 1113

18951 15089

flask 1

061 70

446 223 635 836 352 297 842 523 999 685 718 252 372

2 1069 7125 1182 7142 1065 5025

28359 14530

flask 6

nal Intensity at 2 Days

Sig

71

225 268 429 941 526 965 780 583 264 603 501 379 191

2152 1250 1458 1237 3300

12990 18078 11603 10215

flask 5

007 77

529 280 899 544 539 244 503 499 203 129

2757 1665 2 1407 1058 1007 1275 1203

15487 15342 18040 14356

flask 4

93 64

198 178 840 826 294 990 426 197 816 488 564 536 195 307

2177 7098 1019 9932 1062

Monazite Only

16777 10249

flask 3

307 127 782 763 171 952 461 223 514 397 507 215

2510 1215 1367 1101 1146 1284 2953

14446 16834 10480 11546

flask 2

427 411 215 572 129 713 914 354 986 648 814 766 447 309

2282 1377 1628 1578 1191

12247 14416 16264 15678

flask 1

Metabolite Name or IDBinBase

16817 16850 16855 17068 17069 17140 17425 17471 17651 17830 18082 18173 18226 18241 20282 21704 22967 25801 30962 31359 41682 41689 41808

130

280 930 397 383 278 527 659 906 340 878 815 359 411 449 731 359

1204 1787 1448 6162 1565 1816

25931

flask 6

1

204 73 236 352 206 201 490 361 180 250 666 364 278 631 567 458 411 168

1087 3803 1033 1527

34251

flask 5

286 749 307 239 312 326 313 218 158 357 659 415 387 524 619 353 317 997 627

1466 4368 1036

26550

flask 4

and Monazite

4

47

191 781 340 234 350 263 581 430 262 350 539 3 120 308 801 147 616 152

1390 4718 1574 1860

40236

flask 3

HPO

2

K

152 754 454 286 390 474 335 465 319 516 866 452 427 570 520 609 412 760 261

1628 4305 1153

24290

flask 2

51 059 81

121 578 252 326 208 201 288 288 217 398 644 441 421 571 339 462 221

1431 3899 1

24888

flask 1

247 718 307 908 330 146 602 437 286 108 626 346 246 199 953 154 443 113

1360 4038 2798 3203

14036

flask 6

Signal Intensity at 2 Days

164 472 258 974 278 215 245 186 159 113 467 289 156 140 543 159 369 116

1063 3191 2362 3684

16551

flask 5

79 77

173 904 301 241 225 220 149 240 145 432 295 256 202 748 459 470 100

1327 3755 1613

13527

flask 4

93 80

128 558 277 165 108 209 524 524 223 106 330 264 126 849 512 143 453

6629 3192 2103 3604

Monazite Only

flask 3

77

94

105 831 385 271 331 237 319 319 326 333 264 1 190 510 362 470 122

1392 4189 1936 3413

10653

flask 2

68

374 343 611 277 255 676 534 294 171 428 495 313 244 326 807

1132 1489 1004 8540 5294 2158

10226

flask 1

Metabolite Name or IDBinBase

41811 41938 42205 47170 48522 49382 53724 54643 87877 88911 89221 97326 97332 100768 100869 100880 100908 101299 102223 102616 102661 102662 102679

131

68

462 692 839 194 844

6809 6221 6320 2928 3474 2354 1754 11

11935 81237 34897 20951 32447 59398 11064 11032 31762

flask 6

123803

59

431 942 232 473 628 399

9904 8844 2589 3368 2314 1010 2693 1704 1880 1541

49445 49097 15525 18249 64531 10184

flask 5

1

99

372 274 495 51

9368 6273 3382 2966 1390 2552 1754 2610 2115 1198 1093

12221 53378 55395 15783 22913 44082 23961

flask 4

and Monazite

4

87

436 989 192 939 555 193 388 885 456

5060 4175 7579 1537 1887 3734 2524

14120 30136 23420 25101 19833 32394

flask 3

HPO

2

K

433 355 555 120 580 698

9845 5291 5542 3790 2157 3825 1958 2282 2027 1600

60428 69160 21926 11720 21735 58245 18929

flask 2

sk 1

318 984 365 491 103 800 416

9748 7022 5651 3028 2132 2832 1616 2718 1781 1347

39652 40611 15965 20966 45057 24072

fla

72 97

553 977 624 310 506 261 116 953 230 606 146 441

6576 4954 1160 4308 1886 1722 1378

12832 29619

flask 6

14

Signal Intensity at 2 Days

68 65

3 710 140 356 493 121 219 139 210 387

9605 7301 1755 1278 9062 1477 1306 2542 1290

13839 18395

flask 5

96

744 314 567 873 248 448 141 195 145 517 633

2314 3785 7898 1410 1756 1648 1119

11877 19855 15669 27539

flask 4

59 91 69

300 549 976 560 140 492 124 982 201 221 149 317

9943 7147 5481 3199 1087 1684 1054

Monazite Only

19504

flask 3

89

408 395 333 482 342 305 111 103 297 120 328

8902 1019 2007 4080 1387 1398 1728 1034

10845 11406 22982

flask 2

6

420 531 304 856 21 508 182 293 174 327 172 612

2539 2260 5614 1465 1741 2444 1454

13522 14096 10294 34853

flask 1

Metabolite Name or IDBinBase

102711 102714 102715 102716 102727 102728 102729 102730 102731 102732 102733 102734 102735 102740 102741 102746 102747 102749 102776 102784 102790 102791 102793

132

681

4333

10750

flask 6

293

3116 1591

flask 5

447

4139 1854

flask 4

and Monazite

4

745 212

2350

flask 3

HPO

2

K

277

4622 2618

flask 2

346

3490 2066

flask 1

tensity at 2 Days

223 122

1458

flask 6

Signal In

159 144

2100

flask 5

402 117

3394

flask 4

93 87

1808

Monazite Only

flask 3

187 137

1186

flask 2

286 195

2751

flask 1

Metabolite Name or IDBinBase

102808 102809 102821

133

241 852 332 943 328 192 179 378 131 323 320 120 578 115

1462 2599 1658 3386 2099 1998 2965

12665 18355

flask 6

938 322 216 222 405 337 170 958 854 146 202 505 414

2551 3988 3603 1566 2089 3557 2081 1557 1832

10013

flask 5

71

302 482 718 672 397 489 561 499

3047 5202 1742 2447 5970 2211 2463 5766 1952 1722 2552 3162 1496 21

11169

flask 4

and Monazite

4

552 501 528 728 213 446 503

3928 7531 1031 2583 9797 6245 1129 1785 5901 1413 1939 3141 1812 7595 1374

34211

flask 3

HPO

2

K

217 753 460 445 300 392 897 103 413 659 105 728 147

2188 3803 1730 3874 2036 3694 9597 6372

10922 29943

flask 2

410 410 389 621 820 381 600 323 965 683

3810 6423 1615 4543 1536 2232 5595 1536 2087 3922 1520 5052

14494

flask 1

82

129 530 337 198 539 123 178 275 890 158 280 155 272 429 170

1258 1303 3797 2302 4468 6062

27131

flask 6

Signal Intensity at 4 Days

68 43 77 48 82

567 785 271 350 436 411 342 126 123 142 260 722 347

1126 1720 3857 3301

20347

flask 5

80 96 53

344 601 351 371 197 670 215 282 579 807 154

1268 1720 2068 2396 1703 4256 4647

29751

flask 4

153062

90 59

135 350 335 335 529 135 136 182 254 198 303 639 176

1286 1370 3382 2062 1873 4678 6879

30497

flask 3

Monazite Only

99 67 77 93 91

498 215 252 427 602 490 205 103 223 582 154 504

1380 2224 2224 1647 3914

17195

flask 2

6

50 64 58 72

627 492 602 424 445 159 160 190 284 379 188

1475 1500 2735 2338 1111 584

26274 12346

flask 1

furoic acid

-

2

-

glucose

hexose

-

-

methylglutaric acid

D

d

-

- -

diol

3

-

-

2,3

-

ketoglutarate

-

gentiobiose

droxypropionic acid

-

dihydroxybenzoic acid anhydro anhydro

- - -

deoxyerythritol deoxyerythritol hydroxyadipic acid hydroxyglutaric acid isopropylmalic acid deoxyhexitol hydroxy hy hydroxybenzoate hydroxymethyl

------

Metabolite Name or IDBinBase

1 2 2 2 2 3,4 3,6 3,6 3 3 3 4 5 aconitic acid adipic acid alanine alpha azelaic acid benzoic acid beta butane capric acid cellobiose

134

86

732 377 264 652 233 772 131 328 336 138

2726 4586 8395 1003 3375 1068 1717

13517 11460 10143 20479

flask 6

410461

788 328 726 769 489 208 374 162 237 189 339 225

1335 2127 1189 1333 4227 1035 8572 5065 1015 5977 2919

flask 5

971 994 741 272 505 830 728 387

1820 1827 4477 3520 2362 2968 5369 1122 2860 2408 1401 8135 8164

22019

flask 4

125962

and Monazite

4

507 213 438 252 969 613

3086 6171 1237 2159 6402 1121 3139 7646 2920 1425 6353

12296 11439 15698 41273 15584

flask 3

223882

HPO

2

K

20

68

987 430 839 308 191 568 895 451 194

3718 3640 8347 2659 5635 1849 2023

198 36134 29847 19330 11809

flask 2

854456

634 729 712 402 588 298 273 923 360 588

1362 1603 3760 1255 5785 1077 4212 1474 4067 2025 2108

48720 13049

flask 1

73

538 413 815 236 116 649 174 618 976 332 197

2772 6712 1854 2184 2613 4415 3745 2322

15232

flask 6

156120 376218

Signal Intensity at 4 Days

37 37 98

312 247 421 104 404 158 471 154

6327 1958 2509 1636 3700 1259 2630 5727 2521

36962 10226

flask 5

603235

78

560 506 693 950 100 502 412 289 119 174

4311 1203 3999 1132 4363 6124 3866 1445 3909

12374

flask 4

226219 104609

62

521 367 461 299 108 564 238 935 312 152

2879 5750 2514 1497 1774 6296 6481 2615

17267 10173

flask 3

Monazite Only

158472 557186

9

67 41

352 322 287 223 49 286 799 205 109

8328 3333 3353 3403 1302 1374 3060 7152 4643 2322

flask 2

119554 627709

87 78

349 453 333 204 593 256 752 348 121

1920 5702 3018 6654 1451 7688 6143 5959

18916 10242

flask 1

118713 410676

e

phosphate

-

cid

galactoside

phosphate

-

-

alpha

3

1

- -

-

Metabolite Name or IDBinBase

citramalic acid citric acid dehydroabietic acid dihydroxyaceton erythritol erythronic acid fructose fumaric acid galactinol gluconic acid glucose glucose glutaric acid glyceric acid glycerol glycerol glycerol glycolic acid histidine hydroxylamine isocitric acid isothreonic a lactic acid

135

55 59 60 84

482 536 706 107 306 114 499

1190 1405 1170 4574 1488 2495 8952

87140 23840 20650 57519 10865

flask 6

80

77

293 451 615 247 395 206 337 247 131 503 270 393 486 339

3042 1033 4048 1129 4374

10692 942

flask 5

149056

364 384 581 528 367 784 515 436 298 348 587 312

4041 3257 6918 2211 8679 1975 6882

20537 11680

flask 4

118044 165283

and Monazite

34

4

423 689 249 200 434 699 376 710 589

4049 1020 3955 6775 1090 3417

23729 18279 12159 12300 69102 14946 20839

flask 3

1689

HPO

2

K

76

854 159 494 114 116 484 118 660 140

1051 1972 1084 2250 2013 5491 1721

22424 25789 14768 37635

flask 2

149693 120236

12

352 837 389 335 667 642 534 422 812 327 7 845 592

2717 1048 1143 1015 7164 2493 2907

19426

flask 1

149920 232560

70 70 96 43

841 454 562 520 510 520 433 457 110 197

1499 5566 1887 4578 8106

36371 12207 11493 24501

flask 6

Signal Intensity at 4 Days

41 26 99

936 424 805 387 495 242 133 772 515 269 287 162

1097 2450 7667 8362

64747 11175 10287 13784

flask 5

96 96 44

990 299 388 106 211 354 129 188

7579 1659 7061 7959 2087 5682 2522 7389 5770

55353 25981 33654

flask 4

95 68 92

993 657 523 574 140 872 809 500 138 189 162

2179 1841 4896 2734 7081

40169 10574 12580 22971

flask 3

Monazite Only

7

60 38 83 47

494 398 318 783 216 727 332 593 255 467 110

4291 4605 3107 1055 5047

9671 17219 16825

flask 2

81 72

588 469 718 190 733 121 239 480 107 145 145

1699 1913 1989 4616 1344 6049 6452

61268 22569 21275

flask 1

diol

-

1,3

-

inositol

-

cresol

-

Metabolite Name or IDBinBase

lactitol lactulose lauric acid levoglucosan lyxitol lyxose malic acid maltose maltotriose myo nicotinic acid oleic acid oxalic acid palmitic acid p pelargonic acid phosphate propane putrescine pyruvic acid ribitol ribonic acid ribose

136

85 44

500 507 901 401 363 162 963 266 696

8274 2743 2048 5745 1833 1654 1014

45627 34348

flask 6

712729 158767 291609

9

110 158 422 632 842 518 218 254 578 200

3177 2950 1495 1946 1655 3181 2447 1823 5085 408 2225

64119

flask 5

1567526

325 400 512 554 269 436 282

1745 2811 1738 5376 1591 3064 6124 2496 2611 7111 5353 2991

64313 22400

flask 4

108544

1190155

and Monazite

1

4

249 855 712 360 564 873

3981 1681 1726 2090 3161 4703 4538 4110 5067 2918 280

14543 72057 86811 12893 39589

flask 3

968944

HPO

2

K

793 113 503 230 105 278 289 542

1490 2127 1765 3824 5050 9782 3114 3015 2081 1372

13877 30863 47930

flask 2

100314 355569

t 4 Days

766 538 335 311 542 625 451 340 899

2158 2311 1478 1992 2683 3855 2890 2327 1760 8634 6249 3963

flask 1

103155

1294629

93 21 71

318 252 154 854 237 581 568

5218 2031 1086 3534 2275 1386 1777 1048

20207 28552

flask 6

254626 941264 711606

Signal Intensity a

54 98 25

201 112 211 504 140 143 505 341

4461 9976 1010 1061 1446 1170 1270 1112

18143

flask 5

490231 630477 366393

80 51

318 176 878 137 235 404 242 938 666 772 703

5312 1156 2166 1453

lask 4

87643 30350 12668 27835

f

188729 922416

65 68

230 106 296 205 661 173 854 338 425

4880 1833 1630 2447 1962 1950 1138

20540 26130

flask 3

Monazite Only

242734 754355 534806

62

38 91 50 71

2 127 123 448 840 996 168 635 487

4069 8883 1307 1597 1270

11542 12431

flask 2

427800 567573 522492

93 44

245 184 257 233 209 402 553

5751 1704 1402 1758 2181 1912 1016 1680 1003

21294 44667

flask 1

380224 748015 285503

proline

-

L

-

phosphate

-

hydroxy

5 -

-

glucuronic acid

4

e

-

-

willardiine

-

)

-

Metabolite Name or IDBinBase

ribos s( shikimic acid sorbitol stearic acid succinic acid sucrose sulfuric acid tagatose threitol trans tyrosol UDP uracil xylitol xylonolactone xylose xylulose 39 47 62 91 99

137

89

314 218 197 746 213 536 399 413 681 436 331 389 125 755 157

2374 1062 1106 1008 1914 2776 1014

flask 6

728 757 247 181 322 289 462 538 397 258

2439 1353 2081 1004 3988 1412 1493 2942 2133 2665 3834 1782 2216

flask 5

951 528 636 590 374 544 518 246

1319 1545 2267 1217 2985 2677 2611 6891 3549 4192 2768 4943 2480 2991

15390

flask 4

and Monazite

4

922 705 994 272 589 738 346 352

1908 1519 3245 3247 1464 1090 1174 1362 4961 1924 3900 2951 2763 1515 1732

flask 3

HPO

2

K

345 852 439 848 650 829 248 854 199 691 684 752 526 560 159 711 216

1123 1034 2510 1057 4331 1023

flask 2

609 199 364 890 613 360 137

1748 4435 1379 4766 4692 1569 3777 1644 4004 6108 3392 4443 2646 5715 3383 3358

ity at 4 Days

flask 1

8

298 940 636 210 499 689 500 168 240 385 725 552 662 315 180 611 147

179 1035 1601 2009 1964

23864

flask 6

Signal Intens

26 85

137 111 136 119 285 607 356 331 108 120 689 274 513 402 344 234 286

1037 2132 1085

18093

flask 5

221 654 665 502 441 405 689 337 114 305 495 900 432 270 370 199 804 236

1000 1167 2134 2869 1523

flask 4

255 228 505 122 630 318 724 494 409 152 740 321 893 415 623 348 177 525 178

1861 2100 1382

12726

flask 3

Monazite Only

60 52

183 110 304 358 271 989 408 522 216 102 347 362 510 485 668 406 501

1162 2711 1393

22044

flask 2

1

74

258 300 147 442 534 597 469 128 734 232 817 530 939 459 555 108

1784 1000 1774 2409 1125

10635

flask

Metabolite Name or IDBinBase

137 168 257 657 809 892 1064 1173 1673 1681 1690 1715 1870 1872 1875 1878 1913 2044 2061 2081 2095 2097 2821

138

871 144 538 874 646 632 613 132 125 430 181 185 842 223

4345 3173 2183 2191 1350 1006 5207 2835

10886

flask 6

23

268 235 468 233 611 264 281 403 769 634 351 349

3536 1324 1449 1056 3133 9476 3674 11 1000

16836 33905

flask 5

571 538 607 656 390 941 987 738

7108 1358 1286 5409 1237 2014 1132 4422 1112 6960 7210 1086

24095 52445 18942

flask 4

and Monazite

4

712 875 922 622 732

3664 1319 3059 1793 8670 1204 1824 1186 2912 1867 2783 7842 1074

14469 30945 13474 10008 12940

flask 3

HPO

2

K

206 878 755 539 884 180 181 498 206 250 290

5473 1187 2795 3595 4722 4281 1819 1388 4902 1904 6713

13422

flask 2

439 485 911 199 418 567 439 671 476

6580 1093 1131 2717 3677 1383 5226 1835 1586 4539 2538

30197 65077 18979

flask 1

6

tensity at 4 Days

375 106 657 376 168 106 230 296 597 357

5480 1081 8637 1067 3305 1243 1505 1061 4448 1708 9350

11585 17809

flask

Signal In

92 26 82

464 175 132 246 260 511 709 160 217 737 140

2605 5214 7728 7331 2867 1052 1175 2688 6760

flask 5

458 1314 877 555 924 248 346 322 323 237 360

6022 1121 2642 8948 5495 1150 4265 1584

13045 3 23856 26636 21721

flask 4

37

234 399 399 305 226 115 283 350 365 305

5226 1082 1079 4279 1423 1183 4115 1861 9368

10872 10565 14629

flask 3

Monazite Only

90

98

190 141 576 788 311 454 132 642 168 194 402 870 194

3498 1044 7162 6160 7514 2269 10 3201 5907

flask 2

89

957 235 188 797 921 467 541 220 212 251 289 608 301

4622 6753 3831 1196 1082 3859 7854

10071 13318

flask 1

797

Metabolite Name or IDBinBase

2944 3083 3173 3247 3442 3781 4541 4543 4713 4732 4795 4 4819 4937 4976 5346 5576 6104 6330 6646 9320 14694 16561

139

80

801 585 933 493 638 673 149 183 911 453 782 596 196 271

1129 1051 2084 6377 1813 9886

20998 13394

flask 6

376 320 557 414 491 220 931 177 721 345 177 782 181 337 624

1164 3154 1784 2532 2609 1220 1274 1657

flask 5

443 449 499 718 407 285 833 617 394 213 961

1581 8394 3142 5268 3697 2408 1942 2080 2657 2693 8354

12176

flask 4

and Monazite

4

865

282 305 143 229 607

2266 3210 3744 1433 6206 6744 9646 3104 1 2356 1728 1853 1742 2307 2037

27835 54449 37388

flask 3

HPO

2

K

966 917 703 131 772 212 478 871 945 195 381

1211 2446 1263 5015 1145 1016 1322 1648

19159 11877 27425 16011

flask 2

010

360 563 571 683 729 663 538 617 406 969 319 712

1586 9243 4435 2712 3855 1863 1946 1 1603 3160 1180

flask 1

89

945 291 298 890 936 549 317 702 340 439 654 319 452

1441 6498 4581 2033 8709 2139

12418 53792 24028

flask 6

Signal Intensity at 4 Days

61

807 433 346 236 489 478 264 246 465 391 531 423 187 165

9440 3688 8654 4241 2992 4722 9331

33008

flask 5

37 85

408 173 793 584 355 263 198 628 642 256

1310 6730 4026 2335 4875 6412

14567 36643 68966 10431

flask 4

142987

975 276 535 122 889 573 287 581 744 597 300 218

1380 5875 5616 5083 2653 4592 1532 1731

12114 51038 22082

flask 3

Monazite Only

999 363 149 211 125 668 539 365 378 148 423 656 389 260 237

1411 3910 7813 3056 1695 6143

36337 14119

flask 2

97

911 291 932 663 575 393 863 415 203 176

1370 4785 4612 3285 1057 3716 1258 1273

13195 10102 41398 18861

flask 1

se se ID

Metabolite Name or BinBa

16817 16850 16855 17068 17069 17140 17425 17471 17651 17830 18082 18173 18226 18241 20282 21704 22967 25801 30962 31359 41682 41689 41808

140

39

123 636 150 197 334 320 236 354 210 811 381 444 974 380 529 772

4578 13 1019 9541 6467 3780 4863

flask 6

489 347 686 376 295 183 222 649

1946 1064 1046 1106 1285 1678 1653 3071 2006 1343 2470 1696 2335 8339

14406

flask 5

748 873 859 308 472 794

2598 1105 1099 1223 1886 2473 2700 6281 3365 2480 4576 1624 2516 2217 5373

11369 19687

flask 4

and Monazite

4

309 797 812 550 325 652

2080 1024 1867 3554 8742 2262 2190 4703 1730 1251 1423 1918 2724 3382 1360

12369 12821

flask 3

HPO

2

K

06

158 615 261 143 220 395 263 391 188 905 667 473 609 367

4928 1628 1183 2494 2511 4338 12 9842

32213

flask 2

671 663 667 406 571 493 650 435 894

5512 1491 1789 1251 1615 1230 7040 6431 3797 4224 6013 3069 2481

27393

flask 1

190 782 288 292 164 252 300 432 210 166 387 342 197 622 404

1766 1033 1686 1617 6441 3707 1489 3479

flask 6

Signal Intensity at 4 Days

37

129 379 165 150 177 101 220 139 102 276 193 140 657 171 269 684 620

1159 2967 1613 2911 1789

flask 5

142 744 219 168 291 244 231 473 249 336 615 256 991 290 564 641 654 613

1981 4524 2142

10166 11403

flask 4

1

142 692 283 265 171 225 164 36 248 196 381 299 946 370 500 268

1386 1849 2067 6206 3966 1027 7841

flask 3

Monazite Only

113 492 116 160 186 206 210 331 123 460 596 269 459 870 110 713 350 846 165

3480 3824 2550 5124

flask 2

2

143 485 247 314 181 294 143 300 203 191 428 113 856 292 335 850 990

1902 3763 165 9295 3361 5022

flask 1

Metabolite Name or IDBinBase

41811 41938 42205 47170 48522 49382 53724 54643 87877 88911 89221 97326 97332 100768 100869 100880 100908 101299 102223 102616 102661 102662 102679

141

353 435 616 560 121 605

1210 3670 4312 1663 7178 6849 2013 1532 2122 7436 2351 1026 1672

39027 18739 70090 11659

flask 6

464 803 229 416 378 206 256 459 177 227 884 160 225 239 177 301 971 206

1730 4360 4349 3389

79431

flask 5

zite

64

790 5 446 394 282 318 410 554 892 331 282 325 571 364

8804 1171 1138 2608 4592 3221

28406 35021

flask 4

103441

and Mona

4

955 540 515 436 988

3349 9143 3239 6721 1266 1575 2376 2728 1327 5182 2045

15850 65569 62352 19533 26581 11693 20058

flask 3

HPO

2

K

911

466 906 916 164

5668 4319 7890 3002 1661 1190 1507 5302 4566 2191 2194 2229 1122

23459 35282 13822 14220 26242

flask 2

151

534 865 373 567 998 352 476 402 327 327 344 356 493 402 530 654 443

1632 3317 1122 6506 5843

flask 1

132760

547

938 181 895 138 826

1265 6555 9283 4108 1998 7236 1246 1278 2196 6379 1683 1024

12204 35960 38909 46 24501 11681

flask 6

Signal Intensity at 4 Days

630 971 749 276 793 432 458 372 691 450

5031 6091 2064 1551 6439 1372 1202

23097 34923 39567 13784 16129 10987

flask 5

362

271 970 273 621 113 168 726 489 509 293

1258 4083 5774 7390 2621 8555 1400 1703 1720 1280

28707 48919 28

flask 4

370 755 301 397

1414 9164 9344 2932 1763 6332 1346 2326 1778 8707 1639 1084

10084 46215 53430 33476 20783 12409 11080

flask 3

Monazite Only

0

707 411 540 686 179 979 202 493 561 277 526 602

7740 9738 4292 788 4718 1336 1753 1442

23472 24369 13110

flask 2

939 426 620 979

1652 8155 8922 3109 2206 1102 5536 1491 1628 1499

40512 51657 64781 19533 34455 21928 10696 11484 15584

flask 1

Metabolite Name or IDBinBase

102711 102714 102715 102716 102727 102728 102729 102730 102731 102732 102733 102734 102735 102740 102741 102746 102747 102749 102776 102784 102790 102791 102793

142

2533 1073

17221

flask 6

923 376 206

flask 5

718 443 790

flask 4

and Monazite

4

4750 4812

10774

flask 3

HPO

2

K

1916 3946

16027

flask 2

360 356

1031

flask 1

917

8454 5301

flask 6

Signal Intensity at 4 Days

806

4937 7675

flask 5

5474 2753

16515

flask 4

699

7860 4658

flask 3

Monazite Only

236

5204 1807

flask 2

6278 1373

10255

flask 1

ID

Metabolite Name or BinBase

102808 102809 102821

143

778 308 332 369 457 873 481 525 338 379 362 366 477 535

4502 6759 3588 1334 1303 1205 3611 1723 5182

flask 6

1

286 221 232 368 16 211 313 244 269 961 167 107 967 201 320

3233 2655 6601 2237 1032 1556 1857 1295

flask 5

551 340 379 554 308 551 691 836 461 535 449 285 406 578

4338 3623 7281 2647 1933 6747 4045 1210 1312

flask 4

and Monazite

4

9

300 323 222 359 212 191 509 431 408 338 889 19 323 189 424

2935 2730 3787 1767 1976 2449 1460 1862

flask 3

HPO

2

K

438 278 187 615 457 763 355 752 314 272 213 283 511 272

4503 3440 6869 2934 1678 1896 2397 2529 2161

flask 2

9

793 775 429 756 943

7877 5319 1073 5114 1316 1428 1708 2109 1176 3826 7457 1325 8157 4592 5254 238 2370

13905

flask 1

93

495 496 473 286 407 105 149 161 294 298 380 304 504 107 555 148

2511 2379 5137 6789

50808 11080

flask 6

Signal Intensity at 6 Days

57

245 938 199 303 299 899 579 321 339 975 164 301

1879 2480 1745 1572 2823 1310 1753 7836 1561

17265

flask 5

01

74

585 157 692 256 630 319 200 223 753 472 124 877 486 240

35 4137 8953 3636 2716 1311 1720 3197

flask 4

97

141 607 642 761 904 565 130 277 270 915 402 390 436 850 710 153

2078 1984 1204 8746 4814

Monazite Only

14447

flask 3

60

114 727 466 989 530 829 200 276 226 258 384 741 166

1957 2574 2876 2967 1006 1038 4955

24061 10072

flask 2

87

109 895 844 804 881 728 229 224 177 747 247 595 198

3798 2784 2988 1436 1104 2265

61019 69238 10692

flask 1

furoic acid

-

2

-

glucose

hexose

-

-

pic acid

methylglutaric acid

D

d

-

- -

diol

3

-

-

2,3

-

ketoglutarate

-

gentiobiose

-

dihydroxybenzoic acid anhydro anhydro

- - -

deoxyerythritol deoxyerythritol hydroxyadi hydroxyglutaric acid isopropylmalic acid deoxyhexitol hydroxy hydroxypropionic acid hydroxybenzoate hydroxymethyl

------

Metabolite Name or IDBinBase

1 2 2 2 2 3,4 3,6 3,6 3 3 3 4 5 aconitic acid adipic acid alanine alpha azelaic acid benzoic acid beta butane capric acid cellobiose

144

8

288 362 362 755 447 450 474 274 680 31 393 349 372

1303 1608 4928 1845 2616 7514 1191 5280

49168 10983

flask 6

215 324 514 376 441 251 286 182 616 182 146 267 315

1404 1719 2977 2099 4351 5909 1007 5784 2793

46860

flask 5

903

426 547 757 590 469 238 640 379 508 804 429 480

2749 2132 2249 2518 3334 1132 1925 2300 6

58642 13216

flask 4

and Monazite

4

243 305 269 460 406 235 514 160 307 860 442 227 460

1106 1602 4730 2712 4115 1139 4978 9116 1860

58910

flask 3

HPO

2

K

405 228 257 573 293 828 239 228 843 278 366 311

1320 1995 1211 2698 2820 4874 1458 8052 7801 5891

73282

flask 2

7

784 915 616 653 793 663

1157 409 5515 1736 6131 2044 2408 5506 1465 1316 3098 2968

11554 50946 23406 29845 11236

flask 1

71 46

504 401 346 590 356 548 757 682 145

1351 1176 1103 2708 2183 5556 1252

36079 18592 29417 12798

flask 6

217784

Signal Intensity at 6 Days

55

85 51

679 5 811 712 605 581 796 157 471 148

6991 3047 1016 1160 1071 1407 7891

10463 41954 20601

flask 5

557545

308 376 733 429 294 639 139 490 733 137 600 137 468 167 106

1299 1353 2437 8675 1106 3296 2141

58200

flask 4

58

81 66

609 445 755 307 5 738 367 302 509 199

8685 1562 5974 1529 2004 2065 1450 7153

Monazite Only

35369 63963

flask 3

234113

80 97

423 316 788 377 656 669 859 116 301

2821 5565 1337 1893 3780 7278 1209 5466

17148 10054 22277

flask 2

1179118

89 40

442 415 527 649 799 237 361 796 104

1040 1571 1413 5666 2694 1463 3440

47533 11301 11096 46485

flask 1

179742

phosphate

-

galactoside

phosphate

-

-

alpha

3

1

acid

- -

-

Metabolite Name or IDBinBase

citramalic acid citric acid dehydroabietic acid dihydroxyacetone erythritol erythronic acid fructose fumaric acid galactinol gluconic acid glucose glucose glutaric glyceric acid glycerol glycerol glycerol glycolic acid histidine hydroxylamine isocitric acid isothreonic acid lactic acid

145

44

677 636 460 833 799 508 454 288 342 315 149 477 247 328 413

3168 2173 3970

22139 10817

flask 6

125094 252812

230 240 359 311 568 351 426 597 378 430 286 422 203 159 261 269

2133 4370 1462 3551

14072 74497

flask 5

222037

0

394 761 839 929 488 379 574 363 418 410 289 500 308 590 929 769

3631 9569 3295 9023

19674

flask 4

13330 317164

and Monazite

4

65

356 222 403 951 602 251 801 571 473 263 509 294 390 294 253

1240 5580 1248 6590

14877 90526

flask 3

HPO

284674

2

K

035

451 117 713 428 389 794 438 394 353 288 420 265 368 540

4926 1245 7147 1388 1 8172

14442 94023

flask 2

309771

765 784

1549 2930 5273 2305 1381 2436 1251 4078 2380 2062 1484 1381 1353 5506 1064 1120 9986

50376 17237

flask 1

379456 266580

4

4

889 434 676 207 175 815 103 125 474 326 172 122

3316 2302 4474 1114 2988 6851

41673 15801 11514 19760

flask 6

Signal Intensity at 6 Days

80 91 57

120 562 181 244 534 339 114 108 866 449 112 138

3666 4950 3666 8101

16969 32494 28061 17553

flask 5

58 23

146 593 308 837 277 339 448 175 452 344 297 160 187 569 342

1090 3591 1111 2858

11294 65868

flask 4

68 86

138 198 359 195 183 789 188 878 340 441 169 169

2187 2608 3465 5152 2874 6299

Monazite Only

31549 28367 27508

flask 3

81

938 632 309 807 116 605 526 350 857 518

2690 2775 3401 3600 2906 2241 1079

87727 14639 22701 11402 37756

flask 2

499

72

420 255 123 187 123 481 140 295 153

3764 2147 1003 3975 9753 1672 5354 1374 3915 6317

41357 33 21294

flask 1

diol

-

1,3

-

tic acid

inositol

-

cresol

-

Metabolite Name or IDBinBase

lactitol lactulose lauric acid levoglucosan lyxitol lyxose malic acid maltose maltotriose myo nicotinic acid oleic acid oxalic acid palmi p pelargonic acid phosphate propane putrescine pyruvic acid ribitol ribonic acid ribose

146

454 951 988 213 261 257 464 447 210 278 596 674

5862 1902 1394 1614 1987 2031 7104 5178 4322

flask 6

128655

1301164

568 554 244 182 221 150 401 211 829 303 514 898 171

1072 1769 1001 1740 1792 5598 3718 1815

flask 5

117808

1440822

67

94

191 769 277 387 441 812 289 265 398 547

1074 2206 1452 1238 1413 1394 9230 6036 39

21665

flask 4

107282

1233788

and Monazite

4

953 982 553 245 222 209 783 320 648 222 261 625 346

5794 1718 1416 2157 1509 6750 4763 1909

77744

flask 3

HPO

1377062

2

K

80

192 363 918 781 540 104 340 602 895 309

1160 8153 2423 1167 1367 2726 5590 4677 2091

77034 26492

flask 2

1304173

79

859 429 887 980

12 4750 7905 1288 1251 4162 1353 1381 1643 1288 6327 3378 1754

11442 20093 14969 10126

flask 1

262633 685773

67 51

203 104 451 219 159 276 959 601

3112 1749 2193 5174 3996 2171 1086 1716

22216 42627

flask 6

322748 163286

1062175

Signal Intensity at 6 Days

82 39 68

449 180 273 199 172 112 206 683 258

2997 1329 1316 1647 9205 1519 2035 1159

27583

flask 5

103647

1035809

175 895 189 148 312 205 463 119 861 187 526 186

1003 1527 1309 1572 1178 4213 3328 1603

86240 20730

flask 4

1693905

74 62 59

421 805 234 345 190 299 500 524

1260 1233 1575 1787 1600 1802 1116

Monazite Only

39687 42950 64089 10025

flask 3

1320695

91

797 205 286 190 106 647 250 847

9418 2751 2350 2496 1417 3350 2489 1585 1644 3385

12281 34482 36079

flask 2

377645

2

51 60

351 123 239 367 266 675 290 972 490

1493 1071 3002 5034 1844 1485 1523

30808 35214 9900 26556

flask 1

1203411

proline

-

L

-

phosphate

-

hydroxy

5 -

-

glucuronic acid

4

-

-

willardiine

-

)

-

Metabolite Name or IDBinBase

ribose s( shikimic acid sorbitol stearic acid succinic acid sucrose sulfuric acid tagatose threitol trans tyrosol UDP uracil xylitol xylonolactone xylose xylulose 39 47 62 91 99

147

51

775 954 393 278 291 477 718 619 508

1608 1337 4342 1686 3276 1631 2396 5354 4062 4366 2606 5331 2596 31

flask 6

754 633 702 913 758 568 219 269 295 616 443 261

1402 3022 2667 1429 3814 2609 3008 1051 3465 2020 2139

flask 5

457 437 437 367 535 937 734 137

1792 6774 2140 3319 2268 1046 3326 2432 5895 6106 5072 1648 5372 4299 3795

flask 4

and Monazite

4

828

920 951 328 313 333 473 625 530 287

1289 1589 2242 6102 3033 1679 5197 3513 3777 3188 3717 1834 2524

12

flask 3

HPO

2

K

869 674 428 288 425 270 900 462 182

2132 1520 1007 1193 2618 1017 1868 3001 2081 2330 2114 3941 1995 2446

flask 2

071

840

3780 6822 5599 6327 9 6701 5133 2436 1661 1036 1260 5189 8912 9258 5413 2240 7419 1726 9211

12897 13765 14941

flask 1

298 478 251 203 949 477 676 152 448 389 688 279 912 554 441 149 669 139

1381 1047 7324 2461 2040

flask 6

Signal Intensity at 6 Days

44

60

402 430 326 863 357 633 537 2 117 267 544 834 422 112 664 702 120

1026 1202 1890 1702 1302

flask 5

726 430 162 277 477 200 196 688 191 196

2181 1806 1450 1525 1147 1225 2568 1498 1398 1097 1841 1045 1745

flask 4

43

55

314 120 205 278 792 193 519 359 359 787 126 271 777 932 5 330 288 108

1813 2053 1120 1003

Monazite Only

flask 3

355 331 270 649 477 334 309 971 577 277 780 887 451 177 725

4435 2578 1180 9803 1754 1101 2877 1085

flask 2

98 53

247 494 278 249 872 260 794 508 420 255 498 748 314 693 595 489 535 154

1194 1736 1845

flask 1

ame or IDBinBase

Metabolite N

137 168 257 657 809 892 1064 1173 1673 1681 1690 1715 1870 1872 1875 1878 1913 2044 2061 2081 2095 2097 2821

148

57

457 420 454 288 420 657 687 921 426 792 332

5889 1026 5862 2010 59 1120 3787 1506 1489

26928 62104 16256

flask 6

593 184 426 311 742 391 261 370 520 522 472 472 299 664 836

4265 6051 2970 3720 9730 2358

15915 34553

flask 5

396

492 496 574 285 742 387 402 933 390 566

5817 1171 6411 3045 1644 1921 6438 1495 4 1566

28146 61289 18623

flask 4

and Monazite

4

790 261 266 349 703 364 377 429 863 232 651 377 359

4864 3826 2563 4743 1002 2645 1341

18689 44723 12425

flask 3

HPO

2

K

433 394 794 324 400 467 412 441 983 493 306

5445 4856 2472 1204 1170 2960 2052 1354 1206

18668 38637 10550

flask 2

1

859

1475 1717 1092 1531 1521 1736 4508 1857 2006 7475 3490 3901 1764 1988 3836

75844 15874 14540 11619 36723 10630

flask

139417

98

791 177 631 760 665 837 579 141 290 185 250 806 291

4356 9599 5474 8436 2002 1117 3758 2740

12357

flask 6

Signal Intensity at 6 Days

k 5

74 99

176 349 111 871 605 268 283 126 349 466 156 159

5778 1122 2653 4082 1047 4900 1088 1315

14419

flas

279 207 448 137 371 344 456 148 256 342 729 539 902

4486 4322 1378 2417 1063 8245 1646 1603

14136 29474

flask 4

72

61

304 938 634 798 625 213 268 298 158 331

5043 1515 33 2176 2297 1225 1163 4369 1026 9473

Monazite Only

12503 15044

flask 3

68

991 387 365 880 512 515 740 214 900 552 320 403

4286 5161 3653 1541 3195 3281 1155 1031 4649

29658

flask 2

51 67

908 384 405 616 657 876 254 216 207 731 281

4823 1326 1856 8828 1965 4193 8726

10413 12761 10444

flask 1

Metabolite Name or IDBinBase

2944 3083 3173 3247 3442 3781 4541 4543 4713 4732 4795 4797 4819 4937 4976 5346 5576 6104 6330 6646 9320 14694 16561

149

240 457 589 535 728 775 714 372 575 433 244 910 305 426

1435 2945 3226 4464 3831 2471 1398 1943 1198

flask 6

272 303 549 320 334 357 675 777 194 501 238 194 994 462 230 240 512

1738 1354 1562 3050 2379 1901

flask 5

328 375 593 578 554 551 387 605 387 847 551 472 988

2733 2460 1765 2190 4006 2163 1195 1031 1878 3678

flask 4

and Monazite

4

775 238 354 969 424 364 744 491 305 377 266 305 641 263 276

1684 1974 3327 2774 1155 1519 2289 1072

flask 3

HPO

2

K

241 441 879 272 379 669 254 703 348 140 604 387 400 553

2877 1455 2576 4026 2698 1907 1328 1320 1157

flask 2

933 933

2772 4872 6850 1941 1624 1764 1708 1447 3864 3089 1036 1335 5973 1036 2613 1997 7111 1157 1792

11703 10172

flask 1

48

52

341 608 842 594 229 192 311 426 701 182 265

1450 1019 5477 3532 8796 6822 3042 1076

19815 488 14399

flask 6

Signal Intensity at 6 Days

88

283 245 846 159 301 186 119 783 542 132 354 587 199 564 899 818 300

1222 1069 9100

28598 22561

flask 5

214 733 268 240 297 447 818 195 791 245 121 459 193 265 250

1576 1572 3191 4283 2847 1587 1163 1608

flask 4

79

557 883 300 685 729 321 122 134 684 749 127 338

4551 1440 6835 4814 1027 2189 4618

Monazite Only

42584 14617 18634

flask 3

397 349 511 113 887 590 567 412 559 257

1373 1369 2678 6018 4045 1088 1131 1947 1062 1181

11778 26419 31847

flask 2

7

82

775 696 860 819 338 193 126 506 601 154 184

1186 5685 2228 1638 1259 2530

2359 13163 26362 54547 11992

flask 1

Metabolite Name or IDBinBase

16817 16850 16855 17068 17069 17140 17425 17471 17651 17830 18082 18173 18226 18241 20282 21704 22967 25801 30962 31359 41682 41689 41808

150

907 636 694 494 596 267 399 291 508 667

3737 2102 1831 1432 2264 1323 1560 2305 5537 2874 1699

14283 22565

flask 6

516 618 336 777 831 581 813 716 495 188 207 224 777 211 203

2465 1245 3039 1613 2676 2204 2812

13550

flask 5

527 691 621 527 410 344 547

1511 3389 1335 1862 1390 1335 1616 1464 2003 6661 3608 6270 1308 3526

25651 10503

flask 4

and Monazite

4

790 592 470 956 956 248 369 346 163 225 245 356

2317 1175 1005 1206 1568 6977 2167 4255 1524 2240

17256

flask 3

HPO

2

K

680 410 656 695 869 908 654 270 223 283 272 545

2594 1424 1328 1754 9539 2501 5238 1702 2576 3564

14924

flask 2

943

1297 7531 2688 2706 4358 2930 2566 5002 3210 6915 8045 2678 1689 1419 7877 1232 3724 2333

11666 14689 55584 19953

flask 1

130 613 250 240 190 306 340 446 109 418 600 414 992 204 441 985

3220 9731 1333 1972 3234 3419 2318

flask 6

Signal Intensity at 6 Days

78

120 676 162 247 400 362 351 514 260 362 821 103 560 719 785

3558 1283 1158 4919 6069 2519

55269

flask 5

2

362 261 23 598 528 396 791 546 535 477 290 290 167 175 330

1632 5183 2212 2165 1353 3055

11768 15642

flask 4

onazite Only

158 594 211 114 228 365 363 511 252 188 471 244 668 110 373 265 796

3555 1000 4105 3166 2732

M

22857

flask 3

93 69

126 884 300 334 236 442 338 291 650 450 652 354 540 753 137

1445 1085 2051 5358 2633 3405

flask 2

126 717 165 111 200 217 198 322 345 171 572 817 674 379 438 668

3860 1032 1120 3841 4665 3529

12375

flask 1

Metabolite Name or IDBinBase

41811 41938 42205 47170 48522 49382 53724 54643 87877 88911 89221 97326 97332 100768 100869 100880 100908 101299 102223 102616 102661 102662 102679

151

369 423 342 335 772 274 305 389 230 301 701 166 298 267 213 630 288

4992 1872 3831 2318 1604

flask 6

112314

453 777 363 437 274 234 165 380 190 157 349 184 246 198 236 389 269

3380 2918 4004 1550 1147

65952

flask 5

445 484 441 269 332 695 566 355 441 301 566 269 445 406 441

5575 3732 1566 1089 6680 2432 1011

flask 4

121126

and Monazite

4

89

346 537 305 276 550 2 186 263 225 318 315 227 227 220 189 455 925 287

2968 2294 6068 1586

80988

flask 3

HPO

2

K

176 348 467 306 143 311 472 272 244 384 135 171 270 223 516 156

6286 3855 1211 4208 3064 1339

81540

flask 2

5

933 943 905 691 784 775 63 961 943

9071 7279 3397 2109 1157 1344 1129 1801 1325 4564 1960 5674

72261

flask 1

379456

457 140 388 906

3068 1757 6886 6744 3190 5765 1169 5128 2182 2425 2108 1409

15145 37408 21369 12359 14508 25451

flask 6

143871

Signal Intensity at 6 Days

68 67 81 55

223 116 360 714 140 218 268 658 361 244 250

3191 5615 1661 4126 1558 1587 1527

28061

flask 5

229 620 506 295 357 270 299 133 128 196 742 106 207 220 184 985 202

2336 3186 3281 1680 1131

56926

flask 4

99 88

583 692 255 457 691 607 644 667

2926 3734 2177 4747 9677 3084 4118 1835 4187 1829 4442

Monazite Only

23634 27508

flask 3

427 838 955 282 254 882

1411 8343 1870 7361 3003 2821 2247 1620 2040 5623 2939 2367

25817 37282 20280 11509 32572

flask 2

698 386 649 237 601 855

3812 1231 1514 2076 8730 4087 1527 8276 2564 1651 8585 2739 2974 5685

47420 25079 32378

flask 1

lite Name or IDBinBase

Metabo

102711 102714 102715 102716 102727 102728 102729 102730 102731 102732 102733 102734 102735 102740 102741 102746 102747 102749 102776 102784 102790 102791 102793

152

734 240 318

flask 6

564 272 274

flask 5

726 328 355

flask 4

and Monazite

4

754 238 232

flask 3

HPO

2

K

241 280

1240

flask 2

933

5926 1008

flask 1

5947

7012 3126

1

flask 6

Signal Intensity at 6 Days

518 245 695

flask 5

778 214 164

flask 4

2764 5478 2641

Monazite Only

flask 3

213

5639 1855

flask 2

5563 2052

19058

flask 1

me or IDBinBase

Metabolite Na

102808 102809 102821

153

Appendix 3:

Heatmap showing average levels of all detected metabolites during monazite bioleaching

154

Appendix 3. Heatmap showing average levels of all detected metabolites during monazite bioleaching. Rows represent different conditions and time points. Columns represent different metabolites. Metabolites are ordered based on hierarchical clustering, with the clustering dendrogram displayed at the bottom of the heatmap and metabolite names at the top. Heatmap colors indicate standard deviations below (blue) and above (yellow) the overall mean level for each metabolite. Note: figure covers two pages.

155

156

Appendix 4:

Novel ANAS Dehalococcoides genes with product predictions beyond "hypothetical protein".

157

Appendix 4. Novel ANAS Dehalococcoides genes with product predictions beyond "hypothetical protein". a Genes on contigs identified as Dhc by SS but not by TF are highlighted in grey.

JGI IMG Contig Name Gene Object ID JGI Predicted Product ANASMEC_C725 2014734531 Integral membrane protein TIGR01906 ANASMEC_C818 2014734798 Signal transduction histidine kinase ANASMEC_C818 2014734801 Uncharacterized conserved protein ANASMEC_C818 2014734803 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C818 2014734804 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C818 2014734805 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C818 2014734808 Adenine-specific DNA methylase ANASMEC_C818 2014734809 Predicted transcriptional regulators ANASMEC_C818 2014734832 phage/plasmid primase, P4 family, C-terminal domain ANASMEC_C818 2014734837 Restriction endonuclease ANASMEC_C818 2014734838 methionine adenosyltransferase (EC 2.5.1.6) ANASMEC_C818 2014734839 Predicted transcriptional regulators ANASMEC_C818 2014734840 DNA modification methylase ANASMEC_C818 2014734844 Phage terminase-like protein, large subunit ANASMEC_C818 2014734845 Phage terminase-like protein, large subunit ANASMEC_C818 2014734846 Phage portal protein, HK97 family ANASMEC_C818 2014734847 Protease subunit of ATP-dependent Clp proteases ANASMEC_C818 2014734848 phage major capsid protein, HK97 family ANASMEC_C818 2014734851 Bacteriophage head-tail adaptor ANASMEC_C818 2014734852 phage protein, HK97 gp10 family ANASMEC_C818 2014734858 Phage-related protein ANASMEC_C818 2014734863 toxin secretion/phage lysis holin ANASMEC_C818 2014734864 Negative regulator of beta-lactamase expression ANASMEC_C818 2014734867 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C818 2014734868 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C818 2014734875 Predicted transcriptional regulators ANASMEC_C4102 2014746223 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C4102 2014746224 Recombinase 158

JGI IMG Contig Name Gene Object ID JGI Predicted Product ANASMEC_C5086 2014749813 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C5086 2014749814 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C5086 2014749816 Sigma-70, region 4. ANASMEC_C5086 2014749817 Negative regulator of beta-lactamase expression ANASMEC_C5086 2014749818 toxin secretion/phage lysis holin ANASMEC_C5086 2014749826 phage major tail protein, phi13 family ANASMEC_C5086 2014749828 phage protein, HK97 gp10 family ANASMEC_C5086 2014749829 phage head-tail adaptor, putative, SPP1 family ANASMEC_C5086 2014749833 phage major capsid protein, HK97 family ANASMEC_C5086 2014749834 Protease subunit of ATP-dependent Clp proteases ANASMEC_C5086 2014749835 phage portal protein, HK97 family ANASMEC_C5086 2014749836 Phage terminase-like protein, large subunit ANASMEC_C5086 2014749840 DNA-methyltransferase (dcm) ANASMEC_C5086 2014749841 DNA modification methylase ANASMEC_C5086 2014749844 HNH endonuclease ANASMEC_C5086 2014749846 Superfamily II DNA/RNA helicases, SNF2 family ANASMEC_C5086 2014749847 VRR-NUC domain. ANASMEC_C5086 2014749848 Predicted P-loop ATPase and inactivated derivatives ANASMEC_C5086 2014749852 Uncharacterized phage-encoded protein ANASMEC_C5086 2014749853 DNA polymerase I - 3'-5' exonuclease and polymerase domains ANASMEC_C5086 2014749861 Helix-turn-helix. ANASMEC_C5086 2014749862 Predicted transcriptional regulator ANASMEC_C5086 2014749863 Superfamily II DNA/RNA helicases, SNF2 family ANASMEC_C5086 2014749864 Adenine specific DNA methylase Mod ANASMEC_C5086 2014749865 DNA or RNA helicases of superfamily II ANASMEC_C5086 2014749866 exonuclease SbcD ANASMEC_C5086 2014749867 ATPase involved in DNA repair ANASMEC_C5086 2014749869 ABC-type Mn/Zn transport systems, ATPase component ANASMEC_C6239 2014753770 Predicted transcriptional regulators ANASMEC_C6240 2014753777 PAS domain S-box ANASMEC_C6240 2014753778 Reductive dehalogenase ANASMEC_C6240 2014753782 Fe-S oxidoreductase ANASMEC_C6240 2014753783 Transcriptional regulator, MarR family ANASMEC_C6240 2014753784 PAS domain S-box ANASMEC_C6240 2014753787 Reductive dehalogenase 159

JGI IMG Contig Name Gene Object ID JGI Predicted Product ANASMEC_C6240 2014753788 Predicted ATPases of PP-loop superfamily ANASMEC_C6240 2014753791 ABC-type Fe3+-hydroxamate transport system, periplasmic component ANASMEC_C6240 2014753792 ABC-type Fe3+-siderophore transport system, permease component ANASMEC_C6240 2014753793 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components ANASMEC_C6240 2014753794 Cobalamin biosynthesis protein CobN and related Mg-chelatases ANASMEC_C6240 2014753795 hydrogenobyrinic acid a,c-diamide cobaltochelatase (EC 6.6.1.2) ANASMEC_C6240 2014753796 ABC-type Fe3+-hydroxamate transport system, periplasmic component ANASMEC_C6240 2014753797 ABC-type Fe3+-siderophore transport system, permease component ANASMEC_C6240 2014753798 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components ANASMEC_C6240 2014753799 Percorrin isomerase ANASMEC_C6240 2014753800 ABC-type multidrug transport system, ATPase component ANASMEC_C6240 2014753801 cobalamin biosynthesis protein CbiD ANASMEC_C6240 2014753802 precorrin-6y C5,15-methyltransferase (decarboxylating), CbiE subunit/precorrin-6Y C5,15-methyltransferase (decarboxylating), CbiT subunit ANASMEC_C6240 2014753803 precorrin-2 C20-methyltransferase ANASMEC_C6240 2014753804 precorrin-4 C11-methyltransferase ANASMEC_C6240 2014753805 precorrin-3B C17-methyltransferase ANASMEC_C6240 2014753806 ABC-type polysaccharide/polyol phosphate export systems, permease component ANASMEC_C6240 2014753807 ABC-type Fe3+-hydroxamate transport system, periplasmic component ANASMEC_C6240 2014753808 Predicted amidohydrolase ANASMEC_C6240 2014753809 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components ANASMEC_C6240 2014753810 ABC-type Fe3+-siderophore transport system, permease component ANASMEC_C6240 2014753811 ABC-type Fe3+-hydroxamate transport system, periplasmic component

160

JGI IMG Contig Name Gene Object ID JGI Predicted Product ANASMEC_C6240 2014753812 Mg-chelatase subunit ChlI ANASMEC_C6240 2014753813 Arylsulfatase regulator (Fe-S oxidoreductase) ANASMEC_C6240 2014753814 Precorrin isomerase ANASMEC_C6240 2014753815 Predicted amidohydrolase ANASMEC_C6240 2014753817 ABC-type metal ion transport system, periplasmic component/surface adhesion ANASMEC_C6240 2014753819 Putative GTPases (G3E family) ANASMEC_C6240 2014753821 Fe2+/Zn2+ uptake regulation proteins ANASMEC_C6240 2014753823 Signal transduction histidine kinase ANASMEC_C6240 2014753826 ferric uptake regulator, Fur family ANASMEC_C6240 2014753827 rubrerythrin ANASMEC_C6240 2014753829 Reductive dehalogenase ANASMEC_C6240 2014753830 Reductive dehalogenase ANASMEC_C6240 2014753831 Signal transduction histidine kinase ANASMEC_C6240 2014753832 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain ANASMEC_C6240 2014753834 VTC domain ANASMEC_C6240 2014753837 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain ANASMEC_C6240 2014753838 Signal transduction histidine kinase ANASMEC_C6240 2014753846 NUDIX domain ANASMEC_C6240 2014753847 Predicted phosphoesterase or phosphohydrolase ANASMEC_C6240 2014753850 ADP-ribosylglycohydrolase ANASMEC_C6240 2014753851 Uridine kinase ANASMEC_C6240 2014753852 Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains ANASMEC_C6240 2014753854 Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain ANASMEC_C6240 2014753855 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain ANASMEC_C6240 2014753856 Signal transduction histidine kinase ANASMEC_C6240 2014753857 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain ANASMEC_C6240 2014753858 Reductive dehalogenase ANASMEC_C6240 2014753862 transcriptional regulator/antitoxin, MazE 161

JGI IMG Contig Name Gene Object ID JGI Predicted Product ANASMEC_C6240 2014753863 transcriptional modulator of MazE/toxin, MazF ANASMEC_C6240 2014753864 Uncharacterized protein conserved in bacteria ANASMEC_C6240 2014753870 Uncharacterized conserved protein ANASMEC_C6240 2014753871 Nucleotidyltransferase/DNA polymerase involved in DNA repair ANASMEC_C6240 2014753873 DNA polymerase III, alpha subunit ANASMEC_C6240 2014753874 DNA polymerase III, alpha subunit ANASMEC_C6240 2014753877 DNA binding domain, excisionase family ANASMEC_C6240 2014753881 Predicted transcriptional regulators ANASMEC_C6240 2014753884 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C9125 2014766073 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain ANASMEC_C9125 2014766076 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain ANASMEC_C9125 2014766077 PAS domain S-box ANASMEC_C9125 2014766079 Reductive dehalogenase ANASMEC_C9125 2014766084 Uncharacterized conserved protein ANASMEC_C9422 2014767429 Reductive dehalogenase ANASMEC_C9422 2014767434 FMN-binding domain ANASMEC_C9422 2014767436 Site-specific recombinases, DNA invertase Pin homologs ANASMEC_C9422 2014767438 Predicted transcriptional regulators ANASMEC_C9422 2014767439 Helix-turn-helix ANASMEC_C9422 2014767445 Preprotein translocase subunit Sec63 ANASMEC_C9422 2014767446 Predicted ATPase involved in replication control, Cdc46/Mcm family ANASMEC_C9422 2014767450 SpoVT / AbrB like domain ANASMEC_C9422 2014767462 Subtilisin-like serine proteases ANASMEC_C9422 2014767467 Site-specific recombinase XerD ANASMEC_C9422 2014767469 Uncharacterized conserved protein ANASMEC_C9422 2014767471 MTH538 TIR-like domain (DUF1863). ANASMEC_C9422 2014767472 MTH538 TIR-like domain (DUF1863). ANASMEC_C9422 2014767474 MTH538 TIR-like domain (DUF1863). ANASMEC_C9422 2014767477 DNA polymerase III beta subunit family protein ANASMEC_C9422 2014767481 Eco57I restriction endonuclease ANASMEC_C9422 2014767482 Superfamily II DNA and RNA helicases 162

JGI IMG Contig Name Gene Object ID JGI Predicted Product ANASMEC_C9422 2014767484 Predicted hydrolase of the metallo-beta-lactamase superfamily ANASMEC_C9422 2014767486 Type I restriction-modification system methyltransferase subunit ANASMEC_C9422 2014767507 Reductive dehalogenase ANASMEC_C9422 2014767632 Reductive dehalogenase ANASMEC_C9422 2014767633 PAS domain S-box ANASMEC_C9422 2014767635 Site-specific recombinase XerD ANASMEC_C9422 2014767637 Predicted transcriptional regulator with C-terminal CBS domains ANASMEC_C10019 2014770383 Site-specific recombinase XerD ANASMEC_C10019 2014770387 Reductive dehalogenase ANASMEC_C10029 2014770452 Growth regulator ANASMEC_C10029 2014770454 Nucleotidyltransferase domain ANASMEC_C10029 2014770456 Type III restriction enzyme, res subunit ANASMEC_C10029 2014770458 Adenine specific DNA methylase Mod ANASMEC_C10029 2014770462 Superfamily II helicase ANASMCE_C10442 2014772673 ABC-type multidrug transport system, ATPase and permease components ANASMCE_C10442 2014772674 Putative secretion activating protein ANASMCE_C10442 2014772675 Deoxyribodipyrimidine photo-lyase type II (EC 4.1.99.3) ANASMCE_C10442 2014772676 Dihydroorotate dehydrogenase ANASMCE_C10442 2014772677 GAF domain-containing protein ANASMCE_C10442 2014772678 diguanylate cyclase (GGDEF) domain ANASMCE_C10442 2014772679 Putative threonine efflux protein ANASMCE_C10442 2014772680 Mg2+ and Co2+ transporters ANASMCE_C10442 2014772681 Permeases of the drug/metabolite transporter (DMT) superfamily ANASMCE_C10442 2014772682 Predicted integral membrane protein ANASMCE_C10442 2014772683 Uncharacterized protein conserved in bacteria ANASMCE_C10442 2014772684 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component, and related enzymes ANASMCE_C10442 2014772685 Uncharacterized conserved protein ANASMCE_C10442 2014772686 Glycosyl transferase family 2. ANASMCE_C10442 2014772687 Outer membrane cobalamin receptor protein ANASMCE_C10442 2014772688 Outer membrane cobalamin receptor protein ANASMCE_C10442 2014772689 Sugar phosphate permease 163

JGI IMG Contig Name Gene Object ID JGI Predicted Product ANASMCE_C10769 2014773969 SOS-response transcriptional repressors (RecA- mediated autopeptidases) ANASMCE_C10769 2014773980 VRR-NUC domain ANASMCE_C10769 2014773981 Site-specific DNA methylase ANASMEC_C10784 2014774099 Transcriptional regulators ANASMEC_C10784 2014774101 Uncharacterized Fe-S protein ANASMEC_C10784 2014774103 Transcriptional regulators ANASMEC_C10784 2014774104 Reductive dehalogenase ANASMEC_C10784 2014774108 addiction module toxin, RelE/StbE family FYHO20111_b1 2014778994 dihydrodipicolinate reductase (EC 1.3.1.26)

164

Appendix 5:

Genes for hydrogenase components identified in the ANAS metagenome contigs.

165

)

-

)

G

terminal domain terminal domain terminal domain terminal

- - -

, C

JGI Predicted Product Predicted JGI

only (EC:1.6.5.3)

-

reducing hydrogenase, beta subunit hydrogenase, reducing

-

containing hydrogenase containingcomponents hydrogenase 2

-

containing hydrogenase components 1 (EC:1.2.7. components hydrogenase containing

-

reducing hydrogenase subunit hydrogenase reducing subunit A (EC:1.12.1.2) hydrogenase reducing

- -

hydrogenase III subunit large hydrogenase III componenthydrogenase G

non non

hydrogenase I large subunit (EC:1.12.7.2, EC:1.12.99.6 (EC:1.12.7.2, subunit large I hydrogenase

- -

cluster

-

- -

cluster

-

-

S

S

-

-

Iron only hydrogenase large subunit, C only hydrogenase Iron large subunit only hydrogenase Iron large subunit, C only hydrogenase Iron (E) component 4 membrane Hydrogenase Ni,Fe Ni,Fe (E) component 4 membrane Hydrogenase F420 F420 F420 Coenzyme Fe hydrogenases, small(NiFe) (EC:1.12.99.6) subunithydrogenase (hydA) Ni,Fe Fe Fe subunit hydrogenase EC:1.6.5.3)ech A (EC:1.6.99.5,

JGI IMG JGI

2014756582 2014759808 2014759810 2014737892 2014737897 2014737898 2014737900 2014766382 2014766383 2014767525 2014767574 2014767610 2014767611 2014767618 2014769806 2014774036

Gene Object ID Gene

C_C9422

Contig Name ANASMEC_C7062 ANASMEC_C7752 ANASMEC_C7752 ANASMEC_C1689 ANASMEC_C1691 ANASMEC_C1691 ANASMEC_C1691 ANASMEC_C9125 ANASMEC_C9125 ANASME ANASMEC_C9422 ANASMEC_C9422 ANASMEC_C9422 ANASMEC_C9422 ANASMEC_C9983 ANASMEC_C10782

Genes for hydrogenase contigs metagenome components ANAS in the hydrogenase identified for Genes

des

.

5

Appendix Taxa Contig(TF Class) Clostridiaceae Clostridiaceae Clostridiaceae Dehalococcoides Dehalococcoides Dehalococcoides Dehalococcoides Dehalococcoides Dehalococcoi Dehalococcoides Dehalococcoides Dehalococcoides Dehalococcoides Dehalococcoides Dehalococcoides Dehalococcoides

166

type cytochrome subunit type cytochrome

-

JGI Predicted Product Predicted JGI

reducing hydrogenase, delta subunit hydrogenase, reducing delta subunit hydrogenase, reducing

- -

ase I large subunit

4 membrane component (E) component 4 membrane

containing hydrogenase containingcomponents hydrogenase 1 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1

- - - -

enase subunitenase A

hydrogenase, b hydrogenase,

hydrogenase I smallhydrogenase subunit I smallhydrogenase subunit hydrogen I largehydrogenase subunit

-

- - - -

cluster cluster cluster cluster

- - - -

S S S S

- - - -

ech hydrogenase subunit hydrogenase ech B subunit hydrogenase ech C subunit hydrogenase ech E subunit hydrogenase ech E Hydrogenase F420 Coenzyme F420 Coenzyme Fe Ni,Fe Ni,Fe Ni,Fe Ni,Fe Ni/Fe Fe Fe Fe hydrog ech subunit hydrogenase ech B (EC:1.6.5.3)

JGI IMG JGI

014750566

2014774037 2014774039 2014774044 2014774045 2014734315 2014735861 2014735882 2014737255 2014746021 2014746022 2014746023 2014746024 2014746025 2014747692 2014747699 2014748557 2 2014750568

Gene Object ID Gene

4688

Contig Name ANASMEC_C10782 ANASMEC_C10782 ANASMEC_C10782 ANASMEC_C10782 ANASMEC_C650 ANASMEC_C1084 ANASMEC_C1087 ANASMEC_C1513 ANASMEC_C4055 ANASMEC_C4055 ANASMEC_C4055 ANASMEC_C4056 ANASMEC_C4056 ANASMEC_C4501 ANASMEC_C4501 ANASMEC_C ANASMEC_C5298 ANASMEC_C5298

ibrio

Taxa Taxa Contig(TF Class) Dehalococcoides Dehalococcoides Dehalococcoides Dehalococcoides Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfov Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio

167

EC:1.6.5.3)

7.2)

JGI Predicted Product Predicted JGI

only (EC:1.12.

-

containing hydrogenase containingcomponents hydrogenase 2 containingcomponents hydrogenase 1

- -

bound hydrogenase subunit ehaP bound hydrogenase subunit ehaO bound hydrogenase subunit ehaO bound hydrogenase subunit ehaN (EC:1.6.5.3) bound hydrogenase subunit ehaJ (EC:1.6.5.3) bound hydrogenase subunit ehaH

------

ne

hydrogenase III smallhydrogenase subunit III smallhydrogenase subunit III subunit large hydrogenase

- - -

cluster cluster

- -

S S

- -

ech hydrogenase subunit hydrogenase ech C (EC:1.6.5.3) subunit hydrogenase EC:1.6.5.3)ech E (EC:1.6.5.3, subunit hydrogenase ech F (EC:1.6.5.3) bound hydrogenase subunit mbhM ( Membrane bound hydrogenase subunit mbhM Membrane Ni,Fe subunit hydrogenase ech E (EC:1.6.5.3) Fe Fe Fe hydrogenases, Ni,Fe Ni,Fe membrane membrane membrane membrane membra membrane

14745730

JGI IMG JGI

2014750569 2014750571 2014750572 2014752890 2014752891 2014752892 2014752895 2014752898 2014759038 2014763480 2014770923 2014770925 2014745729 20 2014745731 2014745732 2014745736 2014745737

Gene Object ID Gene

C_C3966

Contig Name ANASMEC_C5298 ANASMEC_C5299 ANASMEC_C5299 ANASMEC_C5961 ANASMEC_C5961 ANASMEC_C5961 ANASMEC_C5961 ANASMEC_C5962 ANASMEC_C7530 ANASMEC_C8446 ANASMEC_C10097 ANASMEC_C10097 ANASMEC_C3966 ANASME ANASMEC_C3966 ANASMEC_C3966 ANASMEC_C3966 ANASMEC_C3966

Taxa Taxa Contig(TF Class) Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Desulfovibrio Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium

168

drogenase, delta subunit (EC 1.12.98.1)

drogenase subunit ehbD

JGI Predicted Product Predicted JGI

reducing hydrogenase, beta subunit hydrogenase, reducing

-

reducing hydrogenase, alpha subunit(EC 1.12.98.1) reducing hy reducing hydrogenase, gamma subunit (EC 1.12.98.1) reducing hydrogenase, beta subunit (EC 1.12.98.1)

- - - -

bound hydrogenase subunit ehaH bound hydrogenase subunit ehaG bound hydrogenase subunit ehaF bound hydrogenase subunit ehaE bound hydrogenase subunit ehaC bound hydrogenase subunit ehaB bound hydrogenase subunit ehbA bound hy bound hydrogenase subunit ehbQ

reducing hydrogenase subunit D hydrogenase reducing subunit D hydrogenase reducing subunit A hydrogenase reducing subunit G (EC:1.12.98.1) hydrogenase reducing

------

- - - -

non non non non

- - - -

membrane membrane membrane membrane membrane membrane F420 coenzyme F420 (EC:1.12.98.1) coenzyme F420 coenzyme F420 (EC:1.12.98.1) coenzyme F420 (EC:1.12.98.1) membrane membrane F420 F420 F420 membrane F420 Coenzyme

JGI IMG JGI

2014745738 2014745739 2014745740 2014745741 2014745743 2014745744 2014746595 2014748304 2014748305 2014748306 2014748307 2014748427 2014748430 2014751448 2014752722 2014752723 2014756258 2014768131

Gene Object ID Gene

ASMEC_C3966

Contig Name ANASMEC_C3966 ANASMEC_C3966 AN ANASMEC_C3966 ANASMEC_C3966 ANASMEC_C3966 ANASMEC_C4198 ANASMEC_C4653 ANASMEC_C4653 ANASMEC_C4653 ANASMEC_C4653 ANASMEC_C4655 ANASMEC_C4655 ANASMEC_C5543 ANASMEC_C5907 ANASMEC_C5907 ANASMEC_C7038 ANASMEC_C9544

terium

cterium

Taxa Taxa Contig(TF Class) Methanoba Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobac Methanobacterium Methanobacterium

169

l domain

1.6.5.3)

terminal domain terminal termina

- -

omponents 2 (EC:1.2.1.2) omponents

JGI Predicted Product Predicted JGI

only (EC:1.6.5.3)

-

reducing hydrogenase, beta subunit hydrogenase, reducing

-

reducinghydrogenase, gamma subunit (EC:1.12.98.1)

-

containing hydrogenase containingcomponents hydrogenase 2

-

containing hydrogenase c hydrogenase containing 2 (EC:1.2.1.2) components hydrogenase containing

bound hydrogenase subunit ehbK bound hydrogenase subunit ehbL bound hydrogenase subunit ehbN (EC:1.6.5.3) bound hydrogenase subunit ehbO (EC:1.6.5.3) bound hydrogenase subunit ehbP

- -

reducing hydrogenase subunit A hydrogenase reducing

- - - - -

-

hydrogenase III smallhydrogenase subunit (EC: III subunit large hydrogenase III smallhydrogenase subunit

non

- - -

cluster

-

cluster cluster

-

- -

S

S S

-

- -

Coenzyme F420 Coenzyme Coenzyme F420 F420 Fe Fe Fe membrane membrane Ni,Fe membrane membrane membrane large subunit, C only hydrogenase Iron large subunit, C only hydrogenase Iron Fe hydrogenases, (E) component 4 membrane Hydrogenase Ni,Fe Ni,Fe

14772217

JGI IMG JGI

2014768132 2014768166 2014768167 2014771049 2014771073 2014771074 2014772216 20 2014772218 2014772219 2014772220 2014772221 2014732339 2014732343 2014744373 2014758594 2014758596 2014758597

Gene Object ID Gene

_C10352

Contig Name ANASMEC_C9544 ANASMEC_C9545 ANASMEC_C9545 ANASMEC_C10127 ANASMEC_C10129 ANASMEC_C10129 ANASMEC_C10352 ANASMEC_C10352 ANASMEC ANASMEC_C10352 ANASMEC_C10352 ANASMEC_C10352 ANASMEC_C77 ANASMEC_C77 ANASMEC_C3551 ANASMEC_C7420 ANASMEC_C7420 ANASMEC_C7420

ium

Taxa Taxa Contig(TF Class) Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacter Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Methanobacterium Spirochaetes Spirochaetes Spirochaetes Spirochaetes Spirochaetes Spirochaetes

170

s 1

terminal domain terminal domain terminal

- -

terminaldomain (EC:1.12.7.2)

-

III component G (EC:1.6.5.3)

JGI Predicted Product Predicted JGI

only (EC:1.6.5.3) only (EC:1.6.5.3)

- -

reducing hydrogenase, subunitalpha hydrogenase, reducing delta subunit hydrogenase, reducing subunitalpha hydrogenase, reducing

- - -

reducinghydrogenase, gamma subunit (EC:1.12.2.1)

-

rogenase large subunit, C rogenase

containing hydrogenase containingcomponents hydrogenase 2 containingcomponent hydrogenase

- -

hydrogenase III subunit large hydrogenase

- -

cluster cluster

- -

S S

- -

hydrogenases, Fe hydrogenases, only hyd Iron Irononlyhydrogenase large subunit, C Fe hydrogenases, bound hydrogenase subunit mbhH (EC:1.6.5.3)Membrane bound hydrogenase subunit mbhJ Membrane Ni,Fe bound hydrogenase subunit mbhL (EC:1.6.5.3) Membrane Ni,Fe large subunit, C only hydrogenase Iron bound hydrogenase subunit mbhD Membrane bound hydrogenase subunit mbhE Membrane Fe Coenzyme F420 F420 Coenzyme F420 Coenzyme Fe F420 Coenzyme

14762786

JGI IMG JGI

2014761181 2014764872 2014769527 2014770160 2014745306 2014745308 2014745309 2014745310 2014745311 2014751345 2014753093 2014753094 2014753408 2014753733 2014753734 2014740617 20 2014753264

Gene Object ID Gene

8

Contig Name ANASMEC_C7814 ANASMEC_C8778 ANASMEC_C9969 ANASMEC_C10002 ANASMEC_C3841 ANASMEC_C3841 ANASMEC_C3841 ANASMEC_C3841 ANASMEC_C3841 ANASMEC_C5524 ANASMEC_C6017 ANASMEC_C6017 ANASMEC_C6125 ANASMEC_C6229 ANASMEC_C6229 ANASMEC_C238 ANASMEC_C8232 ANASMEC_C6076

- -

etes

Taxa Taxa Contig(TF Class) Spirochaetes Spirochaetes Spirochaetes Spirochaetes Synergistetes Synergistetes Synergistetes Synergistetes Synergistetes Synergistetes Synergistetes Synergistetes Synergistetes Synergistetes Synergist unknown Delta proteobacterium unknown Delta proteobacterium 6 Class unidentified

171

nit

/dehydrogenase, beta N terminus.subunit

JGI Predicted Product Predicted JGI

reducing hydrogenase, gamma subunit hydrogenase, reducing gamma subunit hydrogenase, reducing delta subu hydrogenase, reducing delta subunit hydrogenase, reducing subunitalpha hydrogenase, reducing delta subunit hydrogenase, reducing delta subunit hydrogenase, reducing subunitalpha hydrogenase, reducing delta subunit hydrogenase, reducing

------

reducinghydrogenase, gamma subunit (EC:1.12.98.1)

-

containing hydrogenase containingcomponents hydrogenase 1 containingcomponents hydrogenase 1

- -

containing hydrogenase components 2 (EC:1.2.1.2) components hydrogenase containing 1 (EC:1.2.1.2) components hydrogenase containing

- -

hydrogenase III smallhydrogenase subunit III subunit large hydrogenase

- -

cluster cluster

cluster cluster

- -

- -

rogenase small(NiFe) subunitrogenase (hydA)

S S

S S

- -

- -

Coenzyme F420 Coenzyme F420 Coenzyme F420 Coenzyme F420 Coenzyme Fe F420 Coenzyme Coenzyme F420 Coenzyme F420 hydrogenase F420 Coenzyme F420 Coenzyme Fe F420 Coenzyme hyd Fe F420 Coenzyme Ni,Fe Ni,Fe Fe

JGI IMG JGI

2014753265 2014753266 2014753267 2014753268 2014753271 2014759312 2014759313 2014763577 2014763578 2014763580 2014763583 2014764214 2014741466 2014741467 2014758628 2014759102 2014759103 2014761270

Gene Object ID Gene

C_C6076

Contig Name ANASMEC_C6076 ANASME ANASMEC_C6076 ANASMEC_C6076 ANASMEC_C6076 ANASMEC_C7617 ANASMEC_C7617 ANASMEC_C8475 ANASMEC_C8475 ANASMEC_C8475 ANASMEC_C8475 ANASMEC_C8652 ANASMEC_C2675 ANASMEC_C2675 ANASMEC_C7432 ANASMEC_C7550 ANASMEC_C7550 ANASMEC_C7820

dentified Class 9 Class dentified

Taxa Taxa Contig(TF Class) 6 Class unidentified 6 Class unidentified 6 Class unidentified 6 Class unidentified 6 Class unidentified 6 Class unidentified 6 Class unidentified 6 Class unidentified 6 Class unidentified 6 Class unidentified 6 Class unidentified 6 Class unidentified uni 9 Class unidentified 9 Class unidentified 9 Class unidentified 9 Class unidentified 9 Class unidentified

172

5.3)

terminal domain terminal

-

t

JGI Predicted Product Predicted JGI

hydrogenase subunit mbhJ

only (EC:1.12.7.2, EC:1.6.

-

reducing hydrogenase, delta subunit hydrogenase, reducing beta subunit hydrogenase, reducing

- -

reducinghydrogenase, alpha subunit (EC:1.12.1.2)

20

-

containing hydrogenase containingcomponents hydrogenase 1 containingcomponents hydrogenase 2 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1

------

hydrogenase III smallhydrogenase subunit III subunit large hydrogenase I smallhydrogenase subuni

- - -

cluster cluster cluster cluster cluster cluster cluster cluster

hydrogenase I large subunit (EC:1.12.99.6, EC:1.12.5.1,

------

-

S S S S S S S S

------

Fe Fe F4 Coenzyme Fe Fe Fe large subunit, C only hydrogenase Iron bound Membrane Coenzyme F420 Fe Ni,Fe EC:1.12.5.1) Fe hydrogenases, Fe Ni,Fe Ni,Fe F420 Coenzyme Fe Ni,Fe

JGI IMG JGI

2014761457 2014763265 2014732173 2014732382 2014733276 2014733911 2014733912 2014734234 2014735094 2014736156 2014736461 2014736703 2014737258 2014737894 2014737895 2014738154 2014738247 2014738467

Gene Object ID Gene

EC_C8392

Contig Name ANASMEC_C7870 ANASM ANASMEC_C25 ANASMEC_C86 ANASMEC_C341 ANASMEC_C527 ANASMEC_C528 ANASMEC_C627 ANASMEC_C883 ANASMEC_C1189 ANASMEC_C1283 ANASMEC_C1362 ANASMEC_C1514 ANASMEC_C1690 ANASMEC_C1690 ANASMEC_C1735 ANASMEC_C1775 ANASMEC_C1853

ified

Taxa Taxa Contig(TF Class) unclass unclassified

173

terminal domain terminal domain terminal domain terminal

- - -

arginine translocation)

-

subunit mbhC

JGI Predicted Product Predicted JGI

only

-

reducing hydrogenase, subunitalpha hydrogenase, reducing subunitalpha hydrogenase, reducing

- -

containing hydrogenase containingcomponents hydrogenase 1 containingcomponents hydrogenase 2 containingcomponents hydrogenase 1

- - -

reducing hydrogenase subunit A hydrogenase reducing

-

genase small subunit./TAT (twin

hydrogenase I largehydrogenase subunit III smallhydrogenase subunit

non

- -

cluster cluster cluster

-

- - -

S S S

drogenase 4 membrane component (E) component 4 membrane drogenase

- - -

Ni,Fe Fe Fe F420 bound hydrogenase Membrane bound hydrogenase subunit mbhD Membrane bound hydrogenase subunit mbhE Membrane F420 Coenzyme subunit hydrogenase ech E large subunit, C only hydrogenase Iron Fe hydrogenases, Ironhydro pathway signal sequence. (EC:1.12.7.2) large subunit, C only hydrogenase Iron large subunit, C only hydrogenase Iron Fe Hy F420 Coenzyme Ni,Fe

JGI IMG JGI

014743150

2014738468 2014738794 2014739384 2014740019 2014740750 2014740751 2014740752 2014741247 2014742389 2014743057 2 2014744681 2014744682 2014744683 2014745388 2014745528 2014745841 2014746481

Gene Object ID Gene

ASMEC_C3625

Contig Name ANASMEC_C1853 ANASMEC_C1941 ANASMEC_C2075 ANASMEC_C2214 ANASMEC_C2429 ANASMEC_C2429 ANASMEC_C2429 ANASMEC_C2599 ANASMEC_C2945 ANASMEC_C3132 ANASMEC_C3166 AN ANASMEC_C3625 ANASMEC_C3625 ANASMEC_C3861 ANASMEC_C3914 ANASMEC_C4001 ANASMEC_C4172

Taxa Taxa Contig(TF Class)

174

terminal domain terminal

-

ydrogenase components ydrogenase 2

hydrogenase subunit D hydrogenase

JGI Predicted Product Predicted JGI

reducing hydrogenase, delta subunit hydrogenase, reducing gamma subunit hydrogenase, reducing

- -

containing hydrogenase containingcomponents hydrogenase 1 containing h containingcomponents hydrogenase 1

- - -

containing hydrogenase components 1 (EC:1.2.1.2) components hydrogenase containing

bound hydrogenase subunit ehbH

-

reducing hydrogenase subunit A hydrogenase reducing reducing

-

- -

hydrogenase I largehydrogenase subunit III componenthydrogenase G (EC:1.6.5.3)

rane

non non

- -

cluster cluster cluster

- -

cluster

- - -

-

S S S

S

- - -

-

Iron only hydrogenase large subunit, C only hydrogenase Iron Ni,Fe memb F420 Coenzyme Fe F420 F420 Coenzyme F420 bound hydrogenase subunit mbhF Membrane bound hydrogenase subunit mbhD Membrane bound hydrogenase subunit mbhC Membrane bound hydrogenase subunit mbhB Membrane Fe Fe Fe (EC:1.2.1.2) bound hydrogenase subunit mbhJ Membrane Ni,Fe bound hydrogenase subunit mbhL Membrane

21

JGI IMG JGI

2014746664 20147489 2014749641 2014749988 2014750888 2014751039 2014751040 2014751041 2014751251 2014751253 2014751254 2014751255 2014752009 2014752392 2014752737 2014753100 2014753102 2014753103

Gene Object ID Gene

C5384

Contig Name ANASMEC_C4206 ANASMEC_C4809 ANASMEC_C5040 ANASMEC_C5109 ANASMEC_ ANASMEC_C5418 ANASMEC_C5418 ANASMEC_C5418 ANASMEC_C5488 ANASMEC_C5488 ANASMEC_C5488 ANASMEC_C5488 ANASMEC_C5706 ANASMEC_C5845 ANASMEC_C5913 ANASMEC_C6018 ANASMEC_C6018 ANASMEC_C6018

Taxa Taxa Contig(TF Class)

175

)

-

.98.1)

t (hydA) (EC:1.12.7.2, EC:1.12.99.6)

JGI Predicted Product Predicted JGI

ning hydrogenase components ning hydrogenase 1

reducing hydrogenase, subunitalpha hydrogenase, reducing

-

reducinghydrogenase, gamma subunit (EC:1.12.2.1) reducinghydrogenase, alpha subunit (EC:1.12.1.2)

reducing hydrogenase, beta subunit (EC 1.12.98.1) reducing hydrogenase, gamma subunit (EC 1.12 reducing hydrogenase, alpha subunit(EC 1.12.98.1) reducing hydrogenase, alpha subunit(EC 1.12.98.1)

- -

- - - -

contai containingcomponents hydrogenase 1 containingcomponents hydrogenase 2 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1

- - - - -

containing hydrogenase components 1 (EC:1.2.7. components hydrogenase containing

-

hydrogenase III subunit large hydrogenase I largehydrogenase subunit I largehydrogenase subunit

- - -

cluster cluster cluster cluster cluster

cluster

- - - - -

-

S S S S S

S

- - - - -

-

Fe Coenzyme F420 Coenzyme F420 Fe Fe Fe Fe Ni,Fe coenzyme F420 coenzyme F420 coenzyme F420 coenzyme F420 Fe subunit hydrogenase ech B hydrogenase (NiFe) small subuni Ni,Fe Ni,Fe F420 Coenzyme

JGI IMG JGI

2014754483 2014755821 2014755822 2014755955 2014756330 2014757935 2014758973 2014759518 2014761350 2014761351 2014761352 2014761353 2014762362 2014762816 2014763313 2014763315 2014763316 2014763706

Gene Object ID Gene

849

Contig Name ANASMEC_C6459 ANASMEC_C6917 ANASMEC_C6917 ANASMEC_C6950 ANASMEC_C7056 ANASMEC_C7311 ANASMEC_C7505 ANASMEC_C7681 ANASMEC_C7 ANASMEC_C7849 ANASMEC_C7850 ANASMEC_C7850 ANASMEC_C8085 ANASMEC_C8241 ANASMEC_C8406 ANASMEC_C8406 ANASMEC_C8406 ANASMEC_C8505

Taxa Taxa Contig(TF Class)

176

JGI Predicted Product Predicted JGI

reducing hydrogenase, gamma subunit hydrogenase, reducing

-

reducinghydrogenase, alpha subunit (EC:1.12.1.2)

-

bound hydrogenase subunit ehaN bound hydrogenase subunit ehaO (EC:1.6.5.3)

se (E) component 4 membrane

reducing hydrogenase subunit D hydrogenase reducing

- -

-

ydrogenase III smallydrogenase subunit

hydrogenase III subunit large hydrogenase III subunit large hydrogenase h I smallhydrogenase subunit

non

- - - -

-

brane

Coenzyme F420 Coenzyme Hydrogena subunit hydrogenase ech E (EC:1.6.5.3) subunit hydrogenase ech D subunit hydrogenase ech C subunit hydrogenase ech B small subunit. hydrogenase Iron Ni,Fe Ni,Fe Ni,Fe Coenzyme F420 bound hydrogenase subunit mbhL Membrane bound hydrogenase subunit mbhM Membrane bound hydrogenase subunit mbhN Membrane F420 mem membrane Ni,Fe

4

JGI IMG JGI

2014764213 2014764376 2014765406 2014765407 2014765408 2014765409 2014765524 2014765702 2014765703 201476570 2014765793 2014765834 2014765835 2014765837 2014767808 2014769256 2014769257 2014769374

Gene Object ID Gene

Contig Name ANASMEC_C8651 ANASMEC_C8694 ANASMEC_C8895 ANASMEC_C8895 ANASMEC_C8895 ANASMEC_C8895 ANASMEC_C8944 ANASMEC_C9000 ANASMEC_C9000 ANASMEC_C9000 ANASMEC_C9032 ANASMEC_C9045 ANASMEC_C9045 ANASMEC_C9045 ANASMEC_C9478 ANASMEC_C9903 ANASMEC_C9903 ANASMEC_C9936

Taxa Taxa Contig(TF Class)

177

omain

l domain

terminal domain terminal domain terminal domain terminal domain terminal d terminal domain terminal termina

------

JGI Predicted Product Predicted JGI

reducing hydrogenase, gamma subunit hydrogenase, reducing subunitalpha hydrogenase, reducing subunitalpha hydrogenase, reducing

- - -

containing hydrogenase containingcomponents hydrogenase 2 containingcomponents hydrogenase 2 containingcomponents hydrogenase 1

- - -

uster

hydrogenase I largehydrogenase subunit (EC:1.12.99.6) I largehydrogenase subunit I largehydrogenase subunit I largehydrogenase subunit I smallhydrogenase subunit

- - - - -

cl cluster cluster

- - -

S S S

- - -

Ni,Fe Ni,Fe Fe Fe large subunit, C only hydrogenase Iron large subunit, C only hydrogenase Iron large subunit, C only hydrogenase Iron large subunit, C only hydrogenase Iron Ni,Fe Ni,Fe Ni,Fe F420 Coenzyme large subunit, C only hydrogenase Iron Fe large subunit, C only hydrogenase Iron F420 Coenzyme F420 Coenzyme large subunit, C only hydrogenase Iron

JGI IMG JGI

2014769375 2014772737 2014773281 2014773484 2014773633 2014773740 2014774777 2014775047 2014775687 2014776383 2014776384 2014776620 2014776768 2014778033 2014778151 2014778482 2014779184 2014779243

Gene Object ID Gene

5_b1

6

Contig Name ANASMEC_C993 ANASMEC_C10452 ANASMEC_C10611 ANASMEC_C10649 ANASMEC_C10704 ANASMEC_C10743 ANASMEC_FYHO2833_b1 ANASMEC_FYHO4035_g1 ANASMEC_FYHO6120_g1 ANASMEC_FYHO9267_b1 ANASMEC_FYHO9267_g1 ANASMEC_FYHO10722_g1 ANASMEC_FYHO11347_g1 ANASMEC_FYHO16482_g1 ANASMEC_FYHO1699 ANASMEC_FYHO18538_g1 ANASMEC_FYHO20884_b1 ANASMEC_FYHO21096_g1

Taxa Taxa Contig(TF Class)

178

bunit (EC:1.12.2.1,

terminal domain terminal domain terminal domain terminal

- - -

nase nase components 1

ll subunit

JGI Predicted Product Predicted JGI

reducing hydrogenase, beta subunit hydrogenase, reducing gamma subunit hydrogenase, reducing subunitalpha hydrogenase, reducing subunitalpha hydrogenase, reducing delta subunit hydrogenase, reducing

- - - - -

reducinghydrogenase, gamma su

-

containing hydrogenase containingcomponents hydrogenase 2 containingcomponents hydrogenase 1 containing hydroge containingcomponents hydrogenase 1

- - - -

reducing hydrogenase subunit A hydrogenase reducing

-

hydrogenase I sma hydrogenase I largehydrogenase subunit III smallhydrogenase subunit

non

- - -

cluster cluster cluster cluster

-

- - - -

S S S S

- - - -

Coenzyme F420 Coenzyme Fe Ni,Fe Ni,Fe Fe subunit hydrogenase ech C F420 Coenzyme large subunit, C only hydrogenase Iron Fe large subunit, C only hydrogenase Iron F420 Coenzyme F420 Fe Coenzyme F420 EC:1.12.1.2) F420 Coenzyme Ni,Fe large subunit, C only hydrogenase Iron F420 Coenzyme

48

JGI IMG JGI

20147792 2014779943 2014780903 2014781259 2014781278 2014781310 2014782820 2014782841 2014783150 2014783268 2014783309 2014785350 2014786065 2014786359 2014786360 2014787077 2014787100 2014787482

Gene Object ID Gene

62_b1

O28140_g1

Contig Name ANASMEC_FYHO21118_b1 ANASMEC_FYHO24152_g1 ANASMEC_FYH ANASMEC_FYHO29406_g1 ANASMEC_FYHO29496_g1 ANASMEC_FYHO29612_b1 ANASMEC_FYHO40291_g1 ANASMEC_FYHO40769_b1 ANASMEC_FYHO42060_b1 ANASMEC_FYHO42412_g1 ANASMEC_FYHO42621_g1 ANASMEC_FYHO52982_g1 ANASMEC_FYHO56076_g1 ANASMEC_FYHO57762_b1 ANASMEC_FYHO577 ANASMEC_FYHO60390_g1 ANASMEC_FYHO60505_g1 ANASMEC_FYHO62192_b1

Taxa Taxa Contig(TF Class)

179

terminal domain terminal domain terminal domain terminal domain terminal domain terminal

- - - - -

drogenase, delta subunitdrogenase,

large subunit, C

JGI Predicted Product Predicted JGI

reducing hy reducing delta subunit hydrogenase, reducing

- -

reducing hydrogenase, alpha subunit(EC 1.12.98.1)

-

containing hydrogenase containingcomponents hydrogenase 1 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1 containingcomponents hydrogenase 1

- - - -

bound hydrogenase subunit ehaE bound hydrogenase subunit ehaF bound hydrogenase subunit ehaJ

reducing hydrogenase subunit D hydrogenase reducing

- - -

-

non

cluster cluster cluster cluster

-

- - - -

S S S S

- - - -

Iron only hydrogenase large subunit, C only hydrogenase Iron bound hydrogenase subunit mbhH Membrane F420 Coenzyme F420 Coenzyme large subunit, C only hydrogenase Iron Fe large subunit, C only hydrogenase Iron only hydrogenase Iron large subunit, C only hydrogenase Iron Fe membrane membrane membrane subunit hydrogenase ech A Fe coenzyme F420 F420 Fe

14787675

JGI IMG JGI

20 2014788107 2014788638 2014788846 2014789187 2014789360 2014789526 2014789825 2014790391 2014790392 2014790394 2014790395 2014790397 2014791062 2014792104 2014792247 2014792282 2014793123

Gene Object ID Gene

YHO82056_g1

EC_FYHO67404_b1

Contig Name ANASMEC_FYHO63046_g1 ANASMEC_FYHO65027_g1 ANASM ANASMEC_FYHO68348_b1 ANASMEC_FYHO69689_b1 ANASMEC_FYHO70452_g1 ANASMEC_FYHO70995_g1 ANASMEC_FYHO72306_b1 ANASMEC_FYHO74637_b1 ANASMEC_FYHO74637_b1 ANASMEC_FYHO74651_b1 ANASMEC_FYHO74651_b1 ANASMEC_FYHO74651_g1 ANASMEC_FYHO77655_g1 ANASMEC_F ANASMEC_FYHO82556_g1 ANASMEC_FYHO82696_g1 ANASMEC_GCAX03491_c1 _1000_100_1

Taxa Taxa Contig(TF Class)

180

terminal domain terminal

-

small subunit

JGI Predicted Product Predicted JGI

hydrogenase III hydrogenase

-

Iron only hydrogenase large subunit, C only hydrogenase Iron subunit hydrogenase ech A Ni,Fe

JGI IMG JGI

2014793925 2014795068 2014795140

Gene Object ID Gene

Contig Name ANASMEC_GCAX09804_c1 _1000_100_1 ANASMEC_GCAX18540_c1 _1000_100_1 ANASMEC_GCAX20496_c1 _1000_100_2

Taxa Taxa Contig(TF Class)

181

Appendix 6:

Cobalamin biosynthesis genes identified in the ANAS metagenome contigs.

182

Appendix 6. Cobalamin biosynthesis genes identified in the ANAS metagenome contigs.

Taxa JGI IMG Gene (TF Contig Class) Contig Name Gene Object ID Name Clostridiaceae ANASMEC_C2155 2014739789 cbiC Clostridiaceae ANASMEC_C2155 2014739790 cbiC Clostridiaceae ANASMEC_C2155 2014739792 cobA Clostridiaceae ANASMEC_C2155 2014739793 cbiP Clostridiaceae ANASMEC_C2155 2014739795 cbiB Clostridiaceae ANASMEC_C2155 2014739797 cobU Clostridiaceae ANASMEC_C2155 2014739798 cobS Clostridiaceae ANASMEC_C2155 2014739799 cobU Clostridiaceae ANASMEC_C2155 2014739800 cobT Clostridiaceae ANASMEC_C2155 2014739801 cbiA Clostridiaceae ANASMEC_C2155 2014739802 cbiJ/E/T Clostridiaceae ANASMEC_C2155 2014739803 cbiH Clostridiaceae ANASMEC_C2155 2014739804 cbiG Clostridiaceae ANASMEC_C2155 2014739805 cbiF Clostridiaceae ANASMEC_C2155 2014739806 cbiL Clostridiaceae ANASMEC_C2155 2014739807 cbiD Clostridiaceae ANASMEC_C2155 2014739808 cbiK Dehalococcoides ANASMEC_C2180 2014739915 cbiP Dehalococcoides ANASMEC_C2636 2014741361 cobA Dehalococcoides ANASMEC_C6240 2014753794 cobN Dehalococcoides ANASMEC_C6240 2014753795 cobN Dehalococcoides ANASMEC_C6240 2014753799 cbiX/C Dehalococcoides ANASMEC_C6240 2014753801 cbiD Dehalococcoides ANASMEC_C6240 2014753802 cbiE/T Dehalococcoides ANASMEC_C6240 2014753803 cbiL Dehalococcoides ANASMEC_C6240 2014753804 cbiF Dehalococcoides ANASMEC_C6240 2014753805 cbiG/H Dehalococcoides ANASMEC_C6240 2014753814 cbiC Dehalococcoides ANASMEC_C6240 2014753861 cbiE Dehalococcoides ANASMEC_C9125 2014766420 cbiB Dehalococcoides ANASMEC_C9125 2014766421 cbiB Dehalococcoides ANASMEC_C9125 2014766424 cobT Dehalococcoides ANASMEC_C9125 2014766425 cobS Dehalococcoides ANASMEC_C9125 2014766426 cobC Dehalococcoides ANASMEC_C9125 2014766427 cobU Dehalococcoides ANASMEC_C9422 2014767493 cbiE Dehalococcoides ANASMEC_C9422 2014767497 cbiB

183

Taxa JGI IMG Gene (TF Contig Class) Contig Name Gene Object ID Name Dehalococcoides ANASMEC_C9422 2014767498 cobA Dehalococcoides ANASMEC_C9422 2014767503 cbiE Dehalococcoides ANASMEC_C9422 2014767593 cbiA Dehalococcoides ANASMEC_C9983 2014769811 cbiB Dehalococcoides ANASMEC_C9983 2014769812 cobA Desulfovibrio ANASMEC_C10219 2014771567 cbiP Desulfovibrio ANASMEC_C10280 2014771809 cbiH Desulfovibrio ANASMEC_C10425 2014772542 cbiK Desulfovibrio ANASMEC_C1452 2014737023 cobT Desulfovibrio ANASMEC_C2682 2014741500 cbiK Desulfovibrio ANASMEC_C5098 2014749951 cbiL Desulfovibrio ANASMEC_C5922 2014752770 cbiA Desulfovibrio ANASMEC_C6044 2014753178 cbiB Desulfovibrio ANASMEC_C614 2014734182 cbiF Desulfovibrio ANASMEC_C614 2014734183 cbiF Desulfovibrio ANASMEC_C615 2014734184 cbiE Desulfovibrio ANASMEC_C615 2014734185 cbiD Desulfovibrio ANASMEC_C7938 2014761871 cbiG Desulfovibrio ANASMEC_C8971 2014765623 cbiA Methanobacteriacea ANASMEC_C10286 2014771933 cbiT Methanobacteriacea ANASMEC_C10466 2014772804 cbiF Methanobacteriacea ANASMEC_C10466 2014772805 cbiF Methanobacteriacea ANASMEC_C124 2014732527 cbiJ Methanobacteriacea ANASMEC_C126 2014732536 cbiP Methanobacteriacea ANASMEC_C1527 2014737393 cbiE Methanobacteriacea ANASMEC_C4198 2014746601 cobU Methanobacteriacea ANASMEC_C4638 2014748160 cbiC Methanobacteriacea ANASMEC_C4638 2014748181 cobN Methanobacteriacea ANASMEC_C4638 2014748182 cobN Methanobacteriacea ANASMEC_C4638 2014748184 cobN Methanobacteriacea ANASMEC_C4640 2014748204 cobN Methanobacteriacea ANASMEC_C4641 2014748207 cobN Methanobacteriacea ANASMEC_C4654 2014748350 cbiE Methanobacteriacea ANASMEC_C5670 2014751903 cobN Methanobacteriacea ANASMEC_C5670 2014751904 cobN Methanobacteriacea ANASMEC_C5670 2014751905 cobN Methanobacteriacea ANASMEC_C5670 2014751906 cobN Methanobacteriacea ANASMEC_C7444 2014758745 cbiD Methanobacteriacea ANASMEC_C7444 2014758746 cbiD 184

Taxa JGI IMG Gene (TF Contig Class) Contig Name Gene Object ID Name Methanobacteriacea ANASMEC_C9971 2014769572 cbiL Methanobacteriacea ANASMEC_C9972 2014769627 cbiH Methanobacteriacea ANASMEC_C9972 2014769634 cbiG Methanobacteriacea ANASMEC_C9972 2014769635 cbiG Methanobacteriacea ANASMEC_C9972 2014769636 cbiB Methanobacteriacea ANASMEC_C9974 2014769689 cbiA Methanobacteriacea ANASMEC_C9976 2014769733 cbiA Methanobacteriacea ANASMEC_C9976 2014769734 cbiA Methanospirillum ANASMEC_C3080 2014742893 cbiE Methanospirillum ANASMEC_C3080 2014742895 cbiF Methanospirillum ANASMEC_C3080 2014742896 cbiG Methanospirillum ANASMEC_C3081 2014742897 cbiH Methanospirillum ANASMEC_C3081 2014742898 cbiC Methanospirillum ANASMEC_C3081 2014742899 cbiD Methanospirillum ANASMEC_C3081 2014742900 cbiE Methanospirillum ANASMEC_C4547 2014747827 cbiP Methanospirillum ANASMEC_C5014 2014749563 cbiA Methanospirillum ANASMEC_C5014 2014749564 cbiA Methanospirillum ANASMEC_C6921 2014755836 cobS Methanospirillum ANASMEC_C8670 2014764303 cbiB Methanospirillum ANASMEC_C8670 2014764304 cbiB Synergistetes ANASMEC_C10090 2014770870 cobT Synergistetes ANASMEC_C6473 2014754522 cbiD Synergistetes ANASMEC_C6473 2014754523 cbiE Synergistetes ANASMEC_C6473 2014754524 cbiE Synergistetes ANASMEC_C6473 2014754525 cbiF Synergistetes ANASMEC_C6475 2014754530 cobS Synergistetes ANASMEC_C6475 2014754532 cobU Synergistetes ANASMEC_C726 2014734551 cobT Synergistetes ANASMEC_C726 2014734552 cobU unknown Delta- proteobacterium ANASMEC_C7396 2014758181 cibK unidentified Class 6 ANASMEC_C9945 2014769400 cobA unidentified Class 6 ANASMEC_C9946 2014769403 cobS unidentified Class 6 ANASMEC_C9946 2014769404 cobS unidentified Class 6 ANASMEC_C9946 2014769406 cobU unidentified Class 9 ANASMEC_C2876 2014742219 cobA unclassified ANASMEC_C2768 2014741815 cbiL unclassified ANASMEC_C2768 2014741816 cbiL 185

Taxa JGI IMG Gene (TF Contig Class) Contig Name Gene Object ID Name unclassified ANASMEC_C2768 2014741817 cibT unclassified ANASMEC_C2768 2014741818 cibE unclassified ANASMEC_C2768 2014741819 cbiD unclassified ANASMEC_C2768 2014741820 cbiC unclassified ANASMEC_C2768 2014741821 cbiA unclassified ANASMEC_C2768 2014741822 cbiP ANASMEC_C202 2014732832 cbiG ANASMEC_C727 2014734554 cbiB ANASMEC_C728 2014734556 cbiP ANASMEC_C1628 2014737706 cbiA ANASMEC_C2563 2014741123 cbiB ANASMEC_C2767 2014741813 cbiG ANASMEC_C2767 2014741814 cbiF ANASMEC_C2834 2014741992 cobS ANASMEC_C3069 2014742870 cbiH ANASMEC_C3069 2014742871 cbiG ANASMEC_C3069 2014742873 cbiF ANASMEC_C3311 2014743545 cbiA ANASMEC_C3511 2014744204 cbiA ANASMEC_C3511 2014744205 cbiA ANASMEC_C3520 2014744222 cobA ANASMEC_C4447 2014747522 cbiK ANASMEC_C4867 2014749135 cobN ANASMEC_C4867 2014749136 cobN ANASMEC_C4867 2014749138 cobN ANASMEC_C4972 2014749454 cbiL ANASMEC_C5570 2014751547 cbiA ANASMEC_C5703 2014751998 cbiE ANASMEC_C5703 2014751999 cbiE ANASMEC_C5921 2014752768 cbiC ANASMEC_C5921 2014752769 cbiA ANASMEC_C6089 2014753304 cbiD ANASMEC_C6089 2014753305 cbiC ANASMEC_C6089 2014753306 cbiA ANASMEC_C6090 2014753310 cbiA ANASMEC_C6259 2014753935 cbiB ANASMEC_C7118 2014757018 cbiF ANASMEC_C7395 2014758176 cbiK ANASMEC_C7395 2014758177 cbiL 186

Taxa JGI IMG Gene (TF Contig Class) Contig Name Gene Object ID Name ANASMEC_C7713 2014759650 cobU ANASMEC_C8028 2014762141 cobU ANASMEC_C8188 2014762649 cbiA ANASMEC_C8222 2014762750 cbiA ANASMEC_C8223 2014762751 cbiA ANASMEC_C8244 2014762824 cobU ANASMEC_C9106 2014766026 cbiA ANASMEC_C9272 2014766892 cobA ANASMEC_C9272 2014766893 cbiP ANASMEC_C9272 2014766894 cbiP ANASMEC_C9652 2014768461 cbiP ANASMEC_C9834 2014769042 cbiC ANASMEC_C10376 2014772348 cobA ANASMEC_C10376 2014772349 cobA ANASMEC_C10757 2014773785 cobN ANASMEC_FYHO897_b1 2014776303 cbiP ANASMEC_FYHO3513_b1 2014774922 cbiK ANASMEC_FYHO5065_b1 2014775389 cobS ANASMEC_FYHO7312_g1 2014775985 cbiK ANASMEC_FYHO8764_g1 2014776259 cbiF ANASMEC_FYHO10859_b1 2014776666 cobN ANASMEC_FYHO10859_g1 2014776668 cobN ANASMEC_FYHO11194_b1 2014776741 cbiE ANASMEC_FYHO11194_g1 2014776742 cbiB ANASMEC_FYHO11194_g1 2014776743 cbiA ANASMEC_FYHO11245_g1 2014776755 cobU ANASMEC_FYHO13996_b1 2014777397 cbiB ANASMEC_FYHO21893_g1 2014779438 cobA ANASMEC_FYHO23079_b1 2014779767 cbiF ANASMEC_FYHO24439_b1 2014780009 cbiC ANASMEC_FYHO24439_g1 2014780010 cbiP ANASMEC_FYHO26853_g1 2014780563 cbiA ANASMEC_FYHO27856_b1 2014780805 cbiD ANASMEC_FYHO30996_b1 2014781682 cbiG ANASMEC_FYHO30996_g1 2014781684 cbiX ANASMEC_FYHO41447_b1 2014783004 cbiB ANASMEC_FYHO45647_g1 2014783570 cbiP ANASMEC_FYHO55308_g1 2014785885 cobA ANASMEC_FYHO55308_g1 2014785886 cbiT 187

Taxa JGI IMG Gene (TF Contig Class) Contig Name Gene Object ID Name ANASMEC_FYHO55674_g1 2014785970 cobU ANASMEC_FYHO64850_b1 2014788059 cbiB ANASMEC_FYHO64873_g3 2014788076 cbiH ANASMEC_FYHO69586_g1 2014789167 cbiA ANASMEC_FYHO71048_g1 2014789547 cbiB ANASMEC_FYHO71871_b1 2014789709 cbiL ANASMEC_FYHO72545_b1 2014789888 cbiA ANASMEC_FYHO72545_g1 2014789889 cbiB ANASMEC_FYHO76378_b1 2014790734 cobA ANASMEC_FYHO81740_g1 2014792002 cobS ANASMEC_GCAX02227_c1_1000_100_1 2014792982 cbiH ANASMEC_GCAX02227_c1_1000_100_1 2014792983 cbiJ ANASMEC_GCAX04797_c1_1000_100_1 2014793287 cbiA ANASMEC_GCAX10475_c1_1000_100_1 2014794041 cobU ANASMEC_GCAX11080_c1_1000_100_1 2014794138 cbiP ANASMEC_GCAX11638_c1_1000_100_1 2014794222 cbiL

188

Appendix 7:

Bacterial and archaeal sequenced genomes lacking genes for methylene tetrahydrofolate reductase (MTHFR)

189

Appendix 7. Bacterial and archaeal sequenced genomes lacking genes for methylene tetrahydrofolate reductase (MTHFR)

Bacteria and Archaea Lacking MTHFR Genes Acholeplasma_laidlawii_PG_8A Aciduliprofundum_boonei_T469 Aciduliprofundum_MAR08_339 Actinobacillus_succinogenes_130Z Aeropyrum_pernix_K1 Aminobacterium_colombiense_DSM_12261 Anaerococcus_prevotii_DSM_20548 Anaplasma_centrale_Israel Anaplasma_marginale_Florida Anaplasma_marginale_Maries Anaplasma_phagocytophilum_HZ Arthrobacter_arilaitensis_Re117 Aster_yellows_witches_broom_phytoplasma_AYWB Atopobium_parvulum_DSM_20469 bacterium_BT_1 Bartonella_bacilliformis_KC583 Bartonella_clarridgeiae_73 Bartonella_grahamii_as4aup Bartonella_henselae_Houston_1 Bartonella_quintana_RM_11 Bartonella_quintana_Toulouse Bartonella_tribocorum_CIP_105476 Bdellovibrio_bacteriovorus_HD100 Bdellovibrio_bacteriovorus_Tiberius Borrelia_afzelii_HLJ01 Borrelia_afzelii_PKo Borrelia_afzelii_PKo Borrelia_bissettii_DN127 Borrelia_burgdorferi_B31 Borrelia_burgdorferi_JD1 Borrelia_burgdorferi_N40 Borrelia_burgdorferi_ZS7 Borrelia_crocidurae_Achema Borrelia_duttonii_Ly Borrelia_garinii_BgVir Borrelia_garinii_NMJW1 Borrelia_garinii_PBi

190

Bacteria and Archaea Lacking MTHFR Genes Borrelia_hermsii_DAH Borrelia_recurrentis_A1 Borrelia_turicatae_91E135 Buchnera_aphidicola__Cinara_tujafilina_ Caldisericum_exile_AZM16c01 Campylobacter_hominis_ATCC_BAA_381 Campylobacter_lari_RM2100 Candidatus_Amoebophilus_asiaticus_5a2 Candidatus_Arthromitus_SFB_mouse_Japan Candidatus_Arthromitus_SFB_mouse_Yit Candidatus_Arthromitus_SFB_rat_Yit Candidatus_Cloacamonas_acidaminovorans Candidatus_Hamiltonella_defensa_5AT__Acyrthosiphon_pisum_ Candidatus_Kinetoplastibacterium_blastocrithidii__ex_Strigomonas_culicis_ Candidatus_Kinetoplastibacterium_crithidii__ex_Angomonas_deanei_ATCC_30255_ Candidatus_Liberibacter_asiaticus_psy62 Candidatus_Liberibacter_solanacearum_CLso_ZC1 Candidatus_Midichloria_mitochondrii_IricVA Candidatus_Moranella_endobia_PCIT Candidatus_Mycoplasma_haemolamae_Purdue Candidatus_Pelagibacter_IMCC9063 Candidatus_Phytoplasma_australiense Candidatus_Phytoplasma_mali Candidatus_Protochlamydia_amoebophila_UWE25 Candidatus_Rickettsia_amblyommii_GAT_30V Candidatus_Riesia_pediculicola_USDA Candidatus_Sulcia_muelleri_CARI Candidatus_Sulcia_muelleri_DMIN Candidatus_Sulcia_muelleri_GWSS Candidatus_Sulcia_muelleri_SMDSEM Capnocytophaga_canimorsus_Cc5 Cardinium_endosymbiont_cEper1_of_Encarsia_pergandiella Chlamydia_muridarum_Nigg Chlamydia_psittaci_01DC12 Chlamydia_psittaci_84_55 Chlamydia_psittaci_GR9 Chlamydia_psittaci_M56 Chlamydia_psittaci_MN Chlamydia_psittaci_VS225 Chlamydia_psittaci_WC 191

Bacteria and Archaea Lacking MTHFR Genes Chlamydia_psittaci_WS_RT_E30 Chlamydia_trachomatis_434_Bu Chlamydia_trachomatis_A_HAR_13 Chlamydia_trachomatis_A2497 Chlamydia_trachomatis_A2497 Chlamydia_trachomatis_B_Jali20_OT Chlamydia_trachomatis_B_TZ1A828_OT Chlamydia_trachomatis_D_EC Chlamydia_trachomatis_D_LC Chlamydia_trachomatis_D_UW_3_CX Chlamydia_trachomatis_E_11023 Chlamydia_trachomatis_E_150 Chlamydia_trachomatis_E_SW3 Chlamydia_trachomatis_F_SW4 Chlamydia_trachomatis_F_SW5 Chlamydia_trachomatis_G_11074 Chlamydia_trachomatis_G_11222 Chlamydia_trachomatis_G_9301 Chlamydia_trachomatis_G_9768 Chlamydia_trachomatis_L2b_UCH_1_proctitis Chlamydia_trachomatis_L2c Chlamydia_trachomatis_Sweden2 Chlamydophila_abortus_S26_3 Chlamydophila_caviae_GPIC Chlamydophila_felis_Fe_C_56 Chlamydophila_pecorum_E58 Chlamydophila_pneumoniae_AR39 Chlamydophila_pneumoniae_CWL029 Chlamydophila_pneumoniae_J138 Chlamydophila_pneumoniae_LPCoLN Chlamydophila_pneumoniae_TW_183 Chlamydophila_psittaci_01DC11 Chlamydophila_psittaci_02DC15 Chlamydophila_psittaci_08DC60 Chlamydophila_psittaci_6BC Chlamydophila_psittaci_6BC Chlamydophila_psittaci_C19_98 Chlamydophila_psittaci_CP3 Chlamydophila_psittaci_Mat116 Chlamydophila_psittaci_NJ1 192

Bacteria and Archaea Lacking MTHFR Genes Chlamydophila_psittaci_RD1 Clostridiales_genomosp__BVAB3_UPII9_5 Cryptobacterium_curtum_DSM_15641 cyanobacterium_UCYN_A Dehalococcoides_BAV1 Dehalococcoides_CBDB1 Dehalococcoides_ethenogenes_195 Dehalococcoides_GT Dehalococcoides_VS Desulfurococcus_fermentans_DSM_16532 Desulfurococcus_kamchatkensis_1221n Desulfurococcus_mucosus_DSM_2162 Eggerthella_lenta_DSM_2243 Eggerthella_YY7918 Ehrlichia_canis_Jake Ehrlichia_chaffeensis_Arkansas Ehrlichia_ruminantium_Gardel Ehrlichia_ruminantium_Welgevonden Ehrlichia_ruminantium_Welgevonden Enterococcus_faecalis_62 Enterococcus_faecalis_D32 Enterococcus_faecalis_OG1RF Enterococcus_faecalis_Symbioflor_1 Enterococcus_faecalis_V583 Enterococcus_faecium_Aus0004 Enterococcus_faecium_DO Enterococcus_faecium_NRRL_B_2354 Enterococcus_hirae_ATCC_9790 Erysipelothrix_rhusiopathiae_Fujisawa Fervidicoccus_fontis_Kam940 Filifactor_alocis_ATCC_35896 Finegoldia_magna_ATCC_29328 Flavobacterium_psychrophilum_JIP02_86 Francisella_cf__novicida_3523 Francisella_cf__novicida_Fx1 Francisella_noatunensis_orientalis_Toba_04 Francisella_novicida_U112 Francisella_tularensis_FSC198 Francisella_tularensis_holarctica_F92 Francisella_tularensis_holarctica_FSC200 193

Bacteria and Archaea Lacking MTHFR Genes Francisella_tularensis_holarctica_FTNF002_00 Francisella_tularensis_holarctica_LVS Francisella_tularensis_holarctica_OSU18 Francisella_tularensis_mediasiatica_FSC147 Francisella_tularensis_NE061598 Francisella_tularensis_SCHU_S4 Francisella_tularensis_TI0902 Francisella_tularensis_TIGB03 Francisella_tularensis_WY96_3418 Haemophilus_ducreyi_35000HP Halobacterium_NRC_1 Halobacterium_salinarum_R1 Helicobacter_acinonychis_Sheeba Helicobacter_bizzozeronii_CIII_1 Helicobacter_cetorum_MIT_00_7128 Helicobacter_cetorum_MIT_99_5656 Helicobacter_felis_ATCC_49179 Helicobacter_mustelae_12198 Helicobacter_pylori Helicobacter_pylori_2017 Helicobacter_pylori_2018 Helicobacter_pylori_26695 Helicobacter_pylori_35A Helicobacter_pylori_51 Helicobacter_pylori_83 Helicobacter_pylori_908 Helicobacter_pylori_Aklavik117 Helicobacter_pylori_Aklavik86 Helicobacter_pylori_B38 Helicobacter_pylori_B8 Helicobacter_pylori_Cuz20 Helicobacter_pylori_ELS37 Helicobacter_pylori_F16 Helicobacter_pylori_F30 Helicobacter_pylori_F32 Helicobacter_pylori_F57 Helicobacter_pylori_G27 Helicobacter_pylori_Gambia94_24 Helicobacter_pylori_HPAG1 Helicobacter_pylori_HUP_B14 194

Bacteria and Archaea Lacking MTHFR Genes Helicobacter_pylori_India7 Helicobacter_pylori_J99 Helicobacter_pylori_Lithuania75 Helicobacter_pylori_P12 Helicobacter_pylori_PeCan18 Helicobacter_pylori_PeCan4 Helicobacter_pylori_Puno120 Helicobacter_pylori_Puno135 Helicobacter_pylori_Rif1 Helicobacter_pylori_Rif2 Helicobacter_pylori_Sat464 Helicobacter_pylori_Shi112 Helicobacter_pylori_Shi169 Helicobacter_pylori_Shi417 Helicobacter_pylori_Shi470 Helicobacter_pylori_SJM180 Helicobacter_pylori_SNT49 Helicobacter_pylori_SouthAfrica7 Helicobacter_pylori_v225d Helicobacter_pylori_XZ274 Idiomarina_loihiensis_L2TR Ignisphaera_aggregans_DSM_17230 Kosmotoga_olearia_TBF_19_5_1 Kytococcus_sedentarius_DSM_20547 Lactobacillus_brevis_ATCC_367 Lactobacillus_gasseri_ATCC_33323 Lactobacillus_johnsonii_DPC_6026 Lactobacillus_johnsonii_FI9785 Lactobacillus_johnsonii_NCC_533 Lactobacillus_reuteri_SD2112 Lactobacillus_ruminis_ATCC_27782 Lactobacillus_sakei_23K Lactococcus_garvieae_ATCC_49156 Lactococcus_garvieae_Lg2 Lawsonia_intracellularis_N343 Lawsonia_intracellularis_PHE_MN1_00 Legionella_longbeachae_NSW150 Melissococcus_plutonius_ATCC_35311 Mesoplasma_florum_L1 Micavibrio_aeruginosavorus_ARL_13 195

Bacteria and Archaea Lacking MTHFR Genes Micrococcus_luteus_NCTC_2665 Moraxella_catarrhalis_RH4 Mycoplasma_agalactiae Mycoplasma_agalactiae_PG2 Mycoplasma_arthritidis_158L3_1 Mycoplasma_bovis_HB0801 Mycoplasma_bovis_Hubei_1 Mycoplasma_bovis_PG45 Mycoplasma_capricolum_ATCC_27343 Mycoplasma_conjunctivae_HRC_581 Mycoplasma_crocodyli_MP145 Mycoplasma_cynos_C142 Mycoplasma_fermentans_JER Mycoplasma_fermentans_M64 Mycoplasma_gallisepticum_CA06_2006_052_5_2P Mycoplasma_gallisepticum_F Mycoplasma_gallisepticum_NC06_2006_080_5_2P Mycoplasma_gallisepticum_NC08_2008_031_4_3P Mycoplasma_gallisepticum_NC95_13295_2_2P Mycoplasma_gallisepticum_NC96_1596_4_2P Mycoplasma_gallisepticum_NY01_2001_047_5_1P Mycoplasma_gallisepticum_R_high_ Mycoplasma_gallisepticum_R_low_ Mycoplasma_gallisepticum_VA94_7994_1_7P Mycoplasma_gallisepticum_WI01_2001_043_13_2P Mycoplasma_genitalium_G37 Mycoplasma_genitalium_M2288 Mycoplasma_genitalium_M2321 Mycoplasma_genitalium_M6282 Mycoplasma_genitalium_M6320 Mycoplasma_haemocanis_Illinois Mycoplasma_haemofelis_Langford_1 Mycoplasma_haemofelis_Ohio2 Mycoplasma_hominis_ATCC_23114 Mycoplasma_hyopneumoniae_168 Mycoplasma_hyopneumoniae_232 Mycoplasma_hyopneumoniae_7448 Mycoplasma_hyopneumoniae_J Mycoplasma_hyorhinis_GDL_1 Mycoplasma_hyorhinis_HUB_1 196

Bacteria and Archaea Lacking MTHFR Genes Mycoplasma_hyorhinis_MCLD Mycoplasma_hyorhinis_SK76 Mycoplasma_leachii_99_014_6 Mycoplasma_leachii_PG50 Mycoplasma_mobile_163K Mycoplasma_mycoides_capri_LC_95010 Mycoplasma_mycoides_SC_PG1 Mycoplasma_penetrans_HF_2 Mycoplasma_pneumoniae_309 Mycoplasma_pneumoniae_FH Mycoplasma_pneumoniae_M129 Mycoplasma_pulmonis_UAB_CTIP Mycoplasma_putrefaciens_KS1 Mycoplasma_suis_Illinois Mycoplasma_suis_KI3806 Mycoplasma_synoviae_53 Mycoplasma_wenyonii_Massachusetts Nanoarchaeum_equitans_Kin4_M Neorickettsia_risticii_Illinois Neorickettsia_sennetsu_Miyayama Oenococcus_oeni_PSU_1 Olsenella_uli_DSM_7084 Onion_yellows_phytoplasma_OY_M Orientia_tsutsugamushi_Boryong Orientia_tsutsugamushi_Ikeda Parachlamydia_acanthamoebae_UV7 Pediococcus_pentosaceus_ATCC_25745 Porphyromonas_asaccharolytica_DSM_20707 Porphyromonas_gingivalis_ATCC_33277 Porphyromonas_gingivalis_TDC60 Porphyromonas_gingivalis_W83 Prevotella_denticola_F0289 Prevotella_intermedia_17 Propionibacterium_acnes_266 Propionibacterium_acnes_6609 Propionibacterium_acnes_ATCC_11828 Propionibacterium_acnes_C1 Propionibacterium_acnes_KPA171202 Propionibacterium_acnes_SK137 Propionibacterium_acnes_TypeIA2_P_acn33 197

Bacteria and Archaea Lacking MTHFR Genes Pyrococcus_yayanosii_CH1 Rickettsia_africae_ESF_5 Rickettsia_akari_Hartford Rickettsia_australis_Cutlack Rickettsia_bellii_OSU_85_389 Rickettsia_bellii_RML369_C Rickettsia_canadensis_CA410 Rickettsia_canadensis_McKiel Rickettsia_conorii_Malish_7 Rickettsia_felis_URRWXCal2 Rickettsia_heilongjiangensis_054 Rickettsia_japonica_YH Rickettsia_massiliae_AZT80 Rickettsia_massiliae_MTU5 Rickettsia_montanensis_OSU_85_930 Rickettsia_parkeri_Portsmouth Rickettsia_peacockii_Rustic Rickettsia_philipii_364D Rickettsia_prowazekii_BuV67_CWPP Rickettsia_prowazekii_Chernikova Rickettsia_prowazekii_Dachau Rickettsia_prowazekii_GvV257 Rickettsia_prowazekii_Katsinyian Rickettsia_prowazekii_Madrid_E Rickettsia_prowazekii_Rp22 Rickettsia_prowazekii_RpGvF24 Rickettsia_rhipicephali_3_7_female6_CWPP Rickettsia_rickettsii__Sheila_Smith_ Rickettsia_rickettsii_Arizona Rickettsia_rickettsii_Brazil Rickettsia_rickettsii_Colombia Rickettsia_rickettsii_Hauke Rickettsia_rickettsii_Hino Rickettsia_rickettsii_Hlp_2 Rickettsia_rickettsii_Iowa Rickettsia_slovaca_13_B Rickettsia_slovaca_D_CWPP Rickettsia_typhi_B9991CWPP Rickettsia_typhi_TH1527 Rickettsia_typhi_Wilmington 198

Bacteria and Archaea Lacking MTHFR Genes secondary_endosymbiont_of_Ctenarytaina_eucalypti secondary_endosymbiont_of_Heteropsylla_cubana_Thao2000 Serratia_symbiotica__Cinara_cedri_ Simkania_negevensis_Z Staphylothermus_marinus_F1 Streptobacillus_moniliformis_DSM_12112 Streptococcus_dysgalactiae_equisimilis_AC_2713 Streptococcus_dysgalactiae_equisimilis_ATCC_12394 Streptococcus_dysgalactiae_equisimilis_GGS_124 Streptococcus_dysgalactiae_equisimilis_RE378 Streptococcus_equi_4047 Streptococcus_equi_zooepidemicus Streptococcus_equi_zooepidemicus_ATCC_35246 Streptococcus_equi_zooepidemicus_MGCS10565 Streptococcus_intermedius_JTH08 Streptococcus_parauberis_KCTC_11537 Streptococcus_pyogenes_A20 Streptococcus_pyogenes_Alab49 Streptococcus_pyogenes_M1_GAS Streptococcus_pyogenes_Manfredo Streptococcus_pyogenes_MGAS10270 Streptococcus_pyogenes_MGAS10394 Streptococcus_pyogenes_MGAS10750 Streptococcus_pyogenes_MGAS15252 Streptococcus_pyogenes_MGAS1882 Streptococcus_pyogenes_MGAS2096 Streptococcus_pyogenes_MGAS315 Streptococcus_pyogenes_MGAS5005 Streptococcus_pyogenes_MGAS6180 Streptococcus_pyogenes_MGAS8232 Streptococcus_pyogenes_MGAS9429 Streptococcus_pyogenes_NZ131 Streptococcus_pyogenes_SSI_1 Streptococcus_uberis_0140J Taylorella_asinigenitalis_MCE3 Taylorella_equigenitalis_ATCC_35865 Taylorella_equigenitalis_MCE9 Tetragenococcus_halophilus Thermococcus_4557 Thermococcus_AM4 199

Bacteria and Archaea Lacking MTHFR Genes Thermococcus_barophilus_MP Thermococcus_CL1 Thermococcus_gammatolerans_EJ3 Thermococcus_onnurineus_NA1 Thermococcus_sibiricus_MM_739 Thermosphaera_aggregans_DSM_11486 Treponema_denticola_ATCC_35405 Tropheryma_whipplei_TW08_27 Tropheryma_whipplei_Twist Ureaplasma_parvum_serovar_3_ATCC_27815 Ureaplasma_parvum_serovar_3_ATCC_700970 Ureaplasma_urealyticum_serovar_10_ATCC_33699 Waddlia_chondrophila_WSU_86_1044 Weeksella_virosa_DSM_16922 Wigglesworthia_glossinidia_endosymbiont_of_Glossina_brevipalpis Wigglesworthia_glossinidia_endosymbiont_of_Glossina_morsitans__Yale_colony_ Wolbachia_endosymbiont_of_Culex_quinquefasciatus_Pel Wolbachia_endosymbiont_of_Drosophila_melanogaster Wolbachia_endosymbiont_of_Onchocerca_ochengi Wolbachia_endosymbiont_TRS_of_Brugia_malayi Wolbachia_wRi Xylella_fastidiosa_GB514

200