Genetic and biochemical analysis of host-microbe interactions affecting gut homeostasis: functional and genomic analysis of polyphenol catabolism by from cluster IV

Sian Pottenger

Bachelor of Science with Honours BioVeterinary science

Masters by Research (MRes) Microbial Pathogenesis

A thesis submitted for the degree of Doctor of Philosophy at

The University of Queensland in 2018

University of Queensland Diamantina Institute, Faculty of Medicine

i

Abstract

Metagenomics offers opportunities to advance our understanding of the complexities of the microbial communities that inhabit the human gastrointestinal tract (GIT). Whilst this research field has great potential, there are limitations and caveats that affect its impact, and the translation of the “microbiome into medicine”. In particular, the number of representative microbes that still remain uncultured is substantial and constrains our capacity to define the functional roles such microbes play in health and disease. New techniques to gain representative isolates of these microbes are required, and it would also be advantageous if these microbes are amenable to genetic manipulation. Techniques in bacterial genetics applied to a broader diversity of human GIT bacteria would allow us to fully assess the functional aspects of their potential interactions with a host, which would expand understanding of their relation to host health and well-being. With this in mind, the aims of my PhD project were to utilize a new approach developed in our lab referred to as metaparental mating, to expand our collection of genetically tractable bacteria considered relevant to GIT homeostasis, and then assess their immunomodulatory capacity. Based on these results and findings, I then chose to undertake a more detailed and integrated culture-based and genomic analysis of two “new” bacterial isolates assigned to poorly populated and relatively uncharacterised lineages within Clostridium Cluster IV: the genera Flavonifractor and Pseudoflavonifractor.

Chapter 1 provides an overview of the current literature with a focus on the roles of the GIT microbiota and its roles in host health and well-being. I provide evidence and rationale for the basis of the research undertaken throughout my PhD studies based on the gaps in our knowledge of the roles specific members of the microbiota play in the GIT of humans Specific focuses are highlighted with regards to the increasing interest in Polyphenols and their beneficial impacts on the microbiota and the host.

Chapter 2 describes my use of the metaparental mating technique to recover representative isolates of -affiliated bacteria. I validated the utility of the metaparental mating technique to recover a broad diversity of the bacteria present in human stool assigned to these lineages, and in particular the use of a plasmid that contains the evoglow-C-Bs2 bioluminescence reporter gene, which augmented antibiotic resistance selection and the identification of transconjugant strains. My phylogenetic assessment of the isolates I recovered shows the collection includes bacteria assigned to Enterococcus, Clostridium clusters IV, XIVa and XVIII. I then assessed 22 of my recovered isolates for their ability to inhibit lipopolysaccharide-stimulated NF-κB activation of the luciferase reporter gene using the RAW 264.7 mouse macrophage cell line. I was able to show that 7/22 of these

ii

isolates inhibit NF-κB activation of the reporter gene to a magnitude similar or greater than Faecalibacterium prausnitzii A2-165. Of particular interest to me was the isolation of two isolates assigned to Flavonifractor and Pseudoflavonifractor, and I chose to focus on these two isolates for the remainder of my PhD studies.

Chapter 3 is focused on my assessment of Flavonifractor sp. AHG0014 with reference to the type strain, F. plautii strain DSMZ 4000T. In particular, my culture-based studies examined quercetin metabolism by both these strains. My results suggest that the growth of both strains does not proceed until quercetin per se is reduced to relatively low concentrations (~5 µM). The genome of strain AHG0014 was sequenced and found to be similar in size and G:C content to the four Flavonifractor genomes available. In addition, I was also able to retrieve genome sequences of four more strains that were “unassigned” but should now be considered as representatives of the Flavonifractor genus. My assessments of these 9 genomes showed that the chalcone (chi) gene implicated in quercetin metabolism is part of the core genome and is contained in a multi-gene locus that most likely encodes for both quercetin uptake and metabolism. Using a combination of the 16S rRNA gene and quercetin metabolism genes to screen metagenomics datasets, I found that these genes are significantly more abundant in a cohort of Crohn’s disease patients compared to healthy controls; suggesting the proposed association of Flavonifractor spp. and GIT homeostasis needs more detailed assessment.

Chapter 4 focuses on my assessment of “Pseudoflavonifractor” sp. AHG0008. I first showed that similar to P. capillosus DSMZ 23940T, AHG0008 does not metabolise quercetin. Interestingly, and despite the similarity between these two bacteria based on 16S rRNA gene analysis, the genome of strain AHG0008 is much smaller and quite different when compared to P. capillosus DSMZ 23940T. I then used a combination of bioinformatics methods to recover more closely related genomes produced from metagenomics datasets (MAGs), and my assessment of the whole-genome-based phylogeny and Average Nucleotide Identity scores confirmed that strain AHG0008 is the first cultured isolate of a divergent branch of the presumptive Pseudoflavonifractor lineage. I also found that strain AHG0008 and the MAGs possess a relatively large percentage of genes encoding amino acid transport and metabolism.

Chapter 5 provides my overview of the findings arising from my PhD research, which I believe has provided a greater awareness and new understanding of these diverse and underrepresented bacterial lineages. I also provide some perspective and suggestions with respect to future research that will provide new insights into the roles these bacteria play in human health and disease.

iii

Declaration by author

This thesis is composed of my original work, and contains no material previously published or written by another person except where due reference has been made in the text. I have clearly stated the contribution by others to jointly-authored works that I have included in my thesis.

I have clearly stated the contribution of others to my thesis as a whole, including statistical assistance, survey design, data analysis, significant technical procedures, professional editorial advice, and any other original research work used or reported in my thesis. The content of my thesis is the result of work I have carried out since the commencement of my research higher degree candidature and does not include a substantial part of work that has been submitted to qualify for the award of any other degree or diploma in any university or other tertiary institution. I have clearly stated which parts of my thesis, if any, have been submitted to qualify for another award.

I acknowledge that an electronic copy of my thesis must be lodged with the University Library and, subject to the policy and procedures of The University of Queensland, the thesis be made available for research and study in accordance with the Copyright Act 1968 unless a period of embargo has been approved by the Dean of the Graduate School.

I acknowledge that copyright of all material contained in my thesis resides with the copyright holder(s) of that material. Where appropriate I have obtained copyright permission from the copyright holder to reproduce material in this thesis.

iv

Publications included in this thesis

Ó Cuív P, Smith WJ, Pottenger S, Burman S, Shanahan ER, and Morrison M. (2015) Isolation of Genetically Tractable Most-Wanted Bacteria by Metaparental Mating. Scientific reports, 5, 13282. Results presented in this manuscript are incorporated in Chapter 2.

Submitted manuscripts included in this thesis

No manuscripts submitted for publication

Other publications during candidature

Published peer-reviewed literature reviews

Burman S, Hoedt EC, Pottenger S, Mohd-Najman NS, Ó Cuív P, and Morrison M (2016) An (Anti)- Inflammatory Microbiota: Defining the Role in Inflammatory Bowel Disease? Dig Dis.2016; 34(1- 2):64-71

Published peer-reviewed Book Chapters

Ó Cuív, P, Burman S, Pottenger S, and Morrison M (2016). Exploring the Bioactive Landscape of the Gut Microbiota to Identify Metabolites Underpinning Human Health. Microbial Metabolomics: Applications in Clinical, Environmental, and Industrial Microbiology. D. J. Beale, K. A. Kouremenos and E. A. Palombo. Cham, Springer International Publishing: 49-82.

Conference Abstracts

Pottenger S, Hoedt E C, Ó Cuív P and Morrison M: Identifying new anti-inflammatory bacteria from Clostridium clusters IV and XIVa. The Australian Society for Medical Research Postgraduate Student Conference. 2016 May. 31; Brisbane, QLD, Australia.

Pottenger S, Ó Cuív P and Morrison M: Growth studies of Flavonifractor spp. SP1. AusME 2017, Australian Microbial Ecology Conference. 2017 Feb. 13-15; Melbourne, VIC, Australia.

Pottenger S, Ó Cuív P and Morrison M: Quercetin metabolism of Flavonifractor plautii DSMZ 4000T and Flavonifractor sp. SP1. School of Biomedical Sciences International Student Symposium, 2017 Oct-Nov. 31-1; Brisbane, QLD, Australia.

v

Pottenger S, Dekker Niteret M, Ó Cuív P and Morrison M: Genomic and functional insights reveal a novel cluster of bacterial species related to Flavonifractor plautii and Pseudoflavonifractor capillosus strains. Australian Society for Microbiology (ASM) Annual Conference. 2018 Jul. 1-4; Brisbane, QLD, Australia.

Pottenger S, Dekker Niteret M, Ó Cuív P and Morrison M: A novel cluster of Immunomodulatory human gut bacteria related to Flavonifractor and Pseudoflavonifractor spp. Translational Research Symposium. 2018 Jul. 26; Brisbane, QLD, Australia.

Contributions by others to the thesis

Dr Páraic Ó Cuív designed the metaparental mating experiments described in Chapter 2. Dr Emily Hoedt performed the Gas-chromatography analysis of bacterial supernatants in Chapter 2. Ms. Nida Murtaza generated the Metagenome Assembled Genome datasets used in Chapter 4. All results and work presented in this thesis were critically analysed by my primary supervisor Prof. Mark Morrison.

vi

Statement of parts of the thesis submitted to qualify for the award of another degree

No works submitted towards another degree have been included in this thesis

Research Involving Human or Animal subjects

Freshly voided stool samples were collected from pre-adolescent children under Metro Hospital South human research ethics (HREC) approval HREC/13/MHS/27, and were kindly provided by Dr Emma Hamilton-Williams (UQDI).

vii

Acknowledgements

To my supervisors, I am forever grateful to you for giving me this opportunity to travel to Australia and pursue my PhD. Professor Mark Morrison, I am so thankful that you brought me into your group, for sharing your vast knowledge with me and for helping shape me into the scientist I am today. Dr Paráic Ó Cuív, thank you for your knowledge, and guidance within the lab on a day to day basis our continued talks always inspired my thinking and guided my way on this journey. Thank you to you both for sticking with me and allowing me to show my true potential. To Dr Marloes Dekker Nitert, I am so thankful to you for joining my supervisory panel for the final 12-18 months of my PhD your continued support, guidance and knowledge truly helped to boost my confidence as a scientist. To you all, the support you have given me is something I am extremely thankful for and words cannot truly express just how deep my gratitude extends.

To everyone in the Diamantina Institute at the Translational Research Institute (TRI), I am so grateful to all your support during my time here. Especially for making it so easy for me to transition to life Down Under and for being the great friends you have all become. Special thanks go to Dr Emily Hoedt for the guidance and support you provided particularly when I took that daunting step into the world of bioinformatics and genomics your knowledge, help, and guidance were extremely valuable. To Dr Erin Shanahan, Dr Diahann Jansen and Dr Martine Boks, thank you so much for your friendship help and guidance in the brief time I delved into Immunology, your support has made me keen to step back in to it during my future career. Rabina Giri I thank you for the continued conversations and talks we have had regarding your continued analysis of the strains I was working on and for being there for chats in the lab after everyone else had gone home. I gratefully thank everyone within the Morrison Group, past and present, but particularly; Nida Murtaza, Richard Linedale and Dr Erwin Berensden for your continued support, scientific chats and coffees which all helped me throughout my PhD. Finally, to the others on level 6 at TRI; Jeremy Brooks, Meg Donovan, Brooke Geeling, Maggie Veitch, Nicola Pett, Carrie Coggon, Rachel Rollo, Josh Monteith and Jeimy Jimenez your friendship has meant so much to me during my PhD. The Friday night drinks, weekend wine and cheese nights as well as the burger and movie nights all helped keep me sane, happy and smiley when I needed it. To my housemates Renee Morrison and Mc Beagle (best dog in the world) your continued company and support through the last two years of my PhD was much appreciated. You both helped bring a smile to my face; from the warm welcome home tail-wags (Mc Beagle) to the movie watching and heartfelt chats to wind down at the end of those long days.

viii

To my Family and Friends back in the UK, all the love and support you have provided me has truly helped me get through this rollercoaster of an experience. Special mentions go out to my Mum, Dad, sister Riley, my Grandparents; Eileen, Estelle and Jimmy, my Uncle Neil Aunt Jen, Aunt Estelle and Uncle Darren, my Cousins Ethan, Mia and Sam and finally my Fairy Godmother Jo, thank you all for always being there especially for the continued support, jokes (particularly over me gaining an “Aussie” accent), and encouragements through everything. To my Grandad Charlie, my Grandad Billy and Great Grandma Myra, although you never got to see me make it this far, you have forever been by my side through all the love and support you gave me. I love you all.

ix

Financial support

This research was supported by a University of Queensland Diamantina Institute PhD scholarship, a University of Queensland RHD scholarship and a University of Queensland International tuition fee scholarship.

Keywords

Immunomodulatory, Flavonifractor, Pseudoflavonifractor, polyphenol catabolism, genomics, human, pan-genome, quercetin, flavonoid

x

Australian and New Zealand Standard Research Classifications (ANZSRC)

ANZSRC code: 060309, Phylogeny and Comparative Analysis, 30%

ANZSRC code: 060503, Microbial Genetics, 30%

ANZSRC code: 060504, Microbial Ecology, 40%

Fields of Research (FoR) Classification

FoR code: 0603, Evolutionary Biology, 30%

FoR code: 0605, Microbiology, 70%

xi

Table of contents

Abstract ...... ii

Declaration by author...... iv

Publications included in this thesis ...... v

Submitted manuscripts included in this thesis ...... v

Other publications during candidature ...... v

Contributions by others to the thesis...... vi

Statement of parts of the thesis submitted to qualify for the award of another degree...... vii

Research Involving Human or Animal subjects ...... vii

Acknowledgements ...... viii

Financial support...... x

Keywords ...... x

Australian and New Zealand Standard Research Classifications (ANZSRC) ...... xi

Fields of Research (FoR) Classification ...... xi

Table of contents ...... xii

List of Figures ...... xvii

List of Tables ...... xxi

List of Abbreviations used in the thesis ...... xxii

Chapter 1 General introduction and literature review ...... 1

1.1 Introduction ...... 1

1.2 Historical difficulties in studying the GIT microbiota ...... 6

1.3 The generalised concept of dysbiosis in Inflammatory Bowel Diseases ...... 9

1.4 Culture-based approaches to study the GIT microbiota...... 10

1.5 Host-microbe interactions mediating changes in intestinal barrier function during IBD ... 12

xii

1.6 In vitro characterisation of GIT microbiota function ...... 14

1.7 Dietary polyphenols and their impacts on the GIT physiology and composition of the microbiota ...... 18

1.8 Summary and research aims...... 23

Chapter 2 Isolation of ‘new’ genetically tractable gut bacteria using metaparental mating and their potential role in human health...... 24

2.1 Introduction ...... 24

2.2 Materials and Methods ...... 25

2.2.1 Bacterial isolations: Metaparental mating...... 25

2.2.2 Transconjugant confirmation: microscopy and PCR screening...... 28

2.2.3 Transconjugant identification: 16S rRNA sequencing and phylogenetic analysis ...... 28

2.2.4 Plasmid curing of strains AHG0001 ...... 29

2.2.5 RAW 264.7 macrophage cell culturing and NF-κB activation assays...... 29

2.3 Results ...... 31

2.3.1 The recipient cultures showed variable responses to antibiotics ...... 31

2.3.2 Isolation and phylogenetic analyses of transconjugant strains ...... 33

2.3.3 Primary screening of pEHR512112 transconjugant bacteria for the production of immunomodulatory bioactives compounds ...... 39

2.3.4 Plasmid curing experiments with AHG0001 ...... 45

2.4 Discussion ...... 47

Chapter 3 Functional and genomic characterisation of Flavonifractor spp. with a focus on polyphenol catabolism ...... 52

3.1 Introduction ...... 52

3.2 Materials and ...... 55

3.2.1 Bacterial strains and growth conditions ...... 55

xiii

3.2.2 Growth and quercetin metabolism studies ...... 55

3.2.3 Strain AHG0014 DNA extraction, genome sequencing and assembly ...... 56

3.2.4 Comparative genomics of AHG00014 with other Flavonifractor spp...... 57

3.2.1 MetaQuery analysis to assess the prevalence of Flavonifractor spp. and the chi operon in other metagenomics datasets...... 58

3.2.2 Cloning of the AHG0014 chi homolog...... 59

3.2.3 Expression of the recombinant AHG0014 chi homolog ...... 60

3.3 Results ...... 62

3.3.1 16S rRNA based taxonomic assignment of transconjugant strains AHG0008 and AHG0014 ...... 62

3.3.2 Quercetin has minimal effects on the growth kinetics of F. plautii DSMZ 4000T and strain AHG0014 ...... 64

3.3.3 Both Flavonifractor spp. DSMZ 4000T and AHG0014 rapidly metabolise quercetin 66

3.3.4 Whole genome analysis of Flavonifractor spp. and AHG0014 ...... 71

3.3.5 Predicted gene functions and metabolism of the F. plautii species ...... 82

3.3.6 The chalcone isomerase (chi) homolog and contiguous genes of Flavonifractor spp. 92

3.3.7 The AHG0014 genome contains other genes predicted to be involved in Flavonoid, Flavone and Flavonol biosynthesis ...... 98

3.3.8 Bioinformatics-based assessment of the prevalence of Flavonifractor spp. using shotgun metagenomics datasets ...... 101

3.3.9 Cloning of the strain AHG0014 chi homologue in E. coli ...... 105

3.4 Discussion ...... 108

Chapter 4 Genomic and functional insights into a novel cluster of Pseudoflavonifractor sp. within Clostridium cluster IV...... 113

4.1 Introduction ...... 113

4.2 Materials and Methods ...... 115 xiv

4.2.1 Bacterial strains, medium and growth conditions ...... 115

4.2.2 AHG0008 genomic extraction, sequencing and data assembly ...... 115

4.2.3 Genome-based analyses ...... 117

4.3 Results ...... 118

4.3.1 Taxonomic assignment and cell morphology strain AHG0008...... 118

4.3.2 Growth of P. capillosus strain DSMZ 23940T and strain AHG0008 in the presence of quercetin...... 120

4.3.3 Whole genome analysis of strain AHG0008 ...... 125

4.3.4 Predicted gene function and metabolism of Pseudoflavonifractor sp. AHG0008 and the three MAGs...... 141

4.4 Discussion ...... 152

Chapter 5 General Discussion ...... 158

Chapter 6 References ...... 166

Chapter 7 Appendices...... 181

7.1 Media and Solutions used during thesis ...... 181

7.1.1 M2GSC medium ...... 181

7.1.2 Reinforced Clostridial medium (RCM) ...... 182

7.1.3 Brain Heart Infusion Medium (BHI) ...... 183

7.1.4 Mineral solutions...... 183

7.1.5 Resazurin Stock, Ringers solution and anaerobic glycerol stocks...... 183

7.2 Primers used during thesis...... 186

7.3 16S rRNA sequencing sample preparation ...... 187

7.4 Full gel image of transconjugant isolate PCR confirmation ...... 190

7.5 Preparation of Bacterial High Molceular Weight DNA ...... 191

7.6 rowth curves for strain AHG0008 and AHG0014. in RCM medium ...... 194 xv

7.7 Plasmid extraction and colony confirmation of cloned vector...... 196

7.8 Autoinduction media and solutions...... 197

xvi

List of Figures

Figure 1.1 The 16S rRNA gene phylogenetic relationships of the Clostridium clusters...... 4

Figure 1.2 An overview of some key roles the commensal bacteria provide to the host, ...... 5

Figure 1.3 The differences between Chinese healthy subjects and those suffering from Type 2 diabetes, in terms of their “Metagenome Assembled Genomes” ...... 8

Figure 1.4 An overview of the two NF-κB activation pathways ...... 15

Figure 1.5 Pathway of quercetin degradation by F. plautii and Eubacterium ramulus...... 20

Figure 1.5 An overview of the possible effect(s) flavonoids impart on the GIT microbiota and epithelial cell layer in the human gastrointestinal tract...... 21

Figure 2.1 A schematic representation of the metaparental mating process...... 27

Figure 2.2 Analysis of transconjugants carrying pEHR512112 by fluorescence microscopy...... 34

Figure 2.3: Confirmation of the conjugative transfer of pEHR512112 to faecal bacteria by metaparental mating...... 35

Figure 2.4 Phylogenetic analysis of taxonomic affiliations of 22 transconjugants isolated from a pre- adolescent Australian child...... 36

Figure 2.5 Phylogenetic analysis of taxonomic affiliations of 15 transconjugant strains isolated following metaparental mating with E. coli ST18 pEHR513112 ...... 38

Figure 2.6 First-pass screen of culture supernatants harvested at end stage of growth for all 22 bacterial isolates ...... 41

Figure 2.7 The acetate and butyrate concentrations in the spent supernatant of the 7 AHG strains predicted to be “immunosuppressive” ...... 43

Figure 2.8 Effect of SCFA concentration on NF-κB activation...... 44

Figure 2.9 Plasmid curing of strain AHG0001...... 46

Figure 3.1 Pathway of quercetin degradation by F. plautii and Eubacterium ramulus...... 54

Figure 3.2 Photomicrograph of F. plautii AHG0014 following Gram staining...... 62

xvii

Figure 3.3 Phylogeny of strains AHG008 and AHG0014 based on 16S rRNA gene sequence alignments ...... 63

Figure 3.4 Growth of Flavonifractor spp. DSMZ 4000T (panel A) and strain AHG00014 (panel B) in the presence of BHI medium with or without 50 µM quercetin...... 65

Figure 3.5 Quercetin metabolism by F. plautii sp. DSMZ4000T...... 67

Figure 3.6 Quercetin metabolism by Flavonifractor strain AHG0014...... 68

Figure 3.7 The effects of adding 50 µM quercetin to actively growing cultures of either F. plautii sp. DSMZ4000T (A) or Flavonifractor strain AHG0014 (B) ...... 70

Figure 3.8 Alignment of the F. plautii YL31 finished genome with the Flavonifractor sp. AHG0014 genome ...... 75

Figure 3.9 Venn diagram showing the core, shared and unique genes ...... 76

Figure 3.10 Alignment of the 9 Flavonifractor spp. isolate genomes ...... 77

Figure 3.11 The average nucleotide identity (ANI) matrix of Flavonifractor sp. AHG0014 and F. plautii isolate genomes...... 78

Figure 3.12 The Flavonifractor spp. pangenome atlas ...... 80

Figure 3.13 Pangenome and core genome development plots of Flavonifractor spp. isolates ...... 81

Figure 3.14 The predicted gene functions of F. plautii strains AHG0014, YL31 and DSMZ 4000 based on COG usage categories...... 84

Figure 3.15 A comprehensive representation og the top 50 PHX genes through analysis of the AHG0014 genome ...... 87

Figure 3.16 Predicted pathway of Fructose conversion by AHG0014 ...... 88

Figure 3.17 Alignment of the chi homologs from the Flavonifractor strains with those from Eu. ramulus...... 94

Figure 3.18 Gene organization and their predicted functions among the 9 Flavonifractor strains flanking the chalcone isomerase (chi) gene...... 97

Figure 3.19 The Flavone and Flavonol Biosynthesis pathway of strain AHG0014 ...... 100

xviii

Figure 3.20 Abundance of Flavonifractor spp. specific gene counts in the MetaQuery microbiome gene catalogue...... 103

Figure 3.21 MetaQuery-based analysis of the relative abundance of chi operon genes...... 104

Figure 3.22 Cloning of the chi gene into a pET28b(+) plasmid for transformation and expression in E. coli ...... 106

Figure 3.23 SDS-PAGE gel confirming expression of CHI protein in E. coli BL21 and Rosetta strains...... 107

Figure 4.1 Photomicrograph of Gram-stained cells of Pseudoflavonifractor sp. AHG0008 ...... 118

Figure 4.2 Phylogeny of strains AHG008 and AHG0014 based on 16S rRNA gene sequence alignments ...... 119

Figure 4.3 Growth kinetics of P. capillosus sp. DSMZ 23940T in the presence of quercetin...... 121

Figure 4.4 Growth kinetics of strain AHG0008 when cultured in the presence of quercetin...... 122

Figure 4.5 The effects of adding 50 µM quercetin to actively growing cultures of DSMZ 23940T (A) and AHG0008 (B)...... 124

Figure 4.6 Mauve alignment of the draft genome for AHG0008 with the most closely related species based on 16S rRNA gene alignments...... 126

Figure 4.7 A pangenome atlas for AHG0008, P. capillosus DSMZ 23940T, Pseudoflavonifractor sp. ASF500 and three metagenome assembles genomes (MAGs) The pangenome atlas shows that AHG0008 and three MAGs from Browne et al., display a relatively high similarity to each other yet they substantially differ from closest known relative’s P. capillosus DSMZ 23940T and Pseudoflavonifractor sp. ASF500 which haad been identified following 16S rRNA gene alignments...... 132

Figure 4.8 Mauve alignment of the draft genome sequence for Pseudoflavonifractor sp. AHG0008 and three metagenome assembled genomes...... 133

Figure 4.9 A pangenome atlas of AHG0008 and the three MAGs only...... 134

Figure 4.10 Reconstructed phylogenetic alignment of 16S rRNA genes for strain AHG0008 with members of Clostridium cluster IV...... 135

xix

Figure 4.11 Whole genome phylogeny of Flavonifractor spp., Pseudoflavonifractor spp. and other Clostridium cluster IV isolates...... 137

Figure 4.12 Average Nucleotide Identity (ANI) scores of Clostridium cluster IV isolate genomes including the draft genomes for Flavonifractor sp. strain AHG0014 and Pseudoflavonifractor sp. strain AHG0008...... 139

Figure 4.13 The Core and Pan-genome development of AHG0008 and related metagenome assembled genomes...... 140

Figure 4.14 A COG usage bar chart showing the variations between strain AHG0008, the three MAGs and their closest neighbours...... 143

Figure 4.15 A comprehensive representation of the top 50 PHX genes through analysis of the AHG0008 genome...... 146

Figure 4.16 Predicted pathway of Ribose utilisation by AHG0008...... 149

Figure 4.17 Mauve alignment of AHG0008 with the newly recovered MAG: QIN326 ...... 154

Figure 7.1 The full gel for the confirmation of the conjugative transfer of pEHR512112 to faecal bacteria by metaparental mating...... 190

Figure 7.2 Representative gels to assess the quality of the genomic DNA extracted from AHG00008 (A) and AHG0014 (B)...... 192

Figure 7.3 Phenotypic growth analysis of Ffr. sp. AHG0014 and Pfr. sp. AHG0008 in RCM medium...... 194

Figure 7.4 The KEGG Ortholog (KO) functional categories of the draft genome of AHG0014. The KEGG database was used to assign KO descriptions to the core genes in the draft genome...... 195

Figure 7.5 Extraction of plasmid and restriction digests on colonies grown following transformation with cloned pET28b(+)CHI vector...... 196

xx

List of Tables

Table 2.1 Colony counts from the four faecal enrichments used to check for background resistance ...... 32

Table 2.2 The percent identity scores for each of the transconjugant isolates ...... 37

Table 2.3 The colony counts for AHG0001 recovered following AO curing of the pEHR512112 vector...... 45

Table 3.1 Summary statistics for the Flavonifractor spp. genomes ...... 73

Table 3.2 General genome features for isolate genomes DSMZ 4000T, YL31, DSMZ 6470, AHG0014, ATCC BAA442, UC5.1, VE202 and L35FAA...... 74

Table 3.3 PHX analysis for AHG0014 and three Flavonifractor spp. strains ...... 85

Table 3.4 Gene counts for CAZyme families obtained through annotation of the Flavonifractor genomes by the dbCAN meta server...... 89

Table 4.1 CheckM completeness and contamination scores for the draft genome of AHG0008 and phylogenetically related strains...... 129

Table 4.2 General genome features for isolate genomes P. capillosus DSMZ 23940T, Pseudoflavonifractor sp. ASF500, Ruminococcaceae bacterium D16 and AHG0008 ...... 130

Table 4.3 PHX analysis metrics for AHG0008 and the three MAGs ...... 145

Table 4.4 IMG predicted metabolism from pathway assertion. The amino acid auxotrophy/prototrophy profile for the draft genome of AHG0008 ...... 147

Table 4.5 Gene counts for CAZyme families obtained through annotation of the AHG0008 genome and the three MAGs by the dbCAN meta server ...... 150

xxi

List of Abbreviations used in the thesis

16S rRNA 16S ribosomal RNA

µg microgram

µl microliter

µM micromolar

AHG Australian Human Genetically Modified Organism strains

ANI Average Nucleotide Identity

AMPs Anti-microbial peptides

AO Acridine Orange

ATCC American Type Culture Collection

BHI Brain Heart Infusion

BLASTn Basic Local Alignment Search Tool – nucleotide

BLASTp Basic Local Alignment Search Tool – protein bp base pair

Cat Chloramphenicol

CD Crohn’s Disease cfu colony forming units

CIP Calf Intestinal Phosphatase

CO2 Carbon Dioxide

COG Cluster of Orthologous Groups

DSMZ Deutsche Sammlung von Mikroorganismen und Zellkulturen

DSMZ 23940T Pseudoflavonifractor capillosus

DSMZ 4000T Flavonifractor plautii

xxii

EDGAR efficient database framework for comparative genome analyses using BLAST score ratios

Erm Erythromycin

EtOH Ethanol

FBS Foetal Bovine Serum gDNA genomic DNA

GAP glyceraldehyde 3-phosphate

GIT Gastrointestinal tract

HMP Human Microbiome Project

IBD Inflammatory Bowel Disease

IEC Intestinal Epithelial Cells

IgA Immunoglobulin A

IKKβ Inhibitor of N

IL- Interleukin-

Kb kilo bases

LB Luria Bertani

LPS lipopolysaccharide

M2SC minimal media with starch and cellobiose

MAG Metagenome Assembled Genome

MCP-1 monocyte chemoattractant protein-1 ml millilitre mM millimolar

NCBI national centre for biotechnology information

NF-Κb Nuclear Factor Kappa B

xxiii

NG No growth nm nanometres

OD600 Optical Density at 600 nanometre

PEP Phosphoenolpyruvate

PCR Polymerase Chain Reaction pIgR Polymeric Ig Receptor

PPP Pentose-phosphate pathway

PTS Phosphotransferase system

RAST Rapid Annotation using Subsystems Technology

RBB+C repeated bead beating plus column

RCM Reinforced Clostridial Medium

RDP Ribosomal Database Project

RLU Relative Light Units

RPMI Roswell Park Memorial Institute Medium

SCFA Short Chain Fatty Acid

S.E.M. Standard Error Mean sIgA secretory IgA

T2D Type 2 Diabetes

TKL Transketolase

TLR Toll Like Receptor

TNF-α Tumour Necrosis Factor alpha

TNTC too numerous to count

UC Ulcerative Colitis

UQDI Univeristy of Queensland Diamantina Institute

xxiv

vol volume

xxv

Chapter 1 General introduction and literature review

1.1 Introduction

The colonisation of humans and other animals by microorganisms happens soon after birth, and in humans, the mode of birth determines which microbes first colonise the gastrointestinal tract (GIT) (Francino, 2018). Babies delivered vaginally rapidly develop a microbiota resembling that of the mother’s vaginal microbiota, whereas babies delivered via caesarean section initially possess microbiota similar to that of the mother’s skin microbiota (Dominguez-Bello et al., 2010). There is also substantial evidence that the development of the GIT microbiota can be influenced by whether a baby is breast-fed or formula-fed in the first few months of life, with the GIT microbiota of breast- fed infants more closely resembling that of the mother’s GIT microbiota (Harmsen et al., 2000, Palmer et al., 2007, Castanys-Munoz et al., 2016). Nevertheless, by about 1-2 years of age the GIT microbiota, stabilises with approximately 1010 to 1012 microbes per gram of contents, with the largest numbers located within the large bowel (colon) (Turnbaugh et al., 2007, Gill et al., 2006, Qin et al., 2010, Ottman et al., 2012).

In general terms, the GIT microbiota is comprised of bacteria, archaea, lower eukaryotes, and viruses, of which the bacteria are the most abundant and diverse. During the last decade, much effort has been directed towards using cultivation-independent approaches (e.g. sequencing techniques and 16s rRNA-target FISH) to define whether and how the GIT microbiota changes with gastrointestinal disease and/or inflammation. Generally, these studies have consistently reported that, in healthy adult individuals, the 5 most common phyla found in the large intestine of the GIT are: Firmicutes, Bacteroidetes, Actinobacteria, Proteobacteria and Fusobacteria (Dicksved et al., 2009, Delgado et al., 2013, Mondot and Lepage, 2016). The Firmicutes and Bacteroidetes are the most abundant phyla present in the colonic microbiota (Tap et al., 2009, Lozupone et al., 2012). The Firmicutes phylum consists of low G + C content, Gram-positive bacteria that can be largely split into the (predominantly Clostridium cluster IV and Clostridium cluster XIVa) and the Bacilli. The phylum Bacteroidetes are often less diverse and include the Gram-negative anaerobes such as Bacteroides, Prevotella and Porphyromonas (Eckburg et al., 2005, Tap et al., 2009, Mondot and Lepage, 2016). Collectively, most members of these 5 phyla are referred to as “commensal bacteria”.

As mentioned, the Firmicutes phylum contains the highly diverse Clostridia class. Clostridia mainly consist of Gram-positive, mostly anaerobic, rod-shaped spore-forming bacteria. As well as commensal bacteria this class also contains the important human pathogens C. botulinum, C. tetani 1

and C. difficile. Collins et al., (1994) originally split the Clostridia into ~20 clusters (Figure 1.1) with the largest cluster, containing the C. butyricum type species, being designated cluster I. Currently, this cluster is referred to as the Clostridium sensu stricto being the cluster believed to contain the true Clostridia species. Due to the extremely diverse nature of the Clostridia the species have undergone numerous taxonomic reassignments and revisions. As stated previously the predominant clusters making up components of the GIT microbiota are from cluster IV and XIVa. Many of these species have been shown to have beneficial effects on the host and are discussed in more detail later in this Chapter.

2

3

Figure 1.1 The 16S rRNA gene phylogenetic relationships of the Clostridium clusters. The depicted phylogeny shows representative members of the Clostridium clusters as described by Collins et al., (1994) and their relationships. Cluster I represents the Clostridium sensu stricto species containing known human pathogens C. botulinum and C. tetani whilst cluster IV and XIVa are species predominant in the GIT of humans. The archaeon Methanobrevibacter smithii was used as an out- group. Values at branches denote bootstrap analysis following 1000 iterations and the scale bare represents a sequence divergence of 5%.

4

As shown in Figure 1.2, the commensal microbiota have been shown to provide a range of beneficial roles for the host. They ferment indigestible products, such as complex carbohydrates and dietary fibre, which helps provide the host with essential nutrients. Their presence contributes to the development of the host immune system through the induction of IgA secretion and by aiding in controlling inflammation and maintain homeostasis. The commensal bacteria are also able to prevent colonisation of the GIT with pathogenic bacteria by preventing adherence of invaders to the epithelium. All of these roles are vital to the symbiotic relationship between the host and the microbiota.

Figure 1.2 An overview of some key roles the commensal bacteria provide to the host, categorised according to their protective, metabolic and structural contributions. Created with BioRender, and adapted from Grenham S, Clarke G, Cryan JF and Dinan TG (2011) Brain–gut–microbe communication in health and disease. Front. Physio. 2:94. doi: 10.3389/fphys.2011.00094

5

1.2 Historical difficulties in studying the GIT microbiota

Despite the long-established roles outlined above, it is first important to understand what constitutes a “healthy microbiota” and how it is changed during the course of disease. This type of information provides a better understanding of how the GIT microbiota affects the maintenance or disruption of GIT homeostasis, and the consequences of that on a person’s health. It is well recognised that our understanding of the microbial world has long been hampered by the fact that only about 30% of the bacteria present within the human GIT are amenable to culture e.g. (De Cruz et al., 2012). This imposes a limit on what can be elucidated about the intricacies and functions of the microbiota, because a lack of cultured representatives means that understanding their roles in terms of their physiology and metabolism is restricted. This constraint has been partially overcome by the development and application of nucleic acid sequencing technologies and associated bioinformat ics and computational tools. These methods now allow a cultivation-independent and more holistic approach to examine both the structure and functional attributes of these communities. The primary structure of these communities has been most often described using “profiling” methods that amplify and sequence regions of the gene encoding 16S ribosomal RNA (16S rRNA or rrs gene sequencing) (e.g. Peterson et al., 2008, Morgan and Huttenhower, 2012). More recently, methods have further advanced and the production of “shotgun metagenomic” data, representing the collective genomic content of the microbes present in the sample, has become more widespread. Both methods have helped reveal “new” not-yet-cultured groups of GIT bacteria, and the shotgun metagenomic data have also shown that lateral gene transfer and DNA exchange by conjugation have played major roles in shaping the GIT microbiome (Xu et al., 2007, Broaders et al., 2013). The ‘GIT microbiome’ is a term recently redefined by Marchesi and Ravel (2015), which is often used when describing the entire habitat including; microbes, their genomic content and the surrounding environmental conditions. This contrasts with the term GIT microbiota which refers to the microbes in this ecosystem in terms of our knowledge of them derived from functional studies.

In 2008, the Human Microbiome Project (HMP) was set up to inventory the microbiome of different body sites, so as to more fully understand the functions and properties it provides to the host. One of the main aims of the HMP was to be able to characterise the microbiome by studying samples taken from various sites of healthy volunteers. It also aimed to identify differences within the microbiome to find links and associations with various health and disease states. Finally, this project aimed to create a database of reference genomes and (meta)genomic data, which would allow for further characterisation of the individual organisms present in a healthy microbiome (Human Microbiome

6

Project, 2012b, Human Microbiome Project, 2012a). The production of these datasets further showed that there was a paucity of cultured isolates and/or genomic data for particular microbial groupings. This led to the creation of a ‘most-wanted’ list of bacterial isolates, which would require the development of new culture isolation strategies in order to gain isolates for sequencing. These cultured and sequenced isolates could then be further assessed to determine their functional and ecological relevance to host health status (Fodor et al., 2012).

Around the same time as the initiation of the HMP, a European-led initiative called MetaHIT (Metagenomics of the Human Intestinal Tract) was also established (Ehrlich and Consortium, 2011). The approach taken by the MetaHIT consortium was complementary to the HMP because it provides a catalogue of microbial genes which are prevalent within stool samples from healthy persons, as well as persons suffering from intestinal and/or metabolic disease. Provision of this data allows a broad view of the functions which are important for microbial life in the GIT. It also showed that characterisation of the genetic potential of this complex environment is possible (Qin et al., 2010). Advances arising in our understanding of the human GIT microbiota by the MetaHIT consortium include the existence of enterotypes in the human microbiome. Enterotypes are distinct clusters of bacteria which were found across the different nationalities studied. Three distinct enterotypes were discovered, they differed only in relation to variations in particular genera; Bacteroides (enterotype 1), Prevotella (enterotype 2) and Ruminococcus (enterotype 3). The discovery of enterotypes in the human microbiome may help lead to the discovery of microbial properties which relate to their health status (Arumugam et al., 2011). Wu et al. (2011) further expanded on the concept of enterotypes by showing that long-term dietary patterns such as protein and animal fats vs carbohydrates distinguished between the Bacteroides and Prevotella enterotypes, respectively. Metagenome wide association studies have also revealed that specific markers and genes can be used to distinguish between cases and controls in relation to diseases such as, type-2 diabetes and liver cirrhosis (Qin et al., 2012, Qin et al., 2014). The specific biomarkers have no crossover between the two diseases and so may be used to detect disease in individuals (Qin et al., 2014). These collective studies have also resulted in the identification of “Metagenome Linkage Groups” (MLG) or “Metagenomic Assembled Genomes” (MAGs): partial genomes assembled from metagenomic datasets. MAGs can represent cultured and/or taxonomically assigned isolates, as well as not-yet-cultured microbes, as represented in Figure 1.3 (Qin et al., 2012). The MAGs in this Figure denoted as Con or T2D for the respective control and Type-2 diabetes samples from which they were recovered.

7

Figure 1.3 The differences between Chinese healthy subjects and those suffering from Type 2 diabetes, in terms of their “Metagenome Assembled Genomes” that distinguish diseased patients from controls. Note the identification of ‘new’ microbes (annotated as either “Con” or “T2D”), many of which represent unclassified genera (white circles). The shotgun metagenomic data was also used to infer interactions between the host and the microbiota in these subjects. Reproduced from: Qin et al.,

(2016) A metagenome-wide association study of GIT microbiota in type 2 diabetes. Nature 490

(7418): 55-60. DOI: 10.1038/nature11450, with permissions from Springer Nature and Copyright Clearance Centre under licence number: 4451410445689.

Whilst the above studies have revealed that many bacteria are currently underrepresented in a cultured form, these techniques are just tools that allow us to obtain information in the form of nucleic acid sequences. This information provided allows researchers insight into the detection, identification and quantification of microbes in a specific environment (16S Rrna sequenceing) and provides markers for genetic potential and activity in a specific ecosystem (metagenomics and metatransctiptomes). Culture-independent sequencing techniques have proved advantageous in identifying not-yet- cultured bacteria that are specifically associated with differences in the GIT between health and disease, however culture-base techniques also need to be considered and are discussed in more detail later in this Chapter. It is critical that new culture-based isolation techniques are developed to gain representatives of some of these not-yet-cultured bacteria. Gaining axenic representatives of these taxa will mean that they can be functionally assessed for their potential to influence host health. This can be achieved through established techniques such as the in vitro use of mammalian cell lines used to assess inflammatory related markers, or through mouse models of functionally relevant human disease states (Sokol et al., 2008, Zhang et al., 2014)

8

1.3 The generalised concept of dysbiosis in Inflammatory Bowel Diseases

As briefly summarised above and shown in Figure 1.3, many cross-sectional and case-control studies of chronic GIT and metabolic disorders have shown alterations to the GIT microbiota, which is frequently referred to as “dysbiosis”. These changes are best characterised in terms of the prokaryote members of the GIT microbiota (i.e. the bacteria and archaea). Many Inflammatory Bowel Disease (IBD)-related microbiome studies have described a shift in overall species richness in the stool microbiome, with a marked reduction in the abundance of the Firmicutes phylum and in particular, the Clostridium clusters IV and XIVa (Frank et al., 2007). Another representative bacterial species shown to be reduced in early case-control studies of IBD patients was Faecalibacterium prausnitzii (Sokol et al., 2008, Sokol et al., 2009, Martinez-Medina et al., 2006) which is a member of Clostridium cluster IV. In addition, several recent longitudinal studies of IBD patients has shown the natural restoration of F. prausnitzii to the GIT is associated with longer term remission and a reduction in relapse rate (Sokol et al., 2008, De Cruz et al., 2012, Varela et al., 2013). Along with these alterations there are also increases in sub-populations of the Bacteroidetes phylum with some of the major species that increase in abundance being affiliated with Bacteroides fragilis and Bacteroides vulgatus (Takaishi et al., 2008). There are also overrepresentations of the phyla Proteobacteria and Fusobacteria in IBD patients (Frank et al., 2007, Strauss et al., 2011). Interestingly, and outlined in some detail later, some of these bacteria are capable of producing “anti- inflammatory factors” suggesting a protective role; others are now referred to as “pathobionts” (Chow et al., 2011) suggesting their role in disease pathogenesis is in response to environmental triggers, which are still largely unknown.

These alterations are now widely acknowledged to occur in many types of GIT disorders and diseases where inflammation may be subclinical or clinical. It should also be noted that medications have been shown to alter the microbiome of patients with inflammatory disorders(Andrews et al., 2011). However, it is still difficult to differentiate whether these changes occur to cause, or as a consequence, of the disease. Therefore, being able to investigate and understand a broader diversity of commensal bacteria, and the beneficial or detrimental role(s) they are playing, is important when trying to fully understand the key drivers of disease and to seek innovative approaches to treat and/or prevent IBD.

9

1.4 Culture-based approaches to study the GIT microbiota

As outlined above, there is an increasing need to further dissect the members of the GIT microbiota to better understand their functional roles. However, efforts to isolate and culture the more fastidious microbes of the human GIT have not really kept up with the use and reliance on DNA sequencing- based approaches. These techniques do provide ample information without the need to isolate and culture individual species, but I believe that culture-dependent approaches will provide the best opportunity to characterise and dissect the functionalities provided by specific bacteria, and in doing so, relate these attributes to health and disease.

In general, the majority of GIT microbes are obligate anaerobes, meaning that they require no oxygen within their environment for growth to proceed. Many of the anaerobic culture techniques used in laboratories world-wide were originally developed by Hungate and colleagues (Eller et al., 1971, Macy et al., 1972, Balch et al., 1979). Since then, culturing and sustaining strictly anaerobic isolates has been greatly advanced through the development and use of anaerobic chambers. These both reduce the risk of oxygen contamination in cultures and also allow for the use of many standard microbiological techniques, such as plate streaking and plate spreading, to isolate and target specific bacterial species. Nevertheless, difficulties still arise when trying to culture GIT microbes in these environments. In particular, the human GIT provides specific niches which can be difficult to replicate within the laboratory. This, combined with the fact that many of the nutritional requirements for some “not-yet-cultured” species are still unknown, requires the development of habitat-simulating media and/or new enrichment and isolation strategies. For instance, media supplementation with clarified rumen fluid or faecal extracts, and other types of selective media based on specific substrates, has supported the isolation of bacteria that might otherwise be outgrown by numerically more abundant species with broad utilization profiles (Eller et al., 1971, Barcenilla et al., 2000, McSweeney et al., 2005, Livingston et al., 1978, Isenberg et al., 1970, Ferraris et al., 2010). More recently, efforts have been made to isolate more of the less-dominant populations of bacteria through the targeting of a bacteria on the basis of their capacity to stimulate a specific host response or withstand environmental stress. For instance, Atarashi et al. (2013), targeted and isolated human faecal bacteria capable of inducing regulatory T-cells in mice, whilst Browne et al. (2016) showed the recovery of novel families, genera and species through targeting bacteria that were capable of sporulation. Techniques and robotic workflows that can screen large numbers of microbes under anaerobic conditions have also been fruitful. Stevenson et al. (2004) developed the “plate wash PCR” approach to recover uncultured isolates from agricultural soil samples and the guts of wood-feeding

10

termites. Plate wash PCR has since been used to successfully isolate a Lachnospiraceae bacterium which inhibits Clostridium difficile colonisation in the mouse GIT (Reeves et al., 2012) and has also been adapted by Ma et al. (2014) to isolate human GIT affiliated representatives of the HMPs most- wanted taxa. Raoult and colleagues have also developed a “culturomics robotic workflow” which has expanded the representation of culturable isolates from the most numberically predominant phyla (Lagier et al., 2012, Dubourg et al., 2013).

In summary, there have been meaningful steps taken to increase the efficiency of microbial isolation, thereby broadening the representation of GIT microbes available as axenic cultures. However, one key factor that has not been taken advantage of in the aforementioned studies is lateral gene transfer and DNA conjugation, which has been shown to play a major role in shaping the microbiota (Xu et al., 2007, Broaders et al., 2013). To that end, Ó Cuív et al. (2015) developed an innovative technique, termed metaparental mating, to rapidly isolate genetically tractable representatives of Firmicutes affiliated GIT bacteria. The technique takes advantage of DNA conjugation through the transfer of plasmids from E. coli donor strain(s) to promiscuous members of diverse bacterial consortia derived from real-world samples, e.g. faeces. A key aim of my thesis research was to further improve the efficiency and utility of this method to recover bacteria taxonomically affiliated with Clostridium Clusters IV and XIVa, with a view to defining more of the genetics and molecular biology behind their interaction(s) with their host, and more specifically, within the context of inflammatory bowel diseases.

11

1.5 Host-microbe interactions mediating changes in intestinal barrier function during IBD

Gaining cultured representatives of the microbial species which inhabit the human GIT will prove invaluable in developing our understanding of their roles in host health. However, to fully understand the nature of these roles, consideration needs to be given to the complexities of the host-microbiota interactions taking place in the GIT. Our understanding of these interactions has been greatly advanced by studies using germ free (GF), which are microbiologically naïve, and gnotobiotic animals. It has been known since the 1960s that the microbiota influences intestinal barrier function, when it was discovered that the turnover of intestinal epithelial cells (IECs) was much slower in GF animals than in conventionally reared animals (Abrams et al., 1963). More recently, studies have shown that Toll-like receptor (TLR) recognition of the microbiota is essential for the induction of epithelial cell proliferation. This increased proliferation is required to repair the epithelial layer following injury by inflammation (Rakoff-Nahoum et al., 2004). Paneth cells, secretory cells located at the base of the crypts in the small intestinal epithelium, help to maintain intestinal homeostasis by sensing the specific GIT bacteria via TLRs. This microbial detection then triggers the expression of anti-microbial peptides (AMPs), which helps prevent the traversing of the intestinal barrier by commensal and pathogenic bacteria (Vaishnava et al., 2008). In addition, specialised epithelial cells known as goblet cells produce the mucus that covers the villi throughout the small intestine (Ermund et al., 2013, Jakobsson et al., 2015). The mucus layer here is not covalently attached to the GIT epithelium, and thus allows some penetration of bacteria (Vaishnava et al., 2011), but most bacteria are unable to make contact with the epithelium due to the presence of the Paneth cell secreted AMPs (Vaishnava et al., 2011).

One of the hallmark features of IBD flares is the reduction of mucin secretion, due to a decrease in goblet cell numbers, which leads to a thinning of the mucus layer (Dorofeyev et al., 2013). Mice that lack the muc-2 gene also produce less mucin and spontaneously develop colitis (Van der Sluis et al., 2006). Therefore, any reduction in mucin thickness increases the likelihood of bacteria reaching the epithelium where they may promote inflammation. Paneth cell dysfunction may also lead to inflammation, potentially due to a loss of production of AMPs. Some studies have shown that reduced Paneth cell production of alpha-defensins leads to a loss of AMP efficiency at the epithelial layer and so exacerbates ileal-Crohn’s disease (CD) (Wehkamp et al., 2005). The lack of defensin production has been linked to mutations in the nod2 gene, which is expressed on Paneth cells (Wehkamp et al., 2004). In summary, the combined production of mucins and AMPs by various cell types of the GIT

12

epithelium helps to “restrain” microbes, and by doing so, helps to prevent the host immune system from “overreacting” to commensal microbes, thus preventing inflammatory responses.

In addition to goblet and Paneth cells, the host immune system also influences the composition and function of the microbiota. This is achieved by a combination of innate and adaptive immune responses, with IgA being a major player. Secretory IgA (sIgA) is produced by specialised B-cells, and docks with a transmembrane epithelial protein known as the Polymeric Ig Receptor (pIgR). Here the IgA molecules form large polymers that are secreted into the lumen and mucosal layers (Mostov, 1994). Although sIgA has long been recognised as playing an important role in protection against infection by preventing the attachment of intestinal pathogens to the epithelial wall and cellular invasion (Mantis and Forbes, 2010), sIgA may also recognize and “coat” commensal bacteria (van der Waaij et al., 1996). That sIgA is produced in response to the presence of commensal bacteria is supported by the finding that GF mice produce lesser amounts of sIgA (Fagarasan et al., 2010). Interestingly, mouse models deficient in the B-cells responsible for sIgA production show a reduction in the diversity of GIT bacteria (Shulzhenko et al., 2011). This reduction in bacterial diversity may be due to more bacterial species being able to traverse the epithelial layer and are no longer confined to the GIT. Additionally, mice which are deficient in activation induced cytidine deaminase (AID-/- mice), a protein involved in antibody diversification, cannot export IgA across the epithelial barrier. This also leads to reduced diversity of bacterial communities, with the same being true to those mice lacking pIgR (Johansen et al., 1999, Fagarasan et al., 2002). For these reasons, sIgA plays a significant role in both the protection and expansion of the diversity and balance of intestinal bacteria (Peterson et al., 2007). Palm et al., (2014) recently described a method that combines bacterial cell sorting based on sIgA coating, with 16S rRNA gene amplicon sequencing. They showed there is strain variation in terms of the level of IgA coating, and that isolates recovered from highly coated groups can drive a stronger colitic response in mouse models. There is also evidence to suggest that IgA binding by some GIT bacteria may alter their gene expression, which might influence their metabolism and persistence within the GIT (Peterson et al., 2007). Taken together, these observations suggest that under specific conditions some bacteria alter their behaviour to promote inflammation and/or act as pathogens, and are now referred to as pathobionts, and are discussed in more detail as part of the next section.

13

1.6 In vitro characterisation of GIT microbiota function

Another key pathway through which microbially-derived factors may affect GIT homeostasis is via an inflammatory response coordinated by the Nuclear Factor-kappa B (NF-κB) complex. The central importance of NF-κB as a key regulator of a cell’s response to various forms of stress, such as various types of microbially-derived stimuli, is illustrated in Figure 1.4. In brief, and as shown in Figure 1.4, the NF-κB complex is sequestered in the cytoplasm by inhibitory kappa B-α protein (IκB-α). Following the binding of a ligand (e.g. LPS, TNF) to specific receptors such as TLRs, antigen receptors or cytokine receptors, the IκB-α Kinase (IKK) complex is activated. The IKK complex is composed of two subunits (IKKα and IKKβ) and a regulatory protein (NF-κB essential modulator, NEMO). The activated IKK complex catalyses the phosphorylation and degradation of IκB-α. This then allows the free NF-κB complex to translocate to the nucleus, where it can interact with regulatory elements in promoters encoding inflammatory genes (e.g. TNF-α, IL-6 and IL-1β). As such, the research with F. prausnitzii by Sokol, Langella and colleagues has confirmed that at least some GIT bacteria mediate their anti-inflammatory effects via suppression of the NF-κB directed activation of immune genes (Sokol et al., 2008, Miquel et al., 2015, Quevrain et al., 2016). This may be mediated via interrupting receptor activation of the signalling pathway described above, or via some other more direct interaction with, and suppression of NF-κB complex release and/or its translocation to the nucleus.

14

Figure 1.4 An overview of the two NF-κB activation pathways leading to transcription of genes important in inflammatory responses. Created with BioRender and adpated from: Gerondakis, S, Fulford, T. S., Messina, N.L., and Grumont, R. J., (2014) NF-kappaB control of T cell development." Nat Immunol 15(1): 15-25. DOI: 10.1038/ni.2785 with permissions from Springer Nature and Copyright Clearance Centre under licence number: 4451410087634.

15

Several studies have now used immortalised cell lines and monitoring of NF-κB regulated gene expression to screen candidate strains of bacteria, and extracts thereof, for their immunomodulatory effects. In some cell lines, such as RAW264.7 murine macrophage cells, a reporter gene has been fused with an NF-κB responsive promoter to improve ease of assay procedures, and HT-29 human colonic cells, and Caco-2 human intestinal epithelial cells are also used, with cytokines such as IL-8 and TNF-α measured to examine the immunomodulatory response (Jung et al., 1995). Using this approach, inhibition of NF-κB pathways, by inducing the nuclear export of complexes formed between NF-κB and peroxisome proliferator-activated receptor-γ (PPAR- γ) was observed in studies of Bacteroides thetaiotaomicron (Kelly et al., 2004). There have also been observations of manipulation of the ubiquitination pathway upstream of IκB-α, in studies of Lactobacillus casei (Tien et al., 2006). Both these studies are examples of how attenuation of cytokine gene expression, and the inflammatory response, can be mediated at different levels of this important signalling pathway.

Sokol, Langella and colleagues expanded the concept of the production of bioactives by commensal bacteria, to include those bacteria that have been shown in clinical studies to be lost during periods of active inflammation in the GIT (Sokol et al., 2009). They showed that both whole cells and fluids recovered from axenic cultures of F. prausnitzii possess anti-inflammatory effects which could not be ascribed to butyrate or other SCFAs (Sokol et al., 2008). Over the last decade, these researchers have pursued the source and origin of these anti-inflammatory factors, which has led to their discovery of a collection of peptides referred to as microbial anti-inflammatory molecules (mam), derived from the mam gene (Quevrain et al., 2016). These peptides were first shown to possess anti- inflammatory effects by the blocking of NF-κB activation and IL-8 production in epithelial cells (Sokol et al., 2008). They were also shown to greatly reduce colitis in mice (Martin et al., 2014) and also suppressed IL-17 levels in rat models of colorectal colitis (Zhang et al., 2014). In addition, two other studies by Kaci et al., (2011, 2014) showed that the filter-sterilised fluid from Streptococcus salivarius cultures could inhibit NF-κB pathways in HT-29 cell lines, in a dose-dependent manner. The two most efficient strains of S. salivarius were shown to produce a bioactive metabolite that was <3 kDa which downregulated proinflammatory IL-8 secretion. Lakhdari et al., (2010) have described the establishment of a robust, high-throughput screening strategy for a fosmid clone library of metagenomic DNA. This screening strategy was able to identify a clone of DNA believed to be of Bacteroides origin, which was capable of stimulating NF-κB reporter gene activity, suggesting the gene(s) encoded may drive a pro-inflammatory response. Lakhdari et al. (2011) later examined NF- κB modulation by filter-sterilized culture fluids from a number of different GIT commensal bacteria by using a similar approach, this time using either IEC cell lines (HT-29 and Caco-2) of the monocyte- 16

like THP-1 cell line. Each cell line contained an NF-κB responsive promoter fused with a reporter gene. Here, they reported that Bacteroides uniformis and Clostridium sardiniensisi were “pro- inflammatory”. Interestingly, the culture fluids from other bacteria such as Blautia coccoides and Bifidobacterium longum produced variable effects, i.e. were suppressive in one cell line but stimulatory in another (Lakhdari et al., 2011). More recently, O Cuiv et al., (2017) described an ATP binding cassette (ABC) export system and lipoprotein in two strains of B. vulgatus (ATCC 8482 and PC510) which showed significant sequence similarity to the CD-derived metagenomic fosmid clone identified by Lakhdari et al., (2010). Furthermore they showed that the ABC export system was highly enriched in CD subjects and that both B. vulgatus strains were able to activate NF-κB in a growth phase and strain dependent manner using a HT-29/κb-seap-25 enterocyte reporter cell line (Ó Cuív et al., 2017).

In summary, these types of in vitro assays have proven to be powerful for screening candidate microbes, and/or clone libraries of metagenomic DNA for genes encoding immunomodulatory activities. In particular, these assays have provided an insight that at least some of the bacteria, known to be lost from the GIT as a consequence of dybisosis, produce anti-inflammatory factors. However, our understanding of the diversity of immunomodulatory bioactives produced by GIT commensal bacteria remains constrained by the lack of cultured isolates representing those taxa most likely to drive GIT homeostasis from a microbial perspective, and in particular members of Clostridium Clusters IV and IVa.

17

1.7 Dietary polyphenols and their impacts on the GIT physiology and composition of the microbiota

The above sections have focused on host-microbe interactions and their impacts on health and disease. However, it is widely accepted that diet also plays a major role in shaping the structure-function relationships inherent to the GIT microbiota, and that dietary-based manipulations of the microbiota can lead to benefits in the treatment of a range of diseases (Jeffery and O'Toole, 2013, Murtaza et al., 2017). It is therefore important that dietary compounds are also considered when trying to further dissect the complexities of the GIT microbiota and host-mediated changes in GIT function. In that regard, there are a wide variety of foods and beverages that contain polyphenols, from fruits and vegetables, to herbs and whole grains, through to coffee, teas and wine (Vinson et al., 2001). Many members of the polyphenols are covalently attached to carbohydrates and/or organic acids and can be divided into two major groups: flavonoids and non-flavonoids. The flavonoids have a primary structure of two benzene rings connected through a heterogeneous pyrone C-ring. Non-flavonoids are a more diverse range of compounds ranging from simple benzoic acid structures to more complex stilbenes, lignans and gallotannins. The generalized structural features of flavonoid and non- flavonoids have been extensively reviewed (Selma et al., 2009, Ozdal et al., 2016). This broad and chemically diverse group of phytochemicals have long been viewed to promote beneficial effects on GIT function; for example as anti-cancer, anti-microbial and anti-inflammatory bioactives (Scalbert et al., 2005). Some studies suggest that ~5% of the polyphenols ingested are actually absorbed in the small intestine, the remainder of which will reach the colon and can accumulate to millimolar concentrations (Faria et al., 2014, Ozdal et al., 2016). As such, there have been some studies that have examined how flavonoids, and in particular quercetin, might directly affect host epithelial cell biology. For instance, several studies have investigated the effect of quercetin on human Caco-2 cells (Amsheh et al., 2008, Suzuki et al., 2009, Carrasco-Pozo et al., 2013, and Valenzano et al., 2015) and there appears to be some strengthening of the tight junctions between these cells, reflected in an increase of transepithelial electrical resistance, as well as decreases in mannitol permeability (e.g. Carasco-Pozo et al., 2013 and Valenzano et al., 2015). Damiano et al., (2018) recently reported that quercetin was able to increase the expression of muc-2 and muc-5AC, which are the genes encoding for the main pathways of mucin production by intestinal goblet cells. Taken together these findings suggest that quercetin not only directly promotes tight barrier junctions between epithelial cells, but also stimulates mucin secretion, both of which are key aspects of sustained GIT homeostasis.

18

Flavonoid polyphenols have also been proposed to provide a prebiotic effects on the GIT microbiota. In other words, polyphenols may elicit selective pressures that influence the composition of the GIT microbiota, and by increasing the abundance of specific commensal/beneficial bacteria, reduce the ability of potentially pathogenic bacteria to colonise the GIT. Some studies have examined how flavonoid polyphenols result in beneficial alterations to the GIT microbiota (Clavel et al., 2005, Tzounis et al., 2008, Hidalgo et al., 2012, Etxeberria et al., 2015a). Clavel et al. (2005) observed that isoflavone supplementation of postmenopausal women was associated with the enrichment of “beneficial” bacteria such as the F. prausnitzii subgroup. Using a batch fermentation model of the distal colon,(Tzounis et al., 2008) found that the addition of (+)-catechin to the fermenters resulted in significant increases of Bifidobacterium spp., commensal Escherichia coli, and the Clostridium coccoides-Eubacterium rectale group; whilst inhibiting the growth of pathogenic Clostridium histolyicum. Similar findings were reported in response to an intervention of anthocyanidins by (Hidalgo et al., 2012) which also significantly enhance the growth of Bifidobacterium and Lactobacillus-Enterococcus spp. The Firmicutes:Bacteroidetes ratio in mice was altered when quercetin was added to a High-Fat, High-Sugar diet, and in this same study, there were specific reductions in the abundance of Eubacterium cylindroides and spp., which allowed the authors to confirm that both these taxa are implicated with diet-induced obesity as had been previously described (Ley et al., 2006, Turnbaugh et al., 2008, Etxeberria et al., 2015a, Etxeberria et al., 2015b).

Studies such as those highlighted above suggest that flavonoid polyphenols not only appear to elicit changes to epithelial cell biology and compositional changes to the GIT microbiota, but these changes are also implicated with altered nutrient metabolism and diet-mediated adiposity. However and despite these considerations, how polyphenols are actually metabolized by the GIT microbiota remains poorly understood. Most of the bacteria known to be capable of flavonoid metabolism belong to various families of the Firmicutes phylum, with Eubacterium ramulus and Flavonifractor plautii, being the best recognised (Schneider and Blaut, 2000, Schoefer et al., 2003). The genetics and biochemistry of flavonoid metabolism have been best studied for Eu. ramulus (Schoefer et al., 2002, Schoefer et al., 2004, Herles et al., 2004, Elsinghorst et al., 2011, Gall et al., 2014, Thomsen et al., 2015, Braune et al., 2016). Most of these studies have used quercetin as the “model” flavonoid, and its degradation involves an enoate reductase (Gall et al., 2014) and chalcone isomerase (Gall et al., 2014, Braune et al., 2016). A proposed pathway for quercetin degradation was identified by Schoefer et al., (2003) and is shown in Figure 1.5. Quercetin was shown to be first reduced to taxifolin followed by enzymatic conversion by chalcone isomerase into alphitonin. The five membered ring of 19

alphitonin is then hydrolysed leading to the formation of phloroglucinol and 2,3- dihydroxyphenylacetic acid. Phloroglucinol can be further reduced into the SCFA acetate and propionate (Schoefer et al., 2003). On the other hand, F. plautii was proposed by (Carlier et al., 2010) as a novel bacterial genus based on the phylogenetic and biochemical assessment of strains initially assigned to either the Clostridium orbiscindens or Eubacterium plautii lineages. In these early descriptions, C. orbiscindens was identified and named for its ability to cleave the flavonoid C-ring (Winter et al., 1991), and further studies with presumptive F. plautii strains has confirmed their collective abilities to metabolise quercetin, neohesperiden, (+)-catechin and (-)-epicatechin (Schoefer et al., 2003, Braune et al., 2005, Kutschera et al., 2011, Takagaki et al., 2014, Takagaki and Nanjo, 2015). Interestingly, dietary supplementation with flavonoids was associated with an expansion of the Eu. ramulus population in the human GIT some time ago (Simmering et al., 2002) but there appears to be relatively little understanding and only scant studies reporting on the prevalence and abundance of F. plautii in the human GIT; or the impacts of diet on these bacteria. As such, the F. plautii lineage represents a potentially important but understudied bacterial species resident in the human GIT. A brief overview of the potential effects of polyphenols is depicted in Figure 1.6.

Figure 1.5 Pathway of quercetin degradation by F. plautii and Eubacterium ramulus. An initial reduction of the double bond at the 2,3 position results in the formation of taxifolin. A chalcone isomerase enzyme then contracts the c-ring of taxifolin to form alphitonin. Finally, the five-membered ring of alphitonin is opened through a hydrolysis step to form phloroglucinol and 3,4- dihydroxyphenyacetic acid. Adapted from Schoefer, L., et al., Anaerobic degradation of flavonoids by Clostridium orbiscindens. Appl Environ Microbiol, 2003. 69(10): p. 5849-54.

20

Figure 1.6 An overview of the possible effect(s) flavonoids impart on the GIT microbiota and epithelial cell layer in the human gastrointestinal tract. The main aspects of flavonoid interactions depicted here are: i) the direct effects of parent flavonoids on microbial composition; ii) the enhancement of tight barrier junctions and attenuation of pro-inflammatory responses by intestinal epithelial cells and; iii) the production of phenolic acids and other metabolites following microbial metabolism of flavonoids. These byproducts of flavonoid metabolism may be more (or less) efficacious in eliciting local and/or systemic effects than the parent compounds. Created with BioRender.

21

In summation, the intake of flavonoid polyphenols has been associated with a broad spectrum of positive and negative effects on both the host and the GIT microbiota. As such, the availability and efficacy of polyphenols might well be influenced by microbial metabolism, both in a positive (augmentive) and deleterious (degradative) manner. Perhaps then, it is not surprising that much of the reputed benefits associated with the ingestion of polyphenols remains anecdotal in nature and may well depend upon the nature of an individual’s GIT microbiota in relation to their impact(s) on specific aspects of GIT function, homeostasis, and health. For instance, although some strains of F. plautii do appear to metabolize a broad range of flavonoids, to my knowledge, there are has been no genome-wide assessment of Flavonifractor and related bacteria. As such, the ecological and functional impacts of this group of commensal GIT bacteria is understudied but, given their likely involvement with flavonoid metabolism and/or bioavailability in the GIT, needs greater attention.

22

1.8 Summary and research aims

The GIT microbiota has now been thoroughly examined by various cultivation-independent methods, and this has resulted in new insights that advance our understanding of the roles played by the commensal bacteria in health and disease. First a “dysbiosis”, which can be described as a microbiota profile that is different from those encountered in healthy person, is a hallmark of many case vs control studies. Subsequently, the depletion or loss of specific species has now been linked to certain disease states, e.g. a reduced of F. prausnitzii in Crohn’s disease patients. Second, these “omic” approaches have revealed there are still plenty of “new” microbes that are yet to be cultured, and furthermore, some of these favoured by a healthy GIT, whereas others are more prevalent with disease. I believe the isolation of these “new” microbes will advance our understanding of microbial physiology and can also be translated into new opportunities for the improved diagnosis and treatment of disease.

My PhD thesis has focussed on advancing the use of methods developed in our lab to isolate genetically tractable strains of bacteria assigned to Clostridium Clusters IV and XIVa, and by genomic and culture-based analyses provide a better understanding of how these “new” bacteria might interact with the host and/or with specific dietary components. To that end, the following Chapters of my thesis describe my efforts and findings in relation to:

1) The isolation of “new” genetically tractable GIT bacteria and my initial assessment of their capacity to produce immunomodulatory bioactives.

2) Undertake an assessment of quercetin metabolism and its impact on the growth of two “new” bacteria, assigned to Flavonifractor and Pseudoflavonifractor spp.

3) Use various bioinformatics workflows and packages to examine the diversity of these new bacteria and the Flavonifractor and Pseudoflavonifractor spp. more broadly, to advance our understanding of these understudied bacterial lineages.

4) Assess the prevalence of Flavonifractor spp. phylogenetic and functional genes in metagenomics datasets, to reveal new insights into their ecological and/or functional contributions to the GIT microbiota in health and disease.

23

Chapter 2 Isolation of ‘new’ genetically tractable gut bacteria using metaparental mating and their potential role in human health

2.1 Introduction

As I outlined in Chapter 1, our understanding of the role the GIT microbiota plays in health and disease has been greatly enhanced by the establishment of studies such as the Human Microbiome Project (HMP) (Human Microbiome Project, 2012a, Human Microbiome Project, 2012b) and MetaHIT (Ehrlich and Consortium, 2011). These types of studies used (meta)genomic based techniques to gain information of the structural composition and functional capacity of the microbial communities residing on and within the human body. For example, the production of a reference genome database and (meta)genomic data from the HMP revealed the lack of cultured isolates in collections worldwide, leading to the development of a “most-wanted” list of bacterial isolates (Fodor et al., 2012). Other metagenome wide association studies have resulted in the identification of “Metagenomic Assembled Genomes” (MAGs): partial genomes assembled from metagenomic data that represent not-yet-cultured microbes (Qin et al., 2012). These studies have justified the need to further isolate “new” microbes, providing reference genomes of both the “most-wanted” and “MAGs”. By doing so, valuable information concerning their functional roles in the host, what nutrients and compounds they use, what metabolites they produce, and how their colonization and persistence may affect host health and disease states, should be realised. Scientific, medical and commercial value will be forthcoming to those groups that recover “new” isolates and assess their functional capacity in ways that directly link genes to function.

One of the limitations that has restricted the development of our understanding of the roles specific microbes play within the gastrointestinal tract (GIT) is that there are a limited number of microbes which are known to be amenable to genetic manipulation. Ó Cuív et al (2015) reported the development of a new technique, metaparental mating, which allows for the rapid targeted isolation of genetically tractable microbes from diverse microbial communities. The technique employs the conjugative transfer of a plasmid vector from an Escherichia coli donor strain to permissive strains within complex microbial communities such as those present in faeces, as depicted in Figure 2.1A. In this Chapter, I describe my efforts to advance the development of this technique via the utilisation of a plasmid vector constructed to contain a flavin mononucleotide-based reporter gene (evoglow-C- Bs2), which allows the identification of transconjugant bacteria by a combination of antibiotic resistance and fluorescence (Figure 2.1B). My use of metaparental mating and these vectors resulted 24

in my isolation of a collection of “new” genetically competent GIT bacteria affiliated with the Firmicutes phylum, with some of them confirmed as representative of the HMPs “most-wanted” taxa (Fodor et al., 2012, Ó Cuív et al., 2015). Second, I also present my efforts to establish a rapid screening approach to assess the immunomodulatory capacity of these isolates, and report the findings arising from these assays.

2.2 Materials and Methods

2.2.1 Bacterial isolations: Metaparental mating

The metaparental mating method is shown in Figure 2.1 and the approaches used for vector construction are described in detail by Ó Cuív et al (2015). All the isolates referred to in this Chapter were recovered from a faecal sample collected from a healthy pre-adolescent child via Metro Hospital South human research ethics (HREC) approval HREC/13/MHS/27, kindly provided by Dr Emma Hamilton-Williams (UQDI). Approximately 0.15 g of faecal matter was resuspended in 3 ml of sterile, 80% (v/v) anaerobic glycerol solution and stored at -80C until used. The sample was thawed and 100 µl was used to inoculate 10 ml of pre-reduced and anaerobically sterilised M2SC medium dispensed into Hungate tubes (Macy et al., 1972, Ó Cuív et al., 2015). The inoculated tubes were laid flat within a small box and incubated at 37oC with agitation at 60 rpm for 24 hours. Aliquots (3 ml) of the enrichment culture were then added to 3 ml of sterile anaerobically prepared glycerol solution and stored at -80oC. The recipient cultures for the metaparental matings were then produced by using a 50 µl volume of the enrichment sample to inoculate 10 ml of M2SC medium and the cultures were incubated as described above.

In preliminary experiments, samples of these overnight faecal enrichment cultures were subjected to ten-fold serial dilution and 100 µl of the 10-2 through 10-6 dilutions were plated onto M2SC agar medium or the same medium prepared to contain either 100 µg/ml Erm, or 10 µg/ml chloramphenicol. Based on the results shown in Table 2.1, I then chose to perform metaparental matings using two different E. coli ST18 donor strains bearing different plasmids. The first matings were performed using E. coli ST18 bearing pEHR512112, which is a modification of pEHR522121 and encodes both the erythromycin (Erm) resistance and evoglow-C-Bs2 genes, as described by Ó Cuív et al (2015). The donor strain was cultured using 10 ml of LB broth supplemented with 100 µg/ml δ- aminolevulinic acid and 100 µg/ml Erm, and incubated at 37oC overnight. The cells were harvested by centrifugation and washed three times with sterile diluent (anaerobically prepared Ringer’s solution, please see Appendix 6 for recipe). The OD600 of the donor and recipient cultures were then 25

measured prior to their transfer into the COY anaerobic chamber. To initiate the matings, a volume of the recipient and donor cultures were mixed together within a 2 ml screw capped tube, to produce a 4:1 (donor:recipient) ratio based on their OD600 values. As controls, similar volumes of the donor and recipient cells alone were also transferred to individual tubes. All these mixtures were centrifuged at 13000 rpm for 2 minutes within the anaerobic chamber, resuspended in a minimal volume of residual buffer, and then spotted onto nylon filters that were carefully placed onto plastic petri dishes containing antibiotic free M2SC agar medium. The second round of matings used E. coli ST18 cells bearing the pEHR513112 vector which contains the chloramphenicol resistance and evoglow-C-Bs2 genes.

Both sets of mating mixtures, as well as separate samples of the donor and recipient mixtures alone, were subjected to a ten-fold serial dilution as described above, and 100 µl aliquots of each dilution was spread onto M2SC agar plates prepared with the appropriate antibiotic. The plates were incubated anaerobically at 37oC for one week.

26

A)

B) C)

Figure 2.1 A schematic representation of the metaparental mating process. (A) The E. coli ST18 based donor strain is mixed with a faecal enrichment culture, spotted on a nylon filter, then overlayed onto solid medium. Candidate transconjugants are recovered by plating onto medium containing 100 µg/ml erythromycin and checked by fluorescence microscopy. Both individual donor and recip ient cultures are processed in the same way as controls. (B) The modular components of the pEHR512112 plasmid vector, showing the flavin mononucleotide-based reporter gene (evoglow-C-Bs2) inserted prior to the E. coli replicon module, followed by the Erm antibiotic resistance marker, origin of transfer, and non-E. coli replicon module. (C) The modular components of the pEHR513112 plasmid vector, showing evoglow-C-Bs2 gene inserted prior to the E. coli replicon module, followed by the Cat antibiotic resistance marker, origin of transfer, and non-E. coli replicon module. Adapted from: Ó Cuív et al., (2015) (with permissions under creative commons license CC BY 4.0).

27

2.2.2 Transconjugant confirmation: microscopy and PCR screening

Candidate transconjugants were recovered from the selective plates described above and purified through three rounds of streaking for individual colonies on modified Reinforced Clostridial agar medium (RCM, please see Appendix 6 for recipe). An Olympus BX 63 microscope fitted with a DP80 camera, Xcite LED light source and fluorescence filter cube U-FBN (excitation 470–495 nm, emission 510 nm) was used to screen colonies for fluorescence of the evoglow-C-Bs2 gene present in the transconjugants containing the pEHR512112 vector (Figure 2.1). Images of fluorescent cells were captured using the Olympus cellSens modular imaging software platform and processed using the ImageJ software package (http://imagej.nih.gov/ij/). Colonies identified as fluorescent following microscopic examinations were then screened for the presence of the plasmid vector by PCR analysis. Forward and reverse primers (sequences in Appendix 6) targeted the vector region spanning the erythromycin resistance gene (ermF) and the oriT module, respectively, which should produce an amplicon of ~1500 bp from pEHR512112. The PCR reactions were set up as follows; 1 µl of DNA template, 500 nM of each primer, 10 µM each of deoxyribonucleases, 0.02 U Phusion DNA polymerase, 1x HF buffer and topped up to a total reaction volume of 20 µl with nuclease free water. PCR reactions were performed with 98oC for 1 min, followed by 30 cycles of 98oC for 30 s, 60oC for 30 s and 72oC for 1 min and 1 cycle of 72oC for 5 mins. Agarose gel electrophoresis was then used to visualise amplicons to confirm vector presence.

2.2.3 Transconjugant identification: 16S rRNA sequencing and phylogenetic analysis

Identification of the transconjugants was performed by DNA sequencing of the 16S rRNA gene that were amplified by PCR using the 27F and 1492R primers (Ó Cuív et al., 2011). Next, the PCR amplified products were first purified of excess nucleotides and reagents using Exonuclease I and Calf Intestinal Phosphatase as per the steps in Appendix 6. Next, the purified products were prepared for Sanger sequencing using BigDye™ reagents, as per the manufacturer’s instructions (see protocol in Appendix 6). The products were sequenced using an Applied Biosystems 3130 X I Genetic Analyser located within the University of Queensland’s Centre for Clinical Genomics at the Translational Research Institute. The sequence reads were trimmed and assembled using SeqTrace software (Stucky, 2012) to produce near-full length 16S rRNA gene sequences, which were then compared to the aligned sequences available on the RDP-database. The sequences were then aligned using the online software SILVA (Pruesse et al., 2012, https://www.arb-silva.de/aligner/) and analysed through MEGA 7 (Kumar et al., 2016) to produce phylogenetic trees with their nearest

28

neighbour type strains, and the stability of the phylogenetic tree was evaluated by 1000 bootstrap replications and Kimura 2-parameter modelling (Kimura, 1980).

2.2.4 Plasmid curing of strains AHG0001

Strain AHG0001 was resuscitated from glycerol stocks using Brain Heart Infusion (BHI, Difco™) agar plates containing 100 µg/ml Erm. A single colony was aseptically transferred into 10 ml of 0.5x BHI broth medium containing 100 µg/ml Erm, and incubated at 37oC overnight. Then 100 µl aliquots of this culture were used to inoculate 10 ml BHI broth prepared to contain either 0, 2, 4, 6, 8, and 10 µg/ml acridine orange (AO). These cultures were incubated overnight at 37oC with gentle agitation (60 rpm). Then, 100 µl of the culture with the highest AO concentration that still permitted bacterial growth was transferred to 10 ml of fresh BHI broth and incubated overnight at 37oC. The resulting culture was sampled and used to produce a ten-fold dilution series, with 100 µl aliquots spread onto BHI agar plates prepared with or without 100 µg/ml erythromycin. The plates were incubated anaerobically until colonies could be observed, and then colonies from the BHI only plates were replica plated onto BHI-Erm agar plates to confirm those strains that no longer contained the plasmid (by negative growth on these plates). The candidate plasmid cured strains were then picked from the master plate and cultured using 10 ml of BHI broth, and then stored as anaerobic glycerol stocks at - 80oC, as described above.

2.2.5 RAW 264.7 macrophage cell culturing and NF-κB activation assays

Prior to all assays, RAW 264.7 macrophage cells expressing the ELAM luciferase reporter gene for NF-κB activation (Hume et al 2001) were cultured in RPMI medium supplemented with heat- inactivated foetal bovine serum (10% vol/vol), penicillin-streptomycin (1% vol/vol) and L-glutamine (1% vol/vol) to support continuous growth of the cells between experiments . For each experiment, macrophage cells were harvested and transferred into a 50 ml falcon tube and centrifuged at 400 x g for 5 mins at room temperature. The supernatant was then removed, and the cells resuspended in 10 ml of fresh medium. An aliquot (10 µl) of these cell preparations were mixed with 10 µl of trypan blue, to provide a count of viable cells using a hemocytometer. Based on these counts, the volume of the cell cultures were altered to provide a final concentration of 4 x 106 cells/ml.

Aliquots (100 µl) of these macrophage cell preparations were transferred to individual wells of a 96- o well microtiter plate and then incubated at 37 C in a 5% CO2 atmosphere for 30 mins. The cells were then challenged with 20 µl of filter-sterilised culture supernatant harvested from end-phase cultures

29

of my transconjugant isolates. In parallel, assays were also set up as described above but with 20 µl of a 20 ng/ml preparation of lipopolysaccharide (LPS), purified from E. coli O111:B4, which was added 30 mins after adding the culture-supernatants. The positive and negative controls for these assays consisted of the addition of 20 µl of 20 ng/ml LPS alone (positive), or 20 µl of 20 ng/ml LPS with either culture supernatant for Faecalibacterium prausnitzii A2-165 which was also grown in RCM, 10 mM butyrate, and the NF-κB inhibitor (Bay 11-7082) (negative controls). The cells were then incubated for 4 hours before being terminated by adding 50 µl of 1.25X Passive lysis buffer (Promega). The amount of luciferase activity produced was measured following the addition of 30 µl of Luciferin substrate (Steady-Glo Luciferase assay system; Promega) to each well. Luciferase activity was quantified by measuring the absorbance produced using a Fluostar Optime Luminometer by reading each well for 1 second with an Emission filter gain of 4095 (BMG Labtech, Mornington, VIC, Australia). All these assays were performed in triplicate. Aliquots of supernatants for each of the potential suppressive isolates were stored at -80oC for Gas Chromatography analysis of the Volatile Fatty Acids (VFAs). The VFA’s were measured by adding 80 µl of orthophosphoric acid/internal standard solution (20% metaphosphoric acid/0.24% 4-methyl valeric acid) to 800 µl of each supernatant sample and mixed thoroughly. The concentrations of acetate and butyrate were determined as previously described with 4-methyl valerate used as an internal standard (Playne, 1985). To assess the establishment of a good assay a Z-factor was calculated using the following equation: 1 − (3σ+ve control+3σ −ve control) . This calculation compares the ability of the positive control |μ+ve control−μ−ve control| (20 ng/ml of LPS) and the negative control (20 µM Bay 11-7082) to stimulate NF-κB. An assay which has a Z-factor of ≥ 0.5 is considered an excellent assay following published criteria for high throughput screening assays (Zhang et al., 1999). To rank isolates to determine the best ‘hits’ with respect to their NF-κB suppressive abilities the Z-score was calculated according to the following μ equation sample−μRCM control (Brideau et al., 2003, Malo et al., 2006). σRCM control

30

2.3 Results

2.3.1 The recipient cultures showed variable responses to antibiotics

Table 2.1 shows the results of the initial tests of the recipient cultures for both total counts and background resistance to erythromycin (Erm) and chloramphenicol (Cat). As expected, the cell densities of the recipient mixtures exceeded 108 cfu/ml and considered suitable for use in the metaparental matings. The effects from the two antibiotics on these faecal enrichments were quite different. Even though the donor had reported not using antibiotics in the three months prior to collection, the prevalence of Erm resistant bacteria appeared to be very high, whereas Cat resistance was much less prevalent. Based on these results, I chose to use enrichment samples H1974 and H1644 based on their apparent sensitivity to Erm and Cat, respectively. For the metaparental mating performed using E. coli ST18 pEHR512112 and sample H1974, the 10-5 dilution resulted in 272 ErmR colonies, and 32 of these were chosen for further characterisation. For the second metaparental mating using E. coli ST18 pEHR513112 and sample H1644, the 10-2 dilution produced 139 CatR colonies, and 15 of these were selected for further characterisation.

31

Table 2.1 Colony counts from the four faecal enrichments used to check for background resistance to the antibiotics erythromycin and chloramphenicol which would be used to select for the plasmids introduced during metaparental mating. TNTC = too numerous to count, NG =no growth.

Dilution factor Faecal enrichment 10-2 10-3 10-4 10-5 10-6 cfu/ml sample no. No antibiotics H1974 TNTC TNTC TNTC TNTC TNTC TNTC H1644 TNTC TNTC TNTC TNTC TNTC TNTC H1702 TNTC TNTC TNTC 652 105 1.05 x 109 H1588 TNTC TNTC TNTC 338 44 4.4 x 108 Erythromycin H1974 TNTC TNTC 112 82 11 1.1 x 108 H1644 TNTC TNTC TNTC 72 12 1.2 x 108 H1702 TNTC TNTC 200 31 6 6.0 x 107 H1588 TNTC TNTC TNTC 90 61 6.1 x 108 Chloramphenicol H1974 NG NG NG NG NG 0 H1644 NG NG NG NG NG 0 H1702 10 NG NG NG NG 1.0 x 104 H1588 TNTC 21 18 NG NG 1.8 x 106

32

2.3.2 Isolation and phylogenetic analyses of transconjugant strains

From the 32 ErmR colonies selected from the pEHR512112 matings, 22 were subsequently confirmed as fluorescent, and Figure 2.2 depicts representative images of the transconjugant strains. The presence of the pEHR512112 vector in all these isolates was also confirmed by the PCR amplification as described in the methods and a representative image of these results is shown in Figure 2.3.

The taxonomic assignment and phylogenetic placement of these 22 pEHR512112 transconjugant strains is shown in Figure 2.4 and their percent identity scores are shown in Table 2.2. The majority of these new isolates (15/22) were assigned to Clostridium cluster XIVa. Strains AHG0002 and 0004, as well as AHG0001 and 0021 were most closely affiliated with Clostridium bolteae and Clostridium citroniae type strains (~98-99% identity scores). Interestingly, strains AHG0009, 0010, 0015, 0016, 0019 and 0020 form a distinct cluster with Clostridium clostridioforme, and with ~99% identity between each of these isolates. Strains AHG0022 and 0023 were also closely related to this cluster as well as Clostridium symbiosum, and the remaining 3 isolates (AHG0005, 0011 and 0018) most closely related to the Clostridium aldenense type strain. Of the remaining 7 strains, 3 were assigned to Clostridium Cluster IV, and 4 to Clostridium Cluster XVIII (Figure 2.4). Strain AHG0017 formed a distinct and deep branch between the Faecalibacterium prausnitzii and Flavonifractor plautii type strains, while strains AHG0008 and 0014 were most closely affiliated with the F. plautii type strain, with sequence identity scores of 95% and 99%, respectively. Of the strains assigned to Cluster XVIII, strain AHG0012 was determined to be most closely related to Clostridium ramosum, whilst strains AHG0003, 0006 and 0007 formed a deeper branching cluster most closely affiliated with C. ramosum.

In contrast, the selected transconjugants strains resulting from the metaparental mating of E. coli ST18 pEHR513112 with the faecal enrichment H1644 produced a much narrower range of bacterial diversity, which is shown in Figure 2.5. Remarkably, 13/15 strains are assigned to the genus Enterococcus, and more specifically, E. durans, E. faecalis and E. faecium; and the remaining 2 strains both closely related to Clostridium innocuum.

In summation these metaparental matings differed only in the vector used and thereby the antibiotic selection imposed on the recipient stool bacteria, but produced starkly different results in terms of the biodiversity recovered. Because the pEHR512112 transconjugants were representatives of the major Clostridium Clusters commonly present in the human GIT, and produced a much more diverse collection of isolates, I have chosen to focus on these strains in the following sections and Chapters.

33

Figure 2.2 Analysis of transconjugants carrying pEHR512112 by fluorescence microscopy. The transconjugants were visualised using an Olympus BX 63 microscope. Images were captured using the Olympus cellSens modular imaging software platform and processed using the ImageJ software package. A scale bar of 10 m is included for reference. Adapted from: Ó Cuív et al. (2015) (with permissions under creative commons license CC BY 4.0).

34

e e 3

Lane 6 Lane

Lane 2 Lane

Lane 5 Lane

Lan

Lane 7 Lane

Lane 1 Lane Lane 4 Lane A) B)

Figure 2.3: Confirmation of the conjugative transfer of pEHR512112 to faecal bacteria by metaparental mating. A) The DNA 1Kb plus hyperladder used to assess band size of amplified PCR products. B) DNA extracted from select transconjugant strains was used as the template for PCR amplification of the ermF-oriT region, as outlined in the Materials and Methods. The resulting amplification products were subjected to agarose gel electrophoresis (0.7% w/v) and the amplicons visualised by GelRed staining. Lane 1 shows the Hyperladder™ 1kb Plus DNA marker, lane 2 shows the amplicon produced from E. coli ST18 pEHR512112, and lanes 3 through show the results for AHG0001, AHG0008, AHG0014 and AHG0003, respectively. The last lane shows the results produced using DNA extracted from E. coli STI8 bearing no plasmid. The gel image has been modified to show the transconjugants present and the full image can be found in Section 7.4 of Chapter 7.

35

AHG0002 12 AHG0004 14 Clostridium bolteae (T) 16351 (AJ508452) 23 45 Clostridium citroniae (T) RMA 16102 (DQ279737) AHG0021 AHG0001 Clostridium clostridioforme (T) ATCC 25537 (M59089) 55 AHG0020 AHG0009 AHG0010 49 AHG0015 Cluster XIVa AHG0016 AHG0019 11 AHG0022 94 AHG0023 Clostridium symbiosum (T) ATCC 14940 (M59112) 20 AHG0018 33 Clostridium aldenense (T) RMA 9741 (DQ279736) 66 31 AHG0005 59 AHG0011 41 Clostridium indolis (T) DSMZ 755 (Y18184) Clostridium hathewayi (T) DSMZ 13479 (AJ311620) Faecalibacterium prausnitzii (T) ATCC 27768 (AJ413954) 49 AHG0017 Cluster IV 27 AHG0008 96 AHG0014 98 Flavonifractor plautii (T) CCUG 28093 ATCC 29863 (AY724678) AHG0012 Clostridium ramosum (T) DSM 1402 (X73440) 74 AHG0007 Cluster XVIII 21 AHG0003 8 AHG0006 Methanobrevibacter smithii (T) PS (U55233)

0.050

Figure 2.4 Phylogenetic analysis of taxonomic affiliations of 22 transconjugants isolated from a pre- adolescent Australian child. Transconjugants are shown in comparison with known type strains from Clostridium clusters IV, XIVa and XVIII in bold. The archaeon Methanobrevibacter smithii was used as an out-group. Values at branches denote bootstrap analysis following 1000 iterations and the scale bare represents a sequence divergence of 5%. Parentheses designate accession numbers for sequences used to build phylogenetic tree.

36

Table 2.2 The percent identity scores for each of the transconjugant isolates in comparison with their nearest related isolates. Percent identity scores were calculated using BLASTn analysis of the full length 16S rRNA recovered from each of the 22 transcojugant isolates.

Transconjugant Closest Match Identity score Clostridium cluster AHG0001 C. bolteae 99.17% XIVa AHG0002 C. citroniae 99.56% XIVa AHG0003 C. ramosum 98.57% XVIII AHG0004 C. citroniae 99.85% XIVa AHG0005 C. aldenense 99.54% XIVa AHG0006 C. ramosum 90.73% XVIII AHG0007 C. ramosum 99.50% XVIII AHG0008 Pseudoflavonifractor capillosus 94.86% IV AHG0009 C. clostridioforme 98.96% XIVa AHG0010 C. clostridioforme 98.57% XIVa AHG0011 C. aldenense 99.20% XIVa AHG0012 C. ramosum 99.86% XVIII AHG0014 Flavonifractor plautii 99.77% IV AHG0015 C. clostridioforme 99.00% XIVa AHG0016 C. clostridioforme 98.85% XIVa AHG0017 Eubacterium limosum 82.33% XV (Collins et al., 1994) AHG0018 C. aldenense 96.34% XIVa AHG0019 C. clostridioforme 99.05% XIVa AHG0020 C. clostridioforme 99.00% XIVa AHG0021 C. citroniae 99.09% XIVa AHG0022 C. symbiosum 98.12% XIVa AHG0023 C. symbiosum 98.12% XIVa

37

CatpMPM 04 51 Enterococcus durans (T) DSM20633 (AJ276354) CatpMPM 11 CatpMPM 01 CatpMPM 09 CatpMPM 10 CatpMPM 12 CatpMPM 16 CatpMPM 02 CatpMPM 03 88 45 CatpMPM 13 Enterococcus faecium (T) LMG 11423 (AJ301830) CatpMPM 06 99 CatpMPM 14

100 CatpMPM 05 Enterococcus faecalis (T) JCM 5803 (AB012212) 94 Enterococcus casseliflavus (T) (AF039903) 99 Enterococcus gallinarum (T) (AF039900) CatpMPM 15 CatpMPM 07 100 Clostridium innocuum Ulm 12 (DQ440561) Methanobrevibacter smithii (T) PS (U55233)

0.050

Figure 2.5 Phylogenetic analysis of taxonomic affiliations of 15 transconjugant strains isolated following metaparental mating with E. coli ST18 pEHR513112, encoding chloramphenicol resistance. The phylogenetic placement of the transconjugant strains are shown in comparison with known type strains from the genus Enterococcus, with the archaeon M. smithii was used as the out- group. Values at branches denote bootstrap analysis following 1000 iterations and only bootstrap values of greater than 50% are shown. The scale bar represents a sequence divergence of 5% and the accession numbers for these references strains are provided in parantheses.

38

2.3.3 Primary screening of pEHR512112 transconjugant bacteria for the production of immunomodulatory bioactives compounds

The results of the immunomodulatory effects attributable to the addition of sterile spent culture to RAW264.7 macrophage cells are shown in Figure 2.6. I calculated Z scores for each strain as described in the materials and methods, and have plotted these relative to the effects from the addition of LPS plus sterile RCM medium to RAW264.7 macrophage cells (Z score = 0). Using this approach, the sterile spent culture fluids from 11/22 AHG strains were found to produce negative Z scores and 11/22 produced positive scores. In comparison with the Z score calculated from the effect on NF-κB activation when the culture supernatant of F. prausnitzii strain A2-165 was used in these same assays (-0.62), 7 AHG strains were found to produce a more negative Z score; indicative of these strains producing a stronger immunosuppressive effect than F. prausnitzii. Of the 11 AHG isolates shown to produce a positive Z-score in these assays, 3 strains resulted in values approaching 2.0 and thereby, are indicative of strains that produce factors with a pro-inflammatory effect from RAW264.7 macrophage cells. Taken together, these results show that the metaparental mating with pEHR512112 not only recovered a broad diversity of Gram-positive Firmicutes, but also an assortment of isolates capable of eliciting a spectrum of immunomodulatory effects.

Based on these findings, I chose to focus on the 7 AHG strains considered to be “immunosuppressive” and determine whether this effect could be attributed to their production and release of SCFA, and in particular acetate and butyrate, into the culture medium during growth. I first measured acetate and butyrate concentrations in the spent culture fluids from all 7 isolates in comparison to either sterile RCM, or the spent culture fluids from F. prausnitzii A2-165. The results of these assays show that acetate was the predominant SCFA produced by all the AHG isolates, with measurable but much smaller (and variable) final concentrations of butyrate also found (Figure 2.7). In comparison, the sterile RCM medium possesses a small initial amount of acetate and a very small concentration of butyrate, which as expected supports the growth of F. prausnitzii and its formation of butyrate from acetate during active growth. Based on these results, I chose to add acetate and butyrate directly to RAW264.7 cells to assess how acetate and/or butyrate might affect NF-κB activation in response to LPS. The results of these assays are shown in Figure 2.8 and support the contention that the concentrations of acetate and butyrate added via sterile RCM medium, the cell culture supernatants prepared from the AHG isolates, or from F. prausnitzii A2-165 are not sufficient to drive the immunosuppressive effects on RAW264.7 macrophage cells. Furthermore, the threshold concentration of acetate and butyrate giving rise to a suppression of NF-κB transcriptional activity

39

was 16 mM and 8 mM, respectively, both of which are well in excess of the concentrations produced by the AHG strains (Figure 2.7). Based on these results, I conclude that the apparent anti- inflammatory effects produced by these isolates is not explained by their production of SCFA.

40

Figure 2.6 First-pass screen of culture supernatants harvested at end stage of growth for all 22 bacterial isolates The calculated z-score when RAW 264.7 cells are first stimulated with 20 ng/µl LPS and then incubated with the 22 cell free culture supernatants (blue circles). The results of these assays suggest that 7/22 isolates are capable of suppressing NF-κB activation at a similar if not a greater capacity than culture supernatant taken from F. prausnitzii strain A2-165, currently considered to be the “gold standard” anti-inflammatory bacterium (yellow circle). The red circle indicates the NF-κB activation of RAW 264.7 cells incubated with 20 ng/µl of LPS and sterile RCM medium (positive control for NF-κB activation) and the green circle indicates RAW 264.7 cells when incubated with 20 ng/µl of LPS and the NF-κB inhibitor Bay-117082 (negative control for NF-κB activation).

41

A B

42

Figure 2.7 The acetate and butyrate concentrations in the spent supernatant of the 7 AHG strains predicted to be “immunosuppressive” Shown are the acetate (A) and butyrate (B) concentrations in the spent supernatant of the 7 bacterial isolates which appeared to have the greatest inhibitory effect on NF-κB activation in the RAW cells. Acetate (A) was the predominant SCFA produced by the 7 AHG isolates and is much higher in the AHG isolates compare to RCM medium onlu and F. prausnitzii A2-165. The concentrations of butyrate (B) detected were much lower for the 7 AHG strains than for F. prausnitzii A2-165 and appear to be more variable than the concentrations of acetate produced by the 7 strains.

43

Figure 2.8 Effect of SCFA concentration on NF-κB activation. Shows that butyrate has an apparent dose dependent inhibitory effect on NF-κB activation in RAW 264.7 cells whereas acetate was unable to inhibit NF-κB activation at any concentration used in this assay.

44

2.3.4 Plasmid curing experiments with AHG0001

As noted above, strain AHG0001 is closely related to C. bolteae, and was found to show robust and consistent growth using a variety of anaerobic media. Based on these observations, I selected this strain as a candidate “host strain” that could be used in any later studies using techniques in bacterial genetics (e.g. gene knock-out or knock-in). This would first require the curing of strain AHG0001 of the pEHR512112 vector, using AO. Table 2.3 shows that strain AHG0001 was still able to grow in the presence of 4 µg/ml AO, and so this culture was used to produce 10-fold dilution series which were plated onto BHI agar plates with and without Erm. There were 97 colonies recovered on the BHI plates as compared to 9 colonies recovered on Erm selective plates, suggesting that ~90% of the colonies had been cured of the pEHR512112 vector. I then picked 10 candidate cured colonies (from BHI plates) and 10 colonies from the Erm selective plates to assess for fluorescence, and when loss of fluorescence was confirmed (Figure 2.9) the cured strain was stocked in 3 ml of anaerobic glycerols.

Table 2.3 The colony counts for AHG0001 recovered following AO curing of the pEHR512112 vector A ten-fold dilution series of AHG0001 cultures from 4 µg/ml AO cultures were plated on to BHI agar medium with and without 100 µg/ml Erm. TNTC = too numerous to count, NG =no growth.

Dilution factor 10-1 10-2 10-3 10-4 cfu/ml No AO BHI TNTC TNTC TNTC 85 8.5 x 106 BHI TNTC TNTC TNTC 50 5.0 x 106 (Erm) 4 µg/ml AO BHI TNTC TNTC 324 97 9.7 x 106 BHI TNTC 200 33 9 9.0 x 106 (Erm)

45

Brightfield Merged Fluorescence

AHG0001: pEHR512112

AHG0001: cured

Figure 2.9 Plasmid curing of strain AHG0001. Fluorescence microscopy images showing the parent AHG0001 pEHR512112 strain and a derivative cured of the pEHR512112 vector following Acridine orange treatment, based on the strain no longer being fluorescent nor ErmR. .

46

2.4 Discussion

There have been some recent efforts to either recover culturable strains, or metagenome assembled genomes (MAGs), of bacteria included in the HMP’s “most-wanted” list. Ma et al. (2014) utilised a gene-targeted microfluidic cultivation method to successfully recover a representative of a previously uncultured member of the Rumminococcaceae family within Clostridium cluster IV. Almeida et al. (2016) were able to produce ~200 nearly complete genome sequences, of which over half were linked to members of the “high-priority” category of HMP most wanted taxa. Jeraldo et al. (2016) were able to reconstruct the genome of a novel butyrate-producing strain from metagenomic sequence data. This species is also a member of Clostridium cluster IV, and phylogenetically, is most closely related to Eubacterium desmolans and Butyricicoccus pullicaecorum. In this Chapter, I have first described my utilisation of the pEHR512112 and pEHR513112 vectors to isolate genetically tractable bacteria affiliated with the Firmicutes phylum. The metaparental mating approach has the added advantage of isolating genetically tractable strains, which are therefore open to techniques in bacterial genetics to assess their functional properties in relation to host health and disease. In that context, the second part of this Chapter describes my development and use of an assay to examine these strains for their immunomodulatory potential, with a view to identifying new strains capable of eliciting an “immunosuppressive” response through the production of bioactive metabolites.

I believe my results show that metaparental mating and selection using the conjugative transfer of an antibiotic resistance gene is necessary, but not sufficient for the efficient screening of recipient cultures for transconjugant strains. By using a vector encoding ErmR and evoglow-C-Bs2 fluorescence, I produced a collection of bacteria representing a broad diversity of human GIT bacteria affiliated with Clostridium clusters IV, XIVa and XVIII, even in the (unexpected) presence of a high background of ErmR (Table 2.1). This high background of ErmR in the faecal samples donated for my studies, warranted my use of the construct containing the evoglow-C-Bs2 gene as an additional selective marker. Selecting colonies from Erm selective plates that displayed fluorescent cells allowed me to reduce the number of “false positives” due to recovery of GIT bacterial strains that are naturally resistant to antibiotics such as Erm. Seville et al., (2009) reported that there are three Erm resistance genes which are commonly present in metagenomic data collected from human oral and faecal samples, which could explain the background resistance noted in my observations. Furthermore, de Vries et al., (2011), used metagenomic fosmid libraries to determine the diversity of tetracycline (TcR) resistance genes and the bacteria carrying these genes in a mother and her infant child. They determined that tetO, tetW and tetX are carried by a range of bacterial families from the anaerobic

47

Firmicutes and Bacteroides phyla in the mother, with tetO and tetW only being detected in the uncloned DNA from the infant faecal sample, suggesting transfer of resistance from the mother to her infant. In contrast, the tetM and tetL resistance genes were found to be solely carried by streptococci species in the infant library, potentially suggesting a role of transferral of skin or oral bacteria in the transfer of antibiotic resistance from mother to child. Finally, the authors identified a novel transposon conveying TcR and ErmR that belongs to a family of broad host range conjugative elements, which the author’s state could lead to the joint spread of TcR and ErmR in the infant GIT. Indeed, studies such as these support the increasing evidence that the human GIT is a reservoir of bacteria resistance to antibiotics (Modi et al., 2014). Future studies should aim to further profile the microbiota through the use of metagenomic techniques to allow for a greater understanding of the levels of background resistance to antibiotics (Reviewed in detail by van Schaik, 2015).

Another interesting finding from my studies is that in addition to Cat having a stronger selective pressure of my faecal enrichment recipient cultures than Erm (Table 2.1) the metaparental matings with the pEHR513112 vector only afforded the recovery of a much narrower diversity of GIT bacteria, restricted to the genus Enterococcus spp. and the Erysipelotrichaceae family. There is an increasing collection of research pointing to reservoirs of resistance in the GIT and yet “the human GIT resistome” has a surprising lack of Cat resistance genes reported (Modi et al., 2014, van Schaik, 2015). Cat is not often used in GIT related infections/disorders with its more noted uses being for the treatment of meningitis and infections caused by vancomycin-resistant (VanR) Enterococcus spp (Sills and Boenning, 1999, Ricaurte et al., 2001, Lautenbach et al., 2004). The combined lack of use and lack of resistance genes encoded in the GIT resistome is reflected in the stronger selective pressure described for my enrichment cultures (Table 2.1). Based on my results, it would appear that the use of “uncommon” antibiotics like chloramphenicol may restrict the biodiversity recovered using metaparental mating, whereas the selection based on the more commonly used antibiotics is potentially subject to a high background of “false positives”. In future studies, I recommend that, more commonly used antibiotics such as Erm should be used in conjunction with the evoglow-C-Bs2 gene when researchers require a broad range of recovered isolates. There are also advantages in using a less common antibiotic such as Cat when a less diverse selection of isolates is required. I also believe that screening faecal samples for background resistance to antibiotics is a must and will allow researchers to better tailor the vector constructs to the recipient samples.

Ó Cuív et al., (2015) also reported that the range of isolates recovered can be augmented by simply changing the medium used to plate out mating mixtures. Furthermore, combining this isolation method with studies which have shown the recovery of “most-wanted” genomes, such as those of 48

Almeida et al., (2016) and Jeraldo et al., (2016), could allow for us to selectively target these “most- wanted” isolates using different media, thereby gaining cultured representatives. This again would allow us to combine genomic analysis with functional assessments to further expand on how these newly recovered isolates may be important to either host health or disease states.

The goal for my metaparental matings was to create a collection of genetically tractable strains of Firmicutes lineages, which are now recognised to be reservoirs of “anti-inflammatory factors” such as those recently identified and characterised for F. prausnitzii (Miquel et al., 2015, Quevrain et al., 2016) as well as other Clostridia (Atarashi et al., 2011, Atarashi et al., 2013, Narushima et al., 2014). Therefore, I decided to develop an assay to determine whether any of my isolates recovered had the capacity to modulate NF-κB-directed transcription of genes encoding inflammatory pathways. NF- κB is a master regulator of a cell’s response to a variety of stressors, such as various types of microbially-derived stimuli. Normally, the NF-κB complex is sequestered in the cytoplasm by the inhibitory kappa B-α protein (IκB-α) and when a ligand, for instance microbially-derived LPS, is bound by specific receptors such as Toll-Like Receptors (TLRs), antigen or cytokine receptors, activation of the IκB-α Kinase (IKK) complex occurs. The IKK complex is composed of the NF-κB essential modulator (NEMO) regulatory protein and two subunits (IKKα and IKKβ). The activated IKK complex catalyses the phosphorylation and degradation of IκB-α, which allows the free NF-κB complex to translocate to the nucleus. NF-κB is then able to interact with regulatory elements in promoters encoding inflammatory genes (e.g. TNF-α, IL-6 and IL-1β). As such, research performed using F. prausnitzii strain A2-165 has confirmed that at least some commensal bacteria mediate their anti-inflammatory effects through suppression of NF-κB directed activation of inflammatory genes (Sokol et al., 2008, Martin et al., 2014, Quevrain et al., 2016). This may be mediated via interrupting receptor activation of the signalling pathway described above, or via some other more direct interaction with, and suppression of NF-κB complex release and/or its translocation to the nucleus. More recent studies have now shown these effects of F. prausnitzii A2-165 whole cells or spent culture fluids arise from the production of 7 peptides derived from a single 15 kDa protein termed; microbial anti-inflammatory molecule or MAM. The MAM protein was shown to decrease the activation of NF-κB in a dose-dependent manner in epithelial cells. Other pathways involved in inflammation were unaffected by the presence of MAM thus highlighting the specificity of its effects (Quevrain et al., 2016). Based on these collective findings, I chose to use the mouse RAW 264.7 macrophage cell line as a reporter cell line and F. prausnitzii A2-165 as the comparator strain for my assay. As shown in Figure 2.6, spent culture fluids from 7/22 pEHR512112 transconjugants appeared to produce a stronger immunosuppressive effect than F. prausnitzii A2-165. Interestingly, I was able 49

to show that the immunostimulatory strains and immunosuppressive strains do not cluster in any particular groups and it appears that strains within a species can have substantially differing effects on the NF-κB activation in vitro. Given my collective results shown in Figure 2.7 and Figure 2.8 I also believe these immunomodulatory activities were not directly attributable to the production of acetate or butyrate by any of these strains, but instead to other factors released into the culture fluids. Propionate was not investigated using this assay due to the fact that the level of propionate produced by all of the strains when assessed was less than 1 mM and this concentration did not suppress NF- κB during initial optimisation experiments.

Despite the opportunities and potential associated with the RAW264.7 assay I used here, there were also a number of technical inconsistencies that affected its efficacy and repeatability. In brief, I found that, the values recorded for luciferase activity following NF-κB activation were highly affected by several factors. First, the different passages of cells seemed to have profound effects on the luciferase signal output for NF-κB, with initial passages producing no more than 2,000 relative light units (RLU) and later passages producing higher signals of >10,000 RLU following stimulation with LPS. These effects were not predictable either, as further recovery of cells and subsequent passages would produce the opposite effects. Second, the different types/batches of FBS used to culture the cells produced similar effects to those described above. Third, I found that LPS which had been stored at -80oC compared to LPS stored at 4oC also produced similar differences in the RLU detected. Over a period of several months I attempted to find a balance with these issues that would result in reproducible replicates for my stimulation assays. However, I found that combined, these factors were producing unsatisfactory variability beyond my initial screens described above. Due to this, I made the decision to refocus my research to address my primary interest on assessing the physiology and metabolism of GIT bacteria isolated by metaparental mating.

For the remainder of my thesis I have chosen to focus on two strains whose taxonomic assignments indicated that they clustered closely with F. plautii: strains AHG0008 and AHG0014 respectively. Strain AHG0008 is of particular interest as my closed-reference OTU picking showed this strain matches with taxa identified as being medium-priority species from the HMP’s “most-wanted” list (Fodor et al., 2012, Ó Cuív et al., 2015). In addition to very little being known about the role(s) Flavonifractor spp. and related bacteria might play as part of the human GIT microbiota, my results in Figure 2.7 show that while AHG0008 and AHG0014 are closely related phylogenetically, my strains differ dramatically in terms of their immunomodulatory potential, producing Z-scores of -2.42 and 0.33 respectively. The following Chapters of my thesis will therefore focus on functional and genomic studies of these two isolates. 50

51

Chapter 3 Functional and genomic characterisation of Flavonifractor spp. with a focus on polyphenol catabolism

3.1 Introduction

Polyphenols have been shown to have a wide range of beneficial effects in health and disease (Zhang, 2015). To better understand these beneficial health effects, it is important for us to understand the interactions between polyphenols and different members of the gastrointestinal tract (GIT) microbiota. In that regard, there are a number of different bacteria which have been shown to be capable of degrading polyphenols (Reviewed by Braune and Blaut, 2016). Polyphenols are plant secondary metabolites that are sub-divided into different classes dependent on their structure (Neveu et al., 2010). The main class of plant polyphenols are the flavonoids, which include flavonols, anthocyanins and flavanols. Plants also produce non-flavonoid polyphenols such as hydroxybenzoic acids, stilbenes and ellagitannins (Cardona et al., 2013). It is estimated that the typical intake of flavonoids in a western diet is approximately 190 mg/day (Chun et al., 2007). Quercetin is a flavonol and is estimated to be the most abundant of the dietary polyphenols with an estimated daily intake of up to 30 mg/day (Terao, 2017). As quercetin and other ingested flavonoids and polyphenols pass through the GIT only a small percentage (~5%) are absorbed in the small intestine, with the remainder translocating to the large intestine, where they are thereby presented to the metabolic capacity of the GIT microbiota (Faria et al., 2014, Ozdal et al., 2016). The microbial breakdown of the flavonoids leads to alterations in their bioavailability along with changes in their biological activities (Faria et al., 2014) and as might be expected, flavonoids have also been shown to alter the composition of the GIT microbiota. Evidence exists that flavonoids can affect the GIT microbiota via the inhibition of pathogenic bacteria, as well as a prebiotic effect on beneficial bacteria (Clavel et al., 2005, Tzounis et al., 2008, Hidalgo et al., 2012, Etxeberria et al., 2015a). In summation, it may be that the benefits observed following the consumption of polyphenol-rich diets are due to the combined effects of: i) the production of bioactive metabolites; ii) the presence/absence of specific microbes governing these modifications and/or; iii) changes to the structure/function relationships of the GIT microbiota. The interactions between dietary polyphenols and the GIT microbiota therefore require further investigation, if the positive effects that are observed with flavonoid-rich diets are to be realised in a more predictive and personalised manner.

In Chapter 2, I described my isolation of two bacterial strains phylogenetically affiliated with the Flavonifractor plautii species. The F. plautii lineage is a recently described taxon, which includes 52

two reassigned species; Eubacterium plautii (strain DSMZ 4000T) and Clostridium orbiscindens (strain DSMZ 6740), cultured from the human GIT. The C. orbiscidens strain was originally isolated and described based on its ability to cleave the C3-C4 ring of quercetin and other polyphenols (Figure 3.1, Winter et al., 1991). E. plautii was originally described as a Gram-negative non-motile organism and assigned to the Fusobacterium genus (Seguin, 1928), then reassigned to the Eubacterium genus by Hofstad and Aasjord (1982) following electron microscopy analysis, which showed it to be a Gram-positive organism. Carlier et al. (2010) described a number of clinical isolates that, by 16S rRNA gene sequencing were shown to be phylogenetically related to C. orbiscindens, E. plautii and Bacteroides capillosus. The clustering of these isolates led the authors to reassess the phylogeny of all these three strains, and subsequent analyses of the biochemical properties, DNA G:C content, DNA-DNA hybridisation and the quercetin degrading capabilities, concluded that C. orbiscindens and E. plautii are members of the same species, Flavonifractor plautii: a Gram-positive, flavonoid- degrading organism of the human GIT. Furthermore, Carlier et al. (2010) also proposed the reassignment of B. capillosus to a different genus – Pseudoflavonifractor capillosus - because, unlike F. plautii, the species was unable to degrade quercetin.

Beyond these taxonomic studies, there have been only a few studies relating to Flavonifractor and Pseudoflavonifractor spp. Although several studies have described the isolation of new F. plautii strains in conjunction with investigations of the flavonoid/polyphenol degrading abilities of these isolates (Kutschera et al., 2011, Takagaki et al., 2014, Takagaki and Nanjo, 2015) additional studies of Pseudoflavonifractor spp. are non-existent. One reason may be that culture techniques relating to the isolation and propagation of Flavonifractor sp. are largely undeveloped, although recently Browne et al. (2016) did report the recovery of a number of Flavonifractor spp. isolates through ethanol treatment of faecal samples prior to culturing.

53

Figure 3.1 Pathway of quercetin degradation by F. plautii and Eubacterium ramulus. An initial reduction of the double bond at the 2,3 position results in the formation of taxifolin. A chalcone isomerase enzyme then contracts the c-ring of taxifolin to form alphitonin. Finally, the five-membered ring of alphitonin is opened through a hydrolysis step to form phloroglucinol and 3,4- dihydroxyphenyacetic acid. Adapted from Schoefer, L., et al., Anaerobic degradation of flavonoids by Clostridium orbiscindens. Appl Environ Microbiol, 2003. 69(10): p. 5849-54.

While the capacity of F. plautti to metabolise quercetin is established, there remains little knowledge of the effects that quercetin has on the growth kinetics of F. plautii strains, nor has there been a more detailed assessment of the genomic content of the F. plautii group. With that background, here in Chapter 3, I first compare the growth kinetics and quercetin degradation capacity of the F. plautii type strain (DSMZ 4000T) and my isolate Flavonifractor sp. AHG0014. I also present the results of my comparative analyses of the DSMZ 4000T and AHG0014 genomes with those from other presumptive Flavonifractor spp. isolates to assess their functional relatedness, and with a specific focus on flavonoid degradation genes.

54

3.2 Materials and

3.2.1 Bacterial strains and growth conditions

The F. plautii type strain DSMZ 4000T was purchased from DSMZ GmbH (Germany). Flavonifractor sp. AHG0014 was isolated as described in Chapter 2. Both strains were cultured using Brain-Heart-Infusion (BHI, BD Biosciences, Australia) agar and BHI broth medium. The strains were resuscitated from samples of anaerobic glycerol stocks plated onto BHI agar medium and incubated anaerobically for 48 hrs at 37oC in a Coy vinyl anaerobic chamber filled with a nitrogen/carbon dioxide/hydrogen (85% N2: 10% CO2: 5% H2) atmosphere. The AHG0014 isolate was cultured on agar plates containing 100 µg/ml erythromycin (Erm) to retain pEHR512112. The quercetin and Erm were purchased from Sigma Aldrich, and ThermoScientific, respectively. Both chemistries were dissolved in ethanol to the desired concentrations.

3.2.2 Growth and quercetin metabolism studies

T Single colonies of strains AHG0014 and DSMZ 4000 were picked from their respective agar plates, aseptically transferred into Hungate tubes containing 10 ml BHI broth (with and without added Erm, respectively) and incubated at 37oC overnight (~12 hours). These cultures (0.1 ml) were used to inoculate 10 ml BHI broths prepared to contain either 50 µM quercetin or carrier alone (ethanol) and o growth at 37 C was monitored by measuring the absorbance (OD600) of these cultures every hour for 24 hours, using a ThermoScientific Spectronic™ 200E spectrophotometer. To assess the quercetin degradation kinetics of both strains, the same overnight cultures were used to inoculate 50 ml volumes of BHI broth with either added quercetin or carrier alone and prepared in serum bottles. These bottles were incubated in the water bath alongside the culture tubes described above, and 3 ml was removed via a needle and syringe from these larger volume cultures at 2 hr intervals for a period of 26-28 hrs.

A 1 ml aliquot from each sample was used to measure OD600, then centrifuged at 10 000 x g for 5 mins to remove bacterial cells, and the absorbance of the supernatant at 375 nm measured for the estimation of residual quercetin. The remaining 2 ml was also subjected to centrifugation as described above, with the supernatants carefully removed and stored in individual screw capped tubes at -80oC for further analysis. Uninoculated media, containing either EtOH alone or 50 µM quercetin, were used as negative controls. The data from all the growth studies were reported as means ± standard deviation (SD) of two biological replicates each performed with triplicate replicates (n=6). Student t- 55

tests were performed to assess the final yield of each strain and the degradation of quercetin. Differences were considered significant if P < 0.05. All statistical analyses were performed using GraphPad Prism v.7.03 for Windows (GraphPad Software, La Jolla California USA, www.graphpad.com).

3.2.3 Strain AHG0014 DNA extraction, genome sequencing and assembly

Flavonifractor sp. AHG0014 was resuscitated, plated and cultured as described above prior to inoculating 50 ml BHI-Erm broth prepared in serum bottles. These cultures were incubated for ~10 hours until they had reached mid-exponential phase of growth (i.e. OD600 of ~0.2), and the cell biomass was harvested by centrifugation at 3220 x g for 20 minutes using an Eppendorf 5810R benchtop centrifuge.

The genomic DNA (gDNA) was extracted using the repeated bead beating plus column (RBB+C) method (Yu and Morrison, 2004) modified for a larger scale preparation. Briefly, the cell pellets were resuspended in 2ml of RBB+C lysis buffer, centrifuged again as described above, before resuspension with 800 µl of fresh RBB+C lysis buffer. The mixture was incubated at 75oC for ~10 minutes, then placed at room temperature. Once the mixtures had reached ambient temperature, 20 µl of lysozyme (200 mg/ml), 2 µl of mutanolysin (20 U/µl) and 10 µl of achromopeptidase (200 U/µl stock) were added and the cell suspension gently mixed. The samples were then incubated at 37oC for 1 hour. Then 40 µl of 10% (w/v) sodium lauryl sarcosine and 10 µl of proteinase K (20 mg/ml) were added and the samples incubated for 30 mins at 55oC. The lysed mixtures were then mixed with an equal volume of Phenol:Chloroform:Isoamyl alcohol (25:24:1) and centrifuged at 21,130 x g for 5 mins using an Eppendorf 5424R benchtop centrifuge. The upper aqueous phase was carefully removed, and this step was repeated. Then, 40 µl of 3 M sodium acetate and 400 µl of isopropanol was added to the sample and gently mixed. The precipitated nucleic acids were recovered by centrifugation, as described above, and washed twice with 500 µl of 70% (v/v) ethanol, then vacuum dried. The gDNA samples were then pooled from four preparations by resuspending in 50 µl of TE buffer and treated with 0.5 µl of RNAse A prior to incubation at 37oC for 1 hour. The quality of the gDNA was assessed through visualisation on a 1% agarose gel and the quantity assessed using the Quantus™ fluorometer and QuantiFluor® dsDNA system as per manufacturer instructions (Promega, Australia, see Appendix 6. for DNA gels and quantities). The suspended gDNA samples were then stored at 4oC until further use.

56

Flavonifractor sp. AHG00014 genome sequencing was provided by the Australian Centre of Ecogenomics (ACE) using the Illumina NextSeq 500 platform with 2 x 150bp High Output kit with V2 chemistry. The genomic sequence data was then trimmed using the Trimmomatic command line version 0.36. The data was then quality checked and de novo assembled using the SPAdes command line version 3.10.1 (Bankevich et al., 2012). The draft genome for AHG0014 was then further assessed for genome completeness and contamination using CheckM version 1.0.7, which is an automated method for examining a draft genome against a broad set of marker genes specific to an inferred lineage within a reference genome tree (Parks et al., 2015). The contigs for AHG0014 were then reordered using Mauve (Darling et al., 2010) and using the finished F. plautii YL31 genome as the reference (Uchimura et al., 2016). The genome alignment algorithm used by Mauve allows an anchored alignment of two or more genomes whilst supporting the rearrangement of the anchors to allow for the identification of genome rearrangements. This provides information relating to genome synteny, xenologous regions and genome rearrangements as evolutionary markers.

3.2.4 Comparative genomics of AHG00014 with other Flavonifractor spp.

The draft genome sequence for AHG0014 was further examined using the JGI IMG/ER webserver (https://img.jgi.doe.gov/cgi-bin/mer/main.cgi) to analyse the genomic features and to gain further assignment of genes to KEGG or COG orthologies. A BLAST search of the NCBI genome database using the draft genome for AHG0014 revealed four genomes which are currently mis-assigned but could be included in the comparative genomics of the F. plautii spp. (Table 3.1). These four genomes along with the four F. plautii genomes; DSMZ 4000T, DSMZ 6470, YL31 and An248, and the draft genome for AHG0014 were uploaded onto the software platform EDGAR (Blom et al., 2009) for genome comparison and generation of an Average Nucleotide Identity (ANI) matrix. The ANI matrix is a measure of nucleotid-level genomic similarity between the coding regions of two (or more) genomes (Arahal, 2014). To determine the development plots for the pan/core genome, the perl script PGap (Zhao et al., 2012) was used on the Bio-Linux Platform v.8.0.5 (http://environmentalomics.org/bio- linux-software-list/). The three files required by this script for each genome included: annotation file (.ptt), nucleotide sequences (.ffn) and protein sequences (.faa). The PGap analysis pipeline provided gene clusters assigned to either the core or pan-genome and was visualised using the software package PanGP (Zhao et al., 2014). All genomes were also uploaded onto rast.nmpdr.org (Aziz et al., 2008, Overbeek et al., 2014, Brettin et al., 2015) and this was used to determine whether all the Flavonifractor spp. genomes contained the homologue of the Eubacterium ramulus chalcone isomerase gene. Data collected from RAST was subsequently used to produce the gene organisation diagram (Figure 3.18).

57

The phylogeny of the genes encoding homologs of the chalcone isomerase was examined with sequences recovered from RAST, and MEGA7. ClustalW was used to align the sequences (Thompson et al., 1994) and the stability of the phylogeny was assessed using 1000 bootstrap replications with Jones-Taylor-Thornton (JTT) modelling (Jones et al., 1992). Multiple sequence alignments of the 9 Flavonifractor chi genes and two Eu. ramulus chi genes were compared through alignments using the online software PRALINE (Simossis and Heringa, 2005).

The draft genome sequence for AHG0014 was further examined through the JGI IMG/ER to analyse the genomic features and to gain further assignment of genes to KEGG or COG orthologies. Usage of cluster of orthologous groups (COGs/NOGs) (Huerta-Cepas et al., 2016) was computed using the online software eggnog-Mapper (Huerta-Cepas et al., 2017). Predicted highly expressed (PHX) and putative alien (PA) gene analysis was performed using software available from the University of Georgia’s Institute of Bioinformatics (http://www.cmbl.uga.edu/software/phxpa.htm) (Karlin and Mrazek, 2000). Briefly, this analysis compares the codon bias of highly expressed profiles for Ribosoma l proteins, Chaperone proteins and Transcription/Translation factors to that of all proteins in a given genome. The proteins are then assigned an overall value and bias value according to the highly expressed categories. From this, each protein can be assigned as PHX for those displaying a similar bias or PA for those that show a dissimilar bias. The total ribosomal proteins, chaperone proteins and Transcription/Translation factor for AHG0008 were placed into text files and then uploaded to the above web-server for analysis. The resultant PHX and PA gene files were then ranked according to their overall bias value. The DataBase for automated Carbohydrate-active enzyme Annotation (dbCAN2; Zhang et al., 2018b) was used to provide automated and comprehensive CAZyme annotation for each of the Flavonifractor spp. genomes.

3.2.1 MetaQuery analysis to assess the prevalence of Flavonifractor spp. and the chi operon in other metagenomics datasets

The MetaQuery server is a web application that enables rapid and quantitative analysis of specific genes, function and taxa across >2000 publicly available metagenomes (Nayfach et al., 2015). Here, MetaQuery was first used to search for Flavonifractor spp taxonomic abundance across the metagenomes. The MetaQuery application estimates the abundance of taxonomic groups from high- quality metagenomic datasets using MetaPhlAn2 (Truong et al., 2015) and mOTU (Sunagawa et al., 2013). I also used MetaQuery to assess the abundance of the chi gene and surrounding genes contained in the operon across the metagenomes. For specific genes, MetaQuery maps genes from the integrated catalogue using Bowtie2. The “sensitive local” option was used, where only alignments 58

with >70% nucleotide identity and/or where the read was covered by >80% of its length were retained for analysis.

3.2.2 Cloning of the AHG0014 chi homolog

The AHG0014 chi homologue was amplified by PCR using primers designed to produce the entire open reading frame and support directional in-frame cloning for recombinant gene expression via pET28b(+) plasmid vector. The chi insert is flanked by unique sites for 6 bp cutting restrictio n to support the easy exchange of the insert into the pET28b(+) vector bearing the compatible restriction sites. The insert was designed to fit in the cloning region between the XhoI and NcoI restriction sites of the pET28b(+) vector which would allow for His-tagging of the expressed protein. The reverse primer for amplification of the chi homologue was designed to bear the NcoI compatible restriction site, BspHI, as there were multiple NcoI restriction sites in the chi homologue itself. The prime sequences designed for amplification of the chi insert are shown in Appendix 6. Amplification of the chi homologue with appropriate restriction sites was set up as follows: 1 µl of AHG0014 gDNA, 500 nM of the chi primers, 0.02 U Phusion DNA polymerase, 1X High Fidelity Buffer and topped up to a total reaction volume of 20 µl with nuclease free water. PCR reactions were performed as follows initial denaturing at 98oC for 1 min, followed by 30 cycles of 98oC for 30 secs, 60oC for 30 secs and 72oC for 1 min with a final extension cycle of 72oC for 5 mins. Following amplification, the PCR products were subjected to agarose (0.7% w/v) gel electrophoresis, and the amplicon was excised from the gel and extracted and purified using the Promega Wizard® SV Gel and PCR clean- up system, according to manufacturer’s specifications. The DNA was then stored at 4oC until use.

The pET28b(+) plasmid vector was extracted from an E. coli JM109 strain using the Promega Wizard® Plus SV Miniprep DNA Purification system, according to manufacturer’s specificatio ns. The DNA sample was then stored at 4oC until use.

Prior to restriction digests both the gene insert and the plasmid were vacuum dried to remove residual ethanol from the extraction kits which could interfere with cloning steps. The dried gene insert and plasmid were then resuspended in TE buffer. The pET28b(+) plasmid was digested with XhoI and NcoI-HF (New England BioLabs) and the chi amplicon with XhoI and BspHI overnight at 37oC. The ligation mixture was prepared by mixing 1 µl of the digested plasmid, 7 µl of the digested gene insert, with 1 µl of 10x T4 buffer and 1 µl of T4 DNA ligase (20,000 Units). The ligation mixture was left at room temperature overnight, and then used to transform E. coli MAX efficiency® DH5α T1R chemically competent cells by adding the entire ligation mixture to a 200 µl aliquot of the thawed

59

cells placed within a sterile microfuge tube and mixing gently. This mixture was then incubated on ice for 30 mins before being heat shocked at 42oC for 30 secs in a waterbath. The tubes were placed on ice for 2 mins, then 200 µl of SOC medium was added to the mixtures followed by incubation at 37oC for 1 hr. Aliquots (50-150 µl) of the transformation mixture were then spread onto LB agar plates supplemented with 100 µg/ml kanamycin plates and incubated at 37oC overnight. Candidate recombinant strains were cultured in LB broth supplemented with kanamycin and plasmid DNA was extracted from these strains as described above. The candidate recombinant plasmids were confirmed by restriction enzyme digestion and select plasmid DNA was subjected to DNA sequencing by Australian Genome Research Facility (AGRF) to validate in-frame insertion and recombinant chi (rchi) expression.

3.2.3 Expression of the recombinant AHG0014 chi homolog

Chemically competent BL21 and Rosetta™ strains were transformed with both the pET28b(+)chi and the empty vector. Rosetta™ strains are BL21 derivatives designed to enhance the expression of eukaryotic proteins that contain codons rarely used in E. coli. These strains contain a chloramphenicol-resistant plasmid which supply tRNAs for AGG, AGA, AUA, CUA, CCC and GGA codons. This strain allows for the avoidance of potential problems encountered with protein expression relating to codon differences between the Gram-positive parent strain and Gram-negative cloning host. Briefly, competent cells were thawed on ice and 50 µl was aliquoted into fresh Eppendorf tubes. Then, 1-2 µl of plasmid was added to the cells. Next, cells were incubated on ice for 30 mins and heat shocked at 42oC for 30 secs and incubated on ice for a further 2 mins. Next, 800 µl of LB broth was added to the cells and incubated at 37oC for 1 hour. Finally, 50-150 µl of cells were plated onto LB agar plates supplemented with 100 µg/ml kanamycin (BL21 transformants) and/or 25 µg/ml Chloramphenicol (Rosetta™ transformants). Plates were incubated overnight at 37oC. Colonies were then picked into LB broth supplemented with the appropriate antibiotics and culture were prepared and stored in glycerol at -80oC.

The expression of Chi by E. coli BL21 and Rosetta™ strains was induced via the auto-induction method described by Studier (2005). Briefly, the autoinduction medium was prepared by supplementing 19.6 ml of ZYM medium with 400 µl of 50x M solution, 40 µl of 1 M MgSO4, 20µl of Trace Metal Mix and 400 µl of 5052 solution (detailed recipes are provided in Appendix 6.4). Single colonies of transformed E. coli BL21 and Rosetta™ strains were used to inoculate 20 ml of the autoinduction medium supplemented with the appropriate antibiotic(s) and incubated overnight

60

at 37oC. A cell lysate was prepared from these overnight cultures according to manufacturer’s specifications.

61

3.3 Results

3.3.1 16S rRNA based taxonomic assignment of transconjugant strains AHG0008 and AHG0014

Microscopic analysis of AHG0014 cultures showed the bacterium as Gram-negative staining, short filamentous rods (Figure 3.2). My observations are consistent with the description by Carlier et al. (2010) who also found Flavonifractor spp stain Gram-negative. Additionally, when strain AHG0014 was visualised as wet mounts using phase contrast microscopy, I observed that the bacteria appeared to wiggle across the field of view, this observation was also reported by (Carlier et al., 2010). I also observed the presence of spores when using phase-contrast microscopy. My phylogenetic analysis of the Flavonifractor sp. AHG0014 16S rRNA gene sequence and those from related bacteria confirmed that my isolate is very closely related to all the currently sequenced F. plautii genomes (Figure 3.3).

100 µm

Figure 3.2 Photomicrograph of F. plautii AHG0014 following Gram staining. The image was produced following growth in BHI medium containing 100 µg/ml Erm. A scale bar of 100 µm is included for reference.

62

These more detailed phylogenetic analyses compared to those presented in Chapter 2 also revealed that strain AHG0008 is not only separated from the Flavonifractor spp. currently available in the databases but is also divergent from P. capillosus and bacterium ASF500, which is a representative species from the Altered Schaedler Flora that is currently assigned as a Pseudoflavonifractor sp. (Figure 3.3) (Wymore Brand et al., 2015). I will further investigate this finding at the genomic level in the next Chapter.

74 AHG0014 F. plautii YL31 (KR364773) 51 F. plautii (T) DSMZ 4000 (AY724678) F. plautii An248 99 F. plautii DSM 6740 (Y18187) 77 ATCC BAA-442 61 P. capillosus (T) DSMZ 23940 (AY136666) AHG0008 95 Bacterium ASF500 (AF157051.1) Papillibacter cinnamivorans (T) DSMZ 12816 (AF167711) Sporobacter termitidis (T) DSMZ 10068 (Z49863) 67 21 Oscillibacter valericigenes (T) DSMZ 18026 (AB238598) 99 Oscillospira guilliermondii OSC3 (AB040497) Clostridium butyricum (T) DSMZ 10702 (AJ458420)

0.020

Figure 3.3 Phylogeny of strains AHG008 and AHG0014 based on 16S rRNA gene sequence alignments as described in the Materials and Methods, and using sequences derived from strains assigned to those most closely related taxa, as identified and described in Chapter 2. Strain AHG0008 is separable from the two type strains representing both Pseudoflavonifractor and Flavonifractor spp. (shown in bold), and by my analyses is instead most closely related to the bacterium ASF500, which is derived from the Altered Schaedler microbiota (Wymore Brand et al., 2015). Strain AHG0014 is assigned to the cluster of isolates that currently encompass the F. plautii group, including the type strain DSMZ 4000T.

63

3.3.2 Quercetin has minimal effects on the growth kinetics of F. plautii DSMZ 4000T and strain AHG0014

The effects from the addition of 50 µM quercetin (or carrier alone) on the growth kinetics of F. plautii DSMZ4000T and AHG0014 are shown in Figure 3.4. In initial studies, the growth of AHG0014 using

Reinforced Clostridial Medium (RCM) was found to be quite weak, reaching a maximal OD600 of ~0.25 after 34 hours. I then changed to using BHI medium for further growth studies, in large part because the growth of AHG00014 was much quicker but both the type strain and AHG0014 also showed only weak growth with this type of medium as well (also see Appendix Figure 6.1). The type strain did appear to reach a greater maximum yield than AHG0014 (~0.6 c.f. ~0.4 OD600, respectively, at 18 hours) and the growth rates were also slightly less for strain AHG0014 under the growth conditions (Figure 3.4). My results also show that quercetin per se does not appear to provide a preferred or additional nutrient source for these two bacteria, because there were only small changes in the bacterial growth rates and maximum cell yields for both strains with the addition of 50 µM quercetin. Whilst there are no apparent effects of quercetin on the growth rate and final yield of either strain there does appear to be a slight delay in the onset of growth for both Flavonifractor spp. when grown in the presence of quercetin.

64

A B

Figure 3.4 Growth of Flavonifractor spp. DSMZ 4000T (panel A) and strain AHG00014 (panel B) in the presence of BHI medium with or without 50 µM quercetin. The shaded area represents the growth of the respective strains when cultured in the presence of vehicle alone (EtOH), as compared to the measurements obtained from those cultures containing added quercetin (triangles) and uninoculated medium controls (squares). Although the lag time in measureable growth appears to be shorter for both strains in the presence of the carrier alone, both their growth rates and maximum yields appear to be unaffected by the presence of quercetin. There were no measureable changes in

OD600 for sterile uninoculated medium (squares), and the growth data are mean values (± S.E.M.) produced from 2 biological replicates with 3 technical replicates (n=6).

65

3.3.3 Both Flavonifractor spp. DSMZ 4000T and AHG0014 rapidly metabolise quercetin

The time course of quercetin metabolism by the type strain and strain AHG0014 are shown in Figures Figure 3.5 and Figure 3.6. As described in the Materials and Methods, the residual quercetin concentration was monitored by the changes (reductions) in A375 measured from clarified culture fluids, as well as from similarly prepared and incubated bottles of sterile uninoculated medium. My results show that while there was no change in A375 values for the uninoculated medium, quercetin appeared to be quickly metabolised by both the type strain and AHG0014, reflected in the reduction of A375 values to <0.1 within 8 hours (Panel A in Figure 3.5 and Figure 3.6). Furthermore, the growth of both bacterial strains was not measureable until ~10 hours incubation (see Figure 3.4), a time after which quercetin appears to be completely metabolised. For both DSMZ 4000T and AHG0014, the detectable level of quercetin present in the medium at 26h was significantly reduced compared to bacteria-free controls (Panel B in Figure 3.5 and Figure 3.6).

Based on these results, I hypothesised that F. plautii and AHG0014 both rapidly metabolise quercetin to prevent any negative effects on growth. To test this hypothesis, I added either 50 µM quercetin or the carrier alone to actively growing cultures (i.e. OD600 ~0.2). The addition of quercetin and EtOH in this manner did not inhibit the growth of either DSMZ 4000T or AHG0014 (Figure 3.7). Both strains were shown to still be able to rapidly reduce the quercetin present in the medium, in fact the decrease in quercetin happened more rapidly, within 2 hours rather than the 8 hours previously observed when quercetin was added at the same time as the cultures were inoculated. The increase in the rate of reduction of quercetin in this instance is most likely due to the fact that the strains were in the exponential phase of growth and therefore there would be a higher number of cells producing enzymes and other factors involved in the breakdown of quercetin.

66

A B

Figure 3.5 Quercetin metabolism by F. plautii sp. DSMZ4000T. Panel A shows the longitudinal monitoring of both OD600 () and A375 (), as a measure of bacterial growth and quercetin metabolism, respectively, for both inoculated (green) and uninoculated BHI medium (grey). As expected, there were virtually no changes in these values during the time course of the experiment for the uninoculated medium. Conversely, quercetin metabolism by the F. plautii sp. DSMZ4000T cultures was measurable within 2 hours, close to maximal levels within ~8 hours incubation; and after which bacterial growth proceed in a manner similar to that illustrated in Figure 3.4. Panel B shows the quercetin concentrations at 0 and 26 hours, estimated from the A375 values and plotted against a standard curve of quercetin added to BHI medium, as described in the Materials and Methods. The data points represent mean values (± S.E.M.) produced from 2 biological replicates each with 3 technical replicates (n=6). The asterisks denote statistical significance at P <0.0001.

67

A B

Figure 3.6 Quercetin metabolism by Flavonifractor strain AHG0014. Panel A shows the longitudinal monitoring of both OD600 () and A375 (), as a measure of bacterial growth and quercetin metabolism, respectively, for both inoculated (red) and uninoculated BHI medium (grey). As expected, there were virtually no changes in these values during the time course of the experiment for the uninoculated medium. Conversely, quercetin metabolism by Flavonifractor sp. AHG0014 cultures was measureable within 2 hours, close to maximal levels within ~8 hours incubation; and after which bacterial growth proceed in a manner similar to that illustrated in Figure 3.4. Panel B shows the quercetin concentrations at 0 and 26 hours, estimated from the A375 values and plotted against a standard curve of quercetin added to BHI medium, as described in the Materials and Methods. The data points represent mean values (± S.E.M.) produced from 2 biological replicates each with 3 technical replicates (n=6). The asterisks denote statistical significance at P <0.0001.

68

A

B

69

Figure 3.7 The effects of adding 50 µM quercetin to actively growing cultures of either F. plautii sp. DSMZ4000T (A) or Flavonifractor strain AHG0014 (B) Panel A shows the longitudinal monitoring of both OD600 and A375 as a measure of bacterial growth and quercetin metabolism, respectively, for both inoculated (green or red symbols) and uninoculated BHI medium (grey symbols). Inoculated cultures are further differentiated between those receiving quercetin (triangles) and those receiving carrier alone (circles). As expected, there were virtually no changes in these values during the time course of the experiment for the uninoculated medium. Conversely, quercetin metabolism by both strains appeared to be immediate and was reduced to minimal concentrations within 2-4 hours of its introduction to actively growing cultures. Interestingly, both bacterial strains also appeared to show some slowing in growth until the quercetin concentration was substantially reduced, then resumed in a manner similar to those cultures receiving the carrier alone. The data points represent mean values (± S.E.M.) produced from 2 biological replicates each with 3 technical replicates (n=6).

70

3.3.4 Whole genome analysis of Flavonifractor spp. and AHG0014

The genomes used for these comparisons with that from strain AHG0014 are listed in Table 3.1. The AHG0014 draft genome was assembled into 798 contigs with the N50 and L50 scores being 52560 and 30 respectively. By CheckM analysis strain AHG0014 is predicted to be 99.33% complete and has a contamination score of 1.65%. The finished genomes for F. plautii YL31 as well as the draft genomes for F. plautii strains DSMZ 4000T, DSMZ 6470, and An248 were predicted by CheckM analysis to exceed 98% completeness with <1.0% contamination. Using the draft genome for AHG0014, I also retrieved four additional genomes from the publicly available databases that were similar to my draft genome, but their taxonomic assignment was at either a family or genus level. These extra genomes are from Clostridium sp. ATCC BAA-442 (Hur et al., 2002), Clostridiales sp. VE202-03 (Atarashi et al., 2013), Clostridium sp. UC5.1_2H11 (Atarashi et al., 2015) and Lachnospiraceae bacterium 7_1_58FAA (Human Microbiome Jumpstart Reference Strains et al., 2010) with CheckM completeness scores ranging from 90-98% and contamination scores between 0- 2.6% (Table 3.1). The metrics for the AHG0014 genome and other strains used for genomic comparisons are shown in Table 3.2. In summary, the AHG0014 genome is of a similar size to the majority of the F. plautii genomes identified in this study, with the exception of the genome currently assigned to the Lachnospiraceae genus, which appears to be substantially larger. AHG0014 is the second largest of the Flavonifractor genomes.

I first re-ordered and compared the genome data for strain AHG0014 with the finished genome of strain YL31 using the Mauve software package, which confirmed that there was a high degree of synteny between these two genomes (Figure 3.8). The draft genome of AHG0014 was then compared to the F. plautii strains DSMZ 4000T and YL31 to determine the core, shared and unique regions across these three strains. These gene counts were based on reciprocal BLASTp analysis of the EDGAR-based predictions of protein-coding genes. These three genomes shared a total of 2632 genes which consisted of ABC transporters, ribosomal related proteins, hypothetical proteins and genes related to sporulation (Figure 3.9). For AHG0014, there are 367 genes shared only with DSMZ 4000T and 237 genes which are only shared with YL31. AHG0014 has the largest number of unique genes between the genomes with 11065 genes, whilst DSMZ4000T and YL31 have 604 and 498 unique genes, respectively. Next, I used Mauve for genome re-ordering and alignment plots for all the Flavonifractor spp. genomes listed in Table 3.1 and validated that the genomes from all these strains share an extensive amount of synteny, but as expected, some xenologous regions unique to each genome are also present (Figure 3.10). The Average Nucleotide Identity (ANI) scores calculated for

71

the 9 genomes are shown in Figure 3.11 and all values exceed 95%. Taken together, my results suggest all 9 genomes are derived from bacterial strains that are all members of the F. plautii lineage within Clostridium cluster IV.

72

Table 3.1 Summary statistics for the Flavonifractor spp. genomes with respect to genome size, G + C content and CheckM completeness and Contamination scores Genome ID Source Location G + C Genome Size Completeness Contamination (%) Predicted complete content (bp) (%) genome size (Mbp) (%) YL31 Mouse caecum EUR 60.9 3,818,478 99.33 0.13 3.8 DSMZ 4000T Human oral USA 61.1 3,820,124 99.33 0.13 3.8 DSMZ 6740 Human faecal USA 60.3 4,383,642 98.15 0.81 4.5 An248 Chicken caecum EUR 61.0 3,761,516 99.33 0.13 3.8 ATCC BAA-442a Human faecal USA 59.6 4,376,727 93.79 0.67 4.7 VE202-03b Human faecal Japan 59.2 4,569,182 90.97 2.61 5.0 UC5.1_2H11a Human faecal Japan 61.6 3,530,652 96.06 0.00 3.7 L58FAAc Human faecal USA 58.2 5,668,091 98.43 0.13 5.8 AHG0014 Human faecal AUS 60.1 4,617,074 99.33 1.65 4.6

aClostridium sp., bClostridiales sp. and cLachnospiraceae bacterium

73

Table 3.2 General genome features for isolate genomes DSMZ 4000T, YL31, DSMZ 6470, AHG0014, ATCC BAA442, UC5.1, VE202 and L35FAA recovered using the JGI IMG/ER annotation engine. Both the number of genes and the proportion of the genome dedicated to different categories are reported.

DSMZ 4000T YL31 DSMZ 6470 AHG0014 ATCC BAA442 UC5_1 VE202 An248 L35FAA DNA, total number of bases 3813599 100.00% 3813655 100.00% 4323157 100.00% 4617074 100.00% 4376727 100.00% 3525897 100.00% 4869182 100.00% 3760736 100% 5622443 100.00% DNA coding bases 3460932 90.75% 3365434 88.25% 3803509 87.98% 3854984 83.49% 3856728 88.12% 3035322 86.09% 3992843 87.39% 3317338 88.21% 4931730 87.72% DNA, G + C content 239029 61.07% 2322870 60.91% 2616770 60.53% 2774981 60.10% 2609111 59.61% 2171893 61.60% 2706791 59.24% 2293286 60.98% 3271414 58.18% Protein coding genes 4278 98.69% 3642 97.96% 4309 98.49% 4341 98.70% 4613 98.65% 3847 98.51% 5185 98.78% 3620 98.24% 5907 98.43% RNA genes 57 1.31% 76 2.04% 66 1.51% 57 1.30% 63 1.35% 58 1.49% 64 1.22% 65 1.76% 94 1.57% rRNA genes 3 0.07% 6 0.16% 6 0.14% 3 0.07% 3 0.06% 2 0.05% 3 0.06% 5 0.14% 6 0.10% 5S rRNA 1 0.02 2 0.05% 3 0.07% 1 0.02% 1 0.02% - - 1 0.02% 3 0.08% 4 0.07% 16S rRNA 1 0.02% 2 0.05% 2 0.05% 1 0.02% 1 0.02% 2 0.03% 1 0.02% 1 0.03% 1 0.02% 23S rRNA 1 0.02 2 0.05% 1 0.02% 1 0.02% 1 0.02 % 1 0.03% 1 0..02% 1 0.03% 1 0.02% tRNA genes 52 1.20% 54 1.45% 53 1.21% 57 1.30% 52 1.11% 49 1.25% 51 0.97% 51 1.38% 88 1.47% Other RNA genes 2 0.05% 16 0.43% 7 0.16% 8 0.17% 7 0.18% 10 0.19% 9 0.24% - - Protein coding genes with function prediction 2762 63.71% 2894 77.84% 3110 71.09% 3125 71.06% 3214 68.73% 2893 74.08% 3515 66.97% 2814 76.36% 3673 61.21% Without function prediction 1516 34.97% 748 20.12% 1199 27.41% 1216 27.65% 1399 29.92% 954 24.43% 1670 31.82% 806 21.87% 2234 31.23% Protein coding genes with enzymes 820 18.92% 817 21.97% 925 21.14% 810 18.42% 910 19.46% 983 25.17% 1085 20.67% 802 21.76% 958 15.96% Protein Coding genes connected to KEGG pathways 925 21.34% 930 25.01% 1051 24.02% 900 20.46% 1007 77.12% 1089 27.89% 1180 22.48% 898 24.73% 1033 17.21% Protein Coding genes connected to KEGG Orthology 1650 38.06% 1625 43.71% 1835 41.94% 1664 37.84% 1871 40.01% 1869 47.86% 2111 40.22% 1603 43.50% 1928 32.13% (KO) Protein coding genes connected to MetaCyc pathways 716 16.52% 713 19.18% 808 18.47% 711 16.17% 788 16.85% 875 22.41% 947 80.74% 705 19.13% 828 13.80% Protein coding genes with COGs 2171 50.08% 2261 60.81% 2317 52.96% 2337 53.14% 2298 49.14% 1829 46.84% 2072 39.47% 2193 59.51% 2529 42.14% With KOGs 539 12.43% 573 15.41% 549 12.55% 549 12.48% 524 11.21% 423 10.83% 433 8.25% 552 14.98% 561 9.35% With Pfam 2863 66.04% 2907 78.19% 3218 73.55% 3157 71.78% 3329 71.19% 2936 75.19% 3550 67.63% 2829 76.77% 3862 64.36% With TIGRfam 1027 23.69% 1051 28.17% 1083 24.75% 1041 23.67% 1055 22.56% 975 24.97% 1019 19.41% 1017 27.60% 1133 18.88% With InterPro 2923 67.43% 1898 50.91% 2127 48.62% 2057 46.77% 2173 46.47% 1941 49.71% 2356 44.88% - - 3984 66.39% Fused Protein coding genes 6 0.14% 225 6.05% 86 1.97% 283 6.43% 86 1.84% 77 1.97% 75 1.43% 194 5.26% 8 0.13% Protein coding genes coding signal peptides 185 4.27% 211 5.68% 298 6.81% 245 5.57% 301 6.44% 193 4.94% 271 5.16% 226 6.13% 385 6.42% Protein coding genes coding transmembrane proteins 906 20.90% 879 23.64% 949 21.69% 1005 22.85% 999 21.36% 887 22.71% 1065 20.29% 882 23.93% 1210 20.16% Genes in Biosynthetic Clusters 87 2.01% - - 48 1.1% - - 40 0.86% 46 1.18% 96 1.83% - - 112 1.87%

74

YL31

AHG0014

Figure 3.8 Alignment of the F. plautii YL31 finished genome with the Flavonifractor sp. AHG0014 genome using the Mauve alignment software reveals these two genomes share an extensive degree of synteny, and with limited rearrangements. Briefly, the syntenous regions are depicted by the coloured blocks, with any xenologous regions depicted as blank spaces. Genome rearrangements are depicted as the coloured lines joining the syntenic blocks. The AHG0014 genome is somewhat larger than the reference genome (YL31) and principally explained by numerous xenologous regions dispersed throughout the assembly.

75

AHG0014

1106

367

237 2632 604

197 498 DSMZ 4000T

YL31

Figure 3.9 Venn diagram showing the core, shared and unique genes present in the draft genome for AHG0014 (A) and the F. plautii genomes DSMZ 4000T (B) and YL31(C). The gene counts are based on reciprocal BLASTp analysis of the EDGAR-based predictions of protein-coding genes from each genome. There are 2632 genes in the core genome for Flavonifractor spp. with a small number of the genes being assigned as shared between the genomes (197-366). The remainder of the genes are assigned as unique genes with the draft genome for AHG0014 containing nearly twice as many unique genes (1106) as compared to both DSMZ 4000T (604) and YL31 (498).

76

YL31

An248

UC5.1-2H11

T DSMZ 4000

DMSZ 6470

ATCC BAA-442

VE202-03

AHG0014

L58FAA

Figure 3.10 Alignment of the 9 Flavonifractor spp. isolate genomesusing the Mauve genome alignment software. The alignment shows that there is a high degree of synteny between all Flavonifractor spp. Isolates. There appears to be an artefact to the Mauve software which cuts off some of the base numbers on the scale for some of the genomes. The scale is shown in bp and the numbers represent intervals of 200000 bases. 77

Figure 3.11 The average nucleotide identity (ANI) matrix of Flavonifractor sp. AHG0014 and F. plautii isolate genomes calculated from BLAST hits between orthologous genes of the core genome. The ANI score of >95% indicated that each strain used belongs to the same species. 78

Based on these findings, I then generated a pangenome atlas from all nine Flavonifractor spp. genomes using GView and the finished genome from strain YL31 as the “root” genome, and these results are illustrated in Figure 3.12. The atlas shows there is a great level of gene and nucleotide sequence identity/similarity among the 9 genomes across the initial ~4.0 Mbp of the pangenome, but there are also a number of “YL31-specific” blocks, evidenced by the gaps that are occasionally “closed” by homologous genes present only in strains An248, 7_1_58FAA, and/or AHG0014. There is a much greater degree of strain variation across the remaining ~1.3 Mbp, and while a large proportion of these genes are shared among 2-4 strain combinations, there is a notable collection of 121 (3%) genes that are unique to the strain AHG0014 genome, being absent from the other 8 Flavonifractor spp. genomes. These genes proved difficult to annotate because a large number of them are deemed to encode either “hypothetical proteins”, putative transport proteins, or proteins involved with transcriptional regulation, with undefined specificities.

The development plots used to estimate the core and accessory genomes of the 9 Flavonifractor strains are shown in Figure 3.13, which are estimated to be comprised of 2209 genes and 8968.8 genes, respectively. The accessory genome continues to trend upwards, suggesting that the pangenome remains open and more isolates will be needed to provide its complete coverage but conversely, the core genome is predicted to be almost stabilised following the addition of the 9th genome. In addition to the normal housekeeping functions related to DNA replication, transcription, and translation, the core genome is notably characterised by the presence of genes supporting spore- formation (e.g. acid-soluble spore protein, spore maturation protein A and sporulation protein YtfJ) and various ABC transporters. The core genome is also remarkable for the presence of a multi- gene cluster including a homolog of the chalcone isomerase gene involved with quercetin metabolism by Eubacterium ramulus (Braune et al., 2016), and I have chosen to focus on this gene cluster in the following sections.

79

AHG0014 7_1_58FAA UC5.1-2H11 ATCC_BAA442 VE202-03 An248 DSMZ 6470 DMSZ 4000T YL31 Pangenome GC content GC skew

Figure 3.12 The Flavonifractor spp. pangenome atlas constructed and visualised using Gview (Petkau et al., 2010). The 5.3 Mbp pangenome is represented by the inner purple ring whilst the outer colour-coded rings depict the individual genomes of each isolate. The BLASTn was performed at a cut-off of 80% identity per gene.

80

Figure 3.13 Pangenome and core genome development plots of Flavonifractor spp. isolates The Core (black, y = 3692.95e-0.77x + 2209.79) and pan-genome (orange, y = 2674.7x0.48 + 1289.63,) development plot of Flavonifractor spp. was produced using PGAP (Zhao et al., 2012) in Biolinux and then plotted with software package PanGP (Zhao et al., 2014).

81

3.3.5 Predicted gene functions and metabolism of the F. plautii species

Based on the poor growth kinetics demonstrated in my quercetin degradation experiments I next decided to use the 9 F. plautii genomes to provide a predicted metabolism profile for the species. The results using eggNOG-mapper analysis are illustrated as a COG usage bar chart that demonstrates the functional differences between the F.plautii strains (Figure 3.14). Across all genomes there are a large proportion of genes (8-10%) assigned to amino acid biosynthesis and transport with approximately 3-5% of genes assigned to carbohyrdrate transport and metabolism. This potentially indicates a preference of Amino Acid sources which can support growth as compared to Carbohydrate sources. All the Flavonifractor genomes are predicted to encode for motility functions, reflected in the identification of genes encoding for flagellar biosynthesis proteins (flhAB), the basal-body rod protein (flgG), the export protein (fliJ) and fliMG, whose products form the rotor-mounted switch complex, this information supports my phase-contrast microscopy observations in which I noted the movement of cells across the field of vision.

82

83

Figure 3.14 The predicted gene functions of F. plautii strains AHG0014, YL31 and DSMZ 4000 based on COG usage categories. The COG usage bar chart shows the number of predicted COG usage genes for each of the F. plautii genomes analysed here. There are a large number of genes assigned as Function Unknown.

84

The overall metrics from the PHX analysis of four F. plautii genomes (AHG0014, DSMZ 4000T, DSMZ 6470 amnd YL31) are shown in Table 3.3. Overall, the predicted PHX genes represent approximately 19-23% of the genomic content across all four strains. The PA gene content is estimated at approximately 10-13% across all strains. The top 50 annotated PHX genes from the AHG0014 genome are shown in Figure 3.15. I found that 11/50 PHX genes are predicted to encode for amino acid biosynthesis and transport. Using the IMG/ER system I was able to identify that AHG0014 was auxotrophic for the amino acids, phenyalanine, tyrosine, tryptophan, histidine, arginine, isoleucine, leucine, proline, threonine, and valine. PHX analysis revealed that 6/50 genes are predicted to encode for carbohydrate transport and metabolism. These genes are annotated as; Pyruvate-flavodoxin , Enolase, Pyruvate,phosphate dikinase, Phosphomannomutase, Endo-1,4-beta-xylanase A and NADP-specific glutamate dehydrogenase. Four of these genes appear to be involved in the conversion of Fructose to Pyruvate as part of Glycolysis. Using the PATRIC webserver I was able to produce a pathway for the conversion of Fructose to Pyruvate by the AHG0014 genome .There are a full complement of genes for the conversion of Fructose to glyceraldehyde-3-phosphate (GAP), which is then used in glycolysis and broken down into pyruvate. PATRIC also revealed that AHG0014 also contains a full complement of genes involved in converting pyruvate into Acetyl-CoA which can then be used in the citrate cycle.

Table 3.3 PHX analysis for AHG0014 and three Flavonifractor spp. strains Depicted are the total number of predicted highly expressed (PHX) and putative alien (PA) genes annotated via PHX analysis of the genomes for F. plautii strains; AHG0014, DSMZ 4000T, DSMZ 6470 amnd YL31.

Genome Total No. of Total PHX Total No. of Total PA Total No. PHX genes genes (%) PA genes genes (%) genes analysed AHG0014 860 21.99 504 12.89 3910 DSMZ 4000T 640 19.26 393 11.83 3322 DSMZ 6470 806 21.20 409 10.76 3801 YL31 766 22.70 405 12.03 3366

85

I next used the dbCAN2 meta server to annotate the CAZyme families that could be identified for each of the Flavonifractor genomes. The dbCAN2 meta server uses the three databases; HMMER, DIAMOND and HotPep to identify CAZyme families within a genome. In total, across the three databases, 68 modules could be annotated for the AHG0014 genome. DIAMOND appeared to be the best database being able to identify 60 of the 68 modules. Taking a closer look at the CAZyme families identified there are 5 families represented across all the genomes, 4 of which appear to show a high level of conservation. The first family, defined as Auxiliary activity, was only identified within the AHG0014 genome and was not represented in any of the other 8 genomes. This module was annotated as a choline dehydrogenase involved in Choline uptake using RAST. The remaing 4 families appear to be well represented and highly conserved across the Flavonifractor genomes. For the Carbohydrate Binding Modules (CBM) families, 5 were identified across the Flavonifractor. CBM50 was the most abundant and highly conserved across all Flavonifractor genomes. Members of this family have been associated in the binding of peptidoglycans and chitin (Steen et al and Onaga and Taira). There are also 5 of the Carbohydrate Esterase family members identified across all genomes. CE family 9 is the most numerically abundant and has been shown to be involved in the deacetylation of N-acetylglucosamine-6-phosphate to glucosamine-6-phosphate. This deacetylation is important for both amino sugar metabolism and recycling of the peptidoglycan in the cell wall of bacteria (Park). The second most abundant family appears to be the Glycosyl (GT), of which, GT2 and GT4 appear to be the most abundant members. These enzymes are involved in the attachment of glycans to protein most commonly found on the cell surface (Refs: Jarrell et al., 2010, Magidovich and Eichler, 2009). Finally, the most abundant family was determined to be the Glycoside (GH). GH23 was the most represented across all strains and are enzymes active on peptidoglycans. Using the RAST server these enzyme were identified as endopeptidases involved in breaking peptide bonds of non-terminal amino acids. Taken together CAZym analysis seemed to reveal a large number of enzymes which involved in production of proteins most likely related to cell wall synthesis by Flavonifractor species.

86

Figure 3.15 A comprehensive representation og the top 50 PHX genes through analysis of the AHG0014 genome Genes were ranked according to their overall bias relative to the highly expressed gene profiles of the ribosomal proteins, translation and transcription factors and the chaperone degradation genes. The most highly expressed gene is ranked as no.1 in this schematic and genes are colour coded according to the COG categories predicted for each gene.

87

Figure 3.16 Predicted pathway of Fructose conversion by AHG0014 The predicted pathways were identified using PATRIC and show a full complement of genes for the conversion of Fructose to Pyruvate via glycolysis. Firstly, Fructose is converted into GAP by a fructokinase and one of three enxymes (6-phosphofructokinase, fructo-biphosphatase or diphosphate-fructose-6-phosphate 1- phosphotransferase). GAP is then utilised during glycolysis to produce pyruvate. GAP is converted to Glycerate-2-phosphate which is then converted to phosphoenolpyruvate for a final conversion into pyruvate.

88

Table 3.4 Gene counts for CAZyme families obtained through annotation of the Flavonifractor genomes by the dbCAN meta server Overall the profile of the CAZyme families and total gene counts represented across each genome are virtually the same. This suggests that these functions are conserved throughout the species.

CAZyme Family AHG0014 YL31 DSMZ 4000T DSMZ 6470 An248 ATCC BAA442 UC5.1 2H11 VE202-03 L58FAA Auxiliary activity AA3 1 ------Carbohydrate binding module CBM32 2 1 1 1 1 1 1 1 1 CBM40 1 2 1 1 1 1 2 1 1 CBM48 - - - 1 1 - - - - CBM50 5 4 5 4 4 4 4 5 4 CBM54 1 1 1 1 1 1 1 1 1 Carbohydrate Esterase CE4 3 2 2 3 2 2 2 2 2 CE7 2 2 2 2 2 2 2 2 CE9 9 2 2 2 2 2 2 3 2 CE10 3 2 2 2 2 2 2 2 2 CE11 1 1 1 1 1 1 1 1 1 Glycoside GH0 1 1 1 1 1 2 1 2 1 GH3 1 1 1 1 1 1 1 1 3

89

GH4 1 1 1 1 1 1 1 1 1 GH5 - - 1 ------GH6 2 2 2 2 2 - 2 1 2 GH13 ------1 - GH13_9 1 1 - - 1 1 1 - 1 GH13_11 1 1 1 1 1 1 1 1 1 GH13_20 2 1 1 1 1 1 1 1 1 GH13_30 1 1 1 1 1 1 1 1 1 GH18 2 2 2 2 2 2 2 2 2 GH23 4 1 2 3 2 5 - 4 10 GH25 1 1 1 1 1 1 - 1 3 GH28 1 1 1 1 1 1 1 1 1 GH31 3 3 3 3 3 3 3 3 3 GH33 - 2 1 3 1 - 2 - 2 GH36 1 1 1 1 1 1 1 1 1 GH65 1 1 - 1 1 1 1 1 1 GH77 1 1 1 1 1 1 1 1 1 GH125 1 1 1 1 1 1 1 1 1 Glycosyltransferase GT0 - 1 1 1 1 1 1 1 1 GT1 1 1 1 - 1 1 1 - 1 GT2 7 7 7 6 8 7 9 6 7

90

GT4 4 4 4 4 4 4 4 3 5 GT5 1 1 1 1 1 1 2 1 1 GT10 1 - 1 1 1 1 1 1 1 GT13 1 1 1 1 1 1 1 1 1 GT26 1 1 1 1 1 1 1 - 1 GT28 2 2 2 2 2 2 2 2 2 GT35 1 1 1 1 1 1 1 1 1 GT39 1 1 1 1 1 1 1 1 1 GT51 1 1 1 1 1 1 1 1 1 GT76 1 1 1 1 1 1 1 1 1 Surface layer homology SLH 1 1 - 2 - - 1 1 1

91

3.3.6 The chalcone isomerase (chi) homolog and contiguous genes of Flavonifractor spp.

The results in Figure 3.17 confirm that all 9 Flavonifractor genomes possess a homolog of the Eu. ramulus chi, which encodes a with extensive sequence identity/similarity to the Eu. ramulus chi, as described by Braune et al., (2016). The notable exception is the UC5.1-2H11 chi, which lacks the first 45 amino acids, including the histidine residue at position 34 implicated in coordinating the taxifolin-alphitonin conversion.

92

*

93

Figure 3.17 Alignment of the chi homologs from the Flavonifractor strains with those from Eu. ramulus, using the Multiple sequence alignment online software PRALINE (Simossis and Heringa, 2005) Beyond the first 45 amino acids all the proteins share an extensive amount of conservation, with the exception of amino acids present only in the Eu. ramulus Chi sequences at positions 112 and 133-134. The Chi sequence from Flavonifractor strain UC5.1-2H11 is predicted to lack the first 45 amino acids, including the histidine residue at position 43 (denoted by *) that is required for the anchoring of taxifolin as part of its enzymatic conversion to alphitonin, as described by Braune et al (2016).

94

I also found the genes immediately flanking chi in the Flavonifractor genomes are conserved with respect to their organisation and predicted functions (Figure 3.18). The chi gene is preceded by two genes: the nearest encoding a presumptive iron-sulphur containing flavoprotein, the second predicted to encode a LysR-family transcriptional regulator that is divergently transcribed relative to the other genes in the cluster. The two genes immediately downstream of chi are currently annotated as encoding “hypothetical proteins”. However, the first of these two hypothetical proteins bears limited similarity to some inner membrane-bound proteins, and the second is predicted to encode an LVIVD repeat sequence, which are motifs found in some bacterial and archaeal cell surface proteins (Adindla et al., 2007). This gene cluster was absent from the sequenced Pseudoflavonifractor genomes including strain AHG0008, which are unable to metabolise quercetin (presented in Chapter 4). Based on these findings, I conclude that this multi-gene cluster encodes for proteins that coordinate quercetin metabolism, via the action of a chalcone isomerase, as well as an accessory flavoprotein and two membrane-associated proteins with unknown functions. Furthermore, all these genes are likely to be regulated at the level of transcription via the action of a LysR-family DNA binding protein.

As shown in Figure 3.18, I also found that genes flanking the presumptive quercetin metabolism operon were also highly conserved. Downstream of this operon, all the Flavonifractor genomes possess genes predicted to encode an oxalate/formate antiporter, followed by a hypothetical protein, a putative nucleoside-binding and a putative membrane bound permease. With the exception of the genomes from strains DSMZ4000T and ATCC BAA442 the chi-containing operon is preceded by two divergently transcribed genes encoding a putative GntR transcriptional regulatory protein and a hypothetical protein, respectively. Beyond this gene pair, all the genomes except strains DSMZ4000T, ATCC BAA442 and VE202.3 also possess genes encoding a presumptive phosphotransbutyrylase, and the 4 subunits of the 2-oxoglutarate oxidoreductase complex. In summary then, the chi-containing operon, is flanked by homologs of a presumptive oxalate/formate antiporter and ABC-transporter that are also part of the core genome; and most strains also share extensive synteny upstream of the chi- containing operon with respect to functions linked with succinate and butyrate formation.

95

96

Figure 3.18 Gene organization and their predicted functions among the 9 Flavonifractor strains flanking the chalcone isomerase (chi) gene. The red box denotes the presumptive 4 gene operon that is proposed to subject to transcriptional control via a divergently transcribed LysR-family DNA binding protein. Downstream of the chi-containing operon, all 9 genomes also encode homologs of a presumptive oxalate/formate antiporter and ABC-transporter; and most strains also share extensive synteny upstream of the chi-containing operon with respect to functions linked with succinate and butyrate formation. A scale bar of 1 Kb is included for reference.

97

3.3.7 The AHG0014 genome contains other genes predicted to be involved in Flavonoid, Flavone and Flavonol biosynthesis

I next used the PATRIC bacterial bioinformatics resource centre to identify further genes that are predicted to be involved in Flavonoid, Flavone or Flavonol biosynthesis pathways. Figure 3.18 shows the KEGG pathway for Flavone and Flavonol biosynthesis with highlighted genes being encoded for in the genome of AHG0014. The KEGG pathway depicted shows the presence of hexosyltransferases involved in the production of quercetin 3-O-rhamnoside-7-O-glucoside from quercetin. This pathway appears to show a complete degradation of quercetin to quercetrin with the final product being quercetin 3-O-rhamnoside-7-O-glucoside.

98

99

Figure 3.19 The Flavone and Flavonol Biosynthesis pathway of strain AHG0014 The genome was shown to contain hexosyltransferases involved in the biosynthesis of Quercetin 3-O-rhamnoside-7-O-glucoside from Quercetin. The hexosyltransferases also appear to be involved in the biosynthesis of Kaempferin from Kaempferol and Kaempferol 3-O-beta-D-glucosylgalactoside from Trifolin. Image downloaded from PATRIC and adapted from KEGG pathway database (REF).

100

3.3.8 Bioinformatics-based assessment of the prevalence of Flavonifractor spp. using shotgun metagenomics datasets

Given my confirmation that the chi-containing operon is part of the Flavonifractor core genome, I decided to use a combination of taxonomic (16S rRNA) and functional genes (the 4 genes identified in the chi operon) to examine the prevalence and relative abundance of Flavonifractor spp captured in shotgun metagenomics data. I used MetaQuery and their associated datasets for this purpose. The results of these analyses using the Flavonifractor core gene(s) determined by MetaPhlan analysis are shown in Figure 3.20, and suggest there is no significant differences in the relative abundance and prevalence of Flavonifractor spp. between the cohorts recruited for case-control studies of ulcerative colitis (UC), liver cirrhosis, colorectal cancer and Type-2 diabetes. In contrast, my analysis of the MetaQuery datasets suggest that the relative abundance of Flavonifractor spp. is significantly greater in obese persons and patients with Crohn’s disease compared to the healthy control subjects in these studies. Furthermore, the prevalence and relative abundance of Flavonifractor spp. appears to be reduced in a study of subjects with a normal glucose tolerance (NGT) compared to subjects with an impaired glucose tolerance (IGT) and those diagnosed with Type-2 diabetes. In summation, my analyses of these datasets suggest that the relative abundance and prevalence of Flavonifractor spp. is low but may be elevated in obesity and metabolic syndrome (IGT) as well as patients with Crohn’s disease but not ulcerative colitis.

101

A B C D

E F G H

102

Figure 3.20 Abundance of Flavonifractor spp. specific gene counts in the MetaQuery microbiome gene catalogue. Flavonifractor spp abundance is differentially affected across different disease states from the microbiome gene catalogue. Panels A and B show Flavonifractor spp. gene abundance is significantly increased in Crohn’s disease (CD) patients compared to healthy controls (HC) whilst there is no difference in gene abundance between HC and Ulcerative Colitis (UC) patients. Panels C-E and G show that there is no difference in Flavonifractor spp. abundance between HC and patients with either Liver Cirrhosis, Type-2 diabetes (T2D), Rheumatoid Arthritis (RA) and colorectal cancer (CRC). Panel F depicts a slight yet significant increase in Flavonifractor spp. abundance in patients with Obesity compared to controls. Finally, panel H shows that prevalence of Flavonifractor spp. is significantly reduced in Normal Glucose Tolerance (NGT) compared to patients with Impaired Glucose Tolerance (IGT) and T2D.

103

I next used MetaQuery to conduct a BLAST-based search using the protein sequences encoded by the other functional genes encoded by the chi-containing operon, against the human microbiome gene catalogue. The results are illustrated in Figure 3.21, and show there is a significant increase in the relative abundance of all these genes in the dataset produced from Spanish patients with Crohn’s disease, as compared to ulcerative colitis and healthy non-IBD subjects.

A) B)

C) D)

Figure 3.21 MetaQuery-based analysis of the relative abundance of chi operon genes. Genes encoding the putative iron-sulphur flavoprotein (A), chalcone isomerase (B), and two “hypothetical” membrane-associated proteins (C) and (D) in a case-control study of Spanish patients with inflammatory bowel diseases. Both the prevalence and relative abundance of all 4 genes is significantly greater in the Crohn’s disease patients when compared to both healthy control and ulcerative colitis patients.

104

3.3.9 Cloning of the strain AHG0014 chi homologue in E. coli

A comprehensive restriction digest was performed on a sample of the extracted plasmid to confirm that the plasmid used for BL21 and Rosetta™ strain transformations was correct when compared to a simulated gel generated using SnapGene® (Figure 3.22 C and D). Next, E. coli BL21 and Rosetta™ strains were transformed with both the empty pET28b(+) vector and the pET28b(+)chi vector. Single colonies were inoculated into auto-induction medium to allow for a high yield of cells expressing the recombinant protein. Colonies carrying the pET28b(+) vector alone were also inoculated as negative controls. Once cultures had grown the cells were harvested and lysed to produce cell lysates which could be checked via SDS-PAGE for the presence of the CHI protein. As shown in Figure 3.23, the cell lysates produced from BL21 and Rosetta™ strains containing the pET28b(+)chi vector had an additional ~ 32 kDa protein which is the approximate size of the chi protein (Figure 3.23 B lane 3 and 5 respectively). This product is missing from the lysates produced from BL21 and Rosetta™ strains containing the empty pET28b(+) vector (Figure 3.23 B lanes 2 and 4 respectively).

105

A B

C D

XbaI

XhoI

BspHI

XhoI XhoI +XbaI

XhoI NcoI XhoI +

XbaI

XhoI

BspHI

XhoI XhoI +XbaI XhoI NcoI XhoI +

Figure 3.22 Cloning of the chi gene into a pET28b(+) plasmid for transformation and expression in E. coli (A) The pET28b(+) prior to cloning with the PCR amplified chi gene sequence. (B) The pET28b(+) vector following cloning with the chi gene. (C) A gel simulation of restriction digests of the cloned pET28b(+)CHI vector. (D) A 0.7 % agarose gel of the restriction digest performed on the cloned pET28b(+) vector containing the chi gene. Plasmid maps and agarose gel simulation were visualised using the SnapGene® viewer version 4.0.3.

106

1 2 3 4 5 A B

1= SeeBlue® Plus2 protein standard 2= BL21 pET28b(+) empty vector 3= BL21 pET28b(+) chi vector 4= Rosetta pET28b(+) empty vector 5= Rosetta pET28b(+) chi vector *Arrow indicates chi protein band

Figure 3.23 SDS-PAGE gel confirming expression of CHI protein in E. coli BL21 and Rosetta strains. (A) SeeBlue® Plus2 protein standard marker. (B) Coomassie blue stained SDS-PAGE electrophoresis gel showing proteins in the cell lysate of autoinduced BL21 and Rosetta strains. Lane 1 contains the protein standard. Lanes 2 and 3 contain the cell lysates from BL21 strains with either pET28b(+) or pET28b(+)CHI respectively. Lanes 4 and 5 contain the cell lysates from Rosetta strains with either pET28b(+) or pET28b(+)CHI respectively. Arrow indicates a protein of the right size for the CHI protein at ~32 kDa with a slightly more intense band shown in lane 5 as compared to lane 3.

107

3.4 Discussion

While studies have already shown that Flavonifractor spp. are capable of degrading a number of polyphenolic compounds, there had been little investigation into the effect polyphenols have on the growth and functional activity of this species in vitro. My initial hypothesis was that Flavonifractor spp. would be able to utilise polyphenols as an energy source and therefore the addition of polyphenols to growth medium would result in an enhanced yield of the bacteria in a culture-based setting. To investigate this my initial studies focused on quercetin interactions with Flavonifractor spp. Quercetin is one of the most studied and most abundant polyphenols ingested by humans. To assess these microbe-polyphenol interactions in vitro I chose a concentration that was within the physiological range ingested by humans, but which would be easy to measure in a lab setting. The addition of 50 µM quercetin to cultures of the Flavonifractor spp. DSMZ 4000T and AHG0014 did not result in any enhanced yield of the bacterial strains despite the observed reduction of quercetin in the medium over time suggesting it was being metabolised by both strains. Numerous polyphenol intervention studies and culture-based studies have suggested that the ability of polyphenols to alter the composition of the GIT microbiota is through antimicrobial mediated activity on specific species (Reviewed by Marin et al., 2015). This anti-microbial effect may have led to bacterial evolution to develop enzymes and mechanisms to counteract growth inhibitory/limiting effects from the presence of polyphenols. I further tested the effects of quercetin when added to actively growing cultures of both Flavonifractor spp. isolates to try and test this hypothesis. My results here indicated that both Flavonifractor spp. were slightly inhibited by the addition of quercetin to the medium.

Though my studies did demonstrate the growth inhibitory effect of quercetin on Flavonifractor spp., the full extent of this inhibition may not be apparent due to the poor growth kinetics for both strains.

Neither strain was capable of reaching an OD600 greater than 0.6 in a complex medium such as BHI. This could be inhibiting the ability to observe inhibitory effects of quercetin on these strains and thus the nutritional requirements of Flavonifractor spp. need to be further explored. Nevertheless, my observations have shown the rapidity with which Flavonifractor spp. can breakdown quercetin, even under these poor growth conditions, potentially indicating the efficiency of this species in its involvement in the polyphenol-microbe interactions taking place within the GIT.

Further to my functional assessments, and due to the poor growth rates of the Flavonifractor spp. in complex medium such as BHI I used the Flavonifractor spp. genomes to make predictions as to any components which could be used to help enhance the growth rate and yield of these species in vitro. 108

I first used eggnogMapper to annotate the all 9 Flavonifractor spp. genomes to make predictions of the general metabolism of the species. This analysis revealed that were a large number of genes predicted to be involved in amino acid transport and biosynthesis. Further to this I was able to show that the Flavonifractor genomes were auxotrophic for a number of essential amino acids required for bacterial metabolism. Combined these data suggest that the provision of these compounds could lead to improved growth kinetics for the species. My PHX analysis revealed that among the top 50 annotate genes predicted to be highly expressed in the AHG0014 genome, 6 appeared to be involved in Carbohydrate transport and metabolism. A closer look at these genes predicted they would be involved in the conversion of Fructose or Mannose. Using PATRIC I was then able to develop a predicted Pathway of Fructose conversion for the production of Pyruvate. Combined these data can be utilised in future studies to investigate how the supplementation of amino acids and carbohydrate sources such as Fructose affects the growth kinetics of Flavonifractor spp. Indeed, investigating how these different components affect the growth rate and yield of this bacterial species would allow for more detailed analyses of the Polyphenol interactions of this strain. Better growth kinetics would also allow further investigations into the immunostimulatory capacity of this strain observed in Chapter 2 of this thesis.

Next I wanted to investigate how the capacity to degrade polyphenols was translated within the genomes of both DSMZ 4000T, AHG0014 and other Flavonifractor spp. genomes that were available. I believe that the genomic comparisons presented in this Chapter are the first of their kind for the Flavonifractor spp. Collectively, I have expanded upon the genomic representation for Flavonifractor spp. I was able to produce a draft genome for AHG0014 and through BLASTn analysis identify a further four genomes on the NCBI database that had previously only been assigned to the family or genus level. The ANI scores generated for the 9 Flavonifractor strains are greater than the generally accepted 95% threshold for species identification which supports my inclusion of the four previously unassigned genomes (Konstantinidis et al., 2006). Previously, investigation into the enzymatic mechanisms involved in the quercetin degradation pathway shown in Figure 1.5 led to the identification of a chi gene from Eu. ramulus (Braune et al., 2016). This enzyme was shown to be involved in the conversion of the intermediate products taxifolin to aliphitonin via cleavage of the C-ring. Through their investigations, Braune et al. (2016) identified a hypothetical protein from F. plautii DSMZ 4000T which showed 50% similarity to the Eu. ramulus chi protein. Using the online genome annotation server RAST I was able to demonstrate the presence of the chi gene and other genes potentially related to polyphenol degradation contained within an operon across all 9 Flavonifractor spp. genomes. Of note, a search of the genomes of other quercetin degrading bacterial 109

species such as the Eu. ramulus genome did not reveal the presence of the same operon, suggesting that this operon may be unique to the Flavonifractor species. Whilst my genome comparisons presented here are preliminary, I believe that the identification of this novel operon within the core- genome of Flavonifractor spp. can help to advance further investigations into the enzymatic mechanisms involved in polyphenol degradation.

Along with the identification of the presumptive quercetin metabolism operon, I was able to demonstrate that the collection of genes immediately upstream and downstream of the operon are also relatively conserved across all 9 Flavonifractor spp. These collections of genes may go some way to revealing more details of the nutritional ecology of this species. Downstream of the chi operon and, also part of the core-genome for Flavonifractor spp., there are genes encoding for a presumptive oxalate/formate antiporter. The oxalate/formate antiporter protein has been reported to be involved in the transport of oxalate and formate across cell membranes, as part of the oxalate (Ox) decarboxylation process, by members of the GIT microbiota (Anantharam et al., 1989, Ruan et al., 1992, Stewart et al., 2004). This potential metabolic function of Flavonifractor spp. could have potential beneficial implications for the host. Too much Ox present in the GIT could potentially translocate to the urinary tract and form calcium oxalate (CaOx) stones which could then block the kidneys. Bacterial species which can exchange Ox and convert this to formate have the potential to reduce the amount of Ox present in the GIT thereby reducing circulating CaOx. Increasing the abundance of species with these capabilities could be beneficial in patients suffering from CD as it has been reported that these patients have a 10-100 fold higher chance of developing kidney stones compared to healthy controls and patients with UC (Kim et al., 2015, Gaspar et al., 2016). Further investigations as to whether Flavonifractor spp. contain a full complement of Ox conversion genes is warranted. Obtaining this knowledge will help to establish whether Flavonifractor spp. play a role in regulating the GIT Ox concentrations in humans.

The chi operon was identified as part of the core genome, along with the classical housekeeping genes (DNA polymerases and tRNAs). I was also able to identify that the core genome consisted of genes supporting spore-formation (e.g. acidsoluble spore protein, spore maturation protein A and sporulation protein YtfJ) and various ABC transporters. The presence of spore-formation genes as part of the core genome supports the original description of the species by Carlier et al. (2010) in which the authors demonstrated that spores may or may not be produced by the species yet the sporulation specific gene spo0A is present following PCR analysis. During my observations using phase-contrast microscopy of AHG0014 and the DSMZ 4000T strains did reveal the presence of

110

spores. Further studies could be performed using these strains to assess under which conditions spore- formation may be triggered, indeed the ability to form spores may be an adaptive advantage for this low-abundant species. Recently, Browne et al. (2016) were able to recover Flavonifractor isolates following ethanol treatment of faecal samples and this was linked back to the presence of extensive sporulation across the range of species isolated. As well as the Oxalate/formate antiporter identified within the chi operon, my core genome analysis revealed annotation of ABC transporters. However, a more comprehensive assessment of tranpsorter and protein export functions (the secretome) warrant further attention, initially via the use of TransporterDB in an attempt to characterize the genomic content encoding TMHMM, and secreted/exported proteins. This type of analysis should allow better identification of potential transporter genes and identify their specific roles within the Flavonifractor genomes.

My assessment of the ecological distributions of Flavonifractor spp. using the online software MetaQuery revealed that Flavonifractor spp. was more abundant in patients with obesity and impaired glucose tolerance when compared to healthy controls (Figure 3.20). This metadata search supports my initial findings from Chapter 2 in which I showed that AHG0014 appears to increase the LPS stimulated NF-κB activation of the RAW 264.7 cells. Combined my observations throughout my PhD research appear contradictory to some reports in which Flavonifractor spp abundance has been determined as an indicator of health (Kasai et al., 2015, Borgo et al., 2018). The most notable significant increase in Flavonifractor spp. abundance was found in the cohort of patients with CD compared to healthy controls. My observations using the MetaQuery platform offer a different view on Flavonifractor spp. abundance and host health status. A key difference between my analyses and the studies of Kasai et al. (2015) and Borgo et al. (2018) is the gene(s) used to identify Flavonifractor spp. Both Kasai et al. (2015) and Borgo et al. (2018) use the conventional 16S rRNA gene sequencing analysis to assess differences in microbiota composition between lean and obese individuals. Both these studies reported higher abundances of OTUs assigned to the F. plautii in lean individuals as compared to obese subjects. For my observations I used the MetaQuery taxa search function which assigns species identification using MetaPhlan2 (Truong et al., 2015). This method infers the presence and read coverage of clade-specific markers which was expanded in 2015 to 184 ± 45 markers per bacterial species. I believe that the use of this method for species identification is more accurate due to the use of multiple marker genes as compared to one 16S rRNA gene. Despite this, my observations have shown that it is clear members of the Flavonifractor spp. are representative of subdominant species within the GIT as there are a large number of samples in all of the studies, for both cases and controls, in which their abundance is undetectable. Further investigations into the 111

prevalence/abundance of Flavonifractor spp. in relation to host health status is now warranted for us to fully comprehend the impact this species may have in the GIT.

Finally, having identified the chi operon as part of the Flavonifractor core-genome, I wanted to assess how the genes in this operon may be involved in Flavonifractor spp ability to metabolise quercetin. As part of the first step in this investigation I successfully cloned the chi gene into the plasmid vector, pET28b(+). The pET28b(+)CHI vector was then successfully cloned into two E. coli strains. A consideration for these experiments was potential codon usage issues in the transformed E. coli strains given that the parent F. plautii strain is a Gram-positive organism. To try and avoid these limiting factors I used the E. coli BL21 strain and its derivative E. coli Rosetta™ for my transformations. The use of the Rosetta™ would allow for the expression of genes encoding for rare codons due to the presence of a chloramphenicol resistant plasmid which supply tRNAs for these rarely used codons. Unfortunately, due to the time-constraints of my program I was unable to continue with more detailed studies using the constructs generated or the strains transformed. Further work in this regard will help to fully identify the roles of the genes in the chi operon and expand on our current understanding of how Flavonifractor spp. sense and metabolise quercetin in the GIT.

In conclusion, the expansion of the genomic representatives of the Flavonifractor spp has allowed for the identification of an operon within the core-genome. This may help to further elucidate the enzymes involved in the quercetin metabolism capabilities of this underrepresented species and help to underpin the roles Flavonifractor spp. play in host health states

112

Chapter 4 Genomic and functional insights into a novel cluster of Pseudoflavonifractor sp. within Clostridium cluster IV

4.1 Introduction

Quercetin and other polyphenols exhibit their effects on host health and the gastrointestinal tract (GIT) microbiota in a number of different ways. One of the more consistently reported mechanisms of action for polyphenols is that they elicit antimicrobial effects on members of the resident GIT microbiota, leading to compositional changes (Marin et al., 2015). What is lacking in these studies is clarification as to whether the antimicrobial effects are a result of a direct inhibition from the polyphenols per se on specific bacterial species; or whether the observed effects are due to the production of other metabolites from polyphenols, which are inhibitory to other specific members of the microbiota. Clarifying this knowledge gap will allow for us to more comprehensively understand the implications of the inclusion of different polyphenols in our diets, for instance, to better utilise polyphenols as treatments to beneficially alter the GIT microbiota leading to beneficial health outcomes for the host. In Chapter 3, I presented my studies of the genome-based analyses of F. plautii strains and quercetin metabolism. Both Flavonifractor spp. and members of the closely related genus Pseudoflavonifractor belong to Clostridium cluster IV, which is a key group of bacteria shown to be reduced in association with inflammatory GIT diseases such as IBD (Frank et al., 2007).

The genus Pseudoflavonifractor was originally described and characterised by Carlier et al. (2010) who showed that members assigned to this genus, while phylogenetically closely related to Flavonifractor spp., proved to be incapable of degrading quercetin. My literature searches suggest there is very little known about the genus Pseudoflavonifractor I was only able to find 11 publications that have mentioned, isolated or detected members of the genus (Klaring et al., 2013, Molina et al., 2014, Oakley et al., 2014, Ó Cuív et al., 2015, Polansky et al., 2015, Louis et al., 2016, Clooney et al., 2016, Borda-Molina et al., 2016, Ricaboni et al., 2017, Sakamoto et al., 2018, Yitbarek et al., 2018). Of these, 6/11 only identify and remark on Pseudoflavonifractor as part of 16S rRNA gene profiling or metagenomics data analysis. The remainder refer specifically to P. capillosus (Klaring et al., 2013, Molina et al., 2014, Clooney et al., 2016, Ó Cuív et al., 2015, Sakamoto et al., 2018) with only 2 providing descriptions of new species closely related to P. capillosus (Klaring et al., 2013, Sakamoto et al., 2018).

113

The type strain P. capillosus DSMZ 23940T is currently the only cultured isolate for which genome sequence data is also available. Thus, members of the Pseudoflavonifractor genus remain underrepresented in culture collections worldwide, and much remains to be learned about the physiology and metabolism of this taxon. In Chapter 2, I described the isolation of a genetically tractable bacterium (strain AHG0008) shown by 16S rRNA gene phylogenetic analysis to be very closely related to the P. capillosus type strain DSMZ 23940T. Additionally, I was able to show this isolate may be a candidate for the production of “anti-inflammatory” factors, as reflected in the reduction of NF-κB regulated reporter gene activity by RAW264.7 macrophage cells. It is clear the Pseudoflavonifractor genus remains underexplored and that its members are poorly described with respect to their prevalence and abundance in the human GIT. There is also a lack of knowledge with respect to their role in host health or disease states, and the effects of polyphenols such as quercetin on their growth. In that context, P. capillosus has been previously shown to be incapable of degrading quercetin (Carlier et al., 2010), but it remains to be seen whether this holds true for other members of the genus.

With that background, the aims of this Chapter were to first compare the growth kinetics and the potential for quercetin metabolism of P. capillosus strain DSMZ 23940T and strain AHG0008. I next used the draft genome sequence from strain AHG0008 to compare with that from P. capillosus strain DSMZ 23940T and found these two genomes were very different in terms of composition and size. Based on these results, I was able to recover a number of “unclassified” genomes and confirm these to be highly similar to that of strain AHG0008. Collectively, these results expand the genetic and functional diversity represented by this subdivision of Clostridium cluster IV.

114

4.2 Materials and Methods

4.2.1 Bacterial strains, medium and growth conditions

The P. capillosus type strain DSMZ 23940T was purchased from DSMZ GmbH (Germany). Strain AHG0008 was isolated as described in Chapter 2 of this thesis. Strains were cultured using Brain- Heart-Infusion (BHI) agar and broth media (BD Biosciences, Australia). Both strains were typically streaked from axenic anaerobic glycerol stocks onto agar medium and incubated for 48 hrs at 37oC in a Coy vinyl anaerobic chamber with an oxygen free atmosphere of nitrogen/carbon dioxide/hydrogen (85% N2: 10% CO2: 5% H2). Where necessary, quercetin (Sigma Aldrich) was solubilized in ethanol and added to the medium at a final concentration of 50 µM. For strain AHG0008, the media was prepared to also include 100 µg/ml erythromycin (Erm, ThermoScientific) for maintenance of plasmid pEHR512112, as outlined in Chapter 2 (Ó Cuív et al., 2015).

T Single colonies of strains DSMZ 23940 and AHG0008 were picked into Hungate tubes containing 10 ml BHI broth and incubated for ~12 hrs overnight (O/N), and then 0.1 ml was used to inoculate 10 ml BHI broth cultures containing 50 µM quercetin or carrier alone. These cultures were incubated o within a water bath at 37 C, with bacterial growth measured hourly (OD600) for up to 24 hours using a ThermoScientific Spectronic™ 200E spectrophotometer. These growth experiments were performed in technical triplicate and with two biological replicates of each strain (i.e. n=6). Uninoculated medium (EtOH and 50 µM quercetin supplemented broths) were used as negative controls.

The results of the growth studies are reported as mean ± standard error mean (S.E.M.) of two biological replicates, with three technical replicates per biological replicate. Student t-tests were performed to assess the final yield of each strain and the degradation of quercetin. Differences were considered significant if P < 0.05. All statistical analyses were performed using GraphPad Prism v.7.03 for Windows (GraphPad Software, La Jolla California USA, www.graphpad.com).

4.2.2 AHG0008 genomic extraction, sequencing and data assembly

Pseudoflavonifractor sp. AHG0008 cultures were initiated as described above, and 0.5 ml of each O/N culture was transferred into serum bottles containing 50 ml of BHI supplemented with 100 µg/ml

Erm. The cultures were then monitored until they had reached an OD600 of ~0.2 (mid-exponential

115

phase of growth) and the cells were then harvested by centrifugation at 3220 x g for 20 minutes in an Eppendorf 5810R benchtop centrifuge.

Genomic DNA (gDNA) was extracted from the harvested cells using the RBB+C method (Yu and Morrison, 2004) modified for a larger scale preparation. Here, the harvested cells were suspended in 2 ml of RBB+C lysis buffer and then centrifuged as above. The cell pellets were then resuspended in 0.8 ml of RBB+C lysis buffer and incubated at 75oC for 5-10 mins. The cell suspensions were subsequently placed at room temperature and once cooled, 20 µl of lysozyme (200 mg/ml stock), 2 µl of mutanolysin (20 U/µl) and 10 µl of achromopeptidase (200 U/µl stock) were added and the cell suspension gently mixed. The samples were then incubated at 37oC for 1 hour, then 40 µl of 10% sodium lauryl sarcosine and 10 µl of proteinase K (20 mg/ml) were added and the samples incubated for 30 mins at 55oC. Following this, an equal volume of Phenol:Chloroform:Isoamyl alcohol (PCI, 25:24:1, ~0.4 ml) was added to the lysis mixture, vortexed, and centrifuged at 21,130 x g for 5 mins using an Eppendorf 5424R benchtop centrifuge. The aqueous phase was carefully removed and the PCI extraction repeated. Then 40 µl of 3 M sodium acetate and 400 µl of isopropanol was added and the sample gently mixed, centrifuged as described above, and the supernatant removed. The pelleted gDNA was then washed twice with 500 µl of 70% (v/v) ethanol and then vacuum dried. The gDNA samples were then pooled by resuspending in 50 µl of TE buffer and treated with 0.5 µl of RNAse A prior to incubation at 37oC for 1 hour. The quality of the gDNA was assessed through visualisatio n on a 1% agarose gel and the quantity assessed using the Quantus™ fluorometer and QuantiFluor® dsDNA system as per manufacturer instructions (Promega, Australia, see Appendix 6. for DNA gels and quantities). The suspended gDNA samples were then stored at 4oC until further use.

Prior to submission for sequencing, the gDNA sample was precipitated using ethanol as per the protocol in Appendix 6. Following precipitation the sample was resuspended in 50 µl of Tris-HCl and quantified, as described above, before being further diluted to 100 ng total gDNA for submission. The Pseudoflavonifractor sp. AHG0008 gDNA was submitted to the Australian Centre of Ecogenomics (ACE) and sequenced using an Illumina NextSeq 500 with a 2 x 150bp High Output kit with V2 chemistry. The resulting genomic sequence data was then trimmed using the Trimmomatic command line version 0.36. The data was quality checked and de novo assembled using the SPAdes command line version 3.10.1 (Bankevich et al., 2012) and annotated using Prokka version 1.12 (Seemann, 2014). The draft genome for AHG0008 was then further assessed for genome completeness and contamination evaluation using CheckM version 1.0.7 (Parks et al., 2015). The

116

contigs were also reordered using Mauve, which is outlined in Chapter 3 (Darling et al., 2010) and using the P. capillosus DSMZ 23940T genome (Carlier et al., 2010) as the reference.

4.2.3 Genome-based analyses

A BLAST search of the NCBI genome database using the draft genome for AHG0008 revealed three metagenome assembled genomes (MAGs) with 99% similarity to AHG0008. The three MAGs, the draft genome for AHG0008 and the genomes for Pseudoflavonifractor sp. DSMZ 23940T and ASF500 were uploaded onto the software platform EDGAR (Blom et al., 2009) for whole genome comparison and generation of Average Nucleotide Identity (ANI) scores. The development plots for the pan/core genome, were generated using the perl script PGap (Zhao et al., 2012) on the Bio-Linux Platform v.8.0.5 (http://environmentalomics.org/bio-linux-software-list/). The PGap analysis pipeline provided gene clusters assigned to either the core or pan-genome and was visualised using the software package PanGP (Zhao et al., 2014). All genomes were also uploaded onto rast.nmpdr.org to assess the genomes for the presence/absence of polyphenols degradation genes such as chi (Aziz et al., 2008, Overbeek et al., 2014, Brettin et al., 2015). The draft genome sequence for AHG0008 was further examined through the JGI IMG/ER to analyse the genomic features and to gain further assignment of genes to KEGG or COG orthologies.

Usage of cluster of orthologous groups (COGs/NOGs) (Huerta-Cepas et al., 2016) was computed using the online software eggnog-Mapper (Huerta-Cepas et al., 2017). Predicted highly expressed (PHX) and putative alien (PA) gene analysis was performed using software available from the University of Georgia’s Institute of Bioinformatics (http://www.cmbl.uga.edu/software/phxpa. htm) (Karlin and Mrazek, 2000). Briefly, this analysis compares the codon bias of highly expressed profiles for Ribosomal proteins, Chaperone proteins and Transcription/Translation factors to that of all proteins in a given genome. The proteins are then assigned an overall value and bias value according to the highly expressed categories. From this, each protein can be assigned as PHX for those displaying a similar bias or PA for those that show a dissimilar bias. The total ribosomal proteins, chaperone proteins and Transcription/Translation factor for AHG0008 were placed into text files and then uploaded to the above web-server for analysis. The resultant PHX and PA gene files were then ranked according to their overall bias value. The DataBase for automated Carbohydrate-active enzyme Annotation (dbCAN2; Zhang et al., 2018b) was used to provide automated and comprehensive CAZyme annotation for each of the Flavonifractor spp. genomes.

117

4.3 Results

4.3.1 Taxonomic assignment and cell morphology strain AHG0008

Pseudoflavonifractor sp. strain AHG0008 forms short thin rods that stain Gram-positive (Figure 4.1) and the phylogenetic alignments based on full length 16S rRNA gene sequences is shown in Figure 4.2. These results show that strain AHG0008 is phylogenetically most closely related to Pseudoflavonifractor sp. DSMZ 23940T and strain ASF500, which is one of the members of the Altered Schaedler Flora (ASF) species, and is currently classified as a Pseudoflavonifractor sp. (Wymore Brand et al., 2015). My BLASTn alignments of the 16S rRNA gene show the gene from strain AHG0008 shares 95% and 97% identity to the same genes from P. capillosus DSMZ 23940T and Pseudoflavonifractor strain ASF500, respectively. These results suggest that all three strains belong to the genus Pseudoflavonifractor, but strains AHG0008 and ASF500 are divergent from P. capillosus DSMZ 23940T and perhaps, represent a second species group.

Figure 4.1 Photomicrograph of Gram-stained cells of Pseudoflavonifractor sp. AHG0008, following culture with BHI broth medium containing 100 µg/ml of Erm. The scale bar (100 µm) is included for reference.

118

74 AHG0014 F. plautii YL31 (KR364773) 51 F. plautii (T) DSMZ 4000 (AY724678) F. plautii An248 99 F. plautii DSM 6740 (Y18187) 77 ATCC BAA-442 61 P. capillosus (T) DSMZ 23940 (AY136666) AHG0008 95 Bacterium ASF500 (AF157051.1) Papillibacter cinnamivorans (T) DSMZ 12816 (AF167711) Sporobacter termitidis (T) DSMZ 10068 (Z49863) 67 21 Oscillibacter valericigenes (T) DSMZ 18026 (AB238598) 99 Oscillospira guilliermondii OSC3 (AB040497) Clostridium butyricum (T) DSMZ 10702 (AJ458420)

0.020

Figure 4.2 Phylogeny of strains AHG008 and AHG0014 based on 16S rRNA gene sequence alignments as described in the Materials and Methods, and using sequences derived from strains assigned to those most closely related taxa, as identified and described in Chapter 2. Strain AHG0008 is separable from the two type strains representing both Pseudoflavonifractor and Flavonifractor (shown in bold), and by my analyses is instead most closely related to the bacterium ASF500, which is derived from the Altered Schaedler microbiota (Wymore Brand et al., 2015). Strain AHG0014 is assigned to the cluster of isolates that currently encompass the F. plautii group, including the type strain DSMZ 4000T.

119

4.3.2 Growth of P. capillosus strain DSMZ 23940T and strain AHG0008 in the presence of quercetin

The growth of P.capillosus DSMZ 23940T with BHI medium is more robust than that of Pseudoflavonifractor sp. AHG0008 and no significant reductions in quercetin concentration were detected following the culture of either DSMZ 23940T or strain AHG0008 in the presence of 50 µM quercetin (Figure 4.3 and Figure 4.4). However, the final cell yield (as measured by OD600) for both strains DSMZ 23940T and AHG0008 appeared to be reduced in the presence of 50 µM quercetin, suggesting some inhibitory effect(s) of this flavonol on microbial growth. In that context, when I added quercetin to cultures that had reached mid-logarithmic phase of growth, there was a significant reduction in both the growth rate and final yield of both strains, which was not evident in cultures receiving carrier alone (Figure 4.5). Based on these results, I conclude that like P. capillosus DSMZ 23940T, strain AHG0008 is not capable of quercetin metabolism and confirms that the phylogenetic placement of strain AHG0008 with the genus Pseudoflavonifractor rather than Flavonifractor, seems appropriate.

120

A B

Figure 4.3 Growth kinetics of P. capillosus sp. DSMZ 23940T in the presence of quercetin. Panel

A shows the longitudinal monitoring of both OD600 () and A375 () as a measure of bacterial growth and quercetin metabolism, respectively, for both inoculated (yellow) and uninoculated BHI medium (blank). As expected, there were virtually no changes in quercetin concentration during the time course of the experiment in both cultures. Notably, the growth rate and yield of DSMZ 23940T is reduced in the presence of quercetin (yellow triangles) as compared to the presence of vehicle alone T DMSO (grey shading). Panel B, shows the OD600 values of DSMZ 23940 in the presence of quercetin or vehicle alone after 24 hours of growth, there is a significant reduction in the yield of DSMZ 23940T when grown in the presence of quercetin. The data points represent the mean values (±S.E.M.) produced from 2 biological replicates each with 3 technical replicates (n=6). The asterisks denote statistical significance of P = 0.003.

121

A B

Figure 4.4 Growth kinetics of strain AHG0008 when cultured in the presence of quercetin. Panel A shows the longitudinal monitoring of both OD600 () and A375 () as a measure of bacterial growth and quercetin metabolism, respectively, for both inoculated (blue) and unioculated BHI medium (blank). As expected, there were virtually no changes in quercetin concentration during the time course of the experiment in both cultures. There does not appear to be much difference in the growth rates when AHG0008 is cultured in the presence of quercetin (blue triangles) compared to vehicle alone (grey shading). Panel B, shows the OD600 values of AHG0008 in the presence of quercetin or vehicle alone after 24 hours of growth, there is a significant reduction in the yield of AHG0008 when grown in the presence of quercetin. The data points represent the mean values (±S.E.M.) produced from 2 biological replicates each with 3 technical replicates (n=6). The asterisks denote statistical significance of P = 0.003.

122

A

B

123

Figure 4.5 The effects of adding 50 µM quercetin to actively growing cultures of DSMZ 23940T (A) and AHG0008 (B). Both panels depict the longitudinal monitoring of both OD600 and A375 as a measure of bacterial growth and quercetin metabolism, respectively, for both inoculated (yellow or blue symbols) and uninoculated BHI medium (grey symbols). Inoculated cultures are further differentiated for those receiving quercetin () as compared to those receiving vehicle alone (). Notably, both strains appear to display active growth much earlier during the time-course compared to when quercetin is present in the medium upon inoculation as depicted for Figure 4.3 and Figure 4.4. Conversely, there appears to be no difference between minimal differences in the growth rates following the addition of quercetin or vehicle to these actively growing cultures, however there is still a lower yield achieved for both strains when in the presence of quercetin. As expected, in both inoculated and uninoculated BHI medium the quercetin concentration remains virtually stable throughout the period measured following its addition to actively growing cultures. The data points represent mean values (± S.E.M.) produced form 2 biological replicates each with 3 technical replicates (n=6).

124

4.3.3 Whole genome analysis of strain AHG0008

The AHG0008 genome data could be assembled into 208 contiguous sequences using SPAdes (Bankevich et al., 2012) and the CheckM analyses predict the draft genome is 99.3% complete with ~0.7% contamination, well within the accepted standards for quality (Parks et al., 2017). As such, the AHG0008 genome is estimated to be ~2.8 Mbp, and therefore much smaller than the genomes of P. capillosus type strain DSMZ 23490T (~4.3 Mbp) and ASF500 (~3.6 Mbp, Table 4.1). Based on these findings I used the RAST server to identify that in fact, the genome for Ruminococcaceae bacterium D16 was most similar to that for strain AHG0008. Ruminococcaceae bacterium D16 is an unclassified genome submitted under Bioproject number PRJNA42541 as a direct submission. Ruminococcaceae D16 has been reported to be associated with the presence of diet related genes linked to plant metabolism in animals (Wang et al., 2011, Vital et al., 2015). I then used the Mauve alignment software to attempt to reorder the AHG0008 contigs in reference to the genomes of DSMZ 23940T, Pseudoflavonifractor sp. ASF500 and Ruminococcaceae bacterium D16 and to assess the degree of synteny between these genomes. These results are shown in Figure 4.6 and show there was very little syntenic regions shared among or between any of these genomes. As such, these observations suggest that while the 16S rRNA gene phylogeny predicts these strains are most likely to represent the genus Pseudoflavonifractor, the genome composition and arrangement of its membership is quite divergent, and unlike my findings for members of the genus Flavonifractor.

125

DSMZ 23940T

ASF500

Ruminococcaceae D16

AHG0008

Figure 4.6 Mauve alignment of the draft genome for AHG0008 with the most closely related species based on 16S rRNA gene alignments. The alignments of the genomes for P. capillosus DSMZ 23940T, Pseudoflavonifractor sp. ASF500 and Ruminococcaceae bacterium D16 with AHG0008 show little synteny with each other suggesting that at the genome level these species are not as closely related as is suggested by the 16S rRNA gene phylogenetic alignment. The genomes represented were reordered against the P. capillosus DSMZ 23940T strain and the final reordered file for each genome was used in the Mauve alignment to compare all the isolates.

126

Based on these results, I used BLASTn to search the Whole Genome Reference sequences to identify any genome(s) showing greater identity/similarity scores with strain AHG0008, and three metagenome assembled genomes (MAGs) with a high similarity (~99%) to AHG0008 were recovered. These MAGs are currently described as being derived from members of either Clostridium sp. or Flavonifractor sp. (Browne et al., 2016). The metrics for these MAGs and the other genomes are presented in Table 4.1, which show that despite their very good CheckM-based estimates of (near) completeness and (low) contamination scores, there are clear differences in genome size and GC content among the genomes. I then used the GView workflow to produce a pangenome atlas from the genomes of Pseudoflavonifractor DSMZ 23940T, ASF500, strain AHG0008, and the three MAGs. The result of this analysis is shown in Figure 4.7, which shows the large amount of gene conservation between the strain AHG0008 genome and the three MAGs. Furthermore, and not unlike the findings of the initial Mauve alignments, these genomes share only a limited amount of gene conservation with those from P. capillosus DSMZ 23940T and strain ASF500. Indeed, when I used Mauve to examine the degree of synteny among the three MAGs and the draft genome from strain AHG0008 (as the reference genome) the extensive amount of syntenic regions shared among all four genomes was revealed (Figure 4.8), and the pangenome atlas produced using only these four genomes validates these four genomes represent a distinct group (Figure 4.9).

To further ascertain the phylogenetic relationships among all these bacteria, I first revised my 16S rRNA gene analysis to include these sequences recovered from the 3 MAGs, and these results are shown in Figure 4.10. There is now a clear separation of P. capillosus DSMZ 23940T from the remaining strains and furthermore, Pseudoflavonifractor sp. ASF500 is separated from strain AHG0008 and the three MAGs. Deeper analysis was warranted following this, as the 16S rRNA gene identity cut-offs regularly employed are clearly insufficient to fully separate AHG0008 from ASF500. I then used EDGAR and MUSCLE to produce a set of genes conserved in all the genomes, and these were concatenated and the whole genome phylogeny of the strains was inferred using PHYLIP. The results of this analyses are shown in Figure 4.11, which are very similar to the relationships in the 16S rRNA gene analysis. I also determined the Average Nucleotide Identity (ANI) scores among all these genomes, as well as the Flavonifractor genomes examined in Chapter 3 (Figure 4.12), which further validates the relationships among these bacteria. Both P. capillosus DSMZ 23940T and ASF500 remain separated from the F. plautii strains, and also from the core set of genes recovered from the genomes of AHG0008 and MAGs, with their ANI similarity scores ranging between 71- 76%.

127

I also attempted to produce core and accessory genome development plots for AHG0008 and the 3 MAGs, using PGap. These plots were then visualised using PanGP, and the results are shown in Figure 4.13. As might be expected, the accessory genome trends upwards with the addition of each genome, indicative that the coverage of the lineage’s genetic diversity is not extensive. Based on these collective results, it seems reasonable to conclude that the “cluster” of bacterial isolates and MAGs that comprise the Pseudoflavonifractor lineage represent previously unidentified and uncharacterised groups of bacteria; the largest of these now being comprised of strain AHG0008 and the three MAGs, which could potentially be classified as a Pseudoflavonifractor sp. nov., member of Clostridium cluster IV.

128

Table 4.1 CheckM completeness and contamination scores for the draft genome of AHG0008 and phylogenetically related strains; P. capillosus DSMZ 23940T, Pseudoflavonifractor sp. ASF500, Ruminococcaceae bacterium D16 and uncultured Flavonifractor sp. and Clostridium sp. from Browne et al (2016). Genome ID Source Location G + C Genome Size (bp) Completeness (% ) Contamination (% ) Estimated size content (Mbp) DSMZ 23940T Human USA 59.1 4,241,076 98.32 1.48 4.3 ASF500 Mouse CAN 58.8 3,658,722 99.33 0.34 3.6 Ruminococcaceae Human USA 56.8 3,197,355 99.33 0 3.2 bacterium D16 AHG0008 Human AUS 56.3 2,804,758 99.33 0.69 2.8 2789STDY5834937 Human UK 56.0 3,071,079 99.33 0 3.1 2789STDY5834895 Human UK 56.4 2,891,162 99.33 0 2.9 2789STDY5608878 Human UK 56.1 2,736,922 95.83 0 2.7 QIN326 Human China 57.4 2,226,793 81.73 1.34 2.7

129

Table 4.2 General genome features for isolate genomes P. capillosus DSMZ 23940T, Pseudoflavonifractor sp. ASF500, Ruminococcaceae bacterium D16 and AHG0008 recovered using the JGI IMG/ER annotation engine. Both the number of genes and the proportion of the genome dedicated to different categories are reported.

DSMZ 23940T ASF500 Bacterium D16 AHG0008 DNA, total number of bases 4241853 100.00% 3658722 100.00% 3197155 100.00% 2811129 100.00% DNA coding bases 3913285 92.25% 3194298 87.31% 2841065 88.86% 246069 87.53% DNA, G + C content 2507500 59.11% 2150308 58.77% 1815989 56.80% 1579709 56.19% Protein coding genes 4833 98.49% 3563 98.10% 2994 98.13% 2909 97.00% RNA genes 74 1.51% 69 1.90% 57 1.87% 69 2.30% rRNA genes 7 0.14% 10 0.28% 4 0.13% 7 0.23% 5S rRNA 1 0.02% 2 0.06% 3 0.10% 2 0.07% 16S rRNA 3 0.06% 4 0.11% 1 0.03% 3 0.10% 23S rRNA 3 0.06% 4 0.11% - - 2 0.07% tRNA genes 57 1.16% 59 1.62% 59 1.74% 56 1.87% Other RNA genes 10 0.20% - - - - 6 0.20% Protein coding genes with function prediction 2677 54.55% 2482 68.34% 2228 73.03% 180 66.02% Without function prediction 2156 43.94% 1081 29.76% 766 25.11% 929 30.98% Protein coding genes with enzymes 817 16.65% 636 17.51% 675 22.12% 614 20.47% Protein Coding genes connected to KEGG pathways 844 17.20% 682 18.78% 703 23.04% 650 21.67% Protein Coding genes connected to KEGG Orthology (KO) 1618 32.97% 1300 35.79% 1318 43.20% 1171 39.05% Protein coding genes connected to MetaCyc pathways 719 14.65% 557 15.34% 593 19.44% 531 17.71% Protein coding genes with COGs 2105 42.90 1862 51.27% 1813 59.42% 2014 67.16% With KOGs 509 10.37% 384 10.57% 442 14.49% - - With Pfam 3011 61.36% 2610 71.86% 2372 77.75% 2168 72.29% With TIGRfam 1007 20.52% 871 23.98% 881 28.88% 817 27.24% With InterPro 3064 62.44% 1696 46.70% 1555 50.97% - - Fused Protein coding genes 257 5.24% 81 2.40% 81 2.65% 133 4.43% Protein coding genes coding signal peptides 286 5.83% 199 5.48% 172 5.64% 133 4.43% Protein coding genes coding transmembrane proteins 1075 21.91% 829 22.82% 786 25.76% 708 23.61% Genes in Biosynthetic Clusters 152 3.1% 95 2.62% 50 1.64% - -

130

2789STDY5608878 2789STDY5834895 2789STDY5834937 AHG0008 ASF500 T DSMZ 23940 Pangenome GC content GC skew

131

Figure 4.7 A pangenome atlas for AHG0008, P. capillosus DSMZ 23940T, Pseudoflavonifractor sp. ASF500 and three metagenome assembles genomes (MAGs) The pangenome atlas shows that AHG0008 and three MAGs from Browne et al., display a relatively high similarity to each other yet they substantially differ from closest known relative’s P. capillosus DSMZ 23940T and Pseudoflavonifractor sp. ASF500 which haad been identified following 16S rRNA gene alignments.

132

AHG0008

Uncultured Clostridium sp. 2789STDY5608878

Uncultured Flavonifractor sp. 2789STDY5834937

Uncultured Flavonifractor sp. 2789STDY5834895

Figure 4.8 Mauve alignment of the draft genome sequence for Pseudoflavonifractor sp. AHG0008 and three metagenome assembled genomes. The four genome sequences show a high degree of synteny (depicted by the individually coloured blocks) with some xenologous regions (represented by blank spaces) present in each genome.

133

2789STDY560887 2789STDY583489 2789STDY583493

AHG0008

Pangenome

GC content

GC skew

Figure 4.9 A pangenome atlas of AHG0008 and the three MAGs only. The extensive colour blocked regions show the degree of the genomic content which is shared by all four genomes and there are few white regions depicting content which is unique to the individual genomes.

134

F. plautii DSMZ 6740 (Y18187) Clostridiales bacterium VE202-03 46 Lachnospiraceae 7_1_58FAA F. plautii An248 F. plautii YL31 (KR364773) 99 F. plautii (T) DSMZ 4000 (AY724678) 52 92 Clostridia bacterium UC5.1 2H11 ATCC BAA442 78 Pseudoflavonifractor capillosus (T) DSMZ 23940 (AY136666) Bacterium ASF500 Uncultured Flavonifractor sp 2789STDY5834937 99 83 Uncultured Clostridium sp 2789STDY5608878 99 AHG0008 86 52 Uncultured Flavonifractor sp 2789STDY5834895 Clostridium viride (T) DSMZ 6836 (X81125) 91 Oscillibacter valericigenes (T) DSMZ 18026 (AB238598) 100 Oscillospira guilliermondii OSC3 (AB040497) 68 Sporobacter termitidis (T) DSMZ 10068 (Z49863) 62 Papillibacter cinnamivorans (T) DSMZ 12816 (AF167711) Clostridium leptum (T) DSMZ 753T (AJ305238) Clostridium stercorarium (T) DSMZ 2919 (AF266461) Clostridium butyricum (T) DSMZ 10702 (AJ58420)

0.020

Figure 4.10 Reconstructed phylogenetic alignment of 16S rRNA genes for strain AHG0008 with members of Clostridium cluster IV. The phylogenetic alignment shows that AHG0008 (blue, bold- face) clusters closely with the Pseudoflavonifractor sp. isolates DSMZ 23940T and ASF500 (black, bold-face). The 16S rRNA gene sequences for the three MAGs are also included. Phylogenetic relatedness was inferred using the Maximum Likelihood method based on Kimura-2-parameter modelling with Clostridium butyricum used as an outgroup. Values at branches denote bootstrap analysis following 1000 replications.

135

AHG0008 uncultured Flavonifractor sp. isolate 2789STDY5834895 (FMFR00000000)

uncultured Clostridium sp. isolate 2789STDY5608878 (FMET00000000) uncultured Flavonifractor sp. isolate 2789STDY5834937 (FMGL00000000) Firmicutes bacterium ASF500 (AYJP000000001) Pseudoflavonifractor capillosus DSMZ 23940 (AAXG02000057)

Clostridia bacterium UC5 1 2H11 NZ (BCAC01000001) AHG0014 Lachnospiraceae bacterium 7 1 58FAA (JH590967)

Clostridiales bacterium VE202 03 NZ (BAHQ0200000) Flavonifractor plautii strain YL31 (CP015406) Clostridium sp ATCC BAA 442 (AWSS00000000)

Flavonifractor plautii DSMZ 6470 (KN174195) Flavonifractor plautii DSMZ 4000 (JH417693) Flavonifractor plautii strain An248 (NFJM00000000) Papillibacter cinnamivorans DSM 12816 (FWXW00000000)

Oscillibacter ruminantium GH1 NZ (BAGW01000040) Oscillibacter valericigenes Sjm18 20 (AP012045) Clostridium butyricum DSM 10702 NZ (AQQF01000207) Sporobacter termitidis DSM 10068 (FQXV00000000)

0.050 136

Figure 4.11 Whole genome phylogeny of Flavonifractor spp., Pseudoflavonifractor spp. and other Clostridium cluster IV isolates. Whole genome phylogeny was generated using the online genome comparison tool EDGAR. The core genes of the genomes were computed followed by alignments of the core genes using MUSCLE with non-matching parts of the alignments masked by GBLOCKS and removed. The remaining matching parts were concatenated into one alignment which was used as an input for PHYLIP. The resulting genome phylogenies were downloaded as Newick files from EDGAR and used in MEGA 7 to produce the depicted phylogenetic tree. The tree is for 20 genomes built out of a core of 36 genes per genome (756, total). The core has 12996 AA-residues/bp per genome (272916, total). Strains AHG0008 (blue) and AHG0014 (red) were isolated as part of this thesis. Bold-faced isolates represent the type strains for each species. Parentheses depict the accession numbers of the genomes used in generating the phylogenetic tree.

137

138

Figure 4.12 Average Nucleotide Identity (ANI) scores of Clostridium cluster IV isolate genomes including the draft genomes for Flavonifractor sp. strain AHG0014 and Pseudoflavonifractor sp. strain AHG0008. Strain AHG0008 and the three MAGs form a distinct cluster of isolates within the Clostridium cluster IV isolates included in the ANI heatmap. The unique cluster shares the highest ANI scores in comparison with the Flavonifractor spp. isolates and the Pseudoflavonifractor spp. isolates which indicates that its lineage is most closely phylogenetically related to this group of cluster IV isolates, yet is distant enough to be classified as belonging to a unique species.

139

Figure 4.13 The Core and Pan-genome development of AHG0008 and related metagenome assembled genomes. The Core (blue, y = 3162.74e-1.41x + 1937.09) and pan-genome (yellow, y = 1125.52x0.61 + 1548.93) development plot of strain AHG0008 closely related species clustered through Biolinux command line PGAP (Zhao et al., 2012) and then plotted in software package PanGP (Zhao et al., 2014).

140

4.3.4 Predicted gene function and metabolism of Pseudoflavonifractor sp. AHG0008 and the three MAGs

Figure 4.14 shows the results of eggNOG-mapper analysis illustrated as a COG usage bar chart that demonstrates the functional differences between AHG0008, the three MAGs and their nearest phylogenetic neighbours (DSMZ 23940T and ASF500). Notably, 33-39% of genes for each genome remained unclassified (data not shown), indicative of the content of microbial “dark matter” present in this lineage(s). For those genes that could be assigned to COG categories, the genomes from strain AHG0008 and the three MAGs appear to have a much higher percentage of genes predicted to be involved in either carbohydrate and amino acid transport and metabolism, when compared to both P. capillosus DSMZ 23940T and Pseudoflavonifractor sp. ASF500. In contrast, these latter two strains appear to have a greater relative proportion of genes assigned to replication, recombination and repair COG categories. Interestingly, only the Pseudoflavonifractor sp. ASF500 genome is predicted to encode for motility functions, reflected in the identification of genes encoding for flagellar biosynthesis proteins (flhAB), the basal-body rod protein (flgG), the export protein (fliJ) and fliMG, whose products form the rotor-mounted switch complex.

141

142

Figure 4.14 A COG usage bar chart showing the variations between strain AHG0008, the three MAGs and their closest neighbours. There are ~7-9% of the predicted genes that appear to be involved in amino acid transport and metabolism for AHG0008 and the three MAGs in comparison to the percentage of genes involved in transport or metabolism of carbohydrates (~5-6%), lipids (~2%), coenzymes (~2%) and nucleotides (~3%).

143

Table 4.3 shows the overall metrics from the PHX analyses of the AHG0008 genome and the three MAGs. Overall, the predicted PHX genes represent ~14-16% of the genome content, and the PA gene content is estimated at ~9-12%. The top 50 annotated PHX genes from the AHG0008 genome are shown in Figure 4.15, and I found that 11/50 genes are predicted to encode for functions involved with amino acid transport and biosynthesis. Using the integrated microbial genomes and microbiomes expert review (IMG/ER) system I next identified which amino-acids the AHG0008 genome was predicted to be capable of synthesising. Table 4.4 shows the auxotrophic and prototrophic profile for AHG0008. I was able to determine that AHG0008 appears auxotrophic for 16 amino acids. PHX analysis also revealed that 11/50 genes were predicted to encode functions involved in carbohydrate metabolism (pyruvate phosphate dikinase, transketolase (TKL) subunits and phosphoenolpyruvate dihydroxyacetone phosphotransferases). Such findings suggest that strain AHG0008 is capable of ribose utilisation via the pentose phosphate pathway (PPP). A deeper look at genes involved in this pathway using the PATRIC webserver reveals that the genome for AHG0008 contains a full complement of genes involved in the non-oxidative phase of the pentose-phosphate pathway (Figure 4.15). The pathways identified showed that AHG0008 encodes for genes involved in the complete conversion of Ribose to Glyceraldehyde-3-phosphate (GAP) via the non-oxidaive phase of the PPP. GAP is then converted to pyruvate via the formation of phosphoenolpyruvate (PEP) in the glycolysis pathway. In that context, the MAGs were predicted to have similar profiles for their top 50 PHX genes (data not shown). Taken together, the predicted gene functions observed using both COG and PHX analysis appear to point towards strain AHG0008 and the MAGs utilising the pentose phosphate pathway (PPP) and C6 sugars such as fructose as a primary energy sources during maximal growth, with a propensity to use amino acids during maximal growth.

Finally, I used the CAZyme annotation server dbCAN2 to annotate the genomes of AHG0008 and the three MAGs (Table 4.5). There are 4 CAZyme families identified across all 4 genomes and the profiles appear highly conserved and maintained. More in depth analysis shows that the Carbohydrate Binding Modules (CBM) and the Cabohydrate Esterase (CE) families are the least represented, with only two enzymes from each family being identified across all the genomes. The most abundantly represented enzyme from the CBM family appears to be CBM50 involved in binding N- acetylglucosamine residues of bacterial peptidoglycans or chitin (Ref). The CE family plays a role in the deacylation of polysaccharides and across all genomes CE4 is the most numerically abundant enzyme. The final two families identified are the Glycoside Hydrolase family (GH) and the Glycosyltransferase family (GT). These are much more abundant than the CBM and CE families with 12 GH enzymes and 11 GT enzymes having been identified. Of the GH enzymes GH6 appears to be 144

the most numerically represented across all Pseudoflavonifractor sp. genomes. GH6 enzymes are involved in the breakdown of cellulose by hydrolysis of 1,4-β-glucosidic linkages of cellulose to release cellobiose (ref). Finally, of the GT genes present across all four genomes the GT2 and GT4 families are the most abundant. These enzymes are involved in attaching glycans to proteins which are most commonly found in cell surface proteins (Refs: Jarrell et al., 2010, Magidovich and Eichler, 2009).

Table 4.3 PHX analysis metrics for AHG0008 and the three MAGs Depicted are the total number of predicted highly expressed (PHX) and putative alien (PA) genes annotated via PHX analysis of the genomes for Pseudoflavonifractor sp. AHG0008 and the three MAGs.

Genome Total No. of Total PHX Total No. of Total PA Total No. PHX genes genes (%) PA genes genes (%) genes analysed AHG0008 370 15.50 254 10.67 2381 2789STDY5834895 371 14.79 236 9.41 2508 2789STDY5608878 365 15.78 243 10.51 2313 2789STDY5834937 432 16.05 304 11.29 2691

145

Figure 4.15 A comprehensive representation of the top 50 PHX genes through analysis of the AHG0008 genome. Genes were ranked according to their overall bias relative to the highly expressed gene profiles of the ribosomal proteins, translation and transcription processing factors and the chaperone degradation genes. The most highly expressed gene is ranked as no 1 in this schematic. Genes are colour coded according to the COG categories predicted for each gene.

146

Table 4.4 IMG predicted metabolism from pathway assertion. The amino acid auxotrophy/prototrophy profile for the draft genome of AHG0008

Amino acid Auxotroph Prototroph L-lysine Y N

L-alanine Y N L-aspartate Y N L-glutamate N Y

L-phenylalanine Y N L-tyrosine Y N

L-tryptophan Y N L-histidine Y N Glycine N Y

L-arginine Y N L-asparagine N Y L-cysteine Y N L-glutamine Y N L-isoleucine Y N

L-leucine Y N L-proline Y N L-serine Y N

L-threonine Y N L-valine Y N

147

148

Figure 4.16 Predicted pathway of Ribose utilisation by AHG0008. Pathways were identified using the PATRIC web server and show a full complement of genes encoding for enzymes involved in the utilisation of ribose via the Pentose Phosphate Pathway (PPP) for the final formation of Pyruvate as part of glycolysis. Firstly, as part of the non-oxidative phase of the PPP Ribose is broken down via a ribokinase to form Ribose-5-phosphate. This is then broken down via an isomerase enzyme into Ribulose-5-phosphate which is further converted into Xylulose-5-phosphate. The Transketolase predicted in the PHX top 50 then converts the Xylulose-5-phosphate to GAP. GAP is an intermediate of the glycolysis pathway, for which AHG0008 also contains the enzymes for the final conversion of GAP into Glycerate-2-phosphate via a GAP dehydrogenase. Glycerate-2-phosphate is then converted into PEP via a phosphopyruvate hydratase. PEP is finally converted into pyruvate via a pyruvate kinase enzyme.

149

Table 4.5 Gene counts for CAZyme families obtained through annotation of the AHG0008 genome and the three MAGs by the dbCAN meta server Overall the profile of the CAZyme families and total gene counts represented across each genome appear to be highly conserved with only small differences across each genome from each CAZyme family.

CAZyme Family AHG0008 2789STDY5834937 2789STDY5834895 2789STDY5608878 Carbohydrate binding module CBM48 1 1 1 1 CBM50 3 3 3 2 Carbohydrate Esterase CE4 3 3 3 3 CE14 - - 1 - Glycoside Hydrolase GH3 1 1 1 1 GH6 3 2 3 3 GH13_11 1 1 1 1 GH13_20 2 2 2 2 GH13_30 1 1 1 1 GH18 1 1 1 1 GH23 1 1 4 1 GH25 2 2 2 2 GH28 1 1 1 1 GH33 - - 1 - GH73 - 1 - - GH77 1 1 1 1 Glycosyltransferase GT0 1 1 1 1 GT2 5 4 5 4 GT4 3 3 3 3 GT5 1 1 1 1 GT8 1 1 1 1 GT13 1 1 1 1 GT26 1 1 1 1

150

GT28 1 1 1 1 GT35 2 2 2 1 GT51 1 1 1 1 GT84 1 1 1 1 Surface layer homology SLH - - 1 -

151

4.4 Discussion

As mentioned previously, the Pseudoflavonifractor genus is extremely underrepresented in both culture collections and genomic data sets worldwide, with only two described species: P. (Eubacterium) capillosus, which was originally reassigned by Carlier et al. (2010) and P. phocaeensis, which was recently described by Ricaboni et al. (2017). Only the P. capillosus genome has been sequenced. During the final preparations of my thesis, I have identified another 7 isolates that have been assigned to the genus Pseudoflavonifractor. Of these, 5/7 strains were isolated from chicken caecal contents (Medvecky et al., 2018, Bioproject PRJNA377666). Further analysis of the Pseudoflavonifractor genus warrants investigation of these 7 genomes to determine.

My results have demonstrated that quercetin can inhibit both the growth rate and yield of P. capillosus DSMZ 23940T, whilst it effects on the growth of strain AHG0008 appears related only to maximum yield. Notably, although neither strain shows any capacity to metabolise quercetin, I also found that the growth of P. capillosus DSMZ 23940T with the basal BHI medium was substantially better than that found for AHG0008, which suggests clear differences between the two strains in terms of their nutrient requirements. Indeed, the initial genome-based comparisons between these two strains showed the extent of dissimilarity between them. My initial assignment of strain AHG0008 was based upon 16S rRNA gene sequencing and alignment, and based on the percent identity scores described by Stackebrandt and Goebel (1994), strain AHG0008 would belong to the same genus as P. capillosus DSMZ 23940T (95% sequence identity) but assigned to the same species as Pseudoflavonifractor sp. ASF500 (97% sequence identity). However, as shown in Figure 4.6 and Figure 4.7, the AHG0008 draft genome shared very little gene-based homology or synteny with the P. capillosus DSMZ 23940T or ASF500 genomes. Based on these findings I sought to identify genomes more closely related to that from strain AHG0008 and retrieved three MAGs from the datasets reported by Browne et al. (2016). A combination of whole genome-based phylogeny, synteny, and nucleotide identity scores confirmed that the genome from strain AHG0008 and the MAGs formed a unique cluster within the Pseudoflavonifractor lineage (Figure 4.11 and Figure 4.12). I believe that my efforts described here have further emphasized the preference for using genome sequence data for the phylogenetic assignments of newly recovered strains, which is consistent with the results of several other recent studies (Janda and Abbott, 2007, Woo et al., 2009, Rossi-Tamisier et al., 2015, Edgar, 2018). Whilst whole genome sequencing may not be an option for all researchers, my recommendations for the future would be to only assign a species-level classification to isolates with >99% identity scores when only using 16S rRNA gene sequence data. 152

Browne et al. (2016) originally described the MAGs I have used here as being affiliated with either Flavonifractor or Clostridium spp. The ANI scores I have generated from the genomes examined in this Chapter show that AHG0008 and the MAGs do indeed belong to a distinct species, compared to the other isolates assigned to the genus Pseudoflavonifractor, as well the Flavonifractor spp. isolates. Interestingly, our lab group has also been analysing the metagenomic data produced from the case- control study of the GIT microbiota and Type 2 diabetes in Chinese subjects by Qin et al. (2012) and I was able to identify a MAG produced from these data with a BLASTn similarity score of 99% to the AHG0008 genome. Although I chose not to include this MAG in the more detailed analyses reported in this Chapter because of the low level of estimated genome coverage (~82%) and time constraints, my Mauve alignments did show an extensive degree of synteny between AHG0008 and this MAG (Figure 4.17).

Having shown that neither strain AHG0008 or P. capillosus DSMZ 23940T could metabolise quercetin in vitro I performed a BLAST search of the genes present in the chi operon identified in Chapter 3 using the RAST database. I was unable to find these genes or homologous sequences within the AHG0008 or DSMZ 23940T genomes, thus providing some supporting evidence that these strains are unable to metabolise quercetin in the way that the F. plautii strains can.

153

AHG0008

QIN326

Figure 4.17 Mauve alignment of AHG0008 with the newly recovered MAG: QIN326 The MAG recovered from Qin et al., (2012) shows a high degree of synteny (depicted as coloured blocks) with the genome for AHG0008. There are some xenologous regions depicted by the blank spaces. Much of the xenologous regions appear to occur in the alignment for AHG0008. There is a large xenologous region appearing at the end of the genome for QIN326.

154

Due to the poor growth conditions of strain AHG0008 in the basal BHI medium, I next wanted to utilise the draft genome for AHG0008 and the MAGs to predict some key genes and metabolic functions. The identification of predicted metabolic functions could prove useful in further investigations of this novel species. Improving the growth conditions for strain AHG0008 could allow for better analysis of the potential bioactive molecule(s) responsible for the anti-inflammatory effect observed during my initial immunomodulatory characterisations of AHG0008 in Chapter 2. In that regard, I have shown that, compared to the strains DSMZ 23940T and ASF500, strain AHG0008 and the MAGS have a much higher proportion of genes predicted to be involved in amino acid transport and biosynthesis. Further, my analysis of the PHX genes for strain AHG0008 show that out of the top 50 most highly expressed genes 11 of these are predicted to be involved in amino acid transport and biosynthesis. Further analysis using PATRIC revealed that AHG0008 is auxotrophic for 16 amino acids suggesting that its growth may be limited in environments with low levels of these 16 amino acids. Amino acids play a vital role in the host; not only do they form the foundations of peptides and proteins, but they also help drive production of bioactive molecules that in turn contribute to signalling and metabolism within the host (Dai et al., 2015). Evenepoel et al. (1999) showed that in humans, there are insufficient quantities of amino acids assimilated in the small intestine, indicating a role for members of the GIT microbiota. Dai et al (2011) went on to report significant numbers of amino acid fermenting bacteria resident within the human colon, of which Clostridium spp. appear to be key drivers of this activity. Strain AHG0008 and the MAGs are representative species from Clostridium cluster IV and so it is perhaps not too surprising that their genome encodes for genes involved in the biosynthesis and transport of amino acids. Due to time constraints I was unable to investigate this potential metabolic function of strain AHG0008, future studies should consider supplementation of the basal medium with animal or plant proteins to ascertain whether this improves the growth kinetics of this species.

Along with amino acid biosynthesis and transport, the genomes for strain AHG0008 and the MAGs were also predicted to have a higher proportion of genes involved in carbohydrate metabolism and transport than the genomes of DSMZ 23940T and ASF500. The number of predicted carbohydrate and amino acid transport and metabolism genes in my top 50 PHX analysis were the same (11/50, Figure 4.15). The predicted carbohydrate metabolism and transport genes appear to be involved in the Pentose-Phosphate Pathway (PPP), a pathway which is a key component of cellular metabolism. This pathway is a parallel pathway to glycolysis and is known to be important for carbon homeostasis and for the provision of precursors for amino acid biosynthesis (Stincone et al., 2015). The main genes from AHG0008 predicted to be involved in the PPP appear to be a transketolase (TKL) and an 155

aldolase. TKLs use ketose donors such as xylulose 5-phosphate to form ketose products such as glyceraldehyde-3-phosphate (GAP). GAP is an intermediate of glycolysis and is converted by a GAP dehydrogenase, which also appears in the top 50 PHX analysis of AHG0008. A predicted Enolase also appears in the top 50 (Rank 40, Figure 4.15), this enzyme is involved in the formation of phosphoenolpyruvate (PEP) as the penultimate step in glycolysis. The potential production of PEP by strain AHG0008 could be used in two ways; first, it could be further converted to pyruvate thus completing glycolysis, or second, it could be used in the bacterial phosphotransferase system (PTS). PTS is a novel sugar phosphorylating system discovered in E. coli by Kundig et al. (1964). The distinct feature of this system is the use of PEP as the phosphoryl donor during sugar phosphorylation (Saier, 2015). In that context, my PHX analysis predicted the presence of two genes which are predicted to be PEP-dihydroxyacetone-phosphotransferases, this potentially indicates that strain AHG0008 and the MAGs may be able to utilise sugars as part of their core metabolism. My further analysis using the PATRIC webserver did reveal that AHG0008 genome can be predicted to fully convert ribose to pyruvate via the non-oxidative phase of PPP and the final steps of the glycolysis pathway. Further work is warranted to investigate whether ribose and any other sugars could be utilised by AHG0008, which combined with the addition of protein sources can only help to provide improved growth conditions in the lab. Improved growth conditions will enhance further investigations into this novel species and its relevance to host health and homeostasis within the GIT.

Finally, I also performed some computational analyses of the genome with a view to identifying genes encoding functions that may be linked and/or contribute to the immunomodulatory activities found for this strain, and described in Chapter 2. I utilised the antiSMASH online software to asses the genome of AHG0008 to try and identify potential biosynthetic gene clusters which could potentially be linked to these immunomodulatory activities. Only 1 biosynthetic cluster was identified in the AHG0008 genome and this cluster did not appear to be involved with any immunomodulatory functions for the genome. There has been some continuing work performed by Miss Rabina Giri an Honours Student in which she was able to further confirm the immunosuppressive activities of the spent supernatant of AHG0008 in human colonic cell lines, primary cell and organoid cultures (Honours Thesis Student number 43188213). Combined with improvements of the growth conditions with AHG0008 and further analysis using different in vitro and in vivo techniques it may be possible to identify the bioactive products produced.

In summation, the findings in this Chapter show the bacterial lineage I have constructed from strain AHG0008 and these four MAGs are representative of a previously undescribed but ubiquitous

156

member of Clostridium Cluster IV, which has been identified as a medium priority “most-wanted” bacterial lineage (Fodor et al., 2012, Ó Cuív et al., 2015). I have also predicted potential key genes and metabolic functions for this novel species which can now be investigated to improve its growth conditions. Together, these can provide new insights into the roles this novel lineage may play with regards to GIT homeostasis.

157

Chapter 5 General Discussion

To date, there are still many human GIT bacteria that remain elusive to culture or are deemed “unculturable”. This impedes our development in understanding the roles which individual species play in human health and disease. The metagenomics-based era has often proved advantageous in furthering our knowledge and the associations between disease states and the dysbiotic nature of the GIT microbiota. However, to fully understand and dissect the distinct roles specific members of the microbiota may play in health and disease, cultured representatives are still required. Innovative approaches are required if culture-based bacterial analyses are to augment the metagenomics era. I believe that when both approaches are used in an integrative way, then we will deepen our understanding of how individual species function and adapt to different stressors and environments, and we will provide new ways to promote health and treat disease.

To address this issue, my PhD research not only contributed to the development and use of metaparental mating methods, but also resulted in the isolation of ‘new’ bacterial strains from the healthy human microbiota (Chapter 2). Prior to my PhD, I had not worked with anaerobic bacteria, and joining the Morrison lab allowed me to develop new skills in working with some of these more difficult to culture bacterial species. To that end, I was able to successfully generate a collection of genetically tractable representatives of Firmicutes-affiliated bacteria using metaparental mating, from the faecal sample of a healthy pre-adolescent Australian child. By using a modified version of the plasmid vector containing the evoglow-C-Bs2 reporter gene, I was able to more effectively screen and identify the transconjugant strains via fluorescence microscopy, despite the relatively high background of erythromycin resistance. The plasmid vectors I used were designed to contain an origin of replication that should function within and selectively target representative species from the Firmicutes phylum, whose members are often reduced in patients with inflammatory bowel disease (IBD) (Frank et al., 2007). My results in Chapter 2 include the taxonomic assignment and phylogenetic assessment of my isolates using 16S rRNA gene sequencing; and the culture collectio n I generated includes strains affiliated with Enterococcus, Clostridium cluster IV, XIVa and XVIII, with many of these isolates appearing divergent to their nearest cultured relatives, as new “branches” in these phylogenetic trees. As such, I consider that these “new” genetically competent GIT bacteria provide us with opportunities to use techniques in bacterial genetics to assess their functional and/or ecological relevance in relation to human health and disease; which has been described in detail by Ó Cuív et al. (2016). For instance, forward genetics techniques including transposon mutagenesis could be used to identify genes underpinning immunomodulatory capacities. Additionally, we should

158

be able to use these isolates to assess their immunomodulatory capacity in high-throughput screening assays, and by reverse genetics identify the gene(s) encoding these key functions.

With these different approaches in mind, I chose to evaluate the immunomodulatory potential of my new isolates, and Chapter 2 also includes my efforts in this regard. Bacteria affiliated with Clostridium cluster IV and XIVa are routinely reported as reduced in patients with IBD (Frank et al., 2007, Sokol et al., 2008, DeGruttola et al., 2016) and some isolates have also been shown to produce peptides or proteins that have “anti-inflammatory effects (Atarashi et al., 2011, Narushima et al., 2014, Atarashi et al., 2015, Miquel et al., 2015, Quevrain et al., 2016). In particular, the studies of F. prausnitzii have identified that anti-inflammatory factors are released into cell culture fluids by this bacterium (Sokol et al., 2008, Sokol et al., 2009) and reverse genetics approaches have identified the peptide-based bioactives and the gene encoding these attributes (Miquel et al., 2015, Quevrain et al., 2016). Prior to starting my PhD, I had very limited knowledge and technical experience with immunology and related methods, so I found the opportunity of using macrophage cell cultures to evaluate my isolates for their production of bioactives affecting NF-κB directed pathways of inflammation to be both exciting and challenging. I first learned how to culture and maintain the RAW 264.7 mouse macrophage cells, then designed the experiments to assess the immunomodulatory effects of products released into the culture fluids by the bacterial isolates. In summation, my efforts described in Chapter 2 suggest that the spent culture fluids from 7/22 of my isolates appear capable of suppressing lipopolysaccharide-induced NF-κB activity in RAW 264.7 macrophages to the same or greater extent as similar preparations from F. prausnitzii A2-165. However, and despite the promise associated with these initial findings, I found that the use of these macrophage cells was technically very difficult. I invested much time and effort into trying to standardise my culturing and passages of the macrophage cells; but found their responses to the use of different batches of Fetal Bovine Serum and lipopolysaccharide to culture and stimulate the cells, respectively, to be quite variable. Ultimately, I decided that, given my priorities for my PhD program resided more so on bacterial genomics and physiology, I would focus my efforts towards the research presented in Chapter 3 and Chapter 4. Since my initial screening, 5 of the isolates have been shown by Dr Ó Cuív and our collaborators to elicit immunosuppressive effects via the NF-κB pathway using human GIT epithelial cell cultures (Caco2 and LS174T cells). In conclusion, my results in Chapter 2 show that metaparental mating can provide a rapid and effective way of recovering new GIT bacteria, both in terms of their phylogenetic relatedness (or not) with other cultured strains, as well as their capacity to affect host cellular response.

159

As outlined above, Chapter 3 and Chapter 4 of my thesis present my efforts to produce new insights into the prevalence and functional capacity of isolates representing the Flavonifractor and Pseudoflavonifractor genera. These Chapters emphasize the use of culture-based studies with various approaches and workflows to analyse and interpret genome sequence data, an area which was initially daunting for me. I had no experience with these types of computational approaches as my earlier technical background was based entirely on “wet-lab” techniques. I believe the findings and knowledge I have presented in both Chapters show that I was successful in developing my skills to use this integrated approach to microbial physiology, and to generate a greater understanding of the functional attributes and prevalence of these specific bacteria in the human GIT.

The of bacteria assigned to Flavonifractor and Pseudoflavonifractor was justified to reconcile their phylogenetic divergence from other members of the Eubacterium and Clostridium spp. they were initially assigned to, as well as their differentiation with respect to the metabolism of diet- derived polyphenols. Polyphenols have long been considered beneficial to human health. In particular, many intervention studies have shown that polyphenol consumption not only leads to reductions in inflammation/disease-associated markers, but can also have a measurable impact on the resident GIT microbiota (Cardona et al., 2013, Anhe et al., 2015, Roopchand et al., 2015, Cueva et al., 2017). Nevertheless, and despite these perceptions, relatively little is known about the intricacies of the polyphenol-microbiota interactions within the GIT. While a few species of human GIT bacteria have been isolated and described as capable of polyphenol metabolism (Reviewed by Braune and Blaut, 2016), there are scant studies that have specifically investigated the effect of polyphenols on these microbial species. This knowledge gap makes it unclear if polyphenol metabolism by Flavonifractor spp. supports their growth directly (e.g. as an energy source) and/or indirectly (e.g. removing their inhibitory effect on growth). My culture-based studies in Chapter 3 of F. plautii DSMZ4000T and strain AHG0014 show both have very similar (and relatively weak) growth kinetics when using nutrient-rich media typically used for the culture of human GIT bacteria. Quercetin metabolism by both strains was rapid and at concentrations used to simulate what these strains might encounter in the human GIT (50 µM), quercetin appeared to exert only subtle effects on growth kinetics, in terms of rate or final yield (maximal OD600). However, my growth studies did suggest that active growth of both strains did not proceed until the added quercetin per se had been reduced to concentrations less than 5 µM. Previously, the pathway of quercetin metabolism by the original C. orbiscindens strain was elucidated. Schoefer et al (2003) showed that this species degraded quercetin using a three-step pathway. Firstly, quercetin is transformed into taxifolin via a reduction of a double bond in the 2,3-position of quercetin (Figure 3.1). The chi enzyme then contracts the C-ring of 160

taxifolin to form the secondary intermediate alphitonin. Alphitonin has been postulated to undergo oxidative decarboxylation for the final conversion into 3,4-dihydroxyphenylacetic (3,4-DPH) acid and phloroglucinol. Whilst 3,4-DPH has been identified as the final end product for quercetin degradation, phloroglucinol has been shown to be further degraded into the SCFAs; butyrate and acetate (Schoefer et al., 2003). My observations suggest the metabolism of quercetin by F. plautii and related strains is primarily to prevent growth inhibitory effects rather than for use as an energy source. Indeed, the results of my subsequent experiments, where I added 50 µM quercetin to actively growing cultures of strains DSMZ4000T and AHG0014, support the contention that this concentration of quercetin inhibits growth (i.e. bacteriostatic) rather than viability (i.e. bactericidal). I feel these findings can be further validated by using higher concentrations of quercetin in these growth studies, these higher concentrations may better show the bacteriostatic effect of quercetin on Flavonifractor spp.

For the remainder of Chapter 3 my focus turned to a genome-based characterisation and comparison of F. plautii strains. To my knowledge, the results presented in this thesis are the first of their kind for this species. In addition to the genome data for my new isolate (AHG0014), I have retrieved the genome sequences of four “unassigned bacteria” currently available in the NCBI genome database, and I have shown why they should be assigned to the Flavonifractor genus. Using these 9 genomes, I was able to show that the chalcone isomerase (chi) gene implicated in polyphenol metabolism, and originally identified by Braune et al. (2016), is part of the Flavonifractor core genome. Additionally, the chi gene is part of a multi-gene locus of that is also conserved across all 9 genomes, which appears to encode a flavoprotein and membrane-bound proteins. The identification of a multi-gene locus provides clues in terms of the mechanisms of quercetin metabolism and perhaps, the conversion of other flavonoids by these species. This is of interest to the wider community as the genes and enzymes involved in converting flavonoids are still being elucidated. When the pathway of quercetin degradation was first elucidated in 2001 it was still speculated that the quercetin-taxifolin conversion may spontaneously occur. I believe that my functional analyses in Chapter 3 refute this hypothesis by the fact that throughout my growth studies the concentration of quercetin remained constant when no bacteria were present, thereby providing evidence that its breakdown under anaerobic conditions requires the presence of bacterial enzymes.

The chi gene appears to be of an operon, preceded by a putative flavoprotein gene, and two more genes downstream of chi and currently annotated as hypothetical but with motifs implicated in their association with the cell membrane. I believe that further investigations involving these genes will go

161

some way in further elucidating how F. plautii sense and metabolise quercetin and perhaps, other flavonoids. As the first step along this path, I successfully cloned chi into an E. coli compatible plasmid expression vector, to assess the function of this gene product. This was a new set of skills for me and I was successful at constructing a plasmid containing the chi gene and provide initial evidence that the recombinant Chi protein is produced by E. coli following autoinduction. Another consideration is that given F. plautii is a Gram-positive bacterium, there may be problems encountered with respect to codon usage between the parent and E. coli cloning host, and I have also used the E. coli Rosetta™ strain, which contains a plasmid that contains tRNAs with codons rarely used in E. coli, to address this.

In the future, another way to address some of the issues raised above would be to use a genetically tractable bacterium that is more closely related to strain AHG0014, or strain AHG0014 itself. The former could be achieved by utilising, for instance, AHG0001 (C. bolteae) or AHG0008 (Pseudoflavonifractor sp.) as the host of the pEHR vector series employed in Chapter 2 which could be modified to contain chi and/or the other genes encoded in this operon. To that end, I have also produced a derivative of AHG0001 that is cured of the pEHR plasmid and it could be used for these gene “knock-in” experiments. Alternatively, these same genes could be interrupted by disrupting the sequence to be cloned into the pEHR vectors and used to construct AHG0014 strains which are unable to degrade quercetin. Of specific interest would be an assessment of whether the effects of querc etin change from being bacteriostatic in nature to bactericidal on these mutant strains of AHG00014.

There are scant reports in the published literature that specifically remark on the prevalence and/or abundance of Flavonifractor spp. Kasai et al. (2015), reported that Flavonifractor spp. abundance could be associated with non-obese individuals. Browne et al. (2016) reported its enrichment and recovery from spores following the treatment of faecal samples with ethanol. Fang et al. (2017) recently reported a negative correlation between Flavonifractor spp. abundance and the level of inflammatory chemokines (MIP-1α and MCP-1) in mouse models of liver injury, and specifically, following probiotic intervention using Bifidobacterium catenulatum. This is perhaps the first report that specifically links Flavonifractor spp. abundance with a host response, in this case an “anti- inflammatory” effect. However, given the expression of these chemokines is regulated by NF-κB (Goebeler et al., 2001, Liu et al., 2017b) and my results in Chapter 2 suggest that the spent culture fluids of strain AHG0014 have no inhibitory effects on NF-κB signalling this proposed link may not be the direct result of increased Flavonifractor spp. abundance. Indeed, I think my findings from the MetaQuery searches provide evidence that in fact, Flavonifractor prevalence and abundance is

162

variable in human subjects. However, there is a significantly increased prevalence and abundance of Flavonifractor spp. in Spanish patients with Crohn’s disease as compared to patients with ulcerative colitis and healthy control subjects. The evidence from my MetaQuery analysis suggests that Flavonifractor spp. abundance is more likely to be a marker of patients presenting with GIT inflammation, rather than a biomarker of health as reported in studies such as those of Fang et al. (2017) and Kasai et al. (2015).

I was also surprised by the fact that neither strain displayed more active growth with nutrient rich media such as BHI. It should also be noted that Flavonifractor spp. are more commonly reported in 16S rRNA and metagenome profiling studies performed in animal subjects (Oakley et al., 2014, Medvecky et al., 2018, Zhang et al., 2018a, Borrelli et al., 2017, Liu et al., 2017a, Liu et al., 2018). Interestingly, my characterisation of the chi-containing operon also revealed that a capacity for oxalate metabolism is encoded within the Flavonifractor spp. genome due to the presence of a conserved oxalate/formate antiporter downstream of the chi operon. Oxalate is an organic acid found in many plants and so the potential for oxalate metabolism by this species could be a previously unidentified nutritional requirement for this species. Valuable information regarding potential nutritional requirements of Flavonifractor spp. could be gathered from these animal studies. Indeed, many of the animals in these studies reside on vegetable-based diets indicating their microbiota are exposed to a higher level of polyphenols. Altogether, I believe that much more research is required in the field of polyphenol-microbiota interactions as the interface between polyphenol consumption and human health status.

In Chapter 4, I have focused on the functional and genomic analysis of Pseudoflavonifractor capillosus DSMZ 23940T and strain AHG0008. I believe there are four primary outcomes arising from these studies. First, my studies in Chapter 2 suggest this bacterium produces “anti- inflammatory” factors that perhaps, are more efficacious of those produced by F. prausnitzii A2-165. Second, neither P. capillosus DSMZ 23940T nor strain AHG0008 are capable of quercetin metabolism, and their yield is slightly reduced in its presence. Third, and while my 16S rRNA phylogenetic analyses suggested they are closely related, the AHG0008 genome possesses only a small amount of synteny and is much smaller and lower GC content (~56%) than P. capillosus DSMZ 23940T. Fourth, I was then able to show by core genome-based phylogeny and Average Nucleotide Identity scores that strain AHG0008 instead is the first cultured isolate of a divergent branch in this lineage, which includes three “uncultured_Clostridium/Flavonifractor” metagenome-assembled genomes reported by Browne et al. (2016). Indeed, the size of the draft genomes for AHG0008 and

163

these other MAGs are approximately 50% smaller than the P. capillosus genome, even though they are estimated to be >95% complete. Based on the whole genome phylogeny and ANI scores generated in Chapter 4, it was clear that AHG0008 formed a unique clade with these three MAGs. Ricaboni et al. (2017) and Sakamoto et al. (2018) have recently reported their isolation of strains that are presumptively identified as being Flavonifractor and Pseudofalvonifractor spp., but their studies are limited to 16S rRNA gene analysis and some basic functional assessments. Here, my efforts in generating a draft genome for my isolate AHG0008 and the discovery of the MAGs has allowed me to provide a deeper analysis of a novel species. I believe my findings presented in Chapter 4 have not only highlighted the value of the recovery of MAGs, but also shown the necessity for the provision of a cultured representative of the MAGs, to confirm and validate the veracity of their genome assemblies. Although the number of genomes available are still small, my expansion of the representatives of this lineage has allowed me to start building their core and accessory genomes, which can then be assessed to gain valuable information regarding their metabolism, and their presumptive “anti-inflammatory” properties. To that end, I have predicted some key genes and metabolic functions which could be important in the future, to improve our cultivation and understanding of their functional roles in the GIT. Indeed, my analyses have shown that, compared to both P. capillosus strain DSMZ 23940T and Pseudoflavonifractpr sp. ASF500, a relatively large percentage of both the AHG0008 and MAGs genomes encodes for amino acid transport and metabolism. Furthermore, my list of Predicted Highly Expressed (PHX) genes for AHG0008 identified that 8/50 most highly expressed genes encode for functions involved with amino acid metabolism. In that context, I think future studies that evaluate how the growth rate and/or yield of AHG0008 is affected by the provision of plant/animal derived proteins, and their hydrolysates, seem warranted.

To conclude, my thesis presents my accomplishments in terms of building a collection of isolates that expand our representation of the two key lineages of Firmicutes-affiliated bacteria widely regarded to be important members of the GIT microbiota of healthy persons and depleted in those with chronic diseases associated with GIT inflammation. In particular, I have characterised two relatively understudied and low abundance members of this important component of the GIT microbiota, using a combination of culture-based, genome sequencing and bioinformatics approaches. From this foundation, I think my findings will pave the way for future studies that further expand our understanding of the functional attributes of these numerically small and poorly characterized components of the core microbiome of the human GIT, both in terms of diet x microbiota and host-

164

microbe interactions relevant to GIT homeostasis (Pseudoflavonifractor) and inflammation (Flavonifractor).

165

Chapter 6 References ABRAMS, G. D., BAUER, H. & SPRINZ, H. 1963. Influence of the normal flora on mucosal morphology and cellular renewal in the ileum. A comparison of germ-free and conventional mice. Lab Invest, 12, 355-64. ADINDLA, S., INAMPUDI, K. K. & GURUPRASAD, L. 2007. Cell surface proteins in archaeal and bacterial genomes comprising "LVIVD", "RIVW" and "LGxL" tandem sequence repeats are predicted to fold as beta-propeller. Int J Biol Macromol, 41, 454-68. ALMEIDA, M., POP, M., LE CHATELIER, E., PRIFTI, E., PONS, N., GHOZLANE, A. & EHRLICH, S. D. 2016. Capturing the most wanted taxa through cross-sample correlations. ISME J, 10, 2459-67. ANANTHARAM, V., ALLISON, M. J. & MALONEY, P. C. 1989. Oxalate:formate exchange. The basis for energy coupling in Oxalobacter. J Biol Chem, 264, 7244-50. ANDREWS, C. N., GRIFFITHS, T. A., KAUFMAN, J., VERGNOLLE, N., SURETTE, M. G. & RIOUX, K. P. 2011. Mesalazine (5-aminosalicylic acid) alters faecal bacterial profiles, but not mucosal proteolytic activity in diarrhoea-predominant irritable bowel syndrome. Alimentary Pharmacology & Therapeutics, 34, 374-383. ANHE, F. F., ROY, D., PILON, G., DUDONNE, S., MATAMOROS, S., VARIN, T. V., GAROFALO, C., MOINE, Q., DESJARDINS, Y., LEVY, E. & MARETTE, A. 2015. A polyphenol-rich cranberry extract protects from diet-induced obesity, insulin resistance and intestinal inflammation in association with increased Akkermansia spp. population in the gut microbiota of mice. Gut, 64, 872-83. ARAHAL, D. R. 2014. Whole-Genome Analyses: Average Nucleotide Identity. New Approaches to Prokaryotic Systematics, 41, 103-122. ARUMUGAM, M., RAES, J., PELLETIER, E., LE PASLIER, D., YAMADA, T., MENDE, D. R., FERNANDES, G. R., TAP, J., BRULS, T., BATTO, J. M., BERTALAN, M., BORRUEL, N., CASELLAS, F., FERNANDEZ, L., GAUTIER, L., HANSEN, T., HATTORI, M., HAYASHI, T., KLEEREBEZEM, M., KUROKAWA, K., LECLERC, M., LEVENEZ, F., MANICHANH, C., NIELSEN, H. B., NIELSEN, T., PONS, N., POULAIN, J., QIN, J., SICHERITZ-PONTEN, T., TIMS, S., TORRENTS, D., UGARTE, E., ZOETENDAL, E. G., WANG, J., GUARNER, F., PEDERSEN, O., DE VOS, W. M., BRUNAK, S., DORE, J., META, H. I. T. C., ANTOLIN, M., ARTIGUENAVE, F., BLOTTIERE, H. M., ALMEIDA, M., BRECHOT, C., CARA, C., CHERVAUX, C., CULTRONE, A., DELORME, C., DENARIAZ, G., DERVYN, R., FOERSTNER, K. U., FRISS, C., VAN DE GUCHTE, M., GUEDON, E., HAIMET, F., HUBER, W., VAN HYLCKAMA-VLIEG, J., JAMET, A., JUSTE, C., KACI, G., KNOL, J., LAKHDARI, O., LAYEC, S., LE ROUX, K., MAGUIN, E., MERIEUX, A., MELO MINARDI, R., M'RINI, C., MULLER, J., OOZEER, R., PARKHILL, J., RENAULT, P., RESCIGNO, M., SANCHEZ, N., SUNAGAWA, S., TORREJON, A., TURNER, K., VANDEMEULEBROUCK, G., VARELA, E., WINOGRADSKY, Y., ZELLER, G., WEISSENBACH, J., EHRLICH, S. D. & BORK, P. 2011. Enterotypes of the human gut microbiome. Nature, 473, 174-80. ATARASHI, K., TANOUE, T., ANDO, M., KAMADA, N., NAGANO, Y., NARUSHIMA, S., SUDA, W., IMAOKA, A., SETOYAMA, H., NAGAMORI, T., ISHIKAWA, E., SHIMA, T., HARA, T., KADO, S., JINNOHARA, T., OHNO, H., KONDO, T., TOYOOKA, K., WATANABE, E., YOKOYAMA, S., TOKORO, S., MORI, H., NOGUCHI, Y., MORITA, H., IVANOV, II, SUGIYAMA, T., NUNEZ, G., CAMP, J. G., HATTORI, M., UMESAKI, Y. & HONDA, K. 2015. Th17 Cell Induction by Adhesion of Microbes to Intestinal Epithelial Cells. Cell, 163, 367-80.

166

ATARASHI, K., TANOUE, T., OSHIMA, K., SUDA, W., NAGANO, Y., NISHIKAWA, H., FUKUDA, S., SAITO, T., NARUSHIMA, S., HASE, K., KIM, S., FRITZ, J. V., WILMES, P., UEHA, S., MATSUSHIMA, K., OHNO, H., OLLE, B., SAKAGUCHI, S., TANIGUCHI, T., MORITA, H., HATTORI, M. & HONDA, K. 2013. Treg induction by a rationally selected mixture of Clostridia strains from the human microbiota. Nature, 500, 232-6. ATARASHI, K., TANOUE, T., SHIMA, T., IMAOKA, A., KUWAHARA, T., MOMOSE, Y., CHENG, G., YAMASAKI, S., SAITO, T., OHBA, Y., TANIGUCHI, T., TAKEDA, K., HORI, S., IVANOV, II, UMESAKI, Y., ITOH, K. & HONDA, K. 2011. Induction of colonic regulatory T cells by indigenous Clostridium species. Science, 331, 337-41. AZIZ, R. K., BARTELS, D., BEST, A. A., DEJONGH, M., DISZ, T., EDWARDS, R. A., FORMSMA, K., GERDES, S., GLASS, E. M., KUBAL, M., MEYER, F., OLSEN, G. J., OLSON, R., OSTERMAN, A. L., OVERBEEK, R. A., MCNEIL, L. K., PAARMANN, D., PACZIAN, T., PARRELLO, B., PUSCH, G. D., REICH, C., STEVENS, R., VASSIEVA, O., VONSTEIN, V., WILKE, A. & ZAGNITKO, O. 2008. The RAST Server: rapid annotations using subsystems technology. BMC Genomics, 9, 75. BALCH, W. E., FOX, G. E., MAGRUM, L. J., WOESE, C. R. & WOLFE, R. S. 1979. Methanogens: reevaluation of a unique biological group. Microbiol Rev, 43, 260-96. BANKEVICH, A., NURK, S., ANTIPOV, D., GUREVICH, A. A., DVORKIN, M., KULIKOV, A. S., LESIN, V. M., NIKOLENKO, S. I., PHAM, S., PRJIBELSKI, A. D., PYSHKIN, A. V., SIROTKIN, A. V., VYAHHI, N., TESLER, G., ALEKSEYEV, M. A. & PEVZNER, P. A. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol, 19, 455-77. BARCENILLA, A., PRYDE, S. E., MARTIN, J. C., DUNCAN, S. H., STEWART, C. S., HENDERSON, C. & FLINT, H. J. 2000. Phylogenetic relationships of butyrate-producing bacteria from the human gut. Appl Environ Microbiol, 66, 1654-61. BLOM, J., ALBAUM, S. P., DOPPMEIER, D., PUHLER, A., VORHOLTER, F. J., ZAKRZEWSKI, M. & GOESMANN, A. 2009. EDGAR: a software framework for the comparative analysis of prokaryotic genomes. BMC Bioinformatics, 10, 154. BORDA-MOLINA, D., VITAL, M., SOMMERFELD, V., RODEHUTSCORD, M. & CAMARINHA-SILVA, A. 2016. Insights into Broilers' Gut Microbiota Fed with Phosphorus, Calcium, and Phytase Supplemented Diets. Front Microbiol, 7, 2033. BORGO, F., GARBOSSA, S., RIVA, A., SEVERGNINI, M., LUIGIANO, C., BENETTI, A., PONTIROLI, A. E., MORACE, G. & BORGHI, E. 2018. Body Mass Index and Sex Affect Diverse Microbial Niches within the Gut. Front Microbiol, 9, 213. BORRELLI, L., CORETTI, L., DIPINETO, L., BOVERA, F., MENNA, F., CHIARIOTTI, L., NIZZA, A., LEMBO, F. & FIORETTI, A. 2017. Insect-based diet, a promising nutritional source, modulates gut microbiota composition and SCFAs production in laying hens. Sci Rep, 7, 16269. BRAUNE, A. & BLAUT, M. 2016. Bacterial species involved in the conversion of dietary flavonoids in the human gut. Gut Microbes, 7, 216-34. BRAUNE, A., ENGST, W. & BLAUT, M. 2005. Degradation of neohesperidin dihydrochalcone by human intestinal bacteria. J Agric Food Chem, 53, 1782-90. BRAUNE, A., ENGST, W., ELSINGHORST, P. W., FURTMANN, N., BAJORATH, J., GUTSCHOW, M. & BLAUT, M. 2016. Chalcone Isomerase from Eubacterium ramulus Catalyzes the Ring Contraction of Flavanonols. J Bacteriol, 198, 2965-2974. BRETTIN, T., DAVIS, J. J., DISZ, T., EDWARDS, R. A., GERDES, S., OLSEN, G. J., OLSON, R., OVERBEEK, R., PARRELLO, B., PUSCH, G. D., SHUKLA, M., THOMASON, J. A., 3RD, STEVENS, R., VONSTEIN, V., WATTAM, A. R. & XIA, F. 2015. RASTtk: a modular and

167

extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep, 5, 8365. BRIDEAU, C., GUNTER, B., PIKOUNIS, B. & LIAW, A. 2003. Improved statistical methods for hit selection in high-throughput screening. J Biomol Screen, 8, 634-47. BROADERS, E., GAHAN, C. G. & MARCHESI, J. R. 2013. Mobile genetic elements of the human gastrointestinal tract: potential for spread of antibiotic resistance genes. Gut Microbes, 4, 271- 80. BROWNE, H. P., FORSTER, S. C., ANONYE, B. O., KUMAR, N., NEVILLE, B. A., STARES, M. D., GOULDING, D. & LAWLEY, T. D. 2016. Culturing of 'unculturable' human microbiota reveals novel taxa and extensive sporulation. Nature, 533, 543-546. CARDONA, F., ANDRES-LACUEVA, C., TULIPANI, S., TINAHONES, F. J. & QUEIPO- ORTUNO, M. I. 2013. Benefits of polyphenols on gut microbiota and implications in human health. J Nutr Biochem, 24, 1415-22. CARLIER, J. P., BEDORA-FAURE, M., K'OUAS, G., ALAUZET, C. & MORY, F. 2010. Proposal to unify Clostridium orbiscindens Winter et al. 1991 and Eubacterium plautii (Seguin 1928) Hofstad and Aasjord 1982, with description of Flavonifractor plautii gen. nov., comb. nov., and reassignment of Bacteroides capillosus to Pseudoflavonifractor capillosus gen. nov., comb. nov. Int J Syst Evol Microbiol, 60, 585-90. CASTANYS-MUNOZ, E., MARTIN, M. J. & VAZQUEZ, E. 2016. Building a Beneficial Microbiome from Birth. Adv Nutr, 7, 323-30. CHOW, J., TANG, H. & MAZMANIAN, S. K. 2011. Pathobionts of the gastrointestinal microbiota and inflammatory disease. Curr Opin Immunol, 23, 473-80. CHUN, O. K., CHUNG, S. J. & SONG, W. O. 2007. Estimated dietary flavonoid intake and major food sources of U.S. adults. J Nutr, 137, 1244-52. CLAVEL, T., FALLANI, M., LEPAGE, P., LEVENEZ, F., MATHEY, J., ROCHET, V., SEREZAT, M., SUTREN, M., HENDERSON, G., BENNETAU-PELISSERO, C., TONDU, F., BLAUT, M., DORE, J. & COXAM, V. 2005. Isoflavones and functional foods alter the dominant intestinal microbiota in postmenopausal women. J Nutr, 135, 2786-92. CLOONEY, A. G., BERNSTEIN, C. N., LESLIE, W. D., VAGIANOS, K., SARGENT, M., LASERNA-MENDIETA, E. J., CLAESSON, M. J. & TARGOWNIK, L. E. 2016. A comparison of the gut microbiome between long-term users and non-users of proton pump inhibitors. Aliment Pharmacol Ther, 43, 974-84. CUEVA, C., GIL-SÁNCHEZ, I., AYUDA-DURÁN, B., GONZÁLEZ-MANZANO, S., GONZÁLEZ-PARAMÁS, A., SANTOS-BUELGA, C., BARTOLOMÉ, B. & MORENO- ARRIBAS, M. 2017. An Integrated View of the Effects of Wine Polyphenols and Their Relevant Metabolites on Gut and Host Health. Molecules, 22, 99. DAI, Z., WU, Z., HANG, S., ZHU, W. & WU, G. 2015. Amino acid metabolism in intestinal bacteria and its potential implications for mammalian reproduction. Mol Hum Reprod, 21, 389-409. DARLING, A. E., MAU, B. & PERNA, N. T. 2010. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One, 5, e11147. DE CRUZ, P., PRIDEAUX, L., WAGNER, J., NG, S. C., MCSWEENEY, C., KIRKWOOD, C., MORRISON, M. & KAMM, M. A. 2012. Characterization of the gastrointestinal microbiota in health and inflammatory bowel disease. Inflamm Bowel Dis, 18, 372-90. DE VRIES, L. E., VALLES, Y., AGERSO, Y., VAISHAMPAYAN, P. A., GARCIA-MONTANER, A., KUEHL, J. V., CHRISTENSEN, H., BARLOW, M. & FRANCINO, M. P. 2011. The gut as reservoir of antibiotic resistance: microbial diversity of tetracycline resistance in mother and infant. PLoS One, 6, e21644.

168

DEGRUTTOLA, A. K., LOW, D., MIZOGUCHI, A. & MIZOGUCHI, E. 2016. Current Understanding of Dysbiosis in Disease in Human and Animal Models. Inflamm Bowel Dis, 22, 1137-50. DELGADO, S., CABRERA-RUBIO, R., MIRA, A., SUAREZ, A. & MAYO, B. 2013. Microbiological survey of the human gastric ecosystem using culturing and pyrosequencing methods. Microb Ecol, 65, 763-72. DICKSVED, J., LINDBERG, M., ROSENQUIST, M., ENROTH, H., JANSSON, J. K. & ENGSTRAND, L. 2009. Molecular characterization of the stomach microbiota in patients with gastric cancer and in controls. J Med Microbiol, 58, 509-16. DOMINGUEZ-BELLO, M. G., COSTELLO, E. K., CONTRERAS, M., MAGRIS, M., HIDALGO, G., FIERER, N. & KNIGHT, R. 2010. Delivery mode shapes the acquisition and structure of the initial microbiota across multiple body habitats in newborns. Proc Natl Acad Sci U S A, 107, 11971-5. DOROFEYEV, A. E., VASILENKO, I. V., RASSOKHINA, O. A. & KONDRATIUK, R. B. 2013. Mucosal barrier in ulcerative colitis and Crohn's disease. Gastroenterol Res Pract, 2013, 431231. DUBOURG, G., LAGIER, J. C., ARMOUGOM, F., ROBERT, C., HAMAD, I., BROUQUI, P. & RAOULT, D. 2013. The proof of concept that culturomics can be superior to metagenomics to study atypical stool samples. Eur J Clin Microbiol Infect Dis, 32, 1099. ECKBURG, P. B., BIK, E. M., BERNSTEIN, C. N., PURDOM, E., DETHLEFSEN, L., SARGENT, M., GILL, S. R., NELSON, K. E. & RELMAN, D. A. 2005. Diversity of the human intestinal microbial flora. Science, 308, 1635-8. EDGAR, R. C. 2018. Updating the 97% identity threshold for 16S ribosomal RNA OTUs. Bioinformatics, 34, 2371-2375. EHRLICH, S. D. & CONSORTIUM, M. 2011. MetaHIT: The European Union Project on Metagenomics of the Human Intestinal Tract. Metagenomics of the Human Body, 307-316. ELLER, C., CRABILL, M. R. & BRYANT, M. P. 1971. Anaerobic roll tube media for nonselective enumeration and isolation of bacteria in human feces. Appl Microbiol, 22, 522-9. ELSINGHORST, P. W., CAVLAR, T., MULLER, A., BRAUNE, A., BLAUT, M. & GUTSCHOW, M. 2011. The thermal and enzymatic taxifolin-alphitonin rearrangement. J Nat Prod, 74, 2243-9. ERMUND, A., SCHUTTE, A., JOHANSSON, M. E., GUSTAFSSON, J. K. & HANSSON, G. C. 2013. Studies of mucus in mouse stomach, small intestine, and colon. I. Gastrointestinal mucus layers have different properties depending on location as well as over the Peyer's patches. Am J Physiol Gastrointest Liver Physiol, 305, G341-7. ETXEBERRIA, U., ARIAS, N., BOQUE, N., MACARULLA, M. T., PORTILLO, M. P., MARTINEZ, J. A. & MILAGRO, F. I. 2015a. Reshaping faecal gut microbiota composition by the intake of trans-resveratrol and quercetin in high-fat sucrose diet-fed rats. J Nutr Biochem, 26, 651-60. ETXEBERRIA, U., ARIAS, N., BOQUE, N., MACARULLA, M. T., PORTILLO, M. P., MILAGRO, F. I. & MARTINEZ, J. A. 2015b. Shifts in microbiota species and fermentation products in a dietary model enriched in fat and sucrose. Benef Microbes, 6, 97-111. EVENEPOEL, P., CLAUS, D., GEYPENS, B., HIELE, M., GEBOES, K., RUTGEERTS, P. & GHOOS, Y. 1999. Amount and fate of egg protein escaping assimilation in the small intestine of humans. Am J Physiol, 277, G935-43. FAGARASAN, S., KAWAMOTO, S., KANAGAWA, O. & SUZUKI, K. 2010. Adaptive immune regulation in the gut: T cell-dependent and T cell-independent IgA synthesis. Annu Rev Immunol, 28, 243-73.

169

FAGARASAN, S., MURAMATSU, M., SUZUKI, K., NAGAOKA, H., HIAI, H. & HONJO, T. 2002. Critical roles of activation-induced cytidine deaminase in the homeostasis of gut flora. Science, 298, 1424-7. FANG, D., SHI, D., LV, L., GU, S., WU, W., CHEN, Y., GUO, J., LI, A., HU, X., GUO, F., YE, J., LI, Y. & LI, L. 2017. Bifidobacterium pseudocatenulatum LI09 and Bifidobacterium catenulatum LI10 attenuate D-galactosamine-induced liver injury by modifying the gut microbiota. Sci Rep, 7, 8770. FARIA, A., FERNANDES, I., NORBERTO, S., MATEUS, N. & CALHAU, C. 2014. Interplay between anthocyanins and gut microbiota. J Agric Food Chem, 62, 6898-902. FERRARIS, L., AIRES, J., WALIGORA-DUPRIET, A. J. & BUTEL, M. J. 2010. New selective medium for selection of bifidobacteria from human feces. Anaerobe, 16, 469-71. FODOR, A. A., DESANTIS, T. Z., WYLIE, K. M., BADGER, J. H., YE, Y. Z., HEPBURN, T., HU, P., SODERGREN, E., LIOLIOS, K., HUOT-CREASY, H., BIRREN, B. W. & EARL, A. M. 2012. The "Most Wanted'' Taxa from the Human Microbiome for Whole Genome Sequencing. Plos One, 7. FRANCINO, M. P. 2018. Birth Mode-Related Differences in Gut Microbiota Colonization and Immune System Development. Ann Nutr Metab, 73 Suppl 3, 12-16. FRANK, D. N., ST AMAND, A. L., FELDMAN, R. A., BOEDEKER, E. C., HARPAZ, N. & PACE, N. R. 2007. Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc Natl Acad Sci U S A, 104, 13780-5. GALL, M., THOMSEN, M., PETERS, C., PAVLIDIS, I. V., JONCZYK, P., GRUNERT, P. P., BEUTEL, S., SCHEPER, T., GROSS, E., BACKES, M., GEISSLER, T., LEY, J. P., HILMER, J. M., KRAMMER, G., PALM, G. J., HINRICHS, W. & BORNSCHEUER, U. T. 2014. Enzymatic conversion of flavonoids using bacterial chalcone isomerase and enoate reductase. Angew Chem Int Ed Engl, 53, 1439-42. GASPAR, S. R., MENDONCA, T., OLIVEIRA, P., OLIVEIRA, T., DIAS, J. & LOPES, T. 2016. Urolithiasis and crohn's disease. Urol Ann, 8, 297-304. GERONDAKIS, S., FULFORD, T. S., MESSINA, N. L. & GRUMONT, R. J. 2014. NF-kappaB control of T cell development. Nat Immunol, 15, 15-25. GILL, S. R., POP, M., DEBOY, R. T., ECKBURG, P. B., TURNBAUGH, P. J., SAMUEL, B. S., GORDON, J. I., RELMAN, D. A., FRASER-LIGGETT, C. M. & NELSON, K. E. 2006. Metagenomic analysis of the human distal gut microbiome. Science, 312, 1355-9. GOEBELER, M., GILLITZER, R., KILIAN, K., UTZEL, K., BROCKER, E. B., RAPP, U. R. & LUDWIG, S. 2001. Multiple signaling pathways regulate NF-kappaB-dependent transcription of the monocyte chemoattractant protein-1 gene in primary endothelial cells. Blood, 97, 46- 55. HARMSEN, H. J., WILDEBOER-VELOO, A. C., RAANGS, G. C., WAGENDORP, A. A., KLIJN, N., BINDELS, J. G. & WELLING, G. W. 2000. Analysis of intestinal flora development in breast-fed and formula-fed infants by using molecular identification and detection methods. J Pediatr Gastroenterol Nutr, 30, 61-7. HERLES, C., BRAUNE, A. & BLAUT, M. 2004. First bacterial chalcone isomerase isolated from Eubacterium ramulus. Arch Microbiol, 181, 428-34. HIDALGO, M., ORUNA-CONCHA, M. J., KOLIDA, S., WALTON, G. E., KALLITHRAKA, S., SPENCER, J. P. & DE PASCUAL-TERESA, S. 2012. Metabolism of anthocyanins by human gut microflora and their influence on gut bacterial growth. J Agric Food Chem, 60, 3882-90. HOFSTAD, T. & AASJORD, P. 1982. Eubacterium-Plautii (Seguin 1928) Comb Nov. International Journal of Systematic Bacteriology, 32, 346-349.

170

HUERTA-CEPAS, J., FORSLUND, K., COELHO, L. P., SZKLARCZYK, D., JENSEN, L. J., VON MERING, C. & BORK, P. 2017. Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper. Mol Biol Evol, 34, 2115-2122. HUERTA-CEPAS, J., SZKLARCZYK, D., FORSLUND, K., COOK, H., HELLER, D., WALTER, M. C., RATTEI, T., MENDE, D. R., SUNAGAWA, S., KUHN, M., JENSEN, L. J., VON MERING, C. & BORK, P. 2016. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res, 44, D286-93. HUMAN MICROBIOME JUMPSTART REFERENCE STRAINS, C., NELSON, K. E., WEINSTOCK, G. M., HIGHLANDER, S. K., WORLEY, K. C., CREASY, H. H., WORTMAN, J. R., RUSCH, D. B., MITREVA, M., SODERGREN, E., CHINWALLA, A. T., FELDGARDEN, M., GEVERS, D., HAAS, B. J., MADUPU, R., WARD, D. V., BIRREN, B. W., GIBBS, R. A., METHE, B., PETROSINO, J. F., STRAUSBERG, R. L., SUTTON, G. G., WHITE, O. R., WILSON, R. K., DURKIN, S., GIGLIO, M. G., GUJJA, S., HOWARTH, C., KODIRA, C. D., KYRPIDES, N., MEHTA, T., MUZNY, D. M., PEARSON, M., PEPIN, K., PATI, A., QIN, X., YANDAVA, C., ZENG, Q., ZHANG, L., BERLIN, A. M., CHEN, L., HEPBURN, T. A., JOHNSON, J., MCCORRISON, J., MILLER, J., MINX, P., NUSBAUM, C., RUSS, C., SYKES, S. M., TOMLINSON, C. M., YOUNG, S., WARREN, W. C., BADGER, J., CRABTREE, J., MARKOWITZ, V. M., ORVIS, J., CREE, A., FERRIERA, S., FULTON, L. L., FULTON, R. S., GILLIS, M., HEMPHILL, L. D., JOSHI, V., KOVAR, C., TORRALBA, M., WETTERSTRAND, K. A., ABOUELLLEIL, A., WOLLAM, A. M., BUHAY, C. J., DING, Y., DUGAN, S., FITZGERALD, M. G., HOLDER, M., HOSTETLER, J., CLIFTON, S. W., ALLEN-VERCOE, E., EARL, A. M., FARMER, C. N., LIOLIOS, K., SURETTE, M. G., XU, Q., POHL, C., WILCZEK-BONEY, K. & ZHU, D. 2010. A catalog of reference genomes from the human microbiome. Science, 328, 994-9. HUMAN MICROBIOME PROJECT, C. 2012a. A framework for human microbiome research. Nature, 486, 215-21. HUMAN MICROBIOME PROJECT, C. 2012b. Structure, function and diversity of the healthy human microbiome. Nature, 486, 207-14. HUR, H. G., BEGER, R. D., HEINZE, T. M., LAY, J. O., JR., FREEMAN, J. P., DORE, J. & RAFII, F. 2002. Isolation of an anaerobic intestinal bacterium capable of cleaving the C-ring of the isoflavonoid daidzein. Arch Microbiol, 178, 8-12. ISENBERG, H. D., GOLDBERG, D. & SAMPSON, J. 1970. Laboratory studies with a selective Enterococcus medium. Appl Microbiol, 20, 433-6. JAKOBSSON, H. E., RODRIGUEZ-PINEIRO, A. M., SCHUTTE, A., ERMUND, A., BOYSEN, P., BEMARK, M., SOMMER, F., BACKHED, F., HANSSON, G. C. & JOHANSSON, M. E. 2015. The composition of the gut microbiota shapes the colon mucus barrier. EMBO Rep, 16, 164-77. JANDA, J. M. & ABBOTT, S. L. 2007. 16S rRNA gene sequencing for bacterial identification in the diagnostic laboratory: pluses, perils, and pitfalls. J Clin Microbiol, 45, 2761-4. JEFFERY, I. B. & O'TOOLE, P. W. 2013. Diet-microbiota interactions and their implications for healthy living. Nutrients, 5, 234-52. JERALDO, P., HERNANDEZ, A., NIELSEN, H. B., CHEN, X., WHITE, B. A., GOLDENFELD, N., NELSON, H., ALHQUIST, D., BOARDMAN, L. & CHIA, N. 2016. Capturing One of the Human Gut Microbiome's Most Wanted: Reconstructing the Genome of a Novel Butyrate- Producing, Clostridial Scavenger from Metagenomic Sequence Data. Front Microbiol, 7, 783. JOHANSEN, F. E., PEKNA, M., NORDERHAUG, I. N., HANEBERG, B., HIETALA, M. A., KRAJCI, P., BETSHOLTZ, C. & BRANDTZAEG, P. 1999. Absence of epithelial 171

immunoglobulin A transport, with increased mucosal leakiness, in polymeric immunoglobulin receptor/secretory component-deficient mice. J Exp Med, 190, 915-22. JONES, D. T., TAYLOR, W. R. & THORNTON, J. M. 1992. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci, 8, 275-82. JUNG, H. C., ECKMANN, L., YANG, S. K., PANJA, A., FIERER, J., MORZYCKA- WROBLEWSKA, E. & KAGNOFF, M. F. 1995. A distinct array of proinflammatory cytokines is expressed in human colon epithelial cells in response to bacterial invasion. J Clin Invest, 95, 55-65. KACI, G., GOUDERCOURT, D., DENNIN, V., POT, B., DORE, J., EHRLICH, S. D., RENAULT, P., BLOTTIERE, H. M., DANIEL, C. & DELORME, C. 2014. Anti-inflammatory properties of Streptococcus salivarius, a commensal bacterium of the oral cavity and digestive tract. Appl Environ Microbiol, 80, 928-34. KACI, G., LAKHDARI, O., DORE, J., EHRLICH, S. D., RENAULT, P., BLOTTIERE, H. M. & DELORME, C. 2011. Inhibition of the NF-kappaB pathway in human intestinal epithelial cells by commensal Streptococcus salivarius. Appl Environ Microbiol, 77, 4681-4. KARLIN, S. & MRAZEK, J. 2000. Predicted highly expressed genes of diverse prokaryotic genomes. J Bacteriol, 182, 5238-50. KASAI, C., SUGIMOTO, K., MORITANI, I., TANAKA, J., OYA, Y., INOUE, H., TAMEDA, M., SHIRAKI, K., ITO, M., TAKEI, Y. & TAKASE, K. 2015. Comparison of the gut microbiota composition between obese and non-obese individuals in a Japanese population, as analyzed by terminal restriction fragment length polymorphism and next-generation sequencing. BMC Gastroenterol, 15, 100. KELLY, D., CAMPBELL, J. I., KING, T. P., GRANT, G., JANSSON, E. A., COUTTS, A. G., PETTERSSON, S. & CONWAY, S. 2004. Commensal anaerobic gut bacteria attenuate inflammation by regulating nuclear-cytoplasmic shuttling of PPAR-gamma and RelA. Nat Immunol, 5, 104-12. KIM, M. J., WOO, S. Y., KIM, E. R., HONG, S. N., CHANG, D. K., RHEE, P. L., KIM, J. J., RHEE, J. C. & KIM, Y. H. 2015. Incidence and Risk Factors for Urolithiasis in Patients with Crohn's Disease. Urol Int, 95, 314-9. KIMURA, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol, 16, 111-20. KLARING, K., HANSKE, L., BUI, N., CHARRIER, C., BLAUT, M., HALLER, D., PLUGGE, C. M. & CLAVEL, T. 2013. Intestinimonas butyriciproducens gen. nov., sp. nov., a butyrate- producing bacterium from the mouse intestine. Int J Syst Evol Microbiol, 63, 4606-12. KONSTANTINIDIS, K. T., RAMETTE, A. & TIEDJE, J. M. 2006. The bacterial species definition in the genomic era. Philos Trans R Soc Lond B Biol Sci, 361, 1929-40. KUMAR, S., STECHER, G. & TAMURA, K. 2016. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol Biol Evol, 33, 1870-4. KUNDIG, W., GHOSH, S. & ROSEMAN, S. 1964. Phosphate Bound to Histidine in a Protein as an Intermediate in a Novel Phospho- System. Proc Natl Acad Sci U S A, 52, 1067- 74. KUTSCHERA, M., ENGST, W., BLAUT, M. & BRAUNE, A. 2011. Isolation of catechin- converting human intestinal bacteria. J Appl Microbiol, 111, 165-75. LAGIER, J. C., ARMOUGOM, F., MILLION, M., HUGON, P., PAGNIER, I., ROBERT, C., BITTAR, F., FOURNOUS, G., GIMENEZ, G., MARANINCHI, M., TRAPE, J. F., KOONIN, E. V., LA SCOLA, B. & RAOULT, D. 2012. Microbial culturomics: paradigm shift in the human gut microbiome study. Clin Microbiol Infect, 18, 1185-93. LAKHDARI, O., CULTRONE, A., TAP, J., GLOUX, K., BERNARD, F., EHRLICH, S. D., LEFEVRE, F., DORE, J. & BLOTTIERE, H. M. 2010. Functional metagenomics: a high 172

throughput screening method to decipher microbiota-driven NF-kappaB modulation in the human gut. PLoS One, 5. LAKHDARI, O., TAP, J., BEGUET-CRESPEL, F., LE ROUX, K., DE WOUTERS, T., CULTRONE, A., NEPELSKA, M., LEFEVRE, F., DORE, J. & BLOTTIERE, H. M. 2011. Identification of NF-kappaB modulation capabilities within human intestinal commensal bacteria. J Biomed Biotechnol, 2011, 282356. LAUTENBACH, E., GOULD, C. V., LAROSA, L. A., MARR, A. M., NACHAMKIN, I., BILKER, W. B. & FISHMAN, N. O. 2004. Emergence of resistance to chloramphenicol among vancomycin-resistant enterococcal (VRE) bloodstream isolates. Int J Antimicrob Agents, 23, 200-3. LEY, R. E., TURNBAUGH, P. J., KLEIN, S. & GORDON, J. I. 2006. Microbial ecology: human gut microbes associated with obesity. Nature, 444, 1022-3. LIU, C., MENG, Q., CHEN, Y., XU, M., SHEN, M., GAO, R. & GAN, S. 2017a. Role of Age- Related Shifts in Rumen Bacteria and Methanogens in Methane Production in Cattle. Front Microbiol, 8, 1563. LIU, L., LIN, L., ZHENG, L., TANG, H., FAN, X., XUE, N., LI, M., LIU, M. & LI, X. 2018. Cecal microbiome profile altered by Salmonella enterica, serovar Enteritidis inoculation in chicken. Gut Pathog, 10, 34. LIU, T., ZHANG, L., JOO, D. & SUN, S. C. 2017b. NF-kappaB signaling in inflammation. Signal Transduct Target Ther, 2. LIVINGSTON, S. J., KOMINOS, S. D. & YEE, R. B. 1978. New medium for selection and presumptive identification of the Bacteroides fragilis group. J Clin Microbiol, 7, 448-53. LOUIS, S., TAPPU, R. M., DAMMS-MACHADO, A., HUSON, D. H. & BISCHOFF, S. C. 2016. Characterization of the Gut Microbial Community of Obese Patients Following a Weight- Loss Intervention Using Whole Metagenome Shotgun Sequencing. PLoS One, 11, e0149564. LOZUPONE, C. A., STOMBAUGH, J. I., GORDON, J. I., JANSSON, J. K. & KNIGHT, R. 2012. Diversity, stability and resilience of the human gut microbiota. Nature, 489, 220-30. MA, L., KIM, J., HATZENPICHLER, R., KARYMOV, M. A., HUBERT, N., HANAN, I. M., CHANG, E. B. & ISMAGILOV, R. F. 2014. Gene-targeted microfluidic cultivation validated by isolation of a gut bacterium listed in Human Microbiome Project's Most Wanted taxa. Proc Natl Acad Sci U S A, 111, 9768-73. MACY, J. M., SNELLEN, J. E. & HUNGATE, R. E. 1972. Use of syringe methods for anaerobiosis. Am J Clin Nutr, 25, 1318-23. MALO, N., HANLEY, J. A., CERQUOZZI, S., PELLETIER, J. & NADON, R. 2006. Statistical practice in high-throughput screening data analysis. Nat Biotechnol, 24, 167-75. MANTIS, N. J. & FORBES, S. J. 2010. Secretory IgA: arresting microbial pathogens at epithelial borders. Immunol Invest, 39, 383-406. MARCHESI, J. R. & RAVEL, J. 2015. The vocabulary of microbiome research: a proposal. Microbiome, 3, 31. MARIN, L., MIGUELEZ, E. M., VILLAR, C. J. & LOMBO, F. 2015. Bioavailability of dietary polyphenols and gut microbiota metabolism: antimicrobial properties. Biomed Res Int, 2015, 905215. MARTIN, R., CHAIN, F., MIQUEL, S., LU, J., GRATADOUX, J. J., SOKOL, H., VERDU, E. F., BERCIK, P., BERMUDEZ-HUMARAN, L. G. & LANGELLA, P. 2014. The commensal bacterium Faecalibacterium prausnitzii is protective in DNBS-induced chronic moderate and severe colitis models. Inflamm Bowel Dis, 20, 417-30. MARTINEZ-MEDINA, M., ALDEGUER, X., GONZALEZ-HUIX, F., ACERO, D. & GARCIA- GIL, L. J. 2006. Abnormal microbiota composition in the ileocolonic mucosa of Crohn's

173

disease patients as revealed by polymerase chain reaction-denaturing gradient gel electrophoresis. Inflamm Bowel Dis, 12, 1136-45. MCSWEENEY, C. S., DENMAN, S. E. & MACKIE, R. I. 2005. Rumen bacteria. In: MAKKAR, H. P. S. & MCSWEENEY, C. S. (eds.) Methods in Gut Microbial Ecology for Ruminants. Dordrecht: Springer Netherlands. MEDVECKY, M., CEJKOVA, D., POLANSKY, O., KARASOVA, D., KUBASOVA, T., CIZEK, A. & RYCHLIK, I. 2018. Whole genome sequencing and function prediction of 133 gut anaerobes isolated from chicken caecum in pure cultures. BMC Genomics, 19, 561. MIQUEL, S., LECLERC, M., MARTIN, R., CHAIN, F., LENOIR, M., RAGUIDEAU, S., HUDAULT, S., BRIDONNEAU, C., NORTHEN, T., BOWEN, B., BERMUDEZ- HUMARAN, L. G., SOKOL, H., THOMAS, M. & LANGELLA, P. 2015. Identification of metabolic signatures linked to anti-inflammatory effects of Faecalibacterium prausnitzii. MBio, 6. MODI, S. R., COLLINS, J. J. & RELMAN, D. A. 2014. Antibiotics and the gut microbiota. J Clin Invest, 124, 4212-8. MOLINA, J., BARRANTES, G., QUESADA-GOMEZ, C., RODRIGUEZ, C. & RODRIGUEZ- CAVALLINI, E. 2014. Phenotypic and genotypic characterization of multidrug-resistant Bacteroides, Parabacteroides spp., and Pseudoflavonifractor from a Costa Rican hospital. Microb Drug Resist, 20, 478-84. MONDOT, S. & LEPAGE, P. 2016. The human gut microbiome and its dysfunctions through the meta-omics prism. Ann N Y Acad Sci, 1372, 9-19. MORGAN, X. C. & HUTTENHOWER, C. 2012. Chapter 12: Human microbiome analysis. PLoS Comput Biol, 8, e1002808. MOSTOV, K. E. 1994. Transepithelial transport of immunoglobulins. Annu Rev Immunol, 12, 63-84. MURTAZA, N., P, O. C. & MORRISON, M. 2017. Diet and the Microbiome. Gastroenterol Clin North Am, 46, 49-60. NARUSHIMA, S., SUGIURA, Y., OSHIMA, K., ATARASHI, K., HATTORI, M., SUEMATSU, M. & HONDA, K. 2014. Characterization of the 17 strains of regulatory T cell-inducing human-derived Clostridia. Gut Microbes, 5, 333-9. NAYFACH, S., FISCHBACH, M. A. & POLLARD, K. S. 2015. MetaQuery: a web server for rapid annotation and quantitative analysis of specific genes in the human gut microbiome. Bioinformatics, 31, 3368-70. NEVEU, V., PEREZ-JIMENEZ, J., VOS, F., CRESPY, V., DU CHAFFAUT, L., MENNEN, L., KNOX, C., EISNER, R., CRUZ, J., WISHART, D. & SCALBERT, A. 2010. Phenol- Explorer: an online comprehensive database on polyphenol contents in foods. Database (Oxford), 2010, bap024. Ó CUÍV, P., AGUIRRE DE CARCER, D., JONES, M., KLAASSENS, E. S., WORTHLEY, D. L., WHITEHALL, V. L., KANG, S., MCSWEENEY, C. S., LEGGETT, B. A. & MORRISON, M. 2011. The effects from DNA extraction methods on the evaluation of microbial diversity associated with human colonic tissue. Microb Ecol, 61, 353-62. Ó CUÍV, P., BURMAN, S., POTTENGER, S. & MORRISON, M. 2016. Exploring the Bioactive Landscape of the Gut Microbiota to Identify Metabolites Underpinning Human Health. In: BEALE, D. J., KOUREMENOS, K. A. & PALOMBO, E. A. (eds.) Microbial Metabolomics: Applications in Clinical, Environmental, and Industrial Microbiology. Cham: Springer International Publishing. Ó CUÍV, P., DE WOUTERS, T., GIRI, R., MONDOT, S., SMITH, W. J., BLOTTIÈRE, H. M., BEGUN, J. & MORRISON, M. 2017. The gut bacterium and pathobiont Bacteroides vulgatus activates NF-κB in a human gut epithelial cell line in a strain and growth phase dependent manner. Anaerobe, 47, 209-217. 174

Ó CUÍV, P., SMITH, W. J., POTTENGER, S., BURMAN, S., SHANAHAN, E. R. & MORRISON, M. 2015. Isolation of Genetically Tractable Most-Wanted Bacteria by Metaparental Mating. Sci Rep, 5, 13282. OAKLEY, B. B., BUHR, R. J., RITZ, C. W., KIEPPER, B. H., BERRANG, M. E., SEAL, B. S. & COX, N. A. 2014. Successional changes in the chicken cecal microbiome during 42 days of growth are independent of organic acid feed additives. BMC Vet Res, 10, 282. OTTMAN, N., SMIDT, H., DE VOS, W. M. & BELZER, C. 2012. The function of our microbiota: who is out there and what do they do? Front Cell Infect Microbiol, 2, 104. OVERBEEK, R., OLSON, R., PUSCH, G. D., OLSEN, G. J., DAVIS, J. J., DISZ, T., EDWARDS, R. A., GERDES, S., PARRELLO, B., SHUKLA, M., VONSTEIN, V., WATTAM, A. R., XIA, F. & STEVENS, R. 2014. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res, 42, D206-14. OZDAL, T., SELA, D. A., XIAO, J., BOYACIOGLU, D., CHEN, F. & CAPANOGLU, E. 2016. The Reciprocal Interactions between Polyphenols and Gut Microbiota and Effects on Bioaccessibility. Nutrients, 8, 78. PALM, N. W., DE ZOETE, M. R., CULLEN, T. W., BARRY, N. A., STEFANOWSKI, J., HAO, L., DEGNAN, P. H., HU, J., PETER, I., ZHANG, W., RUGGIERO, E., CHO, J. H., GOODMAN, A. L. & FLAVELL, R. A. 2014. Immunoglobulin A coating identifies colitogenic bacteria in inflammatory bowel disease. Cell, 158, 1000-1010. PALMER, C., BIK, E. M., DIGIULIO, D. B., RELMAN, D. A. & BROWN, P. O. 2007. Development of the human infant intestinal microbiota. PLoS Biol, 5, e177. PARKS, D. H., IMELFORT, M., SKENNERTON, C. T., HUGENHOLTZ, P. & TYSON, G. W. 2015. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res, 25, 1043-55. PARKS, D. H., RINKE, C., CHUVOCHINA, M., CHAUMEIL, P. A., WOODCROFT, B. J., EVANS, P. N., HUGENHOLTZ, P. & TYSON, G. W. 2017. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat Microbiol, 2, 1533-1542. PETERSON, D. A., FRANK, D. N., PACE, N. R. & GORDON, J. I. 2008. Metagenomic approaches for defining the pathogenesis of inflammatory bowel diseases. Cell Host Microbe, 3, 417-27. PETERSON, D. A., MCNULTY, N. P., GURUGE, J. L. & GORDON, J. I. 2007. IgA response to symbiotic bacteria as a mediator of gut homeostasis. Cell Host Microbe, 2, 328-39. PETKAU, A., STUART-EDWARDS, M., STOTHARD, P. & VAN DOMSELAAR, G. 2010. Interactive microbial genome visualization with GView. Bioinformatics, 26, 3125-6. PLAYNE, M. J. 1985. Determination of Ethanol, Volatile Fatty-Acids, Lactic and Succinic Acids in Fermentation Liquids by Gas-Chromatography. Journal of the Science of Food and Agriculture, 36, 638-644. POLANSKY, O., SEKELOVA, Z., FALDYNOVA, M., SEBKOVA, A., SISAK, F. & RYCHLIK, I. 2015. Important Metabolic Pathways and Biological Processes Expressed by Chicken Cecal Microbiota. Appl Environ Microbiol, 82, 1569-76. PRUESSE, E., PEPLIES, J. & GLOCKNER, F. O. 2012. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics, 28, 1823-9. QIN, J., LI, R., RAES, J., ARUMUGAM, M., BURGDORF, K. S., MANICHANH, C., NIELSEN, T., PONS, N., LEVENEZ, F., YAMADA, T., MENDE, D. R., LI, J., XU, J., LI, S., LI, D., CAO, J., WANG, B., LIANG, H., ZHENG, H., XIE, Y., TAP, J., LEPAGE, P., BERTALAN, M., BATTO, J. M., HANSEN, T., LE PASLIER, D., LINNEBERG, A., NIELSEN, H. B., PELLETIER, E., RENAULT, P., SICHERITZ-PONTEN, T., TURNER, K., ZHU, H., YU, C., LI, S., JIAN, M., ZHOU, Y., LI, Y., ZHANG, X., LI, S., QIN, N., YANG, H., WANG, J., BRUNAK, S., DORE, J., GUARNER, F., KRISTIANSEN, K., PEDERSEN, O., PARKHILL, 175

J., WEISSENBACH, J., META, H. I. T. C., BORK, P., EHRLICH, S. D. & WANG, J. 2010. A human gut microbial gene catalogue established by metagenomic sequencing. Nature, 464, 59-65. QIN, J., LI, Y., CAI, Z., LI, S., ZHU, J., ZHANG, F., LIANG, S., ZHANG, W., GUAN, Y., SHEN, D., PENG, Y., ZHANG, D., JIE, Z., WU, W., QIN, Y., XUE, W., LI, J., HAN, L., LU, D., WU, P., DAI, Y., SUN, X., LI, Z., TANG, A., ZHONG, S., LI, X., CHEN, W., XU, R., WANG, M., FENG, Q., GONG, M., YU, J., ZHANG, Y., ZHANG, M., HANSEN, T., SANCHEZ, G., RAES, J., FALONY, G., OKUDA, S., ALMEIDA, M., LECHATELIER, E., RENAULT, P., PONS, N., BATTO, J. M., ZHANG, Z., CHEN, H., YANG, R., ZHENG, W., LI, S., YANG, H., WANG, J., EHRLICH, S. D., NIELSEN, R., PEDERSEN, O., KRISTIANSEN, K. & WANG, J. 2012. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature, 490, 55-60. QIN, N., YANG, F., LI, A., PRIFTI, E., CHEN, Y., SHAO, L., GUO, J., LE CHATELIER, E., YAO, J., WU, L., ZHOU, J., NI, S., LIU, L., PONS, N., BATTO, J. M., KENNEDY, S. P., LEONARD, P., YUAN, C., DING, W., CHEN, Y., HU, X., ZHENG, B., QIAN, G., XU, W., EHRLICH, S. D., ZHENG, S. & LI, L. 2014. Alterations of the human gut microbiome in liver cirrhosis. Nature, 513, 59-64. QUEVRAIN, E., MAUBERT, M. A., MICHON, C., CHAIN, F., MARQUANT, R., TAILHADES, J., MIQUEL, S., CARLIER, L., BERMUDEZ-HUMARAN, L. G., PIGNEUR, B., LEQUIN, O., KHARRAT, P., THOMAS, G., RAINTEAU, D., AUBRY, C., BREYNER, N., AFONSO, C., LAVIELLE, S., GRILL, J. P., CHASSAING, G., CHATEL, J. M., TRUGNAN, G., XAVIER, R., LANGELLA, P., SOKOL, H. & SEKSIK, P. 2016. Identification of an anti- inflammatory protein from Faecalibacterium prausnitzii, a commensal bacterium deficient in Crohn's disease. Gut, 65, 415-425. RAKOFF-NAHOUM, S., PAGLINO, J., ESLAMI-VARZANEH, F., EDBERG, S. & MEDZHITOV, R. 2004. Recognition of commensal microflora by toll-like receptors is required for intestinal homeostasis. Cell, 118, 229-41. REEVES, A. E., KOENIGSKNECHT, M. J., BERGIN, I. L. & YOUNG, V. B. 2012. Suppression of Clostridium difficile in the gastrointestinal tracts of germfree mice inoculated with a murine isolate from the family Lachnospiraceae. Infect Immun, 80, 3786-94. RICABONI, D., MAILHE, M., BENEZECH, A., ANDRIEU, C., FOURNIER, P. E. & RAOULT, D. 2017. 'Pseudoflavonifractor phocaeensis' gen. nov., sp. nov., isolated from human left colon. New Microbes New Infect, 17, 15-17. RICAURTE, J. C., BOUCHER, H. W., TURETT, G. S., MOELLERING, R. C., LABOMBARDI, V. J. & KISLAK, J. W. 2001. Chloramphenicol treatment for vancomycin-resistant Enterococcus faecium bacteremia. Clin Microbiol Infect, 7, 17-21. ROOPCHAND, D. E., CARMODY, R. N., KUHN, P., MOSKAL, K., ROJAS-SILVA, P., TURNBAUGH, P. J. & RASKIN, I. 2015. Dietary Polyphenols Promote Growth of the Gut Bacterium Akkermansia muciniphila and Attenuate High-Fat Diet-Induced Metabolic Syndrome. Diabetes, 64, 2847-58. ROSSI-TAMISIER, M., BENAMAR, S., RAOULT, D. & FOURNIER, P. E. 2015. Cautionary tale of using 16S rRNA gene sequence similarity values in identification of human-associated bacterial species. Int J Syst Evol Microbiol, 65, 1929-34. RUAN, Z. S., ANANTHARAM, V., CRAWFORD, I. T., AMBUDKAR, S. V., RHEE, S. Y., ALLISON, M. J. & MALONEY, P. C. 1992. Identification, purification, and reconstitution of OxlT, the oxalate: formate antiport protein of Oxalobacter formigenes. J Biol Chem, 267, 10537-43. SAIER, M. H., JR. 2015. The Bacterial Phosphotransferase System: New Frontiers 50 Years after Its Discovery. J Mol Microbiol Biotechnol, 25, 73-8. 176

SAKAMOTO, M., IINO, T., YUKI, M. & OHKUMA, M. 2018. Lawsonibacter asaccharolyticus gen. nov., sp. nov., a butyrate-producing bacterium isolated from human faeces. Int J Syst Evol Microbiol. SCALBERT, A., MANACH, C., MORAND, C., REMESY, C. & JIMENEZ, L. 2005. Dietary polyphenols and the prevention of diseases. Crit Rev Food Sci Nutr, 45, 287-306. SCHNEIDER, H. & BLAUT, M. 2000. Anaerobic degradation of flavonoids by Eubacterium ramulus. Arch Microbiol, 173, 71-5. SCHOEFER, L., BRAUNE, A. & BLAUT, M. 2004. Cloning and expression of a phloretin hydrolase gene from Eubacterium ramulus and characterization of the recombinant enzyme. Appl Environ Microbiol, 70, 6131-7. SCHOEFER, L., MOHAN, R., BRAUNE, A., BIRRINGER, M. & BLAUT, M. 2002. Anaerobic C- ring cleavage of genistein and daidzein by Eubacterium ramulus. FEMS Microbiol Lett, 208, 197-202. SCHOEFER, L., MOHAN, R., SCHWIERTZ, A., BRAUNE, A. & BLAUT, M. 2003. Anaerobic degradation of flavonoids by Clostridium orbiscindens. Appl Environ Microbiol, 69, 5849-54. SEEMANN, T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics, 30, 2068-9. SEGUIN, P. 1928. The Fusobacterium plauti culture, a mobile form of the fusiform bacillus. Comptes Rendus Des Seances De La Societe De Biologie Et De Ses Filiales, 99, 439-442. SELMA, M. V., ESPIN, J. C. & TOMAS-BARBERAN, F. A. 2009. Interaction between phenolics and gut microbiota: role in human health. J Agric Food Chem, 57, 6485-501. SEVILLE, L. A., PATTERSON, A. J., SCOTT, K. P., MULLANY, P., QUAIL, M. A., PARKHILL, J., READY, D., WILSON, M., SPRATT, D. & ROBERTS, A. P. 2009. Distribution of tetracycline and erythromycin resistance genes among human oral and fecal metagenomic DNA. Microb Drug Resist, 15, 159-66. SHULZHENKO, N., MORGUN, A., HSIAO, W., BATTLE, M., YAO, M., GAVRILOVA, O., ORANDLE, M., MAYER, L., MACPHERSON, A. J., MCCOY, K. D., FRASER-LIGGETT, C. & MATZINGER, P. 2011. Crosstalk between B lymphocytes, microbiota and the intestinal epithelium governs immunity versus metabolism in the gut. Nat Med, 17, 1585-93. SILLS, M. R. & BOENNING, D. 1999. Chloramphenicol. Pediatrics in Review, 20, 357-358. SIMMERING, R., PFORTE, H., JACOBASCH, G. & BLAUT, M. 2002. The growth of the flavonoid-degrading intestinal bacterium, Eubacterium ramulus, is stimulated by dietary flavonoids in vivo. FEMS Microbiol Ecol, 40, 243-8. SIMOSSIS, V. A. & HERINGA, J. 2005. PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information. Nucleic Acids Res, 33, W289-94. SOKOL, H., PIGNEUR, B., WATTERLOT, L., LAKHDARI, O., BERMUDEZ-HUMARAN, L. G., GRATADOUX, J. J., BLUGEON, S., BRIDONNEAU, C., FURET, J. P., CORTHIER, G., GRANGETTE, C., VASQUEZ, N., POCHART, P., TRUGNAN, G., THOMAS, G., BLOTTIERE, H. M., DORE, J., MARTEAU, P., SEKSIK, P. & LANGELLA, P. 2008. Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. Proc Natl Acad Sci U S A, 105, 16731-6. SOKOL, H., SEKSIK, P., FURET, J. P., FIRMESSE, O., NION-LARMURIER, I., BEAUGERIE, L., COSNES, J., CORTHIER, G., MARTEAU, P. & DORE, J. 2009. Low counts of Faecalibacterium prausnitzii in colitis microbiota. Inflamm Bowel Dis, 15, 1183-9. STACKEBRANDT, E. & GOEBEL, B. M. 1994. Taxonomic Note: A Place for DNA-DNA Reassociation and 16S rRNA Sequence Analysis in the Present Species Definition in Bacteriology. International Journal of Systematic and Evolutionary Microbiology, 44, 846- 849.

177

STEVENSON, B. S., EICHORST, S. A., WERTZ, J. T., SCHMIDT, T. M. & BREZNAK, J. A. 2004. New strategies for cultivation and detection of previously uncultured microbes. Appl Environ Microbiol, 70, 4748-55. STEWART, C. S., DUNCAN, S. H. & CAVE, D. R. 2004. Oxalobacter formigenes and its role in oxalate metabolism in the human gut. FEMS Microbiol Lett, 230, 1-7. STINCONE, A., PRIGIONE, A., CRAMER, T., WAMELINK, M. M., CAMPBELL, K., CHEUNG, E., OLIN-SANDOVAL, V., GRUNING, N. M., KRUGER, A., TAUQEER ALAM, M., KELLER, M. A., BREITENBACH, M., BRINDLE, K. M., RABINOWITZ, J. D. & RALSER, M. 2015. The return of metabolism: biochemistry and physiology of the pentose phosphate pathway. Biol Rev Camb Philos Soc, 90, 927-63. STRAUSS, J., KAPLAN, G. G., BECK, P. L., RIOUX, K., PANACCIONE, R., DEVINNEY, R., LYNCH, T. & ALLEN-VERCOE, E. 2011. Invasive potential of gut mucosa-derived Fusobacterium nucleatum positively correlates with IBD status of the host. Inflamm Bowel Dis, 17, 1971-8. STUCKY, B. J. 2012. SeqTrace: a graphical tool for rapidly processing DNA sequencing chromatograms. J Biomol Tech, 23, 90-3. STUDIER, F. W. 2005. Protein production by auto-induction in high density shaking cultures. Protein Expr Purif, 41, 207-34. SUNAGAWA, S., MENDE, D. R., ZELLER, G., IZQUIERDO-CARRASCO, F., BERGER, S. A., KULTIMA, J. R., COELHO, L. P., ARUMUGAM, M., TAP, J., NIELSEN, H. B., RASMUSSEN, S., BRUNAK, S., PEDERSEN, O., GUARNER, F., DE VOS, W. M., WANG, J., LI, J., DORE, J., EHRLICH, S. D., STAMATAKIS, A. & BORK, P. 2013. Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods, 10, 1196-9. TAKAGAKI, A., KATO, Y. & NANJO, F. 2014. Isolation and characterization of rat intestinal bacteria involved in biotransformation of (-)-epigallocatechin. Arch Microbiol, 196, 681-95. TAKAGAKI, A. & NANJO, F. 2015. Bioconversion of (-)-epicatechin, (+)-epicatechin, (-)-catechin, and (+)-catechin by (-)-epigallocatechin-metabolizing bacteria. Biol Pharm Bull, 38, 789-94. TAKAISHI, H., MATSUKI, T., NAKAZAWA, A., TAKADA, T., KADO, S., ASAHARA, T., KAMADA, N., SAKURABA, A., YAJIMA, T., HIGUCHI, H., INOUE, N., OGATA, H., IWAO, Y., NOMOTO, K., TANAKA, R. & HIBI, T. 2008. Imbalance in intestinal microflora constitution could be involved in the pathogenesis of inflammatory bowel disease. Int J Med Microbiol, 298, 463-72. TAP, J., MONDOT, S., LEVENEZ, F., PELLETIER, E., CARON, C., FURET, J. P., UGARTE, E., MUNOZ-TAMAYO, R., PASLIER, D. L., NALIN, R., DORE, J. & LECLERC, M. 2009. Towards the human intestinal microbiota phylogenetic core. Environ Microbiol, 11, 2574-84. TERAO, J. 2017. Factors modulating bioavailability of quercetin-related flavonoids and the consequences of their vascular function. Biochem Pharmacol. THOMPSON, J. D., HIGGINS, D. G. & GIBSON, T. J. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position- specific gap penalties and weight matrix choice. Nucleic Acids Res, 22, 4673-80. THOMSEN, M., TUUKKANEN, A., DICKERHOFF, J., PALM, G. J., KRATZAT, H., SVERGUN, D. I., WEISZ, K., BORNSCHEUER, U. T. & HINRICHS, W. 2015. Structure and catalytic mechanism of the evolutionarily unique bacterial chalcone isomerase. Acta Crystallogr D Biol Crystallogr, 71, 907-17. TIEN, M. T., GIRARDIN, S. E., REGNAULT, B., LE BOURHIS, L., DILLIES, M. A., COPPEE, J. Y., BOURDET-SICARD, R., SANSONETTI, P. J. & PEDRON, T. 2006. Anti-inflammatory effect of Lactobacillus casei on Shigella-infected human intestinal epithelial cells. J Immunol, 176, 1228-37. 178

TRUONG, D. T., FRANZOSA, E. A., TICKLE, T. L., SCHOLZ, M., WEINGART, G., PASOLLI, E., TETT, A., HUTTENHOWER, C. & SEGATA, N. 2015. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat Methods, 12, 902-3. TURNBAUGH, P. J., BACKHED, F., FULTON, L. & GORDON, J. I. 2008. Diet-induced obesity is linked to marked but reversible alterations in the mouse distal gut microbiome. Cell Host Microbe, 3, 213-23. TURNBAUGH, P. J., LEY, R. E., HAMADY, M., FRASER-LIGGETT, C. M., KNIGHT, R. & GORDON, J. I. 2007. The human microbiome project. Nature, 449, 804-10. TZOUNIS, X., VULEVIC, J., KUHNLE, G. G., GEORGE, T., LEONCZAK, J., GIBSON, G. R., KWIK-URIBE, C. & SPENCER, J. P. 2008. Flavanol monomer-induced changes to the human faecal microflora. Br J Nutr, 99, 782-92. UCHIMURA, Y., WYSS, M., BRUGIROUX, S., LIMENITAKIS, J. P., STECHER, B., MCCOY, K. D. & MACPHERSON, A. J. 2016. Complete Genome Sequences of 12 Species of Stable Defined Moderately Diverse Mouse Microbiota 2. Genome Announc, 4. VAISHNAVA, S., BEHRENDT, C. L., ISMAIL, A. S., ECKMANN, L. & HOOPER, L. V. 2008. Paneth cells directly sense gut commensals and maintain homeostasis at the intestinal host- microbial interface. Proc Natl Acad Sci U S A, 105, 20858-63. VAISHNAVA, S., YAMAMOTO, M., SEVERSON, K. M., RUHN, K. A., YU, X., KOREN, O., LEY, R., WAKELAND, E. K. & HOOPER, L. V. 2011. The antibacterial lectin RegIIIgamma promotes the spatial segregation of microbiota and host in the intestine. Science, 334, 255-8. VAN DER SLUIS, M., DE KONING, B. A., DE BRUIJN, A. C., VELCICH, A., MEIJERINK, J. P., VAN GOUDOEVER, J. B., BULLER, H. A., DEKKER, J., VAN SEUNINGEN, I., RENES, I. B. & EINERHAND, A. W. 2006. Muc2-deficient mice spontaneously develop colitis, indicating that MUC2 is critical for colonic protection. Gastroenterology, 131, 117-29. VAN DER WAAIJ, L. A., LIMBURG, P. C., MESANDER, G. & VAN DER WAAIJ, D. 1996. In vivo IgA coating of anaerobic bacteria in human faeces. Gut, 38, 348-54. VAN SCHAIK, W. 2015. The human gut resistome. Philos Trans R Soc Lond B Biol Sci, 370, 20140087. VARELA, E., MANICHANH, C., GALLART, M., TORREJON, A., BORRUEL, N., CASELLAS, F., GUARNER, F. & ANTOLIN, M. 2013. Colonisation by Faecalibacterium prausnitzii and maintenance of clinical remission in patients with ulcerative colitis. Aliment Pharmacol Ther, 38, 151-61. VINSON, J. A., SU, X., ZUBIK, L. & BOSE, P. 2001. Phenol antioxidant quantity and quality in foods: fruits. J Agric Food Chem, 49, 5315-21. VITAL, M., GAO, J., RIZZO, M., HARRISON, T. & TIEDJE, J. M. 2015. Diet is a major factor governing the fecal butyrate-producing community structure across Mammalia, Aves and Reptilia. ISME J, 9, 832-43. WANG, G., MENG, K., LUO, H., WANG, Y., HUANG, H., SHI, P., PAN, X., YANG, P. & YAO, B. 2011. Molecular cloning and characterization of a novel SGNH arylesterase from the goat rumen contents. Appl Microbiol Biotechnol, 91, 1561-70. WEHKAMP, J., HARDER, J., WEICHENTHAL, M., SCHWAB, M., SCHAFFELER, E., SCHLEE, M., HERRLINGER, K. R., STALLMACH, A., NOACK, F., FRITZ, P., SCHRODER, J. M., BEVINS, C. L., FELLERMANN, K. & STANGE, E. F. 2004. NOD2 (CARD15) mutations in Crohn's disease are associated with diminished mucosal alpha-defensin expression. Gut, 53, 1658-64. WEHKAMP, J., SCHMID, M., FELLERMANN, K. & STANGE, E. F. 2005. Defensin deficiency, intestinal microbes, and the clinical phenotypes of Crohn's disease. J Leukoc Biol, 77, 460-5.

179

WINTER, J., POPOFF, M. R., GRIMONT, P. & BOKKENHEUSER, V. D. 1991. Clostridium orbiscindens sp. nov., a human intestinal bacterium capable of cleaving the flavonoid C-ring. Int J Syst Bacteriol, 41, 355-7. WOO, P. C., TENG, J. L., WU, J. K., LEUNG, F. P., TSE, H., FUNG, A. M., LAU, S. K. & YUEN, K. Y. 2009. Guidelines for interpretation of 16S rRNA gene sequence-based results for identification of medically important aerobic Gram-positive bacteria. J Med Microbiol, 58, 1030-6. WU, G. D., CHEN, J., HOFFMANN, C., BITTINGER, K., CHEN, Y. Y., KEILBAUGH, S. A., BEWTRA, M., KNIGHTS, D., WALTERS, W. A., KNIGHT, R., SINHA, R., GILROY, E., GUPTA, K., BALDASSANO, R., NESSEL, L., LI, H., BUSHMAN, F. D. & LEWIS, J. D. 2011. Linking long-term dietary patterns with gut microbial enterotypes. Science, 334, 105- 8. WYMORE BRAND, M., WANNEMUEHLER, M. J., PHILLIPS, G. J., PROCTOR, A., OVERSTREET, A. M., JERGENS, A. E., ORCUTT, R. P. & FOX, J. G. 2015. The Altered Schaedler Flora: Continued Applications of a Defined Murine Microbial Community. ILAR J, 56, 169-78. XU, J., MAHOWALD, M. A., LEY, R. E., LOZUPONE, C. A., HAMADY, M., MARTENS, E. C., HENRISSAT, B., COUTINHO, P. M., MINX, P., LATREILLE, P., CORDUM, H., VAN BRUNT, A., KIM, K., FULTON, R. S., FULTON, L. A., CLIFTON, S. W., WILSON, R. K., KNIGHT, R. D. & GORDON, J. I. 2007. Evolution of symbiotic bacteria in the distal human intestine. PLoS Biol, 5, e156. YITBAREK, A., WEESE, J. S., ALKIE, T. N., PARKINSON, J. & SHARIF, S. 2018. Influenza A virus subtype H9N2 infection disrupts the composition of intestinal microbiota of chickens. FEMS Microbiol Ecol, 94. YU, Z. & MORRISON, M. 2004. Improved extraction of PCR-quality community DNA from digesta and fecal samples. Biotechniques, 36, 808-12. ZHANG, H., SHAO, M., HUANG, H., WANG, S., MA, L., WANG, H., HU, L., WEI, K. & ZHU, R. 2018a. The Dynamic Distribution of Small-Tail Han Sheep Microbiota across Different Intestinal Segments. Front Microbiol, 9, 32. ZHANG, H., YOHE, T., HUANG, L., ENTWISTLE, S., WU, P., YANG, Z., BUSK, P. K., XU, Y. & YIN, Y. 2018b. dbCAN2: a meta server for automated carbohydrate-active enzyme annotation. Nucleic Acids Research, 46, W95-W101. ZHANG, J. H., CHUNG, T. D. & OLDENBURG, K. R. 1999. A Simple Statistical Parameter for Use in Evaluation and Validation of High Throughput Screening Assays. J Biomol Screen, 4, 67-73. ZHANG, M., QIU, X., ZHANG, H., YANG, X., HONG, N., YANG, Y., CHEN, H. & YU, C. 2014. Faecalibacterium prausnitzii inhibits interleukin-17 to ameliorate colorectal colitis in rats. PLoS One, 9, e109146. ZHANG, P. Y. 2015. Polyphenols in Health and Disease. Cell Biochem Biophys, 73, 649-64. ZHAO, Y., JIA, X., YANG, J., LING, Y., ZHANG, Z., YU, J., WU, J. & XIAO, J. 2014. PanGP: a tool for quickly analyzing bacterial pan-genome profile. Bioinformatics, 30, 1297-9. ZHAO, Y., WU, J., YANG, J., SUN, S., XIAO, J. & YU, J. 2012. PGAP: pan-genomes analysis pipeline. Bioinformatics, 28, 416-8.

180

Chapter 7 Appendices

7.1 Media and Solutions used during thesis

7.1.1 M2GSC medium

The following recipe is for 1 litre of M2GSC medium. Rumen fluid is first spun twice at 23,500 x g for 10 mins, the top layer of fluid is removed and placed into a Schott bottle. The remaining ingredients can then be added, save for the cysteine HCl. The solution is then topped up to 1 litre with distilled or MilliQ water. Medium is then microwaved for 4 minutes to remove oxygen. The medium is the gassed with CO2 for 45 mins, using foil to seal the bottle during gassing. Then cysteine HCl is added and the medium taken quickly to a COY vinyl anaerobic chamber where 10 ml of medium is dispensed into Hungate tubes. The Hungate tubes are then sealed with a butyl stopper and plastic screw cap lid. The medium is then sterilised by autoclave for 15 mins at 121 oC at 172 kPa and cooled to room temperature once sterile.

Per 1 litre Rumen fluid (2x spun) 300 ml Mineral solution 2 75 ml Mineral solution 3 75 ml NZ-amine 10 g Yeast Extract 2.5 g Cellobiose 2 g Soluble Starch 2 g

Sodium bicarbonate (NaHCO3) 8 g Cysteine HCl 1 g Resazurin 1 ml

181

7.1.2 Reinforced Clostridial medium (RCM)

The following recipe is to make 1 litre of RCM. All powdered ingredients (except cysteine HCl) and mineral solutions 1 and 2 are added to a Schott bottle. The volume is then made up to 1 litre with distilled water and then microwave for 4 minutes to remove oxygen. The medium is then gassed with

CO2 for 45 mins, using foil to seal the bottle. The cysteine HCl is then quickly added to the medium and the bottle sealed with a lid and taken directly into an anaerobic chamber. Then, 10 ml aliquots are decanted into Hungate tubes and sealed using butyl rubber stoppers and plastic screw cap lids. The medium is then autoclaved as described above and allowed to cool to room temperature.

Per 1 litre Yeast Extract 3 g Lablemco Powder 10 g Glucose 5 g Starch 1 g Sodium chloride 5 g Sodium acetate 3 g

NaHCO3 8 g Mineral Solution 2 75 ml Mineral Solution 3 75 ml Cysteine-HCl 1 g Resazurin 1 ml

182

7.1.3 Brain Heart Infusion Medium (BHI)

The following recipe is for 1 litre of BHI medium. Powered BHI and Sodium Bicarbonate areadded to a Schott bottle. Then mineral solutions 1 and 2 are added and the volume made up to 1 litre using distilled or MilliQ water. The medium is then microwaved for 4 minutes to remove oxygen and gassed with CO2 for 45 mins to remove remaining O2. Then cysteine HCl is added to the medium and the bottle sealed and taken promptly into an anaerobic chamber. Finally 10 ml aliquots are decanted into Hungate tubes and sealed using butyl rubber stoppers and plastic screw cap lids. The medium is autocalve as described above and then allowed to cool to room temperature prior to use.

Per 1 litre BHI powder 37 g

NaHCO3 8 g Mineral Solution 2 75 ml Mineral Solution 3 75 ml Cysteine-HCl 1 g Resazurin 1 ml

7.1.4 Mineral solutions

Mineral 3 solution

o KH2PO4 6g/L

o (NH4)2SO4 6g/L o NaCl 12g/L

o MgSO44.7H2O 2.5g/L

o CaCl2.2H2O 1.6g/L

Mineral 2 solution

o K2HPO4 6 g/L

7.1.5 Resazurin Stock, Ringers solution and anaerobic glycerol stocks

Resazurin

183

Resazurin stock solutions were made up as follows 100mg in 100mls MilliQ H2O to get a 1000x (0.1%) stock solution

Ringers Solution

Per Litre 250mls

Mineral Solution 2 38mls 9.5mls

Mineral Solution 3 38mls 9.5mls

NaHCO3* 5.6g 1.4g

Resazurin 1ml 0.25ml

H2O q to: 1L 250mls

Boil in microwave for 3 minutes, bubble in CO2 for 60 mins. pH to 7 with NaOH, add L-cysteine-HCL 1.0g .25g

Aliquot to 125ml bottles in CO2 chamber and seal and autoclave.

*Sodium Bicarbonate

Anaerobic Glycerol

Ingredients For 1L

Salt Solution #2 38 ml

Salts Solution #3 38 ml

Sodium bicarbonate 1 g

Resazurin solution (0.1%) 1 ml

Glycerol ‘AR’ grade 300 ml

Water To 1 Litre

L-cysteine HCl (Add last after gassing) 0.5 g

Method:

184

1) Measure liquids and sodium bicarbonate into a flask. Microwave to bring to boil. Then transfer

to magnetic stirrer, allow to cool while bubbling with N2.

2) Add dry ingredients (in order). pH should be (~6.5)

3) Add L-cysteine.

4) Move to N2 anaerobic chamber and check pH with a pH strip before dispensing. pH should be (~6.5 to 7)

5) Dispense 3 mL into anaerobic bottles and any extra into larger serum bottles (~100 mL per bottle).

6) Seal with blue balch stoppers and then remove from anaerobic chamber

7) Crimp seals (with tear-away tops)

o 8) Autoclave at 121 C for 15 min

185

7.2 Primers used during thesis

Table 7.1 primers used during this thesis Gene target Primer Nucleotide sequence (5’-3’) Amplicon Annealing Reference (bp) temp (oC) 16S rRNA gene 27-F AGAGTTTGATCMTGGCTCAG 1465 60 {Frank, 2008 #307} bacteria specific 1492-R TACGGYTACCTTGTTACGACTT pEHR vector ermB-F GATCTACGCAGATAAATAAATACG 1400 60 {Ó Cuív, 2015 #9} (resistance gene oriT-R CCTCAATCGCTCTTCGTTCG and OriT modules) catP-F AGTGGGCAAGTTGAAAAATTCAC 16S rRNA gene 530-F GTGCCAGCMGCCGCGG N/A N/A {Amann, 1995 #309} bacteria specific 907-R CCGTCAATTCMTTTRAGTTT {Weisburg, 1991 #310} (used for 926-R AAACTYAAAKGAATTGACGG {Turner, 1999 #308} sequencing) chi gene fchi-BspHI GATCTCATGACCGTGGAATTTCGTCCCATGCGC 850. 60 This study amplification for fchi-XhoI GATCCTCGAGACGCATGGTGATGTAGCCGCGGA cloning

186

7.3 16S rRNA sequencing sample preparation

For sequencing 16S rRNA of bacterial isolates, the BigDye™ Terminator v1.1 cycle sequencing kit was used (Reagents supplied with kit, ThermoFischer Sceintific, Australia). Prior to starting the BigDye™ sequencing reaction, PCR amplified 16S rRNA products were purified of excess nucleotides and reagents. The 5X CS Buffer used in the following reactions was made up as follows:

5XCS Buffer Tris (pH 9.0) 400mM

MgCl2 10mM

ExoCIP PCR clean-up

1 reaction (µl) DNA (from 27F/1492R) 1 5XCS Buffer 1 Calf Intestinal 0.5 Phosphatase (CIP) Exonuclease 1 (Exo) 0.5

The reactions where then subjected to the following thermal cycling conditions:

o -37oC for 20 min o 80oC for 20 min o Hold at 23oC

BigDye™ sequencing reactions

The purified PCR products were then prepared for Sanger sequencing as per manufacturer instructions. Briefly, BigDye™ reactions were set up as follows;

1 reaction (µl) BigDye™ 0.5 5XCS Buffer 3.75 Sequencing primer 0.3

187

Sterile distilled H2O 14.45

The BigDye™ reactions were then subjected to the following thermal cycling conditions;

o 96oC for 1 min o 96oC for 10 sec, 50oC for 5 sec and 60oC for 4 min, for 60 cycles o 40oC for 4 min o Hold at 12oC

Sequencing reactions were then cleaned using the following protocol:

Agencourt AMPure XP sequencing clean up

Add 20µl AMPure beads to PCR product and pipette up and down to mix

Incubate at RT for 5min

Place onto magnetic plate to separate beads from supernatant for 2 mins

Remove supernatant

Add 200µl of 85% ethanol incubate for 30 secs discard supernatant

Repeat ethanol wash once more

Remove all ethanol from wells and leave to dry for 10 mins

Resuspend beads in 50µl distilled Water and incubate at RT for 5 mins

Place onto magnetic plate for 2 mins

Transfer 20µl to sequencing plate and seal plate

Incubate on ice under foil to take to sequencer (Lawrie)

188

http://nextgen.mgh.harvard.edu/attachments/AMPureXPProtocol_000387v001.pdf

189

7.4 Full gel image of transconjugant isolate PCR confirmation

Figure 7.1 The full gel for the confirmation of the conjugative transfer of pEHR512112 to faecal bacteria by metaparental mating.

190

7.5 Preparation of Bacterial High Molceular Weight DNA

The following protocol was modified for a larger scale cell preparation as described in Chapter 3 and 4 of this thesis:

Quantities for 2 mL of starting culture with an OD of 2

1. Microbial cells were removed from the spent media by centrifugation in a microfuge at > 13,000rpm for 5 minutes. 2. Remove supernatant and keep cell pellet. 3. The cell pellet was resuspended in 500 µL of RBB+C lysis buffer 4. Incubate samples at 80ᵒC for 10 minutes to neutralise nucleases 5. Cool to room temperature 6. When cooled 10 µL of lysozyme (200 mg/mL stock) and 1 µL mutanolysin (20 U/µL stock) was added to the cell suspension and mixed gently a. For difficult to lyse cells 5 µL of achromopeptidase (200 U/µL stock) was added 7. The sample was then incubated at 37°C for 1 hour 8. Following incubation 20 µL of 10% sodium laurylsarcosine (SLS) and 5 µL of proteinase K (20 mg/mL) were added and the sample was incubated for at 55°C for 30 minutes. 9. The sample was extracted by gentle inversion with 400 µL of phenol:chloroform (collect the phenol:chloroform from below the protective upper layer) and then centrifuge at > 13,000rpm for 5 minutes. 10. The aqueous phase was carefully removed to a clean microfuge tube. 11. Repeat step 9-10 12. A 10-1 volume (~40 µL) of 3 M sodium acetate (pH 5.2) was added 13. The one volume of (~400 µL) of isopropanol was added, the sample gently mixed and incubated on ice for 30 minutes. The DNA was visible as a coiled thread in the solution. 14. The sample was centrifuged at > 13,000rpm for 5 minutes and the supernatant removed. 15. The DNA pellet was washed by addition of 500 µL 70% ethanol followed by centrifugation at > 13,000rpm for 5 minutes. 16. The DNA pellet was then allowed to air dry (or speedy vac for 3 minutes on high) and then the sample was resuspended in (50-100 µL) TE buffer. 17. The sample was RNase treated with 5 µL (10 mg/ml) and incubated at 37ᵒC. 18. Check quality of DNA extraction on a 0.7% agarose gel at 80 volts for 90 mins.

Use 1:10 dilution of stock lambda Hind III ladder (added 4uL -top band = 47 kb)

191

RBB+C lysis buffer:

• 500 mM NaCl • 50 mM Tris-HCl, pH 8.0 • 50 M EDTA • 4% Sodium Dodecyl Sulfate (SDS)

TE Buffer:

• 10 mM Tris-HCl • 1 mM EDTA

A B

Figure 7.2 Representative gels to assess the quality of the genomic DNA extracted from AHG00008 (A) and AHG0014 (B). For both panels; lane 1 shows the Lamda DNA/HindIII marker purchased from ThermoFischer Scientific, Australia, lane 2 shows 5 µl of the gDNA preparation from both AHG0008 (A) and AHG0014(B) and lane 3 depicts the 5 µl of TE buffer used to resuspend the gDNA.

192

Table 7.2 concentration of gDNA for strain AHG0008 and AHG0014 calculated using the Quantus™ Fluorometer and QuantiFluor® dsDNA system

193

7.6 rowth curves for strain AHG0008 and AHG0014. in RCM medium

A

B

Figure 7.3 Phenotypic growth analysis of Ffr. sp. AHG0014 and Pfr. sp. AHG0008 in RCM medium. Growth kinetics of both species are extremely poor as it takes 52 hrs for both isolates to reach the maximum Optical Density.

194

Analysis of the draft genome of AHG0014 to determine KEGG Ortholog functional categories revealed that 6.9% of the genes were specific for Carbohydrate metabolism. This was unexpected as the F. plautii type strain was reported to be asaccharolytic (REF). Interestingly when all of the Ffr. spp. genomes were analysed using dbCAN they were all shown to possess of a substantial number of genes relating to carbohydrate metabolism (149-183 genes) (Table 1.3). Whilst the presence of the genes was unexpected based on the original descriptions of F. plautii species being asaccharolytic organism, it still remains to be seen whether these genes are actively used by members of this species.

KO functional categories Percentage (%)

Genetic information and processing 18.0

Cellular processes 10.9

Amino acid metabolism 8.0

Carbohydrate metabolism 6.9

Energy metabolism 4.1

Metabolism of cofactors and vitamins 4.0 Nucleotide metabolism 3.3

Environmental information processing 17.9

Enzyme families 3.1

Lipid metabolism 2.2

Glycan biosynthesis and metabolism 1.8

Metabolism of other amino acids 1.4

Xenobiotics biodegradation and 1.3 metabolism Biosynthesis of other secondary 0.9 metabolites

Metabolism of terpenoids and 0.9 polyketides Organismal systems 0.9

Human Diseases 3.2

Unclassified 11.1

Figure 7.4 The KEGG Ortholog (KO) functional categories of the draft genome of AHG0014. The KEGG database was used to assign KO descriptions to the core genes in the draft genome.

195

7.7 Plasmid extraction and colony confirmation of cloned vector

A B

Clone #6

Clone #2

Clone #4 Clone #1

pET28b(+)

Clone #3

Clone #5

1 Kb+ ladder

Clone #4

Digested

Clone #5 Clone #6

Digested

Digested

Digested

Digested

Clone #1

pET28b(+)

Digested

Digested

Clone #3

Clone #2 1 ladderKb+

Figure 7.5 Extraction of plasmid and restriction enzyme digests on colonies grown following transformation with cloned pET28b(+)CHI vector. A) depicts a size shift in plasmids extracted from E. coli DH5α colonies which grew on kanamycin selective plates following transformation with the cloned vector compared to the pET28b(+) vector only extracted from the original E. coli JM109 strain. B) depicts the plasmids from the 6 colonies picked in A following digestion with the enzymes XhoI and XbaI restriction enzymes. For clones #1 to #5 there is a small band at ~800 bp follo wing the digestion compared to the digested pET28b(+) vector only which does not have a band at ~800 bp.

196

7.8 Autoinduction media and solutions

The following solutions and quantities are used to make 100 ml of autoinduction medium:

ZYM (basal medium)

Tryptone 1 g

Yeast Extract 0.5 g dH2O 98 ml

50x 5052 solution (1x 5052; 0.5% glycerol, 0.05% glucose and 0.2% α-lactose)

Glycerol 25%

Glucose 2.5 g

α-lactose monohydrate 10 g

Top-up to 100 ml in a volumetric flask

Once prepared stored at 4oC

1 M MgSO4

MgSO4-7H2O 23.65 g

Diluted in 87 ml of dH2O

50X M Solution

Na2HPO4 17.75 g

KH2PO4 17.00 g

NH4Cl 13.40 g

Na2SO4 3.55 g

Initially dissolve in 80 ml with sterile distilled water then top-up to 100 ml in volumetric flask pH of 50 fold solution should be ~6.7

1000x Trace Metal solutions

197

Trace metal solutions to be prepared as individual solutions then added in to the final solution which is to be filter sterilised prior to addition to ZYM medium:

0.1M FeCl3 in ~0.12 M HCl

1M CaCl2-H2O

1M MnCl2-4H2O

1M ZnSO4-7H2O

0.2M CoCl

0.1M CuCl2-2H2O

0.2M NiCl2

0.1M Na2MoO4

0.1M Na2SeO3

0.1M H3BO3

Solution will be 1X when added to final ZYM medium

198