IDENTIFICATION OF VIRULENCE DETERMINANTS OF MYCOBACTERIUM TUBERCULOSIS VIA GENETIC COMPARISONS OF A VIRULENT AND AN ATTENUATED STRAIN OF MYCOBACTERIUM TUBERCULOSIS.

by

ALICE HOY LAM LI

B.Sc., The University of British Columbia, 2001

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIRMENT FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

in

THE FACULTY OF GRADUATE STUDIES

(Pathology)

THE UNIVERSITY OF BRITISH COLUMBIA

(Vancouver)

MARCH 2008

 Alice Hoy Lam Li, 2008

i ABSTRACT

Candidate virulence genes were sought through the genetic analyses of two strains of

Mycobacterium tuberculosis, one virulent, H37Rv, one attenuated, H37Ra. Derived from the

same parent, H37, genomic differences between strains were first examined via two-dimensional

DNA technologies: two-dimensional bacterial genome display, and bacterial comparative genomic hybridisation. The two-dimensional technologies were optimised for mycobacterial use, but failed to yield reproducible genomic differences between the two strains. Expression

differences between strains during their infection of murine bone-marrow-derived macrophages

were then assessed using Bacterial Artificial Chromosome Fingerprint Arrays. This technique

successfully identified expression differences between intracellular M. tuberculosis H37Ra and

H37Rv, and six candidate genes were confirmed via quantitative real-time PCR for their

differential expression at 168 hours post-infection. Genes identified to be upregulated in the

attenuated H37Ra were frdB, frdC, and frdD. Genes upregulated in the virulent H37Rv were

pks2, aceE, and Rv1571. Further qPCR analysis of these genes at 4 and 96h post-infection

revealed that the frd operon (encoding for the fumarate reductase complex or FRD) was expressed at higher levels in the virulent H37Rv at earlier time points while the expression of

aceE and pks2 was higher in the virulent strain throughout the course of infection. Assessment of

frd transcripts in oxygen-limited cultures of M. tuberculosis H37Ra and H37Rv showed that the

attenuated strain displayed a lag in frdA and frdB expression at the onset of culture when

compared to microaerophilic cultures of H37Rv and aerated cultures of H37Ra. Furthermore,

inhibition of the fumarate reductase complex in intracellular resulted in a significant

reduction of intracellular growth. Microarray technology was also applied in the expression

analysis of intracellular bacteria at 168h post-infection. Forty-eight genes were revealed to be

ii differentially expressed between the H37Ra and H37Rv strains, and a subset were further analysed via qPCR to confirm and validate the microarray data. phoP was expressed at a lower level in the attenuated M. tuberculosis H37Ra, whereas members of the phoPR regulon were up- regulated in the virulent H37Rv. Additionally, a group of genes (Rv3616c-Rv3613c) that may associate with the region of difference 1 were also up-regulated in the virulent H37Rv.

iii TABLE OF CONTENTS

ABSTRACT ...... ii

TABLE OF CONTENTS...... iv

LIST OF TABLES...... viii

LIST OF FIGURES ...... ix

ABBREVIATIONS ...... xi

ACKNOWLEDGEMENTS...... xiv

CHAPTER 1: Introduction...... 1

1.1 Background...... 1

1.2 Mycobacterium tuberculosis...... 3 1.2.1 Genomic organisation in M. tuberculosis...... 3 1.2.2 Lipid metabolism ...... 4 1.2.3 PE and PPE gene family ...... 5

1.3 Intracellular life for M. tuberculosis ...... 6 1.3.1 Encountering the host ...... 7 1.3.2 Adaptations of M. tuberculosis to host defences...... 8

1.4 Bacterial model of interest: M. tuberculosis H37Ra and H37Rv ...... 9

1.5 Difference analyses of M. tuberculosis H37Ra and H37Rv...... 10 1.5.1 Genomic analyses of M. tuberculosis H37Ra versus H37Rv...... 11 1.5.2 Expression analyses examining differences between M. tuberculosis H37Ra and H37Rv ...... 12

1.6 Project goals...... 14 1.6.1 Hypothesis...... 15 1.6.2 Objective and specific aims of project...... 15

CHAPTER 2: Materials and methods ...... 16

2.1 Growth of mycobacteria...... 16 2.1.1 Aerobic Growth of Mycobacterium tuberculosis H37Ra and H37Rv...... 16 2.1.2 Growth of Mycobacterium tuberculosis H37Ra and H37Rv for fumarate reductase studies...... 18

2.2 Isolation of nucleic acids from broth culture ...... 19

iv 2.2.1 Isolation of genomic DNA from axenic broth cultures of Mycobacterium tuberculosis...... 19 2.2.2 Isolation of RNA from axenic broth cultures of Mycobacterium tuberculosis. ... 20 2.2.3 Purification of mycobacterial RNA...... 20

2.3 Derivation, culture, and infection of murine bone marrow-derived macrophages (BM- MΦ)...... 21 2.3.1 Derivation and culture of BM-MΦ ...... 21 2.3.2 Infection of BM-MΦ with Mycobacterium tuberculosis H37Ra and H37Rv ...... 22 2.3.3 Extraction of RNA from macrophage-associated Mycobacterium tuberculosis .. 23

2.4 Two-dimensional DNA displays...... 23 2.4.1 Generation of two-dimensional bacterial genome displays (2DBGD)...... 23 2.4.2 Generation of bacterial comparative genomic hybridisation (BCGH) profiles.... 24 2.4.3 Generation and hybridisation of DIG-labelled probes with BCGH profiles ...... 25

2.5 Bacterial artificial chromosome fingerprint arrays (BACFAs) ...... 27 2.5.1 Growth and harvest of bacterial artificial chromosome (BAC) DNA ...... 27 2.5.2 Generation of BACFAs...... 28 2.5.3 cDNA synthesis and DIG-labelling of cDNA probes for BACFA analysis...... 29 2.5.4 Hybridisation of BACFAs ...... 31 2.5.5 Analysis of BACFAs ...... 32 2.5.6 Quantitative real-time PCR analysis of expression differences observed with BACFA ...... 33 2.5.7 Extraction of whole cell lysate for fumarate reductase enzyme assessments in cultures mimicking microaerophilic conditions...... 34 2.5.8 Fumarate reductase enzyme assay ...... 34 2.5.9 Western blot detection of fumarate reductase in whole-cell-lysates of M. tuberculosis...... 35 2.5.10 Evaluation of mercaptopyridine-N-oxide (MPNO) effects on BM-MΦ and on growth of intracellular M. tuberculosis...... 36 2.5.11 Statistical analysis...... 37

2.6 DNA microarrays...... 37 2.6.1 Genomic comparisons using microarray technology...... 37 2.6.2 Application of microarray technology for expression analysis ...... 39 2.6.3 Analysis of microarray hybridisations...... 40 2.6.4 qPCR analysis of selected candidate genes identified via microarray expression analysis ...... 41 2.6.5 Comparing overlap between datasets and generation of Venn diagrams ...... 42

CHAPTER 3: Two-dimensional DNA analysis...... 43

3.1 Introduction...... 43

3.2 Rationale ...... 45

3.3 2DDNA technologies for use with M. tuberculosis...... 46 v 3.3.1 2-dimensional bacterial genome display (2DBGD)...... 46 3.3.2 Bacterial genomic comparative hybridisation (BCGH)...... 48

3.4 Discussion and summary ...... 51

3.5 Future directions...... 52

CHAPTER 4: Bacterial artificial chromosome fingerprint arrays...... 60

4.1 Introduction...... 60

4.2 Rationale ...... 61

4.3 Growth of Mycobacterium tuberculosis in association with bone-marrow derived macrophages ...... 62

4.4 Optimisation of bacterial artificial chromosome fingerprint array methodology in expression profiling ...... 63 4.4.1 Reliability of BACFA technique ...... 63 4.4.2 Generation of DIG-labelled probes for use with BACFAs...... 64

4.5 Results...... 66 4.5.1 Differences isolated via BACFA include genes reported in expression studies as well as novel differences previously unreported...... 66 4.5.2 Quantitative real-time PCR (qPCR) confirmation of candidates selected after BACFA analysis ...... 67 4.5.3 Assessment of candidate gene expression profiles in broth cultures and at 4h and 96h post-infection...... 68 4.5.4 Growth of M. tuberculosis under oxygen-limiting conditions...... 69 4.5.5 Proteomic analysis of the fumarate reductase enzyme complex in cell lysates derived from M. tuberculosis grown under oxygen-limiting conditions...... 70 4.5.6 qPCR analysis of frd transcripts in M. tuberculosis grown under oxygen-limiting conditions...... 72 4.5.7 Growth of M. tuberculosis in association with macrophages treated with methylpyridine-N-oxide (MPNO), an inhibitor of fumarate reductase...... 73

4.6 Discussion and summary ...... 74

4.7 Future directions...... 81

CHAPTER 5: Microarray-based expression profiling...... 98

5.1 Introduction...... 98

5.2 Rationale ...... 99

5.3 Optimisation of microarray expression studies...... 100

vi 5.4 Results...... 100 5.4.1 Genomic comparisons...... 100 5.4.2 Expression profile comparisons of broth-grown cultures...... 101 5.4.3 Expression profile comparisons of intracellular Mycobacterium tuberculosis... 103 5.4.4 qPCR confirmation of microarray data...... 111 5.4.5 Correlation of microarray data to previous expression studies...... 113

5.5 Discussion and summary ...... 117

5.6 Future directions...... 124

CHAPTER 6: Final summary and conclusion ...... 151

REFERENCES ...... 154

Appendix I: Publications and contributions...... 170

Appendix II: Genes upregulated in broth-grown M. tuberculosis H37Ra versus H37Rv...... 171

Appendix III: Genes downregulated in broth-grown M. tuberculosis H37Ra versus H37Rv.... 175

Appendix IV: Genes upregulated in intracellular M. tuberculosis H37Rv versus broth-grown H37Rv...... 177

Appendix V: Genes upregulated in intracellular M. tuberculosis H37Ra versus broth-grown H37Ra...... 188

Appendix VI: Genes downregulated in intracellular M. tuberculosis H37Rv versus broth-grown H37Rv...... 190

Appnedix VII: Genes downregulated in intracellular M. tuberculosis H37Ra versus broth-grown H37Ra...... 204

vii LIST OF TABLES

Table 4.1 Candidate genes identified via BACFA as being differentially expressed in intracellular M. tuberculosis H37Ra and H37Rv...... 83

Table 4.2 Primers used in qPCR confirmation of candidate genes...... 84

Table 5.1 Genomic differences revealed in microarray comparisons of genomic DNA from M. tuberculosis H37Ra and H37Rv...... 126

Table 5.2 Expression differences between strains grown in broth: genes upregulated in broth- grown M. tuberculosis H37Ra versus H37Rv...... 127

Table 5.3 Expression differences between strains grown in broth: genes downregulated in broth- grown M. tuberculosis H37Ra versus H37Rv...... 128

Table 5.4 Genes upregulated in intracellular M. tuberculosis H37Rv versus broth grown H37Rv...... 129

Table 5.5 Genes upregulated in intracellular M. tuberculosis H37Ra versus broth-grown H37Ra...... 132

Table 5.6 Genes upregulated in intracellular M. tuberculosis versus broth-grown bacteria...... 133

Table 5.7 Genes downregulated in intracellular M. tuberculosis H37Rv versus broth-grown H37Rv...... 134

Table 5.8 Genes downregulated in intracellular M. tuberculosis H37Ra versus broth-grown H37Ra...... 137

Table 5.9 Genes induced in intracellular M. tuberculosis H37Ra compared to intracellular H37Rv...... 138

Table 5.10 Genes repressed in intracellular M. tuberculosis H37Ra compared to intracellular H37Rv...... 139

Table 5.11 Genes mutually induced in intracellular M. tuberculosis...... 141

Table 5.12 Genes induced in intracellular M. tuberculosis H37Rv and their requirement for survival within the macrophage...... 143

viii LIST OF FIGURES

Figure 3.1 Growth of M. tuberculosis H37Ra and H37Rv in enriched broth...... 54

Figure 3.2 Size range of resolution with 2DBGD...... 55

Figure 3.3 Optimisation of 2DBGD for use with M. tuberculosis...... 56

Figure 3.4 Comparison of 2DBGD profiles for M. tuberculosis H37Ra and H37Rv using HinfI digested fragments...... 57

Figure 3.5 BCGH array generated after optimisation...... 58

Figure 3.6 Efficiency test of labelled DNA probes to be used with BCGH comparisons...... 59

Figure 4.1 Generation of BAC fingerprint arrays...... 85

Figure 4.2 Growth of Mycobacterium tuberculosis H37Ra and H37Rv in Proskauer and Beck liquid broth...... 86

Figure 4.3 Growth of Mycobacterium tuberculosis H37Ra and H37Rv in association with murine bone-marrow derived macrophages...... 87

Figure 4.4 Exclusivity of RNA harvested from intracellular mycobacteria...... 88

Figure 4.5 Genomic analysis using BACFA...... 89

Figure 4.6 Acrylamide gel electrophoresis of PCR reactions using uniprime and various amounts of template DNA...... 90

Figure 4.7 Bands of interest seen in BACFA comparisons may represent several Kb...... 91

Figure 4.8 Quantitative real-time PCR assessment of selected candidate genes’s expression profiles at 168h post-infection...... 92

Figure 4.9 Quantitative real-time PCR analysis of frdA, frdB, frdC, frdD, pks2, Rv1571, and aceE identified via BACFA...... 93

Figure 4.10 Growth of M. tuberculosis H37Ra and H37Rv under limited oxygen conditions. 94

Figure 4.11 Western blot detecting FRD-A, FRD-B in cell lysates of E. coli and M. tuberculosis...... 95

Figure 4.12 Quantitative real-time PCR analysis of genes encoding for fumarate reductase (frdA, frdB, frdC, and frdD) in oxygen-limited and aerated broth cultures of M. tuberculosis H37Ra and H37Rv...... 96

ix Figure 4.13 Effect of mercaptopyridine-N-oxide (MPNO) on the growth of intracellular Mycobacterium tuberculosis...... 97

Figure 5.1 Genomic comparisons of M. tuberculosis H37Ra and H37Rv via microarray analysis reveals only the RvD2 region of difference...... 144

Figure 5.2 Comparisons of gene expression in intracellular M. tuberculosis shows a subset of genes mutually induced by virulent and attenuated strains in an intracellular environment...... 145

Figure 5.3 Catabolism of fatty acids via β-oxidation and subsequent glyoxylate cycle metabolism...... 146

Figure 5.4 qPCR confirmation of microarray trends of genes differentially expressed between intracellular M. tuberculosis H37Ra and H37Rv...... 147

Figure 5.5 qPCR assessment of genes differentially expressed between intracellular M. tuberculosis H37Ra and H37Rv...... 148

Figure 5.6 qPCR confirmation of genes that were not differentially expressed between intracellular M. tuberculosis H37Ra and H37Rv...... 149

Figure 5.7 Comparisons of our expression study with previous studies examining the intracellular profile of M. tuberculosis...... 150

x ABBREVIATIONS

TB tuberculosis

BCG Bacille Calmette-Guérin

MDR Multiple-drug resistant

XDR Extremely drug resistant

ORF open-reading frame

IS insertion element

RFLP restriction fragment length polymorphism

PFGE pulse-field gel elextrophoresis

PDIM pthiocerol dimycocerosate

PE Proline-glutamine

PPE Proline-proline-glutamine

PGRS polymorphic GC-rich sequence

IFN-γ interferon-gamma

TNF-α tumour necrosis factor alpha

ROI reactive oxygen intermediates

RNI reactive nitrogen intermediates

CR complement receptor

LAMP lysosome-associated membrane glycoprotein

ManLAM mannosylated lipoarabinomannan

EEA1 early endosome autoantigen 1

RD1 region of difference 1

ESAT-6 6kDa early secreted antigenic target

xi rpm revolutions per minute

GM-CSF granulocyte macrophage colony stimulating factor

MOI multiplicity of infection p.i. post-infection

CFU colony forming units

BM-MФ bone-marrow derived macrophages

2DDE two-dimensional DNA electrophoresis

2DBGD two-dimensional bacterial genome display

DIG digoxigenin

GTC guanidium isothiocyanate

SCOTS Selective capture of transcribed sequences

BCGH bacterial comparative genome hybridisation

BAC bacterial artificial chromosome

BACFA bacterial artificial chromosome fingerprint array

BLAST Basic local alignment search tool qPCR quantitative real-time polymerase chain reaction

TCA Tricarboxylic acid

FRD fumarate reductase

SDH succinate dehydrogenase

PDC pyruvate dehydrogenase complex

MPNO mercaptopyridine-N-oxide

TDM trehalose 6,6-dimycolate

PAT polyacyltrehalose

xii DAT 2,3-di-O-acyltrehalose

SL sulpholipid

TMM trehalose 6,6'-dimycolate

TraSH transposon site hybridisation

PEPCK phosphoenolpyruvate carboxykinase

OAA oxaloacetate

PEP phosphoenolpyruvate

xiii ACKNOWLEDGEMENTS

I would like to thank Chad Malloff and Dr. Edith Dullaghan for discussion and guidance with

regards to two-dimensional DNA technology. The M. tuberculosis H37Rv BAC library was a

generous gift from Drs. Stewart Cole and Roland Brosch at the Pasteur Institute, Paris, France.

RNA isolation protocols were obtained from Dr. Philip Butcher. The FRD-A and FRD-B

antibodies were obtained in a generous aliquot from Dr. Joel Weiner and Gillian McCuaig at the

University of Alberta. I would also like to thank Dr. KG Papavinasasundaram for his valuable input on mycobacterial genetics and Tyler Hickey and Kirk Bergstrom for their input with regards to protein assays and detection. I would like to thank Dr. Robert Hancock and Manjeet

Bains for allowing us to use their facilities for the microarray study; Dr. Karsten Hokamp for

setting up Arraypipe for use with mycobacterial data, and Drs. Jason Hinds and Simon Waddell

for their help in analysing the microarray data. Lastly, I would also like to thank my family and

friends for their unfailing support.

xiv CHAPTER 1: Introduction

1.1 Background

The genus Mycobacterium comprises of bacteria of high G+C genomic content that possess

an unusually complex lipid-rich cellular envelope. Mycobacteria are typically characterised as

Gram positive rods, but due to the lipid-rich outer envelope, gram staining diagnostics are often

inaccurate. A staining alternative that is more reliable than gram staining of mycobacteria is the

bacterial property to retain carbol fuschin after an acid wash, thus mycobacteria are also

characterised as acid-fast.

This genus is composed mainly of saprophytic bacteria that are usually not of pathological

concern. Pathogenic members of the genus are infamous for their virulence, causing leprosy

(Mycobacterium leprae), Buruli ulcer (Mycobacterium ulcerans), and tuberculosis

(Mycobacterium tuberculosis). Through the increased accessibility of avenues to analyse the

genomics of mycobacteria, other species that also cause tuberculosis in mammals and share more

than 99.9% DNA sequence similarity have been grouped into the M. tuberculosis complex.

These include M. tuberculosis, M. bovis, the vaccine strain M. bovis BCG, M. africanum, M.

canetti, and M. microti (40).

The intracellular pathogen Mycobacterium tuberculosis is the causative agent of tuberculosis,

the leading cause of infectious death due to a single species of bacteria. It has been estimated

that one-third of the global population is infected with M. tuberculosis, and three deaths occur

every minute (46). Due to a combination of lax preventative measures and increasingly mobile

populations, a resurgence in TB has been seen in North America and other industrialized nations

(90). The increasing presence of HIV/AIDS has also promoted this resurgence as

1 immunocompromised patients are 500 times more likely to suffer infection and reactivation of

disease (200). Additionally, emerging strains of multi-drug resistant (MDR) and extremely drug

resistant (XDR) M. tuberculosis threaten to make this disease incurable (64, 90, 129).

Bacille-Calmette-Guérin (BCG) vaccination – using the attenuated M. bovis BCG strain – and/or chemotherapy have been used to control the spread of tuberculosis. In children, the BCG vaccine has been observed to offer excellent protection against serious tubercular diseases such as TB meningitis and miliary TB (218). Unfortunately, protection offered by BCG vaccination appears to wane over time, and it has been found that this vaccine has a large range of efficacy

(2-80%) for adults, casting doubt on its ability to protect an older individual from the disease (1,

30, 218). Nonetheless, BCG still offers protection and remains the gold-standard as few of the novel candidate vaccines have exceeded BCG in efficacy in animal models.

In the case of chemotherapeutics, the four front-line drugs used to treat TB are isoniazid, pyrazinamide, ethambutol, and rifampin, the last of which was introduced in 1966 (26). Since then, a few others have been introduced, but are not in widespread application due to costs and toxicity to children, pregnant women and/or HIV-infected persons (26). With the current

regimen of chemotherapy, patients are traditionally subjected to a six to eighteen-month course

of drugs, depending on the bacterial strain (62). Due to the occasional harsh side-effects of the drugs, and the remote locations of some patients complicating monitoring, compliance with medication regimes can be problematic. Failure of patients to adhere to the treatment plan can contribute to the emergence of MDR and XDR M. tuberculosis (62, 90, 129).

In light of these problems of resurgence and increasingly ineffective avenues of control and

treatment, there is a pressing need for new approaches to prevent and treat tuberculosis.

2 1.2 Mycobacterium tuberculosis

Mycobacterium tuberculosis was first isolated and demonstrated to be the aetiological

agent of tuberculosis by Robert Koch in 1882 (104). A slow-growing member of the genus

Mycobacterium that doubles every 12-24 hours, M. tuberculosis has proven difficult to assess on

a genetic basis. In the past two decades, however, dramatic improvements in molecular

biological techniques have greatly advanced genetic assessments of M. tuberculosis, leading to

the sequencing of the M. tuberculosis H37Rv genome in 1998 (32). Genomic sequence

information has allowed the comparative genomics of the M. tuberculosis complex and other

mycobacteria resulting not only in a discourse on the evolution of the tubercle bacterium, but

also elaboration of the minimal gene set required for a strictly intracellular parasite such as M.

leprae (33). Since the original M. tuberculosis H37Rv sequence was published, numerous

mycobacteria have been sequenced, the latest being H37Ra in 2007 (226).

1.2.1 Genomic organisation in M. tuberculosis

Mycobacterium tuberculosis has a genome of 65.6% G+C content and a size of 4.411Mbp

(32). During sequencing and annotation of the M. tuberculosis genome, approximately 4,402 open-reading frames (ORFs) were found within the genomic sequence. ORFs are sequences of bases usually found during genome sequencing processes, which form codons that could potentially encode for polypeptides or proteins. Genes are sequences of nucleotides that encode for functional proteins – short ORFs may exisit between genes, but rarely encode for genes.

Long ORFs, on the other hand, can indicate the presence of a gene in the surrounding sequence.

Thus far, 3927 genes have been annotated within the M. tuberculosis genome (24, 32). The nearly 4000 genes fall into eleven functional categories ranging from intermediary metabolism and respiration to lipid metabolism, to conserved hypothetical proteins (24, 32). Fully 48% of

3 the genes encode unknown or conserved hypothetical proteins, and 52% have been ascribed precise or putative function (24). Further analysis indicates that 51% of the genes likely arose through gene duplication or domain shuffling events whereas 3-4% of the genome is composed of insertion sequences (IS) and prophages (24, 31). IS elements have been implicated in genome plasticity for mycobacteria (55, 228). For example, IS6110 has been described as a mediating factor in a major genomic difference, RvD2, identified between H37Ra and H37Rv (19, 110,

111).

1.2.2 Lipid metabolism

Fully 10% of the M. tuberculosis genome is dedicated to lipid metabolism, with genes involved in both biosynthetic and lipolytic pathways (24, 40). The requirement of involved in lipid biosynthesis seems intuitive given that the outer envelope contains an array of lipids, glycolipids, lipoglycans, and polyketides (37). As the cellular envelope is an interface between the bacterium and the host, an understanding of the composition and metabolism of these lipids would be valuable in the study of mycobacterial pathogenesis. Genes involved in lipid metabolism that have been found to be involved in pathogenesis include a 50kb cluster of genes that is involved in the synthesis and transport of phthiocerol dimycocerosate (PDIM), a cell wall-associated lipid found only in pathogenic mycobacteria (23, 36). Additionally, a fused gene (pks1/15) which encodes a phenolic glycolipid associated with a decrease in the production of pro-inflammatory cytokines from host immune cells was found mainly in the Beijing family of hypervirulent strains of M. tuberculosis (35). Due to a 7-bp deletion, pks1 and pks15 are separated pseudogenes in M. tuberculosis H37Rv as well as another laboratory strain, M. tuberculosis CDC1551. It has been suggested that this phenolic glycolipid may be the reason for the hypervirulent phenotype of strains in the Beijing family (169).

4 The presence of enzymes involved in lipid oxidation correlates well with previous findings where estimated concentrations of substrates available to intracellular pathogens showed an increased availability of lipids and sterols versus carbohydrates (232). Furthermore,

mycobacteria recovered from infected tissues have been observed to have the ability to degrade

exogenous lipids (232). Multiple copies of genes involved in the various steps of β-oxidation are

present in H37Rv: 36 fadD genes which encode acyl-CoA synthases, 36 fadE genes encoding acyl-CoA dehydrogenase, and 21 echA genes that encode enoyl-CoA hydratase/ (32).

The numerous copies suggest redundancy and/or the ability of the bacterium to metabolise different structures of fatty acids. However, a recent study has implied novel roles for a subset of the fadD genes (217). Through genetic and subsequent biochemical analysis, Trivedi et al.

found a cluster of 12 fadD genes (fadD21, 23-26, 28-32, and fadD34) which were linked to

proximal polyketide synthase (pks) genes (217). These fadD genes, it was found, were in fact

fatty acyl-AMP involved in the synthesis of acyl-adenylate structures in long-chain fatty

acids (217).

1.2.3 PE and PPE gene family

Approximately 10% of the mycobacterial genome is further dedicated to the PE and PPE

gene families (24, 32, 40). Proteins encoded by these families of genes typically share a

common N-terminus with the motifs Proline-Glutamine (PE) or Proline-Proline-Glutamine

(PPE) at positions 8-9 or 8-10 (24, 32, 40). The C-termini, however, are not conserved and often

vary in length and sequence. One of the more common C-terminus motifs found in PE proteins

is of the Polymorphic GC-Rich Sequence (PGRS) class, where the motif Asn-Gly-Gly-Ala-Gly-

Gly-Ala is frequently found (24, 32, 40). Specific functions of the PE and PPE proteins have

been difficult to elucidate as the redundancy of the genes complicates elucidation of function via

5 knock-out studies. However, recent developments have characterised PE_PGRS proteins to be

cell-wall associated or surface exposed (5, 15). Further studies have suggested a role for

PE_PGRS proteins in the interference with host immune responses by inhibiting antigen presentation (32), inhibiting the host immune response (16, 42), and even mediating cell-to-cell adhesion with the host (53, 196). It has been suggested that PE and PPE family proteins could have roles in antigenic variation, a possibility bolstered by the genetic characterisations that found that these genes show significantly higher genetic variability compared to the M. tuberculosis genome as a whole (32).

1.3 Intracellular life for M. tuberculosis

Mycobacterium tuberculosis is a bacterium that can be spread via aerosols generated by

infected individuals when they sneeze, cough, or even speak. Inhaled organisms are deposited

into alveoli where it is supposed that the bacterium encounters and is ingested by alveolar

macrophages (58, 97, 165). Here, the bacteria may multiply and induce host cells to release

cytokines and lymphokines which will recruit monocytes and macrophages to the site of

infection, resulting in the formation of a microscopic lesion (58, 97, 165). The mycobacterial growth is slowed as macrophages are activated by Interferon-γ (IFN-γ) secreted by CD4+ T lymphocytes allowing the macrophages to kill their intracellular cargo (57). Alternatively, infected macrophages can also be killed by CD8+ T lymphocytes (165). Slowly, the infection is cordoned off through the formation of a granuloma composed of macrophages, T and B lymphocytes, fibroblasts, and giant cells. Granulomas are not sterilising structures, and bacteria can survive extracellularly in the centre of the lesions, albeit in a latent or dormant phase (165).

Over time, the granuloma may heal, it may enlarge and shed bacteria into the blood stream and

lymphatic system, or it may liquefy and form a cavity. This last scenario typifies secondary TB

6 where the liquefaction of the cavity results in uninhibited bacterial growth into the bronchial airways which facilitates aerosolised transmission of the bacterium (58, 97, 165).

1.3.1 Encountering the host

For non-pathogenic bacteria, the encounter with the host initiates inside the phagosome and usually ends with their degradation inside the phagolysosome (28, 44, 48). The phagosome matures along the endosome-lysosome pathway acquiring and losing markers characteristic of the different stages of maturation (28, 44). With the maturation process, acidification of the vacuolar contents occurs with the delivery of membrane-bound proton pumps to the early and late endosomes (28, 44). Early endosomes typically acidify to pH 6 and late endosomes acidify down to pH 5.5 (28). This acidification is considered an important event as it allows for the activation of proteolytic enzymes (e.g. ) that will degrade the contents of the phagolysosome (48). In addition to the proteolytic enzymes, macrophages activated by IFN-γ also produce a respiratory burst resulting in the generation of reactive oxygen intermediates

(ROIs, e.g. hydrogen peroxide and superoxide radicals) and reactive nitrogen intermediates

(peroxynitrite) that exact oxidative damage to DNA, to amino acids, and even to lipids (28, 57,

97).

On encountering mycobacteria, the host will typically mount a strong cell-mediated immune response that includes the elicitation of CD4+ and CD8+ T lymphocytes as well as a mycobacterial-specific antibody response (48, 57, 165). IFN-γ secretion by the T cells activates macrophages spurring a response with ROIs and RNIs (57, 97). Additionally, activated macrophages will secrete tumour necrosis factor alpha (TNF-α) which not only promotes and stabilises granuloma formation, but also apoptosis (programmed cell death), which has been

7 demonstrated to be of benefit to the host in controlling mycobacterial growth during infection

(98, 99, 199).

1.3.2 Adaptations of M. tuberculosis to host defences.

One of the hallmarks of Mycobacterium tuberculosis infections is the halt of

phagolysosome maturation (3, 20). D’Arcy Hart and colleagues were the first to describe and

characterise, via microscopy, the failure of M. tuberculosis-containing phagosomes to fuse with lysosomes (3, 20). Subsequently, it was shown that mycobacteria-containing compartments fuse selectively with vesicles containing endosomal markers such as lysosome-associated membrane glycoprotein (LAMP) and cathepsin D, but not vacuolar proton pumps (210). It was

suggested that perhaps this exclusion resulted in the observation that mycobacterial-containing

phagosomes did not acidify beyond pH 6.3-6.5. Selective fusion was further demonstrated when

experiments with M. tuberculosis-infected human monocyte-derived macrophages showed

functionally active transferrin receptors able to accept delivery of iron-loaded transferrin (29).

This observation demonstrated that mycobacterial-harbouring vacuoles are stabilised structures

that maintain a recycling relationship with the endosomal machinery of the host cell (209).

In addition to halting phagolysosome maturation, M. tuberculosis has also been observed

to use multiple receptors including complement receptor (CR) 1 and 3 for entry into the host cell

without triggering the oxidative burst (180, 186, 235). M. tuberculosis has also been described

to produce catalase (KatG) and superoxide dismutase (SOD) enzymes which have been observed

to degrade reactive oxygen intermediates (ROIs) (91, 92, 136). Additionally, mannosylated lipoarabinomannan (ManLAM), a cell wall associated glycolipid which is found in M.

tuberculosis (both H37Ra and H37Rv have ManLAM), has been found to attenuate expression of

TNFα and interleukin (IL)-12 in mononuclear phagocytes. IL-12 is a cytokine secreted by

8 macrophages and B lymphocytes involved in the differentiation of naïve T cells into Th1 cells,

and has been observed to stimulate the secretion of IFN- γ and TNF- α, all of which have been

shown to be important in the control of M. tuberculosis (81, 145, 208).

ManLAM has also been observed to interfere with the recruitment of early endosome

autoantigen 1 (EEA1), a molecule involved in endosome-endosome fusion, thus shedding light

on another mechanism of halting phagolysosome fusion (61). Other mechanisms used by M.

tuberculosis to modulate the host response include the down-regulation of antigen presentation

molecules (e.g. MHC class II) (149), IFNγ-mediated activation of macrophages (216), and host

cell apoptosis (82, 198, 222).

1.4 Bacterial model of interest: M. tuberculosis H37Ra and H37Rv

Bacterial strains commonly used for mycobacterial studies include the well established

laboratory-strains M. tuberculosis H37Rv and H37Ra, the attenuated vaccine strain M. bovis

BCG, recently established laboratory strains derived from clinical isolates such as M. tuberculosis Erdman, CDC1551, and members of the hypervirulent Beijing family. The term

“strains” as used in this thesis denotes a group of genetically identical organisms that have a documented history of disease and or utilisation in the laboratory. With the exception of mutant strains of mycobacteria, all strains were originially clinical isolates derived from patient samples.

There are genomic differences that have been characterised between strains however, for the purpose of this project we sought to utilise two strains derived from a common parental strain to alleviate strain differences that may complicate analyses. The bacterial strains used for the research reported in this thesis were H37Rv (virulent) and H37Ra (attenuated). Both were derived from the parental strain: H37 (150, 205). Mycobacterium tuberculosis H37 was isolated from the sputum of a patient suffering from chronic pulmonary tuberculosis in 1906. For most

9 of the following two decades H37 became a widely used laboratory strain of M. tuberculosis and

was shown to be highly virulent for guinea pigs and only moderately so for rabbits – a

characteristic described as marking the bacterium as human in type (203). However, starting in

1922 and through to 1926, reports surfaced of waning virulence of the bacillus (203). It was

speculated that a change in a particular component of the growth medium may have deprived the

virulent bacterium of essential nutrients such that dissociation into an attenuated variant was

possible (203, 205). Through further studies, it was demonstrated that changes in pH readily

allowed for the dissociation of variants and in 1927, a stable dissociation of the attenuated

H37Ra strain was accomplished (150).

However, it should be emphasized that H37Ra is not completely avirulent, only attenuated in its virulence compared to H37Rv. In common animal models for TB such as mice and guinea pigs, H37Ra does establish an infection, and does multiply within these hosts (34, 95, 112, 121,

148, 204). For mice, both H37Ra and H37Rv show growth inside the host in the first 20 days prior to the control of infection by adaptive immunity. H37Rv-infected mice do eventually

succumb to disease whereas H37Ra-infected mice appear to resolve their infections slowly (34,

95, 112, 121, 148, 204). The same trend is observed in guinea pigs where both H37Ra and

H37Rv, can infect and grow, but while H37Rv-infected animals succumb to disease sixty to ninety days after infection due to disseminated disease, H37Ra-infected animals are alive and

well for years after infection (204). Furthermore, persistence of H37Ra in the guinea pigs

allowed for the study of caseous granuloma pathology inside host lungs (204).

1.5 Difference analyses of M. tuberculosis H37Ra and H37Rv

With the publication of the genomic sequence of M. tuberculosis H37Rv, microarray

technology, which relies on having a sequenced reference strain, has become widely accessible

10 although admittedly not commonplace to all laboratories. However, a boom in gene expression

of Mycobacterium tuberculosis under different environmental stressors (e.g. oxygen deprivation,

antibiotic treatments, and nutrient starvation) has been observed (4, 8, 13, 22, 43, 224, 233).

Accounts of intracellular expression of M. tuberculosis have also been published although the majority of these studies have compared expression profiles of one bacterial strain grown in broth media to the same strain in an intracellular environment (116, 117, 188). Such protocols may have drawbacks in that the vast majority of differences found may reflect only on how artificial and dramatically different the two environments compared are, and thus, data from these experiments should be interpreted with caution. As this thesis deals mainly with the differences between two strains of M. tuberculosis, the focus of the introduction will then be on previous studies that have compared H37Ra and H37Rv.

1.5.1 Genomic analyses of M. tuberculosis H37Ra versus H37Rv

On the assumption that the attenuated variant must necessarily exhibit alterations to the

expression of virulence genes compared to the virulent variant, many studies have been

conducted to assess differences between M. tuberculosis H37Ra and H37Rv to find

mycobacterial virulence factors. Less than 20 years ago, it was suggested that both H37Ra and

H37Rv were highly homologous at the DNA level (89) and only one difference had been

reported when the two strains were compared via restriction fragment length polymorphism

(RFLP) analysis (9). Since then, studies utilising pulsed-field gel electrophoresis (PFGE) (19),

RNA differential display (175), and microarray studies (38) have lent further insight into

genomic and expression differences between the two strains. Some genomic differences isolated

pertain to sections of sequence found in H37Ra that have no counterpart in H37Rv (18, 19, 110,

111). One such “deletion” that has been extensively studied is the RvD2 deletion that maps to a

11 7.9kb fragment found in H37Ra, but not H37Rv, when both genomes are digested with DraI

(19). This deletion has been attributed to the effects of the insertion element IS6110, which has also been ascribed the role of providing plasticity to the tubercle genome (201). However, genes within the RvD2 region of difference have not been found to be conclusively linked to virulence

(110, 111).

1.5.2 Expression analyses examining differences between M. tuberculosis H37Ra

and H37Rv

Many differences have been found between the two strains via expression studies including early secreted antigenic target (ESAT)-6, a potent T-cell antigen recognised by the sera of TB patients (79, 172, 173). ESAT-6 is an immunogenic molecule originally isolated in short term broth culture filtrates of Mycobacterium tuberculosis whose gene is found within RD1 (2,

32, 70). A knock-out of the region of difference (RD) 1 [that encodes for Esat-6] in H37Rv attenuated its virulence to that of the vaccine strain, M. bovis BCG (74, 79, 163, 164). With the elucidation and annotation of M. tuberculosis genomics, additional proteins were assigned to the

ESAT-6 family. In total, 23 members were ascribed to this family in M. tuberculosis H37Rv

(32).

ESAT-6 has been characterised as a secreted protein, and its mode of export is via the

ESX-1 secretion system, encoded by the RD1 region of difference (Rv3871-Rv3878), and further aided by genes upstream of RD1 (Rv3868-Rv3870) (6, 70, 214). RD1 was originally identified via subtractive hybridisation to be present in wild-type M. bovis strains, but not in the vaccine strain BCG (6, 70). It has subsequently been identified to be present in M. tuberculosis, but it appears that expression of RD1 in lower in broth-grown H37Ra compared to broth-grown

H37Rv (142). ESAT-6 has been held up as a virulence factor as it appears to be an important

12 target for T-cell responses in mice, humans, cattle and guinea pigs (14, 51, 160, 168).

Furthermore, ESAT-6 has been assigned a cytolytic role enabling virulent mycobacteria to

spread intercellularly, thus, resulting in greater tissue damage (86). This hypothesis appears to

be gaining momentum with recent observations of decreased tissue necrosis being elicited in the

murine lung by mutants lacking the entire ESX-1 secretion system (96).

Another putative virulence factor found via RNA studies is FadD33 (172). Recently, a

knock-out mutant of fadD33 was generated in H37Rv and its growth in a murine host assessed

(171). Compared to the virulent parent, H37Rv, the mutant exhibited comparable survival ability

in both the lungs and spleen, but its growth in the liver was approximately one log unit lower

than that of the parent. This pattern of growth was similar to that shown by H37Ra.

Complementation of the mutant and H37Ra with the wild-type fadD33 gene restored growth in

the liver to levels seen with H37Rv (171). Further studies into virulence factors of M.

tuberculosis saw the use of H37Rv cosmid libraries to complement H37Ra (156). Pascopella et

al. isolated a DNA fragment, termed in vivo growth (ivg) that increased survival of H37Ra in spleens but not lungs (157). Unfortunately, no ivg knock-out mutants of H37Rv were generated, and no further studies were initiated to characterise this fragment further.

Studies comparing the expression differences between broth-grown H37Ra and H37Rv and/or intracellular H37Ra/Rv with that of broth-grown bacteria have elucidated genes differentially expressed between the strains, some of which have been shown to have roles in mycobacterial virulence (65, 72, 102, 172-174, 230). For example, mycobacteria tend to aggregate into “bunched and braided groups” that are called cords (104, 137). It has been demonstrated in various animal models that cording is associated with mycobacterial virulence

(137). Gao et al, examined, using microarray technology, genes involved in the cording and

13 non-cording phenotypes of the respective H37Rv and H37Ra (65). Furthermore, Kinger and

Tyagi examined, via differential display, genes differentially expressed between aerobic cultures

of M. tuberculosis H37Ra and H37Rv (102). From this study, the devR/devS (also known as

dosR/dosS) two-component system was identified, and has subsequently been found to have a

role in the survival of M. tuberculosis under anoxic conditions (39, 155, 177, 183, 193). These

studies highlight the value of the H37Rv/H37Ra model in the search for genes that may have a

role in the pathogenesis of M. tuberculosis.

1.6 Project goals

Previous studies examining M. tuberculosis H37Ra and H37Rv have focussed in on

genomic differences between strains using low resolution techniques. 2DDE has been used successfully to resolve point-mutations (105, 109, 194), thus, it was applied to identify minor

genomic differences between H37Ra and H37Rv. In expression-based studies, transcriptomes of

broth-grown bacteria subjected to various conditions mimicking host environments have been assessed. Expression profiles of broth-grown bacteria have also been compared with intracellular bacteria. However, our study is the first to directly compare the intracellular expression of M. tuberculosis H37Ra and H37Rv.

Gene expression does not always reflect protein expression (76, 101), and given post- translational modifications that have been documented for mycobacterial proteins such as glycosylation and lipid-acylation (181, 182), gene expression may not be an accurate reflection of the resulting virulent phenotype in M. tuberculosis. However, assessments of gene expression can give insight into initial responses (e.g. gene regulation) by the bacterium to prepare for environmental challenges, particularly those encountered inside the host.

14 1.6.1 Hypothesis

Mycobacterium tuberculosis possesses genes that enable its survival and replication

within the host. Comparing the genomic makeup of a virulent (H37Rv) and an attenuated

(H37Ra) strain of M. tuberculosis may identify virulence genes. Furthermore, comparing gene

expression profiles of these strains in interaction with murine derived bone-marrow macrophages

(BM-MΦ) should shed light on the genes that enable intracellular survival.

1.6.2 Objective and specific aims of project

To facilitate the synthesis of novel targets for chemotherapeutics and/or the design of

novel vaccines, we must gain greater understanding of the molecular aspects underpinning

mycobacterial virulence. I sought to analyse genomic and transcriptomic differences between two highly related strains of M. tuberculosis which exhibited different pathogenic phenotypes during interaction with the host. Specific aims of the project included:

i) Compare genomes of M. tuberculosis H37Ra and H37Rv using two-dimensional

DNA gel electrophoresis.

ii) Compare transcriptomes of M. tuberculosis H37Ra and H37Rv using two

methodologies:

1. Bacterial artificial chromosome fingerprint arrays due to limited access to

microarrays, and

2. microarray technology, when available, to assess more global transcriptomic

changes between the two strains.

15 CHAPTER 2: Materials and methods

2.1 Growth of mycobacteria

2.1.1 Aerobic Growth of Mycobacterium tuberculosis H37Ra and H37Rv.

Mycobacteria were grown either in an enriched medium: Middlebrook 7H9 (Difco)

supplemented with 10% OADC and 0.05% Tween-80, or a minimal medium: Proskauer and

Beck plus Tween (PB+T: per litre – KH2PO4 5.0 g, Asparagine 5.0 g, MgSO4·7H2O 0.6 g,

Magnesium Citrate 1.0 g, .0 ml, glycerol 20 mL and 5 mL of 10% Tween 80 adjusted to pH 7.4

with NaOH). To grow bacteria, frozen stocks of the respective strains were thawed and 0.5mL

of stock culture (approximately 108 CFU per millilitre) were used to seed 4.5mL of 7H9 liquid

broth or PB+T in a 14mL polystyrene tube. This culture was allowed to grow with intermittent

shaking (once every two days) until day 7 upon which 2.5 mL were taken to seed 22.5mL of

medium in a 50mL polypropylene conical tube. This expansion scheme of 1-in-10 dilutions was

used to sequentially expand the culture from 5mL to 25mL, 100mL, and finally 300mL cultures.

At day 21, 30mL of the 100mL culture was taken to inoculate 270mL in a 1L Nalgene tissue

culture bottle. Bottles were loosely capped and cultures were grown as aerated roller bottle (3

revolutions per minute) cultures at 37ºC. Growth assessments of the broth cultures were made by taking 2 mL of culture and assessing both the viable count (CFU per mL) and optical density of the culture.

2.1.1.1 Preparation of OADC

OADC is used as a supplement for growth of mycobacteria in enriched media such as

Middlebrook 7H9 or 7H10 (Difco). For one litre of OADC: 50g bovine serum albumin

16 (Serologicals), 8.1g sodium chloride, 20g glucose, and 0.6mL oleic acid pre-dissolved in 30mL

0.12N NaOH in distilled water. Solutions were adjusted to pH 7.0, brought up to volume and filter sterilised. Prior to use, all solutions were monitored for contamination: filtered solutions were incubated at 56ºC (to spur on germination of fungal spores) and then at 37ºC ( to check for presence of bacteria) and noted for cloudiness indicating contamination. Solutions that were able

to clear two rounds of contamination monitoring were stored at 4ºC.

2.1.1.2 Growth assessments

One method of growth assessment is to monitor the optical density (OD) of a culture.

Cultures of greater density will have a greater number of cells, which will result in a greater amount of light being scattered through a particular distance. Thus, the lower the light detected,

the higher the optical density; 580nm is traditionally used to monitor culture growth. To monitor

growth of M. tuberculosis, spectrophotometers used were first zeroed with sterile medium then

one to three millilitres of culture were aliquoted into a cuvette to be read by the

spectrophotometer.

To monitor culture growth via colony forming units, 1mL of culture is sonicated for 10s

using a VC50T 50 watt ultrasonic processor and 3 mm probe (Sonics and Materials; Danbury,

CT) to disperse clumps of cells. Alternatively, the horn sonicator attachment (Sonics and

Materials; Danbury, CT) may be used in place of the probe to disperse mycobacterial clumps

with three 30s bursts. Neither treatment results in loss of bacterial viability and, in fact, appears

to increase viability due to dispersal of clumps (207). Serial dilutions up to 10-8 were plated onto

7H10 agar plates and incubated at 37ºC for 21 days, upon which CFUs were enumerated.

17 2.1.1.3 Calculation of doubling time as a means to monitor growth

Doubling times (in hours) were calculated from viable counts collected on Days 0 and time t using the following equation: doubling time (in hours) = (t x log102) ÷ (log10Nt – log10N0) where t = time elapsed in hours, Nt = number of bacteria at time t, and N0 = number of bacteria at

time 0 (123, 140).

2.1.2 Growth of Mycobacterium tuberculosis H37Ra and H37Rv for fumarate

reductase studies.

Liquid broth cultures of bacteria that had been grown from a frozen stock and expanded to

100mL were diluted 1-in-10 into fresh media (Middlebrook 7H9 or PB+T), and 10mL of the

fresh culture were aliquoted into 14mL polystyrene tubes, which were tightly capped and left

undisturbed at 37ºC. It is supposed that leaving these cultures undisturbed and without aeration

at 37ºC should slowly deprive the 10mL cultured of oxygen throughout the course of the incubation (230). The cultures grown under these conditions only reflect mycobacterial growth

under limited oxygen content, and not true non-replicating persistence at 1% oxygen content. A

model of non-replicating persistence of M. tuberculosis was previously described by Wayne in

1994 (228). Concurrently, roller bottles containing the relevant liquid media were also

inoculated with bacteria to provide a control of aerated cultures grown at 37ºC. Initial growth

assessments of the cultures were done by removing 10mL aliquots at designated time points for

viable counts and optical density readings; growth was assessed up to day 20.

18 2.2 Isolation of nucleic acids from broth culture

2.2.1 Isolation of genomic DNA from axenic broth cultures of Mycobacterium

tuberculosis.

Genomic DNA was extracted from the respective strains using previously published protocols by Belisle et al (7). Briefly, mycobacteria were grown to mid-log phase, as determined above via growth curve analysis of aerobic cultures, and aliquoted into 50mL Falcon tubes. The cultures were spun down and bacterial pellets were frozen at -20ºC for a minimum of four hours to aid the disruption of the mycobacterial capsule and cell wall. Pellets were subsequently resuspended in 1M Tris pH 9, and subjected to an equal volume of chloroform:methanol (2:1 v/v) extraction to remove lipids. The bacterial band was then resuspended in 500µL of TE (pH

8) and incubated overnight (14-16h) at 37ºC with 50µLof lysozyme (10mg/mL; Sigma-Aldrich).

The lysozyme slurry was incubated with 60µL of 10% SDS and 6µL of Proteinase K

(20mg/mL, Roche Biochemicals) were added and left to incubate at 56ºC for 4h. After incubation, the suspensions were treated with phenol:cholorform:isoamyl alcohol (25:24:1) until the interphase between the aqueous and organic layers was clear. The aqueous phase was then extracted with an equal volume of chloroform to remove phenol from the extractions.

To precipitate DNA, an equal volume of isopropanol and 0.1 volumes of 3M sodium acetate (pH 5.2) were added to the aqueous phase, and left at -20ºC to precipitate overnight.

DNA pellets were recovered by centrifugation, washed once with 70% EtOH, once with 96%

EtOH, air-dried, and then dissolved in 500µl of water.

19 2.2.2 Isolation of RNA from axenic broth cultures of Mycobacterium tuberculosis.

RNA was extracted from the respective strains using previously published protocols by

Butcher et al (130, 188). Briefly, samples of mid-log broth culture were added to 5M

Guanidium isothiocyanate (GTC) lysis solution at a ratio of 1:4 (v/v). This lysis buffer is able to penetrate and freeze the transcriptomic profile, but it does not lyse the bacteria. Bacterial suspensions were centrifuged (15 minutes at 3,100 x g), and bacterial pellets were resuspended and pooled into 1mL of GTC lysis buffer (per litre: 5M guanidium isothiocyanate, 7mL beta- mercaptoethanol [14.3M, MP Biomedicals], 3.5mL Tween-80, 0.25% sodium lauryl-sulfate, and

25mM tri-sodium citrate). Bacteria were pelleted, then resuspended in 1.2mL of RNApro solution (FastPrep Solutions, MP Biomedicals), an acid-phenol solution to extract RNA. The suspension was then pipetted into a 2ml screw-top microfuge tube containing 1mL of 0.1mm ceramic beads (FastPrep Solutions, MP Biomedicals) and inserted into the QBiogene ribolyser

(MP Biomedicals) to mechanically disrupt (Setting 6.5 for 45s) the mycobacterial cells. This

slurry was then centrifuged for 5 minutes at room temperature at 12,000 rpm (14,000 x g). The

supernatant was combined with 300µL chloroform and the aqueous phase was pipetted into

500µL 96% EtOH to precipitate the RNA. RNA pellets were left in 96% EtOH at -80ºC for at

least one hour or storage prior to purification.

2.2.3 Purification of mycobacterial RNA

RNA pellets were spun down at 15000rpm (21000 x g) for 20 minutes at 4ºC then washed

once in 75% EtOH, and once with 96% EtOH before air drying. Pellets were resuspended in

100µL of RNase-free water, and purified via the RNeasy Purification kit from Qiagen.

Contaminating genomic DNA was removed using the On-column RNase-free DNase kit from

Qiagen. Briefly, RNA was loaded onto the RNeasy column with RLT lysis buffer (Qiagen) and

20 96% EtOH. The column was then washed with 350µL of RW1 wash buffer (Qiagen). On- column digestion of genomic DNA was accomplished via pipetting 10µL of DNase I mixed with

70µL of RDD dilution buffer (Qiagen). The digestion was incubated at room temperature for 15 minutes. Afterward, the column was washed again with 350µL of RW1 wash buffer, and twice with 500µL of RPE wash buffer (Qiagen) before the RNA was eluted with 30-50µL of RNase- free water. Purified RNA was stored at -80ºC in aliquots of 6-10µL for single use only.

2.3 Derivation, culture, and infection of murine bone marrow-derived

macrophages (BM-MΦ).

2.3.1 Derivation and culture of BM-MΦ

BM-MФ were obtained from the femurs, tibia, and humeri of CD-1 mice. Briefly, the bones were dissected out from euthanized CD-1 mice (seven to ten week old females), and the marrow was flushed from the bones using BM-MФ-media (RPMI containing 10% ∆FCS, 10%

L-929 cell media [see section 2.3.1.1], 2mM L-glutamine, and 1mM sodium pyruvate). The cell suspension was left to adhere in a 175cm2 flask for three hours to deplete non-stem cells. Non- adherent cells (stem cells) were collected and cultured for seven days in BM-MФ-media at 37ºC with 5% CO2 to allow differentiation into macrophages. For growth curves, macrophages were seeded into 24-well plates with each well getting 2.5 x 105 cells. For monolayers used in RNA studies, 3 x 107 cells were seeded into 150cm2 tissue culture flasks. One day before infection,

BM-MФ media was replaced with cRPMI (RPMI with 10% FCS, 2mM L-glutamate, and 1mM sodium pyruvate) and the macrophages were left at 37ºC with 5% CO2 to equilibrate in cRPMI overnight.

21 2.3.1.1 Preparation of L-929 cell media

L-929 cell medium is added to bone-marrow medium as a source of granulocyte

macrophage colony stimulating factor (GM-CSF) to differentiate the stem cells into

macrophages. Frozen aliquots of L-929 cells were thawed, pelleted at 200x g for 5 minutes and

washed in cold cRPMI with 20% FCS. Cells were again pelleted, resuspended in 10mL of

cRPMI with 20% FCS, and left to adhere overnight at 37ºC with 5% CO2. The next day, media

was replaced to remove excess cells. On day 4, cells were detached using 0.25% trypsin-0.03%

EDTA. Detached cells were washed with cRPMI and pelleted at 200x g for 5 minutes. Cells were then resuspended in cRPMI and seeded into T150 tissue culture flasks at 5000 cells per mL.

Media was harvested when monolayers became confluent.

2.3.2 Infection of BM-MΦ with Mycobacterium tuberculosis H37Ra and H37Rv

To assess bacterial expression profiles 96h and 168h post-infection, monolayers were

incubated at a multiplicity of infection (MOI) of 10 bacteria to 1 MФ, which resulted in an

average infection rate (average CFU at 4h/ MФ per well) of 0.1 bacteria per MФ. After a four-

hour incubation with binding medium (138 mM NaCl, 8.1 mM Na2HPO4, 1.5 mM KH2PO4,

2.7 mM KCl, 0.6 mM CaCl2, 1.0 mM MgCl2 and 5.5 mM D-glucose) containing the requisite

number of bacteria, the monolayers were washed three times with pre-warmed media to rinse off

unbound bacteria, before being submerged again in cRPMI (RPMI 1640 with 10% FCS, 1% L-

glutamate, and 1mM sodium pyruvate) until they were to be processed for RNA. To assess

bacterial expression at 4 hours post-infection, monolayers were infected at an MOI of 60 bacteria

to 1 MФ, resulting in an average infection rate of 1 bacteria per MФ and represented a similar

average bacterial content as that in MФ infected at 10 to 1 and left for 96 hours. This higher MOI

was used to allow the harvest of a sufficient amount of RNA. H37Ra and H37Rv are bound and

22 taken up by macrophages at the same rate (1%), if given at similar MOIs (data not shown). At

each time point, cover slips and supernatants were briefly sonicated (10s) to release and disperse

bacteria, and plated on 7H10 agar plates supplemented with 10% OADC to determine colony

forming units (CFU) per mL as per section 2.1.1.2 (180). For all macrophage experiments, three

replicate cover slips were assessed at each time point for each of three independent experiments.

2.3.3 Extraction of RNA from macrophage-associated Mycobacterium tuberculosis

After 4 hours incubations with mycobacteria, all flasks were washed three times with pre-

warmed media. Flasks marked for gene expression analyses of MФ-associated bacteria at 4

hours post-infection were processed immediately for RNA by pouring 50mL of GTC lysis buffer

directly onto the monolayer. For co-cultures that were used to examine M. tuberculosis

expression profiles at 96 and 168 hours post-infection, monolayers were overlaid with fresh cRPMI, and left at 37°C with 5% CO2. 50mL of GTC lysis buffer were added to each flask at

the designated time and rocked gently until Schlieren lines disappeared, indicating homogeneous

mixing of the solutions. For all time points, entire contents of flasks were poured into 50mL

conical tubes, and bacteria were collected at 3700rpm (3,100 x g) for 15 minutes. Pellets were

resuspended and pooled with 1ml of GTC lysis buffer. RNA was extracted from pellets and then

purified as described in Section 2.2.3.

2.4 Two-dimensional DNA displays

2.4.1 Generation of two-dimensional bacterial genome displays (2DBGD)

One microgram of genomic DNA harvested at mid-log growth phase of the respective

strains was digested overnight at 37ºC with 20U of HinfI restriction digest enzyme (NEB

Laboratories). The digested fragments were then dephosphorylated with 20U of calf intestinal

23 alkaline (Invitrogen) for 30 minutes at 37ºC. The digested fragments were then purified via a Gel Elution column (Qiagen) and eluted into a 20µL volume of 10mM Tris-HCl

(pH 7.5). To this, 10 µCi (37kBq) of [γ-32P]ATP (6000 Ci mmol-1; Amersham Pharmacia

Biotech) were used to radiolabel the DNA fragments using 10U of T4 kinase (New England

Biolabs). This reaction was placed at 37ºC for 1h.

The radiolabelled DNA fragments were separated by size in the first dimension via a 5%

non-denaturing TAE acrylamide gel run in 2D TAE electrophoresis buffer (40mM Tris, 20mM

sodium acetate, 1mM Na2EDTA, 0.2% v/v glacial acetic acid, pH 7.4) for 1400 volt-hours.

Following the run, gel lanes were cut as precisely as possible as anecdotal evidence suggests that broader strips may induce diffusion of DNA fragments causing blurring of 2D images. These strips were then overlaid onto the second dimension denaturing acrylamide gel. This 6% acrylamide gel was able to separate DNA fragments according to G+C content because it contained an ascending gradient of formamide (10-40% v/v) and urea (1.8-7M) within the gel.

This secondary separation was accomplished using the ISO-DALT electrophoresis system

(Amersham Pharmacia Biotech) in 2D electrophoresis buffer for 1700 volt-hours at a constant temperature of 68.5ºC. Gels to be compared were run in the same tank and corresponding positions to minimise the observation of differences that may arise between electrophoretic runs.

Following successful completion of the secondary run, gels were washed with electrophoresis buffer for 2h to rinse away residual formamide and urea from the gels before drying. Dried gels were exposed to X-ray film (Kodak) for 10-16h to visualise the 2D profiles.

2.4.2 Generation of bacterial comparative genomic hybridisation (BCGH) profiles

Genomic fragments of M. tuberculosis H37Ra and H37Rv were digested and labelled as

described in Section 2.4.1, with one exception: only 5µCi of [γ-32P]ATP were used to generate

24 BCGH profiles. This reduction in radiolabel was used to allow the user to visualise the run upon completion, as well as to minimise the time required for the radiolabel to decay. It was important to wait until the radiolabel had decayed before proceeding to hybridisation with DIG- labelled probes as failure to do so would not allow the user to distinguish between signals originating from the blot from those of the probes. To generate BCGH profiles, labelled gDNA from both strains were run together in the same lane for the first dimension, and thus, was used to generate one lone 2D-DNA profile as per Section 2.4.1. Following 2D separation of the digested fragments, the gels were washed for 2 hours in 2D-TAE. Southern blots were generated by electroblotting the 2D gels to nylon membranes using the DALT blotting kit (Amersham

Pharmacia Biotech) in the ISO-DALT tank for 2h at 31V. Following completion of the runs, both nylon membranes and dried gels were exposed to film to monitor both the success of fragment separation as well as the transfer of fragments from gel to membrane.

2.4.3 Generation and hybridisation of DIG-labelled probes with BCGH profiles

2.4.3.1 Generation of DIG-probes

Three-hundred nanograms of total DNA were labelled with digoxigenin as per

manufacturer’s instructions via the DIG High Prime DNA Labelling and Detection Starter Kit II

(Roche Biochemicals). Briefly, 300ng of DNA in a final volume of 16µL were heat denatured

and quickly chilled in an ice bath. Four microlitres of mixed DIG-High Prime (Roche

Biochemicals) were added to the DNA and the reaction was incubated at 37ºC for 16-20h. After

the incubation, the reaction was stopped by heating the sample to 65ºC for 10 minutes.

25 2.4.3.2 Determination of DIG-labelling efficiency

Prior to using the labelled probes for hybridisation, yields of labelled probes needed to be assessed to determine if the labelling reactions were indeed successful. As well, it remained to be determined if the probe could be used at specified concentrations. Detection of labelling efficiency was done according to instructions supplied by the manufacturer. Briefly, DNA probe product and DIG-labelled DNA control (Roche Biochemicals) were diluted to a concentration of

1ng/µL from which serial dilutions were made, and spotted onto nylon membranes. The nucleic acids were fixed onto the membrane via UV cross-linking. The membrane was then washed in wash buffer (0.1M maleic acid, 0.15M NaCl; pH 7.5, 0.3% Tween-20), incubated for 30 minutes with blocking buffer (0.1M maleic acid, 0.15M NaCl; pH 7.5, 10% w/v Blocking Reagent), and then incubated for 30 minutes with an anti-DIG-dUTP antibody (1:10,000). The membrane was then washed twice (2 x 15 minutes) with wash buffer, then equilibrated in detection buffer (0.1M

Tris, 0.1M NaCl, pH 9.5) for 5 minutes. The membrane was then covered with CSPD, a chemiluminescent substrate for (conjugated to the anti-DIG antibody), and incubated at room temperature for 5 minutes. Subsequently, excess fluid is squeezed out, and the membrane is further incubated at 37ºC for 10 minutes. The membrane was then exposed to X- ray film to visualise the efficiency of labelling.

2.4.3.3 Hybridisation of Southern blots with DIG-labelled probes

Prior to hybridisation, the Southern blot was equilibrated at the hybridisation temperature with hybridisation buffer (DIG Easy Hyb, Roche Biochemicals) that did not contain any probe.

This equilibration was done at 50ºC for 30 minutes. While waiting for the membrane to equilibrate, DIG-labelled probes were denatured in a boiling water bath for 5 minutes then quickly cooled in an ice-bath. The probes were then added to pre-heated hybridisation buffer,

26 and slowly poured onto the membrane. The hybridisation then proceeded for 16-18h at 50ºC.

Following hybridisation, DIG-labelled were immunologically detected using instructions set out by the manufacturer as described in Section 2.4.3.2. Hybridisation was visualised by exposing the blot to X-ray film (2 minutes up to 2h).

2.5 Bacterial artificial chromosome fingerprint arrays (BACFAs)

2.5.1 Growth and harvest of bacterial artificial chromosome (BAC) DNA

A BAC library of M. tuberculosis H37Rv containing 78 BAC clones was obtained from the Stewart Cole Laboratory at the Pasteur Institute in Paris, France. These BACs had been used in genome mapping of M. tuberculosis H37Rv, as well as in comparative genomics of various mycobacterial strains. To harvest BAC DNA to generate BACFAs, Escherichia coli carrying the relevant BACs were grown overnight in 5mL of 2YT Broth (per litre: 16g BactoTryptone, 10g yeast extract, 5g NaCl, pH 7.2) containing 12.5µg/mL of chloramphenicol (MP Biochemicals).

Cultures were pelleted at 5000 rpm (4,500 x g) for 5 minutes at 4ºC. BACs were extracted from pellets via the commercially available PhasePrep BAC Extraction kit from Sigma-Aldrich.

Alternatively, BACs could be extracted via previously published protocols (18, 19). Briefly: bacterial pellets were suspended in 4mL of solution A (50 mM glucose, 10 mM EDTA, 25 mM

Tris [pH 8.0]). To this, 4mL of solution B (0.2 M NaOH, 0.2% sodium dodecyl sulphate) were added to lyse bacteria for 5 minutes at room temperature. Four millilitres of ice-cold solution C

(3 M sodium acetate [pH 4.7]) were added to neutralise the reaction, and bacterial cell debris were spun down (11,000 rpm or 11,500 x g for 15 minutes). Pooled supernatants were precipitated with isopropanol, and the DNA pellets were dissolved in 600µl of RNase solution

(15 mM Tris HCl [pH 8.0], 10µg of RNase A per ml). The mixtures were incubated for 30 min at

27 37°C and then were extracted with chloroform-isoamyl alcohol (24:1) and precipitated from the

aqueous phase with isopropanol. Pellets were washed with 70% EtOH, air-dried, and then

dissolved in 30µL of water.

2.5.2 Generation of BACFAs

Prior to the generation of BACFAs, an open source digestion programme freely available on the TIGR website (http://cmr.tigr.org/tigr- scripts/CMR/shared/MakeFrontPages.cgi?page=restriction_digest) was used to assess the applicability of enzyme combinations. For this study, the first set of enzymes applied comprised of PvuII and StuI. All seventy-eight BACs were double-digested with PvuII and StuI overnight at 37ºC, and the products were used to generate BACFAs, which were subsequently hybridised to three different pools of DIG-labelled cDNA to screen for expression differences. For preliminary BACFA assessment of differential expression between H37Ra and H37Rv, all BACs were digested with 10U each of PvuII and StuI (New England Biolabs) overnight at 37ºC. After preliminary screening, some BACs were digested with StuI and SalI – here, the BACs of interest

were first digested with SalI overnight at 37ºC, the products cleaned with a Qiagen PCR

purification column (Qiagen) and eluted with 20µL water. Eluted volumes were then digested

overnight at 37ºC with 10U of StuI. Digestion products were run on a 1.5% TAE-agarose gel for

2 hours at 80V. Southern transfer of fragments to nylon membrane was done as follows: the agarose gel was depurinated in 250mM HCl for 15 minutes. The DNA in the gel was then

subjected to a denaturation (0.5M NaOH, 1.5M NaCl) step with two 15 minute washes at room

temperature. Subsequently, the gel was washed twice (2 x 15 minutes) in neutralisation solution

(0.5M Tris-HCl, pH 7.5, 1.5M NaCl), and then left to equilibrate in 20X SSC (3M NaCl, 0.3M

sodium citrate, pH 7). To transfer DNA fragments to nylon membrane, a piece of

28 chromatography-grade filter paper soaked in 20X SSC was placed on top of a glass plate sitting

astride a dish filled with approximately one inch of 20X SSC. The filter paper was extended into

the reservoir of 20X SSC. Next, the gel was placed atop the filter paper, and the dry nylon membrane was carefully smoothed over the gel, using a pipette to roll out any bubbles trapped

between the gel and nylon membrane. A dry piece of filter paper was then placed on top of the

nylon membrane, and a 1.5-2 inch stack of paper towels was overlaid onto the filter paper.

Finally, another glass plate and a weight of 200-500g completed the transfer assembly. This set-

up was left undisturbed overnight (12-16h) to allow for the Southern transfer of DNA fragments

onto the nylon membrane. Fragments were immobilised onto nylon membranes by UV cross-

linking at 260nm for 5 minutes followed by baking at 80ºC for 2 hours.

2.5.3 cDNA synthesis and DIG-labelling of cDNA probes for BACFA analysis

To generate DIG-labelled cDNA for use in hybridisations with BACFA technology,

several primers were evaluated. These included a portion of an arbitrary primer previously used to generate cDNA libraries (63, 130), random primers, as well as a primer carrying a degenerate

3’ end used previously for transcriptomic profiling of M. tuberculosis H37Rv (72), henceforth called the SCOTS primer.

2.5.3.1 Generation of cDNA probes using the universal arbitrary primer

One to five micrograms of total RNA was reverse transcribed with Superscript II

(Invitrogen Life Technologies) and a universal arbitrary primer (5’

GCCGGAGCTCTGCAGAATTC 3’), henceforth called Uniprime, to generate single stranded cDNA (130). Reverse transcription reactions using Uniprime consisted of the following: 0.8µL of Uniprime stock (50µM), 10U Superscript II (Invitrogen), 1.25µL of 0.1mM DTT, 1.25µL

29 dNTPs (10mM), 2.5µL of 5X SSII buffer, 40U of RNaseOUT (Invitrogen) and 1-5µg of total

RNA in a final volume of 12.5µL. Prior to initiation of the reaction, RNA was denatured at 95ºC

for 5 minutes. Upon addition of total RNA, the entire volume of the reaction was left at 25ºC for

10 minutes, and then placed at 43ºC for 90 minutes. Eight units of Klenow fragment of DNA

Polymerase I (Invitrogen) were then used to synthesize second stranded cDNA as per previously

published protocols (130). To generate DIG-labelled probes, 2µL of double stranded cDNA

were added to 8µl of water and boiled for 10 minutes and added to the following: 2µL 10X PCR

buffer, 2.5µL 2mM DIG-dUTP:dTTP (3:1 ratio), 2.5µL 2mM dNTPs (dATP, dCTP, dGTP),

0.8µL 50µM Uniprime stock solution, 0.6µL 50mM MgCl2, 1U Taq Polymerase (Invitrogen), to

a final volume of 20µL. DIG-labelling of cDNA was then done via PCR (94ºC for 2 minutes

followed by 30 cycles of 94ºC, 55ºC, and 72ºC for 1, 2, and 3 minutes, respectively).

2.5.3.2 Generation of cDNA probes using random primers or the SCOTS primer

To generate single stranded cDNA with random primers, each reaction contained the

following: 500ng of total RNA (denatured at 85ºC for 15 minutes), 50ng random hexamers

(Invitrogen), 4µL 2.5mM dNTPs, 4µL 5X Superscript II buffer, 2µL 0.1M DTT, 40U

RNaseOUT (Invitrogen). This was mixed, and incubated at 42ºC for 2 minutes after which 10U

of Superscript II enzyme were added. The entire reaction was incubated at 25ºC for 10 minutes,

and then at 42ºC for 50 minutes. Double strand synthesis was done as per Section 2.5.3.1. DIG

labelling of cDNA was attempted using the High-Prime DIG-labelling kit as per Section 2.4.3.1

as well as PCR labelling with random primers as per section 2.5.3.1.

To generate single stranded cDNA with the SCOTS primer, 200ng of total RNA were

heat denatured at 95ºC for 5 minutes and added to the following: 30U of RNaseOUT, 1.25µL

12.5mM dNTPs, 2.5µL 5X Superscript II buffer, 1µL 0.1M DTT, 10U Superscript II enzyme,

30 1.25µL 40 µM SCOTS primer. The entire contents of the reaction were incubated at 37ºC for 60

minutes, and second strand synthesis proceeded as per Section 2.5.3.1. To DIG-label cDNA,

2µL of second stranded reactions were added to 8µL of water, boiled for 2 minutes and added to

the following: 5µL 10X PCR buffer, 2.5µL 40µM SCOTS primer stock, 0.8µL 12.5mM dNTPs,

1U Taq Polymerase (Invitrogen), 1.5µL 50mM MgCl2, and 30µL water. The reaction was run

for 25 cycles: 94ºC for 40s, 57ºC for 60s, and 72ºC for 30s. This reaction was attempted both

with and without 0.1% NP-40 and 1% acetamide.

2.5.3.3 Silver-staining of DNA

As ethidium bromide staining can typically detect only nanogram amounts of nucleic acid, a more sensitive method was required to visualise the banding pattern of amplified cDNA.

Silver staining of nucleic acids has been reported to detect amounts in the picogram range. The

protocol used here that of the procedure by Bassam et al., published in Promega Notes

Magazine, No. 45, 1994, p.13. Briefly, the acrylamide gel is fixed for 30 minutes in fixative solution (7.5% acetic acid), then washed thrice in distilled water (3 X 5 minutes). The gel is then impregnated with silver (per litre: 1.5g AgNO3, 0.056% formaldehyde) for 60 minutes, washed in water, then developed (per litre: 30g Na2CO3, 0.056% formaldehyde – use at 4-8ºC) for 5 to

15 minutes. The staining procedure is stopped by rinsing the gel in cold (4ºC) fixative solution.

2.5.4 Hybridisation of BACFAs

Nylon membranes were hybridised as per Section 2.4.3.3 with 2.5µL of PCR reaction per millilitre of hybridisation buffer. Membranes were hybridised 16-18 hours at 50ºC.

Immunological detection of blots was carried out as per Section 2.4.3.3. Three independent

pools of DIG-labelled cDNA were used in hybridisations to screen for expression differences.

31 For hybridisations, each BACFA is produced in duplicate, and for hybridisations, one blot is probed with H37Ra and the other with H37Rv. The blots are then stripped and probed again with the alternate set of probes. Only profiles generated from the same blot were compared to alleviate blot-to-blot differences. Furthermore, only differences seen over three independent populations (hybridisations done in duplicate) were selected for further analysis.

2.5.4.1 Stripping a membrane of bound DIG-labelled probes

Prior to re-probing a BACFA with DIG-labelled cDNA, probes already hybridised to the membrane were stripped. The membranes were thoroughly rinsed in autoclaved, double-distilled water, washed twice (2 x 15 minutes) at 37ºC in Stripping Buffer (0.2M NaOH, 0.1% SDS). The membranes were then washed in 2X SSC and were either re-hybridised with labelled probes as per Section 2.4.3.3 or stored in 2X SSC at 4ºC.

2.5.5 Analysis of BACFAs

To isolate differences between strains, BACFAs generated with the enzymes PvuII and

StuI were hybridised with three different pools of DIG-labelled cDNA probes. Presence or absence of bands in the hybridisation profiles, and in some cases, marked changes in band intensity were designated as expression differences in BACFA analysis. Only differences seen in duplicate arrays with all three cDNA probe populations were chosen as candidates. In some cases, a second set of enzymes, StuI and SalI were used to generate new BACFAs to further investigate candidate differences in large fragments, which would contain several genes. To identify genes in the bands of interest, a programme called Restriction Site Search was used (W.

Lam, unpublished data). This programme allows the user to digest sequences in silico with

32 restriction enzymes of interest. It provides both a pictorial of how an ideal digest should run on

an agarose gel and a text file of the sizes and sequences of all restriction digest products.

2.5.6 Quantitative real-time PCR analysis of expression differences observed with

BACFA

As hybridisation protocols could potentially allow for non-specific binding resulting in a

false positive, all differences identified via BACFA were assessed via qPCR. Using gene- specific primers, this latter technique allows for unambiguous expression analysis of the candidates. Second stranded cDNA generated with the universal primer (Section 2.5.3) was used in qPCR reactions with SYBR Green as the indicator dye using the DYNamo SYBR Green qPCR kit (Finnzyme, New England Biolabs). For qPCR reactions, primers were designed with

PrimerQuest, a web-based programme freely available from the IDT website

(http://www.idtdna.com/Scitools/Applications/Primerquest/), as well as Primer Software from

Molecular Biology Tools. Primers were designed with an annealing temperature of 57°C, and used at a final concentration of 300nM in a reaction volume of 20µL. qPCR cycling conditions were as follows: 95ºC for 10 minutes followed by 35 cycles of 94ºC for 30 seconds, 57ºC for 20 seconds, and 72ºC for 30 seconds. Expression as gathered from qPCR data was expressed as fold difference of expression in H37Ra over that of H37Rv via the 2-∆∆Ct method as described

previously (187) with rrnAP1 (135) and dnaK used as normalising genes with 96h and 168h

transcripts, and rrnAP1 and 16S used as normalising genes for 4h transcripts. 16S was used in

addition to rrnAP1 at 4h post-infection as it was found to be more reliable at the earlier time-

point compared to the later time-point. Two normalising genes were used for each time point

because a previous report showed that normalising genes do not stay constant throughout and

more than one is required for reliable expression analysis (221).

33 2.5.7 Extraction of whole cell lysate for fumarate reductase enzyme assessments in

cultures mimicking microaerophilic conditions.

As described in Section 2.1.2, microaerophilic cultures were placed in undisturbed 14mL

polystyrene tubes at 37ºC. At designated time points, aliquots were removed from the incubator,

resuspended through inversion and pooled. Bacterial pellets (3,100 x g for 45 minutes at 4ºC) were resuspended in 600µL of 50mM Tris-HCl (pH 7.5) and transferred to 2mL screw-top tubes containing 0.5mL of 100µm silica beads. Tubes were ribolysed at the maximum setting (6.5

Watts) for 45 seconds then placed on ice for 5 minutes. This was repeated an additional 4 times, whereupon the tubes were spun down (5 minutes at 14,000 x g), and supernatants were collected and placed on ice. The beads and bacterial cell debris were resuspended in 300µL of 50mM

Tris-HCl (pH 7.5), and inverted to ensure homogeneous mixing. The beads were then allowed to settle, the supernatants were collected and combined with that already sitting on ice. This was repeated once more before the entire volume of supernatant was passed through a 0.2µm filter

(PALL).

2.5.8 Fumarate reductase enzyme assay

Fumarate reductase activity was monitored as per the oxidation of the coloured substrate

methyl viologen decribed previously by Lemire and Wiener (115). It should be noted that

reactions are best done in quartz cuvettes and with as little air-space as possible. Briefly, 25µL

of cell-free enzyme extract were added to 1250µL of 0.5M DTT in 0.2M Na2PO4 (pH 6.8), 50µL

of methyl viologen (2.5mg/mL in 10mM Na2PO4 – pH 5), and 37.5µL of 20mM sodium

dithionite in 0.2M Na2PO4 (pH 6.8) whereupon the cuvette was gently inverted twice. 100µL of

0.5M sodium fumarate (pH 7) were then added, and the contents were inverted once before a

34 spectrophotometric reading was made at 570nm. One A570 unit was equivalent to the activity of

1mM of fumarate reductase.

2.5.9 Western blot detection of fumarate reductase in whole-cell-lysates of M.

tuberculosis.

2.5.9.1 Polyacrylamide gel electrophoresis of whole-cell lysates

Protein contents of lysates were assessed using the Pierce BCA protein kit, and

concentrations were determined by comparing absorbance readings of lysates to those of the

standard curve obtained when various amounts (0.5-20µg) of BSA were assayed with the BCA

Protein kit. 10% sodium-dodecyl-sulphate polyacrylamide gel electrophoresis gels were cast as

per the Lammeli method. For each lysate to be assessed via Western blotting, 20µg of each

sample were loaded per lane. Gels were run for 2h at 75V.

2.5.9.2 Western blot transfer and immunodetection of proteins

SDS-PAGE gels were equilibrated for 10-30 minutes in transfer buffer (20mM Tris,

150mM glycine, 20% methanol), and proteins were blotted onto nitrocellulose using the mini-

Protean II system (Bio-Rad, Hercules CA) at 75V for 1.5h. After transfer, membranes were rinsed for 5 minutes in TBS (per litre: 8g NaCl, 2.37g Tris-HCl, 0.6g Tris Base, 1mL Tween-20,

pH 7.6) with 0.1% Tween 20 (TBST), and blocked for 1h in TBST containing 5% (w/v) skim

milk powder. Rabbit polyclonal antibodies raised against the E. coli FRD-A and FRD-B

subunits were obtained from Dr. Joel Weiner and G. McCuaig. Antibodies were dissolved in

PBS and used at a dilution of 1:24000 in TBST containing 1% skim milk. Membranes were

incubated with the primary antibody overnight at 4ºC. Following the primary antibody,

membranes were washed twice with TBST, and then incubated with the goat anti-rabbit

35 secondary antibody at a dilution of 1: 2000 for 2 hours. The membranes were then washed

several times over a 30 minute period, changing solutions often. Chemiluminescent detection of

the membranes were done as follows with the Chemiluminescent Substrate kit from Sigma-

Aldrich: membranes were soaked in the chemiluminescent solution for 5 minutes in the dark, the

signal was allowed to weaken for 30 minutes, and then the membranes were exposed to X-ray

film (3-15 seconds).

2.5.10 Evaluation of mercaptopyridine-N-oxide (MPNO) effects on BM-MΦ and on

growth of intracellular M. tuberculosis.

Macrophages were derived as per Section 2.3.1. To assess effects of MPNO on cell

viability, uninfected BM-MΦ were incubated with various concentrations (0.4, 1.2, 2, 2.4, 2.8,

3.6, and 4.8µM) of MPNO over a 7-day period. Every 24 hours, cover slips with adhered

macrophages were stained in trypan blue (0.02%) and cell viability was assessed. Additionally,

a cover slip of macrophages not treated with MPNO was submerged in 70% EtOH for 2 minutes

to fix and kill the cells. These slips were then stained in trypan blue as a positive control for

staining.

To assess the effects of MPNO on intracellular bacteria, macrophages derived as per

Section 2.3.1were infected as per Section 2.3.2 with an MOI of 10 bacteria to 1 macrophage.

After washing off unbound bacteria, infected macrophages were overlaid with media with or

without 2.4µM of MPNO. CFUs were assessed at 4, 96, and 168h post-infection to determine

effects of the fumarate reductase inhibitor on the growth of intracellular M. tuberculosis H37Ra and H37Rv. Wells marked for CFU assessment at 168h post-infection were replenished with fresh cRPMI with and without 2.4µM MPNO at 120h post-infection. Death rates (cells per hour)

36 were calculated as previously described using the following formula: death rate = 1/t*log(a/b), where a is the initial number of bacteria, and b is the number of bacteria at time t (94, 166).

2.5.11 Statistical analysis

Graphical data shown in Chapters 4 and 5 are means ± standard error of the mean.

Statistical significance of comparisons between H37Ra and H37Rv or between two time points was determined with the two-tailed, unpaired Student’s t test. P values <0.05 were marked as significant.

2.6 DNA microarrays

Mycobacterium tuberculosis microarray slides were obtained from Bµg@S at St. George’s

Medical School (http://www.bugs.sgul.ac.uk/). Amplicons on the array slides were generated using PCR primers designed to genes described in the published sequences and annotations of M. tuberculosis H37Rv, M. tuberculosis CDC1551, and M. bovis AF2122197 for a total of 4410 target genes. Genes were not spotted in duplicate on the array. For genomic comparisons using microarrays, duplicate arrays for each of two populations of genomic DNA from each strain were compared. For expression studies of both broth-grown and intracellular bacteria, three independent populations of RNA from each strain were hybridised in duplicate to array slides.

2.6.1 Genomic comparisons using microarray technology

2.6.1.1 Labelling and purification of DNA

Two micrograms each of M. tuberculosis H37Ra and H37Rv genomic DNA were prepared for labelling with Cy3-dCTP (Amersham Pharmacia) or Cy5-dCTP (Amersham

Pharmacia) dye with each microarray slide (Bµg@S, St. George’s Medical School, UK). To

37 each tube containing 2µg of DNA, 1µL of random primers (3µg/µL, Invitrogen) and water up to

41.5µL were added. The DNA was heated to 95ºC for 5 minutes and quickly cooled in an ice bath. To label the DNA, each tube received the following: 5µL of 10X React 2 buffer, 1 µL of

Klenow Fragment of DNA Polymerase I (Invitrogen), 1.5µL of Cy3-dCTP or Cy5-dCTP dye, and 1µL of dNTPs (5mM dA/G/TTP, 2mM dCTP). This reaction was left to incubate in the dark at 37ºC for 90 minutes. After incubation, the reactions were purified via the PCR Purification

Kit (Qiagen) and eluted into 30.2µL of water. Both the Cy3 and Cy5 labelling reactions were combined into one tube, mixed with 375µL of PB binding buffer (Qiagen), and loaded onto the

PCR Purification column. The column was spun, washed with 500µL of PE wash buffer

(Qiagen), and then washed again with 250µL of PE. The column was spun twice more to dry, and 30.2µL of water were pipetted onto the column and the labelled DNA was eluted.

2.6.1.2 Hybridisation of microarrays

Prior to hybridisation, microarray slides were prehybridised for 30 minutes at 65°C in prehybridisation buffer (per 50mL: 8.75mL 20X SSC, 250µL 20% SDS, 5mL 100mg/mL BSA).

The slides were then washed thoroughly in hot water (65ºC) for 1 minute, and then rinsed in isopropanol for 1 minute, before spinning at 400 x g for 5 minutes to dry. Dried slides were kept in the dark (<1h) prior to hybridisation.

For hybridisation, 29.2µL of purified labelled DNA were combined with 9µL of filter sterilised 20X SSC, and 6.8µL of filter sterilised 20% SDS. This was heated at 95ºC for 2 minutes, cooled slightly and applied to the microarray slide. The slides were snapped into water- tight hybridisation cassettes and submerged in a hot water bath at 65ºC for 16-20h. After hybridisation was complete, slides were washed with solution A (1X SSC, 0.05% SDS), which had been heated to 65ºC, for 3 minutes. The slides were then washed twice (2 x 2 minutes) in

38 room temperature solution B (0.06X SSC). The slides were then spun at 400 x g for 5 minutes to

dry and scanned with a PerkinElmer ScanArray Instrument.

2.6.2 Application of microarray technology for expression analysis

For each slide, 2µg of genomic DNA from M. tuberculosis H37Rv were used as the

normalising channel, and 4µg of total RNA from either M. tuberculosis H37Ra or H37Rv were

used for expression analysis. Genomic DNA rather than broth RNA was chosen as the reference

signal for the following reasons: (1) as the microarray chip was based on H37Rv, using genomic

DNA as a reference would nearly guarantee that all spots can be normalised to the control signal,

whereas with RNA, there may be some genes are not expressed under certain broth grown

conditions; (2) large amounts of DNA are easily obtained and can be stored without significant

degradation, in contrast, RNA is considerably more fragile and harder to obtain in large

quantities requiring several harvests of RNA, thereby introducing variability into the results, (3)

through an open-resource agreement with Colorado State University, we were able to obtain an

amount of genomic DNA that was sufficient for the whole of the microarray experiment.

2.6.2.1 Labelling of RNA

DNA to be used for the microarray hybridisations were labelled as per section 2.6.1.1.

For each slide, 2µg of DNA were used for normalising purposes, and 4µg of total RNA extracted as per Sections 2.2.2 and 2.3.3 were used for hybridisation to assess expression in axenic broth cultures and intracellular M. tuberculosis, respectively. For all expression studies, the normalising channel (genomic DNA) was labelled with Cy3 dye, and the test channel (total

RNA) was labelled with Cy5 dye.

39 To each tube containing 4µg of total RNA, 1µL of random primers and water up to 11µL

were added. These tubes were heated at 95ºC for 5 minutes and quickly cooled in an ice bath.

To start the labelling reaction, each tube received the following: 5µL 5X SSII buffer, 2.5µL

0.1M DTT, 2.3µL dNTPs (5mM dA/G/TTP, 2mM dCTP), 1.7µL Cy-5, 2.5µL Superscript II.

This entire reaction is spun down, incubated in the dark at 25°C for 10 minutes, and then in the

dark at 42ºC for 90 minutes. After incubation, the reactions were purified via the PCR

Purification Kit (Qiagen) and eluted into 30.2µL of water. Briefly: both the DNA and RNA

labelling reactions were combined into one tube, mixed with 375µL of PB, and loaded onto the

PCR Purification column. The column was spun, washed with 500µL of PE, and then washed

again with 250µL of PE. The column was spun twice more to dry, and 30.2µL of water were

pippetted onto the column and the labelled DNA/cDNA mixture was eluted. Hybridisation of

the slides was done as per Section 2.6.1.2.

2.6.3 Analysis of microarray hybridisations

2.6.3.1 Image analysis of microarray slides

Microarrays were scanned using the PerkinElmer ScanArray Instrument, slides were

scanned just below saturation – a point where spots of hybridisation are of intense colour, but not so intense that white spots appear instead of coloured spots. Images were analysed using

Imagene Software, and expression analysed using Genespring software. Imagene analysed files of microarray images could also be submitted to the web-based ArrayPIPE expression analysis programme, found here: http://www.pathogenomics.ca/arraypipe/ (84).

40 2.6.3.2 Expression analysis of microarrays

Two different analysis protocols were applied to all sets of microarray data. In the first method, samples for each experiment were normalised by biological replicates to determine

biological variation within any one population of RNA. Genes were then filtered to select only

genes with reliable expression greater than 1.5-fold over all populations tested. Analysis of

variance (ANOVA) was then performed to assess statistical significance of gene expression.

Although normalisation by biological replicates is useful for determining consistency of

expression trends in several biological samples, it may also add artefacts into the analysis (J.

Hinds, personal communication). Thus, a second protocol of analysis was done to confirm the

results obtained. This method of analysis involved on per array median normalisation where the

median log ratio is subtracted from the calculated log ratios for each spot on the array. Log

ratios for each spot are calculated with the following formula: log10(xi5) – log10(xi3) where xi5 represents the intensity for the spot on Cy5, and xi3 represents the intensity for the spot on Cy3.

After normalisation, ANOVA was then used on all genes to determine significance of expression

differences. Following ANOVA analysis, false discovery rate adjustment was made using

Benjamini and Hochberg's method (103) where P-values were adjusted to correct for the

occurrence of false positives.

2.6.4 qPCR analysis of selected candidate genes identified via microarray

expression analysis

One to four micrograms of total RNA were reverse transcribed using random hexamers

as set forth in Section 2.6.2.1, with the following added to each tube of total RNA: 5µL 5X SSII

buffer, 2.5µL 0.1M DTT, 2.3µL dNTPs (5mM dA/G/C/TTP), 1.7µL water, and 2.5µL

41 Superscript II. The design of gene specific primers and use of qPCR to confirm microarray data were done as per Section 2.5.6.

2.6.5 Comparing overlap between datasets and generation of Venn diagrams

Using a freely available web-based programme called GeneVenn available at

http://mcbc.usm.edu/genevenn/ (159), genes for each dataset to be compared were submitted into

the relevant fields and commonalities between groups were analysed. The programme generated

Venn diagrams representing the overlaps between datasets.

42 CHAPTER 3: Two-dimensional DNA analysis

1. Dullaghan EM, Malloff CA, Li AH, Lam WL, Stokes RW. Two-Dimensional Bacterial

Genome Display: A method for the genomic analysis of mycobacteria. Microbiology,

148: 3111-3117. 2002. *Highlighted in “Hot off the Press” in Microbiology Today, 29:

210. 2002.

2. Malloff C, Dullaghan E, Li A, Stokes R, Fernandez R, Lam W. Two-dimensional

displays for comparisons of bacterial genomes. Biol. Proced. Online, 5:143-152. 2003.

3.1 Introduction

With the emergence of sequencing technology, a wealth of information has been

uncovered with regard to gene origins, putative virulence factors, and adaptive mechanisms of

bacteria to various environmental conditions. Sequencing technology has become more

widespread, more efficient, and allows for faster analysis of data. However, accessibility to the equipment used in high-throughput sequencing is still a concern for developing areas of the world where a large number of interesting bacterial isolates relevant to disease may be found.

Instead, technologies such as microarrays designed using a sequenced reference strain may be focussed upon. Unfortunately, these methods allow only for the study of genes that have

complements in the reference strain and not the novel genomic sequences unique to the test

strain. Technologies that would allow for identification of unique sequences such as Pulse-field

gel electrophoresis (PFGE) or restriction fragment length polymorphism (RFLP) analysis are

limited to the discovery of only large insertions and/or deletions (171). Hence, even using these

methodologies would not allow for the identification of minor genomic differences that could

potentially account for dramatic phenotypic differences.

43 Two-dimensional DNA gel electrophoresis (2DDE) was first introduced by Fisher and

Lerman in 1979 as a means to resolve the genome of E. coli in order to allow for the

identification of the insertion of a lambda phage of approximately 48kbs in length (171). This

technique involves the separation of restriction digested DNA fragments by size in the first dimension using a non-denaturing polyacrylamide gel, and in the second dimension by sequence composition using a denaturing gradient gel. This latter separation is accomplished on the principle that the electrophoretic mobility of a fragment is altered when part of the molecule is melted or denatured. This melting/denaturation occurs at a specific denaturant concentration and is sequence dependent, i.e. fragments rich in AT sequences are likely to denature at a lower denaturant concentration than fragments rich in GC sequences because A-T base-pairs are weaker than G-C base-pairs. Genetic alterations – deletions, additions – that affect the sequence can ultimately impact the mobility of a fragment as changes in the sequence will alter cleaveage sites, which will affect the conditions under which the molecule will melt or denature.

2DDE was initially a challenging and laborious technique that was not widely applied in genome studies, particularly those of bacterial genomes. However, with the advancement of 2D protein apparatus that has allowed for improved ease-of-use, as well as more consistent casting of denaturing gradient polyacrylamide gels, 2DDE has been successfully applied to resolve

genetic alterations in prostate cancer cell lines (105), pedigrees, mutations in breast cancer genes

BRCA1 and BRCA2 (194), as well as mutations in zebrafish (109). Recently, the technique has

been modified to enable the screening of bacterial genomes such as those of Bordetella pertussis

(128, 131). This technique, now called two-dimensional bacterial genome display, or 2DBGD,

has been further modified to allow for the analysis of mycobacterial genomes (49, 127).

44 Two different populations of bacterial DNA may be run on the same gel to allow for the direct comparison of shared DNA between strains. This method – Bacterial Comparative

Genomic Hybridization (BCGH) – can allow for the detection of acquired DNA between non- related bacteria (126, 127). In addition, gene expression can also be studied by BCGH to elucidate the relationship between acquired DNA and phenotype (126, 127). Generating cDNA probes from RNA, 2D profiles can be probed to obtain transcriptomic profiles for any time point.

3.2 Rationale

This study began with the aim of elucidating genomic differences between the sibling M. tuberculosis strains H37Ra and H37Rv. At the onset of this project, the genomic sequence of the virulent M. tuberculosis H37Rv had been completed and annotated (32); however, genomic sequence of the attenuated H37Ra strain was not available at the time and has only recently become available. Due to the high similarity between H37Ra and H37Rv, as well as the paucity of genomic differences discovered between the strains in numerous past studies, there appeared to be only minor genomic differences between the two strains (89). As microarray slides available at the time contained only open-reading frames (ORFs) found in H37Rv, 2DDE was viewed as an alternative because it would allow for the analysis of novel genomic fragments present in H37Ra. Additionally, as 2DDE would allow for the analysis of the complete genome, even intergenic regions, not just ORFs, that may affect phenotype could also be examined.

Furthering the appeal of 2DDE was the aforementioned possibility of minor genomic differences between H37Ra and H37Rv. Other methodologies which also enable the study of novel sequences (e.g. RFLP, PFGE) are more suited to the study of large insertions and/or deletions compared to 2DDE which has been demonstrated to have the capability of resolving point

45 mutations (59, 144). Thus, 2DDE was seen as a viable technique to isolate and study the apparently minor differences between H37Ra and H37Rv.

To examine genomic differences between the stains of interest for this study, two related

2D techniques were applied: two-dimensional bacterial genome display (2DBGD), and bacterial comparative genomic hybridisation (BCGH).

3.3 2DDNA technologies for use with M. tuberculosis

3.3.1 2-dimensional bacterial genome display (2DBGD)

3.3.1.1 Optimisation of 2DBGD

Two-dimensional bacterial genome display had been previously optimised for the genomic profiling of Bordetella pertussis (127), as well as to compare mutant and wild-type strains of Mycobacterium avium (49). Here, the system was optimised for genomic displays of

Mycobacterium tuberculosis.

An important consideration when generating 2DBGD profiles is the harvest of high quality genomic DNA. Nicked or sheared DNA will give rise to small fragments as they are separated on the denaturing gel and will blur the genomic profiles complicating analysis. One consideration was the state of the culture used to extract DNA. Older cultures would allow for the harvest of high amounts of nucleic acid; however, the increased cell mass and the additional steps of purification required to obtain high quality DNA free from contaminating protein (that can also affect 2DBGD profiles) could lead to greater damage to the DNA. In addition, older cultures are also likely to have increased rates of autolysis, or self-degradation of the bacterial cell wall leading to the release of enzymes and DNA. The extracellular DNA released via autolysis would be subject to an abundance of DNA damaging enzymes such as I

46 whilst in culture medium. This damaged DNA would likely be harvested along with DNA from

intact cells, clouding 2DBGD profiles. Therefore, to minimise the likelihood of encountering

such difficulties, I sought to understand the growth of M. tuberculosis in liquid medium as well

as a method of DNA extraction that would result in minimal damage to the DNA molecule.

Figure 3.1 shows the growth curves of the respective strains in enriched 7H9 broth medium over

the course of eight days at 37ºC. These aerated cultures were observed to reach mid-log growth

at approximately the same day as determined by CFUs, albeit at different OD levels. OD

numbers are inaccurate when assessing growth of mycobacteria as clumping and dead bacteria in

the culture can affect OD readings, thus skewing the perception of a culture’s health.

Additionally, calculating growth rates of the respective cultures on Day 4, the doubling times of

H37Ra and H37Rv were not significantly different (H37Ra: 21.2 ± 0.7 hours, H37Rv 18.5 ± 1.2

hours, P=0.1882; Figure 3.1). Thus, DNA was routinely harvested on Day 4 after initial

inoculation of the aerated cultures. With respect to DNA extraction protocols, the enzymatic

method previously published by Belisle and Sonnenberg (7) as it allowed for the harvest of the

maximal amount of DNA with minimal processing and purification to obtain clean DNA.

Two-dimensional displays of genomic fragments are limited to those between 200bp to a

little over 2072bp (Figure 3.2). Thus, restriction enzymes that would enable the generation of

fragments suitable for resolution via 2DBGD were sought. Taking into consideration the high

GC content of M. tuberculosis (67.6%) as well as the need for an enzyme that would generate enough fragments of the desired length, HinfI (New England Biolabs), a 5-cutter restriction

enzyme was chosen for preliminary examination of 2DBGD applicability to decipher the

genomic differences between M. tuberculosis H37Ra and H37Rv.

47 Initial attempts at 2DBGD profiles of M. tuberculosis yielded poorly separated fragments

with no discernable differences (Figure 3.3A). Parameters that were examined to improve

genomic profiling included run-time, voltage of run, temperature, buffer composition and pH, amount of DNA used for the profiles, as well as DNA quality (detailed descriptions are found in section 2.1). After much trial and error, and increased familiarity with the technical aspects of this method, satisfactory 2DBGD separations were achieved resulting in reproducible 2D patterns (Figure 3.3B).

To summarise, the most important variable to be controlled is the harvest of good quality

DNA. While run-time, voltage of run and buffer composition are important for clearly resolved profiles, if DNA quality is lacking, profiles that may be used for genomic analysis cannot be generated.

3.3.1.2 Using 2DBGD to isolate genomic differences between H37Ra and H37Rv

Once 2DBGD could be reliably applied to isolate genomic differences between the two strains, further 2DBGD profiles were generated with HinfI to allow for genomic comparisons.

One difference was spotted over two independent populations of each H37Ra and H37Rv

(Figure 3.4). Subsequent 2DBGD profiles generated using HinfI digested fragments, however, did not reproduce this or other differences, leading to the conclusion that while reproducible patterns could be generated, no reproducible differences could be isolated by examining profiles generated with HinfI.

3.3.2 Bacterial genomic comparative hybridisation (BCGH)

One drawback to 2DBGD profiling is the need to compare separate profiles generated for

each strain thereby introducing gel-to-gel differences that could complicate reproducible

48 comparisons between strains. To alleviate such artefacts, bacterial genome comparative

hybridisation (BCGH), was considered for genomic and expression comparisons between strains.

Similar to 2DBGD, BCGH is also a 2DDNA-based approach, but rather than generating separate

profiles for the strains to be compared, DNA from both strains is run on the same gel. Transfer of the nucleic acids onto nylon membrane and subsequent hybridisation with labelled DNA (or cDNA in the case of expression studies) from the respective bacteria allows for the generation of strain-specific profiles.

BCGH was therefore attempted as it could be applied both to genomic and expression profiling while minimising complications that may arise from comparing separate 2D gels.

Furthermore, it was seen as an alternative method to microarray technology, which may not be widely available to all laboratories. In addition, microarrays mainly contain open reading frames of sequenced genomes, whereas BCGH would allow for the genomic and expression profiling of genomes that have yet to be sequenced. Furthermore, BCGH allows for the use of intergenic regions of a genome to be used as a target because the complete genomes of the organisms of interest are digested and run on a gel whereas microarray slides usually contain only ORFs found in the reference strain. Thus, BCGH was an appealing alternative to microarray technology, especially in terms of dissecting out genomic differences between the highly related H37Ra and

H37Rv.

3.3.2.1 Optimisation of BCGH

BCGH gels are run similarly to 2DBGD gels with an additional electrophoretic Southern transfer of nucleic acids to nylon membranes to generate the arrays. DNA used for BCGH is weakly labelled to allow the user to see [upon exposure to film or imaging equipment] that fragments were properly transferred onto the membrane (Figure 3.5). Subsequently, the blots are

49 left to decay before hybridising to labelled cDNA probes to assess genomic or transcriptomic profiles.

The next round of optimisation involved improving the efficiency of labelling probes to be used in hybridisation to the blot. Previous uses of BCGH to resolve genomic differences between bacteria used radioisotope-labelled DNA probes (128). However, a non-radioactive labelling method was sought for the use of BCGH in resolving differences between H37Ra and

H37Rv. Non-radioactive approaches are a safer alternative to the use of radioactive isotopes.

Furthermore, non-radioactive probes may be stored and re-used thereby reducing the amount of starting material needed – an important factor to consider in expression studies when only a limited amount of RNA can be extracted from intracellular mycobacteria. The non-radioactive approach implemented was that of the digoxigenin (DIG) label. Briefly, digoxigenin is a steroid derived from the flowers of Digitalis purpurea and Digitalis lanata, and due to its small size, can be attached to an array of molecules. Antibodies to DIG are readily available and DIG-labelling of nucleic acid probes can be accomplished via traditional methods used to prepare radioactive probes: random priming, nick-translation, PCR labelling. In this study, 300ng of genomic DNA was used in a random primed labelling reaction to generate DIG-labelled DNA probes for use in

BCGH to isolate genomic differences. However, using the direct detection method of estimating labelled product as recommended by the manufacturer, DIG-incorporation did not result in efficiently labelled probes that would have been sensitive enough to detect differences on the 2D blots (Figure 3.6). Subsequently, a greater amount of DNA was used to generate new BCGH arrays in hopes of increasing the amount of target that could be available for binding to the labelled probes. However, these BCGH did not reveal any identifiable differences between the two strains.

50 3.4 Discussion and summary

In this chapter, the utilisation of two-dimensional DNA technologies to discern genomic differences between two related strains of Mycobacterium tuberculosis has been described.

Although laborious, two-dimensional DNA technologies are valid methods to dissect genomic and expression differences between bacteria. However, there are technical considerations to be taken into account when applying this technique.

To begin, 500-1000 spots can be resolved on the average 2DBGD profile, translating to

roughly 20% of the genome of M. tuberculosis. As seen in Figures 3.3 and 3.4, there are areas of

poor resolution, particularly the leading edge of the gel where the mass of spots prevents the

resolution of any single spot. Additionally, gel distortion effects limits comparisons of profiles

to areas within a 4cm diameter. To address these limitations, additional 2DBGD profiles need to

be generated with adjustments to the electrophoresis conditions (run time, voltage, temperature),

the denaturant gradient (adjusted for GC content), as well as acrylamide concentration (adjusted

for size ranges) and alternate restriction enzymes allowing insight into a different 20% of the

genome.

To date, only one well-established genomic difference between H37Ra and H37Rv has

been found; RvD2 is a region of difference noted when the respective genomes of H37Ra and

H37Rv were digested with the restriction enzyme DraI (19). It should be noted that DraI

digestions of H37Ra and H37Rv run as 2DBGD profiles would not resolve the RvD2 fragment

because RvD2 lies within a 7.9kb DraI fragment whereas 2DBGD is able to display only

fragments ranging from 0.2 to 2.1kb. In practice, resolution and isolation of spots of difference

is biased towards fragments between 500 to 1,300bps. Digesting the sequence of RvD2 (approx.

5.5kb) with HinfI in silico suggested that 17 different products could have been expected (276,

51 829, 879, 433, 153, 350, 140, 531, 140, 139, 129, 57, 59, 468, 711, 215bp). While a portion of

these fragments would have been too small to resolve with 2DBGD, others that fell into the

optimal range were not detected in the 2DBGD profiles with reproducible confidence.

2DDNA cannot replace other high-throughput genetic comparative techniques that are

currently available, but it may be able to complement techniques such as microarrays or PFGE or

subtractive hybridisation. These other techniques are not designed to detect small genetic

alterations such as deletions, insertions, point mutations, or genetic rearrangements. With

2DDNA, however, the underlying principle allows the user to identify even small changes that

result in a change in the restriction digest profile. Additionally, 2DDNA technologies allows for

the genomic and expression analyses of newly-discovered isolates of a given species that may

contain novel sequences not found in the sequenced reference strains commonly used in

microarray studies. Thus, in locales where high-throughput sequencing technology is

unavailable, methods such as 2DBGD and BCGH that have the capability to resolve novel

genomic differences would be of great value.

In conclusion, no reproducible genomic differences between H37Ra and H37Rv were

identified via 2DDNA technology, however, more profiles generated with alternate sets of restriction digestion enzymes need to be compared before disregarding 2DDNA as a viable

technique in the genetic analysis of Mycobacterium tuberculosis.

3.5 Future directions

Two-dimensional bacterial genome display and its complementary method, bacteria

genome comparative hybridisation were applied to analyse genomic differences between M. tuberculosis H37Ra and H37Rv, but no differences were isolated using the conditions specified.

To utilise the 2DDNA technologies to their full potential, more 2D displays with alternate

52 enzymes need to be generated and compared. Thus, for subsequent applications of 2DBGD and

BCGH, alternate snapshots of the M. tuberculosis genome will have to be acquired.

With the recent boom of technologies available to assess genomic and expression differences in mycobacteria, 2DDNA technology may be considered too laborious a technique to be utilized. However, 2DDNA can be applied without use of specialised software or equipment that the newer technologies may require. Furthermore, in situations where newer and more technologically demanding methods of genome analysis are not available, 2DDNA can be relied upon to assess genomes that have not be sequenced and can therefore allow for the study of novel genomic sequences. These methods are viable alternatives to the high-throughput technologies currently available such as the microarray, however, it should be noted that

2DDNA techniques are complementary and not replacements for the microarray.

53

Figure 3.1Growth of M. tuberculosis H37Ra and H37Rv in enriched broth. M. tuberculosis H37Ra and H37Rv were grown as aerated cultures in roller bottles in 7H9 broth medium supplemented with 10% OADC AND 0.05%

Tween-80. Growth of the respective strains were monitored over a sixteen day period via CFU and optical density at 580nm. Data are means ± SEM of three independent populations.

54

Figure 3.2 Size range of resolution with 2DBGD. One microgram of 100bp ladder (Invitrogen) was run on a

2DBGD gel to illustrate size range of fragments that would be included in a 2DBGD comparison. Here, the smallest fragment included in the ladder (100bp) had run off the gel leaving only the fragments ranging from 200-2072bps to be resolved via 2DBGD.

55

Figure 3.3 Optimisation of 2DBGD for use with M. tuberculosis. M. tuberculosis was digested O/N with 10U of

HinfI, labelled with P-32 and used to generate a 2DBGD profile. A) Prior to optimisation and familiarization with the 2DBGD technique, few spots could be discerned on a 2DBGD profile of H37Rv. B) Following optimisation of various parameters (Section 3.3.1.1), satisfactory 2DBGD profiles were obtained such as to allow for genomic comparisons between strains.

56

H37Ra H37Rv

Figure 3.4 Comparison of 2DBGD profiles for M. tuberculosis H37Ra and H37Rv using HinfI digested fragments. 2DBGD profiles for M. tuberculosis H37Ra and H37Rv were generated and compared to assess genomic differences between strains. As the close-ups demonstrate, a fragment of difference at approximately

500bp was observed. However, this difference was not reproduced over three independent populations of DNA.

57

A

B Figure 3.5 BCGH array generated after optimisation. One microgram each of M. tuberculosis H37Ra and

H37Rv were digested with HinfI and run together on one 2DBGD profile. The fragments were electrophoretically transferred onto nylon membrane for 4.5h at 1.5A. (A) represents the BCGH array, following transfer to the membrane, and (B) the gel demonstrates that most of the labelled genomic DNA has been successfully transferred onto the membrane.

58

Figure 3.6 Efficiency test of labelled DNA probes to be used with BCGH comparisons. 300ng of genomic

DNA from each of H37Ra and H37Rv were labelled with DIG-dUTP via random-primed labelling. The control used in the detection assay was DIG-labelled Control DNA included in the DIG Random-Prime labelling kit (Roche

Biochemicals). Typically, random-primed labelling reactions with 300ng starting material should yield 20ng/µL of labelled probe. Thus, to assess labelling efficiency, labelling reactions are first diluted to the same concentration of the DIG-labelled control DNA (1ng/µL), and then diluted in the same scheme as the control. The control DNA spots contain, from left to right: 1ng, 10pg, 3pg, 1pg, 0.3pg, 0.1pg, and 0.03pg of DIG-labelled DNA. According to the manufacturer’s specifications, efficiently labelled probes should give a signal in the 0.1pg range (indicated by arrow) to detect single-copy genes on Southern blots. Probes that give a signal in the 0.3pg range are acceptable, as long as a greater proportion of the labelled mixture is added to the hybridisation buffer. With the H37Ra and H37Rv probes however, signals of the labelled probes were not sensitive enough to be used in Southern blot detection.

59 CHAPTER 4: Bacterial artificial chromosome fingerprint arrays

1. Li, AH, Lam, WL, and Stokes, RW. Array-based expression analysis of Mycobacterium

tuberculosis during interaction with macrophages: characterization of genes differentially

expressed in virulent and attenuated M. tuberculosis identifies candidate genes involved

in intracellular growth. Submitted.

2. Li, AH, Lam, WL, and Stokes, RW. Bacterial artificial chromosome fingerprint arrays

for the differentiation of transcriptomic differences in mycobacteria. Manuscript in

preparation.

4.1 Introduction

Bacterial artificial chromosomes (BACs) are vectors that can carry large portions of an

organism’s genome, usually up to 200kbps. Due to this property, BACs can be used to examine discrete regions of a genome at any one time. A Mycobacterium tuberculosis H37Rv BAC library was generated by the Stewart Cole laboratory (Pasteur Institute, Paris, France) and was used in the sequencing of M. tuberculosis H37Rv (18). This BAC library consisted of approximately 5000 BAC clones, containing inserts ranging from 25 to 104kb, and provided a

70-fold coverage of the M. tuberculosis genome. The minimal overlapping set contains 68 clones, and spans the entire genome, save for 150kb (0.05%). The library that was made available to us contained seventy-eight BACs containing an average of 68kbs (18, 70). The M. tuberculosis BAC library was generated on the pBeloBAC11 plasmid backbone and was carried by the DH10B strain of Escherichia coli.

This BAC library has been used to elucidate genomic differences between the mycobacterial strains M. bovis, M. bovis BCG, and M. tuberculosis H37Rv, of which one was 60 the RD1 region of difference, subsequently characterised to have roles in mycobacterial virulence (70, 163, 202). Thus, this is a useful molecular tool to decipher genomic and possibly, expression differences between strains. For the purposes of this thesis, the H37Rv BAC library was used to screen for expression differences between H37Rv and H37Ra via a technique called

BAC Fingerprint Arrays (BACFA).

To study gene expression, BACFA were generated via restriction enzyme digests of each

BAC and agarose gel electrophoresis of the fragments. The Southern blots of the fingerprints were then hybridised with DIG-labelled cDNA probes generated from RNA extracted from intracellular H37Ra and H37Rv to analyse expression (Figure 4.1). A programme called

Restriction Site Search (Dr. Wan Lam, British Columbia Cancer Research Centre) was used to identify differences. This programme generates in silico digests of BACs with enzymes of our choosing, and also provides a text file that lists the sequences of all fragments generated from the digest. When a band of interest is chosen, the identity of the band can be obtained by referencing the text file and comparing the sequence of the band to the M. tuberculosis genome at NCBI

(http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?cmd=Retrieve&db=Nucleotide&list_uids=571

16681&dopt=GenBank&WebEnv=0KZ4D76MK1JyipZoRGocw4-

HA9%4025612E0E6FED5F70_0138SID&WebEnvRq=1) using the Basic Local Alignment

Search Tool (BLAST).

4.2 Rationale

To gain insight into the attenuated growth of H37Ra inside the macrophage (as compared

to H37Rv) we sought to analyse expression differences between the strains during their

interactions with murine BM-MФ. As microarray technology was not readily available to us at

the time this study was initiated, an alternative method based on the M. tuberculosis H37Rv

61 BAC library was instead applied. Expression analysis using BACFA is less technically demanding than the 2DDE techniques introduced in Chapter 3, allows for faster analysis than

2DDE techniques, and was modified successfully for use with non-radioactive probes.

4.3 Growth of Mycobacterium tuberculosis in association with bone-marrow

derived macrophages

Before analysis of intracellular expression could proceed, the growth of the respective bacteria in association with host macrophages needed to be assessed to determine if there was indeed a difference between growth rates of the two strains as predicted. Furthermore, enumeration of intracellular bacteria over the course of the infection would also give insight into how much RNA could be harvested from intracellular bacteria, and which time point in the infection would be most amenable to maximal RNA harvest.

When grown in minimal PB +T media, the growth of M. tuberculosis H37Ra and H37Rv

(Figure 4.2) were not significantly different, when their respective doubling times during logarithmic growth were compared (H37Ra: 23.0 ± 1.1 hours, H37Rv 24.1 ± 2.4 hours,

P=0.6973 Figure 4.2). However, a significant difference in growth was observed between the two strains within murine BM-MФ (Figure 4.3). Assessing growth of intracellular bacteria at day 7 (168h) following infection of BM-MФ at a multiplicity of infection of 10:1, the doubling time of H37Rv was 31.2 ± 1.3 hours, whereas the doubling time of H37Ra was 51.1 ± 1.4 hours

(P=0.014). While not significant, a difference in growth between strains was observed on day 4

(96h) post-infection (for H37Ra and for H37Rv, P=0.1058, Figure 4.3). This difference represents an inhibition of growth of the attenuated strain when inside macrophages, but not when grown in broth culture. Subsequently, it was decided that the RNA to be used in expression profiling would be harvested at 168h post-infection as this was the time-point in the

62 infection which represented the greatest difference between the two strains and also facilitated the harvest of an abundant amount of RNA. RNA harvested from intracellular bacteria using a differential lysis protocol by Mangan and Butcher (130) was not contaminated with eukaryotic macrophage RNA (Figure 4.4). It should be noted, however, that even if the total RNA harvested from intracellular bacteria were contaminated with eukaryotic RNA, the main concern would be the eukaryotic RNA competing for reagents (e.g. primers, enzyme and labelled nucleotides) thus decreasing the efficiency with which the bacterial RNA could be reverse transcribed and subsequently labelled. It would be unlikely for these eukaryotic amplicons to find a hybridisation target on the arrays generated from mycobacterial DNA.

4.4 Optimisation of bacterial artificial chromosome fingerprint array

methodology in expression profiling

4.4.1 Reliability of BACFA technique

Before BACs were used to generate BACFAs for use in expression profiling, a scaled- down version of the BACFA was used to determine if genomic profiling could be successfully accomplished. For this purpose, the RvD2 region of difference previously characterised via the genomic comparisons between M. tuberculosis H37Ra and H37Rv was chosen (19). RvD2 was initially identified when genomic sequences of M. bovis BCG and M. tuberculosis H37Rv were compared (70), and this region was subsequently found to be present in H37Ra, albeit with slight nucleotide differences between the H37Ra and BCG sequences (19, 110, 111).

RvD2 in M. tuberculosis was sequenced by Brosch et al. (19), and using primers published in that report, a DIG-labelled probe specific to the RvD2 region in H37Ra was synthesised. This probe was hybridised to BACFAs generated with fragments from PvuII and

63 StuI digestions of the following: whole genomic DNA from H37Ra and H37Rv, BACs from the

H37Rv library that encompassed the region of the genome where the RvD2 deletion occurred, as

well as BACs from a region of the H37Rv genome wholly separate from the RvD2 region of

difference (Figure 4.5). As shown in Figure 4.5B, other than the lane where an aliquot of the

RvD2 PCR reaction was run, only the lane run with digested H37Ra genomic material gave a hybridisation signal. This result demonstrated that BACFAs could be used to assess differences between strains.

4.4.2 Generation of DIG-labelled probes for use with BACFAs

4.4.2.1 Selection of primers for use in the generation of DIG-labelled probes.

Bacterial expression studies utilising cDNA as a means to compare the transcriptomic

profiles of different experimental conditions commonly utilise random hexamers to generate

cDNA from RNA. The Oligo-dT primers commonly used for eukaryotic expression studies were

demonstrated to be of limited use in mycobacterial expression analysis because the primers

failed to generate representative cDNA populations compared to random primers and arbitrary

primers (108). The theory behind the use of random primers is to generate a pool of cDNA that

would be most representative of the RNA population from which it was derived. Hence, initial attempts at generating DIG-labelled cDNA probes for use in expression profiling using BACFA involved the use of primers that had a defined sequence with a sequence of random hexamers at the 3’ end. This set of primers would allow both for the advantage of having random primers that would maximise the representation of the cDNA pool during reverse transcription, as well, the defined sequence at the 5’ end would allow for labelling the cDNA with DIG-dUTP via PCR.

64 The primers initially selected for the generation of cDNA probes were of the primer set used previously in a study describing the transcriptomic differences between intracellular and broth-grown H37Rv via a method called Selective Capture of Transcribed Sequences or SCOTS

(72). Using this set of primers, cDNA was generated from RNA and DIG-labelling was attempted, but not successful, as no products could be seen via agarose gel electrophoresis when an aliquot of the reaction was analysed. However, hybridisation was still attempted for it was assumed that a fully representative pool of probes may not necessarily have enough of any one transcript to allow for visualisation on an agarose gel stained with ethidium bromide. However, no hybridisation profiles were obtained.

The lack of hybridisation was not likely due to poor detection, as reagents used above for the genomic profiling were the same used for expression studies. One reason might have been that while the primers with degenerate 3’ ends enabled the construction of a wholly representative cDNA probe population, this population may be too complex for any one transcript sequence to be labelled efficiently with DIG-dUTP. Thus, I sought to decrease the complexity of the cDNA pool, i.e. reduce the number of unique cDNAs reverse transcribed from

RNA, such that with fewer unique sequences, each transcript now would have an increased chance at being labelled with DIG-dUTP. To reduce complexity, the primer previously used to generate random-amplified PCR fingerprints of M. tuberculosis was applied (130). This was a primer twenty-four nucleotides in length, and was of a defined sequence with no degenerate nucleotides in its sequence, henceforth called Uniprime. Initially, random primers had been used for reverse transcription, and Uniprime for subsequent PCR, but results from such a combination were not satisfactory. Instead, Uniprime was used for both reverse transcription and downstream

65 PCR, and this combination was successful at generating hybridisation profiles with BACFA using RNA from broth-grown M. tuberculosis.

4.4.2.2 Amount of template into each labelling reaction

The starting amount of template DNA added to a PCR reaction can influence the efficiency of the reaction. Thus, various amounts of template DNA were used in DIG-labelling PCR reactions to determine the amount of template that would yield a complex population that would still be labelled efficiently. To this end, various amounts of template DNA was added to PCR reactions using the Uniprime primer: 3.6ng, 25ng, 50ng, 100ng, and 200ng. The reaction with

50ng of template DNA gave the most diverse banding pattern (Figure 4.6). Thus, subsequent

DIG-labelling reactions would use 50ng of cDNA to generate DIG-labelled cDNA probes.

4.5 Results

4.5.1 Differences isolated via BACFA include genes reported in expression studies

as well as novel differences previously unreported.

To isolate differences between strains, BACFAs generated with the enzymes PvuII and

StuI were hybridised in duplicate with three different pools of DIG-labelled cDNA probes derived from intracellular H37Ra and H37Rv. Presence or absence of bands in the hybridisation profiles, which in some cases were associated with marked changes in band intensity were designated as expression differences in BACFA analysis. Only differences seen in all three cDNA populations were chosen as candidates.

In some cases, a second set of enzymes, StuI and SalI, were used to generate new

BACFAs to further investigate candidate differences in large fragments which contained several genes. For example, in one case, a 2.4kb fragment was observed to display a band of higher

66 intensity when hybridised with H37Ra probes versus H37Rv probes (Figure 4.7A). To narrow down the potential candidate genes to be confirmed via qPCR, enzymes StuI and SalI were used to generate a second set of BACFAs and hybridised to DIG-labelled cDNA probes. Here, a band of 1.49kb was seen to be expressed at a higher level in H37Ra versus H37Rv (Figure 4.7B).

This fragment was found to contain the genes frdB, frdC, frdD, and Rv1556c, a putative

regulatory protein. After hybridisation with three independent pools of DIG-labelled probes

(hybridisations performed in duplicate), ten fragments containing twenty genes were consistently

observed to be differentially expressed between H37Ra and H37Rv. All twenty genes were then selected for subsequent qPCR analysis to confirm expression trends seen in BACFA analysis

(Table 4-1).

4.5.2 Quantitative real-time PCR (qPCR) confirmation of candidates selected after

BACFA analysis

Three independent pools of RNA from intracellular M. tuberculosis at 168h post-

infection were reverse transcribed as described for BACFA analysis and assessed via qPCR

(Table 4-2). Expression trends as seen in BACFA were confirmed for frdB, frdC, frdD, pks2,

aceE, and Rv1571 (Figure 4.8A). frdB, frdC, and frdD all encode subunits of the fumarate

reductase enzyme complex, and are found in the frd operon (32). Another component of the

operon is frdA, whose gene product, FRD-A, along with FRD-B, comprise the catalytic domain

of the FRD complex. Even though frdA was not identified via BACFA, it was included in the

qPCR analysis. frdA was also found to be expressed at a greater expression level in H37Ra

versus H37Rv (Figure 4.8A) at 168h post-infection.

Fumarate reductase catalyses the conversion of fumarate into succinate, and has been

demonstrated in vitro to also catalyse the reverse reaction of reducing succinate to fumarate (68),

67 normally done by the succinate dehydrogenase (SDH) enzyme complex. Previous studies of sdh

mutants in E. coli have suggested that fumarate reductase can partially compensate for a lack of

SDH activity (73). However, qPCR assessment of sdh in H37Ra and H37Rv indicated that not

only is SDH present in both strains, expression of these genes in the sdh operon (sdhA, sdhB, sdhC, and sdhD) did not differ between H37Ra and H37Rv at 168h post-infection (Figure 4.8B).

4.5.3 Assessment of candidate gene expression profiles in broth cultures and at 4h

and 96h post-infection.

168h post-infection was chosen primarily because that was the time point at which a

sufficient amount of RNA could be harvested for BACFA analysis. However, there remained

the possibility that the significant difference we saw between H37Ra and H37Rv at 168h post-

infection was dependent on events at earlier time points. Taking candidate genes identified as

having expression differences at 168h post-infection, the expression profiles of these genes were

also assessed at 4 and 96h post-infection (Figure 4.9A). Expression profiles of the genes were

also assessed for broth cultures to determine whether these differences are inherent between

strains or are more pronounced during interactions with the host macrophage (Figures 4.9B,

4.9C). Trends revealed via qPCR analysis indicate that genes of the frd operon are indeed

expressed at higher levels in H37Ra at 168h post-infection; however, at 4 and 96h these genes

are expressed at a higher level in the virulent H37Rv (Figure 4.9A). qPCR analysis also

confirmed the upregulation of pks2, Rv1571, and aceE in intracellular H37Rv compared to

H37Ra at 168h of interaction (Figure 4.9A).

Expression of all frd operon genes was elevated in bacteria interacting with the

macrophage versus bacteria grown in enriched broth, with the exception of frdD, which was

expressed at similar levels in both broth grown and intracellular H37Rv (Figure 4.9C).

68 The upregulation of pks2 in macrophage-associated M. tuberculosis over that of broth-

grown bacteria was maintained in H37Rv from initial interaction throughout the infection

(Figure 4.9C). This trend was not detected in macrophage-associated H37Ra at initial

interactions at 4 hours post-infection, but the expression of pks2 was observed to increase over

the duration of the infection (Figure 4.9B).

aceE was observed to display higher levels of expression in macrophage-associated

bacteria over that of broth-grown, steadily increasing over the infection (Figures 4.9B, 4.9C).

However, levels of aceE transcript were seen to be higher in H37Rv versus H37Ra at all time

points assessed during the infection, as well as in broth-grown H37Rv (Figure 4.9A).

Lastly, Rv1571 encodes a conserved hypothetical protein, the expression of the transcript

is upregulated in intracellular bacteria (Figures 4.9B, 4.9 C), and is upregulated in virulent

H37Rv over H37Ra at 4 and 168h post-infection (Figure 4.9A).

4.5.4 Growth of M. tuberculosis under oxygen-limiting conditions

The intracellular environment encountered by M. tuberculosis contains several challenges to the bacteria including oxidative molecules, low availability of nutrients, and lower oxygen

availability (57, 75). Because fumarate reductase is an important enzyme in fumarate

respiration, an energy production pathway relied upon by bacteria exposed to anoxic

environments, we proposed to assess the expression of mycobacteria frd when the respective

H37Ra and H37Rv strains are grown in the microaerophilic model previously described by

Wayne (229). It was hypothesised that such a study would determine whether frd expression is

connected to anoxic conditions versus another macrophage effect. Prior to expression profiling,

growth of the respective strains under microaerophilic conditions were first assessed to

determine whether growth was affected by lowered oxygen. Doubling times of H37Ra and

69 H37Rv grown under these conditions in PB+T (40.7 ± 1.3 hours for H37Ra; 45.9 ± 7.9 hours for

H37Rv) differed significantly from either strain grown as aerated roller bottle cultures (23.49 ±

2.4 hours for H37Ra P = 0.0033; 20.3 ± 1.1 hours for H37Rv, P = 0.0325). However, the growth

rates of the oxygen-limited cultures of H37Ra and H37Rv were not significantly different (P =

0.297). All doubling times were calculated at Day 5 (120 hours). This experiment was

reproduced with the mycobacterial strains grown in 7H9 enriched broth medium, with similar

trends (Figure 4.10).

4.5.5 Proteomic analysis of the fumarate reductase enzyme complex in cell lysates

derived from M. tuberculosis grown under oxygen-limiting conditions.

4.5.5.1 Biochemical analysis of fumarate reductase enzyme activity

Fumarate reductase is a well-studied enzyme complex in bacteria such as E. coli and

Wollinella succinogenes, and its biochemical activity has also been assayed in Mycobacterium phlei (67). Thus, having obtained the growth characteristics of the respective H37Ra and H37Rv

strains under low-oxygen conditions, I sought to biochemically characterise the activity of the

FRD complex in whole cell lysates of M. tuberculosis H37Ra and H37Rv grown at both low-

oxygen and aerated conditions. Applying assay conditions prescribed by Lemire and Weiner

(115), E. coli cell lysates were successfully assessed for the presence of FRD. However,

applying the same protocol to mycobacterial cell lysates gave rise to inconsistent results. FRD is

an enzyme complex that is sensitive to oxygen inhibition. Thus, the enzyme assay should

preferably be done in an anoxic environment. Although regular environs were described as

acceptable, there was a limited time frame before the enzyme would be destabilised in the

presence of air. Thus, the assay was attempted in vessels with limited head-space, and conducted

70 with minimal manipulations to ensure completion of activity assessments in a timely manner.

However, this did not result in reliable assessments of enzymatic activity. Furthermore, it is likely that the preparation, processing, and transport of whole cell lysates from a Containment

Level (CL) 3 laboratory setting to CL2 settings might have introduced oxygen deactivation of the

FRD complex at the onset of the study. Additional protocols for biochemical assessments of

FRD enzymatic activity relied on far greater amounts of protein than could be feasibly harvested from cultures growing at low-oxygen conditions with the present infrastructure at our CL3 facility.

4.5.5.2 Western blot analysis of fumarate reductase in whole cell lysates of M. tuberculosis

As the biochemical analysis of the FRD enzyme complex was unsuccessful, I turned to using an antibody raised against components of the FRD complex to detect its presence in the cell lysates in hopes that the presence of FRD, even if inactive, may indicate a difference not only between strains, but also between culturing conditions.

No mycobacterial-specific antibodies to components of FRD are available, however, there was an antibody available that was raised against the FRD-A and FRD-B subunits of E. coli FRD, subunits that comprise the catalytic domain of the enzyme complex. Unfortunately, E. coli FRD has only 54.7% amino-acid identity to M. tuberculosis FRD. A western blot of E. coli lysates and lysates from both M. tuberculosis H37Ra and H37Rv grown as static cultures (which did show an expression of frd transcripts – Figure 4.12) was probed with the antibody (Figure

4.11). A binding pattern was obtained with the antibody, however, an unexpected abundance of bands were seen with the E. coli lysates, not just those expected at 70kDa, and 30kDa corresponding to FRD-A and FRD-B, respectively (Figure 4.11). This was possibly due to degradation products of the complex. Furthermore, faint bands could be noted in the lanes

71 containing M. tuberculosis proteins; however, these bands did not correspond to the expected sizes of M. tuberculosis FRD-A and FRD-B at 64kDa and 27kDa, respectively (Figure 4.11).

Thus, western blotting with this antibody did not appear to be able to detect FRD in M. tuberculosis.

4.5.6 qPCR analysis of frd transcripts in M. tuberculosis grown under oxygen-

limiting conditions

Concurrent to growth rate assessments of oxygen-limited cultures, RNA from these and

their aerated culture counterparts were also harvested and levels of the frd transcripts were

assessed. Transcripts from static cultures used to inoculate the low-oxygen and aerophilic

cultures (henceforth referred to as seed cultures) were also assessed to determine if changes after

inoculation were pre-existing and strain-specific. Seed cultures of M. tuberculosis H37Ra displayed higher levels of the frd transcripts compared to seed cultures of H37Rv (Figure 4.12A). frdA and frdB were expressed at a higher level in H37Rv shortly after the new cultures were inoculated at 4h (Figure 4.12A). frdA was also higher in H37Rv at 24h (Figure 4.12A).

Examining the frd transcripts in oxygen-limited cultures versus their respective aerated culture counterparts, frd genes were expressed at a greater level in oxygen-limited cultures of the virulent H37Rv (Figure 4.12C), but such differences were not replicated in the oxygen-limited

H37Ra versus aerated H37Ra comparisons. The increase in frd transcripts in the low-oxygen cultures was more moderate in H37Ra, and levels of frdA and frdB were lower at the initial time-

points (Figure 4.12B). Using the Student’s t-test, only the expression of frdA at 4h was

significantly different between strains (P=0.036) whereas all other transcripts were not

significantly different between strains.

72 4.5.7 Growth of M. tuberculosis in association with macrophages treated with

methylpyridine-N-oxide (MPNO), an inhibitor of fumarate reductase.

4.5.7.1 Effects of MPNO on macrophage viability.

Mercaptopyridine-N-oxide is a common antimicrobial and antifungal ingredient found in

household products. Furthermore, MPNO has been studied and characterised to be a fumarate

reductase inhibitor in protozoa and is the precursor to L921-021, a drug developed and

implemented for the treatment of protozoan infections (143). For our purposes of examining

MPNO effects on the intracellular growth of M. tuberculosis H37Ra and H37Rv, we used the concentration of the inhibitor previously used to inhibit the growth of the intracellular parasite

Trypanasoma cruzi without adverse effects on mammalian cells (219). However, it was noted in

our study that even at this concentration, the BM-MФ underwent a phenotypic change. The cells

were no longer displaying the characteristic phagocytic morphology, but rather, were rounded

up. Staining the uninfected BM-MФ with trypan blue, it was observed that MPNO resulted in a

25-30% reduction in cell viability. This prompted an assessment of the effects of various

concentrations of MPNO on the viability of uninfected BM-MФ.

Adding a gradient of concentrations of MPNO: 0.4, 1.2, 2, 2.4, 2.8, 3.6, and 4.8µM to

uninfected macrophages, the cell viabilities of the various conditions were assessed at 24, 48, 72,

96, 120, and 168h after adding the inhibitor. From this, it was noted that viability of treated

macrophages was highest with 2.4µM, the concentration previously used in the study of

intracellular trypanasomes (219). However, comparing the trypan blue stained cells treated with

MPNO with untreated macrophages soaked in 70% EtOH acting as a positive control for the

stain, it was noted that the MPNO-treated cells were not stained as intensely, and were destained

within one to two minutes compared to the ten-plus minutes required for the fixed macrophages

73 to begin to lose some of the stain. This observation was made at all concentrations of MPNO

treatment.

4.5.7.2 Effect of MPNO on intracellular growth of M. tuberculosis.

To assess the importance of fumarate reductase for intracellular growth of mycobacteria, macrophages were infected with an MOI of 10 bacteria to 1 macrophage, and then cultured with media supplemented with 2.4µM of the FRD inhibitor, mercaptopyridine-N-oxide (MPNO)

(Figure 4.13). The growth of H37Ra and H37Rv in untreated infected macrophages was similar

to that obtained previously (Figure 4.3), with doubling times (in hours) significantly different

between the two strains at both 96h (H37Ra 51.5 ± 1.7, H37Rv 32.8 ± 0.9; P = 0.0007) and 168h

post infection (H37Ra 42.1 ± 1.6, H37Rv 31.3 ± 0.8; P = 0.0041). There was a significant

reduction of H37Ra and H37Rv CFUs in macrophages treated with MPNO with respect to their

untreated counterparts at 96 and 168h post-infection (P<0.0001 for all populations compared;

Figure 4.13). Comparing calculated death rates as previously described (94, 166) revealed a

higher rate of killing of intracellular H37Ra in MPNO-treated macrophages both at 96h (H37Ra

death rate 0.025 ± 0.001; H37Rv death rate, 0.02 ± 0.001) and 168h (H37Ra death rate, 0.013 ±

0.001; H37Rv death rate, 0.009 ± 0.001). However, this did not reach significant levels (96h, P

= 0.051; 168h, P = 0.054).

4.6 Discussion and summary

In this chapter, an alternative method of assessing expression differences in

mycobacteria, Bacterial Artificial Chromosome Fingerprint Array analysis, has been examined.

As demonstrated here, this technique was able to identify both previously reported and novel,

previously undescribed differences in expression, which were subsequently confirmed by qPCR.

74 BACFA, due to its nature, and cost-effectiveness, could be termed the “poor-man’s microarray”

in that, like the microarray, it utilises labelled probes to detect target genes immobilised on a

solid substrate. However, unlike microarray technology, BACFA can be performed by any laboratory that has access to standard equipment for gel electrophoresis and Southern blots. It

does not require highly specialised equipment, nor does it require expensive reagents or even

large amounts of RNA that would be required for microarray assessments. Furthermore, unlike

microarrays, BACFAs can be re-used as the digoxigenin-dUTP-labelled-probes can be stripped

and re-used or the array may be rehybridised with a new probe. Additionally, microarray

technology commonly utilises ORFs present only in the reference strains selected by the

manufacturers, which may overlook novel genomic sequences present in other strains or

intergenic sequences that may be informative. BACFA can allow for the analysis of these novel

genomic sequences, provided that BACs spanning the region of interest are available. Overall,

BACFAs can be a viable, complementary alternative to microarray technology, but not a

substitute, for there are considerations that may limit its use.

For this study, a 24-mer (130) was used as the primer for both reverse-transcription and

subsequent PCR labelling of BACFA cDNA probes. This particular primer was chosen because

random hexamers, as well as primers that had random hexamers at the 3’ end previously used

with the SCOTS method (72), did not allow, in this study, the generation of labelled probes that

could be used in expression analysis. It is recognised that the primer chosen was of a defined

sequence, and this could have biased the results seen here. However, given the differences that

were elucidated with this technique, and the distribution of such differences in various aspects of

mycobacterial cell functions, it could be said that such biases can be tolerated. Further, it is

acknowledged that the biases introduced with the use of this primer also reduce the number of

75 differences seen with BACFA analysis. However, to obtain an effectively labelled pool of

cDNA probes, complexity in the population of probes had to be sacrificed for efficiency of

labelling. As stated above, the use of random hexamers and a primer with degenerate

nucleotides at the 3’-end did not allow for an efficiently-labelled pool of cDNA probes that would enable downstream expression analysis. It would follow that if a primer of similar length,

but different sequence, were to be used, a different snap-shot of the transcriptomes could be

obtained. One possibility is to use genome-directed primers (211) in separate RT and PCR

applications, and pool these probes, enabling a more complex analysis of the differences between

strains.

A second consideration that impacts the use of BACFA in expression analyses is the fact

that even after careful choice of an appropriate set of restriction digestion enzymes, it is

unavoidable that fragments containing more than one gene are included in the BACFAs. Thus,

unlike microarrays where one signal equates with one gene, a signal in BACFA analysis may be

the result of several genes. It is this complication that will require resolution in the form of

generating alternate BACFAs with different sets of enzymes. Conversely, this limitation is also

an advantage of the BACFA method as it will allow for the identification of genes present on the

same fragment that may be co-regulated. As in the case of the frd operon isolated in this study,

each gene on its own may not have given a signal intense enough to warrant further analysis.

However, as three genes of the upregulated operon were present in one fragment, their combined

signals led to the isolation of this operon. Ultimately, BACFA analyses can be a viable

alternative, if not a substitute to other expression analysis techniques, particularly if cost and

specialised facilities are of concern.

76 Differences in pks2 expression (a gene whose product is thought to be involved in sulpholipid synthesis as well as other lipid synthesis roles (197)), has been previously reported in transcriptome comparisons of broth-grown versus macrophage-associated H37Rv (72).

Similarly, we found that pks2 expression was upregulated in intracellular H37Rv and H37Ra versus broth-grown counterparts. Additionally, we report for the first time that pks2 expression is higher in intracellular H37Rv versus H37Ra. The expression difference in this particular gene is expected, as it encodes products that may contribute to the colony morphology differences seen between the two strains.

Of the previously unreported differences isolated using BACFA and subsequently confirmed via qPCR analyses were the genes Rv1571, aceE, and frdD. aceE, seen to be upregulated in the virulent M. tuberculosis H37Rv throughout the course of infection, encodes pyruvate decarboxylase, otherwise known as the E1 subunit for the pyruvate dehydrogenase complex (PDC) . The PDC catalyses the conversion of pyruvate to acetyl CoA which feeds into the tri-carboxylic acid (TCA) cycle, one of the main pathways of cellular respiration and biosynthesis in both eukaryotes and prokaryotes (122). However, this E1 subunit can act independently of the PDC in glycolysis, converting pyruvate into acetaldehyde and a molecule of

CO2 (122). aceE upregulation in H37Rv is not surprising given previous studies that have shown that H37Rv displays an increased respiration rate versus that of H37Ra (80). aceE upregulation can support this difference in respiration through the conversion of pyruvate into acetyl-CoA.

Furthermore, although both H37Ra and H37Rv have been shown to rely on both glycolytic and oxidative means of glucose metabolism, H37Rv was seen to rely more heavily on glycolysis

(167). Again, aceE upregulation may support this observation as its gene product, pyruvate decarboxylase, can act independently of the PDC in glycolysis, and thus, its higher level of

77 expression in the virulent strain correlates with the greater usage of the glycolytic pathways of glucose metabolism in H37Rv.

An additional difference found with BACFA analysis and confirmed with subsequent qPCR analysis was that of the components of the fumarate reductase complex (frdA, B, C, and

D). Fumarate reductase (FRD) is an enzyme complex which catalyses the conversion of fumarate into succinate, and is composed of four subunits: FRD-A and FRD-B, which comprise the catalytic domain, and FRD-C and FRD-D, which comprise the anchoring domain. Fumarate respiration is an alternative means to acquire energy by utilising fumarate as the terminal

- electron acceptor when oxygen or NO3 is absent. BACFA analysis of 168h post-infection transcripts revealed an upregulation of frdB, frdC, and frdD in H37Ra versus H37Rv. Drawing on this isolated observation could have led to an interpretation where H37Ra was finding the intracellular environment far more stressful than did H37Rv such that it was relying on a less favourable method of energy production. Alternatively, H37Ra may have been deficient in its normal complement of TCA cycle enzymes. It was previously found in E. coli that FRD can partially compensate for missing succinate dehydrogenase (SDH) activity (73). PCR analysis of the sdh operon in both H37Ra and H37Rv, however, showed no genomic or expression differences between strains (Figure 4.9B).

In mycobacterial studies, frdA has been found to be upregulated in M. tuberculosis interacting with macrophages versus M. tuberculosis grown in broth (188) and in M. tuberculosis grown under carbon starvation (8). Microarray studies examining the transcriptome in stationary phase M. tuberculosis H37Rv cultures found an increase of frdB and frdC transcripts versus bacteria in exponential growth (87). Additionally, investigations into the respiratory behaviour of M. phlei found that FRD activity increased four-fold when bacteria were grown under low

78 oxygen conditions (67). When we examined the expression of the frd operon over the duration of the infection period at 4, 96 and 168h post-infection, the trend observed was instead one where expression was initiated and climaxed earlier in the virulent H37Rv strain versus that in the attenuated H37Ra. Assessing the entire duration of the infection revealed a scenario where

H37Rv was responding more quickly to the challenge of the intracellular environment – possibly due to anoxic conditions encountered inside the macrophage (188). In contrast, the H37Ra response was significantly delayed.

To address the role of FRD under anoxic conditions, we sought to characterise FRD activity and frd transcripts in cultures of M. tuberculosis H37Rv and H37Ra grown as both unaerated-static (oxygen-limited) and aerated roller-bottle cultures. Unfortunately, protein could not be harvested in sufficient amounts to analyse fumarate reductase activity by biochemical and western blot methodologies, directing our focus to the analysis of frd transcripts. It is important to note that although we recognise there is poor correlation of RNA transcription with protein translation (76, 101), transcriptomics still offers a glimpse into the initial responses of an organism to an environmental change. When grown under low-oxygen conditions, the attenuated strain displayed a lag in gene expression with regards to the catalytic domain of FRD

(frdA, frdB) but not the anchoring domain of FRD (frdC, frdD). The data does not duplicate exactly the trends seen with intracellular bacteria, likely due to the multiple challenges encountered by the bacterium inside host cells, whereas here we were subjecting the bacteria to only one environmental challenge. However, the data are still interesting in that once again, a lag in response to an environmental change is observed with the attenuated H37Ra. As fumarate reductase is a complex comprised of four components, all four are necessary then for activity of the enzyme. As such, although H37Ra is observed to transcribe the genes encoding for the

79 anchoring domain at even a higher level than H37Rv, the lag in transcription of the catalytic subdomain would limit the function of the complex as a whole.

With the transcriptomic data suggesting a role for fumarate reductase in intracellular survival, we investigated the effect of a fumarate reductase inhibitor, MPNO, on the growth of intracellular M. tuberculosis. Fumarate reductase is an enzyme complex that is not found in mammalian cells, thus, fumarate-reductase specific inhibitors would be expected to have negligible adverse effects on mammalian cells. With regards to other pathogenic organisms,

FRD has been considered a target in the treatment of Helicobacter pylori as it was found to be an essential gene in the establishment of H. pylori colonisation of the mouse stomach (10, 66).

Furthermore, FRD has also been a popular and successful target in the treatment of protozoan and helminth infections (21, 27, 153, 162, 219, 220).

One compound, mercaptopyridine-N-oxide (MPNO), has been used successfully control intracellular growth of Trypanasoma cruzi (219). In that previous study, no adverse effects were reported for the concentration of MPNO used to inhibit FRD in the intracellular parasites (219).

In our study, however, an effect was noted using this inhibitor. Cell morphology was changed with addition of MPNO, and exclusion of trypan blue dye was affected. Treated macrophages were seen to slowly excrete the dye rather than retaining it, and staining an intense blue as the control macrophages fixed with 70% EtOH. Therefore, it was theorised that although treated macrophages were viable, their efflux systems had been affected by MPNO treatment. As

MPNO targets an enzyme complex involved in energy production, it may have also affected the function of the macrophage efflux system.

The significant reduction in intracellular bacteria in MPNO treated macrophages throughout the study was presumably due to inhibitory effects of the inhibitor on bacterial

80 fumarate reductase. Therefore, the dramatic effect of the FRD-inhibitor on mycobacterial growth further bolsters the hypothesis that fumarate reductase is an important complex that aids

M. tuberculosis with its intracellular lifestyle. There remains, however, the question of why fumarate reductase is required. There is the possibility that fumarate respiration is relied on by intracellular bacteria for an additional energy boost, but there is also the possibility that the bacteria are in need of succinate or succinyl-CoA as a substrate for biosynthesis. For example,

succinate can be oxidised to oxaloacetate to serve as the carbon skeleton for amino acids, or

converted via oxaloacetate and phosphoenolpyruvate to glucose (124). Perhaps, Mycobacterium

tuberculosis exploits fumarate respiration both as an energy source and as a source of substrates

for biosynthesis.

In this chapter, an alternative means to study transcriptomes of intracellular bacteria has

been described. BACFA analysis can be utilised in a multitude of applications as it is simple and

effective in its ability to resolve expression differences. More importantly, using BACFA, we

have isolated differences between virulent and attenuated strains of M. tuberculosis that may

ultimately help to explain why these highly related bacteria have such different phenotypes when

interacting with the host. Lastly, we have also described the characterisation of an enzyme

complex that in other organisms provides energy and substrates under anoxic conditions, and

which as seen in this study, plays a supportive role to M. tuberculosis survival in macrophages.

4.7 Future directions

Using the BACFA approach to examine expression profiles of intracellular mycobacteria,

we were successful in identifying genes that could explain, at least in part, the increased fitness

of the virulent strain H37Rv versus that of the attenuated H37Ra strain inside the macrophage.

To further apply this technique in global expression studies, cDNA should be generated with

81 multiple sets of primers to ensure unbiased reverse transcription and amplification of transcripts.

Thus, more candidates may be identified through the use of alternate sets of primers such as genome-directed primers used previously in mycobacterial expression studies (211-213).

Additionally, a direct comparison of BACFA with microarray technology could be conducted using the arbitrary primers described here for BACFA analysis to generate CyDye labelled probes for microarray studies. It would be interesting to determine if the differences seen here in the BACFA studies would be found if the substrate for hybridisation were changed.

Lastly, from the data obtained so far with regards to the role of fumarate reductase in intracellular survival, further investigation of this enzyme complex would contribute to our understanding of the intracellular lifestyle of M. tuberculosis. Firstly, construction and complementation of site-directed mutants of the frd operon would definitely determine the role of the FRD complex in infection. Interestingly, studies examining genes essential for optimal growth in both mice and murine macrophages did not highlight any of the frd mutants as required for optimal growth (170, 185). Secondly, further characterisation of the effect of fumarate reductase inhibitors on the growth of intracellular mycobacteria, particularly in vivo, could further justify FRD as a target for chemotherapeutic research. Additionally, other FRD inhibitors that have been more extensively studied than MPNO could be tested for their effects on M. tuberculosis and host cells to identify more effective inhibitors of M. tuberculosis with less toxicity to host cells.

82

Expression Gene Function

bioB (Rv1589) biotin synthetase bioD (Rv1570) dethiobiotin synthetase frdB (Rv1553) fumarate reductase (Iron-Sulphur subunit) H37Ra > H37Rv frdC (Rv1554) fumarate reductase (membrane anchor subunit) frdD (Rv1555) fumarate reductase (membrane anchor subunit) rpoA (Rv3457c) DNA-directed RNA polymerase (alpha chain) Rv1556 Possible regulatory protein aceE (Rv2241) E1 subunit of pyruvate dehydrogenase icd2 (Rv0066c) isocitrate dehydrogenase lpqL (Rv0418) lipoprotein aminopeptidase lprN (Rv3495c) Probable Mce-family lipoprotein narX (Rv1736c) nitrate reductase pks2 (Rv3825c) polyketide synthase H37Rv > H37Ra sdaA (Rv0069c) L-serine dehydratase thiG (Rv0417c) Thiamine biosynthesis protein trpG (Rv0013) Glutamine aminotransferase Rv0068c Probable Rv0421c Conserved hypothetical protein Rv1571 Conserved hypothetical protein Rv1739c Probable sulphur transport transmembrane protein

Table 4.1 Candidate genes identified via BACFA as being differentially expressed in intracellular M. tuberculosis H37Ra and H37Rv. After three rounds of hybridisation done in duplicate, twenty genes were identified as being differentially expressed between the virulent and attenuated strains of M. tuberculosis.

Information regarding gene function was obtained through access to the Tuberculist Web Server database

(http://genolist.pasteur.fr/TubercµList/).

83

Gene Forward Primer Reverse Primer

aceE (Rv2241) TCC TGG CCA AGA CCA TCA AA TGC GTG TCA CGA AAC TCC TT

bioB (Rv1589) TCG CAA CGA AGT CGA GAT CA CGT TTC GAG GTT GTG GTT GT

bioD (Rv1570) TCA GAT CGT GCG GCT GAT AAC TTG GTG TGG TTG AGG GT

frdA (Rv1552) ATG GGC TAT GAC GAG TGG TT GTC TTG ATG TTC GCG TTG GT

frdB (Rv1553) AGG ATC ACC TCG ACG GAA CA ACA ACG AGA TCG CGG ATC AC

frdC (Rv1554) TGC TGC TGC ATG CTG TTA CC ACC ATC CAG GCA ACG ATC AC

frdD (Rv1555) TGC TGT TGC TGT TCG GAC TC ACC AGG ACC AGC ACA ACA AG

icd2 (Rv0066c) CCA AGC ACC AGG AGC TGT TC GTT CGT GGC AAC GGT GTA GG

lpqL (Rv0418) TGG CTG TGG TCG TCG CAT TC GTT GGC GTT GGC GAT GTC CT

lprN (Rv3495c) ACC AAG GTG GAT TTC GGT GA ACC GAA GTT GGG AAA TGG GA

narX (Rv1736c) TGA CAT GAT GGG CGA ACT CT CCG AAA TGA AAC ATC GGG CT

pks2 (Rv3825c) ACG GCT CCT ACA TCA TCA CC GCA TTC CAC CAC GAC TTC AG

rpoA (Rv3457c) CGG TCC TAC AAC TGC CTC AA TCA CCT CGT CGA TGG ACT TC

sdaA (Rv0069c) CGA GCG AAG GTG TGG TAT GA GTG GAT TGC GTA TGA TCG AC

thiG (Rv0417c) GCC TGA TGC GGT CGA ATT AG CTG CGC AAC CGG TAT CTT CT

trpG (Rv0013) GGC CAC TCG ATA CCA TTC GT ATC GAC TCC GGA TGG AAC TG

Rv0068c CGG CCT GTT GAT TGA TCG AC AAC AGC AGG TTG GCG AGC TT

Rv0421c GTC GAA GCG ATC CAG CTG TG GGA TGG ACC GGA TAG GAG AA

Rv1556 AGT TCG TCG ACC ACC GTA AG TGG ACC GGA AGA TGA GGT AG

Rv1571 CGG GCC AAT GTC GTG TTC AAT TGG TGA CCA CCG ACC C

Rv1739c GTG GTG CAG TTC CGC GAA TA ACG ATC CGA GCA GTG CGT AA

Table 4.2 Primers used in qPCR confirmation of candidate genes. Sequences of primers used in qPCR confirmation of genes selected after 3 rounds of BACFA analysis of transcripts from M. tuberculosis H37Ra and

H37Rv at 168h post-infection. Primer sequences are presented 5’→3’.

84

Figure 4.1 Generation of BAC fingerprint arrays. Purified BACs of interest were digested with selected restriction enzymes (RE). Digestion products were run on an agarose gel and Southern blotted onto a nylon membrane generating the BAC fingerprint array. The BACFA were sequentially hybridised to strain specific probes and only profiles generated from the same blot were compared to alleviate gel-to-gel differences.

85 A

B

Figure 4.2 Growth of Mycobacterium tuberculosis H37Ra and H37Rv in Proskauer and Beck liquid broth.

M. tuberculosis H37Ra and H37Rv were grown in aerated roller bottles and growth of the respective cultures were assessed via optical density units at 580nm (A) and CFU numbers (B). Data are means (± SEM) from three independent experiments.

86

Figure 4.3 Growth of Mycobacterium tuberculosis H37Ra and H37Rv in association with murine bone- marrow derived macrophages. BM-MФ were incubated with an MOI of 10 bacteria to 1 macrophage, resulting in a rate of infection of 0.1 bacteria per MФ. Growth of intracellular bacteria were assessed at 4h (day 0), 96h (day 4), and 168h (day 7) post-infection. Data are means (± SEM) of three independent experiments.

87

1 2 3 4

Figure 4.4 Exclusivity of RNA harvested from intracellular mycobacteria. One microgram aliquots of total

RNA from uninfected BM-MΦ (Lanes 1 and 4) and intracellular H37Ra (Lane 2) and H37Rv (Lane 3), respectively, were run on a TAE-agarose gel (1.5%). Lanes run with prokaryotic RNA do not show contamination of 18S or 28S rRNA from macrophages, and the only bands that can be seen are the 16S and 23S rRNA bands specific to bacterial total RNA.

88 A B

Figure 4.5 Genomic analysis using BACFA. BACs were used to determine if RvD2 could be isolated as a genomic difference between the two strains. Here a BAC that encompassed the region where the deletion occurred

(302), a BAC that is wholly separate from the region (13), as well as genomic digests of both M. tuberculosis H37Ra and H37Rv were digested with PvuII and StuI were probed with a DIG-labelled probe specific for a 1.3kb portion of the RvD2 region of difference.

89

1 2 3 4 5 6 7 8 9 10 11 12

Figure 4.6 Acrylamide gel electrophoresis of PCR reactions using uniprime and various amounts of template

DNA. Lanes 1 through 5 are PCR reactions with various amounts of H37Rv DNA template (1: 3.6ng, 2: 25ng, 3:

50ng, 4: 100ng, 5: 200ng). Lanes 6 and 9 are negative reverse transcription controls for both H37Rv and H37Ra subjected to a 15 minute treatment with 1µL of 10µg/mL of RNase A. Lanes 7 and 10 are negative reverse transcription controls for both H37Rv and H37Ra that were not treated with RNase A. Lanes 8 and 11 are PCR reactions with 3.6ng of cDNA from H37Rv and H37Ra. Lastly, lane 12 was run with 500ng of 100bp DNA ladder

(Invitrogen). 90

Figure 4.7 Bands of interest seen in BACFA comparisons may represent several Kb. Preliminary screenings were done with BACFAs generated with the enzymes PvuII and StuI. Only differences that were observed over duplicate hybridisations with all three populations of DIG-labelled probes were marked for qPCR confirmation. Over the course of expression analysis, 10 fragments containing 20 genes were selected for confirmation via qPCR. In A, lane 1 contained 1µg of DIG-labelled DNA markers and lane 14 contained 500ng of dnaK PCR product used as a normalisation signal. Digested BACs carrying inserts spanning positions 1755kbps to

2543kbps of the M. tuberculosis genome were run in lanes 2-13. These blots were hybridised to three sets of strain-specific DIG- labelled cDNA probes generated from three independent populations of RNA. Here a band of interest expressed at a higher level in

H37Ra (lane 2) was approximately 2.4Kb and contained several genes. Thus, a second set of BACFAs (B) was generated with the enzymes StuI and SalI to isolate the genes responsible for differential expression seen in A. In B, lane 1 contained HinfI digested total genomic DNA from H37Ra (a smear after hybridisation indicated the probes provided satisfactory representation of the transcripts in the genome), lane 7 contained 1µg of DIG-labelled DNA marker, and lanes 4, 5, 6 contained StuI/SalI digested BACs that were previously digested with PvuII/StuI and run in lanes 9, 4, and 2 in A. The second set of BACFAs was hybridized with the same populations of probes, and new expression profiles were obtained (B). Referencing the Restriction Site Digest programme, the 2.4Kb band in A was predicted to yield a 1.49Kb band (circled in B) containing the genes frdB, frdC, and frdD.

91 A

B

Figure 4.8 Quantitative real-time PCR assessment of selected candidate genes’s expression profiles at 168h post-infection. A) Fold change of frdA, frdB, frdC, frdD, pks2, Rv1571, and aceE in intracellular H37Ra expressed as fold change over expression of these genes in intracellular H37Rv. B) Expression of components of the sdh operon in intracellular H37Ra expressed as fold change over expression of these genes in intracellular H37Rv. Line drawn at “1” denotes expression in M. tuberculosis H37Rv. Data are means (+ SEM) of three independent experiments.

92 A

B

C

Figure 4.9 Quantitative real-time PCR analysis of frdA, frdB, frdC, frdD, pks2, Rv1571, and aceE identified via

BACFA. A) Expression of the respective genes in H37Ra and H37Rv in broth and at days 0, 4 and 7 within macrophages, expressed as fold change of H37Ra expression over that of H37Rv. B) Fold change of the genes in intracellular H37Ra over expression of the respective genes in H37Ra grown in enriched broth. C) Fold change of the respective genes in intracellular H37Rv over expression of the genes in H37Rv grown in enriched broth. For A, line drawn at “1” denotes expression in intracellular M. tuberculosis H37Rv. For B and C, line drawn at “1” denotes expression in the respective strains grown in broth. Data are means (+ SEM) of three independent experiments.

93 A C

B D

Figure 4.10 Growth of M. tuberculosis H37Ra and H37Rv under limited oxygen conditions. M. tuberculosis

H37Ra and H37Rv were grown both as oxygen-limited and aerated roller bottle cultures in PB+T minimal media (A

& B) as well as 7H9 enriched media supplemented with 10% OADC and 0.05% Tween-80 (C &D). A and C are assessments of growth via OD580 and B and D are assessments of growth via CFUs. Data are means + SEM of three independent experiments.

94

Figure 4.11 Western blot detecting FRD-A, FRD-B in cell lysates of E. coli and M. tuberculosis. Western blots run with cell lysates containing 20µg of protein from E. coli (lanes 1 and 2), M. tuberculosis H37Ra (lane 3), and M. tuberculosis H37Rv (lane 4). A was a blot probed with a 1:10000 dilution of the anti-FRD-A, FRD-B polyclonal antibodies. B was a blot probed with a 1:24000 dilution of the polyclonal antibodies.

95 A

B

C

Figure 4.12 Quantitative real-time PCR analysis of genes encoding for fumarate reductase (frdA, frdB, frdC, and frdD) in oxygen-limited and aerated broth cultures of M. tuberculosis H37Ra and H37Rv. A) Expression of frd genes in oxygen-limited cultures of H37Ra and H37Rv, expressed as fold change of H37Ra over that of

H37Rv. Line drawn at “1” denotes expression in M. tuberculosis H37Rv. B) Fold change of frd genes in H37Ra grown under oxygen-limited conditions over that of the respective genes in H37Ra grown in aerated broth cultures.

C) Fold change of frd genes in microaerophilic H37Rv over expression of the frd genes in aerated broth cultures of

H37Rv. Lines drawn at “1” in both B&C denote expression in aerated broth cultures of the respective strains. Data are means (+ SEM) of three independent experiments.

96

Figure 4.13 Effect of mercaptopyridine-N-oxide (MPNO) on the growth of intracellular Mycobacterium tuberculosis. Macrophage monolayers were treated with 2.4µM of MPNO after being infected with M. tuberculosis

H37Ra or H37Rv. CFU of intracellular bacteria were enumerated at 96h (day 4) and 168h (day 7) and normalised to

CFUs of cell-associated bacteria at 4 hours post-infection to assess changes in CFU numbers as a result of macrophages receiving MPNO treatment. Data are means (± SEM) of three independent experiments.

97 CHAPTER 5: Microarray-based expression profiling

1. Li, AH, Waddell SJ, Hinds, J, Bains, M, Hancock, REW, Butcher, PD, and Stokes RW.

Microarray analysis of intracellular Mycobacterium tuberculosis: biosynthetic pathways

important for intracellular survival. Manuscript in preparation.

5.1 Introduction

Through the advancements in genomic knowledge gained via the numerous sequencing

projects in the Mycobacterium genus, microarray technology has become a feasible means of genetic studies into mycobacterial pathogenesis. Microarray technology at its most basic involves printing an array of oligonucleotides or PCR amplicons onto a charged glass surface

(223). This array of nucleic targets can then be probed with genomic DNA to assess the genomic differences between the reference and novel strains or with labelled cDNA probes to assess expression differences that arise between strains when particular environmental challenges are presented.

To dissect genes important in mycobacterial virulence, there needs to be an understanding that pathogenesis is often multifactorial, rather than the result of the augmentation of one sole gene product. Thus, array technology allows the screening of all genes whose expression could be elicited in response to a particular challenge, for example, in host-pathogen interactions, and could facilitate the generation of a hypothesis for pathogenesis that is rooted in multiple factors

(223). Microarray technology in particular, with its representation of every gene in a reference strain, and in some cases, genes from additional strains that have been discovered to have roles in pathogenesis, allows the concurrent evaluation of thousands of genes.

98 In the last few years, there has been a boom of mycobacterial pathogenesis studies relying

on microarray technologies (38, 65, 77, 100, 120, 139, 193, 211-213). What began as a means to dissect and define deletions in the tuberculosis vaccine strain, M. bovis BCG (6, 171) and transcriptomic changes in M. tuberculosis elicited by antibiotic treatment (233) has blossomed into the characterisation of bacterial responses to a multitude of different environmental changes that may challenge M. tuberculosis with respect to homeostasis (206), metabolism (8), and host defences (4, 50, 151). This latter aspect of M. tuberculosis responses has captured much of the field’s attention for it is very much recognised that dissection of bacterial responses to host defence mechanisms are crucial to improving the efficacy of chemotherapeutics and/or vaccines.

5.2 Rationale

Microarray technologies are widely accepted tools used to drive hypothesis searches and to

date, many studies using microarrays have yielded scores of genes that could potentially have

roles in mycobacterial pathogenesis (60, 185, 193). Given the limiting conditions required for

BAC array analysis to reveal potential virulence genes, it was acknowledged that additional candidate genes had yet to be elucidated. Microarray-based expression analysis would not require the use of a primer that could potentially limit the complexity of a pool of labelled probes, thus, it was felt that microarrays would isolate further candidates in addition to those found via BACFA. Additional candidates isolated using microarrays could allow further insight into the differences identified via BACFA through the identification of related pathways and or gene families that may have similar roles.

99 5.3 Optimisation of microarray expression studies

Microarray slides used in expression studies of intracellular H37Ra and H37Rv were obtained from the Bµg@S group at St. George’s Medical School, University of London.

Established protocols were also provided for microarray studies; however, it was recommended that amounts of template RNA to be labelled for expression studies first be optimised, as too much template could overwhelm the system resulting in poor hybridisation profiles due to poorly labelled transcripts. Protocols provided by the BµG@s group suggested using 2-10µg of total

RNA for labelling reactions involving the use of microarrays in hybridisations with RNA, and thus, I initiated these studies by adding 7µg of total RNA into the labelling reactions.

Hybridisation signals using these labelled cDNAs were weak, and not reliable enough to be used in expression profiling. It was later advised that using lower amounts of total RNA in the labelling reactions would actually increase the efficiency of labelling of the cDNAs with Cy-dye, as well as resulting in better hybridisation signals (J. Hinds, personal communication). However,

I first needed to determine how much cDNA would be recovered after purification via the PCR

Purification columns used to purify the labelling reactions. Treating 1, 4, and 7µg of DNA from

H37Ra and H37Rv as per the labelling and purification protocol (Section 2.6.), it was observed that 0.5, 2, and 3µg of DNA could be recovered, respectively. Thus, it was decided that 4µg of total RNA would be used in the labelling reactions.

5.4 Results

5.4.1 Genomic comparisons

To familiarise myself with the technical aspects of microarray technology, I initially used

labelled genomic DNA probes to hybridise to microarray slides. Successful application and

100 analysis of the microarray slides should replicate previous findings that indicated a paucity of genomic differences between the highly related sibling strains, M. tuberculosis H37Ra and

H37Rv (19, 89). Indeed, the differences identified via this genomic comparison were those that corresponded to the RvD2 region of difference. (Figure 5.1, Table 5-1). Recent insight into the sequence of H37Ra has revealed several point mutations compared to H37Rv (114, 227); however, these differences were not recognised by microarray hybridisation.

5.4.2 Expression profile comparisons of broth-grown cultures

A number of previous expression studies comparing M. tuberculosis H37Ra and H37Rv

focussed on the expression of the respective strains grown in enriched broth culture. Axenic

broth culture is a widely-used system that allows the facile harvest of sufficiently large amounts

of RNA required for expression analysis. Moreover, differences seen in broth culture have also

yielded virulence candidates with proven roles in mycobacterial pathogenesis. One of the most

well-known candidates is the devR/S (also known as dosR/S) two-component system, and this

system has been shown to provide M. tuberculosis with a survival advantage under anoxic

conditions (102, 183, 234).

In this study, cultures of H37Ra and H37Rv were grown to mid-logarithmic phase in

enriched 7H9 broth and RNA was then harvested for the generation of CyDye labelled probes to

be used in microarray hybridisations using amplicon microarray slides (for details, please see

Section 2.6) based on the genome of M. tuberculosis H37Rv. Three indepent cultures of each

strain were grown and harvested for RNA, each of which was used to hybridise to amplicon

arrays in duplicate. Microarray data comparing the expression profiles of the respective strains

are found in Tables 5-2, 5-3 and Appendices II and III. Genes whose products have been

characterised to have roles in biosynthesis (pks3, moaE1), translation (rpoB, rpmD), and

101 respiration (nrdH, cydD) were found to be differentially expressed between strains. Specifically,

H37Ra induced to a greater extent (than H37Rv) genes involved in replication and protein

synthesis. It would appear from these data that H37Ra is metabolically more active than H37Rv, although from the growth curve assessments of the two strains, the cultures appear to be similar.

It is possible that H37Ra expresses more genes than H37Rv when grown under optimal

conditions such as enriched broth and one could speculate that this overabundance of

unnecessary transcripts and proteins may hinder the appropriate adaptations to novel

environments. Conversely, genes of the PE, PPE, PE_PGRS family of genes that have been

found to have putative roles in virulence and antigenic variation were more consistently

upregulated in broth-grown H37Rv.

5.4.2.1 Agreement with previous microarray study examining transcriptomics of broth grown

mycobacteria

One well-characterised difference between H37Ra and H37Rv is the absence in H37Ra

of cord factor, or trehalose 6,6'-dimycolate (TDM), a glycolipid found in the mycobacterial cell

wall which was originally isolated in virulent mycobacteria (11). Upon further analysis,

however, it was discovered that cord factor is not exclusive to virulent mycobacteria, but it is

indeed immunogenic (71). A recent microarray study examining the transcriptomic differences

that may explain this difference in cording using broth-grown H37Ra and H37Rv isolated 22

genes that were upregulated in H37Rv, and which appear to have roles in lipid metabolism (65).

Comparing this list with our list of genes upregulated in broth-grown H37Rv with H37Ra, only

one gene was common to both studies: fadD21. fadD21 is a pks3,4-associated fatty acid

activating enzyme involved in polyacyltrehalose (PAT) and 2,3-di-O-acyltrehalose (DAT)

synthesis (69). pks3 and pks4 have been characterised to be regulated by phoP. Recent

102 characterisation of phoP mutants in M. tuberculosis has revealed a deficiency in cord factor

synthesis, and notably, the absence of PATs, DATs, and sulpholipids in these mutants (158).

Additionally, it was recently shown that complementing H37Ra with a copy of phoP from

H37Rv restored the cording phenotype characteristic of the virulent H37Rv (114). Thus, it has

been speculated that PATs, DATs, and SLs may be involved in the synthesis of cord factor in

mycobacteria, and by inference, differential expression of fadD21 could impact cording (69).

5.4.3 Expression profile comparisons of intracellular Mycobacterium tuberculosis

Comparing transcriptomes of intracellular bacteria versus that of the broth-grown bacteria

highlights genes that may have a role in intracellular survival, as well as genes whose products

allow for adaptation to the environmental challenges faced by intracellular mycobacteria. Here,

three independent infections were done with each strain, and RNA from intracellular bacteria

was used to hybridise amplicon arrays in duplicate. In the datasets of intracellular M. tuberculosis H37Ra and H37Rv (Figure 5.2, Tables 5-4, 5-5, Appendices IV and V), 35 genes were found in both (Table 5-6). These genes appear to be involved in lipid metabolism (fadD33,

fadE5, Rv3229c, Rv1344), intermediary metabolism and respiration (Rv1463, icl), transcription regulators (Rv1460, Rv1395), and responses to oxidative stress (ahpC). Three genes were observed to be down-regulated in both strains of bacteria (Tables 5-7, 5-8, Appendices VI and

VII): Rv3371, Rv3897c (conserved hypothetical proteins), and acn (aconitate hydratase).

5.4.3.1 Intracellular bacteria versus broth grown bacteria: Upregulation

Mycolic acids are long-chain fatty acids found in the mycobacterial cell wall, and are involved in the synthesis of trehalose 6,6'-dimycolate (TMM), or cord factor (11). Genes that

may be involved in the synthesis of mycolic acids (umaA, mmaA) were upregulated when H37Rv

103 encountered an intracellular environment, but this same response was not duplicated with intracellular H37Ra. Certainly, the role of mycolic acids is not limited to the generation of cord factor, and in fact, the upregulation of these genes may simply indicate a mechanism of the bacterium to maintain cell-wall homeostasis – i.e. maintenance of hydrophobic properties of the cell-wall (223).

Genes involved in fatty acid metabolism were also upregulated en mass in intracellular bacteria, particularly by the virulent H37Rv (Figure 5.3). While H37Ra induced two of these genes intracellularly (fadD21, fadE5), H37Rv induced nine fatty acid CoA synthases (fadD2, 13,

15, 19, 21, 26, 28, 29, and 30), three acyl-CoA dehydrogenases (fadE5, 21, 18), an enoyl Co-A hydratase (echA6), an hydroxybutyryl CoA dehydrogenase (fadB2), and two acetyl CoA (fadA3,4). The induction of these genes indicates the utilisation of β-oxidation in fatty-acid metabolism. However, five of the fadD genes (fadD21, 26, 28, 29, and 30) are those previously found to encode for fatty-acyl CoA ligases involved in lipid synthesis rather than lipid oxidation, hinting at the need to maintain the cell wall in a challenging environment (217).

Nevertheless, the induction of multiple genes in fatty acid metabolism pathways was still observed in H37Rv, and, as suggested previously (188), such induction may point to the requirement of different isoenzymes to catabolise a variety of different fatty acids.

Furthermore, there are indications that these fatty acid catabolites are broken down further via the TCA and glyoxylate cycles because genes involved in these pathways were also upregulated in H37Rv. gltA and Rv1130, which encode enzymes that catabolise propionyl-CoA

(85), likely shuttle the metabolites into the TCA cycle. Isocitrate (icl), an enzyme required for the glyoxylate cycle, converts the fatty acid metabolites into carbohydrates. Previously, icl has been found to be critical for M. tuberculosis persistence within the host (133).

104 Genes that encode members of the ESAT-6 family of proteins (cfp7, cfp10, Rv3019c,

Rv2347, Rv1198, Rv1038c, and Rv3017c), were induced in intracellular H37Rv compared to its broth counterpart (Table 5-4, Appendix IV). This response was lacking in intracellular H37Ra compared to its broth counterpart with only cfp7 and Rv3019c being expressed in the intracellular bacteria (Table 5-5, Appendix V). It was also observed that intracellular growth of

H37Rv upregulates genes whose products have associations with or comprise the ESX-1 secretion system (Rv3614c, Rv3868, Rv3870, Rv3871, cfp10, Rv3876, Rv3877) which secretes

ESAT-6, the chaperone Cfp-10, Rv3614c, Rv3615c, and EspA, compared to broth-grown

H37Rv. Intracellular H37Ra does not respond similarly with respect to broth-grown H37Ra.

A further difference between strains was noted in the upregulation of an operon that has

been tentatively labelled the SUF complex. These genes (Rv1460-Rv1466) have been found to

be conserved across mycobacteria. Their orthologues in E. coli and Erwinia chrysanthemi, are involved in the biosynthesis of [Fe-S] clusters when cells are grown under stressful environments such as iron deprivation and oxidative stress (88). [Fe-S] clusters are important components of biological enzymes (e.g. hydrogenases), and thus, it is understandable that enzymes used in the biosynthesis of these components be kept active. Although the entire complex of the suf operon is not induced in intracellular H37Ra (unlike H37Rv), Rv1460 and Rv1463 are induced. This difference may be due to the differing metabolisms of the intracellular bacteria versus their broth counterparts, and in the case of H37Ra, it simply may not require the upkeep of Fe-S cluster biosynthesis to the extent of H37Rv. Alternatively, it could be argued that H37Ra is not responding adequately to stresses encountered moving from a broth to an intracellular

environment, resulting in the decreased growth and metabolism.

105 Finally, members of the mbt gene cluster (mbtA-J) which encode for components necessary for mycobactin biogenesis (32) were also differentially regulated between intracellular and broth-grown bacteria, and again, these expression patterns were different between strains.

Mycobactins are mycobacterial siderophores that act as iron chelators (178). Iron is often limited during infections and microorganisms have developed strategies to sequester and store this valuable commodity for growth and survival, one of which is the use of siderophores (178).

With respect to H37Rv, mbtB, and mbtD-J were all upregulated in intracellular H37Rv compared to broth-grown bacteria; however, in H37Ra, only mbtB, mbtF, mbtH, and mbtI were expressed at a higher level by intracellular bacteria. The upregulation of mycobactin synthesis falls in line with the presumably decreased availability of iron inside the host compared to broth cultures.

The induction by H37Ra of only a few components of mycobactin synthesis calls into question the availability of active mycobactin complexes and thus, the ability of H37Ra to fulfil its intracellular iron requirements. A further note of interest is the presence of two forms of siderophores: mycobactin and exochelin (40, 178). While saprophytic mycobacteria such as M. smegmatis synthesise both, pathogenic species such as M. tuberculosis synthesise mainly mycobactins (56, 191). There also appear to be two forms of mycobactin: mycobactin and carboxymycobactin, which differ in the nature of the acyl-group on the hydroxylated lysine in the middle of the core molecules (41). The more hydrophilic carboxymycobactin also appears to be secreted by the cell to compete directly with iron-binding molecules in the environment (41).

Recently, a second mbt cluster has been found containing 4 genes that are responsible for the alkyl substitutions differentiating the two forms of mycobactins: Rv1344, Rv1347, fadD33, and fadE14 (mbtK, mbtL, mbtM, and mbtN, respectively) (106). Both intracellular M. tuberculosis

106 H37Rv and H37Ra induced two of these genes (Rv1344 and fadD33) compared to their respective broth counterparts.

5.4.3.2 Intracellular versus broth-grown bacteria: Down-regulation

Bacterial responses to a novel environment are not solely restricted to the upregulation of genes, but rather, a balance between the up and down-regulation of genes that would enable optimal growth of the bacterium inside the host. Thus, a comparison of genes repressed by intracellular H37Ra or H37Rv versus their respective broth-grown counterparts was made

(Tables 5-7, 5-8, Appendices VI, VII). With respect to strain specific down-regulation, H37Ra showed few changes between the two environments, and what changes there were curiously involved genes whose products have roles in iron storage (bfrA and bfrB), and antioxidant defence (sodA). The gene sodA encodes an enzyme that catalyses the dismutation of superoxide to oxygen and hydrogen peroxide. Thus its down-regulation may result from the fact that the attenuated strain is either not subjected to as much stress as the virulent H37Rv, or it is not adapting appropriately to the intracellular environment resulting in decreased growth within host macrophages. bfrA and bfrB encode bacterioferritin, a molecule used for iron storage, and found previously to be upregulated in M. tuberculosis grown under iron-rich conditions. It had been proposed that bacterioferritin could be used to store excess iron as a means to prevent oxidative damage. The down-regulation of bfrA and bfrB may reflect the iron-deficient intracellular environment. A previous study examining iron sequestration genes of mycobacteria inside the murine lung saw a down-regulation of bfrA coupled with the upregulation of mbtD (179). As a similar response was seen here, the likely explanation for the repression of the bfr genes is the need for the bacteria to obtain and utilise iron for growth rather than for storage.

107 In contrast to the diminutive list of down-regulated genes for intracellular H37Ra versus

broth-grown bacteria, a sizable list was obtained contrasting intracellular and broth-grown

H37Rv (Table 5-6, Appendix VI). Genes that appear to be down-regulated include some genes involved in fatty acid metabolism (fadD8, 11 and fadE9, 13, 17, 27, 31 and 33), perhaps an

indication that substrates specific to these enzymes were not in abundance inside the host.

Additionally, genes that encode for lipoproteins (lppB,Q, lpqC, O, R, and lprM); as well as an

abundance of PE, PPE, PE_PGRS genes are down-regulated in intracellular bacteria, perhaps an

indication that antigenic variation by way of modulation of lipoproteins on the cellular surface is

of decreased demand once inside the host. More likely, the sheer number of genes in the

PE_PPE family suggests a redundancy in the members, and the lack of necessity for expression

of all these genes inside the host.

Only three genes were concurrently down-regulated by both H37Ra and H37Rv: Rv3371,

Rv3897c, both of which encode for conserved hypothetical proteins and acn, which encodes for aconitase, an enzyme in the tricarboxylic acid cycle which catalyses the conversion of citrate to isocitrate. The down-regulation of acn may again signal the decreased metabolism of intracellular bacteria in contrast with their broth-grown counterparts, and thus, the decreased need for metabolites and respiration.

5.4.3.3 Intracellular H37Ra versus intracellular H37Rv

Direct comparisons of M. tuberculosis H37Ra and H37Rv at 168h p.i. revealed 48 genes that were differently expressed between strains in an intracellular environment (Tables 5-9, 5-

10). Of these, 12 genes were upregulated in H37Ra (Table 5-9), and 36 genes were upregulated in H37Rv (Table 5-10).

108 Of the genes upregulated in the attenuated M. tuberculosis H37Ra (Table 5-4), three were hypothetical proteins (Rv1991c, Rv2662, and Rv2644c), one encoded for a fragment of a

dehydrogenase, and one for an exported protein. Also upregulated was the gene encoding for a

transcriptional regulator of the MerR transcriptional regulator family (Rv1994c), which responds

to environmental stimuli such as oxidative stress, or the presence of heavy metals and antibiotics

(25). Interestingly, desA1 (encoding a protein desaturase which is involved in fatty acid biosynthesis as well as mycolic acid biosynthesis) and ndh (encoding NADH dehydrogense, which could provide an alternate means of respiration for intracellular bacteria) were also upregulated in H37Ra. Des proteins have been found to be B-cell antigens that are recognised by patient sera (93) and B-cells have recently been observed to play a role in the modulation of

inflammation and maintenance of granulomas in mice (125). Intuition would suggest that these

genes may be considered important for pathogenesis and one might expect that these genes

would have been more appropriately upregulated in the virulent H37Rv. However, H37Ra is not

avirulent as it can still infect and replicate within a host. It would make sense for H37Ra to have

some pathogenic strategies, albeit, less than the virulent H37Rv. One further interesting

observation was made with regards to phoP, the DNA-binding domain of the PhoPR two-

component regulatory system that appears to correlate with bacterial virulence in other

organisms (83, 138, 141, 215). Recent genomic sequencing of M. tuberculosis H37Ra has

shown that the sequences of phoP in H37Ra and H37Rv differ by one base-pair (114). In our

microarray study, phoP was seen to be induced to a greater level in H37Ra, both in broth and

intracellular comparisons with H37Rv.

In the group of 36 genes induced in intracellular H37Rv (Table 5-10), thirteen of the

genes encoded hypothetical proteins with little to no characterisation with regards to function.

109 Three genes (Rv3614c-Rv3616c), although designated as conserved hypothetical proteins, have been found to be either part of the ESX-1 secretion system, or associated with products of the

RD1 region (17, 45, 60, 119). There have been observations that the secretion of Rv3625c,

Rv3616c, ESAT-6 and Cfp10 is mutually dependent, but the mechanisms of dependence are unknown (45). Other genes related to the ESAT-6 family that were expressed at greater levels in

H37Rv included cfp2, PE13 (Rv1195) and PPE18 (Rv1196), and Rv1198. It is interesting that

PE13 and PPE18 were upregulated along with Rv1198 as it has been previously reported that genes of the ESAT-6 family are often clustered with PE, PPE genes (32, 214). This organisation is often flanked by conserved hypothetical proteins, and it has been proposed that such genomic arrangement may encode a secretory apparatus for the ESAT-6-like proteins (32, 214).

The gene cysH was also upregulated in the virulent H37Rv, it encodes a 5’-adenosine phosphosulphate reductase required in the synthesis of cysteine and methionine (189). Recently, it was reported that cysH was required for M. tuberculosis survival during chronic infection inside the murine host, thus making it a worthwhile target in either the design of novel chemotherapeutics or vaccines (189). Following that study, a cysH mutant was used to vaccinate mice prior to infection with M. tuberculosis Erdman and was found to provide protection equal to that of M. bovis BCG (190). Given our data showing that the levels of cysH transcripts are lower in intracellular H37Ra versus H37Rv, this difference may be explained either as a result of the differing metabolisms [resulting from different growth rates] of the respective strains, or as a pathogenic strategy employed to greater effect in the virulent H37Rv resulting in the different growth of the bacterial strains.

Lastly, although phoP was induced in intracellular H37Ra compared to H37Rv, members of the phoP regulon (pks3, pks4, mmpL8, mmpL10, lipF, papA3, fadD21, PPE18, PPE19,

110 Rv1639c, Rv2376c, PE31, and PPE60) were expressed at a greater level in H37Rv. In contrast

iniB, ndh, and PPE59 were expressed at a greater level in H37Ra – similar to a recent study

comparing the transcriptomes of H37Rv and a phoP mutant (225). Quite a few of these genes, as

first discussed in Section 5.4.2, are involved in the synthesis of cell envelope constituents. The

observed differences may explain the different lipid profiles between the two strains. The base-

pair difference in the phoP gene of H37Ra and its subsequent effect on primary amino acid

structure may affect binding kinetics to phoP resulting in aberrant regulation via PhoPR in

H37Ra (S. Wadell, personal communication). Curiously, in Salmonella typhimurium,

overexpression of phoPQ resulted in an attenuation of bacterial virulence as pathogenesis depended on a fine balance of transcriptional regulation (138). Thus, we propose a similar situation where faulty binding results in increased (and perhaps compensatory) phoP expression

by H37Ra, leading to faulty regulation of genes involved in cell-wall lipid synthesis resulting in

the obvious phenotypic differences between H37Ra and H37Rv.

5.4.4 qPCR confirmation of microarray data

As mentioned in the previous chapter, validation and confirmation of the expression

trends seen in hybridisation experiments need to be carried out to ensure that the trend described

in these experiments is an accurate reflection of the expression of the gene in question. Thus,

qPCR was also used to confirm expression differences noted with the microarray experiments.

As the main goal of the microarray experiments was to assess differences between intracellular

H37Ra and H37Rv, selected genes from that particular dataset were examined via qPCR to confirm microarray results. For confirmation, genes that were close to the threshold of detection

(±1.5-fold difference), genes well above the threshold of detection, genes whose expression fell

111 somewhere in the middle, and genes that were unchanged were examined to validate microarray

findings.

5.4.4.1 Genes differentially expressed between intracellular bacteria

Genes differentially expressed between H37Ra and H37Rv, and which were selected for validation via qPCR were grouped into the following: close to threshold (phoP, Rv3262), clearly above threshold (iniB, Rv1994c), clearly below threshold (papA3, Rv3822), and comfortably above or below threshold (desA1, Rv3616c). Expression trends for selected genes observed with microarray experiments were reproduced via qPCR analysis (Figure 5.4).

Additional genes of interest that were investigated with qPCR included those within the

PhoP regulon (pks3, pks4, mmpL10), genes that appeared to be in an operon with the putative transcriptional regulator, 1994c (1992c, 1993c), but were later proven otherwise, and lastly, members of the cluster of genes (3613c-3616c) that had been previously characterised to be important for intracellular survival and for interactions with products from the RD1 region (60,

119, 185).

1992c and 1993c were found to be induced in intracellular H37Ra over H37Rv (Figure

5.5A), but a recent paper detailing the organisational network of mycobacterial operons did not recognise the cluster of genes running from Rv1992c to 1994c as an operon (176). Members of the PhoP regulon, were confirmed by qPCR to be upregulated in the virulent M. tuberculosis

H37Rv (Figure 5.5B), even though phoP was found to be expressed at a higher level in the attenuated H37Ra (Figure 5.4A). Lastly, the entire cluster of genes that has associations with members of the RD1 region were also upregulated in H37Rv (Figure 5.5B).

112 5.4.4.2 Genes unchanged between intracellular bacteria

To conclude the validation of the microarray data, genes whose expression did not reach

or exceed the threshold level were also assessed via qPCR. For these purposes, genes that were

induced in intracellular bacteria, but without a difference between intracellular H37Ra versus

intracellular H37Rv were selected. These included 2 genes encoding conserved hypothetical

proteins (Rv3839, Rv0282), isocitrate lyase (icl) – a gene shown to be important for long term

persistence within the host (133), and Rv1344, a possible acyl carrier thought to be involved in

fatty acid biosynthesis (32). With the exception of Rv1344, which was repressed in intracellular

H37Ra compared to H37Rv, Rv3839, Rv0282, and icl were all expressed by both strains at

similar levels inside the macrophage (Figure 5.6).

5.4.5 Correlation of microarray data to previous expression studies

Numerous microarray studies have been initiated to examine the transcriptomic profile of

Mycobacterium tuberculosis; however the majority of such studies have examined the imposition

of environmental conditions such as acid or anoxic stress on broth-grown cultures in an attempt

to model intracellular conditions. Recently, expression profiling of mycobacteria has included

the assessment of intracellular bacteria using microarrays. One study in particular assessed the

expression of virulent M. tuberculosis (strain 1254) inside murine bone-marrow derived macrophages over a 48-hour infection period (188). Thus, it was of interest to compare the

dataset obtained by Schnappinger et al. with our own examining the expression profiles of intracellular versus broth-grown M. tuberculosis, keeping in mind that length of infection as well as bacterial strains did differ between the two studies. An additional study of interest was the examination of genes required for optimal survival within macrophages, using a library of

mariner transposon mutants (170). This study, by Renarajan et al., used a technique called

113 transposon site hybridisation, or TraSH, to identify genes important for survival inside murine

bone-marrow-derived macrophages. TraSH had been previously used to identify genes essential

for growth of M. tuberculosis in broth (184), and subsequently inside a murine host (185).

Briefly, TraSH involves the harvest of surviving mutants from an infection and probing a

microarray with labelled DNA probes generated from the survivors. It is assumed that mutants

not harvested from infections are those with mutations in genes deemed “essential” or at the very

least, required for optimal growth inside a macrophage or host animal. The TraSH study of

interest studied genes required for mycobacterial survival in a murine (C57BL/6) bone-marrow-

derived macrophage over 168 hours.

5.4.5.1 Genes common to all three datasets

Comparing both microarray expression datasets and the list of 156 genes identified via

TraSH in macrophages (Figure 5.7), only one gene, aphD, was common to all studies. ahpD

encodes an alkylhydroperoxidase which acts in concert with another alkylhydroperioxidase,

AhpC to provide antioxidant protection for M. tuberculosis (192). Although aphC was in both

the Schnappinger et al. and our datasets, it was not an absolute requirement for optimal growth

inside a macrophage as defined by the TraSH screen (170). AhpC/AhpD may be important in

isoniazid-resistant strains where the catalase-peroxidase KatG has undergone mutation to provide

a resistant phenotype (192). KatG has been ascribed an important role in virulence as it has been

found to be required for the catabolism of exogenous peroxides generated by the oxidative burst

or peroxinitrites generated by the reaction of superoxide and nitric oxide (146). Interestingly,

katG was not found in our or the Schnappinger et al. datasets to be induced intracellularly, nor

was katG found to be an essential gene for survival in the macrophage (170, 188). One possible

114 explanation would be the constitutive expression of katG, which would not be detected as a difference.

5.4.5.2 Genes common between expression studies of intracellular bacteria

A direct comparison of genes differentially regulated in intracellular bacteria versus broth-grown bacteria using both our and the Schnappinger et al. data set revealed 53 genes that were commonly up-regulated across the studies and 3 genes that were commonly down- regulated. Of the 3 down-regulated genes, one was Rv3511 (PE_PGRS55), a protein of unknown function, and the remainder were cydA and cydB. Both encode for components of the cytochrome ubiquinol oxidase, an enzyme involved with the aerobic respiratory chain, particularly at low oxygen levels (124). The down-regulation of these genes may indicate a shift towards ever lower oxygen levels, obviating the need for transcription of aerobic respiratory chain enzymes.

With regards to genes commonly upregulated by both intracellular M. tuberculosis

H37Rv and strain 1254, genes involved in fatty acid metabolism (fadB2, fadD19, fadD26,

fadD33, and fadE5), mycobactin synthesis (mbtB, mbtD-J), and [Fe-S] cluster biosynthesis

(Rv1461, Rv1462, csd, Rv1465) were present in both lists. Additionally, both virulent strains up- regulated a multidrug transporter (Rv1348/Rv1349) which a recent study suggested may be involved in the transport of lipids like TMMs to the outer membrane of M. tuberculosis. Finally, pckA which encodes phosphoenolpyruvate carboxykinase (PEPCK) was also seen to be commonly upregulated in these two virulent strains when inside a macrophage. PEPCK catalyses the interconversion of oxaloacetate (OAA) to phosphoenolpyruvate (PEP) and has been

thought to be involved in mycobacterial virulence either through its contributions to gluconeogensis for carbohydrate formation or to the maintenance of the TCA cycle through the

115 conversion of PEP to OAA (124). Knocking-out pckA in M. bovis BCG resulted in an attenuated phenotype of the mutant versus wild-type BCG in both aerosol-infected C57BL/6 mice as well as

C57BL/6 macrophages (118).

5.4.5.3 Genes common to our dataset and TraSH

In comparing our list of genes up-regulated in intracellular H37Rv with the list of 126 genes required for optimal growth of H37Rv inside macrophages as defined by TraSH, we wanted to characterise the extent of agreement between the studies, and in particular, how many of the genes in our list would be considered essential. To that end, 18 genes were common between the two datasets. Of this, half were conserved hypothetical genes that have not yet been fully characterised, and others, such as secA2 (107) and phoT (170) have been well characterised to be required for growth in both mice and macrophages. fabG1, involved in the elongation of fatty acids, has recently been described as an essential gene even under normal culture conditions

(132, 154). Also present in both the expression and TraSH screen was the gene clpC, which encodes for a chaperone protein presumably involved in a varied number of roles including secretion, gene regulation, protein folding and degradation (12, 147, 161). A recent study characterising protein-protein interactions in mycobacteria found that ClpC specifically associates with Cfp-10, a component of the ESX-1 secretion system (195). Such associations will likely further the dissection of secretion mechanisms via the ESX-1 secretion system, and in combination with the requirement of clpC for optimal growth, highlight the importance of secretory mechanisms in mycobacterial virulence.

116 5.5 Discussion and summary

A microarray-based approach was taken to assess and compare expression profiles between two highly related M. tuberculosis strains, H37Ra and H37Rv. In particular, we sought to characterise the transcriptomic differences between the two strains when inside a host macrophage. Genomic differences were also assessed between strains. Our experiments were in agreement with previous studies that described few differences between strains, other than the

RvD2 region of difference. Other differences previously described include restriction site differences, and more recently, base-pair differences (9, 89, 114). We did not detect these alterations made to the H37Ra genome, and the likely reason for that lies in the insensitivity of hybridisation of any array experiment be they microarray or BAC-array based. A single base- pair alteration may not necessarily affect annealing, and thus, the genomic difference cannot be identified via microarrays.

Examining the transcriptomes of M. tuberculosis H37Ra and H37Rv, expression profiles of the two respective strains grown under different environments were compared and contrasted.

The first element of the study surveyed were expression differences between the two strains of

M. tuberculosis grown in enriched broth both as a means to optimise the technique as well as to identify inherent differences between strains. Behaviours of intracellular bacteria were then compared to their broth-grown counterparts to elucidate differences that may arise due to a change in the environment. The main focus of this study however, was to contrast intracellular expression profiles of the two strains to elucidate genes that may explain the differences seen virulence between H37Ra and H37Rv in their ability to grow in BM-MФ (Figure 4.3).

Expression differences between the strains grown in enriched 7H9 liquid broth saw an induction in H37Rv [compared to H37Ra] of genes that may be involved in antigenic variation,

117 genes whose products may be used in respiratory pathways, as well as genes whose products

may be involved in biosynthesis. H37Ra, however, induced many genes whose products have

roles in protein synthesis and/or replication. Comparing our dataset of differences between

broth-grown H37Ra and H37Rv with that of Gao et al. (65), there was only one gene common to

both studies. This paucity of similarities is likely due to the differing culturing conditions used

by our respective studies to grow bacteria and then the different RNA harvest protocols used. In

both experiments, the bacteria were grown in roller bottle cultures, but the speed of rotation was

different: 3 rpm for our study and 6rpm for Gao et al (65). Additionally, rather than maintaining

cultures at mid-logarithmic growth by constant dilution into fresh media, we grew up fresh

cultures every time and harvested at mid-logarithmic growth. Lastly, to harvest RNA, we used a

method of freezing transcription machinery of bacteria even before pelleting and processing for

RNA. Pelleting and subjecting the pelleted bacteria to lowered temperatures before harvesting

for RNA has the potential to introduce novel transcriptomic changes that one may broadly

associate with a particular set of experimental conditions. Differences in the growth of bacteria, as well as the method used to harvest RNA may have affected the consistency between datasets, thus resulting in minimal overlap. Overall, genes differentially induced between the broth-grown strains do not correspond to virulence differences previously found nor do they point to any obvious virulence mechanisms that may distinguish the two strains.

Much more dramatic are differences induced within intracellular bacteria compared to their broth-grown counterparts. Notably, the intracellular bacteria were seen to induce genes that would enable adaptation to the intracellular milieu. For instance, genes involved in fatty acid metabolism were upregulated, as well as genes that could possibly defend the bacterium against host defences. Additionally, genes that encode products providing alternative means of

118 biosynthesis or respiration were also induced in intracellular bacteria compared to broth-grown bacteria, whereas some genes whose products are involved in aerobic respiration and biosynthesis were repressed by intracellular bacteria, the latter potentially stemming from decreased reliance on aerobic respiration inside the macrophage.

Contrasting the list of responses induced by intracellular H37Rv versus broth with that of intracellular H37Ra versus broth, the most obvious difference is the number of differences in each strain. As mentioned above, both strains were similar in the types of responses of adaptation to the macrophage (e.g. fatty acid metabolism), but the extent of adaptation was greater in the virulent strain. For example, with regards to fatty acid metabolism, while H37Ra did induce some fad genes to levels significantly greater than its broth counterpart, the vast majority of the fad genes were not changed. H37Rv on the other hand, induced all classes of the fad genes involved in fatty acid metabolism as well as alternative pathways that would enable the further breakdown of the products of fatty acid metabolism (Figure 5.3). This same trend was seen for genes responsible for mycobactin (mbt) and Fe-S cluster (Rv1461-Rv1467) synthesis, where intracellular H37Rv significantly induced nearly all of the genes involved in these processes versus its broth counterpart, but H37Ra only induced a small subset of the genes. It is acknowledged that all genes described here are transcribed by H37Ra, but the implication is that the response of H37Ra to the different environments was not as robust as H37Rv. Whereas

H37Rv induction of gene expression inside the macrophage was quite comprehensive and significantly greater than broth-grown bacteria, H37Ra did not duplicate that response. It could be hypothesised that those genes induced by H37Ra to significantly greater levels over its broth counterpart are the key components of the response involved, and/or that this less vigorous

119 response by H37Ra is in part reflected in its decreased growth inside the macrophage compared

to H37Rv.

Contrasting the transcriptome of intracellular H37Rv [versus broth] with other studies that

also examined global expression of intracellular M. tuberculosis, showed overlaps in genes involved in metabolism, as well as genes whose products have been characterised as being associated with pathogenesis. However, the overlap between certain datasets was minimal, and this difference may also stem from experimental procedure, which ultimately may also give insight into mycobacterial responses to different environs. Schnappinger et al., used a different strain of M. tuberculosis (Strain 1254 rather than H37Rv), assessed a different time point, and obtained macrophages were from a different strain of mouse (188). As strain 1254 is a low- passage clinical isolate, it is likely more virulent than H37Rv, which may affect the responses that the respective bacteria may exhibit. Interestingly, strain 1254 did not appear to grow in BM-

MФ, which was markedly different than our findings with both H37Ra and H37Rv (Figure 4.3).

The transcriptomic changes in strain 1254 were also observed at a maximum of 48 hours post- infection, rather than 168 hours examined for our study. Thus, the differences identified in the former examination will likely reflect earlier interactions between the host cell and bacterium.

Despite the differences, it could still be extrapolated that transcription common to both datasets indicate genes whose products are required to sustain intracellular bacteria throughout the course of infection. It has been well characterised that particular strains of inbred mice are either relatively susceptible (CBA, DBA/2, C3H) or more resistant (BALB/c, C57BL/6) to tuberculosis

(134). For all of our expression studies, we utilised an outbred strain (CD-1) that is susceptible, where as both of the studies to which we compared our data, the more resistant C57BL/6 mice were used as the source of bone-marrow-derived macrophages.

120 Correlating our dataset to that of the TraSH study (170), although the strain of M. tuberculosis used and the time-point assessed were the same, the preparation of the bacteria for infection, as well the treatment of monolayers after infection differed between our protocols which may have influenced intracellular behaviours and requirements of the bacteria.

Bacteria used for the TraSH study were sonicated to disperse clumps prior to infection whereas for our experiments, mycobacteria were syringed prior to infection. It was previously found that the two methods of clump dispersal differentially affect the capsular envelope of M. tuberculosis: sonication actually alters the capsule resulting in increased binding of the bacterium to macrophages (207). It could be theorised that other processes which may affect host-pathogen interactions have the potential to be altered with sonication. Furthermore, after inoculating the macrophage monolayer with M. tuberculosis, monolayers were thoroughly washed and then treated with amikacin to remove unbound bacteria. Amikacin is an antibiotic of the aminoglycoside family and along with gentamicin has been commonly used in studies of intracellular pathogens as these antibiotics have been supposed to be excluded from host cells

(52). Increasingly, however, reports have surfaced describing the bacteriocidal activity of these antibiotics on intracellular bacteria (47, 54, 78, 152). It is likely that washing monolayers thoroughly with PBS, as was done in this study, was sufficient to remove extracellular bacteria.

The treatment of the monolayer with antibiotics to exclude extracellular bacteria could have

potentially introduced an additional bacteriocidal element that may have affected the mutants

harvested from the screen, and thus, also impacted the correlation of our datasets. Results

common to both studies highlights genes that have roles in survival of bacteria (no matter its

physiological state) inside the murine host macrophage.

121 The ultimate goal of our microarray study, however, was the comparison and study of transcriptomic profiles of M. tuberculosis H37Ra and H37Rv first to understand reasons for their

phenotypic differences inside the host, and second, to understand the roles of those differences in

mycobacterial pathogenesis.

Among the more interesting observations in recent comparisons between H37Ra and

H37Rv has been the base-pair polymorphism in phoP between the strains (113, 226). From our

studies, it appears that this base-pair difference in H37Ra may have affected the regulation of

this two-component system in that even with higher transcription of phoP in H37Ra, genes under

PhoPR regulation are in fact expressed at higher levels in H37Rv. Previous studies into phoP

mutants have described similarities between the lipid profiles, especially the components of

PAT, DAT and sulpholipids, of the phoP mutants and H37Ra (69, 158, 225) – although it is

unclear if the attenuation of the phoP mutant is on par with H37Ra. The increased transcript

levels of genes involved in PAT, DAT, and sulpholipid synthesis as seen with our study has been

theorised to be required for remodelling of the bacterial cell envelope upon internalisation (69).

Additionally, PhoPR involvement in metabolic processes such as fatty acid metabolism has led

to the supposition that faulty regulation via this two-component system could also affect

bacterial adaptation to host environments in its ability to utilise alternate means of lipid

degradation and synthesis.

It is unlikely, however, that the difference in phoP genomics and transcriptomics is the

sole reason for pathogenic differences between H37Ra and H37Rv. Instead, the ESX-1 secretion

system is also likely to have contributed to the differing phenotypes. As described above,

members of RD1 encode components of ESX-1, which secretes ESAT-6 and its chaperone Cfp-

10. Recently, it was discovered that additional gene products encoded by Rv3614c-Rv3616c

122 interact with and are also secreted via ESX-1 (60, 119). Given that the expression of these

associated products and an ESAT-6 family member are higher in intracellular H37Rv versus

H37Ra, one could hypothesise that the virulent H37Rv exhibits greater secretory activity.

Additionally, comparing the respective strains adaptation to the intracellular environment versus

broth culture, H37Rv induced a much more robust transcription of RD1 genes, ESAT-6 family member genes, as well as the associated genes. This same response was not observed for

H37Ra. If the secreted proteins do indeed act as effectors modulating host responses to M.

tuberculosis, it could be reasoned that H37Ra, with its decreased expression of these putative

virulence factors is at a disadvantage inside the host cell compared to H37Rv.

Overall, the data obtained with microarrays are complementary to those obtained with

BACFA analysis in that these data also suggest a scenario where H37Rv is better equipped to

adapt to novel environments. Certainly, these microarray experiments generated a much larger

volume of data than the BACFA analysis, and the question remains if this is due to the fact that

labelled cDNA probes used for these array experiments were generated with random primers

(that should not bias and enrich for specific transcripts) rather than a defined arbitrary sequence.

The fact that the same differences identified using BACFA were not identified with microarrays

does seem to suggest that perhaps length and sequence of primers used in cDNA generation affects the representation of transcripts, but that remains to be defined experimentally.

Specifically, the microarray experiments did not confirm the differential expression of the frd

operon, as detected by BACFA analysis. This had at first seemed an obvious control to compare

the two methods, but instead, this highlights an important advantage of the BACFA technique:

the ability to identify differences in operons. Had the genes of the frd operon been on different

fragments in the fingerprint arrays, the individual signals would have been too weak to have

123 warranted further analysis even though our additional studies indicate that fumarate reductase

may have a role in mycobacterial pathogenesis. With microarrays, however, each gene has its

own individual spot, and unless programmes are set to cluster signals below significance, genes

whose expression that are biologically, if not numerically significant may be overlooked.

These array data presented here expand upon our understanding of the response of intracellular mycobacteria to the host environment, and allows a dissection of pathogenic

mechanisms as related to the differences between H37Ra and H37Rv. Generally, what can be deduced from the results here is that the virulent H37Rv induces a much more robust transcriptomic profile that includes genes whose products allow it to adequately cope with metabolic challenges faced within the host environment. H37Ra on the other hand, seems to initiate a much more moderate, or, in the case of frd, a delayed response which may not be as

constructive for adapting to the novel host environment, and which may explain, in part, its

diminished virulence compared to H37Rv.

5.6 Future directions

Utilising microarray technologies to assess transcriptomic profiles of intracellular H37Ra

and H37Rv, we were indeed able to isolate many more candidates for mycobacterial virulence

compared to BACFA analysis initiated in Chapter 4. However, as the differences detected using

BACFA analysis were not detected with microarray analysis, the question remains if primer

sequence and length contributed to the differences in results obtained with the respective methodologies. Thus, one future experiment would involve the generation of Cy-dye labelling the cDNA transcripts generated with the arbitrary primer, Uniprime. These labelled cDNAs would then be used in microarray hybridisations and data obtained could be compared with differences isolated with BACFA analysis to assess correlation between primers.

124 The modelling of the structural impact of the base-pair difference in H37Ra and H37Rv

phoP sequence could also be examined in future. As the base-pair change results in the

replacement of a polar amino acid (serine) with a non-polar amino acid (isoleucine or leucine), it

is possible that binding kinetics of PhoP have been altered. Thus, it would be interesting to assess the structural changes elicited by this change in sequence, and how it might have affected interactions of PhoP. Furthermore, a direct comparison of the survival and growth of an H37Rv phoP mutant in a host or host cell assessed alongside that with H37Ra and H37Rv would give a direct comparison of the relatedness of attenuation between the phoP mutant and H37Ra.

125

Fold difference Gene over H37Rv Gene description gDNA ± SEM

Conserved hypothetical protein (unknown function), showing similarity with glycosyl transferases, sulfolipid Mb1785c 2.35 ± 0.33 sulfoquinovosyldiacylglycerol synthases, and hypothetical proteins. No equivalent in Mycobacterium tuberculosis strain H37Rv. Belongs to the RvD2 region.

Possible sulfite oxidase involved in the degradation of sulphur containing compounds. No equivalent in Mb1786 7.83 ± 1.34 Mycobacterium tuberculosis strain H37Rv. Belongs to the RvD2 region.

Probable mmpL14, conserved transmembrane transport protein – unknown function, but thought to be involved Mb1787 3.47 ± 0.25 in fatty acid transport. No equivalent in Mycobacterium tuberculosis strain H37Rv. Belongs to the RvD2 region.

Table 5.1 Genomic differences revealed in microarray comparisons of genomic DNA from M. tuberculosis

H37Ra and H37Rv. Microarray hybridisation profiles generated using DNA probes from both H37Ra and H37Rv

(2 separate populations of each strain hybridised in duplicate). Imagene files for two different populations were compared and analysed using Arraypipe (http://koch.pathogenomics.ca/cgi-bin/pub/arraypipe.pl). Genes were filtered for 1.5-fold difference and then analysed via ANOVA. Only genes significantly different (P<0.05) are presented in this table. Gene details were sought via Bovilist (http://genolist.pasteur.fr/BoviList/).

126

Systematic Fold- Common Name induction Name P-value SD Product PROBABLE MOLYBDENUM BIOSYNTHESIS PROTEIN E MOAE1 (MOLYBDOPTERIN CONVERTING FACTOR MtH37Rv- LARGE SUBUNIT) (MOLYBDOPTERIN [MPT] 3119 1.90 moaE1 6.69E-05 0.35 CONVERTING FACTOR, SUBUNIT 2) PROBABLE GLUTAREDOXIN ELECTRON MtH37Rv- TRANSPORT COMPONENT OF NRDEF 3053c 2.46 nrdH 4.08E-03 1.38 (GLUTAREDOXIN-LIKE PROTEIN) NRDH POSSIBLE TWO COMPONENT SYSTEM MtH37Rv- RESPONSE TRANSCRIPTIONAL POSITIVE 0757 2.72 phoP 1.39E-05 0.64 REGULATOR PHOP MtH37Rv- PROBABLE 50S RIBOSOMAL PROTEIN L30 0722 1.87 rpmD 7.46E-04 0.52 RPMD DNA-DIRECTED RNA POLYMERASE (BETA MtH37Rv- CHAIN) RPOB (TRANSCRIPTASE BETA 0667 2.19 rpoB 3.83E-03 1.07 CHAIN) (RNA POLYMERASE BETA SUBUNIT) Table 5.2 Expression differences between strains grown in broth: genes upregulated in broth-grown M. tuberculosis H37Ra versus H37Rv. Three populations of RNA from each of M. tuberculosis H37Ra and H37Rv grown in 7H9 broth were hybridised in duplicate to M. tuberculosis microarrays supplied by Bµg@S

(http://www.bugs.sgul.ac.uk/). Arrays were normalised and expression analysed as specified in Section 2.6.3.2.

Statistical significance of fold-difference across all three populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) upregulated in broth-grown H37Ra versus broth-grown H37Rv are listed. SD = standard deviation.

127

Systematic Fold- Common Name induction Name P-value SD Product Probable integral membrane cytochrome D MtH37Rv- ubiquinol oxidase (subunit II) cydB (Cytochrome 1622c 0.50 cydB 2.80E-04 0.09 BD-I oxidase subunit II) PROBABLE FATTY-ACID--COA MtH37Rv- FADD21 (FATTY-ACID-COA SYNTHETASE) 1185c 0.49 fadD21 (FATTY-ACID-COA SYNTHASE) MtH37Rv- 0834c 0.49 PE_PGRS14 2.55E-03 0.19 PE-PGRS FAMILY PROTEIN MtH37Rv- 1452c 0.26 PE_PGRS28 2.26E-03 0.20 PE-PGRS FAMILY PROTEIN MtH37Rv- 2487c 0.39 PE_PGRS42 1.21E-03 0.17 PE-PGRS FAMILY PROTEIN MtH37Rv- 3590c 0.49 PE_PGRS58 3.49E-03 0.20 PE-PGRS FAMILY PROTEIN MtH37Rv- 3022A 0.42 PE29 5.06E-03 0.19 PE FAMILY PROTEIN MtH37Rv- 3477 0.22 PE31 6.41E-03 0.26 PE FAMILY PROTEIN MtH37Rv- PROBABLE POLYKETIDE BETA- 1180 0.26 pks3 8.82E-03 0.29 KETOACYL SYNTHASE PKS3 MtH37Rv- 1361c 0.34 PPE19 1.38E-03 0.18 PPE FAMILY PROTEIN MtH37Rv- 3022c 0.45 PPE48 3.65E-04 0.14 PPE FAMILY PROTEIN Table 5.3 Expression differences between strains grown in broth: genes downregulated in broth-grown M. tuberculosis H37Ra versus H37Rv. Three populations of RNA from each of M. tuberculosis H37Ra and H37Rv grown in 7H9 broth were hybridised in duplicate to M. tuberculosis microarrays supplied by Bµg@S

(http://www.bugs.sgul.ac.uk/). Arrays were normalised as per section 2.6.3.2 and gene expression was filtered for genes whose expression differed by 1.5-fold. Statistical significance of fold-difference across all three populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly

(P<0.05) downregulated in broth-grown H37Ra versus broth-grown H37Rv are listed. SD = standard deviation.

128

Systematic Fold- Common Name induction Name P-value SD Product MtH37Rv- ALKYL HYDROPEROXIDE REDUCTASE C 2428 10.45 ahpC 2.27E-03 0.14 PROTEIN AHPC (ALKYL HYDROPEROXIDASE C) ALKYL HYDROPEROXIDE REDUCTASE D MtH37Rv- PROTEIN AHPD (ALKYL HYDROPEROXIDASE 2429 4.00 ahpD 1.81E-03 0.17 D) MtH37Rv- Low molecular weight protein antigen 7 cfp7 (10 kDa 0288 4.88 cfp7 4.46E-03 0.28 antigen) (CFP-7) (Protein TB10.4) MtH37Rv- PROBABLE ATP-DEPENDENT CLP PROTEASE 3596c 1.79 clpC 4.52E-03 0.22 ATP-BINDING SUBUNIT CLPC MtH37Rv- 1464 2.43 csd 2.15E-02 0.27 PROBABLE CYSTEINE DESULFURASE CSD POSSIBLE ENOYL-COA HYDRATASE ECHA6 MtH37Rv- (ENOYL HYDRASE) (UNSATURATED ACYL- 0905 1.64 echA6 6.54E-03 0.21 COA HYDRATASE) (CROTONASE) 3-OXOACYL-[ACYL-CARRIER PROTEIN] REDUCTASE FABG1 (3-KETOACYL-ACYL MtH37Rv- CARRIER PROTEIN REDUCTASE) (MYCOLIC 1483 1.73 fabG1 2.15E-02 0.27 ACID BIOSYNTHESIS A PROTEIN) PROBABLE 3-OXOACYL-[ACYL-CARRIER MtH37Rv- PROTEIN] REDUCTASE FABG4 (3-KETOACYL- 0242c 1.89 fabG4 1.27E-02 0.29 ACYL CARRIER PROTEIN REDUCTASE) MtH37Rv- PROBABLE BETA-KETOACYL COA THIOLASE 1074c 1.82 fadA3 1.75E-02 0.21 FADA3 PROBABLE ACETYL-COA MtH37Rv- ACETYLTRANSFERASE FADA4 1323 1.97 fadA4 3.08E-02 0.25 (ACETOACETYL-COA THIOLASE) PROBABLE 3-HYDROXYBUTYRYL-COA DEHYDROGENASE FADB2 (BETA- MtH37Rv- HYDROXYBUTYRYL-COA DEHYDROGENASE) 0468 1.95 fadB2 1.89E-02 0.28 (BHBD) MtH37Rv- PROBABLE CHAIN -FATTY-ACID-COA LIGASE 3089 1.74 fadD13 3.97E-02 0.24 FADD13 (FATTY-ACYL-COA SYNTHETASE) Probable long-chain-fatty-acid-CoA ligase fadD15 MtH37Rv- (FATTY-ACID-COA SYNTHETASE) (FATTY- 2187 1.82 fadD15 3.43E-02 0.40 ACID-COA SYNTHASE) PROBABLE FATTY-ACID-COA LIGASE FADD19 MtH37Rv- (FATTY-ACID-COA SYNTHETASE) (FATTY- 3515c 2.01 fadD19 3.96E-03 0.14 ACID-COA SYNTHASE) Table 5.4 Genes upregulated in intracellular M. tuberculosis H37Rv versus broth grown H37Rv. Three populations of RNA from each of intracellular and broth-grown H37Rv were reverse transcribed and hybridised to

M. tuberculosis microarrays (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate. Arrays were normalised as per section 2.6.3.2 and gene expression was filtered for genes whose expression differed by 1.5-fold. Statistical significance of fold-difference across all populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) upregulated in intracellular H37Rv versus broth- grown H37Rv are listed. SD = standard deviation.

129 Table 5.4 Genes upregulated in intracellular M. tuberculosis H37Rv versus broth grown H37Rv, cont’d. PROBABLE FATTY-ACID-COA LIGASE FADD2 MtH37Rv- (FATTY-ACID-COA SYNTHETASE) (FATTY-ACID-COA 0270 1.72 fadD2 3.37E-02 0.23 SYNTHASE) PROBABLE FATTY-ACID--COA LIGASE FADD21 MtH37Rv- (FATTY-ACID-COA SYNTHETASE) (FATTY-ACID-COA 1185c 2.03 fadD21 1.33E-02 0.39 SYNTHASE) MtH37Rv- FATTY-ACID-COA LIGASE FADD26 (FATTY-ACID- 2930 2.11 fadD26 1.81E-03 0.16 COA SYNTHETASE) (FATTY-ACID-COA SYNTHASE) MtH37Rv- FATTY-ACID-CoA LIGASE FADD28 (FATTY-ACID-CoA 2941 2.01 fadD28 5.79E-03 0.28 SYNTHETASE) (FATTY-ACID-CoA SYNTHASE) PROBABLE FATTY-ACID-COA LIGASE FADD29 MtH37Rv- (FATTY-ACID-COA SYNTHETASE) (FATTY-ACID-COA 2950c 1.74 fadD29 1.12E-02 0.21 SYNTHASE) MtH37Rv- PROBABLE ACYL-COA LIGASE FADD31 (ACYL-COA 1925 2.34 fadD31 4.20E-02 0.58 SYNTHETASE) (ACYL-COA SYNTHASE) MtH37Rv- 1345 2.07 fadD33 1.81E-03 0.15 POSSIBLE POLYKETIDE SYNTHASE FADD33 MtH37Rv- 1933c 2.01 fadE18 4.81E-02 0.35 PROBABLE ACYL-COA DEHYDROGENASE FADE18 MtH37Rv- 2789c 1.76 fadE21 2.72E-03 0.15 PROBABLE ACYL-COA DEHYDROGENASE FADE21 MtH37Rv- 0244c 1.82 fadE5 7.80E-03 0.17 PROBABLE ACYL-COA DEHYDROGENASE FADE5 MtH37Rv- 1131 2.74 gltA1 1.81E-03 0.15 PROBABLE CITRATE SYNTHASE I GLTA1 MtH37Rv- ISOCITRATE LYASE ICL (ISOCITRASE) 0467 3.30 icl 6.96E-03 0.20 (ISOCITRATASE) MtH37Rv- 3874 2.35 lhp 6.54E-03 0.28 10 KDA CULTURE FILTRATE ANTIGEN LHP (CFP10) MtH37Rv- PHENYLOXAZOLINE SYNTHASE MBTB 2383c 2.97 mbtB 1.81E-03 0.18 (PHENYLOXAZOLINE SYNTHETASE) MtH37Rv- POLYKETIDE SYNTHETASE MBTD (POLYKETIDE 2381c 3.97 mbtD 1.11E-02 0.49 SYNTHASE) MtH37Rv- 2380c 4.05 mbtE 6.54E-03 0.40 PEPTIDE SYNTHETASE MBTE (PEPTIDE SYNTHASE) MtH37Rv- 2379c 4.25 mbtF 1.81E-03 0.43 PEPTIDE SYNTHETASE MBTF (PEPTIDE SYNTHASE) MtH37Rv- LYSINE-N-OXYGENASE MBTG (L-LYSINE 6- 2378c 2.39 mbtG 2.43E-02 0.34 MONOOXYGENASE) (LYSINE N6-HYDROXYLASE) MtH37Rv- 2377c 4.12 mbtH 1.81E-03 0.24 PUTATIVE CONSERVED PROTEIN MBTH MtH37Rv- 2386c 4.95 mbtI 1.81E-03 0.36 PUTATIVE ISOCHORISMATE SYNTHASE MBTI MtH37Rv- 2385 2.53 mbtJ 7.20E-03 0.40 PUTATIVE ACETYL MBTJ METHOXY MYCOLIC ACID SYNTHASE 4 MMAA4 MtH37Rv- (METHYL MYCOLIC ACID SYNTHASE 4) (MMA4) 0642c 1.69 mmaA4 4.06E-02 0.42 (HYDROXY MYCOLIC ACID SYNTHASE) PROBABLE PHOSPHOENOLPYRUVATE CARBOXYKINASE [GTP] PCKA MtH37Rv- (PHOSPHOENOLPYRUVATE CARBOXYLASE) 0211 4.23 pckA 4.55E-03 0.47 (PEPCK)(PEP CARBOXYKINASE) MtH37Rv- PROBABLE PHOSPHATE-TRANSPORT ATP-BINDING 0820 1.64 phoT 2.74E-02 0.25 PROTEIN ABC TRANSPORTER PHOT

130 Table 5.4 Genes upregulated in intracellular M. tuberculosis H37Rv versus broth grown H37Rv, cont’d. MtH37Rv- 0282 3.83 Rv0282 6.54E-03 0.28 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 1130 3.67 Rv1130 6.54E-03 0.23 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 1344 6.87 Rv1344 1.82E-03 0.30 PROBABLE ACYL CARRIER PROTEIN (ACP) PROBABLE DRUGS-TRANSPORT MtH37Rv- TRANSMEMBRANE ATP-BINDING PROTEIN ABC 1348 2.25 Rv1348 1.81E-03 0.19 TRANSPORTER PROBABLE DRUGS-TRANSPORT MtH37Rv- TRANSMEMBRANE ATP-BINDING PROTEIN ABC 1349 2.25 Rv1349 5.93E-03 0.32 TRANSPORTER MtH37Rv- PROBABLE TRANSCRIPTIONAL REGULATORY 1395 1.51 Rv1395 4.48E-02 0.18 PROTEIN MtH37Rv- PROBABLE TRANSCRIPTIONAL REGULATORY 1460 2.86 Rv1460 1.67E-03 0.19 PROTEIN MtH37Rv- 1461 2.70 Rv1461 3.69E-03 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 1462 2.62 Rv1462 2.78E-02 0.48 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- PROBABLE CONSERVED ATP-BINDING PROTEIN 1463 4.14 Rv1463 3.41E-03 0.15 ABC TRANSPORTER MtH37Rv- 1465 2.08 Rv1465 1.12E-02 0.30 POSSIBLE NITROGEN FIXATION RELATED PROTEIN MtH37Rv- POSSIBLE LINOLEOYL-COA DESATURASE 3229c 2.30 Rv3229c 3.35E-02 0.47 (DELTA(6)-DESATURASE) MtH37Rv- 3311 1.64 Rv3311 3.47E-02 0.12 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3614c 1.66 Rv3614c 4.96E-02 0.40 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3839 6.87 Rv3839 3.37E-03 0.30 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3868 1.65 Rv3868 1.80E-02 0.22 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3869 1.50 Rv3869 3.47E-02 0.21 POSSIBLE CONSERVED MEMBRANE PROTEIN MtH37Rv- 3870 1.46 Rv3870 2.22E-02 0.09 POSSIBLE CONSERVED MEMBRANE PROTEIN MtH37Rv- 3871 2.52 Rv3871 9.92E-03 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- CONSERVED HYPOTHETICAL PROLINE AND 3876 2.02 Rv3876 2.88E-02 0.21 ALANINE RICH PROTEIN MtH37Rv- PROBABLE CONSERVED TRANSMEMBRANE 3877 1.47 Rv3877 2.31E-02 0.17 PROTEIN MtH37Rv- PROBABLE PREPROTEIN SECA1 1 3240c 1.93 secA1 2.13E-02 0.34 SUBUNIT MtH37Rv- 1821 1.98 secA2 1.79E-02 0.26 POSSIBLE PREPROTEIN TRANSLOCASE SECA2 MtH37Rv- 0469 2.44 umaA 2.68E-03 0.28 POSSIBLE MYCOLIC ACID SYNTHASE UMAA

131

Systematic Fold- Common Name induction Name P-value SD Product ALKYL HYDROPEROXIDE REDUCTASE C MtH37Rv- PROTEIN AHPC (ALKYL HYDROPEROXIDASE 2428 2.91 ahpC 1.61E-02 0.14 C) MtH37Rv- Low molecular weight protein antigen 7 cfp7 (10 kDa 0288 3.82 cfp7 4.39E-02 0.28 antigen) (CFP-7) (Protein TB10.4) PROBABLE FATTY-ACID--COA LIGASE FADD21 MtH37Rv- (FATTY-ACID-COA SYNTHETASE) (FATTY- 1185c 1.47 fadD21 3.66E-02 0.39 ACID-COA SYNTHASE) MtH37Rv- 1345 3.41 fadD33 8.76E-04 0.15 POSSIBLE POLYKETIDE SYNTHASE FADD33 MtH37Rv- PROBABLE ACYL-COA DEHYDROGENASE 0244c 2.02 fadE5 1.94E-02 0.17 FADE5 MtH37Rv- ISOCITRATE LYASE ICL (ISOCITRASE) 0467 7.29 icl 3.27E-04 0.20 (ISOCITRATASE) MtH37Rv- PHENYLOXAZOLINE SYNTHASE MBTB 2383c 3.47 mbtB 4.61E-03 0.18 (PHENYLOXAZOLINE SYNTHETASE) MtH37Rv- PEPTIDE SYNTHETASE MBTF (PEPTIDE 2379c 4.07 mbtF 4.30E-02 0.43 SYNTHASE) MtH37Rv- 2377c 2.98 mbtH 7.96E-03 0.24 PUTATIVE CONSERVED PROTEIN MBTH MtH37Rv- 2386c 3.88 mbtI 1.67E-03 0.36 PUTATIVE ISOCHORISMATE SYNTHASE MBTI MtH37Rv- 0282 3.63 Rv0282 1.09E-02 0.28 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 1344 6.52 Rv1344 2.74E-03 0.30 PROBABLE ACYL CARRIER PROTEIN (ACP) MtH37Rv- PROBABLE TRANSCRIPTIONAL REGULATORY 1395 1.93 Rv1395 4.84E-02 0.18 PROTEIN MtH37Rv- PROBABLE TRANSCRIPTIONAL REGULATORY 1460 2.71 Rv1460 8.76E-04 0.19 PROTEIN MtH37Rv- PROBABLE CONSERVED ATP-BINDING 1463 3.39 Rv1463 8.76E-04 0.15 PROTEIN ABC TRANSPORTER MtH37Rv- 3019c 3.40 Rv3019c 1.74E-02 0.40 PUTATIVE SECRETED ESAT-6 LIKE PROTEIN 9 MtH37Rv- POSSIBLE LINOLEOYL-COA DESATURASE 3229c 3.50 Rv3229c 4.84E-03 0.47 (DELTA(6)-DESATURASE) MtH37Rv- 3839 11.77 Rv3839 2.58E-04 0.30 CONSERVED HYPOTHETICAL PROTEIN Table 5.5 Genes upregulated in intracellular M. tuberculosis H37Ra versus broth-grown H37Ra. Three populations of RNA from each of intracellular and broth-grown H37Ra were reverse transcribed and hybridised to

M. tuberculosis microarrays (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate. Arrays were normalised as per section 2.6.3.2 and gene expression was filtered for genes whose expression differed by 1.5-fold. Statistical significance of fold-difference across all populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) upregulated in intracellular H37Ra versus broth- grown H37Ra are listed. SD = standard deviation.

132

Gene Product MT2421 conserved hypothetical protein Mb3435c conserved hypothetical protein PE28 PE FAMILY PROTEIN PPE38 PPE FAMILY PROTEIN PPE4 PPE FAMILY PROTEIN Rv0146 conserved hypothetical protein Rv0282 conserved hypothetical protein Rv0284 POSSIBLE CONSERVED MEMBRANE PROTEIN Rv0289 conserved hypothetical protein Rv0290 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN Rv0292 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN Rv1344 PROBABLE ACYL CARRIER PROTEIN (ACP) Rv1352 conserved hypothetical protein Rv1395 PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN Rv1460 PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN Rv1463 PROBABLE CONSERVED ATP-BINDING PROTEIN ABC TRANSPORTER Rv1519 conserved hypothetical protein Rv2791c PROBABLE TRANSPOSASE Rv3019c PUTATIVE SECRETED ESAT-6 LIKE PROTEIN 9 Rv3229c POSSIBLE LINOLEOYL-COA DESATURASE (DELTA(6)-DESATURASE) Rv3402c conserved hypothetical protein CARBONIC ANHYDRASE (CARBONATE DEHYDRATASE) (CARBONIC Rv3588c DEHYDRATASE) Rv3839 conserved hypothetical protein ALKYL HYDROPEROXIDE REDUCTASE C PROTEIN AHPC (ALKYL ahpC HYDROPEROXIDASE C) LOW MOLECULAR WEIGHT PROTEIN ANTIGEN 7 CFP7 (10 KDA ANTIGEN) cfp7 (CFP-7) (PROTEIN TB10.4) PROBABLE FATTY-ACID--COA LIGASE FADD21 (FATTY-ACID-COA fadD21 SYNTHETASE) (FATTY-ACID-COA SYNTHASE) fadD33 POSSIBLE POLYKETIDE SYNTHASE FADD33 fadE5 PROBABLE ACYL-COA DEHYDROGENASE FADE5 icl ISOCITRATE LYASE ICL (ISOCITRASE) (ISOCITRATASE) mbtB PHENYLOXAZOLINE SYNTHASE MBTB (PHENYLOXAZOLINE SYNTHETASE) mbtF PEPTIDE SYNTHETASE MBTF (PEPTIDE SYNTHASE) mbtH PUTATIVE CONSERVED PROTEIN MBTH mbtI PUTATIVE ISOCHORISMATE SYNTHASE MBTI mmpL4 PROBABLE CONSERVED TRANSMEMBRANE TRANSPORT PROTEIN MMPL4 trcR TWO COMPONENT TRANSCRIPTIONAL REGULATOR TRCR Table 5.6 Genes upregulated in intracellular M. tuberculosis versus broth-grown bacteria. Comparing lists of genes induced by intracellular H37Ra and H37Rv versus their respective broth-grown counterparts (Tables 5-4 and

5-5), genes induced by both strains in association with the macrophage are listed above.

133

Systematic Fold- Common Name induction Name P-value SD Product MtH37Rv- PROBABLE ACONITATE HYDRATASE 1475c 0.58 acn 1.42E-02 0.23 ACN (Citrate hydro-lyase) (Aconitase) Probable integral membrane cytochrome D MtH37Rv- ubiquinol oxidase (subunit I) cydA 1623c 0.48 cydA 2.50E-02 0.83 (Cytochrome BD-I oxidase subunit I) Probable integral membrane cytochrome D MtH37Rv- ubiquinol oxidase (subunit II) cydB 1622c 0.39 cydB 7.18E-03 0.31 (Cytochrome BD-I oxidase subunit II) PROBABLE FATTY-ACID-COA LIGASE FADD11 (FATTY-ACID-COA MtH37Rv- SYNTHETASE) (FATTY-ACID-COA 1550 0.50 fadD11 9.46E-03 0.13 SYNTHASE) PROBABLE FATTY-ACID-COA LIGASE FADD8 (FATTY-ACID-COA MtH37Rv- SYNTHETASE) (FATTY-ACID-COA 0551c 0.46 fadD8 2.44E-02 0.31 SYNTHASE) MtH37Rv- PROBABLE ACYL-COA 0975c 0.45 fadE13 1.12E-02 0.22 DEHYDROGENASE FADE13 MtH37Rv- PROBABLE ACYL-COA 1934c 0.66 fadE17 4.35E-02 0.16 DEHYDROGENASE FADE17 MtH37Rv- PROBABLE ACYL-COA 3505 0.78 fadE27 3.61E-02 0.23 DEHYDROGENASE FADE27 MtH37Rv- 3562 0.53 fadE31 4.15E-02 0.35 ACYL-COA DEHYDROGENASE FADE31 MtH37Rv- 3564 0.44 fadE33 2.24E-02 0.39 ACYL-COA DEHYDROGENASE FADE33 MtH37Rv- 0752c 0.61 fadE9 2.42E-02 0.24 ACYL-COA DEHYDROGENASE FADE9 MtH37Rv- PROBABLE CONSERVED LIPOPROTEIN 2544 0.42 lppB 7.80E-03 0.40 LPPB MtH37Rv- PROBABLE CONSERVED LIPOPROTEIN 2341 0.44 lppQ 1.08E-02 0.13 LPPQ MtH37Rv- POSSIBLE LIPOPROTEIN 3298c 0.30 lpqC 7.83E-03 0.34 LPQC MtH37Rv- PROBABLE CONSERVED LIPOPROTEIN 0604 0.55 lpqO 4.46E-03 0.17 LPQO MtH37Rv- PROBABLE CONSERVED LIPOPROTEIN 0838 0.30 lpqR 3.21E-03 0.24 LPQR Table 5.7 Genes downregulated in intracellular M. tuberculosis H37Rv versus broth-grown H37Rv. Three populations of RNA from each of intracellular and broth-grown H37Rv were reverse transcribed and hybridised to

M. tuberculosis microarrays (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate. Arrays were normalised as per section 2.6.3.2 and gene expression was filtered for genes whose expression differed by 1.5-fold. Statistical significance of fold-difference across all populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) downregulated in intracellular H37Rv versus broth- grown H37Rv are listed. SD = standard deviation.

134 Table 5.7 Genes downregulated in intracellular M. tuberculosis H37Rv versus broth-grown H37Rv, cont’d. MtH37Rv- POSSIBLE MCE-FAMILY LIPOPROTEIN LPRM 1970 0.58 lprM 4.55E-02 0.47 (MCE-FAMILY LIPOPROTEIN MCE3E) MtH37Rv- 0109 0.62 PE_PGRS1 2.94E-02 0.27 PE-PGRS FAMILY PROTEIN MtH37Rv- 0754 0.35 PE_PGRS11 2.42E-02 0.36 PE-PGRS FAMILY PROTEIN MtH37Rv- 0834c 0.68 PE_PGRS14 2.36E-02 0.14 PE-PGRS FAMILY PROTEIN MtH37Rv- 0978c 0.30 PE_PGRS17 4.34E-03 0.21 PE-PGRS FAMILY PROTEIN MtH37Rv- 1396c 0.23 PE_PGRS25 5.38E-03 0.31 PE-PGRS FAMILY PROTEIN MtH37Rv- 1450c 0.29 PE_PGRS27 6.96E-03 0.50 PE-PGRS FAMILY PROTEIN MtH37Rv- 1452c 0.36 PE_PGRS28 4.21E-02 0.21 PE-PGRS FAMILY PROTEIN MtH37Rv- 1468c 0.66 PE_PGRS29 4.04E-02 0.27 PE-PGRS FAMILY PROTEIN MtH37Rv- 1651c 0.53 PE_PGRS30 2.56E-02 0.18 PE-PGRS FAMILY PROTEIN MtH37Rv- 1818c 0.51 PE_PGRS33 7.98E-03 0.27 PE-PGRS FAMILY PROTEIN MtH37Rv- 2098c 0.63 PE_PGRS36 3.14E-02 0.25 conserved hypothetical protein, PE_PGRS family MbAF212297- 0285c 0.55 PE_PGRS3a 1.27E-02 0.23 PE-PGRS FAMILY PROTEIN [FIRST PART] MtH37Rv- 2487c 0.33 PE_PGRS42 2.68E-03 0.24 PE-PGRS FAMILY PROTEIN MtH37Rv- 2634c 0.35 PE_PGRS46 8.93E-03 0.21 PE-PGRS FAMILY PROTEIN MtH37Rv- 2853 0.40 PE_PGRS48 1.92E-02 0.15 PE-PGRS FAMILY PROTEIN MbAF212297- 3541 0.46 PE_PGRS55 4.48E-02 0.25 PE-PGRS FAMILY PROTEIN MtH37Rv- 3590c 0.58 PE_PGRS58 2.75E-02 0.16 PE-PGRS FAMILY PROTEIN MtH37Rv- 3653 0.32 PE_PGRS61 2.51E-02 0.30 PE-PGRS FAMILY PROTEIN MbAF212297- 0767 0.48 PE_PGRS9 5.79E-03 0.20 PE-PGRS FAMILY PROTEIN MtH37Rv- 0746 0.50 PE_PGRS9 6.52E-03 0.13 PE-PGRS FAMILY PROTEIN MtH37Rv- 0151c 0.53 PE1 2.88E-02 0.20 PE FAMILY PROTEIN MtH37Rv- 3022A 0.56 PE29 2.22E-02 0.23 PE FAMILY PROTEIN PE-PGRS FAMILY PROTEIN, PROBABLY TRIACYLGLYCEROL MtH37Rv- (ESTERASE/LIPASE) (TRIGLYCERIDE LIPASE) 3097c 0.34 PE30 3.19E-02 0.49 (TRIBUTYRASE) MtH37Rv- 3622c 0.54 PE32 4.35E-02 0.12 PE FAMILY PROTEIN

135 Table 5.7 Genes downregulated in intracellular M. tuberculosis H37Rv versus broth-grown H37Rv, cont’d. MtH37Rv- 0160c 0.45 PE4 1.53E-02 0.14 PE FAMILY PROTEIN MtH37Rv- 0453 0.56 PPE11 1.57E-02 0.18 PPE FAMILY PROTEIN MtH37Rv- 1135c 0.63 PPE16 7.20E-03 0.21 PPE FAMILY PROTEIN MtH37Rv- 1790 0.57 PPE27 1.70E-02 0.15 PPE FAMILY PROTEIN MtH37Rv- 1917c 0.35 PPE34 2.56E-02 0.20 PPE FAMILY PROTEIN MbAF212297- 1951c 0.43 PPE34 4.56E-02 0.41 PPE FAMILY PROTEIN MtH37Rv- 2768c 0.45 PPE43 6.47E-03 0.22 PPE FAMILY PROTEIN MtH37Rv- 2892c 0.53 PPE45 3.03E-02 0.16 PPE FAMILY PROTEIN MtH37Rv- 3022c 0.46 PPE48 1.27E-02 0.18 PPE FAMILY PROTEIN MtH37Rv- 3125c 0.71 PPE49 2.29E-02 0.16 PPE FAMILY PROTEIN MtH37Rv- 3539 0.60 PPE63 3.32E-02 0.23 PPE FAMILY PROTEIN MtH37Rv- 3621c 0.51 PPE65 3.18E-02 0.25 PPE FAMILY PROTEIN MtH37Rv- CONSERVED HYPOTHETICAL 3371 0.43 Rv3371 2.88E-02 0.44 PROTEIN MtH37Rv- CONSERVED HYPOTHETICAL 3897c 0.63 Rv3897c 4.46E-03 0.20 PROTEIN

136

Systematic Fold- Common Name induction Name P-value SD Product MtH37Rv- PROBABLE ACONITATE HYDRATASE ACN 1475c 0.58 acn 1.94E-02 0.23 (Citrate hydro-lyase) (Aconitase) MtH37Rv- 1876 0.35 bfrA 1.95E-03 0.23 PROBABLE BACTERIOFERRITIN BFRA MtH37Rv- 3841 0.23 bfrB 1.12E-02 1.71 POSSIBLE BACTERIOFERRITIN BFRB MtH37Rv- 3371 0.51 Rv3371 4.84E-02 0.44 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3897c 0.58 Rv3897c 4.77E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3846 0.27 sodA 1.80E-02 0.50 SUPEROXIDE DISMUTASE [FE] SODA Table 5.8 Genes downregulated in intracellular M. tuberculosis H37Ra versus broth-grown H37Ra. Three populations of RNA from each of intracellular and broth-grown H37Ra were reverse transcribed and hybridised to

M. tuberculosis microarrays (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate. Arrays were normalised as per section 2.6.3.2 and gene expression was filtered for genes whose expression differed by 1.5-fold. Statistical significance of fold-difference across all populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) downregulated in intracellular H37Ra versus broth- grown H37Ra are listed. SD = standard deviation.

137

Systematic Fold- Common Name induction Name P-value SD Product MtH37Rv- PROBABLE METAL CATION TRANSPORTER 1992c 3.16 ctpG 5.72E-05 3.52 P-TYPE ATPASE G CTPG PROBABLE ACYL-[ACYL-CARRIER PROTEIN] DESATURASE DESA1 (ACYL- MtH37Rv- [ACP] DESATURASE) (STEAROYL-ACP 0824c 2.67 desA1 2.60E-03 2.34 DESATURASE) (PROTEIN DES) MtH37Rv- ISONIAZID INDUCTIBLE GENE PROTEIN 0341 6.73 iniB 1.01E-04 0.98 INIB MtH37Rv- 1854c 2.13 ndh 6.91E-04 0.77 PROBABLE NADH DEHYDROGENASE NDH POSSIBLE TWO COMPONENT SYSTEM MtH37Rv- RESPONSE TRANSCRIPTIONAL POSITIVE 0757 1.71 phoP 2.86E-04 0.92 REGULATOR PHOP MtH37Rv- 3429 2.06 PPE59 9.31E-04 0.74 PPE FAMILY PROTEIN MtH37Rv- POSSIBLE CONSERVED EXPORTED 0320 2.33 Rv0320 1.09E-03 0.85 PROTEIN MtH37Rv- 1990A 2.32 Rv1990A 4.23E-03 0.66 POSSIBLE DEHYDROGENASE (FRAGMENT) MtH37Rv- 1991c 2.76 Rv1991c 4.61E-02 1.47 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- PROBABLE TRANSCRIPTIONAL 1994c 3.60 Rv1994c 2.34E-03 0.60 REGULATORY PROTEIN MtH37Rv- 2644c 1.78 Rv2644c 1.66E-03 0.48 HYPOTHETICAL PROTEIN MtH37Rv- 2662 1.84 Rv2662 1.05E-05 0.22 HYPOTHETICAL PROTEIN Table 5.9 Genes induced in intracellular M. tuberculosis H37Ra compared to intracellular H37Rv. Three populations of RNA from each of intracellular H37Ra and H37Ra were reverse transcribed as per section 2.6.2.1 and hybridised to M. tuberculosis microarray slides (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate as described in section 2.6.1.2. Arrays were normalised and expression analysed as specified in Section 2.6.3.2, and statistical significance of fold-difference across all three populations was analysed using ANOVA. Genes whose expression differences were found to be statistically to be significantly (P<0.05) upregulated in intracellular H37Ra versus intracellular H37Rv are listed. SD = standard deviation.

138

Systematic Fold- Common Name induction Name P-value SD Product LOW MOLECULAR WEIGHT ANTIGEN CFP2 MtH37Rv- (LOW MOLECULAR WEIGHT PROTEIN 2376c 0.42 cfp2 2.37E-05 0.03 ANTIGEN 2) (CFP-2) PROBABLE 3'-PHOSPHOADENOSINE 5'- PHOSPHOSULFATE REDUCTASE CYSH (PAPS REDUCTASE, THIOREDOXIN DEP.) (PADOPS REDUCTASE) (3'- MtH37Rv- PHOSPHOADENYLYLSULFATE 2392 0.54 cysH 2.37E-05 0.07 REDUCTASE) (PAPS SULFOTRANSFERASE). PROBABLE FATTY-ACID--COA LIGASE MtH37Rv- FADD21 (FATTY-ACID-COA SYNTHETASE) 1185c 0.28 fadD21 1.02E-03 0.05 (FATTY-ACID-COA SYNTHASE) MtH37Rv- 3487c 0.18 lipF 1.26E-05 0.06 PROBABLE ESTERASE/LIPASE LIPF MbAF212297- HYPOTHETICAL TRANSMEMBRANE 3508 0.51 Mb3508 4.47E-07 0.03 PROTEIN [THIRD PART] PROBABLE CONSERVED MtH37Rv- TRANSMEMBRANE TRANSPORT PROTEIN 1183 0.34 mmpL10 1.01E-04 0.08 MMPL10 MtH37Rv- PROBABLE CONSERVED INTEGRAL 3823c 0.32 mmpL8 3.15E-05 0.07 MEMBRANE TRANSPORT PROTEIN MMPL8 MtCDC1551- 1585.1 0.33 MT1585.1 3.83E-04 0.12 hypothetical protein MtCDC1551- 2467 0.13 MT2467 5.92E-04 0.14 hypothetical protein MtCDC1551- 3580.2 0.09 MT3580.2 2.86E-04 0.08 hypothetical protein MtCDC1551- 3718.1 0.49 MT3718.1 5.55E-05 0.05 hypothetical protein MtH37Rv- PROBABLE CONSERVED POLYKETIDE 1182 0.15 papA3 3.76E-04 0.12 SYNTHASE ASSOCIATED PROTEIN PAPA3 MtH37Rv- 1195 0.54 PE13 1.96E-03 0.19 PE FAMILY PROTEIN MtH37Rv- 3477 0.16 PE31 2.37E-05 0.08 PE FAMILY PROTEIN MtH37Rv- PROBABLE POLYKETIDE BETA-KETOACYL 1180 0.26 pks3 3.14E-05 0.06 SYNTHASE PKS3 Table 5.10 Genes repressed in intracellular M. tuberculosis H37Ra compared to intracellular H37Rv. Three populations of RNA from each of intracellular H37Ra and H37Ra were reverse transcribed as per section 2.6.2.1 and hybridised to M. tuberculosis microarray slides (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate as described in section 2.6.1.2. Arrays were normalised and expression analysed as specified in Section 2.6.3.2, and statistical significance of fold-difference across all three populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) downregulated in intracellular H37Ra versus intracellular H37Rv are listed. SD = standard deviation.

139 Table 5.10 Genes repressed in intracellular M. tuberculosis H37Ra compared to intracellular H37Rv, cont’d. MtH37Rv- PROBABLE POLYKETIDE BETA-KETOACYL 1181 0.28 pks4 1.44E-03 0.07 SYNTHASE PKS4 MtH37Rv- 1196 0.29 PPE18 3.83E-04 0.07 PPE FAMILY PROTEIN MtH37Rv- 1361c 0.15 PPE19 1.09E-03 0.18 PPE FAMILY PROTEIN MtH37Rv- 3478 0.27 PPE60 5.65E-03 0.13 PE FAMILY PROTEIN MtH37Rv- 1179c 0.52 Rv1179c 8.74E-04 0.16 HYPOTHETICAL PROTEIN MtH37Rv- 1535 0.41 Rv1535 5.51E-04 0.08 HYPOTHETICAL PROTEIN MtH37Rv- 1638A 0.29 Rv1638A 8.52E-03 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- CONSERVED HYPOTHETICAL MEMBRANE 1639c 0.48 Rv1639c 3.42E-04 0.11 PROTEIN MtH37Rv- 2159c 0.54 Rv2159c 2.34E-03 0.18 conserved hypothetical protein MtH37Rv- 2288 0.40 Rv2288 4.81E-03 0.23 hypothetical protein MtH37Rv- 2331 0.48 Rv2331 1.04E-03 0.15 HYPOTHETICAL PROTEIN MtH37Rv- PROBABLE DNA METHYLASE (MODIFICATION 3263 0.58 Rv3263 9.76E-03 0.24 METHYLASE) (METHYLTRANSFERASE) MtH37Rv- 3479 0.29 Rv3479 2.86E-03 0.14 POSSIBLE TRANSMEMBRANE PROTEIN MtH37Rv- 3613c 0.27 Rv3613c 1.05E-03 0.13 HYPOTHETICAL PROTEIN MtH37Rv- 3614c 0.27 Rv3614c 2.77E-02 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3615c 0.34 Rv3615c 3.99E-04 0.09 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- CONSERVED HYPOTHETICAL ALANINE AND 3616c 0.45 Rv3616c 9.46E-04 0.09 GLYCINE RICH PROTEIN MtH37Rv- 3686c 0.19 Rv3686c 9.48E-04 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3822 0.15 Rv3822 4.12E-04 0.11 CONSERVED HYPOTHETICAL PROTEIN

140

Gene Product 35kd_ag CONSERVED 35 KDA ALANINE RICH PROTEIN ALKYL HYDROPEROXIDE REDUCTASE C PROTEIN AHPC (ALKYL ahpC HYDROPEROXIDASE C) PROBABLE ATP-DEPENDENT CLP PROTEASE PROTEOLYTIC SUBUNIT 1 CLPP1 clpP1 (ENDOPEPTIDASE CLP) Csd PROBABLE CYSTEINE DESULFURASE CSD PROBABLE 3-HYDROXYBUTYRYL-COA DEHYDROGENASE FADB2 (BETA- fadB2 HYDROXYBUTYRYL-COA DEHYDROGENASE) (BHBD) PROBABLE FATTY-ACID-COA LIGASE FADD19 (FATTY-ACID-COA SYNTHETASE) fadD19 (FATTY-ACID-COA SYNTHASE) FATTY-ACID-COA LIGASE FADD26 (FATTY-ACID-COA SYNTHETASE) (FATTY-ACID- fadD26 COA SYNTHASE) fadD33 POSSIBLE POLYKETIDE SYNTHASE FADD33 fadE5 PROBABLE ACYL-COA DEHYDROGENASE FADE5 Icl ISOCITRATE LYASE ICL (ISOCITRASE) (ISOCITRATASE) lprD PROBABLE CONSERVED LIPOPROTEIN LPRD mbtB PHENYLOXAZOLINE SYNTHASE MBTB (PHENYLOXAZOLINE SYNTHETASE) mbtD POLYKETIDE SYNTHETASE MBTD (POLYKETIDE SYNTHASE) mbtE PEPTIDE SYNTHETASE MBTE (PEPTIDE SYNTHASE) mbtF PEPTIDE SYNTHETASE MBTF (PEPTIDE SYNTHASE) LYSINE-N-OXYGENASE MBTG (L-LYSINE 6-MONOOXYGENASE) (LYSINE N6- mbtG HYDROXYLASE) mbtH PUTATIVE CONSERVED PROTEIN MBTH mbtI PUTATIVE ISOCHORISMATE SYNTHASE MBTI mbtJ PUTATIVE ACETYL HYDROLASE MBTJ PROBABLE PHOSPHOENOLPYRUVATE CARBOXYKINASE [GTP] PCKA pckA (PHOSPHOENOLPYRUVATE CARBOXYLASE) (PEPCK)(PEP CARBOXYKINASE) sigB RNA POLYMERASE SIGMA FACTOR SIGB sigE ALTERNATIVE RNA POLYMERASE SIGMA FACTOR SIGE umaA CONSERVED HYPOTHETICAL PROTEIN Rv0282 Rv0283 POSSIBLE CONSERVED MEMBRANE PROTEIN Rv0284 POSSIBLE CONSERVED MEMBRANE PROTEIN Rv0885 CONSERVED HYPOTHETICAL PROTEIN Rv1072 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN Rv1130 CONSERVED HYPOTHETICAL PROTEIN PROBABLE DRUGS-TRANSPORT TRANSMEMBRANE ATP-BINDING PROTEIN ABC Rv1348 TRANSPORTER PROBABLE DRUGS-TRANSPORT TRANSMEMBRANE ATP-BINDING PROTEIN ABC Rv1349 TRANSPORTER Rv1397c CONSERVED HYPOTHETICAL PROTEIN Table 5.11 Genes mutually induced in intracellular M. tuberculosis. Our dataset obtained after comparing the expression of intracellular H37Rv and broth-grown H37Rv (Appendix IV) was contrasted to a dataset published by

Schnappinger et al. (188) examining the intracellular profile of M. tuberculosis strain 1254.

141 Table 5.11 Genes mutually induced in intracellular M. tuberculosis, cont’d. Gene Product Rv1461 CONSERVED HYPOTHETICAL PROTEIN Rv1462 CONSERVED HYPOTHETICAL PROTEIN Rv1465 POSSIBLE NITROGEN FIXATION RELATED PROTEIN Rv1592c CONSERVED HYPOTHETICAL PROTEIN Rv2617c PROBABLE TRANSMEMBRANE PROTEIN Rv2619c CONSERVED HYPOTHETICAL PROTEIN Rv2660c HYPOTHETICAL PROTEIN Rv2706c HYPOTHETICAL PROTEIN Rv2791c PROBABLE TRANSPOSASE Rv2949c CONSERVED HYPOTHETICAL PROTEIN Rv3115 PROBABLE TRANSPOSASE Rv3197 PROBABLE CONSERVED ATP-BINDING PROTEIN ABC TRANSPORTER Rv3402c CONSERVED HYPOTHETICAL PROTEIN Rv3403c HYPOTHETICAL PROTEIN Rv3526 POSSIBLE OXIDOREDUCTASE Rv3839 CONSERVED HYPOTHETICAL PROTEIN

142

Gene Product clpC PROBABLE ATP-DEPENDENT CLP PROTEASE ATP-BINDING SUBUNIT CLPC 3-OXOACYL-[ACYL-CARRIER PROTEIN] REDUCTASE FABG1 (3-KETOACYL-ACYL fabG1 CARRIER PROTEIN REDUCTASE) (MYCOLIC ACID BIOSYNTHESIS A PROTEIN) opcA PUTATIVE OXPP CYCLE PROTEIN OPCA PROBABLE PHOSPHATE-TRANSPORT ATP-BINDING PROTEIN ABC TRANSPORTER phoT PHOT rodA PROBABLE CELL DIVISION PROTEIN RODA secA2 POSSIBLE PREPROTEIN TRANSLOCASE SECA2 sigH ALTERNATIVE RNA POLYMERASE SIGMA-E FACTOR (SIGMA-24) SIGH (RPOE) Rv0464c CONSERVED HYPOTHETICAL PROTEIN Rv1211 CONSERVED HYPOTHETICAL PROTEIN Rv1331 CONSERVED HYPOTHETICAL PROTEIN Rv1334 CONSERVED HYPOTHETICAL PROTEIN Rv2708c CONSERVED HYPOTHETICAL PROTEIN Rv3033 HYPOTHETICAL PROTEIN Rv3269 CONSERVED HYPOTHETICAL PROTEIN Rv3311 CONSERVED HYPOTHETICAL PROTEIN Rv3480c CONSERVED HYPOTHETICAL PROTEIN Rv3868 CONSERVED HYPOTHETICAL PROTEIN Table 5.12 Genes induced in intracellular M. tuberculosis H37Rv and their requirement for survival within the macrophage. Our dataset of genes induced in intracellular M. tuberculosis H37Rv (Appendix IV) were compared with the dataset of genes found via TraSH (170) to be required for optimal growth of M. tuberculosis

H37Rv inside a macrophage.

143

Figure 5.1 Genomic comparisons of M. tuberculosis H37Ra and H37Rv via microarray analysis reveals only the RvD2 region of difference. Snapshot of microarray slide showing few genomic differences between H37Ra and H37Rv. Spots circles corresponded to genes Mb1785c and Mb1783c in the RvD2 region of difference (Table 5-

1). Not shown: Mb1787.

144

Figure 5.2 Comparisons of gene expression in intracellular M. tuberculosis shows a subset of genes mutually induced by virulent and attenuated strains in an intracellular environment. Using GeneVenn

(http://mcbc.usm.edu/genevenn/genevenn.htm), datasets (Appendices IV and V) were compared and overlap of expression data is graphically represented as a Venn diagram.

145

Figure 5.3 Catabolism of fatty acids via β-oxidation and subsequent glyoxylate cycle metabolism. Taken from

Schnappinger et al. 2003 (188).

146 A

B

Figure 5.4 qPCR confirmation of microarray trends of genes differentially expressed between intracellular M. tuberculosis H37Ra and H37Rv. qPCR was used to confirm the expression of A) genes induced in intracellular

H37Ra versus H37Rv, and B) genes repressed in intracellular H37Ra versus H37Rv. Line at “1” denotes expression of respective genes in intracellular H37Rv.

147 A

B

Figure 5.5 qPCR assessment of genes differentially expressed between intracellular M. tuberculosis H37Ra and H37Rv. qPCR was used to assess the expression of genes believed to be differentially expressed between

H37Ra and H37Rv. Line at “1” denotes expression of the respective genes in intracellular H37Rv.

148

Figure 5.6 qPCR confirmation of genes that were not differentially expressed between intracellular M. tuberculosis H37Ra and H37Rv. Genes that were not present on the list of genes differentially expressed in intracellular mycobacteria were selected to confirm their expression trends via qPCR. Line at “1” denotes expression of selected genes in intracellular H37Rv.

149

Figure 5.7 Comparisons of our expression study with previous studies examining the intracellular profile of

M. tuberculosis. Using GeneVenn (http://mcbc.usm.edu/genevenn/genevenn.htm), datasets were compared and overlap of expression data is graphically represented as a Venn diagram. “AL Rv up” represents our dataset of genes induced in M. tuberculosis H37Rv (Appendix IV), “Schnappinger” represents data set previously obtained when examining genes induced in intracellular M. tuberculosis strain 1254 (188), and “TraSH mac” represents the list of genes deemed to be essential for survival inside BM-MΦ (170).

150 CHAPTER 6: Final summary and conclusion

Mycobacterium tuberculosis, the intracellular pathogen that causes tuberculosis, continues to plague the global population to tragic proportions. With the increasing prevalence of multi- drug resistant strains, as well as extremely drug resistant strains of M. tuberculosis now gaining ground in previously unaffected areas via increased global migration and travel, it has become more important that improvements be made to current chemotherapeutic avenues and/or vaccine strategies. To do so, pathogenic mechanisms employed by M. tuberculosis in its interactions with the host need to be characterised. To achieve this, we compared the genomics and

transcriptomics of two highly similar strains of M. tuberculosis; H37Ra and H37Rv (which vary

in their virulence) in an attempt to gain insight into adaptations that might enable mycobacterial

pathogenesis. Genomics were compared using two-dimensional DNA techniques, and

expression profiles of intracellular bacteria were compared using both bacterial chromosome

fingerprint array analysis and microarrays. All these techniques were successfully applied to the

study of mycobacteria, and the results are summarised below.

1) Two-dimensional DNA electrophoresis was successful in generating genomic profiles of

M. tuberculosis H37Ra and H37Rv. However, no reproducible genomic differences

were isolated between the strains.

2) Bacterial artificial chromosome fingerprint arrays (BACFA) were successfully applied to

the expression analysis of intracellular M. tuberculosis, and could be used to differentiate

expression profiles of different strains of mycobacteria.

3) Overall, differences isolated using BACFA and later confirmed via qPCR suggested a

more rapid and more extensive adaptation to the intracellular environment by H37Rv as

compared to H37Ra.

151 4) The fumarate reductase enzyme complex (FRD), identified via BACFA, was observed to

be expressed at higher levels in H37Rv at 4h post-infection.

5) Genes encoding components of FRD (frdA and frdB) were expressed at a higher level by

the virulent H37Rv [over H37Ra] in fresh broth cultures, e.g. 4h and 24h after

inoculation into fresh medium.

6) Adding mercaptopyridine-N-oxide (an inhibitor of FRD) to infected macrophages

inhibited intracellular growth of M. tuberculosis.

7) Differences isolated via microarray analysis indicate a more robust response by H37Rv

when encountering novel environments.

8) Microarray expression analysis isolated the differential expression of phoP between

H37Ra and H37Rv. This difference extended to other members of the PhoPR regulon.

9) Members of the PhoPR regulon that are involved in the synthesis of cell envelope

constituents may explain the phenotypic differences between H37Ra and H37Rv.

10) Genes encoding and involved with the ESX-1 secretion system (which may be involved

in intercellular spread) were also identified as being differentially regulated in

intracellular H37Ra and H37Rv.

These data echo previous findings that found a paucity of genomic differences between the two M. tuberculosis strains. It is the expression studies that were the most informative in giving insight into the response of intracellular bacteria. Here, expression data from BACFA analysis indicated that the virulent strain was more responsive or reacted more rapidly following intracellular sequestration. Increased expression of enzymes such as fumarate reductase or pyruvate decarboxylase may indicate a requirement for alternate modes of respiration and/or the need for increased synthesis of products that may enable better adaptation to, or defence against,

152 the host. Expression data from the microarrays comparing intracellular and broth grown

bacterial transcriptomes also echo this readiness of the virulent strain. In these datasets, the

virulent strain appears to launch a more robust expression profile, i.e. more genes from particular

pathways that may be important for adaptation to an intracellular lifestyle (e.g. fatty acid

metabolism, mycobactin synthesis, ESAT-6 family proteins) are expressed by intracellular

H37Rv over its broth counterpart than intracellular H37Ra expressed over its broth counterpart.

However, not all the differences in expression were identified as being more robust in H37Rv.

The microarray data comparing intracellular profiles of H37Ra and H37Rv, revealed that transcription regulating factors were induced at a greater level in the attenuated strain, which does indicate that factors are expressed by H37Ra in response to stress. However, given that an overexpression of phoPQ in Salmonella typhimurium resulted in attenuation of virulence (138), the observation of greater levels of transcription regulators in H37Ra may in fact allude to faulty regulation and a misinterpretation of the environmental changes thus contributing, in part, to its attenuated phenotype.

It could be hypothesised that a successful intracellular pathogen will have mechanisms which will allow it to quickly sense and respond to environmental changes. It will carefully regulate and transcribe genes whose products will enable alternative means of energy production to maintain its survival and replication in the host as well as the synthesis of substrates to strengthen its armour such as to withstand assaults from the host. Additionally, the successful pathogen may regulate the secretion of proteins that may have immunomodulatory effects which further enable its continued survival in the host. The apparent deficiencies exhibited by H37Ra in these aspects may have factored into its attenuation compared to H37Rv.

153 REFERENCES

1. 1996. Randomised controlled trial of single BCG, repeated BCG, or combined BCG and killed Mycobacterium leprae vaccine for prevention of leprosy and tuberculosis in Malawi. Karonga Prevention Trial Group. Lancet 348:17-24. 2. Andersen, P., D. Askgaard, L. Ljungqvist, J. Bennedsen, and I. Heron. 1991. Proteins released from Mycobacterium tuberculosis during growth. Infect Immun 59:1905-10. 3. Armstrong, J. A., and P. D. Hart. 1971. Response of cultured macrophages to Mycobacterium tuberculosis, with observations on fusion of lysosomes with phagosomes. J Exp Med 134:713-40. 4. Bacon, J., B. W. James, L. Wernisch, A. Williams, K. A. Morley, G. J. Hatch, J. A. Mangan, J. Hinds, N. G. Stoker, P. D. Butcher, and P. D. Marsh. 2004. The influence of reduced oxygen availability on pathogenicity and gene expression in Mycobacterium tuberculosis. Tuberculosis (Edinb) 84:205-17. 5. Banu, S., N. Honore, B. Saint-Joanis, D. Philpott, M. C. Prevost, and S. T. Cole. 2002. Are the PE-PGRS proteins of Mycobacterium tuberculosis variable surface antigens? Mol Microbiol 44:9-19. 6. Behr, M. A., M. A. Wilson, W. P. Gill, H. Salamon, G. K. Schoolnik, S. Rane, and P. M. Small. 1999. Comparative genomics of BCG vaccines by whole-genome DNA microarray. Science 284:1520-3. 7. Belisle, J. T., and M. G. Sonnenberg. 1998. Isolation of genomic DNA from mycobacteria. Methods Mol Biol 101:31-44. 8. Betts, J. C., P. T. Lukey, L. C. Robb, R. A. McAdam, and K. Duncan. 2002. Evaluation of a nutrient starvation model of Mycobacterium tuberculosis persistence by gene and protein expression profiling. Mol Microbiol 43:717-31. 9. Bhargava, S., A. K. Tyagi, and J. S. Tyagi. 1990. tRNA genes in mycobacteria: organization and molecular cloning. J Bacteriol 172:2930-4. 10. Birkholz, S., U. Knipp, E. Lemma, A. Kroger, and W. Opferkuch. 1994. Fumarate reductase of Helicobacter pylori--an immunogenic protein. J Med Microbiol 41:56-62. 11. Bloch, H. 1950. A component of tubercle bacilli concerned with their virulence. Bull N Y Acad Med 26:506-7. 12. Bolhuis, A., A. Matzen, H. L. Hyyrylainen, V. P. Kontinen, R. Meima, J. Chapuis, G. Venema, S. Bron, R. Freudl, and J. M. van Dijl. 1999. Signal peptide peptidase- and ClpP-like proteins of Bacillus subtilis required for efficient translocation and processing of secretory proteins. J Biol Chem 274:24585-92. 13. Boshoff, H. I., T. G. Myers, B. R. Copp, M. R. McNeil, M. A. Wilson, and C. E. Barry, 3rd. 2004. The transcriptional responses of Mycobacterium tuberculosis to inhibitors of metabolism: novel insights into drug mechanisms of action. J Biol Chem 279:40174-84. 14. Brandt, L., T. Oettinger, A. Holm, A. B. Andersen, and P. Andersen. 1996. Key epitopes on the ESAT-6 antigen recognized in mice during the recall of protective immunity to Mycobacterium tuberculosis. J Immunol 157:3527-33. 15. Brennan, M. J., and G. Delogu. 2002. The PE multigene family: a 'molecular mantra' for mycobacteria. Trends Microbiol 10:246-9. 154 16. Brennan, M. J., G. Delogu, Y. Chen, S. Bardarov, J. Kriakov, M. Alavi, and W. R. Jacobs, Jr. 2001. Evidence that mycobacterial PE_PGRS proteins are cell surface constituents that influence interactions with other cells. Infect Immun 69:7326-33. 17. Brodin, P., I. Rosenkrands, P. Andersen, S. T. Cole, and R. Brosch. 2004. ESAT-6 proteins: protective antigens and virulence factors? Trends Microbiol 12:500-8. 18. Brosch, R., S. V. Gordon, A. Billault, T. Garnier, K. Eiglmeier, C. Soravito, B. G. Barrell, and S. T. Cole. 1998. Use of a Mycobacterium tuberculosis H37Rv bacterial artificial chromosome library for genome mapping, sequencing, and comparative genomics. Infect Immun 66:2221-9. 19. Brosch, R., W. J. Philipp, E. Stavropoulos, M. J. Colston, S. T. Cole, and S. V. Gordon. 1999. Genomic analysis reveals variation between Mycobacterium tuberculosis H37Rv and the attenuated M. tuberculosis H37Ra strain. Infect Immun 67:5768-74. 20. Brown, C. A., P. Draper, and P. D. Hart. 1969. Mycobacteria and lysosomes: a paradox. Nature 221:658-60. 21. Bryant, C., and E. M. Bennet. 1983. Observations on the fumarate reductase system in Haemonchus contortus and their relevance to anthelmintic resistance and to strain variations of energy metabolism. Mol Biochem Parasitol 7:281-92. 22. Cabusora, L., E. Sutton, A. Fulmer, and C. V. Forst. 2005. Differential network expression during drug and stress response. Bioinformatics 21:2898-905. 23. Camacho, L. R., D. Ensergueix, E. Perez, B. Gicquel, and C. Guilhot. 1999. Identification of a virulence gene cluster of Mycobacterium tuberculosis by signature- tagged transposon mutagenesis. Mol Microbiol 34:257-67. 24. Camus, J. C., M. J. Pryor, C. Medigue, and S. T. Cole. 2002. Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv. Microbiology 148:2967-73. 25. Canneva, F., M. Branzoni, G. Riccardi, R. Provvedi, and A. Milano. 2005. Rv2358 and FurB: two transcriptional regulators from Mycobacterium tuberculosis which respond to zinc. J Bacteriol 187:5837-40. 26. Check, E. 2007. After decades of drought, new drug possibilities flood TB pipeline. Nat Med 13:266. 27. Chen, M., L. Zhai, S. B. Christensen, T. G. Theander, and A. Kharazmi. 2001. Inhibition of fumarate reductase in Leishmania major and L. donovani by chalcones. Antimicrob Agents Chemother 45:2023-9. 28. Clemens, D. L. 1996. Characterization of the Mycobacterium tuberculosis phagosome. Trends Microbiol 4:113-8. 29. Clemens, D. L., and M. A. Horwitz. 1996. The Mycobacterium tuberculosis phagosome interacts with early endosomes and is accessible to exogenously administered transferrin. J Exp Med 184:1349-55. 30. Colditz, G. A., T. F. Brewer, C. S. Berkey, M. E. Wilson, E. Burdick, H. V. Fineberg, and F. Mosteller. 1994. Efficacy of BCG vaccine in the prevention of tuberculosis. Meta-analysis of the published literature. Jama 271:698-702. 31. Cole, S. T. 2002. Comparative and functional genomics of the Mycobacterium tuberculosis complex. Microbiology 148:2919-28. 32. Cole, S. T., R. Brosch, J. Parkhill, T. Garnier, C. Churcher, D. Harris, S. V. Gordon, K. Eiglmeier, S. Gas, C. E. Barry, 3rd, F. Tekaia, K. Badcock, D. Basham, D. Brown, T. Chillingworth, R. Connor, R. Davies, K. Devlin, T. Feltwell, S. Gentles, N. Hamlin, S. Holroyd, T. Hornsby, K. Jagels, B. G. Barrell, and et al. 1998.

155 Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393:537-44. 33. Cole, S. T., K. Eiglmeier, J. Parkhill, K. D. James, N. R. Thomson, P. R. Wheeler, N. Honore, T. Garnier, C. Churcher, D. Harris, K. Mungall, D. Basham, D. Brown, T. Chillingworth, R. Connor, R. M. Davies, K. Devlin, S. Duthoy, T. Feltwell, A. Fraser, N. Hamlin, S. Holroyd, T. Hornsby, K. Jagels, C. Lacroix, J. Maclean, S. Moule, L. Murphy, K. Oliver, M. A. Quail, M. A. Rajandream, K. M. Rutherford, S. Rutter, K. Seeger, S. Simon, M. Simmonds, J. Skelton, R. Squares, S. Squares, K. Stevens, K. Taylor, S. Whitehead, J. R. Woodward, and B. G. Barrell. 2001. Massive gene decay in the leprosy bacillus. Nature 409:1007-11. 34. Collins, F. M., and M. M. Smith. 1969. A comparative study of the virulence of mycobacterium tuberculosis measured in mice and guinea pigs. Am Rev Respir Dis 100:631-9. 35. Constant, P., E. Perez, W. Malaga, M. A. Laneelle, O. Saurel, M. Daffe, and C. Guilhot. 2002. Role of the pks15/1 gene in the biosynthesis of phenolglycolipids in the Mycobacterium tuberculosis complex. Evidence that all strains synthesize glycosylated p- hydroxybenzoic methyl esters and that strains devoid of phenolglycolipids harbor a frameshift mutation in the pks15/1 gene. J Biol Chem 277:38148-58. 36. Cox, J. S., B. Chen, M. McNeil, and W. R. Jacobs, Jr. 1999. Complex lipid determines tissue-specific replication of Mycobacterium tuberculosis in mice. Nature 402:79-83. 37. Daffe, M., and P. Draper. 1998. The envelope layers of mycobacteria with reference to their pathogenicity. Adv Microb Physiol 39:131-203. 38. Danelishvili, L., J. McGarvey, Y. J. Li, and L. E. Bermudez. 2003. Mycobacterium tuberculosis infection causes different levels of apoptosis and necrosis in human macrophages and alveolar epithelial cells. Cell Microbiol 5:649-60. 39. Dasgupta, N., V. Kapur, K. K. Singh, T. K. Das, S. Sachdeva, K. Jyothisri, and J. S. Tyagi. 2000. Characterization of a two-component system, devR-devS, of Mycobacterium tuberculosis. Tuber Lung Dis 80:141-59. 40. de Jonge, M. I., T. P. Stinear, S. Cole, and R. Brosch (ed.). 2007. The Mycobacteria: a postgenomic view. ASM Press, Washington, D.C. 41. De Voss, J. J., K. Rutter, B. G. Schroeder, H. Su, Y. Zhu, and C. E. Barry, 3rd. 2000. The salicylate-derived mycobactin siderophores of Mycobacterium tuberculosis are essential for growth in macrophages. Proc Natl Acad Sci U S A 97:1252-7. 42. Delogu, G., and M. J. Brennan. 2001. Comparative immune response to PE and PE_PGRS antigens of Mycobacterium tuberculosis. Infect Immun 69:5606-11. 43. Denkin, S., S. Byrne, C. Jie, and Y. Zhang. 2005. Gene expression profiling analysis of Mycobacterium tuberculosis genes in response to salicylate. Arch Microbiol 184:152-7. 44. Desjardins, M. 1995. Biogenesis of phagolysosomes: the 'kiss and run' hypothesis. Trends Cell Biol 5:183-6. 45. DiGiuseppe Champion, P. A., and J. S. Cox. 2007. Protein secretion systems in Mycobacteria. Cell Microbiol 9:1376-84. 46. Dolin, P. J., M. C. Raviglione, and A. Kochi. 1994. Global tuberculosis incidence and mortality during 1990-2000. Bull World Health Organ 72:213-20. 47. Drevets, D. A., B. P. Canono, P. J. Leenen, and P. A. Campbell. 1994. Gentamicin kills intracellular Listeria monocytogenes. Infect Immun 62:2222-8.

156 48. Dubnau, E., P. Fontan, R. Manganelli, S. Soares-Appel, and I. Smith. 2002. Mycobacterium tuberculosis genes induced during infection of human macrophages. Infect Immun 70:2787-95. 49. Dullaghan, E. M., C. A. Malloff, A. H. Li, W. L. Lam, and R. W. Stokes. 2002. Two- dimensional bacterial genome display: a method for the genomic analysis of mycobacteria. Microbiology 148:3111-7. 50. Ehrt, S., D. Schnappinger, S. Bekiranov, J. Drenkow, S. Shi, T. R. Gingeras, T. Gaasterland, G. Schoolnik, and C. Nathan. 2001. Reprogramming of the macrophage transcriptome in response to interferon-gamma and Mycobacterium tuberculosis: signaling roles of nitric oxide synthase-2 and phagocyte oxidase. J Exp Med 194:1123- 40. 51. Elhay, M. J., T. Oettinger, and P. Andersen. 1998. Delayed-type hypersensitivity responses to ESAT-6 and MPT64 from Mycobacterium tuberculosis in the guinea pig. Infect Immun 66:3454-6. 52. Elsinghorst, E. A. 1994. Measurement of invasion by gentamicin resistance. Methods Enzymol 236:405-20. 53. Espitia, C., J. P. Laclette, M. Mondragon-Palomino, A. Amador, J. Campuzano, A. Martens, M. Singh, R. Cicero, Y. Zhang, and C. Moreno. 1999. The PE-PGRS glycine-rich proteins of Mycobacterium tuberculosis: a new family of fibronectin-binding proteins? Microbiology 145 ( Pt 12):3487-95. 54. Eze, M. O., L. Yuan, R. M. Crawford, C. M. Paranavitana, T. L. Hadfield, A. K. Bhattacharjee, R. L. Warren, and D. L. Hoover. 2000. Effects of opsonization and gamma interferon on growth of Brucella melitensis 16M in mouse peritoneal macrophages in vitro. Infect Immun 68:257-63. 55. Fang, Z., C. Doig, D. T. Kenna, N. Smittipat, P. Palittapongarnpim, B. Watt, and K. J. Forbes. 1999. IS6110-mediated deletions of wild-type chromosomes of Mycobacterium tuberculosis. J Bacteriol 181:1014-20. 56. Fiss, E. H., S. Yu, and W. R. Jacobs, Jr. 1994. Identification of genes involved in the sequestration of iron in mycobacteria: the ferric exochelin biosynthetic and uptake pathways. Mol Microbiol 14:557-69. 57. Flynn, J. L., and J. Chan. 2003. Immune evasion by Mycobacterium tuberculosis: living with the enemy. Curr Opin Immunol 15:450-5. 58. Flynn, J. L., and J. Chan. 2001. Immunology of tuberculosis. Annu Rev Immunol 19:93-129. 59. Fodde, R., and M. Losekoot. 1994. Mutation detection by denaturing gradient gel electrophoresis (DGGE). Hum Mutat 3:83-94. 60. Fortune, S. M., A. Jaeger, D. A. Sarracino, M. R. Chase, C. M. Sassetti, D. R. Sherman, B. R. Bloom, and E. J. Rubin. 2005. Mutually dependent secretion of proteins required for mycobacterial virulence. Proc Natl Acad Sci U S A 102:10676-81. 61. Fratti, R. A., J. M. Backer, J. Gruenberg, S. Corvera, and V. Deretic. 2001. Role of phosphatidylinositol 3-kinase and Rab5 effectors in phagosomal biogenesis and mycobacterial phagosome maturation arrest. J Cell Biol 154:631-44. 62. Frieden, T. R., T. R. Sterling, S. S. Munsiff, C. J. Watt, and C. Dye. 2003. Tuberculosis. Lancet 362:887-99. 63. Froussard, P. 1992. A random-PCR method (rPCR) to construct whole cDNA library from low amounts of RNA. Nucleic Acids Res 20:2900.

157 64. Gandhi, N. R., A. Moll, A. W. Sturm, R. Pawinski, T. Govender, U. Lalloo, K. Zeller, J. Andrews, and G. Friedland. 2006. Extensively drug-resistant tuberculosis as a cause of death in patients co-infected with tuberculosis and HIV in a rural area of South Africa. Lancet 368:1575-80. 65. Gao, Q., K. Kripke, Z. Arinc, M. Voskuil, and P. Small. 2004. Comparative expression studies of a complex phenotype: cord formation in Mycobacterium tuberculosis. Tuberculosis (Edinb) 84:188-96. 66. Ge, Z., Y. Feng, C. A. Dangler, S. Xu, N. S. Taylor, and J. G. Fox. 2000. Fumarate reductase is essential for Helicobacter pylori colonization of the mouse stomach. Microb Pathog 29:279-87. 67. Gillespie, J., L. L. Barton, and E. W. Rypka. 1988. Influence of oxygen tension on the respiratory activity of Mycobacterium phlei. J Gen Microbiol 134:247-52. 68. Goldberg, I., K. Lonberg-Holm, E. A. Bagley, and B. Stieglitz. 1983. Improved conversion of fumarate to succinate by Escherichia coli strains amplified for fumarate reductase. Appl Environ Microbiol 45:1838-47. 69. Gonzalo Asensio, J., C. Maia, N. L. Ferrer, N. Barilone, F. Laval, C. Y. Soto, N. Winter, M. Daffe, B. Gicquel, C. Martin, and M. Jackson. 2006. The virulence- associated two-component PhoP-PhoR system controls the biosynthesis of polyketide- derived lipids in Mycobacterium tuberculosis. J Biol Chem 281:1313-6. 70. Gordon, S. V., R. Brosch, A. Billault, T. Garnier, K. Eiglmeier, and S. T. Cole. 1999. Identification of variable regions in the genomes of tubercle bacilli using bacterial artificial chromosome arrays. Mol Microbiol 32:643-55. 71. Goren, M. B. 1972. Mycobacterial lipids: selected topics. Bacteriol Rev 36:33-64. 72. Graham, J. E., and J. E. Clark-Curtiss. 1999. Identification of Mycobacterium tuberculosis RNAs synthesized in response to phagocytosis by human macrophages by selective capture of transcribed sequences (SCOTS). Proc Natl Acad Sci U S A 96:11554-9. 73. Guest, J. R. 1981. Partial replacement of succinate dehydrogenase function by phage- and plasmid-specified fumarate reductase in Escherichia coli. J Gen Microbiol 122:171- 9. 74. Guinn, K. M., M. J. Hickey, S. K. Mathur, K. L. Zakel, J. E. Grotzke, D. M. Lewinsohn, S. Smith, and D. R. Sherman. 2004. Individual RD1-region genes are required for export of ESAT-6/CFP-10 and for virulence of Mycobacterium tuberculosis. Mol Microbiol 51:359-70. 75. Guy, L. R., S. Raffel, and C. E. Clifton. 1954. Virulence of the tubercle bacillus. II. Effect of oxygen tension upon growth of virulent and avirulent bacilli. J Infect Dis 94:99- 106. 76. Gygi, S. P., Y. Rochon, B. R. Franza, and R. Aebersold. 1999. Correlation between protein and mRNA abundance in yeast. Mol Cell Biol 19:1720-30. 77. Hampshire, T., S. Soneji, J. Bacon, B. W. James, J. Hinds, K. Laing, R. A. Stabler, P. D. Marsh, and P. D. Butcher. 2004. Stationary phase gene expression of Mycobacterium tuberculosis following a progressive nutrient depletion: a model for persistent organisms? Tuberculosis (Edinb) 84:228-38. 78. Hamrick, T. S., A. H. Diaz, E. A. Havell, J. R. Horton, and P. E. Orndorff. 2003. Influence of extracellular bactericidal agents on bacteria within macrophages. Infect Immun 71:1016-9.

158 79. Harboe, M., T. Oettinger, H. G. Wiker, I. Rosenkrands, and P. Andersen. 1996. Evidence for occurrence of the ESAT-6 protein in Mycobacterium tuberculosis and virulent Mycobacterium bovis and for its absence in Mycobacterium bovis BCG. Infect Immun 64:16-22. 80. Heplar, J. Q., C. E. Clifton, S. Raffel, and C. M. Futrelle. 1954. Virulence of the tubercle bacillus. I. Effect of oxygen tension upon respiration of virulent and avirulent bacilli. J Infect Dis 94:90-8. 81. Hickman, S. P., J. Chan, and P. Salgame. 2002. Mycobacterium tuberculosis induces differential cytokine production from dendritic cells and macrophages with divergent effects on naive T cell polarization. J Immunol 168:4636-42. 82. Hinchey, J., S. Lee, B. Y. Jeon, R. J. Basaraba, M. M. Venkataswamy, B. Chen, J. Chan, M. Braunstein, I. M. Orme, S. C. Derrick, S. L. Morris, W. R. Jacobs, and S. A. Porcelli. 2007. Enhanced priming of adaptive immunity by a proapoptotic mutant of Mycobacterium tuberculosis. J Clin Invest 117:2279-2288. 83. Hitchen, P. G., J. L. Prior, P. C. Oyston, M. Panico, B. W. Wren, R. W. Titball, H. R. Morris, and A. Dell. 2002. Structural characterization of lipo-oligosaccharide (LOS) from Yersinia pestis: regulation of LOS structure by the PhoPQ system. Mol Microbiol 44:1637-50. 84. Hokamp, K., F. M. Roche, M. Acab, M. E. Rousseau, B. Kuo, D. Goode, D. Aeschliman, J. Bryan, L. A. Babiuk, R. E. Hancock, and F. S. Brinkman. 2004. ArrayPipe: a flexible processing pipeline for microarray data. Nucleic Acids Res 32:W457-9. 85. Horswill, A. R., and J. C. Escalante-Semerena. 2001. In vitro conversion of propionate to pyruvate by Salmonella enterica enzymes: 2-methylcitrate dehydratase (PrpD) and aconitase Enzymes catalyze the conversion of 2-methylcitrate to 2-methylisocitrate. Biochemistry 40:4703-13. 86. Hsu, T., S. M. Hingley-Wilson, B. Chen, M. Chen, A. Z. Dai, P. M. Morin, C. B. Marks, J. Padiyar, C. Goulding, M. Gingery, D. Eisenberg, R. G. Russell, S. C. Derrick, F. M. Collins, S. L. Morris, C. H. King, and W. R. Jacobs, Jr. 2003. The primary mechanism of attenuation of bacillus Calmette-Guerin is a loss of secreted lytic function required for invasion of lung interstitial tissue. Proc Natl Acad Sci U S A 100:12420-5. 87. Hu, Y., and A. R. Coates. 2001. Increased levels of sigJ mRNA in late stationary phase cultures of Mycobacterium tuberculosis detected by DNA array hybridisation. FEMS Microbiol Lett 202:59-65. 88. Huet, G., M. Daffe, and I. Saves. 2005. Identification of the Mycobacterium tuberculosis SUF machinery as the exclusive mycobacterial system of [Fe-S] cluster assembly: evidence for its implication in the pathogen's survival. J Bacteriol 187:6137- 46. 89. Imaeda, T. 1985. Deoxyribonucleic acid relatedness among strains of Mycobacterium tuberculosis, Mycobacterium bovis BCG, Mycobacterium microti and Mycobacterium africanum. International Journal of Systemic Bacteriology 35:147-150. 90. Iseman, M. D. 1993. Treatment of multidrug-resistant tuberculosis. N Engl J Med 329:784-91.

159 91. Jackett, P. S., V. R. Aber, and D. B. Lowrie. 1978. Virulence and resistance to superoxide, low pH and hydrogen peroxide among strains of Mycobacterium tuberculosis. J Gen Microbiol 104:37-45. 92. Jackett, P. S., V. R. Aber, D. A. Mitchison, and D. B. Lowrie. 1981. The contribution of hydrogen peroxide resistance to virulence of Mycobacterium tuberculosis during the first six days after intravenous infection of normal and BCG-vaccinated guinea-pigs. Br J Exp Pathol 62:34-40. 93. Jackson, M., D. Portnoi, D. Catheline, L. Dumail, J. Rauzier, P. Legrand, and B. Gicquel. 1997. Mycobacterium tuberculosis Des protein: an immunodominant target for the humoral response of tuberculous patients. Infect Immun 65:2883-9. 94. Jannasch, H. W. 1969. Estimations of bacterial growth rates in natural waters. J Bacteriol 99:156-60. 95. Jung, Y. J., R. LaCourse, L. Ryan, and R. J. North. 2002. Virulent but not avirulent Mycobacterium tuberculosis can evade the growth inhibitory action of a T helper 1- dependent, nitric oxide Synthase 2-independent defense in mice. J Exp Med 196:991-8. 96. Junqueira-Kipnis, A. P., R. J. Basaraba, V. Gruppo, G. Palanisamy, O. C. Turner, T. Hsu, W. R. Jacobs, Jr., S. A. Fulton, S. M. Reba, W. H. Boom, and I. M. Orme. 2006. Mycobacteria lacking the RD1 region do not induce necrosis in the lungs of mice lacking interferon-gamma. Immunology 119:224-31. 97. Kaufmann, S. H. 2001. How can immunology contribute to the control of tuberculosis? Nat Rev Immunol 1:20-30. 98. Keane, J., M. K. Balcewicz-Sablinska, H. G. Remold, G. L. Chupp, B. B. Meek, M. J. Fenton, and H. Kornfeld. 1997. Infection by Mycobacterium tuberculosis promotes human alveolar macrophage apoptosis. Infect Immun 65:298-304. 99. Keane, J., H. G. Remold, and H. Kornfeld. 2000. Virulent Mycobacterium tuberculosis strains evade apoptosis of infected alveolar macrophages. J Immunol 164:2016-20. 100. Kendall, S. L., M. Withers, C. N. Soffair, N. J. Moreland, S. Gurcha, B. Sidders, R. Frita, A. Ten Bokum, G. S. Besra, J. S. Lott, and N. G. Stoker. 2007. A highly conserved transcriptional repressor controls a large regulon involved in lipid degradation in Mycobacterium smegmatis and Mycobacterium tuberculosis. Mol Microbiol 65:684- 99. 101. Kim, S. J., K. Park, D. Koeller, K. Y. Kim, L. M. Wakefield, M. B. Sporn, and A. B. Roberts. 1992. Post-transcriptional regulation of the human transforming growth factor- beta 1 gene. J Biol Chem 267:13702-7. 102. Kinger, A. K., and J. S. Tyagi. 1993. Identification and cloning of genes differentially expressed in the virulent strain of Mycobacterium tuberculosis. Gene 131:113-7. 103. Klipper-Aurbach, Y., M. Wasserman, N. Braunspiegel-Weintrob, D. Borstein, S. Peleg, S. Assa, M. Karp, Y. Benjamini, Y. Hochberg, and Z. Laron. 1995. Mathematical formulae for the prediction of the residual beta cell function during the first two years of disease in children and adolescents with insulin-dependent diabetes mellitus. Med Hypotheses 45:486-90. 104. Koch, R. 1882. Die aetiologie der tuberkulose. Berliner Klinischen Wochenscrift 15:221–230. 105. Konishi, N., M. Tao, M. Nakamura, Y. Kitahaori, Y. Hiasa, and H. Nagai. 1996. Genomic alterations in human prostate carcinoma cell lines by two-dimensional gel analysis. Cell Mol Biol (Noisy-le-grand) 42:1129-35.

160 106. Krithika, R., U. Marathe, P. Saxena, M. Z. Ansari, D. Mohanty, and R. S. Gokhale. 2006. A genetic locus required for iron acquisition in Mycobacterium tuberculosis. Proc Natl Acad Sci U S A 103:2069-74. 107. Kurtz, S., K. P. McKinnon, M. S. Runge, J. P. Ting, and M. Braunstein. 2006. The SecA2 secretion factor of Mycobacterium tuberculosis promotes growth in macrophages and inhibits the host immune response. Infect Immun 74:6855-64. 108. Lakey, D. L., Y. Zhang, A. M. Talaat, B. Samten, L. E. DesJardin, K. D. Eisenach, S. A. Johnston, and P. F. Barnes. 2002. Priming reverse transcription with oligo(dT) does not yield representative samples of Mycobacterium tuberculosis cDNA. Microbiology 148:2567-72. 109. Lam, W. L., T. S. Lee, and W. Gilbert. 1996. Active transposition in zebrafish. Proc Natl Acad Sci U S A 93:10870-5. 110. Lari, N., L. Rindi, and C. Garzelli. 2001. Identification of one insertion site of IS6110 in Mycobacterium tuberculosis H37Ra and analysis of the RvD2 deletion in M. tuberculosis clinical isolates. J Med Microbiol 50:805-11. 111. Lari, N., L. Rindi, C. Lami, and C. Garzelli. 1999. IS6110-based restriction fragment length polymorphism (RFLP) analysis of Mycobacterium tuberculosis H37Rv and H37Ra. Microb Pathog 26:281-6. 112. Larson, C. L., and W. C. Wicht. 1964. Infection of Mice with Mycobacterium Tuberculosis, Strain H37ra. Am Rev Respir Dis 90:742-8. 113. Lee, J. S., R. Krause, J. Schreiber, R. Stein, J.-Y. Kwak, J. Kowall, M.-K. Song, S.- N. Cho, and S. H. E. Kaufmann. 2007. Genomic differences in virulent and attenuated Mycobacterium tuberculosis strains: direct clues for virulence. Abstract #303., Tuberculosis: From lab research to field trials. Keystone Symposia, Vancouver, British Columbia, Canada. 114. Lee, J. S., H. Mollenkopf, R. Krause, J. Schreiber, R. Stein, and S. H. Kaufmann. 2007. A PhoP point mutation discriminates between the virulent H37Rv and avirulent H37Ra strains of Mycobacterium tuberculosis, Tuberculosis: From Lab Research to Field Trials, Vancouver, British Columbia, Canada. 115. Lemire, B. D., and J. H. Weiner. 1986. Fumarate reductase of Escherichia coli. Methods Enzymol 126:377-86. 116. Li, M. S., I. M. Monahan, S. J. Waddell, J. A. Mangan, S. L. Martin, M. J. Everett, and P. D. Butcher. 2001. cDNA-RNA subtractive hybridization reveals increased expression of mycocerosic acid synthase in intracellular Mycobacterium bovis BCG. Microbiology 147:2293-305. 117. Li, M. S., S. J. Waddell, I. M. Monahan, J. A. Mangan, S. L. Martin, M. J. Everett, and P. D. Butcher. 2004. Increased transcription of a potential sigma factor regulatory gene Rv1364c in Mycobacterium bovis BCG while residing in macrophages indicates use of alternative promoters. FEMS Microbiol Lett 233:333-9. 118. Liu, K., J. Yu, and D. G. Russell. 2003. pckA-deficient Mycobacterium bovis BCG shows attenuated virulence in mice and in macrophages. Microbiology 149:1829-35. 119. MacGurn, J. A., S. Raghavan, S. A. Stanley, and J. S. Cox. 2005. A non-RD1 gene cluster is required for Snm secretion in Mycobacterium tuberculosis. Mol Microbiol 57:1653-63.

161 120. Maciag, A., E. Dainese, G. M. Rodriguez, A. Milano, R. Provvedi, M. R. Pasca, I. Smith, G. Palu, G. Riccardi, and R. Manganelli. 2007. Global analysis of the Mycobacterium tuberculosis Zur (FurB) regulon. J Bacteriol 189:730-40. 121. Mackaness, G. B., N. Smith, and A. Q. Wells. 1954. The growth of intracellular tubercle bacilli in relation to their virulence. Am Rev Tuberc 69:479-94. 122. Madigan, M. T., J. M. Martinko, and J. Parker. 1997. Brock biology of microorganisms - Chapter 4: Nutrition and metabolism. Prentice-Hall, Upper Saddle River, New Jersey. 123. Madigan, M. T., J. M. Martinko, and J. Parker. 1997. Brock biology of microorganisms - Chapter 5: Microbial Growth, Eigth ed. Prentice-Hall, Upper Saddle River, New Jersey. 124. Madigan, M. T., J. M. Martinko, and J. Parker. 1997. Brock biology of microorganisms - Chapter 13: Metabolic diversity among the microorganisms, Eigth ed. Prentice-Hall, Upper Saddle River, New Jersey. 125. Maglione, P. J., J. Xu, and J. Chan. 2007. B cells moderate inflammatory progression and enhance bacterial containment upon pulmonary challenge with Mycobacterium tuberculosis. J Immunol 178:7222-34. 126. Malloff, C. A. 2002. Identification of genomic alterations and altered expression profiles using novel bacterial genome display techniques. Master of Science. University of British Columbia, Vancouver. 127. Malloff, C. A., R. C. Fernandez, E. M. Dullaghan, R. W. Stokes, and W. L. Lam. 2002. Two-dimensional display and whole genome comparison of bacterial pathogen genomes of high G+C DNA content. Gene 293:205-11. 128. Malloff, C. A., R. C. Fernandez, and W. L. Lam. 2001. Bacterial comparative genomic hybridization: a method for directly identifying lateral gene transfer. J Mol Biol 312:1-5. 129. Mandavilli, A. 2007. Virtually incurable TB warns of impending disaster. Nat Med 13:271. 130. Mangan, J. A., and P. D. Butcher. 1998. Analysis of mycobacterial differential gene expression by RAP-PCR. Methods Mol Biol 101:307-22. 131. Marczinek, K., A. Sugiyama, J. Hampe, G. Thiel, K. Lehmann, R. Neumann, W. J. de Leeuw, and P. Nurnberg. 1997. Cloning of minisatellite-containing sequences from two-dimensional DNA fingerprinting gels reveals the identity of genomic alterations in low-grade gliomas of different patients. Electrophoresis 18:1586-91. 132. Marrakchi, H., S. Ducasse, G. Labesse, H. Montrozier, E. Margeat, L. Emorine, X. Charpentier, M. Daffe, and A. Quemard. 2002. MabA (FabG1), a Mycobacterium tuberculosis protein involved in the long-chain fatty acid elongation system FAS-II. Microbiology 148:951-60. 133. McKinney, J. D., K. Honer zu Bentrup, E. J. Munoz-Elias, A. Miczak, B. Chen, W. T. Chan, D. Swenson, J. C. Sacchettini, W. R. Jacobs, Jr., and D. G. Russell. 2000. Persistence of Mycobacterium tuberculosis in macrophages and mice requires the glyoxylate shunt enzyme isocitrate lyase. Nature 406:735-8. 134. Medina, E., and R. J. North. 1998. Resistance ranking of some common inbred mouse strains to Mycobacterium tuberculosis and relationship to major histocompatibility complex haplotype and Nramp1 genotype. Immunology 93:270-4. 135. Menendez Mdel, C., M. J. Rebollo, C. Nunez Mdel, R. A. Cox, and M. J. Garcia. 2005. Analysis of the precursor rRNA fractions of rapidly growing mycobacteria:

162 quantification by methods that include the use of a promoter (rrnA P1) as a novel standard. J Bacteriol 187:534-43. 136. Middlebrook, G. 1954. Isoniazid-resistance and catalase activity of tubercle bacilli; a preliminary report. Am Rev Tuberc 69:471-2. 137. Middlebrook, G., R. J. Dubos, and C. Pierce. 1947. Virulence and morphological characteristics of mammalian tubercle bacilli. J Exp Med 86:175–184. 138. Miller, S. I., and J. J. Mekalanos. 1990. Constitutive expression of the phoP regulon attenuates Salmonella virulence and survival within macrophages. J Bacteriol 172:2485- 90. 139. Mollenkopf, H. J., K. Hahnke, and S. H. Kaufmann. 2006. Transcriptional responses in mouse lungs induced by vaccination with Mycobacterium bovis BCG and infection with Mycobacterium tuberculosis. Microbes Infect 8:136-44. 140. Monod, J. 1949. The growth of bacterial cultures. Annu Rev Microbiol 3:371-394. 141. Moss, J. E., P. E. Fisher, B. Vick, E. A. Groisman, and A. Zychlinsky. 2000. The regulatory protein PhoP controls susceptibility to the host inflammatory response in Shigella flexneri. Cell Microbiol 2:443-52. 142. Mostowy, S., C. Cleto, D. R. Sherman, and M. A. Behr. 2004. The Mycobacterium tuberculosis complex transcriptome of attenuation. Tuberculosis (Edinb) 84:197-204. 143. Mracek, J., S. J. Snyder, U. B. Chavez, and J. F. Turrens. 1991. A soluble fumarate reductase in Trypanosoma brucei procyclic trypomastigotes. J Protozool 38:554-8. 144. Muyzer, G., and K. Smalla. 1998. Application of denaturing gradient gel electrophoresis (DGGE) and temperature gradient gel electrophoresis (TGGE) in microbial ecology. Antonie Van Leeuwenhoek 73:127-41. 145. Nau, G. J., J. F. Richmond, A. Schlesinger, E. G. Jennings, E. S. Lander, and R. A. Young. 2002. Human macrophage activation programs induced by bacterial pathogens. Proc Natl Acad Sci U S A 99:1503-8. 146. Ng, V. H., J. S. Cox, A. O. Sousa, J. D. MacMicking, and J. D. McKinney. 2004. Role of KatG catalase-peroxidase in mycobacterial pathogenesis: countering the phagocyte oxidative burst. Mol Microbiol 52:1291-302. 147. Nielsen, E., M. Akita, J. Davila-Aponte, and K. Keegstra. 1997. Stable association of chloroplastic precursors with protein translocation complexes that contain proteins from both envelope membranes and a stromal Hsp100 molecular chaperone. Embo J 16:935- 46. 148. North, R. J., and A. A. Izzo. 1993. Mycobacterial virulence. Virulent strains of Mycobacteria tuberculosis have faster in vivo doubling times and are better equipped to resist growth-inhibiting functions of macrophages in the presence and absence of specific immunity. J Exp Med 177:1723-33. 149. Noss, E. H., C. V. Harding, and W. H. Boom. 2000. Mycobacterium tuberculosis inhibits MHC class II antigen processing in murine bone marrow macrophages. Cell Immunol 201:63-74. 150. Oatway WH, a. S. W. 1936. The pathogenesis and fate of tubercle produced by dissociate variants of tubercle bacilli. Journal of Infectious Diseases 59:306-25. 151. Ohno, H., G. Zhu, V. P. Mohan, D. Chu, S. Kohno, W. R. Jacobs, Jr., and J. Chan. 2003. The effects of reactive nitrogen intermediates on gene expression in Mycobacterium tuberculosis. Cell Microbiol 5:637-48.

163 152. Ohya, S., H. Xiong, Y. Tanabe, M. Arakawa, and M. Mitsuyama. 1998. Killing mechanism of Listeria monocytogenes in activated macrophages as determined by an improved assay system. J Med Microbiol 47:211-5. 153. Omura, S., H. Miyadera, H. Ui, K. Shiomi, Y. Yamaguchi, R. Masuma, T. Nagamitsu, D. Takano, T. Sunazuka, A. Harder, H. Kolbl, M. Namikoshi, H. Miyoshi, K. Sakamoto, and K. Kita. 2001. An anthelmintic compound, nafuredin, shows selective inhibition of complex I in helminth mitochondria. Proc Natl Acad Sci U S A 98:60-2. 154. Parish, T., G. Roberts, F. Laval, M. Schaeffer, M. Daffe, and K. Duncan. 2007. Functional complementation of the essential gene fabG1 of Mycobacterium tuberculosis by Mycobacterium smegmatis fabG but not Escherichia coli fabG. J Bacteriol 189:3721- 8. 155. Park, H. D., K. M. Guinn, M. I. Harrell, R. Liao, M. I. Voskuil, M. Tompa, G. K. Schoolnik, and D. R. Sherman. 2003. Rv3133c/dosR is a transcription factor that mediates the hypoxic response of Mycobacterium tuberculosis. Mol Microbiol 48:833- 43. 156. Pascopella, L., F. M. Collins, J. M. Martin, W. R. Jacobs, Jr., and B. R. Bloom. 1993. Identification of a genomic fragment of Mycobacterium tuberculosis responsible for in vivo growth advantage. Infect Agents Dis 2:282-4. 157. Pascopella, L., F. M. Collins, J. M. Martin, M. H. Lee, G. F. Hatfull, C. K. Stover, B. R. Bloom, and W. R. Jacobs, Jr. 1994. Use of in vivo complementation in Mycobacterium tuberculosis to identify a genomic fragment associated with virulence. Infect Immun 62:1313-9. 158. Perez, E., S. Samper, Y. Bordas, C. Guilhot, B. Gicquel, and C. Martin. 2001. An essential role for phoP in Mycobacterium tuberculosis virulence. Mol Microbiol 41:179- 87. 159. Pirooznia, M., V. Nagarajan, and Y. Deng. 2007. GeneVenn - A web application for comparing gene lists using Venn diagrams. Bioinformation 1:420-2. 160. Pollock, J. M., and P. Andersen. 1997. Predominant recognition of the ESAT-6 protein in the first phase of interferon with Mycobacterium bovis in cattle. Infect Immun 65:2587-92. 161. Porankiewicz, J., J. Wang, and A. K. Clarke. 1999. New insights into the ATP- dependent Clp protease: Escherichia coli and beyond. Mol Microbiol 32:449-58. 162. Prichard, R. K. 1973. The fumarate reductase reaction of Haemonchus contortus and the mode of action of some anthelmintics. Int J Parasitol 3:409-17. 163. Pym, A. S., P. Brodin, R. Brosch, M. Huerre, and S. T. Cole. 2002. Loss of RD1 contributed to the attenuation of the live tuberculosis vaccines Mycobacterium bovis BCG and Mycobacterium microti. Mol Microbiol 46:709-17. 164. Pym, A. S., P. Brodin, L. Majlessi, R. Brosch, C. Demangel, A. Williams, K. E. Griffiths, G. Marchal, C. Leclerc, and S. T. Cole. 2003. Recombinant BCG exporting ESAT-6 confers enhanced protection against tuberculosis. Nat Med 9:533-9. 165. Quast, T. M., and R. F. Browning. 2006. Pathogenesis and clinical manifestations of pulmonary tuberculosis. Dis Mon 52:413-9. 166. Rahn, O. 1930. The non-logarithmic order of death of some bacteria. Journal of General Physiology 13:395-407.

164 167. Ramakrishnan, T., M. Indira, and R. K. Maller. 1962. Evaluation of the routes of glucose utilization in virulent and avirulent strains of Mycobacterium tuberculosis. Biochim Biophys Acta 59:529-32. 168. Ravn, P., A. Demissie, T. Eguale, H. Wondwosson, D. Lein, H. A. Amoudy, A. S. Mustafa, A. K. Jensen, A. Holm, I. Rosenkrands, F. Oftung, J. Olobo, F. von Reyn, and P. Andersen. 1999. Human T cell responses to the ESAT-6 antigen from Mycobacterium tuberculosis. J Infect Dis 179:637-45. 169. Reed, M. B., P. Domenech, C. Manca, H. Su, A. K. Barczak, B. N. Kreiswirth, G. Kaplan, and C. E. Barry, 3rd. 2004. A glycolipid of hypervirulent tuberculosis strains that inhibits the innate immune response. Nature 431:84-7. 170. Rengarajan, J., B. R. Bloom, and E. J. Rubin. 2005. Genome-wide requirements for Mycobacterium tuberculosis adaptation and survival in macrophages. Proc Natl Acad Sci U S A 102:8327-32. 171. Rindi, L., L. Fattorini, D. Bonanni, E. Iona, G. Freer, D. Tan, G. Deho, G. Orefici, and C. Garzelli. 2002. Involvement of the fadD33 gene in the growth of Mycobacterium tuberculosis in the liver of BALB/c mice. Microbiology 148:3873-80. 172. Rindi, L., N. Lari, and C. Garzelli. 2001. Genes of Mycobacterium tuberculosis H37Rv downregulated in the attenuated strain H37Ra are restricted to M. tuberculosis complex species. New Microbiol 24:289-94. 173. Rindi, L., N. Lari, and C. Garzelli. 1999. Search for genes potentially involved in Mycobacterium tuberculosis virulence by mRNA differential display. Biochem Biophys Res Commun 258:94-101. 174. Rivera-Marrero, C. A., M. A. Burroughs, R. A. Masse, F. O. Vannberg, D. L. Leimbach, J. Roman, and J. J. Murtagh, Jr. 1998. Identification of genes differentially expressed in Mycobacterium tuberculosis by differential display PCR. Microb Pathog 25:307-16. 175. Rivera-Marrero, C. A., J. D. Ritzenthaler, S. A. Newburn, J. Roman, and R. D. Cummings. 2002. Molecular cloning and expression of a novel glycolipid sulfotransferase in Mycobacterium tuberculosis. Microbiology 148:783-92. 176. Roback, P., J. Beard, D. Baumann, C. Gille, K. Henry, S. Krohn, H. Wiste, M. I. Voskuil, C. Rainville, and R. Rutherford. 2007. A predicted operon map for Mycobacterium tuberculosis. Nucleic Acids Res 35:5085-95. 177. Roberts, D. M., R. P. Liao, G. Wisedchaisri, W. G. Hol, and D. R. Sherman. 2004. Two sensor kinases contribute to the hypoxic response of Mycobacterium tuberculosis. J Biol Chem 279:23082-7. 178. Rodriguez, G. M., and I. Smith. 2003. Mechanisms of iron regulation in mycobacteria: role in physiology and virulence. Mol Microbiol 47:1485-94. 179. Rodriguez, G. M., M. I. Voskuil, B. Gold, G. K. Schoolnik, and I. Smith. 2002. ideR, An essential gene in mycobacterium tuberculosis: role of IdeR in iron-dependent gene expression, iron metabolism, and oxidative stress response. Infect Immun 70:3371-81. 180. Rooyakkers, A. W., and R. W. Stokes. 2005. Absence of complement receptor 3 results in reduced binding and ingestion of Mycobacterium tuberculosis but has no significant effect on the induction of reactive oxygen and nitrogen intermediates or on the survival of the bacteria in resident and interferon-gamma activated macrophages. Microb Pathog 39:57-67.

165 181. Rosenkrands, I., A. King, K. Weldingh, M. Moniatte, E. Moertz, and P. Andersen. 2000. Towards the proteome of Mycobacterium tuberculosis. Electrophoresis 21:3740- 56. 182. Rosenkrands, I., K. Weldingh, S. Jacobsen, C. V. Hansen, W. Florio, I. Gianetri, and P. Andersen. 2000. Mapping and identification of Mycobacterium tuberculosis proteins by two-dimensional gel electrophoresis, microsequencing and immunodetection. Electrophoresis 21:935-48. 183. Saini, D. K., V. Malhotra, D. Dey, N. Pant, T. K. Das, and J. S. Tyagi. 2004. DevR- DevS is a bona fide two-component system of Mycobacterium tuberculosis that is hypoxia-responsive in the absence of the DNA-binding domain of DevR. Microbiology 150:865-75. 184. Sassetti, C. M., D. H. Boyd, and E. J. Rubin. 2003. Genes required for mycobacterial growth defined by high density mutagenesis. Mol Microbiol 48:77-84. 185. Sassetti, C. M., and E. J. Rubin. 2003. Genetic requirements for mycobacterial survival during infection. Proc Natl Acad Sci U S A 100:12989-94. 186. Schlesinger, L. S., T. M. Kaufman, S. Iyer, S. R. Hull, and L. K. Marchiando. 1996. Differences in mannose receptor-mediated uptake of lipoarabinomannan from virulent and attenuated strains of Mycobacterium tuberculosis by human macrophages. J Immunol 157:4568-75. 187. Schmittgen, T. D., B. A. Zakrajsek, A. G. Mills, V. Gorn, M. J. Singer, and M. W. Reed. 2000. Quantitative reverse transcription-polymerase chain reaction to study mRNA decay: comparison of endpoint and real-time methods. Anal Biochem 285:194-204. 188. Schnappinger, D., S. Ehrt, M. I. Voskuil, Y. Liu, J. A. Mangan, I. M. Monahan, G. Dolganov, B. Efron, P. D. Butcher, C. Nathan, and G. K. Schoolnik. 2003. Transcriptional Adaptation of Mycobacterium tuberculosis within Macrophages: Insights into the Phagosomal Environment. J Exp Med 198:693-704. 189. Senaratne, R. H., A. D. De Silva, S. J. Williams, J. D. Mougous, J. R. Reader, T. Zhang, S. Chan, B. Sidders, D. H. Lee, J. Chan, C. R. Bertozzi, and L. W. Riley. 2006. 5'-Adenosinephosphosulphate reductase (CysH) protects Mycobacterium tuberculosis against free radicals during chronic infection phase in mice. Mol Microbiol 59:1744-53. 190. Senaratne, R. H., J. D. Mougous, J. R. Reader, S. J. Williams, T. Zhang, C. R. Bertozzi, and L. W. Riley. 2007. Vaccine efficacy of an attenuated but persistent Mycobacterium tuberculosis cysH mutant. J Med Microbiol 56:454-8. 191. Sharman, G. J., D. H. Williams, D. F. Ewing, and C. Ratledge. 1995. Determination of the structure of exochelin MN, the extracellular siderophore from Mycobacterium neoaurum. Chem Biol 2:553-61. 192. Sherman, D. R., K. Mdluli, M. J. Hickey, C. E. Barry, 3rd, and C. K. Stover. 1999. AhpC, oxidative stress and drug resistance in Mycobacterium tuberculosis. Biofactors 10:211-7. 193. Sherman, D. R., M. Voskuil, D. Schnappinger, R. Liao, M. I. Harrell, and G. K. Schoolnik. 2001. Regulation of the Mycobacterium tuberculosis hypoxic response gene encoding alpha -crystallin. Proc Natl Acad Sci U S A 98:7534-9. 194. Shih, H. A., F. J. Couch, K. L. Nathanson, M. A. Blackwood, T. R. Rebbeck, K. A. Armstrong, K. Calzone, J. Stopfer, S. Seal, M. R. Stratton, and B. L. Weber. 2002.

166 BRCA1 and BRCA2 mutation frequency in women evaluated in a breast cancer risk evaluation clinic. J Clin Oncol 20:994-9. 195. Singh, C. R., R. A. Moulton, L. Y. Armitige, A. Bidani, M. Snuggs, S. Dhandayuthapani, R. L. Hunter, and C. Jagannath. 2006. Processing and presentation of a mycobacterial antigen 85B epitope by murine macrophages is dependent on the phagosomal acquisition of vacuolar proton ATPase and in situ activation of cathepsin D. J Immunol 177:3250-9. 196. Singh, K. K., X. Zhang, A. S. Patibandla, P. Chien, Jr., and S. Laal. 2001. Antigens of Mycobacterium tuberculosis expressed during preclinical tuberculosis: serological immunodominance of proteins with repetitive amino acid sequences. Infect Immun 69:4185-91. 197. Sirakova, T. D., A. K. Thirumala, V. S. Dubey, H. Sprecher, and P. E. Kolattukudy. 2001. The Mycobacterium tuberculosis pks2 gene encodes the synthase for the hepta- and octamethyl-branched fatty acids required for sulfolipid synthesis. J Biol Chem 276:16833-9. 198. Sly, L. M., S. M. Hingley-Wilson, N. E. Reiner, and W. R. McMaster. 2003. Survival of Mycobacterium tuberculosis in host macrophages involves resistance to apoptosis dependent upon induction of antiapoptotic Bcl-2 family member Mcl-1. J Immunol 170:430-7. 199. Spira, A., J. D. Carroll, G. Liu, Z. Aziz, V. Shah, H. Kornfeld, and J. Keane. 2003. Apoptosis genes in human alveolar macrophages infected with virulent or attenuated Mycobacterium tuberculosis: a pivotal role for tumor necrosis factor. Am J Respir Cell Mol Biol 29:545-51. 200. Spitznagel, J. K. 1999. Mycobacteria: Tuberculosis and Leprosy, p. 230-242. In M. Schaechter, N. C. Engleber, B. I. Eisenstein, and G. Medoff (ed.), Mechanisms of Microbial Disease, 3rd ed. Lippincott Williams & Wilkins, Baltimore, Maryland. 201. Sreevatsan, S., X. Pan, K. E. Stockbauer, N. D. Connell, B. N. Kreiswirth, T. S. Whittam, and J. M. Musser. 1997. Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc Natl Acad Sci U S A 94:9869-74. 202. Stanley, S. A., S. Raghavan, W. W. Hwang, and J. S. Cox. 2003. Acute infection and macrophage subversion by Mycobacterium tuberculosis require a specialized secretion system. Proc Natl Acad Sci U S A 100:13001-6. 203. Steenken W, a. G. L. 1946. History of H37 strain of tubercle bacillus. American Review of Respiratory Disease 54:62-66. 204. Steenken, W., Jr. 1961. The persistence of tubercle bacilli in caseous lesions in the experimental animal (guinea pig). The American review of respiratory disease. 83:550- 554. 205. Steenken W, O. W., Petroff SA. 1934. Biological studies of the tubercle bacillus III. Dissociation and pathogenicity of the R and S variants of the human tubercle bacillus (H37). Journal of Experimental Medicine 60:515-40. 206. Stewart, G. R., L. Wernisch, R. Stabler, J. A. Mangan, J. Hinds, K. G. Laing, D. B. Young, and P. D. Butcher. 2002. Dissection of the heat-shock response in Mycobacterium tuberculosis using mutants and microarrays. Microbiology 148:3129-38. 207. Stokes, R. W., R. Norris-Jones, D. E. Brooks, T. J. Beveridge, D. Doxsee, and L. M. Thorson. 2004. The glycan-rich outer layer of the cell wall of Mycobacterium

167 tuberculosis acts as an antiphagocytic capsule limiting the association of the bacterium with macrophages. Infect Immun 72:5676-86. 208. Strohmeier, G. R., and M. J. Fenton. 1999. Roles of lipoarabinomannan in the pathogenesis of tuberculosis. Microbes Infect 1:709-17. 209. Sturgill-Koszycki, S., U. E. Schaible, and D. G. Russell. 1996. Mycobacterium- containing phagosomes are accessible to early endosomes and reflect a transitional state in normal phagosome biogenesis. Embo J 15:6960-8. 210. Sturgill-Koszycki, S., P. H. Schlesinger, P. Chakraborty, P. L. Haddix, H. L. Collins, A. K. Fok, R. D. Allen, S. L. Gluck, J. Heuser, and D. G. Russell. 1994. Lack of acidification in Mycobacterium phagosomes produced by exclusion of the vesicular proton-ATPase. Science 263:678-81. 211. Talaat, A. M., P. Hunter, and S. A. Johnston. 2000. Genome-directed primers for selective labeling of bacterial transcripts for DNA microarray analysis. Nat Biotechnol 18:679-82. 212. Talaat, A. M., R. Lyons, S. T. Howard, and S. A. Johnston. 2004. The temporal expression profile of Mycobacterium tuberculosis infection in mice. Proc Natl Acad Sci U S A 101:4602-7. 213. Talaat, A. M., S. K. Ward, C. W. Wu, E. Rondon, C. Tavano, J. P. Bannantine, R. Lyons, and S. A. Johnston. 2007. Mycobacterial bacilli are metabolically active during chronic tuberculosis in murine lungs: insights from genome-wide transcriptional profiling. J Bacteriol 189:4265-74. 214. Tekaia, F., S. V. Gordon, T. Garnier, R. Brosch, B. G. Barrell, and S. T. Cole. 1999. Analysis of the proteome of Mycobacterium tuberculosis in silico. Tuber Lung Dis 79:329-42. 215. Teng, F., L. Wang, K. V. Singh, B. E. Murray, and G. M. Weinstock. 2002. Involvement of PhoP-PhoS homologs in Enterococcus faecalis virulence. Infect Immun 70:1991-6. 216. Ting, L. M., A. C. Kim, A. Cattamanchi, and J. D. Ernst. 1999. Mycobacterium tuberculosis inhibits IFN-gamma transcriptional responses without inhibiting activation of STAT1. J Immunol 163:3898-906. 217. Trivedi, O. A., P. Arora, V. Sridharan, R. Tickoo, D. Mohanty, and R. S. Gokhale. 2004. Enzymic activation and transfer of fatty acids as acyl-adenylates in mycobacteria. Nature 428:441-5. 218. Trunz, B. B., P. Fine, and C. Dye. 2006. Effect of BCG vaccination on childhood tuberculous meningitis and miliary tuberculosis worldwide: a meta-analysis and assessment of cost-effectiveness. Lancet 367:1173-80. 219. Turrens, J. F., C. L. Newton, L. Zhong, F. R. Hernandez, J. Whitfield, and R. Docampo. 1999. Mercaptopyridine-N-oxide, an NADH-fumarate reductase inhibitor, blocks Trypanosoma cruzi growth in culture and in infected myoblasts. FEMS Microbiol Lett 175:217-21. 220. Turrens, J. F., B. P. Watts, Jr., L. Zhong, and R. Docampo. 1996. Inhibition of Trypanosoma cruzi and T. brucei NADH fumarate reductase by benznidazole and anthelmintic imidazole derivatives. Mol Biochem Parasitol 82:125-9. 221. Vandesompele, J., A. De Paepe, and F. Speleman. 2002. Elimination of primer-dimer artifacts and genomic coamplification using a two-step SYBR green I real-time RT-PCR. Anal Biochem 303:95-8.

168 222. Velmurugan, K., B. Chen, J. L. Miller, S. Azogue, S. Gurses, T. Hsu, M. Glickman, W. R. Jacobs, S. A. Porcelli, and V. Briken. 2007. Mycobacterium tuberculosis nuoG Is a Virulence Gene That Inhibits Apoptosis of Infected Host Cells. PLoS Pathog 3:e110. 223. Waddell, S. J., and P. D. Butcher. 2007. Microarray analysis of whole genome expression of intracellular Mycobacterium tuberculosis. Curr Mol Med 7:287-96. 224. Waddell, S. J., R. A. Stabler, K. Laing, L. Kremer, R. C. Reynolds, and G. S. Besra. 2004. The use of microarray analysis to determine the gene expression profiles of Mycobacterium tuberculosis in response to anti-bacterial compounds. Tuberculosis (Edinb) 84:263-74. 225. Walters, S. B., E. Dubnau, I. Kolesnikova, F. Laval, M. Daffe, and I. Smith. 2006. The Mycobacterium tuberculosis PhoPR two-component system regulates genes essential for virulence and complex lipid biosynthesis. Mol Microbiol 60:312-30. 226. Wang, S. Y., H. J. Zheng, B. F. Wang, X. L. Zhang, S. Y. Pu, G. F. Zhu, and G. P. Zhao. 2007. Complete genomic sequence of Mycobacterium tuberculosis strain H37Ra, a non-pathogenic variant closely related to the well-characterized pathogenic strain H37Rv. GenBank Accession Number: CP000611, http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=148503909 227. Warren, R. M., S. L. Sampson, M. Richardson, G. D. Van Der Spuy, C. J. Lombard, T. C. Victor, and P. D. van Helden. 2000. Mapping of IS6110 flanking regions in clinical isolates of Mycobacterium tuberculosis demonstrates genome plasticity. Mol Microbiol 37:1405-16. 228. Wayne, L. G. 1994. Dormancy of Mycobacterium tuberculosis and latency of disease. Eur J Clin Microbiol Infect Dis 13:908-14. 229. Wayne, L. G. 1976. Dynamics of submerged growth of Mycobacterium tuberculosis under aerobic and microaerophilic conditions. Am Rev Respir Dis 114:807-11. 230. Wei, J., J. L. Dahl, J. W. Moulder, E. A. Roberts, P. O'Gaora, D. B. Young, and R. L. Friedman. 2000. Identification of a Mycobacterium tuberculosis gene that enhances mycobacterial survival in macrophages. J Bacteriol 182:377-84. 231. Wheeler, P. R., K. Bulmer, and C. Ratledge. 1990. Enzymes for biosynthesis de novo and elongation of fatty acids in mycobacteria grown in host cells: is Mycobacterium leprae competent in fatty acid biosynthesis? J Gen Microbiol 136:211-7. 232. Wheeler, P. R., J. G. Raynes, and K. P. McAdam. 1994. Autoantibodies to cerebroside sulphate (sulphatide) in leprosy. Clin Exp Immunol 98:145-50. 233. Wilson, M., J. DeRisi, H. H. Kristensen, P. Imboden, S. Rane, P. O. Brown, and G. K. Schoolnik. 1999. Exploring drug-induced alterations in gene expression in Mycobacterium tuberculosis by microarray hybridization. Proc Natl Acad Sci U S A 96:12833-8. 234. Wisedchaisri, G., M. Wu, A. E. Rice, D. M. Roberts, D. R. Sherman, and W. G. Hol. 2005. Structures of Mycobacterium tuberculosis DosR and DosR-DNA complex involved in gene activation during adaptation to hypoxic latency. J Mol Biol 354:630-41. 235. Zimmerli, S., S. Edwards, and J. D. Ernst. 1996. Selective receptor blockade during phagocytosis does not alter the survival and growth of Mycobacterium tuberculosis in human macrophages. Am J Respir Cell Mol Biol 15:760-70.

169 Appendix I: Publications and contributions

1) Dullaghan EM, Malloff CA, Li AH, Lam WL, Stokes RW. Two-Dimensional Bacterial Genome Display: A method for the genomic analysis of mycobacteria. Microbiology, 148: 3111-3117. 2002. *Highlighted in “Hot off the Press” in Microbiology Today, 29: 210. 2002.

- I contributed the 2DDE profiles of M. tuberculosis.

2) Malloff C, Dullaghan E, Li A, Stokes R, Fernandez R, Lam W. Two-dimensional displays for comparisons of bacterial genomes. Biol. Proced. Online, 5:143-152. 2003.

- I contributed the 2DDE profiles of M. tuberculosis.

3) Li, AH, Lam, WL, and Stokes, RW. Array-based expression analysis of Mycobacterium tuberculosis during interaction with macrophages: characterization of genes differentially expressed in virulent and attenuated M. tuberculosis identifies candidate genes involved in intracellular growth. Submitted.

4) Li, AH, Lam, WL, and Stokes, RW. Bacterial artificial chromosome fingerprint arrays for the differentiation of transcriptomic differences in mycobacteria. Manuscript in preparation.

5) Li, AH, Waddell SJ, Hinds, J, Bains, M, Hancock, REW, Butcher, PD, and Stokes RW. Microarray analysis of intracellular Mycobacterium tuberculosis: biosynthetic pathways important for intracellular survival. Manuscript in preparation.

170 Appendix II: Genes upregulated in broth-grown M. tuberculosis

H37Ra versus H37Rv.

Expanded list of genes upregulated in broth-grown M. tuberculosis H37Ra versus H37Rv (See Table 5-2.

Three populations each of M. tuberculosis H37Ra and H37Rv grown in 7H9 broth were hybridised in duplicate to

M. tuberculosis microarrays supplied by Bµg@S (http://www.bugs.sgul.ac.uk/). Arrays were normalised and

expression analysed as specified in Section 2.6.3.2. Statistical significance of fold-difference across all three populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) upregulated in broth-grown H37Ra versus broth-grown H37Rv are listed. SD = standard deviation.

Systematic Fold- Common Name induction Name P-value SD Product MtH37Rv-0033 1.80 acpA 2.36E-04 0.39 PROBABLE ACYL CARRIER PROTEIN ACPA (ACP) ALKYL HYDROPEROXIDE REDUCTASE C PROTEIN MtH37Rv-2428 3.97 ahpC 1.94E-04 1.63 AHPC (ALKYL HYDROPEROXIDASE C) ALKYL HYDROPEROXIDE REDUCTASE D PROTEIN MtH37Rv-2429 2.03 ahpD 2.30E-04 0.52 AHPD (ALKYL HYDROPEROXIDASE D) MtH37Rv-1876 2.28 bfrA 2.62E-06 0.24 PROBABLE BACTERIOFERRITIN BFRA Low molecular weight protein antigen 7 cfp7 (10 kDa antigen) MtH37Rv-0288 1.88 cfp7 1.69E-03 0.56 (CFP-7) (Protein TB10.4) PROBABLE ATP-DEPENDENT CLP PROTEASE ATP- MtH37Rv-2457c 1.98 clpX 9.75E-04 0.64 BINDING SUBUNIT CLPX MtH37Rv-0058 1.74 dnaB 1.66E-04 0.32 PROBABLE REPLICATIVE DNA HELICASE DNAB FATTY-ACID-COA LIGASE FADD26 (FATTY-ACID-COA MtH37Rv-2930 1.74 fadD26 2.18E-04 0.34 SYNTHETASE) (FATTY-ACID-COA SYNTHASE) SECRETED ANTIGEN 85-C FBPC (85C) (ANTIGEN 85 COMPLEX C) (AG58C) (MYCOLYL 85C) MtH37Rv-0129c 2.55 fbpC 2.86E-03 1.32 (FIBRONECTIN-BINDING PROTEIN C) PROBABLE ELECTRON TRANSFER FLAVOPROTEIN (BETA-SUBUNIT) FIXA (BETA-ETF) (ELECTRON MtH37Rv-3029c 2.00 fixA 1.17E-04 0.52 TRANSFER FLAVOPROTEIN SMALL SUBUNIT) (ETFSS) MtH37Rv-0251c 2.40 hsp 1.54E-03 1.12 PROBABLE HEAT SHOCK PROTEIN HSP MtH37Rv-0342 2.18 iniA 4.42E-05 0.45 ISONIAZID INDUCTIBLE GENE PROTEIN INIA MtH37Rv-1881c 2.12 lppE 3.35E-03 0.90 POSSIBLE CONSERVED LIPOPROTEIN LPPE MtH37Rv-3763 2.47 lpqH 5.21E-03 1.48 19 KDA LIPOPROTEIN ANTIGEN PRECURSOR LPQH MtH37Rv-0179c 1.87 lprO 3.38E-04 0.45 POSSIBLE LIPOPROTEIN LPRO METHOXY MYCOLIC ACID SYNTHASE 4 MMAA4 (METHYL MYCOLIC ACID SYNTHASE 4) (MMA4) MtH37Rv-0642c 2.03 mmaA4 2.70E-06 0.26 (HYDROXY MYCOLIC ACID SYNTHASE) PROBABLE MOLYBDENUM COFACTOR BIOSYNTHESIS PROTEIN E MOAE1 (MOLYBDOPTERIN CONVERTING FACTOR LARGE SUBUNIT) (MOLYBDOPTERIN [MPT] CONVERTING FACTOR, MtH37Rv-3119 1.90 moaE1 6.69E-05 0.35 SUBUNIT 2) POSSIBLE MOLYBDOPTERIN BIOSYNTHESIS PROTEIN MtH37Rv-2338c 1.83 moeW 2.05E-04 0.39 MOEW IMMUNOGENIC PROTEIN MPT64 (ANTIGEN MtH37Rv-1980c 2.30 mpt64 1.39E-05 0.43 MPT64/MPB64)

171 POSSIBLE LARGE-CONDUCTANCE ION MtH37Rv-0985c 1.82 mscL 2.26E-03 0.60 MECHANOSENSITIVE CHANNEL MSCL MtCDC1551- 0196 2.93 MT0196 5.06E-03 1.72 hypothetical protein MtCDC1551- 0719.1 2.64 MT0719.1 1.27E-03 1.22 hypothetical protein MtCDC1551- 0975 1.97 MT0975 2.30E-04 0.45 hypothetical protein MtCDC1551- 1178 2.14 MT1178 4.06E-03 0.97 hypothetical protein MtCDC1551- 1356 2.41 MT1356 6.76E-03 1.47 hypothetical protein MtCDC1551- 1479.1 2.51 MT1479.1 7.46E-04 0.96 hypothetical protein MtCDC1551- 2042.1 2.37 MT2042.1 2.70E-06 0.35 hypothetical protein MtCDC1551- 2142 1.77 MT2142 1.92E-04 0.35 hypothetical protein MtCDC1551- 2421 2.87 MT2421 1.92E-04 1.00 conserved hypothetical protein MtCDC1551- 2455 2.01 MT2455 2.02E-04 0.50 hypothetical protein MtCDC1551- 3135 4.41 MT3135 8.53E-05 1.74 hypothetical protein MtCDC1551- 3139.1 2.42 MT3139.1 4.60E-04 0.90 hypothetical protein MtCDC1551- 3972.1 3.98 MT3972.1 6.76E-05 1.34 hypothetical protein RIBONUCLEOSIDE-DIPHOSPHATE REDUCTASE (ALPHA CHAIN) NRDE (RIBONUCLEOTIDE MtH37Rv-3051c 1.97 nrdE 6.29E-03 0.90 REDUCTASE SMALL SUBUNIT) (R1F PROTEIN) PROBABLE GLUTAREDOXIN ELECTRON TRANSPORT COMPONENT OF NRDEF (GLUTAREDOXIN-LIKE MtH37Rv-3053c 2.46 nrdH 4.08E-03 1.38 PROTEIN) NRDH MtH37Rv-3052c 2.12 nrdI 1.66E-04 0.44 PROBABLE NRDI PROTEIN PROBABLE NADH DEHYDROGENASE I (CHAIN K) NUOK (NADH-UBIQUINONE OXIDOREDUCTASE MtH37Rv-3155 2.01 nuoK 1.66E-04 0.47 CHAIN K) MtH37Rv-1791 2.05 PE19 1.28E-03 0.72 PE FAMILY PROTEIN POSSIBLE TWO COMPONENT SYSTEM RESPONSE MtH37Rv-0757 2.72 phoP 1.39E-05 0.64 TRANSCRIPTIONAL POSITIVE REGULATOR PHOP PROBABLE NAD(P) TRANSHYDROGENASE (SUBUNIT ALPHA) PNTAB [SECOND PART; INTEGRAL MEMBRANE PROTEIN] (PYRIDINE NUCLEOTIDE TRANSHYDROGENASE SUBUNIT ALPHA) (NICOTINAMIDE NUCLEOTIDE TRANSHYDROGENASE MtH37Rv-0156 2.07 pntAb 1.94E-04 0.51 SUBUNIT ALPHA) PROBABLE TRANSCRIPTION TERMINATION FACTOR MtH37Rv-1297 1.96 rho 8.27E-03 0.98 RHO HOMOLOG MtH37Rv-0704 1.89 rplB 4.70E-03 0.77 PROBABLE 50S ribosomal protein L2 RPLB MtH37Rv-0056 2.30 rplI 1.10E-03 0.85 PROBABLE 50S RIBOSOMAL PROTEIN L9 RPLI MtH37Rv-3443c 1.98 rplM 1.69E-03 0.71 PROBABLE 50S RIBOSOMAL PROTEIN L13 RPLM MtH37Rv-0714 1.81 rplN 4.42E-05 0.27 PROBABLE 50S RIBOSOMAL PROTEIN L14 RPLN MtH37Rv-0703 1.85 rplW 4.42E-05 0.29 PROBABLE 50S RIBOSOMAL PROTEIN L23 RPLW MtH37Rv-0715 2.11 rplX 2.70E-06 0.27 PROBABLE 50S RIBOSOMAL PROTEIN L24 RPLX MtH37Rv-0722 1.87 rpmD 7.46E-04 0.52 PROBABLE 50S RIBOSOMAL PROTEIN L30 RPMD MtH37Rv-1298 2.22 rpmE 1.94E-04 0.59 PROBABLE 50S RIBOSOMAL PROTEIN L31 RPME MtH37Rv- 0634B 3.33 rpmG2 1.16E-05 0.86 PROBABLE 50S RIBOSOMAL PROTEIN L33 RPMG2 DNA-DIRECTED RNA POLYMERASE (BETA CHAIN) RPOB (TRANSCRIPTASE BETA CHAIN) (RNA MtH37Rv-0667 2.19 rpoB 3.83E-03 1.07 POLYMERASE BETA SUBUNIT) PROBABLE DNA-DIRECTED RNA POLYMERASE MtH37Rv-1390 1.92 rpoZ 1.39E-05 0.28 (OMEGA CHAIN) RPOZ (TRANSCRIPTASE OMEGA

172 CHAIN) (RNA POLYMERASE OMEGA SUBUNIT)

MtH37Rv-0683 2.27 rpsG 8.00E-06 0.41 PROBABLE 30S RIBOSOMAL PROTEIN S7 RPSG MtH37Rv-0718 2.66 rpsH 6.63E-04 1.11 PROBABLE 30S RIBOSOMAL PROTEIN S8 RPSH MtH37Rv-2785c 2.11 rpsO 1.98E-07 0.17 PROBABLE 30S RIBOSOMAL PROTEIN S15 RPSO MtH37Rv-2909c 1.78 rpsP 4.88E-04 0.41 PROBABLE 30S RIBOSOMAL PROTEIN S16 RPSP MtH37Rv-0705 3.20 rpsS 3.65E-04 1.37 PROBABLE 30S RIBOSOMAL PROTEIN S19 RPSS MtH37Rv-2412 2.01 rpsT 1.94E-04 0.44 PROBABLE 30S RIBOSOMAL PROTEIN S20 RPST MtH37Rv-0008c 1.70 Rv0008c 7.55E-04 0.40 POSSIBLE MEMBRANE PROTEIN MtH37Rv-0190 2.17 Rv0190 7.44E-05 0.50 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0239 1.95 Rv0239 8.02E-05 0.36 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0298 2.07 Rv0298 1.39E-05 0.33 HYPOTHETICAL PROTEIN MtH37Rv-0313 1.90 Rv0313 1.75E-03 0.57 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0430 2.16 Rv0430 2.26E-03 0.91 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0637 2.81 Rv0637 1.39E-05 0.68 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0831c 1.99 Rv0831c 3.92E-04 0.47 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0883c 1.93 Rv0883c 1.47E-04 0.40 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0885 2.42 Rv0885 1.54E-03 1.02 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0909 1.85 Rv0909 2.80E-04 0.35 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0948c 1.92 Rv0948c 4.99E-05 0.35 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1038c 2.60 Rv1038c 1.97E-04 0.72 Putative ESAT-6 like protein 2 MtH37Rv-1102c 1.81 Rv1102c 6.36E-04 0.42 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1109c 1.94 Rv1109c 1.54E-03 0.60 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1130 2.29 Rv1130 4.95E-04 0.70 CONSERVED HYPOTHETICAL PROTEIN CONSERVED HYPOTHETICAL ALA-, PRO-RICH MtH37Rv-1157c 2.53 Rv1157c 2.16E-03 1.23 PROTEIN CONSERVED HYPOTHETICAL ALA-, PRO-RICH MtH37Rv-1158c 1.78 Rv1158c 1.47E-04 0.32 PROTEIN MtH37Rv-1284 1.90 Rv1284 7.46E-04 0.54 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 1507A 1.81 Rv1507A 1.94E-04 0.37 HYPOTHETICAL PROTEIN MtH37Rv-1792 2.60 Rv1792 8.02E-05 0.77 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1830 2.14 Rv1830 1.29E-03 0.82 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1987 3.45 Rv1987 2.91E-05 1.02 POSSIBLE CHITINASE MtH37Rv-2050 2.10 Rv2050 3.92E-04 0.52 conserved hypothetical protein MtH37Rv-2204c 2.24 Rv2204c 3.59E-03 1.13 conserved hypothetical protein MtH37Rv-2347c 3.32 Rv2347c 4.56E-04 1.54 PUTATIVE ESAT-6 LIKE PROTEIN 7 PROBABLE PROLINE AND GLYCINE RICH MtH37Rv-2560 2.17 Rv2560 8.50E-06 0.34 TRANSMEMBRANE PROTEIN MtH37Rv-2561 2.65 Rv2561 6.61E-07 0.43 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2699c 2.66 Rv2699c 1.40E-04 0.74 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2840c 2.51 Rv2840c 1.39E-05 0.51 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2927c 2.44 Rv2927c 9.92E-04 0.92 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2959c 2.69 Rv2959c 2.70E-06 0.49 POSSIBLE METHYLTRANSFERASE (METHYLASE) MtH37Rv-2960c 1.63 Rv2960c 1.94E-04 0.28 HYPOTHETICAL PROTEIN MtH37Rv- 2970A 2.72 Rv2970A 3.65E-04 0.97 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2971 2.01 Rv2971 4.42E-05 0.35 PROBABLE OXIDOREDUCTASE MtH37Rv-3049c 2.73 Rv3049c 4.35E-04 1.01 PROBABLE MONOOXYGENASE MtH37Rv-3258c 1.51 Rv3258c 8.34E-03 0.36 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3269 2.50 Rv3269 2.26E-03 1.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3620c 2.62 Rv3620c 1.31E-03 1.13 PUTATIVE ESAT-6 LIKE PROTEIN 10

173 MtH37Rv-3669 1.63 Rv3669 2.30E-04 0.28 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-3676 1.94 Rv3676 2.75E-06 0.24 PROTEIN (PROBABLY CRP/FNR-FAMILY) MtH37Rv-3706c 2.49 Rv3706c 2.24E-05 0.55 CONSERVED HYPOTHETICAL PROLINE RICH PROTEIN MtH37Rv-3716c 1.63 Rv3716c 3.10E-03 0.28 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3750c 2.89 Rv3750c 1.00E-04 0.88 POSSIBLE EXCISIONASE MtH37Rv-3764c 1.91 Rv3764c 1.88E-03 0.60 POSSIBLE TWO COMPONENT SENSOR KINASE MtH37Rv-3921c 2.02 Rv3921c 7.46E-04 0.60 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN PROBABLE ADENOSYLHOMOCYSTEINASE SAHH (S- ADENOSYL-L-HOMOCYSTEINE HYDROLASE) MtH37Rv-3248c 1.94 sahH 9.75E-03 0.94 (ADOHCYASE) MtH37Rv-2703 2.09 sigA 1.39E-05 0.36 RNA POLYMERASE SIGMA FACTOR SIGA (SIGMA-A) ALTERNATIVE RNA POLYMERASE SIGMA-E FACTOR MtH37Rv-3223c 1.79 sigH 8.02E-05 0.30 (SIGMA-24) SIGH (RPOE) MtH37Rv-1636 2.61 TB15.3 1.86E-03 1.34 CONSERVED HYPOTHETICAL PROTEIN TB15.3 MtH37Rv-2185c 2.16 TB16.3 2.70E-06 0.25 conserved hypothetical protein TB16.3 MtH37Rv-2928 1.94 tesA 1.33E-04 0.39 PROBABLE TESA MtH37Rv-2462c 1.90 tig 1.16E-04 0.35 PROBABLE TRIGGER FACTOR (TF) PROTEIN TIG MtH37Rv-0685 2.22 tuf 2.28E-03 0.96 PROBABLE ELONGATION FACTOR TU TUF (EF-TU) MtH37Rv-0469 2.10 umaA 6.71E-04 0.63 POSSIBLE MYCOLIC ACID SYNTHASE UMAA

174 Appendix III: Genes downregulated in broth-grown M. tuberculosis

H37Ra versus H37Rv.

Expanded list of genes upregulated in broth-grown M. tuberculosis H37Ra versus H37Rv (see Table 5-3).

Three populations each of M. tuberculosis H37Ra and H37Rv grown in 7H9 broth were hybridised in duplicate to

M. tuberculosis microarrays supplied by Bµg@S (http://www.bugs.sgul.ac.uk/). Arrays were normalised and expression analysed as specified in Section 2.6.3.2. Statistical significance of fold-difference across all three populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) downregulated in broth-grown H37Ra versus broth-grown H37Rv are listed. SD = standard deviation.

Systematic Fold- Common Name induction Name P-value SD Product PROBABLE ALPHA-GLUCOSIDASE AGLA (MALTASE) (GLUCOINVERTASE) (GLUCOSIDOSUCRASE) (MALTASE- MtH37Rv- GLUCOAMYLASE) (LYSOSOMAL ALPHA- 2471 0.44 aglA 3.53E-04 0.14 GLUCOSIDASE) (ACID MALTASE) LOW MOLECULAR WEIGHT ANTIGEN MtH37Rv- CFP2 (LOW MOLECULAR WEIGHT 2376c 0.33 cfp2 1.66E-04 0.11 PROTEIN ANTIGEN 2) (CFP-2) Probable integral membrane cytochrome D MtH37Rv- ubiquinol oxidase (subunit II) cydB (Cytochrome 1622c 0.50 cydB 2.80E-04 0.09 BD-I oxidase subunit II) MtCDC1551- 2283 0.47 MT2283 2.52E-02 0.29 hypothetical protein MtCDC1551- 2466 0.11 MT2466 2.85E-04 0.09 hypothetical protein MtCDC1551- 2467 0.28 MT2467 1.60E-03 0.18 hypothetical protein MtCDC1551- 3098 0.28 MT3098 3.58E-03 0.22 PPE family protein PROBABLE TREHALOSE-6-PHOSPHATE MtH37Rv- PHOSPHATASE OTSB1 (TREHALOSE- 2006 0.51 otsB1 1.27E-03 0.15 PHOSPHATASE) (TPP) MtH37Rv- 0834c 0.49 PE_PGRS14 2.55E-03 0.19 PE-PGRS FAMILY PROTEIN MtH37Rv- 1452c 0.26 PE_PGRS28 2.26E-03 0.20 PE-PGRS FAMILY PROTEIN MtH37Rv- 2487c 0.39 PE_PGRS42 1.21E-03 0.17 PE-PGRS FAMILY PROTEIN MtH37Rv- 3590c 0.49 PE_PGRS58 3.49E-03 0.20 PE-PGRS FAMILY PROTEIN MtH37Rv- 0.42 PE29 5.06E-03 0.19 PE FAMILY PROTEIN

175 3022A MtH37Rv- 3477 0.22 PE31 6.41E-03 0.26 PE FAMILY PROTEIN MtH37Rv- PROBABLE POLYKETIDE BETA- 1180 0.26 pks3 8.82E-03 0.29 KETOACYL SYNTHASE PKS3 MtH37Rv- 1361c 0.34 PPE19 1.38E-03 0.18 PPE FAMILY PROTEIN MtH37Rv- 3022c 0.45 PPE48 3.65E-04 0.14 PPE FAMILY PROTEIN MtH37Rv- 2577 0.41 Rv2577 1.66E-03 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- CONSERVED HYPOTHETICAL ALANINE 2917 0.61 Rv2917 6.63E-04 0.12 AND ARGININE RICH PROTEIN MtH37Rv- CONSERVED HYPOTHETICAL ALANINE 3616c 0.51 Rv3616c 4.43E-02 0.37 AND GLYCINE RICH PROTEIN

176 Appendix IV: Genes upregulated in intracellular M. tuberculosis

H37Rv versus broth-grown H37Rv.

Expanded list of genes upregulated in intracellular versus broth-grown M. tuberculosis H37Rv (see Table

5-4). Three populations of RNA from each of intracellular and broth-grown H37Rv were reverse transcribed and hybridised to M. tuberculosis microarrays (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate. Arrays were normalised as per section 2.6.3.2 and gene expression was filtered for genes whose expression differed by 1.5-fold.

Statistical significance of fold-difference across all populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) upregulated in intracellular H37Rv versus broth-grown H37Rv are listed. SD = standard deviation.

Fold- Common Systematic Name induction Name P-value SD Product MtH37Rv-2744c 1.64 35kd_ag 2.53E-02 0.22 CONSERVED 35 KDA ALANINE RICH PROTEIN MtH37Rv-0033 2.04 acpA 2.29E-02 0.31 PROBABLE ACYL CARRIER PROTEIN ACPA (ACP) ALKYL HYDROPEROXIDE REDUCTASE C PROTEIN MtH37Rv-2428 10.45 ahpC 2.27E-03 0.14 AHPC (ALKYL HYDROPEROXIDASE C) ALKYL HYDROPEROXIDE REDUCTASE D PROTEIN MtH37Rv-2429 4.00 ahpD 1.81E-03 0.17 AHPD (ALKYL HYDROPEROXIDASE D) MtH37Rv-2178c 1.38 aroG 2.53E-02 0.18 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase aroG PROBABLE ARSENIC-TRANSPORT INTEGRAL MtH37Rv-2684 1.54 arsA 1.60E-02 0.22 MEMBRANE PROTEIN ARSA MtH37Rv-2068c 2.08 blaC 4.84E-04 0.15 class A beta-lactamase BlaC MtH37Rv-2641 1.75 cadI 4.90E-02 0.34 CADMIUM INDUCIBLE PROTEIN CADI Low molecular weight protein antigen 7 cfp7 (10 kDa antigen) MtH37Rv-0288 4.88 cfp7 4.46E-03 0.28 (CFP-7) (Protein TB10.4) PROBABLE ATP-DEPENDENT CLP PROTEASE ATP- MtH37Rv-3596c 1.79 clpC 4.52E-03 0.22 BINDING SUBUNIT CLPC PROBABLE ATP-DEPENDENT CLP PROTEASE PROTEOLYTIC SUBUNIT 1 CLPP1 (ENDOPEPTIDASE MtH37Rv-2461c 2.02 clpP1 1.32E-02 0.20 CLP) PROBABLE ATP-DEPENDENT CLP PROTEASE MtH37Rv-2460c 1.82 clpP2 3.62E-02 0.36 PROTEOLYTIC SUBUNIT 2 CLPP2 MtH37Rv-1464 2.43 csd 2.15E-02 0.27 PROBABLE CYSTEINE DESULFURASE CSD PROBABLE PRECURSOR [FIRST PART] MtH37Rv-3724A 1.41 cut5a 2.32E-02 0.11 CUT5A MtH37Rv-1880c 1.75 cyp140 2.15E-02 0.18 Probable cytochrome p450 140 CYP140 PROBABLE THIOSULFATE SULFURTRANSFERASE CYSA3 (RHODANESE-LIKE PROTEIN) (THIOSULFATE CYANIDE TRANSSULFURASE) (THIOSULFATE MtH37Rv-3117 1.44 cysA3 4.23E-02 0.25 THIOTRANSFERASE) PROBABLE SULFATE-TRANSPORT INTEGRAL MtH37Rv-2399c 1.45 cysT 2.66E-02 0.11 MEMBRANE PROTEIN ABC TRANSPORTER CYST PROBABLE DIHYDRODIPICOLINATE SYNTHASE DAPA MtH37Rv-2753c 1.60 dapA 1.74E-02 0.22 (DHDPS) (DIHYDRODIPICOLINATE SYNTHETASE) PROBABLE 6-PHOSPHOGLUCONOLACTONASE DEVB MtH37Rv-1445c 2.38 devB 4.23E-02 0.48 (6PGL) PROBABLE DNA/PANTOTHENATE METABOLISM MtH37Rv-1391 1.37 dfp 2.69E-02 0.21 FLAVOPROTEIN HOMOLOG DFP

177 DNA POLYMERASE III (BETA CHAIN) DNAN (DNA MtH37Rv-0002 1.79 dnaN 3.03E-02 0.14 NUCLEOTIDYLTRANSFERASE) PROBABLE DAUNORUBICIN-DIM-TRANSPORT ATP- MtH37Rv-2936 1.96 drrA 1.36E-02 0.19 BINDING PROTEIN ABC TRANSPORTER DRRA MtH37Rv-1677 1.77 dsbF 1.65E-02 0.27 PROBABLE CONSERVED LIPOPROTEIN DSBF POSSIBLE ENOYL-COA HYDRATASE ECHA6 (ENOYL HYDRASE) (UNSATURATED ACYL-COA HYDRATASE) MtH37Rv-0905 1.64 echA6 6.54E-03 0.21 (CROTONASE) MtH37Rv-3854c 1.88 ethA 4.46E-03 0.21 MONOOXYGENASE ETHA 3-OXOACYL-[ACYL-CARRIER PROTEIN] REDUCTASE FABG1 (3-KETOACYL-ACYL CARRIER PROTEIN REDUCTASE) (MYCOLIC ACID BIOSYNTHESIS A MtH37Rv-1483 1.73 fabG1 2.15E-02 0.27 PROTEIN) PROBABLE 3-OXOACYL-[ACYL-CARRIER PROTEIN] REDUCTASE FABG4 (3-KETOACYL-ACYL CARRIER MtH37Rv-0242c 1.89 fabG4 1.27E-02 0.29 PROTEIN REDUCTASE) MtH37Rv-1074c 1.82 fadA3 1.75E-02 0.21 PROBABLE BETA-KETOACYL COA THIOLASE FADA3 PROBABLE ACETYL-COA ACETYLTRANSFERASE MtH37Rv-1323 1.97 fadA4 3.08E-02 0.25 FADA4 (ACETOACETYL-COA THIOLASE) PROBABLE 3-HYDROXYBUTYRYL-COA DEHYDROGENASE FADB2 (BETA-HYDROXYBUTYRYL- MtH37Rv-0468 1.95 fadB2 1.89E-02 0.28 COA DEHYDROGENASE) (BHBD) PROBABLE CHAIN -FATTY-ACID-COA LIGASE FADD13 MtH37Rv-3089 1.74 fadD13 3.97E-02 0.24 (FATTY-ACYL-COA SYNTHETASE) Probable long-chain-fatty-acid-CoA ligase fadD15 (FATTY- ACID-COA SYNTHETASE) (FATTY-ACID-COA MtH37Rv-2187 1.82 fadD15 3.43E-02 0.40 SYNTHASE) PROBABLE FATTY-ACID-COA LIGASE FADD19 (FATTY- ACID-COA SYNTHETASE) (FATTY-ACID-COA MtH37Rv-3515c 2.01 fadD19 3.96E-03 0.14 SYNTHASE) PROBABLE FATTY-ACID-COA LIGASE FADD2 (FATTY- ACID-COA SYNTHETASE) (FATTY-ACID-COA MtH37Rv-0270 1.72 fadD2 3.37E-02 0.23 SYNTHASE) PROBABLE FATTY-ACID--COA LIGASE FADD21 (FATTY- ACID-COA SYNTHETASE) (FATTY-ACID-COA MtH37Rv-1185c 2.03 fadD21 1.33E-02 0.39 SYNTHASE) FATTY-ACID-COA LIGASE FADD26 (FATTY-ACID-COA MtH37Rv-2930 2.11 fadD26 1.81E-03 0.16 SYNTHETASE) (FATTY-ACID-COA SYNTHASE) FATTY-ACID-CoA LIGASE FADD28 (FATTY-ACID-CoA MtH37Rv-2941 2.01 fadD28 5.79E-03 0.28 SYNTHETASE) (FATTY-ACID-CoA SYNTHASE) PROBABLE FATTY-ACID-COA LIGASE FADD29 (FATTY- ACID-COA SYNTHETASE) (FATTY-ACID-COA MtH37Rv-2950c 1.74 fadD29 1.12E-02 0.21 SYNTHASE) PROBABLE ACYL-COA LIGASE FADD31 (ACYL-COA MtH37Rv-1925 2.34 fadD31 4.20E-02 0.58 SYNTHETASE) (ACYL-COA SYNTHASE) MtH37Rv-1345 2.07 fadD33 1.81E-03 0.15 POSSIBLE POLYKETIDE SYNTHASE FADD33 MtH37Rv-1933c 2.01 fadE18 4.81E-02 0.35 PROBABLE ACYL-COA DEHYDROGENASE FADE18 MtH37Rv-2789c 1.76 fadE21 2.72E-03 0.15 PROBABLE ACYL-COA DEHYDROGENASE FADE21 MtH37Rv-0244c 1.82 fadE5 7.80E-03 0.17 PROBABLE ACYL-COA DEHYDROGENASE FADE5 MtH37Rv-0855 1.51 far 2.89E-02 0.19 PROBABLE FATTY-ACID-COA RACEMASE FAR POSSIBLE FORMATE DEHYDROGENASE H FDHF (FORMATE-HYDROGEN-LYASE-LINKED, SELENOCYSTEINE-CONTAINING POLYPEPTIDE) (FORMATE DEHYDROGENASE-H ALPHA SUBUNIT) MtH37Rv-2900c 1.33 fdhF 1.79E-02 0.18 (FDH-H) MtH37Rv-0684 2.54 fusA1 1.64E-02 0.30 PROBABLE ELONGATION FACTOR G FUSA1 (EF-G) PROBABLE GLYCERALDEHYDE 3-PHOSPHATE MtH37Rv-1436 1.62 gap 2.14E-02 0.15 DEHYDROGENASE GAP (GAPDH) MtH37Rv-1018c 1.64 glmU 3.69E-03 0.11 Probable UDP-N-acetylglucosamine pyrophosphorylase glmU PROBABLE GLUTAMINE-BINDING LIPOPROTEIN GLNH MtH37Rv-0411c 1.70 glnH 1.80E-02 0.24 (GLNBP) MtH37Rv-1131 2.74 gltA1 1.81E-03 0.15 PROBABLE CITRATE SYNTHASE I GLTA1 PROBABLE GLYCYL-tRNA SYNTHETASE GLYS MtH37Rv-2357c 1.64 glyS 2.67E-02 0.22 (GLYCINE--tRNA LIGASE) (GLYRS)

178 DNA GYRASE (SUBUNIT B) GYRB (DNA TOPOISOMERASE (ATP-HYDROLYSING)) (DNA MtH37Rv-0005 2.13 gyrB 3.19E-03 0.20 TOPOISOMERASE II) (TYPE II DNA TOPOISOMERASE) MtH37Rv-3852 1.89 hns 1.95E-02 0.24 POSSIBLE HISTONE-LIKE PROTEIN HNS MtH37Rv-0251c 2.24 hsp 9.94E-03 0.36 PROBABLE HEAT SHOCK PROTEIN HSP MtH37Rv-1223 2.15 htrA 1.12E-02 0.25 PROBABLE SERINE PROTEASE HTRA (DEGP PROTEIN) PROBABLE DNA-BINDING PROTEIN HU HOMOLOG HUPB (HISTONE-LIKE PROTEIN) (HLP) (21-KDA MtH37Rv-2986c 2.62 hupB 2.36E-02 0.62 LAMININ-2-BINDING PROTEIN) PROBABLE ISOCITRATE DEHYDROGENASE [NADP] ICD2 (OXALOSUCCINATE DECARBOXYLASE) (IDH) MtH37Rv-0066c 1.77 icd2 4.52E-02 0.32 (NADP+-SPECIFIC ICDH) (IDP) MtH37Rv-0467 3.30 icl 6.96E-03 0.20 ISOCITRATE LYASE ICL (ISOCITRASE) (ISOCITRATASE) MtH37Rv-3874 2.35 lhp 6.54E-03 0.28 10 KDA CULTURE FILTRATE ANTIGEN LHP (CFP10) MtH37Rv-2218 1.77 lipA 1.99E-02 0.14 Probable lipoate biosynthesis protein A LipA MtH37Rv-2217 2.18 lipB 3.92E-03 0.22 Probable lipoate biosynthesis protein B LipB MtH37Rv-1076 1.58 lipU 2.88E-02 0.25 POSSIBLE LIPASE LIPU MtH37Rv-1899c 2.24 lppD 3.17E-02 0.34 POSSIBLE LIPOPROTEIN LPPD MtH37Rv-1270c 1.83 lprA 3.08E-02 0.39 POSSIBLE LIPOPROTEIN LPRA MtH37Rv-1274 1.89 lprB 2.56E-02 0.17 POSSIBLE LIPOPROTEIN LPRB MtH37Rv-1343c 2.83 lprD 1.42E-02 0.27 PROBABLE CONSERVED LIPOPROTEIN LPRD MtH37Rv-0483 2.37 lprQ 1.81E-03 0.24 PROBABLE CONSERVED LIPOPROTEIN LPRQ MtH37Rv-2790c 1.39 ltp1 2.03E-02 0.14 PROBABLE LIPID-TRANSFER PROTEIN LTP1 ALPHA-D-MANNOSE-1-PHOSPHATE MtH37Rv-3264c 1.46 manB 1.96E-02 0.22 GUANYLYLTRANSFERASE MANB MbAF212297- 2047c 1.56 Mb2047c 2.62E-02 0.23 HYPOTHETICAL PROTEIN MbAF212297- 3435c 2.92 Mb3435c 4.55E-03 0.34 CONSERVED HYPOTHETICAL PROTEIN [SECOND PART] PHENYLOXAZOLINE SYNTHASE MBTB MtH37Rv-2383c 2.97 mbtB 1.81E-03 0.18 (PHENYLOXAZOLINE SYNTHETASE) POLYKETIDE SYNTHETASE MBTD (POLYKETIDE MtH37Rv-2381c 3.97 mbtD 1.11E-02 0.49 SYNTHASE) MtH37Rv-2380c 4.05 mbtE 6.54E-03 0.40 PEPTIDE SYNTHETASE MBTE (PEPTIDE SYNTHASE) MtH37Rv-2379c 4.25 mbtF 1.81E-03 0.43 PEPTIDE SYNTHETASE MBTF (PEPTIDE SYNTHASE) LYSINE-N-OXYGENASE MBTG (L-LYSINE 6- MtH37Rv-2378c 2.39 mbtG 2.43E-02 0.34 MONOOXYGENASE) (LYSINE N6-HYDROXYLASE) MtH37Rv-2377c 4.12 mbtH 1.81E-03 0.24 PUTATIVE CONSERVED PROTEIN MBTH MtH37Rv-2386c 4.95 mbtI 1.81E-03 0.36 PUTATIVE ISOCHORISMATE SYNTHASE MBTI MtH37Rv-2385 2.53 mbtJ 7.20E-03 0.40 PUTATIVE ACETYL HYDROLASE MBTJ MtH37Rv-3497c 2.25 mce4C 1.30E-02 0.24 MCE-FAMILY PROTEIN MCE4C PROBABLE S-ADENOSYLMETHIONINE:2- DEMETHYLMENAQUINONE METHYLTRANSFERASE MtH37Rv-3853 2.80 menG 8.87E-03 0.23 MENG METHOXY MYCOLIC ACID SYNTHASE 4 MMAA4 (METHYL MYCOLIC ACID SYNTHASE 4) (MMA4) MtH37Rv-0642c 1.69 mmaA4 4.06E-02 0.42 (HYDROXY MYCOLIC ACID SYNTHASE) PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-0450c 1.41 mmpL4 3.29E-02 0.13 TRANSPORT PROTEIN MMPL4 MtH37Rv-0451c 2.85 mmpS4 3.37E-03 0.22 PROBABLE CONSERVED MEMBRANE PROTEIN MMPS4 PROBABLE MOLYBDENUM COFACTOR BIOSYNTHESIS MtH37Rv-0869c 1.46 moaA2 3.63E-02 0.14 PROTEIN A2 MOAA2 PROBABLE MOLYBDENUM COFACTOR BIOSYNTHESIS PROTEIN E MOAE1 (MOLYBDOPTERIN CONVERTING FACTOR LARGE SUBUNIT) (MOLYBDOPTERIN [MPT] MtH37Rv-3119 1.59 moaE1 4.07E-02 0.35 CONVERTING FACTOR, SUBUNIT 2) PROBABLE MOLYBDENUM COFACTOR BIOSYNTHESIS PROTEIN MOEB2 (MPT-SYNTHASE SULFURYLASE) MtH37Rv-3116 1.53 moeB2 2.61E-02 0.35 (MOLYBDOPTERIN SYNTHASE SULPHURYLASE)

179 POSSIBLE MOLYBDOPTERIN BIOSYNTHESIS PROTEIN MtH37Rv-2338c 1.63 moeW 1.30E-02 0.14 MOEW PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-1479 1.81 moxR1 5.79E-03 0.24 MOXR1 MYCOBACTERIAL PERSISTENCE REGULATOR MRPA (TWO COMPONENT RESPONSE TRANSCRIPTIONAL MtH37Rv-0981 1.57 mprA 2.50E-02 0.22 REGULATORY PROTEIN) IMMUNOGENIC PROTEIN MPT63 (ANTIGEN MPT63/MPB63) (16 KD IMMUNOPROTECTIVE MtH37Rv-1926c 2.14 mpt63 1.32E-03 0.16 EXTRACELLULAR PROTEIN) IMMUNOGENIC PROTEIN MPT64 (ANTIGEN MtH37Rv-1980c 2.38 mpt64 1.12E-02 0.24 MPT64/MPB64) PROBABLE PHOSPHO-SUGAR MUTASE / MRSA PROTEIN MtH37Rv-3441c 2.13 mrsA 2.88E-02 0.34 HOMOLOG MtCDC1551- 0031 1.51 MT0031 4.56E-02 0.26 hypothetical protein MtCDC1551- 0066.2 3.80 MT0066.2 6.96E-03 0.24 hypothetical protein MtCDC1551- 0407.1 2.33 MT0407.1 1.32E-02 0.30 hypothetical protein MtCDC1551- 0553 2.56 MT0553 1.42E-02 0.24 hypothetical protein MtCDC1551- 0719.1 3.38 MT0719.1 1.81E-03 0.36 hypothetical protein MtCDC1551- 0975 2.02 MT0975 1.12E-02 0.24 hypothetical protein MtCDC1551- 1025.2 3.50 MT1025.2 1.66E-02 0.42 hypothetical protein MtCDC1551- 1083.1 1.83 MT1083.1 1.57E-02 0.27 hypothetical protein MtCDC1551- 1083.2 1.88 MT1083.2 2.68E-02 0.22 hypothetical protein MtCDC1551- 1178 3.60 MT1178 1.74E-02 0.47 hypothetical protein MtCDC1551- 1182 1.53 MT1182 1.74E-02 0.24 hypothetical protein MtCDC1551- 1190 1.79 MT1190 1.89E-02 0.12 hypothetical protein MtCDC1551- 1285 1.80 MT1285 1.36E-02 0.31 hypothetical protein MtCDC1551- 1627 3.77 MT1627 1.74E-02 0.29 hypothetical protein MtCDC1551- 1650 2.45 MT1650 4.34E-03 0.25 hypothetical protein MtCDC1551- 1650.1 1.67 MT1650.1 4.48E-02 0.35 hypothetical protein MtCDC1551- 1771.1 1.81 MT1771.1 2.73E-02 0.35 hypothetical protein MtCDC1551- 1839.1 2.47 MT1839.1 1.39E-02 0.29 hypothetical protein MtCDC1551- 1924.1 4.85 MT1924.1 5.66E-03 0.29 hypothetical protein MtCDC1551- 2420 2.88 MT2420 4.56E-02 0.83 conserved hypothetical protein MtCDC1551- 2421 4.51 MT2421 5.79E-03 0.61 conserved hypothetical protein MtCDC1551- 2588 1.91 MT2588 4.26E-02 0.49 hypothetical protein MtCDC1551- 2625 2.43 MT2625 1.81E-03 0.23 hypothetical protein MtCDC1551- 3135 2.17 MT3135 5.78E-03 0.31 hypothetical protein MtCDC1551- 3139.1 2.94 MT3139.1 4.52E-03 0.41 hypothetical protein MtCDC1551- 3718.1 1.93 MT3718.1 2.74E-02 0.13 hypothetical protein MtCDC1551- 3994 1.84 MT3994 2.31E-02 0.25 hypothetical protein

180 PROBABLE RESPIRATORY NITRATE REDUCTASE MtH37Rv-1161 1.69 narG 4.27E-02 0.31 (ALPHA CHAIN) NARG PROBABLE NITRITE EXTRUSION PROTEIN 1 NARK1 MtH37Rv-2329c 1.45 narK1 4.92E-02 0.23 (NITRITE FACILITATOR 1) PROBABLE GLUTAREDOXIN ELECTRON TRANSPORT COMPONENT OF NRDEF (GLUTAREDOXIN-LIKE MtH37Rv-3053c 2.42 nrdH 2.59E-02 0.40 PROTEIN) NRDH MtH37Rv-3052c 1.51 nrdI 4.76E-02 0.33 PROBABLE NRDI PROTEIN PROBABLE TRANSCRIPTION ANTITERMINATION MtH37Rv-0639 2.44 nusG 6.54E-03 0.27 PROTEIN NUSG MtH37Rv-1446c 1.56 opcA 7.20E-03 0.13 PUTATIVE OXPP CYCLE PROTEIN OPCA PROBABLE ALPHA, ALPHA-TREHALOSE-PHOSPHATE SYNTHASE [UDP-FORMING] OTSA (TREHALOSE-6- PHOSPHATE SYNTHASE) (UDP-GLUCOSE- GLUCOSEPHOSPHATE GLUCOSYLTRANSFERASE) (TREHALOSEPHOSPHATE-UDP GLUCOSYLTRANSFERASE) (TREHALOSE-6-PHOSPHATE SYNTHETASE) (TREHALOSE-PHOSPHATE SYNTHASE) (TREHALOSE-PHOSPHATE SYNTHETASE) (TRANSGLUCOSYLASE) (TREHALOSEPHOSPHATE-UDP MtH37Rv-3490 1.37 otsA 4.36E-02 0.21 GLUCOSYL TRANSFERASE) POSSIBLE CONSERVED POLYKETIDE SYNTHASE MtH37Rv-3820c 1.52 papA2 2.13E-02 0.17 ASSOCIATED PROTEIN PAPA2 PROBABLE PHOSPHOENOLPYRUVATE CARBOXYKINASE [GTP] PCKA (PHOSPHOENOLPYRUVATE CARBOXYLASE) MtH37Rv-0211 4.23 pckA 4.55E-03 0.47 (PEPCK)(PEP CARBOXYKINASE) MtH37Rv-1172c 1.73 PE12 3.60E-02 0.29 PE FAMILY PROTEIN MtH37Rv-1195 1.65 PE13 1.39E-02 0.25 PE FAMILY PROTEIN MtH37Rv-3020c 4.90 PE28 5.79E-03 0.50 PE FAMILY PROTEIN MtH37Rv-0285 2.88 PE5 6.54E-03 0.24 PE FAMILY PROTEIN MtH37Rv-0125 3.39 pepA 3.32E-02 0.37 PROBABLE SERINE PROTEASE PEPA PROBABLE PHOH-LIKE PROTEIN PHOH1 (PHOSPHATE MtH37Rv-2368c 1.55 phoH1 1.70E-02 0.19 STARVATION-INDUCIBLE PROTEIN PSIH) POSSIBLE TWO COMPONENT SYSTEM RESPONSE MtH37Rv-0757 1.82 phoP 2.40E-03 0.18 TRANSCRIPTIONAL POSITIVE REGULATOR PHOP PROBABLE PHOSPHATE-TRANSPORT ATP-BINDING MtH37Rv-0820 1.64 phoT 2.74E-02 0.25 PROTEIN ABC TRANSPORTER PHOT PROBABLE PHOSPHATE-TRANSPORT SYSTEM MtH37Rv-0821c 1.92 phoY2 2.56E-02 0.31 TRANSCRIPTIONAL REGULATORY PROTEIN PHOY2 PROBABLE LOW-AFFINITY INORGANIC PHOSPHATE MtH37Rv-0545c 1.39 pitA 3.65E-02 0.20 TRANSPORTER INTEGRAL MEMBRANE PROTEIN PITA TRANSMEMBRANE SERINE/THREONINE-PROTEIN MtH37Rv-0931c 2.01 pknD 1.62E-02 0.35 KINASE D PKND (PROTEIN KINASE D) (PSTK D) PROBABLE TRANSMEMBRANE SERINE/THREONINE- PROTEIN KINASE H PKNH (PROTEIN KINASE H) (PSTK MtH37Rv-1266c 1.51 pknH 4.44E-02 0.17 H) PROBABLE BIFUNCTIONAL MEMBRANE-ASSOCIATED PENICILLIN-BINDING PROTEIN 1A/1B PONA2 (MUREIN POLYMERASE) [INCLUDES: PENICILLIN-INSENSITIVE TRANSGLYCOSYLASE (PEPTIDOGLYCAN TGASE) + PENICILLIN-SENSITIVE TRANSPEPTIDASE (DD- MtH37Rv-3682 1.59 ponA2 1.14E-02 0.16 TRANSPEPTIDASE)] MtH37Rv-1196 1.72 PPE18 2.40E-02 0.34 PPE FAMILY PROTEIN MtH37Rv-0256c 1.41 PPE2 2.88E-02 0.18 PPE FAMILY PROTEIN MtH37Rv-2352c 2.37 PPE38 6.54E-03 0.18 PPE FAMILY PROTEIN MtH37Rv-0286 2.80 PPE4 1.30E-02 0.17 PPE FAMILY PROTEIN MtH37Rv-2430c 3.10 PPE41 1.23E-03 0.25 PPE FAMILY PROTEIN MtH37Rv-3144c 1.45 PPE52 4.24E-02 0.31 PPE-FAMILY PROTEIN MtH37Rv-3425 1.49 PPE57 1.86E-02 0.21 PPE FAMILY PROTEIN PHOSPHATE-TRANSPORT ATP-BINDING PROTEIN ABC MtH37Rv-0933 1.81 pstB 2.94E-02 0.36 TRANSPORTER PSTB

181 PERIPLASMIC PHOSPHATE-BINDING LIPOPROTEIN MtH37Rv-0932c 1.45 pstS2 4.51E-02 0.34 PSTS2 (PBP-2) (PSTS2) MtH37Rv-1699 1.90 pyrG 2.83E-03 0.14 Probable CTP synthase PyrG PROBABLE QUINONE REDUCTASE GOR (NADPH:quinone MtH37Rv-1454c 2.02 qor 2.95E-02 0.16 reductase) (Zeta-crystallin homolog protein) MtH37Rv-3211 1.95 rhlE 6.47E-03 0.23 PROBABLE ATP-DEPENDENT RNA HELICASE RHLE PROBABLE TRANSCRIPTION TERMINATION FACTOR MtH37Rv-1297 2.38 rho 1.65E-02 0.32 RHO HOMOLOG MtH37Rv-0017c 1.60 rodA 2.13E-02 0.23 PROBABLE CELL DIVISION PROTEIN RODA MtH37Rv-0641 1.91 rplA 4.18E-02 0.29 PROBABLE 50S RIBOSOMAL PROTEIN L1 RPLA MtH37Rv-0702 2.20 rplD 2.07E-02 0.31 PROBABLE 50S RIBOSOMAL PROTEIN L4 RPLD MtH37Rv-0056 2.16 rplI 2.45E-02 0.20 PROBABLE 50S RIBOSOMAL PROTEIN L9 RPLI MtH37Rv-0651 1.51 rplJ 1.65E-02 0.12 PROBABLE 50S RIBOSOMAL PROTEIN L10 RPLJ MtH37Rv-3456c 2.30 rplQ 2.87E-02 0.37 PROBABLE 50S RIBOSOMAL PROTEIN L17 RPLQ MtH37Rv-1643 2.55 rplT 1.88E-02 0.35 Probable 50S ribosomal protein L20 RplT MtH37Rv-1298 2.16 rpmE 2.01E-02 0.40 PROBABLE 50S RIBOSOMAL PROTEIN L31 RPME PROBABLE DNA-DIRECTED RNA POLYMERASE (ALPHA CHAIN) RPOA (TRANSCRIPTASE ALPHA CHAIN) (RNA POLYMERASE ALPHA SUBUNIT) (DNA-DIRECTED RNA MtH37Rv-3457c 1.72 rpoA 4.51E-02 0.26 NUCLEOTIDYLTRANSFERASE) PROBABLE DNA-DIRECTED RNA POLYMERASE (OMEGA CHAIN) RPOZ (TRANSCRIPTASE OMEGA MtH37Rv-1390 1.67 rpoZ 3.84E-02 0.36 CHAIN) (RNA POLYMERASE OMEGA SUBUNIT) MtH37Rv-1630 2.70 rpsA 4.31E-02 0.29 Probable ribosomal protein S1 RpsA MtH37Rv-3459c 2.09 rpsK 3.35E-02 0.47 PROBABLE 30S RIBOSOMAL PROTEIN S11 RPSK MtH37Rv-0705 1.94 rpsS 2.59E-02 0.42 PROBABLE 30S RIBOSOMAL PROTEIN S19 RPSS MtH37Rv-0029 1.80 Rv0029 8.44E-03 0.22 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0061 2.50 Rv0061 2.09E-03 0.25 HYPOTHETICAL PROTEIN MtH37Rv-0064 2.23 Rv0064 2.85E-02 0.38 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-0088 1.65 Rv0088 1.70E-02 0.27 HYPOTHETICAL PROTEIN MtH37Rv-0141c 1.86 Rv0141c 1.42E-02 0.30 HYPOTHETICAL PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-0144 1.82 Rv0144 4.28E-02 0.42 (POSSIBLY TETR-FAMILY) MtH37Rv-0146 2.03 Rv0146 6.54E-03 0.16 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-0165c 1.93 Rv0165c 2.42E-02 0.32 (PROBABLY GNTR-FAMILY) MtH37Rv-0190 2.34 Rv0190 1.12E-02 0.35 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-0238 1.49 Rv0238 2.19E-02 0.16 (PROBABLY TETR-FAMILY) MtH37Rv-0239 1.70 Rv0239 2.29E-02 0.36 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0241c 2.01 Rv0241c 2.56E-02 0.21 CONSERVED HYPOTHETICAL PROTEIN PROBABLE SUCCINATE DEHYDROGENASE [IRON- MtH37Rv-0248c 1.92 Rv0248c 4.10E-02 0.45 SULFUR SUBUNIT] (SUCCINIC DEHYDROGENASE) MtH37Rv-0263c 1.85 Rv0263c 3.95E-02 0.31 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0264c 1.54 Rv0264c 5.15E-03 0.13 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-0275c 1.79 Rv0275c 1.83E-02 0.22 (POSSIBLY TETR-FAMILY) MtH37Rv-0281 1.48 Rv0281 6.54E-03 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0282 3.83 Rv0282 6.54E-03 0.28 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0283 3.52 Rv0283 3.21E-03 0.32 POSSIBLE CONSERVED MEMBRANE PROTEIN MtH37Rv-0284 5.09 Rv0284 1.76E-03 0.30 POSSIBLE CONSERVED MEMBRANE PROTEIN MtH37Rv-0289 4.97 Rv0289 3.10E-03 0.45 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0290 2.43 Rv0290 1.12E-02 0.10 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-0291 3.77 Rv0291 3.92E-03 0.47 PROBABLE PROTEASE PRECURSOR MtH37Rv-0292 3.69 Rv0292 9.73E-05 0.13 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN

182 MtH37Rv-0293c 2.29 Rv0293c 1.12E-02 0.15 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0298 2.65 Rv0298 6.47E-03 0.38 HYPOTHETICAL PROTEIN MtH37Rv-0300 1.48 Rv0300 2.88E-02 0.29 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0313 1.74 Rv0313 2.15E-03 0.17 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0361 1.46 Rv0361 4.36E-02 0.17 PROBABLE CONSERVED MEMBRANE PROTEIN MtH37Rv-0364 1.64 Rv0364 3.92E-03 0.18 POSSIBLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-0420c 1.54 Rv0420c 1.00E-02 0.17 POSSIBLE TRANSMEMBRANE PROTEIN MtH37Rv-0424c 1.84 Rv0424c 3.37E-02 0.40 HYPOTHETICAL PROTEIN MtH37Rv-0431 2.36 Rv0431 1.14E-02 0.37 PUTATIVE TUBERCULIN RELATED PEPTIDE MtH37Rv-0433 1.43 Rv0433 1.96E-02 0.14 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0460 1.75 Rv0460 3.45E-02 0.09 CONSERVED HYDROPHOBIC PROTEIN MtH37Rv-0463 2.04 Rv0463 1.60E-02 0.16 PROBABLE CONSERVED MEMBRANE PROTEIN MtH37Rv-0464c 2.51 Rv0464c 3.61E-03 0.21 CONSERVED HYPOTHETICAL PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-0472c 1.64 Rv0472c 2.58E-02 0.08 (POSSIBLY TETR-FAMILY) MtH37Rv-0479c 1.73 Rv0479c 2.29E-02 0.23 PROBABLE CONSERVED MEMBRANE PROTEIN MtH37Rv-0481c 1.99 Rv0481c 2.30E-02 0.41 HYPOTHETICAL PROTEIN MtH37Rv-0484c 1.47 Rv0484c 4.14E-02 0.29 PROBABLE SHORT-CHAIN TYPE OXIDOREDUCTASE MtH37Rv-0500B 3.20 Rv0500B 2.22E-02 0.67 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0502 2.12 Rv0502 4.91E-02 0.36 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0530 1.54 Rv0530 3.17E-02 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0543c 1.38 Rv0543c 2.47E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0559c 1.80 Rv0559c 3.12E-02 0.34 POSSIBLE CONSERVED SECRETED PROTEIN MtH37Rv-0580c 1.60 Rv0580c 2.94E-03 0.14 CONSERVED HYPOTHETICAL PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-0586 1.63 Rv0586 2.91E-02 0.21 (GNTR-FAMILY) MtH37Rv-0625c 1.67 Rv0625c 4.70E-02 0.23 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-0657c 1.99 Rv0657c 3.47E-02 0.40 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0686 1.55 Rv0686 2.70E-02 0.17 PROBABLE MEMBRANE PROTEIN MtH37Rv-0688 1.50 Rv0688 4.12E-02 0.21 PUTATIVE FERREDOXIN REDUCTASE MtH37Rv-0691c 1.48 Rv0691c 2.54E-02 0.20 PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-0692 1.95 Rv0692 1.27E-02 0.29 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0755A 2.12 Rv0755A 2.51E-02 0.32 PUTATIVE TRANSPOSASE (FRAGMENT) MtH37Rv-0756c 1.32 Rv0756c 3.17E-02 0.13 HYPOTHETICAL PROTEIN MtH37Rv-0787A 1.31 Rv0787A 3.97E-02 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0831c 1.79 Rv0831c 4.93E-02 0.21 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0885 2.28 Rv0885 1.81E-03 0.20 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE CONSERVED EXPORTED OR MEMBRANE MtH37Rv-0901 1.52 Rv0901 1.28E-02 0.17 PROTEIN MtH37Rv-0909 2.57 Rv0909 2.46E-02 0.48 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0910 1.59 Rv0910 4.39E-02 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0913c 1.50 Rv0913c 4.37E-02 0.23 POSSIBLE DIOXYGENASE MtH37Rv-0942 1.50 Rv0942 4.48E-02 0.10 HYPOTHETICAL PROTEIN PROBABLE ADHESION COMPONENT TRANSPORT MtH37Rv-0987 1.26 Rv0987 3.92E-02 0.11 TRANSMEMBRANE PROTEIN ABC TRANSPORTER MtH37Rv-0991c 1.71 Rv0991c 4.80E-02 0.35 CONSERVED HYPOTHETICAL SERINE RICH PROTEIN MtH37Rv-1038c 1.74 Rv1038c 4.82E-02 0.44 Putative ESAT-6 like protein 2 MtH37Rv-1053c 2.59 Rv1053c 4.05E-02 0.20 HYPOTHETICAL PROTEIN MtH37Rv-1055 1.92 Rv1055 4.14E-02 0.38 POSSIBLE INTEGRASE (FRAGMENT) MtH37Rv-1072 2.21 Rv1072 4.74E-02 0.58 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-1099c 1.43 Rv1099c 4.10E-02 0.13 CONSERVED HYPOTHETICAL PROTEIN

183 MtH37Rv-1102c 1.54 Rv1102c 4.95E-02 0.19 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1106c 1.63 Rv1106c 3.31E-02 0.14 PROBABLE CHOLESTEROL DEHYDROGENASE MtH37Rv-1130 3.67 Rv1130 6.54E-03 0.23 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1159A 1.43 Rv1159A 2.40E-02 0.17 HYPOTHETICAL PROTEIN MtH37Rv-1194c 1.39 Rv1194c 4.16E-02 0.22 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1198 2.68 Rv1198 4.00E-02 0.53 PUTATIVE ESAT-6 LIKE PROTEIN 4 MtH37Rv-1211 2.13 Rv1211 2.22E-02 0.12 CONSERVED HYPOTHETICAL PROTEIN PROBABLE SHORT-CHAIN TYPE MtH37Rv-1245c 1.88 Rv1245c 1.81E-03 0.14 DEHYDROGENASE/REDUCTASE MtH37Rv-1247c 2.46 Rv1247c 4.40E-02 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1261c 1.68 Rv1261c 4.81E-02 0.36 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1284 1.93 Rv1284 4.31E-02 0.19 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1322A 1.84 Rv1322A 1.27E-02 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1324 1.99 Rv1324 4.11E-02 0.34 POSSIBLE THIOREDOXIN MtH37Rv-1331 1.99 Rv1331 5.80E-03 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1334 1.91 Rv1334 2.74E-02 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1339 1.87 Rv1339 6.54E-03 0.17 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1344 6.87 Rv1344 1.82E-03 0.30 PROBABLE ACYL CARRIER PROTEIN (ACP) PROBABLE DRUGS-TRANSPORT TRANSMEMBRANE MtH37Rv-1348 2.25 Rv1348 1.81E-03 0.19 ATP-BINDING PROTEIN ABC TRANSPORTER PROBABLE DRUGS-TRANSPORT TRANSMEMBRANE MtH37Rv-1349 2.25 Rv1349 5.93E-03 0.32 ATP-BINDING PROTEIN ABC TRANSPORTER MtH37Rv-1351 2.15 Rv1351 3.47E-02 0.25 HYPOTHETICAL PROTEIN MtH37Rv-1352 1.91 Rv1352 4.55E-03 0.17 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1395 1.51 Rv1395 4.48E-02 0.18 PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-1397c 1.63 Rv1397c 2.74E-02 0.34 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1460 2.86 Rv1460 1.67E-03 0.19 PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-1461 2.70 Rv1461 3.69E-03 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1462 2.62 Rv1462 2.78E-02 0.48 CONSERVED HYPOTHETICAL PROTEIN PROBABLE CONSERVED ATP-BINDING PROTEIN ABC MtH37Rv-1463 4.14 Rv1463 3.41E-03 0.15 TRANSPORTER MtH37Rv-1465 2.08 Rv1465 1.12E-02 0.30 POSSIBLE NITROGEN FIXATION RELATED PROTEIN MtH37Rv-1487 1.71 Rv1487 5.79E-03 0.20 CONSERVED MEMBRANE PROTEIN MtH37Rv-1488 2.46 Rv1488 1.30E-02 0.29 POSSIBLE EXPORTED CONSERVED PROTEIN MtH37Rv-1502 1.82 Rv1502 7.18E-03 0.16 HYPOTHETICAL PROTEIN MtH37Rv-1519 3.17 Rv1519 1.81E-03 0.32 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1520 2.11 Rv1520 3.37E-03 0.21 probable sugar transferase MtH37Rv-1545 1.47 Rv1545 1.42E-02 0.18 HYPOTHETICAL PROTEIN MtH37Rv-1565c 1.79 Rv1565c 3.08E-02 0.22 conserved hypothetical membrane protein MtH37Rv-1592c 1.81 Rv1592c 3.94E-02 0.12 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1615 1.92 Rv1615 3.19E-02 0.23 Probable hypothetical membrane protein MtH37Rv-1626 1.75 Rv1626 2.42E-02 0.28 Probable two-component system transcriptional regulator MtH37Rv-1627c 1.94 Rv1627c 1.42E-02 0.33 Probable nonspecific lipid-transfer protein MtH37Rv-1707 2.04 Rv1707 1.74E-02 0.12 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-1724c 1.46 Rv1724c 4.13E-02 0.25 HYPOTHETICAL PROTEIN PROBABLE CONSERVED TRANSMEMBRANE ATP- MtH37Rv-1747 1.42 Rv1747 2.96E-02 0.14 BINDING PROTEIN ABC TRANSPORTER MtH37Rv-1772 2.59 Rv1772 3.45E-02 0.21 HYPOTHETICAL PROTEIN MtH37Rv-1794 1.63 Rv1794 3.94E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1796 1.89 Rv1796 9.42E-03 0.25 CONSERVED HYPOTHETICAL PRO-RICH PROTEASE MtH37Rv-1810 1.59 Rv1810 2.63E-02 0.34 CONSERVED HYPOTHETICAL PROTEIN

184 MtH37Rv-1812c 1.94 Rv1812c 6.65E-03 0.16 PROBABLE DEHYDROGENASE MtH37Rv-1846c 2.65 Rv1846c 1.67E-02 0.51 POSSIBLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-1856c 1.82 Rv1856c 8.44E-03 0.08 HYPOTHETICAL OXIDOREDUCTASE MtH37Rv-1870c 1.66 Rv1870c 8.57E-03 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1871c 3.19 Rv1871c 5.79E-03 0.47 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1904 2.15 Rv1904 4.31E-02 0.50 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1920 1.39 Rv1920 3.37E-02 0.18 PROBABLE MEMBRANE PROTEIN MtH37Rv-1952 1.35 Rv1952 3.94E-02 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1954c 1.48 Rv1954c 1.81E-03 0.12 HYPOTHETICAL PROTEIN MtH37Rv-1987 1.85 Rv1987 3.73E-02 0.42 POSSIBLE CHITINASE MtH37Rv-1988 1.44 Rv1988 4.27E-02 0.24 PROBABLE METHYLTRANSFERASE MtH37Rv-1990c 1.99 Rv1990c 2.23E-02 0.19 PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-2060 1.62 Rv2060 2.22E-02 0.17 Possible conserved integral membrane protein MtH37Rv-2061c 3.08 Rv2061c 4.05E-02 0.22 conserved hypothetical protein MtH37Rv-2081c 1.65 Rv2081c 4.08E-02 0.25 Possible transmembrane protein MtH37Rv-2091c 1.99 Rv2091c 2.49E-02 0.40 Probable membrane protein MtH37Rv-2114 2.01 Rv2114 2.75E-02 0.23 hypothetical protein MtH37Rv-2137c 2.20 Rv2137c 1.00E-02 0.28 conserved hypothetical protein MtH37Rv-2204c 2.39 Rv2204c 2.91E-02 0.47 conserved hypothetical protein MtH37Rv-2239c 1.45 Rv2239c 2.40E-02 0.21 conserved hypothetical protein MtH37Rv-2347c 2.72 Rv2347c 4.31E-02 0.79 PUTATIVE ESAT-6 LIKE PROTEIN 7 MtH37Rv-2360c 1.32 Rv2360c 1.32E-02 0.16 HYPOTHETICAL PROTEIN MtH37Rv-2367c 1.49 Rv2367c 3.99E-02 0.21 FUNCTION UNKNOWN MtH37Rv-2390c 1.60 Rv2390c 1.12E-02 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2406c 1.73 Rv2406c 1.84E-02 0.15 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2454c 2.09 Rv2454c 1.12E-02 0.29 PROBABLE OXIDOREDUCTASE (BETA SUBUNIT) PROBABLE SHORT-CHAIN TYPE MtH37Rv-2509 1.74 Rv2509 1.81E-03 0.15 DEHYDROGENASE/REDUCTASE MtH37Rv-2576c 2.11 Rv2576c 1.61E-02 0.26 POSSIBLE CONSERVED MEMBRANE PROTEIN MtH37Rv-2613c 2.03 Rv2613c 1.30E-02 0.31 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2617c 1.84 Rv2617c 1.42E-02 0.13 PROBABLE TRANSMEMBRANE PROTEIN MtH37Rv-2619c 1.53 Rv2619c 1.42E-02 0.21 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2660c 1.85 Rv2660c 2.45E-02 0.17 HYPOTHETICAL PROTEIN MtH37Rv-2663 1.59 Rv2663 1.27E-02 0.19 HYPOTHETICAL PROTEIN MtH37Rv-2680 1.46 Rv2680 3.45E-02 0.15 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2694c 1.72 Rv2694c 2.15E-02 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2705c 2.17 Rv2705c 2.45E-03 0.23 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2706c 3.30 Rv2706c 1.32E-03 0.28 HYPOTHETICAL PROTEIN MtH37Rv-2708c 2.17 Rv2708c 1.11E-02 0.14 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2715 1.87 Rv2715 4.39E-02 0.31 POSSIBLE HYDROLASE MtH37Rv-2750 1.41 Rv2750 1.89E-02 0.18 PROBABLE DEHYDROGENASE MtH37Rv-2791c 1.45 Rv2791c 2.88E-02 0.12 PROBABLE TRANSPOSASE MtH37Rv-2840c 2.30 Rv2840c 2.89E-02 0.51 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2901c 1.41 Rv2901c 2.60E-02 0.21 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2949c 1.74 Rv2949c 4.08E-02 0.38 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2951c 1.65 Rv2951c 2.03E-02 0.22 POSSIBLE OXIDOREDUCTASE MtH37Rv-2956 1.80 Rv2956 6.54E-03 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2971 2.22 Rv2971 5.54E-03 0.21 PROBABLE OXIDOREDUCTASE MtH37Rv-3017c 1.42 Rv3017c 4.06E-02 0.13 PUTATIVE ESAT-6 LIKE PROTEIN 8

185 MtH37Rv-3019c 3.45 Rv3019c 5.38E-03 0.40 PUTATIVE SECRETED ESAT-6 LIKE PROTEIN 9 MtH37Rv-3033 1.96 Rv3033 2.29E-02 0.38 HYPOTHETICAL PROTEIN MtH37Rv-3046c 1.47 Rv3046c 2.40E-02 0.17 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3049c 1.66 Rv3049c 4.91E-02 0.37 PROBABLE MONOOXYGENASE MtH37Rv-3083 1.73 Rv3083 2.29E-02 0.19 PROBABLE MONOOXYGENASE (HYDROXYLASE) MtH37Rv-3115 1.53 Rv3115 2.96E-02 0.25 PROBABLE TRANSPOSASE PROBABLE CONSERVED ATP-BINDING PROTEIN ABC MtH37Rv-3197 1.51 Rv3197 4.74E-02 0.27 TRANSPORTER MtH37Rv-3205c 1.33 Rv3205c 1.95E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3221A 1.83 Rv3221A 8.58E-03 0.22 POSSIBLE ANTI-SIGMA FACTOR POSSIBLE LINOLEOYL-COA DESATURASE (DELTA(6)- MtH37Rv-3229c 2.30 Rv3229c 3.35E-02 0.47 DESATURASE) POSSIBLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-3249c 1.72 Rv3249c 2.74E-02 0.19 (PROBABLY TETR-FAMILY) MtH37Rv-3258c 2.36 Rv3258c 4.40E-02 0.38 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3269 2.94 Rv3269 1.81E-03 0.21 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3311 1.64 Rv3311 3.47E-02 0.12 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3402c 3.88 Rv3402c 2.44E-03 0.17 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3403c 2.22 Rv3403c 5.38E-03 0.24 HYPOTHETICAL PROTEIN MtH37Rv-3408 1.47 Rv3408 2.53E-02 0.26 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3412 2.25 Rv3412 6.54E-03 0.23 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3435c 1.53 Rv3435c 3.23E-02 0.25 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-3437 1.26 Rv3437 2.49E-02 0.19 POSSIBLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-3480c 1.29 Rv3480c 2.04E-02 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3489 2.10 Rv3489 4.82E-02 0.47 HYPOTHETICAL PROTEIN MtH37Rv-3491 1.34 Rv3491 4.60E-02 0.21 HYPOTHETICAL PROTEIN MtH37Rv-3519 2.20 Rv3519 1.80E-02 0.29 HYPOTHETICAL PROTEIN MtH37Rv-3524 1.61 Rv3524 3.99E-02 0.22 PROBABLE CONSERVED MEMBRANE PROTEIN MtH37Rv-3526 1.87 Rv3526 2.54E-02 0.16 POSSIBLE OXIDOREDUCTASE MtH37Rv-3528c 2.70 Rv3528c 2.74E-02 0.26 HYPOTHETICAL PROTEIN MtH37Rv-3583c 2.48 Rv3583c 4.55E-02 0.59 POSSIBLE TRANSCRIPTION FACTOR CARBONIC ANHYDRASE (CARBONATE DEHYDRATASE) MtH37Rv-3588c 2.73 Rv3588c 1.12E-02 0.49 (CARBONIC DEHYDRATASE) MtH37Rv-3614c 1.66 Rv3614c 4.96E-02 0.40 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3633 2.08 Rv3633 3.97E-02 0.44 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3669 1.65 Rv3669 2.36E-02 0.29 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-3680 3.13 Rv3680 2.30E-02 0.33 PROBABLE ANION TRANSPORTER ATPASE MtH37Rv-3688c 1.80 Rv3688c 1.37E-02 0.14 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3699 1.98 Rv3699 2.91E-02 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3717 1.78 Rv3717 1.54E-02 0.29 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3734c 2.05 Rv3734c 3.84E-02 0.34 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3755c 1.38 Rv3755c 1.36E-02 0.11 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3766 1.76 Rv3766 1.88E-02 0.22 HYPOTHETICAL PROTEIN MtH37Rv-3782 1.56 Rv3782 3.28E-02 0.26 POSSIBLE L-RHAMNOSYLTRANSFERASE MtH37Rv-3839 6.87 Rv3839 3.37E-03 0.30 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3863 1.55 Rv3863 4.75E-02 0.16 HYPOTHETICAL ALANINE RICH PROTEIN MtH37Rv-3867 1.92 Rv3867 4.53E-02 0.28 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3868 1.65 Rv3868 1.80E-02 0.22 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3869 1.50 Rv3869 3.47E-02 0.21 POSSIBLE CONSERVED MEMBRANE PROTEIN MtH37Rv-3870 1.46 Rv3870 2.22E-02 0.09 POSSIBLE CONSERVED MEMBRANE PROTEIN MtH37Rv-3871 2.52 Rv3871 9.92E-03 0.24 CONSERVED HYPOTHETICAL PROTEIN

186 CONSERVED HYPOTHETICAL PROLINE AND ALANINE MtH37Rv-3876 2.02 Rv3876 2.88E-02 0.21 RICH PROTEIN MtH37Rv-3877 1.47 Rv3877 2.31E-02 0.17 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-3880c 1.55 Rv3880c 2.88E-02 0.29 CONSERVED HYPOTHETICAL PROTEIN PROBABLE PREPROTEIN TRANSLOCASE SECA1 1 MtH37Rv-3240c 1.93 secA1 2.13E-02 0.34 SUBUNIT MtH37Rv-1821 1.98 secA2 1.79E-02 0.26 POSSIBLE PREPROTEIN TRANSLOCASE SECA2 MtH37Rv-0638 2.13 secE1 1.32E-02 0.24 PROBABLE PREPROTEIN TRANSLOCASE SECE1 MtH37Rv-0732 1.68 secY 1.95E-02 0.24 PROBABLE PREPROTEIN TRANSLOCASE SECY POSSIBLE PHOSPHOSERINE AMINOTRANSFERASE MtH37Rv-0884c 1.66 serC 2.90E-02 0.22 SERC (PSAT) MtH37Rv-2703 2.27 sigA 2.47E-02 0.48 RNA POLYMERASE SIGMA FACTOR SIGA (SIGMA-A) MtH37Rv-2710 3.36 sigB 1.76E-03 0.28 RNA POLYMERASE SIGMA FACTOR SIGB PROBABLE ALTERNATIVE RNA POLYMERASE SIGMA-D MtH37Rv-3414c 1.62 sigD 4.05E-02 0.22 FACTOR SIGD ALTERNATIVE RNA POLYMERASE SIGMA FACTOR MtH37Rv-1221 2.35 sigE 1.81E-03 0.24 SIGE ALTERNATIVE RNA POLYMERASE SIGMA-E FACTOR MtH37Rv-3223c 1.48 sigH 4.11E-02 0.31 (SIGMA-24) SIGH (RPOE) PROBABLE ALTERNATIVE RNA POLYMERASE SIGMA MtH37Rv-0445c 1.71 sigK 2.01E-02 0.21 FACTOR SIGK PROBABLE SUCCINYL-COA SYNTHETASE (ALPHA MtH37Rv-0952 1.70 sucD 2.15E-02 0.16 CHAIN) SUCD (SCS-ALPHA) MtH37Rv-1224 1.57 tatB 2.03E-02 0.22 Probable protein TatB MtH37Rv-1636 3.16 TB15.3 1.12E-02 0.46 CONSERVED HYPOTHETICAL PROTEIN TB15.3 MtH37Rv-2605c 1.66 tesB2 2.47E-02 0.25 PROBABLE ACYL-COA THIOESTERASE II TESB2 (TEII) MtH37Rv-0423c 2.44 thiC 1.09E-02 0.38 PROBABLE THIAMINE BIOSYNTHESIS PROTEIN THIC PROBABLE THIAMIN BIOSYNTHESIS PROTEIN THIG MtH37Rv-0417 1.59 thiG 4.65E-02 0.18 (THIAZOLE BIOSYNTHESIS PROTEIN) MtH37Rv-1449c 1.52 tkt 1.22E-02 0.20 PROBABLE TRANSKETOLASE TKT (TK) TWO COMPONENT TRANSCRIPTIONAL REGULATOR MtH37Rv-1033c 2.72 trcR 4.46E-03 0.34 TRCR MtH37Rv-1032c 1.92 trcS 6.39E-03 0.11 TWO COMPONENT SENSOR HISTIDINE KINASE TRCS MtH37Rv-1611 2.04 trpC 3.69E-03 0.17 Probable indole-3-glycerol phosphate synthase trpC POSSIBLE ANTHRANILATE SYNTHASE COMPONENT II MtH37Rv-0013 1.46 trpG 1.62E-02 0.17 TRPG (GLUTAMINE AMIDOTRANSFERASE) MtH37Rv-1689 1.59 tyrS 4.06E-02 0.23 Probable Tyrosyl-tRNA synthase tyrS (TYRRS) MtH37Rv-0469 2.44 umaA 2.68E-03 0.28 POSSIBLE MYCOLIC ACID SYNTHASE UMAA PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-3219 3.48 whiB1 4.20E-02 0.67 WHIB-LIKE WHIB1 Probable VII (small subunit) xseB MtH37Rv-1107c 1.62 xseB 2.13E-02 0.26 ( VII small subunit)

187 Appendix V: Genes upregulated in intracellular M. tuberculosis

H37Ra versus broth-grown H37Ra.

Expanded list of genes downregulated in intracellular versus broth-grown M. tuberculosis H37Ra (see

Table 5-5). Three populations of RNA from each of intracellular and broth-grown H37Ra were reverse transcribed

and hybridised to M. tuberculosis microarrays (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate. Arrays were normalised as per section 2.6.3.2 and gene expression was filtered for genes whose expression differed by 1.5-fold.

Statistical significance of fold-difference across all populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) upregulated in intracellular H37Ra versus

broth-grown H37Ra are listed. SD = standard deviation.

Fold- Common Systematic Name induction Name P-value SD Product ALKYL HYDROPEROXIDE REDUCTASE C PROTEIN MtH37Rv-2428 2.91 ahpC 1.61E-02 0.14 AHPC (ALKYL HYDROPEROXIDASE C) Low molecular weight protein antigen 7 cfp7 (10 kDa antigen) MtH37Rv-0288 3.82 cfp7 4.39E-02 0.28 (CFP-7) (Protein TB10.4) PROBABLE METAL CATION TRANSPORTER P-TYPE MtH37Rv-1992c 3.93 ctpG 1.61E-02 0.44 ATPASE G CTPG PROBABLE FATTY-ACID--COA LIGASE FADD21 (FATTY-ACID-COA SYNTHETASE) (FATTY-ACID-COA MtH37Rv-1185c 1.47 fadD21 3.66E-02 0.39 SYNTHASE) MtH37Rv-1345 3.41 fadD33 8.76E-04 0.15 POSSIBLE POLYKETIDE SYNTHASE FADD33 MtH37Rv-0244c 2.02 fadE5 1.94E-02 0.17 PROBABLE ACYL-COA DEHYDROGENASE FADE5 ISOCITRATE LYASE ICL (ISOCITRASE) MtH37Rv-0467 7.29 icl 3.27E-04 0.20 (ISOCITRATASE) MbAF212297- CONSERVED HYPOTHETICAL PROTEIN [SECOND 3435c 3.97 Mb3435c 8.76E-04 0.34 PART] PHENYLOXAZOLINE SYNTHASE MBTB MtH37Rv-2383c 3.47 mbtB 4.61E-03 0.18 (PHENYLOXAZOLINE SYNTHETASE) MtH37Rv-2379c 4.07 mbtF 4.30E-02 0.43 PEPTIDE SYNTHETASE MBTF (PEPTIDE SYNTHASE) MtH37Rv-2377c 2.98 mbtH 7.96E-03 0.24 PUTATIVE CONSERVED PROTEIN MBTH MtH37Rv-2386c 3.88 mbtI 1.67E-03 0.36 PUTATIVE ISOCHORISMATE SYNTHASE MBTI PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-0450c 1.80 mmpL4 3.07E-02 0.13 TRANSPORT PROTEIN MMPL4 MtH37Rv-1229c 1.51 mrp 4.34E-02 0.14 PROBABLE MRP-RELATED PROTEIN MRP MtCDC1551- 2182.1 4.14 MT2182.1 1.97E-02 0.25 PPE family protein MtCDC1551- 2421 2.03 MT2421 3.51E-02 0.61 conserved hypothetical protein MtH37Rv-3020c 4.41 PE28 2.40E-02 0.50 PE FAMILY PROTEIN MtH37Rv-1808 1.59 PPE32 2.00E-02 0.28 PPE FAMILY PROTEIN MtH37Rv-2352c 3.97 PPE38 1.80E-02 0.18 PPE FAMILY PROTEIN MtH37Rv-0286 3.35 PPE4 4.61E-03 0.17 PPE FAMILY PROTEIN MtH37Rv-0146 1.68 Rv0146 1.10E-02 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0282 3.63 Rv0282 1.09E-02 0.28 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0284 4.28 Rv0284 8.76E-04 0.30 POSSIBLE CONSERVED MEMBRANE PROTEIN

188 MtH37Rv-0289 3.63 Rv0289 2.00E-02 0.45 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0290 3.23 Rv0290 2.00E-02 0.10 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-0292 2.79 Rv0292 1.84E-03 0.13 PROBABLE CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-1344 6.52 Rv1344 2.74E-03 0.30 PROBABLE ACYL CARRIER PROTEIN (ACP) MtH37Rv-1352 1.81 Rv1352 4.42E-02 0.17 CONSERVED HYPOTHETICAL PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-1395 1.93 Rv1395 4.84E-02 0.18 PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-1460 2.71 Rv1460 8.76E-04 0.19 PROTEIN PROBABLE CONSERVED ATP-BINDING PROTEIN ABC MtH37Rv-1463 3.39 Rv1463 8.76E-04 0.15 TRANSPORTER MtH37Rv-1519 3.50 Rv1519 2.62E-02 0.32 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1628c 1.51 Rv1628c 3.99E-02 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1831 2.97 Rv1831 7.02E-03 0.15 HYPOTHETICAL PROTEIN PROBABLE CONSERVED INTEGRAL MEMBRANE MtH37Rv-1986 2.85 Rv1986 1.67E-03 0.54 PROTEIN MtH37Rv-1990A 2.56 Rv1990A 4.42E-02 0.14 POSSIBLE DEHYDROGENASE (FRAGMENT) PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-1994c 5.04 Rv1994c 4.12E-02 0.18 PROTEIN MtH37Rv-2651c 2.64 Rv2651c 4.30E-02 0.22 POSSIBLE phiRv2 PROPHAGE PROTEASE MtH37Rv-2791c 1.61 Rv2791c 2.91E-02 0.12 PROBABLE TRANSPOSASE MtH37Rv-2880c 1.80 Rv2880c 4.61E-03 0.14 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3019c 3.40 Rv3019c 1.74E-02 0.40 PUTATIVE SECRETED ESAT-6 LIKE PROTEIN 9 POSSIBLE LINOLEOYL-COA DESATURASE (DELTA(6)- MtH37Rv-3229c 3.50 Rv3229c 4.84E-03 0.47 DESATURASE) MtH37Rv-3230c 2.27 Rv3230c 4.81E-03 0.27 HYPOTHETICAL OXIDOREDUCTASE MtH37Rv-3402c 3.75 Rv3402c 1.67E-03 0.17 CONSERVED HYPOTHETICAL PROTEIN CARBONIC ANHYDRASE (CARBONATE MtH37Rv-3588c 2.85 Rv3588c 3.36E-03 0.49 DEHYDRATASE) (CARBONIC DEHYDRATASE) MtH37Rv-3839 11.77 Rv3839 2.58E-04 0.30 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE TRANSCRIPTIONAL REGULATORY MtH37Rv-3840 1.73 Rv3840 1.49E-02 0.25 PROTEIN MtH37Rv-0287 3.20 TB9.8 3.55E-02 0.33 CONSERVED HYPOTHETICAL PROTEIN TB9.8 TWO COMPONENT TRANSCRIPTIONAL REGULATOR MtH37Rv-1033c 2.00 trcR 4.84E-02 0.34 TRCR

189 Appendix VI: Genes downregulated in intracellular M. tuberculosis

H37Rv versus broth-grown H37Rv.

Expanded list of genes downregulated in intracellular versus broth-grown M. tuberculosis H37Rv (see

Table 5-6). Three populations of RNA from each of intracellular and broth-grown H37Rv were reverse transcribed

and hybridised to M. tuberculosis microarrays (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate. Arrays were normalised as per section 2.6.3.2 and gene expression was filtered for genes whose expression differed by 1.5-fold.

Statistical significance of fold-difference across all populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) downregulated in intracellular H37Rv versus broth-grown H37Rv are listed. SD = standard deviation.

Systematic Name Fold-induction Common Name P-value SD Product PROBABLE ACONITATE HYDRATASE ACN MtH37Rv-1475c 0.58 acn 1.42E-02 0.23 (Citrate hydro-lyase) (Aconitase) HOLO-[ACYL-CARRIER PROTEIN] SYNTHASE MtH37Rv-2523c 0.68 acpS 3.08E-02 0.15 ACPS (HOLO-ACP SYNTHASE) POSSIBLE MULTI-FUNCTIONAL ENZYME WITH MtH37Rv-3391 0.51 acrA1 2.88E-02 0.22 ACYL-CoA-REDUCTASE ACTIVITY ACRA1 MtH37Rv-1530 0.47 adh 3.84E-02 0.15 Probable alcohol dehydrogenase adh PROBABLE ALPHA-GLUCOSIDASE AGLA (MALTASE) (GLUCOINVERTASE) (GLUCOSIDOSUCRASE) (MALTASE- GLUCOAMYLASE) (LYSOSOMAL ALPHA- MtH37Rv-2471 0.33 aglA 9.43E-03 0.29 GLUCOSIDASE) (ACID MALTASE) PROBABLE ALDEHYDE DEHYDROGENASE NAD DEPENDANT ALDA (ALDEHYDE MtH37Rv-0768 0.29 aldA 9.42E-03 0.37 DEHYDROGENASE [NAD+]) PROBABLE ALKANE 1-MONOOXYGENASE ALKB (ALKANE 1-HYDROXYLASE) (LAURIC ACID OMEGA-HYDROXYLASE) (OMEGA- HYDROXYLASE) (FATTY ACID OMEGA- HYDROXYLASE) (ALKANE HYDROXYLASE- MtH37Rv-3252c 0.44 alkB 2.54E-02 0.32 RUBREDOXIN) POSSIBLE N-ACYL-L-AMINO ACID MtH37Rv-3305c 0.41 amiA1 2.14E-02 0.37 AMIDOHYDROLASE AMIA1 MtH37Rv-3306c 0.52 amiB1 4.05E-02 0.48 PROBABLE AMIDOHYDROLASE AMIB1 PROBABLE AMIDASE AMID (ACYLAMIDASE) MtH37Rv-3375 0.37 amiD 1.27E-02 0.23 (ACYLASE) PROBABLE AMMONIUM-TRANSPORT MtH37Rv-2920c 0.58 amt 4.18E-02 0.23 INTEGRAL MEMBRANE PROTEIN AMT PROBABLE FLAVIN-CONTAINING MONOAMINE MtH37Rv-3170 0.35 aofH 2.27E-03 0.22 OXIDASE AOFH (AMINE OXIDASE) (MAO) PROBABLE SHIKIMATE 5-DEHYDROGENASE MtH37Rv-2552c 0.51 aroE 1.98E-02 0.23 AROE (5-DEHYDROSHIKIMATE REDUCTASE) PROBABLE CHORISMATE SYNTHASE AROF (5- ENOLPYRUVYLSHIKIMATE-3-PHOSPHATE MtH37Rv-2540c 0.41 aroF 2.15E-02 0.31 PHOSPHOLYASE) MtH37Rv-2539c 0.72 aroK 3.02E-02 0.13 PROBABLE SHIKIMATE KINASE AROK (SK) POSSIBLE GLYCINE BETAINE TRANSPORT MtH37Rv-0917 0.45 betP 1.42E-02 0.21 INTEGRAL MEMBRANE PROTEIN BETP MtH37Rv-3569c 0.60 bphD 4.12E-02 0.19 2-HYDROXY-6-OXO-6-PHENYLHEXA-2,4-

190 DIENOATE HYDROLASE BPHD POSSIBLE PEROXIDASE BPOA (NON-HAEM MtH37Rv-3473c 0.58 bpoA 3.54E-02 0.15 PEROXIDASE) PROBABLE CARBAMOYL-PHOSPHATE SYNTHASE LARGE CHAIN CARB (Carbamoyl- MtH37Rv-1384 0.50 carB 1.73E-02 0.24 phosphate synthetase ammonia chain) Probable cystathionine beta-synthase CBS (Serine MtH37Rv-1077 0.70 cbs 4.82E-02 0.19 sulfhydrase) (Beta-thionase) (Hemoprotein H-450) PROBABLE CYTIDINE DEAMINASE CDD (CYTIDINE AMINOHYDROLASE) (CYTIDINE MtH37Rv-3315c 0.40 cdd 6.39E-03 0.17 NUCLEOSIDE DEAMINASE) Probable Cytidylate kinase cmk (CMP kinase) (Cytidine MtH37Rv-1712 0.61 cmk 3.77E-02 0.31 monophosphate kinase) (CK) Probable pantothenate kinase coaA (Pantothenic acid MtH37Rv-1092c 0.58 coaA 4.06E-02 0.20 kinase) PROBABLE COBYRINIC ACID A,C-DIAMIDE MtH37Rv-2848c 0.42 cobB 1.29E-02 0.18 SYNTHASE COBB MtH37Rv-0255c 0.57 cobQ1 1.27E-02 0.25 PROBABLE COBYRIC ACID SYNTHASE COBQ1 MtH37Rv-3713 0.53 cobQ2 2.78E-02 0.18 POSSIBLE COBYRIC ACID SYNTHASE COBQ2 POSSIBLE UDP-GLUCOSE-4-EPIMERASE CPSY (GALACTOWALDENASE) (UDP-GALACTOSE-4- EPIMERASE) (URIDINE DIPHOSPHATE GALACTOSE-4-EPIMERASE) (URIDINE MtH37Rv-0806c 0.59 cpsY 4.56E-02 0.19 DIPHOSPHO-GALACTOSE-4-EPIMERASE) PROBABLE CARBON STARVATION PROTEIN A MtH37Rv-3063 0.61 cstA 2.89E-02 0.11 HOMOLOG CSTA PROBABLE CYTOCHROME C OXIDASE MtH37Rv-1451 0.49 ctaB 4.18E-02 0.34 ASSEMBLY FACTOR CTAB PROBABLE D-SERINE/ALANINE/GLYCINE MtH37Rv-1704c 0.55 cycA 4.90E-02 0.38 TRANSPORTER PROTEIN CYCA Probable integral membrane cytochrome D ubiquinol oxidase (subunit I) cydA (Cytochrome BD-I oxidase MtH37Rv-1623c 0.48 cydA 2.50E-02 0.83 subunit I) Probable integral membrane cytochrome D ubiquinol oxidase (subunit II) cydB (Cytochrome BD-I oxidase MtH37Rv-1622c 0.39 cydB 7.18E-03 0.31 subunit II) MtH37Rv-2276 0.45 cyp121 1.27E-02 0.20 Cytochrome P450 121 CYP121 MtH37Rv-2268c 0.53 cyp128 1.79E-02 0.28 Probable cytochrome P450 128 CYP128 MtH37Rv-1256c 0.54 cyp130 1.59E-02 0.24 PROBA BLE CYTOCHROME P450 130 CYP130 CYTOCHROME P450 51 CYP51 (CYPL1) (P450- L1A1) (STEROL 14-ALPHA DEMETHYLASE) (LANOSTEROL 14-ALPHA DEMETHYLASE) MtH37Rv-0764c 0.43 cyp51 2.30E-02 0.20 (P450-14DM) PROBABLE SULFATE-TRANSPORT INTEGRAL MEMBRANE PROTEIN ABC TRANSPORTER MtH37Rv-2398c 0.68 cysW 1.48E-02 0.20 CYSW POSSIBLE DNA-DAMAGE-INDUCIBLE PROTEIN MtH37Rv-3056 0.59 dinP 2.14E-02 0.12 P DINP POSSIBLE INTEGRAL MEMBRANE C-TYPE MtH37Rv-2874 0.47 dipZ 4.21E-02 0.11 CYTOCHROME BIOGENESIS PROTEIN DIPZ PROBABLE DIPEPTIDE-TRANSPORT INTEGRAL MtH37Rv-3665c 0.40 dppB 2.36E-02 0.24 MEMBRANE PROTEIN ABC TRANSPORTER DPPB PROBABLE DIPEPTIDE-TRANSPORT INTEGRAL MtH37Rv-3664c 0.48 dppC 1.74E-02 0.20 MEMBRANE PROTEIN ABC TRANSPORTER DPPC PROBABLE 1-DEOXY-D-XYLULOSE 5- PHOSPHATE SYNTHASE DXS2 (1- DEOXYXYLULOSE-5-PHOSPHATE SYNTHASE) MtH37Rv-3379c 0.63 dxs2 2.89E-02 0.16 (DXP SYNTHASE) (DXPS) PROBABLE EPOXIDE HYDROLASE EPHA (EPOXIDE HYDRATASE) (ARENE-OXIDE MtH37Rv-3617 0.48 ephA 2.85E-02 0.32 HYDRATASE) C-5 STEROL DESATURASE ERG3 (STEROL-C5- MtH37Rv-1814 0.59 erg3 2.35E-02 0.23 DESATURASE) POSSIBLE MALONYL COA-ACYL CARRIER MtH37Rv-0649 0.60 fabD2 2.69E-02 0.18 PROTEIN TRANSACYLASE FABD2 (MCT) MtH37Rv-1550 0.50 fadD11 9.46E-03 0.13 PROBABLE FATTY-ACID-COA LIGASE FADD11 191 (FATTY-ACID-COA SYNTHETASE) (FATTY- ACID-COA SYNTHASE) PROBABLE FATTY-ACID-COA LIGASE FADD8 (FATTY-ACID-COA SYNTHETASE) (FATTY- MtH37Rv-0551c 0.46 fadD8 2.44E-02 0.31 ACID-COA SYNTHASE) PROBABLE ACYL-COA DEHYDROGENASE MtH37Rv-0975c 0.45 fadE13 1.12E-02 0.22 FADE13 PROBABLE ACYL-COA DEHYDROGENASE MtH37Rv-1934c 0.66 fadE17 4.35E-02 0.16 FADE17 PROBABLE ACYL-COA DEHYDROGENASE MtH37Rv-3505 0.78 fadE27 3.61E-02 0.23 FADE27 MtH37Rv-3562 0.53 fadE31 4.15E-02 0.35 ACYL-COA DEHYDROGENASE FADE31 MtH37Rv-3564 0.44 fadE33 2.24E-02 0.39 ACYL-COA DEHYDROGENASE FADE33 MtH37Rv-0752c 0.61 fadE9 2.42E-02 0.24 ACYL-COA DEHYDROGENASE FADE9 MtH37Rv-2899c 0.51 fdhD 7.26E-03 0.16 POSSIBLE FDHD PROTEIN MtH37Rv-3641c 0.34 fic 4.73E-02 0.26 POSSIBLE CELL FILAMENTATION PROTEIN FIC PROBABLE FUMARATE REDUCTASE [IRON- SULFUR SUBUNIT] FRDB (FUMARATE MtH37Rv-1553 0.60 frdB 2.74E-02 0.14 DEHYDROGENASE) (FUMARIC HYDROGENASE) PROBABLE GLUTAMATE DECARBOXYLASE MtH37Rv-3432c 0.78 gadB 4.02E-02 0.20 GADB PROBABLE GALACTOKINASE GALK MtH37Rv-0620 0.52 galK 2.66E-02 0.25 (GALACTOSE KINASE) PROBABLE GLUTAMINE SYNTHETASE GLNA4 MtH37Rv-2860c 0.57 glnA4 4.05E-02 0.33 (GLUTAMINE SYNTHASE) (GS-II) PROBABLE [PROTEIN-PII] URIDYLYLTRANSFERASE GLND (PII URIDYLYL- TRANSFERASE) (URIDYLYL REMOVING MtH37Rv-2918c 0.30 glnD 1.88E-02 0.27 ENZYME) (UTASE) PROBABLE GLYCEROL-3-PHOSPHATE MtH37Rv-3302c 0.80 glpD2 4.81E-02 0.25 DEHYDROGENASE GLPD2 GLYCEROL KINASE GLPK (ATP:GLYCEROL 3- PHOSPHOTRANSFERASE)(GLYCEROKINASE) MtH37Rv-3696c 0.53 glpK 3.65E-02 0.36 (GK) PROBABLE GLYCEROL-3-PHOSPHATE DEHYDROGENASE [NAD(P)+] GPDA1 (NAD(P)H- DEPENDENT GLYCEROL-3-PHOSPHATE DEHYDROGENASE) (NAD(P)H-DEPENDENT DIHYDROXYACETONE-PHOSPHATE MtH37Rv-0564c 0.56 gpdA1 2.57E-02 0.08 REDUCTASE) FERROCHELATASE HEMZ (PROTOHEME FERRO- MtH37Rv-1485 0.62 hemZ 2.50E-02 0.17 LYASE) (HEME SYNTHETASE) POSSIBLE TYPE I RESTRICTION/MODIFICATION SYSTEM SPECIFICITY DETERMINANT HSDS (S MtH37Rv-2761c 0.54 hsdS 4.04E-02 0.33 PROTEIN) POSSIBLE TYPE I RESTRICTION/MODIFICATION SYSTEM SPECIFICITY DETERMINANT MtH37Rv-2755c 0.70 hsdS' 3.32E-02 0.23 (FRAGMENT) HSDS' (S PROTEIN) PROBABLE MULTIFUNCTIONAL GERANYLGERANYL PYROPHOSPHATE SYNTHETASE IDSA1 (GGPP SYNTHETASE) (GGPPSASE) (GERANYLGERANYL DIPHOSPHATE SYNTHASE): DIMETHYLALLYLTRANSFERASE (PRENYLTRANSFERASE) (GERANYL- DIPHOSPHATE SYNTHASE) + GERANYLTRANSTRANSFERASE (FARNESYL- DIPHOSPHATE SYNTHASE) (FARNESYL- PYROPHOSPHATE SYNTHETASE) (FARNESYL DIPHOSPHATE SYNTHETASE) (FPP SYNTHETASE) + FARNESYLTRANSTRANSFERASE (GERANYLGERANYL-DIPHOSPHATE MtH37Rv-3398c 0.36 idsA1 1.79E-02 0.28 SYNTHASE) POSSIBLE POLYPRENYL SYNTHETASE IDSB (POLYPRENYL TRANSFERASE) (POLYPRENYL MtH37Rv-3383c 0.58 idsB 5.79E-03 0.10 DIPHOSPHATE SYNTHASE)

192 MtH37Rv-1604 0.54 impA 3.73E-02 0.20 Probable inositol monophosphatase impA PROBABLE NUCLEOSIDE HYDROLASE IUNH MtH37Rv-3393 0.57 iunH 4.29E-02 0.22 (PURINE NUCLEOSIDASE) PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-1027c 0.63 kdpE 3.73E-02 0.17 PROTEIN KDPE PROBABLE ATP-DEPENDENT HELICASE LHR MtH37Rv-3296 0.38 lhr 1.89E-02 0.12 (LARGE HELICASE-RELATED PROTEIN) PROBABLE DIHYDROLIPOAMIDE DEHYDROGENASE LPDA (LIPOAMIDE REDUCTASE (NADH)) (LIPOYL DEHYDROGENASE) (DIHYDROLIPOYL MtH37Rv-3303c 0.45 lpdA 1.12E-02 0.38 DEHYDROGENASE) (DIAPHORASE) MtH37Rv-2543 0.47 lppA 2.37E-02 0.23 PROBABLE CONSERVED LIPOPROTEIN LPPA MtH37Rv-2544 0.42 lppB 7.80E-03 0.40 PROBABLE CONSERVED LIPOPROTEIN LPPB MtH37Rv-2341 0.44 lppQ 1.08E-02 0.13 PROBABLE CONSERVED LIPOPROTEIN LPPQ MtH37Rv-3298c 0.30 lpqC 7.83E-03 0.34 POSSIBLE ESTERASE LIPOPROTEIN LPQC MtH37Rv-0604 0.55 lpqO 4.46E-03 0.17 PROBABLE CONSERVED LIPOPROTEIN LPQO MtH37Rv-0838 0.30 lpqR 3.21E-03 0.24 PROBABLE CONSERVED LIPOPROTEIN LPQR POSSIBLE MCE-FAMILY LIPOPROTEIN LPRM MtH37Rv-1970 0.58 lprM 4.55E-02 0.47 (MCE-FAMILY LIPOPROTEIN MCE3E) MtH37Rv-3540c 0.49 ltp2 3.97E-02 0.42 HYPOTHETICAL PROTEIN MtH37Rv-3382c 0.70 lytB1 1.13E-02 0.11 PROBABLE LYTB-RELATED PROTEIN LYTB1 MbAF212297- LONG CONSERVED HYPOTHETICAL PROTEIN 2595 0.53 Mb2595 1.74E-02 0.22 [FIRST PART] MbAF212297- 2836 0.60 Mb2836 1.37E-02 0.27 PUTATIVE TRANSPOSASE [SECOND PART] MtH37Rv-0594 0.35 mce2F 6.83E-03 0.23 MCE-FAMILY PROTEIN MCE2F MtH37Rv-1966 0.60 mce3A 3.42E-02 0.23 MCE-FAMILY PROTEIN MCE3A MtH37Rv-3625c 0.41 mesJ 5.38E-03 0.16 POSSIBLE CELL CYCLE PROTEIN MESJ POSSIBLE Mg2+ TRANSPORT MtH37Rv-0362 0.51 mgtE 2.18E-02 0.24 TRANSMEMBRANE PROTEIN MGTE PROBABLE HOMOCYSTEINE S- METHYLTRANSFERASE MMUM (S- METHYLMETHIONINE:HOMOCYSTEINE METHYLTRANSFERASE) (CYSTEINE MtH37Rv-2458 0.73 mmuM 2.78E-02 0.14 METHYLTRANSFERASE) PROBABLE MOLYBDENUM COFACTOR MtH37Rv-3109 0.38 moaA1 3.32E-02 0.32 BIOSYNTHESIS PROTEIN A MOAA1 PROBABLE MOLYBDENUM COFACTOR MtH37Rv-3324c 0.35 moaC3 7.48E-03 0.25 BIOSYNTHESIS PROTEIN C 3 MOAC3 PROBABLE MOLYBDENUM COFACTOR BIOSYNTHESIS PROTEIN D MOAD1 (MOLYBDOPTERIN CONVERTING FACTOR SMALL SUBUNIT) (MOLYBDOPTERIN [MPT] MtH37Rv-3112 0.50 moaD1 1.36E-02 0.28 CONVERTING FACTOR, SUBUNIT 1) CELL SURFACE LIPOPROTEIN MPT83 MtH37Rv-2873 0.56 mpt83 3.69E-03 0.25 PRECURSOR (LIPOPROTEIN P23) PROBABLE RESTRICTION SYSTEM PROTEIN MtH37Rv-2528c 0.53 mrr 2.53E-02 0.27 MRR MtCDC1551- 0099.1 0.70 MT0099.1 2.89E-02 0.36 hypothetical protein MtCDC1551- 0122 0.52 MT0122 1.69E-02 0.27 phosphoheptose isomerase MtCDC1551- 0614 0.59 MT0614 1.80E-02 0.22 hypothetical protein MtCDC1551- 0639 0.55 MT0639 2.49E-02 0.27 hypothetical protein MtCDC1551- 0726 0.21 MT0726 6.39E-03 0.26 hypothetical protein MtCDC1551- 0910.3 0.61 MT0910.3 4.52E-02 0.22 hypothetical protein MtCDC1551- 0991 0.31 MT0991 1.30E-02 0.23 hypothetical protein

193 MtCDC1551- 1305.1 0.36 MT1305.1 1.88E-02 0.13 hypothetical protein MtCDC1551- 1330.1 0.38 MT1330.1 1.27E-02 0.28 hypothetical protein MtCDC1551- 1360 0.44 MT1360 6.54E-03 0.21 adenylate cyclase, putative MtCDC1551- 1367 0.36 MT1367 3.10E-03 0.18 PE_PGRS family protein MtCDC1551- 1490 0.42 MT1490 2.44E-03 0.16 hypothetical protein MtCDC1551- 1499 0.32 MT1499 3.63E-03 0.29 PE_PGRS family protein MtCDC1551- 1560 0.52 MT1560 2.65E-02 0.26 hypothetical protein MtCDC1551- 1759 0.53 MT1759 6.54E-03 0.17 hypothetical protein MtCDC1551- 1760 0.46 MT1760 2.42E-02 0.17 hypothetical protein MtCDC1551- 1777 0.37 MT1777 2.91E-02 0.23 hypothetical protein MtCDC1551- 1968 0.36 MT1968 3.22E-02 0.17 PPE family protein MtCDC1551- 2015 0.64 MT2015 2.22E-02 0.23 hypothetical protein MtCDC1551- 2015.1 0.34 MT2015.1 4.67E-02 0.39 hypothetical protein MtCDC1551- 2165 0.67 MT2165 4.60E-02 0.12 hypothetical protein MtCDC1551- 2246 0.56 MT2246 2.42E-02 0.28 hypothetical protein MtCDC1551- 2283 0.41 MT2283 1.71E-02 0.23 hypothetical protein MtCDC1551- 2291 0.34 MT2291 4.00E-02 0.11 hypothetical protein MtCDC1551- 2370.2 0.52 MT2370.2 6.54E-03 0.28 hypothetical protein MtCDC1551- 2480 0.44 MT2480 2.42E-02 0.20 hypothetical protein MtCDC1551- 2601.1 0.65 MT2601.1 4.05E-02 0.20 hypothetical protein MtCDC1551- 2616 0.63 MT2616 2.40E-02 0.21 hypothetical protein MtCDC1551- 2722 0.49 MT2722 8.57E-03 0.19 hypothetical protein MtCDC1551- 2924 0.50 MT2924 4.36E-02 0.21 hypothetical protein MtCDC1551- 3033 0.28 MT3033 1.71E-02 0.21 hypothetical protein MtCDC1551- 3041.1 0.60 MT3041.1 3.93E-02 0.21 hypothetical protein MtCDC1551- 3098 0.27 MT3098 2.72E-03 0.26 PPE family protein MtCDC1551- 3117 0.58 MT3117 4.77E-02 0.28 hypothetical protein MtCDC1551- 3207 0.49 MT3207 4.35E-02 0.28 hypothetical protein MtCDC1551- 3270.1 0.46 MT3270.1 5.38E-03 0.19 hypothetical protein MtCDC1551- 3437.1 0.70 MT3437.1 4.49E-02 0.13 hypothetical protein MtCDC1551- 3630 0.61 MT3630 4.00E-02 0.13 hypothetical protein MtCDC1551- 3767.3 0.76 MT3767.3 4.67E-02 0.17 hypothetical protein UPD-N-acetylglucosamine-N-acetylmuramyl- (pentapeptide) pyrophosphoryl-undecaprenol-N- MtH37Rv-2153c 0.52 murG 3.73E-02 0.38 acetylglucosamine transferase MurG MtH37Rv-1736c 0.59 narX 1.89E-02 0.39 Probable nitrate reductase NarX

194 PROBABLE MEMBRANE NADH MtH37Rv-0392c 0.54 ndhA 6.54E-03 0.24 DEHYDROGENASE NDHA POSSIBLE NICKEL-TRANSPORT INTEGRAL MtH37Rv-2856 0.45 nicT 3.23E-02 0.34 MEMBRANE PROTEIN NICT PROBABLE NITRITE REDUCTASE [NAD(P)H] MtH37Rv-0253 0.52 nirD 3.04E-03 0.17 SMALL SUBUNIT NIRD PROBABLE RIBONUCLEOSIDE-DIPHOSPHATE REDUCTASE (LARGE SUBUNIT) NRDZ MtH37Rv-0570 0.60 nrdZ 1.37E-02 0.27 (RIBONUCLEOTIDE REDUCTASE) PROBABLE OUTER MEMBRANE PROTEIN A MtH37Rv-0899 0.71 ompA 1.31E-02 0.17 OMPA PROBABLE 5-OXOPROLINASE OPLA (5-OXO-L- MtH37Rv-0266c 0.56 oplA 1.71E-02 0.15 PROLINASE) (PYROGLUTAMASE) (5-OPASE) PROBABLE TREHALOSE-6-PHOSPHATE PHOSPHATASE OTSB1 (TREHALOSE- MtH37Rv-2006 0.45 otsB1 4.23E-02 0.11 PHOSPHATASE) (TPP) PROBABLE PYRUVATE OR INDOLE-3- MtH37Rv-0853c 0.44 pdc 6.54E-03 0.19 PYRUVATE DECARBOXYLASE PDC MtH37Rv-0109 0.62 PE_PGRS1 2.94E-02 0.27 PE-PGRS FAMILY PROTEIN MtH37Rv-0754 0.35 PE_PGRS11 2.42E-02 0.36 PE-PGRS FAMILY PROTEIN MtH37Rv-0834c 0.68 PE_PGRS14 2.36E-02 0.14 PE-PGRS FAMILY PROTEIN MtH37Rv-0978c 0.30 PE_PGRS17 4.34E-03 0.21 PE-PGRS FAMILY PROTEIN MtH37Rv-1396c 0.23 PE_PGRS25 5.38E-03 0.31 PE-PGRS FAMILY PROTEIN MtH37Rv-1450c 0.29 PE_PGRS27 6.96E-03 0.50 PE-PGRS FAMILY PROTEIN MtH37Rv-1452c 0.36 PE_PGRS28 4.21E-02 0.21 PE-PGRS FAMILY PROTEIN MtH37Rv-1468c 0.66 PE_PGRS29 4.04E-02 0.27 PE-PGRS FAMILY PROTEIN MtH37Rv-1651c 0.53 PE_PGRS30 2.56E-02 0.18 PE-PGRS FAMILY PROTEIN MtH37Rv-1818c 0.51 PE_PGRS33 7.98E-03 0.27 PE-PGRS FAMILY PROTEIN MtH37Rv-2098c 0.63 PE_PGRS36 3.14E-02 0.25 conserved hypothetical protein, PE_PGRS family MbAF212297- 0285c 0.55 PE_PGRS3a 1.27E-02 0.23 PE-PGRS FAMILY PROTEIN [FIRST PART] MtH37Rv-2487c 0.33 PE_PGRS42 2.68E-03 0.24 PE-PGRS FAMILY PROTEIN MtH37Rv-2634c 0.35 PE_PGRS46 8.93E-03 0.21 PE-PGRS FAMILY PROTEIN MtH37Rv-2853 0.40 PE_PGRS48 1.92E-02 0.15 PE-PGRS FAMILY PROTEIN MbAF212297- 3541 0.46 PE_PGRS55 4.48E-02 0.25 PE-PGRS FAMILY PROTEIN MtH37Rv-3590c 0.58 PE_PGRS58 2.75E-02 0.16 PE-PGRS FAMILY PROTEIN MtH37Rv-3653 0.32 PE_PGRS61 2.51E-02 0.30 PE-PGRS FAMILY PROTEIN MbAF212297- 0767 0.48 PE_PGRS9 5.79E-03 0.20 PE-PGRS FAMILY PROTEIN MtH37Rv-0746 0.50 PE_PGRS9 6.52E-03 0.13 PE-PGRS FAMILY PROTEIN MtH37Rv-0151c 0.53 PE1 2.88E-02 0.20 PE FAMILY PROTEIN MtH37Rv-3022A 0.56 PE29 2.22E-02 0.23 PE FAMILY PROTEIN PE-PGRS FAMILY PROTEIN, PROBABLY TRIACYLGLYCEROL LIPASE (ESTERASE/LIPASE) (TRIGLYCERIDE LIPASE) MtH37Rv-3097c 0.34 PE30 3.19E-02 0.49 (TRIBUTYRASE) MtH37Rv-3622c 0.54 PE32 4.35E-02 0.12 PE FAMILY PROTEIN MtH37Rv-3746c 0.44 PE34 1.79E-02 0.41 PROBABLE PE FAMILY PROTEIN MtH37Rv-0160c 0.45 PE4 1.53E-02 0.14 PE FAMILY PROTEIN MtH37Rv-2782c 0.64 pepR 2.39E-02 0.14 PROBABLE ZINC PROTEASE PEPR PROBABLE GLUCOSE-6-PHOSPHATE ISOMERASE PGI (GPI) (PHOSPHOGLUCOSE ISOMERASE) (PHOSPHOHEXOSE ISOMERASE) MtH37Rv-0946c 0.65 pgi 1.54E-02 0.15 (PHI) PROBABLE PHOSPHOGLUCOMUTASE PGMA MtH37Rv-3068c 0.58 pgmA 2.89E-02 0.27 (GLUCOSE PHOSPHOMUTASE) (PGM) PROBABLE PROLINE IMINOPEPTIDASE PIP MtH37Rv-0840c 0.39 pip 1.07E-02 0.12 (PROLYL AMINOPEPTIDASE) (PAP)

195 PROBABLE SERINE/THREONINE-PROTEIN KINASE TRANSCRIPTIONAL REGULATORY MtH37Rv-3080c 0.44 pknK 3.98E-02 0.21 PROTEIN PKNK (PROTEIN KINASE K) (PSTK K) PROBABLE PHOSPHOMANNOMUTASE PMMB MtH37Rv-3308 0.61 pmmB 2.53E-02 0.14 (PHOSPHOMANNOSE MUTASE) PROBABLE NAD(P) TRANSHYDROGENASE (SUBUNIT ALPHA) PNTAA [FIRST PART; CATALYTIC PART] (PYRIDINE NUCLEOTIDE TRANSHYDROGENASE SUBUNIT ALPHA) (NICOTINAMIDE NUCLEOTIDE MtH37Rv-0155 0.51 pntAa 2.42E-02 0.25 TRANSHYDROGENASE SUBUNIT ALPHA) MtH37Rv-0453 0.56 PPE11 1.57E-02 0.18 PPE FAMILY PROTEIN MtH37Rv-1135c 0.63 PPE16 7.20E-03 0.21 PPE FAMILY PROTEIN MtH37Rv-1790 0.57 PPE27 1.70E-02 0.15 PPE FAMILY PROTEIN MtH37Rv-1917c 0.35 PPE34 2.56E-02 0.20 PPE FAMILY PROTEIN MbAF212297- 1951c 0.43 PPE34 4.56E-02 0.41 PPE FAMILY PROTEIN MtH37Rv-2768c 0.45 PPE43 6.47E-03 0.22 PPE FAMILY PROTEIN MtH37Rv-2892c 0.53 PPE45 3.03E-02 0.16 PPE FAMILY PROTEIN MtH37Rv-3022c 0.46 PPE48 1.27E-02 0.18 PPE FAMILY PROTEIN MtH37Rv-3125c 0.71 PPE49 2.29E-02 0.16 PPE FAMILY PROTEIN MtH37Rv-3539 0.60 PPE63 3.32E-02 0.23 PPE FAMILY PROTEIN MtH37Rv-3621c 0.51 PPE65 3.18E-02 0.25 PPE FAMILY PROTEIN PHENOLPTHIOCEROL SYNTHESIS TYPE-I MtH37Rv-2934 0.73 ppsD 3.91E-02 0.22 POLYKETIDE SYNTHASE PPSD PUTATIVE PRIMOSOMAL PROTEIN N' PRIA MtH37Rv-1402 0.42 priA 1.76E-03 0.15 (Replication factor Y) POSSIBLE OSMOPROTECTANT (GLYCINE BETAINE/CARNITINE/CHOLINE/L-PROLINE) TRANSPORT ATP-BINDING PROTEIN ABC MtH37Rv-3758c 0.80 proV 4.34E-02 0.21 TRANSPORTER PROV POSSIBLE OSMOPROTECTANT (GLYCINE BETAINE/CARNITINE/CHOLINE/L-PROLINE) TRANSPORT INTEGRAL MEMBRANE PROTEIN MtH37Rv-3756c 0.71 proZ 3.81E-02 0.24 ABC TRANSPORTER PROZ PROBABLE PROTEASE II PTRBB [SECOND PART] MtH37Rv-0782 0.46 ptrBb 2.91E-02 0.45 (OLIGOPEPTIDASE B) PROBABLE PHOSPHORIBOSYLAMINE--GLYCINE LIGASE PURD (GARS) (GLYCINAMIDE RIBONUCLEOTIDE SYNTHETASE) (PHOSPHORIBOSYLGLYCINAMIDE SYNTHETASE) (5'- PHOSPHORIBOSYLGLYCINAMIDE MtH37Rv-0772 0.79 purD 3.98E-02 0.22 SYNTHETASE) PHOSPHORIBOSYLFORMYLGLYCINAMIDINE MtH37Rv-0803 0.68 purL 2.91E-02 0.19 SYNTHASE II PURL (FGAM SYNTHASE II) DNA REPAIR PROTEIN RADA (DNA REPAIR MtH37Rv-3585 0.55 radA 4.36E-02 0.32 PROTEIN SMS) MtH37Rv-2436 0.47 rbsK 3.14E-02 0.18 RIBOKINASE RBSK PROBABLE ATP-DEPENDENT DNA HELICASE MtH37Rv-2973c 0.48 recG 3.19E-03 0.26 RECG MtH37Rv-2736c 0.56 recX 1.95E-02 0.31 REGULATORY PROTEIN RECX PUTATIVE UNDECAPAPRENYL-PHOSPHATE ALPHA-N- MtH37Rv-1302 0.60 rfe 4.33E-02 0.28 ACETYLGLUCOSAMINYLTRANSFERASE RFE PROBABLE BIFUNCTIONAL FAD SYNTHETASE/RIBOFLAVIN BIOSYNTHESIS PROTEIN RIBF: RIBOFLAVIN KINASE (FLAVOKINASE) + FMN ADENYLYLTRANSFERASE (FAD PYROPHOSPHORYLASE) (FAD SYNTHETASE)(FAD DIPHOSPHORYLASE) (FLAVIN ADENINE DINUCLEOTUDE MtH37Rv-2786c 0.51 ribF 1.22E-02 0.37 SYNTHETASE)

196 PROBABLE BIFUNCTIONAL riboflavin biosynthesis protein RIBG : Diaminohydroxyphosphoribosylaminopyrimidine deaminase (Riboflavin-specific deaminase) + 5-amino- 6-(5-phosphoribosylamino) uracil reductase (HTP MtH37Rv-1409 0.56 ribG 4.21E-02 0.33 reductase) PROBABLE 16S RRNA PROCESSING PROTEIN MtH37Rv-2907c 0.54 rimM 1.88E-02 0.21 RIMM PROBABLE 50S RIBOSOMAL PROTEIN L28-1 MtH37Rv-0105c 0.52 rpmB1 1.84E-02 0.12 RPMB1 MtH37Rv-0021c 0.45 Rv0021c 4.37E-02 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0094c 0.48 Rv0094c 1.30E-02 0.19 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0115 0.47 Rv0115 3.10E-03 0.20 POSSIBLE SUGAR KINASE MtH37Rv-0138 0.60 Rv0138 3.29E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0139 0.51 Rv0139 1.30E-02 0.37 POSSIBLE OXIDOREDUCTASE PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-0143c 0.66 Rv0143c 2.78E-02 0.32 PROTEIN MtH37Rv-0163 0.58 Rv0163 5.93E-03 0.20 CONSERVED HYPOTHETICAL PROTEIN PROBABLE DRUGS-TRANSPORT TRANSMEMBRANE ATP-BINDING PROTEIN ABC MtH37Rv-0194 0.59 Rv0194 3.37E-02 0.36 TRANSPORTER MtH37Rv-0269c 0.61 Rv0269c 3.04E-03 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0320 0.48 Rv0320 5.38E-03 0.18 POSSIBLE CONSERVED EXPORTED PROTEIN MtH37Rv-0323c 0.65 Rv0323c 1.57E-02 0.12 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0358 0.57 Rv0358 4.77E-02 0.35 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0372c 0.59 Rv0372c 3.32E-02 0.12 CONSERVED HYPOTHETICAL PROTEIN CONSERVED HYPOTHETICAL GLYCINE RICH MtH37Rv-0378 0.53 Rv0378 2.41E-03 0.22 PROTEIN MtH37Rv-0396 0.54 Rv0396 2.29E-02 0.20 HYPOTHETICAL PROTEIN MtH37Rv-0397 0.39 Rv0397 1.81E-03 0.27 CONSERVED 13E12 REPEAT FAMILY PROTEIN PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-0401 0.56 Rv0401 2.29E-02 0.28 PROTEIN MtH37Rv-0435c 0.43 Rv0435c 2.98E-02 0.22 PUTATIVE CONSERVED ATPASE MtH37Rv-0443 0.62 Rv0443 3.96E-02 0.26 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE CONSERVED TRANSMEMBRANE MtH37Rv-0513 0.61 Rv0513 9.76E-03 0.12 PROTEIN MtH37Rv-0525 0.61 Rv0525 4.11E-03 0.16 CONSERVED HYPOTHETICAL PROTEIN PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-0528 0.77 Rv0528 1.75E-02 0.18 PROTEIN MtH37Rv-0552 0.64 Rv0552 1.89E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0584 0.56 Rv0584 2.88E-02 0.23 POSSIBLE CONSERVED EXPORTED PROTEIN MtH37Rv-0605 0.66 Rv0605 2.88E-02 0.20 POSSIBLE RESOLVASE MtH37Rv-0607 0.63 Rv0607 1.32E-02 0.26 HYPOTHETICAL PROTEIN MtH37Rv-0609A 0.65 Rv0609A 8.57E-03 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0614 0.47 Rv0614 5.66E-03 0.17 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0633c 0.79 Rv0633c 4.18E-02 0.17 POSSIBLE EXPORTED PROTEIN MtH37Rv-0650 0.43 Rv0650 3.97E-02 0.22 POSSIBLE SUGAR KINASE PROBABLE CONSERVED INTEGRAL MtH37Rv-0658c 0.69 Rv0658c 4.74E-02 0.12 MEMBRANE PROTEIN MtH37Rv-0698 0.51 Rv0698 7.09E-03 0.12 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0725c 0.59 Rv0725c 2.47E-02 0.17 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0736 0.43 Rv0736 2.06E-02 0.30 PROBABLE CONSERVED MEMBRANE PROTEIN POSSIBLE TRANSCRIPTIONAL REGULATORY MtH37Rv-0737 0.46 Rv0737 2.49E-02 0.23 PROTEIN MtH37Rv-0738 0.55 Rv0738 1.67E-02 0.22 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0739 0.64 Rv0739 4.78E-02 0.26 CONSERVED HYPOTHETICAL PROTEIN

197 MtH37Rv-0762c 0.69 Rv0762c 9.05E-03 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0775 0.47 Rv0775 6.54E-03 0.22 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0785 0.52 Rv0785 1.42E-02 0.11 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0786c 0.67 Rv0786c 3.69E-02 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0789c 0.64 Rv0789c 4.91E-02 0.14 HYPOTHETICAL PROTEIN MtH37Rv-0790c 0.61 Rv0790c 3.84E-02 0.24 HYPOTHETICAL PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-0792c 0.63 Rv0792c 1.77E-02 0.35 PROTEIN (PROBABLY GNTR-FAMILY) PUTATIVE TRANSPOSASE FOR INSERTION MtH37Rv-0795 0.39 Rv0795 1.81E-03 0.14 SEQUENCE ELEMENT IS6110 (FRAGMENT) PUTATIVE TRANSPOSASE FOR INSERTION MtH37Rv-0797 0.34 Rv0797 1.25E-02 0.11 SEQUENCE ELEMENT IS1547 MtH37Rv-0836c 0.45 Rv0836c 6.52E-03 0.22 HYPOTHETICAL PROTEIN MtH37Rv-0837c 0.39 Rv0837c 3.17E-03 0.14 HYPOTHETICAL PROTEIN MtH37Rv-0839 0.31 Rv0839 2.53E-02 0.45 CONSERVED HYPOTHETICAL PROTEIN PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-0841 0.55 Rv0841 5.57E-03 0.11 PROTEIN PROBABLE CONSERVED INTEGRAL MtH37Rv-0842 0.37 Rv0842 7.09E-03 0.17 MEMBRANE PROTEIN MtH37Rv-0843 0.39 Rv0843 2.88E-02 0.27 PROBABLE DEHYDROGENASE MtH37Rv-0845 0.36 Rv0845 2.13E-02 0.41 POSSIBLE TWO COMPONENT SENSOR KINASE PROBABLE SHORT-CHAIN TYPE MtH37Rv-0851c 0.28 Rv0851c 5.48E-03 0.20 DEHYDROGENASE/REDUCTASE MtH37Rv-0858c 0.51 Rv0858c 2.29E-02 0.46 PROBABLE AMINOTRANSFERASE MtH37Rv-0861c 0.49 Rv0861c 1.27E-02 0.17 PROBABLE DNA HELICASE POSSIBLE TRANSCRIPTIONAL REGULATORY MtH37Rv-0880 0.44 Rv0880 1.89E-02 0.22 PROTEIN (POSSIBLY MARR-FAMILY) MtH37Rv-0895 0.54 Rv0895 1.64E-02 0.23 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE LIPID CARRIER PROTEIN OR KETO MtH37Rv-0914c 0.65 Rv0914c 3.32E-02 0.12 ACYL-COA THIOLASE MtH37Rv-0922 0.59 Rv0922 2.88E-02 0.25 POSSIBLE TRANSPOSASE PROBABLE SHORT-CHAIN TYPE MtH37Rv-0927c 0.64 Rv0927c 1.27E-02 0.14 DEHYDROGENASE/REDUCTASE POSSIBLE BIFUNCTIONAL ENZYME: 2- HYDROXYHEPTA-2,4-DIENE-1,7-DIOATE ISOMERASE (HHDD ISOMERASE) + MtH37Rv-0939 0.58 Rv0939 4.65E-02 0.31 CYCLASE/DEHYDRASE MtH37Rv-0964c 0.42 Rv0964c 2.78E-02 0.16 HYPOTHETICAL PROTEIN MtH37Rv-0976c 0.47 Rv0976c 3.38E-02 0.33 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-0992c 0.50 Rv0992c 6.54E-03 0.15 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1003 0.38 Rv1003 9.13E-03 0.23 CONSERVED HYPOTHETICAL PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-1019 0.49 Rv1019 4.73E-02 0.30 PROTEIN (PROBABLY TETR-FAMILY) MtH37Rv-1057 0.46 Rv1057 3.49E-02 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1084 0.38 Rv1084 1.75E-02 0.39 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1087A 0.65 Rv1087A 4.29E-02 0.21 CONSERVED HYPOTHETICAL PROTEIN CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1116A 0.69 Rv1116A 1.27E-02 0.17 (FRAGMENT) MtH37Rv-1120c 0.62 Rv1120c 2.42E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN PROBABLE TRANSCRIPTIONAL REGULATOR MtH37Rv-1129c 0.48 Rv1129c 1.81E-03 0.19 PROTEIN POSSIBLE ACETYL-COA ACETYLTRANSFERASE MtH37Rv-1135A 0.51 Rv1135A 2.36E-02 0.22 (ACETOACETYL-COA THIOLASE) MtH37Rv-1137c 0.60 Rv1137c 3.49E-02 0.24 HYPOTHETICAL PROTEIN CONSERVED HYPOTHETICAL MEMBRANE MtH37Rv-1139c 0.67 Rv1139c 1.36E-02 0.17 PROTEIN MtH37Rv-1147 0.65 Rv1147 1.84E-02 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1150 0.60 Rv1150 3.19E-03 0.11 POSSIBLE TRANSPOSASE [SECOND PART]

198 MtH37Rv-1159 0.59 Rv1159 1.36E-02 0.16 CONSERVED TRANSMEMBRANE PROTEIN MtH37Rv-1190 0.51 Rv1190 4.94E-02 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1290A 0.33 Rv1290A 2.40E-02 0.28 HYPOTHETICAL PROTEIN MtH37Rv-1333 0.67 Rv1333 3.32E-02 0.33 PROBABLE HYDROLASE MtH37Rv-1370c 0.47 Rv1370c 4.15E-02 0.11 PROBABLE TRANSPOSASE MtH37Rv-1405c 0.51 Rv1405c 2.22E-02 0.25 PUTATIVE METHYLTRANSFERASE MtH37Rv-1429 0.60 Rv1429 3.22E-02 0.14 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1431 0.67 Rv1431 1.00E-02 0.19 CONSERVED MEMBRANE PROTEIN MtH37Rv-1432 0.62 Rv1432 1.87E-02 0.16 PROBABLE DEHYDROGENASE MtH37Rv-1491c 0.60 Rv1491c 3.71E-02 0.22 CONSERVED MEMBRANE PROTEIN MtH37Rv-1496 0.61 Rv1496 1.23E-02 0.24 Possible transport system kinase MtH37Rv-1499 0.68 Rv1499 9.80E-03 0.12 HYPOTHETICAL PROTEIN MtH37Rv-1503c 0.52 Rv1503c 4.31E-02 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1510 0.62 Rv1510 2.72E-03 0.17 conserved probable membrane protein MtH37Rv-1517 0.67 Rv1517 3.46E-02 0.18 conserved hypothetical transmembrane protein MtH37Rv-1582c 0.56 Rv1582c 4.69E-02 0.15 Probable phiRv1 phage protein MtH37Rv-1588c 0.46 Rv1588c 3.10E-03 0.23 Partial REP13E12 repeat protein MtH37Rv-1619 0.46 Rv1619 1.71E-02 0.29 CONSERVED MEMBRANE PROTEIN PROBABLE FIRST PART OF MACROLIDE- TRANSPORT ATP-BINDING PROTEIN ABC MtH37Rv-1668c 0.62 Rv1668c 2.47E-02 0.30 TRANSPORTER MtH37Rv-1671 0.61 Rv1671 1.59E-02 0.14 PROBABLE MEMBRANE PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-1674c 0.38 Rv1674c 4.27E-02 0.14 PROTEIN MtH37Rv-1678 0.68 Rv1678 2.45E-02 0.11 PROBABLE INTEGRAL MEMBRANE PROTEIN MtH37Rv-1711 0.43 Rv1711 6.54E-03 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1720c 0.60 Rv1720c 1.23E-02 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1735c 0.30 Rv1735c 1.22E-02 0.19 HYPOTHETICAL MEMBRANE PROTEIN MtH37Rv-1757c 0.51 Rv1757c 2.22E-02 0.23 PUTATIVE TRANSPOSASE MtH37Rv-1760 0.46 Rv1760 4.18E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-1764 0.57 Rv1764 6.54E-03 0.22 PUTATIVE TRANSPOSASE POSSIBLE TRANSCRIPTIONAL REGULATORY MtH37Rv-1776c 0.62 Rv1776c 2.66E-02 0.30 PROTEIN MtH37Rv-1778c 0.57 Rv1778c 2.40E-03 0.21 HYPOTHETICAL PROTEIN CONSERVED HYPOTHETICAL MEMBRANE MtH37Rv-1841c 0.58 Rv1841c 2.67E-02 0.21 PROTEIN MtH37Rv-1890c 0.70 Rv1890c 2.59E-02 0.14 HYPOTHETICAL PROTEIN MtH37Rv-1937 0.37 Rv1937 1.74E-02 0.26 POSSIBLE OXYGENASE MtH37Rv-1945 0.58 Rv1945 9.13E-03 0.13 CONSERVED HYPOTHETICAL PROTEIN PROBABLE CONSERVED INTEGRAL MtH37Rv-1999c 0.40 Rv1999c 2.27E-03 0.23 MEMBRANE PROTEIN MtH37Rv-2008c 0.66 Rv2008c 3.23E-02 0.11 conserved hypothetical protein MtH37Rv-2011c 0.52 Rv2011c 8.19E-03 0.15 conserved hypothetical protein MtH37Rv-2013 0.53 Rv2013 1.88E-02 0.21 Possible transposase MtH37Rv-2044c 0.48 Rv2044c 2.91E-02 0.17 conserved hypothetical protein MtH37Rv-2052c 0.39 Rv2052c 2.47E-02 0.39 conserved hypothetical protein MtH37Rv-2059 0.63 Rv2059 1.89E-02 0.23 conserved hypothetical protein MtH37Rv-2170 0.45 Rv2170 2.44E-02 0.28 conserved hypothetical protein MtH37Rv-2219A 0.46 Rv2219A 5.80E-03 0.20 PROBABLE CONSERVED MEMBRANE PROTEIN MtH37Rv-2262c 0.33 Rv2262c 3.19E-03 0.19 conserved hypothetical protein MtH37Rv-2265 0.43 Rv2265 1.07E-02 0.13 Possible conserved integral membrane protein MtH37Rv-2278 0.41 Rv2278 1.81E-03 0.18 Probable transposase 199 MtH37Rv-2279 0.47 Rv2279 1.84E-02 0.27 Probable transposase MtH37Rv-2294 0.66 Rv2294 1.89E-02 0.21 Probable aminotransferase MtH37Rv-2304c 0.55 Rv2304c 2.83E-02 0.16 HYPOTHETICAL PROTEIN MtH37Rv-2306A 0.60 Rv2306A 8.27E-03 0.27 POSSIBLE CONSERVED MEMBRANE PROTEIN MtH37Rv-2307A 0.49 Rv2307A 6.54E-03 0.26 HYPOTHETICAL GLYCINE RICH PROTEIN MtH37Rv-2309A 0.60 Rv2309A 4.14E-02 0.27 HYPOTHETICAL PROTEIN MtH37Rv-2311 0.73 Rv2311 4.18E-02 0.29 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2331A 0.47 Rv2331A 1.81E-03 0.18 HYPOTHETICAL PROTEIN MtH37Rv-2433c 0.59 Rv2433c 4.81E-02 0.15 HYPOTHETICAL PROTEIN PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-2434c 0.61 Rv2434c 1.60E-02 0.10 PROTEIN MtH37Rv-2474c 0.47 Rv2474c 3.75E-02 0.31 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2478c 0.70 Rv2478c 4.91E-02 0.19 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2480c 0.63 Rv2480c 2.74E-02 0.20 POSSIBLE TRANSPOSASE PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-2488c 0.41 Rv2488c 4.25E-02 0.31 PROTEIN (LUXR-FAMILY) MtH37Rv-2514c 0.39 Rv2514c 4.13E-02 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2515c 0.44 Rv2515c 3.72E-02 0.32 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2516c 0.44 Rv2516c 3.54E-02 0.26 HYPOTHETICAL PROTEIN MtH37Rv-2531c 0.53 Rv2531c 2.44E-02 0.31 PROBABLE AMINO ACID DECARBOXYLASE MtH37Rv-2542 0.53 Rv2542 4.53E-02 0.26 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2551c 0.46 Rv2551c 8.89E-03 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2577 0.36 Rv2577 1.69E-02 0.23 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2635 0.47 Rv2635 3.10E-03 0.15 HYPOTHETICAL PROTEIN MtH37Rv-2646 0.69 Rv2646 3.84E-02 0.22 PROBABLE INTEGRASE PROBABLE TRANSPOSASE FOR INSERTION MtH37Rv-2648 0.49 Rv2648 1.27E-02 0.14 SEQUENCE ELEMENT IS6110 PROBABLE TRANSPOSASE FOR INSERTION MtH37Rv-2649 0.49 Rv2649 9.65E-03 0.16 SEQUENCE ELEMENT IS6110 MtH37Rv-2662 0.46 Rv2662 4.55E-03 0.24 HYPOTHETICAL PROTEIN PROBABLE TRANSPOSASE FOR INSERTION MtH37Rv-2666 0.67 Rv2666 1.61E-02 0.16 SEQUENCE ELEMENT IS1081 (FRAGMENT) PROBABLE CONSERVED INTEGRAL ALANINE MtH37Rv-2729c 0.49 Rv2729c 4.35E-02 0.34 AND VALINE AND LEUCINE RICH PROTEIN MtH37Rv-2730 0.49 Rv2730 3.65E-02 0.28 HYPOTHETICAL PROTEIN MtH37Rv-2757c 0.51 Rv2757c 9.65E-03 0.19 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2762c 0.53 Rv2762c 3.69E-03 0.20 CONSERVED HYPOTHETICAL PROTEIN PROBABLE SHORT-CHAIN TYPE MtH37Rv-2766c 0.66 Rv2766c 2.54E-02 0.27 DEHYDROGENASE/REDUCTASE MtH37Rv-2771c 0.59 Rv2771c 3.57E-02 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2776c 0.67 Rv2776c 2.75E-02 0.20 PROBABLE OXIDOREDUCTASE CONSERVED HYPOTHETICAL ALANINE RICH MtH37Rv-2787 0.56 Rv2787 1.52E-02 0.18 PROTEIN MtH37Rv-2797c 0.54 Rv2797c 1.27E-02 0.27 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2806 0.58 Rv2806 4.37E-02 0.23 POSSIBLE MEMBRANE PROTEIN MtH37Rv-2810c 0.56 Rv2810c 2.91E-02 0.15 PROBABLE TRANSPOSASE MtH37Rv-2813 0.41 Rv2813 9.42E-03 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2814c 0.43 Rv2814c 1.80E-02 0.27 PROBABLE TRANSPOSASE MtH37Rv-2815c 0.42 Rv2815c 1.31E-02 0.18 PROBABLE TRANSPOSASE MtH37Rv-2827c 0.50 Rv2827c 7.83E-03 0.23 HYPOTHETICAL PROTEIN MtH37Rv-2854 0.48 Rv2854 7.98E-03 0.27 HYPOTHETICAL PROTEIN MtH37Rv-2859c 0.51 Rv2859c 1.16E-03 0.16 POSSIBLE AMIDOTRANSFERASE MtH37Rv-2862c 0.58 Rv2862c 4.18E-02 0.18 CONSERVED HYPOTHETICAL PROTEIN

200 MtH37Rv-2864c 0.47 Rv2864c 2.78E-02 0.40 POSSIBLE PENICILLIN-BINDING LIPOPROTEIN PROBABLE CONSERVED INTEGRAL MtH37Rv-2877c 0.41 Rv2877c 2.03E-02 0.32 MEMBRANE PROTEIN MtH37Rv-2885c 0.48 Rv2885c 7.18E-03 0.40 PROBABLE TRANSPOSASE MtH37Rv-2886c 0.65 Rv2886c 2.53E-02 0.23 PROBABLE RESOLVASE MtH37Rv-2891 0.61 Rv2891 3.25E-02 0.19 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2897c 0.25 Rv2897c 1.27E-02 0.33 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE D-AMINO ACID AMINOHYDROLASE MtH37Rv-2913c 0.44 Rv2913c 3.23E-02 0.33 (D-AMINO ACID HYDROLASE) CONSERVED HYPOTHETICAL ALANINE AND MtH37Rv-2917 0.54 Rv2917 2.53E-02 0.30 ARGININE RICH PROTEIN MtH37Rv-2923c 0.67 Rv2923c 4.25E-02 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-2961 0.52 Rv2961 2.25E-02 0.11 PROBABLE TRANSPOSASE CONSERVED HYPOTHETICAL ALANINE RICH MtH37Rv-2983 0.61 Rv2983 1.70E-02 0.15 PROTEIN MtH37Rv-2998 0.39 Rv2998 1.95E-02 0.39 HYPOTHETICAL PROTEIN POSSIBLE CONSERVED TRANSMEMBRANE MtH37Rv-3000 0.58 Rv3000 1.84E-02 0.15 PROTEIN PROBABLE SHORT CHAIN ALCOHOL MtH37Rv-3057c 0.46 Rv3057c 4.91E-02 0.39 DEHYDROGENASE/REDUCTASE PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-3060c 0.63 Rv3060c 4.25E-02 0.21 PROTEIN (PROBABLY GNTR-FAMILY) MtH37Rv-3071 0.52 Rv3071 1.65E-02 0.28 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3072c 0.60 Rv3072c 6.96E-03 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3077 0.67 Rv3077 4.35E-02 0.40 POSSIBLE HYDROLASE MtH37Rv-3079c 0.39 Rv3079c 6.39E-03 0.15 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3094c 0.61 Rv3094c 3.98E-02 0.21 CONSERVED HYPOTHETICAL PROTEIN HYPOTHETICAL TRANSCRIPTIONAL MtH37Rv-3095 0.51 Rv3095 1.81E-03 0.15 REGULATORY PROTEIN MtH37Rv-3098c 0.72 Rv3098c 4.63E-02 0.26 HYPOTHETICAL PROTEIN MtH37Rv-3114 0.77 Rv3114 1.92E-02 0.13 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE TRANSCRIPTIONAL REGULATORY MtH37Rv-3160c 0.59 Rv3160c 4.29E-02 0.24 PROTEIN (PROBABLY TETR-FAMILY) MtH37Rv-3162c 0.21 Rv3162c 1.88E-02 0.19 POSSIBLE INTEGRAL MEMBRANE PROTEIN MtH37Rv-3163c 0.57 Rv3163c 1.92E-02 0.18 POSSIBLE CONSERVED SECRETED PROTEIN MtH37Rv-3165c 0.59 Rv3165c 8.44E-03 0.14 HYPOTHETICAL PROTEIN MtH37Rv-3166c 0.39 Rv3166c 8.57E-03 0.43 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3168 0.54 Rv3168 2.08E-02 0.56 CONSERVED HYPOTHETICAL PROTEIN PROBABLE TRANSCRIPTIONAL REGULATORY MtH37Rv-3173c 0.54 Rv3173c 3.39E-02 0.34 PROTEIN (PROBABLY TETR/ACRR-FAMILY) PROBABLE SHORT-CHAIN MtH37Rv-3174 0.34 Rv3174 9.43E-03 0.26 DEHYDROGENASE/REDUCTASE MtH37Rv-3175 0.29 Rv3175 6.96E-03 0.36 POSSIBLE AMIDASE MtH37Rv-3180c 0.57 Rv3180c 4.74E-02 0.24 HYPOTHETICAL ALANINE RICH PROTEIN MtH37Rv-3186 0.47 Rv3186 1.06E-02 0.19 PROBABLE TRANSPOSASE MtH37Rv-3189 0.58 Rv3189 7.20E-03 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3190c 0.52 Rv3190c 1.47E-02 0.17 HYPOTHETICAL PROTEIN CONSERVED HYPOTHETICAL ALANINE AND MtH37Rv-3192 0.51 Rv3192 1.90E-02 0.13 PROLINE-RICH PROTEIN POSSIBLE CATIONIC AMINO ACID TRANSPORT MtH37Rv-3253c 0.54 Rv3253c 4.81E-02 0.24 INTEGRAL MEMBRANE PROTEIN MtH37Rv-3254 0.66 Rv3254 3.12E-02 0.37 CONSERVED HYPOTHETICAL PROTEIN PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-3278c 0.61 Rv3278c 2.56E-02 0.22 PROTEIN MtH37Rv-3325 0.58 Rv3325 3.31E-02 0.16 PROBABLE TRANSPOSASE MtH37Rv-3351c 0.68 Rv3351c 2.45E-02 0.18 CONSERVED HYPOTHETICAL PROTEIN

201 MtH37Rv-3364c 0.37 Rv3364c 2.89E-02 0.28 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3368c 0.47 Rv3368c 1.28E-02 0.33 POSSIBLE OXIDOREDUCTASE MtH37Rv-3371 0.43 Rv3371 2.88E-02 0.44 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3395c 0.39 Rv3395c 2.78E-02 0.33 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3440c 0.74 Rv3440c 3.94E-02 0.16 HYPOTHETICAL PROTEIN PROBABLE CONSERVED INTEGRAL MtH37Rv-3448 0.41 Rv3448 4.94E-02 0.25 MEMBRANE PROTEIN MtH37Rv-3466 0.65 Rv3466 4.81E-02 0.21 CONSERVED HYPOTHETICAL PROTEIN POSSIBLE TRANSPOSASE FOR INSERTION MtH37Rv-3474 0.41 Rv3474 9.32E-03 0.25 ELEMENT IS6110 [FIRST PART] MtH37Rv-3481c 0.45 Rv3481c 1.89E-02 0.36 PROBABLE INTEGRAL MEMBRANE PROTEIN PROBABLE SHORT-CHAIN TYPE MtH37Rv-3549c 0.51 Rv3549c 3.23E-02 0.40 DEHYDROGENASE/REDUCTASE MtH37Rv-3555c 0.76 Rv3555c 3.47E-02 0.14 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3577 0.43 Rv3577 3.04E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3591c 0.36 Rv3591c 4.36E-02 0.49 POSSIBLE HYDROLASE PROBABLE CONSERVED TRANSMEMBRANE PROTEIN RICH IN ALANINE AND ARGININE MtH37Rv-3604c 0.66 Rv3604c 1.27E-02 0.16 AND PROLINE MtH37Rv-3605c 0.37 Rv3605c 1.09E-02 0.31 PROBABLE CONSERVED SECRETED PROTEIN MtH37Rv-3618 0.54 Rv3618 3.12E-02 0.23 POSSIBLE MONOOXYGENASE PROBABLE CONSERVED INTEGRAL MtH37Rv-3629c 0.49 Rv3629c 3.60E-02 0.16 MEMBRANE PROTEIN PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-3635 0.34 Rv3635 1.30E-02 0.14 PROTEIN MtH37Rv-3636 0.60 Rv3636 2.64E-02 0.27 POSSIBLE TRANSPOSASE MtH37Rv-3640c 0.38 Rv3640c 8.28E-03 0.22 PROBABLE TRANSPOSASE MtH37Rv-3656c 0.67 Rv3656c 4.14E-02 0.23 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3659c 0.33 Rv3659c 3.41E-02 0.24 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3660c 0.56 Rv3660c 1.63E-02 0.16 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3691 0.63 Rv3691 4.35E-02 0.26 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3700c 0.54 Rv3700c 2.51E-02 0.21 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3702c 0.68 Rv3702c 3.73E-02 0.19 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3712 0.54 Rv3712 3.19E-02 0.28 POSSIBLE LIGASE MtH37Rv-3714c 0.49 Rv3714c 2.59E-02 0.57 CONSERVED HYPOTHETICAL PROTEIN PROBABLE CONSERVED TWO-DOMAIN MtH37Rv-3728 0.55 Rv3728 4.21E-02 0.21 MEMBRANE PROTEIN MtH37Rv-3729 0.47 Rv3729 1.81E-03 0.19 POSSIBLE TRANSFERASE MtH37Rv-3730c 0.52 Rv3730c 3.81E-02 0.21 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3740c 0.45 Rv3740c 3.64E-02 0.45 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3770c 0.60 Rv3770c 4.00E-02 0.30 HYPOTHETICAL LEUCINE RICH PROTEIN MtH37Rv-3771c 0.55 Rv3771c 1.36E-02 0.17 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3773c 0.70 Rv3773c 2.29E-02 0.17 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3776 0.51 Rv3776 1.81E-03 0.23 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3786c 0.71 Rv3786c 2.56E-02 0.22 HYPOTHETICAL PROTEIN MtH37Rv-3787c 0.39 Rv3787c 2.74E-02 0.18 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3796 0.61 Rv3796 3.49E-02 0.23 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3829c 0.58 Rv3829c 3.60E-02 0.26 PROBABLE DEHYDROGENASE MtH37Rv-3836 0.65 Rv3836 2.85E-02 0.26 CONSERVED HYPOTHETICAL PROTEIN PROBABLE CONSERVED TRANSMEMBRANE MtH37Rv-3848 0.56 Rv3848 4.22E-02 0.25 PROTEIN POSSIBLE SECRETED ALANINE AND PROLINE MtH37Rv-3886c 0.53 Rv3886c 2.73E-02 0.35 RICH PROTEASE MtH37Rv-3897c 0.63 Rv3897c 4.46E-03 0.20 CONSERVED HYPOTHETICAL PROTEIN

202 MtH37Rv-3899c 0.62 Rv3899c 7.20E-03 0.25 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv-3912 0.49 Rv3912 2.12E-02 0.23 HYPOTHETICAL ALANINE RICH PROTEIN PROBABLE L-SERINE DEHYDRATASE SDAA (L- MtH37Rv-0069c 0.66 sdaA 2.51E-02 0.21 SERINE DEAMINASE) (SDH) (L-SD) POSSIBLE D-3-PHOSPHOGLYCERATE DEHYDROGENASE SERA2 (PHOSPHOGLYCERATE DEHYDROGENASE) MtH37Rv-0728c 0.42 serA2 4.35E-02 0.45 (PGDH) PROBABLE ALTERNATIVE RNA POLYMERASE MtH37Rv-3328c 0.61 sigJ 4.05E-02 0.19 SIGMA FACTOR (FRAGMENT) SIGJ PROBABLE tRNA/rRNA METHYLASE SPOU MtH37Rv-3366 0.43 spoU 3.27E-02 0.46 (tRNA/rRNA METHYLTRANSFERASE) PROBABLE TWO COMPONENT DNA BINDING TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv-0602c 0.35 tcrA 2.19E-02 0.22 TCRA PROBABLE THYMIDYLATE KINASE TMK (dTMP MtH37Rv-3247c 0.56 tmk 4.00E-02 0.28 KINASE) (THYMIDYLIC ACID KINASE) (TMPK) MtH37Rv-1563c 0.47 treY 7.83E-03 0.34 Maltooligosyltrehalose synthase TreY PROBABLE Sn-GLYCEROL-3-PHOSPHATE TRANSPORT INTEGRAL MEMBRANE PROTEIN MtH37Rv-2835c 0.36 ugpA 2.94E-02 0.24 ABC TRANSPORTER UGPA MtH37Rv-1849 0.62 ureB 1.12E-02 0.18 Urease beta subunit ureB VIRULENCE-REGULATING TRANSCRIPTIONAL MtH37Rv-3082c 0.68 virS 4.31E-02 0.32 REGULATOR VIRS (ARAC/XYLS FAMILY)

203 Appnedix VII: Genes downregulated in intracellular M. tuberculosis

H37Ra versus broth-grown H37Ra.

Expanded list of genes downregulated in intracellular versus broth-grown M. tuberculosis H37Ra (see

Table 5-7). Three populations of RNA from each of intracellular and broth-grown H37Ra were reverse transcribed

and hybridised to M. tuberculosis microarrays (Bµg@S, http://www.bugs.sgul.ac.uk/) in duplicate. Arrays were normalised as per section 2.6.3.2 and gene expression was filtered for genes whose expression differed by 1.5-fold.

Statistical significance of fold-difference across all populations was analysed using ANOVA. Genes whose expression differences were found statistically to be significantly (P<0.05) upregulated in intracellular H37Ra versus

broth-grown H37Ra are listed. SD = standard deviation.

Systematic Fold- Common Name induction Name P-value SD Product MtH37Rv- PROBABLE ACONITATE HYDRATASE ACN (Citrate hydro-lyase) 1475c 0.58 acn 1.94E-02 0.23 (Aconitase) MtH37Rv- 1876 0.35 bfrA 1.95E-03 0.23 PROBABLE BACTERIOFERRITIN BFRA MtH37Rv- 3841 0.23 bfrB 1.12E-02 1.71 POSSIBLE BACTERIOFERRITIN BFRB PROBABLE MEDIUM CHAIN FATTY-ACID-COA LIGASE MtH37Rv- FADD14 (FATTY-ACID-COA SYNTHETASE) (FATTY-ACID-COA 1058 0.58 fadD14 3.51E-02 0.46 SYNTHASE) PROBABLE FUMARATE REDUCTASE [FLAVOPROTEIN MtH37Rv- SUBUNIT] FRDA (FUMARATE DEHYDROGENASE) (FUMARIC 1552 0.56 frdA 1.84E-02 0.19 HYDROGENASE) MtH37Rv- DIVALENT CATION-TRANSPORT INTEGRAL MEMBRANE 0924c 0.58 mntH 4.96E-02 0.23 PROTEIN MNTH (BRAMP) (MRAMP) MtH37Rv- RECA PROTEIN (RECOMBINASE A) [CONTAINS: 2737c 0.59 recA 4.84E-02 0.50 ENDONUCLEASE PI-MTUI (MTU RECA INTEIN)]. MtH37Rv- 0888 0.33 Rv0888 1.06E-02 0.68 PROBABLE EXPORTED PROTEIN MtH37Rv- 1230c 0.51 Rv1230c 9.25E-03 0.49 POSSIBLE MEMBRANE PROTEIN MtH37Rv- PROBABLE DEHYDROGENASE FAD flavoprotein GMC 1279 0.62 Rv1279 3.65E-02 0.30 oxidoreductase MtH37Rv- 2884 0.68 Rv2884 1.94E-02 0.26 PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN MtH37Rv- 2959c 0.33 Rv2959c 4.84E-02 0.50 POSSIBLE METHYLTRANSFERASE (METHYLASE) MtH37Rv- 3371 0.51 Rv3371 4.84E-02 0.44 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3706c 0.21 Rv3706c 8.76E-04 0.42 CONSERVED HYPOTHETICAL PROLINE RICH PROTEIN MtH37Rv- 3897c 0.58 Rv3897c 4.77E-02 0.20 CONSERVED HYPOTHETICAL PROTEIN MtH37Rv- 3846 0.27 sodA 1.80E-02 0.50 SUPEROXIDE DISMUTASE [FE] SODA

204