<<

Extremozymes of the Hot and Salty Halothermothrix orenii

Author Kori, Lokesh D

Published 2012

Thesis Type Thesis (PhD Doctorate)

School School of Biomolecular and Physical Sciences

DOI https://doi.org/10.25904/1912/2191

Copyright Statement The author owns the copyright in this thesis, unless stated otherwise.

Downloaded from http://hdl.handle.net/10072/366220

Griffith Research Online https://research-repository.griffith.edu.au

Extremozymes of the hot and salty Halothermothrix orenii

LOKESH D. KORI

(M.Sc. Biotechnology)

School of Biomolecular and Physical Sciences

Science, Environment, Engineering and Technology

Griffith University, Australia

Submitted in fulfillment of the requirements of the degree of

Doctor of Philosophy

December 2011 STATEMENT OF ORIGINALITY

STATEMENT OF ORIGINALITY

This work has not previously been submitted for a degree or diploma in any university.

To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made in the thesis itself.

LOKESH DULICHAND KORI

II ACKNOWLEDGEMENTS

ACKNOWLEDGEMENTS

I owe my deepest gratitude to my supervisor Prof. Bharat Patel, for offering me an opportunity for being his postgraduate. His boundless knowledge motivates me for keep going and enjoy the essence of science. Without his guidance, great patience and advice, I could not finish my PhD program successfully.

I take this opportunity to give my heartiest thanks to Assoc. Prof. Andreas Hofmann, (Structural Chemistry, Eskitis Institute for Cell & Molecular Therapies, Griffith University) for his support and encouragement for crystallographic work. I am grateful to him for teaching me about the structures, in silico analysis and their hidden chemistry. I also thank to Dr. Asiah Osman, Dr. Conan Wang and Dr. Saroja Weeratunga for their help and support during biocrystallographic work in Structural Chemistry Lab.

I would like to give my thanks to Dr. Tony Greene for his support and advice throughout my candidature. I am also thankful to Dr. Chris Love for his valuable suggestions and support.

I also thank to my colleagues Dr. Christopher Ogg, Dr. Carolina Diaz, Mr. Mark Hasse, Mr. Saad Farooqui, Mr. Vipul Chaddha and Mr. Mitchell Wright for their support, sharing the lab and good time. I would also like to extend my thanks to the visiting students from Nanyang Technological University Singapore, Ms. Milan Mehta, Ms. Yamonee Zin, Mr. Kevin Neo, Ms. Hui Hui, Ms. Durga Devi, Mr. Leon Oh Yong Jie and Ms. Ginette Gin.

I also thank Head of the School and all administrative staffs in School of Biomolecular and Physical Sciences (BPS) for their help. My thanks also go to the staff of DNA sequencing facility, science store, electronics and mechanical solutions and technical staff for their support.

I extend my vote of thanks to Ms Janet Watts, School of Education and Professional Studies, Griffith University for editing and reading my thesis critically.

III ACKNOWLEDGEMENTS

I gratefully acknowledge the financial support from Griffith University for awarding Griffith University Postgraduate Research Scholarship (GUPRS) for covering living allowance and Griffith University International Postgraduate Research Scholarship (GUIPRS) for covering tuition fees.

Thanks are also due to Dr. Anup Anvikar (Asst. Director, ICMR, New Delhi-India) for his motivational support from the very beginning of my scientific career. I also greatly extend my vote of thanks to Mrs. Inderjeet K Randhawa for developing my interest towards science from my school days.

I am grateful to the support and help by my friends including: Mr. Adnan Naim, Dr. Susrut Arora, Dr. Susheel Singh, Dr. Abhinav Dey, Dr. Abhishek Bhattacharya, Mr. Bharat Neekhra, Mrs. Poonam B Pandey, Mrs. Ruchi Gutch and all those who helped me directly or indirectly. I would like to give my special thanks to my brother Ritesh and grandparents for their love and support.

Finally, I would like to thank especially to my wife Sonam and dedicate this thesis to my parents for their love, support, patience, encouragement and inspiration all the time. It was not possible to complete this work without their precious support.

IV PUBLICATIONS ARISING FROM THIS THESIS

PUBLICATIONS ARISING FROM THIS THESIS

PUBLISHED Chapter 3 (Section 3.1)

Lokesh D. Kori, Andreas Hofmann and Bharat K. C. Patel. Expression, purification and preliminary crystallographic analysis of the recombinant β-glucosidase (BglA) from the halothermophile Halothermothrix orenii. Acta Cryst. (2011); F67, 111-113.

Chapter 3 (Section 3.3)

Lokesh D. Kori, Andreas Hofmann and Bharat K. C. Patel. Expression, purification, crystallization and preliminary X-ray diffraction analysis of thermophilic ribokinase ( kinase ribokinase family) from Halothermothrix orenii. (2012); F68, 240-243.

COMMUNICATED/ IN PREPARATION

Chapter 3 (Section 3.1)

Lokesh D. Kori, Andreas Hofmann, Conan K. Wang and Bharat K. C. Patel. Biochemical and structural properties of β- glucosidase with β- fucosidase and β- galactosidase activities from the halothermophilic anaerobe Halothermothrix orenii.

Chapter 3 (Section 3.2)

Lokesh D. Kori and Bharat K. C. Patel. cloning, over-expression and biochemical characterization of a thermophilic β-galactosidase/ -L- arabinopyranosidase from Halothermothrix orenii.

V ABSTRACT

ABSTRACT

Enzymes secreted by are termed as extremozymes. Extremophiles are organisms that inhabit conditions which are extreme to survive and adapt in. Halothermothrix orenii is a polyextremophilic anaerobic bacterium that grows optimally at 60 C (maximum 70 C) in presence of 10 % NaCl (NaCl growth range between 4-20 %) and optimum pH 6.5-7.0, isolated from the sediments of Tunisian lake (Cayol et. al., 1994). This bacterium carry exclusive features from several points : (a) It is Gram-negative member of phylum and its analysis reveals mixed Gram-negative and Gram-positive properties such as lipid A biosynthesis, a characteristic of Gram-negative pathway and a typical Firmicutes sporulating system. (b) It is a strict anaerobic chemoorganotroph which uses various polysaccharides and produces a series of intracellular and extracellular glycosyl hydrolases and transferases. (c) It is a halothermophile and any extra-cellular enzymes produced by this organism are well adapted to function under thermophilic and saline conditions. (d) Halothermothrix orenii survives under fluctuating environmental conditions – dilution of during rains when temperature is low and increased salt concentration during summer when temperature is at its peak.

In this study various glycosyl hydrolases and transferases from H. orenii selected for investigations are briefly described in chapter 3. were cloned and over-expressed followed by protein purification were further analysed for their biochemical characteristics along-with structural insights.

Section 3.1 of chapter 3 describes the phylogenetic, biochemical and structure findings of β-glucosidase. bglA gene from H. orenii was PCR amplified, cloned and over- expressed in E. coli Rosetta 2 cells as a hexahistidine-tagged protein and purified using Ni affinity chromatography. The recombinant enzyme designated HoBglA was found to have a subunit molecular weight in the region of 53 kDa size. The purified recombinant enzyme was found to possess three activities: a β-glucosidase, that was optimally active at 45 C and pH 8.5, with a Vmax 30.05 nmoles/10 mg/15 min and a Km of 1.38 nM with oNPG as the substrate, a β-fucosidase with optimal activity at 60 C and pH 8.5, with a

Vmax 85.56 nmoles/10 mg/15 min and a Km 1.62 nM with pNPF as the substrate, and a

VI ABSTRACT

β-galactosidase with optimal activity at 65 C and pH 8.5, with a Vmax 23.28 nmoles/10 mg/15 min and Km 0.04 nM with pNPGal as the substrate. Deoxynojirimycin inhibited 59% of the β-glucosidase activity and 67% of galactosidase activity but had no effect on β-fucosidase activity. Beta fucosidase and β-galactosidase activities were strongly inhibited by metal ions in the order Hg>Ni>Cd>Zn>Co, whereas only Ni inhibited β- glucosidase activity. Arabinose, cellotetraose, cellotriose, glucose, galactose, D-fucose, gluconate, D-glucono 1-5 lactone, maltose, maltotriose, mellitriose, sucrose and xylose had no effect on β-glucosidase activity. However arabinose, fructose, maltose, cellobiose, sucrose, glucose, galactose, lactose and maltotriose affected β-fucosidase activity. Sucrose reduced β-galactosidase activity by 62% however rest of the above mentioned did not influence β-galactosidase activity, except D-fucose, gluconate, D-glucono 1-5 lactone which inhibited β-galactosidase activity completely. Remarkably iodine abolished all the three enzyme activities of HoBglA. Recombinant β-glucosidase yielded orthorhombic crystals from the sitting drop method and diffracted X-rays to a resolution limit of 3.5 Å having unit cell dimensions a= 95.51, b= 104.60, c= 105.40 (Kori et. al., 2011a). Structural analysis suggested that the overall structure was a classical (/β)8-TIM barrel fold similar to the β-retaining glycosyl hydrolase of the CAZy family 1. The model contained a slot-like active site cleft and a broad outer opening, related to its function in hydrolyzing β 1- 4 linkages of glucose derivatives. In addition, the two essential glutamate (Glu166 and Glu354) residues found to be responsible for substrates hydrolysis are well conserved both in the active site and in substrate binding region of HoBglA.

The second GH studied was β-galactosidase/ -L-arabinofuranosidase and has been described under section 3.2 of chapter 3. arap gene has been cloned and overexpressed in E. coli BL21(DE3) cells as a hexahistidine tagged protein. The recombinant enzyme designated as HoArap was purified using Ni affinity chromatography with a molecular mass of 35.5kDa. The purified HoArap was found to possess two activities; a β- galactosidase, with optimal activity at 50C and at pH 7.0, with a Km 2.5 nM and Vmax 4.4 nmoles/10 mg/15 min with pNPGal as substrate and -L-arabinopyranosidase at

45C and at pH 7.0, with a Km 8.3 nM and Vmax 0.1 nmoles/10 mg/15 min with pNPAp substrate. Both the activities were strongly inhibited by metals (Cd>Cu>Hg>Zn>Ni>Co), Tween 80, EDTA, 10% ethanol and 50% glycerol. Metals

VII ABSTRACT

such as Mn>Mg>Ca>Li played a crucial role in activation of β-galactosidase activity, whereas Mn>Ba>Fe>Co>Li enhanced -L-arabinopyranosidase activity. Lactose was determined as the specific substrate for HoArap. Attempts were made to crystallize HoAraf using commercial and in-house crystallization factorials, but results are not obtained yet while writing these lines.

In last section 3.3 of chapter 3, preliminary crystallographic studies were done. A ribokinase (rbk) gene was cloned and over-expressed in E. coli BL21 (DE3) cells. The recombinant protein designated as HoRbk was purified using Ni affinity chromatography. Orthorhombic crystals were obtained using the sitting drop method. Diffraction dataset was collected to a resolution of 3.1 Å using Australian Synchrotron

Facility. The crystals belonged to the space group P212121 with unit cell parameters a = 45.6, b = 61.1, c = 220.2 and contained two molecules per asymmetric unit. A molecular replacement solution has been found and attempts are currently under way to build a model of the ribokinase. Efforts were made to investigate the HoRbk biochemical properties, but enzyme assays were not consistent and successful.

The research presented here, has in general, expanded the current knowledge on the repertoire of glycosyl hydrolases and transferases present in the thermohalophile, H. orenii. The bioinformatic analysis of these enzymes provides a useful glimpse into it’s physiology and metabolism which could lead to an understanding of the adaptation and survival mechanisms of this in a temperature and osmotic fluctuating stressed habitat. More specifically, the studies have provided a useful insight into the biochemical properties and/ or structural features of a β-glucosidase and a β- galactosidase/ -L-arabinofuranosidase. The detailed sequence, biochemical and structural information could be useful in the future design of tailor-made enzymes which could have use in biotechnological applications such as bioremediation and in pharmaceutical, detergent industry, dairy and biofuel industries.

VIII TABLE OF CONTENTS

TABLE OF CONTENTS

STATEMENT OF ORIGINALITY…..…………………………………………...... II ACKNOWLEDGEMENTS…………………………………………………………… III PUBLICATIONS………………………………………………………………………. V ABSTRACT…………………………………………………………………………….. VI TABLE OF CONTENTS……………………………………………………………… IX LIST OF FIGURES……………………………………………………………………. XV LIST OF TABLES……………………………………………………………………... XX ABBREVIATIONS…………………………………………………………………….. XXI

CHAPTER 1: LITERATURE REVIEW

1.1 Extremophiles…………………………………………………………… 1 1.2 and ………………………………… 5 1.2.1 Thermophilic and hyperthermophilic and their habitats…... 7 1.2.2 Molecular adaptations to thermophily………………………………….. 8 1.2.3 Halophilic and their habitats………………………….. 8 1.2.4 Molecular adaptation to halophilism………………………………….... 9 1.3 Thermozymes……………………………………………………………. 11 1.4 Applications in Molecular …………………………………….. 13 1.4.1 DNA Polymerases……………………………………………………….. 13 1.4.2 DNA ligase………………………………………………………………. 14 1.4.3 Other enzymes…………………………………………………………… 14 1.5 Applications in starch processing…………….………………………... 14 1.5.1 Alpha amylases…………………………………………………………... 15 1.5.2 Beta amylases……………………………………………………………. 16 1.5.3 Glucoamylases and -glucosidases……………………………………… 17 1.5.4 Pullulanase and amylopullulanase……………………………………… 17 1.5.5 Cyclomaltodextrin glucotransferases…………………………………… 18 1.5.6 Xylose isomerases………………………………………………………. 18 1.6 Other Industrial and Biotechnological applications…………………. 22 1.6.1 Cellulose degradation and ethanol production………………………….. 22

IX TABLE OF CONTENTS

1.6.2 Paper pulp bleaching…………………………………………………….. 22 1.7 Structural features of thermozymes…………………………………... 23 1.8 Additional intermolecular interactions……………………………….. 23 1.8.1 Hydrogen bonds………………………………………………………….. 23 1.8.2 Electrostatic interactions………………………………………………... 24 1.8.3 Hydrophobic bonds…………………………………………………….... 24 1.8.4 Disulphide bonds………………………………………………………… 24 1.8.5 Aromatic interaction……………………………………………………… 25 1.8.6 Metal binding……………………………………………………………. 25 1.8.7 Extrinsic factors…………………………………………………………. 25 1.9 Good conformational structures………………………………………. 26 1.9.1 High rigidity……………………………………………………………… 26 1.9.2 Higher packing efficiency……………………………………………….. 26 1.9.3 Alpha helix stabilization…………………………………………………. 26 1.10 Halothermothrix orenii………………………………………………….. 28 Objectives………………………………………………………………… 32 Thesis Plan……………………………………………………………….. 33

CHAPTER 2: GENERAL MATERIALS AND METHODS

2.1 Bioinformatics.……………………………….…………………………. 36 2.1.1 BioEdit and TREECON.…………………………………………………. 38 2.1.2 Basic Local Alignment Search Tool (BLAST)………………………….. 38 2.1.3 ProtParam Statistics………………………………………………..……. 38 2.1.4 InterProScan protein signature database……………………………….. 38 2.1.5 BRENDA enzyme database …………………………………………….. 39 2.1.6 CaZY database ……….………………………………………………….. 39 2.1.7 dbCAN database ………..………………………………………………. 39 2.1.8 KEGG protein database …………………………………………………. 39 2.1.9 PDB database ……………………………………………………………. 40 2.1.10 PSIPRED protein structure prediction server…………………………… 40 2.1.11 EMBL EBI database...... 40 2.1.11a CSA (Catalytic Site Atlas)……………………………………………… 40 2.1.11b PDBeFold……………………………………………………………….. 40

X TABLE OF CONTENTS

2.1.11c FingerPRINTScan………………………………………………………. 41 2.1.12 XtalPred Server…………………………………………………………. 41 2.1.13 Halothermothrix orenii genome database and gene selection…………. 41 2.2 Gene Cloning…….……………………………………………………… 42 2.2.1 Bacterial Strain and Vector……………………………………………... 42 2.2.2 Oligonucleotide design and synthesis ………………………………….. 43 2.2.3 Culturing of H. orenii and genomic DNA extraction and purification… 44 2.2.4 Plasmid DNA purification…………….…………………………………. 44 2.2.5 DNA gel electrophoresis ……………..…………………………………. 45 2.2.6 Polymerase Chain Reaction…………………………………………….. 45 2.2.6a DNA extraction and purification from agarose gel…………………….. 46 2.2.6b Restriction endonucleases digestion……………………………………. 46 2.2.6c DNA ligation reaction…………………………………………………... 46 2.2.6d Recombinant DNA transformation in E. coli…………………………… 46 2.2.6e Recombinant clone screening …………………………………………... 47 A White/ red selection……………………………………………………... 47 B Colony PCR screening………………………………………………….. 47 C Restriction Enzyme (RE) digestion…………………………………….. 47 D Automated DNA sequencing…………………………………………… 48 E Storage of clones…………………………..…………………………….. 48 2.3 Recombinant protein expression and purification…………………… 49 2.3.1 Protein over-expression………………………………………………….. 49 2.3.2 Recombinant protein purification………………………………………. 49 2.3.3 Protein storage…………………………………………………………... 50 2.3.4 Protein analysis techniques………………………………………………. 50 2.3.4.1 Protein concentration estimation………………………………………... 50 2.3.4.2 SDS-PAGE analysis……………………………………………………… 50 2.4 Biochemical characterization of enzymes…………………………….. 51 2.4.1 oNp/ pNP hydrolysis assays…………………………………………….. 51 2.4.2 Substrate specific assay…………………………………………………. 52 2.4.3 Optimum temperature ………………………………………………….. 52 2.4.4 pH optima………………………………………………………………. 52 2.4.5 …………………………………………………………. 52

XI TABLE OF CONTENTS

2.4.6 Enzyme kinetics study………………………………………………….. 53 2.4.7 Effect of metals, sugars and organic solvents on enzymes……………... 53 2.4.8 Thin Layer Chromatography (TLC)……………...... 53 2.5 Protein Crystallography………………………………………………… 54 2.5.1 X-ray sources……….…………………………………………………… 54 2.5.2 X-ray diffraction……………………………….………………………... 55 2.5.3 X-ray detector……………………………….………………………...... 56 2.5.4 Data processing……………………………….……………………….... 57 2.5.5 Structure determination……………………………….………………… 57 2.5.6 Model building……………………………….………………………..... 58 2.5.7 Structure refinement……………………………….…………………… 59 2.5.8 Structure validation and deposition……………………………….…… 60 2.5.9 Crystallization screens……………………………….…………………. 60 2.5.10 X-ray diffraction and preliminary data analysis……………………….. 61

CHAPTER 3: GLYCOSYL HYDROLASES AND GLYCOSYL TRANSFERASES FROM H. ORENII 3.1 Introduction……………………………………………………………. 63 3.2 Materials and Methods…………………………………………………. 78 3.3 Results…………………………………………………………………… 80

CHAPTER 3: SECTION 3.1 Biochemical And Structural Studies On β-Glucosidase From H. orenii 3.1.1 Introduction…………………………………………………………….. 84 3.1.2 Materials and Methods………………………………………………… 90 3.1.3 Results…………………………………………………………………… 97 3.1.3.1 Cloning, expression and purification…………………………………… 99 3.1.3.2 Phylogenetic analysis……………………………………………………. 101 3.1.3.3 Optimum pH and temperature of HoBglA………………………………. 102 3.1.3.4 Thermostability…………………………………………………………. 105 3.1.3.5 Effect of salts, metal ions, solvents, reagents, sugars and inhibitor on the activity of HoBglA……………………………………………………… 107 3.1.3.6 Determination of kinetic constants of preferred substrates……………. 108

XII TABLE OF CONTENTS

3.1.3.7 Substrate specificity……………………………………………………… 109 3.1.3.8 Circular dichroism measurements……………………………………….. 110 3.1.3.9 Analytical gel filtration (SEC-MALS)………………………………….. 111 3.1.3.10 Differential Scanning Fluorimetry……………………………………… 112 3.1.3.11 Crystallization…………………………………………………………… 113 3.1.3.12 X-ray diffraction and Preliminary data analysis………………………… 116 3.1.3.13 Protein-ligand docking…………………………………………………… 119 3.1.3.14 Overall structure of HoBglA……………………………………………. 121 3.1.3.15 Molecular dynamics (MD) simulation outcome…………………………. 126 3.1.3.16 Comparison with other ………………………………………… 128 3.1.3.17 PDB deposition………………………………………………………….. 128 Discussion……………………………………………………………….. 131

CHAPTER 3: SECTION 3.2 Biochemical characterization of β-galactosidase/-L-arabinopyranosidase from H. orenii 3.2.1 Introduction……………………………………………………………… 137 3.2.2 Material and Methods…………………………………………………... 143 3.2.3 Results……………………………………………………………………. 145 3.2.3.1 Phylogenetic and sequence analysis…………………………………….. 145 3.2.3.2 Gene cloning, protein over-expression and purification…………………. 146 3.2.3.3 Optimum temperature and pH…………….……………………………… 149 3.2.3.4 Thermostability…………………………………………………………. 151 3.2.3.5 Effect of metal ions, solvents and sugars on the activity of HoBglA………………………………………………………………….. 153 3.2.3.6 Determination of kinetic constants ……………………………………. 155 3.2.3.7 Substrate specificity…………………………………………………….. 156 3.2.3.8 Crystallization……..…………………………………………………….. 156 Discussion………………………………………………………………. 157

CHAPTER 3 SECTION 3.3 Preliminary crystallographic investigation of ribokinase from H. orenii 3.3.1 Introduction……………………………………………………………. 162 3.3.2 Materials and Methods…………………………………………………. 164

XIII TABLE OF CONTENTS

3.3.2.1 In silico 3D Structure prediction……………………………………….. 165 3.3.3 Results and Discussion..……………………………………………….. 166 3.3.3.1 Gene cloning, recombinant protein over-expression and purification….. 166 3.3.3.2 Crystallization…………………………………………………………… 168 3.3.3.3 HoRbk sequence and phylogenetic analysis…………...... 171 3.3.3.4 Three dimensional model analysis……………………………………… 174

CHAPTER 4: THESIS DISCUSSION AND FUTURE DIRECTIONS 4.1 Thesis Discussion ………………………..……………………………… 176 4.2 Future Directions……………………………………………………….. 178

REFERENCES………………………………………………………………………… 179

APPENDICES…………………………………………………………………….……. 210 APPENDIX I……………………………………………………………………………. 210 APPENDIX II…………………………………………………………………………… 225 APPENDIX III…………………………………………………………………………... 227 APPENDIX IV………………………………………………………………………….. 228 APPENDIX V…………………………………………………………………………… 232 APPENDIX VI………………………………………………………………………….. 237 APPENDIX VII…………………………………………………………………………. 248 APPENDIX VIII………………………………………………………………………… 249 APPENDIX IX…………………………………………………………….……………. 250 APPENDIX X…………..………………………………………………………………. 251

XIV LIST OF FIGURES

LIST OF FIGURES

CHAPTER 1 Literature Review Figure 1.1 Relation of temperature limits with life……………………………… 2 Figure 1.2 Alkaline hotspring at Yellowstone National Park, USA……………... 2 Figure 1.3 Geothermally heated hydrothermal vents……………………………. 7 Figure 1.4 Schematic presentation of the action of starch degrading enzymes…. 16 Figure 1.5 Overview of Chapter 2 and 3………………………………………… 33 CHAPTER 2 General Materials and Methods Figure 2.1 A flow diagram of procedures used in this chapter…………………... 35 Figure 2.2 Electromagnetic spectrum scale with X-rays region…………………. 55 Figure 2.3 A diagrammatic representation of X-ray diffraction pattern…………. 56 Figure 2.4 Diagram of crystallographic methods followed in this study………... 61 CHAPTER 3 Glycosyl Hydrolases and Transferases from H. orenii Figure 3.1 Fructose metabolism pathways with reference to H. orenii………….. 68 Figure 3.2 Schematic representation of starch degradation……………………… 69 Figure 3.3 Starch and sucrose metabolism pathways with reference to H. orenii. 70 Figure 3.4 Phylogenetic analysis of amy2 from H. orenii……………………….. 71 Figure 3.5 Phylogenetic analysis of glucosylceramidase from H. orenii………... 73 Figure 3.6 Phylogenetic analysis of pfkA from H. orenii……………………….. 76 Figure 3.7 PCR amplified genes on agarose gel…………………………………. 80 Figure 3.8 SDS-PAGE images of various recombinant proteins from H. orenii... 83 CHAPTER 3 SECTION 3.1 Biochemical and structural investigation of β-glucosidase from H. orenii Figure 3.1.1 Molecular mechanism of cellulose hydrolysis …………………….. 86 Figure 3.1.2 Overview of methods used for protein investigation of from H. orenii…………………………………………………………………………... 95 Figure 3.1.3 Overview of protein crystallization and structural investigation….. 96 Figure 3.1.4 PCR amplification of bglA gene…………………………………... 97

XV LIST OF FIGURES

Figure 3.1.5 Cleanup of PCR amplified bglA gene after RE digestion………….. 98 Figure 3.1.6 Colony PCR screening for bglA gene in recombinant plasmid ……. 98 Figure 3.1.7 HoBglA over-expression in BL21 (DE3) and Rosetta 2 cells……... 99 Figure 3.1.8 HoBglA purification using Nickel affinity chromatography………. 99 Figure 3.1.9 Phylogenetic analysis of β-glucosidase from H. orenii……………. 101 Figure 3.1.10 Optimum temperature of HoBglA as β- glucosidase, β- fucosidase and β- galactosidase………………………………………………. 102 Figure 3.1.11a. pH optima of -glucosidase…………………………………….. 103 Figure 3.1.11b. pH optima of -fucosidase. …………………………………….. 103 Figure 3.1.11c. pH optima of - galactosidase activity..………………………... 104 Figure 3.1.12a. Temperature stability of HoBglA (β- glucosidase and β- galactosidase) with oNPG and pNPGal as substrate at …..……………………… 105 Figure 3.1.12b. Temperature stability of HoBglA (β- fucosidase) against pNPF as substrate at different temperatures. …………………………………………… 106 Figure 3.1.13 Effects of various metals on catalytic activity of HoBglA……….. 107 Figure 3.1.14 Kinetics studies of HoBglA….…………………………………… 108 Figure 3.1.15a. TLC of cellobiose hydrolysis...... ……………………………. 109 Figure 3.1.15b. TLC of lactose hydrolysis ……………………………………… 109 Figure 3.1.16 Circular dichroism (CD) measurement graphs…………………… 110 Figure 3.1.17 Size exclusion chromatography result of HoBglA……………….. 111 Figure 3.1.18 Differential Scanning Fluorimetry results graph…………………. 112 Figure 3.1.19 Crystals obtained from H. orenii β-glucosidase grown under different conditions………………………………………………………………. 115 Figure 3.1.20 HoBglA-ligand molecular docking results……………………….. 120 Figure 3.1.21 Cylindrical representation of H. orenii β-glucosidase structure…. 122 Figure 3.1.22 Structural comparison of HoBglA……………………………….. 123 Figure 3.1.23 Loop comparison of HoBglA with 1CBG………………………… 124 Figure 3.1.24 A view of nickel binding residues with electron density……….. 125 Figure 3.1.25 HoBglA MD simulation curves …………………………………. 127

XVI LIST OF FIGURES

CHAPTER 3 SECTION 3.2 Biochemical characterization of β-galactosidase/ -L-arabinopyranosidase from H. orenii Figure 3.2.1 Structure of synthetic substrates hydrolyzed by HoArap…………... 138 Figure 3.2.2 Hydrolysis mechanisms for a) retaining and b) inverting reaction… 140 Figure 3.2.3 Flowchart showing different methods used to study HoArap….…... 143 Figure 3.2.4 Phylogenetic analyses of confirmed GH43 members and HoArap... 145 Figure 3.2.5 PCR amplification of arap gene…………………………………… 146 Figure 3.2.6 Clean-up of PCR amplified arap gene after RE digestion………… 147 Figure 3.2.7 Colony PCR screening for arap gene in recombinant plasmid ……. 147 Figure 3.2.8 HoArap over-expression in BL21 (DE3) …………………...……... 148 Figure 3.2.9 HoArap purification using Nickel affinity chromatography………. 148 Figure 3.2.10 Optimum temperature of - arabinofuranosidase ……………..… 149 Figure 3.2.11a. pH optima of HoArap against pNP-β-D-galactopyranoside…… 150 Figure 3.2.11b. pH optima of HoArap against pNP--L-arabinopyranoside…… 150 Figure 3.2.12a. Temperature stability of HoArap against pNP-β-D galactopyranoside at different temperatures…………………………………… 151 Figure 3.2.12b. Temperature stability of HoArap against pNP--L arabinopyranoside at different temperatures…………………………………… 152 Figure 3.2.13 Effects of various metals and reagents on catalytic activity of HoArap ……………………………..……………………………………………. 153 Figure 3.2.14 Enzyme kinetic studies of HoArap ………………………………. 155 Figure 3.2.15 TLC of lactose hydrolysis ………………………………………... 156 CHAPTER 3 SECTION 3.3 Preliminary crystallographic investigation of ribokinase from H. orenii Figure 3.3.1 Flowchart showing general methods used to investigate ribokinase (sugar kinase) from H. orenii…………………………………………………..... 164 Figure 3.3.2 PCR amplified rbk gene and clean-up of PCR product after RE digestion…………………………………………………………………………. 166 Figure 3.3.3 RE digested recombinant plasmid with insert…………………….. 167 Figure 3.3.4 Purification of HoRbk using Nickel affinity chromatography …… 167 Figure 3.3.5 H. orenii recombinant ribokinase protein crystals grown during initial screen………………………………………………………………………. 170

XVII LIST OF FIGURES

Figure 3.3.6 Dendogram showing the phylogenetic position of ribokinase from H. orenii to its closest relatives………………………………………………….. 172 Figure 3.3.7 Amino acid sequence alignment of ribokinase from H. orenii, based on BLASTp against PDB database……………………………………..... 173 Figure 3.3.8 In silico modeled 3D structure of H. orenii ribokinase……………. 175

XVIII LIST OF TABLES

LIST OF TABLES

CHAPTER 1

Literature Review

Table 1.1 Classification and examples of extremophiles………………………... 4 Table 1.2 Complete genome sequence of hyperthermophiles…………………… 6 Table 1.3 Some important features of H. orenii, thermophiles and …. 10 Table 1.4 Thermophilic and hyperthermophilic enzymes with applications as molecular biology reagents………………………………………………………. 12 Table 1.5 Thermophilic, mesophilic and psychrophilic enzymes with applications in starch processing…………………………………………………. 19 Table 1.6 Properties of the Halothermothrix orenii genome…………………….. 31 CHAPTER 2

General Materials and Methods

Table 2.1 Bioinformatic tools used so study genes and proteins from H. orenii… 37 Table 2.2 Plasmid and bacterial strains used for gene cloning and protein over- expression………………………………………………………………………… 44 Table 2.3 Synthetic substrates used to study enzyme activities…………………. 51 CHAPTER 3

Glycosyl Hydrolases and Transferases from H. orenii

Table 3.1 Glycoside hydrolases and transferases from H. orenii………………… 64 Table 3.2 Substrates and media used for detecting the enzymatic activities……. 78 CHAPTER 3 SECTION 3.1

Biochemical and structural studies on β-glucosidase from H. orenii

Table 3.1.1 Comparative chart of β-glucosidase enzyme from different sources showing β-fucosidase and β-galactosidase activity……………………………… 87 Table 3.1.2 Purification of BglA recombinant protein from H. orenii………….. 100 Table 3.1.3 Optimization conditions to obtain high X-ray diffraction quality crystal from HoBglA protein…………………………………………………….. 114 Table 3.1.4 Diffraction data statistics from HoBglA crystals...... ………………... 117

XIX LIST OF TABLES

Table 3.1.5 HoBglA molecular docking outcome……………………………… 119 Table 3.1.6 Comparative analysis of β-glucosidases 3D structures from different sources…………………………………………………………………………… 129 CHAPTER 3 SECTION 3.2

Biochemical characterization of β-galactosidase/ -L-arabinopyranosidase from H. orenii

Table 3.2.1 Properties of β-galactosidase/-L-arabinopyranosidase from sources 141 Table 3.2.2 HoArap activity against pNP-galactopyranoside in presence and absence of CaCl2 and MnCl2 (1mM) over time………………………………… 154 Table 3.2.3 HoArap activity against pNP-arabinopyranoside in presence and absence of CaCl2 and MnCl2 (1mM) over time………………………………… 154 CHAPTER 3 SECTION 3.3

Preliminary crystallographic investigation of ribokinase from H. orenii

Table 3.3.1 Data collection statistics from HoRbk crystals…………………….. 169 APPENDICES Table A.1 List of primers used to amplify specific genes from H. orenii genomic DNA……………………………………………………………………. 225 Table A.2 Crystallization conditions that yielded HoBglA crystals…………….. 232 Table A.3 Crystallization conditions that yielded HoRbk crystals………………. 237

XX LIST OF ABBREVIATIONS

LIST OF ABBREVIATIONS

aa Amino acid Amp Ampicillin bp Base pairs BASys Bacterial Annotation System BLASTp Basic Local Alignment Search Tool for protein BSA Bovine Serum Albumin Cam Chloramphenicol CD Circular Dichroism DNA Deoxyribonucleic Acid DNase I Deoxyribonuclease I DSF Differential Scanning Fluorimetry DTT Dithiothreitol dH2O Distilled water EDTA Ethylene diamine tetraacetic acid EGTA Ethylene glycol tetraacetic acid EtBr Ethidium Bromide GH Glycosyl Hydrolases GT Glycosyl Transferases HEPES N-(2-hydroxytheyl) piperazine-N’-(2-ethanesulfonic acid) H. orenii Halothermothrix orenii Halothermothrix orenii β-galactosidase/ HoArap -L-arabinopyranosidase HoBglA Halothermothrix orenii β-glucosidase A HoRbk Halothermothrix orenii ribokinase IMAC Immobilized Metal Affinity Chromatography IPTG Isopropyl β-D-1-thiogalactopyranoside IUPAC International Union of Biochemistry and Molecular Biology kDa kilo Dalton

Km Michaelis constant LB Luria Bertani (media) LDS Lauryl Dodecyl Sulphate

XXI LIST OF ABBREVIATIONS

MOPS 3-(N-morpholino) propanesulphonic acid MPD 2-methyl-2,4-pentanediol MR Molecular Replacement NA 1-Naphtahyl acetate NCBI National Center for Biotechnology Information OD Optical Density ORF Open Reading Frame oNPG Ortho Nitrophenyl β-D-glucopyranoside pNPG Para Nitrophenyl β-D-glucopyranoside pNPGal Para Nitrophenyl β-D-galactopyranoside pNPAf Para Nitrophenyl α-L- arabinofuranoside pNPAp Para Nitrophenyl α-L-arabinopyranoside pNPAF Para Nitrophenyl α-D-fucopyranoside pNPBF Para Nitrophenyl β-D-fucopyranoside pNPGal Para Nitrophenyl β-D-galactopyranoside pNPM Para Nitrophenyl β-D-mannopyranoside pNPAX Para Nitrophenyl α-D xylopyranoside pNPBX Para Nitrophenyl β-D-xylopyranoside PAGE Poly Acrylamide Gel Electrophoresis PEG Polyethylene glycol PGA Phenyl-β-D-glucuronic acid monohydrate PCR Polymerase Chain Reaction RMSD Root Mean Square Deviation SDS Sodium Dodecyl Sulphate TLC Thin Layer Chromatography TYEG Tryptone Extract Glucose (media) U Unit (s)

Vmax Maximal velocity v/v Volume per volume w/v Weight per volume

XXII CHAPTER 1: LITERATURE REVIEW

CHAPTER 1

LITERATURE REVIEW

______CHAPTER 1: LITERATURE REVIEW

1.1 Extremophiles

Life exists in several ecological niches under extreme physical and chemical conditions such as temperature (Figure 1.1), pH, pressure, and radiation. In particular, when microbial life is found under extreme conditions then such microbes are known as extremophiles (Table 1.1) and their enzymes are termed as extremozymes. Microorganisms that are adapted to a variety of extreme conditions such as high temperature and high alkalinity are known as polyextremophiles e.g. Halothermothrix orenii, Chromohalobacter Sp. TVSP101. They adapt to tolerate environmental extremes, but these conditions are vital for their growth and survival (Madigan and Oren, 1999; Vidyasagar et. al., 2009). Polyextremophiles use several molecular and structural adaptations to survive and actively grow and propagate in such environments. Extremophilic macromolecules especially proteins, have similar folds and molecular functions to those found in the mesophilic proteins. Only certain features make the extremophilic proteins stable and active in extreme conditions (Sivakumar et. al., 2006). Their study has provided the basics of molecular biology and has attracted researchers to use them as models of primitive life forms and use their biomolecules in both basic and applied research (Madigan and Oren, 1999; Kumar et. al., 2011).

Only a limited number of the microorganisms from widely diverse environments have been exploited and extremophiles are one of them. Extremophilic microbes require intensive research work to understand mechanisms behind the functions performed by enzymes at high temperature adaptations, structure-function relationship in comparison with their mesophilic homologues and their use for mankind and environmental protection (Kumar et. al., 2011).

The most famous example of a thermostable enzyme is Taq polymerase from the organism aquaticus, a remarkable discovery of Thomas Brock in 1966, from the hot springs of Yellowstone National Park (Figure 1.2). This enzyme has optimal activity at 80 C (Kristjansson, 1992). Its use in Polymerase Chain Reaction (PCR) has revolutionized the research in molecular biology and industrial biotechnology.

1 CHAPTER 1: LITERATURE REVIEW

s

-

hermophile

Hyper t

Thermophiles

ophiles

Psychro

Figure 1.1 Relation of temperature limits to life. Highest and lowest temperature range of , mesophiles, thermophiles and hyperthermophiles is shown. (adapted and modified from Rothschild and Mancinelli, 2001).

Figure 1.2 Alkaline hotspring at Yellowstone National Park, USA. Temperature varies from 65 C to >95 C, various thermophiles and hyperthermophiles have been isolated from here including well known (adapted from Rothschild and Mancinelli, 2001).

2 CHAPTER 1: LITERATURE REVIEW

This thesis focuses on enzymes evolved or adapted under extreme condition of high temperature and high salt concentration. These enzymes are secreted by microorganisms well adapted to high temperature and high alkalinity in their natural habitat. Raised temperature and high alkalinity are some of the known extreme conditions in which some microorganisms are able to live without any problem; this has been brought about by their highly developed adaptation strategies during the evolutionary process. Enzymes that function optimally under extreme conditions are termed extremozymes. With the help of extremozymes microorganisms are able to adapt to various extreme environments, including physical, chemical or biological variations such as temperature, pressure, radiation, pH, desiccation, salinity, redox potential, nutritional limits, parasites or excesses of population density (Rothschild and Mancinelli, 2001; Gomes and Steiner, 2004).

3 CHAPTER 1: LITERATURE REVIEW

Table 1.1 Classification and examples of extremophiles

Environmental Type Definition Examples parameter Temperature Hyperthermophiles Growth >80C , 113C Growth 60-80C Synechococcus lividis 15-60C Homo sapiens <15C Psychrobacter, some insects

Radiation radiourans Pressure Barophile Weight loving Unknown Pressure loving For microbe, 130MPa Gravity Hypergravity >1g None known Hypogravity <1g None known Vacuum Tolerates vacuum Tardigrades, insects, microbes, (space devoid of seeds matter) Desiccation Anhydrobiotic Artemia salina; nematodes, microbes, fungi, lichens Salinity Salt loving , (2-5 M NaCl) Dunaliella salina pH pH > 9 Natronobacterium, Bacillus low pH loving firmus OF4, Spirulina spp. (all pH 10.5) Cyanidium caldarium, Ferroplasma sp. (both pH0) Oxygen tension Anaerobe Cannot tolerate O2 janaschii Microaerophile Tolerate some O2 Clostrium Aerobe Requires O2 H. sapiens Chemical extremes Gases C. caldarium (pure CO2) Metals Can tolerate high Ferroplasma acidarmanus concentrations of (Cu, As, Cd, Zn); Ralstonia sp. metals CH34 (Zn, Co, Cd, Hg, Pb) (metalotolerant) (Rothschild and Mancinelli, 2001)

4 CHAPTER 1: LITERATURE REVIEW

1.2 Hyperthermophiles and Thermophiles

Thermophiles are microorganisms with optimal growth temperatures between 60 - 80 C and those which grow above these temperatures are known as hyperthermophiles such as Thermococcus baraphilus (Vannier et. al., 2011). These are isolated from a number of marine and terrestrial geothermally heated habitats including shallow terrestrial hot springs, systems, sediment from volcanic islands and deep sea hydrothermal vents. Thermophilic extremophiles (e.g. Caldicellulosiruptor saccharolyticus, C. hydrothermalis, C. kristjanssonii, C. kronotskyensis, C. owensensis, C. lactoaceticus) have attracted most of the researchers and commercial group’s attention to study and use them for various research and industrial applications (Willquist et. al., 2010; Kumar et. al., 2011; Schuette et. al., 2011).

Complete genome sequences of 10 (Table 1.2) from archaeal and bacterial hyperthermophiles including sulphate reducers, , aerobic hydrogen oxidizing and heterotrophic bacteria and suggests their significant value in fundamental research to understand primitive and present day life form (Atomi, 2005). Hyperthermophiles also carry novel enzymes and metabolic pathways for breakdown of various carbon compounds including polysaccharides. They promise to be a valuable source of enzymes for biotechnology, metabolic engineering related to biomass conversion and biofuel production (Atomi et. al., 2011).

In 1966, Thomas Brock made the remarkable discovery that microorganisms were growing in the boiling hot springs of Yellowstone National Park (Figure 1.1) (Brock, 1969). Since then, thermophiles have been discovered in several geothermal regions all over the world including areas in Iceland, Kamchatka, New Zealand, Australia, Italy, India, Saudi Arabia and other locations. Boiling hot springs are far beyond the comfort zone of humans and other animals, only archaeal and prokaryotic life is able to survive in such environments (adapted from Microbial Life Educational Resources, Montana University, USA, http://serc.carleton.edu/microbelife/extreme/extremeheat/; Bisht and Panda, 2011; Khalil, 2011).

5 CHAPTER 1: LITERATURE REVIEW

All thermophiles require a heated environment, but some need more than one extreme such as high levels of sulphur or calcium carbonate, acidic water or alkaline springs. The ability of extremophiles to thrive in extremely high temperatures is due to the special features carried by amino acids of extremozymes to retain their twisted and folded tertiary structure at high temperatures, where other enzymes would unfold and no longer work. Heat stable enzymes of thermophiles have proven to be very important in the field of biotechnology. For example, two thermophilic species Thermus aquaticus and Thermococccus citoralis are used as sources of the enzyme DNA polymerase, for the PCR. Recently a novel cellulolytic Clostridium thermocellum genome had been sequenced. Enzymes from this bacterium can be significantly useful to basic and applied research (Feinberg et. al., 2011). Thermophilic proteins are widely studied extremophilic proteins and from these studies it appears that thermal stability is not determined by any single factor but the combination of several factors, each with a relatively small effect (Vieille and Zeikus, 2001; Chan et. al., 2011; Goldstein, 2011; Lam et. al., 2011).

Table 1.2 Complete genome sequence of hyperthermophiles Hyperthermophiles Genome size (bp) ORF number Archaea pernix K1 1 669 695 2694 Archaeoglobus fulgidus VC-16 2 178 400 2436 Methanocaldoccus jannacshii DSM 2261 1 739 933 1738 kandleri AV19 1 694 969 1692 Nanoarchaeum equitans Kin4-M 490 885 552 Pyrobaculum aerophilum IM2 2 222 430 2587 abyssi GE5 1 765 118 1788 DSM3838 1 908 253 2208 OT3 1 738 505 2061 Sulfolobus solfataricus P2 2 992 245 2977 Sulfolobus tokodaii 7 2 694 757 2826 Thermococcus kodakaraensis KOD1 2 088 737 2306 Bacteria aeolicus VF5 1 551 335 1512 Thermotoga maritima MSB8 1 860 725 1877 (Atomi et. al., 2005)

6 CHAPTER 1: LITERATURE REVIEW

1.2.1 Thermophilic and Hyperthermophilic prokaryotes and their habitats

The most extreme of all discovered hyperthermophiles is Pyrolobus fumarii. This has the highest optimal growth temperature, grows at 113 C and inhabits submarine hydrothermal vents (Figure 1.3). Cultures of P. fumarii remain viable following a one hour treatment in the autoclave (121 C). The Methanopyrus kandleri which grows at temperature to 110 C was isolated from the wall of the ‘black smoker’ hydrothermal vent chimneys where it reaches 108 cells g-1 of chimney rock (Kelley, 2005; Sleep et. al., 2011).

In a phylogenetic sense, species of hyperthermophiles typically reside on rather short branches that emerge near the root of their respective . The “Korarchaeota”, a newly recognized domain of archaea, also represents an early branching group of hyperthermophiles. These findings suggest that hyperthermophiles are the least evolved of extant organisms and share more features with early life forms than do any other extant prokaryotes (Madigan and Oren, 1999; Atomi, 2005; Andam and Gogarten, 2011).

A B Figure 1.3 Geothermally heated hydrothermal vents. A) Black smoker from Sully Vent in the Main Endeavour Vent Field, NE Pacific and B) White smoker from Northwest Eifuku volcano. (Images adapted from http://www.pmel.noaa.gov, http://en.wikipedia.org/wiki/File:Champagne_vent_white_smokers.jpg).

7 CHAPTER 1: LITERATURE REVIEW

1.2.2 Molecular adaptations to thermophily

For an organism to survive at higher temperatures, its major macromolecular components including proteins, nucleic acids and lipids, must be heat stable. The thermostability of enzymes from various hyperthermophiles can be active up to 140 C. There are no known rules that govern thermostability, structural studies of several thermostable protein points to a small number of non-covalent features, including highly polar cores, reduced surface to volume ratios, decreased glycine contents and a high number of ionic interactions that are extensively correlated with thermostability (Yan et. al., 2011). The hydrophobic core helps to exclude solvent from the internal regions of the protein, making it internally ‘sticky’ and making the protein more resistant to unfolding (Tai et. al., 2011). A small surface to volume ratio is ultimately a function of folding and probably improves stability by conferring a compact form on the protein. In addition to their intrinsic stability factors, protein thermostability can also be facilitated by extrinsic factors such as molecular chaperonins. Chaperonins like the thermosomes to bind heat denatured proteins thereby preventing their aggregation and refolding them into their active form (Vabulas et. al., 2010).

There are several other factors that in combination provide heat stability to DNA in hyperthermophiles including high levels of K+, reverse DNA gyrase and histone or other DNA binding proteins. Positive supercoiling of DNA may be an important factor providing stability to DNA under high temperature. A unique type I topoisomerase called reverse DNA gyrase has been found in all hyperthermophilic archaea and bacteria that catalyses the positive supercoiling of closed circular DNA. Positively supercoiled DNA is more resistant to thermal denaturation than negatively supercoiled DNA (Madigan and Oren, 1999).

1.2.3 Halophilic microorganisms and their habitats

The halophilic bacteria are a group of microorganisms that grow at very high levels of salinity. They require up to 25 % NaCl and their natural habitats are salt lakes and sea water evaporation ponds. Salt lakes and brines are closed and stable water systems, in which intensive insulation and a reduced circulation cause temperatures of up to 70 C.

8 CHAPTER 1: LITERATURE REVIEW

It is therefore likely that organisms which are growing in this biotype are both halophilic and thermophilic. The intracellular salt concentration in the cells of extremely halophilic bacteria varies with the growth phase but is always higher than the lowest concentration. Halophiles are found in all three domains of life: bacteria, archaea and eukarya. Bacteria of genus Chromohalobacter, Halomonas can adapt to life over a wide salt range, whereas others such as archaea of the order are adapted to life in near saturated brines and are unable to grow and even die below 15-20 % NaCl (DasSarma and Arora, 2001; Oren, 2008; Vidyasagar et. al., 2011). The diversity of known halophilic microorganisms has increased tremendously in past years. On the basis of comparative sequencing of 16S rRNA the halobacteria appear to be a diverse group and extensive taxonomic rearrangements have been proposed (Madigan and Oren, 1999). Recently, thermophilic halotolerant Aeribacillus pallidus TD1 from Tao dam hotspring, Thailand have been reported, that grows optimally at 50-60 C in 10 % NaCl (Yasawong et. al., 2011).

1.2.4 Molecular adaptation to halophilism

Fundamentally two strategies enable microorganisms to cope with osmotic stress. The first, used by halobacteria and by a group of anaerobic halophilic bacteria, involves accumulation of inorganic salts in the cytoplasm. K+ rather than Na+ is the dominant intracellular cation while Cl- the dominant anion. In halobacteria the concentration gradient of Na+ in the cytoplasmic membrane is created by action of the Na+/H+ antiporter transporter system. The energy required to expel out Na+ from the cell is supplied by the proton gradient formed during respiratory electron transport or by the light driven proton pump . K+ ions probably enter the cell passively in response to the membrane potential. Two inward chloride transporters have been identified in : the first halorhodopsin is light dependent and the second is probably energized by symport with Na+. Proteins of the halobacteria contain more acidic amino acids and less basic amino acid. The high cation concentration within the cell may be required to shield the negative charges on the protein surface. It is suggested that the acidic residues might be important in preventing aggregation rather than contributing to intrinsic protein stability. The second option for adaptation to life at high salt involves the maintenance of a cytoplasm in low salt concentration. Organic osmotic

9 CHAPTER 1: LITERATURE REVIEW

“compatible” solutes provide osmotic balance while enabling activity of conventional, non-salt adapted enzymes (Madigan and Oren, 1999; Gunde-Cimerman et. al., 2005; Shivanand and Mugeraya 2011). Some important features of thermophiles and halophiles identified in the halothermophile H. orenii are given in Table 1.3.

Table 1.3 Some important features of H. orenii, thermophiles and halophiles Features H. orenii Thermophiles Halophiles Optimum growth 60C 50C 36-55C temperature pH 7.0 6.5-7.5 > 8.5 NaCl 10% - 20% Structural - Lipid bilayer structure Lipid bilayer adaptation structure Cellular - salt-out strategy - molecular -salt in strategy adaptation - sucrose phosphate chaperonins - compatible solute synthase maintains strategy osmotic balance - molecular chaperonins Molecular - low no. of positively - Excess glutamate, - very high adaptation charged amino acid than valine, tyrosine & proportions of halophiles proline residues positively charged - L- isoaspartate (D - Salt bridges, acidic amino acids aspartate) O- hydrogen bonds, methyltransferase repairs packing density, good damaged proteins structural - Salt bridges, hydrogen conformation bonds, packing density, good structural conformation

(Kristjansson et. al., 1992, Cayol et. al., 1994, Mavromatis et. al., 2009)

10 CHAPTER 1: LITERATURE REVIEW

1.3 Thermozymes

Enzymes from hyperthermophiles show their optimal activity at high temperature (sometimes higher than the optimal growth temperature of the organism) against their substrate. These thermostable enzymes show irreversible protein denaturation only at high temperature (Bruins et. al., 2001). Such enzymes are known as thermozymes, as their sources are thermophiles and hyperthermophiles which live at temperatures above 60C. Recently, a novel thermozyme poly-ADP-ribose polymerase has been biochemically characterized with optimal activity at 80 C, half-life at 100 C for 1.5 hr and pH 7.2 (Maio et. al., 2011). Willies and associates in 2010 reported an aldo-keto reductase from T. maritima with optimal activity at 70 C (Willies et. al., 2010). Some of the most important thermozymes are given in Table 1.4.

At present, very few enzymatic processes utilize thermophilic and hyperthermophilic enzymes at an industrial level. Recent characterizations of enzymes and the development of potential tools for protein engineering indicate that thermophilic and hyperthermophilic enzymes will be used in more industrial applications (Atomi, 2005; Waters et. al., 2010; Atomi et. al., 2011). The role of Thermozymes has expand in new frontiers of microbiology, biochemistry and biotechnology, including: (i) molecular determinants for protein thermostability at high temperatures are now being recognized, (ii) the discovery of thermophilicity domains in thermozymes suggests that stability and activity can be encoded by separate molecular determinants, (iii) thermozymes are now becoming protein chemistry tools because they are easy to produce, purify and crystallize, (iv) they are catalysts of choice for developing new industrial processes as they are intrinsically very stable and genetic techniques can be used to alter their optimum temperature and specific activity, (v) the use of thermozymes in diagnostics and reagents may expand beyond DNA polymerase and ligase because of reagent stability and extended shelf life. Thermozymes can be stored at room temperature instead of in a deep freezer, making them more convenient for transport and use (Turner et. al., 2007). Future genetic engineering of thermozymes may yield catalysts with high stability and temperature activity optima designed to meet the exact reaction temperature desired (Zeikus et. al., 1998; Haki and Rakshit, 2003). Some major thermozymes from thermophiles and hyperthermophiles are useful in several

11 CHAPTER 1: LITERATURE REVIEW biotechnological research applications such as thermostable protein nanostructure design, for plant engineering (stress tolerance, temperature resistance), biomass conversion, cell engineering and biofuel production (Atomi et. al., 2011) (Table 1.4).

Table 1.4 Thermophilic and hyperthermophilic enzymes with applications as molecular biology reagents. Enzymes Origin Applications Properties Taq polymerase T. aquaticus PCR technologies Optimal activity at 75C, pH 9.0 Vent DNA Optimal activity at 75C, T. litoralis polymerase proofreading activity Deep vent DNA Optimal activity at 75C, proof P. furiosus polymerase reading activity;t1/2, 23h (95C) Reverse transcriptase activity, C. therm DNA Carboxydothermus Reverse 3’5’ proofreading activity; polymerase hydrogenoformans transcription- PCR optimal activity as 60-70C Thermus DNA polymerase Reverse transcriptase activity thermophilus Ligase chain Active at 45-80C; t1/2 >60 min Pfu DNA ligase P. furiosus reaction and DNA (95C) ligations Thermus Ligase chain Tcs DNA ligase Optimal activity at 45C scodoductus reaction Time reducing and specificity enhancing in Sequence aspecific DNA DNA-DNA DNA binding binding; ATP independent, S. solfataricus hybridizations; protein Scod7 homology dependent DNA locking of annealing at 60C antisense oligonucleotide to target sequence DNA and RNA Optimal activity at 90C, pH Serine protease Thermus strain purifications; 8.0; t1/2, 20 min (90C) (PRETAQ) Rt41A cellular structures (+CaCl2)

12 CHAPTER 1: LITERATURE REVIEW

degradation prior to PCR Protein Optimal activity at 85-95C, pH Protease S P. furiosus fragmentation for 6.0 – 8.0; 80% active after 3 h sequencing (95C) Cleavage of N- Optimal activity at 85- 95 C, Methionine P. furiosus terminal Met in pH 7.0 – 8.0; stable for 1 h aminopeptidase proteins (75C) Cleavage in N Optimal activity at 95- 100C, Pyroglutamate terminal L- P. furiosus pH 6.0- 9.0; 95% active after 2.5 aminopeptidase pyroglutamate in h (75C) proteins Broad specificity(can release C- terminal basic, acidic and aromatic Carboxypeptidase S. solfataricus sequencing residues) stable in solvents at 45C Enzyme labelling Alkaline applications where Optimal activity at 85C, pH T. neapolitana phosphatase high stability 9.9; t1/2, 4h (90C) (+Co2+) required

(Vieille and Zeikus, 2001)

1.4 Applications in Molecular Biology

1.4.1 DNA polymerases

To date, numerous DNA polymerases have been characterized from various thermophiles and hyperthermophiles. However, Taq polymerase from T. aquaticus has played the most critical role in the development of PCR technology. This technology uses two properties of DNA polymerases: processivity and fidelity, for its multiple applications. Taq DNA polymerase synthesizes DNA faster, but it lacks 3’ – 5’ proofreading exonuclease activity. The high processivity of Taq DNA polymerase makes it the enzyme of choice for detection procedures and sequencing.

13 CHAPTER 1: LITERATURE REVIEW

1.4.2 DNA ligase

DNA ligases from thermophiles are now commercially available (e.g. Taq ligase and Ampligase). Thermostable DNA ligase has been reported from hyperthermophilic Aquifex pyrophilus with optimal activity between 45 and 80 C (Lim et. al., 2001). These enzymes are used in ligation of adjacent oligonucleotides that are hybridized to the same DNA of interest. DNA ligase can be used for ligase chain reaction for mutational studies or gene synthesis (Pray, 2008).

1.4.3 Other enzymes

For molecular biology and biochemical applications, a large number of thermophilic and hyperthermophilic proteases are used. Various thermophilic proteins resist proteolytic activity at moderate temperature (20 – 60 C). They unfold or become sensitive to proteolytic attack above 70 C. Some proteases like the Thermus Rt41A serine proteases PRETAQ which get rapidly inactivated by EGTA are useful in RNA and DNA purification methods. Other thermophilic proteases are used in protein N- or C- terminal sequencing. Several thermophilic restriction endonucleases are commercially available, which are isolated from Bacillus, Pyrococcus and Thermus strains and show optimum activity at 50 - 65 C (Hjorleifsdottir et. al., 1996; Morgan et. al., 1998).

1.5 Applications in starch processing

During starch processing, glucose, maltose or oligosaccharide syrups are obtained as products after starch hydrolysis. These products are then used in to obtain a number of different chemicals (e.g. ethanol, lysine and citric acid). Starch bioprocessing involves liquefaction and saccharification at high temperatures. In liquefaction processes, starch is gelatinized in a jet cooker at 105 – 110 C for 5 min in aqueous solution (pH 5.8 - 6.5) and partially hydrolysed at -1-4 bonds with thermostable -amylase at 95 C for 2-3 hr. The pH is adjusted to 4.2 - 5.0 and

14 CHAPTER 1: LITERATURE REVIEW temperature is lowered to 55 - 60 C for saccharification. During this step, liquefied starch is converted into low molecular weight saccharide and finally into glucose and maltose, as shown in Figure 1.4. Glucose syrups are produced using pullulanase and glucoamylase and by using pullulanase and -amylase maltose syrups are produced.

There is a need for thermophilic pullulanase, isoamylase, -amylase and glucoamylase enzyme, which can operate above 100 C, in industrial applications. The enzymes used today for industrial starch processing are of mesophilic origin and are marginally stable at 60 C. There are many benefits to raising the temperature during saccharification processes such as high substrate concentration, low viscosity and reduced pumping costs, decreased risks of bacterial contamination, reaction rate increasing with lowering of operation time, reduction of enzyme purification cost and increased enzyme thermostability which will improve the half-life of the catalyst.

1.5.1 Alpha amylases

Alpha amylases from thermophiles and hyperthermophiles which grow at 80 - 110 C have optimum activity in this temperature range. For starch liquefaction processes enzymes with optimal activity at 100 C are required at pH 4.0 - 5.0, but the enzyme should not require Ca2+ as a supplement for its stability. With the recent developments of enzyme characterization, -amylase with such features will be available in the near future. Initially, some -amylase were believed to be calcium independent (Table 1.4). EDTA has no effect on -amylase activity and stability below 90 C, isolated from P. furiosus. However treatment with EDTA for 30 min at 90 C and extensive dialysis removes active sites of the enzyme linked with Ca2+ thereby partially inactivates the enzyme. The enzyme becomes fully active again when treated with CaCl2 at 90 C. -amylase from P. furiosus is highly stable and remains active at 100 C in the absence of Ca2+, and it can be used in liquefaction even in the absence of Ca2+ (Vieille and Zeikus, 2001).

15 CHAPTER 1: LITERATURE REVIEW

Figure 1.4 Schematic presentation of the action of starch degrading enzymes (Hobel, 2004).

1.5.2 Beta amylases

Thermophilic -amylases can be used to improve the starch saccharification if they show activity at acidic pH with increased reaction rate. When this reaction occurs the chances of unwanted side reactions are low. -amylase from T. thermosulfurigenes is a good option to choose, as it is thermostable and retains 70 % activity at pH 4.0. Another -amylase from T. maritima is also a potential enzyme functions optimally at 95 C at pH 4.3 - 5.5 (Vieille and Zeikus, 2001).

16 CHAPTER 1: LITERATURE REVIEW

1.5.3 Glucoamylases and - glucosidases

Hyperthermophiles use -amylase and amylopullulanase to hydrolyse starch while the smaller unit oligosaccharides are further degraded by -glucosidase. In these microorganisms, glucoamylase is rare and has been reported in very few anaerobes. A potential glucoamylase gene has been identified in M. janaschii genome. More work is required to determine the functionality of the gene. Another glucoamylase from T. thermosaccharolyticum is stable and remains active at 70 - 75 C during the starch saccharification process (Vieille and Zeikus, 2001).

1.5.4 Pullulanase and amylopullulanases

Type I Pullulanase is able hydrolyses -1-6 glucosidic linkage and used in starch saccharification. Previously, type I pullulanase was known from mesophilic organisms but is now reported in thermophiles. Some pullulanase properties are listed in Table 1.4. The only Pullulanase characterized from a hyperthermophile was from Thermotoga maritima (Kriegshauser and Liebl, 2000). They are optimally active at acidic pH and seem to be compatible to temperature and pH requirements with thermophilic glucoamylases and -amylases.

Type II amylopullulanase have dual specificity for  1-4 and  1-6 glucosidic linkages in starch and thus cannot be used for hydrolysing maltose and glucose syrup. But they are suggested as alternatives of -amylases during starch liquefaction for producing fermentable syrups. Some major end products of starch hydrolysis obtained using amylopullulanase, include maltose, maltotriose and maltotetraose. Amylopullulanase have been suggested as catalysts in a one-step liquefaction-saccharification process as they have been purified from the hyperthermophiles viz., P. furiosus, T. litoralis and T. hydrothermalis and are active at 120 C and low pH. They are highly thermostable and are thus more convenient for this process (Table 1.5) (Vieille and Zeikus, 2001).

17 CHAPTER 1: LITERATURE REVIEW

1.5.5 Cyclomaltodextrin glucotransferases

Cyclomaltodextrin glucotransferases (CGTases) convert oligodextrins into cyclodextrins (CD). ,  and  cyclodextrins are cyclic compounds composed of 6, 7 or 8,  1-4 linked glucose units. The internal cavities of cyclodextrins are hydrophobic and encapsulate hydrophobic molecules. This property of CD makes them useful in food, cosmetic and pharmaceutical industries, where they capture undesirable odours, taste, stabilize volatile compounds, increase the water solubility of hydrophobic substances water solubility and protect the substance from unwanted changes. Recently CGTases has been characterized from Thermococcus species (Table 1.5). This enzyme is stable at 100 – 105 C, optimally active in between 90 - 100 C under acidic conditions and also shows high -amylase activity. This enzyme can be used to develop a onestep CD production where it could be used to replace -amylase for starch liquefaction (Vieille and Zeikus, 2001).

1.5.6 Xylose isomerases

Xylose isomerases or glucose isomerases are used in high fructose concentration syrup production and to catalyse the equilibrium isomerisation of glucose into fructose. The isomerisation process operates at 58 – 60 C for 1 - 4 hr in a packed bed reactor and with 42 % fructose in the converted syrups. The glucose to fructose conversion rate is shifted towards fructose at high temperature. Highly thermophilic and thermostable xylose isomerases have been characterized from T. thermophilus, T. aquaticus, T. maritima, T. neapolitana, T. thermosulfurigenes, T. yonseiensis and Opuntia vulgaris (Table 1.5). Inspite of their optimal activity at high temperature (95 - 100 C) and high catalytic rate at 90 C, they work at neutral pH and show only 10-15 % activity between 60-70 C (Kim et. al., 2001; Vieille and Zeikus, 2001; Ravikumar et. al., 2011). In 1991 Meng and associates developed a mutant strain of T. thermosulfurigenes and T. neapolitana isomerase and increased its catalytic efficiency on glucose.

18 CHAPTER 1: LITERATURE REVIEW

Table 1.5 Thermophilic, mesophilic and psychrophilic enzymes with applications in starch processing Enzymes Type Organism Opt. Properties temp. (C) Amylase Thermophilic Geobacillus 70 70C: about 50% of maximal activity & 110C: thermovaleorans 90% activity Pyrococcus furiosus 70 70C: about 30% of maximal activity & at 115C 75% of maximal activity Thermotoga maritima 85 Optimal activity at 85- 90C

Bacillus subtilis 65 Retention of 80% of maximum activity at 75C & rapid deactivation at 70C with t1/2 of 7 min. Mesophilic Alkalimonas amylolytica 45 More than 70% of maximal activity is between 45 & 55C Streptomyces megaspore 45 at 45C: 50 of maximal activity & at 75C 60% of maximal activity 40 40C: 30% of maximal activity & at 80C 40% of maximal activity Bacillus licheniformis 40 40C: 50% of maximal activity & at 100C: 60% of maximal activity Candida Antarctica 37 65% maximal activity at 37 & 75C Bacillus sphaericus 35 35C: 64% of maximal activity but none at 100C Psychrophilic Ascaris suum 15 15C: about 50% of maximal activity & at 60C: 20% of maximal activity Nocardiopsis spp. 0 0C: about 25% of maximal activity & at 50C: 40% of maximal activity Cellulase Thermophilic Xylella fastidiosa 50 at 50C: 45% of maximal activity Chryosporium 50 at 50C:35% of maximal activity & at 75C: lucknowense 60% of maximal activity Clostridium 50 50C:40% of maximal activity & at 80C: 30% thermocellum of maximal activity Melanocarpus 50 50C: 70% of maximal activity & at 80C; 50% albomyces of maximal activity (CelM1, CelM2 & CelM3)

Mesophilic Trichoderma reesei 40 40C: 45% of maximal activity & at 75C:60% of maximal activity Streptomyces rochei 37 37C: 50% of maximal activity of the catalytic domain & at 80C:30% of maximal activity Streptomyces lividans 35 35C: 50% of maximal activity & at 60C:50% of maximal activity

19 CHAPTER 1: LITERATURE REVIEW

Psychrophilic Lysobacter sp. 20 20C: 55% of maximal activity & at 60C: 60% of maximal activity Streptomyces reticule 20 20C: 40% of maximal activity & at 60C:25% of maximal activity, CMCase1 Bacillus 20 20C: 65.9% of maximal activity & at 80C: amyloliquefaciens 72.3% of maximal activity, carboxy methyl cellulose Glucoamylase Thermophilic Sulfolobus solfatarius 60 80C: about 90% of maximal activity

Clostridium 50 >90% activity after 6 h (70C) thermosaccharolyticum Thermoplasma 50 Approx. 20% of maximal activity at 50 & acidophilum 100C

Mesophilic Saccharomyces 40 About 40% of maximal activity at 40 & 70C diastaticus Aspergillus oryzae 35 50% of maximal activity at 35C & at 60C: 90% of maximal activity Aspergillus niger 30 30C: 30% maximal activity immobilized enzyme, 30C: 50% max. activity free enzyme, 60C: 50% of max. activity free enzyme, 60C: 90% of maximal activity immobilized enzyme

Psychrophilic Thermoplasma 20 Maximal activity at 20C acidophium Pullulanase Thermophilic Bacillus flavocaldarius 75 Optimal activity at 75-85C, pH 6.3; t1/2, 10 min (107C)

Pyrococcus woese 80 120C: 40% of maximal activity

Mesophilic Micrococcus sp. 40 Maximal activity between 40 & 45C Geobacillus 25 Maximal activity between 25 & 75C stearothermophillus Psychrophilic - - - Xylose Thermophilic Thermus aquaticus 85 Optimal activity at 85C, pH 7.0; t1/2, 4 days isomerase (70C) Thermoanaerobacterium 80 Optimal activity at 80C, pH 7.5; t1/2, 35 min thermosulfurigenes (85C) Bacillus sp. 50 50C: 55% of maximal activity & 70C: 65% of maximal activity Mesophilic Bacillus sp. 40C 40C: 40% of maximal activity & 80C: 30 % of maximal activity Bifidobacterium 30 30C: 45% of maximal activity & 70C: 10% of adolescentis maxima activity

Psychrophilic - - -

20 CHAPTER 1: LITERATURE REVIEW

- amylase Thermophilic Thermotoga maritima 95 Optimal activity at 95C,t1/2, 30min (90C)

Thermoanaerobacterium 75 Optimal activity at 75C, >80% active after 1h thermosulfurigenes (75C) Clostridium 50 50C: 45% of maximal activity & 85C: 25% of thermosulfurigenes maximal activity Mesophilic Bacillus cereus 45 45C: 0% more active than soybean - amylase Glycine max 45 45C: 60% less active in hydrolyzing corn

Entamoeba invadens 29 starch then the enzyme from B. cereus 75% of maximal activity at 29 & 45C

Psychrophilic Xanthomonas 20 Maximal activity between 20 & 60C dendrihous Hordeum vulgare 20 Maximal activity between 20 & 70 C Galactosidase Thermophilic Thermotoga maritima 75 75 & 95C, >50% of maximal activity Thermomyces 50 50C: 60% of maximal activity & 75C: 50% of lanuginosus maximal activity Mesophilic Lactobacillus fermentum 35 50% of maximal activity at 35 & 50C

Glycine max 35 35C: 40% of maximal activity Phlebia radiate 30 30C: 35% of maximal activity & 70C: 90% of maximal activity

Psychrophilic - - - CGTase Thermophilic Thermoanaerobacter 60 60C: 55% of maximal activity & 100C: 50% of max. activity, soluble enzyme & immobilized enzyme glycosyl substrate dextrin Anaerobranca 55 55C: 75% of maximal activity & 70C: 90% of goltschalkii maximal activity Paenibacillus 55 55C: 60% of maximal activity campinaseusis Mesophilic Bacillus macerans 30 30C: 45% of maximal activity & 70C: 80% of maximal activity Paenibacillus sp. 30 30C: 45% of maximal activity & 90C: 50% of max. activity, native enzyme Psychrophilic - - - ( Vieille and Zeikus, 2001 and http://www.brenda-enzymes.info/index.php4 )

21 CHAPTER 1: LITERATURE REVIEW

1.6 Other Industrial and Biotechnological applications

1.6.1 Cellulose degradation and ethanol production

The most abundant and renewable carbon source on earth is cellulose. Cellulose is used for ethanol production. For this process cellulose requires a pre-treatment of alkali to become an easy target for enzymes. Enzymes make cellulose more suitable for yeast and bacterial attack for ethanol production with the main limitation of this being low cellulase activity. As the cellulose pre-treatment is performed at high temperatures, hyperthermophilic cellulases will be the most suitable catalysts for cellulose hydrolysis. However, cellulase production in hyperthermophiles is rare. A combination of endoglucanase and cellobiohydrolase showed optimal activity at 95 - 105 C, such catalysts combinations are interesting and can be used in cellulose processing (Vieille and Zeikus, 2001; Khalil, 2011; Sizova et. al., 2011; Sveinsdottir et. al., 2011).

1.6.2 Paper pulp bleaching

During paper production wood fibres are broken apart using hot alkali treatment to remove lignin by multistep bleaching with chlorine or chlorine dioxide at high temperatures. Large volumes of waste and pollutants are released by this method. This pollution can be controlled by the pre-treatment of paper pulp with hemicellulase. Since, the pulping and bleaching are performed at high temperatures, industry needs thermophilic hemicellulase active above pH 7.0. A hemicellulose has been characterized from hyperthermophilic Thermotogales, which is active at pH 7.0 and above. Other thermophilic enzymes such as -glucuronidase, β-mannase, -L-arabinofuranosidase, β-glucosidases and galactosidases have been shown to contribute in pulp processing (Bruins et. al., 2001; Vieille and Zeikus, 2001; Haki and Rakshit, 2003; Reddy et. al., 2005; Turner et. al., 2007).

22 CHAPTER 1: LITERATURE REVIEW

1.7 Structural feature of thermozymes

Thermostability is a state that the thermozymes acquired through the combination of small structural adjustments between some amino acids and their canonical forces. To date no new amino acid, covalent modification or structural motifs have been reported that can explain the thermostability and activity of these proteins at higher temperatures. Thermostable enzymes do not show any significant difference in their structural conformation when compared with their mesophilic counterparts (Vieille and Zeikus, 2001). Each protein has its own molecular mechanism of thermostability such as additional intermolecular interactions and good general conformational structures which may not present in other proteins (Li et. al., 2005).

Most mesophilic proteins are hardly stable at high temperature that blocks research and other applications. To overcome this state recent development in protein engineering and thermophilic protein stability investigations are bringing protein stability into high throughput realm (Magliery et. al., 2011).

1.8 Additional intermolecular interactions

1.8.1 Hydrogen bonds

Hydrogen bonds are present in peptides, in between polypeptide chains and in the aqueous medium of protein. Larger hydrogen bonds significantly enhances the thermostability of protein (Botelho and Gomes, 2011). Vogt and co-workers in 1997 described findings about protein thermostability and its correlation with the number of hydrogen bonds and the polar surface fraction, which increases the hydrogen bonding with the surrounding water. Suvd and associates in 2001 compared the structure of Bacillus licheniformis -amylase (BLA) and Bacillus stearothermophillus - amylase (BSTA) and found that BLA have nine more hydrogen bonds than BSTA, making BLA more thermostable.

23 CHAPTER 1: LITERATURE REVIEW

1.8.2 Electrostatic interactions

The amounts of charged residues (Glu, Lys and Arg) are higher in thermophilic proteins compared to their mesophilic homologues. This indicates that increased charged residues increase the salt bridges (charge-charge interactions between opposite charge residues) that enhance the contribution of thermo tolerance in thermophilic proteins (Li et. al., 2005).

1.8.3 Hydrophobic bonds

Hydrophobic bonds are the main contributors in the molecular folding of proteins which brings stability at high temperatures (Botelho and Gomes, 2011). To measure the hydrophobic effects the ratio of buried non-polar surface area to the total non-polar surface area is analyzed. Thermozymes have a larger area of non-polar surface area compared to mesophilic proteins. In tetrameric superoxide dismutase (SOD) from hyperthermophiles Aquifex pyrophiles, the total non-polar surface area is 67 % of the total interfaces, compared with less thermophilic SOD from (Li et. al., 2005). In comparison with mesophilic homologues, thermozymes have extensive hydrophobic interactions and reduced water accessible hydrophobic surface region (Wigley et. al., 1987).

1.8.4 Disulphide bonds

The stability of a protein structure is achieved through internal entropy. This effect decreases the entropy of a protein in its unfolded state. Several mutagenic studies confirmed the stabilizing effect of disulphide bridges (Matsumura et. al., 1989; Li et. al., 2005).

24 CHAPTER 1: LITERATURE REVIEW

1.8.5 Aromatic interactions

Pairing between aromatic rings in less than 7.0 Å is known as aromatic interaction. As described by previously aromatic amino acid (such as tryptophan phenylalanine etc.) interactions are known to be one of the determinants of thermal stability of thermophilic proteins (Serrano et. al., 1991; Kannan and Vishveshwara, 2000).

1.8.6 Metal Binding

Different metals play key roles in the stabilization and activation of enzymes. The presence or absence of metal ions is directly associated with the stabilizing forces. An increase in thermostability is observed when xylose isomerise is treated with different metal ions such as Mg2+, CO2+ and Mn2+. Metals presence also influences the activation energy for irreversible inactivation. Previous workers reported that in presence of Ca2+ cellobiohydrolase from Clostridium thermocellum showed maximum cooperativity for thermal unfolding along-with raised intrinsic stability at 80 C, which exceeded the organisms optimal growth temperature by 20 C (Li et. al., 2005).

1.8.7 Extrinsic factors

Although most of the thermophilic/ hyperthermophilic proteins are very stable intrinsically, but some intracellular thermozymes get their high thermal stability from cells internal environment factors such as high protein concentration, cytoplasmic salt concentration, coenzymes, natural substrates, activators, polyamines or extracellular physical factor such as pressure (Sivakumar, 2006).

25 CHAPTER 1: LITERATURE REVIEW

1.9 Good conformational structures

1.9.1 High rigidity

Mesozymes are notably less rigid than thermozymes. The high rigidity actually preserves the catalytic structure at raised temperatures and protects the protein from unfolding. Enzyme rigidity increases with  helix stabilization, reduced conformational strain and optimized electrostatic interaction (Li et. al., 2005; Botelho and Gomes, 2011).

1.9.2 Higher packing efficiency

Increased molecular density also enhances the thermal stability without affecting the catalytic activity. By eliminating useless cavities, shortening of unwanted loops and increased packing of side chains inside the protein structure brings the compactness. Hyperthermophilic enzymes have higher protein compactness than their mesophilic counterparts. Proteins have 0.75 packing density, that is in the middle of range found for the crystal of most organic molecules. This density may vary in different parts of proteins and affect their thermostability (Scandurra et. al., 1998; Li et. al., 2005). Protein folds also play a significant role in enhancing the thermophilicity rather than the protein family type (Yennamalli et. al., 2011).

1.9.3 Alpha helix stabilization

Alpha helix can be stabilized by negatively charged residues (e.g. Glu) at their N- terminal end and by positively charged residues (e.g. Lys) at their C- terminal end. Beta branched residues destabilize the  helix stabilization; a comparative X-ray structure analysis of 13 different thermophilic proteins with their mesophilic homologues showed that in all the thermophilic proteins beta stranded residues (Val, Ile, Thr) were absent in their  helix stabilization (Li et. al., 2005). Previous investigation concludes that -

26 CHAPTER 1: LITERATURE REVIEW helices stability and intrahelical interactions are necessary for protein thermostability (Wu et. al., 2009).

Protein stability remains one of the most challenging problems because of their complex structures. No universal mechanism of stability can govern their thermostability and only certain structural features can be explored to understand the mechanisms which stabilize protein at high temperature.

27 CHAPTER 1: LITERATURE REVIEW

1.10 Halothermothrix orenii: a model bacterium to study life in hot salty environment

Halothermothrix orenii is a Gram negative thermohalophilic, anaerobic bacterium isolated from sediment of a Tunisian salt lake. The water of this lake evaporates during summer leaving behind high salt conditions and fills back during rains exposing microbial life to fluctuating salinity. It is expected that under such conditions the organisms has ability to adapt and adjust to the variations in the external salt environment (Sivakumar et al., 2006).

16S rRNA studies have placed this organism in the order Haloanaerobiales in the phylum Firmicutes. Genome properties of H. orenii are given in (Table 1.5) (Mavromatis et. al., 2009). Colonies of H .orenii are yellow, flat and circular with 0.5 to 1mm diameter. Growth is chemoorganotrophic and strictly anaerobic, oxidizes carbohydrates including arabinose, cellobiose, fructose, galactose, glucose, melibiose, mannose, starch, ribose and xylose but not lactose, maltose, rhamnose, sucrose, sorbose, cellulose, formate, acetate, butyrate, propionate, fumarate, lactate, malate, succinate, adonitol, dulcitol, glycerol, mannitol, casaminoacids, bio-trypcase or trimethylamine.

The end products of fermentation are acetate, ethanol, H2 and CO2. NaCl and yeast extract are required for growth. The optimum NaCl concentration for growth is about 10%, with upper and lower limits of 20 and 4 % respectively. The pH range for growth is 5.5 to 8.2; optimum pH is 6.5 to 7.0 (Cayol et. al., 1994).

H. orenii genome contains all the necessary genes which encodes for the glycolytic enzymes responsible for monosaccharide degradation. Some of the important enzymes for gluconeogenesis and fermentation to acetate and ethanol are also present. Interestingly, the organism failed to grow on lactate but lactate dehydrogenase gene (H_16140) with high similarity to the lactate dehydrogenase gene of Bacillus is present. There were 194 genes identified in H. orenii which do not have homologs in any other Firmicutes and most of them are of unknown function.

It has been reported earlier that adaptation at high temperature is related with the assembly of heat stable proteins. For stability of protein high numbers of positive and negative charged residues are necessary that may stabilizes the protein structure by the formation of ionic bonds between oppositely charged residues and also reduced

28 CHAPTER 1: LITERATURE REVIEW frequency of thermolabile amino acids glutamine, histidine and threonine is important for protein stability. To analyse these reported features various enzymes from H. orenii has been studied in detail to understand their biochemical characteristics and structural features.

Enzymes that have been investigated from H. orenii were -amylase A (amyA), -amylase B (amyB), sucrose phosphate synthase (SPS), fructokinase (FRK), β- glucosidase (BglA) and ribokinase (Li et. al., 2002; Mijts and Patel, 2002; Tan et. al., 2003; Huynh et. al., 2005; Sivakumar et. al., 2006; Khiang et. al., 2008; Tan et. al., 2008; Chua et. al., 2010; Kori et. al., 2011a; Kori et. al., 2011b), β-glucosidase (Kori et. al., 2011 unpublished), -L-arabinofuranosidase (Kori and Patel, 2011 unpublished) and Oligo 1-6 glucosidase (Chaddha et. al., unpublished data).

To understand adaptation at high salt concentration previous investigations revealed that H. orenii uses salt-out strategy and secretes sucrose phosphate synthase (SPS) to produce sucrose, a well-known compatable solute which helps the organism to live in salty environment. Once this sucrose is no more required as compatable solute organism use it to generate energy. Previously it was believed that sucrose synthesis by SPS catalyse was restricted to plants, and some , but the current finding of SPS in H. orenii may provide useful insight into the reaction mechanism of SPS and sucrose phosphatase (SPP) in plants and their relation with bacterial SPS (Huynh et. al., 2004; Khiang et. al., 2008).

H. orenii use a number of carbohydrates as mentioned above which suggests it carries various glycosyl hydrolases (GH), which have potential applications for commercial use. To explore their vital role in carbohydrate breakdown Alpha amylase A and B has been intensely investigated in the previous years (Mijts and Patel, 2002; Sivakumar et. al., 2006). These findings revealed that the amyA works optimally in 10 % NaCl and have 90 % residual activity at 25 % NaCl at 65 C and in absence of salt the activity ceases down. AmyA also lacks excess of acidic amino acids a biochemical trait thought to be necessary for activity and stability of some halophilic enzymes (Mijts and Patel, 2002). The amyA activity makes it a suitable candidate for starch processing and other commercial applications.

29 CHAPTER 1: LITERATURE REVIEW

Structural analysis revealed that -amylase B from H. orenii carries an additional N terminal domain; this is not a trait of GH family 13. C and N domain together forms a groove that has a putative binding site for raw starch processing. Stability profiling of amyB shows that N domain does not increase the protein stability or hydrolytic activity but it enhances the binding of amyB to insoluble raw starch. Additionally three methionine side chain contacts with sugar binding region, this is not a common interaction. This may provide additional space to rearrange and accommodate the binding of the oligosaccharide chain (Tan et. al., 2008).

30 CHAPTER 1: LITERATURE REVIEW

Table 1.6. Properties of the Halothermothrix orenii genome

DNA, total number of bases 2578146 100.00% DNA coding number of bases 2283859 88.59% DNA G+C number of bases 976684 37 88% DNA scaffolds 1 100.00% Genes total number 2457 100.00% Protein coding genes 2372 97.24% RNA genes 85 2.76% rRNA genes 11 0.45% 5S rRNA 3 0.12% 16S rRNA 4 0.16% 23S rRNA 4 0.16% tRNA genes 56 2.31% other RNA genes 18 0.74% Genes with function prediction 1965 80.57% Genes without function 407 16,69% prediction Pseudo Genes 24 0.98% Fused Genes 105 4.31% Genes in paralogous clusters 311 12.75% Genes in COGs 1868 76.59% Genes in Pfam 1883 77.20% Genes in TIGRfam 907 37.19% Genes coding signal peptides 446 18.29% Genes coding transmembrane 723 29.64% proteins

(Mavromatis et al., 2009)

31 CHAPTER 1: LITERATURE REVIEW

OBJECTIVES

The present study aims to investigate the carbohydrate degrading enzymes of Halothermothrix orenii. This bacterium is of keen interest because it survives in hot, salty and anaerobic environments and produces unique thermophilic enzymes. An increased understanding of these genes, enzymes and metabolic pathways will provide information regarding the adaptations and evolutionary history of H. orenii. Furthermore thermozymes discovered may have commercial applications in biotechnology and industry. The objectives of the proposed work are:

 Search for potential glycosyl hydrolases of that are stable at both high temperatures and in high salt concentrations.

 To investigate the biochemical functional strategies of proteins under elevated temperature and halophilic conditions.

 To explore the structural and function relationship of halothermophilic enzymes.

The expected outcome will toss a ray of light on the relationship between thermophilicity and halophilicity; provide an understanding of evolutionary and adaptive strategies during environmental changes. Analysis of molecular facts of enzyme activity at high temperature and salt concentration could provide imperative elucidation for the development of new molecular strategies for designing novel thermozymes.

32 CHAPTER 1: LITERATURE REVIEW

THESIS PLAN

The purpose to plan out this thesis in the present format (Figure 1.5) was to complete investigation of maximum number of enzymes in the given time period of three years. Maximum numbers of enzymes were selected for studies to overcome possible failure due to unsuccessful gene cloning, protein over-expression and purification and protein crystallization.

CHAPTER 2: METHODS CHAPTER 3: RESULTS SUMMARY

1) Fructokinase, 2) β-fructosidase, 3) -arabinofuranosidase, Bioinformatic analysis, gene selection, PCR & 4) β-glucosidase, 5) ribokinase, 6)-amylase2, 7) glucosyl- ceramidase, 8) 6-phosphofructokinase, 9) sugar kinase gene cloning Genes cloned: fructokinase, β-fructosidase, - arabinofuranosidase, β-glucosidase, ribokinase, -amylase2, glucosylceramidase, 6-phosphofructokinase, sugar kinase

Protein overexpressed: fructokinase, -arabinofuranosidase, Protein over-expression β-glucosidase, ribokinase, -amylase2, glucosylceramidase, 6-phosphofructokinase, sugar kinase

Protein purified: -arabinofuranosidase, β-glucosidase, ribokinase

Biochemical Section characterization 3.1: β-glucosidase 3.2: -arabinofuranosidase

Section Protein crystallographic analysis 3.1: β-glucosidase 3.3: ribokinase

Protein structure solved Section 3.1: β-glucosidase

Figure 1.5 Overview of Chapter 2 and 3. Investigations and outcome of the studies performed on various enzymes from H. orenii. Chapter 3 is divided into 3 subsections comprising of introduction, specific materials and methods, results and discussion.

33 CHAPTER 2: GENERAL MATERIALS AND METHODS

CHAPTER 2

GENERAL MATERIALS AND METHODS

CHAPTER 2: GENERAL MATERIALS AND METHODS

This chapter describes the materials and methods / tools used to study genes and enzymes from H. orenii. For ease of explanation, the methods have been divided into 3 broad areas as depicted in Figure 2.1. Bioinformatics, which is further described in Section 2.1 of this chapter, includes in silico analysis of 14 selected genes and their putative proteins listed in Appendix I and primer design in Appendix II. Gene Cloning, has been described in Section 2.2 of this chapter, and includes the culturing of H. orenii and DNA extraction, gene amplification using primers designed in Section 2.1, cloning of 10 genes, with over-expression of 8 followed by the purification 3 and characterization of 2 of the over-expressed proteins. 2 of the 3 recombinant proteins that were studied biochemically were crystallized and the methods used for the structural analysis are given in Section 2.3.

34 CHAPTER 2: GENERAL MATERIALS AND METHODS

BIOINFORMATICS (2.1) GENE CLONING (2.2)

H. orenii genome sequence Genomic DNA Halothermothrix cart Extraction orenii H168

Glycosyl hydrolases & transferases enzyme gene PCR amplification Vector preparation search

Bioinformatics analysis &

PCR product RE Vector and Insert Primer design digestion ligation

Protein Transformation & Recombinant

crystallization clone screening, cPCR,

rPlasmid RE digestion, DNA (2.3) sequencing Protein crystals X-ray diffraction

rProtein mini over-expression screen, SDS-PAGE analysis Data analysis

Upscaling of culture & protein

Structure modelling purification using IMAC, & refinement protein estimation

Structure analysis & PROTEIN CRYSTALLOGRAPHY Biochemical characterization of PDB deposition recombinant enzymes

Figure 2.1 A flow diagram of procedures used in this chapter are depicted. In silico Bioinformatics methods (blue), Gene cloning, over-expression and biochemical characterization techniques (orange) and protein crystallization and structural studies (purple) are detailed in Sections 2.1, 2.1 and 2.3 respectively of this chapter.

35 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.1 BIOINFORMATICS

As represented in Figure 2.1 this section describes the various bioinformatics tools/ softwares used for the analysis of 14 genes from H. orenii and their putative protein sequences listed in Appendix I. The aim of using these tools is to search rare codons, design primers, and to determine Tm, primer dimer formation, theoretical pI, amino acid composition, hydrophobic / hydrophilic regions and functional features of putative proteins and to confirm their assigned putative annotations with the respective database. These details were used in Section 2.2 for PCR amplification and gene cloning, to design enzyme assays for activity screening and biochemical characterization. Some of these tools have been used in Section 2.3 for secondary and tertiary structure prediction, structure superimposition, to determine RMSD values, to deposit atomic coordinates of solved structure and for other structural properties. The tools that were used are listed in Table 2.1 and details are described following the table.

36 CHAPTER 2: GENERAL MATERIALS AND METHODS

Table 2.1 Bioinformatic tools used to study genes and their putative proteins from H. orenii

S. No Software/ Database/ Source/URL Tools used 1 BioEdit* http://www.mbio.ncsu.edu/bioedit/bioedit.html 2 TREECON* http://bioinformatics.psb.ugent.be/software/details/Treecon 3 ReadSeq http://www.ebi.ac.uk/cgi-bin/readseq.cgi 4 ORF finder CBI, http://www.ncbi.nlm.nih.gov/projects/gorf/ 5 NEBcutter V2.0 http://tools.neb.com/NEBcutter2/ 6 BLASTp NCBI, http://blast.ncbi.nlm.nih.gov/Blast.cgi 7 ProtParam Expasy web server, http://ca.expasy.org/tools/protparam.html 8 InterProScan Protein signature database, http://www.ebi.ac.uk/Tools/pfa/iprscan/ 9 BRENDA BRaunschweig ENzyme Database, http://www.brenda-enzymes.org/ 10 CaZY Carbohydrate-Active enZYmes, http://www.cazy.org/ 11 dbCAN DataBase for Carbohydrate-active enzyme ANnotation, http://csbl.bmb.uga.edu/dbCAN/ 12 KEGG Kyoto Encyclopedia of Genes and , http://www.genome.jp/kegg/ 13 PDB Protein data bank, http://www.pdb.org/pdb/home/home.do 14 PsiPred Protein structure prediction server, http://bioinf.cs.ucl.ac.uk/psipred/ 15 Catalytic site atlas EBI EMBL, http://www.ebi.ac.uk/thornton-srv/databases/CSA/ 16 PDBeFold EBI EMBL, http://www.ebi.ac.uk/msd-srv/ssm/ 17 FingerPRINTScan EBI EMBL, http://www.ebi.ac.uk/Tools/printsscan/ 18 XtalPred Crystal prediction server, http://ffas.burnham.org/XtalPred-cgi/xtal.pl * Software downloaded

37 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.1.1 BioEdit and TREECON

BioEdit and TREECON software were downloaded and installed on a laptop from the respective URL’s listed in Table 2.1. For multiple alignment of DNA and amino acid sequences (Appendix I) ClustalW (Thompson et al., 1994) was used with default parameters, the aligned sequences were saved in Phylip format and imported into TREECON (Van de Peer and De Wachter, 1994) to construct rooted neighbour-joining phylograms using maximum parsimony bootstrap of 100 replicates to determine phylogenetic relationships.

2.1.2 Basic Local Alignment Search Tool (BLAST)

Amino acid sequence homology searches were performed using NCBI GenBank (Benson et al., 2000) against protein database bank (PDB), reference proteins, swiss prot protein sequences (swiss prot) and non-redundant (nr) databases using BLASTp algorithm (Altschul et al., 1990) with default parameters. BLASTp outcome with E-value cut-off 0.0 or best 10 matches were considered for sequence comparisons and further analysis.

2.1.3 ProtParam Statistics

Amino acid composition, estimated pI and enzyme coefficient statistics for proteins were calculated using online ProtParam software.

2.1.4 InterProScan protein signature database

InterProScan database is a collection of domains, functional sites and protein families. It performs simultaneous integrated searches of protein sequences against PRINTS, Pfam, ProDom, PROSITE, Swiss-Prot and TrEMBL databases (Hunter et al., 2009). This database was used to investigate biochemical functions of the non-categorized proteins (Appendix I).

38 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.1.5 BRENDA (BRaunschweig ENzyme DAtabase) enzyme database

BRENDA is a database of enzyme nomenclature based on the recommendations of IUBMB (International Union of Biochemistry and Molecular Biology). It was used as a source of information for molecular and biochemical characteristics of enzymes, along with the primary scientific literature regarding a particular enzyme.

2.1.6 CaZY (Carbohydrate active enZYme) database

CaZY is a database of glycosyl hydrolases (GH), glycosyl transferases (GT), polysaccharide lyases (PL) and carbohydrate esterases (CE) families and Clan. This classification is based on enzyme structure similarity and functional domains. In present study this database was used to determine the GH family and clan of enzymes from H. orenii.

2.1.7 dbCAN (DataBase for automated Carbohydrate-active enzyme ANnotation) dbCAN is a web server and DataBase for automated Carbohydrate-active enzyme ANnotation and generate data based on the information from CAZy and CAT database. It provides automated and comprehensive CAZyme annotation, signature domain and its location, subfamily classification based on sequence similarities, sequence alignments, hidden markov models (HMMs) and phylogenies of the signature domain regions (Yin et. al., 2011 unpublished).

2.1.8 KEGG (Kyoto Encyclopedia of Genes and Genomes) protein database

KEGG is an online database, maintained by the Japanese Human Genome Programme. It accumulates information on genomes, metabolism pathways and their biological chemical reactions. Putative metabolic pathways of GH and GT enzymes that were selected during this study were analysed using KEGG database.

39 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.1.9 PDB (Protein Data Bank) database

This is an online depository of 3D structure of macromolecules i.e. proteins and nucleic acids and is a key resource in the field of structural biology. PDB database was used to access reference 3D protein structures of different proteins and for deposition of atomic coordinates of solved protein structure (Chapter 3, section 3.1).

2.1.10 PSIPRED

PSIPRED is an online server that predicts protein structure from their amino acid sequence by analysing theoretical chemistry and bioinformatic analysis of the respective protein sequence (Jones, 1999). PSIPRED was used to predict secondary and tertiary structure of proteins of interest.

2.1.11 EMBL EBI (European Molecular Biology Laboratory European Bioinformatics Institute) database

2.1.11a. CSA (Catalytic Site Atlas)

CSA is a database that carries detailed information about enzyme active sites and catalytic residues of enzymes in 3D structure. Catalytic active site information for different proteins from H. orenii was extracted from this database.

2.1.11b. PDBeFold (Protein Data Bank in Europe)

This is an online service also known as SSM (Secondary Structure Matching) and is collaborative facility for comparing 3D protein structures. It offers pairwise and multiple comparison with 3D alignment of protein structures, from PDB and SCOP (Structural Comparison Of Proteins) library (Krissinel and Henrick, 2004). PDBeFold was used to compare and align tertiary structure of H. orenii proteins with reference protein to determine RMSD values.

40 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.1.11c. FingerPRINTScan

FingerPRINTS is a database that encodes for protein folds and functional regions of proteins (Attwood et al., 2000). FingerPRINTScan was used to determine conserved region of proteins of interest from a collection of protein fingerprints.

2.1.12 XtalPred (Xtal Prediction) Server

This is a web server that predicts protein Crystallizability by comparing numerous features of the protein with distributions of these features in target database and combining the results into an overall probability of protein crystallization. XtalPred provides: (1) a detailed comparison of the protein's features to the corresponding distribution from target database, (2) a summary of protein features and predictions that indicate problems that are likely to be encountered during protein crystallization, (3) prediction of ligands and (4) a list of close homologs from microbial genomes that crystallize more likely. XtalPred was used to predict possibility of protein crystallization.

2.1.12 Halothermothrix orenii genome database and gene selection

Halothermothrix orenii H168 genome annotated at BASys (http://basys.ca/basys/cgi/submit.pl) and from NCBI genome database (NC_011899, DNA-circular; length-2,578,146 nt; Replicon type: chromosome; Created: 2009/01/15; http://www.ncbi.nlm.nih.gov/genome/?term=H.%20orenii) were setup as local database and used for nucleotide and protein searches by creating a local nucleic acid and protein database file. Nucleotide and protein sequences were formatted as per requirements using ReadSeq (Table 2.1) and used for homology searches by local BLAST with default parameters. Results from BLAST searches were retrieved and used for ClustalW multiple alignments (section 2.1.1).

Genes encoding for Glycosyl Hydrolases (GH) and Transferases (GT) were searched in H. orenii genome database and analysed for open reading frame (ORF) region with

41 CHAPTER 2: GENERAL MATERIALS AND METHODS absence of restriction enzyme sites using NEBcutter V2.0 (Table 2.1). For gene cloning and protein over-expression T7 expression vector (Appendix II) with compatibility to restriction site present on genes was selected and used.

2.2 GENE CLONING

Genes with ORF region that lack NdeI and XhoI restriction sites were selected for cloning, considering their compatibility to insert in multiple cloning site of pET22 (+) (T7 expression vector) (Appendix III) by ligation. General methods used for gene cloning are schematically represented in Figure 2.1. Detailed materials and methods used during gene cloning, protein over-expression, purification, enzyme characterization and protein crystallography (Kori et. al., 2011a, b) are given in following sections.

2.2.1 Bacterial Strains and vectors

Plasmid vector and bacterial strains used in this study for recombinant protein expression are listed with specific details in Table 2.2. For overexpression of recombinant proteins, E. coli was used as a surrogate. The anaerobic bacterium H. orenii was provided by J.L Cayol (Institut de Recherche pour le Developpement [IRD], Universite de Province, France) and was used as a source of genomic DNA to amplify genes of interest.

42 CHAPTER 2: GENERAL MATERIALS AND METHODS

Table 2.2 Plasmid and bacterial strains used for gene cloning and protein over- expression Strain Size/ Genotype Source/ Reference pET22b(+) 5.4 kb, AmpR, lacZ Invitrogen/ Kori et. al., 2011a, b

Halothermothrix orenii - Cayol et. al., 1994 - E. coli DH5 F- deoR endA1 recA1 relA1 gyrA96 hsdR17(rk , Bioline + (Alpha-Select Silver mk ) supE44 thi-1 phoA Δ(lacZYA argF)U169 (Australia)/ Efficiency) Φ80lacZΔM15 λ- Kori et. al., 2011a, b - - BL21(DE3) E. coli F- ompT hsdSB(rB mB ) gal dcm (DE3) Bioline strain (Australia) / Kori et. al., 2011a, b TM - - - Rosetta 2 (DE3) [F ompT hsdSB(rB mB ) gal dcm] Merck E. coli (Australia)/ Kori et. al., 2011 a, b

2.2.2 Oligonucleotide design and synthesis

Oligonucleotides for selected genes (Table 3.1, Chapter 3) were designed as forward and reverse primers (Appendix II) by incorporating restriction enzyme sites specific to pET22b (+) vector (Appendix III) in both primers; hexa- histidine nucleotides, cleavage site and stop codon in reverse primer. Primers melting temperature (Tm) and dimer formation were detected from online tool at URL http://www.basic.northwestern.edu/biotools/oligocalc.html. Oligonucleotides for PCR and automated DNA sequencing (Appendix IV) were obtained from Geneworks (Australia) on 40 nmole scale and supplied in lyophilized form under standard conditions.

43 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.2.3 Culturing of H. orenii and genomic DNA extraction and purification

H. orenii was grown anaerobically overnight in 100 ml TYEG medium (Appendix IV) at 50 C. Cells were harvested by centrifuging at 8 000 g for 15 min and the pellet was resuspended in 3.95 ml of cell lysis buffer A (Appendix IV) and the suspension incubated on ice for 5 min. Following addition of 0.6 ml 0.5M EDTA pH 7.5 and 0.25 ml 20 % (w/v) SDS and the suspension was incubated at 4 C for further 5 min. 100 l proteinase K was added (20 mg. ml-1) and incubated at 55 C overnight (Patel et. al., 1985a; Patel et. al., 1985b). To remove contaminating protein and enzymes from the genomic DNA, an equal volume of phenol/chloroform/isoamyl alcohol (25:24:1, v/v/v) was added and solution was mixed by inversion. Samples were then centrifuged at 18 000 g for 10 min in a bench-top centrifuge (Sigma). Three different phases were visible: lower, middle (organic phase having a protein precipitate) and upper aqueous phase containing DNA. The upper phase was transferred into sterile 1.5 ml microfuge tube with care to avoid protein contamination (Marmur, 1961; Ogg and Patel, 2009a). To precipitate the DNA, 0.1 volume of 0.3 M sodium acetate and 2 volumes of chilled 100 % ethanol were added and mixed gently and subsequently centrifuged at 18 000 g for 10 min to pellet out the DNA. The supernatant was decanted and resulting DNA pellet was washed with 250 l of chilled 70 % ethanol and centrifuged for 5 min at 18 000 g. 70 % ethanol was carefully decanted and the DNA pellet was air dried and subsequently dissolved in desired volume of 10 mM TE buffer pH 8.0 and the remaining DNA pellet was stored at -20 C (Kori et. al., unpublished data).

2.2.4 Plasmid DNA purification

Large and small scale plasmid DNA was purified using QIAGEN Midi (Tip-100) and QIAprep Spin Miniprep kit under the conditions and procedures described by the manufacturer.

44 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.2.5 DNA gel electrophoresis

Agarose gel was prepared from molecular grade agarose powder (Bioline). Unless indicated otherwise, for routine analysis 1 % agarose gel was prepared by dissolving agarose powder in 1X TAE buffer (Appendix IV) and dissolved in microwave oven for 1 min. The solution was cooled up to 50 C and ethidium bromide was added to a final concentration of 0.1 g.ml-1 and casted in plastic tray using appropriate size comb and left for 15 minutes to solidify the gel. DNA samples were mixed with 6x gel loading dye and loaded into wells and electrophoresed at 5 volts cm-1 for 30 min (Sambrook et. al., 1989). Simultaneously, HindIII digested DNA marker (Fermentas) was loaded to estimate DNA size and concentration. DNA bands were visualized under long wavelength UV light and documented using Geliance 600 imaging system (Perkin Elmer).

2.2.6 Polymerase Chain Reaction

PCR reactions were performed in 0.6 ml PCR tubes (Quality Scientific Plastics, Australia) containing 1 l of 2.5 mM dNTP’s (Scientifix, Australia), 1 l each of 50 M forward and reverse primer, 2 l of 50 mM MgCl2, 10 l of 5X Mango Taq reaction buffer, 0.5 l of 5 U/l Mango Taq DNA polymerase (Bioline), 1 l of genomic DNA of H. orenii and 33.5 l of sterilized distilled water. A negative control was run every time consisting of all ingredients except template DNA. PCR reactions were performed using FTS1- Thermal Sequencer Corbett Research (Australia) with 2 min denaturation at 94 C and followed by 30 cycles that includes 94 C for 1 min, 52 C annealing for 1 min and 72 C extension for 1 min 30 sec and final extension at 72 C for 10 min (Mijts, 2001; Kori et. al., 2011a, b). PCR products were analyzed by DNA agarose gel (1 %) electrophoresis (section 2.2.3) and the remaining product was stored at -20 C for further use.

45 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.2.6a DNA extraction and purification from agarose gel

DNA fragments were excised by the Lab gadget X-tracta LG-100 (Genework, Australia) and transferred into a pre-weighed sterilized 1.5 ml microfuge tube (Quality Scientific Plastics, Australia) under sterilized conditions. Subsequently, DNA was extracted from agarose gel (1 %) using QIAGEN QIAquick gel extraction kit (Kori et. al., 2011a, b), according to the recommendations of the manufacturer.

2.2.6b Restriction endonucleases digestion

Amplified PCR product and pET22b (+) vector were digested with NdeI and XhoI restriction endonucleases to make them suitable for ligation (Kori et. al., 2011a, b). Both the enzymes were purchased from commercial source New England Biolabs (NEB) and used according to the recommendations of the manufacturer.

2.2.6c DNA ligation

Reaction mixtures of different ratios (I: V; 4:1, 6:1, 8:1, 10:1) of the DNA inserts (I) and vector (V) were prepared in a total volume of 20 µl and were incubated at room temperature overnight. 10 units of NEB T4 DNA ligase enzyme was used for all ligation reactions, as per the manufacturer instructions. Subsequently T4 DNA ligase enzyme was inactivated at 65C for 15 min and ligated products were stored at -20 C for further experiments (Kori et. al., 2011a, b).

2.2.6d Recombinant DNA transformation in Escherichia coli

Chemically competent cells (Table 2.2) were used for transformation of the recombinant DNA. Transformations were performed following the procedure from the supplier for optimal results. In 1.5 ml of sterilized microcentrifuge tubes, 5 l of the ligated product was added to 100 l of thawed competent cells followed by 25 min of incubation on ice.

46 CHAPTER 2: GENERAL MATERIALS AND METHODS

Heat shock at 42 C was given to the cells for exact 45 sec, after which the cells were incubated on ice for another 2 min. Subsequently, 900 l of SOC media (Appendix IV) was added to the cells as a growth supplement to enhance their catabolic rate due to presence of glucose. Transformed cells were incubated for another 1 hr at 37 C on orbital shaker at 220 rpm. 100 l of the cell suspension was plated on MacConkey agar plates with 100 g.ml-1 of ampicillin (Karlovsky, 1993; Pendola and Palis, 1993; Kori et. al., 2011a, b). The plates were incubated overnight at 37 C.

2.2.6e Recombinant clone screening

A. White/red selection:

Recombinant plasmid screening is basic and essential step in gene cloning. In this study, all the recombinants were screened by growing transformed cells on MacConkey agar (Appendix IV). Lactose fermentation by cells without inserts produce red/ pink colonies, whereas lactose non-fermenters produce white/ colourless colonies due to presence of insert. This method does not require addition of IPTG and X-gal (Karlovsky, 1993; Pendola and Palis, 1993; Kori et. al., 2011a, b).

B. Colony PCR screening:

White colonies were selected as positive clones and confirmed by colony PCR screening using T7 promoter and T7 terminator primers (Appendix II) by following the PCR program (section 2.2.6) (Kori et. al., 2011a, b).

C. Restriction Enzyme (RE) digestion:

Recombinant plasmids isolated from the culture grown overnight at 37 C on orbital shaker at 220 rpm, by QIAGEN QIAprep Spin Miniprep kit under the conditions and procedures as described by the manufacturer. Later isolated plasmids were digested by

47 CHAPTER 2: GENERAL MATERIALS AND METHODS

NdeI and XhoI restriction enzymes for 2 hr at 37 C. RE digested products were analyzed on agarose gel (section 2.2.5). Presence of insert on agarose gel confirmed the positive clone (Kori et. al., 2011a, b).

D. Automated DNA sequencing

Reaction mixture was prepared using 4 l of ABI BigDye, 2 l of 5X BigDye buffer, 1 l of 3.2 M sequencing primers, appropriate concentration of DNA and sterilized water to make the final volume of 20 l.

DNA sequencing reactions were performed by using BigDye labeled terminator v3.1. Reactions consisted of 20 ng PCR product or 300 – 500 ng plasmid DNA, 1 µl of

3.2 M T7 primers (Appendix II), 1 l 5X MgCl2- free BigDye reaction buffer and 4  l of BigDye Terminator v3.1 (Applied Biosystems, USA) and ddH2O to a final volume of 20 l. Thermal cycles were performed using Idaho Technology Rapid Cycler (USA), with a repeat of 25 cycles for 1 min at 96 C, 10 sec at 96 C, 50 C for 5 sec and 60 C for 4 min. The resulting sequence reaction products were transferred to sterile 1.5 ml microcentrifuge tubes precipitated with 5 l of sterilized 125 mM EDTA and washed with 60 l of absolute ethanol followed by 15 min incubation at room temperature and centrifugation at 18 000 g for 20 min. Supernatant was discarded carefully and resulting pellet was washed with 190 l of 70 % ethanol and centrifuged at top speed for 10 min, supernatant was discarded and pellet was allowed to dry under sterilized conditions and later analyzed on an ABI 377 DNA Sequencer (Applied Biosystems, USA) by Griffith University DNA Sequencing Facility (Brisbane, Australia) (Mijts, 2001; Kori et. al., unpublished data).

E. Storage of clones

Recombinant clones were grown up to OD650 0.4 - 0.6 and mixed with 1:1 (v/v) ratio of culture and 80 % glycerol (autoclaved) in 1.8 ml cryovials (Nunc) and stored at -80C.

48 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.3 RECOMBINANT PROTEIN EXPRESSION AND PURIFICATION

2.3.1 Protein over-expression

Colonies were plated on the LB agar (Appendix IV) supplemented with ampicillin (100 µg.ml-1) and/ or chloramphenicol (35 µg.ml-1) and incubated overnight at 37 C. Following day, loop full colonies were scraped and inoculated in LB broth containing ampicillin and/ or chloramphenicol as mentioned above. Cells were grown at 37 C with

220 rpm on orbital shaker to the OD650 0.4 – 0.6 and induced with 1 mM IPTG. Cells were grown to next 5 hr and samples were collected after every 0, 1, 3 and 5 hr (Kori et. al., 2011a, b).

2.3.2 Recombinant protein purification

Cells were harvested by centrifugation at 8 000 g for 10 min at 4 C and pellet was resuspended in cell lysis buffer B (Appendix IV). The mixture was incubated at 37 C for 30 min and sonicated on ice bath (three times of 5 sec burst with 2 min rest on ice after each sonication period). Resulting cell lysate was centrifuged at 18 000 g for 30 min at 4 C to pellet down cell debris. Collected supernatant was filtered through 0.22 m PVDF membrane Millex, Millipore (USA). Follow-on filtrate was loaded on a

NiSO4 (200 mM) charged Fractogel resin column (Merck) and incubated for 1 hr at 4 C with gentle rotation on a tube rotator. Column was subsequently washed with imidazole gradient containing wash buffers; purified recombinant protein was eluted in buffer containing varying concentration of imidazole (100-350 mM) in 20 mM Tris-Cl, 500 mM NaCl, pH 7.9. Recombinant proteins were concentrated using a 30 kDa U-tube concentrator (Novagen) and centrifuged at 10,000 g for 8 min at 25 C and the retenate in the U-tube filter was desalted with buffer (20 mM HEPES, 100 mM NaCl, pH 7.0) at least five times. Purity of recombinant proteins were analyzed on precasted NuPAGE 4-12 % Bis-Tris Gel, Invitrogen (USA) (section 2.3.4.2) (Kori et. al., 2011a, b).

49 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.3.3 Protein storage

Purified recombinant proteins were stored in concentrated form (~10 mg.ml-l) in 20 mM HEPES, pH 7.0 and 50 mM NaCl at 4 C for six months without any loss in the enzyme activity.

2.3.4 Protein analysis techniques

2.3.4.1 Protein concentration estimation

Recombinant protein concentration was estimated by Bio-Rad protein assay (Bradford, 1976). Bio-Rad dye reagent concentrate was diluted in distilled water with 1: 5 ratios. In 200 µl of the diluted dye reagent, 10 µl of purified protein was added and incubated for 5 min at room temperature for colour development and absorbance at 595 nm was recorded. Bovine Serum Albumin (BSA) protein standard curve was prepared with the same method and used to calculate and estimate the concentration of unknown protein.

2.3.4.2 SDS-PAGE analysis

To analyse protein over-expression and to confirm the molecular weight of the purified proteins, sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) was performed using XCell SureLockTM Mini-Cell (Invitrogen corp., USA) in 1X MOPS buffer (Appendix IV) according to the manufacturer’s instructions. Protein samples were prepared with 4X LDS Sample Buffer (Appendix IV) (Kori et. al., 2011a, b).

50 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.4 BIOCHEMICAL CHARACTERIZATION

2.4.1 oNP/ pNP hydrolysis assays

All assays reactions were prepared in 96-well plate (Sarstedt, Newton, USA) and conducted in triplicates. All assays were repeated twice. Unless indicated otherwise, the standard assay mixture contained (per well) 2 µl recombinant enzyme (10 mg.ml-1), 2 µl of each substrate (Table 2.3) added from a 20 mM stock to give a final concentration of 2 mM and an appropriate buffer made to a final volume of 120 µl. The assay was initiated by holding the assay mixture lacking the enzyme at appropriate incubation temperature for 15 min followed by addition of recombinant enzyme and incubation continued for further 15 min. The reaction was stopped and the yellow color released from substrate hydrolysis stabilized by adding 20 µl of 1 M Na2CO3 was read at 405 nm in a microplate reader (BIOLOG Microstation, Molecular Devices) and converted to nanomoles of oNP or pNP using standard curves prepared under the same conditions (Nam et. al., 2008; Kori et. al., unpublished data). One unit enzyme is defined as the amount of enzyme required to release 1nmol of oNP/pNP per minute at optimum temperature under the above assay conditions.

Table 2.3 Synthetic substrates used to study enzyme activities Synthetic substrate Source pNP-α-L-arabinofuranoside (pNPAf), Gold Biotech pNP-α-L-arabinopyranoside (pNPAp), (USA) pNP-α-D-fucopyranoside (pNPAF), pNP-β-D-fucopyranoside (pNPBF), pNP-β-D-galactopyranoside (pNPGal), oNP-β-D-glucopyranoside (oNPG), pNP-α-D-glucopyranoside (pNPG), pNP-β-D-mannopyranoside (pNPM), pNP-α-D xylopyranoside (pNPAX), pNP-β-D-xylopyranoside (pNPBX), Phenyl-β-D-glucuronic acid monohydrate (PGA), 1-Naphtahyl acetate (NA)

51 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.4.2 Substrate specific assay

Disaccharides, trisaccharides and polysaccharides were used to determine specific substrate for recombinant enzyme. Enzyme activity was investigated on 10 mM substrate in suitable buffer (100 mM) with optimized pH, in a total volume of 300 µl. Reaction mixture was incubated at the optimized temperature for different time interval and hydrolysis results were analysed using TLC (section 2.4.3) (Han, 1998; Tan et. al., 2008).

2.4.3 Optimum temperature

Various temperatures were selected to determine optimum temperature of the recombinant enzymes, by following standard (section 2.4.1) with some modifications (Mijts and Patel, 2002; Tan et. al., 2008).

2.4.4 pH Optima

Based on higher temperature stability wide range of buffers was used to determine the pH optima of recombinant enzymes. All the buffers were filter sterilized by 0.45 µM Millex-HA filters (Millipore, USA). Buffers with 0.1 M concentration were used such as Bis-Tris, HEPES, Glycine hydroxide, MOPS and Sodium phosphate, with some modifications (Mijts and Patel, 2002; Tan et. al., 2008).

2.4.5 Thermostability

Enzymes were incubated at temperatures between 40-80 C and aliquots withdrawn at various time intervals and assayed for residual activity using oNP/ pNP enzyme assay (section 2.4.1). Residual activities were compared against a control which was held on ice, with some modifications (Tan et. al., 2008).

52 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.4.6 Enzyme kinetics study

Km and Vmax values for the hydrolysis of different concentrations (0.1 to 15mM) of the substrates (oNP / pNPF) was conducted in appropriate buffer (100mM) and pH, at optimal temperature. The kinetic values were determined using Lineweaver-Burk plot (Patchett et. al., 1987).

2.4.7 Effect of metals, sugars and organic solvents

Effect of inhibitors and activators (metal ions, sugars and organic solvents), on enzyme activity was determined by preincubating it with 1 mM of inhibitors and activators (from 10mM stock solutions). Followed by enzyme assay (section 2.4.1) against substrate (oNP/ pNP) in appropriate buffer adjusted to optimum pH and incubation at the optimum temperature for each enzyme (Patchett et. al., 1987).

2.4.8 Thin Layer Chromatography (TLC)

Sample on TLC plate (25 Alufolein, silica gel 60, Merck) were run with acetonitrile: ethyl acetate:2-propanol:water (85:20:20:15, v/v/v/v) at room temperature. Solvent was allowed to run on the top of TLC plate and the plate was removed and dried thoroughly at 120 C. The hydrolyzed products were visualized by dipping the plate in 10 % ethanol and 10 % sulphuric acid (1:1) solution and dried at 120 C (Han, 1998).

53 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.5 PROTEIN CRYSTALLOGRAPHY

X-ray biocrystallography is the most powerful technique use to investigate and obtain macromolecular structure. For the discovery of X-rays Prof. Roentgen was awarded Nobel Prize for Physics in 1901. Electromagnetic radiations with 10-7 – 10-11 m (1000- 0.1Å) wavelengths are known as X-rays (Figure 2.2). This discovery resulted as a major reason for the extraordinary outcomes in medicine and research. The use of X-rays for crystallography and its applications solving structure was 1st performed by Prof. Bragg, for which he was awarded with Nobel Prize in 1915.

Protein crystallography is a predominant technique for determining the protein structure in 3 dimensions. It is based on the analysis of the X- ray diffraction pattern of all the electrons from atoms present in crystal at the time of X- ray diffraction. Crystals are composed of basic unit cells, these unit cells are repeatedly arranged in a geometric fashion which gives the symmetry to the crystal. Crystals of the same protein may differ physically but structurally they have the same pattern of their unit cell arrangement.

To obtain a high quality diffraction data, a single crystal is required which can diffract the X-rays higher than 3.5 Å. Quality of the single crystal depends on various conditions, protein purity is a major factor (> 95% purity is advantageous), growth temperature, pH, factorials and different additives. Another important thing is the protection of the crystal while exposing it to the X-rays shooting, to stand the crystal in heat generated by the X -rays crystals needs cryoprotection. Various solutions and chemicals are used as cryoprotectant e.g. glycerol, mineral oil, Paratone-N etc. with or without addition of the factorial in which the crystal grown (Kori et. al., unpublished data).

2.5.1 X-ray sources

For protein crystal diffraction, X-rays can be generated in the laboratory using an X-ray generator that uses an electron beam generated by cathode into an anode. The metal used dictates the resulting wavelength of X-rays. At synchrotron uses an alternative method to obtain X-rays, bending the electron beam with a powerful magnet. A hugely

54 CHAPTER 2: GENERAL MATERIALS AND METHODS built synchrotron system works on more powerful particle and storage rings, which are extremely large and expensive facilities with the ring diameter from 10 to few hundred m. and can create an electron beam a thousand times stronger than a rotating anode generator.

Figure 2.2 Electromagnetic spectrum scale with X-rays region (adapted from Wikipedia).

2.5.2 X-ray diffraction

X-ray diffraction is based on a phenomenon involving interference and coherent scattering, as explained by Bragg’s law. This law states that “X-ray diffraction can be viewed as a similar process of reflection by the atomic planes in the crystal”. Crystal planes scatter the incident X-rays as identified by the Miller indices HKL with an angle of reflection. Constructive interference occurs due to the path length difference among the rays diffracted from the parallel crystal planes is an integral number of wavelengths. Where the crystal planes are parted by distance d, the path length distance is 2d.sin. Thus it is important for constructive interference to have: n = 2d.sin. On the basis of

55 CHAPTER 2: GENERAL MATERIALS AND METHODS

Bragg’s law individual atoms in crystal structure can be seen by using radiation wavelength similar to the interatomic distances i.e 0.15 nm or 1.5 Å (Bragg and Bragg, 1913).

2.5.3 X-ray detector

Crystals diffracted by X-rays produce a three dimensional diffraction pattern that is reciprocal to the actual crystal lattice (Figure 2.3). To determine crystal structure, the intensities of each of the diffracted reflections are collected and measured. This can be achieved by rotating the crystal from 0-180 giving all the corresponding reciprocal lattice points. Based on this principle X-ray diffraction instruments have two parts: A) A mechanical assembly to rotate the crystal, known as a Diffractometer. B) A detector to measure the intensity and positions of the diffracted reflections, for example a CCD camera, which is more sensitive and faster than X-ray film, less time consuming and makes data processing faster.

Figure 2.3 A diagrammatic representation of an X-ray diffraction pattern, known as Ewald sphere (adapted from Ilari and Savino, 2008).

56 CHAPTER 2: GENERAL MATERIALS AND METHODS

2.5.4 Data processing

In order to obtain a good dataset a single crystal must be used for X-ray diffraction and detector distance and exposure time should be chosen carefully. The dataset obtained is normally processed with automated programs. MOSFLM (Leslie, 2006) was used in this study to process the dataset in three stages:

1. Autoindexing of images: From oscillation images, the program determines the lattice type, crystal unit cell and crystal orientation parameters (Ilari and Savino, 2008).

2. Indexing of all images: All the diffraction measurements on the spots predicted by autoindexing are compared by the program. Later, the program assigns the hkl indices and calculates the diffraction intensities for each spot from the collected images (Ilari and Savino, 2008).

3. Scaling: For scaling SCALA (Kabsch, 2010) was used, this combines the data collected from all the images and calculates the structure factor amplitudes for each reflection based on hkl indices (Ilari and Savino, 2008).

2.5.5 Structure determination

Structure is determined using electron density distribution based on the atomic position of the unit cell, obtained from the diffraction data. The electron density function has the following expression:

-2i(hx, ky, lz) (x, y, z) = (1/V) hkl Fhkl e

where Fhkl are the structure factors, V is the unit cell volume and h, k, l are the Miller indices. F is a complex number and represented as a vector with a module and a phase. During the F amplitude calculation from scattering measurements Phase value information becomes immeasurable. To overcome this problem different experimental

57 CHAPTER 2: GENERAL MATERIALS AND METHODS techniques can be used that allow the building of a three dimensional protein structure. Some of these techniques are (1) Multiple Isomorphous Replacement (MIR), (2) Multiple Anomalous Diffraction (MAD) and (3) Molecular Replacement (MR). In this thesis the MR method was performed using a native dataset (Ilari and Savino, 2008).

(1) Multiple Isomorphous Replacement (MIR): This method was first applied by Max Perutz and John Kendrew. Perutz and Kendrew soaked protein crystals in heavy- atoms solution, making isomorphous heavy-atom derivatives to measure the changes in intensity by inferring the heavy-atoms position (Perutz, 1956; Kendrew et. al., 1958).

(2) Multiple Anomalous Diffraction (MAD): This technique is based on dataset collection by multiwavelength anomalous diffraction. Usually three wavelengths are used to maximize the absorption and dispersive effects (Sivakumar, 2006).

(3) Molecular Replacement (MR): Rossman and Blow in 1962 first described this method. This technique involves searching for a structural model homologous to the protein of interest, followed by molecular replacement. The typical Patterson function of the search model is correctly oriented according to the new crystal unit cell parameters by rotation functions and then translated in the new unit cell to achieve the best fit that is supported by a residual factor and a convincing correlation factor (Sivakumar, 2006; Ilari and Savino, 2008).

It is well known that resemblance of two protein structures have high level of sequence identity. When the template structure has at least 40 % sequence identity with the protein of interest, the structures are expected to be highly similar and the Molecular Replacement will have a high rate of success (Ilari and Savino, 2008).

2.5.6 Model building

Model of sample protein can be generated by fitting the template structure into the electron density map of sample protein followed by a number of refinement steps. By following template model atomic details, geometrical data of sample protein structures

58 CHAPTER 2: GENERAL MATERIALS AND METHODS can be applied for structure refinement to obtain better phases and the best atomic model (Ilari and Savino, 2008).

2.5.7 Structure refinement

The last part of structure build up after phase determination (by molecular replacement method) is structure refinement. The model builds up before refinement can never be perfect but can be improved up to a larger extent in confidence with the diffraction data by a continuous series of refinement. Theoretically refinement is the optimization of the function of a set of observations by changing the parameters of a model (Sivakumar, 2006; Ilari and Savino, 2008).

Refinement processes with the determination of electron density map that subsequently interpreted with the polypeptide chain as backbone. Once the backbone fits in the electron density map the structure refinement procedure starts. It basically deals with the adjustment between the calculated and observed structure factors. During the process only high resolution dataset should be used to maximize the accuracy of the structure, using low resolution data results in high B factor. Best possible structure can be achieved by the completeness of dataset, signal-to-noise ratio [I/(I)] and redundancy of data within the highest resolution shell. Structure should not be overrefined by building a model into an inadequate electron density; this can be avoided by visual observation of electron density map and high B-factor values (Sivakumar, 2006; Ilari and Savino, 2008).

Structure refinement can take more time than expected, but after numerous rounds of refinement with backbone and side-chain fitting in the electron density map the model generated will have low R factor values (‘residual’ or ‘reliability’ (R) factor). The factors that govern the model correctness during refinement process is the crystallographic R factor (Rcryst) that is the sum of the absolute differences between observed and calculated structure factor amplitudes:

Rcryst =  Fobs - Fcalc /  Fobs

59 CHAPTER 2: GENERAL MATERIALS AND METHODS

But following Rcryst as a guide during refinement can be risky because it may lead to an inappropriate model. To avoid this, Rfree parameter is recommended by experts, which is similar to the Rcryst except that it is calculated from a part of the collected data that has been chosen to exclude randomly from refinement and map calculation. An appropriate model should have decreased values for both R factors (Rcryst and Rfree) between a 10 - 20 % range during the final refinement (Sivakumar, 2006; Ilari and Savino, 2008).

2.5.8 Structure Validation and Deposition

Model building and refinement is completely based on the crystal diffraction dataset which may have issues, thus quality assessment methods are required to assess the correctness of such models. Before deposition and publication it is very important for the crystallographer to remove any of the errors present in the model. Errors can be removed by any of these methods: (i) by using detailed information about well refined homologous structures from the databases, (ii) by using a quality checker by recommended experts and (iii) by using various local quality checks (Sivakumar, 2006).

Model validation generally considers two aspects of models to identify the errors: first it considers both model and crystallographic data and second it looks for the coordinates and B factor. Ramachandran plot analysis is the most important part of model validation. Finally, the highly refined model is deposited in the Protein Data Bank (http://deposit.rcsb.org/adit/).

2.5.9 Crystallization screen

The purified recombinant protein was concentrated up to 10 mg.ml-1 (section 2.3.4.1) and used for crystallization screens. Two crystallization screen methods were followed to obtain protein crystals. These methods are (1) Sitting drop method and (2) Hanging drop method (Figure 2.4). Initial crystallization screens were setup from in-house factorials using the Sitting drop method in 96 well plates (MRC, Molecular Devices, UK) with a 1:1 ratio of recombinant protein and reservoir solution. Conditions that yielded crystals were replicated to optimize the crystal growth and quality for high

60 CHAPTER 2: GENERAL MATERIALS AND METHODS resolution diffraction during X-ray diffraction using the Hanging drop method. Plates were maintained at 20C and were observed under an inverted microscope (Nikon SMZ1000, Japan) each day on daily basis for the 1st two weeks and once a week in the following time (Kori et. al., 2011 a, b).

After initial X-ray diffraction analysis of crystals obtained from different conditions optimization screens were set up with a gradient of pH and PEG, lower temperature and different metals and sugars were used as additives.

A B

Figure 2.4 Diagram of crystallographic methods followed in this study. (A) Sitting & (B) Hanging drop method. Reservoir solution (blue) contains buffer & precipitant. Protein solution & reservoir mixture (red) are usually mixed in a 1:1 ratio (adapted from http://www.bio.davidson.edu/Courses/Molbio/MolStudents/spring2003/Kogoy/protein.html)

2.5.10 X-ray diffraction and Preliminary data analysis

Crystals from sitting drop were fished out and frozen immediately in liq. N2 using 1:1 (v:v) ratio of mineral oil (Oxoid, Australia) and Paratone-N (Hampton, US) as cryo- protectant (Hope, 1988; Kori et. al., 2011a,b). Crystals were diffracted with X-rays (in- house diffractometer consisting of a Rigaku MicroMax-007 HF generator, VariMax ++ optics and an R-Axis IV detector) for 120 sec. under continuous flow of liq. N2 (-121C) to prevent crystal loss due to heat generated by X-rays. In-house X-ray diffracted crystals (3 to 3.5 Å) were further sent to the Australian Synchroton facility AS-MX1 and ADSC Quantum detector at Melbourne to collect high resolution dataset.

61 CHAPTER 2: GENERAL MATERIALS AND METHODS

A complete dataset was collected for crystals that diffracted X-rays up to 3.5 Å with >95 % completeness (Kori et. al., 2011 a, b). The collected dataset was processed with MOSFLM (Leslie, 1992) and SCALA from the CCP4 program suite (Collaborative Computational Project, Number 4, 1994). Self- rotation functions were calculated using GLRF (Tong and Rossmann, 1990). Space groups were determined using POINTLESS, an inbuilt programme of the CCP4 package (Evans, 2005). Molecular replacement method using PHENIX (Adams et. al., 2010) and library of search model from Protein Data Bank (PDB) protein structure was solved. For manual model building and visual inspection, program O (Jones et. al., 1991) and Coot (Emsley and Cowtan, 2004) were used on Linux platform.

62 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

CHAPTER 3

GLYCOSYL HYDROLASES AND TRANSFERASES FROM HALOTHERMOTHRIX ORENII

CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

3.1 Introduction

Extremophilic microorganisms have special features which enable them to survive and grow in environments which are extreme from an anthropocentric point of view. These organisms use specialized proteins and enzymes for their metabolic processes and other specific biological functions under extreme environmental conditions (1), as discussed in Chapter 1. These specialized enzymes are of the utmost importance for investigating the evolution, adaptation and biotechnological significance to humans and environment.

This chapter provides a summary of extensive investigations on the carbohydrate hydrolyzing enzymes utilized by the H. orenii for the breakdown of the di- and polysaccharides and their manipulation to obtain carbon and energy and perform other metabolic processes. In addition, Bioinformatical databases and instrumental methods in molecular and biochemical characterization along with structural studies are described. The studies reviewed here form an essential basis for the group of experiments in the coming chapters in relation with the functional and structural analysis of GH.

Glycoside Hydrolases (GH) and Glycosyl Transferases (GT) selection for investigation

This chapter includes preliminary investigations on GH and GT, subsections of this chapter are about the detailed studies performed on enzymes those successfully cloned and purified. Overview of Chapter 3 is outlined in Figure 3.1.

H. orenii complete genome sequence cart (NCBI database: NC_011899 and in-house database gene cards) were explored for genes which encode for mature GH and GT enzymes with open reading frame (Appendix I). Ample attention was given to the absence of the restriction sites within the selected genes to avoid complication while gene cloning. Selected proteins are described briefly in the ascending order of their gene number in Table 3.1.

63 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Table 3.1 Glycoside hydrolases and Glycoside Transferases from H. orenii

S.no BASYS Gene no. EC no.** Protein name** Gene size/ CAZy Family/ Enzyme assays and Results

& name/ protein clans

NCBI Ref. no./ size/Mw

KEGG no. oNP & pNP Coupled  pI pI

assay A assay Cloned Expression

Purification Reference chapter

1 0126 / rbsK 2.7.1.4 Fructokinase 945 bp/ pNPADG (+) YP_002510008.1 314 aa/ -- pNPBDF (+) D- N K00847 34.8 kDa pNPADAp (+) fructose B 4.83 + + 3 P PGA (+) ( - ) pNPBDM (+) 2 0154 / bfrA 3.2.1.26 - fructosidase 633 bp/ YP_002509984.1 (- fructofuranosidase) 210 aa/ ------5.94 + - - - K01193 24.1 kDa 3 0369 / - 3.2.1.55 -N- 948 bp/ GH-43 pNPALAp(+), YP_002509798.1 arabinofuranosidase 315 aa/ GH-F Fold 5-fold pNPBDGa (+) -- 5.27 + + P 3.2 K01209 35.5 kDa β-propeller 4 0645 / ydjH 2.7.1.45 2-dehydro-3- 984 bp/ (pTHrSKA) deoxygluconokinase 327 aa/ ------5.27 - - - - YP_002509566.1 2.7.1.15 Ribokinase 36 kDa K00874 5 0973 / bglAH 3.2.1.21 -glucosidase A 1356 bp/ GH-1 oNPBDG(+),

YP_002509272.1 451 aa/ GH-A Fold (β/)8 pNPBDF(+) -- 5.18 + + P 3.1 KO5350 52 kDa pNPBDM(+) pNPADG(+)

64 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

pNPALAf(+) pNPALF(+) 6 1006 / MAN2C1 3.2.1.24 -mannosidase 2766 bp/ GH-38

YP_002509245.1 921 aa/ Fold: (β/)7 -- -- 5.78 - - - - K01191 107 kDa 7 1593 / - 2.7.1.4 Sugar kinase ribokinase 918 bp/ pNPADG(+), YP_002508722 family 305 aa/ 34 -- pNPBDF(+), -- 4.77 + + P 3.3 K00847 kDa pNPALAp(+) 8 1594 / amyB 3.2.1.1 -amylase 2 1281bp/ GH-13; pNPADG(+),

YP_002508721 426 aa/ GH-H Fold (β/)8 pNPBDF(+), K01176 49.9 kDa pNPALAp(+) Starch N 5.73 + + 3 PGA(+) azure (-) P pNPBDM(+) pNPBDGa (+) 9 1715 / AGL 2.4.1.25 4--gluconotransferase 1986 bp/ GH-13 --

YP_002508612 3.2.1.33 Amylo--1-6- 661 aa/ GH-H Fold (β/)8 -- 8.88 - - - - K01196 glucosidase 77.5 kDa 10 2075 / frvX 3.2.1.4 Endoglucanase 879 bp / GH- -- - 292 aa / 5,6,7,8,9,12,44,45,4 -- 6.16 - - - - K01179 32.5 kDa 8,51,61,74 11 2216 / GBA 3.2.1.45 Glucosylceramidase 1422 bp/ GH-30 pNPALF(+),

YP_002508172 473 aa/ GH-A Fold: (β/)8 pNPBDF(+), N -- 4.95 + + 3 K01201 53.9 kDa pNPBDGa (+) P

12 2338 / pfkA 2.7.1.11 6-phosphofructokinase 999 bp / pNPALF(+), D- N 6.27 + + 3 YP_002508077.1 332 aa/ -- pNPBDF(+), fructose P

65 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

K00850 35.7 kDa pNPBDGa (+) 6- phosphat e (-) 13 2393 / bglA(H) 3.2.1.73 Licheninase 1512 bp/ GH-16 -- YP_002508023 3.2.1.39 (-glucanase precursor) 503 aa/ GH-B Fold: β-jelly -- 5.65 + - - - K01199 57.4 kDa roll 14 2507 / yjeF 2.7.1.? Sugar kinase, 1542 bp / pNPALF(+), YP_002507924 Not clear carbohydrate kinase 513 aa/ 54 -- pNPBDF(+), N -- 5.49 + + - - kDa pNPBDGa (+) P

 http://www.cazy.org/fam/acc_GH.html;  www.au.expasy.org/cgi-bin/protparam; ** Suggested from the best BLAST match using BLASTp against reference proteins, protein data bank proteins, swissprot protein sequences (http://blast.ncbi.nlm.nih.gov/Blast.cgi);  Clone no. 0126, 0645 and 1593 all are sugar ribokinase (Fructokinase, 2-Keto-D-Gluconate kinase, may be Tagatose-6-Phosphate kinase);  P – purified; NP – Not purified; pNPALAf- 4-nitrophenyl-α-L- arabinofuranoside; pNPALAp- 4-nitrophenyl-α-L-arabinopyranoside; pNPADF- 4-nitrophenyl-α-D-fucopyranoside; pNPBDF- 4-nitrophenyl-β-D- fucopyranoside; pNPBDGa-4-nitrophenyl-β-D-galactopyranoside; oNPBDG- 2-nitrophenyl-β-D-glucopyranoside; pNPADG- 4-nitrophenyl-α-D- glucopyranoside; pNPBDM- 4-nitrophenyl-β-D-mannopyranoside; pNPADX- 4-nitrophenyl-α-D xylopyranoside; pNPBDX- 4-nitrophenyl-β-D- xylopyranoside; PGA- Phenyl-β-D-glucuronic acid monohydrate; NAc- 1-Naphtahyl acetate ‘__’ matches the CAZy (Carbohydrate Active Enzymes) family;  NCBI reference no. for gene BASYS02075 is YP_002508080.1 as a Zinc dependent alcohol dehydrogenase, but in the Local protein BLAST using Bioedit software it comes out as endoglucanase enzyme whereas, doing NCBI protein BLAST with reference to Protein Database results as aminopeptidase as best possible match; ( + ) : Positive result; ( - ) : Negative result

66 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Fructokinase

The fructokinase (EC.2.7.1.4; rbsK) was first isolated and characterized from Pisum sativum (pea) by Frankart and Pontis in 1976, after two decades of its initial report from pea (Medina and Sols 1956). Since then plant and bacterial fructokinase have been isolated and characterized from various plants. Fructokinase is an active member of a specific phosphoenolpyruvate phosphotransferase independent (PTS) system, in which nucleotide- dependent sugar kinases are used for monosaccharide phosphorylation (Nocek et al., 2011).

Fructokinase phosphorylates the freely available monosaccharide such as fructose and glucose as the initial step of metabolic pathways. It is ubiquitous and highly specific for transferring the phosphate group from donor ATP to the recipient D- fructose to produce D-fructose-6-phosphate (F-6-P). This F-6-P is further utilized by other specific metabolic pathways (Figure 3.2).

Based on a homology search fructokinase (rbsK gene) from H. orenii genome belongs to the ribokinase/ pfkB superfamily. Members from this family utilize a wide range of carbohydrates as their respective substrates for phosphorylation with highly conserved structures. Generally, fructokinase encoding genes are present on the bacterial chromosome (Zembrzuski et. al., 1992; Fennington et. al., 1996) but there are some reports which mentioned their isolation from the sucrose utilization gene cluster (Aulkemeyer et. al., 1991; Bockmann et. al., 1992; Luesink et. al., 1999; Reid et. al., 1999; Bogs et. al., 2000).

Bioinformatic investigations have shown that Fructokinase is a member of bacterial Repressors, uncharacterized Open reading frames and the sugar Kinases (ROK) sequence family. There are more than 5,000 members of the ROK family broadly distributed in the environment and all the domains of life. These members show a high functional diversity within the ROK family (Titgemeyer et. al., 1994).

To date, few crystal structures of fructokinase have been studied from various organisms. Some reasons for limited structure may be difficulties during protein crystallization, inherent properties of the protein or lack of resources.

67 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Among the solved structures, recently one thermophilic fructokinase was reported from H. orenii, which brings light on the structure and mechanism of phosphorylation. The structure consists of a Rossman-like fold and a β-sheet as a lid for dimerization; the same lid regulates the access of the substrate to the binding region (Chua et al., 2010).

Figure 3.1 Fructose metabolism pathways with reference to H. orenii (KEGG database (http://www.genome.jp/kegg/).

68 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Alpha amylase 2

Starch is the most abundant polysaccharide after cellulose produced by plants and is an extremely important food source for microbes and raw substrate for industrial processes, where it is utilized for the starch liquefaction (Mijts, 2001). The function of -amylase is to lower the viscosity of the gelatinized starch and hydrolyse it to maltodextrin, so as to obtain a degree of polymerization of 10-15. Bacterial - amylase (1-4, -glucan 4 glucanohydrolase, EC. 3.2.1.1) is an endoglucanase that preferentially hydrolyses internal -1-4- glucosidic bonds. The principal products from starch degradation are maltotriose and maltose from amylose, and maltose and glucose from amylopectin (Figure 3.3). -amylase is an important member of starch and sucrose metabolism pathway (Figure 3.4) and classified into either of two GH families, GH13 and GH 57 (CAZy database) (Tan et. al., 2008).

Figure 3.2 Schematic representation of starch degradation.

Detailed investigation on biochemical properties and tertiary structure of  amylase A and B from H. orenii has been reported in detail by previous workers, as mentioned in chapter one section 1.10 (Mijts and Patel, 2002; Sivakumar, 2006; Sivakumar et. al., 2006; Tan et. al., 2008).

69 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Figure 3.3 Starch and sucrose metabolism pathways with reference to H. orenii (KEGG database (http://www.genome.jp/kegg/).

70 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Figure 3.4 Phylogenetic interpretation of amy2 from H. orenii with its close relatives. NCBI reference numbers are given in parenthesis. Bootstrap values >95 are shown. The scale bar indicates amino acid changes per 100 amino acids. H. orenii (YP_002508721), Petrotoga mobilis SJ95 (YP_001568729.1, unpublished), Acholeplasma laidlawii PG-8A (YP_001620650.1, Akopian et. al., 2011), Lactobacillus farciminis KCTC 3681 (ZP_08576748.1, Nam et. al., 2011), Turicibacter sanguinis PC909 (ZP_06623132.1, Cuiv et. al., 2011), DSM 4359 (YP_002534279.1, unpublished), Thermotoga maritima MSB8 (NP_229450.1, Haft et. al., 1999), Thermotoga naphthophila RKU-10 (YP_003346656.1, unpublished), Alistipes sp. HGB5 (ZP_08513420.1, unpublished), Halogeometricum borinquense DSM 11551 (YP_004037641.1, unpublished). Dendrogram was prepared using TREECON software.

71 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Glucosylceramidase

Glucosylceramidase (EC. 3.2.1.45, GCase) is a hydrolytic enzyme that cleaves the β- glucosidic linkage between ceramide and glucose in glucocerebroside (glucoceramide). Most of the fully investigated GCase has been reported from mammals including Homosapiens, Mus musculus, Rattus norvegicus and Bos taurus. From bacterial Paenibacillus sp. is the only source from where GCase has been reported and studied in detail. GCase from Paenibacillus sp. does not belong to classical GH family 30; instead it showed significant similarity with GH family 3 β- glucosidases from Clostridium thermocellum, Ruminococcus albus and Aspergillus aculeateus (Sumida et. al., 2002). GCase identified from H. orenii belongs to classical GH family 30. Homologs based on amino acid sequence search of GCase from H. orenii are used to interpret phylogenetic relation with other bacterial GCases are shown in Figure 3.5.

In humans glucosylceramidase is precursor for all the glycosphingolipids (GSL) and significantly important for axonal morphology and neuronal functions. Genetic deficiency of GCase in humans causes Gaucher disease. In this disease GCase does not hydrolyse the glucosylceramide due to its accumulation in lysosomes (Sumida et. al., 2002; Friedman et. al., 2004). Interestingly, Gaucher disease (a lysosomal storage disorder) is clinically linked with synucleinopathies Parkinson disease an adult neurodegenerative disorder and dementia with Lewy bodies but their mechanistic relation is not known. More targeted research is required on GCase to lysozomes to design specific therapeutic approach for Parkinson disease and other related disorders (Mazzulli et. al., 2011; Sardi et. al., 2011).

In 2006 Brumshtein and associates reported and described the structure of GCase with two molecules. The structure showed some conformational variances due the changes in loops present in catalytic region that consists of four putative glycosylation sites (Asn19, Asn59, Asn146 and Asn270). Deoxynojirimycin is a known inhibitor of GCase, which is also a strong inhibitor of β-glucosidase (Offman et. al., 2010; Kori et. al., unpublished). Complete structural analysis will provide the molecular tool to improve functional understanding of GCase for use in enzyme- replacement therapy against Gaucher disease (Brumshtein et. al., 2006).

72 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Presence of GCase in H. orenii raises some question about its presence in this thermophilic bacterium. Detailed investigation on biochemical characteristics and structural features are required to unveil GCase properties and use them for therapeutics and understanding relationship between the mammalian and bacterial GCase.

Figure 3.5 A dendrogram showing the phylogenetic position of Glucosylceramidase from H. orenii and its closest relatives. NCBI reference numbers are given in parenthesis. Bootstrap values > 45 are shown. The scale bar indicates amino acid changes per 100 amino acids. H. orenii (YP_002508172), Bacillus coahuilensis (ZP_03227552.1, unpublished), Paenibacillus sp. (ZP_04854107.1, unpublished), Haloplasma contractile (ZP_08555401.1, unpublished), Bacillus cellulosilyticus (YP_004096478.1and YP_004093702.1, unpublished), Bacillus sp. (ZP_07709809.1, unpublished), Methylobacter tundripaludum (ZP_08782073.1, unpublished), Clostridium sp. (ZP_05132242.1, unpublished), Methylomicrobium album (ZP_08485570.1, unpublished), Paenibacillus curdlanolyticus (ZP_07388361.1, unpublished), Thermoanaerobacter tengcongensis (NP_623885.1, Bao et. al., 2002). Figure was prepared using TREECON software.

73 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Phosphofructokinase

Phosphofructokinase (EC. 2.7.1.11, pfkA) is an essential enzyme of glycolysis in the Embden-Meyerhof -Warburg pathway which is nearly ubiquitous in all life forms (Ding et. al., 2001). pfkA catalyses the formation of fructose 1-6 bis-phosphate and ADP from fructose 6-phosphate and ATP are formed during the irreversible glycolysis. This reaction was first discovered by Dische (1935) and by Ostern and co-workers in 1936. First phosphofructokinase was obtained from yeast by Negelein (1936). The PFK actively participate in fructose metabolic pathway (Figure 3.2).

Fructose 6-phosphate + ATP = fructose 1-6 bis-phosphate + ADP

Phosphofructokinase varies in different organism and exhibit great complexity during their kinetic study. Generally two type of phosphofructokinase has been reported one that follow Michaelis-Menten kinetics and one which don’t follow. The one that does not follow Michaelis-Menten interestingly occurs in animals, plants, and in some microorganisms; while the second type found in those organisms that have specialized metabolism (Hofmann, 1976).

The pyrophosphate dependent 6-phosphofructokinase from the thermotolerant methanotroph Methylococcus capsulatus Bath have been characterized recently (Reshetnikov et. al., 2008). This PPi-PFK performs the reversible reaction and can function in both gluconeogenesis and glycolysis (Merterns, 1991). Phylogenetic study of PPi-PFK and ATP-PFK revealed that they share common ancestry with a very complex evolutionary history due to possible gene duplication, horizontal gene transfer and phosphoryl donor changes (Reshetnikov et. al., 2008). ATP-dependent 6-phosphofructokinase from hyperthermophilic archaea Archaeoglobus fulgidus strain 7324, , Desulfurococcus amylolyticus and bacterium Thermotoga maritima has optimal activity at 85 C, 90 C and 93 C with thermostability up to 95 C and 100 C. Divalent cations such as Mg2+, Mn2+, Ni2+, Co2+ and Fe2+ were required for maximal activity. ATP-PFK from T. maritima and A. pernix showed high sequence similarity to members of PFK-A and PFK-B sugar kinase family and contained the conserved allosteric effector-binding sites for ADP

74 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII and PEP (Hansen and Schonheit, 2000; Hansen and Schonheit, 2001; Hansen et. al., 2002; Hansen and Schonheit, 2004).

Phylogenetic analysis of pfkA from H. orenii and PFK from other organisms showed that it clustered in cluster 2 in Figure 3.7. pfkA from H. orenii showed close relationship with Acetohalobium arabaticum, Clostridium butyricum and Halanaerobium hydrogeniformans all are from phylum Fermicutes.

Structure of PFK from B. stearothermophillus, P. horikoshii OT3, T. maritima were solved in recent years (Riley-Lovingshimer et. al., 2002; Wong et. al., 2004 unpublished; Singer et. al., 2008 unpublished; Joint Center for Structural Genomic, 2005 unpublished). The PFK structure from B. stearothermophilus was a tetramer with subunit consists of two domains, each of which had a central β-sheet inserted between -helices (mostly seen in /β class of proteins). The topology of the β-sheet strand connections in both domains of PFK is different than any other known protein structure. The first domain has seven strands of β-sheet; the central five were parallel and the other two antiparallel to each other. The second domain had four parallel strands. The two sheets points towards a deep cleft that forms the active site. Each subunit forms close contacts with only two of the other subunits in the tetramer. The substrate binding residues are His246, Arg159, Arg240, Arg249, Met166, Glu219, Asp124 (Site A), Arg168 (Site B) and Arg151, Arg21, Arg25, His212 and Thr155 (Site C) (Evans and Hudson, 1979; Banaszak et. al., 2011).

75 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

1

2

Figure 3.6 Phylogenetic interpretation of pfkA from H. orenii with selected organisms. NCBI reference numbers are given in parenthesis. Halothermothrix orenii H 168 (YP_002508077.1, Mavromatis et. al., 2009), Acetohalobium arabaticum DSM 5501 (YP_003828429.1, Sikorski et. al., 2010), Halanaerobium hydrogeniformans (YP_003995589.1, Brown et. al., 2011), Thermincola sp. JR (YP_003641068.1, unpublished), Geobacillus kaustophilus HTA426 (YP_148593.1, Takami et. al., 2004), Thermosediminibacter oceani DSM 16646 (YP_003825255.1, Pitluck et. al., 2010), Geobacillus sp. Y4.1MC1 (YP_003988254.1, unpublished), Anoxybacillus flavithermus WK1 (YP_002314868.1, Saw et. al., 2008), Bacillus halodurans C-125 (NP_244030.1, Nakasone et. al., 2000), Thermosinus carboxydivorans Nor1 (ZP_01665182.1, unpublished), Desmospora sp. 8437 (ZP_08466187.1, unpublished), Clostridium butyricum 5521 (ZP_02949422.1, unpublished).

76 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Beta-Glucosidases

Beta-glucosidase (EC.3.2.1.21) biochemical and structural investigations are fully described in Chapter 3 section 3.1.

Beta-Galactosidase/ Alpha-L-arabinopyranosidase

Detail study of -arabinofuranosidase (no EC. number) is fully described in Chapter 3 section 3.2.

Ribokinase (sugar kinase)

Ribokinase (EC. 2.7.1.4) studies are fully described in Chapter 3 section 3.3.

77 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

3.2 MATERIALS AND METHODS

Ortho/ nitrophenyl glycoside hydrolysis assay

The most commonly used chromogen in kinetic experiments is Nitrophenol. Many different mono-/ oligo- saccharides are linked with the oNP (ortho Nitrophenyl) or pNP (para Nitrophenyl) derivatives (Sigma; Carbosynth). These best artificial substrates for the preliminary screening of the hydrolytic enzyme activity of glycosidases (e.g. β- glucosidase, endoglucanase etc.). The substrates can be used for both continuous and discontinuous assays at the elevated temperatures at which the thermostable GH shows their activity. In this study, all the enzymatic activity was determined by discontinuous assay method. The discontinuous method was performed in assays with fixed substrate concentration such as pH optima, Temperature optima and qualitative assays during protein purification and thermostability experiments.

Table 3.2 Substrates and medium used for detecting the enzymatic activity of the overexpressed recombinant proteins (as mentioned in Table 3.1) Medium Artificial chromogenic Natural sugar compounds substrates Oligo/ Disaccharides Polysaccharides - Agar plates pNPALAf; Cellobiose, lactose, Arabinoxylan, - Cell lysate pNPALAp; linear arabinan, arabinogalactan, pNPADF; pNPBDF; trehalose, xylobiose, xylan oNPBDG; pNPADG; xylotriose, maltose, pNPBDM; melibiose, isomaltose pNPADX; pNPBDX; PGA; NAc

78 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Disaccharides

It is important to specify the physiological substrates (di/ oligo- saccharides) of GH which make the difference while determining the enzymatic activity of glycosyl hydrolases, rather than just artificial phenol glycosides. Cellobiose and lactose were determined as specific substrates of β- glucosidase and - arabinofuranosidase. The assays were performed in a total 300 µl volume, containing appropriate concentration of substrate (in mM) with 1 µg/ µl of recombinant enzyme. Reactions were performed at the optimum temperature of the respective enzyme for different time interval. Released monosaccharides were analysed using TLC technique, which is described in detail in the Chapter 2 (section 2.2.18.3).

Congo red assay

For detecting hydrolytic activity against β 1-4 linkage of polysaccharide carboxymethyl-cellulose (CMC; β1-4) Congo red dye assay was used. In the positive results a zone of clearance appears which indicates that the dye is attached with the undigested polysaccharide, whereas the hydrolyzed area shows a clear halo zone. The assay was carried out on the agar plates containing CMC (1 g.l-1) in 100 mM phosphate buffer (pH 7.0), by spreading 10 µl of cell lysate on the plate, followed by 2 hr incubation at 50 C ( under moist condition to prevent the drying of cell lysate). Activity was analyzed by staining the plate with 1 g.l-1 of Congo red in deionized water for 60 min. Subsequently, plate was washed twice with 2 M NaCl. In this assay endoglucanase activity was not observed.

79 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

3.3 RESULTS

Nine out of the 14 genes were amplified (Figure 3.7) and included 6 GHs (β- fructosidase, -arabinofuranosidase, β-glucosidase, -amylase 2, endoglucanase and glucosylceramidase) and 3GTs (fructokinase, ribokinase and 6-PO4 fructokinase) enzyme classes. These amplified genes were cloned in the T7 expression system (pET22b+) using DNA ligase, through optimized insert: vector ratio. Recombinants were screened and confirmed the classical red/white colony selection strategy on the MacConkey Amp+ agar, colony PCR, insert DNA sequencing and RE digestion.

bp M 1 2 3 4 5 6 7 8 9 10 11 12

2322 2027

564

Figure 3.7 PCR results: Lane M: DNA/HindIII ladder; 1, gene 0126 (fructokinase); 2, gene 0369 (β-galactopyranosidase/ -L-arabinopyranosidase); 3, gene 0645 (fructokinase); 4, gene 0973 (β-glucosidase); 5, gene 1593 (ribokinase); 6, gene 2338 (6-PO4 fructokinase); 7, gene 2507 (sugar kinase); 8, gene 2216 (glucosylceramidase); 9, gene 2393 (licheninase); 10, gene 1715 (amylo-- glucosidase); 11, gene 2075 (endoglucanase); 12, gene 1594 (- amylase2).

80 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Protein overexpression screening

Confirmed positive clones were overexpressed in mesophilic expression host BL21 (DE3) for enzyme activity screening. Cells were grown in LB amp+ and/ or camp+ broth up to the initial log phase to induce the gene expression system by appropriate concentration of isopropyl-β-D-thiogalactopyranoside (IPTG). Protein overexpression was optimized by (i) different concentrations of IPTG (ii) temperature and (iii) aeration rate and (iv) different vector system (pET15b+).

Out of 9 positive clones, 8 genes were successfully overexpressed under optimized conditions (Figure 3.8). In the present study no inclusion bodies were observed, all the over expressed proteins were completely soluble and active against their respective substrates (Table 3.1).

Protein Purification

Expression of the thermophilic proteins in Escherichia coli make their purification easy by employing their superior thermostability, heat incubation of cell lysate followed by the centrifugation usually removes the 70 – 80% of the host cell proteins. Further the subsequent purification of recombinant protein was performed by using Immobilized metal affinity chromatography (IMAC) technique. Recombinant proteins were expressed in 100 mL culture of E. coli BL21 (DE3) cells in LB Amp+ medium (6 hr). Cells were lysed after harvesting them, in cell lysate buffer pH 7.0 and heat incubated at 45 C for 30 min, after which the precipitated proteins were removed by centrifugation (10,000 g, 10 min). The cleared supernatant was filtered through 0.22 µm PVDF low protein binding filter, to remove any remained cell particles. Nickel affinity chromatography was used for purification, using Fractogel resin (Merck). The column bound recombinant protein was eluted with increasing gradient of imidazole elution buffers. From a large number of expressed proteins, only - arabinofuranosidase, β- glucosidase and ribokinase were able to purify >95 %. The purified recombinant proteins were stored at 4 C for further investigations.

81 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Activity Assay

Glycosyl hydrolase activity detected by standard assay conditions and other available methods, depending on the level of purification (cell colony on solid medium or cell lysate) and substrate specificity are given in Table 3.1.

82 CHAPTER 3: GLYCOSYL HYDROLASES AND TRANSFERASES FROM H. ORENII

Figure 3.8 SDS-PAGE images of over expressed recombinant protein of H. orenii. Lane M in all figures is SDS-PAGE ruler; A) Lane 1- negative control, 2- 3 overexpression of 0126 (fructokinase) after 1 and 6 hr of IPTG induction; B) 1- negative control, 2- 0369 (β-galactosidase/ -L- arabino pyranosidase) expression after 6 hr of IPTG induction; C) 1- negative control, 2- 0973 (β- glucosidase) expression after 6 hr of IPTG induction; D) 1- negative control, 2-3 1593 (Ribokinase) expression after 1 and 6 hr of IPTG induction; E) 1- negative control, 2- 1594 (- amylase2) expression after 6 hr of IPTG induction; F) 1- negative control, 2- 2216 (glucosylceramidase) expression after 6 hr of IPTG induction; G) 1- negative control, 2- 2338 (6-Phosphofructokinase) expression after 6 hr of

IPTG induction; H) 1- negative control, 2- 2507 (sugar kinase) expression after 6 hr of IPTG induction.

83 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

CHAPTER 3

Section 3.1

BIOCHEMICAL AND STRUCTURAL INVESTIGATION

OF β-GLUCOSIDASE FROM H. ORENII

CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.1 INTRODUCTION

Cellulose is the most copious carbohydrate present on earth, with the estimated annual synthesis of 4 x 1010 tons (Coughlan, 1989) from plant sources. Microorganisms use cellulose as a carbon source and so do biotechnologists who have keen interest in using cellulose as a renewable source for biofuels and other significant chemicals.

In the native form cellulose molecules are bound together by hydrogen bonds to make large and strong crystalline fibres. This physical strength makes it hard to convert the cellulose to glucose by enzymatic means. The structural complexity and inflexibility of cellulosic substrates have given rise to a remarkable diversity of hydrolytic enzymes. All organisms that can hydrolyse cellulose actually use a complex of enzyme systems. These systems are composed of a variety of enzymes that work synergistically with different specificities and types of action. These enzymes are classified as a part of the Glycosyl hydrolases (GH) family in the CAZy database. Glycosyl hydrolases enzymes play a major role in hydrolysis and energy generation from cellulose.

The β-glucosidase is a ubiquitous enzyme secreted by all living organisms of the three domains of life, from microscopic bacteria to large, highly evolved mammals, with various utilities as per the need of the organism. In bacteria and fungi, β- glucosidases are used for hydrolysis of cellobiose, oligosaccharide and disaccharide, where the activity reduces as the number of glucose units increases. Insects use it as cyanide releasing machinery, as a part of their defence strategy. Plants use β- glucosidase as a precursor of phytohormone hydrolysis, in pigment metabolism, seed development and biomass conversion. In humans deficiency of β-glucosidase leads to Gaucher’s disease, that results in the enlargement of the spleen, liver and lymph nodes (Bhatia et al., 2002). Table 3.1.1 shows a comparison among β-glucosidases from different sources.

The cellulases have been ordered into three main groups; (i) exo-glucanases or cellobiohydrolases (EC. 3.2.1.74) (ii) endo-glucanases (EC. 3.2.1.4) and (iii) β- glucosidases (EC. 3.2.1.21). The endoglucanase catalyze the random cleavage of internal glucosidic bonds of cellulose chain, while the exoglucanases or

84 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii cellobiohydrolases attack the chain ends, releasing the cellobiose. β-glucosidase is strongly active on cellobiose releasing glucose units from cellobiose (Figure 3.1.1).

β-glucosidases are the main constituents of Glycosyl hydrolases category. According to the CAZy classification β-glucosidase belongs to GH family 1 and 3. It is part of the cellulose hydrolysing cascade that catalyzes the selective cleavage of glucosidic bonds. In some organisms, β-glucosidases work synergistically with cellobiohydrolases and endoglucanases.

β-glucosidases constitute an assembly of well characterized, biologically important enzymes that catalyze the transfer of glycosyl group between oxygen nucleophiles. Under biological conditions, such a transfer reaction results in hydrolysis of glucosidic bonds which links the glucose units in aryl-, amino-, or alkyl-β-D- glucosides, cyanogenic glucosides, oligosaccharides and disaccharides.

In this chapter, β-glucosidase from H. orenii (designated HoBglA) was investigated. The gene was identified and cloned to analyse the biochemical properties of β-glucosidase. This enzyme showed interesting characteristics about its enzymatic activity. HoBglA is efficient for hydrolysing three different substrates (β-D-glucosides, β-D-fucosides and β-D-galactosides). Due to the presence of three hydrolytic activities, attention was given to understand the molecular mechanism by which HoBglA utilizes various substrates. Biochemical characterization suggests that HoBglA may modify its structural fold and catalytic ability in the presence of various metals, inhibitors, solvents and sugars. To reveal the mechanism at the atomic level, crystallographic investigations were performed. Investigation was focused to obtain ligand bound protein crystals of high X-ray diffraction quality. To achieve this goal several screens were designed with various pH and PEG conc., lower temp., salt and metals conc. and crystal dehydration. Results from metal inhibition study on HoBglA motivated us to grow and investigate more crystals with metals. This strategy worked and gave encouraging results and helped us in solving the complete structure with Ni2+ ion bound with catalytic residues of the HoBglA.

85 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Figure 3.1.1 Molecular mechanism of cellulose hydrolysis by action of endoglucanase, exoglucanase (cellobiohydrolase) and β-glucosidase.

86 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Table 3.1.1 Comparative chart of β-glucosidase enzyme from different sources showing β-fucosidase and β-galactosidase activity

EC No. Source Size Opt. Temp. 3D Inhibitors Reference Remarks ()/ pH structure (kDa)

3.2.1.21 H. orenii 53 45/8.5; Yes Ni+, Deoxynojirimycin, Hg2+, Present study -β- glucosidase, β-fucosidase and β- 60/8.5; Tris galactosidase activity present 65/7.0

3.2.1.21/ Bifidobacterium 48 45/ 5.5 No Cu+, Ag+, Hg+, 1-amino-1- Nunoura -β-fucosidase activity present 3.2.1.38 breve deoxy-D-glucose et. al., 1996

3.2.1.38 Sheep liver 95 37/ 4.5-5.5 No HgCl2, KCN, Tris, maltose Chinchetru -β- glucosidase, β-fucosidase and β- and lactone et. al., 1983; galactosidase activity present; best

1989 Km values for glucosidase activity; -kinetic study indicates for 2 active site i.e. fuco- gluco and galacto site

3.2.1.38 Turbo cortunus - 37/ 5.5 No CuSO4, HgCl2, KCN Yamada -β- glucosidase, β-fucosidase and β- liver et. al., 1973 galactosidase activity present, -one more β- galactosidase active at pH 3.0 3.2.1.38 Human plasma - 37/ 5, 7.4 No pNP-β-D- fucose, pNP-β-D- Tandt, 1987 -β-fucosidase and β- galactosidase enzyme galactose and D-galactonic activity present acid-y-lactone

87 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.2.1.38 Bovine liver - 37/ 4.5-6.5 No D-fucose, glucose and Rodriguez -β-fucosidase, β- glucosidase and β- galactose et. al., 1982 galactosidase activity present, glucoside was best substrate

3.2.1.38 Helicella - 37/ 5.0 No δ- D- gluconolactone and δ- Calvo -β-fucosidase, β- glucosidase and β- ericetorum (Snail) D- galactonolactone et. al., 1983 galactosidase activity present; -kinetic study indicates for 2 active site i.e. fuco- gluco and galacto site

3.2.1.21/ Thai rosewood 330 30/5.0 No HgCl2, p-chloromercuri- Srisomsap -β- galactosidase activity not reported 3.2.1.38 seeds benzoate et. al., 1996

3.2.1.38 Aspergillus 57 40/6.0 No D-fucose, D- fucono-- Zeng -β-glucosidase and β- galactosidase phoenicis lactone, Ag+, Hg2+ et. al., 1992 activity absent

3.2.1.38 Achatina balteata 300, 37/ 5.4 No Iodine, Cu>Zn>Ni>Co>Mn Colas, 1978; -β-glucosidase and β- galactosidase (snail) 110 1980; 1981 activity present

3.2.1.38 Lactuca sativa 37 40/ 5.5 No D- fucose Giordani and -β-glucosidase and β- galactosidase Noat, 1988 activity absent

3.2.1.38 Asclepias 50 50/ 5.5 No D- fucose, Lafon, 1993 -β-fucosidase, β- glucosidase, β- curassavica FeHg>La>Cu>Zn>Ba galactosidase, β- glucosaminidase and - galactosidase activity present

3.2.1.38 Patella vulgate - 25/ 8.5 - - Levy -β-glucosidase and β- galactosidase et. al., 1963 activity absent

88 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.2.1.38 Littorina littorea - 37/ 5.5 No D- fucose, D- glucose and D- Melgar -β- glucosidase, β-fucosidase and β- L. galactose as competitive et. al., 1985 galactosidase activity present; inhibitors -β- fucosidase and β- glucosidase activities are mainly catalyzed by same fuco- gluco site with different galacto site. 3.2.1.21 Sulfolobus 60 75/ 6.5 No - D’Auria -β- glucosidase, β-fucosidase and β- solfataricus et. al., 1996 galactosidase activity present

3.2.1.21 Plumeria obtuse 61 37/ 5.5 No D- glucono 1-5 lactone, Boonclarm -β-fucosidase activity was higher than Cu>Hg>Fe et. al., 2006 β- glucosidase, natural substrate plant flower plumieride coumarate glucoside

3.2.1.21 Thermus 49 88/ 7.0 - - Dion et. al., -β- glucosidase, β-fucosidase and β- thermophilus 1999 galactosidase activity present, with transfucosylation activity

3.2.1.21 Rhynchosciara 65 30/ 6.2 No Sucrose, maltose, mellibiose, Ferreira and -β- glucosidase, β-fucosidase and β- Americana insect pNP - glucoside Terra, 1983 galactosidase activity present larva

EC.3.2.1.21: β-glucosidase, EC.3.2.1.38: β-fucosidase

89 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.2 MATERIALS AND METHODS

All the general methods used for molecular biology, biochemical studies and protein crystallography investigation are described in chapter 2. This section covers only the specific methods used to study recombinant β-glucosidase from H. orenii are given.

3.1.2.1 Circular Dichroism (CD) measurements

A recombinant protein with 10 mg.ml-1 concentration (determined by Bradford assay method) was used for CD measurements. All the measurements were taken in a cuvette with 0.1 cm path length. Data were collected at 65 C in a wavelength range of 190-260 nm with 1 nm bandwidth. The solvent spectrum was subtracted from the sample spectrum. Thermal melting was carried out at a rate of 5 C per min at 222 nm using protein sample in different organic solvent in 20 mM NaCl, 10 mM Na2HPO4, pH 7.5. CD measurements were made with Jasco J-715 Spectropolarimeter (PS-150J, Japan).

3.1.2.2 Analytical gel filtration (SEC-MALS) analysis

HoBglA (10 mg.ml-1 concentration, stored in 50 mM NaCl, 20 mM HEPES pH 7.2) was used for analytical gel filtration chromatography. Sephadex-12 column (Amersham Pharmacia) equilibrated with 100 mM NaCl, 20 mM HEPES pH 7.5 was used on a Duo-Flow FPLC system (Bio-Rad). Protein sample was eluted at flow rate of 0.5 ml.min-1.

3.1.2.3 Protein-ligand docking

In silico docking of cellobiose, lactose, fucoside and glucose into HoBglA was performed with online Docking Server http://www.dockingserver.com/web. For the docking experiments coordinate files for target ligands and protein were prepared by excluding all heteroatoms. Docking run took 120 min and was completed with the default parameters.

90 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.2.4 Differential scanning fluorimetry (DSF)

To determine temperature stability of HoBglA in presence of ligands DSF experiment was performed. All the reactions were performed in triplicates; reaction mixtures were prepared into the wells of a Light Cycler 480 Multi-well 96, clear plate (Roche). 1µl recombinant HoBglA (10 mg.ml-1) was mixed with 1µl of SYPRO Orange dye (100x) diluted with Milli-Q water from 5000x stock, 2mM substrates (oNPG, pNPF and pNPGal, glucose, lactose and cellobiose) and metals were added in a total volume of 20µl buffer (20mM HEPES, 100 mM NaCl, pH 7.5). The plate was sealed with a Light Cycler 480 sealing foil and was heated from 20-95 C at a rate of 1 C/min in a Light Cycler 480 Real-time PCR system (Roche).

3.1.2.5 Molecular Dynamic Simulation

The initial topologies for cellubiose, lactose and fucoside were obtained with PRODRG (Schuettelkopf and van Aalten, 2004). The molecular dynamics (MD) simulations were carried out with GROMACS 3.3.2 using the GMX force field and the SPC water model. The subsequent protocol followed the tutorial by Kerrigan (Kerrigan, 2003). One beta- glucosidase molecule obtained from the crystal structure (PDB code 3TA9) was solvated in a box of water. To neutralise the system, sodium ions were added. Using the steepest descent algorithm, the energy was minimised prior to the MD simulations in order to remove close contacts between atoms which occur during system preparation. Periodic boundary conditions were applied in all three directions. In order to maintain chemical bond distances between hydrogen and non-hydrogen atoms, the Linear Constraint Solver (LINCS) algorithm (Hess et al., 1997) was enabled. Simulations were run at constant temperature (300 K/ 27 C) and pressure (1 bar), using the Berendsen thermostat and barostat. The protein and solvent/ions were thermalized separately, and the coupling constants for thermostat and barostat were 0.1 and 0.5 ps, respectively. Electrostatic interactions were modelled using the Particle-Mesh-Ewald method (Darden et. al., 1993). The time step of the simulations was 2 fs, and the total simulated time in the production run was 20 ps. For analysis and visualisation, GROMACS tools (van der Spoel et. al., 2005) and O (Jones et. al., 1991) were employed.

91 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Scripts used to generate .mdp file during MD simulation run

em.mdp

; ; hofmanna ; 27.04.06 ; Input file ; title = title cpp = /lib/cpp define = -DFLEXIBLE constraints = none integrator = steep nsteps = 400 ; ; Energy minimizing stuff ; emtol = 2000 emstep = 0.01 ; nstcomm = 1 ns_type = grid rlist = 1 rcoulomb = 1.0 rvdw = 1.0 Tcoupl = no Pcoupl = no gen_vel = no

92 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii md.mdp

; ; hofmanna ; 27.04.06 ; Input file ; title = title cpp = /lib/cpp constraints = all-bonds integrator = md dt = 0.002 ; ps ! tinit = 0 ; ps nsteps = 10000000 ; total 20000 ps = 20 ns = 0.02q ms. nstcomm = 1 nstxout = 2500 ; collect data every 5 ps nstvout = 2500 nstfout = 0 nstlog = 100 nstenergy = 100 nstlist = 10 ns_type = grid rlist = 1.0 coulombtype = PME rcoulomb = 1.0 rvdw = 1.4 ; Berendsen temperature coupling is on in two groups Tcoupl = berendsen tc-grps = Protein SOL NA CBI tau_t = 0.1 0.1 0.1 0.1 ref_t = 300 300 300 300 ; Energy monitoring energygrps = Protein SOL ; Isotropic pressure coupling is now on Pcoupl = berendsen Pcoupltype = isotropic tau_p = 0.5 compressibility = 4.5e-5 ref_p = 1.0 ; Generate velocites is off at 300 K. gen_vel = no gen_temp = 300.0 gen_seed = 173529

93 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii pr.mdp

; ; hofmanna ; 27.04.06 ; Input file ; title = title cpp = /lib/cpp define = -DPOSRES constraints = all-bonds integrator = md dt = 0.002 ; ps ! nsteps = 10000 ; total 20.0 ps. nstcomm = 1 nstxout = 250 ; collect data every 0.5 ps nstvout = 1000 ; output velocities every 2.0 ps nstfout = 0 nstlog = 10 nstenergy = 10 nstlist = 10 ns_type = grid rlist = 0.9 coulombtype = PME rcoulomb = 0.9 rvdw = 1.4 ; Berendsen temperature coupling is on in two groups Tcoupl = berendsen tc-grps = Protein SOL NA CBI tau_t = 0.1 0.1 0.1 0.1 ref_t = 300 300 300 300 ; Energy monitoring energygrps = Protein SOL ; Pressure coupling is not on Pcoupl = no tau_p = 0.5 compressibility = 4.5e-5 ref_p = 1.0 ; Generate velocites is on at 300 K. gen_vel = yes gen_temp = 300.0 gen_seed = 173529

94 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Halothermothrix orenii

Bioinformatics study Molecular Biology Biochemical analysis Structural analysis

GH and GT Genomic DNA extraction Optimum Temperature, pH Protein Crystallization enzyme gene search PCR and pET22b (+) Thermal-stability of enzymes X- Ray diffraction of crystals ORF analysis and Vector preparation Primer design Effect of metal ions, Data analysis and molecular PCR product & pET22 b inhibitors and sugars model building Amino acid analysis (+) vector RE digestion - BLAST - Clustal W Enzyme kinetics study Structure refinement and - Protparam Vector and Insert ligation analysis - Protein signature Substrate Specificity Transformation & clone BRENDA database screening by PCR, rPlasmid search RE digestion, DNA sequencing

CaZY database search rProtein mini over-expression screen, SDS-PAGE analysis

KEGG database search Upscaling of culture & protein purification by IMAC, protein PDB database search estimation

Figure 3.1.2 Overview of methods used for investigation of β-glucosidase from H. orenii.

95 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

PCR/ Cloning Protein Crystallography Flowchart

Protein expression/ purification

Crystallization experiments: sitting drop/ hanging drop method

Crystals?

X-ray diffraction analysis

Crystals diffract < 3 Å?

X-ray intensity data collection

Phase determination

Closely related known 3D structure

Molecular replacement

Electron density map calculation

Protein model building/ Model refinement

Structure analysis/ Experimental design Figure 3.1.3 Overview of protein crystallization and structural investigation.

96 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3 RESULTS

3.1.3.1 Cloning, expression and purification

-Glucosidase enzyme encoding gene (NCBI no. YP_002509272.1; encodes 451 amino acids) was identified in Halothermothrix orenii genome cart. A recombinant plasmid, carrying a 1.35kb (Figure 3.1.4) NdeI and XhoI digested fragment of bglA gene purified

(Figure 3.1.5), cloned (pET22b.bglA.His6) (Figure 3.1.6) and successfully over- expressed in Rosetta2 (DE3) cells (Figure 3.1.7). Ni-NTA affinity chromatography was used to purify the HoBglA (Figure 3.1.8) as described above in chapter 2 (section 2.2.15). Each step of purification procedure is summarized in Table 3.1.2. SDS – PAGE analysis under reducing and denaturing conditions showed a band of purified protein (>95% purity) with a molecular mass of 53 kDa (Figure 3.1.8), which correlated with the theoretical molecular mass.

M 1

2322 bp

2027 bp 1356 bp

564 bp

Figure 3.1.4 PCR amplification of bglA gene. Lane M,  HindIII DNA standard size marker; Lane 1, PCR amplified bglA gene.

97 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

M 1 2

2322 bp 2027 bp 1356 bp

564 bp

Figure 3.1.5 PCR product purification. Lane M,  HindIII DNA standard size marker; lane 1, PCR amplified bglA gene physically excised from 1% agarose gel for purification after RE digestion; lane 2, Gel purified bglA gene product.

M 1 2 3

2322 bp

2027 bp 1356 bp

564 bp

Figure 3.1.6 Recombinant plasmid analysis. Lane M,  HindIII DNA standard size marker; lane 1, Colony PCR amplified bglA gene; lane 2, recombinant plasmid with insert (pET22b+ and bglA); lane 3, recombinant plasmid after RE digestion.

98 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

kDa M 1 2 3 4 5 6 7 8

170 130

100

70

55 53 kDa

40

35

25

15 10

Figure 3.1.7 HoBglA over-expression. Lane M, PageRuler™ Prestained Protein Ladder (SM0671); lane 1, Negative control of BL21 (DE3) cells; lane 2- 4, HoBglA expressed in BL21 (DE3) cells after 1 hr., 3 hr. and 6 hr.; lane 5, Negative control of Rosetta 2 cells; lane 6-8, HoBglA expressed in Rosetta 2 cells after 1 hr., 3 hr. and 6 hr.

kDa 1 M 2 3 4 5 6 7 8

170 130

100 70 53 kDa 53 kDa 55

40

35

25 15 10

Figure 3.1.8 HoBglA purification. Lane 1, Cell lysate containing HoBglA; lane 2, PageRuler™ Prestained Protein Ladder (SM0671); lane 2, flowthrough from column; lane 3 – 6, Column wash with 5, 15, 50 and 75 mM imidazole containing wash buffer; lane 8 – 9, Protein elute with 100 and 350 mM imidazole containing elution buffers.

99 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Table 3.1.2 Purification of BglA recombinant protein from Halothermothrix orenii. Protein concentration was determined using Bradford assays. Activity was determined using standard para nitro-phenyl assays under standard assay conditions. One unit of enzyme activity was defined as the amount of the enzyme that liberated 1nmol min-1 of oNP/pNP.

Step β-D-glucosidase β-D-fucosidase β-D- galactosidase Total Total Specific Total Specific Total Specific Yield Purification protein activity activity activity activity activity activity (%) (fold) (mg) (U) (U/mg) (U) (U/mg) (U) (U/mg) Crude 23 100 1.0 5111 26 153 6.6 - - extract Ni- 10 38 7.48 1944 194.4 383 38.3 166.6 16.6 NTA

100 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.2 Phylogenetic analysis

Figure 3.1.9 A dendrogram showing the phylogenetic position of β-glucosidase from H. orenii and its closest relatives. NCBI reference numbers are given in parenthesis. H. orenii (YP_002509272.1, Mavromatis et. al., 2009), β-galactosidase Dictyoglomus turgidum DSM 6724 (YP_002353685.1, unpublished), β-glucosidase A Dictyoglomus thermophilum H-6-12 (YP_002251501.1, unpublished), β-galactosidase Thermoanaerobacter ethanolicus CCSD1 (ZP_05492104.1, unpublished), β- glucosidase Thermoanaerobacter pseudethanolicus ATCC 33223 (YP_001665894.1, unpublished), β-galactosidase Thermoanaerobacter ethanolicus JW 200 (ZP_08211795.1, unpublished), β-galactosidase Alicyclobacillus acidocaldarius subsp. acidocaldarius DSM 446 (Mavromatis et. al., 2010), β-galactosidase Thermoanaerobacter mathranii subsp. mathranii str. A3 (YP_003676178.1, unpublished), β-glycosidase Alicyclobacillus acidocaldarius (AAZ81839.1, Di Laura et. al., 2006), β-galactosidase Thermoanaerobacter wiegelii Rt8.B1 (YP_004819179.1, unpublished), β-glucosidase Thermoanaerobacter brockii (CAA91220.1, Breves et. al., 1997), β-galactosidase Acetivibrio cellulolyticus CD2 (ZP_07326145.1, unpublished). Figure was prepared using neighbourhood joining method by TREECON software.

101 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.3 Optimum temperature and pH of HoBglA

The optimal temperature for β-glucosidase, β-fucosidase and β-galactosidase activities was 45 C, 60 C and 65 C respectively with β-glucosidase, β-fucosidase and β-galactosidase showing 60%, 95% and 80% activities respectively at 80 oC (Figure 3.1.10).

There was a variation in the pH optimum and the pH range for the three enzyme activities of HoBglA (Figure 3.1.11a, b, c). The pH optimum and pH range for β- glucosidase activity was broad and was pH 8.0 to 8.5 and from pH 6.0 to 9.0 respectively. Both β-galactosidase and β-fucosidase were most active over a broad pH range 7 to 8.5.

100 oNPG 80 pNPF pNPGAl

60

40

HoBglAactivity (%)

(unit per mg protein) 20

0 35 40 45 50 55 60 65 70 75 80 Temp (oC)

Figure 3.1.10 Optimum temperature of HoBglA as β- glucosidase ( 45C), β- fucosidase ( 60C) and β- galactosidase ( 65C). The error bar represents the means  SD (n=3).

102 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

100 NaPO4

80 Glycine-NaOH

60

40

(unit per mg protein) 20 -glucosidase activity (%)  0 6.0 6.5 7.0 7.5 8.0 8.5 9.0 pH

Figure 3.1.11a. pH optima of HoBglA against oNPG (β-glucosidase activity)

( Na-PO4,  Glycine-NaOH). The error bar represents the means  SD (n=3).

100 NaPO4 80 Glycine-NaOH

60

40

(unit per mg protein) 20 -fucosidase activity (%)  0 6 7 8 9 10 pH

Figure 3.1.11b. pH optima of HoBglA against pNPF (β-fucosidase activity)

( Na-PO4,  Glycine-NaOH). The error bar represents the means  SD (n=3).

103 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

100 NaPO4 80 Glycine-NaOH

60

40

(unit per mg protein) 20 -galactosidase activity (%)

 0 6.0 6.5 7.0 7.5 8.0 8.5 9.0 pH

Figure 3.1.11c. pH optima of HoBglA against pNPGal (β-galactosidase activity)

( Na-PO4,  Glycine-NaOH). The error bar represents the means  SD (n=3).

104 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.4 Thermostability

The thermostability of HoBglA was determined at various temperatures. β-glucosidase and β-fucosidase had a half-life of more than 20 min and 5 min at 60 and 65 C respectively, whereas β-galactosidase activity was the most stable with a half-life of 2.5 min at 70 C (Figure 3.1.12a, b). All the above activities were without EDTA and substrate.

100 60a

80 65a 60b 60 65b 70b 40

(unit per mg protein) 20

HoBglA residual activity (%) 0 0 5 10 15 20

Time (min)

Figure 3.1.12a. Temperature stability of HoBglA (β-glucosidase;  60C,  65C and β-galactosidase;  60C,  65C,  70C) against oNPG and pNPGal respectively. The error bar represents the means  SD (n=3).

105 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

100 45 80 50 55 60 60 65 40 70

(unit per mg protein) 20

-fucosidase activity (%) 

0 0 15 30 45 60 75 90 105 120 Time (min)

Figure 3.1.12b. Temperature stability of HoBglA (β- fucosidase) against pNPF as substrate at different temperatures ( 45C,  50C,  55C,  60C,  65C,  70C). The error bar represents the means  SD (n=3).

106 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.5 Effect of salts, metal ions, solvents, reagents, sugars and inhibitor on the activity of HoBglA

The effects of various metal ions and sugars were investigated on HoBglA activity. NaCl did not give any significant effect on all the enzymatic activity of HoBglA. Hg, Ni, Cd, Co had a strong inhibitory effect on catalysis potential of β-fucosidase and β- galactosidase, ZnSO4 reduced activity by >50% of both (Figure 3.1.13). Absolute ethanol and methanol (20% v/v) and Tris (100mM) had an inhibitory effect on β- glucosidase; ethanol left with 7%, methanol with 63% and Tris with 25% residual activity (data not shown). The reducing agent, dithiothreitol (DTT), did not affect β- glucosidase activity. Deoxynojirimycin, a reported inhibitor for β-glucosidase reduced the enzymatic activity by 59% (data not shown). Known inhibitors such as D-fucose, gluconate and D-glucono 1-5 lactone have no effect on β-glucosidase and β-fucosidase activity, but gluconate has completely abolished the β-galactosidase activity. HoBglA stopped working in presence of 0.5% iodine.

120

100

80

60

40

(Unit per mg protein)mg per (Unit 20 HoBglA residual activity % activity residualHoBglA

0

Metals (1mM)

Figure 3.1.13 Effects of various metals on catalytic activity of β- glucosidase (45C,), β- fucosidase (60C,) and β- galactosidase (65C,) in MOPS buffer pH 8.5. The error bar represents the means  SD (n=3).

107 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.6 Determination of kinetic constants of preferred substrates

The reaction kinetics of the purified enzyme for oNPG, pNPF and pNPGal was determined from Lineweaver-Burk plots using Prism 4 software. The Km for pNPGal (0.14mM) was lower than that for oNPG (1.38mM) and pNPF (1.62mM), which pNPF kinetics oNPG kinetics indicates less affinity towards oNPG and pNPF (Figure 3.1.14).

90 30 VMAXVmax 30.0530.05 VMAXVmax 85.56 Km 1.3 80 KMKm 1.6201.6 KM 1.383 70 20 60 0.55 0.25 0.50 50 0.45 0.20 0.40 0.35 40 0.15

0.30 1/v 1/v 0.25

10 0.20 30 0.10 v, nmol/min v, v, nmol/min 0.15 0.10 0.05 VMAX 23.28 0.05 20 KM 0.04898 0.00 0.00 0.0 2.5 5.0 7.5 10.0 12.5 10 0.0 2.5 5.0 7.5 10.0 12.5 1/[S] 1/[S] 0 0 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 [S], mM [S], mM A B

Vmax 23.28 25 Km 0.04

20 0.0500

15 0.0475

0.0450 1/v 10

0.0425 v, nmol/min

5 0.0400 0.0 0.5 1.0 1.5 2.0 2.5 1/[S] 0 0.0 2.5 5.0 7.5 10.0 12.5 [S], mM C

Figure 3.1.14 Kinetics studies of HoBglA: A, B and C are Lineweaver-Burk plot of oNP/pNP release as a result of HoBglA activity on the (A) oNPG/ (B) pNPF/ (C) pNPGal concentrations, where Km values suggest that pNPGal is preferred substrate of the enzyme.

108 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.7 Substrate specificity

HoBglA was active towards naturally occurring disaccharides i.e cellobiose, and lactose (Figure 3.1.15a, b), along oNPG, pNPF and pNPGal. The enzyme did not show any activity against trehalose, cellulose, xylobiose, xylotriose and xylan.

Glucose Cellobiose

1 2 3 4 5 6 7 8

Figure 3.1.15a. TLC analysis of cellobiose hydrolysis over a time period– lanes 1 and 2 are cellobiose and glucose controls respectively, lanes 3 to 8 are the hydrolyzed end products from cellobiose after 0hr (lane 3), 0.25hr (lane 4), 0.5hr (lane 5), 1hr (lane 6), 2hr (lane 7) and 16hr (lane 8).

Glucose Galactose Lactose

1 2 3 4 5 6B 7 8 9

Figure 3.1.15b. TLC analysis of lactose hydrolysis over a time period - lanes 1, 2 and 3 are lactose, galactose and glucose controls respectively, lanes 4 to 9 are the hydrolyzed end products from lactose after 0hr (lane 4), 0.25hr (lane 5), 0.5hr (lane 6), 1hr (lane 7), 2hr (lane 8) and 16hr (lane 9).

109 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.8 Circular Dichroism measurements

Stability of HoBglA in organic solvents (20% v/v ethanol and 20% v/v methanol) was checked by circular dichroism (CD) measurements. There were no significant changes in protein folding were seen in the CD results (Figure 3.1.16), the protein structure was stable at 45 C in presence of the solvents.

A

B Figure 3.1.16 Circular dichroism (CD) measurement graphs. (A) CD graphs showing the stability of native HoBglA and with ethanol and methanol at room temperature. (B) In this graph CD measurements of native HoBglA and with ethanol and methanol were taken at 45 C.

110 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.9 Analytical gel filtration (SEC-MALS)

HoBglA was present in monomeric form in aqueous solution as determined by the size exclusion chromatography- multi angle light scattering (SEC-MALS) (Figure 3.2.17). It has been reported by previous workers that oligomeric proteins are more thermostable than their monomeric form, thermophilic and halophilic proteins from the oligomers so often (Jekow et. al., 1999; Richard et. al., 2000; Ishibashi et. al., 2002; Sivakumar, 2006).

Figure 3.1.17 Size exclusion chromatography result of HoBglA. Single peak shows the monomeric form of the HoBglA.

111 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.10 Differential Scanning Fluorimetry

Among the tested substrates and metals, synthetic substrates pNPG, pNPF and oNPG induced the HoBglA stabilization (avg. Temp. 70.4C) shown in Figure 3.1.18. Natural substrates cellobiose, lactose and glucose also stabilized the enzyme (avg. Temp. 69.8C). HoBglA showed higher stability in presence of Zn (avg. Temp. 76.1C), followed by Al>Mn>Ca>Li>Ni>Cd>Hg.

Figure 3.1.18 Coloured lines showing the detected fluorescence released by SYPRO Orange dye due to denaturing of HoBglA (bound with different metals and sugars) at different temperature.

112 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.11 Crystallization

Numerous crystals with different morphologies (Figure 3.1.19) were obtained from initial sitting drop screen with various conditions (Appendix V). Optimization screens (Table 3.1.3) were set to obtain crystals with high diffraction quality different sugars and metals were used as additives. Under these optimized conditions crystals were obtained in high numbers but not diffracted the X-rays to high resolution. Obtaining high-resolution quality crystals was the challenging part of this investigation.

113 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Table 3.1.3 Optimization screens to obtain high X-ray diffraction quality crystal from HoBglA protein

Basic crystal S. No Additives X-ray diffraction conditions 1 0.2M Na formate, 20% 3.5 Å diffraction - PEG 3400, 0.1M Bis- (100% completeness) TRIS propane, pH= 7.5 1mM Cellobiose No diffraction (P78 factorial) 1mM Lactose No diffraction 1mM Glucose No diffraction 1mM MnSo4 No diffraction 1mM Ba Acetate No diffraction 1mM LiOH No diffraction 1mM HgSO4 No diffraction 2mM pNPF No diffraction 2mM pNPF + 1mM ZnCl2 No diffraction 1mM ZnCl2 No diffraction 2mM pNPF + 1mM MnSO4 No diffraction 1mM MnSO4 No diffraction 2mM pNPF + 1mM CaCl2 No diffraction 1mM CaCl2 No diffraction 2mM pNPF + 1mM HgSO4 No diffraction 1mM HgSO4 No diffraction 2mM pNPF + 1mM EDTA No diffraction 1mM EDTA No diffraction 2mM pNPF + 1mM NiSO4 No diffraction 1mM NiSO4 No diffraction 2mM pNPF + 1mM CoCl2 No diffraction 1mM CoCl2 No diffraction 2mM pNPF + 1mM LiOH No diffraction 1mM LiOH No diffraction 2mM pNPF + 1mM CdCl2 No diffraction 1mM CdCl2 No diffraction 2mM pNPF + 1mM FeCl3 No diffraction 1mM FeCl3 No diffraction 2 0.2M MgAc2, 20% PEG - 2.5 Å diffraction (96% 8000, 0.1M Na completeness at 3.0 Å) cacodylate, pH=6.5 (CSI-18 factorial) 3 0.2M MgAc2, 25% - Crystal dissolved MME PEG 2000, 0.1M completely Na cacodylate, pH=6.5 4 0.2M MgAc2, 30% - Crystal dissolved MME PEG 2000, 0.1M completely Na cacodylate, pH=6.5 5 0.2M MgAc2, 25% - Crystal dissolved MME PEG 5000, 0.1M completely Na cacodylate, pH=6.5 6 0.2M MgAc2, 30% - Crystal shape changed MME PEG 5000, 0.1M and looks more fragile, Na cacodylate, pH=6.5 No diffraction

114 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

0.5 µm

Figure 3.1.19 Crystals obtained from H. orenii beta-glucosidase grown under different conditions. A) crystals in PACT-78, original condition; B) PACT-78 with 1mM lactose; C) PACT-78 with 1mM glucose; D) PACT-78 with 1mM Ba(OAc)2; E) PACT-78 with 1mM HgSO4; F) PACT-78 with 1mM MnSO4; G) PACT-78 with 1mM EDTA;H) PACT-78 with 1mM pNPF and EDTA; I) PACT-78 with 1mM LiOH; J) PACT-78 with 1mM pNPF and LiOH; K) PACT-78 with 1mM pNPF and MnSO4; L) PACT-78 with 1mM CdCl2; M) PACT- 78 with 1mM pNPF and CdCl2; N) PACT-78 with 1mM CoCl2; O) Crystals in CSI-18, original condition. Arrows indicate towards crystals.

115 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.12 X-ray diffraction and preliminary data analysis

In-house X-ray diffracted crystals (3 to 3.5 Å) were further sent to the Australian Synchroton facility AS-MX1 and ADSC Quantum detector at Melbourne to collect high resolution dataset.

Crystal obtained from PACT-78 condition diffracted X-rays up to 3.5 Å with 100% completeness. But the best diffraction quality dataset was obtained from CSI-18 condition (20% PEG 8000, 0.2M Mg Acetate, 0.1M Na Cacodylate, pH 6.5) and designated as BP9 dataset. Collected dataset from crystal obtained from CSI-18 condition was processed with XDS (Kabsch, 2010) and SCALA from CCP4 program suite (Collaborative Computational Project, Number 4, 1994). Self- rotation functions were calculated using GLRF (Tong and Rossmann, 1990). The results showed the crystal belongs to the orthorhombic P212121 space group (Table 3.1.4).

By molecular replacement method using PHENIX (Adams et. al., 2010) and library of search model from Protein Data Bank (PDB) HoBglA structure was solved. The library consisted of 29 known structures of β-glucosidases; each model was present as a wild- type and a polyalanine model, yielding a total of 58 individual models. For two molecules of HoBglA in asymmetric unit, the best solution was obtained with a log- likelihood score of 810 for the wild-type β- glucosidase B model from uncultured bacterium (52% sequence identity; PDB code 3FJ0; K.Y. Hwang and K.H. Nam, unpublished work). The second best solution was from Streptomyces sp. with a score of 777 (47% sequence identity; PDB code 1GNX; A Guasch et. al., unpublished work). Three dimensional protein model was built manually using visual inspection program O and Coot (Jones et. al., 1991; Emsley and Cowtan, 2004) were used.

116 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Table 3.1.4 Diffraction data statistics Data set BP9

Total Molecule Molecule B A

Data collection

X-ray source AS-MX1

Detector ADSC Quantum

Wavelength (Å) 0.95375

Space group P212121

Cell dimensions (Å) 88.5, 99.6, 109.4

Max. resolution (Å) 3.0

Wilson B-factor (Å2) 52.0

No of unique reflections 19581 (2747)

Multiplicity 7.9 (6.0)

Completeness 0.984 (0.960)

a Rsym 0.140 (0.262)

Refinement

No of reflections in working / test set 18517 / 998

Solvent

No of water molecules 127

Average B-factor (Å2) 11.8

Ligands

Metal ion Ni2+ Ni2+

117 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

B-factor (Å2) 47.9 59.2

Protein

No of non-H protein atoms 7187 3621 3566

Visible residues 2-318, 4-233, 235- 320-450 305, 308- 318, 320- 447

Side chains modelled as Ala K110, I4, E8, D231, E233, E233, K237, K237, Q269, K304, K304, K316, N311, K343, K316, K362, K343, K368, K362, E425, E425, E441, E441, K445 K445, Q447

Average B-factor (Å2) 23.5 23.3 23.7

rmsd B-factor for bonded atoms 7.26 7.32 7.20

rmsd bond lengths (Å) 0.008

rmsd bond angles (°) 1.23

Ramachandran plot (%) 0.835 / 0.844 / 0.154 / 0.145 / most favoured / additionally allowed 0.008 / 0.008 / / generously allowed / disallowed 0.003 0.003 (W409) (W409)

R-factorb 0.191 (0.295)

c Rfree 0.266 (0.397)

118 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.13 Protein-ligand docking

Calculated binding free energies for cellobiose, lactose, pNPGal, oNPF, D-gluconate and D-fucose and ligand residue interaction frequencies are given in Table 3.1.5. HoBglA residues Asn222, Tyr 296, Glu354, Glu 408 and Trp 409 are interacting to cellobiose through polar bonds. Similarly, Glu166, Asn222, Trp327 and Glu354 interact to lactose through polar bonding (Figure 3.1.20A) and oNPF interacts with residues Glu166, Tyr296, Trp401 and Glu408 (Figure 3.1.20C). Comparative analysis of the interacting residues indicates that Glu166 acts as acid/base and Glu354 acts as nucleophile while targeting cellobiose (Figure 3.1.20B), D-fucose, D-gluconate (Figure 3.1.20D) and lactose. But with oNPF, pNPGal, Glu166 and Glu408 make a direct interaction with the substrates atoms.

Table 3.1.5 HoBglA molecular docking outcome Substrate Lactose Cellobiose D-fucose D- oNPF pNPGal Gluconate Free energy of binding -6.49 -7.52 -5.80 -6.35 -6.70 -7.76 (kcal/ mol) Frequency (%) 91 66 97 74 37 31 Glu166, Glu166, Gln20, Gln20, Gln20, Gln20, Trp168, Val169, His121, His121, His121, His121, Val169, Thr224, Trp122, Trp122, Trp122, Trp122, Asn222, Tyr296, Asn165, Asn165, Asn165, Asn165, Interacting Tyr296, Trp327, Glu166, Glu166, Glu166, Glu166, residues Trp327, Glu354, Tyr296, Tyr296, Val169, Tyr296, Glu354, Glu408, Glu354, Trp327, Tyr296, Trp327, Trp409 Trp409 Trp401, Glu354, Trp327, Trp401, Trp409, Trp401 Glu408 Glu408, Phe417 Phe417 oNPF- ortho Nitro Phenyl Fucopyranoside, pNPGal- para Nitro Phenyl Galactopyranoside

119 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

A B

C D

Figure 3.1.20 Protein-ligand molecular docking results. Interaction of HoBglA catalytic residues with A) lactose (magenta sticks) catalytic sites (yellow sticks), B) cellobiose (red sticks) catalytic sites (yellow sticks), C) oNPF (green sticks) catalytic sites (blue sticks) and D) D-gluconate (cyan sticks) catalytic sites (magenta sticks). Bond distances are shown with yellow dotted lines.

120 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.14 Overall structure of HoBglA

GH family 1 HoBglA amino acid sequence is composed of 451 residues. Interestingly, HoBglA crystallized with two molecules in an asymmetric unit related by a twofold rotation non-crystallographic symmetry approximately parallel to the c-axis, at the nominal resolution of 3.0 Å. However, HoBglA exists as a monomeric form in solution as determined by SEC-MAL analysis (Figure 3.1.16). Overall view of polypeptide chains were reasonably well ordered with a couple of breaks in the electron density map. PROCHECK (Laskowski et. al., 1993) was used for calculating the geometry and quality of the structure. The Ramachandran plot for ϕ, ѱ torsional angle values for most favored region were 83.5% and 15.4% in additionally allowed region for chain A. The overall fold and the general domain forms a (/β) 8 TIM barrel (Figure 3.1.21). Each monomer has a mixed /β fold and a characteristic active catalytic site (Glu166, Asn294 and Glu354). The core catalytic site consists of an eight β-sheet with parallel  and β- strands. The catalytic sites of HoBglA form a 15-20 Å deep slot-like cleft and are located on connecting loops at the C-terminal end of the β-sheets of the TIM barrel.

On the basis of sequence homology, 3D structure of five GH family 1 enzymes and HoBglA were superimposed to analyze their structural similarity are shown in Figure 3.1.22. Comparison between HoBglA (3TA9) and 1CBG loops showed significant differences (Figure 3.2.23). Presence of Ni ion in substrates binding region surrounded with catalytic residues is shown in Figure 3.2.24.

121 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Figure 3.1.21 Cylindrical representation of H. orenii β-glucosidase structure. Catalytic residues Glu 166, Asn 294 and Glu 354 are represented with red sticks. Classical (/β)8 TIM barrel is shown inside the white circle, red and blue ends represents the C and N terminals. This figure was prepared using the program CCP4MG (McNicholas et. al., 2011).

122 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Figure 3.1.22 Structural comparison of HoBglA. (A) Overlay view of HoBglA and its structural homologs. HoBglA (cyan) and the structures of the five-glycoside hydrolase BglA (PDB code 1CBG, green), BglA (PDB code 1E4I, magenta), myrosinase (PDB code 1MYR, yellow), BglA (PDB code 1UG6, pink), BglA (PDB code 3FJ0, tan), BglA (PDB code 3CMJ, blue) and BglA (PDB code 3AHX, grey) of GH family 1 members superimposed on the catalytic site of HoBglA. (B) The HoBglA catalytic sites are red and 3FJ0 in green. This figure was prepared using the program CCP4MG (McNicholas et. al., 2011).

123 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Loop A

Loop D

Loop B

Loop C

Figure 3.1.23 Loop comparison of HoBglA with 1CBG. Loop A (202-215 residues, yellow), loop B (336- 345 residues, red) loop C (355-361 residues, magenta) are present in 1CBG (green) and absent in HoBglA (cyan). Loop D HoBglA (360-367 residues, blue) make a wide opening for substrates at the surface whereas in 1CBG (403 -410 residues, orange) loop D make a close surface. This figure was prepared using the program PyMOL (DeLano, 2002).

124 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Figure 3.1.24 A view of nickel binding residues with electron density. The

2Fo – Fc electron density map drawn at the 1.5  contour levels. Side chain of the amino acids that make coordination bonds with nickel ions are shown as sticks and the nickel and water molecules are shown as white and pink asterisks, respectively.

125 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.15 Molecular dynamics (MD) simulation outcome

The RMSD for the simulated conformations of HoBglA backbone and ligands are plotted in Figure 3.2.25. The RMSD between the ligand bound structures showed that the models, in particular individual backbone and ligand are very similar but overall comparison with initial conformation varies in the loop area D (Figure 3.1.23). There are severe conformational changes when comparing the protein structure in all three simulation cases before and after the MD run. The conformational changes seem to appear mainly in three loop structures all located on the side of the protein that also opens up to the (putative) active site.

MD simulation run for HoBglA and lactose bound structure showed following interactions, Trp122 – O2A (3 Å), His180 – O6’ (2.9 Å), Trp401 – O6A (3.2 Å), Trp409 – O2A (3.1 Å), Trp409 – O3A (2.9 Å), Phe417 – O3A (3 Å), Phe417 – O4A (3 Å). With this substrate also, the putative conserved site residues are not any interaction to the substrate. Tyr296 – O3’ (3.3 Å), Tyr296 – O4 (3.1 Å), Trp323 –O2’ (2.7 Å), Trp323 – O3’ (2.7 Å), Phe417 – O3 (2.5 Å), Phe417 – O4 (2.7 Å), only these residues are taking part in substrate binding, whereas the putative catalytic site residues are not making any bonds with the substrate.

Interactions between HoBglA residues and cellobiose atoms are, Gln20 – O3’ (3.2 Å), Trp120 – O6B (2.6 Å), Glu166 – N6 (3.1 Å), Glu354 – O2’ (3.1 Å). Glu166 and Glu354 are the conserved catalytic site residues which are responsible for the hydrolysis of the substrate.

Only two residues of HoBglA showed interaction with oNP fucopyranoside, Gln20 and Trp120 are the substrate binding residues. In this case Asn294 did not making any contact with the substrate, which is a conserved putative catalytic residue. This indicates that the catalytic mechanism is different than other reported β- glucosidases.

126 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

A

B

C

Figure 3.1.25 RMSD of HoBglA backbone and ligand atoms along last 20 ps for each MD simulation. The curves are indicated as follows: Stability of HoBglA backbone and ligand A) lactose, B) cellobiose and C) ortho nitrophenyl fucopyranoside. All the structure (with ligands) showed stability in their conformation along last 20 ps.

127 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

3.1.3.16 Comparison with other proteins

A search for structurally similar proteins was performed using DALI (Holm and Sander, 1993). Structures showing overall similarity belonged only to the GH family 1 (Table 3.1.6). The structurally most common feature of these proteins is the catalytic site region, determined using CSA database (Porter et. al., 2004)). The highest structural similarity was observed between HoBglA and PDB code: 3AHX with RMSD 0.96 Å.

3.1.3.17 PDB deposition

The atomic coordinates and structure factors have been deposited in the Protein Data Bank, with the accession number 3TA9.

128 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Table 3.1.6 Comparative analysis of β-glucosidases 3D structures from different sources Family: Glycosyl hydrolase 1 Enzyme Enzyme Source PDB No. Optimum RMSD* Enzyme complexed ligands for References name code amino pH Temp (Å) 3D structure acid (C)

β- H. oreniib - 451 8.5 45 - **lactose, cellobiose and synthetic Present study glucosidase substrates (oNPG & pNPF) T. 1UG6 426 - - 1.08 Complex with glycerol Unpublished PDB entry thermophilus Hb8b B. polymyxab 1E4I 447 - - 1.05 Complex with 2-deoxy-2-fluoro-alpha- Unpublished PDB entry D-glucopyranose ; 2,4-dinitrophenyl 2- deoxy-2-fluoro-β-d-glucopyranoside P. polymyxab 1UYQ 447 - - 1.04 Complex with 2-deoxy-2-fluoro-alpha- Unpublished PDB entry D-glucopyranose ; 2,4-dinitrophenyl 2- deoxy-2-fluoro-β-d-glucopyranoside T. maritimab 1UZ1 439 5.8 25 1.08 Complex with Isofagomine lactam Vincent et. al., 2004 1W3J 443 5.8 25 1.08 Complex with Tetrahyrooxazine Gloster et. al., 2005 2CBU 440 5.8 25 1.07 Complex with Castanospermine Gloster et. al., 2006 2CES 440 5.8 25 1.08 Complex with Glucoimidazole Gloster et. al., 2006 2J7H 438 5.8 25 1.10 Complex with Azafagomine Gloster et. al., 2007 2J78 443 5.8 25 1.06 Complex with Galacto- Gloster et. al., 2007 hydroximolactam 2J79 437 5.8 25 1.07 Complex with Galacto- Gloster et. al., 2007 hydroximolactam 2VRJ 438 5.8 25 1.08 Complex with N-octyl-5-deoxy-6-oxa- Aguillar et. al., 2008 N-(thio) carbamoylcalystegine

129 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

2WBG 443 - - 1.04 Complex with 3-imino-2-oxa-(+)- Unpublished PDB entry castanospermine 2WC4 428 - - 1.05 Complex with 3-imino-2-thia-(+)- Unpublished PDB entry castanospermine C. 3AHX 444 6.0 45 0.96 Complex with para-nitro-phenyl β-D- Jeng et. al., 2011 Cellulovoransb glucoside Soil 3CMJ 441 6.5 55 0.99 Complex with tartaric acid Park et. al., 2005 metagenome Uncultured bacteriumb 3FIZ 441 - - 1.01 Complex with β-D-glucose Unpublished PDB entry Uncultured bacteriumb 3FJ0 439 - - 1.04 Complex with β-D-glucose Unpublished PDB entry T. repensp 1CBG 447 - - 1.35 Complex with linamarin and glucose Barret et. al., 1995 (white clover) T. reeseif 3AHY 465 6.0 40 1.41 Complex with Tris Kuo and Lee, 2008 N. kosunensisi 3AHZ 472 5.5 45 1.46 Complex with Tris Kuo and Lee, 2008 Myrosinase S. albai 1MYR 499 - - 1.35 Complex with 2-deoxy-2- Burmeistar et. al., 1997 fluoroglucosinolate b (bacteria), p (plant), f (), i (insect) *RMSD values determined by SSM pairwise 3D alignment services available under PDBe in EMBL-EBI database, http://www.ebi.ac.uk/msd-srv/ssm/cgi-bin/ssmserver ** Crystals obtained with ligand, but did not diffract the X-rays.

130 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

DISCUSSION

Beta glucosidase completes the breakdown of cellulose in a variety of organisms. It plays a key role in the hydrolysis of cellobiose with the release of glucose as monosaccharide confirming the cellulose hydrolysis. According to the CAZy classification HoBglA belongs to the GH family 1 in spite of the presence of Asp residues as a nucleophile, a trait of GH family 3.

Beta-glucosidase gene from H. orenii was sub cloned into E. coli expression vector and purified by means of N- terminal hexahistidine tag affinity chromatography. High level of recombinant protein expression was observed after 1 mM IPTG induction. HoBglA is the first halothermophilic enzyme that actively works as β-D-glucosidase, β-D- fucosidase and β-D-galactosidase. The similar β-D-glucosidase having β-D- galactosidase, β-D-fucosidase activity from almond, B. breve, bovine liver, sheep liver, Turbo cortunus liver, human plasma enzyme, snail, Littorina littorea, Asclepia curassavica, Sulfolobus solfataricus, Plumeria obtuse plant flower, Thermus thermophilus and Rhynchosciara Americana insect larva were reported in previous years (Table 3.6) (Walker and Axelroad, 1978; Rodriguez et. al., 1982; Nunoura et. al, 1996). The molecular mass of the purified HoBglA was estimated to be 53 kDa by SDS-PAGE analysis that correlates with the theoretical values and is close to those of many other β-glucosidases characterized from other bacterial sources (Bhatia et. al., 2002; Park et. al., 2005). The specific activity of purified HoBglA 194.4 U mg-1 is relatively higher when compared to those of most mesophilic and thermophilic bacterial β-glucosidases that typically range between 59 and 149 U mg-1 (Sano et. al., 1975; Patchett et. al., 1987; Takase et. al., 1988; Nunoura et. al, 1996; Harada et. al., 2005).

HoBglA have high amino acid homology towards β-glucosidase and β-galactosidase from different sources that are able to hydrolyze unusual substrates such as gentiobiose, laminaribiose, laminarine, sophorose, trehalose and xylobiose to variable extents (Harada et. al., 2005; Park et. al., 2005). These activities are not present in HoBglA, except cellobiose and lactose.

Comparisons of the temperature and stability with other thermophilic and mesophilic - glucosidases showed HoBglA has higher temperature optimum and thermostability than

131 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii those reported for mesophiles but other thermophilic -glucosidases are more stable and have higher temperature optima. HoBglA’s β-glucosidase has a 45 C optimum temperature with a half-life of 10 min at 65 C and more than 2 hr stability at 60 C and β- fucosidase has a 60 C temperature optimum with a half-life of 1.5 hr at 60 C. β- galactosidase has a 65 C optimum temperature with a half-life of 12 min at 65 C. Remarkably; EDTA does not inhibit the β-glucosidase activity, indicating that divalent cations are not required for enzyme activation which is supported by the metal studies. This phenomenon resembles the BglA activity from Thermotoga neopolitana; however EDTA strongly inhibits the β-fucosidase activity of HoBglA, whereas DTT did not showed any effect (Paavilainen et. al., 1993; Voorhorst et. al., 1995; Nunoura et. al, 1996; Nam et. al., 2008). Hg and Ni was found to be a known effective inhibitor for β- glucosidase, β-fucosidase and β-galactosidase activities (Patchett et. al., 1987; Plant et. al., 1988; Painbeni et. al., 1992; Odoux et. al., 2003; Li et. al., 2005; Yang et. al., 2008;

Qi et. al., 2009) and KI2 (0.5%) was a strong inhibitor and completely abolished the three activities of HoBglA (Colas, 1981; Nunoura et. al, 1996). As known competitive substrates D-fucose, D-gluconate and D-glucono 1-5 lactone, no effect was observed on the β-glucosidase and β-fucosidase activity, but all the three compounds have strongly stopped the β-galactosidase activity. This indicates that the HoBglA have same active site for the three activities without displaying specificity for C-4 and C-6 in the substrate. The same phenomenon was reported earlier with the almond emulsion, which showed three enzymatic activities in one enzyme that lacks C-4 and C-6 specificity of the substrate (Conchie et. al., 1967).

The kinetics of the hydrolysis of the preferred substrates showed that the enzyme was more specific for β-D-galactoside, and -D-glucoside than for -D-fucoside, which is similar to the BglA activity of T. neopolitana, genus Thermus sp., B. polymyxa, B. subtilis natto (Takase et. al., 1988; Camacho et. al., 1996; Park et. al., 2005; Kuo et. al.,

2008). The Km value of HoBglA β-fucosidase is less than the Km of -fucosidase from fungus Aspergillus phoenicis, which suggests HoBglA is more efficient in hydrolyzing fucoside (Zeng et. al., 1992). The catalytic efficiency (Kcat/Km) values for pNPGal, pNPF and oNPG were 303.12 M-1s-1, 27.50 M-1s-1 and 11.31 M-1s-1 respectively.

132 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

In the present work structural features of HoBglA were also analyzed. Comparing the primary sequence and tertiary structure of HoBglA with the structures deposited in the PDB database by PDBeFold (also known as SSM), it is no surprise to find the structure most similar to HoBglA is β-glucosidase from a bacterial source Clostridium cellulovorans, which shows 43% identity and an RMSD of 0.96 Å over 444 C atoms.

Previous workers reported that ZnSO4 caused severe inhibition of β-glucosidase from Clostridium cellulovorans and suspected that Zn ions may coordinate with two His residues (His121 and His180, also present in HoBglA) in the active site which may hinder the substrate binding and inhibit the catalytic reaction by the active site. The strong interaction of the Zn ions with HoBglA was confirmed by DSF results. Inhibition was also observed when Tris was used as a reaction buffer for activity assay; the same inhibition result was seen in the earlier report (Jeng et. al., 2011). Beta-glucosidase from Neotermes koshunensis complexed with pNPG showed the direct interaction of the Tris molecule with the residues on the glycone binding site. To clarify this phenomenon, attempts were made to obtain crystals of HoBglA with substrate and metal ions, but crystals do not diffract the X-rays.

Earlier reports said that thermophilic proteins contain lower levels of asparagine and glutamine, which are more susceptible to deamidation at elevated temperature and lower levels of cysteine, methionine and tryptophan, which are more susceptible to oxidation at elevated temperature. For extra stability of the loops, higher levels of isoleucine and alanine are required that play active roles in allowing the tighter packing of hydrophobic cores and higher levels of prolines (Holm and Sander, 1993). HoBglA has lower contents of alanine and equal content of asparagine, isoleucine and proline than the mean of the mesophiles (Kumar et. al., 2000). Only levels of tryptophan are higher than the mean of the mesophilic compositions. Whether these extra tryptophan residues are involved in enhancing the thermostability of HoBglA is unclear. There are many hydrophobic clusters present in the protein core where tryptophan residues play a key role, but several tryptophan residues are involved in structural features such as the substrate-binding tunnel, which relate to the particular enzymatic function of HoBglA, rather than its structural stability.

133 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

In the catalytic site of HoBglA structure, the electron density was identified on the basis of anomalous density calculation which indicates the presence of the metal ion. To investigate that region attempts were made to fit the magnesium, arsenic and nickel ions. The distorted geometry and high B-factor values of Mg and As discard their presence in the catalytic region. However, Ni ion gave low B-factor values (47.9 and 59.2) for each chain; this finding supports the inhibition of enzyme activity by Nickel (Figure 3.1.23). Ni ion makes a direct bond with Glu354, which is an essential residue to perform the catalysis by the HoBglA and other surrounding substrate binding region residues.

The modes of catalysis followed by GH family 1 enzymes involve a retaining mechanism or double displacement strategy to hydrolyze the substrate. This requires a proton donor and a nucleophile which are separated, on average by 5.0 Å; the general acid catalyzed attack of a nucleophilic carboxylate side-chain on the C1 of the sugar releases the reducing end leaving group and leads to the formation of a covalent intermediate, which is hydrolyzed by a water molecule. Covalent modification with a mechanism based inhibitor has identified a totally conserved glutamic acid residue (Glu183, Glu397 PDB code 1CBG; Glu206, Glu387 PDB code 1GOW) as the attacking nucleophile (Sinnott, 1990; Trimbur et. al., 1992; Barrett et. al., 1995; Aguilar et. al., 1997). On this basis it may be concluded that the conserved Glu166 and Glu354 of HoBglA may act as an acid/base catalyst and nucleophile to catalyze the reaction. The same phenomenon was observed in docking results of lactose, cellobiose and gluconate where Glu166 and Glu354 directly interacts with the substrate atoms to release the monosaccharide unit as a result of hydrolysis. Previous workers reported that Gln20, His121, Tyr296, Glu405 and Trp406 are determinant residues for the recognition of substrate, specifically Gln20 and Glu405 (Glu408 HoBglA) which forms the bidentate hydrogen bonds during β-glucosidase and β-galactosidase activity. This statement supports the β-glucosidase and β-galactosidase activities of HoBglA (Aparicio et. al. 1998).

No reports were found on the β-fucosidase (EC.3.2.1.38) 3D structure. We analyzed the HoBglA oNPF and D-fucose docked models and found the following residues Gln20, His121, Trp122, Asn165, Glu166, Val169, His180, Tyr296, Trp327, Glu354, Trp401, Glu408, Trp409 and Phe417 are interacting with substrate. This indicates that the

134 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii putative active residue Glu166 and Glu354 may play an active role in the hydrolysis of fucosides. These results support the opinion that HoBglA hydrolyses a variety of substrates through a single catalytic domain. GH1 members have four extended loops for active site entrance (Khan et. al., 2011) termed loop A-D. They are responsible for overall shape of the aglycone binding area and alteration in any of these loops modifies the overall shape of substrate entrance. Comparison of these loops in 3D structure of 1CBG and HoBglA structure showed differenced in all the loops (Figure 3.1.22). Loop (202-215 residues, yellow), Loop B (336- 345 residues, red) loop C (355-361 residues, magenta) are absent in HoBglA that make the surface more compact than 1CBG. Loop D of 1CBG (403 -410 residues, orange) and HoBglA (360-367 residues, blue) do not superimpose and makes a wide opening for easy entry of substrate that is similar to the long loop D in Bgl1A from T. neopolitana (Khan et. al., 2011). Interestingly, active residue Glu354 (HoBglA), Glu397 (1CBG) are present on loop D. These observations suggests that loop D in HoBglA plays a major role to provide access to a variety of substrates followed by the catalysis by Glu354 residue.

The MD simulation runs needs more study to be interpreted in more detail, no reports were found about the conformation changes in TIM barrel structures to compare with the present results or to prove the significance of these conformational changes. One option is to run the protein without any small molecule ligand and analyse for conformational changes. If a set of particular conformations is frequently populated (restricted to e.g. loops swinging around hinge points), one could argue for a sort of breathing motion of the protein. It will be hard to establish that the simulation has run for sufficient time, though. Alternatively, if no such movements are observed, the different conformations in case of the ligand-bound simulations may have been caused by the force fields of those simulations and thus be artificial. If the moving areas are inherently flexible, they may show higher B-factors than the rest of the protein. The different conformations may have been caused by relief of crystal packing forces. These points need more analysis. The starting conformation was taken out of the crystal structure and put into water. Superposition shows that the three ligands stay in the binding cavity, but do not occupy the same space.

135 CHAPTER 3: Section 3.1 β-Glucosidase from H. orenii

Previously, reported mutagenic experiments showed a decrease in activity up to 60 fold by mutating the highly conserved Glu387 (PDB code 1GOW) and Glu358 (PDB code 1CBG) residues, which indicated their active role in catalysis (Barrett et. al., 1995; Aguilar et. al., 1997). Glu166 and Glu354 (HoBglA) occurs in the motif Thr-Phe(His)- Asn-Glu-Pro (TF/HNEP) and Ile(Val)-Thr-Glu-Asn-Gly (I(V)TENG), which are highly conserved in all family 1 glycosyl hydrolases (Walker and Axelroad, 1978). Analysis of the solvated pocket at the C-terminus of the (/β)8 TIM barrel exposes several consistent features with it being a glycosyl hydrolase active centre.

In future, detailed analysis of the HoBglA- cellobiose, lactose, D-glucose, oNPG, pNPF, pNPGal complex structure is required to verify the substrate binding sites and catalytic residues of the enzyme. The present study on HoBglA inspires the future analyses of the catalytic mechanisms GH family 1 members that contain a TIM-like barrel structure.

136 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

CHAPTER 3

Section 3.2

BIOCHEMICAL CHARACTERIZATION OF β-

GALACTOSIDASE/ -L-ARABINOPYRANOSIDASE FROM

HALOTHERMOTHRIX ORENII

CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

3.2.1 INTRODUCTION

Beta-D-galactosidase is widely distributed glycosyl hydrolase in all three domain of life. The enzyme is significantly used in a numerous biological processes such as lactose digestion, plant growth and fruit ripening and as reported gene in cell and molecular biology research (Li et. al., 2001; Laurno et. al., 2008; Connell and Walsh, 2010). In recent years, β-galactosidase (EC. 3.2.1.23) from Aspergillus niger, Alicyclobacilli acidocaldorius, Bifidobacterium breve, B. longum H-1 and Clostridium cellulovorans had been reported to have dual functions, that is, they can hydrolyze both β-D-galactoside/-L-arabinopyranoside (Kosugi et. al., 2002; Shin et. al., 2003; Laurno et. al., 2008; Connell and Walsh, 2010; Lee et. al., 2011).

In particular, -L-arabinopyranosidase (no EC number) purified from B. breve specifically hydrolyses the non-reducing ends of -L-arabinopyranosidic linkage of gingenoside Rb2 (Shin et. al., 2003). Very less is known about -L- arabinopyranosidase, both in terms of functional and structural features in combination of β-galactosidase.

Sequence homology HoArap showed that belongs to GH43 family of clan GH-F and according to CaZY database it continues to hydrolyze the substrates by following one step displacement inverting mechanism. Members from this family utilize Aspartic acid as a catalytic nucleophile/ base and Glutamic acid as catalytic proton donor. Several workers have reported three dimensional structures of GH43 members that revealed a 5- fold β-propeller structure of -N-arabinofuranosidase with a central tunnel. GH family 43 members follow two mechanisms for the substrate hydrolysis, a) retaining reaction and b) inversion reaction. In both mechanisms, two carboxylic acids (i.e Glutamic acid and Aspartic acid) highly conserved in all GH families, play crucial role in oxacarbenium ion-like-transition state reaction (Piston et. al., 1998; Rye and Wither, 2000; Shallom et. al., 2002).

In this chapter, β-galactosidase/-L-arabinopyranosidase from halothermophile H. orenii (designated as HoArap) was investigated. It hydrolyses lactose by cleaving β-1-4 glucosidic bond between glucose and galactose. To date, few reports are available

137 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii on the hydrolysis of pNPAp and pNPGal (Figure 3.2.1). Reports suggest that β- galactosidase/-L-arabinopyranosidase is actually a separate type of glycosyl hydrolase (Shin et. al., 2003; Lee et. al., 2011). Biochemical properties did not support HoArap as -L-arabinofuranosidase (EC. 3.2.1.55) and strengthens our interest for structural studies and to reveal the mechanism of substrate hydrolysis.

a b

Figure 3.2.1 Structure of synthetic substrates hydrolysed by HoArap a) pNP-β-D- galactopyranoside and b) pNP--L-arabinopyranoside.

Glycosyl Hydrolase family members follow two mechanisms (Figure 3.2.2) for the substrate hydrolysis, a) retaining reaction and b) inversion reaction. In both mechanisms, two carboxylic acids (i.e Glutamic acid and Aspartic acid) play crucial role in oxacarbenium ion-like-transition state reaction (Piston et. al., 1998; Rye and Wither, 2000; Shallom et. al., 2002). They are highly conserved in all glycosyl hydrolase families (Rye and Wither, 2000). Members of GH3, 51 and 54 families hydrolyse the glycosidic bond by retaining reaction which is a two-step double displacement reaction as illustrated in Figure 3.2.2a. In the first step, glycosylation takes place, where the acid-base acts as a general acid by protonating the glycosidic oxygen and stabilizes the leaving group. The nucleophilic residue attacks the anomeric carbon of the scissile bond forming a covalent glycosyl enzyme intermediate with the opposite anomeric configuration of the substrate. In the second step deglycosylation takes place, this time the acid-base acts as a general base and activates a water molecule that attacks the anomeric centre of the glycosyl enzyme intermediate from the same pathway of the original bond, which releases the sugar with an overall retention of the anomeric

138 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii configuration. GH43 follows one step inverting reaction as shown in Figure 3.2.2b, in which one carboxylate acts as a general base catalyst by deprotonating the nucleophilic water molecule that attacks the bond, while the second carboxylic acid acts as a general acid catalyst by protonating the leaving aglycone (Rye and Wither, 2000; Shallom et. al., 2002).

According to CaZY database β-galactosidase/-L-arabinopyranosidase (EC. 3.2.1.55) from H. orenii belongs to the GH43 family and clan GH-F and it continues to hydrolyze the substrates by following the one step displacement inverting mechanism. Members from this family utilize Aspartic acid as a catalytic nucleophile/ base and Glutamic acid as catalytic proton donor. Three dimensional structure investigations revealed that GH43 members have 5-fold β-propeller structure with a central tunnel (Nurizzo et. al., 2007)

In the present chapter, β-galactosidase/-L-arabinopyranosidase from halothermophile H. orenii (HoArap) was investigated (Table 3.2.1). Gene identification and DNA cloning was performed to study the biochemical characteristics of this enzyme. Interestingly, it hydrolyses lactose by cleaving β-1-4 glucosidic bond between glucose and galactose. To date, few reports are available on the hydrolysis of pNPAp and pNPGal by the same enzyme. Recent reports suggest that β-galactosidase with -L- arabinopyranosidase activity should have a different classification than the previously reported β-galactosidase (Kosugi et. al., 2002; Shin et. al., 2003; Lee et. al., 2011).

139 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

Figure 3.2.2 Hydrolysis mechanisms. a) Retaining and b) inverting reactions (Rye and Wither,

2000).

140 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

Table 3.2.1 Properties of β-galactosidase/-L-arabinopyranosidase from different sources

β-galactosidase/-L-arabinopyranosidase

HoArap Apy-H1 Apy-K110 BgA

Source Halothermothrix Bifidobacterium Bifidobacterium Clostridium

orenii longum H-1 breve K-110 cellulovorans

Accession YP_002509798.1 HM803112 - AY128945.1 number

Upstream YP_002509799.1: - - - and glycoside hydrolase downstream family 2 (β- genes galactosidase,

EC.3.2.1.23);

YP_002509797.1:

alpha/beta hydrolase

fold protein CE 1 family

Mw (kDa) 35.5 81.5 77 78

Amino acids 315 757 - 659

EC. No. 3.2.1.55 (putative) - - -

Temp (C) 45-50 48 40 30-40

Half-life 12 min at 60C - - complete

(pNPAp); 3min at 60C activity

(pNPGal) loss after 20 min

at 50C pH 6.5-7.0 6.8 5.5-6.0 6.0

Glycosides pNPGal, pNPAp pNPAp, pNPGal pNPAp and pNPAp,

141 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii specificity and Ginsenoside Ginsenoside pNPGal

Rb2 Rb2 and pNPFp

Natural lactose not mentioned not mentioned arabinogalactan substrate

Km 8.3 (pNPAp at 0.24 (pNPAp) 0.16 (pNPAp) 1.51 (pNPAp)

45 pH 7.5); 0.26 (pNPGal); 0.086 (Rb2); at 6.06 (pNPGal);

2.5 (pNPGal at at 37C pH 7.0 37C pH 7.0 at 37C pH 6.0

50C pH 7.0)

Metals effect Inhibitors-Cd, Cu, Hg, Inhibitors-Cu Inhibitors-Cu, Inhibitors- Cu,

Zn, Ni, Co Activators-Co, Fe Hg, Zn

Activators-Mn, Mg, Ba, Ni, Mg Activators-Ca,

Fe, Li Co, Mg, Ni, Li

GH family 43 3 - 42

Clan GH-F - - GH-A

Reference Present study Lee et. al., 2011 Shin et. al., Kosugi et. al.,

2003 2002

142 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

3.2.2 Material and Methods

Overall work strategy used to investigate β-galactosidase/-L-arabinopyranosidase (HoArap) from Halothermothrix orenii is presented as a flowchart (Figure 3.2.3).

Halothermothrix orenii

Bioinformatics study Molecular Biology Biochemical analysis

GH and GT Genomic DNA extraction Optimum Temperature, pH enzyme gene search PCR and pET22b (+) Thermal-stability of enzymes Vector preparation ORF analysis and Primer design Effect of metal ions, PCR product & pET22 b inhibitors and sugars Amino acid analysis (+) vector RE digestion - BLAST - Clustal W Enzyme kinetics study - Protparam Vector and Insert ligation - Protein signature Transformation & clone BRENDA database screening by PCR, rPlasmid search RE digestion, DNA sequencing

CaZY database search rProtein mini over-expression screen, SDS-PAGE analysis

KEGG database search Upscaling of culture & protein purification by IMAC, protein PDB database search estimation

Figure 3.2.3 Flowchart showing different methods used to study β-galactosidase/-L- arabinopyranosidase from H. orenii.

143 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

3.2.2.1 Phylogenetic and amino acid sequence analyses

Protein sequences of β-galactosidase/-L-arabinopyranosidase were retrieved from the reference protein sequence databases of National Center for Biotechnology Information (NCBI) and were aligned with CLUSTALW software program (Chenna et. al., 2003). Rooted neighbor-joining phylograms were generated from the aligned sequences with the TREECON software (Van de Peer and De Wachter, 1994) using maximum parsimony method and a bootstrap trial of 100 replicates.

144 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

3.2.3 RESULTS

3.2.3.1 Phylogenetic analysis

Figure 3.2.4 Phylogenetic analyses of confirmed GH43 members and HoArap (β- galactosidase/ -L-arabinopyranosidase) protein sequences. Amino acid sequences of GH43 members and HoArap were retrieved from the reference sequence protein database of NCBI and aligned using ClustalW from BioEdit software (Chenna et. al., 2003). NCBI accession numbers for each sequence are given in parenthesis. H. orenii (YP_002509798.1, Mavromatis et. al., 2009), Anaerolinea thermophila UNI-1, unpublished), Mahella australiensis 50-1 BON (YP_004462226.1, unpublished), Verrucomicrobiae bacterium DG1235 (ZP_05057647.1, unpublished), Clostridium sp. 7_2_43FAA (ZP_05130145.1, unpublished), Butyrivibrio proteoclasticus B316 (YP_003830002.1, Kelly et. al., 2010), Bacteroides sp. 2_1_33B (ZP_06077618.1, unpublished), Thermobaculum terrenum ATCC BAA-798 (Kiss et. al., 2010), Lachnospiraceae bacterium 3_1_57FAA_CT1 (ZP_08608466.1, unpublished), Parabacteroides distasonis ATCC 8503 (YP_001303121.1, Xu et. al., 2007), Alistipes

145 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii indistinctus YIT 12060 (ZP_09021947.1, unpublished). Amino acid sequence identity is given in percentage with lowest E-value cut off.

3.2.3.2 Gene cloning, protein over-expression and purification

Beta-galactosidase/ -L-arabinopyranosidase encoding gene (NCBI no. YP_002509798.1; encodes 315 amino acids) was PCR amplified from H. orenii genomic DNA (Figure 3.2.5) and purified (Figure 3.2.6). Recombinant plasmid pET22b.arap.His6), carrying a 948bp arap gene insert was constructed successfully (Figure 3.2.7) and over-expressed in BL21 (DE3) cells (Figure 3.2.8). Ni-NTA affinity chromatography was used to purify the HoArap as described in Chapter 2 (section 2.2.15). Step wise purification procedure is summarized in Table 3.2.1. SDS – PAGE analysis under reduced and denaturing conditions showed a band of purified recombinant protein (>95% purity) with a molecular mass of 35.5kDa (Figure 3.2.9) which is correlated with the calculated molecular mass.

M 1

2322 bp 2027 bp 948 bp 564 bp

Figure 3.2.5 PCR amplification of araf gene. Lane M,  HindIII DNA standard size marker; lane 1, PCR amplified araF gene.

146 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

M 1 2

2322 bp 2027 bp 948 bp 564 bp

Figure 3.2.6 DNA purification. Lane M,  HindIII DNA standard size marker; lane 1, PCR amplified bglA gene physically excised from 1% agarose gel for purification after RE digestion; lane 2: Gel purified arap gene product.

M 1

pET22b+ 2322 bp 2027 bp 948 bp 564 bp

Figure 3.2.7 Recombinant plasmid analysis. Lane M:  HindIII DNA standard size marker; lane 1: RE digested recombinant plasmid with insert (pET22b+ and arap).

147 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

kDa M 1 2 3 4 5

170 130

95

72

55

43 36 kDa

34

26

17 11

Figure 3.2.8 HoArap over-expression analysis. Lane M. PageRuler™ Prestained Protein Ladder (SM0671); lane 1. Negative control of BL21 (DE3) cells; lane 2- 5. HoArap expressed in BL21 (DE3) cells after 1 hr., 2 hr., 4 hr. and 6 hr.

kDa M 1 2 3 4 5 6

170 130

95

72

55

43 36 kDa 34

26

17 11

Figure 3.2.9 HoArap purification. Lane M, PageRuler™ Prestained Protein Ladder (SM0671); Lane 1, HoArap flowthrough from column; lane 2 – 5, Column wash with 8, 50, 350 and 380 mM imidazole containing wash buffer; lane 6, desalted protein.

148 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

3.2.3.3 Optimum temperature and pH

The optimal temperature determined for HoArap against pNP--L-arabinopyranoside and pNP-β-D-galactopyranoside was 45 and 50C (Figure 3.2.10), respectively. The pH optima and pH range for β-galactosidase activity was broad and was pH 6.5 and from 5.5 to 10.5 respectively (Figure 3.2.11a), and the recorded pH optima and pH range for -L-arabinopyranosidase activity was 7.0 and from 5.5 to 10.0 (Figure 3.2.11b).

Data 1

100 pNPAp

80 pNPGal

60

40

HoAraf activity (%)

(unit per mg protein) 20 (unit per mg protein) mg per (unit (%) activity HoArap 0 30 35 40 45 50 55 60 65 Temp (C)

Figure 3.2.10 Optimum temperature of HoArap with pNPAp ( 45C) and pNPGal ( 50C) in Bis-Tris buffer pH 6.5. The error bar represents the means  SD (n=3).

149 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

Data 1 100 Bis-Tris Tris-HCl

80

Glycine-NaOH 60

40

HoAraf activity (%)

(unit per mg protein) 20 (unit per mg protein) mg per (unit (%) activity HoArap 0 5.5 6.5 7.5 8.5 9.5 10.5 pH

Figure 3.2.11a. pH optima of HoArap with pNPGal (β-galactosidase activity;  Bis- Tris,  Tris-HCl,  Glycine-NaOH ). The error bar represents the means  SD (n=3).

Data 1 100 Bis-Tris

Tris-HCl 80 Glycine-NaOH 60

40

HoAraf activity (%)

(unit per mg protein) 20 (unit per mg protein) mg per (unit (%) activity HoArap 0 5.5 6.5 7.5 8.5 9.5 10.5 pH

Figure 3.2.11b. pH optima of HoArap with pNPAp (-L-arabinopyranosidase activity;  Bis-Tris,  Tris-HCl,  Glycine-NaOH ). The error bar represents the means  SD (n=3).

150 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

3.2.3.4 Thermostability

The thermostability of HoArap was determined at various temperatures against pNPGal and pNPAp. HoArap showed a half-life of 3 min and 12 min at 60 C with pNPGal and pNPAp as substrate (Figure 3.2.12a,b).

Data 1

100 45

80 50

55

60 60

40

(unit per mg protein) 20

HoArap activity (%) activity HoArap protein) mg per (unit HoAraf residual activity (%) 0 0 20 40 60 80 100 120

Time (min)

Figure 3.2.12a. Temperature stability of HoArap with pNP-β-D galactopyranoside at different temperatures ( 45C, 50C,  55C,  60C). The error bar represents the means  SD (n=3).

151 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

Data 1

100 45

80 50

55 60 60

40

(unit per mg protein) 20

HoAraf residual activity (%)

HoArap activity (%) activity HoArap protein) mg per (unit 0 0 20 40 60 80 100 120 Time (min)

Figure 3.2.12b. Temperature stability of HoArap with pNP--L arabinopyranoside at different temperatures ( 45C, 50C,  55C,  60C). The error bar represents the means  SD (n=3).

152 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

3.2.3.5 Effect of metal ions, solvents and sugars on the activity of HoArap

The effect of various metal ions, salt, sugars, solvents, and reagents was determined on HoArap activity. NaCl (1mM) activated the activity up to 43% and 73% for pNPGal and pNPAp, respectively. Following metals had strongly inhibited the HoArap activity against both the substrates Cd>Cu>Hg>Zn>Ni>Co. Metals which activated the activity of HoArap against pNPGal are Mn>Mg>Ca>Li and for pNPAp are Mn>Ba>Fe>Co>Li (Figure 3.2.13). SDS, Tween 80, EDTA and urea also inhibited the HoArap activity against both substrates. 10% ethanol and 50% glycerol completely abolished the enzyme activity. Sugars such as glucose, fructose, galactose, maltose, lactose, sucrose and maltose had not shown any significant effect on the activity of HoArap enzyme.

In presence of 1mM CaCl2 and MnCl2, HoArap with pNPAp showed a significant change in thermostability and increase in activity up to 17.5 fold after 20 min and from 26-100 folds at 45, 50 and 55 C (Table 3.2.2), whereas with pNPGal as substrate no changes were observed (Table 3.2.3).

500

400

300

200

(Unit per mg protein)mg per (Unit 100 HoArap residual activity (%) activity residual HoArap

0

Metals (1mM)

Figure 3.2.13 Effects of various metals and reagents on catalytic activity of HoArap against pNP-β-D-galactopyranoside (50C,) and pNP--L-arabinopyranoside (45C,) in Bis-Tris buffer pH 6.5. The error bar represents the means  SD (n=3).

153 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

Table 3.2.2 HoArap activity against pNP-Arabinopyranoside in presence and absence of CaCl2 and MnCl2 (1mM) over time.

Residual activity (%)* at Temp. (C)

Time (min) CaCl2 (1mM) MnCl2 (1mM)

45 50 55 60 45 50 55 60 0 100 100 100 100 100 100 100 100 5 98.7 59 25 0.1 200 157 126 57

20 117.5 52 11 0.01 193 133 100 57

* Residual activity is the percentage of metal/ non-metal treated HoArap activity. These data were calculated for triplicate experiments and indicated as mean values. Effect of metals was compared against control (100%) without metal (0 min).

Table 3.2.3 HoArap activity against pNP-Galactopyranoside in presence and absence of

CaCl2 and MnCl2 (1mM) over time.

Residual activity (%)* at Temp. (C)

Time (min) CaCl (1mM) MnCl (1mM) 2 2 45 50 55 60 45 50 55 60 0 100 100 100 100 100 100 100 100 5 91.7 94.7 50 25 87 79 71 62 20 81.7 65 50 25 85 69 57 53

* Residual activity is the percentage of metal/ non-metal treated HoArap activity. These data were calculated for triplicate experiments and indicated as mean values. Effect of metals was compared against control (100%) without metal (0 min).

154 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

3.2.3.6 Determination of kinetic constants

The enzyme reaction kinetics of the purified HoArap enzyme for pNPAp and pNPGal was determined from Lineweaver-Burk plots using Prism 4 software. The Km for pNPGal 2.5mM was lower than 8.3mM of pNPAp (Figure 3.2.14a, b). These results suggest that pNPGal was preferred by the enzyme for its catalytic activity.

4.5 VMAX 4.437 0.09 4.0 VMAX 0.1236 KM 2.514 0.08 KM 8.393 3.5 0.07 3.0 0.06 300 2.5 7 6 0.05 2.0 5 0.04 200

4 1/v 1.5 3 0.03 1/V

2 v, nmol/min 100 v, (nmol/min) 1.0 1 0.02 0 0.0 2.5 5.0 7.5 10.0 12.5 0 0.5 0.01 0.0 2.5 5.0 7.5 10.0 12.5 1/[S] 0.0 0.00 1/[S] 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 [S], mM [S], mM a b

Figure 3.2.14 Enzyme kinetic studies of HoArap. Lineweaver-Burk plots (a and b) of pNP release as a result of hydrolysis of pNPGal and pNPAp by HoArap enzyme. The low km value indicated that pNPGal substrate was preferred by the enzyme.

155 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

3.2.3.7 Substrate specificity

The only natural substrate hydrolyzed by the HoArap enzyme was lactose. The hydrolysed glucose and galactose unit were observed as a trail (Figure 3.2.15) and does not completely separated on the TLC plate after many optimization trials. No activity was observed on TLC plate against linear arabinan, arabinoxylan (rye), arabinogalactan (larch) samples.

Glucose

Galactose

Lactose

1 2 3 4 5 6 7

Figure 3.2.15 TLC analysis of lactose hydrolysis over a time period. Lanes 1, 2 and 3 are lactose, galactose and glucose controls respectively, lanes 4 to 7 are the hydrolyzed end products (not completely separated) from lactose after 0 hr (lane 4), 12 hr (lane 5), 24 hr (lane 6), 48 hr (lane 7).

3.2.3.8 Crystallization

Crystal screens were setup using commercial conditions (Jena Bioscience, Germany) and in-house factorials. Hanging drop and sitting drop methods were applied to obtain crystals, but none of the screen gave any crystal to carry further structural studies.

156 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

DISCUSSION

Our in silico analysis using the online web-based DataBase for automated Carbohydrate-active enzyme ANnotation (dbcan) confirmed that the 948bp gene sequence, which had been annotated to code for a putative -L-arabinofuranosidase (araf; locus YP_002509798.1; EC 3.2.1.55) in the H. orenii genome sequence held in the GenBank data base, was a member of GH43. Currently GH43 includes β-xylosidase (EC 3.2.1.37), β-1,3-xylosidase (EC 3.2.1.-), arabinanase (EC 3.2.1.99), xylanase (EC 3.2.1.8), galactan 1,3-β-galactosidase (EC 3.2.1.145) as well as α-L-arabinofuranosidase (EC 3.2.1.55). However, the biochemical properties of the recombinant enzyme, which we have designated, HoArap, revealed that it is a β-galactosidase/-L- arabinopyranosidase and not an -L-arabinofuranosidase as only pNP-α-L- arabinopyranoside (pNPAp) and pNP-β-D-galactopyranoside (pNPGal) but not any of the other eight p- or o-NP-linked glycosides including pNP-α-L-arabinofuranoside (pNPAf), pNP-α-D-fucopyranoside (pNPAF), pNP-β-D-fucopyranoside (pNPBF), oNP- β-D-glucopyranoside (oNPG), pNP-α-D-glucopyranoside (pNPG), pNP-β-D- mannopyranoside (pNPM), pNP-α-D xylopyranoside (pNPAX) and pNP-β-D- xylopyranoside (pNPBX) could act as substrates.

HoArap hydrolyses the β-1-4 glycosidic bond (β-1,4-D-galactosidic linkage) between glucose and galactose present in lactose as well as pNPGal suggesting that HoArap could be classed as a β-galactosidases. β-galactosidases are widely distributed in nature and are found in all the three domains of life and hence would not be surprising. On the basis of amino acid homology, β-galactosidases can be divided into four distinctive Glycosyl Hydrolases (GH) families, namely, GH-1, GH-2, GH-35, and GH-42. Phylogenetic analysis has shown that each of the four families is distantly related to each other and each appears to be derived through separate gene lineages (Sheridan et. al., 2000). The β-galactosidase from Escherichia coli (EC-β-Gal), a member of GH-2, is one of the most studied and commonly used β-galactosidases whose three-dimensional structure has been determined and the structural basis for its reaction mechanism has been reported (Juers et. al., 2001). Structurally all crystallised β-galactosidases belong to clan F, a superfamily of 8-fold β/α-barrels with similar amino acid residues at their active sites. The nucleophile is a glutamate, which is located close to the carboxy-

157 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii terminus of β-strand seven with the sequence asparagine-glutamate close to the carboxy-terminus of β-strand four. This glutamate has been identified as the acid/base and is essential for catalysis. This superfamily is also called the 4/7 superfamily. However, HoArap is very distantly related to the β-galactosidases of GH-1, GH-2, GH- 35 or GH-42.

Biochemically, HoArap hydrolyses lactose by cleaving β-1-4 glucosidic bond between glucose and galactose. Thin layer chromatography results analysis showed that lactose was the natural substrate of HoArap. The enzyme did not showed any traces of activity towards linear arabinan, arabinoxylan (rye) and arabinogalactan (larch), which are natural substrates reported by previous workers for -L-arabinofuranosidase. To release terminal L-arabinosyl residues suggesting that the enzyme is in fact a type of β- galactosidases (Shin et. al., 2003). In nature, β-D-galactosidases have been identified to enhance plant growth and fruit ripening and are also used as tools in cell and molecular biology research (Li et. al., 2001; Laurno et. al., 2008; Connell and Walsh, 2010). The application of β-galactosidases, and in particular, thermostable β-galactosidases, in the hydrolysis of lactose in dairy products, such as milk and cheese whey, has received much interest, because of their potential usefulness in the industrial processing of lactose-containing products. However, pure saccharides such as glucose, lactose etc. are rarely found in the environment and hence are unavailable to microbes as natural substrates for their growth. So the question arises as to the role of microbial enzymes, such as HoArap, in nature. It has been well-established that most saccharides are tethered to aglycone moieties via glycosidic linkages and are locked away in plant biomass. D-galactose and L-arabinose are common components of several disaccharides and glycosides such as lactose, arabinogalactan, arabinoxylan, ginsenoside Rb2/Rc, para-NO2-phenyl β-D-galactopyranoside and para-NO2-phenyl -L-arabinopyranoside (Shin et. al., 2003; Connell and Walsh, 2010). Arabinosidases are one of the accessory enzymes that can cleave -L-arabinofuranosidic linkages to release L-arabinose and can act synergistically with other enzymes to complete the hydrolysis of hemicellulose and pectin. They stands as a promising biotechnological tool as an alternative to some of the present chemical methods used in pulp and paper industry (Greve et. al., 1998; Gobbetti et. al., 1999), oligosaccharides synthesis (Remond et. al., 2002, 2004) and biofuel production units (Saha and Bothast, 1998; Saha, 2003).

158 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii

The -N-arabinofuranosidase from H. orenii reported here has a 5-fold β-propeller structure and the catalytic nucleophile / base: proton donor are Aspartic:Glutamate and hence should be considered to be a member of family GH43 family. In silico model analysis of HoArap with -L-arabinase 43A from Cellvibrio japonicas (PDB code 1GYD) revealed that conserved putative catalytic center of HoArap have Asp121, Glu190 and 1GYD have Phe114, Asp158 and Glu221 and may follow single displacement mechanism. In 1GYD Phe114 significantly take part in substrate binding and its mutation resulted in reduced catalytic activity (Nurrizo et. al., 2002), whereas in HoArap phenylalanine residue is absent.

There are currently sustained efforts to search for new enzymes and new enzyme properties which could have the potential to act synergistically to improve biomass degradation. Examples include -N-arabinofuranosidase and endoxylasnase in which -N-Arafs from rumen metagenome raised the released reducing sugar from birchwood xylan to 73% and converted the intermediate xylo-oligosaccharide hydrolysate into xylose (Hasimoto and Nakata 2003; Zhou et. al., 2011).

To date only four enzymes with the ability to hydrolyze both β-D-galactoside and -L- arabinopyranoside bonds have been reported. Three of these from bacteria have been partially characterized, and include Alicyclobacilli acidocaldorius, Bifidobacterium breve, B. longum H-1 and Clostridium cellulovorans whereas the fourth which has not been studied is from the fungus Aspergillus niger, (Kosugi et. al., 2002; Shin et. al., 2003; Laurno et. al., 2008; Connell and Walsh, 2010; Lee et. al., 2011). However, HoArap is clearly different from the 3 published enzymes as summarised in Table 3.2.1.

BLASTp analysis identified a further nine sequences which are annotated as glycoside hydrolase/ -L-arabinofuranosidase but are related to HoArap. Amino acid sequence homology based on lowest E-value cut off and phylogenetic tree (Figure 3.2.4) analysis shows that HoArap have 55% amino acid identity with glycosidase of Anaerolinea thermophila (Narita-Yamada et. al., 2011; unpublished data), 52% with glycoside hydrolase family 43 of Mahella australiensis (Lucas et. al., 2011; unpublished data). Interestingly, H. orenii, M. australiensis, Clostridium sp., Lachanospiraceae and B.

159 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii proteoclasticus carrying glycoside hydrolase from GH family 43 belong to same phylum-Firmicutes (Figure 3.2.4).

HoArap have been cloned and over-expressed in BL21 (DE3) cells and was purified 2+ using Ni affinity chromatography technique with NiSO4 as column charging solution. HoArap was active against pNPAp and pNPGal, only two reports are available from barley and Bifidobacterium breve K-110 against these substrates. The specific activity of purified HoArap with pNPGal as substrate was 24U mg-1, which was higher than the other reported β-galactosidase/ -L-arabinopyranosidase that range from 6.61 to 8.71U mg-1 (Lee et. al., 2003; Shin et. al., 2003). The optimum temperature and pH of HoArap against pNPAp and pNPGal was 45 and 50C and was 7.0 and 7.5, whereas β- galactosidase with -L-arabinopyranosidase activity from A. acidocaldarius and A. niger had higher optimum temperature (65C) and low pH optima (pH 2-6) ((Kosugi et. al., 2002; Lauro et. al., 2008 and Connell and Walsh, 2010).

HoArap thermostability and half-life was determined against pNPAp and pNPGal. HoArap with pNPGal as substrate showed a half-life of 90 min at 45C which shows similarity with the -L-arabinofuranosidase stability at 45C (Wagschal et.al., 2007), followed by half-life of 90 min at 50C, 14 min at 55C and 3 min at 60C, which is higher than Apy-H1, Apy-K110 and BgA (Table 3) (Kosugi et. al., 2002; Shin et. al., 2003; Lee et. al., 2011).

HoArap catalytic activity was strongly affected by metals. Activity against pNPAp and pNPGal was completely abolished by the following metals Cd>Cu>Hg>Zn>Ni>Co (Li et. al., 2001; Shin et. al., 2003; Canakci et. al., 2007; Lee et. al., 2011). Manganese (1mM) activated the enzyme and increased the activity by 200 fold against pNPGal and more than 450 fold against pNPAp, whereas no effect was observed on Apy-H1, Apy- K110 and BgA (Table 3) (Kosugi et. al., 2002; Shin et. al., 2003; Lee et. al., 2011). Similarly, Fe and Ba (1mM) raised the activity by 200 fold against pNPAp and Magnesium (1mM) raised the activity against pNPGal by 200 fold. Addition of chelating agent EDTA (1mM) inhibited the enzymatic activity, which suggests that the metal ions are required by the HoArap to carry out the hydrolysis reaction. Reducing agent DTT (1mM) improved the catalytic activity against pNPGal by 23%; it seems the

160 CHAPTER 3: Section 3.2 -L-Arabinofuranosidase from H. orenii enzyme needs disulphide bonds for catalysis but not against pNPAp (Canakci et. al.,

2007). In presence of 1mM CaCl2 HoArap activity against pNPAp was enhanced by

17.5 fold at 45C, whereas 1mM MnCl2 enhanced the HoArap activity by 100 and 57 fold at 45 and 50C after 5 min and improved the activity by 93 and 33 fold at 45 and 50C after 20 min as compared against the control without metal treatment (Table

3.2.2), whereas HoArap activity against pNPGal was not enhanced by CaCl2 and MnCl2 (Table 3.2.3).

Enzyme kinetics investigation results suggest that pNPGal was the preferred substrate with Km 2.5 value over pNPAp having Km 8.39, which is supported by A. acidocaldarius

β-galactosidase Km value 4.9 over pNPAp Km value 5.2 (Laurno et. al., 2008).

Thin layer chromatography results analysis showed that lactose was the natural substrate of HoArap, previous studies on β-galactosidase/-L-arabinopyranosidase activities reports lactose as their natural substrate (Li et. al., 2001; Lauro et. al., 2008 and Connell and Walsh, 2010). The enzyme did not show any traces of activity towards linear arabinan, arabinoxylan (rye) and arabinogalactan (larch).

These findings suggests that biochemical and substrates specificities of HoArap enzyme differ from the previously reported -L-arabinofuranosidases and are similar to recently reported bifunctional -L-arabinopyranosidase/ β-D-galactopyranosidase from B. longum H1 (Schwarz et. al., 1995; Shin et. al., 2003; Morana et. al., 2007; Lee et. al., 2011). Further investigations on structural insights of HoArap are necessary to unveil the catalytic mechanism and improve our understanding about the catalytic properties of -N-arabinofuranosidase as -L-arabinopyranosidase/ β-D-galactosidases.

161 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

CHAPTER 3

Section 3.3

PRELIMINARY CRYSTALLOGRAPHIC INVESTIGATION

OF RIBOKINASE FROM H. ORENII

CHAPTER 3: Section 3.3 Ribokinase from H. orenii

3.3.1 INTRODUCTION

The monosaccharide D-ribose is the most abundant sugar in the environment and is metabolised by almost every organism. D-ribose is mainly used in the synthesis of nucleotides and amino acids (histidine, tryptophan) and an important source of energy. Cellular uptake of D-ribose is followed by phosphorylation to yield ribose-5- phosphate, which is catalysed by ribokinase in the presence of ATP and magnesium. The product is then available for the synthesis of nucleotides, tryptophan and histidine, or for entry into the pentose phosphate pathway (PPP). Sugar produced by nucleotide breakdown is also recycled by ribokinase (Sigrell et. al., 1998).

Ribokinase (EC. 2.7.1.15) has been studied from both prokaryotes and and sequence comparison has shown that they belong to a single family (Wu et. al., 1991; Bork et. al., 1993). Ribokinase is a member of a large family of sugar kinases known as PfkB family (also referred as RK family). PfkB family comprises of fructokinase, 1-phosphofructokinase, 6-phosphofructokinase, hexokinase, ketohexokinase, adenosinekinase, ribokinase and 2-dehydro-3-deoxygluconokinase (Sigrell et. al., 1997). Ribokinases are highly specific for phosphate substrates and sugars; they efficiently phosphorylate D-ribose and 2-deoxy-D-ribose compared to other simple sugars (Park and Gupta, 2008). The role of this enzyme for ribose utilization was identified for the first time in E. coli in early 1970’s, when a ribokinase lacking strain was not able to grow on ribose. Addition of ribose in growth media significantly induced the expression of ribokinase in E. coli (Anderson and Cooper, 1970; Hope et. al., 1986).

PfkB family members share a number of unique primary and tertiary structural elements (Park and Gupta, 2008). Members from this family can be identified by their highly conserved sequence motif present on both C and N terminal regions. Motif on C terminal region have a conserved aspartic acid residue that performs as a catalytic base, this motif is also involved in binding of ATP and anion hole formation. The N terminal region contains two glycine residues that form a hinge between the lid and β domain. Overall sequence identity among the PfkB family members is less than 30% but they have highly similar 3D structures, except hexokinase and galactokinase; however both show distinct pattern along with

162 CHAPTER 3: Section 3.3 Ribokinase from H. orenii different numbers of -helices and β strands (Sigrell et. al., 1999; Holden et. al., 2004; Kawai et. al., 2005).

Sugar phosphorylation is not completely understood both in prokaryotes and eukaryotes. Although the first 3D structure of ribokinase was solved from E. coli in 1998. The E. coli ribokinase was identified as a homodimer, each monomer consisting of two domains. The core domain of ribokinase is formed by a /β fold and harbours the catalytic site. (Sigrell et. al., 1997; Sigrell et. al., 1998). As such, ribokinase is a worthwhile target to explore the mechanism of sugar phosphorylation. In 2010, Chua and associates reported a protruding four-stranded β-sheet with distinct topology that forms a lid covering the active site while the enzymatic reaction takes place. Previously determined structures of E. coli ribokinase showed that the D-ribose substrate is enclosed in the substrate binding site under the protruding β-sheet. An induced fit mechanism has been suggested to enable recognition and binding of the donor trinucleotide ATP as a co-substrate (Sigrell et. al., 1998; Sigrell et. al., 1999).

The catalytic mechanism proposed by Sigrell and co-workers in 1999 describes about the binding of ribose to catalytic region that induces the enzyme affinity for second substrate ATP and the residue Asp255 acts as a catalytic base which abstracts a proton from the O5’ hydroxyl group of ribose. The negatively charged O5’ atom makes a nucleophilic attack on the -PO4 of ATP, resulting in a pentavalent transition state. Anion hole stabilizes the transition state by positioning the -PO4 of ATP. Subsequent movements of the lid domain to induce the breakdown of the transition state and allowing the enzyme to regain its open conformation. The products, R-5-P and ADP are released and the enzyme is ready to allow access for another ribose unit. This process suggests an ordered Bi Bi reaction mechanism which is proposed for other carbohydrate kinases (Sigrell et. al., 1998; Sigrell et. al., 1999).

163 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

3.3.2 MATERIALS AND METHODS

The materials and methods used for genomic DNA extraction, gene amplification, gene cloning, protein over-expression and purification are described in Chapter 2. Only specific methods used for ribokinase study are given in this section. For quick reference, an overview of different methods used for ribokinase investigation is presented in Figure 3.3.1.

Halothermothrix orenii

Bioinformatics study Molecular Biology Structural analysis

Sugar kinase gene Genomic DNA extraction Protein Crystallization search

PCR and pET22b (+) X- Ray diffraction of crystals ORF analysis and Vector preparation Primer design Data analysis and molecular PCR product & pET22 b model building Amino acid analysis (+) vector RE digestion - BLAST Structure refinement and - Clustal W Vector and Insert ligation - Protparam analysis - Protein signature Transformation & clone BRENDA database screening by PCR, rPlasmid search RE digestion, DNA sequencing

CaZY database search rProtein mini over-expression screen, SDS-PAGE analysis

KEGG database search Upscaling of culture & protein purification by IMAC, protein PDB database search estimation

Figure 3.3.1 Flowchart showing general methods used to investigate ribokinase from H. orenii.

164 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

3.3.2.1 In silico 3D Structure prediction

Amino acid sequence data of the enzyme ribokinase was submitted to the Geno3D server (which is a part of Pole BioInformatique Lyonnais, France) http://geno3d- pbil.ibcp.fr/cgi-bin/geno3d_automat.pl?page=/GENO3D/geno3d_home.html against NPS@3D sequences at 95% identity (from PDB, automated search by the server). E. coli ribokinase (PDB code: 3IN1, unpublished data) was used as template to model a 3D structure of HoRbk enzyme. A run on the Geno3D server took 15 min to complete with the default parameters.

165 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

3.3.3 RESULTS AND DISCUSSION

3.3.3.1 Gene cloning, recombinant protein over-expression and purification

The ribokinase gene (Genome accession number NC_011899, Gene ID 7313464 from nucleotide positions 1055983 to 1056900, 918 bp; locus YP_002508722, 305 amino acid) was amplified by PCR from genomic DNA of H. orenii using an forward and reverse primer (Appendix III). PCR and cloning procedures were carried out as described in chapter 2 (section 2.2). The full-length recombinant protein encompasses 311 amino acid and has a theoretical molecular mass of 34 kDa and a calculated pI of 4.8. Recombinant ribokinase from H. orenii (designated HoRbk) with a C-terminal hexa-his-tag was over-expressed in E. coli using a pET22b (+) (T7 RNA polymerase based expression system). Ni-NTA affinity chromatography (chapter 2, section 2.2.15) was used for HoRbk purification.

M 1 M 1 2

2322 bp

2027 bp 2322 bp 918 bp 2027 bp 564 bp 918 bp 564 bp

A B

Figure 3.3.2 (A) Lane M,  HindIII DNA standard size marker; Lane 1, PCR amplified rbk gene (918bp). (B) Lane M,  HindIII DNA standard size marker; lane 1, PCR amplified rbk gene physically excised from 1% agarose gel for purification after RE digestion; lane 2, Gel purified rbk gene product.

166 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

M 1

pET22b+ 2322 bp 2027 bp 918 bp 564 bp

Figure 3.3.3 Lane M,  HindIII DNA standard size marker; lane 1, RE digested recombinant plasmid with insert (pET22b+ and rbk).

kDa M 1 2 3 4 5 6 7 8 9

170 130

95 72

55

43 34 26 17 11

Figure 3.3.4 Purification of HoRbk using Ni2+ affinity chromatography and analysis of the elution fractions on a 4-12 % Bis-Tris NuPAGE stained with Coomassie blue. Lane M, Molecular-weight markers in kDa; lane 1, cell lysate; lane 2, column flow through; lanes 3 – 8, imidazole wash 8, 15 , 50, 75, 100 and 350 mM respectively; lane 9, U-tube concentrated and desalted HoRbk from 350mM eluted fraction

167 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

3.3.3.2 Crystallization

Out of 1056 tested factorial conditions, crystals were obtained from 181 different conditions (Appendix V) (Figure 3.3.5). Crystals obtained from condition CSII-22 (20% (w/v) PEG 4000, 0.02 M TRIS, pH 9.0) after about six days showed X-ray diffraction. The rectangular plate-like crystals had dimensions of 0.5 mm x 0.1 mm (Figure 3.3.5F) and diffracted up to 3.1 Å resolution.

The orthorhombic crystals possessed unit cell parameters of a = 45.6 Å, b = 61.0 Å, c = 220.2 Å. The data collection statistics is given in Table 3.3.1. Serial extinctions were clearly visible in all three directions, resulting in the final assignment of

P212121 as the space group. The Matthews coefficient (Matthews, 1968) indicates that there are two molecules in the asymmetric unit.

Using PSIPRED (Bryson et al., 1994), mannose-binding protein from Rattus rattus (PDB accession code 1rdk) was identified as a protein of known structure with highest homology to HoRbk. Molecular replacement calculations with various models based on 1rdk were conducted. The best solution was obtained with a poly- Ala model of 1rdk that had the protruding four-stranded -sheet truncated. This solution suggested the existence of two HoRbk monomers in the asymmetric unit which are not related by a rotational symmetry axis, and is thus in agreement with the self-rotation function calculated for = 180° where no non-crystallographic peaks have been observed.

Efforts to improve crystallisation conditions in order to obtain better quality crystals of the apo and ligand-bound enzyme are under way. Attempts to build a model of HoRbk using the available molecular replacement solution are in progress.

168 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

Table 3.3.1 Data collection statistics X-ray source AS MX2

Detector ADSC Quantum

Wavelength (Å) 0.95375

Temperature 100 K

Sample-to-detector distance 250 (mm)

Rotation per image (°) 1

Total rotation (°) 180

Exposure time (s) 2

No of crystals 1

Space group P212121

Unit cell parameters: a, b, c (Å) 45.6, 61.1, 220.2

Resolution range (Å) 59 - 3.1 (3.27 - 3.10)

Wilson B-factor (Å2) 57.2

Mosaicity (°) 0.52

Total no of measurements 57751 (7084)

No of unique reflections 11229 (1440)

I / I 5.3 (3.1)

Redundancy 5.1 (4.9)

Completeness 0.948 (0.851)

Rsym 0.108 (0.238) Values in parentheses are for highest-resolution shell.

169 CHAPTER 3: Section 3.3 Ribokinase from H. orenii 0.5 µm

A B C D

E F G H

I J K L

M N O P

Q R S T Figure 3.3.5 H. orenii recombinant ribokinase protein crystals grown during initial screen. Different screen conditions are A) CSII22, B) E24, C) P6, D) P52, E) P57, F) P58, G)

P60, H) P61, I) P69, J) P89, K) P93, L) N73, M) N83, N) F103, O) F104, P) F154, Q) F164, R) F157,

S) F162. Arrows indicates towards crystals.

170 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

3.3.3.3 HoRbk amino acid sequence and phylogenetic analysis

Amino acid sequence analysis of H. orenii ribokinase indicated that the protein shared maximum sequence similarities with members of group B (Figure 3.3.7) related to the family Thermotogaceae, class , phylum Thermotogae, domain Bacteria showed 100% similarity and family Nakamurallaceae, class , phylum Actinobacteria, domain Bacteria. Also shares 96-97 % similarity with family Spirochaetaceae, class Spirochetes, phylum Spirochetes, domain Bacteria; family Geobacillus, class , phylum Firmicutes, domain Bacteria. Interestingly, member of family Pyrococcus, class , phylum Themococci of domain Archaea share 97 % similarity with HoRbk

171 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

A

B

Figure 3.3.6 A dendrogram showing the phylogenetic position of ribokinase from H. orenii to its closest relatives. Genbank accession numbers are given in parentheses. Halothermothrix orenii H 168 (YP_002508722.1), Thermotogales bacterium (ZP_07577899.1), Nakamurella multipartita (YP_003200510.1), Lutiella nitroferrum (ZP_03699990.1), Catenulispora acidiphila (YP_003112873.1), Salinispora tropica (YP_001160982.1), Streptosporangium roseum (YP_003339034.1), Streptomyces pristinaespiralis (ZP_06908813.1), Actinomyces sp. (ZP_08026112.1), Nocardiopsis dassonvillei (YP_003681418.1), Streptomyces scabiei (YP_003487718.1), Streptomyces avermitilis (NP_822686.1), Streptomyces griseoflavus (ZP_07310272.1), Spirochaeta caldaria (YP_004696748.1), (NP_126109.1). Figure was prepared using neighbourhood joining method by TREECON software.

172 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

Figure 3.3.7 Amino acid sequence alignment of ribokinase from H. orenii, based on BLASTp against PDB database. Details are given in parentheses. Genbank accession number: 2C49 ( jannaschii Nucleoside Kinase- an Archaeal member of the ribokinase family), 2ABQ (Fructose-1-Phosphate Kinase from Bacillus halodurans), 1V19 (Keto-3-deoxygluconate kinase from Thermus thermophilus), 1BX4 (Human adenosine kinase), 3HJ6 (Halothermothrix orenii fructokinase), 3H49 (Putative ribokinase from E. coli). Figure was prepared using ESPript.cgi Version 3.06.

173 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

3.3.3.4 Three dimensional model analyses

A total of 10 models were generated from Geno3D tool for ribokinase from H. orenii. Model 1 was selected for analysis (Figure 3.3.8), the average RMSD for all the model is 1.23 Å. The 3D model of ribokinase is similar to the structure of E. coli ribokinase, RMSD =2.6 Å (PDB code 1RKD; Sigrell et. al., 1998), fructokinase from H. orenii, RMSD = 2.5 Å (PDB code 3HJ6; Chua et. al., 2010). The large domain contains seven  helices and five β sheets (Figure 3.3.8A). As per previously described mechanism (Sigrell et. al., 1998), the β domain is responsible for providing specific binding to the substrate, D-ribose and ATP. The smaller lid domain is composed of two loops (Figure 3.3.8B) that acts as a lid to cover active site.

174 CHAPTER 3: Section 3.3 Ribokinase from H. orenii

A

B

Figure 3.3.8 In silico modeled 3D structure of H. orenii ribokinase. Two domains are shown in cartoon diagram (blue end N- terminus and red end C- terminus). A) β domain, that binds with the substrate and induces the closing of lid domain , B) lid domain, that binds to the phosphate substrate. The image was constructed using PyMol v1.3. (DeLano, 2002).

175 CHAPTER 4: THESIS DISCUSSION AND FUTURE DIRECTIONS

CHAPTER 4

THESIS DISCUSSION AND FUTURE DIRECTIONS

CHAPTER 4: THESIS DISCUSSION AND FUTURE DIRECTIONS

4.1 Thesis Discussion

In this thesis investigations on glycosyl hydrolases and transferases from H. orenii are reported. H. orenii is a unique thermophilic anaerobe and is able to survive at high salt concentration and high temperature. It uses a variety of carbon sources to generate energy and chemicals for metabolic activities. For successful utilization of sugars it secretes numerous extracellular hydrolases and transferases. Extremophilic nature of H. orenii makes it a suitable candidate to study macromolecules secreted by it and explore their potential for commercial applications. Investigations of biochemical properties and an insight to the structural elucidation of GH and GT constitute the major achievements of this study.

In chapter 3 enzymes selected for this project have been described briefly. The selected GH and GT are mentioned in Table 3.1 with overall initial results from bioinformatic analysis to gene cloning, protein over-expression, purification and enzymatic activity. Enzymes such as fructokinase, glucosylceramidase, phosphofructokinase, -amylase 2 were only cloned and over-expressed are described in short. Whereas enzymes those get purified were investigated in detail and are described in sections of chapter 3 (section 3.1 β-glucosidase, section 3.2 -arabinofuranosidase and section 3.3 ribokinase).

In section 3.1 properties of β-glucosidase are explored in detail using various biophysical and biochemical analytical methods. Results pertaining to these analyses were compared with the structural outcomes. One of the significant outcomes was the presence of Ni ions in the substrate binding site (Figure 3.1.23), Ni was found as an inhibitor for β-glucosidase activity. The recombinant HoBglA exhibits three hydrolytic activities against three substrates without any specification towards C4 or C6 hydroxyl group and enzyme kinetics results indicate that glucoside is the most preferred substrate of HoBglA followed by galactoside and fucoside. Structure analysis and molecular docking results suggests that Glu166 and Glu354 play an important role in substrate hydrolysis by following retaining reaction mechanism (Figure 3.2.2). Molecular dynamics (MD) simulation analysis of HoBglA was performed using cellobiose, lactose and oNPF as ligands for 20 picoseconds; however the results showed severe conformational changes when the protein structures were compared in all three simulation cases before and after MD run. Superposition showed that the three ligands

176 CHAPTER 4: THESIS DISCUSSION AND FUTURE DIRECTIONS stay in the binding cavity, but do not occupy the same space. It would be difficult to analyse the MD simulations for catalytically active residues or mechanisms, due severe conformational changes and inherent flexibility of moving areas.

β-Galactosidase/ -L-arabinopyranosidase was studied in detail and is described in section 3.2. HoArap showed hydrolytic activity against lactose, galactopyranoside and arabinopyranoside. The results of the present study are similar to the activities of β- galactosidase/ -L-arabinopyranosidases reported from other sources (Table 3.2.1). Some of the workers reported that arabinopyranoside hydrolyzing enzymes resembles a type of β-galactosidase; results obtained from BLASTp in the present study did not support this fact with respect to HoArap. To further investigate on this, crystal screens were setup using commercial and in-house factorials unfortunately no crystals could be obtained during the period of this project.

Preliminary crystallographic analysis of Ribokinase (sugar kinase) from H. orenii is given in section 3.3. Orthorhombic crystal of HoRbk diffracted X-rays to 3.1 Å and have P212121 space group. Results showed that two HoRbk monomers were present in the asymmetric unit. Similarly HoBglA monomers were present in the asymmetric unit; this may be a trait of orthorhombic crystals or possibly two units of H. orenii enzymes working together during substrate hydrolysis or modification reaction. Molecular replacement calculation shows HoRbk structure shares homology with mannose binding protein. In silico model analysis of HoRbk indicated highest similarity with the ribokinase from E. coli with RMSD of 2.6 Å (PDB code 1RKD). Attempts were made to obtain crystals for high resolution X-ray diffraction with sugars and ligands, but results were not encouraging. Attempts for biochemical characterization of HoRbk were not successful.

These finding suggests that H. orenii is competent enough to use its hydrolase as multi- enzyme against various sugars available within their natural habitat. These enzymes can have significant implications in biotechnological applications, such as HoBglA which can hydrolyse three different sugars, can be significantly useful in pharmaceuticals and dairy industry. This study draws attention towards the adaptive strategies regarding how extremophiles survive under harsh and extreme conditions.

177 CHAPTER 4: THESIS DISCUSSION AND FUTURE DIRECTIONS

4.2 Future Directions

This research has expanded the present knowledge regarding the biochemistry and structural features of thermophilic (halophilic) proteins from H. orenii. Proteins mentioned in chapter 3 need more research in terms of biochemical characterization and structure studies. These enzymes may hold interesting properties similar to the HoBglA and HoArap having three and two activities, respectively. HoBglA and HoArap require more research to enhance their activities using modern enzyme/ protein engineering techniques such as site-directed mutagenesis. To reveal the mechanism for hydrolytic activities of HoBglA, HoArap and HoRbk high X-rays diffraction resolution crystals with ligands are required. Biochemical investigations of HoRbk are a necessity to provide an affirmative status regarding its structural parameters and transferases activity.

178 REFERENCES

REFERENCES

REFERENCES

Adams, P. D., Afonine, P.V., Bunkóczi, G., Chen, V.B., Davis. I. W., Echols, N., Headd, J.J., Hung, L., Kapral, G.J., Grosse-Kunstleve, R.W. (2010). "PHENIX: a comprehensive Python-based system for macromolecular structure solution." Acta Crysta. D66: 213-221.

Aguilar, C., Sanderson, I., Moracci, M., Ciaramella, M., Nucci, R., Rossi, M and Pearl, LH. (1997). "Crystal structure of the β-glycosidase from the hyperthermophilic archeon Sulfolobus solfataricus: resilience as a key factor in thermostability." J. Mol. Bio. 271: 789-802.

Aguillar, M., Gloster, TM., Moreno, MIG., Mellet, CO., Davies, GJ., Llebaria, A., Casas, J., Gabas, ME., and Fernandez, JMG. (2008). "Molecular basis for beta- glucosidase inhibition by ring- modified Calystegine analogues." Chem. Bio. Chem. 9: 2612-2618.

Altschul, S., Gish, W., Miller, Webb., Myers, EW and Lipman, DJ. (1990). "Basic local alignment search tool." J. Mol. Biol. 215: 403-410.

Altschul, S., Madden TL., Schäffer, A A. , Zhang, J., Zhang, Z., Miller, W and Lipman, DJ. (1997). "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic Acids Res. 25: 3389-3402.

Andam, C. and Gogarten, JP. (2011). "Biased gene transfer in microbial evolution." Nature Rev. Microbiol. 12: 543-555.

Andersson, A. and Cooper, RA. (1970). "Bioechemical and genetic studies on ribose catabolism in Escherichia coli K-12." J. Gen. Microbiol. 62(3): 335-339.

Andersson, C. (2004). Structure-function studies of enzymes from ribose metabolism. Department of Cell and Molecular Biology. Uppsala, Uppsala University. Doctor of Philosophy.

Apweiler, R., Attwood, TK., Bairoch, A., Bateman, A., et. al. (2000). "InterPro an integrated documentation resource for protein families, domains and functional sites." Bioinform. 16(12): 1145-1150.

179 REFERENCES

Atomi, H., Sato, T and Kanai, T. (2005). "Recent progress towards the application of hyperthermophiles and their enzymes." Curr. Op. Chem. Biol. 9: 166-173.

Atomi, H., Sato, T and Kanai, T. (2011). "Application of hyperthermophiles and their enzymes." Curr. Op. Biotechnol. 22: 618-626.

Attwood, T., Croning, MDR., Flower, DR., Lewis, AP., Mabey, JE., Scordis, P., Selley, JN and Wright, W. (2000). "PRINTS-S: the database formerly known as PRINTS." Nucleic Acids Res. 28(1): 225-227.

Aulkemeyer, P., Ebner, R., Heilenmann, G., Jahreis, K., Schmid, K., Wrieden, S and Lengeler, JW. (1991). "Molecular analysis of two fructokinases involved in sucrose metabolism of enteric bacteria." Mol. Microbiol. 5(12): 2913-2922.

Bachmann, S. and McCarthy, AJ. (1991). "Purification and cooperative activity of enzymes constituting the xylan-degrading system of Thermomonospora fusca." Appl. Microbiol. Biotechnol. 57: 2121–2130.

Bajaras, E., Luethy, MH and Randall, DD (1997). "Molecular cloning and analysis of fructokianse expression in tomato (Lycopersicon esculentum Mill.)." Plant Sci. 125: 13- 20.

Banaszak, K., Mechin, I., Obmolova, G., Oldham, M., Chang, SH., Ruiz, T., Radermacher, M., Kopperschlager, G and Rypniewski, W. (2011). "The crystal structure of eukaryotic phosphofructokinase from baker's yeast and rabbit skeletal muscle." J. Mol. Biol. 407: 284-297.

Barrett, T., Suresh, CG., Tolley, SP., Dodson, EJ and Hughes, MA. (1995). "The crystal structure of a cyanogenic β-glucosidase from white clover, a family 1 glycosyl hydrolase." Structure 3(9): 951-960.

Benson, D., Mizrachi, IK., Lipman, DJ., Ostell, J., Rapp, BA and Wheeler, DL. (2000). "GenBank." Nucleic Acids Res. 28(1): 15-18.

180 REFERENCES

Beylot, M., Mckie, VA., Voragen, AGJ, Doeswijk-Voragen, CHL and Gilbert, HJ. (2001). "The Pseudomonas cellulosa glycoside hydrolase family 51 arabinofuranosidase exhibits wide substrate specificity." Biochem. J. 358: 607-614.

Bhatia, Y., Mishra, S and Bisaria, VS. (2002). "Microbial β-Glucosidases: Cloning, properties and applications.” Crit. Rev. Biotechnol. 22(4): 375-407.

Birgisson, H., Fridjonsson, O., Bahrani-Mougeot, FK., Hreggvidsson, GO., Kristjansson, JK and Mattiasson, B. (2004). "A new thermostable -L- arabinofuranosidase from novel thermophilic bacterium." Biotech. Lett. 26: 1347-1351.

Bisht, S. and Panda, AK. (2011). "Biochemical characterization and 16S rRNA sequencing of few lipase-producing thermophilic bacteria from taptapani hot water spring, Orissa, India." Biotechnol. Res. Int.: doi: 10.4061/2011/452710.

Bloom, J., Labthavikul, ST., Otey, CR and Arnold, FH. (2006). "Protein stability promotes evolvability." PNAS 103(15): 5869-5874.

Boonclarm, D., Sornwatana, T., Arthan, D., Kongsaeree, P and Svasti, J. (2006). "Beta-Glucosidase catalyzing specific hydrolysis of an Iridoid β-glucoside from Plumeria obtusa." Acta Biochim. Biophy. 38(8): 563-570.

Bockmann, J., Heuel, H and Lengeler, JW. (1992). "Characterization of a chromosomally encoded, non-PTS metabolic pathway for sucrose utilization in Escherichia coli EC3132." Mol. Gen. Genet. 235: 22-32.

Bogs, J. and Geider, K. (2000). "Molecular analysis of sucrose metabolism of Erwinia amylovora and influence on bacterial virulence." J. Bacteriol. 182(19): 5351-5358.

Bork, P., Sander, C and Valencia, A. (1993). " of similar enzymatic function on different protein folds: the hexokinase, ribokinase and galactokinase families of sugar kinases." Protein Sci. 2: 31-40.

Botelho, H. and Gomes, CM. (2011). "Structural reorganization renders enhanced metalloprotein stability." Chem. Commun. 47: 11149-11151.

181 REFERENCES

Bradford, M. M. (1976). "A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding." Anal. Biochem. 72: 248-254.

Bragg, WL and Bragg, WH. (1913). "The structure of crystals as indicated by their diffraction of X-ray." Proc. Roy. Soc. London 89: 248-277.

Breves, R., Bronnenmeier,K., Wild,N., Lottspeich,F., Staudenbauer,WL. and Hofemeister,J. (1997). "Genes encoding two different beta-glucosidases of Thermoanaerobacter brockii are clustered in a common operon." Appl. Env. Microbiol. 63(10): 3902-3910.

Brock, TD and Freeze, H. (1969). “Thermus aquaticus gen. nov., a nonsporulating extreme thermophile. J. Bacteriol. 98:289-297.

Bruins, M., Janssen, AEM and Boom, RM. (2001). "Thermozymes and their applications." Appl. Biochem. Biotech. 90: 155-186.

Brumshtein, B., Wormald, MR., Silman, I., Futerman, AH and Sussman, JL. (2006). "Structural comparison of differently glycosylated forms of acid-beta- glucosidase, the defective enzyme in Gaucher disease." Acta Cryst. D62: 1458-1465.

Bryson K., McGuffin L.J., Marsden R.L., Ward J.J., Sodhi J.S. & Jones D.T. (2005). “Protein structure prediction servers at University College London.” Nucl. Acids Res. 33, W36-W38.

Burmeister, W., Cottaz, S., Driguez, H., Iori, R., Palmieri, S and Henrissat, B. (1997). "The crystal structures of Sinapis alba myrosinase and a covalent glycosyl- enzyme intermediate provide insights into the substrate recognition and active-site machinery of an S-glycosidase." Structure 5: 663-675.

Caescu, C., Vidal, O., Krzewinski, F., Artenie, V and Bouquelet, S. (2004). "Bifidobacterium longum requires a fructokinase (Frk; ATP:D-fructose-6- phosphotransferase, EC. 2.7.1.4) for fructose catabolism." J. Bacteriol. 186(19): 6515- 6525.

182 REFERENCES

Camacho, C., Salgada, J., Lequerica, JL., Madarro, A., Ballestar, E., Franco, L and Polaina, J. (1996). "Amino acid substitutions enhancing thermostability of Bacillus polymyxa b-glucosidase A." Biochem. J. 314: 833-838.

Canakci, S., Belduz, AO., Saha, AC., Yasar, A., Ayaz, FA and Yayli, N. (2007). "Purification and characterization of a highly thermostable -L-arabinofuranosidase from Geobacillus caldoxylolyticus TK4." Appl. Microbiol. Biotechnol. 75: 813-820.

Cantarel, B. L., Coutinho, P.M., Rancurel, C., Bernard, T., Lombard, V. and Henrissat, B. (2009). "The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics." Nucleic Acids Res. 37: D233-238.

Cayol, I., Olliver, B., Patel, BKC, Prensier, G., Guezennec, J and Garcia, JL. (1994). "Isolation and characterization of Halothermothrix orenii gen. nov., sp. nov., a halophilic, thermophilic, fermentative, strictly anaerobic bacterium." Int. J. Syst. Bacteriol. 44(3): 534-540.

Calvo, P., Santamaria, MG., Melgar, MJ and Cabezas, JA. (1983). "Kinetic evidence for two active sites in β-D-fucosidase of Hellicell ericetorum." Int. J. Biochem. 15(5): 685-693.

Cayol, JL., Ducerf, S., Patel, BKC., Garcia, JL., Thomas, P and Ollivier, B. (2000). "Thermohalobacter berrensis gen. nov., sp. nov., a thermophilic, strictly halophilic bacterium from a solar saltren." Int. J. of Syst. Evol. Microbiol. 50: 559-564.

Chan, C., Yu, T and Wong, K. (2011). "Stabilizing salt-bridge enhznces protein thermostability by reducing the heat capacity change of unfolding." PLoS ONE 6(6): e21624.

Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, TJ., Higgins, DG and Thompson, JD. (2003). "Multiple sequence alignment with the Clustal series of programs." Nucleic Acids Res. 31: 3497-3500.

183 REFERENCES

Chinchetru, M., Cabezas, JA., and Calvo, P. (1983). "Characterization and kinetics of β-D-Gluco/Fuco/Galactosidase from sheep liver." Comp. Biochem. Physiol. 75B (4): 719-728.

Chinchetru, M., Cabezas, JA and Calvo, P. (1989). "Purification and characterization of a broad specificity β-glucosidase from sheep liver." Int. J. Biochem. 21(5): 469-476.

Chua, T., Bujnicki, JM., Chye, TT., Huynh, F., Patel, BK and Sivaraman, J. (2008). "The structure of sucrose phosphate synthase from Halothermothrix orenii reveals its mechanism of action and binding mode[W],[OA]." The Plant Cell 20: 1059- 1072.

Chua, T., Seetharaman, J., Kasprzak, JM., Ng, C., Patel, BKC., Love, C., Bujnicki, J.M. and Sivaraman, J. (2010). "Crystal structure of a fructokinase homolog from Halothermothrix orenii." J. Struct. Biol. 171: 397-401.

Chuenchor, W., Pengthaisong, S., Robinson, RC., Yuvaniyama, J., Svasti, J and Cairns, JRK. (2011). "The structural basis of oligosaccharide binding by rice BGlu1 beta-glucosidase." J. Struct. Biol. 173: 169-179.

Colas, B. (1978). "Some physiochemical and structural properties of two β-fucosidases from Achatina balteata." Biochim. Biophy. Acta. 527: 150-158.

Colas, B. (1980). "Kinetic studies on β-fucosidases from Achatina balteata." Biochim. Biophy. Acta. 613: 448-458.

Colas, B. (1981). "Inactivation of β-fucosidase by iodine, N-acetylimidazole and tetranitromethane, evidence for the existence of essential tyrosine residues." Biochim. Biophy. Acta. 657: 535-538.

Collaborative Computational Project, Number 4 (1994). "The CCP4 suite: Programs for protein crystallography." Acta Cryst. D50: 760-763.

Coughlan, M. (1985). "The properties of fungal and bacterial cellulases with comment on their production and application." Biotechnol. Genet. Eng. Rev. 3: 39-109.

184 REFERENCES

Cuiv, P., Klaassens, ES., Durkin, AS., Harkins, DM., Foster, L., McCorrison, J., Torralba, M., Nelson, KE and Morrison, M. (2011). "Draft genome sequence of Turicibacter sanguinis PC909, isolated from human feces." J. Bacteriol. 193(5): 1288- 1289.

D’Auria, S., Morana, A., Febbraio, F., Vaccaro, C., Rosa, MD and Nucci, R. (1996). "Functional and structural properties of the homogenous β-glycosidase from the extreme thermoacidophilic archaeon Sulfolobus solfataricus expressed in Saccharomyces cerevisiae." Prot. Exp. Purif. 7: 299-308.

Dai, N., Schaffer, A., Petreikov, M and Granot, D. (1997). "Potato (Solanum tuberosum L.) fructokinase expressed in yeast exhibits inhibition by fructose of both in vitro enzyme activity and rate of cell proliferation." Plant Sci. 128: 191-197.

Darden, T., York, D and Pedersen, L. (1993). “Particle mesh ewald: an N log(N) method for ewald sums in large systems”. J. Chem. Phys. 98:10089-10092.

DasSarma, S. and Arora, P. (2002). "Halophiles." eLS.

DeLano, W. (2002). “The PyMol molecular graphics system”. DeLano Scientific, USA.

Di Lauro, B., Rossi,M and Moracci,M. (2006). "Characterization of a beta- glycosidase from the thermoacidophilic bacterium Alicyclobacillus acidocaldarius." Extremophiles 10(4): 301-310.

Ding, Y., Ronimus, RS and Morgan, HW. (2001). "Thermotoga maritima phosphofructokinases: expression and characterization of two unique enzymes." J. Bacteriol. 183(2): 791-794.

Dion, M., Fourage, L., Hallet, JN and Colas. (1999). "Cloning and expression of a β- glycosidase gene from Thermus thermophilus, sequence and biochemical characterization of the encoded enzyme." Glycoconj. J. 16: 27-37.

Dische, Z. (1935). "Die bedeutung der phosphorsaureester fur den ablauf und die steuerung der Blutglykolyse." Biochem. Z. 280: 248-264.

185 REFERENCES

Emanuelsson, O., Brunak, S., Heijne, G von., and Nielsen, H. (2007). "Locating protiens in the cell using TargetP, SignalP and related tools." Nature Proto. 2: 953-971.

Emsley, P., and Cowtan, K. (2004). "Coot: model-building tools for molecular graphics." Acta Cryst. D60: 2126-2132.

Evans, P. and Hudson, PJ. (1979). "Structure and control of phosphofructokinase from Bacillus stearothermophilus." Nature 279: 500-504.

Evans, PR. (2005). “Scaling and assessment of data quality”. Acta Cryst. D62, 72-82.

Feinberg, L., Foden, J., Barrett, T et. al., (2011). "Complete genome sequence of the cellulolytic thermophile Clostridium thermocellum DSM1313." J. Bacteriol. 193(11): 2906-2097.

Fennington, G. J. and Huges, TA. (1996). "The fructokinase from Rhizobium leguminosarum biovar trifolii belongs to group I fructokinase enzymes and is encoded seperately from other carbohydrate metabolism enzymes." Microbiol. 142: 321-330.

Ferreira, C. and Terra, WR. (1983). "Physical and kinetic properties of a plasma- membrane-bound β-D-glucosidase (cellobiase) from midgut cells of an insect (Rhynchosciara americana larva)." Biochem. J. 213: 43-51.

Friedman, B., Vaddi, K., Preston, C., Mahon, E., Cataldo, JR and McPherson, JM. (1999). "A comparison of the pharmacological properties of carbohydrate remodelled recombinant and placental-derived β-glucocerebrosidase: implications for clinical efficacy in treatment of gaucher disease." Blood 93(9): 2807-2816.

Giordani, R. and Lafon, L. (1993). "A β-D-fucosidase from Asclepias curassavica latex." Phytochem. 33(6): 1327-1331.

Giordani, R. and Noat, G. (1988). "Isolation, molecular properties and kinetic studies of a strict β-fucosidase from Lactuca sativa latex." Eur. J. Biochem. 175: 619-625.

186 REFERENCES

Gloster, T., MacDonald, JM., Tarling, CA., Stick, RV., Withers, SG and Davies, GJ. (2004). "Structural, thermodynamic, and kinetic analyses of tetrahydrooxazine- derived inhibitors bound to beta-glucosidases." J. Biol. Chem. 279(47): 49236-49242.

Gloster, T., Roberts, S., Perugino, G., Rossi, M., Moracci, M., Panday, N., Terinek, M., Vasella, A and Davies, GJ. (2006). "Structural and thermodynamic analysis of glucoimidazole-derived glycosidase inhibitors." Biochem. J. 45: 11879-11884.

Gloster, T., Meloncelli, P., Stick, RV., Zechel, D., Vasella, A and Davies, GJ. (2007). "Glycosidase inhibition: an assessment of the binding of 18 putative transition- state mimics." J. Am. Chem. Soc. 129: 2345-2354.

Gobbetti, M., De Angelis, M., Arnaut, P., Tossut, P., Corsetti, A and Lavermicocca, P. (1999). "Added pentosans in breadmaking: of derived pentoses by sourdough lactic acid bacteria." Food Microbiol. 16: 409–418.

Goldstein, R. (2011). "The evolution and evolutionary consequences of marginal thermostability in proteins." Proteins 79: 1396-1407.

Gouet, P., Courcelle, E., Stuart, DI and Metoz, F. (1999). "ESPript: analysis of multiple sequence alignments in PostScript." Bioinformatics 15: 305-308.

Greve, L., Labavitch, JM and Hungate, RE. (1998). "α-L-Arabinofuranosidase from Ruminococcus albus 8: purification and possible roles in hydrolysis of alfalfa ." Appl. Environ. Microbiol. 47: 1135–1140.

Gunde-Cimerman, N., Oren, A and Plemenitas, A., Ed. (2005). Adaptation to life at high salt concentrations in arachaea, bacteria and eukarya. Cellular origin, life in extreme habitats and astrobiology. Springer.

Haki, G and Rakshit, SK. (2003). "Developments in industrially important thermostable enzymes: a review." Bioresource Tech. 89: 17-34.

187 REFERENCES

Hakulinen, N., Paavilainen, S., Korpela, T and Rouvinen, J. (2000). "The crystal structure of beta-glucosidase from Bacillus circulans sp. alkalophilus: ability to form long polymeric assemblies." J. Struct. Biol. 129: 69-79.

Han, N., and Robyt, JF. (1998). "Separation and detection of sugars and alditols on thin layer chromatograms." Carbohyd. Res. 313: 135-137.

Hansen, T. and Schonheit, P. (2000). "Purification and properties of the first- identified, archaeal, ATP-dependent 6-phosphofructokinase, an extremely thermophilic non-allosteric enzyme, from the hyperthermophile Desulfurococcus amylolyticus." Arch. Microbiol. 173: 103-109.

Hansen, T. and Schonheit, P. (2001). "Sequence, expression and characterization of the first archael ATP-dependent 6-phosphofructokinase, a non-allosteric enzyme related to the phosphofructokinase-B sugar kinase family, from the hyperthermophilic crenarchaeote Aeropyrum pernix." Arch. Microbiol. 177: 62-69.

Hansen, T., Musfeldt, M and Schonheit, P. (2002). "ATP-dependent 6- phosphofructokinase from the hyperthermophilic bacterium Thermotoga maritima: characterization of an extremely thermophilic, allosterically regulated enzyme." Arch. Microbiol. 177: 401-409.

Hansen, T. and Schonheit, P. (2004). "ADP-dependent 6-phosphofructokinase, an extremely thermophilic, non-allosteric enzyme from the hyperthermophilic, sulfate- reducing archaeon Archaeoglobus fulgidus strain 7324." Extremophiles 8: 29-35.

Harada, K., Tanaka, K., Fukuda, Y., Hashimoto, W., and Murata, K. (2005). "Degradation of rice bran hemicellulose by Paenibacillus sp. strain HC1: gene cloning, characterization and function of β-D-glucosidase as an enzyme involved in degradation." Arch. Microbiol. 184: 215-224.

Hashimoto, T. and Nakata, Y. (2003). "Synergistic degradation of arabinoxylan with α-l-arabinofuranosidase, xylanase and β-xylosidase from koji mold, Aspergillus oryzae, in high salt condition." J. Biosci. Bioeng. 95: 164-169.

188 REFERENCES

Henrissat, B. and Davies, GJ. (2000). "Glycoside hydrolases and glycosyltransferases families, modules and implications for genomics." Plant Physiol. 124: 1515–1519.

Her, S., Lee, HS., Choi, SJ., Choi, SW., Yoon, SS and Oh, DH. (1999). "Cloning and sequencing of β-1,4-endoglucanase gene (celA) from Pseudomonas sp. YD-15." Lett. Appl. Microbiol. 29: 389-395.

Hess, B., Bekker, H., Berendsen, HJC and Fraaije JGEM. (1997). “LINCS: a linear constraint solver for molecular simulations”. J. Comp. Chem. 18:1463-1472.

Hirasawa, K., Uchimura, K., Kashiwa, M., Grant, William D., Ito, S., Kobayashi, T and Horikoshi, K. (2006). "Salt-activated endoglucanase of a strain of alkaliphilic Bacillus agaradhaerens." Antonie van Leeuwenhoek 89: 211-219.

Hirsh, S., Nosworthy, NJ., Kondyurin, A., Remedios, CG., McKenzie, DR and Bilek, MMM. (2011). "Linker-free covalent thermophilic beta-glucosidase functionalized polymeric surfaces." J. Mater. Chem.: doi:10.1039/c1jm13376d.

Hjorleifsdottir, S., Petursdottir, SK., Korpela, J., Torsti, A., Mattila, P and Kristjansson, JK. (1996). "Screening for restriction endonucleases in aerobic, thermophilic eubacteria." Biotechnol. Tech. 10(1): 13-18.

Hobel, C. (2004). Access to biodiversity and new genes from thermophiles by special enrichment methods. Department of Biology, University of Iceland.

Hofmann, E. (1976). "The significance of phosphofructokinase to the regulation of carbohydrate metabolism." Rev. Physiol. Biochem. Pharmacol. 75: 1-68.

Hofmann., A and Wlodawer, A. (2002). "PCSB—a program collection for structural biology and biophysical chemistry." Bioinformatics 18: 209-210.

Holden, H., Thoden, JB, Timson, DJ and Reece, RJ. (2004). "Galactokinase: structure, function and role in type II galactosemia." Cell. Mol. Life Sci. 61: 2471-2484.

Holm, L and Sander, C. (1993). "Protein structure comparison by alignment of distance matrices." J. Mol. Bio. 233: 123-138.

189 REFERENCES

Hope, J., Bell, AW., Hermodson, MA and Groake, JM. (1986). "Ribokinase from Escherichia coli K-12: nucleotide sequence and overexpression of the rbsK gene and purification of ribokinase. ." J. Biol. Chem. 261: 7663-7668.

Hope, H. (1988). "Cryocrystallography of biological macromolecules: a generally applicable method." Acta Cryst. B44: 22-26.

Hunter, S., Apweiler, R., Attwood, TK, et. al. (2009). "InterPro: the integrative protein signature database." Nucleic Acids Res. 37(D): 211-215.

Husebye, H., Arzt, S., Burmeister, WP., Hartel, FV., Brandt, A., Rossiter, JT., and Bones, AM. (2005). "Crystal structure at 1.1 Angstroms resolution of an insect myrosinase from Brevicoryne brassicae shows its close relationship to beta- glucosidases." Insect Biochem. Mol. Biol. 35: 1311-1320.

Huynh, F., Tan, TC., Swaminathan, K and Patel, BKC. (2005). "Expression, purification and preliminary cryatallographic analysis of sucrose phosphate synthase (SPS) from Halothermothrix orenii." Acta Crysta. F F61: 116-117.

Hyunh, F. (2004). Structure function relationship of proteins from the halothermophile Halothermothrix orenii. School of Biomolecular and Physical Sciences, Griffith University. BSc (Honours).

Hyunh, F., Tan, TC., Swaminathan, K and Patel, BKC. (2005). "Expression, purification and preliminary crystallographic analysis of sucrose phosphate synthase (SPS) from Halothermothrix orenii." Acta Cryst. F61: 116-117.

Ilari, A and Savino, C. (2008). “Protein structure determination by X-ray crystallography”. Bioinfromatics, Volume I, Data, sequence analysis and evolution. J. Keith, Humana Press.

Ishibashi, M., Arakawa, T., Philo, JS., Sakashita, K., Yonezawa, Y., Tokunaga, H and Tokunaga, M. (2002). "Secondary and quaternary structural transition of the halophilic archaeon nucleoside diphosphate kinase under high and low salt conditions." FEMS Microbiol. Lett. 216: 235-241.

190 REFERENCES

Isorna, P., Polaina, J., Garcia, LL., Canada, FJ., Ganzalez, B and Aparicio, JS. (2007). "Crystal structures of Paenibacillus polymyxa beta-glucosidase B complexes reveal the molecular basis of substrate specificity and give new insights into the catalytic machinery of family I glycosidases." J. Mol. Bio. 371: 1204-1218.

Jekow, P., Behlke, J., Tichelaar, W., Lurz, R., Regalla, M., Hinrichs, W and Tavares, P. (1999). "Effect of the ionic environment on the molecular structure of bacteriophage SPP1 portal protien." Eur. J. Biochem. 264: 724-735.

Jeng, W., Wang, NC., Lin, MH., Lin, CT., Liaw, YC., Chang, WJ., Liu, CL., Liang, PH and Wang, AH. (2011). "Structural and functional analysis of three β- glucosidases from bacterium Clostridium cellulovorans, fungus Trichoderma reesei and termite Neotermes koshunensis." Struct. Biol. 173: 46-56.

Jiang, H., Dian, W., Liu, F and Wu, P. (2003). "Isolation and characterization of two fructokinase cDNA clones from rice." Phytochem. 62: 47-52.

Jones, T.A., Zou, JY., Cowan, S and Kjeldgaard, M. (1991). "Improved methods for building protein models in electron density maps and the location of errors in these models." Acta Cryst. A47: 110-119.

Jones, D. (1999). "Protein secondary structure prediction based on position specific matrices." J. Mol. Biol. 292: 195-202.

Juers, DH., Heightman, TD., Vasella, A., McCarter, JD., Mackenzie, L., Withers, SG. and Matthews, BW. (2001). A structural view of the action of Escherichia coli (lacZ) β-galactosidase. Biochem. 40: 14781–14794.

Kabsch, W. (1993). "Automatic processing of rotation diffraction data from crystals of initially unknown symmetry and cell constants." J. Appl. Cryst. 26: 795-800.

Kabsch, W. (2010). "Integration, scaling, space-group assignment and post- refinement." Acta Cryst. D 66: 133-144.

191 REFERENCES

Karlvosky, P. (1993). "Screening for recombinant plasmids using MacConkey agar." Trends in Gen. 9: 375-376.

Kannan, N. and Vishveshwara, S. (2000). "Aromatic clusters: a determinant of thermal stability of thermophilic proteins." Protein Eng. 13: 753-761.

Kawai, S., Mukai, T., Mori, S., Mikami, B and Murata, K. (2005). "Hypothesis: structures, evolution and ancestor of glucose kinases in the hexokinase family." J. Biosci. Bioeng. 99: 320-330.

Kelley, D. (2005). "From the mantle to microbes: the lost city hydrothermal field." Oceanography 18(3): 32-45.

Kerrigan, JE. (2003). http://www2.umdnj.edu/~kerrigje.

Kendrew, J., Bodo, G., Dintzis, HM., Parrish, RG., Wyckoff, H and Phillips, DC. (1958). "A three-dimentional model of the myoglobin molecule obtained by x-ray analysis." Nature 181: 662-666.

Khiang, C. T., Bujnicki, J. M., Huynh. F., Patel. BKC & Sivaraman. J. (2008). "The structure of sucrose phosphate synthase from Halothermothrix orenii reveals its mechanism of action and biding mode." The Plant Cell 20: 1-14.

Kim, B., Singh., SP and Hayashi, K. (2006). "Characteristics of chimeric enzymes constructed between Thermotoga maritima and Agrobacterium tumefaciens β- glucosidases: role of C- terminal domain in catalytic activity." Enym. Microbiol. Technol. 38: 952-959.

King, K., Phan, P., Rellos, P and Scopes, RK. (1996). "Overexpression, purification and generation of a thermostable variant of Zymomonas mobilis fructokinase." Prot. Express. Purificat. 7: 373-376.

Kori, LD., Hofmann, A and Patel, BKC. (2011a). "Expression, purification and preliminary crystallographic analysis of the recombinant beta-glucosidase (BglA) from the halothermophile Halothermothrix orenii." Acta Cryst. F67: 111-113.

192 REFERENCES

Kori, LD., Hofmann, A and Patel, BKC. (2011b). “Expression, purification, crystallization and preliminary X-ray diffraction analysis of a ribokinase from the thermohalophile, Halothermothrix orenii." Acta Cryst. F68: 240-243.

Kosugi, A., Murashima, K and Doi, RH. (2002). "Characterization of two noncellulosomal subunits, ArfA and BgaA, from Clostridium cellulovorans that cooperate with the cellulosome in plant cell wall degradation.” J. Bacteriol. 184: 6859- 6865.

Kriegshauser, G and Liebl, W. (2000). "Pullulanase from the hyperthermphilic bacterium Thermotoga maritima: purification by β-cyclodextrin affinity chromatography." J. Chrom. B 737: 245-251.

Krissinel, E. and Henrick., K. (2004). "Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions." Acta Cryst. D 60: 2256- 2268.

Kumar, S., Tsai, CJ and Nussinov, Ruth. (2000)."Factors enhancing protein thermostability". Prot. Engineer. 13(3): 179-191.

Kumar, L., Awasthi, G and Singh, B. (2011). "Extremophiles: A novel source of industrially important enzymes." Biotechnol. 10(2): 121-135.

Kuo, L., and Lee., KT. (2008). "Cloning, expression and characterization of two β- glucosidases from Isoflavone glycoside-hydrolyzing Bacillus subtilis natto." J. Agric. Food Chem. 59: 119-125.

Lam, S., Yeung, RCY., Yu, T., Sze, K and Wong, K. (2011). "A rigidifying salt- bridge favors the activity of thermophilic enzyme at high temperatures at the expense of low temperature activity." PLoS Biol. 9(3): e1001027.

Laskowski, R., MacArthur, MW., Moss, DS and Thornton, JM. (1993). "PROCHECK: a program to check the stereochemical quality of protien structures." J. App. Crysta. 26: 283-291.

193 REFERENCES

Laurno, B., Strazzulli, A., Perugino, G., Cara, FL., Bedini, E., Corsaro, MM ., Rossi, M and Moracci, M. (2008). "Isolation and characterization of a new family 42 β-galactosidase from the thermoacidophilic bacterium Alicyclobacillus acidocaldarius: identification of active site residues." Biochem. Biophy. Acta 1784: 292-301.

Lazarev, V., Levitskii, SA., Basovskii, YI., et. al. (2011). "Complete genome and proteome of Acholeplasma laidlawii." J. Bacteriol. 193(18): 4943-4953.

Lee, J., Hyun, YJ and Kim, DH. (2011). "Cloning and characterization of -L- arabinofuranosidase and bifunctional -L-arabinopyranosidase/β-D-galactopyranosidase from Bifidobacterium longum H-1." J. App. Microbiol. 111: 1097-1107.

Leslie, A.G.W., (1992). Joint CCP4 + ESF-EAMCB Newsletter on Pro. Cryst. No. 26.

Levvy, G. and McAllan, A. (1963). "Beta-D-fucosidase in the Limpet, Patella vulgata." Biochem. J. 87: 206-209.

Li, N., Patel, BK., Mijts, BN and Swaminathan, K. (2002). "Crystallization of an - amylase, AmyA, from the thermophilic halophile Halothermothrix orenii." Acta Cryst. D58(12): 2125-2126.

Li, X., Pei, J., Wu, G and Shao, W. (2005). "Expression, purification and characterization of a recombinant beta-glucosidase from Volvariella volvacea." Biotechnol. Lett. 27: 1369-1373.

Li, WF., Zhou, XX and Lu, P. (2005). "Structural features of thermozymes." Biotechnol. Adv. 23: 271-281.

Lim, J., Choi, J., Han, S., Kim, S., Hwang, H., Jin, D., Ahn, B and Han, Y. (2001). "Molecular cloning and characterization of thermostable DNA ligase from Aquifex pyrophilus, a hyperthermophilic bacterium." Extremophiles 5: 161-168.

Lucas, R., Robles, A., Garcia, TM., Cienfuegos, GA., and Galvez A. (2001). "Production, purification and properties of an endoglucanase produced by the

194 REFERENCES

Hyphomycete Chalara (Syn. Thielaviopsis) paradoxa CH32. ." J. Agri. Food Chem. 49: 79-85.

Luesink, E., Marugg, JD., Kuipers, OP and Vos, WM. (1999). "Characterization of the divergent sacBK and sacAR operons, involved in sucrose utilization by Lactococcus lactis." J. Bacteriol. 181(6): 1924-1926.

Madigan, MT., and Oren, A. (1999). "Thermophilic and halophilic extremophiles." Curr. Opin. Microbiol. 2: 265-269.

Magliery, T., Lavinder, JJ and Sullivan, BJ. (2011). "Protein stability by number: high throughput and statistical approaches to one of protein sciences's most difficult problems." Curr. Op. Chem. Biol. 15: 443-451.

Maio, A., Porzio, E., Romano, I., Nicolaus, B and Mennella, MRF. (2011). Purification of the poly-ADP-ribose polymerase like thermozyme from the archaeon Sulfolobus solfataricus. Meth. Mol. Biol. A. Tulin, Springer. 780.

Marmur, J. (1961). "A procedure for the isolation of deoxyribonucleic acid from micro-organisms." J. Mol. Biol. 3: 208-218.

Matsumura, M., Signor, G and Matthews, BM. (1989). "Substantial increase of protein stability by multiple disulphide bonds." Nature 342: 291-293.

Matthews, B. (1968). "The solvent content of protein crystals." J. Mol. Bio. 33: 491- 497.

Mavromatis, K., Ivanova, N., Anderson, I., Lykidis, A., Hooper, SD., Sun, H., Kunin, V., Lapidus, A., Hugenholtz, P., Patel, B and Kyrpides, NC. (2009). "Genome analysis of the anaerobic thermohalophilic bacterium Halothermothrix orenii." PLoS ONE 4(1): 1-8. e4192.

Mavromatis, K., Sikorski,J., Lapidus,A., et. al. (2010). "Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IA)." Stand Genomic Sci. 2(1): 9-18.

195 REFERENCES

Mazzulli, J., XU, Y., Knight, AL., McLean, PJ and Caldwell, GA. (2011). "Gaucher disease glucocerebrosidase and alpha-synuclein form a bidirectional pathogenic loop in synucleinopathies." Cell 146: 37-52.

McNicholas, S., Potterton, E., Wilson, KS and Noble, MEM. (2011). “Presenting your structures: the CCP4mg molecular graphics software”. Acta Cryst. D67: 386-394.

Melgar, M., Cabeza, JA and Calvo, P. (1985). "Kinetic studies on β-D-fucosidase of Littorina littorea L." Comp. Biochem. Physiol. 80B(1): 149-156.

Medina, A. and Sols, A. (1956). "A specific frucokinase in peas." Biochim. Biophys. Acta 19: 378-379.

Mijts, B. (2001). Genes and Enzymes of Halothermothrix orenii. School of Biomolecular and Physical Sciences. Brisbane, Griffith University. PhD.

Mijts, B and Patel, BKC. (2001). "Random sequence analysis of genomic DNA of an anaerobic, thermophilic, halophilic bacterium, Halothermothrix orenii." Extremophiles 5: 61-69.

Mijts, B., and Patel., BKC. (2002). "Cloning, sequencing and expression of an α- amylase gene, amyA, from the thermophilic halophile Halothermothrix orenii and purification and biochemical characterization of the recombinant enzyme." Microbiol. 148: 2343-2349.

Morana, A., Paris, O., Maurelli, L., Rossi, M and Cannio, R. (2007). "Gene cloning and expression in Escherichia coli of a bifunctional β-D-xylosidase/ -L-arabinosidase from Sulfolobus solfataricus involved in xylan degradation." Extremophiles 11: 123- 132.

Nam, K., Kim, SJ., Kim, MY., Kim, JH., Yeo, YS., Lee, CM., Jun, HK and Hwang, KY. (2008). "Crystal structure of engineered β-glucosidase from a soil metagenome." Proteins 73(3): 788-793.

196 REFERENCES

Nam, S., Choi, SH., Kang, A., Kim, DW., Kim, RN., Kim, A., Kim, DS and Park, HS. (2011). "Genome sequence of Lactobacillus farciminis KCTC 3681." J. Bacteriol. 193(7): 1790-1791.

Nazir, A., Soni, Rohit., Saini, H S., Manhas, R K and Chadha, BS. (2009). "Regulation of expression of multiple beta-glucosidases of Aspergillus Terreus and their purification and characterization." Bioresources 4(1): 155-171.

Negelein, E. (1936). "Methode zur gewinnung des A-proteins der Garungsfermente." Biochem. Z. 287: 329-333.

Nelson, K., Clayton, RA., Gill, SR., et. al. (1999). "Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima." Nature 399(6734): 323-329.

Nielsen, H., Engelbrecht, J., Brunak, S and Heijne, G von. (1997). "Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites." Prot. Engineer. 10(1): 1-6.

Nijikken, Y., Tsukada, T., Igarashi, K., Samejima, M., Wakagi, T., Shoun and Fushinobu, S. (2007). "Crystal structure of intracellular family 1 beta glucosidase BGL1A from basidiomycete Phanerochaete chrysosporium." FEBS Lett. 581: 1514- 1520.

Nocek, B., Stein, AJ., Jedrzejczak, R., Cuff, ME., Li, H., Volkart, L and Joachimiak, A. (2011). "Structural studies of ROK fructokinase YdhR from Bacillus subtilis: insights into substrate binding and fructose specificity." J. Mol. Biol. 406: 325- 342.

Nunoura, N., Ohdan, K., Yano, T., Yamamota, K and Kumagai, H. (1996). "Purification and characterization of β-D-glucosidase (β-D-fucosidase) from Bifidobacterium breve clb acclimated to cellobiose." Biosci. Biotech. Biochem. 60(2): 188-193.

197 REFERENCES

Nurizzo, D., Turkenburg, JP., Charnock, SJ., Roberts, SM., Dodson, EJ., Mckie, VA., Taylor, EJ., Gilbert, HJ and Davies, GJ. (2002). "Cellvibrio japonicus -L- arabinanase 43A has a novel five-blade β-propeller fold." Nat. Struct. Biol. 9(9): 665- 668.

Odoux, E., Chauwin, A and Brillouet, JM. (2003). "Purification and characterization of vanilla bean (Vanilla planifolia Andrews) β-D-glucosidase." J. Agric. Food Chem. 51: 3168-3173.

Offman, M., Krol, M., Silman, I., Sussman, JL and Futerman, AH. (2010). "Molecular basis of reduced glucosylceramidase activity in the most common Gaucher disease mutant, N370S." J. Biol. Chem. 285(53): 42105-42114.

Ogg., C. and Patel, BKC. (2009a). "Thermotalea metallivorans gen. nov., sp. nov., a thermophilic, anaerobic bacterium from the Great Artesian Basin of Australia aquifer." Int. J. Syst. Evolution. Microbiol. 59: 964-971.

Oren, A. (2008). "Microbial life at high salt concentrations: phylogenetic and metabolic diversity." Saline Sys. 4(2).

Ostern, P., Guthke, IA and Terszakowec, J. (1936). "Uber die bildung des Hexose- monophosphorsaure-esters und dessen umwandlung in fructose-diphosphorsaure-ester im muskel. Hoppe -Seylers." Z. Physiol. Chem. 243: 9-37.

Paavilainen, S., Hellman, J and Korpela, T. (1993). "Purification,, characterization, gene cloning and sequencing of a new β-glucosidase from Bacillus circulans subsp. alkalophilus." Appl. Env. Microbiol. 59(3): 927-932.

Painbeni, E., Valles, S., Polaina, J and Flors, A. (1992). "Purification and characterization of a Bacillus polymyxa β-glucosidase expressed in Escherichia coli." J. Bacteriol. 174(9): 3087-3091.

Park, T., Choi, KW., Park, CS., Lee, CB., Kang HY., Shon, KJ., Park, JS and Cha, J. (2005). "Substrate specificity and transglycosylation catalyzed by a thermostable β-

198 REFERENCES glucosidase from marine hyperthermophile Thermotoga neapolitana." Appl. Microbiol. Biotechnol. 69: 411-422.

Park, J. and Gupta, RS. (2008). "Adenosine kinase and ribokinase- the RK family of proteins." Cell. Mol. Life Sci. 65: 2875-2896.

Parry, NJ., Beever, DE., Owen, E and Beeumen, JV. (2001). "Biochemical characterization and mechanism of action of a thermostable b-glucosidase purified from Thermoascus aurantiacus." Biochem. J. 353: 117-127.

Patchett, M., Daniel, RM., and Morgan, HW. (1987). "Purification and properties of a stable β-glucosidase from an extremely thermophilic anaerobic bacterium.” Biochem. J. 243: 779-787.

Patel, B., Morgan, W and Daniel, RM. (1985a). "A simple and efficient method for preparing and dispensing anaerobic media." Biotechnol. Lett. 7: 277-278.

Patel, B., Morgan, W and Daniel, RM. (1985b). "Fervidobacterium nodosum gen. nov. and spec. nov., a new chemoorganotrophic, caldoactive, anaerobic bacterium." Arch. Microbiol. 141: 63-69.

Pego, J. and Smeekens, SCM. (2000). "Plant fructokinases: a sweet family get- togather." Trends Plant Sci. 5(12): 531-536.

Pendola, F. and Palis, J. (1993). "MacConkey-II lactose agar for screening recombinant plasmids." Trends in Biotech. 9(1).

Perutz, M. (1956). "Isomorphous replacement and phase determination in non- centrosymmetric spcae groups." Acta Cryst. 9: 867-873.

Piston, S., Voragen, AG and Beldman G. (1998). "Stereochemical course of hydrolysis catalyzed by arabinofuranohydrolases." FEBS Lett. 398: 7-11.

Plant, A., Oliver, JE., Patchett, ML., Daniel, RM and Morgan, HW. (1988). "Stability and substrate specificity of a beta-glucosidase from the thermophilic bacterium Tp8 cloned in Escherichia coli." Arc. Biochem. Biophy. 262(1): 181-188.

199 REFERENCES

Porter, C., Bartlett, GJ and Thornton, JM. (2004). "The catalytic site atlas: a resource of catalytic sites and residues identified in enzymes using structural data." Nucleic Acids Res. 32(D): 129-133.

Pray, L. (2008). "Recombinant DNA technology and transgenic animals." Nature Edu. 1(1).

Qi, B., Wang, L and Liu, X. (2009). "Purification and characterization of beta- glucosidase from newly isolated Aspergillus sp. MY-0204." Af. J. Biotechnol. 8(10): 2367-2374.

Qu, Q., Lee, SJ and Boos, W. (2004). "Molecular and biochemical characterization of a fructose-6-phosphate-forming and ATP-dependent fructokinase of the hyperthermophilic archaeon Thermococcus litoralis." Extremophiles 8: 301-308.

Rainey, F. and Oren, A. (2006). Extremophile microorganisms and the methods to handle them. Extremophile microorgan. 35.

Reid, S., Rafudeen, MS and Leat, NG. (1999). "The genes controlling sucrose utilization in Clostridium beijerinckii NCIMB 8052 constitute an operon." Microbiol. 145: 1461-1472.

Remond, C., Ferchichi, M., Aubry, N., Plantier-Royon, R., Portella, C and O’Donohue MJ. (2002). "Enzymatic synthesis of alkyl arabinofuranosides using a thermostable α-l-arabinofuranosidase." Tetrahedron Lett. 43: 9653–9655.

Remond, C., Plantier-Royon, R., Aubry, N., Maes, E., Bliardc, C and O’Donohue MJ. (2004). "Synthesis of pentose-containing disaccharides using a thermostable α-L- arabinofuranosidase." Carbohydr. Res. 339: 2019–2025.

Reshetnikov, A., Rozova, ON., Khmelenina, VN., Mustakhimov, II., Beschastny, AP., Murrell, JC and Trotsenko, YA. (2008). "Characterization of the pyrophosphate- dependent 6-phosphofructokinase from Methylococcus capsulatus Bath." FEMS Microbiol. Lett. 288: 202-210.

200 REFERENCES

Richard, S., Madern, D., Garcin, E and Zaccai, G. (2000). "Halophilic adaptation: novel solvent protein interactions observed in the 2.9 A and 2.6 A resolution structures of the wild type and a mutant of malate dehydrogenase from marismortui." Biochem. 39: 992-1000.

Riley-Lovingshimer, M., Ronning, DR., Sacchettini, JC and Reinhart, GD. (2002). "Reversible ligand-induced dissociation of a tryptophan-shift mutant of phosphofructokinase from Bacillus stearothermophilus." Biochem. 41: 12967-12974.

Rodriguez, J., Cabezas, JA and Calvo, P. (1982). "Beta-fucosidase, β-glucosidase and β-galactosidase activities associated in bovine liver." Int. J. Biochem. 14: 695-698.

Rossmann, M and Blow, DM. (1962). "The detection of sub-units within the crystallographic asymmetric unit." Acta Cryst. 15: 24-31.

Rothschild, LJ and Mancinelli, RL. (2001). "Life in extreme environments." Nature 409 (22): 1092-1100.

Rye, C. and Withers, SG. (2000). "Glycosidase mechanisms." Curr. Opin. Chem. Biol. 4: 573-580.

Saha, B. (2003). "Hemicellulose bioconversion." J. Ind. Microbiol. Biotechnol. 30: 279–291.

Saha, B. and Bothast, RJ. (1998). "Purification and characterization of a novel thermostable α-L-arabinofuranosidase from a color-variant strain of a Aureobasidium pullulans." Appl. Environ. Microbiol. 64: 216-220.

Sambrook, J., Fritsch, EF and Maniatis, T., Ed. (1989). Molecular cloning: A laboratory manual. Cold Spring Harbour Laboratory Press.

Sano, K., Amemura, A., and Harada, T. (1975). "Purification and properties of a β-1- 6-glucosidase from Flavobacterium." Biochim. Biophysi. Acta. 377: 410-420.

201 REFERENCES

Sardi, S., Clarke., Kinnecom, C., et. al., (2011). "CNA expression of glucocerebrosidase corrects alpha-synuclein pathology and memory in a mouse model of Gaucher-related synucleinopathy." PNAS 108(29): 1201-12106.

Scandurra, R., Consalvi, V., Chiaraluce, R., Politi, L and Engel, PC. (1998). "Protein thermostability in extremophiles." Biochimie 80: 933-941.

Schuette, S., Ozdemir, I., Mistry, D., Lucas, S., et. al. (2011). "Complete genome sequences for the anaerobic, extremely thermophilic plant biomass degrading bacteria, Caldicellulosiruptor hydrothermalis, Caldicellulosiruptor kristjanssonii, Caldicellulosiruptor kronotskyensis, Caldicellulosiruptor owensensis and Caldicellulosiruptor lactoaceticus." J. Bacteriol. 193(6): 1483-1484.

Schuettelkopf, AW and van Aalten, DMF. (2004). “PRODRG - a tool for high- throughput crystallography of protein-ligand complexes”. Acta Crysta. D60:1355-1363.

Schwarz, W., Bronnenmeier, K., Krause, B., Lottspeich, F and Staudenbauer, WL. (1995). "Debranching of arabinoxylan: properties of the thermoactive recombinant -L- arabinofuranosidase from Clostridium stercorarium (ArfB)." Appl. Microbiol. Biotechnol. 43: 856-860.

Serrano, L., Bycroft, M and Fersht, AR. (1991). "Aromatic-aromatic interactions and protein stability: investigation by double-mutant cycles." J. Mol. Biol. 218: 465-475.

Shallom, D., Belakhov, V., Solomon, D., Gilead-Gropper, S., Baasov, T., Shoham, G and Shohama, Y. (2002). "The identification of the acid-base catalyst of α- arabinofuranosidase from Geobacillus stearothermophilus T-6, a family 51 glycoside hydrolase." FEBS Lett. 514: 163-167.

Sharma, B. (2011). "Kinetic characterization of phosphofructokinase purified from Setaria cervi: a bovine filarial parasite." Enzyme Res. 2011: doi:10.4061/2011/939472.

Sheridan, PP and Brenchley, JE. (2000). Characterization of a salt-tolerant family 42 β-galactosidase from a psychrophilic antarctic Planococcus isolate. Appl. Environ. Microb., 66: 2438–2444.

202 REFERENCES

Shin, H., Park, SY., Sung, JH and Kim, DH. (2003). "Purification and characterization of -L-arabinofuranosidase from Bifidobacterium breve K-110, a human intestinal anaerobic bacterium metabolizing ginsenoside Rb2 and Rc." App. Environ. Microbiol. 69(12): 7116-7123.

Sigrell, J., Cameron, AD., Jones, TA and Mowbray, SL. (1997). "Purification, characterization and crystallization of Escherichia coli ribokinase." Pro. Sci. 6: 2474- 2476.

Sigrell, J., Cameron, AD., Jones, TA and Mowbray, SL. (1998). "Structure of Escherichia coli ribokinase in complex with ribose and dinucleotide determined to 1.8 A resolution: insights into new family of kinase structures." Structure 6: 183-193.

Sigrell, J., Cameron, AD and Mowbray, SL. (1999). "Induced fit on sugar binding activates ribokinase." J. Mol. Biol. 290: 1009-1018.

Sinnott, M. (1990). "Catalytic mechanisms of enzymic glycosyl transfer." Chem. Rev. 90: 1171-1202.

Sivakumar, N. (2006). Structural basis of protein stability at poly-extreme: crystal structure of AmyA at 1.6 Å resolution. Institute of Molecular and Cell Biology, National University of Singapore. PhD.

Sivakumar, N., Li, Nan., T, Julian W., Patel, BKC and Swaminathan, Kunchithapadam. (2006). "Crystal structure of AmyA lacks acidic surface and provide insights into protein stability at poly-extreme condition." FEBS Lett. 580: 2646-2652.

Smant, G., Stokkermans, JPWG., et. al., (1998). "Endogenous cellulases in animals: Isolation of β-1-4-endoglucanase genes from two species of plant-parasitic cyst nematodes." PNAS 95: 4906-4911.

Srisomsap, C., Svasti, J., Surarit, R., Champattanchai, V., Sawangareetrakul, P., Boonpuan, K., Subhasitanont, P and Chokchaichamnankit, D. (1996). "Isolation and characterization of an enzyme with β-glucosidase and β-fucosidase activities from Dalbergia cochinchinensis Pierre." J. Biochem. 119: 585-590.

203 REFERENCES

Stetter, KO. (1996). "Hyperthermophilic procaryotes." FEMS Microbiol. Rev. 18: 149-158.

Sterpone, F., Bertonati, C., Briganti, G and Melchionna, S. (2010). "Water around thermophilic proteins: the role of charged and apolar atoms." J. Phys. Condens. Matter. doi:10.1088/0953-8984/22/28/28113.

Sulzenbacher, G., Bignon, C., Nishimura, T., Tarling, CA., Withers, SG., Henrissat, B., and Bourne, Y. (2004). "Crystal structure of Thermotoga maritima -L- Fucosidase; insights into the catalytic mechanism and the molecular basis for fucosidosis." J. Biol. Chem. 279(13): 13119-13128.

Sumida, T., Sueyoshi, N and Ito, M. (2002). "Molecular cloning and characterization of a novel glucocerebrosidase of Paenibacillus sp. TS12." J. Biochem. 132(2): 237-243.

Sunna, A., Moracci, M., Rossi, M and Antranikian, G. (1997). "Glycosyl hydrolases from hyperthermophiles." Extremophiles 1: 2-13.

Sun, Y., Cheng, JJ., Himmel, ME., Skory, CD., Adney, WS., Thomas, SR., Tisserat, B., Nishimura, Y and Yamamoto, YT. (2007). "Expression and characterization of Acidothermus cellulolyticus E1 endoglucanse in transgenic duckweed Lemna minor 8627." Biores. Techn. 98: 2866-2872.

Suvd, D., Fujimoto, Z., Takase, K., Matsumura, M and Mizuno, H. (2001). "Crystal structure of Bacillus stearothermophilus  amylase: possible factors determining the thermostability." J. Biochem. 129: 461-468.

Sveinsdottir, M., Sigurbjornsdottir, MA and Orlygsson, J. (2011). Ethanol and hydrogen production with thermophilic bacteria from sugars and complex biomass. Progress in biomass and bioenergy production. S. Shaukat, InTech.

Tai, H., Irie, K., Mikami, S and Yamamoto, Y. (2011). "Enhancement of the thermostability of thermophilus cytochrome C552 through introduction of an extra methylene group into its hydrophobi protein interior." Biochem. 50: 3161-3169.

204 REFERENCES

Takase, M., and Horikoshi, K. (1988). "A thermostable β-glucosidase isolated from a bacterial species of the genus Thermus." Appl. Microbiol. Biotechnol. 29: 55-60.

Tan, T. C., Yein, Y.Y., Patel, B.K., Mijts, B.N. and Swaminathan, K (2003). "Crystallization of a novel -amylase, AmyB, from the thermophilic halophile, Halothermothrix orenii." Acta Cryst. D59: 2257-2258.

Tan, TC., Mijts, BN., Swaminathan, K., Patel, BKC and Divne, C. (2008). "Crystal structure of the polyextremophilic -amylase AmyB from Halothermothrix orenii: details of a productive enzyme-substrate complex and an N domain with a role in binding raw starch." J. Mol. Biol. 378: 850-868.

Tandt, W. (1987). "Human leukocyte and plasma β-D-fucosidase: identity with β-D- galactosidase." Biochem. Medi. Meta. Biol. 38: 331-337.

Thompson, J., Higgins, DG and Gibson, TJ. (1994). "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice." Nucleic Acids Res. 22(22): 4673-4680.

Titgemeyer, F., Reizer, J., Reizer, A and Saier, MH Jr. (1994). "Evolutionary relationships between sugar kinases and transcriptional repressors in bacteria." Microbiol. 140: 2349-2354.

Tong, L and Rossman, MG. (1990). "The locked rotation function." Acta Cryst. A46: 783-792.

Trimbur, D., Warren, RAJ and Withers, SG. (1992). "Region-directed mutagenesis of residues surrounding the active site nucleophile in beta-glucosidase from Agrobacterium faecalis." J. Biol. Chem. 267(15): 10248-10251.

Turner, J., Harrison, DD and Copeland, L. (1977). "Fructokinase (fraction IV) of Pea seeds." Plant Physiol. 60: 666-669.

205 REFERENCES

Turner, P., Mamo, G and Karlsson, EN. (2007). "Potential and utilization of thermophiles and thermostable enzymes in biorefining." Microb. Cell Factories 6(9).

Turner, P., Pramhed, A., Kanders, E., Hedstrom, M., Karlsson, EN and Logan, DT. (2007). "Expression, purification, crystallization and preliminary X-ray diffraction analysis of Thermotoga neapolitana β-glucosidase B." Acta Cryst. F63: 802-806.

Vabulas, R., Raychaudhuri, S., Hartl, MH., et. al. (2010). "Protien folding in the cytoplasm and the heat shock response." Cold Spring Harb. Perspect. Biol. 2: a004390. van der Spoel, D., Lindahl, E., Hess, B., Groenhof, G., Mark, AE and Berendsen HJC. (2005). “GROMACS: Fast, flexible and free”. J. Comp. Chem. 26:1701-1718.

Van de Peer, Y. and De Wachter, R. (1994). "TREECON for windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment." Comput. Appl. Biosci. 10: 569-570.

Vannier, P., Marteinsson, VT., Fridjonsson, OH., Oger, P and Jebbar, M. (2011). "Complete genome sequence of the hyperthermophilic, piezophilic, heterophic and carboxydotrophic archaeon Thermococcus barophilus MP." J. Bacteriol. 193(6): 1481- 1482.

Vidyasagar, M., Prakash, S., Mahajan, V., Shouche, YS and Sreeramulu, K. (2009). "Purification and characterization of an extreme halothermophilic protease from a halothermophilic bacterium Chromohalobacter sp. TVSP101." Brazil. J. Microbiol. 40: 12-19.

Vielle, C and Zeikus, Gregory J. (2001). "Hyperthermophilic Enzymes: Sources, uses and molecular mechanisms for thermostability." Microbiol. Mol. Biol. Rev. 65(1): 1-43.

Vincent, F., Gloster, TM., MacDonald, J., Morland, C., Stick, RV., Dias, FMV., Prates, JAM., Fontes, CMGA., Glibert, HJ., and Davies, GJ. (2004). "Common inhibition of both beta-glucosidases and beta-mannosidases by Isofagomine lactam reflects different conformational itineraries for pyranoside hydrolysis." Chem. Bio. Chem. 5: 1596-1599.

206 REFERENCES

Vogt, G., Woell, S and Argos, P. (1997). "Protein thermal stability, hydrogen bonds and ion pairs." J. Mol. Biol. 269: 631-643.

Voorhorst, W., Eggen, RIL., Luesink, EJ., and Vos, WM De. (1995). "Characterization of the celB gene coding for β-glucosidase from the hyperthermophilic archaeon Pyrococcus furiosus and its expression and site directed mutation in Escherichia coli." J. Bacteriol. 177(24): 7105-7111.

Wagschal, K., Espiet, DF., Lee, CC., Accinelli, REK., Robertson, GH and Wong, DWS. (2007). "Genetic and biochemical characterization of an -L-arabinofuranosidase isolated from a compost starter mixture." Enzy. Micro. Technol. 40: 747-753.

Walker, D and Axelroad, B. (1978). "Evidence for a single catalytic site on the β-D- glucosidase, β-D-galactosidase of almond emulsin." Arc. Biochem. Biophy. 187(1): 102- 107.

Wang, Q., Trimbur, D., Graham, R., Warren, RAJ., and Withers, SG. (1995). "Identification of the acid/base catalyst in Agrobacterium faecalis β- glucosidase by kinetic analysis of mutants." Biochem. 34: 14554-14562.

Waters, D., Murray, PG., Ryan, LA., Arendt, EK and Tuohy, MG. (2010). "Talaromyces emersonii thermostable enzyme systems and their applications in wheat baking systems." J. Agric. Food Chem. 58: 7415-7422.

Wigley, D., Clarke, AR., Dunn, CR., Barstow, D., Atkinson, T., Chia, W., Muirhead, H and Holbrook, J. (1987). "The engineering of a more thermally stable lactate dehydrogenase by reduction of the area of a water accessible hydrophobic surface." Biochim. Biophys. Acta 916: 145-148.

Willies, S., Isupov, M and Littlechild, J. (2010). "Thermophilic enzymes and their applications in biocatalysis: a robust aldo-keto reductase." Environ. Technol. 31(10): 1159-1167.

207 REFERENCES

Willquist, K., Zeidan, AA and Niel, EW. (2010). "Physiological characteristics of the extreme thermophile Caldicellulosiruptor saccharolyticus: an efficient hydrogen cell factory." Microb. Cell Factories 9: 89.

Wolin, E., Wolin, MJ and Wolfe, RS. (1963). "Formation of methane by bacterial extracts." J. Biol. Chem. 238: 2882-2886.

Wu, L., Lee, J., Huang, H., Liu, B and Horng, J. (2009). "An expert system to predict protein thermostability using decision tree." Expert Sys. App. 36: 9007-9014.

Wu, L., Reizer, A., Reizer, J., Cai, B., Tomich, JM and Saier, MH. Jr. (1991). "Nucleotide sequence of the Rhodobacter capsulatus fruK gene, which encoded fructose-1-phosphate kinase: evidence for a kinase superfamily including both phophofructokinases of Escherichia coli." J. Bacteriol. 173: 3117-3127.

Yamada, M., Ikeda, K and Egami, F. (1973). "Further studies on glycosidases from the liver of Turbo cornutus with special reference to β-galactosidase, -L-arabinosidase and β-D-fucosidase activities." J. Biochem. 73: 69-76.

Yang, S., Jiang, Z., Yan, Q and Zhu, H. (2008). "Characterization of a thermostable extracellular β-glucosidase with activities of exoglucanase and transglycosylation from Paecilomyces thermophila." J. Agric. Food Chem. 56: 602-608.

Yan, F., Feng, G., Wei, LY., Hua, WG., Hao, P., He, MY., Hua, YJ and Fu, GG. (2011). "Crystal structure of histidine-containing phosphocarrier protein from Thermoanaerobacter tengcongensis MB4 and the implications for thermostability." Sci. China Life. Sci. 54: 513-519.

Yasawong, M., Areekit, S., Pakpitchareon, A., Santiwatankul, S and Chansiri, K. (2011). "Characterization of thermpophilic halotolerant Aeribacillus pallidus TD1 from Tao dam hot spring, Thailand." Int. J. Mol. Sci. 12: 5294-5303.

Yennamalli, R., Rader, AJ., Wolt, JD and Sen, TZ. (2011). "Thermostability in endoglucanases is fold specific." BMC Struc. Biol. 11(10): 1-15.

208 REFERENCES

Yoshioka, H and Hayashida., S. (1980). "Production and purification of thermostable beta glucosidase from Mucor miehei YH-10." Agric. Biol. Chem. 44(12): 2817-2724.

Zechel, D., Borastan, AB., Gloster, T., Borastan, CM., Macdonald, JM., Tilbrook, DM., Stick, RV and Davies, GJ. (2003). "Iminosugar glycosidase inhibitors: structural and thermodynamic dissection of the binding of isofagomine and 1-deoxynojirimycin to β-glucosidases." J.Am. Chem. Soc. 125: 14313-14323.

Zeikus, J., Hegge, PW and Anderson, MA. (1979). "Thermoanaerobium brockii gen. nov. and sp. nov., a new chemoorganotrophic, caldoactive, anaerobic bacterium." Arch. Microbiol. 122: 41-48.

Zeikus, J. G., Vieille, C., and Savchenko, A. (1998). "Thermozymes: biotechnology and structure- function relationships." Extremophiles 2: 179-183.

Zembrzuski, B., Chilco, P., Liu, XL., Liu, J., Conway, T and Scopes, R. (1992). "Cloning, sequencing and expresssion of Zymomonas mobilis fructokinase gene and structural comparison of the enzyme with other hexose kinases." J. Bacteriol. 174(11): 3455-3460.

Zeng, Y., Li, YT., Gu, Y and Zang, S. (1992). "Purification and characterization of a strictly specific β-D-fucosidase from Aspergillus phoenicis." Arc. Biochem. Biophy. 298(1): 226-230.

Zhou, J., Bao, L., Chang, L., Zhou, Y and Lu, H. (2011). "Biochemical and kinetic characterization of GH43 β-D-xylosidase/ -L-arabinofuranosidase and GH30 -L- arabinofuranosidase/ β-D-xylosidase from rumen metagenome." J. Ind. Microbiol. Biotechnol. DOI: 10.1007/s10295-011-1009-5

209 APPENDICES

APPENDICES

APPENDICES

APPENDIX I

Nucleotide and amino acid sequence of the genes under investigation

BASYS00126: rbsK, Fructokinase, EC. 2.7.1.4

>nameless_945 bp

ATGCCTGAAGTAATAACCATTGGTGAAGTCCTTGTTGAAGTAATGGCCAGGGAGATTAACCAGA AGTTTTCTGAGCCCGGGGAGTTTGTCGGGCCATTTCCCAGCGGAGCTCCGGCCATATTTATTGA TCAGGTCGCCAAAATGGGAGTTAGCTGTGGGATAATATCAAAAGTAGGTAATGATGATTTTGGA TACCTCAACCAGAAAAGGCTTGAAAAGGATGGGGTTGATACCAGTTATCTCGCTGTTTCAGACG AACATACTACCGGTGTTGCTTTTGTTACCTATATGGATAATGGTGACCGGGAATTTATTTTTCA CTTAAAAGAAGCCGCCCCTGGCTATATAAGTGAAGATGATATTGATGAAGGATACTTTGAAGAT TGTAAATACCTTCATTTGATGGGCTGTTCCCTATTTAATAAGTCTATACGAAAGGCCATAGGTA AAGCCCTGGACATAGCATCTGAAAAAGGGGTAAGGATATCCTTTGACCCCAACGTCAGAAAGGA GCTCCTTCAGGATAAGGAGATTAAGGACCTGTTTGATAGAATCCTGGATCAGTGTTATATATTT TTACCCGGGGAAGAGGAGCTTAAGTTTGTAACCGGGTATGACAGAGAAGAAGAGGCCATTCAGT ATTTAAGGAATAAAGGTGTTGAAATAATTGCTGTCAAAAAGGGTAGAAGGGGCAGTAAGGTTTA TACTGATGAAGTTACGTATACATTAAAACCCTTTAAGGTAGAGGAGGTTGATCCTACCGGGGCA GGGGATACATATGACGGGGCCTTGATAGCAAGCCTGGTTAACGGGAAGGGGTTTGAAAAGGCCT TACGGTATGCCAGTGCTGCCGGGGCTCTTGCTGTAACTACAAAAGGTCCCATGGAGGGGACCAG GACTGTTCCGGAATTAGATGAATTTATTAAATCACAACAGAAGATGTAA

>Translated_314_residues

MPEVITIGEVLVEVMAREINQKFSEPGEFVGPFPSGAPAIFIDQVAKMGVSCGIISKVGNDDFG YLNQKRLEKDGVDTSYLAVSDEHTTGVAFVTYMDNGDREFIFHLKEAAPGYISEDDIDEGYFED CKYLHLMGCSLFNKSIRKAIGKALDIASEKGVRISFDPNVRKELLQDKEIKDLFDRILDQCYIF LPGEEELKFVTGYDREEEAIQYLRNKGVEIIAVKKGRRGSKVYTDEVTYTLKPFKVEEVDPTGA GDTYDGALIASLVNGKGFEKALRYASAAGALAVTTKGPMEGTRTVPELDEFIKSQQKM

210 APPENDICES

BASYS00154: bfrA, β-fructosidase, EC. 3.2. 1.26

>nameless_633 bp

ATGTACTATTTAAGAAGCTTTAAAGACTCGGTGGGAGATGTAGATTTGATAGTACACAATGATA GACTCCACCTTTTCCATCTGGTATCACCGGGGAACAGTGCGGTAAGGCACCTTGTTTCTGATGA TGGGATAATCTGGGATAGCCTCCCTACGGCCATTACTATCGGTGATACAGGGAGCTATGATGAT GATCGTATCTGGACCATGGGTGTTACCAGACATAAGGGGAAGTTTTATATGTTCTATACTGCCT GTTCCACCAGGGAGGCTGGTCGGGTTCAGAGGACAGCCATGGCTGTGTCTCCTGACCTGATAAA CTGGGAAAAATACGATGGGAATCCTGTTTTATCCGCCGACAGCCGTTGGTATGAAAATGAACTG GGAGACAAAATAATGGTCTCCTGGCGGGACCCCAAGATATATTACGAAAACGGTAAGTACTATA TGGTTATTTCCAGCCGGAGTAATTCAGGTCCTTTCCTCAGGAGAGGTGTGGTAGCCCTGGCTGT TTCTGATAATTTAATTGACTGGGAGGTAAAAAAACCACTATTTGCCCCGGGGCAGTTTTATGAC CTGGAATGTCCCCAGTTATTTAAAATAGTGATTATTATTACATATTTGCCTCAATAA

>Translated_210_residues

MYYLRSFKDSVGDVDLIVHNDRLHLFHLVSPGNSAVRHLVSDDGIIWDSLPTAITIGDTGSYDD DRIWTMGVTRHKGKFYMFYTACSTREAGRVQRTAMAVSPDLINWEKYDGNPVLSADSRWYENEL GDKIMVSWRDPKIYYENGKYYMVISSRSNSGPFLRRGVVALAVSDNLIDWEVKKPLFAPGQFYD LECPQLFKIVIIITYLPQ

BASYS00369: -N-arabinofuranosidase II precursor, EC. 3.2.1.55

>nameless_948 bp

ATGAAAGATACCTATACAAATCCAGTTGGTGGGATAACCGGTATTGGTGATCCCTATGTCTTAA AACATGAATCACGGTATTATCTTTACGCTACTTCAGCAATTAACCGGGGCTTTAAGGTCTGGGA ATCACCAAACCTGGTTGACTGGGAATTAAAGGGGTTGGCCCTGGATTCTTATTATGAAAAAAAT GGCTGGGGAACAGAAGATTTCTGGGCCCCTGAGGTTATTTTTTATAATAATAAATTTTATATGA CATATAGTGCCAGGGATAATGATGGTCATTTAAAGATAGCTCTTGCTAGTAGCAAAAGTCCTCT GGGCCCTTTTAAAAATATTAAAGCTCCTCTTTTTGACCGTGGCTTATCCTTCATAGACGCCCAT ATATTTATAGATCAAGATGGCACCCCGTACATTTATTACGTCAAGGATTGTTCTGAAAATATTA TCAATGGTATTCATATAAGTCAAATTTATGTACAGGAAATGAGCCAGGACCTTCTCGAATTAAA AGGTGACCCGGTTTTAGCAATTCAACCAAGTCAGGACTGGGAGGGGATAAATGATGCCTGGCAG

211 APPENDICES

TGGAATGAAGGCCCCTTTGTCATTAAACATGAAGGTAAGTACTATATGATGTATTCAGCTAATT GTTACGCATCCCCGGATTATTCGATTGGTTATGCTGTGGCAGAAACCCCCTTGGGTCCCTGGAT AAAATATAGTGGTAACCCCATATTATCGAAAAGGATGGACAAAGGCATTTCAGGGCCAGGCCAT AACAGCGTAACCGTATCACCAGACGGCAGTGAATTATTTGTTGTATATCACACCCATACCTATC CCGATTCACCGGGTGGGGATAGGACGGTTAACATTGACAGATTATATTTTGAAGATGGAATATT AAAAGTAAAAGGTCCAACAAGGTCACCACAACCAGGACCGCGGAGTAATTAA

>Translated_315_residues

MKDTYTNPVGGITGIGDPYVLKHESRYYLYATSAINRGFKVWESPNLVDWELKGLALDSYYEKN GWGTEDFWAPEVIFYNNKFYMTYSARDNDGHLKIALASSKSPLGPFKNIKAPLFDRGLSFIDAH IFIDQDGTPYIYYVKDCSENIINGIHISQIYVQEMSQDLLELKGDPVLAIQPSQDWEGINDAWQ WNEGPFVIKHEGKYYMMYSANCYASPDYSIGYAVAETPLGPWIKYSGNPILSKRMDKGISGPGH NSVTVSPDGSELFVVYHTHTYPDSPGGDRTVNIDRLYFEDGILKVKGPTRSPQPGPRSN

BASYS00645: 2-dehydro-3-deoxygluconokinase , EC-2.7.1.45/ Ribokinase EC-2.7.1.15

>nameless_ 984 bp

GTGAAGGGAGAAGGTGTTATTGTGGGTATTCTAAATAAAAACTTTTCCCTGTCAAAGGGAGACC TGGATGTTGTAAGTCTCGGTGAAATACTGGTAGACATGATTTCTACAGAAGAGGTTAATTCTTT ATCACAGAGCAGGGAATATACCAGGCATTTCGGGGGATCACCGGCGAATATAGCCGTTAATTTA AGTCGCCTCGGGAAAAAGGTGGCTCTCATAAGTCGGCTCGGGGCAGATGCTTTTGGTAACTACC TGCTTGATGTTCTGAAAGGAGAACAGATAATTACAGATGGTATTCAGCAGGATAAGGAGAGAAG GACCACCATTGTATATGTCAGTAAGAGCACCCGGACCCCTGACTGGCTTCCTTACCGGGAGGCT GATATGTATTTACAGGAAGATGACATTATTTTTGAACTGATAAAAAGGAGCAAGGTTTTTCATC TATCGACATTTATTTTATCAAGGAAACCGGCCCGTGATACGGCCATAAAAGCCTTTAACTATGC CCGCGAGCAGGGGAAGATTGTCTGCTTCGATCCCTGTTACCGCAAGGTATTGTGGCCTGAAGGT GATGATGGAGCAGGGGTTGTTGAGGAAATAATATCGAGGGCTGATTTTGTGAAACCTTCTCTCG ATGATGCCAGGCATCTGTTTGGCCCTGATAGTCCTGAAAACTATGTGAAACGTTACCTGGAGCT TGGGGTTAAGGCTGTTATCCTGACCCTGGGAGAAGAAGGGGTCATAGCATCAGATGGTGAGGAG ATTATTAGAATACCGGCTTTTTCAGAAGACGCGGTGGATGTAACCGGAGCCGGTGATGCCTTCT GGAGTGGTTTTATCTGCGGACTTTTAGATGGATATACTGTAAAGAGGTCTATTAAACTGGGTAA TGGAGTGGCCGCTTTCAAGATCAGGGGGGTAGGGGCACTGTCACCGGTTCCCTCTAAAGAAGAC ATAATCAAAGAATATAACATTTAA

212 APPENDICES

>Translated_327_residues

MKGEGVIVGILNKNFSLSKGDLDVVSLGEILVDMISTEEVNSLSQSREYTRHFGGSPANIAVNL SRLGKKVALISRLGADAFGNYLLDVLKGEQIITDGIQQDKERRTTIVYVSKSTRTPDWLPYREA DMYLQEDDIIFELIKRSKVFHLSTFILSRKPARDTAIKAFNYAREQGKIVCFDPCYRKVLWPEG DDGAGVVEEIISRADFVKPSLDDARHLFGPDSPENYVKRYLELGVKAVILTLGEEGVIASDGEE IIRIPAFSEDAVDVTGAGDAFWSGFICGLLDGYTVKRSIKLGNGVAAFKIRGVGALSPVPSKED IIKEYNI

BASYS00973: bglA, Beta-glucosidase A, EC. 3.2.1.21

>nameless_1356 bp

ATGGCAAAAATAATATTTCCAGAAGATTTTATCTGGGGAGCAGCTACGTCATCCTACCAGATAG AAGGAGCTTTTAATGAGGATGGAAAGGGAGAATCCATCTGGGACAGATTTAGCCATACCCCGGG AAAAATTGAAAACGGTGACACCGGTGATATAGCCTGTGATCATTATCATCTGTACCGGGAAGAT ATTGAATTAATGAAAGAGATAGGGATTAGGTCATACCGTTTTTCTACTTCCTGGCCCCGGATTC TTCCGGAAGGTAAAGGCAGGGTAAACCAGAAAGGTCTTGATTTTTATAAAAGACTGGTTGACAA TCTTCTTAAGGCCAATATCAGACCCATGATAACCCTATACCACTGGGATTTACCCCAGGCATTA CAGGATAAAGGTGGCTGGACCAACAGGGATACAGCCAAATATTTCGCTGAATATGCCAGGCTTA TGTTTGAAGAGTTTAACGGCCTGGTGGACCTCTGGGTTACCCATAATGAACCCTGGGTAGTTGC CTTTGAGGGTCATGCTTTTGGTAACCATGCCCCCGGTACTAAAGATTTTAAAACGGCCCTTCAG GTCGCCCATCACCTGTTATTATCCCATGGAATGGCTGTTGATATCTTCAGGGAGGAAGACCTGC CCGGGGAGATTGGTATTACTCTCAACTTAACCCCTGCTTACCCGGCCGGTGACAGTGAGAAGGA TGTTAAGGCAGCTTCTTTACTTGATGACTATATTAATGCATGGTTTTTATCTCCAGTGTTCAAG GGCAGTTACCCGGAGGAATTACACCACATCTATGAACAGAATTTAGGGGCCTTTACAACCCAAC CGGGTGATATGGATATAATAAGCAGGGATATTGACTTCCTGGGCATTAATTACTACTCCAGGAT GGTGGTCAGGCATAAACCGGGAGATAATTTGTTTAATGCTGAAGTTGTAAAAATGGAGGATAGG CCATCTACAGAGATGGGCTGGGAGATTTATCCCCAGGGACTTTATGATATTTTAGTGAGGGTTA ATAAAGAATATACCGATAAGCCCCTTTACATAACAGAAAACGGGGCAGCTTTTGATGACAAATT AACAGAGGAAGGTAAGATCCATGATGAGAAGAGGATTAACTACCTGGGGGATCATTTTAAGCAG GCATATAAAGCCCTTAAAGATGGAGTTCCCCTCAGAGGTTATTATGTGTGGTCATTGATGGATA ATTTTGAATGGGCCTATGGCTATAGCAAGCGCTTTGGTCTCATTTATGTTGATTATGAAAATGG TAACAGACGCTTTTTAAAAGATAGTGCCCTATGGTATCGGGAGGTCATCGAAAAAGGCCAGGTT GAAGCTAACTAA

213 APPENDICES

>Translated_451_residues

MAKIIFPEDFIWGAATSSYQIEGAFNEDGKGESIWDRFSHTPGKIENGDTGDIACDHYHLYRED IELMKEIGIRSYRFSTSWPRILPEGKGRVNQKGLDFYKRLVDNLLKANIRPMITLYHWDLPQAL QDKGGWTNRDTAKYFAEYARLMFEEFNGLVDLWVTHNEPWVVAFEGHAFGNHAPGTKDFKTALQ VAHHLLLSHGMAVDIFREEDLPGEIGITLNLTPAYPAGDSEKDVKAASLLDDYINAWFLSPVFK GSYPEELHHIYEQNLGAFTTQPGDMDIISRDIDFLGINYYSRMVVRHKPGDNLFNAEVVKMEDR PSTEMGWEIYPQGLYDILVRVNKEYTDKPLYITENGAAFDDKLTEEGKIHDEKRINYLGDHFKQ AYKALKDGVPLRGYYVWSLMDNFEWAYGYSKRFGLIYVDYENGNRRFLKDSALWYREVIEKGQV EAN

BASYS01006: MAN2C1, -mannosidase 2C1, EC. 3.2.1.24

>nameless_ 2766 bp

GTGGAGAAAGCTGACCCGGATAGGATACATAAGATTAAGCTAGAAGCATATACTCGTTCAAGAC CAGATGATGACAGAAATACTGATACCATTAACATTAAAGGCTGTTTACAATATTTTCGAACACC TGAACTAGTTTTAATAGATGAAAAAATGTTATCATTGTATTATGATTTGAAGGCTATTTATGAT GCTGCTTATGCAGAATCTATGGATGAAAGTATAAAAAATTATCTTCAGTATCACTTACAACAGT TAATTAAAGAATTTCCTCTTTACGGCTGTAGCAGAGAAGAAATGGTTATGGCGATACCTAAAAT AAAAGAATATATAAACAATAAGATTTATAAAGGACATGAACATTTTGGGAAAAACGGCAAGATT GCCCTGGTTGCCCATTCACACCTTGATGTTGCTTACCATTGGACGGCAAAACAGGCTATTCAAA AAAATGCCAGGACAACATTAATTCAATTGAGATTAATGGAAGAATATCCTGATTTTAAGTATGC TCACAGTCAAGCCTGGACTTATGAAAAACTTGAAAAATATTATCCAGAACTTTTTGCTCAGGTG AAAGAACGAATAAAAAGTGGTCAATGGGAAATTGTAGGTGGTATGTATATTGAACCTGATTGTA ATTTAATTAGTGCTGAAAGTTTTGTTCGACAAATAGTCTATGGAAAATATTATTTTAAACAAAA ATTTGGAATTGATGTTGATAACTGTTGGTTACCTGACGTTTTTGGTAATAGTCCAATTATGCCA CAGATTTTAAAGTCTGGTGGACTTGAATATTTCGTTACCCATAAACTGTCTGTCTGGAATGATA CCAATAAATTTCCTCATAATGTCTTCCTGTGGAAAGGTTTAGATGGTACAACGGTTAATGCCTG TATACCACCAATCCATTTTGTTACCTGGATGGATACTGAAGAGACAATAAATAACTGGAACCAA TTTCAGGATAAAAATGTGTGTGATGAAACCTTACAATTATATGGTTATGGTGATGGTGGTAGTG GTGTTACAGATGAGATGCTACAGTTATATGAACGTCAACAGAAGTTGCCTGGTATTCCAGAACA AAGGTTGACTACAGCTAAAGAATATTTACATCGTATATTTAAAGACACGAAAGATTTTGCAGTA TGGGATGGAGATTTATATCTCGAAATGCACCGGGGGACTTATACAAGTAAGGCAAAATTAAAGA AATACAACCGACAGGGAGAATTTCTTGCCCAGGAAGTTGAAACATTATGTACAGCTTGTGACAT TTATACTGGTGATTTTAAGGTGCAGAACAAACTTAAAAATATATGGAAAAAACTGTTAGTTAAC

214 APPENDICES

CAGTTTCATGATATTTTACCTGGTAGTCATACACAACCGGTTTATAAAGAAGCTATTGAGAATT ATGAAGAAATGTTTACAGCTTTCAATAATTTAAAGGAAAAGGCCCTGAAAAAAATAACCAGTCC CGGGGATGAAATAGATTATGTTGCCTTTAATGCTTTCAGCGATGTTAGAAATAAAGTAGCCTAT ATTGATGCGAATAACTGGGATATTAAATATAATGCCCTGAAAGATGATGAGGGGAATATTTATC CAGTTCAGAAGCAAATAAAAGCTGATGGTAGTATTCAGTATGCAGTAAAAATTCCTGACATCCC TGGGTTTTCATTAAAACAATTTAAAGCAACAAGTACTGACAAAATATCTTCATCAATGAAAGTA TCTCAGACAGAAATGGAAAATGACTACTATATCTTGAAATTAGAATCCTCAGGAAAGATAGTAA GCCTTTATGATAAAGTAAGAGAAAAATATGTTAATACTAGGAATGAAGTATTAAATAAATGGCA GATGTTTGAAGATAAACCCGGCGCCCTAAACGCATGGGATATTGTTGAAACTTATAAAAATCAG GAGATTAATCTTCCTGATTGGGAAAATATAACAGTTATAGAAGATGGTCCGGTAAGTATTGCGT TAAGAATGGAGAGGAAATTTAGTAATAGCAAAGCGGTGCAGGTCATCAGAATGTTTGATAATAA ACCTCAGATTGATTTTGATACCTGGGTCGACTGGCAAGAAGAAGAAAAGTTATTAAAAGTAGCA TTCCCTGTAAATGTAAGATCCAGGACTTATTCCACTGATACATCAGCAGGTGGATTTGAGAGAA TGAATCATAAAAATACTGGCTGGGAACAGGGGCAATTTGAAGTTCCCTGTCATAAATGGGTAGA CATATCTGAAGGTTTGTTTGGAGTCTCACTAATGAATGATTGTAAGTATGGATGTGATGTAGAA GATAATGTTATGAGGTTAACCCTTTTAAAAGCCCCCATTTATCCTGATAGAACTAGTGATAGAG AGGAACACACTTTCACATATTCTATTTTTACTCATGATGGTAATAGACAGACAGGTGGTCTTGA AGAGGCTGCCTATGATCTTAATTATCCTTTAATCCTGGAGCAGGAAAGACGGTTAAAGGTTAAA AAGCCTGTTTTAACAATTAATGTCAAGAGTTTAAAATGCCAGGCATTTAAATTAGCAGAAGATG GTTCGTCAGATATAATTTTAAGATTAGCTGAGGTTTATGGTTCTCATGGAAAAGCAAAAGTACA TTTTAATTTTAATATTGAAGGCGTAAGTGTATGTAATATTCTAGAAGAAGAGCAAACATCATTG CAGGTTAATGATAACACAGTTACAATGGATTTTTCTCCTTACCAAATTATATCTTTAAGAGTTA AAAGGAAGTTGTAA

>Translated_921_residues

MEKADPDRIHKIKLEAYTRSRPDDDRNTDTINIKGCLQYFRTPELVLIDEKMLSLYYDLKAIYD AAYAESMDESIKNYLQYHLQQLIKEFPLYGCSREEMVMAIPKIKEYINNKIYKGHEHFGKNGKI ALVAHSHLDVAYHWTAKQAIQKNARTTLIQLRLMEEYPDFKYAHSQAWTYEKLEKYYPELFAQV KERIKSGQWEIVGGMYIEPDCNLISAESFVRQIVYGKYYFKQKFGIDVDNCWLPDVFGNSPIMP QILKSGGLEYFVTHKLSVWNDTNKFPHNVFLWKGLDGTTVNACIPPIHFVTWMDTEETINNWNQ FQDKNVCDETLQLYGYGDGGSGVTDEMLQLYERQQKLPGIPEQRLTTAKEYLHRIFKDTKDFAV WDGDLYLEMHRGTYTSKAKLKKYNRQGEFLAQEVETLCTACDIYTGDFKVQNKLKNIWKKLLVN QFHDILPGSHTQPVYKEAIENYEEMFTAFNNLKEKALKKITSPGDEIDYVAFNAFSDVRNKVAY IDANNWDIKYNALKDDEGNIYPVQKQIKADGSIQYAVKIPDIPGFSLKQFKATSTDKISSSMKV SQTEMENDYYILKLESSGKIVSLYDKVREKYVNTRNEVLNKWQMFEDKPGALNAWDIVETYKNQ EINLPDWENITVIEDGPVSIALRMERKFSNSKAVQVIRMFDNKPQIDFDTWVDWQEEEKLLKVA

215 APPENDICES

FPVNVRSRTYSTDTSAGGFERMNHKNTGWEQGQFEVPCHKWVDISEGLFGVSLMNDCKYGCDVE DNVMRLTLLKAPIYPDRTSDREEHTFTYSIFTHDGNRQTGGLEEAAYDLNYPLILEQERRLKVK KPVLTINVKSLKCQAFKLAEDGSSDIILRLAEVYGSHGKAKVHFNFNIEGVSVCNILEEEQTSL QVNDNTVTMDFSPYQIISLRVKRKL

BASYS01593: Sugar Kinase Ribokinase Family, EC. 2.7.1.4

>nameless_918 bp

ATGGGAAAAGATAATTATGATGTAGTTGTTATTGGGGCAGTAGGTGTGGATACAAATGTCTATT TGCCCGGGCAGGATATAGATTTTGATGTGGAAGCCAATTTTACAACAAATATAGATTATCCGGG GTTGGCAGGTAGCTATACCAGTCGGGGTTTTGCCAGACTGGTTGATAAGGTAGCTGTTATTGCC TATATTGGTGAGGATCATAATGGTTGGTATGTCAAAAATGAACTGGAAGGGGATGGCATTGATT GTACTTTCTTTTTAGACCCCGTGGGGACTAAAAGGAGTATAAACTTTATGTATAAGGATGGGAG AAGGAAAAATTTCTATGATGGCAAGGCGTCAATGGAGGTAAAACCCGATATTGACATTTGCAAA TCTATCCTTAAAAAAACAAATCTGGCCCACTTTAACATTGTAAACTGGACCCGTTATTTATTAC CAGTGGCCAGGGAACTGGGAGTTACCATATCCTGTGATATTCAGGATATAACAGATTTAGATGA TGAATACAGGCAGGATTTTATAAATTATGCAGATGTCTTATTTTTCTCAGCTGTTAACTTTAAA GATCCTGTACCTTTAATTAAAGAATTTATTAAAAGAAAACCTGATAGGATTGTCATCTGTGGTA TGGGTGACAAAGGATGTGCCCTGGGTACAGAACAGGGAATAAAGTTTTATAAAGCCCTGGATAT GTTAGAACCTGTTGTGGACACTAATGGGGCCGGTGATGGATTGGCGGTTGGGTTTTTATCAAGT TATTTTATAGATGGGTTTTCACTTGAGGATTCTATATTAAGAGGTCAGATTGTAGCCAGATATA CATGTACTAAAAAGGCAACTTCGTCAGAATTAATAACAGGAGATAAATTAAATAAATACTTTGA AGAAATGAAAGAGTCCGGATAA

>Translated_305_residues

MGKDNYDVVVIGAVGVDTNVYLPGQDIDFDVEANFTTNIDYPGLAGSYTSRGFARLVDKVAVIA YIGEDHNGWYVKNELEGDGIDCTFFLDPVGTKRSINFMYKDGRRKNFYDGKASMEVKPDIDICK SILKKTNLAHFNIVNWTRYLLPVARELGVTISCDIQDITDLDDEYRQDFINYADVLFFSAVNFK DPVPLIKEFIKRKPDRIVICGMGDKGCALGTEQGIKFYKALDMLEPVVDTNGAGDGLAVGFLSS YFIDGFSLEDSILRGQIVARYTCTKKATSSELITGDKLNKYFEEMKESG

216 APPENDICES

BASYS01594: amyB, -amylase 2, EC. 3.2.1.1

>nameless_1281 bp

GTGACTATTAAGACATCCGACTGGCTAAAATCAGCTATTATTTATGAGGTGTTTCCACGAAACC ATACACAAGAAGGAAATATTCAGGGGATAACCAGGGACCTGGAAAGAATAAGGGAACTGGGGGT TGATATCGTCTGGTTAATGCCTGTATATCCTGTCGGCAGGAAAGGAAGAAAAGGCAAAGAGGGA AGTCCCTATGCTATAAGGGATTACAGGTCTATTGATCCGGCTCTGGGAACATCTGAGGATTTTA AAAAATTGGTAGATAAGGCCCATCGACTCAAATTAAAAGTTATAATTGATGTTGTATTCAATCA TACAGCTATTGATTCGGTACTGGTTAAAAAGCATCCAGAGTGGTTTTATAAAACACCAGAGGGG GAGATATCAAGAAAAATTGATGACTGGTCAGATATCGTTGACCTTGATTTTTCCAGTTCTAAAT TAAAAGAGTATTTAATTGATACCCTCCAGTACTGGGTAGACCTGGGGGTTGATGGGTTCAGGTG TGATGTAGCAGCATTGGTTCCATTAAGCTTCTGGAAAGAAGCCAGAGAACAAATTAAAACCGAC AGGGAAATTATATGGCTAGCTGAAAGTGTTGAAAAATCTTTTGTTAAATTTTTGAGGGAGTCAG GATATGTATGCCATTCAGATCCCGAGCTCCATGAGGTTTTTGACCTGACTTATGATTATGATGG TTTTGAATATCTTAAATCTTACTTTAAAGGGCAGGGACAAATACATGATTATATAAATCACCTT TATATTCAGGAAACTTTGTACCCGGAAAATGCCATCAAGATGCGTTTTCTGGAAAATCATGATA ATGTAAGGATTGCTTCTATTATTAAAGATAAAACCCGTTTAAAAAACTGGACTGCTTTTTATCT TCTTTTACCTGGTGCTTCTTTAATATATTCTGGTCAGGAGTACGGGATCGATAATACACCTGAT TTATTTAACCAGGATCCTATTGACTGGGAAACTGGTGATCCTGATTTTTATTCATATATGAAAA GACTGGTCAGTATTGCGAGAAAAATAAAGTCTAACTGTTACCGGTTCAGTATTAAAGAAATCGT TAAGGGGATTATTGAAATAAAATGGCAGGGTCGTGAGGACTGTTATCTGTCTCTCCTGAATCTT GAAGACAGGTATGGGTATCTTGACTGTAATAGTGTTTATTCCGGGTATGATATGTTAAACGACC GGGTATATGAATCAGGGGAAAAAATGAAAATAGAGAAAAATCCCCTTATTTTAAAGCTAAAATA G

>Translated_426_residues

MTIKTSDWLKSAIIYEVFPRNHTQEGNIQGITRDLERIRELGVDIVWLMPVYPVGRKGRKGKEG SPYAIRDYRSIDPALGTSEDFKKLVDKAHRLKLKVIIDVVFNHTAIDSVLVKKHPEWFYKTPEG EISRKIDDWSDIVDLDFSSSKLKEYLIDTLQYWVDLGVDGFRCDVAALVPLSFWKEAREQIKTD REIIWLAESVEKSFVKFLRESGYVCHSDPELHEVFDLTYDYDGFEYLKSYFKGQGQIHDYINHL YIQETLYPENAIKMRFLENHDNVRIASIIKDKTRLKNWTAFYLLLPGASLIYSGQEYGIDNTPD LFNQDPIDWETGDPDFYSYMKRLVSIARKIKSNCYRFSIKEIVKGIIEIKWQGREDCYLSLLNL EDRYGYLDCNSVYSGYDMLNDRVYESGEKMKIEKNPLILKLK

217 APPENDICES

BASYS01715: 4--gluconotransferase, EC-2.4.1.25/ Amylo--1-6-glucosidase EC- 3.2.1.33

>nameless_1986 bp

ATGGAGGAGTTTTTACCTGAAAGCTGTCGGGAGCAGGCCTGTATGTTCAGAAACGAATGGCTTG TTACCAATGGACTTGGTTCTTTTGCCAGTTCGACTGAATCTGGAGCCAATACCAGACGTTATCA TGGACTATTGATAGCTTCGTTTAATCCACCGGTAGAAAGAAGATTAATGGTAAGCAAAGTCGAT GATACTCTCGTAATAGATAATACCAGGGTTAACCTGGCATCAAATTTAAAAAATAAAGAACTTG AGGACTCCGGTTATAAATTTTTATTATCTTTTAGATATAAACCATATCCTGTCTATGTTTATTC CTATAAAAGCTTTCTTATTAAAAAATCAATATTTATGGTTCCCGGGAAAAATATTACGGTAGTT CATTATCATATTCTTGATGGAGATTTTCCGGCCCTGCTGGAACTAACTCCAGTGTTAAACTCAA GGGACTTTCACGGGGAAACTATATTAAGTCGACCTGAACAGGACCAGAAGACCAGGGGCAAATC TTATAAAATTAATAGTTATGATAAAAATTTTAAAAGTATATCCTTCGATTTAAATAATTACCCC CTTTATATGTTATGGGATAAAGGTTATTTTAAAATTAAAGAATCATTTTATAAAAATATGTATT ATCCCTTTGAAGCTTACCGGGGTCTGGGGGATAGGGAAGATCATTATCTACCTGGAAAATTAAT TGTAGATTTAGAAGAAACCTGTCAATTTACAATTGTATTTTCAACAGAGAGGGTAGAGAAAATT AACTATAAGAAGTGGAAAAACCAGTATATGGATAAACAACAGTCTTTATTACAGAATTACCCCA AAAAAAGAAATACGTTTATAGATCAGTTAATTAAAAGTGCCGATCATTTTATAGCTTACCGGAG GTCAACCGGCAAGAAAACTATACTTGCTGGCTTTCCCTGGTTTACAGACTGGGGCCGGGATACA ATGATTTCCCTTCCAGGATTAACCCTGGCTACCGGTAGATATAAAGAAGCCCGGGAAATACTGA CTACCTTTGCATTAAATATTAAAAGAGGCCTGTTACCCAATGTCTTTGATGATTATTCTGGAGA AGCAGCCCAATACAATACAGTAGATGCTGCCCTATGGTTTTTTTATGCTATTTATAAATATTAT AAATATACCCGGGATAAGGATTTTATAAAGAAAATAATTCCGGGGCTTAAAGAAATAATCGAAT ACCATATTCAAGGCACTGATTATGATATTAAAGTGGATAACGAAGATGGCTTGCTCTGTGCCGG GAATCCATCCACTCAGCTAACCTGGATGGATGTTAAAGTAGAGGACTGGGTTGTAACCCCACGC CATGGCAAAGCTGTGGAAATAAATGCCCTGTGGTATAATGCCTTAAAGATATATAATTATCTAA GTAGTATTCTTAACAGGGAACCTGAATTTATGGATATTATAGATAAAATTGAAGTAAATTATGA GAAAAAATTCTGGTATAACAAAGGCAAATATCTTTATGATGTTATTGATGGTAACAGAAAAGAT AAGAGTCTAAGACCAAATCAAATATTTTCAGTTAGTTTACCTTTTTCACTTTTGCCAGATAATA AAGAAAAGATGGTGGTTAAAAAGGTTATGGAAGAGCTCTATGTACCCCTGGGGTTAAGGAGTTT GAATCCATCTCATTCCGAATATAAAGGTAAGTATTCAGGCAAGCGCCTGGAGCGTGATGGGGCT TACCATCAGGGGACTGTCTGGGGTTGGTTGATAGGACCCTTTTTTGAAGCATACCTTAAGGTAA ACAAATTTTCACCACAGGCTAAAAAAAATGTCCAGGAAATGATGAAACCCTTTTATGATCATAT AAAAGAATATGGACTGGGCAGTATTTCTGAAATTTTTGATGGAAATTACCCTCATATACCCAGG GGTTGTTTTGCCCAGGCCTGGTCTGTTGCCGAGATTTTAAGAATTTATACAGAATATCTAATAT AG

218 APPENDICES

>1715Translated_661_residues

MEEFLPESCREQACMFRNEWLVTNGLGSFASSTESGANTRRYHGLLIASFNPPVERRLMVSKVD DTLVIDNTRVNLASNLKNKELEDSGYKFLLSFRYKPYPVYVYSYKSFLIKKSIFMVPGKNITVV HYHILDGDFPALLELTPVLNSRDFHGETILSRPEQDQKTRGKSYKINSYDKNFKSISFDLNNYP LYMLWDKGYFKIKESFYKNMYYPFEAYRGLGDREDHYLPGKLIVDLEETCQFTIVFSTERVEKI NYKKWKNQYMDKQQSLLQNYPKKRNTFIDQLIKSADHFIAYRRSTGKKTILAGFPWFTDWGRDT MISLPGLTLATGRYKEAREILTTFALNIKRGLLPNVFDDYSGEAAQYNTVDAALWFFYAIYKYY KYTRDKDFIKKIIPGLKEIIEYHIQGTDYDIKVDNEDGLLCAGNPSTQLTWMDVKVEDWVVTPR HGKAVEINALWYNALKIYNYLSSILNREPEFMDIIDKIEVNYEKKFWYNKGKYLYDVIDGNRKD KSLRPNQIFSVSLPFSLLPDNKEKMVVKKVMEELYVPLGLRSLNPSHSEYKGKYSGKRLERDGA YHQGTVWGWLIGPFFEAYLKVNKFSPQAKKNVQEMMKPFYDHIKEYGLGSISEIFDGNYPHIPR GCFAQAWSVAEILRIYTEYLI

BASYS02075: frvX, Endoglucanase, EC. 3.2.1.4

>nameless_ 879 bp

TTGCAAAAGGAAATCAATCCTCAAAAAATTAAAGAATCATTAAAGGAAAATTTGGAGGTCTTAT GTCAGTCCCACGGAGTAGCCGGGTTTGAAAATGATGTAAGAAACGTAGTAAAAGAACAAATCAG GGATTATGTCGATGACATCAAAGTAGATGCGCTGGGTAATTTAATTGCAACCAGAAGAGGAAAC AGTAAATATTCTTTAATGATAGCAGCTCACCTGGATGAAATAGGTCTTTTGATTAAGCGGATTG ATAAAAACGGTTTTCTATGGTTTGAACCGGTCGGTGGAGTTTTTCCTCAGGTGCTTTTTTCAAG GCCTGTTGTTATTAAAACCGACAATGGTTATGTTAAAGGAATAATAAACCACTTAAAACCCGGT CGTCCTGAAATGATTAAAGATTTACCTGAAATTCAGGACTTTTTCATCGATGTAGGAGCTAGAA GTAAAGAAGAAGTCAAAGAAATGGGAATAGAAGTAGGTAATCCTGTCTCCATTCAATATGACTT TATGAGCCTGGGAAAAAATAAAATTGCCGGGAAAGCCCTTGATGACAGATTATTTGTTTTTATG CTAATTGAAGTCTTAAAATTATTAAAAGACGATAAAGATATACCTGATGTTCATGCAGTTTTTA CCACTCAGGAAGAAGTTGGTCTCAGGGGTGCTAAAACTTCAGCATATAGTCTAAATCCTGATGA GTGTCTCGCCCTTGACATATCAATTGCCAATGACATTCCCGGTACACCAGACCGGAAAATAATT ACCGAACTTGATAAAGGACCAGTTATTAAGGTTATGGATCTGGTTCCCAGGGCTATGTTAGGCT TAATTGCATCACAAAAGATAGTCAATAAACTGAAGAAAGTTGCCTGA

>Translated_292_residues

MQKEINPQKIKESLKENLEVLCQSHGVAGFENDVRNVVKEQIRDYVDDIKVDALGNLIATRRGN

219 APPENDICES

SKYSLMIAAHLDEIGLLIKRIDKNGFLWFEPVGGVFPQVLFSRPVVIKTDNGYVKGIINHLKPG RPEMIKDLPEIQDFFIDVGARSKEEVKEMGIEVGNPVSIQYDFMSLGKNKIAGKALDDRLFVFM LIEVLKLLKDDKDIPDVHAVFTTQEEVGLRGAKTSAYSLNPDECLALDISIANDIPGTPDRKII TELDKGPVIKVMDLVPRAMLGLIASQKIVNKLKKVA

BASYS02216: GBA, Glucosylceramidase precursor, EC. 3.2.1.45

>nameless_1422 bp

ATGGGGATGATGATTATGACCCCTTATTCGAATATGGTACTGGCCTTAAAATGGACCTTGAGTA AAACAGAAAGGAGAGTTTTCATGAATTCTATTAGTGTAATTTTAACCGCCAGAGACACCGGAGA TAGATTAAGTTTAAAAGGTGAAAAAGTATTTAAATCGGGAATAGGGAGACAGGATATAGACCTG GAATTATATCCTGATACAAGATATCAGAAAATAATCGGTTTTGGTGGGGCATTTACTGAGGCCG CTGCATATACACTGTCTAAAATAAGTTCTGATAAGAGACTTAAAATTATCGAAAGCTATTTTGA TAGGGATAAAGGTCTCGGGTATAATATGGGGCGTGTTCATATCAATAGCTGCGATTTTGCCCTG GAGAACTATACTTATGTAGAAGATGGAGATAGAGAGTTAAAGACATTTGATATTTCCCGGGAAC GGCAATGGGTGATACCTTTGATCAGGGATGCTATAAAGGCCAGGGGTGGTGAAATAAAATTACT GGCCTCACCCTGGAGCCCACCCGCCTGGATGAAGAGCAATGAAAATATGAATTATGGCGGTAAA TTGCTGCCTGAATATAGAGATGTCTGGGCTAAATATTATACTAAATATATTAAAGCCTTTCAGG AAGAAGGATTAAATATCTGGGGAATTACTGTTCAGAATGAACCTGCAGCAGTTCAGACCTGGGA TTCCTGTACATATACTGCTGAAGAAGAGCGTGATTTTGTTAAAAACCACCTCGGCCCGGTTATG CATGAAGAAGGCCTTGGTGACATTAATATCCTTATCTGGGATCATAATAGAGATATTATTGTTG ACAGAGTAAAACCCATTCTGGATGACCTTGAAGCTGCTAAATATGTATGGGGGACCGCCTTTCA CTGGTATGTGAGTGAAGACTTTGATAATGTGGGCCAGGTACATGAAATGTATCCTGACAAGCAT TTGCTTTTTACTGAAGGTTGTCAGGAGGGTGGCTGTCAAATTGGCGAATGGTTTACGGGTGAGA GATATGGGCGTAATATCATCGGTGATTTAAATAACTGGACTGAAGGGTATCTGGACTGGAACAT GGTATTGAATGAGGAAGGTGGTCCAAACCATGTGGGCAATTACTGTGATGCCCCGGTAATTGTG GATACAAATACAGAAGAGATATATTATAATAGTTCATATTATTATATTGGCCATTTCAGTAAAT ATATCAGGCCTGGTGCTGTCCGGATTGGTGTATCCTGTACTAATGATAATTTAAAGGCAACATC TTTCCTTAATAGTGATGGTAGTATTATACTAATTGTTATGAATGAGACAGATAATCCCACAGAT TTTGCAGTATCTCTTGATAATAAGGTAGCTGACCTTACATTGCCAGCCCATGCTATTGCAACTT ATATCATTACTTAA

>Translated_473_residues

MGMMIMTPYSNMVLALKWTLSKTERRVFMNSISVILTARDTGDRLSLKGEKVFKSGIGRQDIDL

220 APPENDICES

ELYPDTRYQKIIGFGGAFTEAAAYTLSKISSDKRLKIIESYFDRDKGLGYNMGRVHINSCDFAL ENYTYVEDGDRELKTFDISRERQWVIPLIRDAIKARGGEIKLLASPWSPPAWMKSNENMNYGGK LLPEYRDVWAKYYTKYIKAFQEEGLNIWGITVQNEPAAVQTWDSCTYTAEEERDFVKNHLGPVM HEEGLGDINILIWDHNRDIIVDRVKPILDDLEAAKYVWGTAFHWYVSEDFDNVGQVHEMYPDKH LLFTEGCQEGGCQIGEWFTGERYGRNIIGDLNNWTEGYLDWNMVLNEEGGPNHVGNYCDAPVIV DTNTEEIYYNSSYYYIGHFSKYIRPGAVRIGVSCTNDNLKATSFLNSDGSIILIVMNETDNPTD FAVSLDNKVADLTLPAHAIATYIIT

BASYS02338: 6-phosphofructokinase, EC- 2.7.1.11

>nameless_999 bp

ATGAAGAAAATAGGTGTATTAACAAGCGGTGGAGATGCCCCGGGAATGAATGCTGCCATCAGGT CTGTGGTTAGAACTGGTATATATCACGGACTTGATGTTGTTGGGATAAGAAGGGGTTATGCCGG TTTAATCGAAGGTGATTTTTATTTAATGGACAGGGGTTCTGTGGCCGATATTATTCAACGTGGG GGAACTATTCTCCTTTCAGCCCGTTGTGAAGAATTTAAAACAGAAGAAGGCCGGCAAAAGGCCC TTAAAAAAATGAAAGAAGAAGGTATTGAAGGCCTGGTTGTTATTGGTGGTGACGGCTCCTTCAG GGGAGCAGAAAAAGTAAGCAAAGAGCTGGGTATTCCTGCAATTGGAATTCCGGGTACTATAGAT AATGATCTGGCCTGTACTGATTTTACTATTGGATTTGATACTGCCATGAATACTATTATCGATA CTATCAATAAAATCAGGGATACGGCTACTTCCCATGAAAGGACCTTTGTTGTAGAAACAATGGG ACGCAGATCCGGGTTTTTAACCCTTATGGCCGGTCTAGCCGGTGGTGCCGAATCCATTTTAATA CCTGAAATTGACTTTGATCTGGATGATGTTTCCCAGAAATTAAAAAAAGGTTATGAACGTGGTA AACTGCATAGTATTATTCTGGTAGCTGAAGGTGTTGGTGATGACTTCAGTACCAACCGTGATAT AAACCACAGTAAGGCCTTTGTTATTGGCAAGCAGATAGCTGAAAAAACGGGACTCGAAACAAGG GTCATTATCCTTGGTCATATTCAGAGGGGTGGCTCTCCTACAGCTATGGACCGGATTCTGGCAA GTCGAATGGGGTCTAAAGCGGTAGAATTGCTCCTGGCTAGTGAATCTAATAAAATGGTAGGATA TGTTAATAATAAACTTGTAACCTGTGATATTAGCGAAGCCCTCAGTAAGGAAAAACAGGTCGAA AAAGATGTATATAAGCTAGCCAATATTCTTTCTATGTAA

>Translated_332_residues

MKKIGVLTSGGDAPGMNAAIRSVVRTGIYHGLDVVGIRRGYAGLIEGDFYLMDRGSVADIIQRG GTILLSARCEEFKTEEGRQKALKKMKEEGIEGLVVIGGDGSFRGAEKVSKELGIPAIGIPGTID NDLACTDFTIGFDTAMNTIIDTINKIRDTATSHERTFVVETMGRRSGFLTLMAGLAGGAESILI PEIDFDLDDVSQKLKKGYERGKLHSIILVAEGVGDDFSTNRDINHSKAFVIGKQIAEKTGLETR

221 APPENDICES

VIILGHIQRGGSPTAMDRILASRMGSKAVELLLASESNKMVGYVNNKLVTCDISEALSKEKQVE KDVYKLANILSM

BASYS02393: Licheninase (β-glucanase precursor), EC-3.2.1.73

>nameless_1512 bp

ATGTTAAGAAATAAAATTGTATCAGTGTTTTTATTATTAATTATTAGTACAGCTGGCGTTATGG CAACAGATGATAATGTAGGGCAAGAAAAACATCAAGTGCCCTATGATCAATTGACATTAGTCTG GAGTGATGAATTTAATTATACAGGATTCCCCGATCCTGATAAATGGAGTTATGATGTAGGGGGG CATGGCTGGGGCAATGGAGAGAAACAGTACTACACTGAAAAAAGAAAAGAAAATGTCTGGGTAG AAGATGGAAGATTAATTATTATGGCCAGGAAAGAGGATTATAAGGGTAATAAATATACATCAGC TAGATTAGTTACCCGCAATAAAGGAGACTGGTTATATGGTAGAATAGAAGTGAGGGCCAAACTC CCCCGGGGAAGAGGAACCTGGCCTGCTATCTGGATGCTTCCTACTGGCTGGATATATGGTAACT GGCCCAGCAGCGGGGAGATAGATATAATGGAACATGTGGGTTATGATATGGGGGTAGTTCATGG TTCAGTACATACCAGGAGTTATAACCATCGACTGGGGACCCATAAAACTAATACCATCAGGGTT GAAGAGGTTGATAAAAAATTTCATGTTTATAGTATTGAATGGTATCCTGGTAAAATACTTTTCT TTGTTGATAATATCCATTATTTTACCTTTAGGAATGAGGGAACCGGCTGGCGGGAATGGCCCTT TGATCAGCCCTTCCATTTGATATTAAATATTGCAGTAGGTGGTGCCTGGGGTGGTCAAAAAGGA ATTGATGACAGTATCTGGCCCCAGAGTATGGAGGTTGATTATGTAAGGGGTTATGATTTAAATA TTGAAGAAAGGGATAAGATCCCTCCCGGAAAACCTGGTAATTTAAAGGCAGATTCTACTGCCAT TTATGTAGATTTAAGCTGGGACCATTCTGTTGACAATTTTGGGGTAAAAGAATATGAAATTTAT TTAAATGATAGATTAATAGGAAAAACCAGTAATACCAAATTTAAGGTTAAAAAACTGGACCCTG AAACAAAATATAATTTTAAGGTCAGGGCAGTAGATTTTGCAGGAAATGTATCTGAGTATAGTTC TGTTCAGGTTAAAACTACGTCCCTGCCTTTAAAAATAATTCCGGGAAAAGTGGAAGCAGAGGAA TACCTTATAATGGAGGATGGAATTGCCACTGAAACTACAACTGATGCTGGTGGTGGTGTTAATC TTTGTTATATAGGTGATGGTGACTGGATGGAGTATGTTCTGGAGGTACAGGAAACAGGAAATTA TCAAGTGGATTTTAGAGTGGCCGGGGTAATCAAGGGCAAAGTAGAGCTCAGAGATGAAAAGGGG AACACCCTGGCCAGTGTCGAAGTTCCTGCTACTGGTGACTGGCAAAAATGGGTTACTGTTTCCA GTGATAAATTTACCCTGGCAGCTGGAGTATACCATATTAAGGTTTATGCTACAATTGGTGGCTT TAACCTTAACTGGTTTGAGATTAAGAAAGTTGACCAGTAG

>Translated_503_residues

MLRNKIVSVFLLLIISTAGVMATDDNVGQEKHQVPYDQLTLVWSDEFNYTGFPDPDKWSYDVGG HGWGNGEKQYYTEKRKENVWVEDGRLIIMARKEDYKGNKYTSARLVTRNKGDWLYGRIEVRAKL

222 APPENDICES

PRGRGTWPAIWMLPTGWIYGNWPSSGEIDIMEHVGYDMGVVHGSVHTRSYNHRLGTHKTNTIRV EEVDKKFHVYSIEWYPGKILFFVDNIHYFTFRNEGTGWREWPFDQPFHLILNIAVGGAWGGQKG IDDSIWPQSMEVDYVRGYDLNIEERDKIPPGKPGNLKADSTAIYVDLSWDHSVDNFGVKEYEIY LNDRLIGKTSNTKFKVKKLDPETKYNFKVRAVDFAGNVSEYSSVQVKTTSLPLKIIPGKVEAEE YLIMEDGIATETTTDAGGGVNLCYIGDGDWMEYVLEVQETGNYQVDFRVAGVIKGKVELRDEKG NTLASVEVPATGDWQKWVTVSSDKFTLAAGVYHIKVYATIGGFNLNWFEIKKVDQ

BASYS02507: yjeF, Sugar Kinase, EC.2.7.1.?

>nameless_1542 bp

GTGGTTGTACTTACACCCGGGCAGATGAGGAAAGCAGATCAGCTGGCCATCAAAGCAGGAATCC CTGACCTGGTCTTAATGGAGGTAGCCGGGAGGGAAACTGCTGAAAGAGTCCGCCACTTATTACC GGAACCAGGTCATGTAGTAATTTTTGCCGGTAAAGGTAATAATGGTGGTGATGGTCTTGTAGCA GCCCGTTATTTAGATATGTGGGGTTATGATGTAAATCTGGTTTTATTAAGACCCGGAGATGAAT ACCGGGGCAGTTCTGCCGCTAATTTTAAGGCCTGTAAAGCCAGGAGTATTAATATTAGCATATT TTCAGATGTTAAGACCAGGTTGAATAAAGAACTTGAAAGGGCGGATCTTGTAATAGATGCTATG CTGGGGACAGGTCTTACCGGTGAAGTCAGGGGGGAATTTGCCAGGGCTATAGAGATGATAAATG ATCAATCCTGCCCTGTTGTGTCAGTTGATATCCCCTCCGGGGTTGAAGGTGAGACCGGAAGGGT TCTGGGGATGGCAGTAAAGGCTGATTTAACGGTAACCATGGCCTGTCCCAAGGTGGGACTCCTG GTTTACCCGGGAAAGAAATATGTTGGTAACTTACATGTAGTGGATATCGGAATTCCCGGTGAAA TAATTATGAAGGTCAAGCCTGATCATTTTGTCTTAACTGAGAATGAAGCCCGGCTGTTACTTCC TGAAAGAGAAGTTGACGGACATAAAGGTAGTTTTGGAAGGGTTCTCATGGTTGGTGGGTCCCGG GGGATGAGTGGTGCCCCCTCCCTGGGAGGACTGGCTGCACTCAGGGTAGGGGCTGGTCTGGTTG AAGTCGCGGTTCCTGAAAGCATTCAACCTCTGGTAGCTATTACTGCTTCTGAGTTAATTACCTC CGGGTTACCTGAAAAAGATGGAAAACTCAATACCAATTCTGACTCCATTATAACAAAGAAGGCA GAAAGAGCAGATGTCCTCGCCTTAGGCCCGGGACTGGGACGGGGGCGTGGTATTAGGACTGTGG TGGCCAGTGTGATTAAAAAGGCTAAAAAACCTGTGGTTCTTGATGCAGATGCTTTGAATGAAAT TGAGTCTCCTGATATTCTTAAAGAGTCTGAAGTCCCGTTAATACTAACACCACATCCCGGTGAG ATGGCCCGCCTGCTGGATACTGATATAAAAGGGGTCCAGTCTAACAGAATCAAAATTGCCAGGG AGTTTGCCACAACCTATAGTGTTTACCTCATTCTAAAGGGAGCAGATACGGTTATTGCTTTACC TGATGGAACTATATATATAAATGATACTGGTAATGAAGGAATGGCAACGGCCGGTAGTGGTGAT GTCTTAACAGGAATGGTCTCGGGGTTAATAGCCCAGGGGCTGGATATACAGGAGGCTGTTGTTC TGGCACCATACCTCCATGGTCTCTCCGGTGATAAAGCCAGAGATAATGTAACTTCCTATTCCCT TACTGCAGGGACAATAATTGAGTTTATAGCCGGAGCTATAATGGATTTGACAACTTCGGTGATA AATTAG

223 APPENDICES

>Translated_513_residues

MVVLTPGQMRKADQLAIKAGIPDLVLMEVAGRETAERVRHLLPEPGHVVIFAGKGNNGGDGLVA ARYLDMWGYDVNLVLLRPGDEYRGSSAANFKACKARSINISIFSDVKTRLNKELERADLVIDAM LGTGLTGEVRGEFARAIEMINDQSCPVVSVDIPSGVEGETGRVLGMAVKADLTVTMACPKVGLL VYPGKKYVGNLHVVDIGIPGEIIMKVKPDHFVLTENEARLLLPEREVDGHKGSFGRVLMVGGSR GMSGAPSLGGLAALRVGAGLVEVAVPESIQPLVAITASELITSGLPEKDGKLNTNSDSIITKKA ERADVLALGPGLGRGRGIRTVVASVIKKAKKPVVLDADALNEIESPDILKESEVPLILTPHPGE MARLLDTDIKGVQSNRIKIAREFATTYSVYLILKGADTVIALPDGTIYINDTGNEGMATAGSGD VLTGMVSGLIAQGLDIQEAVVLAPYLHGLSGDKARDNVTSYSLTAGTIIEFIAGAIMDLTTSVI N

*Underlined are NcoI restriction sites

224 APPENDICES

APPENDIX II

Table A.1 List of primers used to amplify specific genes from H. orenii genomic DNA

Restriction Gene no. Primer sequence site BASYS F_primer: 5’- TTCCATATGATGCCTGAAGTAATAAC-3’ 0126 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTCATCTTCTGTTGTG-3’ BASYS F_primer: 5’- TTCCATATGATGTACTATTTAAGAAG-3’ 0154 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTTTGAGGCAAATATG-3’ BASYS F_primer: 5’- TTCCATATGATGAAAGATACCTATAC-3’ 0369 R_primer: 5’-TACTCGAGTCAATGATGGTGATGGTGATGCTTATTACTCCGCGGTC-3’

BASYS F_primer: 5’- TTCCATATGGTGAAGGGAGAAGGTG-3’ 0645 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTAATGTTATATTCTTTG-3’ BASYS F_primer: 5’- TTCCATATGATGGCAAAAATAATATTTC-3’

0973 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTGTTAGCTTCAACC-3’ CTCGAG - BASYS F_primer: 5’- TTCCATATGGTGGAGAAAGCTGACC-3’ I 1006 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTCAACTTCCTTTTAAC-3’ Xho BASYS F_primer: 5’- TTCCATATGATGGGAAAAAGATAATTATG-3’ 1593 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTTCCGGACTCTTTC-3’

BASYS F_primer: 5’- TTCCATATGGTGACTATTAAGACATC-3’ CATATG, - 1594 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTTTTTAGCTTTAAAATAAG-3’ I

BASYS F_primer: 5’- TTCCATATGATGGAGGAGTTTTTAC-3’ Nde 1715 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTTATTAGATATTCTG-3’ BASYS F_primer: 5’- TTCCATATGTTGCAAAAGGAAATC-3’ 2075 R_primer: 5’-TACTCGAGTCAATGATGGTGATGGTGATGCTTGGCAACTTTCTTC-3’ BASYS F_primer: 5’- TTCCATATGATGGGGATGATGATTATG-3’ 2216 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTAGTAATGATATAAG-3’ BASYS F_primer: 5’-TTCCATATGATGAAGAAAATAGGTG-3’

225 APPENDICES

2338 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTCATAGAAAGAATATTGGC-3’ BASYS F_primer: 5’- TTCCATATGATGTTAAGAAATAAAATTG-3’ 2393 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTCTGGTCAACTTTC-3’ BASYS F_primer: 5’- TTCCATATGGTGGTTGTACTTACAC-3’ 2507 R_primer: 5’- TACTCGAGTCAATGATGGTGATGGTGATGCTTATTTATCACCGAAGTTG-3’ F_primer: 5’- TAATACGACTCACTATAGG-3’ T primers 7 R_primer: 5’- GCTAGTTATTGCTCAGCG-3’

- Restriction sites NdeI (CATATG) and XhoI (CTCGAG) are underlined, hexa-histidine fusion peptide italicized, bold and underlined nucleotide is the cleavage site. - Primer Tm and dimer detection tool: http://www.basic.northwestern.edu/biotools/oligoCalc.html

226 APPENDICES

APPENDIX III

227 APPENDICES

APPENDIX IV

Media and Solutions

Modified TYEG medium (per liter) (Patel et.al., 1985a)

-1 TYEG medium contained (l dH2O): 10 g Tryptone, 3 g Yeast Extract, 5 g Glucose, 0.9 g NH4Cl, 0.2 g MgCl2, 0.75 g KH2PO4, 9 ml Zeikus trace element solution, 5 ml

Wolin’s vitamin solution, 5 l FeSO4, 1 ml resazurin (0.1 %), 100 g NaCl. The pH of the medium was adjusted to pH 7.0. To remove oxygen from media, it was autoclaved in 1liter Scott bottle with half the volume for 5 minutes with loosely screwed cap. Cap was replaced with a dispensing apparatus and the medium was cooled in sealed bottle under a continuous stream of oxygen free N2 gas. Na2S was added to a final concentration of 0.02% and the medium dispensed into serum bottles (100ml) that had been purged with Oxygen free N2 gas. Bottles were sealed with the rubber septum and sterilized by autoclaving at 121C for 15 minutes.

Modified TYEG medium was stored in the dark at room temperature until use and inoculated by injection through the rubber septum (5-100% inoculum) using sterile needle and syringes. Media containing oxygen traces (indicated by pink colour) were discarded.

Zeikus trace element solution (per liter) (Ziekus et. al., 1979)

-1 Zeikus trace element solution contained (l dH2O): 12.8 g Nitriloacetic acid, 0.17 g

CoCl2.6H2O, 0.1 g CaCl2.2H2O, 0.1 g FeSO4.7H2O, 0.1 g MnCl2.4H2O, 0.1g NaCl, 0.1 g ZnCl2, 0.026 g NiSO4.6H2O, 0.02 g CuCl2.2H2O, 0.0017 g Na2SeO3, 0.01 g H3BO4,

0.01 g Na2MoO4.2H2O. The pH of the solution was adjusted to pH 6.5 was sterilized for 30 min at 121 C and 1-1.5 Kg cm-2 pressure. The solution was stored at 4 C.

228 APPENDICES

Wolin’s vitamin solution (per liter) (Wolin et. al., 1963)

-1 Wolin’s vitamin solution contained (l dH2O): 10mg pyroxidine-HCl, 5 mg thiamine- HCl, 5 mg riboflavin, 5 mg nicotinic acid, 5 mg calcium pantothanate, 5 mg p- aminobenzoic acid, 5 mg thioctic acid, 2 mg biotin, 2 mg folic acid, 0.1 mg cyanoalamin. Wolins vitamin solution was filter sterilized (Millex GS 0.22 µm filter unit, Millipore, Ireland) and reduced by bubbling under N2: CO2 gas mixture and stored at 4 C.

Super Optimal broth Catabolite repression (SOC) medium

SOC medium was used to aid in the recovery of the transformed DH5 competent cells (-select silver efficiency, Bioline). Medium was prepared by adding 2.0 g tryptone, 0.5 g yeast extract, 1 ml of 1M NaCl and 0.25 ml of 1M KCl to 100 ml dH2O (total volume). The pH of the medium was adjusted to pH 7.0 with 1M NaOH and the medium was sterilized for 30 min at 121 C and 1-1.5 Kg cm-2 pressure. 1 ml of filter- 2+ sterilized 2M Mg solution (1M MgCl2.6H2O / 1M MgSO4.6H2O) and 1 ml of 2M glucose was added.

Luria Bertani (LB) Medium

Luria Bertani medium supplemented with ampicillin (100 µg.ml-1) and/ or chloramphenicol (35 µg.ml-1) was used to grow clones in broth. LB medium was purchased from Oxoid (England) and prepared according to manufacturer’s instruction. The pH of the medium was adjusted to pH 7.0 with 1M NaOH and the medium was sterilized for 30 min at 121 C and 1-1.5 Kg cm-2 pressure. Prior to use ampicillin and/ or chloramphenicol was added to a final concentration of 100 µg.ml-1 for ampicillin and/ or 35 µg.ml-1 for chloramphenicol.

229 APPENDICES

MacConkey Agar Medium

MacConkey agar plates supplemented with ampicillin (100 µg.ml-1) and/ or chloramphenicol (35 µg.ml-1) were used to screen and select recombinant clones from plates having white coloured colonies (both insert and vector are present) and red coloured colonies (only vector is present) from transformed cells. MacConkey agar was purchased from Oxoid (England) and prepared according to manufacturer’s instruction. The pH of the medium was adjusted to pH 7.0 with 1M NaOH and the medium was sterilized for 30 min at 121 C and 1-1.5 Kg cm-2 pressure. After cooling the medium to approximately 50 C, ampicillin and/ or chloramphenicol was added to a final concentration of 100 µg.ml-1 for ampicillin and/ or 35 µg.ml-1 for chloramphenicol and medium was poured on the plates. MacConkey agar plates were stored at 4 C for up to one month. Medium was labelled as MacConkey Agar +Amp and/ or + Cam.

Buffers

Cell Lysis Buffer A

Cell lysis buffer A contained (10 ml dH2O): 10mM Tris-HCl, 1.5 M NaCl, 100 l lysozyme (50 mg.ml-1), pH 8.0.

Cell Lysis Buffer B

Cell lysis buffer B contained (10 ml dH2O): 100 mM HEPES pH 7.5, 500mM NaCl, lysozyme (2 mg ml-l), 4 U ml-l DNase I.

MOPS Buffer (20X)

MOPS buffer 20X contained (500 ml dH2O): 50 mM MOPS (3[N-Morpholino] Propane sulphonic acid), 50 mM Tris-base, 0.1 % SDS, 1 mM EDTA, pH 7.7.

230 APPENDICES

Lithium Dodecyl Sulphate (LDS) Sample Buffer (4X)

LDS Sample Buffer 4X contained (10 ml dH2O): [40 % (v/v) glycerol, 4 % (w/v) lithium dodecyl sulphate, 4 % (w/v) Ficoll-400, 0.8 M triethanolamine - Cl, pH 7.6, 0.025 % (w/v) phenol red, 0.025 % (w/v) coomassie G250 and 2 mM EDTA disodium].

Tris Acetic acid EDTA (TAE) Buffer (50X)

-1 TAE buffer contained (l dH2O): 242 g Tris base, 57.1 mL glacial acetic acid and 100mL of 500 mM EDTA, pH 8.0.

Buffers used for protein purification using Ni2+ affinity chromatography

Buffer name Buffer ingredients 1X Fractogel equilibrium 500 mM NaCl, 20 mM Tris-HCl, pH 7.9 buffer 1X Fractogel rinse solution 500 mM NaCl

100 mM NiSO4, 100 mM CoSO4, or 100 mM CuSO4 in 1X Fractogel charge buffer deionized water 1X Fractogel bind buffer 500 mM NaCl, 20 mM Tris-HCl, 5 mM imidazole, pH 7.9 1X Fractogel wash buffer 500 mM NaCl, 20 mM Tris-HCl, 20 mM imidazole, pH 7.9 1X Fractogel elute buffer 500 mM NaCl, 20 mM Tris-HCl, 1 M imidazole, pH 7.9 1X Fractogel strip buffer 500 mM NaCl, 100 mM Tris-HCl, 20 mM Tris-HCl, pH 7.9 Adapted from user protocol TB461 Rev. C 0406 Page 3 of 5 (Novagen).

231 APPENDICES

APPENDIX V

Table A.2 Crystallization conditions that yielded HoBglA crystals (In-house crystallization factorials from Structural Chemistry Lab, Eskitis Centre for Cell and Molecular Therapies, Griffith University)

No. Factorial ingredients Crystals Comments

CSI1 0.02M CaCl2,30%MPD,0.1M NaAc,pH=4.6 Crystals ND

CSI7 1.4M NaAc, 0.1M Na cacodylate, pH=6.5 Crystals ND

CSI18 0.2M MgAc2, 20% PEG 8000, 0.1M Na cacodylate, pH=6.5 Crystals 3.0 Å

CSI22 0.2M NaAc, 30% PEG 4000, 0.1M TRIS/ HCl, pH=8.5 Crystals ND

CSI42 0.05M K phosphate, 20% PEG 8000 Crystals ND

CSII26 0.2M (NH4)2SO4, 30% PEG MME 5000, 0.1M MES, pH=6.5 Crystals ND

F29 10% PEG 4000, 20mM MPOS, pH= 7 Crystals ND

F43 0.1M MgCl2, 30% PEG 4000, 20mM MOPS, pH= 7 Crystals ND

F45 0.1m MgCl2, 10% PEG 4000, 20mM KH2PO4(H3PO4), pH= 8 Crystals ND

F46 0.1M MgCl2, 20% PEG 4000, 20mM KH2PO4(H3PO4), pH= 8 Crystals ND

F67 0.1M MgCl2, 30% PEG 2000, 20mM MOPS, pH= 7 Crystals ND

F69 0.1m MgCl2, 10% PEG 2000, 20mM KH2PO4(H3PO4), pH= 8 Crystals ND

F70 0.1M MgCl2, 20% PEG 2000, 20mM KH2PO4(H3PO4), pH= 8 Crystals ND

232 APPENDICES

F118 5mM MgSO4, 10% PEG 4000, 50mM TRIS, pH =8 Crystals ND

F155 30% PEG4000, 20mM TRIS, pH =9 Crystals ND

F158 0.1M MgCl2, 20% PEG 4000, 20mM TRIS, pH= 9 Crystals ND

H97 10% PEG 8000, 0.1M MES pH= 6 Crystals ND

H98 12.5% PEG 8000, 0.1M MES pH= 6 Crystals ND

H99 15% PEG 8000, 0.1M MES pH= 6 Crystals ND

H103 5mM CaCl2, 10% PEG 8000, 0.1M MES pH =6 Crystals ND

H104 5mM CaCl2, 12.5% PEG 8000, 0.1M MES pH =6 Crystals ND

H105 5mM CaCl2, 15% PEG 8000, 0.1M MES pH =6 Crystals ND

H106 5mM CaCl2, 17.5% PEG 8000, 0.1M MES pH =6 Crystals ND

N14 0.2M MgCl2, 30% PEG 8000, 0.10M MES/ NaOH, pH= 6.1 Microcrystals ND

N22 0.2M NaAc, 30% PEG 4000, 0.10M TRIS/HCl, pH= 8.4 Small crystals ND

N27 0.2M Na/K tartrate, 30% PEG 8000, 0.1M MES/ NaOH, pH= 6.5 Crystals ND

N30 0.1M (NH4)2SO4, 20% PEG 4000, 0.1M HEPES/ NaOH, pH= 7.3 Microcrystals ND

N43 0.2M Li2SO4, 30% PEG 4000, 0.1M TRIS/ Hcl, pH= 8.6 Microcrystals ND

N44 0.2M (NH4)2SO4, 30% PEG 6000, 0.1M HEPES/ NaOH, pH= 7.2 Microcrystals ND

N76 0.2M (NH4)H2PO4, 25% PEG 8000, 0.1M TRIS/ Hcl, pH= 6.4 Microcrystals ND

233 APPENDICES

N78 0.2M MgAc2, 20% PEG 8000, 0.1M MES, pH= 6.4 Crystals ND

N96 2M (NH4)2SO4, 2% PEG 400, 0.1M HEPES / NaOH, pH= 7.6 Crystals ND

P8 25% PEG 1500, 0.1M malonic acid, imidazole, boric acid(MIB), pH= 5 Crystals ND

P9 25% PEG 1500, 0.1M malonic acid, imidazole, boric acid(MIB), pH= 6 Crystals ND

P10 25% PEG 1500, 0.1M malonic acid, imidazole, boric acid(MIB), pH=7 Crystals ND

P14 25% PEG 1500, 0.1M Na propionate, Na cacodylate, Bis-TRIS propane(PCB), pH=5.4 Crystals ND

P15 25% PEG 1500, 0.1M Na propionate, Na cacodylate, Bis-TRIS propane(PCB), pH=5.8 Crystals ND

P16 25% PEG 1500, 0.1M Na propionate, Na cacodylate, Bis-TRIS propane(PCB), pH=6.8 Crystals ND

P31 0.2M NaCl, 20% PEG 6000, 0.1M MES, pH= 6 Small crystals ND

P32 0.2M (NH4)Cl, 20% PEG 6000, 0.1M MES, pH= 6 Small crystals ND

P33 0.2M LiCl, 20% PEG 6000, 0.1M MES, pH= 6 Small crystals ND

P34 0.2M MgCl2, 20% PEG 6000, 0.1M MES, pH= 6 Crystals ND

P35 0.2M CaCl2, 20% PEG 6000, 0.1M MES, pH= 6 Needles ND

P37 0.2M NaCl, 20% PEG 6000, 0.1M HEPES, pH= 7 Crystals ND

P38 0.2M (NH4)Cl, 20% PEG 6000, 0.1M HEPES, pH= 7 Crystals ND

P40 0.2M MgCl2, 20% PEG 6000, 0.1M HEPES, pH= 7 Crystals ND

P41 0.2M CaCl2, 20% PEG 6000, 0.1M HEPES, pH= 7 Crystal plates ND

234 APPENDICES

P44 0.2M (NH4)CL, 20% PEG 6000, 0.1M TRIS, pH= 8 Crystals ND

P46 0.2M MGCl2, 20% PEG 6000, 0.1M TRIS, pH= 8 Crystals ND

P50 0.2M NaBr, 20% PEG 3500 Crystals ND

P51 0.2M KI, 20% PEG 3500 Crystals ND

P52 0.2M KSCN, 20% PEG 3500 Crystals ND

P53 0.2M NaNO3, 20% PEG 3500 Crystals ND

P54 0.2M Na formate, 20% PEG 3500 Crystals ND

P55 0.2M NaAc, 20% PEG 3500 Crystals ND

P58 0.2M Na/K phosphate, 20% PEG 3500 Crystals ND

P60 0.2M Na malonate, 20% PEG 3500 Crystals ND

P62 0.2M NaBr, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Crystals ND

P63 0.2M KI, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Crystals ND

P64 0.2M KSCN, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Crystals ND

P66 0.2M Na formate, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Crystals ND

P67 0.2M NaAc, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 6.5 Crystals ND

P69 0.2M K/Na tartrate, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 6.5 Crystals ND

P73 0.2M NaF, 20% PEG 3350, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

235 APPENDICES

P74 0.2M NaBr, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

P75 0.2M KI, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

P76 0.2M KSCN, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

P77 0.2M NaNO3, 20% PEG 3350, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

P78 0.2M Na formate, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Crystals 3.5 Å

P79 0.2M NaAc, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

P84 0.2M Na malonate, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

P85 0.2M NaF, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Needles ND

P86 0.2M NaBr, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Needles ND

P87 0.2M KI, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals ND

P88 0.2M KSCN, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals ND ND = No Diffraction

236 APPENDICES

APPENDIX VI Table A.3 Crystallization conditions that yielded HoRbk crystals (In-house crystallization factorials from Structural Chemistry Lab, Eskitis Centre for Cell and Molecular Therapies, Griffith University)

No. Conditions Crystal Comments E24 1.4M (NH4)2SO4, 0.1M citrate acid / NaOH, pH= 6.5 Crystals ND E91 0.5M (NH4)2SO4, 0.1M NaAc, pH =4.5 Crystals ND

P5 25% PEG 1500, 0.1M succinic acid, NaH2PO4, glycine(SPG), pH=8 Needle Crystals ND

P6 25% PEG 1500, 0.1M succinic acid, NaH2PO4, glycine(SPG), pH=10 Crystals ND

P12 25% PEG 1500, 0.1M malonic acid, imidazole, boric acid(MIB), pH=10 Needle ND 25% PEG 1500, 0.1M Na propionate, Na cacodylate, ND P18 Needle Bis-TRIS propane(PCB), pH= 9.5 P24 25% PEG 1500, 0.1M malic acid, MES, TRIS (MMT), pH= 9 Needle ND

P38 0.2M (NH4)Cl, 20% PEG 6000, 0.1M HEPES, pH= 7 Microcrystals ND

P39 0.2M LiCl, 20% PEG 6000, 0.1M HEPES, pH= 7 Microcrystals ND

P40 0.2M MgCl2, 20% PEG 6000, 0.1M HEPES, pH= 7 Microcrystals ND

P41 0.2M CaCl2, 20% PEG 6000, 0.1M HEPES, pH= 7 Needles ND

P43 0.2M NaCl, 20% PEG 6000, 0.1M TRIS, pH= 8 Microcrystals ND

P44 0.2M (NH4)CL, 20% PEG 6000, 0.1M TRIS, pH= 8 Microcrystals ND

237 APPENDICES

P45 0.2M LiCl, 20% PEG 6000, 0.1M TRIS, pH= 8 Microcrystals ND

P46 0.2M MGCl2, 20% PEG 6000, 0.1M TRIS, pH= 8 Microcrystals ND

P47 0.2M CaCl2, 20% PEG 6000, 0.1M TRIS, pH= 8 Microcrystals ND

P50 0.2M NaBr, 20% PEG 3500 Microcrystals ND

P51 0.2M KI, 20% PEG 3500 Microcrystals ND

P52 0.2M KSCN, 20% PEG 3500 Microcrystals ND

P54 0.2M Na formate, 20% PEG 3500 Microcrystals ND

P57 0.2M K/Na tartrate, 20% PEG 3500 Crystals ND

P58 0.2M Na/K phosphate, 20% PEG 3500 Crystals ND

P60 0.2M Na malonate, 20% PEG 3500 Crystals ND

P61 0.2M NaF, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Microcrystals ND

P62 0.2M NaBr, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Microcrystals ND

P63 0.2M KI, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Microcrystals ND

P64 0.2M KSCN, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Microcrystals ND

P65 0.2M NaNO3, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Microcrystals ND

P66 0.2M Na formate, 20% PEG 3500, 0.1M Bis-TRIS propane, pH= 6.5 Microcrystals ND

P67 0.2M NaAc, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 6.5 Microcrystals ND

238 APPENDICES

P68 0.2M Na2SO4, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 6.5 Crystals ND

P69 0.2M K/Na tartrate, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 6.5 Crystals ND

P72 0.2M Na malonate, 20% PEG 3350, 0.1M Bis-TRIS propane, pH= 6.5 Crystals ND

P73 0.2M NaF, 20% PEG 3350, 0.1M Bis-TRIS propane, pH= 7.5 Crystals, needles ND

P74 0.2M NaBr, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Crystals, needles ND

P75 0.2M KI, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Microcrystals ND

P76 0.2M KSCN, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Crystals, needles ND

P77 0.2M NaNO3, 20% PEG 3350, 0.1M Bis-TRIS propane, pH= 7.5 Microcrystals ND

P78 0.2M Na formate, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Microcrystals ND

P79 0.2M NaAc, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Microcrystals ND

P80 0.2M Na2SO4, 20% PEG 3350, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

P81 0.2M K/Na tartrate, 20% PEG 3350, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

P84 0.2M Na malonate, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 7.5 Crystals ND

P85 0.2M NaF, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals ND

P86 0.2M NaBr, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals, needle ND

P87 0.2M KI, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals, needle ND

P88 0.2M KSCN, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals ND

239 APPENDICES

P89 0.2M NaNO3, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals ND

P90 0.2M Na formate, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals ND

P91 0.2M NaAc, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals ND

P93 0.2M K/Na tartrate, 20% PEG 3400, 0.1M Bis-TRIS propane, pH= 8.5 Crystals ND

F9 10% PEG 400, 20mM KH2PO4(H3PO4), pH= 8 Crystals ND

F17 0.1M MgCl2, 10% PEG 400, 20mM MOPS, pH= 7 Crystals ND

F18 0.1M MgCl2, 20% PEG 400, 20mM MOPS, pH= 7 Needles ND

F21 0.1m MgCl2, 10% PEG 400, 20mM KH2PO4(H3PO4), pH= 8 Needles, Crystal ND

F22 0.1M MgCl2, 20% PEG 400, 20mM KH2PO4(H3PO4), pH= 8 Needles ND

F23 0.1M MgCl2, 30% PEG 400, 20mM KH2PO4(H3PO4), pH= 8 Needles ND

F58 20% PEG 2000, 20mM KH2PO4(H3PO4), pH= 8 Needles ND

F59 30% PEG 2000, 20mM KH2PO4(H3PO4), pH= 8 Needles ND

F60 40% PEG 2000, 20mM KH2PO4(H3PO4), pH= 8 Needles ND

F65 0.1M MgCl2, 10% PEG 2000, 20mM MOPS, pH= 7 Needles ND

F66 0.1M MgCl2, 20% PEG 2000, 20mM MOPS, pH= 7 Microcrystals ND

F68 0.1M MgCl2, 40% PEG 2000, 20mM MOPS, pH= 7 Microcrystals ND

F69 0.1m MgCl2, 10% PEG 2000, 20mM KH2PO4(H3PO4), pH= 8 Needles ND

240 APPENDICES

F74 5mM MgSO4, 2.50% PEG 400, 50mM NaAc, pH= 5 Needles ND

F75 5mM MgSO4, 5% PEG 400, 50mM NaAc, pH= 5 Microcrystals ND

F80 5mM MgSO4, 2.50% PEG 400, 50mM MES, pH= 6 Crystal rod ND

F82 5mM MgSO4, 7.50% PEG 400, 50mM MES, pH= 6 Microcrystals ND

F83 5mM MgSO4, 10% PEG 400, 50mM MES, pH= 6 Microcrystals ND

F86 5mM MgSO4, 2.50% PEG 400, 50mM HEPES, pH =7 Crystal Needles ND

F87 5mM MgSO4, 5% PEG 400, 50mM HEPES, pH =7 Crystal Needles ND

F88 5mM MgSO4, 7.50% PEG 400, 50mM HEPES, pH =7 Needles ND

F89 5mM MgSO4, 10% PEG 400, 50mM HEPES, pH =7 Crystal Needles ND

F90 5mM MgSO4, 15% PEG 400, 50mM HEPES, pH =7 Needles ND

F103 5mM MgSO4, 2.50% PEG 4000, 50mM MES, pH= 6 Crystals ND

F104 5mM MgSO4, 5% PEG 4000, 50mM MES, pH= 6 Crystals ND

F105 5mM MgSO4, 7.50% PEG 4000, 50mM MES, pH= 6 Microcrystals ND

F106 5mM MgSO4, 10% PEG 4000, 50mM MES, pH= 6 Microcrystals ND

F107 5mM MgSO4, 12.50% PEG 4000, 50mM MES, pH= 6 Microcrystals ND

F110 5mM MgSO4, 5% PEG 4000, 50mM HEPES, pH =7 Crystals ND

F111 5mM MgSO4, 7.50% PEG 4000, 50mM HEPES, pH =7 Microcrystals ND

241 APPENDICES

F112 5mM MgSO4, 10% PEG 4000, 50mM HEPES, pH =7 Microcrystals ND

F113 5mM MgSO4, 12.50% PEG 4000, 50mM HEPES, pH =7 Microcrystals ND

F114 5mM MgSO4, 15% PEG 4000, 50mM HEPES, pH =7 Microcrystals ND

F117 5mM MgSO4, 7.50% PEG 4000, 50mM TRIS, pH =8 Crystals ND

F118 5mM MgSO4, 10% PEG 4000, 50mM TRIS, pH =8 Microcrystals ND

F150 0.1M MgCl2, 20% PEG400, 20mM TRIS, pH =9 Crystals ND

F154 20% PEG4000, 20mM TRIS, pH =9 Crystals ND

F155 30% PEG4000, 20mM TRIS, pH =9 Needles ND

F157 0.1M MgCl2, 10% PEG 4000, 20mM TRIS, pH= 9 Crystals ND

F162 20% PEG1500, 20mM TRIS, pH= 9 Crystals ND

F163 30% PEG1500, 20mM TRIS, pH= 9 Crystals ND

F164 40% PEG1500, 20mM TRIS, pH= 9 Needles ND

F179 5mM MgSO4, 12.5% PEG 4000, 50mM TRIS, pH =9 Crystals ND

F180 5mM MgSO4, 15% PEG 4000, 50mM TRIS, pH =9 Needles ND

F181 2.50% PEGMME550, 0.1M NaAc, pH =5 Needle Crystals ND

F182 5.00% PEGMME550, 0.1M NaAc, pH =5 Crystals ND

F183 7.50% PEGMME550, 0.1M NaAc, pH =5 Crystals ND

242 APPENDICES

F191 12.50% PEGMME550, 0.1M MES pH =6 Microcrystals ND

N19 0.1M (NH4)2SO4, 30% PEG 8000, 0.10M Na citrate, pH= 6.1 Crystals ND

N20 30% MPD, 0.10M MES/ NaOH, pH= 6.2 Crystals ND

N27 0.2M Na/K tartrate, 30% PEG 8000, 0.1M MES/ NaOH, pH= 6.5 Skin, Crystals ND

N30 0.1M (NH4)2SO4, 20% PEG 4000, 0.1M HEPES/ NaOH, pH= 7.3 Microcrystals ND

N34 0.2M (NH4)Ac, 30% EtOH, 0.1M TRIS/HCl, pH= 8.2 Crystals ND

N43 0.2M Li2SO4, 30% PEG 4000, 0.1M TRIS/ Hcl, pH= 8.6 Needle, Crystals ND

N44 0.2M (NH4)2SO4, 30% PEG 6000, 0.1M HEPES/ NaOH, pH= 7.2 Microcrystals ND

N57 0.2M (NH4)2SO4, 30% PEG 8000, 0.1M MES, pH= 6.3 Crystals ND

N61 30% MPD, 0.1M MES, pH= 6.1 Crystals ND

N75 20% iso-propanol, 20% PEG 4000, 0.1M Na citrate, pH= 6.7 Microcrystals ND

N76 0.2M (NH4)H2PO4, 25% PEG 8000, 0.1M TRIS/ Hcl, pH= 6.4 Needle, Crystals ND

N83 8% PEG 8000, 0.1M TRIS/HCl, pH= 8.5 Crystals ND

N85 0.05M K2HPO4, 20% PEG 8000, pH= 9.4 Needle, Crystals ND

N89 0.2m CaAc2, 18% PEG 8000, 0.1M MES, pH= 6.5 Microcrystals ND

NX4 0.2M Kcl, 0.01M MgSO4, 10% PEG 400, 0.05M MES, pH= 5.6 Crystals ND

NX5 0.2M Kcl, 0.01M MgCl2, 5% PEG 8000, 0.05M MES, pH= 5.6 Crystals ND

243 APPENDICES

NX10 0.005M MgSO4, 5% PEG 4000, 0.05M MES, pH= 6 Needles ND

NX23 0.2M Kcl, 0.01M MgCl2, 10% PEG 4000, 0.05M cacodylic acid, pH= 6.5 Needles ND 0.2M (NH4)Ac, 0.01M CaCl2, 10% PEG 4000, 0.05M cacodylic acid, ND NX24 Needles pH= 6.5 NX26 0.2M Kcl, 0.1M MgAc2, 10% PEG 8000, 0.05M cacodylic acid, pH= 6.5 Crystals ND

OA9 7% PEG 6000,0.2M cacodylic acid / NaOH, pH= 6.1 Crystals ND

OA10 7% PEG 6000, 0.2M MES/ NaOH, pH= 6.1 Crystals ND

OA11 7% PEG MME, 5000 0.2M cacodylic acid / NaOH, pH= 6.1 Crystals Bunch ND

OA12 7% PEG MME, 5000 0.2M MES/ NaOH, pH= 6.1 Crystals Bunch ND

OA13 7% PEG 6000,0.2M PIPES/ NaOH, pH= 6.7 Crystals ND

OA15 7% PEG MME, 5000 0.2M PIPES/ NaOH, pH= 6.7 Crystals Bunch ND Needles, ND SM94 60% MPD, 0.1M TRIS/ Hcl, pH= 7 Microcrystals Needles, ND SM95 60% MPD, 0.1M TRIS/ Hcl, pH= 8 Microcrystals Needles , ND SM96 60% MPD, 0.1M TRIS/ Hcl, pH= 9 Microcrystals H1 5% PEG 6000, 0.1M MES pH =6 Crystals ND

H2 5mM CaCl2,5% PEG 6000, 0.1M MES pH =6 Needles , ND

244 APPENDICES

Microcrystals Needles , ND H3 10mM CaCl2,5% PEG 6000, 0.1M MES pH =6 Microcrystals H4 15mM CaCl2,5% PEG 6000, 0.1M MES pH =6 Needle ND

H5 20mM CaCl2,5% PEG 6000, 0.1M MES pH =6 Needle ND

H6 25mM CaCl2,5% PEG 6000, 0.1M MES pH =6 Needle ND

H25 5% PEG 1500, 0.1M MES pH = 6 Crystals ND

H26 5mM CaCl2, 5% PEG 1500, 0.1M MES pH =6 Crystals ND Needles , ND H27 10mM CaCl2, 5% PEG 1500, 0.1M MES pH =6 Microcrystals Crystals, ND H28 15mM CaCl2, 5% PEG 1500, 0.1M MES pH =6 Microcrystals H29 20mM CaCl2, 5% PEG 1500, 0.1M MES pH =6 Crystals ND

H30 25mM CaCl2, 5% PEG 1500, 0.1M MES pH =6 Crystals ND

H32 5mM CaCl2, 10% PEG 1500, 0.1M MES pH =6 Microcrystals ND

H33 10mM CaCl2, 10% PEG 1500, 0.1M MES pH =6 Microcrystals ND

H35 20mM CaCl2, 10% PEG 1500, 0.1M MES pH =6 Microcrystals ND

H36 25mM CaCl2, 10% PEG 1500, 0.1M MES pH =6 Microcrystals ND

245 APPENDICES

H49 5% PEG 400, 0.1M MES, pH= 6 Crystals ND

H50 5mM Cl2, 5% PEG 400, 0.1M MES pH =6 Crystals ND

H51 10mM CaCl2, 5% PEG 400, 0.1M MES pH =6 Crystals ND

H52 15mM CaCl2, 5% PEG 400, 0.1M MES pH =6 Crystals ND

H53 20mM CaCl2, 5% PEG 400, 0.1M MES pH =6 Crystals ND

H54 25mM CaCl2, 5% PEG 400, 0.1M MES pH =6 Crystals ND

H55 10% PEG 400, 0.1M MES pH= 6 Microcrystals ND

H56 5mM CaCl2, 10% PEG 400, 0.1M MES pH =6 Microcrystals ND

H57 10mM CaCl2, 10% PEG 400, 0.1M MES pH =6 Microcrystals ND

H58 15mM CaCl2, 10% PEG 400, 0.1M MES pH =6 Microcrystals ND

H59 20mM CaCl2, 10% PEG 400, 0.1M MES pH =6 Microcrystals ND

H60 25mM CaCl2, 10% PEG 400, 0.1M MES pH =6 Microcrystals ND Needles , ND H64 15mM CaCl2, 15% PEG 400, 0.1M MES pH =6 Microcrystals H65 20mM CaCl2, 15% PEG 400, 0.1M MES pH =6 Needles ND

H66 25mM CaCl2, 15% PEG 400, 0.1M MES pH =6 Needles ND

H79 5mM CaCl2, 10% PEG 4000, 0.1M MES pH =6 Microcrystals ND

246 APPENDICES

H80 5mM CaCl2, 12.5% PEG 4000, 0.1M MES pH =6 Microcrystals ND

H91 5mM CaCl2, 10% PEG 6000, 0.1M MES pH =6 Microcrystals ND

H93 5mM CaCl2, 15% PEG 6000, 0.1M MES pH =6 Microcrystals ND

E102 0.8M (NH4)2SO4, 0.1M NaAc, pH =5 Crystal rod ND

E103 1.1M (NH4)2SO4, 0.1M NaAc, pH =5 Crystal square ND

E164 1.4M (NH4)2SO4, 0.1M HEPES , pH =5.5 Crystal square ND

CSI36 8% PEG 8000, 0.1M TRIS/HCl, pH=8.5 Crystals ND

CSI46 0.2M CaAc2, 18% PEG 8000, 0.1M Na cacodylate, pH=6.5 Crystals ND

CSII4 35% dioxane Crystals ND

CSII10 0.2M NaCl, 30% MPD, 0.1M NaAc, pH=4.6 Crystals, skin ND

CSII15 0.5M (NH4)2SO4, 1.0M LiSO4, 0.1M Na citrate, pH= 5.6 Crystals ND CSII22 12% PEG 20000, 0.1M MES, pH= 6.5 Crystals 3.2Å CSII26 0.2M (NH4)2SO4, 30% PEG MME 5000, 0.1M MES, pH=6.5 Crystals, skin ND

CSII30 5% MPD, 10% PEG 6000, 0.1M HEPES/NaOH, pH= 7.5 Crystals plate ND

CSII38 205 PEG 10000, 0.1M HEPES/ NaOH, pH= 7.5 Crystals ND

CSII43 0.2M (NH4) phosphate, 50% MPD, 0.1M TRIS/HCl, pH=8.5 Crystals ND

ND = No Diffraction

247 APPENDICES

APPENDIX VII

Publication 1:

Expression, purification and preliminary crystallographic analysis of the recombinant β- glucosidase (BglA) from the halothermophile Halothermothrix orenii.

Lokesh D. Kori, Andreas Hofmann and Bharat K. C. Patel.

Acta Cryst. F (2011); F67, 111-113.

248 APPENDICES

APPENDIX VIII

Publication 2:

Expression, purification, crystallization and preliminary X-ray diffraction analysis of thermophilic ribokinase (sugar kinase family) from Halothermothrix orenii.

Lokesh D. Kori, Andreas Hofmann and Bharat K. C. Patel.

Acta Cryst. F (2012); F68, 240-243

249 APPENDICES

APPENDIX IX

Publication 3:

Biochemical and structural properties of β- glucosidase with β- fucosidase and β- galactosidase activities from the halothermophilic anaerobe Halothermothrix orenii.

Lokesh D. Kori, Andreas Hofmann and Bharat K. C. Patel.

(In preparation)

250 APPENDICES

APPENDIX X

Publication 4:

Gene cloning, over-expression and biochemical characterization of a thermophilic β-galactosidase/ -L-arabinopyranosidase from Halothermothrix orenii.

Lokesh D. Kori and Bharat K. C. Patel.

(In preparation)

251