Structural and functional investigations of selected at molecular level

Thesis Submitted to AcSIR For the Award of the Degree of DOCTOR OF PHILOSOPHY in BIOLOGICAL SCIENCES

by Ekta Shukla Registration Number: 10BB12A26067

Under the guidance of Dr. Dhanasekaran Shanmugam (Research Supervisor) Dr. Sushama Gaikwad (Research Co-Supervisor)

DIVISION OF BIOCHEMICAL SCIENCES CSIR-NATIONAL CHEMICAL LABORATORY PUNE – 411008, INDIA

August 2017

Abstract of the thesis

Chapter 1: Introduction

The importance and updates in protein folding/ unfolding research has been discussed. Significance of studying the structure-function relationship of proteins through various experimental and theoretical approaches has also been presented. Being the largest and most diverse class of , hydrolases offers an opportunity to explore the conformational/ topological diversity which forms the basis of their differential biological activity. An introduction to hydrolases in terms of their diversity, classification, structure and function is given. The available information and literature survey on selected hydrolases has been summarized as the basis of studies undertaken in the thesis.

Chapter 2: Structure-function studies of from Conidiobolus brefeldianus MTCC 5185 (Cprot)

The from Conidiobolus brefeldianus MTCC 5185 (Cprot) is a monomeric 28 kDa protein showing optimum proteolytic activity at pH 9.0 and 50 °C. Cprot was considered as an interesting candidate for structural and functional studies owing to its commercial applications and unique properties. The results of the chapter are divided into three sections. The first section in the present chapter includes analysis of structural elements of Cprot using different biophysical techniques. The microenvironment of three tryptophans in Cprot was studied using steady state fluorescence and solute quenching studies with neutral (acrylamide) and ionic (I- and Cs+) quenchers. Acrylamide was found to be the most efficient solute for native Cprot, which quenched the fluorescence intensity with −1 Stern-Volmer constant Ksv as 3.9 M . The resistance offered by Cprot towards different and its auto-chewing property was studied by biochemical activity assays. In the second section of the chapter, our understanding of the conformational and functional dynamics of Cprot, in presence of various stress conditions has been discussed. The was found to be active over a wide pH range, except for extremely acidic

Page 3 condition. The thermal denaturation of the enzyme was irreversible and observed above 55 °C, after which both the structure and function of the enzyme were lost. The protease was interestingly, stable in organic solvents up to 50 % (v/v) concentration of alcohols and dimethyl sulfoxide. Alcohols showed α-helix inducing effect on Cprot and its stability in presence of fluorinated alcohols (5-10 %) was also observed. Conformational changes of Cprot during guanidine hydrochloride induced denaturation indicated multi step unfolding of Cprot, involving several intermediates. The melting profile observed for the native Cprot and for the enzyme treated under various stress conditions, correlated well with the corresponding structural and functional transitions obtained. In the last section of the chapter, we intended to construct the 3D model of Cprot. Since, the three dimensional structure of enzymes often provides significant information to understand its , secondary structural elements and overall tertiary conformation. Cprot is a β-sheet rich protein as indicated by far-UV CD spectrum and the homology model. The attempt made to identify the sequence of Cprot using C. coronatus proteases revealed its similarity to a -like protease belonging to PA clan of proteases with His-64, Asp-113 and Ser-208 as putative . The FTIR spectrum of Cprot also resembled with that of trypsin.

Chapter 3: Structure-function studies of trehalase from Drosophila melanogaster (DmTre) and Chironomus ramosus (CrTre)

Trehalase is a physiologically important glycosidase, known for its crucial role in insect glycometabolism and stress recovery. The present study describes the molecular cloning of a cDNA segment, encoding the catalytically active trehalase and its characterization from two dipteran insects, Drosophila melanogaster (DmTre) and Chironomus ramosus (CrTre). The results of the chapter are divided into three sections. The first section includes the molecular cloning of a cDNA segment, encoding the insect trehalase (DmTre and CrTre). The sequences for both DmTre and CrTre has been submitted in GenBank with accession numbers KU049688 and KX857662, respectively. The deduced amino acid sequences were characterized in silico by subjecting them to homology search, multiple sequence alignment and phylogenetic tree construction, revealing their identity to other trehalases which belong to glycoside family 37. The instability index and aliphatic index suggested the in vivo- and thermo-stable nature of the protein, respectively.

Page 4

The second section of the chapter consists of homology modeling and molecular docking studies, which provide insights on tertiary structure and active site of the enzyme, identifying glutamate and aspartate as the putative catalytic residues. In silico docking of trehalose in the active centre pocket of two trehalases, i.e. E. coli trehalase (prokaryotic) and DmTre (eukaryotic), revealed substantial differences in the binding interactions and affinity of substrate (trehalose) with the enzyme (trehalase). An intrinsic region was also predicted in DmTre and CrTre, which suggested that a small segment between 190 to 215 and 245 to 264 amino acids, respectively, is prone to disorderness. Further, the conserved regions and catalytically important residues were found to be formed majorly of loops, which tend to be evolutionarily conserved. In the third section of the chapter, heterologous expression of DmTre in Escherichia coli has been described using two different vectors viz., pET28a and pCOLD TF and compared for their variable soluble expression, purification and activity of the recombinant enzyme. The recombinant enzyme was also characterized biophysically using far-UV CD and DSF which indicated that trehalase is α-helix rich protein (also evident from the homology model). A novel PPII fold was observed in the far-UV CD scan of the DmTre-inclusion bodies, which is occasionally formed by the unordered structures. The functional stability of DmTre, was then tested against various physicochemical stressors like pH, temperature, denaturants, detergents, organic solvents and proteolytic environment. The enzyme was found to be active over a wide pH range and temperature up to 60 °C, with optimum pH and temperature being 6.0 and 55 °C respectively.

Chapter 4: Summary and conclusion

Understanding the relationships between and function at the molecular level, remains a primary focus in structural biology. To understand the structure-function paradigm, useful structural information comes from the primary amino acid sequences and the associated tertiary structures. This chapter discusses the major highlights of the thesis and summarizes the characteristics of the two hydrolases under study. The in vitro and in silico studies described in this thesis contribute to our knowledge of the interplay between the stability, structure and function of the enzymes at molecular level, which can serve as a structural toolbox to improve their efficiency in future.

Page 5

Chapter 4

Summary and conclusion

Ekta Shukla AcSIR Ph.D. Thesis (2017)

4.1 Introduction Understanding the relationships between protein structure and function at the molecular level, remains a primary focus in structural biology with important consequences in diverse areas such as drug designing, therapeutics and in various industries like textile, agro-based, food and feed etc. To understand the structure-function paradigm, useful structural information comes from the primary amino acid sequences and the associated tertiary structures. Several recent developments in the fields of molecular biology, genetics, biochemistry, protein engineering and bioinformatics have accelerated the research in the "protein universe" (1).

Protein structure-function relationships can be investigated by asking how nature has engineered protein structures to perform a variety of functions. In short, there are three elementary assumptions (2):  Different structures come from different arrangements of amino acids.  If amino acids change, so does the shape which affects function of the protein.  Physical and chemical parameters of protein are important in maintaining correct structure and proper function.

Thus, full understanding of a molecular system comes from careful examination of the sequence-structure-function triad (Fig. 1). Furthermore, proteins display diverse sequence-structure-function similarity relationships. Usually, proteins with high sequence identity and high structural similarity tend to possess functional similarity and evolutionary relationships; however, examples of proteins deviating from this general relationship of sequence/structure/function homology are well-recognized (table 1). For example, high sequence identity but low structure similarity can occur due to conformational plasticity, mutations, solvent effects, and ligand binding (3, 4, 5).

Therefore, the major challenges in structural biology are:  Below 30 % protein sequence identity detection of a homologous relationship is not guaranteed by sequence alone.  Structure is much more conserved than sequence.  The structure-function relationship is even more complex than the relationship between sequence and structure and is not yet well understood.

Chapter 4: Summary and conclusion Page 127

Ekta Shukla AcSIR Ph.D. Thesis (2017)

Table 1. Different proteins have different sequence-structure-function relationship (see references 5, 6)

Case 1 Case 2 Case 3 Case 4

Sequence Low identity Low identity Low identity High similarity

Structure Similar Similar Different Same

Function Same Different Same Different

Example (Ntn)2 Archaeal IMP Serine Steroid-delta- hydrolase cyclohydrolase proteases and superfamily PurO scytalone dehydratase

Figure 1. Sequence-structure-function triad of proteins

In short, determination of function from sequence and structure is complicated by the fact that proteins of similar structure may not have the same function even when evolutionarily related (7). Therefore, protein structure-function research has derived several aims. These include:

 Studying those proteins, the structure of which is not amenable to resolution by crystallographic and NMR methods, but which catalyze key metabolic processes and are important targets in industries and pharmacy.

Chapter 4: Summary and conclusion Page 128

Ekta Shukla AcSIR Ph.D. Thesis (2017)

 Elucidating the detailed molecular structure of proteins by biophysical techniques, such as CD and fluorescence spectroscopy, etc. to supplement the basic understanding of the relationships between molecular shape and biological function.

 Combining computational and experimental studies to study the mechanisms of folding and misfolding of proteins and to develop molecular level understanding of how conformational transitions can lead to protein misfolding disorders.

 Understanding the changes induced in protein structure through conformational transitions caused by various physicochemical chaotropes and kosmotropes.

 Studying the structural details of protein-ligand interactions and to understand the networks of such interactions that exist naturally thereby improving their efficiency by protein engineering.

4.2 Present study investigates two unexplored hydrolases Keeping the above mentioned objectives in mind, the present studies characterized two hydrolases: a serine hydrolase and a glycoside hydrolase to achieve molecular-level understanding of their functional and structural aspects. As a consequence of growing knowledge on physiological and commercial importance of various hydrolases from microbial, and origin, these enzymes have been studied for the stability and unfolding transitions under denaturing conditions. However, there are very few reports, where people have studied different hydrolases for their structural transitions and simultaneous effect on their function, in a systematic manner (8, 9, 10). Understanding structure-function relationship of these proteins under different stress conditions is primarily important, since, such studies may provide insights into the molecular basis of stability of the protein in different environments.

Although the enzymes under study have different origins and subclass, along with differences in their structure, topology and function, they share some common characteristics. For instance, moderate functional stability at higher temperature, in the denaturing conditions and in the wide range of pH. Moreover, the differences in their sequence and structural elements highlight their different roles/ functions in nature. We performed a combination of computational and experimental studies, in order to better understand the functional and structural characteristics of the protein of interest.

Chapter 4: Summary and conclusion Page 129

Ekta Shukla AcSIR Ph.D. Thesis (2017)

4.3 Highlights of the thesis

Serine protease from Conidiobolus brefeldianus MTCC 5185 (Cprot):  Biophysical characterization suggested Cprot to be a β-sheet rich protein with

melting temperature (Tm): 63 °C  3 Trp present in partially exposed and positively charged environment  Structurally stable up to 60 °C and at wide pH range (5.0-11.0)  Irreversible thermal and chemical denaturation  Moderately stable in presence of chaotropic agent (up to 2M GdnHCl)  Cprot exhibits functional and structural stability towards organic solvents (50 %) where, stability decreases with increase in solvent polarity  Can be stored at lower temp in concentrated form  Resistant to (up to 48 h)  Sequence identified as trypsin-like protease belonging to PA clan of serine proteases  Homology modeling revealed structural topology and active site  Catalytic triad identified as His-64, Asp-113 and Ser-208

Trehalase:  Cloning of 2 insect trehalases (D. melanogaster- DmTre and C. ramosus- CrTre)  Sequence and phylogenetic analysis: GH family 37; Conserved regions: mainly formed by loops  Homology model predicted the essential catalytic residues and their geometry  Molecular docking offered knowledge about the enzyme-substrate interactions  DmTre is overexpressed in two different vectors and compared for the better choice  pH based purification approach of insect trehalase is reported for the first time  Encoded protein was 52 kDa; catalytically active  Biochemical properties: Optimum pH- 6; Optimum temp.- 55°C; Km- 4.5 mM ; Vmax- 8.3 U/mL; activation energy- 23.94 KJ/mol.K  Moderate thermal stability; denaturant-labile; Inhibited by imidazole  Fairly resistant to detergents and proteolysis

 Biophysical characterization suggested it to be α-helix rich protein with Tm: 51 °C  Detection of a novel structural element (PPII fold) in the inclusion bodies of DmTre

Chapter 4: Summary and conclusion Page 130

Ekta Shukla AcSIR Ph.D. Thesis (2017)

4.4 Comparative account of characteristics of Cprot and DmTre

Serine hydrolase Glycoside hydrolase Characterization EC 3.4.21 EC 3.2.1 (by biochemical, biophysical and in- silico methods)

General properties

Abbreviation Cprot DmTre

Enzymatic nature Trypsin-like serine protease Hydrolyses trehalose sugar

Enzyme family PA clan of proteases Glycoside hydrolase fam. 37

Source Fungi (Conidiobolus Insect (Drosophila brefeldianus MTCC 5184) melanogaster)

Significance Commercial value in silk and Physiologically important in leather industries insect glycometabolism

Purification Extracellularly produced by Low yield from insects, so, fungi, purified using column cloned and expressed in chromatography bacterial system (E. coli)

In silico characterization:

Secondary structure β-sheet rich protein α-helix rich protein dominance

Topology Two β-barrels present (α/α)6 toroidal core present

Active site Catalytic triad: His-Asp-Ser Active centre: Glu, Asp, 3 Arg

Catalysis Occurs at interface of barrels Inverting glycosidase mechanism

No. of Tryptophans Three Fourteen

Biochemical and biophysical characterization:

Molecular weight 28 kDa 52 kDa

Chapter 4: Summary and conclusion Page 131

Ekta Shukla AcSIR Ph.D. Thesis (2017)

Optimum pH 9.0 6.0

Opt. temperature 50 °C 55 °C

Melting 63 °C 51 °C temperature

Secondary structure 8.2 % α-helix, 31.1 % β-sheet, 45 % α-helix, 5 % β-sheet, 21 composition 23.8 % turns, 36.9 % unordered % turns and 29 % unordered

Activity / Stability in: pH 5.0 - 11.0 5.0 - 9.0

Temperature Up to 60 °C (24 h) Up to 60 °C (5 h)

GdnHCl Up to 2M Up to 1 M

Proteolysis Resistant up to 24 h Up to 3 h

Organic solvents DMSO > prop > eth > meth meth > eth > prop > DMSO

4.5 Conclusion

These studies provide a deeper understanding of protein folding, structural integrity and functional stability of these enzymes at the molecular level, which can serve as a structural toolbox to improve their efficiency in future. The in vitro and in silico studies described in this thesis contribute to our knowledge of the stability, activation and functionality of two different hydrolases under stress conditions. Questions regarding structural details and molecular interaction of these enzymes, still remain to be resolved. Future efforts in crystallizing these proteins would give a complete picture upon its structure-function relationship.

Chapter 4: Summary and conclusion Page 132