Probing Interaction Motifs for Ligand Binding Prediction from Three
Total Page:16
File Type:pdf, Size:1020Kb
PROBING INTERACTION MOTIFS FOR LIGAND BINDING PREDICTION FROM THREE PERSPECTIVES: ASSESSING PROTEIN SIMILARITY, LIGAND SIMILARITY AND COMPONENTS OF PROTEIN-LIGAND INTERACTIONS By Nan Liu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Chemistry – Doctor of Philosophy 2015 ABSTRACT PROBING INTERACTION MOTIFS FOR LIGAND BINDING PREDICTION FROM THREE PERSPECTIVES: ASSESSING PROTEIN SIMILARITY, LIGAND SIMILARITY AND COMPONENTS OF PROTEIN-LIGAND INTERACTIONS By Nan Liu The interactions between small molecules and diverse enzyme, membrane receptor and channel proteins are associated with important biological processes and diseases. This makes the study of binding motifs between proteins and ligands appealing to scientists. We use multiple computational techniques to unveil the protein-ligand interaction motifs from three perspectives. Firstly, from the perspective of proteins, by comparing the structure differences and common features of different binding sites for the same ligand, 3-dimensional motifs that represent the favorable interactions of the same ligands can be extracted. The goal is for such a motif to represent the shared features for binding a certain ligand in unrelated proteins, while discriminating from other ligands. The 3-dimensional motifs for cholesterol and cholate binding to non-homologous protein sites have been extracted, using SimSite3D alignment and analysis of the conserved interactions between these sites. The 3-dimensional protein motif for cholesterol binding can give about 80% accuracy of true positive sites with a low false positive rate. Furthermore, an online server CholMine was established so that the users can use this approach to predict cholesterol and cholate binding sites in proteins of interest. These motifs can help annotation of protein functions, drug discovery and the design of mutations. Secondly, from the perspective of ligands, interaction motifs can be represented as molecular features important for biological activities of ligands. Searching and summary of shared motifs from pretested series of ligand candidates can provide rational guidance to further drug improvement and screening. Here, we report a series of potential sea lamprey olfactory receptor 1 antagonists discovered from databases we designed of molecules that are similar to the native ligand, 3kPZS. Compounds with overall electrostatic and shape similarity to 3kPZS were assessed by using ROCS software, and their initial important feature matches to 3kPZS were analyzed, to prioritize compounds for biological testing. Then, the molecular features important to biological activities were summarized using SALI analysis and functional group matchprint analysis. By combining theses approaches, 12 compounds were discovered that suppress the detection of 3kPZS by at least 45%, and the most active compounds have entered field testing. Thirdly, dissecting the components of protein-ligand binding energies is also important to define the key determinants of ligand interaction with a protein site. Through analyzing the correlation coefficient of interaction energies between a series of alpha-phenylalanine substitutes and PaPAM and biological activities of these compounds, the dominant factor that determine the activities of the compounds was revealed, which was steric effect between the binding site and these compounds. From the analysis, mutations at the residues of the binding site were suggested to change or improve the catalytic efficiency of the enzyme. Given these three approaches, we envision a more integrated approach in the future that combines the analysis of shared protein-ligand interactions, shared interaction features from active ligands and shared features of protein binding sites to identify even more selective and tight-binding ligands. Copyright by NAN LIU 2015 This dissertation is dedicated to my beloved parents, younger brother and my husband. v ACKNOWLEDGEMENTS Here I sincerely acknowledge all the people who offer me their kindness and helps during my graduate study. Without their support and guidance, I can’t finish my graduate study and dissertation. First of all, with most gratitude, I thank to my advisor, Leslie Kuhn. She gives me a lot of support and technical guidance during my graduate study. She cultivates me with good research habits, such as writing weekly reports in scientific formats, always backing up the research data and so on. Furthermore, she encourages and inspires me to solve problems independently and innovatively. She is always approachable, patient and resourceful to me. She makes my stay at Michigan State University delightful, memorable and productive. Secondly, I wound like to appreciate the helps from my committee members. I am grateful to Dr. Robert Cukier, Dr. Shelagh Ferguson-Miller and Dr. Kevin Walker. They give me many valuable feedbacks and suggestions on my research, based on which I can modify and improve my research skills. In addition, I learn a lot from the collaborations with Dr. Shelagh Ferguson-Miller on cholesterol prediction and the CholMine server building. Without Dr. Kevin Walker’s help, the PaPAM project cannot come to a cheerful end. Thirdly, I owe my thanks to all of my collaborators. I need to thank to Dr. Fei Li and Dr. Jian Liu for their suggestions on CholMine software features to support experimental follow-up, and their feedbacks on this manuscript. I am grateful to Dr. Nishanka Dilini Ratnayake on the vi collaboration of the PaPAM project. I also appreciate Dr. Mar Huertas, Anne Scott (Graduate Research Assistant) and Dr. Weiming Li for their contributions to sea lamprey olfactory receptor 1 antagonist discovery project. Without their help and encouragement, I cannot produce the useful and instructive results for these projects. Furthermore, I would like to appreciate my group members. John Johnson, the technique specialist, helped me to solve a lot of problems, such as implementation of CholMine server, various software environment setting problems and so on. Dr. Jeffrey Van Voorst, the developer of SimSite3D, gave me a lot of helps on how to use SimSite3D properly. Dr. Leann Buhrow helped me a lot when I started to work in the lab. Santosh Gunturu, a former undergraduate researcher in our lab, collaborated with me at the beginning of the CholMine and sea lamprey olfactory receptor 1 antagonist discovery projects. I also need to thank to Sebastian Raschka, Joe Bemister and Alex Wolf for their helps and suggestions on my research and presentations. Lastly, I want to thank to my husband. He gave me unlimited love and support in my life and graduate study. Only under his support, understanding, and love, I can finish my graduate study and research. He gives me strength to overcome various problems in my life. I want to thanks to my parents. Without their unconditional love and support, I can’t overcome so many problems in my life. I owe my appreciations to all my families members. vii TABLE OF CONTENTS LIST OF TABLES ......................................................................................................................... xi LIST OF FIGURES ..................................................................................................................... xiii KEY TO ABBREVIATIONS ...................................................................................................... xix Chapter 1 Introduction .................................................................................................................... 1 1.1 Introduction ........................................................................................................................... 2 1.1.1 Computer technologies used in biochemistry ................................................................. 2 1.2 Representations of protein and small molecule structures .................................................... 3 1.2.1 Protein structure representations and applications ......................................................... 4 1.2.2 Molecular structure representations and applications .................................................... 6 1.3 Predicting ligand binding only given protein information .................................................... 9 1.4 Combination of multiple techniques in drug discovery ........................................................ 9 1.5 Objectives of this dissertation ............................................................................................. 11 REFERENCES ......................................................................................................................... 12 Chapter 2 Decoding protein structural motifs for ligand binding prediction ................................ 16 2.1 Introduction ......................................................................................................................... 17 2.1.1 Conserved lipid binding sites in membrane proteins .................................................... 17 2.1.2 Cholesterol, cholate and related sequence-based binding motifs ................................. 18 2.1.3 Determinants of lipid-membrane protein binding ........................................................ 22 2.1.4 Previous prediction of lipid binding sites ..................................................................... 23 2.2 Methods............................................................................................................................... 25 2.2.1 SimSite3D and site maps