Probing Orthologue and Isoform Specific Inhibition of using In Silico Strategies : Perspectives for Improved Drug Design

Amanda Kristine Sharp

Thesis submitted to the faculty of the Virginia Polytechnic Institute and State University in partial f ulfillment of the requirements for the degree of

Master of Science in Life Science In Department of Biochemistry

Anne M. Brown , Chair Kylie D. Allen Justin A. Lemkul

May 5, 2020 Blacksburg, VA

Keywords: kinases, molecular dynamics simulations, molecular docking, structure - function, drug discovery

Copyright 2020, Amanda Kristine Sharp

Probing Orthologue and Isoform Specific Inhibition of Kinases using In Silico Strategies: Perspectives for Improved Drug Design

Ama nda Kristine Sharp

ACADEMIC ABSTRACT

Kinases are involved in a multit ude of signaling pathways, such as cellular growth, proliferation, and apoptosis , and have been discovered to be important in numerous diseases including cancer, Alzheimer’s disease , cardiovascular health, rheumatoid arthritis, and fibrosis. Due to the inv olvement in a wide variety of disease types, kinases have been studied for exploita tion and use as targets for therapeutics . There are many limitations w ith developing target t herapeutics due to the high similarity of kinase composition , making the utilization of new techniques to determine kinase exploitability for therapeutic design with high specificity essential for the advancement of novel drug strategies . In si lico approaches have become increasing ly prevalent for provid ing useful insight into protein structure - function relationships , offering new information to researchers about drug discovery strategies. This work utilizes streamlined computational techniques on an atomistic level to aid in the identification of ortholo gue and isoform exploitability, identifying new features to be utilized for future inhibitor design. By e xploring two separate kinas es and kinase targeting domains, we found that orthologue s and isoforms contain distinct features, likely responsible for th eir biological roles, which can be utilized and exploited for selective drug development. In t his work, we identified new exploitable features between kinase orthologues for treatment in Human Af rican

Trypanosomiasis and structural morphology differences between two kinase isoforms that can potentially be exploited for cancer therapeutic desig n.

Probi ng Orthologue and Isoform Specific Inhibition of Kinases using In Silico Strategies: Perspectives for Improved Drug Design

Amanda Kristine Sharp

GENERAL AUDIENCE ABSTRACT

Numerous diseases such as cancer, Alzheimer’s disease, cardiovascular disease , r heumatoid arthritis, and fibrosis have been attributed to different and survival pathw ays. Many of these pathways are controlled by a class of called kinases .

Kinases are involved in almost every metabolic pathway in human cells and can act as molecular switches to turn on and off disease progression. Due to the involvement of these kinases’ in a wide variety of disease types, kinases have been continually studied for the development of new drugs . Develop ing eff ective drugs for kinases r equire s an extensive understanding of the structural characteristics due to the high structur al si milarity across all kinases . In silic o , or computational, techniques are useful strategies for drug development practices , offer ing new information into protein structure - function relationships, which in turn can be utilized in drug discovery advancements. Utilizing computation al methods to explore structural features can help identify specific protein structur al features, thus providi ng new strategies for protein specific inhibitor design. In t his work , we identified new exploitable features between kinase orthologues for treatment in Human African Trypanosomiasis and structural morphology differences between two kinase isoforms that c an potentially be exploited for cancer therapeutic design.

iv

ACKNOWLEDGEMENTS

First and foremost, I want to show deep grati tude to my advisor, Dr. Anne Brown,

for all the support, counsel, career and research development, and guidance she has

provided me throughout my undergraduate studies and through my master’s program. I

would al so like to thank my committee members, Dr. Ju stin Lemkul and Dr. Kylie Allen ,

for providing me with a support system and offering me essential feedback on my research.

I also offer sincere gratitude to Dr. Glenda Gillaspy for providing opportunities to fur ther

develop my skills and supporting each of my academic ventures . Thank you to Dr. Zachary

Mackey and Dr. Webster Santos for allowing me to collaborate on these projects with them.

Thank you to Dr. David Bevan and Dr. S. Nikki Lewis for acting as second mentors through my graduate studies. I would like to thank the Bevan and Brown Lab for constantly helping my research. Importantly, I would like to thank Brittany Worrell, Carter Gottschalk, Jeamin

Jung, and R. Jason Mosier for the work they have put into aiding these projects. I would also like to e xtend gratitude to the Advanced Research Computing at Virginia Tech for providing access to their computing resources on the Cascades supercomputer. Lastly, I would like to thank my family for their support duri ng my education.

iv

Table of Contents

CHAPTER 1: INTRODUCTION TO KINASES AS DRUG TARGETS AND CURRENT RESEARCH IN DRUG DESIGN ...... 1 1.1 Kinase Functions ...... 1 1.2 Kinases in Disease and as Drug Targets ...... 4 1.3 Protein and Autophosphorylating Kinases ...... 7 1 .4 Small Molecule Phosphorylating Kinases in Phosphorylation ...... 9

CHAPTER 2: EXAMINING SPECIFICITY FOR ERK8 OF TRYPANOSOMA BRUCEI THROUGH IN SILICO TECHNIQUES ...... 13 2.1 Abstract ...... 14 2.2 Introduction ...... 15 2.3 Material and Methods ...... 18 2.3.1 Homology Modeling of TbERK8 and HsERK8 ...... 18 2.3.2 Physicochemical Characterization of the ATP - ...... 19 2.3.3 Molecular docking ...... 20 2.3.4 Pharmaceutical ADME Qualities and Identification of Similar FDA Approved Therapeutics ...... 22 2.4 Result s and Discussion ...... 23 2.4.1 Sequence alignment and homology modeling of Hs and TbERK8 orthologues ...... 24 2.4.2 Physicochemical Characterizat ion of ERK8 Orthologues ...... 26 2.4.3 Molecular Docking and Interaction Analysis ...... 30 2.4.4 Inhibitor Characteristics and Additional Compoun d s to Probe Binding Site Specificity ...... 35 2.5 Conclusion ...... 41 2.6 Main Figures ...... 43 2.7 Supplemental Information ...... 49

CHAPTER 3: ASSESSING THE INFLUENCES OF ATP AND ON THE PROTEIN DYNAMICS OF ISOFORMS USING MOLECULAR DYNAMICS SIMULATIONS ...... 92 3.1 Abstract ...... 93 3.2 Introduction ...... 94 3.3 Materials and Methods ...... 96 3.3.1 Mo lecular Dynamics Simulations of SphKs ...... 96 3.3.2 MD Analysis ...... 98 3.4 Results and Discussion ...... 99

v

3.4.1 Influence of ATP and Sphingosine on SphK Isoform Structural Morphology ...... 100 3.4.2 SphK Isoform Interaction Analysis with ATP and Sph ingosine Molecules .. 110 3.5 Conclusion ...... 115 3.6 Acknowledgements ...... 116 3.7 Main Figures and Tables ...... 117 3.8 Supplemental Information ...... 122

CHAPTER 4: DISCUSSION ...... 134

REFERENCES ...... 136

vi

Attributions

All section chapters in this thesis w ere written by Amanda Sharp . The primary research advisor in these projects is Dr. Anne M. Brown and provided content insi ght and corrections for improvement in all chapters . The detailed author attributions for this document are listed below by chapter. I nitials are used to specify which author contributed each part. Author s for this document are as follows:

Amanda K. Sharp (AKS) Anne M. Brown (AM B) David R. Bevan (DRB) Zachary B. Mackey (ZBM) Brittney L. Worrell (BLW)

Chapter 1

Attribution: AKS wrote this section with input on corrections from AMB .

Chapter 2

Attribution: AKS wrote this section with the support of AMB , DR B, and Z B M providing content input and corrections to the manuscript. AKS performed the analysis in this chapter , with support of BLW .

Chapter 3

Attribution: AKS wrote this chapter using input from AMB and DRB to make corrections and content suggesti ons. AKS performed the MD simulations and analysis. All authors collectively decided on project plans and direction.

Chapter 4

Attribution: AKS wrote this chapt er entirely with the input of AMB for edits and corrections for improvement.

vii

CHAPTER 1: I NTRODUCTION TO KINASES AS DRUG TARGETS AND CURRENT RESEARCH IN DRUG DESIGN

1.1 Kinase Functions

Phosphorylated biomolecules play a n important role in cellular signaling and are involved in nearly all pathway s in eukaryotic organisms. 1, 2 Phosphorylation is defined a s a high ener gy exchange involving the transfer of a phosphate group . This Figure 1. Phosphorylation is reversable. Basic covalent transfer of a phosphate group is schematic of protein phosphorylation via kinases. typically facilitated by two class es : kinases and phosphatases . Kinases

phosphorylate while phosphatases dephosphorylate biomolecules, working synergistically together to sustain si gnaling homeostasis ( Figure 1 ) . 1, 2 Regarding protein structure and function , phosphorylation is one of the most common post - translational modifications

(PTM) and is involved in a wide variety of signaling cascades, which in turn can alter a protein’s enzymatic activity by influencing folding and binding – giving this mechanism a high level of control over biological processes. 3 - 5

The event of p hosphorylation , as catalytically facilitated by kinases in many cases, is implicated in many different proliferative and apo ptotic pathways. Cellular p roliferation is generally dictated by transcriptional regulation of DNA replication . 6 Diverse signaling cascades are involved i n proliferation pathways , including G1 cyclins controlled by mitogenic signals , 6 G1 cyclin - dependent kinase inhibitors , 6 and even

1

. Restriction of Cell Div. tra nsitioning from G1 phase to S phase is the basis of apopt ot ic pathways, leadi ng the cell to an imminent death. 7 Apoptosis plays an important role in regular cell function as it controls cellular growth and development, as well as keep ing a homeostatic level of cellular growth. 8 Proliferation and apoptosis are mechanistically synonymous in many cases , with inhibition of one essential piece of a prolif erative pathway leading to apoptosis. Collectively, tight control of cellular growth and development is dependent on the function of the components within those signaling pathways – many of which are kinase enz ymes .

Kinases are a class of eukar yotic enzymes involved in a diverse number of cellular functions. Humans have over 500 kinases 1 essential in cellular functions such as metabolic regulation, cell proliferation , and autophagy , and kinases are broken into nine broad groups with over 134 different families. 1 Many kinases are also involved in disease progression due to their involvement in proliferation and apoptotic pathways. 9 Kinases phosphorylate other proteins, small molecules, and even themselves to regulate many cellular processes causing downstream regulatory effects. 10 E very kinase performs effectively the same chemistry – cleaving the γ - phosphate from ATP , 11 or another phosphate group of other high - energy molecules such as GTP , 12 with a h ydroxyl group of another molecular structure , 12, 13 such as a of , threonine, or tyrosine residues on protein structures.

Kinases act as molecular switches 14 and can adopt specific structural morphologies to regulate their active and inactive states. 15

2

Given similar chemical

properties and structural similarities,

kinases contain several sp ecific

characteristics that are con s erved

( Figure 2 ) . 16 Catalytically active

k inases are typically classified by their

ability to phosphorylate other

molecules and contain three largely

conserved m otifs : VAIK, HRD, and

DFG. 1, 17 The VAIK motif contains a

lysine tha t is required to interact with Figure 2. Kinases contain highly conserved domains required for phosphorylation to occu r. the γ - phosphate group on ATP and facilitate charge stability during phosphate cleavage . 11 In some cases, this lysine is replaced with an arginine, which can perform the same function. 11 The HRD motif contains a catalytic aspartate , while the DFG motif contains the essential cation - binding aspartate . 17

While there will be some cases of variations in these moti fs, the majority are highly conserved in these domains and the two aspartates along with either a lysine or an arginine are essential for performing the chemistry for phosphate cleavage . 16 K inases also embody specific regions, containing two distinct lo b es, N - terminal and C - terminal, being linked by a hinge and linker region, a glycine - rich loop (GxGxxG motif) , 18 and an alpha - C helix. 19

The hinge and linker region sit between the two lo b es and offers hydrogen bonds to the adenine moiety of ATP , enhancing small molecule bi nding . 19 The glycine - rich loop has been determined to be essential in kinase activity, providing a region rich in hydrogen

3

bonds to interact with ATP. 18 Lastly, the alpha - C helix is important in overall structural

stabilization and responsible for activation of the protein, with a lysine located on the N -

terminal lobe forming a salt bridge with a glutamate residue located on the a lpha - C helix. 19

T hese features are essential to allow for the phosphorylation activity and to be classified as

a k inase , 1, 17 how ever, distinct kinases contain other structural characteristics control the

functionality and specificity .

There are many different types of kinases ranging from autophosphorylating protein - kinases to small molecule phosphorylating kinases . In this work, w e discuss the

importance of an autophosphorylating kinase, the Extracellular Signal Regulating Kinase

8 (ERK8), and lipid phosphorylating kinase s , Sphingosine Kinases (SKs). Both t hese

kinase enzymes have been studied in the lens of enhanc ing drug discover y and have been

implicated in a variety of diseases.

1. 2 Kinases in Disease and as Drug T argets

Figure 3. Kinases play a role in a variety of different d isease progression pathways. Schematic of a portion of diseases being investigated currently in kinase drug discovery.

4

Since kinases play an important role in prolif eration and apoptotic pathways,

malfunction of these proteins can also lea d to disease ( Figure 3 ) . 10, 20 The malfunction of these proteins, produced by genetic alterations, 9 creates dysregulation and abnormalities in kinase signaling pathways, leading to evasion of apoptotic pat hways, generating various types of cancers. 21 According to the American Cancer Society, it is expected that over

600,000 Americans will die of some form of cancer in 2020. 22

Due to the roles that kinases have in cell survival , proliferation, and programmed

cell death, 10 these enzymes have been extensively explored for drug discovery. 20 The first

kinase inhibitor approved by the FDA was im atinib for treating Philadelphia

positive chronic myeloid leukemia in 2001, 23, 24 sparking the drive for additional kinase

inhibitor drug design. 25 Some diseases currently being treated with FDA - approved kinase

inhibitors include leukemia (imatinib, 26 nilotinib, 27 and das atinib 28 ), non - small cell lung

cancer ( gefitini b , 29 crizotinib, 30 and er l otinib 31 ), hepatocellular and advanced rental cell

carcinoma (sor a fenib , 32 pazopanib, 33 and su n i tinib 34 ), and metastatic breast cancer

(lapatinib 35 ). 36 M any anti - cancer chemotherapeutic techniques on the market target

kinases 36 and are still being heavily explored fo r new therapeutic approaches.

Kinase inhibition is not only used for cancer therapy, but other diseases such as

rheumatoid arthritis (tofacitinib , 37 an d baricitinib, 3 8 and upadacitinib 39 ), fibrosis

(ruxolitinib, 40 fedratinib, 41 and nintedanib 42 ), glaucoma (netarsudil 43 ), and ulcerative colitis

(tofacitinib 44 ). 25 The explorati on of using kinases in drug discovery outside of cancer is

continuously growing, including for neuropathy and heart disease. 36 While there are not

currently any approve d kinase inhibitors for bacterial and parasitic diseases, they are

gaining traction for drug development research. 45 - 48

5

There are seven different types of kinase inhib itors based on binding patterns. Type

I kinase inhibitors bind in or near the adenine binding domain , at the glycine loop and the

hinge region, and interact with the active structure. Type I inhibitors are considered ATP -

competitive as they are frequently designed to imitate ATP interactions. 25, 49 Type II

in hibitors interact with the inactive structure at the DFG region, relying on the abundance

of heterogeneity of inactive kinase configurations. 50 From type I and type II inhibitors

emerged the Type I½ class, which binds similar to type I and type II, interacting with the

adenine binding domain in the inactive structure. 51, 52 Type III and IV are both non -

competitive to ATP , 25, 49, 51 where t ype III inhibitors bind close to but outside of the adenine

binding domain while type IV bind entirely outside of the active site. 25 T ype III and type

IV inhibitors are considered to have the highest affinity for their tar get out of the types of

kinase inhibitors since each are bind ing to a region that is more specific to the dist inctive

kinase structure. 49 Type V are another class of covalent inhibitors that bind in two distinct regions with the enzyme, combining features of inhibitor types described previously by linking them covalently , otherwise known as “bisubstrate” or “bivalent” inhibitors. 25, 53

Type VI, also known as covalent inhibitor s, bind outside of the two lobes and form an

irr eversible covalent bond, permanently blocking ATP binding. 25, 49 When it comes to

pharmaceutical development, the most common inhibitor type are in the type I category

accounting for over 40% of FDA approved Kinase drugs. 36

In this work, we discuss targeting the autophosphorylating kinase, ERK8, and the

group of lipid phosphorylating kinases, SphK. Targeting autophosphorylating kinases has

a vast a mount of limitations due to the high similarity in the ATP binding regions of

kinases. Development of ATP - competitive inhibitors with high specificity is essential in

6

the drug discovery for these types of kinases. 54 While a difficult task, it is not uncommon

to develop a therapeutic that is ATP - competitive, as previously discussed. Current ly, most

kinase therapeutics on the market are A TP - competitive or type I inhibitors , many of which target the class of t yrosine k inases, which are also autophosphorylating kinases . 54 This

indicates promise in that the development of an ATP - competitive or type I inhibitor could

be developed with enough specificity for therapeutic usages.

I n contrast, development of drugs that are allost eric inhibitors, such as those that

are being explored for SphK for cancer research, are advantageous due to their binding

region being outside of the ATP - binding domain. These regions tend to have more

disti nctive characteristics, allowing for higher spec ificity for drug development. While

seemingly less difficult, currently there are only two FDA approved type III k inase

inhibitors.

This work focuses on utilizing computational strategies to study k inase

dru ggability. Molecular docking can provide useful understanding in protein -

interactions to guide drug development into inhibitor design for isoform or orthologue

specificity. Molecular dynamics (MD) simulations utilizes protein dynamics to understand biological questions, such as better understandi ng ligand binding on structural morphology.

Together, this work provides deeper in sight into orthologue and isoform specific binding

patterns using computational strategies, offering useful information into n ovel therapeutic

design by utilizing computation al approaches .

1. 3 Protein Phosphorylation and Autophosphorylating Kinases

Post - translational modifications (PTMs) are important in protein modulation and

cellular functions such as localization, activation, and enzymatic activity. 55 Protein

7

phosphorylation is the most common PTM and allows for enzymatic control and signaling

modulation . 56, 57 Protein phosphorylation is completely reversible and acts as a molecula r

switch, activating or deactivating enzymatic activity of a protein , 56 which is controlled by

protein k inase s and phosphatases. 56 Many k inases also require phosphorylation for protein

activation, which in many cases can be controlled through autophosphorylation.

Autophosphorylating kinases (AKs) have the ability to phosphorylate and control

their own enzymatic activity . AKs co ntain an activation loop that undergoes

conformational changes once phosphorylated, auto - activating the structure by causing

conformational changes, leading to subsequent downstream effects. 58 The phosphorylation

of this activation loop can happen mechanistically in a multitude of ways – either through

cis - or trans - autophosphorylation. 58 Cis - AKs , of self AKs, phosp horylate their own

activation loop , being both the enzyme and the substrate. Trans - autophosphorylation occurs between two or more copies of a kinase protein , where one acts as an activ ating protein (enzyme) and phosphorylates the subsequent inactive protei n (substrate). 58 The regulation of these mechanisms is s till poorly understood and require s further exploration

at the structural and biochemical level in order to understand more in the compl exity of

pathways .

An example of an AK is the Extracellular Signal Regulated Kinase 8 ( ERK8 ) , a mitogen activated Kinas e and a cis - autophosphorylator . In humans, ERK8 is involved in the mitogen activated Kinases (MAPK) signaling system, a massive system o f pathways involved in signal transduction that has been highly characterized. This MAPK system involves a multitude of parallel signaling cascades and include a variety of kinases, totaling at least 60 different kinases, 59 such as RAS, RAF, MEK, and ERK proteins. E RK8 is

8

frequently referred to as M APK15 and m ost recent ly identified MAPK protein . 60, 61 ERK8 is involved in a variety of cellular functions, ranging from cell proliferation, autophagy, and genomic stabilization, and exists in three is oforms due to splice variants. 59 Control of

ERK8 a ctivation is a result of aut o phosphorylation and dephosphorylation by protein tyrosine phosphatase 1B (PTP1B) , making it an atypic a l MAPK protein as it is the only

MAPK protein to effectively autoactivate . 62 ERK8 has been shown to play a critical role in autophagy and bind s to proliferating cell nuclear antigen (PCNA), 63, 64 thus having a role in the transmission of genetic information and stability.

Due to the role ERK8 has in autophagy and genetic stabilization, it has been explored in a wide variety of therapeutic design. While ERK8 has been studied in humans for diseases such as cancer, it is also being explored as a drug target for treating parasitic diseases such as Human African Trypanosomiasis (HAT) caused by T rypanosoma brucei .

While ERK8 seems to be a promising targe t for drug discovery, on e difficulty with ATP

competitive drugs is the high similarity in the ATP binding domain . This work aims to

investigate the binding differences in the human ERK8 and T . brucei ERK8 to identified

residue differences to be exploited f or HA T drug discovery purposes.

1. 4 Small Molecule Phosphorylating Kinases in Lipid Phosphorylation

Many k inases play an important role in regulation of signaling molecules. Outside

of protein phos phorylation, k inases can also phosphorylate and l ipids. The

most recognized phosphorylating kinases are those involved in glycolysis,

such as 65 and . 66 are highly complex and important

in many functions throughout biological systems . Phospholipids can be the produc t of

Kinase phosphorylation , such as phosphatidylinositol (4,5) - bisphosphate and sphing o sine -

9

1 - phosphate (S1P) , two lipids important in cell signaling. is invol ved in many pathways, including neurological signaling and can act as secondary signaling molecules by binding to GPCRs. 67

S1P is a bioactive phospho lipid involved in the synthesis and salvage pathway s a nd act s as signaling switch to direct cells to either survival an d proliferation or programmed cell death. 68 S1P has be en connected to various downstream signaling networks , including PI3K - AKT pathway involved signal t ransductio n and ERK pathways involved in cell proliferation and acts “inside - out” by autocrine or paracrine signaling several by interacting with different m embrane receptors . 68 Elevated levels of S1P ha ve been associated to numerous types of diseases, ranging from cardiovascula r disease to multiple cancer types , and researchers are investigating their sole producer, the

Sphingosine Kinases ( SphKs ), as the target s to reduce cellular levels of S1P. 68

The class of SphKs consist of two members, SphK1 and SphK2 , and differ by subcellular localization . 68 SphK1 has been shown to be localized in the while

SphK 2 is localized to the mitochondria, endoplasmic reticulum, and nucleus. 68 Studies have show n each is o form may perform oppo site roles in cell fate , even though both perform the same phosphotransfer reaction producing S1P from sphingosine . 69 Sph K1 has been shown to promote cell proliferation and migration while SphK2 appears to hold anti - proliferation patterns by inhibiting DNA replication . R esearchers are developing isoform - selective inhibitors due to the differ ing cellular functions for various diseases and the importance of each in drug development is still a topic being explored . 69, 70

SphK1 is present in nearly all cell typ es and has shown to drive cell growth and proliferation , and has an indirect role in activating the PI3K - AKT pathway . 68 Du e to its

10

ro le in cell proliferation pathways, SphK1 has been well - studied and the target for pharmaceutical development. 70 C irculatory fluids naturally contain high concentrations of

S1P and m ost production occurs in erythrocytes . 71, 72 Red blood cells have a high expression rate of SphK1 and overexpres sion in erythrocytes are involved in various blood diseases, such as sickle cell anemia 73 and myeloid leukemia. SphK1 selec tive inhibitors have been aggressively explored for drug discovery , but thus far, none have been FDA - approved.

While SphK2 also phosphorylates sphingosine into S1P, it elicits different res ponses tha n SphK1 due to subcellular localization . In the nucleus, S1P inhibits histone deacetylase 1 and 2 (HDAC 1/2) , enhancing expression of p21 and c - fos , 70, 74 increasing cell cycle arrest and expression of the proto - oncogene, FOS . In the endoplasmic reticulum, S1P is recycled back into ceramide through the salvage pathway , promoting apoptosis . In the mitochondria, S1P interacts with BAK, signaling a release of cytochrome c across the mitochondrial membrane , also facilitating apoptosis. SphK2 is the target of the drug , also commonly referred to as FTY720, a pro drug metabolite that gets transformed by SphK2 into phospho - FTY720. 70 Phospho - FTY720 interacts with the group of G - protein coupled rec eptors, the S1P receptors (S1PRs), acting as an agonist to downregulate S1PR activity. 75 Downstream, the reduction of S1PR activity decreases the cellular levels of T helper protein 17, downregulating inflammatory and autoimmune conditions such as multiple scleros is. 76

Both SphK1 and SphK2 are being he avily explore d for therapeutic development and isoform - specific inhibitors are e ssential for disease treatment efficacy. SphK1 have shown to be the most promising of the two for cancer treatment development, while SphK2

11

still needs further exploration in i ts involvement in autoimmune and inflammatory

conditions. This work aims to buil d a foundation of the biological understanding of SphK

isoform protein dynamics using computational procedures which can in turn help the advancement of isoform selective inhib itors .

12

CHAPTER 2: EXAMINING SPECIFICITY FOR ERK8 OF TRYPANOSOMA BRUCEI THROUG H IN SILICO TECHNIQUES

Amanda K. Sharp, 1 Anne. M. Brown, 1,2,3 Brittney L. Worrell, 1 David R. Bevan, 1,3 and Zachary B. Mackey 1,3

Department of Biochemistry, 1 University Libra ries, 2 and Virginia Tech Center for Drug Discovery, 3 Virginia Tech

Attribution: AKS wrote this section with the support of AMB, DBR and Z B M providing content input and corrections to the manuscript. AKS performed the analysis in this chapter , with sup port of BLW .

13

2.1 Abstract

Human African Trypanosomiasis (HAT), commonl y referred to as sleeping

sickness, is a neglected, vector - borne disease caused by two parasite subspecies:

Trypanosomiasis brucei gamiense and T. brucei rhodesiense . The sub - species, T. brucei rhodesiense , causes the acute form of HAT and can lead to death within a matter of months if left untreated. Current therapeutics cause severe side effects, even death, which demonstrates need for both insight into potential drug target mechanism s and novel therapeut ic design. Extracellular - signal regulated kinase 8 (ERK8) is a member of the mitogen - activated Kinase (MAPK) family and has been discovered as an essential kinase for survival and proliferation of T. brucei and is only expressed in sel ect human tissue cell s.

Exploiting potential binding cavity differences in the T. brucei and human orthologue s of

ERK8 is a route to new therapeutic design. In silico techniques were utilized to probe t he

ATP binding pocket of T. brucei ERK8 (TbERK8) agai nst human ERK8 (HsERK 8) and identify ATP - competitive inhibitor features as based on key differences in amino acid residue s and chemical feature s between orthologue s . Molecular docking of experimentally determined, orthologue - specific inhibitors revealed di fferent modes of bind ing and a shift in position of residues , most notable at Leu19, Met95, and Ser141, within the active site between kinases. Moreover, t he TbERK8 ATP binding cavity was found to be a smaller and more hydrophobic pocket as compared to HsE RK8. Collectively, th ese results demonstrate d that TbERK8 contains exploitable residue and binding pocket features that can be targeted for selectivity over HsERK8, leading to the potential for design of a new class of inhibitors for HAT . This information provides a buildable scaffold in targeting

14

orthologue - specific kinases to advance potential therapeutic drug design for the infectious

organism, T. brucei .

2.2 I ntroduction

Human African Trypanosomiasis (HAT), also referred to as sleeping sickness, is a

ne glected tropical disease endemic to Sub - Saharan Africa, with approximately 65 million people susceptible to infection . 77 HAT is a caused by either Trypanosoma brucei : T. b. gambiense, or T. b. rhodensiense , which is transmitted by the bite of infected tsetse flies from the genus Glossina . 78 Currently, four drugs are approved for tr ea ting HAT that are specific for disease stage : pentamidine and suramin (first - stage) , and melarsoprol and eflornithine (second - stage). 79, 80 T hese therapeutics are difficu lt to administer, have poorly defined molecular targets or mechanisms of action, limited efficacy between sub - species, and contribute to severe adverse side effects such as neurological disorders, cardiac failure, coma, and even death. 77, 79, 81 Consequently, a critical need exists for developing n ovel

HAT therapeutics that can inhibit select , experimentally validated molecular targets that

are essential for T. brucei survival.

K inases are attractive targets for dr ug design due to their role in regulating

numerous essential parasitic processes that are often linked to cell cycle control 82 and their

high potential for druggability. 20 A mostly uncharact erized Kinase , extracellular - signal

regulated kinase 8 (ERK8), has been explored in humans as a potential therapeutic

candid ate for diseases such as cancers or rheumatoid arthritis. 83 ERK8 is the most recently

identified member of the MAP K family 60, 61 and ha s presented itself as a potential target for therapeutic design given its role in response to extracellular signaling processes such

as mitogenic signals, pro - inflammatory cytokines, and cellular stress in eukaryotes. 84 No

15

upstream activating kinase has been identified for ERK8 fa mily members proteins, but

rather ERK8 contains an ATP - binding catalytic domain that is activated by

autophosphorylation at two resi dues in the conserved TXY motif within the activation loop

or ERK8. 83, 85 - 87 While targeting the kinase domain can be difficult, recent studies have explored the potential of the ATP - binding site for pharmacologi cal inhibitors, indicating

that there are strateg ies for drugging the domain with high target specificity. 54, 88 - 90 ERK8

is a potential drug target given its implication to association with several diseases , 83, 91 and

has been determined to be useful for utilization in HAT therapeutic design given its

essential role in T. brucei survival.

A prev ious RNAi study investigated the T. brucei kinome and found two kinases

essential in cellular proliferation and parasite survival: ERK8 and cdc2 - related - kinase 12

(CRK12). 92 ERK8 is a more attractive drug target for HAT as CRK12 is functionally

analogous to its human orthologue and does not appear to have cell - cycle defects upon

down - regulation. 93, 94 T. brucei ERK8 (TbERK8), a member of the MAPK

(MAPK6), 95 is an orthologue to the human ERK8 (HsERK8) 60 , and has minimal functional

characterization compared to oth er ERKs. 61, 92 Given the successful utilization of kinases

in drug discovery and the necessity of ERK8 in trypanosome survival, TbERK8 is a

potential , exploitable target for the design of orthologue - specific (e.g. T. brucei - specific)

inhibitors for HAT.

Exploring the inhibitor efficacy and selectivity differences between the two

orthologue s is a key piece to identifying routes for orthologue specifi c drug design for

HAT . 96 TbERK8 was screened against a kinase inhibitor library, and a compound

containing methyl - pyrazol and fluorophenyl moieties, AZ960, was identi fied to selectively

16

inhibit TbERK8 over HsERK8. 60 A HsERK8 inhibitor, Ro318220, a compound containing pyrrole - dione and methyl - sulfany l moieties, has also been identified and shows 100 - 300x

60, 86 potency for inhibiting HsERK8 (IC 50 = 5 - 10 nM) over TbERK8 (IC 50 = 1.4 µM). This experimentally observed, selective inhibition of HsERK8 or TbERK8 based on unique inhibitors indicates that there are structural and/or biochemical determinant s that influence how each of these inhibitors can interact with and inhibit the ATP - binding pockets of

TbERK8 and HsERK8.

In silico techniques are an essential tool for drug discovery as they allow for theoretical analysis into protein - ligand binding, whi c h can support and enrich in vivo and in vitro experiments and accelerate the drug design process. 97, 98 Detailed, atomistic characterization of the ATP binding pockets of these ERK8s can predict specific ch e mical characteristics of inhibitors to further exploit in orthologue - selective inhibitor design. To date, neither HsERK8 nor TbERK8 has an experimentally resolved structure, indicating the need for in silico approaches to provide a more detailed understan d ing of experimentally observed inhibitor selectivity. Previously, homology modeling of the kinase domain of HsERK8 was performed and found useful for predictive drug discovery strategies . 86 Here , we applied homology modeling, molecular docking, and free energy calculations of ligand - protein complexes to compare HsERK8 a nd TbERK8 inhibitor interactions and design. We also explore the structural differences between models of

HsERK8 and TbERK8 and show variances caused by residue deletions in TbERK8 in binding pocket size and chemical characteristics. Our work further demo n strates that there are distinct, structural differences between the two orthologue s that allows for exploitation of the TbERK8 catalytic site for orthologue - specific drug design .

17

2.3 Material and Methods

2.3.1 Homology Modeling of TbERK8 and HsERK8

Seque nces of TbERK8 (Tb927.10.5140) 95 and HsERK8 ( Q8TD08 ) 61 were retrieved from GeneDB 99 a nd Ge nBank 100 , respectively. For the generation of homology models, sequences were clipped to contain only the kinase domain, residues 6 - 341 for

TbERK8 and residues 12 - 345 for HsERK8 as determined by Valenciano et al., 60 and

Strambi et al., 86 . Molecular Operating Environment (MOE) 101 was used to generate models using the crystal structure of MAPK from Cryptosporidium parvum (PDB ID: 3OZ6)( DOI:

10.2210/pdb3OZ6/pdb ) as a template. MAPK from C. parvum shares 51% and 46% identity to TbERK8 and HsERK8 ( Table S1 ), respectively, and is a member of the same protein family as ERK8. Models of TbERK8 and HsERK8 were energy minimized using

MOE 101 and the AmberEHT force field 102, 103 and then validated for model quality using a variety of metrics to assess side chain positioning (Verify3D), 104 local environment favorability (ANOLEA) 105 , similarity to known experimen tally solved structures (ProSA),

106 a nd backbone and torsional energies ( SWISS - MODEL 107 - 110 , Rampage 111 ) ( Figure S1 -

S3 ) . Additionally, the x - ray diffraction - resolved structure of MAPK Fus3 from

Saccharomyces cerevisiae (PDB ID: 2B9F) 112 was energy minimized and validated the same way as the homology models, as Fus3 from S. cerevisiae (denoted as Fus3 in this work) shares 38% and 39% percent identity ( Figure S4, Table S1 ) with TbERK8 and

HsERK8, respectively, and w a s co - crystallized with ADP. The homology models of

TbERK8, HsERK8, and the energy minimized structure of Fus3 showed favorable validation metrics based on above analysis types. Model validation quality metrics

18

indicated that models generated were acceptab l e for utilization in subsequent docking and

binding cavity analysis experiments.

2.3.2 Physicochemical Characterization of the ATP - Binding Site

To analyze the physicochemical features of the ATP - binding site ( Figure 1 ), the

TbERK8 and HsERK8 energy minim i zed homology models were evaluated through

various servers to determine their interaction abilities, pocket volume, and hydrophobicity.

MetaPocket 2.0 113 was used to predict the locations of binding cavities of proteins and

provide coordinates of t he space - filling region of these cavities. These coordinates of the

binding site were then imported into Chimera 114 for volume calculation. The binding cavity

spheres were evaluated, and regions that were outside of the ATP - binding cavity were eliminated ( Figure 2A - B, S7 ). The overall surface of the binding pocket was evaluated for hydrophobicity, hydrogen bonding abilities, and charged regions using Schr ö dinger -

Maestro 115 binding site surface feature ( Figure 2C - D ). Pharmacophore hypothesis testing in Schr ö dinger - Maestro (v. 2018 - 1) software 115 was used to predict the receptor features required for small molecule binding , predicting features of hydrogen bonding, charge - charge interactio ns, hydrophobicity, and aromaticity . Each model was analyzed using the receptor - based method, u tilizing the parameter coordinates of (38.00, 6.00, 37.00) and

(34.00, 6.00, 40.00) for HsERK8 and (30.00, 3.00, 42.00), (38.00, 6.00, 31.00), and (35.00,

6.00, 42.00) for TbERK8 and investigated the different types of essential interaction features within the binding pocket and ideal characteristics of an orthologue selective small molecule ( Figure 3 ). Protein electrostatic surfaces were created using PyMOL. 116

Electrostatic surfaces of inhibitors were created using UCSF Chimera 114 by displaying the protein surface using the Coulombic Surface Coloring feature within the software ( Figure

19

S8 ). The surfaces of each ligand used in the molecular docking procedures were also evaluated for thei r electrostatic characteristics ( Table S6 ) usi ng Chimera 114 . Rotatable bond analysis of sm all molecules ( Table S6 ) was analyzed using AutoDock Tools (v.

1.5.6). 117

2.3.3 Molecular docking

AutoDock Tools (v. 1.5.6) 117 was used to prepare receptor (kinase) a nd ligand files for docking experiments. AutoDock Vina 118 was used to dock all ligands and inhibitors used in this work , including ADP and ATP, into the energy minimized Fus3 structure and

ERK8 homology models. All protein structures were overlaid prior to docking to utilize the same grid box size and cent er position given the similarity in the ADP/ATP binding cavity among the three homologues. The grid box was centered on the ADP/ATP binding site based on location and proximity of key, interact ing residues with ADP from the Fus3 crystal structure. To negat e any differences in size between the ADP/ATP binding cavities of th e three proteins, a larger box was utilized to ensure the entire cavity volume in all three proteins was searched in the docking process . The grid box was centered on the coordinates

(35.3 , 2.6, 29.8) and was sized 22 Å x 22 Å x 22 Å with 1.000 Å grid spac ing relative to the ADP/ATP active site. To evaluate the accuracy of the redocking, the root - square - mean - deviation (RMSD) was calculated using an in - house script. R edocking of ADP into Fu s3 resulted in a lowest energy ADP pose with a n RMSD of 0.710 Å rela tive to the ADP position in the crystal structure, further validating the box size and position for all future docking experiments ( Figure S 9 ) . Structures of AZ960 (5 - fluoro - 2 - [[(1S) - 1 - (4 - fluorophenyl)ethyl]amino] - 6 - [(5 - methyl - 1H - pyrazol - 3 - yl)amino]pyridin e - 3 - carbonitrile) and Ro318220 ( 3 - [3 - [2,5 - Dihydro - 4 - (1 - methyl - 1 H - indol - 3 - yl) - 2,5 - dioxo - 1 H - pyrrol - 3 -

20

yl] - 1 H - indol - 1 - yl]propyl ) were downloaded from PubChem 119 and also docked into the

TbERK8 and HsERK8. Each docking procedu re resulted in nine poses. The lowest energy

poses that docked in a similar orientation as ATP w ere utilized for further analysis such as

distance measurements and residue interaction fingerprints. Free energy (measured in

kcal/mol from Schr ö dinger - Maestro ) 115 of the protein, ligand, and protein - ligand complex structures were calculated using the Molecular Mechanics/Gene ra lized Born Surface Area

(MM - GBSA) method . For protein - ligand complex free energy calculations, all poses from docking results were utilized. Ligand efficiencies were calculated using all AutoDock

Vina 118 output energies (kcal/mol) and dividing by the number of heavy atoms in the molecule. Di stance measurements determined through fingerprinting were divided into three different interaction types depending on the heavy atom type involved and distance:

electrostatic interactions were 3.0 – 5.0 Å and involved polar atoms with partial charges,

hydrophobic interactions were 3.0 – 5.0 Å and involved non - polar carbon atoms, and

hydrogen bond interaction s were 2.8 – 4.0 Å betwe en hydrogen bond donor and acceptor

atoms. All poses were visually assessed in PyMOL 116 and UCSF Chimera 114 to determine

the hit rate in position and occupancy of the binding cavity ( Fi gure S10 ) .

To further analyze the docking results, Schr ö dinger - Maestro (v. 2018 - 1) 115 was

utilized for residue inter actions between the protein and docked ligands and inhibitors. The

protein structur es were preprocessed using the protein preparation wizard . 115 Using the

Discovery Informatics and QSAR feature, a fin gerprint of the interactions between the

protein and ligand was produced, examining electrostatic, hydrophobic, and backbone

interactions. 115

21

To evaluate any conserved residues in the three homologues essential for ligand

binding in the kinase domain, a sequence alignment was p erformed using Schr ö dinger -

Maestro (v. 2018 - 1) 115 using the multiple sequence alignment feature ( Figure S1 ). These co nserved residues were compared to previous mutagenesis experiments through a litera ture search, and the interaction fingerprints were then overlaid to see the three - dimensional location of the residues ( Figure S6 ). Once the key residues were defined, the s equences were renumbered for referencing purposes for all results and discussion ( T able

S 2 ).

2.3.4 Pharmaceutical ADME Qualities and Identification of Similar FDA Approved Therapeutics

Compounds with known potential as therapeutics, that contained drug - l ike characteristics pertaining to absorption, distribution, metabolism, and excretion (ADME) characteristics, and with chemical characteristics similar (60%+ similarity) to the known

ERK8 inhibitors were determined using the QikProp feature in Schrodinger - Maestro’s suite. 115 QikPr o p identified six addi tional ligands to screen : famotidine ( N' -

(Aminosulfonyl) - 3 - ([2 - (diami nomethyleneamino) - 4 -

thiazolyl]methylthio)propanamidine ), fludrocortisone ( (8S,9R,10S,11S,13S,14S,17R) - 9 -

fluoro - 11,17 - dihydroxy - 17 - (2 - hydroxyacetyl) - 10,13 - dimethyl - 1,2,6,7,8,11,12,14,15,16 -

decahydrocyclopenta[a] 22 henanthrene - 3 - one ),fluprednisolone

( (6S,8S,9S ,10R,11S,13S,14S,17R) - 6 - fluoro - 11,17 - dihydroxy - 17 - (2 - hydroxyacetyl) -

10,13 - dimethyl - 7,8,9,11,12,14,15,16 - octahydro - 6H - cyclopenta[a] 22 henanthrene - 3 - one ),

idarubicin ( (7S,9S) - 9 - acetyl - 7 - [(2R, 4S,5S,6S) - 4 - amino - 5 - hydroxy - 6 - methyloxan - 2 -

yl]oxy - 6,9,11 - trihydroxy - 8 ,10 - dihydro - 7H - tetracene - 5,12 - dione ), prednisolone

( (8S,9S,10R,11S,13S,14S,17R) - 11,17 - dihydroxy - 17 - (2 - hydroxyacetyl) - 10,13 - dimethyl -

22

7,8,9,11,12,14,15,16 - octahydro - 6H - cyclopenta[a] 23 henanth rene - 3 - one ), and sildenafil ( 5 -

[2 - ethoxy - 5 - (4 - methylpiperazin - 1 - yl)su lfonylphenyl] - 1 - methyl - 3 - propyl - 6~ { H } - pyrazolo[4,3 - d]pyrimidin - 7 - one ). Each drug that was identified using this method was docked and evaluated in the same manner as AZ960 and Ro318220 as described in the molecular docking section above.

2.4 Results and D iscussion

In silico techniques that are coupled with in vivo and in vitro experimental data can provide a deeper understanding into physicochemical characteristics in protein binding cavities, allowing for new insights into drug discovery tailorable for orthologue or isoform specificity. Designing new therapeutics is a constant need for diseases, and parasitic diseases such as HAT require a thorough investigation into orthologue , e.g. Trypanosoma , specific characteristics. TbERK8 has been established as a promising therapeutic target for

HAT, 92 and orthologue specific inhibition has been demonstrated through experimental research, 60 though the rationale behind the mechanism of inhibitory differences between the Tb and Hs orthologue s remains unknown. Exploring the difference in specificity of the two ERK8 orthologue - specific chemical inhibitors, Ro318220 and AZ960, and determining the protein - inhibitor interactions can aid in binding pocket characterization of both structures for future, selective pharmacological design for T. brucei . Here, we explore the inhibitory differenc e between TbERK8 and HsERK8 using known inhibitor data 60 and in silico approaches to discern a protein - structure based rationale to ort hologue specificity and routes for exploitation in future inhibitor design. This work contributes to our understanding on the role of essential prolif eration pathways for T. brucei, which in turn

23

provides rationale for the creation of next - gen compounds th at will specifically target T. brucei for HAT treatment.

2.4.1 Sequence alignment and homology modeling of Hs and TbERK8 orthologue s

TbERK8 has been experimentally determined to be a promisi ng target for HAT therapeutics, but, to date, there is no resolve d structure and minimal insight into mode of action and inhibition. 9 2 A human orthologue , HsERK8, has been identified and must be investigated in tandem to determine exploitable differences in the orthologue s. HsERK8 is involved in cell growth and proliferation pathways, 61 making it necessary to explo re structural differences between the two structures to conclude if TbERK8 is a suitable target for HAT drug design. Homology models of TbERK8 and HsERK8 were generated to structurally characterize the ATP - binding pocket of the kinase domain of TbERK8 and

HsERK8. A homology model of HsERK8 was previously created and analyzed for investigating ATP - competitive inhibitors for drug discovery by Strambi, et al., 86 showing that these computational techniques to analyze structural features is a viable approach.

Structural characteristics and differe nces between the two ortholog ue s must be identified to determine druggability of TbERK8 for HAT.

Molecular Operating Environment (MOE) 101 was used to determine prepare homology models of the ERK8 orthologues. The similarity of TbERK8 and HsERK8 to the template structure was 69% and 64%, respectively ( Table S1 ). To validate these str uctures, multiple web - based t ools were utilized, comparing to the original template structure as reference . 104, 106 - 108, 111 Each structure showed favorable psi/phi backbone angles, negative energy scores, and scored similar to x - ray and NMR r esolved structures

( Figure S1 - S3 ). 104 - 111 Based on all validation techniques collectively and since the two

24

ERK8 orthologue s demonstrated similarities in structural validation as compared to the

reference structure , the models were considered acceptable for the purpose of our investigation.

Sequence alignments were perf ormed to determine conserved residues within the kinase domain to distinguish variations and confirm similarities between the two structures using S chrödinger - Maestro 115 Multiple Sequence Viewer ( Figure S4 - S5 ), allowing for

perceptive into exploitable differences. The MAPK template from C. parvum (PDB ID:

3OZ6) (DOI: 10.2210/pdb3OZ6/pdb) , was included in this analysis, which was crystallized as an apo structure with no ligands, inhibitors, or cofactors bound , as was Fus3 (PDB ID:

2B9F) , 120 a n additional MAPK co - crystallized with ADP . The Fus3 structure also has similarity in the ADP/ATP binding pocket and the kinase domain ( Figure S4 ) with

TbERK8 and HsERK8, which aided in validation of our docking protocol. Fus3 also was previously utilized in a human - specific ERK8 study. 86

Previous research has provided insight into the kinase domain of both T bERK8 and

HsERK8, indicating residues 6 - 341 and 12 - 345, respectively, as constituting the kinase domains. 60, 86 TbERK8 and HsERK8 were determined to h ave 72% similarity and 51% identity within the kinase domain ( Figure S5, Table S1 ), while the alignment revealed some unique features, specifically wit h T. brucei having a deletion of residues Asp69 and

Lys105 relative to HsERK8 , which both predominately e ngage in charge - charge and hydrogen bonding interactions. The absence of these residues potentially produces a modification in the electrostatic poten tial, hydrophobicity, and ligand - binding abilities within the ATP binding site considering both residues have electrostatic binding abilities

( Figure S5 ). To further evaluate the effects of this change determined through the sequence

25

alignment results, str uctural analysis of the binding cavity of each orthologue was

necessary.

Overlays were used to structura lly orient the ATP - binding region relative to Fus3,

as it was crystallized with ADP, to identify structural variations for further investigation.

A shi ft in residues is displayed spatially within the binding region, most notably at residue

Asp155, which is positioned near the Mg 2+ ion ( Figure 1, Figure S6 ). The shift of Asp155

is likely caused by the two residue deletions for TbERK8, as observed in the s equence

alignment ( Figure S5 ). Asp155 has been experimentally determined to be essential in the

phosphorylation of HsERK8, leading to catalytic arrest when mutated. 83 The shift in

residue position as a result of the two deletions in TbERK8 compared to HsERK8 can also

observed in residues TbLeu19, TbMet95, and TbSer141 ( Figure S6 ). W hile these residues are present in both orthologue s, they are oriented in spatially different positions, leaving the question of how each residue influences lig and structural localization and how this may influence the physicochemical features of the bindi ng pocket. Importantly, both kinases are still active and bind ATP , likely due to the conservation in the phosphate region , but displays variations in the adeni ne binding region as a result of the residue deletions observed. Previous literature has shown t hat TbERK8 and HsERK8 have different inhibitory features; 60 this shift in residues Leu19, Met95, Ser141, and Asp155, has the potential to cause such distinctions in inhibitor binding abilities while still having both utilize ATP for autophosphorylation .

2.4.2 Physicochemical Characterization of ERK8 Orthologue s

Experimental research has previously determined important physicochemical d ifferences between TbERK8 and HsERK8 that need further classification. 60 Ro318220, a

26

potent HsERK8 inhibitor, 83 and A Z960, a TbERK8 selective inhibitor, 60 demonstrate

inhibitory differences that support binding pocket variations between the two ERK8

orthologue s. To characterize the effects of the residue deletions on the physicochemical features within the active site, each binding site was analyzed through pocket volume,

pharmacophore mapping, binding site surface mapping, and electrostatic potential of the

ATP - binding site.

The predicted binding pocket of each structure was explored using MetaPocket

2.0, 113 and volume was calculated using UCSF Chi mera 114 ( Figure 2A - B, S7 ) to provide insight into the types of small molecules that can bind based on size and structure. The volume s of TbERK8 and HsERK8 were calculated to be 518.8 Å 3 and 1186 Å 3 , respectively ( Figure 2A - B, S7 ). TbERK8 has shown more consistent inhibitory abilities with smaller, f lexible molecules, such as AZ960 (292.5 Å 3 ), 60 which is likely due to the smaller binding cavity . Ro318220 is larger than AZ960, at a volume of 404.0 Å 3 versus

292.5 Å 3 , and has been experimentally determined to be over 100 - times more potent at inhibiting HsERK8 over TbERK8. 60 TbERK8 has a thinner, L - shaped ( Figure 2A ) binding cavity while the pocket of the active site of HsERK8 is overall more spherical ( Figure 2B ).

AutoDock Tools was used for analysis of the rotatab le bonds of each inhibitor . 117

Rotameric analysis of Ro318220 revealed six rotatable bonds ( Tab le S6 ) mostly in the sulfanyl tail, with rotameric hindrances in the pyrrole - dione moiety, potentially preventing flexibility that appears to be necessary for binding to the TbERK8 binding cav ity given the smaller size and distinct shape ( Figure 2A - B ). AZ 960 displays the same rotameric flexibility as Ro318220 (six rotatable bonds; Table S6 ) but contains regions that are much less bulky ( Table S6 ), thereby confirming the need for smaller, flexi ble inhibitors in the

27

TbERK8 ATP binding cavity. Based on these observations, smaller, more flexible

molecules, like AZ960, are expected to bind more favorably in the active site of TbERK8

given the distinct shape and size of the binding cavity, which is f urther confirmed using

free energy calculations (discussed below ). Larger, less flexible molecules, like Ro318220,

are predicted to bind with higher affinity in the binding cavity of HsERK8 since it has the

volume and shape to support it. These observation s thus support that small er , flexible molecules are required for TbERK8 over HsERK8 specificity.

To determine distinct binding features in each orthologue , mapping of the active

site surfaces was performed using the Schrö dinger - Maestro 115 binding site surface tool,

generating a surface map of structural characteristics ( Figure 2C - D ). This tool allows for

visualization of the surface of the active site based on hydrophobic, glycine, p olar

uncharged, positively charged, and negatively charged regions. HsERK8 displayed an

active site with more distinct electrostatic features, exhibiting more negatively charged and hydrogen - b onding features ( Figure 2C ), compared to TbERK8, which displays greater

positively charged and hydrophobic features ( Figure 2D ). Both structures displayed

features of hydrogen bonding, aromaticity, and ionic binding abilities but organized

differently within the binding pocket ( Figure 3 ). Based on the pharmacophore mod els,

which displays features required of a ligand for binding, HsERK8 displayed hydrogen bond

donor and hydrophobic features with few aromatic features in the upper right quadr ant of

the adenine binding region. HsERK8 contained t hree negative ionic feature s within the

triphosphate binding region. The TbERK8 pharmacophore model showed hydrogen bond

donor and aromatic features throughout the adenine binding region, with three nega tive

ionic features at the top left side of the triphosphate binding region ( Fig ure 3 ). The binding

28

feature positioning differences between TbERK8 and HsER8 are hypothesized to result

from the shift in residues that would change the structural chemistry re quired of a small

molecule, supporting the specificity variances between AZ960 a nd Ro318220.

To further confirm the electrostatic differences between the binding pockets of the

two orthologue s and to complement the pharmacophore models, the PyMOL Protein

Contact Potential/Vacuum Electrostatic feature (version 2.0), 116 a calculation based on a quasi - Coulombic - shape function, 121 was used to evaluate the electrostatic potential of each protein surface ( Figure S8 ). The binding cavity of TbERK8 shows a neutral upper adenine binding region (white) with an overall positive charge (blue) closer to the activation loop in the ATP binding pocket ( Figure S8A ). The activation loop for HsERK8 is similarly charged compared to TbERK8. The adenine binding region of HsKER8 contrasts compared to TbERK8 ( Figure S8B ), displaying a strong negative charged (red) region embedded deep wit hin the active site as compared to a more neutral (white) region in

TbERK8. These structural data presented from the electrostatic surface show that charged regions in the uppe r binding region could be influencing inhibitor positioning in HsERK8, while aro matic and hydrophobic features may play an important role for TbERK8.

These physicochemical features of TbERK8 and HsERK8 can be used to rationalize

and exploit the inhibitory differences between AZ960 and Ro318220. HsERK8 displays a

region of negative el ectrostatic potential in the posterior of the adenine binding region in

the active site, and a larger overall binding pocket volume. Despite the range of

physicochemical differences, both orthologue s share three negative ionic feature

requirements in the t riphosphate binding region, allowing the facilitation of known kinase

activi ty ( Figure 3 ). It is demonstrated below with molecular docking experiments that the

29

binding cavity of HsERK8 can accommodate a bulkier ligand with more positively charged

or hydrog en bonding features with a negatively charged tail with higher affinity, sim ilar to the characteristics of Ro318220. TbERK8 displays a hydrophobic and relatively small binding pocket with some positively charged regions and hydrogen bonding features, which could support a smaller, flexible ligand with aromatic features, matching c haracteristics of AZ960. Molecular docking can help identify structural determinants responsible for the inhibitory differences to support these observations and provide enhanced b iochemical rationale for inhibitor orthologue preference .

2.4.3 Molecular Do cking and Interaction Analysis

To provide insight into key residues responsible for the inhibitory differences between TbERK8 and HsERK8, inhibitor binding conformations were explo red using molecular docking and evaluating the protein - ligand interactions. Molecular docking is a structure - based method that provides information on into how small molecules interact with protein structures, 122 which is useful when investigating proteins as potential targets for drug discovery. This process allows for examination of the binding pocket for designing a scaffold for TbERK8 specificity.

Fus3 (PDB: 2B9F) 120 was used as a standard for protocol development and validat ion of the methodology, given its use in other modeling experiments 8 6 and the co - crystallization with ADP within the binding pocket. To validate the molecular docking protocol, redocking of ADP was performed using AutoDockTools 117 and AutoDock

Vina. 118 Redocking is generally considered a success by obtaining a lowest ene rgy pose with a RMSD less than 2.00 Å. 123 The redocking of ADP into Fus3 resulted an RMSD of

30

0.710 Å , thereby supporting our protocol for these protein structures ( Figure S9, Table

S3 ).

Following docking protocol valida tion , docking of ATP into each structure was

pe rformed to identify any key residues essential in binding . Each dock ed pose was chosen

based on lowest energy scores and domains positioned appropriately for

autophosphorylation , with the phosphate groups coordinated near the Mg 2+ ( Figure S 10 -

S 11 , Table S4 ). To assess the ability of AutoDock Vina to dock ATP relative to the known

crystal structure position of ADP, only the ADP moiety of ATP was considered in the

RMSD calculations . The RMSD values of the ADP moiety of ATP , relative to the ADP

position in the crystal structure of Fus3 were 2.028 Å , 1.811 Å , and 1.920 Å for Fus3,

TbERK8, and HsERK8, respectively ( Figure S 11 ). Using Schr ö dinger - Maestro’s 115 interaction f ingerprint tool, interaction types were evaluated within the protein - ATP complexes ( Table S9 ). Fingerprinting exhibited interactions in both orthologue s with residues Leu19, Val25, Lys42, and Asp155 ( Table S9 ). Site - directed mutagenesis experiment s have also revealed Lys42 and Asp155 as essential residues for catalytic activity of HsERK 8. 61, 83, 124, 125 In combination with the sequence alignment, structural overlays, mu tagenesis of residues within the binding cavity from previous literature, 61, 63,

83 , 124 - 129 and the fingerprinting results, an initial set of key r esidues interacting with ATP were identified ( Table S2 ). Notably, Asp155 displayed interactions in all structures, despite a shift in its spatial localization between the orthologue s ( Figure S1 0 - S11 ).

However, this result is expected given the biological r elevance of aspartate residues coordinating Mg 2+ in kinases and pr omoting catalytic events. 11, 130

31

To further evaluate the structural difference between HsERK8 and TbERK8 and to

probe inhibitor specificity, Ro318220 and AZ960 was docked into both TbERK8 and

HsERK8. Ro318220 is a pyrrole - dione and methyl - sulfanyl moiety containing compound

that has been experimentally determined to be HsERK8 selective. 60, 86 Ro318220 has a

volume of 404.0 Å 3 , contains both electropositive and electronegative regions ( Table S6 ), and d isplays features of hydrogen bond donating, hydrophobicity, and hydrogen bond accepting ( Figure S12 ) . Ro318220 docking results support the specificity for HsERK8 over TbERK8 ( Figure S13 - S14, Table S7 ), showing more electrostatic interactions with

HsERK8. O verall HsERK8 docked poses displayed electrostatic interactions with the pyrrole - dione moiety at residues HsArg56, HsArg55 and HsLys139 and with the methyl - sulfanyl moiety at residues HsLys42, HsGlu60, and HsAsp155, all interacting at a distance less than 5.00 Å. Additionally, Ro318220 interacts hydrophobically with the key residue

HsVal27. These specific interactions with the defined key residues (HsVal27, HsLys42,

HsArg56, and HsAsp155) with additional electrostatic interactions (HsLys139, HsSer141,

HsAsn 142) show that Ro318220 has the chemical features and size required to bind competitively against ATP in the binding cavity of HsERK8.

When Ro318220 was docked into TbERK8, many poses , such as the pose used for interaction analysis, resulted in the pyrr ole - dione moiety region being partially solvent exposed ( Figure S 1 5 B ). The electrostatic interactions between TbERK8 and Ro318220 inclu de TbArg56 in the pyrrole - dione moiety with TbLys42 and TbArg182 interactions display ing distances greater than 5.0 Å in distance, indicat ing weak electrostatic interaction s . Ro318220 poses in TbERK8 also show a hydrophilic interaction with Asn142 at the p yrrol - dione moiety greater than 5.0 Å away. Ro318220 showed no interactions with

32

the essential hydrophobic residues of Tb ERK8, TbLeu19 and TbVal27, likely due to the

bulkiness of the molecule and inability to fully access the adenine binding region of the

active site given the smaller pocket size. These observations support experimentally

determined selectivity trends for Ro 318220 and the two ERK8 orthologue s, noting that this

inhibitor is selective towards HsERK8, and that Ro318220 does not contain the

phy sicochemical composition and smaller size required for TbERK8 binding. 60 Free

energy calculations confirmed these observations for bin ding preferences and are discussed

collectively later.

AZ960 is a compound containing fluorophenyl and methyl - pyrazol mo ieties that

has been experimentally determined to be a strong inhibitor for TbERK8 and a weak

inhibitor for HsERK8. 60 AZ960 has a cal culated volume of 292.5 Å 3 , exhibits features of hydrogen bond donating, aromaticity, and hydrophobicity ( Figure S12 ), and contains both electronegative and electropositive clustered regions ( Table S6 ). Comparing the docking results of AZ960 in TbERK8 and HsERK8, AZ960 had a common binding position throughout all poses with TbERK8 while there were less consistent patterns in t he docking into HsERK8 ( Figure S14 A - B, Table S8 ). Most poses of AZ960 in TbERK8 showed the fluorophenyl region of the molecule posit ioned near the adenine binding region ( Figure

S14 C ). This positioning was not observed in HsERK8, with the methyl - pyrazol moiety of the molecule positioned near the adenine binding region ( Figure S14 D ). Interestingly,

TbERK8 and HsERK8 had equal numbers of electrostatic and hydrophobic interactions with AZ960 ( Figure S15, Table S9 ), which could explain the weak inhibitory ab ilities with HsERK8. HsERK8 displayed more backbone and hydrophilic interactions with

AZ960 as compared to TbERK8. TbERK8 interacted electrostatically with the methyl -

33

pyrazol moiety of AZ960 at residues TbLys42, TbAsp53, and TbArg56 but at distances

greate r than 5.0 Å. HsERK8 interacted electrostatically in the methyl - pyrazol moiety with

HsLys42, HsGlu60, and HsAsp155 and hydrophobicall y with HsGln21. Only the HsGlu60

and HsGln21 interacted at distances less than 5.0 Å. Due to these weak interaction with

pr edicted key residues, other factors may contribute to the differences of AZ960 binding.

When evaluating additional interactions to e xplain the AZ960 inhibitory difference

between the ERK8 orthologue s, TbPhe156 displayed aromatic and hydrophobic

interactio ns with the fluoropyridine - carbonitrile moiety of AZ960, at a 3.6 Å distance, in

the triphosphate binding region, potentially represe nting T - shape π - π like interaction

( Figure S16 ). TbPhe156 neighbors the ERK8 key residue, Asp155, showing that this interac tion could drive ATP competition in TbERK8. HsERK8 also displayed a potential

T - shape π - π like interaction at HsPhe97, but at 4.3 Å a nd in the upper and back - end of the adenine binding region because of additional binding pocket space and volume of HsERK8

( Figure S16 ). The positioning of HsPhe156 reveals that the shift in residues from the residue insertions in HsERK8 may have influence d a feature re - organization, allowing for these aromatic components to be more accessible for small - molecule binding ( Figur e S16

D ). This aromatic interaction required for AZ960 binding could be a key detail of the specificity differences between the two o rthologue s and potentially be an exploitable

feature for HAT pharmaceutical design.

Collectively, docking results of the k nown inhibitors highlight structure differences

observed in overlays, alignments, and ADP/ATP docking, but also reveal fundamental

in teraction differences between the ERK8 orthologue s and their specific inhibitors.

Ro318220 interactions are driven by elect rostatic features and require a larger binding

34

cavity for high specificity, which agrees with above results. Aromatic and hydrophobic

characteristics influence interactions with AZ960, potentially in a T - shape π - π like manner.

TbERK8 has aromatic interacti ons in the triphosphate binding region at TbPhe156 with

AZ960, a potential essential interaction required for TbERK8 specificity. Fut ure TbERK8 selective inhibitors should contain an aromatic feature to interact with TbPhe156.

2.4.4 Inhibitor Characterist ics and Additional Compounds to Probe Binding Site Specificity

Ro318220 and AZ960 were screened using Schrödinger - Maestro’s 115 QikProp feature for additional small molecules with physicochemical similarities and to identify any additional features o f structures that selectively mimic AZ960 and Ro318220 in the

ATP binding cavity for orthologue - specific inhibitor design. QikProp identifies compounds that are chemically similar to the small molecule of interest that are purchasable and FDA approved. Ro 318220 was identified to be similar to idarubicin

(63.77%), famotidine (62.99%), and sildenafil (60.08%). Idarubicin and si ldenafil are similar to Ro312880 in volume, 402.8 Å 3 and 423.1 Å 3 , respectively, and display concentrated regions of positive charge ( Table S6 ). Famotidine is relatively small in comparison to the other identified molecules similar to Ro318220, with a volume of 251.0

Å 3 , and displays a concentrated region with a net negative charge. When docked into

HsERK8, both idarubicin and sildenaf il presen ted interactions similar to Ro318220, notably electrostatic interactions with the key residues HsLys42 and HsAsp155 while also displaying hydrophobic interactions ( Table S17 ). Interestingly, famotidine docking results did not show similar binding interacti ons with HsERK8 as compared to Ro318220, having only one electrostatic interaction with HsAsp98, with most of the interactions being from hydrophobicity despite being a Ro318220 chemical analog. This is likely because the

35

central features of famot idine are hydrophobic ( Table S6 ). All of the Ro318220 analogs

displayed similar interact s , some with even more interactions , with TbERK8’s binding

cavity compared to HsERK8 , despite their larger molecular volume. When docked into

TbERK8, idarubicin had an equal num ber of electrostatic interactions in the two

orthologue s and interacted with TbERK8’s key residue TbAsp155. Sildenafil had two more electrostatic interactions with TbERK8 compared to HsERK8, again interacting with the key residue, TbAsp155. Famoti dine resu lted in only two electrostatic interactions with

TbERK8, one more than HsERK8, none of which were key residues. It appears that idarubicin and sildenafil encompass characteristics driving molecular interactions deeper into the binding cavity of Tb ERK8 that are absent in Ro318220. From this, we can observe that larger molecules with the preferred physicochemical characteristics of negative electrostatic features, which can drive interactions with TbLys75, and flexibility can still inhibit TbERK8.

AZ 960 was i dentified to be similar to prednisolone (64.68%), fludrocortisone

(63.59%), and fluprednisolone (63.44%). The AZ960 analogs had similar structural characteristics, with mostly neutrally charged surfaces ( Table S6 ), and volume ranging from 325 - 350 Å 3 . While most properties, such as hydrogen bonding and hydrophobicity were seen in these analogues, interestingly, none had aromatic features. When all three

AZ960 analogues were docked into both ERK8 orthologue s, the AZ960 analogs displayed

similar inter actions, with mostly backbone and hydrophobic interactions driving the

binding ( Table S17 ). When docked into HsERK8, all three AZ960 analogs interacted with

HsERK8’s key residues, HsLys42 and HsAsp155. Fludrocortisone and prednisolone

interacted similarly with TbER K8, with electrostatic interactions with the key residue,

36

TbAsp155. Fluprednisolone did not interact with any TbERK8 key residues, indicating a

very weak ability to act competitively with ATP. Interestingly, none of the AZ960 analogs

interacted wi th the im portant TbERK8 residue, TbPhe156, likely due to the lack of aromaticity of these molecules, further emphasizing the importance of an aromatic moiety in developing a TbERK8 specific inhibitor.

The inhibitor AZD5438 was also docked for further evalu ation of inhibitory abilities. Previous studies have identified AZD5438 as a strong inhibitor of both ERK8 orthologue s and has a n approximate volume of 357.8 Å 3 . 60 In HsERK8, AZD5438 interacted electrostatically with HsLys42 and HsAsp98 and had hydrophilic interactions with HsGln21 ( Table S17 ). AZD5438 interacted with three of HsERK8’s key residues:

HsLeu19, HsVal27, and HsLys42. In TbER K8, AZD54 38 interacted electrostatically with

TbLys42, TbAsp53, TbGlu60, TbLys75, and TbAsp155 and via hydrophilic interactions

with TbThr57 ( Table S17 ). AZD5438 interacted with four of TbERK8’s key residues:

TbLeu19, TbVal27, TbLys42, and TbAsp155. AZD543 8 interac ted with a feature determined to be essential in TbERK8 binding, TbPhe156, which was found to be essential for interacting with AZ960. The features of AZD5438 allow for interaction with key features in triphosphate binding regions for both ERK8 or thologue s , while allowing it to drive into the adenine binding region of TbERK8, which is otherwise less accessible. This supports that AZD5438 strongly inhibits both HsERK8 and TbERK8. 60 The results from docking AZD5438 into TbERK8 and HsERK8 indicate that a strong, TbERK8 inhibitor requires interactions with ERK8 key residues and with TbPhe156.

An additional residue appeared to be im portant for TbERK8 binding with both

Ro318220 and AZ960 chemical analogs, especially with allowing larger molecules to

37

access the binding cavity: TbLys75. Both Ro318220 and AZ960 analogs interacted electrostatically with TbLys75. Further investigation of t he sequence alignment of the two structures show that Lys75 is a feature only present in TbERK8, whereas HsERK8 contains a serine in this position. When visualized using PyMOL, 116 due to the shift in residues caused by the deletions of residues Asp69 and Lys105 in TbERK8 ( Figure S5 ), this position is occupied by HsIle74, producing a more hydrophobic region within the adenine binding region of H sERK8, which is also observed in the surface maps ( Figure 2 C, D ).

TbLys75 also occupies more of the binding pocket relative to HsIle74, reducing the overall pocket volume of TbERK8 and increasing the net positive charge in the adenine binding region of Tb ERK8’s active site. This observation supports the patterns seen in the docking results, exhibiting fewer TbERK8 interactions with larger molecules such as Ro318220, unless containing a negatively charged moiety in a flexible domain, such as sildenafil, d ri ving the molecule into the binding cavity of TbERK8. The electrostatic characteristics of these inhibitor analogs contain the physicochemical features required for this crucial interaction with TbLys75 of TbERK8, indicating it is a key component for des ig ning an orthologue specific inhibitor. These results support that TbERK8 selective inhibitors should contain chemical constituents that can interact with TbLys75 for strong inhibitory abilities.

To further support binding patterns and favorability, liga nd efficiencies were evaluated by dividing the docking output energy scores by the number of heavy atoms in each molecule ( Figure S24 ). As expected, given known experimental data, th e highest ligand efficiency for TbERK8 was AZ960, while the lowest was Ro3 18 220 ( Figure S24

A ). Unexpectedly, AZ960 also had the highest ligand efficiency for HsERK8 and

38

Ro318220 scored low compared to all other ligands ( Figure S24 B ) in HsERK8 . When th ese data is organized by size of the molecule (largest to smallest), the volu me of the

molecule appears to also influence ligand efficiency, most apparent in TbERK8 ( Figure

S24 C ) and related to the smaller binding volume of TbERK8. While heavy atom number is taken into account in the ligand efficiency calculation, volume of both t he binding pockets are substantially different which can influence the free energy from AutoDock

Vina. I n HsERK8, there are a few outliers, also likely due to the size of the binding cavity, allowing for more space to accommodate larger molecules, and more a reas for the small molecules to bind. It is possible that AZ960 is binding well in HsERK8, which experimentally is known to inhibit HsERK8 to some extent, but not in the region required to fully inhibit kinase activity and outcompete ATP. Collectively , s in ce the ligand efficiencies are calculated based on energy outputs from docking and heavy atom count , these results show that the ligand efficiencies are not a complete indicator for comparing inhibitors of varying size or multiple receptors with varying po cket volumes but can be helpful when comparing molecules of similar size within one orthologue .

To explore these new features identified through structural analogs to the known inhibitors, overlays of the structures were analyzed using a multiple ligand p harmacophore hypothesis feature in Schrödinger - Maestro. 115 Each analog was identified to contain a hydrogen bond ac ce pting feature driving the interactions with TbLys75 ( Table S17 ).

Incorporating this feature into the scaffold of AZ960 could improve selectivity for

TbERK8 over HsERK8. Additional compounds were constructed using the features from

Ro318220 and AZ960 ana lo gs essential in TbERK8 binding, modifying the methyl - pyrazol and fluorophenyl moiety of AZ960 and to further use our data and rationale to

39

design a new scaffold based on the hypothesis supported by the in silico characterization

of the binding cavity and m olecular docking. The dihydroxycyclopentyl - ethanone moiety

in the structures of fludrocortisone, fluprednisolone, and prednisolone was altered to a

cyclopentenyl - hydroxyenthanone moiety by removing the extra group to create compound 1 ( Table S18 ) . This alteration to the AZ960 scaffold drives poses to interact

with TbLys75 i n molecular docking studies ( Table S21 ). Two constructs appeared to bind

more effectively in TbERK8 over HsERK8 when evaluating the protein - small molecule

interactions: compound 5 and 6 ( Table S21 ). Compound 5 contains an oxydiazole ring in

place of the methyl - pyrazol moiety and compound 6 contains a carboxyl and a trifluoro

moiety. Compound 5 interacted electrostatically with four charged residues in TbERK8 ,

TbLys42, TbLys75, TbGl u96, and TbAsp155, and hydrophobically with the aromatic

residue TbPhe156. Compound 5 only interacted electrostatically with two charged residues

in HsERK8, HsAsp98 and HsAsp155. Compound 6 interacted electrostatically with five

cha rged residues in TbERK8, TbLys42, TbGlu60, TbGlu96, TbAsp98, and TbAsp155, but

not TbLys75, and interacted hydrophobically with TbPhe156. Compound 6 interacted

electrostatically with three charged residues in HsERK8, HsLys42, HsAsp98, and

HsAsp155. Compoun ds 5 and 6 had interacti ons similar to those observed with AZ960 and

AZD5438 with HsERK8 , interacting with HsAsp98 similar to AZD5438 and comparable

hydrophobic and backbone residues in AZ960. Interactions of compounds 5 and 6 in

TbERK8 resembled those of AZD5438 rather than AZ96 0. From the lack of clear binding

patterns to known inhibitors, further analysis is necessary to determine selectivity.

Each new c ompound was further analyzed using ligand efficiency calculations and

energy properties and compared to the inhibitors Ro3182 20, AZ960, and AZD5438 ( Table

40

S22 ). Ligand efficiency calculations displayed only compound 6 to bind measurably greater in TbERK8 ( Table S22 ) compared to HsERK8, but AZ960 displayed efficiencies greater in HsERK8 though experimental work supports AZ960 as stronger at inhibiting

TbERK8. 60 More metrics of analysis, specifically free energy calculations using Molecu lar

Mechanics/Generalize d Born Surface Area ( MM - GBSA ) , are essential when evaluating the inhibitory efficacy from molecular docking results given the complexity and differences in the binding pockets due to the rearrangements from the substitutions and del etions. MM - GBSA energy p roperties were calculated using Schrödinger - Maestro 115

( Table S22 ). Both AZ960 and AZD5438 di splayed measurably great er , negative binding energies for TbERK8 ( - 107.97 ± 11.24 kcal/mol and - 109.44 ± 11.76 kcal/mol , respectively) compared to HsERK8 ( - 98.60 ± 10.83 kcal/mol and - 96.00 ± 14.95 kcal/mol , respectively), which aligns with experimental da ta. 60 Ro318220, however, did not display expected free energy values, ultimately showing low binding energies to both TbERK8 and

HsERK 8, which may be influenc ed by protein dynamics not accounted for in this work and due to the large size of the molecule. Both compounds 5 and 6 had measurably greater energies with TbERK8 over HsERK8, though not significant. Following the patterns of the k nown ERK8 inhibitors, AZ 960 and AZD5438, compounds 5 and 6 are hypothesized to selectively inhibit TbERK8 over HsERK8, providing a buildable scaffold for new - gen compound design and rationale.

2.5 Conclusion

This work identifies unique structural feature for TbERK8 drug design specificity that can ultimately be used for treating HAT. Here, we utilize c omputational techniques that compare the ATP - binding pockets of TbERK8 and HsERK8 and rationalize and

41

predict the structural determinants that can influence the experimentally obse rved inhibitor

specificity. Molecular docking results identified key residues and features necessary for

ADP/ATP and inhibitor interactions in TbERK8 that differ in HsERK8, ultimately be ing

exploitable for TbERK8 specificity over HsERK8. These features inc lude a positively

charged component, TbLys75, and an aromatic residue, TbPhe156, that are shown as

requisite to increase the inhibitor specificity in TbERK8 as based on their interaction s with experimentally determined potent TbERK8 inhibitors ( Figure 6 ). This work also identifies

new chemical moieties for the development of next gen inhibitors that can be utilized for

greater selectivity towards the active site of TbERK8 over HsERK8 . By testing chemical

analogs of known TbERK8 or HsERK8 selective inhibitor s, and using those chemical

features coupled with pharmacophore models, we were able to examine the impact of new

chemical moieties , an oxydiazole ring and a trifluoro group, on an AZ960 derivative , that

was confirmed with free energy (MM - GBSA) calculation s to indicate TbERK8 specificity

( Table S22 ) . Further drug discovery and targeting of TbERK8 should focus on developing

TbERK8 selective inhibitors using the basic scaffold containing a central aromatic feature

to interact with TbPhe156 with a flexible do main containing electrostatic features to

interact with TbLys75 in the adenine binding region, and a strong, electronegative domain

to interact with the triphosphate binding region. Thes e future inhibitors should be small

and flexible to allow for movement within the small binding area of active site of TbERK8.

Collectively, our work demonstrates exploitable physicochemical differences in TbERK8

and HsERK8 that can drive novel therapeutic design strategies for treating HAT.

42

2.6 Main Figures

Figure 1. Structural features of ERK8 orthologue s. (A ) HsERK8 with binding pocket surface shown as mesh and ATP represented in volumetric dots and colored by element. Key residues are shown in red sticks and colored by atom type. (B) Key residues of the ERK8 binding pocket shown as sticks, labeled and colore d based on orthologue (Fus3 (green), HsERK8 (navy), and TbERK8 (violet)). (C) Spatial orientation diagram of binding region of ATP in HsERK8 structure.

43

Figure 2. ERK8 orthologue binding cavity volume and physicoc hemical features. (A, B) Predicted binding pocket volume surface representation. (A) TbERK8 shown in violet cartoon. (B) HsERK8 shown in navy cartoon. Binding pockets were determined using MetaPocket 2.0. T he pocket volume of TbERK8 and HsERK8 was calculat ed to be 518.8 Å 3 and 1186 Å 3 , respectively , using UCSF Chimera. (C , D) Surface map of binding cavity. TbERK8 (C) and HsERK8 shown in gray cartoon. Colors of surface represent specific residue properties: hydrophobic (green), hydrogen bonding features (tea l), negatively charged (red), and positivel y charged (blue).

44

Figure 3. Pharmacophore model of TbERK8 and HsERK8 . (A) TbERK8 shown as violet cartoon and (B) HsERK8 shown as navy cartoon. Spheres represent the pharmacophore model features: red circles - n egative ionic, blue sphere with arrows - hydrogen bonding donor, pink sphere with arrows - hydrogen b o nding acceptor, green sphere - hydrophobic, and orange rings - aromatic features.

45

Figure 4. Known orthologue - specific inhibitors docked into TbERK8 an d HsERK8 . (A) TbERK8 and ATP, (B) HsERK8 and ATP, (C) TbERK8 and AZ960, (D) HsERK8 and AZ960, (E) TbERK8 and Ro318220, and (F) HsERK8 and Ro318220. In all panels, TbERK8 shown as violet cartoon and HsERK8 shown as navy cartoon with key residues shown in st ick colored by atom type. Mg 2+ is shown as a green sphere. ATP is represented in stick colored by atom type with carbon s colored maroon. AZ960 is represented in stick colored by atom type with carbon in teal. RO318220 shown in stick colored by atom type wi th carbon in light blue. Distances measured to closest key residue are in Å 3 shown in gray dashes.

46

Figure 5. Volume representation of known orthologue - specific inhibitor poses for TbERK8 and HsERK8 . (A) TbERK8 and ATP, (B) HsERK8 and ATP, (C) TbERK8 and AZ960, (D) HsERK8 and AZ960, (E) TbERK8 and Ro318220, and (F) HsERK8 and Ro318220. TbERK8 shown as violet cartoon and HsERK8 shown as navy cartoon with key residues shown in stick colored by atom typ e. Mg 2+ in green sphere. ATP is represented in volumetric dots colored by atom type with carbon s colored maroon. AZ960 shown as volumetric dots colored by atom type with carbon s colored teal. RO318220 shown as volumetric dots colored by atom type with carb on s colored light blue.

47

Figure 6. Schematic of prop osed targeting strategies TbERK8 over HsERK8 for HAT drug development .

48

2.7 Supplemental Information

Table S1. Summary table of sequence identity and similarity between Fus3 (PDB ID : 2B9F), TbERK8, HsERK8, and the MAPK t emplate (PDB ID : 3OZ6) sequences . Sequence identity and similarity p ercentages were calculated using Schr ö dinger - Maestro after performing pairwise sequence a lignment. Reference MAPK t emplate Fus3 (PDB: 2B9F) TbERK8 HsERK8 Sequence (PDB ID : 3OZ6) Percent Percent Percent Percent Percent Percent Percent Percent

Similarity Id entity Similarity Identity Similarity Identity Similarity Identity Fus3 58 38 57 39 54 35

TbERK8 58 38 72 51 69 51

HsERK8 57 39 72 51 64 46 MAPK 54 35 69 51 64 46 t emplate

49

A B

C

D

E

Figure S 1. Energy minimized structure validation of the Fus3 crystal structure (PDB ID: 2B9F). (A) Ramachandran plot showed 95.5% of residues within the favored region, 4.2% within the allowed region, and 0.3% in the outlier region. (B) ANOLEA indicates favorable free energy scores (green). No major unfavorable residue regions are observed. (C) ProSA indicates a Z - score ( - 9.11, black dot) in the acceptable spectrum compared to resolved protein structures. (D) QMEAN analysis indicates mostly favorable Z - scores in so lvation, SSE agreement, and all - atom interactions. Negative Z - scores are observed for torsion, ACC agreement, and CB inter actions. (E) Verify3D results show most residues above the threshold for favorable side chain placement (dotted yellow line). Residue ranges below the threshold are not within the active site.

50

A B

C

D

E

Figure S 2 . Energy minimized s tructure validation of the TbERK8 homology model ( sequence accession: Tb927.10.5140 ). ( A ) R amachandran plot showed 88% of residues within the favored region, 8.1 % within the allowed region, and 3.9% in the outlier region . ( B ) ANOLEA indicates favorable free energy scores (green). No major unfavorable residue regions are observed. ( C ) ProSA indicates a Z - score ( - 2.7, black dot) in the acceptable spectrum compared t o resolved protein structures. (D) QMEAN analysis indicates negative Z - scores in all analysis areas. (E) Verify3D results show most residues above the threshold for favorable side chain placement (dotted yellow line). Residue ranges bel ow the threshold ar e not within the active site.

51

A B

C

D

E

Figure S 3 . Energy minimized s tructure validation for HsERK8 homology model ( sequence accession: Q8TD08 ). ( A ) R amachandran plot showed 89.3% of residues within the favored region, 9.5% within the allowed region, and 1.2% in the outlier region . ( B ) ANOLEA indicates favorable free energy scores (green). No major unfavorable residue regions are observed. ( C ) ProSA indicates a Z - score ( - 7.87, black dot) in the acceptable spectrum compared to resolv ed protein structures. ( D) QMEAN analysis indicates negative Z - scores in all analysis areas. (E) Verify3D results show most residues above the threshold for favorable side chain placement (dotted yellow line). Residue ranges below the threshold are not wit hin the active site.

52

Figure S 4 . Multiple sequence alignment (MSA) of the kinase domain of ERK8 in Fus3 (PDB: 2B9F), TbERK8, HsERK8, and MAPK t emplate (PDB: 3OZ6). A lignment performed using Schr ö dinger - Maestro ’s Multiple Sequence Viewer. Sequences are displayed as a single let ter amino acid, with conserved residues colored based on side - chain property and alignment quality. The Fus3 sequence was used as a reference, with percent similarity (S) and identity (I), indicated in the right column.

Figure S 5 . Pairwise sequence a lignment of the kinase domain of ERK8 in TbERK8 and HsERK8 . A lignment performed using Schr ö dinger - Maestro ’s Multiple Sequence Viewer. Sequences are displayed as a single letter amino acid, with conserved residues colored based on side - chain property and al ignment quality. Percent similarity (S) and identity (I), indicated in the right column.

53

Figure S 6. ATP binding site overlays. (A) Fus3 and TbERK8, (B) Fus3 and HsERK8, (C) HsERK8 and Tb ERK 8, and (D) Fus3, TbERK8, and HsERK8. All structures are sho wn as cartoon and colored as Fus3 (green), TbERK8 (violet), and HsERK8 (navy). Residues in the ATP bindin g cavity are labeled and shown as stick and colored based on kinase orthologue . Mg 2+ is shown as spheres and colored based on kinase orthologue .

54

Tab le S 2 . Key residue name and sequence position of Fus3, TbERK8, and HsERK8 based on active site overlay an d literature. Residues are renumbered in order to have consistency in discussion of amino acid position based on sequence and renumbered residues are b ased on the sequence position in HsERK8 Structure Residue name and number Fus3 (PDB ID: 2B9F) Leu19 Val27 Lys42 Arg55 Met96 Ser141 Asp155 TbERK8 Leu19 Val27 Lys42 Arg56 Glu95 Asn140 Asp153 HsERK8 Gln2 1 Val32 Lys4 2 Arg 55 Met100 Ser146 Asp1 55 Renumbe red Residues Fus3 (PDB ID: 2B9F) Leu19 Val27 Lys42 Arg5 6 Met95 Ser14 1 Asp155 TbERK8 Leu19 Val27 Lys42 Arg5 6 Glu96 Asn14 2 Asp155 HsERK8 Gln 21 Val27 Lys42 Arg5 6 Met95 Ser14 1 Asp155

55

Figure S 7 . ERK8 orthologue s and ATP b inding site volume representation overlay with ATP. Volume pocket of ATP binding cavity was created using MetaPocket 2.0 and calculated with UCSF Chimera. (A, C, E) Binding pocket is shown as spheres to indicate volume and shape of the ATP bind ing cavity for each structure. (B, D, F) Overlays of the volume representation with ATP was also performed to highlight the orientation of ATP in the predicted pocket. All structures are shown as cartoon and colored by F us3 ( green ) , TbERK8 (v iolet ), and Hs ERK8 (navy) . ATP is shown as sticks and colored by element.

56

Figure S 8 . Electrostatic potential surface of (A) TbERK8 and (B) HsERK8. Electrostatic potential surface calculations were performed using the PyMOL vacuum electrostatics protein contact potential feature and are colored by a gradient of red (negatively charged) to blue (positively charged ). ATP binding cavity shown in white dashed box.

Figure S 9 . ADP redocking results compared to Fus3 x - ray crystal structure. Fus3 structure is shown in c artoon and colored as x - ray crystal structure (green) and the energy minimized structure used for redocking (purple). The lowest energy docked pose of ADP docking r esults (pose 1) is overlaid with the crystal structure position of ADP. ADP is shown as stic ks and colored by structure (x - ray crystal structure in green, redocked pose in purple). The docked pose had an RMSD of 0.710 Å compared to the x - ray crystal struct ure pose.

57

Table S 3 . Docking free energies of poses for ADP redocked into Fus3 energy minimi zed structure. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energy score and frequency of additional poses with similar positioning , correct positionin g of chemical groups, and similarity to ADP in crystal structure (represented by an asterisk ) . ADP – Fus3 Energy Minimized Structure Free Energy Pose # (kcal mol - 1 ) 1 - 9.5 * 2 - 9.4 3 - 8.8 4 - 8.8 5 - 8.6 6 - 8.5 7 - 8.5 8 - 8.5 9 - 8.4

58

Figure S1 0. Results of all poses of ATP docking in ERK8 orthologue s. (A) Fus3 docking results with all poses generated displayed in stick and colored by energy scores in red, orange, yellow, green, blue, violet ( ROYGBV ) . Fus3 is shown as green cartoon with Mg 2+ as a green sphere. (B) TbERK8 docking results with all poses generated displayed in stick and colored by energy scores in ROYGBV. TbERK8 shown in violet cartoon with Mg 2+ in green sphere. (C) HsERK8 docking results with all poses generated displayed in stick and colored by energy scores in ROYGBV. HsERK8 shown in n avy cartoon with Mg 2+ in green sphere.

59

Figure S 11 . Lowest energy docked poses of ATP in ERK8 orthologue s . ( A ) Fus3 with best pose (1) of ATP docking results , ( B ) TbERK8 with best pose (1) of ATP , ( C ) HsERK8 with best pose (3) of ATP , and ( D ) all three structure s overla id . Fus3, TbERK8, and HsERK8 are shown in cartoon colored green, violet, and navy, respectively, with Mg 2+ in green sphere . ATP structures are represented in stick and colored by ass ociated pro tein and by atom type. R MSD of ATP sans γ phosphate pose compared to ADP in Fus3 crystal structure is 2.028 Å , 1.811 Å , and 1.920 Å , respectively.

60

Table S4. Docking free energies of poses for ATP docked into Fus3, TbERK8, and HsERK8 energ y minimized structure. Ener gies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energy score and frequency of additional poses with similar positioning , correct po sitioning of chemical group s, and similarity to ADP in crystal structure (represented by an asterisk ) . TbERK8 HsERK8 Fus3 Pose # affinity (kcal affinity (kcal (kcal mol - 1 ) mol - 1 ) mol - 1 ) 1 - 9.6 * - 7.5 * - 7.7 2 - 8.9 - 7.3 - 7.6 3 - 8.9 - 7.2 - 7.5 * 4 - 8.8 - 7.2 - 7.5 5 - 8.8 - 7.0 - 7.4 6 - 8.5 - 7.0 - 7.3 7 - 8.5 - 6.9 - 7.2 8 - 8.4 - 6.7 - 7.2 9 - 8.3 - 6.6 - 6.9

61

Table S 5 . 2D structure of known TbERK8 and HsERK8 inhibitors with name, experimental K i and IC 50 value, and computationally calculated ligand volume and area. ATP is included for native ligand reference.

R o 318 22 0 AZ960 Volume: 404.0 Å 3 Volume: 292.5 Å 3 Area: 377.3 Å 3 Area: 304.9 Å 3

ATP Volume: 348.3 Å 3 Area: 321.4 Å 3

62

Table S 6 . Small Molecules used for docking into ERK8 orthologue s. Volume Rotatable Electrostatic Electrostatic Inhibitor Stru cture (Å 3 ) Bonds Surface Surface 180 

Ro318220 404.0 6

AZ960 292.5 6

Famotidine 251.0 6

Fludrocortisone 340.7 5

Fluprednisolone 331.9 5

Idarubicin 402.8 5

Prednisolone 327.9 5

Sildenafil 423.1 7

AZD5438 357.8 6

63

Figure S12. Ligand pharmacophore model of orthologue - specific inhibitors. (A) Ro318220 (teal) and (B) AZ960 (light blue) displayed in stick and colored by atom type. Spheres represent the pharmacophore features: red circles - negative ionic, blue sphere with arrows - hydrogen bonding donor, pink sphere with arrows - hydrogen b o nding acceptor, green sphere - hydrophobic, and orange rings - aromatic features.

64

Figure S1 3 . Docking result s of Ro318220 and ERK8 orthologue s. (A) TbERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10. (B) TbERK8 docking results with best pose with Mg 2+ as a green sphere (3) displayed in stick a nd colored in yellow and by atom type with Mg 2+ as a green sphere . (C) HsERK8 docking results with all poses ge nerated displayed in stick and colored by energy scores colored as in Figure S10. HsERK8 shown in navy cartoon with Mg 2+ in green sphere. (D) HsE RK8 docking selected for interaction analysis displayed in stick and colored in red and atom type with Mg 2+ as a green sphere.

65

Table S7. Docking free energies of poses for Ro318220 docked into the TbERK8 and HsERK8 energy minimized structu re. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energy score and frequency of additional poses with similar positioning (represented by an asterisk ) . TbERK8 HsERK8 Pose # affinity (kcal affinity (kcal mol - 1 ) mol - 1 ) 1 - 7.1 - 7.8 * 2 - 6.6 - 7.7 3 - 6.5 * - 7.6 4 - 6.1 - 7.3 5 - 6.0 - 7.2 6 - 5.6 - 7.0 7 - 5.3 - 6.8 8 - 5.3 - 6.7 9 - 5.2 - 6.6

66

Figure S14. Docking results of AZ960 and ERK8 orthologue s. (A) Tb ERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10. TbERK8 shown in violet cartoon with Mg 2+ in green sphere. (B) TbERK8 docking results with best pose (1) displayed in stick and colored i n red with Mg 2+ as a green sphere . (C) HsERK8 docking results with all poses generated displayed in stick an d colored by energy scores colored as in Figure S10. HsERK8 shown in navy cartoon with Mg 2+ in green sphere. (D) HsERK8 docking results with best po se (1) displayed in stick and colored in red and by atom type with Mg 2+ as a green sphere .

67

Table S8. Docking free energies of poses for AZ960 docked into TbERK8 and HsERK8 energy minimized structure. Energies are shown in kcal/mol. AutoDo ck Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by low est energy score and frequency of additional poses with similar positioning (represented by an asterisk ) . TbERK8 HsERK8 Pose # affinity (kcal affinity (kcal mol - 1 ) mol - 1 ) 1 - 7.9 * - 8.8 * 2 - 7.8 - 8.8 3 - 7.8 - 8.7 4 - 7.7 - 8.4 5 - 7.6 - 8.4 6 - 7.5 - 8.3 7 - 7.5 - 8.3 8 - 7.5 - 8.1 9 - 7.5 - 8.0

68

Table S9. Protein - ligand interactions of known inhibitors with TbERK8 and HsERK8 from molecular docking results. Interactions were identified using Schrödinger - Maestro fingerprinting feature. Ligand Protein Electrostatic Polar Neutral Hydrophobic Backbone Y24, V27, A40, I74, K42, R56, E60, HsERK8 Q21 F92, F9, L144, M95 D155 C154 ATP L19, V27, A40, I44, G20, Q21, G22, K42, K75, TbE RK8 - Y45, L99, L143, G25, I26, K43, D155 V145, F156 N142 Difference - 1 electrostatic - l hydrophilic +1 hydrophobic +6 backbone K42, R5 6 , E6 0 , Y24, V27, I74, F9 2 , G 25 , S141, HsERK8 S141, N142 K139, D155 C154 D155 Ro318220 K42, R5 6 , V27, M138, L143, Q21, G22, G25, TbERK8 N142 D155, R182 F156, L158 K43, N142 Difference - 1 electrostatic - 1 hydrophilic +2 backbone K42, E60, Y24, V27, A40, I74, E93, M95, T97, HsERK8 Q21 D155 F9 2 , L144 D98 AZ960 V27, I44, F48, TbERK8 K42, D53 , R56 - G22, K43, A54 F 156 , L158 Difference - 1 hydrophilic - 1 backbone

69

Figure S1 5 . ERK8 orthologue - specific inhibitor interactions within ATP active sites. (A) HsERK8 and Ro318220. (B) TbERK8 and Ro318220. (C) HsERK8 and AZ960. (D) TbERK8 and AZ960. HsERK8 is represented in navy cartoon with interacting or key residues shown in stick an d colored by atom type with Mg 2+ displayed as green sphere. TbERK8 is represented in violet cartoon with interacting or key residues shown in stick and colored by atom type with Mg 2+ displayed as green sphere. Ro318220 is shown in light blue stick and colo red by atom type. AZ960 is shown in teal stick and colored by atom type. Protein - inhibitor interactions measured in angstroms. Image is sho wn zoomed into the ATP active site. Inserts in each image is 90 degree turn from original position.

70

Fig ure S16. AZ960 aromatic interactions with ERK8 orthologue s. (A) Full structure overlay of HsERK8 and TbERK8. (B) TbPhe156 interaction with AZ960. (C) HsPhe97 aromatic interaction with AZ960. (D) Overlay of Asp155 and Phe156 in HsERK8 and TbERK8, displaying a shift in residues. HsERK8 shown in navy cartoon with interacting residues shown in stick. TbERK8 shown in violet cartoon with interactin g residues shown in stick. AZ960 displayed as stick shown in gray. Mg 2+ shown as green sphere.

71

Figure S1 7 . Docking results of famotidine and ERK8 orthologue s. (A) TbERK8 docking results with all poses generated displayed in stick and colored by energy s cores colored as in Figure S10 with Mg 2+ as a green sphere . (B) TbERK8 docking results with best pose (1) displayed i n stick and colored in red and by atom type with Mg 2+ as a green sphere . (C) HsERK8 docking results with all poses generated displayed in s tick and colored by energy scores colored as in Figure S10. HsERK8 shown in navy cartoon with Mg 2+ in green sphere. (D) HsERK8 docking results with best pose (1) displayed in stick and colored in red and atom type with Mg 2+ as a green sphere.

72

Table S10. Docking free energies of poses for famotidine docked into TbERK8, and HsERK8 energy minimized structure. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined b y lowest energy score and frequency of additional poses with similar positioning (represented by an asterisk ) . TbERK8 HsERK8 Pose # affinity (kcal affinity (kcal mol - 1 ) mol - 1 ) 1 - 6.0 * - 6.1 * 2 - 5.8 - 6.1 3 - 5.8 - 6.1 4 - 5.8 - 6.1 5 - 5.6 - 5.9 6 - 5.6 - 5 .9 7 - 5.5 - 5.9 8 - 5.5 - 5.9 9 - 5.4 - 5.8

73

Figure S1 8 . Docking results of fludrocortisone and ERK8 orthologue s. (A) TbERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10 with Mg 2+ as a green sphere . (B) TbERK8 docking results with best pose (1) displayed in stick and colored in red and by atom type with Mg 2+ as a green sphere . (C) HsERK8 docking results with all poses generated displayed in stick and colored by energy scor es colo red as in Figure S10. HsERK8 shown in navy cartoon with Mg 2+ in green sphere. (D) HsERK8 docking results with best pose (2) displayed in stick and colored in orange and atom type with Mg 2+ as a green sphere.

74

Table S11. Docking free energies of poses for fludrocortison e docked into TbERK8, and HsERK8 energy minimized structure. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energ y score and f requency of additional poses with similar positioning (represented by an asterisk ) . TbERK8 HsERK8 Pose # affinity (kcal affinity (kcal mol - 1 ) mol - 1 ) 1 - 7.9 * - 6.6 2 - 7.7 - 6.5 * 3 - 7.6 - 6.5 4 - 7.5 - 6.3 5 - 7.5 - 6.2 6 - 7.4 - 6.0 7 - 7.3 - 5.9 8 - 7.2 - 5.9 9 - 7.1 - 5.9

75

Figure S1 9 . Docking results of fluprednisolone and ERK8 orthologue s. (A) TbERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10 with Mg 2+ as a green sphere . (B) TbER K8 docking results with best pose (1) displayed in stick and colored in red and by atom type with Mg 2+ as a green sphere . (C) HsERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10. HsERK8 s hown in navy cartoon with Mg 2+ in green sphere. (D) HsERK8 docking results with best pose (1) displayed in stick and colored in red and atom type with Mg 2+ as a green sphere.

76

Table S12. Docking free energies of poses for fluprednisolone dock ed into TbERK8, and HsERK8 energy minimized structure. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energy score and f requency of additional poses with similar positioning (represented by an asterisk ) . TbERK8 HsERK8 Pose # affinity (kcal affinity (kcal mol - 1 ) mol - 1 ) 1 - 7.7 * - 6.7 * 2 - 7.6 - 6.6 3 - 7.4 - 6.6 4 - 7.3 - 6.5 5 - 7.3 - 6.5 6 - 7.2 - 6.4 7 - 7.1 - 6.4 8 - 6.9 - 6.3 9 - 6.7 - 6.3

77

Figure S 2 0 . Docking results of idarubicin and ERK8 orthologue s. (A) TbERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10 with Mg 2+ as a green sphere . (B) TbERK8 docking results with best pose (1) d isplayed in stick and colored in red and by atom type with Mg 2+ as a green sphere . (C) HsERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10. HsERK8 shown in navy cartoon with Mg 2+ in green sphere. (D) HsERK8 docking results with best pose (1) displayed in stick and colored in red and atom type with Mg 2+ as a green sphere.

78

Table S13. Docking free energies of poses f or idarubicin docked into TbERK8, and HsERK8 energy minimized structure. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energy score and frequency of add itional poses with similar positioning (represented by an aste risk ) . TbERK8 HsERK8 Pose # affinity (kcal affinity (kcal mol - 1 ) mol - 1 ) 1 - 8.3 * - 8.6 * 2 - 8.1 - 8.5 3 - 8.1 - 8.2 4 - 7.6 - 8.2 5 - 7.6 - 7.8 6 - 7.6 - 7.8 7 - 7.3 - 7.7 8 - 7.3 - 7.6 9 - 7.0 - 7.5

79

Figure S 21 . Docking results of prednisolone and ERK8 orthologue s. (A) TbERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10 with Mg 2+ as a green sphere . (B) TbERK8 docking results with best pose (2) displayed in stick and colored in orange an d by atom type with Mg 2+ as a green sphere . (C) HsERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10. HsERK8 s hown in navy cartoon with Mg 2+ in green sphere. (D) HsERK8 docking results wit h best pose (2) displayed in stick and colored in orange and atom type with Mg 2+ as a green sphere.

80

Table S14. Docking free energies of poses for prednisolone dock ed into TbERK8, and HsERK8 energy minimized structure. Energies are shown in k cal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energy score and frequency of additional poses with similar positioning (represented by an asterisk ) . TbERK8 HsERK8 Pose # affinity (kca l affinity (kcal mol - 1 ) mol - 1 ) 1 - 7.5 - 6.4 2 - 7.4 * - 6.3 * 3 - 7.3 - 6.3 4 - 7.1 - 6.3 5 - 7.1 - 6.2 6 - 6.5 - 6.1 7 - 6.5 - 6.0 8 - 6.4 - 6.0 9 - 6.4 - 6.0

81

Figure S 22 . Docking results of sildenafil and ERK8 orthologue s. (A) TbERK8 docking resul ts with all poses generated displayed in stick and colored by energy scores colored as in Figure S10 with Mg 2+ as a green sphere . (B) TbERK8 docking results with best p ose (1) displayed in stick and colored in red and by atom type with Mg 2+ as a green sphe re . (C) HsERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10. HsERK8 shown in navy cartoon with Mg 2+ in green sphere. (D) HsERK8 docking results with best pose (1) displayed in stick and c olored in red and atom type with Mg 2+ as a green sphere.

82

Table S15. Docking free energies of poses for sildenafil docked into TbERK8, and HsERK8 energy m inimized structure. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energy score and frequency of additional poses with similar positioning (represented b y an asterisk ) . TbERK8 HsERK8 Pose # affinity (kcal affinity (kcal mol - 1 ) mol - 1 ) 1 - 7. 6 * - 8.6 * 2 - 7.6 - 8.4 3 - 7.5 - 8.3 4 - 7.5 - 8.3 5 - 7.4 - 8.2 6 - 7.4 - 8.2 7 - 7.4 - 7.9 8 - 7.3 - 7.9 9 - 7.3 - 7.9

83

Figure S 23 . Docking results of AZD5438 and ERK8 orthologue s. (A) TbERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10 with Mg 2+ as a green sphere . (B) TbERK8 docking results with best pose (1) displayed in stick and col ored in red and by atom type with Mg 2+ as a green sphere . (C) HsERK8 docking results with all poses generated displayed in stick and colored by energy scores colored as in Figure S10. HsERK8 shown in navy cartoon with Mg 2+ in green sphere. (D) HsERK8 docki ng results with best pose (2) displayed in stick and colored in orange and atom type with Mg 2+ as a green sphere.

84

Table S16. Docking free energies of poses for AZD5438 docked into TbERK8, and HsERK8 energy minimized structure. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energy score and frequency of additional poses with similar positioning (represented by an asterisk ) . TbERK8 HsERK8 Pose # affinity (kcal affinity (kcal mol - 1 ) mol - 1 ) 1 - 8.1* - 8. 0 2 - 7. 5 - 8. 0* 3 - 7. 3 - 7.9 4 - 7. 3 - 7 .8 5 - 7. 0 - 7.6 6 - 7. 0 - 7.4 7 - 7. 0 - 7. 4 8 - 7. 0 - 7. 3 9 - 7. 0 - 7. 3

85

Table S 17 . Protein - small molecule interaction s of identified FDA - approved drugs with similar physicochemical characteristics to Ro318220 and AZ960 with TbERK8 and HsERK8 from molecular docking results . Interactions were identified using Schr ö dinger - Maestro fingerprin ting feature. Ligand Protein Electrostatic Polar Neutral Hydrophobic Backbone Y24, V27, A40, HsERK8 D98 Q21, S141 M95 L144, C154 N72, M95, Famotodine V27, Y94, L99, TbE RK8 K75, E96 D98, L144, V145, V153 N146 Difference +1 - 2 +4 K42, K139, HsERK8 S141, N142 Y24, V27 D155 M95, D98, Fludrocortisone K75, E96, H100, N142, TbERK8 S147 L99, V145, V153 D155 L143, L144, N146, K152 Difference - 1 +1 +8 K42, K43, HsERK8 N142 Y24, I44, F45 G25 D155 Y94, D98, Fluprednisolone N142, L143, TbERK8 K75, E96 L99, V145, V153 L144, N146, K152, A154 Difference - 1 - 1 +7 Y24, V27, I74, K42, D98, M95, T97, HsERK8 Q21 F92, F94, L144, D155 S142 C154 M95, D98, Idarubicin K75, E96, H100, N142, TbERK8 S147 L99, V145, V153 D155 L143, L144, N146, K152 Difference - 4 +5 K42, K139, HsERK8 S141, N142 Y24, V27 D155 Prednisolone K75, E96, Y94, L99, V145, TbERK8 N145 M97, L147 D155 V153 Difference - 1 +2 +2 V27, A40, I44, K42, E60, HsERK8 L64, I74, F92, G25, S142 D155 L144, C154 L76, M97, Sildenafil K75, E93, E96, A40, Y94, L 99, T100, H103, TbERK8 D97, D155 V145, V153 N145, L147, N149 Difference +2 - 2 +5 Y24, V27, I74, G20, E93, HsERK8 K42, D98 Q21 F9 2, L144, C154 D96, T97 L19, V27, I44, AZD5438 K42, D53, E60, TbERK8 T57 Y45, L99, F156, G25, K43 K75, D155 L158 Difference +3 +1 - 2

86

Figure S24. Ligand efficiencies of known inhibitors and FDA - approved molecules. Ligand efficiencies are calculat ed in kcal/mol/heavy atom count. (A) TbERK8 ligand efficiencies organized from highest efficiency to lowest. (B) HsERK8 ligand efficiencies organized from highest efficiency to lowest. (C) Comparison of TbERK8 and HsERK8 ligand efficiencies organized by si ze ( Å 3 ) of each ligand. Ligand efficiency calculate d using all poses output energies from AutoDock Vina and heavy atom numbers with standard deviation.

87

Table S18. New compound constructs built from AZ960 scaffold, number of rotatable bonds, heavy atoms, a nd volumes. Rotatable Hea vy Volume Compound Structure Bonds Atoms ( Å 3 )

1 7 29 319.9

2 5 28 303.7

3 5 27 292.7

4 4 25 273.5

5 5 25 280.0

6 7 31 338.4

7 7 31 386.7

88

Table S1 9 . Docking free energies of new small molecule con structs docked into Hs ERK8 energy minimized structure. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest energy score and frequency of additional poses with similar positioning (represented b y an asterisk ) . Compound Number Pose 1 2 3 4 5 6 7 1 - 9.1 * - 9.7 * - 9.5 * - 9.2 - 8.7 * - 9.0 * - 9.9 * 2 - 9.0 - 9.4 - 9.3 - 9.1 * - 8.5 - 8.8 - 9.3 3 - 8.8 - 9.4 - 9.2 - 9.0 - 8.5 - 8.8 - 9.0 4 - 8.8 - 9.4 - 9.1 - 8.9 - 8.3 - 8.8 - 8.9 5 - 8.8 - 9 .3 - 9.1 - 8.9 - 8.3 - 8.7 - 8.9 6 - 8.8 - 8.9 - 9.0 - 8.9 - 8.2 - 8.4 - 8.8 7 - 8.7 - 8.9 - 8.9 - 8.6 - 8.1 - 8.2 - 8.4 8 - 8.6 - 8.9 - 8.8 - 8.4 - 8.1 - 8.1 - 8.4 9 - 8.6 - 8.9 - 8.6 - 8.4 - 8.0 - 8.0 - 8.1 A ve . - 8.80 - 9.20 - 9.06 - 8.82 - 8.30 - 8.53 - 8.86

Table S 20 . Docking free e nergies of new small molecule constructs docked into TbERK8 energy minimized structure. Energies are shown in kcal/mol. AutoDock Vina was used for docking, which generates up to nine poses. P ose used in the analysis determined by lowest ene rgy score and fr equency of additional poses with similar positioning (represented by an asterisk ) . Compound Number Pose 1 2 3 4 5 6 7 1 - 8.2 * - 8.3 * - 8.0 * - 8.2 * - 8.1 - 9.4 - 8.6 2 - 8.0 - 8.2 - 7.9 - 8.0 - 7.9 * - 9.0 * - 8.5 * 3 - 7.9 - 8.1 - 7.8 - 8.0 - 7.8 - 8.9 - 8.5 4 - 7.9 - 8.0 - 7 .6 - 7.9 - 7.6 - 8.8 - 8.3 5 - 7.7 - 7.9 - 7.6 - 7.7 - 7.6 - 8.5 - 8.1 6 - 7.6 - 7.9 - 7.5 - 7.7 - 7.6 - 8.4 - 8.0 7 - 7.6 - 7.8 - 7.5 - 7.7 - 7.5 - 8.4 - 8.0 8 - 7.6 - 7.8 - 7.3 - 7.5 - 7.4 - 8.3 - 8.0 9 - 7.5 - 7.7 - 7.2 - 7.5 - 7.4 - 8.3 - 8.0 Ave. - 7.78 - 7.97 - 7.60 - 7.80 - 7.66 - 8.67 - 8.22

89

Table S 21 . Protein - small molecule interaction s of new chemical compounds with TbERK8 and HsERK8 from molecular docking results . Interactions were identified using Schr ö dinger - Maestro fingerprinting feature. Compound Protein Electrostatic Polar Neutral Hydrophobic Backbone K42, E60, D98, Y24, V27, A40, E93, M95, D96, HsERK8 Q21 D155 I74, F92, L144 T97 L19, V27, I44, 1 K42, D53 , K75, Q21, G22 , G25, TbERK8 - Y45, A47, V145, D155 K43 F156 Difference - 1 +1 Y2 4, V27, A40, K42, E60, D98, E93, M95, D96, HsERK8 Q21 I74, F92, L144, D155 T97 C154 2 V27, I44, Y45, K42, D53 , K75, Q21, G22 , G25, TbERK8 - A47, L99, V145, D155 K43 F156 Difference - 1 K42, E60, D98, Y24, V27, A40, E93, M95, D96, HsERK8 Q21 D155 I74, F92, L144 T97 3 K42, E 60 , K75, V27, I44, V145, G 2 0 , G22 , G25, TbERK8 - D155 F156, L158 K43 Difference - 1 - 1 Y24, V27, A40, M95, D96, T97 , HsERK8 K42, D98, D155 Q21 F94, L144 S141 L19, V27, L99, 4 G 2 0 , Q21, E96, TbERK8 K75, E96, D98 - V102, V145, T97 F156 Difference - 1 +1 Y24, V27, A40, HsERK8 D98, D155 Q21 I74, F92, F94, E93, D96, T97 L144 5 K42, K75, E96, L19, V27, L99, G 2 0 , G22 , G25, TbERK8 - D155 V145, F156 I26, K43, N142 Difference +2 - 1 - 1 +3 E93, M95, D96 , Y24, V27, A40, HsERK8 K42, D98, D155 Q21 T97 , F94, L144, C154 S141 6 L19, V27, I44, K42, E60, E96, Q21, G22 , G25, TbERK8 T57 L99, V145, F156, D98, D155 N142 L158 Difference +2 +1 - 1 Y24, V27, A40, G25, M95, D96, HsERK8 K42, D98, D155 Q21 F92, F94, L144 T97 , S141 V27, I44 , Y45, 7 K42, D53 , K75, G20, Q21, G22 , TbERK8 T57 A47, L99, V145, D155 G25, K43, F48 F156 Difference +1 +1 +1

90

HsERK8 TbERK8 Binding Binding Binding Binding Compound Energy Efficiency Energy Efficiency (kcal/mol) (kcal/mol / heavy (kcal/mol) (kcal/mol / heavy atom) atom) Ro318220 - 3.83 ± 15.24 - 0.21 ± 0.01 - 10.70 ± 8.03 - 0.18 ± 0.02 AZ960 - 98.60 ± 10.83 - 0.30 ± 0.01 - 107.97 ± 11.24 - 0.26 ± 0.01 AZD5438 - 96.00 ± 14.95 - 0.26 ± 0.01 - 109.44 ± 11.76 - 0.27 ± 0.01 1 - 48.30 ± 22.38 - 0.30 ± 0.01 - 44.03 ± 19.61 - 0.29 ± 0.01 2 - 78.59 ± 1 8.42 - 0.33 ± 0.01 - 68.79 ± 27.81 - 0.29 ± 0.01 3 - 98.95 ± 10.89 - 0.34 ± 0.01 - 95.47 ± 27.68 - 0.28 ± 0.01 4 - 79.86 ± 13.35 - 0.35 ± 0.01 - 68.41 ± 23.27 - 0.31 ± 0.01 5 - 34.59 ± 13.44 - 0.33 ± 0.01 - 47.83 ± 11.71 - 0.31 ± 0.01 6 - 21.37 ± 17.93 - 0.28 ± 0.01 - 3 4.82 ± 5.47 - 0.28 ± 0.01 7 - 73.56 ± 19.08 - 0.29 ± 0.02 - 51.55 ± 41.98 - 0.27 ± 0.01

Table S22 . Ligand binding energies and efficiencies of known inhibitors and constructed chemical compounds. Ligand efficiency calculated using all poses output energies f rom AutoDock Vina and heavy atom numbers with standard deviation . Binding energies evaluated us ing all pose outputs from AutoDock Vina and Schr ö dinger - Maestro Molecular Mechanics/Generalized Born Surface Area (MM - GBSA) feature to calculate energy of comple xes with standard deviation.

91

CHAPTER 3: ASSESSING THE INFLUENCES OF ATP AND SPHINGOSIN E ON THE PROTEIN DYNAMICS OF SPHINGOSINE KINASE ISOFORMS USING MOLECULAR DYNAMICS SIMULATIONS

Amanda K. Sharp , 1 David R. Bevan , 1, 2 and Anne. M. Brown , 1,2, 3

De partment of Biochemistry , 1 V irginia Tech Center for Drug Discovery, 2 and University Libraries , 3 Virginia Tech

Attribution: AKS wrote this chapter using input from AMB and DRB to make corrections and co ntent suggestions. AKS performed the MD simulations and analysis . All autho r s collectively decided on project plans and direction.

92

3 .1 Abstract

Elevated cellular levels of the lipid signaling molecule, sphingosine - 1 - phosphate

(S1P), are associated with a variety of diseases, including various cancers an d fibrosis . S1P is the product of the class of enzymes, sphingosine kinases ( SphK). SphKs exist in two isoforms, SphK1 and SphK2, and given their direct role in controlling cellular levels of

S1P, are attractive targets in drug discovery. Exploring and dis tinguishing the SphK isoform binding sites to identify divergent structural m orphologies is essential in developing isoform - selective inhibitors and elucidating more on the biological implications of native substrates, sphingosine and ATP, binding to SphKs . Atomistic molecular dynamics (MD) simulations were utilized in this work to determine the influence of sphingosine and ATP on SphKs in order to understand structural morphologies and determine binding pocket exploitability of these kinase isoforms . Our w ork offers insight into the ove r all structural stability of the SphK isoforms , providing evidence that the

SphK2 homology model can be utilized as predictive computational tool for inhibitor design. We also identified an arginine - rich domain essential for ATP binding and a glutamate that could play the role of a catalytic aspartate . In sphingosine binding, SphK isoforms displayed similar protein - small molecule interaction patterns but provided insight into the importance of Il e 174 in SphK1 and Val304 in Sph K2 in binding differences. This work offers atomistic insight into the biolog ical functio n of SphK isoform binding patterns as well as classifying structural features for exploitation in drug discovery .

93

3.2 Introduction

Sphingosine - 1 - phosphate (S1P) is a regulatory lipid signaling molecule involved in

cell growth, migration, physiology, and proliferation pathways. 76, 131 Elevated cellular concentrations of S1P ha ve been linked with diseases such as fibrosis, 76, 132 Alzheimer’s disease, 133 sickle cell anemia, 13 4 multiple sclerosis, 76, 135, 136 inflammation , 76, 137

cardiovascular disease, 76, 138 and various types of cancers. 139 - 142 S1P is formed by the ph osphorylation of sphingosine by the class of enzymes, sphingosine kinases (SphKs).

SphK s exist in two isoforms, (SphK1) and sphingosi ne kinase 2

(SphK2) ( Figure 1 ) , and are involved in the ceramide synthase and salvage pathways to

cont rol cellular concentrations of ceramide, sphingosine, and S1P. 76 Ceramide, a precursor

to many sphingolipid molecules, is an essential pro - apoptotic sphingolipid that is produced either via de n ovo synthesis or sphingolipid recycling. 76, 143 Regulation of the ceramide/S1P

ratio is essential in cell h omeostasis and an imbalance can lead to disease. 144, 145

S1P stimulates a family of G - protein coupled receptors, the S1P receptors

(S1PRs), 146 and can be shuttled across the by the spinster 2 homologue

(Spns2) transporter. 76 Spns2 and many of the S1PRs associated with S1P are involved in various diseases, such as multiple sclerosis 76, 135 and fibrosis, 76, 132 and are involved in inflammation and autoimmune conditions, 76, 137 thus making the inhibition of S1P produc t ion an attractive approach for drug discovery. Control over the cellular concentrations of ceramide, sphingosine, and S1P are tightly regulated and are considered to be controlled “inside - out,” or act both intracellularly and extracellularly via autocrine or paracrine action. 147, 148 The r egulatory mechanism is influenced by the cellular levels of the , rather than the proteins, and can indirectly control the expression of

94

SphKs. 149 Several studies have demonstrated that SphKs are independent of S1PRs intracellular activation, indicating that SphKs alone can influence cellular survival. 148, 150,

151

SphKs are the only known enzymes that convert sphingosine into S1P, rendering

them an attractive target for reducing sub cellular concentrations of S1P. 70 SphK 1 and

SphK2 are isoforms that differ in cellular localization and show 53% sequence similarity

and 39% sequence identity. 152 SphK1 is the smaller iso form, containing 384 amino acids

and is localized to the cytoplasm, 153 with known translocation into the plasma

membrane. 147, 154 SphK1 is found to be involved in cell ular proliferation and, due to its role

in cell migration, may aid in metastasis. 139 - 142 SphK2 is comprised of 618 amino acids and

contains a nuclear localization signal. This nuclear localization signal, predicted to be

located at positions 122 - 130, 155 primarily localizes SphK2 to the cell nucleus, endoplasmic

reticulum, and mitochondria. 147 Many studies have ob served that SphK2 may hold a n

opposite role compared to SphK1, 1 50 dependent on the subcell ular localization, 147, 150 and

can potentially induce apoptotic pathways by increasing cellular levels of ceramide. 150 For

example, Sph K2 localization in the nucleus holds anti - proliferation characteristics,

preventing new synthesis of DNA. 74, 147 Due to the contrasting roles of t hese two isoforms,

it is essential to have a better understanding of their structural differences to aid in

understanding the necessary and spe cific chemistry performed by these kinases and when

developing isoform - specific inhibitors for therapeutic design given the role of both

isoforms in cell proliferation and disease.

In silico techniques can provide useful and atomistic insight for protein s tructure -

function relationships and aid in the drug discovery process. Computational techniques

95

have been used in previous studies to help aid in inhibitor development of SphKs, but to - date, there are no computational studies that probe the influence of bi ological substrates

(sphingosine and ATP) on protein structur al dynamics. These insights can give us more underst anding on biological roles by understanding the isoform structural differences and better directs our drug design efforts. Previous work has identified key residues in the sphingosine binding cavity essential for inhibition of SphKs through molecular docki ng and structural analysis. 152 This study seeks to investigate the biologic al features of SphK isoforms, exploring the apo (no sphingosine or ATP bound), ATP - bound, and ATP - sphingosine complex systems using MD simulations to develop a dynamic basis for SphK isoform protein structure - function relationship. This work show s the ove rall structural integrity of the SphK1 crystal structure and the SphK2 homolog y model, insight into ATP - binding patterns and the importance of an arginine - rich region, and then tells us more about the biolog ical influences o n the SphK isoforms protein dyna mics, ultimately helping drug discovery for these unique targets.

3.3 Material s and Methods

3.3.1 Molecular Dynamics Simulations of SphKs

Molecular dynamics (MD) simulations and system construction were performed using the GROMACS v2019.3 software suite. 156 SphK1 structures used for the SphK1 MD simulation systems, downloaded from the Open Science Framework

( https://o sf.io/82n73/ ), 152 are based on the PDB ID: 3VZB, 157 which was modified by

W orrel l, et al. by transforming ADP from PDB ID: 3VZD 157 to contain A TP by adding a phosphate to ADP . 152 Homology model structure files of SphK2 containing

ATP+sphingosine were downloaded from the work done in Worrell, et. al . from the Open

96

Science Framework ( https://osf.io/82n73/ ) 152 Systems containing SphK1 and SphK2 apo,

ATP, and ATP+sphingosine structures were utilized in this work. The CHARMM36 M arch

2019 release (CHARMM36m) force field was used. 158 CHARMM general force field

(GGenFF) webserver was used for ligand parameterization. 159 Systems w e re solvated using the water model TIP3P 160 - 162 modified for CHARMM and 0.150 M NaCl was randomly added to the solvent to neutralize the system and reflect a standard, biological ionic concentration. 163, 164 Energy minimizat ion w as implemented using the steepest descent minimization algorithm with a maximum force of 1000 kJ mol - 1 nm - 1 . Four replicates of each system (apo, ATP - only, and ATP+sphingosine) for each isoform were created at the onset of equilibration using differen t ran dom velocities in the first equilibration phrase.

Equilibration was completed in two phase s , first at 300 K for 100 ps using a canonical

(NVT) ensemble with an integration time step of 2 fs using the v - rescale modified

Berendsen coupling method. 165 Secondly, an isothermal - isobaric (NPT) ensemble was applied to maintain pressure (1 bar) and temperature (300 K) for 100 ps using the

Ber endse n 165 pressure coupling method. Restraints were placed on the protein and substrate for both phases of equilibration. Production si mulat ions removed all protein and substrate restraints and were run using the Parrinello - Rahman 166, 167 barostat pressure coupling and V - rescale temperature coupling method 168 at 300 K and 1 bar. P arallel Linear

Constraint Solver ( P - LINCS) was used as the fourth order algorith m par allel constraints to pr ovide constrain t s on bond s involving hydrogens. 169 Neighbors lists were calculated using the Verlet 170 cutoff - scheme using neighboring grid cells with a cutoff of 1.2 nm for nonbonded interactions. Electrostatic forces were calculated using the smooth particle mesh Ewald 171, 172 method, which included using a Fourier grid space of 0.16 nm and cubic

97

interpolation. Periodic boundary conditions in the x, y, and z dimension s. All systems for

both SphK1 and SphK2 isoforms contained four r eplicates and were simulated for a total

of 1000 ns (1 µs) for each replicate of every system and isoform. Analysis was performed

over the last 250 ns of each simulation based on root mean - sq uare deviation (RMSD)

convergence representing a total of 12 µs o f aggregated, sampled data in this work.

3.3.2 MD Analysis

Analysis of all trajectories was performed with GROMACS 156 and Schrödinger -

Maestro. Clustering of the simulation structures to generate dominant morphologies that

were then used in subsequent analysis was performed by clustering each replicate of each

system over the last 250 ns, with a backbone RMSD cutoff of 0.2 nm. This procedure for

clustering was established by Daura, et al . 173 and produced dominant morphologi es that

represented the converged simulation ( Figure S2 - S3, Table S1 ). A further analysis was

performed to determine the deviation between each replicate’s dominant morphology to

determine a representative structure for additional structural analysis by ta king the RMSD of each structure using PyMOL 116 ( Table S2 - S4 ). All four repli cates were analyzed and combined by taking the calculated averages and standard deviations. Analysis was performed across each isoform systems and then across isoforms to determine similarities and differences between structures and unders tand the influenc e on native substrate binding. Schrödinger - Maestro Suite (v. 2020 - 1) 115 was used to evaluate Molecul ar

Mechanics/Gene ralized Born Surface Area (MM - GBSA) bi nding energies and protein - substrate interactions through F ingerprinting with Canvas . 174 Meta Pocket 2.0 webserver 113

was used to identify bindi ng pockets and pocket volume was calculated usin g Meta Pocket

output i n UCSF Chi mera. 114 Visu alization and rendering of images were perfo r med using

98

PyMOL. 116 Simulation input files and s cripts can be found at our Open Science Framework

page ( https://osf.io/82n73/ ).

3.4 Results and Discussion

Assessing and probing the role of protein dynamics on the structure - fu nction relationship of sphin gosine kinases (SphKs) is important given the unique and specific chemistry that these kinase isoforms perform. Biologically, these kinases and their substrate/products are involved in a variety of signaling cascades that greatl y influence cellular functio n. 76, 131 The biological influences the native molecules, sphingosine and

ATP, and the subsequent impact the protein morphology of SphKs can shed light on binding mechanics and kinase activity control. In turn, this base k nowledge of biological dynam ic influence by substrate presence can also suggest aspects that can be utilized in drug discovery and inhibitor design. Given limitations of molecular docking, sampling only energetically favorable binding positions of flexible substrates and ligands on a static protein structure, m olecular dynamic s (MD) simulations can provide further insight into amino acids essential in substrate and ligand binding, influence of native substrates on protein structure, and connect biological aspects of the protein struc ture - function relationship of SphKs to drug design.

Previous studies have investigated protein dynamics of SphKs but have focused on integrating computational and experimental work to aid in inhibitor design , and none have comp utationally explored the inf luence of biological substrates on protein dynamics of

SphKs . 175 - 185 This work branches from th e work previously performed using MD simulations and probes the biological applications of each SphK isoform structural morphology in three systems: apo (protein only) , ATP - only bound, and an ATP -

99

sphingosine bound system . These systems highlight the multip le states that SphKs can

function biologically, as well as allow for comparisons between substrates bound and

between i soforms. MD simulations can also offer an understanding into the structur al

morphology changes influenced by protein and small molecule i nteractions. In

investigating the dynamic changes in protein structures, we explore the structural dynamics

that occur as a result of native substrate presence and identify dynamic movement that

might implicate the influence of structural differences on th e biological roles of these

kinases. We identify new residues predicted to be essential in ATP - sphingosine binding

and connect structural and binding differences between the SphK isoforms, highlighting

their isoform specific differences. Collectively, this work provides a deeper understanding

in the structural differences of these SphK isoforms and determine new exploitabl e features

for isoform - selective inhibitor design for drug discovery.

3.4.1 Influence of ATP and Sphingosine on SphK Isoform Structural M orphology

Investigation into the overall structural morphology changes influenced by the

native substrates, ATP+sphing osine, on the SphK isoforms can aid in understanding the

implication of structural dynamic changes on potential biological functions. MD

simulations in this work were built using the all - atom force field built for proteins,

CHARMM36m, due to the recent ad vancements in protein folding and validation against

NMR data , as well as the ability to incorporate inhibitors and other small molecules in

future works. 158, 186 To determine system stability and simulation convergence, root mean - square deviation (RMSD) was calculated over the entire simulation ( Figure S 1 ) and represents convergence of each system by 75 0 ns for all systems. Other systems reached convergence, which is defined as having not statistically variant RMSD from previous time

100

points, at earlier time points and maintained convergence until the end of the simulation.

Therefore, to equally sample ov e r each replicate, simulation system, and per isoform, the

last 250 ns of simulation time was aggregated across replicates for all analysis. Dominant morphologies for each replicate of each system were generated using RMSD clustering on

the protein backbon e , with a cut - off of 0.2 nm ( Figure S 2 - S 3 ). Subsequently, to compare

within and across isoforms, a dominant morphology from each system and isoform was

calculated using RMSD of the protein structures and small molecules across the last 250

ns ( Table S 1 - S4 ). RMSD showed stable structures for SphK1 through most of th e

simulation, but the last 250 ns were used to best compare against SphK2 systems which

have a larger, more dynamic loop region. The structure with the median RMSD was chosen

as a representative fo r cross system analysis. Dominant morphology percentages fo r the

most representative structures of SphK1 apo, ATP bound, and ATP+sphingosine bound

systems were 89.0%, 99.9%, and 79.1%, respectively ( Figure 2 ). The most representative structures of SphK2 ap o, ATP bound, and ATP+sphingosine bound systems dominant mo rphology percentages were 28.3%, 35.8%, and 8.90%, respectively ( Figure 2 ). The low

cluster percentage in the SphK2 ATP+sphingosine bound system is due to the size of the

loop region but displays t he most similarity between replicates. The dominant morphol ogy percentages, coupled with the RMSD analysis indicate the systems are stable and representative of the SphK structures and evaluation of the last 250 ns are suitable for structural analysis, dis cussed below.

Residue fluctuation as a result of the prese nce of either ATP and/or sphingosine binding on SphK isoforms was evaluated by using root mean - square fluctuation (RMSF)

( Figure S 4 - S5 ). Averages across replicates were taken to better understand the fluctuation

101

and standard deviation ( Figure S4 - S5 ) per system a nd across isoforms. For SphK1 apo,

regions with high fluctuation are observed in residue regions 175 – 200, 210 – 240, and 260 –

280. Investigating the protein structure, the amino acids in regions 175 – 2 00 are in the α7 -

α8 helix loop region, a region involved in sphingosine binding as well an important region in ubiquitination leading to proteasomal protein degradation in SphK1 via ubiquitination at Lys183. 187 Regions 210 – 240 are in the domain containing an elongated random coil region, also referred to as th e regulatory loop or the “R - loop.” 188 Re sidues 260 – 280 is an a rea rich in random coils between the N - terminal and C - termi nal lobes within the active site, containing both the ATP and sphingosine binding regions. SphK1 ATP systems displayed fluctuation with similar intensity to SphK1 apo systems in residue region 210 –

240, and the elongated random coil region, while the residu e region 175 – 200, the α7 - α8 helix loop, and region 260 – 280, the random coil in the active site, are measurably lower and show less fluctuation, with an average of 0.2 nm and 0.3 nm lower RMSF value s, respectively ( Figure S4 - S5 ) in comparison to SphK1 apo stru ctures. The lack of fluctuation in regions identified from SphK1 ATP systems, such as in the α7 - α8 helix loop and the random coil in the active site, indicates that these regions may be anchored by ATP binding, reducing structural flexibility ( Figure S 4 - S5 ). When sphingosine is introduced to the

SphK1 ATP + sphingosine system, there is minimal difference in the fluctuation in the α7 -

α8 helix loop and elongated random coil region compared to the SphK1 A TP systems, while the random coil (residues 260 - 280) in the active site fluctuates more similarly to the

SphK1 apo system. In SphK2 systems, RMSF show ed high fluctuation in nearly all regions between residues 350 - 475. This entire region is encompassed by t he elongated random coil region of SphK2, hypothesized to be important in membrane binding and contains a nuclear

102

export sequence. 189, 19 0 All SphK2 systems ha d fluctuation between residues 300 - 340, which is the α7 - α8 helix loop, and be tween 510 - 530, the random coil in the active site. This aligns with the fluctuations observed in the SphK1 systems indicating the importance of these do mains in substrate binding. In the SphK2 systems, the only region with differences compared to the SphK1 systems is a fluctuation peak between residues 575 - 585. This fluctuation is larger in SphK2 apo and ATP+sphingosine bound systems, with approximately 0 .1 nm increase in residue fluctuation than the ATP bound system. In SphK2 apo, this peak is extended to residues 600, containing domains closer to the sphingosine binding site of the protein. Re sidues 585 - 600 have a larger fluctuation by 0.1 nm for SphK2

A TP bound systems and 0.2 nm for SphK2 ATP+sphingosine bound systems. Given the increase in residue fluct uation in this region (residues 585 - 600), it is hypothesized ATP and sphingosine binding are the influencing these fluctuation differences. Residue p osi tions 575 - 585 is occupied between two β sheets, which are solvent exposed and outside of the binding cav ity and are likely have movement due to solvent accessibility. Given the observed fluctuations in the α7 - α8 helix loop and the random coil within the ac tive site in both SphK isoforms and the fluctuation occurring solely in SphK2 region 585 - 600 this inform ation shows these areas could be playing an important role on structural morphology changes upon small molecule binding.

To understand the structural changes potentially influenced by ATP and/or sphingosine binding, secondary structure of each kinase was analyzed over the last 250 ns

( Table 1 , Table S5 ). The SphK1 apo structure consist of 28 ± 1 % α helices, 29 ± 1 % β strands, and 43 ± 1 % rand om coils. The SphK2 apo structure consist of 24 ± 1 % α helices,

20 ± 2 % β stands, and 56 ± 3 % random coils ( Table 1 ). The larger percentage of random

103

coils is likely influenced by the elongated random loop re gion in SphK2, where in SphK1

it is 40 amino aci ds in length while in SphK2 it is nearly 150 amino acids in length. The

SphK1 ATP bound structure consist of 29 ± 1 % α helices, 29 ± 0 % β strands, and 43 ± 1 %

random coils. The SphK2 ATP bound structure consi st of 25 ± 1 % α helices, 21 ± 1 % β

strands, and 54 ± 1 % random coils ( Table 1 ). The SphK1 ATP+sphingosine bound

structure consist of 39 ± 1 % α helices, 29 ± 0 % β strands, and 42 ± 1 % random coils. The

SphK2 ATP+sphingosine bound structure consist of 25 ± 0 % α helices, 21 ± 1 % β strands,

and 55 ± 1 % random coils ( Table 1 ). While there appears to be a vast amount of disorder

with the random coils in the SphK2 systems, the overall structure is generally stable as

demonstrated through this secondary structure a nalysis, RMSD, and RMSD clustering. It

is hypothesized that the elongated random coil region, from residues 210 – 240 in SphK1

and 350 - 475 in SphK2, have a purpose in SphK2, such as membrane interactions or

forming complexes with other proteins, 188, 190 which require further exploration. Due to

the minimal secondary structure deviation from starting structures for each isoform and

each system, this indicates that the cry stallized structure and the homology model

structures maintains overall structural stability and were useful for previous molecular

docking experiments. 152

Radius of gyration and eccentricity were calculated to evaluate the overall structural

compactness and shape of the protein structures. Radius of gyration (R g ) was obtained a t

the onset of simu lations and also as an average over the last 250 ns of simulation time from

all system replicates ( Table S 6 ). SphK1 apo system was calculated to have a starting R g of

2.0 nm and ending R g of 2.1 ± 0.01 nm. SphK2 apo system was calculated to have a starting

R g of 2.4 nm and ending R g of 2.4 ± 0.0 3 nm. SphK1 ATP bound system was calculated to

104

have a starting R g of 2.0 nm and ending R g of 2.1 ± 0.01 nm. SphK2 ATP bound system

was calculated to have a starting R g of 2.4 nm and ending R g of 2. 4 ± 0.0 2 nm. SphK1

ATP+sphingosine bound system was calculated to have a starting R g of 2.0 nm and ending

R g of 2.1 ± 0.01 nm. SphK2 ATP+sphingosine bound system was calculated to have a

starting R g of 2.4 nm and ending R g of 2.4 ± 0.0 3 nm. The increase in R g in all the systems

from start to end of the simulation is likely due to protein relaxing in the solvent

environment and further readjustment of loop regions. Initially we hypothesized that, upon

substrate binding, SphK isoforms wou ld increase in compac tness, but interestingly no

compaction occurred around sphingosine and ATP binding domains. Comparing the

SphK1 and SphK2 systems, SphK1 systems are more compact compared to the SphK2

systems ( Table S 6 ). This is likely the influence of the elongated random coil regions in

SphK2 and eccentricity can help identify if compactness is influenced by the overall shape

of the structure. Eccentricity was calculated using the following equations:

2 = ( + ) 5 2 = ( + ) Inertia 5

2 = ( + ) 5

Eccentricity = 1 −

Comparing the eccentricity values across the SphK1 systems, the eccentricity stays

generally consistent ( Table S7 ). Comparing the SphK2 systems, SphK 2 apo system has

the smallest eccentricity at 0.74 ± 0.2 , indicating the most spherical system, while SphK2

ATP bound system has the largest eccentricity at 0.82 ± 0.1 , indicating the most

105

oblong/ellip tical system. SphK2 with ATP+sphingosine bound is betwe en the apo and ATP bound systems, at 0.78 ± 0.1 . Collect ively with R g values, it is observed SphK1 may not undergo any conformational changes with respect with substrate binding, while SphK2 may prompt slight conformational changes to accommodate substrate binding. To further support these influences on the str uctural morphology of SphK isoforms, Solvent -

Accessible Surface Area (SASA) can be explored.

SASA indicates protein surface regions and residues that are accessible for interacting with solvent molec ules, which can further investigate structural morpholog y changes in protein structures. Specifically, the rot a meric changes of the residues in the binding cavity of ATP+sphingosine can be assessed to determine if they stabilize or promote substrate bindin g. SphK1 apo system had a total SASA of 172 ± 3 nm 2 with 87 ±

2 nm 2 contributing from hydrophobic and 85 ± 2 nm 2 from hydrophilic surface areas ( Table

2 ). SphK1 ATP bound system had a total SASA of 165 ± 3 nm 2 with 83 ± 2 nm 2 contributing from hydrophobic and 83 ± 2 nm 2 from hydrophilic surface areas ( Table 2 ).

Interestingly, with ATP bound to SphK1, it appears that there is a smaller total protein

SASA. Considering the SphK1 R g , eccentricity, and secondary structure indicates that there are likely no major conformationa l changes occurring due to substrate binding in any system, it is possible that ATP binding influences rot a meric position of residues, but not necessarily major backbone movement. SphK1 ATP+sphingosine bound systems had a total SASA of 173 ± 3 nm 2 with 88 ± 2 nm 2 contributing from hydrophobic and 85 ± 2 nm 2 from hydrophilic surface areas. The SphK1 apo system and SphK1 ATP+sphingosine systems are similar in SASA characteristics. Similar patterns in SASA were observed when comparing SphK2 syste ms. SphK2 apo system had a total SASA of 238 ± 6 nm 2 with 131

106

± 4 nm 2 contributing from hydrophobic and 108 ± 3 nm 2 from hydrophilic surface areas.

SphK2 ATP bound system had a total SASA of 236 ± 5 nm 2 with 131 ± 33 nm 2 contributing from hydrophobic and 1 05 ± 3 nm 2 fro m hydrophilic surface areas. SphK2

ATP+sphingosine bound system had a total SASA of 240 ± 5 nm 2 with 134 ± 4 nm 2 contributing from hydrophobic and 106 ± 2 nm 2 from hydrophilic surface areas. With all the SphK systems having low differences SA SA values, this further confirming there are

no large conformational changes influenced by native substrate binding but could present

interesting information on the residue level .

SASA per residue was analyzed to obtain a deeper understanding on the ATP an d

sphingosine binding on the structural morphology of SphK isoforms ( Figure S 6 ). In SphK1

systems, SASA differences occur noticeably in Asp81, an identified essential residue in

enzymatic activity, 157 where it has a decrease in 0.21 ± 0.14 nm 2 in SASA in ATP bound

system and decrease of 0.26 ± 0.13 nm 2 in the ATP+sphingosine bound system. In the

SphK2 systems, Asp211, which is the residue position equival ent of Asp81 in SphK1, also

saw a difference in SASA, with a decrease in 0.06 ± 0.18 nm 2 in ATP systems and decrease

0.70 ± 0.14 nm 2 in ATP+sphingosine systems. These SASA changes in an essential catalytic residue supports that with SphK1 ATP bound and ATP+sp hingosine bound systems and the SphK2 ATP+sphingosine bound system s , the active site is becoming less solvent - accessible. The area rich in random co ils between the N - terminal and C - terminal lobes , residues 260 – 280 in SphK1 and residues 510 - 530 in SphK2, also contains notable changes in SASA. In SphK1 c ompared to apo systems, ATP bound systems have a decrease in 2.66 ± 0.71 nm 2 SASA and ATP+sphingosi ne bound systems have an increase in 0.23 ±

0.78 nm 2 SASA. In SphK2 compared to apo systems, ATP bound syste ms have a n increase

107

in 1.30 ± 0.71 nm 2 SASA and ATP+sphingosine bound systems have an increase in 0.12 ±

0.7 0 nm 2 SASA. This indicates the random coils in this region to be most influenced by

ATP binding , being attracted to ATP in SphK1 and repelled in Sph K2 . Lastly, the largest

SASA region changes are within the α7 helix in SphK1 residues 186 - 200 and SphK2 residues 300 - 315. In SphK1 compared to the apo system, ATP bound systems had a decrease in 0.77 ± 0.66 nm 2 SASA and ATP+sphingosine bound system had an in crease in

0.38 ± 0.66 nm 2 SASA within the α7 helix region. For the SphK2 systems compared to apo, ATP bound systems had a decrease in 1.7 5 ± 0.54 nm 2 SASA and ATP+sphingosine

bound systems had a decrease in 0.60 ± 0.52 nm 2 SASA values within the α7 helix. The α7

helix may be more important in sphingosine binding in SphK2 than in SphK1. Th e s e data support that SASA is overall decreased within the active site of both SphK isoforms upon

ATP binding ( Figure S6 ). Measurements of the overall pocket volume of the kinase

domain can support this claim and further highlight the influence of residues in both the

ATP+sphingosine binding cavity that are influenced by either native substrate presence. At

the biological level, this is relevant and can provide a deeper unde rstanding in the kinetic

differences between the SphK isoforms. 136

Binding pocket volume was used to determine if the SASA per residue area value

changes were due to the binding cavity availab ility and to determine how the binding

pockets of both ATP+sphingosine changed in relation to the crystallized and homology

model structures. The crystal structure and homology model of SphK1 and SphK2 were

found to have binding pocket volumes of 538 Å 3 an d 443 Å 3 , respectively, as based on previous work. 152 The major takeaway fr om this work was that SphK1 had a larger binding c avity, as based on the difference of two residues in the sphingosine binding cavity.

108

Metapocket 2.0 webserver 113 was used to identify binding pocket clusters and results were

visualized in Chimera t o calculate total volume of the clustered regions in the sphingosine

binding domain. The SphK1 apo, ATP bound, and ATP+sphingosine bound systems had

volumes of 1137 ± 166 Å 3 , 1039 ± 128 Å 3 , and 1079 ± 146 Å 3 , respectively ( Table S8 ).

This indicated the bin ding of ATP+sphingosine both induced a reduction i n the binding

cavity of SphK1. Considering the α7 - α8 helix loop fluctuation changes in the systems, the

smaller sphingosine binding pocket could be influence by the anchoring of this region

closer to the sp hingosine binding region. SphK2 apo, ATP bound, an d ATP+sphingosine

bound systems had volumes of 896 ± 184 Å 3 , 1042 ± 398 Å 3 , and 1189 ± 261 Å 3 ,

respectively ( Table S8 ). Previous work indicated that the SphK2 sphingosine binding cavity was smaller as compa red to SphK1; however, in both SphK1 and SphK2 sys tems in this work, there is enlargement of both binding pockets, indicating that the structural morphologies from crystallography and modeling did adapt to the presence of a substrate or inhibitor. 152, 157 From the volume calculation, the SphK2 binding pocket for sphingosine bindin g expands when compared to the SphK2 apo system, o pposite of what is seen in

SphK1. There are distinct characteristic differences within the binding cavity of SphK1 and SphK2, potentially causing pocket volume changes in the presence of the biological subs trates. This could be influenced by the residue ch anges within the sphingosine binding cavity, such as Ile174 and Phe288 in SphK1 vs Val304 and Cys533 in SphK2, identified in Worrell, et al., 152 providing a deeper understanding into the distinguishing roles of SphK isoforms and offer perspicacity for future isoform - specific drug design.

T he structural analysis performed through evaluating secondary structure, RMSD,

RMSF, R g , eccentricity, SASA, and pocket volume shows the stability of the structural

109

morphology in th e kinase domain as based on previous crystal and homology model

s tructures. Importantly, this shows that the SphK2 homology model is stable and a provides

a useful tool for drug discovery aspects until a crystal structure can be solved. This work

has also shown the SphK isoforms display distinguishing structural changes in the α7 - α8

helix loop and random coils between the N - terminal and C - terminal lobes occurring within the active site , influenced by native substrate binding, providing potential insight int o differentiating biological roles. Understanding the effects of native s ubstrate binding in more detail on the SphK isoforms will provide further insight into substrate binding mechanisms.

3.4.2 SphK Isoform Interaction Analysis with ATP and S phingosine Molecules

MD simulations with the biological substrates for the SphK isof orms can reveal amino acids essential in substrate binding and features distinct that can be exploited for isoform - selective inhibitor design. Evaluation of the protein dynamics on th e binding cavity of each isoform due to the influence of ATP+sphingosine can help identify isoform - specific features. Analysis of the total amount of hydrogen bonds and other interactions between the protein and small molecules coupled with protein - small m olecule analysis of each substrate can identify regions within the bindin g cavity essential in the divergent

SphK isoform biological roles, implications in SphK isoforms in diseases, and can be exploited for drug discovery.

Hydrogen bond analysis was perf ormed on the SphK ATP bound and SphK

ATP+sphingosine bound systems ( Figur e S7 ) . SphK1 ATP bound system contained an average of 6 ± 1 hydrogen bonds across replicates with ATP. SphK2 ATP bound system also contained an average of 6 ± 1 hydrogen bonds across replicates with ATP. SphK1

110

ATP+sphingosine bound system contained an aver age of 6 ± 1 hydrogen bonds across replicates with ATP. SphK2 ATP+sphingosine bound system contained an average of 8 ±

2 hydrogen bonds across replicates with ATP. This could indicate more hydrogen bonding features surfacing as influenced by the presence o f sphingosine in SphK2 to facilitate the phosphotransfer reaction between ATP to sphingosine. SphK1 ATP +sphingosine bound system contained an average of 1 ± 0 hydrogen bonds across replicates with sphingosine.

SphK2 ATP+sphingosine bound system also contai ned an average of 1 ± 0 hydrogen bonds across replicates with sphingosine. This is expected since sphin gosine contains a lipid base with a polar head and requires hydrophobic interactions to bind to the active site. The essential interactions between the S phK isoforms and sphingosine are likely contained in the hydrophobic channel within the binding cavity. To further investigate sphingosine binding in the MD simulations, more detailed protein - small molecule interactions must be considered.

Protein - small m olecule interactions were explored using the Schrödinger - Maestro

Suite (v. 2020 - 1) fingerprinting tool. This software allows for identification of electrostatic, hydrophobic, polar neutral , and backbone interactions that could be influencing sphingosine bi nding. Amino acids essential for ATP binding in kinases are a lysine or arginine to facilitate the γ - ph osphate transfer , a cation - binding aspartate, and a catalytic aspartate. 1, 17 SphK isoforms are unlike other kinases and do not contain the VAIK, DFG, or HRD kinase mot ifs. Site - directed mutagenesis experiments have determined Asp81 in

SphK1 as an essential residue for c atalytic activity, 157 but that is the only potential insight into sphingosine - binding essential residues. In the energy minimized, or starting structure,

S phK1 ATP - bound system, ATP interacted with the charged residues Arg24, Glu86,

111

Arg185, Arg191, and Glu343. In the SphK1 ATP - bound system cluster structure s, ATP interacted with the charged residues Arg56, Arg57, Asp81, Asp178, Arg185, Arg191, and

Glu343 ( Fi gure 3 , Table S 9 ). In the energy minimized SphK2 ATP - bound system, ATP interacted with the charged residues Glu216, Arg315, Arg321, and Glu588. In the SphK2

ATP - bo und system cluster structures, ATP interacted with the charged residues Arg157,

Asp211, Arg315, Arg321, and Glu588 ( Figure 3 , Table S 9 ). In ATP only systems, this shows the importance of having negatively - charges region, consisting of arginine rich domain, for ATP b inding in both SphK1 and SphK2 structures. With typical kinases consisting of interactions with catalyt ic aspartates, 1, 1 7 it is possible that the interactions with glutamates could be replacing or supplementing aspartate interactions to support catalytic activity. In th e energy minimized SphK1 ATP+sphingosine bound system, ATP interacted with the charged residues Arg24, Glu86, Arg185, Arg191, and Glu313. In the

SphK1 ATP+sphingosine bound system cluster structures, ATP interacted with the charged residues Arg24, Arg57, G lu86, Arg185, Arg191, and Glu343 ( Figure 3 , Table S 9 ). In the energy minimized SphK2 ATP+sphingosine bound system starting structures, ATP interacted with the charged residues Glu216, Arg315, Arg321, and Glu588. In the SphK2

ATP+sphingosine bound system cluster stru ctures, ATP interacted with the charged residues Arg157, Arg315, Arg321, and Glu588 ( Figure 3 , Table S 9 ). Both Sp hK isoform interactions with ATP are the overall same between the ATP bound system and the

ATP+sphingosine bound system, except for a loss of the interac tion with the essential catalytic aspartate, Asp81 in SphK1 and Asp211 in SphK2, in the ATP+sphingosine bound systems. This suggests that Asp81 in SphK1 and Asp211 in SphK2 is important for recruiting ATP into the active site but may play a different role in the phosphotransfer

112

reaction. SphK1 Glu343 and SphK2 Glu588 show promising importance in ATP binding .

Investigating the binding cavity and comparing the apo, ATP bound, and ATP+sphingosine

bound systems, SphK1 Glu343 and SphK2 Glu588 move towards the γ - phosphate of ATP, most distinctly in SphK2 systems, signifying potential importance in γ - phosphate clea vage

( Figure 3 ). From these interactions, its potential that rather than catalytic and ion - orienting aspartates, SphK isoforms may utilize glutamate residues for catalytic activity as well and a heavy presence of arginine residues for ATP binding.

Interes tingly, Arg185 in SphK1 and Arg315 in SphK2 are located on the α7 - α8 helix loop. Evaluating the movement of the Arg185 in SphK1 and Arg315 in SphK2 by comparing the apo, ATP bound, and sphingosine bound systems, Arg185/Arg315 moves considerably towards ATP in the ATP bound system and the ATP+sphingosine bound system ( Figure 3 ). It is possible that the ATP - Arg interactions anchors the α7 - α8 helix loop, reducing the overall fluctuation of that domain ( Figure S4 ). This interaction also may be influencing the t otal SASA in the structures ( Table 2 ). The decrease of SASA in the

ATP+sphingosine bound systems in both isoforms supports the native substrate binding influence the α7 - α8 helix loop to be anchored in place by the ATP - Arg interactions, red ucing structural fluctuation . This information could provide useful insight into protein breakdown through proteasomal degradation via ubiquitination of Lys183 in SphK1, displaying that low cellular levels of ATP could lead to more loop variability, prompt ing pathways for degradation.

Key residues for sphingosine binding, identified in Worrell, et al. 152 for SphK1 are

Asp81, Ser168, Ile174, Asp178, Phe288, Phe303, His311, and Tyr321 and for SphK2 are

Asp211, Ser298, Val304, Asp308, Cys533, Phe548, His556, and Tyr566. 152 After energy

113

minimization through MD setup, SphK1 ATP+sphingosine system had interactions with

the key residues Ser168, Ile174, Asp178 , and Phe303 ( Fig ure 4 , Table S 10 ) . SphK2

ATP+sphingosine system interacted with key residues Ser298, Val304, Asp308, and

His556. In the SphK1 ATP+sphingosine bound system dominant morphologies, sphingosine interacted with the key residues Ile174, Asp178, Phe288, and Phe303 in two or more reps. Additionally, sphingosine interacted with Phe192 and Met306 in SphK1. In the SphK2 ATP+sphingosine bound system dominant morphologies, sphingosine interacted with the key residues Phe303, Asp308, Cys533, and Phe548 ( Figure 4 , Table

S 10 ) . Sphingosine also interacted with Phe322 and Met551 in SphK2. Comparing the

SphK isoforms interactions with sphingosine, both interact with similar residues: one negatively charged residue (Asp178 in SphK1 and Asp308 in SphK2), three aromatic residues (Phe1 92, Phe288, and P he303 in SphK1 and Phe303, Phe322, and Phe548 in

SphK2), and a methionine (Met306 in SphK1 and Met551 in SphK2). Outside of identified key residues, new residues were identified through the protein - substrate interaction analysis by identif ying residues tha t had interactions in three or more replicates. In SphK1,

Phe192, Leu268, and Met306 have interactions with sphingosine. In SphK2, Phe322,

Leu513, and Met551 have interactions with sphingosine. These residues are shared in both the SphK is oforms, indicatin g them as new potential key residues in sphingosine binding.

While these residues do not indicate exploitable features, the exceeding similarities between SphK1, a crystallized protein structure, and SphK2, a computationally solved structu re , it provide s i nsight that the SphK2 homology model it is to be expected similar to the biological structure and a useful model in future drug discovery applications .

114

Previous work has identified the Ile174 in SphK1 and Val304 in SphK2 as being

essentia l in dictation is oform specificity for drugs. 152 While the majority of the interactions are similar between the SphK isoforms, the importance of SphK1 Ile174 and SphK2

Val304 are evident in this work. In SphK1 ATP+sphingosine systems, Ile174 interacts with

sphingosine in two of the four replicate s ( Figu re 4 ). In SphK2 ATP+sphingosine systems,

Val 304 did not interact with any of the replicates. This displays a larger importance in

Ile174 in the SphK1 hydrophobic channel than Val304 in SphK2. Further evaluation of the influence on Ile174 on SphK1 and Val30 4 on SphK2 sphingosine binding should be exp lored.

3.5 Conclusion

MD simulations were used to provide insight into binding pocket flexibility and

establishing how residue interactions change in different biological systems . Taken

together, this work seeks to develop a better understanding of the b inding mechanics for

bio logical substrates of SphKs, ATP and sphingosine, to help aid in development of more

efficacious isoform - specific inhibitors and their rationale for design. Here, we show that

SphK isoforms do not have the typical kinase ATP binding motifs, but instead con tain an

arginine - rich, ATP binding region with three potential leads for catalytic activity in the

active site, such as Asp81, Asp178, and Glu343 in SphK1 and Asp211, Asp308, and

Glu588 in SphK2. From our MD simulations, we discover ed the α7 - α8 helix loop

fluctuation is influenced by the ATP presence, between the apo (no sphingosine or ATP bound), ATP - bound, and ATP - sphingosine complex systems supporting that ATP binding changes the positioning of Arg185 in SphK1 and Arg315 in SphK2, anchoring the α7 - α8 helix loop to stabilize substrate bi n ding . In sphingosine binding, the most important

115

components for binding include Asp178, Phe192 , Phe288, Phe303, and Met306 in SphK1

and Asp308, Phe303, Phe322, Phe548, and Met551 in SphK2. Binding s imilarities in the

SphK1 crystal structure and the SphK2 homology model supports the SphK2 structure as

a valid predictor of future drug discovery pract ices. Additionally, further emphasis on key

residues identified for exploitation between SphK isoforms a re Ile174 in SphK1 and

Val304 in SphK2. Future experiments should explore the influences Ile174 and Val304 have on known inhibitors. This work provides the biological foundation for understanding

ATP and sphingosine binding characteristics and their influe nce on the structure of SphK1 and SphK2 , which in turn can be used for future drug discovery research.

3.6 Acknowledgements

We acknowledge the contr ibutions of Carter Gottschalk, Jeamin Jung, R. J ason Mosier to for help with data visualization. We also acknowledge the support and resources provided by Virginia Tech Advanced Research Computing.

116

3.7 Main Figures and Tables

Figure 1. Structural feat ures of SphK isoforms in the energy minimized starting structures. A. SphK1 and B. SphK2. Both structure s represented in cartoon with ATP shown in stick. Blue star represents the ATP binding domain and the yellow start represents the sphingosine binding do main.

117

Figure 2 . SphK1 and SphK2 dominant morphologies over the last 250 ns of each simulation as base d on backbone RMSD clustering. RMSD between replicates of each system was performed and the replicate with the lowest RMSD was chosen as the representative morphology of the simulation system. For the replicate chosen as representative of the system, the c enter structure of the largest cluster, representing the largest number of frames sampled, is shown. Percent size of the cluster is shown in the bottom right of each replicate. Sphk1 and SphK2 are shown as cartoon, color maroon and teal respectively, with ATP and sphingosine shown as sticks and colored by element. Mg2+ is shown as a green sphere.

118

Table 1. Average Percent Secondary Structure throughout SphK system simulations. Structure α Helix β Sheet Coil SphK1 apo 28 ± 1 29 ± 1 43 ± 1 SphK1 ATP 29 ± 1 29 ± 0 42 ± 1 SphK1 ATP + sphingosine 29 ± 1 29 ± 1 42 ± 1 SphK2 apo 24 ± 1 20 ± 2 56 ± 3 SphK2 ATP 25 ± 1 21 ± 1 54 ± 1 SphK2 ATP + sphingosine 25 ± 1 21 ± 1 55 ± 1 Percent second ary structure for SphK systems averaged over the final 250 ns of four replicates for a cumulative sampling time of 1000 ns with corresponding standard deviations .

Table 2. Solvent Accessible Surface Area (SASA) of SphK structures from system simulations. System Hydrophobic Hydrophilic Total SASA ( n m 2 ) ( nm 2 ) ( nm 2 ) SphK1 apo 87 ± 2 85 ± 2 172 ± 3 SphK1 ATP 83 ± 2 83 ± 2 165 ± 3 SphK1 ATP + Sphingosine 88 ± 2 85 ± 2 173 ± 3 SphK2 apo 131 ± 4 108 ± 3 238 ± 6 SphK2 ATP 131 ± 3 105 ± 3 236 ± 5 SphK2 ATP + Sphingosine 134 ± 4 106 ± 2 240 ± 5 Average over final 250 ns of four replicate trajectories for a cumulative sampling time of 1000 ns, with corresponding standard deviations.

Table 3. Total number of hydrogen bonds between SphK isoforms and ligands fr om the SphK system MD simulations Ligand System Number of Hydrogen Bonds SphK1 ATP 6 ± 1 SphK1 ATP + Sphingosine 6 ± 1 ATP SphK2 ATP 6 ± 1 SphK2 ATP + Sphingosine 8 ± 2 SphK1 ATP + Sphingosine 1 ± 0 Sphingosine SphK2 ATP + Sphingosine 1 ± 0

119

Figure 3 . ATP binding domain o verlaid structures of most representative structures of SphK isoform apo, ATP bound, and ATP+sphingosine bound systems showing residue positionings. A. SphK1 with interacting residues shown in stick. Apo colored in darkest re d, ATP bound system colored in medium red, and the ATP+sphingosine system colored in the lightest shade of red. ATP colored purple and by atom type with magnesium displayed as sphere colored in green scaled down to 0.6 nm. Sphingosine shown in stick colore d in green. B. SphK2 with key residues shown in stick. Apo colored in darkest teal, ATP bound system colored in medium teal, and the AT P+sphingosine system colored in the lightest shade of teal. ATP colored purple and by atom type. Sphingosine shown in sti ck colored in yellow with magnesium displayed as sphere colored in green scaled down to 0.6 nm. C. SphK1 systems colored and presented as in A, without ATP . D. SphK2 systems colored and presented as in B, without ATP .

120

Figure 4. Overlays of sphingosine binding of dominant morphologies of SphK isoform apo, ATP bound, and ATP+sphingosine bound systems showing key residue positionings. A. SphK1 with key residues shown in stick. Apo colored in darkest red, ATP bound system colored in medium red, and the ATP+ sphingosine system colored in the lightest shade of red. Sphingosine shown in stick colored in green. ATP colored by atom type with mag nesium displayed as sphere colored in green scaled down to 0.6 nm. B. SphK2 with key residues shown in stick. Apo colored in darkest teal, ATP bound system colored in medium teal, and the ATP+sphingosine system colored in the lightest shade of teal. Sphing osine shown in stick colored in yellow. ATP colored by atom type. C. SphK1 systems colored and presented as in A, without sphingosine. D. SphK2 systems colored and presented as in B, without sphingosine.

121

3.8 Supplemental Information

Figure S 1 . Root mean - square deviation ( RMS D) of protein backbone atoms for each system and SphK isoform. RMSD is shown for each replicate ove r the 1 µ s simulation.

122

Figure S2. SphK1 dominant morphologies over the last 250 ns of each simulation as based on backbone RMSD clustering. The center structure of the largest cluster, representing the largest number of frames sampled, is shown for each replicate and each isofor m of SphK1. Percent size of the cluster is shown in the bottom right of each replicate. SphK1, is shown as cartoon and is colored by replicate number (1 - 4). ATP and sphingosine are shown as sticks and colored by element. Mg2+ is shown as a green sphere.

123

Figure S3. SphK2 dominant morphologies over the last 250 ns of each simulation as based on backbone RMSD clustering. The center structure of the largest cluster, representing the largest number of frames sampled, is shown f or each replicate and each isof orm of SphK2. Percent size of the cluster is shown in the bottom right of each replicate. SphK2, is shown as cartoon and is colored by replicate number (1 - 4). ATP and sphingosine are shown as sticks and colored by element. Mg 2+ is shown as a green sphere.

124

Table S1. Percent size of dominant morphologies for the largest cluster of SphK as based on backbone RMSD clustering with a cut - off of 0.2 nm. Percent Dominant Morphology System Rep1 Rep2 Rep3 Rep4 SphK1 Apo 89.0 100 9 1.0 95.5 SphK1 ATP 99.9 100 98.7 99. 9 SphK1 ATP + Sphingosine 96.5 79. 1 62.2 100 SphK2 Apo 28.3 3.7 25.0 13.6 SphK2 ATP 35.8 5. 5 15.3 35. 8 SphK2 ATP + Sphingosine 41. 8 8. 9 19.6 11.3

Table S2. RMSD of dominant morphologies per replica te of each system as compared across replicates. The replicate with the lowest RMSD compared to the replicates in that system was chosen as the representative structure for that system. Average Dom inant Morphology RMSD System Rep1 Rep2 Rep3 Rep4 SphK1 Apo 2.14 ± 0.30 2.29 ± 0.29 2.19 ± 0.19 2.52 ± 0.12 SphK1 ATP 1.38 ± 0.14 1.34 ± 0.12 1.51 ± 0.13 1.54 ± 0.10 SphK1 ATP + Sphingosine 1.61 ± 0.03 1.68 ± 0.07 1.71 ± 0.09 1.69 ± 0.05 SphK2 Apo 3.02 ± 0.52 3.52 ± 0.23 3.73 ± 0.34 3.22 ± 0.74 SphK2 ATP 3. 48 ± 1.15 3.17 ± 0.66 3.18 ± 1.13 4.62 ± 0.44 SphK2 ATP + Sphingosine 2.75 ± 0.08 2.44 ± 0.31 2.48 ± 0.18 2.40 ± 0.22

Table S3. RMSD of ATP per replicate of each system as compared across replicates. The replicate with the lowest RMSD compared to the r eplicates in that system was chosen as the representative structure for that system. Average Dominant Morphology RMSD System Rep1 Rep2 Rep3 Rep4 SphK1 ATP 4.46 ± 0.82 5.76 ± 1.17 6.60 ± 0.86 5.84 ± 1.49 SphK1 ATP + Sphingosine 3.75 ± 1.43 3.91 ± 1.54 5 .90 ± 1.53 5.98 ± 1.47 SphK2 ATP 8.96 ± 2.46 12.50 ± 2.23 13.62 ± 1.81 10.51 ± 3.64 SphK2 ATP + Sphingosine 6.16 ± 0.52 4.90 ± 1.40 5.24 ± 1.65 3.18 ± 0.95

Table S4. RMSD of per replicate of each system as compared across replicates. The replicate with the lowest RMSD compared to the replicates in that system was chosen as the representative structure for that system. Average Dominant Morphology RMSD System Rep1 Rep2 Rep3 Rep4 SphK1 ATP + Sphingosine 7.85 ± 0.55 5.68 ± 1.82 5.96 ± 1.63 5.30 ± 1.30 SphK2 ATP + Sphingosine 4.17 ± 0.38 3.11 ± 0.44 3.29 ± 0.78 3.18 ± 0.95

125

Figure S4. Root mean - square fluctuation (RMSF) of each residue position for each replicate and system of SphKs. RMSF is shown for each replicate over the 1 µs simulati on.

126

Figure S5. Averaged RMSF per residue per system and isoform of SphKs with associated standard deviation shown as shaded background. RMSF is displayed is over the 1 µs simulation.

127

Table S5. Average secondary structure composition of SphKs (shown as %). a Starting Structure Last 250 ns Structure α β Sheet α Helix β Coil Coil Helix Sheet SphK1 apo 30 ± 1 31 ± 1 38 ± 1 28 ± 1 29 ± 1 43 ± 1 SphK1 ATP 30 ± 1 31 ± 0 39 ± 1 29 ± 1 29 ± 0 42 ± 1 SphK1 ATP + sphingosine 30 ± 1 31 ± 0 38 ± 1 29 ± 1 29 ± 1 42 ± 1 SphK2 apo 28 ± 1 24 ± 1 49 ± 1 24 ± 1 20 ± 2 56 ± 3 SphK2 ATP 28 ± 0 23 ± 1 49 ± 1 25 ± 1 21 ± 1 54 ± 1 SphK2 ATP + sphingosine 29 ± 0 23 ± 1 49 ± 1 25 ± 1 21 ± 1 55 ± 1 a Secondary structure percentages represent averages over the f inal 250 ns of simulation time for four replicates of each system (cumulative sampling time of 1000 ns) with corresponding standard deviations

Table S 6 . Average r adius of g yration for pre and post simulation systems. a System R g Start R g End (nm) (nm) SphK1 apo 2.0 2.1 ± 0.01 SphK1 ATP 2.0 2.1 ± 0.01 SphK1 ATP + Sphingosine 2.0 2.1 ± 0.01 SphK2 apo 2.4 2.4 ± 0.03 SphK2 ATP 2.4 2.4 ± 0.02 SphK2 ATP + Sphingosine 2.4 2.4 ± 0.03 a R adius of gyration for SphK systems averaged over the final 250 ns o f four replicates for a cumulative sampling time of 1000 ns with corresponding standard deviations .

Table S 7 . Average moments of i nertia and e ccentricity for SphKs. a System I 1 I 2 I 3 e (10 4 amu × nm 2 ) (10 4 amu × nm 2 ) (10 4 amu × nm 2 ) SphK1 apo 8.3 ± 0 .1 11.9 ± 0.2 14.2 ± 0.2 0.82 ± 0. 1 SphK1 ATP 8.1 ± 0.1 11.6 ± 0.2 14.0 ± 0.2 0.82 ± 0. 1 SphK1 ATP + Sphingosine 8.2 ± 0.1 12.0 ± 0.2 14.2 ± 0.2 0.81 ± 0. 2 SphK2 apo 14.4 ± 0.5 19.5 ± 0.6 22.3 ± 0.7 0.74 ± 0.2 SphK2 ATP 13.9 ± 0.4 20.2 ± 0.5 24.1 ± 0.7 0.82 ± 0.1 SphK2 ATP + Sphingosine 14.7 ± 0.4 20.8 ± 0.7 23.8 ± 0.8 0.78 ± 0.1 a Moment of Inertia for SphK systems averaged over the final 250 ns of four replicates for a cumulative sampling time of 1000 ns with corresponding standard deviations .

128

Figure S 6 . Solvent Accessibility Surface Area ( SASA ) of SphK systems average d across replicates by residue number associated standard deviation shown as shaded background. SASA analysis was performed over the last 250 ns of simulation.

129

Table S8 . SphK isoform Binding cavity analysis influenced by biological substrate binding. a System Pocket Volume (Å 3 ) SphK1 apo 1137 ± 166 SphK1 ATP 1039 ± 128 SphK1 ATP + Sphingosine 1079 ± 146 SphK2 apo 896 ± 184 SphK2 ATP 1042 ± 398 SphK2 ATP + Sphingosin e 1189 ± 261 a Pocket volume was calculated using Metapocket 2.0 webserver and analyzed for volume in UCSF Chimera.

130

Figure S 7 . Number of h ydrogen b onds between SphK isoforms and ligands over time and per replicate.

131

Table S9. Analysis of Protein - ATP i nteractions in cluster structures of SphK systems System Replicate Electrostatic Polar Neutral Hydrophobic Backbone R56 , R57 , D178 , 1 S181 G342 R185 , R191 , E343

d R57 , D81 , E86 , E55 , G80 , n

u

o 2 D178 , R185 , L83 G82 , G113 , b R191 , E343 , G342 P

T R56 , R57 , E182 , A

1 3 R185 , R191 , S112 L344 K

h E343 , p

S R24 , R57 , D81 , 4 R185 , R186 , N22 , S79 , S112 G25 , G26 , E55 R191 , E343 R157 , D211 ,

1 N188 , H215 G212

d R315 , R321 , E588 n u o

b R157 , E312 , 2 Q589 I309 , V582 P R315 , E588 T A

R157 , E312 , R315 S242 , N244 , Q589 L159 , C240 G156 , G241 ,

2 3

K R321 , E5 88 G243 , G587 h

p R157 , D211 , S 4 E216 , R315 , Q187 , N188 , H215 G212 , G587 R321 , E588

d R24 , R57 , E86 , n 1 N22 , T54 L83 E55 , G82 u R185 , R191 , E343 o b

e

R24 , R57 , R185 , n i

1 2 N22 , S112 E55 , G342 s R191 , E343 K o h g p n i

S R24 , R57 , E86 h

p 3 N22 E55 , G82 , L83

s R185 , R191 , E343 + P

T R56 , R57 , E182 4 A R185 , R191 , E343

d R157 , R186 , G212 G241

n 1 Q187 S242

u R315 , R321 , E588 G587 o b

e R157 , R315 ,

n 2 H215 S242 G212 i 2

s R321 , E588 K o h g G155 , G156 , p n

i R157 , D211 , N152 , Q187 , N188 , S h 3 L213 H189 , G210 , p E216 , R315 , E588 S209 s G212 + P

T R157 , D211 , 4 V590 G212 A R315 , E588

132

Table S10. Analysis of Protein - sphingosine interactions in cluster structures of SphK systems System Replicate Electrostatic Polar Neutral Hydro phobic Backbone F173 , V177 , F192 ,

1 E307 T193 , E307 L259 , L268 , M272 G269 , L302 d

n F303 , M306 u o b I174 , F192 , L261 ,

e 2 D178 H311 , G342

n i

1 L268 , F288 , L302 s K o h g F173 , V177 , L259 , p n i S h L261 , L263 , L268 , p 3 D178 I174 s A274 , F288 , L302 , +

P F303 , M306 , L31 9 T

A I174 , F192 , L268 , 4 D178 H311 D341 G342 F288 , M306 F322 , L513 , A519 ,

1 D308 R321 C533 , L547 , M551 , d

n F557 , L564 u o

b F303 , F322 , L504 ,

e

n 2 D308 L513 , C533 , F548 , A519 , G587 i 2 s

K M551 o h g p n F303 , V307 , F322 , i S h

p L513 , L517 , V535 , s 3 D308 R315 T326 L547 , G587 + L544 , F548 , M551 , P

T L564 A V307 , F322 , L513 , L544 , F548 , 4 D308 H556 , T584 L517 , L547 , M551 G587

133

CHAPTER 4 : DISCUSSION

This work provides useful insight into computational advances useful in orthologue

and isoform specific i nhibitor design. In the Extracellular Signal Regulated Kinase 8

(ERK8), we were able to identify new residues in TbERK8, TbLys75 and TbPhe154, that

can be utilized for orthologue selective inhibitor design. Additionally, TbERK8 shows

promising inhibition w ith small, flexible molecules containing an aromatic central feature,

and positively charged flanking side chains. This work utilizing molecular dockin g

protocols allows new insight into orthologue specific features and created a starting

scaffold for next - gen HAT therapeutic inhibitor design.

In the SphK isoform exploration, we used molecular dynamic s simulations to

explore the protein structure - functi on relationship influence by the biological substrates,

ATP and sphingosine. Here, we were able to determ ine that SphK1 and SphK2 contain an

α7 - α8 helix loop that fluct uates in apo systems and is anchored in the presence of ATP, showing potential insight i n this domain being essential for sphingosine binding . We identified an arginine - rich region within the A TP binding site, important in triphosphate binding. Additionally, we identified glutamate residues that could be an alternative for the catalytic aspartate, essential in facilitat ing phosphotransfer reaction s. In sphingosine binding, many similarities occu rred with the SphK isoforms, showing the importance of the polar head and the h ydrophobic channel of the sphingosine binding cavity, and that isoform differences may be influenced by the residues Ile174 in SphK1 and Val304 in

SphK2, residues in spatially t he same position. Importantly, this work supports the usefulness the the SphK2 homology model, showing promising similarities to SphK1

134

binding, indicating it as a useful tool when utilizing computational techniques in SphK isoform drug discovery practices.

Collectively, both ERK8 orthologue and SphK isoform investigation using compu tational strategies provided new insight into protein structure - function relationships.

Exploring the physicochemical characteristics of protein structures can help identify fea tures specific to isoform or orthologue structures, providing a deeper understa nding on how structural mutations or variations effect the protein interface, influencing small molecule binding. Molecular docking protocols provided useful information into pr otein specific features, identifying novel chemical orthologue - specific feature s caused by residue deletions and a shift in residues, which can be exploited in drug discovery of Human

African Trypanosomiasis. Molecular Dynamics (MD) simulations provided an atomistic approach for investigating isoform - specific structural features. MD analysis identified structural morphology differences between the two SphK isoforms, providing further insight into the biological roles each structure performs. Together , this work demonstrates a streamlined approach for computation advancements useful in drug discovery aspects, providing strategies to accelerate therapeutic discovery.

135

R EFERENCES

[1] Manning, G., Whyte, D. B., Martinez, R., Hun ter, T., and Sudarsanam, S. (2002) The complement of the , Science 298 , 1912 - 1934. [2] Krebs, E. G., and Beavo, J. A. (1979) Phosphorylation - dephosphorylation of enzymes, Annu. Rev. Biochem. 48 , 923 - 959. [3] Jiang, Z. G., and McK night, C. J. (2006) A phosphorylation - induced conformation change in dematin headpiece, Structure 14 , 379 - 387. [4] Volkman, B. F., Lipson, D., Wemmer, D. E., and Kern, D. (2001) Two - state allosteric behavior in a single - domain signaling protein, Science 29 1 , 2429 - 2433. [5] Lubman, O. Y., and Waksman, G. (2002) Dissection of the energetic coupling across the Src SH2 domain - tyrosyl phosphopeptide interface, J. Mol. Biol. 316 , 291 - 304. [6] Duronio, R. J., and Xiong, Y. (2013) Signaling pathways that control ce ll proliferation, Cold Spring Harb. Perspect. Biol. 5 , a008904. [7] Owa, T., Yoshino, H., Yoshimatsu, K., and N agasu, T. (2001) Cell cycle regulation in the G1 phase: a promising target for the development of new chemotherapeutic anticancer agents, Curr. M ed. Chem. 8 , 1487 - 1503. [8] Lawen, A. (2003) Apoptosis — an introduction, Bioessays 25 , 888 - 896. [9] Hanahan, D., and Weinberg, R. A. (2000) The hallmarks of cancer, Cell 100 , 57 - 70. [10] Cohen, P. (1999) The development and therapeutic potential of protein kinase inhibitors, Curr. Opin. Chem. Biol. 3 , 459 - 465. [11] Endicott, J. A., Noble, M. E., and Johnson, L. N. ( 2012) The structural basis for control of eukaryotic protein kinases, Annu. Rev. Biochem. 81 , 587 - 613. [12] Hanks, S. K., and Hunter, T. (1995) P rotein kinases 6. The eukaryotic protein kinase superfamily: kinase (catalytic) domain structure and classifica tion, FASEB J. 9 , 576 - 596. [13] Lindberg, R. A., Quinn, A. M., and Hunter, T. (1992) Dual - specificity protein kinases: will any hydroxyl do?, Tre nds Biochem. Sci. 17 , 114 - 119. [14] Taylor, S. S., Keshwani, M. M., Steichen, J. M., and Kornev, A. P. (2012) E volution of the eukaryotic protein kinases as dynamic molecular switches, Philos. Trans. R. Soc. Lond., B, Biol. Sci. 367 , 2517 - 2528. [15] Huse, M., and Kuriyan, J. (2002) The conformational plasticity of protein kinases, Cell 109 , 275 - 282. [16] Hanks, S. K., Quinn, A. M., and Hunter, T. (1988) The protein kinase family: conserved features and deduced phylogeny of the catalytic domains, Science 241 , 42 - 52. [17] Hammarén, H. M., Virtanen, A. T., and Silvennoinen, O. (2016) Nucleotide - binding mechanisms in ps eudokinases, Biosci. Rep. 36 . [18] Hemmer, W., McGlone, M., Tsigelny, I., and Taylor, S. S. (1997) Role of the glycine triad in the ATP - binding s ite of cAMP - dependent protein kinase, J. Biol. Chem. 272 , 16946 - 16954. [19] Kornev, A. P., Haste, N. M., Taylor, S. S., and Ten Eyck, L. F. ( 2006) Surface comparison of active and inactive protein kinases identifies a conserved activation mechanism, PNAS 1 03 , 17783 - 17788. [20] Cohen, P. (2002) Protein kinases — the major drug targets of the twenty - first century?, Nat. Rev. Drug Discov. 1 , 309.

136

[2 1] Harsha, H., and Pandey, A. (2010) Phosphoproteomics in cancer, Mol. Oncol. 4 , 482 - 495. [22] Siegel, R. L., Mille r, K. D., and Jemal, A. (2020) Cancer statistics, 2020, CA - CANCER J CLIN 70 , 7 - 30. [23] van Oosterom, A. T., Judson, I., Verweij, J., Strooba nts, S., di Paola, E. D., Dimitrijevic, S., Martens, M., Webb, A., Sciot, R., and Van Glabbeke, M. (2001) Safety an d efficacy of imatinib (STI571) in metastatic gastrointestinal stromal tumours: a phase I study, The Lancet 358 , 1421 - 1423. [24] Fabian, M. A ., Biggs, W. H., Treiber, D. K., Atteridge, C. E., Azimioara, M. D., Benedetti, M. G., Carter, T. A., Ciceri, P., E deen, P. T., and Floyd, M. (2005) A small molecule – kinase interaction map for clinical kinase inhibitors, Nat. Biotechnol. 23 , 329 - 336. [25] Roskoski, R. (2019) Properties of FDA - approved small molecule protein kinase inhibitors, Pharmacol. Res. [26] Peng, B., Lloyd, P., and Schran, H. (2005) Clinical pharmacokinetics of imatinib, Clin. Pharmacokinet. 44 , 879 - 894. [27] Deininger, M. W. (2008) Ni lotinib, Clin. Cancer. Res. 14 , 4027 - 4031. [28] Kantarjian, H., Jabbour, E., Grimley, J., and Kirkpatrick, P. (2006 ) Dasatinib, Nature Publishing Group. [29] Maemondo, M., Inoue, A., Kobayashi, K., Sugawara, S., Oizumi, S., Isobe, H., Gemma, A., Harada, M., Yoshizawa, H., and Kinoshita, I. (2010) Gefitinib or chemotherapy for non – small - cell lung cancer with mutated EGF R, New Engl. J. Med. 362 , 2380 - 2388. [ 30] Solomon, B. J., Mok, T., Kim, D. - W., Wu, Y. - L., Nakagawa, K., Mekhail, T., Felip, E., Cappuzzo, F., Paolini, J., and Usari, T. (2014) First - line crizotinib versus chemotherapy in ALK - positive lung cancer, N Engl j Med 371 , 2167 - 2177. [31] Shepherd, F. A., Rodrigues Pereira, J., Ciuleanu, T., Tan, E. H., Hirsh, V., Thongprasert, S., Campos, D., Maoleekoonpiroj, S., Smylie, M., and Martins, R. (2005) Erlotinib in previously treated non – small - cell lung cancer, New Engl . J. Med. 353 , 123 - 132. [32] Llovet, J . M., Ricci, S., Mazzaferro, V., Hilgard, P., Gane, E., Blanc, J. - F., De Oliveira, A. C., Santoro, A., Raoul, J. - L., and Forner, A. (2008) Sorafenib in advanced hepatocellular carcinoma, New Engl. J. Med. 359 , 378 - 390. [33] Sternberg, C. N., Davis, I. D., Mardiak, J., Szczylik, C., Wagstaff, J., Salman, P., Kavina, A., Chen, M., Pandite, L., and Hawkins, R. E. (2010) Pazopanib in locally advanced or metastatic renal cell carcinoma: results of a randomized phase III tria l , J. Clin. Oncol. [34] Motzer, R. J., Hutson, T. E., Tomczak, P., Michaelson, M. D., Bukowski, R. M., Rixe, O., Oudard, S., Negrier, S., Szczylik, C., and Kim, S. T. (2007) Sunitinib versus interferon alfa in metastatic renal - cell carcinoma, New Engl. J. M ed. 356 , 115 - 124. [35] Geyer, C. E., Forster, J., Lindquist, D., Chan, S., Romieu, C. G., Pienkowski, T., Jagiello - Gruszfeld, A., Crown, J., Chan, A., and Kaufman, B. (2006) Lapatinib plus capecitabine for HER2 - positive advanced breast cancer, New Engl. J . Med. 355 , 2733 - 2743.

137

[36] Carles, F., Bourg, S., Meyer, C., and Bonnet, P. (2018) PKIDB: A curated, annotated and updated database of protein kinase inhibitors in clinical trials, Molecules 23 , 908. [37] Caporali, R., and Zavaglia, D. (2019) Real - world e x perience with tofacitinib for the treatment of rheumatoid arthritis, Clin Exp Rheumatol 37 , 485 - 495. [38] Zamora, N. V., Tayar, J. H., Lopez ‐Olivo, M. A., Christensen, R., and Suarez‐ Almazor, M. E. (2019) Baricitinib for rheumatoid arthritis, Cochrane Dat a base Syst. Rev. . [39] Serhal, L., and Edwards, C. J. (2019) Upadacitinib for the treatment of rheumatoid arthritis, Expert Rev. Clin. Immunol. 15 , 13 - 25. [40] Sin, J. H., and Zangardi, M. L. (2019) Ruxolitinib for secondary hemophagocytic lymphohistiocyto s is: first case report, Hematol Oncol Stem Cell Ther. 12 , 166 - 170. [41] Blair, H. A. (2019) Fedratinib: first approval, Drugs 79 , 1719 - 1725. [42] Flaherty, K. R., Wells, A. U., Cottin, V., Devaraj, A., Walsh, S. L., Inoue, Y., Richeldi, L., Kolb, M., Tetzl a ff, K., and Stowasser, S. (2019) Nintedanib in progressive fibrosing interstitial lung diseases, New Engl. J. Med. 381 , 1718 - 1727. [43] Asrani, S., Robin, A. L., Serle, J. B., Lewis, R. A., Usner, D. W., Kopczynski, C. C., Heah, T., Ackerman, S. L., Alper n , L. M., and Bashford, K. (2019) Netarsudil/Latanoprost Fixed - Dose Combination for Elevated Intraocular Pressure: Three - Month Data from a Randomized Phase 3 Trial, Am. J. Ophthalmol. 207 , 248 - 257. [44] Hanauer, S., Panaccione, R., Danese, S., Cheifetz, A. , Reinisch, W., Higgins, P. D., Woodworth, D. A., Zhang, H., Friedman, G. S., and Lawendy, N. (2019) Tofacitinib induction therapy reduces symptoms within 3 days for patients with ulcerative colitis, Clin. Gastroenterol. Hepatol. 17 , 139 - 147. [45] O'Connel l , E. M., Bennuru, S., Steel, C., Dolan, M. A., and Nutman, T. B. (2015) Targeting Filarial Abl - like Kinases: Orally Available, Food and Drug Administration – Approved Inhibitors Are Microfilaricidal and Macrofilaricidal, J. Infect. 212 , 684 - 6 93. [46] Dissous, C., Ahier, A., and Khayath, N. (2007) Protein tyrosine kinases as new potential targets against human schistosomiasis, Bioessays 29 , 1281 - 1288. [47] Knobloch, J., Beckmann, S., Burmeister, C., Quack, T., and Grevelding, C. G. (2007) Tyro s ine kinase and cooperative TGFβ signaling in the reproductive organs of Schistosoma mansoni, Exp. Parasitol. 117 , 318 - 336. [48] Li, X., Haeberlein, S., Zhao, L., Mughal, M. N., Zhu, T., Liu, L., Fang, R., Zhou, Y., Zhao, J., and Grevelding, C. G. (2019) Th e ABL kinase inhibitor imatinib causes phenotypic changes and lethality in adult Schistosoma japonicum, Parasitol. Res. 118 , 881 - 890. [49] Zhang, J., Yang, P. L., and Gray, N. S. (2009) Targeting cancer with small molecule kinase inhibitors, Nat. Rev. Canc er 9 , 28 - 39. [50] Ranjitkar, P., Perera, B. G. K., Swaney, D. L., Hari, S. B., Larson, E. T., Krishnamurty, R., Merritt, E. A., Vill é n, J., and Maly, D. J. (2012) Affinity - based probes based on type II kinase inhibitors, J. Am. Chem. Soc. 134 , 19017 - 19025.

138

[51] Roskoski Jr, R. (2016) Classification of small molecule protein kinase inhibitors based upon the structures of their drug - enzyme complexes, Pharmacol. Res. 103 , 26 - 48. [52] Zuccotto, F., Ardini, E., Casale, E., and Angiolini, M. (2010) Through the “gatekeeper door”: exploiting the active kinase conformation, J. Med. Chem. 53 , 2681 - 2694. [53] Lamba, V., and Ghosh, I. (20 12) New directions in targeting protein kinases: focusing upon true allosteric and bivalent inhibitors, Curr. Pharm. Des. 18 , 2936 - 2945. [54] Wu, P., Nielsen, T. E., and Clausen, M. H. (2016) Small - molecule kinase inhibitors: an analysis of FDA - approved dr ugs, Drug Discov. Today 21 , 5 - 10. [55] Karve, T. M., and Cheema, A. K. (2011) Small changes huge impact: the role of protein posttranslational modifications in cellular homeostasis and disease, J. Amino Acids 2011 . [56] Ardito, F., Giuliani, M., Perrone, D ., Troiano, G., and Lo Muzio, L. (2017) The crucial role of protein phosphorylation in cell signaling and its use as targeted therapy, Int. J. Mol. Med. 40 , 271 - 280. [57] Li, X., Wilmanns, M., Thornton, J., and Köhn, M. (2013) Elucidating human phosphatase - substrate networks, Sci. Signal. 6 , rs10 - rs10. [58] Beenstock, J., Mooshayef, N., and Engelberg, D. (2016) How do protein kinases take a selfie (autophosphorylate)?, Trends Biochem. Sci. 41 , 938 - 953. [59] Lau, A. T., and Xu, Y. M. (2019) Regulation of hum an mitogen ‐activated protein kinase 15 (extracellular signal ‐regulated kinase 7/8) and its functions: A recent update, J. Cell. Physiol. 234 , 75 - 88. [60] Valenciano, A. L., Ramsey, A. C., Santos, W. L., and Mackey, Z. B. (2016) Discovery and antiparasitic activity of AZ960 as a Trypanosoma brucei ERK8 inhibitor, Bioorg. Med. Chem. 24 , 4647 - 4651. [61] Abe, M. K., Saelzler, M. P., Espinosa, R., Kahle, K. T., Hershenson, M. B., Le Beau, M. M., and Rosner, M. R. (2002) ERK8, a new member of the mitogen - activate d protein kinase family, J. Biol. Chem. 277 , 16733 - 16743. [62] Abe, M. K., Kahle, K. T., Saelzler, M. P., Orth, K., Dixon, J. E., and Rosner, M. R. (2001) ERK7 is an autoactivated member of the MAPK family, J. Biol. Chem. 276 , 21272 - 21279. [63] Groehler, A . L., and Lannigan, D. A. (2010) A chromatin - bound kinase, ERK8, protects genomic integrity by inhibiting HDM2 - mediated degradation of the DNA clamp PCNA, J. Cell Biol. 190 , 575 - 586. [64] Nicolae, C. M., and Moldovan, G. - L. (2016) ERKing Trypanosoma: PCNA phosphorylation as novel target, Cell Cycle 15 , 3167. [65] Jang, J. - C., León, P., Zhou, L., and Sheen, J. (1997) Hexokinase as a sugar sensor in higher plants, Plant Cell 9 , 5 - 19. [66] Mansour, T. E. (1972) Phosphofructokinase, In Curr. Top. Cell. Regul. , pp 1 - 46, Elsevier. [67] Prokazova, N., Samovilova, N., Go lovanova, N., Gracheva, E., Korotaeva, A., and Andreeva, E. (2007) Lipid second messengers and cell signaling in vascular wall, Biochemistry (Moscow) 72 , 797 - 808.

139

[68] Pitson, S. M. (2011) Regulatio n of sphingosine kinase and sphingolipid signaling, Trends Biochem. Sci. 36 , 97 - 107. [69] Sibley, C. D., Morris, E. A., Kharel, Y., Brown, A. M., Huang, T., Bevan, D. R., Lynch, K. R., and Santos, W. L. (2020) Discovery of a Small Side Cavity in Sphingosin e Kinase 2 that Enhances Inhibitor Potency and Selectivity , J. Med. Chem. 63 , 1178 - 1198. [70] Santos, W. L., and Lynch, K. R. (2015) Drugging sphingosine kinases, ACS Chem. Biol. 10 , 225 - 233. [71] Pappu, R., Schwab, S. R., Cornelissen, I., Pereira, J. P., Regard, J. B., Xu, Y., Camerer, E., Zheng, Y. - W., Huang, Y ., and Cyster, J. G. (2007) Promotion of lymphocyte egress into blood and lymph by distinct sources of sphingosine - 1 - phosphate, Science 316 , 295 - 298. [72] Mendoza, A., Bréart, B., Ramos - Perez, W. D. , Pitt, L. A., Gobert, M., Sunkara, M., Lafaille, J. J., M orris, A. J., and Schwab, S. R. (2012) The transporter Spns2 is required for secretion of lymph but not plasma sphingosine - 1 - phosphate, Cell Rep. 2 , 1104 - 1110. [73] Darrow, M. C., Zhang, Y., Cinquin , B. P., Smith, E. A., Boudreau, R., Rochat, R. H., Schmid , M. F., Xia, Y., Larabell, C. A., and Chiu, W. (2016) Visualizing red blood cell sickling and the effects of inhibition of sphingosine kinase 1 using soft X - ray tomography, J. Cell Sci. 129 , 3511 - 3 517. [74] Hait, N. C., Allegood, J., Maceyka, M., Strub, G . M., Harikumar, K. B., Singh, S. K., Luo, C., Marmorstein, R., Kordula, T., and Milstien, S. (2009) Regulation of histone acetylation in the nucleus by sphingosine - 1 - phosphate, Science 325 , 1254 - 12 57. [75] Chun, J., and Hartung, H. - P. (2010) Mechanism of action of oral fingolimod (FTY720) in multiple sclerosis, Clin. Neuropharmacol. 33 , 91. [76] Pyne, N. J., and Pyne, S. (2017) Sphingosine 1 - phosphate receptor 1 signaling in mammalian cells, Molecul es 22 , 344. [77] Dichiara, M., Marrazzo, A., Prezzavento, O., Collina, S., Rescifina, A., and Amata, E. (2017) Repurposing of Human Kinase Inhibitors in Neglected Protozoan Diseases, ChemMedChem 12 , 1235 - 1253. [78] Brun, R., Blum, J., Chappuis, F., and Bur ri, C. (2010 ) Human african trypanosomiasis, The Lancet 375 , 148 - 159. [79] Büscher, P., Cecchi, G., Jamonneau, V., and Priotto, G. (2017) Human African trypanosomiasis, The Lancet 390 , 2397 - 2409. [80] Nok, A. J. (2003) Arsenicals (melarsoprol), pentamidine and suramin in the treatment of human African trypanosomiasis, Parasitol. Res. 90 , 71 - 79. [81] Babokhov, P., Sanyaolu, A. O., Oyibo, W. A., Fagbenro - Beyioku, A. F., and Iriemenam, N. C. (2013) A current analysis of chemotherapy strategies for the treatmen t of human A frican trypanosomiasis, Pathog. Glob. Health 107 , 242 - 252. [82] Naula, C., Parsons, M., and Mottram, J. C. (2005) Protein kinases as drug targets in trypanosomes and Leishmania, Biochim. Biophys. Acta 1754 , 151 - 159. [83] Klevernic, I. V., Staff ord, M. J., Morrice, N., Peggie, M., Morton, S., and Cohen, P. (2006) Characterization of the reversible phosphorylation and activation of ERK8, Biochem. J. 394 , 365 - 373.

140

[84] Waskiewicz, A. J., and Cooper, J. A. (1995) Mitogen and stress response pathways : MAP kinase cascades and phosphatase regulation in mammals and yeast, Curr. Opin. Cell Biol. 7 , 798 - 805. [85] Saelzler, M. P., Spackman, C. C., Liu, Y., Martinez, L. C., Harris, J. P., and Abe, M. K. (2006) ERK8 down - regulates transactivation of the gluco corticoid receptor through Hic - 5, J. Biol. Chem. 281 , 16821 - 16832. [86] Strambi, A., Mori, M., Rossi, M., Colecchia, D., Manetti, F., Carlomagno, F., Botta, M., and Chiariello, M. (2013) Structure prediction and validation of the ERK8 kinase domain, PloS o ne 8 , e52011. [87] Coulombe, P., and Meloche, S. (2007) Atypical mitogen - activated protein kinases: struct ure, regulation and functions, Biochim. Biophys. Acta 1773 , 1376 - 1387. [88] Huang, D., Zhou, T., Lafleur, K., Nevado, C., and Caflisch, A. (2009) Kina se selectivity potential for inhibitors targeting the ATP binding site: a network analysis, Bioinformatics 26 , 198 - 204. [89] Strebhardt, K., and Ullrich, A. (2008) Paul Ehrlich's magic bullet concept: 100 years of progress, Nat. Rev. Cancer 8 , 473 - 480. [90 ] Laufer, S. A., Domeyer, D. M., Scior, T. R., Albrecht, W., and Hauser, D. R. (2005) Synthesis and biolog ical testing of purine derivatives as potential ATP - competitive kinase inhibitors, J. Med. Chem. 48 , 710 - 722. [91] Xu, Y. - M., Zhu, F., Cho, Y. - Y., Car per, A., Peng, C., Zheng, D., Yao, K., Lau, A. T., Zykova, T. A., and Kim, H. - G. (2010) Extracellular signal - regulated kinase 8 – mediated c - Jun phosphorylation increases tumorigenesis of human colon cancer, Cancer Res . 70 , 3218 - 3227. [92] Mackey, Z. B., Kou pparis, K., Nishino, M., and McKerrow, J. H. (2011) High ‐ throughput analysis of an RNAi library identifies novel kinase targets in Trypanosoma brucei, Chem. Biol. Drug Des. 78 , 454 - 463. [93] Paculová, H., and Kohoute k, J. (2017) The emerging roles of CDK12 in tumorigenesis, Cell Div. 12 , 7. [94] Merritt, C., and Stuart, K. (2013) Identification of essential and non - essential protein kinases by a fusion PCR method for efficient production of transgenic Trypanosoma bruc ei, Mol. Biochem. Parasitol. 190 , 44 - 49. [95] Wei, Y., and Li, Z. (2014) Distinct roles of a mitogen - activated protein kinase in cytokinesis between different life cycle forms of Trypanosoma brucei, Eukaryot. Cell 13 , 110 - 118. [96] Mervin, L. H., Bulusu, K . C., Kalash, L., Afzal, A. M., Svensson, F., Firth, M. A., Barrett, I., Engkvist, O., and Bender, A. (2018) Orthologue chemical space and its influence on target prediction, Bioinformatics 34 , 72 - 79. [97] Ekins, S., Mestres, J., and Testa, B. (2007) In si lico pharmacology for drug discovery: methods for virtual ligand screening and profiling, Br. J. Pharmacol. 152 , 9 - 20. [98] Ekins, S., Mestres, J., and Testa, B. (2007) In silico pharmacology for drug discovery: applications to targets and beyond, Br. J. P harmacol. 152 , 21 - 37. [99] Logan - Klumpler, F. J., De Silva, N., Boehme, U., Rogers, M. B., Velarde, G., McQuillan, J. A., Carver, T., Aslett, M., Olsen, C., and Subramanian, S. (2012) GeneDB — an annotation database for pathogens, Nucleic Acids Res. 40 , D98 - D108.

141

[100] Benson, D. A., Cavanaugh, M., Clark, K., Karsch - Mizrachi, I., Lipman, D. J., Ostell, J., and Sayers, E. W. (2012) GenBank, Nucleic Acids Res. 41 , D36 - D42. [101] Inc., C. C. G. (2016) Molecular operating environment (MOE), Chemical Computing Gro up Inc 1010 Sherbooke St. West, Suite# 910, Montreal …. [102] Ponder, J. W., and Case, D. A. (2003) Force fields for protein simulations, In Adv Protein Chem. , pp 27 - 85, Elsevier. [103] Li, M. - J., Wu, G. - Z., Kaas, Q., Jiang, T., and Yu, R. - L. (2017) Develo pment of efficient docking strategies and structure - activity relationship study of the c - Met type II inhibitors, J. Mol. Graphics Model. 75 , 241 - 249. [104] Bowie, J. U., Luthy, R., and Eisenberg, D. (1991) A method to identify protein sequences that fold i nto a known three - dimensional structure, Science 253 , 164 - 170. [105] Melo, F., and Feytmans, E. (1998) Assessing protein structures with a non - local atomic interaction energy, J. Mol. Biol. 277 , 1141 - 1152. [106] Wiederstein, M., and Sippl, M. J. (2007) Pro SA - web: interactive web service for the recognition of errors in three - dimensional structures of proteins, Nucleic Acids Res. 35 , W407 - W410. [107] Biasini, M., Bienert, S., Waterhouse, A., Arnold, K., Stude r, G., Schmidt, T., Kiefer, F., Cassarino, T. G., Bertoni, M., and Bordoli, L. (2014) SWISS - MODEL: modelling protein tertiary and quaternary structure using evolutionary information, Nucleic Acids Res. 42 , W252 - W258. [108] Kiefer, F., Arnold, K., Künzli, M ., Bordoli, L., and Schwede, T. (2009) The SWISS - M ODEL Repository and associated resources, Nucleic Acids Res. 37 , D387 - D392. [109] Benkert, P., Künzli, M., and Schwede, T. (2009) QMEAN server for protein model quality estimation, Nucleic Acids Res. 37 , W5 10 - W514. [110] Benkert, P., Tosatto, S. C., and Sc homburg, D. (2008) QMEAN: A comprehensive scoring function for model quality assessment, Proteins. 71 , 261 - 277. [111] Lovell, S. C., Davis, I. W., Arendall III, W. B., De Bakker, P. I., Word, J. M., Prisant , M. G., Richardson, J. S., and Richardson, D. C. (2003) Structure validation by Cα geometry: ϕ, ψ and Cβ deviation, Proteins. 50 , 437 - 450. [112] Bhattacharyya, R. P., Reményi, A., Good, M. C., Bashor, C. J., Falick, A. M., and Lim, W. A. (2006) The Ste5 s caffold allosterically modulates signaling output of the yeast mating pathway, Science 311 , 822 - 826. [113] Huang, B. (2009) MetaPocket: a meta approach to improve protein ligand binding site prediction, OMICS 13 , 325 - 330. [114] Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M ., Meng, E. C., and Ferrin, T. E. (2004) UCSF Chime ra — a visualization system for exploratory research and analysis, J. Comput. Chem. 25 , 1605 - 1612. [115] Release, S. (2017) 1: Maestro, Schrödinger, LLC, New York, NY 2017 . [116] Schrodinger, L. (2017) The P yMol molecular graphics system, version 2.0. Schröd inger, LLC, New York, NY. [117] Morris, G. M., Huey, R., Lindstrom, W., Sanner, M. F., Belew, R. K., Goodsell, D. S., and Olson, A. J. (2009) AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility, J. Comput. Chem. 30 , 2785 - 279 1.

142

[118] Trott, O., and Olson, A. J. (2010) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading, J. Comput. Chem. 31 , 455 - 461. [119] Kim, S., Thiessen, P. A., Bolton, E. E., Che n, J., Fu, G., Gindulyte, A., Han, L., He, J., He, S., and Shoemaker, B. A. (2015) PubChem substance and compound databases, Nucleic Acids Res. 44 , D1202 - D1213. [120] Reményi, A., Good, M. C., Bhattacharyya , R. P., and Lim, W. A. (2005) The role of docking interactions in mediating signaling input, output, and discrimination in the yeast MAPK network, Mol. Cell 20 , 951 - 962. [121] Schrödinger, L. (2011) Maestro, New York, NY . [122] Pinzi, L., and Rastelli, G. (2019) Molecular Docking: Shifting Paradigms in Drug Discovery, Int. J. Mol. Sci. 20 , 4331. [123] Schneider, N., Hindle, S., Lange, G., Klein, R., Albrecht, J., Briem, H., Beyer, K., Claußen, H., Gastreich, M., and Lemmen, C. (2012) Substantial improvemen ts in large - scale redocking and screening using the novel HYDE scoring function, J. Comput. - Aided Mol. Des. 26 , 701 - 723. [124] Kazatskaya, A., Kuhns, S., Lambacher, N. J., Kennedy, J. E., Brear, A. G., McManus, G. J., Sengu pta, P., and Blacque, O. E. (2017 ) Primary Cilium Formation and Ciliary Protein Trafficking Is Regulated by the Atypical MAP Kinase MAPK15 in Caenorhabditis elegans and Human Cells, Genetics , genetics. 300383.302017. [125] Bermingham, D. P., Hardaway, J. A ., Refai, O., Marks, C. R., Snide r, S. L., Sturgeon, S. M., Spencer, W. C., Colbran, R. J., Miller, D. M., and Blakely, R. D. (2017) The atypical MAP kinase SWIP - 13/ERK8 regulates dopamine transporters through a Rho - dependent mechanism, J. Neurosci. , 1582 - 1517. [126] Klevernic, I. V., Mar tin, N. M., and Cohen, P. (2009) Regulation of the activity and expression of ERK8 by DNA damage, FEBS Lett. 583 , 680 - 684. [127] Colecchia, D., Strambi, A., Sanzone, S., Iavarone, C., Rossi, M., Dall’Armi, C., Piccioni, F., Verrotti di Pianella, A., and Ch iariello, M. (2012) MAPK15/ERK8 stimulates autophagy by interacting with LC3 and GABARAP proteins, Autophagy 8 , 1724 - 1740. [128] Chia, J., Tham, K. M., Gill, D. J., Bard - Chapeau, E. A., and Bard, F. A. (2014) ERK8 is a nega tive regulator of O - GalNAc glycos ylation and cell migration, Elife 3 , e01828. [129] Rossi, M., Colecchia, D., Iavarone, C., Strambi, A., Piccioni, F., di Pianella, A. V., and Chiariello, M. (2011) Extracellular signal - regulated kinase 8 (ERK8) controls est rogen - related receptor α (ERRα) c ellular localization and inhibits its transcriptional activity, J. Biol. Chem. 286 , 8507 - 8522. [130] Kanev, G. K., de Graaf, C., de Esch, I. J., Leurs, R., Würdinger, T., Westerman, B. A., and Kooistra, A. J. (2019) The lan dscape of atypical and eukaryotic protein kinases, Trends Pharmacol. Sci. [131] Pyne, S., and Pyne, N. J. (2011) Translational aspects of sphingosine 1 - phosphate biology, Trends Mol. Med. 17 , 463 - 472. [132] Ding, B. - S., Liu, C. H., Sun, Y., Chen, Y., Swend eman, S. L., Jung, B., Chavez, D. , Cao, Z., Christoffersen, C., and Nielsen, L. B. (2016) HDL activation of

143

endothelial sphingosine - 1 - phosphate receptor - 1 (S1P1) promotes regeneration and suppresses fibrosis in the liver, JCI insight 1 . [133] Czubowicz, K. , Jęśko, H., Wencel, P., Lukiw, W . J., and Strosznajder, R. P. (2019) The role of ceramide and sphingosine - 1 - phosphate in Alzheimer’s disease and other neurodegenerative disorders, Mol. Neurobiol. 56 , 5436 - 5455. [134] Zhang, Y., Berka, V., Song, A., Sun, K ., Wang, W., Zhang, W., Ning, C., Li, C., Zhang, Q., and Bogdanov, M. (2014) Elevated sphingosine - 1 - phosphate promotes sickling and sickle cell disease progression, J. Clin. Invest. 124 , 2750 - 2761. [135] Brinkmann, V., Cyster, J. G., and Hla, T. (2004) FTY 720: sphingosine 1 ‐phosphate receptor ‐1 in the control of lymphocyte egress and endothelial barrier function, Am. J. Transplantation 4 , 1019 - 1025. [136] Billich, A., Bornancin, F., Dévay, P., Mechtcheriakova, D., Urtz, N., and Baumruker, T. (2003) Phosphor ylation of the immunomodulatory drug FTY720 by sphingosine kinases, J. Biol. Chem. 278 , 47408 - 47415. [137] Donoviel, M. S., Hait, N. C., Ramachandr an, S., Maceyka, M., Takabe, K., Milstien, S., Oravecz, T., and Spiegel, S. (2015) Spinster 2, a sphingosine - 1 - phosphate transporter, plays a critical role in inflammatory and autoimmune diseases, FASEB J. 29 , 5018 - 5028. [138] Zhang, F., Xia, Y., Yan, W., Zhang, H., Zhou, F., Zhao, S., Wang, W., Zhu, D., Xin, C., and Lee, Y. (2016) Sphingosine 1 - phosphate signali ng contributes to cardiac inflammation, dysfunction, and remodeling following myocardial infarction, Am. J. Physiol. Heart. Circ. Physiol. 310 , H25 0 - H261. [139] Johnson, K. R., Johnson, K. Y., Crellin, H. G., Ogretmen, B., Boylan, A. M., Harley, R. A., and Obeid, L. M. (2005) Immunohistochemical distribution of sphingosine kinase 1 in normal and tumor lung tissue, J. Histochem. Cytochem. 53 , 1159 - 116 6. [140] Li, J., Guan, H. - Y., Gong, L. - Y., Song, L. - B., Zhang, N., Wu, J., Yuan, J., Zheng, Y. - J., Huang, Z. - S., and Li, M. (2008) Clinical significance of sphingosine kinase - 1 expression in human astrocytomas progression and overall patient survival, Clin . Cancer. Res. 14 , 6996 - 7003. [141] Shida, D., Takabe, K., Kapitonov, D., Milstien, S., and Spiegel, S. (2008 ) Targeting SphK1 as a new strategy against cancer, Curr. Drug Targets 9 , 662 - 673. [142] Pyne, S., Adams, D. R., and Pyne, N. J. (2016) Sphingosine 1 - phosphate and sphingosine k inases in health and disease: recent advances, Prog. Lipid Res. 62 , 93 - 106. [14 3] Futerman, A. H., and Riezman, H. (2005) The ins and outs of sphingolipid synthesis, Trends Cell Biol. 15 , 312 - 318. [144] Gomez - Munoz, A., Presa, N., Gomez - Larrauri, A., River a, I. - G., Trueba, M., and Ordonez, M. (2016) Control of inflammatory responses by ceramide, sphingosine 1 - phosphate and ceramide 1 - phosphate, Prog. Lipid Res. 61 , 51 - 62. [145] Rex, K., Jeffries, S., Brown, M. L., Carlson, T., Coxon, A., Fajardo, F., Frank, B., Gustin, D., Kamb, A., and Kassner, P. D. (2013) Sphingosine kinase activit y is not required for tumor cell viability, PLoS One 8 . [146] Strub, G. M., Maceyka, M., Hait, N. C., Milstien, S., and Spiegel, S. (2010) Extracellular and intracellular action s of sphingosine - 1 - phosphate, In Sphingolipids as signaling and regulatory mole cules , pp 141 - 155, Springer.

144

[147] Cannavo, A., Liccardo, D., Komici, K., Corbi, G., de Lucia, C., Femminella, G. D., Elia, A., Bencivenga, L., Ferrara, N., and Koch, W. J. (201 7) Sphingosine kinases and sphingosine 1 - phosphate receptors: signaling and act ions in the cardiovascular system, Front. Pharmacol. 8 , 556. [148] Olivera, A., Rosenfeldt, H. M., Bektas, M., Wang, F., Ishii, I., Chun, J., Milstien, S., and Spiegel, S. (2003 ) Sphingosine kinase type 1 induces G12/13 - mediated stress fiber formation, yet promotes growth and survival independent of - coupled receptors, J. Biol. Chem. 278 , 46452 - 46460. [149] Huang, K., Huang, J., Chen, C., Hao, J., Wang, S., Huang, J., Li u, P., and Huang, H. (2014) AP - 1 regulates sphingosine kinase 1 expression in a positive feedback manner in glomerular mesangial cells exposed to high glucose, Cell. Signal. 26 , 629 - 638. [150] Maceyka, M., Sankala, H., Hait, N. C., Le Stunff, H., Liu, H., Toman, R., Collier, C., Zhang, M., Satin, L. S., and Merrill, A. H. (2005) SphK 1 and SphK2, sphingosine kinase isoenzymes with opposing functions in sphingolipid metabolism, J. Biol. Chem. 280 , 37118 - 37129. [151] Liu, H., Toman, R. E., Goparaju, S. K., Maceyka, M., Nava, V. E., Sankala, H., Payne, S. G., Bektas, M., Ishii, I., and Ch un, J. (2003) Sphingosine kinase type 2 is a putative BH3 - only protein that induces apoptosis, J. Biol. Chem. 278 , 40330 - 40336. [152] Worrell, B. L., Brown, A. M., Santos, W. L., and Bevan, D. R. (2019) In Silico Characterization of Structural Distinctions between Isoforms of Human and Mouse Sphingosine Kinases for Accelerating Drug Discovery, J. Chem. Inf. Model. 59 , 2339 - 2351. [1 53] Olivera, A., Kohama, T., Edsall, L., Nava, V., Cuvillier, O., Poulton, S., and Spiegel, S. (1999) Sphingosine kinase express ion increases intracellular sphingosine - 1 - phosphate and promotes cell growth and survival, J. Cell Biol. 147 , 545 - 558. [154] Pit son, S. M., Moretti, P. A., Zebol, J. R., Lynn, H. E., Xia, P., Vadas, M. A., and Wattenberg, B. W. (2003) Activation of sphingo sine kinase 1 by ERK1/2 ‐ mediated phosphorylation, The EMBO journal 22 , 5491 - 5500. [155] Igarashi, N., Okada, T., Hayashi, S., Fu jita, T., Jahangeer, S., and Nakamura, S. - i. (2003) Sphingosine kinase 2 is a nuclear protein and inhibits DNA synthesis, J. Bio l. Chem. 278 , 46832 - 46839. [156] Páll, S., Abraham, M. J., Kutzner, C., Hess, B., and Lindahl, E. (2014) Tackling exascale softw are challenges in molecular dynamics simulations with GROMACS, In EASC , pp 3 - 27, Springer. [157] Wang, Z., Min, X., Xiao, S. - H., Johnstone, S., Romanow, W., Meininger, D., Xu, H., Liu, J., Dai, J., and An, S. (2013) Molecular basis of sphingosine kinase 1 substrate recognition and catalysis, Structure 21 , 798 - 809. [158] Huang, J., Rauscher, S., Nawrocki, G., Ran, T., Feig, M., de G root, B. L., Grubmüller, H., and MacKerell, A. D. (2017) CHARMM36m: an improved force field for folded and intrinsically disorde red proteins, Nat. Methods 14 , 71 - 73. [159] Vanommeslaeghe, K., Hatcher, E., Acharya, C., Kundu, S., Zhong, S., Shim, J., Darian , E., Guvench, O., Lopes, P., and Vorobyov, I. (2010) CHARMM general

145

force field: A force field for drug ‐like molecules compatib le with the CHARMM all ‐atom additive biological force fields, J. Comput. Chem. 31 , 671 - 690. [160] Jorgensen, W. L., Chandrasekha r, J., Madura, J. D., Impey, R. W., and Klein, M. L. (1983) Comparison of simple potential functions for simulating liquid water , J. Chem. Phys. 79 , 926 - 935. [161] Neria, E., Fischer, S., and Karplus, M. (1996) Simulation of activation free energies in mol ecular systems, J. Chem. Phys. 105 , 1902 - 1921. [162] Durell, S. R., Brooks, B. R., and Ben - Naim, A. (1994) Solvent - induced force s between two hydrophilic groups, Am. J. Phys. Chem. 98 , 2198 - 2202. [163] Luo, Y., and Roux, B. (2010) Simulation of osmotic pre ssure in concentrated aqueous salt solutions, Am. J. Phys. Chem. letters 1 , 183 - 189. [164] Beglov, D., and Roux, B. (1994) Finit e representation of an infinite bulk system: solvent boundary potential for computer simulations, J. Chem. Phys. 100 , 9050 - 9063. [165] Berendsen, H. J., Postma, J. v., van Gunsteren, W. F., DiNola, A., and Haak, J. R. (1984) Molecular dynamics with coupling to an external bath, J. Chem. Phys. 81 , 3684 - 3690. [166] Nosé, S., and Klein, M. (1983) Constant pressure molecular dynamics f o r molecular systems, Mol. Phys. 50 , 1055 - 1076. [167] Parrinello, M., and Rahman, A. (1981) Polymorphic transitions in single crystals: A new molecular dynamics method, J. Appl. Phys. 52 , 7182 - 7190. [168] Bussi, G., Donadio, D., and Parrinello, M. (2007) C anonical sampling through velocity rescaling, J. Chem. Phys. 126 , 014101. [169] Hess, B. (2008) P - LINCS: A parallel linear constraint solver for molecular simulation, J. Chem. Theory Comput. 4 , 116 - 122. [170] Verlet, L. (1967) Computer" experiments" on cl assical fluids. I. Thermodynamical properties of Lennard - Jones molecules, P hysical review 159 , 98. [171] Essmann, U., Perera, L., Berkowitz, M. L., Darden, T., Lee, H., and Pedersen, L. G. (1995) A smooth particle mesh Ewald method, J. Chem. Phys. 103 , 857 7 - 8593. [172] Darden, T., York, D., and Pedersen, L. (1993) Particle mesh E wald: An N ⋅ log (N) method for Ewald sums in large systems, J. Chem. Phys. 98 , 10089 - 10092. [173] Daura, X., Gademann, K., Jaun, B., Seebach, D., Van Gunsteren, W. F., and Mark, A. E. (1999) Peptide folding: when simulation meets experiment, Angew. Chem. I nt. Ed. 38 , 236 - 240. [174] Duan, J., Dixon, S. L., Lowrie, J. F., and Sherman, W. (2010) Analysis and comparison of 2D fingerprints: insights into database screening performance usi ng eight fingerprint methods, J. Mol. Graphics Model. 29 , 157 - 170. [175] Gu pta, P., Mohammad, T., Khan, P., Alajmi, M. F., Hussain, A., Rehman, M. T., and Hassan, M. I. (2019) Evaluation of ellagic acid as an inhibitor of sphingosine kinase 1: A targeted a pproach towards anticancer therapy, Biomed. Pharmacother 118 , 109245. [176] Bayraktar, O., Ozkirimli, E., and Ulgen, K. (2017) Sphingosine kinase 1 (SK1) allosteric inhibitors that target the dimerization site, Comput. Biol. Chem. 69 , 64 - 76. [177] Vettoraz zi, M., Angelina, E., Lima, S., Gonec, T., Otevrel, J., Marvanova, P., Padr tova, T., Mokry, P., Bobal, P., and Acosta, L. M. (2017) An integrative study

146

to identify novel scaffolds for sphingosine kinase 1 inhibitors, Eur. J. Med. Chem. 139 , 461 - 481. [178] Vettorazzi, M., Menéndez, C., Gutiérrez, L., Andujar, S., Appignanesi, G., and Enriz, R. D. (2018) Theoretical models to predict the inhibitory effect of ligands of sphingosine kinase 1 using QTAIM calculations and hydrogen bond dynamic propensity an alysi s, J. Comput. - Aided Mol. Des. 32 , 781 - 791. [179] Gupta, P., Mohammad, T., Dahiya, R., Roy, S., Noman, O. M. A., Alajmi, M. F., Hussain, A., and Hassan, M. I. (2019) Evaluation of binding and inhibition mechanism of dietary phytochemicals with sphingos ine kinase 1: Towards targeted anticancer therapy, Sci. Rep. 9 , 1 - 15. [180] Vettorazzi, M., Vila, L., Lima, S., Acosta, L., Yépes, F., Palma, A., Cobo, J., Tengler, J., Malik, I., and Alvarez, S. (2019) Synthesis and biological evaluation of sphingosine ki nase 2 inhibitors with anti ‐inflammatory activity, Archiv der Pharmazie 352 , 1800298. [181] Zhang, J., Zhang, M., Yu, J., Shang, Y., Jiang, K., Jia, Y., Wang, J., and Yang, K. (2020) Investigating the binding mechanism of sphingosine kinase 1/2 inhibitors: Insights into subtype selectivity by homology modeling, molecular dynamics simulation and free energy calculation studies, J. Mol. Struct. 1208 , 127900. [182] Gupta, P., Khan, F. I., Ambreen, D., Lai, D., Alajmi, M. F., Hussain, A., Islam, A., Ahmad, F., and Hassan, M. I. (2020) Investigation of guanidinium chloride - induced unfolding pathway of sphingosine kinase 1, Int. J. Biol. Macromol. . [183] Khan, F. I., Lai, D., Anwer, R., Azim, I., and Khan, M. K. A. (2020) Identifying novel sphingosine kinase 1 inh ibitors as therapeutics against breast cancer, J. Enzyme Inhib. Med. Chem. 35 , 172 - 186. [184] Fang, L., Wang, X., Xi, M., Liu, T., and Yin, D. (2016) Assessing the ligand selectivity of sphingosine kinases using molecular dynamics and MM - PBSA binding free energy calculations, Molecular BioSystems 12 , 1174 - 1182. [185] Vettorazzi, M., Insuasty, D., Lima, S., Gutiérrez, L., Nogueras, M., Marchal, A., Abonia, R., Andújar, S., Spiegel, S., and Cobo, J. (2020) Design of new quinolin - 2 - one - pyrimidine hybrids as sp hingosine kinases inhibitors, Bioorg. Chem. 94 , 103414. [186] Huang, J., and MacKerell Jr, A. D. (2013) CHARMM36 all ‐atom additive protein force field: Validation based on comparison to NMR data, J. Comput. Chem. 34 , 2135 - 2145. [187] Powell, J. A., Pitman, M. R., Zebol, J. R., Moretti, P. A., Neubauer, H. A., Davies, L. T., Lewis, A. C., Dagley, L. F., Webb, A. I., and Costabile, M. (2019) Kelch - like protein 5 - mediated ubiquitination of lysine 183 promotes proteasomal degradation of sphingosine kinase 1, Bi ochem. J. 476 , 3211 - 3226. [188] Adams, D. R., Pyne, S., and Pyne, N. J. (2016) Sphingosine kinases: emerging structure – function insights, Trends Biochem. Sci. 41 , 395 - 409. [189] Ding, G., Sonoda, H., Yu, H., Kajimoto, T., Goparaju, S. K., Jahangeer, S., Ok ada, T., and Nakamura, S. - i. (2007) Protein kinase D - mediated phosphorylation and nuclear export of sphingosine kinase 2, J. Biol. Chem. 282 , 27493 - 27502. [190] Pyne, N. J., Adams, D. R., and Pyne, S. (2017) Sphingosine kinase 2 in autoimmune/inflammatory disease and the development of sphingosine kinase 2 inhibitors, Trends Pharmacol. Sci. 38 , 581 - 591.

147