Pka Measurements for the SAMPL6 Prediction Challenge for A
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/368787; this version posted July 13, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. PrEPRINT AHEAD OF SUBMISSION — July 13, 2018 P MEASUREMENTS FOR THE SAMPL6 1 K A PREDICTION CHALLENGE FOR A SET OF KINASE 2 inhibitor-likE FRAGMENTS 3 1,2 3 1,4 5 4 Mehtap Işık , DorOTHY LeVORSE , Ariën S. RustenburG , IKENNA E. Ndukwe , Heather 6 5 5 5 6 5 WANG , Xiao WANG , Mikhail Reibarkh , Gary E. Martin , AleXEY A. MakarOV , David L. 7 3* 1* 6 MobleY , Timothy Rhodes , John D. ChoderA Computational AND Systems Biology PrOGRam, Sloan Kettering Institute, Memorial Sloan 7 1 Kettering Cancer Center, NeW York, NY 10065, United States; Tri-INSTITUTIONAL PhD PrOGRAM IN 8 2 Chemical Biology, WEILL Cornell GrADUATE School OF Medical Sciences, Cornell University, NeW York, 9 NY 10065, United States; Pharmaceutical Sciences, MRL, MerCK & Co., Inc., 126 East Lincoln 10 3 Avenue, Rahway, NeW JerseY 07065, United States; GrADUATE PrOGRAM IN Physiology, Biophysics, 11 4 AND Systems Biology, WEILL Cornell Medical College, NeW York, NY 10065, United States; PrOCESS 12 5 AND Analytical ResearCH AND Development, MerCK & Co., Inc., Rahway, NJ 07065, United States; 13 Analytical ResearCH & Development, MRL, MerCK & Co., Inc., MRL, 126 East Lincoln Avenue, 14 6 Rahway, NeW JerseY 07065, United States; Department OF Pharmaceutical Sciences AND 15 7 Department OF Chemistry, University OF California, Irvine, Irvine, California 92697, United States 16 17 *For CORRespondence: 18 [email protected] (TR); [email protected] (JDC) 19 20 AbstrACT Determining THE NET CHARGE AND PROTONATION STATES POPULATED BY A SMALL MOLECULE IN AN envi- 21 RONMENT OF INTEREST—SUCH AS solvent, A PROTEIN BINDING site, OR A LIPID BILAYER—OR THE COST OF ALTERING THOSE 22 PROTONATION STATES UPON TRANSFER TO ANOTHER ENVIRONMENT IS A PREREQUISITE FOR PREDICTING ITS PHYSICOCHEMICAL 23 AND PHARMACEUTICAL PRoperties, AS WELL AS INTERACTIONS WITH BIOLOGICAL MACROMOLECULES USING COMPUTATIONAL 24 models. INCORRECTLY MODELING THE DOMINANT PROTONATION state, SHIFTS IN DOMINANT PROTONATION state, OR THE 25 POPULATION OF SIGNIfiCANT MIXTURES OF PROTONATION STATES CAN LEAD TO LARGE MODELING ERRORS THAT DEGRADE THE 26 ACCURACY OF PHYSICAL MODELING AND HINDER THE ABILITY TO USE PHYSICAL MODELING APPROACHES FOR MOLECULAR 27 design. For SMALL molecules, THE ACID DISSOCIATION CONSTANT (pK A) IS THE PRIMARY QUANTITY NEEDED TO deter- 28 MINE THE IONIC STATES POPULATED BY A MOLECULE IN AN AQUEOUS SOLUTION AT A GIVEN pH. As A PART OF SAMPL6 29 COMMUNITY challenge, WE ORGANIZED A BLIND PK A PREDICTION COMPONENT TO ASSESS THE ACCURACY WITH WHICH 30 CONTEMPORARY PK A PREDICTION METHODS CAN PREDICT THIS QUANTITY, WITH THE ULTIMATE AIM OF ASSESSING THE 31 EXPECTED IMPACT ON MODELING ERRORS THIS WOULD induce. While A MULTITUDE OF APPROACHES FOR PREDICTING PK A 32 VALUES CURRENTLY Exist, PREDICTING THE PK AS OF drug-likE MOLECULES CAN BE DIffiCULT DUE TO CHALLENGING PROPERTIES 33 SUCH AS MULTIPLE TITRATABLE sites, HETERocycles, AND tautomerization. For THIS challenge, WE FOCUSED ON SET 34 OF 24 SMALL MOLECULES SELECTED TO RESEMBLE SELECTIVE KINASE INHIBITORS—AN IMPORTANT CLASS OF THERAPEUTICS 35 REPLETE WITH TITRATABLE moieties. Using A Sirius T3 INSTRUMENT THAT PERFORMS AUTOMATED acid-base TITRations, 36 WE USED UV absorbance-based PK A MEASUREMENTS TO CONSTRUCT A high-quality EXPERIMENTAL REFERENCE DATASET 37 OF MACROSCOPIC PK AS FOR THE EVALUATION OF COMPUTATIONAL PK A PREDICTION METHODOLOGIES THAT WAS UTILIZED IN 38 THE SAMPL6 PK A challenge. For SEVERAL COMPOUNDS IN WHICH THE MICROSCOPIC PROTONATION STATES ASSOCIATED 39 WITH MACROSCOPIC PK AS WERE ambiguous, WE PERFORMED FOLLOw-up NMR EXPERIMENTS TO DISAMBIGUATE THE 40 MICROSTATES INVOLVED IN THE TRansition. This DATASET PROVIDES A USEFUL STANDARD BENCHMARK DATASET FOR THE 41 EVALUATION OF PK A PREDICTION METHODOLOGIES ON KINASE inhibitor-likE compounds. 1 OF 27 bioRxiv preprint doi: https://doi.org/10.1101/368787; this version posted July 13, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. PrEPRINT AHEAD OF SUBMISSION — July 13, 2018 42 43 KeYWORDS ⋅ ⋅ ⋅ ⋅ 44 ACID DISSOCIATION CONSTANTS SPECTROPHOTOMETRIC PK A MEASUREMENT BLIND PREDICTION CHALLENGE SAMPL ⋅ ⋅ ⋅ 45 MACROSCOPIC PK A MICROSCOPIC PK A MACROSCOPIC PROTONATION STATE MICROSCOPIC PROTONATION STATE 46 AbbrEVIATIONS 47 SAMPL Statistical Assessment OF THE Modeling OF PrOTEINS AND Ligands log 48 PK A – 10 ACID DISSOCIATION EQUILIBRIUM CONSTANT log 49 PSK A – 10 APPARENT ACID DISSOCIATION EQUILIBRIUM CONSTANT IN COSOLVENT 50 DMSO Dimethyl SULFOXIDE 51 ISA Ionic-strENGTH ADJUSTED 52 SEM StandarD ERROR OF THE MEAN 53 TFA TARGET FACTOR ANALYSIS 54 LC-MS Liquid CHROMATOGRAPHY - MASS SPECTROMETRY 55 NMR Nuclear MAGNETIC RESONANCE SPECTROSCOPY 56 HMBC HeterONUCLEAR Multiple-Bond CorrELATION 57 TFA-D DEUTERo-triflUOROACETIC ACID INTRODUCTION 58 59 SAMPL (Statistical Assessment OF THE Modeling OF PrOTEINS AND Ligands) IS A RECURRING SERIES OF BLIND PREDICTION 60 CHALLENGES FOR THE COMPUTATIONAL CHEMISTRY COMMUNITY [1, 2]. ThrOUGH THESE challenges, SAMPL AIMS TO 61 EVALUATE AND ADVANCE COMPUTATIONAL TOOLS FOR RATIONAL DRUG design. By FOCUSING THE COMMUNITY ON SPECIfiC 62 PHENOMENA RELEVANT TO DRUG DISCOVERY POORLY PREDICTED BY CURRENT models, ISOLATING THAT PHENOMENON 63 FROM OTHER CONFOUNDING FACTORS IN well-designed TEST systems, EVALUATING TOOLS PROSPECTIVELY, ENABLING DATA 64 SHARING TO LEARN FROM FAILURes, AND RELEASING THE RESULTING high-quality DATASETS INTO THE COMMUNITY AS 65 BENCHMARK sets, SAMPL HAS DRIVEN PROGRESS IN A NUMBER OF AREAS OVER SEVEN PREVIOUS ROUNDS OF CHALLENGE 66 CYCLES [3–7, 7–15]. 67 As A STEPPING STONE TO ENABLING THE ACCURATE PREDICTION OF PRotein-ligand BINDING Affinities, SAMPL 68 HAS FOCUSED ON EVALUATING HOW WELL PHYSICAL AND EMPIRICAL MODELING METHODOLOGIES CAN PREDICT VARIOUS 69 PHYSICOCHEMICAL PROPERTIES RELEVANT TO BINDING AND DRUG DISCOVERY, SUCH AS HYDRATION FREE ENERGIES (which 70 MODEL ASPECTS OF DESOLVATION IN isolation), DISTRIBUTION COEffiCIENTS (which MODEL TRANSFER FROM RELATIVELY 71 HOMOGENEOUS AQUEOUS TO NONPOLAR ENVIRonments), AND host-guest BINDING AffiNITIES (which MODEL high- 72 AffiNITY ASSOCIATION WITHOUT THE COMPLICATION OF SLOW PROTEIN dynamics). These PHYSICOCHEMICAL PROPERTY 73 PREDICTION CHALLENGES—IN ADDITION TO ASSESSING THE PREDICTIVE ACCURACY OF QUANTITIES THAT ARE USEFUL IN VARIOUS 74 STAGES OF DRUG DISCOVERY IN THEIR OWN RIGHT—HAVE BEEN HELPFUL IN PINPOINTING DEfiCIENCIES IN COMPUTATIONAL 75 MODELS THAT CAN LEAD TO SUBSTANTIAL ERRORS IN AffiNITY PRedictions. Neglect OF PROTONATION STATE EffECTS CAN LEAD TO LARGE MODELING ERRORS 76 77 As PART OF THE SAMPL5 CHALLENGE series, A NEW CYCLOHExane-water DISTRIBUTION CONSTANT (log D) PREDICTION 78 CHALLENGE WAS INTRoduced, WHERE PARTICIPANTS PREDICTED THE TRANSFER FREE ENERGY OF SMALL drug-likE MOLECULES 79 BETWEEN AN AQUEOUS BUffER PHASE AT pH 7.4 AND A NONAQUEOUS CYCLOHEXANE PHASE [16, 17]. While octanol- 80 WATER DISTRIBUTION COEffiCIENT MEASUREMENTS ARE MORE common, CYCLOHEXANE WAS SELECTED FOR THE SIMPLICITY 81 OF ITS LIQUID PHASE AND RELATIVE DRYNESS COMPARED TO WET OCTANOL phases. While THE EXPECTATION WAS THAT 82 THIS CHALLENGE WOULD BE RELATIVELY STRAIGHTFORWARD GIVEN THE LACK OF COMPLEXITY OF CYCLOHEXANE phases, 83 ANALYSIS OF PARTICIPANT PERFORMANCE REVEALED THAT MULTIPLE FACTORS CONTRIBUTED TO SIGNIfiCANT PREDICTION 84 FAILURes: POOR CONFORMATIONAL SAMPLING OF flEXIBLE SOLUTE molecules, MISPREDICTION OF RELEVANT PROTONATION 85 AND TAUTOMERIC STATES (or FAILURE TO ACCOMMODATE SHIFTS IN THEIR populations), AND FORCE fiELD INACCURACIES 86 RESULTING IN BIAS TOWARDS THE CYCLOHEXANE phase. While THESE fiNDINGS JUSTIfiED THE BENEfiT OF FUTURE ITERATIONS 87 OF BLIND DISTRIBUTION OR PARTITION COEffiCIENT challenges, THE MOST SURPRISING OBSERVATION FROM THIS INITIAL LOG D 88 CHALLENGE WAS THAT PARTICIPANTS ALMOST UNIFORMLY NEGLECTED TO ACCURATELY MODEL PROTONATION STATE Effects, 2 OF 27 bioRxiv preprint doi: https://doi.org/10.1101/368787; this version posted July 13, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. PrEPRINT AHEAD OF SUBMISSION — July 13, 2018 89 AND THAT NEGLECT OF THESE EffECTS LED TO SURPRISINGLY LARGE ERRORS IN TRANSFER FREE ENERGIES [16–18]. CarEFUL 90 QUANTUM CHEMICAL ASSESSMENTS OF THE MAGNITUDE OF THESE PROTONATION STATE EffECTS FOUND THAT THEIR NEGLECT 91 COULD INTRODUCE ERRORS UP TO 6–8 Kcal/mol FOR SOME COMPOUNDS [18]. This EffECT STEMS FROM THE NEED TO 92 ACCOUNT FOR THE FREE ENERGY DIffERENCE BETWEEN THE MAJOR IONIZATION STATE IN CYCLOHEXANE (most LIKELY NEUTRAL 93 state) AND IN WATER PHASE (which COULD BE NEUTRAL OR CHARged). 94 TO ISOLATE THESE SURPRISINGLY LARGE PROTONATION STATE MODELING ERRORS FROM DIffiCULTIES RELATED TO lipophilic- 95 ITY (log P AND LOG D) PREDICTION methods, WE DECIDED TO ORGANIZE A SET OF STAGED PHYSICOCHEMICAL PROPERTY 96 CHALLENGES USING A CONSISTENT SET OF MOLECULES THAT RESEMBLE SMALL MOLECULE KINASE INHIBITORS—AN IMPORTANT 97 DRUG CLASS REPLETE