bioRxiv preprint doi: https://doi.org/10.1101/2020.10.24.353318; this version posted October 25, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. October 24, 2020 1 Fitting quantum machine learning 2 potentials to experimental free energy 3 data: Predicting tautomer ratios in 4 solution 5 Marcus Wieder* (0000-0003-2631-8415)1*, Josh Fass* (0000-0003-3719-266X)1,2, John D. Chodera 6 (0000-0003-0542-119X)1 7 *contributed equally to this work; 1Computational and Systems Biology Program, Sloan Kettering Institute, Memorial 8 Sloan Kettering Cancer Center, New York, NY 10065, USA; 2Tri-Institutional PhD Program in Computational Biology and 9 Medicine, Weill Cornell Graduate School of Medical Sciences, New York, NY 10065, USA 10 *For correspondence: 11
[email protected] (MW) 12 13 Abstract The computation of tautomer rations of druglike molecules is enormously important in computer- 14 aided drug discovery, as over a quarter of all approved drugs can populate multiple tautomeric species in so- 15 lution. Unfortunately, accurate calculations of aqueous tautomer ratios—the degree to which these species 16 must be penalized in order to correctly account for tautomers in modeling binding for computer-aided drug 17 discovery—is surprisingly difficult. While quantum chemical approaches to computing aqueous tautomer 18 ratios using continuum solvent models and rigid-rotor harmonic-oscillator thermochemistry are currently 19 state of the art, these methods are still surprisingly inaccurate despite their enormous computational ex- 20 pense.