COVID-19: predicting inhibition of the main protease and therapeutic intracellular accumulation and plasma and lung concentrations of repurposed inhibitors Clifford Fong

To cite this version:

Clifford Fong. COVID-19: predicting inhibition of the main protease and therapeutic intracellu- lar accumulation and plasma and lung concentrations of repurposed inhibitors. [Research Report] Eigenenergy. 2020. ￿hal-02917312￿

HAL Id: hal-02917312 https://hal.archives-ouvertes.fr/hal-02917312 Submitted on 20 Aug 2020

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. COVID-19: predicting inhibition of the main protease and therapeutic intracellular accumulation and plasma and lung concentrations of repurposed inhibitors

Clifford W. Fong

Eigenenergy, Adelaide, South Australia, Australia. Email: [email protected]

Keywords: COVID-2019 or SARS-CoV-2; SARS-CoV; MERS; 3C-like protease, or 3CLpro, pro or M ; inhibition; IC50, EC50, EC90, host cell membrane transport, AUC, Cmax, linear free energy relationships, HOMO-LUMO; quantum mechanics;

Abbreviations: Structure activity relationships SAR, ΔGdesolv,CDS free energy of water desolvation, ΔGlipo,CDS lipophilicity free energy, CDS cavity dispersion solvent structure of the first solvation shell, Dipole moment DM, Molecular Volume Vol, HOMO highest occupied molecular orbital, LUMO lowest unoccupied molecular orbital, HOMO-LUMO energy gap, linear free energy relationships LFER, area under the curve AUC, highest concentration of drug in blood plasma Cmax

Summary

It has been shown that a linear free energy relationship (LFER) can describe the structure activity of the inhibition of the main protease Mpro of COVID-19 or SARS-Cov-2. Application of this LFER can be used to predictably rank the inhibitory efficacy of a series of repurposed drugs against the main protease of SARS-CoV-2, as well as SARS-CoV and MERS. The same LFER also applies to the intracellular accumulation of Mpro inhibitors from the plasma and their inhibitory efficacy Cmax/EC90 in the targeted lung tissue. The LFER is linearly comprised of the desolvation energy, lipophilicity, dipole moment, molecular volume and HOMO-LUMO energy gap, with varying combinations of these fundamental molecular specifiers applying differently to various structural series of inhibitors.

It is also shown that protonation of basic drugs has a major influence on bioavailability in the target lung tissue pH 6.7 compared to that in the plasma pH 7.4, with the major difference between the neutral species and the charged species is due to the differences in desolvation energy of the inhibitors. Neutral species passively penetrate the infected cell membrane, or endocytosis (which requires some degree of desolvation as the drug is engulfed by the lipophilic membrane) may be required to transport larger drugs across the cell membrane.

There is evidence in the literature that molecular docking methods that derive binding energies to predict likely inhibitors of Mpro of SARS-CoV-2 do not always correlate well with structure activity inhibitory studies of Mpro. This study shows that the binding energy of a series of inhibitors is well correlated with the HOMO-LUMO energy gap and the molecular volume of the inhibitors. Introduction

There has been much activity seeking to find repurposed drugs that may be therapeutically active against COVID-19 or SARS-Cov-2. Such activity is an adjunct to, and in support of the main search for an effective vaccine for the coronavirus to control the virus. Repurposed drugs offer the advantage of having already been assessed for unwanted side effects in humans, but their efficacy against SARS-Cov-2 needs to be assessed. The search for repurposed drugs has largely centred on screening many available drugs using deep learning and other artificial intelligence algorithms to screen very large numbers of existing drugs using molecular docking techniques. Other approaches have used quantitative structure activity relationships or linear free energy relationships to predict potential efficacy against the SARS-Cov-2 main protease, Mpro, a critical component of the coronavirus replication mechanism.

We have recently documented a LFER structural activity model to predict the inhibitory efficacy of the SARS-CoV and MERS coronaviruses for a wide range of repurposed previously approved drugs. [1-3] In this study we extend the use of this method to the SARS-CoV-2 main protease MPro again evaluating repurposed drugs.

Vatansever et al [4] conducted a docking evaluation of 55 previously approved anti-viral and antimicrobial drugs with the Mpro of SARS-Cov-2 (6LU7 crystal structure) and chose 29 drugs that showed a binding energy lower than -8.3 kcal/mol for IC50 studies. It was noted that docking results did not necessarily correlate with IC50 studies, similar to observations made by Bobrowski. [5] The most effective inhibitors (with IC50 values below 100 µM) were pimozide, ebastine, bepridil, rupintrivir, sertaconazole, rimonabant and oxiconazole, with the first three being the most effective. These drugs were studied as bases with the expectation that they would exert a dual function of raising the endosomal pH to slow viral entry by impairing viral fusion and assembly, as well as inhibiting Mpro in infected cells.

However, a substantial shortcoming of many searches for effective inhibitors of the main protease Mpro (or 3C-like or 3CLpro) is the lack of due consideration of the how such anti-virals can be targeted to the relevant organs via adequate plasma concentrations, and then how such drugs can enter the virus infected cells and inhibit the Mpro and hence stop virus replication processes.

The in vivo intracellular accumulation of anti-viral protease inhibitors is dominated by the amount of inhibitor bound to plasma protein (for example nelfinavir, saquinavir, amprenivir, lopinavir, ca 90-99%, indinavir 60%) compared to the amount of free inhibitor available to traverse cell membranes in the target tissue. The in vivo intracellular accumulation of a series of anti-virals (expressed as a ratio of the intracellular area under the curve, AUC) over the total plasma AUC throughout the dosage interval) has been found to be: nelfinavir > saquinavir >amprenavir > nelfinavir metabolite M8 > lopinavir > ritonavir > indinavir. These drugs are mainly weak bases at physiological pH, and are more likely to passively traverse cell membranes than their ionized or protonated counterparts, although the molecular properties that contributed to ease of intracellular transport could not be determined for these drugs. [6] It was also noted that the potential for sequestration of basic drugs in acidic compartments such as lymphocytes will influence viral replication processes as well (by slowing viral entry into cells). This is similar to the proposed inhibition of endosomal acidification by chloroquine analogs as a potential therapeutic strategy for viral . [7]

Petersen [8] has shown that acidic endosomes and TPC-mediated Ca++ release from the endo- lysosomal system are important factors in both SARS-CoV-2 entry and NAADP-mediated Ca++ signaling. Endosomal acidification and loss of Ca++ are interlinked. The uptake of H+ into endosomes occurs simultaneously with release of Ca++, and the two processes are interdependent.

A recent study by Ashad et al [9] used human pharmacokinetic models on in vitro anti-SARS- CoV-2 activity data from all available publications up to 13th April 2020 to recalculate an EC90 value for each drug. EC90 values were then expressed as a ratio to the maximum achievable plasma concentrations (Cmax) reported for each drug after administration of the approved dose to humans (Cmax/EC90 ratio). Only 14 of the 56 analyzed drugs achieved a Cmax/EC90 ratio above 1 meaning that plasma Cmax concentrations exceeded those necessary to inhibit 90% of SARS-CoV-2 replication. For all drugs reported, the unbound lung to plasma tissue partition coefficient (KpUlung) was also simulated and used along with reported Cmax and fraction unbound in plasma in Vero E6 cells to derive a lung Cmax/EC50 as a wider indicator of potential human efficacy. Using the more rigorous Cmax/EC90 ratio eltrombopag, , remdesevir, nelfinavir, niclosamide, nitazoxanide and tipranavir were predicted to be the most effective drugs tested, with Anidulafungin, lopinavir, chloroquine and ritonavir having lesser efficacy.

The aims of this study are: (a) Use a previously documented LFER method to evaluate the inhibitory efficacy of a series of repurposed drugs to the Mpro of SARS-CoV-2, and comparing the results to those previously found for SARS-CoV and MERS (b) Determine if the same LFER method used for inhibition of the Mpro of SARS-CoV-2 SARS-CoV and MERS can be applied to pharmacological properties such as the likely intracellular accumulation of these inhibitors from the plasma and their inhibitory efficacy in the targeted lung tissue

Results

(a) Inhibition of Mpro of SARS-CoV-2 and comparisons with SARS-Cov and MERS

The evaluation method used in this study and others [1-3] was to calculate the molecular specifiers for the various inhibitors used on the 3C-like protease: (1) the free energy of water desolvation (ΔGdesolv,CDS), (2) the lipophilicity free energy (ΔGlipo,CDS) in n-octane, (3) the dipole moment in water, (4) the molecular volume in water, and (5) HOMO, (6) LUMO or (7) HOMO- LUMO energy gap in water. These independent variables values can be scaled to similar magnitudes so that the coefficients in the multiple linear regression equations can be directly compared to gauge the relative magnitudes of inhibitory sensitivity of these molecular variables. Stepwise multiple regression is then applied to seek out which of the seven drug molecular properties had the largest and most significant effect on the inhibition. Equations 1-7 show the most statistically significant relationship found after testing against all independent variables in a stepwise fashion. We have previously shown that equation 1 accurately describes the inhibition of the HKU4-CoV Mpro. HKU4 (HKU4-CoV) belongs to the same 2c lineage as MERS-CoV and shows high sequence similarity with MERS-CoV. HKU4-CoV Mpro shares high sequence identity (81%) with the MERS-CoV enzyme.

Eq 1 Inhibition of Mpro protease of HKU4-CoV for 40 compounds was: pIC50 = 0.05ΔGdesolv,CDS - 0.11ΔGlipo,CDS - 0.08Dipole Moment – 0.23(HOMO-LUMO) + 6.63

2 Where R = 0.382, SEE = 0.39, SE(ΔGdesolvCDS) = 0.04, SE(ΔGlipoCDS) = 0.04, SE(Dipole Moment) = 0.025, SE(HOMO-LUMO) = 0.08, F=5.42, Significance= 0.0017 where ΔGdesolv,CDS is the free energy of water desolvation, ΔGlipo,CDS is the lipophilicity free energy, the dipole moment in water, and HOMO-LUMO is the energy gap in water.

The important finding was that pIC50 is dominantly related to the HOMO-LUMO energy gap of the inhibitors. The HOMO-LUMO gap is an inherent descriptor of the innate reactivity of the inhibitor, and is related to how the inhibitor binds to the protease. In particular, how the HOMO of the protease (HOMOprot) interacts with the LUMO of the inhibitor (LUMOinhib), and how the HOMO of the inhibitor (HOMOinhib) interacts with the LUMO of the protease (LUMOprot). These molecular interactions fundamentally define the inhibitor-protease binding interaction.

We previously showed that eq 2 describes the inhibition for 35 aromatic disulphides drugs of the Mpro protease of SARS-CoV: Eq 2

IC50 = -0.74ΔGdesolv,CDS - 0.30ΔGlipo,CDS - 0.21Dipole Moment + 0.25(HOMO-LUMO) -2.31

2 Where R = 0.725, SEE = 0.69, SE(ΔGdesolvCDS) = 0.11, SE(ΔGlipoCDS) = 0.075, SE(Dipole Moment) = 0.07, SE(HOMO-LUMO) = 0.38, F=19.75, Significance=0.000000 Similarly we also showed that a different series of 25 inhibitors of the SARS-CoV Mpro protease yielded eq 3(a) and 3(b) (although the accuracy of the experimental IC50 values were of lower accuracy that those analyzed in eqs 1 and 2):

Eq 3(a)

IC50 = -25.2ΔGdesolv,CDS + 19.7ΔGlipo,CDS + 5.3Dipole Moment + 24.1HOMO + 192.1

2 Where R = 0.347, SEE = 102.8, SE(ΔGdesolvCDS) = 10.6, SE(ΔGlipoCDS) = 15.4, SE(Dipole Moment) = 7.8, SE(HOMO) = 41.8, F=2.79, Significance=0.053 Eq 3(b)

IC50 = -29.2ΔGdesolv,CDS + 15.2ΔGlipo,CDS + 7.2Dipole Moment + 27.7LUMO + 4.3

2 Where R = 0.350, SEE = 102.5, SE(ΔGdesolvCDS) = 12.3, SE(ΔGlipoCDS) = 15.5, SE(Dipole Moment) = 7.2, SE(HOMO) = 42.3, F=2.82, Significance=0.051 Using the same methodology, stepwise analysis of 23 inhibitors (see Table 1 and Figure 1) of the Mpro of SARS-CoV-2 yields eq 4(a), 4(b) and 4(c): Eq 4(a)

IC50 = 11.65ΔGdesolv,CDS - 5.75ΔGlipo,CDS – 4.66Dipole Moment + 56.68(HOMO-LUMO) +118.83

2 Where R = 0.28, SEE = 193.56, SE(ΔGdesolvCDS) = 15.29, SE(ΔGlipoCDS) = 19.50, SE(Dipole Moment) = 8.25, SE(HOMO- LUMO) = 38.24, F=1.33, Significance=0.300

Eq 4(b) Eliminating the ΔGlipo,CDS and Dipole Moment variables from eq 4(a) as they show the weakest correlations

IC50 = 13.00ΔGdesolv,CDS + 66.55(HOMO-LUMO) +60.40

2 Where R = 0.21, SEE = 186.35, SE(ΔGdesolvCDS) = 10.30, SE(HOMO-LUMO) = 34.39, F=2.57, Significance=0.101

Eq 4(c) Finally it is clear that the dominant correlation is between inhibitory activity and the HOMO-LUMO energy gap

IC50 = 64.71(HOMO-LUMO) +60.40

Where R2 = 0.14, SEE = 189.96, SE(HOMO-LUMO) = 34.84, F=3.49, Significance=0.077 It is noted that bepridyl, oxiconazole, nelfinavir, trihexyphenidyl, clemstine and metixicene were obtained as ionic salts [4] but treated as neutral species and the ionic species in equal proportions as the inhibitors were initially dissolved in DMSO (pH ca 10.7) then added to the protease at pH 7.8 buffer with 20% DMSO at 37C. ]4] Duloxetine was a clear outlier in all analyses.

Since Ashad et al [9] noted that their docking results did not necessarily correlate with their IC50 results, an analysis of the docking binding energy results in eq 5(a) and 5(b), where the molecular volumes in water has been multiplied by 100 to allow a direct comparison of the magnitude of the coefficients. The results indicate that the HMO-LUMO gap is the dominant molecular specifier that determines docking binding energy, with the molecular volume being about a quarter in magnitude. The direct correlation between the docking binding energy and IC50 is poor (significance F 0.22).

Eq 5(a) Docking Binding Energy = 0.45(HOMO-LUMO) -0.12 Molecular Volume -11.07

Where R2 = 0.21, SEE = 0.81, SE(HOMO-LUMO) = 0.24, SE(Molec Vol) = 0.19, F=2.53, Significance=0.105

Eq 5(b)

Docking Binding Energy = 0.12(HOMO-LUMO) -0.86 Molecular Volume - 0.22ΔGlipo,CDS - 9.95

2 Where R = 0.19, SEE = 0.85, SE(HOMO-LUMO) = 0.16, SE(Molec Vol) = 0.48, SE(ΔGlipoCDS) = 0.15, F=1.36, Significance=0.287 Eq 5(a) is clearly the stronger correlation but there may be evidence of a contribution from hydrophobic bonding in the binding energies. (b) Pharmacological properties: intracellular accumulation of inhibitors from the plasma and their inhibitory efficacy in the targeted lung tissue inhibition of the Mpro of SARS- CoV-2

Ford [6] has previously determined the in vivo intracellular accumulation of a series of protease inhibitors to be: nelfinavir > saquinavir >amprenavir > nelfinavir metabolite M8 > lopinavir > ritonavir > indinavir. LFER analysis of the AUC ratios (Table 2) gives eq 6(a) or 6(b) where the molecular volume has been scaled by 100 times to allow a comparison of the relative magnitudes of the coefficients for the molecular specifiers:

Eq 6(a)

AUC ratio = -0.31ΔGdesolv,CDS – 1.35Molec Volume – 1.39(HOMO-LUMO) +12.27

2 Where R = 0.801, SEE = 1.06, SE(ΔGdesolvCDS) = 0.18, SE(MolecVol) = 0.50, SE(HOMO-LUMO) = 0.72, F=4.04, Significance=0.140 Eq 6(b) AUC ratio = – 1.06Molec Volume – 1.39(HOMO-LUMO) +14.51

Where R2 = 0.607, SEE = 1.29, SE(MolecVol) = 0.57, SE(HOMO-LUMO) = 0.88, F=3.09, Significance=0.154 It is noted that eq 6(a) or 6(b) can only be indicative, since the number of experimental AUC ratio data points is too small to be statistically robust. The results suggest that in vivo intracellular accumulation of the protease inhibitors is largely determined by the molecular volume and the HOMO-LUMO gap in about equal proportions.

Analysis of Ashad’s data [9] for lung Cmax/EC50 and Cmax/EC90 derived from human pharmacokinetic models for in vitro anti-SARS-CoV-2 studies yields eqs 7(a) and 7(b):

Eq 7(a)

Cmax/EC50 = -0.32ΔGdesolv,CDS - 0.32ΔGlipo,CDS + 0.83Dipole Moment – 1.05Molec Volume – 1.38(HOMO-LUMO) + 3.68

2 Where R = 0.745, SEE = 2.34, SE(ΔGdesolvCDS) = 0.31, SE(ΔGlipoCDS) = 0.39, SE(Dipole Moment) = 0.26, SE(Molec Vol) = 0.65, SE(HOMO-LUMO) = 2.08, F=4.08, Significance= 0.047

Eq 7(b) using the more accurate and significant experimental Cmax/EC90 values:

Cmax/EC90 = -0.27ΔGdesolv,CDS - 0.32ΔGlipo,CDS + 0.55Dipole Moment – 0.88Molec Volume – 0.98(HOMO-LUMO) + 1.23

2 Where R = 0.926, SEE = 0.75, SE(ΔGdesolvCDS) = 0.11, SE(ΔGlipoCDS) = 0.15, SE(Dipole Moment) = 0.11, SE(Molec Vol) = 0.34, SE(HOMO-LUMO) = 0.74, F=12.44, Significance= 0.0075

Despite the limited number (13) of experimental Cmax/EC50 and Cmax/EC90 values which are insufficient for robust statistical significance, the close similarity of eq 7(a) and 7(b) which are independently experimentally derived, indicates that all five independent variables (or molecular specifiers) are important contributors to the Cmax/EC50 and Cmax/EC90 values. Also stepwise elimination of any of the independent variables does not significantly change the how the remaining variables contribute to the Cmax/EC50 and Cmax/EC90 correlations. It is noted that the molecular volume are multiplied by50 times to allow comparison of coefficients of the independent variables. It appears that molecular volume and the HOMO-LUMO gap are the ca equal major molecular determinants of the Cmax/EC50 and Cmax/EC90.

Discussion

(a) Inhibition of Mpro of SARS-CoV-2 and comparisons with SARS-Cov and MERS

Examination of the LFER eqs 1-4 which apply to the inhibition of Mpro by a wide and diverse range of inhibitors indicates that an eq of the general form applies to the SARS-CoV-2, SARS- CoV, and MERS and HKU4-Cov and other members of the Coronaviridae family, which include those derived from bats, civets, birds and cats. [3]

General form (1) for inhibition of coronavirus proteases:

IC50 = ΔGdesolv,CDS + ΔGlipo,CDS + Dipole Moment + HOMO-LUMO) (or LUMO or HOMO )

The dominant molecular specifier in eqs 1-4 is the HOMO-LUMO energy gap. The HOMO- LUMO gap is an inherent descriptor of the innate reactivity of the inhibitor, and is related to how the inhibitor binds to the protease. In particular, how the HOMO of the protease (HOMOprot) interacts with the LUMO of the inhibitor (LUMOinhib), and how the HOMO of the inhibitor (HOMOinhib) interacts with the LUMO of the protease (LUMOprot). These molecular interactions fundamentally define the inhibitor-protease binding interaction.

We have previously used eqs 1,2,3 which describe the inhibition of the MERS HKU4-CoV Mpro, SARS-CoV Mpro to rank the likely inhibition of a wide range of currently available repurposed anti-virals. [3] We have extended this study by using eqs 3(a) (b) and (c) for the inhibition of the Mpro of the SARS-CoV-2. It is known that there is a 96% similarity of the Mpro for SARS-CoV and SARS-Cov-2 and the comparison of the X-ray crystal structures of the bat HKU4-CoV Mpro with the SARS-CoV-2 Mpro reveals a 65% sequence similarity. The bat HKU4 Mpro shares high sequence identity (81%) with the MERS-CoV Mpro as well. Protease structures from the coronavirus strains causing human respiratory infections like SARS, MERS and SARS-CoV-2 as well as those from bats were highly conserved. The phylogenetic analysis validated the bat origin of the SARS-CoV-2. [10] It is apparent that these LFERs can apply to the Coronaviridae family, and may be useful predictors of inhibitory efficacy against future coronaviruses that may emerge.

These results are shown in Table 2. These data can only relatively rank the likely efficacy of the various repurposed inhibitors to each other, since it is clear that these LFER equations are specific to a particular structural class of inhibitors active against the various proteases, so the ranking derived from a particular LFER equation are dependent on the inhibitor class structures. We have previously shown that inhibitors that are predominantly charged at physiological pH levels will not easily passively permeate or be actively transported by endocytosis across the cell membrane of host cells, since desolvation of the inhibitors is greatly increased for charged inhibitors, as shown in Table 2. It should also be noted that the calculated inhibitory capability of the protonated and di-protonated anti-virals are less meaningful than those for the neutral species in Table 2, because equations 1, 2, 3 and 4 were derived from various series of neutral anti-virals. Molecular docking is currently the mainstay of predictive computational methods to evaluate new and repurposed ant-virals for coronaviruses, and there have been some reports [5,9] that inhibitory structure activity relationships (SARs) do not always agree with docking results for the Mpro. We have tested this observation using the data of Ashad, and find that eq 5 shows that the docking binding energy is a function of the HOMO-LUMO gap and the molecular volume. Such LFERs may be more cost effective means of screening new drugs than the more intensive docking method.

(b) Intracellular accumulation of Mpro inhibitors from the plasma and their inhibitory efficacy in the targeted lung tissue

We have previously shown that in vivo, the inhibitory properties of anti-virals to treat coronaviruses will depend on how well the drugs can enter the infected host cells. [2,3] For diffusion dominant transport of drugs across the host cell membrane, small neutral anti-virals like favipiravir would have better membrane transport, as opposed to drugs that are charged at physiological pH levels. However for larger anti-virals, endocytosis is the likely transport mechanism. Endocytosis requires active cellular energy input since substantial energy is required to allow the lipophilic cell membrane to engulf the drug and eventually deposit the drug into the cytoplasm. Endocytosis also requires some degree of desolvation of the drug before membrane engulfment, since the membrane is lipophilic, so desolvation, lipophilicity, and molecular size of the drug are major determinants of endocytosis.

We have previously found [11-15] that an eq of the general form can describes the passive and some active transport of drugs across the cell membrane or blood brain barrier:

General form (2) for intracellular transport of drugs:

Drug Accumulation = ΔGdesolv,CDS + ΔGlipo,CDS + Dipole Moment + Molecular Volume

However endocytosis is a major transport mechanism for larger drugs (> 0.5-1 kDa). [16,17] Once inside the cell, the intracellular fate of the endosomal contents is a major determinant of successful drug delivery. It was also noted that the potential for sequestration of basic drugs in acidic compartments such as lymphocytes will influence viral replication processes as well (by slowing viral entry into cells). Consequently the proposed inhibition of endosomal acidification by neutral chloroquine analogs, for example, is a potential therapeutic strategy for viral infections. [6,7]

It is noted that the AUC ratios and Cmax/EC50 and Cmax/EC90 data used to derive eqs 6 and 7 are for drugs which are predominantly neutral at the physiological pH for the AUC data, and ionized at low interstitial lung pH, 6.7 for the Cmax/EC50 and Cmax/EC90 data. Ashad’s method [9] used the physicochemical properties of the drug (pKa, log P, acid/base/neutral) and in vitro drug binding information (fraction unbound in plasma, blood to plasma ratio), in combination with tissue specific data (lipid content, volume of intra- and extracellular water) to derive unbound lung to plasma tissue partition coefficients (KpUlung). However Ashad noted it was not possible to determine protein binding-adjusted EC90 values. In highly protein bound drugs the antiviral activity in plasma may be lower than reported for in vitro activity because protein concentrations used in culture media are lower than those in plasma. Both eqs 6 and 7 indicate that the general form (2) of the eqs both for pharmacokinetic distribution/inhibition and intracellular accumulation apply consistently well for both data sets, and is consistent with the general form (1) for Mpro inhibition. Clearly the pKa of the inhibitors and the degree of protonation in the plasma and at the target tissue is a dominating factor in predicting therapeutic efficacy for proposed anti-coronavirus drugs. For example, for chloroquine and hydroxychloroquine, which are diprotic weak bases, are highly dependent on the pH gradient to drive lysosomal uptake as a mechanism of lung accumulation. Intracellular uptake of chloroquine decreases one hundred-fold for every pH unit of external acidification. [18] In addition, high drug to plasma protein binding lowers the bioavailability of drugs in the target tissues.

Eqs 7(a) and (b) despite the approximations in the experimentally derived Cmax/EC50 and Cmax/EC90 values, the statistical correlations are indicators of the molecular properties which can be useful in design or selection of proposed inhibitors as well as the bioavailability in target tissue. The protonation of various inhibitors can be accurately evaluated and their water desolvation, lipophilicity and dipole moment can be tested against their neutral species. Molecular docking / molecular dynamics studies can differentiate between neutral and protonated species showing for example a difference of ca 4.1 kcal/mol for streptomycin (as a neutral species -7.9 kcal/mol or diprotonated species -3.8 kcal/mol at physiological pH) binding to SARS-CoV-2 Mpro. [19] Comparison of the neutral and diprotonated species of streptomycin shows differences of -6.75 and -10.80 kcal/mol in ΔGdesolv,CDS, and -9.67 and -8.89 kcal/mol for ΔGlipo,CDS. For streptomycin, the energy difference between the neutral and diprotonated forms amounts to -4.05 kcal/mol for desolvation, or -3.3 kcal/mol for desolvation plus hydrophobic bonding, both of which are close to the -4.1 kcal/mol difference in the binding energies to the pro M . These data indicate that desolvation (as measured by ΔGdesolv,CDS) of streptomycin plays the major role in binding to the Mpro of SARS-CoV-2, and may be a better indicator of the desolvation contribution to the binding energy than the MM-PBSA solvation energy. [19]

Table 2 shows the molecular specifier properties of the neutral and protonated drugs used in deriving eqs 7(a) and (b). Comparison of the neutral and diprotonated species of chloroquine and hydroxychloroquine shows differences of -4.58 and -4.58 kcal/mol in ΔGdesolv,CDS, and -0.39 and -0.38 kcal/mol for ΔGlipo,CDS. These data suggest that desolvation plays the major role in the inhibitory behaviour of neutral and diprotonated species of chloroquine and hydrochloroquine, similar to the binding of neutral and diprotonated forms of streptomycin.

These results for neutral versus charged species show significant differences which would be expected to influence protease binding and tissue properties that govern intracellular transport in the lung where the pH is 6.7 compared to plasma pH of 7.4. Passive transport of the charged species are highly retarded compared to that of the neutral species, so since chloroquine and hydroxychloroquine are ca 60-70% bound to plasma proteins, the bioavailability of these drugs in the lung tissue for the neutral species is very low. Endocytosis which requires some degree of desolvation as the drug is engulfed by the lipophilic membrane may be required to transport larger drugs across the cell membrane. Conclusions

It has been shown that a linear free energy relationship (LFER) can describe the structure activity of the inhibition of the main protease Mpro of COVID-19 or SARS-Cov-2. Application of this LFER can be used to predictably rank the inhibitory efficacy of a series of repurposed drugs against the main protease of SARS-CoV-2, as well as SARS-CoV and MERS. The same LFER also applies to the intracellular accumulation of Mpro inhibitors from the plasma and their inhibitory efficacy in the targeted lung tissue. The LFER is linearly comprised of the desolvation energy, lipophilicity, dipole moment, molecular volume and HOMO-LUMO energy gap, with varying combinations of these fundamental molecular specifiers applying differently to various structural series of inhibitors.

It is also shown that protonation of basic drugs has a major influence on bioavailability in the target lung tissue pH 6.7 compared to that in the plasma pH 7.4, with the major difference between the neutral species and the charged species is due to the differences in desolvation energy of the inhibitors. Neutral species passively penetrate the infected cell membrane, or endocytosis (which requires some degree of desolvation as the drug is engulfed by the lipophilic membrane) may be required to transport larger drugs across the cell membrane.

There is evidence in the literature that molecular docking methods that derive binding energies to predict likely inhibitors of Mpro of SARS-CoV-2 do not always correlate well with structure activity inhibitory studies of Mpro. This study shows that the binding energy of a series of inhibitors is well correlated with the HOMO-LUMO energy gap and the molecular volume of the inhibitors. Figure 1. Protease inhibitors of COVID-19 Table 1. Inhibition and docking binding energies of Mpro of SARS-CoV-2

Docking ΔG ΔG Dipole Volume IC50 desolv,CDS lipo,CDS HOMO- Binding kcal/mol kcal/mol Mom cm3/ µM Energy D mol LUMO kcal/mol eV Pimozide 42.2 -10.01 -7.72 -12.06 4.26 357 5.33 Ebastine 57 -10.62 -8.27 -13.9 5.17 370 4.22 Bepridyl 72 -8.31 -6.69 -10.44 3.62 326 5.14 Bepridyl Ion 72 -8.31 -9.5 -10.54 15.53 283 5.26 Sertaconazole 76 -8.77 -6.02 -11.3 2.76 315 0.68 Rimonabant 85 -11.23 -6.54 -11.17 6.57 343 4.90 Oxiconazole Ion 99 -9.18 -6.57 -10.85 13.31 267 3.50 Oxiconazole 99 -9.18 -5.82 -10.67 9.38 297 4.84 Itraconazole 111 -8.44 -11.02 -20.68 8.75 563 3.80 Tipranavir 180 -10.74 -12.86 -11.47 7.91 391 4.30 Nelfinavir 234 -9.67 -11.77 -13.78 11.8 390 4.77 Nelfinavir Ion 234 -9.67 -12.25 -13.77 12.71 463 4.77 Zopiclone 349 -10.1 -0.35 -7.79 8.27 260 3.66 Trihexyphenidyl 370 -8.72 -3.14 -8.99 2.77 292 5.71 Trihexyphenidyl 370 -8.72 -6.59 -9.12 15.22 237 6.46 Ion Saquinavir 411 -10.37 -11.96 -14.84 10.08 501 3.89 Isavuconazole 438 -8.77 -8.7 -9.05 8.08 296 4.44 Lopinavir 486 -8.91 -13.93 -16.77 5.82 491 5.76 Clemastine 497 -8.36 -4.7 -9.37 5.60 237 5.34 Clemstine Ion 497 -8.36 -8.39 -9.56 27.20 293 6.00 Metixene 635 -9.01 -2.7 -8.8 3.10 244 5.16 Metixene Ion 635 -9.01 -6.71 -9 14.64 178 5.19 Rupintrivir 68 -16.39 -11.88 10.94 482 4.75 Duloxetine 3047 -8.79 -4.55 -7.68 1.65 185 4.63 Duloxetine Ion 3047 -8.79 -7.8 -7.97 20.40 203 4.65 Table 2. Calculated inhibitory constants from Equations 1, 2, 3(a), 4(c) for a range of repurposed anti-virals: eq 1 applies to the MERS bat HKU4 Mpro, and eq 2 and 3(a) apply to the SARS-CoV Mpro, eq 4(c) applies to the SARS-CoV-2 Mpro

ΔGdesolv,CDS ΔGlipo,CDS Dipole Vol LUMO HOMO HOMO- pIC IC IC IC 3 50 50 50 50 kcal/mol kcal/mol Mom cm / eV eV LUMO Eq 1 Eq 2 Eq 3(a) Eq 4(c) D mol eV µM µM µM µM Chloroquine -3.4 -7.71 7.66 217 -1.28 -5.35 4.07 5.02 2.47 6.83 Neut 226.08 Chloroquine -6.19 -7.81 30.85 232 -1.31 -5.60 4.29 3.03 -0.10 252.66 239.81 Ion Chloroquine -7.98 -8.1 12.01 244 -2.22 -6.54 4.32 4.41 4.67 140.30 241.80 Di-Ion Hydroxychloro- -3.61 -7.93 7.97 261 -1.26 -5.45 4.18 4.97 2.64 12.24 233.05 quine Neut Hydroxychloro- -6.3 -8.02 28.61 298 -1.29 -5.60 4.30 3.20 0.51 236.97 240.91 quine Ion Hydroxychloro- -8.09 -8.31 10.29 279 -2.20 -6.55 4.34 4.54 5.18 128.50 243.42 quine Di-Ion Favipiravir -4.6 -1.03 8.96 82 -2.45 -6.46 4.01 4.81 0.75 120.69 221.89 Lopinavir -13.93 -16.77 5.82 491 -0.08 -5.84 5.76 4.37 14.12 196.36 335.30 -13.44 -10.79 12.76 381 -1.19 -5.97 4.78 4.01 10.12 292.61 272.02 Remdesivir -15.83 -6.99 18.29 267 -1.11 -5.97 4.85 3.41 9.97 462.01 276.58 TriPhosphate GS441524 -5.39 -4.83 7.65 166 -1.17 -5.96 4.79 4.73 3.39 111.65 272.56 Saquinavir -11.96 -14.84 10.08 501 -2.02 -5.92 3.89 4.54 9.99 145.83 214.34 Invermectin -18.95 -15.67 16.56 547 -0.73 -5.79 5.06 3.44 15.17 419.38 289.89 B1A Ritonavir -14.09 -14.74 8.81 644 -0.94 -6.16 5.22 4.23 12.58 229.96 300.45 Atazanavir -18.24 -12.86 6.16 544 -1.26 -5.76 4.50 4.39 15.71 351.99 253.39 Nelfinavir -11.77 -13.78 11.8 390 -0.93 -5.70 4.77 4.21 9.92 198.50 271.12 -17.12 -17.67 2.11 666 -1.73 -5.32 3.59 5.02 16.57 204.47 194.72 -14.99 -18.09 15.9 521 -1.45 -5.30 3.85 3.99 12.34 242.52 211.50 Nitazoxanide -9.08 -6.79 13.18 172 -2.98 -6.83 3.85 4.36 4.49 180.25 211.82 Ruxolitinib -4.35 -7.47 8.55 226 -1.26 -5.81 4.55 4.80 2.99 45.21 256.95 Baricitinib -3.54 -7.06 6.86 259 -1.31 -5.84 4.53 4.97 2.57 14.32 255.73 Carfilzomib -14.54 -15.94 5.38 580 -1.48 -5.94 4.46 4.67 13.60 185.52 251.15 Nafamostat -6.86 -9.19 15.39 296 -1.51 -5.37 3.86 4.32 3.82 134.72 212.42 Neut Nafamostat -9.51 -9.38 19.85 228 -2.13 -6.19 4.06 3.80 4.64 224.57 225.23 Di-Ion -4.63 -5.52 9.85 174 -1.27 -7.23 5.95 4.34 2.45 91.93 347.59 Darunavir -10.78 -10.68 8.72 331 -0.74 -5.75 5.01 4.40 9.12 199.62 286.39 Sofusbuvir -14.37 -9.51 7.61 397 -1.32 -6.70 5.38 4.22 11.51 298.61 310.28 -2.55 -5.49 12.09 170 -0.47 -5.33 4.84 4.52 0.85 69.46 275.79 Neut Galidesivir -3.53 -5.62 14.51 179 -0.62 -5.86 5.24 4.19 1.06 109.40 301.32 Ion Dolutegravir -8.9 -8.43 17.07 306 -1.75 -6.21 4.46 3.95 4.74 211.43 250.73 Efavirenz -8.24 -5.37 10.3 199 -2.85 -4.68 1.82 5.08 4.15 159.97 80.45 -13.53 -13.61 12.7 521 -1.82 -5.94 4.12 4.20 10.51 234.73 228.99 Arbidol -6.99 -10.45 10.84 263 -1.21 -5.34 4.13 4.62 5.36 94.98 229.87 Arbidol Ion -9.4 -10.56 17.9 282 -1.38 -5.87 4.50 3.87 5.72 209.84 253.46 Imatinib -2.4 -13.91 6 404 -1.87 -5.33 3.47 5.41 3.19 -144.50 186.85 Imatinib Ion -5.47 -14.05 48.94 400 -1.88 -5.35 3.47 1.91 -3.19 251.37 187.29 -13.13 -9.97 10.72 374 -2.13 -6.32 4.19 4.32 9.50 255.68 233.42 -19.65 -12.53 8.25 464 -3.42 -5.13 1.71 4.79 14.95 354.60 73.26 13B α- -13.37 -13.19 10.15 392 -2.41 -5.66 3.25 4.60 10.43 202.15 173.05 ketoamide3 Footnotes: Inhibitors colour coded in red are dominantly protonated at the physiological pH. Inhibitors colour coded in green are predicted to have high inhibitory capacity. 3 Ref 3.

Table 3. Intracellular accumulation of Mpro inhibitors from the plasma and their inhibitory efficacy in the targeted lung tissue

Cmax:EC90 Cmax:EC50 ΔGdesolv,CDS ΔGlipo,CDS Dipole Volume HOMO- pKa base Charge kcal/mol kcal/mol Mom cm3/ D mol LUMO Plasma eV Andidulafungin 1.192 1.323 -23.73 -24.91 3.81 720 4.22 -3.5 0 Chloroquine 1.261 2.318 -3.4 -7.71 7.66 217 4.07 10.32 2 Eltrombopag 2.029 3.416 -10.17 -10.56 4.62 252 3.20 -0.12 -2 Favipiravir 2.469 6.326 -4.6 -1.03 8.96 82 4.01 -3.70 0 Hydrochloroquine 0.101 3.598 -3.61 -7.93 7.97 261 4.18 9.76 2 Mefloquine 1.284 1.35 -6.4 -4.33 9.43 259 3.95 9.46 1 Merimepodib 0.638 1.629 -10.01 -8.77 7.01 289 4.40 0.57 0 Nelfinavir 3.755 5.849 -11.77 -13.78 11.8 390 4.77 8.18 1 Niclosamide 4.936 8.286 -8.38 -7.65 10.1 227 4.05 -4.40 -1 Nitrazoxanide 6.315 13.823 -8.96 -6.76 14.15 193 3.59 -4.20 0 Remdesivir 3.755 5.603 -13.44 -10.79 12.76 381 4.78 0.65 0 Indomethacin 5.366 -9.4 -8.85 4.49 235 4.08 -2.90 -1 Ritonavir 1.8 -14.09 -14.74 8.81 644 5.22 2.84 0 Chloroquine Di-Ion -7.98 -8.1 12.00 244 4.32 Hydrochloroquine Di-Ion -8.09 -8.31 10.29 279 4.34 Nelfinavir Ion -12.25 -13.77 12.80 463 4.77 Mefloquine Ion -8.88 -4.53 28.78 215 4.62 Niclosamide Ion -7.84 -7.71 8.85 189 4.14 Eltrombopag Di-Ion -9.18 -10.6 49.66 263 2.45 Indomethacin Ion -8.99 -8.91 19.64 228 3.77

AUC ratio ΔGdesolv,CDS ΔGlipo,CDS Dipole Volume HOMO- pKa Charge kcal/mol kcal/mol Mom cm3/ D mol LUMO Plasma eV Nelfinavir 5.3 -11.77 -13.78 11.8 390 4.77 6 or 8.81 1 or 0 Saquinavir 3.64 -11.96 -14.84 10.08 501 3.89 7 or 5.5 1 or 0 Amprenivir 3.2 -10.22 -9.62 5.79 365 5.35 2.39 0 M8 (metabolite of 2.3 -11.98 -13.91 12.66 440 4.77 8.81 1 or 0 Nelfinavir) Lopinavir 1.55 -13.93 -16.77 5.82 491 5.76 -1.5 0 Ritonavir 1.25 -14.09 -14.74 8.81 644 5.22 2.84 0 Indinavir 0.29 -6.59 -14.49 12.84 500 5.24 6.2 or 1 or 0 7.27 Footnotes: Cmax:EC50, Cmax:EC90 from ref 9, AUC ratios from ref 6, pKa values from ref 6 and ACD.

References

[1] CW Fong, Inhibition of COVID-2019 3C-like protease: structure activity relationship using quantum mechanics, hal archives 2020, hal-02529030v1 [2] CW Fong, Screening potential repurposed COVID-2019 3C-like protease inhibitors, hal archives 2020, hal-02663287v1 [3] CW Fong, Screening potential anti-virals for the main protease of the Coronaviridae family including SARS-CoV-2, SARS-CoV, MERS, hal archives 2020, [4] EC Vatansever, K Yang, KC Kratch, A Drelich, et al, Targeting the SARS-CoV-2 Main Protease to Repurpose Drugs for COVID-19, bioRxiv Preprint 2020 Jul 27, doi: 10.1101/2020.05.23.112235 [5] T Bobrowski, V Alves, CC Melo-Filho, et al, Computational Models Identify Several FDA Approved or Experimental Drugs as Putative Agents Against SARS-CoV-2, ChemRxiv 2020 Preprint https://doi.org/10.26434/chemrxiv.12153594.v1 [6] J. Ford, SH Khoo, DJ Back, The intracellular pharmacology of antiretroviral protease Inhibitors, J Antimicrobial Chemotherapy, 2004, 54, 982–990 [7] AA Al-Bari, Targeting endosomal acidification by chloroquine analogs as a promising strategy for the treatment of emerging viral diseases, Pharma Res Per, 2017, 5(1), e00293, doi: 10.1002/prp2.293 [8] OH Petersen, OV Gerasimenko, Julia Gerasimenko Endocytic uptake of SARS-CoV-2: the critical roles of pH, Ca++, and NAADP, Function, 2020, 1(1): zqaa003 doi: 10.1093/function/zqaa003 [9] U Arshad, H Pertinez, H Box, L Tatham, et al, Prioritisation of potential anti-SARS-CoV-2 drug repurposing opportunities based on ability to achieve adequate plasma and target site concentrations derived from their established human pharmacokinetics, doi: https://doi.org/10.1101/2020.04.16.20068379 medRxiv preprint, 2020 [10] RS Joshi, SS Jagdale, SB Bansode, et al, Discovery of potential multi-target-directed ligands by targeting host-specific SARS-CoV-2 structurally conserved main protease, J Biomol Struct Dynam, 2020, DOI:0.1080/07391102.2020.1760137 [11] CW Fong, Permeability of the Blood–Brain Barrier: Molecular Mechanism of Transport of Drugs and Physiologically Important Compounds, J Membr Biol. 2015, 248,651-69. [12] CW Fong, The extravascular penetration of tirapazamine into tumours: a predictive model of the transport and efficacy of hypoxia specific cytotoxic analogues and the potential use of cucurbiturils to facilitate delivery, Int J Comput Biol Drug Design. 2017, 10, 343-373 [13] CW Fong, Statins in therapy: Understanding their hydrophilicity, lipophilicity, binding to 3- hydroxy-3-methylglutaryl-CoA reductase, ability to cross the blood brain barrier and metabolic stability based on electrostatic molecular orbital studies. Eur J Med Chem. 2014, 85, 661-674 [14] CW Fong, Predicting PARP inhibitory activity – A novel quantum mechanical based model. HAL Archives. 2016, https://hal.archives-ouvertes.fr/hal-01367894v1. [15] CW Fong, A novel predictive model for the anti-bacterial, anti-malarial and hERG cardiac QT prolongation properties of fluoroquinolones, HAL Archives. 2016, https://hal.archives- ouvertes.fr/hal-01363812v1 [16] LM Bareford, PW Swaan, Endocytic mechanisms for targeted drug delivery, Adv Drug Deliv Rev. 2007, 59, 748–758. [17] OO Glebov, Understanding SARS-CoV-2 endocytosis for COVID-19 drug repurposing, FEBS J. 2020, [18] TG Geary, AD Divo, JB Jensen, et al, Kinetic modelling of the response of Plasmodium falciparum to chloroquine and its experimental testing in vitro. Implications for mechanism of action of and resistance to the drug, Biochem Pharmacol, 1990, 40, 685-91. [19] J Wang, Fast Identification of Possible Drug Treatment of Coronavirus Disease -19 (COVID-19) Through Computational Drug Repurposing Study, J. Chem. Inf. Model. 2020, 60, 3277–3286