International Journal of Molecular Sciences

Article Virtual Screening of C. Sativa Constituents for the Identification of Selective Ligands for Cannabinoid Receptor 2

Mikołaj Mizera 1 , Dorota Latek 2 and Judyta Cielecka-Piontek 1,*

1 Department of Pharmacognosy, Poznan University of Medical Sciences, 60-781 Pozna´n,Poland; [email protected] 2 Faculty of Chemistry, University of Warsaw, 02-093 Warsaw, Poland; [email protected] * Correspondence: [email protected]; Tel.: +48-854-67-10

 Received: 28 April 2020; Accepted: 24 July 2020; Published: 26 July 2020 

Abstract: The selective targeting of the cannabinoid receptor 2 (CB2) is crucial for the development of peripheral system-acting cannabinoid analgesics. This work aimed at computer-assisted identification of prospective CB2-selective compounds among the constituents of Cannabis Sativa. The molecular structures and corresponding binding affinities to CB1 and CB2 receptors were collected from ChEMBL. The molecular structures of Cannabis Sativa constituents were collected from a phytochemical database. The collected records were curated and applied for the development of quantitative structure-activity relationship (QSAR) models with a machine learning approach. The validated models predicted the affinities of Cannabis Sativa constituents. Four structures of CB2 were acquired from the Protein Data Bank (PDB) and the discriminatory ability of CB2-selective ligands and two sets of decoys were tested. We succeeded in developing the QSAR model by achieving Q2 5-CV > 0.62. The QSAR models helped to identify three prospective CB2-selective molecules that are dissimilar to already tested compounds. In a complementary structure-based virtual screening study that used available PDB structures of CB2, the agonist-bound, Cryogenic Electron Microscopy structure of CB2 showed the best statistical performance in discriminating between CB2-active and non-active ligands. The same structure also performed best in discriminating between CB2-selective ligands from non-selective ligands.

Keywords: QSAR; endocannabinoid system; Cannabis Sativa

1. Introduction The medical effect of Cannabis sp. bioactive ingredients is the subject of extensive research [1–3]. It has been discovered that Cannabis sp. contains over 500 various compounds, with the cannabinoid group itself having about 110 molecules [4]. Phytocannabinoids present in Cannabis sp. have slight differences in their chemical structures (types: , , cannabitriol, cannabicycol, , , cannabielsoin, , ∆9-, ∆8-tetrahydrocannabinate) and are not selective. The most well-known phytocannabinoids are THC (∆8-tetrahydrocannabinol) and CBD (cannabidiol). Research to date reports that both THC and CBD have an affinity for various types of endocannabinoid system receptors [5–7]. For example, cannabidiol is a non-competitive CB1 antagonist, CB2 inverse agonist, GPR55 and GPR18 antagonist, Peroxisome proliferator-activated receptor (PPAR-γ) agonist, α1, α3 glycine agonist, TRPM8 antagonist, and an agonist and antagonist of receptors of various types of serotonin 1A receptors (5-HT1A) [8–11]. The pharmacological effects of the interaction of Active pharmaceutical ingredients (APIs) with CB1 and CB2 receptors are the most investigated. The differences in the density of CB1 and CB2 receptor distribution also determine the activities within the central nervous system [12]. The pharmacological

Int. J. Mol. Sci. 2020, 21, 5308; doi:10.3390/ijms21155308 www.mdpi.com/journal/ijms Int. J. Mol. Sci. 2020, 21, 5308 2 of 14 effect of cannabinoids results from modification of the signaling pathway of two G protein-coupled receptors (GPCRs): cannabinoid receptors 1 and 2 (CB1 and CB2). Activation of the CB1 receptor, which is expressed mainly in the central nervous system, is responsible for their psychotropic action, while CB2 receptors are found mainly in the immune system [13]. This distinct distribution of CB2 receptors in human tissues suggests that selective cannabinoids are promising APIs with anti-inflammatory, analgesic, and anti-neuroinflammatory actions [14]. The discovery of new selective cannabinoids can be streamlined with the application of in silico methods such as structure-based virtual screening (VS) [15–17] or ligand-based (quantitative structure-activity relationship (QSAR)) VS [18]. Machine learning-based QSAR models can be successfully applied in pharmaceutical research related to drug discovery [19], drug formulation [20], and pharmaceutical analysis [21,22]. The applicability of QSAR modeling to predict the selectivity of CB receptors has also been reported in the literature. The CB2-selective activity of 29 benzimidazole and benzothiophene derivatives was investigated with comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) 3D-QSAR models [23] and showed an external predictive performance of R2 > 0.9. In the recent 4D-QSAR study involving molecular dynamics (MD), the modeling was performed on 29 structurally similar CB2 receptor inverse agonists [24]. The 4D-QSAR approach based on partial least squares (PLS) and multiple linear regression (MLR) resulted in Q2 = 0.719 and Q2 = 0.761 for the PLS and MLR models, respectively. The selectivity of 29 arylpyrazole derivatives was investigated with the application of 3D-QSAR/CoMFA analyses [25]. The QSAR model helped to identify the causes of CB1 selectivity by producing counter maps of affinity for both CB receptor subtypes. Nevertheless, the narrow applicability of these models due to similar scaffolds used as training examples may be a significant limitation when they are used in the prediction of novel chemotypes. In contrast, Floresta et al. conducted a 3D-QSAR study on a diverse dataset containing 312 molecules with reported experimental CB1 affinities and 187 molecules with reported CB2 affinities [26]. These models showed a predictive performance of Q2 = 0.62 and Q2 = 0.72 for the CB1 and CB2 QSAR models, respectively. The diversity of molecular structures in the training set also allowed Floresta et al. to perform VS for novel chemical scaffolds. The idea of the QSAR application for the identification of bioactive compounds in plant extract was also used by Labib et al. [27]. This VS study aimed to predict the activities of CB1 and opioid receptors among compounds isolated from Pinus roxburghii bark extract. The model of Labib et al. explained the synergistic anti-inflammatory action of Pinus roxburghii and provided information on the activity of several bioactive molecules identified in the extract. Our study aimed to predict CB2-selectivity for molecules identified in Cannabis Sativa by using a validated QSAR model. This goal was achieved by the execution of several steps: the collection of diverse ligands’ structures with experimental data describing CB1 and CB2 affinity; data curation; conducting QSAR study for CB1 and CB2; and the prediction of the CB1 and CB2 affinity of phytochemicals in the Cannabis Sativa. So far, the lack of crystal structures in cannabinoid receptors has been a major obstacle in searching for new, selective cannabinoids. However, there have been attempts to predict binding modes of well-known actives of CB1 and CB2 using homology models of these receptors and molecular dynamics [17,28,29]. Recent advances in structural studies of cannabinoid receptors 6KPC [30], 6KPF [30], 6PT0 [31], and 5ZTY [32] provided new data that supplemented drug discovery studies [30]. In this study, we tested the applicability of available experimental structures of CB2, solved in both active and inactive conformations, in structure-based VS to search for novel or more CB2-selective ligands.

2. Results

2.1. Study Design Data on the experimentally evaluated affinity of diverse compounds to CB1 and CB2 receptors and data on structures identified in Cannabis Sativa with unknown CB1/CB2 affinity were collected Int. J. Mol. Sci. 2020, 21, 5308 3 of 14 from publicly accessible online resources. Next, the data with known affinity values were curated and subsequently used for the development and validation of CB1 and CB2 QSAR models. The validated models were used for the prediction of the CB1 and CB2 affinity of Cannabis Sativa ingredients and the prediction of prospective CB2-selective Cannabis Sativa ingredients. The dataset collected for the QSAR study was used to evaluate the statistical characteristics of the discriminatory ability of CB1 and CB2Int. crystal J. Mol. Sci. structures 2020, 21, x FOR in PEER the REVIEW docking study. 3 of 14

2.2.validated Data Curation models were used for the prediction of the CB1 and CB2 affinity of Cannabis Sativa ingredients and the prediction of prospective CB2-selective Cannabis Sativa ingredients. The dataset collectedDatasets for the of ligandsQSAR study with was experimental used to evaluate affi nitiesthe statistical for CB1 characteristics and CB2 were of the joined discriminatory with corresponding recordsability ofof CB1 assay and metadataCB2 crystal fromstructures ChEMBL. in the docking The resultingstudy. dataset was composed of six columns containing: the structure of the molecule in simplified molecular-input line-entry system (SMILES) 2.2. Data Curation format, a standard type of reported value (i.e., Ki, IC50, etc.), reported value, the mathematical relation of the reportedDatasets of value ligands to with the experimental experimentally affinities measured for CB1 and value, CB2 were the ID joined number with corresponding of the source document withrecords reported of assay study, metadata and thefrom confidence ChEMBL. The score. resulting The initialdataset datasetwas composed described of six above columns was subjected containing: the structure of the molecule in simplified molecular-input line-entry system (SMILES) to theformat, removal a standard of potentially type of reported unreliable value data(i.e., Ki, (Figure IC50, 1etc.),). The reported initial value, dataset the includedmathematical 14,126 records forrelation CB1 and of the 13,506 reported records value for to CB2the experimenta (Figure1.1).lly Recordsmeasured withoutvalue, the reported ID number SMILEs, of the source which included metaldocument complexes with reported and polymers study, and were the confidence excluded score. from The the initial dataset dataset (Figure described1.2). above For was the remaining records,subjected the to respective the removal molecular of potentially structures unreliablewere data (Figure standardized 1). The init andial dataset 2D coordinates included 14,126 were generated. Werecords assumed for CB1 that and both 13,506 the assays,records whichfor CB2were (Figure carried 1.1). Records out on without a target reported protein SMILEs, or a homologous which protein included metal complexes and polymers were excluded from the dataset (Figure 1.2). For the were reliable. Records meeting these criteria were annotated in ChEMBL with a confidence score remaining records, the respective molecular structures were standardized and 2D coordinates were of 9generated. and 8, respectivelyWe assumed (Figurethat both1.3). the Onlyassays, activities which were measured carried out in largeon a target and consistentprotein or a assays were pickedhomologous from the protein dataset were (Figurereliable. 1Records.4).That meeting is, an these assay criteria was were considered annotated as in suitableChEMBL with for selection a if it containedconfidence a score consistent of 9 and group 8, respectively of 10 or (Figure more 1. molecules,3). Only activities all with measured the ainffi largenity and measured. consistent These large assaysassays were were identified picked from based the dataset on the (Figure document 1.4).That ID reported is, an assay in ChEMBL.was considered In the as nextsuitable step, forrecords with a reportedselection if a ffiit nitycontained in any a consistent standard group value of type10 or relatedmore molecules, to Ki were all with kept the (Figure affinity1 .5)measured. and subsequently These large assays were identified based on the document ID reported in ChEMBL. In the next step, converted to pKi. Duplicates analysis (Figure1.6) was conducted by comparing records InChIKeys. records with a reported affinity in any standard value type related to Ki were kept (Figure 1.5) and Duplicatessubsequently were converted merged toif pKi. the Duplicates standard analysis deviation (Figure of duplicate1.6) was conducted measurements by comparing was records lower than 10% of theInChIKeys. entire range Duplicates of measurements. were merged if As the a standa result,rd only deviation one recordof duplicate was measurements kept, with the was pKi lower value averaged. Mordredthan 10% descriptors of the entire and range Morgan of measurements. fingerprints As werea result, computed only one record for each was compound kept, with the in pKi the dataset and thenvalue concatenated averaged. Mordred to create descriptors a feature and vector. Morgan Records fingerprints with were duplicate computed feature for each vectors compound were removed in the dataset and then concatenated to create a feature vector. Records with duplicate feature vectors (Figure1.7). These included, e.g., compounds di ffering only by chiral hydrogen atoms and those that were removed (Figure 1.7). These included, e.g., compounds differing only by chiral hydrogen atoms couldand notthose be that di couldfferentiated not be differentiated by the used by descriptorsthe used descriptors and fingerprints. and fingerprints.

Figure 1. Data records that passed through the subsequent data curation steps: (1) data acquisition, Figure 1. Data records that passed through the subsequent data curation steps: (1) data acquisition, (2) removing records with no SMILES included, (3) removing records with Confidence Score < 8, (4) (2)keeping removing only recordsrecords from with large no SMILESassays, (5 included,) keeping only (3) removingrecords with records included with Ki values, Confidence (6) Score <8, (4)duplicate keeping merging, only records (7) removing from large molecules assays, with (5) the keeping same feature only records vector. with included Ki values, (6) duplicate merging, (7) removing molecules with the same feature vector. 2.3. Model Description

Int. J. Mol. Sci. 2020, 21, 5308 4 of 14

Int.2.3. J. Mol. Model Sci. Description2020, 21, x FOR PEER REVIEW 4 of 14 The independent CB1 and CB2 QSAR models were created according to the architecture presented The independent CB1 and CB2 QSAR models were created according to the architecture in Figure2. The input feature vectors for the machine learning algorithm were molecular descriptors presented in Figure 2. The input feature vectors for the machine learning algorithm were molecular and fingerprints of compounds from the training set (Figure2.1). The feature vectors along with descriptors and fingerprints of compounds from the training set (Figure 2.1). The feature vectors experimental CB1 and CB2 pKi values were used as a training set for the CB1 and CB2 QSAR models along with experimental CB1 and CB2 pKi values were used as a training set for the CB1 and CB2 (Figure2.2). QSAR models (Figure 2.2).

FigureFigure 2. 2. MachineMachine learning-based learning-based quantitative quantitative struct structure-activityure-activity relation relationshipship (QSAR) (QSAR) model. model.

TheThe value value predicted predicted in in each each leaf leaf of of decision decision trees trees in in a agradient gradient boosting boosting (GB) (GB) ensemble ensemble was was used used forfor the the embedding embedding of of a adescriptor descriptor space space (Figure (Figure 2.2.3).3). The The embedded embedded representation representation of of the the descriptor descriptor spacespace was was used used to to create create a aseparate separate k-nearest k-nearest neighbor neighbor (kNN) (kNN) model model for for each each receptor receptor (Figure (Figure 2.4).2.4). TheThe kNN kNN algorithm algorithm was was modified modified to to predict predict an an averag averagee pKi pKi value value of of a acompound compound only only if if all all nearest nearest neighborsneighbors of of the the query query molecular molecular structure structure were were within within a given a given distance. distance. This Thismaximal maximal distance distance was usedwas usedas an as applicability an applicability domain domain threshold threshold related related to the to theconfidence confidence of ofprediction prediction (Figure (Figure 2.5).2.5). DifferentDifferent thresholds thresholds werewere testedtested to to assess assess their their influence influence on theon statisticalthe statistical characteristics characteristics of kNN of models. kNN models.We selected We selected 20 thresholds 20 thresholds as percentiles as percentiles 5 to 100 with 5 to a100 step with equal a step to 5% equal of the to distribution5% of the distribution of maximal ofEuclidean maximal distancesEuclidean within distances kNN within clusters kNN obtained clusters for obtained the training for the set. training set.

2.4.2.4. Model Model Validation Validation

InIn Figure Figure 33,, we we present present the the dependence dependence of cross-validated of cross-validated Q 2 on Q the2 on applicability the applicability domain threshold.domain threshold.The predictive The performancepredictive performance showed consistent showed behavior, consistent declining behavior, with andeclining increased with threshold. an increased The best threshold.statistical The characteristics best statistical Q2 > characteristics0.8 for the CB1 Q and2 > 0.8 CB2 for models the CB1 was and observed CB2 models for the was lowest observed thresholds for the(< 0.01).lowest Interestingly, thresholds kNN (<0.01). models Interestingly, with the highest kNN threshold, models whichwith correspondsthe highest tothreshold, the most inclusivewhich correspondsapplicability to domain, the most still inclusive achieved applicability the statistical domain characteristics, still achieved Q2 >the0.6, statistical as suggested characteristics in studies Q of2 >QSAR 0.6, as bestsuggested practices in studies [33]. of QSAR best practices [33]. An autocorrelation plot (Figure4) shows good agreement between out-of-sample predictions and experimental values for the majority of compounds, and importantly, for the CB2 selectivity ranges of pKi (pKi < 6 for CB1 and pKi > 6.5 for CB2). The majority of data points presented in Figure4 are distributed between 6 and 8. Because machine learning techniques used in this study are interpolating techniques, we expected our model to overestimate pKi predictions at lower values and underestimate predictions at large ones.

Figure 3. Q2—applicability domain threshold dependence for CB1 and CB2 k-nearest neighbor (kNN) models. Each point on the curves represents a Q2 calculated for external predictions for a cumulative fraction of molecular structures in the training dataset within a step of 5%.

Int. J. Mol. Sci. 2020, 21, x FOR PEER REVIEW 4 of 14

The independent CB1 and CB2 QSAR models were created according to the architecture presented in Figure 2. The input feature vectors for the machine learning algorithm were molecular descriptors and fingerprints of compounds from the training set (Figure 2.1). The feature vectors along with experimental CB1 and CB2 pKi values were used as a training set for the CB1 and CB2 QSAR models (Figure 2.2).

Figure 2. Machine learning-based quantitative structure-activity relationship (QSAR) model.

The value predicted in each leaf of decision trees in a gradient boosting (GB) ensemble was used for the embedding of a descriptor space (Figure 2.3). The embedded representation of the descriptor space was used to create a separate k-nearest neighbor (kNN) model for each receptor (Figure 2.4). The kNN algorithm was modified to predict an average pKi value of a compound only if all nearest neighbors of the query molecular structure were within a given distance. This maximal distance was used as an applicability domain threshold related to the confidence of prediction (Figure 2.5). Different thresholds were tested to assess their influence on the statistical characteristics of kNN models. We selected 20 thresholds as percentiles 5 to 100 with a step equal to 5% of the distribution of maximal Euclidean distances within kNN clusters obtained for the training set.

2.4. Model Validation

In Figure 3, we present the dependence of cross-validated Q2 on the applicability domain threshold. The predictive performance showed consistent behavior, declining with an increased threshold. The best statistical characteristics Q2 > 0.8 for the CB1 and CB2 models was observed for the lowest thresholds (<0.01). Interestingly, kNN models with the highest threshold, which correspondsInt. J. Mol. Sci. 2020 to the, 21 ,most 5308 inclusive applicability domain, still achieved the statistical characteristics5 of Q 142 > 0.6, as suggested in studies of QSAR best practices [33].

Int. J. Mol. Sci. 2020, 21, x FOR PEER REVIEW 5 of 14 Int. J. Mol. Sci. 2020, 21, x FOR PEER REVIEW 5 of 14 An autocorrelation plot (Figure 4) shows good agreement between out-of-sample predictions and Anexperimental autocorrelation values plot for (Figurethe majority 4) shows of comp goodounds, agreement and importantly,between out-of-sample for the CB2 predictions selectivity andranges experimental of pKi (pKi values< 6 for CB1for the and majority pKi > 6.5 of for comp CB2).ounds, The majority and importantly, of data points for presentedthe CB2 selectivity in Figure ranges4 are distributed of pKi (pKi between< 6 for CB1 6 and pKi8. Because > 6.5 for machin CB2). Thee learning majority techniques of data points used presented in this study in Figure are 2 4interpolating areFigure distributed 3. Qtechniques,Q2—applicability—applicability between we 6 expecteddomainand domain 8. threshold thresholdBecause our model dependence dependencemachin to overestimatee learning for for CB1 CB1 and andpKitechniques CB2 CB2predictions k-nearest k-nearest used atneighbor neighbor in lower this (kNN)values (kNN)study andare 2 interpolatingunderestimatemodels. EachEach techniques, predictions point point on on weth the ate expectedcurves curveslarge ones.represents represents our model a Q 2 tocalculatedcalculated overestimate for for external external pKi predictions predictions predictions forat for lower a a cumulative cumulative values and underestimatefractionfraction of molecular predictions structures at large in ones. the training dataset within a step of 5%.5%.

Figure 4. Autocorrelation of experimental and predicted pKi values for the CB1 model (blue) and the FigureCB2Figure model 4. 4. AutocorrelationAutocorrelation (orange). of of experimental experimental and and predicted predicted pKi pKi values values for for the the CB1 CB1 model model (blue) (blue) and and the the CB2CB2 model model (orange). 2.5. Virtual Screening of Cannabis Sativa Phytochemicals 2.5.2.5. Virtual Virtual Screening Screening of of Cannabis Cannabis Sativa Sativa Phytochemicals Phytochemicals Sixty-eight Cannabis Sativa phytochemicals with a molecular weight between 250D and 500D Cannabis Sativa wereSixty-eightSixty-eight subjected toCannabis VS using Sativa our validatedphytochemicalsphytochemicals CB1 and with with CB2 a a QSARmolecular molecular models. weight weight On average,between between the250D 250D selectivity and and 500D of werecompounds subjected from to VSCannabis using our Sativa validated was less CB1 than and CB2 for theQSAR ChEMBL-derived models. On On average, average, training the the set.selectivity Still, we of of compoundsobservedcompounds a relatively fromfromCannabis Cannabis high Sativaautocorrelation Sativawas was less less than of thanthe for thepredicted for ChEMBL-derived the ChEMBL-derived CB1 and CB2 training activity training set. of Still, Cannabis set. we observedStill, Sativa we observedphytochemicalsa relatively a highrelatively autocorrelation(see Figure high autocorrelation 5) in of comparison the predicted of totheCB1 the predicted andChEMBL CB2 CB1 activity training and of CB2set.Cannabis Theactivity average Sativa of phytochemicalsCannabis absolute Sativa dpKi phytochemicals(seebetween Figure CB25) inand comparison(see CB1 Figure was 0.69 5) to in the and comparison ChEMBL 1.26 for molecules training to the ChEMBL set. in TheCannabis average training Sativa absolute set. and The molecules daveragepKi between inabsolute the CB2training dpKi and betweenset,CB1 respectively. was CB2 0.69 and and CB1 1.26 was for molecules0.69 and 1.26 in Cannabisfor molecules Sativa inand Cannabis molecules Sativa in and the molecules training set, in respectively.the training set, respectively.

Figure 5. The correlationcorrelation betweenbetween aaffinityffinity against against CB1 CB1 and and CB2 CB2 for for compounds compounds in Cannabisin Cannabis Sativa sativaand Figureandcompounds compounds 5. The reported correlation reported in ChEMBL. betweenin ChEMBL. affinity against CB1 and CB2 for compounds in Cannabis sativa and compounds reported in ChEMBL. Despite relatively low average predicted selectivity of compounds in Cannabis Sativa, structures C1–C3Despite showing relatively dpKi >low 1 andaverage a Tanimoto predicted coefficient selectivity (TC) of compounds < 0.5 were inidentified Cannabis (Table Sativa, 1). structures Low TC C1–C3values indicatedshowing dpKia significant > 1 and st aructural Tanimoto difference coefficient in the (TC) iden < 0.5tified were molecular identified structures (Table 1).in CannabisLow TC valuesSativa comparedindicated ato significant the respective structural most similar difference mo lecularin the idenstructurestified molecular in the training structures set. in Cannabis Sativa compared to the respective most similar molecular structures in the training set.

Int. J. Mol. Sci. 2020, 21, 5308 6 of 14

Despite relatively low average predicted selectivity of compounds in Cannabis Sativa, structures C1–C3 showing dpKi > 1 and a Tanimoto coefficient (TC) < 0.5 were identified (Table1). Low TC Int. J. Mol. Sci. 2020, 21, x FOR PEER REVIEW 6 of 14 valuesInt. J.Int. Mol. indicated J. Mol. Sci. 2020Sci. 2020, 21 a, ,significantx 21 FOR, x FOR PEER PEER REVIEW structural REVIEW difference in the identified molecular structures6 of 614 in of Cannabis14 Int.Int. J.J.Int. Mol.Mol. J. Mol.Sci.Sci. 2020 Sci. ,,2020 21,, xx, 21FORFOR, x FORPEERPEER PEER REVIEWREVIEW REVIEW 6 of 146 of 14 Sativa compared to the respective most similar molecular structures in the training set. TableTable 1. Compounds 1. Compounds from from Cannabis Cannabis Sativa Sativa that that were were predicted predicted as CB2-selective as CB2-selective and and are arealso also dissimilar dissimilar TableTableTable 1. Compounds 1. 1. Compounds Compounds from from from Cannabis Cannabis Cannabis Sativa Sativa Sativa thatthat that werewerethat were were predictedpredicted predicted predicted asas CB2-selectiveCB2-selective as as CB2-selective CB2-selective andand areand areand alsoalso are are also dissimilardissimilaralso dissimilar dissimilar to theto theChEMBL ChEMBL training training set. set. Tabletoto thethetoto 1. ChEMBL ChEMBL thetheCompounds ChEMBLChEMBL trainingtraining trainingtraining from set.set. set.Cannabisset. Sativa that were predicted as CB2-selective and are also dissimilar PredictedPredicted TaniTani to the ChEMBLStructureStructure training set. PredictedPredictedPredicted SimilarSimilar Molecule Molecule for forChEMBL ChEMBL TaniTaniTani StructureStructureStructure ValueValue SimilarSimilarSimilar Molecule MoleculeMolecule for ChEMBL forfor ChEMBLChEMBL motomoto ValueValueValue motomotomoto Structure Predicted Value Similar Molecule for ChEMBL CoefTanimotoCoef CB1CB1 CB2CB2 CB1CB1 CB2CB2 Coef CoefCoef CB1 CB1CB1CB2 CB2CB2 StructureStructure CB1 CB1CB1 CB2 CB2CB2 ficieCoeficiefficient CB1pKipKi pKiCB2pKi StructureStructureStructure pKiCB1pKi pKiCB2pKi ficie ficieficieficie pKi pKipKi pKi pKipKi Structure pKi pKipKi pKi pKipKi nt nt pKi pKi pKi pKi nt ntnt

5.46 6.87 4.64 N/A 0.28 C1 5.465.46 6.876.87 4.644.64 N/AN/A 0.280.28 C1C1 5.46 5.46 5.46 6.87 6.87 6.87 4.644.64 4.64 N/A N/N/A A0.28 0.28 0.28 C1C1C1 (4-hydroxy-6-methoxyspiro[1,2-(4-hydroxy-6-methoxyspiro[1,2- (4-hydroxy-6-methoxyspiro[1,2-(4-hydroxy-6-methoxyspiro[1,2-(4-hydroxy-6-methoxyspiro(4-hydroxy-6-methoxyspiro[1,2-(4-hydroxy-6-methoxyspiro[1,2- CHEMBL1201151CHEMBL1201151 dihydroindene-3,4’-cyclohexane]-1’-yl)dihydroindene-3,4’-cyclohexane]-1’-yl) CHEMBL1201151CHEMBL1201151CHEMBL1201151CHEMBL1201151 dihydroindene-3,4’-cyclohexane]-1’-yl)[1,2-dihydroindene-3,4’-cyclohexane]dihydroindene-3,4’-cyclohexane]-1’-yl)dihydroindene-3,4’-cyclohexane]-1’-yl) acetateacetate -1’-yl)acetate acetateacetateacetate

5.575.57 6.596.59 6.956.95 8.038.03 0.180.18 C2 C2 5.575.575.57 6.596.596.59 6.956.956.95 8.038.038.03 0.180.18 0.18 C2C2 C2C2 5.57 6.59 6.95 8.03 0.18 2-[(2R,4aR,8R,8aR)-8-hydroxy-4a,8-2-[(2R,4aR,8R,8aR)-8-hydroxy-4a,2-[(2R,4aR,8R,8aR)-8-hydroxy-4a,8- 2-[(2R,4aR,8R,8aR)-8-hydroxy-4a,8-2-[(2R,4aR,8R,8aR)-8-hydroxy-4a,8-2-[(2R,4aR,8R,8aR)-8-hydroxy-4a,8- dimethyl-1,2,3,4,5,6,7,8a-8-dimethyl-1,2,3,4,5,6,7,dimethyl-1,2,3,4,5,6,7,8a- CHEMBL256753CHEMBL256753CHEMBL256753 dimethyl-1,2,3,4,5,6,7,8a-dimethyl-1,2,3,4,5,6,7,8a-dimethyl-1,2,3,4,5,6,7,8a- CHEMBL256753CHEMBL256753CHEMBL256753 octahydronaphthalen-2-yl]prop-2-enoicoctahydronaphthalen-2-yl]prop-2-enoic8a-octahydronaphthalen-2-yl] octahydronaphthalen-2-yl]prop-2-enoicoctahydronaphthalen-2-yl]prop-2-enoicoctahydronaphthalen-2-yl]prop-2-enoic prop-2-enoicacidacid acid acid acidacid

6.416.41 7.29 7.297.29 7.337.337.33 6.04 6.046.04 0.410.41 0.41 6.41 6.416.41 7.29 7.297.29 7.33 7.337.33 6.04 6.046.04 0.41 0.410.41 C3 C3 C3 C3C3 6-carboxy-2-methyl-2-(4-methylpent-3-6-carboxy-2-methyl-2-(4-methylpent-6-carboxy-2-methyl-2-(4-methylpent-3- 6-carboxy-2-methyl-2-(4-methylpent-3-6-carboxy-2-methyl-2-(4-methylpent-3-6-carboxy-2-methyl-2-(4-methylpent-3- 3-enyl)-7-pentylchromen-5-olateenyl)-7-pentylchromen-5-olateenyl)-7-pentylchromen-5-olate enyl)-7-pentylchromen-5-olateenyl)-7-pentylchromen-5-olateenyl)-7-pentylchromen-5-olate 2.6.2.6. CB22.6. CB2 CB2 Structure-Based Structure-Based Structure-Based Virtual Virtual Virtual Screening Screening Screening Results Results 2.6. 2.6.CB22.6. CB2 CB2Structure-Based Structure-BasedStructure-Based Virtual VirtualVirtual Screening ScreeningScreening Results ResultsResults WeWe performed performed VS VSusing using four four PDB PDB structures structures of theof the CB2 CB2 receptor receptor (PDB (PDB id: id:6KPF, 6KPF, 6KPC, 6KPC, 6PT0, 6PT0, WeWe performedWeperformedWe performedperformed VS VS using VSVS using usingusing four four four fourPDB PDB PDBPDB structures structuresstructures of the ofof of CB2thethe the CB2 CB2receptor CB2 receptorreceptor receptor (PDB (PDB(PDB id: (PDB 6KPF, id:id: 6KPF,6KPF, id: 6KPC, 6KPF, 6KPC,6KPC, 6PT0, 6KPC, 6PT0,6PT0, 6PT0, andand 5ZTY) 5ZTY) and and two two compounds’ compounds’ libraries. libraries. The The libr libraryary for for the the first first compound compound was was derived derived from from the the andand5ZTY) andand5ZTY) 5ZTY)5ZTY) and and and andtwo two two twocompounds’ compounds’ compounds’compounds’ libraries. libraries. libraries. The The Thelibr The ary librlibr libraryaryforary the forfor firstthethe for first firstcompound the compoundcompound first compound was was wasderived derivedderived wasfrom fromfrom derivedthe thethe from curated,curated, ChEMBL ChEMBL training training datase dataset preparedt prepared for for our our QSAR QSAR models models and and it includedit included CB2-selective CB2-selective curated,curated,curated, ChEMBL ChEMBLChEMBL training trainingtraining datase datasedatasett preparedpreparedtt preparedprepared forfor ourforourfor ourourQSARQSAR QSARQSAR modelsmodels modelsmodels and andandit included itit includedincluded CB2-selective CB2-selectiveCB2-selective theligands curated,ligands expanded expanded ChEMBL with with training CB2-non-selective CB2-non-selective dataset prepared ligands. ligands. forFor For our this this QSAR compound compound models library, library, and it6KPF included6KPF and and 5ZTY CB2-selective 5ZTY ligandsligandsligandsligands expandedexpanded expandedexpanded withwith withwith CB2-non-selectiveCB2-non-selective CB2-non-selectiveCB2-non-selective ligands.ligands. ligands.ligands. ForFor ForForthisthis thisthiscompoundcompound compoundcompound library,library, library,library, 6KPF6KPF 6KPF6KPF andand and and5ZTY5ZTY 5ZTY5ZTY ligandsstructuresstructures expanded achieved achieved with the the highest CB2-non-selective highest area area under under the the ligands.receiver receiver operating Foroperating this characteristic compound characteristic curve library, curve (ROC (ROC 6KPF AUC) AUC) and 5ZTY structuresstructuresstructures achieved achievedachieved the thehighestthe highesthighest area area areaunder underunder the theretheceiver rereceiverceiver operating operatingoperating characteristic characteristiccharacteristic curve curvecurve (ROC (ROC(ROC AUC) AUC)AUC) structuresvaluevalue in thein achieved the discrimination discrimination the highest between between area CB2-selective CB2-selective under the ligands receiver ligands versus versus operating CB2-non-selective CB2-non-selective characteristic ones ones curve(see (see Table (ROCTable AUC) valuevaluevalue in the inin discriminationthethe discriminationdiscrimination between betweenbetween CB2-selective CB2-selectiveCB2-selective ligandsligands ligandsligands versusversus versusversus CB2-non-selectiveCB2-non-selective CB2-non-selectiveCB2-non-selective onesones onesones (see(see (see (seeTableTable TableTable value2). 2).The in The theantagonist-bound, antagonist-bound, discrimination 5ZTY between 5ZTY structure structure CB2-selective slightly slightly outperformed outperformed ligands versus 6KPF 6KPF CB2-non-selective(agonist (agonist-bound),-bound), as observedas ones observed (see Table 2). 2). The2).2). The Theantagonist-bound, antagonist-bound,antagonist-bound, 5ZTY 5ZTY5ZTY structure structurestructure slightly slightlyslightly outperformedoutperformed outperformedoutperformed 6KPF6KPF 6KPF6KPF (agonist(agonist (agonist(agonist-bound),-bound),-bound),-bound), asas observedobserved asas observedobserved in ina detaileda detailed analysis analysis of ofROC ROC curves curves obtained obtained in inour our enrichment enrichment study study (Figure (Figure 6). 6).The The second second Theinin a antagonist-bound,ain in detaileddetailed aa detaileddetailed analysisanalysis analysisanalysis ofof 5ZTY ROCROC ofof ROCROC structure curvescurves curvescurves obtainedobtained slightly obtainedobtained inin outperformed ourour inin ourourenrichmentenrichment enrichmentenrichment 6KPF studystudy studystudy (agonist-bound), (Figure(Figure (Figure(Figure 6).6). TheThe6).6). The Thesecondsecond as observedsecondsecond in a compound’scompound’s library library used used in ourin our structure-based structure-based VS VSincluded included the the same same CB2-selective CB2-selective ligands ligands derived derived detailedcompound’scompound’scompound’s analysis library librarylibrary of used ROC usedused in curves our inin ourstructure-basedour obtained structure-basedstructure-based inour VS includedincluded VSenrichmentVS includedincluded thethe same samethethe study samesame CB2-selectiveCB2-selective (Figure CB2-selectiveCB2-selective6). Theligandsligands ligandsligands second derivedderived derivedderived compound’s fromfrom the the ChEMBL ChEMBL training training dataset, dataset, named named here here with with the the term term “actives”, “actives”, and and general general diverse diverse decoys decoys libraryfromfromfromfrom thethe used ChEMBLtheChEMBLthe ChEMBL inChEMBL our trainingtraining structure-based trainingtraining dataset,dataset, dataset,dataset, namednamed named VSnamed herehere included herehere withwith withwith thethe the termthetermthe same termterm “actives”,“actives”, “actives”, CB2-selective“actives”, andand and andgeneralgeneral generalgeneral ligands diversediverse diversediverse derived decoysdecoys decoysdecoys from the thatthat were were generated generated with with DUD-E DUD-E [34], [34], named named here here as “non-actives”.as “non-actives”. In VSIn VSagainst against this this second second library, library, ChEMBLthatthatthat thatwerewere wereweretraining generatedgenerated generatedgenerated dataset, withwith withwith DUD-EDUD-E named DUD-EDUD-E [34],[34], here [34],[34], namednamed namednamed with herehere the herehere asas term “non-actives”.“non-actives”. asas “non-actives”.“non-actives”. “actives”, InIn andVSVS InIn against against VSVS general againstagainst thisthis diverse thissecondthissecond secondsecond library, decoyslibrary, library,library, that were 6KPF6KPF and and 5ZTY 5ZTY structures structures performed performed similarly, similarly, discriminating discriminating CB2-actives CB2-actives from from non-actives non-actives at theat the 6KPF6KPF6KPF and and and5ZTY 5ZTY5ZTY structures structuresstructures performed performedperformed similarly, similarly,similarly, discriminating discriminatingdiscriminating CB2-actives CB2-activesCB2-actives from fromfrom non-actives non-activesnon-actives at the atat thethe generatedlevellevel of 0.8of with 0.8and andDUD-E 0.79, 0.79, respectively respectively [34], named (see (see hereTable Table as 2, “non-actives”. ROC2, ROC AUC AUC values). values). In 6KPF VS 6KPF against slightly slightly this outperformed outperformed second library, 5ZTY 5ZTY 6KPF and levellevellevellevel ofof 0.80.8 ofof andand 0.80.8 and and0.79,0.79, 0.79,0.79, respectivelyrespectively respectivelyrespectively (see(see (see (seeTableTable TableTable 2,2, ROCROC 2,2, ROCROC AUCAUC AUCAUC values).values). values).values). 6KPF6KPF 6KPF6KPF slightlyslightly slightlyslightly outperformedoutperformed outperformedoutperformed 5ZTY5ZTY 5ZTY5ZTY 5ZTYaccordingaccording structures to theto the performedROC ROC AUC AUC characteristics similarly, characteristics discriminating and and 6KPC 6KPC performed performed CB2-actives the the worst, worst, from however, however, non-actives the the values values at thewere were level of 0.8 accordingaccordingaccording to the toto ROC thethe ROCROC AUC AUCAUC characteristics characteristicscharacteristics and and and6KPC 6KPC6KPC performed performedperformed the worst, thethe worst,worst, however, however,however, the values thethe valuesvalues were werewere andvery 0.79,very close close respectively with with no noexplicit explicit (see difference Table difference2, ROC between between AUC active values).active and and inactive 6KPF inactive slightlyCB2 CB2 structures. structures. outperformed Our Our results results 5ZTY showed showed according to veryveryvery close closeclose with withwith no explicit nono explicitexplicit difference differencedifference between betweenbetween activetive acac tive andtiveand and andinactiveinactive inactiveinactive CB2CB2 CB2 CB2structures.structures. structures.structures. OurOur Our Ourresultsresults resultsresults showedshowed showedshowed thethat ROCthat even even AUC slight slight characteristics differences differences between between and the 6KPC the crystal crystal performed struct structuresures theof theof worst, the same same however,receptor receptor have thehave an values animpact impact wereon onthe verythe close thatthatthat thateveneven eveneven slightslight slightslight differencesdifferences differencesdifferences betweenbetween betweenbetween thethe crystal crystalthethe crystalcrystal structstruct structstructuresuresures of the ofof same thethe samesame receptor receptorreceptor have havehave an impact anan impactimpact on the onon thethe VS VSresults. results. A similarA similar conclusion conclusion was was derived derived from from a recent a recent study study on onglucagon glucagon receptors, receptors, where where an an withVS results.VS noVS results. explicitresults. A similar AA di similarsimilarfference conclusion conclusionconclusion between was was wasderived active derivedderived andfrom fromfrom inactive a recent aa recentrecent CB2study studystudy structures. on glucagon onon glucagonglucagon Our receptors, resultsreceptors,receptors, where showed wherewhere an an thatan even ensembleensemble of ofreceptor receptor conforma conformationstions generated generated in inshort short MD MD si mulationssimulations outperformed outperformed crystal crystal slightensembleensembleensemble differences of receptorofof receptor betweenreceptor conforma conformaconforma the crystaltionstionstionstions generatedgenerated structures generatedgenerated inin of shortshortinin the shortshort same MDMD MD MDsisi receptormulations sisimulationsmulations have outperformed anoutperformedoutperformed impact oncrystal thecrystalcrystal VS results. structuresstructures in VSin VS[35,36]. [35,36]. structuresstructuresstructures in VS inin [35,36]. VSVS [35,36].[35,36]. Astructures similar conclusionin VS [35,36]. was derived from a recent study on glucagon receptors, where an ensemble of receptor conformations generated in short MD simulations outperformed crystal structures in

VS [35 ,36].

Int. J. Mol. Sci. 2020, 21, 5308 7 of 14

Int. J. Mol. Sci. 2020, 21, x FOR PEER REVIEW 7 of 14 Table 2. Results of the enrichment study for CB2-receptor structures. Table 2. Results of the enrichment study for CB2-receptor structures. CB2-Non-Selective Decoys General Decoys Metric Metric 6KPF CB2-Non-selective 6KPC 6PT0 Decoys 5ZTY 6KPF 6KPC General 6PT0 Decoys 5ZTY 6KPF 6KPC 6PT0 5ZTY 6KPF 6KPC 6PT0 5ZTY EF 2% 0 0 0 1.7 3.3 0 0 3.3 EFEF 5%2% 0.670 00 00 1.31.7 4.73.3 1.30 1.30 3.3 3.3 EF 10%5% 1.7 0.67 0 0 0.33 0 1 1.3 4.0 4.7 2 1.3 1.7 1.3 3 3.3 ROC EF 10% 0.61.7 0.530 0.530.33 0.61 0.84.0 0.732 0.771.7 0.79 3 ROCAUC AUC 0.6 0.53 0.53 0.6 0.8 0.73 0.77 0.79

A

B

Figure 6. 6. TheThe receiver receiver operating operating characteristic characteristic (ROC) (ROC) curves curves obtained obtained in the in enrichment the enrichment study studyusing usinglibraries libraries including including (A) CB2-selective (A) CB2-selective and CB2-non-selective and CB2-non-selective ligands and ligands (B) CB2-actives and (B) CB2-actives and CB2-non- and CB2-non-activeactive ligands. ligands.

3. Discussion The main goal of this study was to predict CB2-selective compounds among the phytochemicals that constitute Cannabis Sativa. We We achieved this goal throughthrough thethe followingfollowing steps:steps: collecting a large set of experimental data from ChEMBL for the trainingtraining set, datadata curation,curation, modelmodel developmentdevelopment and validation, andand predicting predicting the the selectivity selectivity of phytochemicalsof phytochemicals from fromCannabis Cannabis Sativa Sativa. In parallel,. In parallel, we tested we thetested applicability the applicability of recently of releasedrecently PDBreleased structures PDB ofstructures CB2 in structure-based of CB2 in structure-based VS by conducting VS anby enrichmentconducting an study. enrichment The dataset study. collected The dataset by our collected group by was our curated, group was which curated, resulted which in the resulted removal in ofthe more removal than of 90% more of than unreliable 90% of and unreliable inconsistent and inco data.nsistent Despite data. the Despite removal the of removal this large of fraction this large of data,fraction the of final data, dataset the final included dataset 1958 included records 1958 for CB1records and for 2616 CB1 records and 2616 for CB2.records Notably, for CB2. such Notably, a large datasetsuch a large is su datasetfficient foris sufficient employing for theemploying machine the learning machine approach. learning Muchapproach. smaller Much datasets smaller have datasets been successfullyhave been successfully used to develop used QSAR to develop models QSAR for CB1 mo/CB2dels studies for CB1/CB2 [23,24]. studies Nevertheless, [23,24]. focusing Nevertheless, on the datasetfocusing diversity on the indataset developing diversity our in QSAR developin modelg also our created QSAR amodel challenge also for created predicting a challenge the activity for ofpredicting outlier compounds.the activity of Weoutlier solved compounds. this problem We solved by determining this problem an by applicability determining domain an applicability for each compounddomain for individually.each compound Embedding individually. the descriptor Embeddin spaceg the of descriptor base GB models space madeof base it GB possible models to assessmade theit possible prediction to assess confidence the prediction based on confidence Euclidean base distancesd on Euclidean of k-nearest distances neighbors. of k-nearest The GB neighbors. algorithm The GB algorithm was also successfully applied by Ancuceanu et al. [37] for cytotoxicity prediction.

Int. J. Mol. Sci. 2020, 21, 5308 8 of 14 was also successfully applied by Ancuceanu et al. [37] for cytotoxicity prediction. The performance of GB in the modeling of various QSARs was compared to other ensemble methods in a study by Kwon et al. [38]. Our approach was validated with the five-fold cross-validation protocol with resulting statistical characteristics of Q2 > 0.6 for the entire dataset and Q2 > 0.8 for the most confident predictions. Regarding small subsets with the highest confidence predictions, our model outperformed much more complex 3D and 4D QSAR models created for equally small sets of similar compounds [23,24]. We used our validated QSAR models for the prediction of the CB1 and CB2 affinities of phytochemicals from Cannabis Sativa. As a result, we identified three compounds C1–C3 (see Table1) dissimilar to the training dataset with a TC < 0.3 compared to the respective most similar structure, and predicted pKi difference CB2 vs. CB1 (dpKi) 1. The common name of C1 is cannabispirol, and it is a compound isolated ≥ from Japanese domestic Cannabis Sativa [39]. C1 has been experimentally evaluated for antimicrobial activity [40] and was also tested in a study on targeting multidrug resistant mouse lymphoma cells [41]. Ilicic acid (C2) showed activity in G2/M cell cycle arrest of tumor cells [42]. Cannabichromente (C3) is a precursor for biosynthesis of various cannabinoids [43]. As for structure-based VS using PDB structures of CB2, we observed that these structures were able to discriminate not only CB2-actives from non-active ligands (DUD-E decoys) but also CB2-selective from non-selective ligands (the ChEMBL-derived dataset). The best results were obtained for 5ZTY and 6KPF structures, regardless of the activation state. In this study, we identified potential selective cannabinoids among the constituents of Cannabis sativa. Our QSAR model identified three compounds of potential interest with significant dissimilarity to compounds already evaluated experimentally and reported in ChEMBL. The statistical characteristics of the developed QSAR models suggest a high probability of successful experimental validation, which we hope will attract attention from experimental groups interested in searching for CB2-selective ligands of natural origin.

4. Materials and Methods

4.1. Data Collection Chemical structures from the ChEMBL database [44] with experimentally measured affinities for the CB1 and CB2 receptors were used as training data. The datasets were composed of 14,126 records for CB1 and 13,506 records for CB2. Compounds constituting the phytochemical profile of Cannabis Sativa were acquired from the Collective Molecular Activities of Useful Plants Database (CMAUP) [45]. The ChEMBL data was exported and downloaded in comma-separated values (CSV) format, while the CMAUP data was downloaded as an Structure Data Format (SDF) file.

4.2. Data Curation The method of data curation followed a modified approach of Fourches et al. [46,47]. We modified the curation method by the addition of extra steps that were relevant for curating data acquired from ChEMBL accompanied by additional metadata specific to this database. Namely, we grouped experimental values provided their source studies were the same, and then removed groups with less than 10 records. We also assessed data reliability according to the source study confidence score as provided by ChEMBL. We then removed records with a confidence score lower than 8. A final dataset consisted of records with InChIKeys as molecular structure identifiers with associated standardized 2D structures of compounds and pKi values. RDKit was used for standardization and InChIKeys calculation [48]. The curated CB2 and CB1 datasets are available on the website [49].

4.3. Descriptor Calculation The Mordred python library [50] was used to calculate 1613 2D standard descriptors and 213 3D descriptors. 3D descriptors were used to include information about chirality in ligand structures. Int. J. Mol. Sci. 2020, 21, 5308 9 of 14

Descriptors with variance less than 0.05 or containing invalid values were removed. The descriptors were concatenated with 1024-bit Morgan fingerprints with a radius equal to 3.

4.4. Machine Learning A gradient boosting (GB) algorithm implemented in the Light Gradient Boosting Machines library [51] was used to create base models. The GB algorithm and its parameters were selected in an inner 5-fold cross-validation protocol based on grid search. Other algorithms including linear regression, partial least squares, and support vector machines from the scikit-learn library [52] were also tested. GB models for CB1 and CB2 were decision tree ensembles. The output of each tree in a boosted ensemble was used to create embedding of a descriptor space. A kNN algorithm with k = 3 was used to determine the applicability domain of the model based on the embedding. The applicability domain was defined by a threshold, which is a maximum allowed distance between a query molecular structure and nearest neighbors. Molecular structures with the Euclidean distance to nearest neighbors greater than a given threshold were considered as out of the applicability domain. Python code for the execution of trained models is provided in the Supplementary Materials.

4.5. Model Validation Models were validated using a 5-fold cross-validation protocol. Out-of-sample predictions were used to calculate Q2 according to the following equation:

 2 P ˆ n Yi Yi Q2 = 1 − (1) P  2 − Y Y n i − i

Here, n is the number of samples in the dataset, Yi is an experimental pKi of the ith sample, 2 Yˆ i is a predicted value of pKi of the ith sample, and Yi is a mean pKi value. We evaluated Q for 20 different applicability domain thresholds and considered Q2 as a confidence measure for a model using a given threshold.

4.6. Prediction of CB2-Selectivity of Cannabis Sativa Ingredients A curated dataset of Cannabis Sativa constituents reported in the CMAUP database [45] was tested against our QSAR CB1 and CB2 models. In such a way, we searched for selective molecules with dpKi 1, where dpKi was defined as dpKi = pKi pKi . For each predicted selective compound, ≥ CB2 − CB1 a Tanimoto coefficient (TC) was calculated to assess the similarity of predicted compounds to ones already tested and used to exclude hits that were not novel with regard to training data. TC was computed using 1024-bit Morgan fingerprints with a radius equal to 3.

4.7. Structure-Based Virtual Screening CB2 receptor structures (PDB id: 6KPC [30], 6KPF [30], 6PT0 [31], and 5ZTY [32]) were derived from the Protein Data Bank (PDB) [53]. This set included agonist-bound (6KPC, 6KPF, 6PT0) and antagonist-bound (5ZTY) structures corresponding to different activation states of the CB2 receptor (active and inactive, respectively). The CB2 structures used in this study were very similar to each other (average pairwise RMSD = 6.41, with standard deviation = 3.40), yet differing in TMH6 conformation (5ZTY and 6KPC vs. others) bending while interacting with a G protein complex. In all these structures ligands were located in the orthosteric binding site of CB2, although allosteric modulation has also been observed for these receptors [54]. The four PDB structures have nearly identical binding sites except for a few residues: W258 (W6.48), S285, and to a lesser extent: F87 (F2.57), F91 (F2.61), H95 (H2.65), F183 (EC2) (see Figure7). W258 and F117 (F3.36) residues form a well-known Trp-Phe toggle switch changing its position on the receptor activation [17,30]. Interestingly, the position of Int. J. Mol. Sci. 2020, 21, 5308 10 of 14

W258 is slightly different not only in comparing the active vs. inactive conformations but also in three active conformations, e.g., moved away from the ligand (6PT0 vs. two others). Int. J. Mol. Sci. 2020, 21, x FOR PEER REVIEW 10 of 14

Figure 7. Comparison of of CB2 structures available in in the Protein Data Bank (PDB). Here, an inactive CB2 conformation (5ZTY) (5ZTY) is is shown shown in in green green and and is is the best performing in in virtual virtual screening screening (VS), (VS), active conformation (6KPF) (6KPF) is is shown shown in in yellow, yellow, an andd other other CB2 CB2 active active structures structures are are shown shown in in grey. grey. Residues are labeled according to the 5ZTY numbering.

Receptor structures were processed with the the Pr Proteinotein Preparation Preparation tool tool 2017-4, 2017-4, Schrodinger Schrodinger LLC, LLC, New York, NY, USA [[55].55]. Ligands in PDB structures were used to determine the position of docking grids spanning over the whole CB2 active site. The The prepared prepared CB2 CB2 structures structures were used for screening against a CB2 actives actives library library that that included included CB2-selective CB2-selective and and CB2-non-selective CB2-non-selective compounds to assess the ability of these CB2 structures to discriminate between selective selective and and CB2-non-selective CB2-non-selective ligands. ligands. This compound librarylibrary waswas extracted extracted from from our our ChEMBL ChEMBL training training set set (see (see Materials Materials and and Methods) Methods) and andincluded included compounds compounds with pKi withCB1 pKi<= CB15.5 and<= 5.5 pKi andCB2 >pKi7 forCB2 CB2-selective,> 7 for CB2-selective, while others while were others considered were consideredCB2-non-selective. CB2-non-selective. The second roundThe second of VS wasround performed of VS was to test performed to what extentto test experimental to what extent CB2 experimentalstructures were CB2 able structures to discriminate were able betweento discriminate CB2-actives between and CB2-actives non-actives. and Here, non-actives. we used Here, the wecompounds used the librarycompounds that was library generated that was using genera DUD-Eted [using34] with DUD-E ChEMBL-derived [34] with ChEMBL-derived CB2-selective ligands CB2- selective(see above) ligands used (see here above) as CB2 used actives. here Here,as CB2 we actives. discard Here, CB2-non-selective we discard CB2-non-selective ligands and we ligands did not andinclude we themdid not in theinclude compound them in library. the compound VS was performedlibrary. VS with was Glide performed 2017-4, with Schrodinger Glide 2017-4, LLC, SchrodingerNew York, NY, LLC, USA New [ 56York,] that NY, followed USA [56] the that ligand followed preparation the ligand with preparation Ligprep [57 with]. We Ligprep computed [57]. Weenrichment computed factors enrichment (EF) and factors areas (EF) under and the areas curve under (AUC) the forcurve receiver-operator (AUC) for receiver-operator curves (ROC) curves using (ROC)Maestro using 2017-4, Maestro Schrodinger 2017-4, LLC,Schrodinger New York, LLC, NY, New USA. York, NY, USA .

5. Conclusions Conclusions We would like to emphasize that our study aime aimedd to perform an in silico study to identify prospective CB2-selective compounds from C. Sativa that could be be confirmed confirmed by by experimental experimental studies studies inin the future. This This study study shows shows the the first first robust robust CB1/CB2 CB1/CB2 QSAR models that are reproducible and applicable to a large variety of chemical scaffo scaffolds.lds. Our developed model model was trained on a large and diverse setset ofof compounds compounds that that surpassed surpassed recent recent studies studies [26]. We[26]. based We based our method our method on machine on machine learning, learning,which utilized which data utilized from thousandsdata from ofthousands experimental of experimental studies on CB1 st/udiesCB2 activities. on CB1/CB2 Our studyactivities. was beenOur studyconducted was withoutbeen conducted making common without mistakes making suchcommon as, e.g., mistakes not conducting such as, data e.g., curation not conducting or its improper data curation or its improper execution, lack of a validation protocol, applying inappropriate statistical characteristics for estimation of predictive performance, and finally, lack of applicability domain determination [58–60]. We also showed that the applicability domain estimation based on a gradient boosted model latent space allows the accurate prediction of the model confidence. What is more, we validated the estimation of the model confidence interval, which allows users to make conclusions

Int. J. Mol. Sci. 2020, 21, 5308 11 of 14 execution, lack of a validation protocol, applying inappropriate statistical characteristics for estimation of predictive performance, and finally, lack of applicability domain determination [58–60]. We also showed that the applicability domain estimation based on a gradient boosted model latent space allows the accurate prediction of the model confidence. What is more, we validated the estimation of the model confidence interval, which allows users to make conclusions based on the general statistical characteristics of a model and also on the validated confidence of a single prediction. No prior QSAR study on cannabinoids have showed these results. The phytochemistry of C. Sativa has been described in many experimental and theoretical studies (see ElSohly et al. [2]). The analgesic effect of C. Sativa extracts has been known for many years [61]. Among the many experimental studies regarding cannabinoids, a few studies describing the three compounds selected by us deserve attention [39–43]. What is more, in the last year, many valuable and novel studies on C. Sativa applications in medicine have been published [62–66], thus, we hope that the exploration of the potential use of C. Sativa in pharmacotherapy will continue both experimentally and by using theoretical models.

Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/1422-0067/21/15/ 5308/s1. Author Contributions: Conceptualization, M.M., D.L. and J.C.-P.; Data curation, M.M.; Formal analysis, D.L. and J.C.-P.; Investigation, M.M. and D.L.; Methodology, M.M. and D.L.; Project administration, J.C.-P.; Resources, D.L. and J.C.-P.; Software, M.M. and D.L.; Supervision, D.L. and J.C.-P.; Validation, M.M.; Visualization, M.M. and D.L.; Writing—original draft, M.M., D.L. and J.C.-P.; Writing—review & editing, M.M., D.L. and J.C.-P. All authors have read and agreed to the published version of the manuscript. Funding: M.M. acknowledges the doctoral scholarship support from the National Science Center DEC-2018/28/T/NZ7/00472. This research was supported in part by the PL-Grid Infrastructure. Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

1. Cascio, M.G.; Pertwee, R.G.; Marini, P. The pharmacology and therapeutic potential of plant cannabinoids. In Cannabis sativa L.—Botany and Biotechnology; Chandra, S., Lata, H., ElSohly, M.A., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 207–225. ISBN 9783319545646. 2. ElSohly, M.A.; Radwan, M.M.; Gul, W.; Chandra, S.; Galal, A. Phytochemistry of Cannabis sativa L. Prog. Chem. Org. Nat. Prod. 2017, 103, 1–36. [PubMed] 3. Andre, C.M.; Hausman, J.-F.; Guerriero, G. Cannabis sativa: The plant of the thousand and one molecules. Front. Plant Sci. 2016, 7, 19. [CrossRef][PubMed] 4. Aizpurua-Olaizola, O.; Soydaner, U.; Öztürk, E.; Schibano, D.; Simsir, Y.; Navarro, P.; Etxebarria, N.; Usobiaga, A. Evolution of the cannabinoid and terpene content during the growth of Cannabis sativa plants from different chemotypes. J. Nat. Prod. 2016, 79, 324–331. [CrossRef][PubMed] 5. Galaj, E.; Bi, G.-H.; Yang, H.-J.; Xi, Z.-X. Cannabidiol attenuates the rewarding effects of cocaine in rats by CB2, 5-HT1A and TRPV1 receptor mechanisms. Neuropharmacology 2020, 167, 107740. [CrossRef] 6. Aso, E.; Andrés-Benito, P.; Grau-Escolano, J.; Caltana, L.; Brusco, A.; Sanz, P.; Ferrer, I. Cannabidiol-enriched extract reduced the cognitive impairment but not the epileptic seizures in a Lafora disease animal model. In Cannabis and Cannabinoid Res; Mary Ann Liebert, Inc.: New Rochelle, NY, USA, 2019. 7. Volkow, N.D.; Hampson, A.J.; Baler, R.D. Don’t worry, be happy: Endocannabinoids and cannabis at the intersection of stress and reward. Annu. Rev. Pharmacol. Toxicol. 2017, 57, 285–308. [CrossRef] 8. Rong, C.; Lee, Y.; Carmona, N.E.; Cha, D.S.; Ragguett, R.-M.; Rosenblat, J.D.; Mansur, R.B.; Ho, R.C.; McIntyre, R.S. Cannabidiol in medical marijuana: Research vistas and potential opportunities. Pharmacol. Res. 2017, 121, 213–218. [CrossRef] 9. Morales, P.; Hurst, D.P.; Reggio, P.H. Molecular Targets of the Phytocannabinoids: A Complex Picture. Prog. Chem. Org. Nat. Prod. 2017, 103, 103–131. Int. J. Mol. Sci. 2020, 21, 5308 12 of 14

10. Devinsky, O.; Cilio, M.R.; Cross, H.; Fernandez-Ruiz, J.; French, J.; Hill, C.; Katz, R.; Di Marzo, V.; Jutras-Aswad, D.; Notcutt, W.G.; et al. Cannabidiol: Pharmacology and potential therapeutic role in epilepsy and other neuropsychiatric disorders. Epilepsia 2014, 55, 791–802. [CrossRef] 11. Russo, E.B.; Burnett, A.; Hall, B.; Parker, K.K. Agonistic properties of cannabidiol at 5-HT1a receptors. Neurochem. Res. 2005, 30, 1037–1043. [CrossRef] 12. Svíženská, I.; Dubovy, P.; Šulcová, A. Cannabinoid receptors 1 and 2 (CB1 and CB2), their distribution, ligands and functional involvement in nervous system structures—A short review. Pharmacol. Biochem. Behav. 2008, 90, 501–511. [CrossRef] 13. Marzo, V.D.; Di Marzo, V.; Bifulco, M.; De Petrocellis, L. The endocannabinoid system and its therapeutic exploitation. Nat. Rev. Drug Discov. 2004, 3, 771–784. [CrossRef][PubMed] 14. Pacher, P.; Bátkai, S.; Kunos, G. The endocannabinoid system as an emerging target of pharmacotherapy. Pharmacol. Rev. 2006, 58, 389–462. [CrossRef][PubMed] 15. Lyu, J.; Wang, S.; Balius, T.E.; Singh, I.; Levit, A.; Moroz, Y.S.; O’Meara, M.J.; Che, T.; Algaa, E.; Tolmachova, K.; et al. Ultra-large library docking for discovering new chemotypes. Nature 2019, 566, 224–229. [CrossRef][PubMed] 16. Jakowiecki, J.; Filipek, S. Hydrophobic ligand entry and exit pathways of the CB1 cannabinoid receptor. J. Chem. Inf. Model. 2016, 56, 2457–2466. [CrossRef][PubMed] 17. Latek, D.; Kolinski, M.; Ghoshdastider, U.; Debinski, A.; Bombolewski, R.; Plazinska, A.; Jozwiak, K.; Filipek, S. Modeling of ligand binding to G protein coupled receptors: Cannabinoid CB 1, CB 2 and adrenergic β 2 AR. J. Mol. Model. 2011, 17, 2353–2366. [CrossRef][PubMed] 18. Chohan, T.A.; Chen, J.-J.; Qian, H.-Y.; Pan, Y.-L.; Chen, J.-Z. Molecular modeling studies to characterize N-phenylpyrimidin-2-amine selectivity for CDK2 and CDK4 through 3D-QSAR and molecular dynamics simulations. Mol. Biosyst. 2016, 12, 1250–1268. [CrossRef] 19. Stokes, J.M.; Yang, K.; Swanson, K.; Jin, W.; Cubillos-Ruiz, A.; Donghia, N.M.; MacNair, C.R.; French, S.; Carfrae, L.A.; Bloom-Ackermann, Z.; et al. A Deep Learning Approach to Antibiotic Discovery. Cell 2020, 181, 475–483. [CrossRef] 20. Kasabe, A.J.; Kulkarni, A.S.; Gaikwad, V.L.QSPR Modeling of biopharmaceutical properties of hydroxypropyl methylcellulose (cellulose ethers) tablets based on its degree of polymerization. AAPS PharmSciTech 2019, 20, 308. [CrossRef] 21. Mizera, M.; Krause, A.; Zalewski, P.; Skibi´nski,R.; Cielecka-Piontek, J. Quantitative structure-retention relationship model for the determination of naratriptan hydrochloride and its impurities based on artificial neural networks coupled with genetic algorithm. Talanta 2017, 164, 164–174. [CrossRef] 22. Mizera, M.; Talaczy´nska,A.; Zalewski, P.; Skibi´nski,R.; Cielecka-Piontek, J. Prediction of HPLC retention times of tebipenem pivoxyl and its degradation products in solid state by applying adaptive artificial neural network with recursive features elimination. Talanta 2015, 137, 174–181. [CrossRef] 23. Romero-Parra, J.; Chung, H.; Tapia, R.A.; Faúndez, M.; Morales-Verdejo, C.; Lorca, M.; Lagos, C.F.; Di Marzo, V.; David Pessoa-Mahana, C.; Mella, J. Combined CoMFA and CoMSIA 3D-QSAR study of benzimidazole and benzothiophene derivatives with selective affinity for the CB2 cannabinoid receptor. Eur. J. Pharm. Sci. 2017, 101, 1–10. [CrossRef][PubMed] 24. Zhang, H.; Lv, Q.; Xu, W.; Lai, X.; Liu, Y.; Tu, G. 4D-QSAR studies of CB 2 cannabinoid receptor inverse agonists: A comparison to 3D-QSAR. Med. Chem. Res. 2019, 28, 498–504. [CrossRef] 25. Chen, J.-Z.; Han, X.-W.; Liu, Q.; Makriyannis, A.; Wang, J.; Xie, X.-Q. 3D-QSAR studies of arylpyrazole antagonists of cannabinoid receptor subtypes CB1 and CB2. A combined NMR and CoMFA approach. J. Med. Chem. 2006, 49, 625–636. [CrossRef][PubMed] 26. Floresta, G.; Apirakkan, O.; Rescifina, A.; Abbate, V. Discovery of high-affinity cannabinoid receptors ligands through a 3D-QSAR ushered by scaffold-hopping analysis. Molecules 2018, 23, 2183. [CrossRef] 27. Labib, R.M.; Srivedavyasasri, R.; Youssef, F.S.; Ross, S.A. Secondary metabolites isolated from Pinus roxburghii and interpretation of their cannabinoid and opioid binding properties by virtual screening and in vitro studies. Saudi Pharm. J. 2018, 26, 437–444. [CrossRef] 28. Hua, T.; Vemuri, K.; Nikas, S.P.; Laprairie, R.B.; Wu, Y.; Qu, L.; Pu, M.; Korde, A.; Jiang, S.; Ho, J.-H.; et al. Crystal structures of agonist-bound human cannabinoid receptor CB1. Nature 2017, 547, 468–471. [CrossRef] Int. J. Mol. Sci. 2020, 21, 5308 13 of 14

29. Ji, B.; Liu, S.; He, X.; Man, V.H.; Xie, X.-Q.; Wang, J. Prediction of the binding affinities and selectivity for CB1 and CB2 ligands using homology modeling, molecular docking, molecular dynamics simulations, and MM-PBSA binding free energy calculations. ACS Chem. Neurosci. 2020, 11, 1139–1158. [CrossRef] 30. Hua, T.; Li, X.; Wu, L.; Iliopoulos-Tsoutsouvas, C.; Wang, Y.; Wu, M.; Shen, L.; Johnston, C.A.; Nikas, S.P.; Song, F.; et al. Activation and signaling mechanism revealed by cannabinoid receptor-Gi complex structures. Cell 2020, 180, 655–665.e18. [CrossRef] 31. Xing, C.; Zhuang, Y.; Xu, T.-H.; Feng, Z.; Zhou, X.E.; Chen, M.; Wang, L.; Meng, X.; Xue, Y.; Wang, J.; et al. Cryo-EM structure of the human cannabinoid receptor CB2-Gi signaling complex. Cell 2020, 180, 645–654.e13. [CrossRef] 32. Li, X.; Hua, T.; Vemuri, K.; Ho, J.-H.; Wu, Y.; Wu, L.; Popov, P.; Benchama, O.; Zvonok, N.; Locke, K.; et al. Crystal structure of the human cannabinoid receptor CB2. Cell 2019, 176, 459–467.e13. [CrossRef] 33. Tropsha, A. Best practices for QSAR model development, validation, and exploitation. Mol. Inform. 2010, 29, 476–488. [CrossRef][PubMed] 34. Mysinger, M.M.; Carchia, M.; Irwin, J.J.; Shoichet, B.K. Directory of useful decoys, enhanced (DUD-E): Better ligands and decoys for better benchmarking. J. Med. Chem. 2012, 55, 6582–6594. [CrossRef][PubMed] 35. Latek, D.; Rutkowska, E.; Niewieczerzal, S.; Cielecka-Piontek, J. Drug-induced diabetes type 2: In silico study involving class B GPCRs. PLoS ONE 2019, 14, e0208892. [CrossRef][PubMed] 36. Pasznik, P.; Rutkowska, E.; Niewieczerzal, S.; Cielecka-Piontek, J.; Latek, D. Potential off-target effects of beta-blockers on gut hormone receptors: In silico study including GUT-DOCK—A web service for small-molecule docking. PLoS ONE 2019, 14, e0210705. [CrossRef] 37. Ancuceanu, R.; Dinu, M.; Neaga, I.; Laszlo, F.G.; Boda, D. Development of QSAR machine learning-based models to forecast the effect of substances on malignant melanoma cells. Oncol. Lett. 2019, 17, 4188–4196. [CrossRef] 38. Kwon, S.; Bae, H.; Jo, J.; Yoon, S. Comprehensive ensemble in QSAR prediction for drug discovery. BMC Bioinform. 2019, 20, 521. [CrossRef] 39. Shoyama, Y.; Nishioka, I. Cannabis. XIII. Two new spiro-compounds, cannabispirol and acetyl cannabispirol. Chem. Pharm. Bull. 1978, 26, 3641–3646. [CrossRef] 40. Molnar, J.; Csiszar, K.; Nishioka, I.; Shoyama, Y. The effects of cannabispiro compounds and tetrahydrocannabidiolic acid on the plasmid transfer and maintenance in Escherichia coli. Acta Microbiol. Hung. 1986, 33, 221–231. 41. Molnár, J.; Szabó, D.; Pusztai, R.; Mucsi, I.; Berek, L.; Ocsovszki, I.; Kawata, E.; Shoyama, Y. Membrane associated antitumor effects of crocine-, ginsenoside- and cannabinoid derivates. Anticancer Res. 2000, 20, 861–867. 42. León, L.G.; Donadel, O.J.; Tonn, C.E.; Padrón, J.M. Tessaric acid derivatives induce G2/M cell cycle arrest in human solid tumor cell lines. Bioorg. Med. Chem. 2009, 17, 6251–6256. [CrossRef] 43. Crombie, L.; Crombie, W.M.L. Cannabinoid acids and esters: Miniaturized synthesis and chromatographic study. Phytochemistry 1977, 16, 1413–1420. [CrossRef] 44. Gaulton, A.; Bellis, L.J.; Bento, A.P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40, D1100–D1107. [CrossRef][PubMed] 45. Zeng, X.; Zhang, P.; Wang, Y.; Qin, C.; Chen, S.; He, W.; Tao, L.; Tan, Y.; Gao, D.; Wang, B.; et al. CMAUP: A database of collective molecular activities of useful plants. Nucleic Acids Res. 2019, 47, D1118–D1127. [CrossRef] 46. Fourches, D.; Muratov, E.; Tropsha, A. Trust, but verify: On the importance of chemical structure curation in cheminformatics and QSAR modeling research. J. Chem. Inf. Model. 2010, 50, 1189–1204. [CrossRef] [PubMed] 47. Fourches, D.; Muratov, E.; Tropsha, A. Trust, but verify II: A Practical guide to chemogenomics data curation. J. Chem. Inf. Model. 2016, 56, 1243–1252. [CrossRef][PubMed] 48. Landrum, G. Rdkit documentation. Release 2013, 1, 1–79. 49. Resources. Available online: http://db-gpcr.chem.uw.edu.pl/ (accessed on 28 April 2020). 50. Moriwaki, H.; Tian, Y.-S.; Kawashita, N.; Takagi, T. Mordred: A molecular descriptor calculator. J. Cheminform. 2018, 10, 4. [CrossRef] Int. J. Mol. Sci. 2020, 21, 5308 14 of 14

51. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates Inc.: New York, NY, USA, 2017; pp. 3146–3154. 52. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.;et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. 53. Berman, H.M. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235–242. [CrossRef] 54. Shao, Z.; Yan, W.; Chapman, K.; Ramesh, K.; Ferrell, A.J.; Yin, J.; Wang, X.; Xu, Q.; Rosenbaum, D.M. Structure of an allosteric modulator bound to the CB1 cannabinoid receptor. Nat. Chem. Biol. 2019, 15, 1199–1205. [CrossRef] 55. Release, S. Others 4: Schrödinger Suite 2017-4 Protein Preparation Wizard; Epik, Schrödinger, LLC: New York, NY, USA, 2017. 56. Friesner, R.A.; Murphy, R.B.; Repasky, M.P.; Frye, L.L.; Greenwood, J.R.; Halgren, T.A.; Sanschagrin, P.C.; Mainz, D.T. Extra precision glide: Docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes. J. Med. Chem. 2006, 49, 6177–6196. [CrossRef][PubMed] 57. Release, S. 2: LigPrep; Schrödinger, LLC: New York, NY, USA, 2017. 58. Paulke, A.; Proschak, E.; Sommer, K.; Achenbach, J.; Wunder, C.; Toennes, S.W. Synthetic cannabinoids: In silico prediction of the cannabinoid receptor 1 affinity by a quantitative structure-activity relationship model. Toxicol. Lett. 2016, 245, 1–6. [CrossRef][PubMed] 59. Kumar, P.; Kumar, A. CORAL: QSAR models of CB1 cannabinoid receptor inhibitors based on local and global SMILES attributes with the index of ideality of correlation and the correlation contradiction index. Chemom. Intell. Lab. Syst. 2020, 200, 103982. [CrossRef] 60. Ghasemi, F.; Mehridehnavi, A.; Fassihi, A.; Pérez-Sánchez, H. Deep neural network in QSAR studies using deep belief network. Appl. Soft. Comput. 2018, 62, 251–258. [CrossRef] 61. Grayson, M. Cannabis. Nature 2015, 525, S1. [CrossRef][PubMed] 62. Mangoato, I.M.; Mahadevappa, C.P.; Matsabisa, M.G. Cannabis sativa L. Extracts can reverse drug resistance in colorectal carcinoma cells in vitro. Synergy 2019, 9, 100056. [CrossRef] 63. Martinenghi, L.D.; Jønsson, R.; Lund, T.; Jenssen, H. Isolation, purification, and antimicrobial characterization of cannabidiolic acid and cannabidiol from Cannabis sativa L. Biomolecules 2020, 10, 900. [CrossRef][PubMed] 64. Stonehouse, G.C.; McCarron, B.J.; Guignardi, Z.S.; El Mehdawi, A.F.; Lima, L.W.; Fakra, S.C.; Pilon-Smits, E.A.H. Selenium metabolism in hemp (Cannabis sativa L.)—Potential for Phytoremediation and biofortification. Environ. Sci. Technol. 2020, 54, 4221–4230. [CrossRef] 65. Frassinetti, S.; Gabriele, M.; Moccia, E.; Longo, V.; Di Gioia, D. Antimicrobial and antibiofilm activity of Cannabis sativa L. seeds extract against Staphylococcus aureus and growth effects on probiotic Lactobacillus spp. LWT 2020, 124, 109149. [CrossRef] 66. Sangiovanni, E.; Fumagalli, M.; Pacchetti, B.; Piazza, S.; Magnavacca, A.; Khalilpour, S.; Melzi, G.; Martinelli, G.; Dell’Agli, M. Cannabis sativa L. extract and cannabidiol inhibit in vitro mediators of skin inflammation and wound injury. Phytother. Res. 2019, 33, 2083–2093. [CrossRef]

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).