Identification of a Novel Inhibitor of SARS-Cov-2 3CL-PRO Through
Total Page:16
File Type:pdf, Size:1020Kb
Identification of a novel inhibitor of SARS-CoV-2 3CL-PRO through virtual screening and molecular dynamics simulation Asim Kumar Bepari and Hasan Mahmud Reza Department of Pharmaceutical Sciences, North South University, Dhaka, Bangladesh ABSTRACT Background: The COVID-19 pandemic, caused by the SARS-CoV-2 virus, has ravaged lives across the globe since December 2019, and new cases are still on the rise. Peoples’ ongoing sufferings trigger scientists to develop safe and effective remedies to treat this deadly viral disease. While repurposing the existing FDA-approved drugs remains in the front line, exploring drug candidates from synthetic and natural compounds is also a viable alternative. This study employed a comprehensive computational approach to screen inhibitors for SARS-CoV-2 3CL- PRO (also known as the main protease), a prime molecular target to treat coronavirus diseases. Methods: We performed 100 ns GROMACS molecular dynamics simulations of three high-resolution X-ray crystallographic structures of 3CL-PRO. We extracted frames at 10 ns intervals to mimic conformational diversities of the target protein in biological environments. We then used AutoDock Vina molecular docking to virtual screen the Sigma–Aldrich MyriaScreen Diversity Library II, a rich collection of 10,000 druglike small molecules with diverse chemotypes. Subsequently, we adopted in silico computation of physicochemical properties, pharmacokinetic parameters, and toxicity profiles. Finally, we analyzed hydrogen bonding and other protein- ligand interactions for the short-listed compounds. Submitted 9 November 2020 ’ 22 March 2021 Results: Over the 100 ns molecular dynamics simulations of 3CL-PRO s crystal Accepted a Published 13 April 2021 structures, 6LZE, 6M0K, and 6YB7, showed overall integrity with mean C Corresponding author root-mean-square deviation (RMSD) of 1.96 (±0.35) Å, 1.98 (±0.21) Å, and 1.94 Asim Kumar Bepari, (±0.25) Å, respectively. Average root-mean-square fluctuation (RMSF) values were [email protected] 1.21 ± 0.79 (6LZE), 1.12 ± 0.72 (6M0K), and 1.11 ± 0.60 (6YB7). After two phases Academic editor of AutoDock Vina virtual screening of the MyriaScreen Diversity Library II, Pedro Silva we prepared a list of the top 20 ligands. We selected four promising leads considering Additional Information and predicted oral bioavailability, druglikeness, and toxicity profiles. These compounds Declarations can be found on also demonstrated favorable protein-ligand interactions. We then employed 50-ns page 23 molecular dynamics simulations for the four selected molecules and the reference DOI 10.7717/peerj.11261 ligand 11a in the crystallographic structure 6LZE. Analysis of RMSF, RMSD, Copyright and hydrogen bonding along the simulation trajectories indicated that S51765 would 2021 Bepari and Reza form a more stable protein-ligand complexe with 3CL-PRO compared to other Distributed under molecules. Insights into short-range Coulombic and Lennard-Jones potentials also Creative Commons CC-BY 4.0 revealed favorable binding of S51765 with 3CL-PRO. How to cite this article Bepari AK, Reza HM. 2021. Identification of a novel inhibitor of SARS-CoV-2 3CL-PRO through virtual screening and molecular dynamics simulation. PeerJ 9:e11261 DOI 10.7717/peerj.11261 Conclusion: We identified a potential lead for antiviral drug discovery against the SARS-CoV-2 main protease. Our results will aid global efforts to find safe and effective remedies for COVID-19. Subjects Computational Biology, Drugs and Devices, Infectious Diseases, Pharmacology Keywords COVID-19, Main protease, Mpro, docking, Coronavirus, in silico, SARS-CoV-2, 3CL-PRO, Vina, Gromacs INTRODUCTION The “severe acute respiratory syndrome coronavirus 2” (SARS-CoV-2), responsible for the coronavirus disease-2019 (COVID-19), originated in Wuhan, China in late 2019 as a pneumonia outbreak causing acute respiratory distress syndrome and related complications (Huang et al., 2020; Zhou et al., 2020; Wu et al., 2020; Gorbalenya et al., 2020). Considering the severity of symptoms among the affected people and rapid spread, the World Health Organization (WHO) declared COVID-19 as a pandemic on 11 March 2020. This catastrophe has created an unprecedented healthcare crisis confounded with multifaceted economic, social, and cultural impacts (Sultana & Mahmud Reza, 2020; McKibbin & Fernando, 2020; Hartley & Perencevich, 2020; Headey et al., 2020; Forster et al., 2020). Despite extensive measures taken at individual to global scales, the world has only a few arsenals to fight against this massive disaster. While remdesivir, the only FDA-approved drug to treat COVID-19, is indicated for patients 12 years of age and older requiring hospitalization, we all are in pursuit of safer and more effective antiviral agents. SARS-CoV-2 virus is closely related to other coronaviruses, including SARS-CoV and MERS-CoV, and carries a single-stranded RNA genome of ∼30 kb, which encodes at least 14 open-reading frames (ORFs) (Zhou et al., 2020; Wu et al., 2020; Kim et al., 2020; Gordon et al., 2020). ORF1a and ORF1ab produce polypeptides pp1a and pp1ab, respectively, which generate nonstructural proteins (nsps) upon proteolytic cleavage and form the replicase–transcriptase complex (Kim et al., 2020; Gordon et al., 2020; Jiang et al., 2020). The activity of 3CL-PRO (also known as 3C-like proteinase, main protease, and Mpro) is crucial in the auto-proteolysis of viral polypeptides and is a prime target in the discovery of antiviral agents for COVID-19 (Ziebuhr, Snijder & Gorbalenya, 2000; Anand et al., 2003; Zhang et al., 2020; Jin et al., 2020). Many high-resolution X-ray crystallographic structures of SARS-CoV-2 3CL-PRO, in both bound and unbound states, are available in the Protein Data Bank (PDB) (www.wwpdb.org). These three-dimensional structures can significantly help design, discover, and develop potential inhibitors for future therapeutic applications. Computational methods are introducing many quick and efficient avenues to reach destinations in the journey of drug discovery and development (Kapetanovic, 2008; Macalino et al., 2015; Yu & MacKerell, 2017; Cui et al., 2020). It is noteworthy that proteins are dynamic in a biological environment, in contrast to the static X-ray crystallographic structures. Virtual screening methods for approved drugs or large databases such as ZINC15 usually involve only a few target structures; therefore, they are more likely to leave Bepari and Reza (2021), PeerJ, DOI 10.7717/peerj.11261 2/29 off potential ligands. In this study, we have employed a comprehensive in silico approach to identify leads for the treatment of COVID-19 through inhibition of the viral main protease. We generated multiple target structures through molecular dynamics simulations of 3CL-PRO crystal structures and performed target-based virtual screening of the MyriaScreen Diversity Library II. Top compounds were then scrutinized for physicochemical properties, pharmacokinetic profiles, and toxicity risks. Subsequently, we performed protein-ligand interaction analyses for the best picks. Results from this comprehensive computational analysis may assist in finding an effective therapeutic intervention for COVID-19. MATERIALS AND METHODS Protein structure We retrieved X-ray crystallographic protein structures with PDB IDs 6LZE (Dai et al., 2020), 6M0K (Dai et al., 2020), and 6YB7 from the Protein Data Bank (www.rcsb.org). A multiple structure alignment was done using the mTm-align webserver (Dong et al., 2018a). Ligand libraries MyriaScreen Diversity Library II is a powerful resource for lead discovery (Screening Compounds, 2020). Upon request to Sigma-Aldrich, we received an sdf file of this library which contains 10,000 high-purity screening compounds. Sigma–Aldrich constructed this popular library from over 300,000 compounds on the basis of diversity and drug- likeness. All structures were edited using Open babel (O’Boyle et al., 2011) and Discovery Studio Visualizer (Discovery Studio Visualizer, v20.1.0.192, 2019; BIOVIA, Dassault Systèmes, San Diego, CA, USA). Virtual screening All non-amino acid residues from a protein structure were removed using UCSF Chimera alpha version 1.14 (2019) (Pettersen et al., 2004). Then the Dock Prep tool of the Chimera program was used to prepare the protein for docking. All default parameters were selected and the structure was saved as a pdb file. In AutoDockTools version 1.5.6 (Morris et al., 2009) the pdb file was then edited by adding polar hydrogens, merging non-polar hydrogens and adding Kollman charges. The final macromolecule was saved in the pdbqt format. We used Parallelized Openbabael and Autodock suite Pipeline (POAP) to automate the AutoDock Vina virtual screening process (Samdani & Vetrivel, 2018). The Ligand Preparation Module of POAP prepared the ligands by adding hydrogens, generating 3D coordinates and minimizing energy. Ligand files were saved in the pdbqt format. Then we used the Virtual screening Module of POAP to screen the ligands using AutoDock Vina (Trott & Olson, 2010). The inhibitor 11a complexed with 6LZE was used as a guide to make the grid box. For the grid box, the spacing was set at default 1 Å, center xyz coordinates were 10.700, 0.784, 23.667, and the dimension was 26 × 26 × 26. Bepari and Reza (2021), PeerJ, DOI 10.7717/peerj.11261 3/29 Exhaustiveness was set at eight. Ligands were ranked based on the binding energy (kcal/mol). A more negative value indicates stronger protein-ligand binding. We performed rigid docking for the best four ligands and the reference inhibitor 11a using AutoDock4.2 (Morris et al., 2009). We used the same ligand and protein files prepared for the Vina virtual screening. For the grid parameter file (.gpf), atom types were selected from the ligands files, the grid was centered on the ligand, grid dimension was 60 × 60 × 60, and the spacing was 0.375 Å. The Lamarckian Genetic Algorithm (LGA) was used for the simulation and the maximum number of energy evaluations was 2,500,000. The best docked poses were selected based on the binding scores and complexes were generated. Subsequently, we used those complexes for protein-ligand interaction analyses in Discovery Studio Visualizer.