User’s manual: Ruby script for analysis of FMO output (RbAnalysisFMO)

(Windows / Unix version)

Editor: Takaki Tokiwa Last update: 2019/02/18

1

Contents 0. First of all ...... 2 0.1 Copyright notice ...... 2 0.2 Citation ...... 2 1. Introduction ...... 3 2. Preparation ...... 5 2.1 Required programs ...... 5 2.2 Installation ...... 8 2.2.1 Installation of PV and Paics ...... 8 2.2.2 Installation of Ruby compiler ...... 8 2.2.3 Installation of Nokogiri library in Ruby using gem command...... 10 2.2.5 Installation of gs (Ghostscript) ...... 15 3. Analysis ...... 19 3.1 What is needed...... 19 3.2 Option list...... 23 3.3. Analysis from the Paics out file ...... 30 3.3.1 “Selected-pairs” mode in “one-dimensional table” ...... 30 3.3.1.1 Comannd line mode without options (standard streams on command-line or terminal) . 30 3.3.1.2 In-line mode (Using options on command-line or terminal) ...... 38 3.3.2 “All-pairs” mode in “two-dimensional table” ...... 39 3.3.2.1 Comannd line mode without options (standard streams on command-line or terminal) . 39 3.3.2.2 In-line mode (Using options on command-line or terminal) ...... 45 4. References ...... 46 Appendix ...... 47 A – Preparation to install PV and Paics ...... 47 B – Installation of gcc/g++ compiler on Ubuntu...... 47 C – Installation of gfortran compiler on Ubuntu ...... 47 D – Installation of Intel C and Intel Fortran compiler ...... 47 E – Installation of MPI (Message Passing Interface) ...... 49 F – Installation of other softwares (freeglut, glut, LaTeX, and tcl/tk) ...... 49 G – Installation of PV ...... 52 H – Installation of Paics ...... 54

2

0. First of all 0.1 Copyright notice The Ruby source code (RbAnalysisFMO) is copyrighted, but you can freely use and copy it as long as you don't change or remove any of the copyright notices. This Ruby program (RbAnalysisFMO) is Copyright (C) 2016 by Takaki Tokiwa.

All Rights Reserved Permission to use, copy, modify, distribute, and distribute modified versions of this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both the copyright notice and this permission notice appear in supporting documentation, and that the name(s) of the author(s) not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission.

THE AUTHOR(S) DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

0.2 Citation Please cite the following paper when using this analysis program (AnalysisFMO toolkit)! - Authors: Takaki Tokiwa, Shogo Nakano, Yuta Yamamoto, Takeshi Ishikawa, Sohei Ito, Vladimir Sladek, Kaori Fukuzawa, Yuji Mochizuki, Hiroaki Tokiwa, Fuminori Misaizu, and Yasuteru Shigeta - Title: Development of an Analysis Toolkit, AnalysisFMO, to Visualize Interaction Energies Generated by Fragment Molecular Orbital Calculations - Journal name: Journal of Chemical Information and Modeling - Vol., Issue, pp. year: Vol. 59, Issue 1, pp. 25-30 - DOI: 10.1021/acs.jcim.8b00649 3

1. Introduction

This manual provides a first introduction to ruby script (RbAnalysisFMO in AnalysisFMO Toolkit [1]) for analysis the output file (*.out or *.log, and so on) of the fragment molecule orbital (FMO) calcualtion packages (GAMESS (the general atomic and molecular electronic structure system) [2], Paics (the parallelized ab initio calculation system) [3], and ABINIT-MP (the ABINT ab initio multi-processor) [4,5]). The output file of FMO calculation is composed of various data in an ASCII (text) format: run time parameter, atomic coordinate, and inter fragment interaction energies (IFIEs) and/or pair interaction energies (PIEs). It is necessary to extract IFIEs or PIEs from the FMO output file due to analyze and visualize. RbAnalysisFMO is able to analyze and extract IFIEs or PIEs from the FMO output file. The GAMESS program package can calculate PIEs [2] and The PAICS program package can calculate IFIEs [3]. ABINIT-MP program package can calculate IFIEs [4] but ABINIT-MP Open series can calculate IFIEs and PIEs [5]. There are two analysis modes in this ruby script as following: ・”Selected-pairs” mode in “one-dimensional table”: This ruby script using “Selected-pairs” mode analyzes IFIEs or PIEs as fragments-target (target means fragment, ligand, peptide, H2O and so on) interaction from the FMO output file. In the PAICS case, this mode analyzes data from the result of PAICS calculation with the “frag_calc_pair” option (selection of the fragment pairs, such as fragment number 1-100 (ALL) and fragment number 101 (target)). The details of the this option is described in P. 14 the Chapter2 “INPUT” at http://www.paics.net/pdf/manual.pdf/. In the GAMESS case, result is the GAMESS calculation with the “MOLFRG” option (The details of the this option is described in P. 276 the Section 2 – “Input Description” at http://fisica.ufpr.br/bettega/input_GAMESS.pdf/ or “$FMO group (optional, activates FMO option)” at http://myweb.liu.edu/~nmatsuna/gamess/input/FMO.html/). But, ABINIT-MP and ABINIT-MP Open series (Ver. 1 Rev. 10) do not calculate this mode, as of September in 2018. This mode saves as a CSV (Comma-Separated Values format, *.csv) and TXT (ASCII (text) format, *.txt) files in one-dimensional table. IFIEs data is plotted bar graph using the gnuplot program (http://www.gnuplot.info/ (accessed September 26th, 2018)). The PS file is converted the PDF file using ps2pdf program attached to Ghostscript (https://www.ghostscript.com/ (accessed September 26th, 2018)) after this plotted graph is saved as the PS file by gnuplot.

・”All-pairs” mode in “two-dimensional table”: This ruby script using “All-pairs” mode analyzes IFIEs or PIEs as fragments-fragments (which may be included ligand, peptide, H2O and so on) interaction from the FMO output file. In the PAICS case, this mode analyzes data from the result of PAICS calculation without the “frag_calc_pair” option (that is, results of PAICS for all fragment calculation). In the GAMESS case, result is the GAMESS calculation without the “MOLFRG” option. This mode saves as a CSV (*.csv) and TXT (*.txt) files in two-dimensional table and can plot 2D interaction map from IFIE or PIE data using Gnuplot. 4

In addition, the RbAnalysisFMO requires several Ruby libraries: Logger and Nokogiri (sparklemotion/nokogiri. GitHub. https://github.com/sparklemotion/nokogiri/blob/master/LICENSE.md (accessed September 26th, 2018)), which is Rubygen providing HTML and XML.

The manual is subdivided in two sections. The first one covers the basic preparation for ruby analysis in section 2. i.e., the installation of required programs of your system. The section 3 introduces typical analysis of IFIEs or PIEs using this ruby script.

5

2. Preparation 2.1 Required programs The input preparation for FMO program and FMO programs are required for this manual: ● PV (PaicsView) [6]: PaicsView is a supporting program for making input files of PAICS, which is a software of quantum chemical calculation developed by Takeshi Ishikawa. But, now only Japanese manuals and order form are available (2011/01/11). Available at http://www.paics.net/paics_view_e.html (for windows platforms (32 bit, binary))

● PyMOL plug-in (AnalysisFMO toolkit) [1]: PyMOL plug-ins (AnalysisFMO toolkit including PyGAMESS, PyPAICS, and PyABINIT-MP), which coded Python language[7], are able to visualize IFIEs/PIEs to the structure of target protein in PyMOL program, a popular molecular GUI visualization software [8], and is programing and developed by Shogo Nakano. For a detailed description of PyMOL plug-in, the reader is referred to the PyMOL plug-in user’s guide (manual) located at http://dfns.u-shizuoka-ken.ac.jp/labs/proeng/custom20.html (for Windows and Unix platforms).

● Paics (Parallelized ab initio Calculation System based on FMO) [3]: Paics is a program of quantum chemical calculation, which has been developed and administrated by Takeshi Ishikawa. In this program, fragment molecular orbital (FMO) method is adopted, by which large molecules including biomolecular systems can be treated with several quantum chemical approaches. Available at http://www.paics.net/index_e.html (for UNIX platform) Manual (English) at http://www.paics.net/pdf/manual.pdf

● GAMESS (General Atomic and Molecular Electronic Structure System) [2]: The General Atomic and Molecular Electronic Structure System (GAMESS) is a general ab initio quantum chemistry package at http://www.msg.ameslab.gov/gamess/.

● FU (FMO utility) [9]: FU is an open source GUI for molecular simulations for GAMESS. (Ref.) https://staff.aist.go.jp/d.g.fedorov/fmo/fu.html

● Facio: Facio is a free GUI for computational chemistry softwares (TINKER, MSMS, Firefly, Gamess, MOPAC and Gaussian) at http://zzzfelis.sakura.ne.jp/. 6

● ABINIT-MP (ABINT ab initio Multi-Processor) [4,5]: ABINIT-MP is a user-friendly FMO program (by which 4-body fragments can be computed), especially for in-house /Intel servers under the standard MPI environment. Additionally, the associated graphical user-interface system, BioStation Viewer (on Windows) helps the preparation of input data including the tedious fragmentation setting and also assists intuitive understanding of the target system through the inter-fragment interaction energies (IFIEs). Available at http://ma.cms-initiative.jp/en/application-list/abinit-mp/abinit-mp http://www.cenav.org/abinit-mp-open_ver-1-rev-10/ Download at http://www.ciss.iis.u-tokyo.ac.jp/riss/dl/download/index.php#download_2

● BioStation Viewer [10]: ABINIT-MP's pre/post processing program, BioStation Viewer is a visualization tool for the molecular interaction analysis with Java and Java3D on Windows OS. http://www.ciss.iis.u-tokyo.ac.jp/riss/english/project/molecule/img/BioStation_EJ-20101012.pdf

The following analysis programs are required for this manual: ● Ruby compiler for Ruby scripts: Ruby is a dynamic, open source programming language [11] which is made in JAPAN with a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write. Ruby is available for all pkatforms. (Ref.) https://www.ruby-lang.org/en/

● Plotting Program using gnuplot: For RbPaics_LigandInteractions, we will use “gnuplot” to plot and view analyzed output data from PAICS ooutput file (*.out). gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. The source code is copyrighted but freely distributed. (Ref.) http://www.gnuplot.info/ * But, it is not necessary for you to install “gnuplot” if this ruby script does not plot data using gnuplot.

7

● Convert program from PS file to PDF file using ps2pdf in gs (Ghostscript): gnuplot directly outputs PS (postscript) file of plots from the PAICS output file. The PDF (Portable Document Format) file is able to be created from the PS file of plots using gnuplot, so the PS file is converted to PDF file using ps2pdf on the terminal or commandline. ps2pdf script is a work-alike for nearly all the functionality (but not the user interface) of Adobe's AcrobatTM DistillerTM product: it converts PostScript files to Portable Document Format (PDF) files. (Ref.) http://ghostscript.com/doc/current/Ps2pdf.htm * But, it is not necessary for you to install gs if you do not install gnuplot and plot data using gnuplot.

● Ruby library: “Nokogiri”: Nokogiri is a Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support. (Ref.) http://www.nokogiri.org/

8

2.2 Installation This subsection explains how to install compiler, package, and application software which are introduced in section 2.1.

2.2.1 Installation of PV and Paics Installations of PV and Paics are described in appendix H and I, respectively.

2.2.2 Installation of Ruby compiler - (Reference page): http://www2.rikkyo.ac.jp/~tokiwa/eng_home/menu_eng/bioLib/library/ruby/ruby.html

● Windows:  Install ruby using package (*.exe): * As of 21 January 21, 2016, Ruby 2.2.2 has been released. Ruby 2.0.0 or later can be installed together with the irb and tk for ruby (ruby/tk).

a). Download ruby installer package (Homepage): http://rubyinstaller.org/ (Download): http://rubyinstaller.org/downloads/ Please download ruby installer package (*.exe) of version that you want to use above the ruby download web-site.

b). Install Ruby You execute ruby installer (*.exe) following the install flow. * In installation, you should select checkbox for “Add Ruby executables to your PATH” and “Associate .rb and .rbw files with this Ruby installation”. → You don’t need to pass ruby path.

9

c). Validation of Ruby You must perform an operation check of ruby on the command prompt or the cygwin terminal. On the command prompt or the cygwin terminal, $ ruby –v * “$” is prompt on terminal for general user (no root). ruby 2.0.0p0 (2013-02-24) [x64-mingw32] $ gem -v 2.0.0 $ irb DL is deprecated, please use Fiddle irb(main):001:0> exit $ * As associating .rb and .rbw files with Ruby, if you double-click ruby program (*rb or *rbw), you can execute it directly.

● Unix (for Ubuntu):  Install ruby from apt-get command: a). Change your shell into “bash” to use the wildcard Matching in your system $ bash

b). Search ruby package and ruby version $ apt-cache search ruby*

c). Install ruby $ sudo apt-get install ruby or $ sudo apt-get install ruby[version]

 Install ruby from source file: a). Download ruby source file following web-site (Download page): http://www.ruby-lang.org/en/downloads/

b). Decompress tar file $ tar xvzf ruby-*.*.*-p***.tar.gz

c). Install $ cd ruby-*.*.*-p***/ $ ./ configure --prefix=/***/***/***/ruby-*.*.* 10

$ make $ make install

d). Validation of Ruby $ cd /***/***/***/ruby-*.*.* $ ls bin/ include/ lib/ share/ $ cd bin/ $ ls erb* gem* irb* rake* rdoc* ri* ruby* testrb* $ $ ./ruby –v

e). You need to pass ruby path in a full path on the shell file (.cshrc or .bashrc, …) of your system.

2.2.3 Installation of Nokogiri library in Ruby using gem command * But, it is not necessary for you to install “Nokogiri” using gem if this ruby script does not output XML data using Nokogiri library. ● Windows: In command prompt, you type “gem install nokogiri”.

● Unix (for Ubuntu)  Install nokogiri using gem command $ sudo gem install nokogiri

2.2.4 Installation of gnuplot program * But, it is not necessary for you to install “gnuplot” if this ruby script does not plot data using gnuplot. ● Windows:  Install gnuplot using package (*.exe): a). Download gnuplot installer file following web-site (Homepage): http://gnuplot.sourceforge.net/ (Download): http://www.tatsuromatsuoka.com/gnuplot/Eng/winbin/ 11

b). Decompress the downloaded file and double-click the decompressed gnuplot icon following the install flow

↓ 12

↓ 13

↓ You should select “windows” for the gnuplot default terminal.

And, you should select checkbox for “Add application directory to your PATH environment variable”. 14

Upon complete and successful gnuplot installation.

c). Validation of gnuplot You execute gnuplot program on the command prompt, if you select the checkbox for “you should select checkbox for “Add application directory to your PATH environment variable”.

* If you select“windows” for the gnuplot default terminal, the screen of command prompt are shown above. 15

 Install gnuplot with cygwin: a). Install cygwin If you install cygwin in full packages, cygwin full packages include gnuplot program.

b). Install xming package (Donwload): http://sourceforge.net/projects/xming/

c). Double-click xming icon and start xming program

d). Start cygwin program

e). Execute cygwin window on the cygwin terminal $ startxwin.exe or $ startxwin

f). Execute gnuplot on the xterm $ gnuplot

● Unix (for Ubuntu)  Install gnuplot from apt-get command $ sudo apt-get install gnuplot

2.2.5 Installation of gs (Ghostscript) * But, it is not necessary for you to install gs if you do not install gnuplot and plot data using gnuplot. ● Windows:  Install gs using package (*.exe): a). Download gs installer following the web-site (Homepage): http://pages.cs.wisc.edu/~ghost/ (Download): http://www.ghostscript.com/download/ 16

b). Double-click the downloaded gs icon following the install flow

↓ 17

c). Set environment variable for Windows (in Windows7) c-1). Open the Start Menu and right click on Computer. Select “Properties”. c-2). Select “Advanced system settings”. c-3). In the “Advanced” tab, select “Environment Variables…”. 18

c-4). Select “Edit…” on “Path”. : [Path];[Path];…[Path];%PATH% Make sure you separate the value with “;“ and add “%PATH%” to the last of the Path. c-5). Add “bin” and “lib” of gs to the Path of environment variable “C:¥Program Files¥gs¥gs9.06¥bin;” and “C:¥Program Files¥gs¥gs9.06¥lib;” * You must specify and add your installation location of gs! c-6). Select OK. You should now see the new Environmental Variable that you created.

d). Validation of ps2pdf on the command prompt $ ps2pdf or $ ps2pdf14 Usage: ps2pdf input.ps [output.pdf] or: ps2pdf [options...] input.[e]ps output.pdf

● Unix (for Ubuntu):  Install gs with LaTeX packages from apt-get command: Install LaTeX packages $ sudo apt-get install texlive-latex-extra $ sudo apt-get install latexmk dvipng $ sudo apt-get install xpdf gs-cjk-resource $ sudo apt-get install vfdata-morisawa5 dvi2ps-fontdesc-morisawa5 $ sudo apt-get install cmap-adobe-japan1 cmap-adobe-japan2 cmap-adobe-cns1 cmap-adobe-gb1 $ sudo jisftconfig add

19

3. Analysis 3.1 What is needed In order to run Ruby analysis for calculated FMO data, this ruby script requires at least two things:

● Protein Data Bank (PDB) file (*.pdb): A PDB file has stored atomic coordinates and served as the information about the 3D structures of proteins, nucleic acids, and complex assemblies. It is necessary to use the PDB file which you used to make the input file (*.inp) of PAICS calculation using PV or PyPaics.

● FMO out file (*.out, *.log, or *.txt, and so on…): The out file is the results of FMO calculation and has stored input information, fragment information, FMO-1 results (RHF energy, MP2 correlation energy (cmp2), etc.) calculated by FMO program. - Paics manual: http://www.paics.net/pdf/manual.pdf at P. 33, Chapter 3 - GAMESS-FMO manual: https://staff.aist.go.jp/d.g.fedorov/fmo/GAMESS-FMO_J.pdf (Japanese) http://myweb.liu.edu/~nmatsuna/gamess/input/FMO.html - ABINIT-MP manual: http://ma.cms-initiative.jp/en/application-list/abinit-mp?set_language=en

20

● Ruby analysis script (RbAnalysisFMO.rb): ・”Selected-pairs” mode in “one-dimensional table”: This ruby script using “Selected-pairs” mode analyzes IFIEs or PIEs as fragments-target (target means fragment, ligand, peptide, H2O and so on) interaction from the FMO output file. In the PAICS case, this mode analyzes data from the result of PAICS calculation with the “frag_calc_pair” option (selection of the fragment pairs, such as fragment number 1-100 (ALL) and fragment number 101 (target)). The details of the this option is described in P. 14 the Chapter2 “INPUT” at http://www.paics.net/pdf/manual.pdf/. In the GAMESS case, result is the GAMESS calculation with the “MOLFRG” option (The details of the this option is described in P. 276 the Section 2 – “Input Description” at http://fisica.ufpr.br/bettega/input_GAMESS.pdf/ or “$FMO group (optional, activates FMO option)” at http://myweb.liu.edu/~nmatsuna/gamess/input/FMO.html/). But, ABINIT-MP and ABINIT-MP Open series (Ver. 1 Rev. 10) do not calculate this mode, as of September in 2018. This mode saves as a CSV (Comma-Separated Values format, *.csv) and TXT (ASCII (text) format, *.txt) files in one-dimensional table. IFIEs data is plotted bar graph using the gnuplot program (http://www.gnuplot.info/ (accessed September 26th, 2018)). The PS file is converted the PDF file using ps2pdf program attached to Ghostscript (https://www.ghostscript.com/ (accessed September 26th, 2018)) after this plotted graph is saved as the PS file by gnuplot.

21

・”All-pairs” mode in “two-dimensional table”: This ruby script using “All-pairs” mode analyzes IFIEs or PIEs as fragments-fragments (which may be included ligand, peptide, H2O and so on) interaction from the FMO output file. In the PAICS case, this mode analyzes data from the result of PAICS calculation without the “frag_calc_pair” option (that is, results of PAICS for all fragment calculation). In the GAMESS case, result is the GAMESS calculation without the “MOLFRG” option. This mode saves as a CSV (*.csv) and TXT (*.txt) files in two-dimensional table and can plot 2D interaction map from IFIE or PIE data using Gnuplot.

22

23

3.2 Option list This section explains the options available in RbAnalysisFMO. This explanation can be typed by using the -h (--Help) option of the program at the terminal.

Usage: $ ruby RbAnalysisFMO.rb [options] or $ ruby RbAnalysisFMO.rb

Examples: - Option mode $ ruby RbAnalysisFMO.rb -opt1 -opt2 ...

- Terminal mode (without options) $ ruby RbAnalysisFMO.rb * If you execute this program without options, you input character string instead of options on your terminal.

Specific options: : -f, --fmo VALUE [(int) VALUE]: 1: Paics 2: GAMESS 3: ABINIT-MP or ABINIT-MP Open Series

: -r, --out VALUE (ex.) Input the out file name of FMO method (*.out, *.log, *.txt, and so on…) and (full) path -r /AAA/BBB/CCC/[FMO Out File Name].out 24

: -b, --pdb VALUE (ex.) Input the PDB file name (*.pdb) and (full) path -b /AAA/BBB/CCC/[PDB File Name].pdb

: -o, --output VALUE (ex.) Input the Output file name and (full) path to write it * Format of ouptut file is "TEXT format" only, you specify any extension for yourelf. -o /AAA/BBB/CCC/[Output file name].output

: -a, --aaletter VALUE [(int) VALUE]: 1: Amino acid 1 letter code (ex. Glycine name is "G") 2: Amino aicd 3 letter code (ex. Glycine name is "GLY")

26

: -i, --interaction VALUE [(int) VALUE]: 1: “Selected-pairs” mode in "one-dimensional" table

To analyzes IFIEs as [fragments]-[user's selected target] (fragment, ligand, peptide, H2O and so on) interaction (selection of the fragment pairs, such as fragment number 1-100 (ALL) and fragment number 101(user's selected target)) 2: “All-pairs” mode in "two-dimensional" table To analyzes IFIEs as [fragments]-[fragments] (which may be included ligand, peptide,

H2O and so on) interaction * Notice: << Paics>> - Paics output with the "frag_calc_pair" option → [1]: Selected-pairs mode - Paics output without the "frag_calc_pair" option → [2]: All-pairs mode (that is, results of PAICS for all fragment calculation) : Chapter 2. INPUT, P. 14 in PAICS manual (URL: http://www.paics.net/pdf/manual.pdf) << GAMESS >> - GAMESS-FMO output without "MOLFRG(i)" option → [1]: Selected-pairs mode - GAMESS-FMO output with the "MOLFRG(i)" option → [2]: All-pairs mode : http://myweb.liu.edu/~nmatsuna/gamess/input/FMO.html << ABINIT-MP >> As of September 5th, 2018, ABINIT-MP or ABINIT-MP Open can calculate the FMO output of "All-pairs mode only"! * But, This program can extract data of "one-dimensional" table from the FMO output of "All-pairs" mode as well as GAMESS and PAICS

27

: -j, --fragment one, two a). Specify fragment number range (between ith-fragment and jth-fragment) to extract data * (ith-frag < jth-frag) and (ith-frag, jth-frag ≠ 0) -j [ith-frag], [jth-frag] (ex). Extract data from residue sequence number 300 to residue sequence number 400 -j 300, 400 b). Extract all data (from residue sequence number 1 to max residue sequence number) -j all