Biophysics and Physicobiology Vol. 16, pp. 485–489 (2019) Supplementary Materials

Supplementary Text S1. Introduction File format support using simulation packages/molecular visualization systems

In the field of computational molecular science, there are various file formats containing the chemical information of proteins such as PDB, PDBx/mmCIF, and PDBML[1]. In computational studies, researchers use such containers for pre- processing and post-processing in computation, for example, preparation of an initial structure for computation including molecular dynamics (MD) simulation and visualization of the final structure of the MD simulation. The PDB file format is supported by various MD packages, including Amber[2], NAMD[3], and Charmm[4], and Molecular Graphics Systems, including VMD[5] and UCSF Chimera[6], despite its data update support being finished on 21 November 2012 according to wwPDB. PDBx/mmCIF, the current standard file format, is also supported by the various tools mentioned above. In contrast, PDBML, XML-conversion of PDBx/mmCIF, is generally not supported by such tools except for PDBj-supplying tools such as jV[7] and Molmil[8]. The way of describing the information of a single atom differs among PDB, PDBx/mmCIF, and PDBML. Advantage of PDBML

In the field of information technologies, the XML format is widely used as a file format capable of extending attributes of hierarchically structured data. The XML file I/O function is necessary to follow an advancement of file formats for biomolecules such as new attribute addition, which is required by next-generation research. For example, the PDBML format by wwPDB is an XML file format which has identical information to the PDBx/mmCIF file format, which is highly suitable for future extension.

©2016 THE BIOPHYSICAL SOCIETY OF JAPAN

1 2 Biophysics and Physicobiology Vol. 16

Methods Programming strategies

Model-View-Controller model

The Model-View-Controller (MVC) model[9] is an architectural pattern used in applications with user interface (UI). In the MVC model, an application is divided into three components: Model, View, and Controller. Model manages the data, logic, and rules of the application. View represents information users can see such as diagrams or tables (Figure S1). Controller accepts input from users, converting it to commands for the Model/View.

Facade pattern

The facade pattern[10] is one of the software design patterns by the Gang of Four (GoF). This software design pattern uses a wrapper class to call a sub-system from the client component, UI. The wrapper class is delegated to conduct complex sub-systems in order to implement complex logic. In other words, wrapper components mask the complex implementation of a sub-system, providing a simple interface for client components as in Figure S2. The software design pattern divides a software application into three layers: UI layer; a wrapper layer which implements complex logic; sub- system layer called by the wrapper layer. An expert user, who is an expert in software development to generate custom-made software, can enhance logic by modifying the wrapper layer, and speeding up an application by editing the sub-system layer, whereas non-expert users can easily call new logic by editing the UI layer.

2 Onishi et al.: An tool for computational science on biomolecules 3

Programming language and libraries

Scala

Scala[11], scalable language, is a highly-scalable, object-oriented programming language with some features of a functional programming language, which was developed by Martin Odersky, a professor at École Polytechnique Fédérale de Lausanne in Lausanne, Switzerland. Scala is adopted in Twitter, LinkedIn, and GitBucket. Scala has the following features: a multi-paradigm programming language of object-oriented programming language and functional programming language; runs on a Java Virtual Machine, which means availability of various Java class-libraries and platform- independence; powerful regular-expression functions; type-inference; availability for interpreter usage as in Python; Some/None type, devastating NullPointerException hell; XML-syntax support. A Scala compiler has an auto-translation feature turning procedural programming expressions in the source code into functional programming expressions. For example, for-loop, which many developers are familiar with, is internally converted to the foreach function. This feature allows non-expert users to use powerful features of the functional programming language without knowledge of a category theory such as monads. Scala supports XML-syntax (Figure S3). Scala accepts XML in code and has an XML class, which corresponds to XML parser in Java, for XML file I/O. Since PDBML files are automatically generated by computers, they basically must be valid XML without human modification. In addition, either the Author or Label attributes can be empty in some cases. We chose XML parser in Scala as the PDBML handler despite the existence of the XML parser in Fortran, Java, etc. due to Scala’s XML-syntax support and the existence of XML file I/O functions. Scala supports easy access to the XML tag hierarchical structure, which allows users to easily edit XML values and attributes, and even tag hierarchical structures. Scala can call shell commands almost as is (Figure S4). Pipe handling is also available in Scala. This ability helps users to build Model components which use Shell commands. Scala differs from Java in the following points: (1) powerful regular-expression functions; (2) allowance of data update of outer class by inner class, which is prohibited in Java. Unlike implementation in Java and other languages, regular expression in Scala is free from “escape hell”, where many programmers struggle to escape various special characters, e.g. backslash \, depending on the environment including the OS and

3 4 Biophysics and Physicobiology Vol. 16

programming language they use (Figure S5). Scala has triple-quoted String syntax, which allows programmers to write pure regular expressions (Figure S6). Furthermore, Scala can easily extract data matched to a pattern (Figure S7). Scala also allows developers to update the data of the outer class by inner class, which is prohibited in Java. This feature leads to easy implementation of new functions in smaller code, which speeds up GUI implementation. Scala is type-safe (the compiler will validate types while compiling), or strong static typing, unlike script language like Python. A type-safe programming language is highly maintainable especially in the case of server side applications. Unlike Java, Scala enforces parameters of a method immutable for parallelization. Mis-modification of data will never happen in the case of immutable data.

Standard

Standard Widget Toolkit (SWT) [12] is the Java GUI toolkit provided by the Foundation, which is used in Eclipse, an integrated development environment (IDE) provided by the Eclipse Foundation. Unlike in Java, SWT uses native window- widgets of each OS to open directories/files, resulting in a fast response. Needing complex implementation, developers can use JFace[13], an extension of SWT, although SWT provides only simple functions.

JLine

JLine[14] is a Java for handling console input. JLine provides Linux-shell-like interfaces such as bash/tcsh. We used JLine for Scala[15], the fork of the original JLine, to build a character user interface (CUI).

Lightweight Java Game Library

The Lightweight Java Game Library (LWJGL) [16] is a Java-OpenGL wrapper which supports stereoscopic 3D graphics by Quad-buffered stereo, in Java/Scala. Minecraft, a famous video game, uses this library. By using LWJGL, stereoscopic 3D graphics are available with a 3D display and video card supporting Quad-buffered stereo. LWJGL enables researchers to develop molecular visualization systems capable of providing stereoscopic 3D graphics, which are necessary to understand the structure of biomolecules.

4 Onishi et al.: An tool for computational science on biomolecules 5

User’s guide Setup Java

STCSB needs Java8 or higher. Please install Java before using STCSB.

Check Java version

If you already have installed Java, please check the Java version with the following command. java -version If you got any of the following standard output, you can use STCSB. XXX is the build number. # java8 java version "1.8.0_XXX" # java10 java version "1.10.0_XXX" # java12 java version "12.0.1_XXX"

Setting Environment variables

Set Environment variable, ST_HOME

Linux/Unix # Assuming /home/user/STCSB is a root directory # bash export ST_HOME=/home/user/STCSB # csh, tcsh set ST_HOME=/home/user/STCSB macOS # Assuming /home/user/STCSB is a root directory export ST_HOME=/home/user/STCSB

5 6 Biophysics and Physicobiology Vol. 16

Windows rem Assuming :\home\user\STCSB is a root directory set ST_HOME=c:\home\user\STCSB rem ------rem If path to a root directory contains white-spaces, rem enclose a path by double-quotes set ST_HOME="C:\Program Files\STCSB"

Setting Environment variable, PATH

Linux/Unix # bash export PATH=$PATH:$ST_HOME/bin # csh, tcsh set PATH=$PATH:$ST_HOME/bin macOS export PATH=$PATH:$ST_HOME/bin Windows set PATH=%PATH%:%ST_HOME%\bin

How to use modules we have already prepared

CUI

startup CUI # Linux/Unix/macOS $ exeCUI # Windows $ exeCUI.bat use sub-commands (platform-independent) # ls user@domain dir> ls directory1 files # cd user@domain dir> cd directory1 #

6 Onishi et al.: An tool for computational science on biomolecules 7

user@domain directory1> ls pdb1rvb.pdb # validate pdb file user@domain directory1> validatePDB pdb1rvb.pdb

GUI startup GUI # Linux/Unix/macOS $ exeGUI # Windows $ exeGUI.bat Then, you can see the GUI as follows:

PDBMLViewer

# Linux/Unix/macOS $ exeBioPolymerViewer sample/1rvb/1rvb.xml # Windows $ exeBioPolymerViewer.bat sample\1rvb\1rvb.xml

# also PDB file is supported $ exeBioPolymerViewer sample/1rvb/pdb1rvb.pdb # Windows $ exeBioPolymerViewer.bat sample\1rvb\pdb1rvb.pdb

7 8 Biophysics and Physicobiology Vol. 16

For troubleshooting If you have any problems when executing the modules above, please check the output by them as follows. [System info] 1. OS Name: Windows 10 Architecture: amd64 Version: 10.0 Data model: 64 2. Java Version: 1.8.0_191 Vendor: Oracle Corporation JVM Type: Java HotSpot(TM) 64-Bit Server VM

Use-case: generating volume file (.plt) from 3D-RISM output file

Go to sample/3drism directory

Please go to 3drism directory in the sample directory. $ cd sample/3drism Then, you can see the guv file, the file for solute–solvent distribution function calculated by 3D-RISM. $ ls alanineDipeptide_T.pdb alanineDipeptide_water.guv Startup CUI # Linux/Unix/macOS $ exeCUI # Windows $ exeCUI.bat Execute guvplt sub-command Please execute the guvplt sub-command to get the plt file. You can use command/argument completion. user@domain 3drism> guvplt alanineDipeptide_water.guv

8 Onishi et al.: An tool for computational science on biomolecules 9

Then, you will get .plt files for the distribution function of oxygen and hydrogen of water. user@domain 3drism> ls alanineDipeptide_T.pdb alanineDipeptide_water.guv alanineDipeptide_water-0.plt alanineDipeptide_water-0.txt alanineDipeptide_water-1.plt alanineDipeptide_water-1.txt You can remove alanineDipeptide_water-*.txt, transient-files. In the case of this sample input file, z alanineDipeptide_water-0.plt: distribution function of oxygen of water. z alanineDipeptide_water-1.plt: distribution function of hydrogen of water. You can open these plt files by molecular graphics systems such as VMD, UCSF Chimera, and PyMol. Analysis of trajectory calculated by sander/pmemd in Amber

$ cd sample/amber $ ls 5residues.mdcrd.zip 5residues.prmtop 5residues.rst trajectoryAnalysis.input trajectoryAnalysis.input contents: prmtop 5residues.prmtop # file name for parameter and topology mdcrd 5residues.mdcrd # file name for trajectory prefix traj-analysis-test # file prefix for output files distance 1@C-1@CB #calculate distance between C of residue 1 and CB of residue 1 distance 1@C-14@O angle 1@N-1@CA-1@C #calculate angle among N, CA, and C of residue 1 angle 15@O-19@O-34@O torsion 1@O-1@C-2@N-2@H #calculate torsion angle among O, C of residue 1 and N, H of residue 2

9 10 Biophysics and Physicobiology Vol. 16

unzip archive to extract trajectory

$ unzip 5residues.mdcrd.zip $ ls 5residues.mdcrd 5residues.mdcrd.zip 5residues.prmtop 5residues.rst trajectoryAnalysis.input

execute sample

$ exeTrajectoryAnalyzer trajectoryAnalysis.input $ ls 5residues.mdcrd 5residues.mdcrd.zip 5residues.prmtop 5residues.rst traj-analysis-test.angle traj-analysis-test.distance traj-analysis-test.torsion trajectoryAnalysis.input File contents if the output files are as follows: traj-analysis-test.angle

#step #1@N-1@CA-1@C #15@O-19@O-34@O 1 108.56223 51.17098 2 108.53746 51.16125 3 108.49004 51.16756 ...... 998 108.42865 61.51006 999 108.92988 61.42949 1000 109.42517 61.37292

10 Onishi et al.: An tool for computational science on biomolecules 11

, traj-analysis-test.distance

#step #1@C-1@CB #1@C-14@O 1 2.48884 14.84318 2 2.48998 14.84256 3 2.49153 14.84370 ...... 998 2.51095 15.81327 999 2.51268 15.81395 1000 2.51431 15.81391 , traj-analysis-test.torsion

#step #1@O-1@C-2@N-2@H 1 178.19011 2 178.21407 3 178.22132 ...... 998 178.96984 999 179.70296 1000 -179.46793 You can plot values by using gnuplot, Excel, etc.

11 12 Biophysics and Physicobiology Vol. 16

Developer’s guide Overview

Like libgdx(a commonly used framework for Android apps), STCSB (a framework for standalone applications for computational science on biomolecules on Linux/MS- Windows/macOS) provides several interfaces. STCSB consists of the following:

x (SWT): widget toolkit which can use native widgets on each platform

x Lightweight Java Game Library (LWJGL): modern/old OpenGL functionality including Quad-buffered stereo

x scala-xml: powerful XML I/O library in Scala API

Please refer to https://irisa-lab.github.io/STCSB/ for API. Setup development environment

Build tool

We recommend Simple build tool (SBT), an open-source build tool for Java/Scala, which is similar to Java’s Maven and Ant, especially version 1.2.3 since we have developed STCSB using SBT 1.2.3. If you use SBT, you do not have to install Scala by yourself. SBT will handle it. Please refer to SBT official documents to know how to install SBT.

Setting up a project

We will explain how to set up a project for SBT using STCSB.

Clone the repository

First, please clone the repository

12 Onishi et al.: An tool for computational science on biomolecules 13

git clone https://github.com/irisa-lab/STCSB.git You can see the directory structure of STCSB is as follows: STCSB ├── bin ├── conf ├── jar ├── README.md └── sample

Add files

Second, please add files for the SBT project. Please add:

x directories src/main/scala and project

x Scala source code in src/main/scala

x plugin setting file for SBT as project/plugins.sbt

//https://github.com/sbt/sbt-assembly/tree/v0.14.7 addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.7")

x project setting file for SBT as build.sbt

import sbtassembly.AssemblyPlugin.autoImport._

name := "stcsb-test" organization := "test" scalaVersion := "2.12.6" //Specifiy which version of Scala is used in the project. We recommend 2.12.6

//Root directory lazy val rootDir = baseDirectory in ThisProject

val os = sys.env.get("ST_TARGET").getOrElse("Linux") val arch = sys.env.get("ST_ARCH").getOrElse("64")

13 14 Biophysics and Physicobiology Vol. 16

val stcsbJar:String = os match{ case "Linux" => s"stcsb-assembly-linux${arch}.jar" case "Windows" => s"stcsb-assembly-win${arch}.jar" case "MacOSX" => "stcsb-assembly-macosx--64.jar" case _ => s"stcsb-assembly-linux${arch}.jar" }

unmanagedJars in (Compile) += Attributed.blank(file( rootDir.value + stcsbJar ))

//------// assembly //------// Configure the main class in the packaging mainClass in assembly := Some("test.MyApp")

val assemblyJar:String = os match{ case "Linux" => s"$name-assembly-linux${arch}.jar" case "Windows" => s"$name-assembly-windows${arch}.jar" case "MacOSX" => "$name-assembly-macosx.jar" case _ => s"$name-assembly-linux${arch}.jar" }

assemblyOutputPath in assembly := rootDir.value / ".." / "jar" / assemblyJar So, finally, the directory structures are as follows: stcsb ├── bin ├── build.sbt ├── conf ├── jar ├── project │ └── plugins.sbt ├── README.md ├── sample └── src └── main └── scala

14 Onishi et al.: An tool for computational science on biomolecules 15

Examples

Read PDB file

Project settings We assume you have followed the guide in the Setup development environment for SBT settings. Add Scala source code You can get atom coordinates from the PDB file by simple code via classes we have already prepared. Please copy the following contents to src/main/scala/test/MyApp.scala package test import irisalab.tinker.visual.storage.AtomInfo import irisalab.tinker.visual.reader.PDBReader object MyApp { def main( args:Array[String]){ if( args.length ==0 ){ println("No pdb file epecified!! Exit!!") System.exit(0) }

// Pass pdb file path by first command line argument val pdbFilePath:String = args(0)

//Read pdb file by PDBReader we have already prepared var pr:PDBReader = new PDBReader() val atoms:Array[AtomInfo] = pr.startReading( pdbFilePath )

//Print atoms to standard-outout for( atom <- atoms ) { println(atom) } } }

15 16 Biophysics and Physicobiology Vol. 16

Compile

1. Move to the root directory of STCSB

2. Set environmental variable ST_TARGET and ST_ARCH

# ST_TARGET = Windows/Linux/MacOSX # ST_ARCH = 32/64

# bash export ST_TARGET=Windows export ST_ARCH=64 # tcsh set ST_TARGET=Windows set ST_ARCH=64

#Windows set ST_TARGET=Windows set ST_ARCH=64

3. Execute the following command

sbt assembly Finally, you will get stcsb-test-assembly-win64.jar if you specified ST_TARGET=Windows and ST_ARCH=64 in CSB/jar. Execute compiled program # 1. Use java command # Windows 64bit java -cp jar\stcsb-test-assembly-win64.jar test.MyApp sample\1rvb\pdb1rvb.pdb # Linux 64bit java -cp jar/stcsb-test-assembly-linux64.jar test.MyApp sample/1rvb/pdb1rvb.pdb Then, you will get the atom coordinate information on standard-output. Acknowledgment

This work was supported by Grants-in-Aid (16041234) from MEXT, Japan.

16 Onishi et al.: An tool for computational science on biomolecules 17

Supplementary References

[1] Westbrook, J., Ito, N., Nakamura, H., Henrick, K.,&Berman, H. M. PDBML: the representation of archival macromolecular structure data in XML. Bioinformatics 21, 988–992 (2005). [2] Case, D. A., Betz, R., Cerutti, D., Cheatham, T., III, Darden, T., et al. AMBER 16 (University of California, San Francisco, 2016). [3] Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., et al. Scalable molecular dynamics with NAMD. Journal of Computational Chemistry 26, 1781–1802 (2005). [4] Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swaminathan, S.,&Karplus, M. CHARMM: A programfor macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry 4, 187–217 (1983). [5] Humphrey,W., Dalke, A., & Schulten, K. VMD: Visual Molecular Dynamics. Journal of Molecular Graphics 14, 33–38 (1996). [6] Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M., Meng, E. C., et al. UCSF Chimera–A visualization system for exploratory research and analysis. Journal of Computational Chemistry 25, 1605–1612 (2004). [7] Kinoshita, K.,&Nakamura, H. eF-site and PDBjViewer: database and viewer for protein functional sites. Bioinformatics 20, 1329–1330 (2004). [8] Bekker, G.-J., Nakamura, H., & Kinjo, A. Molmil: A molecular viewer for the PDB and beyond. Journal of Cheminformatics 8, (2016). [9] Wikipedia contributors. Model–view–controller— Wikipedia, The Free Encyclopedia [Online; accessed 22-Junuary-2015]. 2015. https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller. [10] Wikipedia contributors. Facade pattern — Wikipedia, The Free Encyclopedia [Online; accessed 17-June-2019]. 2019. https://en.wikipedia.org/w/index.php?title=Facade_pattern&oldid=893946674. [11] Odersky, M., Altherr, P., Cremet, V., Dragos, I., Dubochet, G., Emir, B., et al. An Overview of the Scala Programming Language tech. rep. (2004). [12] Eclipse Foundation, SWT Official HP https://www.eclipse.org/swt/. [13] Eclipse Foundation, JFace(Eclipse Wiki) https://wiki.eclipse.org/JFace. [14] jline. jline2— GitHub [Online; accessed 17-June-2019]. 2019. https://github.com/jline/jline2. [15] scala-jline. scala-jline— GitHub [Online; accessed 17-June-2019]. 2019. https://github.com/scala/scala-jline. [16] LWJGL community, LWJGL Official web page http://www.lwjgl.org/. http://www.lwjgl.org/.

17 18 Biophysics and Physicobiology Vol. 16

Supplementary Figure S1. MVC model in the software design pattern by the Gang of Four (GoF).

Supplementary Figure S2. Facade pattern in the software design pattern by the Gang of Four (GoF).

Supplementary Figure S3. Example of XML support in Scala, as syntax.

Supplementary Figure S4. Example of a shell command calling in Scala.

18 Onishi et al.: An tool for computational science on biomolecules 19

Supplementary Figure S5. Example of a regular expression in Java.

Supplementary Figure S6. Example of a regular expression in Scala.

Supplementary Figure S7. Example of data extraction using a regular expression in Scala.

19