Open Babel Documentation Release 2.3.1
Total Page:16
File Type:pdf, Size:1020Kb
Open Babel Documentation Release 2.3.1 Geoffrey R Hutchison Chris Morley Craig James Chris Swain Hans De Winter Tim Vandermeersch Noel M O’Boyle (Ed.) December 05, 2011 Contents 1 Introduction 3 1.1 Goals of the Open Babel project ..................................... 3 1.2 Frequently Asked Questions ....................................... 4 1.3 Thanks .................................................. 7 2 Install Open Babel 9 2.1 Install a binary package ......................................... 9 2.2 Compiling Open Babel .......................................... 9 3 obabel and babel - Convert, Filter and Manipulate Chemical Data 17 3.1 Synopsis ................................................. 17 3.2 Options .................................................. 17 3.3 Examples ................................................. 19 3.4 Differences between babel and obabel .................................. 21 3.5 Format Options .............................................. 22 3.6 Append property values to the title .................................... 22 3.7 Filtering molecules from a multimolecule file .............................. 22 3.8 Substructure and similarity searching .................................. 25 3.9 Sorting molecules ............................................ 25 3.10 Remove duplicate molecules ....................................... 25 3.11 Aliases for chemical groups ....................................... 26 4 The Open Babel GUI 29 4.1 Basic operation .............................................. 29 4.2 Options .................................................. 29 4.3 Multiple input files ............................................ 30 4.4 Wildcards in filenames .......................................... 30 4.5 Local input ................................................ 30 4.6 Output file ................................................ 30 4.7 Graphical display ............................................. 30 4.8 Using a restricted set of formats ..................................... 31 4.9 Other features .............................................. 31 4.10 Example files ............................................... 31 5 Molecular fingerprints and similarity searching 33 5.1 Fingerprint format ............................................ 33 5.2 Spectrophores™ ............................................. 37 6 obabel vs Chemistry Toolkit Rosetta 43 6.1 Heavy atom counts from an SD file ................................... 43 i 6.2 Convert a SMILES string to canonical SMILES ............................. 43 6.3 Report how many SD file records are within a certain molecular weight range ............. 44 6.4 Convert SMILES file to SD file ..................................... 44 6.5 Report the similarity between two structures .............................. 44 6.6 Find the 10 nearest neighbors in a data set ................................ 44 6.7 Depict a compound as an image ..................................... 45 6.8 Highlight a substructure in the depiction ................................. 45 6.9 Align the depiction using a fixed substructure .............................. 46 6.10 Perform a substructure search on an SDF file and report the number of false positives ......... 46 6.11 Calculate TPSA ............................................. 47 6.12 Working with SD tag data ........................................ 47 6.13 Unattempted tasks ............................................ 48 7 Write software using the Open Babel library 49 7.1 The Open Babel API ........................................... 49 7.2 C++ .................................................... 50 7.3 Python .................................................. 52 7.4 Java .................................................... 69 7.5 Perl .................................................... 72 7.6 CSharp and OBDotNet .......................................... 75 7.7 Ruby ................................................... 77 8 Cheminformatics 101 79 8.1 Cheminformatics Basics ......................................... 79 8.2 Representing Molecules ......................................... 81 8.3 Substructure Searching with Indexes .................................. 85 8.4 Molecular Similarity ........................................... 86 8.5 Chemical Registration Systems ..................................... 87 9 Radicals and SMILES extensions 89 9.1 The need for radicals and implicit hydrogen to coexist ......................... 89 9.2 How Open Babel does it ......................................... 89 9.3 In radicals either the hydrogen or the spin multiplicity can be implicit ................. 90 9.4 SMILES extensions for radicals ..................................... 90 10 Contributing to Open Babel 93 10.1 Overview ................................................. 93 10.2 Developing Open Babel ......................................... 94 10.3 Documentation .............................................. 99 10.4 Testing the Code ............................................. 100 10.5 Software Archaeology .......................................... 101 11 Adding plugins 103 11.1 How to add a new file format ...................................... 103 11.2 Adding new operations and options ................................... 104 12 Supported File Formats and Options 107 12.1 Common cheminformatics formats ................................... 107 12.2 Utility formats .............................................. 113 12.3 Other cheminformatics formats ..................................... 122 12.4 Computational chemistry formats .................................... 123 12.5 Crystallography formats ......................................... 130 12.6 Reaction formats ............................................. 132 12.7 Image formats .............................................. 133 12.8 2D drawing formats ........................................... 138 ii 12.9 3D viewer formats ............................................ 138 12.10 Kinetics and Thermodynamics formats ................................. 139 12.11 Molecular dynamics and docking formats ................................ 140 12.12 Volume data formats ........................................... 142 12.13 Miscellaneous formats .......................................... 143 12.14 Biological data formats .......................................... 143 12.15 Obscure formats ............................................. 143 Bibliography 145 iii iv Open Babel Documentation, Release 2.3.1 The latest version of this documentation is available in several formats from http://openbabel.org/docs/dev/. Contents 1 Open Babel Documentation, Release 2.3.1 2 Contents Chapter 1 Introduction Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It’s an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas. 1.1 Goals of the Open Babel project Open Babel is a project to facilitate the interconversion of chemical data from one format to another – including file formats of various types. This is important for the following reasons: • Multiple programs are often required in realistic workflows. These may include databases, modeling or compu- tational programs, visualization programs, etc. • Many programs have individual data formats, and/or support only a small subset of other file types. • Chemical representations often vary considerably: – Some programs are 2D. Some are 3D. Some use fractional k-space coordinates. – Some programs use bonds and atoms of discrete types. Others use only atoms and electrons. – Some programs use symmetric representations. Others do not. – Some programs specify all atoms. Others use “residues” or omit hydrogen atoms. • Individual implementations of even standardized file formats are often buggy, incomplete or do not completely match published standards. As a free, and open source project, Open Babel improves by way of helping others. It gains by way of its users, contributors, developers, related projects, and the general chemical community. We must continually strive to support these constituencies. We gratefully accept contributions in many forms – from bug reports, complaints, and critiques, which help us improve what we do poorly, to feature suggestions, code contributions, and other efforts, which direct our future development. • For end users, we seek to provide a range of utility, from simple (or complex) file interconversion, to indexing, databasing, and transforming chemical and molecular data. • For developers, we seek to provide an easy-to-use free and open source chemical library. This assists a variety of chemical software, from molecular viewers and visualization tools and editors to databases, property prediction tools, and in-house development. To this end, we hope that our tools reflect several key points: • As much chemical information and files should be read and understood by Open Babel. This means that we should always strive to support as many concepts as possible in a given file format, and support for additional file formats is beneficial to the community as a whole. • Releases should be made to be “as good as we can make it” each and every time. 3 Open Babel Documentation, Release 2.3.1 • Improving our code and our community to bring in additional contributions in many forms helps both developers and end-users alike. Making development easy for new contributors will result in better tools for users as well. 1.2