Chemical Structure Computer Modelling

Total Page:16

File Type:pdf, Size:1020Kb

Chemical Structure Computer Modelling JournalJournal of Chemical of Chemical Technology Technology and Metallurgy,and Metallurgy, 55, 4, 55, 2020, 4, 2020 714-718 CHEMICAL STRUCTURE COMPUTER MODELLING Radoslava Topalska, Fatima Sapundzhi South-West University “Neofit Rilski”, 66 Ivan Michailov str. Received 11 January 2019 2700, Blagoevgrad, Bulgaria Accepted 30 July 2019 E-mail: [email protected] ABSTRACT The root-mean-square deviation of atomic positions (RMSD) is one of the most commonly used approaches in bioinformatics. It measures the average distance between the atoms of superimposed proteins. The present study describes a program calculating RMSD between two structures. The software developed detects the surfaces of two molecular structures – a convex and a concave one and the area of their interaction by calculating RMSD between them. The program uses fragments of files from Protein Data Bank format. The Python implementation enabling RMSD computation is suggested on the ground of the Kabsh algorithm. Keywords: computer modelling, RMSD, Python, PDB, ligand-receptor interactions, bioinformatics. INTRODUCTION structure is typically based on the protein name or ID [5]. The objective of this research is to present a program The protein structure prediction refers in general that (i) detects two surfaces – a convex and a concave to the juxtaposition of the predicted structure and the one of two structures and the area of their interaction experimentally determined one obtained by X-ray crys- and (ii) calculates RMSD. tallography and Nuclear Magnetic Resonance Imaging (NMR) technology used in clinical medicine. The degree EXPERIMENTAL of similarity is often expressed as a Root Mean Square Python Deviation (RMSD) measure, which represents the dis- Python is an object-oriented and an open source tance between the corresponding atoms in each molecule. computer programming language. It is commonly used It is useful as a measure of the accuracy of a model if one for both standalone programs and scripting applications has a crystal structure of the protein in order to compare in a wide variety of domains. Python is designed to opti- the model. RMSD calculations can be applied to non- mize the developer productivity, the software quality and protein molecules such as small organic molecules [1]. the program portability. The programs using Python run The algorithm of Kabsch is a popular method for on most platforms commonly used, including Windows, calculating the optimal rotation matrix that minimizes Linux, Java and .NET, and more [6]. RMSD between two paired sets of points. This algorithm is widely used in bioinformatics for comparing protein RMSD structures, in cheminformatics to compare molecular The root mean-square deviation (RMSD) is a structures, etc. [2, 3]. However, as the size of the protein measure of the differences between values predicted by increases, the minimum RMSD to qualify for what is a model and the values actually observed in the object considered a good fit increases. Whereas an RMSD of being modeled or estimated (Eq. 1): 10 Å would be considered a poor fit for a small protein, 1 it might be considered excellent for a longer protein with 2 (1) = several hundred amino acids. =1 � � Most of the imaging work in bioinformatics involves data from the Protein Data Bank (PDB) or the Molecu- where δi is the distance between atom i and either a refer- lar Modeling Database (MMDB) [4]. The search for a ence structure or a mean position of n equivalent atoms. 714 Radoslava Topalska, Fatima Sapundzhi Normally a rigid superposition which minimizes Table 1. A fragment of 4DKL.xyz file (<mol1.xyz>) data used. RMSD is performed, and this minimum is returned. C -18.687 18.589 -1.665 Given two sets of points n and v , RMSD is defined C 9367 7640 8730 in accordance with Eq. 2: N -19.331 17.879 -2.770 2 2 2 =1 1, 2, + 1, 2, + 1, 2, N 9001 7490 8520 ( , ) = ∑ �� − � � − � � − � � C -19.216 18.216 -4.051 � (2) C 8657 7044 8299 The proteins atomic coordinates are generally ex- N -18.473 19.257 -4.403 pressed in Å (where 1 Å = 10–10 m = 0.1 nm). RMSDs N 8544 6605 8184 are also expressed in Å as RMSD value is expressed in N -19.844 17.511 -4.981 length units [7, 8]. N 8198 6785 7937 N -12.650 17.723 -2.621 UCSF Chimera N 10424 9873 11513 UCSF Chimera 1.12 software is used to generate C -11.638 17.687 -3.667 high-quality images. It is a program for interactive C 10117 9785 11668 visualization and analysis of molecular structures and C -11.784 16.370 -4.426 related data including density maps, supramolecular as- semblies, sequence alignments, docking results, etc. The C 9158 9210 10910 program can be downloaded free of charge for academic, O -12.134 15.349 -3.833 government, non-profit, and personal use [9]. O 8539 8832 10227 C -10.221 17.815 -3.068 RESULTS AND DISCUSSION C 10776 10747 12732 The developed program is based on the Kabsh O -9.239 17.516 -4.067 algorithm and is realized in the Python program lan- O 11173 11464 13570 guage. The toll superimposes and calculates RMSD C -10.051 16.866 -1.891 between two molecule structures in xyz format. RMSD C 10474 10804 12436 is calculated between two sets of atomic coordinates, in this case, one for crystallographic structure ( , N -11.538 16.389 -5.747 ) (Table 1) from PDB fail and another for the atomic N 9212 9274 11159 coordinates of the ligand ( , ) (Table 2). C -11.721 15.184 -6.569 The mathematical calculation of RMSD in case of C 8973 9337 11061 two sets of xyz coordinates for n particles is given by C -10.806 14.020 -6.179 Eq. 1 [1, 2]. This procedure does not take into account C 8911 9825 11405 that the two molecules could be identical. In this case O -11.131 12.867 -6.479 they are translated only in space. The solution of the O 9022 10154 11539 problem requires to position the molecules at an identi- C -11.397 15.672 -7.988 cal center and to rotate one onto the other. The centroid C 8792 9003 10988 for both molecules has to be initially found and then both molecules have to be translated to the center of the C -10.598 16.916 -7.799 coordinate system. Then the Kabsch algorithm is used to C 8685 8681 10987 align the molecules by rotation. The procedure described C -11.142 17.551 -6.560 is in fact a method for calculating the optimal rotation C 9004 8731 10982 matrix that minimizes RMSD between two paired set N -9.688 14.312 -5.520 of points. It returns the centroid of a matrix as a [x y z] N 8882 10011 11688 vector and translates two matrices so that their centroids … … … … are equal to the origin of the coordinate system. 715 Journal of Chemical Technology and Metallurgy, 55, 4, 2020 Table 2. A fragment of MET-enkephalin – a pentapeptide The Kabsch algorithm [2, 3] solves the constrained of a morphine-like activity (PubChem CID:443363) MET- orthogonal Procrustes problem. This problem refers to the enkephalin.xyz fail (< mol2.xyz >). comparison of two (or more) shapes. Aiming this, they N -1.347 0.242 -1.290 must be optimally superimposed by translating, rotating C 0.058 -0.100 -0.952 and scaling. Rotations of the matrices are only allowed. C 0.541 -1.375 -1.688 Widely studied proteins, both theoretically and C -0.407 -2.515 -1.332 experimentally, are used for a test set. The working N -0.004 -3.808 -1.498 algorithm is illustrated by following an excerpt of PDB file of human μ-opiod receptor. Fig. 1 shows a part of C -0.871 -4.926 -1.094 the structure of MOR [4]. C -1.964 -5.309 -2.111 The atomic coordinates are used to construct a N -2.283 -4.329 -3.025 reference matrix, which together with another matrix C -3.293 -4.539 -4.043 of coordinates (constructed in the same way), provides C -4.655 -4.025 -3.586 the algorithm input data (Fig. 2) [11]. O -5.148 -3.004 -4.056 The program is written in Python programming O -2.553 -6.388 -2.088 language and uses two fails of atomic coordinates < O -1.586 -2.242 -1.105 mol1.xyz > and < mol2.xyz >. C 2.011 -1.666 -1.347 The program uses a function that reads the atomic C 3.029 -0.652 -1.834 coordinates from PDB file and returns the non-hydrogen C 4.240 -0.531 -1.134 coordinates as arrays [5 - 9]. The .pdb-format contains a C 5.234 0.348 -1.569 lot of information about the molecules investigated, such C 5.030 1.110 -2.710 as name of each amino-acid, their coordinates, hbonds O 6.024 1.956 -3.096 over which they are related to each other, etc. C 3.850 0.996 -3.433 In the loop, the function loops through each line of PDB file and assigns the atomic coordinates to three lists, C 2.860 0.105 -3.005 named x, y and z. The coordinates x-, y- and z- are each H -1.623 1.085 -0.786 separated by two blank characters, while the coordinates H -1.967 -0.532 -1.051 are stored in a floating value format (Table 1 and Table H -1.418 0.414 -2.293 2). The advantage of the .pdb-files refers to the fact that H 0.154 -0.234 0.105 their structure can be visualized again in Chimera.
Recommended publications
  • Synchronized Load Quantification from Multiple Data Records for Analysing High-Rise Buildings
    ACEE0195 The 7th Asia Conference on Earthquake Engineering, 22-25 November 2018, Bangkok, Thailand Synchronized Load Quantification from Multiple Data Records for Analysing High-rise Buildings 1st Marco Behrendt 2nd Wonsiri Punurai 3rd Michael Beer Institute for Risk and Reliability Department of Civil and Environmental Institute for Risk and Reliability Leibniz Universtät Hannover Engineering Leibniz Universtät Hannover Hannover, Germany Mahidol University Hannover, Germany [email protected] Bangkok, Thailand [email protected] [email protected] Abstract—To analyse the reliability and durability of large noise compensation for speaker recognition [8], network complex structures such as high-rise buildings, most intrusion detection [9] and calibration of laser sensors in realistically, it is advisable to utilize site-specific load mobile robotics [10]. characteristics. Such load characteristics can be made available as data records, e.g. representing measured wind or In this work, the various influences that make the earthquake loads. Due to various circumstances such as measured signals uncertain are examined in more detail. The measurement errors, equipment failures, or sensor limitations, influence of noise, missing data and rotated sensors are the data records underlie uncertainties. Since these considered. First, the strength of the influence of these uncertainties affect the results of the simulation of complex factors is determined and then a sensitivity analysis is structures, they must be mitigated as much as possible. In this performed. It determines which influences affect the results work, the Procrustes analysis, finding similarity most and whether they distort the results too much. transformations between two sets of points in n-dimensional space is used and is extended to uncertainties so that data This work is organised as follows.
    [Show full text]
  • Procrustes Documentation Release 0.0.1-Alpha
    Procrustes Documentation Release 0.0.1-alpha The QC-Devs Community Apr 23, 2021 USER DOCUMENTATION 1 Description of Procrustes Methods3 2 Indices and tables 41 Bibliography 43 Python Module Index 45 Index 47 i ii Procrustes Documentation, Release 0.0.1-alpha Procrustes is a free, open-source, and cross-platform Python library for (generalized) Procrustes problems with the goal of finding the optimal transformation(s) that makes two matrices as close as possible to each other. Please use the following citation in any publication using Procrustes library: “Procrustes: A Python Library to Find Transformations that Maximize the Similarity Between Matrices”, F. Meng, M. Richer, A. Tehrani, J. La, T. D. Kim, P. W. Ayers, F. Heidar-Zadeh, JOURNAL 2021; ISSUE PAGE NUMBER. The Procrustes source code is hosted on GitHub and is released under the GNU General Public License v3.0. We welcome any contributions to the Procrustes library in accord with our Code of Conduct; please see our Contributing Guidelines. Please report any issues you encounter while using Procrustes library on GitHub Issues. For further information and inquiries please contact us at [email protected]. USER DOCUMENTATION 1 Procrustes Documentation, Release 0.0.1-alpha 2 USER DOCUMENTATION CHAPTER ONE DESCRIPTION OF PROCRUSTES METHODS Procrustes problems arise when one wishes to find one or two transformations, T 2 Rn×n and S 2 Rm×m, that make matrix A 2 Rm×n (input matrix) resemble matrix B 2 Rm×n (target or reference matrix) as closely as possible: min kSAT − Bk2 |{z} F S;T where, the k · kF denotes the Frobenius norm defined as, v u m n q uX X 2 y kAkF = t jaijj = Tr(A A) i=1 j=1 Here aij and Tr(A) denote the ij-th element and trace of matrix A, respectively.
    [Show full text]
  • The Quaternion-Based Spatial Coordinate and Orientation Frame Alignment Problems
    The Quaternion-Based Spatial Coordinate and Orientation Frame Alignment Problems Andrew J. Hanson Luddy School of Informatics, Computing, and Engineering Indiana University, Bloomington, Indiana, 47405, USA Abstract We review the general problem of finding a global rotation that transforms a given set of points and/or coordinate frames (the “test” data) into the best possible alignment with a corresponding set (the “reference” data). For 3D point data, this “orthogonal Procrustes problem” is often phrased in terms of minimizing a root-mean-square deviation or RMSD corresponding to a Euclidean distance measure relating the two sets of matched coordinates. We focus on quaternion eigensystem methods that have been exploited to solve this problem for at least five decades in several different bodies of scientific literature where they were discovered independently. While numerical methods for the eigenvalue solutions dominate much of this literature, it has long been realized that the quaternion-based RMSD optimization problem can also be solved using exact algebraic expressions based on the form of the quartic equation solution published by Cardano in 1545; we focus on these exact solutions to expose the structure of the entire eigensystem for the traditional 3D spatial alignment problem. We then explore the structure of the less-studied orientation data context, investigating how quaternion methods can be extended to solve the corresponding 3D quaternion orientation frame alignment (QFA) problem, noting the interesting equivalence of this problem to the rotation-averaging problem, which also has been the subject of independent literature threads. We conclude with a brief discussion of the combined 3D translation-orientation data alignment problem.
    [Show full text]
  • Uncertainty Characterization of the Orthogonal Procrustes Problem with Arbitrary Covariance Matrices
    Pattern Recognition 61 (2017) 210–220 Contents lists available at ScienceDirect Pattern Recognition journal homepage: www.elsevier.com/locate/pr Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourenço a,n, Bruno J. Guerreiro a,b, Pedro Batista a,b, Paulo Oliveira d,c,a, Carlos Silvestre e,a a Institute for Systems and Robotics, Laboratory for Robotics and Engineering Systems, Portugal b Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal c Department of Mechanical Engineering, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal d Institute of Mechanical Engineering, Associated Laboratory for Energy, Transports and Aeronautics, Lisbon, Portugal e Department of Electrical and Computer Engineering, Faculty of Science and Technology of the University of Macau, China article info abstract Article history: This paper addresses the weighted orthogonal Procrustes problem of matching stochastically perturbed Received 18 February 2016 point clouds, formulated as an optimization problem with a closed-form solution. A novel uncertainty Received in revised form characterization of the solution of this problem is proposed resorting to perturbation theory concepts, 12 May 2016 which admits arbitrary transformations between point clouds and individual covariance and cross- Accepted 25 July 2016 covariance matrices for the points of each cloud. The method is thoroughly validated through extensive Available online 28 July 2016 Monte Carlo simulations, and particularly interesting cases where nonlinearities may arise are further Keywords: analyzed. Weighted Procrustes statistics & 2016 Elsevier Ltd. All rights reserved. Perturbation theory Uncertainty characterization Map transformation 1. Introduction transformation between the sets, i.e., rotations and reflections were allowed, a more evolved strategy appeared restricting the The problem of finding the similarity transformation between transformation to the special orthogonal group, as detailed in [6,9].
    [Show full text]
  • Package 'Bio3d'
    Package ‘bio3d’ May 11, 2021 Title Biological Structure Analysis Version 2.4-2 Author Barry Grant [aut, cre], Xin-Qiu Yao [aut], Lars Skjaerven [aut], Julien Ide [aut] VignetteBuilder knitr LinkingTo Rcpp Imports Rcpp, parallel, grid, graphics, grDevices, stats, utils Suggests XML, RCurl, lattice, ncdf4, igraph, bigmemory, knitr, rmarkdown, testthat (>= 0.9.1), httr, msa, Biostrings Depends R (>= 3.1.0) LazyData yes Description Utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data. Please refer to the URLs below for more information. Maintainer Barry Grant <[email protected]> License GPL (>= 2) URL http://thegrantlab.org/bio3d/, https://bitbucket.org/Grantlab/bio3d/ RoxygenNote 7.1.1 NeedsCompilation yes Repository CRAN Date/Publication 2021-05-11 07:02:15 UTC 1 2 R topics documented: R topics documented: bio3d-package . .6 aa.index . .7 aa.table . .9 aa123 . 10 aa2index . 11 aa2mass . 12 aanma . 14 aanma.pdbs . 17 aln2html . 19 angle.xyz . 21 as.fasta . 22 as.pdb . 23 as.select . 26 atom.index . 27 atom.select . 28 atom2ele . 31 atom2mass . 33 atom2xyz .
    [Show full text]
  • Tracking Internal Frames of Reference for Consistent Molecular Distribution Functions Robin Skånberg, Martin Falk, Mathieu Linares, Anders Ynnerman and Ingrid Hotz
    Tracking Internal Frames of Reference for Consistent Molecular Distribution Functions Robin Skånberg, Martin Falk, Mathieu Linares, Anders Ynnerman and Ingrid Hotz The self-archived postprint version of this journal article is available at Linköping University Institutional Repository (DiVA): http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-174336 N.B.: When citing this work, cite the original publication. Skånberg, R., Falk, M., Linares, M., Ynnerman, A., Hotz, I., (2021), Tracking Internal Frames of Reference for Consistent Molecular Distribution Functions, IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2021.3051632 Original publication available at: https://doi.org/10.1109/TVCG.2021.3051632 Copyright: Institute of Electrical and Electronics Engineers http://www.ieee.org/index.html ©2021 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. PREPRINT - ACCEPTED FOR PUBLICATION AT TVCG 1 Tracking Internal Frames of Reference for Consistent Molecular Distribution Functions Robin Skanberg˚ , Martin Falk , Mathieu Linares , Anders Ynnerman , and Ingrid Hotz Abstract—In molecular analysis, Spatial Distribution Functions (SDF) are fundamental instruments in answering questions related to spatial occurrences and relations of atomic structures over time. Given a molecular trajectory, SDFs can, for example, reveal the occurrence of water in relation to particular structures and hence provide clues of hydrophobic and hydrophilic regions. For the computation of meaningful distribution functions, the definition of molecular reference structures is essential.
    [Show full text]
  • Uncertainty Characterization of the Orthogonal Procrustes Problem with Arbitrary Covariance Matrices
    Pattern Recognition 61 (2017) 210–220 Contents lists available at ScienceDirect Pattern Recognition journal homepage: www.elsevier.com/locate/pr Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourenço a,n, Bruno J. Guerreiro a,b, Pedro Batista a,b, Paulo Oliveira d,c,a, Carlos Silvestre e,a a Institute for Systems and Robotics, Laboratory for Robotics and Engineering Systems, Portugal b Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal c Department of Mechanical Engineering, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal d Institute of Mechanical Engineering, Associated Laboratory for Energy, Transports and Aeronautics, Lisbon, Portugal e Department of Electrical and Computer Engineering, Faculty of Science and Technology of the University of Macau, China article info abstract Article history: This paper addresses the weighted orthogonal Procrustes problem of matching stochastically perturbed Received 18 February 2016 point clouds, formulated as an optimization problem with a closed-form solution. A novel uncertainty Received in revised form characterization of the solution of this problem is proposed resorting to perturbation theory concepts, 12 May 2016 which admits arbitrary transformations between point clouds and individual covariance and cross- Accepted 25 July 2016 covariance matrices for the points of each cloud. The method is thoroughly validated through extensive Available online 28 July 2016 Monte Carlo simulations, and particularly interesting cases where nonlinearities may arise are further Keywords: analyzed. Weighted Procrustes statistics & 2016 Elsevier Ltd. All rights reserved. Perturbation theory Uncertainty characterization Map transformation 1. Introduction transformation between the sets, i.e., rotations and reflections were allowed, a more evolved strategy appeared restricting the The problem of finding the similarity transformation between transformation to the special orthogonal group, as detailed in [6,9].
    [Show full text]
  • Rowan Documentation Release 1.0.0
    rowan Documentation Release 1.0.0 Vyas Ramasubramani Feb 12, 2019 Contents: 1 rowan 3 2 calculus 13 3 geometry 15 4 interpolate 19 5 mapping 21 6 random 25 7 Development Guide 27 7.1 Philosophy................................................ 27 7.2 Source Code Conventions........................................ 28 7.3 Unit Tests................................................. 28 7.4 General Notes.............................................. 28 7.5 Release Guide.............................................. 29 8 License 31 9 Changelog 33 9.1 Unreleased................................................ 33 9.2 v0.6.1 - 2018-04-20........................................... 33 9.3 v0.6.0 - 2018-04-20........................................... 33 9.4 v0.5.1 - 2018-04-13........................................... 34 9.5 v0.5.0 - 2018-04-12........................................... 34 9.6 v0.4.4 - 2018-04-10........................................... 34 9.7 v0.4.3 - 2018-04-10........................................... 34 9.8 v0.4.2 - 2018-04-09........................................... 34 9.9 v0.4.1 - 2018-04-08........................................... 35 9.10 v0.4.0 - 2018-04-08........................................... 35 9.11 v0.3.0 - 2018-03-31........................................... 35 9.12 v0.2.0 - 2018-03-08........................................... 35 9.13 v0.1.0 - 2018-02-26........................................... 36 10 Credits 37 i 11 Support and Contribution 39 12 Indices and tables 41 Bibliography 43 Python Module Index 45 ii rowan Documentation, Release 1.0.0 Welcome to the documentation for rowan, a package for working with quaternions! Quaternions form a number system with various interesting properties, and they have a number of uses. This package provides tools for standard algebraic operations on quaternions as well as a number of additional tools for e.g. measuring distances between quaternions, interpolating between them, and performing basic point-cloud mapping.
    [Show full text]
  • Alignment and Integration of Spatial Transcriptomics Data
    Alignment and Integration of Spatial Transcriptomics Data Ron Zeira1, Max Land1, and Benjamin J. Raphael1 1Department of Computer Science, Princeton University, Princeton NJ 08544, USA S1 Supplementary methods S1.1 Proof of Theorem 1 ¯ P (q) (q)T 1 Theorem 1. Let X = q λqX Π diag( g ) and X = WH. We have, X S(W; H) = gic(x·i; x¯·i) + τ i 2 P vl where c(u; v) = ku − vk or c(u; v) = KL(vjju) = vl log − vl + ul, and τ is a constant l ul that does not depend on W; H Proof. We first prove the theorem for the Euclidean distance c(u; v) = ku − vk2. We write the P (q) P objective function explicitly and simplify it using j Πij = gi and q λq = 1. 2 X X X (q) (q) S(W; H) = λq x·i − x·j πij q i j X X X T (q) T (q) (q) = λq x·i x·iπij − 2x·i x·j πij + β q i j X T X (q) X X X T X (q) (q) = x·i x·i πij λq − 2 x·i λqx·j πij + β i j q i j q X T T X (q) (q)T = gix·i x·i − 2 Tr(X λqX Π ) + β i q X T 1 = Tr(XT X diag(g)) − 2 Tr(XT λ X(q)Π(q) diag( ) diag(g)) + β q g q = Tr(XT X − 2XT X¯ diag(g)) + β X 2 0 = gi kx·i − x¯·ik + β i 1 where β and β0 are constants that do not depend on W; H.
    [Show full text]
  • The Quaternion-Based Spatial-Coordinate and Orientation-Frame Alignment Problems
    lead articles The quaternion-based spatial-coordinate and orientation-frame alignment problems ISSN 2053-2733 Andrew J. Hanson* Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, USA. *Correspondence e-mail: [email protected] Received 4 September 2018 Accepted 25 February 2020 The general problem of finding a global rotation that transforms a given set of spatial coordinates and/or orientation frames (the ‘test’ data) into the best possible alignment with a corresponding set (the ‘reference’ data) is reviewed. Edited by S. J. L. Billinge, Columbia University, For 3D point data, this ‘orthogonal Procrustes problem’ is often phrased in USA terms of minimizing a root-mean-square deviation (RMSD) corresponding to a Euclidean distance measure relating the two sets of matched coordinates. This Keywords: data alignment; spatial-coordinate alignment; orientation-frame alignment; quater- article focuses on quaternion eigensystem methods that have been exploited to nions; quaternion frames; quaternion eigenvalue solve this problem for at least five decades in several different bodies of scientific methods. literature, where they were discovered independently. While numerical methods for the eigenvalue solutions dominate much of this literature, it has long been Andrew J. Hanson is an Emeritus Professor of realized that the quaternion-based RMSD optimization problem can also be Computer Science at Indiana University. He solved using exact algebraic expressions based on the form of the quartic earned a bachelor’s degree in Chemistry and equation solution published by Cardano in 1545; focusing on these exact Physics from Harvard University in 1966 and a PhD in Theoretical Physics from MIT in 1971.
    [Show full text]