Alignment and Integration of Spatial Transcriptomics Data

Total Page:16

File Type:pdf, Size:1020Kb

Alignment and Integration of Spatial Transcriptomics Data Alignment and Integration of Spatial Transcriptomics Data Ron Zeira1, Max Land1, and Benjamin J. Raphael1 1Department of Computer Science, Princeton University, Princeton NJ 08544, USA S1 Supplementary methods S1.1 Proof of Theorem 1 ¯ P (q) (q)T 1 Theorem 1. Let X = q λqX Π diag( g ) and X = WH. We have, X S(W; H) = gic(x·i; x¯·i) + τ i 2 P vl where c(u; v) = ku − vk or c(u; v) = KL(vjju) = vl log − vl + ul, and τ is a constant l ul that does not depend on W; H Proof. We first prove the theorem for the Euclidean distance c(u; v) = ku − vk2. We write the P (q) P objective function explicitly and simplify it using j Πij = gi and q λq = 1. 2 X X X (q) (q) S(W; H) = λq x·i − x·j πij q i j X X X T (q) T (q) (q) = λq x·i x·iπij − 2x·i x·j πij + β q i j X T X (q) X X X T X (q) (q) = x·i x·i πij λq − 2 x·i λqx·j πij + β i j q i j q X T T X (q) (q)T = gix·i x·i − 2 Tr(X λqX Π ) + β i q X T 1 = Tr(XT X diag(g)) − 2 Tr(XT λ X(q)Π(q) diag( ) diag(g)) + β q g q = Tr(XT X − 2XT X¯ diag(g)) + β X 2 0 = gi kx·i − x¯·ik + β i 1 where β and β0 are constants that do not depend on W; H. P vl Next, we prove the theorem for the KL divergence c(u; v) = KL(vjju) = vl log −vl+ul. l ul Again, we write the objective function explicitly and simplify it: X X Xh X (q) (q) (q) (q) i (q) S(W; H) = λq xli − xlj − xlj log(xli) + xlj log(xlj ) πij q i j l X X Xh X (q) i (q) = λq xli − xlj log(xli) πij + γ q i j l X Xh X (q) X X X (q) (q)i = xli πij λq − log(xli) λqxlj πij + γ i l j q j q X Xh i = gi xli − x¯li log(xli) + γ i l X 0 = giKL(¯x·ijjx·i) + γ i where γ and γ0 are constants that do not depend on W; H. S1.2 Finding optimal rotation for spatial coordiantes In this section, we seek to find a rotation and translation that of the spatial coordinates of one layer that minimizes the distances to the spatial coordinates of the other layer given a mapping. The prob- lem of finding rotation and translation that minimizes the distances between matched set of points is a well know problem in several research fields [6, 3]. In 2d the problem is often called called Procrustes analysis , a more general linear algebra problem is called the Orthogonal Procrustes problem , and the vector weighted version is called Wahba’s problem [6]. In chemistry/biology the solution to the 3d problem is called the Kabsch algorithm [3]. The 2d solution is based on finding the rotation angle while the general case (which also works in 2d) looks for a rotation matrix, thus it also supports reflection. Our problem is a variation of this problem since we have a probabilistic alignment between the spots given by the mapping Π. 2×n 2×n0 Problem S1. Given ST layers with spatial coordinates Z 2 R and W 2 R and a mapping 0 2 2×2 Π 2 Γ(g; g ), find a vector t 2 R and a rotation matrix R 2 R : X 2 Q(t; R) = πij kz·i − Rw·j − tk : (S1) i;j We first show that we can assume that no translation is needed (t = 0) by centering the spatial coordinates Z and W . Assuming R is fixed, we can find the optimal translation by taking the 2 derivative of Q w.r.t. t and comparing to zero: @Q X = −2 π (z − Rw − t) @t ij ·i ·j i;j X X X X X = −2 z·i πij + 2 w·j πij + 2t πij i j j i i;j X X 0 = −2 z·igi + 2 w·jgj + 2t = 0 i j 0 We have t^ = Zg − W g . By replacing the spatial coordinates z·i with z·i − Zg and the spatial 0 P 2 coordinates w·j with w·j − W g we get Q = i;j πij kz·i − Rw·jk . Therefore, centering both spatial coordinates removes the need to find a translation and we are only left with finding the optimal rotation. We rewrite the objective Q in matrix notation: X Q = πij(z·i − Rw·j) i;j X T T T T T T = πij(z·i z·i − w·j R Rw·j − z·i Rw·j − w·j R z·i) i;j X T = −2 πij(z·i − Rw·j) + α i;j = −2 Tr(ZT RW ΠT ) + α = −2 Tr(RW ΠT ZT ) + α where α us a constant independent of R. We find the optimal rotation R that minimizes Q using SVD similar to the solution to Wahbs’s problem [4]. Let UΣV T be the SVD decomposition of W ΠT ZT . We have Q = −2 Tr(RUΣV T ) + α = −2 Tr(ΣV T RU) + α Notice that Σ is a positive diagonal matrix and V T RU is an orthonormal matrix. Therefore, the objective Q is minimized when the trace of V T RU is maximal which is attained when V T RU = I. We have R = VU T . We note that R may also do reflection in addition to rotation. An alternative derivation for the 2d case is done similar to Procrustes analysis. We write the rotation matrix as a function of the rotation angle θ: cos(θ) − sin(θ) R(θ) = sin(θ) cos(θ) Taking the derivative of Q with respect to θ and comparing to zero gives: 3 @Q @R(θ) = −2 Tr W ΠT ZT @θ @θ − sin(θ) − cos(θ) = −2 Tr W ΠT ZT = 0 cos(θ) − sin(θ) Dividing by cos(θ) and extracting θ we have: 0 −1 Tr W ΠT ZT 1 0 θ^ = arctan Tr(W ΠT ZT ) S2 Supplementary results S2.1 Comparison of PASTE to Scanorama on ST alignment simulation We compared our results to a SC-RNAseq integration method Scanorama [1]. Scanorama integrates gene expression information by resolving noise and batch effects between two or more datasets. Scanorama is not designed to align cells from RNAseq, though it does relies on inferring near- est neighbors between cells in the given data sets. To directly compare Scanorama with PASTE, we calculated an alignment between spots of the different layers by finding a mapping that mini- mizes the Wasserstein optimal transport distance, where the transportation cost between the spots is taken as the Euclidean distance between the spots in the integrated gene expression datasets from Scanorama. We see that PASTE outperforms alignment based gene expression corrected by Scanorama (Figure S3). In fact, Scanorama performs slightly worse than our pairwise alignment on the original gene expression data alone. S2.2 Spatial entropy definition The spatial entropy is computed as follows. Given a graph with vertex labels (e.g. cluster labels), the spatial entropy is the Shannon entropy of the distribution of the unordered pairs of cluster labels on the edges of the graph. Specifically, let G = (V; E) be graph where V is the set of spots and where there is an edge (i; k) 2 E between every pair (i; k) of spots adjacent on the array. Let K = f1; 2; : : : ; kg be a set of k cluster labels and let ` : V K be the spot cluster assignment. We define a categorical variable C = ffa; bg;(a; b) 2 N × Ng which describes every distinct unordered pair of cluster labels. The spatial entropy is calculated as H(G) = H(CjE) = P c − c2C P(cjE) log(P(cjE)), where P(cjE) = jEj . A low value of spatial entropy indicates that many adjacent spots have the same cluster label, while a large spatial entropy indicates that many adjacent spots have different cluster labels. 4 S2.3 Supplementary plots Layer 1 Layer 2 Layer 3 Layer 4 Figure S1: Spatial organization of breast cancer ST layers from [5]. 5 0.7 0.6 0.5 0.4 0.3 0.2 Mixed Gene Exp Only % of Spots Correctly Aligned 0.1 Spatial Only 0.0 0.2 0.4 0.6 0.8 1.0 level Figure S2: PASTE results for pairwise alignment of a simulated ST layer with layer 1 of breast cancer dataset [5] with varying levels of α. Coverage variability factor for the simulated ST layer was set at η = 10. 6 Figure S3: PASTE results on pairwise alignment of simulated ST layers based on four layers of breast cancer dataset [5]. Each value is an averaged over 10 simulations. 7 a Original b Rotation π/6 c Rotation π/3 d Rotation 2π/3 Figure S4: Spatial organization of spots used in center layer alignment simulation of layer 2 from the breast cancer dataset [5]. (a) Original spatial organization of spots in layer 2 of breast cancer π π 2π dataset. (b) - (d) Simulated spatial structures obtained by rotating (a) by 6 ; 3 ; 3 respectively. 8 Figure S5: PASTE results on center layer integration of simulated ST layers based on four layers of breast cancer dataset [5].
Recommended publications
  • Synchronized Load Quantification from Multiple Data Records for Analysing High-Rise Buildings
    ACEE0195 The 7th Asia Conference on Earthquake Engineering, 22-25 November 2018, Bangkok, Thailand Synchronized Load Quantification from Multiple Data Records for Analysing High-rise Buildings 1st Marco Behrendt 2nd Wonsiri Punurai 3rd Michael Beer Institute for Risk and Reliability Department of Civil and Environmental Institute for Risk and Reliability Leibniz Universtät Hannover Engineering Leibniz Universtät Hannover Hannover, Germany Mahidol University Hannover, Germany [email protected] Bangkok, Thailand [email protected] [email protected] Abstract—To analyse the reliability and durability of large noise compensation for speaker recognition [8], network complex structures such as high-rise buildings, most intrusion detection [9] and calibration of laser sensors in realistically, it is advisable to utilize site-specific load mobile robotics [10]. characteristics. Such load characteristics can be made available as data records, e.g. representing measured wind or In this work, the various influences that make the earthquake loads. Due to various circumstances such as measured signals uncertain are examined in more detail. The measurement errors, equipment failures, or sensor limitations, influence of noise, missing data and rotated sensors are the data records underlie uncertainties. Since these considered. First, the strength of the influence of these uncertainties affect the results of the simulation of complex factors is determined and then a sensitivity analysis is structures, they must be mitigated as much as possible. In this performed. It determines which influences affect the results work, the Procrustes analysis, finding similarity most and whether they distort the results too much. transformations between two sets of points in n-dimensional space is used and is extended to uncertainties so that data This work is organised as follows.
    [Show full text]
  • Procrustes Documentation Release 0.0.1-Alpha
    Procrustes Documentation Release 0.0.1-alpha The QC-Devs Community Apr 23, 2021 USER DOCUMENTATION 1 Description of Procrustes Methods3 2 Indices and tables 41 Bibliography 43 Python Module Index 45 Index 47 i ii Procrustes Documentation, Release 0.0.1-alpha Procrustes is a free, open-source, and cross-platform Python library for (generalized) Procrustes problems with the goal of finding the optimal transformation(s) that makes two matrices as close as possible to each other. Please use the following citation in any publication using Procrustes library: “Procrustes: A Python Library to Find Transformations that Maximize the Similarity Between Matrices”, F. Meng, M. Richer, A. Tehrani, J. La, T. D. Kim, P. W. Ayers, F. Heidar-Zadeh, JOURNAL 2021; ISSUE PAGE NUMBER. The Procrustes source code is hosted on GitHub and is released under the GNU General Public License v3.0. We welcome any contributions to the Procrustes library in accord with our Code of Conduct; please see our Contributing Guidelines. Please report any issues you encounter while using Procrustes library on GitHub Issues. For further information and inquiries please contact us at [email protected]. USER DOCUMENTATION 1 Procrustes Documentation, Release 0.0.1-alpha 2 USER DOCUMENTATION CHAPTER ONE DESCRIPTION OF PROCRUSTES METHODS Procrustes problems arise when one wishes to find one or two transformations, T 2 Rn×n and S 2 Rm×m, that make matrix A 2 Rm×n (input matrix) resemble matrix B 2 Rm×n (target or reference matrix) as closely as possible: min kSAT − Bk2 |{z} F S;T where, the k · kF denotes the Frobenius norm defined as, v u m n q uX X 2 y kAkF = t jaijj = Tr(A A) i=1 j=1 Here aij and Tr(A) denote the ij-th element and trace of matrix A, respectively.
    [Show full text]
  • The Quaternion-Based Spatial Coordinate and Orientation Frame Alignment Problems
    The Quaternion-Based Spatial Coordinate and Orientation Frame Alignment Problems Andrew J. Hanson Luddy School of Informatics, Computing, and Engineering Indiana University, Bloomington, Indiana, 47405, USA Abstract We review the general problem of finding a global rotation that transforms a given set of points and/or coordinate frames (the “test” data) into the best possible alignment with a corresponding set (the “reference” data). For 3D point data, this “orthogonal Procrustes problem” is often phrased in terms of minimizing a root-mean-square deviation or RMSD corresponding to a Euclidean distance measure relating the two sets of matched coordinates. We focus on quaternion eigensystem methods that have been exploited to solve this problem for at least five decades in several different bodies of scientific literature where they were discovered independently. While numerical methods for the eigenvalue solutions dominate much of this literature, it has long been realized that the quaternion-based RMSD optimization problem can also be solved using exact algebraic expressions based on the form of the quartic equation solution published by Cardano in 1545; we focus on these exact solutions to expose the structure of the entire eigensystem for the traditional 3D spatial alignment problem. We then explore the structure of the less-studied orientation data context, investigating how quaternion methods can be extended to solve the corresponding 3D quaternion orientation frame alignment (QFA) problem, noting the interesting equivalence of this problem to the rotation-averaging problem, which also has been the subject of independent literature threads. We conclude with a brief discussion of the combined 3D translation-orientation data alignment problem.
    [Show full text]
  • Uncertainty Characterization of the Orthogonal Procrustes Problem with Arbitrary Covariance Matrices
    Pattern Recognition 61 (2017) 210–220 Contents lists available at ScienceDirect Pattern Recognition journal homepage: www.elsevier.com/locate/pr Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourenço a,n, Bruno J. Guerreiro a,b, Pedro Batista a,b, Paulo Oliveira d,c,a, Carlos Silvestre e,a a Institute for Systems and Robotics, Laboratory for Robotics and Engineering Systems, Portugal b Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal c Department of Mechanical Engineering, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal d Institute of Mechanical Engineering, Associated Laboratory for Energy, Transports and Aeronautics, Lisbon, Portugal e Department of Electrical and Computer Engineering, Faculty of Science and Technology of the University of Macau, China article info abstract Article history: This paper addresses the weighted orthogonal Procrustes problem of matching stochastically perturbed Received 18 February 2016 point clouds, formulated as an optimization problem with a closed-form solution. A novel uncertainty Received in revised form characterization of the solution of this problem is proposed resorting to perturbation theory concepts, 12 May 2016 which admits arbitrary transformations between point clouds and individual covariance and cross- Accepted 25 July 2016 covariance matrices for the points of each cloud. The method is thoroughly validated through extensive Available online 28 July 2016 Monte Carlo simulations, and particularly interesting cases where nonlinearities may arise are further Keywords: analyzed. Weighted Procrustes statistics & 2016 Elsevier Ltd. All rights reserved. Perturbation theory Uncertainty characterization Map transformation 1. Introduction transformation between the sets, i.e., rotations and reflections were allowed, a more evolved strategy appeared restricting the The problem of finding the similarity transformation between transformation to the special orthogonal group, as detailed in [6,9].
    [Show full text]
  • Package 'Bio3d'
    Package ‘bio3d’ May 11, 2021 Title Biological Structure Analysis Version 2.4-2 Author Barry Grant [aut, cre], Xin-Qiu Yao [aut], Lars Skjaerven [aut], Julien Ide [aut] VignetteBuilder knitr LinkingTo Rcpp Imports Rcpp, parallel, grid, graphics, grDevices, stats, utils Suggests XML, RCurl, lattice, ncdf4, igraph, bigmemory, knitr, rmarkdown, testthat (>= 0.9.1), httr, msa, Biostrings Depends R (>= 3.1.0) LazyData yes Description Utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data. Please refer to the URLs below for more information. Maintainer Barry Grant <[email protected]> License GPL (>= 2) URL http://thegrantlab.org/bio3d/, https://bitbucket.org/Grantlab/bio3d/ RoxygenNote 7.1.1 NeedsCompilation yes Repository CRAN Date/Publication 2021-05-11 07:02:15 UTC 1 2 R topics documented: R topics documented: bio3d-package . .6 aa.index . .7 aa.table . .9 aa123 . 10 aa2index . 11 aa2mass . 12 aanma . 14 aanma.pdbs . 17 aln2html . 19 angle.xyz . 21 as.fasta . 22 as.pdb . 23 as.select . 26 atom.index . 27 atom.select . 28 atom2ele . 31 atom2mass . 33 atom2xyz .
    [Show full text]
  • Tracking Internal Frames of Reference for Consistent Molecular Distribution Functions Robin Skånberg, Martin Falk, Mathieu Linares, Anders Ynnerman and Ingrid Hotz
    Tracking Internal Frames of Reference for Consistent Molecular Distribution Functions Robin Skånberg, Martin Falk, Mathieu Linares, Anders Ynnerman and Ingrid Hotz The self-archived postprint version of this journal article is available at Linköping University Institutional Repository (DiVA): http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-174336 N.B.: When citing this work, cite the original publication. Skånberg, R., Falk, M., Linares, M., Ynnerman, A., Hotz, I., (2021), Tracking Internal Frames of Reference for Consistent Molecular Distribution Functions, IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2021.3051632 Original publication available at: https://doi.org/10.1109/TVCG.2021.3051632 Copyright: Institute of Electrical and Electronics Engineers http://www.ieee.org/index.html ©2021 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. PREPRINT - ACCEPTED FOR PUBLICATION AT TVCG 1 Tracking Internal Frames of Reference for Consistent Molecular Distribution Functions Robin Skanberg˚ , Martin Falk , Mathieu Linares , Anders Ynnerman , and Ingrid Hotz Abstract—In molecular analysis, Spatial Distribution Functions (SDF) are fundamental instruments in answering questions related to spatial occurrences and relations of atomic structures over time. Given a molecular trajectory, SDFs can, for example, reveal the occurrence of water in relation to particular structures and hence provide clues of hydrophobic and hydrophilic regions. For the computation of meaningful distribution functions, the definition of molecular reference structures is essential.
    [Show full text]
  • Uncertainty Characterization of the Orthogonal Procrustes Problem with Arbitrary Covariance Matrices
    Pattern Recognition 61 (2017) 210–220 Contents lists available at ScienceDirect Pattern Recognition journal homepage: www.elsevier.com/locate/pr Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices Pedro Lourenço a,n, Bruno J. Guerreiro a,b, Pedro Batista a,b, Paulo Oliveira d,c,a, Carlos Silvestre e,a a Institute for Systems and Robotics, Laboratory for Robotics and Engineering Systems, Portugal b Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal c Department of Mechanical Engineering, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal d Institute of Mechanical Engineering, Associated Laboratory for Energy, Transports and Aeronautics, Lisbon, Portugal e Department of Electrical and Computer Engineering, Faculty of Science and Technology of the University of Macau, China article info abstract Article history: This paper addresses the weighted orthogonal Procrustes problem of matching stochastically perturbed Received 18 February 2016 point clouds, formulated as an optimization problem with a closed-form solution. A novel uncertainty Received in revised form characterization of the solution of this problem is proposed resorting to perturbation theory concepts, 12 May 2016 which admits arbitrary transformations between point clouds and individual covariance and cross- Accepted 25 July 2016 covariance matrices for the points of each cloud. The method is thoroughly validated through extensive Available online 28 July 2016 Monte Carlo simulations, and particularly interesting cases where nonlinearities may arise are further Keywords: analyzed. Weighted Procrustes statistics & 2016 Elsevier Ltd. All rights reserved. Perturbation theory Uncertainty characterization Map transformation 1. Introduction transformation between the sets, i.e., rotations and reflections were allowed, a more evolved strategy appeared restricting the The problem of finding the similarity transformation between transformation to the special orthogonal group, as detailed in [6,9].
    [Show full text]
  • Rowan Documentation Release 1.0.0
    rowan Documentation Release 1.0.0 Vyas Ramasubramani Feb 12, 2019 Contents: 1 rowan 3 2 calculus 13 3 geometry 15 4 interpolate 19 5 mapping 21 6 random 25 7 Development Guide 27 7.1 Philosophy................................................ 27 7.2 Source Code Conventions........................................ 28 7.3 Unit Tests................................................. 28 7.4 General Notes.............................................. 28 7.5 Release Guide.............................................. 29 8 License 31 9 Changelog 33 9.1 Unreleased................................................ 33 9.2 v0.6.1 - 2018-04-20........................................... 33 9.3 v0.6.0 - 2018-04-20........................................... 33 9.4 v0.5.1 - 2018-04-13........................................... 34 9.5 v0.5.0 - 2018-04-12........................................... 34 9.6 v0.4.4 - 2018-04-10........................................... 34 9.7 v0.4.3 - 2018-04-10........................................... 34 9.8 v0.4.2 - 2018-04-09........................................... 34 9.9 v0.4.1 - 2018-04-08........................................... 35 9.10 v0.4.0 - 2018-04-08........................................... 35 9.11 v0.3.0 - 2018-03-31........................................... 35 9.12 v0.2.0 - 2018-03-08........................................... 35 9.13 v0.1.0 - 2018-02-26........................................... 36 10 Credits 37 i 11 Support and Contribution 39 12 Indices and tables 41 Bibliography 43 Python Module Index 45 ii rowan Documentation, Release 1.0.0 Welcome to the documentation for rowan, a package for working with quaternions! Quaternions form a number system with various interesting properties, and they have a number of uses. This package provides tools for standard algebraic operations on quaternions as well as a number of additional tools for e.g. measuring distances between quaternions, interpolating between them, and performing basic point-cloud mapping.
    [Show full text]
  • Chemical Structure Computer Modelling
    JournalJournal of Chemical of Chemical Technology Technology and Metallurgy,and Metallurgy, 55, 4, 55, 2020, 4, 2020 714-718 CHEMICAL STRUCTURE COMPUTER MODELLING Radoslava Topalska, Fatima Sapundzhi South-West University “Neofit Rilski”, 66 Ivan Michailov str. Received 11 January 2019 2700, Blagoevgrad, Bulgaria Accepted 30 July 2019 E-mail: [email protected] ABSTRACT The root-mean-square deviation of atomic positions (RMSD) is one of the most commonly used approaches in bioinformatics. It measures the average distance between the atoms of superimposed proteins. The present study describes a program calculating RMSD between two structures. The software developed detects the surfaces of two molecular structures – a convex and a concave one and the area of their interaction by calculating RMSD between them. The program uses fragments of files from Protein Data Bank format. The Python implementation enabling RMSD computation is suggested on the ground of the Kabsh algorithm. Keywords: computer modelling, RMSD, Python, PDB, ligand-receptor interactions, bioinformatics. INTRODUCTION structure is typically based on the protein name or ID [5]. The objective of this research is to present a program The protein structure prediction refers in general that (i) detects two surfaces – a convex and a concave to the juxtaposition of the predicted structure and the one of two structures and the area of their interaction experimentally determined one obtained by X-ray crys- and (ii) calculates RMSD. tallography and Nuclear Magnetic Resonance Imaging (NMR) technology used in clinical medicine. The degree EXPERIMENTAL of similarity is often expressed as a Root Mean Square Python Deviation (RMSD) measure, which represents the dis- Python is an object-oriented and an open source tance between the corresponding atoms in each molecule.
    [Show full text]
  • The Quaternion-Based Spatial-Coordinate and Orientation-Frame Alignment Problems
    lead articles The quaternion-based spatial-coordinate and orientation-frame alignment problems ISSN 2053-2733 Andrew J. Hanson* Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, USA. *Correspondence e-mail: [email protected] Received 4 September 2018 Accepted 25 February 2020 The general problem of finding a global rotation that transforms a given set of spatial coordinates and/or orientation frames (the ‘test’ data) into the best possible alignment with a corresponding set (the ‘reference’ data) is reviewed. Edited by S. J. L. Billinge, Columbia University, For 3D point data, this ‘orthogonal Procrustes problem’ is often phrased in USA terms of minimizing a root-mean-square deviation (RMSD) corresponding to a Euclidean distance measure relating the two sets of matched coordinates. This Keywords: data alignment; spatial-coordinate alignment; orientation-frame alignment; quater- article focuses on quaternion eigensystem methods that have been exploited to nions; quaternion frames; quaternion eigenvalue solve this problem for at least five decades in several different bodies of scientific methods. literature, where they were discovered independently. While numerical methods for the eigenvalue solutions dominate much of this literature, it has long been Andrew J. Hanson is an Emeritus Professor of realized that the quaternion-based RMSD optimization problem can also be Computer Science at Indiana University. He solved using exact algebraic expressions based on the form of the quartic earned a bachelor’s degree in Chemistry and equation solution published by Cardano in 1545; focusing on these exact Physics from Harvard University in 1966 and a PhD in Theoretical Physics from MIT in 1971.
    [Show full text]