Procrustes Documentation Release 0.0.1-Alpha
Total Page:16
File Type:pdf, Size:1020Kb
Procrustes Documentation Release 0.0.1-alpha The QC-Devs Community Apr 23, 2021 USER DOCUMENTATION 1 Description of Procrustes Methods3 2 Indices and tables 41 Bibliography 43 Python Module Index 45 Index 47 i ii Procrustes Documentation, Release 0.0.1-alpha Procrustes is a free, open-source, and cross-platform Python library for (generalized) Procrustes problems with the goal of finding the optimal transformation(s) that makes two matrices as close as possible to each other. Please use the following citation in any publication using Procrustes library: “Procrustes: A Python Library to Find Transformations that Maximize the Similarity Between Matrices”, F. Meng, M. Richer, A. Tehrani, J. La, T. D. Kim, P. W. Ayers, F. Heidar-Zadeh, JOURNAL 2021; ISSUE PAGE NUMBER. The Procrustes source code is hosted on GitHub and is released under the GNU General Public License v3.0. We welcome any contributions to the Procrustes library in accord with our Code of Conduct; please see our Contributing Guidelines. Please report any issues you encounter while using Procrustes library on GitHub Issues. For further information and inquiries please contact us at [email protected]. USER DOCUMENTATION 1 Procrustes Documentation, Release 0.0.1-alpha 2 USER DOCUMENTATION CHAPTER ONE DESCRIPTION OF PROCRUSTES METHODS Procrustes problems arise when one wishes to find one or two transformations, T 2 Rn×n and S 2 Rm×m, that make matrix A 2 Rm×n (input matrix) resemble matrix B 2 Rm×n (target or reference matrix) as closely as possible: min kSAT − Bk2 |{z} F S;T where, the k · kF denotes the Frobenius norm defined as, v u m n q uX X 2 y kAkF = t jaijj = Tr(A A) i=1 j=1 Here aij and Tr(A) denote the ij-th element and trace of matrix A, respectively. When S is an identity matrix, this is called a one-sided Procrustes problem, and when it is equal to T, this becomes two-sided Procrustes problem with one transformation, otherwise, it is called two-sided Procrustes problem. Different Procrustes problems use different choices for the transformation matrices S and T which are commonly taken to be orthogonal/unitary matri- ces, rotation matrices, symmetric matrices, or permutation matrices. The table below summarizes various Procrustes methods supported: Procrustes Type S T Constraints Generic [1] I T None Orthogonal [2][3][4] I Q Q−1 = Qy ( R−1 = Ry Rotational [4][5][6] I R jRj = 1 Symmetric [7][8][9] I X X = Xy ( [P]ij 2 f0; 1g Permutation [10] I P Pn Pn i=1[P]ij = j=1[P]ij = 1 ( −1 y y Q1 = Q1 Two-sided Orthogonal [11] Q1 Q2 −1 y Q2 = Q2 Two-sided Orthogonal with One Transformation [12] Qy Q Q−1 = Qy 8 >[P1]ij 2 f0; 1g > y <[P2]ij 2 f0; 1g Two-sided Permutation [13] P P2 1 Pn [P ] = Pn [P ] = 1 > i=1 1 ij j=1 1 ij :>Pn Pn i=1[P2]ij = j=1[P2]ij = 1 ( y [P]ij 2 f0; 1g Two-sided Permutation with One Transformation [2][14] P P Pn Pn i=1[P]ij = j=1[P]ij = 1 In addition to these Procrustes methods, summarized in the table above, the generalized Procrustes analysis (GPA) [15][1][16][17][18] and softassign algorithm [19][20][21] are also implemented in our package. The GPA algorithm 3 Procrustes Documentation, Release 0.0.1-alpha seeks the optimal transformation matrices T to superpose the given objects (usually more than 2) with minimum distance, j X 2 min kAiTi − AjTjk (1.1) i<j where Ai and Aj are the configurations and Ti and Tj denotes the transformation matrices for Ai and Aj respec- tively. When only two objects are given, the problem shrinks to generic Procrustes. The softassign algorithm was first proposed to deal with quadratic assignment problem [19] inspired by statistical physics algorithms and has subsequently been developed theoretically [20][21] and extended to many other applica- tions [22][20][23][24][25]. Because the two-sided permutation Procrustes problem is a special quadratic assignment problem it can be used here. The objective function is to minimize Eqap(M; 휇, 휈),[20][26], which is defined as follows, 1 X E (M; 휇, 휈) = − C M M qap 2 ai;bj ai bj aibj ! ! X X X X + 휇a Mai − 1 + 휈i Mai − 1 a i i a 훾 X 1 X − M2 + M log M 2 ai 훽 ai ai ai ai Procrustes problems arise when aligning molecules and other objects, when evaluating optimal basis transformations, when determining optimal mappings between sets, and in many other contexts. This package includes options to translate, scale, and zero-pad matrices, so that matrices with different centers/scaling/sizes can be considered. 1.1 Installation 1.1.1 Downloading Code The latest code can be obtained through theochem (https://github.com/theochem/procrustes) in Github, git clone [email protected]:theochem/procrustes.git 1.1.2 Dependencies The following dependencies will be necessary for Procrustes to build properly, • Python >= 3.6: https://www.python.org/ • SciPy >= 1.5.0: https://www.scipy.org/ • NumPy >= 1.18.5: https://www.numpy.org/ • Pip >= 19.0: https://pip.pypa.io/ • PyTest >= 5.3.4: https://docs.pytest.org/ • PyTest-Cov >= 2.8.0: https://pypi.org/project/pytest-cov/ • Sphinx >= 2.3.0, if one wishes to build the documentation locally: https://www.sphinx-doc.org/ 4 Chapter 1. Description of Procrustes Methods Procrustes Documentation, Release 0.0.1-alpha 1.1.3 Installation The stable release of the package can be easily installed through the pip and conda package management systems, which install the dependencies automatically, if not available. To use pip, simply run the following command: pip install qc-procrustes To use conda, one can either install the package through Anaconda Navigator or run the following command in a desired conda environment: conda install -c theochem procrustes Alternatively, the Procrustes source code can be download from GitHub (either the stable version or the development version) and then installed from source. For example, one can download the latest source code using git by: # download source code git clone [email protected]:theochem/procrustes.git cd procrustes From the parent directory, the dependencies can either be installed using pip by: # install dependencies using pip pip install -r requirements.txt or, through conda by: # create and activate myenv environment # Procruste works with Python 3.6, 3.7, and 3.8 conda create -n myenv python=3.6 conda activate myenv # install dependencies using conda conda install --yes --file requirements.txt Finally, the Procrustes package can be installed (from source) by: # install Procrustes from source pip install . 1.1.4 Testing To make sure that the package is installed properly, the Procrustes tests should be executed using pytest from the parent directory: # testing without coverage report pytest -v . In addition, to generate a coverage report alongside testing, one can use: # testing with coverage report pytest --cov-config=.coveragerc --cov=procrustes procrustes/test 1.1. Installation 5 Procrustes Documentation, Release 0.0.1-alpha 1.2 Quick Start The code block below gives an example of the orthogonal Procrustes problem for random matrices A and B. Here, matrix B is constructed by shifting an orthogonal transformation of matrix A, so the matrices can be perfectly matched. As is the case with all Procrustes flavours, the user can specify whether the matrices should be translated (so that both are centered at origin) and/or scaled (so that both are normalized to unity with respect to the Frobenius norm). In addition, the other optional arguments (not appearing in the code-block below) specify whether the zero columns (on the right-hand side) and rows (at the bottom) should be removed prior to transformation. [1]: import numpy as np from scipy.stats import ortho_group from procrustes import orthogonal # random input 10x7 matrix A a= np.random.rand(10,7) # random orthogonal 7x7 matrix T t= ortho_group.rvs(7) # target matrix B (which is a shifted AxT) b= np.dot(a, t)+ np.random.rand(1,7) # orthogonal Procrustes analysis with translation result= orthogonal(a, b, scale= True, translate=True) # display Procrustes results # error (expected to be zero) print(result.error) 9.945692192702779e-31 [2]: # transformation matrix (same as T) print(result.t) [[ 0.76704368 0.24234201 -0.2443444 -0.03500412 0.13473421 -0.4848402 -0.19687949] [ 0.00853565 -0.68551647 -0.08350729 -0.31427979 -0.29329304 -0.47950741 0.32909102] [ 0.10680518 0.11156386 -0.16906181 -0.5547083 -0.60941328 0.33956084 -0.39137815] [ 0.3117252 -0.28167213 -0.19484105 0.71257785 -0.45491058 0.26552062 0.01769573] [-0.27528981 0.51527592 -0.55432293 0.05562049 -0.26779419 -0.1780102 0.49491144] [ 0.04849826 0.33465191 0.72922801 0.115903 -0.48165441 -0.31040565 0.11002901] [ 0.47418838 0.04528433 0.1665233 -0.26078206 0.11744363 0.47027711 0.66513446]] [3]: print(result.new_b) print(result.new_a) [[ 0.10524259 -0.20171952 -0.00197784 0.09527217 -0.00576083 0.03226466 0.16905405] [ 0.09910817 -0.16527621 -0.05833857 -0.07205314 -0.21556919 0.05246684 -0.10984354] [-0.18236865 0.11982583 -0.13907749 -0.04269053 0.19000499 0.01580882 (continues on next page) 6 Chapter 1. Description of Procrustes Methods Procrustes Documentation, Release 0.0.1-alpha (continued from previous page) 0.03687883] [ 0.00058638 0.13064853 0.14907956 -0.07415179 -0.01526494 -0.14933601 -0.00886503] [ 0.05250876 -0.10823365 0.02606576 -0.04653071 0.16780312 0.06690096 0.09107437] [ 0.08533708 0.27191838 0.00873697 0.01521687 -0.19000507 0.16683213 -0.05890518] [ 0.0517116 -0.18424346 -0.16585383 0.13987241 -0.10713652 -0.14291237 -0.19611888] [-0.2919628 0.0243955 0.13466663 0.00734221 -0.11547638 -0.02873159 -0.04597026] [ 0.02859943 0.12270226 0.07612172 0.02668408 0.01175157 -0.11417017 -0.00790431]