Generating Triangulated Macromolecular Surfaces by Euclidean Distance Transform
Total Page:16
File Type:pdf, Size:1020Kb
Generating Triangulated Macromolecular Surfaces by Euclidean Distance Transform Dong Xu1,2, Yang Zhang1,2* 1 Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America, 2 Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, Lawrence, Kansas, United States of America Abstract Macromolecular surfaces are fundamental representations of their three-dimensional geometric shape. Accurate calculation of protein surfaces is of critical importance in the protein structural and functional studies including ligand-protein docking and virtual screening. In contrast to analytical or parametric representation of macromolecular surfaces, triangulated mesh surfaces have been proved to be easy to describe, visualize and manipulate by computer programs. Here, we develop a new algorithm of EDTSurf for generating three major macromolecular surfaces of van der Waals surface, solvent-accessible surface and molecular surface, using the technique of fast Euclidean Distance Transform (EDT). The triangulated surfaces are constructed directly from volumetric solids by a Vertex-Connected Marching Cube algorithm that forms triangles from grid points. Compared to the analytical result, the relative error of the surface calculations by EDTSurf is ,2–4% depending on the grid resolution, which is 1.5–4 times lower than the methods in the literature; and yet, the algorithm is faster and costs less computer memory than the comparative methods. The improvements in both accuracy and speed of the macromolecular surface determination should make EDTSurf a useful tool for the detailed study of protein docking and structure predictions. Both source code and the executable program of EDTSurf are freely available at http://zhang. bioinformatics.ku.edu/EDTSurf. Citation: Xu D, Zhang Y (2009) Generating Triangulated Macromolecular Surfaces by Euclidean Distance Transform. PLoS ONE 4(12): e8140. doi:10.1371/ journal.pone.0008140 Editor: Markus J. Buehler, Massachusetts Institute of Technology, United States of America Received August 19, 2009; Accepted November 9, 2009; Published December 2, 2009 Copyright: ß 2009 Xu, Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: The project is supported by the Alfred P. Sloan Foundation, NSF Career Award 0746198 and the National Institute of General Medical Sciences Grant GM083107 and GM084222. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: [email protected] Introduction are also a number of other methods which were developed to analytically calculate the value of the exact surface area and There are mainly three types of macromolecular surfaces—van volume [11–17]. Among them, Liang et al. presented a method for der Waals surface (VWS), solvent-accessible surface (SAS) and molecular computing molecular area and volume based on the alpha shape surface (MS)—in molecular biology studies [1]. Because the shape theory [14] which was earlier proposed by Edelsbrunner and and surface decide how the macromolecules interact with others, Muche [11]. An alpha-shape of a set of weighted points is a subset accurate determination of the macromolecular surfaces is essential of the regular Delaunay triangulation of these weighted points. for elucidating their biological roles in physiological processes. The reduced surface [10] is equivalent to an alpha-shape with an Consequently, calculations of the macromolecular surfaces from alpha value equal to zero when the radii of atoms are further given 3D structures have found extensive uses in modern inflated by the solvent radius. molecular biology studies, including protein folding and structure Although analytical methods have the advantage of getting prediction [2], protein-ligand docking [3,4], DNA-protein inter- accurate values of surface area and volume, they are not actions [5], and new drug screening [6]. convenient to be employed in other applications when explicit A variety of methods have been proposed to compute the three surfaces of local atoms are required for further processing. For macromolecular surfaces. These methods can be generally example, local surfaces of proteins and ligands are often used for categorized into two classes: analytical computation and explicit shape comparison in the docking problem. The explicit surface representation. For analytical computing, Connolly first presented generation method is a grid-based approximation which uses an algorithm for calculating the smooth solvent-excluded surface space-filling model where each atom is modeled as a volumetric of a molecule [7] (Which he called ‘‘alternative solvent-accessible item [18,19]. Molecules are placed onto the grids, whose width surface’’), where the spheres, tori and arcs were defined using could be altered to achieve different resolution. LSMS (Level Set analytical expressions according to the atomic coordinates, van der method for Molecular Surface generation) used a level-set method Waals radii and the probe radius [8]. The author also developed and achieved a very fast speed [20]. Zhang et al. constructed a the Connolly’s Molecular Surface Package (MSP) which was a smooth volumetric electron density map from atomic data by suite of programs for computing and manipulating molecular using weighted Gaussian isotropic kernel function and a two-level surfaces and volumes [9]. MSMS (Michel Sanner’s Molecular clustering technique [21]. The authors selected a smooth implicit Surface) was later developed to compute both solvent accessible solvation surface approximation to the Lee-Richards molecular and molecular surface relying on the reduced surface [10]. There surface. PLoS ONE | www.plosone.org 1 December 2009 | Volume 4 | Issue 12 | e8140 Macromolecular Surfaces by EDT After the space-filling procedure, an important step is surface The solvent-accessible surface (SAS) (see the red part of Figure 1B) is representation and construction. In general, macromolecular defined as the area traced out by the center of a probe sphere as it surface could be represented by parametric equations or triangular is rolled over the van der Waals surface. The probe sphere is a patches. Parametric representations of protein molecular surfaces solvent water molecule which is represented by the black circle in are a compact way to describe a surface, and are useful for the Figure 1B. evaluation of surface properties such as the normal vector, The molecular surface (MS) is a continuous sheet consisting of two principal curvatures, and principal curvature directions [22]. parts: the contact surface and the reentrant surface [26]. The contact Simplified triangular representations of molecular surfaces are surface (see the green part of Figure 1C) is part of the van der useful for easy manipulation, efficient rendering and for the display Waals surface that is accessible to a probe sphere. The reentrant of large-scale surface features. It is composed of a set of vertices surface (see the pink part of Figure 1C) is the inward-facing surface and a group of triangular patches connecting these vertices. of the probe when it touches two or more atoms. The molecular Connolly created the triangles by subdividing the curved faces of surface is also called the solvent-excluded surface (SES), which is the an analytical molecular surface [23]. Molecular areas and volumes boundary of the union of all possible probes which do not overlap may be calculated from it and packing defects in proteins may be with the molecule [10]. Molecular surface is also called the identified. MSMS computed the triangulated molecular surfaces Connolly surface. It was revealed that the solvent-accessible by sewing pre-triangulated template spheres and concave faces surface was displaced outward from the molecular surface by a together. distance equal to the probe radius [8]. A commonly used method to construct triangulated isosurface from 3D grid is the Marching Cube algorithm [24], which was also Euclidean Distance Transform used in LSMS. Marching Cubes (MC) creates triangle models of Distance Transform (DT) is the transformation that converts a constant density surfaces from 3D image data. The LSMS digital binary image to another gray scale image in which the value algorithm only considers the inside/outside attributes of each of each pixel in the object is the minimum distance from the vertex and uses Marching Cubes to connect the middle point of background to that pixel by a predefined distance function. Three each edge. Xiang et al. proposed an improved version of the distance functions between two points ðÞx1,y1,z1 and ðÞx2,y2,z2 Marching Cube method for molecular surface triangulation [25]. are often used in practice, which are City-block distance, This new algorithm involves fewer and simpler basic building Chessboard distance and Euclidean distance, i.e. blocks and avoids the artificial gaps of the original one. Obviously, quantities like surface area and volume by grid-based algorithms 8 > d { ~ x {x z y {y z z {z may not be as accurate as that calculated by the analytical <> city block j 1 2j j 1 2j j 1 2j methods. However, these algorithms can generate triangular d ~max x {x , y {y , z {z chessboard