Application of Machine Learning to Toolmarks: Statistically Based Methods for Impression Pattern Comparisons
Total Page:16
File Type:pdf, Size:1020Kb
The author(s) shown below used Federal funds provided by the U.S. Department of Justice and prepared the following final report: Document Title: Application of Machine Learning to Toolmarks: Statistically Based Methods for Impression Pattern Comparisons Author: Nicholas D. K. Petraco, Ph.D.; Helen Chan, B.A.; Peter R. De Forest, D.Crim.; Peter Diaczuk, M.S.; Carol Gambino, M.S., James Hamby, Ph.D.; Frani L. Kammerman, M.S.; Brooke W. Kammrath, M.A., M.S; Thomas A. Kubic, M.S., J.D., Ph.D.; Loretta Kuo, M.S.; Patrick McLaughlin; Gerard Petillo, B.A.; Nicholas Petraco, M.S.; Elizabeth W. Phelps, M.S.; Peter A. Pizzola, Ph.D.; Dale K. Purcell, M.S.; Peter Shenkin, Ph.D. Document No.: 239048 Date Received: July 2012 Award Number: 2009-DN-BX-K041 This report has not been published by the U.S. Department of Justice. To provide better customer service, NCJRS has made this Federally- funded grant final report available electronically in addition to traditional paper copies. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. This document is a research report submitted to the U.S. Department of Justice. This report has not been published by the Department. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. Report Title: Application of Machine Learning to Toolmarks: Statistically Based Methods for Impression Pattern Comparisons Award Number: 2009-DN-BX-K041 Authors: Nicholas D. K. Petracoa,b, Ph.D.; Helen Chana, B.A.; Peter R. De Foresta, D.Crim.; Peter Diaczuka,b, M.S.; Carol Gambinoa,c M.S., James Hambyd, Ph.D.; Frani L. Kammermana, M.S.; Brooke W. Kammratha,b, M.A., M.S; Thomas A. Kubica,b, M.S., J.D., Ph.D.; Loretta Kuoa, M.S.; Patrick Mc Laughlina,e; Gerard Petillof, B.A.; Nicholas Petracoa,e, M.S.; Elizabeth W. Phelpsa, M.S.; Peter A. Pizzolag, Ph.D.; Dale K. Purcella,b, M.S.; Peter Shenkina, Ph.D. aJohn Jay College of Criminal Justice, City University of New York bThe Graduate Center, City University of New York cBorough of Manhattan Community College, City University of New York dInternational Forensic Science Laboratory & Training Centre eNew York City Police Department fIndependent Firearms Examiner gNew York City Office of the Chief Medical Examiner Abstract Over the last decade, forensic firearms and toolmark examiners have encountered harsh criticism that there is no accepted methodology to generate numerical “proof” that independently corroborates their morphological conclusions. This project strives to answer that criticism and focuses on: a. The collection of 3D quantitative surface topographies of toolmarks by confocal microscopy; b. Identification of relevant modern multivariate machine learning methods for tool- toolmark associations and estimations of identification error rates; and c. Dissemination of toolmark surface data and software generated for the project to aid further research. A database was assembled which consists of 3D striation and impression patterns on Glock fired cartridge cases, screwdriver and chisel striation patterns. The database is now available to registered users. Statistical studies were carried out on a large portion of the primer shears (cartridge cases) and screwdriver striation patterns collected thus far. Principal component analysis, canonical variate analysis and support vector machine methodology was used to objectively associate these toolmarks with the tools that created them. Estimated toolmark identification error rates were on the order of 1% using these algorithmic methods. Conformal prediction theory was used to assign confidence levels to each toolmark identification and is suggested as a useful measure in gauging the quality of a toolmark “match” for a multivariate This document is a research report submitted to the U.S. Department of Justice. This report has not been published by the Department. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. This document is a research report submitted to the U.S. Department of Justice. This report has not been published by the Department. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. 2009-DN-BX-K041 Final Report December 2011 classification system. The findings of this objective and quantitative scientific research reinforce the general conclusions codified in the AFTE theory of identification. Table of Contents Executive Summary pg. 3 Final Technical Report I. Introduction 1. Statement of the problem pg. 7 2. Review of relevant literature pg. 7 2.1 Introduction to toolmarks and toolmark examination pg. 8 2.2 Individualization of toolmarks pg. 10 2.3 Materials for experimentation pg. 12 2.4 Two schools of thought pg. 14 2.5 Methods and techniques of toolmark examination pg. 17 2.6 Reliability of toolmark examination pg. 21 2.7 Court decisions pg. 23 2.8 Statistics and toolmarks pg. 25 3. Rationale for the research pg. 28 II. Materials and Methods 1. Materials pg. 29 2. Methods for toolmark impression data collection 1.1 Generating reproducible toolmark impressions pg. 31 1.2 Confocal microscope pg. 32 3. Machine learning methods for toolmark comparison 3.1 General striated toolmark surface preprocessing and feature vector construction pg. 38 3.2 The data matrix and principal component analysis pg. 42 3.3 Canonical variate analysis pg. 43 3.4 Support vector machines pg. 44 4. Methods for error rate estimation 1 This document is a research report submitted to the U.S. Department of Justice. This report has not been published by the Department. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. This document is a research report submitted to the U.S. Department of Justice. This report has not been published by the Department. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. 2009-DN-BX-K041 Final Report December 2011 4.1 Resubstitution methods pg. 46 4.2 Conformal prediction theory pg. 47 III. Results 1. Toolmark impression data collection and database pg. 49 1.1 Cartridge case striation and impression pattern collection pg. 50 1.2 Striated toolmark pattern collection pg. 54 1.3 Database and web interface pg. 58 1.4 Surface visualization and measurement software pg. 61 1.5 Profile simulator software pg. 65 1.6 R software and statistical analysis scripts pg. 73 2. Statistical Analyses 2.1 Glock 19 Cartridge Casings pg. 76 2.2 Screwdriver striation patterns pg. 82 IV. Conclusions 1. Discussion of findings pg. 85 2. Implications for policy and practice pg. 88 3. Implications for further research pg. 88 V. References pg. 89 VI. Dissemination of research findings pg. 95 2 This document is a research report submitted to the U.S. Department of Justice. This report has not been published by the Department. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. This document is a research report submitted to the U.S. Department of Justice. This report has not been published by the Department. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. 2009-DN-BX-K041 Final Report December 2011 Executive Summary 1. Introduction Forensic science has come under increased scrutiny in recent years. In February 2009, the National Academy of Sciences (NAS) released their report on the forensic sciences in the United States. The report, entitled “Strengthening Forensic Science in the United States: A Path Forward,” states that “much forensic evidence— including, for example, bite marks and firearm and toolmark identifications—is introduced in criminal trials without any meaningful scientific validation, determination of error rates, or reliability testing to explain the limits of the discipline” (p. 3-18). The NAS report further contends that “sufficient studies have not been done to understand the reliability and repeatability of the methods (p. 5-21)” and, as a result, “additional studies should be performed to make the process of individualization more precise and repeatable” (p. 5-21). This experiment sought to develop a statistical foundation for assessing the likelihood that one tool is the source of a given toolmark to the exclusion of all other tools. Impression evidence has received the brunt of attack, and while some of the criticism is justified, much of it is naive and based on misunderstandings. Impression evidence is a broad category of important, commonly encountered, and valuable physical evidence. It includes fingerprints, toolmarks, footwear impressions, tire tracks, and those impressions associated with firearms identification (i.e. microstriae in land impressions on bullets, breech face impressions, firing pin impressions, and other marks on cartridge cases). Although impression evidence of various types has been used successfully for decades, its examination has lacked a well- articulated scientific basis. This research seeks to place the analysis of impression evidence, specifically those made by tools and firearms, on a sound scientific foundation by laying down, testing, and fully publishing methodological statistical foundations for toolmark impression pattern recognition and comparison 2. Scope of the project This study focuses on striation patterns left by tools and on cartridge casings imparted by firearms.