Forensic Statistics and the Assessment of Probative Value
Total Page:16
File Type:pdf, Size:1020Kb
Center for Statistics and Applications in CSAFE Presentations and Proceedings Forensic Evidence 12-11-2018 Forensic Statistics and the Assessment of Probative Value Hal Stern University of California, Irvine Follow this and additional works at: https://lib.dr.iastate.edu/csafe_conf Part of the Forensic Science and Technology Commons Recommended Citation Stern, Hal, "Forensic Statistics and the Assessment of Probative Value" (2018). CSAFE Presentations and Proceedings. 20. https://lib.dr.iastate.edu/csafe_conf/20 This Presentation is brought to you for free and open access by the Center for Statistics and Applications in Forensic Evidence at Iowa State University Digital Repository. It has been accepted for inclusion in CSAFE Presentations and Proceedings by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Forensic Statistics and the Assessment of Probative Value Disciplines Forensic Science and Technology Comments Posted with permission of CSAFE. This presentation is available at Iowa State University Digital Repository: https://lib.dr.iastate.edu/csafe_conf/20 Forensic Statistics and the Assessment of Probative Value OSAC Meeting Phoenix, AZ December 11, 2018 Hal Stern Department of Statistics University of California, Irvine [email protected] Interesting times in forensic science Evaluation of forensic evidence • Forensic examinations cover a range of questions – timing of events – cause/effect – source conclusions • Focus here on source conclusions – topics addressed (e.g., need to assess uncertainty, logic of the likelihood ratio) are relevant beyond source conclusions • The task of interest for purposes of this presentation: assess two items of evidence, one from a known source and one from an unknown source, to determine if the two samples come from the same source – Bullet casing from test fire of suspect’s gun – Bullet casing from the crime scene The Daubert standard • Daubert standard (Daubert v. Merrell Dow Pharmaceuticals, 1993) governs admission of scientific expert testimony in federal courts – judge as gatekeeper – conclusions should be the product of applying a scientific methodology – relevant factors for judge to consider • Has the technique been tested in actual field conditions (and not just in a laboratory)? • Has the technique been subject to peer review and publication? • What is the known or potential rate of error? • Do standards exist for the control of the technique's operation? • Has the technique been generally accepted within the relevant scientific community? – applies to all expert evidence (Kumho Tire Co. v. Carmichael, 1999) . Frye standard (Frye v. United States, 1923) – general acceptance in relevant scientific community standard – applicable to novel scientific evidence FRE Rule 702 (post-Daubert) A witness who is qualified as an expert by knowledge, skill, experience, training, or education may testify in the form of an opinion or otherwise if: a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue; b) the testimony is based on sufficient facts or data; c) the testimony is the product of reliable principles and methods; and d) the expert has reliably applied the principles and methods to the facts of the case. Logic of forensic examinations • Examine two samples to identify similarities and differences • Assess similarities and differences to see if they are expected (or likely) under the same source hypothesis • Assess similarities and differences to see if they are expected (or likely) under the different source hypothesis Evaluation and interpretation of forensic evidence • Approaches – Expert assessment based on experience, training, use of accepted methods. Typically summarized by a categorical conclusion (e.g., identification / exclusion / inconclusive) – Two-stage procedure (see, e.g., Parker and Holford in the 1960s) • similarity (binary decision based on distance/score) • identification (likelihood of coincidental match) – Likelihood ratio (sometimes known as the Bayes factor) Satisfying Daubert / FRE 702 • Application of any of these approaches should be supported by evidence regarding how well (how reliably) they perform • Examples: – Studies of reliability and validity of measurements (e.g., chemical composition of glass) – Peer-reviewed studies of techniques/models – Studies of reliability and validity of examiner conclusions • Important to also recall that the approach needs to be “reliably applied …. to the facts of the case.” (e.g., N.C. vs McPhaul, 2017) Forensic Evidence as Expert Opinion • Status quo in pattern disciplines (fingerprints, shoe prints, firearms, toolmarks, questioned documents, etc.) • Examiner analyzes evidence based on – Experience – Training – Use of accepted methods in the field • Assessment of the evidence reflects examiner’s expert opinion • Conclusions typically reported as categorical conclusions – Identification, Exclusion, Inconclusive – Multi-category scales (some support, strong support, very strong support,..) Forensic Evidence as Expert Opinion • Occasionally conclusions are expressed as statements about the hypotheses rather than the evidence, e.g., “based on the evidence, author of the known samples … – Wrote the questioned sample – Highly probable wrote the questioned sample – Probably wrote the questioned sample – Indications may have written the questioned sample – with similar statements on the negative side • This is logically problematic – It is a statement about the likelihood of a hypothesis (“same source”) after viewing the evidence – But, as we will see later, this conclusion must also reflect in part the examiner’s a priori (pre-evidence) opinion about the hypothesis Forensic Evidence as Expert Opinion • What does it take to establish that testimony is – “based on sufficient facts or data” – “the product of reliable principles and methods” • Note that the use of the word “reliable” in the legal sense (trustworthy) differs from its technical use in statistics • In measurement / assessment, statisticians focus on a number of related concepts in thinking about “reliability”: – Would the same analyst draw the same conclusion in a new examination of the evidence (repeatability) – Would different analysts draw the same conclusion given the same evidence (reproducibility) – Repeatability and reproducibility are both components of reliability – Do analysts get the right answer in studies where the ground truth is available (accuracy / validity) Reliability of Measurements: An Example from Handwriting • 5 forensic document examiners (FDE) rated 123 signatures in terms of difficulty to simulate on a 5-point scale (easy - fairly easy - medium - difficult - very difficult) • Assessing reproducibility (similarity of assessments by two different examiners) − Correlation of ratings of each pair of FDEs (.62 - .75) − Statistical model (intraclass correlation coefficient) (.65) • Assessing repeatability (similarity of assessments by same examiner at two different times) … a very small study w/ only 7 signatures − Correlation of ratings (range from .40 - .88) − Statistical model estimates .68 Forensic Evidence as Expert Opinion • PCAST report called for assessment of – Foundational validity of a forensic science discipline – Validity as applied in a particular case • Foundational validity – A method can in principle be reliable (in the legal sense) – PCAST advocated for multiple “black box” studies • Validity as applied – Proficiency testing (this person can do the task) – Case report establishing it has been applied appropriately in this case • PCAST report has been controversial Forensic Evidence as Expert Opinion • Example of a (PCAST-style) “black box” study – Having examples with known “ground truth” allows estimation of error rates – Ulery et al. (2011) “black box” study of fingerprint decisions • false positive rate was 0.1% • false negative rate was 7.5% – There are limitations in this and any study (similarity to case work, case environment?) – Same group carried out a series of “white box” studies in fingerprints to assess • Reliability of different steps in the examination process (e.g., marking of minutiae) Forensic Evidence as Expert Opinion • Reliability and validity are likely to depend on characteristics of the evidence, e.g., – quality of latent print – complexity of a signature • Studies should address this and would allow statements like “for evidence of this type …” Forensic Evidence as Expert Opinion Example: Forensic Evidence as Expert Opinion • A few final remarks on forensic evidence as expert opinion – Information on reliability and accuracy for forensic analyses is extremely helpful and will likely be increasingly requested – As per FRE 702, there is also a need to address application of the method or technique in the current case (e.g., N.C. vs. McPhaul, 2017) – There will always be unique situations without relevant empirical studies (e.g., did this typewriter produce this note) • Not necessarily a problem as long as lack of relevant empirical evidence is acknowledged The Two-Stage Approach • Stage 1 - Similarity – Statistical test or procedure to determine if the two samples “are indistinguishable”, “can’t be distinguished”, “match”, etc. • Stage 2 - Identification – Assessment of the probability that two samples from different sources would be found indistinguishable • Used in assessment of trace evidence (like glass) • Conceptually