Assessing the Strength of Non-Contemporaneous Forensic Speech Evidence
Total Page:16
File Type:pdf, Size:1020Kb
Assessing the strength of non-contemporaneous forensic speech evidence Richard William Rhodes Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy The University of York Department of Language and Linguistic Science Submitted December 2012 2 Abstract The aim of this thesis is to assess the impact of long term non-contemporaneity on the strength of forensic speech evidence. Speakers experience age-related changes to the voice over long delays and this time also presents the opportunity for social factors to vary. These changes are shown to impact on speech parameters used in forensic analyses. Using longitudinal data from the Up documentary series, this thesis analyses the effects of aging on forensically useful acoustic parameters in eight speakers at five seven-year intervals between ages 21 and 49. The investigation reveals significant age-related changes in real-time across adulthood. Frequencies of the first three formants in monophthongs /i: ɪ e a ɑ: ʌ ɒ ʊ & u:/ and diphthongs /eɪ & aɪ/ show comprehensive reduction. For monophthongs, F1 exhibits mean change of 8.5%, greater than F2 and F3 at 3.7% and 2.2% respectively. Vowel quality also impacts on magnitude of change in each formant. Estimations based on this data suggest that vocal tract extension and restricted articulator movement are probable drivers for acoustic change, operating on different timelines. Counter-examples to this aging pattern can generally be explained by social factors, as a result of mobility or in accordance with mainstream changes in a variety. Strength of evidence estimates for these non-contemporaneous data are calculated using a numerical likelihood ratio (LR) approach. Age-related changes result in weaker and fewer correct LRs with greater length delays. Cubic coefficients of diphthong formants are investigated in line with a formant dynamic approach. These LR tests show promising results and resilience to aging, especially in F1; tentatively suggesting that, for these speakers, some speaker-specific behaviour pervades in spite of physiological changes. This analysis raises several questions with regards to applying an overtly numerical LR approach where there is apparent mismatch between forensic samples. The effect of aging on an ASR system (BATVOX) is also tested for six male subjects. The system measures Mel-frequency cepstral coefficient (MFCC) parameters that reflect the physical properties of the vocal tract. Predicted degradation of the system’s performance with increasing age is apparent. The reduction in performance is significant, varies between speakers, and is striking in longer delays for all speakers. 3 The degradation in strength of evidence for acoustic data from monophthongs and formant dynamic coefficients, as well as that for the ASR system, demonstrates that aging presents a real problem for forensic analysis in non-contemporaneous cases. Furthermore, aging also presents issues for speech databases for the purpose of assessing strength of evidence, where further research into distributions of parameters in different age groups is warranted. 4 Contents Abstract ............................................................................................................................................... 3 List of Figures ...................................................................................................................................... 8 List of Tables ..................................................................................................................................... 11 List of Abbreviations ......................................................................................................................... 13 Acknowledgements ........................................................................................................................... 14 Declaration ........................................................................................................................................ 14 1 Introduction ............................................................................................................................. 15 1.1 Forensic Speech Science ................................................................................................... 15 1.1.1 Forensic speaker comparison ................................................................................... 15 1.1.2 The speaker, the voice and speech evidence ........................................................... 16 1.1.3 The forensic condition .............................................................................................. 17 1.1.4 Desirable criteria for FSC parameters ....................................................................... 20 1.1.5 Summary of features used in FSC ............................................................................. 21 1.1.6 Applications ............................................................................................................... 23 2 Research Review ...................................................................................................................... 25 2.1 Longitudinal Research ....................................................................................................... 25 2.1.1 Sociolinguistic research ............................................................................................. 25 2.1.2 Social factors and individual change ......................................................................... 34 2.1.3 Longitudinal forensic research .................................................................................. 50 2.1.4 Physiological research ............................................................................................... 53 2.1.5 Summary ................................................................................................................... 71 2.1.6 Questions .................................................................................................................. 72 2.2 Formant Dynamics ............................................................................................................ 72 2.2.1 Theoretical model ..................................................................................................... 73 2.2.2 Research .................................................................................................................... 76 2.2.3 Questions .................................................................................................................. 81 2.3 Expressing the strength of forensic speech evidence ....................................................... 82 2.3.1 Context ...................................................................................................................... 82 2.3.2 The likelihood ratio approach ................................................................................... 87 2.3.3 LR-based conclusions in Research ............................................................................. 94 2.3.4 Benefits ..................................................................................................................... 95 2.3.5 Issues ......................................................................................................................... 98 2.3.6 Questions ................................................................................................................ 112 2.4 Research Questions ........................................................................................................ 113 3 Method ................................................................................................................................... 115 3.1 Subjects ........................................................................................................................... 115 3.2 Preparation ..................................................................................................................... 116 3.3 Summary of materials ..................................................................................................... 117 3.3.1 Samples ................................................................................................................... 117 3.3.2 Technical quality of recordings ............................................................................... 117 3.3.3 Criticisms of RT studies and this dataset ................................................................ 118 3.3.4 Establishing intra-speaker variability: short-term non-contemporaneity .............. 119 3.3.5 7 Up Video ............................................................................................................... 119 3.4 Parameter extraction ...................................................................................................... 119 3.4.1 Equipment ............................................................................................................... 119 3.4.2 Features analysed ................................................................................................... 120 5 3.5 Statistical analyses .......................................................................................................... 130 3.5.1 Monophthong formants ......................................................................................... 130 3.5.2 Diphthong dynamics ............................................................................................... 131 3.5.3 Likelihood