Statistical Pattern Recognition for Audio Forensics -- Empirical

Statistical Pattern Recognition for Audio Forensics { Empirical Investigations on the Application Scenarios Audio Steganalysis and Microphone Forensics Dissertation zur Erlangung des akademischen Grades Doktoringenieur (Dr.-Ing.) angenommen durch die FakultätfürInformatik der Otto-von-Guericke-UniversitätMagdeburg von: Dipl.-Inform. Christian Krätzer geboren am 12. Dezember 1978 in Halberstadt Gutachter: Prof. Dr. Jana Dittmann, Otto-von-Guericke UniversitätMagdeburg, Magdeburg, Germany Prof. Dr. Stefan Katzenbeisser, Technische UniversitätDarmstadt, Darmstadt, Germany Prof. Dr. Scott A. Craver, Binghamton University, Binghamton, NY, United States of America Magdeburg, Germany 23. Mai 2013 Abstract Media forensics is a quickly growing research field struggling to gain the same acceptance as traditional forensic investigation methods. Media forensics tasks such as proving image manipulations on digital images or audio manipulations on digital sound files are as relevant today as they were for their analogue counterparts some decades ago. The difference is that tools like Photoshop now allow a large number of people to manipulate digital media objects with a processing speed far beyond anything imaginable in times when image, audio or video manipulation of analogue media was a hardware- and labour-intensive task requiring skill and experience. Currently, there exists a large number of research prototypes but only a small number of accepted tools that are capable of answering specific questions regularly arising in court cases, such as the origin (source authenticity) of an image or recording, or the integrity of a media file. The reason for this discrepancy between the number of research prototypes and the number of solutions accepted in court has to be sought in the very nature of most judicial systems: Judges have to distinguish between valid bases for evidence and expert testimonies on the one hand, and `junk science' on the other hand. In most cases, this distinction reflects a long and cumbersome process that relies on support from the respective scientific communities. Statistical pattern recognition (SPR) is a powerful methodology that has received a lot of attention in academia and industry since the 1930s. In many application fields, it makes possible efficient solutions for classification-based problems, such as the decision whether an e-mail is spam or not. Many current media forensics methods already implement SPR to solve very specific classification or decision problems. This thesis introduces an SPR-based solution concept for audio signal analyses that leave the narrow tracks of such specific forensic methods. Instead, a general-purpose audio forensics concept is discussed, providing the basis for the verification of different authenticity and integrity aspects of audio signals. From the wide range of potential application scenarios that can be used to illustrate the significance of such an approach, audio steganalysis (the detection of hidden communication channels in audio data) and microphone forensics are chosen here to represent one research field focussed on integrity and one focussed on source authenticity. The introduction of such a currently unavailable generalised approach does not solve the issue of making all current SPR-based audio forensics analyses acceptable in court. This goal would be far too ambitious for a PhD project. But the author considers it to be an important step in helping the discourse between judges on the one hand and researchers working on media forensics approaches on the other hand. A positive outcome of this discourse will improve the chances of forensic science passing hurdles such as the Daubert criteria used by judges to keep `junk science' out of their courtrooms. The three research challenges for this thesis are: • Is there a generalised SPR approach (i.e. an adaptable solution concept) for media forensics that can be used for addressing multiple audio forensics investigation goals? • How can the usefulness of application scenario specific instantiations of a generalised audio forensics approach be measured? • Can instantiations of the generalised audio forensics approach be used to adequately implement the application scenarios of audio steganalysis and microphone forensics? In order to address these challenges, detailed methodology, concept and design considerations are made, leading to an approach that is highly flexible (i.e. considers components like classification algorithms to be exchangeable off-the-shelf products), deterministic, verifiable (integrating feature selection as a means of plausibility verification for the classification process) and easily adaptable to different application scenarios. Furthermore, the possibility of assessing the usefulness of implementations of application scenario specific instantiations of the introduced concept is discussed against the background of the Daubert criteria. iii The two instantiations of the introduced generalised approach selected as examples illustrate the in- stantiation process and the corresponding performance assessment. In the substantial empirical investigations performed in audio steganalysis, detection performances similar to those presented in the state-of-the-art in specialised audio steganalysis solutions are achieved. For the microphone forensics scenario, the approach introduced here is assumed to outperform the current state-of-the-art, as it shows similar detection performances while overcoming context limitations from which most of the specific approaches suffer. Despite the progress made in both application scenarios, which illustrates the significance of the introduced approach, it has to be admitted that the results discussed here still fall short of the ultimate (yet for a PhD thesis unrealistic) goal of providing audio forensic solutions acceptable for court proceedings. iv Deutschsprachige Version des Abstract Medienforensik ist ein derzeit schnell wachsendes Forschungsfeld, welches anstrebt, eine ähnlicheAkzep- tanz wie etablierte forensische Disziplinen zu erlangen. Aufgaben der Medienforensik, wie beispielsweise der Nachweis von Manipulationen an digitalen Bild- oder Tondaten, sind heute im selben Umfang relevant, wie es ähnlicheTechniken fürFotomaterial oder analoge Tonträgerzuvor waren. Der Unterschied zu diesen liegt in der Verfügbarkeit von Werkzeugen wie zum Beispiel Photoshop, die den Durchsatz bei Manipulationen erhöhen,während gleichzeitig das benötigteKnow-how sinkt. Derzeit gibt es eine Vielzahl von prototypischen Lösungenaus der Forschung, welche in der Lage wären,spezielle Fragestellungen der Medienforensik, wie zum Beispiel den Nachweis der Herkunft (Quellenauthentizität)oder der Integrität,zu führen. Allerdings gibt es nur wenige solche Lösungen, die auch vor Gericht Bestand haben. Der Grund fürdiese Diskrepanz liegt in der Natur der meis- ten Rechtssysteme: Richtern obliegt es, zwischen geeigneten Grundlagen fürdie Beweisführungund pseudowissenschaftlichen Ramschwissenschaften zu unterscheiden. Zumeist geht der Zulassung einer wissenschaftlichen Methode im Gericht ein langwieriger Zulassungsprozess voran, bei dem die Richter auf die Zuarbeit der entsprechenden Forschungsbereiche angewiesen sind. Statistische Mustererkennung (SPR, engl.: statistical pattern recognition) ist eine sehr mächtigeMe- thodik, die seit den 1930er Jahren in Forschung und Industrie eine starke Akzeptanz erfährt.In vielen Bereichen ermöglichtsie effiziente Lösungen fürklassifikationsbasierte Probleme, wie zum Beispiel die Frage, ob eine E-Mail Spam ist oder nicht. Viele aktuelle Ansätzeder Medienforensik nutzen diese Methodik fürdie Lösungspezieller Klassifikations- oder Entscheidungsprobleme. In dieser Dissertation wird ein SPR-basiertes Lösungskonzept vorgestellt, welches die ausgetretenen Pfade hochspezialisierter forensischer Methoden verlässtund ein Universalwerkzeug fürunterschiedliche forensische Analysen der Authentizitätund Integritätvon Audiodaten darstellt. Aus dem umfangreichen Pool an möglichen Einsatzszenarien fürein solches Universalwerkzeug werden hier zwei Beispiele zur Illustration der Prax- isrelevanz des vorgestellten Ansatzes vorgestellt. Dabei handelt es sich um die Audiosteganalyse (die Erkennung von verdeckten Kommunikationskanälenin Audiodaten) und die Mikrofonerkennung. Durch diese Auswahl wird jeweils eine Lösungfürdie Verifikation der Integritätvon Audiodaten und eine Lösungfürdie Verifikation der Quellenauthentizitätbetrachtet. Die Einführungeines solchen, derzeitig fehlenden, generalisierten Ansatzes wird nicht automatisch alle derzeitig auf statistischer Mustererkennung basierenden Verfahren der Medienforensik fürAudiodaten vor Gericht verwertbar machen. Dies wäreein viel zu ambitioniertes Unterfangen füreine einzelne Dis- sertation. Allerdings betrachtet der Autor die hier durchgeführtenArbeiten als einen wichtigen Schritt im Diskurs zwischen Richtern einerseits und Forschern im Bereich der Medienforensik andererseits. Ein positiver Ausgang dieses Diskurses sollte die Chancen dafürerhöhen,dass Forschungsergebnisse erfol- greich die Hürdennehmen, welche von Richtern genutzt werden, um Pseudowissenschaften aus den Gerichtssälenfernzuhalten (zum Beispiel die Daubert-Kriterien der US-Rechtsprechung). Die drei wissenschaftlichen Herausforderungen, die in dieser Arbeit aufgegriffen werden, sind: • Gibt es einen generalisierbaren SPR-Ansatz (d. h. ein anpassbares Lösungskonzept) im Bereich der Medienforensik, welcher verwendet werden kann, um

Statistical Pattern Recognition for Audio Forensics -- Empirical

Steganography- a Data Hiding Technique

Steganography & Steganalysis

Steganography: Forensic, Security, and Legal Issues

Steganalysis Techniques and Comparison of Available Softwares

Crime-Scene-Technician-Curriculum-Guide

Government Response to the Consultation on New Statutory Powers for the Forensic Science Regulator

Forensic Audio Analysis 612-637 Forensic Video Analysis 638-651 Imaging 652-687 Digital Evidence 688-743

Forensic Science 1 Forensic Science

A Short Note on Forensic Science

A Forensics Software Toolkit for DNA Steganalysis

Forensic Analysis of Video Steganography Tools Thomas Sloan and Julio Hernandez-Castro School of Computing, University of Kent, Canterbury, Kent, United Kingdom

Cybersecurity Professional Penetration Tester