Detecting Deception in Speech
Total Page:16
File Type:pdf, Size:1020Kb
Detecting Deception in Speech Frank Enos Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2009 °c 2009 Frank Enos All Rights Reserved ABSTRACT Detecting Deception in Speech Frank Enos This dissertation describes work on the detection of deception in speech using the techniques of spoken language processing. The accurate detection of deception in human interactions has long been of interest across a broad array of contexts and has been studied in a number of fields, including psychology, communication, and law enforcement. The detection of decep- tion is well-known to be a challenging problem: people are notoriously bad lie detectors, and no verified approach yet exists that can reliably and consistently catch liars. To date, the speech signal itself has been largely neglected by researchers as a source of cues to deception. Prior to the work presented here, no comprehensive attempt has been made by speech scientists to apply state-of-the-art speech processing techniques to the study of deception. This work uses a set of features new to the deception domain in classification experiments, statistical analyses, and speaker- and group-dependent modeling approaches, all designed to identify and employ potential cues to deception in speech. This dissertation shows that speech processing techniques are relevant to the deception domain by demonstrating significant statistical effects for deception on a number of features, both in corpus-wide and subject-dependent analyses. Results also show that deceptive speech can be automatically classified with some success: accuracy is better than chance and considerably better than human hearers performing an analogous task. The work also examines speaker and group differences with respect to deceptive speech, and we report a number of findings in this regard. We provide a context for our work via a perception study in which human hearers attempted to identify deception in our corpus. Through this perception study we identify a number of previously unreported effects that relate the personality of the hearer to deception detection ability. An additional product of this work is the CSC Corpus, a new corpus of deceptive speech. Contents I Preliminaries 1 1 Introduction 2 1.1 Goal......................................... 3 1.2 Scope ........................................ 4 1.3 Approach ...................................... 5 2 Previous Research on Deception 6 2.1 Theory........................................ 7 2.2 Empirical Studies of Deceptive Speech . .. 9 2.3 DetectionTechnologies............................... 9 2.4 Previous Work on Individual Differences in Deception . .. 11 2.4.1 Individual differences and non-verbal cues . 11 2.4.2 Individual differences and speech cues . 14 2.5 PerceptionStudies ................................. 15 2.6 Conclusions ..................................... 17 3 Columbia-SRI-Colorado (CSC) Corpus 18 3.1 Rationale for collecting . 19 3.2 Thecorpus ..................................... 20 3.2.1 Method ................................... 20 3.2.2 Exampledialogs .............................. 25 3.2.3 Recording and labeling . 28 3.3 Featureextraction ................................. 28 i 3.3.1 Acousticandprosodicfeatures . 29 3.3.2 Lexicalfeatures............................... 30 3.3.3 Subject-dependentfeatures . 32 3.4 Discussion...................................... 33 II General Analysis and Classification 35 4 Statistical Analysis 36 4.1 StatisticalMethods................................. 36 4.2 BinaryLexicalFeatures .............................. 38 4.3 NumericalFeatures................................. 40 4.3.1 Resultsanddiscussion ........................... 42 4.4 Conclusions ..................................... 46 5 Analysis and Classification on the Local Level 50 5.1 Preliminary Analyses . 50 5.2 Preliminary Local Lie Classification With Ripper . 52 5.3 Local Lie Classification Using Combined Classifiers . 53 5.3.1 Data..................................... 54 5.3.2 Prosodic-lexicalSVMsystem . 54 5.3.3 AcousticGMMsystem........................... 55 5.3.4 CombinerSVMsystem........................... 55 5.3.4.1 Results .............................. 56 5.3.5 ProsodicSystemfromRecognizedWords. 57 5.4 In-depth Machine Learning Experiments . 57 5.4.1 PerformanceMetrics ............................ 60 5.4.2 The Base featureset ............................ 61 5.4.3 The Base + Subject-dependent featureset ................ 63 5.4.4 The All featureset ............................. 65 ii 5.4.5 The Best 39 featureset .......................... 65 5.4.6 Discussion.................................. 69 5.5 Conclusions ..................................... 71 6 Classification of Global Lies 73 6.1 Global Lies Via Critical Segments . 74 6.2 MethodsandMaterials............................... 75 6.2.1 Selection of critical segments . 75 6.2.2 Coping with skewed class distributions . 76 6.3 ResultsandDiscussion............................... 77 6.3.1 Relevantfeatures .............................. 79 6.3.2 Otherobservations ............................. 81 6.4 ConclusionsandFutureWork . 82 III Speaker and Group Dependent Analyses 84 7 Motivations 85 7.1 PreviousWork ................................... 86 7.2 ExploratoryAnalyses................................ 86 7.2.1 Methods................................... 87 7.2.2 Observations ................................ 87 8 Speaker-Dependent Statistical Analyses 91 8.1 StatisticalMethods................................. 91 8.2 ResultsonBinaryFeatures ............................ 93 8.2.1 Discussion.................................. 95 8.3 ResultsonNumericFeatures . 99 8.3.1 Discussion of feature classes . 113 8.3.2 Towards a visualization of speaking styles . 119 iii 8.4 Conclusions .....................................123 9 Group and Subject Dependent Modeling 125 9.1 SubjectsGroupedbyGender . 125 9.2 SubjectsGroupedbyGraph-derivedClusters . .130 9.3 Another Approach to Speaker Similarity . .135 9.3.1 Discussion..................................137 9.4 Speaker-dependentModels. .138 9.5 ConclusionsandFutureWork . 140 IV Human Deception Detection 144 10 Human Deception Detection and the CSC Corpus 145 10.1PreviousResearch ................................. 145 10.2Procedure ......................................146 10.3 ResultsonDeceptionDetection . .148 10.3.1 Additional findings . 149 10.4 Predicting Detectability of the Speaker . .157 10.4.1 Materialsandmethods. .159 10.4.2 Three-class prediction . 160 10.4.3 Two-class prediction . 161 10.4.4 Discussion..................................162 10.5 The Personality of the Hearer: Effects on Performance . ........166 10.5.1 Materialsandmethods. .166 10.5.2 Results ...................................168 10.6Conclusions .....................................175 iv V Conclusions 176 11 Conclusions 177 11.1SummaryofFindings................................178 11.1.1 Statistical Analyses . 178 11.1.1.1 Binary lexical features . 178 11.1.1.2 Numericalfeatures. .179 11.1.2 Classification of local lies . 180 11.1.3 Classification of global lies . 181 11.1.4 Speakerdependentanalyses . 182 11.1.4.1 Binaryfeatures. .182 11.1.4.2 NumericFeatures . .183 11.1.5 Group dependent classification . 184 11.1.6 Human performance at classifying the CSC Corpus . .184 11.2Contributions....................................185 11.3 Implications for Practitioners . 185 11.4FutureWork ....................................186 VI Bibliography 188 Bibliography 189 VII Appendices 198 A Protocol 199 A.1 SubjectIntroduction ................................ 199 A.1.1 Biographical Questions . 199 A.2 Tasks ........................................199 A.3 Interview Instructions . 200 v A.3.1 Scores ....................................200 A.3.2 InterviewProcess..............................201 A.3.3 InterviewPreparation . .202 A.4 Debriefing......................................202 A.5 Biographical Questionnaire . 203 B Pre-test Questions 205 B.1 InteractiveTasks ..................................205 B.1.1 Easy.....................................205 B.1.2 Difficult ...................................206 B.2 Musical .......................................210 B.2.1 Easy.....................................210 B.2.2 Difficult ...................................210 B.3 Survival / first aid (easy and difficult) . 210 B.4 FoodandWineKnowledge ............................211 B.4.1 Easy.....................................211 B.4.2 Difficult ...................................211 B.5 GeographyofNewYorkCity ...........................212 B.5.1 Easy.....................................212 B.5.2 Difficult ...................................212 B.6 Civics ........................................213 B.6.1 Easy.....................................213 B.6.2 Difficult ...................................213 C Features 214 C.1 Notes ........................................215 vi List of Figures 3.1 PhotographoftheInterviewSetting . .. 21 4.1 Boxplots of significant numerical features: 1 . .... 47 4.2 Boxplots of significant numerical features: 2 . .... 48 4.3 Boxplots of significant numerical features: 3 . .... 49 5.1 LocalLiePerformance: BaseFeatures . .. 62 5.2 Local Lie Performance: Base + Subject-dependent Features . ........ 64 5.3 Local Lie Performance: All Features . 66 5.4 Local Lie Performance: 39 Features from Base + Subject-dependent ..... 69 5.5 Local Lie Performance: Best Learners for Feature Sets . ...... 71 7.1 Counts of Significant Logistic Regression