Masaryk University Faculty of Arts Department of Psychology

Monika Kupcová

Ocular-Motor Methods for Detecting Deception: Effect of Countermeasures.

Supervisor: prof. PhDr. Tomáš Urbánek, Ph.D. Consultant: John C. Kircher, Ph.D.

2017

I hereby declare that I wrote this thesis on my own. I have acknowledged all sources used and have cited these in the reference section.

In Brno, 30.11.2017 ……………………………… Monika Kupcová

ACKNOWLEDGEMENTS I would first like to thank professor John C. Kircher who has been a wonderful mentor. I am beyond grateful for his immense knowledge and for his guidance through writing this thesis: encouraging my ideas, helping me with the software, guiding me through the data analysis, and much more. I have asked him countless questions and I cannot thank enough for every answer I have always been given. I would like to thank professor Tomáš Urbánek for his support of my ideas from the very beginning. The door to his office was always open whenever I ran into a trouble spot or had a question. I must express deep gratitude to my parents for providing me with unfailing support and continuous encouragement throughout my years of study and through the process of writing this thesis. This would not have been possible without them. Thank you. Last, but not least, I would like to thank all the participants that participated in the experiment and to all the undergraduate students who served as the secretary confederates in the experiment.

Table of Contents

INTRODUCTION ...... 5 I. THEORETICAL PART ...... 7 Deception detection ...... 7 Lay people vs. professionals ...... 8 Application of deception detection ...... 9 History of deception detection ...... 9 Addressing the Ground Truth Issue ...... 11 Approaches to detect deception ...... 13 1.5.1 Physiological lie detection: The ...... 14 1.5.2 Neuroimaging- based lie detection ...... 16 1.5.3 Verbal lie detection tools: Statement validity analysis, Reality monitoring and Scientific content analysis ...... 17 1.5.4 Non-verbal lie detection ...... 18 The Ocular- Motor Deception Test (ODT) ...... 21 Vision ...... 22 2.1.1 Anatomy and physiology of the eye ...... 22 Physiological and psychological bases of pupil dilation ...... 23 2.2.1 Light reflex ...... 24 2.2.2 Fatigue ...... 25 2.2.3 Pain ...... 25 2.2.4 Startle response and emotional arousal ...... 25 2.2.5 Cognitive load ...... 27 Deception and cognitive load ...... 29 Pupil dilation during deception: arousal vs. cognitive load ...... 31 Reading behaviors during deception ...... 33 Countermeasures ...... 33 Current research on the ocular-motor methods for detecting deception ...... 38 II. EMPIRICAL PART ...... 43 Present study ...... 43 METHODS ...... 45 Overview of the Design ...... 45 Participants ...... 46

Procedures ...... 47 Materials ...... 49 2.4.1 The Ocular-Motor Deception Test ...... 49 2.4.2 The post-test questionnaire ...... 50 2.4.3 The information documents ...... 51 Apparatus ...... 52 Data Collection and Analysis ...... 52 RESULTS ...... 53 Pupil measures ...... 54 Countermeasures ...... 59 Additional results ...... 62 DISCUSSION ...... 64 Limitations of the study ...... 67 Implications and future directions ...... 69 Conclusion ...... 70 References ...... 71

INTRODUCTION Several government agencies and private companies, routinely conduct credibility assessments to test job applicants and screen current employees. In many countries, criminal investigations also have an option to use a polygraph examination of the suspects. In the Czech Republic, since 1984 for instance, there is one center for physiological detection of deception within police investigations: Kriminalistický ústav Praha. Apart from polygraph, a voice analysis is used to assess credibility and altogether it is called the “Physio-detection examination” (Gillernová & Boukalová, 2006). The polygraph is the most widely used method of credibility assessment (Honts, 2014). However, a National Research Council (NRC) in a review of the scientific evidence on the polygraph had several criticisms toward polygraph examination, especially its use for pre-employment screening. The NRC highlighted the need for “an expanded research effort directed at methods for detecting and deterring major security threats, including efforts to improve techniques for security screening . . .” (National Research Council, 2003, p. 8). More recently, a cognitive approach to detecting deception became relevant. The notion is that lying requires more cognitive effort than telling the truth (e.g. Vrij et al., 2008; Vrij, Granhag, & Porter, 2010). Vrij (2004) suggested that the right question to ask when we try to reveal deception in practice might not be “Is the person lying?”. As discussed throughout this paper, lying is cognitively more demanding than truth- telling. Therefore, instead, we need to consider asking “How hard is the person thinking?” Cook and her colleagues (2012) introduced a new method to detect deception, the Ocular-Motor Deception Test (ODT). The ODT evaluates pupillary responses and reading behavior that occur while a person reads and responds to statements about their possible involvement in a mock-crime. This method has the potential to substitute polygraph examinations in security screenings as it is fast, non-intrusive and relies on cognitive indicators of deception rather than solely on physiological indicators of sympathetic activation. The issue in any kind of credibility assessment is that there is usually a lot at stake for the person that is interviewed. Naturally, people attempt to develop techniques and strategies which would lead to an examiner’s conclusion of their innocence. Those deliberate techniques used by guilty people to defeat the deception test are called ‘countermeasures’ (Gudjonsson, 1983). Honts (2014) also lists two other countermeasures

5 that are not as deliberate: Spontaneous countermeasures (SC) and Information countermeasures (IC). The SC occur without previous preparation or plans, it is a strategy or a technique the person developed during the testing. The IC is all kinds of information obtained regarding the credibility assessment and ways to defeat it. Research on polygraph countermeasures generally does not find SC nor IC effective for guilty subjects to appear truthful or at least inconclusive. It is of a great value for the credibility assessment research to reveal countermeasures, their use, and effectiveness. To date, there is no research demonstrating effectivity of any countermeasures on the ODT accuracy. Apart from adapting the ODT on Czech environment and examinees, the goal of the proposed research is to reveal how the use of certain countermeasures interacts with deception and pupil dilation.

6

I. THEORETICAL PART

Deception detection Throughout this paper, deception is considered to be an attempt to convince someone else of something the liar believes is not true. It is the unexpressed intention of the liar to mislead. Although some scholars consider deceiving and lying to be terms describing two different behaviors (e.g., Bok, 1978), in this paper the terms are used interchangeably. Deception is part of our lives more than we believe, or than we would like to admit. DePaulo, Kashy, Kirkendol, Wyer, and Epstein (1996) attempted to find an answer to the questions “How often do people lie? What do they lie about? To whom do they tell their lies?”. They defined lies as a behavior that serves everyday social interaction functions, such as self- presentation and emotion regulation. They conducted two studies, each with approximately 70 participants. In the first study participants were college students (age M = 18.69 years, SD = 0.91 years) and in the second study the participants were recruited from community college (age M = 34.19 years, SD = 12.49 years). All participants were asked to write a diary for 7 days with all social interactions and lies they said to other people. All participants lied every day, in the first study on average twice a day and in the second study on average once a day. About 80% of those lies people said was, at least partially, about themselves. Most of the lies, however, were considered not serious and the participants did not expect to be caught lying; furthermore, they expected to be believed. Most of the lies people tell are not serious or harmful to the receiver. We usually tell other people “social lies” such as “I really like your new dress” because we simply do not want to hurt other people. If we were honest instead, sometimes conversations and social interactions would become very awkward. For example, when we receive a gift that is not exactly what we wanted or expected, it would make the giver very sad or upset if we were honest and said we dislike the gift. By making mutual compliments we benefit relationships with other people and people generally like to be liked by other people and have good relationships (Aron, Dutton, Aron, & Iverson, 1989). Sometimes, though, the lies people tell are more serious than just a mock compliment. In the case of a criminal investigation for instance deception is a question of freedom for the culprit, and a question of justice for the victim. It is important to know whether a person applying for a job has the competence he/she claims to have, or whether a person applying for a job at a police department is really not taking any drugs. Being able to distinguish between a lie and truth would highly benefit individuals as well as society.

7

Lay people vs. professionals

How efficient are people in detecting lies when they cannot rely on any technology or measurement? A meta-analysis of 206 studies where people attempted to judge credibility in real time with no special aids or training, revealed that people are successful on average in 54% (Bond & DePaulo, 2006). Considering that by a simple guess we would be successful in 50% of the cases, these results are not particularly impressive. Generally, people were slightly more successful in classifying truths (61%) than lies (57%). Interestingly, Bond and DePaulo (2006) found that people are more accurate in judging audible than visible lies, and that people appear deceptive when motivated to be believed. A different meta- analysis focused on a relation between judge’s accuracy in deception detection and their confidence about the decision. That relation was not found to be significantly different from zero. In conclusion, people not only have the ability to detect deception just above the chance level, but they also have no objective insight into their judgment accuracy (DePaulo, Charlton, Cooper, Lindsay, & Muhlenbruck, 1997). Another meta- analysis of 247 studies conducted by Bond and DePaulo (2008) focused on individual differences in judging deception. The study revealed that inter- individual differences in ability to detect deception are trivial. Generally, people differentiate truth from lies on a chance level and this ability does not differ across groups of individuals. It is commonly believed that people who attempt to detect deception on a daily basis (e.g. law enforcement officials, legal professionals) would be more trained to spot lies and therefore more successful in discriminating truth from the lies (Garrido, Masip, & Herrero, 2004). Unfortunately, results of the previously cited meta- analysis shows otherwise; the authors conclude that the ‘lie experts’: “On the average achieve less than 55% lie–truth discrimination accuracy. In any case, experts’ apparent superiority in lie–truth discrimination disappears when means are statistically adjusted.” (Bond & DePaulo, 2006, p.229). In conclusion, research shows that people are mediocre in detecting lies. In the following chapters, a fundamental issue of deception detection will be discussed: there is no direct indicator of lying. As DePaulo et al. (2003) found in their meta-analysis that only a few behaviors covary with deception and even those behaviors are only weakly associated to lying. Therefore, people may fail to accurately detect deception because they rely on cues that are in fact not indicators of deception. Typically, people rely on widespread belief that liars move restlessly, make errors in speech, do specific gestures or have gaze aversion.

8

Rather than deception, though, these are indicators of stress and discomfort which is not specific to deception. In another words, the perceivers misinterpret signs of pressure or stress as indicators of lying (Hartwig, Granhag, & Luke, 2014).

Application of deception detection

For a variety of important and usually serious reasons, we often need to assess the credibility of others. A credibility assessment may be conducted by law enforcement, immigration, government agencies or private companies. It may be used for screening of job applicants as well as current employees and in intelligence and security clearance assessments. Classifying an individual as truthful or deceiving may sometimes be a question of national security and may have consequences for individuals and society (Cook et al., 2012). Currently, deception detection is most often assessed by polygraph, which measures physiological responses of an individual while asked relevant questions; heightened emotional arousal is interpreted as deception. Though polygraph is widely used in the USA and Israel for instance, it is not so commonly used and accepted in Europe. Physiological detection of deception is not used at all in criminal investigations in several European countries (Ireland, Estonia, Greece, Portugal, Spain, Italy, Austria, Luxemburg and France) (Kalvodová & Hrušáková, 2015). In the Czech Republic, according to the law (§89 odst. 2 TrŘ), anything that would help clarify the crime related events, may be used as an evidence at the court with the exception of torture and illegal interception. In 1993, however, the Czech Highest Court issued a decision (R 8/1993) that results obtained from the lie detector cannot be used as an evidence in criminal investigations.

History of deception detection

Throughout the history, deceptive behavior has been described in terms of physiological responses of the liar. More specifically, it was the fear response that has been tracked. For instance, a Greek physician and anatomist Erasistratus (300- 250 BC) attempted to detects lies by measuring liars pulse. (Trovillo, 1939). Unsurprisingly, since the anno domini, deception detection had been achieved through ordeal (subjection of the accused to severe pain), survival of which was accepted as a divine proof of innocence. In that case, the accuser did not search for the clues of deception in physiological responses nor the face of the accused. As ordeal arose from superstition and faith, it was the rule for people to ask for ordeal themselves to prove their

9 innocence. For the tribes of Rajmahal in the north of Bengal, such proof of innocence was to touch a red-hot iron with tongue nine times. If burnt before the ninth attempt, the accused was killed as guilty. Perhaps, the guilt would dry the mouth and then the tongue easily burns, although fear would have the same effect. The accused were also forced to carry the hot iron which rises doubts regarding whether the ordeal was anyhow based on observations of physiology of deception. Another ordeal, similar in a way, is a rice- chewing ordeal. The idea was borrowed by Spanish inquisition from Indian traditions. Apparently, conscious of their crime, and scared of being revealed the guilty is not able to swallow a single grain of rice. As well as the hot iron ordeal, the idea behind it was probably that in a consequence of guilt, less saliva is produced (Trovillo, 1939). Significant progress has appeared since the nineteenth century. It is important to realize that deception detection has never been the first act in the growth of an idea; it came though application of certain methods used primarily as means to assess something else. For example, in the nineteenth century the first suitable apparatus for measuring blood pressure was invented. The foundations were laid by Mosso, an Italian physiologist. He did not study deception, but he conducted broad studies on (among other things) the influence of fear on blood pressure, respiration and circulation of blood in the brain. Also, Mosso made observations on pallor, blushing, respiration, trembling and facial expressions produced during a fearful situation. And since fear is the key element of deception, his work should not go unnoticed. First, he used "plethysmograph" (an instrument for measuring blood volume and pulse changes from within the body) to collect data and later on in 1895 he described a new method of assessing blood pressure from the outside giving credit for the device to someone else (Trovillo, 1939). An outstanding asset to criminology of that era was certainly Lombroso, mentor of Mosso. Although some of his contributions are not accepted from today’s point of view, he conducted a great number of experiments channeling from findings of his predecessor. Several times he helped the police to identify a criminal suspect through the use of blood pressure and pulse. We can find a brief review of his contributions in his daughter’s Gina L. Ferrero’s book (1911), remarkably a record of an investigation her father helped to solve. In 1962, a six years old girl was found dead and a coachman named Tosseti was arrested as a suspect. Lombroso concluded he was not born criminal (due to physical and psychological characteristics) but went further: "To obtain stronger proof, my father adopted the plethysmograph and found a slight diminution of the pulse when Tosetti was set to do a sum; when, however, skulls and

10 portraits of children covered with wounds were placed before him, the line registered showed no sudden variation, not even at the sight of the little victim's photograph. The results of the foregoing examination proved conclusively that Tosetti was innocent of a crime.” (Ferrero, 1911, p.265). During the first half of the twentieth century, a variety of techniques were developed to measure respiration, blood pressure and pulse characteristics. In 1914, Benussi in his paper Die Atmungssymptome der Lilge (as cited in Trovillo, 1939, p.870) reported his findings on detecting deception by reading respiratory curves. Notably, Marston (1917) wrote a comprehensive work on systolic blood pressure symptoms of deception. In 1921 he published an article called Possibilities in the Deception Tests where he names four known ways to detect deception up to date; it is (i) measuring electrical body currents by galvanometer, however he states it is of a very little value in detecting deception as it registers every experienced emotion; (ii) association reaction- time to associate words crucial to the crime; (iii) Benussi’s breathing test measuring the length of inhale and exhale before and after making a statement; and (iv) measuring a systolic blood pressure. Inspired by Marston’s article in 1921, John Augustus Larson, a young medical student at University of Berkeley employed by the Berkeley Police Department in Berkeley, California used the blood pressure test to investigate thefts in girls’ dormitories at the University of California. Later in 1921, he published an article called Modification of the Marston Deception Test describing how he helped with solving a crime. Larson assembled a polygraphic apparatus, measuring of respiratory changes, blood pressure, and heart rate and employed it in the investigation. There were four suspects in the theft but twelve were tested overall. The records were uniform with one exception, a girl with marked effects both in the respiratory and in the blood-pressure curve. This record was, however, not completed as she dropped the examination; she confessed later on. Current approaches to detect deception, along with a modern appearance of polygraph will be introduced in chapter 1.5.

Addressing the Ground Truth Issue

The issue of deception detection research in the laboratory versus field studies was discussed, for instance, by Vrij and Verschuere (2013). In laboratory studies, participants are always clearly instructed whether to lie or be truthful; they know that lies will be forgiven, and consequences are not significant. In field studies, real-life cases are investigated, and a lot is at stake for the examinees. The issue is, unlike in laboratory studies,

11 that in real life we often do not know for certain whether the examinee is lying or not: we do not know the ground truth. This is related to the deception detection obstacle that lies have no directly observable indicator; neither does the truth. In field studies, especially with a polygraph, confession is usually used as the ground truth which has been criticized because the confession usually results from the credibility assessment situation. Patrick and Iacono (1991) warn, that using confessions as the ground truth may lead to overestimation of polygraph accuracy. It is difficult to conduct field studies on deception detection given it usually relates to sensitive and personal topics. Two field studies testing the ODT accuracy were reported in Hacker, Kuhlman, Kircher, Cook, and Woltz (2014). The first study was conducted at an office of the US government. It was designed to reveal employees’ rules violations: (i) bringing phones to a sensitive facility where it was prohibited and (ii) unreported unofficial foreign travel. Similar to previously conducted laboratory studies on the ODT, subjects were seated in front of the monitor with an eye- tracking device attached and responded to neutral items and two relevant sets of items including questions on the cell phones and unreported traveling. All 94 federal employees were informed that if they pass the test, they will receive one hour paid time off plus normal pay during their participation. After they were administered the ODT, they were asked to report the truth regarding their violations in a post-test questionnaire. Hence, the ground truth was participant’s self-identification as innocent (n=43) or guilty (n=51). This questionnaire was, however, confidential and was emailed only to the researchers- the employer was not informed about any violations. A linear discriminant analysis correctly classified 36 of the 43 innocent participants (83.7%) and 37 of the 51 guilty participants (72.5%). Ultimately, the discriminant function analyses correctly classified 77.7% of the participants. The second field study reported in Hacker et al. (2014) was conducted at the Latin American Polygraph Institute in Bogota, Colombia on 104 job applicants. The ODT was examining the personal history of drug use and falsification of academic background information. The ground truth on the academic background was obtained by background checks and suggested that all the participants were truthful. Drug tests and admissions during interview indicated that 30 examinees were guilty of drug use within 30 days of testing. In this field study, though, the response error rate was extremely high (24% compared to 3-11% in other studies) and none of the oculomotor measures discriminated between truthful and deceptive participants. The authors discussed a number of reasons for the unexpected results. First of all, high attrition rate (from initially 341 only 104

12 participants remained due to unusable eye-tracking data or unobtainable ground truth) and more importantly, issues with reading ability of the participants. Hacker et al. conclude that adequate reading ability is a prerequisite for the ODT to discriminate between deceptive and truthful subjects.

Approaches to detect deception

Deception as such has no external indicator that we could directly observe. Therefore, we can only rely on indirect indicators of lying. Four general approaches to detecting deception were identified by Granhag, Vrij and Verschuere (2015). The first possibility is to (i) measure someone’s physiological responses, we might also (ii) observe behavior, (iii) analyze someone’s speech or (iv) measure brain activity. All of these approaches are associated to deception, in other words, they are indirect indicators of deception in a very unique way. Unfortunately, scientists have not yet revealed a measurable signal performed by all liars at all times they deceive. In his significant study, Zuckerman and his associates (1981) admit that there is no behavior or set of behaviors always occurring when people lie and at the same time not occurring when they are being truthful. Instead, they suggested there are four cues that should be used to detect deception: arousal, feelings while lying, cognitive aspects of deception and attempted control of verbal and non-verbal behaviors. Some of the current approaches to detect deception will be briefly introduced in the following text. For more thorough overview of recent methods to detect deception see Granhag et al. (2015) or Raskin, Honts and Kircher, (2014). In this paper, approaches to deception detection will be classified as proposed by Vrij and Ganis (2014):

➢ Physiological lie detection ➢ Neuroimaging- based lie detection ➢ Verbal lie detection ➢ Non- verbal lie detection

These four approaches will be described briefly. However, polygraph needs to be understood in more detail as the ODT was formed in an opposition in a way; or as a kind of a supplement to imperfections of the polygraph. Therefore, it will be described in greater detail compared to other methods. Naturally, description of the Ocular-Motor Deception Test will be covered in the greatest detail in chapter 2 as it is crucial for this proposed research.

13

1.5.1 Physiological lie detection: The Polygraph

Out of all approaches, polygraph has the longest tradition; it was first mentioned about 100 years ago (Marston, 1917; Larson, 1921) and is still being used today, in a modified form. The name ‘polygraph’ is nowadays often used as a synonym for ‘lie detector’. The word itself derives from Greek words poly which means ‘many’ and graph, which means ‘write’. It refers to the way the recording device registering physiological functions used to look at the beginning: a brief-case sized instrument registering physiological responses by a number of pens writing on a long roll of paper. Today, the polygraph is reduced to a small amplifier/ digitizer combined with a laptop to record the data (Granhag et al., 2015). According to the National Research Council, polygraph is widely used in the USA, but also in other countries, notably in Canada, Israel or Japan for three main purposes:

➢ Preemployment screening in law enforcement and preemployment or preclearance screening in agencies involved in national security ➢ Screening current employees, especially in security-sensitive occupations ➢ Investigation of specific event, for instance criminal cases

Polygraph measures physiological responses that covary with certain psychological processes such as stress; as we have said previously, deception can only be measured indirectly. In the case of polygraph, the indirect index of deception would be emotional reactions measured by physiological responses. These reactions are generally measured by three sensors attached to the subject; (i) two elastic stripes measuring respiration positioned around subject’s thorax and abdomen, (ii) an inflatable cuff (sphygmomanometer) attached to an upper arm measuring blood pressure, and (iii) two electrodes positioned on the fingers or inside of the hand measuring galvanic skin response (electrodermal activity) (Granhag et al., 2015). According to Kircher and Raskin (2002), skin conductance is the most accurate of the traditional physiological indicators of deception in both, laboratory and field studies. There are many types of polygraph tests, four of the most popular ones are described in detail by Vrij (2000). The oldest polygraph procedure is the Relevant/Irrelevant Technique (RIT), published by Larson (1932). In the RIT, all subjects are asked questions relevant to the crime such as ‘Did you steal the money from the desk?’ and all examinees, both guilty and innocent, will answer ‘no’ unless they intend to confess to the crime. The irrelevant questions have nothing in common with the crime and the examiner is certain that

14 the subjects tell the truth: ‘Is it Monday today?’. Physiological responses to relevant and irrelevant question are then compared; larger responses to crime-relevant questions than to irrelevant questions indicate the examinees is deceiving. Research shows, however, that this method is limited in the very premise: the crime- relevant question itself is more likely to be arousal provoking than a question ‘is it Monday today?’ and therefore the RIT is not recommended for use anymore (Honts, 1991; Raskin, 1986). Another type of polygraph test is the Comparison Question Test (CQT). Again, there are relevant questions specific to the crime. The comparison questions manipulate the subject to lie and they are meant to embarrass both, innocent and guilty subjects. Comparison question usually cover longer period of time such as: ’Have you ever tried to hurt someone to get revenge?’. The subject is told that admitting such things would lead to a conclusion that he/she has a kind of personality capable of committing the crime; this is, obviously, a misleading statement but examinees are made to believe it. Therefore, examinees usually deny earlier wrong-doings on the comparison questions. The rationale is that innocent suspects will generate more physiological responses to the comparison questions while the relevant question is expected to elicit more arousal in guilty suspects (Iacono & Patrick, 1997). The CQT is mostly criticized because the examinees are misled and still, alternative explanations for the emotional arousal cannot be ruled out. Also, the CQT is lacking standardization for the comparison questions as they need to be adjusted to the investigated crime (Ben-Shakhar, 2002; Vrij, 2000). Standardized comparison questions are in the Directed Lie Test (DLT), where comparison questions can be applied to all cases. Typically, the questions would be: ‘During the first 27 years of your life, did you ever tell even one lie?’ In the case of the DLT, the participants are instructed to answer ‘no’ to the directed lie questions. The rationale is similar to the CQT: innocent suspect would yield more physiological responses to the directed lie questions than to the relevant ones (Raskin & Honts, 2002). The last test mentioned in this paper is the Guilty Knowledge Test (GKT) or sometimes called the Concealed information test. (Ben-Shakhar & Elaad, 2003). It investigates whether an examinee possesses knowledge about a certain fact related to the crime. Supposedly, someone was killed by a knife but only police and the culprit are aware what the murder weapon was. When the culprit is presented various weapons, and asked whether he recognizes it as the murder weapon, his physiological responses are recorded.

15

Presumably, the so-called guilty knowledge will produce a strong physiological response to the recognized murder weapon.

1.5.2 Neuroimaging- based lie detection

Neuroimaging-based lie detection has been studied for the past ten years. It presumes that during deception a specific neural process occurs in the brain and such process can be measured more directly than behavioral or peripheral psychophysiological processes (Ganis, 2015). Brain processes are most commonly measured by the fMRI, a technique that uses quickly oscillating magnetic fields to detect moment-to-moment oxygen use and blood flow in currently active sections of the brain. The fMRI is taken while an individual engages in a task that simulates deception, so we can determine whether there are brain processes specific for truthful and deceptive behavior. This allows us to see a reasonable spatial and temporal resolution of the brain areas involved in investigated activity (Watson & Breedlove, 2012). A mock crime study using the fMRI to detect deception was conducted by Kozel and his colleagues (2005) who attempted to measure the brain correlates of lying in individuals. Participants (n=31) took part in a mock crime where participants stole either a ring or a watch and subsequently were instructed to deny stealing either of those objects. Contrast between honest and deceiving subjects was most obvious in three clusters of brain regions: right anterior cingulate, right inferior frontal and orbitofrontal and right middle frontal cortex. The classification accuracy in this study yielded 90%. Additional experiments were conducted, the most elaborate one by Kozel, Johnson, Grenesko et al. (2009a) but this time, the accuracy yielded only 69%, whereas Kozel, Johnson, Laken et al. (2009b) reported accuracy 86%. Ganis (2015) concludes, that available evidence on neuroimaging-based lie detection does not provide us with sufficient information regarding the classification accuracy. The use of fMRI is not even remotely suitable for field use: it is extremely expensive and shows mixed results, that are based on used study paradigm. Further research is required to determine how well this promising technology will work in different settings and populations.

16

1.5.3 Verbal lie detection tools: Statement validity analysis, Reality monitoring and Scientific content analysis

Verbal lie detection tools are covered thoroughly by Vrij (2015). There are many verbal methods to detect deception, three of the most frequently used ones in practice will be introduced in this chapter: Statement validity analysis (SVA), Reality monitoring (RM) and Scientific content analysis (SCAN). 1. Statement validity analysis (SVA) Up to date, the SVA is the most frequently researched verbal lie detection tool. It is also widely used in practice by criminal courts in some European countries. It is a method created to assess credibility of child witnesses’ testimonies in trials for sexual felony. Since very often there is no other evidence than the child’s testimony and the accused adult typically provides contradictory testimony, it is crucial to examine accuracy of such accusations (Vrij, 2015). The SVA consists of four consecutive parts (Vrij, 2008): (i) A case-file analysis (information regarding the child, the event, previous statements of the child and other people that are involved) (ii) A semi-structured interview (an attempt to obtain complete information regarding the event through interviewing the child with the aim for the interviewer to stay objective and not to suggest any statement) (iii) A Criteria-Based Content Analysis (CBCA) that systematically assesses the quality of the transcribed interviews. It assumes that content of statement based on a genuine memory differs fundamentally from a content of a statement that is based on fantasy or invention (The Undeutsch theory). The CBCA evaluates 19 criteria that are likely to be present in truthful memories (e.g. logical structure, quantity of detail, reproduction of speech, unusual details) (iv) An evaluation of the CBCA outcome via a set of questions (Validity Checklist). Investigating whether there might be some other reasons for why some of the CBCA criteria occurred (e.g. mental capability of the child, mental capability of the child)

To date, there is no reliable research available on the SVA accuracy. Although, it is widely being used in courts. One of the issues of assessing the SVA accuracy is the ground truth issue (discussed in chapter 1.4), just as in deception detection field. According to Vrij (2008), the accuracy of the CBCA alone is approximately 70%.

17

2. Reality monitoring (RM) Reality monitoring is a process of determining whether an event has happened, or it was created by the person’s imagination (Johnson & Raye, 1981). The RM is based on an idea that real memory differs in quality from a memory based on imagination. Reality monitoring has a solid theoretical background and thus is popular among scholars. Although, the RM has never been used in practice. The RM technique uses 7 criteria (e.g. sensory information, temporal detail, affect, cognitive operations) to assess the likelyhood a memory is based on reality (Vrij, 2015). According to Vrij (2008), the RM accuracy is approximately 70%. 3. Scientific content analysis (SCAN) Compared to the Reality monitoring method, The Scientific content analysis is a very popular tool to detect deception in practice but only a few studies on the SCAN accuracy are available. In the SCAN procedure, the examinee is given a blank paper and is asked to write in as much detail as possible about their activities during the critical period so even a reader with no previous knowledge of the events would fully comprehend what happened. The SCAN examiner consequently conducts an analysis of the text based on criteria some of which are thought to be more frequent in truthful statements than in deceiving, however these criteria lack scientific justification and the list of criteria used in SCAN is not fixed. Some of the criteria are denial of allegations, spontaneous corrections, emotions, etc. (Vrij, 2015).

1.5.4 Non-verbal lie detection

There are four main theoretical frames to assess non-verbal cues to deception (Bond, Levine, & Hartwig, 2015): leakage theory, four-factor theory, self-presentational theory and interpersonal deception theory. The leakage theory is introduced merely from a historical point of view. The remaining three theories are, however, influential in deception detection research. All three theories have in common the idea that sometimes there are cues to deception in behavior or appearance of the liars compared to truth tellers and that three factors contribute to the cues of deception: emotions, cognitive load and using strategies to appear truthful.

1. Leakage theory A classic theory was introduced by Ekman and Friesen (1969) in their article named ‘Nonverbal leakage and clues to deception’. Partially, Ekman channeled from Freud, who was, however, more concerned with verbal leakage such as slips of the tongue and dreams. 18

Ekman and Friesen proposed that there are two types of deception: deceiving others and deceiving ourselves. According to their theory, deception leaks in the form of deception clues; they describe various parts of the body where deception may leak out. The face, for instance, has the biggest nonverbal sending capacity of all. At the same time, deception and clues to detect it adjust to feedback the deceiver gets about his or her body parts. As we get a lot of feedback on our facial expressions, according to Ekman and Friesen deception does not leak out of facial expression as much as we might expect. We do not, on the other hand, get much feedback on our legs and hands signals and therefore, it may be a potential source of deception leakage. Bond, Levine and Hartwig (2015) conclude that ultimately, Ekman shifted his focus on micro expressions, but still his theories remain controversial. 2. Four-factor theory (Multifactor model) Zuckerman, DePaulo and Rosenthal (1981) introduced three factors that may influence cues to deception: emotional reactions (fear, guilt or delight), cognitive load (lying is cognitively more demanding than truth telling) and attempted behavioral control (deliberate attempts to appear honest). The fourth factor, ‘arousal’, appears to overlap with the emotional reactions. Zuckerman and his colleagues hypothesize that there are no verbal or non-verbal behavioral clues to detect deception. Instead, deception is associated with the four psychological factors that may be manifested in behavior. 3. Self-presentational theory DePaulo (1992; see also DePaulo et al. 2003) offers a different point of view on deception. According to her theory, non-verbal behavior during deception is a form of self- presentation; people try to manage their non-verbal behaviors to achieve their self- presentational goals. According to DePaulo, both liars and truth tellers engage in strategies to appear credible. Bond, Levine and Hartwig (2015) pointed out that the self- presentational theory provides little guidance to improve deception detection in practice. 4. Interpersonal deception theory (IDT) The interpersonal deception theory was first introduced by Burgoon and Buller, (1996) and it postulates that deception is a two-way process where both the liar and the receiver influence each other- deception is a dynamic, interactive process. The receiver provides a feedback to liars attempts to appear truthful and the liar may then adjust his/her behavior to reduce suspicions. The deceiving person must manage multiple communication tasks simultaneously to appear truthful. The deceptive message includes not only the obvious content but also strategic actions that serve the intention to appear truthful as well as unintentional behavior that may serve as a clue to deception.

19

A meta-analysis conducted by DePaulo et al. (2003) is the most comprehensive review addressing consistency and strength of some non-verbal cues indicating deception. They investigated 158 different behaviors or impressions from 1338 estimates that were, in 120 studies, used as cues to detect deception. DePaulo and her colleagues arranged 88 cues into four sections according to the number of independent estimates and the size of the combined effects. Cues with larger size effects based on both, smaller (k≤5) and larger (k≥5) numbers of estimates are listed in Table 1. The section with larger size effects based on larger number of estimates (k≥5) suggest that deception was linked to less vocal and verbal immediacy and more discrepancy and ambivalence in behavior, also liars provided fewer details in their speech and appeared uncertain and in tension. The section with larger size effects based on a lower number of estimates (k≤5) suggest that liars were less cooperative than truth tellers, were less likely to admit that they did not remember something, and showed bigger pupil dilations. According to DePaulo et al. (2003), cues to deception were more noticeable when people were highly motivated to deceive, especially when the motivation was identity relevant rather than of material or monetary value. Also, cues to deception were more pronounced when the deceit was due to transgressions. Table 1: Non-verbal cues to deception with larger effect sizes based on larger and smaller numbers of estimates (DePaulo, 2003, pp.95).

Large effect size (d > |0.20|), p >.05 Larger no. of estimates (k>5) d k Verbal and vocal immediacy (impressions) -0.55 7 Discrepant, ambivalent 0.34 7 Details -0.30 24 Verbal and vocal uncertainty (impressions) 0.30 10 Nervous, tense (overall) 0.27 16 Vocal tension 0.26 10 Logical structure -0.25 6 Plausibility -0.23 9 Frequency, pitch 0.21 12 Negative statements and complaints 0.21 9 Verbal and vocal involvement -0.21 7 Smaller no. of estimates (k≤5) d k Cooperative (overall) -0.66 3 Admitted lack of memory -0.42 5 Pupil dilation 0.39 4

20

Talking time -0.35 4 Related external associations 0.35 3 Verbal immediacy (all categories) -0.31 3 Spontaneous corrections -0.29 5 Chin raise 0.25 4 Word and phrase repetitions 0.21 4

The ocular-motor deception test is one of the non-verbal methods to detect deception and will be introduced in detail in chapter 2.

The Ocular- Motor Deception Test (ODT) A research team at the University of Utah has been developing The Oculomotor Deception Test since 2003 (“Credibility Assessment Technologies”, 2014), namely professors John C. Kircher and David C. Raskin, internationally known scientists in the field of physiology of deception detection, Doug Hacker, an educational psychologist with expertise in the psychology of reading, Anne Cook and Dan Woltz, cognitive scientists. Professor Kircher and professor Raskin have conducted a broad research on physiology of deception detection (Barland & Raskin, 1975; Podlesny & Raskin, 1978; Horowitz, Kircher, Honts, & Raskin, 1997; Kircher, & Raskin, 2002), and they recognized the need to find an alternative deception detection method that could complement the polygraph. Generally, physiological measurement (e.g. polygraph) requires site preparation and attachment of multiple transducers on the subject which, in some cases, may be uncomfortable (Podlesny & Kircher, 1999). A method that would only collect oculomotor data, on the other hand, can be used remotely, safely and unobtrusively. The ocular-motor deception test may be either used for screening alone or to decide whether to conduct a polygraph test (Patnaik, 2016). Essentially, the ODT is a method that classifies people as truthful or deceptive based on their pupillary responses and reading behavior while they read statements concerning their possible involvement in illicit activities on a computer monitor and respond by pressing a button labeled Truth or False. In contrast to the polygraph, the ODT is automated and can be completed in approximately 40 minutes (Cook et al., 2012; Kircher & Raskin, 2016). The ODT uses a question format of the Relevant Comparison Test (RCT), a polygraph technique originally developed for use at ports of entry to screen travelers for possible trafficking of drugs and/or carrying explosives (Kircher & Raskin, 2002; Kircher & Raskin, 2016). The RCT compares subject’s reactions to two relevant issues (crimes) with high face

21 validity intermixed with neutral questions. The test assumes that the subject can be guilty only of one crime and therefore, the difference in reactions to those two crimes is diagnostic; more diagnostic than the difference between a crime-related and neutral items. Reacting more strongly to one of the relevant issues is indicative of deceiving on these items. To fully comprehend the ODT, it is necessary to explain some crucial terms, concepts, and theories that are underlying the ODT rationale. It is important to understand how human vision works, what are the physiological and psychological bases of pupil dilation. Also, we need to discuss the cognitive load theory in the context of deception and finally, reading behaviors during deception. Research on ocular-motor method to detect deception published up to the date will be introduced in chapter 4.

Vision

All human beings are typically equipped with five senses. One, however, is dominant: vision. Partially, it is because vision is remarkably adaptable. For instance, after half an hour or so in the dark, the retinal rod photoreceptors become capable of responding to just a single photon. On the other hand, when eyes are exposed to the light, the cone photoreceptors allow us to see details and colors (Dowling  Dowling, 2016).

2.1.1 Anatomy and physiology of the eye

The eyes are positioned separately, though generally work as a pair. Each is covered with an eyelid which is a layer of tissue below and above the eye. The eyelashes on the edge of the eyelid serve as a protection from substances such as dust. The eyelid moves about 20- 30 times per minute to moisturize the surface of the eye with a fluid produced by lacrimal glands. The eye- moistening is necessary for the cornea not to become dry. In case of a foreign object entering the eye surface, extra fluid is produced to wash it away. The cross- section of the eye as described is illustrated in Figure 1 (Tucker  Foulston, 2015).

22

Figure 1: Cross section of the eye (Tucker  Foulston, 2015, pp. 278) The outer fibrous layer is called the sclera. At the front of the eye it is covered by conjunctiva which is a thin mucous membrane with the ability of mucous secretion. Conjunctiva is a thin layer lining the inside of eyelids and it is another prevention from drying of the eye. The light rays enter the eye through cornea, pupil and iris and focus at the back of the eye on the retina. The iris, colored circle in the center of the eye, controls the amount of light that reaches retina via constricting and dilating the pupil. Therefore, the pupil dilates in dim light and constricts in a sharp light. Behind the pupil, there is a lens that bends the light reflected from an object; the slightly bent picture of an object is then displayed on the retina. Essentially, the lens is thicker when perceiving closer objects and thinner when perceiving a more distant object. Finally, the retina is on the distant wall of the eye and it contains photoreceptors (rods and cons). In the center of retina, there is a small spot called the fovea. It is a place of the sharpest vision. The optic nerve leaves the eye through the blind spot. The blind spot contains no photoreceptors (Tucker  Foulston, 2015).

Physiological and psychological bases of pupil dilation

The pupil size is determined by a respective activation of two iris muscles: sphincter (constrictor) and dilator. Both muscles are controlled by the autonomic nervous system (Cacioppo, Tassinary,  Berntson, 2007). As the somatic body system is made up of the central nervous system (CNS), peripheral fibers that deliver messages from the sense organs and striated muscles all over the body; the vegetative (visceral) body system is made

23 up of the autonomic nervous system (ANS) and its centers in the spinal cord and brain stem. The ANS is to innervate internal organs, glands, smooth muscles, heart and glands. We are mostly unaware of these ANS activities as they are not controlled by our will (Sternbach, 1966). Traditionally, the ANS is further divided into a sympathetic nervous system (SNS) and parasympathetic nervous system (PNS). These systems have usually inversed autonomic functions that are either facilitative or inhibitory (Sternbach, 1966). Although there is a severe reciprocity among the ANS systems, generally is parasympathetic activity predominant for light reflexes and accommodation and sympathetic activity is more noticeable for pupillary responses in behavioral and stress context (Cacioppo, Tassinary,  Berntson, 2007). The ‘push-pull’ balance between the constrictor muscle and dilator muscle is very dynamic and it determines the pupil diameter. When sympathetic activity increases, and parasympathetic system is inhibited, dilation of pupil occurs. Constriction of the pupil happens as a result of increased parasympathetic activity or reduced sympathetic activity (Steinhauer, Siegle, Condray, & Pless, 2004) There is, however, a variety of reasons for pupillary changes. First, research shows (Fotiou et al., 2007) that age influences the baseline pupil size. Generally, young people tend to have larger pupil diameter than older people. Secondly, the mains reasons for an immediate pupillary change are light reflex, startle response and emotional arousal, fatigue, pain and cognitive load. These will be briefly described with emphasis on cognitive load which is a foundation stone of the ODT rational.

2.2.1 Light reflex

As mentioned earlier, the main function of the iris is to control the amount of light entering the eye. In clinical practice, the light reflex is used to observe the amount of pupil contraction as a response to a hand-held bright light altered between the two eyes for 1 to 3 seconds. If any asymmetry in the response is detected, a presence of a damage on retina, optic nerve, chiasm or optic tract is usually assumed and it is a subject of further investigation (Kardon, 1995). Sharma and colleagues (2016) published a study suggesting that morphological ocular parameters such as iris thickness may influence the pupillary light reflex. They proposed that there is an independent association between the iris thickness and the amplitude of pupillary changes. Interestingly, the research conducted by Fotiou and colleagues (2007) suggests that with age the latency of light reflex remains unaltered.

24

2.2.2 Fatigue

Fatigue influences pupillary reactions; greater fatigue is associated with smaller pupil dilation. Geacintov and Peavler (1974) conducted a study testing differences in pupil diameter of telephone operators (providing assistance either with a paper telephone book or automated microfilm reader) before and after their work day. Eleven telephone operators were measured for three minutes, taking frames of their pupils twice a second. The mean difference in pupil size (.32 mm) between the morning and the evening measures was statistically significant (t = 2.78, p < .01). This decline indicated that baseline pupil diameter decreases with fatigue. Kahneman and Peavler (1969) measured pupillary responses of ten subjects during eight verbal learning trials and found that pupil size is a valuable measure of mental effort, or processing load. Mean pupil diameter at the beginning of the study blocks was 4.11 mm., 3.99 mm., and 3.92 mm., respectively, for trials 1, 2, and 8. A similar decrease of pupil diameter over time was found and discussed elsewhere (Kahneman & Beatty, 1967). Despite the changing base line, there were no consistent changes in the magnitude of pupillary responses under cognitive load.

2.2.3 Pain

Ellermeier and Westphal’s (1995) work on relation between pupil size and pain demonstrated that increases in pupil size is associated with pain. Furthermore, the dilation is not only immediate but lasts for the duration of the pain. Using electrical stimulation, Chapman, Oka, Bradshaw, Jacobson and Donaldson (1999) demonstrated that the pupil dilates immediately at the onset of pain and the diameter increases with increasing pain. In their experiment, participants received painful electrical fingertip stimulation at four intensities, ranging from mild to barely tolerable pain. Throughout the trials of painful electrical stimulation, pupil dilation showed no habituation to the pain. The pupil dilation began at 0.33 s and peaked at 1.25s after the stimulus. They hypothesized that “the pupil dilation reflects central processing of a threatening event. This processing is probably not specific to pain as a sensory modality, but it may be specific to potentially threatening stimuli.” (Chapman et al., 1999, p. 50).

2.2.4 Startle response and emotional arousal

The link between pupillary changes and emotions was drawn by Hess and Polt (1960). They suggested that pupillary changes are a function of sympathetic activation and

25 as such can be used as measure of greater or less interest value and pleasure value of visual stimulus. To test this hypothesis, they developed a technique to record pupil size while the subject was shown various visual stimuli. They ran a pilot study on six subjects. Their technique was to obtain exposure of participant’s eye on 16 mm film while he was presented visual material. The recorded film was then projected, and pupil size measured on the projected image. Hess and Polt report that in subsequent studies they managed to replicate the study with a larger number of participants. It was suggested that further research might aim to investigate how other visual or auditory stimuli influence pupillary changes. Before we continue investigating the relationship of pupillary changes and emotions, let us look into a phenomenon called the startle response. When a person is exposed to an intense or threatening stimulus with rapid onset such as a loud noise or unexpected move in the environment, the body usually responds with a startle reflex. Davis (1984) describes it as a diffuse skeletomuscular response of an organism common for many mammalians. The startle reflex is a series of physiological reactions with protective components such as eyeblink, forward and downward head movements, arching the shoulder (raising and drawing them forward), contraction of the abdomen. During a startle response the iris quickly dilates. That dilation persists even when the startled person is exposed to a bright light which would normally cause the pupil to dilate (Andreassi, 2000). Cook, Hawk, Davis and Stevenson (1991) conducted a study on startle reflex modulation to test generalizability of the startle response across a variety of emotional states. Participants were exposed to pictures with pleasant, unpleasant and neutral content to test response to stimuli in different emotional contexts, especially whether stimuli valence influences pupil size. The results suggested that startle latency is specifically responsive to arousal but not valence. These findings were supported by a later research by Bradley, Miccoli, Escrig, and Lang (2008) which aimed to assess effects of hedonic valence and emotional arousal on pupillary responses. At the same time, autonomic activity (skin conductance and heart rate) was measured to explore whether pupillary changes are primarily influenced by SNS or PNS activation. The data in this study clearly indicated that pupillary increases were larger when emotionally arousing pictures, no matter whether pleasant or non-pleasant, were viewed compared to neutral stimuli. Hence, the study by Bradley et al. (2008) disconfirms that hedonic relevance influences pupil size. As for the autonomic activity, changes in pupil diameter were accompanied with changes in skin conductance suggesting that pupillary response during emotional arousal is associated with increased sympathetic activity.

26

A study concerned with the question of habituation and duration of emotional stimuli and its relationship to pupillary responses was conducted by Snowden et al. (2016). In the study, participants were presented fearful and neutral images. The data showed that the pupil dilates more when an affective image is presented regardless the time of exposure to the picture (100-3000 ms). Also, the dilation was not diminished by repeated presentation of the stimulus. The theory that emotional arousal is linked to changes in pupil diameter was supported by data of several other replication studies (e.g. Bradley & Lang, 2015; van Steenbergen, Band, & Hommel, 2011).

2.2.5 Cognitive load

The work of Hess and Polt (1960) who drew a direct link between emotions and a pupillary response has already been mentioned. It was also Hess and Polt (1964) who showed a connection between mental effort and pupillary responses. In a pilot study with five subjects, they used simple arithmetic problems as the material for mental activity. In that experiment, they used a mirror reflecting the eye to a camera lens taking two frames per second. While looking at number ‘5’ on a screen the subject heard a mathematical task (e.g. multiply 7 x 8; multiply 8 x 13) and was asked to solve it orally. Again, the recorded film was then projected, and pupil size measured by means of a millimeter ruler on the projected image. Hess and Polt reported that pupil diameter of each subject yielded gradual increase, reached maximum right before the answer was verbalized and immediately the pupil size dropped and then slowly returned to the control size. The authors found a higher pupil diameter increase in tasks with higher difficulty in all subjects demonstrating the link between pupil dilatation and problem difficulty. It was two decades earlier, however, when Berrien and Huntington (1943) were investigating “what kinds of pupillary responses accompany the tension and emotional excitement incident to deception and second, whether these responses show sufficient consistency in any respect to permit them to be used as valid indications of attempted deceit. Furthermore, since considerable validating evidence is already at hand both from laboratory and field studies pertaining to the use of blood pressure changes as indicators of deception, a comparison of these changes with pupillary responses was proposed” (p.443). In their study, they identified a pattern in a large number of deceiving participants: a slow dilation lasting about one to five seconds and then a rapid constriction. Based on data from 32 participants, they concluded that “(1) A slow dilation followed by a very rapid constriction is probably indicative of the emotion usually accompanying deceit. (2) A sudden change in

27 the stability of the pupil is found more frequently among those attempting deceit than those not attempting deceit when critical questions are first introduced.” (p.449). Even though Berrien and Huntington (1943) did not link pupillary changes to the increased cognitive effort during lying, it is one of the first documented studies linking pupil dilation to deception. Research done by Hess and Polt (1964) linked processing load (in a form of mental multiplication) to pupil dilation and has inspired a number of studies addressing the relationship between pupil size and cognitive effort. Ahern and Beatty (1979) attempted to determine the relationship between intelligence (for the purposes of that experiment defined by scores on the Scholastic Aptitude Test- SAT) and cognitive demands during mental activity; they hypothesized that “the amount of ‘general mental energy’ that is demanded in thinking can be measured with recently devised pupillometric methods. Thus, the dynamics of cognitive processing may now be compared among individuals who differ in psychometrically measured intelligence.” (p.1290). Thirty-nine participants solved mental multiplication problems of varying difficulty while their pupil size was recorded. The data showed that both SAT score and task difficulty affect pupil diameter; the task- evoked activation was estimated by the mean pupillary dilation during the 4 seconds preceding the onset of the answer. During more difficult tasks pupils dilated the most and at the same time participants with higher SAT scores performed less task- induced pupil dilation than participants with lower scores. The authors conclude that the study provided evidence that more intelligent individuals have more efficient cognitive structures of information processing which emerge during mental activity and are reflected by pupil dilation. Kahneman and Beatty (1966) studied pupil diameter and load on memory concluding that during a short- memory tasks, pupil diameter reflects actively processed material at any time; the pupil dilates while the task is processed and constricts during reports. The authors also describe a practice effect throughout the experiment which they consider a proof of pupillary response being an indicator of processing load: “the adoption by subject of a consistent performance set will tend to reduce both the subjective difficulty of the task and the pupillary response to it. The appearance of such practice effects in the pupillary response appears to provide additional evidence for the validity of this response as an indicator of processing load” (p.1585). Relation of pupil dilation and short-term memory for digits and words was also found by Kahneman, Onuska, and Wolman (1968). Kahneman and Peavler (1969) measured pupillary responses on ten subjects during verbal learning tasks and found

28 that “pupil size is a valuable measure of mental effort, or processing load—two terms that have been used interchangeably” (p.316). All the listed findings demonstrate that cognitive load and pupil dilation are related. Research close by topic to my proposed thesis would be a study by Just and Carpenter (1993) investigating intensity of processing during sentence comprehension by measuring pupillary responses while reading. The study was inspired by previous research suggesting that pupillary response may be sensitive to resource demands in language processing. To test this hypothesis, Just and Carpenter conducted two experiments to compare the processing of simple versus more complex sentences. Object-relative sentences such as, “The reporter that the senator attacked admitted the error” (p.314) were presented to participants. People usually fail to paraphrase these sentences as they do not correctly match agent with the verb. Also, subject-relative sentences such as “The reporter that attacked the senator admitted the error.” (p.315) were used; subject- relative sentences are much easier to process. Results showed that sentence difficulty effects pupil diameter: the pupil dilates more during reading the more difficult object-relative sentences (0.249 mm) than subject- relative sentences (0.203 mm), F(l,32) = 5.11, MSe = 20.0, p < .05. In addition, Just and Carpenter (1993) found that readers performed longer gaze during processing more demanding sentences. Interestingly, two possibilities depending on reader’s goal appeared when they encountered difficult text: the reader either (i) slows down and sacrifices speed of reading to comprehend the text properly or (ii) continues reading fast at the cost of failing to comprehend the text.

Deception and cognitive load

Zuckerman and colleagues (1981), as mentioned earlier, stated that one of the cues used to detect deception is the cognitive aspect of deception. They conceptualized lying as a more cognitively demanding task than being truthful. The reason is that liars need to formulate their answers/ statements in a way that is consistent internally as well as with what others already know. The greater was the challenge of lying, the greater latencies in responses and pupil diameter were predicted. DePaulo et al. (2003), however, is quite skeptical of the assumption that deception is more cognitively demanding than telling the truth. From their self-presentational perspective of deception, lying is routinely practiced in everyday life and is much less self-conscious than in laboratory studies. Therefore, it is usually not too cognitively demanding as people practice it every day, she claims.

29

Other research (e.g. Vrij et al., 2008; Vrij, Granhag, & Porter, 2010) claims that deception is cognitively more demanding, but not in all circumstances. There are six reasons (or conditions) leading us from why lying is more cognitively demanding than truth- telling and, more importantly, when is lying more cognitively demanding. At least two of these conditions need to be fulfilled so we can consider lying to be more cognitively demanding than truth- telling. Firstly, telling a lie is cognitively demanding itself: one needs to make up a story that is consistent with what he/she has said previously, and with what other people already know. This fabrication must be monitored at all times, and the suspect needs to avoid slips of the tongue. Secondly, liars usually do not tend to take their credibility for granted, unlike truth tellers who typically believe that their credibility is obvious. This can be explained by the illusion of transparency (Gilovich, Savitsky, & Medvec, 1998), which is one’s confidence that their feelings will manifest on the outside. Another explanation is the belief in a just world, a confidence that people get what they deserve (Lerner, 1980). At the same time, liars attempt to control their demeanor in order to appear honest; this amount of controlling and monitoring is indeed cognitively demanding. Thirdly, examinees observe interviewer’s reactions more carefully, since they do not take their credibility for granted; this requires more cognitive effort. Fourthly, liars might invest excessive cognitive effort into concentration on acting roles they should play. Fifthly, liars have to be suppressing the truth while they are lying. Lastly, the reason is that activating a lie is deliberate and intentional, therefore cognitively more demanding (Vrij, Granhag, & Porter, 2010). To answer the question of when is lying likely to be more cognitively demanding than telling the truth: for example, when the examinee is highly motivated to be believed. Another circumstance under which deception requires more cognitive effort is when the interviewee can recall the truth clearly and has a clear image of it. At least some of these conditions need to be fulfilled for deception to be more cognitively demanding than being truthful (Vrij et al., 2008). Another report on how deception is more cognitively demanding than truth- telling was published by Vrij, Fisher, Mann, and Leal (2006). Last, but not least, an important thing to mention is how the liars themselves are aware of their behavior and cognitive effort during deceit. Vrij, Semin, and Bull (1996) conducted a study where the difference between actual indicators of deception (nonverbal behavior objectively associated with deception) and perceived indicators of deception (indicators observer thinks are associated with deception regardless of whether it is really manifested during deception) was emphasized. One of the actual indicators of deception is that the liar decreases the number of hands, feett, and leg movements and move very

30 deliberately, but the perceived indicator is the opposite: people expect the liar to perform nervous behavior full of random moves and self-manipulation. The cognitive load framework explains the stillness of liars by their neglect of body language, while attempted control framework explains it by overcontrol of movements during deception. Subjects in the study were interviewed twice, once they were truthful and once they deceived. Some of the subjects received information about actual indicators of deception and some did not. Results showed that deceivers did not have a realistic estimate of how their own deceptive behavior appeared. The results implied that deceivers expect more cues to reveal deception than it really does and also that deceivers are unaware of which behavior is indicative of deception. The participants were, in the end, asked about the cognitive load during deception: “they had tried to control their behavior during deception and they indicated that deception was a cognitively more complex task than telling the truth. Moreover, results showed that the decrease in subtle movements was in fact associated with the experience of attempted control and cognitive load” (Vrij, Semin, & Bull, 1996, p.558).

Pupil dilation during deception: arousal vs. cognitive load

The question is: To what extent is the pupil dilation during deception influenced by emotional arousal (the ‘fight or flight’ sympathetic reaction), and to what level it is caused by cognitive load? There is various research available on the effect of emotional arousal on the pupil size as well as research on cognitive load influencing pupil dilation. The ODT uses pupil enlargement to distinguish between truthful and deceiving participants. It is unclear, though, to what extent pupil dilates are due to cognitive load or emotional arousal. As cited earlier in this text, Kahneman and Beatty (1966) reported that during short- memory tasks, pupil diameter reflects actively processed material at any time- it dilates as the task is processed. More importantly, the authors also described a practice effect throughout the experiment. The repetition, they claim, reduces not only the subjective difficulty of the task and hence, the pupillary response to the task. Hess and Polt (1964) proposed a reversed but not contradictory conclusion, that pupil dilation increases with task difficulty. Therefore, considering only the cognitive load, pupil dilation may to some extent decrease during the ODT as the participant becomes familiar with the items and invests less cognitive effort in responding the items. Previously, research by Snowden et al. (2016) showed that pupillary response to emotional stimuli (a fearful one) is independent of the presentation time of the images (from 100–3,000ms) and is not diminished by repeated presentations of the images. Their research

31 design was, however different from the one used in ODT studies as the ODT does not present fearful images to participants. Kuhlman, et al. (2011) presented a poster attempting to answer this question by testing for habituation of pupil dilation during the ODT, which is assumed to be mediated by decreasing emotional responses to items’ content. They conducted a study with a similar design to the proposed project: mock crime study with participants assigned to either guilty or innocent condition, subsequently tested with the ODT (48 items repeated on five trials). For each repetition and each item, the peak amplitude of the evoked pupil response was regressed onto its ordinal position in the sequence. Subsequently, a regression line for each repetition was created. The regression slope was more than zero for the first three repetitions of the ODT but did not differ from zero in the last two item repetitions. The regression plots for the first and fifth repetition is presented below in Figure 2. Kuhlman and his colleagues (2011) concluded that although habituation of responses was present, the diagnostic validity of pupil responses for detecting deception did not decrease. The authors ascribed the habituation to the decrease in emotional arousal over time concluding, that the remaining ability of the ODT to distinguish between guilty and innocent subjects is due to the cognitive effort invested into lying. 0,310 1ST REP 0,310 5TH REP 0,290 0,290

0,270 0,270

0,250 0,250

0,230 0,230

PUPIL, PUPIL, PEAK AMP PUPIL, PEAK AMP

0,210 0,210 0 20 40 60 0 20 40 60 TIME TIME

Figure 2: Regression plots for the first and the fifth repetition of the ODT items. (Kuhlman, et al., 2011) Given the contradicting evidence, it remains unclear to what extent is the pupil dilation during deception influenced by emotional arousal or cognitive effort. Possibly, the explanation of regression slope becoming closer to zero over trials is twofold; as participants become familiar with the items it requires less mental effort to lie and at the same time, due to habituation, their emotional arousal decreases.

32

Reading behaviors during deception

There are many ways to characterize eye movements, although for purposes of this paper only some of them will be listed. Whenever we read (or look on a scene), our eyes continually make rapid movements called saccades. During a saccade, eyes move with velocity (a function of how far the eye moves) up to 5000 per second. To make it easier to imagine, for reading a 2° saccade (30ms) is typical for reading, whereas for scene perception a 5° saccade (40-50ms) would be typical (Rayner, 1998). Between saccades, our eyes remain relatively still during fixations for approximately 200-300ms (Rayner, 1998). About 90% of the time during reading is accounted for fixations and the remaining 10% to saccades (Andreassi, 2000). In fact, the eyes are never perfectly still. There is a constant tremor of the eyes, most likely caused by imperfect nervous system control of the oculomotor organ. This constant tremor is called nystagmus. Most experiments consider nystagmus a noise in the data (Rayner, 1998). Fixation duration is the amount of time the eyes focus on a particular point in a text. The first- pass is the initial reading time of a region of text and it consists of all forward fixations. Fixation frequency is the number of fixations in a region of text. The second- pass duration is the amount of time re-reading a region of text (Hacker at al., 2014). When people experience difficulty in reading, their fixation frequency, and duration increase, and they spend more time reading and re-reading (Rayner, 1998). Research shows, consistent with the cognitive load theory, that liars in general have longer response times and make more fixations in the text. However, on statements relevant to the crime they committed participants make fewer fixations and respond quicker. For innocent participants, the number of fixations is quite consistent across statement types (e.g. Cook et al., 2012; Patnaik, 2013).

Countermeasures As was discussed previously in the text, it is very difficult to detect deception since deception has no direct manifestation present under all circumstances when people lie and at the same time not present while they are truthful. Methods developed to detect deception only detect behavior, expressions or other indirect indicators of deceit that were identified to occur along with the deception. The chase of truth becomes even more difficult when a lot is at stake for the interviewee. It is not surprising that people, especially the guilty ones, have attempted to develop techniques and strategies to appear truthful. Deliberate

33 techniques used by guilty people to defeat the deception test are called ‘countermeasures’ (Gudjonsson, 1983). Honts (2014) addresses existing scientific literature on efforts to defeat technology- based credibility assessment procedures. The polygraph is considered to be the prototypical one since it has been widely used and there is broad literature regarding attempts to defeat it. Polygraph testing is applied under generally serious circumstances: criminal investigation, security screening etc. and may seriously influence guilty examinees’ lives. Therefore, guilty participants may try to influence their physiological responses in a way that they would appear innocent. Sometimes, innocent examinees use countermeasure as well to decrease the false- positive results. This strategy is quite risky, though, since evidence shows that in fact, usage of countermeasures by innocent subjects may increase a chance to appear deceptive (Dawson, 1980; Honts, Amato, & Gordon, 2001).

There is a great number of studies on countermeasures used in polygraph testing (e.g. Ben-Shakhar & Dolev, 1996; Gudjonsson, 1983; Honts, Raskin, & Kircher, 1994), as it is a crucial fact to realize that even though a test is considered valid, it may be inveigled or defeated by means of countermeasures.

Honts (1987) in his review lists two main types of deliberate polygraph countermeasures:

1. General state countermeasures (GS).

Actions throughout the testing attempting to alter examinee’s physiological and/or psychological state for the entire examination, not specifically focused on a certain question or section of the testing. The most commonly mentioned GS is drug use. Presumably, the autonomic nervous system inhibitors reduce examinee’s physiological responses to test items. Other mentioned possibilities are relaxation techniques, exercising to exhaustion, hypnosis and mental effort to relax, rationalizing or dissociating from the testing situation and/or test items. Honts (2014) concludes, however, that none of these GS countermeasures are effective against the CIT or CQT polygraph credibility assessments.

2. Specific point countermeasures (SP).

These methods are used to alter subject’s physiological and/or psychological states during a certain period of the examination. The SP can be attempts to inhibit physiological responses to relevant questions but more often, it is an effort to increase physiological responses to comparison questions. The SP can be employed physically, mentally or in

34 combination. Granhag and Strö mwall (2004) conclude, that there is no scientific evidence of the specific point countermeasures being effective in increasing false negative or inconclusive outcomes. According to Honts (2014), however, training in specific point countermeasures may pose a threat to polygraph validity.

Apart from the deliberate countermeasures, Honts (2014) also describes:

3. Spontaneous countermeasures (SC).

Attempts to influence the examination outcome in a way the subject would appear truthful, although this effort is not planned in advance. Honts, Amato and Gordon (2001) collected data on spontaneous countermeasures used to defeat the CQT and found that 82.3% of guilty and 42.7% of innocent subjects employed one or more SC. In the guilty subjects, no significant effect of SC was found, but in innocent subjects the scores were moved so that they appeared more deceptive. The mock- crime experiment conducted by Honts et al. (2001) had 192 participants. The SC were coded in four conditions: None, Altered Breathing, Mental, and Physical. Mental countermeasures were techniques such as rationalization (e.g., I did not steal anything/ It is just an experiment and I was told to do so) and disassociation (e.g. The examinee imagines themselves as not being tested at the moment.). The Physical countermeasure was considered any kind of physical act other than altered breathing. The use of spontaneous countermeasures by deceptive participants did not affect polygraph examination outcomes. The study was replicated by Otter-Henderson, Honts and Amato (2002) using an RIT instead of CQT. The results were very similar to Honts et al. (2001): 77.5% guilty subject used one or more spontaneous countermeasures, but no effect was found on their credibility assessment: it did not help them appear truthful. Also, 30% of innocent subject used the SC to help them appear truthful, but it had no effect on their results. Honts (2014) concludes that spontaneous countermeasures are widely used by guilty subjects but have no effect on outcomes, not even on inconclusive rates. The SC are sometimes used by innocent subjects as well, even though the rate is low. The use of SC by innocent subject may lead to a conclusion they are deceiving. Honts concludes that there is no scientific evidence supporting the statement that spontaneous countermeasures help defeat polygraph.

35

4. Information Countermeasures (IC)

In 2017, availability of information of any kind on the Internet is endless. In September 2017, hit rate for the exact phrase ‘lie detector’ is 5,630,000 and for ‘polygraph countermeasures’ it is 619,000. It is, therefore, reasonable to consider the factor of people doing their own research before undergoing a credibility assessment. Usage of the obtained information is considered an Information Countermeasure. A study design quite relevant to the proposed research was done by Rovner (1986) where a group of 76 six male participants was divided in half- the first half committed a mock crime (theft of a ring from secretary’s office) and the second half was only informed about a theft. All the participants were tested on polygraph using the Control Question Test (CQT). Participants were divided into three treatment groups: Standard group (STD) receiving no special treatment, Informed group (INFO) obtaining specific, detailed information about the CQT. The information was prepared in a loose-leaf notebook and explained the theory underlying the CQT, types of physiological responses which the examiner would use to make his decisions (including photographic examples of those responses), and a variety of physical and mental countermeasures which might be utilized to produce physiological responses at a given time so as to produce a truthful outcome. In addition, this information was recorded on a cassette tape, and the tape was played while the subject read the booklet. The tape played for 10.75 minutes, and for the remainder of the 40-min. period the subject was free to ask questions of the experimenter, have the tape replayed, or examine the information booklet.” (Rovner, 1986, p. 4). The third group was Information and Practice group (INFO+PRAC) which after receiving information was given two practice polygraph tests. Rovner (1986) reported that accuracy of the decision on credibility was 95% for both the STD and INFO group and 71% for the INFO+PRAC group. The guilty and innocent participants were easily distinguished, however, in the INFO+PRAC group, the difference was less obvious. In this experiment, the information participants acquired had no effect on their ability to defeat the test. Twyman, Schuetzler, Proudfoot, and Elkins (2013) conducted an experiment examining countermeasures in an automated deception detection screening context. Among other cues to detect deception, they used a pupillary response to target items. They hypothesized that mental and physical countermeasures may be used to defeat the credibility assessment: since pupil dilates during the cognitive effort, mental tasks performed during

36 non-target items could mask pupillary response during target items. Pupils dilate as a response to pain and the dilation lasts for the whole duration of pain. Therefore, causing pain such as biting the tongue during non-target items should increase pupil dilation, reducing the difference between target and non-target pupil dilation. Contrary to their predictions, “pupil dilation was the strongest effect among those investigated and appeared to be the most resilient to countermeasures. The pupil dilation resulting from the orienting response was strong, and there was no decrease in this effect when mental distraction or pain was used.” (p. 12-13). Honts (2014) made a significant remark on the credibility assessment countermeasures. He evaluates the area of countermeasures research as ‘not great’, at least not in the USA. Polygraph examinations are widely used in the USA. Presumably, according to the National Research Council (2003), the US government has a policy which has all the countermeasures research classified. Classified research is, however, of a very little value for the scientific community.

To date, there is no study investigating possible countermeasures for the ODT. Authors conducting research on the ocular methods to detect deception have emphasized, that further research investigating the effect of countermeasures is necessary. Webb et al. (2009a) speculated two possible explanations for the unexpected results that guilty participants make fewer fixations and spend less time reading the crime-relevant items. The first possibility is, that participants are capable of consciously deciding to read faster; if this is true, it should be possible to train the participant to adopt a more general strategy to defeat the test. The other possibility is, that the overcorrection of guilty participants is a result of a cognitive distortion known as salience effect when a participant aims to appear truthful and his or her deceptive answers become even more obvious compared to truthful ones. If the latter is true, training to defeat the ODT may have only a little effect on reading behaviors. In a study on the effectiveness of pupil diameter in a comparison question test (CQT) Webb at al. (2009b) also suggested that further research is needed to examine whether pupil diameter is resistant to countermeasures. Possibly, the classification of countermeasures in physiological credibility assessment may to a certain level apply to other ways of deception detection. Either way, along with an adaptation of the ODT to Czech language and it’s testing on the Czech population, the aim of this study is to investigate the possibility of countermeasures use during the ODT and

37 influence of countermeasures on the ability of the participant to defeat the Ocular-Motor Deception Test. Along with John C. Kircher, author of the ODT, a ‘Beat the ODT’ document for the examinees in ‘Informed’ condition was created. It basically contains information regarding the rationale underlying the ODT and suggestions on how to defeat it. In the beginning, participants read a brief introduction to eye-tracking and how it works. The testing procedure is described and then it is explained to the examinee how the decision on whether they are guilty or innocent is drawn from the results: “The computer will compare your reactions to the statements about the two crimes… if you react differently to statements about the two crimes, then you will fail the test.” Subsequently, the idea of the higher cognitive load during deception and its influence on pupil is introduced. At the end, all participants are offered suggestions on how they might try to defeat the ODT: “If you took the $20, you might try to think hard about a few of the statements about the exam. If you committed a crime, you should be careful to not take too long to respond or make too many mistakes, since those behaviors are clear indicators that you are trying to defeat the test.”

Current research on the ocular-motor methods for detecting deception In previous chapters various research on oculomotor data, cognitive load and deception have been described. Even though the use of pupil diameter to reveal deception was documented much earlier (e.g. Berrien & Huntington, 1943; Bradley & Janisse, 1979; Bradley & Janisse, 1981; Heilveil, 1976), comprehensive and valid methods to assess credibility using ocular metrics have emerged only in the past decade. All previously listed findings create a theoretical background for the ocular-motor deception test that was first introduced by Cook et al. (2012). In the Table 2 results from some of the ODT studies using mock-crime experiment are presented. The results indicate that the standard protocol in mock crime experiments yielded about 86% classification accuracy in the original, standardization sample, and approximately 83% correct classification when tested on independent samples (cross-validation). On cross-validation, accuracy was slightly higher for innocent (84.1%) than guilty participants (82.1%) (Kircher & Raskin, 2016).

38

Table 2: Percent of correct decisions under standard conditions in mock crime experiments (Kircher & Raskin, 2016, pp. 166).

Experiment N nG nI Guilty Innocent Mean ValidationG ValidationI Mean Osher (2005) 40 20 20 85.0 85.0 85.0 85.0 70.0 77.5 Webb (2008) 112 56 56 82.1 89.2 85.7 89.3 80.4 84.9 Patnaik (2013) 48 24 24 83.3 95.8 89.6 83.3 83.3 83.3 Patnaik (2015) 80 40 40 82.5 90.0 86.3 80.0 90.0 85.0 Patnaik et al. (2016) 145 82 63 84.1 87.3 85.7 81.9 87.5 84.7 Study A 112 51 61 80.4 88.5 84.5 Study B 101 52 49 75.0 85.7 80.4 Standard protocol 638 325 313 82.8 89.0 85.9 82.1 84.1 83.1 Annotation: Study A and Study B are unpublished studies conducted on Middle East by researchers from the University of Utah, Department of Education in 2016.

Three years earlier, Webb et al. (2009a) published a study with 40 participants, one half committed one of two mock crimes (theft of $20 from secretary’s purse or downloading credit card information from a student’s computer) and the other half of participants was innocent. The authors proposed that deception will be associated with greater pupil diameter, increased fixation frequency and reading time, the latter two based on the research conducted by Baker, Stern and Goldstein (1992a) (as cited in Baker, Stern and Goldstein, 1992b)1: these authors measured eye movements of ten subject, that were responding to autobiographical questions on a computer monitor. Participants were instructed to lie on questions that previously they answered truthfully. Authors in the original study concluded that deceptive answers produced longer response times from question onset to vocal response in 6 from 10 subjects. The response time was divided into “reading” and “thinking” time. The reading time failed to discriminate between truth and deception but the fixation time within the thinking period was significantly longer in 9 of the 10 subjects. It was suggested that lying may be manifested in saccadic data during the inter-trial period from the point when the participant indicates readiness for the next trial up to the beginning of

1 Baker, Stern and Goldstein published two final reports to the U.S. Government in 1992 (referenced here as 1992a and 1992b, while 1992a was published first). The first report called The gaze control system and the detection of deception (1992a) was published first but the original document is not accessible; is only reported of in the consequent report called Saccadic eye movements in deception (1992b). The report (1992b) was also published under the head of the U.S. Department of Defense Polygraph Institute (accessible here: http://www.dtic.mil/dtic/tr/fulltext/u2/a304658.pdf). Further in the text the previous suggestions that saccadic data between trials may be discriminant between truthful and deceptive participants are called ‘informal observations’ and are denounced. 39 the next trial. This hypothesis was not supported by data in the consequent study by Baker, Stern, and Goldstein (1992b). Webb et al. (2009a) used ODT with 48 items (16 neutral, 16 on credit card and 16 on cash) and presented participants with eight statements on eight rows on a computer monitor. The participants responded by selecting Truth or False and were allowed to change their answers. The mean classification accuracy was 78.3% (95% for innocent subjects, 80% for card subjects and 60% for cash subjects), however, the jackknife analysis yielded 58.3%. The authors discussed that a possible cause of the low accuracy may be presenting the items in a traditional questionnaire format; it may be that the lack of pressure in answering brought error to their measurement. Results supported the prediction that the pupil diameter would be greatest while participants read the statement to which they respond deceptively. Contrary to predictions, participants guilty of either crime spend less time and made fewer fixations on their crime related items. Deception was associated with greater increases in pupil diameter. Webb, Honts, Kircher, Bernhardt, and Cook (2009b) tested the effectiveness of pupil diameter in a comparison question test (CQT; discussed in chapter 1.5.1) for a polygraph. On a sample of 24 males, the study explored whether pupil diameter is diagnostic of deception in a CQT and whether it could possibly replace one of the predictor variables (thoracic and abdominal respiration, skin conductance, relative blood pressure and vasomotor activity). According to the results, pupil diameter was as highly correlated with deception (r= .61) as skin conductance (r= .59), the most discriminant physiological indicator of deception (Kircher & Raskin, 2002). As expected, innocent participants had larger pupil diameter in the comparison questions than in the relevant questions as they were deceptive only to comparison questions. Guilty participants, on the other hand, did not show discriminant pupil responses in relevant and comparison questions; the authors speculated that it may be caused by the fact that guilty participants gave deceptive answers to both question types. Nevertheless, the difference between pupil responses to relevant and comparison questions was diagnostic of group membership (guilty vs innocent). The ocular- motor test for deception was first introduced by Cook et al. (2012) by conducting two mock crime experiments (again, one innocent group, one group guilty of stealing $20 and one group guilty of downloading credit card information); the first experiment tested a hypothesis that guilty participants would show greater increases in pupil diameter, increased fixations and longer first pass (initial reading time) and second pass (subsequent rereading) reading times when responding deceptively to statements compared

40 to statements answered truthfully. The second experiment attempted to replicate and test the reliability of the results obtained from the first experiment. Cook and colleagues used the ODT consisting of 48 items repeated three times in separate trial blocks. Data from the first experiment revealed that before giving the deceptive answer, pupils increase greatly and that fewer fixations and shorter second-pass reading times are associated with deception. Interestingly, innocent participants showed greater pupil responses to crime- related items (especially cash items) than to neutral items. Similar findings were reported in a study by Horowitz et al. (1997). Cook et al. (2012) offer two possible explanations: participants viewed the cash theft as more outrageous than downloading the credit card information or maybe the cash items syntax was more complex. No repetition effect on pupil responses was detected over the three trials. Overall, the ODT yielded 85% correct classifications and after jackknife analysis, the mean accuracy dropped to 80%. The second experiment in Cook et al. (2012) study manipulated the statement complexity (simple or mixture of simple and complex statements), repetition effect (raising the number of repetitions from three to five), motivation to deceive (bonus for beating the lie detector $1 or $30). An important change compared to the first experiment is that the design was simplified and only one crime (cash theft) was used because the general pattern of responses to committed and non-committed crime was consistent across the cash and card groups. This design change of the mock crime experiments has been replicated in all the other consecutive ODT experiments. However, participants are led to believe that some participants had stolen an exam from a professor’s office. Results on pupil diameter, fixations and reading habits replicated results of the first experiment. Guilty participants with higher monetary motivation ($30) did not differ from guilty participants with lower motivation ($1) but conversely, highly motivated innocent participants made fewer fixations than less motivated innocent participants. As for the syntax, the test with simple syntax differentiated better between the guilty and innocent group. And finally, no repetition effect was found even with five repetitions of the ODT. Overall, the classification accuracy of the ODT in the second experiment was 86%. The effect of direct versus indirect interrogation on the accuracy of the ocular-motor methods for detecting deception was investigated by Pooja Patnaik (2013) in her master’s thesis research. One hundred nine people participated in a mock-crime experiment where ODT statements were manipulated: items on $20 theft and exam theft were worded both, directly (e.g. “I did not steal the exam”) and indirectly (e.g. “I correctly reported that I did not take the exam”). The accuracy of classification for the direct items was 83.3% and

41

60.4% for the indirect items; the test was more diagnostic when participants were asked directly about their involvement in one of the crimes than if they were asked about their truthfulness on a questionnaire. A study on the effect of practice feedback and blocking on the accuracy of oculomotor methods for detecting deception was conducted by Pooja Patnaik (2015) in her dissertation. It was a complex design with three between-group factors: guilt with two levels (guilty or innocent), feedback (practice items with or without performance feedback), and presentation format (distributed or blocked- presenting 4 items of the same type in the analysis treated as a single unit) and 160 participants were randomly assigned to these conditions (20 participants per each group). Three within-subject factors were statement type (neutral, cash, credit card), interevent interval (50ms, 1500ms, and 3000ms), and repetition (2 repetitions of the items at each of the three interevent intervals). Overall, the classification accuracy yielded 86.3% accuracy for the distributed format and 83.3% for the blocked presentation format which is not a significant difference. The practice performance feedback did not affect response times but did reduce error rates. For guilty participants, it was linked to larger pupil reactions to test items and greater differences between pupil responses to cash and credit card. Changes in the length of the interevent interval had no effect on the diagnostic validity of any ocular-motor measure. A study on the generalizability of the ODT to native speakers in Mexico was tested by Patnaik and colleagues (2016). A mock- crime experiment with 147 participants was conducted at a large university in Mexico. Classification accuracy of the ODT was 80% for both, guilty and innocent participants. The study also compared results with a U.S. sample and failed to detect any differences between Mexican and U.S. participants on any behavioral or oculomotor measure of deception.

42

II. EMPIRICAL PART

Present study In the present study, the ocular-motor deception test will be used as a tool to assess the credibility of participants in a mock- crime experiment. The experiment follows the design of previous studies on ODT (e.g. Cook et al., 2012; Patnaik et al., 2016; Webb et al., 2009a). However, to estimate the accuracy of the test, typically number of measures are used: response time, error rate, reading patterns and pupil dilation. Pupillary reactions are the strongest predictor of those listed and it will be the only dependent variable analyzed for purposes of this preliminary study. The classification accuracy estimate will not be assessed. Apart from adapting the ODT on the Czech conditions and testing pupillary responses to the test items, this study investigates countermeasures used to defeat the ODT and interaction effect of information document, deception and pupil dilation.

The first hypothesis relates to previously cited research on deception and pupil diameter. Research shows, that deceiving participants show greater increases in pupil diameter than truthful participants when responding to items related to the crime they committed (e.g. Cook et al., 2012; Patnaik et al., 2016). Therefore, in this research, it is predicted that participants who are in a guilty condition would produce larger pupil diameter during the ODT when responding to items on the cash theft than innocent participants. The remaining two hypotheses relate to research on countermeasures since it is crucial to examine the possibility of countermeasures employed by subjects who do not wish to be detected as well as to explore the use of countermeasures by innocent subjects. Available research is mostly focused on countermeasures used on credibility assessment by polygraph. Research done by Twyman, Schuetzler, Proudfoot and Elkins (2013) was concerned with effect of countermeasures on pupil diameter, but they have applied very specific methods of countermeasures (arithmetic practice or pain during non-target items). Rather than that, present study will test informational countermeasures with direct suggestions of defeating the test combined with post- test questionnaire investigating both, spontaneous and information countermeasures used by the participants. The questionnaire findings will then be compared with the ODT results.

43

In the present experiment, three hypotheses will be tested:

H1: Guilty participants would show greater increases in pupil diameter while responding to the cash related items than innocent participants.

Lying is cognitively more demanding than telling the truth (e.g. Vrij et al., 2008; Vrij, Granhag, & Porter, 2010) and during cognitively demanding tasks pupils dilate (e.g. Hess and Polt, 1960; Kahneman and Beatty, 1966). This hypothesis is based on previous research findings (e.g. Webb et al., 2009a; Cook et al., 2012).

H2: As compared to guilty uninformed participants, guilty participants who receive information would show less difference between pupil reactions to questions about the two crimes.

The “Beat the ODT” document suggests some strategies that guilty participants may use to enhance their credibility. Based on the ODT rationale, significantly stronger reactions to one of the two crimes (in this case, the cash crime) indicates guilt of that crime. This hypothesis predicts that use of some of these strategies would lead to less difference between pupillary reactions to questions about the two crimes.

H3: Spontaneous countermeasures used by the guilty participants would not enhance participants’ credibility as compared to information countermeasures.

As concluded by Honts (2014), spontaneous countermeasures are widely used by polygraph examinees, although there is no scientific evidence it would help defeat the polygraph. In this thesis, spontaneous countermeasures are not expected to enhance participants’ credibility either.

The present study is a pilot of the ocular-motor deception test on a Czech sample. A hypothesis concerning cross-cultural comparison was not formulated, though, because significant differences among the two cultural contexts (Czech Republic and the USA) are not expected.

44

METHODS

Overview of the Design

The study is a pilot experiment with 2 x 2 x 3 mixed factorial design used to test the hypotheses. A similar design was used in previous studies testing the ODT (Cook et al., 2012; Patnaik et al., 2016; Webb et al., 2009a). The two between-subject variables were: guilt with two levels (guilty or innocent) and information with two levels (being informed about the rationale underlying the ODT or not being informed). The within-subject variable was statement type (neutral, cash, exam). For the analysis of pupil diameter, time with 40 levels (10 Hz samples x 4 seconds) was also included as a within-subjects factor. Participants were randomly assigned to a condition (guilty informed/uninformed or innocent informed/ uninformed). Participants assigned to a guilty condition stole 200 CZK from a secretary’s office. The list of conditions was created by a research assistant since the experimental condition was to be concealed from the experimenter until the experiment was over. Participants were given a condition in the order they called for an appointment since the list of participants was unknown in advance. The independent variables were participants’ guilt (guilty, innocent), information (informed/uninformed), gender (male/female), language (Czech/Slovak) and lenses (glasses/no glasses). Guilt and information combinations are illustrated in Table 3. Ten participants were assigned to each one of the four conditions with a total number of 40 participants.

Table 3: The 2x2 between subject variables: guilt and information condition

INFORMATION GUILT Informed Uninformed Guilty Guilty informed Guilty uninformed Innocent Innocent informed Innocent uninformed

The dependent variable was the pupil diameter (PD): • Area under the pupil response curve (AUC) reflects the overall pupil response evoked by recognition of the stimulus. It is obtained from the pupil response curve and it is the area from baseline pupil size (BS) to the pupil size (PS) between two points in time. It started from response onset to the point at which the response returned to the initial level or to the end of the 4-second sampling interval,

45

whichever occurred first. Response onset was defined at the low point in the response curve from which peak amplitude was measured. The 60 Hz PD data samples from the beginning of a block of 48 test items to the end of that block of items were standardized within participants. • PD level at Response Onset is the size of the pupil about the time of the answer. PD level was the mean of standard scores that began 1 second prior to the moment the participant pressed a key to respond to the statement and ended 1 second after the response. Participants. Typically, participants completed a set of 48 test items in about 4 minutes (240 seconds). With 4 minutes of PD data, standard scores would be computed using the mean and standard deviation of the 240 X 60 Hz = 14,400 data samples.

The PD data samples were standardized (converted to z-scores) within participants.

Participants

An advertisement at Masaryk University Information System was used as well as fliers around campus and other public buildings such as sports center and dormitories to recruit participants from the Masaryk University community. The advertisement stated that participants were needed for a psychology research study and the reward for participation is 200 to 400 CZK for a maximum of two hours of their time. Fliers and advertisements provided contact information. For the pilot study 40 participants were needed. The participants called on a provided phone number or sent an email to obtain further information regarding the research. Then they scheduled an appointment and were instructed on where to go for the research. Data from all 40 participants was collected within one month. Approximately four participants were usually scheduled for one day, although about one fourth of participants that were scheduled did not show up and did not cancel the appointment. A total of 48 participants were recruited for the study but data from eight participants were either not obtained or not used for the analysis: • Five participants decided to withdraw from the study: two (male and female) of them directly after getting the guilty instructions, the other three (females) came to the secretary’s office but realized they could not commit the theft and informed the experimenter that they wish to withdraw from the study.

46

• One participant (female) in the guilty condition underwent the deception test after she did not manage to steal the money from the secretary’s office: she admitted she did not steal anything after the testing. Apparently, she did not follow the instruction properly: instead of going to the confederate/secretary office she went to a real secretary of the department (fortunately, the participant did not manage to get to the office). Her data was not included in the analysis as she did not fulfill instructions for her condition. • One participant (female) underwent the deception test but during debriefing admitted that she did not lie in the test. Since the clear instruction was to attempt to appear innocent and she confessed to the crime repeatedly in the test, her data were not included in the analysis. • One participant (male) in a guilty condition came to the secretary’s office and admitted that he was there to steal the money as a part of the experiment; it says directly in the instructions that the participant shall not tell the secretary nor the experimenter anything about the experiment and is supposed to act as instructed. The experiment was over for this participant and data were not collected.

Ultimately, data from 40 Caucasian college students, or graduates (generally young people below 30) were used to conduct the analysis. Participants were randomly assigned to a guilty/informed (n=10), guilty/uninformed (n=10), innocent/informed (n=10) and innocent/uninformed condition (n=10). There were 16 males (40%) and 24 females (60%) in the sample. As for the guilt condition, 6 males and 14 females were innocent; the guilty condition was balanced with 10 guilty males and 10 guilty females. One of the sample characteristics was also the first language of the participants: there were 15 participants whose first language was Slovak and 25 participants speaking the Czech language. The guilty condition was balanced with 10 Czech and 10 Slovak participants whereas there were 5 Slovak and 15 Czech participants in the innocent condition.

Procedures

The experimental procedures were approved by the Institutional Review Board of the Masaryk University (Etická komise pro výzkum Masarykovy univerzity). To remain unbiased, the experimenter was unaware of the experimental condition the participant was assigned in until the end of the experiment. All participants were suspected to have committed the mock crimes and were offered a monetary bonus to convince the

47 experimenter of their innocence. The instructions were prepared by one of eight research assistants who also served as confederates in the mock-crime experiment. Before the study begun, twelve subjects were tested as a pilot to test comprehensibility of the items, questionnaire and information sheets. Software functionality was tested as well. The pilots suggested one of the ODT items was difficult to understand due to a double negative, therefore this item was changed. During the pilots, the environment of the experiment and the procedures were optimized. Data obtained from the pilots were not used for the pupil diameter analysis.

Participants called in a response to the advertisement to schedule a meeting. At a given hour they came to a room near the lab in one of the Masaryk University buildings to receive instructions. In the instruction room participants signed the informed consent and read/listened to the instructions. All participants were informed that one group of participants downloads a test from a teacher’s computer, a second group steals 200CZK from a secretary’s purse and the third group is innocent of committing either of these crimes. In fact, though, there were only two groups of participants in this experiment. Those, who stole 200CZK from secretary’s purse and those who were innocent. This fact was hidden from the participants and in all materials the test- theft was mentioned first to make sure participants believe there is the third group. All participants, prior to their arrival to the lab, got following instruction:

You will receive the bonus only if the examiner finds you innocent, so you must convince the examiner that you are innocent. If you indicate that you took the money, if the examiner decides when the test is over that you are guilty of either crime, or if the examiner cannot decide whether you are guilty or innocent, you will not receive the bonus. The test is based on the idea that a person who committed a crime will have a difficult time answering quickly and honestly to questions about the crime. You could make the examiner suspicious and fail the test if it takes you a long time to answer the questions, or if you make mistakes. To appear innocent, you should respond as quickly and as accurately as you possibly can.

If the participants were assigned to the innocent condition, they were instructed to wait for 20 minutes anywhere but the instruction room and then go to the lab to see the examiner. This was to ensure that guilty and innocent participants would arrive to the lab in approximately the same time since the scheduled appointment, so the experimenter remains unaware of their assigned condition. If the condition was also to get the information regarding the ODT, the participants were instructed to study a material about the rationale underlying the ODT (and suggestions that might help to defeat it) for 5 minutes.

48

Consequently, a chin rest was adjusted for the participant to sit comfortably during the testing and after calibration the testing could begin. The ODT took about 30 minutes. After finishing the test, the participants were asked to fill the post- test questionnaire (which would not be used in the decision whether they are guilty or innocent) and read the debriefing document. Then, the participants were told the experiment is over. Subsequently, participants were debriefed. They were informed if they passed the test and paid a bonus of 200 CZK or failed and were paid their base salary 200CZK. If the participants were assigned to the guilty condition, they were instructed to go to a secretary’s office, which is upstairs. Participants were informed that they had no more than 20 minutes to commit the crime. They were equipped with a map of the building with the right way to the secretary’s office marked. The office was in the same building but on a different floor. As they came to the secretary’s office, the instruction was to ask her “Does doctor Blazek work here?” and the secretary (a female confederate) responded that no such person works at this department. The subject then left immediately and waited nearby for the secretary to leave the office unattended. The secretary left soon after their dialog with an urgent phone call the participant entered her office and took 200CZK from the wallet in a purse on the table. Consequently, the participant came to the lab and the procedure was the same as with the innocent participants: filled the post- test questionnaire, were debriefed and paid their base salary and a bonus if the computer indicated they were innocent.

Materials

2.4.1 The Ocular-Motor Deception Test

In the Czech language, each noun has a gender. The natural gender is logical for example in people and animals. For the rest of the nouns, though, there is no logical explanation of their gender. The noun gender affects not only the declension of each noun, but also the declension of adjectives, pronouns and verbs in each sentence. Therefore, two versions of the ODT (male and female) were used to personalize the ODT. The Ocular- motor Deception test starts shortly after calibration. First, instructions and practice items were presented to the participant in black font with a pale grey background. Participants get the instruction: ”You should respond as quickly and accurately as possible to appear innocent.” Participants answered 11 practice items. Then the eye- tracker starts collecting data and 48 test items are presented five times in different orders. Sixteen items pertained to the theft of the 200 Czech crowns (e.g., The cash from the secretary's desk is hidden on my person.), 16 addressed the theft of the exam (e.g., I copied 49 the exam on the professor's computer to a USB drive.), and 18 were neutral items (e.g., The sky is blue on sunny days.). Statements were presented one at a time on a single line and the characters S/N (Správně/Nesprávně; stands for T/F- Truth/False) appeared for 200ms to the far right of the statement to remind participants of their answer choices. Then, the response was replaced by the next statement 200ms later. Participants responded by pressing a button labelled S or N. The correct (i.e., nonincriminating) answer was true for 8 of 16 items in a category and false for the remaining 8 items in each category (9 correct and 9 false for the neutral items). Each statement type required an equal number of true and false responses, and each group of True and False statements was subdivided into equal numbers of statements with negation (e.g., “I did not take the 200CZK from the secretary’s purse.”) and without negation (e.g., “I took the 200CZK from the secretary’s purse.”). The number of characters in the neutral, cash and USB items was balanced. Mean length in characters (with spaces) of the statements in the neutral, cash, and card conditions for male and female version is in Table 4. Table 4: Mean length of items types (neutral, cash, exam) for male and female versions of the ODT in characters with spaces. Statement type Mean SD Mean SD M F Neutral 43.44 11.56 43.61 11.73 Cash 45.73 6.28 46.02 6.35 Exam 46.45 9.12 46.82 9.42

Between repetitions of the 48 test items, participants were asked to go through 32 simple arithmetic practice items (e.g., 8+3=2.). Half of these items had the correct answer N and for the other half it was S. It was designed to provide a break and clear working memory of ODT test items and answers. During the arithmetic practice no data from the eye-tracker was collected.

2.4.2 The post-test questionnaire

The post-test questionnaire was created by Dr Kircher to reveal what exact countermeasures participants used to defeat the ODT and at what part of the ODT they used it. Then the participant indicated the time period during the test when that strategy was applied and how effective they think it was. The exact instruction was the following:

Please describe specific strategies you used during the test in order to pass the test (regardless of whether you were truthful or not). With each strategy that you tried, choose the term that best describes when you implemented the strategy and indicate how effective you think it was.

50

At the end of the questionnaire, participants responded to eleven statements on the Likert scale (1- Strongly Agree, 5- Strongly Disagree). These statements refer to participant’s motivation to participate in this study (“I volunteered for this study because I was curious about the technology/ I volunteered for this study because I wanted the pay.”). Three items examined participant’s sincere effort to attempt to pass the test (“I did not try very hard to pass the test. I made a concerted effort to appear truthful on questions about the exam and $20. I tried my best to stay focused on answering quickly and accurately). One of these items was reverse coded. Anxiety was also measured by the Post-test questionnaire using summed scale of three items coded on a 5-point scale. (“When the test began, I felt nervous about whether I would appear truthful. I knew this was just research, so I felt no anxiety regarding my answers. I was relaxed during this test, because I knew it had no consequences.”). Two of the items were reverse coded before computing the summed Anxiety scale.

2.4.3 The information documents

Participants that were assigned to the ‘informed’ condition got five minutes to study a “Beat the ODT” document that was created in cooperation with Dr Kircher. The document has a male and female version just like the ODT. It contains information about what eye- tracking is, basic information about pupil dilation, cognitive effort during lying and suggestions on how to defeat the ODT, such as: The computer will compare your reactions to the statements about the two crimes. If your reactions to statements about one crime are similar to your reactions to statements about the other crime, the computer will classify you as truthful on the test. If you take about the same amount of time to answer statements about one crime as the other, the computer will classify you as truthful. If you make about the same number of mistakes when you answer statements about one crime as the other, the computer will classify you as truthful. On the other hand, if you react differently to statements about the two crimes, then you will fail the test.

If you took the exam or the $20, we want to see if you can figure out a way to beat the test. To beat the test, you need to respond quickly, accurately, and consistently to statements about the two crimes. For example, you might imagine that you did not take the exam or the $20; that is, you are completely innocent of both crimes. Alternatively, you might imagine that you committed both crimes, in which case your reactions to statements about both crimes should be similar. Or, you might try to mix things up. If you took the $20, you might try to think hard about a few of the statements about the exam. If you committed a crime, you should be careful

51

to not take too long to respond or make too many mistakes, since those behaviors are clear indicators that you are trying to defeat the test.

Apparatus

A SensoMotoric Instruments (SMI) RED250 mobile eye tracker attached to a desktop PC monitor recorded eye movements and pupil diameter at a frame rate 60 Hz. A chin rest was used to keep the participant’s head still. Stimuli were presented to the participant on a 19-inch flat screen LCD monitor with a 5:4 aspect ratio. The monitor was positioned approximately 65 cm from the participant’s eyes. The lab setting is presented in Figure 3.

Figure 3: Laboratory setting. On the left chinrest and square- sized monitor to present the ODT items with the RED 250 mobile attached. On the right examiner's laptop connected to the monitor and SMI 250 mobile. Data Collection and Analysis

The statistical analysis was conducted in IBM SPSS Statistics version 24. The data was originally collected and edited in the EYElab and CPSlab software (mediated by the iView). Repeated measures of analyses of variances (RMANOVAs) were conducted on the dependent variables.

52

RESULTS The primary goal of this thesis was to determine the effect of deception on ocular- motor measures and to explore interaction effect between guilt condition, question type (neutral, cash, USB) and information (informed/uninformed). Repeated measures analysis of variance (RMANOVA) was used to analyze the dependent variables: pupil measures (PD area under the curve and PD level). Deception detection was tested by comparing means of the relevant issues (cash and USB items) for guilty versus innocent individuals. The level of significance for all statistical tests was set at .05 significance levels for tests involving within-subject factors were conservatively assessed with Huynh-Feldt adjusted degrees of freedom. To test whether guilty participants would show greater increases in pupil diameter while responding to the cash related items than innocent participants., repeated measures of participants’ mean pupil diameter waveforms 4 s from the time of each item onset was assessed. To test on the interaction effect between guilt condition, pupil reactions and the information condition, RMANOVA was conducted for both, the PD area under the curve (AUC) and PD level. The second hypothesis (As compared to guilty control participants, guilty participants who receive information would show less difference between pupil reactions to questions about the two crimes) assumed that based on reading the “Beat the ODT” document, participants would use the suggested cognitive strategies and hence, show less pupillary reactions to both relevant items. However, according to the post-test questionnaires, participants generally did not use the cognitive strategies much or at least, did not state using them. The third hypothesis predicted that spontaneous countermeasures used by the guilty participants would not enhance participants’ credibility as compared to information countermeasures. However, results of the post-test questionnaires indicated, that participants with a few exceptions did not use cognitive strategies (suggested in the “Beat the ODT” document) that would effect changes in pupil diameter. Therefore, only guilty participants that managed to come up with effective strategies and appeared innocent on the test will be identified and their strategies explored. Successful individuals will be identified from the subset of deceptive subjects that reacted more strongly to exam items than to cash items. In all likelihood, those people would have passed the test (even if response time, error

53 rate and reading patterns were included in the overall classification accuracy), since PD level is the best predictor of deception among the ODT predictors.

Pupil measures

H1: Guilty participants would show greater increases in pupil diameter while responding to the cash related items than innocent participants.

Figure 4 shows mean change of pupil diameter from baseline. The PD waveforms signals over 4 seconds from statement onset are not standardized. Rather, a deviation from initial value was used. The template was output 600 samples per minute (10 Hz). To track the mean change in pupil diameter we use data for 4 seconds at 10 Hz; therefore, in Figure 4 there are 40 levels of the repeated measure. Positive difference indicated PD increase relative to the initial value and a negative difference indicated PD decrease relative to the initial value. In approximately one second after question onset the pupil diameter begun to distinguish among statement types (neutral, cash, exam). Neutral items were associated with weakest pupil responses for both innocent and guilty participants. Innocent participants showed a slightly stronger pupil response to exam items than cash items, while guilty participants showed to some extent stronger reaction to cash items than exam items. There is a noticeable drop in the curve for neutral items between seconds 2-3 indicating pupil constriction.

Figure 4: Mean change in pupil diameter for 4 seconds following item onset by item content and guilt condition.

54

H1: Results, as illustrated in Figure 4, show that guilty participants had greater increases in pupil diameter while responding to the cash related items than innocent participants. The first hypothesis was supported by data. Repeated measures ANOVA revealed, that there was a weak effect of deception on the mean change in pupil diameter for 4 seconds following the item onset, F(1.90,72)=4.44, 2=.11, p<.05, and information did not moderate effect of deception on the mean change in pupil diameter as a response to the three statement types, F(1.90,72)=.35, p>.05.

H2: As compared to guilty control participants, guilty participants who receive information would show less difference between pupil reactions to questions about the two crimes.

To illustrate the difference in pupillary reactions to relevant items (cash and exam) in guilty participants who received and did not receive the “Beat the ODT” document, mean change of pupil diameter from baseline was produced for two separate groups: guilty- informed and guilty-uninformed. As we can see in Figure 5 both informed and uninformed guilty participants reacted more strongly to cash items than to exam items, while reactions of guilty- informed participants were generally slightly stronger.

Figure 5: Mean change in pupil diameter for 4 seconds following item onset by item content and guilty- information condition. Pupil diameter (PD) level and the PD area under the curve (AUC) are expressed in z- scores times 1000. A repeated measures ANOVA assessed the interaction effect between pupil diameter, guilt and item type. The magnitude of the pupil response as measured by PD AUC is shown in Figure 6a for each group and item type; Table 5 presents all mean values and standard deviations of PD level and PD AUC in guilt and information condition. Figure

55

6a shows the PD AUC mean levels. Pairwise comparison showed that there are some significant pairwise differences in PD AUC between items for both innocent Wilks’s λ=.46, F(2)=13.21, p<.05 and guilty participants, Wilks’s λ=.35, F(2)=17.25, p<05 but neither innocent nor guilty participants significantly differed in PD AUC during cash and exam items. The PD level at the time participant answered the item for each group and item type is shown in Figure 6b. Pairwise comparison showed that innocent participants significantly differed in mean PD levels during the statement types, Wilks’s λ=.38, F(1.44)=9.84, p<.05, 2=.34. Innocent participants had significantly greater pupillary response to exam items (M=181.81, SD=145.80) than to cash items (M=80.04, SD=150.49). Guilty participants also had significant pairwise differences in PD level, Wilks’s λ=.39, F(1.82)=19.50, p<.05. but not between cash and exam items.

Figure 6: Mean area under the curve and level at response for pupil diameter in mm x 1000 by item content and guilt condition. As for the PD level, guilt and statement type interaction (effect of deception on pupil response) was significant, but weak: F(1.81,65.22)=5.01, 2=.12, p≤.01. Information, on the other hand, did not moderate effect of deception on pupil reaction (PD level) to the three statement types, F(1.81, 65.22)=.01, 2=.00, p>.05. For PD AUC, there was no significant interaction between guilt and statement type F(2,72)=1.46, p>.05. Similarly to the PD level, in PD AUC information did not moderate effect of deception on area under the curve as a response to the three statement types, F(2,72)=.44, p>.05. Mean levels of PD and PD AUC in guilty and information conditions are listed in Table 5.

56

Table 5: Mean values and standard deviations for PD level and PD AUC.

Guilt Info PD level PD AUC N Mean SD Mean SD 0 0 38,32 237,35 114656,52 11646,94 10 139,49 1 -9,93 114170,84 21279,22 10 Total 14,20 191,08 114413,68 16697,44 20 Neutral 1 0 -38,51 219,80 107688,75 22048,20 10 174,16 1 -143,58 104395,53 30110,06 10 Total -91,05 200,39 106042,14 25740,48 20 Total 0 -0,10 226,10 111172,64 17530,00 20 168,18 1 -76,76 109283,19 25866,64 20 Total -38,43 200,48 110227,91 21830,93 40 0 0 102,07 123,47 145086,24 30865,42 10 177,44 1 58,00 130105,56 25096,13 10 Total 80,04 150,49 137595,90 28436,92 20 Cash 1 0 229,64 184,06 145592,90 36682,43 10 182,75 1 133,46 142271,93 40678,15 10 Total 181,55 185,21 143932,42 37737,29 20 Total 0 165,86 165,99 145339,57 32995,82 20 179,54 1 95,73 136188,75 33482,79 20 Total 130,79 174,32 140764,16 33136,85 40 0 0 193,38 151,29 140972,82 22403,69 10 147,27 1 170,23 134083,69 27180,01 10 Total 181,81 145,80 137528,26 24498,56 20 Exam 1 0 221,84 209,40 133277,56 35118,58 10 150,98 1 157,09 136975,99 40712,49 10 Total 189,47 180,75 135126,78 37053,16 20 Total 0 207,61 178,40 137125,19 28940,27 20 145,32 1 163,66 135529,84 33723,46 20 Total 185,64 162,13 136327,52 31028,05 40

There was no main effect of guilt across all item types on either PD AUC, F(1,36)=.03, p>.05, or PD level, F(1,36)=.00, p>.05. Neither was identified a main effect for information across all item types on PD AUC F(1,36)=.28, p>.05, or PD level, F(1,36)=2.32, p>.05.

H2: Information did not moderate effect of deception on the mean change in pupil diameter, F(1.90,72)=.35, p>.05, PD level, F(1.81, 65.22)=.01, 2=.00, p>.05 or PD AUC, F(2,72)=.44, p>.05 as a response to the three statement types. As shown in Figure 5,

57 difference in pupil dilation as a response to both relevant items was not smaller for guilty- informed participants than for guilty- uninformed participants. The hypothesis was not supported by data.

Repeated measures ANOVA were also used to assess the interaction effect between pupil diameter, statement type and language of the participant (Czech or Slovak). Language of the participant did significantly moderate the effect of deception on PD level F(1.93, 72)=4.5, 2=.11, p<.05, even though the effect was weak. Slovak guilty participants responded more strongly to both relevant items than Czech guilty participants. Mean PD level at response by statement type and guilt condition for Czech and Slovak language is showed in Figure 7.

Figure 7: Mean PD level (mm x 1000) at response by statement type and guilt condition for Czech and Slovak language. Language of the participant did significantly moderate the effect of deception on PD AUC as well, F(2, 72)=5.02, 2=.12, p<.05, even though the effect was weak. Mean PD level at response by statement type and guilt condition for Czech and Slovak language is showed in Figure 8.

58

Figure 8: Mean area under the curve in relative unites x 1000 by statement type and guilt condition for Czech and Slovak language. Countermeasures

H3: Spontaneous countermeasures used by the guilty participants would not enhance participants’ credibility as compared to information countermeasures.

Qualitative analysis of content was conducted to classify strategies that participants used in attempts to defeat the ODT. Each participant had the option to list 1-3 strategies. Overall, 68 statements were evaluated. Ten statements were disqualified for being too vague or it was not clear what was meant by the strategy. All strategies were classified by content into 7 categories as listed in Table 6. Ultimately, only three participants used the cognitive strategies suggested by the “Beat the ODT” document (Imagine being innocent or as if committed the other crime) and one of those three participants was not even in the information condition.

Table 6: Categories of strategies used by participants to defeat the deception test.

Category of strategies Strategies N % G I 1. Controlling the speed of Uniform/regular speed of answers 27 39.7 17 14 answers Answer as fast as possible 2. Skim through the items Searching for key words 8 11.8 1 7 Remember the items/answers 3. Thorough reading of the Read the text properly 5 7.4 2 3 items Think about each item Imagine/believe they are innocent 4. Belief in innocence Imagine telling the truth 9 13.2 5 4 Not to think of the committed crime Reading without emotions Having 'poker face'

59

5. Controlling emotions Stay calm 6 8.8 4 2 Breathe regularly and deeply If distracted, briefly close eyes to relax Making deliberate mistakes Deliberately taking longer to respond 6. Confusing the test Uniform eye movements 10 14.7 9 1 Random eye movements Sometimes respond randomly Looking at answers that prove innocence Act innocent or as if committed the other crime 7. Other Think about something else 3 4.4 3 0 Dissociate from the crime Total 68 100 Annotation: N…frequency of the strategy; G/I…Frequency of use by guilty/ innocent participants. The most commonly used category of strategies was controlling the speed of answers (nearly 40% of strategies). Typically, participants tried to answer as quickly as possible and/or attempted to answer to all statements with the same speed. Another category was to skim through the text rather than reading it properly (11.8%). In this category, participants were searching for mentions of money, crime, USB, guilt, innocence or negative verbs. They attempted to memorize the items and answers so they do not have to read them in the next trial. An opposite group of categories was thorough reading of the items (7.4%) where participants read each item carefully and thought about the answers. Another category of strategies was belief in one’s innocence (13.2%): participants denied the crime, did not think about it, imagined they did not commit it, concentrated on their innocence, imagined they were telling the truth or tried to convince themselves they are innocent. Some of the strategies aimed to control emotions or manifestation of emotions; these were labeled as controlling emotions (8.8%) and contained strategies such as emotionless reading, having a ‘poker face’, in case of distraction to close eyes and relax to ease the tension, react calmly to all items, breathe deeply and regularly and stay calm even when answering items relevant to participant’s guilt. Number of participants used strategies that were labeled as confusing the test (14.7%). This category contained making mistakes on purpose: either after a mistake on relevant item or without previous mistake; if thinking longer on one item, deliberately thinking longer on the next one; looking only at answers proving one’s innocence; sometimes reading longer items on the crime that was not committed. One participant responded randomly in about 10% of the cases. The last category, labeled as other, holds only three strategies. These strategies are most resembling

60 cognitive strategies: to pretend the participant is innocent or committed the USB crime, to think about something else than the crime and to dissociate from the crime. As for the guilt and information condition as related to countermeasures used, categories were quite balanced by informed and uninformed condition. Noticeably, the second category- to skim through the text instead of thorough reading of the items was mostly used by innocent participants. Deliberate confusing of the test, on the other hand, was mostly applied by guilty participants as well as the last category of strategies (Other), which was used by no innocent subject. Successful individuals were identified from the subset of deceptive subjects that showed bigger PD level increase during responding to exam items than to cash items. There were twelve guilty subjects identified and seven of them showed a difference between the two relevant items bigger than 0.88 mm. These individuals would be considered successful in defeating the ODT. On the other hand, there was only one innocent subject who yielded greater PD level during response to the cash items than to exam items. The difference was, however, less than 0.05 mm and would not directly lead to false- positive conclusion.

Table 7: Categories of strategies used by guilty participants who showed bigger increases in PD level during exam items (R2) compared to cash items (R1) Subject PD |R3-R2| Info Number of Category in mm strategies 1 2 3 4 5 6 1 .227 0 2     2 .170 0 3     3 .156 0 2      4 .152 1 4     5 .112 1 0    6 .132 1 2    7 .88 0 2     Total 4 1 1 2 2 5

As we can see in Table 7, most guilty participants that showed greater increases of pupil diameter during exam items instead of cash items (with the difference of pupil diameter between cash and exam items at least 0.88 mm) have in common two categories of strategies: controlling the speed of their answers and deliberately confusing the test. Since the first category was the most commonly used for the whole sample, it is not surprising it has appeared repeatedly in these subjects either. Within this category subject tried to either answer as quick as possible or to keep a regular speed of answers for all items. The most commonly used category of strategies among these seven subjects was deliberate confusing of the test. In the whole sample of guilty participants, this strategy appeared nine times and

61 five of them managed to have more increased pupil diameter during exam items than cash items. The five participants stated that in case of a mistake, they deliberately made mistakes on other items and sometimes took deliberately longer to respond on irrelevant questions to hide hesitation on cash items. Furthermore, two informed subjects used the imagination of their innocence as a countermeasure. Two other subjects used controlling of their emotions such as having a “poker face” and trying to stay calm. There were four other guilty subjects in the whole sample who used the deliberate confusing the test, however these subjects used different strategies such as uniform eye movements, random eye movement, sometimes respond randomly, and looking at answers that prove innocence. These strategies differ from those used by successful guilty subjects Only three of the seven subjects obtained the “Beat the ODT” document prior to the lie detection test and one of those three subjects did not list any countermeasures. Nevertheless, the uninformed subjects evolved similar strategies that were suggested in the document. At this point it might be useful to reconsider the classification and terminology used for countermeasures: since the participants did not obtain the document, their countermeasures were spontaneous. However, some of these spontaneous countermeasures are the same as information countermeasures provided in the document.

H3: It is not possible to reject the null hypothesis mainly because spontaneous countermeasures of some participants were similar to countermeasures suggested by the “Beat the ODT” document. To insist on the difference based on whether the participant evolved the strategy during the test or was advised to use it beforehand does not seem to make sense. In the end, it still is the same strategy no matter where it came from.

Additional results

As mentioned previously, participants also filled a post-test questionnaire. In the questionnaire, participants indicated number of strategies used during the test. In the Table 8 we can see that only four participants did not state any strategy (or stated that their strategy was to answer the questions truthfully- since these participants were innocent, it does not count as a strategy). About two thirds of the participants used more than two strategies to defeat the test. There was no statistically significant difference in mean number of strategies between guilty (M=2.05, SD=1.00) and innocent participants (M=1.85, SD=1.18), t(38)=.20, p=-.58. However, the difference in mean number of strategies listed by males 62

(M=1.56, SD=1.21) and females (M=2.21, SD=.93) almost reached statistical significance t(38)=1.91, p=.06. Participants rated each strategy in the post-test questionnaire according to how effective they thought each strategy was on the scale 1-not effective at all to 5-very effective. A significant difference was found in perceived strategy effectivity between innocent (M=3.89, SD=.73) and guilty participants (M=3.31, SD=.72), t(36)=2.46, p<.05. No significant difference in strategy effectivity evaluation was found between uninformed (M=3.51, SD=.76) and informed participants (M=3.69, SD=.80), t(36)=-.74, p>.05. Table 8: Self-reported number of strategies used by participants to defeat the test.

Frequency Percent 0 4 10 1 10 25 No. of strategies 2 12 30 3 12 30 4 2 5 Total 40 100,0

There was also found a significant difference in anxiety levels between guilty (M=3.45, SD=.70) and innocent participants (M=2.72, SD=.98), t(37)=-2.68, p=.01. In other measures, such as relative anxiety, motivation by salary or technology or effort there was not found any statistical difference between guilty and innocent participants.

63

DISCUSSION The present pilot mock-crime study was conducted at a Czech university with forty Czech and Slovak speaking participants. The goal of this study was to determine if pupil diameter increases in guilty participants while they read statements about the crime they committed more than it increases in innocent participants while they read the cash-related statements. Also, the goal was to determine whether receiving the information (“Beat the ODT” document) would help guilty participants to show less difference between pupil reactions to questions about the two crimes and therefore, appear innocent. The information some participants received may have served as an efficient source of countermeasures to defeat the test. Finally, the purpose of this thesis was to test the hypothesis that spontaneous countermeasures would not enhance participants’ credibility on the ODT. As observable from the mean change in pupil diameter for 4 seconds following item onset (as presented previously in Figure 4), guilty participants reacted more strongly to cash items (theft of 200 CZK) than to exam items (downloading a test from professor’s office). Guilty participants reacted more strongly to cash- related items than innocent participants. Consistently with previous research (e.g. Bradley & Janisse, 1981; Cook et al., 2012; Heilveil, 1976), deception was associated with increases in pupil size. Cook and her colleagues (2012) speculated that when participants deliberately suppress reading behaviors and/or deliberately read faster to avoid detection (or use any other countermeasure), this extra effort invested into implementing these strategies on statements answered deceptively may raise the cognitive effort and contribute to the observed pupil dilation. Findings of the present study do not report a great difference between PD level and PD AUC to cash and exam items in guilty subjects. Based on previous research (e.g. Cook et al., 2012; Patnaik, 2016) a bigger difference was expected for guilty subjects as the ODT is based on the Relevant Comparison Test rationale: the difference between crime-related items should be more diagnostic than the difference between crime-related and neutral items. One of possible causes of this finding might be the use of the SMI RED 250 eye tracker which may not adequately measure changes in pupil size. Based on an unpublished research conducted at the University of Utah by Dr Kircher and his colleagues, when subjects performed mental arithmetic (easy, moderately difficult, and difficult multiplication problems) there appeared to be ceiling effects – no discrimination of pupil size between moderately difficult and difficult multiplication problems by the SMI RED 250 as compared to the SMI REDm.

64

Innocent participants showed generally stronger reactions to crime-relevant items than to neutral items and, at the same time, slightly stronger pupillary reactions to exam items than cash items. The finding that innocent participants respond more strongly to crime-relevant statements than to neutral stimuli is consistent with previous findings (e.g. Webb et al., 2009a; Cook et al., 2012). An explanation could be that threatening stimuli (the possibility of being accused of a crime) evokes a stronger response of the autonomic system even if the participant is innocent (Horowitz et al., 1997). At the same time, innocent participants reacted more strongly to exam statements than to cash statements. Cook et al. (2012) speculated two possible causes of a similar issue: either participants find the exam crime to be more despicable, or the exam statements were more semantically complex and required more cognitive effort to comprehend. Generally, presence/absence of the information document did not moderate the effect of deception on the pupillary response for neither, pupil diameter or area under the curve. It was predicted that guilty-informed participants would show less difference in reactions to both relevant items than guilty- uninformed participants due to use of cognitive strategies suggested in the “Beat the ODT” document. Even though the information condition did not interact with pupil reactions to the three statement types, the mean pupil change of guilty- informed and guilty- uninformed participants is not the same: guilty- informed participants reacted more strongly to both relevant statements types than guilty- uninformed participants. The present study explored countermeasures. Half of the participants had a chance to read the “Beat the ODT” document before undergoing the test. The document contained information regarding the rationale underlying the ODT and suggestions to defeat it such as responding at the same speed to both relevant items or not making more mistake on one item type. Most importantly, participants were suggested to use cognitive countermeasures such as to imagine they are innocent, have committed the other crime or both crimes. Unfortunately, only three participants used a cognitive strategy suggested by the information document (Imagine being innocent or as if committed the other crime) and one of them was not even in the information condition. One participant attempted to balance the number of mistakes, but not between relevant items- in case of a mistake on one relevant item, another mistake was made on the neutral item. About 40% of all strategies participants listed were labeled as “controlling the speed of answers” where participants tried to answer as quickly as possible and/or tried to answer to all statements with the same speed.

65

Seven successful individuals were identified from the subset of deceptive subjects that showed bigger PD level increase during responding to exam items than to cash items. Most commonly used categories of strategies were controlling the speed of their answers and deliberately confusing the test. The latter mentioned seems to be a promising strategy, especially to deliberately make mistakes after giving a wrong answer and sometimes to take deliberately longer to respond on irrelevant questions to hide hesitation on cash items. These strategies were used only by guilty subjects that ultimately managed to show stronger pupil reactions to exam rather than cash items. Additionally, an interaction effect between language, deception and pupil size was revealed. Slovak guilty participants responded more strongly to both relevant items than Czech guilty participants. As mentioned in the study by Just and Carpenter (1993), the pupillary response may be sensitive to resource demands in language processing. Stronger pupil reactions of the Slovak participants may relate to higher demands of processing the Czech language. In the context of our study, we also need to consider the wording of some of the ODT items in the Czech language and ease of comprehension for the participants. As cited earlier, DePaulo et al. (2003) have conducted a meta-analysis of nonverbal cues to deception. Even though the ODT was not established yet by then, a pupil dilation was included in the analysis. Pupil dilation had one of the highest effect sizes (0,39) among cues with the number of estimates lower than 5. However generally the non- verbal deception cues did not reach very high effect sizes (up to -0,66). In their study, DePaulo et al. (2003) also mentioned the issue of laboratory lies: most of them are conducted on college students telling truths and lies in a laboratory experiment. Miller and Stiff (1993) criticized laboratory studies on deception is that people are not motivated enough to get away with their lies. Another critique is that participants do not deliberately choose to lie; they do so because they are forced. In the presented study, this criticism is irrelevant at least to a certain point: guilty participants were motivated to lie by a possibility of earning bonus money if they appear truthful on the lie- detection test. However, they were still aware it is only a laboratory experiment and there were no real consequences of being revealed as deceptive. The rationale behind the ODT is that deceit requires more cognitive effort than being truthful and during the cognitive effort, pupils dilate. Previously, six conditions (Vrij et al., 2008; Vrij, Granhag, & Porter, 2010) under which lying is more cognitively demanding than truth- telling were listed. As mentioned earlier, at least two of them should be fulfilled for the deceit to be more cognitively demanding. In the experiment proposed in this paper, the liars indeed attempted to control their demeanor in order to appear honest; this amount of

66 controlling and monitoring is indeed cognitively demanding. Liars were definitely suppressing the truth while they are lying and also, activating a lie is deliberate and intentional, therefore cognitively more demanding. Therefore, four of the six conditions under which lying is more cognitively demanding than telling the truth were fulfilled.

Limitations of the study

The first limitation of the present study to mention is the sample size. The sample consisted of forty participants, which gives each condition only ten subjects. At the same time, it was a self- selected sample. Participants, who volunteered to earn the monetary reward, were Caucasian university students so generalization to the population may be limited. As the study was considered a pilot, the results will serve mostly to develop and improve further research plans on the ODT. The present study is exploratory to a certain level; countermeasures are documented for polygraph techniques, which are based on sympathetic reaction. It is not clear, however, to what level does the ODT rely on sympathetic reaction. Therefore, it is not clear, to what level can knowledge on polygraph countermeasures be applied to the ODT and whether solely cognitive countermeasures would be efficient. In the present study, a sincere effort to explore countermeasures was made. The way of putting the last hypothesis did not turn out to be quite fortunate since the procedure did not lead the participants to use the cognitive countermeasures. It is likely that with a modification of the document and bigger sample size, a significant interaction effect for information, statement type and pupil dilation would be found. Additionally, the post- test questionnaire on the countermeasures was based on the self-report method. Possibly, participants did not list all the strategies they used or the strategies they listed were, in fact, not used during the whole test. They may have used it only once or twice, but recalled it as they were filling the questionnaire. As for the experimental part, the experimenter was, in few cases, aware of the experimental condition due to procedure mistakes or misunderstandings of the confederates. The experimenter made sincere efforts to stay unbiased during the initial contact with subjects. Also, there were eight different confederates acting as the secretary. All the confederates were personally instructed and had a “manual” of how to prepare instructions and what to tell the participants. Nevertheless, there may have appeared alterations in the way they communicated with the participants due to individual differences.

67

One of the questions that have not been sufficiently answered yet is fundamental to the rationale underlying the ODT: To what level is the pupil dilation during deception influenced by emotional arousal (the ‘fight or flight’ sympathetic reaction) and to what level it is caused by cognitive load? The ODT is based on the idea that deception is more cognitively demanding than telling the truth and previous research (e.g. Hess and Polt, 1964; Kahneman and Beatty, 1966; Kahneman, Onuska and Wolman, 1968; Kahneman and Peavler, 1969) has shown that task- evoked pupillary response reflects increased cognitive effort. But still, we have no scientific evidence showing the precise extent to which the pupil dilates due to cognitive load or emotional arousal. Unlike in a real-life testing situation, there was not much in stake for the examinees; other than not receiving the monetary bonus, there were no consequences of failing the test as discussed by Webb et al. (2009b). In polygraph laboratory studies, it was found that higher incentives to pass the test are associated with higher classification accuracy (Kircher, Raskin, Honts, & Horowitz, 1988). A sincere effort was made to simulate field settings; participants were instructed to create an alibi for their presence in the secretary’s office and not to leave fingerprints. But still, we cannot assume same results as would be in a field study; this was a mock-crime experiment and participants were aware of this fact during the whole study. Despite this fact, in present experiment guilty participants reported significantly higher anxiety level in the post-test questionnaire than innocent participants. The anxiety level is possibly related to the wrongdoing guilty participants committed no matter it was only a mock-crime; the situation felt genuine enough. Another limitation addressing the laboratory deception detection research proposed by Miller and Stiff (1993) was that throughout the experiment, there is a very limited social interaction between the examiner and examinees. The participant very often tells lies or truth with very little or no feedback from the examiner. In real life, though, we get certain feedback from other people most of the time and we adjust our statements or behavior to that feedback. This criticism addresses ecological validity most of all, however, it is an important thing to realize. The lower ecological validity of the experimental situation might also underestimate the effect sizes of evaluated cues to detect deception in the meta-analysis conducted by DePaulo her and colleagues (2003) as in real-life situation the cues would be pronounced more noticeably. In a laboratory environment, people are more aware of their truthful self-presentation. The study is concerned with interaction effects of countermeasures on ocular-motor responses rather than how countermeasures affect the ODT accuracy. The reason is that

68 generally, the accuracy is less sensitive to experimental manipulation. In the present study, it was found that information does not moderate effect of deception on reactions to the three types of statements. According to the content analysis, though, participants generally did not use the cognitive countermeasures that were suggested in the “Beat the ODT” document (such as think harder during the statements on crime you did not commit or do multiplication in your head) which would influence the pupil size. Instead, from the document, they picked countermeasures such as influencing the speed of their answers. Most of the used countermeasures were either spontaneous or related to the speed; neither had an interaction with deception and pupil dilation. Perhaps, participants chose those countermeasures because they seem to be easier to apply than cognitive countermeasure. Thus, participants in the informed condition should have been more directly instructed to use the countermeasures that are expected to influence their pupil size.

Implications and future directions

The ocular-motor deception test is a quite new method for deception detection that might supplement or replace the polygraph in the process of testing job applicants and screening of current employees of several government agencies. The present study was the first attempt to adjust the ODT to the Czech environment. Findings of this study suggest that even though the Czech and the Slovak language are very similar, there is a significant interaction of the language. In the future research, Czech population only should be used for the ODT with Czech items. At the same time, this study presents preliminary results of only one of the ODT measures- pupil dilation. To evaluate the overall ODT classification accuracy, further analysis of reading patterns, error rates and response times will be conducted.

Further research on countermeasures is certainly desirable and may channel from findings of this study. First of all, it would be beneficial to modify the “Beat the ODT” document in a way that emphasizes the use of strategies that would influence pupil dilation. It is to consider whether the terms “spontaneous” and “information” countermeasures are suitable for the ODT. Some uninformed participants used a countermeasure which would be labeled as spontaneous, however it was similar as the countermeasure suggested in the ODT document. Insisting on the difference based on whether the participant evolved the strategy during the test or was advised to use it beforehand does not seem to make sense as in the end, it still is the same strategy no matter where it came from. Honts (2014) suggested that specific countermeasures may work after practice: further research might pursue not only identifying efficient countermeasures but also to investigate the effect of practice. 69

Conclusion In the present study, the ocular-motor deception test (ODT) was used in a laboratory mock- crime experiment and pupillary responses to the test items were investigated as the pupil dilation is the strongest predictor of deception used by the ODT. The study found that guilty participants show greater mean increases in pupil diameter during deception, however greater differences were expected to appear between responses to the two relevant crimes since the difference between crime-related items should be more diagnostic than the difference between crime-related and neutral items. This finding may be due to the eye- tracking device that was used to measure changes in the pupil size and the sample size may play a role as well. The present study did not reveal an effect of the “Beat the ODT” document on participants’ pupil dilation during deception. Countermeasures that participants used during the deception test were investigated. Generally, most participants attempted to influence the speed of their answers, but other kinds of countermeasures were employed, too. Seven successful individuals were identified from the subset of deceptive subjects that showed greater increase of pupil diameter during responding to exam items than to cash items and their countermeasures were assessed. Variety of countermeasures were identified, such as controlling the speed of answers or imagination of their innocence. Five of the seven individuals attempted to confuse the test: to deliberately make mistakes after giving a wrong answer and sometimes to take deliberately longer to respond on irrelevant questions to hide hesitation on cash items. These strategies seem to show a promising direction in the research on countermeasures. Further research will be necessary to adapt the ODT on the Czech environment and to explore the countermeasures.

70

References Ahern, S., & Beatty, J. (1979). Pupillary responses during information processing vary with Scholastic Aptitude Test scores. Science (New York, N.Y.), 205(4412), 1289-1292. Andreassi, J. (2000). Psychophysiology. 4th ed. Mahwah, N.J.: Erlbaum Associates. Aron, A., Dutton, D., Aron, E. and Iverson, A. (1989). Experiences of Falling in Love. Journal of Social and Personal Relationships, 6(3), 243-257. Baker, L., Stern, J. A., & Goldstein, R. (1992a). The gaze control system and the detection of deception: Final report to the U.S. Government (Contract #90-F131400). St. Louis, MO: Washington University, Department of Psychology. Baker, L., Goldstein, R., & Stern, J. A. (1992b). Saccadic eye movements in deception: Final report to the U.S. Government (Contract #91-P-0003). St. Louis, MO: Washington University, Department of Psychology. Barland, G. H., & Raskin, D. C. (1975). An evaluation of field techniques in detection of deception. Psychophysiology, 12(3), 321-330. doi:10.1111/j.1469- 8986.1975.tb01299.x. Ben-Shakhar, G. & Dolev, K. (1996). Psychophysiological detection through the Guilty Knowledge Technique: Effects of mental countermeasures. Journal of Applied Psychology, 81, 273–281. Ben-Shakhar, G. (2002). A critical review of the control questions test (CQT). In M. Kleiner (Ed.), Handbook of polygraph testing (pp.103–126). London: Academic Press. Ben-Shakhar, G., & Elaad, E. (2003). The validity of psychophysiological detection of information with the guilty knowledge test: A meta-analytic review. Journal of Applied Psychology, 88, 131–151. Berrien, F. K., & Huntington, G. H. (1943). An exploratory study of pupillary responses during deception. Journal of Experimental Psychology, 32(5), 443-449. doi:10.1037/h0063488. Bok, S. (1978). Lying: Moral choice in public and private life. New York: Pantheon. Bond, C. F., & DePaulo, B. M. (2006). Accuracy of deception judgments. Personality and Social Psychology Review: An Official Journal of The Society for Personality and Social Psychology, Inc, 10(3), 214-234. Bond, C. F., & DePaulo, B. M. (2008). Individual differences in judging deception: accuracy and bias. Psychological Bulletin, 134(4), 477-492. doi:10.1037/0033-2909.134.4.477

71

Bond, C. F., Levine, T. R., & Hartwig, M. (2015). New Findings in Non-Verbal Lie Detection. In Granhag, P., Vrij, A., & Verschuere, B. (Eds), Detecting deception: current challenges and cognitive approaches. (37-59). Hoboken: Wiley Blackwell. Bradley, M. T. & Janisse, M. P. (1979). Pupil size and lie detection: The effect of certainty on deception. Psychology: A Quarterly Journal of Human Behavior, 16, 33-39. Bradley, M. T., & Janisse, M. P. (1981). Accuracy demonstrations, threat, and the detection of deception; Cardiovascular, electrodermal, and pupillary measures. Psychophysiology, 18, 307-315. Bradley, M. M., Miccoli, L., Escrig, M. A., & Lang, P. J. (2008). The pupil as a measure of emotional arousal and autonomic activation. Psychophysiology, 45(4), 602–607. http://doi.org/10.1111/j.1469-8986.2008.00654.x. Bradley, M. M., & Lang, P. J. (2015). Memory, emotion, and pupil diameter: Repetition of natural scenes. Psychophysiology, 52(9), 1186–1193. doi: 10.1111/psyp.12442. Burgoon, J., & Buller, D. (1996). Interpersonal Deception Theory. Communication Theory, 6(3), 311-328. Cacioppo, J., Tassinary, L., & Berntson, G. (2007). Handbook of psychophysiology. New York, NY: Cambridge University Press. Chapman, C. R., Oka, S., Bradshaw, D., Jacobson, R., & Donaldson, G. (1999). Phasic pupil dilation response to noxious stimulation in normal volunteers: Relationship to brain evoked potentials and pain report. Psychophysiology, 36(1), 44-52. Cook, E. W., Hawk, L. W., Davis, T. L., & Stevenson, V. E. (1991). Affective individual differences and startle reflex modulation. Journal of Abnormal Psychology, 100(1), 5-13. doi:10.1037/0021-843X.100.1.5 Cook, A., Hacker, D., Webb, A., Osher, D., Kristjansson, S., Woltz, D. and Kircher, J. (2012). Lyin' eyes: Ocular-motor measures of reading reveal deception. Journal of Experimental Psychology: Applied, 18(3), 301-313. Credibility Assessment Technologies Becomes Converus Inc. (2014, March 21). Wireless News. Retrieved from http://go.galegroup.com.ezproxy.lib.utah.edu/ps/i.do?p=ITOF&sw=w&u=marriottli brary&v=2.1&it=r&id=GALE%7CA362203503&asid=7ea58f441e6bcf389103d875 02312e7d Davis, M. (1984). The mammalian startle response. In R. C. Eaton (Ed.), Neural mechanisms of startle behaviour. (pp. 287-351). New York: Plenum Press.

72

Dawson, M. (1980). Physiological Detection of Deception: Measurement of Responses to Questions and Answers During Countermeasure Maneuvers. Psychophysiology, 17(1), 8-17. DePaulo, B. M. (1992). Nonverbal behavior and self-presentation. Psychological Bulletin, 111(2), 203-243. doi:10.1037/0033-2909.111.2.203. DePaulo, B. M., Kashy, D. A., Kirkendol, S. E., Wyer, M. M., & Epstein, J. A. (1996). Lying in everyday life. Journal of Personality and Social Psychology, 70, 979– 995. DePaulo, B. M., Charlton, K., Cooper, H., Lindsay, J. J., & Muhlenbruck, L. (1997). The accuracy-confidence correlation in the detection of deception. Personality and Social Psychology Review: An Official Journal Of The Society For Personality and Social Psychology, Inc, 1(4), 346-357. DePaulo, B. M., Lindsay, J. J., Malone, B. E., Muhlenbruck, L., Charlton, K., & Cooper, H. (2003). Cues to deception. Psychological Bulletin, 129(1), 74-118. doi:10.1037/0033- 2909.129.1.74. Dowling, John E.; Dowling, Joseph L. (2016). Vision: How It Works and What Can Go Wrong. Cambridge, MA: The MIT Press. Ekman, P., & Friesen, W. V. (1969). Nonverbal leakage and clues to deception. Psychiatry, 32(1), 88-106. Ellermeier, W., & Westphal, W. (1995). Gender differences in pain ratings and pupil reactions to painful pressure stimuli. Pain, 61(3), 435-439. Ferrero, G. (1911). The criminal man, according to the classification of Cesare Lombroso. New York: Putnam. Fotiou, D. F., Brozou, C. G., Tsiptsios, D. J., Fotiou, A., Kabitsi, A., Nakou, M., Giantselidis, C., & Goula, A. (2007). Effect of age on pupillary light reflex: evaluation of pupil mobility for clinical practice and research. Electromyography and Clinical Neurophysiology, 47(1), 11-22. Ganis, G. (2015). Deception Detection Using Neuroimaging. In Granhag, P., Vrij, A., & Verschuere, B. (Eds), Detecting deception: current challenges and cognitive approaches. (105-123). Hoboken: Wiley Blackwell. Garrido, E., Masip, J., & Herrero, C. (2004). Police officers' credibility judgments: Accuracy and estimated ability. International Journal of Psychology, 39(4), 254-275. doi:10.1080/00207590344000411. Geacintov, T., & Peavler, W. S. (1974). Pupillography in industrial fatigue assessment. Journal of Applied Psychology, 59(2), 213-216.

73

Gillernová, I., Boukalová, H. (2006). Vybrané kapitoly z kriminalistické psychologie. Praha: Karolinum. Gilovich, T., Savitsky, K., & Medvec, V. H. (1998). The illusion of transparency: Biased assessments of others' ability to read one's emotional states. Journal of Personality and Social Psychology, 75, 332-346. Granhag, P. and Strö mwall, L. (2004). The detection of deception in forensic contexts. Leiden: Cambridge University Press. Granhag, P., Vrij, A., & Verschuere, B. (2015). Detecting deception: current challenges and cognitive approaches. Hoboken: Wiley Blackwell. Retrieved from https://ebookcentral.proquest.com. Gudjonsson, G. H. (1983). Lie detection: Techniques and countermeasures. In S.M.A. Lloyd-Bostock & B.R. Clifford (Eds.), Evaluating witness evidence (pp.137–153). Chichester: Wiley. Hacker, D.J., Kuhlman, B., Kircher, J. C., Cook, A. E., & Woltz, D. J. (2014). Detecting deception using ocular metrics during reading. In D.C. Raskin, C.R. Honts, & J.C. Kircher (Eds.), Credibility assessment: Scientific research and applications. (pp. 159–216). Amsterdam: Academic Press. Hartwig, M., Granhag, P. A., & Luke, T. (2014). Strategic use of evidence during investigative interviews: The state of the science. In D.C. Raskin, C.R. Honts, & J.C. Kircher (Eds.), Credibility assessment: Scientific research and applications. (pp. 1- 38). Amsterdam: Academic Press. Heilveil, I. (1976). Deception and pupil size. Journal of Clinical Psychology, 32(3), 675- 676. Hess, E. H., & Polt J. M. (1960). Pupil size related to interest value of visual stimuli. Science, 132, 349–350. doi: 10.1126/science.132.3423.349 Hess, E. H., & Polt, J. M. (1964). Pupil Size in Relation to Mental Activity during Simple Problem-Solving. Science, 143(3611), 1190-1192. Honts, C. R. (1987). Interpreting research on polygraph countermeasures. Journal of Police Science and Administration, 15, 204-209. Honts, C. R. (1991). The emperor’s new clothes: The application of the polygraph tests in the American workplace. Forensic Reports, 4, 91–116 Honts, C. R., Raskin, D. C., & Kircher, J. C. (1994). Mental and physical countermeasures reduce the accuracy of polygraph tests. Journal of Applied Psychology, 79, 252–259.

74

Honts, C. R., Amato, S., & Gordon, A. K. (2001). Effects of spontaneous countermeasures used against the comparison question test. Polygraph, 20(1), 1-9. Honts, C. R. (2014). Detecting deception using ocular metrics during reading. In D.C. Raskin, C.R. Honts, & J.C. Kircher (Eds.), Credibility assessment: Scientific research and applications. (pp 133–155). Amsterdam: Academic Press. Horowitz, S. W., Kircher, J. C., Honts, C. R., & Raskin, D. C. (1997). The role of comparison questions in physiological detection of deception. Psychophysiology, 34(1), 108-115. doi:10.1111/j.1469- 8986.1997.tb02421. Iacono, W. G., & Patrick, C. J. (1997). Polygraphy and integrity testing. In R. Rogers (Ed.), Clinical assessment of malingering and deception (pp.252–281). New York: The Guildford Press. Just, M. A., & Carpenter, P. A. (1993). The intensity dimension of thought: pupillometric indices of sentence processing. Canadian Journal of Experimental Psychology: Revue Canadienne De Psychologie Experimentale, 47(2), 310-339. Johnson, M. K., & Raye, C. L. (1981). Reality monitoring. Psychological Review, 88(1), 67-85. Kahneman, D., & Beatty, J. (1966). Pupil Diameter and Load on Memory. Science, 154(3756), 1583-1585. Kahneman, D., & Beatty, J. (1967). Pupillary responses in a pitch-discrimination task. Perception & Psychophysics, 2(3), 101-105. doi:10.3758/BF03210302. Kahneman, D., Onuska, L., & Wolman, R. (1968). Effects of grouping on the pupillary response in a short-term memory task. Quarterly Journal of Experimental Psychology, 20, 309-311. Kahneman, D., & Peavler, W. S. (1969). Incentive effects and pupillary changes in association learning. Journal of Experimental Psychology, 79(2), 312-318. doi:10.1037/h0026912. Kalvodová, V., & Hrušáková, M. (2015). Dokazování v trestním řízení - právní, kriminologické a kriminalistické aspekty. Brno: Masarykova univerzita. Řada teoretická Edice Scientia svazek č. 359. ISBN 978-80-210-8072-0. Kardon, R. (1995). Pupillary light reflex. Current Opinion in Ophthalmology, 6(6), 20-26. Kircher, J. C., Raskin, D. C., Honts, C. R., & Horowitz, S. W. (1988). Generalizability of mock crime laboratory studies of the control question polygraph technique. In Psychophysiology, 25(4), 462-463.

75

Kircher, J. C., & Raskin, D. C. (2002). Computer methods for the psychophysiological detection of deception. In M. Kleiner, M. Kleiner (Eds.), Handbook of polygraph testing (pp. 287-326). San Diego, CA, US: Academic Press. Kircher, J. C., & Raskin, D. C. (2016). Laboratory and Field Research on the Ocular-motor Deception Test. European Polygraph, 10(4). Kozel, F. A., Johnson, K. A., Mu, Q., Grenesko, E. L., Laken, S. J., & George, M. S. (2005). Detecting deception using functional magnetic resonance imaging. Biological Psychiatry, 58(8), 605-613. doi:10.1016/j.biopsych.2005.07.040. Kozel, F. A., Johnson, K. A., Grenesko, E. L., Laken, S. J., Kose, S., Lu, X., & ... George, M. S. (2009a). Functional MRI detection of deception after committing a mock sabotage crime. Journal of Forensic Sciences, 54(1), 220-231. doi:10.1111/j.1556- 4029.2008.00927.x. Kozel, F. A., Johnson, K. A., Laken, S. J., Grenesko, E. L., Smith, J. A., Walker, J., & George, M. S. (2009b). Can simultaneously acquired electrodermal activity improve accuracy of fMRI detection of deception? Social Neuroscience, 4(6), 510-517. doi:10.1080/17470910801907168. Kuhlman, B. B., Webb, A. K., Patnaik, P., Cook, A. E., Woltz, D. J., Hacker, D. J., & Kircher, J. C. (2011, September). Evoked Pupil Responses Habituate During an Oculomotor Test for Deception. Poster presented at the Society for Psychophysiological Research convention, Boston, MA. Lerner, M. J. (1980). The belief in a just world. New York: Plenum Press. Larson, J. A. (1921). Modification of the Marston Deception Test. Journal of the American Institute of Criminal Law and Criminology, 12(3), 390-399. Larson, J. A. (1932). Lying and its detection. Chicago: University Press. Marston, W. (1917). Systolic blood pressure symptoms of deception. Journal of Experimental Psychology, 2(2), 117-163. Marston, W. (1921). Psychological Possibilities in the Deception Tests. Journal of the American Institute of Criminal Law and Criminology, 11(4), 551-570. Miller, G., & Stiff, J. (1993). Deceptive communication. Newbury Park: Sage. National Research Council. (2003). The polygraph and lie detection. Washington, DC: The National Academies Press. Osher, D. (2005). Multimethod assessment of deception: Oculomotor movement, pupil size, and response time measures. Unpublished dissertation, University of Utah, Department of Educational Psychology.

76

Otter-Henderson, K. D., Honts, C. R., & Amato, S. (2002). Spontaneous countermeasures during polygraph examinations: An apparent exercise in futility. Polygraph, 31(1), 9- 13. Patnaik, P. (2013). Ocular-motor methods for detecting deception: Direct versus indirect interrogation. Unpublished master’s thesis, University of Utah, Department of Educational Psychology. Patnaik, P. (2015). Oculomotor methods for detecting deception: Effects of practice feedback and blocking. Unpublished dissertation, University of Utah, Department of Educational Psychology Patnaik, P., Woltz, D.J., Hacker, D.J., Cook, A.E., Ramm, M.L., Webb, A.K., & Kircher, J.C. (2016). Generalizability of an ocular-motor test for deception to a Mexican population. International Journal of Applied Psychology, 6(1), 1–9. Patrick, C. J., & Iacono, W. G. (1991). Validity of the control question polygraph test: The problem of sampling bias. Journal of Applied Psychology, 76(2), 229-238. Podlesny, J. A., & Raskin, D. C. (1978). Effectiveness of techniques and physiological measures in the detection of deception. Psychophysiology, 15(4), 344-359. Podlesny, J. A, & Kircher, J. C. (1999). The Finapres (volume clamp) recording method in psychophysiological detection of deception examinations. Communications, 1(3), 1–17. Raskin, D. C. (1986). The polygraph in 1986: Scientific, professional, and legal issues surrounding acceptance of polygraph evidence. Utah Law Review, 29, 29–74. Raskin, D. C., & Honts, C. R. (2002). The comparison question test. In M. Kleiner (Ed.), Handbook of polygraph testing (pp.1–48). London: Academic Press. Raskin, D., Honts, C., & Kircher, J. (Eds.). (2014). Credibility assessment. Amsterdam: Academic Press. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124(3), 372-422. Rayner, K., Chace, K. H., Slattery, T. J., & Ashby, J. (2006). Eye movements as reflections of comprehension processes in reading. Scientific Studies of Reading, 10, 241-255. Rovner, L. I. (1986). The accuracy of physiological detection of deception for subjects with prior knowledge. Polygraph, 15(1), 1-39. Sharma, S., Baskaran, M., Rukmini, A. V., Nongpiur, M. E., Htoon, H., Cheng, C., & Milea, D. (2016). Factors influencing the pupillary light reflex in healthy individuals. Graefe's Archive for Clinical and Experimental Ophthalmology =

77

Albrecht Von Graefes Archiv Fur Klinische Und Experimentelle Ophthalmologie, 254(7), 1353-1359. doi:10.1007/s00417-016-3311-4. Snowden, R. J., O’Farrell, K. R., Burley, D., Erichsen, J. T., Newton, N. V., & Gray, N. S. (2016). The pupil’s response to affective pictures: Role of image duration, habituation, and viewing mode. Psychophysiology, 53(8), 1217–1223. http://doi.org/10.1111/psyp.12668. Steinhauer S. R., Siegle G. J., Condray R., & Pless M. (2004). Sympathetic and parasympathetic innervation of pupillary dilation during sustained processing. International Journal of Psychophysiology, 52(1), 77–86. doi: 10.1016/j.ijpsycho.2003.12.005. Sternbach, R. (1966). Principles of psychophysiology. New York: Academic Press. Trovillo, P. (1939). A History of Lie Detection. Journal of Criminal Law and Criminology (1931-1951), 29(6), 848-881. Tucker, L., & Foulston, J. (2015). An introductory guide to anatomy & physiology. London: EMS Publishing. Twyman, N., Schuetzler, R., Proudfoot, J., G., & Elkins, A., C. (2013). A Systems Approach to Countermeasures in Credibility Assessment Interviews. Information Systems and Quantitative Analysis Faculty Proceedings & Presentations. Paper 15. Retrieved from https://digitalcommons.unomaha.edu/cgi/viewcontent.cgi?article=1016&context=isq afacproc. van Steenbergen, H., Band, G. P. H., & Hommel, B. (2011). Threat but not arousal narrows attention: Evidence from pupil dilation and saccade control. Frontiers in Psychology, 2:281. doi: 10.3389/fpsyg.2011.00281. Vrij, A., Semin, G., & Bull, R. (1996). Insight into Behavior Displayed During Deception. Human Communication Research, 22(4), 544-562. Vrij, A. (2000). Detecting lies and deceit: The psychology of lying and the implications for professional practice. Chichester: Wiley. Vrij, A. (2004) Why professionals fail to catch liars and how they can improve. Legal Criminol. Psychol. 9, 159–183. Vrij, A., Fisher, R., Mann, S., & Leal, S. (2006). Detecting deception by manipulating cognitive load. Trends in Cognitive Sciences, 10(4), 141-142. Vrij, A. (2008). Detecting lies and deceit: Pitfalls and opportunities (2nd ed.). Chichester, UK: John Wiley and Sons.

78

Vrij, A., Mann, S., Fisher, R., Leal, S., Milne, R., & Bull, R. (2008). Increasing cognitive load to facilitate lie detection: The benefit of recalling an event in reverse order. Law and Human Behavior, 32(3), 253-265. Vrij, A., Granhag, P., & Porter, S. (2010). Pitfalls and Opportunities in Nonverbal and Verbal Lie Detection. Psychological Science in the Public Interest, 11(3), 89-121. Retrieved from http://www.jstor.org/stable/41038740. Vrij, A., & Verschuere, B. (2013). Lie Detection in a Forensic Context. In D. S. Dunn (Ed.), Oxford Bibliographies. New York: Oxford University Press. DOI: 10.1093/OBO/9780199828340-0122. Vrij, A., & Ganis, G. (2014). Theories in deception and lie detection. In D.C. Raskin, C.R. Honts, & J.C. Kircher (Eds.), Credibility assessment: Scientific research and applications. (159–216). Amsterdam: Academic Press. Vrij, A. (2015). Verbal Lie Detection Tools: Statement Validity Analysis, Reality Monitoring and Scientific Content Analysis. In Granhag, P., Vrij, A., & Verschuere, B. (Eds), Detecting deception: current challenges and cognitive approaches. (3-37). Hoboken: Wiley Blackwell. Watson, N. V., & Breedlove, S. M. (2012). The Mind´s Machine: Foundations of Brain and Behavior. Sunderland, Mass: Sinauer Associates. Webb, A.K. (2008). Effects of Motivation, and Item Difficulty on Oculomotor and Behavioral Measures of Deception. Unpublished dissertation, University of Utah, Department of Educational Psychology. Webb, A. K., Hacker, D. J., Osher, D., Cook, A. E., Woltz, D. J., Kristjansson, S., & Kircher, J. C. (2009a). Eye movements and pupil size reveal deception in computer administered questionnaires. In D. D. Schmorrow, I. V. Estabrooke, & M. Grootjen (Eds.), Foundations of augmented cognition: Neuroergonomics and operational neuroscience (pp. 553–562). Berlin, Germany: Springer-Verlag. Webb, A. K, Honts, C. R., Kircher, J. C., Bernhardt, P.C., & Cook, A. E. (2009b). Effectiveness of pupil diameter in a probable-lie comparison question test for deception. Legal and Criminal Psychology, 14(2), 279–292. Zuckerman, M., DePaulo, B. M., & Rosenthal, R. (1981). Verbal and nonverbal communication of deception. Advances in experimental social psychology, 14, 1-59.

79

List of Figures

Figure 1: Cross section of the eye (Tucker  Foulston, 2015, pp. 278) ...... 23 Figure 2: Regression plots for the first and the fifth repetition of the ODT items. (Kuhlman, et al., 2011) ...... 32 Figure 3: Laboratory setting. On the left chinrest and square- sized monitor to present the ODT items with the RED 250 mobile attached. On the right examiner's laptop connected to the monitor and SMI 250 mobile...... 52 Figure 4: Mean change in pupil diameter for 4 seconds following item onset by item content and guilt condition...... 54 Figure 5: Mean change in pupil diameter for 4 seconds following item onset by item content and guilty-information condition...... 55 Figure 6: Mean area under the curve and level at response for pupil diameter in mm x 1000 by item content and guilt condition...... 56 Figure 7: Mean PD level (mm x 1000) at response by statement type and guilt condition for Czech and Slovak language...... 58 Figure 8: Mean area under the curve in relative unites x 1000 by statement type and guilt condition for Czech and Slovak language...... 59

List of Tables

Table 1: Non-verbal cues to deception with larger effect sizes based on larger and smaller numbers of estimates (DePaulo, 2003, pp.95)...... 20 Table 2: Percent of correct decisions under standard conditions in mock crime experiments (Kircher & Raskin, 2016, pp. 166)...... 39 Table 3: The 2x2 between subject variables: guilt and information condition ...... 45 Table 4: Mean length of items types (neutral, cash, exam) for male and female versions of the ODT in characters with spaces...... 50 Table 5: Mean values and standard deviations for PD level and PD AUC...... 57 Table 6: Categories of strategies used by participants to defeat the deception test...... 59 Table 7: Categories of strategies used by guilty participants who showed bigger increases in

PD level during exam items (R2) compared to cash items (R1) ...... 61 Table 8: Self-reported number of strategies used by participants to defeat the test...... 63

80