MASARYK UNIVERSITY Faculty of Social Studies Department of Psychology

Mgr. Aleš Neusar

Memory for unique events: Predictors of dating accuracy

Dissertation

Supervisor: prof. PhDr. Vladimír Smékal, CSc. Masaryk University, Brno,

Co-supervisor: Wander van der Vaart, PhD. University for Humanistics, Utrecht, The

Brno 2012

Acknowledgements

For Věruška, Matylda and Frida Time spent with you is my most precious possession...

(...) the deeper your education, the more it will change the “you” that you are or want to be. That’s why it’s so important to choose carefully what you study and with whom. (Booth, Colomb, & William, 2008, p. 13)

I was lucky with the choice of my topic—but as is known, any topic is in the end interesting when one puts so much into it... What I find most important is that I was even luckier in that I had countless opportunities to meet so many nice and inspiring people in the course of writing the dissertation. Here are at least some of them: • Vladimír Smékal (my supervisor) • Wander van der Vaart (my co-supervisor) • Stanislav Ježek (my “unofficial” co-supervisor) • Eva Literáková (my co-worker) • Dana Šimková (my “guardian angel”) • Jan Vančura (my colleague from whom I learned a lot about non- scientific stuff) • Daniel Hastík (the one who told me to finish this dissertation...)

I would like to thank to the reviewers for their patience and reviews: • Jaap Murre & Petr Kulišťák

I would also like to thank: my wife’s family who sheltered us when I was not “mentally available”, Hans Tenwolde, Miroslav Charvát, Ivo Plšek, Jessica Merrill, Martin Dolejš, Irena Smetáčková, Gabriela Jiskrová, Albert Kšiňan, Josef Kundrát, Jana Hoferková, Douglas Bernstein, Peter Reddy, Ivo Čermák, Hana Hoblová, Martin Glogar, Jana Glogarová, Petr Macek, Zbyněk Vybíral, Jan Mareš, Martin Fišr, Tomáš Flašar, Monika Zamazalová, Lars Kaczmirek, Natalja Menold, Annie Trapp, Zdeněk Neusar, all the respondents and their proxies, and many others...

Announcement to all those who suffered over the years: “I promise I will never write a dissertation again.” Aleš Neusar / January 19, 2012

2

Glossary of potentially unclear terms used in the dissertation

This glossary includes only terms that could be unclear or are used in various ways by different authors. I will use these terms with their definitions throughout the dissertation and use similar terms only when quoting directly (with an explanation when necessary). Aided : Recall with the help of various cues or procedures that should improve the accuracy of recalled events. Calendar instrument: An instrument based on the common calendar that incorporates various recall aids into one instrument. The aids usually consist of a calendar grid, landmark events, and various domains (Glasner & Van der Vaart, 2009). Calendar instrument may be also called timeline, event history calendar, life history calendar (see chapter 3.4 for more alternatives how to call these instruments). Dating error: By dating error I refer to the dating error in absolute value. This error does not have the sign showing the direction of the dating but only the magnitude of error. Some authors refer to absolute dating error (e.g., Gibbons & Thompson, 2001). Landmark or temporal landmark: Landmarks are events that are well remembered, organize , and help to aid other events from the same refer- ence period (Shum, 1998). In the context of this dissertation I will use the word landmark or temporal landmark—if I want to emphasize that a landmark has to be time-tagged to serve its role in making the date estimates of other events more accurate. It can also happen that some landmarks are not time- tagged which is why it may be useful to differentiate between landmark and temporal landmark. Month temporal schema: temporal schema helping to estimate the week. Multi–year temporal schema: temporal schema helping to estimate the year. Personal event: Personal events are usually defined as events that hap- pened to the person himself or herself (self–events), sometimes events that happened to the whole family are added as well (self–events). I will use the word personal event in even broader sense by adding events that happened to proxies but respondents were well informed about them (other–events). The reason for adding family events into self–events is that they can hardly be distinguished when the respondent took part in them. For example a vacation of the whole family in Croatia is both family event and self–event. However,

3 when the family vacation took part without the respondent it is categorized as “other–event”. Public event: This term refers to public events that are known by the res- pondents. The level of knowledge may vary but respondents have to at least know that the event happened. Events can be nationally known and heavily covered by media, regional events or public events known by some specific group only (e.g., concerts at a university campus). Recall period: time interval between interview date and the date of the target event. For example if the interview takes place in January 2011 and the target events are from 2005 to 2008 then the recall period of the most recent event from December 2008 is 2 years and 1 month. Reference period: the period in which researcher is interested and col- lects the data. This period is fixed for all events (chosen for some reason by the researcher) and may serve as a boundary when made explicit to the res- pondents. The reference period in Study I and II is for example 4 years long (2005–2008). There is often some elapsed time (break) between reference pe- riod and interview date. Signed dating error or signed error: This type of dating error shows the magnitude of dating error with the sign. Minus sign means that event is teles- coped backward (= moved towards past). Plus sign means that event was teles- coped forward (= moved towards present). Some authors use net dating error (e.g., Van der Vaart & Glasner, 2007a). When clear from the context I some- times shorten the error to signed error. Temporal schema: temporal schemata refer to general knowledge about time patterns and thus constrain the likely time of the event (Larsen, Thompson, & Hansen, 1995). For example when somebody visits a gym every Tuesday the day of the week error of this event will probably be zero (week schema). Unique event: The word “unique” has different meanings throughout the dissertation. 1) It can mean that the event occurred not more than once during the recall period. This dissertation is focused on these types of events. However this does not imply that similar events could not happen during the same recall period. An event is considered as unique when people have enough de- tails to distinguish between the two or more similar events. For example I consider “tennis play” to be unique event when the person played tennis only once during the recall period and unique as well when she played many times but the event is described as “Tennis play with Martin”—an event that happened only once during the period. If she played tennis with Martin more often

4 another detail is needed et cetera. The criterion for an event to be unique is thus the knowledge of enough details. These details can be provided by other people as well but the person who estimates the date of that event must ac- knowledge that he or she knows which event is discussed. This knowledge does not imply that the person knows the date as well. It can also mean that the dates of other similar events interfere with the target event. This is why some events that happen more often may be more difficult to date even when people remember many details about the event. 2) The word “unique” can also be used in assessing the phenomenological characteristic of events. “Uni- queness” of events can mean that considered the context of events that hap- pened before or after, a person finds an event to be unique, standing out among other memories, something like that does not happen to him or her often. Which meaning of the word unique I use will be clear from the context and otherwise be explicitly stated. When not used in a particular context (e.g., phenomenology of memories), I use the former meaning = an event is distin- guishable from other events. Other authors may use the word in other mean- ings. If so it is explained. Week temporal schema: temporal schema helping to estimate the day of the week. Year temporal schema: temporal schema helping to estimate the month. This glossary includes only terms that could be unclear or are used in various ways by different authors. I will use these terms with their definitions throughout the dissertation and use similar terms only when quoting directly (with an explanation when necessary).

Abbreviations used in the dissertation

• CAL = interview with calendar instrument (in Studies II & III) • DOW = day of the week • η² = eta squared (effect size measure for ANOVA or Kruskal-Wallis test; range from 0 to 1) • Non-CAL = interview without calendar instrument (in Studies II & III) • Study I, II, III = When I mention the word “study” in italics I always refer to the empirical studies from this dissertation. • V = Cramer’s V (effect size measure of the chi-square test; range from 0 to 1)

5

Contents

Glossary of potentially unclear terms used in the dissertation 3 Abbreviations used in the dissertation 5 1 Introduction 10 1.1 When is the accuracy of date estimates relevant 10 1.2 General aim and the dissertation outline 12 2 Unique events dating 15 2.1 Long-term for unique events 15 An event and its representation in long-term memory 16 for unique events 18 2.2 Temporal representation of events 20 2.3 How people store and retrieve temporal information 21 of temporal information 21 How people reconstruct the temporal aspect of unique events 23 Temporal schemata 27 2.4 Patterns of dating error 30 Error in days 31 Error in days of the week 31 Error in weeks 32 Error in months 32 Error in years 33 3 Predictors of dating accuracy 34 3.1 Respondent characteristics 37 Age 39 Gender 43 Educational level 46 Memory for dates 47 Personality 50 Interests 51 Current state of mind 53 3.2 Event characteristics 54 Event recency 55 Typicality, regularity and frequency 58 Self–events and other–events 59 Event theme 59 Media coverage (applies to public events only) 60 Temporal schemata 61 Phenomenology of memories 62 Confidence of date estimates 67 Association with other events 68 3.3 Data collection and dating accuracy 69 The response process 70 Choosing appropriate time units 73 Temporal boundaries 75 Facilitation of respondent’s recall 77 Motivation facilitation 77 Recall facilitation (aided recall procedures) 78 3.4 Calendar instruments 82 Incorporation of a calendar instrument into data collection 86 Effect of a calendar instrument on data quality (in particular on dating accuracy) 88 Plain calendar study (Gibbons & Thompson, 2001) 89 Purchases of pairs of glasses study (Van der Vaart & Glasner, 2007a) 91 Educational careers (Van der Vaart, 1996, 2004) 92 Panel study of income dynamics I. (Belli, et al., 2004; Belli, et al., 2001) 92 Panel study of income dynamics II. (Belli, et al., 2007) 92 Potential costs and problems connected to calendar instruments 93 3.5 Summary of the predictors of dating difficulty 94 4 Overall design and research aims of empirical studies 97 4.1 Introduction 97 4.2 Aims and research questions 100

6

4.3 Design 101 Experimental conditions in Studies II and III 102 Calendar instrument design issues 103 4.4 Samples 104 4.5 Measures 105 Respondent characteristics 107 Event characteristics – respondent independent or partially independent 108 Event characteristics – respondent dependent or proxy dependent 109 Outcome measures (dependent variables) 112 4.6 Procedures 113 Instruction in the CAL condition (Study II) 114 Instruction in the CAL condition (Study III) 115 Instruction in Studies II and III (Non-CAL condition) 115 Training of the interviewers 116 4.7 Issues in data analyses 117 5 Dating accuracy predictors of remote public events (Study I) 118 5.1 Introduction 118 Hypotheses 119 5.2 Methods 121 Participants 121 Selection criteria 121 Response rate 121 Sample 121 Selection of public events 122 Classification of the events according to the media coverage in time 123 Classification of public events – multi–year and year temporal schema 124 Classification of the events according to the theme 125 Indicators of temporal landmarks from 2006 and 2007 125 Event dating procedure 126 5.3 Results 127 Descriptives 127 Event characteristics of public events 132 Event recency 132 Temporal schemata 133 Media coverage in time 133 Association with a personal event 134 Respondent characteristics 134 Self-evaluation of memory 134 Interest in public events 136 Gender 138 Age and education 138 Could any public events be temporal landmarks? 139 Evaluation of the possibility of using public events in Study II as recall aids: 143 Evaluation of the use of public events in the subsequent Study II 143 5.4 Discussion 144 Discussion of the major findings 144 Implications for Study II choices 146 Limitations of the study and the implications for future research 147 6 Dating accuracy predictors of remote personal events (Study II) 150 6.1 Study aims and hypotheses 150 Selection of personal events 151 Selection of public events 152 Events excluded from the analyses 153 Variables excluded from the analyses 153 Hypotheses 154 6.2 Method 157 Participants 157 Measures 158 Classification of events according to a temporal schema 159 Design and procedure 159 6.3 Results 160 Descriptives of the collected data 160

7

General patterns of the dating errors of personal events 161 Impact of event recency on dating accuracy 164 Comparison of dating accuracy in CAL and Non-CAL condition 165 Explanation how dating error in various units will be presented 166 Event characteristics (respondent independent or partially independent) 167 Temporal schemata 167 Self–events versus other–events 168 Event characteristics (respondent dependent or proxy dependent) 169 Knowing the date versus reconstructing the date 169 Importance, vividness/details, uniqueness, sharing 170 Confidence in date estimates 175 Predictors of dating accuracy – respondent characteristics 178 Self-evaluation of memory 178 Gender 178 Difficulty of the task 179 Number of landmarks 179 Summary of the effect sizes of event and respondent characteristics 180 Public events 181 Relationship between event recency and dating accuracy of public events 181 Confidence in date estimates of public events 182 Association of public events with personal events 183 Comparison of dating accuracy in CAL and Non-CAL condition 184 6.4 Discussion 184 7 Dating accuracy predictors of recent personal events (Study III) 188 7.1 Study aims and hypotheses 188 Selection of personal events 189 Selection of public events 191 Events excluded from the analyses 191 Variables excluded from the analyses 192 Hypotheses 192 7.2 Method 195 Participants 195 Measures 196 Classification of events according to a temporal schema 197 Design and procedure 198 7.3 Results 198 Descriptives of the collected data 198 General patterns of the dating error of personal events 200 Dating error in days 200 Dating error in weeks 201 Dating error in DOW 202 Impact of event recency on dating accuracy in days and DOW 203 Comparison of dating accuracy in CAL and Non-CAL condition 204 Further explorations 207 Explanation how dating error in various units will be presented 208 Event characteristics (respondent independent or partially independent) 209 Temporal schemata 209 Event frequency 210 Self–events versus other–events 211 Event characteristics (respondent dependent or proxy dependent) 213 Knowing the date versus reconstructing the date 213 Importance and sharing 214 Confidence in date estimates 219 Respondent characteristics 222 Self-evaluation of memory 222 Gender 222 Difficulty of the task 223 Number of landmarks 224 Summary of the effect sizes of event and respondent characteristics 224 Public events 225 Relationship between event recency and dating accuracy of public events 225 Confidence in date estimates of public events 226 Comparison of dating accuracy in CAL and Non-CAL condition 228 8

7.4 Discussion 228 8 General discussion and conclusions 232 8.1 Introduction 232 8.1 Event and respondent predictors – recent and remote events 234 Event characteristics 235 Respondent characteristics 236 8.3 Impact of the calendar instrument on dating accuracy 237 8.4 Limitations of the empirical studies 237 8.5 Implications for future research 239 Appendices 241 Appendix 1: Calendar instrument in Study II (CAL) 242 Appendix 2: Calendar instrument in Study III (CAL) 245 Appendix 3: Plain calendar in Study III (Non-CAL) 246 Appendix 4: Description of 35 public events used in Study I (and II) 247 Appendix 5: Frequency of answers outside the 2005–2008 boundaries 256 Appendix 6: Description of public events used in Study III 257 Bibliography 259

9

1 Introduction

1.1 When is the accuracy of date estimates relevant

Recalling a personal event seems an incomplete experience without at least some accompanying sense of when the event occurred. On the rare occasion when we are completely unable to place an autobiographical event in time, the experience is vaguely disquieting (...) (Friedman, 2004, p. 591)

When I ask people how important it is for them to date correctly their life events, in most cases they answer something similar to “not too much. It’s more important to know what happened and with whom”. This opinion is supported by the fact that the date alone is in most cases useless retrieval cue to the memories in comparison with the information what happened, where it happened, and who was involved (Wagenaar, 1986). When I ask the same question about well-known public events, the answer is often even stronger, some people suggesting that questions like that should not be asked at al. Yes, people do not like when questions and do not consider them as rele- vant, especially when interviewer asks about more remote events in absolute time format (Janssen, Chessa, & Murre, 2006). The reason why is obvious. First it is usually really not so relevant and second it is demanding task in which people often fail. Nevertheless many “real-world” or “research” situa- tions require answering these questions and the ability to date accurately may be crucial. Consider for example the conversation my friends (two males) had recently at our joint vacation in the Slovakian mountains:

“I was at the Považský Inovec only twice. Last time it was in 2004.” “No, in 2004 I have just finished master’s studies and you were not here.” “Wait a sec. Yes, it must have been a year before then.” “No you weren’t here either, because we went by only two cars that were both full.” “No, no... I was certainly here in 2004. You must be mistaken”1

The conversation continued for another couple of minutes and fortunately reached an agreement—with the great help of female partners—on the se- quence of events and exact dating as well. This story shows how relevant is the temporal aspect of events in an ordinary life situations. Without this knowledge it is difficult to create the personal narrative (or collective narrative in

1 I do not remember the exact words but the meaning remained very close to the real conversation. 10 this case) because people tend to create them in a chronological order (Friedman, 2004; Scholl-Schneider, Schneider, & Spurný, 2010). As the research evidence shows, people sometimes do not know even the dates of landmark events such as weddings, birthdays of their children or other very salient events but still are often surprisingly confident that their temporal estimates are correct (Shum, 1998; Wagenaar, 1986). Correct dating of at least some personal events is important even nowa- days in the time of smart phones and other electronic gadgets or social net- works. Often only our memory can “say” whether the stored information is true or false. In one small pilot study connected for this dissertation I have found that more than 50% of the activities stored in several peoples’ Facebook accounts where these people clicked “I will go there” actually did not happen and sometimes even the dates of the activities that happened were not cor- rect. Other real-world areas, where temporal aspect is relevant, are health or mental issues. For the diagnosis it may be important to know when the symp- tom started and what where the consequences. I remember how ashamed I felt when I went to the emergency with my daughter and the doctor asked me about the sequence of symptoms and the last time she had antibiotics. I did my best, but later on, when speaking to my wife, I have realized that I was completely wrong. Hopefully this time it did not have any serious conse- quences. The literature shows that I am not alone because and it is quite common to misplace medically–related events (Means, Mingay, Nigam, & Zarrow, 1988; Schwarz, 2007). In many professions it is common to gather temporal data from informants and this data collection is often retrospective. These professionals (e.g., jour- nalist, sociologists, psychologists, demographers, oral historians) often do not have any objective source of date2 and have to rely on date estimates reported by informants. For these professionals it is important to know how valid these reports are and what they can do to improve the data quality. Knowing the date or at least the sequence of some public events is part of general knowledge. The importance is revealed in everyday communication where people often use the relative dating format and place events temporal- ly by the reference to persons, offices or other important events (Greenway, 1999). Examples from everyday language are numerous: “after the Velvet revolution” (after November 17, 1989), “before the NY terrorist attacks” (be-

2 This is not the case for majority of public events but may apply to less known or older events where it is more difficult to find objective source of date. 11 fore September 11, 2001), “when we joined the EU” (May 1, 2004), “during the Czech Presidency of the European Union” (during January–June 2009). If people do not know these dates they can look them up, but it is more conve- nient to know them. This knowledge of public events dates and especially their sequence helps to create the national narrative of one’s country as well (Boyer & Wertsch, 2009; White, 1997) The date may be the key characteristic that helps to distinguish similar events. This applies especially to cyclic personal and especially public events, e.g., National Votes 2006, National Votes 2010, and Christmas 1999. The forensic field is probably the area where exact dating is most impor- tant. For example police investigators and other professionals in the forensic field often ask the witnesses what they did at some time and about the se- quence of events. Precise temporal estimates may be very crucial in this field because they can ascertain somebody’s guilt. This is why many memory stu- dies (e.g., Watergate scandal or process with Damjanuk, see Neisser, 1981; Wagenaar, 1988) as well as data collection methods come from the forensic field (e.g., Fisher & Geiselman, 2010; Roberts & Horney, 2010; Yoshihama, Clum, Crampton, & Gillespie, 2002).

1.2 General aim and the dissertation outline

“The overarching goal of any scientific data collection is to acquire valid in- formation” (Belli & Callegaro, 2009, p. 31). In the context of my dissertation this means collecting the temporal data of unique events properly but at the same time having insight into how much confidence we3 can have in these data that are often found to be biased. My first aim is thus an exploration of the possibilities how to improve the ac- curacy of date estimates. More concretely in Studies II and III I will evaluate the impact of a calendar instrument on dating accuracy of unique personal and public events. Calendar instrument is a data collection technique based on the com- mon calendar that incorporates various recall aids into one instrument and should thus help to increase the dating accuracy. However there is not too much evi- dence about the conditions under which the calendar increases dating accura- cy and if there are people or types of events when the calendar instrument is not suitable (Belli, Stafford, & Alwin, 2009; Glasner & Van der Vaart, 2009).

3 We = professionals from diverse fields. I will usually use the term “researcher” instead of a “professional”. 12

Unique personal events that are of major interest in Studies II and III should be generally dated relatively well and it is thus questionable if these events will be dated more accurately with a calendar instrument or if the ca- lendar instrument will only make the interview longer and more costly with- out any substantial benefits (and possibly with decreased dating accuracy of some events). The second issue—deciding how much to trust the collected data—is more tricky. Without an objective source of dates (e.g., from documents) the researcher can never be 100% sure that the date estimate is correct even for very salient events. In such a situation a researcher has the choice to simply trust that the respondent was correct or to try to predict under what conditions people are more accurate. Knowledge of various sources of dating accuracy (or error) can help in these predictions and alert the researcher when the ex- pected difficulty is too high and it would be better not to ask about the date estimates (or not to trust them too much). Study I focuses only on well-known public events and Studies II and III mostly on personal events. I will explore three sources (predictors) of dating accuracy among personal and public events—sources related to individual differences of respondents (respondent characteristics); sources related to the target events that will be dated (event characteristics); and sources related to the data collection (this source is used only in Study II and III). I will focus only on unique events that are defined here as events that are distinguishable from other events—at least some characteristics of the event are specific for the event alone and these characteristics are known to respondents. Chapter 2: I focus here on memory for the temporal aspect of unique events. I first explore various types of events and aggregated concepts related to them. Events are composed not only from episodic memories but also from seman- tic information (e.g., temporal schemata) that fills in the missing information. The temporal aspect can be represented by relative time format (e.g., before or after something) and absolute time format which will be generally pre- ferred. Next I move to how people store and reconstruct the temporal aspect of events. It is rare that people remember the exact date and thus reconstruc- tion with the help of various plays a major role in date estimation. The last topic dealt in chapter two is the patterns of dating error in vari- ous time units. Chapter 3: The three sources (predictors) of dating accuracy are ex- plored here. Respondent characteristics are for example the demographics, the quality of people’s memory for dates, interests or the current state of mind

13

(e.g., tiredness). Next I move to events characteristics. Some of them can be as- sessed by other people or extracted from documents (e.g., event recency) while others require the respondent’s evaluation (e.g., phenomenology of memories such as importance or vividness). The last source is the data collec- tion that is divided into two sections. I first deal with data collection in gener- al and then focus on calendar instruments that will be evaluated in Studies II and III. Chapter 4: This chapter offers an overall design and research aims of the three empirical studies. The reader will see the interconnections among the studies and the major choices that were made. Chapter 5: Study I is focused on the dating accuracy predictors of well- known public events from 2005–2008. It is an online correlational study. The an- cillary aim was also to help with the choices made in Study II (e.g., whether public landmarks should be used or not). Chapter 6: Study II explores the dating accuracy predictors of personal events from 2005–2008 and several public events from the same period as well. It is a correlational study, but the way of data collection (with the help of ca- lendar instrument or without) is experimentally manipulated. Chapter 7: Study III is similar to Study II. The difference is that this study concerns very recent events (recall period 0.5–3 month and that some predic- tors are left out and others (e.g., emotionality of events) are added. Chapter 8: This chapter contains the general discussion and conclusions of the empirical studies.

14

2 Unique events dating

2.1 Long-term memory for unique events

When people are faced with a date estimate task two conditions have to be fulfilled—the target event cannot be forgotten and has to be distinguishable from other events. Remembering that the target event happened is important, because no- body can estimate the date of completely forgotten events—that would be a pure guess. In some research however even guessing the date of unknown events can make sense. For example when researchers study temporal sche- mata connected to public events, they can find out that even people who swear that they never heard about the event may guess the date better than pure chance. This would support that some known temporal schema (e.g., season or month) are connected to an event. When more similar events happened during the reference period, people have to be able to distinguish the target event from other events— what requires having enough specific details or in other words cues that will help people in the search for the target memory or information (Conway & Loveday, 2010). The information about unique events is stored in long-term memory that is divided into two interdependent systems—semantic and (Tulving & Craik, 2000; Wheeler, Stuss, & Tulving, 1997). Semantic memory refers to the knowledge of facts and concepts. This includes for ex- ample the knowledge of exact dates, knowing the cyclic nature of parliamen- tary elections, knowing your own birthday, or general knowledge about the typical sequence of the trip. Episodic memory refers to the recall of specific epi- sodes or events. The difference between both systems is that people re- experience some aspects of the episodic memories (Tulving calls that “mental time travel”) while semantic memory involves only “pure” knowing without re-experiencing (Baddeley, 2009; Tulving & Craik, 2000). Events in real life are usually represented in both memory systems because many semantic facts involve at least some form of episodic experience and vice versa (Williams, Conway, & Cohen, 2008). The nature of the relationship between both memo- ry systems is very complex and not yet fully understood. We know that se- mantic knowledge is converted from episodic (Murre, 2010) but semantic knowledge has an impact on episodic as well (Conway, 2005). Lack of defini- tive knowledge about both systems however does not decrease the usefulness

15 of both concepts because they capture the way how people think and feel about memories very well (Williams, et al., 2008). The important message is that temporal information can be stored as a fact or can be reconstructed from the recalled episodes.

An event and its representation in long-term memory

According to Glasner (2011) there is no agreed definition of what constitutes an event and the term describes a variety of concepts. The smallest conceptual unit that can be called an “event” is the single episodic memory (e.g., falling off a bicycle). Conway (2005) defines episodes similarly as Tulving that they are very specific experience-near knowledge that contain “summary records of sensory-perceptual–conceptual-affective processing that characterized or predominated in a particular experience” (p. 612). According to Conway’s self-memory system theory episodic memories are formed at junctures in short- term goal processing. For example I am now writing this text about Conway’s theory and when I stop and leave the room to get a new cup of tea, that would be the juncture where episodic memories are formed. The first episode is the typing or more precisely describing Conway’s theory and the second episode is bringing the cup. One could also argue that typing could be di- vided into smaller units like typing each letter, typing the word “Conway”, typing the whole sentence, et cetera. But according to Conway people hardly chunk episodic memories into such small clusters because they are not mea- ningful. The example with bringing a new cup of tea is less obvious. I could go into the kitchen to bring the cup and go back to work. But I also could go into kitchen, fill the kettle with water and wait until it is boiled while listen- ing to the radio news (this was actually what I did). This could be considered either as several episodes or as one longer episode of making tea because my short term goal of writing did not change. Listening to the radio could be part of this episode. This is why Conway writes that junctures can occur at many different levels. Most of these episodic memories are quickly forgotten. When people speak about past events and especially when more remote events are taken into account (e.g., more than one day to much longer) an event usually means a more aggregated concept. Conway (2005) uses the term general event to describe the events that are shorter and more event– specific in nature than lifetime periods but aggregate and integrate usually many episodic memories. A general event can be for example writing a mo- nograph. However when it takes years, the word life-time period would be

16 more appropriate. But when writing in binges, the last month’s binge before sending the manuscript to the publisher could be considered as a general event. General events that cover longer periods are sometimes called extended events (Barsalou, 1988). From some point life-time periods could be consi- dered as very long extended evens, but life-time period should better be kept for really long periods, such as “living in Olomouc”, “writing the mono- graph”, or “studying at university” that cover more specific general events. There can be more life-time periods overlapping each other or appearing in parallel. I can speak about university years while at the same time about the years with my girlfriend that can cover similar periods. Other kinds of general events are repeated events (e.g., playing tennis every Tuesday). In a experiment Barsalou (1988) asked students what they did last summer. Most frequently mentioned events were extended events (trip to Europe; diet) and repeated events (tennis play). Only 21% of the events were specific episodes. Extended events were defined as events that last longer than a day and may be continuous but also discontinuous (interrupted by some other events). Conway (2005) tried to propose a model of that includes all mentioned types of events (see figure 2.1). At the top of the hie- rarchy is a person’s life story. Conway defines life story as “general factual and evaluative knowledge about the individual” (p. 608) that can have more than one self-image and every self-image may access memories slightly diffe- rently. This is why it is so easy to speak about the same event so differently when talking to different people where the person “uses” different self- images. As Conway, Meares, and Standart (2004) highlight, the self (and its goals) has an impact not only on the choice of episodes but it alters the mem- ories or even fabricates them in order to keep the memories self-coherent. The life story consists of several major themes (e.g., work theme or rela- tionships. Every theme may cover several life-time periods (e.g., university years or being with girlfriend that changed into marriage and family) that can overlap or be accessed through more themes. Lower in the hierarchy are the general events (tennis tournament or rebuilding the living room) and at the bottom the specific episodes (e.g., tennis match with X; buying the sofa).

17

Life story

Work & study Relationship theme theme

University Working at Being with Spending Family years Masaryk University girfriend time with ...

Tennis Camping in Rebuilding the tournament Norway living room Life story themes

... .. Life-time periods ......

General events ...... buying sofa Episodes ......

. Figure 2.1 Model of autobiographical memory. Adapted from Conway (2005).

Most personal events from Study II and III in this dissertation could be described as general events, because they cover more episodes. Remote events in Study II are mostly extended events (e.g., trip to Croatia) while in recent events in Study III a small percentage of episodic memories appears as well.

Semantic memory for unique events

The memory system is faced with several mutually contradictory demands. One is to represent reality as this is experienced, but in cognitively efficient ways, and another is to retain knowledge in such way as to support a coherent and effective self. (Conway, 2005, p. 596)4

In the previous section I dealt with episodic memories and their aggregation in concepts like general events of life-time periods. All these types of events do not consist of only episodic information but also of semantic knowledge. The reason is obvious and is clearly stated in the Conway’s quotation above. Human memory needs to represent memories in some effective way because without that people would be overloaded with memories and there is also a need of a coherent self-image which is probably why most episodes are for-

4 In original: „The memory system is, therefore, faced ...” 18 gotten or only limited information remains stored (Burt, 2008; Conway, 2005; C. B. Harris, Sutton, & Barnier, 2010). The retrieval of events is almost always a reconstructive process where pieces of episodes, general events and semantic knowledge are put together (Anderson, 2009; Conway & Loveday, 2010; Conway & Pleydell-Pearce, 2000). The semantic knowledge is especially important because it fills in miss- ing information (Eysenck, 2009). I will focus on those forms of semantic knowledge that could have impact on dating accuracy5. Semantic knowledge helps to date unique events in many ways. People know the schema that the year is divided into months or seasons and that some activities typically happen during some time of the year (e.g., skiing—if not on a glacier). People also know that a week has 7 days and that workdays are typically Monday to Friday and that major trips will be more probable during weekends. Also the life-time periods have typical schemata (Janssen, et al., 2006). Czech children start attending the primary school at the age of 6 or 7 and the earliest age for entering university is 19 or 20. This knowledge is usually called temporal schemata (Larsen, et al., 1995). Temporal schemata are not only shared in a culture but may be idiosyncratic as well. For example a person who works in shifts may easily reconstruct when he or she worked if the shifts have a regular pattern (see next section for more details about tem- poral schemata). Semantic knowledge may involve knowledge of the typical sequence of events—scripts (Schank & Abelson, 1977). Typical script is for example a vaca- tion. People first take the leave at work, then leave for vacation, spend some time at their destination, and depart from vacation. This may be followed by a party with pictures from the vacation and so forth. Also longer extended events or life-periods may be at least partially scripted. For example rebuild- ing the new house may have a sequence of looking for a house, finding a mortgage, receiving the mortgage, repairing the roof, changing electricity wires, etc. People may be wrong in dating these events but the sequence will typically be well remembered (Friedman, 2004; Larsen, et al., 1995). Also epi- sodic memories have typical scripts. For example a party script at university could be getting to know people, drinking, dancing etc. Even though some

5 Rubin (2006) incorporates the schematic knowledge into his basic-systems model of episod- ic memory. This model is very unique because it tries to interconnect the isolated know- ledge about different systems (narratives, schemata, episodes, neural substrate etc.) into one framework how to understand the episodic memory. 19 sequences may be specific to only a specific vacation, house rebuilding, or party, the sequence usually remains fairly stable (Eysenck, 2009). Problems may occur when the schematic knowledge is not true for the events to which it is associated (see e.g., Neuschatz, Lampinen, Preston, Hawkins, & Toglia, 2002; Wiley, 2005). For example a person who plays ten- nis regularly on Tuesday may estimate that the important match happened on Tuesday and forget that it happened on Wednesday (an atypical day for tennis). Or it may be forgotten that the first vacation to Croatia was in July because all other vacations were in September. In summary, when semantic knowledge is available and is correct then these events will be dated more accurately than events with similar characte- ristics without this knowledge. However, when semantic knowledge is wrong and people are not aware of that it can lead to significant dating errors.

2.2 Temporal representation of events

Personal memory has difficulty with intervals of time and the sequence of happenings, and even today individuals often resort to the ancient dating me- chanisms of relative chronology, centred on persons and offices. (Greenway, 1999, p. 127)

The relative chronology format has been dominant throughout the history of humankind (Greenway, 1999). Dating of events by reference to persons, offic- es or other important events has a natural quality and is most of the times more relevant than knowing the absolute date in calendar units—especially when more remote or less relevant events have to be dated (Greenway, 1999; Janssen, et al., 2006). The relative chronology format is an example of relative time format. When professionals are interested in dating accuracy then the exact date of the reference point (or periods) has to be known (e.g., that Velvet revolution was in November 1989). For example in Loftus and Marburger’s study (1983) respondents related events to the disastrous eruption of Mount St. Helens. Another possibility of using relative time format is to ask participants to describe the amount of time between an event and present—e.g., two month ago (Janssen, et al., 2006). Absolute time format uses the exact time units (e.g., September 2011) that do not need any further knowledge to be interpreted (Janssen, et al., 2006) which is why I will use this time format in all empirical studies.

20

Bradburn (2000) describes various time units people can use to estimate the temporal aspect of events. Commonly shared calendar units are year, month, week, day of the week, day, hour, etc. In most cases these are the units of the Gregorian calendar, which is the de facto international standard6, and is used almost everywhere in the world for civil purposes. When people do not know the date in calendar units they can still be able to estimate the date in approximate calendar–like units: e.g., seasons, holidays; socially defined time periods: e.g., “during the spring semester” or “during Prague spring”; or by idiosyncratic reference point: “two days after my birthday”, “when I was at the university”. This is supported by research of Janssen et al. (2006) who found that in most cases people prefer the relative time format. The preference is also moderated by the recency of events (people prefer absolute time format for recent events) and type of event (people prefer absolute dating for per- sonal events but not for public ones). Different time formats are also connected to the different questions (Gaskell, Wright, & O'Muircheartaigh, 2000). When relative time formats is used researchers may ask for example: “How long ago is it when this event happened”; or “Has the event occurred since...?; or “Has the event happened before... or after...? When researchers ask in absolute time format the question may be: “When did the event happen?” or “Which year and month did this event happen?” or on “Which day did it happen?” Choice of the format is not only important because people prefer abso- lute or relative format in different situations. Both formats also lead to slightly different dating errors (see the next section for more details).

2.3 How people store and retrieve temporal information

Storage of temporal information

There are many different theories on how temporal information is stored in human memory. Each theory is able to explain only a part of the whole phe- nomenon and many of them are rather useful metaphors than well supported theories. Because of this I will not focus on a single theory but on several theoretical approaches categories. Most widely used theoretical classification

6 Gregorian calendar is also known as the Western calendar, or Christian calendar. The date and time is standardized by ISO 8601. http://dotat.at/tmp/ISO_8601-2004_E.pdf 21 was developed by Friedman (1993, 2004). He classifies the theories into three major types—distance, location, and order theories. Distance theories. The idea behind these theories is that people are aware of the temporal distance (elapsed time) between now and the past event. This knowledge may stem from several sources. People may for example simply “feel” how far away did the event happen. This feeling supports the hypothe- sis that temporal information is stored in a spatial format (e.g., Murdock, 1974). Another theories stem from the fact that memories are subject to decay which was identified in the classic Ebbinghaus’ curve (e.g., Hinrichs, 1970). The strength of the memory is thus the source of the date es- timate. Similar theory highlights the number of details that was recalled about the event. The more details the more recent the event is (e.g., N. R. Brown, Rips, & Shevell, 1985). Location theories. Friedman (2004) mentions two types of these theories. First are the time–tagging theories (e.g., Hasher & Zacks, 1979) assuming that the temporal information is stored as one aspect of the events similarly as other aspects (e.g., who was involved, when it happened). Another type is the reconstructive theory, presuming that not temporal information but rather the contextual information is stored with the memories automatically. This contextual information about life-time period, people involved, activities— with the help of general knowledge (semantic knowledge mentioned in chap- ter 2.1)—helps to reconstruct the date (e.g., Hintzman, Block, & Summers, 1973). Order theories. These theories assume that order code of memories that are related is automatically stored in memory (e.g., Tzeng & Cotton, 1980). This is supported by the fact the meaningfully related events often determine the date of the other event (Friedman, 2004). Friedman (2004) also emphasizes that temporal information in some time units is partially (or totally) independent on time scales in other units which is why fine-grained units may be more accurate as rough-grained units. For example in one of his studies he found a dating error of 1.94 months but in hours the error was only 1.04 hours (Friedman, 1987). Similar effects were found for other time units such as month and year or exact date and day of the week (DOW) (Larsen, et al., 1995). For more information about time units see chapter 2.4. Although many critical arguments against some of the theories men- tioned here can be found7 all of them provide at least useful hypotheses that

7 See e.g., arguments against trace decay theories in Brown and Lewandowsky (2010). 22 help to understand the complex nature of date estimates better. For example time–tagging theories may be generally wrong but it is still true that some of the dates are directly retrieved without any necessity to retrieve the episodes (Skowronski, Betz, Thompson, Walker, & Shannon, 1994). These theories are not only important because they explain how people store temporal information but also because using different reconstruction strategies—from which these theories stem—is associated with different dat- ing errors (see more details in a next section).

How people reconstruct the temporal aspect of unique events

The previous section showed that there are many theories trying to explain how people store the temporal information. These theories were mostly de- rived from the experiences people reported about how they think they ar- rived at the date estimate. Even though people may have only limited access to this introspective information (Ericsson, 2003) the following strategies for arriving at the date estimates are fairly general and most people use all of them or their combinations in everyday life situations (Friedman, 1993; Janssen, et al., 2006; Thompson, Skowronski, & Betz, 1993). Knowledge of the exact date of an event. If people know the exact date then this date is part of their semantic knowledge in long-term memory and they do not have to recall the event in order to reconstruct the date (Burt, 2008). According to Friedman (1993) there are not too many so called “time-tagged” memories and the majority of them concerns very salient and important events, such as the birth of one’s own child. The bulk of public events are not so salient and important though, which is why time-tagged events are hardly found. This is not true for flash- bulb memories of landmark events like the eruption of Mt. St Helens (Loftus & Marburger, 1983), the Challenger disaster, the 9/11 terrorist attack, the fall the Berlin Wall (Luminet & Curci, 2009) or Velvet Revolution for Czech citi- zens (Trnka, 2011). When people know the exact date (e.g., September 11th, 2001) then all time units that are covered in this dissertation can be easily traced. When people do not know the DOW they can find it in the calendar or when no ca- lendar is available simply count from the last date where they are sure about the DOW. The same is true for the week. When the exact date is not known it does not mean that other time units are not time-tagged. The reason is that different time units are relatively in-

23 dependent of each other. For example DOW is often time-tagged because people remember the temporal schemata of the activities over the week and people can still have problems with the exact day of the month. Month or sea- son is also relatively often time-tagged and people can still have problems in estimating the year (Larsen, et al., 1995). More details about various time units follow below. Even though people do not remember dates or other time units directly they can still use several strategies that will help them to estimate the date better then sheer chance would predict. Janssen, Chessa, and Murre (2006) differ between primary (direct) temporal information8 that has tendency to be forgotten or partially forgotten relatively soon and indirect temporal informa- tion that includes contextual information about the event (e.g., sequence, tem- poral schemata). The strategies mentioned below all apply to this indirect temporal information. An association with one-time public or private temporal landmark. In this case the target event is somehow associated with the temporal landmark9. The association is often temporal (the sequence of events is known) but may be also logical or causal (Glasner, Van der Vaart, & Belli, forthcoming) Because temporal landmarks are usually correctly located in time (Gaskell, et al., 2000; Loftus & Marburger, 1983), the correctness of a date es- timate depends on the ability to estimate the temporal distance between the landmark and target event (e.g., the same day or month before). An associa- tion may be purely coincidental (I remember the exact date when I volun- teered in a hospice because it happened at the same date as 9/11) but usually events from the same thematic domain work better (Glasner, et al., forthcoming). It depends on the distance from the temporal landmark how precise will be the date estimate of target event. Even when people do not clearly remember the sequence of events they can still deduce it from the typ- ical scripts, because some sequences of events are more probable than others (e.g., people usually go to honeymoon after the marriage and not the opposite way) (Friedman, 2004; Schank & Abelson, 1977).

8 Thompson, Skowronski, and Betz (1993) use the term „partial temporal information“. 9 By temporal landmark I mean a time-tagged event that can help with the time estimate of related events. However when no other time-tagged events are available (which is common) than even less accurate “landmarks” may be helpful (helping with estimating the season or at least the life-time period). From this perspective any event regarding which person thinks to know at least some partial temporal information can serve as an “approximate landmark event”. The same applies to the next strategy. 24

Association with cyclic public or private temporal landmarks. Some events happen on a regular pattern. For example birthdays happen every year at the same time. Other examples are beginnings of the semester at university, na- tional holidays, Christmas, the World Championship in Ice hockey, et cetera. All these cyclic events help to date the month or even the exact date of the associated target events (Larsen, et al., 1995). But since most cyclic events take place every year, they usually do not help with year estimates because people may not be able to distinguish among two similar cyclic events, especially in long recall periods. There are also some temporal landmarks that happen in two, four or other cyclic intervals. For example the Venice art biennale, an important landmark for art lovers, happens every two years and Olympic Games or Na- tional elections in the Czech Republic happen in four year intervals. Even if people do not know the date of these events they can calculate the right year when they know the regularity. As mentioned in a previous strategy people often know the sequence and if not they can deduce it from the scripts. When the exact date of personal or public landmark (both one-time and more frequent) is not provided there is a danger that landmark event date can be biased (and sometimes is) which makes it more difficult to estimate the dates of target events associated to it (Van der Vaart & Glasner, 2011). An event contains a temporal cue (or schema). Some public or personal events may have a temporal hint in its name or description. For example Win- ter Olympic Games, summer vacation, Prague Spring, Sixty-eighth10. Another possibility to estimate the date is by finding an event temporal schema if it is available (Larsen, et al., 1995). An open-air in the Czech Repub- lic will almost certainly happen during summertime because it is too cold to organize such an event in other seasons. Temporal schemata will be tho- roughly covered in separate section below. An event is part of a life-time period. Evidence from autobiographical memory studies (e.g., Williams, et al., 2008) shows that peoples’ life events are organized according to their lifetime periods. Even if people do not know the exact month or year, they often know that an event happened “when I was living abroad in Budapest”. It is very rare that people do not know the lifetime period of the event. Even some public events may be a part of a life-

10 “Prague spring” was a period of political liberalization in Czechoslovakia during the era of its domination by the Soviet Union after World War II. It ended on August 21st, 1968 when the Soviet Union and members of its Warsaw Pact allies invaded the country to halt the reforms. 25 time period. For example the Czech Republic joining the EU is part of my “writing the thesis” lifetime period, because I was working on my in the right after the Czech joined. The accuracy of the date esti- mation depends on the length of the lifetime period. If it is too long it does not help too much, unless the event happened at the beginning or end of this period. Beginnings and endings of life-time periods are often landmark events which are dated quite precisely (Shum, 1998). Estimating and guessing. When people estimate the date they not only use the above mentioned strategies but also various heuristics that influence the date estimate. One comes from the trace which says that the memory trace decays with the passage of time—more recent memories are thus more vivid (Hinrichs, 1970). This is generally true, so when respondents feel that the event happened far away from now they may adjust the date estimate according to it. People may also use the heuristic that “counts” the details about the event (Friedman, 1993). The more they know about the event the more recent it seems to be. People use these “distance” heuristics quite often, because they generally work quite well (Friedman, 1993). A limitation is that both these heuristics may be influenced by some event characteristics (e.g., importance or knowledge). More important events are perceived as being more recent then less important events, because people have more knowledge about them and because their memories are more vi- vid (N. R. Brown, et al., 1985). But as Lee and Brown (2004) note even when people do not know much about the event they may provide plausible date estimates from the bits and pieces they know and with the help of available heuristics. Guessing is thus for these authors not “an unexplained and unbiased error” (p. 748) but often a systematic estimation bias which is worth thorough exploration that may provide valuable insights into why people often estimate the date better than chance even in cases they do not know much. Although all strategies including the “feeling” of knowing the exact date may lead to erroneous as well as correct dates, the probability is not the same for different strategies. Thompson, Skowronski, and Betz (1993) found in their diary study of undergraduate students (1 to 14 weeks old events) that accuracy decreased when less exact strategies were used and that when stu-

26 dents thought they knew the exact date their dating accuracy was higher (see table 2.1)11.

Table 2.1 Relationship between dating strategy and dating error (from Thompson, et al., 1993, p. 354).

Dating strategy Frequency of the use Dating error in days Exact date recalled 802 1.30 Related to reference event 1011 4.08 General reference period recalled 1312 8.93 Estimated number of intervening events 182 10.74 Clarity of recall 210 11.15 Generated prototypic temporal information 572 11.84 Guess 461 15.27 Note: The dating strategy was the dominant strategy used. In many situations people use multiple strategies how to reconstruct the date (Thompson, Skowronski, & Lee, 1988). General reference- the = general time period was known (e.g., summer). Generated prototypic temporal information = prototypic date (bowling always on Wednesdays). Estimated number of intervening events since the event being dated.

Temporal schemata

Temporal schemata refer to general knowledge about time patterns and thus constrain the likely time of the event to happen (Larsen, et al., 1995). For ex- ample when somebody visits a gym every Tuesday the DOW error of this event will probably be zero (week schema). I will deal with the temporal schemata that are relevant for time units used in this dissertation (from “exact date schema” to “multi-year” schema). Exact date schema. There are events that can happen only at one day or during some specified period of the year. Examples are Easter, own birthday, or kissing under the cherry blossom (May 1st)12.

11 The relative inaccuracy of the prototypic temporal information is rather surprising. This knowledge—I call it temporal schemata or schematic information in general—lead to relatively accurate estimates. It also depends heavily on the chosen units in which the error is represented. When people bowl on Wednesdays they will hardly ever misplace the Wednesday though the exact date may not be correct. 12 The tradition of kissing under the cherry tree in the Czech Republic goes back to the Karek Hynek Mácha (a romantic poet from the 19th century who wrote about a tragic love between the two young people on the first of May.) but there are also some older 27

Week schema. Some events can happen any day of the week and when the exact date is not remembered it could be expected that the DOW will be estimated no better than chance. This is, however, usually not true, because people remember the DOW quite well even for events that people do not re- member the exact date (Gibbons & Thompson, 2001). This applies especially for recent personal events where people can relate these “non–schematic” events to schematic events (e.g., it happened at the “gym day”). Public events that were unpredictable may be especially difficult and for these events the DOW estimate may be close to chance (when not the exact date is remem- bered). People with an irregular life may have problems with estimating the DOW (unless they have great memory for DOW), especially when their prox- ies have the same irregularity in their lives. Huttenlocher, et al. (1992) found that workdays (Monday to Friday) function as one unit while Saturday and Sunday are rather independent. This implies that Mondays and Fridays will not be temporally displaced with equal chance to either week days or week- end days, because people can hardly misplace Monday with Sunday. Even when people do not know the exact DOW often they still know whether the event could have happened during the weekdays; in the weekend; in the first weekdays; or at the end of the weekdays (Larsen & Thompson, 1995). Month schema. A month is subdivided into weeks. People can know that they do some activities every second week in the month or that there is a reg- ular alarm siren check-up the first Wednesday of the month at 12 AM (as it is in every Czech town). Not many events have a month schema helping to es- timate the week which is why estimating the exact week is relatively difficult. Belli, Shay, and Stafford (2001) used 1/3 of month as the time unit instead of a week because people tend to remember more easily whether the event hap- pened at the beginning of the month, in the middle or at the end. Larsen, et al. (1995) also point out that 52 weeks are too fine-grained to represent the year and problematic for representing the month as well, because weeks may over- lap the month. This is of course not a problem when people are provided with a calendar. But when exact day of the month and month are provided then a week estimate is redundant. But when people estimate the date of re- cent events, the week may be very helpful because people may date in rela- tive time format by saying “two weeks ago”. Relative time format may be useful for confidence statements like “I am sure that it happened within last two weeks” (see chapter 3.2 – confidence of date estimates).

traditions that are connected with love during this time of the year (e.g., the tradition of Maypole). 28

Year schema. A year is subdivided into 12 months and 4 seasons and many other more or less idiosyncratic schemata such as “spring semester”, “time of the blossoming elder” “before Christmas time”. The year schema helps to estimate the month but is usually rather rough grained for helping to estimate the exact date or DOW (Larsen, et al., 1995). People often remember some contextual information that helps them to estimate the month or at least the season (snow = winter). Many events are of cyclic nature, e.g., festivals, vacation times, holidays, Easter, Labor Day, etc. and there are also many re- gional cyclic events such as feast or markets (Vokolek, 2011). When an event could happen at any time of the year and happened in- doors it may be more difficult to estimate the month because the context does not offer too many retrieval cues Multi–year schema. Some events happen every 2 years, every 4 years, every 10 years, and so on. These schemata especially apply to public events because many of them happen on cyclic pattern (e.g., Olympic Games or Par- liamentary elections) (Larsen, et al., 1995). Events that have this schema may have the year estimate more accurate than events without this schema. How- ever, when the recall period is very long, the cyclic pattern may not help and similar events can mix up—people may know the month of winter Olympic Games but not be sure if Osaka was before or after Torino. Larsen et al. (1995) also deal with the stability and updating of temporal schemata. According to the authors some schemata usually remain the same over people’s life-time (e.g., seasons, day schema—having lunch at noon time) while other may slowly update or change their importance. For exam- ple the academic year may be very important but lose its meaningfulness when people grow up (and do not have children for whom this schema is important). Another example is the DOW schema that is very stable for grown-ups with regular work and free time activities but may change when life-time period ends and new starts (new job, child, retirement). This section shows that temporal schemata are the major source of dat- ing accuracy for both personal and public events. I would argue that these schemata are in particular important for date estimates of public events, be- cause they are generally more difficult to date and people have fewer re- sources from which to estimate the date.

29

2.4 Patterns of dating error

The error in dating accuracy can be presented as signed or in absolute value. As mentioned in chapter 1.3 I will use the term dating error for the error in absolute value and signed dating error or simply signed error for error where the sign shows if the date was moved backward or forward from the true date. When people estimate the date of more events the mean signed error is usually close to zero. The reason is that some events are dated as more remote and some as more recent and the errors cancel one another (N. R. Brown, et al., 1985; Rubin & Baddeley, 1989). When events are estimated as more remote researchers speak about backward telescoping (or use minus error sign) and when events are estimated as more recent about forward telescoping (or use plus error sign). The word telescoping is used, because it reminds the shrink- ing of distance when people see objects through a telescope the right way or the expansion of distance the other way (Rubin & Baddeley, 1989). The most obvious cause why some events are telescoped is that when people guess the date an event is inevitably telescoped backward or forward (if not guessed correctly). Thus the telescoping bias may not have any “deeper” cause. How- ever, there are also several systematic reasons why telescoping may appear (Tourangeau, Rips, & Rasinski, 2000): • It can be caused by the boundary effect; • Older events are more prone to forward telescoping while more recent to backward telescoping; • More important, salient events where people have more details (or knowledge) tend to be telescoped forward (the opposite is true for non- significant events).

The dating errors of more events in absolute value does not cancel one another which is why for most analyses (e.g. comparing means) dating error is more appropriate. Another way of presenting the dating error is in the percentage of correct or incorrect estimates. This is especially helpful in statistical analyses because dating error distribution in absolute value for personal events is hardly ever normal (see e.g., Gibbons & Thompson, 2001; Janssen, et al., 2006; Larsen & Thompson, 1995).

30

The dating error in various time units usually shows some typical pat- terns. I will therefore describe the typical patterns of all time units used in the dissertation. These are days, DOW, weeks, months and years.

Error in days

The day is the time unit which is used to represent dating error of recent events. The error (in absolute value) of recent events typically show the next 7-days pattern: many events are dated correctly, next less events are dated with one day error, much less with 2 days up to 5 days error, but then again more events are dated with errors of 6 days and even more with 7 days. Then the whole cycle repeats though the percentage or errors tends to be smaller and smaller over the cycles (Larsen, et al., 1995). The reason why the almost linear decreasing trend of dating accuracy is interrupted by the “bumps” of the multiplies of 7±1 day error is that people often remember the DOW very well and the DOW is forgotten more slowly than the correct week (Gibbons & Thompson, 2001). This applies especially to personal events or to public events with a DOW temporal schema. Public events without this schema do not show such a strong pattern of the 7±1 days error (Larsen & Thompson, 1995)13. When signed error is used the pattern is similar at both sides of the distribution (Larsen & Thompson, 1995). To sum up, even though the error in days shows the linear decreasing trend (especially when correctly dated events are excluded14), this trend is interrupted by many bumps that are close to the multiples of 7±1 days and the size of these bumps also decreases with time.

Error in days of the week

The error in days of the week—so whether it concerns e.g., a Monday or another day in the week—can be no more than 3 days if measured in num- bers of days, because days of the week are cyclic. The DOW error shows the

13 The similarity of the error pattern of public and personal events depends on how rele- vant are the public events. When researchers choose for example local events that are personally relevant (or have a DOW schema) the pattern of the dating error will be more similar to personal events. This was done for example by Kurbat, Shevell, and Rips (1998) who used the booklet containing the major events at the University of Chicago. Students had to circle if they participated in the event or not. 14 When these events are not excluded than power function (or exponential) may be a better approximation because it is able to take into account the more frequent categories of dating error (zero error, 1 day error, 2 days error) at the beginning of the distribution. 31 similar pattern as the ‘error in days’ for the first 3 days of the week (many DOW correct, much less with 1 day error etc.). However, generally people know the correct DOW more often than the exact date in days due to know- ledge of week temporal schemata together with knowledge of the exact date from which the DOW can be extracted (Gibbons & Thompson, 2001; Larsen & Thompson, 1995). When public or less important unexpected personal events are dated the percentage of correct answers may be much smaller or even close to chance level (Huttenlocher, et al., 1992; Larsen & Thompson, 1995; Larsen, et al., 1995). When people think that the correct DOW is Sunday they will hardly ever mistake it for Monday even though it is only one DOW error. The reason is that weekend days have different temporal schema as the weekdays (Huttenlocher, et al., 1992) and thus Sunday is mistaken for Saturday more often than for other days (Larsen & Thompson, 1995). The same applies to weekdays. Also the first two weekdays will probably not be mistaken with the last two (Thursday and Friday). This applies to events that have some temporal schema. Events that can happen at any day and of which people do not have any contextual information, can have any DOW. Researchers sometimes ask only about the exact date and extract the DOW from this information. This approach, when no calendar is used (pocket calendar with exact dates and DOW information is enough) leads to many DOW errors that are artificial because people often do not know the exact day of the month or are not sure for example whether it was 29th or 30th which leads to DOW error even when they are sure it was Sunday. This is also why some researchers ask separately about the DOW and the exact date (Gibbons & Thompson, 2001). This approach leads to less DOW errors.

Error in weeks

The error in weeks has the similar pattern as the multiples of 7 days in error in days, though the decline of accuracy is usually not so steep. Most events have the correct week, fewer 1 week errors, and even fewer 2 week errors and so on. The trend of slowly decreasing accuracy resembles the exponential curve (Betz & Skowronski, 1997; Larsen, et al., 1995).

Error in months

For more remote events (approximately 2 years or more) month units are usually appropriate to represent the dating error (e.g., Freedman, Thornton,

32

Camburn, Alwin, & Young-DeMarco, 1988). People usually remember the month quite well, which is why the distribution of errors in month units shows the “bumps” that are multiples of 12 month or close to these multiples. This suggests that it is relatively easy to make a full year(s) error (Larsen, et al., 1995). This pattern applies only to events that have year temporal schema. Unexpected events do not show this pattern or show it much less (Larsen, et al., 1995). The pattern of this error thus resembles the pattern of the error in days. Public events also show this bump pattern, but when year temporal schema is not available the size of these bumps will be smaller—because when the date is guessed the distribution of this error will be closer to the chance level though usually not completely (Lee & Brown, 2004)15.

Error in years

Error in years does not seem to show any typical patterns and gradually in- creases with time. When really remote events are dated and relative time format is used (“How long ago... ?”) rounding may play a role, showing that for example the multiplies of 5 years or 10 years will be more frequent (Huttenlocher, Hedges, & Bradburn, 1990; Tourangeau, et al., 2000). This however is not an inevitable pattern and when absolute time format is used most of these errors will probably disappear (Janssen, et al., 2006).

15 Lee and Brown (2004) found that even without knowledge of temporal schemata or the correct date people are often better than a chance which implies that some implicit know- ledge (heuristics) helps them in date estimates. 33

3 Predictors of dating accuracy

Casino operators have learned that all they have to do is keep the odds in their favor and have a large enough sample size of events so that their edges have ample opportunity to work. (Douglas, 2000, p. 106)

Marriage is a salient event and it is no surprise that people estimate the date correctly irrespective of the recall period, gender, age, or interview type. But even landmark events are sometimes incorrectly dated. Therefore in a single case the chance of accurately estimating the exact date in chosen units is al- ways fifty percent16. Casino operators as well as memory researchers have to accept this fact. Luckily when more people have to date landmark events (say ten) and the chosen temporal units are appropriate, researchers can be pretty sure that most estimates will be correct because it is a known fact that land- marks are usually time-tagged (Shum, 1998). If all estimates are incorrect re- searchers would be very surprised, because the probability of this happening is very low (p = .00098). In such a case the “rules” of remembering were vi- olated or something really extraordinary happened—e.g., ten asocial people who were married and did not care about the marriage were chosen. There- fore, all presented predictors only increase the odds that the date estimate will be correct, but do not warrant the certainty in a single case or for a single individual who could be very different from an “average” person. I will only focus on predictors for unique personal and public events and (partially) omit the predictors that are more relevant for other types of events like repeated events (e.g., frequency, interference). Unique events are by the definition easily differentiated from other events, because the descrip- tion is the only one of its kind. However, if the description is poor or the unique event was partially forgotten, then the interference of similar events may have impact. The reason for searching various predictors is twofold. Knowing predic- tors may increase researchers’ trust (or distrust) in temporal data when he or she does not have an objective source to validate the date estimate of the res- pondent. Second, it may help to predict the task difficulty which is here ope- rationalized “as the degree to which it is difficult for a respondent to produce an accurate answer to a question” (Van der Vaart, 1996, p. 2). When for ex- ample researchers ask about job changes, it may be a difficult task for people

16 The accuracy of course depends on researchers’ decision how big deviation from the chosen units is still accepted as “accurate”. 34 who changed jobs frequently and easy task for those who did not. When re- searchers expect high task difficulty they can help respondents with various recall aids such as the calendar instruments (Belli, et al., 2009) that will be covered in chapter 3.4. Task difficulty, or more precisely dating difficulty, has many sources (see figure 3.1) which I tried to categorize into three groups 17: • Respondent characteristics influence the overall dating accuracy and most of them can be gathered both from the respondents and their prox- ies. I divide them into stable characteristics and the current state of mind. Stable characteristics include commonly examined demographics such as gender or age; quality of memory for dates and ability to recon- struct the date; and a heterogenous group of respondent predictors re- lated to the type of information that tends to be remembered by these respondents (e.g., due to their interests or lifestyle). The current state of mind moderates respondents’ ability to estimate the date. I will focus on three most influential—tiredness, mood and motivation. • Event characteristics have to be measured separately for every event. These include characteristics which are independent on respondents’ evaluation (e.g., recency) and characteristics that have to be assessed by respondents—respondent dependent characteristics (e.g., vividness of an event). Some event characteristics can be assessed by proxies as well (e.g., expected week accuracy) but their interpretation may be different. • Too often the way of data collection is not appropriate, because the type or mode of interview (or question type) is not suitable for collecting temporal information. This is especially true when the task is difficult (Belli, Smith, Andreski, & Agrawal, 2007; Van der Vaart & Glasner, 2007a). Important factors are also the interviewers’ facilitation of the respondent’s recall, chosen time units, contextual factors and the reason why people reconstruct the date estimates.

17 I will mostly use the word predictor instead of a source, because it highlights the function in predicting dating accuracy. 35

Figure 3.1. Conceptual model of the sources of dating difficulty

The predictors are often related to each other. For example women tend to be better in dating personal events (Skowronski, et al., 1994) but it may not be gender but the type of an event or the family role of women that plays more crucial role18. I will explore the sources (predictors) of dating difficulty independently and show the major interactions within the subchapters. The summary of all predictors and their impact on dating accuracy is at the end in chapter 3.5. Some predictors will be examined very briefly (e.g., educational level) while others will be covered quite extensively (e.g., gender). This imbalance reflects the complexity of some predictors as well as the amount of available litera- ture19.

18 Some of the predictors could be part of more than one category and it depends more on the perspective the researcher takes. For example importance could be respondent charac- teristics, because it has to be rated by respondents themselves but it can be event character- istics because it is different for every event. 19 The reasons for the differences in the coverage are: relevancy of the predictor (less space is devoted to less relevant predictors), complex nature (some predictors are more complex and need more clarification), and lack of literature (there is not enough evidence about some predictors and lot of evidence about other predictors). 36

When not specified otherwise the findings can be generalized for both public and personal events that are either recent (approximately 0.5–6 month or up to 1.5 year) or remote (approximately 1.5–7 years old)20.

3.1 Respondent characteristics

Researchers usually find that some respondents are very precise in dating while others are not (e.g., see Thompson, 1982). This variability can be ex- plained by the stable characteristics of respondents such as gender, personali- ty or abilities. The impact of stable characteristics is moderated by the current state of mind (e.g., tiredness, motivation). To my knowledge, there is no re- view of respondent characteristics because most of the literature focuses on event characteristics (Bradburn, 1996; Williams, et al., 2008) or the means of data collection (Glasner & Van der Vaart, 2009). However several characteris- tics such as gender (Skowronski, Betz, Thompson, & Shannon, 1991), age (Howes & Katz, 1992) or ability to reconstruct the date (Friedman, 1993; Janssen, et al., 2006), received greater . I will therefore focus on some other promising characteristics (predictors) as well. These are education, per- sonality, and interests. Some of the predictors describe the mechanism why certain people es- timate the date more correctly (e.g., memory for dates) while others only point out that there is a difference among people (e.g., gender, age) and the mechanism can be attributed to other predictors or is not known yet. Knowing the demographics is worthwhile because if they can differen- tiate people according to the dating accuracy, researchers can easily choose the group which is more credible. Apart from age, gender, work, and educa- tional level that will be examined in the following text, other demographics are worth mentioning as well. For example race, socio-economical status and location have certainly impact on dating accuracy but the lack of evidence does not allow making generalizations. Most of the empirical studies re- viewed in my dissertation were conducted in European countries, Israel, USA or and the generalizations to other territories should be made with caution, because of the different calendars used, different lifestyles or the re-

20 The reason for choosing this two recall periods is that study I and II deals with “remote” events (2–5 years old) and study III with recent events (0.5–3 month old). The delimiting line of 1.5 years is arbitrary. See chapter 2.3 for more details. 37 levance of exact time measures21. For example Asian cultures are more inter- dependently oriented and value more social rules, group harmony, solidarity as Western cultures (Williams, et al., 2008). Eastern mothers in comparison to Western mothers also use low-elaborative reminiscing style with their child- ren which has impact on the memories they remember and on a reminiscing style in adulthood (Wang & Brockmeier, 2002). Another limitation is that res- pondents in most studies are quite well educated (especially low educated people are missing) Caucasians22, and it can be expected that people with low socio-economical status are underrepresented as well. If research participants come from an atypical population it will be mentioned. All my studies pre- sented in chapters 5 to 7 have the similar limitations with regards to the race and education and the results can thus be easily compared. Ability to retrieve or reconstruct the date is an important source of dat- ing accuracy. People may have a good or poor memory for dates. However, the ability to date is not only about the storage and retrieval of the exact dates or approximate temporal information (seasons) from memory. In most cases the key is the ability to reconstruct the date from the resources respondents have. The reconstruction is done by putting various pieces of information together with the help of heuristics (Baron, 2008; Friedman, 2004). The nature of these two abilities is interrelated. For example the more dates people re- member the easier they reconstruct the dates where they do not remember the exact dates because they have more temporal information from which the date can be reconstructed. Another type of respondent characteristics affects what sort of informa- tion respondents tend to remember. I will focus on personality and interests. The mechanism is usually that people remember mostly the information which they find to be relevant (Wagenaar, 1986; Williams, et al., 2008). The current state of mind or in other words actual affects is the last type of respondent characteristics I will explore. These predictors moderate the ability to retrieve or reconstruct the dates. For instance, person who is tired and demotivated will date less accurately than a person with the same dating ability, especially when the task difficulty is high (Kunda, 1990; Martin & Jones, 1984). Apart from obvious pathological states like depression or anxie-

21 For example Axinn, Barber, and Ghimire (1997) had to use major public landmarks in their neighborhood history calendar in Nepal because many of their participants did not use calendars in their everyday life and used major events and their sequences instead. 22 Does not apply to more recent studies from the USA or Canada where many undergra- duates are of different races as well. 38 ty, the most relevant and easily assessed actual states are motivation, tired- ness and mood.

Age

According to Rubin, Wetzler, and Nebes (1986) (how much people can learn) and retrieval (how much and how easily they retrieve the wanted information) are affected by age of respondents but there is lack of evidence that retention (the amount of remembered memories) is affected by age as well. Simply said, once older people store something—and their memory is not pathological—the retention will be similar to any people at any age even though it may be somewhat harder to retrieve memories. When older respondents are asked to recall memories from their life in free recall or cued recall, their life-span retrieval curve shows four distinctive periods (Janssen, Chessa, & Murre, 2005; Rubin, et al., 1986). Figure 3.2 presents the idealistic representation of the life-span retrieval curve for res- pondents who are 4023. 1) At the beginning there are nearly no memories be- fore three years of age (infantile ); 2) Then the number of recalled memories quickly grows and reaches the maximum between the age of 10 and 35 and especially between 15 and 25 (reminiscence bump); 3) The bump is followed by the decrease of recalled memories. This period may be long when participants are older (over 40) or very short (or none) when they are younger. I call this period “ordinary years”, because for many people nearly all most salient events already happened in the previous bump period (e.g., high-school, first love, marriage, first children); 4) The “recency period” de- scribes the phenomenon that closer to present time people recall more memo- ries. Conway, et al. (2005) found that the reminiscence bump was similar for five different cultures although the content of memories was different (e.g., Chinese people recalled more group memories).

23 Recalled memories are plotted in terms of the age of encoding. The shape of the curve is inspired by Conway, Wang, Hanyu, and Haque (2005) and Janssen, et al. (2005). I did the adjustment to 40 years old respondents which are the oldest respondents in studies II & III. 39

Figure 3.2 Idealistic representation of the life-span retrieval curve for 40 years old respondents.

Janssen, et al. (2005) studied the impact of age, gender and education level on this ideal retrieval curve in Holland and in the United States and found that for the age cohort of 31-40 years old the reminiscence bump was found but is rather weak in comparison to the later age (Figure 3.1 exagge- rates the bump for this group and is more appropriate for people closer to 40 or older). The reason is that the reminiscence bump does not end but is im- mediately followed by the recency period. The longer the period of “ordinary years”, the higher the chance that a reminiscence bump will be found. It is surprising that results do not show influences of the education level, because one would expect that more educated people show a shift towards more re- cent years, because some major life events happen a bit later (e.g., first child- ren). Women showed the reminiscence bump slightly earlier than men, which could be attributed to earlier maturing. Rubin, Wetzler, and Nebes (1986) argue that the reminiscence bump can be explained by a) the novelty of events during this period—many events are first-time experiences, b) lack of proactive interference because of the novelty of events, c) sampling memories—people may sample events from certain period more often, d) encoding better at certain age (adolescence and early adulthood), e) memorable events that happen more often during some years. This explanation is supported by the life-scripts which describe that in a typi- cal life, major life events happen at some order and during some time (love >>> marriage >>> children) (Rubin, Berntsen, & Hutson, 2009). Recent studies found that the reminiscence bump appears only for posi- tive events and that negative events most often show a decreasing retention over the years (Berntsen & Rubin, 2002; Leist, Ferring, & Filipp, 2010). Glück 40 and Bluck (2007) show evidence that not all positive events lead to a reminis- cence bump, but only positive events over which participants perceived con- trol. They explain it by the notion that during this life-time period, people exercise the control over their lives and make many consequential life choices. The life-span retrieval curve offers some valuable insights into date es- timates especially when distinguishing between three major types of life events: normative age-graded events, history-graded events, and non- normative events (Baltes, Reese, & Lipsitt, 1980). Normative age-graded events occur in a certain life phase because of bi- ological or sociocultural reasons and usually have a typical sequence (prima- ry school >>> high school >>> first work >>> marriage). Many of the memories from the reminiscence bump are of this type. Non-normative events do not depend on the historical era or age (e.g., moving to another house). History- graded events can produce history bumps which are cohort related. For ex- ample Czech people who are now24 30 will have a “Velvet revolution history bump” at the age of 18 and people who are 40 will have this bump at the age of 28. Fradera and Ward (2006) found that older participants (Mage = 74) were not better in dating major public events from their life in comparison to younger participants (Mage = 20) who knew them only as semantic facts but did not experience them. The older group knew many more facts (contextual knowledge) about the events but this did not increase the dating accuracy. On the other hand when young group participants had high knowledge of an event this was related to increased dating accuracy. The results show that knowledge itself is not sufficient for accurate date estimates and other factors may play a role. For example the older group could be worse because most public events were of a repeated nature for them and the younger group could have fresher semantic knowledge of the dates from the school. Another study by Howes and Katz (1992) found that the older group (Mage = 68) had worse memory for public events than the middle-aged group (Mage = 48). It would be also interesting to compare the younger group with the middle- aged group who lived through the public events in their reminiscence bump period and the events are still relatively recent. For people interested in public events, history-graded events may play an important role in dating not only public events but possibly some asso- ciated personal events as well (see chapter work, lifestyle and interests). More normative age-graded events happen during “reminiscence bump” periods and these events are often first-time experiences and land-

24 End of the year 2011. 41 mark events. A sample of personally relevant events from this period will thus be probably dated more accurately than a sample of events from more recent years with the exception of very recent years. The reason is also that age-graded events have a connection to temporal schema—e.g., the person knows at what age she attended the school or began to study at university. Even though older people show more clear reminiscence bumps than younger adults (Janssen, et al., 2005), it can be hypothesized that younger people will be more accurate in dating events from the reminiscence bump because these events are not only more memorable (which is similar for people at any age), but also more recent with less retroactive interference. In most European countries as well as in the Czech Republic, the age at which some important choices are made (such as marriage or first children) increased considerably in the last 20 years. For example the average age of women having their first child increased from 24.9 (2000) to 27.3 years in 2008. The average age of having the first child seems to be higher in more developed countries (UNECE, 2010)25. I would expect that the reminiscence bump will be somewhat wider and moved toward older age and that the period of “ordinary years” will be moved forward, too, especially for more educated people. But it may happen that this shift will appear just for some events (marriage, first child) and not for all other salient events (e.g., school years, first love, trips, first salary) which supports the above mentioned finding of Janssen, et al. (2005). Non-normative events that come from the reminiscence bump period should also be more accurately dated if they are associated with age-graded events (or life-time periods). The other reason is that also many non- normative events will happen during reminiscence bump for the first time and these events are often landmarks that are remembered well (Shum, 1998). However when there is no association with other events and the event is not a landmark alone then these events should be dated worse than age-graded events. Events that happen after the reminiscence bump period (that do not come from the recency period) may of course be accurately dated too but it should be generally more difficult because in this period happen less age- graded events and first-time events. But if these events have other characteris- tics of well-remembered events—like importance, occurring on memorable

25 It was for example 30.0 years in ; 28.1 in Norway. 26.6 in ; and 25.5 in (UNECE, 2010). 42 date or at transitions (Bradburn, 2000; Thomsen & Berntsen, 2005; Wagenaar, 1986)—then any events from any time period may be correctly dated. Other non–landmark and non-normative events and events with the lack of temporal hints (such as temporal schema or association with other time-tagged events) should degrade according to the normal retention curve (see below).

Gender

It should surprise no one that women are better at remembering dates than men... That difference has caused many a tense moment in marriages. (Bradburn, 2000, p. 58)

“My wife knows best”26 is a stereotype that can be applied to autobiographi- cal memory in general (Bloise & Johnson, 2007; Pohl, Bender, & Lachmann, 2005) as well as to event dating. In most of the studies (but not all) women tend to recall autobiographical events slightly better than men and their dat- ing accuracy is better as well (Auriat, 1993; Skowronski, et al., 1994). Women’s autobiographical memory is described as qualitatively differ- ent from men’s. For example Bloise and Johnson (2007) mention in their in- troductory section that women’s memories tend to be richer, more emotional, more complex, longer, more detailed, embedded in richer context with more references to other people and events. Despite of this fact Fivush (1998) warns that the gender differences in autobiographical memory are still quite speculative and may be at least par- tially explained by “the ways in which mothers and fathers reminisce with their daughters and sons across the preschool years” (p. 81). She also found that the gender of the child influences the way of conversations more than the gender of the parent. According to Davis (1999) daughters learn to elaborate the memories more deeply and also take more attention to the emotional as- pects of events. Sons on the other hand learn to communicate more pro- grammatically. Because of the changes in the socialization one would expect that these gender differences should become less obvious over the years, at least in the western countries. Davis (1999) mentions that the lack of emotion- al and relevant content explains why in the laboratory or clinic studies re- searchers usually do not find big (if any) gender differences. Fivush (1998) emphasizes that the reason for a better autobiographical memory of women

26 Title of the article by Nadia Auriat (1993). 43 may be that personal memories are more important for their identity. Sko- wronski and Thompson (1990) also point out that women more often have the family role of the temporal keepers—keeping dates of birthdays, anniversa- ries or weddings. Bloise and Johnson (2007) investigated if the difference in recall is due to the emotionality of events. Participants had to read two scripts depicting a conversation of a married couple. One script was about house remodeling and relationship issues and the other about an upcoming vacation and rela- tionship issues. There were three experimental conditions. In the emotional focus condition the participants had to focus on emotional issues (relation- ship), in the neutral condition on trip and house issues and in the undirected condition there was no instruction on what to focus on. After a 15 minutes retention period women recalled more emotional information then men in all conditions (η² = .10). Women also recalled more neutral information than men, but only in the neutral focus condition (η² = .42). In the undirected and emotional focus condition the gender differences in neutral information were not found which implies that women’s focus is not naturally on neutral aspects of events but if they have to they still benefit from better memory. The researchers also wanted to find out if woman’s ability is not related to their higher emotional and interpersonal ability. The Emotional and Interpersonal Sensitivity Measure27 (EISM) showed that in general wom- en score higher on this measure than men. However the correlation between emotional recall and EISM in the undirected condition was higher for men (r = .67) than for women (r = .45). Researchers found that the amount of emo- tional sensitivity mediated the gender difference in the recall of emotional information and when put together into multiple regression gender was re- duced to a non-significant predictor. This suggests that when men are more emotionally sensitive they can be as good in recalling emotional events as women and that the individual level of emotional sensitivity is a stronger predictor than the gender alone. Skowronski, Betz, Thompson, & Shannon (1991) recommend caution when interpreting the gender differences, because there is a large spread in dating accuracy among people (both women and men may have extraordi- nary or poor memory for dates) and at least four explanations of this gender difference can be found: a) women are better in encoding the exact dates; b) women have more developed temporal schema than men; c) women do more

27 12 statements with 5 point scale 1: Not at all like me to 5: Exactly like me. Most items are from Social Skills Inventory (see Riggio, 1986) 44 likely encode the partial temporal information which helps them later to date better (e.g., the season); and d) women recall better the details of the events. In the diary study Skowronski et al. (1991) did not find gender differ- ences in how well the event was remembered. Every event memory was measured on a 7-point scale (1-not at all; 3-fairly well; 7-perfectly). They con- clude that gender differences in dating accuracy thus cannot be due to the differences in memory. This conclusion seems to me a bit quick because the difference in memory ratings may be at least partially explained by differenc- es in question comprehension (and answering) or differences in self- evaluations between women and men. For men a “fairly well” remembered event may be actually less remembered as a “fairly well” event by women. In this study women were on average 1.2 days better than men (M = 8.43 versus M = 9.65). These differences are mostly due to the fact that women were par- ticularly accurate when the events were self-related and recent and that men were particularly inaccurate in dating older other-related events. Women provided the exact date on average more often than men (M = 0.31 versus M = 0.24), especially for self-related events which were recent. The difference be- tween self-related events for women and men was M = 0.35 versus M = 0.25 and the difference between other-related events was redundant (M = 0.28 women versus M = 0.25 men). This implies that the gender differences are partially due to women’s more exact time-tags. When the authors analyzed the multiples of 7—the dates where the day of the week was correct but the exact date not—they did not find any gender difference in dating accuracy. This implies that “non-exact temporal tags” do not help women more than men. The authors also investigated if event content (theme) could cause the gender difference, because it could happen that women wrote down more “easy to date” events like birthdays. All events were divided into 66 catego- ries and women were better in 42 of them. The frequencies of events in the categories were similar for both genders. When two categories were used (events often stored with the exact date versus events not often stored with exact date) women were better in both of them while much better in the “easy” category. Even though it seems that women are generally better in dating, there may be some areas where men are better. The study indicates that men were better in activities involving home improvements or activities involving a car or other vehicle. It should be pointed out that this study may be biased by the

45 sample of undergraduate students for whom gender differences in event con- tent may be smaller due to similar university experience. There may also be gender difference in the ability to take the advantage from recall aids such as calendar instruments (Reimer & Matthes, 2007). This will be explored in chapter 3.4. Gender differences in dating public events do not bring similar findings, because some studies found gender differences and some did not (cf. Botwinick & Storandt, 1980; Howes & Katz, 1992). It seems that public event dating is much more dependent on the type of events used. If for example political events are used, there is higher chance that men will be more correct because they are more often interested in politics.

Educational level

Educational level may have impact on the events people experience and on the quality of memory as well. Janssen, et al. (2005) did not find any differ- ences in the reminiscence bump with respect to educational level. This means that irrespective the age when people left school, the pattern of the reminis- cence bump was similar. It also implies that people with different educational levels still experience most of the salient events at quite similar ages even though they can differ in theme—e.g., more university related events for edu- cated people and more work events for less educated people from the same cohort. Personal memories should thus be quite similar in many aspects. Studies examining the quality of memory are mostly clinical and focus on elderly people. These studies show that highly educated people suffer less and later in life from memory disorders and also complain less about their memory (e.g., Jonker, Geerlings, & Schmand, 2000; Schmand et al., 1997). Ac- cording to Van der Vaart (1996) more highly educated people may have a greater cognitive potential which can lead to better memory. The author also concludes that hardly ever any relationship was found in healthy individuals. As mentioned in the introductory chapter, in most memory studies, people with really low education are missing. It can be argued that these people will probably have worse memories because some of them may have mild intellectual disability which is often connected to the problems with es- timating the temporal aspect of events (Sigelman et al., 1983). But for majority of people who have at least finished high school without graduation exam (sometimes referred as “vocational school”; brick layers, carpenters, waiters) or some higher education and do not have intellectual disability it can be ex-

46 pected that educational level does not a play significant role. On the other hand, public events should be dated more accurately by people with higher education because educated people have better general knowledge (Cohen, 2008a). This is supported by the study of Howes and Katz (1992) who found positive relationships between educational and intellectual measures and re- call of public events for both older and middle-aged groups of respondents. The educational level may also have impact on dropout from the study. For example Kristo, Janssen, and Murre (2009) found that respondents who dropped out of their study had lower education than those who finished the study.

Memory for dates

There are probably some differences among people in innate memory abilities; in this sense, there may be such a thing as a good memory and a bad memory. However, the important point is that even if there are such innate differences in memory, except for a few extreme cases these innate differences are not nearly as important in the ability to remember as are differences in learned memory skills. (Higbee, 1996, p. 6)

People with a good memory for dates should obviously be better in dating than those with a poor one. The nature of good or expert memory is however more complex and has at least three sources: talent—innate good memory for dates (storage and retrieval) and ability to reconstruct the dates; acquired knowledge—the dates or other contextual information leading to date esti- mates were learned; and acquired mechanisms—acquired strategies how to learn (e.g., chunking) (Ericsson, 2009b; Scott, 2007). The views of acquired knowledge or mechanism are especially well supported because even people with no talent can improve their memory for dates a lot. This is also reflected by the vast literature focusing on memory improvements (see e.g., Fry, 2011; Scott, 2007; Turkington, 2003) or expert knowledge in general (Ericsson, 2009a; Sternberg & Grigorenko, 2003). The date reconstruction strategies are thoroughly explored in chapter 2.3. Asking respondents to talk aloud when dating or probing after answering temporal questions may bring some insight into the strategies respondents use (Willis, 2005) but it often happens that these strategies are partially or totally unconscious (Cohen, 2008b; Talarico & Mace, 2010) and people report that the date “just appeared” (see e.g., AP, 2008).

47

Assessing talent for dating is difficult because researchers need some set of events by which to evaluate this talent. For example if researchers use the set of 35 well-known public events from 2005 to 2008 (as I did in Study I), the relevancy of sport events will be different for people who are interested in sports in comparison to those who are not. Thus assessing the talent can be done only at some areas. This is supported by the fact that even memory ex- perts have areas where they do not achieve higher than average people, be- cause that area is of no interest to them (Wilding & Valentine, 2006). The same applies to personal events. “First child tooth” is often more relevant for the parent who stays with the child at home than to the partner who is at work28. These single tests of memory for dates thus measure rather the acquired knowledge and mechanisms than talent. However if the respondent scores high at dating events from diverse areas researchers can be more confident that talent plays a role. This assessment may be time consuming and it seems reasonable to find easier ways how to do it. An easy solution to the problems with memory tests could be a self- assessment of memory for dates. Cohen (2008b) points out that people have first-hand experience with failure and success in their everyday life. If a wife usually knows the date of most family events and her husband does not, then he can conclude that his memory for dates is poor. Such self-ratings are quite reliable, but the validity is questionable. Cohen (2008b, pp. 12–13) lists the major issues which have been identified (shortened): • Self-assessment may reflect a person's self-image rather than his or her performance, and be distorted by modesty, pride, anxiety or depression. • The Paradox may operate, so that people who make the most errors are least likely to report them because they forget they have occurred. • Individual variation in the opportunity for error may also distort re- sults. For example, some individuals may assess their memory for faces as excellent, but have few demands made on it because they seldom meet many new people. Differences in the opportunity for error may explain why elderly people paradoxically report fewer memory lapses overall than the young.

28 Using personal events in a set of events for testing the memory for dates has an addi- tional problem because researchers have to obtain the true date of all events which may be difficult. 48

• Using memory aids like diaries, address books, shopping lists, or knot- ted handkerchiefs may protect an individual from memory failures, so that few actually occur even though memory is poor. • Questions that ask ”How often” or “How good” are ambiguous unless they specify a reference point. Providing an objective scale (e.g., in- structing the participants that “often” should be taken to mean about once a week) or specifying comparisons (e.g., “How good are you as compared with an average person of your own age?”) helps to increase precision. • Low correlations between psychometric tests and self-assessments may be due simply to the fact that they are measuring different things. If so, it becomes necessary to ask which type of memory ability—the kind as- sessed by subjective reports or the kind measured by formal tests—is most important or relevant. • The knowledge that people are asked about in questionnaires may be implicit, so that they do not have explicit awareness of this knowledge.

Despite these objections Cohen argues that if these self-report data are interpreted with caution they can be valuable and valid. This is supported by studies where the self-rating was significantly correlated (though usually weakly) with a wide range of memory tasks (e.g., Wilding, Valentine, Mar- shall, & Cook, 1999). When applying this to dating accuracy, researchers should however be careful because when people rate their memory without knowing the exact type of questions to be asked they may think about some other types of events than those researchers will ask about. For example when speaking about important events the researcher can think about events other than what the respondent considers important. Or when speaking about well-known public events respondent may think about events from a different domain than the researcher has in mind. Thus the clearer the type of events that will be asked about the stronger should be the relationship between memory self- assessment and dating accuracy. This of course implies that people have ex- perience in failure or success in the studied types of events or have at least a good ability to predict their memory performance.

49

Personality

According to Caprara (2001) the “personality system underlies the distinctive patterns of affect, cognition, and behavior that foster coherence in individual conduct and experience over time and across settings” (p. 11254). The author also highlights that most investigators share the interactionist view that per- sonality is an “an open system that develops and functions through conti- nuous and reciprocal interactions with the environment“ (p. 11254). Because personality in a this broad sense covers for example people’s goals, interests and patterns of cognition as well as abilities such as memory or intelligence, nearly all respondent characteristics could be covered under this predictor. In comparison the so-called personality inventories do not use such a broad definition and focus on basic trait dimensions such as extroversion, or neuroticism (the abilities are measured by separate tests). This approach seems to me more suitable because personality traits nicely supplement other predictors mentioned in this chapter and do not overlap as much as the per- sonality in a broad sense. When speaking about personality in this dissertation I will use this nar- rower approach to personality as a set of traits. Commonly used normal per- sonality inventories are for example NEO Personality Inventory (NEO-PI-R), the Eysenck’s Personality Questionnaire (EPQ-R), or the Sixteen Personality Factor Questionnaire (16PF) (Caprara, 2001). To my knowledge there is no direct empirical evidence about the rela- tionship of dating accuracy and the above mentioned personality measures but it can be quite easily hypothesized. For example some of the NEO-PI-R facets should correlate with dating accuracy, especially if the score is extreme: anxiety (N1) – anxious people may have problems with retrieval; fantasy (O1) – those people with more open receptivity to imagination and fantasy may be prone to memory distortion; or depression (N3) – these people may have more problems with memory distortion because of the rumination. Well documented is the relationship between general cognitive ability (e.g., intelligence) and psychoticism (P) or neuroticism (N) (see the review by Ackerman & Heggestad, 1997). These correlations are often highly significant but weak—e.g., between .09 to .19 in Austin et al. (2002). Authors found a negative association between general ability and Eysenck’s Lie scale. This was interpreted to mean that more intelligent people are less prone to socially de- sirable responding. This relationship is however more complex because in

50 real world settings neuroticism and lie scales tend to be correlated (see e.g., Jackson & Francis, 1998). Denomme and Adi (2009) were looking for the relationship between false events and NEO-PI-R. Authors found that when respondents confused false events with true events this was related to fantasy (O2; r = .55), com- pliance (A4; r = .52), and trust (A1; r = .45). All other significant correlations were smaller and because of the small sample (55 undergraduate students) the statistical confidence intervals were wide (lower confidence interval close to zero implying that the correlations may be much lower). Openness (O) was the only significant domain (r = .34). These bits and pieces of evidence imply that the relationship of perso- nality and dating accuracy could provide fruitful insights, especially when studying the less obvious personality traits such as fantasy, straightforward- ness or trust. Another interesting area could be the examination if people with different personality traits tend to experience different types of events. However because many major events are normative age-graded events (Baltes, Reese, & Lipsitt, 1980) within the studied society this difference will probably be in less salient and non-normative events only. The importance of one’s own past and how much people recollect, talk about memories et cetera could also be an interesting personality trait to study, because some people tend to reminiscence more than others (Karniol & Ross, 1996). It seems reasonable to expect that people who think more about the past will remember the dates better as well. But because the “when” as- pect is often the least important it can happen that these ruminations in the past would not help at all or could even cause some bias in memories. For example present goals tend to influence the memories (Conway, 2005; Karniol & Ross, 1996) or more frequently recalled memories may seem to be less dis- tant in time as they actually are—because there are more easily accessible (N. R. Brown, et al., 1985).

Interests

Interests affect the type of activities people experience and may have high commitment to. Interests may be related to work, lifestyle or free time activi- ties. For example Hoferková (2011) found, that when a person was a sport commentator he knew many exact dates from this field29. This can be ex-

29 I had the same experience with one politician on political events. 51 plained by the relevance of sport events for the work, by personal interest in sport events and by the effect of repetition (seeing the dates and talking about them often). As will be mentioned in the following chapter covering event characteristics, relevancy and repetition are both strong predictors of dating accuracy (Sedlmeier & Betsch, 2002; Wagenaar, 1986). Not only sport commentators have to remember the dates of public events. Public event dates are also relevant for journalists, politicians, histo- rians or other professionals whose knowledge of temporal aspect of events is important. Personal events on the other hand should not be influenced by the profession so much. Even though ability to date many events in one life area (e.g., sport events, political, personal) will not in most cases generalize into other areas (Ericsson, 2009a), knowing many exact dates increases the chance that some of them will serve as temporal landmarks for other events. Howev- er if the work or personally related interest is not strong or the temporal as- pect of an event is irrelevant it may happen that people with that interest will not date events from the same area better than other people. Lifestyle is a concept often used in sociology and concerns the collective pattern of how people live and behave (Stevenson, 2006). Lifestyle is useful in emphasizing the distinctions among groups of people. I am not aware of any study on lifestyle with respect to dating accuracy but it can be argued that people with similar lifestyles may have similar experiences, value similar events or have similar approach to valuing or disvaluing the temporal aspect of events30. Lifestyle is related to other respondent characteristics such as age, gender, education, location, and socioeconomic status (Giddens, 2009). Work and free-time activities are also important because if they happen on regular basis they can have the role of temporal schemata (covered in next chapter). If there is a week-long training at a respondent’s work in February she knows that all temporally associated events must happen close to Febru- ary. The same is true for other temporal schemata such as week schema (e.g., regular meeting every Monday). Even though regularity of events helps to estimate the date, it can also make the estimates more difficult when a differ- ent temporal scale is used. For example knowing that the training is every February helps to estimate the month of this or related events but makes the estimate of the year more difficult because of the interference of all other Feb- ruary trainings.

30 I know several business–world people for whom the temporal aspect is totally unim- portant and they live in one continuous flow of events and do not look back more than necessary (this can be the personality difference as well). 52

People with irregular work or free time activities cannot profit from temporal schemata aid. Quite often major work or free-time activities are also temporal landmarks, especially when something changes (e.g., 1st victory in tennis, new job). Grafova and Stafford (2009) found that after 13 years people gave only about 10% of inconsistent answers about smoking behavior. This is surprisingly accurate, but the reason may be quite obvious. People often quit smoking at some memorable date (New Year, summer holidays), quitting is often planned ahead and people tend to remember transitions—being smoker versus non-smoker—well (Shum, 1998).

Current state of mind

Whenever something is difficult people “choose” consciously or unconscious- ly how deeply they will process information to reach the answer (Baron, 2008; Kunda, 1990). The quality of the processing is not only influenced by the quality of memory or retrieval strategies but also by the current state of mind that moderates the effort involved in the processing. When not taking into account the pathological states that heavily influ- ence cognitive performance, like depression or anxiety disorder (see e.g., Ko- pelman, 2004; Preiss, 2008) then the most influential states probably are level of arousal, mood and motivation. Motivation will be more thoroughly covered in chapter 3.3 where I ex- plore the possibilities how to facilitate respondent’s motivation. However the motivation to put some effort into dating task may also be seen as the stable characteristic that people usually put some amount of effort into the task (Humphreys & Revelle, 1984) or as the current state of mind influenced by the actual situation. It is a well-known fact that when people try harder the dating accuracy is higher especially for the events where the date cannot be easily reconstructed (Bradburn, 1996; Wagenaar, 1986). Apart from this con- scious effort (motivation) there is also an “implicit motivation” that can mod- erate it (Ferguson, Hassin, & Bargh, 2008) 31. Trying may be associated with the level of arousal that is too high for successful recall. On the other hand too low arousal (e.g., when people are tired) may also influence the recall negatively (Eysenck & Keane, 2000). Mood not only influences the effort in task processing (Baron, 2008; Preiss, 2008) but can also have an impact on the episodic memories and con-

31 By implicit motivation authors mean the goals that „operate in implicit or nonconscious ways“ (Ferguson, et al., 2008, p. 150). 53 textual information that will be retrieved. Anderson (2009) speaks about mood-congruent memory and mood-dependent memory or generally about context dependent memory. Mood–congruent memory is the bias that people having some mood (depressive, cheerful) tend the recall the mood-congruent memo- ries or aspects of memories more easily (neutral aspects are not affected). Mood-dependent memory describes the known fact that when people store some information in some context (e.g., when they were sad) they will more likely recall this event in the same mood. Also other contextual phenomena such as ideas, thoughts, concepts etc. that occupied peoples’ mind at the time of encoding may help in memory retrieval. This fact is the reason why some aided recall techniques, e.g., cognitive interview32 facilitate the respondents’ recall accuracy by asking respondents to travel mentally in time and trying to re-experience the feelings they had at that time (Fisher & Geiselman, 2010).

3.2 Event characteristics

Most of the event characteristics cannot be separated from the subject, be- cause they need a respondent’s subjective evaluation (e.g., importance, vivid- ness). I call these characteristics respondent dependent event characteristics. Other characteristics, like the true date of an event, do not need this evaluation since any objective source is sufficient (respondent independent characteristics). It may also be worthwhile to let some of the subjective characteristics like impor- tance assess by proxies, because it can bring interesting information. For ex- ample proxies may be better in evaluating “importance” of events when res- pondents are ashamed to say that the new car is very important for them (self-assesment problems, see above Cohen, 2008b). Some of the event characteristics may be classified as objective, because there is only one correct answer to them (though we may not have the objec- tive source). These are for example the event recency (exact date when an event happened), regularity (an event happens every Tuesday), or frequency (an event happened three times in the reference period). Other event charac- teristics are subjective, because it depends on every respondent how he or she evaluates them and this evaluation is prone to changes. Among these charac- teristics are for example all phenomenological characteristics (e.g., vividness).

32 This is a different technique as „cognitive interview“ used in survey research. In survey research cognitive interview usually means the talk-aloud procedures or probing after people answer a question in order to find out how people arrived at the answer (Willis, 2005). 54

A third group consists of not completely objective and sometimes very sub- jective event characteristics. I will call this category partially objective. A good example is the event theme. When a lot of information is known about the event it can be relatively easy to categorize the event according to its theme. Although “a trip to High Tatras” can be “objectively” categorized as a “trip”, a respondent can subjectively categorize it as a landmark event with a rela- tionship theme, because he became engaged to his girlfriend and the trip alone may be completely irrelevant. Because subjective or partially subjective event characteristics can be measured differently and at various times, it makes a comparison across the studies difficult. For example, the above mentioned trip to the mountains could be annoying for a couple at first but several months later this informa- tion could be suppressed when they looked at the “beautiful pictures from the trip and how happy they are to be engaged”. Many event characteristics were studied thoroughly in relation to recall accuracy in general, especially how much detail (true or false) respondents remember. Even though this is not specifically dating accuracy, the evidence shows that detailed (better remembered) memories tend to be dated more accurately that less detailed memories (Bradburn, 2000; Friedman, 2004). Many of the presented characteristics are related to the date reconstruc- tion strategies mentioned in chapter 2.3 and explain the mechanism why these events are dated more accurately. For example when people say that they know the date and the event is very vivid researchers can be relatively sure that the date estimate will be accurate (Thompson, et al., 1993). Other event characteristics do not have such a clear relationship to the reconstruc- tion strategies but the research evidence showed that they are generally well remembered (e.g., important events; events with high memorability). I will first focus on the objective event characteristics, then move to par- tially objective and end up with subjective event characteristics.

Event recency

As the event gets older people remember fewer details and dating accuracy decreases as well (Bradburn, 2000). In case of a single memory, however, there may be other factors playing a role (e.g., importance or temporal sche- mata) which is why even some remote events resist the influence of time.

55

This retention function of event recency33 on forgetting was described by Ebbinghaus (1885/1964) and later on many times verified and clarified (see recent metaanalysis, Rubin & Wenzel, 1996). For most autobiographical events power function seems to explain most of the variance in retention. For short recall periods it explains approximately 85% and for long recall periods 88% of variance. Logarithmic functions explains only 65% of the variance (Rubin, 1982). Rubin & Wenzel (1996) also point out that because the range of the dependent variable is usually quite large in autobiographical studies the explained variance by the power function may be lower34. This function is similar for people of any age from 12 to 70 for most recent 10 to 20 years (Rubin, 2000). Bradburn (2010) notes that forgetting curves differ for different types of events a lot. Some events may have a very steep while others seem not to be forgotten at all or very slowly. Even though dating accuracy in any time units is positively related to the recall of events and the amount of error gradually increases, the relation- ship is not so straightforward (Bradburn, 1996; Friedman, 2004; Van der Vaart, 1996). There are three reasons why this relationship is difficult to pre- dict: • The curves of dating errors in different time units often have some spe- cial properties (e.g., multiples of 7 days bumps, or 12 month bumps; as was shown in a chapter 2.4) that disrupt the linear trend of increased dating error with the passage of time. • Other event characteristics such as temporal schemata, saliency or event importance often have a much stronger effect on dating accuracy than event recency does (see e.g., Bradburn, 2000; Larsen, et al., 1995; Shum, 1998; Skowronski, et al., 1994; Wagenaar, 1986). This is especially true when the recall period is not too long (e.g., 0.5 up to 6 month for recent events when exact date is estimated; or 2 to 5 years for remote events when month and year are estimated). When the recall period is very long (e.g., 0.5 month up to 2 years) the impact of event recency on dat- ing error in days will of course be much larger. The question is also if

33 Some authors prefer the term „recency of events“ (e.g., Van der Vaart & Glasner, 2007a) and others „age of events“ (e.g., Burt, Kemp, & Conway, 2001). I will mostly use the term “recency” when speaking about events and “age” when speaking about age of respon- dents. 34 This issue is quite complicated because explained variance is influenced a lot by the studies chosen. It may be even 97% but also much less. Even some other functions may explain a lot of variance such as logarithmic, hyperbolic, and exponential. For more detail see Rubin & Wenzel (1996) and Rubin (1982). 56

for 2 year-old events the exact date is an appropriate time unit (see a chapter 3.4 for discussion on appropriate time units). • The means of data collection may have impact on the relationship of event recency and dating error as well. An effect may have the wrong choice of time units, use of calendars and other recall aids, and bounda- ries of the recall period (see the chapters devoted to these topics for fur- ther details).

Because of the above mentioned arguments the generalizations of the results from literature should be made with caution. Rubin & Baddeley (1989) for example found a nearly linear decrease of dating accuracy for 76 colloquia talks (which were divided into 11 time periods), where the recall period was 2.5 years long. The increase of the dating error was about 0.4 days per one day, which is approximately twice as big as in the roommate study of Thompson (1982) or other diary studies (Linton, 1975; Rubin, 1982). The rea- son why is probably that events in the Rubin and Baddeley study were not so personally relevant and even though they were distinct, the act of being at colloquia was repeated, which caused interference among the colloquia. This trend of decreased dating accuracy with time is however often vi- olated close to the boundaries of the recall period35 which is why the func- tions (especially the linear function) do not fit the beginnings of the distribu- tion well. The same applies to remote events when the remote boundary is known—in this case remote events close to this boundary may have more accurate date estimates as less remote events that are not so close to the boundary (see chapter 3.3). This is supported by the study of Skowronski, et al. (1991) who found that dating error increased quite linearly from about 2 days to 15 over the 75 days long recall period with a mild increase of the ac- curacy at the end (most remote events). Respondents kept the diaries from this period (if they knew that the event was very old, they knew it could not be older then the time when they started keeping the diaries). Janssen et al. (2006) found that when well-known public events (from very recent up to tens of years old) were dated in absolute time format, the best fit function of errors was the logarithmic function (explained variance was only slightly more than 20%) while it was a linear function when relative time format was used (explained variance nearly 70%)36. These findings show

35 People usually know from which period all the events come because they have to keep the diary during this period (see more in chapter 3.3) 36 The dating accuracy was higher in absolute time format. 57 that when really remote events are dated the question format may play a role (see chapter 3.4 for more details).

Typicality, regularity and frequency

Both laboratory and diary studies showed that events that are consistent with person’s life (typical events) or events that are inconsistent with person’s life (atypical events) are recalled and dated more accurately than events that are not typical or atypical (neutral events) (Hastie, 1980; Skowronski, et al., 1994). An example of a typical event may be that when a student receives usually B’s or C’s then an exam when she received B or C will be rated as typical event while receiving A or D will be rated as atypical. Another finding is that atypical events tend to be recalled and dated even better (in days) than typi- cal events (Skowronski, et al., 1991). Event typicality was also found to be related to the accuracy of event duration estimation—suggesting that when an event is typical (e.g., summer vacation is two weeks long) then typical events duration will be estimated more accurately (Burt, 1993). The issue of (a)typicality is nevertheless more difficult (Bradburn, 2000). When an event is typical it can also be a frequent event and possibly a regular one as well. Regular events tend to be dated very well, but not in all time units. For example events that happen regularly every Tuesday will have very accurate DOW estimate while the week or month may be very difficult because of the interference of the similar events (Larsen, et al., 1995). Some events may be frequent but not regular which may cause big difficulties in date estimates, because people may not be able to distinguish among the similar events. Low-frequency events and especially first time events are usually very accurately dated in case they are important too (Bradburn, 2000). All these arguments are also the reason that the whole dissertation is focused on unique events where people remember the specific information connected to the event. Frequency estimates and processing are thoroughly covered by Sedlmeier and Betsch (2002). In summary, whenever event typicality, regularity, or frequency are the important characteristics, the researcher has to think about whether these characteristics have a positive or negative effect on dating accuracy.

58

Self–events and other–events

Events in which people participated (self–events) are generally better recalled and dated than events they encountered only secondhand (Betz & Skowronski, 1997; Skowronski, et al., 1994). This difference between self- and other–events is often substantial and highlights the common sense knowledge that what happens to us is more important than what happens to others. Thompson, Skowronski, & Lee (1988) found in their 10 week long recall period study that respondents reported 9% time-tagged self–events in com- parison to only 4% time-tagged other–events. Another diary study conducted by Skowronski, et al. (1991) where the recall period was one academic quarter (participants had to keep diary of personal events as well as proxy events for 10 weeks) showed more DOW tags37 for self events and the dating error of self–events was on average 8.45 days while it was 9.62 days for other–events. The regression analysis showed that other variables (e.g., typicality, pleasant- ness, recency) had similar effects on both self and other–events. According to Skowronski, et al. (1994) this suggests that proxy estimates are not biased dif- ferently than self–event estimates, only the dating accuracy is worse. People also tend to remember positive self–events better than negative self–events (Skowronski, et al., 1991) and some studies found that negative other–events were dated better than positive other–events (Betz & Skowronski, 1997). This is sometimes referred as self-reference effect. Meta-analysis conducted by Sy- mons and Johnson (1997) found this effect in most memory tasks (though not in some neutral tasks such as memorizing noun categories).

Event theme

The self–events vs. other–events distinction shows that events in which people participate are generally dated more accurately than events about which they only heard. Self and other–events can also be seen as one aspect of an event theme, specifically “who was involved”. Event theme is one of the basic characteristics of human memory, because thematic order is supposed to be one of the ways in which episodic memory is organized (Burt, Kemp, & Conway, 2003; Conway, 2005). Some events are generally more important than others (they usually also have a scripted sequence or are age-graded), so the events describing events like a graduation exam, one’s own wedding or

37 No significant gender difference was found (women had slightly more “tags”). 59 one’s own children, or buying a first house will be dated generally accurately (many of these events come from the reminiscence bump period; see in chap- ter 3.1). Skowronski et al. (1991) found also some evidence that events from some thematic areas are better dated by women (most areas) while other areas such home improvements or activities involving a car or vehicle were better dated by men. Unfortunately, their evidence is rather weak and I did not find other studies focused on which thematic categories are better dated by which gender. This lack of evidence is not a big problem because it can be relatively easily inferred that for same people with certain interests events from rele- vant thematic area will be dated more accurately (see in chapter 3.1; section about interests).

Media coverage (applies to public events only)

Public events receive media attention that can have an impact on the recall of public events. There are many studies exploring the effect of media type on the accuracy of recall (Katz, Adonio, & Parness, 1977; Snoeijer, de Vreese, & Semetko, 2002). However, because research focused on public events dating mostly deals with real events that happened from several weeks to many years ago, media coverage cannot usually be specified for every respondent, because they probably forgot from which resources they learned about the event. This is especially true nowadays because many people use multiple sources (internet, TV, newspapers, magazines, books, radio) and thus the in- formation about the event will probably be stored as a semantic fact rather than as an episode. Wicks (1995) highlights that even though people forget most of the discrete news stories (episodes) they acquire the “common know- ledge” from the news. This common knowledge may later help in date esti- mates if it contains some temporal cue. For example the New York terrorist attacks can hardly be displaced in time because very often this event is re- ferred to in the media with the time-tag: 9/11, September 11 attacks, 91138. This time-tag is usually not full while the year 2001 is often omitted which may mean that some people will remember the exact day of the month and month

38 911 is also the emergency number in many countries (e.g., USA, Canada). This informa- tion can serve as a recall aid. 60 but not the year39. An event description may involve approximate temporal in- formation like season too (e.g., Winter Olympic Games). Also media coverage of an event in time may have impact on dating accuracy. Brown, Rips, and Shevell (1985) hypothetized that culminating events (the media coverage started before the event and ended up after the event) should be dated more accurately than initiating events (in which case the media coverage starts after the announcement of the event). The reason is that people may put the date wrongly in the middle of the media coverage interval instead at the real onset of an event. The middle of the period of “culminating events media coverage” is thus often a more correct date estimate than the middle of the media coverage of initiating events. This hypothesis was not supported by the authors but it could be argued that their reference period was too long to find such a subtle difference40. Shorter reference period as in my Study I may provide a more appropriate condition to trace such a phenomena.

Temporal schemata

Temporal schemata refer to general knowledge about time patterns and thus constrain the likely time that an event could have happened (Larsen, et al., 1995). The DOW of events that have week schema will be generally dated more accurately than events without this schema; the month will be estimated more accurately when events have a year schema and so forth (temporal schemata are covered in more detail in chapter 2.3). Because public events are generally more difficult to date than personal events it can be argued that the availability of temporal schemata plays the major role in their accuracy (N. R. Brown, 1990; Larsen, et al., 1995). Researchers are aware of this fact, which is why events with temporal schemata are sometimes excluded from the selec- tion of events that are dated (e.g., in study of Janssen, et al., 2006). When a public event does not have a temporal schema and the date is not known for other reasons (e.g., a repeatedly mentioned date such as 9/11 in previous section) then especially more remote events should be very difficult

39 Another problem may cause the American format of the date. 9/11 means the 9th of November in Europe, while September 11th in the USA (which is the true date). 40 Their events (N = 72) were from 1976 to 1981. The media coverage of most events is usually not more than several months which could make it difficult to find such a subtle difference in 5 years long reference period. 61 to date. Lack of temporal schema knowledge can thus end up in date esti- mates that are close to a pure guess. Personal events also have temporal schemata that help to increase their dating accuracy (Larsen, et al., 1995) but their accuracy does not depend on them as much. The reason is that people usually have more date reconstruc- tion possibilities when dating personal events because they can use rich con- textual information that is remembered with personal events (Bradburn, 2000; Conway & Loveday, 2010; Williams, et al., 2008).

Phenomenology of memories

Memories produce intense phenomenological experiences. Our memories have the power to move us to laughter or tears. Memories of hard-won achieve- ments or humiliating rejections may arouse intense feelings of pride or shame. In fact, our most personally meaningful memories are defined by their phe- nomenology. (Sutin & Robins, 2007, p. 390)

Seeing that memories produce so many and such intense phenomenological experiences, it would be useful to know whether at least some of them are related to recall accuracy. There is no agreement about what the most relevant phenomenological dimensions of memories are. One of the most recent at- tempts to identify these dimensions was made by Sutin and Robins (2007) who selected vividness, coherence41, accessibility, time perspective, sensory detail, emotional intensity, visual perspective, sharing, distancing, and va- lence. I will add some more relevant phenomenological characteristics not mentioned by these authors, such as importance, typicality, or memorability. Phenomenological characteristics can be measured differently (e.g., self- report, observation) and at various times (right after when an event ends or at the time of interview) which is why it is difficult to compare the results across the studies. As Williams, et al., (2008) highlight “it appears that the characte- ristics of an event at the time of encoding only affect memorability if the same qualities are still present at the time of recall” (p. 40). This is why it can hap- pen that even an event that was very important at the time of encoding may be forgotten years later because it became insignificant.

41 I will not deal with coherence which is defined as “the extent to which the memory retrieved involves a logical story in a specific time and place rather than fragments of the original experience or a merging of many similar experiences” (Sutin & Robins, 2007, p. 393). 62

Not only common sense experience suggests that people remember im- portant events better than trivial events (Ley, 1972; Ritchie, Skowronski, Walker, & Wood, 2006; Williams, et al., 2008). Still some studies (e.g., Linton, 1982) did not find the relationship between importance and the accuracy of recall. The probable reason is that Linton was analyzing the relationship be- tween initial ratings of importance at the time when an event happened and subsequent recall. As mentioned above by Williams et al. initial ratings may change (and often do) over time42. Larsen and Conway (1997) used a similar procedure in a diary study of two respondents and also did not find any rela- tionship. Even though this could be expected in the line with Williams, et al. argument it is still surprising that no relationship at all was found. This pro- vides evidence that importance of many events changes a lot over time. Event importance is related to many other phenomenological characte- ristics, e.g., vividness, amount of details, ease of the recall, emotionality, or coherence of the event story (see e.g., Ritchie, et al., 2006; Rubin & Kozin, 1984; Rubin, Schrauf, & Greenberg, 2003). Even though important events are generally better remembered and dated, this does not apply to all of them. This is supported by the fact that event importance alone does not have to imply that the fact (date) is also in- teresting for the person (Wade & Adams, 1990). For example a wedding is generally a highly important event but the fact that it happened on July 20, 1999 may not be interesting at all for people who do not celebrate the anni- versaries. Life-time periods are often marked with the landmark events that tend to be important (Shum, 1998). Because of this it could be hypothesized that more important events happen at the boundaries of the life-time periods or other shorter boundaries. But Kurbat, Shevell, and Rips (1998) found that im- portant events in their study did not show this tendency and the distribution of events was independent of the boundaries43. Sharing the memories is one of the most fundamental functions of memo- ry (Neisser, 1988). People share events because they are important, interesting and more often share those events that are positive (Sutin & Robins, 2007; Williams, et al., 2008). The relationship of sharing with dating accuracy is not so obvious, because people may share other aspects than date. However

42 This is supported by Linton who realized that her concurrent ratings of event impor- tance do not correspond closely to her previous ratings. 43 Longer recall periods (tens of years) would probably find that important events are often those that are landmarks demarcating the life-time periods. 63 when people share events more often the probability that these events will be dated more accurately should be generally higher. Vividness and the details remembered are also found to be positively re- lated to the recall accuracy of date estimates (Bradburn, 1996; Friedman, 2004). Both characteristics are also crucial in the trace decay theories of event dating mentioned in chapter 2.3. These theories relate the amount of details or vividness to the accuracy of dating—this strategy is especially used for more remote events (Janssen, et al., 2006). For example Larsen and Thompson (1995) found that correct DOW was more frequent for high–vividness events (55%) than for low–vividness events (44%). Burt, Kemp, and Conway (2008) found that vividness rating was positively related with the correct order of the sequences of memories. Vivid memories tend to be also personally impor- tant (Rubin & Kozin, 1984) and more accessible (Sutin & Robins, 2007). Be- cause both vividness and sensory details44 are highly correlated I will use the joint category Vividness/Details in my dissertation. Similar to details remembered is the amount of knowledge about the event. This phenomenological characteristic is especially used when public events are dated. Brown, Rips and Shevell (1985) as well as Burt and Kemp (1991) found that higher-knowledge events were dated more accurately than lower-knowledge events. However, Fradera and Ward (2006) found that this does not apply in all situations (see chapter 3.1, section about age for more details). In their study event knowledge was related to better event dating only for younger group of respondents who did not live through the events. Older respondents did not estimate better when knew more details about the event. The reason may be that for older respondents many events are not the first- time occurrences of similar events and additional knowledge may not be re- lated to the temporal aspect of the events. Lee and Brown (2004) point out that lack of knowledge does not have to end up in completely erroneous date estimate, because people may use other information such as temporal schemata or boundaries. Sometimes the date just pops-up from the memory while in most times a reconstruction must take place (Friedman, 1993). Betz and Skowronski (1997) found that respondents reported knowing the exact date among 19% of self-events (the average dating error was 1.93 days; percentage of correct an-

44 Sutin and Robins (2007) use the concept “sensory details” for all details except those that people can see. I will use it more generally as any specific details remembered about the event. 64 swers 79%) and 12% of other-events (the average dating error was 2.79 days; percentage of correct answers 79%) in their 10 week diary study. When exact date is recalled (known) the dating error is usually small in comparison when people report not knowing the date and reconstructing it somehow. For exam- ple Thompson, Skowronski, & Betz (1993) found in their diary study (events 1–14 weeks old) that the dating error of events where the exact date was re- called was on average only 1.30 days while much bigger when the date was said to be unknown (4 days up to 15 days). Whether a date needs to be reconstructed or is known can be considered as phenomenological characteristics because respondents evaluate whether “mental calculations were made or not.” Also they can be used relatively in- dependently from the respondent when the reconstruction/known distinction is evaluated by the researcher. A researcher can code the reports as known when the answer was quick without any visible effort and as reconstructed when people have to think about the answer45. This is closely related to the measures of reaction time which are also positively correlated to the accuracy of recall (Howes & Katz, 1992; Kemp, Burt, & Malinen, 2009; Robinson, Johnson, & Herndon, 1997). Even though reaction time seems to be valuable predictor of dating accuracy it is important to point out that very short reac- tion times are especially good predictors of accuracy46 while longer reaction times can be interpreted differently: • A respondent makes a lot of effort to come to the best estimate even in cases where he reached sufficient accuracy (e.g., thinking long about Ju- ly 17th or July 18th.). • A respondent has a vague memory for an event and tries to find at least some contextual information so as not to just guess. • A respondent has a slow pace of talking and thinking so, even in cases when he or she is confident (knows the answer), takes a long time to reach the answer.

Because of these problems with interpreting reaction times it can be valuable to use the think aloud procedure while estimating the date or probing how the respondent reached the date estimate after final answer was reached (more about these procedures in Willis, 2005).

45 I use a combined approach in studies II & III. I used reaction time and think aloud pro- cedure while taking into account the individual pace of answering. I also asked about the retrieval strategy when I was not sure about how to classify the event. 46 Even this is not true when people are not motivated to provide accurate date estimates and “fire“ the answers quickly without thinking. 65

A related phenomenological concept is the accessibility of the date or in other words the ease of the recall or reconstruction. Ease of recall is positively related to the details remembered (Ritchie, et al., 2006) and also to the retriev- al strategies that were shown to be positively related to dating accuracy (Thompson, et al., 1993) (see chapter 2.3 for more details about dating strate- gies). Although accessibility heuristics lead to better estimates than pure guessing it is generally the strategy leading to less accurate date estimates in comparison to knowing the date, having a temporal association with land- mark events, etc. (N. R. Brown, et al., 1985; Wright, Gaskell, & O'Muircheartaigh, 1997). Many studies also use the troublesome concept of saliency. The word “salient” can have at least two meanings that are relevant for memory accura- cy47: being “most noticeable” or “important”. These two related meanings together with different conceptualizations make this concept problematic (Van der Vaart, 1996). For example, according to Mathiowetz and Duncan (1988) “a salient event appears to evoke emotion at the time of occurrence, mark a transition point, have economic or social costs and benefits, or have continuing consequences after the event” (p. 225). This definition is very close the definition of landmark events (cf. Shum, 1998) and it is thus no surprise that these events are dated accurately. On the other hand Wagenaar (1986) classified the event on the scale from “common place daily events” (low salience) through “a class occurring with a lower frequency, like once a week, once in 3 years” to most salient events “once in a lifetime”. His defini- tion thus uses not only the “most noticeable” criterion but also frequency and probably some other characteristics, such as importance of the events, play a role in this classification as well. Again it is no surprise that events like this are dated more accurately, though it depends also on when the saliency was rated (higher correlation is expected when the ratings were made at the time of the interview). In this sense the term uniqueness could also be used. This word means “being the only one of its kind; unlike anything else”48. Salient events must have some unique properties, e.g., not a “common trip” but a “life-time trip”. The last term closely related to this group of characteristics I will deal with is the memorability of the events. Memorability is usually de- fined at the time when events are recorded (in diary studies) (Thompson,

47 According to Oxford dictionary: Retrieved from http://oxforddictionaries.com/definition/salient?q=salient 48 According to Oxford dictionary: Retrieved from http://oxforddictionaries.com/definition/unique?q=uniqueness#unique__7 66

1982). An extremely memorable event may be defined such that a respondent expects he or she will remember it after a long time (this was 1 year in Thompson study) and “not very memorable event” is expected to be forgot- ten soon (within 2 weeks in Thompson study). Thompson found that memor- able events (the oldest were 14 weeks old) had the lowest median error of only 3.4 days while medium memorable events had 4.5 days error and least memorable events 6.5 days error. This finding was supported by other studies as well (e.g., Skowronski, et al., 1991). The relationship of event pleasantness or emotional involvement with mem- ory accuracy has been addressed by many studies with rather ambiguous outcomes. Wagenaar (1986) found in the diary study of his memory, that more emotionally involved events are more resistant to decay (after 1, 2, 3, 4 or 5 years). More pleasant events were better recalled than neutral or unplea- sant events after one year. While both pleasant and neutral events were grad- ually forgotten, negative events remained quite stable over a three-year reten- tion period and then dropped. Even though negative events were on average less remembered, those which stayed (approximately 40% of events) were stable even after three years (the saliency of positive and negative events was similar). A reanalysis of 120 positive and negative events showed that when the main character in the event description was somebody else, positive events were better recalled; when the main character was Wagenaar himself he recalled better the negative events (Wagenaar, 1994). This is interesting, but one has to be careful because it is N = 1 study. Groeger (1997, p. 230, in Eysenck & Keane, 2000, p. 243) for example speculated that this finding may reflect Wagenaar’s personality: “What Wagenaar does not address is the pos- sibility that he may actually have a tendency to be rather self-critical or self- effacing.” Skowronski, et al (1991) found that positive events are generally better remembered than negative events but the extremity of the emotion was also important. More extreme events (both pleasant and unpleasant) were better dated than neutral events. The authors also found that positive self-events and negative other-events are dated more accurately than negative self– events and positive other–events (see above).

Confidence of date estimates

Confidence in date estimates could be considered as one of the phenomeno- logical characteristics and for example Sutin and Robins (2007) included the

67 vividness of date estimates in various time units into their Memory Expe- riences Questionnaire (MEQ). I will deal with this event characteristic sepa- rately because this is the only characteristic that is not directly connected to the event (and its representation in memory) but to the date estimate. Many studies have found that confidence is a relatively good predictor of accuracy in various tasks (including dating accuracy) but also that people are generally overconfident in their ratings (Baron, 2008; Wagenaar, 1986). This overconfidence is especially high when people think they are 100% sure or close to “full” confidence in accuracy of date estimates. This applies espe- cially to temporal landmark events or other events where people simply know the date but there is evidence that the date is not always correctly dated (Shum, 1998; Thompson, et al., 1993; Van der Vaart & Glasner, 2007b). On the other hand when confidence is very low the accuracy in reality is usually bigger (similar to the regression to the mean phenomenon). Rubin, et al. (2003) were aware of this phenomenon, which is why they included in their “Autobiographical Memory Questionnaire” the item “Would you be confident enough in your memory of the event to testify in a court of law” (p. 901). This is a very strict criterion that can be used to test the belief in the es- timates.

Association with other events

Both public and personal events can be temporally associated with other events. When the date of the other events is known or approximately known then this association—often in a form of a known sequence of events—will help to estimate the date of related events (Glasner, et al., forthcoming). This is one of the main mechanisms for reconstructing the date of many personal events (Friedman, 2004) but is less frequent for most public events because they are generally less associated with other events in autobiographical memory and are often stored as semantic facts. When the association is available, both types of events should be dated more accurately in comparison to similar events without the association, as was shown in the study by Larsen and Thompson (1995). This effect should be more pronounced in less salient public events because they are generally dated less accurately (Kemp, 1999) and the association can thus have relative- ly bigger effect (Larsen & Thompson, 1995). Neisser (1981) found that when people are faced with similar events they tend to recall very well the aspects that were the same for all events but

68 often forget the unique features or associate the unique features with the wrong episode. This may apply for example to cyclic public events where the association with personal events (e.g., I was watching the Olympic Games while I was still living in my old house) may be correct with the Olympic Games but not with the Olympic Games in Torino but in Salt Lake City. It is why Neisser calls these memories “repisodic” rather than episodic.

3.3 Data collection and dating accuracy

The subjective impression that many of us have when we attempt to provide accurate dates for autobiographical events is that it is a very difficult task. This personal subjective experience is corroborated by our observations of respon- dents in event-dating studies. These respondents often spontaneously express a great amount of dismay (sometimes quite vehemently) when they are asked to provide exact dates for their autobiographical events. (Skowronski, et al., 1994, p. 217)

As the quotation shows, providing date estimates is often demanding and difficult, especially when researchers or other professionals ask about events that are not as memorable as peoples’ own birthday or wedding date (which is helpfully written on the wedding ring). The first task of the researcher is thus to explore whether the task difficulty of the target event(s) is not too high. Event and respondent characteristics mentioned in previous chapters should help in this assessment. Dating accuracy is however influenced by other factors as well which can be freely covered under the heading of data collection. The most important is the cause (reason) why the person estimates the date. In real life situations people look for the dates because they want it for their own purpose or because they are asked. The cause of looking for the date is closely connected to the expected dating accuracy because when people (or researchers) need very precise dating they will try much harder to reconstruct the date in comparison to the situations when precise dating is not necessary. Deeper and more demanding reconstruction strategies may lead to increased dating accuracy as mentioned in chapters 3.1 and 2.3. In the situation when researchers or other professionals need precise dating but this is not as relevant for the respondent they should do every- thing possible to increase respondent’s motivation and to minimize the cogni- tive burden of the task. (This is of course true even when the respondent is motivated and tries hard to estimate the dates).

69

I will first deal with the general issue of response process and its com- ponents, then move to more concrete issues of facilitating respondent’s moti- vation, task characteristics, and end with aided recall procedures that should facilitate the recall.

The response process

To answer the “when” question a respondent needs to comprehend the ques- tion; retrieve the relevant material; judge the completeness, make inferences, integrate the material; and finally respond in chosen time units (Tourangeau, et al., 2000). According to these authors this is an idealized process and in the real world people often skip some phases, come back to earlier phases, or do not process the information thoroughly but answer the first thing which comes upon their minds that satisfies the researchers (see figure 3.3).

Figure 3.3 Schematic response process (adapted from Tourangeau, et al., 2000). The small arrows demonstrate the “shortcuts” when people skip some phases and that people may also go again to the previous processes.

Even though the first part (comprehension) may seem obvious for “when questions” it may not be. The meaning of the question “When did you start working at your current job? Please be as accurate as you can” is obvious and when time units are provided the interviewer may be pretty sure that res- pondent will interpret the meaning of the question as it was meant by the researcher. What is not so obvious is how a respondent comprehends the ex- pected accuracy. Even if mentioned explicitly—when the interviewer says that he or she wants the most accurate answers possible—the respondent will still infer the expected accuracy from the context of the interview and task difficulty (Grice, 1975; Tourangeau, et al., 2000). Grice (1975) calls this (often implicit) information a conversational implicature. For example the respondent may feel the right to provide less precise answers (and try less hard) in an online interview than in face-to-face interview. The reason may be that res- 70 pondent in an online mode implies that it is up to his choice how hard he tries and will thus work with less effort. This may be intensified when many date estimates are required and by the fact that the “personal investment” of the researcher is smaller in an anonymous online mode—so why should respon- dent “bother” being hardworking. The conversational maxims of quantity and quality may thus be different for online and face-to-face interviews (Grice, 1975). In face-to-face interviews the interviewer has also the chance of facilitating the respondent to try harder even in situations when the task dif- ficulty is high and more estimates are needed. A better and more conversa- tional mode should be the telephone interview but even in this mode lower data quality is often found in comparison to face-to-face interviews (see e.g., Aquilino & Lo Sciuto, 1990). There are two ways to ask for dates. First is the when question men- tioned above: “When did you start working at your current job?” The second is asking in a relative time format: “How long ago did you start working at your current job?”49 In both cases the same time units can be used. In every- day conversation people often use the relative time format “month ago” or “week ago” instead of in July or on July 21st. The relative time format is also preferred when more recent and less relevant events (e.g., public events) are dated (see Janssen, et al., 2006). The important issue about the question type is that both types of ques- tions evoke slightly different estimation strategies that may end up in differ- ent errors patterns (Gaskell, et al., 2000). For example when using relative time format people may use rounding more often—it happened a month ago, two months ago, etc. The absolute time format does not encourage rounding as much as the relative time format does. Huttenlocher, et al. (1990) found that in reports of elapsed time (relative time format) people often use the pro- totypical values such as 7 days ago (1 week), 14 days ago (2 weeks), et cetera. This is supported by the study of Janssen, et al. (2006), that the dating error was smaller in absolute time format (for remote events > 1000 days and recent events: 100–1000 days; no difference was found for very recent events). Both time formats suffer from the same problems but it seems that the relative time format (though people like it often) leads to higher dating error (Janssen, et al., 2006). Because of the rounding problem I will use absolute time format and when questions in the empirical Studies I to III.

49 The relative time format can also use other reference point than present (e.g., How long after/before some other event or period; see chapter 2.2). 71

Retrieval of the event as well as temporal information connected to that event may be the most difficult part of the whole process, especially when the events are less salient. The painfulness of remembering was nicely captured by Wagenaar (1986) who tested his own memory:

Unexpectedly, the recall appeared to be somewhat torturous. Most of the events were quite trivial by the time of recall, and I needed much motivation to search my memory for trivia. It was hardly possible to recall more than five events on a day, which explains why the recall period of the main experiment itself lasted a full year (p. 231-232).

During retrieval phase four scenarios may happen: • Respondent knows the right date straight away—it’s already in his or her “head”. • Respondent can reconstruct the date by using various strategies. • Respondent does not know the date and is not able to reconstruct the date on the chosen scale because he or she forgot the temporal informa- tion completely. When the answering scale is very rough–grained, people are usually able to at least say that the target event happened when they were children or in adulthood (Larsen, et al., 1995). • Respondent thinks he or she knows the date, but is mistaken.

The fifth possible scenario would be that the person forgets that the event happened at all. This scenario will be skipped in this dissertation be- cause people cannot estimate the date of events they do not know that hap- pened.

The first scenario is not problematic. Whatever researchers do an an- swer remains the same. As mentioned in chapter 2.3, time–tagged memories are rare50. But because even confident people can make errors (see chapter 3.2 and the third scenario here) caution is recommended even in these cases. In the second scenario when a respondent is able to reconstruct the date, dating accuracy may be influenced by the way of data collection. Researchers should thus do the maximum to facilitate respondent’s motivation and recall. The third scenario is the opposite of the first one. Whatever researchers do respondents are not able to date the events and their answer is nothing but a pure guess where some logic (e.g., temporal schemata) may be applied. The question is of course whether the temporal information or at least some con-

50 With the exception of DOW estimates of recent events. 72 textual information that could lead to improved date estimate was completely forgotten or respondents just need some more effective retrieval cues (Della Sala, 2010). This question cannot be answered because there is always a chance (though it may be small) that at least some facts about the events may be recalled in the future. As Berntsen (1998) highlights people store many episodes “which may often be inaccessible to voluntary retrieval, but highly accessible for involuntary retrieval” (p. 136). The forth scenario where a respondent thinks he or she knows the an- swer but actually does not is probably the most problematic. This is especially true when respondents feel confident about the accuracy of their reports. As Wagenaar (1986) points out this overconfidence may mislead even jurors who have to judge the reliability of witness testimony. The problem is that in these cases interviewers and respondents may not pay enough attention to the re- call aids or possible inconsistencies with other date estimates, because the confidence overrides the doubts. The literature shows that even people who are confident make recall errors as well as dating errors and it is not rare (see e.g., Baron, 2008; Kebbell, 2009). It thus seems advisable to probe respondents with questions like: “Are you really sure?”; “Couldn’t it be a different year?”; “Look at the calendar—is it in concordance with other answer?” In the judgment phase people evaluate the information they were able to retrieve, assess its completeness, draw inferences with the help of various heuristics, and fill in the missing information. In the retrieval and judgment phases interviewer’s facilitation of respon- dent plays a major role, because both phases may be very demanding and rounding, averaging, as well as other “shortcuts” can increase the dating er- ror. The last phase (response or response selection) is usually not problematic when appropriate time units are chosen (see next section) and when respon- dents try to be accurate.

Choosing appropriate time units

The choice of the appropriate time units depends on the study topic and re- search questions while taking into consideration the possibility that these units can be retrospectively reconstructed. This is clearly articulated by Freedman, Thornton, Camburn, Alwin, and Young-DeMarco (1988):

The investigator must choose a time unit that is small enough to ascertain with adequate precision the sequence and temporal interrelation of events. To 73

record events that occur fairly frequently or quite close together, it is necessary to divide time rather finely. At the same time, one must consider the respon- dents' ability to make fine time distinctions and the feasibility of fitting the de- sired time unit over the required time span of the study onto a calendar of ma- nageable size (p. 44).

For remote events where month is not crucial information longer time units such as year may seem more suitable than month. For example Yoshi- hama, Clum, Crampton, and Gillespie (2002) chose one-year intervals as the time unit in their domestic violence study (the recall period was very short— up to 40 years) with the reasoning that “the relatively long recall period to be covered in this study would make it difficult for the respondent to accurately recall a smaller time unit in which a particular event occurred” (p. 302). Even though it may not be necessary to know the month of occurrence and year could be sufficient, the argument of smaller units to be difficult does not seem to be well grounded because people for example remember the month usual- ly very well even after years, because they remember some contextual infor- mation from which the month can be estimated (see chapter 2.3). The same logic could be applied to day of the week estimates, because people remem- ber at least some temporal schemata very long (Larsen & Thompson, 1995). Fine-grained time units such as month or DOW may also facilitate respon- dents recall because they have to look for the contextual details that could otherwise be skipped and this activity was found to improve the accuracy of recall (Fisher & Geiselman, 2010). On the other hand omitting the month may also be beneficial because when more events are dated it may help to reduce the cognitive burden of the task. For the analytical purposes it is best to chose the smallest time units possible because they can later be aggregated into larger units when needed (Belli, et al., 2009). There is however a limit to this approach. For example when exact dates are asked for without any calendar help (e.g., January 30, 2011), the day of the week can be aggregated from the date but there is a high chance that when the exact date is not time-tagged the day of the week may be estimated no better than pure chance. On such occasions calendar (common pocket calendar; see e.g., Gibbons & Thompson, 2001) should help to solve this problem, because people can look up the DOW (of time-tagged events) or adjust the exact date according to the day of the week they remem- ber.

74

Commonly used time units are hours, parts of the day, exact day of the month or days units generally, DOW, weeks, months and years. Three parts of a month (beginning, middle, and end) also proved to be an appropriate time unit when the recall period is not more remote than approximately year or two (see e.g., Belli, et al., 2001; Kessler & Wethington, 1991). Belli et al. (2001) argue that because weeks cross the month boundaries and are general- ly difficult to remember, thirds of month are more clear and easy to recon- struct. See chapter 2.4 for more details about different time units and the error patterns connected to them.

Temporal boundaries

An event, e.g., visiting the volcano Etna, may be part of an extended event (holidays in Sicily) which limits its occurrence to the temporal boundaries of this the extended event. If respondents know these limits the dating error will be within that range only. Apart from exceptions most autobiographical events as well as extended events are associated with some lifetime period too (Conway & Pleydell-Pearce, 2000; Williams, et al., 2008). Often the begin- nings or endings of these periods serve as temporal landmarks (Shum, 1998) and events close to these landmarks tend to be dated accurately (if temporal boundaries are correctly dated). These boundaries are related to events and I deal with them here because they have similar effect on dating accuracy as the temporal boundaries of the recall period. In many memory studies respondents were aware of temporal bounda- ries of the recall period because they kept their diary for some period. For instance in Skowronski, et al. (1991) diary study respondents knew that events may come from the last academic quarter only. Similarly, Wagenaar (1986) knew that he recorded events for six years only and the oldest events cannot be older than this boundary. Sometimes researchers even mention the boundaries for various reasons. For example, participants in Kemp’s public events dating study (1988) were told that the oldest event could be from 1980. The boundary effect models explain the phenomenon that without boundary the estimated dates are less biased. The reason why is that all events have the chance to be telescoped forward as well as backward. The signed errors as well as dating errors in absolute value will be generally smaller (because the boundary limits the maximum error), but more biased close to the boundary—events close to the more remote boundary may be more accurate that slightly more recent events. The reason is that the more

75 remote events cannot be telescoped so much backward but only forward (Huttenlocher, Hedges, & Prohaska, 1988; Rubin & Baddeley, 1989). The boundary model also partially explains the typical pattern of telescoping. Older events are more prone to forward telescoping and recent events to backward telescoping while events in the middle of the recall period should have similar chances for both types of bias. Lee & Brown (2004) found that that three different boundaries (1997, 1994, 1991) had an impact on dating accuracy of events from 1997 to 2001. When people knew the true boundary (1997) they were more correct than when they knew the more remote boundaries. The proportion of guessed dates (when people were not aware that the event happened) predating the middle of the true range (middle of the 1997–2001 reference period)51 was 67% for 1997 boundary, 75% for 1994 and 81% for 1991. This shows that people were aware of the boundary and put their responses closer to the middle of the period between boundaries and present. When guessed events were removed researchers surprisingly found no impact of the boundary but only impact of the recency. Guessed events were backward telescoped and the amount of bias was contingent on the boundary. It is interesting that boundary effects worked in the expected direction only for guessed events— when boundaries receded in time forward telescoping decreased and back- ward telescoping increased. However when guessed events were removed this almost entirely removed both types of telescoping. Kemp’s associative model (1999) explains telescoping even without boundaries. The theory has three assumptions: events are often associated with other events and similar or related events can cue other events; when the recall period increases it is harder to recall time information as well as contex- tual information; and when time information is not available people will re- construct it from other related events. The theory predicts that if events are older, the time information as well as associative contextual processes are scarcer and the accuracy lower. Older events should be more telescoped for- wards because there is a higher probability that associative search will lead them towards more recent events (because these can be more easily recalled).

51 Pre–dating = being more remote than the middle of the 1997–2001 interval which was at the end of the 1998 year. 76

Facilitation of respondent’s recall

There are two roads to facilitating a respondent’s recall. Researchers can try to affect respondent’s motivation and hope that it will improve the recall be- cause they will “try harder” and use more thorough reconstruction strategies. The other road focuses directly on the facilitation of the recall and helps res- pondents by providing various recall aids such as calendars, landmarks, et cetera. I will first deal with the general motivational issues and later with the aided recall procedures.

Motivation facilitation

When people are motivated to be accurate, they expend more cognitive effort on issue–related reasoning, attend to relevant information more carefully, and process it more deeply, often using more complex rules (Kunda, 1990, p. 481).

The motivation of respondents to put enough effort into dating task can be facilitated by making the task easier with the help of various recall aids (this will be described in the next section), or indirectly by for example showing the support and interest to the respondent or by the “motivated” behavior of the interviewer. As Gillham (2000) points out the interviewer is a “research instrument”. Researchers can not only talk about the importance of the research or impor- tance of the results for respondents, they can do much more by a mere show- ing the real feeling and interest in respondents: “If you are not really interest- ed in the interview topic, and therefore in the responses of the person you are interviewing, then it won’t work; in a myriad of small ways you will show that you are just going through the motions” (p. 30). Interviewers should support respondents especially when the task is difficult. This can be done by words, voice, or gestures. Even though this may interrupt the standardized procedure it seems generally advisable to offer the genuine “non-standardized” support when feasible (Belli & Callegaro, 2009; Gillham, 2000). This support has to be non-suggestive and it is thus advisable to provide training in giving feedback and support for less experienced inter- viewers (Dijkstra, Smit, & Ongena, 2009). Belli, Lee, Stafford, and Chou (2004) found no detrimental effect of more conversational and “memory friendly” type of an interview. When is a great danger that interviewer will bias the date estimates the supportive phrases may be scripted and interviewers can be told what parts of the feedback are flexible and which are not. 77

People are more likely to arrive at the conclusions that they want to ar- rive at (Kunda, 1990) and researchers thus have to do their best to assure that respondents will share the aim of researchers.

Recall facilitation (aided recall procedures)

Intentionally retrieving memories is an effortful cognitive process that takes seconds and sometimes even 10s of seconds (Conway & Loveday, 2010). Re- trieving or, more accurately, reconstructing the date is often an even more effortful process which is why some people are reluctant to put much effort into it. Even though there is no miraculous technique leading to accurate date estimates, researchers can do quite a lot to increase the chances that people will provide the best estimates of which they are capable. As mentioned in the previous section good motivation on both sides (in- terviewer and respondent) may increase the recall because respondents will use better recall strategies and try harder to reconstruct the date. However good motivation may not be enough when people face a really difficult task— which providing date estimates often is. I did not find any comprehensive reviews of aided recall procedures that help to decrease the task difficulty. However a lot can be borrowed from literature focused on the recall in general, from dating accuracy studies or literature describing calendar instruments. The best way to facilitate recall (and also check its validity) is to find some evidence (documents, photos with the dates, e-mails et cetera). Because of this, it is advisable to the interviews at home of the respondents or give respondents some time to check their responses. This is however not possible on many occasions and researchers thus have to use other strategies. The order of asking temporal questions should preferably go from recent events to more remote events. The reason is that recent event dates are usual- ly known quite well and people can reconstruct the sequence leading to re- mote events more easily (Loftus & Fathi, 1985). But, when events that have a causal relationship are recalled (or a story as a sequence of events) ordering questions from remote events to recent ones is preferable (Schwarz & Oyser- man, 2001). Decomposition of the task is also sometimes used, especially when the task can be decomposed into the meaningful units (Tourangeau, et al., 2000). When for example researchers ask for day of the week and exact date it may be easier to first ask about the day of the week and reconstruct the exact date 78 afterward. However the decomposition should not disturb the natural strate- gy of respondents to date the target event. It may be generally advisable to start with the easier task (which day of the week usually is), but it should be the respondent who chooses whether to start with DOW or exact date. Guiding respondents about how to effectively reconstruct the dates by using all available temporal information, contextual information, and heuristics. By asking to talk aloud or probing after answering (e.g., how did you arrive at the date?) researchers can learn a lot about the reconstruction strategies of the respondent (Willis, 2005). Respondents may not use some available informa- tion or may apply the wrong heuristics. For example Niedźwieńska (2004) found that students who attended the intensive course on autobiographical memory were more accurate (η2 ranging from .01 to .02) in their recollections (and less confident about the veridicality of their reports) of 9/11 terrorist at- tacks than a group of students who did not attend this course. Researchers can use knowledge of how people arrive at the dates (see chapter 2.3). More time for a task may also improve dating accuracy. Sudman and Bradburn (1982) mention that the mere length of the question can promote fuller recall (when longer). The reason is obvious. When respondents feel to have enough time without pressure they can try to look up all relevant infor- mation and process the estimates more deeply. A short talk about the reference period memories before the interview can make subsequent recall quicker because the memory will be in so called retrieval mode in which people access the episodes more quickly (Tulving, 2002). Van der Vaart and Glasner also mention that ‘‘warming up’’ proce- dures may stimulate subsequent recall (2011). The reinstatement of context plays a major role in date estimates where the task difficulty is high (Friedman, 1993; Janssen, et al., 2006) and is the main component of cognitive interview that was found to increase the accu- racy of retrieval (Fisher & Geiselman, 2010). Not only facts are worth retriev- ing. Also the emotions and other states of mind may be useful because it was found that some memories are context-dependent and the reinstatement of the context help in retrieving these memories. Marian and Neisser (2000) for example found that even the language of the word cues can cause different memories to be retrieved. Word cues in English led among Russian-English bilinguals to 35% of Russian memories while Russian word cues led to 64% of Russian memories. Multiple retrieval may also increase the accuracy of recall because people tend to think about the events even after the interview and may find evidence

79 supporting their reports or making them more accurate (Fisher & Geiselman, 2010). Collaborative recall generally increases the accuracy. The reason is ob- vious. More people may remember different aspects of the memories (some even the exact date) that will lead into more accurate recall and date estimates (C. B. Harris, Paterson, & Kemp, 2008; Karns, Irvin, Suranic, & Rivardo, 2009). The price of this type of recall is however that errors may happen as well and people recalling events together were found to be more confident than indi- viduals even when they were wrong (C. B. Harris, et al., 2008). Clark and Ste- phenson (1990) summarize this problem:

Social remembering has advantages and disadvantages, which must be weighed against each other. If maximum accuracy with the minimum of eva- luative comment, regardless of the occasional (but consistent) error is required, then collaboration has distinct advantages. However, if the minimization of er- ror is more important than the completeness of accounts, and evaluative com- ments can be tolerated, then individual recall has distinct advantages. (p. 92).

An elegant solution to the problems arising from collaborative recall could be doing the interview first with two or more people separately and then again together. This should lead to the best outcomes, because the group usually remembers less than the sum of the individuals but more than each individual separately (C. B. Harris, et al., 2008). When some idiosyncratic dates are known from documents or from previous data collections in the longitudinal research the bounded recall may be used. Bounded interviews use the events and their dates where the date is known and people thus have at least some time points where the date is exact or approximately exact (Tourangeau, et al., 2000). When people receive feed- back about the known dates this is sometimes called dependent interviewing (or pre-loading when computerized) (Hoogendoorn, 2004). In longitudinal re- search it is generally considered that more concurrent data collection is more accurate than the subsequent collection (often years after) and can thus serve as the “gold accuracy standard” (Belli & Callegaro, 2009; Belli, et al., 2001). Public events, especially the temporal landmarks, are sometimes used by researchers as a bounding recall aid as well (Gaskell, et al., 2000; Hoppin, Tolbert, Flagg, Blair, & Zahm, 1998). The reason is that temporal landmarks such as 9/11 or other salient public events often structure the memory and when used as cues can increase the dating accuracy of related events (Shum,

80

1998; see also chapter 2.3 for more details about how landmarks help date reconstruction). There are two possibilities for how to use public temporal landmarks as recall aids. Researchers can provide the landmark event (or events) with its date and hope that target event in which researchers are interested will be temporally connected to this event. The landmark event will then provide a temporal anchor and thus the date estimate may be more accurate. Or, in the case of very salient events where everyone knows the exact date, researchers can simply expose people to these public landmarks as did Loftus and Mar- burger (1983), who found that a mere exposure to these public landmarks (eruption of Mount St. Helens and New Year’s Day) may increase the accura- cy of recall of other events. There are, however, several limitations to the use of public landmarks as recall aids. First, they have to be dated precisely if the true date is not pro- vided (this applies to public events). Second, they have to be relevant for res- pondents because it increases the chance that they will be temporally con- nected to other events of the focus (Van der Vaart & Glasner, 2007b, 2011) Third, some cyclic temporal landmarks such as New Year’s Day or Christmas may not work so well 52, because people have them in mind inevit- ably. On the other hand less obvious cyclic (often seasonal) temporal land- marks (see chapter 2.3 – year temporal schema) such as Easter (the date is not fixed), International women’s day (the date is fixed but people may forget about it), etc. may be good recall aids. And fourth, usually not many (if any) public landmark events happen in a given year. In most years nothing as salient as the “Velvet revolution” in the Czech Republic or 9/11 in the USA happens. Because of these problems with public landmarks some authors prefer the use of self-generated personal landmarks (this is also often part of the ca- lendar instruments). According to Van der Vaart and Glasner (2007b), the most suitable landmarks are “important, domain related, personal events, that are generated by respondents themselves” (p. 33). Personal events as landmarks have however a danger that their date may be biased and thus all date estimates associated with this landmark may be systematically biased as well (Shum, 1998; Van der Vaart & Glasner, 2011).

52 But as shown above in the Loftus and Marburger’s study New Year’s Day surprisingly “worked”. 81

It is also important to note that landmarks (both public or personal) may be more suitable in aiding only some temporal units while not having an im- pact (or only a small one) on other time units (Van der Vaart & Glasner, 2011). For example the cyclic year temporal schemata (such as World Championship in Ice Hockey) may aid the recall of month or even weeks or exact days while not having any impact on year estimates. Multi-year cyclic events may on the other hand have impact on the year estimate as well; especially when the re- call period is not too long to cover the multi-year events too many times (see more details in chapter 2.3 – temporal schemata). Recall aids may not be equally helpful for all people. There is for exam- ple some evidence that women may have better ability to take an advantage of recall aids such as calendar instruments (Reimer & Matthes, 2007). Accord- ing to Radovan Šikl (2011)53 the reason may be that women are in real life more used to use various recall aids, emoticons, notepads, and thus the calendar instrument may be more familiar to them. This is however a pure hypothesis though supported by some indirect empirical evidence (see e.g., Skowronski, et al., 1994; women being "time keeper at families" more frequently and use mnemonics better; Tabatabaei & Hejazi, 2011) and lay ex- perience that women are better at remembering autobiographical events and men thus do not have to practice their memory so much and can simply ask their wives (Auriat, 1993). The talk aloud procedure can also increase recall because people may try harder when the researcher hears them. This will nevertheless not be the case for all people because some people may prefer to be silent when processing some task deeply and may find the talk aloud procedure bothersome. Probing for the answer could solve this problem but both procedures may increase the cognitive burden – especially when more events have to be dated (Willis, 2005).

3.4 Calendar instruments

Calendar instruments incorporate many of the recall aids mentioned in chap- ter 3.3 into one technique. There are many variations of these instruments that differ in terminology and more or less in the design features54. Still, the most

53 R.Š. works at the Institute of Psychology at the Academy of Sciences of the Czech Re- public in Brno. Personal communication; December 15, 2011. 54 Other names that are used for calendar instruments are for example: event history calen- dar (Belli, 1998); life events calendar (Roberts & Horney, 2010); life history calendar; time–line 82 important features usually are usually similar (see figure 3.4). These include (Glasner & Van der Vaart, 2009): • Graphical display of a timeline in temporal units (usually calendar units or other units shared by the respondents studied). • One or more thematic axes (usually rows) that represent the domains in which are the researcher interested. • Incorporation of recall aids such as bounding or landmarks (public or per- sonal).

JANUARY 2007 FEBRUARY 2007 MARCH 2007

..

Bulgaria & Romania Hurricane Kyrill killed winter school holidays joined EU (1st); Songwri- many people and dam- St. Valentine’s Day ter Karel Svoboda com- aged cities in Europe as (12th) mitted suicide (28th) well in CZ (18-19th). Public landmarks Public

Big party—showing pic- Visit of friends in Rych- Honeymoon in Barcelona tures from Barcelona leby mountains

Personal Personal landmarks I got the job!

unemployed; looking for unemployed;

unemployed a new job (several com- got the job at the univer-

Work petitive examinations) sity from May

with Jane (recently mar- ...... ried – 12/06) Relationship

Chilled in Rychleby— × ×

Health one week ill

..

Figure 3.4 An example of the calendar instrument design (rows and columns truncated)

or timeline (Van der Vaart, 2004; Van der Vaart & Glasner, 2007a); illustrated life history (Balán, Browning, Jelin, & Litzler, 1969). See two recent reviews for other names and set- tings in which these names are used (Belli & Callegaro, 2009; Glasner & Van der Vaart, 2009). 83

The basis of the calendar instrument, which is usually prepared in ad- vance of the interview, is a sheet of paper (or a document on a computer screen) divided into rows and columns like a common wall calendar. Col- umns separate the chosen calendar units—in figure 3.4 these are month units; but years, weeks, days of the week, hours and even thirds of month (in two years reference period; see Belli, et al., 2001) or non calendar units may be used (e.g., in Axinn, et al., 1997). The temporal units have to be appropriate for the researchers’ aims55. Calendar rows typically contain landmark events and the domains of researchers’ interest. The calendar may have both public landmarks (they are especially helpful when a longer recall period is used because in such recall periods more public landmarks happen) and personal landmarks rows. Pub- lic landmarks are prefilled by the researcher and personal landmarks are gen- erated by the respondent. Other rows contain the domains of researchers’ interest (e.g., work, relationships, health, risk behavior, consumer behavior, and school grade for younger respondents) (Belli, et al., 2009; Glasner & Van der Vaart, 2009). Researchers may be interested in reports from other domains that are not incorporated into the calendar instrument or the calendar is not suitable as data collection instrument for them. In these cases some additional question lists may be used and the calendar may serve as the recall aid only. The typical procedure is that the respondent is first informed about the calendar features and how to use the calendar (if the calendar is visible), then generates the personal landmarks, fills in the domain grids and answers some additional questions if available. In an example in figure 3.4 the researcher is interested in the relationship changes (that is why only one event is there— life with Jane), work changes (the respondent was first unemployed but found the job), and health problems (the respondent recalled one spell of ill- ness). The calendar should enhance the autobiographical memory because of the help of parallel, sequential, and top-down cues (Belli, 1998). Parallel cues describe the aiding potential of one domain on another. For example the fic- tional respondent from figure 3.4 remembers that he was on honeymoon in Barcelona (landmark event). He knows that this was close to the beginning of the year. The public landmark supports this because it reminds him of cele- brating St. Valentine’s Day not in the Czech Republic but in . The ho- neymoon landmark is connected to the job search and several competitive examinations he participated in just before leaving for Barcelona (more about

55 See more details about unit’s choice in chapter 3.3. 84 landmarks in chapter 2.3). The job search is related to the health domain while he remembers to be in a recovery phase during the job interviews. Events from the same theme (here domains) provide sequential cues to each other. The respondent reconstructed the fact that he participated in job inter- views in February and that he found the new job quite quickly—so the chro- nology of events implies that it must be March 2007 or maybe early April when he received the message of having the new job. He does not remember if he spoke about the new job at the Barcelona party, so this parallel cue does not help. This is also true for the public events which are not connected to the job. Filling in the work domain in January is also easy because he had to be unemployed. The last type of cues used in calendar instruments is top-down cueing. This refers to the hierarchy of the autobiographical memory. Re- searchers may first ask about more general events in the domain and later on about specific details. For example first asking if he was with somebody dur- ing this reference period and later on asking specific relationship details of interest. An interview using the calendar instrument, in comparison to the tradi- tional question–list interview, may provide better data quality in these re- spects (Balán, et al., 1969; Belli & Callegaro, 2009): • Time or date estimates may be more accurate. This is especially true when the task difficulty is high (Van der Vaart & Glasner, 2007a). • The overreporting of behavioral frequencies may decrease. • Duration estimates may be more accurate. • The completeness of the data may improve. The “gaps” in reports are more easily found. • The inconsistencies in reports are more easily found. • It may be more difficult to satisfice (e.g., by providing quick and often poor quality answer) in calendar interviewing, because the answer is of- ten reconstructed in front of the interviewer (Yu et al., 2004).

Overreporting is an important issue in studies focused on frequency reports of smoking, consumption, etc. (e.g., Belli, et al., 2007; Sobell, Sobell, Leo, & Cancilla, 1988). Unique public or personal events are usually well- known and their frequency is thus not a problem. However when more simi- lar events happen over the years (e.g., more summer vacations in Croatia) the calendar instrument may help to distinguish these events from each other. All other mentioned calendar instrument effects are important for unique event

85 dating and I will explore them in more detail in the next sub-section where I review the most relevant studies and their effects. The reason why I wrote “may provide better data quality” is that the ef- ficiency of the calendar instrument under different conditions is not yet tho- roughly explored (see next section) and it depends on the type of events re- searchers are interested in as well. If for example the reference period is too remote and the events are not relevant it may happen that even the most suit- able calendar instrument will not have any effect. The same is true for other difficult or easy recall tasks—once people easily know the answer or are not able to produce it at all—the way of data collection does not have any impact on data quality (see e.g., Van der Vaart & Glasner, 2007a).

Incorporation of a calendar instrument into data collection

There are many ways how calendar instruments may be incorporated into data collection. The main differences are summarized in a table 3.1.

Table 3.1 Possibilities for incorporating the calendar into data collection

Calendar medium paper and pencil electronic both Administration self-administered interviewer administered mixed Mode of interviewing face-to-face telephone online mail group Calendar visibility respondent only interviewer only both Function of a calendar visual aid only collect. instr. & visual aid mixed Flexibility of an interview fully structured semi-structured unstr. int. mixed int.

Most calendar instruments are paper and pencil (Belli & Callegaro, 2009). This medium has the advantage that is it not limited by the size of the LCD screen as in the electronic version and people can keep the calendar in their hands. The only limitation is the size of the table and transparency of the ar- rangement—a size bigger than A2 format is thus not advisable. The paper medium is also more pleasant to work with when the calendar interview takes a long time because the calendar does not exhaust the eyes so much as the screen does and it is possible to point the event sequences with fingers, write notes with the pencil etc. Computerized calendar instruments have become more and more popular nowadays. The reason is that the data do not need to be transcribed and the software can provide consistency checks and offer appropriate feedback in

86 predefined situations (Belli & Callegaro, 2009; Callegaro et al., 2005; Reimer & Matthes, 2007). Complicated computerized calendars with various validity checks are more suitable for large scale studies because the cost of the pro- gramming may be relatively high and working with these calendars requires appropriate training as well (Belli, et al., 2007; Glasner, 2011). Both electronic and paper and pencil calendar instruments may be a good so- lution in larger scale studies. The researcher uses the electronic version of the calendar instrument for checkups and feedback and writes down the reports into the electronic instrument while the respondent uses the paper version. Another solution is to use just the paper calendar and write down the reports straight into the computer’s database (I have used this approach in Studies II and III). The researcher loses the advantage of the check–ups and automated feedback but this is usually not an issue in smaller studies. When the topic is sensitive (e.g., alcohol consumption or the incidence of partners’ violence) self–administration may a good solution (e.g., Martyn, 2009) but in most cases the interview is interviewer administered and sometimes mixed as well. Interviewer administration has an advantage that the notes written into the calendar will be readable for the interviewer (in paper and pencil case) and it’s usually a must in electronic calendar version because they are usually too difficult for untrained respondents to use. Another ad- vantage is that interviewer may better assist the respondent. Recently Glasner (2011) developed a self-administered electronic online calendar that was easy for the respondents to use. Her findings suggest that there is potential in us- ing electronic online calendar instrument (e.g., the visual feedback was help- ful) but for example requesting the landmarks ended up in higher break-off rates56. Data are usually collected during a face-to-face interview or via telephone. An online interview mode will probably receive more attention in the future as well, but as mentioned by Glasner (2011) this will need further exploration. A telephone interview has the disadvantage that respondent cannot use the visual aid of the instrument and thus has to rely on the abilities of the inter- viewer to find the best cues and check the consistency of the report. The calendar may be visible to both interviewer and respondent which pos- sibly is preferable because both of them can make checkups and use the ca- lendar as recall aid. However, very often the calendar is seen only by the in- terviewer (this is especially the case in a telephone interview). In some studies

56 Glasner did not know the true dates of events, which is why her analyses bring only limited insight into the problem of using online calendar instruments. 87 the calendar was visible to the respondent only (Van der Vaart & Glasner, 2007a). The calendar instrument may function as a visual aid only (e.g., Van der Vaart & Glasner, 2007a) or as a data collection instrument and visual aid at the same time (Belli, et al., 2007). When the calendar is used as a visual aid only the respondent may fill in the personal landmarks and then answer the ques- tions from the interviewer while checking the calendar to aid recall. Collect- ing the data with a calendar instrument may be easy but also very difficult when many diverse questions are asked. In these cases the interviewer may collect only some data with the calendar instrument and the rest separately while using the filled-up calendar as recall aid only. Calendar interviews may be flexible (or unstructured), semi–structured or fully structured (Robson, 2002). Most of the interviews are semi–structured (e.g., Belli, et al., 2001). Yet, structured interviews (e.g., Van der Vaart & Glasner, 2007a), qualitative flexible interviews (e.g., D. A. Harris & Parisi, 2007) or mixed interviews (e.g., Yount & Gittelsohn, 2008) can be found as well. The nature of temporal data collection should preferably be at least par- tially flexible because all the probing and recall aids should fit the individual respondent (see the previous chapter 3.3). This is why the semi–structured interview is most often used (Belli, et al., 2009). More structured interviews on the other hand allow more standardization of the procedure and decrease the potential for interviewer bias. Thus less structured interviews seem to be more suitable because they can use the all the recall aids more flexibly and profit maximally from the better rapport of more conversational nature of the interview (Belli & Calle- garo, 2009). The more conversational nature may also reduce satisficing—the effect of choosing the first acceptable answer which comes upon the respon- dents mind or using other “shortcuts” often leading to inaccurate reports (Krosnick, 1991). In all cases proper interviewer training is important (Dijkstra, et al., 2009) because inappropriate use of the probes and other recall aids may lead to bias caused by the interviewer (Van der Vaart, 2004).

Effect of a calendar instrument on data quality (in particular on dating ac- curacy)

The most notable feature concerning the quality of retrospective reports in this study is that they vary more as a function of which variable is being collected

88

rather than whether an EHC or CQ interviewing method is being imple- mented57. (Belli, et al., 2007, p. 617).

Many studies evaluated the effect of a calendar instrument on data quality and usually found that calendar instruments had either neutral or mild to substantial positive effect on data quality. I will not go into much detail in describing these studies because extensive and recent reviews are available (see e.g., Belli & Callegaro, 2009; Glasner, 2011; Glasner & Van der Vaart, 2009) and most of the studies are only partially devoted to dating accuracy or have severe limitations58. In the following section I will explore in detail sev- eral studies that are most related to my own Studies II and III. I will summar- ize in brackets the main design features and how the calendar was incorpo- rated.

Plain calendar study (Gibbons & Thompson, 2001)

[Experimental design: interview with common calendar versus interview without a calendar. Events from 1 week to 8 or 14 weeks old; interview right after; calendar as visual aid only; fully structured interview]

Even though this study does not compare a calendar instrument interview with other type of interview it brings a valuable insight into the role of the sole calendar. Authors asked about the dates of personal events and used in one condition a common calendar as a recall aid and no calendar in the other condition. No landmarks or other features of the calendar instruments were used.

57 EHC = event history calendar (calendar instrument). CQ = conventional questionnaire. 58 Many studies do not have any external standard to which the accuracy could be com- pared (e.g., Engel, Keifer, & Zahm, 2001; Glasner, 2011; Goldman, Moreno, & Westoff, 1989). Other studies have this external source but do not compare the calendar instru- ment interview with other types of interviews which makes the generalizations about the superiority of a calendar instrument interview limited (e.g., Hoppin, et al., 1998; Rosenberg et al., 1983). 89

Table 3.4 Average proportions of exact date estimates, DOW estimates and effect size in Gib- bons and Thompson study (2001)

14 weeks (Exp. 1) 8 weeks (non Cal) Proportions Cal non Cal η2 Exp. 1 Exp. 2 Date correct .337 .239 .135 .307 .272 DOW correct .466 .197 .629 .226 .146 Note: non Cal = condition without calendar. Cal = condition with calendar. DOW = day of the week. Proportion = average proportion of correct date estimates or DOW estimates out of all an- swers (maximum is 1). In DOW the exactly dated events were excluded. η2 = eta squared.

As can be seen in table 3.4 the calendar had the strongest effect on DOW esti- mates (η2 = .63). This is especially truth in experiment 1 where a 14 week refer- ence period was used. Errors greater than 42 days were excluded from the analyses (less than 2% of the data. The level of chance for each DOW error is p = .14359. The proportion of correct DOW would be bigger because exactly estimated events were excluded from this analysis. (The proportion would a little bit more than the sum of .337 and .466 which means that more than 80% of the answers had correct DOW. Participants were undergraduate students which is probably why the percentage is so high. Undergraduates have usual- ly very schematic weeks—lectures, free time, vacations, visiting parents etc. When the reference period was shortened to 8 weeks only, the DOW ef- fect in non–calendar condition in experiment 1 (when older events were ex- cluded) remained above the chance level and both correct date and correct DOW were slightly more accurate in comparison to 14 weeks period. This implies that for more recent events both DOW and exact dates are easier to estimate. This effect was only partially replicated in experiment 2 where the DOW correct (when exactly dated events were excluded) only reached the nearly exactly the chance level (p = .0146). But if the proportions of exact dat- ings and DOW datings in experiment 2 are summed it again highly exceeds the chance level (p = .418). To sum up, the mere presence of the calendar helps to make the exact date estimates more correct and especially increases the accuracy of DOW estimates. I did not find any study making the same comparison in situations where the error is measured in month units but it could be argued that the

59 1 / 7 =. 143 90 effect would be very small or none. The reason is that month of the year as well as seasons are remembered relatively well and people thus do not need a calendar in helping them to choose the right month (Larsen, et al., 1995). The Czech calendar has many cyclical highlights over the year that serve as tem- poral schemata (Vokolek, 2011). On the other hand when the reference period is long and covers very remote events, then a timeline or calendar with years could help people to visualize the length of the reference period. But without the possibility of writing the events into the calendar this help will probably be quite small (if any).

Purchases of pairs of glasses study (Van der Vaart & Glasner, 2007a)

[Experimental design: calendar instrument interview versus question list interview. Events from 1997–2004. Interview took place in 2004; Calendar instrument: Paper and pencil; self–administered; telephone; calendar visible to respondent only; visual aid only; fully structured interview.]

Authors used in the experimental condition a simple timeline embedded in a consumer survey. It had a seven year grid and several domains (age, city & street, domestic situation, jobs & courses, personal landmarks). In the control group no calendar instrument (or plain calendar) was used. Respondents in the calendar condition filled in the calendar before the interview. During the telephone interview they were asked to write purchases (of glasses and lenses) into the calendar instrument, next questions were posed about (the dates of) these purchases. Respondents in the control group simply answered the questions. The mean signed dating error and dating error in month units was -0.88 (7.06) in calendar condition and -0.98 (16.14) in control group. Only dating error difference was significant (p < .001). The effect of the calendar instrument was greater when the task difficulty was higher (= less salient and less recent events).

91

Educational careers (Van der Vaart, 1996, 2004)

[Experimental design: calendar instrument interview versus question list interview. Recall period 4 to 8 years. Calendar instrument: Paper and pencil; self–administered; face-to-face interview; calendar visible to respondent and interviewer; visual aid only; both interviews fully structured.]

In his study Van der Vaart compared calendar instrument interviews with the classic questionnaire list. As the “true date” were considered the dates col- lected from the first wave of a longitudinal social survey. The calendar in- strument significantly enhanced the data quality especially when the task was difficult (= more remote events, less salient, high frequency of similar events). The CAL condition led to more correct year estimates of the starting year of the courses; more correct sequence of the courses; the CAL condition positive- ly affected only dating error (not signed error or heaping).

Panel study of income dynamics I. (Belli, et al., 2004; Belli, et al., 2001)

[Experimental design: flexible calendar instrument interview versus structured ques- tion list interview. Interview in 1998 about events from 1996 and 1997. Calendar instrument: electronic; interviewer administered; telephone interview; calendar visi- ble to interviewer only; data collection and visual aid]

Reports were compared to data that were collected in a previous wave of the longitudinal research. The CAL condition in comparison to the question list condition led to increased accuracy of the reports about moves, weeks unem- ployed, days missing from work (due to the illness or others illnesses), and income. But regarding the mean absolute error, almost no differences were significant (apart from self-ill and other ill).

Panel study of income dynamics II. (Belli, et al., 2007)

[Experimental design: flexible calendar instrument interview versus structured ques- tion list interview. Reference period up to 30 years. Calendar instrument: paper-and- pencil; interviewer administered; telephone interview; calendar visible to interviewer only; data collection and visual aid]

Reports were validated against the annual reports from the panel study. The CAL condition in comparison to the question list condition led to increased

92 accuracy of the reports about cohabitation, work history, unemployment his- tory, smoking, and number of working hours. The CAL condition led also to decreased data quality with some topics, but it is probably more an artifact than a real effect of the calendar instrument (the difference was in the number of reported marriages which was probably caused by the different wording of the question than by the different impact of the calendar instrument).

Potential costs and problems connected to calendar instruments

Glasner (2011) explores several issues related to the cost or potential problems connected to calendar instruments. These are increased interview time and preparation time, cost of developing the calendar instrument, problems with data entry, necessity of interviewer training, decreased data quality, and non- response. Most studies found some increase in interview time when the calendar instrument was used (usually in comparison to question-list interview). The percentage of the interview time increase was from about 10% (Belli, et al., 2007; Van der Vaart, 1996) to approximately up to twice as long interview (Engel, et al., 2001). However also no increase in interview time was found, for example in a study of Belli et al. (2001) which used telephone interview where the calendar was visible only to the interviewer. Interview time is nev- ertheless not the only time cost. Calendar instruments also need to be prepared in advance and more com- plicated electronic versions must be programmed which may be timely and expensive (Belli, et al., 2007; Reimer & Matthes, 2007). On the other hand sim- ple paper-and-pencil calendars may be prepared easily without too much effort and additional cost. This is also the reason why I will use very simple paper-and-pencil calendar instrument that can be quickly drawn without any cost. More complicated calendar instruments need longer interviewer training but simple calendar instruments may also need slightly more training as a common interview because interviewers need to learn how to provide the sequential, parallel or top-down cues and how to probe without being sug- gestive (Belli, et al., 2004). Dijkstra, et al. (2009) recommend that good training should include learning the manual (with all the probes), role plays, practice interview with the real respondent, short refresher trainings after some time and monitoring during the fieldwork.

93

Even though calendar instruments usually do not decrease the data quality, on rare occasions interview without calendar instrument may bring better results. Belli and Callegaro (2009) mention for example that autobio- graphical facts may be better inquired with classical questions rather than calendar interview that may be too complicated for gathering these facts and may sometimes be even misleading. This is an interesting finding which however needs further exploration about how frequent it is60. The most serious problem is probably the increase of non-response re- lated to the use of calendar instruments in some modes of interview. It seems that especially a self-administered calendar in anonymous mode may sub- stantially increase the non-response (Yoshihama, Gillespie, Hammock, Belli, & Tolman, 2005). Glasner (2011) found that especially the inclusion of self- generated landmarks increased the non-response in her online calendar in- strument study. In one of my unpublished studies I have found that even though the calendar used in an anonymous online interview (only as recall aid) did not increase the non-response, respondents did not fill it properly (did not spend enough time on completing it) which is why the calendar could not have any effect on data quality (Neusar & Ježek, 2009)61.

3.5 Summary of the predictors of dating difficulty

In the previous section various predictors of dating accuracy were intro- duced. Because there were so many of them, this chapter summarizes them in one table and depicts their impact on task difficulty. “Easy to date” are the characteristics of respondent, events or data collection that generally make the dating task easier. These characteristics should generally be associated with relatively high dating accuracy. This, of course, does not mean that the target event must be dated accurately because no model can definitely predict the dating accuracy especially when only one or few events are dated. “Neu- tral” are the characteristics that sometimes have impact on dating accuracy

60 A common experience is in agreement with it. It happens fairly often that we know something and when we start exploring how we know it, it influences our confidence and sometimes we even change our answer—everything often resulting in wrongly changing correct answers. Of course the correction into the right answer is common as well. 61 I spoke about this unpublished work at 4th ESRA conference in Lausanne (July 12-22, 2011). The title of the presentation was “The effect of event history calendar on dating accuracy in an online survey”. Abstract retrievable from: http://surveymethodology.eu/media/files/ ESRA_Conference_2011_Programme_Book_1.pdf 94 but many studies did not find any relationship or a very small one. “Difficult to date” are the characteristics that are generally connected with higher task difficulty and worse dating accuracy may be expected. This summary (see table 3.5) simplifies the interconnections among the predictors and the com- plexity of the problem. For more details see the particular sections above.

Table 3.5 Summary of dating difficulty predictors and their relationship with dating difficulty

Easy to date Neutral Difficult to date Respondent characteristics Age of respondents Reminiscence bump years Ordinary years Good memory for dates Poor memory for dates “Accurate” dating strategies “Approximate” dating strategies Woman (personal events) Men (personal events) Gender (public events) Personality Interest in the area of target No interest in the area of target events events Alert, motivated, good mood Tired, demotivated, depressive

Event characteristics Respondent independent charac-

teristics Recent events Old events Highly typical or atypical events Medium typicality Low-frequency events High-frequency events Self–events Other–events Respondent dependent characte-

ristics Important theme Unimportant theme Available temporal schemata Unavailable temporal schemata Important event Unimportant event Sharing the memory Not sharing the memory Vivid memories Vague memories Detailed memories Details forgotten High-knowledge events Low-knowledge events Date known Date reconstructed Easy to recall Difficult to recall High-saliency events Low-saliency events Unique events Ordinary events

95

Memorable events Not memorable events Pleasant events Negative events High emotional involvement Low emotional involvement Event characteristics Pleasant events (especially self– Neutral events, negative events events), Negative events (espe- (especially self–events) cially other–events) High emotional involvement Low emotional involvement Self-related events Other-related events High confidence in date estimates Low confidence in date estimates

Data collection No facilitation of respondents moti- Facilitation of respondents moti- vation and recall (especially when vation and recall task difficulty is high) Enough time for date reconstruc- Time pressure tion Collaborative recall Single recall Plain calendar (recent events) No calendar (recent events) Plain calendar (remote

events) Calendar instrument No calendar instrument (especially Calendar instrument (too easy or too diffi- when task difficulty is high) cult task)

Note. Temporal units are skipped because it is not possible to reduce the problem of the relation- ship between temporal units and task difficulty to this table.

96

4 Overall design and research aims of empirical studies

This chapter describes the overall design and research aims of the empirical studies in a summarizing way and shows how the studies are interconnected and how they follow up on each other. Empirical studies were conducted in a chronological order as they ap- pear in the dissertation with overlap between Study II and III62. There is a dis- cussion chapter at the end of each study and a general discussion and conclu- sions of all studies in chapter 8. Bibliography is included just once at the end of the dissertation. For more details about each study see the particular stu- dies (chapters 5–7).

4.1 Introduction

Researchers or other professionals often ask about the temporal aspect of events people experience. On many occasions, however, the source for the date cannot be objectively verified and researchers63 thus have to trust their respondents that the date estimates are sufficiently accurate. If people estimate the date of more events (or more people estimate the date for a similar event) then events and respondents with certain characteris- tics are on average more accurate—all this is moderated by the way of data collection that may be more of less appropriate. As mentioned in chapters 3.1 to 3.4 there are not many studies, which have focused on more accuracy pre- dictors at the same time, what makes it difficult to make an overall picture which characteristics are better predictors of dating accuracy. I tried to fill-in this gap in the theoretical review which is thus longer than usual. Studies I to III will try to fill-in this gap empirically. Study I deals with well-known public events only (which was done on purpose) because many event characteristics were known in advance and I could learn a lot about dating accuracy predictors of public events. Another reason is that it is known that the pattern of dating error is more or less simi- lar for public events and personal events (Kemp, 1999) and I could thus trans- fer some of the gained knowledge into the design of studies II and III which are focused mostly on personal events.

62 Study III was finished before Study II that started earlier but continued for a longer time. 63 I will write about researchers but it is relevant for professionals from other fields as well. 97

The previous chapters have shown that there are many events and res- pondent characteristics or ways of data collection that are or could be related to dating accuracy. No single study can explore all of them and I had thus to choose which of them to include into my studies. My criterion of choosing the predictors that will be included in my research was that the predictor had to be easily measured—ideally measured by a single question such as: “Do events like this happen regularly?” “What is your highest received degree?” “Are you sure that the day of the week estimate is correct?” The reason for choosing this criterion was that predictors that can be easily assessed can also be incorporated into interviews in applied settings without much effort. This seems to me important because in real life situations for example almost no professional will use the complicated phenomenological questionnaires (men- tioned in chapter 3.2), although such questionnaires may generate valuable information. In real life situations most professionals will probably ask: “Are you really sure about this date”? or something similar. My aim is thus to an- swer which of these questions regarding various predictors are worth asking more. The predictors may not have the same impact on remote and recent events, which is why my studies focus on both remote and recent event esti- mates. As mentioned in chapter 3.4 calendar instrument seems to be a promis- ing technique how to improve the quality of date estimates. However, there is a lack of evidence under what conditions these instruments work best and when it is better not to use them (Belli & Callegaro, 2009). There are only few studies comparing the effect of a calendar instru- ment interview with another type of an interview (see chapter 3.4 for more details). The available studies (e.g., Belli, et al., 2001; Belli, et al., 2007; Van der Vaart, 2004; Van der Vaart & Glasner, 2007a) compared the calendar instru- ment interview to a classic question list where researchers simply asked when questions and waited for the answer without any help. Significant improve- ment in the calendar condition is thus not a great surprise, because in one condition researchers did many things that usually help people date more precisely (e.g., use of calendar, landmarks, more time for a task, use of do- mains and sometimes more flexible interview as well) and in other condition nothing. On the other hand in some studies (e.g., Belli, et al., 2001) the calen- dar was not visible to the respondent, what could handicap the potential ef- fect of the calendar.

98

Results from these studies are certainly relevant as the majority of large scale studies employing heavily structured interviews with question lists demonstrate that too much standardization without recall aids may lead to decreased data quality. However, these studies bring only limited insight into smaller and less standardized (or qualitative) studies where researchers often try to help (facilitate the respondent) as much as possible. Another issue is that the above mentioned studies inquired about events which were relevant for researchers and where researchers had the objective or at least approximate source for the date (e.g., from databases or previous data collection). Focusing on personally more relevant events may bring another insight into the impact of calendar instruments on events that are idiosyncratic (= different for every respondent). I will thus compare the semi-structured calendar instrument interview with the semi-structured interview without the calendar instrument. Both interviews are as similar as possible while differing mainly in the availability of the calendar instrument (see more details below). Studies II and III will fo- cus on personal events, many of them highly relevant for the respondents. Study II focuses on remote events from the same period as Study I and Study III on recent events. As mentioned in the review part, personally relevant events are usually dated quite precisely (Larsen, et al., 1995; Wagenaar, 1986) and it is questionable if calendar instrument interviews can bring a significant increase in dating accuracy of these events. The problem with calendar instruments is that they are sometimes very complicated and when in electronic version they also need to be programmed (see chapter 3.4). Using such instruments is often not possible in smaller scale studies (because of the cost or lack of flexibility) or for professionals (journal- ists, historians, doctors) who only want to receive valid answers to the “when” questions and typically do not have much time for conducting leng- thy interviews. In these situations less complicated calendar instruments may be more appropriate, especially when used in paper-and-pencil form, because drawing the calendar on a sheet of paper does not take too long. This is why I use a very simple calendar instrument in both Study II and III. Public events (especially temporal landmarks) are sometimes used by researchers to aid the recall during data collection (Gaskell, et al., 2000). There are two possibilities how to use temporal landmarks as recall aids. Research- ers can provide the landmark event (or events) with its date and hope that target event in which researchers are interested will be temporally connected to this event. The landmark event will then provide a temporal anchor and

99 thus the date estimate will be more accurate. Or, in case of very salient events where all people know the exact date, researchers can simply expose people to these public landmarks, as did Loftus and Marburger (1983) who found that a mere exposure to these public landmarks (eruption of Mount St. Helens and New Year’s Day) may reduce the dating error of other related events. There are several limitations to the use of public landmarks as recall aids. Firstly, they have to be dated precisely if the true date is not provided. Secondly, they have to be relevant for respondents because it increases the chance that they will be temporally connected to other events of the focus (Van der Vaart & Glasner, 2007b). Thirdly, even though public landmarks such as New Year’s Day or Christmas may work, as was shown in the Loftus and Marburger’s case cited above, people generally use these seasonal land- marks spontaneously anyways. And fourthly, usually not too many (if any) public landmarks, such as the “Velvet revolution” in the Czech Republic, happen in ordinary years.

4.2 Aims and research questions

The main aim of this dissertation is to find predictors of dating accuracy of unique personal and public events and thus to get more insight into mechanisms that affect dating accuracy. Studies I to III explore event and respondent characteristics and Studies II and III add two ways of data collection (with and without a calendar instrument) as well. The following research questions were asked64: • Which event and respondent characteristics are the strongest predictors of dating accuracy among remote public (Study I) or remote personal events (Study II)? • Which event and respondent characteristics are the strongest predictors of dating accuracy among recent personal events (Study II)? • Do interviews with a calendar instrument lead to greater dating accura- cy of remote personal or public events in comparison to interviews without a calendar instrument (Study II)? • Do interviews with a calendar instrument lead to greater dating accura- cy of recent personal or public events in comparison to interviews with- out a calendar instrument (Study III)?

64 Studies II and III also partially explore several predictors of public events (e.g., confi- dence ratings of accuracy in date estimates). See the particular studies for more details. 100

Study I has two ancillary aims as well: • First, to find out whether any of the public events from 2006 and 2007 could serve as temporal landmarks and more specifically whether it would be beneficial to use them in Study II as recall aids. • Second, the public events from Study I could form a sub–set of events for the evaluation of a calendar instrument in Study II. I plan to choose events with varied difficulty, because a calendar instrument should work best for events that are neither too easy to date nor too difficult.

4.3 Design

All three correlational studies focus on date estimates of unique events and seek the strongest predictors of dating accuracy. Study I deals only with remote well- known public events, while Studies II and III focus on personal events and partially on public events as well (both remote and recent). The dependent va- riable is dating error in calendar units—month and year in Studies I and II and days, DOW, and week in Study III (see table 4.1). Three groups of predictor (independent) variables are examined. Predictors measured separately for every event are called event characteristics. These include characteristics which are independent of a respondents’ evaluation (e.g., recency) and characteris- tics that have to be assessed by respondents (e.g., phenomenological characte- ristics of an event such as importance). Predictors related to respondents’ overall dating accuracy are called respondent characteristics (e.g., gender, self- evaluation of memory for dates). Studies II and III examine more event and respondent characteristics than Study I and are enriched with an experimental manipulation of data collection (interview incorporating calendar instrument versus interview without calendar instrument). The summary of all indepen- dent variables is in table 4.2.

101

Table 4.1 Summary information about the studies

S Recall period Int. mode Independent variables Dependent variables I 2005–2008 online Event & respondent characteristics Error in months Error in years II 2005–2008 face-to-face Event & respondent characteristics Error in months Data collection (manipulated) Error in years III March 2011 face-to-face Event & respondent characteristics Error in days to June 2011 Data collection (manipulated) Error in weeks Error in days of the week Note: S = study; Int. mode = mode of interview.

In Study I respondents had to estimate the date of 35 predefined events (plus several freely recalled events). In Study II respondents estimated the date of approximately65 28 personal events and 8 public events (public events were chosen from Study I and were the same for all respondents). Personal events with the true date were collected by the proxies of the respondents (proxies had to check the date with some documents such as e-mails, pictures, diaries, bills). In Study III respondents estimated the date of approximately 22 personal events and 4 public events (this was same for all the respondents). Events were collected through an online proxy diary. In Studies II and III proxies were not allowed to tell their partners that their task will be to esti- mate the date of personal events (they could tell them that the study will be about everyday memory). The advantage in Study II and III is that it is not using the diary of the respondent but the diary or documents of the proxy. This solves the potential problem common in diary studies that the act of keeping the diary may influ- ence the dating accuracy.

Experimental conditions in Studies II and III

Study II compares a face-to-face semi-structured interview with incorporated calendar instrument to the same interview without calendar instrument. Res- pondents in the control group simply answer the questions from a question list that is printed on an A4 sheet and respondents write their answers into this list (the interviewer simultaneously records the answers into an excel sheet in the notebook). Study III is similar in design with one exception. Res-

65 The number of personal events varies a little bit. 102 pondents in Non-CAL condition may use a plain pocket calendar (see Ap- pendix 3). The reason for including this calendar is that in real world situa- tions people hardly ever estimate the date of recent events without this pock- et calendar (sometimes with the calendar in a mobile phone or notebook). The reason is also that people often know that some event happened 3 weeks ago on Monday, however, it may be difficult for them to count the right date. This is supported by the study of Gibbons and Thomsen (2001) (see more details in chapter 3.4). My aim is not to measure the quality of “numeracy” but the ac- curacy of date estimates that could be (and evidence shows that is) influenced by poor ability to calculate the dates. This small calendar highlights three public holidays that happened over the reference period similarly as the ca- lendar instrument. This is not an issue for remote events which is why the control group in Study II does not use any additional plain calendar. Even though calendar in Study II includes major public holidays, the interviewers’ task was to work with these cyclic events with respondents in Non-CAL con- dition as well (see the thorough description of the calendar instruments be- low). Respondents were not aware of the fact that there are two experimental conditions. In both studies random assignment to conditions was used with the li- mitation that half of the women’s and men’s sample should be present in both conditions. The order in which the target events to be dated were presented was also randomized (in both cases random numbers in excel were used). The roles—who will be the proxy (and help with collecting personal events of his or her partner) and who will be the respondent—were also randomly as- signed.

Calendar instrument design issues

The calendar instruments used in both studies are very simple (see Appendix 1 and 2). The main feature of both “paper-and-pencil” instruments is the ca- lendar grid covering the recall period with a blank space in each cell where personal landmarks can be recorded as well as the date estimates of the target events. The calendars do not contain any domains because it would make the calendar very big and less transparent. It could be argued that the most im- portant events should appear as landmarks (thus the changes in the impor- tant domains will not be lost) and that domains do not help to aid the recall much when nothing changes (such as living with the same partner during the whole recall period). All target events in the calendar instrument condition were printed on adhesive labels (6 cm × 1 cm) and respondents were asked to 103 stick the labels into the appropriate cells of the calendar instrument. When respondents realized they were wrong the label was moved to another place. Self-generated landmarks were written on colored adhesive labels (5 cm × 1.5 cm). Calendar size in Study II was 55 cm × 36.5 cm and in Study III 47.5 cm × 40 cm. The smallest units in Study II are months and in Study III days. Both calendars included the information about main public holidays that all people in the Czech Republic are familiar with (Labor Day, Christmas, etc.). The only public date that is not generally known and which was included into the ca- lendar is the date of Easter in Study III because the date of Easter varies. Even though I do not expect that these events will have any impact on dating accu- racy I included them because they are available in all plain calendars (wall calendar or pocket calendar) in the Czech Republic. The calendar in Study II covers a reference period from January 2005 to December 2008. All events were from this reference period. Most target events were from years 2006 to 2007. Events from 2005 and 2008 (two events from both years) were excluded from most of the analyses because they could be heavily influenced by the boundary effect. (This is similar as in Study I where the boundary years are also excluded from some analyses). The calendar in Study III covers reference period from March 14, 2011 till June 5, 2011 (both calendars: the plain pocket calendar as well as the ca- lendar instrument). All personal events came from the 6 week reference pe- riod from April 4, 2011 till May 22, 2011. Because of this I do not expect any boundary effect in Study III (or very small one for the most remote events) and analyze all events that were collected.

4.4 Samples

The inclusion criteria in Study I were respondents of age between 18 to 70 years old and living in the Czech Republic for the most of the time during 2005 and 2008. The reason for choosing such a wide range of age was that I could analyze the effect of age. In Studies II and III the age range was reduced to 23 to 40, because during these years people experience most of the “big first time experiences” such as graduating from university, wedding, birth of children etc. (reminiscence bump period66). The other inclusion criterion in

66 Even though ”reminiscence bump” usually ends earlier (see chapter 3.1) there is a trend in the Czech Republic (as well as in many other European countries) to postpone many important life events (e.g., marriage). Because of this even many people over 40 expe- rience many similar events as 10 to 15 years younger people. 104 these two studies was that respondents have to live as a couple since the be- ginning of 2006 (or earlier) and for at least one year in Study III. The reason for having the criterion of relationship length was that the personal events were collected by their proxies. In Study II proxies collected the documents with events and their true date and in Study III they kept the diary for a pe- riod of 6 weeks. All couples were heterosexual, because respondents’ gender was used in analyses as a predictor of dating accuracy and proxies’ gender was used in same additional analyses as well. The aim was to obtain a sample of couples with different educational, socio-economic and demographic cha- racteristics even though I did not expect these respondent characteristics to have a substantial impact on the dating accuracy of personal events. All three samples (and especially in Studies II and III) have much higher percentage of respondents with a university degree. Some respondents took part in both Studies II and III but I do not expect any effect of this on dating accuracy.

4.5 Measures

Many measures—but not all67—are similar in all the Studies which is why I describe most of them here and provide only a short summary in the actual Studies (chapters 5 to 7). Table 4.2 summarizes all independent variables measured in Studies I to III. Study I gathers the data by an anonymous online questionnaire. The downside of this way of data collection may be the lower motivation people usually have when filling out online questionnaires when the task difficulty is high leading hence to a lower response rate or quality of the answers (see chapter 3.3 and 3.4). The procedure of the data collection is described in more detail the Study I (chapter 5). In Studies II and III the data is collected by employing semi-structured face-to-face interviews, using the help of calendar instrument with half of the respondents (CAL) while the other half does not utilize the help of the calen- dar instrument (Non-CAL). This way of data collection should provide more accurate date estimates because people are motivated in face-to-face interviews more and the inter- viewer can use the recall aids as well (see chapter 3.3 and 3.4).

67 For example vividness may be an interesting predictor of dating accuracy of remote events but for very recent events (Study III) it will probably not be a very suitable predic- tor because all events will still be vivid (because only unique more memorable events were chosen). 105

Table 4.2 Summary of independent variables measured in empirical studies

Independent variables measured in empirical studies Studies Respondent characteristics I II III Age (S-I: 18–68; S-II & III: 23–41) • • • Gender • • • Education • • • Self-evaluation of memory (R) • • • Overall difficulty of estimating the date (R.) • • Tiredness (Int., R.) • • Motivation (Int., R.) • • Number of landmarks (CAL only) • • Public holidays help (R) • • Regularity of life, work and free time activities (R) • • Interest in a type of public events (R) • Event characteristics – respondent independent or partially independent Event recency • • • Temporal schemata • • • Regularity (S-III: R) • • Frequency (R) • Theme (S-I: Int.; S-II & III R, Int.) • • • Self–events or other–events (R, Int.) • • Media coverage in time (culminating / initiating events) • Event characteristics – respondent dependent or proxy dependent Date reconstructed / Date known (Int., R) • • Vividness/details (R) • Uniqueness (R) • Importance (R, P) • • Emotions connected with the event (valence and type; P) • Sharing an event with other people (R) • • Sharing an event with the respondent (P) • Confidence in date estimates (R) • • Expected accuracy of date estimates (P) • • Association of a public event with a personal event (R) • • • Data collection Semi-structured interview with or without calendar instrument • • Note. S = study; Roman numerals mark studies where each variable was measured. R = rated by respondent; P = rated by proxy; Int. = rated by interviewer.

106

The following section will describe the way how each variable was measured. Even when the variable has the same name in all Studies there may be some minor difference in how it was measured across the studies.

Respondent characteristics

I will sometimes use the shortcuts such as “S-I” which means Study I. Age: Exact birthday was collected in S-II & III. Age in years was meas- ured in S-I. Gender: Women and men. Education: Education was measured as the highest degree received. Scale: 1 “university degree”; 2 “higher education (but not university educa- tion); 3 “high school finished with a graduation exam”; 4 “high school with- out a graduation exam”; 5 “lower education”. Self-evaluation of memory: Respondents were first asked about how well they remembered public or personal events when compared to other people they knew: • Important public events (content of memories) (S-I); • Important events from their life (content of memories; last several years) (S-I, II); • Events from their life from last several months (content of memories) (S- III).

After answering, respondents were asked how well they were able to locate these events in time. This was an open question because I wanted them to reflect on respondent’s subjective feelings about their abilities. In regards to “content of memories” refer to the following examples: what happened, who was involved (not the temporal aspect). Study I used 5-point rating scale: 1 “very well”; 2 “rather well”; 3 “average; 4 “rather poor”; 5 “poor”. Because people did not use all of the alternatives very often I shortened the scale to 3- point scale in Studies II and III. The reason was also that people in cognitive interviews68 were generally more able to answer on 3-point scale than 5-point scale. The answering scale in these two studies was: 1"rather better than oth- ers", 2 "similarly to others", 3 "rather worse than others".

68 I have conducted many cognitive interviews on all measures rated by respondents or proxies to be sure that most people understand the scales in the intended way. 107

In S-II & III an open question about the memory for content and the dates was asked first. Overall difficulty of estimating the dates. When respondents finished esti- mating the dates they were asked about the overall difficulty of estimating the dates. In Study II an overall rating was asked only (separately about pub- lic and personal events) and in Study III separate questions about the DOW and the week were asked. Rating scale was: 1 “very hard”; 2 “rather hard”; 3 “rather easy”; 4 “very easy”. The overall difficulty of estimating the date of personal events was also rated independently by the interviewer (before the respondent answered the same question). Tiredness: Respondents tiredness was rated by the interviewer and measured on a 3-point scale: 1 "looks tired"; 2 "looks slightly tired"; 3 "does not look tired". The observation was made during the “dating task”. When interviewer was not sure he or she asked. Motivation: Respondents motivation was rated by the interviewer and measured on a 3-point scale: 1 "very motivated"; 2 "motivated"; or 3 "slightly motivated". The observation was made during the dating task. Number of landmarks: Number of reported landmarks. The total amount of landmarks was restricted to 15 but when 7 landmarks were generated, the interviewers did not ask for more. This applies to half of the respondents (those in the CAL condition). Public holidays help: Respondents were asked if any of the public holi- days in the calendar or calendar instrument helped them in date estimates. Interest in a type of public events: Respondents in Study I wrote down in which public events they were interested in (e.g., political events, sport events). Regularity of life, work and free time activities: Respondents in Study III rated whether their life and activities are generally regular on a 4-point scale: 1 “very regular”; 2 “rather regular”; 3 “rather irregular”; 4 “irregular”. Also yes or no questions about the regularity of work and free time activities were asked.

Event characteristics – respondent independent or partially independent

Recency: Age of event calculated as the difference between interview date and when an event happened in days (S-III) or in months (S-I & III). Frequency: Frequency of how often similar events happened was meas- ured on a 3-point scale: 1 "once in a reference period"; 2 "2-3 times in reference

108 period"; 3 "more than 3 times". Respondents were not asked to name the other events but many of them did that spontaneously. Regularity: The event was rated as regular when it happened on some regular basis (e.g., once a week, once a month etc.). The regularity pattern was recorded as well and served as the basis for temporal schemata classifica- tion (e.g., events that happen every Tuesday have week temporal schema). Regularity of public events in Study II also served as the basis for the tempor- al schemata classification. Self–events or other–events: self–events are events which the respondent took part in. Other–events are events about which the respondent knew but was not part of. Theme: Public events in Study I are classified according to the theme, e.g., political events, events about celebrities (more details in Study 5). Per- sonal events were also classified according to its theme, but this is not the focus of my dissertation69. Media coverage in time (culminating / initiating events): According to Brown et al. (1985) culminating are those events where the media coverage started before the event and ended after the event occured (e.g., cyclic events such as Olympic Games). Initiating events are events where the media coverage started after the announcement of the event (e.g., surprising events). More details about this variable are in chapter 5.

Event characteristics – respondent dependent or proxy dependent70

Date reconstructed / date known: When a respondent knew the answer straight away it was coded as “known”. When he or she had to think about the an- swer it was coded as “reconstructed”. Date known is thus a similar variable to reaction time but not the same because interviewers took into account also the individual speed of respondents to verbalize their answer. When not clear, the interviewer asked whether the date was known straight away or whether some mental calculations had to be done and the estimate was thus

69 I cooperate on this analysis with Eva Literáková, who is writing her master’s thesis on the relationship between event theme of personal events and dating accuracy. 70 Sharing the event, event importance and clarity/details had relatively good reliability (approx. after one week). Kappa in pilot study is ranging from 0.6 to 0.9. Uniqueness seems to be more problematic concept having the kappa coefficient ranging from 0.4 to 0.7 (n = 4). However people hardly ever skip more than over one level (e.g., from a “lot important” to “not important”). Before this test of reliability, 12 pilot interviews were made about how people understand these phenomenological characteristics (cognitive interviewing) resulting in many small changes in the question wording. 109 reconstructed. The distinction between reconstructed/known dates is espe- cially important because when the date is known I do not expect any differ- ence between the experimental conditions (calendar instrument cannot help much when people answer before looking at the calendar). Sharing with other people: Respondents in Study III rated whether they spoke about the target event: 1 "many times"; 2 "several times (1-3 times)"; 3 "never". There was a restriction that it was not meant straight after the event ended but later on (e.g., next day or week until present). In Study II instruc- tion was told that sharing is restricted to approximately the last 12 month. This measure was added to Study II nearly at the end of data collection and the data are thus available only for a small subset of respondents. Importance: Respondents had to answer if an event is (or was in S-II) personally important for them on a 3-point scale: 1 “yes, a lot”; 2 “aver- age/usual importance”; 3 “not really”. The exact wording of the instruction was: “This event was personally important for me and it had an impact on my life” (S-II) and “This event is personally important for me” (S-III). The reason for the difference in the instructions is that many remote events (S-II) may seem unimportant from today’s point of view, which is why the word “was” is used. Respondents were also instructed that the event could still be important nowadays but that is not a prerequisite here: events should be rated also if they were “once” very important but lost their importance over the time. In Study III the instruction was shortened, because when the dates of recent events from a short reference period are estimated one may expect that not too many really important events happen. This is why part of the instruc- tion, “it had an impact on my life”, was omitted. The instruction also uses “is” instead of “was“ to highlight that a recent event has to be important at the time of an interview. Expected importance for the respondent was rated by proxies on the same scales and with the same instructions. The reason for including this measure is that there may be interesting differences in evaluating what is important for respondents (see chapter 3.2 – importance). Uniqueness. The instruction was: “Considering the context of the events that happened before or after, I find this event unique, standing out among other memories, something like this does not happen often”. The rating scale was: 1 “yes, a lot”; 2 “average/usual uniqueness”; 3 “not really”. This charac- teristic was not used in Study III because not many “unique” events happened to people during such a short reference period.

110

Vividness/Details. These two phenomenological characteristics were in- cluded together because respondents in pilot study (n = 12) were not able to distinguish properly between the clarity of the memories and details resulting in two variables with strong correlation (see chapter 3.2, phenomenology, for more details). The instruction was: “This memory is bright, I remember vari- ous details”. The rating scale was: “yes, a lot”; 2 “average/usual”; 3 “not real- ly”. This characteristic was not used in Study III because only most salient recent events were chosen and most of them were still remembered vividly and in a detailed fashion. As a result this variable did not differ among the events too much. Emotions connected with the event. In Study III proxies rated the concur- rent valence and type of an emotion that they observed their partners expe- rienced. Valence was rated on a 3-point scale 1 “very strong”; 2 “average; 3 “weak” not rated when they were not sure. Type was: positive, negative, am- bivalent or not rated when they were not sure. DOW confidence: Respondents evaluated whether they are “sure about the day of the week” or not. Expected DOW accuracy was rated by proxies as well. Week confidence: Respondents rated how many weeks earlier or later the event happened for certain (= confidence period). It could be a different num- ber of weeks going backwards or going forward. Expected week accuracy was rated by proxies as well. They used a simplified scale and the instruction was “It is probable that the week will be estimated with this accuracy”: 0 “correct week”; 1 “one week later or earlier”; 2 “2 or more weeks later or earlier”. Proxies could write some additional notes about why they think the estimate would be correct or not. Under oath confidence: When respondents were sure about the DOW and week (= both correct) then the interviewer asked: “Would you be willing to testify under oath about the accuracy of the date?” and explained to respon- dents that it was of course a hypothetical question but that we wanted to dif- ferentiate between the times when the respondent was extremely confident and when not. Month and year confidence: Month and year confidence was rated sepa- rately but with a same scale (though respondents did not use more than 3 when reported about year). 0 “correct month/year”; 1 “the month/year stated or one month/year later or earlier”; 2 “the month/year stated or 2 months/years later of earlier; 3 “the month/year stated or 3 months/years later of earlier”; and 4 “the month/year stated or possibly even 4 month/years later

111 or earlier. Proxies used the same scale and could write some additional notes about why they think the estimate would be correct or not. Temporal schemata: In all studies every event is classified as having tem- poral schemata or not having temporal schemata. In Studies I and II year and multi-year temporal schemata play a major role while in Study III exact date, week and month are most relevant (see details in chapter 2.4 and in the stu- dies 5–7). The classification for personal events is more approximate in na- ture, because for personal events it is more difficult to distinguish what is a temporal schema and what is not (see details in chapters 6 and 7). Association of public events with personal events: Respondents rated (yes or no) whether they have some personal association with the target public events.

Outcome measures (dependent variables)

Because my main interest is in predicting dating accuracy in general I was not so much interested in the sign of the dating bias, though some analyses will use signed error as well, when appropriate. But generally I will use the dating error in absolute value (referred as “dating error”). The plus sign of signed error means that the event was telescoped forward (moved towards the present) and the minus sign means that an event was tele- scoped backwards (moved towards the past). Errors in various time units are described in chapter 2.4. It is important, however, to note that when error in months is measured separately for month’s estimates only, then maximum error in month can be 6 month (be- cause of the cyclic nature of the year). I will mostly use the error in absolute value because I am not as much interested in the way of telescoping as in the amount of error. Dependent variables can be represented as an average error in chosen units (whether mean or median) as shown in table 4.3 or as the proportion of correct date estimates (or approximately correct) in chosen units. Both types will be used in all studies.

112

Table 4.3 Possible ways how to represent the dating error

Studies Error in absolute value Signed error I II III error in days signed error in days • error in DOW signed error in DOW • error in weeks signed error in weeks • error in months* signed error in months* • • error in years signed error in years • • Note: can be computed as month and year together or for month only.

4.6 Procedures

The Study I procedure is described in chapter 5. The procedures of Studies II and III are very similar which is why I describe them here in detail and only shortly in the actual studies. Both studies followed the same order as men- tioned in table 4.3. Most interviews took place at the respondents’ homes, at the university or in quiet cafes. Data were recorded into the excel sheet by the interviewer. The laptop was also used to record the audio of the interview. All respon- dents were informed about the voice recording and told that only interview- ers plus two researchers (I and E.L) will have access to the whole dataset with all personal information71. Respondents were informed that all presentations that will be made about them will be in an anonym summary form only and that significant details that could lead to disclosure of their identity will be changed. If not spontaneously done respondents were asked during the dating task to talk aloud if it did not interrupt them. This brings some additional insight that can be used in some future research or in case studies about how people arrive at the date estimates. Full debriefing about the aims of the study was given to the respondents who took part only in one of the two studies. The debriefing also contained information about how difficult it is to provide the date estimates (of even

71 The dataset contains only ID of the respondents and another dataset the ID and other contact information. This was important because sometimes a mistake was found or ad- ditional information was needed and full anonymization of the dataset would not allow us to contact the respondents again. Respondents were also told that the research results will be sent to them when finished. 113 very important events) and that researchers did not expect that respondents will know the exact date very often (or at all). Respondents that were part of the two studies were not told about the experimental conditions before the end of the second interview. Table 4.3 illustrates the procedure of data collection and the order of the question domains in Study I and II.

Table 4.3 Procedure of data collection in Study I and II

Order Procedure 1 Question if the exact study focus was disclosed by proxies* 2 Expectation about the research 3 Birthday; education; length of the relationship. 4 Children: names and their birthdays 5 Self–evaluation of a memory 6 Dating task instruction 7 Landmark recall and recording (only CAL group) Event dating; assessment if the date was reconstructed/known; event frequency & event 8 regularity (S-III) 9 Evaluation of the difficulty of dating task 10 Public holidays help 11 Phenomenology of memories (e.g. importance etc.) 12 Life regularity, job description, regularity of free–time activities and work 13 Feedback from respondent 14 Showing of correct answers; debriefing 15 Researcher evaluates the respondent motivation, tiredness (especially in the dating task) Note. *Proxies were asked not to fully disclose what exactly is the research about (respondents’ could know that the research will be about the everyday memory).

Instruction in the CAL condition (Study II)

The next instruction was provided by the interviewer to the respondent in the CAL condition in Study II. Interviewers were free to use their own words but all main points had to be said and the critical details had to be said as written (e.g., that landmarks should be accurate).* = instruction for the interviewer only.

Now we will move to the actual interview. It will cover events from 2005 to 2008. To make your thinking and orientation easier, you will have a “calendar” at your disposal *shown. As you can see, the calendar focuses mainly on the years 2006 and 2007. On the left side, there are also several months from the 2005 and the “older events” box (events from January to September 2005). On the right side, the latest events are from 2008, ending with a “recent events” box (events from 114

April to December 2008). *show the way the calendar is made and explain again. All holidays and day-offs are marked in the calendar, including several important days (e.g., St. Nicolas Day, Inter- national Children's Day). Before the actual interview begins, can you try to think about personally relevant events of those years, particularly events that you know at least approximately when they happened (in particular when you are sure about the date)? Please write legibly a short name of the event—preferably in capital letters—on the adhesive label and paste it into the calendar on the spot the event could have happened. *show the adhesive label. If the event took place during more than one month, you can paste the label between two months or mark the beginning or some im- portant detail for which you are most sure that the date is correct. Long-lasting events can also be divided into episodes (e.g., beginning, end). If you change your mind, you can change the position of the labels. Start numbering the events on adhesive labels from one, please. For the “recent” and “older” events that do not have the month listed *show, write the number of the month in which the event took place on the label. The events do not have to be pasted in the order as they happened.

Instruction in the CAL condition (Study III)

The following instruction was provided by the interviewer to the respondent in the CAL condition in Study III. Interviewers were free to use their own words but all main points had to be said and the critical details had to be said as written.* = instruction for the interviewer only.

Now we will move to the actual interview. It will cover events from last several months. To make your thinking and orientation easier, you will have two calendars at your disposal *shown. The first is small pocket calendar covering the reference period which is March 14, 2011 to June 5, 2011. The bigger calendar *shown covers the same period. Both calendars emphasize Easter and 2 na- tional holidays *shown. We will work with the bigger calendar but if you find it easier or more transparent you may have a look into the smaller calendar as well. Now please try to think about events from this period that you remember well and you know the date off when they happened. Please write legibly a short name of the event—preferably in capital letters—on the adhesive label and paste it into the calendar on the spot the event could have happened. *show the adhesive la- bel. If the event took place in more than one day, you can paste the label between two days or mark the beginning or some important detail for which you are most sure that the date is correct. Long-lasting events can also be divided into episodes (e.g., beginning, end). If you change your mind, you can change the position of the event labels. Start numbering the events on adhesive la- bels from one, please. The events do not have to be pasted in the order as they happened.

Instruction in Studies II and III (Non-CAL condition)

In these conditions respondents were informed about the reference period from which events should come (S-II: January, 2005 to December 2008; S-III: March 14, 2011 to June 5, 2011). In Study II respondents were told that most events will come from the years 2006 and 2007 but there will be events that

115 come from the boundary 2005 and 2008 years as well. In Study III respondents got the pocket calendar as well and interviewers explained them that 3 public holidays are highlighted in the calendar.

Training of the interviewers

Even though using the calendar did not require any extensive training apart from explaining the calendar procedure, the use of appropriate probes as well as the procedure of the whole interview and data entering needed some train- ing. The training included thorough reading of the manual (in advance), dis- cussion about the manual, training of entering the data into the excel sheet, role plays (myself as the interviewer first and later myself or other interview- ers as the respondent), practice interviews (at least two) that were recorded and analyzed, and monitoring during the fieldwork 72. As mentioned above, explaining the calendar and entering the events or landmarks did not take much time (approx. five minutes) but the whole interview procedure and da- ta entering needed at least 2–3 meetings (2–3 hours long). Apart from this there was monitoring during fieldwork. Study II had three interviewers and Study III five interviewers. I was one of the interviewers and Eva Literáková73 another interviewer. All other inter- viewers were students who finished their bachelor degree in psychology or were close to the finish. The next probes (see below) were used during the interview but inter- viewers were allowed to use their own probes as well if they found them suitable (with the limitation of not using manipulative probes). Probes were used when no direct reply was obtained after some time or when the respon- dent looked that he or she is struggling. Examples of probes in Studies II or III: • Was it over the weekdays or on weekend? • Are you really sure that this landmark is correctly dated? (CAL only) • I know that dating is difficult, but please try to provide as accurate date estimate as possible even though you may guess. • It this event connected with some other events we spoke about already? • Do you think that it happened far from now or is it more recent event?

72 This is very similar to the training recommended by Dijkstra et al. (2009). 73 Eva Literáková cooperated a lot in the Studies II and III. Eva not only collected some of the data but also collaborated on some design issues, manual preparation, data prepara- tion and categorization of events. As mentioned above she also works on her master the- sis where she looks for the relationships between event theme and dating accuracy. 116

• Is the event part of some life-time period or extended event? • Do you remember how the weather was? Was it snow outside? Warm? Etc. • Happen events like this more often in some season (month, DOW)? • Could not be this event an exception from the regularity?

4.7 Issues in data analyses

Data were analyzed in IBM SPSS Statistics Version 1974. The dependent varia- ble in Study I is mostly dating error in month units. This variable is not nor- mally distributed, but the skewness and kurtosis are not too high (both lower than 1), the sample of respondents and events is large enough and the resi- duals of the analyses are close to normal distribution. Because of this parame- tric analyses were used. The statistical method used for most primary out- comes in Study I is nested ANOVA or a similar approach with linear mixed models. This methods are suitable because of the potential of controlling within-subject variability and between subject variability (Bickel, 2007). The dependent variables in Studies II and III are far from normal (be- cause of the “bumps” of multiples of 7 days or multiples of 12 month) and no transformation can make them normal in order to be able to use the parame- tric analyses. Because of this the analyses use non-parametric procedures (e.g., chi-square test, Kruskal-Wallis test, Mann-Whitney U test). These pro- cedures do not answer the questions so simply as parametric ones (e.g., mean error or standard deviations cannot be used) but this approach is more ap- propriate because the use of parametric tests would end up in having para- meter estimates that are strongly biased. Also more sophisticated (nested) hierarchal analyses (such as generalized linear mixed models) were explored but these analyses did not lead to better understanding of the data and thus will not be reported in this thesis.

74 I have consulted the statistical issues mostly with the statisticians at the Faculty of So- cial Studies, at the Institute of Biostatistics and Analyses, and at the Department of Ma- thematics and Statistics (all Masaryk University). I would like to thank here to the many people with whom I spoke over the last years. 117

5 Dating accuracy predictors of remote public events (Study I)

5.1 Introduction

Though the dates of public events often are known (especially for more recent events) having insight in recall of public events is crucial in cases: • when professionals inquire into some special population or regional events where no documents with dates exist; • when people use dates of public events in order to date other events; • or when researchers are interested in dating accuracy of public events in their own right (see e.g., N. R. Brown, et al., 1985; Kemp, 1988).

Even though the dating error is usually higher for public events than for personal ones, the pattern of dating bias is similar for both types of events (Kemp, 1999). That is also why both types of events have been frequently used interchangeably in memory accuracy studies (see e.g., Howes & Katz, 1992; Huttenlocher, et al., 1988; Rubin & Baddeley, 1989). Studying the dating accuracy of personal events has the disadvantage that true dates as well as other event characteristics are not known and it is often difficult and time consuming—if possible at all—to verify the date esti- mates and other characteristics. Because of this my first empirical study fo- cuses on public events only75. The main research aim of this study is: • to get insight in dating accuracy and to find out which event and res- pondent characteristics are the strongest predictors of dating accuracy among remote public events (most events are from 2006 and 2007).

The ancillary aims are:

75 This study is completely rewritten and expanded version of an article published in a peer-reviewed journal Mediální studia in the Czech language (Neusar, Hoferková, & Ježek, 2011). Participation on the article: Jana Hoferková took part in the selection of events (with A.N.), participants’ recruitment (with A.N.), data preparation (with A.N.), and found the true dates for freely recalled public events. Jana Hoferková also finished her bachelor thesis on dating public events (she used the same sample but smaller). None of the presented analyses are the same as in her work. Stanislav Ježek consulted the sta- tistical analyses and did the article proofreading. Completely new parts of this disserta- tion study are introduction, some analyses, almost full discussion, most of the temporal landmarks section and topics covering the choices to be made in Study II. 118

• First, to find out whether any of the selected public events could serve as temporal landmarks and more specifically whether it would be bene- ficial to use them in study II as recall aids. • Second, public events could form a sub–set of events for the evaluation of a calendar instrument in study II because this study uses the same reference period (2005–2008). I plan to choose events with varied diffi- culty, because calendar instrument should work best for events that are neither too easy to date or too difficult.

Hypotheses

The main research question of this Study I is to find which event and respon- dent characteristics are the strongest predictors of dating accuracy among remote public events. In order to answer the research question I will address the following hypotheses and explorations. In some cases I do not state any hypotheses because the design of my study or lack of the variance in the data does (which was known already during the data collection) decreases the chances of finding the expected relationship.

Hypotheses concerning event characteristics

Before I move to hypotheses (shortened as “H”) I will explore some of the variables where I do not state any hypotheses. • Event recency. I do not expect to find strong (or any) relationship be- tween event recency and dating accuracy, because my design decreases the chances to find any relationship. The reason is that most target events come from two years long reference period (2006–2007) and in- terviewing started in August 2010 and thus the most recent events were approximately 2.5 years old and the most remote approximately 4.5 years old, which is not a big difference to find some substantive effect of event recency. • Media coverage in time. Brown, Rips, and Shevell (1985) hypothesized that culminating events (the media coverage started before the event and ended up after the event) should be dated more accurately than in- itiating events (in which case the media coverage starts after the an- nouncement of the event). They did not find any effect but because the reference period in my study is not so wide it is worth trying whether the effect could be found (see section 3.2 for more details).

119

• H-1: Events that have temporal schema are dated more accurately than events without this schema. • H-2: Public events that are associated with personal events are dated more accurately than events without this association.

Hypotheses concerning respondent characteristics

Before I move to hypotheses I will explore some of the variables where I do not state any hypotheses. • Gender may play some role in estimating the date of some events but I do not expect any substantive gender differences because public events include various topics (e.g., sport, politics, celebrities, foreign news). On the other hand when only a subset of events is selected then the interac- tion of gender and topic may bring significant differences (though I do not expect strong differences because the events were not selected with the intention to find topics more “suitable” for women or men). • Age. The target events were not selected to be more “suitable” for some age group. Because of this I do not expect substantive effect of age on dating accuracy. The reason is also that the studies which found some effect of age usually compared more extreme groups (very young to very old) as are respondents in my study. However it may happen that the effect of age may be found for individual questions. • Education could certainly play a role, because as mentioned in chapter 3.1 more educated people may have higher knowledge about the events (e.g., knowledge of temporal schemata). However due to the lack of the variance and low frequencies of people with lower levels of education the relationship will probably not be found. • H-3: Respondents’ self-evaluation of memory is positively related to dating accuracy. The better the respondents rate their memory the more accu- rate they are. However I expect only a small effect, because respondents rated their memory before knowing which events they would estimate. • H-4: Interest in some type of public events (e.g., events with sport theme) is positively related to dating accuracy of these events.

120

5.2 Methods

Participants

Selection criteria

Participants had to comply with several inclusion criteria for being part of this study. Age was restricted to 18–70 years. The reason for excluding people over 70 was because of the higher prevalence of mild cognitive impairment or more serious memory disorders such as dementia (Baddeley, Kopelman, & Wilson, 2004; EuroCoDe, 2009). Another reason is that for the over 70 group computer literacy is often insufficient. The lower bound of 18 years was cho- sen because that is the age at which public events become more relevant (e.g., voting rights, driving license for car, topics of discussions at high school). However the major focus was on people who were 23 and older because this and older cohorts experienced all public events as adults (with all the men- tioned rights). People had to be Czech citizens who had been in the country for most of the period covered by the research.

Response rate

693 people clicked on the hyperlink to the questionnaire and 425 filled in at least one question. 265 respondents answered the majority of introductory questions regarding demographic characteristics, quality of memory, interests and free recall of important public events from 2006 and 2007. 250 respon- dents dated at least one of 35 public events which are the subject matter of this chapter. The response rate that was computed as the number of answers with at least one public event dated divided by the number of people who clicked on the hyperlink is 36%76. The data collection started in half August 2010 and was finished within 3 and half months.

Sample

The range of participants’ ages is 18–68 years (M = 39.8; SD = 12.9). Almost half of the respondents had bachelor or higher degree (48%), 8% had higher education (after high school graduation but not university), 41% high school with state exam and 3% high school without state exam. According to the

76 The percentage is probably higher because some participants could first click on the questionnaire hyperlink and then fill it in later on (which is counted as two clicks). 121

2008 ÚIV & OECD report (2010) 14% of Czech inhabitants in the 25–64 age group and 18% in the 25–34 age group have finished tertiary education. Thus the number of highly educated participants is overrepresented. Women were overrepresented, too, because the sample consists of only 34% of men (n = 90).

Selection of public events

The questionnaire consists of 35 public events (see Appendix 4), e.g., “Kap- licky’s project won the architectonic competition for the National Library.” The reference period was 4 years long, starting in 2005 and finishing in 2008. I am interested in particular in the events from the middle of this recall period (years 2006 and 2007) because they will not be biased by the boundary effect. In this study respondents were told that all events will be from the reference period of 2005 to 2008. This means that even events that seem very old could not be dated more remotely than January 2005. Boundary in this sense affects mostly the year estimate because the month estimate is in most cases inde- pendent on the year estimate (Larsen, et al., 1995). The focus on events from 2006 and 2007 was chosen because older events could be too difficult to date and more recent events too easy; and that in wider reference periods event recency becomes too strong a predictor, heav- ily moderating the impact of all other factors (Rubin & Wenzel, 1996). In addi- tion it gave the opportunity to compare the results with Study II that deals with events from the same period. The pool of events from which the final selection was chosen was pri- marily done by the analysis of yearbooks from years 2006 and 2007. Using this approach, inspired by the media function of agenda setting (McCombs & Shaw, 1972), most frequently mentioned events were chosen. These events are assumed to be memorable ones from the studied years. However other event characteristics played a role in the selection process as well. Removed were events with an unclear beginning, unfinished events, very specific events (important for only few people), events with higher frequency (e.g., bom- bardment in Israel or Palestine), events regarding semantic knowledge (e.g., taking Pluto off the list of planets), non–concrete events, unimportant events and events not resulting in any change (just confirmation of the current situa- tion). Chosen events should be known about and for the majority of the popu- lation have at least some of the following characteristics that are typical for well-remembered events: long–term impact, emotional, important, surpris-

122 ing, new, discussed in media, taking place near temporal landmarks, clearly dated and mentioned in most common media (see chapter 3). Several most watched TV shows were also selected (even though not mentioned in year- books) because they fulfill the above mentioned criteria. The added events from 2005 and 2008 also met the selection criteria and were often mentioned in the yearbooks. Another source of events was an unpublished pilot study (N = 52) where I asked the respondents to freely recall events from the years 2006 and 2007 (Neusar, 2010). This pilot study showed that the events selected by the above mentioned criteria are often the same as the events mentioned in free recall, which improves the validity of the selection process. When mentioning the events I will use event number in brackets e.g., Bulgaria & Romania joined EU [33] (see appendix 1).

Table 5.1 Summary of independent and dependent variables

Independent variables Dependent variables Respondent characteristics Month error Age (range: 18–68) Year error Gender Dating error (M+Y) Education Signed dating error Self–evaluation of memory % of correct answers Interest in public events Number of events dated Event characteristics – respondent independent Recency

Temporal schemata Media coverage in time (Culminating / Initiating) Event theme (sport, politics, celebrities, foreign) Event characteristics – respondent dependent Event associated with a personal event

Classification of the events according to the media coverage in time

An event was classified as “initiating” when it fulfilled at least one of these criteria: a) it is a sudden, unexpected event (e.g., suicide of a famous compos- er [24]) or b) the event was expected but only for a very short time in advance i.e., several days to two months (e.g., hurricane Kyrill [7]). An event was clas- sified as “culminating” when it was expected a long time in advance (e.g., 123

Olympic Games in Torino [29]) and at the same time the media paid attention to the event before it happened (at least 2 months or more). In case the an- nouncement was shorter it could not influence the dating so much and the event was classified as “initiating”. I eliminated events that were culminating but from which it could be expected that respondents paid no attention to it or did not know about it at all before it actually happened. That is why this categorization contains only clearly initiating and clearly culminating events. Both classifications can be found in Appendix 4.

Classification of public events – multi–year and year temporal schema

Many events typically occur within some period, in other words have a tem- poral schema (Larsen, et al., 1995). First, some events can be expected to have temporal schemata that are shared by the majority of the Czech population. This type of schema will be called “shared temporal schema”. To this catego- ry belong events which happen in certain part of a year, e.g., in summer (ap- proximate shared temporal schema) or in a concrete month (exact shared temporal schema). Many events have both approximate and specific shared schema—if people do not know that the winter Olympic Games [29] take place in Febru- ary (specific schema) they will probably date them in the winter month after New Year (approximate schema). Second, other events will have “unique tem- poral schema”. For these events some temporal schema or clue for month is available, but this schema is very specific, or unknown to most of the people, or too inaccurate (e.g., forming of a new government must be after elections but it is not clear exactly when. If the government formation is soon after the elections it may be approximate shared temporal schema. If it took longer [as in event number 1] it will be rough unique temporal schema). Other events can take place at any time (e.g., derailment of a train) and have no temporal schema at all (see table 5.2). Two researchers77 evaluated the categorization and agreed; a third researcher was called in to deal with unclear cases. Some events could have misleading temporal schema. For example ice hockey is a winter sport but the World Championship [25] is held in May. These schema- ta are not included in this classification (but are mentioned in Appendix 4 with all other information about multi–year and year temporal schemata). No

77 A.N., J.H. and S.J. 124 schema category is included in the “unique schema“, because I do not expect any effect (or very small) on dating accuracy of both categories.

Table 5.2 Categorization of public events according their approximate, exact, or no year tem- poral schema

Shared Unique No Event Exact Approximate Exact Approx. or rough No 29 Olympic Games in Torino February Winter

22 Sarkozy – presidential victory May

1 Topolánek – 1st government After “elections”

30 Massacre at Virginia Tech ×

Classification of the events according to the theme

The selected events can be classified according to the interests in public news mentioned by the respondents. The reason for classifying events is that I can analyze whether people with certain interest date events from the same area more accurately than people without this interest. Most events could be clas- sified as “domestic news”. However such a category is too broad for making comparisons because few events are not part of it. Four relatively distinct cat- egories can be found (2 researchers agreed on the categories)78. Some events may be part of more than one group, e.g., Czech footballers – prostitutes [10] is both a sport event as well as a celebrities event. But because it was a non- significant sport event it was classified as only celebrities’ event. The final four categories are: sport events (e.g., Olympic Games in Torino [29]), politi- cal events (e.g., Parliamentary elections [32]), foreign news (e.g., Barack Ob- ama won the presidential elections [11]), and celebrities news (e.g., suicide of Karel Svoboda—a well known Czech composer [24]). For more details about each event and its content see Appendix 4.

Indicators of temporal landmarks from 2006 and 2007

I will analyze 4 indicators that are expected to contribute to the “landmark quality” of public events. These are (see chapter 2): freely recalled events, percentage of correctly dated events, dating error, and the association of pub-

78 A.N. and S.J. 125 lic events with personal events. All these indicators are only indirect—they say that some events are dated more correctly, are easily retrieved or more connected to other events. To directly verify an event’s function as a temporal landmark a different design would have been needed. Such a design should explore every event’s ability to aid in the recall of personal events. This ap- proach would however be too burdensome in an online questionnaire and for my purpose the indirect indicators should offer enough evidence for the choices to be made. Researchers sometimes provide the correct date of the landmark (when the date is not obvious) or do not provide the date (when they study the “natural” landmarks or the date is obvious). Researchers can look for general landmarks for the whole population or look at specific landmarks for some specific sub–population. I will focus on finding the general landmarks that could serve as the recall aids for all participants.

Event dating procedure

Respondents’ first self–evaluated their memory for personal and public events, interests in public events, and also filled in several demographics (sex, age, education). This was followed by a free recall task of public events from 2006 and 2007. In this part respondents had to think alone (without any aids) about two public events from 2007 and two from 2006. Respondents were instructed that the research is primarily focused on how people estimate (guess) the date, not on knowledge of dates. Events had to be described suffi- ciently and have a month estimate provided as well. Well-known public events were preferred (if the respondent was aware of more events). People who could think of more than four events could use free space to write as many more events as they wanted. The main part consisted of 35 descriptions of predefined public events which had to be dated—year and month (see fig- ure 5.1; Event description is translated from Czech language. The graphical layout is identical as in the questionnaire). When respondents did not know the event they were instructed to fill in “N” into the year bounded space. In all other cases they were instructed to estimate (guess) the full date (year and month).

126

Figure 5.1 Illustration of the questionnaire layout. Question description and bounded space for year and month estimate.

At the top of every page of the questionnaire was an instruction stating that the events are from 2005–2008 only and that I expect them to guess the date even in situations in which they do not have even a vague idea of when it happened. At the end of each page there was information on how many ques- tions they answered and how many were still ahead. After the last event was dated they were asked if any of the public events were associated with their personal events. This was followed by a feedback question on the question- naire and a question about whether they checked any of the dates before they answered (which they should not have done). After clicking on the “contin- ue” button they were provided with the link to download a document with the correct dates of the events and a debriefing (mentioned above).

5.3 Results

Descriptives

Although it was repeatedly stressed on every new page of the questionnaire that all predefined public events are from years 2005 to 2008, 143 respondents dated at least one of the events outside this limit (see Appendix 5). I decided to eliminate the year estimate for these reports because knowledge of the lim- it makes dating more precise and it is not possible to have one sample consist- ing of people with and without this knowledge. I also considered eliminating month estimates in such cases but analysis showed that there was no differ- ence in month estimates of the events where the year estimate was excluded and when it was not excluded. This indicates that estimating the month is practically independent on estimating the year79.

79 8% of year estimates were outside the limits (n = 533 out of 6630 estimates). Frequency was from 0 to 41 times for single events. Half of the respondents estimated one or two events outside the boundaries. Only 8 respondents had more than 7 estimates outside the limits. Independence of month on year is obvious for cyclic events such as Olympic 127

When analyzing month estimates only I used data from all respondents but for analyses of year only or of year and month together I used the data only within the limit 2005 to 2008. When percentages are shown, it is always percentage of valid answers. Accuracy of dating of all events and percentage of correct answers is in table 5.3 (events are sorted as they appeared in the questionnaire) and in table 5.4 (events are sorted from most accurate to least accurate according the date error in various units). Average dating error range (years and months together) was from 2.8 month for Olympic Games in Peking [12] to 15.1 month for the final of TV show SuperStar 3 [19]). Maximum possible month error is 6 months80. It is smallest for music festival police raid [8] (M = 0.6; SD = 0.7), which is not sur- prising because most people probably easily deduced that the open air festiv- al had to take place in summer. On the other hand, the highest average month error is for the charge of the so-called “heparin murderer” [2] (M = 3.9; SD = 1.5). This event does not have year temporal schema or other cues which ended up in very high month dating error. The smallest year error is for Ob- ama’s victory [11] (M = 0.16; SD = 0.46) while highest for the SuperStar 3 show (M = 1.49; SD = 0.69). Backward telescoping was most demonstrated in Obama’s victory which in almost 76% of the cases was dated too far in the past. The limit of the year 2008 played an important role here. Forward telescoping was highest in the scandal of Czech footballers with prostitutes in FIFA qualification [10] (84%). This is probably caused by the fact that FIFA happened in the year af- ter the qualification and the event was monitored by the media for a long time. The most accurately dated events are the Olympic Games in Torino [29] (47%) and Peking [12] (46%) which are events with a specific temporal sche- ma.

Games. Some wrong year estimates could be explained by the interference of similar events—e.g., Czechs Republic won in ice hockey WCH both in 2005 and 2010 (12 respon- dents reported 2009 and 17 2010). The only exception was Obama’s victory in presidential elections [11]. It is evident that some respondents did not date Obama’s victory (Novem- ber 2008) but his commencement into presidential mandate (January 2009). Because I am not able to distinguish if dating of 2009 is wrong estimation or commencement of mandate, I eliminated these answers from the analyses (n = 29). 80 Because year is cyclic, the error is computed as the ‘shortest way’ to the accurate month which cannot be more than 6 months. 128

The result section is divided into four subsections: a) relationship of event characteristics and dating accuracy; b) relationship of respondent cha- racteristics and dating accuracy; c) exploration if any events could serve as recall aids (temporal landmarks) in Study II; d) dealing with the question whether public events could form a suitable sub–set of events for the evalua- tion of calendar instrument in Study II.

129

Table 5.3 Dating accuracy of all 35 predefined events

Date error Month e. Year e. % of correct date

No Event Y n M SD M SD M SD - 0 + 0–3 0–14 1 Topolánek – 1st government 6 169 10.8 9.5 2.20 1.87 0.96 0.39 39.1 9.5 51.5 36.7 62.7 2 Heparin murderer 6 134 11.0 6.3 3.89 1.46 1.11 0.74 33.6 0.0 66.4 8.2 64.9 3 Paroubek – wedding 7 129 7.0 4.1 3.31 1.98 0.70 0.48 26.4 7.0 66.7 17.8 97.7 4 Violence against children 7 131 11.3 7.5 2.01 1.73 0.90 0.67 51.1 4.6 44.3 22.1 72.5 5 Kubice’s political crime report 6 115 7.6 8.8 1.80 1.81 0.59 0.71 29.6 27.8 42.6 51.3 78.3 6 Same-sex couples reg. partnership 6 106 11.7 8.1 2.94 1.66 1.00 0.72 41.5 0.9 57.5 24.5 64.2 7 Hurricane Kyrill 7 146 10.7 6.7 3.78 1.93 0.78 0.69 32.2 6.2 61.6 15.8 65.1 8 Music festival police raid 5 155 11.3 12.8 0.63 0.74 0.90 1.05 5.8 27.7 66.5 48.4 73.5 9 Municipal elections 6 158 4.8 7.4 1.56 2.11 0.37 0.71 31.6 44.3 24.1 63.3 84.8 10 Czech footballers – prostitutes 7 112 11.5 6.1 3.48 1.82 0.73 0.53 14.3 1.8 83.9 16.1 58.0 11 Obama – presidential victory 8 140 5.5 6.6 2.72 2.11 0.16 0.46 75.7 20.7 3.6 49.3 92.1 12 Olympic Games in Peking 8 170 2.8 6.3 0.96 1.21 0.21 0.59 52.4 45.9 1.8 86.5 94.7 13 Star Dance II 7 104 7.7 4.8 1.43 1.60 0.74 0.47 21.2 10.6 68.3 31.7 100 14 Train derail 8 147 7.9 9.9 1.44 1.35 0.55 0.83 63.9 23.8 12.2 57.1 81.6 15 Interchange of newborns 7 132 9.8 8.5 3.77 1.88 0.66 0.64 61.4 3.8 34.8 14.4 78.0 16 Topolánek – 2nd government 7 129 8.2 7.4 2.93 1.45 0.75 0.50 43.4 7.0 49.6 41.1 75.2 17 FIFA World Cup – Czechs failed 6 114 9.4 10.8 0.81 1.03 0.86 0.93 10.5 32.5 57.0 56.1 66.7 18 Driving points system 6 143 10.8 7.0 3.54 2.35 0.84 0.69 46.2 3.5 50.3 18.9 66.4 19 SuperStar 3 6 106 15.1 7.5 1.53 1.89 1.42 0.69 11.3 5.7 83.0 11.3 49.1 20 Robbery of the century in the CR 7 109 11.8 9.4 3.67 1.60 0.84 0.65 61.5 0.9 37.6 14.7 70.6 21 Kuchařová – Miss World 6 119 10.9 7.5 2.48 1.89 0.96 0.76 36.1 5.0 58.8 20.2 72.3 22 Sarkozy – presidential victory 7 115 11.0 8.4 1.88 1.75 0.82 0.73 62.6 7.8 29.6 24.3 73.9 23 Saddam Hussein execution 6 107 11.0 7.2 2.77 1.58 0.84 0.72 62.6 6.5 30.8 24.3 65.4 24 Suicide of K. Svoboda 7 138 9.7 7.5 2.30 1.84 0.65 0.59 26.1 14.5 59.4 28.3 72.5 25 Czechs won ice hockey WCH 5 112 13.8 14.2 1.28 1.52 1.23 1.27 17.0 25.9 57.1 42 65.2 26 Healthcare charges 8 133 6.0 7.8 1.41 1.96 0.50 0.74 37.6 39.8 22.6 51.9 93.2 27 Star Dance I 6 97 13.7 8.2 1.54 1.70 1.30 0.73 13.4 6.2 80.4 17.5 57.7 28 WCH athletics Osaka 7 97 9.8 7.3 1.05 1.16 0.82 0.63 50.5 13.4 36.1 29.9 86.6 29 Olympic Games in Torino 6 143 6.7 9.6 0.71 1.27 0.51 0.75 17.5 46.9 35.7 63.6 82.5 30 Massacre at Virginia Tech 7 86 12.8 8.1 2.58 2.11 0.98 0.74 55.8 3.5 40.7 20.9 55.8 31 National Library project by Kaplický 7 125 10.2 7.1 2.67 1.98 0.64 0.61 24.0 6.4 69.6 26.4 70.4 32 Parliament elections 6 123 5.6 8.6 1.22 1.38 0.43 0.72 36.6 30.1 33.3 68.3 84.6 33 Bulgaria & Romania joined EU 7 96 10.5 7.4 2.51 2.00 0.74 0.60 19.8 16.7 63.5 26.0 64.6 34 Pope John Paul II died 5 133 15.0 13.0 1.99 1.98 1.13 1.05 7.5 15.0 77.4 31.6 57.1 35 Czech Republic joined Schengen 7 137 14.0 12.2 2.44 1.95 0.88 0.76 69.3 9.5 21.2 24.1 61.3 Note: No = number of an event in Appendix 4; Y = correct year; Date error = absolute dating error of month and year together expressed in month units; Month e. = net error of month estimate (maximum is 6 month). Year e. = error of year estimate in absolute value in years. Minus sign = percentage of backward telescoped date estimates; Plus sign = percentage of forward telescoped date estimates; 0 = percentage of correct date estimates; 0–3 = percentage of date estimates that have maximal dating error of 3 months. 0–14 = percen- tage of date estimates that have maximal dating error of 14 months.

130

Table 5.4 Sorted dating accuracy of all 35 predefined events—ranking according to the date error, month error, year error, and percentage of correct date estimates

Date error Month e Year e % correct No Event Y M SD No Y M SD No Y M SD No Y 0 Rank 12 Olympic Games in Peking 8 2.80 6.30 8 5 0.63 0.74 11 8 0.16 0.46 29 6 46.85 1 9 Municipal elections 6 4.75 7.44 29 6 0.71 1.27 12 8 0.21 0.59 12 8 45.88 2 11 Obama – presidential victory 8 5.46 6.57 17 6 0.81 1.03 9 6 0.37 0.71 9 6 44.30 3 32 Parliament elections 6 5.64 8.60 12 8 0.96 1.21 32 6 0.43 0.72 26 8 39.85 4 26 Healthcare charges 8 6.05 7.83 28 7 1.05 1.16 26 8 0.50 0.74 17 6 32.46 5 29 Olympic Games in Torino 6 6.66 9.63 32 6 1.22 1.38 29 6 0.51 0.75 32 6 30.08 6 3 Paroubek - wedding 7 6.96 4.06 25 5 1.28 1.52 14 8 0.55 0.82 5 6 27.83 7 5 Kubice’s political crime report 6 7.63 8.77 26 8 1.41 1.96 5 6 0.59 0.71 8 5 27.74 8 13 Star Dance II 7 7.70 4.76 13 7 1.43 1.60 31 7 0.64 0.61 25 5 25.89 9 14 Train derail 8 7.85 9.88 14 8 1.44 1.35 24 7 0.65 0.59 14 8 23.81 10 16 Topolánek – 2nd government 7 8.25 7.38 19 6 1.53 1.89 15 7 0.66 0.64 11 8 20.71 11 17 FIFA World Cup – Czechs failed 6 9.45 10.76 27 6 1.54 1.70 3 7 0.70 0.48 33 7 16.67 12 24 Suicide of K. Svoboda 7 9.75 7.47 9 6 1.56 2.11 10 7 0.73 0.53 34 5 15.04 13 28 WCH athletics Osaka 7 9.76 7.31 5 6 1.80 1.81 13 7 0.74 0.47 24 7 14.49 14 15 Interchange of newborns 7 9.85 8.49 22 7 1.88 1.75 33 7 0.74 0.60 28 7 13.40 15 31 National Library project by Kapl. 7 10.18 7.12 34 5 1.99 1.98 16 7 0.75 0.50 13 7 10.58 16 33 Bulgaria & Romania joined EU 7 10.52 7.38 4 7 2.01 1.73 7 7 0.78 0.69 35 7 9.49 17 7 Hurricane Kyrill 7 10.68 6.69 1 6 2.20 1.87 28 7 0.82 0.63 1 6 9.47 18 1 Topolánek – 1st government 6 10.76 9.46 24 7 2.30 1.84 22 7 0.82 0.73 22 7 7.83 19 18 Driving points system 6 10.77 7.00 35 7 2.44 1.95 18 6 0.84 0.69 16 7 6.98 20 21 Kuchařová – Miss World 6 10.86 7.53 21 6 2.48 1.89 20 7 0.84 0.65 3 7 6.98 21 2 Heparin murderer 6 10.96 6.27 33 7 2.51 2.00 23 6 0.84 0.72 23 6 6.54 22 23 Saddam Hussein execution 6 10.97 7.19 30 7 2.58 2.11 17 6 0.86 0.93 31 7 6.40 23 22 Sarkozy – presidential victory 7 10.99 8.45 31 7 2.67 1.98 35 7 0.88 0.76 27 6 6.19 24 4 Violence against children 7 11.27 7.54 11 8 2.72 2.11 4 7 0.90 0.67 7 7 6.16 25 8 Music festival police raid 5 11.32 12.79 23 6 2.77 1.58 8 5 0.90 1.05 19 6 5.66 26 10 Czech footballers – prostitutes 7 11.53 6.12 16 7 2.93 1.45 21 6 0.96 0.76 21 6 5.04 27 6 Same-sex couples reg. partner. 6 11.72 8.05 6 6 2.94 1.66 1 6 0.96 0.39 4 7 4.58 28 20 Robbery of the century in the CR 7 11.75 9.37 3 7 3.31 1.98 30 7 0.98 0.74 15 7 3.79 29 30 Massacre at Virginia Tech 7 12.83 8.08 10 7 3.48 1.82 6 6 1.00 0.72 18 6 3.50 30 27 Star Dance I 6 13.72 8.21 18 6 3.54 2.35 2 6 1.11 0.74 30 7 3.49 31 25 Czechs won ice hockey WCH 5 13.81 14.19 20 7 3.67 1.60 34 5 1.13 1.05 10 7 1.79 32 35 CR joined Schengen 7 13.99 12.19 15 7 3.77 1.88 25 5 1.23 1.27 6 6 0.94 33 34 Pope John Paul II died 5 14.99 13.02 7 7 3.78 1.93 27 6 1.30 0.73 20 7 0.92 34 19 SuperStar 3 6 15.11 7.51 2 6 3.89 1.46 19 6 1.42 0.69 2 6 0.00 35 Note: Event description is shown only for date error. In all other cases only event number is shown. No = num- ber of an event in Appendix 4; Y = correct year; Date error = absolute date error of month and year together expressed in month units; Month error = error of month estimate (maximum is 6 month); Year error = error of year estimate in absolute value in years; 0 = percentage of correct date estimates; Rank = events are sorted four times. In all cases ranking is from the most accurate (rank 1) to least accurate (rank 35).

131

Event characteristics of public events

Event characteristics are measured separately for every event. I will focus here on event recency, temporal schemata, media coverage in time and on the association of public events with personal events.

Event recency

All four events from 2008 are among the most accurately dated events. This is probably caused by the recency effect, the 2008 year boundary and temporal schemata. The accuracy rank for these events is 1st, 3rd, 5th, and 10th (see table 5.4). The worst dated event from year 2008 is the derailment of a train in Studénka [14] (M = 7.85; SD = 9.88) which does not have a temporal schema for year or month. This shows that even recent events may be difficult to date when no temporal schema is available. All three events from 2005 are dated very inaccurately in comparison to most of the other events (26th, 32nd, and 34th rank). Although they are the oldest events, knowledge of the 2005 year boundary could help some respon- dent to increase the accuracy of their date estimates because these events could not be much telescoped backward. Two events from 2005 (the police raid at the summer festival [8] and the World Championship in ice hockey [25]) are cyclic. In such cases it is difficult to specify the year because of im- pact of repetition (proactive and retroactive interference). Another reason may be that because they are the oldest events the distance from interview time was large and participants had less contextual information available than for more recent events. The relationship between recency of an event and dating error in month (year and month) is positive for all 35 events (r = .42, p < .05, 95% CI = (.10, .79), so the older the event is the less accurate is the date estimate. An even stronger relationship can be found when I eliminate events with multi– year temporal schema (n = 8), the correlation becomes higher (r = .63, p < .001; 95% CI = (.33, .81). Without events from the boundary years the correlation disappears (n = 28) showing that the events from the boundary years are in- fluential and that a two year reference period is not long enough to show the effect of event recency on events from this sample.

132

Temporal schemata

H-1: Events that have temporal schema are dated more accurately than events with- out this schema. Differences in dating accuracy of events with shared versus unique year tem- poral schema are large and significant, t(32.72) = 8.68, p < .001, Cohen’s d = 2.67. Events with unique year schema (n = 22) have an average dating error of 2.8 month (SD = 1.2); events with shared year temporal schema (n = 13) have av- erage dating error of only 1.2 month (SD = 0.4). Events with multi-year temporal schema (four year) from 2006 and 2007 (n = 7) are dated more accurately (M = 0.64, SD = 0.23) than events without this schema (n = 21; M = 0.87; SD = 0.20). The difference between means is 0.22 year and is significant, t(26) = 2.41, p = .23, Cohen’s d = 1.1. When events from the boundary years 2005 and 2008 are included, the difference is similar.

Media coverage in time

Regarding media coverage in time, culminating events are dated on average 1.4 month more accurately than initiating events F(1, 3368) = 8.25; p < .010) while statistically controlled for having temporal schema or not81. Accurate date estimates were more frequent for culminating events (25% compared to 10% of answers). Culminating events were on average affected by backward telescoping just as much as initiating events. Initiating events were more af- fected by forward telescoping than culminating events (51% compared to 36%). These results show that higher accuracy of the dating of culminating events is the result of higher number of accurate date estimates and by small- er tendency of forward telescoping.

81 The statistical model controlled the categorical variable “event has temporal schema” or “event has no temporal schema”. In the “event has temporal schema” category were in- cluded 7 events [5, 9, 12, 17, 28, 29, 32] that have multi–year temporal schema as well as year temporal schema. 133

Association with a personal event

H-2: Public events that are associated with personal events are dated more accurately than events without this association.

People who reported having some association of a public event with a per- sonal event dated the public event more accurately than people without this association. According to Mann–Whitney U test the effect was significant in 12 cases82. The biggest difference occurred in relation to the death of John Paul II [34], where participants without a personal association had dating er- ror of 16.5 months, compared to only 5.1 months dating error for participants with association. In case of the derailment of a train in Studénka [14] the mean dating error was 8.4 months for participants without and 1.4 month for people with a personal association. The problem was that participants did not mention associations with personal events very often, so many differences were not significant or the difference between groups might be in some cases accidentally significant because of their low frequency. For example the death of John Paul II was associated 18 times and the derailment of a train in Studénka 12 times. Other events were associated even less often (usually no more than 10 times) with a personal event.

Respondent characteristics

In this section I will focus on respondent characteristics that could influence the overall dating accuracy. The explored characteristics are the self- evaluation of memory, age, gender, education and interests in public events.

Self-evaluation of memory

H-3: Respondents’ self-evaluation of memory is positively related to dating accuracy. The majority of participants rated their memory for personal events as being better than for public events, which is not surprising, because personal events are typically more relevant for peoples’ life than public events. For both types

82 (p < .05 for events 6, 7, 9, 12, 14, 17, 18, 25, 28, 29, 31, 34). Hurricane Kyrill [7] was the only event where respondents with association (n = 7) dated significantly worse than respondents without this association (n = 149). The explanation could be that this is an artifact of the small sample and it happened by chance that these seven people had a big dating error (range is 12 to 21 month). 134 of events participants (n = 262–265) evaluated their own memory quality in two domains—memory for content of events (especially “what” happened) and ability to date events. Table 5.4 shows that 46% of respondents think that they can date personal events very well or rather well in comparison to 22% for public events.

Table 5.4 Self–assessment of memory for personal and public events among participants

Memory self–assessment very well rather well average rather poor poor n Memory for personal events 12% 43% 34% 10% 2% 264 Memory for major public events 2% 23% 50% 23% 3% 265 Dating personal events 6% 41% 37% 15% 2% 264 Dating public events 1% 20% 46% 29% 4% 262

Mean dating error of all events may be biased by the fact that some res- pondents dated difficult events and other did not. That is why I statistically controlled not only for participants’ ID but also for the difficulty of questions (operationalized as the average dating error for all participants together). The resulting mean absolute dating error shown in Figure 5.1 is therefore adjusted and can be interpreted as the relationship between self–evaluation of memory and dating accuracy adjusted for question difficulty.

How well are you able the date major public events /when you compare with other people you know/

12

11 10.8 11 10.3 10 9.5 10 9.2 9

9 8.6 Adjusted mean dating error in in months error dating mean Adjusted Very well Rather well Average Rather poorly Poorly

Figure 5.1 The relationship between self–assessed ability to date public events and adjusted mean dating accuracy (controlled to participants’ ID and difficulty of questions).

135

Only 2 respondents rated their memory for public events as very well. Those respondents gave dates only for 17 and 7 events respectively (the accu- racy of dating is shown in the graph and is marked with a small square). This is the reason why I present only comparisons among the alternatives rather well (n = 46), average (n = 105), rather poorly (n = 70) and poorly (n = 10). Post– hoc tests (Sidak) shows that differences between positive self-evaluation (“ra- ther well”) and worse self–evaluation (the other alternatives) are statistically significant at least at .05 levels.

Interest in public events

H-4: Interest in some type of public events is positively related to dating accuracy of these events. I expected that respondents would date events more accurately if they belong to their areas of interest. On the other hand the interest may be focused on different aspects of events (e.g., what, where, who, how) or the interest may be declared only formally. One of the questions asked participants to judge their interest in public events. I will present only interests related to events in the questionnaire. The number in brackets denotes the incidence how many participants stated this interest. I could identify four thematic types of interest in public events which relate to the thematic types of public events in the questionnaire—politics (94), sport (88), celebrities (36) and foreign news (75). Because the number of events is not very high I have included also the events from 2005 and 2008 to this analysis. I tested the relation between dating error in month (year and month), the topic of the questions and the respondent’s field of interests in a particular area by one–way ANOVA. The effect of participant’s interest in sport remained even in the most complex model where intervening factors were controlled for83. So I can say that participants interested in sport dated all events 0.7 month more accurate- ly, regardless of gender or question topic (F(1, 1341) = 4.31; p = .04; partial η2 = .003). It is a small but surprisingly robust effect. The relation between field of

83 Apart from main effects of question topic and respondents field of interest, the interac- tion between question focus and respondents’ interest was included in the model as well. This tests the hypothesis whether interest in some public event theme increases the dat- ing accuracy in this area. Inter–subject variability was controlled by nesting ID into inter- est; dissimilarly of events was controlled the inter–question variability. Both respondent ID and event ID are random factors. Gender is also included as main effect and as well as part of three–way interaction among gender with interaction interest–question focus–that is gender as the moderator of the interaction. For every area unique model was used. 136 interest and question topic is significant (F(1, 4137) = 7.10; p < .01; partial η2 = .002) and in accordance with my expectation. Participants interested in sport dated sport events 2.3 months better than participants not declaring such in- terest (for non–sport events this difference was only 0.4 month—again in fa- vor of respondents interested in sport). This effect differs between men and women. “Sportsmen” dated sport events 2.8 month better than non–sport events, whereas “sportswomen” dated sport events only 1.1 month better than non–sport events. However, this effect is not statistically significant (F(1, 4137) = 1.98; p = .16; partial η2 < .01). If the interest in sport is controlled for, men and women did not differ in accuracy of dating sport events (relation of sex and area of question, F(1, 4137) = .84; p = .36; partial η2 < .001).

11

10 9 Men - no sport interest 8 Men - sport interest 7 Women - no sport 6 interest 5 No Yes Average dating error error dating in months Average Question focused on sport Figure 5.2 Average dating error in months (year and month) as a function of question focus, res- pondent interests, and gender.

The relationship between declared interest in politics and accuracy of dating is not as strong as for sport. The difference in absolute accuracy be- tween people interested in politics and the rest of the participants is 1.0 month for politics questions in favor of people interested in politics and 0.3 month in favor of people not interested in politics for non–politics questions F(1, 4139) = 4.81; p = .03; partial η2 = .001). When sex is also included, this effect substantially decreased (F(1, 4137) = 3.00; p = .08; partial η2 = .001)84. It is espe- cially caused by different accuracy in men’s and women’s dating (F(1, 1134) =

84 It is especially caused by different accuracy of men’s and women’s dating (F(1, 1134) = 4.98; p = .03; partial η2 = .004) and slightly different relative frequency of interest in politics for men and women (men are interested in politics more often). When all these factors are controlled, the relationship between interest in politics and accuracy of dating becomes small and insignificant.

137

4.98; p = .03; partial η2 = .004) and by slightly different relative frequency of interest in politics for men and women. In a similar way, the positive effect of interest in celebrities on accuracy of dating was not verified. Difference in dating error for people interested in celebrities and the rest of the participants is 1.1 month for celebrities‘ events in favor of people interested in celebrities and 0.5 month for non–celebrities‘ events, again in favor of people interested in this area. That is why this inte- raction effect is insignificant, F(1, 4137) = 0.36, p = .55, partial η2 < .001. Declared interest in news from abroad was not related to accuracy of date estimates of foreign news. The difference is less than 0.1 month, in favor of people not interested in foreign news (F(1, 4137) = 0.31; p = .58; partial η2 < .001).

Gender

Gender was already included in some analyses above. The results show that men were on average approximately 0.9 month more accurate than women85. Although the events were not chosen with an intention to differentiate be- tween genders, I tried to explore the effect of gender for all questions sepa- rately as well86. Men were significantly more accurate than women (p < .05) in dating 5 events (the difference in dating is shown in brackets). Topolánek – 1st government [1] (5.8), Kubice’s political crime report [5] (4.8), Music festiv- al police raid [8] (4.5), Topolánek – 2nd government [16] (2.7) and FIFA World Cup – Czechs failed [17] (4.8). Women dated significantly better only the sui- cide of Karel Svoboda [24] (3.7).

Age and education

One–way ANOVA found that age of respondents as a random factor is insig- nificant for all public events both for signed dating error and dating error. The reason may be that all age groups use similar temporal schemata which explain a lot of variance. No relationship was found for education, either.

85 It is a pure effect of gender. Respondents ID controlled and difficulty of questions esti- mated as well. All results are situated near the 5% significance boundary considering the big variability and the number of factors included in the model. 86 This procedure may be inappropriate from the statistical point of view (there is a high chance that at least some of the differences will be significant by chance) but gives at least some impression which events were easy for women or men. 138

Could any public events be temporal landmarks?

I have used four indirect indicators of temporal landmarks—freely recalled events, percentage of correct date estimates, magnitude of dating error, and the association of public events with personal events. The tables will present only a subset of events that most strongly comply with the chosen public landmark indicators. Additional information can be found in tables 5.3 and 5.4. Before dating 35 events participants were asked to freely recall two pub- lic events from 2006 and two events from 2007 and provide their month esti- mate. Approximately half of respondents mentioned at least one event (this can be the same as the pre-specified events that respondents had to date later on). Floods were mentioned most frequently but they could not be analyzed because a detailed description was missing and there were too many floods in the Czech Republic in recent years. The most frequent freely recalled event for which the true date could be specified were Parliamentary elections [32] (n = 66), where only 4 respondents answered with a wrong year and almost all errors were just within one month. Other freely recalled events can be found in Table 5.5. The presented frequency is higher than sum of correctly and wrongly dated events because dating of month was missing for a few events so I could not calculate dating error for year and month together. If freely recalled events are correctly dated they can be labeled as temporal landmarks because they fulfill two important criteria—availability in memory and known correct date. However whether they can help to aid the recall of other events cannot be answered87.

87 It would also like to emphasize that the dating error of freely recalled events was on average smaller as the mean dating error in table 5.3. However the frequencies of freely recalled events are relatively low and substantial generalizations cannot be made. 139

Table 5.5 Indicators of temporal landmarks among events from 2006 and 2007 (free recall and association of public events with personal events)

Free recall Association No Event f C E No Event f 32 Parliamentary elections 66 41 19 32 National Library 14 9 Municipal elections* 25 13 8 29 Ol. Games in Torino 9 29 Ol. Games in Torino 18 12 2 25 Ice hockey WCH – 1st 9 17 FIFA World Cup 18 12 3 18 Driving points system 8 35 CZ in Schengen 8 7 1 7 WCH athletics 4 × Bird-flue 7 2 2 28 FIFA World Cup 4 24 Suicide of K. Svoboda 6 4 2 17 Municipal elections 4 × Economical crises USA 5 1 3 9 Same-sex coup. reg. 3 Note. No = event number as appears in appendix 1. C = number of correct answers. E = number of erroneous answers. 0–3 = percentage of date estimates that have maximal dating error of 3 months. × = event not within 35 predefined events. *Includes elections to the Senate (2 times) and elections of county authorities (1 time). f = frequency. **% both August and September are counted as correct month because WCH athletics took place within two month period.

Association of public events with personal events. A personal event can make dating of a public event easier, e.g., a respondent may know that Olympic Games take place in February and then she deduced the year from her graduation year. In table 5.5 there are only those events for which the as- sociation with personal event significantly improved dating accuracy. The frequencies of the associations were small which is why many of the differ- ences between respondents having an association or not were not significant. Only one significant difference in favor of the non–association group was found (Hurricane Kyrill [7]). This can however be an artifact due to the small frequency of the association. Percentage of correct date estimates is presented for the correct date (month and year together), correct year and correct month. As can be seen in table 5.6 most respondents (47%) dated accurately (year plus months) the Olympic Games in Torino [29]. When dating error of at most 3 month is allowed the most accurate event is Parliamentary elections [32] (68%). Percentage of cor- rect year indicates which events can help to date the year. Both elections were dated correctly on a year level by more than 70% of respondents. Percentage of correct month indicates which events can help to date the month. The month was most often dated correctly for Olympic Games in Torino [29]

140 which has the obvious reason that it is a summer event with temporal sche- ma. All other correctly dated events have this temporal schema too.

Table 5.6 Indicators of temporal landmarks among events from 2006 and 2007 (% of correct date estimates in months—month and year together, years and month only)

% correct % correct year % correct month No Event 0 0–3 No Event % No Event % 29 Ol. Games in Torino 46.9 63.6 9 Municipal elections 77.2 29 Ol. Games in Torino 63.5 9 Municipal elections 44.3 63.3 32 Parliamentary elections 71.3 9 Municipal elections 51.4 17 FIFA World Cup 32.5 56.1 29 Ol. Games in Torino 64.0 17 FIFA World Cup 46.8 32 Parliamentary elections 30.1 68.3 5 Kubice’s crime rep. 54.0 19 SuperStar 3 46.8 5 Kubice’s crime rep. 27.8 51.3 17 FIFA World Cup 51.1 7 WCH athletics** 42.2 33 Bulg. & Rom. EU 16.7 26.0 32 National Library 43.1 13 Star Dance II 41.1 24 Suicide of K. Svoboda 14.5 28.3 15 Interchange of newborns 42.7 29 Star Dance I 40.6 Note. No = number of an event in appendix 1. Dating error = error of month and year in month units. 0–3 = percentage of date estimates that have maximal dating error of 3 months

The magnitude of mean dating error in absolute value shows which events have the smallest dating error (month and year together). This mean error is of course heavily influenced by outliers. However if the average and standard deviation are small it offers strong evidence that the event is gener- ally easy to date and can potentially serve as temporal landmark too (see ta- ble 5.7). This indicator shows that the municipal elections and the parliamen- tary elections are the most correctly dated events. As can be seen some events may have large year error (and dating error) and small month or vice versa. For example Municipal elections [9] are correctly dated (dating error and year error are small) but the month has relatively large dating error (M = 1.56; SD = 2.11; 8th smallest month error). The dating error also indicates that although some events can be dated correctly by many respondents (e.g., FIFA World Cup Germany 33% [17]; 3rd place in the table of correct percentage), other respondents may find dating more difficult, which results in higher overall dating error (8th place in the table of dating error).

141

Table 5.7 Indicators of temporal landmarks among events from 2006 and 2007 (dating error in month and year together, year error, and month error)

Dating error Year error Month error No Event M SD No Event M SD No Event M SD 9 Municipal elections 4.8 7.4 9 Municipal elections 0.37 0.71 29 Ol. Games in Torino 0.71 1.27 32 Parliament elections 5.6 8.6 32 Parliament elections 0.43 0.72 17 FIFA World Cup 0.81 1.03 29 Ol. Games in Torino 6.7 9.6 29 Ol. Games in Torino 0.51 0.75 28 WCH athletics 1.05 1.16 3 Paroubek – wedding 7.0 4.1 5 Kubice’s crime rep. 0.59 0.71 32 Parliament elections 1.22 1.38 5 Kubice’s crime rep. 7.6 8.8 32 National Library 0.64 0.61 13 Star Dance II 1.43 1.60 13 Star Dance II 7.7 4.8 24 Suicide of K. Svoboda 0.65 0.59 19 SuperStar 3 1.53 1.89 16 Topol. – 2nd gov. 8.3 7.4 15 Interchange of newborns 0.66 0.64 29 Star Dance I 1.54 1.70 17 FIFA World Cup 9.5 10.8 3 Paroubek – wedding 0.70 0.48 9 Municipal elections 1.56 2.11 Note. No = number of an event in appendix 1. Dating error = error of month and year in month units.

The answer if any of the public events could serve as temporal land- marks to aid the recall of personal events depends on the task and the sample of people concerned. The reason for using public temporal landmarks may be to aid the re- call of month, year or both. If the month estimate is important than all well– known public events could be useful. These are often annual events (or events happening every four years) with a small month dating error and a high per- centage of correct answers (as can be seen in table 5.6 and 5.7 these are often the same events). The only problem is to choose the right events for the res- pondents. If the year estimate is important, the situation is similar. All events with a small year error and high percentage of correct answers may be useful. These are often the events with multi–year temporal schema. When both year and month estimates are important then all of the so far mentioned indicators are relevant plus the dating error indicator and the per- centage of correct answers. When researchers want to choose landmarks suitable for some specific subpopulation it would be suitable to choose events from the areas of their interest, because as was shown in previous sections interest in some type of public events increases the dating accuracy of these events.

142

Evaluation of the possibility of using public events in Study II as recall aids:

The remaining question is if some of these events could be suitable to aid re- call as landmarks in Study II. Study II will experimentally compare two types of interviews—with and without calendar instrument—and their impact on dating accuracy of personal events. Incorporation of public landmarks into the interview is one of the possibilities how to aid the recall. Because the date of public events can be written down into the calendar instrument by the researcher not only events with dates known by respon- dents have to be used. Because of this free recall indicator and associations’ indicator could be the most suitable indicators because such events are al- ready in the memory network and may thus be connected to some other tar- get events. The problem is, however, that both indicators showed that people do not remember public events very often and the frequencies of the associa- tions are very low as well. These indicators can be deflated because of the online mode of data collection, but even if that is true, it still does not provide a convincing argument why to use these events as recall aids. A supporting argument is that the dating accuracy seems to be highly influenced by the availability of temporal schemata. These schemata are of semantic nature and people often know the fact only (e.g., when Olympic Games happen) without remembering any episodes. Because of this I have decided not to use any of the public events as re- call aids in Study II.

Evaluation of the use of public events in the subsequent Study II

Forming a suitable sub–set of events for the evaluation of the calendar in- strument in Study II is less problematic. Events vary in dating difficulty a lot, while some having temporal schemata and some not. I have thus many pos- sibilities which events to choose. I have decided to choose only eight public events because more events could be too burdensome for the respondents (they will have to estimate the date of 28 personal events as well). I will choose two events that are easy to date, two medium difficulty events, two events that are difficult to date and one event from 2005 and one event from 2008 (see Study II for further thoughts about the choice of specific questions).

143

5.4 Discussion

The main aim of this study was to find out how accurately people estimated the date of well–known public events presented in the media and if there are any effective predictors of accurate dating. The auxiliary aims partially con- nected to Study II focused on indicators of temporal landmarks among public events and whether public events could form a suitable sub–set of events to be used in Study II. I will first focus on the main aim.

Discussion of the major findings

An impact of the event recency on dating accuracy was found only when events from the boundary years 2005 and 2008 were included and was stronger for events without temporal schema. The reason why is that it in- creased the length of the reference period to four years and it is expectable that five years old events have to be more difficult to date as two years old events (especially when other characteristics are similar) (Bradburn, 2010; N. R. Brown, et al., 1985). When events from the boundary years were eliminat- ed, the effect of the recency disappeared. Therefore I can conclude that for approximately 3 to 4 year-old events, no evidence was found that recency is a significant predictor of dating accuracy. This does not imply that there is no relationship at all (which is counterintuitive and against research evidence; see chapter 3.2) but that when such a short recall period is used other event characteristics play a more substantive role (e.g., temporal schemata or rele- vancy). The availability of temporal schema is the strongest predictor of dating accuracy for both month and year, explaining a lot of variance in the data. According to (Larsen, et al., 1995) public events with temporal schemata may be dated better than personal events with temporal schemata. The reason is that public events’ temporal schemata are more stable than personal schemata that are changing over the life span. Dating of culminating events was more accurate and the effect of for- ward telescoping was more subtle than dating of initiating events. Surprising- ly, there was no backward telescoping effect found in culminating events. This could be explained by the fact that participants might not pay too much attention to the media coverage before the event happened.

144

Participants who had a public event associated with a personal event dated this event more accurately. However, since the number of public events that were associated with personal events was small, the difference in accuracy of dating was often non–significant88. Participants who rated their memory for public events as “rather good” dated 1.7 month more accurately than people who rated their memory as “ra- ther poor”. Since the memory self–assessment preceded the actual dating task of target events, the relationship between self rated memory quality and dat- ing accuracy probably is underestimated. Interest in a particular public area generally increased dating accuracy of events connected with this interest. The greatest differences in accuracy of dating were found only between participants interested in sport and the rest of the sample. I suppose that the reason why “sportsmen” dated sport events so well is that sport events—compared to events from other areas—are more predictable and usually of cyclic nature. This is supported by the finding that “easier” (= predictable and cyclic) questions in politics (e.g., Municipal elec- tions, Parliamentary elections, Topolánek – 2nd government) are dated more accurately by people interested in politics although on average the positive effect of interest in politics was small and disappeared in more complicated models. The fact that people interested in sport dated all events 0.7 month better is also worth noticing. Maybe expert knowledge (interest) in some area can have positive effect on dating events in other areas because participants had more clues that can be associated with events from other areas. Women dated 0.9 month less accurate than men, which is probably caused by the choice of events that may be more relevant to men (sport events, politics). Differences in dating of some events show interesting gender differences worth further study89. When women date more relevant and per- sonal events, they are usually more accurate than men (see e.g., Skowronski, et al., 1991; Skowronski & Thompson, 1990). The age of participants is not a good predictor of dating accuracy and it seems that the set of events was equally difficult for people of different ages.

88 Only one event was dated significantly worse by people with association (hurricane Kyrill [7]). It is probably an artifact. 89 For example some political events with temporal schema were dated more accurately by men, whereas political events without temporal schema were dated equally accurate by men and women. 145

The differences may be more pronounced when there are more participants of different ages in the sample90. Education also did not turn out to be a good predictor of the dating ac- curacy, contrary to some studies (Howes & Katz, 1992). The reason is, howev- er, probably not lack of a relationship but the overrepresentation of well edu- cated people in the sample and underrepresentation of people with really low education.

Implications for Study II choices

I also dealt with the question which events from 2006 and 2007 could be a temporal landmark. All of the explored indicators of public landmarks were only indirect indicators, which is why they cannot provide a final answer whether the event will actually work as a recall aid or not. On the other hand even this indirect evidence provided enough arguments that no landmark was found that convincingly could help people estimate the month or year of the target events in Study II. One of the main reasons is that most people do not have any associations with public events and also were not able to recall too many public events in free recall task. Being interconnected with other events is one of the most important features of temporal landmarks (Shum, 1998) which is thus violated. It may of course be partially caused by the online mode of the interview where respondent may not feel obliged to answer all questions. This does not imply that all events would be useless as recall aids. The reason is that also the length of the delay between time of the interview and event to be estimated is important. When for example researchers ask about one year old events then one year old public landmarks may work very well. However, if the delay is more than two years, the interference of the similar events (e.g., second and third series of the same TV shows) may cause prob- lems of estimating the year, and if the month is not constant also with a month estimate. When the delay between interview and target events is very long (say 6 years or more) only the most salient events that do not repeat too often will probably help with the year estimate of personal events. It is questionable if any of these events could increase the dating accuracy of personal events in such a remote reference period. For example Olympic Games occur every

90 Younger respondents in my study: 18–23 years, n=17; elder respondents: 60+, n=12. 146 four years as well as elections and it can be expected that the association of personal events with public events will fade out over the time and mix up with the similar events that happen later of before. Unique one–time events like the death of John Paul II [34] could be better recall aids for longer periods but as can be seen the frequency of associ- ations is low even for this event and the average dating accuracy is one of the worst. It is questionable if any of the public events could aid the recall in gen- eral. Nevertheless, there may be events that could work reasonably well for some sub–groups of people. For example the death of John Paul II [34] could work well for the believers, sport events could work well for people interest- ed in sport and political events for those interested in politics. This is sup- ported by the fact that when event is associated with personal events the dat- ing accuracy is generally higher and that when the events are freely recalled they are often very accurate. Choosing a subset of public events for the evaluation of the calendar in- strument is less problematic (see chapter 6 for more details).

Limitations of the study and the implications for future research

The study has a broad design which enables us to see the issues of dating in a wider context but on the other hand the generalization of some results is li- mited. The small range of event recency decreases the possible correlation between age of an event and accuracy of dating. Likewise, the relationship between area of events and interests of participants is limited because the events were not chosen with the intention to represent just the four analyzed areas of interests. Many participants did not follow the instructions precisely—especially the “I do not know that the event happened” answer was used too often and improperly and the 2005 and 2008 year boundary was often exceeded. Even though I expected these problems in advance and placed the instructions at every page in a short summary way, some participants probably did not pay too much attention to it, as is quite common in an online questionnaire mode (Dillman, Smyth, & Christian, 2009)91. The “I do not know...” answers mostly

91 Both problems could be partially solved by not providing the free bounded space where respondents wrote down their answers but by radio buttons with predefined al- ternatives on which respondents click. Unfortunately pilot study of the questionnaire did not show any problems and the page layout without radio buttons seemed to be suitable 147 pertained to the year that was skipped and not the month. This could make the year error (as well as total dating error) biased towards more accurate year estimates because I expect that those people who skipped the questions skipped the ones that were difficult to them. Nevertheless I do not expect a strong bias because dating most public events was difficult even for the most accurate participants. More than half of the participants put down at least one year estimate outside of the boundary period. These reports had to be eliminated (only year). This decreases the number of date estimates in the analyses and also possibly year estimates towards more accurately dated events. Again I do not expect a strong bias (see above). It is also possible that people dating without limit were aware of the limit in some cases and forgot about it in oth- er cases or did not respect the limit intentionally because they were sure of their answer. This could happen for example when Czechs won ice hockey WCH because Czech team won both in 2005 and 2010. Even though participants did not pay full attention to the 2005 and 2008 boundary instruction I assume that the accuracy of dating would be lower without this limitation (Rubin & Baddeley, 1989). This is also sup- ported by an unpublished pilot study (Neusar, 2011), where I asked in Janu- ary 2011 a different sample (N = 30, all university graduates or studying at university) two questions from the questionnaire—hurricane Kyrill [7] and “penalty points system came into force” [18]. Without any boundary years provided, participants dated much worse and with greater standard devia- tions. More recent time of interview cannot explain such a great difference because temporal schemata which explain a lot of dating accuracy are quite stable over the time (Larsen, et al., 1995). The frequency of interests in public events, frequency of associations with personal events as well as the dating accuracy may be underestimated in the current study because it can be expected that respondents try less hard in an anonymous mode of data collection as compared to a face–to–face inter- view (Dillman, et al., 2009; Grice, 1975). This could cause all the effects to be smaller than in reality when respondents take full attention to the task. This hypothesis will be explored in Study II.

and transparent. I also did not want to include the alternative “I do not know the event” as a radio button, because it is known that respondents choose it too often when such alternative is easily available (Tourangeau, et al., 2000). 148

Future research in this field should focus on individual differences be- cause interpersonal variance is very high. Another possibility to improve re- search in this field is to use a more sophisticated way of choosing events and their characteristics (e.g., choosing events from some areas on purpose) or to use different modes of data collection.

149

6 Dating accuracy predictors of remote personal events (Study II)

6.1 Study aims and hypotheses

Study I focused on the predictors of dating accuracy regarding remote unique public events from the reference period 2005–2008. The present Study II fo- cuses on the events from the same reference period but its main interest is in personal events. The design is almost identical to a design of Study III with the difference that Study III focuses on recent personal events. The main aim of this study is to find predictors of dating accuracy of remote unique personal events. Three types of predictors of dating accuracy will be examined: • Respondent characteristics: they influence the overall dating accuracy (e.g., self-assessment of memory); • Event characteristics: these are measured separately for every event and influence each target event (e.g., vividness/details, recency); • Data collection: respondents in the experimental group are interviewed with calendar instrument (CAL) and the control group (Non-CAL) without calendar instrument.

The following research questions were asked: • Which event and respondent characteristics are the strongest predictors of dat- ing accuracy among remote personal events? • Do interviews with a calendar instrument (CAL) lead to higher dating accura- cy of remote personal events in comparison to interviews without calendar in- strument (Non-CAL)?

I have also added eight public events into the sample of events. The ma- jor aim of including these events was to explore whether the calendar instru- ment increases the dating accuracy of public events. • Do interviews with a calendar instrument (CAL) lead to higher dating accura- cy of remote public events in comparison to interviews without a calendar in- strument (Non-CAL)?

150

Selection of personal events

Proxies’ task was to secretly collect 24 events from the life of their partner92 during the period 2006–2007 plus additional 2 events from 2005 and 2 events from 2008. Proxies were not allowed to tell their partners that they are collect- ing any events. Proxies had several instructions as to which events to collect: • Collected events (or their description) had to be unique enough, for their partner (respondent) to be able to distinguish them from other similar events. The event description had to be as short as possible but with all necessary details. Any additional details could be written into the note which was visible only to the interviewer and occasionally used when the respondent did not know enough details to distinguish the target event from other events; • The event description has to highlight the major theme. For example “trip to Jeseníky mountains” may be unique enough but when this event is about “marriage proposal” that happened during the trip, it should be noted that it was not an ordinary trip. • Approximately half of the events should be from 2006 and 2007. • Only memorable events should be collected—that means only events which they think that their partner did not forget completely; • The true date of all collected events had to be checked with some docu- ments (e.g., e-mails, text messages, diaries, old calendars, bills); • If some extended event covered a period of months, the description had to specify which part of the extended event has to be dated (e.g., begin- ning of the trip to Croatia). That means that all events have only one “true date”; • Events that are too easy to date should be omitted (e.g., own birthdays, New Year’s celebration); • Events that are too sensitive should be omitted or the sensitive aspect could be described in an inoffensive way; • Not too many events that happened during the same month and were strongly associated should be collected. • The event description should not contain any temporal hints (e.g., our “summer” vacation when we were swimming a lot in the Caribbean Sea—even when they think it is obvious and their partner cannot make a mistake in month estimate.

92 Important family issues that concern other family members could be included as well but the majority of events were supposed to be self-events where the respondent took part. 151

• To select events from as wide range of dates within the boundary as possible.

Collected events were then checked by the interviewers to see if the de- scriptions complied with all above-mentioned criteria. There were two controls to check whether the recorded date was cor- rect: • All proxies were asked whether they had checked all the dates. If some dates were not checked they were asked to do so. If that was not possi- ble they were asked to collect another event instead (some proxies were unable to collect more events which ended up in small differences in the total number of events to be dated). • If the respondent did not agree with some of the target events’ dates that were selected from the proxy diary (after the interview was fi- nished) he or she could provide evidence that it was mistaken (diary, e- mail, text message, etc.). Proxies could provide evidence as well. When the evidence did not prove sufficiently that the date was correctly rec- orded, the event was excluded from the analysis. Respondents also skipped events if the description was not unique enough so that similar events could not be distinguished.

Examples of the events and their descriptions: • Our first “official” meeting with your parents in Olomouc. Note: intro- ducing the new girlfriend. • Vacation in Bludov spas. • Car accident with our Mégane. • Going by boat down the Morava River with our daughter and stopping at the pub “Doga”. • First “expedition” to find the wedding dress.

Selection of public events

Public events were chosen from Study I. I have chosen eight events (see table 6.1). Two events were from the boundary years—one from 2005 and one from 2008. Six events were chosen from the years 2006 and 2007. The aim was to choose events with three levels of dating difficulty93. Because of this I have chosen two events that are of low overall (month and year together; dating

93 Dating difficulty was operationalized as the dating error in chosen units from Study I. 152 error) difficulty (events number 32, 29), two of medium difficulty (events number 7, 18) and two of high difficulty (events number 6, 35). Event from 2005 (34) is one of the most difficult to date and event from 2008 is one of the easiest (26). Three events have low month difficulty (events number 32, 29, 26), three events medium month difficulty (events number 6, 35, 34) and two events are of high month difficulty (events number 7, 18).

Table 6.1 Selected public events from Study I Dating error Month error No Event Year N M SD M SD 32 Parliamentary elections 2006 123 5.6 8.6 1.2 1.4 29 Olympic Games in Torino 2006 143 6.7 9.6 0.7 1.3 7 Hurricane Kyrill 2007 146 10.7 6.7 3.8 1.9 18 Driving points system 2006 143 10.8 7.0 3.5 2.3 6 Same-sex couples registered partnership 2006 106 11.7 8.1 2.9 1.7 35 Czech Republic joined Schengen 2007 137 14.0 12.2 2.4 1.9 34 Pope John Paul II died 2005 133 15.0 13.0 2.0 2.0 26 Healthcare charges 2008 133 6.0 7.8 1.4 2.0 Note. More details about the dating accuracy (see chapter 5). The description of these events is in Appendix 1.

Events excluded from the analyses

• Events that were not distinguishable from other similar events (the de- scription of an event was not unique enough). • Events of which the true date could be biased. When the original “true” date was biased but the correct true date was found then this new true date replaced the original one and the event was not excluded.

Variables excluded from the analyses

• Event theme. Event theme will be analyzed by my colleague in her mas- ter’s thesis which is due in summer 201294. • Education: Even though respondents with higher education may be more accurate, my sample does not offer data for such comparison be- cause most of the respondents (88%) received at least a university ba- chelors’ degree and the frequencies of other levels of education are low.

94 Eva Literáková. 153

Hypotheses

In this section I will mention both hypotheses (will be shortened as “H”) and more explorative questions. The reason for not stating the hypotheses in some specific cases is that the design of my study influences the impact of some variables and thus the typical relationship may not be found. Another reason is that some of the measured variables lack the variance, what was already known during or even before the data collection.

Hypotheses and explorations concerning event characteristics of personal events: res- pondent independent or partially independent

Before I move to the hypotheses I will shortly look at the more explorative variables. • Event recency. I do not expect to find strong (or any) relationship be- tween event recency and dating accuracy. The reason is that most target events come from a two year long reference period (2006–2007) and the interviewing started in January 2011 and ended in November 2011. Thus the most recent events were approximately 3 years old and the most remote approximately 5 years old (or slightly older for respon- dents that were interviewed later on) which is probably not a big enough difference to find a substantive effect of event recency when the possibilities of controlling the event characteristics are limited. • H-1: Events that have temporal schema are dated more accurately than events without this schema. This is true only when temporal schema with appropriate time units are concerned. Year schema increases the accuracy of month estimates as well as the season schema (e.g., winter). • H-2: Self–events are dated more accurately than other–events.

Hypotheses and explorations concerning event characteristics of personal events: res- pondent dependent or proxy dependent

• Event sharing should be positively related to dating accuracy but this variable was added later on and thus the sample of events will probably be too small or the differences could be just an artifact. However it may be still worth exploring whether at least some tendency could be found.

154

• H-3: Events with a known date are dated more accurately than events with a reconstructed date. • H-4: Event importance is positively related to dating accuracy. Note: es- pecially “highly important” events should be dated more accurately than the rest of the events. The expected importance rated by proxies will be also positively related to dating accuracy but the relationship will be probably weaker. • H-5: Event uniqueness is positively related to dating accuracy. Note: es- pecially “highly unique” events should be dated more accurately than the rest of the events. • H-6: Confidence in month estimates is positively related to dating accuracy in months (month only). Expected accuracy of month estimate rated by proxies will also be positively related to dating accuracy in months (month only) but the relationship will be weaker. • H-7: Confidence in year estimates is positively related to dating accuracy in years (year only). Expected accuracy of year estimate rated by proxies will also be positively related to dating accuracy in years (year only) but the relationship will be weaker.

Hypotheses and explorations concerning respondent characteristics

• Age should not be related to dating accuracy because I have restricted the age. The reason for this restriction is that I wanted all respondent to experience approximately similar events (e.g., the vacation in Croatia may be a very different experience for older people who were in Croatia 20 times in comparison to younger people who were there only once). • Motivated respondents should be on average more accurate than less motivated respondents (as was mentioned in chapter 3.1). But in a small scale study that uses a face–to–face interview the rapport with a res- pondent is usually good, thus limiting the chances of finding the differ- ences in dating accuracy caused by different levels of motivation. • The same what was mentioned for “motivation” applies to tiredness of respondents as well. It should also have impact on dating accuracy, but because the interviewers were instructed not to make interviews very late at night or at times not suitable for respondents the negative effect of tiredness will probably be not found. • H-8: Women are on average more accurate than men.

155

• H-9: Respondents’ self-evaluation of memory is positively related to dating accuracy. The better the respondents rate their memory the more accu- rate they are. However I expect only a small effect, because respondents rated their memory before knowing which events they would estimate. • H-10: Respondents’ evaluation of the overall difficulty of estimating the date is negatively related to dating accuracy (rated after providing the date estimates—for month and year together). Note: This relationship should be stronger than the relationship of self-evaluation of the memo- ry before the dating task. • H-11: Number of self-generated landmarks is positively related to dating ac- curacy.

Explorations concerning the impact of calendar instrument on dating accuracy of personal events and public events

• Even though it could be expected that the calendar instrument will in- crease the dating accuracy of personal events it is not that obvious. It can be expected that a calendar instrument should not decrease dating accu- racy. One reason why the calendar instrument may have limited (or none) impact on dating accuracy is that a calendar instrument cannot in- fluence the dating accuracy when the date was just known. Another reason is that some landmark events in the calendar may be biased due to which the reconstruction of associated events and their dates may be biased as well. The calendar instrument may also “work” differently for women than for men. I will thoroughly explore the relationship between dating accuracy and applying the calendar instrument to find some in- dications if the calendar has some effect on dating accuracy or not. • The impact of the calendar instrument on public events is even less ob- vious, because when public events are not associated with personal events the network of personal events cannot help too much. It could be expected that public events that are not associated with personal events will not be dated more accurately in the CAL condition. On the other hand events that are associated with personal events could be dated more accurately in the CAL condition because of the help of other events mentioned in the calendar that may provide sequential or paral- lel cues.

Hypotheses concerning the public events

156

• H-12: Confidence in month estimates is positively related to dating accura- cy. • H-13: Confidence in year estimates is positively related to dating accuracy. • H-14: Public events that are associated with personal events are dated more accurately than events without this association (applies only when tem- poral schemata of public events are controlled for).

6.2 Method

Participants

The participants in this study were respondents (those who estimated the dates) and their partners are proxies. Many couples95 were contacted with flyers at public places and by e-mails through snowball sampling. The aim was to reach a sample of couples with different educational, socio-economic and demographic characteristics even though I do not expect these partici- pant characteristics to have a substantial impact on the dependent variable which is the dating error. Partners had to live together at least since 2005 (be- cause the reference period is 2005–2008). The role of a proxy was to assist in collecting (and sometimes selecting as well—when they collected more events) personal events about their partners (or family). Respondents had to be between 23 to 40 years old. The reason is that during these reminiscence bump years people experience many important life events and also for the future comparison of results with Study III that used the same range of res- pondents’ age. All respondents are Czech citizens who were present in the Czech Republic for most of the reference period. Every couple was compen- sated with a small present worth approximately five Euro. Both partners had to agree to participate in the study. Proxies were informed about the study focus on dating personal events but were instructed to keep this information secret. Respondents were informed that the study will focus on everyday memory without any further specification. They were also informed that no sensitive or offensive information will be gathered from them or their proxies

95 Only heterosexual couples were recruited. This is because gender is an important pre- dictor of dating accuracy and also because gender of the proxy could have some effect on the events that are recorded in the diary. Over 120 couples were contacted but many of the couples did not have time (especially the proxies) or did not want to be part of the memory research. 157 and results will be shown to them only (or shown to both partners if they wished to). I have finally reached the sample of 40 couples (respondents were 18 women and 22 men). This was the maximum number of respondents I could find within the time constraint that the interviews should not take longer than one year (because the effect of the event recency could then be much more pronounced and it could be difficult to control it in the analyses). The reason why a bigger sample was not reached was that many of the couples contacted found the task too difficult and time consuming especially for the proxies (though most of the proxies who participated in the research did not say that). Another reason is that some respondents (the roles who will be respon- dent and who will be proxy were randomly assigned) were afraid of being part of the memory study even though I tried to persuade them that it will not be a difficult task and they will gain interesting information about them- selves. The intention to have equal numbers of male and female respondents was not completely achieved (especially in CAL condition). The reason is that several male proxies did not finish the task of collecting the personal events of their partners and these couples were thus withdrawn from the study. One respondent was older than 40 (46) but was not excluded because the lifestyle of the couple was similar to many other couples in the sample (the proxy was younger) and they were also part of a Study III.

Measures

Most of the measures are thoroughly described in chapter 4 and are not ex- plained here again (with the exception of temporal schemata – see next sec- tion). Table 6.1 summarizes all measured independent variables.

Table 6.1 Summary of measured independent variables

Independent variables measured in empirical studies Respondent characteristics Age (23–40; 46) Gender Education Self-evaluation of memory Overall difficulty of estimating the date (R.) Tiredness (Int., R.)

158

Motivation (Int., R.) Number of landmarks (only CAL condition) Public holidays help Regularity of life, work and free time activities Event characteristics – respondent independent or partially independent Event recency Temporal schemata Theme Self–events or other–events Event characteristics – respondent dependent or proxy dependent Date reconstructed / Date known (Int., R) Importance (R, P) Vividness/details (R) Uniqueness (R) Sharing an event with other people (R) Confidence in date estimates (R) Expected accuracy of date estimates (P) Association of a public event with a personal event (R) Data collection Semi-structured interview with (CAL) or without calendar instrument (Non-CAL)

Classification of events according to a temporal schema

An event classification was created based on the events’ description and con- tent. For some events the description gives enough information on whether there is a temporal schema or not. For example “swimming in Baltic Sea” has a season schema, as it must have happened in summer (unless the respondent is an extremely tough fellow). To confirm the availability of the schema when not obvious, recordings of the interviews were used. The availability of the schema does not imply that the schema has to be correct. Sometimes respon- dents used schematic information which was wrong. Year schema. Year schema helps in estimating the exact month. It is simi- lar to season schema which is not so exact but gives the information in which season events like that typically happen. Events with multi-year schema were not frequent and thus I do not analyze them.

Design and procedure

The study has a correlational design, seeking the predictors of dating accura- cy. The method of data collection—also a predictor in this study—was expe- rimentally manipulated. Respondents were randomly assigned to the CAL or 159

Non-CAL condition (with the limitation that women were equally divided over the CAL and Non-CAL conditions). Randomization was done in excel sheet with the help of random numbers. Respondents were not aware that there are two experimental conditions. Respondents in the Non-CAL condi- tion estimated the date without any calendar or calendar instrument and res- pondents in the CAL condition used the calendar instrument (see chapter 4 for more details). The procedure was similar as in Study III and is described in detail in chapter 4 (procedure).

6.3 Results

Descriptives of the collected data

Interviews were conducted between January 2011 and November 2011. The intention was to conduct all interviews proportionally in about four months but many proxies were not able to collect the events soon enough and also because some couples fell out of the study I had to contact other couples (make another wave of recruitment). The total number of conducted inter- views was 40 (see table 6.2). No difference was found between the mean date for interviews with women or men, between the conditions or when both gender and conditions were included into the same model, F(3) = 0.74, p = .54.

Table 6.2 Number of respondents in experimental and control condition

Condition Non-CAL CAL Total Men 10 12 22 Women 11 7 18 Total 20 19 40

The number of men and women is unbalanced, as the total number of men is slightly higher than the total number of women. This was caused by the fact that women proxies were more willing to cooperate in this diary study more than men proxies. The gender proportion in especially problematic in CAL conditions where remained only seven women (versus 12 men) (see table 6.2). Women dated significantly less events than men, χ² (1, N = 853) = 22.91, p < .001. This is caused by the fact that there are four women less than men (18

160 versus 22). For most analyses this is not a problem because of the large size of events sample, but it limits the gender generalizations from the comparison between the two experimental conditions. Events in Non-CAL condition were similarly recent as events in the CAL condition, t(851) = 0.77, p = .44 (see table 6.3 and especially 6.4). Small significant differences were found in the mean recency of events for men and women in different conditions. The biggest difference was 2.2 months (men in Non-CAL condition having older events as women in Non-CAL condition). However, such a small difference in event recency cannot cause any substan- tive differences in the dating accuracy. The order of target events in which they were put in the interview (all events were randomly ordered) is not related to dating accuracy, Kendall’s tau-b (τ = -.002, p = .94, n = 853).

Table 6.3 Number of personal events that were dated in CAL and Non-CAL conditions (all events)

Condition Non-CAL CAL Total Men 252 315 576 Women 272 179 451 Total 524 494 1018

Table 6.4 Number of personal events that were dated in CAL and Non-CAL conditions (events from 2006 and 2007 only)

Condition Non-CAL CAL Total Men 207 267 474 Women 228 151 379 Total 435 418 853

General patterns of the dating errors of personal events

In this section I will explore the general pattern of dating error in month units (month and year together), month units (only month) and year units (only year). This is done for the whole sample of personal events (including events from 2005 and 2008). The reason is that the general patterns and also the best fit functions (in all three figures) are almost identical (only the dating error in

161 months in the first figure is smaller, because the two biggest errors (35 days and 36 days) are connected to events from the boundary year). The similarity between the general pattern of all events and events from 2006 and 2007 is caused by the fact that there were only two events from 2005 and two from 2008. Figure 6.1 shows the overall dating error in month units. The figure clearly shows the “bump” of 12 ± 1 month error, or in other words the error where respondents knew the month and estimated the year with one year error (181 cases had dating error of 11 to 13 months; 17.8%). There is also a small “bump” close to the 24 month error (18 cases have dating error of 23 to 25 month).

Relative frequency of dating error in months 40

35

30 y = -5.863ln(x) + 18.443 25 R² = .55 20

15

10

Relative frequency of of frequency Relative dating errors 5

0 0 12 24 36 Dating error in months

Figure 6.1. The best fit function of all personal events is logarithmic function (N = 1018)

The year of most events is correctly estimated (81.6%, n = 831). One year error occurred in 16.8% of cases (n = 171) and two years error in 1.5% of cases (n = 15). Three years error appeared among the personal events only once and is excluded from figure 6.2.

162

Relative frequency of dating error in years

90 80 70 60 50 40 30 20 10 0 Relative frequency of of frequency Relative dating errors 0 1 2 Dating error in years

Figure 6.2 The best fit function could be linear, exponential or power function. But because there is only 1 error of 3 years (trimmed) fitting of the function with only three points may be very biased (n = 1017) (Both exponential and linear function explain more than 88%).

Month error shows the decreasing trend of dating accuracy that is best fitted by the power or exponential functions (see figure 6.3). 47.1% of events have a correct month estimate and 76.9% (n = 783) have no error or an error of one month. This clearly shows that people store some temporal information (e.g., temporal schemata – see below) that helps them in estimating the month.

Relative frequency of month dating error

50 45 40 35 30 -0.551x 25 y = 39.053e R² = .94 20 15 10 5 0 Relative frequency of of frequency Relative dating errors 0 1 2 3 4 5 6 Month dating error (month only)

Figure 6.3 The best fit functions are exponential (see the figure) or power function (y = 62.722x-1.746; R² = .95) 163

In the following sections I will mostly focus on dating error in month units (month and year together; “dating error”) and on dating error in month units (month only; “month error”). The year error will be presented in the tables as 12 month error. In majority of analyses only data from 2006 and 2007 will be used because of the potential bias of the boundary effect.

Impact of event recency on dating accuracy

Before moving to other predictors the impact of event recency has to be ex- plored because it could be a covariate in some of the subsequent analyses. No relationship was found between event recency and dating error in days of events from 2006 and 2007, Kendall’s tau-b (τ = .004, p = .87, n = 853). When all personal events were included the outcome was also non-significant (τ = .02, p = .39, n = 1018). When measured separately for events where the date was known or re- constructed, and events from 2005 and 2008 were included, a weak relation- ship between event recency and dating error among “known” events was found (τ = .13, p = .04, n = 164; when only events from 2006 and 2007 were in- cluded the relationship became non-significant) and no relationship for “re- constructed” events remained (τ = .03, p = .45, n = 853). Dating error of other–events or self–events is not significantly related to event recency either. Only for reconstructed other–events where the date is known, a significant positive relationship between recency and accuracy is found (τ = .47, p = .02, n = 17). But this sample is very small (n = 17) and the relationship changes to non-significant when 2005 and 2008 events are ex- cluded. When all events are classified into 4 groups according to the four years in which they happened, the Kruskal-Wallis test of variance indicates a signif- icant effect of event year96 on dating accuracy (in month), H(3, N = 1018) = 8.84, p = .03, partial η2 = .009, mean ranks are 478, 519, 521, 428. In summary, event recency does not seem to be related to dating accu- racy when only events from 2006 and 2007 are selected. When events from boundary years 2005 and 2008 are included as well a weak correlation be- tween event recency and dating error (month and year) was found, but only for events where the date was known.

96 Difference between four groups: 2005, 2006, 2007 and 2008. 164

Comparison of dating accuracy in CAL and Non-CAL condition

As mentioned in the hypotheses section I did not state any hypothesis about the impact of the calendar instrument on dating accuracy because there is a lack of evidence and arguments allowing me making such a statement. The mean rank of dating error of events in the Non-CAL condition is 435 while it is 418 in the CAL condition. This increase of dating accuracy is not significant, z(853) = -0.96, p = .34, r = .03. When analyzed separately for women and men, no difference between the accuracy in both conditions was found for men, while a significant difference was found for women; women: z(379) = -3.17, p = .002, r = .16 (mean ranks 204 versus 169); men: z(474) = -0.49, p = .62, r = .02 (mean ranks 234 versus 240). When the analysis was made only for events where the date was reconstructed, the relationship remained signif- icant for women z(304) = -2.19, p = .03 (mean ranks 160 versus 138), r = .13 and no relationship was found for men. But because there were only seven wom- en in the CAL condition the generalization of this result is very limited. Women with better memory for dates could be by chance assigned to the CAL condition more often than to the Non-CAL condition. No relationship between the frequency of month error and the two con- ditions was found, χ² (6; 853) = 4.65, p = .708, V = .07. No relationship was found either when analyses of gender or date reconstructed / known were taken into account. Further analyses found that 32.1% (n = 34) of self-generated landmarks where the date could be validated (because the landmark appeared among the target events) were actually biased at least one month. The amount of dat- ing error was mostly 1 month (n = 16)97. The biased landmarks could partially explain the lack of the difference between the conditions. Respondents in CAL condition reported that the public holidays (such as Christmas, Labour Day etc.) pre-recorded in the calendar instrument did not help them at all as recall aids. They would either use them spontaneously anyways (such as Christmas or Easter) or not use them anyways. Respon- dents were instructed to talk aloud when estimating the date. Hardly any of these “talk-alouds” indicated the use of the pre-recorded public holidays.

97 The other errors were 4 × 2 month error; 2 × 3 month error; 3 × 11 month error; 7 × 12 month error, and 2 × 13 month error. 165

Explanation how dating error in various units will be presented

Because the dating error in days is not normally distributed I will mostly use the Mann-Whitney U test or Kruskal-Wallis test for the comparisons between two or more groups. Both tests use rank error. I will sometimes also refer to the mean rank error: for each respondent separately the rank error of all date estimates was computed and then the overall mean of the ranked errors was computed. For the month error (or categorized error of month and year to- gether) I will use the chi-square test that measures the differences among the frequencies. Table 6.5 shows how dating error (month and year together) will be presented. Frequencies per column are cumulative. Thus the ≤ 12 month error is cumulative error of “no error” up to 12 month error (including). The chi-square test is conducted separately for each row.

Example table 6.5 Dating errors in months (year and month) for self–events and other–events

Date estimate (count, column %) Dating error Self-event Other-events Total (%) χ² V No error 267 (36.4) 31 (26.1) 298 (34.9) 4.80* .08 Error ≤ 1 month 417 (56.8) 59 (49.6) 476 (55.8) 2.20× .05 Error ≤ 12 month 666 (90.7) 106 (89.1) 772 (90.5) 0.33×× .02 Total (row %) 734 (86.0) 119 (14.0) 853 (100) Note. χ² is computed for each row (df = 1). ***p <.05; ×p = .14; ××p = .57. Frequencies per col- umn are cumulative.

The table 6.6 shows the second type of the table that is used for month error (month only). Frequencies per column are not cumulative and the chi- square test is the overall test of all month errors. Errors of three and more months are trimmed from the tables but the chi-square test applies to all month errors. The reason for trimming these errors are very small frequencies of these errors (this is visible in figure 6.3)

166

Example table 6.6 Month dating errors (month only) for self–events and other–events

Date estimate (count, column %) Dating error Self-event Other-events Total (%) χ² V No error 351 (47.8) 38 (31.9) 389 (34.9) 19.07** .15 1 month 220 (30.0) 42 (35.3) 262 (30.7) 2 months 68 (9.3) 10 (8.4) 78 (9.1) Total (row %) 734 (86.0) 119 (14.0) 853 (100) Note. χ² is an overall test (df = 6); **p < .01. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

Event characteristics (respondent independent or partially independent)

Temporal schemata

H-1: Events that have temporal schema are dated more accurately than events with- out this schema. When events have a year or season temporal schema their month estimate (month only) is very often correct (see table 6.9). As expected a season schema leads to less correct month estimates in comparison to a year schema (58% versus 82% when year schema is available). 99% of events having a year schema have up to one month error (including correct estimates) and 9% less when they have only a season schema. Multi–year temporal schemata were hardly ever found among the target events and thus no analysis is provided.

Table 6.9 Availability of temporal schemata and frequency of dating errors (n = 853)

Year schema (months)1 Season schema (season)2 D. error Yes No Yes No No error 106 (81.5) 283 (39.1) 205 (57.7) 184 (31.7) 1 month 23 (17.7) 239 (33.1) 114 (31.9) 148 (29.8) 2 month 0 (0) 78 (10.8) 25 (7.0) 53 (10.7) 3 month 0 (0) 44 (6.1) 4 (1.1) 40 (8.9) Total 130 (15.2) 723 (98.1) 357 (41.9) 496 (69.4) Note. 1 = units of dating error are months (months only). 1 = χ²(6, N = 853) = 86.29, p = < .001, V = .32; 2 = χ²(6, N = 853) = 72.83, p = < .001, V = .29.

167

Self–events versus other–events

H-2: Self–events are dated more accurately than other–events. The Mann-Whitney U analysis revealed no relationship between the mean rank of self–events and other–events. The mean rank of self–events was (M = 421, n = 734) while for other–events (M = 458, n = 119), z(853) = -1.54, p = .13. When events from boundary years were included the relationship changed into marginally significant. The mean rank of self–events was (M = 502, n = 857) while for other–events (M = 551, n = 143), z(1018) = -1.90, p = .057. The tables 6.10 and 6.11 show the percentage of the errors for both types of events. As is clear self–events are on average dated more accurately as oth- er–events. For the dating error in months (month and year) this relationship is significant only for the difference between the frequencies of no error. However a generalization should be made with caution because the percentage of self–events (86%) is much higher as the percentage of other– events (14%). The reason for this difference is that in proxy instruction per- sonal events were stressed.

Table 6.10 Dating errors in months (year and month) for self–events and other–events

Date estimate (count, column %) Dating error Self-event Other-events Total (%) χ² V No error 267 (36.4) 31 (26.1) 298 (34.9) 4.80* .08 Error ≤ 1 month 417 (56.8) 59 (49.6) 476 (55.8) 2.20× .05 Error ≤ 12 month 666 (90.7) 106 (89.1) 772 (90.5) 0.33×× .02 Total (row %) 734 (86.0) 119 (14.0) 853 (100) Note. χ² is computed for each row (df = 1). ***p <.05 ×p = .14 ××p = .57. Frequencies per column are cumulative.

168

Table 6.11 Month dating errors (month only) for self–events and other–events

Date estimate (count, column %) Dating error Self-event Other-events Total (%) χ² V No error 351 (47.8) 38 (31.9) 389 (34.9) 19.07** .15 1 month 220 (30.0) 42 (35.3) 262 (30.7) 2 months 68 (9.3) 10 (8.4) 78 (9.1) Total (row %) 734 (86.0) 119 (14.0) 853 (100) Note. χ² is an overall test (df = 6); **p < .01. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

Event characteristics (respondent dependent or proxy dependent)

Knowing the date versus reconstructing the date

H-3: Events with a known date are dated more accurately than events with a recon- structed date. I have made some analyses of “date known” and “date reconstructed” in pre- vious sections when comparing the effect of the calendar instrument. The Mann-Whitney U analysis revealed that the mean rank error of events with reconstructed date is much higher than for events where the date was known. The mean rank error of the “reconstructed” group of events was (M = 452, n = 719) while for “known” group (M = 298, n = 134), z(853) = -8.32, p < .001., r = .28. As shown in tables 6.12 the percentage of “no errors” is very high when the date was known (66.4%) while much lower when the date was recon- structed (29.1%). The month error in table 6.13 shows a similar relationship, only the month is generally known more often than the correct date (month and year together).

169

Table 6.12 Dating errors in months (year and month) of events with reconstructed date or known date Date estimate (count, column %) Dating error Known Reconstructed Total (%) χ² V No error 89 (66.4) 209 (29.1) 298 (34.9) 69.32*** .29 Error ≤ 1 month 118 (88.1) 358 (49.8) 467 (55.8) 67.07*** .28 Error ≤ 12 month 131 (97.8) 641 (89.2) 772 (90.5) 9.74** .11 Total (row %) 134 (15.7) 719 (84.3) 853 (100) Note. χ² is computed for each row (df = 1). ***p <.001; **p <.01. Frequencies per column are cumulative.

Table 6.13 Month dating errors (month only) of events with reconstructed date or known date

Date estimate (count, column %) Dating error Known Reconstructed Total (%) χ² V No error 95 (70.9) 294 (40.9) 389 (45.6) 51.98*** .25 1 month 34 (25.4) 228 (31.7) 262 (30.7) 2 months 5 (3.7) 73 (10.2) 78 (9.1) Total (row %) 134 (15.7) 719 (84.3) 853 (100) Note. χ² is an overall test (df = 6); ***p < .001. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

Importance, vividness/details, uniqueness, sharing

As shown in the table 6.14 importance, vividness/details, uniqueness, and sharing, are moderately or weakly correlated. This is not a surprise because as was shown in chapter 3.2, many important events are shared, vivid and unique at the same time.

170

Table 6.14 Correlations between phenomenological characteristics

Phenomen. characteristics Importance Import - p Viv/Det. Uniqueness Sharing Importance - .45 .36 .56 .40 Importance – proxy .45 - .27 .39 .27 Vividness/Details .36 .27 - .51 .39 Uniqueness .56 .39 .51 - .42 Sharing .40 .27 .39 .42 - Note. All Kendall’s tau b correlations are significant on at least on 1% significance level. n = 853 for all comparisons except with sharing. Sharing variable was included later on (n = 267)

H-4: Event importance is positively related to dating accuracy. Events rated as highly important were dated on average 1.6 days better than events rated as of “average, usual” importance and 3.2 days better than events rated as “not really” important. The Kruskal-Wallis test of variance indicated a significant effect of event importance on dating accuracy (in months), H(2, N = 853) = 39.90, p < .001, partial η2 = .047, mean ranks are 353, 431, 486. The frequencies of errors show that the biggest differences are be- tween the two extreme alternatives (see table 6.13). Most events (38.8%) were rated as having “average, usual” importance (see table 6.15), followed by “not really” alternative (32.6%) and “yes, a lot” alternative (28.6%). As shown in table 6.16 also month error (month only) was less fre- quent for highly important events in comparison to less important events.

Table 6.15 Relationship between event importance and frequency of dating errors in months (year and month)

Event is important (count, column %) D. error Yes, a lot Aver., usual Not really Total (%) χ² V No error 117 (52.8) 106 (37.5) 75 (30.9) 298 (41.3) 27.16*** .18 Error ≤ 1 month 169 (69.3) 187 (56.5) 120 (43.2) 476 (55.8) 35.99*** .21 Error ≤ 12 month 234 (95.9) 301 (90.9) 237 (85.3) 772 (90.5) 17.27*** .14 Total (row %) 244 (28.6) 331 (38.8) 278 (32.6) 853 (100) Note. χ² is computed for each row (df = 2). ***p <.001; Frequencies per column are cumulative.

171

Table 6.16 Relationship between event importance and frequency of month dating errors (month only)

Event is important (count, column %) D. error Yes, a lot Aver., usual Not really Total (%) χ² V No error 138 (56.6) 141 (42.6) 110 (39.6) 389 (45.6) 30.14 ** .19 1 month 68 (27.9) 109 (32.9) 85 (30.6) 262 (30.7) 2 months 16 (6.6) 29 (8.8) 33 (11.9) 78 (9.1) Total (row %) 244 (28.6) 331 (38.8) 278 (32.6) 853 (100) Note. χ² is an overall test (df = 12); **p < .01. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

Table 6.17 describes the relationship between expected event importance and different amounts of dating error when rated by proxies. As can be seen 34.9% of all cases were correctly dated regardless of the event importance. Proxies appear to rate more events as being “not really important” for their partners (40.1%) than their partners (“respondents”) themselves do (32.6%, Table 6.14)98. The relationship between dating errors (see tables 6.17 and 6.18) is similar as when importance was rated by respondents, though slightly weaker.

Table 6.17 Relationship between expected event importance rated by proxy and frequency of dat- ing errors in months (month and year together)

Expected importance (count, %) D. error Yes, a lot Aver., usual Not really Total (%) χ² V No error 99 (44.2) 106 (36.9) 93 (27.2) 289 (34.9) 17.96*** .15 Error ≤ 1 month 151 (67.4) 167 (58.2) 158 (46.2) 476 (55.8) 25.69*** .17 Error ≤ 12 month 212 (94.6) 263 (91.6) 297 (86.8) 772 (90.5) 10.23** .11 Total 244 (26.3) 287 (33.6) 342 (40.1) 853 (100) Note. χ² is computed for each row (df = 2). ***p <.001; **p <.01. Frequencies per column are cu- mulative.

98 Kappa coefficient between the ratings of respondents and proxies was .29. The frequen- cy of totally opposite ratings was however small (e.g., “yes, a lot“ versus “not really“). 172

Table 6.18 Relationship between event importance (rated by proxy) and frequency of month dat- ing errors (month only)

Expected importance (count, column %) D. error Yes, a lot Aver., usual Not really Total (%) χ² V No error 121 (54.0) 134 (46.7) 134 (39.2) 389 (45.6) 21.97 * .12 1 month 66 (29.5) 87 (30.3) 109 (31.9) 262 (30.7) 2 months 13 (5.8) 29 (10.1) 36 (10.5) 78 (9.1) Total (row %) 224 (26.3) 287 (33.6) 342 (40.1) 853 (100) Note. χ² is an overall test (df = 12); *p < .05. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

The Kruskal-Wallis test of variance did not indicate a significant effect of event sharing on dating accuracy in months (month and year), H(2, N = 267) = 2.58, p = .28, partial η2 = .01, mean ranks are 146, 124, 137. Table 6.19 shows that the relationship of event sharing with the dating error in month is not as clear as with importance (see above). Only the category “no sharing” has lower frequency of accurate month estimates in comparison to the other two categories. Month error (month only) in table 6.20 also shows no significant rela- tionship. However when only the frequencies of no month error versus any month error are compared this relationship is significant, χ² (2, 267) = 10.82, p = .004, V = .02.

Table 6.19 Relationship between event sharing and frequency of dating errors in months (month and year)

Sharing the event (count, column %) D. error Many times Several times (1-3) No Total (%) χ² V No error 12 (37.5) 33 (40.7) 44 (28.6) 89 (33.3) 3.82× .15 Error ≤ 1 month 15 (46.9) 50 (61.7) 83 (53.9) 148 (55.4) 2.40× .01 Error ≤ 12 month 30 (93.8) 75 (92.6) 140 (90.9) 245 (91.8) 0.39× .04 Total (row %) 32 (12.0) 81 (30.3) 154 (57.7) 267 (100) Note. χ² is computed for each row (df = 2). ×p = not significant. Frequencies per column are cumula- tive.

173

Table 6.20 Relationship between event sharing and frequency of month errors (month only)

Sharing the event (count, column %) D. error Many times Several times (1-3) No Total (%) χ² V No error 22 (68.8) 42 (51.9) 60 (39.0) 124 (46.4) 17.79× .18 1 month 7 (21.9) 22 (27.2) 50 (32.5) 79 (29.6) 2 months 1 (3.1) 9 (11.1) 16 (10.4) 26 (9.7) Total (row %) 32 (12.0) 81 (30.3) 154 (57.7) 267 (100) Note. χ² is an overall test (df = 12); ×p = not significant. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

H-5: Event uniqueness is positively related to dating accuracy. Table 6.21 describes the relationship between event uniqueness and different amounts of dating error in month units. As can be seen 34.9% of all events were correctly dated regardless of the event uniqueness. The relationship be- tween dating errors (see tables 6.21 and 6.22) is slightly weaker as when im- portance was rated by respondents but similarly robust. Only the “up to 12 months error” does not significantly differ between the three levels of event uniqueness.

Table 6.21 Relationship between event uniqueness and frequency of dating errors in months (year and month)

Event is unique (count, column %) D. error Yes, a lot Aver., usual Not really Total (%) χ² V No error 156 (52.8) 86 (28.9) 56 (18.8) 298 (34.9) 20.86*** .16 Error ≤ 1 month 230 (64.4) 158 (53.7) 88 (43.6) 476 (55.8) 23.54*** .17 Error ≤ 12 month 332 (93.0) 263 (89.5) 177 (87.6) 772 (90.5) 4.91× .08 Total (row %) 357 (41.9) 294 (34.5) 202 (23.7) 853 (100) Note. χ² is computed for each row (df = 2). ***p <.001; ×p = .09.Frequencies per column are cumu- lative.

174

Table 6.22 Relationship between event uniqueness and frequency of month dating errors (month only)

Event is unique (count, column %) D. error Yes, a lot Aver., usual Not really Total (%) χ² V No error 192 (53.8) 116 (39.5) 81 (40.1) 389 (45.6) 50.52 *** .17 1 month 103 (28.9) 108 (36.7) 51 (25.2) 262 (30.7) 2 months 25 (7.0) 23 (7.8) 30 (14.9) 78 (9.1) Total (row %) 357 (41.9) 294 (34.5) 202 (23.7) 853 (100) Note. χ² is an overall test (df = 12); ***p < .001. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

Confidence in date estimates

H-6: Confidence in month estimates is positively related to dating accuracy in months (month only). Confidence was measured separately for the month estimate and year esti- mate. As is shown in table 6.23 when respondents were confident that their month estimate is correct, in 74.6% of the cases they are right (0 = no month error; 1 = up to one month error). This effect is relatively strong in comparison to other predictors (V = .27). When a chi-square test is calculated only for two categories (no error versus any month error) the effect is much stronger, χ² (4, 267) = 167.62, p < .001, V = .44. Expected accuracy of month estimate rated by proxies’ shows (see table 6.24) that proxies know a lot about the events as well because their estimates are only slightly worse as the estimates of res- pondents.

Table 6.23 Relationship between confidence in month estimate and frequency of month errors (month only)

Confidence in correct month (count, column %) D. error 0 1 2 3 4 Total (%) χ² V No error 203 (74.6) 127 (43.3) 35 (23.8) 10 (16.1) 14 (17.7) 389 (45.6) 253.43*** .27 1 month 54 (19.9) 108 (36.9) 53 (36.1) 24 (38.7) 23 (29.1) 262 (30.7) 2 months 6 (2.2) 34 (11.6) 22 (15.0) 10 (16.1) 6 (7.6) 78 (9.1) Total (row %) 272 (31.9) 293 (34.3) 147 (17.2) 62 (7.3) 79 (9.3) 853 (100) Note. χ² is an overall test (df = 24); ***p < .001. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

175

Table 6.24 Relationship between expected accuracy of month estimate (rated by proxy) and fre- quency of month errors (month only)

Expected accuracy of month (count, column %) D. error 0 1 2 3 4 Total (%) χ² V No error 157 (66.8) 110 (44.2) 74 (38.3) 23 (27.4) 25 (27.2) 389 (45.6) 116.47*** .19 1 month 59 (25.1) 83 (33.3) 60 (31.1) 35 (41.7) 25 (27.2) 262 (30.7) 2 months 11 (4.7) 23 (9.2) 23 (11.9) 6 (7.1) 15 (16.3) 78 (9.1) Total (row %) 235 (27.5) 249 (29.2) 193 (22.6) 84 (9.8) 92 (10.8) 853 (100) Note. χ² is an overall test (df = 24); ***p < .001. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

H-7: Confidence in year estimates is positively related to dating accuracy in years (year only). When respondents were confident (0 = no year error; 1 = up to one year error) that their year estimate is correct, in 88.4% of the cases they were right (see table 6.25). This effect is also relatively strong in comparison to other predic- tors (V = .23). As is visible from the table hardly any two years errors ap- peared when respondents were confident, though the frequency of one year error was relatively high (11.2%). Expected year confidence rated by proxies (see table 6.26) is almost identical, suggesting that proxies may be similarly good informants about high (or low) dating difficulty as respondents them- selves.

Table 6.25 Relationship between confidence in year estimate and frequency of year errors (year only)

Confidence in correct year (count, column %) D. error 0 1 2 3 Total (%) χ² V No error 467 (88.4) 212 (69.5) 9 (52.9) 0 (0) 688 (80.7) 43.36*** .23 1 year 59 (11.2) 86 (28.2) 5 (29.4) 2 (66.7) 152 (17.8) 2 years 2 (0.4) 7 (2.3) 3 (17.6) 1 (33.3) 13 (1.5) Total (row %) 528 (61.9) 305 (35.8) 17 (2.0) 3 (0.4) 853 (100) Note. χ² is an overall test (df = 6); ***p < .001. Error of 2 years and expected accuracy 3 not in- cluded in the model (too many cells had count lower than 5). The effect size remained the same.

176

Table 6.26 Relationship between expected accuracy of year estimate and frequency of year errors (year only)

Expected accuracy of year (count, column %) D. error 0 1 2 3 Total (%) χ² V No error 402 (87.8) 212 (72.6) 69 (71.1) 5 (83.3) 688 (80.7) 33.09*** .20 1 year 50 (10.9) 75 (25.7) 26 (26.8) 1 (16.7) 152 (17.8) 2 years 6 (1.3) 5 (1.7) 2 (2.1) 0 (0) 12 (1.5) Total (row %) 458 (53.7) 292 (34.2) 97 (11.4) 6 (0.7) 853 (100) Note. χ² is an overall test (df = 6); ***p < .001. Error of 2 years and expected accuracy 3 not in- cluded in the model (too many cells had count lower than 5). The effect size remained the same.

Respondents were in 156 cases confident both in month and year esti- mate (see table 6.27) at the same time (18.3% out of the total number of events, n = 853). When events from the boundary years were included the total num- ber of events where people swore under oath increased to 197 (out of all events n = 1018) but the percentage of no errors “under oath” remained simi- lar (19.4%). On the other hand when respondents did not want to “swear un- der oath” they had a good reason for that because only in 49.1% of cases the date estimate was correct. It thus seems that “under oath confidence” is a good predictor helping in distinguishing the “very” confident date estimates from “confident” date estimates.

Table 6.27 Relationship between “under oath confidence” and dating accuracy (month and year)

Under oath confidence (count, %) Dating error Yes No Total (%) χ² V No error 110 (70.5) 27 (49.1) 137 (64.9) 8.20** .20 ≥1 month error 46 (29.5) 28 (50.9) 74 (35.1) Total 156 (73.9) 55 (26.1) 211 (100) Note. χ² (df = 1). ***p <.01.

177

Predictors of dating accuracy – respondent characteristics

As expected no relationship was found between different levels of education, motivation and tiredness. This is more due to the lack of variability and the de- sign of my study than support that no such relationship exists.

Self-evaluation of memory

H-9: Respondents’ self-evaluation of memory is positively related to dating accuracy. The Kruskal-Wallis test of variance indicated that the relationship between self-evaluation of memory and dating accuracy is not significant H(2, N = 40) = 2.981, p = .23; partial η² = .007. The mean ranks of those who rated their memory as “rather better than others” was 15 (n = 7) which is smaller rank error as in the other two groups; those who rated it as “similar to other” had a mean rank error of 19 (n = 15) and those who rated it as “rather worse than others” had a mean rank error of 24 (n = 18).

Gender

H-8: Women are on average more accurate than men. The Mann-Whitney U analysis revealed a relationship between the mean rank error and gender. The mean rank error of women’s events was (M = 387, n = 474) is much lower than the mean rank error of men’s events (M = 458, n = 474), z(853) = -4.27, p = <.001, r = .15. Women are more accurate in date esti- mates as men (see table 6.28). Analysis of month error also revealed signifi- cant gender differences (see table 6.29). Women estimate the month correctly more often than men and also make the smallest error of 1 month more often than men.

Table 6.28 Relationship between gender and frequency of dating errors in months (year and month)

Date estimate (count, column %) D. error Men Women Total (%) χ² V No error 146 (30.8) 152 (40.1) 298 (34.9) 8.02** .10 Error ≤ 1 month 230 (48.5) 246 (64.9) 467 (55.8) 22.92*** .16 Error ≤ 12 month 421 (88.8) 351 (92.6) 772 (90.5) 3.32× .06 Total (row %) 474 (55.6) 379 (44.4) 853 (100)

178

Note. χ² is computed for each row (df = 1). ***p <.001; **p <.01; × = .06. Frequencies per column are cumulative.

Table 6.29 Month dating errors (month only) for men and women

Date estimate (count, column %) Dating error Men Women Total (%) χ² V No error 207 (43.7) 182 (48.0) 389 (45.6) 13.99* .13 1 month 137 (28.9) 125 (33.0) 262 (30.7) 2 months 57 (12.0) 21 (5.5) 78 (9.1) Total (row %) 474 (55.6) 379 (44.4) 853 (100) Note. χ² is an overall test (df = 6); *p < .05. Errors of three and more months are trimmed but the chi-square test applies to all month errors.

Difficulty of the task

H-10: Respondents’ evaluation of the overall difficulty of estimating the date is nega- tively related to dating accuracy. When respondents rated the difficulty of the dating task (month and year together) after estimating the date of all events on 4-point scale (see table 6.30) this was related to dating accuracy (month and year together). The mean rank error was 28 when respondents rated the difficulty of estimating the week as “very hard” and only 15 mean rank when the difficulty was described as “ra- ther easy” (see table for all ranks). The Kruskal-Wallis test of variance indi- cated that this relationship is significant, H(2, N = 40) = 7.54, p = .02, partial η² = .16.

Table 6.30 Frequency of the overall difficulty ratings after estimating the date of all target events

Difficulty Count (%) Mean rank error very hard 10 (25.0) 27.8 rather hard 17 (42.5) 19.3 rather easy 12 (30.0) 14.5 N 40 (100)

Number of landmarks

H-11: Number of self-generated landmarks is positively related to dating accuracy.

179

Only a weak and non-significant relationship was found between the number of self-generated landmarks and mean rank dating error (τ = -.14, p = .44, n = 40) in the CAL condition. This relationship would probably be significant and higher if respondents had the chance to generate as many landmarks as they could (ceiling effect). Some respondents would be able to generate many more landmarks but they were told that the number of landmarks they gen- erated was enough (when they reached 15). Respondents were facilitated to report more landmarks when at least 7 landmarks were not reached99.

Summary of the effect sizes of event and respondent characteristics

Table 6.31 provides the summary of the effect sizes. Cramer’s V is used for event characteristics and other effect size measures for respondent characte- ristics. The maximum value is 1 and the minimum value is 0. Zero means no relationship.

99 This facilitation, however, was only mild because I did not want to put too much bur- den on the respondents. When they said that they cannot think of any additional land- marks (and the total number of landmarks was lower than 7) I probed them only once. 180

Table 6.31 Summary table of the effect sizes of event characteristics (only personal events)

Predictors Effect size Event characteristics (Cramer’s V) Dating error Month error Year schema - .32 Season schema - .29 Self–event / Other–event .08 .15 Date known / Reconstructed .29 .25 Importance .18 .19 Expected importance (proxy) .15 .12 Sharing the event .15NS .18 NS Uniqueness .16 .17 Confidence in month - .27 Expected month accuracy (proxy) - .19 Confidence in year (only year) .23 - Expected accuracy of year (proxy) .20 - Under oath confidence .20 - Respondent characteristics Self-evaluation of memory (partial η²) .007NS - Gender (r) .15 - Overall difficulty (partial η²) .16 - Number of landmarks (tau b) -.14NS - Note. Dating error: effect size of no error versus any error. Month error: effect size for all month errors. NS = not significant. Sharing has smaller sample (n = 267). Sample of events of all other predictors is n = 853.

Public events

Relationship between event recency and dating accuracy of public events

Only 8 public events were included into the sample of target events (6 from 2006 and 2007). Their “age” (recency) is significantly related do dating accu- racy but the relationship is very small, τ =.12, p = .002, n = 315. Because of the small sample of respondents no correlation is significant when computed separately for each event.

181

Confidence in date estimates of public events

H-12: Confidence in month estimates is positively related to dating accuracy. Confidence was measured separately for the month estimate and year esti- mate. As is shown in table 6.32100 when respondents are confident that their month estimate is correct (0 = no month error; 1 = up to one month error), in 74.6% of the cases they are right. This effect is relatively strong in comparison to other predictors (V = .24; but see the warning in the note that the minimum expected count per cell was not reached). When a chi-square test is calculated only for two categories (no error versus any month error) the effect is a little bit stronger, χ² (4, 315) = 35.33, p < .001, V = .34. When compared to confidence ratings of personal events the most striking difference is in a very high per- centage of very low confidence (57% of public events versus only 9% of per- sonal events have confidence “4” meaning that the error may be four or more month).

Table 6.32 Relationship between confidence in month estimate and frequency of month errors (month only)

Confidence in correct month (count, column %) D. error 0 1 2 3 4 Total (%) χ² V No error 17 (74.6) 19 (43.3) 12 (23.8) 5 (16.1) 23 (17.7) 76 (24.1) 70.86*** .24 1 month 8 (18.6) 10 (27.8) 9 (24.3) 2 (9.5) 38 (21.3) 67 (21.3) 2 months 0 (0) 2 (5.6) 4 (10.8) 2 (9.5) 19 (10.7) 27 (8.6) Total (row %) 43 (13.7) 36 (11.4) 37 (11.7) 21 (6.7) 178 (56.5) 315 (100) Note. χ² is an overall test (df = 24); ***p < .001. Errors of three and more months are trimmed but the chi-square test applies to all month errors. Warning 50% cells did not reach the expected min- imum count of 5. Because of this I have computed an overall test where for two confidence ratings (0–1 together versus 2–4 together). χ² (4, 315) = 38.42, p < .001, V = .35.

H-13: Confidence in year estimates is positively related to dating accuracy. When respondents were confident that their year estimate is correct, in 69% of the cases they are right (see table 6.33). As is visible from the table only nine two years errors appeared when respondents were confident in the accuracy of year estimate, and the frequency of one year error was the same. When compared to personal events respondents chose the less confident ratings

100 All analyses were made including the two events from 2005 and 2008. 182 more often for public events, showing that their confidence in the year esti- mates of public events is generally weaker.

Table 6.33 Relationship between confidence in year estimate and frequency of year errors (year only)

Confidence in correct year (count, column %) D. error 0 1 2 3 Total (%) χ² V No error 42 (68.9) 75 (55.1) 48 (45.3) 3 (25.0) 168 (53.3) 20.20* .15 1 year 9 (14.8) 43 (31.6) 34 (32.1) 8 (66.7) 94 (29.8) 2 years 9 (14.8) 16 (11.8) 22 (20.8) 1 (8.3) 48 (15.2) Total (row %) 61(19.4) 136 (43.2) 106 (33.7) 12 (3.8) 315 (100) Note. χ² is an overall test (df = 9); *p < .05.

Association of public events with personal events

H-14: Public events that are associated with personal events are dated more accurate- ly than events without this association. Table 6.34 shows the frequencies of dating errors for all eight public events separately. Only in 44 cases respondents mentioned an association of some public events with their personal events. Most frequent was the association with the death of John Paul II (n = 15). The frequencies are however so low that making any comparisons is not possible. For example 5 out of 7 people who have some personal association with event 26 (Healthcare charges) esti- mated the date correctly which is much higher percentage in comparison to all other respondents, but it can be an artifact as well.

183

Table 6.34 Frequency of the associations of public events with personal events and frequencies of dating errors

Public event number Association 6 7 18 26 29 32 34 35 Total No error 0 0 0 5 2 0 4 0 11 1 month e. 0 0 0 0 0 2 1 0 3 12 month e. 3 3 4 1 0 1 5 1 18 Total 3 3 5 7 4 4 15 3 44 Note. 6 = Same-sex couples registered partnership (2006), 7= Hurricane Kyrill (2007), 18 = Driv- ing points system (2006), 26 = Healthcare charges (2008), 29 = Olympic Games in Torino (2006), 32 = Parliamentary elections (2006), 34 = Pope John Paul II died (2005), 35 = Czech Republic joined Schengen (2007). No errors, 1 month errors and 12 month error are shown only for asso- ciated public events.

Comparison of dating accuracy in CAL and Non-CAL condition

No difference was found between the mean rank error (month and year to- gether) in CAL (n = 151) and Non-CAL (n = 164) condition, z(315) = -1.29, p = .13. Analysis of month error also did not find any relationship, χ²(6, N = 315) = 7.35, p = .29, V = .29. Other explorations also did not lead to any differences between the conditions (when made separately for the couples of events with different dating difficulty). The conclusion is that for the chosen events a calendar instrument inter- view does not have any effect on dating accuracy in comparison to an inter- view without calendar instrument.

6.4 Discussion

This study had two major aims—looking for the event and respondent pre- dictors of dating accuracy and evaluating whether calendar instrument inter- view leads to higher dating accuracy of recent personal or public events. I will first start with the second aim. The overall discussion of all studies is in chap- ter 8.

Does the calendar instrument interview (CAL) lead to more accurate date estimates in comparison to the Non-CAL interview?

184

Calendar instruments are generally recognized as techniques leading to in- creased quality of data (or at least neutral) and hardly any cons connected to these instruments were raised101 though it is also transparent that more re- search is needed into the conditions under which the calendar instruments brings the best outcomes (Belli & Callegaro, 2009; Glasner & Van der Vaart, 2009). The available studies (mentioned in section 3.4) compared heavily structured question lists with more flexible calendar interviews. Some studies compared structured interviews with and without calendar instrument. To my knowledge there is no study that was comparing a calendar instrument interview with a more “memory recall friendly” type of interview. By more “friendly” I mean that the interviewers tried to facilitate recall of the respon- dents as much as possible with various probes similar to that in the CAL con- dition (e.g., season probes, parallel probes between events in a question list, landmarks questions—whether the target event is not connected to some landmarks). I did not find any effect of the CAL condition on dating accuracy when compared to results in the Non-CAL condition. Analyses were made sepa- rately for dating error (month and year together) and for month error (month only), while controlling the effect of other variables such as gender or date reconstructed / known as well. Some landmarks were biased (34 from 106 landmarks where the true date could be validated) and the data show that a biased landmark biases the events that are associated with it. It can be argued that in the CAL condition landmarks play a major role in estimating the dates and because they are easi- ly visible they can cause more bias (when biased) in comparison to the Non- CAL condition. In the latter condition people were also probed to use land- marks (mentally or using the events from the question list as recall aids) but because they were not visible they probably did not lead to so many errors as in CAL condition. Another explanation is that the impact of personal events—which usually bring along vast contextual information—did not leave any chance to the calendar instrument to improve anything (but only making the task easier because everything was visible102). People know the month of many events event without the calendar and the year error shows that estimating the year

101 Apart for example from longer interview time, preparation time or possible danger that some personal landmarks dates can be biased (see section 3.4). 102 Most respondents in CAL condition provided feedback that calendar instrument was of great help for them and the task was easier. 185 was also not too difficult (most of the events are estimated with up to 1 year error). No difference between the two conditions was also found for the public events, showing that the task difficulty was probably similar in both condi- tions.

Predictors of dating accuracy – event characteristics

Most of the studied predictor variables have shown at least some weak rela- tionship with dating accuracy. As shown in the summary table 6.31 the strongest predictors of dating accuracy are availability of temporal schemata, knowing the date straight away and high confidence in the temporal esti- mate. This is not a surprising finding because all three predictors are most directly connected to the date estimates—are closely related to the date esti- mating strategies which were found to be related to dating error (Thompson, et al., 1993). The remaining event characteristics also predict the dating accu- racy but not so strongly. It is probably because the remaining variables are connected to dating accuracy only indirectly (e.g., importance, sharing) or cannot be pronounced as much because of the short recall period (e.g., event recency). It has to be highlighted that it is easier to predict biased date esti- mates than correct date estimates because even when all “strong” predictors implicate a correct date estimate the chances of incorrect date estimates are still rather high. Most of the less direct predictors (e.g., importance, frequency, self–other events distinction, sharing) provide some hints as to which events could pos- sibly be dated more accurately but as the effect sizes show (usually from .02 to .18) the effect is relatively small, though in most cases robust.

Predictors of dating accuracy – respondent characteristics

Respondent characteristics were on average only weakly related to the dating accuracy and some of the expected relationships were not found significant at all (e.g., self-evaluation of the memory before the dating task). Respondents’ evaluation of the overall difficulty of estimating the date was related to the overall dating accuracy showing that subjectively “harder” task was related to the less accurate date estimates. Gender was also found to be an important predictor of dating accuracy supporting that women are on average more accurate than men as was found in most studies of personal events (e.g., Skowronski, et al., 1994). Education was omitted from the analyses due to the 186 lack of variance. Rather surprisingly the number of landmarks was not found to be significantly related to dating error (τ = -.14, p = .44). This could partially be caused by the instruction limiting the maximum number of freely recalled landmarks to 15 and not facilitating respondents when they reached at least 7 landmarks.

Predictors of dating accuracy – public events

Confidence in the date estimates of public events also showed relatively strong relationship to dating accuracy, similarly as for personal events. Date estimates of the public events where the association was available were dated more accurately but due to the very low frequencies generalizations cannot be made and thus support for the hypothesis of the impact of the association on dating accuracy (that was also explored in Study I) was not found.

187

7 Dating accuracy predictors of recent personal events (Study III)

7.1 Study aims and hypotheses

Study II focused on the predictors of dating accuracy regarding remote unique personal events (2005–2008). The present Study III has the same aim but focuses on more recent events (March – June 2011). There are at least two reasons for conducting this study: • Predictors of dating accuracy (respondent characteristics, event charac- teristics) may have different impact on recent events as they have on remote events and it is probable that the difference will not be merely caused by the different time units that are being used for measuring dat- ing accuracy. • Study II was concerned with a relatively small sample of respondents which makes the inferences about the impact of calendar instrument ra- ther weak. The present study has a larger sample and thus the proba- bility of finding an impact of a calendar instrument (if there is any) on dating accuracy is higher.

The main aim of this study is to find predictors of dating accuracy of recent unique personal events. Three types of predictors of dating accuracy will be examined: • Respondent characteristics: they influence the overall dating accuracy (e.g., gender); • Event characteristics: these are measured separately for every event and influence each target event (e.g., importance, recency); • Data collection: experimental group interviewed with calendar instru- ment (CAL) and control group (Non-CAL) without calendar instru- ment.

The following research questions were asked: • Which event and respondent characteristics are the strongest predictors of dat- ing accuracy among recent personal events? • Do interviews with a calendar instrument (CAL) lead to higher dating accura- cy of recent personal events in comparison to interviews without calendar in- strument (Non-CAL)?

188

I have also added 4 public events into the sample of events. The major aim of including these events was to explore whether the calendar instrument increases the dating accuracy of public events. • Do interviews with a calendar instrument (CAL) lead to higher dating accura- cy of recent public events in comparison to interviews without a calendar in- strument (Non-CAL)?

Selection of personal events

Proxies kept the online diaries for a period of 6 weeks (from Monday - April 11, 2011 to Sunday - May 29, 2011). Proxies’ task was to secretly record sever- al events from the life of their partners (or events relevant to the whole fami- ly) per week of which they thought their partner will probably not forget completely after a month of two. Proxies were not allowed to tell their part- ners that they are keeping a diary. Some days could be skipped if nothing remarkable happened or when they did not record of any noticeable event (e.g., when their partner did not speak about anything or when they were away for the whole day). The description of an event included the exact date, the day of the week (both for the purpose of checking for inconsistencies), a short description using up to three words, a longer description with all details necessary to distinguish this event from other similar events, any notes about the event, and an assessment of the event pleasantness and emotional in- volvement from their partner’s point of view (only in cases where this infor- mation was available). Proxies were instructed to avoid sensitive or offensive details and if necessary not to mention problematic events at all. Several days before the interview proxies were asked to select 22 events from their online diaries. The selection criteria were the following: • To avoid events where the exact date is obvious (e.g., respondents’ birthday); • To avoid events which were too sensitive or to delete sensitive details from event descriptions; • Not to select events which happened on the same day and were ob- viously associated with each other (e.g., “travel to a cottage” and “hav- ing dinner after the arrival at the cottage”); • To delete all temporal hints which were in the descriptions (e.g., “Tues- day tennis training with a new racket”) • To select events from as wide range of dates within the boundary as possible. Selected events were then checked by the interviewers to see if the descriptions are complied with all above-mentioned criteria. 189

Another proxy’s task was to fill in some additional information: • Rate the expected accuracy of DOW; • Rate the expected accuracy of the respondents’ week estimate; • Rate the importance of the event for the respondent; • Rate whether their partners spoke about the event with the respondent.

There were four controls to check whether the date recorded into the online diary was correct. • The online system (Survey Monkey) recorded the date when people recorded their events. This date could be compared with the date men- tioned in the event’s description (proxies were asked to record the events as soon as possible—but sometimes they did not have online access which resulted in one or more days delay). • When I did not trust the date I could confirm this with the proxy if it was correct (the ID was different for every proxy). • The DOW and the exact date were recorded separately. I could thus compare the DOW extracted from the exact date with the DOW that was recorded separately. Several mistakes were found (e.g., proxies were describing Easter and the DOW was Monday but the exact date was not April 25, 2011 but April 26, 2011). In cases where I was not sure about the accuracy I contacted the proxies. • If the respondent did not agree with some of the target events’ dates that were selected from the proxy diary (after the interview was fi- nished) he or she could provide evidence that it was mistaken (diary, e- mail, text message, etc.). Proxies could provide evidence as well. When the evidence did not prove sufficiently that the date was correctly rec- orded, the event was excluded from the analysis. Respondents also skipped events if the description was not unique enough and therefore similar events could not be distinguished.

Examples of the events and their descriptions:

• “We were at the technical museum with our boys. It was superb.” • “Supper at David’s followed by Vodka party” • “Collection of the thesis from bookbinder” • “Departure for vacation to ” (part of extended event) • “First time we caught a fish this year” • “Helping with the restoration of the church”

190

• “You removed the doors from the wardrobe because we could not close them for some time already. You put a drapery over it.”

Selection of public events

When exploring the media for well-known public events from the reference period (14th March to 5th June) I realized that only few events met the inclu- sion criteria. These were: 1) an event had to be known to all respondents or at least to the vast majority of them and the available resources (internet, news- papers) agreed on just one true date for when the event happened. 2) An event had to be mentioned by all major types of media because not all res- pondents had for example TV (TV, internet, radio, newspapers, and journals). The selection was done by two researchers independently with the agreement that only 4 public events could be chosen103. This is less than intended but the lack of the events could not be anticipated in advance. The chosen events are (detailed description of the events is in Appendix 6): 1) Royal wedding: Prince William married Kate Middleton at Westmin- ster Abbey in London. 2) CZ won bronze at WCH: Czechs hockey players defeated the Russians at the World Ice Hockey Championship and took bronze. 3) Annie found dead: Annie, a child missing for several months, was found dead in Prague–Trója. DNA tests later confirmed her identity. 4) Osama’s death: The US commando found and killed Osama bin Laden in Pakistan.

Events excluded from the analyses

• Events that were not distinguishable from other similar events (the de- scription of an event was not unique enough). • Events of which the true date could be biased. When the original “true” date was biased but the correct true date was found then this new true date replaced the original one and the event was not excluded.

103 Two researchers (A.N. and E.L.) were looking for the appropriate public events sepa- rately. After the choice was done I asked ten people that had similar characteristics as the respondents to verify whether the choice was good. All of the four events were known to them. 191

Variables excluded from the analyses

• Event theme. Event theme will be analyzed by my colleague in her mas- ter’s thesis which is due in summer 2012104. • Education: Even though respondents with higher education may be more accurate, my sample does not offer data for such comparison be- cause most of the respondents (83%) received at least bachelors’ degree at a university and the frequencies of other levels of education are low. • Emotions ratings. Emotions were rated concurrently by the proxies. This variable shows large variability among proxies and events and is thus suitable for more qualitative analysis.

Hypotheses

I will mention here not only hypotheses (will be shortened as “H”) but also more explorative questions. The reason for not stating the hypotheses in some specific cases is that the design of this study influences the impact of some variables and thus the typical relationship may or cannot be found. Another reason is that some of the measured variables lack the variance, what was already known during or even before the data collection.

Hypotheses and explorations concerning event characteristics of personal events: res- pondent independent or partially independent

• H-1: Event recency is negatively related to dating accuracy. (More remote events will be dated less accurately than less remote events.) • H-2: Events that have temporal schema are dated more accurately than events without this schema. This is true only when temporal schema with appropriate time units are concerned. Week schema increases the accuracy of DOW. Month schema increases the accuracy of week esti- mates. Exact date schema increases the accuracy of the exact date esti- mate. Note: Event regularity was used as one of the indicators of tempor- al schema (see below). • Event frequency is generally related to dating accuracy. Especially low- frequency events should be dated more accurately. However because part of this research concerns only unique events it means that even when similar events happened in the reference period more than once

104 Eva Literáková. 192

they will be distinguishable. Also the frequency of repeated events is very low. Because of this I do not expect that event frequency is related to dating accuracy, but some trend could be found. • H-3: Self–events are dated more accurately than other–events.

Hypotheses concerning event characteristics of personal events: respondent dependent or proxy dependent

• H-4: Events with a known date are dated more accurately than events with a reconstructed date. • H-5: Event importance is positively related to dating accuracy. Note: es- pecially “highly important” events should be dated more accurately than the rest of the events. The expected importance rated by proxies will be also positively related to dating accuracy but the relationship will be probably weaker. • H-6: Event sharing is positively related to dating accuracy. The sharing of an event with respondent rated by proxies will be also positively related to dating accuracy but the relationship will be probably weaker. • H-7: Confidence in DOW estimates is positively related to dating accuracy. Expected accuracy of DOW estimate rated by proxies will also be positively related to dating accuracy but the relationship will be weaker. • H-8: Confidence in week estimates is positively related to dating accuracy. Expected accuracy of week estimate rated by proxies will also be positively related to dating accuracy but the relationship will be weaker.

Hypotheses and explorations concerning respondent characteristics of personal events

• Age should not be related to dating accuracy because I have restricted the age. • H-9: Women are on average more accurate than men. This applies to the number of exact date estimates and exact estimates of the DOW as well. • H-10: Respondents’ self-evaluation of memory is positively related to dat- ing accuracy. The better the respondents rate their memory the more ac- curate they are. However I expect only a small effect, because respon- dents rated their memory before knowing which events they would es- timate. • H-11: Respondents’ evaluation of the overall difficulty of estimating the date is negatively related to dating accuracy (rated after providing the

193

date estimates separately for week and DOW). Note: This relationship should be stronger than the relationship of self-evaluation of the memo- ry before the dating task. • Motivated respondents should be on average more accurate than less motivated respondents. But in face–to–face interview the rapport with a respondent is usually good and thus strong impact of motivation cannot be expected. • Tired respondents should on average be less accurate than less tired res- pondents. But because interviewers were instructed not to make inter- views very late at night or at times not suitable for respondents the neg- ative effect of tiredness will probably be not found. • H-12: Number of self-generated landmarks is positively related to dating ac- curacy.

Explorations concerning the impact of calendar instrument on dating accuracy of personal events and public events

• Even though it could be expected that the calendar instrument will in- crease the dating accuracy of personal events it is not that obvious. It can be expected that a calendar instrument should not decrease dating accu- racy, but—especially as compared to the Non-CAL condition in which a plain calendar is used—an increase in dating accuracy cannot be ex- pected for certain. One reason is that a calendar instrument cannot have much impact on dating accuracy when the date was just known. Anoth- er reason is that some landmark events in the calendar may be biased due to which the reconstruction of associated events and their dates may be biased as well. The calendar instrument may also “work” diffe- rently for women than for men. I will thoroughly explore the relation- ship between dating accuracy and applying the calendar instrument to find some indications if the calendar has some effect on dating accuracy or not. • The impact of the calendar instrument on public events is even less ob- vious, because when public events are not associated with personal events the network of personal events cannot help too much. It could be expected that public events that are not associated with personal events will not be dated more accurately in CAL condition. On the other hand events that are associated with personal events could be dated more ac-

194

curately in CAL condition because of the help of other events men- tioned in the calendar that may provide sequential or parallel cues.

Hypotheses concerning the public events

• H-13: Confidence in DOW estimates is positively related to dating accura- cy. • H-14: Confidence in week estimates is positively related to dating accuracy. • H-15: Public events that are associated with personal events are dated more accurately than events without this association (applies only when tem- poral schemata of public events are controlled for).

7.2 Method

Participants

The participants in this study were respondents (those who estimated the dates) and their partners, called proxies. 78 couples105 were recruited with flyers at public places and by e-mails through snowball sampling. The aim was to reach a sample of couples with different educational, socio-economic and demographic characteristics even though I do not expect these partici- pant characteristics to have a substantial impact on the dependent variable which is the dating error. Partners had to live together for at least a half year before the research interview was administered. The role of a proxy was to assist in recording and selecting the personal events about their partners (or family). Respondents had to be between 23 to 40 years old. The reason is that during these reminiscence bump years people experience many of the impor- tant events and also the comparison of results with study II that used the same range of respondents’ age. All respondents are Czech citizens who were present in the Czech Republic without a break longer than two weeks from March 14th to June 5th, 2011. Proxies had to live with their partners during the same time period and to be in a regular telephone (or other) contact when their partners or when they themselves were away from home for more than two days. Every couple was compensated with a small present worth approx- imately 5 Euro. Both partners had to agree with the participation in the study.

105 Only heterosexual couples were recruited. This is because gender is an important pre- dictor of dating accuracy and also because gender of the proxy could have some effect on the events that are recorded in the diary. 195

Proxies were informed about the study focus on dating personal events but were instructed to keep this information secret. Respondents were informed that the study will focus on everyday memory without any further specifica- tion. They were also informed that no sensitive or offensive information will be gathered from them or their proxies and results will be shown to them on- ly (or shown to both partners if they wished to). The intended sample size of 80 couples or more was almost reached as the final sample consists of 78 respondents (35 women and 43 men). The in- tention to have equal numbers of male and female respondents was not com- pletely achieved. The reason is that several male proxies did not finish the task of keeping the diary with personal events of their partners and these couples were thus withdrawn from the study. One respondent was older than forty (46) but was not excluded because the lifestyle of the couple was similar to many other couples in the sample (proxy was younger) and they were also part of Study II.

Measures

Most of the measures are thoroughly described in chapter 4 and I do not ex- plain them here again (with the exception of temporal schemata – see next section). Table 7.1 summarizes all measured independent variables.

Table 7.1 Summary of measured independent variables

Independent variables measured in empirical studies Respondent characteristics Age (23–40; 46) Gender Education Self-evaluation of memory Overall difficulty of estimating the date (R.) Tiredness (Int., R.) Motivation (Int., R.) Number of landmarks (only CAL condition) Public holidays help Regularity of life, work and free time activities Event characteristics – respondent independent or partially independent Event recency Temporal schemata 196

Regularity Frequency Theme Self–events or other–events Event characteristics – respondent dependent or proxy dependent Date reconstructed / Date known (Int., R) Importance (R, P) Sharing an event with other people (R) Sharing an event with a respondent (P) Confidence in date estimates (R) Expected accuracy of date estimates (P) Pleasantness (R) Emotional strength (R) Association of a public event with a personal event (R) Data collection Semi-structured interview with (CAL) or without calendar instrument (Non-CAL)

Classification of events according to a temporal schema

An event classification was created based on the events’ description and con- tent. For some events the description gives enough information on whether there is a temporal schema or not. For example “downhill sledging” has a season schema, as it must have happened in winter (if not on a glacier). To confirm the availability of the schemata when not obvious, recordings of the interviews were used. The availability of the schema does not imply that the schema has to be correct. Sometimes respondents used the schematic infor- mation which was wrong. Month schema. Events have a month schema when the schematic infor- mation leads to choosing some specific week within a month (e.g., something which happens every first week of the month). When an event has also a year schema then together with the month schema the exact week will probably be known. Week schema. Some events may happen only on Wednesdays, e.g. when a woman on maternity leave works only one day a week. Some other events may happen at weekends, e.g. visits and trips and people may not be sure whether it was Saturday or Sunday. All events were classified according to having these schemata or not. Exact date schema. Some events had exact date schema. According to the respondents it can happen only at a certain exact date. For example there is a

197 tradition in the Czech Republic that before on so called “Green Thursday106” that happens before Easter people drink green beer (beer with some herbs that give it green color). Thus events that mention drinking beer on Green Thursday have exact date schema.

Design and procedure

The study has a correlational design, seeking the predictors of dating accura- cy. The method of data collection—also a predictor in this study—was expe- rimentally manipulated. Respondents were randomly assigned to CAL or Non-CAL condition (with the limitation that half of the women have to be in CAL and Non-CAL condition). Randomization was done in excel sheet with the help of random numbers. Respondents were not aware that there are two experimental conditions. Respondents in Non-CAL condition used the small pocket calendar as a recall aid and respondents in CAL condition used both the small pocket calendar and the calendar instrument (see chapter 4 for more details). The procedure was similar as in Study II and is described in detail in chapter 4 (procedure).

7.3 Results

Descriptives of the collected data

Interviews were conducted between June, 6th and June, 29th 2011. The inten- tion was to conduct all interviews proportionally during the two weeks’ time. The interview dates were randomly assigned but some respondents (or inter- viewers) were not available on these days; this is the reason for the frequency diversity (see figure 7.1). Eight interviews were conducted between June 20th and 29th because the suggested interview dates were canceled and a suitable date within the limits was not found. One interview was not conducted be- cause of family issues. The total number of conducted interviews was thus 78. No difference was found between the mean date for interviews with women or men, between the conditions or when both sex and conditions were in- cluded into the same model, F(3) = 17.49, p = .56107.

106 “Maundy Thursday” in English. 107 Non parametric analyses lead to the same conclusion but the ANOVA was enough robust as well in this case. 198

Figure 7.1 Frequency plot with interview date (n = 78)

The number of men and women is unbalanced, as the total number of men is slightly higher than the total number of women. This was caused by the fact that women proxies were more willing to cooperate in this diary study more than men proxies. However, the gender proportion in both experimental conditions is nearly identical (see table 7.2).

Table 7.2 Number of respondents in experimental and control condition

Condition Non-CAL CAL Total Men 21 22 43 Women 17 18 35 Total 38 40 78

I did not find a significant difference in the frequency of personal events to be dated in the CAL condition (n = 806) and Non-CAL condition (n = 755), χ² (1, N = 1561) = 1.66, p = .20. Nor did I find any difference in both conditions when gender was included, χ² (1, N = 1561) = 0.48, p = .49 (see the frequencies in table 7.3). Women, however, dated significantly less events than men, χ² (1, N = 1561) = 19.62, p < .001. This is caused by the fact that there are eight women less than men (35 versus 43). For most analyses this is not a problem because of the big size of events sample. The mean number of valid events to be dated

199 per person was (M = 20.0; SD = 2.7) and was almost identical for women and men in both conditions. Events in Non-CAL condition were on average more recent than events in the CAL condition, t(1559) = -2.53, p = .011, partial η² = .004. The difference is however very small (only 1.6 days) and was found only for women. Such a small difference cannot have significant impact on the dating accuracy. The order of target events in which they were put in the interview (all events were randomly ordered) is not related to dating accuracy, Kendall’s tau-b (τ = .003, p = .89, n = 1561).

Table 7.3 Number of personal events that were dated in the two conditions

Condition Non-CAL CAL Total Men 413 455 868 Women 342 351 693 Total 755 806 1561

General patterns of the dating error of personal events

In this section I will explore the general patters of the dating errors in days, DOW and week units.

Dating error in days

The dating error in days shows the typical decreasing trend of accuracy with the “bumps” that are multiplies of 7 days (see figure 7.2) or close to these multiplies (±1 day). The signed error shows a similar pattern on both sides of the distribution: 30.0% of the events are telescoped backwards, 41.3% is cor- rectly dated and 28.6% is telescoped backward.

200

Relative frequency of dating error (no error trimmed) 15 14

13 12 11 10 9 8 7 6 5 y = -4.364ln(x) + 14.916 4 R² = .40 3

Relative frequency of of frequency Relative dating errors 2 1 0 0 7 14 21 28 35 42 Dating error in days

Figure 7.2. Correct date estimates are trimmed (41.3%, n = 645). The best fit function of all events is a logarithmic function (N = 1651)

Dating error in weeks

The week error shows a decreasing trend without any patterns similar to the dating error in days. The pattern of week error is best (almost completely) described with an exponential function shown in figure 7.3. The signed week error shows a similar trend on both sides of the distribution. When the correct date was e.g., Sunday April 17th, 2011 and the date estimate was Monday April 18th, 2011 this is considered as a week error (even though there is only a one-day difference in the days). This is because the first day of the week is considered Monday (1) and as the last day of the week is considered Sunday (7). Even though Sunday is typically considered as the first day, it is not suit- able for the purposes of my dissertation because Sunday is a holiday and the work week usually starts on Monday and ends on Friday for most people. As figure 7.3 shows 53% of the respondents estimated the week correct- ly and almost 22% estimated the week with one week error.

201

Relative frequency of dating error in weeks 60

55 50 45 40 35 30 25 y = 100.76e-0.699x 20 R² = .99 15 10

Relative frequency of of frequency Relative dating erorrs 5 0 0 1 2 3 4 5 6 7 Dating error in weeks

Figure 7.3. The best fit function is exponential function.

Dating error in DOW

The DOW error shows this decreasing trend as well (see figure 7.4). 65% of the DOW estimates are correct (n = 1015). The signed DOW error shows a similar trend on both sides of the distribution. The best fit function is a power func- tion. The probability of DOW error is not the same for all DOW. For example Sunday was most often misdated with Saturday, Monday with Tuesday, Tuesday with Wednesday, Wednesday with Tuesday, Thursday with Wednesday, Friday with Thursday and Saturday with Sunday. The range of errors was biggest for Wednesday, Tuesday and Monday which implies that it is easier to replace these days with other days. On contrary Saturday or Sunday are usually replaced by the other weekend day and rarely by the Fri- day as well. The other DOW’s are almost completely ignored.

202

Relative frequency of DOW dating error

70 60 50 y = 65.602x-1.768 40 R² = .99 30 20 10

Reralive frequency of of erorrs dating frequency Reralive 0 0 1 2 3 Dating error in days of the week

Figure 7.4. The best fit function is a power function.

Impact of event recency on dating accuracy in days and DOW

H-1: Event recency is negatively related to dating accuracy. Before moving to other predictors the impact of event recency has to be ex- plored because it can be a covariate in some of the subsequent analyses. Only a weak relationship was found between event recency and dating error in days (Kendall’s tau-b τ = .10, p < .001, n = 1561) which means that old- er events have larger dating errors in days (are less accurate). This relationship applies only to self–events where the date was recon- structed (τ = .11, p < .001, n = 1081). Other–events or events with known date were not related to event recency at all. When correct date estimates were excluded, the relationship remained the same. When temporal schemata are taken into account (and only events without month temporal schemata and season temporal schemata are chosen) then even dating error in days of oth- er–events is positively related to event recency (τ = .12, p < .001, n = 137; self– events τ = .13, p < .001, n = 1181). The rank correlation is slightly higher for women than for men (τ = .09 versus τ = .12; the difference is not significant, z = -0.65, p = .51) and the impact of self or other–events is similar for women and men. The relationship is similar but even smaller for the DOW error (Ken- dall’s tau-b τ = .06, p = .003, n = 1561; also applies to self–events only). To sum up, event recency has a robust but weak effect on dating accura- cy in days or DOW and has not the same impact on events with different cha-

203 racteristics. Because of this event recency will be included in some of the fol- lowing analyses.

Comparison of dating accuracy in CAL and Non-CAL condition

As mentioned in the hypotheses section I did not state any hypothesis about the impact of the calendar instrument on dating accuracy because there is lack of the evidence and arguments allowing me making such statements. The mean rank of dating error of events in the Non-CAL condition is 789 while 772 in CAL condition. This increase of dating accuracy is not signif- icant, z(1561) = -0.76, p = .45, r = .02. When analyzed separately for women and men also no difference between the accuracy in both conditions was found, women: z(693) = -1.64, p = .10, r = .06; men: z(868) = -0.50, p = .62, r = .02. When the analysis was made only for events where the date was reconstructed no relationship was found as well. The same results brought the analysis of the mean rank error of respondents in both conditions when computed for all respondents or for women and men separately (n = 78)108. Figures 7.4 and 7.5 show that the only visible difference is that women are on average more accurate (when mean ranked error is used). Figure 7.5 shows only events where the date was reconstructed. Here the difference be- tween women and men is more visible. Again no overall difference between the two conditions is visible.

108 I have also tried linear mixed models where the dependent variable was ranked error (or absolute error in days), ID as a random factor, and sex and condition as fixed factors (with or without inclusion of event recency). Again no relationship was found when all events were analyzed or only those when the date was reconstructed. These analyses were only ancillary because as mentioned in chapter 4 the dependent variable is not nor- mally distributed and ranked error on the other hand decreases the chance of finding the relationship (because the variance is smaller). 204

Figure 7.4. Procedure of computing the mean rank error: First the rank error of every personal event was computed. Then mean rank error was computed from this rank error separately for every respondent. The “points” are respondents in the experimental conditions and there mean rank dating errors for all events that they estimated (only personal).

Figure 7.5. Procedure of computing the mean rank error: First rank error of every personal event which date was reconstructed was computed. Then mean rank error was computed from this rank error separately for every respondent. The “points” are respondents in the experimental conditions and there mean rank dating errors for all events that they estimated (only personal).

205

Figure 7.6 shows the cumulative percentage of dating errors in days (up to and including 21 days; only events when the date was reconstructed are in- cluded). The reason for trimming the rest is that the frequencies are very low and the smaller errors (and the comparison across the groups) are better visi- ble when the graph is trimmed.

Cumulative percentage of dating errors in CAL and Non-CAL condition (reconstructed date) 100 95 90 85 80 75 70 65 Non-CAL Men 60 CAL Men 55 50 Non-CAL Women 45 CAL Women 40

Cumulative percentage percentage Cumulative of events 35 30 25 0 7 14 21 Dating error in days

Figure 7.6. Errors of 22 or more days are not in the figure (n = 116; 9.5%; total n = 1212 events).

There are two interesting “spots” in this distribution. First is that wom- en in the CAL condition made more 7 day errors than in the Non-CAL condi- tion (percentage of 7 days error is 18.3%, cumulative percentage is 73.4%; while only 12.6% in the Non-CAL condition, cumulative percentage is 64.4%). Another interesting “spot” is that men in CAL condition seem to have less 1 to 6 days errors than in Non-CAL condition. This implies that the ca- lendar could “work” better for men when they are relatively accurate but not later on when the dating error increases. For example the cumulative percen- tage of errors up to 4 days is 38.9% in CAL condition and 46.2% in Non-CAL condition. Even though this may be interesting finding caution must be made because for example the frequency of occurrence of the 4 days error is 7 in non-CAL condition and 6 in CAL condition109.

109 Frequency of occurrence of 1 day to 6 days error is 78 in Non-CAL condition and 64 in CAL condition; the difference between the frequencies was not significant. 206

No difference between the two conditions was found either in the fre- quency of the DOW errors among events that have a reconstructed date, χ² (3, N = 1212) = 0.99, p = .80, V = .03. When made separately for women and men, no difference was found as well110. Analyses of the week error also did not lead into finding any significant difference111 between the two conditions, χ² (7, N = 1212) = 9.81, p = .20, V = .09.

Further explorations

When exploring the date I came across a possible explanation of this lack of difference between both conditions. For some of the self-generated landmarks (n = 64) in the CAL condition the true date is known because the event appeared in the target events from the proxy diaries. 54 landmarks (84.4%) were correctly dated but 10 (15.6%) were biased112. It can be expected that some of the other landmarks where the true date is not known (n = 233) may be biased as well and thus the chance is relatively high that target events associated with these biased landmarks also will be biased. Thus, even if the calendar improves dating accuracy of many events this may be counterbalanced by the biased estimates due to some bi- ased landmarks (or other events that served as the cue). Even though in the Non-CAL condition respondents also spontaneously used landmark events, it can be expected that they used them less than in the CAL condition (because they were not so easily visible and the connections between all events were more difficult to find). Respondents in CAL and Non-CAL condition reported that the three pre-recorded public holidays (Easter, Labour Day, Liberation Day) visible in both calendars helped them as recall aids. Especially useful recall aid was Easter that was mentioned by most of the respondents. Respondents in CAL condition generally did not use the second calendar (the plain calendar) be- cause they found the calendar instrument enough transparent. To sum up, no relationship between dating accuracy and the use (or no use) of the calendar instrument was found.

110 Many more analyses were made but no difference was found (e.g., self-other events, sex, date reconstructed/known). 111 Many more analyses were made but no difference was found (e.g., self-other events, sex, date reconstructed/known). 112 The dating error of landmark events was: 3 × 1 day; 1 × 2days; 2 × 6 days; 3 × 7days; 1 × 21 days. 207

Explanation how dating error in various units will be presented

Because the dating error in days is not normally distributed I will mostly use the Mann-Whitney U test or Kruskal-Wallis test for the comparisons between two or more groups. Both test use the rank error. I will sometimes also refer to the mean rank error. This is an error when for each respondent separately the rank error of all date estimates was computed and then the overall mean of the ranked errors was computed. For the DOW error I will use the chi- square test that measures the differences among the frequencies. I will use two types of tables throughout the result section that are in- formative but can be confusing without further explanation. Table 7.4 is the first type of the table. It shows the error in days. First row describes the fre- quency (and percentage in brackets) of “no errors”, second row describing the frequency of “up to 1 day error” and third row describing the error of “up to 7 days”. I could also present “no error”, 1 day error and 7 days error, but I find it more informative when the reader can see the cumulative frequency of the “up to one day” or “up to 7 days error” straight away (thus 71.7% of self events have at most 7 days error or in 51.1% at most one day error). I find these three types of errors the most interesting points of the cumulative dis- tribution of errors (see figure 7.2). When the frequency of “no error” is subtracted from the “up to 7 days error” we get the amount of errors that are not bigger than 7 days (= errors of exactly up to one week). I do not provide separate tables for the week errors. The chi-square test is computed separately for each row as well as the effect size (Cramer’s V).

Example table 7.4 An example: dating errors in days of self-events and other-events

Date estimate (count, column %) Dating error Self-event Other-events Total (%) χ² V No error 611 (43.3) 34 (22.8) 645 (41.3) 23.25*** .12 Error ≤ 1 day 721 (51.1) 40 (26.8) 761 (48.8) 31.64*** .14 Error ≤ 7 days 1013 (71.7) 77 (51.7) 1090 (69.8) 25.75*** .13 Total (row %) 1412 (90.5) 149 (9.5) 1561 (100) Note: χ² is computed for each row (df = 1). ***p <.001 Frequencies per column are cumulative.

208

The DOW error (see table 7.5) is not presented in a cumulative way and each row thus represents only one amount of error (e.g., 113 self-events were esti- mated with 2 days of the week error). The chi-square test is the overall test for all cells as well as the effect size.

Example table 7.5 DOW error for self–events and other–events

Date estimate (count, column %) Dating error Self-event Other-events Total (%) χ² No error 936 (66.3) 79 (53.0) 1015 (65.0) 11.90** 1 DOW e. 284 (20.1) 38 (25.5) 322 (20.6) 2 DOW e. 113 (8.0) 17 (11.4) 130 (8.3) 3 DOW e. 79 (5.6) 15 (10.1) 94 (6.0) Total (row %) 1412 (90.5) 149 (9.5) 1561 (100) Note. χ² is an overall test (df = 3), V = .09. **p < .01

Event characteristics (respondent independent or partially independent)

Temporal schemata

H-2: Events that have temporal schema are dated more accurately than events with- out this schema. Table 7.6 shows that events that have exact date schema are in 93.3% of the cases correctly dated (exact date), events that have a week schema are in 95.4% of the cases correct (the day of the week – DOW), and events with month schema are in 73.0% correct (the week). As can be seen only 30 events had exact date schema and 28 of them were correctly dated. The two remain- ing events that were erroneously dated had 7 days error and 14 days error (indicating that at least the DOW was known). Most common schema is the week schema (n = 456). As shown in the table when an event has a week schema almost all of the events have correctly estimated DOW. Month sche- ma was not so common but when available it helped in estimating the week a lot as well (but much less in comparison to the other two types of schemata).

209

Table 7.6 Availability of temporal schemata and frequency of dating errors

Exact date sch. (days)1 Week schema (DOW)2 Month schema (weeks)3 D. error Yes No Yes No Yes No No error 28 (93.3) 617 (40.3) 456 (95.4) 559 (51.6) 111 (73.0) 715 (50.7) 1 unit 0 (0) 116 (7.6) 17 (3.6) 305 (28.2) 25 (16.4) 312 (22.1) 2 units 0 (0) 48 (3.1) 2 (0.4) 128 (11.8) 13 (8.6) 202 (14.3) 3 units 0 (0) 24 (1.6) 3 (0.6) 91 (8.4) 2 (1.3) 85 (6.0) Total 30 (1.9) 1531 (98.1) 478 (30.6) 1083 (69.4) 152 (9.7) 1409 (90.3) Note. 1 = units of dating error are days. No error, 1 day error, 2 days error and 3 days error. χ²(1, N = 1561) = 34.13, p = < .001, V = .15.; 2 = units of dating error are days of the week (DOW). χ²(3, N = 1561) = 280.15, p = < .001, V = .42; 3 = units of dating error are weeks. χ²(7, N = 1561) = 27.38, p = < .001, V = .13.

Event frequency

As mentioned in the hypotheses section I did not state any hypotheses about the relationship between event frequency and dating accuracy. In this par- ticular study it is probable just by design that no relationship will be found (or maybe only weak trend will be found) because the frequencies are not high and events are uniquely described. The Kruskal-Wallis test of variance indicated no significant effect of event frequency on dating accuracy in days, H(2, N = 1561) = 4.44, p = .11, partial η2 = .003, mean ranks are 771 (n = 1309), 817 (n = 171), 859 (n = 81). As the mean ranks show there may be some trend but the number of events with higher frequency is probably too low to find the significant difference. Tables 7.7 and 7.8 show that there could be some tendency that more frequent events are more difficult to date, but the only significant relationship was found for up to one day error (50% correct when event frequency was “once“ in comparison to 43% or 36% when more fre- quent)

210

Table 7.7 Relationship between event frequency and the dating errors in days

Event frequency (count, %) D. error Once 2-3 times More often Total (%) χ² V No error 553 (42.2) 63 (36.8) 29 (35.8) 645 (41.3) 2.89× .04 Error ≤ 1 day 656 (50.1) 74 (43.3) 31 (38.3) 761 (48.8) 6.59* .07 Error ≤ 7 days 924 (70.6) 115 (67.3) 51 (63.0) 1090 (69.8) 2.71× .15 Total 1309 (34.8) 171 (42.3) 81 (22.8) 1561 (100) Note. χ² is computed for each row (df = 2). ×p = not significant; *p <.05. Frequencies per column are cumulative.

Table 7.8 Relationship between event frequency and DOW errors

Event frequency (count, %) Dating error Once 2-3 times More often Total (%) χ² V No error 862 (70.2) 102 (62.9) 51 (61.0) 1015 (65.0) 18.11× .11 1 DOW e. 270 (18.4) 39 (23.1) 13 (19.4) 322 (20.6) 2 DOW e. 106 (7.0) 18 (7.7) 6 (7.4) 130 (69.8) 3 DOW e. 71 (5.4) 12 (7.0) 11 (13.6) 94 (6.0) Total 1309 (83.9) 171 (11.0) 81 (5.) 1561 (100) Note. χ² is an overall test (df = 6); ×p = .06

Self–events versus other–events

H-3: Self–events are dated more accurately than other–events. The Mann-Whitney U analysis revealed significant difference between self– events and other–events. The mean rank of self–events was (M = 759, n = 1412) while for other–events (M = 981, n = 149), z(1561) = -3.94, p < .001. Table 7.9 and 7.10 show the percentage of errors (in days and in days of the week) for both types of events and figure 7.9 shows the cumulative per- centage of errors for both types of events. It is clear that self–events are dated more accurately than other-events; the pattern of the dating error is similar for both types of events—e.g., showing the “bump” of the 7 days error (visi- ble as the “jump” from 6 days error into 7 days error). The DOW is also more often correctly estimated for self–events (66.3%) but the difference in “no error” between self and other–events is smaller in DOW units than in days. This is caused by the fact that DOW is generally eas-

211 ier to estimate resulting in higher percentage of correct DOW for both types of events. However generalizations should be made with caution because the per- centage of self–events (90.5%) is much higher than the percentage of other– events (9.5%). The reason for this difference is that in the diary instruction personal events were stressed.

Table 7.9 Dating errors in days for self–events and other–events

Date estimate (count, column %) Dating error Self-event Other-events Total (%) χ² V No error 611 (43.3) 34 (22.8) 645 (41.3) 23.25*** .12 Error ≤ 1 day 721 (51.1) 40 (26.8) 761 (48.8) 31.64*** .14 Error ≤ 7 days 1013 (71.7) 77 (51.7) 1090 (69.8) 25.75*** .13 Total (row %) 1412 (90.5) 149 (9.5) 1561 (100) Note. χ² is computed for each row (df = 1). ***p <.001. Frequencies per column are cumulative.

Table 7.10 DOW error for self–events and other–events

Date estimate (count, %) Dating error Self-event Other-events Total (%) χ² No error 936 (66.3) 79 (53.0) 1015 (65.0) 11.90** 1 DOW e. 284 (20.1) 38 (25.5) 322 (20.6) 2 DOW e. 113 (8.0) 17 (11.4) 130 (8.3) 3 DOW e. 79 (5.6) 15 (10.1) 94 (6.0) Total (%) 1412 (90.5) 149 (9.5) 1561 (100) Note. χ² is an overall test (df = 3), V = .09. **p < .01

212

Self and other events

and dating error in days 100% 90% 80% 70% 60% Self 50% Other 40% 30% 20% Cumulative percentage of of erors dating percentage Cumulative 0 7 14 21 Dating error (in days)

Figure 7.7. Errors of 22 or more days were trimmed because of the very low frequencies.

Event characteristics (respondent dependent or proxy dependent)

Knowing the date versus reconstructing the date

H-4: Events with a known date are dated more accurately than events with a recon- structed date. I have made some analyses of the date known/date reconstructed in previous sections resulting in the finding that recency has impact only on self–events where the date was reconstructed113. The Mann-Whitney U analysis revealed that the mean rank error of events with reconstructed date is much higher than of events where the date was known. The mean rank error of the “reconstructed” group of events was (M = 873, n = 1212) while for “known” group (M = 458, n = 349), z(1561) = - 15.75, p < .001., r = .40. As shown in tables 7.11 the percentage of “no errors” is very high when the date was known (79.9%) while much lower when the date was recon- structed (30.2%). The DOW error in table 7.12 shows a similar relationship.

113 Another reference is in a section devoted to the comparison of dating accuracy in CAL and Non-CAL condition—here no impact between events with reconstructed date or known was found.

213

Table 7.11 Relationship between knowing the date or reconstructing the date and frequency of dating errors in days

Date estimate (count, column %) Dating error Known Reconstructed Total (%) χ² V No error 279 (79.9) 366 (30.2) 645 (41.3) 276.55*** .42 Error ≤ 1 day 299 (85.7) 462 (38.1) 761 (48.8) 245.27*** .40 Error ≤ 7 days 324 (92.8) 766 (63.2) 1090 (69.8) 112.88*** .27 Total (row %) 349 (22.4) 1212 (77.6) 1561 (100) Note. χ² is computed for each row (df = 1). ***p <.00. Frequencies per column are cumulative.

Table 7.12 Relationship between knowing the date or reconstructing the date and frequency of DOW errors (n = 1561)

Date estimate (count, %) Dating error Known Reconstructed Total (%) χ² No error 311 (89.1) 704 (58.1) 1015 (65.0) 117.91*** 1 DOW e. 31 (8.9) 291 (24.0) 322 (20.6) 2 DOW e. 4 (1.1) 126 (10.4) 130 (8.3) 3 DOW e. 3 (0.9) 91 (7.5) 94 (6.0) Total (%) 349 (22.4) 1212 (77.6) 1561 (100) Note. χ² is an overall test (df = 3), V = .28. ***p < .001

Importance and sharing

As shown in table 7.13 importance, sharing and emotional intensity or va- lence are moderately or weakly correlated. This is not a surprise because many important events are also shared114.

114 When also positive (n = 863; listwise) and negative (n = 190; listwise) emotions are tak- en into account then the phenomenological characteristics listed in table 7.13 are slightly stronger related with each other for negative events than for positive events (no signifi- cant difference. However no generalizations can be made even more since the sample of negative events is very small (due to the instruction not to mention “problematic” events) and the negative events are of varied nature which needs more qualitative analysis. 214

Table 7.13 Correlations between phenomenological characteristics

Phenomen. characteristics Importance Import - p Sharing Sharing - p Importance - .32 .40 .23 Importance – proxy .32 - .26 .43 Sharing .40 .26 - .32 Sharing – proxy .23 .43 .32 - Note. All Kendall’s tau b correlations are significant on at least on 1% significance level. Listwise n = 1403.

H-5: Event importance is positively related to dating accuracy. Events rated as “highly important” were dated on average 2.1 days better than events rated as of “average, usual” importance and 3.8 days better than events rated as “not really” important. The Kruskal-Wallis test of variance indicated a significant effect of event importance on dating accuracy (in days), H(2, N = 1561) = 59.93, p < .001, partial η2 = .04, mean ranks are 672, 810, 891. As is shown in figure 7.8 the relationship is getting weaker when the dat- ing error increases. The pattern of dating error is similar for all three levels of importance and shows the “bumps” of the multiples of 7 days errors.

Relationship between event importance and percentage of dating errors

95%

85%

75%

65% Highly 55% Average, usual 45% Not really

35%

Cumulative percentage of dating errors of percentage dating errors Cumulative 25% 0 7 14 21 28 Dating error (in days)

Figure 7.8. Note. Errors bigger than 28 days were trimmed (n = 55). Total n = 1561

Table 7.14 describes the relationship between event importance and dif- ferent amounts of dating error in more detail and shows the frequencies of the dating error as well. As can be seen 41.3% of all cases were correctly dated

215 regardless the event importance. Most events (42.3%) were rated as having “average, usual” importance, followed by “yes, a lot” important (34.8%) and “not really” important (22.8%). As can be seen when people rate an event as “yes, a lot” important then in 52.8% of the cases the dating error is zero (and in 47.2% there is at least one day error). On the other hand, when people rate an event as “not really” important then only in 30.9% of the cases the event date is accurately estimated. The DOW is generally more often correctly esti- mated as shown in table 7.15 which is why the difference between the percen- tage of correct DOW estimates among the three levels of importance is not so big as for the dating error in days. To sum up when events are more important they are more often correct- ly dated. Expected importance rated by proxies in tables 7.16 and 7.17 closely resembles the relationship of event importance rated by respondents. Only the effect is slightly smaller.

Table 7.14 Relationship between event importance and frequency of dating errors in days

Event is important (count, %) D. error Yes, a lot Aver., usual Not really Total (%) χ² V No error 287 (52.8) 248 (37.5) 110 (30.9) 645 (41.3) 49.23*** .18 Error ≤ 1 day 333 (61.2) 299 (45.2) 129 (36.2) 761 (48.8) 59.41** .20 Error ≤ 7 days 427 (78.5) 446 (67.5) 217 (61.0) 1090 (69.8) 34.43*** .15 Total 544 (34.8) 661 (42.3) 356 (22.8) 1561 (100) Note. χ² is computed for each row (df = 2). ***p <.001; **p <.01. Frequencies per column are cu- mulative.

Table 7.15 Relationship between event importance and frequency of DOW errors

Event is important (count, %) Dating error Yes, a lot Average, usual Not really Total (%) χ² V No error 382 (70.2) 416 (62.9) 217 (61.0) 1015 (65.0) 18.11** .11 1 DOW e. 100 (18.4) 153 (23.1) 69 (19.4) 322 (20.6) 2 DOW e. 38 (7.0) 51 (7.7) 41 (11.5) 130 (8.3) 3 DOW e. 24 (4.4) 41 (6.2) 29 (8.1) 94 (6.0) Total 544 (34.8) 661 (42.3) 356 (22.8) 1561 (100) Note. χ² is an overall test (df = 6); **p < .01

Table 7.16 216

Relationship between events expected importance (rated by proxy) and frequency of dating errors in days

Expected importance (count, %) D. error Yes, a lot Aver., usual Not really Total (%) χ² V No error 197 (45.1) 271 (40.1) 114 (33.2) 564 (39.9) 10.82** .09 Error ≤ 1 day 212 (61.2) 314 (45.2) 137 (36.2) 663 (48.8) 13.45** .10 Error ≤ 7 days 291 (78.5) 457 (67.5) 217 (61.0) 965 (69.8) 8.70* .01 Total 544 (34.8) 661 (42.3) 356 (22.8) 1561 (100) Note. χ² is computed for each row (df = 2). **p <.01; *p <.01. Frequencies per column are cumula- tive.

Table 7.17 Relationship between events expected importance (rated by proxy) and frequency of DOW errors

Expected importance (count, %) Dating error Yes, a lot Average, usual Not really Total (%) χ² V No error 263 (66.2) 443 (65.6) 205 (59.8) 911 (64.4) 18.48** .08 1 DOW e. 89 (22.4) 142 (21.0) 67 (19.5) 298 (21.1) 2 DOW e. 32 (7.0) 37 (7.7) 39 (11.5) 118 (69.8) 3 DOW e. 13(3.3) 43 (6.4) 32 (9.3) 88 (6.2) Total 397 (28.1) 675 (47.7) 343 (24.2) 1415 (100) Note. χ² is an overall test (df = 6); **p < .01

H-6: Event sharing is positively related to dating accuracy. The Kruskal-Wallis test of variance indicated a significant effect of event shar- ing on dating accuracy in days, H(2, N = 1561) = 32.00, p < .001, partial η2 = .02, mean ranks are 669, 780, 855. Thus the more people share the events the more accurate are the date estimates of these events. When proxies rated the shar- ing with respondent this was also significantly related to dating accuracy, H(2, N = 1411)115 = 9.53, p = .009, partial η2 = .007, mean ranks are 681, 684, 752. This relationship is slightly weaker in comparison to when event sharing is rated by the respondent (see table 7.18) (respondent can of course share the event with other people as well which may be the cause of the difference). Table 7.20 shows that the relationship of event sharing with the DOW error is not as clear as with the dating error in days. Only the category “no sharing” has lower frequency of accurate DOW estimates in comparison to other two

115 Some proxies did not rate this variable. 217 categories. This is supported by the adjusted standardized residuals which indicate that only the two extreme categories significantly contribute to the model (= are greater than the critical value of 1.96 for p <.05).

Table 7.18 Relationship between event sharing and frequency of dating errors in days

Sharing the event (count, %) Dating error Many times Several times (1-3) No Total (%) χ² V No error 166 (55.1) 332 (41.0) 147 (32.7) 645 (41.3) 37.67*** .16 Error ≤ 1 day 179 (59.5) 401 (49.5) 181 (42.2) 751 (53.2) 27.12*** .13 Error ≤ 7 days 235 (78.1) 562 (69.4) 293 (65.1) 1090 (69.8) 14.54** .10 Total 222 (15.´7) 732 (51.9) 457 (32.4) 1411 (100) Note. χ² is computed for each row (df = 2). ***p <.001; **p <.01. Frequencies per column are cu- mulative.

Table 7.19 Relationship between event sharing rated by proxy and frequency of dating errors in days

Sharing the event with proxy (count, %) Dating error Many times Several times (1-3) No Total (%) χ² V No error 91 (41.0) 312 (42.6) 158 (28.2) 561 (39.8) 7.78* .02 Error ≤ 1 day 112 (50.5) 357 (48.8) 191 (41.8) 751 (53.2) 6.93* .03 Error ≤ 7 days 157 (70.7) 510 (69.7) 293 (64.1) 960 (68.0) 4.87× .09 Total 222 (15.´7) 732 (51.9) 457 (32.4) 1411 (100) Note. χ² is computed for each row (df = 2). ***p <.001; **p <.01; ×p = .088. Frequencies per col- umn are cumulative.

218

Table 7.20 Relationship between event sharing and frequency of DOW errors (n = 1561)

Sharing the event (count, %) Dating error Many times Several times (1-3) No Total (%) χ² V No error 226 (75.1) 529 (65.3) 260 (57.8) 1015 (65.0) 24.42*** .09 1 DOW e. 47 (15.6) 169 (20.9) 106 (23.6) 322 (20.6) 2 DOW e. 15 (5.0) 69 (8.5) 46 (10.2) 130 (8.3) 3 DOW e. 13 (4.3) 43 (5.3) 38 (8.4) 94 (6.0) Total 301 (19.3) 810 (51.9) 450 (28.8) 1561 (100) Note. χ² is an overall test (df = 6); ***p < .001

Table 7.21 Relationship between event sharing rated by proxy and frequency of DOW errors (n = 1561)

Sharing the event with proxy (count, %) Dating error Many times Several times (1-3) No Total (%) χ² V No error 141 (63.5) 493 (67.3) 273 (57.8) 907 (64.3) 9.24× .06 1 DOW e. 52 (23.4) 142 (19.4) 104 (23.6) 298 (21.1) 2 DOW e. 15 (6.8) 58 (7.9) 45 (10.2) 118 (8.4) 3 DOW e. 14 (6.3) 39 (5.3) 35 (8.4) 88 (6.2) Total 222 (15.7) 810 (51.9) 457 (32.4) 1411 (100) Note. χ² is an overall test (df = 6); ×p = .16

Confidence in date estimates

H-7: Confidence in DOW estimates is positively related to dating accuracy. Confidence was measured separately for the DOW estimate and week esti- mate. As is shown in table 7.22 when respondents are confident in their DOW estimates they are correct in 83.2% while only in 37.5% when they are not con- fident. This effect is one of the strongest from all predictors (V = .48). The ef- fect is significant for proxies ratings as well (see table 7.23), but is weaker (V = .34).

219

Table 7.22 Relationship between confidence in DOW estimate and frequency of DOW errors

Confidence in correct DOW (count, %) Dating error Yes No Total (%) χ² V No error 782 (83.2) 233 (37.5) 1015 (65.0) 361.01*** .48 1 DOW e. 117 (12.4) 205 (33.0) 322 (20.6) 2 DOW e. 25 (2.7) 105 (16.9) 130 (8.3) 3 DOW e. 16 (1.7) 78 (12.6) 94 (6.0) Total 940 (60.2) 621 (39.8) 1561 (100) Note. χ² is an overall test (df = 3); ***p < .001

Table 7.23 Relationship between expected accuracy of DOW estimate (rated by proxy) and fre- quency of DOW errors

Confidence in correct DOW (count, %) Dating error Yes No Total (%) χ² V No error 267 (45.6) 659 (77.9) 926 (64.7) 160.67*** .34 1 DOW e. 180 (30.7) 118 (13.9) 298 (20.8) 2 DOW e. 78 (13.3) 41 (4.8) 119 (8.3) 3 DOW e. 61 (10.4) 28 (3.3) 89 (6.2) Total 586 (40.9) 846 (59.1) 1432 (100) Note. χ² is an overall test (df = 3); ***p < .001

H-8: Confidence in week estimates is positively related to dating accuracy. Respondents rated their confidence in week estimates by two numbers: num- ber of weeks backwards and forwards. These two numbers together make a “confidence interval” of weeks when they think the event almost surely hap- pened. Kendall’s tab b correlation between confidence interval in weeks and week error is τ =.39, p < .001, n = 1561, showing that when respondents are less confident (the confidence interval is wider) the week error is bigger. When respondents are confident in week estimate they are in 78% (n = 556) correct and in 23% (n = 78) have only 1 week error. This relationship between full confidence and other levels of confidence is highly significant and rela- tively strong, χ² (7, 1561) = 334.39, p < .001, V = .46. The analyses for proxies show the similar relationship, though weaker (V = .28). Respondents were in 588 cases confident both in week estimate and in DOW estimate (37.7% out of the total number of events, n = 1561). 67 respon- dents swore under oath at least once and 67 respondents did not swear under

220 oath at least once116. When respondents rated that they are confident with the DOW estimate and week estimate they were asked whether they would “swear under oath” that the date is correct. In 395 cases respondents swore under oath and in 84.8% of cases they were right but in 60 cases (15.2%) they were one or more days wrong (see table 7.24 and figure 7.9). On the other hand when respondents did not want to “swear under oath” they had a good reason for that because only in 49.2% of cases the date estimate was correct. It thus seems that “under oath confidence” is a good predictor helping in dis- tinguishing the “very” confident date estimates from “confident” date esti- mates (the Cramer’s V is also one of the highest = .37).

Table 7.24 Relationship between “under oath confidence” and dating accuracy

Under oath confidence (count, %) Dating error Yes No Total (%) χ² V No error 335 (84.8) 96 (49.2) 450 (63.9) 80.48*** .37 ≥1 day error 60 (15.2) 194 (62.8) 254 (36.1) Total 395 (67.2) 193 (32.8) 588 (100) Note. χ² (df = 1). ***p <.001

Dating errors made "under oath" (no error excluded) 20 18 16 14 12 10 8 6 4 2

Frequency of of Frequency dating errors 0 1 2 4 7 8 13 14 20 21 28 30 42 Dating error in days

Figure 7.9. Frequency of dating errors (in days). Correct date estimates are excluded.

116 These are not the same respondents (though most of them are because the total num- ber of respondents is 78). It happened by chance that the two frequencies are the same. 221

Respondent characteristics

As expected no relationship was found between different levels of education, motivation and tiredness. This is more due to the lack of variability and the de- sign of my study than support that no such relationship exists.

Self-evaluation of memory

H-10: Respondents’ self-evaluation of memory is positively related to dating accura- cy. Respondents that rate their memory for dates as “rather better than others” (n = 6) are on average 0.7 days more accurate than respondents rating their memory as “similar to others” and 1.9 days more accurate than those rating their memory as “rather worse than others” (in exact days). The Kruskal- Wallis test of variance indicated that this relationship is not significant H(2, N = 75)117 = 2.51, p = .29, mean ranks are 33, 35, 43. The relationship is not signifi- cant even when the categories “rather better than others” and “similar to oth- ers” are merged or when the first category is excluded. The reason is that the variance is generally very big regardless the self-evaluation of memory.

Gender

H-9: Women are on average more accurate than men. The Mann-Whitney U analysis revealed significant relationship between the mean rank of men and women. The mean rank error of events estimated by men was (M = 818, n = 868) while for women (M = 693, n = 693), z(1561) = - 3.83, p = <.001; r = .10. Women were on average 1.6 days more accurate than men. Tables 7.25 and 7.26 show the frequencies of dating error and DOW er- rors.

117 Three respondents did not answer this question. 222

Table 7.25 Relationship between gender and frequency of dating errors in days

Date estimate (count, column %) Dating error Men Women Total (%) χ² V No error 327 (37.7) 318 (45.9) 645 (41.3) 10.72** .08 Error ≤ 1 day 391 (45.0) 370 (53.4) 761 (48.8) 10.74** .08 Error ≤ 7 days 576 (65.3) 523 (75.5) 1090 (69.8) 18.83*** .11 Total (row %) 868 (22.4) 693 (77.6) 1561 (100) Note. χ² is computed for each row (df = 1). **p <.01; ***p <.001. Frequencies per column are cumulative.

Table 7.26 Relationship between gender and frequency of DOW errors (n = 1561)

Date estimate (count, %) Dating error Men Women Total (%) χ² No error 532 (61.3) 483 (69.7) 1015 (65.0) 12.12** 1 DOW e. 196 (22.6) 126 (18.2) 322 (20.6) 2 DOW e. 81 (9.3) 49 (7.1) 130 (8.3) 3 DOW e. 59 (6.8) 35 (5.1) 94 (6.0) Total (%) 868 (55.6) 693 (44.4) 1561 (100) Note. χ² is an overall test (df = 3), V = .09. **p < .01

Difficulty of the task

H-11: Respondents’ evaluation of the overall difficulty of estimating the date is nega- tively related to dating accuracy. When respondents rated the difficulty of the task after estimating the date of all events on 4-point scale (see table 7.27) this was highly related to dating accuracy. The mean rank of week error was 49 when respondents rated the difficulty of estimating the week as “very hard” and only 23 mean rank when the difficulty was described as “rather easy” (see table for all ranks). The Kruskal-Wallis test of variance indicated that this relationship is significant and the more difficult the week estimate, the worse week accuracy is, H(3, N = 78) = 12.12, p = .007, partial η² = .15. For DOW error no relationship was found. The reason is probably that most of the respondents chose the catego- ry “rather hard” and “rather easy” and the other two levels were not used. The mean DOW error of “rather hard” category is 0.67 days and the mean

223

DOW error of “rather easy” category is 0.47 showing that there could be some relationship between DOW error and the overall difficulty rating118.

Table 7.27 Frequency of the overall difficulty ratings after estimating the date of all target events for DOW and week (count and %)

Difficulty DOW (%) Week (%) very hard 2 (2.6) 17 (21.8) rather hard 33 (42.3) 47 (60.3) rather easy 36 (46.2) 13 (16.7) very easy 7 (9.0) 1 (1.3) N 78 78

Number of landmarks

H-12: Number of self-generated landmarks is positively related to dating accuracy. A moderate relationship was found between the number of self-generated landmarks and dating error (τ = -.39, p = .001, n = 40) in the CAL condition. This relationship would probably be even higher if respondents had the chance to generate as many landmarks as they could (ceiling effect). Some respondents would be able to generate many more landmarks but they were told that the number of landmarks they generated was enough (when they reached 15). Respondents were facilitated to report more landmarks when at least 7 landmarks were not reached119.

Summary of the effect sizes of event and respondent characteristics

Table 7.28 provides the summary of the effect sizes. Cramer’s V is used for event characteristics and other effect size measures for respondent characte- ristics. The maximum value is 1 and the minimum value is 0. Zero means no relationship.

118 It must be emphasized that the mean is biased by the extreme values and when para- metric tests are applied the assumption of the normality would be violated as well as the assumption of DOW being a continual measure. But as mentioned when non-parametric analysis was made, no relationship was found.

119 This facilitation, however, was only mild because I did not want to put too much bur- den on the respondents. When they said that they cannot think of any additional land- marks (and the total number of landmarks was lower than 7) I probed them only once. 224

Table 7.28 Summary table of the effect sizes of event characteristics (only personal events)

Predictors Effect size Event characteristics (Cramer’s V) Dating error DOW error Exact date schema .15 - Week schema - .42 Month schema (effect on week) .13 - Frequency .04NS .11NS Self–event / Other–event .12 .09 Date known / Reconstructed .42 .28 Importance .18 .11 Expected importance (proxy) .09 .08 Sharing the event .16 .09 Sharing the event (proxy) .02 .06 Uniqueness .16 .17 Confidence in DOW - .48 Confidence in Week1 .46 Expected accuracy of DOW (proxy) - .34 Expected accuracy of week (proxy) .28 - Under oath confidence .37 - Respondent characteristics Self-evaluation of memory (partial η²) .007NS - Gender (V) .08 09 Overall difficulty (partial η²) .16 - Number of landmarks (tau b) -.14 - Note. Dating error: effect size of no error versus any error. DOW error: effect size for all DOW er- rors. NS = not significant. Sample of events n = 1561. 1 = compares the full confidence with all other confidence levels.

Public events

Relationship between event recency and dating accuracy of public events

Only 4 public events were included into the sample of target events. Their “age” (recency) was differing a lot and this could have an impact on dating accuracy of these events. Table 7.29 shows the age of these events and the rank correlation between age and dating error in days. As is visible the only significant correlation was found for “Annie found dead”. The reason is quite obvious. This event was the most remote event from all target events (both public and personal). Probably an even bigger correlation would be found if

225 the calendar boundary was not restricted to March 14, 2011. The correlations for women and men were similar. The only difference is in the first event, where the correlation of women did not reach significance level (p = .089) but the size is similar to men.

Table 7.29 Rank correlation of event recency of public events and dating error in days.

Public events Annie Wedding Osama Hockey τ .22** -.05× -.08× .05× n 75 77 78 78 True March April 29 May 2 May 15 date 16 Note. ×p = not significant **p <.01

Confidence in date estimates of public events

H-13: Confidence in DOW estimates is positively related to dating accuracy. As is shown in table 7.30 when respondents are confident in their DOW esti- mate they are correct in 73.4% while only in 37.5% when they are not confi- dent. This effect is one of the strongest from all predictors (V = .48).

Table 7.30 Relationship between confidence in DOW estimate and frequency of DOW errors (n = 308)

Confidence in correct DOW (count, %) Dating error Yes No Total (%) χ² V No error 47 (73.4) 72 (29.5) 119 (38.6) 46.76*** .39 1 DOW e. 14 (21.9) 71 (29.1) 85 (27.6) 2 DOW e. 3 (4.7) 68 (27.9) 71 (23.1) 3 DOW e. 0 (0) 33 (13.5) 94 (10.7) Total 64 (20.8) 244 (79.2) 308 (100) Note. χ² is an overall test (df = 3); ***p < .001

H-14: Confidence in week estimates is positively related to dating accuracy. Respondents rated their confidence in week estimates by two numbers: num- ber of weeks backwards and forwards. These two numbers together make a “confidence interval” of weeks in which they think the event almost surely

226 happened. Kendall’s tab b correlation between confidence interval in weeks and week error is τ =.32, p < .001, n = 1561, showing that when respondents are less confident (the confidence interval is wider) the week error is bigger. The size of the correlation is almost identical to the size of correlation of per- sonal events.

H-15: Public events that are associated with personal events are dated more accurate- ly than events without this association. Table 7.31 shows the frequencies of dating errors for all four public events separately. For example 3 respondents out of 72 did not report having any personal association with “Annie” but estimated the date correctly. On the other hand no respondent (out of 3 respondents) who had some personal as- sociation with “Annie” estimated the date correctly. All remaining public events were estimated correctly more often when the association was availa- ble; the strongest effect is for Ice hockey (V = .35 or .36 for up to 1 day error). Table 7.32 describes the mean ranks of the errors when the association was available and when not. This table also presents the frequency of all errors and their amounts when association was available.

Table 7.31 Impact of the association of public event with personal event on dating accuracy in days

Association with public event Annie (n = 75) Wedding (n = 77) Osama (n = 78) Hockey (n = 78) Assoc. No Yes χ² ; V No Yes χ² ; V No Yes χ² ; V No Yes χ² ; V No error 3 0 0.13× .04 6 12 9.32** .35 2 2 7.24** .31 5 17 9.32** .35 ≤ 1day e. 6 0 0.27× .06 9 12 5.39* .27 5 3 7.20** .30 13 19 10.03** .36 Total N 72 3 49 28 70 8 48 30 Note: χ² is computed for each event (df = 1). **p <.01, *p <.05, × = not significant.

227

Table 7.32 Mean rank errors and Frequency of errors when public event was associated with personal or not (the mean ranks)

Association No Yes z p f error Amount of errors× Annie (n = 75) 38.2 32.3 -0.46 .67 3 6,7,42 Wedding (n = 77) 44.6 29.2 -2.94 .003 16 3,7,7,7,7,7,7,7,7,8,8,8,8,13,14,23 Osama (n = 78) 40.7 29.2 -1.36 .174 6 1,5,10,12,14,23 Hockey (n = 78) 47.6 26.5 -4.06 < .001 13 1,1,2,2,5,7,7,8,8,8,9,10,50 Note: Mean ranks are presented. f error. = frequency of errors of events with association (correct estimates excluded). z = z from Mann Whitney U test. p = significance of Mann-Whitney U test. × = Amount of error for date estimates with association only.

Comparison of dating accuracy in CAL and Non-CAL condition

The mean rank of dating error in days of events in the Non-CAL condition is 154.14 (n = 150) and 154.84 (n = 158) in the CAL condition. The finding that there is no difference between the two conditions remains when computed separately for women and men or when the analyses were computed for each public event separately. None of the analyses were even close to significance or showed substantive differences between the conditions (most had p value of more than ≥ .5). Analysis of DOW error also did not find any relationship, χ²(3, N = 308) = 5.43, p = .14, V = .13. The conclusion is that for the chosen events a calendar instrument inter- view does not have any effect on dating accuracy in comparison to an inter- view without calendar instrument.

7.4 Discussion

This study had two major aims—exploration of event and respondent predic- tors of dating accuracy and evaluation whether calendar instrument inter- view leads to higher dating accuracy of recent personal or public events. I will first start with the second aim. The overall discussion of all studies is in chap- ter 8.

Does the calendar instrument interview (CAL) lead to more accurate date estimates in comparison to the Non-CAL interview?

228

Calendar instruments are generally recognized as techniques leading to in- creased quality of data (or at least neutral) and hardly any cons connected to these instruments were raised (see the beginning of the discussion section in Study II – everything mentioned there applies to this study as well). Respondents in my Non-CAL condition could use a plain calendar be- cause in real life situations almost nobody would estimate the date (when it has to be exact) of recent events without calendar where the DOW and exact days of the month are visible. I did not find any effect of the CAL condition on dating accuracy when compared to results in the Non-CAL condition. Analyses were made sepa- rately for DOW and overall dating error in days and other influential va- riables were controlled as well (e.g., gender, date reconstructed / known). This suggests that the calendar instruments potential to aid the recall and in- crease the dating accuracy is limited when personal events are taken into ac- count which is rather surprising finding (cf Belli, et al., 2001; Belli, et al., 2007; Glasner & Van der Vaart, 2009). It can be expected that if the interviewers would not provide a plain ca- lendar in the Non-CAL condition, significant differences between the condi- tions would almost certainly be found, as was the case in the Gibbons and Thompson study (2001). But as I mentioned in chapter 4 my aim was not to measure the quality of “numeracy” but accuracy of date estimates and the latter could be influenced by poor ability to calculate the dates when no ca- lendar is visible (e.g., calculating which date was 6 weeks ago on Tuesday). Apart from the fact that the (plain) calendar alone could be the most important feature of the calendar instrument—which was lost when CAL condition was compared to Non-CAL condition with a plain calendar—there may be some other explanations of the lack of the differences between the conditions. Some landmarks were biased and the data show that a biased landmark biases the events that are associated with it. It can be argued that in the CAL condition landmarks play a major role in estimating the dates and because they are easily visible they can cause more bias (when biased) in comparison to the Non-CAL condition. In the latter condition people were also probed to use landmarks (mentally or using the events from the question list as recall aids) but because they were not visible they probably did not lead to so many errors as in CAL condition. Another explanation is that the impact of personal events—which usually bring along vast contextual information—did not leave any chance to

229 the calendar instrument to improve anything (but making the task easier be- cause everything was visible, what respondents liked). People know the DOW of many events event without the calendar and the exact date could be found in a calendar in both conditions. No difference between the two conditions was also found for the public events.

Predictors of dating accuracy – event characteristics

Most of the studied predictor variables have shown at least some weak rela- tionship with dating accuracy. As shown in table 7.28 the strongest predictors of dating accuracy are availability of temporal schemata, knowing the date straight away and high confidence in the temporal estimate. This is not a sur- prising finding because all three predictors are most directly connected to the date estimates or more specifically to dating estimates strategies. As men- tioned in chapter 2 these strategies are related to dating accuracy (Thompson, et al., 1993). The remaining event characteristics also predict the dating accu- racy but not so strongly. It is probably because the remaining variables are connected to dating accuracy only indirectly (e.g., importance, sharing) or cannot be pronounced as much because of the short recall period (e.g., event recency). It has to be highlighted that it is easier to predict biased date esti- mates than correct date estimates because even when all “strong” predictors implicate a correct date estimate the chances of incorrect date estimates are still rather high. Most of the less direct predictors (e.g., importance, frequency, self–other events distinction, sharing) provide some hints as to which events could pos- sibly be dated more accurately but as the effect sizes show (usually from .02 to .18) the effect is relatively small, though in most cases robust.

Predictors of dating accuracy – respondent characteristics

Respondent characteristics were on average only weakly related to the dating accuracy and some of the expected relationships were not found significant at all (e.g., self-evaluation of the memory before the dating task). Respondents’ evaluation of the overall difficulty of estimating the date was related to the overall dating accuracy showing that “harder” task was related to the less accurate date estimates. Gender was also found to be an important predictor of dating accuracy supporting that women are on average more accurate than

230 men as was found in most studies of personal events (e.g., Skowronski, et al., 1994). Education was omitted from the analyses due to the lack of variance.

Predictors of dating accuracy – public events

Confidence in the date estimates of public events also showed relatively strong relationship to dating accuracy, similarly as for personal events. Three of the events were more often associated with personal events. Date estimates of the public events where the association was available were dated more accurately (effect size ranging from .27 to .36). This provides support for the findings from Study I where I studied the associations as well.

231

8 General discussion and conclusions

8.1 Introduction

The main aim of this dissertation was to find predictors of dating accuracy of unique personal and public events and thus to get more insight into mechanisms that affect dating accuracy or are related to it. Studies I to III explored event and res- pondent characteristics and Studies II and III added two ways of data collection (with and without a calendar instrument) as well. The design of all studies was correlational with the exception that the data collection in Studies II and III was experimentally manipulated. I had two reasons for conducting these studies. The first was that on many occasions professionals from diverse fields who collect temporal data do not have any objective source by which the date can be verified. Know- ledge of predictors of dating accuracy could help them at least to estimate the difficulty of the dating task and also warn them when it is highly probable that people will not provide accurate date estimates. Another reason why to focus of dating accuracy predictors was a relative lack of literature focusing on more dating accuracy predictors at the same time and putting them more into context (this was also done in the theoretical review of this dissertation). Literature review also brings many contradictory views which characteristics should be connected to dating accuracy. For example event recency should obviously be related to dating accuracy but on many occasions (and also in my research) it occurs that hardly any relationship is found. My aim is thus to show more predictors in one study and to compare how strongly they are related to dating error. Many memory studies focused on dating accuracy of public events in- stead of personal events. The reason is quite obvious—it is difficult to gain the true dates for personal events. But even though the general pattern of dat- ing error may be relatively similar for both public and personal events (Kemp, 1999), the amount of dating error is often very different and the rela- tionship with various predictors as well. Another problem with using public events is that they are hardly ever relevant for people, because they are not connected to their personal narrative (Friedman, 2004). I heard many times about “how stupid” it is to ask about dates of public events but not so many times that the date of personal events is useless information (especially when relevant events were dated).

232

Because of this some researchers have used personal events from diary studies (e.g., Skowronski, et al., 1991; Thompson, 1982; Wagenaar, 1986). This approach has certainly many advantages but there is a danger that a mere act of keeping the diary may have impact on dating accuracy. My approach was thus different and to my knowledge not many studies used anything similar (the roommate study of Thompson mentioned above resembles my approach—in his case students kept a diary about themselves but also about their roommates, but it was limited because they could record only events from the weekdays because they were apart during weekends). I did not recruit individuals but couples. One from the couple was randomly assigned to be an assistant (proxy) and another (respondent) took part in the research and had to estimate the dates. Proxies’ role was to collect the person- al events of their partner (from documents in Study II and in a diary in Study III). This approach successfully overcomes both potential problems men- tioned above. Respondents did not know that any events were collected and the results could thus be seen as unbiased. The second reason for conducting the studies in my dissertation (espe- cially Study II and III) was to offer these professionals some easy to use in- strument that could lead into increased dating accuracy in comparison to the common ways of inquiring temporal information. According to the literature (e.g., Belli, et al., 2009; Glasner & Van der Vaart, 2009) calendar instruments (e.g., event history calendar, life history calendar, timelines) should be the most promising techniques because they incorporate many recall aids into one data collection instrument. There was however a lack of evidence under what conditions these in- struments work best and when it is better not to use them (Belli & Callegaro, 2009). Even though it may seem obvious that calendar instrument cannot have negative effect on dating accuracy because it provides the visual help in a form of calendar grid, utilizes personal or public landmarks, and uses vari- ous cues (parallel, sequential or top-down), it can be argued that on some oc- casions it can have negative effect as well (e.g., when the temporal landmark is biased it can bias all the related events). So far to my knowledge only four studies compared the effect of a calendar instrument interview with another type of an interview (see chapter 3.4 for more details). The available studies (e.g., Belli, et al., 2001; Belli, et al., 2007; Van der Vaart, 2004; Van der Vaart & Glasner, 2007a) usually found some positive effect of the calendar instrument or neutral, often with the cost of slightly increased preparation time or inter- view time.

233

However, none of these studies compared more flexible interview with another flexible interview with the only difference in the presence of calendar instrument or not. It can be argued that when the interview is more flexible and the interviewer tries to facilitate the respondent as much as possible that the difference between the CAL condition and Non-CAL condition should not be so big (if any)—especially when personally relevant events are dated. My aim thus was to experimentally evaluate whether the calendar instrument increases the dating accuracy of personal events in comparison to similar type of interview without the calendar instrument. I have asked four major research questions: • Which event and respondent characteristics are the strongest predictors of dating accuracy among remote personal or public events (Study II, I)? • Which event and respondent characteristics are the strongest predictors of dating accuracy among recent personal (and partially public) events (Study II)? • Does the calendar instrument interview (CAL) lead to more accurate date estimates of recent events in comparison to the Non-CAL interview (Study II)? • Does the calendar instrument interview (CAL) lead to more accurate date estimates of recent events in comparison to the Non-CAL interview (Study III)?

The following sections will try to answer these questions.

8.1 Event and respondent predictors – recent and remote events

Which event and respondent characteristics are the strongest predictors of dating accuracy among remote personal or public events (Study I, II)?

Which event and respondent characteristics are the strongest predictors of dating accuracy among recent personal or public events (Study III)?

Because the impact of accuracy predictors was very similar on remote and recent events I will deal with them together, highlighting the differences when found.

234

Event characteristics

The studies provided evidence that the strongest event characteristics are temporal schemata (both for public and personal events), confidence ratings of expected accuracy (both for public and personal events) and the dating strategy—events where people knew the date straight away were dated more accurately as events where the date had to be reconstructed (applies to Studies II and III only). This is not a surprising finding because all three predictors are most directly connected to the date estimates—are closely related to the date estimating strategies which were found to be strongly related to dating error (Thompson, et al., 1993). One of the strongest predictors of dating accuracy was the “under oath confidence” measured in Studies II and III. This measure was chosen from Rubin, et al. (2003) and proved to be very helpful because it distinguished between events where people thought they are right and lead to better predic- tions of dating accuracy. An interesting finding is that proxies (in Study II and III) were able to predict the dating accuracy of their partners very well (only slightly weaker than respondents alone). This suggests that proxies may be relatively good informants in case we cannot ask the respondent. The remaining event characteristics (such as importance, uniqueness, sharing the event) generally also predicted dating accuracy but the effect sizes were lower though in most cases robust (there are summary tables with effect sizes in Study II and III). It thus seems that when the relationship of a predictor is closely con- nected to the date estimate strategy, such a predictor is stronger in compari- son to less direct predictors. Study I also showed that when respondents declared an interest in some type of public events they also estimated the date of these public events better (though the relationship did not reach significance in all cases). This relation- ship is however weak, probably because people are generally interested in other than temporal aspects of the public events. This evidence supports that when researchers provide temporal landmarks they should be related to the interests and other characteristics of their respondents (Glasner, et al., forthcoming). Study I also found a weak relationship between media coverage in time (culminating versus initiating events) and dating accuracy. This provides some evidence that the “media replay” hypotheses suggested—but not found—by Brown, Rips, and Shevell (N. R. Brown, et al., 1985) could have 235 impact on dating accuracy finding that culminating events are more accurate- ly dated as initiating events. Study I did not measure respondent dependent event characteristics apart from the impact of the association of public event with personal event. This association mostly led to the increased dating accuracy, though the gene- ralization is limited due to the low frequencies. Studies II and III found the associations more frequently and support that when public event is asso- ciated with personal events it usually leads to the increased dating accuracy in comparison when the same event is not associated with personal events. The reason why is quite obvious. Personal events are generally more accu- rately estimated and provide rich contextual information from which the date can be reconstructed—especially in cases when no temporal schema is availa- ble (Conway, 2005; Janssen, et al., 2006; Kemp, 1999).

Respondent characteristics

Inspection of the respondent characteristics (e.g., age, sex, self-evaluation of the memory) did not lead into findings of any really strong predictor of dat- ing accuracy for public or personal events. It seems that event characteristics have much stronger effect on dating accuracy as the “overall” respondent characteristics. Gender was found to be one of the strongest predictors of dating accu- racy of personal events (but not of public events). This is in the line with the research of others (e.g., Skowronski, et al., 1994). Self-evaluation of the memory was a significant predictor of dating ac- curacy in Study I and III. Study II surprisingly did not find any relationship but it could be caused by the relatively small sample. On the other hand an overall difficulty of the dating task (which was measured after the dating task) showed to be a stronger predictor of dating accuracy as the self-evaluation before the dating task. This is not surprising, because when respondents knew the events they could adjust their evalua- tions. Both predictors show that even for dating accuracy people have some insight and “feel” how accurate they generally are (Cohen, 2008b). This does not imply that there are no respondent differences—the con- trary is true. Many respondent were very accurate and many very inaccurate. It only implies that when events from many areas serve as a set of events then the respondent characteristics do not predict the dating accuracy a lot and event characteristics seem to be stronger predictors.

236

8.3 Impact of the calendar instrument on dating accuracy

Does the calendar instrument interview (CAL) lead to more accurate date estimates of remote events in comparison to the Non-CAL interview (Study II)?

Does the calendar instrument interview (CAL) lead to more accurate date estimates of recent events in comparison to the Non-CAL interview (Study III)?

As mentioned in the introductory section calendar instruments are generally recognized as techniques leading to increased quality of data (or at least neu- tral) and hardly any cons connected to these instruments were raised. Even though I had doubts whether the calendar instrument “works” for personal events when the interviewer in the control group does everything possible to aid the recall (similarly as in CAL condition) the outcome that no difference between the conditions for both public and personal events was found is ra- ther surprising. Calendar is a good visual aid and should at least help to orientate better—this was supported by the reports of respondents who were generally very happy that they could use the calendar and found it useful for both orientation and for increasing the date estimates as well. One of the reasons for not finding any difference could be the small sample. This was especially a problem in Study II where the sample of women in CAL condition was seven. However Study III is not so limited and also did not lead into any differences between the conditions. When I was exploring the reasons why no difference was found I have found that quite many freely recalled temporal landmarks in both studies are biased. This could be found only for landmarks where the true date was known (the landmark was the same as the target event where the true date was known from the proxy) but provides a strong evidence that using per- sonal landmarks may have both benefits and disadvantages as was recently highlighted by (Glasner, et al., forthcoming).

8.4 Limitations of the empirical studies

There are several limitations of the empirical studies. The more specific and less important limitations are explored in the discussion sections of the actual studies in chapters 5 to 7. Here, I will focus on the major limitations only: boundary effect, education, bias of the true dates of personal events, and sample size for making comparisons between CAL and Non-CAL conditions.

237

The boundary was known in all studies (2005–2008 in Studies I and II and March 14, 2011–May 5, 2011 in Study III). The boundary effect could thus bias the date estimates, meaning that the date estimates of events close to the boundaries cannot be telescoped (moved in time) so much because they hit the boundary limit. In Study I and II the effect of the boundary could be rela- tively high, because in many cases people estimated the date to lie outside the boundaries (both ways). The same phenomenon was observed in Study II where respondents sometimes wanted to provide a date estimate that was outside the 2005–2008 reference period. Study III used the calendar (in both conditions) that covered 12 weeks but personal events came from only a 6 weeks period. Thus the boundary could not have as much effect as in the oth- er two studies. This is supported by the observation of the interviewers that respondents hardly ever spoke about not having long enough calendars. Most respondents, in particular in Studies II and III, had high education. Even though education should not generally have a strong effect on dating accuracy (of personal events), it can be expected that people with low level education may be worse in date estimating. This especially applies to dating public events where the lower cognitive abilities (that are often connected with lower education) together with lower knowledge of public events may play an important role (Howes & Katz, 1992). Using the personal events in Studies II and III may be connected to the danger that some true dates of these events were biased, because for example the proxies record the wrong date by mistake or found a document with the wrong date. Nevertheless, I do not expect a systematic bias of the date esti- mates because many checks were made and all problematic events (not many) were excluded. The comparison of interviews with a calendar instrument (CAL) and in- terviews without a calendar instrument (Non-CAL) was limited especially in Study II where the intended sample was bigger than the actual sample size. This was caused by the fact that for some proxies (especially males) collecting enough events with true dates was a too difficult task. This unfortunately led to the situation that in the CAL condition only seven women remained (com- pared to 12 men in the CAL condition). Because gender is an important pre- dictor of dating accuracy whose effects could potentially be moderated by the experimental condition, the imbalance in gender makes the generalization about the impact of the calendar instrument on women and men less strong. Study III has an almost balanced proportion of men and women and the gene-

238 ralizations about the interaction of gender and experimental condition are thus stronger (though the number of respondents could also be higher). The response rate in Study I was small and thus makes problematic the generalizations especially for lower educated people.

8.5 Implications for future research

The empirical studies brought some insight into the areas that may be worth further exploration. These are: • Using proxies to collect events from their partner’s life proved to be a very good solution to the problems that are raised in the diary studies (e.g., that keeping the diary may change which events are remembered and also people are aware of the fact in which period they kept the di- ary). The only potential problem found was when more remote events where collected some proxies were unable to collect enough events. But I expect that in the future this may not be such a big problem because more and more people use digital cameras (the date is stored with the picture), e-mails (many people do not delete old e-mails), social net- works etc. All these new sources could provide enough events that could be used in the research. • Study III showed that even a plain calendar may work well and not lead to poorer date estimates than the more sophisticated calendar instru- ment. Using the plain calendars does not cost anything and future re- search could disentangle whether it could be suitable even for more re- mote events such as events that are 3–5 years as in Study II. • Event content was not analyzed in my empirical studies but may be a strong predictor of dating accuracy. For example some types of events (domains) may be easier to date for men while others for women. Be- cause of this I will further explore the impact of event content on dating accuracy120. • Study II indicated that it is difficult to find public landmarks that are ex- pected to “work” as recall aids generally. Because of this it may be ap- propriate to explore the public landmarks that “work” for some more specified population (e.g., people interested in sport or politics). • Studies II and III indicate that the general popularity of calendar instru- ments that seems to be increasing (see the reviews, Belli, et al., 2009;

120 Eva Literáková writes her thesis on this topic and we plan to publish the results when available. 239

Glasner & Van der Vaart, 2009) brings also many potential methodolog- ical issues about the validity of data and about the design of the calen- dar instruments. Even though “common sense” tells us that a calendar instrument should increase the dating accuracy or at least have a neutral effect, on some occasions (or for some people) it may have even a detri- mental effect. I have highlighted the hypothesis that especially the use of personal landmarks that are sometimes biased may have a negative effect on dating accuracy. The problem with biased temporal landmarks could possibly be partially solved by using stricter instructions—e.g., record only those events where you are 100% sure that they happened on the chosen date. But as shown in the section “under oath confidence” even when respondents were “totally confident” they sometimes made errors in date estimates. • My studies as well as most studies mentioned in the review part of this dissertation focused on date estimates of well-educated respondents. Focusing on the differences of dating personal events of people with low and high education could thus bring interesting findings whether the date estimates of these groups are similar or not.

240

Appendices

241

Appendix 1: Calendar instrument in Study II (CAL)

Note. Calendar is trimmed. listopad = November; prosinec = December; LEDEN = January; ÚNOR = FEBRUARY; BŘEZEN = March; ČERVENEC = July;... The black rectangles were target events to be dated. Red rectangle is self-generated personal landmark. (In the calendar, target events are printed on white stickers and landmarks are recorded on coloured stickers).

242

UARY 2006 3 MARCH 2006 3 MARCH December 2005 1 JAN 2 FEBRUARY 2006 2006 4 APRIL 2006 5 MAY 2006 6 JUNE recent events

(8) Liberation

Labour Day International Children's Christmas, St Nicholas Day New Year's Day (14) Saint Valentine's Day (8) International Women's Day (17) Easter Monday (1) Day (1) Day

November 2005 2006 7 JULY 2006 8 AUGUST 2006 9 SEPTEMBER 2006 10 OCTOBER 2006 11 NOVEMBER 2006 12 DECEMBER 2008 March

s- o-

Wenceslas Day

St. Czecho Independent Jan Hus Day

Saints Cyril and Meth (6) (17) Struggle for Freedom and Democracy Day (5) dius Day (28) (Czech Statehood Day) (28) lovak State Day (17) Struggle for Freedom and Democracy Day Christmas, St Nicholas Day (8) International Women's Day

MAY 2007 MAY October 2005 1 JANUARY 2007 2 FEBRUARY 2007 2007 3 MARCH 2007 4 APRIL 5 2007 6 JUNE February 2008

s-

(8) Liberation

Independent Czecho Independent

Easter Monday Labour Day New Year's Day Saint Valentine's(14) Day (28) lovak State Day (14) Saint Valentine's Day (8) International Women's Day (9) (1) Day (1) International Children's Day

2007

OCTOBER older events 2007 7 JULY 2007 8 AUGUST 2007 9 SEPTEMBER 10 2007 11 NOVEMBER 2007 12 DECEMBER January 2008

s-

Czecho

St. Wenceslas Day Independent Saints Cyril and Jan Hus Day (5) (6) New Year's Day Methodius Day (28) (Czech Statehood Day) (28) lovak State Day (17) Struggle for Freedom and Democracy Day Christmas, St Nicholas Day Note. The text was horizontal in the calendar instrument (as visible in the trimmed calendar above)

243

Public holidays and other events mentioned in the calendar (Translations of public holidays and some descriptions are from: http://en.wikipedia.org/wiki/Public_holidays_in_the_Czech_Republic):

Date English Name Czech Name Description People usually celebrate the New Year from the 1 January New Year's Day Nový rok 31st December to 1st January morning. This day is becoming more and more popular especially among younger people. For people 14 February Saint Valentine's Day Svatý Valentýn over 40 this is usually not an issue (St Valen- tine’s Day was “imported” into the Czech Re- public after the revolution.) International Wom- International Women's Day. Women often gen 8 March Mezinárodní den žen en's Day a flower or small present from men. Easter is celebrated for two days (Sunday and Monday) in the Czech Republic but longer for religious people. For children Monday is espe- cially interesting because they get Easter eggs and sweets and males whip females lightly with March, April Easter Monday Velikonoční pondělí willow twigs to symbolically chase away the illness and bring health and youth to them. Many traditions are still observed and prac- ticed, especially in villages, and different re- gions may have their own Easter traditions and customs. 1 May Labour Day Svátek práce Labour Day 1945, the end of the European part of World 8 May Liberation Day Den vítězství War II. There are often many attractions for children on International Child- 1 June Mezinárodní den dětí this day. Especially relevant for families with ren's Day small children. In 863, Church teachers St. Cyril (Constantine) Den slovanských vě- Saints Cyril and Me- and Metoděj (Methodius) came from the Bal- 5 July rozvěstů Cyrila a Meto- thodius Day kans to Great Moravia to propagate Christian děje faith and literacy. The religious reformer Jan Hus was burned at 6 July Jan Hus Day Den upálení Jana Husa the stake in 1415. St. Wenceslas Day In 935, St. Wenceslas, Duke of Bohemia, now 28 Septem- (Czech Statehood Den české státnosti patron of the Czech State was murdered by his ber Day) brother. Den vzniku samostatné- Creation of Czechoslovakia in 1918. Independent Cze- 28 October ho československého choslovak State Day státu Commemorating the student demonstration Struggle for Freedom Den boje za svobodu a against Nazi occupation in 1939, and nowa- 17 November and Democracy Day demokracii days especially the demonstration in 1989 that started the Velvet Revolution. Day before St Nicholas Day (6th) children are visited by the St Nicholas (dressed as a bishop), an angel and devil. They quiz children weather they have behaved themselves over the last 5-6 Decem- twelve months and then reward or punish them St Nicholas Day Svatý Mikuláš ber for good or bad behavior respectively (when children say same poem or sing a song there are always rewarded). Parents also give their children some small presents which they find in the morning. 24 (25, 26) Christmas is celebrated during the evening of Christmas Vánoce December the 24th.

244

Appendix 2: Calendar instrument in Study III (CAL)

March 14 (MO) 15 (TU) 16 (WE) 17 (TH) 18 (FR) 19 (SA) 20 (SU)

March 21 (MO) 22 (TU) 23 (WE) 24 (TH) 25 (FR) 26 (SA) 27 (SU)

March 28 (MO) 29 (TU) 30 (WE) 31 (TH) 1 (FR) 2 (SA) 3 (SU)

April 4 (MO) 5 (TU) 6 (WE) 7 (TH) 8 (FR) 9 (SA) 10 (SU)

April 11 (MO) 12 (TU) 13 (WE) 14 (TH) 15 (FR) 16 (SA) 17 (SU)

April 18 (MO) 19 (TU) 20 (WE) 21 (TH) 22 (FR) 23 (SA) 24 (SU)

April 25 (MO) 26 (TU) 27 (WE) 28 (TH) 29 (FR) 30 (SA) 1 (SU)

May 2 (MO) 3 (TU) 4 (WE) 5 (TH) 6 (FR) 7 (SA) 8 (SU)

May 9 (MO) 10 (TU) 11 (WE) 12 (TH) 13 (FR) 14 (SA) 15 (SU)

May 16 (MO) 17 (TU) 18 (WE) 19 (TH) 20 (FR) 21 (SA) 22 (SU)

May 23 (MO) 24 (TU) 25 (WE) 26 (TH) 27 (FR) 28 (SA) 29 (SU)

May 30 (MO) 31 (TU) 1 (WE) 2 (TH) 3 (FR) 4 (SA) 5 (SU)

Note. Calendar instrument is schematically represented. The calendar instrument had bigger cells (similar as in Study I calendar – see Appendix 1)

245

Appendix 3: Plain calendar in Study III (Non-CAL)

Note. březen = March; duben = April; květen = May. Po = Monday. Ne = Sunday. April 25th = Easter. May 1st = Labour Day. May 8th = Liberation Day. (more details in Appendix 1).

246

Appendix 4: Description of 35 public events used in Study I (and II)

The description follows this pattern: Number of event / Short name of event / Whole description of event used in the questionnaire. (True date of event; Event frequency of appearance in the yearbooks—applies to majority of events from years 2006 and 2007 only; Type of year temporal schema: I = shared sche- ma, II = unique schema; Type of multi–annual temporal schema: 0 = no schema, 1 = sche- ma; Note: other important details about an event; Temporal schema: notes about the type of temporal schema. Coverage in time = Type of media coverage in time: C = a culminating event. I = an initiating event. 0 = event is discarded from the second more strict categori- zation where only events with a clear culminating/initiating type without major doubts remained (see chapter 5, section “Classification of the events according to the temporal media coverage” for more details). For example “C, C” means that the event is in both categorizations labeled as culminating. “Pilot study” means that the event was men- tioned in an unpublished online free recall study in which respondents with similar de- mographics (N = 52) were asked to provide public events from years 2006 and 2007 (Neusar, 2010). 1 /Topolánek – 1st government/ Prime minister Miroslav Topolánek’s first government failed to gain confidence. This happened for the first time in the history of the Czech Republic. (3rd November 2006; 2; II; 1). Note: The event was surprising, followed by emotional political speeches, described in detail and alive in the media for a long time. Pilot study. Temporal schema: even though Parliamentary elections happen every four years in June (exception 2010 – May), setting up the government took many months this time. The respondents obviously used a temporal schema of “after Parliamentary elections” and chose May to September most often (different months were exceptional). That is why the event is clas- sified as type II year temporal schema. Coverage in time: C; C 2 /Heparin murderer/ The 'heparin murderer' Petr Zelenka. Employee of the hospital in Havlíčkův Brod was charged of murdering 7 people and 10 attempts to murder a person by admi- nistering high doses of heparin to unconscious patients. (4th December 2006; 4; II; 0). Note: Shocking case of a hospital employee, who administered high doses of heparin to the patients, this event gained interest of media right from its beginning when Zelenka was accused from killing of a few people, until the sentencing judgment. The event was ex- ceptional and very emotive. Religious service was done in memoriam of those who were killed and their families were offered support as well. More and more facts appeared over time and the event is still mentioned in the media. Coverage in time: I; I 3 /Paroubek – wedding/ Ex–prime minister Jiří Paroubek married Petra Kováčová. (17th November 2007; II; 0). Note: Jiří Paroubek is a former prime minister and ex–leader of one of the country’s largest parties (Social democratic party). He is source of much controver- sy (loved and hated by many people), and the tabloid papers give him a lot of attention. 247

The marriage followed Paroubek’s divorce, which received much media coverage. Tem- poral schema: the event happened on 17th November, which is otherwise an important date in the history of the Czech Republic (date of the “Velvet Revolution” in 1989). However I expect that most people will be not aware of this information. This is supported by the fact that only 10% of respondents estimated the month correctly. 26% of the respondents responded with maximum ten month error which indicates that at least some of them knew the wedding was in autumn. Some respondents probably used the summer wed- ding schema as well because 42% estimated May to July. Coverage in time: C; C 4 /Violence against children/ The case of shocking violence against children in town Kuřim. (10th May 2007; II; 0). Note: The case begun with a revelation of violence against a small boy, but even more shocking facts were revealed later—more adults and children were involved. Media interest has been remarkably high until present. Pilot study. Cover- age in time: I; I 5 /Kubice’s political crime report/ Publication of so called Kubice’s report (“Kubiceho zpráva”). The document dealt with the connection of organized crime and government. (29th May 2006; II; 1). Note: This document leaked to the public just a few days before Parliamentary elections. It was an important and surprising event which probably affected the outcome of elections. Moreover the elections were an important event themselves. Temporal schema: even though Parliamentary elections happen every four years in June (exception 2010 – May), the association with “Kubice‘s report” will probably be clear only to some respon- dents. This is why this event is classified as II type schema. The results show that 34% of respondents estimated the month correctly and 55% had at most one month error. This supports that the event could be classified as having a shared schema as well. In compar- ison 74% of respondents dated Parliamentary elections with at most one month error which shows that some respondents did not know the schema. Coverage in time: I; I 6 /Same–sex couples granted registered partnership/ Registered partnership of same– sex couples came into force. (1st July 2006; 2; II; 0). Note: After a long discussion the same–sex couples registered partnership act came into force. It was a novelty in the Czech Republic and therefore a debated act. The media was interested in the topic and professionals, gay- rights advocates, Catholics, and some lay people discussed this with strong emotions. However the majority of society was indifferent to these discussions. Temporal schema: some laws come into force in July but the usual month for bigger changes is January. Nevertheless, there is no general rule to this, it can happen any month during the year. The results show, that there is a higher percentage of answers in January (12%), April (14%), May (12%), June (11%) and October (16%). Apart from “January” temporal sche- ma, respondents could as well use the “wedding” temporal schema. October cannot be explained by any of these temporal schemas. Coverage in time: C, 0 – I suppose that the event become relevant for most of the respondents after first couples started to register.

248

7 /Hurricane Kyrill/ The hurricane Kyrill raged in Europe. 47 casualties; 4 of which were from the Czech Republic. Damaged forests, drop–outs, government declared a state of emergency. (18th – 19th January 2007; 2; II; 0). Note: A stressful, rare event which caused long–term difficulties to some people and the Czech economy. Coverage in time: I, I 8 /Music festival police raid/ The Czechtek open–air music festival police raid. Music festival was terminated by a raid of about thousand armored policemen with water cannons and tear gas. (29th July 2005; I; 0). Note: Czechtek was a controversial freetekno music festival that was held illegally (and sometimes legally) every year in the summer. Dozens of people were hurt during the exceptionally brutal police raid. There were many protests against the way police dealt with the participants. This question is still debated today and it was used in the last election campaign. The police claimed the concert was not legal but the organizers claimed the opposite. Temporal schema: because it was an open–air festival it had to happen during summer. It happened regularly every year. The results show that respondents had “summer” schema in mind because 9% estimated June, 48% July and 36% August. Coverage in time: I, I – if and where will the festival take place was always known in a last minute which is why the event is classified as initiating. 9 /Municipal elections/ Municipal elections in the Czech Republic (20th – 21st October 2006; 2; I; 1). Note: Municipal elections are a regular event where people elect their local representatives. These elections have a direct effect on their life. Temporal schema: elec- tions happen every four years always in autumn. Last two times in October and in 2002 and 1998 in November. Coverage in time: C, C 10 /Czech footballers – prostitutes/ Czech football players lost against Germany in a UEFA European Football Championship qualification. Later they were caught at a party with prostitutes by tabloid journalists. (24th March 2007; 2; II; 0). Note: This event was in connec- tion with an important sporting event and interesting material for tabloids. Temporal schema: The qualification to EURO 2008 was held from august 2006 to November 2007, so there should not be any month schema available. It also happened in rather atypical” month for summer sport what lead to more frequent choice of spring and summer months. Euro is held every four years. It seems that some respondents were aware of the 2008 year, because it was chosen significantly more often than any other year (46%). Cov- erage in time: I, I – the affair with prostitutes was the only event of this kind and the media still mentions it (at the time of writing this article). But the match was just one among many others. 11 /Obama – presidential victory/ Barack Obama won presidential elections in the USA (4th November 2008; II; 0). Note: There was a long campaign before the elections. Obama is the first African–American president of the USA. He was inaugurated as president on January 20, 2009. Temporal schema: Presidential elections always take place in November (every four years) but I do not expect this schema is known to Czech citizens. The accura-

249 cy of dating (22% correct) is probably caused by the recent nature of the event. Coverage in time: C, C 12 /Olympic Games in Peking/ XXIX. Olympic Games in Peking. Shooter Kateřina Emmons gained gold and silver medal, shooter David Kostelecký – gold, javelin thrower Barbora Špotáková – gold, canoeists Ondøej Štěpánek and Jaroslav Wolf – silver, rower Ondřej Synek – silver. (8th – 24th August 2008; I; 1). Note: Olympic Games are a regular event, which is very popular in the Czech Republic. Czech sportsmen were successful. Temporal schema: Olympic Games happen every four years in summer, most of the times in July or August (But there are exceptions. E.g., Sydney 2000 or South Korea 1988 were held in Septem- ber/October). Coverage in time: C, C 13 /Star Dance II/ Final episode of a popular TV show Star Dance II. (22nd December 2007; - ; I; 0). Note: The final episode of this popular show (in UK “Strictly Come Danc- ing”) was seen by 2.15 million people (Czech Republic has more than 10 million inhabi- tants). The event occurred just before Christmas (the 3rd series as well). The dancers are well–known Czech celebrities. Temporal schema: most of these big TV shows air their final episodes in December before Christmas. Even uninterested people will probably come across this regularity. 41% of respondents dated this event correctly and 74% of them chose October to December (new shows often start in autumn). Coverage in time: C, 0 – the event was culminating. Nevertheless according to frequent repetition there may be an influence of previous or following series. 14 /Train derail/ Studénka tragedy. EuroCity train crashed into fallen bridge, 7 casualties, 65 injuries. (8th August 2008; - ; II; 0). Note: Studénka tragedy was one of the most tragic events on Czech railways. This event was surprising and emotional for the broad public and the media dedicated a lot of coverage to the whole event. Coverage in time: I, I 15 /Interchange of newborns/ Interchange of newborns in Třebíč hospital. (4th October 2007; 3; II; 0). Note: Parents found (10 month after the birth) that their children have been interchanged in a hospital after the birth. Media focused on this unusual event extensive- ly for a long time. Parents were subject to many interviews and psychologists discussed this as well. Coverage in time: I, I 16 /Prime Minister Topolánek’s 2nd government/ Miroslav Topolánek’s second gov- ernment gained confidence. Turncoats Miloš Melčák and Michal Pohanka voted for the govern- ment. (19th January 2007; II; 0). Note: The way in which government won a confidence vote was broadly debated. Pilot study. Temporal schema: even though Parliamentary elections happen every four years in June, setting up the government took many months this time and respondents had to be aware of this information. Because the elections were occur- ring for the second time (the first time the government did not win a confidence vote), the estimate of the month the event occurred in will be probably close to chance. Respon-

250 dents obviously used year schema for elections while 40% estimated year 2006 (compared to 28% 2007). Coverage in time: C, C 17 /FIFA World Cup – Czechs failed/ FIFA World Cup in Germany. Czech football players did not advance further than the basic group (22nd June 2006, 2; I; 1). Note: FIFA World Cup is a popular sport event. Czech fans were disappointed. Pilot study. Temporal schema: the event takes place every four years, usually in June/July. The recent exceptions are World Cups in Korea/Japan (2002) and Mexico (1986) which took place in May/June. Many people are interested in sport and will know the month. Those who do not know can still use the summer temporal schema connected to summer sport. 51% of correct year answers indicate the knowledge of year temporal schema. Coverage in time: C, C 18 /Penalty points system/ Penalty points system came into force (traffic offences). (1st July 2006; 3; II; 0). Note: The penalty points system is a system where a driver gets penalty points as well as a fine. When he or she reaches twelve points his or her driving license is taken away for some time. This change directly affected many people. The event occurred at the beginning of summer holidays which is a landmark itself, but it may confuse some people because many law acts come into force in January. Pilot study. Temporal schema: some laws come into force in July but the most frequent month is probably January. Nev- ertheless, there is no general rule to this, it can happen any month during the year. The results show that there is a significantly higher percentage of answers in January (40%) than in any other month. This suggests the use of January temporal schema for laws to come into force. There is also a smaller peak in July (15%) suggesting that more people knew the right answer or had July temporal schema in mind. Coverage in time: C, C 19 /SuperStar 3/ Final episode of the popular TV show SuperStar 3. (17th December 2006; - ; I; 0). Note: The final episode of this popular show (in UK Pop Idol) was seen by 2.3 million people (Czech Republic has more than 10 million inhabitants). The event oc- curred just before Christmas (like the finales of other series). Temporal schema: most of these big shows have their finales in December before Christmas. Even people not inter- ested will probably come across this regularity. This is supported by the results: 47% estimated December and 71% of those questioned chose October – December (new shows often start in autumn). Coverage in time: C, 0 – the event was culminating nevertheless according to frequent repetition there can exist influence of previous or following series. 20 /Robbery of the century in the CR/ František Procházka stole 564 million CZK from the security agency. It was the highest amount of cash ever stolen in the Czech Republic. (1st De- cember 2007; 2; II; 0). Note: This event is frequently labeled as ‘the greatest robbery of the century’. There is media coverage of the event until now. Even though there are some hints where the thief could be, he is still at large (information from October 2011). Cover- age in time: I, I

251

21 /Kuchařová – Miss World/ Tatiana Kuchařová became Miss World. She was the first Czech woman awarded a prize in this competition. (30th September 2006; 3; II; 0). Note: The event was exceptional and tabloids were interested in it. Pilot study. Temporal schema: Most Miss World events take place in November or December, but there are also excep- tions to this rule (2006 – September; 2010 – October). Results do not show any month to be preferred (only August and September were slightly more preferred which is close to the correct answer). Coverage in time: C, O – even though the event is culminating, people probably paid attention to it after Kuchařová won. There is also a possibility of interfe- rence with Czech Miss Competition 2006 that happened in April. 22 /Sarkozy – presidential victory/ Nicolas Sarkozy was elected French president. The former president was Jacques Chirac. (6th May 2007; 4; II; 0). Note: Czech people are generally interested in because it is a historical ally and Czech people like French movies, music, books and wine. French was also taught a lot at schools but is nowadays “stea- mrolled” by English. Pilot study. Temporal schema: Last four French presidents were elected in May, but I do not expect this to be the known temporal schema in the Czech Republic. The results show higher frequency of May and June (44% together) among answers which may indicate temporal knowledge of some respondents. It could also happen that respondents used the temporal schema of Czech presidential elections that happen at the similar time. Coverage in time: I, I – the event is culminating but the first round takes place in April, thus the event is classified as initiating. 23 / Execution of Saddam Hussein / Execution of Saddam Hussein. (30th December 2006; 2; II; 0). Note: Executions of public figures are rare. Video of the execution leaked to the internet. The event occurred the day before New Year’s Eve. Pilot study. Coverage in time: I, 0 – this event is disputed. A three-year long lawsuit preceded the delivery of the judgment. The judgment, however, was made just one month before the execution. That is why the event is classified as initiating (to be sure just in first category). 24 /Suicide of K. Svoboda/ Well–known composer Karel Svoboda committed suicide. (28th January 2007; 2; II; 0). Note: The well–known composer of popular songs in the Czech Republic and Germany (film songs for ZDF) committed suicide. The media de- scribed the event as related to an emotional story involving his young son. The possibility of murder was also discussed. The tabloids press mentioned it for a long time. Pilot study. Coverage in time: I, I 25 /Czechs won ice hockey WCH/ Czech national hockey team gained the title of the world champion. The Czech Republic defeated Canada 3:0. (15th May 2005; - ; I; 0). Note: Hock- ey is a very popular sport in the Czech Republic and this event is exceptional for many of the Czech citizens. Temporal schema: there are two temporal schemata which do not cor- respond with each other. Ice hockey is winter sport which should take place in winter but the WCH is a regular event which always takes place in May or April/May (the final

252 game is always in May). Results show that most frequent answer was May (46%), but January to April was chosen fairly often as well. The year estimate is difficult because Czechs are often successful and the WCH takes place every year. Coverage in time: C, 0 – WCH took place every year so the previous or following year can interfere. 26 /Healthcare charges/ Health service act on payment for services came into force. (1st January 2008; I; 0). Note: This change affected many people in the long–term. The topic was hotly debated both before and after the change. Temporal schema: Change occurred on New Year’s Day, which is a very typical date for such changes. This is supported by the results: 59% of respondents were correct which indicates that knowledge of the schema or exact knowledge of the date (the event was recent). Coverage in time: C, C 27 /Star Dance I/ Final of popular TV show Star Dance. (23rd December 2006; I; 0). Note: This TV show (in UK Strictly Come Dancing) was very popular. The final episode was broadcasted just before Christmas Eve. Temporal schema: most of these big shows have their finale in December before Christmas. Even people not interested will probably come across this regularity. This is supported by the results: 41% respondents estimated December and 68% chose October – December (new shows often start in autumn). Cover- age in time: C, 0 – the event was culminating, however according to the frequent repetition of the show there can be influence of previous or following series. 28 /WCH athletics Osaka/ World Athletics Championship in Osaka. Decathlon athlete Roman Šebrle and javelin thrower Barbora Špotáková won gold medals. (25th August – 2nd Sep- tember 2007; 2; I; 1). Note: The World Athletics Championship is a regularly-held sporting event, which is usually very well covered by media. Both of the sportsmen are popular. Temporal schema: WCH athletics take place every four years, usually in August. Results also show that respondents used summer schema because 88% of the estimates are from June to September. Even though the event has a multi–year temporal schema, respon- dents probably do not know this (just 30% answered correctly compared to 34% for 2008). Coverage in time: C, C 29 /Olympic Games in Torino/ XX. Olympic Games in Torino. Kateřina Neumannová won a gold medal, Lukáš Bauer silver, and hockey players bronze medals. (10th – 26th February 2006; 5; I; 1). Note: It is a regular sporting event, which is very well covered by media. Pilot study. Temporal schema: Winter Olympic Games happen every four years in Febru- ary. Coverage in time: C, C 30 /Massacre at Virginia Tech/ Massacre at University of Virginia. Student from South Korea shot 32 people, injured dozens and committed suicide (16th April 2007; 3; II; 0). Note: It was the biggest school shooting in the history of United States of America. The event was shocking and media coverage was very emotional; psychologists and other professionals analyzed the causes. Coverage in time: I, I

253

31 /National Library project by Kaplický/ The project by Kaplický won the competition for new National Library (2nd March 2007; 5; II; 0). Note: The architectonic project his com- pany Future Systems proposed (octopus-shaped building) was hotly debated in the pub- lic and is still a concern. The library has not built yet. Kaplický suddenly died 14. 1. 2009 just few hours after his daughter was born which provoked the discussion about the project again. Coverage in time: I, I – even though the event was culminating, most res- pondents probably registered the event after Kaplický won. That is why the event is clas- sified as initiating. 32 /Parliamentary elections/ Parliamentary elections in the Czech Republic. There was no winner; both left and right wing parties gained 100 representatives. (2nd – 3rd June 2006; 7; I; 1). Note: Parliamentary elections were preceded by a massive campaign in across the me- dia. For the first time in history there was no winner and both sides had even number of representatives. Many of the speeches following the event were emotional, as was the debate in a general public. Pilot study. Temporal schema: elections occur in four year inter- vals in June. The 2010 elections was an exception to this rule and happened at the end of May. Results show that 69% of respondents estimated either May or June. 71% chose the correct year which implies that the multi–year schema is known. Coverage in time: C, C 33 /Bulgaria & Romania joined the EU/ Bulgaria and Romania joined the EU (1st Jan- uary 2007; 2; II; 0). Note: It was an important change for the whole European Union. The event happened on New Year (a remarkable date). It was approximately 3 years after Czech Republic joined the EU. Temporal schema: EU accession happens in January or May (from 1973). The Czech Republic joined the EU in May. The results show that respon- dents could use a temporal schema of January as most answers are January (30%). Tem- poral schema of May or “early summer” was probably also in minds of the respondents because of the higher frequency of April (12%), May (10%) and June (12%). Coverage in time: C, 0 – even though the event is culminating, probably most respondents paid atten- tion to it after the joining. That is why the event was removed from the second category. 34 /Pope John Paul II died/ Pope John Paul II died. Benedict XIV (Joseph Ratzinger) be- came a new pope. (2nd April 2005; II; 0). Note: The Pope was an important public figure not only for Catholics in the Czech Republic (even though it is highly secular country). It was heavily covered by all media sources. The length of his pontificate was almost 27 years. Coverage in time: I, I 35 /Czech Republic joined Schengen/ Czech Republic joined Schengen Area (resulting in a cessation of border checks). (21st December 2007; 3; II; 0). Note: The consequences of this event were important for many people, because it meant no border control with Czech neighbors. The event happened before Christmas. Temporal schema: countries join the Schengen Area usually in groups in December or March (the exception is which joined in October 1997). But I do not expect that many people will be aware of this sche-

254 ma. Results show that respondents are not aware of this schema at all because most of them chose January (41%) which is typical month for such changes. Coverage in time: C, 0 – the event is culminating. Probably just few respondents paid attention to it more than 2 months before, however. That is why the event was removed from the second category.

255

Appendix 5: Frequency of answers outside the 2005–2008 boundaries

No Event Y 1 2 3 4 9 10 f Out 1 Topolánek – 1st government 6 × × × × 14 × 14 × 2 Heparin murderer 6 × × 1 3 8 × 12 × 3 Paroubek – wedding 7 × 1 × × 27 × 28 × 4 Violence against children 7 × × × 1 7 1 9 × 5 Kubice’s political crime report 6 × × × 3 7 × 10 × 6 Same-sex couples reg. partnership 6 1 × 1 4 13 2 31 × 7 Hurricane Kyrill 7 × × 1 1 14 2 18 × 8 Music festival police raid 5 × 2 2 7 8 × 19 × 9 Municipal elections 6 × × × × 8 13 21 × 10 Czech footballers – prostitutes 7 × × × × 32 3 35 × 11 Obama – presidential victory 8 × × × × 41 × 41 × 12 Olympic Games in Beijing 8 × × × × 22 × 22 × 13 Star Dance II 7 × × × × 33 2 35 × 14 Train derail 8 × × × × 30 1 31 × 15 Interchange of newborns 7 × × × × 6 1 7 × 16 Topolánek – 2nd government 7 × × × × 6 × 6 × 17 FIFA World Cup – Czechs failed 6 × × × × 11 5 16 × 18 Driving points system 6 × 1 × 1 3 × 5 × 19 SuperStar 3 6 × × × × 24 × 24 × 20 Robbery of the century in the CR 7 × × × × 8 × 8 × 21 Kuchařová – Miss World 6 × × × 2 4 × 6 × 22 Sarkozy – presidential victory 7 × × 2 1 1 × 4 × 23 Saddam Hussein execution 6 1 1 2 4 6 × 15 1992 24 Suicide of K. Svoboda 7 × × × × 13 × 13 × 25 Czechs won ice hockey WCH 5 × × 1 × 12 17 30 × 26 Healthcare charges 8 × × × × 9 × 9 × 27 Star Dance I 6 × × × × 14 × 14 × 28 WCH athletics Osaka 7 × × × × 13 × 13 × 29 Olympic Games in Torino 6 × × × × 4 2 6 × 30 Massacre at Virginia Tech 7 × × × 1 5 × 6 × 31 National Library project by Kaplický 7 × × × × 6 × 6 × 32 Parliament elections 6 × × × × 3 1 4 × 33 Bulgaria & Romania joined EU 7 × × × 3 11 × 14 × 34 Pope John Paul II died 5 × × × 1 1 × 2 × 35 Czech Republic joined Schengen 7 × × 2 4 2 × 9 1997 Note. No = number of an event in appendix 1. Y = correct year. 1, 2 ... 10 = frequency of reported years out of boundary. f = sum of all reports out of boundary. Out = outliers (year).

256

Appendix 6: Description of public events used in Study III

I have used multiple sources to verify the information about the events. The “major source” thus highlights only the source that was used predominantly. 1) Royal wedding: Prince William married Kate Middleton at Westmin- ster Abbey in London. April 29, 2011. Note: Even though most foreign news are usually not of much interest to Czech citizens, this wedding was heavily covered by the media (TV, internet, newspapers, journals, blogs) and was often discussed amongst the public. The TV program covering the event ranked as the most watched show in the history of the public news station ČT24. The peak of the ceremony in the Westminster Abbey was followed by 52% of people watching TV between 12:00 and 13:53 in the Czech Republic; 1.4 million adults followed the pro- gram for at least 3 minutes continually during the six hours long special TV coverage (the population of the Czech Republic is approximately 10 million). (Major source: http://www.regionycr.cz/view.php?cisloclanku=2011050026- kralovska-svatba-nejsledovanejsim-poradem-v-dejinach- ct24&rstema=202&rsstat=5&rskraj=10&rsregion=50) 2) CZ won bronze at WCH: The Czechs hockey players defeated Russians at the World Ice Hockey Championship and took bronze medal. May 15, 2011. Note: Ice hockey is a very popular sport in the Czech Republic. The rea- son for this is the fact that that despite its relatively small size, the Czech Re- public has won the world championship several times. This time the Czech Republic defeated the Russian team which is especially important for many people who experienced the communist regime. It was the second most watched TV program (1.6 million viewers) in the history of the public sport news station ČT4. (Slightly higher viewer rates were recorded during the ba- sic group match between the Czechs Republic and Finland). (Major source: http://isport.blesk.cz/clanek/hokej-reprezentace-ms- 2011/105430/boj-o-bronz-prinesl-dalsi-rekord-ve-sledovanosti.html) 3) Annie found dead: Annie, a child who had been missing for several months, was found dead in Prague–Troja. DNA tests later confirmed her identity. March 16, 2011. Note: Nine-year-old Anna never made it back home from school on Oc- tober 13th 2010 and all efforts to find her proved futile. Despite the fact that hundreds of police officers were searching for her for hours in the vicinity where she disappeared and in the nearby riverbed, the only trace they found was her abandoned schoolbag and her bottle of water lying close to where 257 she was last seen. An intensive round-the-clock police operation involved hundreds of police officers, sniffer dogs, a helicopter with thermo-vision etc. The media coverage of this event was big during the whole time (family members appeared many times on TV). Anna was found dead after several months. The accused person said that he met Anna but did not confess to a crime (police found his DNA on her bag and had several other indirect proofs). The accused was charged with murder and rape. Subsequently he was found dead in the prison after committing suicide. This happened few days after Anna was found. Affirmative proof against him was not found. It was speculated that media could have caused the death of an innocent person and that it inflicted harm on people who were close to the accused (disclosing their names, place of residence etc.). (Major source: http://rozanek.posterous.com/jak-media-rozhodla-o-vine-a- trestu). 4) Osama’s death: The US commando found and killed Osama bin Laden in Pakistan. May 2, 2011. Note: Osama Bin Laden was shot dead at a compound near Islamabad, in a ground operation based on US intelligence, the first lead for which emerged last August. Bin Laden is believed to have ordered the attacks on New York and Washington on September 11, 2001 and a number of others. He was ranked at the top of the US' "most wanted" list. DNA tests later con- firmed that Bin Laden was dead. Bin Laden was buried in the sea after a Mus- lim funeral on the board of an U.S. aircraft carrier. This news was officially announced by the American president Barack Obama. The event received substantive coverage in all the Czech media. (Major source: http://www.bbc.co.uk/news/world-us-canada-13256676)

258

Bibliography

Ackerman, P. L., & Heggestad, E. D. (1997). Intelligence, personality, and interests: Evidence for overlapping traits. Psychological Bulletin, 121(2), 219-245. doi: 10.1037/0033- 2909.121.2.219 Anderson, M. C. (2009). Retrieval. In A. D. Baddeley, M. W. Eysenck & M. C. Anderson (Eds.), Memory (pp. 163-189). Hove: Psychology Press. AP (Producer). (2008). Scientists study man’s amazing memory. Retrieved from http://www.msnbc.msn.com/id/23296808/ns/health-mental_health/t/scientists-study-mans- amazing-memory/#.TqJrw7L1eto (Associated Press) Aquilino, W. S., & Lo Sciuto, L. A. (1990). Effects of Interview Mode on Self-reported Drug Use. Public Opinion Quarterly, 54(3), 362-395. Auriat, N. (1993). 'My wife knows best': A comparison of event dating accuracy between the wife, the husband, the couple, and the population register. Public Opinion Quarterly, 57(2), 165-190. doi: 10.1086/269364 Austin, E. J., Deary, I. J., Whiteman, M. C., Fowkes, F. G. R., Pedersen, N. L., Rabbitt, P., . . . McInnes, L. (2002). Relationships between ability and personality:does intelligence contribute positively to personal and social adjustment? [Article]. Personality & Individual Differences, 32(8), 1391. Axinn, W. G., Barber, J. S., & Ghimire, D. J. (1997). The Neighborhood History Calendar: A Data Collection Method Designed for Dynamic Multilevel Modeling. Sociological Methodology, 27, 355-392. Baddeley, A. D. (2009). What is memory. In A. D. Baddeley, M. W. Eysenck & M. C. Anderson (Eds.), Memory. Hove: Psychology Press. Baddeley, A. D., Kopelman, M. D., & Wilson, B. A. (Eds.). (2004). The essential handbook of memory disorders for clinicians. Chichester: John Wiley & Sons. Balán, J., Browning, H. L., Jelin, E., & Litzler, L. (1969). A computerized approach to the processing and analysis of life histories obtained in sample surveys. Behavioral Science, 14(2), 105- 120. Baltes, P. B., Reese, H. W., & Lipsitt, L. P. (1980). Life-span Developmental Psychology. [Article]. Annual Review of Psychology, 31(1), 65-110. Baron, J. (2008). Thinking and deciding (4 ed.). Cambridge: Cambridge Univ Press. Barsalou, L. W. (1988). The content and organization of autobiographical memories. In U. Neisser & E. Winograd (Eds.), Remembering reconsidered: Ecological and traditional approaches to the study of memory. (pp. 193-243). New York, NY US: Cambridge University Press. Belli, R. F. (1998). The Structure of Autobiographical Memory and the Event History Calendar: Potential Improvements in the Quality of Retrospective Reports in Surveys. Memory, 6(4), 383-406. doi: 10.1080/096582198388229 Belli, R. F., & Callegaro, M. (2009). The Emergence of Calendar Interviewing. A Theoretical and Empirical Rationale. In R. F. Belli, F. P. Stafford & D. F. Alwin (Eds.), Calendar and time diary methods in life course research (pp. 31-54). Thousand Oaks: Sage Publications, Inc. Belli, R. F., Lee, E. H., Stafford, F. P., & Chou, C.-H. (2004). Calendar and Question-List Survey Methods: Association Between Interviewer Behaviors and Data Quality. Journal of Official Statistics, 20(2), 185-218. Belli, R. F., Shay, W. L., & Stafford, F. P. (2001). Event history calendars and question list surveys: A direct comparison of interviewing methods. Public Opinion Quarterly, 65(1), 45-74. doi: 10.1086/320037 Belli, R. F., Smith, L. M., Andreski, P. M., & Agrawal, S. (2007). Methodological Comparison Between Cati Event History Calendar and Standardized Conventional Questionnaire Instruments. Public Opinion Quarterly, 71(4), 603-622. Belli, R. F., Stafford, F. P., & Alwin, D. F. (2009). Calendar and time diary methods in life course research: Sage Publications, Inc. Berntsen, D. (1998). Voluntary and involuntary access to autobiographical memory. Memory, 6(2), 113-141. doi: 10.1080/741942071

259

Berntsen, D., & Rubin, D. C. (2002). Emotionally charged autobiographical memories across the life span: The recall of happy, sad, traumatic and involuntary memories. Psychology and Aging, 17(4), 636-652. doi: 10.1037/0882-7974.17.4.636 Betz, A. L., & Skowronski, J. J. (1997). Self-events and other-events: temporal dating and event memory. Memory & Cognition, 25(5), 701-714. Bickel, R. (2007). Multilevel analysis for applied research: it's just regression! New York: The Guilford Press. Bloise, S. M., & Johnson, M. K. (2007). Memory for emotional and neutral information: Gender and individual differences in emotional sensitivity. [Article]. Memory, 15(2), 192-204. doi: 10.1080/09658210701204456 Booth, W. C., Colomb, G. G., & William, J. M. (2008). The Craft Of Research (3 ed.). Chicago & London: The University of Chicago Press. Botwinick, J., & Storandt, M. (1980). Recall and Recognition of Old Information in Relation to Age and Sex. Journal of Gerontology, 35(1), 70-76. Boyer, P., & Wertsch, J. V. (2009). Memory in mind and culture. New York, NY US: Cambridge University Press. Bradburn, N. M. (1996). Event dating. In S. Sudman, N. M. Bradburn & N. Schwarz (Eds.), Thinking About Answers: The Application of Cognitive Processes to Survey Methodology (pp. 185- 196). San Francisco: Jossey-Bass. Bradburn, N. M. (2000). Temporal representation and event dating. In A. A. Stone, J. S. Turkkan, C. A. Bachrach, J. B. Jobe, H. S. Kurtzman & V. S. Cain (Eds.), The science of self-report: Implications for research and practice (pp. 49-61). Mahwah: Lawrence Erlbaum Associates, Inc. Bradburn, N. M. (2010). Recall Period in Consumer Expenditure Surveys Program. Paper presented at the Bureau of Labor Statistics Consumer Expenditure Survey Methods Workshop. Brown, G. D. A., & Lewandowsky, S. (2010). Forgetting in memory models. Arguments against trace decay and consolidation failure. In S. Della Sala (Ed.), Forgetting (pp. 50-75). Hove: Psychology Press. Brown, N. R. (1990). Organization of public events in long-term memory. Journal of Experimental Psychology: General, 119(3), 297-314. doi: 10.1037/0096-3445.119.3.297 Brown, N. R., Rips, L. J., & Shevell, S. K. (1985). The subjective dates of natural events in very- long-term memory. , 17(2), 139-177. doi: 10.1016/0010- 0285(85)90006-4 Burt, C. D. (1993). The effect of actual event duration and event memory on the reconstruction of duration information. Applied Cognitive Psychology, 7(1), 63-73. doi: 10.1002/acp.2350070107 Burt, C. D. (2008). Time, Language, and Autobiographical Memory. Language Learning, 58(1), 123-141. Burt, C. D., & Kemp, S. (1991). Retrospective duration estimation of public events. Memory & Cognition, 19(3), 252-262. Burt, C. D., Kemp, S., & Conway, M. (2001). What happens if you retest autobiographical memory 10 years on? Memory & Cognition, 29(1), 127-136. Burt, C. D., Kemp, S., & Conway, M. (2008). Ordering the components of autobiographical events. Acta Psychologica, 127(1), 36-45. doi: 10.1016/j.actpsy.2006.12.007 Burt, C. D., Kemp, S., & Conway, M. A. (2003). Themes, Events, and Episodes in Autobiographical Memory. Memory & Cognition, 31(2), 317-325. Callegaro, M., Yu, M., Cheng, F.-W., Hjermstad, E., Liao, D., & Belli, R. (2005, 2004/05/13/2004 Annual Meeting, Phoenix, AZ). Comparison of Computerized Event History Calendar and Question-list Interviewing Methods: A Two-year Hospitalization History Study. Paper presented at the 59th Annual conference of the American Association for Public Opinion Research, Alexandria. Caprara, G. V. (2001). Personality and Adaptive Behavior. In N. J. Smelser & P. B. Baltes (Eds.), International Encyclopedia Of The Social & Behavioral Sciences (pp. 0080430767). Oxford: Elsevier Science Ltd. Clark, N. K., & Stephenson, G. M. (1990). Social remembering: Quantitative aspects of individual and collaborative remembering by police. [Article]. British Journal of Psychology, 81(1), 73. 260

Cohen, G. (2008a). Memory for knowledge: General knowledge and expert knowledge. In G. Cohen & M. Conway (Eds.), Memory in the real world (3 ed., pp. 207-227): Psychology Press. Cohen, G. (2008b). The study of everyday memory. In G. Cohen & M. Conway (Eds.), Memory in the real world (3 ed., pp. 1-20): Psychology Press. Conway, M. A. (2005). Memory and the self. [Article]. Journal of Memory and Language, 53, 594- 628. doi: 10.1016/j.jml.2005.08.005 Conway, M. A., & Loveday, C. (2010). Accessing autobiographical memories. In J. H. Mace (Ed.), The act of remembering: Toward an understanding of how we recall the past. (pp. 56-70). Oxford: Wiley-Blackwell. Conway, M. A., Meares, K., & Standart, S. (2004). Images and goals. [Article]. Memory, 12(4), 525- 531. Conway, M. A., & Pleydell-Pearce, C. W. (2000). The Construction of Autobiographical Memories in the Self-Memory System. Psychological Review, 107(2), 261-288. doi: 10.1037//0033- 295X.107.2.261 Conway, M. A., Wang, Q., Hanyu, K., & Haque, S. (2005). A Cross-Cultural Investigation of Autobiographical Memory: On the Universality and Cultural Variation of the Reminiscence Bump. Journal of Cross-Cultural Psychology, 36(6), 739-749. doi: 10.1177/0022022105280512 Davis, P. J. (1999). Gender differences in autobiographical memory for childhood emotional experiences. Journal of Personality and Social Psychology, 76(3), 498-510. doi: 10.1037/0022-3514.76.3.498 Della Sala, S. (Ed.). (2010). Forgetting. Hove: Psychology Press. Denomne, R., & Adie, M. (2009). -prone personality: a study on the big five personality traits associated with susceptibility to false memory (Presentation). SNHU Academic Archive http://academicarchive.snhu.edu/bitstream/handle/10474/789/snhu_00110.pdf?sequence= 1 Dijkstra, W., Smit, J. H., & Ongena, Y. P. (2009). An evaluation study of the event history calendar. In R. F. Belli, F. P. Stafford & D. F. Alwin (Eds.), Calendar and time diary methods in life course research (pp. 257-275). Thousand Oaks: Sage. Dillman, D. A., Smyth, J. D., & Christian, L. M. (2009). Internet, mail, and mixed-mode surveys: The tailored design method (3 ed.). Hoboken: Wiley. Douglas, M. J. (2000). Trading in the zone: master the market with confidence, discipline and a winning attitude. Paramus: Prentice Hall Press. Ebbinghaus, H. (1885/1964). Memory: A Contribution to Experimental Psychology. New York: Dover. Engel, L. S., Keifer, M. C., & Zahm, S. H. (2001). Comparison of a traditional questionnaire with an icon/calendar-based questionnaire to assess occupational history. American Journal Of Industrial Medicine, 40(5), 502-511. doi: 10.1002/ajim.1118 Ericsson, K. A. (2003). Valid and Non-Reactive Verbalization of Thoughts During Performance of Tasks. Towards a Solution to the Central Problems of Introspection as a Source of Scientific Data. In A. Jack & A. Roepstorff (Eds.), Trusting the Subject? (Vol. 1, pp. 1-18). Exeter: Imprint Academic. Ericsson, K. A. (2009a). Development of Professional Expertise: Toward measurement of expert performance and design of optimal learning environments. Cambridge: Cambridge university press. Ericsson, K. A. (2009b). Enhancing the development of professional performance: Implications from the study of deliberate practice. In K. A. Ericsson (Ed.), Development of Professional Expertise: Toward measurement of expert performance and design of optimal learning environments (pp. 405-431). Cambridge: Cambridge university press. EuroCoDe. (2009). Prevalence of dementia in Eastern Europe. http://www.alzheimer- europe.org/Research/European-Collaboration-on-Dementia/Prevalence-of- dementia2/Prevalence-of-dementia-in-Eastern-Europe Eysenck, M. W. (2009). Semantic memory and stored knowledge. In A. D. Baddeley, M. W. Eysenck & M. C. Anderson (Eds.), Memory. Hove: Psychology Press. Eysenck, M. W., & Keane, M. T. (2000). Cognitive psychology: A student's handbook (4 ed.). Hove: Psychology Pr. 261

Ferguson, M. J., Hassin, R., & Bargh, J. A. (2008). Implicit motivation: Past, present, and future. In J. Y. Shah & W. L. Gardner (Eds.), Handbook of motivation science. (pp. 150-166). New York, NY US: Guilford Press. Fisher, R. P., & Geiselman, R. E. (2010). The Cognitive Interview method of conducting police interviews: Eliciting extensive information and promoting Therapeutic Jurisprudence. International Journal of Law and Psychiatry, 33(5-6), 321-328. doi: 10.1016/j.ijlp.2010.09.004 Fivush, R. (1998). Gendered narratives: Elaboration, structure, and emotion in parent-child reminiscing across the preschool years. In C. P. Thompson, D. J. Herrmann, D. Bruce, J. D. Read, D. G. Payne & M. P. Toglia (Eds.), Autobiographical memory: Theoretical and applied perspectives. (pp. 79-103). Mahwah, NJ US: Lawrence Erlbaum Associates Publishers. Fradera, A., & Ward, J. (2006). Placing events in time: The role of autobiographical recollection. Memory, 14(7), 834-845. doi: 10.1080/09658210600747241 Freedman, D., Thornton, A., Camburn, D., Alwin, D., & Young-DeMarco, L. (1988). The Life History Calendar: A Technique for Collecting Retrospective Data. In C. C. Clogg (Ed.), Sociological Methodology 1988 B2 - Sociological Methodology 1988 (Vol. 18, pp. 37-68). San Francisco: Jossey-Bass. Friedman, W. J. (1987). A follow-up to 'Scale effects in memory for the time of events': The earthquake study. Memory & Cognition, 15(6), 518-520. Friedman, W. J. (1993). Memory for the time of past events. Psychological Bulletin, 113(1), 44-66. doi: 10.1037/0033-2909.113.1.44 Friedman, W. J. (2004). Time in Autobiographical Memory. Social Cognition, 22(5), 591-605. Fry, R. (2011). Improve your memory (6 ed.). Boston: Course Technology PTR. Gaskell, G. D., Wright, D. B., & O'Muircheartaigh, C. A. (2000). Telescoping of landmark events: Implications for survey research. [Article]. Public Opinion Quarterly, 64(1), 77-89. Gibbons, J. A., & Thompson, C. P. (2001). Using a calendar in event dating. Applied Cognitive Psychology, 15, 33-44. Giddens, A. (2009). Sociology (Revised and updated with Philip W. Sutton) (6 ed.). Cambridge: Polity Press. Gillham, B. (2000). The research interview. London: Continuum. Glasner, T. (2011). Reconstructing event histories in standardized survey research: Cognitive mechanism and aided recalll techniques. doctoral, Vrije Universiteit, Amsterdam. Glasner, T., & Van der Vaart, W. (2009). Applications of calendar instruments in social surveys: a review. [Article]. Quality & Quantity, 43(3), 333-349. doi: 10.1007/s11135-007-9129-8 Glasner, T., Van der Vaart, W., & Belli, R. F. (forthcoming). The use of landmark events as memory aids: Implications for international surveys. Bulletin de Méthodologie Sociologique. Glück, J., & Bluck, S. (2007). Looking back across the life span: A life story account of the reminiscence bump. Memory & Cognition, 35(8), 1928-1939. Goldman, N., Moreno, L., & Westoff, C. F. (1989). Collection of Survey Data on Contraception: An Evaluation of an Experiment in Peru. Studies in Family Planning, 20(3), 147-157. Grafova, I., & Stafford, F. P. (2009). The wage effects of personal smoking history. Industrial & Labor Relations Review, 62(3), 381-393. Greenway, D. E. (1999). Dates in history: chronology and memory. Historical Research, 72(178), 127-139. Grice, H. P. (1975). Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Syntax and semantics: 3. Speech acts (pp. 41-58). New York: Academic Press. Groeger, J. A. (1997). Memory and remembering: Everyday memory in context. Harlow: Longman. Harris, C. B., Paterson, H. M., & Kemp, R. I. (2008). Collaborative recall and : What happens when we remember together? Memory, 16(3), 213-230. Harris, C. B., Sutton, J., & Barnier, A. J. (2010). Autobiographical forgetting, social forgetting, and situated forgetting. In S. Della Sala (Ed.), Forgetting (pp. 253-284). Hove: Psychology Press. Harris, D. A., & Parisi, D. M. (2007). Adapting life history calendars for qualitative research on welfare transitions. Field Methods, 19(1), 40-58. doi: 10.1177/1525822X06292707 Hasher, L., & Zacks, R. T. (1979). Automatic and effortful processes in memory. Journal of Experimental Psychology: General, 108(3), 356-388. doi: 10.1037/0096-3445.108.3.356

262

Hastie, R. (1980). Memory for behavioral information that confirms or contradicts a personality impression. In R. Hastie, T. M. Ostrom, E. B. Ebbesen, R. S. Wyer, D. L. Hamilton & D. E. Carlston (Eds.), Person memory: The cognitive basis of social perception (pp. 155-177). Hillsdale: Erlbaum. Higbee, K. L. (1996). Your memory: How it works and how to improve it (3 ed.). New York: Marlowe & Company. Hinrichs, J. V. (1970). A two-process memory-strength theory for judgment of recency. Psychological Review, 77(3), 223-233. doi: 10.1037/h0029101 Hintzman, D. L., Block, R. A., & Summers, J. J. (1973). Contextual associations and memory for serial position. Journal of Experimental Psychology, 97(2), 220-229. Hoferková, J. (2011). Role veřejných mezníků v dlouhodobé paměti. Bc, Masarykova Univerzita., Brno. Retrieved from http://is.muni.cz/th/273150/fss_b/Hoferkova_bakalarska_prace.pdf Hoogendoorn, A. W. (2004). A Questionnaire Design for Dependent Interviewing that Addresses the Problem of Cognitive Satisficing. Journa of Official Statistics, 20(2), 219-232. Hoppin, J. A., Tolbert, P. E., Flagg, E. W., Blair, A., & Zahm, S. H. (1998). Use of a life events calendar approach to elicit occupational history from farmers. American Journal Of Industrial Medicine, 34(5), 470. Howes, J. L., & Katz, A. N. (1992). Remote memory: Recalling autobiographical and public events from across the lifespan. Canadian Journal of Psychology/Revue canadienne de psychologie, 46(1), 92-116. doi: 10.1037/h0084311 Humphreys, M. S., & Revelle, W. (1984). Personality, motivation, and performance: A theory of the relationship between individual differences and information processing. Psychological Review, 91(2), 153-184. doi: 10.1037/0033-295x.91.2.153 Huttenlocher, J., Hedges, L., & Prohaska, V. (1988). Hierarchical organization in ordered domains: Estimating the dates of events. Psychological Review, 95(4), 471-484. doi: 10.1037/0033- 295x.95.4.471 Huttenlocher, J., Hedges, L. V., & Bradburn, N. M. (1990). Reports of elapsed time: Bounding and rounding processes in estimation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16(2), 196-213. doi: 10.1037/0278-7393.16.2.196 Huttenlocher, J., Hedges, L. V., & Prohaska, V. (1992). Memory for day of the week: A 5 + 2 day cycle. Journal of Experimental Psychology: General, 121(3), 313-325. doi: 10.1037/0096- 3445.121.3.313 Jackson, C. J., & Francis, L. J. (1998). Interpreting the correlation between neuroticism and lie scale scores. Personality and Individual Differences, 26(1), 59-63. Janssen, S. M. J., Chessa, A. G., & Murre, J. M. J. (2005). The reminiscence bump in autobiographical memory: Effects of age, gender, education, and culture. Memory, 13, 658-668. Janssen, S. M. J., Chessa, A. G., & Murre, J. M. J. (2006). Memory for time: how people date events. Memory & Cognition, 34(1), 138-147. Jonker, C., Geerlings, M. I., & Schmand, B. (2000). Are memory complaints predictive for dementia? A review of clinical and population-based studies. International Journal of Geriatric Psychiatry, 15(11), 983-991. doi: 10.1002/1099-1166(200011)15:11<983::aid- gps238>3.0.co;2-5 Karniol, R., & Ross, M. (1996). The motivational impact of temporal focus: thinking about the future and the past. Annual Review of Psychology, 47(1), 593-620. Karns, T. E., Irvin, S. J., Suranic, S. L., & Rivardo, M. G. (2009). Collaborative Recall Reduces the Effect of a Misleading Post Event Narrative. [Article]. North American Journal of Psychology, 11(1), 17-28. Katz, E., Adonio, H., & Parness, P. (1977). Remembering the news: What the picture adds to recall. Journalism Quarterly, 54(2), 231-239. Kebbell, M. R. (2009). Witness confidence and accuracy: is a positive relationship maintained for recall under interview conditions? [Article]. Journal of Investigative Psychology & Offender Profiling, 6(1), 11-23. doi: 10.1002/jip.89 Kemp, S. (1988). Dating Recent and Historical Events. [Article]. Applied Cognitive Psychology, 2(3), 181-188. Kemp, S. (1999). An associative theory of estimating past dates and past prices. Psychonomic Bulletin & Review, 6(1), 41-56.

263

Kemp, S., Burt, C. D., & Malinen, S. (2009). Investigating the structure of autobiographical memory using reaction times. Memory, 17(5), 511-517. Kessler, R. C., & Wethington, E. (1991). The reliability of life event reports in a community survey. Psychological Medicine, 21(3), 723-738. Kopelman, M. D. (2004). . In A. D. Baddeley, M. D. Kopelman & B. A. Wilson (Eds.), The essential handbook of memory disorders for clinicians (pp. 69-89). Chichester: John Wiley & Sons. Kristo, G., Janssen, S. M. J., & Murre, J. M. J. (2009). Retention of autobiographical memories: An Internet-based diary study. Memory, 17(8), 14. Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5(3), 213-236. Kunda, Z. (1990). The case for motivated reasoning. Psychological Bulletin, 108(3), 480-498. doi: 10.1037/0033-2909.108.3.480 Kurbat, M. A., Shevell, S. K., & Rips, L. J. (1998). A year's memories: The calendar effect in autobiographical recall. Memory & Cognition, 26(3), 532-552. Larsen, S. F., & Conway, M. A. (1997). Reconstructing dates of true and false autobiographical memories. European Journal of Cognitive Psychology, 9(3), 259-272. doi: 10.1080/713752560 Larsen, S. F., & Thompson, C. P. (1995). in the dating of personal and public news events. Memory & Cognition, 23, 780-790. Larsen, S. F., Thompson, C. P., & Hansen, T. (1995). Time in autobiographical memory. In D. C. Rubin (Ed.), Remembering our past. Cambridge: Cambridge University Press. Lee, P. J., & Brown, N. R. (2004). The role of guessing and boundaries on date estimation biases. Psychonomic Bulletin & Review, 11(4), 748-754. Leist, A. K., Ferring, D., & Filipp, S.-H. (2010). Remembering positive and negative life events: Associations with future time perspective and functions of autobiographical memory. GeroPsych: The Journal of Gerontopsychology and Geriatric Psychiatry, 23(3), 137-147. doi: 10.1024/1662-9647/a000017 Ley, P. (1972). Primacy, rated importance, and the recall of medical statements. Journal of Health and Social Behavior, 13(3), 311-317. Linton, M. (1975). Memory for real-world events. In D. A. Norman & D. E. Rumelhart (Eds.), Explorations in cognition (pp. 376-404). San Francisco: Freeman. Linton, M. (1982). Transformations of memory in everyday life. In U. Neisser (Ed.), Memory observed: Remembering in natural contexts (pp. 77-91). San Francisco: Freeman. Loftus, E. F., & Fathi, D. C. (1985). Retrieving multiple autobiographical memories. Social Cognition, 3(3), 280-295. Loftus, E. F., & Marburger, W. (1983). Since the eruption of Mt. St. Helens, has anyone beaten you up? Improving the accuracy of retrospective reports with landmark events. Memory & Cognition, 11(2), 114-120. Luminet, O., & Curci, A. (2009). Flashbulb memories: New issues and new perspectives. New York, NY US: Psychology Press. Marian, V., & Neisser, U. (2000). Language-dependent recall of autobiographical memories. Journal of Experimental Psychology: General, 129(3), 361-368. doi: 10.1037/0096- 3445.129.3.361 Martin, M., & Jones, G. V. (1984). Cognitive failures in everyday life. In J. E. Harris & P. E. Morris (Eds.), Everyday memory: actions and absent-mindedness. London: Academic Press. Martyn, K. K. (2009). Adolescent health research and clinical assessment using self-administered event history calendar. In R. F. Belli, F. P. Stafford & D. F. Alwin (Eds.), Calendar and time diary methods in life course research (pp. 257-275). Thousand Oaks: Sage. Mathiowetz, N. A., & Duncan, G. J. (1988). Out of Work, Out of Mind: Response Errors in Retrospective Reports of Unemployment. [Article]. Journal of Business & Economic Statistics, 6(2), 221-229. McCombs, M. E., & Shaw, D. L. (1972). The agenda-setting function of mass media. [Article]. Public Opinion Quarterly, 36(2), 176-187. Means, B., Mingay, D. J., Nigam, A., & Zarrow, M. (1988). A cognitive approach to enhancing health survey reports of medical visits. In M. M. Gruneberg, P. E. Morris & R. N. Sykes (Eds.), Practical aspects of memory: Current research and issues, Vol. 1: Memory in everyday life. (pp. 537-542). Oxford England: John Wiley & Sons. 264

Murdock, B. B. (1974). Human memory: Theory and data. Potomac: Lawrence Erlbaum. Murre, J. M. J. (2010). Connectionist models of forgetting. In S. Della Sala (Ed.), Forgetting (pp. 77- 99). Hove: Psychology Press. Neisser, U. (1981). John Dean's memory: A case study. Cognition, 9(1), 1-22. Neisser, U. (1988). Five kinds of self-knowledge. Philosophical Psychology, 1(1), 35-59. doi: 10.1080/09515088808572924 Neusar, A. (2010). Volné vybavení veřejných událostí z let 2006 a 2007. [Free recall of public events from 2006 and 2007]. Masarykova Univerzita. Brno. Neusar, A. (2011). Přesnost datace dvou veřejných událostí–hurikán Kyrill a zavedení bodového systému. [Dating accuracy of two public events—hurricane Kyrill and penalty points system]. Masarykova Univerzita. Brno. Neusar, A., Hoferková, J., & Ježek, S. (2011). Přesnost datace mediálně známých veřejných událostí [Dating accuracy of well-known public events]. Mediální studia, 5(2). Neusar, A., & Ježek, S. (2009). Kalendárium životních událostí jako metoda podpory vybavování v retrospektivním dotazování. [Event history calendar – a method for facilitating the retrieval in retrospective reports]. Faculty of Social Studies (IRCYF). Masarykova Univerzita. Brno. Neuschatz, J. S., Lampinen, J. M., Preston, E. L., Hawkins, E. R., & Toglia, M. P. (2002). The Effect of Memory Schemata on Memory and the Phenomenological Experience of Naturalistic Situations. Applied Cognitive Psychology, 16, 687-708. Niedźwieńska, A. (2004). Metamemory knowledge and the accuracy of flashbulb memories. [Article]. Memory, 12(5), 603-613. doi: 10.1080/09658210344000134 Pohl, R. F., Bender, M., & Lachmann, G. (2005). Autobiographical memory and social skills of men and women. Applied Cognitive Psychology, 19(6), 745-759. Preiss, M. (2008). Deprese a výkon [Depression and performance]. Praha: Psychiatrické centrum. Reimer, M., & Matthes, B. (2007). Collecting Event Histories with TrueTales:Techniques to Improve Autobiographical Recall Problems in Standardized Interviews. [Article]. Quality & Quantity, 41(5), 711-735. doi: 10.1007/s11135-006-9021-y Riggio, R. E. (1986). Assessment of basic social skills. Journal of Personality and Social Psychology, 51(3), 649-660. doi: 10.1037/0022-3514.51.3.649 Ritchie, T. D., Skowronski, J. J., Walker, R. W., & Wood, S. E. (2006). Comparing two perceived characteristics of autobiographical memory: Memory detail and accessibility. Memory, 14(4), 471-485. doi: 10.1080/09658210500478434 Roberts, J., & Horney, J. (2010). The Life Event Calendar Method in Criminological Research Handbook of Quantitative Criminology. Handbook of quantitative criminology(3), 289-312. doi: 10.1007/978-0-387-77650-7_15 Robinson, M. D., Johnson, J. T., & Herndon, F. (1997). Reaction time and assessments of cognitive effort as predictors of accuracy and confidence. Journal of Applied Psychology, 82(3), 416-425. doi: 10.1037/0021-9010.82.3.416 Robson, C. (2002). Real World Research: A Research for Social Scientists and Practitioners- Researchers (2 ed.). Oxford: Blackwell Publication. Rosenberg, M. J., Layde, P. M., Ory, H. W., Strauss, L. T., Rooks, J. B., & Rubin, G. L. (1983). Agreement between Women's Histories of Oral Contraceptive Use and Physician Records. International Journal of Epidemiology, 12(1), 84. Rubin, D. C. (1982). On the retention function for autobiographical memory. Journal of Verbal Learning & Verbal Behavior, 21(1), 21-38. doi: 10.1016/s0022-5371(82)90423-6 Rubin, D. C. (2000). Autobiographical . In D. Park & N. Schwarz (Eds.), Cognitive aging: A primer (pp. 292). New York: Psychology Press. Rubin, D. C. (2006). The basic-systems model of episodic memory. Perspectives on Psychological Science, 1(4), 227-311. Rubin, D. C., & Baddeley, A. D. (1989). Telescoping is not time compression: A model of the dating of autobiographical events. Memory & Cognition, 17(6), 653-661. Rubin, D. C., Berntsen, D., & Hutson, M. (2009). The normative and the personal life: Individual differences in life scripts and life story events among USA and Danish undergraduates. [Article]. Memory, 17(1), 54-68. doi: 10.1080/09658210802541442 Rubin, D. C., & Kozin, M. (1984). Vivid memories. Cognition, 16(1), 81-95. doi: 10.1016/0010- 0277(84)90037-4 Rubin, D. C., Schrauf, R. W., & Greenberg, D. L. (2003). Belief and Recollection of Autobiographical Memories. Memory & Cognition, 31, 887-901. 265

Rubin, D. C., & Wenzel, A. E. (1996). One hundred years of forgetting: A quantitative description of retention. [Article]. Psychological Review, 103(4), 734. Rubin, D. C., Wetzler, S. E., & Nebes, R. D. (1986). Autobiographical memory across the adult lifespan. In D. C. Rubin (Ed.), Autobiographical memory (pp. 312). Cambridge: Cambridge University Press. Scott, G. G. (2007). 30 days to a more powerful memory. New York: American Management Association. Sedlmeier, P., & Betsch, T. (2002). ETC. Frequency processing and cognition. New York: Oxford University Press. Shum, M. S. (1998). The role of temporal landmarks in autobiographical memory processes. Psychological Bulletin, 124(3), 423-442. doi: 10.1037/0033-2909.124.3.423 Schank, R. C., & Abelson, R. P. (1977). Scripts, plans, goals and understanding: An inquiry into human knowledge structures. Hillsdale: Lawrence Erlbaum Associates. Schmand, B., Smit, J., Lindeboom, J., Smits, C., Hooijer, C., Jonker, C., & Deelman, B. (1997). Low education is a genuine risk factor for accelerated memory decline and dementia. Journal Of Clinical Epidemiology, 50(9), 1025-1033. Scholl-Schneider, S., Schneider, M., & Spurný, M. (Eds.). (2010). Sudetské příběhy / Sudetengeschichten. Praha: Antikomplex, Lehrstul für Bayerische und Schwäbische Landesgeschichte, Universität Augsburg. Schwarz, N. (2007). Retrospective and concurrent self-reports: The rationale for real-time data capture. In A. Stone, S. S. Shiffman, A. Atienza & L. Nebeling (Eds.), The science of real- time data capture: Self-reports in health research (pp. 11-26). New York: Oxford University Press. Schwarz, N., & Oyserman, D. (2001). Asking Questions About Behavior: Cognition, Communication, and Questionnaire Construction. American Journal of Evaluation, 22(2), 127. Sigelman, C. K., Schoenrock, C. J., Budd, E. C., Winer, J. L., Spanhel, C. L., Martin, P. W., . . . Bensberg, G. J. (1983). Communication with mentally retardet persons: Asking questions and getting answers. Lubbock: Texas Tech University. Skowronski, J. J., Betz, A. L., Thompson, C. P., & Shannon, L. (1991). Social memory in everyday life: Recall of self-events and other-events. Journal of Personality and Social Psychology, 60(6), 831-843. doi: 10.1037/0022-3514.60.6.831 Skowronski, J. J., Betz, A. L., Thompson, C. P., Walker, R. W., & Shannon, L. (1994). The Impact of Differing Memory Domains on Event-Dating Processes in Self and Proxy Reports. In N. Schwarz & S. Sudman (Eds.), Autobiographical memory and the validity of retrospective reports (pp. 217-231). New York: Springer-Verlag. Skowronski, J. J., & Thompson, C. P. (1990). Reconstructing the Dates of Personal Events: Gender Differences in Accuracy. Applied Cognitive Psychology, 4(5), 371-381. Snoeijer, R., de Vreese, C. H., & Semetko, H. A. (2002). Research note: The effects of live television reporting on recall and appreciation of political news. European Journal of Communication, 17(1), 85-101. Sobell, L. C., Sobell, M. B., Leo, G. I., & Cancilla, A. (1988). Reliability of a timeline method: assessing normal drinkers' reports of recent drinking and a comparative evaluation across several populations. British Journal Of Addiction, 83(4), 393-402. Sternberg, R. J., & Grigorenko, E. L. (2003). The psychology of abilities, competencies, and expertise. Cambridge: Cambridge Univ Press. Stevenson, N. (2006). Lifestyle. In B. S. Turner (Ed.), The Cambridge dictionary of sociology (pp. 339). Cambridge: Cambridge university press. Sudman, S., & Bradburn, N. M. (1982). Asking questions: A practical guide to questionnaire design. San Francisco: Jossey-Bass. Sutin, A. R., & Robins, R. W. (2007). Phenomenology of autobiographical memories: The memory experiences questionnaire. Memory, 15(4), 390-411. Symons, C. S., & Johnson, B. T. (1997). The self-reference effect in memory: A meta-analysis. Psychological Bulletin, 121(3), 371-394. doi: 10.1037/0033-2909.121.3.371 Tabatabaei, O., & Hejazi, N. H. (2011). Gender Differences in Vocabulary Instruction Using Keyword Method (Linguistic Mnemonics). [Article]. DIFFERENCES PARMIS LES SEXES DANS L'UTILISATION DES METHODES DE L'ENSEIGNEMENT DU VOCABULAIRE., 7(5), 198-204. doi: 10.3968/J.css.1923669720110705.465 266

Talarico, J. M., & Mace, J. H. (2010). Involuntary and voluntary memory sequencing phenomena. An interesting puzzle for the study of autobiographical memory organization and retrieval. In J. H. Mace (Ed.), The act of remembering: Toward an understanding of how we recall the past. (pp. 71-82). Oxford: Wiley-Blackwell. Thompson, C. P. (1982). Memory for unique personal events: the roommate study. Memory & cognition, 10(4), 324-332. Thompson, C. P., Skowronski, J. J., & Betz, A. L. (1993). The use of partial temporal information in dating personal events. Memory & Cognition, 21(3), 352-360. Thompson, C. P., Skowronski, J. J., & Lee, D. J. (1988). Reconstructing the date of a personal event. In M. M. Gruneberg, P. E. Morris & R. N. Sykes (Eds.), Practical aspects of memory: Current research and issues. Vol. 1. Memory in everyday life (pp. 241-246). Chichester: Wiley. Thomsen, D. K., & Berntsen, D. (2005). The end point effect in autobiographical memory: More than a calendar is needed. Memory, 13(8), 846-861. doi: 10.1080/09658210444000449 Tourangeau, R., Rips, L. J., & Rasinski, K. (2000). The psychology of survey response. Cambridge: Cambridge Univ Press. Trnka, S. (2011). When the world went color: Emotions, senses and spaces in contemporary accounts of the Czechoslovak Velvet Revolution. Emotion, Space and Society. doi: 10.1016/j.emospa.2011.05.002 Tulving, E. (2002). Episodic memory: From mind to brain. Annual Review of Psychology, 53(1), 1- 25. doi: 10.1146/annurev.psych.53.100901.135114 Tulving, E., & Craik, F. I. M. (2000). Episodic Memory and Autonoetic Awareness. In E. Tulving & F. I. M. Craik (Eds.), The Oxford handbook of memory (1st ed.). New York: Oxford University Press New York. Turkington, C. (2003). Memory: A self-teaching guide (Vol. 156). Hoboken: John Wiley & Sons Inc. Tzeng, O. J., & Cotton, B. (1980). A study-phase retrieval model of temporal coding. Journal of Experimental Psychology: Human Learning and Memory, 6(6), 705. ÚIV, & OECD. (2010). České školství v mezinárodním srovnání; Vybrané ukazatele publikace OECD Education at a Glance 2010 (compiled by M. Kleňhová, P. Štastnová, & P. Cibulková) [Czech education in an international comparison; Chosen indicators from OECD Education at a Glance 2010]. Praha: Ústav pro informace ve vzdělávání. UNECE. (2010). Mean age of women at birth of first child. from United Nations Economic Commission for Europe http://w3.unece.org/pxweb/dialog/varval.asp?ma=04_GEFHAge1stChild_r&path=../databa se/STAT/30-GE/02- Families_households/&lang=1&ti=Mean+age+of+women+at+birth+of+first+child Van der Vaart, W. (1996). Inquiring into the past: data quality of responses to retrospective questions. doctoral, Vrije Universiteit, Amsterdam. Van der Vaart, W. (2004). The time-line as a device to enhance recall in standardized research interviews: A split ballot study. Journal of Official Statistics, 20(2), 301-317. Van der Vaart, W., & Glasner, T. (2007a). Applying a Timeline as a Recall Aid in a Telephone Survey: A Record Check Study. Applied Cognitive Psychology, 21(2), 227-238. doi: 10.1002/acp.1338 Van der Vaart, W., & Glasner, T. (2007b). The use of landmark events in EHC-interviews to enhance recall accuracy. Paper presented at the The Use of Event History Calendar Methods in Panel Surveys, Washington D.C. Van der Vaart, W., & Glasner, T. (2011). Personal Landmarks as Recall Aids in Survey Interviews. Field Methods, 23(1), 37-56. doi: 10.1177/1525822x10384367 Vokolek, V. (2011). Český rok [Czech year]. Praha: Plus. Wade, S. E., & Adams, R. B. (1990). Effects of importance and interest on recall of biographical text. Journal of Literacy Research, 22(4), 331-353. Wagenaar, W. A. (1986). My memory: A study of autobiographical memory over six years. Cognitive psychology, 18(2), 225-252. doi: 10.1016/0010-0285(86)90013-7 Wagenaar, W. A. (1988). Identifying Ivan: A case study in legal psychology. New York: Harvard University Press. Wagenaar, W. A. (1994). Is memory self-serving. In U. Neisser & R. Fivush (Eds.), The remembering self: Construction and accuracy in the self-narrative (pp. 191-204). Cambridge: Cambridge University Press. 267

Wang, Q., & Brockmeier, J. (2002). Autobiographical remembering as cultural practice: Understanding the interplay between memory, self and culture (Vol. 8, pp. 45-64). Wheeler, M. A., Stuss, D. T., & Tulving, E. (1997). Toward a theory of episodic memory: The frontal lobes and autonoetic consciousness. Psychological Bulletin, 121(3), 331-354. doi: 10.1037/0033-2909.121.3.331 White, G. M. (1997). Introduction: Public History and National Narrative. Museum Anthropology, 21(1), 3-7. doi: 10.1525/mua.1997.21.1.3 Wicks, R. H. (1995). Remembering the News: Effects of Medium and Message Discrepancy on News Recall over Time. Journalism and Mass Communication Quarterly, 72(3), 666-681. Wilding, J. M., & Valentine, E. R. (2006). . In K. A. Ericsson (Ed.), The Cambridge handbook of expertise and expert performance. Cambridge: Cambridge Univ Press. Wilding, J. M., Valentine, E. R., Marshall, P., & Cook, S. (1999). Memory, IQ and Examination Performance. [Article]. Educational Psychology, 19(2), 117. Wiley, J. (2005). A fair and balanced look at the news: What affects memory for controversial arguments? Journal of Memory and Language, 53(1), 95-109. doi: 10.1016/j.jml.2005.02.001 Williams, H. L., Conway, M. A., & Cohen, G. (2008). Autobiographical memory. In G. Cohen & M. A. Conway (Eds.), Memory in the real world (3 ed., pp. 21-90): Psychology Press. Willis, G. (2005). Cognitive interviewing: a tool for improving questionnaire design: Sage Publications, inc. Wright, D. B., Gaskell, G. D., & O'Muircheartaigh, C. A. (1997). Temporal estimation of major news events: Re-examining the accessibility principle. Applied Cognitive Psychology, 11(1), 35- 46. doi: 10.1002/(sici)1099-0720(199702)11:1<35::aid-acp420>3.0.co;2-r Yoshihama, M., Clum, K., Crampton, A., & Gillespie, B. (2002). Measuring the lifetime experience of domestic violence: Application of the life History Calendar method. Violence and Victims, 17(3), 297-317. doi: 10.1891/vivi.17.3.297.33663 Yoshihama, M., Gillespie, B., Hammock, A. C., Belli, R. F., & Tolman, R. M. (2005). Does the Life History Calendar method facilitate the recall of intimate partner violence? Comparison of two methods of data collection. Social Work Research, 29(3), 151-163. Yount, K. M., & Gittelsohn, J. (2008). Comparing reports of health-seeking behavior from the integrated illness history and a standard child morbidity survey. Journal of Mixed Methods Research, 2(1), 23-62. Yu, M., Callegaro, M., Cheng, F.-W., Hjermstad, E., Liao, D., & Belli, R. (2004, 2004/05/13/ 2004 Annual Meeting, Phoenix, AZ). Comparison of Computerized Event History Calendar and Question-list Interviewing Methods: A Two-year Hospitalization History Study. Paper presented at the 59th annual conference of the American Association for Public Opinion Research.

268