Formalizing TV Crime Series: Application and Evaluation of the Doxastic Preference Framework

Kevin M. van Andel 5686199

Bachelor thesis Credits: 9 EC

Bachelor Opleiding Kunstmatige Intelligentie University of Amsterdam Faculty of Science Science Park 904 1098 XH Amsterdam

Supervisor Prof. dr. Benedikt Löwe Institute for Logic, Language and Computation (ILLC) Faculty of Science University of Amsterdam Science Park 904 1098 XH Amsterdam

June 25th, 2010 Abstract

We aim at capturing the human notion of story understanding in a formal system. We apply a formal system, the Doxastic Preference Framework (DPF), to seven narratives from four episodes of CSI: Crime Scene In- vestigationTM. From an analysis of the formalizations we reach similar conclusions as in previous work. That is, all the formalizations of this commercial TV crime series consist of a small number of specific struc- tures called building blocks, and we identify a number of deficiencies and missing features in DPF. To illustrate the difficulty of proposing an experimental setup for human story understanding, we present a small experiment to verify whether two stories that are different in their formal structure, are also perceived as different by human subjects. Ideally, we want to know how well DPF captures the notion of human story under- standing. For this we refer to an existing proposal for an empirical eval- uation on the basis of comparison of two formal frameworks. Finally, we discuss future work on this subject.

Keywords — formal frameworks, formalization, isomorphism, logic, story understanding

ii Contents

1 Introduction1 1.1 Motivation...... 1 1.2 Related work...... 1 1.3 Overview...... 2

2 Formalization of narratives3 2.1 Methodology...... 4

3 The formal frameworks5 3.1 The Plot Units Framework...... 5 3.2 The Doxastic Preference Framework...... 7

4 Formalization of narratives from CSI episodes in DPF 10 4.1 CSI: Crime Scene InvestigationTM ...... 10 4.2 Formalizations and analysis...... 10 4.2.1 Trick roll...... 10 4.2.2 The killed house guest...... 11 4.2.3 Winning a fortune...... 12 4.2.4 Faked kidnapping...... 12 4.2.5 Hit and run...... 14 4.2.6 Pledging gone wrong...... 15 4.2.7 The severed leg...... 15 4.3 Identifying deficiencies in DPF...... 16

5 Evaluating formal frameworks 17 5.1 An experiment on human story understanding...... 17 5.1.1 Experimental setup...... 17 5.1.2 Results...... 18 5.2 Methods to compare formal frameworks...... 18

6 Conclusion and discussion 18 6.1 Discussion...... 19 6.1.1 Improving DPF...... 19 6.2 Future work...... 20

A Timeline of narratives from CSI episodes 22 A.1 Episode 1: Pilot...... 22 A.1.1 Trick roll...... 22 A.1.2 The killed house guest...... 22 A.2 Episode 2: Cool Change...... 23 A.2.1 Winning a fortune...... 23 A.3 Episode 3: Crate ’n Burial...... 24 A.3.1 Faked kidnapping...... 24 A.3.2 Hit and run...... 24 A.4 Episode 4: Pledging Mr. Johnson...... 25 A.4.1 Pledging gone wrong...... 25 A.4.2 The severed leg...... 26

iii 1 Introduction

1.1 Motivation Humans are generally good at understanding stories. So far we have not been able to capture the human notion of story understanding in a formal representation. So what we are inter- ested in is to capture this notion and to represent it formally. If we know how to tackle this problem, we might be able to empirically identify the structures of a narrative that may be suited in particular genres or even interesting for a specific public. From a more global per- spective, we are interested in the question whether two stories are identical in structure with respect to a fixed formal framework, and whether this formal notion of identity corresponds to the informal human notion of identity of stories. While capturing the human notion of story understanding is what we actually want to do, it is a topic beyond the scope of this thesis. We do not have a formal framework that captures this notion, let alone an algorithm that automatically produces formalizations in this framework. We discuss a variety of other applications and topics that are beyond the scope of this thesis in Section 6.2. What we will discuss in this thesis with respect to story understanding, is the application and evaluation of a formal framework called the Doxastic Preference Framework (DPF) (Löwe and Pacuit(2008), Löwe et al.(2009), Löwe(2010)) and to propose a number of suggestions on how to improve this formal framework. Before we specifically discuss this, we will give an overview of the work that is related to the topic of story understanding.

1.2 Related work The available literature on story understanding is vast. We attempt to focus on literature that devotes itself to the topic of formalizing texts. Some of the research done in the field of story understanding is creating computer algorithms that are able to perform an analysis of stories automatically, given a narrative in a natural language, or more specifically produce formal representations of such narratives. Correira(1980) discussed the task of parsing narrative texts. He emphasized on the difficulty of this task, from parsing single sentences to paragraphs or longer texts, where knowledge that goes beyond the narrative is required in order to understand it. Kazantseva and Szpakowicz(2010) presented an approach by which summaries can be automatically created from short literary stories. Story grammars, which fall in the same category as the formal frameworks that are dis- cussed in this thesis, base their formalizations on rewrite rules. Black and Wilensky(1979) proposed an evaluation of story grammars, and argued that they are not useful for story understanding. Rumelhart(1980) countered their arguments by bringing to light their mis- understandings on story grammars and their failure to stress the true problems that underly story grammars. Following this paper, Frisch and Perlis(1981) suggested a proper method to analyze and evaluate story grammars. While proposing a computer algorithms for narrative formalization is beyond the scope of this thesis, a formal framework has to be defined first, that adequately captures the rel- evant structures of a narrative, that correspond to the notion of how humans understand stories. On a similar note, we are also interested in finding the relevant structures of a narra- tive for a specific genre or audience. In a paper on story understanding by van Dijk(1980), he discusses the role of psychology in narrative theory and particular important results in that field or research. Experiments on story understanding were done by Bower(1976), focusing specifically on how humans represent stories in memory. Automated story synthesis engines could use the relevant structures to synthesize stories interactively in computer games. For example, Young(2007) proposes a basic approach to modeling narratives in interactive virtual worlds. Intelligent agents in computer games could employ their reasoning based on their beliefs about the preferences about other agents,

1 including human agents. Agents may change their beliefs about other agents or even their preferences as the story progresses. An example of a story synthesis engine is Mexica, which was proposed by Pérez y Pérez and Sharples(2001). However, Mexica does require human guidance to produce stories. Their model is based on the engagement-reflection style of writing. That is, according to Pérez y Pérez and Sharples(2001), creative writing is a cycle of engagement and reflection. Narratives are composed of two main features: story and discourse. As in the model from Young(2007), both these features are distinctive parts of a narrative. With respect to the central topic of this thesis, we have to distinguish story and discourse as well during the formalization process. We particularly focus on the storyline in order to create formaliza- tions in DPF. Lehnert(1981) proposed a formal framework for the conceptualization of narratives based on how humans summarize them, i.e. how humans represent such summaries inter- nally in memory. We can refer to this framework as the Plot Units Framework (PUF) and it is part of the discussion in this thesis. More recent developments in formal framework research comes from (Löwe and Pacuit, 2008), who proposed DPF. They pursued the goal of “formalizing and understanding rea- soning processes in multi-agent situations with imperfect information”. They particularly focused on the actual behavior of agents. Because DPF is the central topic of this thesis, we leave further discussion to the remainder of this paper. Humans are able to analyze narratives and make judgements about the beliefs and pref- erences of characters in a narrative quite easily (Löwe, Pacuit and Saraf, 2009). More of the motivation from the previous subsection can be found in that paper as well. From Section 4 of (Löwe, Pacuit and Saraf, 2009) we partly reused and revised the formalizations, and the building blocks that follow from these formalizations, of narratives from CSI: Crime Scene InvestigationTM. The formalizations are presented in Section 4 of this thesis. Further literature on story understanding that is worth mentioning is the book In-Depth Understanding: A Computer Model of Integrated Processing for Narrative Comprehension by Dyer(1983), who presented an initial attempt to “specify and model those knowledge constructs, inference strategies, and memory search processes which are a prerequisite for an in-depth understanding” of stories. Another book that is worth reading is Inside computer understanding: five programs plus miniatures by Riesbeck and Schank(1982), who focus on the topic of how humans understand natural language.

1.3 Overview In this thesis two formal frameworks are discussed, and some of the arguments about the for- mal frameworks are based on the notion of isomorphism between formalizations produced by these frameworks. The first formal framework in question, on which we do specifically focus in this thesis, has been proposed by Lehnert(1981), which we will refer to as the Plot Units Framework (PUF). The next formal framework has been proposed by Löwe and Pacuit(2008), which has been applied descriptively and empirically to narratives from CSI episodes in (Löwe, Pacuit and Saraf, 2009). We will refer to this framework as the Doxastic Preference Framework (DPF). DPF approaches the problem of identifying the structure of a narrative on the basis of notions from doxastic logic, which is a modal logic concerned with reasoning about beliefs. Their formal framework is applied to storylines driven by what characters in the story believe and prefer. The following section will cover the details about this formal framework. In this thesis we will concern ourselves with narratives from the forensic crime series CSI: Crime Series InvestigationTM, with DPF as the main formal framework of reference. Löwe et al.(2009) employed this formal framework to produce formalizations of narratives from several CSI episodes. They identified common structures, i.e. the building blocks

2 and found that a small number of building blocks is enough to formally represent all the narratives. For methodological reasons, which we will discuss in Section 2.1, these formal- izations were verified and revised. As was pointed out in Section 3 of (Löwe, Pacuit and Saraf, 2009), there are a number of deficiencies in DPF, i.e., relevant structure that is not represented in DPF formalizations or artifacts of the formalization that do not correspond to the narrative. In order to improve on DPF, one can evaluate it against the human intuition or against other formal frameworks such as PUF. DPF is a sparse formal framework, and this is not necessarily a weakness. This means that DPF identifies two stories as isomorphic more easily than formal frameworks that are more elaborate than DPF. This raises the difficulty to find a balance between these two extremes. Problems that are identified in DPF due to features that are missing, which are present in the framework it is evaluated against, could be solved by adding these features to DPF, and thereby making more complex narratives adequately formalizable. By this more comprehensive storylines can be formalized. In Section 6.2 we will discuss future work with respect to improvements that may be added to DPF. From now on we will use the terms narrative and story interchangeably. In the remainder of the thesis we elaborate on the subject of formalizing narratives in DPF, and we evaluate this framework by analyzing the formalizations of narratives from CSI episodes. Specifically, we begin with a general notion of what defines a narrative formaliza- tion framework and a discussion of its methodology in Section 2. Following this, Section 3 covers the theory of two frameworks, respectively PUF and DPF. The formalizations of seven narratives from four CSI episodes, including elaborated and revised versions of the formalizations of the same narratives from (Löwe, Pacuit and Saraf, 2009), are presented in Section 4. Then in Section 5 we present a small experiment on human story understanding to illustrate the difficulty of proposing such experiments, by which follows a reference to a proposal of a method on how evaluate formal frameworks against other formal frameworks, based on the level of granularity and the notion of isomorphism between the formalization of narratives. We give a conclusion, discuss possible improvements to DPF and future work in Section 6. Finally, Appendix A contains the timeline summaries of narratives from CSI episodes in terms of relevant actions, events and discourse elements.

2 Formalization of narratives

Narratives given in a natural language do not occur as structures, but in a non-formal form. An important prerequisite for applying formal methods is to transform the narrative into a formal structure. This is called the formalization process. In this section we give an overview of the meaning of the formalization process and discuss the general methodology. The formalization process requires a formal framework1. We define a formal framework F as a class of structures S for a given syntax with a notion of isomorphism, i.e. a structure- preserving mapping. Section 3 discusses the theory of two such formal frameworks. With the notion of isomorphism, denoted by the symbol ', we can determine whether two narratives are structurally the same in a formal framework F . We say that the formal- ization of a narrative N is isomorphic to the formalization of a narrative M in a formal framework F iff for all S ∈ F (N) there exists an S0 ∈ F (M) such that S ' S0 and for all S ∈ F (M) there exists an S0 ∈ F (N) such that S ' S0.

F : N 7−→ F (N) (1) A formal framework is not only defined by its syntax. Its semantics are found in the formal- ization process, which is defined as the mapping from a narrative N given in a natural lan- guage, which contain extralingual information in the form of video material, to at least one

1In the following we follow Löwe(2010)

3 mathematical model2, and is presented in subsubsection 4.2.4. This mathematical model, the product of the formalization process, is called a formalization or F (N). In Equation (1) we show the definition of this mapping. To illustrate how the formalization process would take place for an arbitrary formal framework, we first present a short fictitious story:

John and Mary are a happily married couple. Mary trusts John and be- lieves that he would never cheat on her. John has always been confident that Mary would not leave him if she found out. However one day Mary finds out that John is having an affair. In feelings of loss and grief, she files for divorce. When John returns home from his girlfriend, Mary con- fronts John and tells him she has filed for divorce.

When formalizing this narrative we distinguish a very simple version of the syntax defined by DPF, where we consider only events, and actions that are taken by the agents themselves. The agents are John (J) and Mary (M). We denote an event with the symbol E. Nonterminal nodes are denoted as vi and terminal nodes as ti. The resulting formalization of the example story is shown in Figure 1 below.

t0 t1

E M t2 v0 v1

Figure 1: The formalization of John cheating on Mary

The story starts in the non-terminal node v0, where an event takes place. If Mary finds out about the affair the story reaches the terminal node t0. If she does not, the story would end in terminal node t0. Mary finds out John is having an affair and the story reaches non-terminal node v1. It’s Mary’s turn to choose between two actions. Mary could decide not to divorce John, by which the story would reach terminal node t1. She decides to divorce however, therefore the story reaches terminal node t2.

2.1 Methodology A result of the formalization process may include multiple formalizations if the narrative contains imperfect or incomplete information. We will present an example of this in Fig- ure 9 and in Figure 10. The formalization process should not be called entirely formal. The narrative is given in natural language and thus informal in nature. Rendering this informal input into a formal structure requires human input by necessity. The following is a description of a sequence of steps for performing the formalization process:

1. Create a timeline of actions, events and discourse elements 2. Define the set of agents that play a role in the storyline 3. Annotate the actions, events and/or states that are relevant to the storyline 4. Create the formalization using the syntax of the formal framework 5. Idenfify structures from the formalization

We shall discuss each step separately in the following. The first step is chosen because a narrative is given in a natural language, and it may contain information that goes beyond narratives (such as video material). A narrative may also contain elements of story and

2The reason that more than one model is possible discussed in Section 2.1 and Section 4.3

4 discourse that do not contribute to a particular narrative. The narrative has to be analyzed thoroughly in order to capture all the elements, including but not limited to elements that contribute to the story. The second step involves the inclusion of only the agents that actually contribute to the story. In the third step, we determine which actions and events are relevant in the story. For instance, in the case of video material, we could fix a level of granularity by deciding that only elements are included that extend beyond a time period fixed in advance. A disadvantage of this approach is that, for instance, important facts could be represented by facial expressions that last less than two seconds, if we fix a time period of two seconds. The fourth step is to create the formalization using a particular formal framework, for example in DPF or PUF. The theory of both these formal frameworks are covered in Sec- tion 3. Finally, the formalization may contain specific structures that can be identified. Formalizations in DPF are presented in Section 4,

3 The formal frameworks

In the previous section we explained that the method of formalizing narratives is an informal task. This section covers the theory of two formal—narrative formalization—frameworks and provides a few examples. We start with the Plot Units Framework (PUF) and thereafter follows the Doxastic Preference Framework (DPF).

3.1 The Plot Units Framework Lehnert(1981) proposed a formal framework for the formalization of narratives based on how humans summarize narratives, i.e. how humans represent such summaries internally in memory. We will call this framework the Plot Units Framework (PUF). The PUF transforms narratives given in a natural language to a chronological configuration or a matrix of plot unit structures as in Definition 1, which is a subset of the PUF structure. Plot units in turn are configurations of affect states. The PUF represents narratives by PUF structures which are defined in Definition 2. The class of PUF structures is denoted by FPUF. First we present several definitions and axioms that form the basis this formal framework and we finish with a few examples. A formalization according to the PUF is an informal assignment of a PUF structure to a narrative, in symbols

Definition 1. FPUF : N 7−→ FPUF(N ) ⊆ FPUF where FPUF is the class of PUF structures defined in Definition 5.

Definition 2. A plot unit structure Q = hA, T, <, S, Lc, Lm, La, Lt, Lei is called a PUF structure if A is a finite set of agents, T is a finite set of discrete time nodes, S is a function of S : A × T −→ {∅,M, +, −} (we call it a state), < is a linear order on T , Lm defines the pairwise causal link Motivation satisfying Axiom 1, La defines the pairwise causal link Actualization satisfying Axiom 2, Lt defines the pairwise causal link Termination satisfying Axiom 3, Le defines the pairwise causal link Equivalence satisfying Axiom 3, and Lc defines the set of cross-character causal links satisfying Axiom 4

All possible formalizations in PUF consist of the set of all possible PUF structures as given in Definition 3.

5 Definition 3. FPUF := {Q; Q is a PUF structure}

The state function requires as its input an agent α ∈ A and a discrete time node t inT . Its output is one of the three affect states defined for PUF, which are presented in Table 1. Note that the affect state ∅ is not actually an affect state, it merely denotes that at a discrete time node, one of the three affect states may not be present in the plot unit matrix. When an agent has the affect state mental state, the agent has a desire or wish, which does not embody any emotions. Positive events and negative events do embody emotions and their interpretation is straightforward, events that provide pleasure and pain respectively.

M mental state + positive event − negative event

Table 1: The PUF affect states

Axiom 1 states that for all agents and for all pairs of discrete time nodes we have the pairwise causal link motivation (m) iff the link points forward in time given the two discrete time nodes and the affect state of the agent in the later time node is a mental state.

Axiom 1. For all α ∈ A and for all x, y ∈ T , we have α Lm(x, y) iff x > y and S(α, x) ∈ {M, −, +} and S(α, y) ≡ M

Axiom 2 states that for all agents and for all pairs of discrete time nodes we have the pairwise causal link actualization (a) iff the link points forward in time given the two discrete time nodes and the affect state of the agent in the earlier time node is a mental state and that the affect state in the later time node is an event.

Axiom 2. For all α ∈ A and for all x, y ∈ T , we have α La (x, y) iff x > y and S(α, x) ≡ M and S(α, y) ∈ {+, −}

Axiom 3 states that for all agents and for all pairs of discrete time nodes we have the pairwise causal link termination (t) or equivalence (e) iff the link points backwards in time given the two discrete time nodes and the affect state of the agent in both time nodes are either a mental state or an event.

Axiom 3. For all α ∈ A and for all x, y ∈ T and for all z ∈ {t, e}, we have α Lz (x, y) iff x < y and ((S(α, x) ≡ M and S(α, y) ≡ M) or (S(α, x) ∈ {+, −} and S(α, y) ∈ {+, −}))

Axiom 4 states that for all agents and for all pairs of discrete time nodes we have a cross character causal link actualization (a) iff the link goes forward or backward in time given the two discrete time nodes and the affect state of the first agent in the first time node is any mental state and that the affect state of the agent in the second time node is any affect state. The set of cross character causal links thus consists of all possible pairs of affect states accross agents, which gives a total of nine.

Axiom 4. For all α ∈ A and for all β ∈ A and for all pairs of x, y ∈ T , we have α,β Lc (x, y) iff x 6= y and S(α, x) ∈ {M, +, −} and S(β, y) ∈ {M, +, −}

From the constraints set by the definitions and axioms of PUF, as Lehnert determined, 15 possible pairs of affect states connected by their respective pairwise causal link are possible. These form the primitive plot units, from which very complex may emerge. Lehnert presents

6 many examples of such complex plot units in her paper. We start by giving a simple example of the primitive plot unit success in Figure 2. Of course, many interpretations of this plot unit are possible. An example interpretation could be: “John agent wants a beer (M). He goes to the kitchen, grabs a can of beer from the refrigerator and opens it (a). John takes a sip (+)”. Note that the chronological order goes downward in the plot unit matrix.

Agent

M a +

Figure 2: Example formalization of the plot unit success in PUF

Now we want to present a more complex example of the formalization of John cheating on Mary in PUF, which is shown in Figure 3. From Mary’s point of view, she is happy at first, but when she finds out about the affair meanwhile she experiences feelings of loss and grief. This becomes the primitive plot unit loss, that terminates (t) a positive event with a negative event. This negative event is a problem for Mary, which has motivated (m) her to file a divorce. She actualizes (a) this by filing the divorce successfully, which is why the story ends in a positive event for Mary. Note that this does not terminate the earlier negative event of finding out about the affair. From John’s point of view, he is happy at first and when he returns home, he experiences a negative event of hearing about their divorce, which is also the primitive plot unit loss.

John Mary + + t − m t M a + −

Figure 3: The formalization of John cheating on Mary in PUF

3.2 The Doxastic Preference Framework In this section we explain the theory behind the Doxastic Preference Framework (DPF), of which more elaborate definitions and explanations can be found in both Löwe and Pacuit (2008) and Löwe et al.(2009). DPF bases all of its formalizations on (iterated) beliefs of agents about the preferences of (other) agents. This framework transforms narratives given in a natural language, unlike PUF not always in chronological order as we discuss in Section 4, to a tree of action nodes, in which agents decide to take an action, and events nodes, in which events happen without the agents making any decisions (see Definition 4). The formalization is an informal assignment of DPF structures to narratives, in symbols,

Definition 4. FDPF : N 7−→ FDPF(N ) ⊆ FDPF where FDPF is the class of DPF structures defined in Definition 5.

7 Definition 5. A DPF structure Q = hI, T, µ, Si is called a DPF structure if I is a finite set of agents denoted with bold capital letters, T is a finite game tree where nodes are connected by only one path, µ is a moving function such that µ : T \tn(T ) −→ I, and S is called a state such that S : T × I≤dp(T ) −→ PI

Table 2 lists and explains the functions and variables on which the DPF structure depends:

P a symbol denoting an agent E a reserved symbol denoting an event −→ P a finite sequence of agents (Definition 7) −→ −→ PP0 adding an agent to P0 (Definition 8) rootT the root of T tn(T) the set of terminal nodes of T succT (t) the set of immediate successor nodes of t in T dp(T ) the depth of T , i.e. the number of nodes with the longest path in T  linear orders on tn(T ), i.e. preferences P a finite set of preferences

Table 2: Functions and variables on which the DPF structure depends

All possible formalizations in DPF consist of the set of all possible DPF structures as given in Definition 6.

Definition 6. FDPF := {Q; Q is a DPF structure} A finite sequence of agents is written as in Definition 7. We can add a new agent Q to the −→ sequence P as in Definition 8. −→ Definition 7. P = hP0, ..., Pni

−→ 0 0 Definition 8. PP = hP0, ..., Pn, P i The moving function in Definition 9 determines which agent may decide to take an action at a non-terminal node in T . We have action nodes for µ(t) = P, where P is an agent, and event nodes for µ(t) = E, where E is a reserved symbol that denotes an event. Note that agents choose actions in action nodes based . We can distinguish two interepretations of event nodes: (1) agents choose actions that are entirely not based on their beliefs or preferences, or (2) agents do not act at all, i.e. an event takes place. Definition 9. µ : T \tn(T ) −→ I We define preferences of agents as linear orders on (denoted by ) on terminal nodes tn(T ), and P is the set of these preferences. Definition 10 defines descriptions, which maps agents to their respective preferences. The function in Definition 11 is called a state, which maps a node t in T to a description given I≤dp(t).3 For example, the description S(t, ∅) denotes the true preferences of all agents in node t. That is, the symbol ∅ causes this description to denote the zeroth order introspective beliefs of agents, which are their actual preferences. We also assume introspectivity of agents. First −→ and higher order descriptions are denoted as S(t, QP). This means the belief of agent Q −→ about S(t, P). Definition 10. : I −→ P Definition 11. S : T × I≤dp(T ) −→ PI

3We say I≤dp(t) instead of I because new agents may be introduced in nodes at a greater depth than t.

8 We do not always have full states which contains the beliefs and preferences of all agents. To fix the description of a single agent, which we call partial states, the notation Q, S(t, ∅)(Q) is used. Non-terminal nodes in T are denoted as vi and terminal nodes as ti. −→ S(vi, P)(Q) = (ti0 , ti1 , ..., tin ) (2)

The partial state below means that agent Q prefers terminal node tj over every terminal node that follows from vk. −→ S(vi, P)(Q) = (tj, vk) (3)

The partial state means that agent Q prefers every terminal node that follows from vj over every terminal node that follows from vk. −→ S(vi, P)(Q) = (vj, vk) (4) Formalizations in DPF are sequences of building blocks. The most basic building blocks and a few more complex building blocks are illustrated in Section 2.4 of (Löwe, Pacuit and Saraf, 2009). Here we will only illustrate the basic building block of an agent deciding on an action (see Figure 4. From the partial state it can be analyzed that agent Q chooses the action such that it reaches terminal node t0.

t0

Q t1 v0

S(v0, ∅)(Q) = (t0, t1) Figure 4: The basic DPF building block action

Now we want to give an example of the formalization of the story John cheating on Mary, which is presented in Figure 5. We start in the non-terminal node v0, where an event takes place. If Mary finds out about the affair the story reaches the terminal node t0. If she does not, the story would end in terminal node t0. Mary finds out John is having an affair, which is an unexpected event for Mary (UnEv(M)). The story thus reaches non-terminal node v1. It’s Mary’s turn to choose between an action. Mary could decide not to divorce John, by which the story would reach terminal node t1. She decides to divorce however, therefore the story reaches terminal node t2. This is an unexpected action for John taken by Mary (UnAc(J, M)).

t0 t1

E M t2 v0 v1

S(v0, ∅)(J) = (v1, t0); S(v0, J)(M) = (t0, v1); S(v0, M)(J) = (t2, t3); S(v1, ∅)(E) = (v2, t1); S(v0, E)(M) = (t1, v2); S(v2, ∅)(M) = (t3, t2) Figure 5: The formalization of John cheating on Mary in DPF

9 4 Formalization of narratives from CSI episodes in DPF

4.1 CSI: Crime Scene InvestigationTM In this section we will present and discuss the formalizations of seven narratives from four episodes of the TV crime series CSI: Crime Scene InvestigationTM in DPF. A number of these formalizations are compared to the formalizations from Löwe et al.(2009). CSI is an interesting series to formalize in DPF, because agents choose actions based on their own preferences and beliefs about the preferences of other agents. The CSI team has to follow a strict rule to put their preferences aside and choose their actions based on only the evidence. Such nodes become event nodes, because DPF does not incorporate beliefs about facts. This is also mentioned in Löwe et al.(2009). Recall that the formalization process is informal. In Section 2.1 we described a method by which the formalization process could take place. Most important is that the narratives are analyzed carefully. That is because they are presented in natural language, and they contain information that goes beyond the presentation of the narratives themselves (i.e. video material). Story and discourse have to be distinguished as well. These two main features of a nar- rative are often interleaved. Most discourse elements are excluded from the formalization. However, in some occassions discourse elements contain information that should also be contained in the formalization. Same as in (Löwe, Pacuit and Saraf, 2009), we focus on the actual behavior of the agents and not what should have been the rational behavior of the agents. We argue why certain choices were made during formalization. We also identify fea- tures that are missing in the formalizations. Improvements that can be added to to DPF are discussed in Section 6.1. A formalization of a narrative in DPF is a sequence of event nodes and action nodes. From the formalizations we can identify commonly occuring structures, which are called the building blocks. Recall that agents are denoted with a bold capital letter, and that events are denoted with the reserved capital letter E. The symbol of each agent in the CSI team is reused for every narrative in which an agent may occur.

Jim Brass (B); Warrick Brown (W) (G) (S) (N) (C)

4.2 Formalizations and analysis Below we present and analyze the eight formalizations. Appendix A contains the timeline of events, actions and discourse elements that contributed to the story. During analysis we identified the structures, or the building blocks as we like to call them. The descriptions beneath the graphs are exactly the states that we described in Section 3.2. They are based on the building blocks that have been identified4.

4.2.1 Trick roll Kristy Hopkins, a prostitute, seduces Mr. Laferty in a casino and they go upstairs to his room. Hopkins puts a substance on her nipples to pass Mr. Laferty out and to steal his belongings. Nick sees a discolouration on Mr. Laferty’s lips and connects the cases when he

4Multiple occurrences of the same description below a graph means that this description belongs to multiple building blocks.

10 receives information about prostitutes having discolouration on their nipples, and verifies it when he meets Hopkins. Nick gives Hopkins the choice of giving Nick the substance she was using and Mr. Laferty his belongings, without further consequences. Hopkins chooses to do this and Mr. Laferty is given back his belongings.

t0 t1 t2 t3 t4 t5

H L H E N H t6 v0 v1 v2 v3 v4 v5

S(v0, ∅)(H) = (t3, t0); S(v1, ∅)(L) = (t2, t1); S(v1, L)(H) = (t2, v3); S(v1, ∅)(H) = (t3, t2); S(v2, ∅)(H) = (t3, t2); S(v2, H)(E) = (t3, v4); S(v3, ∅)(E) = (v4, t3); S(v4, ∅)(N) = (t6, t4); S(v4, N)(H) = (t6, t5); S(v5, ∅)(H) = (t6, t5) Figure 6: The formalization of Trick roll (episode 1; Kristy Hopkins (H), Mr. Laferty (L))

The formalization of the narrative Trick Roll (Figure 6) consists of the following sequence of building blocks: Ac(H), UnAc(L, H), UnEv(H) and ExAc(N, H). The formalization has been made more elaborate than its previous version in Section 4 of (Löwe, Pacuit and Saraf, 2009). To the left we have the action by Hopkins of choosing to seduce Mr. Laferty. On the right we have the event of Nick Stokes connecting the cases that now results in Kristy Hopkin’s action of choosing to return Mr. Laferty’s stuff and then being allowed to walk free. Because agent Nick Stokes connected the cases he was able to force Hopkins to give Laferty his stuff back, but he offered to not get her arrested if she complied. Hopkins indeed did not expect this, and preferred not to get arrested, so she gave Mr. Laferty’s belongings back and the substance. Finally, Mr. Laferty gets his belongings back. We could have modeled this in the formalization as an unexpected event, but it was left it out, because it does not further contribute to the story.

4.2.2 The killed house guest The husband has been long tired of his wife’s drunk friend Jimmy, who was allowed to stay at their place on her behalf. The husband plots to kill the friend by kicking him out, and knowing Jimmy does not accept that the husband has his wife open the door. He fatally shoots Jimmy and puts on one of his shoes to kick open the door, but injures his pinky toe nail in the process. The husband lies about his injured pinky toe, but the CSI team eventually finds out that it was a chipped pinky toe nail (found in the shoe), and with the warrant that Warrick managed to get from judge Cohen behind Brass his back, the CSI team matched the chipped pinky toenail with the toenail which the husband cut himself with a nail clipper. The husband was arrested afterwards.

11 t0 t1 t2 t3 t4 t5 t6 t7 t8 t9

H I H E W B W J H E t10 v0 v1 v2 v3 v4 v5 v6 v7 v8 v9

S(v0, ∅)(H) = (t3, t0); S(v0, H)(I) = (v2, t1); S(v1, ∅)(I) = (v2, t1); S(v2, ∅)(H) = (t3, t2); S(v2, H)(E) = (t3, v4); S(v3, ∅)(E) = (v4, t3); S(v4, ∅)(W) = (t5, t4); S(v4, W)(B) = (t5, v6); S(v5, ∅)(B) = (v6, t5); S(v6, ∅)(W) = (v8, t6); S(v6, W)(J) = (v8, t7); S(v7, ∅)(J) = (v8, t7); S(v8, ∅)(H) = (t9, t8); S(v8, H)(E) = (t9, t10); S(v9, ∅)(E) = (t10, t9) Figure 7: The formalization of The killed house guest (episode 1; Husband (H), Jimmy (I), Judge Cohen (J))

This formalization consists of a sequence of five building blocks: ExAc(H, I), UnEv(H), UnAc(W, B), ExAc(W, J) and UnEv(H). The event of the husband requesting his wife to open the door was chosen to be left out in this formalization.

4.2.3 Winning a fortune Jamie asks her boyfriend Ted for money to play at a slot machine. Unwillingly he does so himself with his own money, and he wins the 40 million dollar jackpot. He dumps Jamie, to which she decides to murder Ted and cover it up. The CSI unit manages to find out how it happened and Jamie is arrested.

t0 t1 t2 t3 t4 t5

J T E T J E t6 v0 v1 v2 v3 v4 v5

S(v0, ∅)(J) = (t1, t0); S(v0, J)(T) = (t1, v2); S(v1, ∅)(T) = (v2, t1); S(v1, ∅)(T) = (t2, t1); S(v1, T)(E) = (t2, v3); S(v2, ∅)(E) = (v3, t2); S(v3, ∅)(T) = (t4, t3); S(v3, T)(J) = (t4, v5); S(v4, ∅)(J) = (t5, t4); S(v4, ∅)(J) = (t5, t4); S(v4, J)(E) = (t5, t6); S(v5, ∅)(E) = (t6, t5) Figure 8: The formalization of Winning a fortune (episode 2; Jamie Smith (J), Ted Sallenger (T))

Except for minor differences in the descriptions, his formalization is equal to the one found in (Löwe, Pacuit and Saraf, 2009), especially because it was straightforward to formalize this narrative. It is a sequence of four building blocks: UnAc(J, T), UnEv(T), UnAc(T, J) and UnEv(J). Note that the building blocks in this formalization are interleaved, instead of only being stacked.

4.2.4 Faked kidnapping Laura cheats on her husband Jack with his personal trainer Chip. Chip (or Laura) proposes a faked kidnapping and Laura (or Chip) agrees to do it. However, when they are exercising the kidnapping, Laura is hit by Chip and buried alive in a crate somewhere in the desert. Brass asks Jack not to pay the ransom, but it does not seem to work. The CSI unit does manage to find Laura before she runs out of air. Shortly after Jack payed the ransom, Chip tries to snatch it, but is arrested by Brass. In the hospital Laura lies about the events. While Chip is interrogated by Brass, he is recorded and but also denies everything, by which Chip took his leave. The CSI unit finds out what happened and Laura is arrested after she is confronted with the truth. Chip was arrested again in the meanwhile.

12 t0 t1 t2 t3 t4 t5 t6

L R L R J E R ... v0 v1 v2 v3 v4 v5 v6

t7 t8 t9 t10 t11

... B L R B E t12 v7 v8 v9 v10 v11

S(v0, ∅)(L) = (v1, t0); S(v1, ∅)(R) = (t7, t0); S(v1, R)(L) = (t3, t2); S(v2, ∅)(L) = (t3, v6, t5); S(v1, RL)(R) = (t3, v4); S(v2, L)(R) = (t3, v4); S(v3, ∅)(R) = (t7, t3); S(v4, ∅)(J) = (v5, t4); S(v4, J)(E) = (v5, t5); S(v5, ∅)(E) = (v5, t4); S(v6, ∅)(R) = (t7, t6); S(v6, R)(B) = (t7, v8); S(v7, ∅)(B) = (v8, t7); S(v9, ∅)(R) = (t10, t9); S(v9, R)(B) = (t10, v11); S(v10, ∅)(B) = (v11, t10); S(v8, ∅)(L) = (t11, t8); S(v8, L)(E) = (t11, t12); S(v11, ∅)(E) = (t12, t11) Figure 9: The first possible formalization of Faked kidnapping (episode 3; Chip Rundle (R), Laura Garris (L), Jack Garris (J))

t0 t1 t2 t3 t4 t5 t6

L L R J E R B ... v0 v1 v2 v3 v4 v5 v6

t7 t8 t9 t10

... L R B E t11 v7 v8 v9 v10

S(v0, ∅)(L) = (v1, t0); S(v1, ∅)(L) = (t2, v5, t4); S(v1, L)(R) = (t2, v3); S(v2, ∅)(R) = (t6, t2); S(v3, ∅)(J) = (v4, t3); S(v3, J)(E) = (v4, t4); S(v4, ∅)(E) = (v4, t3); S(v5, ∅)(R) = (t6, t5); S(v5, R)(B) = (t6, v7); S(v6, ∅)(B) = (v7, t6); S(v8, ∅)(R) = (t9, t8); S(v8, R)(B) = (t9, v10); S(v9, ∅)(B) = (v10, t9); S(v7, ∅)(L) = (t10, t7); S(v7, L)(E) = (t10, t11); S(v10, ∅)(E) = (t11, t10) Figure 10: The second possible formalization of Faked kidnapping (episode 3; Chip Rundle (R), Laura Garris (L), Jack Garris (J))

As presented in the two formalizations above, this is an example of when a narrative may produce more than one formalization. The ambiguity in this narrative exists in whether Chip or Laura proposed the kidnapping, by which the second building blocks of the two possible formalizations differ. This formalization has also been made more elaborate than the one found in (Löwe, Pacuit and Saraf, 2009). It consists of the following sequence of building blocks: Ac(L),(Betr(R, L) or UnAc(L, R)), ExEv(J), UnAc(R, B), UnAc(R, B) and UnEv(L). The addition of the first building block, where Laura takes the action of choosing to have an affair with Chip, was reasoned to be directly relevant for the storyline that followed. We discuss the second building block in Section 4.3. The choice for building block UnAc(R, B) deserves its own discussion. This building block could have been chosen as an event, because it is natural for B to tail the ransom if

13 nobody else is volunteering. What we actually want to discuss about this building block is that in the formalization, the order of actions is reversed. The actual order of events is that first, Brass chooses to tail the ransom, and thereafter R attempts to take it. R does not know whether he is being watched by the police or not, so he is not able to reason whether he will get arrested if he takes the money. Here we have an example of imperfect information. Because DPF assumes perfect information games, we had to convert it to a perfect information representation. The second occurrence of this building block follows the same reasoning.

4.2.5 Hit and run James Moore is driving in his grandfather’s car and hits a young girl on her scooter while he is not paying attention. Not knowing what to do, he drives away to his grandfather Charles Moore and asks to go with him to the police station. But Charles does not want his grandson to go to jail and so they do not go to the police. The CSI unit traces the plate number from marks on the girl’s leg and Charles is arrested, lying that he hit the girl. His grandson James is traced from the insurance on the car. The CSI unit check the car and come to the conclusion that James drove that car. Charles still denies it and does not want James to tell the story. After the CSI unit finds a chipped tooth from James in the steering wheel, they can no longer hide the truth and James is arrested.

t0 t1 t2 t3 t4 t5 t6

J E J C E C E t7 v0 v1 v2 v3 v4 v5 v6

S(v0, ∅)(J) = (t1, t0); S(v0, J)(E) = (t1, v2); S(v1, ∅)(E) = (v2, t1); S(v2, ∅)(J) = (t3, t2); S(v3, ∅)(M) = (t4, t3); S(v3, M)(E) = (t4, v5); S(v4, ∅)(E) = (v5, t4); S(v5, ∅)(M) = (t6, t5); S(v5, M)(E) = (t6, t7); S(v6, ∅)(E) = (t7, t6); Figure 11: The formalization of Hit and run (episode 3; Charles Moore (M), James Moore (J))

This formalization consists of the following sequence of building blocks: UnEv(J), Ac(J), UnEv(C), UnEv(C). Compared to the formalization in (Löwe, Pacuit and Saraf, 2009), Ac(J) was added, because J his decision of leaving the scene after the crash, instead of calling help services, was important to the storyline and the resulting formalization. We identified three unexpected events. However, one could argue that the decision made by J had the implication that C chose the action of convincing J not report himself. For this we can introduce the building block Reconsideration from advice, which can be described as an instance of changing preferences:

t0 t1 t2

J C J t3 v0 v1 v2

S(v0, ∅)(J) = (t2, t0); S(v0, J)(C) = (t2, t1); S(v1, C)(J) = (t3, t2) S(v1, ∅)(C) = (t3, t1); S(v0, JC)(J) = (t2, t3); S(v2, ∅)(J) = (t3, t2) Figure 12: The building block Reconsideration from advice (Charles Moore (M), James Moore (J))

14 4.2.6 Pledging gone wrong James, a pledge going through sorority row, asked Jill, Kyle Travis his girlfriend, to sign his penis for the initiation, which she agreed to do. Kyle then murdered James by hanging a piece of raw liver on a noose and pulling it while the raw liver is hanging in his troath. A few moments later, Matt comes in, but is ordered by Kyle to keep his mouth shut. Together they cover the murder by hanging James in his own room. The CSI unit finds out that James choked to death from the raw liver, after which they find the noose. Matt finally admits everything, after being confronted with facts, and Kyle is charged with murder.

t0 t1 t2 t3 t4 t5 t6 t7

J W K M E K E M t8 v0 v1 v2 v3 v4 v5 v6 v7

S(v0, ∅)(J) = (t2, t1, t0); S(v0, J)(W) = (t2, t1); S(v1, ∅)(W) = (v2, t1); S(v0, JW)(K) = (t2, v3); S(v0, J)(K) = (t2, v3); S(v2, ∅)(K) = (t4, t2, t3); S(v2, K)(M) = (t4, t3); S(v2, K)(E) = (t4, v5); S(v2, KM)(E) = (t4, t5); S(v3, ∅)(M) = (t4, t3); S(v4, ∅)(E) = (v5, t4); S(v5, ∅)(K) = (t6, t5); S(v5, K)(E) = (t6, v7); S(v6, ∅)(E) = (v7, t6); S(v7, ∅)(M) = (t8, t7) Figure 13: The formalization of Pledging gone wrong (episode 4; James Johnson (J), Jill Wentworth (W), Kyle Travis (K), Matt Daniels (M))

This formalization is equal to the one found in (Löwe, Pacuit and Saraf, 2009), with some minor corrections in the descriptions. It contains the following sequence of building blocks: UnCT(J, W, K), CoGW(K, M), UnEv(K) and Ac(M). We have two building blocks that haven’t been introduced yet in the previous formalizations. The building block UnCT(J, W, K) is Unsuccessful Collaboration with a Third, which means that J’s goal of being accepted to the fraternity, was supposed to be fulfilled by the successfully completed subgoal of getting signed by W, but it ends with K killing J. The other building block that is new in this formalization, is CoGW(K, M) (Collabora- tion gone wrong). This building block is the result of M helping K with the cover up, but the cover up itself failed (which means the CSI team found out) and therefore M had to confess during the last interrogation.

4.2.7 The severed leg Two fishermen find a severed woman’s leg while fishing on a lake, and shortly thereafter the body is found by the police. It appears to be the body of Wendy Barger. Her husband Winston is notified and questioned. The CSI unit traces the evidence from Wendy’s body to Phil Swelco, who she had an affair with. Winston follows the CSI unit and becomes aware of Swelco being involved in the investigation. He wonders how and why Swelco is involved in the case, and if he was the one Wendy might have had an affair with and if she was killed by him. Catherine confirms the affair when Winston notices Swelco at the police station. Winston then concluded for himself that Swelco killed his wife. As a response Winston kills Swelco, before Grissom and Catherine could notify that it was all an accident.

15 t0 t1 t2 t3

W E C W t4 v0 v1 v2 v3

S(v0, ∅)(W) = (t1, t0); S(v0, W)(E) = (t1, v2); S(v1, ∅)(E) = (v2, t1); S(v2, ∅)(C) = (t3, t2); S(v2, C)(W) = (t3, t4); S(v3, ∅)(W) = (t4, t3) Figure 14: The formalization of The severed leg (episode 4; Winston Barger (W))

This formalization has been made more elaborate than its previous version, consisting of the following sequence of building blocks: UnEv(W) and UnAc(C, W). The first building block was added, because it models W’s surprise of seeing the CSI unit at Swelco’s house. We do not have enough information from this narrative to conclude whether W acted on his own assumptions or that C has influenced W’s preferences by confirming the affair. W told C that he knew his wife Wendy had an affair, but not with who. C told W she was sorry for him, by which W confirmed his suspicions of not only the affair, but incorrectly that Swelco had also murdered his wife. His curiosity during the investigation is a good indicator of Winston being after the murderer by himself, before Catherine even told him about the affair. This underlying uncertainty illustrates the difficulty of information processing during formalization.

4.3 Identifying deficiencies in DPF DPF is a sparse formal framework as mentioned in (Löwe, Pacuit and Saraf, 2009). Instead of focusing on its sparseness, we will focus on deficiencies that arise from the analysis of its formalizations. A large dataset from which deficiencies may be identified is not at hand. So we can only identify deficiencies in DPF by analyzing the previously presented formalizations. For each of the deficiencies that were identified we will discuss possible improvements to DPF in Section 6.1. The following example is from episode 3. The narrative is ambiguous about whether Chip or Laura proposed the kidnapping, and note that the second building block depends on this. One could choose either of them to having proposed the kidnapping, so we can formalize the narrative in two ways. The second building block could either be betrayal (Figure 15) or unexpected action (Figure 16)5. It’s obvious that the bijections do not match in the two different formalizations due to the difference in building blocks, which clearly shows that they are not isomorphic. This is rather problematic, because we have to argue whether the formalizations are actually all that different from our own perspective. We experiment with this example in Section 5.1.

t0 t1 t2

R L R t3 v0 v1 v2

S(v0, ∅)(R) = (t3, t0); S(v0, L)(R) = (t2, t1); S(v1, ∅)(L) = (t2, t1); S(v0, RL)(R) = (t2, t3); S(v0, L)(R) = (t2, t3); S(v2, ∅)(R) = (t3, t2) Figure 15: The building block Betrayal (Chip Rundle (R), Laura Garris (L))

5Both these building blocks are respectively figures 5 and 3 from Löwe et al.(2009)

16 t0 t1

L R t2 v0 v1

S(v0, ∅)(L) = (t1, t2); S(v0, L)(R) = (t1, t2); S(v1, ∅)(R) = (t2, t1) Figure 16: The building block Unexpected action (Chip Rundle (R), Laura Garris (L))

5 Evaluating formal frameworks

In the previous section we presented and analyzed eight CSI narratives in DPF. From this we identified a number of deficiencies in DPF. Before we can propose computer algorithms that are able to automatically summarize stories and determine whether two stories are essentially the same in their structure, it is required that we find the formal framework that captures the human notion of story understanding. For this we have to capture this notion and solve the problem of the general methodology as explained in Section 2.1. That is, the formalization process requires human intervention, by which the method is inherently informal. We cannot claim that the frameworks presented in the previous section capture this notion. One way of solving this problem is to empirically evaluate formal frameworks against other formal frameworks. In this section we discuss how this could be done. The reason that we are evaluating DPF against another framework is that we are inter- ested in knowing which framework captures the notion of isomorphism better with respect to the human understanding of stories. If we can empirically verify that two narratives that are isomorphic in DPF (and by this we mean that humans see the two stories as essentially the same), but not isomorphic in PUF (or another formal framework), it becomes certain that DPF is appropiate to be used in various applications such as story formalization and story synthesis. The reason that we are not yet presenting algorithms for summarizing sto- ries is not only that it is beyond the scope of this thesis, it is indeed that we have not yet verified with an experiment that a certain formal framework captures the human notion of story understanding. In this section we first present a small experiment on human story understanding, and thereafter we propose how to evaluate formal frameworks, or more specifically how two formal frameworks could be compared.

5.1 An experiment on human story understanding Inspired by Lam(2008), we performed a small scale experiment on 20 subjects, all grad- uate or undergraduate students. The goal of this experiment was to show that creating an experimental setup is by definition a non-trivial, psychological problem, when we want to argue whether a formal framework captures the notion of human story understanding well.

5.1.1 Experimental setup For the experiment we first presented two short stories (see below) to the subjects. Their formalizations in DPF are clearly not isomorphic. That is, the formalization of Narrative 1 is the building block Unexpected Action (see figure Figure 16), and the formalization of Narrative 2 is the building block Betrayal (see figure Figure 15).

Narrative 1. Laura is having an affair with Chip, behind the back of her rich husband Jack. Laura proposes to fake a kidnapping to Chip. Chip agrees to do it, but while they carry it out, Chip buries Laura alive in a crate in the Nevada desert.

17 Narrative 2. Laura is having an affair with Chip, behind the back of her rich husband Jack. Chip proposes to fake a kidnapping to Laura. Laura agrees to do it, but while they carry it out, Chip buries Laura alive in a crate in the Nevada desert. The subjects were then presented with the question below6. The symbols S1 and S2 respec- tively refer to Narrative 1 and Narrative 2.

Are the stories S1 and S2 essentially the same?

5.1.2 Results The experiment yielded the following results:

6 of the 20 subjects answered yes 14 of the 20 subjects answered no

The goal of this experiment was to show that creating an experimental setup is by definition a non-trivial problem. First of all, the circumstances concerning the experiment were not normalized, since it has been performed through e-mail. From this we can deduce another major issue with this experimental setup, that we do not know whether the subjects first read the question or the stories. The experiment is also very sparse. However, it illustrates that humans do not understand stories in the same way. Ideally, we would like to compare formal frameworks to evaluate how well they represent human story understanding. A proposal for this goal is elaborated in the following subsection.

5.2 Methods to compare formal frameworks A comparison of two formal frameworks is something that has not been done before. It is also not a trivial problem to deal with, because there has to be some basis of comparison while two formal frameworks may differ greatly in both their syntax and their semantics. The goal of comparing two formal frameworks experimentally, is to find out which formal framework better captures the notion of human story understanding. Section 3 of (Löwe, 2010) discusses a method by which this comparison could be performed. The method is based on the isomorphism between formalizations of narratives in two different formal frameworks.

6 Conclusion and discussion

In this thesis we discussed three research directions that have been done on the subject of the formalizing narratives, and particularly focused on the first. The first research direction involved the formalization of seven narratives from four CSI: Crime Scene InvestigationTM episodes in the Doxastic Preference Framework. From the formalizations we could identify the relevant structures as the building blocks, of which most consist of only a small num- ber of different building blocks. These formalizations are a matter of debate, with respect to what actions and events should be chosen to be incorporated in the formalization (dis- tinction of story and discourse matters here as well), and when sequences of actions and events could be seen as one action or event respectively. Another matter of debate are the descriptions chosen with the formalizations, but the differences between the descriptions in this thesis and in (Löwe, Pacuit and Saraf, 2009) are mostly minor, except for elaborated formalizations. The second research direction involved the evaluation of DPF. A comparison of DPF to other formal frameworks has to be based on isomorphism between narratives. We described

6The question was presented in Dutch: “Zijn de verhalen S1 en S2 essentieel hetzelfde?” The two narratives were presented in English

18 that in order to prove that two stories that are structurally the same in a particular formal framework, these narratives should also be perceived as essentially the same by human be- ings, in order for the formal framework to capture the notion of human story understanding. With respect to this requirement, a small scale experiment was performed on 20 undergrad- uates and/or graduates, to verify that two non-isomorphic narratives were also generally perceived as different stories. Several difficulties concerning experiments on human story understanding were raised. The third and final research direction involved the improvement of DPF, on the basis of the deficiencies that were identified after researching the first aspect. The proposals included the addition of joint actions, classes of formalizations, and annotating nodes with time. This thesis raises the question whether the design of formal frameworks is an adequate approach. The cycle of applying, evaluating and improving the formal framework almost may lead to infinite regress. A formal framework is either too sparse or too elaborate. We could argue to move away from the problem of defining formal frameworks until other fields of study are able to capture the human notion of story understanding. On this basis we could then define formal frameworks that capture this notion as we intended in the first place.

6.1 Discussion Part of the research that has been elaborated in this thesis deals with processes that are subjective. We want to design objective guidelines for choosing the right formal framework. By comparing two formal frameworks by means of an experiment, we could show that a formal framework captures the notion of human story understanding if two formalizations of narratives are essentially the same (isomorphic), and also through the perspective of human beings. A number of improvements of DPF are proposed in this section on the basis of several deficiencies that were identified in Section 4.3, i.e. during the formalization process of CSI narratives. Instead of manipulating the problem during formalization, we want to introduce several improvements that should become part of the Doxastic Preference Framework.

6.1.1 Improving DPF In the narrative presented in subsubsection 4.2.4, we are uncertain whether Chip or Laura intended to scene the kidnapping. To counter this problem we could define the act of propos- ing the faked kidnapping as a joint action that consists of both Chip and Laura both propos- ing and planning the kidnapping. This however causes a change to DPF such that it not only should then be re-evaluated for similar and even very different narratives, but this may also introduce other deficiencies. Therefore we will leave this matter as future work. Another option, as presented in subsubsection 4.2.4, is to create a class of formalizations for similar narratives with existing ambiguity in the formalization process. The class of formalizations then includes both possible formalizations. A formalization consists of a sequence of building blocks. Due to interleaving scenes in CSI episodes, we cannot tell for sure which building block should come first and whether this should influence isomorphism between related stories. We have also witnessed during the formalization of the narrative Faked kidnapping, that we have to account for the fact that imperfect information causes the formalization to reverse time steps. By annotating the nodes with time steps, the chronological order of building blocks can be retrieved after formalization.

19 6.2 Future work The main problem is to define a formal framework that captures in its syntax the notion of human story understanding. Solving this problem includes a variety of goals that have to be pursued. With respect to DPF, future works includes understanding the reasoning processes of agents in a multi-agent situation with imperfect information, with respect to the beliefs and preferences in DPF, while of course DPF assumes perfect information games. So for this goal we would like to know what is the optimal behavior of an agent given an situation with imperfect information. The dataset of formalizations in DPF should be expanded in order to identify possible new deficiencies, by which DPF can then be improved. Each time DPF is improved, new deficiencies may be identified by a re-evaluation of its formalizations. At this time we cannot foretell if this will lead to infinite regress. Formal frameworks can be evaluated at the level of how well they define human story understanding. This can be done by an emperical evaluation. For example, the experiment in Section 5.1 may be a start to pursue this goal. When we are done with the design of a formal framework that captures the notion of human story understanding, we can use insights from the research fields of, e.g, Com- puter Vision, Machine Learning and Natural Language Processing, to conquer the goal of automatic formalization of narratives given in a natural language that possibly contain ex- tralingual information. Future work also includes the implementation of an algorithm that automatically synthesizes interactive stories using (formalizations in) DPF as its basis. This is aimed at computer games, where the story may then contain imperfect information for both the artificial agent(s) and the human player. Finally, related to the possibility of ex- panding the dataset, is to create a dataset of the structures (building blocks) that are relevant for a certain genre or public, by which new stories can be synthesized creatively.

Acknowledgements

First of all I would like to thank my supervisor Benedikt Löwe, both for his helpful com- ments on several drafts of this thesis, and for his explanations and suggestions that helped me complete this thesis. Inviting me to attend a number of meetings of a master’s project has been positive for my insight in the subject of this thesis. Finally I would also like to thank the students who provided me the results of the experiment.

References

Black, J.B. and Wilensky, R., ‘An evaluation of story grammars,’ Cognitive Science, vol. 3 (3), 1979, pp. 213 – 229. Bower, G.H., ‘Experiments on story understanding and recall,’ The Quarterly Journal of Experimental Psychology, vol. 28 (4), 1976, pp. 511–534. Correira, A., ‘Computing story trees,’ Comput. Linguist., vol. 6 (3-4), 1980, pp. 135–149. van Dijk, T.A., ‘Story comprehension: An introduction,’ Poetics, vol. 9, 1980, pp. 1–21. Dyer, M.G., In-Depth Understanding: A Computer Model of Integrated Processing for Narrative Comprehension (MIT Press, Cambridge, MA, USA, 1983). Frisch, A.M. and Perlis, D., ‘A re-evaluation of story grammars,’ Cognitive Science, vol. 5 (1), 1981, pp. 79 – 86. Kazantseva, A. and Szpakowicz, S., ‘Summarizing short stories,’ Comput. Linguist., vol. 36 (1), 2010, pp. 71–109. Lam, S., Affective Analogical Learning and Reasoning, Master’s thesis, School of Infor- matics, University of Edinburgh, 2008.

20 Lehnert, W.G., ‘Plot units and narrative summarization,’ Cognitive Science, vol. 4, 1981, pp. 293–331. Löwe, B., ‘Comparing formal frameworks of a narrative structure,’ Computational Models of Narrative: Papers from the AAAI Fall Symposium FS-10, AAAI, 2010. Löwe, B. and Pacuit, E., ‘An abstract approach to reasoning about games with mistaken and changing beliefs,’ Australasian Journal of Logic, vol. 6, 2008, pp. 162–181. Löwe, B., Pacuit, E. and Saraf, S., ‘Identifying the structure of a narrative via an agent-based logic of preferences and beliefs: Formalizations of episodes from CSI: Crime Scene InvestigationTM,’ in M. Duvigneau and D. Moldt (eds.), MOCA’09, Fifth International Workshop on Modelling of Objects, Components, and Agents (2009). Pérez y Pérez, R. and Sharples, M., ‘Mexica: a computer model of a cognitive account of creative writing,’ Journal of Experimental and Theoretical Artificial Intelligence, vol. 13 (2), 2001, pp. 119–139. Riesbeck, C.K. and Schank, R.C., Inside Computer Understanding: Five Programs Plus Miniatures (L. Erlbaum Associates Inc., Hillsdale, NJ, USA, 1982). Rumelhart, D.E., ‘On evaluating story grammars,’ Cognitive Science, vol. 4 (3), 1980, pp. 313 – 316. Young, R.M., ‘Story and discourse: A bipartite model of narrative generation in virtual worlds,’ Interaction Studies, vol. 8, 2007, pp. 177–208.

21 Appendix

A Timeline of narratives from CSI episodes

Seven narratives from four episodes of CSI were carefully analyzed several times. Each episode contains at least one ncoarrative, and they are all interleaved. The actual method of analysis consisted of carefully (re-)watching very short or longer fragments of DVD video material and writing down all actions, events and discourse elements that may have contributed to the actual storyline. The data that became the basis of the formalizations presented in Section 4 is presented in this appendix. Note that the data may not represent the narrative as it was presented, because the formalization process is informal.

A.1 Episode 1: Pilot A.1.1 Trick roll Mr. Laferty sits in the lobby of a casino Kristy Hopkins sees Mr. Laferty Kristy Hopkins goes to Mr. Laferty Kristy Hopkins seduces Mr. Laferty Mr. Laferty goes along They go to Mr. Laferty’s hotel room Mr. Laferty waits on the bed Kristy Hopkins puts on scopolamine Kristy Hopkins goes sit on Mr. Laferty Kristy Hopkins reveals her breasts Mr. Laferty sucks on Kristy Hopkins’ breast Mr. Laferty passes outcomes Kristy Hopkins steals Mr. Laferty’s belongings ------CSI getting involved Mr. Laferty tells agent Nick Stokes what happened Nick notices a discolourization around Mr. Laferty’s mouth Nick takes a mouth swab of the discoloration At the lab the swab could not be identified to a particular substance Nick arrives at a car accident Nick examines Kristy Hopkins but sees no discoloration Nick has help services send Kristy Hopkins to the hospital Nick meets Dr. Leever in the hospital Dr. Leever reveals discoloration was present on six prostitute’s nipples Nick goes to see one of the prostitutes, that turns out to be Kristy Hopkins Nick requests Kristy Hopkins to show her breasts for discoloration Nick notices the discoloration Nick demands Kristy Hopkins to reveal the substance and give the belongings back Kristy Hopkins gives the substance to Nick The lab confirms the substance is scopolamine Nick returns Mr. Laferty his belongings

A.1.2 The killed house guest Husband throws Jimmy out of the house Jimmy bangs on the front door Husband tells his wife to open the front door Husband’s wife asks what he is going to do Husband tells wife to do as he says Husband’s wife opens the door Jimmy enters Husband shoots Jimmy down Husband takes off one of Jimmy’s shoes Husband puts on Jimmy’s shoe and chips pinky toe nail Husband goes outside Husband locks the door Husband kicks it open Husband puts back on Jimmy’s shoe Husband has his wife call 911 ------CSI getting involved Agent Catherine Willows and Warrick Brown arrive at the scene Catherine asks if Husband’s wife is OK

22 Husband starts telling his side of the story Warrick tells Catherine that Husband lies Warrick takes shoeprint of Jimmy Warrick compares the shoeprint against the shoeprint on the door Husband asks why he is doing that Catherine notices differently tied laces Catherine asks Husband if he moved or altered Jimmy’s body Husband denies Warrick asks Husband for signed statement Husband agrees to sign one Catherine asks Husband why his pinky toe is injured Husband says he tripped over a rattle At the CSI lab Warrick hypothesizes struggle from Jimmy’s hair fibers Warrick asks if Husband if a struggle ensued before he shot Jimmy down Husband replies that he must have tied the laces wrong Husband says he was nervous to tell Warrick appears confused Warrick talks to Grissom about it Grissom explains to Warrick to follow the evidence Warrick examines the shoe Warrick finds chipped toenail Warrick is convinced that Husband lied about tripping over rattle Warrick asks Brass for warrant for the toenail Brass refuses to have a warrant prepared Warrick goes to Judge Cohen’s house Warrick asks for a warrant Judge Cohen agrees to give a warrant on a condition Brass finds out Brass puts Warrick off the case Brass commands Warrick to shadow Holly on her case Grissom arrives at Husband’s house with a warrant Husband wonders why, because he thought a signed statement was enough Grissom examines Husband’s toilet Grissom finds clipped toenails At lab Grissom compares clipped toenails to chipped pinky toenail One clipped toenail matches the chipped pinky toenail Husband is arrested

A.2 Episode 2: Cool Change A.2.1 Winning a fortune Jamie asks Ted for 20 dollars Ted refuses and decides to play himself Ted wins the jackpot of 40 million dollars Jamie and Ted are brought to the presidential suite Ted asks Jamie to leave Roomservice brings drinks Jamie attacks Ted and cuts him up with a broken bottle Ted leaves the room for first aid Jamie waits for Ted to come back with a heavy object behind the door Ted comes back to the suite Jamie hits and kills Ted with a hit on the back of his head Jamie covers up the blood of the scene Jamie drags Ted to the roof and pushes him off ------CSI getting involved Jamie is taken into custody by the police Grissom deduces a murder because Ted still wore his eyeglasses when he fell down Coroner Jenna shows a piece of black glass she found in Ted’s wound Grissom and Nick find the champagne bottle that matches the black glass Grissom and Nick interrogate Jamie Jamie admits that she cut Ted and that he left thereafter Grissom asks if Ted came back to the room Jamie denies this Nick confirms it and Jamie leaves Nick wonders from where Ted fell down if it was not the balcony Grissom proposes the roof Grissom sets out operation Norman: three ways of Ted falling down from the roof Grissom deduces that Ted was pushed Grissom and Nick check the security camera Red is identified and arrested based on roof dust on his shoes During interrogation Red is asked if he was on the roof Red admits that he was but not that he pushed him off

23 Red admist he talked to Ted but separated their ways at the floor of the presidential suite Red tells he almost commited suicide on the roof Grissom and Nick do not find roof dust on Ted’s shoes Red is no longer a suspect Grissom finds fibers on Ted’s watch Nick proposes Ted was dragged and both believe he was Grissom asks Nick to have Jamie arrested Coroner Jenna tells Grissom that Ted was hit fatally with a heavy object on his head before he was pushed Nick compares the fibers on the carpet with the fibers from Ted’s watch Grissom finds the murder weapon Nick calls Grissom to tell the fibers match Grissom tests the security locks It appears to fail to register the entry Jamie is charged with murder

A.3 Episode 3: Crate ’n Burial A.3.1 Faked kidnapping Chip or Laura proposes a faked kidnapping Chip or Laura respectively agrees to do it They plan the faked kidnapping Chip prepares a crate in the desert unbeknownst to Laura Chip arrives at Laura’s home Chip grabs Laura and carries her around the house Laura leaves marks of struggle Chip asks Laura to put halothene on a piece of cloth to Laura does it and drops it just outside the door They ride outside town to the desert On the side of the road, Chip calls a message demanding ransom for kidnapping Laura Laura tells Chip to hurry up, because a car is approaching Chip stops at a point in the desert where he prepared the crate Chip hits Laura and buries her alive ------CSI getting involved Grissom listens to the tape message a number of times and deduces it is from the desert near power lines At Jack’s house Brass requests Jack not to pay the ransom Jack still is intending to pay the ransom Sara and Grissom examine the house They find mud on the floor and a halothane stained cloth At the lab they find cyanide and gold flecks and cyanide in the mud Grissom deduces from the audio and the mud that Laura is near one of three gold mines near power lines Brass decides to tail Jack Jack puts the ransom in a bin Laura is rescued Chip attempts to get the ransom Chip is arrested At the hospital Laura lies about what happened before she woke up buried alive Grissom requests a blood sample from Laura and he gets it Sara finds fingerprints of Chip on the crate Brass decides to record Chip’s voice before interrogating him Chip denies any allegation and walks away The recorded audio matches the tape message Grissom and Sara conclude Laura is an accomplice because of car seat sheep skin on Laura’s sleeve Chip is arrested The lab confirms no halothene in Laura’s blood Audio tape confirms "Chip, hurry up" at the highway phone Laura is arrested

A.3.2 Hit and run James is calling Charles while driving He puts his cellphone back after the call and loses his eyes on the road Too late he notices a girl on her scooter and fatally hits her James does not know what to do and drives home Charles tells James not to report himself to the police

24 James does what his grandfather says ------CSI getting involved The coroner shows license plate numbers in bruises on the girl’s leg The car is tracked down Catherine and Warrick go to Charles Moore’s home They let Moore know his car has been part of a traffic accident that evening Moore says he told the police his car was stolen They show the warrant Moore opens his garage and the damaged car is revealed Moore lies that he did it and asks if the girl is OK They tell him the girl died at the scene Moore is clearly shocked by the news and is arrested At HQ Catherine thinks Charles gave up too easy Catherine and Warrick check the car Warrick notices that the seat is set too much pushed forward for Charles’s height Catherine asks Warrick to turn on the car Mostef music plays and they deduce that a younger person drove this car Catherine and Warrick interrogate Charles Moore He denies and diverts the question of someone else driven his car Catherine lets him know that his grandson James is an approved driver James Moore enters the interrogation room Warrick asks James if he likes Mostef James does not answer looking at his grandfathr and he sits down Catherine asks James if he hit the girl with the car James tells his grandfather that he wants to tell the truth Charles demands that he will tell (his version) of the truth Charles lies about the actual events taking James into account Warrick asks James if he wants to add anything to that story James does not and agrees that is how it went Catherine knows the truth but wants to "accept" Charles his story Warrick tells her she should not do that and she agrees Catherine finds a chipped piece of tooth in the driving wheel A new interrogation takes place Warrick wants to see Charles Moore his teeth Charles shows his denture Warrick and Catherine reveal the chipped tooth piece Charles and James gives in and James now wants to tell the truth They tell the truth James is arrested

A.4 Episode 4: Pledging Mr. Johnson A.4.1 Pledging gone wrong James Johnson attemps to join the fraternity but is lacking points Kyle gives all initiates an assignment to get their body parts signed Kyle catches James signing himself Kyle exposes it to the fraternity and they humiliate James with a beer shower James asks Kyle for a second chance James gets his penis signed by Jill, Kyle’s girlfriend James proves the valid signing Kyle is angry and decides to have James swallow a piece of raw live through a noose James chokes when the piece of raw liver is stuck Matt enters and Kyle casually taps Matt’s back shoulder, and tells him his father will take care of it They hang James in his room ------CSI getting involved The coroner says teeth marks in the tongue are missing, which is common for people hanging themselves and that he found an ink signature on James his penis At the dorms Nick asks Kyle and Matt if they haze initiates Matt denies that they haze initiates and tells he knows the law Matt tells them there was an initiation for getting their body parts signed Sara is angry that they responded to the false signing with a beer shower Sara does not buy their story At the lab Sara tells Nick that James died from choking on a piece of raw liver, and that miniscule fibers were found on the liver Nick and Sara interrogate Kyle about the fibers they found Kyle denies knowing anything about it Nick tells Kyle to tell them about the liver Kyle gives in and tells them both he and Matt made James swallow a piece of raw liver, saw him choking and performed heimlich Sara asks Kyle why then James ended up at the ceiling

25 Kyle says he and Matt hanged him to make James his death look like a suicide Nick and Sara do not believe the heimlich and want to confirm it The coroner says has no broken ribs or abdominal bruises that would occur due to the heimlich The coroner again shows them the signature that it came from a Jill W. Nick and Sara question Jill W. and admits the signing and that she is Kyle’s girlfriend Nick and Sara search James his room for fibers and they find the noose Nick and Sara interrogate Matt Nick confronts Matt with the possibility of Kyle being angry about the signing Matts rejects it Nick and Sara confront Matt about murder charges for both of them if he does not tell the truth and he is confronted about the noose Matt tells the truth Kyle is charged with murder

A.4.2 The severed leg Wendy Barger and Phil Swelco had calamari at a restaurant They went to Swelco’s home They made love and Wendy left Wendy runs out of gas on the boat, pulls the cord and dislocates her shoulder Wendy falls and fatally hits her head on the boat and falls in the water Wendy’s leg is cut off by a two fisherman’s boat ------CSI getting involved Wendy’s body is found Coroner Jenna deduces from the leg that the cut happened post mortem Grissom and Catherine take fingerprints of Wendy’s hand Coroner Jenna has said that Wendy’s body has been in the water for two days Grissom and Catherine wonder why Wendy was not reported missing Wendy is identified and her husband Winston Barger is called for confirmation Coroner Jenna tells Grissom and Catherine there was vaginal penetration, a dislocated shoulder and a fractured skull Winston Barger is interrogated Winston answers Wendy did not have any enemies Winston asnwers he saw Wendy on a Tuesday morning for the last time to get perspective Catherine asks Winston when he last had intercourse with his wife Grissom asks Winston for a DNA sample (and gets it shortly after) The DNA does not match The husband clearly tells them he wants to know everything that happened Grissom and Catherine think that she might have had an affair Brass tells Catherine Wendy’s friends did not know about an affair Coroner Jennas tells about her findings that Wendy was wounded near her right sleep, did not drown and ate calamari three hours before she died Grissom knows which restaurant it is and he goes there with Catherine The waitress tells the name of the guy that was with Wendy there Grissom and Catherine go to Swelco and tell him the news Swelco is clearly shocked by the news He denies anyone knew about their affair and He denies that he became angry about knowing that Wendy did not want to divorce as well He tells about their dinner and intercourse when they reached his home Grissom and Catherine find Winston Barger near Swelco’s home and tell him not to interfere with the investigation Grissom tests where the boat could have stranded and Catherine searches the boat The boat is found Grissom and Catherine find both skin pieces and blood on the boat Grissom wants Swelco arrested Brass arrested Swelco Brass interrogated Swelco, where Swelco said that he cut himself and got an injection at the pharmacy Winston shows up at CSI HQ and sees Swelco standing before Grissom goes questioning him Winston asks why he is here Catherine takes him away and indirectly tells confirms to him the affair and that he is a suspect Grissom lectures Catherine that she should not have told him about the affair Grissom confirms that pulling the cord was Wendy’s initial cause of death Meanwhile Winston kills Swelco in revenge Winston is arrested

26