The Event Calculus in Probabilistic Programming with Annotated Disjunctions

Kevin McAreavey Kim Bauters Weiru Liu Queen’s University Belfast University of Bristol University of Bristol Belfast, Northern Ireland Bristol, England Bristol, England [email protected] [email protected] [email protected] Jun Hong University of the West of England Bristol, England [email protected]

ABSTRACT is being recognised [36, 16]. Yet while SIEMs and activity We propose a new probabilistic extension to the event cal- recognition systems generally rely on rule-based components culus using the probabilistic (PLP) lan- to reason about events, they are rarely founded on sound log- guage ProbLog, and a language construct called the anno- ical semantics. This is especially true of approaches capable tated disjunction. This is the first extension of the event cal- of handling uncertainty (see [28] for a survey). culus capable of handling numerous sources of uncertainty Reasoning about change (i.e. reasoning about events and (e.g. from primitive event observations and from compos- their effects) is an important topic in AI. Some of the most ite event definitions). It is also the first extension capa- influential logical approaches include the situation calcu- lus [18], Allen’s interval algebra [1], the event calculus [13] ble of handling multiple sources of event observations (e.g. 2 in multi-sensor environments). We describe characteristics and action languages [9]. Conceptually, the event calculus of this new extension (e.g. rationality of conclusions), and is particularly suited for event reasoning orientated applica- prove some important properties (e.g. validity in ProbLog). tions such as SIEM and activity recognition due to its use Our extension is directly implementable in ProbLog, and we of a linear timeline of actual events as well as its ability to successfully apply it to the problem of activity recognition handle concurrent, incomplete and partially ordered events. under uncertainty in an event detection data set obtained Perhaps for this reason, the event calculus has continued to from vision analytics of bus surveillance video. attract attention since its original proposal. However, the issue of uncertainty has been largely overlooked in the event calculus literature [19]. Keywords A notable exception is the work of Skarlatidis et al. In [30], The event calculus, event reasoning, probabilistic logic pro- they modelled a variant of the event calculus with discrete gramming, ProbLog, annotated disjunction time using Markov logic networks (MLNs). This work han- dles uncertainty over the definitions of composite events 1. INTRODUCTION but does not support uncertainty over event observations. Strong restrictions to the event calculus were also required The past decade has seen an increasing demand for so- to reduce the size of the ground MLN, resulting in the loss called security information and event management (SIEM) of canonical features, e.g. domain-independent capturing of platforms, such as IBM’s QRadar.1 SIEMs analyse secu- events. In [29], they modelled a discrete-time variant of the rity alerts generated by network hardware and applications, event calculus in the probabilistic logic programming (PLP) for the purpose of threat detection and prevention, decision language ProbLog [7]. This work (which is disjoint and support, and forensic analysis. Similar themes are reflected incompatible with the previous one) handles some uncer- in vision-based activity recognition, e.g. for public trans- tainty over event observations but does not support uncer- port security [16]. In either case, the aim is to support au- tainty over the definition of composite events. Uncertainty tonomous decision making and/or human decision support over event observations also only relates to the occurrence by exploiting large quantities of rich temporal data. Often of events (not to the time they occur, nor their properties). this equates to inferring (composite) events of interest w.r.t. In this paper, we propose a new extension to a continuous- (primitive) event observations. Increasingly, the impact of time variant of the event calculus, using advances in PLP uncertainty (e.g. in event observations, domain knowledge) not considered by existing work. Our work provides the 1http://www.ibm.com/software/products/qradar-siem first formulation of the event calculus capable of handling numerous sources of uncertainty. As far as we are aware, Appears in: Proceedings of the 16th International Confer- it is also the first formulation capable of reasoning about ence on Autonomous Agents and Multiagent Systems (AA- events from multiple sources (e.g. in multi-sensor environ- MAS 2017), S. Das, E. Durfee, K. Larson, M. Winikoff (eds.), May 8–12, 2017, São Paulo, Brazil. 2 Copyright © 2017, International Foundation for Autonomous Agents and Although many of these approaches are used for automated Multiagent Systems (www.ifaamas.org). All rights reserved. planning, it is not the subject of this paper.

105 ments). Thus, it is particularly suited for event reasoning Atom Description in SIEM and activity recognition systems. Our main con- initially(F) Fluent F holds at timepoint 0 tributions are: (i) we propose a new probabilistic definition happens(E, T) Event type E occurs at time- for event observations, supporting uncertainty over their oc- point T currence, the time at which they occur, and their associated initiates(E, F, T) At timepoint T, event type properties; (ii) we propose new probabilistic definitions for E initiates a period during effect clauses, initial beliefs, and composite events; (iii) we which fluent F holds propose a method to handle multiple sources of event ob- terminates(E, F, T) At timepoint T, event type servations; (iv) we prove some important properties of our E terminates a period during overall extension; and (v) we demonstrate an application of which fluent F holds our extension in vision-based activity recognition. The remainder of this paper is organised as follows: in Section 2 we recall some preliminaries; in Section 3 we pro- Table 1: Domain-dependent atoms in the SEC. pose our new extension; in Section 4 we examine properties of our extension; in Section 5 we summarise results from a case study on an event detection data set; in Section 6 we R. Thus, the least Herbrand model defines the semantics discuss related work; and in Section 7 we conclude. of definite clause LPs. Conversely, normal clause LPs are not guaranteed to have a least Herbrand model. For these 2. PRELIMINARIES programs, various alternative semantics have been proposed. As we will see later, ProbLog uses the well-founded seman- In this section, we recall some preliminaries on logic pro- tics [33]. In this case, the (unique) two-valued well-founded gramming (LP) and introduce an LP formulation of the model of an LP R is denoted WFM(R). An LP R is said to event calculus. We also recall some preliminaries on the entail an atom p iff p is a member of WFM(R). Importantly, PLP language ProbLog which we will use to formalise our if R is a definite clause LP, then WFM(R) is identical to the probabilistic extension of the event calculus in Section 3. least Herbrand model of R. However, WFM(R) is only guar- 2.1 Logic programming anteed to exist for some classes of LP, such as those which are locally stratified [33]. The language of logic programming (LP) is built from symbols for variables, constants, functors and predicates. By convention, variables start with uppercase letters and all 2.2 The event calculus others start with lowercase letters. If V is a variable and c The event calculus provides a mechanism to reason about is a constant, then V and c are terms. If f is a functor and temporal information in classical logic using the common- r1, . . . , rn are terms s.t. n ≥ 1, then f(r1, . . . , rn) is a term. sense law of inertia, which says that properties (called flu- If p is a predicate and r1, . . . , rn are terms s.t. n ≥ 0, then ents) persist over time unless the occurrence of an event p(r1, . . . , rn) is an atom. If p is an atom, then p and not p are triggers a change. As formalised in LP, the event calculus is (resp. positive and negative) literals. A (normal) clause is a kind of many-sorted LP, with sorts for event types, fluents, a formula h :− b1,..., bn s.t. h is an atom (called the head), and timepoints. A set of domain-independent axioms specify b1,..., bn are literals (called the body), and all variables which fluents hold at any given timepoint w.r.t. a timeline are implicitly universally quantified. Informally, if each bi is (narrative) of event type occurrences and a set of domain- true, then h must be true. If each bi is an atom, then a clause dependent effect clauses. The , i.e. the ap- is also called a definite clause. A clause h :− true is called parent need to explicitly model the non-effects of events, a fact and is simply written as h. An LP is a finite set of is addressed in the event calculus through the use of non- clauses. An expression is a term, atom, clause, or LP. Given monotonic reasoning. First-order logic formulations rely on an expression e and a substitution θ = {V1/r1,..., Vn/rn}, circumscription [17], while LP formulations rely on some se- then eθ denotes a new expression with each variable Vi in e mantics for negation by default (e.g. completion [6] or stable replaced with the term ri. Expressions are said to be ground models [14]). The well-founded semantics is an example of if they contain no variables. the latter. Herbrand models define the semantics of LP. The Her- There is no definitive variant of the event calculus (see [27, brand base of an LP is the set of ground atoms w.r.t. the 19] for surveys). Nonetheless, the simplified event calculus set of all possible ground terms (the Herbrand universe). (SEC) from [27] arguably represents the intersection of most A Herbrand interpretation is a mapping from the Herbrand variants. The SEC is essentially a simplification of a revised base to the set of truth values {true, false}. For convenience, variant [12] of the original event calculus [13]. Throughout we will also refer a Herbrand interpretation ω by the set of this paper, we use E, F and T to denote variables for event atoms {p ∈ B | ω(p) = true} where B is the Herbrand base. types, fluents and timepoints, respectively. We also assume A Herbrand interpretation is a Herbrand model of a definite the existence of a partial order  over timepoints s.t. for clause h :− b1,..., bn if for every substitution θ s.t. biθ is each timepoint T we have that 0  T. The SEC is then a (set-theoretic) member of the interpretation, hθ is also a defined by the atoms in Table 1 and the following axioms: member of the interpretation. A Herbrand interpretation is a Herbrand model of a definite clause LP if it is a Herbrand holdsAt(F, T2) :− model of every clause in the LP. happens(E, T1), Definite clause LPs are guaranteed to have a unique mini- initiates(E, F, T ), (SEC1) mal (by set inclusion) Herbrand model, called the least Her- 1 brand model. A definite clause LP R is said to entail an T1 ≺ T2, atom p iff p is a member of the least Herbrand model of not stoppedIn(T1, F, T2).

106 stoppedIn(T1, F, T2) :− terpretations Ω is defined for each ω ∈ Ω as: happens(E, T), (SEC2a)  P(C), if ω = WFM(C ∪ R), P(ω) = terminates(E, F, T), 0, otherwise. T1 ≺ T, T ≺ T2. If there exists a total choice C s.t. WFM(C ∪ R) does not clipped(T1, F, T2) :− exist, F ∪ R is not a valid ProbLog program. Otherwise, for 0 happens(E, T), any total choices C and C , it is guaranteed that WFM(C ∪ (SEC2b) R) 6= WFM(C0 ∪ R). Thus, there exists a unique model for terminates(E, F, T), each total choice, but this model may have a of T1  T, T ≺ T2. 0 (i.e. if the total choice itself has a probability of 0). The definition of a ProbLog program may appear restric- : holdsAt(F, T) − tive at first. However, for ease of modelling, ProbLog pro- initially(F), (SEC3) vides some additional syntactic sugar. One example is the not clipped(0, F, T). annotated disjunction (AD) [7], which is a language con- struct of the form:

In its original definition, the SEC was comprised of SEC1, p1 :: h1; ... ; pn :: hn :− b1,..., bm SEC2a (called SEC2) and SEC3, with the second literal in n where n ≥ 1 and P p ≤ 1. Intuitively, an AD says SEC3 replaced with not stoppedIn(0, F, T). However, such i=1 i that if each b is true, then exactly one h is true and this a definition means that if an event occurs at timepoint 0 i i hi is true with probability pi. Conversely, every hi is false and terminates a fluent f that holds initially, then that ef- n with probability 1 − P p . In a sense, the ground atoms fect will be ignored since not stoppedIn(0, f, t) will hold for i=1 i h ,..., h represent the possible values of a multi-valued any timepoint t  0. To avoid this issue, we introduce 1 n random variable. In ProbLog, ADs are said to be syntactic SEC2b and modify the second literal in SEC3, as was done sugar because they are encoded as a set of probabilistic facts in later extensions [27, 19]. Notable features supported by and a set of (deterministic) normal clauses [35, 7]. the SEC include: (i) continuous time; (ii) concurrent events; 1 1 1 (iii) partially ordered events; (iv) conditional and context- Example 1. Consider the AD 3 :: red; 3 :: green; 3 :: blue. dependent effects; and (v) inferred (i.e. conditional, com- This is encoded as the following ProbLog program [7]: posite or hierarchical) events. Other features which can be 1 :: choice(g). 1 :: choice(r). 1 :: choice(b). supported via extensions to the SEC include: events with 3 2 duration; cumulative and canceling effects; release of flu- green :− choice(g). red :− not choice(g), choice(r). ents from the commonsense law of inertia; and continuous blue :− not choice(g), not choice(r), choice(b). change. Due to space considerations, we will focus on the SEC but our conclusions extend to more elaborate variants. In this encoding, independence between probabilistic facts is still assumed, but dependencies are captured by normal clauses after a suitable adaption of probability values. Con- 2.3 ProbLog sider the following clause: There are numerous PLP languages, each with subtle dif- conjunction :− green, red, blue. ferences in syntax, semantics and implementation [20, 22, 25, 35, 34]. For continuity with related work [29], we will Then P(conjunction) = 0. Conversely, suppose the original define our framework in the ProbLog language [7]. Nonethe- ProbLog program only contained the independent proba- 1 1 1 less, it is likely that our framework will map to at least some bilistic facts 3 :: red, 3 :: green and 3 :: blue, then we would 1 other PLP languages, and we will suggest some possibilities have P(conjunction) = 27 . in Section 7 when we mention future work. Different implementations of ProbLog use different meth- A ProbLog program F ∪R consists of a set of ground prob- ods to evaluate the probability distribution over Herbrand abilistic facts F and an LP R (including non-probabilistic interpretations. This paper is agnostic to the particular facts). Probabilistic facts are denoted p :: a s.t. a is an atom method used and instead focuses on features of the language (called a probabilistic atom) and p is a probability value. that are useful to adapt the event calculus into PLP. How- As with most PLPs, a ProbLog program specifies a proba- ever, as only ProbLog2 [8] supports ADs, our discussion will bility distribution over Herbrand interpretations via Sato’s assume the use of Problog2 in practice. distribution semantics [24]. Intuitively, a ground probabilis- tic fact p :: a corresponds to an independent atomic choice about including a in a set of atoms C s.t. C ∪ R forms a 3. A PROBLOG EXTENSION OF THE SEC new LP. In particular, p :: a says that a should be included In this section, we propose a ProbLog extension of the in C with probability p and not included with probability simplified event calculus (SEC) and describe its character- 1−p. Thus, a set of probabilistic atoms C represents a total istics. The main aspects that we need to consider are: (i) choice over the inclusion of ground atoms from F in C ∪ R. event narratives; (ii) event effects; (iii) initial beliefs; (iv) Due to the independence assumption, the probability of a inference rules; and (v) multiple sources of events. total choice C is equal to the product of the of atomic choices specified by C. 3.1 Event narrative A Herbrand interpretation ω is a model of a ProbLog pro- In the SEC, the event narrative is a set of ground atoms gram F ∪R iff there exists a total choice C s.t. WFM(C∪R) = of the form happens(e, t). This was extended in [29] as a ω. A probability distribution P over the set of Herbrand in- set of ground probabilistic facts of the form p :: happens(e, t)

107 so as to model uncertainty over the occurrence of events. e 0.8 act However, given that probabilistic facts are independent, this happens object approach is not capable of modelling uncertainty over the subject time at which an event occurs. We propose the following: 0 drop 0.7 0.3 0.9 0.1 Definition 1. Let e be an event type and t1,..., tn be time- points. A probabilistic occurrence of e is a ground annotated alice bob glass plate disjunction (AD):

p1 :: happens(e, t1); ... ; pn :: happens(e, tn). Figure 1: Semantic network for Example 2.

This says that event type e occurs at timepoint ti (and not at timepoint tj s.t. tj 6= ti) with probability pi and Example 2. Consider an event where one of two people (Al- does not occur (at any timepoint t1,..., tn) with probability ice with probability 0.7, Bob with probability 0.3) is thought Pn 1 − i=1 pi. Clearly, this form of uncertainty cannot be to have dropped either a glass (with probability 0.9) or a expressed purely by probabilistic happens(E, T) facts since plate (with probability 0.1). The event is believed to have they cannot enforce mutual exclusion between occurrences. occurred with probability 0.8. This could be modelled by Another form of uncertainty in the event narrative re- the following probabilistic event: lates to event type definitions. In particular, the term e in the atom happens(e, t) is generally assumed to provide suf- 0.8 :: happens(e, 0). ficient information to reason about the effects of an event. 0.7 :: subject(e, alice); 0.3 :: subject(e, bob). However, as was noted in the original proposal of the event act(e, drop). calculus [13], event type definitions may be incomplete. The solution proposed in their work is to refer to an event type 0.9 :: object(e, glass); 0.1 :: object(e, plate). by an identifier and to describe it by a set of properties (at- In ProbLog, this induces the following probability distribu- tributes). Effectively, these attributes construct a semantic tion over total choices (we omit e and predicate symbols): network where edges are specified by ground atoms of the form h(e, r) s.t. r is the value of attribute h associated with P({0, alice, glass}) = 0.504, P({alice, glass}) = 0.126, event type e. Given that event type definitions describe the P({0, alice, plate}) = 0.056, P({alice, plate}) = 0.014, actual meaning of events, they must be capable of modelling P({0, bob, glass}) = 0.216, P({bob, glass}) = 0.054, uncertainty. We propose the following: P({0, bob, plate}) = 0.024, P({bob, plate}) = 0.006, Definition 2. Let h be a binary predicate symbol, e an event with P(C) = 0 for any other total choice C. Thus we can type, and r1,..., rn be terms. A probabilistic h attribute for e is a ground AD: infer from this probabilistic event e.g. that the glass was dropped at timepoint 0 with probability 0.72. p1 :: h(e, r1); ... ; pn :: h(e, rn). 3.2 Effect clauses Clearly Definition 2 generalises Definition 1, saying that In the SEC, events are not of direct interest. Rather, their the value of attribute h for event type e is ri with probability Pn interest lies in their effect on fluents w.r.t. domain-dependent pi and is unknown with probability 1− pi. Thus, pi can i=1 effect clauses with initiates(E, F, T) and terminates(E, F, T) be thought of as a degree of belief that the value of h is ri, Pn as the head. The event model outlined previously has in- where 1− i=1 pi is a degree of ignorance over the value of h. Once again, probabilistic h(E, R) facts cannot express this teresting consequences for such effect clauses and the flu- notion since they cannot enforce mutual exclusion over the ents that they initiate or terminate. Specifically, if p is the possible values of h. Notice that we do not allow ADs of the probability of models specified by a probabilistic event (e.g. 0.72 in Example 2) and satisfying an effect clause, then this form p1 :: happens(e1, t1); ... ; pn :: happens(en, tn) in Defini- will propagate to the specified fluent, causing it to be ini- tion 1. The reason being that e1,..., en can be equivalently expressed by a single event type e and a set of probabilistic tiated or terminated with probability p. Thus, each fluent attributes for e. Moreover, relying on a single event type will hold with some probability. This behaviour was recog- simplifies other aspects of this extension, e.g. when merging nised in [29], where events had the effect of reinforcing pre- event narratives. viously held beliefs (i.e. increasing or decreasing previously In our extension, a set of probabilistic event type occur- held probabilities). However, our approach is necessarily rences and a set of probabilistic event type attributes is different due to the use of ADs. Specifically, given some called a probabilistic event narrative. Given an event type domain of a probability distribution f1,..., fn, then for any e, we refer to the set of probabilistic attributes for e and a ground initiates(e, fi, t) clause there must also be ground single probabilistic occurrence of e as a probabilistic event. terminates(e, f1, t),..., terminates(e, fn, t) clauses. In effect, Thus, for any probabilistic event, we are capable of express- this approach maintains a valid probability distribution over ing uncertainty about whether the event actually occurred, f1,..., fn by terminating the previously held probabilities as in [29], but also about the time at which it occurred, each time that new probabilities are initialised. This is akin and about its attributes. In ProbLog, a probabilistic event to database updates as described in [12] and is imposed as narrative actually specifies a probability distribution over a syntactic restriction. the set of total choices, corresponding to a set of (classical) Example 3. Consider Example 4 from [15], which has six event narratives. If any such narrative contains more than belief update events describing the gender and/or orienta- one value specified by a probabilistic event type attribute, tion of a subject over six timepoints. For example, the first then that narrative will have a probability of 0. event says that subject s1 is standing, and that their gender

108 n holdsAt(gender(s1, V1), T) holdsAt(orientation(s1, V2), T) P probability 1− i=1 pi, event type E does not initiate (resp. 1 1 terminate) this period. As with deterministic effect clauses, 0.8 0.8 the event probabilities are propagated to conclusions. In this 0.6 V1/male 0.6 V2/standing case, however, conclusions themselves are also uncertain. V1/female V2/sitting 0.4 0.4 Example 4 (Continuing Example 2). Suppose that, when Probability 0.2 0.2 dropped, the glass may break with probability 0.9 and the 0 0 1 2 3 4 5 6 1 2 3 4 5 6 plate may break with probability 0.8. This can be repre- Timepoint T Timepoint T sented by the following probabilistic effect clauses: 0.9 :: initiates(E, broken(glass), T) :− Figure 2: Probabilities of atoms in Example 3. act(E, drop), object(E, glass). 0.8 :: initiates(E, broken(plate), T) :− is male with probability 0.8 or female with probability 0.2. act(E, drop), object(E, plate). These can be modelled by the following probabilistic events: terminates(E, broken(O), T) :− happens(e1, 0). subject(e1, s1). orientation(e1, standing). act(E, drop), object(E, O). 0.8 :: gender(e1, male); 0.2 :: gender(e1, female). Thus, for any timepoint t 0, we can infer that the glass is broken with probability 0.648, and the plate is broken with happens(e2, 1). subject(e2, s1). probability 0.064. 0.9 :: orientation(e2, standing); 0.1 :: orientation(e2, sitting). 3.3 Initial beliefs happens(e3, 2). subject(e3, s1). gender(e3, male). Aside from effect clauses, the SEC also allows us to specify happens(e4, 3). subject(e4, s1). which fluents hold prior to the occurrence of events by means of a set of ground atoms of the form initially(f), as indicated 0.85 :: gender(e4, male); 0.15 :: gender(e4, female). in Table 1. We can propose a simple probabilistic extension: happens(e5, 4). subject(e5, s1). Definition 4. Let f1,..., fn be fluents. Then an initial prob- 0.3 :: orientation(e5, standing); 0.7 :: orientation(e5, sitting). abilistic belief is a ground AD:

happens(e6, 5). subject(e6, s1). p1 :: initially(f1); ... ; pn :: initially(fn). 0.35 :: gender(e6, male); 0.65 :: gender(e6, female). This says that fluent fi holds with probability pi and that Pn Suppose we have the following effect clauses: fluents f1,..., fn do not hold with probability 1− i=1 pi. In practice, initial probabilistic beliefs define the initial context initiates(E, gender(S, G), T) :− for events (e.g. for determining conditional effects) and per- subject(E, S), gender(E, G). sist over time until an event occurs with effects over the do- main f ,..., f . From this point, probabilities over f ,..., f terminates(E, gender(S, G1), T) :− 1 n 1 n will be determined by events and effect clauses. Note that subject(E, S), gender(E, G2). initial probabilistic beliefs do not impose any requirement with equivalent clauses defined for orientation(S, G).3 Then on complete knowledge. For example, omitting initial beliefs the resulting holdsAt probabilities are as outlined in Fig- that are unknown allows the SEC axioms to derive (or not derive) conclusions as usual, while incomplete initial proba- ure 2, matching the desired behaviour in [15]. Pn bilistic information is expressed when pi < 1. This example demonstrates that our probabilistic event i=1 model allows us to derive rational probabilistic conclusions Example 5 (Continuing Example 4). Suppose instead that via standard (deterministic) effect clauses. However, it is the probability of the glass breaking is dependent on the commonly accepted in the literature (e.g. in planning under person’s orientation when they drop the glass (probability uncertainty) that the effects of events may not be determin- 0.9 if standing and 0.3 if sitting). This can be represented istic. Thus, there is a clear need to support the definition of by the following probabilistic effect clauses: such effects in the SEC. We propose: 0.9 :: initiates(E, broken(glass), T) :− subject(E, S), act(E, drop), object(E, glass), Definition 3. Let E be an event type, F1,..., Fn be fluents and T1,..., Tn be timepoints. A probabilistic effect clause holdsAt(orientation(S, standing), T). for E is an AD: 0.3 :: initiates(E, broken(glass), T) :−

p1 :: h(E, F1, T1); ... ; pn :: h(E, Fn, Tn) :− b1,..., bm subject(E, S), act(E, drop), object(E, glass), where h ∈ {initiates, terminates}. holdsAt(orientation(S, sitting), T). Suppose our prior belief is that Alice is standing with prob- This says that, if each b is true, then with probability p i i ability 0.6 and sitting with probability 0.4, and that Bob is event type E initiates (resp. terminates) a period at time- standing (resp. sitting) with probability 0.5. This can be point T during which fluent F holds. Conversely, with i i modelled by the following initial probabilistic beliefs: 3While some properties (e.g. gender) are not fluents in prin- ciple, what we are describing is beliefs about those properties 0.6 :: initially(orientation(alice, standing)); and this satisfies the intuition of a fluent. 0.4 :: initially(orientation(alice, sitting)).

109 0.5 :: initially(orientation(bob, standing)); subject(e, X)θ P1 P2 P3 ⊕(P1, P2, P3) 0.5 :: initially(orientation(bob, sitting)). θ = {X/alice} 0.7 0.8 0.4 0.67 Thus, for any timepoint t 0, we can infer that the glass is θ = {X/bob} 0.3 0 0.1 0.17 broken with probability 0.462. θ = null 0 0.2 0.5 0.16

3.4 Inference rules Table 2: Linear opinion pool for Example 7. Composite events are closely related to other concepts, such as conditional events. In [16], they refer to any event which is derived through reasoning as an inferred event, i.e. the authors did not provide a formal mechanism to detect conditional, composite, and hierarchical events are all spe- such an event. Example 6, on the other hand, demonstrates cial cases. We will use this term also. In the SEC, inferred that our framework is capable of implementing this feature. events can be defined by means of inference rules. As with 3.5 Merging event narratives effect clauses, such rules may be uncertain. We propose: In real-world applications, event reasoning systems usu- Definition 5. Let E be an event type, T1,..., Tx time- ally operate in multi-sensor environments, e.g. SIEMs aggre- points, and h1(E, V1),..., hy(E, Vy) event type attributes for gate events obtained separately from log files and network E. A probabilistic inference rule for E is an AD: traffic. Clearly, different sources may provide conflicting ob- servations of the same event. As implied previously, a prob- p1 :: infer(happens(E, T1), h1(E, V1),..., hy(E, Vy)); ... ; abilistic event type attribute p1 :: h(e, r1); ... ; pn :: h(e, rn) px :: infer(happens(E, Tx), h1(E, V1),..., hy(E, Vy)) :− expresses a probability distribution P over the set of val- b1,..., bz. ues Ω = {r1,..., rn, unknown} for h s.t. P(ri) = pi and Pn P(unknown) = 1 − i=1 pi. In this sense, the problem of Ground terms in the head of a probabilistic inference rule merging event narratives reduces to the problem of merging can be interpreted as the definition of a new probabilistic (aggregating) subjective probabilities. A standard approach event (called a probabilistic inferred event) of the form: to this problem is the linear opinion pool [31], which is a lin- ear weighted average defined for each ω ∈ Ω as follows: p1 :: happens(E, T1); ... ; px :: happens(E, Tx). n h1(E, V1).... hy(E, Vy). X ⊕(P1,..., Pn)(ω) = wiPi(ω) A probabilistic inference rule is able to derive a new event i=1 with uncertainty over its occurrence and over the time at which it occurred, but not over its attributes. However, to where wi is a weight associated with source si s.t. 0 ≤ wi ≤ 1 Pn ensure that our extension satisfies the well-founded seman- and i=1 wi = 1. When there is no confusion, we will use tics (as will be explained in Section 4), we do not support P to denote a probabilistic h attribute for e. hierarchical events, i.e. inferred events derived from other Definition 6. Let P1,..., Pn be probabilistic h attributes for inferred events. In this sense, inferred events are an out- e obtained from sources s1, . . . , sn. Then the merged proba- put of our extension. That said, it is possible to support a bilistic h attribute for e is defined as ⊕(P1,..., Pn). kind of hierarchical event by inserting inferred events into the event narrative as new (primitive) event observations, Example 7. Consider the probabilistic subject attributes for although this requires a revision strategy to maintain the e provided by sources s1, s2 and s3 as outlined in Table 2 s.t. set of inferred events. w1 = 0.5, w2 = 0.3 and w3 = 0.2. We obtain the following Example 6 (Continuing Example 3). Suppose we want to merged probabilistic subject attribute for e: recognise an event in which a subject sits down. Given that we have temporal beliefs about the subject’s orientations, 0.67 :: subject(e, alice); 0.17 :: subject(e, bob). we can infer such an event via changes from the standing Suppose there exists a unique event type for each real- to sitting orientations. Assume this recognition is accurate world event, the event narrative from each source is defined with probability 0.9. This can be modelled by the following over the same set of event types, and each event narrative probabilistic inference rule: has the same probabilisitic h attributes for each event type 0.9 :: infer(happens(e(S, sit), T), (e.g. assume P(unknown) = 1 for missing attributes). We can thus obtain a merged event narrative by merging each subject(e(S, sit), S), act(e(S, sit), sit)) :− probabilistic h attribute w.r.t. each event type e. Note that happens(E, T), subject(E, S), orientation(E, sitting), the choice of the linear opinion pool is merely to illustrate holdsAt(orientation(S, standing), T). the principle that combining event narratives reduces to the problem of combining subjective probabilities, i.e. any other The resulting (inferred) event narrative is as follows: valid merging operator could be used instead.

0.09 :: happens(e(s1, sit), 1). 0.567 :: happens(e(s1, sit), 4). 4. PROPERTIES subject(e(s1, sit), s1). act(e(s1, sit), sit). We now examine properties of our new extension. Firstly, Thus, subject s1 sits down at timepoint 1 with probability we define an SEC ProbLog program as a ProbLog program 0.09, and at timepoint 4 with probability 0.567. N ∪ E ∪ I ∪ C ∪ {SEC1,..., SEC3} where N is a probabilis- In [15], a similar scenario was suggested where sudden tic event narrative, E is a set of probabilistic effect clauses, changes to the gender assessment could be used as an indica- I is a set of initial probabilistic beliefs, and C is a set of tion of attempts to obfuscate the gender classifier. However, probabilistic inference rules. The clauses with p as the head

110 are called the definition of p and are denoted simply by p it- and clipped(e1, f1, t1) can be in some strata j < k. More- self. We must clarify some assumptions: A1 SEC1–3 are the over, the ground definitions of atoms initiates(e1, f1, t1) and exhaustive definitions of holdsAt(F, T), stoppedIn(T1, F, T2) terminates(e1, f1, t1) can be in some strata i < j. Let the and clipped(T1, F, T2); A2 with the exceptions of SEC1 and body of some clause in the ground definitions of either atom SEC3, stoppedIn(T1, F, T2) and clipped(T1, F, T2) do not oc- initiates(e1, f1, t1) or terminates(e1, f1, t1) contain a positive/ cur as positive/negative literals in the body of any clause; negative literal of the form holdsAt(f2, t2), initiates(e2, f2, t2) and A3 infer(H1,..., Hn) does not occur as a positive/neg- or terminates(e2, f2, t2). Then the ground definition of atom ative literal in the body of any clause. Without loss of gen- holdsAt(f2, t2) can be in some strata h < i. If there is erality, we assume all ADs are subsequently translated to no looping through ground definitions of initiates(e1, f1, t1), probabilistic facts and normal clauses. terminates(e1, f1, t1) and holdsAt(f1, t1), then there exists some total stratification in this manner. Therefore, C ∪ R Proposition 1. Let F ∪ R be an SEC ProbLog program in is a locally stratified LP. which bodies of clauses in the definitions of initiates(E, F, T) and terminates(E, F, T) do not contain literals of the form Corollary 2. Let F ∪ R be an SEC ProbLog program satis- holdsAt(F, T), not holdsAt(F, T), not initiates(E, F, T) or fying Proposition 2. Then F ∪R is a valid ProbLog program. not terminates(E, F, T). Then, for each total choice C, we This proves that the semantics of our extension to the have that C ∪ R is a stratified LP. SEC are indeed well-defined under natural conditions, i.e. we only require that an SEC ProbLog program does not Proof. An LP R is stratified if there exists a partition R = contain (infinite) looping through effect clauses. R0 ∪ · · · ∪ Rn, s.t. Ri ∩ Rj = ∅ for any i 6= j, satisfying: (i) for each atom p, the definition of p is a subset of some Ri; (ii) for each 1 ≤ i ≤ n, if a literal p is in the body of a clause 5. CASE STUDY in Ri, then the definition of p is a subset of some Rj s.t. The aim of [11] was to augment vision analytics with com- j ≤ i; and (iii) for each 1 ≤ i ≤ n, if a literal not p is in the posite event detection for the purpose of high-level activ- body of a clause in Ri, then the definition of p is a subset of ity recognition. To evaluate this work, 8 video sequences some Rj s.t. j < i. Let {p1 :: c1(H), . . . , px :: cx(H)} ⊆ F be were captured in a controlled environment on a bus via two new probabilistic facts after translation of ADs. Given as- cameras, recording both the interior of the bus (covering 20 sumptions A1–3 and the fact that probabilistic event type seats) and the door exterior. Actors were used to simulate attributes have empty bodies by Definition 2, then the fol- passenger behaviour. Each sequence thus captures the ac- lowing is a valid stratification of C ∪ R: tivity of a number of passengers as they board the bus, move around the interior, sit down, stand up, and exit the bus. (C ∪ R)0 ={, c1(H),..., cx(H)}, The vision analytics include classifiers for gender and ori- (C ∪ R)1 ={initially(F), happens(E, T), entation (sitting/standing), and a spatial location tracker. The uncertainty associated with detected events was mod- h1(E, V),..., hy(E, V)}, elled using Dempster-Shafer theory [26], similar to the event (C ∪ R) ={initiates(E, F, T), terminates(E, F, T), 2 model in [16]. To evaluate our own framework, the authors stoppedIn(T1, F, T2), clipped(T1, F, T2)}, provided us with the set of detected events for Sequence 1,

(C ∪ R)3 ={holdsAt(F, T)}, as described in Appendix C of [11]. This video sequence in- (C ∪ R) ={infer(H ,..., H )}. volves two passengers who board the bus, sit down at a se- 4 1 z ries of different seats, and then exit the bus. Although these Therefore, C ∪ R is a stratified LP. events are referred to as composite events in their work, they represent primitive event observations in our case study. Corollary 1. Let F ∪ R be an SEC ProbLog program satis- The original data set and a complete SEC ProbLog imple- fying Proposition 1. Then F ∪R is a valid ProbLog program. mentation are available online,4 along with full details about how the data set was translated. It suffices to say that trans- The restrictions imposed in Corollary 1 are somewhat lim- lating the data set results in a probabilistic event narrative iting in practice. For example, literals holdsAt(F, T) and containing 62 probabilistic events, the first of which occurs not holdsAt(F, T) are prohibited, even though they are re- at timepoint 53 and the last of which occurs at timepoint quired to define context-dependent effects. Fortunately, th- 2440. Each probabilistic event is comprised of a determin- ese restrictions are stronger than is actually necessary, since istic event type occurrence and a (ground) subset of the we only need to ensure that an SEC ProbLog program is following probabilistic event type attributes: locally stratified. subject(E, S). act(E, board). p :: act(E, exit).

Proposition 2. Let F ∪ R be an SEC ProbLog program p1 :: gender(E, male); p2 :: gender(E, female). which does not contain looping through ground definitions of p :: location(E, door); p :: location(E, gangway); initiates(e, f, t), terminates(e, f, t) and holdsAt(f, t). Then, 1 2 for each total choice C, we have that C ∪ R is a locally p3 :: location(E, seat(1)); ... ; p22 :: location(E, seat(20)). stratified LP. p1 :: orientation(E, sitting); p2 :: orientation(E, standing). Thus, the subject attribute and boarding event are both Proof. Given Proposition 1, it remains to be shown that (C∪ deterministic, while all other attributes are probabilistic. R) ∪(C ∪R) is locally stratified. An LP R is said to be lo- 2 3 In order to reason about the probabilistic event narra- cally stratified if there exists a grounding Rθ s.t. Rθ is strat- tive, we completed the SEC ProbLog program by adding 12 ified. Let the ground definition of atom holdsAt(f1, t1) be in 4 strata k. Then the ground definitions of stoppedIn(e1, f1, t1) https://github.com/kevinmcareavey/secproblog

111 hr a lobe oewr ntebodae fPP on PLPs of example, For area broad frameworks. [19], the similar in in applications work uncertainty reasoning some supporting event been also for has calculus there event [29]. in the done well- of not are was 2, approach this and and our 1 ProbLog) of in Propositions valid semantics In the (i.e. defined model that 5). prove to Section also from expressive we set sufficiently data not real ap- is the broader [29] has our (e.g. thus of comparison, plicability and In case expressivity special ours. enhances by the framework subsumed to effectively limited is is proach as where [29] stated 1 in be Definition model can the frameworks two follows: the between relationship WORK RELATED 6. spatio-temporal complex about uncertainty. reason under formally information can programs to ProbLog used SEC serve which be they with set), ease data the limited illustrate the re- to to these (due trivial While fairly are 1395. are passengers sults and both 1170 that timepoints between suggests between overlap passen- set onboard no each data with the is associated although there events ger, that inferred of Note sequence passenger the seats. two while different the seats, 4 From For different to 6 3. passenger here. to that Figure them up conclude in might repeat we summarised not figure are this do results we shown rules, so already inference those 2, re- to corresponding Figure equivalent The are in fluents sequence. these passen- the for each of sults of duration orientation information the and for probabilistic ger location, temporal gender, derive the to about us allow thus probability). for same equivalently the defined with directly was (and taken other events was the stand one and rules, 6, inference Example the they from For when bus. terminated the are exit orientation and form the location of senger’s fluents terminate and to location(S, boarding assumed Similarly, are orientation. events standing exiting in door the fluents at the initiate on to and boarding have assumed particular, events In are exiting fluents. events and orientation and boarding location that the effect of the fluents terminate scribe equiv- and 2 initiate form another to clauses, with the used effect 2, clauses Example 12 effect from the alent directly Of taken inference SEC1–3. are probabilistic 4 axioms with 2 along and rules, clauses effect (deterministic) iue3: Figure

lhuhw aearaymnindterltv merits relative the mentioned already have we Although The [29]. of that course of is ours to work closest The clauses effect these narrative, event probabilistic the Given Probability 0.2 0.4 0.6 0.8 orientation(S, 0 1

258 location(S, happens(e(s

L) 273 275 Timepoin rbblte fifrne nScin5. Section in inferences of Probabilities 378 and 456 460 1 orientation(S, 528 ..psegr or h bus the board passengers i.e. standing), , t A) n 539 T h eann ffc lue de- clauses effect remaining 6 The L). 611 , 1 = 800 T) 920 1205 and 0.2 0.4 0.6 0.8 p 0 1 ..blesaotapas- a about beliefs i.e. O), happ 1

≤ 1429 Timepoin 1705 ens(e(s

ssc,terap- their such, As 1. 1782 1855 s

1919 2 2 , t

2061 A) location(S, T isdw tup at down sits

2125 , s T) 1 2129 isdw at down sits A/stand A/sit door) 112 eeto aastdsusdi eto 5. Section in event discussed The the set providing for data 688127. Hong detection Xin No. thank to agreement like DE- would grant authors the under through project, programme PACESVELOP innovation EPSRC and Horizon Union’s research the European 2020 from the and funding (EP/J012149/1) received project has work This Acknowledgments environ- real-time study to for way. plan adapted similar also a be in we can ments work, framework future runtime our For the how and [2]. [4], calculus calculus event event eventevent reactive cached (classical) the the the [5], in calculus e.g. adapting environments, on real-time for work calculus has some there However, been fashion. already offline an in framework. operates our framework to suited if will particularly and is we possible, language work, is PLP future generalisation any For a such [10]. whether dis- DCs investigate the from and [34], clause CP-logic multi-ary Modelling tributional from CP-event Statistical the the in [25], [22], (PRISM) Programming (ICL) from switch Logic random Choice alternative probabilistic Independent the con- from [20], Stochas- language (SLPs) from Programs clause similar Logic probabilistic tic support the example, PLPs is For other it Programs structs. However, that Logic [7] namely [35]. PLP, (LPADs) known sup- other Disjunctions our exam- is one Annotated in For construct least with role at language pivotal by this PLPs. a ported and other plays definition, some disjunction framework least annotated the at ple, to generalise ProbLog can reasoning SEC their then way. formalise 16]), general to a [36, means in in viable capabilities a case offer the temporal programs is to [16] approaches (as and ad-hoc reasoning [36] on as rely frameworks such formulated existing suitable programs, trivially many ProbLog be SEC can find uncertain literature as can some the we example, from For models 5, event literature. Section the in in study applications addition case In reason the to sources. to able multiple first from the observations also event un- is about It uncertain about and definitions. reason event effects, to composite uncertain able observations, first event the certain is extension our calculus, CONCLUSION 7. SEC time. the of of may treatment extension they specialised their our reasoning, to event for due to benefits directly computational these relate offer Although not respectively. [21], do and [10] works [32] (DCs) Probabilistic in Clauses proposed Causal Distributional were and languages [34] PLP dy- (CP-logic) the example, Logic For of dynamics. general variants more and namic the time approach on PLP modelling such in of con- work no problem language been aware, also PLP has are was similar There we a LP) struct. as or to disjunctions far related annotated As uses closely [3]. variant are in probabilistic (which proposed a languages Similarly, prob- action for [23]. of used in were semantics planning set abilistic answer with PLPs hybrid aeinms ucin [26]. functions mass Bayesian 5 suigteeet’ms ucin r etitdto restricted are functions mass events’ the Assuming ti ot oigta h rbo mlmnaino our of implementation ProbLog the that noting worth is It framework our that likely is it previously, mentioned As event the on literature state-of-the-art the to Compared 5 fthese If . REFERENCES problems from the standpoint of Artificial Intelligence. [1] J. F. Allen. Towards a general theory of action and In B. Meltzer and D. Michie, editors, Machine time. Artificial Intelligence, 23(2):123–154, 1984. Intelligence, volume 4, pages 463–502. Edinburgh University Press, 1969. [2] A. Artikis, M. Sergot, and G. Paliouras. An event calculus for event recognition. IEEE Transactions on [19] E. T. Mueller. Commonsense Reasoning: An Event Knowledge and Data Engineering, 27(4):895–908, Calculus Based Approach. Morgan Kaufmann, second 2015. edition, 2015. [3] C. Baral, N. Tran, and L.-C. Tuan. Reasoning about [20] S. Muggleton. Stochastic logic programs. Advances in actions in a probabilistic setting. In Proc. of 18th Nat. Inductive Logic Programming, 32:254–264, 1996. Conf. on Artificial Intelligence, pages 507–512, 2002. [21] D. Nitti, T. D. Laet, and L. De Raedt. A particle filter [4] F. Chesani, P. Mello, M. Montali, and P. Torroni. A for hybrid relational domains. In Proc. of 2013 logic-based, reactive calculus of events. Fundamenta IEEE/RSJ Int. Conf. on Intelligent Robots and Informaticae, 105(1-2):135–161, 2010. Systems, pages 2764–2771, 2013. [5] L. Chittaro and A. Montanari. Efficient temporal [22] D. Poole. The independent choice logic for modelling reasoning in the cached event calculus. Computational multiple agents under uncertainty. Artificial Intelligence, 12(3):359–382, 1996. Intelligence, 94(1):7–56, 1997. [6] K. L. Clark. Negation as failure. In H. Gallaire and [23] E. Saad. Probabilistic planning in hybrid probabilistic J. Minker, editors, Logic and Data Bases, pages logic programs. In Proc. of 1st Int. Conf. on Scalable 293–322. Springer, 1978. Uncertainty Management, pages 1–15, 2007. [7] L. De Raedt and A. Kimmig. Probabilistic (logic) [24] T. Sato. A statistical learning method for logic programming concepts. Machine Learning, programs with distributional semantics. In Proc. of 100(1):5–47, 2015. 12th Int. Conf. on Logic Programming, pages 715–729, 1995. [8] D. Fierens, G. Van den Broeck, J. Renkens, D. Shterionov, B. Gutmann, I. Thon, G. Janssens, and [25] T. Sato and Y. Kameya. Parameter learning of logic L. De Raedt. Inference and learning in probabilistic programs for symbolic-statistical modeling. Journal of logic programs using weighted Boolean formulas. Artificial Intelligence Research, 15:391–454, 2001. Theory and Practice of Logic Programming, [26] G. Shafer. A Mathematical Theory of Evidence. 15(3):358–401, 2015. Princeton University Press, 1976. [9] M. Gelfond and V. Lifschitz. Representing action and [27] M. Shanahan. Solving the Frame Problem: A change by logic programs. The Journal of Logic Mathematical Investigation of the Common Sense Law Programming, 17(2):301–321, 1993. of Inertia. The MIT Press, 1997. [10] B. Gutmann, I. Thon, A. Kimmig, M. Bruynooghe, [28] A. Skarlatidis. Event Recognition Under Uncertainty and L. De Raedt. The magic of logical inference in and Incomplete Data. PhD thesis, University of probabilistic programming. Theory and Practice of Piraeus, 2014. Logic Programming, 11(4-5):663–680, 2011. [29] A. Skarlatidis, A. Artikis, J. Filippou, and [11] X. Hong, Y. Huang, W. Ma, S. Varadarajan, P. Miller, G. Paliouras. A probabilistic logic programming event W. Liu, M. Jose Santofimia Romero, J. Martinez del calculus. Theory and Practice of Logic Programming, Rincon, and H. Zhou. Evidential event inference in 15(2):213–245, 2015. transport video surveillance. Computer Vision and [30] A. Skarlatidis, G. Paliouras, A. Artikis, and G. A. Image Understanding, 144:276–297, 2016. Vouros. Probabilistic event calculus for event [12] R. Kowalski. Database updates in the event calculus. recognition. ACM Transactions on Computational The Journal of Logic Programming, 12(1-2):121–146, Logic, 16(2):11:1–11:37, 2015. 1992. [31] M. Stone. The opinion pool. The Annals of [13] R. Kowalski and M. Sergot. A logic-based calculus of Mathematical Statistics, 32(4):1339–1342, 1961. events. New Generation Computing, 4(1):67–95, 1986. [32] I. Thon, N. Landwehr, and L. De Raedt. Stochastic [14] V. Lifschitz. Twelve definitions of a stable model. In relational processes: Efficient inference and Proc. of 24th Int. Conf. on Logic Programming, pages applications. Machine Learning, 82(2):239–272, 2011. 37–51, 2008. [33] A. Van Gelder, K. A. Ross, and J. S. Schlipf. The [15] J. Ma, W. Liu, and P. Miller. Handling sequential well-founded semantics for general logic programs. observations in intelligent surveillance. In Proc. of 5th Journal of the ACM, 38(3):619–649, 1991. Int. Conf. on Scalable Uncertainty Management, pages [34] J. Vennekens. Algebraic and logical study of 547–560, 2011. constructive processes in knowledge representation. [16] J. Ma, W. Liu, P. Miller, and W. Yan. Event PhD thesis, K.U. Leuven, 2007. composition with imperfect information for bus [35] J. Vennekens, S. Verbaeten, and M. Bruynooghe. surveillance. In Proc. of 6th IEEE Int. Conf. on Logic programs with annotated disjunctions. In Proc. Advanced Video and Signal Based Surveillance, pages of 20th Int. Conf. on Logic Programming, pages 382–387, 2009. 431–445, 2004. [17] J. McCarthy. Circumscription—a form of [36] S. Wasserkrug and O. Etzion. A model for reasoning non-monotonic reasoning. Artificial Intelligence, with uncertain rules in event composition systems. In 13(1-2):27–39, 1980. Proc. of 21st Conf. on Uncertainty in Artificial [18] J. McCarthy and P. J. Hayes. Some philosophical Intelligence, pages 599–608, 2005.

113