<<

Capturing High-Level Conditions, using a Publish/Subscribe , in Sensor Systems

Salman Taherian and Jean Bacon

University of Cambridge Laboratory, JJ Thomson Avenue, Cambridge, CB3 0FD, UK {st344,jmb}@cl.cam.ac.uk

Keywords: states, publish/subscribe, sensor systems granularity of their publications (e.g. rate of events), are hidden from the subscribers and throughout the Abstract system lifetime. Related efforts have led to the develop- ment of Composite Event (CE) frameworks, that are fur- In this paper we discuss State Maintenance Components ther discussed in the next section. We, however, have (SMCs), which are used to capture high-level condi- adopted the notion of state to capture lasting conditions in tions in a State-based Publish/Subscribe (SPS) middleware. the system. Our efforts have led to the development of the SMCs support temporal and spatial condition interrelation- SPS framework[17], which combines the expressiveness of ships, while using events received from a Publish/Subscribe states with the scalability of Pub/Sub systems. (Pub/Sub) communication system. The SMC data structure The SPS middleware builds on stateless topic-based is designed so as to enable SMC decompositions and distri- Pub/Sub protocols, which are simple in design and butions for effective load-balancing, localized processing, lightweight for on resource-constrained de- and information sharing. Our evaluation, using real sensor vices. More resourceful devices house State Maintenance data in the context of a smart transportation system, demon- Components (SMCs), that capture high-level conditions. strates the expressiveness and the scalability of SMCs in the These components maintain limited data structures that aid SPS middleware. context-based data processing (for increased efficiency), and memory-based condition detection (for increased ex- 1 Introduction presssiveness). In this paper, we focus on SMCs, and dis- With the advance of sensor technologies, and increased re- cuss their expressiveness and operation for capturing high- alisation of the benefits of studying environmental data, level conditions in sensor systems. new sensor systems are emerging, that exhibit sheer scale, In section 2, we present an overview of related work. Sec- heterogeneous device types, and multi-application oper- tion 3 outlines our middleware architecture; further details ational settings. Smart environments are an example of which can be found in [17]. Capturing expressive con- of these systems, where environments are equipped with ditions, using SMCs, is detailed in section 4. Section 5 wired/wireless sensor devices of various types and plat- evaluates the expressiveness and performance of the SPS forms, and serve diverse and dynamic applications. middleware, in the context of a smart transportation system These systems demand middleware solutions that abstract case-study. Finally, concluding remarks and future direc- the low-level data, and the infrastructural details, and cap- tions are outlined in section 6. ture high-level knowledge or conditions for their applica- tions and users. Most important, however, is the dynamic 2 Related Work that is realised in these systems. This is not only related to node failures, but also to network characteris- SPS is not the first framework to use the notion of state tics (e.g. node mobility), the multi-application setting (dis- for sensor systems. Others [6, 8, 1] have offered the ex- tributed and dynamic users, with changing interests), and pressiveness of states to sensor network applications. But the system’s evolution (node joins and leaves). they are mainly based on the principles of Finite State Ma- Provisioning the highlighted information processing needs, chines (FSMs), and describe the internal state of a program in the view of the dynamic network topology, is a great in sensor networks. They are predominantly “state-oriented challenge. In an effort to address this, we have decided programming models”, in which one or more user appli- to leverage from existing Pub/Sub systems[4], which have cations can be modelled and programmed over sensor de- been extensively researched in the context of large-scale vices. Other works use the notion of state to reflect knowl- distributed systems. The aim is to build a suitable infor- edge about the real-world. Examples include [18] where mation processing service that cooperates with the Pub/Sub lasting conditions are captured over correlated events, and communication paradigm. [14] where high-level information is deduced from primi- Capturing high-level conditions, using a Pub/Sub system, tive state events. In [14], primitive state events are drawn to is non-trivial. Publishers’ details, such as the quantity and a centralized server, where expressive state predicates are adv./ subscribe/ notify publish SMC spec. (events) Table 1: SMC Structure SMC Index Annotation notify (events) N SMC Name Q Publish/ InfoS n SMC Query Expressions (QEs) Subscribe manager P SMC Entrance Predicate An x ingress Event Attributes Computations subscribe P SMC Exit Predicate adv./ InfoS Ax publish SMC spec. egress Event Attributes Computations s SMC Status- request/ e SMC’s last Capture (Event Publication) SMC manager register (QEs) SMCs reply/notify and a consuming one is called a subscriber. (KPs) Figure 1 shows the internals of the SPS middleware: the Publish/Subscribe, the Information Space (InfoS) manager, Figure 1: SPS architecture and the SMC manager components. There is an ex- evaluated. Our work uses a similar notion of state, but of- plicit separation of tasks: the Pub/Sub component han- fers SMCs that are more expressive in describing condi- dles network-wide messaging, the InfoS component serves tions, and decomposable for distributed processing. as an event repository, and the SMC manager captures CE frameworks [13, 12, 2, 7] extract high-level informa- high-level conditions. This separation helps to isolate the tion through patterns of event occurrences. These patterns costs and predict run-time resource usage. Information are enveloped as individual composite events, which may flow, in SPS, is by means of event notifications, compris- subsequently serve as events to other composite events. ing a of attribute/value pairs. Some attributes (topic, These frameworks focus on temporal relationships between time, location, and status1) are generic, while others are events. Event parameterization (constraints over event at- topic-related. High-level information deduction, in SPS, is tribute values) hinders the sharing of CEs, and is often through SMCs, that have a structure outlined in table 1. Ev- performed pre- or post-CE detection. Selection and dis- ery SMC is an instantiated object that envelopes the condi- card of constituent events are governed by per topic event tion definition in (N, Q, P n, An, P x, Ax), and maintains a consumption policies. In contrast, our framework supports status-bit s and SMC event e that reflects the current status event parameterization as part of its condition detection and last capture of the condition. The status-bit s and the process, and selects and discards events according to their last capture e can be examined through the valid and last contained knowledge (attribute values). SPS defines a uni- keywords within the SMC predicates. fied model for supporting per condition temporal and spa- SMCs are housed at the SMC manager, and operate as pub- tial policies that govern the event selection, condition detec- lishers to the Pub/Sub component. Their names indicate the tions, and condition interrelationships. Furthermore, unlike event topics under which their events are published. QEs some CE frameworks, SPS detectors (SMCs) can capture highlight the relevant data that is obtained from the InfoS multiple concurrent conditions without the need to have to examine the conditions. The entrance predicate captures pre-existing replicated detectors. the start of a condition, for which an ingress event (with With a -oriented view on sensor networks[5], DB- related attributes) is generated, and the exit predicate cap- related frameworks [10, 19, 9, 15] have been developed, tures the ending of a condition, denoted by an egress event. that support application-level Structured Query Language Context-based data processing is performed by oscillating (SQL) queries over resource-constrained sensor devices. between the two predicates, according to the current status SQL queries are parsed into procedural Relational of the condition, s. The next section details how SMCs are Expressions in Database Management Systems (DBMSs). evaluated and conditions captured in SPS. Although we use Relational Algebra to define our condi- tion detection model, our framework is state-based (main- 4 Capturing Conditions tains states, provides context-based data processing, and fa- cilitates memory-based condition detection) and is concep- In SPS, the notion of state is used to capture momentary or tually different from the work on DBMSs and data-stream lasting conditions. We say that a condition has occurred if processing, focusing on an open distributed environment. there exists some knowledge that contributes or hints about We describe how temporal and spatial condition interrela- its occurrence. The detection process can be divided into tionships can be captured using Relational Algebra, and in three stages: knowledge selection, knowledge examination, this process, the efforts of the DB research community can and knowledge encapsulation. Throughout this process the be used for accurate optimization and implementation. following data structures are observed, that refine the InfoS data to some knowledge that suggests the condition’s oc- 3 State-based Publish/Subscribe (SPS) currence, and finally contribute to an SMC event. Knowledge Point (KP) is an event-like attributed tuple, SPS[17] is a State-based Publish/Subscribe framework that that reflects a piece of knowledge in the InfoS. supports its clients, comprising sensors, actuators, applica- Satisfying Knowledge (SK) is a set of KP-combination tions, controllers, etc., through a uniform Pub/Sub inter- face. An information producing client is called a publisher, 1this attribute may have the atomic, ingress or egress value. tuples, each of which represents a combination of one ATTRIBUTES RELATIVE DOMAINS or more KPs, whose values satisfy an SMC predicate. topic super−topic names SMC topic name sub−topic names time a past time present InfoS time Contributing Knowledge (CK) is a set of KP- location SMC−host location distance from host combination tuples, that satisfy an SMC predicate and status ingress atomic egress relate to a unique condition. − 0 + We explain the detection process with reference to an ex- (local) ample. A high-level explosion event is defined as below. Explosion Event. An explosion event is detected when Figure 2: Relative domains sound and light events are detected, and high temperature Topic The event topics from the absolute domain are readings are realised from the environment. The time of the mapped onto the domain of real based on sound and light events should be close (within 0.5s), and the corresponding SMC’s name and super- and sub- high temperature readings should be detected continuously topic names (e.g. explosion event topic is a sub-topic for the past 30s. Multiple sound events, realised at different of light, sound, and temperature event topics). times, could relate to different explosion events. Time The time values are mapped according to the lo- cal InfoS’s processing timeline, that monotonically in- 4.1 Knowledge Selection creases as knowledge (events) become stable[17]. Location The location values are mapped according to QEs extract distinct pieces of knowledge from the InfoS, their Euclidean distances from the position of the cor- to be examined individually or against other knowledge for responding SMC’s host. condition occurrences or terminations. Every QE resolves Status The status values are mapped statically. into a set of KPs, when examined against the InfoS. Figure 2 illustrates this mapping over an axis of real num- If we consider the InfoS as an MDX[11] cube, with di- bers. Let AF = {topic, time, location, status} represent mensions matching the fixed event attributes, then each QE the set of fixed event attributes, and AT represent the set of could be thought of as a selection box, that encloses some topic-related event attributes. We now define the QE param- related knowledge in the cube, represented later by some eters, using relational algebra, about a candidate attribute KPs. The QEs are composed of attributed parameters that a ∈ AF . Note that relative values are expressed like abso- govern the selection of knowledge along every dimension lute values with an r superscript. of the cube. These parameters hold a one-to-one relation- I(v) S({x|x∈AF ,x6=a}γmin(|a−v|),{y|y∈AT }InfoS), ship with the fixed event attributes. Table 2 shows the means that the knowledge closest to or at the v of permissible values for each attribute. DP is the set value is selected (e.g. for the time attribute, I(0r) of all event topics, DT , DL are the set of time and loca- selects the most recent knowledge from the InfoS). tion values, and DS = {atomic, ingress, egress}. The A(fn, < range >) S({x|x∈AF ,x6=a}γfn({y|y∈AT }),aY ), < range > parameters are described using a pair of values where Y = σrl≤a≤rh InfoS. This parameter selects (rl, rh) that denote the lower and higher bound values. The all knowledge, that falls within a given range, and ap- default range is (−∞, +∞). QE parameters follow one- plies an aggregation , fn, independently of another with a dot (‘.’) in the representation. For max r r the other fixed attributes. Maximum ( ), minimum example, the “I(T emperature).L(−30 , 0 ).N.I(atomic)” (min), sum (sum), and average (avg) are supported. is a QE, that holds the I(T emperature) as its topic {O, L, N, S}(< range >) σ InfoS, r r rl≤InfoS.a≤rh parameter, the L(−30 , 0 ) as its time parameter, the means that all knowledge instances, falling within the N(−∞, +∞) as its location parameter, and the I(atomic) specified range, are selected from the InfoS. as its status parameter. If we label the resultant relation, from resolving a QE pa- If we consider the InfoS as a Database (DB) relation, then rameter, corresponding to an attribute a ∈ AF , as Ra(X) the parameters could be described using relational algebra. where X is the input relation, then the overall KPs for a Prior to this, however, we must define the domain (set of QE is K = Rtopic (Rstatus (Rlocation (Rtime(InfoS)))). permissible values and their meanings) for each attribute The earlier QE example retrieves the past 30s temperature in the InfoS. For every fixed attribute, we define two do- readings, published by all the sensors across the space. mains: an absolute domain D and a relative domain Dr. The absolute domain reflects the set of static values that are 4.2 Knowledge Examination storable in the tuples of InfoS. The relative domain, how- ever, offers an alternative and dynamic view to the same set SMCs are examined whenever their related knowledge is of static values in D. Queries that use relative values (e.g. changed or updated. This change is not restricted to the re- r σtime==0r InfoS, where 0 is a relative value) may return ception of a new event, but may occur when the processing different relations, when examined by different queriers, at timeline is advanced, or the SMC’s host changes position. different times, or at different locations. The change triggers the InfoS to provide new KPs, which The relative domain Dr can be best explained by mapping are then examined for condition occurrence or termination. each attribute domain D onto a unique domain of real num- The predicates define what knowledge contributes or hints bers. Note that these domains are used as part of describing about a condition’s occurrence. If we label an SMC’s QEs that have corresponding SMCs, who also have corre- QEs as A, B, , . . ., and their corresponding KPs as sponding hosts in the network. KA, KB, KC, . . ., then the SK S can be found as follows. Table 2: Query Expression attributes and parameters Attributes Parameters Default r topic single(I)(v ∈ DP ∪ DP ), multiple : {one(O), all(L), any(N), separate(S)}(< range >) – r r time I(v ∈ DT ∪ DT ), aggregate(A)(fn, < range >), multiple : {O, L, N, S}(< range >) I(0 ) r location I(v ∈ DL ∪ DL), A(fn, < range >), multiple : {O, L, N, S}(< range >) N(−∞, +∞) r status I(v ∈ DS ∪ DS), multiple : {O, L, N, S}(< range >) I(ingress)

S = σ n x ((ρ KA) × p∈{P ,P } {A.topic,A.time,A.location,···} Table 3: Explosion SMC (ρ K ) × · · ·), where p is se- {B.topic,B.time,B.location,···} B SMC lected according to the current status of the condition. N “Explosion” If KA and KB represent the light and sound events, and the A := I(Light).N.N.I(atomic); SMC predicate is |A.time − B.time| ≤ 0.5, then S would Q B := I(Sound).S.N.I(atomic); C := I(T emp).L(−30r, 0r).N.I(atomic); contain all light and sound event-combinations whose oc- P n |A.time − B.time| ≤ 0.5 && C.value ≥ 40 currence times are within 0.5s. This can be used as part of P x true a search for detecting explosions. from the InfoS. Similarly, distinct explosion events (re- The cartesian product of the relations (K , K , K , . . .), A B C lated to different sound events) may be captured using the and the examination of all KP-combinations may be com- multiple : separate parameter for the time attribute of a putationally expensive. In section 4.4, we discuss the de- QE like B that extracts the sound events from the InfoS. composition of QEs, which allows this computation to be The Explosion SMC structure is shown in table 3. distributed across many networked devices. In order to determine all CKs (from an SK S), the SK S is divided into CKs according to the multiple : separate 4.3 Knowledge Encapsulation parameter, {U|U = σ(a=i∈(πaKx))S}. The multiple : one Every tuple in S highlights a KP-combination that satisfies and multiple : all parameter assertions are then examined the predicate. The user specifies how these tuples should over each set member, and unsatisfied CKs are eliminated relate to one-another, and how they may be grouped to in- from the {U}. Subsequently if |{U}| > 1, then concurrent dicate one or more condition occurrences or terminations. conditions are detected, and temporary SMCs are spawned Knowledge encapsulation is a process whereby the SK S is to monitor each distinct condition individually. Every tem- transformed into zero or more CKs. The of result- porary spawned SMC is assigned a unique CK u ∈ {U}, ing CKs determines the number of detected conditions. and lasts until its assigned condition is terminated. The cardinality of S relates to the cardinality of the in- SMCs transform CKs into SMC events. The most recent put KPs (KA, KB, KC, . . .), such that if |S| > 1 then knowledge in the CK and the SMC’s attributes computa- An/Ax ∃x ∈ {A, B, . . .} where |Kx| > 1. For |Kx| > 1 to hold, tions, , are used to determine the topic-related at- the corresponding QE, x, must hold at least one multiple tribute values. The generic attribute values are determined parameter among its attributed values. We define sub- from the SMC’s name, the most recent KP in CK, the lo- parameters for multiple, that define what relationship the cation of the SMC’s host, and the satisfied predicate. The resultant KP-combinations in S should hold for the condi- condition capturing process is completed when the SMC tion to have occurred. Assuming that the multiple param- event is published and the SMC status-bit is toggled. eter relates to an attribute a ∈ A, then the sub-parameters and their associated conditions may be defined as follows. 4.4 Distributed Detection |π S| = 1. Asserts that only one unique multiple:one (O) a SPS supports the decomposition of SMC to distribute the a value, from K , should appear in S. x processing load across many network nodes, and poten- multiple:all (L) |πaS| = |πaKx|. Asserts that all a val- tially reduce processing and communication costs. De- K S ues, from x, should appear in . composition of complex SMCs into simpler SMCs length- multiple:any (N) |πaS| ≥ 1 ≡ |S| ≥ 1, is a dummy pa- ens the information processing chain, thus increasing the rameter that asserts any one satisfied a value, from Kx, chance of information sharing among multiple independent may be taken as a representative in a CK U. subscribers. Furthermore, SMC decompositions allow for multiple:separate (S) |πaS| ≥ 1 ≡ |S| ≥ 1, is similar to more effective distribution of the SMCs, where localized multiple : any but implies that every unique a value (zero-communication cost) processing may be achieved. in S can signal a unique condition. A maximum num- ber of |π S| conditions can be captured per evaluation. a 4.4.1 Predicate Decompositions A CK U is a of the SK S, such that all conditions (related to the multiple sub-parameters in x) are satisfied An SMC predicate is a boolean expression, which may be within U. These sub-parameters can be used to define tem- decomposed using boolean algebra. These decompositions, poral and spatial condition interrelationships in SPS. For however, are only effective if disjoint operands are pro- example, continuous high temperature readings can be as- duced (i.e. an SMC is decomposed into many SMCs, that serted using the multiple : all parameter for the time at- hold mutually exclusive groups of QEs). Every SMC then tribute of a QE like C that acquires temperature readings either examines a distinct part of the overall condition or Table 4: Decomposed Explosion SMC Table 5: Decomposed SMCs (a) Temperature High SMC (a) Temperature High1 SMC SMC SMC N “Temperature High” N “Temperature High1” Q A := I(T emp).L(−30r, 0r).N.I(atomic); Q A := I(T emp).L(−30r, −15r).N.I(atomic); P n A.value ≥ 40 P n A.value ≥ 40 P x true P x true (b) Explosion SMC (b) Temperature High2 SMC SMC SMC N “Explosion” N “Temperature High2” r r A := I(Light).N.N.I(atomic); Q A := I(T emp).L(−15 , 0 ).N.I(atomic); Q P n A.value ≥ 40 B := I(Sound).S.N.I(atomic); x C := I(T emperature High).I(0r).N.I(atomic); P true P n |A.time − B.time| ≤ 0.5 && C.valid x P true (c) Temperature High SMC SMC joins the partial results to examine the overall condition. N “Temperature High” A := I(T emperature High1).I(0r).N.I(atomic); The Explosion SMC may be decomposed as in table 4. Q r B := I(T emperature High2).I(0 ).N.I(atomic); The resultant SMCs may capture meaningful conditions, P n A.valid && B.valid that can be shared with other high-level conditions. P x true

4.4.2 QE Decompositions Table 6: Filtering SMCs Predicate decompositions can lead to the separation of QEs, but each individual QE may also be decomposed. This de- (a) Temp 0.5Hz SMC composition distributes the SK search-space over a number SMC N “Temp 0.5Hz” of SMCs, such that every SMC searches a disjoint table of Q A := I(T emp).I(0r).N.I(atomic); P n last.time ≤ −2r the KP-combinations (cartesian product of KPs). x This decomposition only relates to the QEs that hold a P true multiple parameter. The range of the parameter can be (b) Temp 10%Change SMC decomposed into an arbitrary number of disjoint ranges, SMC N “Temp 10%Change” each of which is examined by an independent SMC r r r Q A := I(T emp).I(0 ).N.I(atomic); (e.g. the L(−30 , 0 ) parameter may be decomposed into n r r r r P |A.value − last.value| ≥ 0.10 ∗ last.value L(−30 , −15 ) and L(−15 , 0 )). P x true The decomposed SMCs search only parts of the knowledge- space, hence their findings (SKs) are limited and incom- The performance and scalability of the SPS framework is plete. Events published by these SMCs need to be joined discussed within the context of the application scenario. to examine sub-parameter conditions, and yield accurate results. This join operation depends on the chosen sub- parameter, and in its simplest case (for the multiple : 5.1 Expressiveness separate) is not necessary. Other sub-parameters require 5.1.1 Controlling the rate of events appropriate join operations that assert their conditions over the realised SKs. Although Pub/Sub subscribers have no control over the rate multiple:any requires a logical OR join operation to yield of event publication, the need for this control is evident. only a single result (SMC event) from the SKs. Consider a temperature sensor that publishes temperature multiple:all requires a logical AND operation to ensure readings (events) every second. While this granularity is all attribute values are satisfied across all the SKs. suited for some applications (e.g. fire breakout monitoring), multiple:one requires a logical XOR operation to ensure it may be too fine-grained for others (e.g. daily temperature the uniqueness of the satisfied attribute value in SKs. logging). Subscriber-asserted control over the rate of event These join operations can be expressed using SMCs. Table publication is useful, particularly when communication is a 5 gives an example of this decomposition for the Tempera- scarce resource. ture High SMC (shown in table 4(a)). In SPS, a subscriber may define an SMC that filters events according to a custom specification. Table 6 shows two 5 Evaluation SMCs that limit the rate of events, that are delivered to the subscribers, either by time or by value. We first examine the expressiveness of SMCs in the SPS Temp 0.5Hz SMC publishes the most recent Temp event ev- framework; in particular, we discuss two drawbacks that ery 2s, maintaining a fixed 0.5Hz event publication rate. are commonly associated with the Pub/Sub systems. In the Temp 10%Change SMC passes an event when the value at- second part, we highlight an application scenario, for which tribute has changed by more than 10% (relative to the pre- we process real sensor data to capture high-level conditions. viously passed event’s value). Table 7: SensorInUse SMC JourneyPlannerApp SMC subscribe/ notify N “SensorInUse” (events) Q A := N.I(0r).I(0r).N; B := L.I(0r).I(0r).N; SMC spec. P n A.name > T emperature P x !(B.name > T emperature) SPS TC TCN

SPS IL_High Car_Slow SensorInUse

adv./ X adv./ adv./ publish publish publish InductiveLoop subscribe/ Car User adv./ notify publish SMC spec. (events) Figure 4: Application Overview Sensor on /off Controller

Figure 3: Controlling Sensors Table 8: TrafficCongestionNear SMC SMC 5.1.2 On-demand event publication N “TrafficCongestionNear” A := I(T rafficCongestion).N(0r, 0r).S.N; Q r r One of Pub/Sub’s downsides, for sensor networks, is its B := I(User).I(0 ).I(0 ).I(atomic); P n |A.location − B.location| ≤ 2 push-based communication model. This requires event x P true publishers (e.g. sensors) to be actively sensing and publish- ing events that may be of no interest to event subscribers high-level information deduction in our experiment. and subsequently discarded at the local event broker. Some • Inductive Loop sensors, with periodic reports on road- related work has modified Pub/Sub implementations to al- segment occupancies (continuous InductiveLoop (IL) low event publishers to be shut down when no correspond- event notifications at 1Hz). ing subscribers exist. • Speed Measurements2, reporting on the speed of the SPS’s expressiveness rectifies this problem, without the passing cars on the road (discrete, but potentially high- need to modify the Pub/Sub implementation. Essentially, volume, Car (C) event notifications). a feedback loop is created where the sensor’s hardware- • Location sensors, indicating the current location of the controller becomes a subscriber to a condition that detects simulated users (continuous User events at 1Hz). interests about its sensor’s data. As such, the controller can We decided to help mobile users to plan their journeys, by activate the sensor when interests arise, and de-activate it providing real-time traffic information about their nearby when interests perish, see figure 3. traffic congestions. More specifically, we simulated 500 We can detect if there exists any SMC, such as X in figure mobile users, in a two-dimensional grid-like road network, 3, that uses the local sensor (e.g. temperature) data. We that had subscribed to their nearby traffic congestions. The assume that such an SMC publishes high-level (e.g. fire) challenges in this application were two-fold. events about our local environment. We can know about Firstly to capture “traffic congestion” conditions, which we all SMCs that have advertised events about our local envi- defined as the mutual occurrence of “high road occupancy” ronment through the A QE in the SensorInUse SMC (table and “slow vehicle speeds”. These were deduced from the 7). This SMC examines for any SMC, such as X, that pub- inductive loop sensor data, and the speed camera readings. lishes high-level events which incorporate knowledge about Every congestion lasted until the speed of the moving vehi- the local sensor data. The entrance predicate is satisfied cles exceeds a given threshold value (15MpH ). when an SMC like X exists, and the exit predicate is sat- Secondly, traffic congestion reports needed to be filtered, isfied when no such SMC exists any more. The controller according to individual user locations. Since our input toggles the switch according to the received ingress/egress knowledge was confined to the published event notifi- SensorInUse events. cations, we introduced User event notifications (detailed above) that were published periodically to provide loca- 5.2 Application Scenario: Journey Planning tion information about each individual user. Traffic conges- tion reports were filtered according to the most recent User The proposed framework has been implemented on Jist/ event knowledge, which meant users did not need to update Swans[3]. An application scenario was implemented and their subscriptions as they moved around the network. tested (using real-data from SCOOT[16]) to observe the The TrafficCongestionNear event topic, whose SMC is de- scalability and performance of SPS. tailed in table 8, reports on the nearby traffic congestions, Consider a smart transportation system, composed of dif- that are situated within a 2 road-junction distance of the ferent sensor devices, such as inductive loop sensors, speed user’s present location. Users subscribed to this SMC lo- cameras, ANPRs, and traffic light signals. In this system, cally, meaning that SMCs were hosted locally and filtered users can benefit from high-level information to plan re- lated activities. Let us define the following data sources for 2inferred from a secondary stream of raw SCOOT data Table 9: Decomposed TrafficCongestion SMC Table 11: Simulation Results (a) TrafficCongestion SMC Sim1 Sim2 Annotations 1/7/06 6/4/06 SMC Decomposed SMC types 4 4 N “TrafficCongestion” A := I(IL High).I(0r).S.I(ingress); IL High, Car Slow, TC and TCN Q r Decomposed and Distributed SMCs 876 876 B := I(Car Slow).I(0 ).S.N; n Max SMC allocation per node 1 1 P A.valid&&B.valid&&A.location == B.location x Max SMCs observed per node 4 6 P (!B.valid) && (B.location == last.location) TC SMCs (b) InductiveLoop High (IL High) SMC Max Events stored per InfoS 37 30 Car events SMC Max Events stored per InfoS for the 14 21 N “IL High” r r TC SMC evaluation Q A := I(InductiveLoop).A(avg, (−30 , 0 )).S. IL High, Car Slow, and TC events I(atomic); Max received Events for TC SMC 236 701 P n A.value > 2.5 x Max Predicate Evaluations per re- 10 14 P A.value < 2.5 && A.location == last.location ceived Event Notification TrafficCongestion predicate evaluations (c) Car Slow SMC Event Publications by Clients 40650736 40814682 SMC IL, Car, and User Events N “Car Slow” r r Total Event Notifications (ENs) 40656552 40825670 Q A := I(Car).A(avg, (−60 , 0 )).S.I(atomic); published Events (including SMC ENs) P n A.value < 7 x Events Disseminated (ED) 1784 3884 P A.value > 15 && A.location == last.location Events Disseminated within the Network Events Delivered to Subscribers 4032 7104 TCN Events to JPA Clients Number of high-level Event Notifications Car Slow 774 1048 TrafficCongestion events for the local subscriber. IL High 842 2540 TrafficCongestion 168 296 The manually decomposed version of the TrafficConges- TrafficCongestionNear 4032 7104 tion SMC, which captures traffic congestion conditions and publishes TrafficCongestion events is outlined in table 9. 5.3 Performance See also figure 4 for a high-level view of the system. Our performance evaluation aimed at examining the effec- We examined sixty roads, each equipped with a single in- tiveness of SMCs, their decompositions and distributions in ductive loop and speed measuring sensor. In to the context of a real application scenario. We were particu- these, 380 network nodes were introduced to ensure wire- larly interested in the following parameters, each of which less network connectivity in the simulation environment. was also related to the expressiveness, induced processing, All nodes were assumed to have resources to house SMCs. communication, and storage costs of SPS. Twenty hours of real data, relating to two distinct days, • How many distinct SMC types were defined to support st th 1 July, 2006, 1AM–9PM for Sim1, and 6 April, 2006, the outlined application scenario? 1AM–9PM for Sim2 were processed. Table 10 outlines the • How much load-balancing was achieved? simulation parameters, and table 11 outlines the results. • How much localization was achieved? • How much information sharing was attained? The application scenario was fully described using four Table 10: Simulation Parameters SMCs (shown in figure 4), three of which were localized (either fully or partially) at the publisher nodes. The In- Statistics Sim1 Sim2 ductiveLoop High and Car Slow SMCs were fully local- Annotations 1/7/06 6/4/06 Number of SMCs and Clients ized, meaning that they were automatically decomposed Client – User 500 500 (along the QE location attributes) and distributed over the Client – InductiveLoop (IL) 60 60 60 InductiveLoop and 60 Car event publishers. The Traf- Client – Car 60 60 SMC – IL High 60 60 ficCongestionNear SMC was partially localized - located SMC – Car Slow 60 60 on the nodes that hosted User event publishers (i.e. the 500 SMC – TrafficCongestion (TC) 256 256 mobile user nodes), but also received non-local events. SMC – TrafficCongestionNear (TCN) 500 500 Client – JourneyPlannerApp (JPA) 500 500 The above localizations meant that only Induc- Client Subscriptions 500 500 tiveLoop High, Car Slow, and TrafficCongestion events journey planner applications were disseminated across the network - other events were Resolved Subscriptions 2132 2132 processed or consumed locally. The disseminated events (including InfoS Subscriptions) Max Subscriptions per node 2 2 totalled to 1784 (Sim1) and 3884 (Sim2). InfoS subscriptions for TC and TCN SMCs With regard to load-balancing, although only four SMCs Number of primitive Event Notifications were defined, the number of SMCs, following autonomous User 3.6e+7 3.6e+7 Car 330736 494682 decompositions and distributions, totalled 876. This in- InductiveLoop 4.32e+6 4.32e+6 cluded 620 SMCs that were localized at the publisher nodes (discussed above). With localization, the processing is also limited to the examination of locally generated data. The original TrafficCongestion SMC was computationally works. In Proc. of 24th International Conference on Dis- expensive to evaluate. The cartesian product of the KPs, tributed Systems (ICDCS), Japan, March 2004. representing the most recent “high road occupancies” and [2] Asaf Adi and Opher Etzion. Amit - the situation manager. “slow vehicle speeds”, produced large data sets for exam- The VLDB Journal, 13(2):177–203, 2004. ination. This process was repeated per receiving Induc- [3] Rimon Barr, Zygmunt J. Haas, and Robbert van Renesse. tiveLoop High or Car Slow event at the TrafficCongestion Scalable wireless ad hoc network simulation. Handbook on Theoretical and Algorithmic Aspect of Sensor, Ad hoc Wire- SMC. The decomposition of the TrafficCongestion SMC, less, and Peer-to-Peer Networks, pages 297–311, 2005. however, resulted in 256 SMCs that focused on smaller ( 1 ) 16 [4] Patrick Th. Eugster, Pascal A. Felber, Rachid Guerraoui, regions of the environment. This decomposition reduced and Anne-Marie Kermarrec. The many faces of pub- the maximum number of events that were received at any lish/subscribe. ACM Comput. Surv., 35(2):114–131, 2003. one SMC from 1616 (Sim1) and 3588 (Sim2) events (relat- [5] Ramesh Govindan, Joseph M. Hellerstein, Wei Hong, Sam ing to a centralized SMC) to 236 and 701 events. Madden, Michael Franklin, and Scott Shenker. The sen- The TrafficCongestion SMC decomposition and distribu- sor network as a database. Technical Report 02-771, tion not only reduced the number of receiving event no- USC/Information Sciences Institute, September 2002. tifications, which in turn reduced the frequency of SMC [6] Oliver Kasten and Kay Romer¨ . Beyond event handlers: Pro- evaluations, but also reduced the size of the knowledge that gramming wireless sensors with attributed state machines. In The Fourth International Conference on Information Pro- was maintained at the local InfoS components. The maxi- cessing in Sensor Networks (IPSN), pages 45–52, Los Ange- mum observed processing complexity was 10n (Sim1) and les, USA, April 2005. 14n (Sim2), meaning that the maximum size of the KP- [7] Shuoqi Li, Sang Hyuk Son, and John A. Stankovic. Event combinations relation (produced by the cartesian product detection services using data service middleware in dis- of the KPs) were 10 and 14 tuples, respectively. This com- tributed sensor networks. In Feng Zhao and Leonidas J. Guibas, editors, IPSN, volume 2634 of Lecture Notes in pares to 74n (Sim1) and 90n (Sim2), that would have been , pages 502–517. Springer, 2003. realised if the TrafficCongestion SMC was not decomposed. [8] Jie Liu, Maurice Chu, Juan Liu, James Reich, and Feng From the 876 distributed SMCs, 376 SMCs collaboratively Zhao. State-centric programming for sensor-actuator net- deduced the traffic congestion information. The 168 (Sim1) work systems. IEEE Pervasive Computing, 02(4):50–62, and 296 (Sim2) TrafficCongestion events, published as re- Oct-Dec 2003. sult of this collaboration, were disseminated to the 500 [9] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TrafficCongestionNear SMCs that were localized at their TAG: A tiny aggregation service for ad-hoc sensor networks. ACM SIGOPS Operating Systems Review, 36(SI):131–146, mobile user nodes. This sharing meant that the process- 2002. ing, storage, and communication costs, that were associated [10] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. with these 376 SMCs, were divided among the 500 users The design of an acquisitional query for sensor when examining per-user costs in the system. Thus, as the networks. In Proc. SIGMOD, 2003. number of users increases, the per-user cost decreases to- [11] Multidimensional Expressions (MDX) Reference. wards the cost of disseminating TrafficCongestion events to http://msdn2.microsoft.com/en-gb/library/ms145506.aspx. a user, and filtering these events for nearby traffic conges- [12] D. Moreto and Markus Endler. Evaluating composite events tion reports. This demonstrates the scalability of the SPS, using shared trees. IEE Proceedings - Software, 148(1):1– that is achieved as a result of information sharing. 10, 2001. [13] Peter R. Pietzuch, Brian Shand, and Jean Bacon. Composite 6 Conclusions event detection as a generic middleware extension. IEEE Network, 18(1):44–55, 2004. In this paper we focused on State Maintenance Components [14] Kay Romer¨ and Friedemann Mattern. Event-based systems and how they may be used to capture expressive conditions for detecting real-world states with sensor networks: A crit- ical analysis. In DEST Workshop on Signal Processing in in an SPS supported sensor system. We demonstrated how Sensor Networks at ISSNIP, pages 389–395, Melbourne, temporal and spatial condition interrelationships can be de- Australia, December 2004. fined, and provided a fine-grained explosion detection ex- [15] N. Sadagopan, B. Krishnamachari, and A.Helmy. The AC- ample to accompany our discussions. Our evaluation, using QUIRE mechanism for efficient querying in sensor net- real sensor data, demonstrated the effectiveness of SMC de- works. In Proc. 1st Intl. Workshop on Sensor Network Proto- composition and distribution. This decomposition allowed cols and Applications (SNPA), Anchorage, AK, May 2003. for load-balancing and distributed data processing, as well [16] SCOOT. http://www.scoot-utc.com. as effective information sharing when interests overlapped. [17] Salman Taherian and Jean Bacon. SPS: A middleware for multi-user sensor systems. In Proc. of the 5th workshop on Middleware for pervasive and ad-hoc computing, pages 19– Acknowledgements 24, Newport Beach, USA, November 2007. ACM Press. This work was funded by Microsoft Research Cambridge. [18] Salman Taherian and Jean Bacon. State-filters for enhanced filtering in sensor-based publish/subscribe systems. In Proc. of the 8th International Conference on Mobile Data Man- References agement (MDM’07), pages 346–350, Germany, May 2007. [1] T. Abdelzaher, B. Blum, D. Evans, J. George, S. George, [19] Y. Yao and J. Gehrke. The COUGAR approach to in- L. Gu, T. He, C. Huang, P. Nagaraddi, S. Son, P. Sorokin, network query processing in sensor networks. ACM SIG- J. Stankovic, and A. Wood. Envirotrack: Towards an en- MOD Record, 31(3):9–18, 2002. vironmental computing paradigm for distributed sensor net-