DOTTORATO DI RICERCA IN INFORMATICA XIX CICLO

SETTORE SCIENTIFICO DISCIPLINARE INF/01 INFORMATICA

Techniques, Algorithms, and Architectures for Adaptation in Mobile Computing

Tesi di Dottorato di Ricerca di: Daniele Riboni

Relatore: Prof. Claudio Bettini

Coordinatore del Dottorato: Prof. Vincenzo Piuri

Anno Accademico 2005/06 2 To my parents 4 Contents

1 Introduction 9

2 Context-Awareness 13 2.1Introduction...... 14 2.2Classificationofcontextparameters...... 16 2.2.1 Ataxonomyofcontextdata...... 16 2.2.2 Complexityofreasoning...... 20 2.3Currentprofilingapproaches...... 22 2.3.1 Profilerepresentationofdevices...... 23 2.3.2 Userprofiling...... 28 2.3.3 Profilingprovisioningenvironments...... 31 2.4Profile-baseddeliveryplatforms...... 37 2.4.1 Requirements...... 37 2.4.2 CC/PP-basedarchitectures...... 39 2.4.3 Commercialapplicationservers...... 40 2.4.4 Alternativemiddlewareproposals...... 41

3 The CARE Middleware 45 3.1Architecture...... 45 3.1.1 Overview...... 45 3.1.2 ProfileManagementandAggregation...... 47

5 3.1.3 Policies for Supporting Adaptation ...... 50 3.1.4 Ontologicalreasoning...... 52 3.1.5 Supporting continuous services ...... 53 3.2Softwarearchitecture...... 54 3.3 Evaluation with respect to the addressed requirements . . . . 56

4 Conflict Resolution for Profile Aggregation and Policy Eval- uation 59 4.1Representationofcontextdataandpolicies...... 60 4.2Conflictsandresolutionstrategies...... 64 4.3Mergingdistributedcontextdata...... 67 4.4Policyformalsemanticsandconflictresolution...... 69 4.4.1 Cycledetectionandresolution...... 70 4.4.2 Policyconflictresolution...... 75 4.4.3 Evaluation algorithm and complexity analysis . . . . . 82 4.5Experimentalresults...... 84 4.6BibliographicNotes...... 85

5 Distributed Context Monitoring for the Adaptation of Con- tinuous Services 89 5.1Adaptationofcontinuousservices...... 90 5.2Trigger-BasedMechanism...... 91 5.3Minimizingunnecessaryupdates...... 95 5.3.1 Baselinealgorithm...... 95 5.3.2 Optimization based on profile resolution directives . . 98 5.3.3 Optimizationbasedonrulepriority...... 100 5.4Bibliographicnotes...... 103

6 Loosely Coupling Ontological Reasoning with Adaptation Policies 105

6 6.1ContextModeling...... 107 6.1.1 ShallowProfileData...... 107 6.1.2 Ontology-basedprofiledata...... 107 6.2 Basic notions on Description Logics ...... 109 6.3OntologicalReasoning...... 112 6.3.1 Off-lineontologicalreasoning...... 112 6.3.2 On-demandontologicalreasoning...... 116 6.4ExperimentalEvaluation...... 117 6.4.1 Experiment A: Ontological reasoning with increasing ABox size (instances obtained from the aggregated profile)...... 118 6.4.2 Experiment B: Ontological reasoning with increasing ABox size (instances known apriori) ...... 120 6.4.3 Experiment C: Ontological reasoning with increasing ABox size (instances not involved in the reasoning task)123 6.4.4 Experiment D: Ontological reasoning with increasing TBoxsize...... 124 6.4.5 Experiment E: Ontological reasoning with increasing TBoxandABoxsize...... 126 6.4.6 Experiment F: Ontological reasoning with a realistic ontology and increasing number of derived activities . 128 6.4.7 Experiment G: Ontological reasoning with a realistic ontologyandincreasingABoxsize...... 130 6.5Bibliographicnotes...... 132

7 Prototype Applications 135 7.1 A context-aware architecture for management and retrieval ofextendedpointsofinterest...... 136 7.1.1 AnoverviewofthePOIsmartsystem...... 136

7 7.1.2 Architecture...... 137 7.1.3 ClassificationandSearch...... 144 7.1.4 Thecurrentsystemprototype...... 149 7.2Anadaptivevideostreamingservice...... 153 7.2.1 Theadaptivestreamingsystem...... 153 7.2.2 The interaction between CARE and the streamer . . 154 7.2.3 ExperimentalSetup...... 155 7.2.4 Observedresults...... 156 7.2.5 Comparisonwithacommercialsolution...... 160 7.3Integrationwithanadaptivetranscodingproxy...... 165 7.3.1 Architectureoverview...... 166 7.3.2 Serviceactivationpolicies...... 168 7.3.3 The GeoAware prototypeservice...... 170

8 Summary and Outlook 173 8.1Technicalcontribution...... 173 8.2Openproblems...... 174 8.2.1 Ontologiesandontologicalreasoning...... 174 8.2.2 Privacyissues...... 176 8.2.3 Optimizationtechniquesandcaching...... 178 8.2.4 Scalability issues ...... 179

8 Chapter 1

Introduction

In the last few years, the proliferation of mobile devices and wireless net- works has radically changed the way people use computing devices. Hand- held computers such as smart phones and personal digital assistants (PDAs) are becoming more and more important as a mean for communicating, work- ing, having access to information, and accessing intelligent services. A hot topic in the area of mobile computing is how to efficiently and effectively adapt services on the basis of context. Context is essentially any information that is useful for adapting an application to the user’s needs and expecta- tions. In mobile computing environments, adaptation is a fundamental fea- ture, since the context of mobile users continuously changes depending on spatio-temporal conditions, activity, and surrounding environment, just to cite a few parameters. Adaptation and personalization of services for commercial applications has traditionally taken into account just a very restricted set of context data, such as those that describe the characteristics of devices, network sta- tus, and possibly user interests and simple preferences. These data are gen- erally stored in central context managers, using proprietary representation formalisms. On the contrary, in the research community various proposals

9 have been made in the last few years for performing adaptation on the basis of a much wider set of context data, which includes dynamic user preferences, privacy policies, and complex context data such as those that describe the socio-cultural context of the user. However, reasoning with complex context data poses serious performance issues.

Since mobile services can be possibly accessed by a huge number of users at a time, efficiency and scalability are mandatory. As a consequence, var- ious efficient contextual reasoning procedures have been proposed for spe- cific applications like telecommunication services (e.g., [59]) and e-commerce (e.g., [43]). The adaptation of these classes of services is performed taking into account raw context data such as the ones that describe the network, device capabilities, and categories of interests. This class of data can be naturally modeled by means of attribute/value pairs, adopting standard representation formalisms.

On the other hand, mobility claims for the use of a wider set of con- text data – provided by different sources – including complex data such as the user current activity and her surrounding environment. A quite large consensus has been reached in the research community towards the use of expressive languages in order to represent and reason with these data, and various frameworks have been recently proposed (e.g., [23, 47]) for applica- tions requiring sophisticated adaptation (like applica- tions). Since the formalism of choice is typically OWL-DL [58] or some of its variations, the reasoning tasks are known to have high complexity.

The goal of this thesis is the investigation of the above mentioned re- search issues. One of the main results of the thesis is the definition and development of a framework – called Context Aggregation and REasoning (CARE) framework – for efficiently and effectively supporting context-aware adaptation of services in a mobile computing environment. The

10 main research areas involved in this work are those of knowledge represen- tation, distributed systems, wireless networks,andsoftware engineering. The CARE hybrid reasoning mechanism is based on a loose interaction between ontological reasoning and efficient reasoning in a restricted logic programming language. We have defined an efficient logic programming language for reasoning with raw context data, while we adopt OWL-DL as the language for representing and reasoning with complex context data. The CARE framework includes sophisticated mechanisms for resolving conflicts between context data provided by different sources, and policies declared by different entities. Moreover, an optimized trigger-based mechanism has been adopted for supporting services that persist in time (called continuous services), like multimedia streaming. We have accurately performed both theoretical and experimental analyses of the main novel algorithms that are adopted by CARE. The framework has been developed, and used for the adaptation of various prototype services addressed to mobile users. The dissertation is structured as follows. Chapter 2 provides a classifi- cation of context data, discusses how they can be acquired from different sources, and presents context-aware delivery platforms that have been pro- posed in the literature. Chapter 3 provides an introduction to the CARE middleware, outlining the main requirements that were considered, and sum- marizing the adopted solutions. Chapter 4 describes in detail the logic pro- gramming language we have defined, and the conflict resolution techniques. Chapter 5 describes the optimized trigger-based mechanism for supporting continuous services. Chapter 6 describes the hybrid reasoning approach of CARE. Chapter 7 illustrates various prototype services that have been de- veloped for experimenting with CARE. Finally, Chapter 8 concludes the dissertation summarizing the technical contributions of the thesis, and dis- cussing open problems.

11 12 Chapter 2

Context-Awareness

Context-awareness is emerging as one of the essential features for the next generation of Internet mobile services. Context-awareness is a desirable fea- ture for many application areas, including natural language understanding, electronic commerce, tele-medicine, and e-learning, just to cite a few of them. It is however particularly relevant for mobile and pervasive computing, since mobile devices naturally enable a much wider set of contexts characterized, among other things, by a spatio-temporal dimension, a wide range of de- vice features and networking capabilities, and very different environment situations.

In this chapter we provide a classification of context data, we discuss how they can be acquired from different sources, and formally represented. Finally, we describe different context-aware delivery platforms that have been recently proposed both by academic research groups and by industrial companies.

13 2.1 Introduction

We consider the context defining a mobile service request as described by a set of parameter values possibly belonging to different profiles.Aprofile is intended as a structured set of parameters describing an entity. Most common examples of profiles are user profiles and device profiles. The first ones usually contain data about user preferences, interests, demographics, as well as behavior models. The second ones usually contain technical data describing device capabilities such as installed memory, screen resolution, computing power, available user interfaces, installed software, as well as device status parameters like the battery level or the available memory. Modeling the context of a mobile service request also includes consider- ing profile parameters of the provisioning environment. These include the availability, type, and status of the network connection between the user and the service provider, the spatio-temporal condition of the user (e.g., time, location, speed, direction), and the user’s environment (e.g., close-by re- sources, temperature, weather conditions), as well as the policies the service provider may enforce for the service request. The different kinds of profile data are discussed in detail in Section 2.3, providing for each one a survey of the existing approaches to represent, manage and use these data. However, the real challenge we are considering in this chapter is the acquisition from different sources of all the profile parameters defining the context of a service request and their aggregation into a consistent uniform description. The distribution of profile sources imposes two main require- ments: a) a common formalism and a shared vocabulary to be used by the different profile sources to represent the data, and b) a mechanism to deal with possibly conflicting parameter values provided by different sources. It is indeed possible that different sources have different values for the same

14 profile parameters; for example, the context provider may maintain its own user profiles, storing among other values the user’s interests as deduced from previous service requests. On the other side, the user itself may provide a personal profile including its own interests as explicitly defined by himself or automatically derived by a software agent. Conflicts may exist even when considering more technical parameters like, for example, positioning data: different data may be provided by the network operator using triangula- tion and cell-id, and from the user using GPS or other client-side methods. Other relevant aspects that need to be taken into account are related to the relationship between profile parameters and to the dynamics of profile data. The value of a parameter may well depend on the value of other parameters. For example, the preference of a user for receiving high quality multimedia content on its device may depend on the cost of the connectivity he is using at the time of request and/or from the status of its device (e.g., battery level). Other user preferences may depend on the user location or on the current user activity. The conditional setting of parameter values can be modeled by the introduction of simple user policies that should be evalu- ated at the time of service request. Analogously, the service provider and possibly other profile sources may have their own set of policies. A com- prehensive solution for distributed profiling should also take care of possible conflicts both within a set of local policies and among policies from different sources.

Profile data can also be dynamic in the sense that parameter values may change quickly, and possibly during service provisioning. Typical examples are tracking applications whose service is actually based on the update of the positioning data. An example of a more advanced service is adaptive multimedia streaming: streaming may be initiated with a very high bit- rate based on profile data acquired at the time of request, but it should

15 progressively degrade if the profile parameters change during the streaming session suggesting to use a lower bitrate. Taking into account this aspect requires a mechanism to detect changes in relevant profile parameters at remote different sources as well as defining how much should a value change to require the recomputation of the aggregated global profile.

2.2 Classification of context parameters

In this section we provide an analysis and classification of context data, on the basis of its semantics, and of the complexity of reasoning techniques that are adopted for its derivation.

2.2.1 A taxonomy of context data

Different researchers have proposed taxonomies for classifying context data on the basis of the type of data they represent (see, e.g., [97, 80]). Here, we provide a classification of context data that follows the taxonomy proposed by Dey and colleagues in [41]. They identify four primary context types that characterize the situation of an entity, namely: location, identity, time,and activity.

Location

Currently, a number of mobile computing applications provide services tar- geted to the user’s location. Navigation systems, emergency services, mobile tourist guides, and proximity marketing are only few examples of location- aware applications. To support such applications, many different location systems and technologies have been developed, to the end of providing a user or a device with her/its physical location and with information regarding people and items located in the surroundings [56]. It is worth noting that

16 different techniques may represent the location of objects and people using a different representation scheme. Generally, outdoor systems provide a physical position (e.g., coordi- nates), while indoor systems provide a symbolic position (e.g., in room R32, in the living room, etc). The physical position of an object can be naturally expressed through a 2- or 3-dimensional coordinate system (lati- tude, longitude, and optionally altitude) in a given spatial reference model. For instance, NMEA 0183 is a standard defined by the National Marine Electronics Association adopted by GPS device manufacturers for commu- nicating physical location information based on asynchronous sentences that provide the latitude and longitude (expressed by degrees, minutes, and sec- onds triplets), and other data (e.g., velocity). Unfortunately, representing symbolic positions is much more difficult. One possible solution consists in defining an ontology of symbolic locations as done for example in [78]. This ontology defines classes such as country, city, street, building, floor, and room, using the OWL [77] language. An ontology-based representation of symbolic locations also allows some simple forms of reasoning. For instance, through the definition of the transitive property is-located-in, in the case that RoomA is located in FloorB and FloorB is located in BuildingC, it is possible to infer that RoomA is located in BuildingC. Mapping between physical and symbolic locations is generally executed by an external spatial-aware application.

Identity

An important category of context data is the one that contains informa- tion about user personal data and identity. These data are used by service providers for customizing applications on the basis of various features like gender, title, and age of users.

17 For this class of information standard taxonomies already exist. As an example, the Liberty Alliance’s ID-Personal Profile [73] represents user per- sonal data that fulfills the requirements of most internet services. These data are classified in various categories. As an example, the category Analyzed- Name includes data for the first name, second name, and title. The category AddressCard includes data for the user address, email, and other contacts. Categories can also include other subcategories. For instance, the Le- galIdentity category contains – in addition to data regarding the fiscal iden- tification number, marital status, and day of birth – the subcategory Ana- lyzedName. ID-Personal Profile also includes personal data that describe the identity of employers, in order to support workgroup applications. For instance, the EmploymentIdentity category contains data that represent the user’s job title, and her organization. Other context data modeled by ID-Personal Profile are addressed to medical emergency situations. As an example, the EmergencyContact data contains the next of kin or other person to contact if the user has a medical emergency.

Time

Reasoning with time is fundamental for dealing with complex context data such as the ones that describe user activities, tasks, and socio-cultural en- vironment. Research on time has a long history, and huge efforts have been made for defining appropriate formalisms for representing time information and reasoning with it. Here we just mention some recent proposals that make use of ontologies, which could be nicely integrated in frameworks for context-awareness. OWL-Time [62] is an ontology of temporal concepts, whose main goal

18 is to support time reasoning for the Semantic Web (e.g., describing the temporal characteristics of Web resources, and temporal properties of Web services). This ontology is based on the model of temporal representation called Interval Temporal Logic, and provides a vocabulary for expressing facts about relations among instants and intervals, durations, and datetime information. A further ontology for representing time was proposed in [88]. In this ontology, both time points and time intervals are treated as primitive ele- ments on a time line, and the class hierarchy, relations, axioms and instances are defined on the basis of those primitives. The ontology allows the use of granularities, and reasoning procedures can be performed in order to switch from finer to coarser time granules.

Activity

Taking into account the activities that are currently performed by users is of primary importance for effectively perform adaptation. Some classes of activity (e.g., physical activities) can be derived by analyzing data retrieved from body-worn sensors and environmental sensors. As an example, move- ment activities can be derived analyzing data gathered from accelerometers and heart-beat sensors. More complex activities are generally modeled by means of ontologies. As an example, the CONON [47] OWL-DL [58] ontology defines various activities such as showering, cooking, sleeping, having dinner, watching TV. A similar proposal has been presented in [22].

Including other classes of data

We must point out that none of the proposed classifications addresses all of the aspects characterizing context. For the sake of this dissertation, we

19 Primary dimension Secondary dimensions Sources User Identity U Physiological S Emotional S Interests U, O Preferences U, O Social U, S, O Organizational U, O Activity U, S, O Environment Weather S Temperature S Humidity S Light S Noise S Air pollution S Surrounding persons / devices S, O Location Physical S Symbolic S, O Time Absolute D, S Granularity-based D, S, O Device Capabilities D Status D Connectivity D, O

Table 2.1: A taxonomy of context data. Possible sources of context data are users (U), sensors (S), devices (D), and other (O), like, e.g., personal agents and logs. integrate to the above described classification a taxonomy that is sufficient for including all of the context data considered in the following chapters. The taxonomy is summarized in Table 2.1.

2.2.2 Complexity of reasoning

Context data can also be classified on the basis of the formalisms that must be adopted for representing them, and of the reasoning tasks that must be executed for deriving their values.

20 We point out that a strict classification of context data on the basis of these features is impossible, since generally the same data can be derived – with different levels of accuracy – adopting different inference methods, or it can be explicitly provided by the user. As an example, consider the context data representing the current user activity. Various methods exist that derive this data exploiting complex reasoning techniques such as Bayesian networks, Hidden Markov Models, or description logics. However, this data can also be explicitly provided by the user, who sets her current activity on her .

Raw context data

We denominate raw context data those data that can be directly acquired from sensors, or that are explicitly provided by the user or by other entities. Raw data can be provided by:

• environmental sensors (e.g., temperature, light intensity, noise level, and GPS coordinates);

• computing devices (e.g., battery level, display size, available memory, operating system);

• body-worn sensors (e.g., blood pressure, heart and respiration rate).

Explicit data can be provided by:

• users (e.g., name, interests, credit card number, and other personal data);

• network operators (e.g., available bandwidth, latency, bearer);

• service providers (e.g., user interests derived analyzing the users be- havior).

21 Composed context data

We denominate composed context data those data that cannot be directly gathered from sensors, but that can be derived by the composition of raw context data. The reasoning task required for deriving such data is extremely simple, since it consists just in the evaluation of a series of conditions, pos- sibly combined through operators such as and, or,andnot.Asanexample, the situation stuffy weather can be defined as a series of conditions like:

(temperature > 28oC and humidity > 70%) or (temperature > 32oC).

Examples of composed context data for the healthcare domain are part of those that describe the emotional status of people. For instance, the status of anxiety can be described as a combination of conditions about raw context data such as heart rate and skin conductance.

Complex context data

We denominate complex context data those data that can be inferred by means of complex reasoning tasks on the basis of raw data, composed data, and other complex context data. These data typically include information regarding the socio-cultural environment of users, complex user preferences regarding the adaptation of services, and physical activities. Reasoning techniques for deriving complex context data include logic programming, probabilistic reasoning, and machine learning.

2.3 Current profiling approaches

This section considers separately profile data describing device features, pro- file data describing users, and profile data describing the provisioning en-

22 vironments. For each one of these profile categories we illustrate the main approaches to profiling considering both existing commercial systems and consolidated research work. The last subsection briefly illustrates existing delivery platforms based on profiling.

2.3.1 Profile representation of devices

A precise definition of the characteristics and capabilities of the device used for accessing an Internet service is essential for performing adaptation. In particular, profile data include information regarding both software (e.g., the browser name and version, java support, etc.) and hardware (e.g., CPU and network interfaces). While software capabilities remain constant through the service provision, data regarding certain hardware parameters can change (e.g., the remaining battery lifetime, or available memory).

HTTP headers

In the last years, the diffusion of mobile devices with low capabilities spurred the definition of markup languages (e.g., WML, cHTML) targeted to differ- ent classes of terminals. The simplest technique (and, actually, still the most adopted by service providers) for choosing the most appropriate markup lan- guage and for adapting web contents to the device that makes the request consists in identifying the device by means of the HTTP request headers. It is worth to note that this technique is applicable only to HTTP-based services. The Hypertext Transfer Protocol – HTTP/1.1 specification defines the syntax and semantics of all standard HTTP/1.1 header fields. Un- fortunately, the information conveyed by HTTP/1.1 headers that can be useful for representing device capabilities is quite limited, including only the user agent (i.e., browser) and media types (MIME types) accepted by the user agent (e.g., the supported markup languages), charsets, and encod-

23 ings. Hence, this information only allows the service provider to determine how to markup the content. The User-Agent Display Attributes Headers Internet-Draft has been widely adopted by browser developers for extend- ing the set of information provided by HTTP headers with data regarding display characteristics, such as screen size and resolution, and color capa- bilities. Moreover, some browsers include in the HTTP request other un- documented header fields for representing information such as the operating system, CPU, and voice capabilities. As an example, Table 2.2 shows part of the HTTP request headers provided by the Internet Explorer browser of a Windows XP, PocketPC, and SmartPhone device.

Windows XP PocketPC SmartPhone User-Agent Mozilla/4.0 Mozilla/4.0 Mozilla/4.0 (compatible; (compatible; (compatible; MSIE 6.0) MSIE 4.01) MSIE 4.01) Accept application/msword, . . . image/gif, . . . image/gif, . . . UA-CPU i486 ARM OMAP 710 UA-OS WinCE (Pocket PC) WinCE (SmartPhone) UA-pixels 240x320 176x220

Table 2.2: An excerpt of the HTTP request headers provided by the Internet Explorer browser of different devices

Obviously, the header approach has a number of shortcomings. First of all, the provided information is limited to static characteristics of the device, while a mobile computing scenario advocates the knowledge of the current status of devices, such as available memory and battery charge status. More- over, since part of the headers are not well documented, it is necessary to perform a sort of reverse engineering in order to understand their meaning. As a matter of fact, different browsers provide different HTTP headers for the same device (e.g., Internet Explorer and Opera provide different header fields for the same SmartPhone, as shown in Table 2.2.)

24 CC/PP and UAProf

In order to overcome the limitations of the HTTP headers approach, the W3C defined the Composite Capability/Preference Profiles (CC/PP): Struc- ture and Vocabularies [66]. CC/PP uses the XML serialization of Resource Description Framework (RDF) graphs to create profiles that describe the capabilities of the device and, possibly, the preferences of the user. CC/PP profiles are structured as sets of components that contain various attributes with associated values. Components and attributes are defined into CC/PP vocabularies; i.e., RDF Schemas that formally define their semantics and allowed values. Data types support in CC/PP is quite limited; in fact, at- tribute values can be either simple (string, integer or rational number) or complex (set or sequence of values, represented as rdf:Bag and rdf:Seq respectively). Currently, CC/PP is mostly used for representing device capabilities and network characteristics. UAProf [83] is the most renowned CC/PP- complaint vocabulary. It has been proposed by the Open Mobile Alliance for representing hardware, software, and network capabilities of mobile de- vices. Some components defined within UAProf have been extended with new attributes by Intel [18]. In particular, UAProf defines seven compo- nents:

• HardwarePlatform provides a detailed description of the hardware ca- pabilities of the terminal, including input/output capabilities, CPU, memory, battery status, and available expansion slots.

• SoftwarePlatform describes the device operating system, its Java sup- port, supported video and audio encoders, and the user’s preferred language.

• BrowserUA describes in detail the browser features, providing not only

25 ... ... Description: Number of soft keys available on the device. Type: Number Resolution: Locked Examples: "3", "2"

Description: The size of the device’s screen in units of pixels, composed of the screen width and the screen height. Type: Dimension Resolution: Locked Examples: "640x480" ...

Figure 2.1: An excerpt of the UAProf definition of the HardwarePlatform component

26 the browser name and version, but also information regarding its sup- port to applets, JavaScript, voiceXML, text to speech and speech recog- nition capabilities, as well as the user’s preference regarding frames.

• NetworkCharacteristics provides information about the network ca- pabilities and environment, such as the supported Bluetooth version, support of security protocols, the current bearer signal strength and bitrate.

• WAPCharacteristics contains a set of attributes regarding the device WAP capabilities, including the supported Wap, Wml and WmlScript versions, and the Wml deck size.

• PushCharacteristics and MMSCharacteristics provide information re- garding the device WAP push capabilities and MMS support, respec- tively.

A small excerpt of the UAProf definition of the HardwarePlatform compo- nent can be seen in Figure 2.1. Currently, many hardware vendors make publicly available on their web sites the UAProf profiles of their devices. Several examples can be found at http://w3development.de/rdf/uaprof_repository/.Atthetimeof writing, the list of UAProf descriptions provided by important vendors such as , Sony Ericsson, and BlackBerry are kept up-to-date with the new models, suggesting that this technology is considered interesting by hard- ware vendors. Figure 2.2 shows an excerpt of the UAProf profile of a mobile phone, as published on its supplier web site. The CC/PP and UAProf specifications also define the communication protocol of profiles to the service provider. These protocols are based on pro- file defaults and profile diffs, as described in [82] and [83]. The client should send HTTP requests containing a reference to the device default profile,

27 128x128 6820 ISO-8859-1 US-ASCII UTF-8 ISO-10646-UCS-2 18x5 12 Yes Yes Yes Qwerty 3 ...

Figure 2.2: An excerpt of the UAProf profile of a mobile phone and attribute/value pairs that describe the variations from the default pro- file (e.g., the insertion of a new memory card, or volatile information such as the current available memory). Possibly, the CC/PP profile can be updated with new information by firewalls and proxies encountered by the HTTP request.

2.3.2 User profiling

Within the user modeling literature, a user model is intended as the sys- tem representation of the characteristics of a user, including, for example, knowledge and beliefs, skills and expertise, interests and preferences. Our meaning of a user profile, as a structured set of parameters, can be consid-

28 ered in all respects a user model, since it represents many relevant user’s characteristics. However, it is only the whole context profile (composed by different integrated profiles coming from different sources) which contains, among other things, the complete user model. Research prototypes and commercial systems exploiting user profiles can be categorized taking into account different relevant aspects. With respect to our needs, we consider as primary dimensions: (i) the adopted method for modeling users; and, (ii) the richness and generality of user data modeled. We also consider, as secondary dimensions: (iii) the kind of user data acquisition (e.g., ex- plicit or derived data collection); and, (iv) the type of user-adaptation (e.g., content or presentational adaptation). In the following we report on a few academic researches and commercial systems selected since providing semi- nal solutions or being a well-established pillar in one or more of the above dimensions.

First of all, early researches held a simple user model expressed in the form of records of command usage or data access; the user-adaptation was directly connected to the frequency of such usage. There was no attempt to infer or represent any other information about the user. The user model was embedded into the application and it was not possible to distinguish specific user modeling components from the other application modules.

GUMS [34] is the forerunner of future ‘user modeling shell systems’; i.e., systems providing user modeling services at runtime that can be configured during development time. In general, user modeling shell systems, for more details see [67], have quite sophisticated approaches for modeling users and include rich categories of user data (i.e., dimensions (i) and (ii)). They sup- port both explicit user data acquisition and the derivation of implicit user characteristics; handling of contradictions (also named truth maintenance) is included too. For instance, GUMS allowed the definition of simple stereo-

29 type hierarchies, and for each stereotype, a set of Prolog facts describes stereotype members and a set of rules defines the system’s reasoning. The final application can also communicate new facts about the user at run-time. BGP-MS [68], instead, provides two integrated formalisms for representing users’ beliefs and goals. Assumptions about the user and stereotypical as- sumptions about user groups can be represented in a first-order predicate logic. A subset of these assumptions is stored in a terminological logic. Inferences are defined in a first-order modal logic.

The SETA prototype [6], a toolkit for the construction of adaptive Web stores, includes state-of-the-art solutions in many dimensions. Regarding dimension (iv), it integrates the personalized suggestion of items (content adaptation) with the adaptation of the layout based on user preferences and expertise. This is especially made possible thanks to the richness of the user model (dimension (ii)) and to a specific representation technique (dimension (i)). The SETA user model contains four main types of data: 1. Explicitly provided personal data such as age, gender, job, education level. 2. Domain-independent user features; e.g., user’s receptivity, expertise, and interests. 3. Domain-dependent preferences regarding product properties, used by SETA to select the items most suited to the user. 4. Information relative to the classification of the user in the stereotypical customer classes.

User data acquisition (iii) is both explicit and dynamically computed taking into account the user’s behavior during the current session. With respect to dimensions (i) and (iii), SETA integrates KR-based user modeling techniques with machine learning mechanisms. In particular, while personal data are simple attribute-value pair, the other profile attributes have a more

30 sophisticated representation. For instance, for each user features attribute, a probability distribution is associated to its possible values (e.g., expertise about phones: low = 0.1; medium = 0.2; high = 0.7). Recommender systems, like GroupLens [92], should also be mentioned, since they are heavily based on user profiling. In these systems, the affinity between users is evaluated considering explicit rating of items provided by users, implicit ratings derived from navigational behaviors, and transaction history data. A comprehensive comparison of products for e-commerce and CRM (Customer Relationship Management) can be found in [35].

2.3.3 Profiling provisioning environments

As outlined before, the set of context data that are useful for performing a better adaptation goes beyond device capabilities and user’s information. In this section we present some profiling methods for gathering information regarding the network status, the position of the user and of people and objects in his surroundings, and the user’s environment.

Bandwidth estimation techniques

An estimate of the data rate that can be transmitted by the network link that connects the service provider to the user is important in order to de- termine the adaptation parameters of a wide spectrum of Internet services. As a consequence, a number of techniques have been proposed in the last years in order to estimate available bandwidth. A survey regarding metrics, techniques, and tools can be found in [85]. Application-level approaches try to estimate quantities of interest (espe- cially available bandwidth) at the communication endpoint observing packet dispersion [69] either in probing traffic or existing transmissions. As an ex- ample, some techniques estimate the end-to-end available bandwidth by

31 means of streams of probing packets that the source (server) sends to the receiver (client). Similar application-level approaches require a strict coop- eration between the sender and the receiver, since the receiver has to give explicit feedbacks. Other techniques require the receiver to perform the estimation itself, in order to improve the system scalability. Application- level approaches have a number of known weaknesses. In particular, even in wired networks, the main weakness resides in the estimation accuracy itself, and in convergence times. The application of these techniques in a mobile computing scenario poses new issues, mainly due to the required co- operation of clients having low power and network capabilities. As a matter of fact, client-side cooperation determines power consumption and loss of bandwidth (that in many mobile network technologies is a very valuable –and costly– resource). Moreover, due to their particular characteristics, obtaining a reliable end-to-end measurement in some mobile networks (e.g., in WiFi networks) is questionable.

To overcome these weaknesses, various network-level approaches have been devised. Network-level techniques exploit explicit network feedback to monitor available resources, as described for instance in [65]. The main disadvantage of these techniques resides in the fact that nodes in order to operate in the network need to provide specific support for each given archi- tecture and technology; clearly, this limits scalability and ease of deployment of such techniques. Moreover, these techniques are unsuitable in end-to-end networks like UMTS and GPRS.

For a more in-depth discussion of bandwidth monitoring issues in the context of mobile service adaptation we refer the interested reader to [75].

32 Location

As outlined in Section 2.2.1, generally outdoor systems provide a physical position (e.g., coordinates), while indoor systems provide a symbolic posi- tion. Localization techniques differ in many aspects, such as the accuracy of the provided position, the physical medium exploited for determining lo- cation, power and infrastructure requirements. For ease of presentation, we divide these technologies into outdoor and indoor positioning systems.

Outdoor positioning systems Probably, the most renowned outdoor positioning technology is the Global Positioning System (GPS).GPSisa worldwide positioning infrastructure formed by 24 satellites, together with ground stations in charge of maintaining the precise position of satellites. Satellites transmit signals encoded with timing information obtained from an atomic clock. Signals are used by GPS receivers to calculate their position by means of trilateration. Basically, in order to determine their position, GPS receivers use an estimate of the distance from 4 or more satellites, obtained analyzing the travel time of radio signals. Given the particular nature of these signals, the GPS technology is generally unavailable indoors. From the user’s perspective, GPS receivers are small, relatively economical widgets integrated into vehicles and mobile devices, or easily connectable to mobile devices through a wireless link (usually a Bluetooth connection). The communication between GPS receivers and mobile devices is based on the NMEA standard. GPS accuracy can vary depending on a number of fac- tors, including the particular receiver, electronic interferences, atmospheric effects, presence of tall buildings or other surfaces that reflect signals before they reach the receiver. Currently, low-cost GPS receivers have an accuracy of 10 meters or less. Even if GPS accuracy is not a problem for a number of location-based services, various modifications to the basic GPS technology

33 have been devised for improving accuracy (e.g., DGPS and AGPS [9]).

The Differential GPS (DGPS) technique is based on the correction of timing information by local reference stations. Reference stations are in charge of measuring timing errors, and of broadcasting correction informa- tion by means of radio beacons. Then, receivers able to detect beacons and apply corrections can compensate for the timing error in their trilateration calculation, improving accuracy to a couple of meters in ideal conditions.

Assisted GPS (AGPS) introduces in the basic GPS infrastructure sta- tionary servers (called assistance servers) that are in charge of performing part of the decoding and computation, and of sending information to the receiver through a wireless network. Thanks to assistance data, an AGPS receiver is able to immediately locate the visible satellites, strongly reduc- ing the so-called time-to-first fix. Assistance is also provided to the AGPS receiver by sending decoded data for each satellite, so that the receiver does not have to decode the entire signal. AGPS also increases the sensitivity of the receiver, which is able to obtain and demodulate signals too low to be treated by unassisted GPS receivers. Moreover, offloading part of the computation to assistance servers, receivers can lower power consumption.

One method that offers lower accuracy with respect to GPS, but that is available both outdoor and indoor is called Cell ID. Cell ID exploits the GSM base station the user is connected to for approximating the user’s po- sition. As a consequence, accuracy depends on the cell size, and varies from hundreds of meters in densely populated urban areas to tens of kilometers in country areas. The main advantage of this technique – currently used by many operators – is that it is employable with no modifications to the network infrastructure, and it does not require new functionalities to be added to mobile devices. However, the localization accuracy provided by this technique is inadequate for a number of location-based services. Hence,

34 various improvements have been proposed for increasing accuracy [2]. The Enhanced Cell ID method takes into account the delay of the radio signal sent from the mobile phone to the cell tower to derive the approximate dis- tance of the device. Other methods are based on trilateration.InEnhanced Observed Time Difference (E-OTD) the user’s location is calculated by a software module on the mobile phone that analyzes the delay of synchro- nization messages transmitted by base stations. This method can provide a good accuracy, but requires the installation of additional hardware and software modules on base stations and mobile phones. A similar solution is provided by the Time Difference of Arrival (TDOA) technique, with the difference that processing is executed by the network infrastructure.

Indoor positioning systems One of the first indoor positioning infras- tructures, called Active Badge [104], was developed between 1989 and 1992 at Olivetti Research Labs. The Active Badge proximity system is based on infrared transmitters carried by people, and receivers located in buildings that are in charge of determining symbolic locations (typically, the room people are in). More recently, similar techniques have been proposed, which adopt ultrasound instead of infrared beacons, and determine the location of users and objects by means of triangulation, thus obtaining greater ac- curacy. In the Active Bat system [51], the user’s location is calculated by a centralized module that collects and analyzes data retrieved from sen- sors. On the contrary, in the Cricket system [87] emitters are spread in the environment, and user-side widgets are in charge of receiving beacons and performing triangulation, thus protecting the user’s privacy. RFID (RF Identification) systems are composed by a set of readers that can read data through electromagnetic transmission from RFID tags. RFID tags can be either active or passive. Active tags have radio capabilities, and have ranges up to hundreds of meters. On the contrary, passive tags

35 only reflect signals received from readers; thus, their communication range is smaller. However, passive tags are considerably less expensive than active ones. The advantage of RFID systems is that they are easily deployable and tags are relatively inexpensive. Other approaches try to exploit general-purpose wireless networks for implementing location systems. Various techniques (e.g., RADAR [8]) pro- pose a solution based on WiFi networking for tracking users inside build- ings. The user’s position is determined by analyzing signal strengths at multiple overlapping base stations that cover a certain area. Even though this techniques have the advantage that they are implementable on top of a widespread wireless network infrastructure, the accuracy they provide is not fully satisfactory. As an example, the RADAR system is able to deter- mine the location of users to within 3 meters of their true position with 0.5 probability. Similar considerations hold for positioning systems deployed on top of Bluetooth network infrastructures. Various commercial systems (generally called location servers; e.g., Mi- crosoft MapPoint Location Server1, Geodan Movida Location Server2,SiR- FLoc Server3) offer the opportunity to nicely integrate location information collected by different means (e.g., GPS, cell ID). These systems give to the application provider the possibility to access location information in a uni- form way, independently by the specific technique used to derive it.

Environment conditions

Various interesting works exist which are focused on gathering information about the user’s surrounding environment on the basis of sensors.

1http://www.microsoft.com/mappoint/products/locationserver/default.mspx 2http://www.geodan.nl/uk/product/mobileproducts/Movida_location_server. htm 3http://www.sirf.com/products-sirflocserver.html

36 The AmbieSense4 project is based on the use of context tags.Context tags are small electronic widgets that can be spread all over the environment (for example within shops, hotels, furniture, and even clothes). They auto- matically send contextual information about the surroundings environment to mobile users. Interestingly, context tags can be attached to users too. In this case, they provide context data regarding the user they are attached to. User context data includes socio-cultural data such as the users interests, his status, and other spatio-temporal aspects. This information is of paramount importance, since each user belongs to the socio-cultural environment of the other users that interact with him. On the contrary, other projects (e.g., TEA [40]) are focused on the inte- gration of simple and cheap sensors that measure raw data such as presence, temperature, sound, and light level, in order to derive more complex, implicit context conditions (e.g., the action performed by the user). An application of similar techniques is the analysis of human eye-blinking and other factors in order to determine the fatigue states of car drivers [15].

2.4 Profile-based delivery platforms

A number of delivery platforms that take into account users’ profile data have been developed both by academic and industrial groups. In this section we describe a set of requirements we have identified, and briefly present research and commercial proposed solutions.

2.4.1 Requirements

In the following, we present a set of requirements, which were identified considering the data required for implementing highly adaptive services, the

4http://www.ambiesense.com/

37 infrastructure available now and that will be available in the near future, as well as the issues of data privacy and accessibility. The main requirements we identified are:

1. Interoperable context representation: A representation formal- ism is needed for the specification of a very broad set of profile data, which integrates device capabilities with spatio-temporal context, de- vice and network status, as well as user preferences and semantically rich context data. Since these data must be exchanged among various entities, it is highly advisable the use of a standard language, a shared vocabulary and a non-ambiguous semantics.

2. Support for context dynamics: It must be possible for multiple entities (e.g., users, providers, agents) to define how some changes on context reflect on other context data; for this reason, a representation formalism is needed for the specification of policies, which can dynam- ically determine the value of some profile data based on other values. Moreover, changes in context data must be asynchronously commu- nicated to the interested entities. Therefore, the architecture should provide a configurable mechanism for “intra-session” adaptation based on real-time update of certain profile data (e.g., location).

3. Support for distributed context data: Context data are natu- rally provided by different sources in some cases delivering conflicting information. The architecture should support the distributed storage and management of profiles and policies, with information stored and managed close to its source.

4. The architecture should provide a mechanism to aggregate profile data and policies from different sources, supporting a flexible and fine-grained conflict resolution mechanism;

38 5. The architecture should rely on an advanced system for pri- vacy protection, which allows the user to precisely control the partial sharing of his profile data;

6. Efficiency: The time needed for adaptation should not significantly affect the final user.

2.4.2 CC/PP-based architectures

Even if CC/PP and UAProf provide a satisfactory solution to the issue of representing both static and dynamic properties of mobile devices, cur- rently the adoption of these technologies is not yet widespread. The key requirement for implementing the CC/PP approach is to enable browsers to recognize the current device and communicate –through HTTP headers– its UAProf profile to the service provider (either building the UAProf profile from scratch, or pointing to the profile stored on the vendor web site). More- over, in order to keep up-to-date parameters regarding the current status of the device, the profile should be updated client-side by a proper monitor ap- plication. DELI and the CC/PP SDK provide two experimental platforms supporting the CC/PP technology. DELI [20] is an open-source Java library developed by HP Labs that allows the resolution of HTTP requests containing references to the CC/PP profile of the client device. DELI adopts the profile integration approach of UAProf, which consists in associating a resolution rule to every attribute. Whenever a conflict arises (i.e., when the default profile and profile diffs provide different values for the same attribute) the resolution rule deter- mines the value to be assigned to the attribute by considering the order of evaluation of partial profiles. DELI is fully integrated with Cocoon, the well-known XML-based application server. Intel CC/PP SDK [18] proposes an architecture composed by client-

39 and server-side modules for the management of UAProf profiles. Client-side modules execute on Microsoft Pocket PC 2002 devices. The CC/PP profile of the device is kept up-to-date by a monitoring module that is in charge of retrieving static as well as dynamic information about the device status and capabilities. The communication of the CC/PP profile to server applications is obtained by means of the CC/PP client proxy, which intercepts HTTP requests (e.g., originated by the micro-browser of the device) and inserts pro- file information into the HTTP headers. Server-side, the main component of this architecture is the CC/PP Content Customization Module, a module of the Apache web server that is in charge of retrieving partial profiles by analyzing the HTTP request headers, and of combining them in order to obtain the merged profile. This profile will be used by the application logic for adapting the content and its presentation. The CC/PP SDK framework provides three different mechanisms for personalization. Content Selection consists in building different representations of the same content, choosing the most appropriate representation on the basis of profile data. Stylesheet Conversion is used for adapting XML-based content using a different XSL stylesheet for various classes of profiles. Finally, Script Processing uses a script language to dynamically build an interface suited to the profile.

2.4.3 Commercial application servers

Today, most commercial application servers provide personalization and con- tent adaptation solutions that take into account at least the characteristics of the user’s device. As an example, the personalization scheme of the IBM WebSphere Portal is based on the creation of web pages and services using XDIME, a proprietary XML-based device-independent markup lan- guage. Depending on the specific device, XDIME contents are transformed by proper predefined XSLT stylesheets into the most appropriate format

40 (e.g., WML, XHTML Basic, etc.), evaluating policies that take into ac- count the capabilities of the particular device that issued the request. The framework also includes a repository of profiles describing the capabilities of a broad range of terminals. Similar solutions are provided by other well-known application servers like BEA Weblogic and OracleAS Wireless.

2.4.4 Alternative middleware proposals

In the last few years, many research groups and companies have been work- ing to define and implement middlewares for supporting service adaptation and personalization in a multi-device and mobile environment. In the follow- ing, we will report on the efforts we consider closest to our work, and evaluate them on the basis of the set of requirements presented in Section 2.4.1. The Houdini middleware [59], developed at Bell Labs, has the main goal of efficiently enabling context-aware mobile applications while preserving the user’s privacy. In Houdini, sharing of context data is controlled by policies declared by the user. The key component of the architecture is a mod- ule that evaluates the requests of profile data issued by service providers against the privacy policies declared by the user. The Houdini policy lan- guage, together with the conflict resolution strategy of its inference engine, is similar to ours. However, a policy mechanism which is primarily focused on adaptation should handle the resolution of conflicts determined by policies declared by different entities (e.g., the user and the service provider). The presence of multi-entity policies can also also determine the issue of cycles in the logic program, which cannot be detected in advance with the service request; hence, a mechanism for cycle detection and resolution is also ad- visable. From an architectural point of view, in Houdini profile information is divided into two categories: “static” data (e.g., personal data, buddies

41 list, address book, calendar) and “dynamic” data (e.g., preferences, loca- tion, device status). These classes of data, provided by various sources, are handled by two distinct modules which are responsible for their integration on a per-request basis.

CARMEN [10] is a middleware for supporting context-aware mobile com- puting. In CARMEN, service access is mediated by context-aware mobile proxies. These intermediate proxies execute directives obtained from Pon- der [29] policies, which manage migration, binding, and access control. The Ponder language turns out to be a good choice for the class of policies used in this middleware, however, in order to support rule-based reasoning with context data, mechanisms for conflict resolution that are more sophisticated than the ones provided by Ponder are needed. Moreover, as the other lan- guages adopting the ECA (event-condition-action) paradigm, Ponder does not naturally support rule chaining, since the domain of actions is gener- ally disjoint by the one of conditions. On the contrary, we believe that rule chaining is a necessary feature for the kind of application addressed by our framework, since rules can be declared for inferring higher-level context data starting from more simple ones. Moreover, rule chaining is essential for enabling the composition of policies declared by multiple entities.

A further interesting architecture for supporting context-aware systems in mobile environments is CoBrA [22]. Context-awareness in CoBrA is based on a formal model of context – represented by an OWL ontology [23] – that is shared by all the system components. The Context Broker module is in charge of gathering context data from sensors spread through the environ- ment. The Broker inference engine performs ontology-based reasoning in order to derive new context data from raw data, and to detect and resolve inconsistencies in profile data. The privacy enforcing mechanism of CoBrA is based on ontologies as well; in particular, privacy policies are represented

42 through an extension of the Rei [63] policy language. It is worth to note that, since the main goal of this architecture is to support knowledge sharing and interoperability in ambient intelligence scenarios, the efficiency of reasoning is not the main focus of this work. On the contrary, a middleware that is intended to support the provision of Internet services possibly accessed by a huge number of users at a time should provide techniques for performing ontological reasoning in advance with respect to the service provision, in order to preserve the system scalability.

SOCAM [47] is a middleware for supporting context-aware services in intelligent environments similar to CoBrA, since its context model is based on ontologies too. In particular, the adopted ontology [103] is composed by a general-purpose upper ontology and by application-specific lower ontologies. Context reasoning is performed in order to check the consistency of context data and to derive higher-level context data, and is based on both description logic and user-defined logic rules. The experiments they performed show that their reasoning approach is computationally expensive, and unfeasible for time-critical applications which make use of a huge amount of context data. Rules in SOCAM are declared by a single entity (which can be the final user or a service provider), and rules inconsistencies are not taken care of. Thus, a scenario in which service constraints and user preferences interact is not considered in the architecture.

An architecture for the user-side adaptation of applications in a mobile environment is described in [32]. This architecture contains a single profile manager, which is in charge of discovering context services (i.e., services that provide profile data). Context data are kept up-to-date on the profile manager by means of asynchronous notifications of changes sent by context services. An adaptation control module on the user device is in charge of evaluating adaptation policies against profile data, opportunely modifying

43 the behavior of local applications. Policies are declared by users defining priorities among applications as well as among resources of their devices. Consequently, the behavior of applications is adapted to obtain the optimal level of service with respect to the user’s requirements. It is worth noting that this architecture does not support server-side adaptation. Rich´e and Brebner [94] propose a different approach to profile manage- ment, implementing a distributed and replicated storage system on user de- vices. This approach is useful to preserve the privacy of data. However, the intermittent connectivity of mobile devices along with their limited CPU, storage and power resources, makes it difficult to guarantee the availability of profiles, even if sophisticated techniques are adopted. Henricksen et al. [55] propose a hybrid approach to context modelling, combining their modelling scheme, based on the ORM graphical notation and predicate logic, with ontologies; In particular, they represent their con- text model using OWL-DL. However, they do not perform ontology-based reasoning in order to derive new context data, since their model expres- sivity is comparable to the one of OWL-DL. Ontology-based reasoning is performed only for consistency checking and semantic interoperability.

44 Chapter 3

The CARE Middleware

Taking into account the issues outlined in Chapter 2, we have proposed the CARE (Context Aggregation and REasoning) middleware for context- awareness in mobile environments [3]. CARE has been defined based on the requirements presented in Section 2.4.1.

3.1 Architecture

In this section we describe the logical architecture of our framework. We present its main components as well as the issues related to context and policy representation and management.

3.1.1 Overview

Clearly, the specification and implementation of a full-fledged architecture satisfying all the requirements illustrated above is a long term goal. The contribution illustrated in this dissertation is a first step in this direction. Without loss of generality, we present an architecture where three main entities are involved in the task of building an aggregated profile, namely: the user with her devices (called user in the rest of the dissertation), the

45 1 1 USER OPERATOR INTERNET SERVER 5 5 SERVICE PROVIDER CONTEXT SERVICES OPM APPLICATION UPM LOGIC

2

3 PROFILE 4 3 CONTEXT PROFILE SERVICES SPPM 3

ONTOLOGY POLICIES REASONER CONTEXT PROVIDER

MERGE PROFILE

IE

ONTOLOGY ONTOLOGIES REASONER ONTOLOGY POLICIES REASONER

Figure 3.1: Architecture overview and data flow upon a user request network operator with its infrastructure (called operator), and the service provider with its own infrastructure. Clearly, the architecture, including conflict resolution mechanisms, has been designed to handle an arbitrary number of entities. A Profile Manager devoted to manage context data and policies is associated with each entity and will be called UPM, OPM,and SPPM, respectively. Figure 3.1 provides an overview of the proposed architecture. We il- lustrate the system behavior by describing the main steps involved in a service request. At first (step 1) a user issues a request to a service provider through his device and the connectivity offered by a network operator. The HTTP header of the request includes the URIs which are used to con- tact the UPM and the OPM. Then (step 2), the service provider queries the Context Provider module to retrieve the context data needed to perform adaptation. In step 3, the same module queries the profile managers to retrieve distributed context data and user’s policies. Context data are ag- gregated by the MERGE module in a single profile which is given, together with policies, to the IE (Inference Engine) for policy evaluation. Ontological rea- soning is performed on-demand (i.e., at the time of the service request) only

46 if the integrated profile lacks values for ontology-based context data that are necessary for providing the service. In this case, the Context Provider populates the ontology with the integrated profile, performs ontological rea- soning, and adds the new context information to the integrated profile. In step 4, the integrated profile is returned to the service provider. Finally, context data are used by the application logic to properly adapt the service before its provision (step 5).

3.1.2 Profile Management and Aggregation

In the following we explain the mechanism of context management, and address the issue of how to aggregate possibly conflicting data in a single profile.

Profile Managers

As outlined above, every Profile Manager is responsible for managing context attributes provided by the entity they pertain to. In addition, the UPM and SPPM manage user and service provider policies, respectively. In particular:

• The UPM stores information related to the user and her devices. These data include, among other things, personal information, interests, con- text information, and device capabilities. The UPM also manages policies defined by the user, which describe the content and the pre- sentation he wants to receive under particular conditions;

• The OPM is responsible for managing attributes describing the current network context (e.g., location, connection profile, and network sta- tus);

• Finally, the SPPM is responsible for managing service provider propri- etary data including information about users derived from previous

47 service experiences.

The architecture may be also easily extended by introducing other profile managers (e.g., profile managers owning context services) and by extending the per-user UPM model to a peer-to-peer network of UPM modules.

Context data representation

In order to aggregate context information, data retrieved from the different profile managers must be represented using a well defined schema, providing a mean to understand the semantics of the data. For this reason, we chose to represent context data using the Composite Capabilities/Preference Pro- files (CC/PP) structure and vocabularies [66]. CC/PP uses the Resource Description Framework (RDF) to create profiles describing device capabil- ities and user preferences. In CC/PP, profiles are described using a 2-level hierarchy, in which components contain one or more attribute-value pairs. Attribute values can be either simple (string, integer or rational number) or complex (set or sequence of values, represented as rdf:Bag and rdf:Seq respectively). It must be observed that different vocabularies can identify different at- tributes or components using the same name. In order to avoid ambiguities, each attribute must be identified by means of its name, its vocabulary, its component, and its component vocabulary. Throughout the dissertation the attributes syntax is simplified by omitting the vocabulary and possibly the component they belong to. CC/PP components and attributes are de- clared in RDFS vocabularies. In addition to well known CC/PP-compliant vocabularies for device capabilities like UAProf [83] and its extensions, our framework assumes the existence of vocabularies describing information like user’s interests, content and presentation preferences, and user’s context in general. Clearly, there are several issues regarding the general acceptance

48 of a vocabulary, the privateness of certain server-side attributes, and the uniqueness of attribute names. Here, we simply assume there exists a suffi- ciently rich set of context attributes that is accessible by all entities in the framework. We also simplify the syntax used to refer to attributes avoiding to go into RDF and namespace details. For the sake of simplicity, through- out this dissertation the attributes syntax is simplified, possibly omitting the references to their corresponding components and vocabularies.

Profile aggregation and conflict resolution

Once the context provider has obtained context data from the other profile managers, this information is passed to the IE which is in charge of context integration (Step 4 in Figure 3.1). Conflicts can arise when different values are given for the same attribute. For example, the UPM could assign to the Coordinates attribute a certain value x (obtained through the GPS of the user’s device), while the OPM could provide for the same attribute a different value y, obtained through triangulation. In our architecture, resolution of this kind of conflicts is performed by the Merge submodule of the IE. In order to resolve this type of conflict, the Service Provider has to specify resolution rules at the attribute level in the form of priorities among entities. Priorities are defined by profile resolution directives which associate to every attribute an ordered list of profile managers, using the setPriority statement. This means that, for instance, a service provider willing to obtain the most accurate value for user’s location can give preference to the value supplied by the UPM while keeping the value provided by the OPM just in case the value from the UPM is totally missing. Continuing the above example, the directive giving higher priority to the user for the Coordinates attribute is: setP riority Coordinates =(UPM,OPM)

49 Profile resolution also depends on the type of attribute. With respect to attributes of type Bag, the values to be assigned are the ones retrieved from all entities present in the list. If some duplication occurs, only the first occurrence of the value is taken into account (i.e., we apply the union operation among sets). Finally, if the type of the attribute is Seq,thevalues to be assigned to the attribute are the ones provided by the entities present in the list, ordered according to the occurrence of the entity in the list. If some duplication occurs, only the first occurrence of the value is taken into account. The mechanisms of context aggregation and conflict resolution are presented in detail in Chapter 4.

3.1.3 Policies for Supporting Adaptation

As anticipated in the introduction, policies can be declared by both the service provider and the user. In particular, service providers can declare policies in order to dynamically personalize and adapt their services consid- ering explicit context data. For example, a service provider can choose the appropriate resolution for an image to be sent to the user, depending both on user preferences and on current available bandwidth. Similarly, users can declare policies in order to dynamically change their preferences regard- ing content and presentation depending on some parameters. For instance, a user may prefer to receive high resolution media when working on her palm device, while choosing low resolution media when using a WAP phone. Both service providers and users’ policies determine new context data by analyzing context attribute values retrieved from the aggregated profile.

Policy Representation

Each policy rule can be interpreted as a set of conditions on context data that determine a new value for a context attribute when satisfied. A policy

50 in our language is composed by a set of rules of the form:

If C1 And ...And Cn Then Set Ak=Vk, where Ak is an attribute corresponding to a profile data, Vk is either a value or a variable, and Ci is either a condition like Ai = Vi or not Ai with the meaning that no explicit nor derived value for Ai exists. For example, the informal user policy: ”When I am in the main conference room using my palm device, any com- munication should occur in textual form” can be rendered by the following policy rule:

”If Location=’MConfRoom’ And Device=’Pda’ Then Set PrefMedia=’Text’” The precise semantics of logic programs expressed in our language is presented in Section 4.1.

Conflicts and resolution strategies

Since policies can dynamically change the value of an attribute that may have an explicit value in a profile, or that may be changed by some other policies, they introduce nontrivial conflicts. They can be determined by policies and/or by explicit attribute values given by the same entity or by different entities. We have defined conflict resolution strategies specific for different conflict situations. A complete description of possible conflicts and of the solutions implemented in our architecture is presented in Chapter 4. Here we just mention the basic technique. We implement conflict resolution strategies by transforming the logical program defined by the policy rules. Transformations basically consist of the assignment of a proper weight to each rule and the introduction of negation as failure. In the resulting pro- gram, each rule with a generic head predicate A and weight w is evaluated only after the evaluation of the rule with the same head predicate and weight w + 1. When a rule with weight w fires, rules with the same head predi-

51 cate having a lower weight are discarded. The weight assignment algorithm ensures that the evaluation of the program satisfies the conflict resolution strategies, and a direct evaluation algorithm can be devised that is linear in the number of rules. The mechanisms of policy evaluation and conflict resolution are presented in detail in Chapter 4.

3.1.4 Ontological reasoning

In our framework we need to model both simple context data such as device capabilities or current network bearer, and socio-cultural context informa- tion describing, for instance, the user current activity, the set of persons and objects a user can interact with, and the user interests. While the first category, that we call shallow context data, can be naturally modeled by means of attribute/value pairs, the second one calls for more sophisticated representation formalisms, such as ontologies and we call it ontology-based context data. Similarly to other research works (e.g., [23] and [47]), we have adopted OWL [58] as the language for representing ontology-based context data. This choice is motivated by the fact that the description logic languages underlying the Lite and DL sublanguages of OWL guarantee completeness and decidability, while promising high expressiveness. Moreover, a number of tools already exist for processing OWL ontologies and, being OWL a W3C Recommendation, the available utilities should further increase. We extended the original CARE architecture by introducing ontology reasoners with the main goal of deriving new ontology-based context data based on the data explicitly available. For a framework in which efficiency is a fundamental requirement, the introduction of ontological reasoning has been particularly challenging. The experimental results we obtained im- posed the choice of an off-line form of ontological reasoning, i.e., anticipat-

52 ing, whenever it is possible, ontological reasoning before a user requests a service. The mechanism of ontological reasoning in CARE is presented in Chapter 6.

3.1.5 Supporting continuous services

The dynamic nature of some context attribute values claims for a mechanism for keeping up-to-date the context data used by the service provider during a session. Consider the case of user context data; while some attributes do not change during a session (e.g., the user personal data), other information may change depending on device status (e.g., available memory), user interaction with the device or application (e.g., turning on or off a feature), and user behavior (e.g., change of location). Data owned by the network operator are, possibly, even more unstable. Different mechanisms can be adopted to fulfill this requirement, with the usual approaches being either based on polling techniques or on asynchronous notifications fired by triggers. Polling, especially when involving properties of the user device, poses problems of cost and bandwidth consumption.

Our choice was to include in the CARE middleware a trigger mechanism to obtain asynchronous feedback on specific events (e.g., available band- width dropping below a certain threshold, user location changed by more than 100 meters). When a trigger fires, the corresponding profile manager sends the new values of the modified attributes to the Context Provider module, which should then re-evaluate policies. Various optimizations have been devised, to the aim of minimizing network traffic, and of avoiding re-computation of those values that are completely independent from the changed attributes. The mechanism for intra-session updates is presented in detail in Chapter 5.

53 INTERNET SERVER CLIENT SYSTEMS PROFILE MEDIATOR PROXY USER DEVICE Streaming MMS server Web server (Java) PROFILE server TRIGGER REFERENCES MONITOR (C#) (HTTP) APPLICATION PROFILE, LOGIC UPDATES UPM INTERFACE (C#) (cocoon, servlet, cgi, NETWORK OPERATOR

LOCAL PROXY (C#) (HTTP) TRIGGERS php, jsp, ...)

PROFILE, SPPM INTERFACE (Java) TRIGGERS UPDATES (socket) TRIGGERS UPDATES (socket) (socket) ( ws) UPDATES CONTEXT PROVIDER ( ws, socket) TRIGGERS, PROFILE, UPDATES (socket) TRIGGERS, PROFILE, MERGE (Java) IE (C library) UPDATES (socket)

UPM ONTOLOGY REASONER (Racer) SPPM

POLICY POLICY REPOSITORY TRIGGERS, REPOSITORY PROFILE, OPM UPDATES (socket) TRIGGER TRIGGER MONITOR TRIGGER MONITOR CONTEXT (Java) MONITOR CONTEXT (Java) DATA CONTEXT (Java) DATA DATA ONTOLOGY REASONER ONTOLOGY REASONER (Racer) (Racer) ONTOLOGIES (OWL)

Figure 3.2: The software architecture

3.2 Software architecture

The software architecture used to implement our middleware is shown in Figure 3.2. The current implementation improves the one presented in [3] by adopting more efficient, socket-based communication technologies. Fur- thermore, both the context management modules of the profile managers and the merge module of the context provider have been rewritten in order to improve performance. We have chosen Java as the preferred programming language, switching to more efficient solutions only when imposed by efficiency requirements. The profile mediator proxy (pmp) is a server-side Java proxy that is in charge of intercepting the HTTP requests from the user’s device, and of communicating the user’s profile (retrieved from the context provider)

54 to the application logic, by inserting context data into the HTTP request headers. In this way, user context data is immediately available to the application logic, which is relieved from the burden of requesting the remote profile from the context provider, and of parsing CC/PP data. The pmp is also in charge of storing the monitoring specifications of the application logic. When the pmp receives a notification of changes in context data, it communicates them to the application logic by means of an HTTP HEAD message. Given the current implementation of the pmp, the application logic can be developed using any technology capable of parsing HTTP requests, including JSP, PHP, Cocoon, Java servlets, ASP .NET, and many others. The application logic can also interact with provisioning servers based on protocols other than HTTP. As an example, in the case of the adaptive streaming server presented in Section 7.2, context data is communicated to the streamer by a PHP script through a socket-based protocol.

CC/PP profiles are managed by means of proper Java objects, and com- municated by profile managers to the context provider by means of the socket-based binary serialization of Java objects. The evaluation of the logic program is performed by an efficient, ad-hoc inference engine [14] developed using C.

Context data, policies and triggers are stored by the profile managers into ad-hoc repositories that make use of the MySQL DBMS. Each time a profile manager receives an update of context data, the trigger monitor evaluates the received triggers, possibly notifying changes to the context provider.Theupm has some additional modules for communicating trig- gers to a server application executed by the user device. The communication of triggers is based on a socket protocol, since the execution of a SOAP server by some resource-constrained devices could be unfeasible.

The trigger monitor module on the user’s device is in charge of moni-

55 toring the status of the device (e.g., the battery level and available memory) against the received triggers. The local proxy is the application that adds custom fields to the HTTP request headers, thus providing the context provider with the user’s identification, and with the URIs of her upm and opm. At the time of writing, modules executed on the user device are de- veloped using C# for the .NET (Compact) Framework. A command-line proxy is also available for Linux clients.

3.3 Evaluation with respect to the addressed re- quirements

We conclude this chapter by briefly evaluating our middleware with respect to the requirements for context-awareness frameworks that we have identi- fied in Section 2.4.1:

1. Interoperable context representation: Our choice has been to adopt standard languages for representing context data. The most relevant specifications in this sense are the CC/PP language for rep- resenting device capabilities and user preferences, and the OWL lan- guage for representating data by means of ontologies. This choice allows us to represent both raw context data by means of CC/PP profiles, and complex context data by means of expressive OWL on- tologies. More information about the fulfilling of this requirement can be found in Chapters 4 and 6.

2. Support for context dynamics: The CARE middleware adopts logic programming techniques for representing and reasoning with dy- namic context data; i.e., data whose value depends on the value of other context data. Moreover, the framework includes a trigger-based

56 mechanism for continuously updating context data during service ses- sions. More information about the fulfilling of this requirement can be found in Chapters 4 and 5.

3. Support for distributed context data: As shown in Section 3.1.2, the middleware includes distributed profile managers for managing context data provided by single entities (e.g., the user, the network operator, the service provider).

4. The architecture should provide a mechanism to aggregate profile data and policies from different sources: Our proposed framework includes conflict resolution mechanisms for aggregating con- text data retrieved form distributed sources. Moreover, the framework includes algorithms for automatically detecting and solving cycles de- termined by joining policies declared by different entities, and for re- solving conflicts encountered during policy evaluation. More informa- tion about the fulfilling of this requirement can be found in Chapter 4.

5. The architecture should rely on an advanced system for pri- vacy protection: Privacy protection is an extremely challenging issue in mobile and ubiquitous systems; however, this aspect has not been specifically addressed in this work. Nevertheless, as discussed in Sec- tion 8.2.2, the CARE middleware can be easily extended with privacy policies and anonymization techniques.

6. Efficiency: We have defined a restricted logic programming language for representing policies that are used for reasoning with context data. An efficient inference algorithm for this language has been devised, which has linear complexity with respect to the rule set size (see Chap- ter 4). Moreover, the mechanism for supporting continuous updates of context data is based on optimized algorithms that can reduce the

57 number of unnecessary updates of context data; i.e., the number of updates that do not impact the aggregated profile (see Chapter 5). Moreover, techniques for reducing the inefficieny of ontological rea- soning in our framework has also been devised (see Chapter 6).

58 Chapter 4

Conflict Resolution for Profile Aggregation and Policy Evaluation

Mobility emphasizes the need for modeling the dynamics of some context data. For example, changes in the available bandwidth (due, e.g., to a switch to a different mobile device and infrastructure) should correspond to a change in the value of the data specifying the desired bitrate for streaming video services. The same need might also occur with context data related to location and even user preferences. A natural approach to the modeling of this dynamics is to augment context data with policies; that is, rules that set or change certain context data based on the current values of other context data.

While distributed context data management and policy specification en- ables the enhanced form of adaptation that we envision for mobile comput- ing, a technical problem can be easily identified. Different entities ( e.g., user, network operator, device manufacturer, service provider) may provide

59 partial and possibly conflicting context data. For example, the location reported both by the user GPS module and by the network operator in- frastructure may be inconsistent. Similarly, conflicts can arise when policies given by different entities – or even by the same entity – determine conflict- ing values for a context data. In this chapter we illustrate in detail how the above mentioned issues are addressed by the CARE framework. In particular, the main contribution of our solution can be summarized as follows: (i) we analyze sources of con- flicts in context data and policies provided by different sources, and propose resolution strategies; (ii) by encoding policies, explicit values assignments, and priorities into logic programs we provide both a clear semantics for the intended model of a set of aggregated policies, and an evaluation procedure; (iii) we show experimental results performed on a prototype system. This chapter is structured as follows: Section 4.1 illustrates the for- malisms for representing context data and policies; Section 4.2 presents a categorization of conflict types, and their corresponding resolution strate- gies; Section 4.3 describes the mechanism of distributed context data aggre- gation; Section 4.4 presents the mechanism of policy evaluation and conflict resolution; Section 4.5 shows experimental results; and Section 4.6 presents some related work.

4.1 Representation of context data and policies

In this dissertation, we call profile the partial view of context data owned by a single entity. In order to aggregate distributed profiles, context data retrieved from the different entities must be represented using a well-defined schema, providing a mean to understand the semantics of the data. For this reason, we chose to represent context data using the Composite Capabili- ties/Preference Profiles (CC/PP) structure and vocabularies [66]. CC/PP

60 uses the Resource Description Framework (RDF) to create profiles describ- ing device capabilities and user preferences. In CC/PP, profiles are described using a 2-level hierarchy in which components contain one or more attribute- value pairs. CC/PP components and attributes are declared in RDFS vocabular- ies. Values can be either simple (string, integer or rational number) or complex (set or sequence of values, represented as rdf:Bag and rdf:Seq, respectively). It must be observed that different vocabularies can identify different attributes or components using the same name. In order to avoid ambiguities, each attribute must be identified by means of its name, its vocabulary, its component, and its component vocabulary. Thus, in our framework attributes are identified using the notation

Vocabulary1.Component/V ocabulary2.Attribute where: Vocabulary1 refers to the vocabulary the component belongs to; Componentis the ID of the component containing the attribute; Vocabulary2 refers to the vocabulary the attribute belongs to; and Attribute is the ID of the attribute. In order to improve readability, throughout this dissertation the attributes syntax is simplified by omitting the vocabulary and possibly the component they belong to. Currently, CC/PP is mainly used for describing device capabilities and network conditions. Well-known CC/PP-compliant vocabularies are UAProf [83] and its extensions. These vocabularies provide an exhaustive description of device capabilities and network state; however, they do not take into account various data that are necessary to obtain a wide-ranging adaptation and personalization of services. Vocabularies describing information like user’s interests, content and presentation preferences, session variables, and user’s context are also needed. We have been working in this direction mostly

61 considering confined domains, with the goal of experimenting our framework on prototype applications. Since a detailed discussion on vocabularies and their sharing policies is out of the scope of this dissertation, from now on we assume there exists a sufficiently rich set of profile attributes that is accessible by all entities in the framework.

As anticipated, policies in CARE can be declared by both the service provider and the user. In particular, service providers can declare policies in order to dynamically personalize and adapt their services considering ex- plicit context data. For example, a map service provider can choose the appropriate points of interest and the resolution of the map, depending on the user’s interests and her device capabilities. Similarly, users can declare policies in order to dynamically change their preferences regarding content and presentation depending on some parameters. For instance, a hypo- thetical user of the map service may prefer to receive image content when browsing from her desktop, while choosing audio instructions when driving her car. Both service providers and users’ policies determine new context data by analyzing other context data.

As usual, the choice of a representation language is a compromise be- tween simplicity, expressiveness, and efficiency. The policy language must also support the definition of a mechanism for handling conflicts that could arise when user and service provider policies determine different values for the same context data.

Our choice for a policy language has privileged low complexity, well- defined semantics and well-known reasoning techniques. Indeed, our policies are specified as a set of first-order definite clauses [81] with negation-as- failure and no function symbols, forming a general logic program.

Each policy rule is composed by a set of conditions on context data (interpreted as a conjunction) that determine a new value for a context

62 data when satisfied. A policy in our language is composed by a set of rules of the form:

If C1 And ...And Cn Then Ak(Vk), where Ak is a predicate corresponding to a CC/PP attribute, Vk is either a value or a variable, and Ci is either a subgoal like Aj(Vl)ornot Aj(Vl). Note that the semantics of not in our rules is negation as failure.

For example, the informal user policy:

”When I am in the main conference room using my palm device, any com- munication should occur in textual form” can be rendered by the following policy rule:

”If Location(MConfRoom) And Device(PDA) Then PreferredMedia(Text) ”

The language also includes various built-in comparative predicates, i.e., <, <=, >, >=, <>, ==, with their standard semantics in the domain of reals.

Due to the special purpose of our logic programs, where atoms like P (a) represent the fact of a being the value of the context data P , we need to ensure that at most a single ground atom for each predicate must be present in the program model. This is due to the implicit assumption that each context data can have a single value (even if it could be a composite value). For this reason, we have extended the syntax of general logic programs in order to declare priorities between conflicting rules (i.e., rules having the same head predicate). In particular, in our language rules are labeled, and expressions of the form R1 R2 state that rule R1 has higher priority than rule R2.Therelation on the sets of rules having the same head predicate

63 is a strict partial order. Priorities are declared by the same entity that declares the policy, and if they are not given, a default ordering is used. Since priorities are introduced in the language only for managing conflicts between rules, we restrict priorities to be assigned to rules having the same head predicate (i.e., rules setting a value to the same context data). The formal semantics of a set of policy rules and priorities is given by the unique model of the logic program in which it can be encoded (details in Section 4.2). In order to guarantee the uniqueness of the program model, we ensure that the logic program corresponding to the policy rules defined by each entity is stratified [5]. To this aim, we perform a simple test at the insertion of each new policy rule, discarding those rules that generate a cycle. Note that this condition does not prevent rule chaining. Web-based interfaces are currently used by service providers and users to insert and modify their own policies. User policies may also be taken or adapted from a library of predefined policy rules, as well as partially learned from user behavior.

4.2 Conflicts and resolution strategies

As explained in Section 4.1, conflicts in our framework are due to the implicit assumption that each context data can have a single value (even if it could be a composite value). We distinguish two types of conflicts: a) conflicts due to policies and/or explicit context data given by the same entity, and b) conflicts due to policies and/or explicit context data given by different entities. A simple example of a conflict of type a) is the use of policies to override default context data when specific events occur and/or specific conditions are verified. In this case, a policy given by an entity, deriving a value for a context data, intuitively has higher priority over an explicit value for that data given by

64 the same entity. Conflicts of type a) also include the case of two policies given by the same entity,thatspecifydifferentrulestoderivethevalueforthesamedata.In this case if the conditions of the two rules are not mutually exclusive we may derive two different values. There is no intuitive way to solve such a conflict, and it is not reasonable to simply disallow each entity to specify more than one policy to derive the value of a data. Hence, we assume that the specification of these possibly conflicting policies includes an indication of priority, and if this is not given, a default ordering will be used. An example of conflict of type a) is given below:

Example 1 Consider the case of the provider of an instant messaging ser- vice declaring its policies regarding the notification of incoming messages to the users of the system. The service provider could declare two policies regarding the Notification data, specifying to send audio notifications when the user is using a PDA, and text notifications when the user is involved in an important meeting. If the user is both using her PDA and involved in an important meeting, both the rules would fire providing conflicting values. In this case, the service provider is forced to specify which rule has the highest priority. In this example, it is reasonable for the service provider to give higher priority to the second rule. In this way, when the user is involved in an important meeting, the first rule cannot fire because the second rule has higher priority.

Conflicts of type b) can occur, for instance, when a provider is not able or does not want to agree with a user policy or explicit preference, and declares a policy rule to override the values explicitly given or derived by the user. Conflicts due to rules and explicit values given by different entities can be taken care of considering a priority rule between entities for that particular data: if an entity has priority over other ones for the value of a

65 specific data, policies given by that entity to derive values for that data have priority over policies given by other entities. An example of such a conflict and its resolution is given below:

Example 2 Consider the user of a video streaming service. The user could declare a policy requiring high-resolution media as a default when using her PDA; the service provider may want to supply low-resolution media when the available bandwidth drops below a certain threshold. If both conditions hold, the evaluation of policies would generate a conflict. If the service provider has the highest priority for the data, the rule of the service provider prevails over the user’s one.

A categorization of possible conflicts is useful for determining the system behavior. We summarize the desired behavior of the system, in the presence of possible conflicts, considering each case as follows:

1. Conflict between explicit values provided by two different entities when no policy is given for the same data. In this case, the priority over enti- ties for that data determines which value prevails. This kind of conflict is totally handled by the Merge module of the Context Provider.

2. Conflict between an explicit data value and a policy given by the same entity that could derive a different value. A simple example of a conflict of this type is the use of policies to override default context data when specific events occur and/or specific conditions are verified. In this case, a policy given by an entity, deriving a value for a context data, intuitively has higher priority over an explicit value for that data given by the same entity. Thus, the value derived from the policy must prevail.

3. Conflict between an explicit context data and a policy given by a dif- ferent entity that could derive a different value. Conflicts of this type

66 can occur, for instance, when a provider is not able or does not want to agree with a user explicit preference, and declares a policy rule to override the values explicitly given by the user. This kind of conflict can be taken care of considering priority rules between entities. Con- sidering the priority over entities for that data, if the entity giving the explicit value has priority over the other, then the policy can be ignored. Otherwise, the policy should be evaluated, and if a value is derived it prevails over the explicit one.

4. Conflict between two policies given by two different entities on a spe- cific context data. Similarly to conflict (3), the priority over entities for that data states the priority in firing the corresponding rule. If a rule fires, no other conflicting rule from different entities should fire.

5. Conflict between two policies given by the same entity on a specific context data. There is no intuitive way to solve such a conflict. Hence, we assume that the entity gives a priority over these rules, using the syntax provided by the policy language, and if this is not given, a default ordering will be used. The priority over rules for that data is used to decide which one to evaluate first. If a rule fires, no other conflicting rule from the same entity should fire.

4.3 Merging distributed context data

Even if no policies are given, conflicts can arise when different values are given by different profile managers for the same data. For example, the UPM could assign to the Coordinates data a certain value x (obtained through the user’s device GPS), while the OPM could provide for the same data the value y, obtained through triangulation. This kind of conflict resolution is performed in our architecture by the Merge module of the Context Provider.

67 We have defined a language for allowing service providers to specify resolu- tion rules at the level of the context data. This means that, for instance, a service provider willing to obtain the most accurate value for user’s location can give preference to the value supplied by the UPM while keeping the value provided by the OPM just in case the value from the UPM is missing. Priorities are defined by profile resolution directives which associate to every data an ordered list of profile managers.

Example 3 Consider the following profile resolution directives:

1. setPriority */* = (SPPM, UPM, OPM)

2. setPriority NetSpecs/* = (OPM, UPM,SPPM)

3. setPriority UserLocation/Coordinates = (UPM, OPM)

In (1), a service provider gives highest priority to its own context data, and lower priority to data given by the other entities. Clearly, if no value is present in the service provider profile, the value is taken from other profiles following the priority directive. Directives (2) and (3) give the highest pri- ority to the operator for network-related data and to the user for the single Coordinates data, respectively. The absence of SPPM in directive (3) states that values for that data provided by the SPPM should never be used.

The semantics of priorities actually depends on the type of the data. When the CC/PP attribute corresponding to the context data is simple, the value to be assigned to the attribute is the one retrieved from the first entity in the list that supplies it. When the attribute is of type rdf:Bag, the values to be assigned are the ones retrieved from all entities present in the list. If some duplication occurs, only the first occurrence of the value is taken into account (i.e., we apply union). Finally, if the type of the attribute is rdf:Seq, the values assigned to the attribute are the ones provided by the

68 entities present in the list, ordered according to the occurrence of the entity in the list. All duplicates are removed keeping only the first occurrence.

It is worth noting that the entities that can provide context data must be explicitly declared. In particular, if an entity is not present in the list associated to an attribute, all values supplied by the entity for this attribute are discarded. This feature can be useful when a service provider does not trust the values specified by an entity for one or more attributes, or simply is not interested in considering a value for a certain attribute.

4.4 Policy formal semantics and conflict resolution

Despite the logic program corresponding to the set of policy rules defined by each entity is stratified (see Section 4.1), it is still possible that this property is lost when policies of different entities are joined. In order to preserve stratification (and consequently the uniqueness of the logic program model), we detect and resolve cycles possibly occurring in the joined logic program. This mechanism is presented in detail in Section 4.4.1.

Section 4.4.2 presents the conflict resolution transformations that are applied to the logic program obtained after cycle resolution. Even if strati- fication cannot be preserved applying those transformations, we prove that the resulting logic program maintains a weaker form of stratification that is sufficient to guarantee the uniqueness of the logic program model.

Finally, Section 4.4.3 illustrates the evaluation algorithm, and provides a complexity analysis.

69 4.4.1 Cycle detection and resolution

In order to ensure that a logic program is stratified it is sufficient to show that its rule dependency graph (RDG) is acyclic1. Given a logic program P , RDG(P ) is a directed graph whose nodes are the rules forming P .The graph contains an edge from R to R iff the head predicate of rule R belongs to the set of body predicates of R. In our case, despite acyclicity of rules is guaranteed in each entity rule- set, it is still possible that a cycle is created when user and service provider policies are joined. We check for the presence of such a cycle as a prepro- cessing step before rule evaluation. When a cycle is recognized, we apply a proper cycle resolution strategy.

Example 4 Consider the following joined ruleset P : u.r1 (user): A ← B. sp.r2 (service provider): B ← C, E. sp.r3 (service provider): D ← A, B, G. u.r4 (user): E ← F. u.r5 (user): C ← D. u.r6 (user): G ← B. sp.r7 (service provider): F ← E.

Its rule dependency graph is shown in Figure 4.1(a). RDG(P ) contains four cycles, corresponding to paths C1 =(u.r1,sp.r2,u.r5,sp.r3,u.r1), C2 =

(sp.r2,u.r5,sp.r3,sp.r2), C3 =(sp.r3,u.r6,sp.r2,u.r5,sp.r3),andC4 =(u.r4,sp.r7).

Resolution strategy Since cycle detection is performed at the time of the service request, an automatic mechanism for resolving cycles is needed.

1This property can be easily derived from the definition of stratification [5]

70 sp.r2 u.r4 sp.r2 u.r4 u.r1

sp.r7 sp.r3 u.r5 sp.r3

u.r6 u.r6

(a) Ruleset dependency graph. Bold (b) Cycle graph. nodes represent rules candidate for dele- tion.

OUT1 sp.r2 u.r4 sp.r2 LOOP u.r1

LOOP

IN1 sp.r3 sp.r7 u.r5 u.r6 u.r6

(c) Application of the heuristic algorithm to (d) The resulting acyclic rule de- the cycle graph. Hatched nodes represent pendency graph rules to be pruned.

Figure 4.1: DFS-based algorithm

71 A reasonable strategy to resolve cycles would be to remove from the set of rules involved in the cycle the node corresponding to the policy rule with lowest priority. Unfortunately, it is easily seen that each cycle is composed by rules having different head predicates. Thus, since the scope of priorities in our language is limited to each single predicate, an ordering of priorities relative to different predicates cannot be applied.

However, edges in a cycle can be categorized on the basis of the relative priority of the entity that declared their corresponding rule. As an example, consider the cycle between rules r4 and r7 in Example 4. These rules were declared by the user and by the service provider, respectively. Suppose that the user has the highest priority – according to profile resolution directives – for the head predicates of both rules. Our resolution strategy consists in discarding one of the rules declared by an entity that does not have the highest priority for the corresponding head predicate (rule r7 in this example). This choice is motivated by the fact that the evaluation of these rules is never guaranteed, since they can be overwritten by rules declared by higher-priority entities.

It is still possible that this strategy is not applicable, since every rule composing a cycle was declared by an entity with the highest relative prior- ity. In this case, we exploit an interesting property of cycles in our frame- work: since each single entity ruleset is acyclic, every cycle occurring in the joined ruleset will contain at least one node corresponding to a service provider rule and at least one node corresponding to a user rule. Thus, cy- cles for which the above mentioned strategy cannot be applied are resolved deleting one of the user rules involved in the cycle. This choice is motivated by the fact that users have no real control over the actual evaluation of their policy rules. As a matter of fact, user policies can be overwritten by policies or values given by the service provider, depending on the profile resolution

72 directives it applied. On the contrary, service providers must be guaranteed that their policies regarding a given service are thoroughly applied, provided that they kept the highest priority for the corresponding context data. When a user rule is deleted, the application logic can choose the most appropriate action to perform, depending on the type of service. We believe that, for the majority of Internet services, the requested service can be provided without informing the user. In the case of particular applications (e.g., services involving privacy issues) the user may be informed about the incompatibility of her preference with the general policies of the provider, and invited to define a new strategy consistent with the service provider rules.

Minimizing the number of rules to prune When multiple cycles are detected in the joined ruleset, it is possible that a single rule may be involved in multiple cycles. In this case, an obvious optimization consists in pruning a minimum cardinality set of rules that resolves all cycles – provided that the deletion of rules in this set is consistent with our overall strategy. The problem of finding a minimum cardinality set of nodes whose deletion cuts every cycle in general graphs is called feedback vertex set (FVS) problem (see [33] for a survey). The FVS problem is known to be NP-complete [64], even if an exact solution of the problem is achievable in polynomial time for particular classes of graphs (e.g., reducible flow graphs [98]). Unfortunately, the RDP of joined rulesets in our framework does not fall into one of these classes. Hence, since cycle resolution must be performed at the time of the user request, we chose to adopt a low-complexity heuristic algorithm. Our strategy consists of the following steps:

• Step 1: Given the logic program P obtained by joining the user

73 and service provider policies, we construct its rule dependency graph

RDG(P ). In order to detect every cycle Ci occurring in RDG(P ) we apply the well-known depth-first search (DFS) algorithm [27] for directed graphs.

 • Step 2: Every cycle Ci is transformed into Ci by discarding those rules in the cycle that are not candidate for deletion on the basis of our cycle resolution strategy. We recall that our resolution strategy ensures that  ∀i, Ci = ∅ (i.e., each cycle contains at least one rule candidate for  deletion). Hence, this transformation preserves all cycles. Cycles Ci are used to build the cycle graph G, which is composed by cycles in RDG(P ) contracted by removing those nodes that correspond to rules that must be preserved accordingly to the resolution strategy. More formally, graph G is obtained by applying the following transformation to RDG(P ):

Transformation 1 At first, RDG(P ) is transformed by removing ev- ery node r that does not appear in any cycle, together with its incoming    and outgoing edges. Then, for each node r ,r ∈{Ci}−{Ci},ifacou- ple of edges (v → r,r → v) appears in the resulting graph, then a new edge v → v is added to RDG(P ). Finally, the node r and its incoming and outgoing edges are removed.

Note that v and v may coincide. In this case the transformation determines the addition of a self-loop.

• Step 3: Solution of the FVS problem. In the last years, various ap- proximation and heuristic algorithms for the FVS problem have been developed. Since in practical scenarios we expect the joined logic pro- gram to contain a small number of cycles (if any), our choice for such

74 an algorithm has privileged low complexity. In particular, in Step 3 we adopt the heuristic algorithm for the unweighted case proposed in [71]. This algorithm has time complexity O(|E| log |V |), and is the basis of many other heuristic algorithms for the FVS problem.

Example 5 Continuing Example 4, Figure 4.1(b) shows the cycle graph G obtained after the application of Transformation 1 to the RDG in Fig- ure 4.1(a). The acyclic graph obtained after the application of the heuristic algorithm is shown in Figure 4.1(c). The removed nodes correspond to the rules to prune from P in order to solve all cycles in RDG(P ).TheRDGof the resulting acyclic ruleset is shown in Figure 4.1(d).

4.4.2 Policy conflict resolution

Since policies can dynamically change the value of an attribute that may have an explicit value in a profile, or that may be changed by some other policies, they introduce nontrivial conflicts. They can be determined by policies and/or by explicit context data given by the same entity or by different entities. We now show how the conflict resolution strategies we have devised can be implemented. In the simple case when no policies are given for a certain attribute, conflicts are easily solved by the Merge module as explained above, and the resulting attribute value is directly passed to the service provider ap- plication logic. However, when policies are present, the resolution strategies must be integrated in the evaluation of logical rules. A set of policy rules can be encoded in a logic program where each rule has the following form:

A(X) ← A1(X1), ..., Ak(Xk),notAk+1(Xk+1),

..., notAn(Xn) (4.1)

75 Algorithm 1 DFS-based cycle detection and resolution. i) Rsp,Ru,P are respectively the service provider, the user, and the joined ruleset; ii) G = V,E is the PDG of P ; iii) (EA,1,EA,2,EA,3) is the priority over entities for the attribute A; iv) Paths is the set of cycle paths detected in G; v) Pathi = {Ei.rj ,...,Em.rn} is a cycle path ∈ Paths; vi) Ci is the set of rules candidate for deletion in order to resolve Pathi.

3: Procedure CycleResolution Paths := CyclesDetection(G) for all Pathi ∈ Paths do 6: Ci := ∅ for all ej.rk ∈ Pathi do H := headP redicate(ej.rk); 9: if ej = eH,1 then Ci := Ci ∪ ej.rk end if 12: end for if Ci = ∅ then Ci := {ej.rk|ej.rk ∈ Pathi ∧ j = u} 15: end if end for while {Ci} = ∅ do 18: calculateOccurrences({Ci}) ej.rj := oneOf(mostF requentRule({Ci})) P := P/ej.rj 21: {Ci} := {Ci}/{Pathl | ej.rk ∈ Pathl} end while return P

where A, A1,...,An are predicate symbols corresponding to context data, and X, X1,...,Xn are either variable or constant symbols and denote con- text data. Note that our language allows positive and negative premises, with negative ones denoting the absence of a value for a specific attribute, but constrain the head of a rule to be positive. Moreover, safety imposes that if X is a variable appearing in the rule head, the same variable must

76 appear in the rule body. In addition to logic rules that encode policy rules, we have to encode in the logic program the implicit and explicit priorities that will be necessary to solve conflicts. For this purpose, a second argument, that we call weight, is added to each predicate, and the logic program encoding the policy rules is transformed in a program P by modifying each rule of the form (4.1) into:

A(X, w) ← A1(X1,W1), ..., Ak(Xk,Wk),

not Ak+1(Xk+1,Wk+1), ..., notAn(Xn,Wn) (4.2)

where W1,...,Wn are variables with values in non-negative integers, and w is a non negative integer determined by Algorithm 2. Note that rules labels as well as priorities over rules are used only in this pre-processing phase, and therefore are removed from the logic program. The weight of a rule is defined as the weight assigned to the predicate in its head. Intuitively, rules on context data for which a prevailing fact exists (see conflict type 3 in Section 4.2) are not assigned any weight and discarded, all facts are given weight 0, and other rules are assigned increasing weights accordingly to priorities over entities and priorities specified by each entity. Algorithm 2 ensures that (i) no pair of rules exist having the same head predicate symbol and the same weight; (ii) rules having the same head pred- icate but higher weight have higher priority, according to our conflict reso- lution strategy, over those with lower weights. From a logic programming point of view we can also observe that if the predicate dependency graph of the starting set of rules is acyclic, then the above transformation preserves acyclicity. Indeed, it is easily seen that the addition of a second argument to each head predicate does not determine the introduction of new edges in the predicate dependency graph of the logic program.

77 Algorithm 2 Setting the Weight parameter. Let ({E3{, E2{, E1}}}) be the priority over entities for the attribute A; Er be the entity among E1,E2,E3 providing the value obtained by the Merge module A E A kth for ; REj,A be the set of rules declared by j for ;andREj,A,k be the

rule ∈ REj ,A in increasing order of priority, according to Ej.

\∗ Facts have always weight 0 ∗\ Weight(FactA):=0 w := 0 \∗ Repeat ∀ Ej,r≤ j ≤ 3 ∗\ for j = r to 3 do

Kj := REj ,A \∗ Repeat for each rule declared by Ej on A ∗\ for k =1toKj do w := w +1

Weight(REj ,A,k):=w end for end for

In order to give a standard formal semantics to our policies and to enforce the above evaluation strategy, we still need to encode in the logic program the fact that we do not allow two different values for the same attribute in the output. This means that the logic program should have a unique model and this model should contain at most one single atom for each predicate. For this purpose, program P is once more modified as follows.

Transformation 2 Each rule (4.2) is modified by adding the subgoals:

not A(Y, Z), Z > w, (4.3)

where Y is a variable with the same domain as X, X1,...,Xn, leading to:

A(X, w) ← A1(X1,W1), ..., Ak(Xk,Wk),

not Ak+1(Xk+1,Wk+1), ..., notAn(Xn,Wn),

not A(Y,Z),Z >w. (4.4)

78 We call P  the resulting program.

Example 6 Consider conflicts between an explicit attribute value provided by the operator, two policies given by the same entity (e.g.; the user), and a policy given by the service provider, possibly deriving different values for the same attribute A1; in this example, the user declared two policies over the same attribute, and she gave highest priority to the policy user2. Suppose that the priority over entities for the A1 attribute is (SPPM, UPM, OPM). The Inference Engine preprocessor receives in input from the SPPM the fol- lowing logic program:

(op) A1(a) ←

(user) A3(b) ←

(p1-user) A1(X) ← A2(X)

(p2-user) A1(X) ← A3(X)

(p1-sp) A1(X) ← A4(X) p2-user p1-user

The fact (op) represents the value provided by the OPM for A1,Thefact

(user) represents the value provided by the UPM for A3, the first policy (p1- user) and the second policy (p2-user) are declared by the user, and the last policy (p1-sp) is declared by the service provider. Applying the Algorithm 2, the lowest weight (0) is assigned to the facts. The UPM has higher priority over the OPM, and so the preprocessor, following the priorities defined by the user over her rules, gives weight 1 to the head of the user policy with lowest priority (p1-user) and weight 2 to the head of the policy (p2-user). Finally, the highest weight (3) is assigned to the head of the policy (p1-sp), as it was declared by the entity with highest priority (the service provider in this case). Note that, if the OPM had highest priority than the UPM and SPPM, no rule would have been assigned any weight, and hence all rules

79 would have been discarded. Hence, the above logic program is modified as follows:

(op) A1(a, 0) ← not A1(Y,Z),Z >0.

(user) A3(b, 0) ← not A3(Y,Z),Z >0.

(p1-user) A1(X, 1) ← A2(X, W),notA1(Y,Z),Z>1.

(p2-user) A1(X, 2) ← A3(X, W),notA1(Y,Z),Z>2.

(p1-sp) A1(X, 3) ← A4(X, W),notA1(Y,Z),Z>3.

In this case, the value of A1 is determined as b by the firing of rule (p2-user).

The addition of the subgoal (4.3) determines the introduction of negative loops (i.e., negative edges connecting one node to itself) in the predicate de- pendency graph of P . Hence, since a logic program is stratified iff in its pred- icate dependency graph there are no cycles containing negative edges [5], we can conclude that Transformation 2 does not preserve stratification. How- ever, as proved by Theorem 1 and Corollary 1, the transformed programs maintain a weaker form of stratification (known as local stratification [99]) that guarantees the uniqueness of the program model.

Theorem 1 Given a stratified program P with weights assigned by Algo- rithm 2, the logic program P  obtained by Transformation 2 is acyclic [4].

Proof. We demonstrate that the atom dependency graph (ADG) of the ground program G(P ) can be obtained from the atom dependency graph of G(P ) inserting arcs that do not introduce cycles. Since the program P is stratified, its predicate dependency graph is acyclic. For this reason, the atom dependency graph D(G(P )) is acyclic too. By Transformation 2, D(G(P )) can be obtained by adding to D(G(P )) the arcs from the ground instances of A(Y,w) to the ground instances of A(X, v), v>w,foreach rule in P . Suppose A(y,w)andA(x, v) are ground instances of A(Y,w)

80 and A(X, v), respectively. Then, an arc from A(y,w)toA(x, v) is added to D(G(P )). This arc could introduce a cycle in the graph only if there is a path from A(x, v)backtoA(y,w). By construction of P and P ,anyarc starting from A(x, v) either goes to a node with a different predicate or to the node A(x1, v1)withv1 > v. The second case is actually similar to the

first one, since any arc starting from A(x1, v1) either goes to a node with a different predicate or to the node A(x2, v2)withv2 > v1, and analogously we can repeat this consideration until the node A() with the highest weight, which by construction can only have an outgoing arc towards a node with a different predicate. Hence, in order to show that no cycle is created, it is sufficient to show that each path starting from A(x, v)withanarctowards a different predicate Aj does not lead to A(t, u), u ≤ v. Considering an  arbitrary atom Aj(z,w ) such that a path to A(t, u)existsinD(G(P )), there  canbenopathfromA(x, v)toAj(z,w ). Indeed, if this path would exist, the predicate dependency graph of P would contain a cycle, contradicting our hypothesis. Since the above holds for all the transformed rules, cycles cannot arise in D(G(P )) if there are none in D(G(P )), and therefore ADG(G(P )) is acyclic.

The acyclicity property ensures that a topological order ord can be ap- plied to ADG(G(P )) [27]. In this case, ord constitutes a level mapping from the elements of the Herbrand base of G(P ) to the natural numbers. Since we allow only constant values and bound variables to appear in rules atoms, ADG(G(P )) is finite. Therefore, N = Max{ord(ADG(G(P )))} exists, and we can define the function ord(a)=N − ord(a), which is also a level mapping. For graph construction, for each ground clause

Ai(xj, wk) ← [not]Ai1 (xj1 , wk1 ),...,[not]Ain (xjn , wkn ),

81 ord(Ai(xj, wk))

  ord (Ai(xj, wk)) >ord([not]Aim (xjm , wkm )).

This demonstrates that P  is acyclic with respect to the level mapping ord. 

Corollary 1 Since acyclic logic programs are a subclass of locally strati- fied programs [4], program P  is locally stratified. Hence, it has a unique model [99].

4.4.3 Evaluation algorithm and complexity analysis

Despite the above mentioned formal properties guarantee the uniqueness of the intended model, and hence provide a clear semantics to our prioritized rulesets, they do not guarantee in general an efficient evaluation procedure. However, in our case a direct evaluation algorithm can be devised that is linear in the number of rules, since each rule has to be evaluated only once. The intuitive evaluation strategy is to proceed, for each attribute A,starting from the rule having A() in its head with the highest weight, and continuing considering rules on A() with decreasing weights till one of them fires. If none of them fires, the value of A is the one specified by the fact on A,or none if such a fact does not exist. The algorithm is shown in Algorithm 3.

Theorem 2 The complexity of Algorithm 3 is linear in the number of rules.

Proof. Function Evaluate(r) is called only if r belongs to New (see lines 6, 13, and 18). When executed, function Evaluate(r) removes rule r from the set New. Thus, the cardinality of New decreases each time Evaluate is called. Since at initialization time the cardinality of New is equal to the number of rules, Evaluate(r) is executed at most once for every r ∈ P . 

82 Algorithm 3 Logic program evaluation. i) P is the initial set of rules in the logic program. ii) New = {r ∈ P | r has never been in ES}. iii) M = { derived atoms }. iv) body(r)={literals in the body of r, r ∈ P }. v) head(r) = head of r, r ∈ P .

vi) wAi := Max{w|Ai(Xj,w) ∈ head(rk),rk ∈ P }.

vii) rAi,w := rk|head(rk)=Ai(Xj,w)). viii) Procedure RuleEval(r) returns the rule head atom if all literals in the body evaluate to True; it returns NULL otherwise.

3: Procedure Main() New := P ; M := ∅; rA ,w ∈ P for all i Ai do rA , ∈ New rA , 6: if i wAi then Evaluate( i wAi ) end for return M 9:

Procedure Evaluate(rAi,w)

New = New\{rAi,w}

12: for all a =[not]Aj(X, W)wherea ∈ body(rAi,w) do rA ,w ∈ New rA ,w if j Aj then Evaluate( j Aj ) end for

atom := RuleEval(rAi,w) 15: if atom = NULL then M := M ∪{atom} else

18: if rAi,w−1 ∈ New then Evaluate(rAi,w−1) end if

Since – for most Internet services – adaptation will probably be per- formed considering a small subset of CC/PP attributes, we chose to adopt a form of goal-driven reasoning, implementing our Inference Engine using a backward-chaining approach. Hence, the cost of the evaluation will be generally less than linear in the size of policies, since a number of irrelevant rules will be ignored.

83 4.5 Experimental results

In order to estimate the feasibility of the evaluation of policies based on logic programming for sophisticated services, we performed an experiment using the ad-hoc Inference Engine we developed and artificial rulesets of various cardinalities. Each rule in the rulesets had three random subgoals, one of which was negative. For each attribute, each ruleset contained three conflicting rules. Rulesets were properly built in order to avoid recursion, and to allow a random rule to fire for each set of conflicting rules over an attribute. Figure 4.2 shows experimental results of executing the rulesets on a two- processor Xeon 2.4 GHz workstation. Evaluation times are averages of ten runs, each using a different random ruleset: Results show that a ruleset of 45 rules is evaluated in around 1 millisecond, while a ruleset of 180 rules is evaluated in around 5 milliseconds. It should be noted that evaluation times exhibit a linear increase with the number of rules in the ruleset. In

Figure 4.2: Rulesets evaluation time.

84 order to compare our Inference Engine with a widely adopted solver, we performed the same experiments using the DLV system [70] (rules syntax has been slightly modified in order to be acceptable by the DLV parser). As expected, evaluation times are considerably higher with DLV,sincethe class of logic programs it considers (i.e., Disjunctive Datalog programs) is more complex than ours. In conclusion, rulesets evaluation time seems to be acceptable for the class of services we consider; therefore, we expect that response time will be dominated by the network latency.

4.6 Bibliographic Notes

With respect to the issue of integrating multi-source context data, our ap- proach is similar to the one adopted by DELI [20] and Intel CC/PP SDK [18]. These frameworks adopt the profile aggregation approach of UAProf [83], consisting in associating a resolution rule to every context attribute. When- ever a conflict arises, the resolution rule determines the value to be assigned to the attribute by considering the order of evaluation of partial profiles. This corresponds to assigning priorities to partial profiles, as opposed to assigning priorities to single context data as we do. Furthermore, since res- olution rules are defined in the CC/PP vocabulary, service providers cannot have control over the aggregation mechanism. Our approach to context data aggregation (see Section 4.3) overcomes the above mentioned limits provid- ing, in our opinion, a more flexible and powerful aggregation mechanism. Moreover, policies are not considered in these frameworks. A number of adaptive systems take into account the users’ preferences regarding the adaptation of Web resources by means of rule-based systems (e.g., [52]). A mechanism of rule evaluation against user context data similar to ours

85 is adopted in the Houdini framework [59]. The focus of Houdini is on pro- viding user context information to service providers while preserving the privacy of data. While the policy rule languages have similar expressive- ness – requiring acyclicity but allowing rule chaining – in Houdini rules are declared by the user only and evaluated by a proper user-trusted module. Being primarily focused on adaptation, our policy mechanism is different: policies are declared by multiple entities in order to determine customization parameters, and our focus is on conflict resolution strategies.

The SweetDeal [45] project investigates rule-based business processes for e-commerce and is based on courteous logic programs [43] which are closely related to the ones in which we encode our policy rules. However, due to the complex application domain addressed in that project, their rule language is more expressive and evaluation cannot be achieved in linear time as in our case.

In the architecture for presented in [54] and [53], pref- erences are evaluated against context data, and determine a score to each adaptation parameter; a score can be either a numerical value, or a special score that represents prohibition, obligation, veto, or an error state. For those parameters whose value can be represented by a number, conflicts are resolved by calculating the mean of the numerical scores determined by the single preferences. In the other case, an ad-hoc conflict resolution technique is adopted, which considers the semanitcs of special scores (e.g., when both a prohibition and an obligation are derived, the resulting score represents an error state). That approach is different from ours, since our system does not try to conciliate different preferences; as a matter of fact, our conflict res- olution mechanism consists in evaluating preferences – expressed as policy rules – in decreasing order of priority, until one of them fires.

Considering work on policy conflict resolution, we should mention the

86 PDL language and the monitor concept introduced in [25]. PDL,asmany other policy languages, is based on the event-condition-action paradigm, however its semantics is given in terms of nonmonotonic logic programs as in our approach. An interesting extension of PDL that allows the specification of preferences regarding the application of monitors is proposed in [12]. However, with respect to PDL and its extensions we allow chaining of rules since we believe this is essential in our context to enable composition of policies specified by different entities.

A possible approach for implementing a form of prioritized conflict res- olution is to adopt PLP [30], a Prolog compiler for logic programs with preferences. Programs are compiled by PLP into regular logic programs, that is the class of logic programs in which we encode our policies. How- ever, the need of a Prolog compilation phase poses major problems in terms of response time, especially considering that policies can dynamically change and thus the compilation should be performed run-time.

An interesting class of engines is the one of production rules systems (e.g., engines based on the RETE algorithm [36] such as OPS-5, CLIPS, and Jess). These engines encode a built-in conflict resolution strategy that in certain systems can be modified. For instance, in Jess rules can be prioritized, and default conflict resolution strategies can be overridden. However, the use of priorities is discouraged, since it can have a negative effect on performance. Furthermore, production rules adopt the forward-chaining approach, which do not seem to be optimal in our case, as explained in Section 4.4.2.

Datalog engines like DLV and Mandarax would be suitable for evalu- ating our rulesets after the preprocessing phase (Transformation 2 in Sec- tion 4.4.2). Nevertheless, the experimental results shown in Section 4.5 confirm the intuition that even an optimized Datalog engine is slower than an ad-hoc implementation, since the restrictions we impose to our rulesets

87 can be profitably exploited for improving the evaluation time.

88 Chapter 5

Distributed Context Monitoring for the Adaptation of Continuous Services

Consistently with the approach taken by many adaptation systems, CARE was initially designed to compute the current context at the time of a user request for a specific service. This model is adequate for the large class of services that provide a single response from the server to each request from the user; examples are adaptive web browsing, and location-based resource identification. In this chapter, we consider the particular class of continuous services. These services persist in time, and are typically characterized by multiple transmissions of data by the service provider, as a result of a single request by the user. Examples are multimedia streaming, navigation services, and publish/subscribe services.

89 This chapter is structured as follows: Section 5.1 introduces continuous services, and presents some research issues; Section 5.2 provides an overview of the trigger-based mechanism; Section 5.3 presents optimized algorithms for keeping up-to-date context information during the service provision; and Section 5.4 discusses related work.

5.1 Adaptation of continuous services

Context-awareness is much more challenging for continuous services, since changes in context should be taken into account during service provision- ing. As an example, consider an adaptive streaming service. Typically, parameters used to determine the most appropriate media quality include a number of context parameters, as, for example, an estimate of the available bandwidth, and the battery level on the user’s device. Note that this infor- mation may be owned by different entities, e.g., the network operator and the user’s device, respectively. A straightforward solution is to constantly monitor these parameters, possibly by polling servers in the network opera- tor’s infrastructure as well as the user’s device for parameter value updates. Moreover, the application logic should internally re-evaluate the rules that determine the streaming bit rate (e.g., “if the user’s device is low on mem- ory, decrease the bitrate”). This approach has a number of shortcomings, including: (i) client-side and network resource consumption, (ii)highre- sponse times due to the polling strategy, (iii) complexity of the application logic, and (iv) poor scalability, since for every user the service provider must continuously request context data and re-evaluate its rules. The alternative approach we follow is to provide the application logic with asynchronous notifications of relevant context changes, on the basis of its specific requirements. However, when context data must be aggregated from distributed sources which may possibly deliver conflicting values, as

90 well as provide different dependency rules among context parameters, the management of asynchronous notifications is far from trivial. The straight- forward strategy of monitoring all context data, independently from the conflict resolution policies, and from the rules affecting the adaptation pa- rameters, would be highly inefficient and poorly scalable. Indeed, compu- tational resources would be wasted by communicating with certain context data sources when not needed, by monitoring unnecessary context data, and by re-computing dependency rules not affecting the adaptation parameters. The main contributions presented in this chapter are:

• The module of CARE that are devoted to support continuous services through asynchronous context change notifications;

• Algorithms to identify context sources to be monitored and specific context parameter thresholds for these sources, with the goal of min- imizing the exchange of data through the network and the time to re-evaluate the rules that lead to the aggregated context description.

The module of CARE which are devoted to keep up-to-date context data during the service provision have been extensively tested, proving the efficiency of our solution when coupled with and adaptive streaming server. Experiments also include a comparison with a state-of-the-art commercial solution for adaptive streaming. The experimental results are presented in Section 7.2.

5.2 Trigger-Based Mechanism

In this section we describe the main features of the trigger mechanism for supporting continuous services. This mechanism allows profile managers to asynchronously notifying the service provider upon relevant changes in profile data on the basis of triggers. Triggers in this case are essentially

91 request OPERATOR request INTERNET SERVER

SERVICE OPM PROVIDER MONITOR MONITOR APPLICATION LOGIC monitoring specifications

UPM CONTEXT SPPM PROVIDER MONITOR MONITOR

profile triggers updates

Figure 5.1: Trigger mechanism

conditions over changes in profile data (e.g., available bandwidth dropping below a certain threshold, or a change of the user’s activity) which determine the delivery of a notification when met. In particular, when a trigger fires, the corresponding profile manager sends the new values of the modified attributes to the context provider module, which should then re-evaluate policies.

Figure 5.1 shows an overview of the mechanism. To ensure that only useful update information is sent to the service provider, a deep knowledge of the service characteristics and requirements is needed. Hence, the context parameters and associated threshold values that are relevant for the adapta- tion (named monitoring specifications) are set by the service provider appli- cation logic, and communicated to the context provider. Actual triggers are generated by the context provider –according to the algorithms pre- sented in the following of this section– and communicated to the proper profiles managers. Since most of the events monitored by triggers sent to the upm are generated by the user device, the upm communicates triggers

92 to a light server module resident on the user’s device. Note that, in order to keep up-to-date the information owned by the upm,eachuserdevicemust be equipped with an application monitoring the state of the device against the received triggers (named monitor in Figure 5.1), and with an applica- tion that updates the upm when a trigger fires. Each time the upm receives an update for a profile attribute value that makes a trigger fire, it forwards theupdatetothecontext provider. Finally, the context provider re-computes the aggregated profile, and any change satisfying a monitoring specification is communicated to the application logic. In order to show the system behavior, consider the following example.

Example 7 Consider the case of a streaming service, which determines the most suitable media quality on the basis of network conditions and avail- able memory on the user’s device. The MediaQuality is determined by the evaluation of the following policy rules:

R1: ”If AvBandwidth ≥ 128kbps And Bearer =’UMTS’ Then Set NetSpeed=’high’” R2: ”If NetSpeed=’high’ And AvMem ≥ 4MB Then Set MediaQuality=’high’” R3: ”If NetSpeed=’high’ And AvMem < 4MB Then Set MediaQuality=’medium’” R4: ”If NetSpeed!=’high’ Then Set MediaQuality=’low’”

Rules R2, R3 and R4 determine the most suitable media quality consid- ering network conditions ( NetSpeed) and available memory on the device ( AvMem). In turn, the value of the NetSpeed attribute is determined by rule R1 on the basis of the current available bandwidth ( AvBandwidth)and Bearer. Suppose that a user connects to this service via a UMTS connection, and

93 that at first the available bandwidth is higher than 128kbps, and the user de- vice has more than 4MB available memory. Thus, the context provider, evaluating the service provider policies, determines a high MediaQuality (since rules R1 and R2 fire). Consequently, the service provider starts the video provision with a high bitrate. At the same time, the application logic sets a monitoring specification regarding MediaQuality. Analyzing policies, profile resolution directives, and context data, the context provider sets triggers to the opm and to the upm/device, asking a notification in case the available bandwidth and the available memory, respectively, drop below certain thresholds. Suppose that, during the video provision, the user device runs out of memory. Then, the upm/device sends a notification (together with the new value for the available memory) to the context provider, which merges profiles and re-evaluates policies. This time, policy evaluation determines a lower MediaQuality (since rule R3 fires). Thus, the context provider notifies the application logic, which immediately lowers the video bitrate.

Monitoring specifications In order to keep the re-evaluation of rules to a minimum, it is important to let the application logic to precisely specify the changes in context data it needs to be aware of in order to adapt the service. These adaptation needs, called monitoring specifications, are expressed as conditions over changes in profile attributes. As an example, consider the provider of the continuous streaming service shown in Example 7. The application logic only needs to be aware of changes to be applied to the quality of media. Hence, its only monitoring specification will be:

MediaQuality(X),X=$ old valueMediaQuality.

94 where $old valueMediaQuality is a variable to be replaced with the value for the MediaQuality attribute, as retrieved from the aggregated profile. Monitoring specifications are expressed through an extension of the lan- guage used to define rule preconditions in our logic programming language [14]. This extension involves the introduction of the additional special predicate difference, which has the obvious semantics with respect to various domains, including spatial, temporal, and arithmetic domains. For instance, the mon- itoring specification:

Coordinates(X), difference(X, $old valueCoordinates) > 200 meters, will instruct the context provider to notify changes of the user position greater than 200 meters.

5.3 Minimizing unnecessary updates

In general, allowing the application logic to specify the changes in context data it is interested in does not guarantee that unnecessary updates are notsenttothecontext provider. We define an update to the value of a profile attribute as unnecessary if it does not affect the aggregated profile. In the context of mobile service provisioning, the cost of unnecessary updates is high, in terms of client-side bandwidth consumption (since updates can be sent by the user’s device), and server-side computation, and can compromise the scalability of the architecture. In order to avoid inefficiencies, monitoring specifications are communicated to the context provider, which is in charge of deriving the actual triggers and performing the optimizations that will be described in Sections 5.3.2 and 5.3.3.

5.3.1 Baseline algorithm

95 Algorithm 4 Baseline algorithm for trigger derivation

Input: Let C be the set of monitoring specifications; cAi be a monitoring specification regarding the attribute Ai; RAi be the set of rules rAi having A r A i in their head; Ai be the rule that determined the value for i in the aggregated profile (it such rule exists); b(rAi ) be the set of preconditions pc of rAi . Output: A set of directives regarding the communication of triggers to profile managers.

1: for all cAi ∈ C do

2: for all rAi ∈ RAi do

3: for all pc ∈ b(rAi ) | pc∈ / C do r  r 4: if Ai = Ai then 5: C := C ∪ pc 6: else 7: C := C ∪¬pc 8: end if 9: end for 10: end for

11: trigger t := ”if cAi then notify update(Ai)” 12: communicate(t, ProfileManagers); 13: end for

The baseline algorithm for trigger derivation is shown in Algorithm 4, and consists of the following steps: a) set a trigger regarding the attribute

Ai for each monitoring specification cAi regarding Ai, b) communicate the trigger to every profile manager, and c) repeat this procedure considering each precondition of the rules having Ai in their head as a monitoring spec- ification. The completeness of Algorithm 4 is shown by the following proposition:

Theorem 3 Given a monitoring specification cAi regarding attribute Ai, and a set of policy rules P , the baseline Algorithm 4 calculates a set of triggers t that is sufficient to detect any change in the value of Ai that satisfies the monitoring specification cAi .

Proof. The value of an attribute Ai in the aggregated profile can change

96 in three cases: (i) the value of Ai is changed by a profile manager; (ii) r ∈ P A the preconditions of the rule Ai that set the value of i are no more satisfied; (iii) the preconditions of other rules rAi ∈ P possibly setting a value for Ai are satisfied. Case (i) is addressed by Algorithm 4 in lines 11 and 12. With regard to cases (ii) and (iii),foreachmonitoring specification c A r Ai regarding an attribute i whose value was set by rule Ai the algorithm creates new monitoring specifications pc for checking that the preconditions r of Ai are still valid (line 7), and for monitoring the preconditions of the other rules rAi that can possibly set a value for Ai (line 5). The algorithm is recursively repeated for the newly generated monitoring specifications pc. 

Example 8 Consider rule R2 in Example 7. The value of the MediaQual- ity attribute depends on the values of other attributes, namely NetSpeed and AvMem. Hence, those attributes must also be kept up-to-date in order to satisfy a monitoring specification regarding MediaQuality. For this rea- son, the context provider sets new monitoring specifications regarding those attributes. This mechanism is recursively repeated. For example, since NetSpeed depends on AvBandwidth and Bearer (rule R1), the context provider generates new monitoring specifications for those attributes.

The use of the baseline algorithm would lead to a number of unnecessary updates, as will be shortly explained in Example 10. We devised two opti- mizations, one exploiting profile resolution directives (see Section 3.1.2), and the other exploiting priorities over rules. These optimizations, presented in the following of this section, avoid a large number of unnecessary updates while preserving useful ones.

97 5.3.2 Optimization based on profile resolution directives

A number of unnecessary updates are the ones that do not affect the profile obtained after the merge operation, as shown by the following lemma.

Lemma 1 Given an aggregated profile p,asetofpolicyrulesR,andaset of profile resolution directives PRD, any changes to profile attributes that do not affect the result of the merge operation do not affect p.

Proof. The aggregated profile p is obtained through the evaluation of the logic program P against the profile obtained after the merge operation.

We recall that P is obtained from R adding a fact Ai(x) for each context data Ai = x obtained after the merge operation. P , in turn, is transformed before evaluation considering the profile resolution directives PRD.Since we assume that neither policies R,norprofile resolution directives PRD can change during service provision, the logic program obtained after these transformations does not change as long as the profile obtained after the merge operation remains the same. Since in Section 4.4.2 we proved the program model uniqueness of policies expressed in our language, we have that different evaluations of the same logic program determine the same aggregated profile p. 

Our first optimization considers the profile resolution directives used by the merge operation.

Example 9 Consider the following profile resolution directives, set by the provider of the streaming service cited in Example 7:

PRD1: setPriority AvBandwidth = (OPM, SPPM, UPM) PRD2: setPriority MediaQuality = (SPPM, UPM)

In PRD1, the service provider gives highest priority to the network operator for the AvBandwidth attribute, followed by the service provider and by the

98 user. The absence of a profile manager in a directive (e.g., the absence of the opm in PRD2) states that values for that attribute provided by that profile manager should never be used.

The semantics of merge ensures that the value provided by an entity ei for the attribute aj can be overwritten only by values provided by ei or provided by entities which have higher priority for the aj attribute.

Example 10 Consider the profile resolution directive on the attribute AvBand- width given in Example 9 (PRD1). Suppose that the opm (the entity with the highest priority) does not provide a value for AvBandwidth,butthe sppm and the upm do. The value provided by the sppm is the one that will be chosen by the merge module, since the sppm has higher priority for that attribute. In this case, possible updates sent by entities with lower priority than the sppm (namely, the upm) would not modify the profile obtained after the merge operation, since they would be discarded by the merge algorithm. As a consequence, the context provider does not communicate a trigger regarding AvBandwidth to the upm.

Note that, if the application logic defines a monitoring specification regard- ing an attribute whose value is null (i.e., an attribute for which no profile manager provided a value), the corresponding trigger is communicated to every entity that appears in the profile resolution directive. The algorithm is shown in Algorithm 5.

Theorem 4 The optimization applied by Algorithm 5 preserves the com- pleteness of the trigger generation mechanism.

Proof. The demonstration easily follows from Lemma 1, since Algorithm 5 determines the communication of triggers to and only to those entities whose updates can modify the result of the merge operation (lines 3 to 5, and lines 7to9). 

99 Algorithm 5 Communicating triggers to the proper profile managers. Input: Let {E3{, E2{, E1}}}) be the priority over entities for the attribute A; Er be the entity among E = {E1,E2,E3} providing the value obtained by the Merge module for A; p be the aggregated profile, A(X) ∈ p;letthe application logic set a monitoring specification c involving A. Output: A set of directives regarding the communication of triggers to profile managers. 1: trigger t := ”if c then notify update(A)” 2: if A(X),X= null then 3: for all Ei ∈{E3{,E2{,E1}}} do 4: communicate(t, Ei) 5: end for 6: else 7: for j = r to 3 do 8: communicate(t, Ei) 9: end for 10: end if

5.3.3 Optimization based on rule priority

The second optimization exploits the fact that an attribute value set by the r r rule Ai can be overwritten only by Ai or by a rule having higher priority r A than Ai with respect to the head predicate i. As a consequence, values r set by rules having lower priority than Ai are discarded, and do not modify the aggregated profile. For this reason, the preconditions of rules rAi having r lower priority with respect to Ai should not be monitored.

Lemma 2 Given an aggregated profile p,asetofpolicyrulesP ,andan A r ∈ P r ∈ P attribute i whose value was set by rule Ai ,anyrule Ai having r A lower priority than Ai with respect to the head predicate i does not affect p r , as long as the preconditions of Ai hold.

Proof. Rules in P having the same attribute Ai in their head are evaluated in decreasing order of priority, and when a rule fires, rules having lower r priority are discarded. As a consequence, the evaluation of Ai precedes the

100 Algorithm 6 (DIMS) Derivation of implicit monitoring specifications.

Input: Let C be the set of monitoring specifications; cAi be a monitoring specification regarding the attribute Ai; RAi be the set of rules rAi having A p r r r i in their head; ( Ai ) be the priority of the rule Ai ; Ai betherulethat determined the value for Ai in the aggregated profile (it such rule exists); b(rAi ) be the set of preconditions pc of rAi . Output: A set of directives regarding the communication of triggers to profile managers.

1: for all cAi ∈ C do r | p r ≥ p r 2: for all Ai ( Ai ) ( Ai ) do

3: for all pc ∈ b(rAi ) | pc∈ / C do r  r 4: if Ai = Ai then 5: C := C ∪ pc 6: else 7: C := C ∪¬pc 8: end if 9: end for 10: end for

11: Apply Algorithm 5 to cAi 12: end for

r r r evaluation of Ai . If the preconditions of Ai hold, rule Ai fires, and rule rAi is discarded. Thus, rAi cannot affect p. 

Algorithm 6 (DIMS) is the optimized version of the baseline Algorithm 4.

Generally speaking, for each monitoring specification cAi , an implicit mon- r itoring specification is created for each precondition of the rule Ai that de- termined the value for Ai, and for the preconditions of the other rules having A r i in their head, and having higher priority than Ai .Ruleswithlowerpri- ority do not generate new monitoring specification. For each monitoring specification, the context provider creates a trigger and communicates it to the proper profile managers according to Algorithm 5.

Example 11 Consider rules R2, R3 and R4 in Example 7. Suppose that p(R2) >p(R3) >p(R4) –wherep(R) is the priority of rule R – and that

101 R2 does not fire, while R3 does. In this case, the preconditions of R2 (the only rule with higher priority than R3 in this example) must be monitored, since they can possibly determine the firing of this rule. Preconditions of R4 must not be monitored since, even if they are satisfied, R4 cannot fire as long as the preconditions of R3 are satisfied. The preconditions of R3 must be monitored in order to assure that the value derived by the rule is still valid. In case the preconditions of R3 do not hold anymore, rules with lower priority (R4 in this example) can fire, and their preconditions are added to the set of monitoring specifications.

Theorem 5 The optimization applied by Algorithm 6 (DIMS) preserves the completeness of the trigger generation mechanism.

Proof. The demonstration follows from Lemma 2. As a matter of fact, for each monitoring specification cAi regarding attribute Ai whose value was set r by rule Ai , the DIMS Algorithm generates new monitoring specifications r considering the preconditions of Ai and the preconditions of every other r p r >pr  rule Ai such that ( Ai ) ( Ai ) (line 2).

An important complexity result regarding the DIMS algorithm is shown by the following theorem.

Theorem 6 The time complexity of the DIMS algorithm is O(N),whereN is the total number of rules in the joined logic program.

Proof. We call P the logic program obtained by joining user and service provider policy rules. According to the DIMS algorithm, triggers are gener- ated by traversing part of the rule dependency graph of P . We recall that, given a logic program P , RDG(P ) is a directed graph whose nodes are the rules forming P . The graph contains an edge from R to R iff the head predicate of rule R belongs to the set of body predicates of R.Sincewe

102 guarantee that the rule dependency graph of the logic program P is acyclic (see Section 4.4.2), the DIMS algorithm terminates in at most K steps, where K is the number of rules in P , K ≤ N. 

5.4 Bibliographic notes

Many proposed architectures for context-awareness do not explicitly sup- port adaptation for continuous services. On the other side, some existing architectures supporting this feature are bound to specific platforms. As an example, an extension of Java for developing context-aware applications by means of code mobility is proposed in [1]. Moreover, various commer- cial products (e.g., RealNetworks Helix server [91], TomTom Traffic [100]) adopt some forms of context-aware adaptation, requiring external entities (typically, applications running on the user device) to cooperate providing asynchronous updates of context data (e.g., available bandwidth and current location). These approaches are quite common in practice, but they do not provide a general solution to the addressed problem, since they are bound to specific applications. Other frameworks try to optimize and adapt the behavior of applications running on the user device on the basis of context data, reacting to context changes (typically, availability of resources on the device, user location, and available bandwidth). One such architecture is described in [32]. In that architecture, context data are aggregated by modules running on the user device, and kept up-to-date by a trigger mechanism. Users can define poli- cies by specifying priorities among applications as well as among resources of their devices. These policies are evaluated by a proper module, and de- termine the applications behavior. Being limited to user-side adaptation, similar proposals cannot be applied to complex Internet services.

103 A proposal for a flexible mechanism for intra-session adaptation pre- senting many similarities with our approach can be found in [89]. That architecture includes a module devoted to apply adaptation policies on the basis of context changes. Adaptation policies are represented as ECA (event, condition, action) rules. Actions correspond to directives that modify the service behavior on the basis of conditions on context. Our middleware es- sentially extends this approach by considering a scenario in which context data and adaptation policies are provided by different entities. Hence, we provide formal techniques for aggregating distributed context data and for solving conflicts between policy rules. As a further contribution, we address the problem of minimizing the number of context change notifications, and the subsequent evaluation of adaptation policies by the architecture. These optimizations are intended to improve the scalability of the framework, es- pecially when considering context data that may continuously change, such as the user location and the available bandwidth. A more recent proposal for a flexible architecture supporting intra-session adaptation is presented in [86]. This proposal includes sophisticated tech- niques for resolving inconsistencies between conflicting context data; how- ever, mechanisms for minimizing exchange of data and re-evaluation of rules are not specifically taken into account. The work on stream data management has also a close connection with the specific problem we are tackling. Indeed, each source of context data can be seen as providing a stream of data for each context parameter it handles. One of the research issues considered in that area is the design of filter bound assignment protocols with the objective of reducing communication cost (see, e.g., [24]). Since filters are the analogous of triggers used in our approach to issue asynchronous notifications, we are investigating the applicability to our problem of some of the ideas in that field.

104 Chapter 6

Loosely Coupling Ontological Reasoning with Adaptation Policies

Context-awareness in mobile and requires the acqui- sition, representation and processing of information that goes beyond the device features, network status, and user location, to include semantically rich data, like user interests and user current activity. On the other hand, when services have to be provided on-the-fly to many mobile users, the effi- ciency of reasoning with these data becomes a relevant issue. Experimental evidence has lead us to consider currently impractical a tight integration of ontological reasoning with rule based reasoning at the time of request.

The necessity of including ontological reasoning in our framework origi- nates from the fact that (as others have independently observed in [60]) we found CC/PP to have serious limitations in representing non-shallow con- text data. The limitations are particularly evident when trying to represent context information regarding the socio-cultural environment of the user. A

105 switch to ontology languages for the representation of context seems to be advocated by many researchers in the field (e.g., [23, 47]). However, this switch introduces the problem of the classical trade-off between expressive- ness and efficiency. Indeed, it is well known that reasoning with the logics underlying ontologies is in most cases intractable [7].

Our solution is based on the integrated representation of shallow context data (e.g., device capabilities and network parameters) and context data be- longing to more complex domains (e.g., user activity and interests) by means of CC/PP profiles that contain references to ontological classes and relations. In order to preserve efficiency, ontological reasoning with non-shallow data is mainly performed in advance with respect to the service provision. On- tological reasoning at the time of the service request is made on-demand only in particular cases; i.e., when the integrated profile lacks some context information that is crucial for providing the service. Actually, analyzing pragmatically various case studies, we believe that there are only few cases in which non-shallow data cannot be calculated in advance. Moreover, in order to keep complexity acceptable for mobile computing services, our ap- proach is to perform ontological and rule-based reasoning separately, and use the ontology query engines provided by well-known tools like Racer [50] as an intermediate layer.

This chapter is structured as follows: Section 6.1 illustrates our context modeling approach; Section 6.2 provides some basic notions about descrip- tion logics; Section 6.3 describes how ontological reasoning is performed in the CARE framework; Section 6.4 shows experimental results with an on- tology we have developed for modeling the activities of mobile users; and Section 6.5 presents some related work.

106 6.1 Context Modeling

As outlined before, the context information we intend to model includes, in principle, any information that can be used to characterize the situation [31] of a mobile user requesting a service. It includes spatio-temporal informa- tion (e.g., user’s location, time), environmental conditions (e.g., lighting, noise level), data about the technological infrastructure (e.g., device fea- tures, network connections, available bandwidth), user preferences, as well as socio-cultural information (e.g., user current activity and interests). We remind that in our framework we use the term profile to indicate a subset of context information collected and managed by a certain entity. We divide profile data in two classes: shallow profile data and ontology-based profile data.

6.1.1 Shallow Profile Data

We denominate shallow profile data those attributes that can be modeled in a natural way by using attribute/value pairs, provided that the semantics for attributes and their allowed values is clear. This class contains data about environmental conditions and technological infrastructure but only few at- tributes regarding the user and socio-cultural information can be modeled in this way. As explained in Section 4.1, we represent this type of data by means of CC/PP profiles. Since existing CC/PP vocabularies mostly cover only hardware, software, and network capabilities of mobile devices, we have extended them to include a much richer set of context data.

6.1.2 Ontology-based profile data

The CC/PP language has many shortcomings when it comes to model non- shallow profile data like e.g., user activities (see, e.g., [60]). Indeed, CC/PP

107 xmlns:uaprof="http://www.wapforum.org/ UAPROF/ccppschema-19991014#"> 37.45 outdoor

Figure 6.1: An excerpt of a profile vocabularies define both the semantics of each attribute and the list of its possible values in natural language by using the resource, leading to possibly different interpretations. Moreover, the 2-level structure (components and attributes) imposed by CC/PP greatly affects its expres- sive power. For representing non-shallow profile data a natural choice is using ontologies; in fact, they have a higher expressive power than CC/PP and, in most cases, offer reasoning services. In our framework adopting on- tologies has two main purposes. First of all, public/shared ontologies sup- port knowledge sharing among the various involved entities. For instance, if user’s interests/expertise are described via a shared ontology the service provider can correctly interpret them without risking misunderstandings.

108 Naturally, in these cases, the ontology hierarchy can be walked through for looking for more specialized/general terms. Secondly, ontologies are used for consistency check of contextual data instances and for reasoning. E.g., automatically deriving, based on other context data, that the user is indeed busy in an “InternalMeeting”. In this second case, ontologies can be private to a specific profile manager. We currently use OWL-DL [58] as the ontology language, because both we want to take advantage of the reasoning services it supports and it is be- coming a de-facto standard in various application domains. However, mainly for interoperability purposes, differently from other approaches [23, 103], we decided to keep storing all of our profile data in CC/PP profiles, but linking those attributes modeling non-shallow context data to ontology concepts that formally define their semantics. In order to adhere to the CC/PP specification, the mapping between a CC/PP attribute and an ontology concept is defined in the vocabulary which defines the attribute, using the resource. Figure 6.1 shows an excerpt of a profile containing both kind of at- tributes. The attribute belonging to the first component refers to the UAProf [83] CC/PP vocabulary and represents the available storage mem- ory of the user’s device. On the contrary, the second component refers to an ontology modeling user activities (see Section 6.3.1). The semantics of the attribute currentLocationType and of its value (“outdoor”)isprovided by the fragment of the ontology defining properties and relationships of the concepts related to “places”.

6.2 Basic notions on Description Logics

Knowledge Representation is a field of Artificial Intelligence that aims at defining representation formalisms that are adequate for representing ob-

109 jects belonging to a particular domain, providing both a formal seman- tics of data and computationally feasible reasoning procedures. Description logic [7] is a family of knowledge representation languages – evolved from earlier network-based representation structures – that allows the definition of knowledge bases, and reasoning procedures for executing inferences over them. Currently, description logic is the preferred class of languages for modelling formal ontologies [105].

By means of description logics languages, it is possible to model a par- ticular domain by defining concepts, relations between them (roles ), and individuals. Complex descriptions of concepts and roles can be built com- posing elementary descriptions by means of the operators provided by the particular description logic language.

The formal semantics of a description logic is given in terms of an in- terpretation I, which is composed by a non-empty set ∆I (the domain of the interpretation), and by an interpretation function ·I. The interpretation function assigns every atomic concept A toasubsetof∆I, and every atomic role R to a binary relation RI ⊆ ∆I × ∆I.

Knowledge bases in description logics are composed by a pair T, A. The TBox T constitutes the terminological part of the knowledge base. It is composed by a set of axioms having the form C  D or R  S (inclusions) and C ≡ D or R ≡ S (equality ), where C and D are concepts, and R and S are roles. An axiom C  D is satisfied by an interpretation I when CI ⊆ DI. An interpretation I satisfies a TBox T when I satisfies all the axioms of T. The TBox describes intensional knowledge about the considered domain (i.e., the definition of concepts and roles).

On the other hand, the ABox A constitutes the assertional part of the knowledge base. It is composed by a set of axioms of the form x : C and x, y : R,wherex and y are individuals, C is a concept, and R is a role.

110 Axioms x : C and x, y : R are satisfied by an interpretation I when xI ∈ CI and xI,yI∈RI, respectively. An interpretation I satisfies an ABox A when I satisfies all the axioms of A. The ABox describes extensional knowledge about the individuals belonging to the considered domain. An interpretation I that satisfies both the TBox T and the ABox A is called a model of T, A. Description logic allows the execution of various reasoning tasks, includ- ing:

• Satisfiability: a concept C is satisfiable with respect to a TBox T iff there exists some model I of T such that CT is not empty;

• Subsumption: a concept C is subsumed by a concept D with respect toaTBoxT iff CI ⊆ DI for every model I of T;

• Equivalence: concepts C and D are equivalent with respect to a TBox T iff CI = DI for every model I of T;

• Disjointness: concepts C and D are disjoint with respect to a TBox T iff CI ∩ DI = ∅ for every model I of T;

• Consistency of an ABox A withrespecttoaTBoxT:anABoxA is consistent with respect to a TBox T iff an interpretation I exists, which is a model of T, A;

• Classification: computing the hierarchy of the atomic concepts in T;

• Instance retrieval: retrieving all the instances in A that belong to a given concept C;

• Realization: computing the most specific atomic concepts in T that are instantiated by a given individual.

111 6.3 Ontological Reasoning

Ontological reasoning is supported by ontology languages, like OWL-DL, that can be mapped to certain classes of description logics [7]. Reasoning services are based on subsumption computation for these logics and usually include consistency and classification, as well as checking for instances of specific concepts based on their properties. While the use of ontologies for the Semantic Web is relatively new, research on subsumption computation is not, and it is well-known to be intractable even for relatively simple logics [7]. Despite the progress made by reasoner implementations, the delay that the reasoning services inevitably introduce in service provisioning is one of the motivations for performing ontological reasoning off-line at each profile manager. In selected cases only, ontological reasoning can be performed on- demand by the Context Provider. In particular, on-demand ontological reasoning is performed when the profile lacks some crucial context informa- tion that can be possibly obtained after populating a shared ontology with the integrated profile.

6.3.1 Off-line ontological reasoning

User and service provider profile managers use shared and private ontologies to represent non-shallow context data as explained in Section 6.1.2. The goal of ontological reasoning is to provide the additional services of consistency checking, and of implicit context data derivation.

An ontology for modeling the socio-cultural context of mobile users

Figure 6.2 shows part of the OWL-DL ontology we defined for modeling the socio-cultural environment of mobile users. The ontology is intended to be (locally) maintained by an entity trusted by the user, in our case the UPM;in

112 Figure 6.2: An excerpt of an ontology modeling the socio-cultural context of mobile users

fact, it is populated with user’s sensitive data. The ontology is composed by nearly 150 classes and relations that describe features among which there are the user activities (actions, movements, . . . ), interests, contacts, cal- endar items, and places. As an example, the UnimiInternalMeeting and UnimiEmployee classes are defined as follows (see also Figure 6.2):

UnimiInternalMeeting ≡ Activity ≥2 Actor 

113 ∀ Actor.UnimiEmployee ∃Location.UnimiLocation

UnimiEmployee ≡ Employee ∃Employer.{unimi}

In order to model some more general concepts, such as time and place, we adopted well-known ontologies (e.g., DAML-Time). Example 12 shows an application of our ontology for determining the current activity of a user, and Section 6.4 briefly reports preliminary per- formance results about off-line reasoning with this ontology.

Consistency checking and derivation of implicit context data

Consistency is crucial in the definition of an ontology as well as in its pop- ulation. Indeed, when the ontology is populated with instances obtained from local repositories of the profile manager, consistency checking is per- formed in order to capture possible inconsistencies (e.g., the same instance belonging to disjoint classes, a person localized in different rooms at the same time). More importantly, ontological reasoning is performed by the UPM and by the SPPM in order to derive new context data. The ability of ontological rea- soning to derive new context data introduces the second argument in favor of this technique: privacy. Indeed, when ontological reasoning is performed by the UPM, the profile manager may release to the service provider some high level context information without releasing details that have been used to derive that information. The following example illustrates this aspect:

Example 12 Consider the case of Alina, the user of an adaptive messaging service. The service properly filters and redirects messages by considering various context data, including the user current activity. User activities are modeled by the ontology described in Section 6.3.1. Alina is currently in

114 her office with Will, a colleague of hers. Her calendar has an entry about a scheduled meeting between them, and she keeps Will’s contacts in her cell phone. Since the UPM is a trusted entity, it has read access to all these data. Since Alina and Will are both employed by the same organization and are currently together in a place belonging to that organization, the ontology rea- soner derives that their current activity (and in particular, Alina’s current activity) is “InternalWorkMeeting”. Hence, upon the Context Provider request, this information is given to the service provider, which accordingly applies an adaptation policy, for example, redirecting non-priority calls to Alina’s answering machine.

Note that the service provider only knows Alina’s current activity, and ig- nores other sensitive information such as her current location, contacts, and people she is with.

Activation rules for ontological reasoning execution

The execution of off-line ontological reasoning is controlled by each single profile manager (UPM and SPPM)bymeansofontological reasoning activa- tion rules. In fact, each profile manager knows exactly which contextual data are modeled by its own local ontologies and therefore under which conditions–e.g., a change in a specific context attribute–ontological reason- ing can, possibly, produce new implicit contextual data. For instance, the UPM can decide to execute ontological reasoning any time a user adds a new appointment on her calendar for attempting to better specify user activity. These activation rules are essentially conditions over changes in profile data that fire the execution of ontological reasoning when met. For instance, the following rule determines the execution of ontological reasoning when the

115 user’s location changes of more than 100 meters:

If changes(currentLocation,100m) then execOntReas

Of course, both the UPM and the SPPM are provided with a monitor module, which is in charge of monitoring changes in context data that can fire acti- vation rules. Whenever a rule fires, the profile manager executes ontological reasoning and properly updates profile data.

6.3.2 On-demand ontological reasoning

In particular cases contextual data can be derived through ontological rea- soning only populating the ontology with information provided by different entities. In this case, reasoning must be performed on-demand at the time of the service request. For the sake of simplicity, we describe the mechanism of on-demand ontological reasoning by means of the following example.

Example 13 Consider the case of John, the user of a location-based rec- ommendation service. The service provides to mobile users a list of events ordered accordingly to their proximity to the user and to the user’s specific interests. Suppose that John submits a query regarding music to the service. For this reason, the attribute MusicPreferences is crucial for provisioning the service. Suppose the integrated profile does not contain entries for the MusicPreferences attribute, but only a list of John’s preferred artists. So, the Context Provider populates a shared ontology modeling music genres and artists with the integrated profile. Then, the Context Provider per- forms ontological reasoning, inferring from John’s preferred artists that his favorite music genre is R&B. Finally, the Context Provider adds this in- formation to the integrated profile, and the application logic orders the events accordingly.

116 In order to specify which attribute values are necessary for provisioning a specific service, the service provider can mark part of profile data as crucial, using the interface for inserting rules and required context data. When, after the evaluation of rules by the IE, an attribute marked as crucial does not have associated values, the Context Provider performs on-demand ontological reasoning, populating a shared ontology with data obtained from the integrated profile. These data possibly comprehend con- textual information that can assist ontological inferences, which are not available before the user’s request. After having populated the ontology, the Context Provider performs ontological reasoning in order to derive values for crucial attributes, and adds their values (which can possibly be null) to the integrated profile. Finally, the integrated profile is sent to the service provider application logic. If some crucial attributes still have no values, the application logic can choose to deny the service, or to ask the user to explicitly specify a value for them.

6.4 Experimental Evaluation

We performed some experiments on executing on-line and off-line reason- ing with the OWL-DL ontology we defined for modeling the socio-cultural environment of mobile users presented in Section 6.3.1. We recall that the ontology contains nearly 150 classes and relations. The reasoning task we performed corresponds to the realization of an instance CurrentActivity be- longing to the Activity class. This experimental setup simulates the case in which ontological reasoning is used to derive the specific activity that is currently performed by the user. As explained before, in our approach ontological reasoning is executed mostly offline. For keeping computational times acceptable for interactive services, we perform online ontological rea- soning only on small subsets of the whole ontology, populating the ABox

117 with only those instances that are necessary for deriving new context data. For instance, in part of the experiments presented in this section we use the subsect of our ontology – composed by 12 classes – that is sufficient to define a particular activity we are interested in. Ontological reasoning is performed by the Racer [50] ontology reasoner, on a two-processor Xeon 2.4 GHz workstation with 1.5GB of RAM, using a Linux operating system. Results are calculated as the average of ten runs. Standard deviation is shown in plots. Experimental results with a realistic ontology (having more than 500 classes and more than 2000 instances) show that the execution time of reasoning tasks like instance realization is in the order of seconds. As a consequence, ontological reasoning at the time of the service request is un- feasible for most Web applications and services, and should be executed asynchronously with respect to the service requests. On the other hand, on-line ontological reasoning is feasible when executed on simple ontologies populated by a small number of instances.

6.4.1 Experiment A: Ontological reasoning with increasing ABox size (instances obtained from the aggregated pro- file)

This experiment aimed at evaluating the feasibility of on-line ontological reasoning with a growing number of instances added to the ABox at the time of the service request. Since these instances are gathered from the aggregated profile, they are not known a-priori. Hence, realization must be performed for assigning these new instances to the classes they belong to.

TBox The TBox consists of the 12 classes that are used to define the UnimiInternalMeeting concept.

118 ABox Before the service request, the ABox contains:

• a variable number k of UnimiEmployees; k corresponds to the values of the x axis of plots in Figures 6.3 and 6.4;

• 1 UnimiBuilding.

At the time of the service request, the ABox is filled with:

• the CurrentActivity instance;

• 1 relation that links CurrentActivity to the building in which it is performed;

• k relations that link CurrentActivity to its actors.

Figure 6.3: Results of Experiment A: Ontological reasoning with increasing ABox size

Results Experimental results are shown in Figures 6.3 and 6.4. Execution times grow exponentially with the number k of instances added to the ABox

119 Figure 6.4: Results of Experiment A with a small number of instances at the time of the service request (see Figure 6.3). As shown in Figure 6.4, execution times with this subset of our ontology are acceptable (i.e., less than 100ms) only when the number of instances added at the time of the service request is small (less than 30).

6.4.2 Experiment B: Ontological reasoning with increasing ABox size (instances known apriori)

This experiment aimed at evaluating the feasibility of on-line ontological reasoning with a growing number of instances added to the ABox before the service request. Since these instances are in the ABox before the service re- quest, they can be realized in advance. Hence, in this case the only reasoning task to be performed at the time of the service request is the realization of the CurrentActivity instance.

TBox The TBox consists of the 12 classes that are used to define the UnimiInternalMeeting concept.

120 ABox Before the service request, the ABox contains:

• 5 UnimiEmployees;

• 1 UnimiBuilding;

• k instances for each one of the remaining 10 classes; k corresponds to the value of the x axis of plots in Figures 6.5 and 6.6.

At the time of the service request, the ABox is filled with:

• the CurrentActivity instance;

• 1 relation that links CurrentActivity to the building in which it is performed;

• 5 relations that link CurrentActivity to its actors.

Figure 6.5: Results of Experiment B: Ontological reasoning with increasing ABox size

121 Figure 6.6: Results of Experiment B with a small number of instances

Results Experimental results are shown in Figures 6.5 and 6.6. Execu- tion times grow linearly with the number of instances added to the ABox before the service request (see Figure 6.5). The results obtained when adding a small number of instances exhibit a strange behaviour (see Figure 6.6), which was confirmed when executing experiments on a different machine. These results are probably due to the internal behaviour of the Racer rea- soner. Even if the time of ontological reasoning execution increases with the cardinality of the population of the ABox, performance is not too badly affected. For instance, when the ABox is filled with 150 instances, execution time is doubled.

122 6.4.3 Experiment C: Ontological reasoning with increasing ABox size (instances not involved in the reasoning task)

This experiment aimed at evaluating the feasibility of on-line ontological reasoning with a growing number of instances that are not involved in the realization of the CurrentActivity instance. These instances are added to the ABox before the service request; hence, they can be realized in advance. In this experiment, the only reasoning task to be performed at the time of the service request is the realization of the CurrentActivity instance.

TBox The TBox consists of:

• the 12 classes that are used to define the UnimiInternalMeeting con- cept;

• 100 classes that are not involved in the definition of Activity and its subclasses.

ABox Before the service request, the ABox contains:

• 5 UnimiEmployees;

• 1 UnimiBuilding;

• k instances belonging to the classes that are not involved in the def- inition of Activity and its subclasses; k corresponds to the values of the x axis of the plot in Figure 6.7.

At the time of the service request, the ABox is filled with:

• the CurrentActivity instance;

• 1 relation that links CurrentActivity to the building in which it is performed;

• 5 relations that link CurrentActivity to its actors.

123 Figure 6.7: Results of Experiment C: Ontological reasoning with increasing ABox size

Results Experimental results are shown in Figure 6.7. Execution times grow linearly with the number of instances belonging to the classes that are not involved in the on-line ontological reasoning. These results essentially confirm the ones presented in Experiment B. The fact that instances are not involved in ontological reasoning slightly improves the execution times. It should also be noted that the TBox in this experiment (composed by 112 classes) is huger than the one used in Experiment B (composed by 12 classes).

6.4.4 Experiment D: Ontological reasoning with increasing TBox size

This experiment aimed at evaluating the feasibility of on-line ontological reasoning with growing size of the TBox. In particular, in this experiment the TBox is filled with a growing number of classes that are not involved

124 in the ontological reasoning task (i.e., classes that are not involved in the definition of subclasses of Activity). These classes are added to the TBox before the service request; hence, classification can be performed before the service request, and does not affect the execution time of on-line ontological reasoning. In this experiment, the only reasoning task to be performed at the time of the service request is the realization of the CurrentActivity instance.

TBox The TBox consists of:

• the 12 classes that are used to define the UnimiInternalMeeting con- cept;

• k classes that are not involved in the definition of Activity and its subclasses; k corresponds to the values of the x axis of the plot in Figure 6.8.

ABox Before the service request, the ABox contains:

• 5 UnimiEmployees;

• 1 UnimiBuilding.

At the time of the service request, the ABox is filled with:

• the CurrentActivity instance;

• 1 relation that links CurrentActivity to the building in which it is performed;

• 5 relations that link CurrentActivity to its actors.

125 Figure 6.8: Results of Experiment D: Ontological reasoning with increasing TBox size

Results Experimental results are shown in Figure 6.8. Execution times remains constant, independently from the number of classes that are not involved in the definition of Activity and its subclasses. These results show that the total size of the ontology TBox does not affect the execution times of ontological reasoning, provided that the introduced classes are not involved in the reasoning tasks.

6.4.5 Experiment E: Ontological reasoning with increasing TBox and ABox size

This experiment aimed at evaluating the feasibility of on-line ontological reasoning with growing size of the TBox and of the ABox. In particular, in this experiment the TBox is filled with a growing number of classes that are involved in the ontological reasoning task. A growing number of instances are also added to the other classes that are involved in the reasoning task. In

126 this experimental setup, the added classes are subclasses of Activity.These classes are added to the TBox before the service request; hence, classifica- tion can be performed before the service request, and does not affect the execution time of on-line ontological reasoning. Instances belonging to the various classes are also added to the ABox before the service request; then, their realization can be performed off-line. In this experiment, the reasoning task to be performed at the time of the service request is the realization of the CurrentActivity instance.

TBox The TBox consists of:

• the 12 classes that are used to define the UnimiInternalMeeting con- cept;

• k subclasses of Activity; k corresponds to the values of the x axis of plots in Figures 6.9 and 6.10.

ABox Before the service request, the ABox contains:

• 5 UnimiEmployees for each subclass of Activity;

• 1 UnimiBuilding for each subclass of Activity.

At the time of the service request, the ABox is filled with:

• the CurrentActivity instance;

• k relations that link CurrentActivity to the building in which it is performed;

• 5k relations that link CurrentActivity to its actors.

127 Figure 6.9: Results of Experiment E: Ontological reasoning with increasing TBox and ABox size

Results Experimental results are shown in Figures 6.9 and 6.10. The execution time grows exponentially with the number of classes involved in the reasoning task (see Figure 6.9). However, execution times in this case are acceptable (i.e., less than 100ms) when the TBox is filled with less than 30 subclasses of Activity (see Figure 6.10). If the ontology contains a huger number of activities, ontological reasoning at the time of the service request is unfeasible for most Web applications, and should be executed asynchronously from the service request.

6.4.6 Experiment F: Ontological reasoning with a realistic ontology and increasing number of derived activities

This experiment aimed at evaluating the feasibility of on-line ontological reasoning with a growing number of activities that were derived to real-

128 Figure 6.10: Results of Experiment E with a small number of classes ize the CurrentActivity instance (i.e., the current activity of a user). In this experiment we tried to reproduce a realistic ontology, which describes more than 100 activities, and other 400 concepts (with a maximum depth of 5 in the subclasses hierarchy). The ABox contains more than 2000 in- stances. The relations that link CurrentActivity to other instances (actors and buildings) are properly chosen, in order to determine a growing number of activities (from 1 to 100) as the result of the realization of CurrentActiv- ity. The reasoning task to be performed at the time of the service request is the realization of the CurrentActivity instance.

TBox The TBox consists of:

• the 12 classes that are used to define the UnimiInternalMeeting con- cept;

• other 100 subclasses of Activity (similar to the previous ones, but with

129 different cardinality restrictions);

• 400 classes that are not involved in the definition of Activity and its subclasses.

ABox Before the service request, the ABox contains:

• 2000 instances belonging to classes not involved in the reasoning task;

• 125 Persons;

• 25 Buildings.

At the time of the service request, the ABox is filled with:

• the CurrentActivity instance;

• 25 relations that link CurrentActivity to the buildings in which it is performed;

• 125 relations that link CurrentActivity to its actors.

Results Experimental results are shown in Figure 6.11. Execution times grows linearly with the number of derived activities. However, execution times with this experimental setup are unfeasible for most Web applications when reasoning is performed at the time of the service request.

6.4.7 Experiment G: Ontological reasoning with a realistic ontology and increasing ABox size

This experiment aimed at evaluating the feasibility of on-line ontological rea- soning with a realistic ontology and an increasing number of instances pop- ulating the ABox. The terminological part of the ontology is the same used in Experiment F. The ABox contains 2000 instances belonging to classes

130 Figure 6.11: Results of Experiment F: Ontological reasoning with a realistic ontology and increasing number of derived activities that are not involved in reasoning (i.e., classes that are not subclasses of Activity), and a growing number of instances that belong to classes involved in the ontological reasoning. Since these instances are added to the ABox before the service request, their realization can be performed off-line. The only reasoning task to be performed at the time of the service request is the realization of the CurrentActivity instance.

TBox The TBox consists of:

• the 12 classes that are used to define the UnimiInternalMeeting con- cept;

• other 100 subclasses of Activity (similar to the previous ones, but with different cardinality restrictions);

• 400 classes that are not involved in the definition of Activity and its subclasses.

131 ABox Before the service request, the ABox contains:

• 2000 instances belonging to classes not involved in the reasoning task;

• 5 Persons;

• 1 Buildings;

• k instances belonging to classes involved in the reasoning task; k cor- responds to the values of the x axis of the plot in Figure 6.12.

At the time of the service request, the ABox is filled with:

• the CurrentActivity instance;

• 1 relation that links CurrentActivity to the building in which it is performed;

• 5 relations that link CurrentActivity to its actors.

Results Experimental results are shown in Figure 6.12. Execution times grow linearly with the number of instances that belong to classes involved in the reasoning task. When the TBox contains a huge number of classes (more than 500 in this setup) and the ABox is filled with a rather huge number of instances (more than 2000), ontological reasoning execution time is in the order of seconds, even if few instances belong to classes involved in the reasoning task.

6.5 Bibliographic notes

The adoption of ontologies for context-awareness purposes is not new and it is increasing in the last few years. For instance, in CoBrA [22] the context is modeled via a shared OWL ontology. The same group is working on defining

132 Figure 6.12: Results of Experiment G: Ontological reasoning with a realistic ontology and increasing ABox size

SOUPA [23], a standard ontology for the specific domain of ubiquitous and pervasive applications. The main purpose, in this case, is to support knowl- edge sharing and interoperability in ambient intelligence scenarios, where efficiency is not the main focus. The context modeling of SOCAM is based on ontologies too [47]. A centralized module collects context information and a reasoning engine evaluates first-order logic rules for inferring new context information. Similarly to SOUPA, Wang et al. [103] define the CONON ontology which is composed by a general-purpose upper ontology and by application-specific lower ontologies. Reasoning is performed real-time and is based on both description logic and user-defined logic rules. However, they admit that this approach is unsuitable for time-critical applications.

The possibility of overcoming the restrictions of rule-based and descrip- tion logic (DL) reasoning through a form of combination of these classes of languages has been widely investigated. Early proposals date back to

133 some of the so-called “second generation DL systems” (e.g., CLASSIC [16] and LOOM [74]). This research field has recently gained new momentum from the semantic web community. One of the main research issues in that area is to provide very expressive DL languages with the possibility of performing powerful reasoning tasks not only on terminological knowledge (classes and relations) but also on assertional knowledge (instances). To this end, SWRL [57] extends OWL-DL and OWL-Lite with Horn clauses. The extended languages overcome most of the expressive restrictions of the primitive ones, but reasoning tasks in the new languages are undecidable, and the development of optimized tools for reasoning with SWRL subsets is currently at an early stage. Other recent proposals try to combine DL and logic programming while keeping decidable reasoning, imposing constraints on the form of the rules [79] and/or adopting quite simple DL languages [44]. However, decidability does not guarantee in general that reasoning is com- putationally feasible. Thus, in our opinion similar approaches are unsuited for modeling context for the provision of adaptive real-time services, espe- cially if –in particular cases– reasoning must be performed at the time of the service request. For these reason, in our framework ontological and rule- based reasoning are performed separately. The main characteristic of this approach is that the evaluation of rules does not affect the assertional part of the ontology (ABox), i.e., the information flow is one-way from the ABox to the logic program knowledge base. This feature clearly limits the expres- sive power of the language, compared with logics in which the information flow is bi-directional (e.g., [57, 79]). The main benefit of this approach is that complexity remains the same of the adopted policy language (i.e., lin- ear in the number of rules) when on-demand ontological reasoning is not performed; complexity is the same of the description logic reasoning tasks in the other case.

134 Chapter 7

Prototype Applications

Various prototype applications have been developed for experimenting with the CARE framework. These applications exploit CARE for obtaining con- text data that are used for adaptation and personalization. In this chapter we present three prototype applications: Section 7.1 presents the POIsmart system for management, sharing, and retrieval of an extended form of points of interest; Section 7.2 illustrates an adaptive streaming server coupled with CARE, and presents experimental results; Section 7.3 presents an architecture for adaptive transcoding services that exploits CARE for obtaining context data.

135 7.1 A context-aware architecture for management and retrieval of extended points of interest

There is a rapidly growing number of users of GPS enabled mobile devices, and this or a similar technology for accurate localization will be eventu- ally integrated in most mobile phones. As a consequence, a number of services providing location-based information are now available for mobile users. However, these services generally do not take into account other data than the current location of the user. The aim of the prototype that we illustrate in this section is to take advantage of the CARE framework for providing a context-aware service for resource localization that considers not only location but a wider set of context data, including personal interests, device features, and user prefer- ences. In particular, this prototype architecture is devoted to management and sharing of an extended notion of points of interests.

7.1.1 An overview of the POIsmart system

The notion of point of interest used by navigation software to trace and highlight resources possibly interesting to the user is analogous to the one of a Web page bookmark, except that coordinates are used to identify the resource instead of a URL. It is also likely that some resources have both a Web site and a physical location. Consider, for example, a university department, a restaurant, or a museum. We call POIsmart the object used to describe the geographical as well as the virtual location of a resource. We have developed a distributed architecture for managing, sharing and searching this particular type of objects. The main features we envision for our POIsmart management system are: (1) a POIsmart server enabling each user to access his POIsmarts

136 from different devices (desktop, PDA, cellular phone, . . . ); (2) a facility to share POIsmarts with other users; (3) an adaptive categorization system, suggesting appropriate folders upon user bookmarking of specific POIsmarts; (4) a search facility to quickly find private and/or shared POIsmarts based on context data and free text search terms. The architecture presented in this dissertation is integrated with the CARE middleware for obtaining context data used to personalize and adapt the service (e.g., ranking POIsmarts according to the user’s location and interests). A number of shareware utilities has been developed for Web bookmarks mainly addressing 1 and 3, and prototypes have been developed to address 2 and 4 (see e.g. [72]), but none of them integrates nicely these features, and, to our knowledge, an extension to manage physical locations was never considered. On the other side, the idea of bookmarking physical locations is not new [19], and in the last years it has been exploited for different means in the fields of ubiquitous and pervasive computing. For example, [84] and [17] present architectures for managing virtual notes that allow the user to attach comments, reminders, and multimedia resources to objects identified by a physical location. A key difference with our work is that these systems are mainly intended for personal use, while in our proposal the sharing of POIsmarts between (communities of) users is one of the main goals. The introduction of sharing poses additional issues involving the management and search of resources.

7.1.2 Architecture

The system architecture we propose, sketched in Figure 7.1, is composed of two levels:

• POIsmart Servers (PS) for the management of the device-independent

137 Figure 7.1: The architecture for POIsmart management

representation of POIsmarts;

• Client Systems (CS) as device-dependent interfaces for the client-side management and search of POIsmarts.

POIsmart Servers

The main component of the architecture is the POIsmart Server (PS) which manages the representation of POIsmarts for one or more users. There are

Figure 7.2: The POIsmart Server

138 three types of requests received by each PS from a client system: (a) requests for specific folders identified by their name in the POIsmart hierarchy; (b) queries based on search terms and context data; (c) requests to add, delete or modify POIsmart and/or folders. The PS answers to requests of type (a) and (b) by providing folders in the format appropriate for the specific browser and device. Requests of type (b) are answered using a multi-feature query engine and machine learning techniques (see Section 7.1.3). Similar techniques are used to handle requests to insert new POIsmarts by suggest- ing a list of candidate folders where to store the POIsmarts. In order to support POIsmart sharing, PSs are organized as nodes in a peer-to-peer network. Requests of type (b) are forwarded from the local PS to all of its neighboring nodes. Each of these nodes evaluates the request and in turn forwards it to its neighboring nodes. Typically, a PS can reside on a department server, an ISP, or even on a personal server. The peer-to- peer PS network organization will also allow a client system to connect to an arbitrary PS in the network and to transfer and operate on its POIsmarts through it. Figure 7.2 shows the data flow upon a user’s request. In Step 1, the user submits a query based on zero or more keywords. The POIsmart Server queries the Context Provider for the user’s profile information and other context data (Step 2). This information as well as the original query is used by the Query creation module to build a multi-feature query (Step 3), which also specifies the maximum number (n) of POIsmarts to be returned.1 Next (Step 4), the query is forwarded to the known peers, and executed locally by the Multi-feature query execution module. In Step 5, each known peer returns the XML representations of the top n POIsmarts, each one with the associated confidence value. The returned POIsmarts, together with the

1The parameter n is chosen based on the device capabilities and the available connec- tivity.

139 ones retrieved locally, are provided to the Ranking module. Finally, this module selects the top n POIsmarts, which are sent to the user’s device (Step 6 and 7). When a peer receives a query from another peer in the network, it per- forms the same operations described above, except that the multi-feature query is already available and that it will forward the query only to a subset of the peers. Moreover, the Ranking module does not return its result to the client but to the peer that made the request.

Client Systems

Client Systems (CSs) provide an interface for the user to select POIsmarts, add new POIsmarts (including automatic categorization by contacting a reference PS), and reorganize POIsmarts. They also provide a way to search private POIsmarts and other users’ shared folders. For efficiency and privacy reasons, each CS is configured to connect to a reference PS that stores its POIsmart hierarchy. CSs are device-dependent, since the graphical interface and several fea- tures need to adapt differently to desktop computers, PDAs, cellular phones and other devices. Different features are implemented depending on the de- vice capabilities. For example, the CS on devices that have enough memory and that are not always online work using a local copy of the hierarchy and synchronize with the PS upon user request; on other devices only part of the POIsmart hierarchy is transferred to the client, and any change to this transferred part is immediately propagated to the PS.

The XML Schema for POIsmarts

In order to simplify the exchange of POIsmarts between different software architectures (e.g., from a Java PS to a .Net smart client), POIsmarts are

140 Figure 7.3: A fragment of the XML Schema defining POIsmarts represented in XML. The POIsmart XML Schema is derived in part from the Document Type Definition proposed for the XML Bookmark Exchange Language (XBEL) [38], an interchange format for hierarchical bookmark data supported, among others, by the Galeon browser and the Konqueror file manager. Our schema defines a superset of the information used by bookmark managers integrated in popular browsers and of the information used by managers of points of interest. Each POIsmart in the schema rep- resents a resource by describing the real-word and/or virtual features of the resource (e.g., its physical location, its description, and URIs of Web content related to it). Figure 7.3 provides a diagram of part of the Schema. These are the main characteristics of the schema:

• Each poismart entry contains zero or more multimedia notes and po- ismart instances.

• Each note contains a reference to a multimedia Web resource related to the POIsmart (e.g., a video describing the resource, or a vocal comment on the resource recorded by a user).

• The notion of POIsmart instance has been introduced to model the common case of a resource, like a department store chain, that can have

141 multiple branches, each one with a different location and possibly a different Web site. Each POIsmart instance can contain a URI and/or location information: while the former identifies a Web site related to the instance, the latter provides its physical location (which can be either a GPS coordinate or a cell ID). It is worth to note that a single POIsmart instance can contain more URIs: this is useful when the site offers different entry points for different technologies (XHTML, WML, etc.).

• The schema defines appropriate tags to support the automatic cate- gorization of POIsmarts and folders. The metadata tag, in addition to text keywords, contains an element index that is used for storing the feature vector of the Web pages referenced by the POIsmart. This attribute is used both for the suggestions on the most appropriate folders for a new POIsmart and for the search of desired POIsmarts or folders.

• Finally, POIsmarts and folders have a owner attribute and a boolean shared attribute; the latter states whether they are accessible only by their owner or not.

Peer-to-peer architecture

The choice of structuring POIsmart Servers as nodes in a peer-to-peer net- work is mainly motivated by the need of providing ubiquitous access and research capabilities to the users of the system. As explained in section 7.1.2, users’ queries are forwarded across the network for finding interesting items in the set of POIsmarts shared by other people. Moreover, there are several reasons, e.g., the user temporary location, why a CS may occasion- ally connect to a PS which is not its reference PS. Additionally, peer-to-peer

142 seems to be the best choice for addressing the problem of scalability in de- centralized communities.

Our peer-to-peer infrastructure essentially follows the model of the well- known Gnutella protocol, but attempts to overcome its main weaknesses by improving message routing. The mechanism for discovering other peers on the network works as follows: When a new peer (PS) needs to join the peer-to-peer network of PSs, it needs the address of a known PS that is used as a Host Cache Service. The new PS sends a request to this host obtaining a (possibly partial) list of peers participating in the network. Each peer in this list can also be contacted to retrieve more peer addresses. Moreover, new peers are added to the set of known ones in the search phase. For searching POIsmarts or folders in the network, starting from a specific peer p1,weforwardaQuery message to every peer known to p1. When a peer pi forwards a query to pj, it provides it with the set of its known peers. When pj receives the set of peers known by pi, it adds them to its set of known peers. Thus, pj forwards the query only to peers known by itself which are unknown to pi. Note that the set of peers known by pj is a superset of the set of peers known by pi. Every message has a TTL field that states the maximum number of times the message can be forwarded. Moreover, when a peer receives for the second time the same message, it does not forward it to any other peer. However, since no assumption can be made with regard to the storing of POIsmarts in specific sets of peers, search of POIsmarts is essentially performed by flooding the whole network of peers with query messages (until the TTL expires). Obviously, this approach is far from optimal, since every peer should be contacted in order to correctly answer a query.

As a possible alternative, we are investigating the possibility to take into account the location of both POIsmarts and peers, in order to assign

143 POIsmarts to “close” peers. This feature would allow to answer location- based queries only by contacting those peers that cover the area of the query. However, the issue of implementing this feature preserving load- balancing is particularly challenging, since generally the distribution of peers and POIsmarts is different (i.e., a particular area can have a huge number of points of interest but a small number of peers, and vice versa), and is left to future work.

7.1.3 Classification and Search

Assigning new POIsmarts to folders in a large and complex hierarchy may be a tedious task, especially when using mobile devices. This motivates the introduction in our architecture of an automatic categorization system, which suggests the most appropriate folders for each new POIsmart.

New POIsmart categorization

Each time a user wants to save a new POIsmart, the PS suggests an ordered list of folders chosen from the current folder hierarchy. The suggestion sys- tem exploits both the virtual and the physical features of the POIsmart. When the POIsmart contains one or more references to Web sites, the Web sites content is used to categorize the POIsmart using machine learning techniques. The system maintains a local dictionary of words that occurred so far in Web pages referenced by the POIsmarts of the user. The dictionary is used to represent a Web page as a page feature vector using the standard vector space model of information retrieval [76]: roughly speaking, each vector coordinate measures the contribution of the corresponding dictionary word in the semantics of the page. The categorization system also keeps, for each folder i in the hierarchy, a folder feature vector wi. Given a new

POIsmart to add, the system computes the page feature vectors v1,...,vn of

144 the n Web pages referenced by the POIsmart and then suggests to the user the name of one or more folders in the hierarchy. A folder i is suggested for labeling a Web page vj (and, thus, the POIsmart which contains a reference to it) whenever the confidence for that folder, computed as the cosine of the angle between wi and vj, goes above a certain threshold. Then the user can choose whether to assign the POIsmart to one or more folders in the set suggested by the system, or to assign it to another folder. Note that, unlike other Web page categorization systems, in order to provide real-time responses our system does not use any information taken from pages linked to the target pages. In the case the new POIsmart does not contain references to Web sites, the categorization technique exploits the physical location of the resource and the user’s context information. The POIsmart location is used by a GIS to retrieve a set of points of interest close to it. The number of points of interest in the set generally depends on the precision of the localization technology. For instance, the area identified by a cell ID can be wide, and would possibly contain a considerable number of points of interest. The GPS system provides very accurate location information, but typically it does not work indoor, like in shopping centers or museums; thus, the location area, derived from the last available GPS data, must be widened in order to ensure that the right point of interest is included. The folders containing the points of interest in the set become candidates for including the new POIsmart and they are ranked taking into account the user’s context.

Example 14 Consider the case of John, a user who had just had dinner in a restaurant. Being very satisfied, he decides to add a POIsmart for the restaurant with a short vocal message in order to remember the experience and to share it with his friends. Since the GPS signal was absent in the restaurant, the available coordinates are the ones corresponding to the park-

145 ing lot in front of the restaurant. The PS, in order to suggest a folder where to store the new POIsmart, queries a GIS sending the coordinates and ask- ing for points of interest in a range of 50 meters. The GIS returns 4 points of interest corresponding to the parking lot, the restaurant, a flowers shop, and a night club. The PS retrieves context data from the Context Provider, among which the current time. Since the current time is 1p.m., the system ranks very low the categories of “night clubs” and “specialty shops” (since most are closed at that time), and it suggests “Restaurants” as the probable category for the POIsmart, followed by “Parking lots”.

POIsmarts retrieval

A Client System can query its reference PS using free text search terms. The query is transformed by a proper module on the PS into a multi-feature query by considering context information, such as his current location, de- vice capabilities, known interests, and possibly his current activity. To answer such a multi-feature query, the PS adopts a ranking algorithm similar to the one proposed in [48]. For each considered feature, a single degree of match is computed. For instance, the degree of match of the terms in the search expression against the set of available POIsmarts is obtained computing a query feature vector q based on the search terms, and then calculating the cosine of the angle between q and the feature vector w of the POIsmart. The single degrees of match are then combined to obtain a comprehensive confidence value.

Experimental results

Our experimental results on automatic classification are currently limited to POIsmarts having URI information since we built feature vectors only based on the content of the corresponding Web pages. Previous experimen-

146 Figure 7.4: Preliminary results for the categorization system. Curves display error rates related to finding the correct folder within the first three (lower curve), two (middle curve), or within the first folder (upper curve) in the ranking generated by the categorizer.

tal results in Web page categorization [93] show that term-weighting schemes (taking into account the semistructured nature of HTML) and dimension- ality reduction techniques can both improve the classification performance. These techniques have a low computational cost and are therefore suitable for our real-time application. In particular, term-weighting techniques ex- ploit the structural information present in HTML documents by considering not only the number of occurrences of terms in documents, but also the HTML element the terms belong to. The experimental results show that varying the weights of a term depending on the associated HTML element, e.g. assigning greater importance to terms that belong to the META and TITLE

147 elements, may lead to an improvement in accuracy. This weighting method, called structure-oriented weighting technique (SWT) in [93], is defined by the function    SWTw(ti,dj )= w(ek) · TF(ti,ek,dj ) ek where ek is an HTML element, w(ek) denotes the weight we assign to the element ek and TF(ti,ek,dj) denotes the number of times the term ti is present in the element ek of the HTML document dj.

We performed various experiments on a corpus of 8000 Web pages be- longing to 10 Yahoo! categories. Page feature vectors are built by applying the structure-oriented weighting technique together with a strong feature se- lection. To compute the folder feature vectors wi we use the perceptron, an extremely fast and incremental algorithm. The perceptron adjusts the vectors wi after each insertion of a new POIsmart containing references to Web pages in the hierarchy, thus improving its performance as the hierarchy grows.

For each new Web page taken from the corpus, we ranked the 10 folders according to the cosine of the angle between the page feature vector and each folder feature vector. An error occurs when the right category is not in the first k places of the ranking, where we set k =1, 2, 3. Figure 7.4 shows the corresponding error rates against the number of pages. Though these results are very preliminary and not yet satisfying in terms of performance, note that the error rates have a nice shape; i.e., they drop quickly at the beginning of the learning process. To decrease the error rate further we plan to use variants of the perceptron tailored to ranking problems, such as the one proposed in [28].

148 7.1.4 The current system prototype

A POIsmart management system prototype is being developed at the Uni- versity of Milan with the goal of testing the various components of the proposed architecture and eventually obtain a fully functional POIsmarts management system. The prototype currently implements part of the fea- tures explained in the previous sections.

Communication protocol Our architecture adopts Web services for client/server and server/server communication. The choice of Web services is mainly motivated by the need for device independence: users should be able to access their POIsmarts management system from many different devices. Many of these devices need an ad hoc client software for the com- munication with the reference PS. We currently consider Web services to be the best and simplest solution for the interaction of applications running on different environments, compared to proprietary technologies like DCOM or RMI.

POIsmart Servers Java-based POIsmart servers can currently manage multiple users XML POIsmart representations and can interact with client systems allowing insertion and deletion of POIsmarts and folders. The peer- to-peer algorithm is also implemented and working as described in Sec- tion 7.1.2. We are still working on the integration with the middleware providing context information, and hence multi-feature queries are not yet supported. POIsmart servers currently support basic searches on private as well as on shared POIsmarts hierarchies. The search facility we implemented is based on keyword matching and location; hence, the list returned by the server includes only POIsmarts such that these keywords are found in its

149 Figure 7.5: Client systems

title or metadata keyword tag, ordered by their proximity to the user. The search facility allows the user to specify if the search should only span over the personal hierarchy or should extend to other users shared POIsmarts, both on the same PS and on different ones in the peer-to-peer network.

Client systems One of the main goals of our project is to allow users to access their POIsmarts from different devices. Currently, we have developed Client Systems for the most commonly used devices, i.e. personal computers, cellular phones and PDAs (see Figure 7.5). The client for personal computers is a multi-platform Java standalone application that provides a user-friendly interface for the interaction between the end user and the system. When the user connects to his PS, he obtains his whole POIsmarts hierarchy. The user can manipulate it by adding, modifying or deleting POIsmarts and POIsmart folders. When the user wants to save his changes, he makes a request to his reference PS, sending an appropriate XML representation of the changes made by the user in the current session. The user can also

150 search for POIsmarts or POIsmart folders shared by other users. The clients for the PDA and cellular phone platforms have been developed using the Java 2 Microedition platform. This choice guarantees the reuse of certain Java components, and provides adequate robustness and portability. We have a version for cellular phones supporting MIDP 1.0 and one for PDAs implementing the J2ME Personal Profile specification. The communication between the client and PSs is realized by using kSOAP, an implementation of a reduced set of the SOAP protocol specifications, suitable for the Java 2 Microedition. In addition to Java smart clients, we developed a .Net version of the client for Microsoft Smartphone 2003 devices. This solution seems to be more appropriate for mobile phones equipped with the .Net Compact Framework. The clients for mobile devices essentially have the same functionality of the personal computer version. The key differences are due to the limited amount of memory available on mobile devices. For example, in the MIDP 1.0 version no POIsmart is stored on the device and all operations are carried out server-side, while in the version for PDA, the POIsmart hierarchy can be cached locally and synchronized with the server with an appropriate policy.

The capability of the client systems to automatically launch a browser when selecting a URL as well as the capability of a browser to store a URL in the POIsmart hierarchy are key factors for the usability of the proposed system. For this reason, we are working for obtaining a full integration with the most widely used browsers. We have managed to include the former capability in all versions of the clients except the MIDP 1.0 one. Regarding the latter capability, the integration has been done between Internet Ex- plorer and the client for personal computers: a specific toolbar button has been added on the browser which users can use to create a new POIsmart which has a reference to the Web page currently displayed. The integration

151 is more complex for microbrowsers which usually are less customizable; we are evaluating several solutions one of which consists in the association of a hardware button to the creation of a new POIsmart.

152 7.2 An adaptive video streaming service

As already discussed in previous work [75], multimedia streaming adaptation can benefit from an asynchronous messaging middleware, in order to adapt the streaming bitrate on the basis of context data such as network conditions and users’s preferences. In order to demonstrate the effectiveness of our solution, we have implemented a streamer prototype that exploits the CARE middleware for obtaining context data. Moreover, the streaming bitrate is continuously adapted on the basis of asynchronous notifications of changes in context data sent by means of the trigger mechanism presented in Chapter 5.

7.2.1 The adaptive streaming system

We selected the VideoLan Client (VLC) [102] as a starting point to develop a customized client system, because it is an open platform and multiple operating systems are supported. This client is intended to run on Linux systems, Windows workstations, and WindowsCE PDAs in order to achieve the largest possible population of users. Two new components have been implemented for VLC: a network adapter and a demuxer. The network adapter contacts the streaming service provider performing an HTTP request that will been modified by the local proxy adding in the HTTP headers information that identifies the user and the remote profile managers. Then, the client waits for the video feed to come on a specific port. The demuxer is in charge of collecting frame segments coming from the network component, put them together, and feed the result to the correct decoder. The user HTTP request from VLC is received by the pmp module, which, as explained in Section 3.2, asks the context provider for the aggregated profile data. The attribute/value pairs returned by the context provider are included in the HTTP request header and the request is forwarded to

153 the streamer application logic. The streamer has been implemented ground-up on a Linux system. Im- plementation on this platform grants us with easy deployment and the pos- sibility to easily fine-tuning the application. Our streamer is able to concur- rently read data from a variable number of different video files while keeping frames synchronized. Switching between video files is performed only upon request by the underlying middleware. Upon receiving a streaming request, the streamer opens all the video files associated with the requested content. Based on the context parameter values, the application logic selects an ap- propriate encoding and the streamer starts sending UDP packets containing frames belonging to the selected encoding. Network streaming is performed thanks to an ad-hoc file format. Our custom format is a streamer-friendly version of the AVI encapsulation for- mat. Metadata are taken away from the original AVI file, while frames are isolated and divided into fixed size chunks that will be used as UDP payload. Packets are then tagged with timestamps and other relevant information (i.e., frame size). As can be noted, this streaming-friendly encapsulation format is essentially codec-independent.

7.2.2 The interaction between CARE and the streamer

We now illustrate how changes in context are detected and notified by the CARE middleware to the streamer application logic. When the pmp mod- ule receives the user request, it recognizes that is directed to a continuous service, and retrieves the monitoring specifications related to the requested feed – which in the case of our streamer prototype consist only of the Medi- aQuality parameter. Using this specification, the context provider com- putes the set of required triggers, accordingly to the algorithms reported in Section 5.3, and illustrated by Example 11. Triggers are then set on the opm

154 (a) CLIENT SYSTEM (c) SERVER SYSTEM

USER DEVICE ASYNCHRONOUS NOTIFICATIONS VLC CLIENT

LOCAL PROXY CARE TRIGGER MONITOR Streaming server

(b) LINK EMULATOR

Network OPM Network connection connection TRIGGER MONITOR PROFILE

Figure 7.6: Experimental setup and upm/device, in order to monitor available bandwidth and battery level, respectively. Upon firing of one of the triggers, the new value is forwarded to the context provider, which recomputes the value for the MediaQuality parameter. If the new value differs from the previous one, it is forwarded to the pmp which issues a special HTTP request to the streamer application logic. The application logic selects a different encoding based on the new value. The feeder process is notified and forced to change the file from which the video frames are taken. The transmission restarts from the beginning of the last frame being transmitted, and the client will discard any UDP packets belonging to a partially transmitted frame.

7.2.3 Experimental Setup

The experiments performed with the current prototype are based on a full implementation and demonstrate the viability of our solution. During our experiment we setup a testbed and emulate network congestion in order to

155 observe the perceived quality of streamed media under different conditions. At first, we run a set of experiments without performing adaptation, and then another set while using CARE to adapt the streamer media quality on the basis of context. The streaming behavior is evaluated by measuring the number of frames per second received by the decoder. In the ideal case this number should always be 25 (since we are using PAL). When the number of frames per second drops below the expected value we start to experience mosaic effects, stop-and-go video, and random artifacts. The experimental setup is shown in Figure 7.6. In order to simulate changes in the available bandwidth, client/server communications are medi- ated by a link emulator machine (b). This machine runs a live distribution of Unix. The Dummynet [95] link emulator software is used to limit avail- able bandwidth in both directions, in order to simulate network congestions. This machine also hosts the opm module, which is in charge of notifying the context provider with changes in the available bandwidth according to our trigger mechanism. The client system (a) hosts the VLC client, as well as the local proxy used to identify the user and her profile managers. A trigger monitor is in charge of monitoring the device resources (e.g., available memory) according to the received triggers. The server system (c) is dedicated to host the streaming server as well as the remaining components of the CARE middleware.

7.2.4 Observed results

The video streaming sent to the VLC client is encoded using XviD [106] – a free MPEG-4 codec – using three different sets of parameters for average bandwidth request and quantization matrix size. In order to ensure a clear visual feedback during the experiment, we kept the difference between en-

156 Figure 7.7: A detail of the same frame at three different levels of encoding (high, medium, and low resolution, respectively) coding parameters wide. The first encoding (high quality) has an average bandwidth of 512 KBps and a quantization of 4 bits, while the other two (medium and low quality, respectively) have 128 KBps with 16 bits and 64 KBps with 64 bits, respectively. Figure 7.7 shows the appearance of a detail of the same frame, using the different encodings. Due to the VBR encoding and some buffering effects in the kernel protocol stack the aforementioned bandwidth cope nicely with a medium bandwidth on transmission of 10, 5 and 2 Mbps respectively. These three bandwidth levels are meaningful for the experiment since they are three possible values for link quality after handshake in WiFi networks. The experiment has a duration of 150 seconds divided into five 30-seconds segments that we will call slots. The bandwidth allowed by the link emulator is changed every slot. During the first slot the network link is unconstrained (i.e., available bandwidth is 10 Mbps, due to our setup); in the second slot

157 (a) High-resolution video

(b) Medium-resolution video

(c) Low-resolution video

Figure 7.8: Experimental results with no adaptation

158 Figure 7.9: Results with our adaptive streamer, and variable bandwidth the allowed bandwidth is limited to 5 Mbps, and in the third it is limited to 2 Mbps. Then, we scale up again first to 5 Mbps, and then to unconstrained in the last slot.

At first, we ran the experiment without performing adaptation. The resulting frames per second (FPS) perceived by the player for the three video encodings are reported in Figure 7.8. As we can see in Figure 7.8(a), using the high bitrate video, the streamer is able to provide the required 25 frames per second during the first slot. Then, when the available bandwidth drops to 5Mbps the streamer receives around 20 frames per second. In the third timeslot congestion is too strong, and frames per second drop to nearly zero. Strangely enough, even if the links is less congested during the forth slot, video quality does not improve. This behavior has been encountered in a significant number of experiments; we suppose that this is

159 due to VLC internal buffering and decoder issues. Only in the last timeslot – when bandwidth is unconstrained – we can observe a normal playout again. Streaming the medium bitrate video (Figure 7.8(b)). we can observe that network congestion becomes a problem only in the third slot. From the point of view of the user experience the drop to 20 FPS is definitely perceivable and the resulting quality would not be normally acceptable if not for a short period of time. Figure 7.8(c) shows that the streaming of the lowest bitrate video is not affected by the network congestion. However, its quality is not comparable with the other two video encodings. The improvement obtained by means of adaptation can be seen in Fig- ure 7.9. The adaptation of media quality to changes of network bandwidth is almost immediate. The asynchronous notifications sent by the network operator profile manager (OPM) have been sent introducing a three seconds delay in order to be more realistic. This delay is intended to model an up- per bound for the network latency in the notification of context updates, as well as possible high loads at the profile managers. Without this delay, due to the buffering effects of the streaming system, it is almost impossible to perceive the data loss when switching between the video encodings. From the point of view of the visual experience of the final user, our experiment results have been very positive. After a small data loss at the beginning of the second and third slots, the playout proceeds smoothly at the adapted bitrate. A video capture of the experiment can be found at the project Web site [21].

7.2.5 Comparison with a commercial solution

In order to compare our solution with a commercial product supporting continuous media adaptation, we have performed experiments with the Helix server from RealNetworks [91]. Since the software has been provided as

160 Figure 7.10: Results with the Helix streaming server, and variable band- width closed source, we have not been able to provide statistics based on FPS like we have done for our streamer. Hence, we report here some observations based on comparison of user experience, and some technical observations based on the available knowledge on Helix. A minimal background on RealMedia streaming system independently from adaptation is necessary. Since the system is codec-dependent, every clip needs to be converted in order to be streamed. On the contrary, our streamer supports multiple codecs: any kind of MPEG-like frame-oriented video content inside an AVI encapsulation can be sent over the network. Moreover, while RealMedia encoding is proprietary, MPEG is an open stan- dard; many implementations exist both open and closed source. Regarding the video quality of the produced stream, RealMedia format is generally better suited for low bandwidth than MPEG. When encoding for very low

161 bandwidth, MPEG usually exposes a mosaic-like effect if not a stop-and- go behavior, while RealMedia adopts a fuzzier degradation, thus rendering an image which is more pleasant for the final user. On the other hand, in our experiments, when encoding is performed at a very high bitrate the rendering quality seems to be better with MPEG clips.

The comparison regarding adaptation is obtained by using Helix to stream the same video content, encoded at the three different bitrates used in the first experiment. For this purpose, a multiple RealMedia stream has been produced and streamed while operating on the link bandwidth in the same way as in the first experiment. The experiment has been repeated multiple times, obtaining similar results. Experimental results are shown in Figure 7.10.

We have observed that during the first slot the playout proceeds smoothly. At the beginning of the second slot, the link becomes congested and after 6 seconds the information panel shows a quality drop, even if still not per- ceivable by the user because of buffering effects. Data loss is visible on the screen as artifacts at the end of the second slot, from second 50 to second 58. At the beginning of the third slot, the link becomes more congested. This time, a quality drop occurs after 20 seconds; artifacts can be seen 10 seconds later, and last for several seconds. Interestingly, even if the avail- able bandwidth increases at the fourth and fifth slots, the stream quality does not improve, and the lowest bitrate video continues to be streamed until the end of the experiment. To our knowledge, technical specifications motivating this behavior are not publicly available, but this may also be a specific design choice.

From the point of view of the user experience, the adaptation performed by Helix is very effective. With respect to what observed with our streamer with the same experiment setup, the artifacts appearing when the encodings

162 are switched appear a bit later, last longer, and, due to the specific encoding, have less impact on the global perception of the video stream. Considering the last part of the experiment (slots 4 and 5), in the case of our streamer the user experiences a significant quality raise, while in the other case the user keeps watching the low bitrate video. Note that, because of the use of its particular encoding, the RealMedia video is more enjoyable than an MPEG one at the same bitrate. Overall, the experiment showed that our prototype adaptive streamer coupled with the CARE middleware, despite the limited functionalities, offers a user experience close to that of a leading commercial solution. The adaptation performed by Helix, and in general by the available solu- tions for continuous media adaptation, focuses on the available bandwidth as the only parameter to be monitored. We believe that streaming adaptation would be improved by considering multiple context parameters, including those that describe the status of resources on the client device (e.g., available memory and battery level). Moreover, complex services providing streaming contents should also perform adaptation on the basis of the user’s activity and surrounding environment. Using CARE instead of an ad-hoc probing technique integrated with the streamer has a number of advantages:

• since the probing techniques are usually network dependent and in our solution they are decoupled from the streaming client and server, the streaming system becomes network independent;

• multiple context parameters can be taken into account to select the appropriate bitrate;

• the same parameter (e.g., the network bandwidth) can be acquired from multiple sources, possibly with different acquisition techniques,

163 and the conflict resolution mechanism is used to dynamically identify themostreliablevalue;

• both users and service providers can specify policy rules to influence streaming adaptation on the basis of the current context, accordingly to their preferences and business rules, respectively.

164 7.3 Integration with an adaptive transcoding proxy

In the Pervasive and Ubiquitous Computing era, the trend is to provide Web content and multimedia applications by taking into account four important requirements: anytime-anywhere access to any data through any device and by using any access network. As a consequence, it is increasingly significant to provide tailored content efficiently, in order to address the mismatch be- tween the rich multimedia content that is widely available and the limited client capabilities and available bandwidth. Furthermore, users feel the need of personalized content that matches their personal preferences, considering not only device capabilities and network status, but a wider notion of con- text, which includes -among other things- their current location, activity, and surrounding environment. One of the current research topics in distributed systems is how to ex- tend the traditional client/server computational paradigm in order to allow the provision of intelligent and advanced services. To this aim, new actors are introduced into the WWW scene, in the form of intermediaries that act on the HTTP data flow exchanged between client and server. Several exam- ples of intermediary adaptation systems exist in literature, such as iMobile by AT&T [90], whose main goal is to provide personalized mobile services; the TACC infrastructure [37], whose main goal is to support the dynamic deployment of transformation modules into proxy-based components; Rab- bIT2 and Privoxy3, whose main goal is to provide functionalities such as text and image transcoding, removing of cookies, GIF animations, advertisement, Java Applets, etc. Many of these systems provide adaptation functionalities without taking into account user’s preferences and context information. In order to support the provisioning of this type of services, together with

2http://rabbit-proxy.sourceforge.net 3http://www.privoxy.org

165 a research group of the University of Salerno, we have defined and experi- mented with an architecture that builds on top of two existing frameworks: the Scalable Intermediary Software Infrastructure (SISI) [26], and the CARE middleware. Our proposal is supported by a running implementation, and by a prototype service aimed at providing location-based support to mobile users.

7.3.1 Architecture overview

In this section we present the architecture for context-aware service pro- visioning. Figure 7.11 shows the system architecture and the data flow upon a user request. In order to provide the system with the user’s iden- tification and with the information required to retrieve distributed profiles and policies, the request is intercepted by a local proxy that runs on the client device and adds these data into custom HTTP headers. Then (Step 1), the request is forwarded to the node running the CARE middle- ware for context-awareness. Here the profile mediator proxy retrieves the aggregated context data from the context provider module. These data -which include the list of SISI services to be applied, as well as their parameters- are inserted into the HTTP headers, and the request is sent to the node running the SISI adaptation system (Step 2). SISI retrieves the requested resource from the external Web server (Steps 3 and 4), and applies the services on the basis the user’s context data and preferences. Finally, the adapted resource is sent back to the user’s client (Steps 5 and 6) trough the CARE node.

In the following, we briefly describe the SISI adaptation system, defined and developed at the University of Salerno.

166 Figure 7.11: The adaptation architecture.

The SISI adaptation system

The main goal of SISI [26, 42] is to facilitate the deployment of efficient and programmable adaptation services. This framework, built on top of Apache Web server and mod perl, provides a modular architecture that allows an easy definition of new functionalities implemented as building blocks in Perl. These building blocks, packaged into Plugins, produce transformations on the information stream as it flows through them. Moreover, they can be combined in order to provide complex functionalities. Thus, multiple Plug- ins can be composed into SISI edge services, and their composition is based on preferences specified by end users.

Many adaptation services have already been implemented, among which we can mention:

LinkRelationship.TheHTMLLINK element can be used to improve

167 the Web site accessibility and, at the same time, to ensure a better support for search engines (e.g. semantic information). This service adds a toolbar containing the LINK attributes on top of each HTML page. The main goal is to make HTML pages more accessible when accessed through screen readers, and also to improve their navigability on devices with limited capabilities. BlockList and AnnoyanceFilter. The main goal of these services is to get rid of particularly annoying abuse during the navigation on the Web. In particular, the BlockList service provides a simple way to block a Web browser from viewing sites that are not on a list of approved sites. This service searches for all the links embedded in a Web page and substitutes the unapproved ones by plain text. The AnnoyanceFilter service provides functionalities for removing advertisement, banners, pop-ups in JavaScript and HTML, JavaScript code, for disabling unsolicited pop-up windows, etc.

7.3.2 Service activation policies

By default, each SISI service is deactivated. The activation of services, as well as the service parameters, are determined by policy rules declared by both the user and the service provider.

Example 15 ConsiderthecaseoftheFilterImg service for image transcod- ing. The evaluation of the following policy rules determine the activation state and the parameters of the service:

R1: If DeviceType = CellPhone Then Set FilterImg:Activate=’on’

R2: If DeviceType = CellPhone And Bearer ! = UMTS Then Set FilterImg:removeImage=’on’

R3: If ColorCapable = no Then Set FilterImg:black&white=’on’

168 Figure 7.12: User interface to declare policies.

Rule R1 is used to activate the service for devices with low capabilities (cell phones). Rule R2 instructs SISI to remove images when the cell phone is not connected through a UMTS bearer. Rule R3 is used to determine the transformation in black and white of images for devices that cannot display colors. It should be noted that the user’s preferences regarding the activation of services, as well as their parameters, are represented through a proper CC/PP vocabulary.

Of course, policies expressed by means of the formal language shown in Example 15 are not intuitive for the final users of the system. For this reason, users are provided with Web interfaces to manage their preferences regarding the activation and parameters of SISI services. Figure 7.12 shows the interface to express preferences regarding the FilterImg service.

169 7.3.3 The GeoAware prototype service

In order to show the integration between CARE and SISI we have imple- mented a prototype location-based service, named GeoAware. This service is addressed to mobile users equipped with a mobile device and a GPS re- ceiver. The main goal of this service is to provide a map with information about both the current location of the user and the locations (expressed as physical addresses) appearing on the Web page she is currently viewing. The main steps performed by the GeoAware service are described below. At first, CARE retrieves the GPS coordinates of the user from her profile manager, and communicates them to SISI as explained in Section 7.3.1. The GeoAware service parses the requested Web page and, by applying regular expressions, matches all standard U.S. addresses [101]. When an address is recognized, GeoAware adds a hyperlink on the Web page, high- lighted by a particular icon (see Figure 7.13-A). When the user selects the hyperlink icon, GeoAware invokes the geocoder.us service4 to obtain the coordinates of the relative address. Geocoder.us is a public service providing free geocoding of addresses in the United States, and relies on Geo::Coder::US, an open-source Perl module available from CPAN. After having obtained this information, GeoAware builds a query string to be issued to the U.S. Census Bureau TIGER Map Server 5, which pro- vides public-domain, customized U.S. maps. Figure 7.13-B shows the map obtained from the address in Figure 7.13-A. The current position of the user is represented by a blue star with a label you are here, while the posi- tion corresponding to the address is represented by a red star labeled with the address. Exploiting the other adaptation services, the map is properly

4http://geocoder.us/ 5http://tiger.census.gov/cgi-bin/mapbrowse-tbl

170 Figure 7.13: The GeoAware service.

tailored to device capabilities and available bandwidth. The interest for geolocalized services is witnessed by the recent service provided by Google with its AutoLink button in the Google toolbar. Even if that service addresses the same needs as ours, it is currently available only to users with desktop browsers. Moreover, the AutoLink functionalities are somewhat limited since it does not take advantage of a context-awareness framework.

171 172 Chapter 8

Summary and Outlook

In this chapter we summarize the main contributions of this thesis, and address open issues.

8.1 Technical contribution

The goal of this thesis was the definition and development of a framework (called CARE) for efficiently and effectively support context-aware adapta- tion of Internet services in mobile computing. The proposed framework adopts a hybrid reasoning mechanism based on a loose interaction between ontological reasoning and efficient reasoning in a restricted logic programming language. The logic programming language we defined for reasoning with raw con- text data is particularly efficient, since the evaluation of policies written in this language has linear complexity. Moreover, the framework includes sophisticated mechanisms for resolving conflicts between context data pro- vided by different sources, and policies declared by different entities. Conflict resolution mechanisms ensure that the model of the logic program is unique. Furthermore, an optimized trigger-based mechanism has been adopted

173 for supporting services that persist in time (called continuous services), like multimedia streaming. The adopted optimizations aim at reducing compu- tation time and network communications. We have adopted OWL-DL as the language for representing and reason- ing with complex context data. In order to have a uniform context repre- sentation language, we have defined new CC/PP vocabularies for having a mapping between OWL-DL concepts and CC/PP attributes. We have accurately performed both theoretical and experimental analy- ses of the novel algorithms that are adopted by CARE. The framework has been developed, and used for the adaptation of various prototype services addressed to mobile users.

8.2 Open problems

Several issues, both practical and theoretical, still have to be deeply investi- gated to make advanced distributed profiling an effective and viable solution for service adaptation. We identify the following as the main open issues:

• Automatic recognition and representation of complex profile attributes;

• Preservation of privacy for user personal data (including location data) according to privacy policies defined by the user;

• Optimization and caching techniques to provide real-time integrated context data;

• Scalability with respect to the number of users and profile managers.

8.2.1 Ontologies and ontological reasoning

As outlined earlier in this dissertation, there is a need for profile represen- tation formalisms that are far more expressive than CC/PP. Our favorite

174 example is the representation of user activities, but many other complex profile attributes call for a more expressive formalism. Ontologies are a nat- ural candidate, since they are emerging for a variety of services including knowledge sharing and semantics disambiguation, which are very relevant even in the context we are considering. However, we have two main concerns with the use of current ontology languages for the representation of complex profile attributes and their automatic recognition: a) expressiveness, and b) efficiency of reasoning.

Regarding a) we encountered several difficulties when trying to use the well known ontology language OWL [77] (a W3C Recommendation lan- guage) to represent user activities. In particular, OWL is very weak in rea- soning with properties, lacking a constructor for property composition. The typical example is that in OWL is not possible to define uncle as the com- position of parent and brother. It also lacks the possibility to express feature agreement; for example, it is not possible to force the value of the property has-employer in a class of persons to be the same (without specifying which one) in order to represent colleagues. This is essentially due to the fact that the underlying description logic does not support role-value-maps,not even in limited forms that preserve decidability [7]. Another relevant issue regarding expressiveness is the lack of support for the representation of rules and the integration of rule-based reasoning with ontological (subsumption- based) reasoning.

The introduction of first-order rules into OWL would greatly augment the expressiveness of the language, and a number of projects are currently addressing this issue (e.g., [57, 39]). For instance, the Semantic Web Rule Language (SWRL) is based on the combination of OWL-DL with the Unary/Binary Datalog RuleML sublanguages of the Rule Markup Language (RuleML). In SWRL, the set of OWL axioms is augmented with Horn-like rules, where

175 unary and binary predicates correspond to OWL classes and properties, respectively. A prototype implementation has been developed. Unfortu- nately, the introduction of rules into OWL or even OWL-DL can easily lead to the undecidability of the basic reasoning tasks, making languages such as SWRL unsuited for real-time services which make use of large and complex ontologies. Efficiency of reasoning with ontologies is indeed the second general con- cern mentioned above (b). Very good progress has been done in the recent years in the development of modal logic theorem provers that can be used to compute subsumption and related reasoning tasks in ontologies. However, the underlying reasoning problems are inherently difficult, and the classifi- cation task becomes unfeasible in real-time even for small to medium size ontologies and not very expressive languages like OWL-lite (see for exam- ple [49]).

8.2.2 Privacy issues

Another major issue involved in advanced distributed profiling is the privacy of user data. There are two main approaches to privacy preservation: a) the enforcement of privacy policies, and b) the use of anonymization techniques. The first approach essentially considers each request of access to personal information and decides if granting or denying access based on a specific policy. Policies usually consider the entity making the request, the infor- mation requested, the modality of the access and, in the case of mobility, spatio-temporal conditions may also be considered. In our architecture, the distribution of user profile data may be restricted by such a set of privacy policy rules enforced at each profile manager. For example, a set of UPM poli- cies could allow the user client interface and user trusted agents to update personal data and policies, allow the user GPS module to update location

176 data, and allow a set of service provider profile managers to read the profile attribute values in certain CC/PP components. Note that profile managers are considered trusted agents by their corresponding entities. Several for- malisms that have been proposed for access control (see e.g., [61]) can be easily adapted to the context we are considering. Extensions to these basic models have been proposed in [11] to include temporal constraints that specify periodic time windows where access is denied/accepted as well as qualitative temporal relationships among accesses (using operators like as- longas, whenevernot, until, . . . ). The presence of spatio-temporal constraints in policies to preserve mobile user privacy has been recently identified as a requirement in [107]. They propose an access control system for moving objects and customer profiles where each access rule is composed of a triple s, o, +/− and a spatio-temporal constraint stc. The triple specifies the subject (s), which may be a specific service provider, the object (o), which may be a specific user profile, and a flag (+/−) specifying if it is a positive or negative access rule. The implicit access mode is read. The constraint stc defines the spatio-temporal context of access rule application, and it is composed of a location and a time interval. For simplicity in the definition of rules, the rule components can be defined at different levels of granularity. Despite this is still a preliminary study, it is an example of adaptation of the general idea of database access control to the release of mobile users’ profile data. It may be probably applied without much effort to the CARE middleware.

A general concern about the access control approach is the boolean result returned by the evaluation of rules at each access request. The access is either granted or denied; this means the whole profile is either released or not released and, in some cases, the denial may lead to loss of service. In [13] an extension to the classical approach, introducing conditional granting,

177 is proposed. In such a model, the access can be granted provided that the requester satisfies some conditions at the time of access and accepts to fulfill certain obligations in the future, for example notifying the owner when the information is used for certain purposes. A totally different approach is based on anonymization [96, 46]. In this case, instead of denying access to the information, the information is prop- erly manipulated, so that it preserves a form of anonymity. The main idea is that any sensitive information released to an untrusted entity should not be possibly connected by this entity to the specific individual it refers to; i.e., it should be impossible for this entity to distinguish among k individuals to which the released information potentially refers to. This can be achieved by an appropriate middleware using pseudonyms instead of real user names, hiding real network addresses, and appropriately obfuscating information that may reveal the identity of the user. Obfuscation techniques are usu- ally based on generalization of values using, e.g., granularity hierarchies, or they can be based on truncation of values. As trivial examples, consider the truncation of zip codes to 3 or 4 digits, the truncation of SSN, or phone numbers, as well as the obfuscation of location data by releasing, e.g., the name of the closest city instead of the precise GPS coordinates. We believe that it would be very interesting and challenging to further investigate, in the context of advanced profiling for mobile users, both the access control and the anonymization approaches and their possible integra- tion in a single privacy protection solution.

8.2.3 Optimization techniques and caching

As reported in Chapter 4, the algorithm for policy evaluation is particularly efficient, having linear complexity. On the contrary, the algorithm for cycle detection and resolution can be optimized in order to reduce the computa-

178 tion time in the case a cycle is encountered. Further optimizations consist in developing new modules using programming languages more efficient then Java for critical tasks like merging distributed profiles. Moreover, a crucial issue for optimizing our middleware consists in de- vising caching techniques and algorithm refinements to reduce the time de- lay due to the network latency when retrieving context data from remote sources. We have already implemented an optimization that consists in storing partial profiles on the context provider. Once a context update is received by a context source, the corresponding profile is updated, and policies can be re-evaluated without retrieving other context data from the distributed context sources. New optimizations that can be adopted con- sists in caching partial profiles, and requesting at the time of the service request just those context data whose values have changed from the last communication.

8.2.4 Scalability issues

Scalability issues are also relevant since the middleware should be able to scale to a large number of users and of profile managers. Even if in this dissertation we have assumed to have just three entities with a corresponding profile manager, the architecture can handle an arbitrary number of profile managers. However, evaluating the scalability of our middleware requires extensive realistic experiments. We are investigating the possibility to use synthetic data from profile generators and movement traces generators, as well as simple real users testbeds.

179 180 Acknowledgments

First of all, I would like to thank my supervisor Claudio Bettini: for every- thing I have learnt from him, for his invaluable ideas and suggestions, for the support he gave me during these years. He did much more than what is expected from a supervisor: Thank you! I would also like to thank the other persons that collaborated to this work, giving fundamental contributions: Alessandra Agostini, Dario Mag- giorini, Linda Pareschi, and in particular my M.Sc. thesis supervisor Nicol`o Cesa-Bianchi, who introduced me to the very exciting world of research. While doing this work I had the opportunity of working with other valu- able people from both academia and industry: Delfina Malandrino, Raffaella Grieco, Francesca Mazzoni, Cristiano Sala, Michele Ruberl. Interacting with them was one of the most profitable experiences of these three years of work. Special thanks go to Vincenzo Pupillo – for his scrupulous kernel recom- pilations, and nice discussions about mountain – and to Sergio Mascetti, for his Java support, tea preparation, and students-scaring activities. Last but not least, I am really grateful to all the students that, with their care and passion, implemented part of the software architecture: in chronological order, Davide Vitali, Emanuele Giuliani, Marco Ronchi, Carlo Cestana, Enrico Mancini, Marco Millefanti, Massimo Zaroli, Edoardo Tosca, Marco Fornoni, Federico Luraschi, and Paola Lorusso.

181 182 Bibliography

[1] A. Acharya, M. Ranganathan, and J. H. Saltz. Sumatra: A Language for Resource-Aware Mobile Programs. In Proceedings of Mobile Object Systems - Towards the Programmable Internet, Second International Workshop, MOS’96, volume 1222 of Lecture Notes in Computer Sci- ence, pages 111–130. Springer, 1997.

[2]I.K.Adusei,K.Kyamakya,andK.Jobmann.MobilePositioning Technologies in Cellular Networks: an Evaluation of their Performance Metrics. In MILCOM 2002, pages 1239–1244, 2002.

[3] A. Agostini, C. Bettini, N. Cesa-Bianchi, D. Maggiorini, D. Riboni, M. Ruberl, C. Sala, and D. Vitali. Towards Highly Adaptive Ser- vices for Mobile Computing. In Proceedings of IFIP TC8 Working Conference on Mobile Information Systems (MOBIS), pages 121–134. Springer, 2004.

[4] K. R. Apt and M. Bezem. Acyclic Programs. New Generation Com- puting, 9(3/4):335–365, 1991.

[5] K. R. Apt, H. A. Blair, and A. Walker. Towards a Theory of Declar- ative Knowledge. In Foundations of Deductive and Logic Programming, pages 89–148. Morgan Kaufmann, 1988.

183 [6] L. Ardissono and A. Goy. Tailoring the interaction with users in web stores. User Modeling and User-Adapted Interaction, 10:251–303, 2000.

[7]F.Baader,D.Calvanese,D.L.McGuinness,D.Nardi,andP.F. Patel-Schneider, editors. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, 2003.

[8] P. Bahl and V. N. Padmanabhan. RADAR: An In-Building RF-Based User Location and Tracking System. In INFOCOM, pages 775–784, 2000.

[9] R. Bajaj, S. Ranaweera, and D. P. Agrawal. GPS: Location-Tracking Technology. IEEE Computer, 35(4):92–94, 2002.

[10] P. Bellavista, A. Corradi, R. Montanari, and C. Stefanelli. Context- aware Middleware for Resource Management in the Wireless Internet. IEEE Transactions on Software Engineering, Special Issue on Wireless Internet, 29(12):1086–1099, 2003.

[11] E. Bertino, C. Bettini, E. Ferrari, and P. Samarati. An Access Control Model Supporting Periodicity Constraints and Temporal Reasoning. ACM Transactions on Database Systems, 23(3):231–285, 1998.

[12] E. Bertino, A. Mileo, and A. Provetti. Policy Monitoring with User- Preferences in PDL. In Proceedings of Workshop on Nonmonotonic Reasoning, Action, and Change (NRAC’03), pages 37–44, 2003.

[13] C. Bettini, S. Jajodia, X. Wang, and D. Wijesekera. Provisions and Obligations in Policy Rule Management. Journal of Network and Sys- tems Management, 11(3), 2003.

184 [14] C. Bettini and D. Riboni. Profile Aggregation and Policy Evaluation for Adaptive Internet Services. In Proceedings of The First Annual In- ternational Conference on Mobile and Ubiquitous Systems: Network- ing and Services (Mobiquitous), pages 290–298. IEEE, 2004.

[15] R. Bittner, P. Smrcka, P. Vysok´y, K. H´ana, L. Pousek, and P. Schreib. Detecting of Fatigue States of a Car Driver. In ISMDA, pages 260–273, 2000.

[16] A. Borgida, R. J. Brachman, D. L. McGuinness, and L. A. Resnick. CLASSIC: A Structural Data Model for Objects. In Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data, pages 58–67, 1989.

[17] G. Borriello, W. Brunette, M. Hall, C. Hartung, and C. Tangney. Reminding About Tagged Objects Using Passive RFIDs. In Ubicomp, pages 36–53. Springer, 2004.

[18] M. Bowman, R. D. Chandler, and D. V. Keskar. Delivering Cus- tomized Content to Mobile Device Using CC/PP and the Intel CC/PP SDK. Technical report, Intel Corporation, 2002.

[19] P. J. Brown. The Electronic Post-it Note: a Metaphor for Mobile Computing Applications. In IEE Colloquium on Mobile Computing and its Applications, 1995.

[20] M. Butler, F. Giannetti, R. Gimson, and T. Wiley. Device In- dependence and the Web. IEEE Internet Computing, 6(5):81–86, September-October 2002.

[21] CARE middleware architecture Web site. http://webmind.dico. unimi.it/care/.

185 [22] H. Chen, T. Finin, and A. Joshi. Semantic Web in the Context Bro- ker Architecture. In Proceedings of the Second IEEE International Conference on Pervasive Computing and Communications (PerCom 2004), pages 277–286. IEEE Computer Society, 2004.

[23] H. Chen, F. Perich, T. W. Finin, and A. Joshi. SOUPA: Standard Ontology for Ubiquitous and Pervasive Applications. In 1st Annual International Conference on Mobile and Ubiquitous Systems (MobiQ- uitous 2004), pages 258–267, 2004.

[24] R. Cheng, B. Kao, S. Prabhakar, A. Kwan, and Y.-C. Tu. Adaptive Stream Filters for Entity-based Queries with Non-Value Tolerance. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB 2005), pages 37–48. ACM, 2005.

[25] J. Chomicki, J. Lobo, and S. A. Naqvi. Conflict Resolution Using Logic Programming. IEEE Transactions on Knowledge and Data En- gineering, 15(1):244–249, 2003.

[26] M. Colajanni, R. Grieco, D. Malandrino, F. Mazzoni, and V. Scarano. A Scalable Framework for the Support of Advanced Edge Services. In Proc. of the 2005 Int. Conf. on High Performance Computing and Communications (HPCC’05), pages 1033–1042, Sorrento (NA), Italy, September 2005.

[27] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. McGrawHill, 1990.

[28] K. Crammer and Y. Singer. A New Family of Online Algorithms for Category Ranking. In SIGIR 2002 Proceedings, 2002.

186 [29] N. Damianou, N. Dulay, E. Lupu, and M. Sloman. The Ponder Policy Specification Language. In Policies for Distributed Systems and Net- works, International Workshop (POLICY 2001), pages 18–38, 2001.

[30] J. Delgrande, T. Schaub, and H. Tompits. A Framework for Com- piling Preferences in Logic Programs. Theory and Practice of Logic Programming, 3(2):129–187, 2003.

[31] A. K. Dey. Understanding and Using Context. Personal and Ubiqui- tous Computing. Special issue on Situated Interaction and Ubiquitous Computing, 5(1), 2001.

[32] C. Efstratiou, K. Cheverst, N. Davies, and A. Friday. An Architecture for the Effective Support of Adaptive Context-Aware Applications. In Proceedings of the Second International Conference on Mobile Data Management (MDM 2001), pages 15–26, 2001.

[33] P. Festa, P. Pardalos, and M. Resende. Feedback Set Problems. In D. Z. Du and P. M. Pardalos, editor, Handbook of Combinatorial Op- timization, Supplement Volume A, pages 209–259. Kluwer Academic Publishers, 2000.

[34] T. W. Finin and D. Drager. A general user modeling system. In Proc. of the 6th Canadian Conference on Artificial Intelligence, Montreal, Canada, pages 24–29, 1986.

[35] J. Fink and A. Kobsa. A review and analysis of commercial user modeling servers for personalization on the world wide web. User Modeling and User-Adapted Interaction, 10:209–249, 2000.

[36] C. L. Forgy. RETE: A Fast Algorithm for the Many Pattern/Many Object Pattern Matching Problem. Artificial Intelligence, 19(1):17–37, 1982.

187 [37] A. Fox, Y. Chawathe, and E. A. Brewer. Adapting to Network and Client variation using active proxies: Lessons and perspectives. IEEE Personal Communications, 5(4):10–19, 1998.

[38] J. Fred L. Drake. The XML Bookmark Exchange Language. Technical report, Corporation for National Research Initiatives (CNRI), USA.

[39] F. L. Gandon, M. Sheshagiri, and N. M. Sadeh. ROWL: Rule Lan- guage in OWL and Translation Engine for JESS. Technical report, Carnegie Mellon University, February 2004.

[40] H.-W. Gellersen, A. Schmidt, and M. Beigl. Multi-sensor context- awareness in mobile devices and smart artifacts. MONET, 7(5):341– 351, 2002.

[41] Gregory D. Abowd and Anind K. Dey and Peter J. Brown and Nigel Davies and Mark Smith and Pete Steggles. Towards a Better Un- derstanding of Context and Context-Awareness. In Proceedings of Handheld and Ubiquitous Computing, First International Symposium, volume 1707 of Lecture Notes in Computer Science, pages 304–307. Springer, 1999.

[42] R. Grieco, D. Malandrino, F. Mazzoni, and V. Scarano. Mobile Web Services via Programmable Proxies. In Proc. of the IFIP TC8 Working Conference on Mobile Information Systems - 2005 (MOBIS), pages 139–146, Leeds (UK), December 2005.

[43] B. Grosof. Prioritized Conflict Handling for Logic Programs. In Pro- ceedings of the International Logic Programming Symposium (ILPS), pages 197–211, 1997.

[44] B. Grosof, I. Horrocks, R. Volz, and S. Decker. Description Logic Programs: Combining Logic Programs with Description Logics. In

188 Proceedings of the 12th International Conference on the World Wide Web (WWW-2003), 2003.

[45] B. N. Grosof and T. C. Poon. SweetDeal: Representing Agent Con- tracts with Exceptions using XML Rules, Ontologies, and Process Descriptions. In Proceedings of the Twelfth International World Wide Web Conference (WWW 2003), pages 340–349, 2003.

[46] M. Gruteser and D. Grunwald. Anonymous Usage of Location-Based Services Through Spatial and Temporal Cloaking. In 2nd International Conference on Mobile Systems, Applications, and Services (MobiSys), pages 42–47, 2003.

[47] T. Gu, X. H. Wang, H. K. Pung, and D. Q. Zhang. An ontology-based context model in intelligent environments. In Proceedings of Commu- nication Networks and Distributed Systems Modeling and Simulation Conference, San Diego, California, USA, January 2004.

[48] U. G¨untzer, W.-T. Balke, and W. Kießling. Optimizing Multi-Feature Queries for Image Databases. In VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, pages 419–428, 2000.

[49] Y. Guo, Z. Pan, and J. Heflin. An Evaluation of Knowledge Base Systems for Large OWL Datasets. In International Semantic Web Conference, pages 274–288, 2004.

[50] V. Haarslev and R. Mller. RACER System Description. In Proceed- ings of Automated Reasoning, First International Joint Conference (IJCAR 2001), volume 2083 of Lecture Notes in Computer Science, pages 701–706. Springer, 2001.

189 [51] A. Harter, A. Hopper, P. Steggles, A. Ward, and P. Webster. The anatomy of a context-aware application. In Annual ACM/IEEE Inter- national Conference on Mobile Computing and Networking (Mobicom 1999), pages 59–68, 1999.

[52] K. Harumoto, T. Nakano, S. Fukumura, S. Shimojo, and S. Nishio. Effective Web Browsing through Content Delivery Adaptation. ACM Transactions on Internet Technology, 5(4):571–600, 2005.

[53] K. Henricksen and J. Indulska. Developing Context-aware Pervasive Computing Applications: Models and Approach. Pervasive and Mo- bile Computing, 2(1):37–64, February 2006.

[54] K. Henricksen, J. Indulska, and A. Rakotonirainy. Using Context and Preferences to Implement Self-adapting Pervasive Computing Applica- tions. Software - Practice and Experience, 36(11-12):1307–1330, 2006.

[55] K. Henricksen, S. Livingstone, and J. Indulska. Towards a hybrid approach to context modelling, reasoning and interoperation. In Pro- ceedings of the First International Workshop on Advanced Context Modelling, Reasoning And Management, UbiComp 2004, 2004.

[56] J. Hightower and G. Borriello. Location Systems for Ubiquitous Com- puting. IEEE Computer, 34(8):57–66, 2001.

[57] I. Horrocks, P. F. Patel-Schneider, H. Boley, S. Tabet, B. Grosof, and M. Dean. SWRL: A Semantic Web Rule Language Combining OWL and RuleML. W3c member submission, W3C, May 2004.

[58] I. Horrocks, P. F. Patel-Schneider, and F. van Harmelen. From SHIQ and RDF to OWL: The making of a web ontology language. Journal of Web Semantics, 1(1):7–26, 2003.

190 [59] R. Hull, B. Kumar, D. Lieuwen, P. Patel-Schneider, A. Sahuguet, S. Varadarajan, and A. Vyas. Enabling Context-Aware and privacy- Conscius User Data Sharing. In Proceedings of the 2004 IEEE In- ternational Conference on Mobile Data Management, pages 187–198. IEEE, 2004.

[60] J. Indulska, R. Robinson, A. Rakotonirainy, and K. Henricksen. Ex- periences in using CC/PP in context-aware systems. In Proceedings of the 4th International Conference on Mobile Data Management, pages 247–261. Lecture Notes in Computer Science, Springer Verlag, LNCS 2574, 2003.

[61] S. Jajodia, P. Samarati, M. L. Sapino, and V. S. Subrahmanian. Flex- ible Support for Multiple Access Control Policies. ACM Transactions on Database Systems, 26(2):214–260, 2001.

[62] Jerry R. Hobbs and Feng Pan. Time Ontology in OWL. W3C Editor’s Draft, W3C, September 2006. http://www.w3.org/2001/sw/BestPractices/OEP/Time-Ontology.

[63] L. Kagal, T. W. Finin, and A. Joshi. A Policy Language for a Pervasive Computing Environment. In 4th IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY 2003), pages 63–75. IEEE Computer Society, 2003.

[64] R. M. Karp. Reducibility Among Combinatorial Problems. In R. E. Miller and J. W. Thatcher, editors, Complexity of Computer Compu- tations, pages 85–103. Plenum Press, New York, 1972.

[65] M. Kazantzidis, I. Slain, T. Chen, Y. Romanenko, and M. Gerla. End- to-end versus Explicit Feedback Measurement in 802.11 Networks. In

191 The Seventh IEEE Symposium on Computers and Communications (ISCC02), pages 429–434, 2002.

[66]G.Klyne,F.Reynolds,C.Woodrow,H.Ohto,J.Hjelm,M.H.But- ler, and L. Tran. Composite Capability/Preference Profiles (CC/PP): Structure and Vocabularies 1.0. W3C Recommendation, W3C, Jan- uary 2004. http://www.w3.org/TR/2004/REC-CCPP-struct-vocab- 20040115/.

[67] A. Kobsa. Generic user modeling systems. User Modeling and User- Adapted Interaction, 11:49–63, 2001.

[68] A. Kobsa and W. Pohl. The bgp-ms user modeling system. User Modeling and User-Adapted Interaction, 4(2):59–106, 1995.

[69] K. Lai and M. Baker. Measuring Bandwidth. In Proceedings of IEEE INFOCOM ’99, pages 235–245, 1999.

[70] N. Leone, G. Pfeifer, W. Faber, T. Eiter, G. Gottlob, S. Perri, and F. Scarcello. The DLV System for Knowledge Representation and Rea- soning. Technical Report cs.AI/0211004, arXiv.org, November 2002.

[71] H. Levy and D. W. Low. A Contraction Algorithm for Finding Small Cycle Cutsets. Journal of Algorithms, 9(4):470–493, 1988.

[72] W. Li, Q. Vu, D. Agrawal, Y. Hara, and H. Takano. Powerbookmarks: a System for Personalizable Web Information Organization, Sharing, and Management. Computer Networks, 31(11):1375–1389, 1999.

[73] Liberty Alliance. Liberty ID-SIS Personal Pro- file Service. Technical Report, March 2006. http://www.projectliberty.org/resources/whitepapers.php.

192 [74] R. MacGregor and R. Bates. The LOOM Knowledge Representation Language. Technical Report ISI/RS-87-188, Information Science In- stitute, University of Southern California, Marina del Rey (CA), USA, 1987.

[75] D. Maggiorini and D. Riboni. Continuous Media Adaptation for Mo- bile Computing Using Coarse-Grained Asynchronous Notifications. In 2005 International Symposium on Applications and the Internet (SAINT 2005), Proceedings of the Workshops, pages 162–165. IEEE Computer Society, 2005.

[76] C. Manning and H. Sch¨utze. Foundations of Statistical Natural Lan- guage Processing. MIT Press, 1999.

[77] D. L. McGuinness and F. van Harmelen. OWL Web Ontol- ogy Language. W3C Recommendation, W3C, February 2004. http://www.w3.org/TR/owl-features/.

[78] I. Millard, D. D. Roure, and N. Shadbolt. The Use of Ontologies in Contextually Aware Environments. In First International Workshop on Advanced Context Modelling, Reasoning And Management, pages 42–47, 2004.

[79] B. Motik, U. Sattler, and R. Studer. Query Answering for OWL-DL with Rules. In The Semantic Web ISWC 2004: Third International Semantic Web Conference, pages 549–563, 2004.

[80] N. S. Ryan and J. Pascoe and D. R. Morse. Enhanced Reality Field- work: the Context-aware Archaeological Assistant. In V. Gaffney and M. van Leusen and S. Exxon, editor, Computer Applications in Ar- chaeology 1997, British Archaeological Reports. Tempus Reparatum, 1998.

193 [81] P. Norvig and S. Russell. Artificial Intelligence. A Modern Approach. Prentice Hall Series in Artificial Intelligence, 2003.

[82] H. Ohto and J. Hjelm. CC/PP Exchange Protocol. W3C Note, W3C, June 1999. http://www.w3.org/1999/06/NOTE-CCPPexchange- 19990624.

[83] OpenMobileAlliance. User Agent Profile Specification. Technical Re- port WAP-248-UAProf20011020-a, Wireless Application Protocol Fo- rum, October 2001. http://www.openmobilealliance.org/.

[84] J. Pascoe. The Stick-e Note Architecture: Extending the Interface Beyond the User. In IUI ’97: Proceedings of the 2nd international conference on Intelligent user interfaces, pages 261–264. ACM Press, 1997.

[85] R. S. Prasad, M. Murray, C. Dovrolis, and K. Claffy. Bandwidth Esti- mation: Metrics, Measurement Techniques, and Tools. IEEE Network, 17(6):27–35, 2003.

[86] D. Preuveneers and Y. Berbers. Adaptive Context Management Using a Component-Based Approach. In Proceedings of DAIS 2005, Dis- tributed Applications and Interoperable Systems, 5th IFIP WG 6.1 International Conference, volume 3543 of Lecture Notes in Computer Science, pages 14–26. Springer, 2005.

[87] N. B. Priyantha, A. Chakraborty, and H. Balakrishnan. The Cricket location-support system. In MOBICOM, pages 32–43, 2000.

[88] Q. Zhou and R. Fikes. A Reusable Time Ontology. 2002.

[89] A. Rakotonirainy, J. Indulska, S. W. Loke, and A. B. Zaslavsky. Middleware for Reactive Components: An Integrated Use of Con-

194 text, Roles, and Event Based Coordination. In Middleware 2001, IFIP/ACM International Conference on Distributed Systems Plat- forms, volume 2218 of Lecture Notes in Computer Science, pages 77– 98. Springer, 2001.

[90] C. Rao, Y. Chen, D.-F. Chang, and M.-F. Chen. imobile: A proxy- based platform for mobile services. In Proceedings of the First ACM Workshop on Wireless Mobile Internet (WMI 2001). ACM Press, 2001.

[91] RealNetworks. http://www.realnetworks.com/.

[92] P. Resnick, N. Iacovou, M. Sushak, P. Bergstrom, and J. Riedl. Grou- plens: An open architecture for collaborative filtering of netnews. In Proceedings of the Conference on Computer Supported Cooperative Work, pages 175–186, 1994.

[93] D. Riboni. Feature Selection for Web Page Classification. In EURASIA-ICT 2002 Proceedings of the Workshop, pages 473–477, 2002.

[94] S. Rich´e and G. Brebner. Storing and Accessing User Context. In Proceedings of the 4th International Conference on Mobile Data Man- agement (MDM 2003), pages 1–12, 2003.

[95] L. Rizzo. Dummynet: a Simple Approach to the Evaluation of Net- work Protocols. ACM Computer Communication Review, 27(1):31–41, 1997.

[96] P. Samarati. Protecting Respondents’ Identities in Microdata Release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010– 1027, 2001.

195 [97] B. Schilit, N. Adams, and R. Want. Context-aware computing appli- cations. In Proc. of the Workshop on Mobile Computing Systems and Applications, pages 85–90. IEEE, New York, 1994.

[98] A. Shamir. A Linear Time Algorithm for Finding Minimum Cutsets in Reducible Graphs. SIAM Journal On Computing, 8(4):645–655, 1979.

[99] Teodor C. Przymusinski. On the Declarative Semantics of Deduc- tive Databases and Logic Programs. In Foundations of Deductive Databases and Logic Programming, pages 193–216. Morgan Kauf- mann, 1988.

[100] TomTom PLUS services. http://www.tomtom.com/plus/.

[101] US Postal Service. Postal Addressing Standards. Technical Report Publication 28, November 2000.

[102] VLC media player. http://www.videolan.org/vlc/.

[103] X. H. Wang, T. Gu, D. Q. Zhang, and H. K. Pung. Ontology Based Context Modeling and Reasoning using OWL. In Proceedings of Sec- ond IEEE Annual Conference on Pervasive Computing and Commu- nications Workshops, pages 18–22. IEEE Computer Society, 2004.

[104] R. Want, A. Hopper, V. Falcao, and J. Gibbons. The Active Badge Lo- cation System. ACM Transactions on Information Systems, 10(1):91– 102, 1992.

[105] C. A. Welty and N. Guarino. Supporting Ontological Analysis of Taxonomic Relationships. Data and Knowledge Engineering, 39(1):51– 74, 2001.

[106] XviD Codec. http://www.xvid.org/.

196 [107] M. Youssef, V. Atluri, and N. R. Adam. Preserving Mobile Customer Privacy: An Access Control System for Moving Objects and Customer Profiles. In I. press, editor, 6th International Conference on Mobile Data Management (MDM), 2005.

197