International Journal of Pure and Applied Volume 118 No. 24 2018 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ Special Issue http://www.acadpubl.eu/hub/

Category Infused Probabilistic Policy Learning Based Mission - Aware Services for Collaborative Communication

Ankush Rai 1,R. Jagadeesh Kannan 2 School of Computing Science & Engineering, VIT Chennai, India [email protected], [email protected] 2

May 23, 2018

Abstract During catastrophic incident caused by natural disas- ter, a rescue team composed of combining the workforce of players such as Fire-fighters, Robots and Autonomous Aerial vehicles needs effective cooperation mechanism from team members to save the affected people in order to avoid a major loss. To tackle this challenge of adaptive control and communication b/w team members, the paper proposes Mission-Aware Middleware architecture which is capable of dispensing a dynamic infrastructure for uninterrupted com- munication between the players. The middleware monitors the mission context for any deviation and reconfiguration action is executed if there is an abnormality. Semantic modelling facilitates the middleware to comprehend the dif- ferent mission contexts and Semantic Web Rule Language instructions are used to understand the degree/severity of context. Providing generic resolutions for automatic self- reconfiguration is motivated via rule-based reconfiguration

1 International Journal of Pure and Special Issue

policies utilizing ontology. Finally, the research work illus- trates different mission contexts and middleware solutions followed by the implementation. Key Words:Semantic model, Reconfiguration, Adap- tive Communication, Ontology, Event-based communica- tion.

1 Introduction

Natural disasters bring out a situation in which victims need to be located and saved. To protect endangered people, rescue mission is launched instantly at the intervention area where players such as Autonomous Aerial Vehicle (AAV), Robots and Firefighters work together to speed up the recovery process. In this mission, the play- ers actively communicate, collaborate and coordinate the mission tasks with the help of smart devices. These coordinated activi- ties for the rescue of victims from natural disasters are termed as Crisis Management System (CMS) [1]. In [2], the authors have pro- posed a framework to support multimedia services and each service is offered with various QoS dynamically. A posteriori, the met- rics for resource utilization is calculated and according to that, the system reconfigures by itself. Normally, for an information sys- tem, the reasons for adaptation could be corrective, evolutional or perspective [3]. The work in [4] has developed a middleware that provide constant performance by mediating the resource to the needed applications. The authors in [5] have introduced a context- aware middleware for dynamic environment that uses ontology to describe the semantics of various conceptions. Their model incorpo- rates context-triggered action for real-time ubiquitous application. In [6] it was proposed that the adaptive middleware can ensures context processing functions that can be added dynamically. Here context processing patterns are used to customize attributes and schemes using Domain Specific Language. An architecture com- posed of multi-level context mechanism has been proposed in [7] and it is used for context-specific dissemination protocol. [8] de- scribed a mobile middleware which uses the concept of reflection to improve the construction of adaptive applications. In [9], the au- thors have introduced a holistic approach in which the issue is how the context management should be integrated with an adaptive

2 International Journal of Pure and Applied Mathematics Special Issue

middleware. In [10], a middleware supports ubiquitous context- aware services that ease integration while autonomy is addressed at situation modelling components. This research concentrates on event based communication for context retrieval and processing. A new event in the environment forces the middleware to adapt it- self with the new context changes. This reaction by the middleware generally comes under structure and behaviour changes. Structural adaptation refers to reconfigure the functionality of the communi- cation services whereas behavioural adaptation refers to the quality of the communication according to battery level of smart devices.

2 Theory Based Rule Gen- eration

Here, Ontology is used to represent the architecture models for context reconfigurations. Concepts and relations are derived from the general Collaborative Ontology (GCO); which are instantiated and rules are processed over these ontological instances such as GCO:CommunicationFlow, thus creating the corresponding event based communication that comprises channel administrator, event regulators and event producers. The resultant of ontologi- cally connected instances outline into a collaboration level graph. Latter, it is converted in form of GraphML language by the uti- lization of XSLT transformation. For refining purposes, the graph grammar construct a valid configuration that contains terminal nodes only (i.e. nodes belonging to the Event Based Communi- cation:EBC ). The research work discusses the following context: Context 1: Firstly, consider a situation from Fig 1 in which an in- vestigator, AAV intends to drop water in a fired area where another investigator, fireman2 is in rescue process. The initial situation at mission and communication levels is represented by ontology in Fig- ure 3(a) as well as by EBC in Figure 3(b). In this particular case, AAV has to be notified to stop dropping water until the fireman2 finishes his task. Another investigator (fireman1), aware of the lo- cation of fireman2, establishes a data flow to its coordinator and the coordinator reports to the supervisor subsequently. As the su- pervisor communicates with AAVs coordinator, a warning message is sent to AAV to stop the action.

3 International Journal of Pure and Applied Mathematics Special Issue

Fig. 3. (a) EBC deployment: Initial ontology and (b) EBC deployment: Graph representation But there is no connection between the AAV and fireman2 and AAV has to obtain the supervisors decision through its coordina- tor that will eventually take time. The other solution could be to establish a connection between the AAV and fireman1 through use of a new cooperation flow obtained by running the SWRL rule in decision model. This is represented by reconfiguration rule in Table 1. The configuration and decision rule is modelled based on category theory as a universal modelling language. Here, we have certain observable aspect of the subject which is then formal- ize with observable relationship between them. The configuration rule and decision rule in table 2 is dependent on Boolean value of the variables and the connecting them. The whole process of derivating the rule from the given of player conditions can be derived based on following category modelling. Let p be the set of feature points of a given dataset p1, p2, pn where the model features are embedded based on its{ locality··· where} a func- tion is defined to gauge the closing distance between a category set = p1(u, v, z), p2(u, v, z) pi(u, v, z)pj(u, v, z) and a set of other points{ in (u,v,z) coordinate··· system. Thus, the} energy or global distance of a category set is a weighted area where each el- ements of this category sets is weighted in correspondence with its distance to its closest point in the data set Σ

F¯(s) = ( dp(x)ds)1/p, 1 p 4 (1) x s ≤ ≤ Z ∈ where d(x) is the distance from x IR3 to its closest point ∈ 4 International Journal of Pure and Applied Mathematics Special Issue

in S. We can utilize a detailing the development of a variational formulation of an evolution equation (like that of table 1 and table 2) to construct this minimal forces in a category set to withstand the deformation forces in the model of the given dataset such as twisting and stretching of feature points by plunging on the vital- ity of a decent initial enclosing approximation of the category set. Therefore, at every iteration the evolution equation runs a gradient of the energy function in order to get it minimized. At each step, every point x of the category set S(t) at time t evolves towards the interior of the category set, along the normal direction to S(t) at point x, with a displacement from convergent condition that is relative to :

d(x)K ∆d(x). n (2) T − −→ where, −→n is the inner normal at x and K is a notation denoting the mean curvature of the category set at x. The tension of the category set is represented byte first term (d(x)K)/T which is non- linear, such that the evolution process requires a certain number of steps before reaching its equilibrium. A consistent condition of this development of evolution equation is a boolean connector in setting of a finite state model which has K = 0 over the place aside at the input data points where d = 0. The better the initial approxima- tion of the category set in successive iteration, lesser the non linear tedious effect of the evolution model is encountered and its defor- mation or restoration property is carried to the consequent itera- tion. The scalibility and degree of freedom of the boolean can be modeled with settings of transitional parameters by model- ing its deformation produced by experimentally extracted feature sets. At the equilibrium, it achieves a stable Boolean topology of which belongs to the cued feature data set in form of semantic ruled by functors defined previously. This is compara- ble to making every point of the category set to evolve along the normal direction to S(t) with a displacement vector from conver- gent model is set in correspondence to the first term∆d(x).n¯ Each point of the subsequent category set likewise fulfills the relentless steady state equation :∆d(x).−→n =o . The evolving pseudo-category set Sev is initialized with the convex hull of Σ and it is arranged inwards. Then, at that point, this oriented pseudo-category set Sev

5 International Journal of Pure and Applied Mathematics Special Issue

evolves, subject to topological operations which ensure the con- nectivity restoration between half-facets, whenever a half-facet is deformed. Therefore, taking in account of the effects of both in- ternal model deformation noises or the valence component in the translational states of finite state machine represented as a graph G = (V,E), with vertices V and edges E and the deformable model n is given by TD(x)= i=1 ϕ(x)vi , where TD is the complete topol- ogy, is the continuous piecewise basis function of the vertices (1 or 0 if i=j or otherwise)P for the ithvertex in which x R3 . Given that we have external force vector F ext = f ext f ext∈ f ext V 1 2 ··· 2 , which tends to minimize the energy function for both external and   internal model deformation noises and is given by.

ext StV = FV (3) &

S f int = a (v v ) (4) t 1 ij i − j (i,j) E X∈ where the stiffness matrix is represented bySt , whose size is nn. is the vector of control vertices on the category set. It is the control vertices which can be precisely illustrated for the ith row and jth column element ofSt, aij, as:

(∆ϕ .∆ϕ )dT D, i = jor(i, j) E A = TD i j (5) ij 0 otherwise∈  R Using finite differences in time, the deformation of the model can be iteratively represented as:

(t) (t 1) V V − (t) ext − + AV = F (t 1) (6) θ V − where V (t) is the vector of the models vertices at the tth iteration, and θ is the time step size. Upon combining the effects of both external and internal model we get the final combined effects into one system to which gives the combined model type at time t:

(t) ext ext (t 1) V (F (t 1) + f (t 1) )θ + V − (7) − V − V − The above equation is then used in combination with labels and actors to derive the connection and decision rule.

6 International Journal of Pure and Applied Mathematics Special Issue

Table 1: Configuration Rule Used in the Design

Now, as shown above fireman1 can warn the AAV about of location of fireman2 and AAV will not drop water. This adaptive reconfig- uration in ontology is depicted in Figure 4(a) and EBC in Figure 4(b). With this new flow of cooperation the corresponding event producer and cooperating event consumer is connected through re- configuration (represented by dotted line on Figure 4).

Fig. 4. (a) Illustration of EBC deployment during reconfiguration process, (b) EBC Deployment: Graph after Reconfiguration Event.

3 LEARNING MODEL FOR PROPOSED MIDELWARE SERVICE

We can model the correlation between different corresponding users (CoUs) i.e., key users who dictates the plan of operation and col- laborative users (CUs) i.e, users who support CoUs in a wireless communication environment as the receiving baseband signal as

7 International Journal of Pure and Applied Mathematics Special Issue

the nth CU during the spectrum sensing interval denoted by can be composed as:

x (t), y (t) = i nc (8) n G (t)C(t) + x (t)  i i c

Where, xi(t) represents the additive white Gaussian noise, Gi(t) is the channel gain which models the multipath fading channel, represents the CoUs signal. Also, nc &c are the hypothesis of Non- Corresponding User signal and the transmitted signal form CoU.

Table 2: Decision Rule Used in the Design

Fig. 5. Illustration of EBC deployment during power diminution Event

For sensing the communication spectrum a correlation coefficient is required to model the two sensing signals as a ratio of its covariance in the time interval and of its standard deviation. But this signals are subject to multipath fading between two cross sensing

8 International Journal of Pure and Applied Mathematics Special Issue

channels Ga(t)&Gb(t). Thus the cross correlation coefficient is given as:

cov(Ga(t),Gb(t) δab (9) ≈ σya(t)σyb(t) This will enable us to derive the channel gain at each CUs from the deterministic contribution of scatter signal loss due to diffusion and channel gain Line of Sight (LOS)as:

Gi(t)∗ = Gi,DIF F (t) + δab(t).Gi,LOS(t) (10) Thus, the correlation coefficient between CoUs and CUs can be expressed as:

o, nc ca,b(t) = Gi(t)C(t) + xi(t) (11)  c  σya(t)σyb(t) It will be easier to use a in situation like this to determine the connectivity strength and based on that accommodate a dynamic wireless network topology [18]; such that the mission commands shall be dispatched to the CUs if the CoU cant directly issue the mission command to it owing to the problems of poor connection strength, multipath fading of signals etc. There- fore, we are using the probabilistic policy based learning technique to reconfigure the action state pairs (see figure 5) of the ontological decision making in the proposed middleware service. Definition 1 A MDP is a 4-tuple (S,A,Tp ,R ), where S is signifies set of the states, A characterize set of actions; where, A(i) is the set of actions available at state MDP of transition.Tp(i,j) which is the of transition from a particular state i to state j during performing action a U(i) in state ∈ i, andRMDP (s,a) is the reward received when performing action a in set of state marked as s. We take as non-negative and confined to R , i.e., : 0 Max ∀s ≤ RMDP RMax. For the simplicity we adopt the idea that the reward≤ is considered to be deterministic, granting the fact that all of our results relate when is stochastic. We assign a policy for an MDP at each time t, for each state s as a probability for performing action a U(s), as per the given history of action state pairs during the ontological communication of our middleware service as:

9 International Journal of Pure and Applied Mathematics Special Issue

Ht 1 = s1, a1, r1, St 1, at 1, rt 1 (12) − { ··· − − − } A policy P principally depends only upon the current state and not onto its history. Thus, a deterministic approach P assigns for each state a unique action. While taking after a strategy P we execute at time t action at at state st and observe a reward rt (distributed ac- cording to RMDP (s, a)). and the next state st+1 (dispersed accord- MDP ing to Pst,st+1 (at) ). Hence, now we can establish the sequences of rewards to a single value as the return, and the goal is to maximize it. Hence, the computational process is to focus on this discounted return, which has a parameterγ (0, 1), and the discounted return of policy P is: ∈

∞ P t VMDP = γ rt (13) t=0 X , (13) Where rtis the reward observed at time t. Since all the re- wards are bounded by RMax . For a grouping of sets for state and activity, let the covering time, meant byC 0 , be a furthest point of confinement on the quantity of state-activity sets starting from any pair, until all state-activity shows up in the consecutive course of action. The probabilistic policy based learning algorithm gauges the state-action value function (for discounted return) as takes af- ter:

0 Qc+1(s, a) = Qc(s, a) + αt(s, a)(RMDP (s, a) + γmaxb U(s0 )Qc(s , b) ∈ (14) Where s is the state reached from state s when performing action a at time t. Since, probabilistic policy based learning is a non- concurrent process as it updates a single entry every step. The algorithmic process is as follows:

10 International Journal of Pure and Applied Mathematics Special Issue

4 RESULTS& CONCLUSION

The distributed adaptability offers by the presented work is the primary advantage of this architecture compared to previous re- search. Jess Rule Engine is used for implementing Semantic Web Rule Language (SWRL) whereas Pellet Engine for inferences. Protg is used to define ontology while Graphical is imple- mented through Java Universal Network/Graph framework using GMTE. For implementation, FACUS (Framework for Adaptive Col- laborative Ubiquitous Systems), has been used. In this framework, communicating entity is represented by a node and will be exe- cuted on a physical machine. Communication link (flow) between

11 International Journal of Pure and Applied Mathematics Special Issue

the entities is represented by properties:hasSource and hasDesti- nation. Data flow is handled by an external component such as Tool concept. This initial stage of this situation at application, col- laboration and massaging levels are shown in Figure 6. Consider fireman1 lost the connection with its coordinator while searching for a victim. Once the connection is lost, the coordinator aware this situation and thanks to semantic policies, the coordinator and the other investigator will shift to adhoc mode. Also, the local de- cision of the lost investigator changes to adhoc mode automatically such that communication is established. Figure 6 show the adaptive collaboration and middleware graph in the implementation.

Fig. 6 (a). Instance of initial collaboration formed by semantically connecting the ip addresses of actors in the given fire fighting operation. (b). Initial middleware graph generated in the application by the proposed algorithm to coordinate fire fighters. (c). Instance of collaboration opportunity sensed by the proposed algorithm for adaptive collaboration between the players in fire-fighting operation. (d). Middleware graph generated after adaptation based on the sensed collaboration opportunity by the proposed algorithm.

The results in Figure. 7 show that the static system rejects more sessions than probabilistic policy based learning based. These re- sults suggest that probabilistic policy based learning is able to per- form efficient admission control by directly considering the status

12 International Journal of Pure and Applied Mathematics Special Issue

of the end- to-end goals. These figures 8 and 9 also suggest that probabilistic policy based learning is too conservative and static system is too imprudent in admitting sessions. The blocking and dropping decrease with larger network sizes, which is due to the availability of more resources to accommodate the fixed load. Furthermore, Probabilistic policy based learning achieves the smallest percentage of service disruption for all metrics.

Fig 7: Plot representing session blockage percentage with the number of nodes in the simulated network.

Note: The service disruption in proposed algorithm corresponds to the range of 22%-34% as the Number of nodes rises from 25-200. Which is significant improvement in the current literature. Only the percentage of service disruption of delay increases with larger networks due to the possibility of using longer paths. This figure shows the ability of the probabilistic policy based learning to adapt parameters accurately and react when the threshold for any Qual- ity of Service (QoS) parameter is violated. Moreover, it achieves the highest probability of all QoS metrics being supported simul- taneously, which illustrates its capabilities in considering multiple end-to-end goals and constraints. The complexity of the frame- work presented can be represented with c notation and the state for three dimensional state-action pair set can be referred with k notation. The complexity of the framework with that of deep learning model can be summed as O(ck) and O(n2) in the fol- lowing table 3. Here the depth of the state space is dependent on maximum availability of state-action pair sets. This in turn grows sub-linearly with the n iteration cycle. Here, the method reduced from its exponential clock time when the duplicated state-action pairs are limited.

13 International Journal of Pure and Applied Mathematics Special Issue

References

[1] Mecella, M.; Angelaccio, M.; Krek, A.; Catarci, T.; But- tarazzi, B.; Dustdar, S., WORKPAD: an Adaptive Peer- to-Peer Software Infrastructure for Supporting Collaborative Work of Human Operators in Emergency/Disaster Scenarios ,International symposium on Collaborative Technologies and Systems, May 2006.

[2] Paulo Roberto Massa Cereda, Joo Jos Neto, ”A middleware architecture for adaptive devices”, Procedia , Volume 109, 2017, Pages 1158-1163.

[3] Jess M.T.Portocarrero, Flavia C.Delicato, Paulo F.Pires, Bruno Costa, Wei Li, Weisheng Si, Albert Y.Zomaya, ”RAM- SES: A new reference architecture for self-adaptive middleware in Wireless Sensor Networks”, Ad Hoc Networks, Volume 55, February 2017, Pages 3-27.

[4] Yuvraj Sahni, Jiannong Cao, Xuefeng Liu, ”MidSHM: A Middleware for WSN-based SHM Application using Service- Oriented Architecture”, Future Generation Computer Sys- tems, Volume 80, March 2018, Pages 263-274.

[5] Georgios Lilis, Maher Kayal, ”A secure and distributed mes- sage oriented middleware for smart building applications”, Au- tomation in Construction, Volume 86, February 2018, Pages 163-175.

[6] Jianjiang Li, Qian Gea, Jie Wu, Yue Lia, XiaoleiYang, Zhan- ning Ma, ”Research and implementation of a distributed trans- action processing middleware”, Future Generation Computer Systems, Volume 74, September 2017, Pages 232-240.

[7] Balakrishnan, D. et al , Adaptive Context Dissemination in Heterogeneous Environments, IEEE Transactions on Mobile Computing, (Volume:13 , Issue: 6 ), October 2013.

[8] Soobin Jeon, Inbum Jung,”Experimental evaluation of im- proved IoT middleware for flexible performance and efficient connectivity”, Ad Hoc Networks, Volume 70, 1 March 2018, Pages 61-72.

14 International Journal of Pure and Applied Mathematics Special Issue

[9] G. Lau, M. Al-Sabah, M. Jaseemuddin, H. Razavi, M. Bhuiyan,”Context-aware RAON middleware for opportunistic network”, Pervasive and Mobile Computing, Volume 41, Oc- tober 2017, Pages 28-45.

[10] Ali Khalili, Massimo Narizzano, Lorenzo Natale, Armando Tacchella,”Learning middleware models for verification of dis- tributed control programs”, Robotics and Autonomous Sys- tems, Volume 92, June 2017, Pages 139-151.

15