Reinforcement Learning Frame Work for Energy Efficient Wireless Sensor Networks: a Survey

TIJER || ISSN 2349-9249 || Technix International Journal for Engineering Research Reinforcement Learning Frame Work for Energy Efficient Wireless Sensor Networks: A Survey 1st Ankit B Patel, 2nd Prof. Abhay upadhyay, 3rd Dr. Hitesh Shah 1PG student, 2Assistant Professor, 3Professor 1Electronics and Communication Engineering (CSE), 1L.D College of Engineering, Ahmedabad, India 2 L.D College of Engineering, Ahmedabad, India 3G.H.Patel College of Engineering & Technology, Vidhyanagar, India [email protected], [email protected], [email protected] ______Abstract – Wireless sensor networks are widely used in the commercial and industrial areas. Wireless Sensor Network is a highly distributed network of small and lightweight sensing nodes which are deployed in a large number at multiple locations. These sensor nodes performs sensing, processing and then communicating. Routing is a major challenge faced by wireless sensor networks due to the dynamic nature of WSNs. While we try to implement the wireless sensor network practically, there are various challenges like energy efficiency, responsiveness, heterogeneity; systematic design, privacy and security etc. come across the path in the domain of wireless sensor network. Out of these challenges the energy efficiency has major impact in the wireless sensor network. Sensor nodes are deployed randomly in the space. However, it is not possible to change the batteries of the nodes after some period of time because it may be very costly or almost impossible. So we have to minimize the use of power supply resources and it results in the increased network lifetime. Many researchers have presented some algorithms with different routing policies and also developed some new routing protocols. Further research can be done by developing new routing protocols, by modifying these existing algorithms or by developing new algorithms for the new routing policies and also by implementing the new topologies in the wireless sensor network. The network lifetime can increase by choosing the optimal path for routing the data packets in the network. Here some useful papers are reviewed related to reinforcement learning framework in context of energy efficiency. Index Terms – Energy efficiency, Responsiveness, Heterogeneity, Sensor nodes, routing ______

I. INTRODUCTION Wireless sensor networks were originally conceived for military applications like battle field monitoring. But today these networks have many purposes in both industry and consumer applications. Applications like environmental monitoring; industrial process control and machine health monitoring are just a few examples of areas where wireless sensor networks are being put to use. The growth of applications and the constant evolution of cheaper and more capable hardware make wireless sensor networks an interesting domain with a lot of potential research opportunities. A wireless sensor network is typically composed of multiple autonomous wireless sensor nodes. These nodes gather data about their environment and work together to forward their data to centralized locations called base stations or sinks. The sensor nodes are equipped with various sensors that allow them to monitor their environment. The type and amount of sensors that are available on the sensor nodes depends on the application. Some of the common examples are sensors for measuring temperature, humidity, movement, visible light, and so forth. The data that each node gathers from its sensors is passed along through the wireless network to special nodes that are responsible for collecting data. These nodes usually do not perform any measurements themselves and are called base stations or sink nodes. The sink nodes forward the collected data to a different system that they are connected to. This can be a directly connected computer or a device on a second network interface that the sink could be connected to. This usage scenario is often also referred as data collection. Figure 1.1 Overview of a wireless sensor network topology Regular nodes are shown in white. The sink node is shown in black and forwards its gathered data to the connected computer. Besides a set of sensors, each node features a micro controller, a wireless antenna and a battery power source. The micro controller integrates a low-power processor core, program memory, RAM memory and an input output interface for the wireless antenna. Because of their structure these sensor nodes seemingly resemble modern day computers, but it is important to keep in mind that the capabilities of these nodes are of a much smaller magnitude in terms of performance. Processors work at clock cycles of only a few megahertz and the amount of available program and RAM memory is typically less than 100 kilobytes. This limited hardware however allows sensor nodes to achieve acceptable lifetimes on regular batteries. II. REINFORCEMENT LEARNING Reinforcement learning takes place as a result of interaction between an agent and the world, the idea behind learning is that percepts received by an agent should be used not only for acting, but also for improving the agent’s ability to behave optimally in the future to achieve the goal. Use of reinforcement learning in energy efficient network With the use of reinforcement learning technique we have only knowledge about the environment i.e. existing algorithms, available protocols and information about topology which is used to implement wireless sensor network. We may receive some evaluation of action (Reinforcement), but is not told of which action is the correct one to achieve goal i.e. to design energy efficient network. The network lifetime can increase by choosing the optimal path for routing the data packets in the network.

III. LITERATURE SURVEY

A SURVEY OF MACHINE LEARNING IN WIRELESS SENSOR NETWORKS [1] In this paper machine learning methods that have been applied in WSNs to solve some networking and application problems are surveyed. Fundamental limits of learning algorithms are addressed and future machine learning research directions are highlighted. Research in WSNs area has focused on two separate aspects of such networks, namely networking issues, such as capacity, delay, and routing strategies; and application issues. They surveyed the machine learning methods used in WSNs from both aspects. Use of MEMS technology is addressed and why machine learning is useful while using MEMS technology is described. They surveyed machine learning in WSN application from two perspectives, namely the network associated issue and application associated issue. In the network associated issue, different machine learning algorithms applied in WSNs to improve network performance is discussed. In application associated issue, machine learning methods that have been used for information processing in WSNs is summarized. They presented different machine learning algorithms applied in WSNs to solve some specific network associated problems, such as energy- aware communication, node localization, resource allocation and etc. They also surveyed application of machine learning in WSNs information processing, including object or event identification, pattern recognition and etc. They discussed the future trend of machine learning in WSNs. Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and applications [2] In this paper they presented an extensive literature review over the period 2002-2013 of machine learning methods that were used to address common issues in wireless sensor networks (WSNs) and they evaluated advantages and disadvantages of each proposed algorithm against the corresponding problem. They also provided a comparative guide to aid WSN designers in developing suitable machine learning solutions for their specific application challenges. Machine learning (ML) was introduced in the late 1950’s as a technique for artificial intelligence (AI). They observed that the promise of machine learning lies in exploiting historical data to improve the performance of sensor networks on given tasks without the need for re-programming. A wide variety of important up-to date machine learning algorithms are presented for a comparison of their strengths and weaknesses. They provided a comprehensive overview which groups recent techniques roughly into supervised, unsupervised and reinforcement learning methods. Machine learning algorithms are discussed based on their target WSN challenges, so as to encourage the adoption of existing machine learning solutions in WSN applications. They included machine learning algorithms and themes and simple examples in the context TIJER || ISSN 2349-9249 || Technix International Journal for Engineering Research of WSNs. The review of existing learning efforts to address functional issues in WSNs such as routing, localization, clustering, data aggregation, query processing and medium access control is explained briefly in this paper. They investigated machine learning solutions in WSNs for fulfilling non-functional requirements, i.e. those which determine the quality or enhance the performance of functional behaviors. Examples of such requirements include security, quality of service (QoS) and data integrity. They also discussed major difficulties and open research problems for machine learning in WSNs. Finally, they concluded that numerous issues are still open and need further research efforts such as developing lightweight and distributed message passing techniques, online learning algorithms, hierarchical clustering patterns and adopting machine learning in resource management problem of wireless sensor networks. Reinforcement learning for context awareness and intelligence in wireless networks: Review, new features and open issues [3] In this paper they advocated the use of reinforcement learning (RL) to achieve context awareness and intelligence. They presented an overview of classical RL and three extensions, including events, rules and agent interaction and coordination, to wireless networks. They discussed how several wireless network schemes have been approached using RL to provide network performance enhancement, and also open issues associated with this approach. Throughout the paper, discussions are presented in a tutorial manner, and are related to existing work in order to establish a foundation for further research in this field, specifically, for the improvement of the RL approach in the context of wireless networking, for the improvement of the RL approach through the use of the extensions in existing schemes, as well as for the design and implementation of RL in new schemes. In wireless networks, context awareness enables each host to be aware of its operating environment; while intelligence enables each host to make the right decision at the right time to achieve optimum performance. The use of reinforcement learning (RL) to achieve context awareness and intelligence is discussed. They provided background and motivation for the application of the RL approach in wireless networks through the introduction of policy-based and intelligent-based approaches, respectively. They also discussed traditional RL including state, action and reward, and open issues in the context of wireless networking. The features that are not used in the traditional RL approach, and are not widely applied in the wireless networking literature and yet have great potential for performance enhancement are explained briefly in this paper. These are events, rules, and agent interaction and coordination. This paper provides tutorial-based discussions on how problems in wireless networks, including routing, resource management and dynamic channel selection (DCS) are solved using RL. This paper includes the implementation of RL for cognitive radio networks. The open issues in the application of RL approach in wireless networks are discussed. All discussions are presented in a tutorial manner in order to establish a foundation for further research in this field; specifically, the improvement of RL in the context of wireless networking, the improvement of RL in existing schemes, and the design and implementation of RL in new schemes. They advocated the use of reinforcement learning (RL) to achieve context awareness and intelligence in wireless networks. RL has been successfully applied to routing, resource management, dynamic channel selection and other network functions demonstrating significant performance enhancement. RL has been shown to achieve performance enhancement for dynamic channel selection both in simulation and real implementation on a cognitive radio network platform. There is a great deal of future work in the use of RL including the designs of events and rules, and multi-agent approaches, as well as the open issues raised in this paper. Decentralized reinforcement learning for energy-efficient scheduling in wireless sensor networks [4] They presented a self-organizing reinforcement learning (RL) approach for scheduling the wake-up cycles of nodes in a wireless sensor network. The approach is fully decentralized, and allows sensor nodes to schedule their active periods based only on their interactions with neighboring nodes. They implemented RL approach in the OMNET++ sensor network simulator, and illustrated how sensor nodes arranged in line; mesh and grid topologies autonomously uncover schedules that favor the successful delivery of messages along a routing tree while avoiding interferences. The limited resources of the sensor nodes make the design of a WSN application challenging. In the active state all the components (CPU, sensors, radio) of a node are active, allowing the node to collect, process and communicate information. During the sleep state all these components are switched off, allowing the node to run with an almost negligible amount of energy. However, nodes in sleep mode cannot communicate with others, since their radio transmitter is switched off. The fraction of time in which the node is in the active mode referred to as duty cycle. They demonstrated how the performance of a WSN network can be further improved, if nodes not only synchronize, but also desynchronize with one another. They showed that coordinating the duty cycles of sensor nodes can successfully be done using the multi agent systems and the reinforcement learning (RL) frameworks by rewarding successful interactions (e.g., transmission of a message) and penalizing the ones with a negative outcome (e.g., message loss or overhearing). They illustrated the benefits of the proposed RL approach by implementing it in the OMNET++ simulator, and studied three different WSN topologies, namely line, mesh and grid. They showed that nodes form coalitions which enable a quicker delivery of the data packets to the base station, allowing shorter active periods and therefore lower energy consumption. They presented the energy challenges related to the communication and routing protocols in WSN. This paper presented a decentralized RL approach for energy-efficient wake-up scheduling in WSNs. This approach drives nodes to coordinate their wake-up cycles based only on local interactions. They applied self-organizing RL approach to nodes in a WSN and compared its performance to a standard, fully synchronized network. They investigated three different topologies and showed that agents are able to independently adapt their duty cycles to the routing tree of the network. For high data rates this adaptive behavior improves both the throughput and lifetime of the system, as compared to a fully synchronized approach where all nodes wake up at the same time. They demonstrated how initially randomized wake-up schedules successfully converge to being both synchronized and desynchronized without any form of explicit coordination. Their current focus is put on minimizing the number of active time slots within a frame, and on relaxing the constraint of having the active time of nodes in one contiguous period. They will investigate the convergence properties of the proposed algorithm, as well as the analysis of its behavior on real-world test beds as a future work. Reinforcement learning for energy efficient routing in wireless sensor networks [5] In this paper they studied the potential of using energy aware metrics in reinforcement learning based routing algorithms for wireless sensor networks. This paper contributed with an enhanced version of an existing energy aware algorithm and with a study that tests the influence of combining energy aware metrics with load balancing metrics from delay based Q-routing. They showed that their enhanced algorithm can significantly improve the lifetime of a network without requiring any extra information or communication, by propagating energy information beyond direct neighbors throughout the network. Their study also showed that topologies composed from heterogeneous nodes can have a significant impact on an algorithm’s performance. Furthermore they showed that load balancing in routing algorithms can help to improve the network lifetime while only requiring energy information about a node’s direct neighbors. In this paper they focused on general applicable routing algorithms that can be used in multiple usage scenarios. The routing algorithms that they considered in this paper all use Q-learning techniques to create their routing policies. Routing algorithms based on Q-learning techniques are often referred to as Q-routing algorithms and were first introduced by Boyan and Littman in 1994, who designed their initial Q-routing algorithm to balance the network load in packet-switched networks by estimating the routing delay at different nodes. In this paper they looked at an energy-aware Q-routing algorithm for WSNs that was recently proposed by Forster. They extended the original algorithm so that it propagates energy information faster through the network and they showed that this modified version can further extend the network lifetime. They explained how the original algorithm provides parent energy feedback. They also provided the enhancements for lowest energy on path feedback. They also tested a variation of a Q-routing algorithm that incorporates both load balancing properties from the original Q-routing algorithm together with the energy-aware properties introduced by Forster. This allowed them to compare the influence of load balancing on energy efficient routing on different topologies. They showed that their extended algorithm can help to improve the lifetime when using a shortest path routing algorithm or when using a load balancing algorithm under low traffic load simulations. Their extended version of the algorithm introduced by Forster helps to further extend the network lifetime and they noticed that the energy cost function will always improve the network lifetime when compared to algorithms not using an energy cost function. They performed simulations of multiple routing algorithms on different topologies different scenarios. Their experiments showed that using energy aware metrics can always extend the lifetime of the network. By modifying the Q- routing algorithm to propagate energy information beyond its direct neighbors they became able to significantly improve the network lifetime in heterogeneous networks. Their experiments showed that there currently isn’t a single routing algorithm which delivers the best network lifetime in all these different scenarios. This means that there still exists research opportunity for online learning algorithms which are able to adjust their routing decisions based on limited local information. TIJER || ISSN 2349-9249 || Technix International Journal for Engineering Research IV. ENERGY EFFICIENCY IN WSNS The use of wireless sensor networks is increasing day by day and at the same time it faces the problem of energy constraints in terms of limited battery lifetime. As each node depends on energy for its activities, this has become a major issue in wireless sensor networks. The failure of one node can interrupt the entire system or application. Every sensing node can be in active (for receiving and transmission activities), idle and sleep modes. In active mode nodes consume energy when receiving or transmitting data. In idle mode, the nodes consume almost the same amount of energy as in active mode, while in sleep mode, the nodes shutdown the radio to save the energy. The following steps can be taken to save energy caused by communication in wireless sensor networks.  To schedule the state of the nodes (i.e. transmitting, receiving, idle or sleep).  Changing the transmission range between the sensing nodes.  Using efficient routing and data collecting methods.  Avoiding the handling of unwanted data as in the case of overhearing. V. CONCLUSION WSNs face many challenges, mainly caused by communication failures, storage and computational constraints and limited power supply. There are lots of researches carried out in the area of wireless sensor network by the researchers and possibility of more research still exists. There are various challenges like energy efficiency, responsiveness, robustness, scalability, self-configuration and adaptation, heterogeneity, systematic design, privacy and security etc. available in the area of wireless sensor network. The energy efficiency constraint affects the overall network and reduces the network life time. Many researchers have developed routing protocols and routing algorithms for the routing path of the data packets to deal with the energy efficiency constraint of the network. The selection of the optimal path from source to destination gives result in the increased network life time. VI. REFERENCES 1. Energy-efficient wireless sensor network design and implementation for condition-based maintenance. Journal ACM Transactions on Sensor Networks (TOSN), Volume 3 Issue 1, March 2007. 2. S. R. K. Joel B. Predd, and H. Vincent Poor, "Distributed Learning in Wireless Sensor Networks - application issues and the problem of distributed inference," IEEE Signal Processing Magazine, 2006. 3. D. E. D. Culler, and M. Srivastava, "Overview of sensor networks," IEEE Computer, pp. 41-49, 2004. 4. T. O. Ayodele, “Introduction to machine learning,” in New Advances in Machine Learning. InTech, 2010. 5. B. Krishnamachari, D. Estrin, and S. Wicker, “The impact of data aggregation in wireless sensor networks,” in 22nd International Conference on Distributed Computing Systems Workshops, 2002, pp. 575–578. 6. Al-Karaki, J.N. and Kamal, A.E. (2004) ‘Routing techniques in wireless sensor networks: a survey’, IEEE Wireless Communications, Vol. 11, No. 6, pp.6–28. 7. Justin A. Boyan and Michael L. Littman. Packet routing in dynamically changing networks: A reinforcement learning approach. In Advances in Neural Information Processing Systems 6, pages 671–678. Morgan Kaufmann, 1994. 8. A Forster and A L Murphy. Balancing energy expenditure in wireless sensor network through reinforcement learning: A study. In Proceedings of the 1st International Workshop on Energy in Wireless Sensor Networks WEWSN, page 7pp. Citeseer, 2008. 9. Energy-efficient communication in wireless sensor networks. Martin Enzinger, Seminar SN SS2012, Network Architectures and Services, August 2012.