Simulating Federated Learning for Smartphone Based Indoor Localisation Litian Li University of Twente P.O

Simulating Federated Learning for Smartphone based Indoor Localisation Litian Li University of Twente P.O. Box 217, 7500AE Enschede The Netherlands [email protected] ABSTRACT complex, which in turn will cause a distance deviation due Satellite navigation such as Global Positioning System to the signal reflection path. (GPS) cannot accurately and quickly locate indoors due With the rapid development of the Internet of Things to signal congestion and path complexity caused by the (IoT) and related applications in the last decades, the building structure. In indoor positioning technology based demand for high-precision indoor positioning is increas- on wifi fingerprint is a general solution. As the demand for ing. A variety of different technologies have been used for indoor positioning increases and people's awareness of pri- indoor Localisation. This research mainly focuses on Re- vacy protection increases. It is essential to protect privacy ceived Signal Strength Indication (RSSI) Wifi fingerprints in the crowdsourced method of collecting user location in- in indoor Localisation. formation. Compared to traditional centralized machine Indoor Localisation based on wifi fingerprints includes two learning and distributed machine learning. In federated stages, training and testing. The on-site investigation is learning, user data is only trained locally without leav- difficult in the training stage because collecting training ing the local device. Only model parameters and gradi- data is time-consuming and labor-intensive. Meanwhile, ents are transmitting between the service and the client. because RSSI is easily affected by the indoor environment, Thus federated learning has become a solution as a ma- the database often needs to be rebuilt over time, so the use chine learning method to protect user privacy. The au- of mobile devices to dynamically collect data has become thor will select a suitable open-source indoor positioning a solution, avoiding on-site data collection.[11]. data set based on wifi fingerprints, and choose a suitable framework by evaluating the existing mainstream feder- Due to the rapid development of computer processing ca- ated learning frameworks. Conducting federated learning pabilities and the explosive growth of available data, thus and non-federal learning based on the selected data and using crowdsourcing for indoor wifi fingerprint-based Lo- framework. Observing whether it can have good training calisation to collect training data for machine learning results under the premise of protecting user privacy char- has naturally become a convenient and effective solution. acteristics of federated learning and compare it with the However, the crowdsourcing method of collecting user data performance under traditional machine learning. has the hidden danger of privacy leakage, since the sensitive data related to the user's location needs to be trans- Keywords mitted from the user device to the central server for model training in traditional machine learning. Federated learning, federated learning framework, feder- As the awareness of personal data protection has increased, ated learning simulation, federated learning simulator, in- the crowdsourcing method of uploading sensitive user data door Localisation, Wifi fingerprint, indoor positioning has caused users' concerns. In this case, the concept of federated learning came into being[8]. 1. INTRODUCTION Traditional centralized learning requires various terminal Since the development of Global Positioning System (GPS), users to upload their data on smartphones to a central it has been possible to achieve high-precision, timely, and server in the cloud. The central server will use the col- high-efficiency positioning in various outdoor fields. GPS lected data for model training, and then send the training is widely used in many different application scenarios, from results to each terminal. But the problem is that users are daily entertainment to aviation, navigation, and military not necessarily willing to upload their privacy-related data applications. However, GPS services cannot achieve the to the cloud, especially when data involve their privacy. same ideal result in indoor positioning as outdoor positioning. The reason is that the precise positioning of GPS Meanwhile, although traditional distributed learning can relies on strong enough satellite signals, but the GPS sig- make data only perform machine learning locally on the nal will be greatly weakened after penetrating the build- terminal, it does not allow data to leave the terminal, en- ing. Moreover, the indoor building structure is generally suring data privacy, however, to improve the learning ef- fect and efficiency, traditional distributed learning some- Permission to make digital or hard copies of all or part of this work for times requires a certain degree of processing between dif- personal or classroom use is granted without fee provided that copies ferent terminals data exchange, named shuffle[14], which are not made or distributed for profit or commercial advantage and that can make the data independent and identical distribution copies bear this notice and the full citation on the first page. To copy oth- (IID), that is more conducive to efficient algorithm design erwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. and implementation. Moreover, worker nodes of tradi- 28th Twente Student Conference on IT Febr. 2nd, 2018, Enschede, The tional distributed machine learning are all connected to Netherlands. high-speed wired bandwidth, each worker node has ideal Copyright 2018, University of Twente, Faculty of Electrical Engineer- computing power, and the computing power between them ing, Mathematics and Computer Science. 1 is almost the same. But in the actual smartphone appli- accuracy of GPS indoors and outdoors is very different. cations of IoT, the network status of the worker nodes is It has a good performance in outdoor positioning applica- unstable, the computing power is different and limited, tions, but its performance in indoor positioning applica- and the data distribution is also every unbalanced that tions is not satisfactory. data distribution is not independent and identically dis- In order to solve the accuracy problem of indoor position- tributed (Non-IID). Which also one of the problems that ing, a variety of solutions based on different technology federated learning wants to solve. classifications have been proposed[16]. This research will select a suitable open-source data set Among these technologies, the Received Signal Strength and a open-source federated learning framework for indoor (RSS)-based Localisation technology is divided into two positioning based on wifi fingerprints. To simulate a ma- categories. One is to use the loss model of the RSS on chine learning scenario in which the data in the federated the path to obtaining position information, however this learning is Non-IID in the clients. Observe whether the method often has the problem of low Localisation accuracy federated learning can have good performance in indoor due to the complexity of the indoor path. The other is positioning based on wifi fingerprint under the premise of based on wifi fingerprints. This method collects RSSI from protecting privacy, and compare the learning performance multiple wireless access points as the fingerprint feature, with non-federal learning. to performs indoor Localisation[1]. The structure of this paper is as follow. In section 2, it In order to protect the privacy of data uploaded by users will state the problem that this research is going to solve. under the crowdsourcing method, several different In section 3, it will introduce the current research back- approaches have been proposed[4][9][5][7], these methods ground of indoor positioning based on wifi fingerprint and are different methods of encrypting the transmitted sensi- privacy protection in indoor positioning. And the back- tive data. ground of current federated learning mainstream simulation tools. In section 4, it will mention what approaches In addition to the above privacy protection methods, fed- will be used at different stages for this research. In section erated learning addresses users' concerns about privacy 5, the results of federated learning and non-federal learn- leakage from another perspective. ing implementation will be described and discussed. In After the concept of federated learning emerged, there section 6, it will summarize the experimental results and have been open source frameworks led by different compa- the direction of future research. nies and institutions to support simulation model training for federated learning. Currently, the mainstream open 2. PROBLEM STATEMENT source frameworks that support federated learning simu- Although there has been a lot of research on deep learn- lation are Tensorflow federated(TFF), Pysyft, Fate. ing of indoor Localisation based on wifi fingerprints. But These frameworks provide a variety of federated learning there is still a lack of indoor Localisation based on wifi algorithms and traditional machine learning algorithms fingerprints using federated learning. Because the wifi components. In terms of privacy protection, they also pro- fingerprint data collection is a very time-consuming and vide different privacy algorithms to ensure the data confi- labor-consuming stage, and the database often needs to dential during the model parameter transmission. This al- be updated due to changes in site factors. Therefore, low

Load more