Unsupervised Feature Learning in a Federated Setting for Human Activity Detection
Total Page:16
File Type:pdf, Size:1020Kb
Eindhoven University of Technology MASTER Unsupervised feature learning in a federated setting for human activity detection van Berlo, B.R.D. Award date: 2019 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain Department of Mathematics and Computer Science System Architecture and Networking Research Group Unsupervised Feature Learning in a Federated Setting for Human Activity Detection Master Thesis Bram van Berlo, BSc Student number: 1376802 Thesis Submitted in Partial Fulfillment of the Requirements for the Master of Science Degree in Embedded Systems offered by the EIT Digital Master School Supervisors: Affiliations: dr. Tanir Ozcelebi TU/e SAN dr. Georgios Exarchakos TU/e ECO dr. Vlado Menkovski TU/e IS Aaqib Saeed, MSc TU/e SAN Ewout Brandsma, MSc Philips Research final version Eindhoven, August 23, 2019 Abstract In the current situation, Federated Machine Learning (FML) can only be implemented effect- ively in use cases where implicitly labeled data are generated. An explanation for the limited use case effectiveness is that data in a FML context are either implicitly labeled or unlabeled, and explicit client labelling of unlabeled data is undesirable. In response, this research pro- ject introduces a FML software architecture that can effectively train a wide variety of deep neural network types with large quantities of unlabeled data that are typically present at distributed clients in a federated setting. These pre-trained neural networks can be used to extract discriminative features from labeled data. The extracted features allow use case (e.g. down-stream) specific deep learning networks to effectively train a model with labeled data in low volumes. The architecture opens up FML to a wider variety of use cases. Furthermore, the representation performance of deep neural networks trained with the software architecture is assessed with labeled data and use case-specific, Human Activity Detection (HAD) deep learning tasks. The assessment is executed with quantitative research experiments, executed with simulation functionality, that involve the classification task-specific metrics accuracy, precision, recall, f1 score, and Cohen's kappa. The effect of computation distribution in a federated setting on the total execution time for training deep neural networks is analyzed as well. The effect is analyzed with the help of three theoretical models. The models depict the total execution time that can be expected in three different model train situations. The representation performance results indicate that down-stream classifiers pre-trained with self- supervised learning reach similar performance results compared to supervised learning from scratch in a federated setting. In case pre-training is executed with unlabeled data coming from more users and more activity classes compared to the labeled data, self-supervision performs better. Overall, self-supervision outperforms unsupervised learning with an autoen- coder network at a difference between 0.03 and 0.06 for all reported metric values. Therefore, self-supervised learning is more effective than unsupervised learning. The total execution time results indicate that FML eventually leads to superior performance compared to letting a server train personalized models for every client, or letting clients train a model on their own for an increasing amount of participating clients. This makes FML more scalable. However, model parameter and model parameter update communication between server and clients for large deep learning models eventually becomes an execution time bottleneck. Lastly, unsu- pervised feature learning algorithms can be integrated effectively with the global model server aspect of FML algorithms in an application layered software architecture. Keywords: Federated learning, Unsupervised feature learning, Self-supervised learning, Hu- man activity recognition, Deep learning ii Unsupervised Feature Learning in a Federated Setting for Human Activity Detection Acknowledgements First of all, I would like to thank dr. Tanir Ozcelebi, dr. Georgios Exarchakos, dr. Vlado Menkovski, Aaqib Saeed, and Ewout Brandsma for their feedback, insightful discussions, and guidance throughout this graduation project. The feedback and guidance were provided with help from SCOTT project (https://scottproject.eu) resources. Secondly, most of the graduation project experiments would not have been finished in time without the assistance of a dear friend of mine. His name is Leonardo Araneda Freccero. Leonardo provided the hardware resources that were used to execute the experiments. Lastly, various icons used throughout this thesis were created by Alexander Kahlkopf as part of iconmonstr (https://iconmonstr.com). I would like to thank Alexander for granting the use of his icons. Bram van Berlo Unsupervised Feature Learning in a Federated Setting for Human Activity Detection iii Contents Contents iv List of Figures vi List of Tables vii List of Algorithms viii List of Acronyms ix 1 Introduction1 1.1 Challenges and objectives.............................1 1.2 Research questions and scope...........................2 1.3 Research methodologies..............................2 1.4 Document structure................................3 2 Literature background4 2.1 Federated machine learning............................4 2.1.1 Characteristics...............................5 2.1.2 Architecture requirements.........................5 2.1.3 Existing federated systems and algorithms................6 2.1.4 Asynchronous dispatch/aggregation interaction.............8 2.1.5 Automatic service discovery........................ 11 2.2 Human activity detection............................. 13 2.2.1 Temporal convolutional neural networks................. 13 2.3 Unsupervised feature learning........................... 15 2.3.1 Algorithms and neural network architectures.............. 15 2.4 Total execution time performance modelling................... 18 2.4.1 Total execution time models........................ 19 2.5 Thesis contribution................................. 23 3 System design 24 3.1 Global model server................................ 24 3.2 Resource directory................................. 27 3.3 Federated client................................... 29 3.4 System interfaces.................................. 29 3.5 Simulation functionality.............................. 29 iv Unsupervised Feature Learning in a Federated Setting for Human Activity Detection CONTENTS 3.6 Pitfalls........................................ 31 4 Unsupervised feature learning algorithm effectiveness 33 4.1 Datasets....................................... 35 4.2 Assessment strategy................................ 36 4.3 Comparison against supervised learning..................... 40 4.4 Weight transferability assessment......................... 41 4.5 Total execution time discussion.......................... 41 5 Conclusions 44 5.1 Future work..................................... 45 Bibliography 46 Appendix 53 A Specifications 54 B Profiling results 55 Unsupervised Feature Learning in a Federated Setting for Human Activity Detection v List of Figures 2.1 Abstract FML communication round loop example...............4 2.2 Unified Modeling Language (UML) sequence diagram depicting a control flow example for steps 1-3 from Figure 2.1 in case asynchronous I/O is used.... 10 2.3 UML activity/sequence diagram depicting a control flow for steps 1-3 from Figure 2.1 in case multi-threading is used.................... 10 2.4 UML sequence diagram depicting a scenario example in which a FML server requests client presence over a multicast address................. 12 2.5 UML sequence diagram depicting a scenario example in which a FML server requests client services and clients attach their services to a resource directory. 12 2.6 Simplified representation of the encoder-decoder TCN for modelling a sequence of temporal data.................................. 14 3.1 UML deployment diagram depicting a high level overview of the FML software architecture design................................. 25 3.2 FML global model server application logic layers in the scenario that a FML communication round is executed......................... 25 3.3 UML class diagram depicting attributes, functions and interface connections between system components depicted in Figure 3.1................ 30 4.1 Supervised deep learning architecture used during effectiveness experiments. 34 4.2 Unsupervised Autoencoder deep learning architecture used during effectiveness experiments..................................... 34 4.3 Self-Supervised deep learning architecture