Visual Analysis of Large, Time-Dependent, Multi-Dimensional Smart Sensor Tracking Data
Total Page:16
File Type:pdf, Size:1020Kb
_________________________________________________________________________Swansea University E-Theses Visual Analysis of Large, Time-Dependent, Multi-Dimensional Smart Sensor Tracking Data Walker, James How to cite: _________________________________________________________________________ Walker, James (2017) Visual Analysis of Large, Time-Dependent, Multi-Dimensional Smart Sensor Tracking Data. Doctoral thesis, Swansea University. http://cronfa.swan.ac.uk/Record/cronfa36342 Use policy: _________________________________________________________________________ This item is brought to you by Swansea University. Any person downloading material is agreeing to abide by the terms of the repository licence: copies of full text items may be used or reproduced in any format or medium, without prior permission for personal research or study, educational or non-commercial purposes only. The copyright for any work remains with the original author unless otherwise specified. The full-text must not be sold in any format or medium without the formal permission of the copyright holder. Permission for multiple reproductions should be obtained from the original author. Authors are personally responsible for adhering to copyright and publisher restrictions when uploading content to the repository. Please link to the metadata record in the Swansea University repository, Cronfa (link given in the citation reference above.) http://www.swansea.ac.uk/library/researchsupport/ris-support/ Visual Analysis of Large, Time-Dependent, Multi-Dimensional Smart Sensor Tracking Data James S. Walker Submitted to Swansea University in fulfilment of the requirements for the Degree of Doctor of Philosophy. Department of Computer Science Swansea University 2017 Redacted Version of Original Uploaded October 2017 A selection of third party content is redacted or is partially redacted from this thesis. Pages: Figs. 2.10 & 2.11 page 31. Figs. 2.12 & 2.13 page 32. Declaration This work has not previously been accepted in substance for any degree and is not being currently submitted for any degree. Signed: .................................................... (candidate) Date: .................................................... Statement 1 This thesis is being submitted in partial fulfilment of the requirements for the Degree of Doctor of Philosophy. Signed: .................................................... (candidate) Date: .................................................... Statement 2 This thesis is the result of my own independent work/investigation, except where otherwise stated. Other sources are specifically acknowledged by clear cross referencing to author, work, and pages using the bibliography/references. I understand that failure to do this amounts to plagiarism and will be considered grounds for failure of this thesis and the degree examination as a whole. Signed: .................................................... (candidate) Date: .................................................... Statement 3 I hereby give consent for my thesis to be available for photocopying and for inter-library loan, and for the title and summary to be made available to outside organisations. Signed: .................................................... (candidate) Date: .................................................... ABSTRACT Technological advancements over the past decade have increased our ability to collect data to previously unimaginable volumes [Kei02]. Understanding temporal patterns is key to gaining knowledge and insight. However, our capacity to store data now far exceeds the rate at which we are able to understand it [KKEM10]. This phenomenon has led to a growing need for advanced solutions to make sense and use of an ever-increasing data space. Abstract temporal data provides additional challenges in its, representation, size, and scalability, high dimensionality, and unique structure. One instance of such temporal data is acquired from smart sensor tags attached to freely- roaming animals recording multiple parameters at infra-second rates which are becoming commonplace, and are transforming biologists understanding of the way wild animals behave. The excitement at the potential inherent in sophisticated tracking devices has, however, been limited by a lack of available software to advance research in the field. This thesis introduces methodologies to deal with the problem of the analysis of the large, multi-dimensional, time-dependent data acquired. Interpretation of such data is complex and currently limits the ability of biologists to realise the value of their recorded information. We present several contributions to the field of time-series visualisation, that is, the visualisation of ordered collections of real value data attributes at successive points in time sampled at uniform time intervals. Traditionally, time-series graphs have been used for temporal data. However, screen resolution is small in comparison to the large information space commonplace today. In such cases, we can only render a proportion of the data. It is widely accepted that the effective interpretation of large temporal data sets requires advanced methods and interaction techniques. In this thesis, we address these issues to enhance the exploration, analysis, and presentation of time-series data for movement ecologists in their smart sensor data analysis. The content of this thesis is split into two parts. In the first volume, we provide an overview of the relevant literature and state-of-the-art methodologies. In the second part, we introduce a research development of techniques which address particular challenges unsolved in the literature, emphasising on their application to solving challenging domain level tasks faced by movement ecologists using smart tag data. Firstly, we comparatively evaluate existing methods for the visual inspection of time-series data, giving a graphical overview of each and classifying their ability to explore and interact with data. Analysis often involves identifying segments of a time-series where specific phenomena occur and comparing between time segments for interesting patterns which can be used to form, prove, or refute a hypothesis. After analysis, findings are communicated to a wider audience. Navigating and communicating through a large data space is an important task which is not fully supported by existing techniques. We propose new visualisations and other extensions to the existing approaches. We undertake and report an empirical study and a field study of our approach on smart sensor data. The reality of researchers faced with perhaps 10 channels of data recorded at sub- second rates spanning several days, is it is time consuming and error-prone for the domain expert to manually decode behaviour. Primarily this is dependent on the manual inspection of multiple time-series graphs. Machine learning algorithms have been considered, but have been difficult to introduce because of the large numbers of training sets required and their low discriminating precision in practice. We introduce TimeClassifier, a visual analytic system for the classification of time-series data to assist in the labelling and understanding of smart sensor data. We deploy our system with biologists and report real-world case studies of its use. Next, we encapsulate TimeClassifier into an all-encompassing software suite, Frame- work4, which operates on smart sensor data to determine the four key elements considered pivotal for movement analysis from such tags. The software transforms smart sensor data into (i) dead-reckoned movements, (ii) template-matched behaviours, (iii) dynamic body acceleration derived energetics and (iv) position-linked environmental data before outputting it all into a single file. Biologists are given a software suite which enables them to link iii behavioural actions and environmental conditions across time and space through a user- friendly software protocol. Its use enhances the ability of biologists to derive meaningful data rapidly from complex data. We make this software publicly available via the world-wide web and report over 60 users world-wide, across 12 countries. Traditional analytical methods for smart sensor data concentrate on either behaviour or power, even though, because behaviour is manifest by movement which requires energy, both are inextricably linked. We introduce a proof-of-concept visualisation, the g-sphere, which capitalises on a suite of different acceleration-derived metrics (coupled with behaviour derived using framework4) and allow multiple dimensions to be visualised simultaneously. This approach, which highlights patterns that are not readily accessible using traditional time-series visualisation, can be used to inform our understanding and identification of processes in areas as disparate as sports practice, disease and wild animal ecology. Finally, we conclude our findings and outline fruitful areas for future work. iv ACKNOWLEDGEMENTS Firstly, I would like to give special thanks to my academic supervisors, Prof. Mark W. Jones, and Dr. Robert S. Laramee (Bob) who’s expertise, support, and patience made this thesis possible. Bob provided my first experience into the world of academia, supervising my BSc dissertation. During this time I was deliberating pursuing a career in industry. Bob persuaded me to stay in the university and undertake an MRes in collaboration with local business, which proved to be the perfect compromise. Without his encouragement I would not have considered a post-graduate degree. However, I thoroughly