Instrument Automation and Measurement Data Curation Platform for Enhancing Research Reproducibility and Knowledge Discovery

INSTRUMENT AUTOMATION AND MEASUREMENT DATA CURATION PLATFORM FOR ENHANCING RESEARCH REPRODUCIBILITY AND KNOWLEDGE DISCOVERY By Yousef Gtat A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Electrical Engineering – Master of Science 2019 ABSTRACT INSTRUMENT AUTOMATION AND MEASUREMENT DATA CURATION PLATFORM FOR ENHANCING RESEARCH REPRODUCIBILITY AND KNOWLEDGE DISCOVERY By Yousef Gtat Many applications demand the continued development of sensing systems that employ smart sensors, instrumentation circuits, and signal processing techniques to extract relevant information from real-world environments. In the engineering efforts to develop new sensors, tasks such as instrument automation, measurement process curation, real-time data acquisition, data analysis, and long-term tracking of inter-related datasets generate a significant volume and variety of information that is challenging to organize, record, and analyze. Sensor development and characterization experiments can be laborious, prone to human error, difficult to repeat precisely, and can produce data that are challenging to interpret. Such issues highlight a need for a structured, automated approach to curate measurement processes and data acquisition. This thesis presents the first software platform for i) digitally designing measurement recipes, ii) remotely scheduling and monitoring experiment execution, iii) automatic data acquisition, iv) analyzing and storing results datasets, and v) linking the datasets with their prospective meta-datasets for deeper analysis and inspection. The proposed platform is flexible and capable of managing a large set of diverse instruments, measurement recipes and sensor datasets. By employing several design abstractions, it allows users to remotely design, schedule, monitor and execute measurement-based experiments while archiving results along with their information-rich metadata therefore preserving the provenance of the datasets. The platform enable precise timing control of instruments and stimulus signals along with long-term tracking of datasets eliminating manual errors and human omissions thus enhancing research reproducibility and promoting knowledge discovery methodologies. ACKNOWLEDGMENTS I am most grateful to my parents and siblings for their tremendous support throughout my life; I owe it all to them! I also extend my gratitude to my relatives, the Huwio family, for lending me their support throughout college and graduate school. My special thanks to all of the developer who made the presented platform proposed in this thesis possible. Specifically, I am thankful to the software developers: Ian Bacus, Luke Wiseman, Ben Buscarino, and Trevor Sabo for lending me their expertise and helping with the code base. Finally, there are no proper words to convey my deep gratitude and respect for my thesis and research advisor, Professor Andrew J. Mason. My sincere thanks must also be extended to my colleagues at the AMSaC group and the members of my thesis committee: Prof. Zhang, Prof. Biswas, and Prof. Tan. This work is supported by the National Institutes of Health (NIH) grant R01ES022302 and grant R01AI113257. iii TABLE OF CONTENTS LIST OF TABLES ....................................................................................................................... vi LIST OF FIGURES .................................................................................................................... vii CHAPTER 1 .................................................................................................................................. 1 MOTIVATION ............................................................................................................................. 1 1.1 Significance ..................................................................................................................... 1 1.1.1 Research Reproducibility ............................................................................................ 1 1.1.2 Knowledge Discovery ................................................................................................. 2 1.1.3 Institutional Memory Loss .......................................................................................... 4 1.2 Requirements and Challenges ....................................................................................... 5 1.4 Approach and Goals ...................................................................................................... 7 1.5 Summary ......................................................................................................................... 8 CHAPTER 2 .................................................................................................................................. 9 BACKGROUND ........................................................................................................................... 9 2.1 Experimental Research and Scientific Method ........................................................... 9 2.2 Research Objects (ROs) ............................................................................................... 11 2.3 Cloud Computing and Services ................................................................................... 12 2.4 Backend Architectures: Monolithic and Microservices ........................................... 14 2.5 Databases: Relational and Non-relational ................................................................. 16 2.6 Review of Existing Solutions ....................................................................................... 19 2.6.1. LabVIEW..................................................................................................................... 20 2.6.2. Scientific Workflow Tools .......................................................................................... 22 2.6.3. Electronic Lab Notebooks (ELN) ................................................................................ 26 2.6.4. Previous Work ............................................................................................................. 27 2.7 Summary ....................................................................................................................... 28 CHAPTER 3 ................................................................................................................................ 29 DESIGN AND IMPLEMENTATION ...................................................................................... 29 3.1 Requirements and Terminology ................................................................................. 29 3.2 Instrument Control and Device Drivers..................................................................... 30 3.3 Web-based Graphical User Interface ......................................................................... 36 3.4 Data Management, Models, and Provenance ............................................................ 39 3.5 Microservices and Messaging Mechanisms ............................................................... 41 3.6 Collaboration, Security, and Networking .................................................................. 43 3.7 Platform Versatility and Adaptability to Existing Tools .......................................... 44 3.8 Platform Dataflow ........................................................................................................ 45 3.9 Software Development Environment ......................................................................... 46 3.10 Summary ....................................................................................................................... 47 CHAPTER 4 ................................................................................................................................ 48 RESULTS and DISCUSSION ................................................................................................... 48 4.1 User Experience Design ............................................................................................... 48 iv 4.2 Instrument Automation and Measurement Data Curation ..................................... 50 4.2.1 Experiment Setup ...................................................................................................... 50 4.2.2 Automated Measurement Design using the Designer Tool ...................................... 51 4.2.3 Measurement Execution and Monitoring using the Executer Tool .......................... 59 4.2.4 Data Visualization and Analysis using the Analyzer Tool ....................................... 65 4.3 Platform Deployment in the Cloud ............................................................................. 69 4.4 Table of the Platform Features vs Existing Tools ..................................................... 71 4.5 Summary ....................................................................................................................... 73 CHAPTER 5 ................................................................................................................................ 75 CONCLUSION ........................................................................................................................... 75 5.1 Summary ....................................................................................................................... 75 5.2 Contributions ................................................................................................................ 76 5.3 Current Status

Load more