Dimensional Analysis of Robot Software Without Developer Annotations John-Paul W
Total Page:16
File Type:pdf, Size:1020Kb
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Computer Science and Engineering: Theses, Computer Science and Engineering, Department of Dissertations, and Student Research Summer 7-26-2019 Dimensional Analysis of Robot Software without Developer Annotations John-Paul W. Ore University of Nebraska-Lincoln, [email protected] Follow this and additional works at: https://digitalcommons.unl.edu/computerscidiss Part of the Artificial Intelligence and Robotics Commons, and the Robotics Commons Ore, John-Paul W., "Dimensional Analysis of Robot Software without Developer Annotations" (2019). Computer Science and Engineering: Theses, Dissertations, and Student Research. 175. https://digitalcommons.unl.edu/computerscidiss/175 This Article is brought to you for free and open access by the Computer Science and Engineering, Department of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Computer Science and Engineering: Theses, Dissertations, and Student Research by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. DIMENSIONAL ANALYSIS OF ROBOT SOFTWARE WITHOUT DEVELOPER ANNOTATIONS by John-Paul William Calvin Ore A DISSERTATION Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy Major: Computer Science Under the Supervision of Professors Sebastian Elbaum and Carrick Detweiler Lincoln, Nebraska August, 2019 DIMENSIONAL ANALYSIS OF ROBOT SOFTWARE WITHOUT DEVELOPER ANNOTATIONS John-Paul William Calvin Ore, Ph. D. University of Nebraska, 2019 Advisers: Sebastian Elbaum and Carrick Detweiler Robot software risks the hazard of dimensional inconsistencies. These incon- sistencies occur when a program incorrectly manipulates values representing real-world quantities. Incorrect manipulation has real-world consequences that range in severity from benign to catastrophic. Previous approaches detect dimen- sional inconsistencies in programs but require extra developer effort and technical complications. The extra effort involves developers creating type annotations for every variable representing a real-world quantity that has physical units, and the technical complications include toolchain burdens like specialized compilers or type libraries. To overcome the limitations of previous approaches, this thesis presents novel methods to detect dimensional inconsistencies without developer annotations. We start by empirically assessing the difficulty developers have in making type annotations. In a human study of 83 subjects, we find that developers are only 51% accurate and require more than 2 minutes per annotation. We further find that type suggestions have a significant impact on annotation accuracy. We find that when showing developers annotation suggestions, three suggestions are better than a single suggestion because they are as helpful when correct and less harmful when incorrect. Since developers struggle to make type annotations accurately, we present a novel method to infer physical unit types without developer annotations. This is novel because it is the first method to detect dimensional inconsistencies in ROS C++ without developer annotations, and this is important because robot software and ROS are increasingly used in real-world applications. Our method leverages a property of robotic middleware architecture that reuses standardized data structures, and we implement our method in an open-source tool, Phriky. We evaluate our method empirically on a corpus of 5.9 M lines of code and find that it detects real inconsistencies with an 87% TP rate. However, our method only assigns physical unit types to 25% of variables, leaving much of the annotation space unaddressed. To overcome these limitations, we extend our method to utilize uncertain evidence in identifiers using probabilistic reasoning. We implement our new probabilistic method in a tool Phys and find that it assigns units to 75% of variables while retaining a TP rate of 82%. We present the first open dataset of dimensional inconsistencies in open-source robotics code, to our knowledge. Lastly, we identify extensions to our work and next steps for software tool developers to build more powerful robot software development tools. iv COPYRIGHT © 2019, John-Paul William Calvin Ore v ACKNOWLEDGMENTS Profound thanks to my advisors, Carrick and Sebastian, for their enduring patience and encouragement. Thank you, Matthew B. Dwyer, for sharing your deep insights and love of wisdom. Thank you to my family Charles, Constance†, Heidi, Janna, Jon, Todd, Zoie, Kira, Fiona, and Ursula, for pointing to the guiding stars, cheerleading, cajoling, gentle reminders like ’Eyes on the Prize,’ and your relentless love. Words fail to express the depth of my gratitude to my brilliant wife, Dr. Aimee Allard. I’m looking forward to our journey together, hand-in-hand, on days both dark and bright, venturing in good courage on paths as yet unknown. Deep thanks as well to Aimee’s family, Gregory and Carol Allard and my new brother-in-law Erik Allard for their kind support. Thank you to my dear colleagues and friends for their assistance and feedback, including: Adam Plaucha, Adam Taylor, Ajay Shankar, Anne Rutledge, Ashraful Islam, Brittany Duncan, Carl Hildebrandt, Chandima Fernando, David Anthony, David Current, Evan Beachly, Gwendolyn Krieser, Hengle Jiang, Jim Higgins, Jinfu Leng, Jonathan Saddler, Justin Bradley, Lola Masson, Mitchell Gerrard, Nic Warmenhoven, Nishant Sharma, Pedro Albuquerque, Rubi Quinones,˜ Sayali Kate, Siya Kunde, Urja Acharya, William Wimsatt, and Xiangyu Zhang. “For wisdom is better than rubies; and all the things that may be desired are not to be compared to it.” Proverbs 8:11 vi GRANT INFORMATION This work was partially supported by USDA-NIAF #2013-67021-20947, USDA- NIFA #2017-67021-25924, AFOSR #FA9550-10-1-0406, NSF #CCF-1718040, NSF #CCF-1526652 NSF #IIS-1116221, and NSF #IIS-1638099. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of these agencies. vii Table of Contents List of Figures xiii List of Tables xvi 1 Introduction 1 1.1 Contributions................................ 5 1.2 Outline of Dissertation........................... 7 2 Background 9 2.1 Physical Units and the SI System..................... 9 2.2 Dimensional Analysis & Inconsistencies................. 10 2.2.1 Assignment of Multiple Units................... 12 2.2.2 Comparison of Inconsistent Units................ 13 2.2.3 Addition of Inconsistent Units.................. 14 2.3 Dimensional Analysis through Type Checking............. 15 2.4 Robotic Message-oriented Middleware: The Robot Operating System (ROS)..................................... 18 3 Related work 21 3.1 Unit Types in Robot Software....................... 21 3.2 Dimensional Analysis in Programs with Type Checking....... 22 viii 3.2.1 Type Checking and Empirical Studies of their Effectiveness. 22 3.2.2 Dimensional Analysis in Programs................ 23 3.2.3 Checking With Type Annotations................. 24 3.2.4 Checking Without Type Annotations............... 29 3.3 Helping Developers Annotate Code with Tools............. 31 3.3.1 Type Qualifiers........................... 31 3.3.2 Type Annotation Burden in Java and Javascript........ 32 4 Study of Developers: Better Understanding the Type Annotation Burden 34 4.1 Introduction................................. 34 4.2 Methodology: Accuracy, Duration, and Suggestions.......... 37 4.2.1 Research Questions......................... 37 4.2.2 Experimental Setup......................... 39 4.2.3 Subject Sample Population.................... 46 4.2.4 Test Instrument Details....................... 48 4.2.5 Utilized Tools............................ 50 4.2.6 Study Phases............................ 51 4.3 Results.................................... 54 4.3.1 Accuracy............................... 54 4.3.2 Timing................................ 55 4.3.3 Impact of Suggestions on Accuracy............... 57 4.3.4 Impact of Suggestions on Timing................. 62 4.3.5 Qualitative Results: Clues for Choosing a Type......... 63 4.4 Threats.................................... 67 4.4.1 External Threats........................... 67 4.4.2 Internal Threats........................... 69 ix 4.4.3 Conclusion Threats......................... 72 4.5 Discussion: Code Attributes’ Impact on Annotation Accuracy.... 72 4.5.1 Code Attributes........................... 72 4.5.2 Results of Code Attributes’ Impact................ 74 5 Method to Infer Types without Developer Annotations 77 5.1 Challenges.................................. 77 5.2 Approach Overview............................ 78 5.2.1 One-time Mapping from Class Attributes in Shared Program Libraries to Units........................... 78 5.2.2 External Mapping Cost....................... 80 5.2.3 Algorithm for Lightweight Detection of Unit Inconsistencies. 81 5.3 Implementation: Phriky .......................... 89 5.3.1 Termination and Complexity of Phriky ............. 91 5.4 Research Questions............................. 91 5.5 Results.................................... 92 5.5.1 Analysis of Robotic Software Corpus.............. 92 5.5.2 RQ6 Results: Phriky Effectiveness................ 93 5.5.3 RQ7 Results: Developer Survey................. 95 5.5.4 Scale and Speed........................... 98 5.6 Threats and Limitations.......................... 98 5.6.1 Self-labeling.............................