The Pennsylvania State University

Total Page:16

File Type:pdf, Size:1020Kb

The Pennsylvania State University The Pennsylvania State University The Graduate School The Huck Institute of the Life Sciences FORMAL METHODS FOR GENOMIC DATA INTEGRATION A Thesis in Integrative Biosciences by Nigam Shah 2005 Nigam Shah Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy August 2005 ii The thesis of Nigam Shah was reviewed and approved* by the following: Nina V. Fedoroff Willaman Professor of Life Sciences and Evan Pugh Professor Acting Co-Director, Integrative Biosciences Graduate Program Huck Institutes of the Life Sciences Thesis Advisor Chair of Committee Mark D. Shriver Associate Professor of Anthropology and Genetics Wojciech Makalowski Associate Professor of Biology Francesca Chiaromonte Associate Professor of Statistics and Health Evaluation Sciences Gustavo A. Stolovitzky Manager, Functional Genomics & Systems Biology IBM T.J. Watson Research Center Special Member *Signatures are on file in the Graduate School iii ABSTRACT The rapid growth of life sciences research and the associated literature over the past decade, the rapid expansion of biological databases, and invention of high throughput techniques that permit collection of data on many genes and proteins simultaneously have created an acute need for new computational tools to support the biologist in collecting, evaluating and integrating large amounts of information of many disparate kinds. This thesis presents methods for the representation, manipulation and conceptual integration of diverse biological data with prior biological knowledge to facilitate both, interpretation of data and evaluation of hypotheses. We have developed a tool (called CLENCH) that assists in the interpretation of gene-lists resulting from microarray data analysis, by integrating and visualizing Gene Ontology (GO) annotations and transcription factor binding site information with gene expression data. During the development of CLENCH, it became evident that developing a unified framework for representing prior knowledge and information can increase our ability to integrate new data with existing knowledge. In subsequent work, we developed the HyBrow (Hypothesis Browser) system as a prototype tool for designing hypotheses and evaluating them for consistency with existing knowledge. HyBrow consists of a conceptual framework with the ability to represent diverse biological information types, an ontology for describing biological processes at different levels of detail, a database to query information in the ontology, and programs to design, evaluate and revise hypotheses. We demonstrate the HyBrow prototype using the galactose gene network in Saccharomyces cerevisiae as a test system. Along with the increase in available information, knowledgebases, which provide structured descriptions of biological processes, are proliferating rapidly. In order to support computer-aided information integration tools like HyBrow, a knowledgebase should be trustworthy and it should structure information in a sufficiently expressive manner to represent biological systems at multiple scales. We extend and adapt the conceptual framework underlying HyBrow and use it to verify the trustworthiness and usefulness of the Reactome knowledgebase. iv TABLE OF CONTENTS LIST OF FIGURES vi LIST OF TABLES vii ACKNOWLEDGEMENTS viii Chapter 1: Introduction 1 Chapter 2: Managing and interpreting large scale gene expression data. 2 Managing high volume microarray data 3 Using the Gene Ontology for interpreting microarray expression datasets: 8 Signaling pathways as an organizing framework for expression data 17 Summary 19 Chapter 3: Towards a unified formal representation for genomics data 21 Challenges for developing a unified formal representation 23 Description of relevant related efforts 28 Chapter 4: A novel conceptual framework 32 Extensions to the conceptual framework 34 Comparison with other conceptual frameworks 36 Chapter 5: Prototype implementation of HyBrow 42 Hypothesis ontology 43 Inference rules and constraints 48 Database and information gathering 52 User interfaces 54 The hypothesis evaluation process 54 Test runs with sample hypotheses 57 Chapter 6: Lessons learned from the prototype 60 Revision of the hypothesis ontology 61 Bottleneck for structuring data and role of knowledgebases 62 Chapter 7: Comparison with related efforts 65 The Riboweb project 66 Modeling biological processes as workflows 66 Pathway logic 67 Summary 68 v Role of Knowledgebases 68 Chapter 8: Proofreading the Reactome knowledgebase 70 Background 71 Methods 72 Results 76 Summary 83 Chapter 9: Summary and Future directions. 85 Future directions 86 References 88 Appendix A – Formal specification of the hypothesis grammar 95 Appendix B – Using the GUI 96 vi LIST OF FIGURES Figure 1 Flow-chart showing the microarray data preprocessing pipeline. 6 Figure 2 The types of plots that can be made by ProcessGprfile.pl. 7 Figure 3 Visualizing the expression, annotation and TF binding site data. 11 Figure 4 Directed acyclic graph showing the relationships among GO categories 12 Figure 5 A sample row from the CLENCH result table. 13 Figure 6 Components of a formal representation. 27 Figure 7 Examples of different types of ontology specifications. 46 Figure 8 An overview of the ontology. 47 Figure 9 Outline of the binds to prompter rule. 51 Figure 10 Screen shots of the visual and widget interfaces. 54 Figure 11 The hypothesis evaluation process. 56 Figure 12 Screen shot of the result page 57 Figure 13 Properties of agents in the revised ontology. 61 vii LIST OF TABLES Table 1: A comparison of the properties of different conceptual frameworks 41 Table 2: Numbers of Well-Formed Pathways 83 Table 3: Property comparison for the latest releases of Reactome 83 viii ACKNOWLEDGEMENTS First of all I would like to acknowledge my advisor, Nina Fedoroff, for her mentoring and support throughout my graduate studies. She has had the most profound influence on the way I think (and write!) about science and my approach to research in general. I feel privileged to have studied under a scientist of her stature. I would also like to acknowledge Stephen Racunas, my colleague and a very dear friend, for making my graduate studies at Penn State a memorable and enriching experience. I feel honored to know and work with someone like him. I am also very grateful to Dilip Desai, a close family friend, who along with my parents (Haresh and Chaula Shah) has played a very major role in shaping my personality and outlook towards life. Finally and most importantly, I am grateful to my wife Prachi for always being with me and for her unconditional love during the ups and downs of graduate life. 1 Chapter 1: Introduction With the advent of high-throughput technologies, molecular biology is undergoing a revolution in terms of the amount and types of data available to the scientist. On the one hand there is an abundance of individual data types such as gene and protein sequences, gene expression data, protein structures, protein interactions and annotations. On the other hand there is a shortage of tools and methods that can handle this deluge of information and allow a biologist to draw meaningful inferences. A significant amount of time and energy is spent in merely locating and retrieving information rather than thinking about what that information means. In this situation it becomes extremely difficult to integrate current knowledge about the relationships within biological systems and formulate hypotheses about a large number genes and proteins[1]. It becomes difficult to determine whether the hypotheses are consistent internally or with data, to refine inconsistent hypotheses and to understand the implications of complicated hypotheses[2]. It is obvious that this situation needs to be rectified and tools need to be developed that allow repetitive tasks to be automated and that allow formal methods to query and interpret the information at hand[3]. My thesis work is focused on developing methods for integrating large data sets with prior biological knowledge to facilitate their interpretation. My initial efforts were focused on interpreting results from microarray expression data using the gene ontology and known biological pathways. During this work, which is described in the next chapter, it became evident that explicitly structuring prior knowledge and formally representing current information facilitates the integration of new data with prior knowledge by increasing our ability to fit the new data into the big picture. Subsequently, in collaboration with Stephen Racunas, an engineering doctoral student, we developed a prototype system for integrating biological data and existing knowledge in an environment that supports the formulation and evaluation of alternative hypotheses about biological systems. This work is described starting from chapter three. 2 Chapter 2: Managing and interpreting large scale gene expression data. Microarray technology is a high-throughput method of measuring the expression level of thousands of genes in parallel. It is also the most widely used method among the several high-throughput technologies for collecting data on the levels of various biological entities such as mRNA and proteins in cells. My efforts to manage genomic data were focused on preprocessing, analyzing and interpreting microarray gene expression data. I developed programs for rapid preprocessing of raw microarray data and interpreting gene-groups that result from analyzing those data. While developing these methods for interpreting microarray data,
Recommended publications
  • Verification and Formal Methods
    Verification and Formal Methods: Practice and Experience J. C. P. Woodcock University of York, UK and P. G. Larsen Engineering College of Aarhus, Denmark and J. C. Bicarregui STFC Rutherford Appleton Laboratory, UK and J. S. Fitzgerald Newcastle University, UK We describe the state of the art in the industrial use of formal verification technology. We report on a new survey of the use of formal methods in industry, and compare it with the most significant surveys carried out over the last 20 years. We review the literature on formal methods, and present a series of industrial projects undetaken over the last two decades. We draw some observations from these surveys and records of experience. Based on this, we discuss the issues surrounding the industrial adoption of formal methods. Finally, we look to the future and describe the development of a Verified Software Repository, part of the worldwide Verified Software Initiative. We introduce the initial projects being used to populate the repository, and describe the challenges they address. Categories and Subject Descriptors: D.2.4 [Software/Program Verification]: Assertion check- ers, Class invariants, Correctness proofs, Formal methods, Model checking, Programming by contract; F.3.1 [Specifying and Verifying and Reasoning about Programs]: Assertions, Invariants, Logics of programs, Mechanical verification, Pre- and post-conditions, Specification techniques; F.4.1 [Mathematical Logic]: Mechanical theorem proving; I.2.2 [Automatic Pro- gramming]: Program verification. Additional Key Words and Phrases: Experimental software engineering, formal methods surveys, Grand Challenges, Verified Software Initiative, Verified Software Repository. 1. INTRODUCTION Formal verification, for both hardware and software, has been a topic of great sig- nificance for at least forty years.
    [Show full text]
  • Formal Methods
    SE 5302: Formal Methods Course Instructor: Parasara Sridhar Duggirala, Ph.D. Catalog Description. 3 credits. This course is designed to provide students with an introduction to formal methods as a framework for the specification, design, and verification of software-intensive embedded systems. Topics include automata theory, model checking, theorem proving, and system specification. Examples are driven by cyber-physical systems. The course is addressed to students in engineering who have had at least a year of software or embedded systems design experience. Pre- Requisites: SE 5100 or SE 5101 or SE 5102 and at least one year of software or embedded systems design experience. Course Delivery Method. The course will be offered online, asynchronously, in small recorded modules according to the course schedule and syllabus. Direct and live communication with the instructor will be available each week, according to the class schedule, for discussion, questions, examples, and quizzes. Attendance at live sessions is required, and you must notify the instructor in advance if you cannot attend. A social networking tool called Slack will be used to communicate with students and the instructor between live sessions. Course Objective. This course is designed to provide students with an introduction to formal methods as a framework for the specification, design, and verification of software- intensive embedded systems. Topics include automata theory, model checking, theorem proving, and system specification. Examples are driven by control systems and software systems. Anticipated Student Outcomes. By the end of the course, a student will be able to (1) Gain familiarity with current system design flows in industry used for embedded system design, implementation and verification.
    [Show full text]
  • Formal Methods for Biological Systems: Languages, Algorithms, and Applications Qinsi Wang CMU-CS-16-129 September 2016
    Formal Methods for Biological Systems: Languages, Algorithms, and Applications Qinsi Wang CMU-CS-16-129 September 2016 School of Computer Science Computer Science Department Carnegie Mellon University Pittsburgh, PA Thesis Committee Edmund M. Clarke, Chair Stephen Brookes Marta Zofia Kwiatkowska, University of Oxford Frank Pfenning Natasa Miskov-Zivanov, University of Pittsburgh Submitted in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Copyright c 2016 Qinsi Wang This research was sponsored by the National Science Foundation under grant numbers CNS-0926181 and CNS- 1035813, the Army Research Laboratory under grant numbers FA95501210146 and FA955015C0030, the Defense Advanced Research Projects Agency under grant number FA875012C0204, the Office of Naval Research under grant number N000141310090, the Semiconductor Research Corporation under grant number 2008-TJ-1860, and the Mi- croelectronics Advanced Research Corporation (DARPA) under grant number 2009-DT-2049. The views and conclusions contained in this document are those of the author and should not be interpreted as rep- resenting the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Keywords: Model checking, Formal specification, Formal Analysis, Boolean networks, Qual- itative networks, Rule-based modeling, Multiscale hybrid rule-based modeling, Hybrid systems, Stochastic hybrid systems, Symbolic model checking, Bounded model checking, Statistical model checking, Bounded reachability, Probabilistic bounded reachability, Parameter estimation, Sensi- tivity analysis, Statistical tests, Pancreatic cancer, Phage-based bacteria killing, Prostate cancer treatment, C. elegans For My Beloved Mom & Dad iv Abstract As biomedical research advances into more complicated systems, there is an in- creasing need to model and analyze these systems to better understand them.
    [Show full text]
  • A Systematic Literature Review of the Use of Formal Methods in Medical Software Systems∗
    JOURNAL OF SOFTWARE: EVOLUTION AND PROCESS J. Softw. Evol. and Proc. 2017; 00:1–23 Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/smr A Systematic Literature Review of the use of Formal Methods in Medical Software Systems∗ Silvia Bonfanti∗, Angelo Gargantini1, Atif Mashkoor2 ∗University of Bergamo, [email protected] 1University of Bergamo, [email protected] 2Software Competence Center Hagenberg GmbH, [email protected] SUMMARY The use of formal methods is often recommended to guarantee the provision of necessary services and to assess the correctness of critical properties, such as functional safety, cybersecurity and reliability, in medical and health-care devices. In the past, several formal and rigorous methods have been proposed and consequently applied for trustworthy development of medical software and systems. In this paper, we perform a systematic literature review on the available state of the art in this domain. We collect the relevant literature on the use of formal methods for modeling, design, development, verification and validation of software-intensive medical systems. We apply standard systematic literature review techniques and run several queries in well-known repositories in order to obtain information that can be useful for people who are either already working in this field or planning to start. Our study covers both quantitative and qualitative aspects of the subject. Copyright c 2017 John Wiley & Sons, Ltd. Received . KEY WORDS: Formal Methods, Systematic Literature Review, Medical Device Software 1. INTRODUCTION In modern medical devices, human safety depends upon the correct operation of software controlling the device: software malfunctioning can cause injuries to, or even the death of, patients.
    [Show full text]
  • FORMAL METHODS: BENEFITS, CHALLENGES and FUTURE DIRECTION Mona Batra1, Amit Malik2, Dr
    Volume 4, No. 5, May 2013 Journal of Global Research in Computer Science REVIEW ARTICLE Available Online at www.jgrcs.info FORMAL METHODS: BENEFITS, CHALLENGES AND FUTURE DIRECTION Mona Batra1, Amit Malik2, Dr. Meenu Dave3 1 M. Tech. Scholar Department of Computer Science, Jagan Nath University, Jaipur, India [email protected] 2Sr. Analyst, HCL Technologies Ltd. , Noida-201304, India [email protected] 3Assistant Professor Department of Computer Science, Jagan Nath University, Jaipur, India [email protected] Abstract: There is an increasing demand of current information systems to incorporate the use of a higher degree of formalism in the development process. Formal Methods consist of a set of tools and techniques based on mathematical model and formal logic that are used to specify and verify requirements and designs for hardware and software systems. This paper presents a detailed analysis of formal methods along with their goals and benefits followed by limitations. This research work is aimed to help the software engineers to identify the use of formal methods at different stages of software development, with special reference to the requirements phase. Keywords- Formal Methods, Requirements Engineering, Formal Specification, Feasibility Analysis etc. of mathematics in design and construction to ensure product INTRODUCTION quality is common practice in established engineering disciplines, such as bridge or aircraft building, and even In today’s commercial environment, the primary measure of computer (hardware) construction, where one applies success of software projects is the extent to which a mathematically expressed physical and other natural laws to software system fulfills the purpose, which it is intended for.
    [Show full text]
  • Do Formal Methods Improve Code Quality?
    Do Formal Methods Improve Code Quality? Shari Lawrence Pfleeger Les Hatton Centre for Software Reliability Programming Research Ltd. Northampton Square Glenbrook House, 1/11 Molesey Road London EC1V 0HB Hersham, Surrey KT12 4RH England England phone: +44 171 477-8426 +44 932 888080 fax: +44 171 477-8585 +44 932 888081 email: [email protected] [email protected] Abstract Formal methods are advocated on many projects, in the hope that their use will improve the quality of the resulting software. However, to date there has been little quantitative evidence of their effectiveness, especially for safety-critical applications. We examined the code and development records for a large air traffic control support system to see if the use of formal methods made a measurable difference. In this paper, we show that formal specification, in concert with thorough unit testing and careful reviews, can lead to high-quality code. But we also show that improved measurement and record-keeping could have made analysis easier and the results more clear-cut. As practitioners and researchers, we continue to look for methods and tools that help us to improve our software development processes and products. Articles in IEEE Software and elsewhere focus on candidate technologies that promise increased productivity, better quality, lower cost or enhanced customer satisfaction. But we are reminded that these methods and tools should be tested empirically and rigorously to determine if they make a quantifiable difference to software we produce.1,2 Often, such evaluation is not considered until after the technology has been used, making careful, quantitative analysis difficult if not impossible.
    [Show full text]
  • A Debate on Teaching Computing Science
    Teaching Computing Science t the ACM Computer Science Conference last Strategic Defense Initiative. William Scherlis is February, Edsger Dijkstra gave an invited talk known for his articulate advocacy of formal methods called “On the Cruelty of Really Teaching in computer science. M. H. van Emden is known for Computing Science.” He challenged some of his contributions in programming languages and the basic assumptions on which our curricula philosophical insights into science. Jacques Cohen Aare based and provoked a lot of discussion. The edi- is known for his work with programming languages tors of Comwunications received several recommenda- and logic programming and is a member of the Edi- tions to publish his talk in these pages. His comments torial Panel of this magazine. Richard Hamming brought into the foreground some of the background received the Turing Award in 1968 and is well known of controversy that surrounds the issue of what be- for his work in communications and coding theory. longs in the core of a computer science curriculum. Richard M. Karp received the Turing Award in 1985 To give full airing to the controversy, we invited and is known for his contributions in the design of Dijkstra to engage in a debate with selected col- algorithms. Terry Winograd is well known for his leagues, each of whom would contribute a short early work in artificial intelligence and recent work critique of his position, with Dijkstra himself making in the principles of design. a closing statement. He graciously accepted this offer. I am grateful to these people for participating in We invited people from a variety of specialties, this debate and to Professor Dijkstra for creating the backgrounds, and interpretations to provide their opening.
    [Show full text]
  • Top Ten Ways to Make Formal Methods for HPC Practical
    Top Ten Ways to Make Formal Methods for HPC Practical Ganesh Gopalakrishnan and Robert M. Kirby School of Computing, University of Utah, Salt Lake City, UT 84112 http://www.cs.utah.edu/fv { fganesh,[email protected] ABSTRACT such as deadlocks and resource leaks. Barring a few ex- Almost all fundamental advances in science and engineer- ceptions [2], debugging challenges associated with MPI pro- ing crucially depend on the availability of extremely capable gramming have not been very much discussed in the formal high performance computing (HPC) systems. Future HPC verification literature. The primary debugging approach for systems will increasingly be based on heterogeneous multi- MPI is still one of running an MPI application on a specific core CPUs, and their programming will involve multiple con- platform, feeding it a collection of test inputs, and seeing currency models. These developments can make concurrent if anything goes wrong. This approach may seem to work programming and optimization of HPC platforms and ap- | largely due to the uniformity of the platform hardware, plications very error-prone. Therefore, significant advances libraries, and programming styles employed in this area. By must occur in verification methods for HPC. We present ten avoiding aggressive styles of MPI programming and staying important formal methods research thrusts that can accel- within safe practices, today's HPC application developers erate these advances. can often compensate for the lack of rigorous testing and reasoning approaches. At present, HPC application developers and the Com- 1. FORMAL METHODS AND HPC puter Science research community are relatively unaware of High performance computing (HPC) is one of the pillars what is going on each others' research areas.
    [Show full text]
  • Formal Methods in Macro- Biology First International Conference, FMMB 2014, Noumea, New Caledonia, September 22-14, 2014, Proceedings
    springer.com Computer Science : Computational Biology / Bioinformatics Fages, François, Piazza, Carla (Eds.) Formal Methods in Macro- Biology First International Conference, FMMB 2014, Noumea, New Caledonia, September 22-14, 2014, Proceedings This book constitutes the refereed proceedings of the First International Conference on Formal Methods in Macro-Biology, FMMB 2014, held in Nouméa, New Caledonia, in September 2014. The 7 revised full and 3 short papers presented together with 7 invited presentations were carefully reviewed and selected from 17 submissions. The scientific program consists of papers on a wide variety of topics, including ecological systems, medical applications, logical frameworks, and discrete continuous and hybrid models for the analysis of biological systems at macroscopic levels. Springer Order online at springer.com/booksellers 2014, XXVI, 183 p. 46 illus. Springer Nature Customer Service Center LLC 1st 233 Spring Street edition New York, NY 10013 USA T: +1-800-SPRINGER NATURE Printed book (777-4643) or 212-460-1500 Softcover [email protected] Printed book Softcover ISBN 978-3-319-10397-6 $ 59,99 Available Discount group Professional Books (2) Product category Proceedings Series Lecture Notes in Bioinformatics Other renditions Softcover ISBN 978-3-319-10399-0 Prices and other details are subject to change without notice. All errors and omissions excepted. Americas: Tax will be added where applicable. Canadian residents please add PST, QST or GST. Please add $5.00 for shipping one book and $ 1.00 for each additional book. Outside the US and Canada add $ 10.00 for first book, $5.00 for each additional book. If an order cannot be fulfilled within 90 days, payment will be refunded upon request.
    [Show full text]
  • The State-Of-The-Art in Formal Methods
    The State-of-the-Art in Formal Methods by Milica Barjaktarovic Wilkes University Wilkes Barre PA 18766 and WetStone Technologies, Inc. 273 Ringwood Rd. Freeville, NY 13068 January 1998 For Michael Nassiff Rome Research Site AFRL/IFGB 525 Brooks Rd. Rome, NY 13441-4505 Report: State of the Art in Formal Methods M. Barjaktarovic / WetStone Technologies Abstract This report is predominantly based on observations and informal interviews during participation in the following events: · Workshop on Integrating Formal Techniques (WIFT 98). Bocca Raton, FL, October 1998. · Visit to SRI International. Palo Alto, CA, October 1998. · Formal Methods Standardization Working Group Meeting. Palo Alto, CA, October 1998. · Formal Methods PI Meeting with Hellen Gill. Palo Alto, CA, October 1998. · Formal Methods in CAD (FMCAD 98). Palo Alto, CA, November 1998. The overall impression from the trip is that industry needs assistance in dealing with the present realities of complex products and short time-to-market deadlines, and needs to consider formal methods as a systematic approach to dealing with the overwhelming amount of information. CAD industry seems the most willing and already uses many lightweight formal tools, such as model checkers and equivalence checkers. Telecommunications industry comes next. We have not encountered presentations in computer security and electronic commerce, but we could assume that their situation is very similar. The major task of the formal methods community will be to provide the assistance sought. Expressed needs include: more user-friendly tools; more powerful and robust tools; more real-life applications; more infrastructure such as verified libraries; more publicity of success stories and available technologies; and more user training.
    [Show full text]
  • Methods and Tools for Formal Software Engineering
    Methods and Tools for Formal Software Engineering Zhiming Liu1, and R. Venkatesh2 1 International Institute for Software Technology United Nations University, Macao SAR, China [email protected] 2 Tata Research and Design Development Centre, Pune, India [email protected] Abstract. We propose a collaboration project to integrate the research effort and results obtained at UNU-IIST on formal techniques in component and object sys- tems with research at TRDDC in modelling and development of tools that support object-oriented and component-based design. The main theme is an integration of verification techniques with engineering methods of modelling and design, and an integration of verification tools and transformation tools. This will result in a method in which a correct program can be developed through transformations that are either proven to be correct or by showing that the transformed model can be proven correct by a verification tool. 1 Formal Software Engineering and the Grand Challenge The goal of the Verifying Compiler Grand Challenge [7,6] is to build a verifying com- piler that “uses mathematical and logical reasoning to check the programs that it compiles.” This implies that “a program should be allowed to run only if it is both syntactically and semantically correct” [20]. To achieve this goal, the whole computing community have to deal with a wide range of issues, among which are [2] 1. arriving at automated procedures of abstraction that enables a compiler to work in combination with different program verification tools including testing tools, 2. studying what, where, when and how the correctness properties, i.e.
    [Show full text]
  • Assembling a Prehistory for Formal Methods: a Personal View Thomas Haigh [email protected]
    Assembling A Prehistory for Formal Methods: A Personal View Thomas Haigh [email protected] University of Wisconsin—Milwaukee & Siegen University www.tomandmaria.com This is a preprint copy. Please quote and cite the final version, which will appear in a special issue of Formal Aspects of Computing devoted to historical work. Thanks to Cliff Jones and the anonymous reviewers for their encouragement and feedback on earlier drafts of this paper. Preprint Draft Haigh – Assembling a History for Formal Methods 2 Although I was pleased to be asked to contribute something to this volume I have a confession to make: I have never studied the history of formal methods. So this is not going to be a history of formal methods as much as a reflection on how such a story might be written. My plan is to triangulate from my personal experiences as a computer science student a quarter century ago, my Ph.D. training as a historian of science and technology, and my subsequent career researching and writing about various aspects of the history of computing. The fact that, despite a general familiarity with the literature on the history of computing, I don’t have a better grasp of the history of formal methods tells us a lot about the need for this special issue. Most of the history is so far locked up in the heads of participants, which is not a convenient place for the rest of us to find it. Stories written by participants or people with a personal connection to the events described are not usually the last word on historical events, but they are a vital starting point.
    [Show full text]