A Life Cycle Approach to the Development and Validation of an Ontology of the U.S

The Texas Medical Center Library DigitalCommons@TMC UT SBMI Dissertations (Open Access) School of Biomedical Informatics 2017 A Life Cycle Approach to the Development and Validation of an Ontology of the U.S. Common Rule (45 C.F.R. § 46) Francis Manion University of Texas Health Science Center at Houston, [email protected] Follow this and additional works at: https://digitalcommons.library.tmc.edu/uthshis_dissertations Part of the Bioinformatics Commons, and the Medicine and Health Sciences Commons Recommended Citation Manion, Francis, "A Life Cycle Approach to the Development and Validation of an Ontology of the U.S. Common Rule (45 C.F.R. § 46)" (2017). UT SBMI Dissertations (Open Access). 37. https://digitalcommons.library.tmc.edu/uthshis_dissertations/37 This is brought to you for free and open access by the School of Biomedical Informatics at DigitalCommons@TMC. It has been accepted for inclusion in UT SBMI Dissertations (Open Access) by an authorized administrator of DigitalCommons@TMC. For more information, please contact [email protected]. A Life Cycle Approach to the Development and Validation of an Ontology of the U.S. Common Rule (45 C.F.R. § 46) A Dissertation Presented to the Faculty of The University of Texas Health Science Center at Houston School of Biomedical Informatics in Partial Fulfilment of the Requirements for the Degree of Doctor of Philosophy By Frank J. Manion, M.S.E. University of Texas Health Science Center at Houston 2017 Dissertation Committee: Cui Tao, PhD1, Advisor Hua Xu, PhD1 William A. Weems, PhD2 Marcelline R. Harris, PhD3 Charles P. Friedman, PhD4 1The School of Biomedical Informatics 2University of Texas McGovern Medical School 3University of Michigan School of Nursing, Department of Systems, Populations, and Leadership 4University of Michigan Medical School, Department of Learning Health Sciences Copyright by Frank J. Manion 2017 Dedication I dedicate this dissertation to my wife Carol R. Manion whose constant encouragement has been essential to both this work, and to my life. May you continue to be the music in my life! I also dedicate this work to all those brave individuals who advance the human condition by taking part in human subjects research. ii Acknowledgements In the eloquent acknowledgment section of his recent Ph.D. dissertation from the School, Jose Franck DiazVasquez, to paraphrase, said ‘it all begins with people, it grows from interactions with people, and it is people who provide the opportunities.’ He is so right. There are many people who I must acknowledge for standing alongside me in this walk and providing constant support, encouragement, scholarship, and perhaps most importantly, their precious time. First and foremost is my wife, Carol, who has continually encouraged me for many years while I worked on first my Master’s degree and now this dissertation. She has been rewarded by many lonely days and nights while I studied or worked on projects. I could not have accomplished this without her love and support and I can never repay her. Next, I want to thank my advisor Dr. Cui Tao and all the members of my committee. Dr. Tao has taken me from knowing little about the field to this point and has been a constant source of wisdom, encouragement and a true help in this work. My colleague and friend Dr. Marcy Harris from the University of Michigan has been enormously generous with her time without which this work would have never been finished. Dr. Bill Weems, who was a friend and colleague long before he was a member of my dissertation committee, has spent countless hours over the years discussing and tutoring me in certain technical aspects of privacy, authentication and authorization. Dr. Hua Xu iii and his lab have assisted me in the field of NLP and been a constant positive influence. Dr. Charles Friedman at the University of Michigan provided a bit of an ‘academic home’ for me in Ann Arbor, and constantly has looked out for my best interests. I thank you all for the important roles you have played. Many others from the Tao and Xu labs have assisted with this work. In particular Muhammad ‘Tuan’ Amith, Hsing-Yi ‘Judy’ Song, Madhuri Sankaranarayanapillai, and Yaoyun Zhang. Given my role as a distance student, I’m sure there are others who have given time and effort that I’m either unaware of or simply have forgotten. I hope they will forgive me and know that I am grateful to them all. Many people from the University of Michigan and other academic medical centers have played a supporting role. Those who contributed their knowledge of the domain as subject area experts were Drs. Thomas Carey, Blake Roessler, Richard Redman, Victoria Blanc, Raymond Hutchinson, Michael Geisser, Jodyn Platt, Raymond De Vries, Terry VandenBosch, and Allan Loup, J.D. Members of the OBO consortium also provided advice with biomedical ontologies and the OBO Foundry. These include Drs. Yongqun ‘Oliver’ He from the University of Michigan, Drs. Christian Stockert and Jie Zheng from the University of Pennsylvania, and Dr. Mathias Brochhausen from the University of Arkansas. Dr. Jihad Obeid from the Medical University of South Carolina provided help with informed consent, and Ms. Helena Ellis, managing director at Biobanking Without Borders, LLC provided additional advice on biobanking. Marisa Conte, M.L.I.S. from the University of Michigan Taubman Health Sciences Library assisted with the literature searches. iv Finally, to all those who provided encouragement and support along the way. These include Dr. Jack London from Thomas Jefferson University, and innumerable colleagues at the University of Michigan, most particularly Ms. Jin Park. M.P.H. and Mr. Wei Shi M.S., M.B.A. I am grateful to you all. This work was supported in part by grant HG009454 from the National Human Genome Research Institute. This work used the Protégé editor, which is supported by grant GM10331601 from the National Institute of General Medical Sciences of the United States National Institutes of Health. v Abstract Requirements for the protection of human research subjects stem from directly from federal regulation by the Department of Health and Human Services in Title 45 of the Code of Federal Regulations (C.F.R.) part 46. 15 other federal agencies include subpart A of part 46 verbatim in their own body of regulation. Hence 45 C.F.R. part 46 subpart A has come to be called colloquially the ‘Common Rule.’ Overall motivation for this study began as a desire to facilitate the ethical sharing of biospecimen samples from large biospecimen collections by using ontologies. Previous work demonstrated that in general the informed consent process and subsequent decision making about data and specimen release still relies heavily on paper-based informed consent forms and processes. Consequently, well-validated computable models are needed to provide an enhanced foundation for data sharing. This dissertation describes the development and validation of a Common Rule Ontology (CRO), expressed in the OWL-2 Web Ontology Language, and is intended to provide a computable semantic knowledge model for assessing and representing components of the information artifacts of required as part of regulated research under 45 C.F.R. § 46. I examine if the alignment of this ontology with the Basic Formal Ontology and other ontologies from the Open Biomedical Ontology (OBO) Foundry provide a good fit for the regulatory aspects of the Common Rule Ontology. vi The dissertation also examines and proposes a new method for ongoing evaluation of ontology such as CRO across the ontology development lifecycle and suggest methods to achieve high quality, validated ontologies. While the CRO is not in itself intended to be a complete solution to the data and specimen sharing problems outlined above, it is intended to produce a well-validated computationally grounded framework upon which others can build. This model can be used in future work to build decision support systems to assist Institutional Review Boards (IRBs), regulatory personnel, honest brokers, tissue bank managers, and other individuals in the decision-making process involving biorepository specimen and data sharing. vii Vita 1976............................B.S., Music Education, Baldwin-Wallace University 1985............................M.S.E., Computer and Information Science, University of Pennsylvania 1976 – 1983................Project Leader, CHI Research Incorporated, Audubon, NJ 1983 – 1984................Member, Engineering Staff, RCA Adv. Technology Labs, Camden, NJ 1984 – 1986................Systems Programmer, Fox Chase Cancer Center, Philadelphia, PA 1986 – 1990................Senior Systems Programmer, Fox Chase Cancer Center, Philadelphia, PA 1990 – 1994................Manager, Systems and Applications, Fox Chase Cancer Center, Philadelphia, PA 1990 – 1994................Project Leader, Cooperative Human Linkage Center Fox Chase Cancer Center and University of Iowa 1994 – 2002................Director, Research Computing Services & Bioinformatics Facility, Fox Chase Cancer Center, Philadelphia, PA 2002 – 2009................Director, High-Performance Computing Facility, Fox Chase Cancer Center, Philadelphia, PA 2002 – 2009................Member, Bioinformatics Program, Fox Chase Cancer Center, Philadelphia, PA viii 2002 – 2009................Sr. Director & Chief Technology Officer/Deputy CIO, Information Science & Technology, Fox Chase Cancer Center, Philadelphia, PA 2009 – 2017................Chief Informatics Officer & Director, Cancer Center Informatics Facility, University of Michigan Comprehensive Cancer Center, Ann Arbor, MI Publications Fenton, S.H., Manion, F., Hsieh, K.L., and Harris, M., Informed Consent: Does Anyone Really Understand What is Contained in the Medical Record?, Applied Clinical Informatics, 6(3):466-477, 2015. PMID:26448792. PMCID: PMC4586336. doi: 10.4338/ACI-2014-09-SOA-0081. Schumacher, F.R., Schmit, S.L., Jiao, S., Edlund, C.K., Wang, H., Zhang, B., Hsu, L., Huang, S.C., Fischer, C.P., Harju, J.F., Idos, G.E., Lejbkowicz, F., Manion, F.J. et. al. (2015). Genome-wide association study of colorectal cancer identifies six new susceptibility loci. Nature Communications, 6:7138. PMID: 26151821. http://doi.org/10.1038/ncomms8138.

A Life Cycle Approach to the Development and Validation of an Ontology of the U.S

UNIVERSITY of CALGARY Concept-Learning Supported

Deep Semantics in the Geosciences: Semantic Building Blocks for a Complete Geoscience Infrastructure

Functions in Basic Formal Ontology

A Definition of Ontological Category

The Zebrafish Anatomy and Stage Ontologies: Representing the Anatomy and Development of Danio Rerio

Introduction to Ontologies Part I

GMS 6805: Introduction to Applied Ontology Location

Ontology, Ontologies, and Science

Human Disease Ontology 2018 Update: Classification, Content And

Ech-0205 V1.0 Linked Open Data

Ontology-Based Approach for Structural Design Considering Low Embodied Energy and Carbon

Introduction to Applied Ontology and Ontological Analysis