Sample Identification and Tracking in Biobanks
Total Page:16
File Type:pdf, Size:1020Kb
Sample Identification and Tracking in Biobanks A thesis submitted to University of Dublin, Trinity College for the degree of Doctor of Philosophy Atieh Zarabzadeh Department of Computer Science Trinity College Dublin October 2009 Declaration I, the undersigned, declare that this work has not been previously submitted as an exercise for a degree at this or any other University, and that, unless otherwise stated, it is entirely my own work. ______________________ Atieh Zarabzadeh October 2009 i Permission to lend or copy I, the undersigned, agree that the Trinity College Library may lend or copy this thesis upon request. ______________________ Atieh Zarabzadeh October 2009 ii Dedication To my parents... iii Acknowledgment I would like to express my appreciation to all those without whose help I would not have been able to complete this thesis. First and foremost I offer my sincerest gratitude to my supervisor, Prof. Jane Grimson for her invaluable support, supervision and encouragement. I would like to express my profound thanks to Prof. William Watson, University College Dublin, for his patient support and valuable suggestions that enabled me observe the world from a different window. Completing this thesis without Jane and Bill‟s advice would not have been possible. I am grateful to the Irish Prostate Cancer Research Consortium members for their support in the first year when I was learning about the biobanks, and for their kind cooperation during the evaluation phase. I am especially grateful to Ms. Marie Burke, Dr. Linda Sullivan, Dr. Ruth Foley and Dr. Antoinette Perry. I really appreciate the kindness of Dr. Darach Golden and Dr. Geoff Bradley from the Trinity Centre for High Performance Computing during all these years and especially during the last few months for setting up a test server and allowing me to work in their office. I also wish to express my thanks to the Irish Research Council for Science, Engineering and Technology and Hewlett-Packard for providing funding for this project, without which life would have been difficult. In my daily work I have been blessed with a friendly and cheerful group in the Center for Health Informatics (CHI) laboratory. Their support and encouragement made long working hours tolerable. I wish to express my appreciation to former and current staff and students in the School of Computer Science and Statistics. I am thankful to all my friends for their supports, encouragements and advice. I am deeply indebted to Farzad Hayati for his support, encouragement and help in the implementation phase of the project and beyond. I am grateful to Karen Tierney and Sinead Kelleher for editing and proofreading my thesis and to Ronan McDonnell and iv Dr. Bridget Kane for their advice throughout all these years. I am also indebted to Maureen D‟Arcy for her kindness during the times I was under pressure. I would also like to express my thanks to Dr. Mohammad Jamshidi. I would like to express my love and gratitude to my beloved family; for their understanding and endless love not only throughout the duration of my studies but also throughout all my entire life. My parents, Nasrin and Mohsen; and my sister, Talieh are the best I could ever have. v Abstract Biobanks or bio-repositories facilitate the storage and maintenance of biological samples and data to support discovery of biomarkers, therapeutic targets, and the underlying causes of diseases. Such discoveries require high quality samples that are supported by their associated data. While samples and their data are of crucial importance, they are expensive to collect and maintain. Samples and data are subject to various procedures from data collection from the sample donors, to clinical laboratory analyses, to processing in the research laboratory (omic procedures) and ultimately to knowledge discovery. Samples are transferred between different locations and moved in and out of freezers at several different points in this workflow, increasing the potential of error in sample identification and incorrect linkage with the corresponding data. Implementing a robust and reliable Sample Identification and Tracking System (SITS) which can track samples through all the various phases in different locations and at the same time protecting the confidentiality of the sample donors represents a major challenge. In this thesis a novel method has been developed based on Radio Frequency Identification (RFID) technology to support reliable tracking of samples from collection to knowledge discovery. In this system RFID tags are attached to the tubes containing the samples and send and receive data to and from the users through a web based interface. These RFID tags are passive, allowing longer life span than active RFID tags, rewriteable and survive extreme temperatures. They have small physical size and store dynamic data to support biobank activities. In addition to sample tracking, this system improves quality control of samples by recording details of collection, procedure and storage data such as Standard Operating Procedures (SOPs) under which the sample has been collected and processed. The SITS prototype has been evaluated by the Irish Prostate Cancer Research Consortium (PCRC) biobank, a federated biobank for prostate cancer. vi Publications Related to This PhD 1. Zarabzadeh, A., Hayati, F., Watson, R. W. G., Bradley, G., Grimson, J. (2009) An Overview of Sample Identification and Tracking System for Biobanks: Abstract for the 14th Annual Conference and Scientific Symposium of the Healthcare Informatics Society of Ireland (HISI). Dublin, Ireland. 2. Zarabzadeh, A., Hayati, F., Watson, R. W. G., Bradley, G., Grimson, J. (2009) A Sample Identification and Tracking System for Biobanks: Proceedings of the 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS 2009). New Mexico, USA. 3. Zarabzadeh, A., Watson, R. W. G., Bradley, G., Grimson, J. (2008) Participants Confidentiality in Prostate Cancer Research Consortium Bio-repository: Abstract for the 13th Annual Conference and Scientific Symposium of the Healthcare Informatics Society of Ireland (HISI). Dublin, Ireland. 4. Zarabzadeh, A., R. W. G. Watson, et al. (2009). Ensuring participant privacy in networked biobanks. Principles and Practice in Biobank Governance. J. Kaye and M. Stranger, Ashgate. 5. Zarabzadeh, A., Watson, R. W. G., Bradley, G., Grimson, J. (2008) Ensuring Participant Confidentiality in Bio-repositories: Abstract for Conference on Governing Biobanks – What are the challenges? Oxford, UK. 6. Zarabzadeh, A., Watson, R. W. G., Grimson, J. (2008) The use of Radio Frequency Identification to Track Samples in Bio-repositories: Proceedings of the IEEE 1st International Conference on Information Technology (IT2008). Gdansk, Poland. pp.351-354. 7. Zarabzadeh, A., Watson, R. W. G., Bradley, G., Grimson, J. (2007) Sample Tracking and Identification in Biobanks: Abstract for the 12th Annual Conference and Scientific Symposium of the Healthcare Informatics Society of Ireland (HISI). Dublin, Ireland. vii Table of Contents List of Figures .................................................................................................................... i List of Tables ................................................................................................................... vii Abbreviations ................................................................................................................... ix Chapter 1. Introduction .................................................................................................. 1 1.1 Motivation ........................................................................................................... 1 1.2 Research Question ............................................................................................... 6 1.3 Research Goals and Objectives ........................................................................... 6 1.4 Research Contribution ......................................................................................... 8 1.5 Thesis Overview .................................................................................................. 9 Chapter 2. Sample Identification and Tracking Systems ............................................. 11 2.1 Introduction ....................................................................................................... 11 2.2 Biobanking ........................................................................................................ 12 2.3 Sample and Data Journey .................................................................................. 20 2.4 Biobank Information Management Systems ..................................................... 25 2.5 The Irish Prostate Cancer Research Consortium (PCRC) Biobank .................. 29 2.6 Currently Available SITS................................................................................... 40 2.7 SITS Requirements ............................................................................................ 48 2.8 Conclusion ......................................................................................................... 50 Chapter 3. Confidentiality............................................................................................ 53 3.1 Introduction ....................................................................................................... 53 viii 3.2 Confidentiality in Biobanks ............................................................................... 54 3.3 Best