Analytics for Accelerating Biomedical Innovation Kien Wei Siah

Analytics for Accelerating Biomedical Innovation by Kien Wei Siah B.Eng., National University of Singapore (2015) S.M., Massachusetts Institute of Technology (2017) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2021 ○c Massachusetts Institute of Technology 2021. All rights reserved. Author.............................................................. Department of Electrical Engineering and Computer Science December 17, 2020 Certified by. Andrew W. Lo Charles E. and Susan T. Harris Professor, Sloan School of Management Thesis Supervisor Accepted by . Leslie A. Kolodziejski Professor of Electrical Engineering and Computer Science Chair, Department Committee on Graduate Students 2 Analytics for Accelerating Biomedical Innovation by Kien Wei Siah Submitted to the Department of Electrical Engineering and Computer Science on December 17, 2020, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering and Computer Science Abstract Despite the many breakthroughs in biomedical research and the increasing demand for new drugs to treat unmet medical needs, the productivity of research and development in the pharmaceutical industry has been steadily declining for the past two decades and is at its lowest level today. Traditional sources of financing in biopharma are no longer compatible nor aligned with the new realities of biomedical innovation, a process which has become more challenging, complex, expensive, time-consuming, and risky in the past twenty years. This has led to an outflow of capital from the biopharma industry, creating an ever-widening gap in funding between early-stage basic biomedical research and late-stage clinical development, where many promising academic discoveries fail not because of bad science but due to financial reasons. In this thesis, we explore the use of data analytics to facilitate biomedical innovation with a particular emphasis on the mismatch between the risk characteristics of biomedical projects and the risk preferences of biopharma investors. We begin with a brief introduction of the challenges faced by the biopharma industry in PartI. In PartII, we focus on analytics in the context of clinical trials. First, we develop analytics for precision medicine in non-small cell lung cancer, an emerging area of innovation in disease treatment with the advent of human genome sequencing. Next, we train and validate predictive models for estimating the probability of success of drug development programs. By providing greater risk transparency, our models can help facilitate more accurate matching of investor risk preferences with the risks of biomedical investment opportunities, thus increasing the efficiency of capital allo- cation. Finally, we turn our attention to the ongoing COVID-19 (coronavirus disease 2019) pandemic. We propose a systematic framework for quantitatively assessing the potential costs and benefits of different vaccine efficacy trial designs for COVID-19 vaccine development, including traditional and adaptive randomized clinical trials, and human challenge trials (HCTs). Our results contribute to the current ethical debate about HCTs by identifying situations where HCTs can provide greater social value versus non-challenge development pathways, and are thus justifiable. In Part III, we explore new business models to address the dearth of funding for translational medicine in the valley of death. In view of the increasingly critical role 3 that academic institutions play in the biotechnology industry, we develop a systematic framework for tracking the financial and research impact of university technology licensing in the life sciences using the Massachusetts Institute of Technology as a case study. Next, we investigate the use of a recently proposed megafund structure for financing early-stage biomedical research. We extend the existing model toaccount for technical correlation between assets in the underlying portfolio, thus allowing us to evaluate the tail risks of the megafund more accurately. We show that financial engineering techniques can be used to structure the megafund into derivatives with risk-reward characteristics that are attractive to a broad range of investors. This allows the fund to tap into a substantially larger pool of capital than the traditional sources of biopharma funding. In the last part of the thesis, we further extend the megafund framework to include adaptive clinical trial designs, and demonstrate the economic viability of using the megafund vehicle to finance and accelerate drug development for glioblastoma, a disease with very few treatment options, low historical probabilities of success, and huge unmet need. Thesis Supervisor: Andrew W. Lo Title: Charles E. and Susan T. Harris Professor, Sloan School of Management 4 Acknowledgments I am grateful to my thesis supervisor, Professor Andrew W. Lo, for his guidance and support over the past five years. I am deeply inspired by his wisdom, vision, and passion for research with real-world impact. He has been and always will be a role model in my career. I would also like to thank my thesis committee, Professor Martha L. Gray and Dr. Sean Khozin, and RQE committee, Professor Peter Szolovits and Professor John V. Guttag, who have provided invaluable feedback on my work and encouraged me to see the big picture. I am thankful to everyone in the MIT Laboratory for Financial Engineering: Jayna Cummings, Crystal Myler, Mavanee Nealon, and Kate Lyons for everything they have done to support our research; Chi Heem Wong, Samuel Huang, Qingyang Xu, Shomesh Chaudhuri, Zied Ben Chaouch, and Manish Singh for the many useful dis- cussions and joint work on several projects. They have made my PhD journey an amazing and enjoyable experience. I was fortunate to have the opportunity to work with many collaborators and co- authors during my time at MIT: David Aron, Donald Berry, Scott Berry, Christine Blazynski, Meredith Buxton, John Frishkopf, Olga Futer, Mark Gordon, Jerry Gupta, Peter Hale, Michael Hay, Leah Isakov, Nicholas Kelley, Sean Khozin, Grace Lindsay, Jeff Lura, Lita Nelsen, Lesley Millar-Nicholson, Kirk Tanner, and Richard Thakor.I have learned so much about drug development and the biopharma industry from each and everyone. I’m also grateful to the MIT Laboratory for Financial Engineering and the Rockefeller Foundation for funding support. The views and opinions expressed in this thesis are solely my own, and do not necessarily represent the views and opinions of any institution or agency, or any of the individuals acknowledged above. Finally, being far from home has been a great challenge. I would like to thank my friends and family for their unwavering love and support back in Singapore, without which none of this would be possible. 5 6 Contents I Introduction 19 1 Challenges of Biomedical Innovation 21 1.1 Introduction................................ 21 1.2 Thesis Contributions........................... 23 1.2.1 Clinical Trial Analytics...................... 23 1.2.2 New Business Models....................... 25 II Clinical Trial Analytics 29 2 Predictive Models for Patient Outcomes in Lung Cancer 31 2.1 Introduction................................ 32 2.2 Data.................................... 33 2.2.1 Study Population......................... 33 2.2.2 Tumor Response Data...................... 35 2.2.3 Longitudinal Tumor Size Data.................. 43 2.2.4 Survival Data........................... 44 2.3 Methods.................................. 48 2.3.1 Stochastic Model for Tumor Growth.............. 48 2.3.2 Machine Learning Models for Objective Response....... 51 2.3.3 Statistical Models for Survival.................. 52 2.4 Results................................... 53 2.4.1 Tumor Response......................... 53 7 2.4.2 Survival.............................. 57 2.5 Discussion................................. 64 3 Predictive Models for Drug Development Programs 71 3.1 Introduction................................ 72 3.2 Data.................................... 75 3.2.1 Summary Statistics........................ 75 3.2.2 Missing Data........................... 80 3.3 Methods.................................. 81 3.3.1 Statistical Imputation...................... 83 3.3.2 Machine Learning Models.................... 88 3.4 Results................................... 90 3.4.1 Imputation Versus Listwise Deletion.............. 90 3.4.2 Predicting Drug Approvals.................... 96 3.4.3 Predictions Over Time...................... 99 3.5 Discussion................................. 104 4 Cost/Benefit Analysis of Vaccine Trial Designs for COVID-19 115 4.1 Introduction................................ 116 4.2 Simulation Framework.......................... 118 4.3 Vaccine Efficacy Trial Designs...................... 119 4.3.1 Traditional Randomized Clinical Trial............. 119 4.3.2 Optimized Randomized Clinical Trial.............. 121 4.3.3 Adaptive Randomized Clinical Trial............... 122 4.3.4 Human Challenge Trial...................... 122 4.4 Efficacy Analysis............................. 124 4.4.1 Fixed-Duration Clinical Trial.................. 124 4.4.2 Superiority-by-Margin Testing.................. 126 4.4.3 Adaptive Clinical Trial...................... 127 4.5 Epidemiological Model.......................... 131 4.6 Cost/Benefit

Load more