Monitoring Trace Levels of Ctdna Using Integration of Variant Reads
Total Page:16
File Type:pdf, Size:1020Kb
Monitoring trace levels of ctDNA using 1 integration of variant reads 2 3 Jonathan C. M. Wan 4 Supervisor: Dr Nitzan Rosenfeld 5 6 Cancer Research UK Cambridge Institute 7 8 University of Cambridge This dissertation is submitted for the degree of 9 Doctor of Philosophy 10 Trinity College May 2019 11 Abstract 12 In patients with early-stage cancer, ctDNA detection rates can be low due to the presence of 13 few or no copies of any individual mutation in each sample. Sensitivity may be increased 14 by collecting larger plasma volumes, but this is not feasible in practice. Although cancers 15 typically have thousands of mutations in their genome, previous analyses measured only 16 individual or up to 32 tumour-specific mutations in plasma. 17 Here, we demonstrate that sensitivity can be greatly enhanced for any given input DNA 18 mass by analysing a large number of mutations via sequencing. We sequenced in plasma 2 4 19 10 -10 mutated loci per patient, using custom capture panels, whole exome (WES) or 20 whole genome sequencing (WGS). We developed a method for INtegration of VAriant Reads 21 (INVAR) that aggregates reads carrying tumour mutations across multiple mutant loci, and 22 assigns confidence to error-suppressed reads based on mutation context, fragment length and 23 tumour representation. This workflow combines a number of concepts in a novel approach in 24 order to quantify ctDNA with maximal sensitivity. 25 We applied INVARto plasma sequencing data from 45 patients with stage II-IV melanoma 26 and 26 healthy individuals. ctDNA was detected to 1 mutant molecule per million, and tu- 3 27 mour volumes of ~1cm . We show that this algorithm is applicable across targeted and 28 untargeted sequencing methods. In patients with stage II-III melanoma who relapsed after 29 resection, ctDNA was detected within 6 months post-surgery and prior to relapse in 50% of 30 cases, compared to 16% in a similar cohort analysed with digital PCR. In addition, INVAR 31 may enhance detection of ctDNA in samples with limited input or sequencing coverage by iv 32 aggregating signal across a large number of mutations. Using low-depth WGS (0.6x), ctDNA 33 was detected to 1 mutant per 10,000 molecules. Given that 60 genome copies of cfDNA may 34 be obtained from 1 drop of blood (50-75µL), we suggest that INVAR may enable cancer 35 monitoring from limited samples volumes. 36 37 As tumour sequencing becomes more widespread allowing identification of a large 38 number of mutations per patient, this method has potential to quantify ctDNA with enhanced 39 sensitivity, and to enable routine cancer monitoring using low-depth sequencing, potentially 40 from low-volume blood samples that might be self-collected. 41 Lay Abstract 42 Cancer cells release their contents into the bloodstream when they die. This includes mutated 43 DNA fragments from the cell nucleus, which can be called circulating tumour DNA (ctDNA). 44 Therefore, cancer can be detected from blood tests by identification of ctDNA through 45 methods such as DNA sequencing. The fraction of DNA fragments that are mutated in a 46 patient’s blood corresponds to the extent of cancer in their body. Thus, in patients with low 47 disease burden, for example post-surgery or in response to treatment, there are few mutant 48 DNA molecules in the bloodstream. This results in ctDNA molecules being missed when 49 blood is sampled and tested for specific mutations. 2 4 50 Cancers can have up to 10 to 10 mutations in their genome, each in different regions of 51 the genome. Each mutation gives rise to a ctDNA fragment, representing an opportunity to 52 detect cancer. DNA sequencing-based approaches have so far targeted up to 40 mutations 53 per patient in order to improve sensitivity for ctDNA compared to methods targeting single 54 mutations. In order to target multiple mutations efficiently, These mutations may be identified 55 per patient if tumour sequencing is performed prior to ctDNA analysis, which is possible in 56 the disease monitoring setting, and may allow improved detection or monitoring of cancer. 57 In this study, we studied patients with early or advanced melanoma skin cancer and 58 identified up to 5,073 tumour mutations per patient, then sequenced patients’ blood samples. 59 To achieve high sensitivity detection, we developed a method called ‘INtegration of VAri- 60 ant Reads’ (INVAR) that aggregates signal across thousands of mutations, and considers 61 biological and technical features of ctDNA sequencing. This method detects low levels of vi 62 ctDNA, down to a single mutant molecule per million circulating DNA molecules. ctDNA 63 was detected in 100% of advanced melanoma patients, and disease was monitored to the 64 limit of CT imaging. Within 6 months post-surgery, this method detected ctDNA in 50% 65 of stage II-III melanoma patients who later relapsed within 5 years. Furthermore, sensitive 66 detection of ctDNA can be achieved from samples where little starting material, or limited 67 (low-depth) sequencing data, is available. We show that this method can monitor ctDNA to 1 68 mutant molecule per 10,000 using limited input and low-depth WGS. These data could be 69 generated from ~1 genome copy, indicating that cancer monitoring from droplet volumes of 70 blood with high sensitivity may be possible. 71 72 As tumour and plasma sequencing becomes increasingly routine, INVAR could enhance 73 detection of residual disease and could enable cancer monitoring using low-depth sequencing, 74 or from limited sample volumes such as blood droplets, which might enable self-collection 75 at home. 76 Dedicated to my family. 77 Thank you for always encouraging me to pursue my interests. 78 Declaration 79 I hereby declare that except where specific reference is made to the work of others, the 80 contents of this dissertation are original and have not been submitted in whole or in part 81 for consideration for any other degree or qualification in this, or any other university. This 82 dissertation is my own work and contains nothing which is the outcome of work done in 83 collaboration with others, except as specified in the text and Acknowledgements. This 84 dissertation contains fewer than 65,000 words including appendices, bibliography, footnotes, 85 tables and equations and has fewer than 150 figures. 86 Jonathan C. M. Wan 87 May 2019 88 Acknowledgements 89 During this PhD, I have developed greatly both as a scientist, and personally. I am grateful to 90 have been around such excellent scientists in the Rosenfeld lab and at Cancer Research UK, 91 from whom I have learned very much. It has certainly been inspiring to work alongside such 92 motivated and hard-working individuals, who share my aim of improving cancer care for 93 patients. 94 First, I would like to thank Nitzan Rosenfeld for his excellent support, supervision, and 95 ideas. I feel very privileged to have been able to work with one of the leaders of the ctDNA 96 field; I have been impressed how Nitzan combines a clinician-like eye for translational 97 research questions with the problem-solving approach of a biotechnologist. He offered me 98 excellent support at strategic times, while giving me the academic freedom to creatively 99 problem-solve. I have been consistently impressed by his vision for our work, and am sure 100 the Rosenfeld lab will continue to be world-leading. 101 During my first year, I was co-supervised by Charlie Massie. During the early stages 102 of my PhD, I needed a higher level of support, and Charlie was generous with his time to 103 support and guide me through this period. I recall thanking him with a Christmas card of 104 that year, writing that he had provided me with the best start to a PhD I could have hoped for. 105 Almost three years on, I reiterate these thanks. 106 In the lab, I have worked closely with a multidisciplinary team of biologists, computa- 107 tional biologists, mathematicians, and clinicians. I learned my wet lab skills from Suzanne 108 Murphy, Christopher Smith, Florent Mouliere, Katrin Heider, Wendy Cooper, Irena Hude- xii 109 cova, Davina Gale and Charlie Massie, all of whom took the time to teach me during my 110 PhD. Through their teaching and support, I have developed a more systematic approach to 111 problem-solving in science. I thank James Morris, Dineika Chandrananda and Francesco 112 Marass for mentoring me through my development of my bioinformatics skills. In addition, I 113 would like to thank the following people for their discussions and help over the years, either 114 clinically or technically: Dee Lynskey, Mareike Thompson, Gahee Park, Keval Patel, Dana 115 Tsui, Jenny Chan, Andrew Gill, Andrea Marshall, Roy Rabbie and Hugh O’Brien. Eyal 116 Fisher joined us in the lab only very briefly, but I would like to thank him for his openness in 117 discussion of mathematics with a non-expert, which I greatly enjoyed. 118 As part of the MelResist study, I have had the good fortune to collaborate with excellent 119 clinicians and associated staff. I would like to thank Pippa Corrie and Christine Parkinson 120 for their valuable discussions, and in particular for their emphasis on the clinical relevance of 121 our work. I would also like to thank Emily Barker, Cathy Thorbinson and Gemma Young for 122 running the MelResist so well, enabling us to try innovative approaches. 123 I have worked most closely with Katrin Heider in the lab. We have worked on the INVAR 124 algorithm together, and I am very proud of what we have achieved together.