BMJ.2016.036140 entitled "The Genomics England 100,000 Project" Response to Editor and Reviewer Comments

Editors' comments

1) While editors felt that your paper covered an interesting and relevant topic, we did not feel it was a good fit for the Analysis section of the journal in its current form. Typically Analysis articles are 1800-2000 word scholarly debate articles that present a clear argument. The paper as it currently stands is too detailed and too descriptive compared with the papers that we typically publish as Analysis, with no argument being put forward. -Thank you. We have shortened the manuscript from almost 4000 words to 2471. Furthermore, as also suggested by Reviewer 1, we have removed a lot of the descriptive/background material and added more data and emphasis relating to current challenges in NHS genomics and how the 100,000 Genomes Project is addressing these.

2) Editors also found the paper unclear as to when one of the key objectives of the 100,000 Genomes Project -- "to bring benefit to NHS patients" -- is supposed to be met and whether any of this goal has been met thus far. We would usually expect Analysis articles that cover specific projects and initiatives to include some information about outcomes, barriers, challenges etc. -Thank you. We have added examples of how a result can change management of a family and additional metrics around reports returned and diagnostic rate. We have also added a full section to the end entitled “Challenges, hurdles and future directions” as well as box e “100,000 Genomes Project: examples of early steps in catalysing complex change”, in which additional complex hurdles are addressed.

3) We wondered if a shorter article (around 1800 words) that focuses on this aspect, and does more to make the case for what the project is doing, as well as addressing possible unforeseen consequences, could be of interest if you are willing to revise? -Thank you for the invitation, which we have indeed taken up. We have shortened down to 2471 words, removing much of the background, putting in much more data and adding additional material around progress, outputs and challenges. I hope this meets with your expectations. As you mentioned in the original invitation for the piece, there was potential for a little flexibility [“The word count for Analysis articles is under 2000 words, but we can be flexible about this. Our main constraint is for the weekly print journal, where we only really have space for around 1700 words. We can go longer online, but it gets difficult to cut text for print when articles are very long, so I would suggest aiming for 3000 words. Or we can aim for online only publication”] If word count is absolutely critical, I can attempt to shave off another couple of hundred words.

Reviewer(s)' Comments to Author

Reviewer: 1

Comments: The paper is a thorough and comprehensive overview of the 100,000 Genomes Programme – a research and healthcare transformation programme currently underway in England. It outlines history of genomics going back to the inception of the Human Project, aims of the 100,000 Genomes Programme and key people involved, why (as opposed to exome sequencing) has been chosen to deliver the aims of the project, and the role of Genomic Medicine Centres and the GeCIPS. As someone with an interest in the Programme, I find the article highly informative and comprehensive. It was also interesting to read about sequencing of pathogens, about which I am much less familiar.

However, I think the paper has been written for those who are already familiar with Genomic Medicine; in the UK most of these people will already be involved in the Programme, so it is not these readers that must be engaged. If the paper is really aimed at the general readership of the BMJ, including readers overseas, it would definitely benefit from a slightly different emphasis. I think that the paper could be considerably shortened and should focus on the benefits and medical potential for the different groups of patients from the very beginning of the article; the transformative aspects of healthcare should be spelled out with real examples and emphasis on how the NHS and its workforce must change in order to absorb this huge positive change. The majority of BMJ readers will have little knowledge of genomic medicine and are probably less interested in history and local political drivers and personnel involved. Thank you. We have rewritten the paper, shortening it, adding data and taken these suggestions on board. We have changed the ‘Background’ to focus upon aspects of NHS genomics service delivery that require transformation and the final section is now called “Challenges, hurdles and future directions” and focuses on how this programme is beginning to catalyse this change. Specifically, a box has been added entitled “box b: Application of genomics for healthcare improvement” and another box entitled “box e: 100,000 Genomes Project: examples of early steps in catalysing complex change”

Some terms could also be better explained: What is the GENE consortium actually going to do? Examples for the reader would be very helpful. -Thank you. This section has been removed as the GENE consortium has now disbanded.

The Box on Big Data is very impressive in terms of numbers but actual examples of what Big Data might be able to achieve for patients would capture the imagination of the readers. -Thank you. I have removed the box and incorporated some of these numbers into the text, which I hope gives the context you allude to: “Piloting of patient recruitment, sample collection, sequencing and data analysis was initiated in 2014 for both the rare disease (4957 participants) and cancer programmes (1650 participants). The first patients from the NHS Genomic Medicine Centres were recruited in February 2015, with an average weekly recruitment of ~650 participants and cumulative recruitment of 46,698 participants by October 2017 (fig e). Sequencing at the 100,000 Genomes Project Sequencing centre in Hinxton commenced in March 2016; by October 2017, a cumulative total of 36,083 WGS had been generated. Constitutional samples are been sequenced to produce a minimum of 85 GB of data per sample (>300 million high quality, non-duplicated sequencing reads per samples ensuring at least 15 sequencing read coverage for over 90% of the 3.2 billion bases in each patient genome, figs f and g).”

The list of GeCIPs needs more explanation for the BMJ reader. -I have sought to better explain this through the text and the legend: “Following a call for expressions of interest, >1700 senior academics from the UK representing >300 institutions, >600 NHS clinicians and >200 international collaborators responded and self-organised into 41 domains spanning rare disease, tumour types and cross-cutting themes such as ethics, health economics and advanced analytical approaches (fig c)” Legend:“Fig c: The Genomics England Clinical Interpretation Partnership: Researchers have grouped themselves into 41 “domains” and will work within these groups to analyse the genomic and clinical data to make additional diagnoses in patients and advance overall genomic understanding”

The sentence ‘early investigation with a whole genome thus can obviate the protracted and expensive diagnostic odyssey which historically characterised investigation of these disorders,’ will not be understood by most readers and needs explanation. The benefits of making early diagnoses in rare diseases should be illustrated with examples. -I have sought to address this in “Box b: Application of genomics for healthcare improvement” • Diagnosis of rare and/or inherited diseases: Whole genome or exome sequencing for a child with rare disease within the first weeks or months of life enables provision of a precise molecular genetic diagnosis. This offers opportunity for early administration of the interventions and therapies most likely to be effective, improved estimation of prognosis, pre-emption of complications and, if timely, facilitates reproductive decision-making for subsequent pregnancies. Historically diagnosis in rare disease took, on average, seven years. A ‘diagnostic odyssey’ was typical, involving investigation of multiple organ systems by different medical specialists and, even once referred to a geneticist, prolonged, serial testing of individual genes.

Similarly the part on cancer is very well-written and aims are clearly articulated but again would benefit from specific examples of how genomic analysis can change patient management. The statement, ‘Whole genome sequencing of the tumour can predict therapeutic efficacy and prognosis, thus enabling administration of more effective treatments and avoidance of administration of drugs that may be ineffective’ needs to be illustrated with examples. -I had re-written these sections to (I hope) be clearer about opportunities: Text box b: “Precision oncology and targeted cancer treatments: growth and replication of cancer cells can be driven by mutated oncogenes (‘oncogene addiction’). Small molecules or monoclonal antibodies switching off the over-active protein can yield dramatic response (targeted drugs). However, the response is often time-limited as the tumour typically evolves a ‘resistance mutation’” Main text: “The results returned include (i) well-characterised mutations marking eligibility for NICE-approved targeted drugs such as BRAF-inhibitors in melanoma and EGFR inhibitors in lung cancer (ii) gene mutations, fusions and copy number changes which may enable access to clinical trials of experimental molecules (iii) analyses of signatures and mutational burden which are emerging as clinical biomarkers by which to predict drug response.”

Really interesting issues such as insurance are not explained at all! What are the implications for participants? These are questions which patients may bring to their GPs and specialists. -Detailed and judicious exploration of issues around insurance are likely beyond the scope of a short article. However, I have sought to at least recognise that these are complex and highly relevant issues in box f: 1) Complexity around consent. As per any introduction to the NHS of paradigm-shifting technology, this programme is a hybrid of clinical care and research, causing tensions around consent. Research consent addresses sharing data with researchers from industry as well as life-long data storage and linkage, both areas of potential public concern. Impact on insurance and longer-term diminution in mental capacity need to be covered when enrolling participants. Children and teenagers are alerted that they will require re-consenting on turning 18. We are also piloting return of secondary findings (genetic variants identified which are not related to the condition under investigation but which are informative to the risk of unrelated but serious medical conditions)1. The time required to cover and consent for all these complex aspects, along with collecting all clinical/phenotype data, has rendered infeasible recruitment within a routine outpatient setting.

What does ‘Following a 'bake-off’ between multiple sequencing providers launched in 2013, Illumina Inc was selected to partner with the programme to provide sequencing services, working alongside our Sequencing Advisory Group’, actually mean? -I have re-worded. “Following a competitive tender in 2013 between multiple sequencing providers, Illumina was selected to partner with the programme to provide sequencing services”

There are some minor errors which need to be corrected: Were the first patients really recruited in February 2015 in Newcastle and their results returned in March 2015 as the text states?? This suggests that results were returned within a month. From looking at the Box though these two dates pertain to different groups of patients. The first patients were recruited to the pilot in October 2013. -The pilot recruitment indeed began in 2013 and GMC recruitment in Feb 2015. The first results from the pilot were returned in March 2015. I have remove this box but sought to make this clearer in the text.

I believe it was Rosalind Franklin and not Rosalin. -Thank you. The errant ‘d’ has been re-patriated from the surname back to the first name.

In Summary: Shorten the paper as suggested above. Start with a very brief Introduction followed by benefits to patients, and specific vision of how genomic medicine will transform healthcare and, more importantly, what changes need to occur in the NHS. -Thank you. I have almost halved in length. I have radically restructured and the areas of change required in the NHS are now highlighted in the ‘Background’ section, along with introduction of box b, which explicitly highlights healthcare opportunities.

Analysis articles are supposed to ‘stimulate discussion, raise debate, and air controversies’. I am not sure that this remit is completely fulfilled; there is no real evaluation of the programme other than a comprehensive description. An idea of the investment that has been made, in financial terms, is important for the reader as it illustrates the real importance and value of the programme to NHSE in terms of healthcare transformation. -I have added “Box d: Major funding support for the 100,000 Genomes Project” to highlight investment and sought in the “Challenges, hurdles and future directions” section and “Box e: 100,000 Genomes Project: examples of early steps in catalysing complex change” to include some of the thorny areas that have been challenging and messy. Furthermore, I have now highlighted forward activity in the final paragraph “Genomics England and NHSE are co-leading transition working groups to evolve evidence-based frameworks to direct post-2018 NHS commissioning of genomic testing (including WGS)….”

Reviewer: 2

Comments: This is a description of the overall aims and design of an extremely valuable large-scale genomics project which will generate whole genome sequence data on 100,000 individuals recruited through the (NHS) in England. This project will have enormous benefit to the scientific community for many years to come, including discovery of new genetic-phenotypic relationships as well as methods for implementation of this knowledge to improved health and well-being of participants, their family members, and other patients in general.

The manuscript is very well-written, and a project of this magnitude and importance deserves the publication of such a detailed description of its goals, priorities and design. However, the manuscript would be significantly improved and be of much greater value if it also contained some preliminary data on a couple of key areas. -Thank you. We have introduced several data metrics and a number of new figures into the updated manuscript, thus substantially improving the amount of data we have included.

Participant attitudes, consent rates, and engagement: On page 12, these issues are alluded to briefly, with references (63-65) to 3 Genomics England white papers or publications. It is not clear whether these represent peer-reviewed publications easily accessible to the public. -The white papers are referenced as web resources to ensure that they can be accessed by readers.

It might be useful to reproduce key summary information in a table or figure for this manuscript to highlight key findings. Of particular interest to compare to other similar biobank/genomics projects would be items such as consent rates, clinical/EHR data descriptors such as longevity of retrospective clinical information, # clinical encounters, # laboratory values, etc. -We have now included “number of HPO terms (positive and total) per participant” (fig h) and number of additional linked data items from external datasets from HES, imaging, ONS and PROM datasets (fig i). For the HES data, we have broken it down into data terms by year and by source (outpatient, admissions, A&E, critical care) (fig i). Further detailed analyses of these datasets are beyond the scope of this summary paper, but we agree that some high-level inclusion of these data indeed enhance the paper.

Early sequencing data quality/experience: On page 8, the authors describe a goal of >1,000 WGS per week and a start date of March 2016. It would be useful to know where the sequencing throughput stands after 9-10 months start-up, and some high-level description of quality of sequence, success rates of obtaining high-quality results, etc. On page 14, Box f states that 10,000 WGS were completed in April, 2016--is this really supposed to be April, 2017? -We have now included figures on number of sequences performed (fig f), coverage (fig g), as well as additional metrics included in the text: “The first patients from the NHS Genomic Medicine Centres were recruited in February 2015, with an average weekly recruitment of ~650 participants and cumulative recruitment of 46,698 participants by October 2017 (fig e). Sequencing at the 100,000 Genomes Project Sequencing centre in Hinxton commenced in March 2016; by October 2017, a cumulative total of 36,083 WGS had been generated. Constitutional samples are been sequenced to produce a minimum of 85 GB of data per sample (>300 million high quality, non-duplicated sequencing reads per samples ensuring at least 15 sequencing read coverage for over 90% of the 3.2 billion bases in each patient genome, figs f and g).”

Returning patient results: On page 14, it is stated that the first results returned were in March, 2015 and that 1,000 will be returned by end of 2016. Can some high-level summary data be presented here that will not interfere with separate, more detailed publication of these results? -We have now included figures on rare disease recruitment by disease category (e), as well as detailing number of results returned and diagnostic rate in the text. “By October 2017, results had been returned to 4426 NHS participants with a preliminary diagnostic rate of 22%.”

I am sympathetic to the authors desires to publish detailed data on these several areas in separate manuscripts, but really think that inclusion of at least high-level summaries will enhance the interest and value of this manuscript. -Thank you for your detailed review: the suggested amendments have, I believe, enhanced the paper substantially. We have completely rewritten, restructured and shortened the paper in response to your most helpful comments. We hope that the shorter and more data-rich manuscript reflects improvement in the direction you were proposing.