Cancer Genome-Sequencing Study Design
Total Page:16
File Type:pdf, Size:1020Kb
REVIEWS APPLICATIONS OF NEXT-GENERATION SEQUENCING Cancer genome-sequencing study design Jill C. Mwenifumbo and Marco A. Marra Abstract | Discoveries from cancer genome sequencing have the potential to translate into advances in cancer prevention, diagnostics, prognostics, treatment and basic biology. Given the diversity of downstream applications, cancer genome-sequencing studies need to be designed to best fulfil specific aims. Knowledge of second-generation cancer genome-sequencing study design also facilitates assessment of the validity and importance of the rapidly growing number of published studies. In this Review, we focus on the practical application of second-generation sequencing technology (also known as next-generation sequencing) to cancer genomics and discuss how aspects of study design and methodological considerations — such as the size and composition of the discovery cohort — can be tailored to serve specific research aims. Driver mutations Cancer pathogenesis is rooted in inherited genetic mortality. The aim of this article is not to review results Somatic mutations that have variation and acquired somatic mutation; accord- of cancer genome-sequencing studies but to focus on a role in creating, controlling ingly, genomics is integral to cancer research (for a their archetypal specific aims, methodological requi- and/or directing some aspect review, see REF. 1). In 2008, the first cancer genome was sites and study designs. Throughout this Review, the of the cancer phenotype. sequenced using second-generation technology (also benefits and limitations of approaches, technologies and 2 Kataegis known as next-generation sequencing) . Four years interpretation will also be discussed. From the Greek meaning later, approximately 800 genomes from at least 25 dif- ‘thunderstorm’, this refers ferent cancer types have been sequenced. Consider that Specific aims to clusters of somatic only 20 years ago, sequencing one human genome took Thus far, most cancer genome-sequencing studies have single-nucleotide variants that driver often colocalize with somatic an international collaboration more than 10 years and had one or more of four specific aims: discovering 3–5 structural variants. cost US$3.8 billion . Today, accurate and rapid genome mutations; identifying somatic mutational signatures; sequencing costs only a few thousand dollars. With this characterizing clonal evolution; and advancing person- Chromothripsis advancement in technology comes considerable capac- alized medicine (FIG. 1; Supplementary information S1 From the Greek meaning ‘chromosome shattering’, ity to increase basic cancer biology knowledge and the (table)). First, determining which somatic mutations this refers to a single event opportunity to advance cancer prevention, diagnostics, are likely to contribute to the cancer phenotype is the of genome shattering and prognostics and treatment. most common aim of cancer genome-sequencing stud- reassembly that results in Despite the diversity of questions to be addressed ies. Discovering driver mutations leads to improved complex somatic structural using cancer genome-sequencing studies (that is, studies understanding of basic cancer biology and conse- variations characterized by oscillating copy number that have used second-generation technology to quently treatment discovery and development. Take the and tens to hundreds of sequence at least one cancer genome), only a limited gene enhancer of zeste 2 (EZH2), for example: second- rearrangements that localize number of specific aims have been investigated to date; generation sequencing resulted in the discovery of somatic to one or a few chromosomes. many remain to be explored. For instance, cancer pre- mutations in EZH2 in lymphoma at a clinically signifi- vention is an important area that could greatly benefit cant frequency6, spurring functional characterization7 from well-designed second-generation cancer genome- and leading to a promising treatment8. Second, identify- Genome Sciences Centre, sequencing studies. Family-based and case–control ing somatic mutational signatures has also led to gains BC Cancer Research Centre, study designs will be integral to uncovering inherited in understanding basic cancer biology. For the first time, 675 West 10th Avenue, polymorphisms that predispose individuals to cancer. researchers can uncover global signatures of the muta- Vancouver, British Colombia Knowing cancer predisposition can benefit patients, tion processes and DNA repair mechanisms that con- V5Z 1L3, Canada. Correspondence to M.A.M. health practitioners and the health-care system if it tribute to the catalogue of somatic mutations in a cancer e‑mail: [email protected] results in a change of lifestyle behaviours or medical type. Indeed, the signatures of two newly discovered doi:10.1038/nrg3445 intervention that reduces cancer risk, morbidity or mutational phenomena, kataegis9 and chromothripsis10, NATURE REVIEWS | GENETICS VOLUME 14 | MAY 2013 | 321 © 2013 Macmillan Publishers Limited. All rights reserved REVIEWS Single-patient study Discovery cohort e.g. Ley (2008)2 e.g. Campbell (2008)42 Ultra-deep sequencing e.g. Shah (2012)60 Multi-sample sequencing e.g. Ding (2012)100 Multi-region sequencing e.g. Tao (2011)55 Personalized Mutational signatures Clonal evolution Driver mutations Recurrent mutation medicine e.g. Nik-Zainal 30 e.g. Shah (2009)16 e.g. Banerji (2012)72 e.g. Fujimoto (2012) e.g. Jones (2010)17 (2012)9 SV detection Pathway analysis <30-fold e.g. Ellis (2012)27 e.g. Stephens (2011)10 SNV, indel, SV detection Clinically actionable ≥30-fold mutation e.g. Berger (2012)97 e.g. Chapman (2011)65 Validation Genome Multi-omics Exome e.g. Cheung (2012)89 e.g. Pleasance (2009)19 e.g. Wang (2011)64 e.g. Muzny (2012)43 Generalizability Transcriptome e.g. Wu (2012)63 e.g. Morin (2011)62 Clinical importance Validation or No validation or Interaction omics e.g. Molenaar extension cohort extension cohort e.g. Molenaar (2012)90 e.g. Jones (2012)76 e.g. Sung (2012)94 (2012)90 Figure 1 | Cancer genome second-generation-sequencing study designs. This flow diagram is intended to highlight the diversity of second-generation cancer genome-sequencing study designs. Study design choicesNature are Reviews in yellow, | Genetics green and blue boxes. Choosing a path along these boxes, which are connected by arrows, represents a possible cancer genome-sequencing study design. The yellow boxes highlight that single-patient studies are well-suited for personalized medicine, whereas the blue boxes highlight that discovery cohorts are well-suited for discovering driver mutations. Dark grey boxes represent choices for analyses or methods specific to the box that they are connected to. Specifically, clonal evolution can be examined through ultra-deep, multi-sample or multi-region sequencing. Discovery cohorts can be multi-omics studies that combine genome, exome or transcriptome sequencing. Genome sequencing can be either <30‑fold or ≥30‑fold redundant coverage for focused or comprehensive somatic mutation detection, respectively. Validation and extension cohorts can confirm the findings from discovery cohorts or can explore generalizability or clinical importance. Secondary aims are in light grey boxes. Peer-reviewed publications that may serve as models for a particular study design feature or study aim are noted in boxes. SNV, single-nucleotide variant; SV, structural variant. have been discovered thanks to cancer genome-sequenc- the emergence of mutated v-Ki-Ras2 Kirsten rat sarcoma ing studies. Third, characterizing clonal evolution is an oncogene (KRAS) and acquired resistance to epidermal important concept, particularly when considering can- growth factor (EGFR)-targeted therapy11). cer treatment, and this characterization can be achieved Finally, advancing personalized medicine is a clear at the nucleotide level using sequencing. For example, application of using second-generation technology to major subclones may share a somatic mutation (or muta- sequence cancer genomes. The goal of personalized tions) that confers intrinsic drug resistance, whereas medicine is to reduce toxicity and to improve efficacy minor subclones and de novo mutations may evolve to through selecting the correct treatment for the correct confer acquired resistance (as has been documented for patient at the correct dose and time. Medulloblastoma 322 | MAY 2013 | VOLUME 14 www.nature.com/reviews/genetics © 2013 Macmillan Publishers Limited. All rights reserved REVIEWS could be a model cancer type to further the personal- public repository of simple genetic variations31; the tran- ized medicine paradigm as it is a heterogeneous cancer sition to transversion ratio (~2.1 for the whole genome); type with respect to both overall survival and molecu- and concordance with matched SNP genotyping arrays. lar signatures; moreover, aggressive treatment results in SNP arrays can be used to estimate a false-negative rate; improved mortality at the cost of substantial morbidity12. however, the assumption is that array calls are the gold Identifying the patients who would be best served by standard. an aggressive treatment regime has great potential to SNVs are not the only kind of somatic mutation, but improve the quality of life of medulloblastoma survivors. they are the most abundant. The somatic status of struc- The specific aims discussed here are not exhaus- tural variants, such as copy number variants (CNVs), tive; many more remain to be explored with second- copy-neutral regions of loss of heterogeneity