China Next Gen Sequencing.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
SCIENCE CHINA Life Sciences January 2010 Vol.53 No.1: 44–57 Celebrating Scientia Sinica doi: 10.1007/s11427-010-0023-6 (SCIENCE CHINA)’S the 60th Anniverasry · REVIEW · The next-generation sequencing technology: A technology review and future perspective ZHOU XiaoGuang1†*, REN LuFeng1†, LI YunTao2, ZHANG Meng1, YU YuDe2 & YU Jun1* 1 Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China; 2 Institute of Semiconductor, Chinese Academy of Sciences, Beijing 100083, China Received December 8, 2009; accepted December 16, 2009 As one of the most powerful tools in biomedical research, DNA sequencing not only has been improving its productivity in an exponential growth rate but also been evolving into a new layout of technological territories toward engineering and physical disciplines over the past three decades. In this technical review, we look into technical characteristics of the next-gen sequenc- ers and provide prospective insights into their future development and applications. We envisage that some of the emerging platforms are capable of supporting the $1000 genome and $100 genome goals if given a few years for technical maturation. We also suggest that scientists from China should play an active role in this campaign that will have profound impact on both scientific research and societal healthcare systems. genomics, DNA sequencing, next generation sequencing technologies, sequencer Citation: Zhou X G, Ren L F, Li Y T, et al. The next-generation sequencing technology: A technology review and future perspective. Sci China Life Sci, 2010, 53: 44–57, doi: 10.1007/s11427-010-0023-6 1 Introduction propel creation and development of other branches of ge- nomic studies such as comparative genomics and bioinfor- matics as well as closely related fields such as systems bi- DNA sequencing technology has played an essential role in ology and synthetic biology. In a way, technological ad- the advancement of molecular biology ever since its inven- vancement in DNA sequencing has transformed the study of tion [1]. From early manual sequencing operation developed fundamental element of life – from individual, localized by Frederick Sanger, first-generation automated sequencer genes or fragment of genes to whole genomes, which in turn driven by Sanger chemistry, to present next-gen sequencing demands more competent sequencing technology. The syn- platforms, we have witnessed tremendous changes in this ergetic relationship between sequencing and its applications field [2]. Some even liken this change in genomic sequencing ensures that the trend will continue in foreseeable future and to the evolution of semiconductor technology [3]. This is not even accelerate due to the promise of and drive for person- totally unfounded – the speed of sequencing has improved alized medicine in disease diagnosis and treatment. Here, exponentially every few years over the last few decades, we provide a review of sequencing technology evolution, similar to what semiconductor industry has experienced summary of generational advancements with their merits under the Moore’s law [4]. This rapid transformation is and drawbacks, and prediction of possible direction of the captured in Figure 1 and has fundamentally changed the field. For ease of discussion, we categorize the progress of way we can examine the blue-print of all life and helps to sequencing technology into three generations with sub- † Contributed equally to this work *Corresponding author (email:[email protected]; [email protected]) © Science China Press and Springer-Verlag Berlin Heidelberg 2010 life.scichina.com www.springerlink.com Zhou XiaoGuang, et al. Sci China Life Sci January (2010) Vol.53 No.1 45 time, to study in depth the genetic code of life. The original method was primarily a manual endeavor and hard to automate. For one, it utilized isotopic radioac- tive labeling of primer for DNA ladder imaging, making the sequencing process non-user friendly. The requirement of four separate chain-termination reactions with dideoxynu- cleotides (ddNTPs) and subsequent slab-gel based separa- tion of chain-terminated products on four individual elec- trophoretic lanes are both time- and reagent-consuming. All these severely limited the overall throughput of sequencing- hence, the desire to develop non-radioactive based 1st gen- eration sequencing technology. 2.1.1 G1.1 Figure 1 Sequencing technology timeline The initial version of 1st generation sequencer first appeared in mid 80s and developed in Leroy Hood’s laboratory at Cal generations as illustrated in Table 1. The designation of each Tech [6]. It made possible through modifications to the state of advancement is somewhat arbitrary but nevertheless Sanger’s method. The key improvement includes the use of it captures the key delineation of technological advances of color fluorescent dyes to replace radioactive labeling - four each period. dideoxynucleotide terminators are tagged with differently colored fluorophores. Furthermore, the tag is attached to the terminator molecule (ddNTPs) instead of the primer as in 2 A review of the technology and its recent de- the case of original Sanger’s method. The color-coded velopments scheme made it possible to perform all four chain-termina- tion reactions in one tube. Polyacrylamide gel analysis of 2.1 1st generation technology – fluorescently labeled ladder fragments can be performed through computerized sanger method fluorescence detection system. This greatly enhanced the overall sequencing speed and reduced manual intervention Before the appearance of first automated DNA sequencing required by operator during sequencing run. platform, widely accepted DNA sequencing method of In the following year, ABI introduced its first semi- choice had been the Sanger’s chain-termination method automated DNA sequencing platform, e.g. ABI 370 Se- developed in the mid 1970s, for which Sanger was awarded quencer, based on the technology from Leroy Hood’s lab [7]. the Nobel Chemistry Prize in 1980 [5]. His invention In the subsequent two decades, we had experienced a rapid opened a realm of possibility for researchers, for the first change and improvement of its performance. But the under- Table 1 Roadmap of sequencing technologya) Generation 1st-G 2nd-G 3rd-G Version 1.1 1.2 2.1 2.2 2.3 3.1 3.2 ABI/GenoME Sanger ABI MS SBS Illumina Complete Ge- SBL ABI/Polonator G.007 nomics SBP Roche FD Helicos Platform SM-SBS Pacific Biosciences/ ? FE VisiGen SM-SBL SM-SBP Pore PoC Nano Nife PoC Graphene PoC a) SBS, sequence-by-synthesis; SBL, sequence-by-ligation; SBP: sequence-by- pyrosequencing; SM: single molecule, FD: fixed DNA; FE: fixed en- zyme, PoC: proof-of-concept; ?: expected technology 46 Zhou XiaoGuang, et al. Sci China Life Sci January (2010) Vol.53 No.1 pinning working principle has not changed until very re- tions of sequencing platform. The reliability, raw accuracy, cently. scalability of the tried-and-true method will continue to play an important role, especially in sequencing PCR products 2.2.2 G1.2 and clone-ends of plasmids and bacterial artificial chromo- Toward the end of last century, the second version of the somes as well as genotyping for STR markers. 1st-generation technology appeared. With it, we see further enhancement in the speed and quality of DNA sequencing. 2.2 2nd Generation technologies – cyclic array sequenc- This was mainly achieved through improvements in two ing by Synthesis areas. First, slab-gel based separation was replaced by cap- illary-electrophoresis, and second, number of concurrent The so-called next generation sequencing methods encom- samples that can be analyzed was increased through higher pass a myriad of approaches based on different technology. parallelism. The use of capillary instead slab-gel eliminated Although utilized quite diverse techniques and biochemistry sample loading, reduced the reagent consumption and sped in each step from template library preparation, fragment up analysis. Further, the compact form of capillary device amplification, to sequencing, they all adopted a massive makes it easier to parallelize multiple sequencing runs, re- matrix configuration popularized by microarray analysis – sulting in higher instrument throughput; 96 samples on ABI DNA samples on the array are simultaneously analyzed in 3730 platform and 384 samples on Amersham MegaBACE parallel. Furthermore, sequencing is carried out by observ- could be achieved in one run. This generation of sequencers ing and recording optical events through microscopic played a pivotal role in DNA sequence production at later apparatus during iterative sequencing cycles - a serial ex- stage of the Human Genome Project and helped to accele- tension of primed template by either DNA polymerase [8] rate the project completion. They have been continuously or ligase [9]. used until this day due to key advantages in its raw data Several key characteristics can be easily observed based accuracy and sequence read length. on the general description. First, massive parallelism can be Through decades of gradual improvement, 1st-generation achieved through ordered or disordered array configuration sequencer can be applied to achieve sequencing length up to that offers high degree of information density. Theoretically, 1000 bp, with raw accuracy as high as 99.999%, at a cost as this is only limited by the diffraction limit of light (i.e., half little as $0.50/kilobase and throughput close to 600000 of the wavelength used for detection of