Next Generation Sequencing and Its Application in Livestock

Total Page:16

File Type:pdf, Size:1020Kb

Next Generation Sequencing and Its Application in Livestock Journal of Entomology and Zoology Studies 2018; 6(3): 1601-1609 E-ISSN: 2320-7078 P-ISSN: 2349-6800 Next generation sequencing and its application in JEZS 2018; 6(3): 1601-1609 © 2018 JEZS livestock Received: 18-03-2018 Accepted: 23-04-2018 Vandana Yadav Vandana Yadav, BL Saini, Narendra Pratap Singh, Neeti Lakhani and Ph.D. Scholar, Animal Genetics, Rahul Sharma ICAR-NDRI, ICAR-National Dairy Research Institute, Karnal, Haryana, India Abstract The ability to determine nucleic acid sequences is one of the most important platforms for the detailed BL Saini study of biological systems. Next generation sequencing (NGS) technology has revolutionized genomic Ph.D Scholar, Animal Genetics, and genetic research. Starting from the basic Sanger sequencing, the next generation sequencing are ICAR-IVRI, ICAR-National providing high throughput and much cheaper alternative. All NGS have a similar base methodology that Dairy Research Institute, includes template preparation, sequencing and imaging, and data analysis. Different NGS platform use Karnal, Haryana, India different technology for the identification of the called base. Different commercially available technologies include 454 FLX Roche (pyrosequecing), Illumina Solexa (CRT), ABI SOLiD (sequencing Narendra Pratap Singh M.V.Sc Scholar, Animal genetics, by ligation) and Ion semiconductor sequencing (Ion Torrent) which require the pre-amplification process ICAR- NDRI, ICAR-National before sequencing. Single molecule approach are the developing technology for sequencing which can Dairy Research Institute, provide the @1000$ per genome. Single molecule technology includes Helicos sequencer, SMRT, Karnal, Haryana, India Nanopore technology that does not require the amplification prior to sequencing. Data analysis can be time-consuming and may require special knowledge of bioinformatics to gather accurate information Neeti Lakhani from sequence data. Resequencing of the human genome is being performed to identify genes and Ph.D Scholar, Nutrition division, regulatory elements involved in pathological processes. NGS has also provided a wealth of knowledge ICAR-NDRI, ICAR-National for comparative biology studies through whole genome sequencing of a wide variety of organisms. Dairy Research Institute, Simply analyzing all DNA from control versus infected (diseased) individuals will reveal the “extra” Karnal, Haryana, India DNA, which most likely derives from the infectious agent. The approach was used successfully to identify e.g. colony collapse disorder killing honeybees but also to identify the cause of diseases that Rahul Sharma killed thousands of humans in the past. Thus NGS can be used for genomics, trancriptomics, epigenetic VEO, DAH, Katni, Madhya study and metagenomics study. Pradesh, India Keywords: Sanger sequencing, nucleotide bases, template, adapter 1. Introduction DNA sequencing is the process of determining the precise order of nucleotide bases (adenine, guanine, cytosine, and thymine) within a DNA molecule. The advancement in DNA sequencing technology has been proved to be a boon for biological and medical research and discovery. Knowledge of DNA sequences is not only required for basic biological research but in numerous applied fields such as medical diagnosis, biotechnology, forensic biology, virology and biological systematic. The modern DNA sequencing technology with rapid speed has been instrumental in the sequencing of the complete DNA sequences, or genomes of numerous types and various life forms, including the human genome and other complete DNA sequences of many animal, plants, and microbial species. 2. History of sequencing Ray Wu at Cornell University (1970) established the first method for determining DNA sequences, based on a location-specific primer extension strategy. The cohesive ends of lambda phage DNA were sequenced by using DNA polymerase catalysis and specific nucleotide labeling, both of which forms the base for the current sequencing technologies [1]. This primer-extension strategy was then used to develop more rapid DNA sequencing Correspondence Vandana Yadav methods. Frederick Sanger (1977) at the MRC Centre, Cambridge, UK then developed the Ph.D. Scholar, Animal Genetics, method for "DNA sequencing with chain-terminating inhibitors". In the same year Walter ICAR-NDRI, ICAR-National Gilbert and Allan Maxam at Harvard also developed another method for DNA sequencing Dairy Research Institute, known as “DNA sequencing by chemical degradation”. In 1987, Applied Biosystems gave the Karnal, Haryana, India first full-automated sequencer (ABI370) for the first time. ~ 1601 ~ Journal of Entomology and Zoology Studies After that various high throughput-sequencing platforms 3. Next-generation sequencing methods (Pyrosequencing-2005, Illumina-2006, Ion torrent-2011, Pac- Generation refers to the chemistry and technology used by the Bio- 2012, illumine HiseqX-2013, Nanopore- 2014) were sequencing process. First generation generally refers to launched, marking the era of sequencing. The advancement in Sanger sequencing. "Next-generation" is generally used to sequencing technologies has now made it possible to refer to any of the high-throughput methods which were sequence the complete genome within a short time and in a developed after Sanger. Third-generation refers to single- cost effective manner. molecule methods (Helicos and Pac-Bio SMRT) and fourth generation includes the Nano-pore sequencing (Fig: 1). Fig 1: Different NGS platforms 3.1 Sanger method of sequencing invented by Frederick Sanger in 1977 & got Nobel Prize in It was the most common approach used for DNA sequencing 1980. Fig 2: Sanger sequencing by chain termination This method is also termed as Chain Termination or Dideoxy polymerase, a short primer DNA to enable the polymerase to method (Fig: 2). It relies on the use of dideoxyribonucleoside start DNA synthesis, and the four deoxyribonucleoside triphosphate, derivative of the normal deoxyribonucleoside triphosphate (dATP, dCTP, dGTP, dTTP) [3]. If a triphosphate that lack the 3′ hydroxyl group [2]. It is based on dideoxyribonucleotide analog of one of these nucleotides is the synthesis of DNA in a controlled way to generate also present in the mixture, it can incorporate into a growing fragments terminating at specific point. Purified DNA is DNA chain. Because this chain now lacks a 3′ OH group, the synthesized in vitro in a mixture that contains single stranded addition of next nucleotide is blocked, and the DNA chain molecules of the DNA to be sequenced, the enzyme DNA terminates at that point. ~ 1602 ~ Journal of Entomology and Zoology Studies To determine the complete sequence of a DNA fragment, the dramatic increase in cost-effective sequence throughput, double- stranded DNA is first separated into its single strands albeit at the expense of read lengths [5]. The next-generation and one of the strands is used as a template for sequencing. technologies commercially available today include the 454 Four different chain- terminating dideoxyribonucleoside GS20 pyrosequencing based instrument (Roche Applied triphosphates (ddATP, ddCTP, ddGTP, ddTTP) are used in Science), the Solexa 1G analyzer (Illumina, Inc.), the SOLiD four separate DNA synthesis reactions on copies of the same instrument from Applied Biosystems, and the Heliscope from single-stranded DNA template. Each reaction produces a set Helicos, Inc. and Nanopore from Oxford [6]. of DNA copies that terminate at different points in the sequence. The products of these four reactions are separated 5. Workflow of next-generation sequence by electrophoresis in four parallel lanes of a thin slab The workflow to produce next-generation sequence-ready polyacrylamide gel. The newly synthesized fragments are libraries is straightforward and includes sample preparation, detected by a label (either radioactive or fluorescent) that has library construction, clonal amplification, sequencing and data been incorporated either into the primer or into of the analysis (Fig: 3). deoxyribonucleoside triphosphates used to extend the DNA chain [4]. In each lane, the bands represent fragments that have a. Sample preparation been terminated at a given nucleotide but at different Source of sample (DNA, RNA) positions in the DNA. By reading off the bands in order, Qualify and quantify samples starting at the bottom of the gel and working across all lanes, the DNA sequence of the newly synthesized strands can be b. Library Construction determined. Thus, a long and labor-intensive process was Prepare platform specific library. completed, and the sequencing data for the DNA of interest Fragmentation of target DNA and adapter ligation. were in hand and ready for assembly, translation to amino acid sequence, or other types of analysis. c. Clonal Amplification Bridge PCR amplification-Illumina sequencing. 4. Advances in DNA sequencing technologies Emulsion PCR-SOLiD, Ion torrent and pyrosequencig. The landmark publications of the late 1970 s by Sanger's and Third and fourth generation sequencing technologies Gilbert's groups and notably the development of the chain don’t require amplification. termination method by Sanger and colleagues established the groundwork for decades of sequence-driven research that d. Sequencing followed. The chain-termination method published in 1977 Perform sequencing run reaction on NGS platform. also commonly referred to as Sanger or dideoxy sequencing, Sequencing by ligation-SOLiD sequencing. has remained the most commonly used DNA sequencing technique to date and was
Recommended publications
  • Next-Generation DNA Sequencing Techniques Review
    New Biotechnology Volume 25, Number 4 April 2009 REVIEW Next-generation DNA sequencing techniques Review Wilhelm J. Ansorge Ecole Polytechnique Federal Lausanne, EPFL, Switzerland Next-generation high-throughput DNA sequencing techniques are opening fascinating opportunities in the life sciences. Novel fields and applications in biology and medicine are becoming a reality, beyond the genomic sequencing which was original development goal and application. Serving as examples are: personal genomics with detailed analysis of individual genome stretches; precise analysis of RNA transcripts for gene expression, surpassing and replacing in several respects analysis by various microarray platforms, for instance in reliable and precise quantification of transcripts and as a tool for identification and analysis of DNA regions interacting with regulatory proteins in functional regulation of gene expression. The next-generation sequencing technologies offer novel and rapid ways for genome-wide characterisation and profiling of mRNAs, small RNAs, transcription factor regions, structure of chromatin and DNA methylation patterns, microbiology and metagenomics. In this article, development of commercial sequencing devices is reviewed and some European contributions to the field are mentioned. Presently commercially available very high-throughput DNA sequencing platforms, as well as techniques under development, are described and their applications in bio-medical fields discussed. Introduction Sanger method was used in the first automated fluorescent project Next-generation high-throughput DNA sequencing techniques, for sequencing of a genome region, in which sequence determina- which are opening fascinating new opportunities in biomedicine, tion of the complete gene locus for the HPRT gene was performed were selected by Nature Methods as the method of the year in 2007 using the EMBL technique; in that project the important concept of [1].
    [Show full text]
  • Linkage Disequilibrium Based Genotype Calling from Low- Coverage Shotgun Sequencing Reads
    Linkage Disequilibrium Based Genotype Calling from Low- Coverage Shotgun Sequencing Reads Jorge Duitama1, Justin Kennedy1, Sanjiv Dinakar2, Y¨ozen Hern´andez3and Yufeng Wu1, Ion I. M˘andoiu∗1 1Department of Computer Science & Engineering, University of Connecticut,371 Fairfield Rd., Unit 2155, Storrs, CT 06269-2155, USA 2Department of Computer Science, University of Maryland,College Park, Maryland 20742, USA. 3Department of Computer Science, Hunter College,695 Park Avenue, New York, NY 10021, USA. Email: Jorge Duitama - [email protected]; Justin Kennedy - [email protected]; Sanjiv Dinakar - [email protected]; Y¨ozen Hern´andez - [email protected]; Yufeng Wu - [email protected]; Ion I. M˘andoiu∗- [email protected]; ∗Corresponding author Abstract Background: Recent technology advances have enabled sequencing of individual genomes, promising to revolutionize biomedical research. However, deep sequencing remains more expensive than microarrays for performing whole-genome SNP genotyping. Results: In this paper we introduce a new multi-locus statistical model and computationally efficient genotype calling algorithms that integrate shotgun sequencing data with linkage disequilibrium (LD) information extracted from reference population panels such as Hapmap or the 1000 genomes project. Experiments on publicly available 454, Illumina, and ABI SOLiD sequencing datasets suggest that integration of LD information results in genotype calling accuracy comparable to that of microarray platforms from sequencing data of low-coverage. A software package implementing our algorithm, released under the GNU General Public License, is available at http://dna.engr.uconn.edu/software/GeneSeq/. Conclusions: Integration of LD information leads to significant improvements in genotype calling accuracy compared to prior LD-oblivious methods, rendering low-coverage sequencing as a viable alternative to microarrays for conducting large-scale genome-wide association studies.
    [Show full text]
  • Genome-Based Biotechnologies in Aquaculture Citation: FAO
    THEMATIC BACKGROUND STUDY Genome-based Biotechnologies in Aquaculture Citation: FAO. forthcoming. Genome-based biotechnologies in aquaculture. Rome The designations employed and the presentation of material in this information product do not imply the expression of any opinion whatsoever on the part of the Food and Agriculture Organization of the United Nations concerning the legal or development status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. The content of this document is entirely the responsibility of the author, and does not necessarily represent the views of the FAO or its Members. The mention of specific companies or products of manufacturers, whether or not these have been patented, does not imply that these have been endorsed or recommended by the Food and Agriculture Organization of the United Nations in preference to others of a similar nature that are not mentioned. Table of contents Abbreviations and acronyms iii Abstract v 1. INTRODUCTION 1 1.1 History of genomic research 1 1.2 Development of genomics as a new branch of science 1 1.3 Science demands the development of sequencing technologies 2 1.4 Aquaculture genomics, a historical review 3 2. TRADITIONAL GENETIC BIOTECHNOLOGIES FOR AQUACULTURE 4 2.1 Selective breeding 4 2.2 Polyploidy 5 2.3 Gynogenesis 6 2.4 Androgenesis 7 2.5 Sex reversal 7 2.6 Gene transfer 7 3. DNA MARKER TECHNOLOGIES 9 3.1 History of DNA marker technologies 9 3.2 Genomic variations as the basis of polymorphism 10 3.3 Allozyme markers 10 3.4 Restriction fragment length polymorphism markers 11 3.5 Mitochondrial DNA markers 11 3.6 DNA barcoding 12 3.7 RAPD markers 13 3.8 Amplified fragment length polymorphism markers 14 3.9 Microsatellite markers 15 3.10 SNP markers 17 3.11 Restriction site-associated DNA sequencing (RAD-) markers 19 4.
    [Show full text]
  • Reducing the Complexity of OMICS Data Analysis
    Julius-Maximilians-Universität Würzburg Reducing the complexity of OMICS data analysis Dissertation zur Erlangung des naturwissenschaftlichen Doktorgrades der Julius-Maximilians-Universität Würzburg Vorgelegt von Beat Wolf aus Fribourg, CH, 2017 Eingereicht am: 5 April 2017 bei der Fakultät für Mathematik und Informatik 1. Gutachter: Prof. Dr. Thomas Dandekar 2. Gutachter: Prof. Dr. Pierre Kuonen Tag der mündlichen Prüfung: 31 August 2017 Summary The field of genetics faces a lot of challenges and opportunities in both research and diag- nostics due to the rise of next generation sequencing (NGS), a technology that allows to sequence DNA increasingly fast and cheap. NGS is not only used to analyze DNA, but also RNA, which is a very similar molecule also present in the cell, in both cases producing large amounts of data. The big amount of data raises both infrastructure and usability problems, as powerful computing infrastructures are required and there are many manual steps in the data analysis which are complicated to execute. Both of those problems limit the use of NGS in the clinic and research, by producing a bottleneck both computationally and in terms of manpower, as for many analyses geneticists lack the required computing skills. Over the course of this thesis we investigated how computer science can help to improve this situation to reduce the complexity of this type of analysis. We looked at how to make the analysis more accessible to increase the number of people that can perform OMICS data analysis (OMICS groups various genomics data-sources). To approach this problem, we developed a graphical NGS data analysis pipeline aimed at a diagnostics environment while still being useful in research in close collaboration with the Human Genetics Depart- ment at the University of Würzburg.
    [Show full text]
  • Bioinformatic Analysis of Microrna Genes in Free-Living and Parasitic Nematodes
    Bioinformatic Analysis of microRNA Genes in Free-Living and Parasitic Nematodes Rina Ahmed November 2014 DISSERTATION zur Erlangung des akademischen Grades der Doktorin der Naturwissenschaften (Dr. rer. nat.) eingereicht im Fachbereich Mathematik und Informatik der Freien Universit¨at Berlin Begutachtet von: Prof. Dr. Martin Vingron Prof. Dr. Kris Gunsalus 1. Gutachter: Prof. Dr. Martin Vingron 2. Gutachterin: Prof. Dr. Kris Gunsalus Disputation: 26. Februar 2015 Preface The work that led to this thesis is part of two collaborative projects in which I par- ticipated. This thesis presents results from both projects. A panel of different bioin- formatics and statistical methods suitable to analyze small RNA deep sequencing data were identified and developed. Individual contributions for each project will be detailed here: Flexbar Project The work presented in Chapter3 was published in the special issue \Next-Generation Sequencing Approaches in Biology" in the journal Biology 1. The Flexible Barcode and Adapter Remover (FLEXBAR) originated from the Flexible Adapter Remover (FAR) and has been developed by Matthias Dodt in the bioinformat- ics group of Dr. Christoph Dieterich. As part of this project, I developed the adapter removal feature for SOLiD color space reads and focused on the application of small RNA-seq in letter and color space. Additionally, I was involved in the design of FAR and in the development of specific features of the subsequently added barcode detec- tion function for demultiplexing. The final version of FLEXBAR (paper version) has been extensively revised and enhanced by Johannes R¨ohr through the introduction of novel and extended features, a cleanup in the source code, redesigned command-line interface, and optimized parameter settings.
    [Show full text]