Next-Generation Sequencing Data for Use in Risk Assessment Bruce Alexander Merrick
Total Page:16
File Type:pdf, Size:1020Kb
Available online at www.sciencedirect.com Current Opinion in ScienceDirect Toxicology Next-generation sequencing data for use in risk assessment Bruce Alexander Merrick Abstract fluorescently labeled nucleotides and capillary electro- Next-generation sequencing (NGS) represents several phoresis that gave rise to automated sequencing in- powerful platforms that have revolutionized RNA and DNA struments such as the Applied Biosystems, Inc. model analysis. The parallel sequencing of millions of DNA molecules 370A sequencers and others, on the basis of Sanger can provide mechanistic insights into toxicology and provide chemistries [2]. Read length per each sample was at 500e new avenues for biomarker discovery with growing relevance 800 nucleotides, and sample throughput was limited. for risk assessment. The evolution of NGS technologies has improved over the last decade with increased sensitivity and Sanger-based DNA sequencing instruments are accuracy to foster new biomarker assays from tissue, blood, considered first-generation platforms. Instruments that and other biofluids. NGS technologies can identify transcrip- perform multiple sequencing reactions simultaneously tional changes and genomic targets with base pair precision in in a ‘massively parallel’ fashion have been dubbed, response to chemical exposure. Furthermore, there are ‘NextGeneration’ or NextGen Sequencing [3]. How several exciting movements within the toxicology community NGS came about compared with other genomic plat- that incorporate NGS platforms into new strategies for more forms is interlinked with microarray technology rapid toxicological characterizations. These include the Tox21 (Figure 1). Both platforms can provide whole genomic in vitro high-throughput transcriptomic screening program, approaches to research problems. Microarrays are a development of organotypic spheroids, alternative animal fluorescent probe hybridization-based technology with models, mining archival tissues, liquid biopsy, and epige- origins in the mid-1990s that are now a mature genomic nomics. This review will describe NGS-based technologies, platform with a well-established data analysis pipeline. demonstrate how they can be used as tools for target dis- Downsides of microarrays are that a prior genomic covery in tissue and blood, and suggest how they might be knowledge is needed to generate probes which are applied for risk assessment. species-specific with a limited dynamic range for dif- ferential expression. Addresses Molecular and Genomic Toxicology Group, Biomolecular Screening Branch, Division National Toxicology Program, National Institute of Development of the second wave of sequencing tech- Environmental Health Sciences, Research Triangle Park, NC, 27709, nologies, termed NextGen sequencing (NGS) technol- USA ogies, has overlapped with microarray platforms Corresponding author: Merrick, B. Alex ([email protected]) (Figure 1). NGS began in the new millennium as exem- plified by the massively parallel signature sequencing (MPSS) system that came from university research. Current Opinion in Toxicology 2019, 18:18–26 Improvements in sequencing chemistries, detection, and This review comes from a themed issue on Genomic Toxicology automation over the next decade promoted a rapid Available online 8 March 2019 development of NGS platforms. From 2010 to 2015, many NGS instruments became commercially available For a complete overview see the Issue and the Editorial that could produce millions of reads from 100 to 1000 https://doi.org/10.1016/j.cotox.2019.02.010 bases in length. A read is a short piece of sequence (e.g. 2468-2020/Published by Elsevier B.V. 100 nucleotides) that can be aligned to a transcript, and it also serves as a quantitative measure of a transcript when Keywords summed up with other aligning reads. These second- Next generation sequencing, RNA-seq, DNA-seq, Risk assessment, generation sequencers include the Roche ‘454 FLX,’ Biomarkers, High throughput transcriptomics, Liquid biopsy, Life Technologies‘Ion Torrent,’Applied Biosystems, Inc. Transcriptomics. ‘SOLiD’, and Illumina family of sequencers including the HiSeq 2000 series, MiSeq, X-Ten, and NovaSeq [4]. Introduction Further advances, such as single-molecule real-time The Sanger sequencing method was developed in the sequencing, in NGS sequencing technology have led to late 1970s to analyze DNA sequence using 32P-labeled longer read sequencers such as the Pacific Biosciences nucleotides separated by polyacrylamide gels for ‘PacBio RS II’ instrument that produces reads greater autoradiograms [1]. Radionuclides were replaced by than 10,000 bases [3]. Current Opinion in Toxicology 2019, 18:18–26 www.sciencedirect.com NGS in risk assessment Merrick 19 Figure 1 Timeline for development of microarray and next-generation sequencing (NGS) technology platforms. Microarray developments are above the timeline and NGS activities are below. For microarray development, Brown’s laboratory at Stanford was one of the first to develop a multigene expression measurement system using fluorescent detection. The term, toxicogenomics (Tgmx) was first defined by Nuwaysir et al., in 1999 [67]. Commercialized platforms such as Affymetrix, Agilent, and NimbleGen matured through 2010. For NGS, the massively parallel signature sequencing (MPSS) was developed in 2000 by the Brenner lab at Lynx Therapeutics. 454 Life Sciences developed a massively parallel pyrosequencing method in 2006 followed by a commercial instrument put out by Roche. The Solexa short-read platform was acquired by Illumina in 2008 and has undergone continued devel- opment and improvement. Commercialization of NGS platforms continues with various speeds of analysis, read lengths, and sequencing capacities. More recent developments include BRB-seq or ‘Bulk RNA Barcoding and Sequencing’ and TempO-Seq by BioSpyder as a library of bar-coded probes that hybridize to representative gene transcripts as a targeted NGS approach to transcript expression. Advantages and disadvantages are summarized and discussed further in the text. A more recent sequencing technology has been available [7]. Test articles can include chemicals and advanced by Oxford Nanopore Technologies. Nanopore many other agents, including pharmaceuticals, drugs, instruments read bases directly from single DNA or natural products, particles such as asbestos, nano- RNA molecules through a biological nanopore particles, physical factors such as radiation, metals, and channelda nanoscale biological tube that sequences by many others. The type of toxicity or hazard can be sensing changes in ionic current as the nucleic acid widely defined as macroscopic or microscopic lesions molecule passes through [5]. The sequencing devices and pathologies, altered pharmacologic, immunologic, can provide rapid analysis (hours), and some units are functional, and behavioral reactions, changes in portable (size of an USB flash drive) that can be readily biochemistry and physiology, or any measurable applied to teaching laboratories, medical offices, and response that is considered adverse or outside of normal field work. Reads lengths can be in the tens to hundreds health. Study of the underlying molecular changes of kb in length. A primary advantage of long read length contributing to toxicity has been greatly facilitated by is to reduce the ambiguity of highly homologous genes, omics technologies, particularly transcriptomics, while splice variants, and repetitive regions in the genome standardization of data analysis and interpretation where alignment is inherently more difficult using short continue to be refined [8]. There are many in vitro assays reads. The high sequence resolution of NGS in- and screens (e.g. anticholinesterase activity or bacterial struments has come at the expense of relatively low mutagenesis) that support the mode of action in the risk sample throughput. This issue has been addressed by assessment process, but new initiatives such as Tox21 creating libraries of targeted probe sets that analyze the aim to develop new assays incorporating NGS platforms complete transcriptome (e.g. TempO-Seq [6] and bulk for a larger role in risk determination [9]. RNA coding-seq [BRB-seq], reviewed later), rapidly and at relatively low cost. A brief depiction of NGS appli- The dynamic nature of gene expression (tran- cations is shown in Figure 2. scriptomics) in response to a chemical or test article exposure makes it well suited as part of the hazard NGS and risk assessment identification and dose-setting process for risk assess- Traditional risk assessment often involves identifying ment [10,11]. There are approximately 15,000 coding hazard(s) in a doseeresponse manner after chemical or genes and probably an equal number of noncoding genes test article exposure in animal models or human data if expressed at any one time in a specific cell type. Splice www.sciencedirect.com Current Opinion in Toxicology 2019, 18:18–26 20 Genomic Toxicology Figure 2 Many applications for measuring RNA and DNA in toxicogenomics are supported by NGS platforms. Whole genome or transcriptome analysis or targeted portions of each can be measured by NGS. seq, sequencing; WG, whole genome; ATAC, assay for transposase-accessible chromatin; ChIP, chromatin immunoprecipitation; Ribo, ribosome; miRNA, microRNA. RNA analysis platforms are indicated by blue arrows, and DNA analysis platforms are shown by red arrows. Further description of these applications is provided in the text. variants also add more complexity to response. Of those the transcriptome