Efficient Approximation of DNA Hybridisation Using Deep Learning

Total Page:16

File Type:pdf, Size:1020Kb

Efficient Approximation of DNA Hybridisation Using Deep Learning Efficient approximation of DNA hybridisation using deep learning David Buterez? Department of Computer Science and Technology University of Cambridge, Cambridge, UK Abstract. Deoxyribonucleic acid (DNA) has shown great promise in enabling computational applications, most notably in the fields of DNA data storage and DNA computing. The former exploits the natural prop- erties of DNA, such as high storage density and longevity, for the archival of digital information, while the latter aims to use the interactivity of DNA to encode computations. Recently, the two paradigms were jointly used to formulate the near-data processing concept for DNA databases, where the computations are performed directly on the stored data. The fundamental, low-level operation that DNA naturally pos- sesses is that of hybridisation, also called annealing, of complementary sequences. Information is encoded as DNA strands, which will naturally bind in solution, thus enabling search and pattern-matching capa- bilities. Being able to control and predict the process of hybridisation is crucial for the ambitious future of the so-called Hybrid Molecular-Electronic Computing. Current tools are, however, limited in terms of throughput and applicability to large-scale problems. In this work, we present the first comprehensive study of machine learning methods applied to the task of predicting DNA hybridisation. For this purpose, we introduce a synthetic hybridisation dataset of over 2.5 million data points, enabling the use of a wide range of machine learning algorithms, including the latest in deep learning. Depending on the hardware, the proposed models provide a reduction in inference time ranging from one to over two orders of magnitude compared to the state-of-the-art, while retaining high fidelity. We then discuss the integration of our methods in modern, scalable workflows. The implementation is available at: https://github.com/davidbuterez/dna-hyb-deep-learning Keywords: DNA hybridisation · Annealing · DNA storage · DNA computing · Deep learning Introduction and motivation In this introductory section we place our work in context, by discussing the innovations and chal- lenges brought by digital data processing in DNA. We emphasise the importance of DNA hybridisa- tion, the mechanism enabling these technologies, and how this study delivers scalable, future-ready workflows. DNA storage The use of DNA to facilitate computation is an active area of research, dating back to 1994 when arXiv:2102.10131v1 [cs.LG] 19 Feb 2021 Leonard Adleman solved a seven-node instance of the Hamiltonian path problem, using the tool- box of (DNA) molecular biology [1]. The algorithm uses DNA sequences to represent nodes and edges in the graph and relies on hybridisation to detect paths which visit all nodes once. The resulting DNA sequences (algorithm output) can be read using polymerase chain reaction (PCR). This pioneering work opened the gate to many interesting questions: can molecular machines be used to solve intractable problems? Is DNA suitable for long-term storage of digital information? More recently, are such methods scalable in the era of Big Data? The focus has gradually changed from solving difficult computational problems to exploiting other ? Work conceptualised and partially executed while at Imperial College London. 2 D. Buterez desirable properties of DNA, leading to the development of DNA digital data storage. The field has seen rapid advances in recent years [22], [7], [29], [39], [10], [27], [5], [11], [30] and industry-ready solutions are currently in development. It is now generally accepted that the amount of digital data is doubling at least every two years. Predictions from Seagate [35] estimate that this quan- tity, called the Global Datasphere, will grow from 33 ZB (Zettabytes) in 2018 to 175 ZB by 2025. Furthermore, the dominant storage mediums are “traditional”, with 59% of the storage capacity expected to come from hard disk drives and 26% from flash technologies. Synthetic DNA has been argued to be an attractive storage medium, for at least three prominent reasons [9]: (a) Density and lifetime comparison with (b) Reading and writing DNA against mainstream storage media [9] evolution of transistors [8] Fig. 1: DNA storage compared to traditional systems. 3 1. Density – The theoretical maximum information density of DNA is 1018 B/mm . Compara- tively, this is a 7 orders of magnitude increase over tape storage (Figure 1a). 2. Durability – Depending on storage conditions, DNA can be preserved for at least a few hundred years. Fossil studies reveal that DNA has a half-life of 521 years [3]. In appropriate conditions, researchers estimate that digital information can be recovered from DNA stored at the Global Seed Vault (at -18°C) after over 1 million years [22]. 3. Future-proofing – Next-generation sequencing and technologies such as Oxford Nanopore have made reading and writing DNA more accessible than ever. There has been exponential progress in manipulating DNA, surpassing even Moore’s Law (Figure 1b). Furthermore, since DNA is the fundamental building block of life, it will be relevant for as long as humans exist. Near-data processing Clearly, DNA storage and computing have the potential to bring breakthroughs in the way we process and store data. The objective of this work is to develop methods that make near-data processing applications viable at a large scale. The near-data processing philosophy is to bring the computing substrate closer to the storage substrate, concretely by encoding meaningful interactions using DNA hybridisation. Database query operations on DNA In the OligoArchive project [4], the authors present a processing technique capable of performing SQL operations such as selection, projection and join directly on the DNA molecules. More specif- ically, arbitrary tables can be encoded in DNA by designing a strand that represents the attribute value prepended by hashed values of the table, attribute name and primary key, supplemented by error correcting codes. The selection database operation can then be implemented by PCR, as it is sensitive enough to allow single sequences of DNA to be selected in a background of billions of Efficient approximation of DNA hybridisation using deep learning 3 irrelevant molecules. This enables searching by attribute value. More interestingly, the join operator makes direct use of hybridisation. Intuitively, if the attributes of two records match, then a duplex product will form. This implements the equi-join operation. The encoding of the attributes should be orthogonal for different values, to avoid unintended operations. However, the case of an imperfect match is still interesting for fuzzy matching and similarity search. The scale of the experiments in [4] is small: a PostgreSQL database of 12KB was successfully encoded and read back. Future projects will clearly benefit from a larger, high-quality sequence design tool. DNA encodings for similarity search The objective of similarity search is to retrieve documents from a database that are similar in terms of content to a search query. The state-of-the-art solution is to encode the items into feature vectors that can be compared with metrics like Euclidean distance (similar documents are close in the feature space). With rich data (e.g. images) the feature vectors tend to have hundreds of dimensions, hindering even the fastest algorithms, in the so-called “curse of dimensionality”. The near-data processing paradigm can be applied to the similarity search problem: the database stores information in DNA molecules and the similarity search is performed by means of DNA hybridisation. The work of Stewart et al. [38] proposes a novel encoding for information. Each database element is associated with a unique ID and a feature vector. Fig. 2: Strand designs in Stewart et al. [38] The database itself does not store the data, but the ID can be used to retrieve the information from another location. The ID and feature sequences are placed on the same DNA strand. In Figure2, FP, IP and RP are sequencing primers (forward, internal and reverse, respectively) and coloured in blue. d(T ) denotes the ID sequence, and f(T ) is the feature sequence, in orange colour. Similarly, the features of the query strand are encoded in the f(Q) region. This is designed to match (im- perfectly) with f(T ), so its reverse complement, denoted by f(Q)∗, is used. To help the (unstable) hybridisation, a reverse complement of the first six bases of the reverse primer RP is added to the query strand: RP [: 6]∗. With this design, similar targets and queries will bind together. They can then be amplified by PCR (using FP and RP) and sequenced. The other novel contribution of [38] is training a neural network that learns to associate a query with a target if and only if the query and the target represent similar images. Note that the network is trained to find appropriate encodings for pairs of images based on their semantic features, i.e. it does not predict hybridisation probabilities. Rather, it uses an approximation for thermodynamic analysis that is differentiable (necessary for backpropagation) in the form of a modified sigmoid function. Unfortunately, the estimation is far from perfect. The authors remark that an important feature direction is a more accurate approximation for thermodynamic yield. To summarise, we identify two applications for such thermodynamic prediction techniques: • Orthogonal datasets – Traditionally, orthogonal DNA sequences have been used to build DNA barcodes, unique sequences that are used for identification. More recently, orthogonal 4 D. Buterez sequences are required to enable random-access in DNA databases through PCR. More specif- ically, each piece of information stored in DNA molecules requires a short, unique identifica- tion sequence. Uniqueness is crucial: one of the goals is to avoid cross-talk. Naturally, larger databases require a higher number of orthogonal sequences. • Similar datasets – The concept of near-data processing introduced the need for similar sets of sequences. The idea is to store closely-related entities (by some distance metric) in similar DNA molecules, which are part of the database.
Recommended publications
  • DNA: the Future Storage Device
    Special Issue - 2017 International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 ICADEMS - 2017 Conference Proceedings DNA: The Future Storage Device Sakshi Jha1, Mr. Mahesh Kumar Malkani2 1Ganga Institute of Technology and Management, 2Department of Computer Science Engineering and CFIS, MDU, Rohtak GITAM, MDU, Rohtak Data Storage has been a great concern since the early ages. than current magnetic tape or hard drive due to its huge With the advent of time, data has been expanding and new density .Digital data universe has been growing storage devices have been satisfying the demand. Demand for exponentially. Trends in data storage are setting new data storage is growing on a faster pace .In order to meet this milestones. From floppy disk to tape drives and tape drives growing need for storage device, a shift from electronic media to cloud storage have been the new developments in the to molecular DNA has been made. DNA molecule is universal and fundamental data storage unit in biology. 1gram of DNA field of storage devices. Most of the world’s data is stored molecule can store 1 Zettabyte of data. A new encoding on magnetic tapes and optical disk drives. Despite this scheme that offers controllable redundancy, trading off improvement, storing Zetta bytes of data would still take reliability for density has also been developed. Finally, we millions of years and space. With the increasing technology highlight trends in biotechnology that indicate the impending and tech-literacy, there is a need for revolution in the arena practicality of DNA storage for much larger datasets. of archiving.
    [Show full text]
  • Downloaded” to a Computer Than to Answer Questions About Emotions, Which Will Organize Their World
    Between an Animal and a Machine MODERNITY IN QUESTION STUDIES IN PHILOSOPHY AND HISTORY OF IDEAS Edited by Małgorzata Kowalska VOLUME 10 Paweł Majewski Between an Animal and a Machine Stanisław Lem’s Technological Utopia Translation from Polish by Olga Kaczmarek Bibliographic Information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the internet at http://dnb.d-nb.de. Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. The Publication is founded by Ministry of Science and Higher Education of the Republic of Poland as a part of the National Programme for the Development of the Humanities. This publication reflects the views only of the authors, and the Ministry cannot be held responsible for any use which may be made of the information contained therein. ISSN 2193-3421 E-ISBN 978-3-653-06830-6 (E-PDF) E-ISBN 978-3-631-71024-1 (EPUB) E-ISBN 978-3-631-71025-8 (MOBI) DOI 10.3726/978-3-653-06830-6 Open Access: This work is licensed under a Creative Commons Attribution Non Commercial No Derivatives 4.0 unported license. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/ © Paweł Majewski, 2018 . Peter Lang – Berlin · Bern · Bruxelles · New York · Oxford · Warszawa · Wien This publication has been peer reviewed. www.peterlang.com Contents Introduction ........................................................................................................ 9 Lemology Pure and Applied ............................................................................. 9 Part One Dialogues – Cybernetics as an Anthropology ........................................
    [Show full text]
  • Diagnostic Strategies for COVID-19 and Other Coronaviruses Medical Virology: from Pathogenesis to Disease Control
    Medical Virology: From Pathogenesis to Disease Control Series Editor: Shailendra K. Saxena Pranjal Chandra Sharmili Roy Editors Diagnostic Strategies for COVID-19 and other Coronaviruses Medical Virology: From Pathogenesis to Disease Control Series Editor Shailendra K. Saxena, Centre for Advanced Research, King George’s Medical University, Lucknow, India This book series reviews the recent advancement in the field of medical virology including molecular epidemiology, diagnostics and therapeutic strategies for various viral infections. The individual books in this series provide a comprehensive over- view of infectious diseases that are caused by emerging and re-emerging viruses including their mode of infections, immunopathology, diagnosis, treatment, epide- miology, and etiology. It also discusses the clinical recommendations in the man- agement of infectious diseases focusing on the current practices, recent advances in diagnostic approaches and therapeutic strategies. The books also discuss progress and challenges in the development of viral vaccines and discuss the application of viruses in the translational research and human healthcare. More information about this series at http://www.springer.com/series/16573 Pranjal Chandra • Sharmili Roy Editors Diagnostic Strategies for COVID-19 and other Coronaviruses Editors Pranjal Chandra Sharmili Roy School of Biochemical Engineering Division of Oncology Indian Institute of Technology Stanford Medicine Varanasi, India California, CA, USA ISSN 2662-981X ISSN 2662-9828 (electronic) Medical Virology: From Pathogenesis to Disease Control ISBN 978-981-15-6005-7 ISBN 978-981-15-6006-4 (eBook) https://doi.org/10.1007/978-981-15-6006-4 # The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
    [Show full text]
  • Coding of Still Pictures
    ISO/IEC JTC 1/SC29/WG1N91016 91st Meeting, Online, 19-23 April 2021 ISO/IEC JTC 1/SC 29/WG 1 (ITU-T SG16) Coding of Still Pictures JBIG JPEG Joint Bi-level Image Joint Photographic Experts Group Experts Group TITLE: DNA-based Media Storage: State-of-the-Art, Challenges, Use Cases and Requirements version 4.0 SOURCE: Contributors: Marc Antonini, Luis Cruz, Eduardo da Silva, Melpomeni Dimopoulou, Touradj Ebrahimi (editor), Siegfried Foessel, Gloria Menegaz, Fernando Pereira (editor), António Pinheiro, Mohamad Raad PROJECT: JPEG DNA Exploration STATUS: Final REQUESTED ACTION: For information and feedback DISTRIBUTION: Public Contact: ISO/IEC JTC 1/SC 29/WG 1 Convener – Prof. Touradj Ebrahimi EPFL/STI/IEL/GR-EB, Station 11, CH-1015 Lausanne, Switzerland Tel: +41 21 693 2606, Fax: +41 21 693 7600, E-mail: [email protected] - 1 - ISO/IEC JTC 1/SC29/WG1N91016 91st Meeting, Online, 19-23 April 2021 DNA-based Media Storage State-of-the-Art, Challenges, Use Cases and Requirements Version 4.0 Executive Summary DNA is a macromolecule which is essential for any form of life and is made of simple units that line up in a particular order within this large molecule. The order of these units usually carries genetic information for a specific life organism, similar to how the order of letters in a text carries information. In practice, this means that DNA molecules may be artificially created with specific unit orders, notably to store some relevant sequence of information. In digital media information, notably images, the relevant representation symbols, e.g. quantized DCT coefficients, are expressed in bits (binary units) but they could be expressed in any other units, for example the DNA units which follow a quaternary (4-ary) representation basis.
    [Show full text]
  • Nucleic Acid/Inorganic Particle Hybrid Systems from Design to Application
    Research Collection Doctoral Thesis Nucleic acid/inorganic particle hybrid systems from design to application Author(s): Puddu, Michela Publication Date: 2015 Permanent Link: https://doi.org/10.3929/ethz-a-010609177 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use. ETH Library DISS. ETH No. 23177 NUCLEIC ACID/INORGANIC PARTICLE HYBRID SYSTEMS: FROM DESIGN TO APPLICATION A thesis submitted to attain the degree of DOCTOR OF SCIENCES of ETH ZURICH (Dr. sc. ETH Zurich) presented by MICHELA PUDDU MSc Materials Science born on 02.03.1987 citizen of Italy accepted on the recommendation of Prof. Dr. Wendelin J. Stark, examiner Prof. Dr. János Vörös, co-examiner Dr. Robert N. Grass, co-examiner 2015 2 3 To my parents “Considerate la vostra semenza: fatti non foste a viver come bruti, ma per seguir virtute e canoscenza." Dante Alighieri, Divina Commedia, Inferno, canto XXVI, vv. 18-20. 4 5 Acknowledgments The PhD has been a wonderful and exciting experience, and I want to take a moment to thank the many people who took part in this journey. Foremost, I would like to express my gratitude to Prof. Wendelin Stark for giving me the opportunity to do research and develop a mature scientific approach in a highly stimulating and innovative environment. I am indebted to him for providing me with entrepreneurial spirit and vision that influenced my thinking and career choice. I also thank him for the precious lesson that a balance between research interests and personal pursuits does not keep away success, but it actually brings one step closer to it.
    [Show full text]
  • 160 Удк 004.383.8 Миколюк Ю. – Ст. Гр. Сп-21 Тернопільський
    ІX Всеукраїнська студентська науково - технічна конференція "ПРИРОДНИЧІ ТА ГУМАНІТАРНІ НАУКИ. АКТУАЛЬНІ ПИТАННЯ" УДК 004.383.8 Миколюк Ю. – ст. гр. СП-21 Тернопільський національний технічний університет імені Івана Пулюя БІОКОМП’ЮТЕРИ Науковий керівник:Боднар О.І. Mykolyuk Y. Ternopil Ivan Pul’uj National Technical University BIOCOMPUTERS Supervisor: Bodnar O Ключові слова: ДНК, нанобіотехнології, інтегровані схеми, бімолекулярні системи Key words: DNA, nanobiotechnologies, integrated circuits, boimolecular systems Biocomputers use systems of biologically derived molecules – such as DNA and proteins – to perform computational calculations involving storing, retrieving, and processing data. The development of biocomputers has been made possible by the expanding new science of nanobiotechnology. The term nanobiotechnology can be defined in multiple ways; in a more general sense, nanobiotechnology can be defined as any type of technology that uses both nano-scale materials (i.e. materials having characteristic dimensions of 1- 100nanometers) and biologically based materials. A more restrictive definition views nanobiotechnology more specifically as the design and engineering of proteins that can then be assembled into larger, functional structures. The implementation of nanobiotechnology, as defined in this narrower sense, provides scientists with the ability to engineer biomolecular systems specifically so that they interact in a fashion that can ultimately result in the computational functionality of a computer. DNA computing is a branch of computing which uses DNA, biochemistry, and molecular biology hardware, instead of the traditional silicon-based computer technologies. Research and development in this area concerns theory, experiments, and applications of DNA computing. The term "molectronics" has sometimes been used, but this term had already been used for an earlier technology, a then-unsuccessful rival of the first integrated circuits; this term has also been used more generally, for molecular-scale electronic technology.
    [Show full text]
  • DNA-Based Cryptography: Motivation, Progress, Challenges, and Future 1 A.E
    JOURNAL OF SOFTWARE ENGINEERING & INTELLIGENT SYSTEMS ISSN 2518-8739 30th April 2018, Volume 3, Issue 1, JSEIS, CAOMEI Copyright © 2016-2018 www.jseis.org DNA-based cryptography: motivation, progress, challenges, and future 1 A.E. El-Moursy, 2 Mohammed Elmogy, 3 Ahmad Atwan 1,2,3Information Technology Dept., Faculty of Computers and Information, Mansoura University, Egypt Email: [email protected], [email protected], [email protected] ABSTRACT Cryptography is about constructing protocols by which different security means are being added to our precious information to block adversaries. Properties of DNA are appointed for different sciences and cryptographic purposes. Biological complexity and computing difficulties provide twofold security safeguards and make it difficult to penetrate. Thus, a development in cryptography is needed not to negate the tradition but to make it applicable to new technologies. In this paper, we review the most significant research, which is achieved in the DNA cryptography area. We analysed and discussed its achievements, limitations, and suggestions. In addition, some suggested modifications can be made to bypass some detected inadequacies of these mechanisms to increase their robustness. Biological characteristics and current cryptography mechanisms limitations were discussed as motivations for heading DNA-based cryptography direction. Keywords: DNA; cryptography; encryption; DNA computing; bio-inspired cryptography; 1. INTRODUCTION Technological development seizes our valuable information including financial transactions are transmitted back and forth in public communication channels, posing a considerably high challenge in confronting with unintended intruders. One suggestion is cryptography that is about constructing protocols built strong mathematically and theoretically by which different security means are being added to such precious information.
    [Show full text]
  • Singapore Fintech Festival Conference 2019 Day 2 Key Highlights
    Singapore FinTech Festival Conference 2019 Day 2 key highlights 12 November 2019 01 Singapore FinTech Festival Conference 2019 | Day 2 key highlights Contents Plenary Stage The US Tech Giants Agenda: Conversation with Intel, Microsoft & Nasdaq Macro Vulnerabilities vs Technology Opportunities Sustaining Financial Services - Why Tech Is Key From carving in stone to writing in DNA: History and Future of Data Storage by Kees Immink, President, Turing Machines Inc. Defining the Future of Digital Currency Sustainability, Finance and Tech Stage - Powered by Deloitte: Sustainable Finance Eyeing humanity’s impact on Earth through satellite imagery Fireside Chat: SDGs: The blueprint for a sustainable future Sustainable finance: The trillion-dollar opportunity New deal for finance: Natural capital for resilient portfolios FinTechs for good: Combating climate change with technology Sustainability is good for business This document is developed by the Monetary Authority of Singapore (“MAS”) in collaboration with Deloitte Southeast Asia Ltd (“Deloitte”). It directly reports on and summarises the topics presented and discussed at the FinTech Conference as part of the Singapore FinTech Festival 2019. The contents within this document by no means reflect views and opinions from Deloitte or MAS. Please contact Deloitte, MAS or other appropriate organisations and agencies should you wish to obtain expert opinions on what has been reported in this document. © 2019 Deloitte Southeast Asia Ltd 2 Singapore FinTech Festival Conference 2019 | Day 2 key highlights Future of Finance Stage- Powered by Prudential: Payments Payments are Dead; Long Live Payments. Opening by Joop Wijn, Chief Strategy & Risk Officer, Adyen Panel Discussion Breaking the Rules; and Rewriting them Fast Forward Payments.
    [Show full text]
  • OPCODE News.Cdr
    www.apsit.edu.in volume #1 issue #1 2018 inline with trends Parshvanath Charitable Trust’s A. P. SHAH INSTITUTE OF TECHNOLOGY, THANE (Affiliated to Mumbai University, Approved by AICTE, DTE and Govt of Maharashtra) DEPARTMENT OF COMPUTER ENGINEERING PERFORMANCE INNOVATION EVENTS ACHIEVEMENTS In its uncompromising committment to support innovation and research by building instituitional infrastructure, the department is creating a legacy for the future. Opcode aims to create a bond of awareness and inspiration amongst students to participate in all opportunities available. Innovations ` From the Principal's Desk Computing software and systems are becoming increasingly integral to our lives, revolutionizing virtually everything from the way we live, work and communicate. Computer Engineering is one of the most ourishing disciplines in recent times and is now also a key enabler for discovery and innovation in most other elds, making it an incredibly relevant course of study. With increased demand among recruiters, this domain has become the preferred line of study among students as well. I whole heartedly welcome and congratulate you for choosing Computer Engineering Department in A.P. Shah Institute of Technology for continuing your educational journey. The department was launched keeping in view the dynamic nature of the IT industry and the ever increasing demand for quality and well- trained IT professionals. Right from its inception in 2014, the department has been offering excellent infrastructural facilities with a variety of computing platforms to motivate students to meet the burgeoning demands of IT industry. The teaching processes are executed by highly qualied and experienced teachers who make good use of smart classroom facilities and Moodle software, thereby ensuring deep understanding of concepts through demonstrations and hands-on practice sessions executed under their keen supervision.
    [Show full text]
  • DNA Computers
    Volume IV, Issue XII, December 2015 IJLTEMAS ISSN 2278 – 2540 DNA Computers Ani Mehrotra Nirma University, India Abstract – The paper discusses the concept of DNA computing, its Although DNA computing is in its infancy, and its advantages over silicon-based computing and the future it holds implications are only beginning to be explored, they have for the mankind. It showcases the limitations of present already been harnessed to perform complex mathematical technology and the promises DNA computing hold for the future. problems.[9] The paper looks on the future we shall be living in and how this new technology can sweep away all the notions of how we II. WHAT IT IS? perceive and live in our world. Keywords – DNA, nucleotides, parallel computing, DNA bases, In its simplest sense, a computer is just a machine capable Hamiltonian Path Problem, Transcriptor, nucleic acids. of performing computations. It doesn't have to be electronic. Tom Ran, of the Weizmann Institute of Science in Israel, I. INTRODUCTION works with computers made out of strands of DNA."Working this way, we can get three trillion computers, working in omputer chip manufacturers are furiously racing to make parallel, in a space the size of a water droplet," he says. The 0s C the next microprocessor that will topple speed records. and 1s of conventional computers are replaced with the four Sooner or later, though, this competition is bound to hit a DNA bases: A, C, G and T. Operations can be translated into wall. Microprocessors made of silicon will eventually reach strands of DNA using these bases, and the way the DNA their limits of speed and miniaturization.
    [Show full text]
  • Deoxyribonucleic Acid (DNA): the Magician in the Field of the Digital Data Storage
    Research and Reviews on Healthcare: Open Access Journal DOI: ISSN: 2637-6679 10.32474/RRHOAJ.2019.03.000163Review Article Deoxyribonucleic Acid (DNA): The Magician in the Field of the Digital Data Storage Lichun Sun1,2*, Jun He2, Jing Luo3 and David H Coy1 1Department of Medicine, Tulane University Health Sciences Center, New Orleans, USA 2Sino US Innovative Bio-Medical Center and Hunan Beautide Pharmaceuticals, China 3Department of Health Information and Technology, EXCELth Primary Health Care, USA *Corresponding author: Lichun Sun, Department of Medicine, School of Medicine, Tulane University Health Sciences Center, Sino US Innovative Bio Medical Center and Hunan Beautide Pharmaceuticals, Hunan, China Received: May 20, 2019 Published: May 29, 2019 Introduction world. These digital data are stored on the silicon-based chips with It is an explosive era of the big digital data with an exponential the binary numeric system that uses only two digital numbers, or 0 growth rate. The data produced in the couple years between 2015 and 2017 are more than all the preceded data produced in the entire human history. Nowadays, over 40 exabytes (EBs) of new and 1[5]. As reported, the microchip-grade silicon is rare in nature data per day are produced all over the world. There were totally 33 and will be run out in 2040 [6]. A new media or a new technology zettabytes (ZBs, or 33000 EBs) of digital data worldwide in 2018. is needed to fix the coming data storage crisis. Due to its unique magic digital data storage medium to eventually solve data storage It is estimated that there will be almost 175 zettabytes (ZBs) of advantages [7], deoxyribonucleic acid (DNA) is expected to be the problem that have long plagued scientists worldwide.
    [Show full text]
  • DNA Data Storage
    Journal for Research | Volume 03| Issue 11 | January 2018 ISSN: 2395-7549 DNA Data Storage S. V. Ramanamurthy P. L. Kiranmayi Professor, SM-IEEE Head of Department Department of Computer Science & Engineering Department of Computer Science & Engineering Pragati Engineering College, Surampalem, India Pragati Engineering College, Surampalem, India S. D. Sarayu Md. Iftekharuddin PG Student Head of Department Department of Computer Science & Engineering Department of Computer Science & Engineering Pragati Engineering College, Surampalem, India Pragati Engineering College, Surampalem, India Abstract Human Beings have always tried to simplify the way of storing data maintaining both security and speed of access. This decade (2011-2020) is focusing on improving data storage devices. New technologies like SSDs (Solid State Drives), technical upgrades in SATA or IDE HDDs (Hard Disk Drives), etc with Terra Bytes of storage capabilities have come to light in recent past. However, DNA Data Storage technology is the next generation of storage technique, which has a lots of storage capability. DNA Data Stor-age will reinvent the way of storing data. This paper discusses about this storage mechanism and emphasizes on the on-going re-search in this field. Keywords: Data Storage, DNA, SDDs, HDDs, Genes _______________________________________________________________________________________________________ I. INTRODUCTION Todays ‘Digital universe’ forecasts that total data in 2017 is 16 zettabytes (1021). Companies are building Data Centers in thousands of acres of land areas. Facebook recently built a data center dedicated to one Exabyte of data storage. Similarly, many organizations and universities started working on data storage solutions for better storage duration, less cost, and less space occupying. DNA Storage Technology was first demonstrated in the 1980s and it became the largest project by 2010.
    [Show full text]