Special course in Science: Molecular Computing Lecture 1: Introduction

Vladimir Rogojin Department of CS, Abo Akademi http://combio.abo.fi/teaching/special-course-in-computer-science-molecular-computing/

Fall 2015 Space that is computing

➢ Universe as information and information processing

➢ World as a computer or being computed

➢ Konrad Zuse, 1969 – the Universe is being computed by cellular automata or other discrete computing machinery World science festival

➢ Digital physics – theories based on premise that “the Universe is, at heart, describable by information, and is therefore computable ”

Rh izome | Rechnender Raum 3D Virtual Creature Evolution ➢ Artificial evolution simulation by Lee Graham

➢ In an artificially simulated environment

➢ Artificial organisms are generated

➢ Purpose: visualize and research body shapes and strategies to achieve fitness function 3D Virtual Creature Evolution ➢ Evolution simulation: artificial organisms evolve to achieve highest fitness

➢ Fitness criteria: body size, maximum height, average height, contact with ground, catching flying spheres, etc. www.snipview.com

➢ Artificial environment: landscape, gravity, water pools, etc.

➢ Artificial organisms: consist of blocks-joints- motors, can reproduce sexually/asexually Can mutate www.youtube.com ➢ There were reported 220 artificial species so far Synthetic bacteria

➢ Can one

➢ design on a computer an artificial genome,

➢ synthesize the respective DNA sequences in wet-lab, and

➢ Basing purely on this artificial genome grow living organisms

➢ THE ANSWER: YES, WE CAN!

www.mit.edu Mycoplasma laboratorium

➢ Minimal Genome Project – find minimal set of genes able to sustain life

➢ J. Craig Venter Institute (JCVI) – non-profit genomics research institute founded by J. Craig Venter

➢ Experiments with M. genitalium – reduced to 382 genes

➢ Artificial genome with 382 genes – Mycoplasma laboratorium www.synthetic-bestiary.com

➢ Plan – generate synthetic genome of M.laboratorium and inject it into a proper cell, to use its translation and replication biochemical machinery and environment Synthia

➢ Synthesized M.mycoides genome of 1,078,809 bp from a computer record from scratch

➢ Transplanted the synthetic genome into DNA-free M.capricolum cell

➢ The new genome took over, the new organism multiplied

➢ Craig Venter: “the first species... to have its parents be a computer” holistichealthinsider.com ➢ Technologies: DNA sequencing, long DNA synthesis, genome transplantation

twin-cities.umn.edu Self-assembled nanostructures

➢ Programmable matter - “any bulk substance which can be programmed to change its physical properties”

➢ Principle: coupling computation to its material properties

➢ Goal: creating nano-scale stable structures, like: ➢ Crystal latices, nanotubes, arbitrary shapes

➢ Functional: molecular machines, and DNA

➢ Implementations: ➢ DNA walkers – nanoparticles transport and direct chemical synthesis

➢ Molecular wires – molecular-scale electronics

➢ Smart drugs – targeted drug delivery Nanobots

➢ Molecule-size robots www.explainingthefuture.com ➢ Nanomedicine: ➢ Targeted drug delivery (detect and kill cancer cells)

➢ Surgery,

➢ Monitoring of diabetes,

➢ Biomedical instrumentation,

➢ Etc. thevine.com.au ➢ Design issues: ➢ Sensing, power communication, navigation, manipulation, locomotion, onboard computation Molecular machines

➢ Molecular car: ➢ Molecular-sized “4-wheel” devices capable for moving (rolling) on the surface:

➢ Engine-less car:

➢ Non-controlable

➢ undirected movement

➢ on hot metallic surfaces

➢ Electric-driven car:

➢ Electrically powered

➢ wheels drive car in

➢ the desired direction

➢ Motor nanocar:

➢ Nanocar with the synthetic molecular motor Molecular-scale electronics

➢ Branch of nanotechnology

➢ Single molecules as electronic components: ➢ Wires

➢ Rectifiers

➢ Contrary to conventional electronics: ➢ Bottom up approach rather than top down approach

➢ An integrated circuit is self-assembled from properly designed molecules

➢ Problems with traditional bulk approach: precision limitations

pubs.rsc.org Transcriptors

➢ Analogy: semiconducting material-based

➢ Transcriptor : DNA/RNA/enzyme -based logic device

➢ A computer needs: ➢ Store information

➢ Transmit information

➢ Logic operations www.kurzweilai.net ➢ In biochemistry all the three functions were finally implemented

➢ The invention of biological counterpart of a transistor – transcriptor was finally announsed on March 2013 in Stanford University

preethisiribhat.wix.com BIL gates

➢ Transcriptor: ➢ Device composed of a complex of biological materials: DNA/RNA/proteins

➢ Three-terminal device with a logic control system

➢ On the physical level the device controls the flow of RNA www.kurzweilai.net polymerase across a strand of DNA

➢ Traditional AND, OR, NOR, NAND, XOR, XNOR gates are replicated by transcriptors and called “Boolean Integrase Logic (BIL) gates”

➢ Likewise transistors, transcriptors can amplify a signal

➢ Group of transcriptors can form a Turing-complete computational device Biocomputers

➢ Not to replace conventional computing silicon-based devices

➢ Meant to be used where electronic systems cannot be implemented and applied: ➢ Reprogramming living cells ➢ Nanobots www.photonics.com ➢ Smart drugs

➢ Etc.

➢ Potential applications: ➢ Fully functional computers at nano-scale, that can sense and manipulate the environment.

➢ Disease warning, diagnostic, control insulin production/consumption, control cell reproduction, detect and suppress cancer cells www.prote.in Biology-based paradigms

➢ Neural nets: ➢ Image/speach/text/pattern recognition

➢ Evolutionary computation, Genetic algorithms, swarm intelligence: ➢ Optimization problems

➢ Cellular automata ➢ Modelling physical and biological processes:

➢ Such as, communication, growth, reproduction, competition, evolution, etc.

➢ Artificial immune systems ➢ Computer security, data analysis, bioinformatics, robotics, etc.

➢ Membrane computing www.doc.ic.ac.uk Natural Computing

Computer Natural Science, Nature Computing Mathematics Computations in Nature

Molecular computing: •DNA computing

www.engineering.com

Cellular computing: •Gene assembly in ciliates

combio.abo.fi Quantum computing: •Superposition •Entanglement

ralphlosey.files.wordpress.com Computations in Nature

Molecular computing: •Massive parallelism •Nano-scale www.engineering.com Cellular computing: •Massive parallelism •Nano-scale •Replication •Filtering combio.abo.fi •No supervision

Quantum computing: •Exponential speed-up ralphlosey.files.wordpress.com •Information teleportation Natural computing

➢ In general three directions:

1)Nature-inspired paradigms and problem-solving www.onlineinvestingai.com techniques 2)Math and computer-based analysis and simulation of natural phenomena 3)Employing natural components (bio-components and systems) to compute

en.wikipedia.org Bioinformatics

➢ Major activity: ➢ Develop software tools to generate useful biological knowledge

➢ Computer science, mathematics and engineering to process bio-data

➢ Databases and information systems: store and organize bio-data

www.stsiweb.org

www.ocib.ca Bioinformatics and Computational Systems Biology

➢ Two tightly related areas with vague border:

➢ Bioinformatics:

➢ analyzing bio-data to generate bio-knowledge

➢ Computational Systems Biology:

➢ computational modelling of bio-systems and bio- processes to generate bio-knowledge www.bioquicknews.com Synthetic biology qb3.org

➢ Engineering synthetic biological components and systems

➢ Started from genetic engineering techniques based on recombinant DNA technology

➢ Nowadays we can synthesize some bacterial chromosomes: ➢ M.mycoides genome of 1,078,809 bp, grown fully functional cell from the synthetic genome

➢ Other efforts: ➢ Cell reprogramming (for instance to make them produce combustable fuel, novel cancer therapy approaches, etc.)

➢ Designing multi-cellular systems. For instance cell-to-cell communication modules to coordinate living bacterial populations blogs.plos.org

www.bio.org Membrane computing ➢ Formalizes membranal cellular structure and intermembranar transport of biochemicals

➢ Terms: strings, multisets, graphs

➢ Membrane system – formal computational device based on multiset rewriting and communication

➢ Basic ingredients: ➢ Membranes – formalize cellular membranes. Membranes determine regions that: github.com

➢ may include other membranes (hierarchical structure) or

➢ can be connected via communication channels (networks)

➢ Multisets of objects – formalize biochemical compounds.

➢ Each membrane has an associated multiset of objects (the membrane's content)

➢ Multiset rewriting/communication rules – formalize biochemical reactions and biochemical cross-membrane transportation: en.wikipedia.org ➢ The rules dictate how membranes' content evolution and inter-membrane communications Membrane computing

1 ab d, in2 1 aaabbc dc d, in4 add 2 4 2 4 aaaaa ad d, out aa d 3 3 a ad, out

1 1 ac acddd 2 4 2 4 aaaaaddd aa 3 3 Membrane computing

➢ Computation: ➢ At each step rules are chosen non-deterministically and in maximal parallel manner (i.e., whatever can evolve – evolves)

➢ The system halts when no rule can be applied. Result – either sequence/multiset of objects expelled into the environment or multiset of objects collected in the “output” membrane webapps2.ucalgary.ca

➢ Applications: ➢ Machine learning,

➢ Modeling of biological systems,

➢ Computer graphics, public-key cryptography, approximation and sorting,

➢ Analysis of computationally hard problems

liacs.leidenuniv.nl DNA computing: DNA

➢Complementarity, base-pairing: ● Deoxyribonucleic acid: ➢A-T – Nucleic acid ➢C-G – Genetic information – Two strands of polymers

● Polymer: – Sequence of nucleotides

● Nucleotide: – has one of four bases:

● adenine (A)

● cytosine (C)

● guanine (G)

● thymine (T) Adleman’s Experiment, DNA Computing • Solves an instance of Hamiltonian Path Problem HPP problem: • given: directed graph • by DNA manipulation • find: a path coming through all the T A T G A C T vertices exactly once A T A C T G A T A T G A C T • DNA computing operations: A T A C T G A A C T • cloning (PCR) α1 α2 α1 T G A α2 • splicing 4 β1 A C T β2 • lengthening/shortening T G A 1β1 β2 • separating/fusing 5 • cutting • sequencing 2 3 • other! Adleman’s Experiment, DNA Computing 1. Encoding: • Vertex  single strand short DNA 1 1 • Edge  single strand short DNA complementary to adjacent vertices-DNA 1 2 1-2 1 1-2 1 2. Cloning 1-2 1 1-2 1 1-2 1 3. Generating all pathes (hybridization) 1-2 1 1-2 1 1-2 Adleman’s Experiment, DNA Computing 1. Encoding: • Vertex  short DNA • Edge  short DNA complementary to adjacent vertices-DNA

4 2. Cloning 2 1 3. Generating all pathes (hybridization) 3 4. Filtering out non-hamiltonian1-2 pathes 3-4 2-3 Adleman’s Experiment, DNA Computing

1. Encoding: • Vertex  short DNA • Edge  short DNA complementary to adjacent vertices-DNA

2. Cloning 3. Generating all pathes (hybridization) 4. Filtering out non-hamiltonian pathes

5. Result: • Remaining molecules DNA computing

➢ Initiated by Leonard Adleman from University of Southern California in 1994

➢ Proof-of-concept: solved 7-point HPP

➢ Turing universal computational devices could be built

➢ Milestones: ➢ 1994: Adleman's experiment

➢ 1997: theoretical implementation of Boolean circuits

➢ 2002: programmable molecular computing machine

➢ 2004: attempts to build DNA computer to diagnose cellular cancer activity and to release anti-cancer drug

➢ 2000's: increased interest in DNA nanotechnology

➢ 2013, January: stored JPEG photograph, Shakespearean sonnets and an audio file of Martin Luther King, Jr.speech “I Have a Dream” on DNA

➢ 2013, March: implemented transcriptor – a biological transistor Cellular computing

Advantages:

• Massive parallelism

• Replication

• Filtering

• No supervision

Gene assembly in ciliates during sexual reproduction Ciliates

- ∃for >billion years

- 1000s of species

- Most complex 1-cell organisms known

- Some >4mm large Micrographia Gallery Enlargement Stylonychia

Micrographia Gallery Enlargement Coleps Macronuclei - Unicellular eukariotes - They have 2 types of nuclei

Macronuclei : 1 DNA - 1 gene , gene not fragmented .

GENE

MAC Micronuclei - Unicellular eukariotes - They have 2 types of nuclei

Micronuclei : - DNA organized on chromosomes, - One DNA molecule - many genes , - genes are fragmented , - fragments (MDS's) are shuffled, - some fragments are inverted ,

- MDSs separated by IESs MDS3 IES MDS1 IESMDS4 IES MDS2

MIC

Gene Assembly MDS3 MDS1 MDS4 MDS2

MIC

-Remove IESs Gene assembly -Unscramble MDSs -Ligate MDSs

GENE

MAC Pointers

• Pointers : • short nucleotide sequences • on MDS edges

p MDS q Pointers

• Pointers : • short nucleotide sequences • on MDS edges An MDS “points” to the next MDS by means of common pointer

p MDS1 q q MDS2 r Assembled gene • First MDS of a gene • Pointers : • begins with marker b, • short nucleotide sequences • last MDS of a gene • on MDS edges • ends with marker e;

• Assembled gene : • MDSs spliced on common pointers, in orthodox order

b MDS1 p2 MDS2 p3 pn MDS n e Cellular computing

➢ Studying or implementing computations in living cells

➢ Examples:

➢ Gene assembly in ciliates – permutations and inversions of DNA fragments. Turing universal.

➢ In vivo programmable and autonomous finite-state automation with E. coli

➢ In vivo cellular logic gates and genetic circuits that alter the cell's existing biochemical processes Molecular computing

➢ Emergent interdisciplinary field concerned with programming molecules ➢ To perform a desired computation, or

➢ Fabricate a desired object, or

➢ Control functioning of specific molecular system

➢ Central principle: ➢ The data can be encoded within bio-molecules

➢ Tools of molecular science could be used to manipulate and process the data

➢ Program: ➢ collection of molecules, when placed in a suitable substrate,

➢ Will perform a specific function (could be interpreted as program execution) Molecular computing – an overview course ➢ During this course we will learn about: ➢ Computations with DNA and other bio-molecules in vitro

➢ DNA Computing and self-assembly of nano-structures

➢ Computation with bio-molecules in vivo

➢ Cellular Computing

➢ Biologically inspired model of distributed parallel computations:

➢ Membrane computing – abstracting from multi-compartmental membranar structure of living organic systems

➢ We will overview: ➢ Basics of DNA structure and manipulation

➢ Formal models for DNA Computing and DNA-based computational devices

➢ Biology and computational models of gene assembly in ciliates

➢ Theory of membrane computing Course info

➢ No strict requirements towards students. All background will be provided on-demand

➢ 7 weeks, twice per week st ➢ First lecture – today, 31 of August, 2015

th ➢ Last lecture – Thursday, 15 of October, 2015

➢ Mondays – 15:15 – 16:45 Cobol, ICT

➢ Thursdays – 10:15 – 11:45 Algol, ICT

➢ Exams: ➢ TBD

➢ Lecturer: ➢ Vladimir Rogojin, B5050, ICT [email protected]