concepts Cells as computation Cellular abstractions Aviv Regev and Ehud Shapiro an essential property of the phenomenon; computable, bringing to bear computational lthough we are successfully consolidat- knowledge about the mathematical represen- Computer science could provide the ing our knowledge of the ‘sequence’ and tation; understandable, offering a conceptual abstraction needed for consolidating A‘structure’ branches of molecular framework for thinking about the scientific in an accessible manner, the moun- domain; and extensible, allowing the capture knowledge of biomolecular systems. tains of knowledge about the function, activi- of additional real properties in the same ty and interaction of molecular systems in mathematical framework. Using this abstraction opens up new possi- cells remain fragmented. Sequence and struc- For example, the DNA-as-string abstrac- bilities for understanding molecular systems. ture research use computers and computer- tion is relevant in capturing the primary For example, computer science distinguishes ized databases to share, compare, criticize sequence of nucleotides without including between two levels to describe a system’s and correct scientific knowledge, to reach a higher- and lower-order biochemical proper- behaviour: implementation (how the system consensus quickly and effectively. Why can’t ties; it allows the application of a battery of is built, say the wires in a circuit) and specifica- the study of biomolecular systems make a string algorithms, including probabilistic tion (what the system does, say an ‘AND’ logic similar computational leap? Both sequence analysis using hidden Markov models, as well gate). Once biological behaviour is abstracted and structure research have adopted good as enabling the practical development of data- as computational behaviour, implementation abstractions: ‘DNA-as-string’ (a mathematical bases and common repositories; it is under- can be related to a real biological system, for string is a finite sequence of symbols) and standable, in that a string over the alphabet A, example the detailed molecular machinery of ‘-as-three-dimensional-labelled-graph’, T, C, G is a universal format for discussing and a circadian clock, and the corresponding respectively. Biomolecular systems research conveying genetic information; and extensi- specification to its biological function, such as has yet to find a similarly successful one. ble, enabling, for example, the addition of a a ‘black-box’ abstract oscillator. Ascribing a The hallmark of scientific understanding fifth symbol denoting methylated cytosine. biological function to a biomolecular system is the reduction of a natural phenomenon We believe that computer science can is thus no longer an informal process but an to simpler units. Equally important under- provide the much-needed abstraction for objective measure of the semantic equiva- standing comes from finding the appropriate biomolecular systems. Advanced computer lence between low-level and high-level com- abstraction with which to distill an aspect of science concepts are being used to investigate putational descriptions. Equivalence between knowledge. An abstraction — a mapping the ‘molecule-as-computation’ abstraction, in related implementations in different organ- from a real-world domain to a mathematical which a system of interacting molecular enti- isms can also be a measure of the behavioural domain — highlights some essential proper- ties is described and modelled by a system of similarity of entire systems, complementary ties while ignoring other, complicating, ones. interacting computational entities. Abstract to sequence and structure similarity. For example, classical genetic analysis uses the computer languages, such as Petrinets, State- Computer and biomolecular systems both ‘gene-as-hereditary-unit’ abstraction, ignor- charts and the Pi-calculus, were developed for start from a small set of elementary com- ing the biochemical properties of genes as the specification and study of systems of inter- ponents from which, layer by layer, more DNA sequences. A good scientific abstraction acting computations, yet are now being used complex entities are constructed with ever- has four properties: it is relevant, capturing to represent biomolecular systems, including more sophisticated functions. Computers are regulatory, metabolic and signalling path- networked to perform larger and larger com- ways, as well as multicellular processes such putations; cells form multicellular organisms. as immune responses. These languages enable All existing computers have an essentially simulation of the behaviour of biomolecular similar core design and basic functions, but systems, as well as development of knowledge address a wide range of tasks. Similarly, all bases supporting qualitative and quantitative cells have a similar core design, yet can survive reasoning on these systems’ properties. in radically different environments or fulfil Processes, the basic interacting computa- widely differing functions. Of course, bio- tional entities of these languages, have an molecular systems exist independently of our internal state and interaction capabilities. awareness or understanding of them, whereas Process behaviour is governed by reaction computer systems exist because we under- rules specifying the response to an input mes- stand, design and build them. Nevertheless, sage based on its content and the state of the the abstractions, tools and methods used to process. The response can include state specify and study computer systems should change, a change in interaction capabilities, illuminate our accumulated knowledge about and/or sending messages. Complex entities biomolecular systems. ■ are described hierarchically — for example, Aviv Regev and Ehud Shapiro are in the Department if a and b are abstractions of two molecular of Computer Science and Applied Mathematics, and domains of a single molecule, then (a parallel the Department of Biological Chemistry, Weizmann b) is an abstraction of the corresponding two- Institute of Science, Rehovot 76100, . domain molecule. Similarly, if a and b are FURTHER READING abstractions of the two possible behaviours ➧ www.wisdom.weizmann.ac.il/~aviv of a molecule in one of two conformational Milner, R. Communicating and Mobile Systems: states, depending on the ligand it binds, then The Pi-calculus. (Cambridge Univ. Press, 2000). (a choice b) is an abstraction of the molecule, Fontana, W. & Buss, L. W. in Boundaries and Barriers Abstract work: a computer model of information with the choice between a and b determined (eds Casti, J. & Karlqvist, A.) 56–116 (Addison-Wesley, transfer through a network of coloured points. by its interaction with a ligand process. New York, 1996).

NATURE | VOL 419 | 26 SEPTEMBER 2002 | www.nature.com/nature © 2002 Nature Publishing Group 343