The Perfect C. Elegans Project: an Initial Report
Total Page:16
File Type:pdf, Size:1020Kb
The Perfect C. elegans Project: An Initial Report Hiroaki Kitano Shugo Hamahashi Sony Computer Science Laboratory Dept. of Electrical Engineering 3-14-13 Higashi-Gotanda, Shinagawa-ku Keio University Tokyo 141, Japan Yokohama, Japan [email protected] [email protected] Jun Kitazawa Sean Luke Dept. of Media and Governance Department of Computer Science Keio University University of Maryland Fujisawa, Japan College Park, MD 20742 USA [email protected] Abstract standing of the components and their isolated functions do not lead to an understanding of the dynamics behind the global The soil nematode Caenorhabditis Elegans (C. elegans) is the development and behavior of this organism. Despite the fact most investigated of all multi-cellular organisms. Since the it is the simplest known differentiated organism, C. elegans proposal to use it as a model organism, a series of research dynamics are still too complex for us to fully grasp at this time. projects have been undertaken, investigating a wide range of This situation typi®es the problems that lie ahead in molec- on various aspects of this organism. As a result, the com- ular biology. The human genome projects and other genome plete cell lineage, neural circuitry, and various genes and their sequence projects will soon identify complete DNA sequences functions have been identi®ed. The complete C. elegans DNA and clone all genes, giving the ultimate reductionist view of sequencing and gene expression mapping for each cell at dif- ourselves and other organisms; however, this does not directly ferent times during embryogenesis will be identi®ed in a few lead to an understanding of the full interactions that make up years. Given the abundance of collected data, we believe that living organisms. Since many phenomena occurring in living the time is ripe to introduce synthetic models of C. elegans organisms are complex and non-linear, it is almost impossi- to further enhance our understanding of the underlying prin- ble to fully understand their characteristics without the aid of ciples of its development and behavior. For this reason, we simulation and modelling. Thus, we claim that an in-depth un- have started the Perfect C. elegans Project, which aims to ul- derstanding may be only be achievable through reconstructing timately produce a complete synthetic model of C. elegans the phenomena with computer simulation. cellular structure and function. This paper describes the goal, In a ®rst attempt at this approach, we have developed the approach, and the initial results of the project. detailed simulation models of C. elegans and Drosophila melanogaster (another widely studied organism). Given the abundance of data and the expected progress of analytical 1 Introduction methods for these organisms, they're the natural target for this approach. When Sydney Brenner proposed the investigating C. elegans to the Medical Research Council, he chose it because it was the simplest of all differentiated organisms [Brenner, 63]. The 2 Caenorhabditis elegans decision proved wise, generating a number of fruitful re- sults, including the complete identi®cation of cell lineage C. elegans is a small worm found in soil and ubiquitously of the worm [Sulston and Horvitz, 77, Kimble and Hirsh, 79, observed throughout the world. It accounts for the largest Sulston et al., 83] and its full neural circuit topology includ- biomass on earth. It has a life span of about 3 days and feeds on ing all synapses and gap junctions [White et al., 86]. Using bacteria. C. elegans has no female sex (only hermaphrodites a whole mount in situ hybridization, the complete C. elegans and males), so most every worm in the population is genetic DNA sequencing and expression pattern mapping are now clone. This, plus the simple nature of the organism, means within our reach [Sulston et al., 92, Tabara et al., 96]. These that lineages, positions, and interactions of every single cell investigations clarify what C. elegans is composed of and how can be mapped out. The adult male C. elegans has exactly each of its individual components work. However, an under- 1031 somatic nuclei, and the adult hermaphrodite has 959 1 somatic nuclei. The lineage of these cells has been fully the simulation or the hypothesis is incorrect. The accuracy of identi®ed by the extensive work of various research groups simulations needs to be carefully examined and veri®ed be- (one in particular: [Sulston et al., 83]). fore the model can serve as a testbed for hypotheses. It should The nervous system of C. elegans is relatively simple. Its be noted, however, that even with a limited accuracy model, hermaphrodites have only 302 neurons and 56 glial and as- hypotheses can be tested if they focus on speci®c aspects of sociated support cells. This accounts for 37% of all so- genetic interactions and involve logical interrelationships of matic cells. In the adult male, the number of neurons is interactions rather than details concerning the subtle balance 381, and there are 92 glial and support cells, which is 46% of participating components. of all somatic cells. White differentiated these neurons into 118 classes and reported that there are about 5,000 chemical By the same token, a well designed model can be used for synapses, 2,000 neuromuscular junctions, and 600 gap junc- predicting the workings of possible molecular mechanisms tions [White et al., 86]. which are involved in speci®c phenomena. The ideal ap- A haploid C. elegans genome is composed of 8 107 nu- × proach is for the model to provide a set of possible molecular cleotide pairs. The C. elegans genome project is being carried mechanisms for speci®c phenomena whose true mechanisms out by the Sanger Center and Washington University, which are unknown. Experimental biologists can then design exper- has already sequenced more than 25% of its genomes. Cur- iments to identify which of these mechanisms actually exists. rent progress suggests that all genes, estimated to number about 13,000, will be identi®ed within a few years. Implementing the model can be extremely useful for biol- The mechanism of fate determination involving maternal ogists studying C. elegans. It can be a comprehensive visual genes has been intensively investigated. However, fate de- database of the organism, and assist in cell identi®cation. Cur- termination in later cells is largely unknown because down- rently, the most widely used computer assisted system for C. stream genes have not been identi®ed. In order to investigate elegans is the Angler system, also called the 4D system, devel- the genetic interactions for fate determination, a project to oped by the Sanger Center. Angler is essentially a collection identify the genes that are expressed in a speci®c cell lin- of tagged images taken by Nomarski confocal microscopy. eage has been initiated at the National Institute for Genetics However, the Angler system does not provide the capability [Tabara et al., 96]. This project uses in situ hybridization on to rotate images or to animate the embryogenesis process. A whole mount embryos to identify the expression of genes at good computer model can compliment the Angler system by speci®c cells at speci®c times during embryogenesis. providing computer graphic images and simulations linked Many mutants have been isolated which can be used for with a set of optical images. genetic analysis. For example, the ced family affects pro- grammed cell death, and mutations in the lin family of genes cause cell lineage abnormality. There is a large list of genes While modeling C. elegans on a computer is useful in its and their phenotypical disorders, allowing for a wide range of own right for augmenting biological knowledge, the visualiza- manipulations to help in the investigation of cellular develop- tion of embryogenesis is one of the most persuasive tasks of ment. Other manipulations are possible during embryogenesis the project. Visualization helps researchers to conceptualize through laser ablation or direct micro-manipulations. both global and local issues in embryogenesis, and to identify cells during experiments. This requires tools to assist biolog- ical observations using three-dimensional computer graphics 3 Project Goals of the complete C. elegans development process (possibly cou- pled with image understanding techniques for automatic data Given these biological accomplishments and on-going efforts, acquisition and semi-automatic cell identi®cation), a simula- the Perfect C. elegans Project aims to create a detailed simula- tor for a complete neural circuit, and an integrated database tion model of C. elegans to promote our understanding of the on C. elegans. organism and of life in general. By implementing a detailed model, we can verify how much To begin meeting these goals, we have so far developed we really do understand about C. elegans. Due to complexities three modelling and visualization systems for C. elegans cellu- of genetic and cell to cell interactions, and the large number of lar data. The ®rst, a embryogenesis visualization system, uses participating genes and their products, it is extremely dif®cult a sophisticated dynamics model to display the embryogenesis for us to fully understand and verify whether a certain hypoth- process cell-by-cell. The second, a neural simulation sys- esis can be supported by biological ®ndings. A possible use tem, models the known neural circuitry involved in C. elegans of the simulation model is to bring the genetic interactions of thermotaxis. The third, a portable Java-based 3D visualizer, genes into focus to see if hypothetical genetic interactions can displays simple cell position, neural connections, and other in- consistently recreate phenotypical changes that match known formation using new positional data recently made available, biological observations. If the results of simulation differ without attempting the sophisticated cellular modelling and from actual observations, there are two possibilities: either 3D rendering of the ®rst system.