www.asiabiotech.com Special Feature —

Korean Systems Biology Project Do Han Kim and Sang Yup Lee

Background Recently, unprecedentedly large amounts of genomic, transcriptomic and proteomic data are being generated, thanks to the rapid development in high- throughput experimental technologies. The Korean Systems Biology Project funded by the Korean Ministry of Science and Technology supports two projects: Project and the Calcium Signaling Systems Project. Through these two projects, researchers are planning to integrate all these X-omic data effi ciently to make new discoveries through which the current methods are incapable of making. Currently, intensive studies of the utilization and integration of diverse biological databases are performed all over the world. None, however, have developed a module which completely combines several factors using the systems biological approach. The Calcium Signaling Systems Project seeks to combine these factors in the study of the cardiac calcium signaling system in the mouse, specifi cally the role Ca2+ plays in the heart. Systems biology groups in South Korea, including the Korea Advanced Institute of Science and Technology (KAIST), Gwangju Institute of Science and Technology (GIST) and several other research institutes, have investigated systematic modeling and construction of bioinformatic database involved in the various biological systems (Table 1). The well-known limitation of having non-integrative databases is that most groups cannot effectively extract new information from these databases. As a result, this offers a rare opportunity to lead the world in the cross- disciplinary research of combining biology, chemistry, computer science, physics and chemical engineering in the Systems Biology Project of South Korea. Basic data of the biological systems of microorganisms have been accumulated over the years through the use of using various technologies to manipulate their biological pathways, as seen in the essential second-messenger role of the intracellular concentration of Ca2+ ions in processes. Although complete sequences of several organisms have been released, the methods

APBN • Vol. 10 • No. 3 • 2006 137 www.asiabiotech.com Special Feature — Systems Biology

for analyzing the overall and the prediction of complex and diverse organisms remain primitive. The methods for the understanding and combining of data from genome, , and fl uxome are required to construct an integrated metabolic network. However, such methods have not been extensively studied by anyone, but are attempted in the Systems Biology Project. Meanwhile, as data obtained from the simple repetitive experiments decrease, the methods for the acquisition of the quantitative systems biology are becoming more important. A crucial point for the current studies in biology is to extract valuable information from the scattered databases which are upgraded regularly. Even if the necessary information can be found, it is diffi cult to understand how it is related to other information presented in another database. Currently in modeling biological systems, researchers are forced to model specifi c biological systems on a case by case basis, unless an integrated framework is developed. In the case of the Calcium Signaling Systems Project, a defi ned set of genes associated with mouse cardiac excitation-contraction coupling (ECC) was initially compiled and, later, the expression correlation networks, protein expression levels and interaction networks. The effects of various genetic perturbations and altered functions have been studied using a variety of molecular methods in order to construct in silico models for myocardial Ca2+ homeostasis and hypertrophy. A successful approach to the modeling of biological systems through the integrated system framework will contribute to optimizing experimentation and reducing cost. As a result, researchers can perform in silico experiments based on the systems biological model to obtain a better idea as to what experiments to perform in vitro. It can also help visualize as well as extend the map of the system framework.

Objective and Necessity of the Project It is inevitable that biotechnology will be the center of the world economy during the next era. Since the past decade, the market for biochemical and biotechnological products, including pharmaceutical and medical equipment has grown exponentially. It is currently one of the largest and fastest growing industry, with an expected market size of over six hundred billion dollars in the future. The vast amount of funds invested in biological and biotechnological research have been mainly focused in discovering the cellular and organic behavior of cells for industrial, medical and pharmaceutical applications. The advancement in high-throughput techniques allows for the huge amounts of biological data from X-omics technology to be accumulated. The systems biology concept seeks to obtain and analyze comprehensive data of certain physiology on a systemic level. Particularly, we can extract systemic expression profi les of the entire genome under controlled environments

138 APBN • Vol. 10 • No. 3 • 2006 www.asiabiotech.com Special Feature — Systems Biology

through comparatively simple experiments using microarray technique and DNA chip. Similarly proteomics provide us with the quantitative data of total cellular proteins simultaneously on the 2-dimentional polyacrylamide gel electrophoresis. Also, in metabolomics, quantitative profi les of most of the intracellular metabolites can be made available by gas chromatography/ mass spectrometry and liquid chromatography/capillary electrophoresis. Also , improvement of experimental processes, such as DNA sequencing, has accelerated the expansion of genomic sequence database of various organisms which can be accessed through the Internet on various servers. These advances allow for the construction a genome-scale network model of biochemical reactions used in metabolic fl ux analysis. These methodological improvements have provided a wealth of data to effi ciently understand biochemical and biophysical phenomena of the cell. However, due to the complexities of biological behavior, many qualitative analysis of the cell gives only part of the information desired and sorting through the vast amount of experimental data is overwhelming. Therefore, mathematical and computational work capable of simultaneously handling huge amount of the qualitative and quantitative data from several high-throughput techniques is necessary. To accomplish this, in silico cell modeling has been carried out based on the extensive biological information available, including data generated from X-omics techniques as well as data from various biological databases. Also, it is important to include the integrated platform of and the databases for the construction of the in silico cell. When complete, the in silico cell will provide new clues to further our understanding of the biological systems and promise a new era in biology and biotechnology. In this paper, the current status and results obtained from the Korean Systems Biology Project are reviewed. This systems biological approach should prove useful in designing and developing cells to achieve the many goals of biotechnology.

Strategies and Applications of the Project

1. High-throughput X-omic analysis In the last decade, the fi eld of biology and biotechnology has grown rapidly. This rapid development is largely driven by the high-throughput

Fig. 1. High-throuput x-omics methodologies. Genomic experimental techniques, which allow the generation of biological dataset data obtained initially from the DNA sequencer are at a rapid rate. Genome sequencing along with other advances such as highly related to the transcriptomic data generated from the profi ling of transcriptome, proteome, and fl uxome of the DNA microarray or DNA chip. The high-throughput technologies in the proteomics using 2-D gel electropho- many organisms are changing the traditional paradigm of biological and resis and mass spectrometry accelerates the construction biotechnological research (Fig. 1). These technological advances have of the network of protein relationship. The metabolomic naturally led to the birth of Systems Biology, which aims to elucidate and fl uxomic data can be highly helpful to determine the biological mechanisms and phenomena at a system and genome-scale quantity of the metabolites and the internal fl ux profi les of the cell (Lee /et al./, 2005). level.

APBN • Vol. 10 • No. 3 • 2006 139 www.asiabiotech.com Special Feature — Systems Biology

Genome analysis: A genome contains inherited genetic information which enables cells to respond according to changing environmental factors. Thus, the genetic information of the cell is indispensable in manipulating natural cellular physiology to engineer a new cell for the purpose of biotechnology. Recently, many high-throughput experimental tools have been developed with the intent to learn more about the genome, and are applied to the fi eld of systems biology. In studying the transcriptome and proteome of the cell, knowledge of the genome is also required in order to be fully understood. Additionally, in the reconstruction of the in silico model of metabolic pathways, knowledge of the cell’s genome is helpful. In the post-genome era, comparative genomics is being done on many fi nished microbial genome projects. In comparing the of the many microorganisms sequenced, lots of information is learned about these microorganisms, such as their evolutionary tree and metabolic relationship. Transcriptome analysis: Transcriptome analysis started from the need for high-throughput expression pattern analysis of the whole genome in various tissue and organisms under various controlled environments. Before the development of DNA microarray, expression of genes was monitored by northern blot analysis, a tedious and time consuming process. With the arrival of high-throughput DNA microarray technology, data on the expression of genes can be generated at a much faster rate. Transcriptome analysis is widely applied in various fields of biotechnology from food and science to medical-related research such as cancer. With high-throughput DNA microarray analysis, we are able to understand global cellular physiology and metabolism of living organisms under the various environmental conditions. As a result, transcriptome profi ling of microorganisms is becoming popular as the search continues for ways to improve cellular function to meet the goals of biotechnology. With transcriptome analysis, new connections between regulatory circuits and metabolic pathways can be identifi ed, resulting in the design of metabolic and cellular engineering strategies for optimizing of biological processes to produce amino acids, and recombinant proteins. Proteome analysis: Proteomics is a study of the functional and structural information of protein from protein-protein interaction and the protein complement of the genome. Whereas transcriptomics is a bottom- up approach which surveys information from DNA to RNA, proteomics is a top-down approach surveying the profi le of protein expression. The main focus of studies targeting protein-protein interaction seeks to understand the protein’s role in causing diseases. Proteomics also plays an important role in fi nding useful target proteins. It is for these reasons that proteomics hold a lot of interest in the world of biotechnology. However, there are some problems associated with proteomics. Since proteomics is a quantitative study of proteins, in situ activity of protein cannot be determined directly. Because of this, proteomic studies’ main drawback is that correlation between the amount of protein and physiological mechanism cannot be directly identifi ed and only

140 APBN • Vol. 10 • No. 3 • 2006 www.asiabiotech.com Special Feature — Systems Biology

ambiguous results can be obtained. On the other hand, proteomics do have the advantage that it can show which proteins participate in the actual bioprocess and that it can easily confi rm target proteins. Proteome profi ling is an important tool for systems biology because of the fact that cellular behavior is more directly infl uenced by proteins than mRNAs. Proteome analysis makes it possible to monitor the presence of large numbers of proteins within a cell or tissue and to observe quantitatively how the protein levels change under the different circumstances. It has many applications in biotechnology, including the discovery of drug targets, development of diagnostic machines, monitoring of intracellular metabolism and elucidating regulatory networks from proteins that undergo coordinated changes of expression. Proteome profi ling of microorganisms makes it possible to generate invaluable information that can be used for the development of metabolic and cellular engineering strategies to enhance the yield and productivity of native and foreign bio-products and to modify cellular properties to improve mid-stream and down-stream processes. Fluxome (or Metabolome) analysis: Metabolic fl ux analysis is based on the mass balances of the metabolites in the metabolic pathways in the form of stoichiometric equations and it is useful to fi nd the property and performance of metabolic networks. The aim of metabolic fl ux analysis is the detailed quantifi cation of all metabolic fl uxes in the central metabolism of a microorganism. The result is a fl ux map that shows the distribution of anabolic and catabolic fl uxes in the metabolic network. Based on such a fl ux map or a comparison of different fl ux maps, possible targets for genetic modifi cations may be identifi ed, results of genetic manipulation can be predicted, or conclusions about the cellular energy metabolism can be drawn. Therefore, the measurement of metabolism wide fl uxes, or fl uxome, allows us to observe the functional output of the compositional transcriptome, proteome and metabolome changes and address the missing link in contemporary functional analysis to the cellular phenotype. Combined X-omics strategies: Each one of the strategies above alone is not enough for understanding cellular physiology and regulatory mechanisms and fi nding gene targets, since the biological network is too complex to understand and regulations of the network occurring at the protein level (post-translational modifi cation, protein activity, etc.) cannot be captured. It is also important to know how much RNA is transcribed into proteins. In order to visualize the entire system, it is necessary to combine the different strategies mentioned and only then can the complex system of the cell can be unraveled and its secrets revealed. However, there are still some improvements required in the many strategies. For example, in proteome profi ling, not all the protein markers on the 2-D gel are known at this point. This makes it diffi cult to match completely the proteome profi les with cellular metabolism and physiology. Also, the quantity of the protein is not always proportional to the activity of the protein, which in turn is not always proportional to the metabolic fl ux

APBN • Vol. 10 • No. 3 • 2006 141 www.asiabiotech.com Special Feature — Systems Biology

performed by the protein. With an effective integration of the information, a better understanding of the principles that control the metabolic, signaling, and regulatory networks of the organisms can be obtained. In future, this type of integrated analysis will become a powerful tool to better understand the cellular physiology and its metabolism at the systems level and to design strategies for metabolic and cellular engineering of the organisms.

2. In silico modeling and static/dynamic simulation Several approaches have been developed for quantitative in silico modeling and simulation of metabolic systems 12. These can mainly be classifi ed according to their time-dependency into two types: the kinetic- based dynamic model and constraints-based stationary model. The two models are both important to the simulation of the metabolic systems, each with its own strengths and weaknesses. The kinetic- based model analysis in the dynamic approach is imperative to fully characterize the metabolic reaction systems. However, it entails a large number of kinetic parameters which are diffi cult to obtain. Moreover, the value of many of the known parameters is uncertain because the reaction mechanisms and parameters in these models are derived from measurement done in vitro. The constraints-based stationary model, on the other hand, offers an alternative to the kinetic model as a more realistic approach to the large-size application under consideration. Assuming the pseudo-steady or stationary state, we can simplify the kinetic model into a static representation. Unlike the dynamic approach, the stationary model considers only the network’s connectivity and capacity as time-invariant properties of the metabolic system. The stationary approach includes stoichiometric analysis, structural or topological pathway analysis and constraints-based fl ux analysis. Of these, the constraints-based fl ux analysis, also known as metabolic fl ux analysis (MFA), is the most widely adopted method since it requires the least amount of information to quantify and analyze the metabolic system. An important result from this project was the development of strategies for the construction of genome-scale metabolic network and for fi lling in the missing links in the metabolic network. Therefore, the construction of the metabolic network from the complete genome sequence becomes an essential prerequisite for this study and the application of metabolic engineering strategies. However, there are several limitations in the validation of the constructed network which prevents these strategies from being widely used. In this project, the genome information of the newly sequenced microorganism, Mannheimia succiniciproducens MBEL55E, was fully annotated and the genome-scale metabolism was systematically analyzed3. An in silico metabolic network model, comprising 373 reactions and 352 metabolites, was constructed from the information from the genome through the biological and biotechnological approaches available. From this in silico model, new metabolic characteristics and new ways to improve

142 APBN • Vol. 10 • No. 3 • 2006 www.asiabiotech.com Special Feature — Systems Biology

the production of important metabolites were revealed. As a result, new methods for the production of important metabolites from the genome sequence were proposed by the combined application of the virtual experiment using computer simulations and experiments in vitro. Additionally, we have reconstructed the large genome-scale metabolic network of Escherichia coli composing of 979 reactions using the information in the public metabolic databases such as EcoCyc, KEGG, and BioSilico 6,7. When the reconstructed network was completed, new pathways and metabolic reactions were found. However, there are still many missing links present in the metabolic networks constructed from the reference databases. These orphan pathways and missing links can be checked and updated by utilizing the metabolic databases and various bioinformatics skills.

3. Systems biology model and data integration This strategy’s objective is based on the development of a foundation system for the modeling and analysis of the biological systems by incorporating biological databases with the construction and application of the biological knowledge. Current biological research is limited due to the diffi culties in fi nding the needed information from the scattered databases. Also, diffi culties in fi nding relationships among the data obtained restrict the fl exibility of the research. Therefore, the development of an integrated systematic framework incorporating the information mentioned above can help to optimize experiments and reduce costs. Information of the core models in systems biology have been collected and model libraries accumulating various compatible network models such as SBML, CellML5 have been developed. tools (GeneRUBY and MONET) were created using the Bayesian network model and biological data clustering model. In developing the integrated querying system, the relational schema of the various public databases was analyzed and UML descriptions of each database were generated13. Based on the descriptions, the integrated querying systems for the biological information (BIRDIE and UNICORN) were developed using the mediator- wrapper structure. These developed systems can be used for predicting genetic interactions and improving of the effi ciency of the research.

4. Dynamic modeling and parameter estimation The objective of this strategy is based on the development of modeling strategies for the dynamic simulation and analysis of the metabolic networks to help the implementation of a virtual cell. The biological metabolic network generally contains lots of unknown information. Although the characteristics of the metabolic network are known in detail, diffi culties in the prediction of dynamic parameters for the simulation remain. To accomplish the development of an exact model for the simulation of metabolic networks, an increase in the accuracy of the constructed model becomes important.

APBN • Vol. 10 • No. 3 • 2006 143 www.asiabiotech.com Special Feature — Systems Biology

The strict metabolic network model for the dynamic simulation of the metabolic network of Lactococcus lactis was constructed for this project and is mainly focused on the lactate metabolism. The constructed metabolic network was analyzed by applying the in vitro experiment using the sampling equipment and in silico network analysis using GEPASI simulation tool10 . The parameter estimation of the dynamic model of Lactococcus lactis was performed using the GEPASI simulation tool by utilizing the experimental data. The results obtained as a result of this project can be used to complement and improve research involved in the simulation of microbial metabolism and improve the production of bioproducts.

5. Calcium Signaling Systems in the Mouse Heart This involve studies of bioinformatics and genomics to identify and characterize total genes related to Ca signaling processes in mouse heart. In order to build up a protein system structure for Ca2+ signaling in the mouse heart, both known and unknown genes should be considered. By UniGene- clustering and searching gene expression database, approximately 1,000 genes highly expressed in the mouse heart have been identifi ed. Gene ontology annotation has led us to classify the identifi ed mouse heart genes by their biological processes, molecular functions and cellular locations. To discover novel genes expressed in mouse heart tissue, a database with tissue-specifi c has been analyzed. UniGene database combined with other computational bioinformatics databases provides a great deal of information for predicting the tissue specifi city of gene expression, genomic nature, and structure and function of novel gene products. Out of 13 mouse heart libraries deposited in UniGene database at the NCBI, we focused on the adult mouse cardiac muscle library (Library 8901), consisting of 827 UniGene entries. Our classifi cation of these gene entries revealed that 671 entries are known genes (previously named or assigned with potential functions) and the other 156 entries are potentially novel genes (unnamed genes with unknown or unassigned function). Thus, our study is directed to these novel gene candidates. Our initial investigation is to determine whether these candidates are genuine novel genes with signifi cant and evident expression in cardiac muscle, employing various expression analysis methods. Building Up a Genetic Network for Calcium Signaling by Gene Expression Profi ling Assessment of the genetic network for Ca2+ signaling is fundamentally important to an understanding of the etiology of cardiac diseases associated with abnormal Ca2+ signaling. Our study aims to 2+ formulate the genetic network that regulates the expression of Ca signaling genes by an in-silico promoter analysis and a large-scale gene expression analysis using DNA chips. This study observes how Ca2+ signaling genes are transcriptionally controlled, when they are genetically manipulated or pharmacologically treated.

144 APBN • Vol. 10 • No. 3 • 2006 www.asiabiotech.com Special Feature — Systems Biology

To elucidate functional correlations between the identifi ed target genes and to complement the high rate of false-positives found in various clustering methods, an integrative expression profi le analysis was adopted. As for the initial approach, by UniGene-clustering and searching of the GEO gene expression database, approximately 1,000 genes highly expressed in mouse heart have been identifi ed. A variety of functional modules in the mouse heart genes, especially genes related to Ca2+ signaling toolkit (CSTK) genes2 have been identifi ed using various cDNA-microarray data by a composite analysis of bi-clustering and a 2nd order analysis. In this way, thousands of unique or redundant functional clusters were found and they are under active characterization. All the resulting clusters can be accessed at “HCNet,” a database of heart and calcium functional network (http://sbrg2.gist.ac.kr/hcnet”http://sbrg2.gist.ac.kr/hcnet). The identifi cation of transcription factor binding sites in the promoter of CSTK genes using a promoter analysis program called “TFExplorer” is also being planned. In a preliminary analysis, we could identify regulatory relationships between the Ca2+ signaling target genes and transcription factors by applying the TFExplorer to the promoter analysis of CSTK genes that are co-regulated with potential cognate transcription factors. The results of promoter analysis are integrated with DNA chip data to build a reliable Ca2+ genetic network. This approach is expected to give much more accurate results than constructing the genetic network from the gene expression data only. The verifi cation of the network will be done experimentally by carrying out chromatin immunoprecipitation assays. The genetic network for cellular Ca2+ handling will allow a systematic analysis of the genes that execute and regulate the Ca2+ signaling genes and, eventually the identifi cation of therapeutic strategies. Genetic Manipulation of the Ca2+ Signaling Proteins Using Lentiviral Vector-mediated Gene Manipulation. The development of a DNA microarray and proteomics technology provided the proper grounds for the development of modern functional genomics. For a study of how a genome works, we need a suitable method for selectively inducing and silencing the expression of each individual gene. Recently, the discovery of RNA interference (RNAi) technology opens up a completely new realm of research on the functioning of particular mammalian genes. Our research team fi rst defi ned a subset of genes (22 genes) whose expression might be involved in calcium signaling in cardiac myocytes. To examine the roles of the genes in Ca2+ signaling, we are establishing the 2+ mouse cardiac myocyte HL-1 cell clonal lines which the Ca signaling- related gene is down-regulated by RNAi technology. To overcome some limitations in siRNA-mediated gene silencing experiment, we have designed a lentiviral vector system capable of expressing siRNA. We have found that the lentivirus system can express integrated siRNA effi ciently in a wide variety of cell lines, including the HL-1 cell line, and primary cells both in vitro and in vivo.

APBN • Vol. 10 • No. 3 • 2006 145 www.asiabiotech.com Special Feature — Systems Biology

More importantly, it has been demonstrated that a lentivirus vector system capable of expressing siRNA can effi ciently transduce in vivo and even pre-implant mouse embryos. Furthermore, the resulting progeny expressed siRNA and showed reduced expression of specifi c genes. Therefore, the lentiviral vector system expressing siRNAs which silence the expression of the calcium signaling-related gene is thought to be very useful for identifi cation of molecular mechanisms for calcium signaling in mouse cardiac myocytes in vitro and in vivo. Both Qualitative and Quantitative Analysis of Proteins and Metabolites Involved in the Ca2+ Signaling Processes. Due to recent breakthroughs in ionization techniques, mass spectrometry has become the method of choice for proper protein analysis. Shotgun Proteomics is a combined method of multidimensional LC separation and tandem mass spectrometry (MS/MS). Instead of protein, it analyzes proteolytically digested peptide bases on their tandem mass spectrometric data to infer the amino acid sequence of individual peptides. SEQUEST, a protein database-searching compatible with the shotgun proteomics approach matches a peptide tandem mass spectrum to an amino acid sequence within the protein database. Compared to a more conventional proteomic research tool (2D-PAGE/MALDI TOF MS), the shotgun proteomics approach is more sensitive and faster, and they can be fully-automated. The ultimate success of this approach is not only to discover the existence of proteins, but also to observe the dynamic changes of proteins under diverse biological conditions. Recent advances in biological mass spectrometry have resulted in the development of numerous strategies for the large-scale quantifi cation of protein expression levels within cells. These measurements of protein expression are most commonly accomplished through differential incorporations of stable isotopes into cellular proteins. The relative and absolute protein quantifi cation strategies in mass spectrometry will be used for the precise determination of Ca2+ signaling protein profi les. The simplicity and sensitivity of the mass spectrometric methods coupled with the widespread availability of shotgun proteomics technique will make our approach a highly useful research tool for understanding Ca2+ signaling network in the cell. In addition, capillary electrophoresis (CE), CE-MS, and MALDI-TOF-MS, will be used for metabolic profi ling to construct a metabolic network structure during the conversion to the hypertrophied myocyte. Building-up Protein Interaction Network for Ca2+ Signaling in Mouse Heart. We have constructed an initial model among the primary target proteins based on published literature. We are elaborating the model by various biochemical and biophysical techniques. The primary target proteins have been cloned in the various expression vectors such as pET and pGEX series. Expressible clones were characterized by small scale protein expression system at different temperatures and then proteins are purifi ed using various chromatographic techniques. The purifi ed proteins serve as a source for analyzing protein-protein interactions for ECC and

146 APBN • Vol. 10 • No. 3 • 2006 www.asiabiotech.com Special Feature — Systems Biology

discovering new proteins that associate with or are regulated by the primary target proteins. The identifi cation of protein-protein interactions was carried out using the analytical techniques such as Isothermal Titration Calorimetry (ITC), Surface Plasmon Resonance (SPR) and GST pull-down assay. The complex structures of interacting molecules are determined using 3D structure modeling and possibly X-ray crystallography. 3D structure modeling will identify best matches between two molecules that bind each other by simulation of interacting surfaces and free energy minimization at the domain level. Protein interacting network in ECC will be made by computer modeling together with experimental analysis. Network model can help to understand the integrated function of ECC. Spatiotemporal Dynamics of Ca2+ Signaling. Advances in molecular genetics and biochemistry have led to the identifi cation of many new signaling molecules and interactions between them, as documented in the elaborate signaling maps that are currently under development. These maps, however, do not take into account the spatial and structural aspects of these signaling pathways, which in real cells are very important. Understanding these pathways and the mechanisms of signal propagation in cells will require the measurement of many signaling reactions, with high spatial and temporal resolution. Most cells are small and the concentration of signaling molecules is generally low. Therefore, these measurements require both considerable magnifi cation and sensitivity. The most widely used detection methods are, therefore, based on the fl uorescent microscopic imaging techniques. Because many signaling reactions involve proteins that undergo reversible conformational changes and/or formation of complexes with other proteins, fl uorescent resonance energy transfer (FRET) method will be used for proposed study to identify the proper interactions between the molecules. Discovery of Novel Functions in Cardiac Ca2+ Signaling. As one of the important research projects for evaluating Ca2+ signaling systems in mouse cardiac myocyte, we are currently attempting to characterize the physiological functions of novel proteins that we have recently identifi ed. This is done by evaluating the spatio temporal properties of Ca2+ release patterns in cardiac cell line or mouse heart, where their genes are knocked- down or overexpressed. To measure the dynamic properties of global and local Ca2+ releases in intact myocytes, we use real-time confocal Ca2+ imaging technique combined with quantifi cation tools. In addition, we also use other optical, electrical, pharmacological tools to examine basic physiological functions of cardiac cells, such as cell contractility, beating rate, hormonal regulation, and membrane permeability (functions of ion channel or transporter). Finally, quantitative data, obtained from the experiments, provide practical information to establish a fi ne network of the Ca2+ regulatory proteins and cellular model under normal and failed heart conditions.

APBN • Vol. 10 • No. 3 • 2006 147 www.asiabiotech.com Special Feature — Systems Biology

Modeling of the Complex Dynamic Systems. For system biologists, the ultimate goal would be to investigate the dynamical properties of biological systems having multi-active interactions between the molecular components, and to identify new therapeutic strategies to cure related diseases. However, most of the mathematical modeling has been restricted to a specifi c range of the whole bio-molecular network and people now agree that this is not so appropriate for the above purpose. Biological systems, including the cardiac Ca2+ signaling system are in general composed of complex hierarchical structures ranging from genes to proteins. Hence, for the Calcium Signaling Systems Project, a hierarchical platform model (Fig. 2) of the Ca2+ signaling system will be constructed by integrating models of each bio-molecular layer. Focus will particularly be on CaMKII (calcium/calmodulin-dependent protein kinase II) which has been known to play a crucial role in the cardiac myocyte, since CaMKII phosphorylates several important CSTK proteins such as ryanodine receptor, L-type Ca2+ channel, phospholamban, SERCA (sarco(endo)plasmic reticulum Ca- ATPase) in response to Ca2+ signals and regulates the variety of cell functions such as cell proliferation and development. CaMKII is also known to be associated with several cardiac diseases such as heart failure, hypertrophy, Fig. 2. A schematic diagram of the cardiac calcium signaling system. This illustrates and arrhythmias. Eventually, we could fi nd new therapeutic strategies for the hierarchical platform model we are such heart diseases by analyzing the role of CaMKII in a cardiac cell using developing. the hierarchical platform model (Fig. 2) we are developing.

Future Perspectives Up till now, the Systems Biology Project has mainly focused on the individual X-omics fi eld. However, the effective integration of various X-omics data becomes more important for the exact understanding of the characteristics of organisms using the data. We have developed the methodologies for knowledge mining data from the combination of the different fi elds. The knowledge mining methodology aims at integrating a wide range of in vitro data, metabolic fl ux models, strategies and software for building a virtual cell model. The virtual cell model will be continuously verifi ed and upgraded with feedback from the results of the comparison and analyzing accumulated transciptome, proteome and fl uxome data. In addition, the virtual cell model will be generalized through the communication with the database and exchanged in the format of standard XML based modeling language. In the case of the Calcium Signaling Systems Project, a comprehensive knowledge of the novel genes is to be obtained both at the gene and protein level. This includes various in vitro and in silico analysis. Subsequently, we will focus on determining the relationships between the novel genes and Ca2+ signaling. A proposed approach is to evaluate the expression profi les of the novel genes in cell or animal models with altered calcium signaling. Collectively, the full exploitation of the novel genes with cardiac expression will provide a fi rm basis for further characterization of gene network underlying the Ca2+ signaling system in this tissue.

148 APBN • Vol. 10 • No. 3 • 2006 www.asiabiotech.com Special Feature — Systems Biology

Finally, researchers can take advantage of the use of the virtual model of cells and processes in order to get the effective prediction of physiological behaviors under the various genetic and environmental perturbations. International Collaborations through Electronic International Molecular Biology Laboratory (eIMBL)- Systems Biology Laboratory (SBL) A multi national network for collaborations in systems biology at the Asia-Pacifi c Rim (eIMBL-SBL) started in November 2005. The organization has been approved by the APEC summit meeting and organized by the International Molecular Biology Network (IMBN). The center of eIMBL- SBL is currently located at GIST, Gwangju, Korea. The future agendas of eIMBL-SBL include training of young scientists, exchange of knowledge and resources, and collaborations in the rapidly growing fi eld of systems biology through a web/grid-based infrastructure (www.eimbl.org).

Acknowledgements We’d like to thank Drs. Chung Hee Cho, Soo Hyun Eom, Zee Yong Park, Kwang-Hyun Cho, Sun-Hee Woo, Young-il Yeom, Ki-sun Kwon, Yeon Soo Kim, Young Sook Yoo, Seong-Hwan Rho, Mr. Jin Sik Kim, Seung Bum Sohn, Hyung Seok Choi, Han Min Woo, Hong Seok Yun, and Tae Young Kim for their contributions in preparing this manuscript. Our work described in this paper has been supported by the Korean Systems Biology Research grant from the Korean Ministry of Science and Technology, and the Brain Korea 21 Project. SYL also thanks the additional support received through the LG Chem Chair Professorship and IBM SUR program.

APBN • Vol. 10 • No. 3 • 2006 149 www.asiabiotech.com Special Feature — Systems Biology

References

1. Berridge et al., (2000) Nat. Rev. Mol. Cell. Biol. 1:11-21

2. Barrett et al., Nucleic Acids Res. 33:D562-566

3. Hong, S.H., Kim, J.S., Lee, S.Y., In, Y.H., Choi, S.S., Rih, J.K., Kim, C.H., Jeong, H., Hur, C.G. & Kim, J.J. (2004) The genome sequence of the capnophilic rumen bacterium Mannheimia succiniciproducens. Nat. Biotechnol. 22:1275-1281

4. Hou, B.K., Kim, J.S., Jun, J.H., Lee, D.Y., Kim, Y.W., Chae, S., Roh, M., In, Y.H. & Lee, (2004) S.Y. BioSilico: an integrated metabolic database system. Bioinformatics 20:3270- 3272

5. Hucka, M. et al., The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19:524- 531.

6. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., & Hattori, M. (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res. 32:D227-D280

7. Keseler, I.M., Collado-Vides, J., Gama-Castro, S., Ingraham, J., Paley, S., Paulsen, I.T., Peralta-Gil, M. & Karp, P.D. (2004) EcoCyc: a comprehensive database resource for Escherichia coli. Nucleic Acids Res. 33:D334-D337 (2005)

8. Lee, S.Y., Lee, D.-Y. & Kim, T.Y. (2005) Systems biotechnology for strain improvement. Trends Biotechnol. 23:349-358

9. Lloyd, C.M., Halstead, M.D. & Nielsen, P.F. (2004) CellML: its future, present and past. Prog. Biophys. Mol. Biol. 85:433-450

10. Mendes, P. (1997) Biochemistry by numbers: simulation of biochemical pathways with Gepasi 3. Trends Biochem Sci. 22:361-363

11. Nam, J.W., Han, K.H., Yoon, E.S., Shin, D.I., Jin, J.H., Lee, D.H., Lee, S.Y. & Lee, J. “(2004) In silico analyis of lactate producing metabolic network in Lactococcus lactis”, Microb. Technol. 35:654-662

12. Wiechert, W. (2002) Modeling and simulation: tools for metabolic engineering. J. Biotechnol. 94:37-63

13. Webb, K. & White, T. (2005) UML as a cell and biochemistry modeling language. Biosystems 80:283-302

Contact Details: Contact Person: Professor Sang Yup Lee Address: Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology, 373-1 Guseong-dong Yuseong-gu, Daejeon 305-701, Republic of Korea. Email: [email protected]

150 APBN • Vol. 10 • No. 3 • 2006