1

Biostatistics and Modeling Biochemical Pathways

Description of Module

Subject Name Biochemistry

Paper Name 13 and Bioinformatics

Module Name/Title 16 Modeling Biochemical Pathways

Dr. Vijaya Khader Dr. MC Varadaraj

2

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

1. Objectives: In this module, the students will understand

1. Biochemical Pathways and Pathways databases 2. Biochemical Pathway models and models databases 3. Simulating Biochemical Pathways 4. Modeling Biochemical Pathways using COPASI

Brief Description

Biochemical Pathways

Biochemical Pathway Models

Simulating Biochemical Pathway

Modeling Biochemical Pathway

Summary

2. Brief Description

Dear students, Biochemistry is the study of enzyme catalysed reactions. The product of one enzyme catalyzed reaction may form the substrate of second enzyme catalysed reaction to yield a new product. In the second reaction, the product of first reaction acts as substrate to take the reaction forward to form another product. In addition, the product of the first reaction may bind to first enzyme to take the first reaction backward to form the first substrate. Further, the new product of the second reaction may form the substrate of third enzyme catalysed reaction and so on. Therefore, in this way, various substrates and products (collectively known as metabolites) are chained together in series, to yield, what we call the metabolic pathways. Consequently, metabolic pathway occurring within a cell is a series of enzyme catalysed chemical reactions starting with a substrate, chained together through intermediate metabolites to yield the final end product. The rate of turnover of these metabolites in a metabolic pathway is called Flux, or metabolic flux.

3

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

In addition, the end product of one pathway may be used immediately to initiate another metabolic pathway. Therefore, all the metabolic pathways of a cell form an elaborate network of interconnected pathways, collectively known as Biochemical processes. The biochemical processes in the cell achieve biochemical phenotypes such as to meet energy requirements or to synthesize building blocks for growth or to undergo cell division or to become even a dormant cell such as spore.

The metabolite flux of biochemical pathways is very vital for the survival of the cell under different conditions. The information about an enzyme catalysed reaction is collected using on-bench experiments and then it is used to develop a kinetic model of the enzyme, which is stored in reaction kinetics databases. The in vitro kinetic properties of the constituent enzymes of a biochemical pathway may be gathered from published literature and also from reaction kinetics databases. The in vitro kinetic properties of the constituent enzymes may then be used develop a model for a biochemical pathway. In case in vitro kinetic properties of some constituent enzymes is not available, then on-bench experiments may be conducted to collect in vitro kinetic properties of the remaining enzymes to develop a model. The in vivo behaviour of the pathway can then be understood in terms of the developed model. These models are deposited in databases and can be downloaded for use by other users to simulate pathways using simulation software. Consequently, once a model is available in the database, the same may be used to simulate metabolic pathway using simulation software to predict the behaviour of a biochemical process, without the need to conduct further on-bench experiments.

Therefore, the objectives in this module are to learn Biochemical Pathways and databases storing information about Pathways. This will follow to learn Biochemical Pathway models and databases storing information about Pathway models. Simulation of Biochemical Pathways using COPASI will be undertaken for downloaded pathway model. Finally, we will learn using COPASI for modeling a biochemical pathways to gain insight into biochemical processes and phenotypes.

Back to Concept Map

2.1. Metabolic Pathways

Glycolysis was the first metabolic pathway discovered, which is used to meet the energy requirements of the cell. Metabolic pathways represent a series of metabolites transformed from the initial substrate or what we call the source metabolite, glucose-6-phosphate in glycolysis, to the end product of pathway, called the sink metabolite, the pyruvate. Technically all enzyme catalysed reactions are reversible. However, some reactions in a pathway may be highly exothermic resulting in irreversible reactions. Two reactions in glycolysis, the one catalysed by phosphofructokinase and the other catalysed by pyruvate 4

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

kinase, are highly exothermic, and therefore irreversible. These two reactions make the glycolysis pathway as an irreversible pathway.

In addition to metabolic pathways, there are pathways involving a group transfer reactions among several species, say several proteins. For example, a protein may receive a phosphoryl group, which is subsequently transferred to another protein. This phosphoryl group transfer may be repeated to transfer the phosphoryl group to some another protein. This chaining of phosphoryl transfers in series is called group transfer pathway. Taken together, metabolic pathways and group transfer pathways are collectively known as Biochemical pathways.

The source metabolite i.e. first substrate and sink metabolite i.e. end product of the pathway are called boundary metabolites. The metabolites from substrate of second reaction to product of second last reaction are called intermediates. The source, the sink and the intermediate(s) are collectively known as metabolites. The first step to describe a biochemical pathway is to identify the start reaction and end reaction, which define the boundaries of the pathway. The glycolysis pathway shows that Glucosephosphate isomerase and Pyruvate kinase catalysed reactions, may be used to describe the boundaries of the glycolysis.

5

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

KEGG database i.e. Kyoto Encyclopedia of Genes and Genomes, is a comprehensive resource of all known biochemical pathways. Visit KEGG at http://www.kegg.jp/kegg/pathway.html, and search for glycolysis by entering glycolysis followed by clicking Go button.

However, the better way to search KEGG database is to search for a particular . Let us search the database for, say “Saccharomyces cerevisiae”. To do so, click on the Organism button. A new window will open, as shown next. Start typing the name of organism and select from the suggestions offered in the list. For the present example, select the third organism and click “Select” Button.

Or in the Organism text box, simply enter the first alphabet of genus and two alphabets of species. For “Saccharomyces cerevisiae” enter ‘s’ for Saccharomyces and ‘ce’ for cerevisiae i.e. sce. After selecting Organism, enter the keyword “glycolysis” and click “Go” button.

6

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

This will display search results in the browser window. Click on the first map i.e. thumbnail image

This will open pathway map involving glycolysis in “Saccharomyces cerevisiae”.

7

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

8

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

KEGG pathways connect metabolites in reactions using Enzyme Commission numbers. The ENZYME commission numbers on the pathway map are displayed in rectangles, such as 5.3.1.9 for Glucosephosphate isomerase. The metabolites are displayed as small circles labeled with names. Bringing the “mouse over” the metabolite circle, displays its structure, as shown for beta-D-Fructose 6-phosphate for the product of enzyme number 5.3.1.9

Different pathways are interconnected to form network of pathways. When The KEGG pathways database is searched for a particular organism, then the enzyme numbers are mapped on to the known genes in the genome of target organism. The mapping is displayed with ENZYME commission numbers highlighted with green background of the rectangle. This helps in understanding the biochemical processes in the target organism for the mapped genes of a pathway. To reach at the target gene entry, click on the highlighted rectangle, say 5.3.1.9, to reach the gene on the genome.

9

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

This shows the mapping of glucose-6-phosphate isomerase enzyme to gene ‘YBR196C’ in the genome of “Saccharomyces cerevisiae” (i.e. budding yeast). In this way, one can reach to gene number in the genome and use the same for browsing gene in the genome browser as described in Module three “Molecular Sequence Databases”.

Back to Concept Map

2.2. Biochemical Pathway Models

Modeling of biochemical pathways, allows simulation of pathways for manipulation of various variables such as initial concentrations of metabolites. This manipulation results in prediction of the behavior of biochemical pathway, without actually working with the real cells. Therefore, the predictions from a model provide an easy way for testing hypotheses, before actually manipulating pathways within real cells. It allows the understanding and visualization of the biochemical pathway for manipulation in desired way. With a biochemical pathway model, one can actually visualize the metabolic flux with varying levels of metabolites during the operation of a biochemical pathway. Consequently, one can have the levels of metabolites at different times of operation of a biochemical pathway. This is known as time course behavior in a biochemical pathway. In addition, simulation of biochemical pathways allows the calculation of steady state levels of various metabolites in the pathway. BioModels Database is a repository of computational models of biological processes, including biochemical pathways. At BioModels database, Models described in the scientific literature are manually curated and enriched with cross-references.

As on 30th June, 2016, more than 40% models (i.e. 612) taken from published literature were manually curated. The manually curated models are fully compliant with MIRIAM standard (i.e. The Minimal Information Required In the Annotation of Models). The MIRIAM standard is required for checking the behavior of the model during simulation. In addition, the dynamic behaviors are verified with computational tools such as COmplex PAthway SImulator (COPASI). Further, 873 models were awaiting manual curation. These models were published in journals of very high international impact. In addition to curated models, computationally generated model available includes 112,898 Metabolic, 27,531 Non- metabolic and 2,641 Whole genome metabolism models.

Visit BioModels at https://www.ebi.ac.uk/biomodels-main/ and enter Glycolysis to click Search Button.

The ensuing page shows the search results. 10

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

The first category listed 69, which are manually curated models.

In addition, 223 Non-curated models were also found

11

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

Further, 2 Models which were under curation, were also returned

The list ended with 1000 models returned from “Path2Models” repository.

“Path2Models” repository houses the automatically generated models using pathways in KEGG database. This search lists 69 manually curated models for the keyword glycolysis. Let us use seventh model in the above list.

12

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

Click ‘BioModels ID’ in the entry “BIOMD0000000064” to download this model. This will reach Download SBML page.

“Hover the mouse” over on button to expand the list of available formats

Select i.e. SBML Level2 Version1 (curated) to download and save the file with suggested file name “BIOMD0000000064.xml”. In this manner, users can download a required model in SBML format to be used with simulation software package.

Back to Concept Map

2.3. Simulating Biochemical Pathways

Biochemical pathway models saved in SBML can be simulated using COPASI (COmplex PAthway SImulator) software package.

Run COPASI. The default units for Time, Quantity and Volume will be displayed. 13

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

From File menu, import the downloaded model using command (import SBML), and select the saved file “BIOMD0000000064.xml” to import. This will open the model which was downloaded in the last section. This is a model for “Can yeast glycolysis be understood in terms of in vitro kinetics of the constituent enzymes? Testing Biochemistry”. The opening working space panel displays the general information regarding units and literature for publication of the model. The units for Time, Quantity and Volume have been changed to min, mmol, l, respectively, as imported with the model. Expand COPASI nodes in the navigation tree panel on left with single click on node or by double click, on the node name.

14

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

In the navigation Tree View on the left, expand nodes ‘Model’‘Biochemical’. Select “Compartments” Node. This will display the number of compartments with associated properties.

Now select, Species node to display the metabolites and associated modifiers. The initial concentrations of 26 species i.e. metabolites are displayed.

Now select, “Reactions” node to display the 17 reactions in the pathway.

15

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

Now select, Parameter Overview node to display the Model Parameters in a single window.

Let us say that during ‘Time Course’ simulation, user is interested in visualization of time course levels of Glucose-6-phosphate and Pyruvate. To display the intermediates during ‘Time Course’ simulation, open ‘Output Specifications’ node in Navigation Tree Panel and double click ‘Plots (0)’ node.

16

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

followed by double click ‘New Plot’

This will open Plot_1 in working Panel space.

Now click “New Curve” Button. Now, in the left panel of dialog box, Expand Time node for “X-axis” and select Model Time in left panel to display model time on x-axis.

17

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

Now in the right panel under Species node and Transient concentrations sub-node, select Glucose-6- phosphate and click Pyruvate while control key pressed. This will select both these species to be displayed on y-axis in the plot. click “OK” button. In the working space panel, select both the Glucose-6-phosphate and while control key pressed click Pyruvate in the list and increase the width of the line/ curve to be plotted to 5.0.

Now click “Commit” button on bottom left of “Working Panel”.

In “Navigation Tree”, Expand “Tasks” node and select “Time Course”. Set the time course parameters, i.e. duration 2 min, interval size 0.0005 and intervals 4000.

Now click “Run Button” in the left bottom of “Working space” panel. 18

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

The time course for two minutes duration shows the flux of Glucose 6 Phosphate and pyruvate.

For altering initial concentrations of species interactively, add sliders. Sliders are graphic user elements to change the initial values for modelling parameters including volume of compartments, reaction parameters, initial concentrations and Modelling initial time during ‘Time Course’ simulation. Select ‘Time Course’ node. Now click on “Show Sliders” button in the Toolbar, to open dialog box followed by clicking “New Sliders” button. Expand Species node and select initial concentrations to be used for simulation.

First select Glucose 6 Phosphate and then with “Ctrl” key pressed select pyruvate. This will display the ‘Time Course Sliders’ window with the selected sliders to control initial values, as shown next. 19

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

At this stage, simply slide Glucose 6 phosphate slider to maximum and pyruvate slider to minimum, it will update the display of Time Course automatically.

This will display the time course flux of chosen metabolites for two minute pathway simulation and we find that it is different from the previous time course simulation. At this stage, now slide Glucose 6 phosphate slider to minimum and pyruvate slider to maximum, it will update the display of Time Course automatically.

20

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

This shows that when initially pyruvate is provided to glycolytic enzymes, the pathway is reversed to synthesize Glucose 6 phosphate.

In addition, to time course simulation, the model may be used for estimation of steady state levels of metabolites in the pathway. In the Navigation Panel, expand Tasks Node, select “Steady Sate” node and click Run button.

Now display the steady state levels of metabolites by clicking on the Result node. This will display the steady state levels which can be compared with initial levels and we find that levels of Glucose 6 phosphate increase and pyruvate decrease, showing that the pathway has gone in the reverse direction under the initial levels of Glucose 6 phosphate 0 and pyruvate 10 mM.

Back to Concept Map 21

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

2.4. Modeling Biochemical Pathways

We have seen that the glycolysis pathway may run in reverse direction to produce glucose-6-phosphate from pyruvate. This happens when instead of glucose-6-phosphate some other alternative source of energy is made available. Seven out of nine reactions of glycolysis pathway are reversible reactions under cellular conditions. There are two thermodynamically irreversible reactions, catalysed by Phosphofructokinase and Pyruvate kinase in the pathway. Therefore, for synthesis of glucose-6-phosphate from pyruvate using seven reversible reactions, separate enzymes for irreversible reactions catalysed by phosphofructokinase and Pyruvate kinase are required. Within the cell, these two irreversible reactions are replaced with other irreversibly catalysed reactions, to reverse glycolysis to synthesize glucose-6- phosphate from pyruvate. This is achieved through Fructose 1,6-bisphosphatase (FBPase) replacing reaction catalysed by phosphofructokinase. Two separate enzymes i.e. Pyruvate Carboxylase and Phosphoenolpyruvate (PEP) carboxykinase replace Pyruvate kinase catalysed reaction to convert pyruvate into PEP. Therefore, the pathway is reversed to synthesize Glucose 6 phosphate, instead of achieving the steady state level. Consequently, the pathways may be reversed. This is known as the gluconeogenesis pathway. This results in different pathways in each direction for catabolism and anabolism of glucose -6- phosphate, respectively, i.e. Glycolysis and Gluconeogenesis.

However, sharing reactions between Glycolysis and Gluconeogenesis pathways is an exception where seven enzymes are shared between these two pathways. Now let us add three enzyme catalyzed reactions

22

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

for Fructose 1,6-bisphosphatase, Pyruvate Carboxylase and Phosphoenolpyruvate carboxykinase, in the already available model of glycolysis for “Saccharomyces cerevisiae”. For adding these reactions to the model, we need the kinetic parameters for these three enzymes. Therefore, visit enzyme kinetics database BRENDA at http://www.brenda-enzymes.org/ and enter “Fructose 1,6-bisphosphatase”, to search database

The search displays the available enzymes.

-

Click 3.1.3.11, to reach the Fructose 1,6-bisphosphatase enzyme information page. Select “Saccharomyces cerevisiae” and click Submit button.

23

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

Expand Functional parameters node and click “Km value”. Expand the node for 9 entries

and retrieve kinetic parameters from the literature link 715513.

The table 1 in the literature reports Km and Vmax values.

In COPASI, add the new reaction for Fructose 1,6-bisphosphatase as shown next or simply copy and paste this reaction in COPASI.

24

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

"Fructose-1,6 bisphosphate" + H2O -> "Fructose 6 Phosphate" + phosphate

and click the “New Rate Law” button.

In the following working panel, enter the irreversible Michaelis Menten Law, by entering the mathematical formula exactly, as shown next or simply copy and paste this formula.

Vmax * "Fructose-1,6 bisphosphate" / (KmFBP + "Fructose-1,6 bisphosphate")

In addition, in the same working panel space, change the description of "Fructose-1,6 bisphosphate" to “substrate” in the description dropdown list

25

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

and click “Commit” button on the bottom left of the working panel space. In the Navigation tree select “Fructose-1,6 bisphosphatase” reaction and select “Rate law for Fructose-1,6 bisphosphatase” in the “Rate Law” dropdown list.

Change the value of Vmax and KmFBP to 0.2 and 0.0087, by double clicking in the text boxes under the value column. These values were retrieved from table 1 in the literature link.

and click “Commit” button on the bottom left of the working panel space. Follow the same steps for adding Pyruvate Carboxylase and Phosphoenolpyruvate Carboxykinase reactions.

Similarly, follow the same steps to retrieve kinetic parameters information for Pyruvate Carboxylase and Phosphoenolpyruvate Carboxykinase reactions and set the kinetic parameters as shown.

26

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

Set the initial concentrations of new metabolites as shown

After these additions run “Time Course” task for 2 minutes and compare new “Time Course” simulation for Glycolysis + gluconeogenesis with previous one for reversed glycolysis.

Reversed Glycolysis Glycolysis + Gluconeogenesis

Back to Concept Map

3. Summary

27

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways

In this module, we learnt about Biochemical Pathways and online database of integrated Biochemical Pathways at KEGG database. The Biochemical Pathways are expressed mathematically in the form of models. The known models are stored in BioModels database and can be downloaded in SBML for simulation using COPASI. The time course simulation for metabolites was seen for Glycolysis. In addition, the steady state levels of various metabolites could be calculated using COPASI. The Glycolysis pathway was converted into gluconeogenesis by addition of three enzymes, to catalyze the irreversible reactions of phosphofructokinase and pyruvate kinase. The parameter values for simulating enzyme catalysed reactions may be obtained from SABIO-RK and BRENDA databases. This will result in an integrated model for glycolysis and Gluconeogenesis. Further, Glucose-6-phosphate synthesized through gluconeogenesis may enter in Pentose phosphate pathway for synthesis of NADPH and biosynthetic precursors. Therefore, Pentose phosphate pathway may be integrated with model of glycolysis and gluconeogenesis by adding the enzyme catalysed reactions in Pentose phosphate pathway to simulate biochemical process for biosynthesis of precursors for growth. In this way modeling may be extended to integrate more biochemical pathways to simulate complete biochemical processes for growth and cell division. The simulation may be used to study the levels of metabolites without conducting further on-bench experiments.

28

Biostatistics and Bioinformatics Biochemistry Modeling Biochemical Pathways