Quick viewing(Text Mode)

Modeling Framework for Isotopic Labeling of Heteronuclear Moieties Mark I

Borkum et al. J Cheminform (2017) 9:14 DOI 10.1186/s13321-017-0201-7

RESEARCH ARTICLE Open Access Modeling framework for isotopic labeling of heteronuclear moieties Mark I. Borkum1*, Patrick N. Reardon1, Ronald C. Taylor2 and Nancy G. Isern1

Abstract Background: Isotopic labeling is an analytic technique that is used to track the movement of through reac- tion networks. In general, the applicability of isotopic labeling techniques is limited to the investigation of reaction networks that consider homonuclear moieties, whose are of one tracer element with two isotopes, distin- guished by the presence of one additional . Results: This article presents a reformulation of the modeling framework for isotopic labeling, generalized to arbitrar- ily large, heteronuclear moieties, arbitrary numbers of isotopic tracer elements, and arbitrary numbers of isotopes per element, distinguished by arbitrary numbers of additional . Conclusions: With this work, it is now possible to simulate the isotopic labeling states of metabolites in completely arbitrary biochemical reaction networks. Keywords: Isotopic labeling, Isotopomers, Cumomers, Elementary metabolite units, Metabolic engineering

Introduction isotopes per element, distinguished by arbitrary numbers Isotopic labeling is an analytic technique that is used to of additional neutrons. track the movement of isotopes through reaction net- works. First, specific atoms of the reagent moieties are Background replaced with detectable, “labeled” isotopic variants. The first group to give a partial solution to the problem of Then, after the reactions have been allowed to proceed, the representation of isotopic labeling states of arbitrary the position and relative abundance of labeled isotopic moieties was Malloy et al. [1]. atoms are determined by experiment. Subsequent analy- Representing isotopic labeling states of carbon atoms sis of the measured quantities elucidates the characteris- in backbones of metabolites as vectors of Boolean truth tics of the reaction network, e.g., the rate constants of the values, which they referred to as “isotopomers” (a con- reactions. traction of the term “isotopic isomers”), using 0 and 1 12C 13C In general, the applicability of isotopic labeling tech- to denote, respectively, -unlabeled and -labeled niques is limited to the investigation of reaction networks atoms, they showed that relative abundances of metabo- that consider homonuclear moieties, whose atoms are lites in biological systems could be calculated using non- of one isotopic tracer element with two isotopes, distin- linear functions of relative abundances of isotopomers, guished by the presence of one additional neutron. which they referred to as “isotopomer fractions.” The contribution of this article is a reformulation of the For example, a metabolite of two carbon atoms has 2 modeling framework for isotopic labeling, generalized to 2 = 4 isotopomers, 00, 01, 10 and 11; and hence, the iso- arbitrarily large, heteronuclear moieties, arbitrary num- topic labeling state of the metabolite is given by 4 isoto- bers of isotopic tracer elements, and arbitrary numbers of pomer fractions. Aside from being nonlinear, Malloy et al.’s construction suffers from the fact that an exponential number of iso- *Correspondence: [email protected] 1 Environmental Molecular Laboratory, Pacific Northwest topomer fractions are required in order to determine the National Laboratory, 3335 Innovation Boulevard, Richland, WA 99354, USA isotopic labeling state of any given metabolite. Full list of author information is available at the end of the article

© The Author(s) 2017. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/ publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Borkum et al. J Cheminform (2017) 9:14 Page 2 of 11

Wiechert et al. [2] noted that, by the nature of the that preceded it. Is this issue intrinsic to the problem, problem, isotopic labeling states of biochemical reac- i.e., unavoidable, or simply, difficult to mitigate? Third, tion network substrates are wholly determined by spe- Antoniewicz et al. give neither a rigorous proof of the cific subsets of carbon atoms; and therefore, that isotopic correctness of the EMU method, nor a derivation of its labeling states of the complement of each subset can be construction from prior research. Why does the EMU omitted, incurring no loss of information. method appear to work? Is it the most effective decom- Representing isotopic labeling states of carbon atoms position that can be “done” to a model of a given biologi- in backbones of metabolites as vectors of placeholder cal system? Fourth, and finally, the EMU method has thus variables, which they referred to as “cumomers” (a con- far been demonstrated for homonuclear systems only; traction of the term “cumulative isotopomers”), using 1 specifically, those containing carbon atoms, where unla- and x to denote, respectively, determinate (13C-labeled) beled and labeled variants of isotopic tracer elements are 12C 13 and indeterminate ( -unlabeled or C-labeled) iso- distinguished by the presence of one additional neutron. topic labeling states, with a total ordering given by x < 1 , Can the EMU method be generalized to arbitrary, heter- they showed that Malloy et al.’s construction could be onuclear systems? reformulated as a cascade system of linear functions of At the conclusion of their essay, Antoniewicz et al. relative abundances of cumomers, which they referred promise to return to these cases later; but, thus far, this to as “cumomer fractions,” with the original construction promise has gone unfulfilled. Nor do the works of Srour being recovered via a “suitable variable transformation” et al. [11] fill this gap. Their essays, which are contempo- [2, Eq. 7] in the form of an invertible square . raneous with those of Antoniewicz et al., develop a coun- For example, a metabolite of two carbon atoms has terpart to the EMU method, where system variables are 2 2 = 4 cumomers, xx, x1, 1x and 11; and hence, the combinations of flux variables and cumomer fractions, isotopic labeling state of the metabolite is given by 4 which they refer to as “fluxomers.” cumomer fractions. While the fluxomer method does enable a reformula- Antoniewicz et al. [3] shed new light on this subject. tion of the flux balance equation that, it is claimed, has Manipulating isotopic labeling states of specific sub- greater numerical precision and does not require matrix sets of carbon atoms as aggregations, rather than as inversion for its solution, it is, in actuality, a method for singletons, they showed that, under certain conditions, constructing a specific matrix in a single pass, which, Wiechert et al.’s cascade system could be optimized using with some algebraic manipulation, can be shown to be graph-theoretic methods, e.g., vertex reachability analy- equivalent to the EMU method for certain biochemi- sis, edge smoothing and Dulmage–Mendelsohn decom- cal reaction networks. Specifically, the fluxomer method position, thereby significantly reducing the total number involves the construction of the directed line graph for of system variables. each EMU reaction network (the new graph that repre- Representing isotopic labeling states of carbon atoms sents the adjacencies between the edges of the original in backbones of metabolites as mass distributions, which graph). Furthermore, the software implementation of they referred to as “Elementary Metabolite Units (EMU),” the fluxomer method is, in general, far more complex they showed that every EMU has a unique factorization, than that of the EMU method, limiting its widespread and that the mass distribution of a given EMU is equal adoption. to the vector convolution of the mass distributions of its The most recent work in this area, by Nillson and Jain proper factors. Moreover, they showed that, in a given [12], gives a construction for the representation and mass distribution, the mass fraction corresponding to the manipulation of “multivariate mass isotopomer distri- greatest number of mass shifts is equivalent to a specific butions” of heteronuclear moieties, which they demon- cumomer fraction. strate for moieties of carbon and nitrogen atoms. The Antoniewicz et al.’s work provided the foundation key advantage of their approach is that the probabilities for a whole host of important biological investigations associated with each combination of the contribution of [4–6], and inspired many software implementations [7– each isotopic tracer element are uniquely identifiable. 10]. Even so, the cases not solved by Antoniewicz et al. The disadvantage of their approach, however, is that it merit attention for four reasons. First, the subject is very requires the use of k-dimensional vectors, where k is the closely tied to the concept of isotopic labeling, and can number of isotopic tracer elements, increasing the com- thus serve to bring greater clarity and determinacy to plexity of the software implementation. In the discus- its mathematical formulation. In this respect, treatment sion of their essay, the authors claim that “the cumomer of the subject possesses an immediate interest. Second, framework can be easily extended along the same lines;” the EMU method suffers from the same system variable however, they give neither a construction of multivariate explosion issue as the isotopomer and cumomer methods cumomers nor a method of conversion to multivariate Borkum et al. J Cheminform (2017) 9:14 Page 3 of 11

mass distributions. Moreover, their method is only dem- variables per moiety, required to represent the mass dis- onstrated for isotopic tracer elements with two isotopic tribution, is reduced to the successor of the greatest addi- labeling states per : unlabeled or labeled. tional neutron number. In summary: Malloy et al. represented isotopic labe- Let m be a non-negative integer that denotes the great- ling states of carbon atoms in backbones of metabolites est neutron number (either directly or, assuming the as probability distributions of configurations, obtaining existence of a mapping from proton to least neutron a nonlinear formulation of the flux balance equation. number, indirectly), then the set of determinate isotopic Wiechert et al. omitted specific isotopic labeling states, labeling states of a given isotopic atom of a given moiety, obtaining a cascade system. Antoniewicz et al. manipu- denoted Stm, is the set of integral members of the closed lated aggregations of isotopic labeling states, represented interval [0, m] as mass distributions, yielding an optimization; and Srour def St = {0, 1, ..., m}. (1) et al. gave an algebraically equivalent reformulation. Nill- m son and Jain used multidimensional vectors to represent Let n and m be non-negative integers that denote, mass distributions of heteronuclear moieties, but did not respectively, the number of isotopic atoms and greatest give a construction for isotopomers or cumomers. As neutron number, then the set of isotopomers of a given things stand, however, representation of and conversion moiety, denoted Isotopomern, m, is the set of length-n between isotopic labeling states of metabolites in com- vectors of members of the set Stm pletely arbitrary biological systems is not possible. def a a a i i n Isotopomern, m = {{ 1, 2, ..., n}|∀ : ( ∈ [1, ]) Results ∧(ai ∈ Stm)}, (2) Definitions The indeterminacy which still prevails on a number of and the set of cumomers of a given moiety, denoted fundamental points in the theory of isotopic labeling Cumomern, m, is the set of sequences of members of the compels us to make some prefatory remarks about the set Stm ∪ {⊤} concept of an isotopic labeling state and the scope of its def Z validity. Cumomern, m = {(a1, a2, ..., an, an+1, ...)|∀i : (i ∈ ) For this investigation, unless otherwise stated, a moi- Stm ∪ {⊤} if i ≤ n ∧ ai ∈ , ety is any set of atoms, not necessarily connected via  {⊤} if i > n  (3) chemical bonds, nor of the same metabolite. Further- more, we take as an axiom that the number of nucleons where ⊤ denotes the indeterminate isotopic labeling in an atomic nucleus is quantized, i.e., is a non-negative state, the logical disjunction of every determinate iso- integer. topic labeling state (viz., logical true). First and foremost, what do we understand by the con- Importantly, while an isotopomer is a representation of cept of an isotopic labeling state? the isotopic labeling state of a given moiety in the context Representations of states of moieties of isotopic atoms of itself, a cumomer is a representation of the isotopic are isomorphic to logical conjunctions of assertions of labeling state of a given moiety in the context of a larger, both the proton and neutron numbers of each atom, e.g., virtual moiety of a countably infinite number of atoms, “the first atom has 1 proton and 0 neutrons, the second whose isotopic labeling states are mutually independent, atom has 6 protons and 7 neutrons, etc.” In this way, where the isotopic labeling states of non-local atoms are, the characteristics of each isotopic atom are completely by definition, indeterminate (as displayed by the quanti- specified; and hence, the representation is an unambigu- fication of atom indices over the nonzero, positive mem- Z ous description of the isotopic labeling state of the given bers of ). It is this essential difference that enables the moiety. formulation of the flux balance equation as a cascade Z Z However, if there exists a mapping f : → from system. proton to least neutron number, e.g., f (1) = 0, f (6) = 6, Let n and m be non-negative integers that denote, Z etc., where denotes the set of integers, then representa- respectively, the number of isotopic atoms and the great- tions of states of moieties are isomorphic to logical con- est neutron number, then the set of cumomers of a given n junctions of assertions of both the proton and additional moiety can be exhaustively partitioned into a set of 2 neutron numbers, i.e., mass shifts, of each atom, e.g., “the mutually disjoint subsets, denoted EMUn, m, where the first atom has 1 proton and 0 additional neutrons, the isotopic labeling states of local atoms in each subset of second atom has 6 protons and 1 additional neutron, etc.” cumomers, denoted EMUn, m, N, are determinate for the In this way, the characteristics of each isotopic atom are members of a member N of the power-set of the set of still completely specified; but, the total number of system atoms of the given moiety Borkum et al. J Cheminform (2017) 9:14 Page 4 of 11

def EMUn, m = EMUn, m, N |∀N : (N ∈ P([1, n])) def  Stm if i ∈ N EMUn, m, N = a | a ∈ Cumomern, m ∧ ∀ i : (i ∈ [1, n]) ∧ ai ∈ , (4)     {⊤} if i �∈N    P where denotes the power-set function. states are also undetectable, i.e., the corresponding isoto- pomer and cumomer fractions are also 0%. Representation as Boolean functions = Isotopomer and cumomer fraction vectors for uni- and Representation as mass distributions multi-atomic EMUs are, by definition, discrete probabil- Cumomer fraction vectors for uni-atomic EMUs are, by ity distributions of the isotopic labeling state of the given definition, probability mass functions that characterize moiety. Interpreted as Boolean functions, isotopomer discrete probability distributions of the mass of the given and cumomer fraction vectors are “multiplied” using the atom, i.e., mass distributions. Interpreted as mass distri- Cartesian under multiplication. butions (instead of Boolean functions), cumomer fraction Let n and m be non-negative integers that denote, vectors are “multiplied” using vector convolution (instead respectively, the number of isotopic atoms and the of the Cartesian product under multiplication). greatest neutron number, then the set of isotopomer fractions of a given moiety, corresponding to the set Representation as arbitrary monoids n Isotopomern, m , represented as a vector of (m + 1) Both interpretations induce a commutative monoid entries, each giving the functional value of the corre- structure on the set of EMUs, where the left- and right- sponding minterm, is a completely specified, Boolean identity of the monoidal product is the interpretation of function of n variables, with m + 1 values per variable. the trivial EMU, corresponding to the “empty moiety” The sum of the set of isotopomer fractions of a given of zero atoms, denoted EMUn, m, ∅, where ∅ denotes the moiety is, therefore, =100%. empty set. Let n and m be non-negative integers that denote, What other interpretations are possible for uni-atomic respectively, the number of isotopic atoms and the greatest EMUs? Under what circumstances are they valid? P neutron number, and N ∈ ([1, n]) be a subset of atoms A homomorphism f : A → B is a function between two with determinate isotopic labeling states, then any (mutu- sets A and B, such that ally disjoint) subset of cumomer fractions of a given moi- f (µA(a1, a2, ..., an)) = µB f (a1), f (a2), ..., f (an) ety, corresponding to the set EMUn, m, N, represented as |N|  a vector of (m + 1) entries, each giving the functional (5) value of the corresponding minterm, is a completely speci- holds, for the variadic functions µA and µB, and for all fied, Boolean function of| N| variables, with m + 1 values elements a1, a2, ..., an ∈ A. per variable. The sum of a (mutually disjoint) subset of Let n and m be non-negative integers that denote, cumomer fractions of a given moiety is, therefore, 100 %. respectively, the number of isotopic atoms and the great- = P Let n and m be non-negative integers that denote, est neutron number, and N ∈ ({1, 2, ..., n}) be a sub- respectively, the number of isotopic atoms and the great- set of atoms with determinate isotopic labeling states, Z est neutron number, then the set of cumomer fractions then the function f : { } → S is a homomorphism of a given moiety, corresponding to the set Cumomern, m , n between the domain, the set of (mutually disjoint) sub- represented as a vector of (m + 2) entries, each giving sets of atoms, and the codomain S, where µ{Z} and µS are, the functional value of the corresponding minterm, is a respectively, set union and the monoidal product. completely specified, Boolean function of n variables, Interpretations of uni-atomic EMUs are valid, there- with m + 2 values per variable (the extra value being fore, if and only if: the intermediate isotopic labeling state). The sum of the set of cumomer fractions of a given moiety is, therefore, n •• The codomain S is isomorphic to the set of vectors; =2 × 100%. whose lengths, non-negative integers, are given by a For the above to hold in situations where isotopic well-defined function of| N| and m; and, tracer elements have different numbers of isotopes, dis- •• The codomain S is a commutative monoid. tinguished by different numbers of additional neutrons, it is necessary to assume that surplus isotopic labeling Hence, given n and m, the function f (N) = EMUn, m, N states are undetectable, i.e., that the probability of their is a valid homomorphism, with the codomain being inter- detection is =0%. Given this assumption, isotopom- preted as either the set of mass distributions or the set of ers and cumomers that include surplus isotopic labeling Boolean functions. Moreover, the function f (N) = |N| Borkum et al. J Cheminform (2017) 9:14 Page 5 of 11

is also a valid homomorphism, with the codomain being isotopomers and certain subsets of cumomers can be the set of non-negative integers under addition. defined, witnessed by a family of square matrices, as we Notice that, if we follow Wiechert et al.’s advice and will now show. partition the members of the codomain S by the num- Let m be a non-negative integer that denotes the ber of isotopic atoms, then, by definition, the vector rep- greatest neutron number, and s ∈ Stm be a determi- resentations of the members of each partition have the nate isotopic labeling state, then the set of punctured, same length. Therefore, any valid interpretation can be determinate isotopic labeling states of an isotopic atom used in flux balance equations if the corresponding vec- of a given moiety with respect to s is the set Stm \ {s} . tor representations are transposed (from column to row Removal of the (s + 1)th row of the rectangular matrix IC vectors). 1, m, followed by the repeated Kronecker product, always yields an invertible matrix. In fact, for a given n Conversion of isotopomer and cumomer fractions and m, the product of any two of these invertible matri- Let n and m be non-negative integers that denote, respec- ces is an injective matrix (the utility of which we have not tively, the number of isotopic atoms and the greatest neu- yet established). tron number, then conversion from isotopomer fraction Why is this the case? For every determinate isotopic vectors to cumomer fraction vectors is “done” using the labeling state s, there exists a set of punctured, determi- rectangular matrix nate isotopic labeling states, and a corresponding set of punctured cumomers (the term “punctured” referring to the exclusion of any one member of the set). The set of (6) punctured cumomers is, therefore, the logical conjunc- tion-exclusive disjunction expansion of the set of isoto- I where ⊗ denotes the Kronecker product, and k denotes pomers with respect to s; or, in coding theory parlance, the identity matrix of dimension k × k. the Reed–Muller spectral domain [13, 14] of the set of Exemplar matrices for small values of n and m are given isotopomers with respect to s. The square matrices are, in Table 1. therefore, the Reed–Muller transform matrices. Non-square, rectangular matrices are, of course, non- Note that the set of “cumomers” described by Wiechert invertible; however, isomorphisms between the set of et al. is the set of punctured cumomers with respect to s = 0. Table 1 Matrices for conversion of isotopomer and cumomer Conversion of isotopomer and mass fractions fraction vectors Let n and m be non-negative integers that denote, respec- n = n = n = ICn,m 0 1 2 tively, the number of isotopic atoms and the greatest neu- tron number, then conversion from isotopomer fraction m = 0 1 1 1 vectors to mass fraction vectors is “done” using the rec- m = 1 1 10 1000 0100 tangular matrix  01   1100 11 n   0010   IM def I n 0001 n, m = interspersecols m+1, (m + 1) − 1, 0   0011 i=1    (7) 1010 0101   Ik 1111 where denotes matrix convolution, denotes the   identity matrix of dimension k × k, and interspersecols m = 2 1 100 100000000 A 010 010000000 is a function that takes as input an arbitrary matrix , a      001 001000000 spacing k and an element x, and returns as output a new 111 111000000 A     matrix, where the columns of are interspersed with k 000100000   000010000 columns of the specified element x.   000001000 Exemplar matrices for small values of n and m are given   000111000 000000100 in Table 2.   000000010 Non-square, rectangular matrices are non-invertible.   000000001   While every rectangular matrix has a Moore–Penrose 000000111   pseudoinverse, their use, in this context, is incorrect, as 100100100   010010010 we will now show with a pathological example.   001001001 Let A be a homonuclear moiety of two isotopic atoms, 111111111   considering one isotopic tracer element, with two Borkum et al. J Cheminform (2017) 9:14 Page 6 of 11

Table 2 Matrices for conversion of isotopomer and mass The Cayley table for the setSt m ∪ {⊥, ⊤} under logical fraction vectors conjunction, with rows and columns permuted to facili-

IMn,m n = 0 n = 1 n = 2 tate this exposition, is m = 0 1 1 1 m = 1 1 10 1000  01 0110 0001   m = 2 1 100 100000000 (9)  010 010100000 001 001010100   000001010   000000001   isotopes, distinguished by one mass shift, then conver- showing that the set Stm ∪ {⊥, ⊤} is closed under logical sion of isotopomer and mass fraction vectors is given by conjunction, as are four of its cosets: {⊥}, {⊤}, {⊥, ⊤}, and the linear equation Stm ∪ {⊥}; however, the set Stm, and by extension, the set Stm ∪ {⊤} A(0, 0) , is neither closed nor well-defined under logi- A{1, 2},M+0 A(0, 1) cal conjunction; since, by definition, it does not contain A{ } = IM ·   ;  1, 2 ,M+1  2, 1 A (8a) ⊥ Stm ∪ {⊥, ⊤} A (1, 0) . The set is, therefore, a multi-valued logic, {1, 2},M+2  A     (1, 1)  with m distinguished, mutually contradictory elements that are identified with neither logical true⊤ nor logical and, to aid the example, assuming different values for false ⊥. each of the four isotopomer fractions of A, then Logical conjunction is, therefore, defined only for pairs 0.4 of isotopomers that are pairwise equal; and is equivalent 0.4 1000 0.3 to multiplication of the corresponding isotopomer frac- 0.5 = 0110·   ,     0.2 (8b) 0.1 0001 tions; however, since logical conjunction is not defined  0.1        for every pair of isotopomers, we cannot take the Carte- sian product under logical conjunction of pairs of sets of which is correct, given the specified isotopomer isotopomers. fractions. Logical conjunction is, therefore, defined only for However, conversion between mass and isotopomer pairs of cumomers that are pairwise: equal, determinate fraction vectors, using the Moore–Penrose pseudoin- and indeterminate, or indeterminate and determinate; verse, is incorrect and it is equivalent to multiplication of the correspond- A(0, 0) ing cumomer fractions. Since logical conjunction is A{1, 2},M+0 A + defined for pairs of cumomers that are determinate for  (0, 1)  �= IM · A , A 2, 1  {1, 2},M+1  (8c) (1, 0) A disjoint subsets of atoms, we can take the Cartesian  A  � � {1, 2},M+2  (1, 1)    product under logical conjunction of pairs of sets of cumomers that are determinate for disjoint subsets of since atoms. 0.4 100 Logical disjunction is, of course, defined, for all isoto- 0.4 0.25 0 0.5 0 pomers and cumomers, and it is equivalent to addition of   =   · 0.5 . 0.25 0 0.5 0   (8d) 0.1 the corresponding isotopomer and cumomer fractions.  0.1   001       Decomposition of arbitrary biochemical reactions Logical conjunction and disjunction of isotopic labeling A biochemical reaction is an identity of power-sets of iso- states topic atoms of virtual moieties, corresponding to the sets Let m be a non-negative integer that denotes the great- of all isotopic atoms on the left- and right-hand sides of est neutron number, then logical conjunction is defined the “arrow.” For example, the expression for pairs of members of the set Stm ∪ {⊤} that are either: equal, determinate and indeterminate, or indeterminate v A{a,b} + B{c} −→ C{a} + D{b,c} (10a) and determinate. Borkum et al. J Cheminform (2017) 9:14 Page 7 of 11

denotes a biochemical reaction with two reagents, A and Formulation of flux balance equations B, two products, C and D, and a flux variable v. It is a rep- A biochemical reaction network is a set of biochemical resentation of the identity reactions; and therefore, the decomposition of a bio- network is a set of systems of identities f P a P b c right ∪ right, right \∅ of isotopic labeling states.      = v × f P aleft, bleft ∪ P({cleft}) \∅ , (10b) From this point, formulation of the flux balance equa-     tions is purely mechanical: P where denotes the power-set function, ∅ denotes the empty set, and the function f is a valid homomorphism, 1. Take the union of the set of systems of identities of given the conditions that we have established. isotopic labeling states; Note the empty set is deliberately excluded from both 2. Partition the resulting set by the number of isotopic power-sets. Doing otherwise would introduce a contra- atoms per identity; diction: f (∅) = v × f (∅). 3. Represent each partition as an adjacency matrix; Application of the power-set and set difference func- 4. Optimize the adjacency matrices using any valid tions yields graph-theoretic method (optional, but highly recom- mended); f a b c b c right , right , right , right, right 5. Formulate the flux balance equation for each adja-         = v × f {aleft}, bleft , aleft, bleft , {cleft} ; (10c) cency matrix.       and hence, the decomposition of the original identity is Adjacency matrices are representations of labeled, the system of identities directed graphs. After permutation of the rows and col- umns, vectors of vertices (of the directed graphs) are of f aright = v × f ({aleft})   the form f bright = v × f bleft a f c  = v × f ({c }) 1 right left a =  a2  , (11) f b , c  = v × f b ⋆ f ({c }) (10d) a3 right right left left        where ⋆ denotes the monoidal combinator (for the codo- where a1, a2 and a3 denote, respectively, sub-vectors of main of the function f). source, intermediate and sink vertices, corresponding to Replacement of each moiety by its isotopic labeling substrates, intermediates and products; and, adjacency state, i.e., application of the function f, we obtain the sys- matrices are of the form tem of identities 0 00 C{1} = v × A{1} A =  A2,1 A2,2 0  , (12) A3,1 A3,2 0 D{1} = v × A{2}   (10e) D{2} = v × B{1} where the element Ai,j = denotes the identity aj = × ai D{1,2} = v × A{2} ⋆ B{1} for the coefficient .  Assuming a variational formulation of fluid dynamics Thus, the problem of decomposition of representa- for a bounded domain with a piecewise-smooth bound- tions of “biochemical reactions” is reduced to the decom- ary and a general “mass” conservation law [15], flux bal- position of identities of power-sets of “isotopic atoms,” ance equations are of the form making no reference to the underlying , where A A a “isotopic atoms” are, in practice, members of any counta- f 2,1 2,2 2 (...) = diag rowsum A A · a bly infinite set for which equality can be established, e.g.,  3,1 3,2   3  the set of natural numbers. A2,1 A2,2 a1 − · (13) Note that, unlike the “EMU decomposition” algorithm  A3,1 A3,2   a2  of Antoniewicz et al., our decomposition algorithm is exhaustive, runs in constant time and memory, has a where f is an arbitrary function, diag denotes a function O 2 worst-case computational complexity of n with that constructs a square matrix with the specified lead- respect to the number of “isotopic atoms” n, and does not ing diagonal, and rowsum denotes a function that yields require backtracking. a column vector of the sums of the elements of each row Borkum et al. J Cheminform (2017) 9:14 Page 8 of 11

of the specified matrix. (The steady-state hypothesis such, we make no attempt to distinguish between (math- being f (...) ≈ 0). ematically) conceivable and (experimentally) detectable On the right-hand side of (13), the first matrix is the isotopic labeling states. Instead, we assert that represen- mass lumping matrix for intermediate and sink vertices tations of undetectable isotopic labeling states be identi- using the row sum technique; and, the second matrix is fied by logical false, i.e., that the probability of detection the discrete transport operator for source and intermedi- is =0%. ate vertices. It remains to give examples of the practical usage of However, (13) is nonlinear, referring to the set of inter- representations of detectable isotopic labeling states in mediates in the expressions of both influx and efflux; the context of different experimentations: mass spec- and therefore, is not in the required form. Rearranging trometry (MS) and nuclear magnetic resonance (NMR) the right-hand side of (13) using elementary algebra, we spectroscopy. obtain A A A f 2,1 2,2 2,2 0 Following an MS experiment, a post-processing function (...) = diag rowsum A A − A  3,1 3,2   3,2 0  is applied to the experimentally obtained data in order to a A 2 2,1 a yield a representation of the isotopic labeling state of the · − · 1 (14)  a3   A3,1  detected fragments; typically, a mass distribution. For example, in the case of electrospray mass ionization which is of the required form. (ESI) and high-resolution MS experiments, where frag- On the right-hand side of (14), the first matrix is the mentation is optional, the post-processing function may be consistent mass matrix for intermediate and sink ver- the identity function. By contrast, in the case of gas chro- tices; and, the second matrix is the discrete transport matography mass spectrometry (GC-MS), experimentally operator for source vertices only, i.e., the required form. obtained data is corrected with respect to the representa- Notice that the elements of the leading diagonal of the tion of the isotopic labeling state of the chemical derivati- consistent mass matrix are positive. This is to ensure that zation agent, yielding an “underivatized” representation of the eigenvalues of the consistent mass matrix are non- the isotopic labeling state for each detected fragment. negative [16]. Irrespective of the underlying MS experiment, each Assuming the steady-state hypothesis, for example, we fragment is identified with an EMU. obtain For example, consider the cyano radical; molecular for- mula: ·CN. In a heteronuclear model, where the two iso- A A A 2,1 2,2 2,2 0 topic atoms are, respectively, carbon and nitrogen, if an diag rowsum A A − A  3,1 3,2   3,2 0  isotopically labeled cyano radical is detected using high- a A 2 2,1 a resolution MS, then the result is, by definition, a matrix · ≈ · 1 (15)  a3   A3,1  of the probability of occurrence of each of the four possi- ble combinations of detectable isotopes (viz., a mass frac- which is readily solved by inversion of the consistent tion matrix). mass matrix. Thus, the problem of formulating cascade systems of 14N 15N flux balance equations for “biochemical reaction net- 12 works” is reduced to the construction and optimization C P({C = 0} ∩ {N = 0}) P({C = 0} ∩ {N = 1}) 13 of dependency networks of adjacency matrices, followed C P({C = 1} ∩ {N = 0}) P({C = 1} ∩ {N = 1}) by the extraction of block matrices of predetermined Accordingly, the representation of each event corre- dimensions, making no reference to the underlying sponds to a specific isotopomer of the cyano radical; chemistry.

14 15 Practical usage of representations of detectable isotopic N N labeling states 12 C ·CN(0, 0) ·CN(0, 1) 13 In this manuscript, we have given constructive definitions C ·CN(1, 0) ·CN(1, 1) of representations of isotopic labeling states of moieties. Our mathematical framework, independent of Chemis- which, in turn, are equivalent to isotopomers of EMUs of try, is based on the premise that every conceivable iso- the cyano radical (precisely, the EMU that corresponds to topic labeling state is (mathematically) representable. As the moiety of all atoms of the cyano radical). Borkum et al. J Cheminform (2017) 9:14 Page 9 of 11

14N 15N Since there are two indistinguishable Carbon atoms, the probability of either or both of Cδ being 13C-labeled 12 ·CN ·CN C {1, 2}, (0, 0) {1, 2}, (0, 1) is equal to 13 C ·CN{1, 2}, (1, 0) ·CN{1, 2}, (1, 1)

P Cδ1 = 0 ∩ Cδ2 = 1 Taking the trace of the anti-diagonals of the matrix, ∪ C = 1 ∩ C = 0  which corresponds to taking the logical disjunction of the δ1 δ2 (17d)     subset of (heteronuclear) mass fractions that correspond ∪ Cδ1 = 1 ∩ Cδ2 = 1 . to isotopically labeled moieties with the same number of     additional neutrons, yields a vector; specifically, the mass Therefore, the result is equal to fraction vector that is obtained by low-resolution MS of Leu + Leu + Leu × Leu the same sample. (⊤, ⊤, ⊤, ⊤,0,1) (⊤, ⊤, ⊤, ⊤,1,0) (⊤, ⊤, ⊤, ⊤,1,1) (⊤, 1, ⊤, ⊤, ⊤, ⊤) , Leu(⊤, 1, ⊤, ⊤, ⊤, ⊤)  ·CN {1, 2},M+0 (17e) ·CN  {1, 2},M+1  which, given logical conjunction, is equal to ·CN {1, 2},M+2   Leu(⊤, 1, ⊤, ⊤,0,1) + Leu(⊤, 1, ⊤, ⊤,1,0) + Leu(⊤, 1, ⊤, ⊤,1,1) P({C = 0} ∩ {N = 0}) . Leu(⊤, 1, ⊤, ⊤, ⊤, ⊤) =  P(({C = 0} ∩ {N = 1}) ∪ ({C = 1} ∩ {N = 0}))  P({C = 1} ∩ {N = 1}) (17f)   (16) Since cumomers of a metabolite are equivalent to iso- In this way, both low- and high-resolution MS data can topomers of EMUs of said metabolite, the final result is be leveraged within the same mathematical framework. equal to Leu{2, 5, 6}, (1, 0, 1) + Leu{2, 5, 6}, (1, 1, 0) + Leu{2, 5, 6}, (1, 1, 1) Nuclear magnetic resonance . Leu{2}, (1) After Fourier transformation, the result of a NMR experi- (17g) ment (of arbitrary dimensionality) is a frequency-domain Discussion spectrum. If we assume, without loss of generality, that the integral for the resonance of a given “observed” Beginning with a single assumption (the quantization nucleus of a given metabolite is equal to unity, then the of nucleons in an atomic nucleus), we have formulated proportional integral of a given NMR splitting pattern a mathematical framework for isotopic labeling that is of a moiety of said metabolite is equal to the conditional independent of Chemistry. probability of measuring a given isotopic labeling state of Reclothing the theory of isotopic labeling with famil- the “detected” nucleus whilst fixing the isotopic labeling iar concepts from Chemistry only serves to obscure its state of the “observed” nucleus. Note that, in the context generality and rigor. Furthermore, imposing artificial of a given moiety, the “detected” and “observed” nuclei constraints on the theory of isotopic labeling, motivated need not be directly connected via a single chemical by Chemistry, rather than by Mathematics, is, obviously, bond. incorrect. (As a corollary, imposing artificial constraints Consider Leucine (abbreviated: “Leu”), an on the practice of isotopic labeling, such as limiting one- of 6 Carbon atoms. The second Carbon atom is denoted self to chemically and financially feasible experiments, is, Cα; whereas, the fifth and sixth Carbon atoms, indistin- obviously, correct). guishable by NMR, are both denoted Cδ. Identifying the true name of the mathematical objects The integral of the NMR splitting pattern of the problem has allowed us to leverage the com- paratively vast corpora of a diverse range of disciplines, − Cα Cδ (17a) including: Computer , Mathematics, Information is equal to the conditional probability of the detected Theory and Coding Theory. For example, we have shown atom Cδ being 13C-labeled when the observed atom Cα is that the set of punctured cumomers of a given moiety is also 13C-labeled the Reed–Muller spectral domain of the set of isotopom- ers of said moiety; and hence, that we can use the Reed– ({ = }|{ = }) P Cδ 1 Cα 1 , (17b) Muller transform matrices to convert isotopomer and which, by definition, is equal to the quotient punctured cumomer fraction vectors. The isotopomer method, the most naïve representation P({Cδ = 1} ∩ {Cα = 1}) . (17c) of isotopic labeling state, is, clearly, flawed. Malloy et al. P({Cα = 1}) assume, incorrectly, that the “labeled” and “unlabeled” variants of isotopic tracer elements can be identified with Borkum et al. J Cheminform (2017) 9:14 Page 10 of 11

the Boolean truth values. Instead, as we have shown, •• A bounded domain with a piecewise-smooth bound- determinate isotopic labeling states are distinguished ary; elements of a multi-valued logic, whose undistinguished •• A unit volume; elements are identified with contradiction⊥ and the •• An incompressible fluid; indeterminate isotopic labeling state ⊤. •• Mass lumping approximated by the row-sum tech- The cumomer method fairs slightly better. Weichert nique; and, et al. do not recognize the complete set of cumomers of •• A distance metric that is constant for all pairs, i.e., a given moiety. Instead, what they refer to as “cumom- that all “stuff” in a compartment is equally, infinitely ers” is the set of punctured cumomers with respect close. to s = 0. In contrast to the advice of the Mathemat- ics community, they assume that the indeterminate This suggests that the flux balance equation could be isotopic labeling state is less than every determinate derived from Fluid Dynamics under much more general isotopic labeling state; and hence, the elements of assumptions. In the forwards direction, this technology Weichert et al.’s cumomer fraction vectors are out of transfer could enable the translation of order, making conversion between isotopomer and models into Fluid Dynamics models, and subsequently, cumomer fraction vectors cumbersome in the general the simulation of Systems Biology models using Fluid case. Furthermore, contrary to the advice of the Fluid Dynamics tools. In the backwards direction, establish- Dynamics community, Weichert et al. assume that, in ing a dictionary could enable Fluid Dynamics concepts, the flux balance equation, influx is positive and efflux theorems and results to be utilized by Systems Biology is negative. Hence, in cumomer-based flux balance researchers. For example, what are the “translations” of equations, the consistent mass matrix has non-positive laminar flow and friction? Moreover, can a biological eigenvalues. reaction network be embedded in an arbitrary manifold, We have shown that the EMU method can be “done” thereby allowing the model to account for the spatial because the function that interprets a set of isotopic characteristics of the compartment? atoms (with determinate isotopic labeling states) as a mass distribution is a homomorphism, and because the Conclusions set of mass distributions is a commutative monoid (when The contributions of this article are as follows: sets of isotopic atoms are pairwise disjoint). From this new perspective, the cumomer method can •• We have rigorously defined the concept of an isotopic be “done” for two reasons: because cumomer fraction labeling state of an arbitrary moiety, consisting of an vectors are a valid codomain of the homomorphism and arbitrary number of isotopic atoms, with an arbitrary EMU-based flux balance equations can be transformed number of isotopes per isotopic tracer element, dis- into cumomer-based flux balance equations by elemen- tinguished by an arbitrary number of mass shifts. tary algebraic manipulations: replacement of flux variable •• We have reformulated the isotopomer, cumomer coefficients by coefficient matrices; transposition of row and EMU methods in terms of our new formulation vectors and subsequent construction of a block column of isotopic labeling states; demonstrating both the vector; and, permutation of rows and columns. decomposition of arbitrary biochemical reaction net- Cumomer and mass fraction vectors are not the only works and the construction of the corresponding flux “valid” interpretations. Zero dimensional, i.e., sca- balance equations. lar, interpretations are possible, the simplest being the •• We have given matrices for the conversion of arbi- number of isotopic atoms per moiety. One dimensional, trary isotopomer, cumomer and mass fraction vec- i.e., vector, interpretations include: cumomer fraction tors; recognizing that the set of punctured cumomers vectors, punctured cumomer fraction vectors and mass of a given moiety (with respect to any determinate distributions. Two dimensional, i.e., matrix, interpre- isotopic labeling state) is the Reed–Muller spectral tations, considering both the number of protons and domain of the corresponding set of isotopomers. neutrons of each moiety are also admissible. (In the flux balance equation, coefficient matrices are replaced by With this work, it is now possible to simulate the iso- tensors). topic labeling states of metabolites in completely arbi- We have shown that the “flux balance equation” is the trary biochemical reaction networks. “mass balance equation” of Fluid Dynamics, under the An immediate application of this work is the devel- following assumptions: opment of software systems that are informed of all Borkum et al. J Cheminform (2017) 9:14 Page 11 of 11

mathematical relationships between the mathematical References 1. Malloy CR, Sherry AD, Jeffrey FM (1988) Evaluation of carbon flux and sub- objects. strate selection through alternate pathways involving the citric acid cycle We expect that this work will be incorporated into of the heart by 13C NMR spectroscopy. J Biol Chem 263(15):6964–6971 future implementations of Systems Biology software 2. Wiechert W, Möllney M, Isermann N, Wurzel M, de Graaf AA (1999) Bidi- rectional reaction steps in metabolic networks: III. Explicit solution and systems. analysis of isotopomer labeling systems. Biotechnol Bioeng 66(2):69–85 3. Antoniewicz MR, Kelleher JK, Stephanopoulos G (2007) Elementary Future work metabolite units (EMU): a novel framework for modeling isotopic distri- butions. Metab Eng 9(1):68–86 The following research questions are unanswered by this 4. Metallo CM, Walther JL, Stephanopoulos G (2009) Evaluation of 13C iso- study and are left for future work: topic tracers for metabolic flux analysis in mammalian cells. J Biotechnol 144(3):167–174 5. Walther JL, Metallo CM, Zhang J, Stephanopoulos G (2012) Optimization •• How should the new capability, determination of of 13C isotopic tracers for metabolic flux analysis in mammalian cells. isotopic labeling states of heteronuclear moieties, be Metab Eng 14(2):162–171 leveraged? For example, is it more computationally 6. Martín HG, Kumar VS, Weaver D, Ghosh A, Chubukov V, Mukhopadhyay A, Arkin A, Keasling JD (2015) A method to constrain -scale models efficient and/or numerically precise to fit metabolic with 13C labeling data. PLoS Comput Biol 11(9):1004363 flux variables to experimentally obtained measure- 7. Quek L-E, Wittmann C, Nielsen LK, Krömer JO (2009) OpenFLUX: efficient 13C ments of hetero- rather than homonuclear moieties? modelling software for -based metabolic flux analysis. Microb Fact 8(1):1–15 •• This study assumes that isotopic labeling states of 8. Shupletsov MS, Golubeva LI, Rubina SS, Podvyaznikov DA, Iwatani S, C atoms in a given moiety are mutually independent. Mashko SV (2014) OpenFLUX2: 13 -MFA modeling software package Is it possible to provide a formulation where isotopic adjusted for the comprehensive analysis of single and parallel labeling experiments. Microb Cell Fact 13(1):152 labeling states are conditionally dependent? 9. Young JD (2014) INCA: a computational platform for isotopically non- •• In the context of Radiochemistry, can the new formu- stationary metabolic flux analysis. 30(9):1333–1335 lation be used to determine isotopic labeling states of 10. Kajihata S, Furusawa C, Matsuda F, Shimizu H (2014) OpenMebius: an open source software for isotopically nonstationary 13C-based metab moieties in the presence of and/or olic flux analysis. BioMed Res Int 2014. https://www.hindawi.com/ neutron sources? journals/bmri/2014/627014/ 11. Srour O, Young JD, Eldar YC (2011) Fluxomers: a new approach for 13C Authors’ contributions metabolic flux analysis. BMC Syst Biol 5(1):129 The manuscript was written through contributions of all authors. All authors 12. Nilsson R, Jain M (2016) Simultaneous tracing of carbon and nitrogen read and approved the final manuscript. isotopes in human cells. Mol BioSyst 12(6):1929–1937 13. Reed I (1954) A class of multiple-error-correcting codes and the decod- Author details ing scheme. Trans IRE Prof Group Inf Theory 4(4):38–49 1 Environmental Molecular Sciences Laboratory, Pacific Northwest National 14. Muller DE (1954) Application of Boolean algebra to switching circuit Laboratory, 3335 Innovation Boulevard, Richland, WA 99354, USA. 2 Biological design and to error detection. Trans IRE Prof Group Electron Comput Sciences Division, Pacific Northwest National Laboratory, 3335 Innovation 3(3):6–12 Boulevard, Richland, WA 99354, USA. 15. Kuzmin D, Hämäläinen J (2015) Finite element methods for computa- tional fluid dynamics: a practical guide. SIAM Rev 57(4):642 Acknowledgements 16. Felippa C (2013) Matrix finite element methods in dynamics (course in The authors would like to thank Nathan Baker, Juan Brandi-Lozano, Luke preparation). http://www.colorado.edu/engineering/CAS/courses.d/ Gosink, Costas Maranas, Panos Stinis and Xiu Yang for their comments and MFEMD.d/. Accessed 1 Apr 2016 feedback. The authors would also like to thank the anonymous reviewers for their suggestions and comments.

Competing interests The authors declare that they have no competing interests.

Funding The research was performed using EMSL, a DOE Office of Science User Facility sponsored by the Office of Biological and Environmental Research and located at Pacific Northwest National Laboratory. Funding for this work was provided in part by the William Wiley Postdoctoral Fellowship from EMSL to P.N.R. Additional funding was provided by the Development of an Integrated EMSL MS and NMR Metabolic Flux Analysis Capability In Support of Systems Biology: Test Application for Biofuels Production intramural research project from EMSL to N.G.I.

Received: 22 September 2016 Accepted: 20 February 2017