Principles of systems and Dmitri Belyaev’s co-selection of traits

Hans V. Westerhoff and friends MCISB, MIB, SCEAS, University of Manchester SysBA, Universities of Amsterdam Systems Biology

• What is it? • Principles – Lack of dominance (Kacser) – Co-selection (Belyaev) • Progress – Make Me My Model – The genome wide metabolic maps – Epigenetics and noise/cell diversity Bioinformatics: From biological data to information

Systems Biology: From that information to understanding Systems Biology: From data to understanding: why is this such an issue?

• Because the mapping from genome to function is extremely nonlinear • E.g.: – -

– - The DNA in all our cells is the same, but: a heart cell is essentially different from a brain cell

– - Self organization, bistability: Belousov, Zhabotinsky, Waddington, Ilya Prigogine, Boris Kholodenko Why systems biology?

Cause 1 ~all functions are X network functions Multiple causality Cause 2 Cause 3 X Multifactorial X disease

Impaired function 2006 Hornberg et al: ‘Cancer: a systems biology disease’. Now: ‘virtually all disease are Systems Biology diseases.’ This causes the ‘missing heritability problem (Baranov; Stepanov)’ Systems Biology=

• The Science that • aims to understand • principles governing • how the biological functions • arise from the interactions = from the networking This leads to precision, personalized, 4P , PPP4M And to precision Systems Biology

• What is it? • Principles – Lack of dominance (Kacser) – Co-selection (Belyaev) • Progress – Make Me My Model – The genome wide metabolic maps – Epigenetics and noise/cell diversity Henrik Kacser (Student of Waddell) Henrik Kacser

Recessivity of most lack-of-function mutations Lack of dominance: No loss of function in heterozygote Lack of dominance: observation F0 F0’ F1 (knock out) X

Function= 100% 0% 95% Flux J Lack of dominance: single molecule explanation fails F0 F0’ F1

AA 00 A0 X This is almost always incorre Function= 100% 100% 50% Flux J ct Cf. talk by Prof Baranov Lack of dominance: single molecule explanation fails

F0 F0’ F1

AA 00 A0 X

Function= 100% 100% >90%X50% Flux J The systems biology explanation

(Henrik Kacser) Dependence of pathway flux on the concentration of an

Nonlinear: Remember the Polly presentation

Flux versus enzyme activity

 1

0.5 Flux J J Flux 0 0 0.5 1 1.5 2 2.5 Enzyme 1 activity  Wild type level How to measure whether a component is limiting: take the relative slope: the flux control coefficient  extent of dominance 푑퐽 푑푒 푖 J Flux Control Coefficient C I = 푟푒푙푎푡푖푣푒 푠푙표푝푒 = 퐽 푒푖

푠푡푒푎푑푦 푠푡푎푡푒푠 푎푛푑 푎푙푙 표푡ℎ푒푟 푒푗 푐표푛푠푡푎푛푡

Flux versus enzyme activity The relative 1 slope determines 0.5 Flux J Flux the extent of 0 control 0 0.5 1 1.5 2 2.5 Gene dosage ei Example

C is not confined to the single first irreversible step in the pathway

dJ  d ln | J | '%dJ ' C J     J  i  d ln e  de '1%de  i steadystate i i ei + 1.01 Systems Biology=

• The Science • That aims to understand • principles governing • how the biological functions • arise from the interactions Flux Control Summation Principle

퐽 퐽 퐽 퐽 퐶1 + 퐶2 + 퐶3 + ⋯ + 퐶푛 ≡ 1 • J: steady state flux • 1, 2, 3, : number of the enzyme (gene product) • Consequence: %푟푒푑푢푐푡푖표푛 푖푛 푓푢푛푐푡푖표푛 퐶퐽 ≈ =1/n 푖 푓표푟 푎 50% 푟푒푑푢푐푡푖표푛 푖푛 푔푒푛푒 푑표푠푎푔푒 Flux Control Summation Principle and recessivity

Henrik Kacser (Student of 퐽 %푟푒푑푢푐푡푖표푛 푖푛 푓푢푛푐푡푖표푛 Waddell) 퐶 ≈ ≅ 1/10, 푖 푓표푟 푎 50% 푟푒푑푢푐푡푖표푛 푖푛 푔푒푛푒 푑표푠푎푔푒 hence 5 % reduction in function for a pathway of 10 genes Lack of dominance: observation F0 F0’ F1

X

Function= 100% 0% 95% Flux J Lack of dominance (expectation?) F0 F0’ F1

X

Function= 100% 0% 50% Flux J Lack of dominance (network explanation; Kacser) F0 F0’ F1

X

Function= 100% 100% 95% Flux J Systems Biology

• What is it? • Principles – Lack of dominance (Kacser) – Co-selection (Belyaev) • Progress – Make Me My Model – The genome wide metabolic maps – Epigenetics and noise/cell diversity Principles

The Novosibirk example of co-selection: Selection for domestication also brings drooping ears Belyaev’s observation

By selection

for lack of aggressivity

Function: Aggressivity Upright ears Domestified Drooping ears Singe molecule explanation fails

By selection

for lack of aggressivity

Function: Aggressivity Upright ears Domesticated Upright ears Network explanation

By selection

for lack of aggressivity

Function: Aggressivity Upright ears Network explanation

By selection

for lack of aggressivity

Function: Aggressivity Upright ears Domestified Drooping ears N.K. Popova’ lecture

Domestication

Activation of 5-HT systems

Catalepsy Hypophyse al-gonadal system 5- HT

Decrease of Reduction aggressiven of stress ess response Selection during domestication

Domestication

Activation of 5-HT systems

Catalepsy Hypophyse al-gonadal system 5- HT

Decrease of Reduction aggressiven of stress ess response After selection

Domestication

Activation of 5-HT systems

Catalepsy Hypophyse al-gonadal system 5- HT

Decrease of Reduction aggressiven of stress ess response Systems Biology

• What is it? • Principles – Lack of dominance (Kacser) – Co-selection (Belyaev) • Progress – Make Me My Model – The genome wide metabolic maps – Epigenetics and noise/cell diversity M4 @ ISBE.NL team

Stefania Astrologo, Ewelina Weglarz-Tomczak, YanFei Zhang

Make Me My Model service: Something 4U? Systems Biology

• What is it? • Principles – Lack of dominance (Kacser) – Co-selection (Belyaev) • Progress – Make Me My Model – The genome wide metabolic maps – Epigenetics and noise/cell diversity The human: A jungle of 25 000 genes and gene products and various nutrition, life style and ambition factors This must be impossible to deal with. Where to start……? We obtained the consensus genome wide metabolic map (Recon2), i.e. all the human network can make from any nutrition

food1

food2

This is all components in the context of their whole food3

Data concerning all metabolic genes have hereby been integrated into a predictive format. Predicting how every molecule in our body is made by our body Does this help?

Could it lead to cures?

40 Example of map utilization tyrosine metabolism: nature Phenylketonuria (PKU) = IEM

Phenylketone  urine Nutrition Protein Phe ✗✗ ✗OK Tyr ✗ ✗ ✗

dopa

dopamine

Nor-epinephrin Example of map utilization tyrosine metabolism: Nurture and brain

Protein

dopa (Herbeck dopamine presentat ion)

Nor-epinephrin

Epinephrin=adrenaline Now: after hearing Prof Popova: Mapping serotonin (?) Systems Biology

• What is it? • Principles – Lack of dominance (Kacser) – Co-selection (Belyaev) • Progress – Make Me My Model – The genome wide metabolic maps – Epigenetics and noise/cell diversity How can we understand clonal heterogeneity (such as in these glucose transporter activities)?

Yeast uptake of fluorescently labelled glucose analogue Explanations Phenotypic heterogeneity

• Variations in external conditions (extrinsic noise) • Genetic diversity • Epigenetic diversity: • Intrinsic Noise • Bistability Explanations Phenotypic heterogeneity

• Variations in external conditions (extrinsic noise) • Genetic diversity • Epigenetic diversity: • Intrinsic Noise • Bistability What is noise?

Due to temporarily increased/decreased activities of molecular processes At different moments in different individual cells, Hence noise cell-cell heterogeneity From statistical mechanics: In a flat (bio)chemical network at steady state, noisy molecule numbers should be (approximately) Poisson distributed Probability distribution: characteristics

• 흁 = 푥ഥ푖: the average of x 2 2 2 2 • 휎 = (푥푖 − 휇) = 푥푖 − 휇 : Noise: 0.07 • 휎 = 푠푡푎푛푑푎푟푑 푑푒푣푖푎푡푖표푛 = 휎2: the width of the 0.06 distribution 0.05 휎 • Relative noise = Coefficient of variation: 푐푣 = Τ휇 0.04 =standard deviation in number of 흈ퟐ • Fano factor: 푭 = ൗ흁: deviation from trivial noise 0.03 mRNAs=6.3 (width of distribution)

0.02

0.01 =Average number of mRNAs =40

Probability that number of mRNAs = = number N mRNAs of that Probability 0 0 50 100 150 200 -0.01 N=Number of mRNAs/cell The equilibrium case: Poisson distributed molecule numbers F=1, i.e. 휎2 = 휇

휎 Relative noise ( ) decreases with Absolute noise (휎) increases 휇 with average number of average number of molecules molecules () () 휎 1 cv= = 휎 = 휇 휇 휇 0.14 100

0.12 80 0.1 =10 60

0.08 m=10 )

 m=40 0.06 =40 40 m=100

P(x=n) 0.04 20

=100 P(x=n/ 0.02 0 0 0 0.5 1 1.5 2 0 50 100 150 200 -0.02 -20 n n/ If Poisson, then for ‘normal’ reactions and ‘usual’ molecule numbers: relative noise < 3% ATP protein

E. coli: 1 m3, 5 mM ATP: 3 million molecules; 1/3000000=0.1 % Total protein: 3 million; on average of each type; 1000; 3 % relative noise?

S. cerevisiae: 40 m3, 5 mM ATP: 120 million molecules; <0.01% Total protein: 100 million; on average of each type; 16 000; <1 % relative noise?

Mammalian cell: 2 pL, 5 mM ATP: 6 billion molecules; << 0.01% Total protein: 10 billion; on average of each type; 400 000; <0.2 % relative noise? Therefore: Poisson noise can not explain our (and others’) observations.

But what can? And: Is such noise important? We now study this vis-à-vis oestrogen receptor positive breast cancer

Therefore these cases are treated with tamoxifen that 70% OF BREAST TUMORS ARE binds to the Estrogen OESTROGEN RECEPTOR TESTOSTERONE POSITIVE(ER+) receptor without activating it Inadvertent activation leads to proliferation AROMATASE

AROMATASE INHIBITOR However TAMOXIFEN 40-50% OF ER+ PATIENTS OESTRADIOL DEVELOP RESISTANCE ESTROGEN RECEPTOR Do all cancer cells become resistant, ER ER PROLIFERATION e.g. due to induction of more ER ER INTRATUMOR HETEROGENEITY ER TARGET GENES estrogen receptor? REPRESENTS A MAJOR No,OBSTACLE just a few doTO EFFECTIVE CANCER TREATMENT. But these outgrow the others Noise in MCF-7 (clonal) cells; noisy CD44 promotor

• 4 MCF-7 sister cells, DAPI stain, GFP that reports CD44 promoter activity(possibly resistance related) So Yes: it probably is important and GAPDH FISH mRNA probe.

And as shown by Moshkin this morning it may be important for IVF (2015) PLoS Comput Biol 11(4): e1004236. doi:10.1371/journal.pcbi.1004236

Between 3 and 9 h precisely b mRNAs molecules synthesized: Stateb = burst oscillations size Could such epigenetics lead to transcription bursting and to Non- Poisson distributed mRNA?

ퟐ ퟐ 흈 흈 ൗ풏ഥ T풉풆 푭풂풏풐 풇풂풄풕풐풓 = 푭 ≝ = ൘ 흈ퟐ 풏ഥ ൗ풏ഥ 푷풐풊풔풔풐풏

is a measure of the deviation from Poisson distribution

So, the issue is whether F >>1 due to bursting Modelling backed up by Statistical Mechanics

Probability to have n’ Probability to have n mRNAs at time t mRNAs at time t

∞ 풅푷 풏, 풕 푷ሶ ≝ = ෍ 푊 푛 푛′ ∙ 푃 푛′, 푡 − 푾 풏′ 풏 ∙ 푃 푛, 푡 푑푡 풅풕 푛′=0

Increase in the probability Probability for cell with n’ Probability for cell with n that a cell contains n RNA RNA molecules to become RNA molecules to become molecules a cell with n RNA a cell with n’ RNA molecules molecules For bursting transcription and linear degradation of mRNAs we derived for the noise in the number of mRNAs: variance 휎2 푏푢푟푠푡 푠푖푧푒 + 1 퐹 ≝ = 푠푡푒푎푑푦 푠푡푎푡푒 푁ഥ 2 Mean number of mRNA molecules

Implications: • For burst size of 1: F=1, distribution is Poisson, variance equals the average • F is ONLY a function of burst size, not of burst kinetics • For burst size =100: F=50.5 and distribution far from Poisson, variance 50 times the average, so Yes, bursting can cause high F’s Confirmations by Gillespie modelling Bursting: indeed a distribution much broader than Poisson

Westerhoff: 60 This can even produce a bimodal distribution

kdeg = 0.1 kburst=0.1 Burst size b= 100

F = 50.5 Indeed, the Fano factor is independent of kburst and kdeg , whereas the other noise factors are not Hence: one may infer the burst size from the Fano factor precisely because of the non identifiability MCF-7 cell, DNA stained with DAPI (blue) and mRNA stained with fluorescent ssDNA probe (red). Green and yellow circles enclose single mRNA molecules.

Distributions of the number of transcripts obtained through smFISH experiments, in MCF-7 cell lines. HES1 10

8

r 6

o

t

c

a

F

o i.e. burst size = 9

n

a

F 4 So Yes, the observed 2 diversity in mRNA could be explained by our epigenetic 0 r rs u rs u o u o h o h h transcription clock! 1 0 0 2 Duration of treatment And now for an intriguing discovery (??) When starting from resting cells F increases as if cells begin to ‘exploring’

HES1 10

8 Statistical mechanics:

r 6

o t 푑퐹 푏∙푘 푏−퐹

c 푠

a

F ~ ∙ 푏 − 퐹 ~ >0

o n ത a 푑푡 푛 푡

F 4

2

0 r rs u rs u o u o h o h 1 h 0 0 2 Duration of treatment Systems Biology

• What is it? • Principles – Lack of dominance (Kacser) – Co-selection (Belyaev) • Progress – Make Me My Model – The genome wide metabolic maps – Epigenetics and noise/cell diversity Systems Biology

• What is it? • Principles – Lack of dominance (Kacser) – Co-selection (Belyaev) • Progress – Make Me My Model – The genome wide metabolic maps – Epigenetics and noise/cell diversity Acknowledgements

Pernette Verschure

and many other teachers, colleagues and students

Thierry Will Stefania Mondeel Beckman Astrologo Before the tea break