Korsching ---- -- 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Introduction BIOLOGY Bioinformatics 1

Systems Biology Overview an Introduction

& 29.11.2018 – 10:15 to 11:45 Definitions Eberhard Korsching

[email protected]

http://www.bioinformatics.uni-muenster.de/teaching/ 4

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY My personal background What fields are involved ?

i

d

e

a

s Interests in Physics / Applied Mathematics / Life Sciences Biology Informatics Biology

Studies in Chemistry / Biochemistry w

e

t

l PhD in fields of Biochemistry / Immunohistochemistry / a Biochemistry

b Cell Biology / sequence theoretical methods (HUSAR) Computational Bioinformatics Venia Legendi in Experimental Pathology (Medicine) Biology transition human Life Sciences / Expressome / Phenotypes

t

h

Now a stronger focus on theoretical methods in e

o

r Systems Biology / Computational Biology y

l Statistics

a

but embedded in Biology / Medicine b (cellular) Systems Biology Probability Theory ... this double period should encourage you to discover the wealth of theoretical science ... a theoretical discipline & interdisciplinary …. 2 5

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Resources The evolution of biology follows physics

Internet resources : there are many introductions, tutorials since 1950 since 1500 and scientific publications available for free

Some books : Experimental Experimental biology physics

Theoretical biology Theoretical physics Systems biology

networks discover topics get inspired 3 now 2018 6

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology Combining the elements

One attempt of a definition : interaction Systems biology tries to understand mechanisms

So mechanisms are composed of nodes (with states)

and transitions between states adding some change and interaction / exchange in between nodes m o c . k c

o human cell t s r e t t will u h s . respond w gene 1 gene 2 gene 3 w w 7 10

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology Lingua franca

Another attempt of a definition pointing or speaking ?

“The whole is more than the sum of the parts” (some) modeling languages

simple example: " Scientists who have come from engineering disciplines tend to use MATLAB "

" those from the statistics community are fluently with R "

" and those in the computer science community

system with system with combinations may be more familiar with PYTHON " one observable two observables of observables ....

8 from the book 'Quantitative-Biology' 11

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology Systems biology

One of the major visualizations of the 'system' from the perspective of biology is a graph ( see also --> 'graph theory' ) … deals with the analysis of all (known/relevant) interactions composed of nodes / vertices and edges / arcs in between the components of a cellular system

… tries to explain and to predict cellular behavior visualizes relations

in-between … emerged with the appearance of the parts high-throughput technologies of the system (massive parallel measurements of molecular observables)

9 12

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY High-throughput technologies Components versus Systems

microarrays rd y hea at lread out th you a on ab st less the la  Tissue microarrays

 Genome-, -, ChIP sequencing etc.

Characterized by a highly parallel measurement of (in silico biology) e.g. concentrations / numbers of system compounds

appearing or disappearing of system compounds ...

--> paradigm shift --> 13 from the book B.O. Palsson 16

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY

try emis Interactions bioch Section summary (I)

 Biochemical reactions between molecular factors  we have learned how this  Binding (without chemical reactions) interdisciplinary field systems biology forming a molecular complex enabling something ... is organized

… are forming one biological network  we have learned some simple ideas which can be analyzed in different views e.g. on what might be a 'system'

reaction networks (e.g. Petri nets)  we have seen some visualizations looking on properties like reaction kinetics and learned some terms important in systems biology

co-expression networks  comparing expression trends over time and between genes we combined systems biology with biology and the necessary laboratory methods

14 17

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems behavior Section summary (II)

Systems biology is a gain for [human tissue]

 putting (many) different experimental observations together in a sens making way means creating a -> system

 answering mechanistic questions-

A B how does the cell work ?

A --> B : systemic change A – Normal  which of my favorite molecules A normal duct has a myoepithelial cell layer and a single luminal cell layer are playing together ? B - Epithelial hyperplasia The lumen is filled with a heterogeneous population of cells of different morphologies 15 18

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Outline of this lecture Many different types of networks

Protein- interaction networks (PPI, sometimes also peptide-peptide interaction) Introducing some important parts of systems biology primary network - e.g. transcription pre-initiation complex, cell structure forming, signaling discussing measurement options Metabolic networks pointing to some limitations primary network - e.g. databases : KEGG (Japan), ExPASy (Swiss) Biochemical Pathways

Genetic interaction networks To get some insight: meta network - e.g. observe pattern of mutations and associate with disease types a detailed example on protein co-expression Gene / transcriptional regulatory networks primary network - cellular control on structure and function, e.g. cellular differentiation, morphogenesis

Spotlight on Cell signaling networks microRNA - mRNA networks primary network - cell communication plays a role in e.g. development, immunity

Petri nets Gene / protein expression networks co-expression networks meta network - observe expression pattern and associate e.g. with disease function

Neural networks - mixture between primary and meta network - e.g. thinking Overview on further fields - systems biology is huge ... Ecological networks - meta network - e.g. ecological interactions between species 19 22

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology: our main model Some details on graph theory

a simplified graph edge

node degree = 1 nodes

ESTIMATES for the complexity of a human cell node degree = 3 About 25 * 109 human hemoglobin macromolecules (64 kDa, ku) fit in the volume of a human cell of r= 5 µm Approximately 1-3 * 105 basic types of macromolecules (including variants, subunits) and a lot more protein states 'hub' like node degree = 5 http://www.estrellamountain.edu/faculty/farabee/biobk/BioBookCELL2.html 20 23

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY What is a network ? Directed versus undirected graphs

Abstract definition: interconnected objects

‘interconnect’ means : sharing something

e.g. interchanging information or physical objects

h

t

t

p

s

:

/

/

e

n

.

w

i

k

i

p

in biology, e.g. e

d

i

a

.

o

r

g

/

w

i

k metabolic reactions i

/

M

e

t

to transport energy a

b

o

l

i to build macromolecules c

_

p

a

t

to degrade macromolecules h

w

a to transport information y

21 http://webmathematics.net/ 24

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Network properties Network models (II)

Power law or Degree distribution number of direct neighbors per node P(k) the node node degree number does distribution not change the Subgraphs / motifs / sample size distribution

Betweenness-Centrality BC: number of shortest path through a node (bridge function) Average C(k) is clustering independent of coefficient k Assortativity C(k) (node degree) high degree nodes directly connected to other high degree nodes biological networks 25 28

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Real world node degree distributions Network types

Power law or

Poisson power law like distributed distributed

biological other networks networks 26 29

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Network models (I) Hubs & Routers

Power law or

might be useful sometimes as a control

biological networks 27 30

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Join graph theory with biology (I) Immune response - even more complex

protein

translation dimerization

DNA-directed RNA polymerase

transcription

promoter gene

Prokaryotic auto-regulatory feedback loop. 31 34

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Join graph theory with biology (II) Motifs (I)

foreign regulation A configuration of nodes

auto-regulation with a biological significance

most of the time also

binding site synonymous with a gene B gene A 'directed graph'

Example of an eukaryotic transcription factor activity. 32 35

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Join graph theory with biology (III) Motifs (II)

Transcription factor

gene target

Example of an eukaryotic transcription. 33 PLoS Comput Biol. 2013;9(8):e1003210. doi: 10.1371/journal.pcbi.1003210 36

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Section summary What to measure ?

Genome sequences

 we have learned some basics on Gene expression ( mRNA, lncRNA, sRNA, … ) the cellular model

Proteome expression ( / peptides, ... )  we have learned some facts on graphs, motifs Glycome expression ( glycans / polysaccharides, ... ) and the visualization of models

Alterations of macromolecules  we translated biology splice variants, methylation patterns, mutations, into graphs and vice versa degradation products

much more ... 37 40

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology approaches How to measure, e.g. protein expression ?

One way is more abstract basal simulated networks to count total cells breast cancer in a certain sector 8 with a positive feature - 1 2

K5/p63 positive 3 1

relevance networks : )

K14 negative 0

and maybe add the 1 ( 8 1

strength of the staining to ; t c O co-expression networks the score value 5 0 0 2

y

The result for each sector g o l o

is a single number ranging h t

Petri nets a P

e.g. from 0 (negative) to 3 n r

(strongly positive) e d o reaction networks M weighted by the number more concrete of positive cells

38 41

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Again, remember It might look different for other sub-types

basal breast cancer which represent almost 9% of The basis for calculating networks invasive ductal breast cancers K5/K14 positive p63 negative is qualitative and/or quantitative information mostly aggressive hormone receptor negative on interaction grade III tumors

so, we need measurements ...

39 42

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Another cancer subtype Measurement summary

Every method has its limitations basal be aware of that, be critical on the quality of your lab results breast cancer Transcription factor p63- positive tumors are rare Translating measurements into data is a critical step p63 positive and has impact on the analysis procedure and the results K5/K14 negative

and NOTE: Like in laboratory experiments also theoretical experiments showed in this case usually termed 'analysis' no overlap to need controls K5/K14-positive tumors to verify the stability of the generated results

Drawing good conclusions is bound to careful considerations

43 46

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Section summary A step by step example

 we have learned some basics on protein measurements (immunohistology) invasive ductal carcinoma [case]

 this might be one basis to reconstruct biological networks

One physiological situation (-no- case control design)

Objective : look for molecular dependencies in this situation

44 47

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Another common approach to How to create these sections ? expression measurements

Pick a piece of tissue which owns predominantly sections - approx. 4 µm thick a certain cell type of interest ( cells of interest > 70 % ) and an area between 1 mm2 and 1 cm2

Dissolve the tissue and separate the molecule class of interest Cryotom Microtom

Take appropriate technologies ( RNAseq, expression microarrays ) to generate signals of expression strength

Drawback in this case: loss of information ( spatial information, morphology ), cell type mixture of unknown composition, ...

45 48

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY How to stain these sections ? Section summary

 H&E stain is for showing tissue structure  Selecting the tissue sample 1  Preparing the immunohistology  Immunostaining 2 is to visualize specific 3  Making the expression measurement macromolecules with 3 replicates by using antibodies 4  Calculating the score value directed against for each protein these molecules

5 http://www.abcam.com - modified 49 52

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY What to measure in this example ? How to come to dependencies ?

16 different proteins At this point we only measured one patient sample and their expression this is not sufficient to see a dependency

between different proteins I one patient sample of a tissue owning ductal invasive carcinoma The term dependency implies a further type of information

we see a pattern of In our case this additional information resulted from 16 protein expression multiple patient measurements values of the same disease situation proteins 1 to 16 (slightly) varying the basic situation

50 53

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY How stable is the measurement ? How to measure multiple patients ? d e i f i d o m

,

If we have one measurement, t s e p a

we own some uncertainty d u B

y t i

that the next measurements is exactly equal s r e v i n U

I s i e w l e m m

do at least 3 replicate measurements e S

, s c a n e r K

r o b i 1 2 3 tissue microarray (TMA) T NOTE: there are statistical rules to decide how many replicates are necessary slide with one to get a certain precision TMA section

Lab Münster – Bürger/Korsching 51 54

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY One section - one protein stain Entering the first analysis step

Now the first tissue microarray section will be used Applying a proximity measure between each of the proteins (columns)

All spots come from the Purpose: is the profile of the protein across all samples similar or not same basic situation profile for each protein (here only 3) Each spot is a variation score value of the basic situation and details look like this :

magnify 6 of 640 samples KI-67 Method: Pearson correlation [ normalizing proximity measure ]

one (serial) section Result: one value [ condensing measure ] 640 samples 55 58

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY More than one protein – more sections Resulting dependencies

Drawback: every section will own different cells splitting the factors K5-6 K14 K19 K8-18 K1 K10 'reference' into seven

partitions EGFR, VIM, KI-67, p53, ERBB2, Cyclin-D1, BCL-2, EMA, PR, ER 'test' Assumption: on average the same physiological situation

generalized regression

optimal ordered K 5 K 8/18 K 19 EGFR K 14 EMA ERB-B2 VIM ... correlation values

Every section will be stained by a different protein dependencies

Result: every protein in every patient sample forms its own synergistic co-expression characteristic expression pattern no dependency antagonistic co-expression 56 Cancer Inform. 2016 Jun 29;15:143-9 , Histopathology. 2015 Dec;67(6):888-97 59

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Score data based on raw signals Section summary

proteins

 The term dependency assumes the idea of a network

and implies a further type of information

640  samples The concept : every patient has the same network but in slightly different states

 The superposition of all network variants is a surrogate marker for the dependency

score sheet 57 60

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Another network analysis example Scheme for an enzymatic reaction

Double differential expression analysis

Forming an interaction model from the results

PLoS Comput Biol. 2013;9(8):e1003210. doi: 10.1371/journal.pcbi.1003210 61 64

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Now further Membrane transport

topics

on networks

and

their application

62 65

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Petri net Tools to work with

‘Petri nets’ are used as a formal and graphically language to model systems

A Petri net is represented by a directed, finite, bipartite graph, typically without isolated nodes

Download & Credits http://www-dssz.informatik.tu-cottbus.de/DSSZ/Software/Software

The scientific biography behind that: Prof. Monika Heiner

Blaetke MA Tutorial Petri Nets in Systems Biology 2011 63 66

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY WGCNA - R package WGCNA - example of use

Showing hundreds of genes and Co-expression networks modules is not well working to give an overview on affected regulatory mechanisms

but instead - as before - over factors (proteins) Solution: - focus on strong in many different patients, but of one condition dependencies

- get more abstract here over factors (genes) in many different conditions, but of one entity The 47 prognostic modules are plotted in four (e.g. a certain cell type treated with different agents) circles, each representing one cancer type.

Grey lines: conservation correspondence objective : find gene modules between different cancer types. GBM - glioblastoma discriminating these conditions OV - ovarian adenocarcinoma BRCA - invasive breast carcinoma KIRC - kidney carcinoma BMC Bioinformatics. 2008 Dec 29;9:559. doi: 10.1186/1471-2105-9-559 67 Nat Commun. 2014;5:3231. doi: 10.1038/ncomms4231 70

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY How does it work ? Overview on further fields

sample trait The outcome is a number of Databases 1 2 m similar genes called a module Chemical reaction constants database different experiments with similar expression http://kinetics.nist.gov/kinetics/welcome.jsp 1 2 n properties gene 1 2 .... sharing a specific all Biochemical reaction kinetics database genes 3 biological context ...... or task http://sabio.villa-bosch.de/ i-1 i .... and http://www.ebi.ac.uk/biomodels/ might (partly) explain the difference between Pearson correlation of the tested microRNA database biological conditions gene profiles (rows) http://www.mirbase.org/

68 71

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY How to create a module ? Databases (cont.)

here by hierarchical clustering Genome database http://www.internationalgenome.org/

EBI database http://www.ebi.ac.uk/ many more NCBI database smaller database https://www.ncbi.nlm.nih.gov/ ...

but often Pathway database not evolving :( http://www.genome.jp/

69 72

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Pathway database : KEGG Epigenetics 6 1 2 0 0 . 2

N-glycans or 1 0 2

asparagine-linked . s

glycans are major y

constituents of h p f glycoproteins in / 9

eukaryotes 8 3

A C 3 . 0 1

:

Methylation targets i o d

.

Homosapiens: 6 1

green 2 : 3 ;

Drugs/disease: 7 2 A Stochastic Model of Epigenetic Dynamics

blue,pink n

in Somatic Cell Reprogramming u J

2 1

Transcriptional regulators that account for the 0 2

.

activation of a certain cell state are combined l o i

into a module. s y h P

Four modules: t n

2 different differentiation modules A and B, o r

the pluripotency module P, F and the exogenous reprogramming genes E.

Each module is governed by the activity of the other modules as well as its epigenetic states. 73 76

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems Biology Markup Language (SBML) Synthetic biology

http://sbml.org Already existing units from existing cells will be taken and be assembled to a simple but functional Idea: a computer language is necessary to standardize new cellular system the description of models e.g. to produce certain proteins and to enable comparisons / automate computations

SBML is free and open

SBML is useful for models of metabolism, cell signaling, .... The quest for the minimal bacterial genome The R platform is supporting this language Current Opinion in Biotechnology 2016, https://www.bioconductor.org/packages/release/bioc/html/SBMLR.html 42:216–224

Ways to construct the minimal cell 74 77

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY A SBML model Evolutionary Systems Biology

The hallmarks of cancer have strong links to Reaction the evolution of cellular properties - means adaptation, survival and remodeling of cellular control over time.

Hanahan and Weinberg Cell. 2011 Mar 4;144(5):646-74. .... .... .... Even more advanced: Trying to model evolutionary processes by

analyzing the dynamics of the present cellular system under the constraint of the k1 environment. X0 Encyclopedia of Evolutionary Biology, Volume 4 doi:10.1016/B978-0-12-800049-6.00184-0 sbml.org - this reaction 75 78

Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Take home message

The field of systems biology is still under development

There are a multitude of different application scenarios

Realize that systems biology needs to be tightly integrated into wet lab research

because the majority of ideas for modeling are coming from experimental insight / observations

79