Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Introduction BIOLOGY Bioinformatics 1
Systems Biology Overview an Introduction
& 29.11.2018 – 10:15 to 11:45 Definitions Eberhard Korsching
http://www.bioinformatics.uni-muenster.de/teaching/ 4
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY My personal background What fields are involved ?
i
d
e
a
s Interests in Physics / Applied Mathematics / Life Sciences Cell Biology Informatics Biology
Studies in Chemistry / Biochemistry w
e
t
l PhD in fields of Biochemistry / Immunohistochemistry / a Biochemistry
b Cell Biology / sequence theoretical methods (HUSAR) Computational Bioinformatics Venia Legendi in Experimental Pathology (Medicine) Biology transition human Life Sciences / Expressome / Phenotypes Genetics
t
h
Now a stronger focus on theoretical methods in e
o
r Systems Biology / Computational Biology y
l Statistics
a
but embedded in Biology / Medicine b (cellular) Systems Biology Probability Theory ... this double period should encourage you to discover the wealth of theoretical science ... a theoretical discipline & interdisciplinary …. 2 5
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Resources The evolution of biology follows physics
Internet resources : there are many introductions, tutorials since 1950 since 1500 and scientific publications available for free
Some books : Experimental Experimental biology physics
Theoretical biology Theoretical physics Systems biology
networks discover topics get inspired 3 now 2018 6
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology Combining the elements
One attempt of a definition : interaction Systems biology tries to understand mechanisms
So mechanisms are composed of nodes (with states)
and transitions between states adding some change and interaction / exchange in between nodes m o c . k c
o human cell t s r e t t will u h s . respond w gene 1 gene 2 gene 3 w w 7 10
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology Lingua franca
Another attempt of a definition pointing or speaking ?
“The whole is more than the sum of the parts” (some) modeling languages
simple example: " Scientists who have come from engineering disciplines tend to use MATLAB "
" those from the statistics community are fluently with R "
" and those in the computer science community
system with system with combinations may be more familiar with PYTHON " one observable two observables of observables ....
8 from the book 'Quantitative-Biology' 11
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology Systems biology
One of the major visualizations of the 'system' from the perspective of biology is a graph ( see also --> 'graph theory' ) … deals with the analysis of all (known/relevant) interactions composed of nodes / vertices and edges / arcs in between the components of a cellular system
… tries to explain and to predict cellular behavior visualizes relations
in-between … emerged with the appearance of the parts high-throughput technologies of the system (massive parallel measurements of molecular observables)
9 12
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY High-throughput technologies Components versus Systems
Gene expression microarrays rd y hea at lread out th you a on ab st less the la Tissue microarrays
Genome-, transcriptome-, ChIP sequencing etc.
Characterized by a highly parallel measurement of (in silico biology) e.g. concentrations / numbers of system compounds
appearing or disappearing of system compounds ...
--> paradigm shift --> 13 from the book B.O. Palsson 16
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY
try emis Interactions bioch Section summary (I)
Biochemical reactions between molecular factors we have learned how this Binding (without chemical reactions) interdisciplinary field systems biology forming a molecular complex enabling something ... is organized
… are forming one biological network we have learned some simple ideas which can be analyzed in different views e.g. on what might be a 'system'
reaction networks (e.g. Petri nets) we have seen some visualizations looking on properties like reaction kinetics and learned some terms important in systems biology
co-expression networks comparing expression trends over time and between genes we combined systems biology with biology and the necessary laboratory methods
14 17
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems behavior Section summary (II)
Systems biology is a gain for [human tissue]
putting (many) different experimental observations together in a sens making way means creating a -> system
answering mechanistic questions-
A B how does the cell work ?
A --> B : systemic change A – Normal which of my favorite molecules A normal duct has a myoepithelial cell layer and a single luminal cell layer are playing together ? B - Epithelial hyperplasia The lumen is filled with a heterogeneous population of cells of different morphologies 15 18
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Outline of this lecture Many different types of networks
Protein-protein interaction networks (PPI, sometimes also peptide-peptide interaction) Introducing some important parts of systems biology primary network - e.g. transcription pre-initiation complex, cell structure forming, signaling discussing measurement options Metabolic networks pointing to some limitations primary network - e.g. databases : KEGG (Japan), ExPASy (Swiss) Biochemical Pathways
Genetic interaction networks To get some insight: meta network - e.g. observe pattern of mutations and associate with disease types a detailed example on protein co-expression Gene / transcriptional regulatory networks primary network - cellular control on structure and function, e.g. cellular differentiation, morphogenesis
Spotlight on Cell signaling networks microRNA - mRNA networks primary network - cell communication plays a role in e.g. development, immunity
Petri nets Gene / protein expression networks co-expression networks meta network - observe expression pattern and associate e.g. with disease function
Neural networks - mixture between primary and meta network - e.g. thinking Overview on further fields - systems biology is huge ... Ecological networks - meta network - e.g. ecological interactions between species 19 22
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology: our main model Some details on graph theory
a simplified graph edge
node degree = 1 nodes
ESTIMATES for the complexity of a human cell node degree = 3 About 25 * 109 human hemoglobin macromolecules (64 kDa, ku) fit in the volume of a human cell of r= 5 µm Approximately 1-3 * 105 basic types of macromolecules (including variants, subunits) and a lot more protein states 'hub' like node degree = 5 http://www.estrellamountain.edu/faculty/farabee/biobk/BioBookCELL2.html 20 23
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY What is a network ? Directed versus undirected graphs
Abstract definition: interconnected objects
‘interconnect’ means : sharing something
e.g. interchanging information or physical objects
h
t
t
p
s
:
/
/
e
n
.
w
i
k
i
p
in biology, e.g. e
d
i
a
.
o
r
g
/
w
i
k metabolic reactions i
/
M
e
t
to transport energy a
b
o
l
i to build macromolecules c
_
p
a
t
to degrade macromolecules h
w
a to transport information y
21 http://webmathematics.net/ 24
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Network properties Network models (II)
Power law or Degree distribution number of direct neighbors per node P(k) the node node degree number does distribution not change the Subgraphs / motifs / sample size distribution
Betweenness-Centrality BC: number of shortest path through a node (bridge function) Average C(k) is clustering independent of coefficient k Assortativity C(k) (node degree) high degree nodes directly connected to other high degree nodes biological networks 25 28
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Real world node degree distributions Network types
Power law or
Poisson power law like distributed distributed
biological other networks networks 26 29
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Network models (I) Hubs & Routers
Power law or
might be useful sometimes as a control
biological networks 27 30
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Join graph theory with biology (I) Immune response - even more complex
protein
translation dimerization
DNA-directed RNA polymerase
transcription
promoter gene
Prokaryotic auto-regulatory feedback loop. 31 34
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Join graph theory with biology (II) Motifs (I)
foreign regulation A configuration of nodes
auto-regulation with a biological significance
most of the time also
binding site synonymous with a gene B gene A 'directed graph'
Example of an eukaryotic transcription factor activity. 32 35
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Join graph theory with biology (III) Motifs (II)
Transcription factor
gene target
Example of an eukaryotic transcription. 33 PLoS Comput Biol. 2013;9(8):e1003210. doi: 10.1371/journal.pcbi.1003210 36
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Section summary What to measure ?
Genome sequences
we have learned some basics on Gene expression ( mRNA, lncRNA, sRNA, … ) the cellular model
Proteome expression ( proteins / peptides, ... ) we have learned some facts on graphs, motifs Glycome expression ( glycans / polysaccharides, ... ) and the visualization of models
Alterations of macromolecules we translated biology splice variants, methylation patterns, mutations, into graphs and vice versa degradation products
much more ... 37 40
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems biology approaches How to measure, e.g. protein expression ?
One way is more abstract basal simulated networks to count total cells breast cancer in a certain sector 8 with a positive feature - 1 2
K5/p63 positive 3 1
relevance networks : )
K14 negative 0
and maybe add the 1 ( 8 1
strength of the staining to ; t c O co-expression networks the score value 5 0 0 2
y
The result for each sector g o l o
is a single number ranging h t
Petri nets a P
e.g. from 0 (negative) to 3 n r
(strongly positive) e d o reaction networks M weighted by the number more concrete of positive cells
38 41
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Again, remember It might look different for other sub-types
basal breast cancer which represent almost 9% of The basis for calculating networks invasive ductal breast cancers K5/K14 positive p63 negative is qualitative and/or quantitative information mostly aggressive hormone receptor negative on interaction grade III tumors
so, we need measurements ...
39 42
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Another cancer subtype Measurement summary
Every method has its limitations basal be aware of that, be critical on the quality of your lab results breast cancer Transcription factor p63- positive tumors are rare Translating measurements into data is a critical step p63 positive and has impact on the analysis procedure and the results K5/K14 negative
and NOTE: Like in laboratory experiments also theoretical experiments showed in this case usually termed 'analysis' no overlap to need controls K5/K14-positive tumors to verify the stability of the generated results
Drawing good conclusions is bound to careful considerations
43 46
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Section summary A step by step example
we have learned some basics on protein measurements (immunohistology) invasive ductal carcinoma [case]
this might be one basis to reconstruct biological networks
One physiological situation (-no- case control design)
Objective : look for molecular dependencies in this situation
44 47
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Another common approach to How to create these sections ? expression measurements
Pick a piece of tissue which owns predominantly sections - approx. 4 µm thick a certain cell type of interest ( cells of interest > 70 % ) and an area between 1 mm2 and 1 cm2
Dissolve the tissue and separate the molecule class of interest Cryotom Microtom
Take appropriate technologies ( RNAseq, expression microarrays ) to generate signals of expression strength
Drawback in this case: loss of information ( spatial information, morphology ), cell type mixture of unknown composition, ...
45 48
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY How to stain these sections ? Section summary
H&E stain is for showing tissue structure Selecting the tissue sample 1 Preparing the immunohistology Immunostaining 2 is to visualize specific 3 Making the expression measurement macromolecules with 3 replicates by using antibodies 4 Calculating the score value directed against for each protein these molecules
5 http://www.abcam.com - modified 49 52
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY What to measure in this example ? How to come to dependencies ?
16 different proteins At this point we only measured one patient sample and their expression this is not sufficient to see a dependency
between different proteins I one patient sample of a tissue owning ductal invasive carcinoma The term dependency implies a further type of information
we see a pattern of In our case this additional information resulted from 16 protein expression multiple patient measurements values of the same disease situation proteins 1 to 16 (slightly) varying the basic situation
50 53
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY How stable is the measurement ? How to measure multiple patients ? d e i f i d o m
,
If we have one measurement, t s e p a
we own some uncertainty d u B
y t i
that the next measurements is exactly equal s r e v i n U
I s i e w l e m m
do at least 3 replicate measurements e S
, s c a n e r K
r o b i 1 2 3 tissue microarray (TMA) T NOTE: there are statistical rules to decide how many replicates are necessary slide with one to get a certain precision TMA section
Lab Münster – Bürger/Korsching 51 54
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY One section - one protein stain Entering the first analysis step
Now the first tissue microarray section will be used Applying a proximity measure between each of the proteins (columns)
All spots come from the Purpose: is the profile of the protein across all samples similar or not same basic situation profile for each protein (here only 3) Each spot is a variation score value of the basic situation and details look like this :
magnify 6 of 640 samples KI-67 Method: Pearson correlation [ normalizing proximity measure ]
one (serial) section Result: one value [ condensing measure ] 640 samples 55 58
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY More than one protein – more sections Resulting dependencies
Drawback: every section will own different cells splitting the factors K5-6 K14 K19 K8-18 K1 K10 'reference' into seven
partitions EGFR, VIM, KI-67, p53, ERBB2, Cyclin-D1, BCL-2, EMA, PR, ER 'test' Assumption: on average the same physiological situation
generalized regression
optimal ordered K 5 K 8/18 K 19 EGFR K 14 EMA ERB-B2 VIM ... correlation values
Every section will be stained by a different protein dependencies
Result: every protein in every patient sample forms its own synergistic co-expression characteristic expression pattern no dependency antagonistic co-expression 56 Cancer Inform. 2016 Jun 29;15:143-9 , Histopathology. 2015 Dec;67(6):888-97 59
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Score data based on raw signals Section summary
proteins
The term dependency assumes the idea of a network
and implies a further type of information
640 samples The concept : every patient has the same network but in slightly different states
The superposition of all network variants is a surrogate marker for the dependency
score sheet 57 60
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Another network analysis example Scheme for an enzymatic reaction
Double differential expression analysis
Forming an interaction model from the results
PLoS Comput Biol. 2013;9(8):e1003210. doi: 10.1371/journal.pcbi.1003210 61 64
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Now further Membrane transport
topics
on networks
and
their application
62 65
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Petri net Tools to work with
‘Petri nets’ are used as a formal and graphically language to model systems
A Petri net is represented by a directed, finite, bipartite graph, typically without isolated nodes
Download & Credits http://www-dssz.informatik.tu-cottbus.de/DSSZ/Software/Software
The scientific biography behind that: Prof. Monika Heiner
Blaetke MA Tutorial Petri Nets in Systems Biology 2011 63 66
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY WGCNA - R package WGCNA - example of use
Showing hundreds of genes and Co-expression networks modules is not well working to give an overview on affected regulatory mechanisms
but instead - as before - over factors (proteins) Solution: - focus on strong in many different patients, but of one condition dependencies
- get more abstract here over factors (genes) in many different conditions, but of one entity The 47 prognostic modules are plotted in four (e.g. a certain cell type treated with different agents) circles, each representing one cancer type.
Grey lines: conservation correspondence objective : find gene modules between different cancer types. GBM - glioblastoma discriminating these conditions OV - ovarian adenocarcinoma BRCA - invasive breast carcinoma KIRC - kidney carcinoma BMC Bioinformatics. 2008 Dec 29;9:559. doi: 10.1186/1471-2105-9-559 67 Nat Commun. 2014;5:3231. doi: 10.1038/ncomms4231 70
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY How does it work ? Overview on further fields
sample trait The outcome is a number of Databases 1 2 m similar genes called a module Chemical reaction constants database different experiments with similar expression http://kinetics.nist.gov/kinetics/welcome.jsp 1 2 n properties gene 1 2 .... sharing a specific all Biochemical reaction kinetics database genes 3 biological context ...... or task http://sabio.villa-bosch.de/ i-1 i .... and http://www.ebi.ac.uk/biomodels/ might (partly) explain the difference between Pearson correlation of the tested microRNA database biological conditions gene profiles (rows) http://www.mirbase.org/
68 71
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY How to create a module ? Databases (cont.)
here by hierarchical clustering Genome database http://www.internationalgenome.org/
EBI database http://www.ebi.ac.uk/ many more NCBI database smaller database https://www.ncbi.nlm.nih.gov/ ...
but often Pathway database not evolving :( http://www.genome.jp/
69 72
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Pathway database : KEGG Epigenetics 6 1 2 0 0 . 2
N-glycans or 1 0 2
asparagine-linked . s
glycans are major y
constituents of h p f glycoproteins in / 9
eukaryotes 8 3
A C 3 . 0 1
:
Methylation targets i o d
.
Homosapiens: 6 1
green 2 : 3 ;
Drugs/disease: 7 2 A Stochastic Model of Epigenetic Dynamics
blue,pink n
in Somatic Cell Reprogramming u J
2 1
Transcriptional regulators that account for the 0 2
.
activation of a certain cell state are combined l o i
into a module. s y h P
Four modules: t n
2 different differentiation modules A and B, o r
the pluripotency module P, F and the exogenous reprogramming genes E.
Each module is governed by the activity of the other modules as well as its epigenetic states. 73 76
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY Systems Biology Markup Language (SBML) Synthetic biology
http://sbml.org Already existing units from existing cells will be taken and be assembled to a simple but functional Idea: a computer language is necessary to standardize new cellular system the description of models e.g. to produce certain proteins and to enable comparisons / automate computations
SBML is free and open
SBML is useful for models of metabolism, cell signaling, .... The quest for the minimal bacterial genome The R platform is supporting this language Current Opinion in Biotechnology 2016, https://www.bioconductor.org/packages/release/bioc/html/SBMLR.html 42:216–224
Ways to construct the minimal cell 74 77
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 --BIOLOGY A SBML model Evolutionary Systems Biology
The hallmarks of cancer have strong links to Reaction the evolution of cellular properties - means adaptation, survival and remodeling of cellular control over time.
Hanahan and Weinberg
analyzing the dynamics of the present
Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Take home message
The field of systems biology is still under development
There are a multitude of different application scenarios
Realize that systems biology needs to be tightly integrated into wet lab research
because the majority of ideas for modeling are coming from experimental insight / observations
79